key: cord-008293-5cwb5g3h authors: shaw, shyh-yu; laursen, richard a.; lees, marjorie b. title: analogous amino acid sequences in myelin proteolipid and viral proteins date: 1986-10-27 journal: febs lett doi: 10.1016/0014-5793(86)81502-2 sha: doc_id: 8293 cord_uid: 5cwb5g3h computer analysis of the intrinsic membrane protein, myelin proteolipid, shows strong sequence similarities between the putative extramembrane segments of the proteolipid protein and a number of viral proteins, several of which infect humans. these similarities are even more striking than those reported previously between viral proteins and the encephalitogenic myelin basic protein (mbp). these findings, along with other reports of molecular mimicry by viruses, suggest that immunological cross‐reactions between virus‐induced antibodies or t‐cells and analogous antigenic determinants (epitopes) in myelin proteolipid could be involved in the pathophysiology of multiple sclerosis or post‐infectious demyelinating syndromes. current evidence suggests that the etiology of multiple sclerosis and other demyelinating diseases involves a combination of viral and autoimmune factors [1] . a particularly well studied model for multiple sclerosis is eae which can be induced by immunization of laboratory animals with cns tissue or with mbp, one of the major myelin proteins [2] . recently it has been reported that mbp shows homology with certain viral proteins [3] . furthermore, immunization with peptides having regions common to mbp and hepatitis b virus dna polymerase showed histological eae in rabbits [4] . however, it has long been recognized that components other than mbp contribute to the encephalitogenic response to cns tissue [5] . * to whom correspondence should be addressed abbreviations: mbp, myelin basic protein; eae, experimental allergic encephalomyelitis; cns, central nervous system; mlv, murine leukemia virus; aids, acquired immune deficiency syndrome; htlv-iii/lav, human t-lymphotropic retrovirus the most abundant protein of cns myelin is myelin proteolipid protein, which is embedded in the myelin membrane. its amino acid sequence has been reported recently [6, 7] , and two different models have been proposed for its conformation in the myelin membrane [8, 9] . immunization of rabbits with the proteolipid apoprotein leads to the development of a chronic, progressive or relapsing form of eae [10] . however, the epitopes in myelin proteolipid which canse the encephalitogenic response are not known. in the present study we compared decapeptide sequences from proposed [8, 9] extramembrane hydrophilic regions of myelin proteolipid with various viral proteins by computer analysis, with the aim of searching for similarities, such as found between mbp and viral proteins, that might shed light on the origin of demyelinating diseases. decapeptides were chosen for comparison with the extramembrane segments of the proteolipid protein, the rationale being that decapeptides are more than sufficient to induce an immune response [11] and that epitopes of membrane proteins are generally located at extramembrane hydrophilic regions. comparisons and statistical scorings were made using the database (release 6.8 of 28 august 1985, containing 3309 sequences) and programs of the protein identification resource of the national biomedical research foundation. the search program compared every decapeptide in bovine brain myelin proteolipid with every decapeptide in the database. the calculation of similarity scores was accomplished using the mutation data matrix [12] program. for each proteolipid decapeptide, 709253 comparisons were made and the highest scoring segments and their standard deviations were printed out. we selected those viral decapeptides, similar to extramembrane hydrophilic segments of myelin proteolipid, with scores higher than 30, and which also exceed the average by 5 standard deviations; i.e., the probability that the match is due to random chance is <3 x 10 -7 . a vax 11/750 computer was used for searching and calculations. sequence alignments with scores of 30 or greater were observed between proteolipid and 249 viral decapeptides, excluding identities seen with proteins from similar strains of virus. these include decapeptides from a variety of types of viruses, such as adenoviruses and epstein-barr, influenza and measles viruses. only those with a score of 38 or greater, or which show at least 60% identity are shown in table 1. the degree of similarity between proteolipid and viral proteins is significantly greater than that observed [3] between mbp and viral proteins. furthermore, although only 10% of the proteins in the protein sequence database are viral proteins, they account for over 30% of the similarities. it is interesting and perhaps significant that many of the strong similarities lie between residues 135 and 150 in the proteolipid. this region is predicted (shaw, s.-y., unpublished) to have a strong tendency to form an amphipathic helix (hydrophobic on one side and polar on the other), and thus might be regarded as a prime candidate for an antigenic site. likewise, many similarities are found with the n-terminus of the proteolipid, a region which has also been predicted to be an amphipathic helix [8l. the two proposed models [8, 9] for the organization of the proteolipid in the myelin membrane are table 1 comparison of myelin proteolipid decapeptide sequences with those of certain viral proteins all of the peptides shown have alignment scores of 38 or greater and/or have at least 6 identities. amino acids which are identical with the corresponding residues in the proteolipid are underlined similar in that they depict the protein as threading through the lipid bilayer several times with the polar segraents external to the bilayer; they differ, however, in the sidedness of some of the extramembrane segments. while both models put residues 1-10 and 220-235 on the extracellular face of the membrane, the model of laursen et al. [8] puts residues 35-60 on the extracellular face and 90-150 on the cytoplasmic face, whereas the model of stoffel et al. [9] predicts the opposite orientation for the latter two segments. regardless of which (if either) model is correct, similarities between viral proteins and proposed extracellular and cytoplasmic segments of the proteolipid are seen. it has been proposed [3, 4] that eae-like diseases can arise when virus-evoked antibodies and/or sensitized t-cells cross-react with homologous amino acid sequences (epitopes) in mbp. however, since mbp is located entirely on the cytoplasmic face of the membrane and is therefore within the oligodendroglial cell, it would be expected to be relatively inaccessible to immune surveillance. on the other hand, the proteolipid has potential antigenic sites not only on the cytoplasmic face but also on the outer surface of the membrane. recognition of these sites by antivira[ antibodies or sensitized cells could induce a cascade of immunological events leading to cell destruction. in the process, mbp would be released, which might result in further inflammatory consequences. the similarities between mbp and virus proteins have been discussed [3, 4] previously in terms of 'homology', which implies a common ancestor for the proteins. when comparing myelin proteins with viral proteins, it is difficult to make a compelling case for homology, since the similarities fall off after a few residues. while it is not uncommon to encounter five or six identical residues in a stretch of ten (see table 1), extended regions of similar sequence are encountered only rarely. for example, although a 55% identity is seen between the first 20 residues of the proteolipid and the late 100 kda protein of adenovirus 2 or 5, i.e., gia.e('ca r('i v(;ap.i asi.va illll ii ! ii~ adenovirus (561-580) gi.i.i;('hcr('ni_(.'tpi, trsi.vc, more likely that the virus and myelin proteins are 'analogous', i.e., that similar portions arose by convergent evolution to a common sequence or secondary structure. this idea is consistent with the well-known propensity of viruses to undergo rapid mutation, particularly of their surface antigenic sites [13] . the tryptophan containing region of mbp has long been recognized as being highly encephalitogenic [2, 5] . in particular, the nonapeptide, phe-ser-trp-gly-ala-glu-gly-gin-lys, has been shown to be highly effective in inducing eae in guinea pigs, and studies with peptide analogs indicate that the essential features of an encephalitogenic determinant are x-x-trp-x-x-x-x-gln/asn-lys/arg. the proteolipid contains two regions that have a trp and a lys residue separated by 5 residues, but otherwise they are not remarkably similar to the encephalitogenic mbp peptide. it is interesting, however, that one of the tryptophans (trp-144) occurs in a region that shows several similarities with viral proteins (table 1). in the course of our analysis, we also found a strong similarity, not previously reported [3] , between the encephalitogenic mbp protein and the gag-pol polyprotein of murine leukemia virus it would not be surprising if antibodies against the mlv peptide were found to cross-react with mbp. there are now at least two reports of mimicry of normal human proteins by viruses: mbp by hepatitis b virus polymerase [4] and c~-thymosin by the aids virus htlv-iii/lav [14] . in addition pruijn et al. [15] have reported that sera from patients suffering from autoimmune diseases can inhibit adenovirus dna replication, suggesting a similar phenomenon. the similarities between viral proteins and myelin proteolipid may represent yet another case. in this instance, it seems likely that portions of the protein which are potential antigenic determinants are located on the outer surface of the cell membrane, making it more understandable how the cell membrane could come under immunological attack. the similarity does not extend further. it seems viral infections of the nervous system experimental allergic encephalomyelitis: a useful model for multiple sclerosis hoppc seyler's proc. natl. acad. sci. usa proc. natl. acad. sci. usa this work was supported by grants from the national institutes of health (ns-13649 and ns 16945) and national science foundation (dmb 85-03940). key: cord-024989-0o6agnrc authors: li, qihao; peng, wen; ou, yu title: prediction and analysis of key protein structures of 2019-ncov date: 2020-05-12 journal: nan doi: 10.2217/fvl-2020-0020 sha: doc_id: 24989 cord_uid: 0o6agnrc aim: the purpose of this study was to predict and analyze the structure and function of 2019-novel coronavirus (ncov) key proteins. materials & methods: we obtained the structure and sequence of proteins from related databases and studied them through multiple sequence alignment, homology modeling, sequence analysis, virtual screening, reverse mutation, protein structure overlap and surface property analysis. results & conclusion: we found no significant changes in envelope protein, membrane protein, nucleocapsid protein and key proteases in open reading frame 1ab, and predicted results of proteins and performed molecular dynamics simulations. based on the surface properties of spike protein and docking results with angiotensin-converting enzyme 2, we believe that the binding ability of spike protein to angiotensin-converting enzyme 2 may be similar to sars. these studies will help us in fighting 2019-ncov. we searched the complete genomes of wuhan seafood market pneumonia virus (2019-ncov) and other bat cov in ncbi (ncbi id is as follows: 2019-ncov: nc 045512.2, bat sars cov ratg13: mn996532.1, bat sars cov rs672 / 2006: fj588686.1 bat sars cov zc45: mg772933.1, bat sars cov bj01 ay278488.2, bat sars cov exon1 fj882956.1 and bat sars cov gz02 ay390556.1), and downloaded the amino acid sequence of each key protein. multisequence alignments are analyzed using clustal omega in the european bioinformatics institute. clustal omega is a new multiple sequence alignment program that is used to seed guide trees and hidden markov model profile-profile techniques to generate alignments between three or more sequences. predicting key protein structures using homology modeling after obtaining the amino acid sequence of each key protein in ncbi, the spatial structure of each protein was predicted using the swiss-model homology modeling method. we entered the amino acid sequence and selected the protein structure of the amino acid sequence with the highest homology to the input sequence as a template, which has been experimentally determined for protein structure. the designated template protein data bank (pdb) ids for each key protein are 3cl hydrolase (id: 2z9j), e protein (id: 5x29), plpro (id: 5tl6) and s protein (id: 6acd and 6acc). when the sequence homology of each template reached more than 75%, the predicted structure obtained was highly reliable. through discovery studio 2016 software, we conducted a molecular dynamics simulation of the predicted key protein structure of the virus. the protein molecules were put into physiological saline solvent environment. in the process of minimization and molecular dynamics simulation, the particle-mesh ewald is used to deal with long-range electrostatic interactions. all chemical bonds related to hydrogen atoms are fixed using the shake algorithm. the standard dynamics cascade process includes five stages: minimization, minimization 2, heating, equilibration and production. minimization and minimization 2 is minimized by the 2000-step steepest descent method and the 2000-step conjugate gradient method. the heating step is slowly heating up to 300 k. next, the system performs the simulation of the balancing step under the ensemble of constant pressure normal pressure and temperature (npt). simulation time is set to 200 picosecond (ps) and so on, and 100 conformations are obtained in the production item. for the files generated by the molecular dynamics simulation, the trajectory analysis was carried out. in order to evaluate the structure of the simulation system, we performed the calculation of root mean square deviation. a full-sequence tripeptide library containing 8000 peptides was constructed, and the plpro-predicted structure obtained by homology modeling was used to virtually screen the peptide library. the specific operation is using protonate 3d to protonate the structure of the protein. energy minimize minimizes the added hydrogen bond energy, specifies the catalytic site in the active pocket as a virtual screening target and finally obtains the best-scoring tripeptide. overlapping alignment of protein spatial structure by overlapping the predicted 2019-ncov s protein structure with the template bat sars cov s protein (pdb id: 6acc) structure, we found that there was a spatial structural difference in the s protein between 2019-ncov and previous bat sars covs. we used swiss pdb viewer (spdbv) to open the predicted s protein structures of 2019-ncov (yellow) and bat sars cov (blue) at the same time, and the s protein of bat sars cov was used as the template. according to the sequence alignment results, the identical amino acid sequence from ala 930 to gln 1040 was designated as the overlap position. fit molecules were used for intelligent overlap and finally the overlap result was analyzed. back-mutate study of receptor binding domain of s protein according to the sequence alignment of s protein and docking results of s protein with ace2, there are two obvious changes in receptor binding domain (rbd) of s protein: first, three of the six amino acids that interact with ace2 are mutated; second, a large number of prolines in the proline concentrated region of rbd are replaced. 10 therefore, we reversed the mutated amino acids or replaced them with other amino acids, then docked with ace2, and used pdbepisa of european bioinformatics institute to analyze the docking results. we compared homology of 2019-ncov five key protein structures with other bat sars covs in this section [13] . five key proteins include open reading frame 1ab (orf1ab) in the nonstructural region and four structural proteins: s protein, e protein, m protein and n protein. the results show that the four main proteins of 2019-ncov have the highest homology with that of bat sars cov ratg13 and bat sars cov zc45. e protein of 2019-ncov has 100% homology with that of both bat sars covs ( figure 1a) , and the identity of orf1ab, m and n proteins between 2019-ncov and bat sars covs has also been reached at 94.27% or more ( figure 1b-d) . the comparison results of the s protein showed that it still has the highest homology with bat sars cov ratg13 and bat sars cov zc45 reached 97.71 and 81.85%, respectively ( figure 1e ). overall, 2019-ncov is basically developed from bat sars cov, with the highest degree of homology to bat sars cov ratg13, which is consistent with the results of existing studies [14] , and the degree of agreement with other sars covs is also greater than 75%, suggesting that we can use the homology modeling method to predict 2019-ncov protein structure, which can be used for virtual screening, molecular docking and drug design [15, 16] for accelerating the development of anti-2019-ncov drugs. key protein structures of 2019-ncov are predicted with homologous modeling we selected the existing crystal structure of pdb, which is more than 75% consistent with 2019-ncov's plpro, 3cl hydrolase, structural protein s and e, and used swiss-model's homology modeling method to predict the structure of each protein of 2019-ncov [17] , as shown in figure 2a -d. among them, 3cl protease is the main protein processing enzyme of cov, which is essential for virus replication and proliferation ( figure 2a ) [18] . e protein is also a membrane integrin, consisting of a highly hydrophobic n-terminus (the transmembrane region of e protein) and a c-terminus that extends into the body of the virus ( figure 2b ) [19] . the s protein is the main protein that interacts with host cells on the viral coat ( figure 2c ).the protein produced by plpro digestion is necessary for the virus, because it can activate the synthesis of viral mrna ( figure 2d ) [20] . we separately predicted the structure of these key proteins. ramachandran plot and profile-3d were used to evaluate the quality of the predicted structure ( supplementary figure 1) , and by using two different s protein templates, we predicted two different states conformation of the s protein during its interaction with ace2. this is consistent with the research by wrapp et al. [21] . at the same time, we conducted a molecular dynamics simulation, and the root mean square deviation trajectory of each conformation was basically stable, as shown in figure 2a -d. the structures of these proteins are common targets for drug development. taking the plpro structure obtained by homology modeling as a target, and using a peptide with low toxicity and favorable for clinical acceleration as an example [22] , the full-sequence library of tripeptides was subjected to screening of antivitual drugs. finally, the tripeptide with amino acid sequence val-val-asn (tp8) with strong binding ability to ace2 was obtained. the results show that tp8 can contact and form hydrogen bonds with the catalytic sites his 272 and asp 286 of the plpro ( figure 2e ). differential key protein structure analysis of 2019-ncov although some amino acids were inserted in two positions of nsp3 in orf1ab [23] , the insertion sites were in the nsp3b and nsp3c regions, which are mainly related to the binding reaction of nucleic acids. because the insertion sites are not in the nsp3d region that contain plpro ( figure 3a) , the inserted sequence has little effect on the structure and function of plpro. however, the two transmembrane domains contained in nsp3 are localized on nsp3b and nsp3c [24] . it may affect the localization of the nsp3 protein on the endoplasmic reticulum membrane [25] . the results of s protein sequence comparison showed the largest differences among all proteins. in the rbd site [26] , three of the six key amino acid residues that interact with ace2 have been changed. pro 470 , tyr 484 and thr 487 are converted to glu, gln and asn, respectively ( figure 3b ). it is worth mentioning that the 470th amino acid was changed from nonpolar to polar amino acid. by analyzing the surface properties of the rbd structure of 2019-ncov and bat sars cov (pdb id: 6acc), we found that the rbd region polarity of the 2019-ncov was more dense than the bat sars cov after mutation ( figure 3c ). at the same time, four insert boxes (ibs; 1-4) were inserted into the n-terminus and s2 region of s protein in 2019-ncov ( figure 3b ). we selected the 2019-ncov s protein with a low degree of homology comparison and compared it with the s protein of bat sars cov [27] . it was found that the insertion of ib3 increased the lateral expansion area of the s1 portion of the 2019-ncov s protein, and a loop structure is extended at the overlap with the bat sars cov. the insertion of ib4 also adds a loop structure to the envelope region of s2 ( figure 3d ), and the loop structure of proteins is often closely related to the structure and function of proteins [28, 29] . back-mutating mutant amino acids to study the functional change of rbd of s protein in order to study the effect of interactional amino acid changes in 2019-ncov-ace2 binding region rbd, we mutated the changed three amino acid residues (glu 470 , gln 484 and asn 487 ) within the rbd structure back to the original amino acids. then new, predicted structure is used to analyze the interaction between rbd and ace2. we found that based on the original hydrogen bond, arg 170 of ace2 and thr 486 of rbd added a new hydrogen bond. gln 81 of ace2 forms hydrogen bonds with tyr 484 while forming hydrogen bonds with tyr 435 of rbd ( figure 4a & b). it is suggested that the mutation of three amino acid residues in rbd may weaken the 2019-ncov interaction with ace2. at the same time, we found that compared with ordinary bat sars cov, the four in the five prolines (pro 458 , pro 461 , pro 465 , pro 468 and pro 470 ) of 2019-ncov rbd were replaced with other amino acids ( figure 3b ), so we changed the four amino acid residues to the original proline. the interaction between rbd and ace2 was also analyzed to study the impact of this change on 2019-ncov, but the results showed that the replacement of prolines has little effect on the interaction between 2019-ncov and ace2 ( figure 4a & c) . through homology alignment, we elaborated the sequence differences of each key protein between 2019-ncov and other bat sars covs, and analyzed whether the new sequence changes in 2019-ncov affected the function of each key protein. it was found that the sequence and protein structure of the structural proteins e, m and n of 2019-ncov are basically consistent with that of bat sars cov. considering that the structure determines the function, we believe that these three proteins should not be mutated. although the orf1ab region has two large changes in the sequence, these changed positions are in nsp3b and nsp3c, not in the plpro and 3cl hydrolase regions. therefore, it has little effect on the two proteins that play a key role in the virus replication process. at the same time, we give examples of its application in the screening of peptide drugs after predicting the structure of plpro. among all proteins, the s protein has the largest variation, and most of the changes are located in the s1 region that interacts with ace2. we also predicted two possible conformational changes of the s protein. they are similar to the changes of bat sars cov in the process of binding with ace2, which suggests that the interaction mechanism between 2019-ncov and ace2 may be the same as bat sars cov. based on the amino acid sequence and protein structure alignment, we found that the periphery of the s1 region is more extended than the general bat sars covs. as the most direct structure with the outside world, this may eventually affect its binding to the receptor or its adsorption to objects. the three amino acids that interact with ace2 are altered in the rbd of 2019-ncov. by analyzing the surface properties of the protein, it was found that this change made the region more polar. in order to further study the effect of changed amino acids on the rbd, we back mutated these three amino acids and found that the mutated rbd structure has a stronger effect on ace2. because the rbd is more polar, and the number of hydrogen bonds it interacts with ace2 is reduced, the strength of 2019-ncov binding to ace2 may be similar to common bat sars cov. the results presented in this manuscript demonstrate that the e, m and n protein of 2019-ncov are not significantly different compared with the original bat sars covs. the new fragment inserted in orf1ab has no effect on plpro and 3cl hydrolase. our predicted protein structure is highly reliable and can be used for drug development. we took the low-toxicity peptide library as an example, and successfully screened a tripeptide for potential drugs. for s proteins with large sequence differences, changes of key amino acid residues in rbd reduce the number of hydrogen bonds binding to ace2. however, due to the extension of the loop structure in s1 and future science group 10.2217/fvl-2020-0020 the increase of polarity, this may make the binding ability of 2019-ncov and ace2 similar to that of bat sars cov. from the appearance of the first 2019-ncov infected patient to now, due to the convenience of modern transportation, its global impact far exceeded the sars that occurred in guangdong, china in 2003. cov is an ssrna structure, based on its instability, multiple virus mutations have been found in the past few years. however, because the experimental structure analysis of viral proteins requires time, it is difficult to meet the urgent need for drug development in emergencies. although the virus mutates, it has a high degree of homology with the original bat sars covs. the structure obtained by homology modeling has the potential to replace the crystal structure, and this will help to speed up drug development and facilitate structural and functional analysis of mutant viral proteins. it will play an important role in the future fight against different mutant viruses. • we elaborated the sequence and structure differences in each key protein of 2019-ncov and other bat sars coronaviruses (covs). we found no significant changes in envelope proteins, membrane proteins, nucleocapsid proteins and key proteases in open reading frame 1ab. • we used the method of homology modeling to predict the structure of each key protein, then used molecular dynamics simulation to further process the predicted protein structure. on the basis of predicting a key protein structure, we also predicted two different state changes of s protein structure when it interacts with ace2, and gave an example of the application of papain-like protease structure in peptide drug screening. • we analyzed whether the new sequence changed in 2019-ncov affected the function of each key protein. s protein has the largest variation among all proteins, and we used a virtual back-mutation method to study whether the new amino acid mutation of receptor binding domain of s protein in 2019-ncov has an effect on the interaction between s protein and ace2. through a series of analyses, combined with the docking and simulation of s protein, we believe that the binding ability and mechanism of action between 2019-ncov and ace2 may be similar to that of bat sars covs. • this study combines bioinformatics tools and previous relevant experimental studies on the basis of viral sequences. it can overcome the problem of limited time and lack of experiments in the early stage of the disease, and has a good theoretical and practical basis. financial & competing interests disclosure bat-origin coronaviruses expand their host range to pigs update: severe respiratory illness associated with a novel coronavirus-worldwide potential of large 'first generation' human-to-human transmission of 2019-ncov novel wuhan (2019-ncov) coronavirus updated understanding of the outbreak of 2019 novel coronavirus (2019-ncov) in wuhan the fight against the 2019-ncov outbreak: an arduous march has just begun note from the editors: novel coronavirus (2019-ncov) the genome sequence of the sars-associated coronavirus epidemiology, genetic recombination and pathogenesis of coronaviruses epub ahead of print) future science group cryo-electron microscopy structures of the sars-cov spike glycoprotein reveal a prerequisite conformational state for receptor binding pre-fusion structure of a human coronavirus spike protein structure, function and evolution of coronavirus spike proteins clustal omega for making accurate alignments of many protein sequences discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin protein structure homology modeling using swiss-model workspace the swiss-model repository: new features and functionalities swiss-model: an automated protein homology-modeling server potential broad spectrum inhibitors of the coronavirus 3clpro: a virtual screening and structure-based drug design study coronavirus envelope protein: current knowledge the sars-coronavirus papain-like protease: structure, function and inhibition by designed antiviral compounds cryo-em structure of the 2019-ncov spike in the prefusion conformation progress in regulatory peptide research nsp3 of coronaviruses: structures and functions of a large multi-domain protein nuclear magnetic resonance structure shows that the severe acute respiratory syndrome coronavirus-unique domain contains a macrodomain fold sars coronavirus replicase proteins in pathogenesis structure of sars coronavirus spike receptor-binding domain complexed with receptor swiss-pdb viewer (deep view) modeling of loops in protein structures ab initio construction of all-atom loop conformations the authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. this includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.no writing assistance was utilized in the production of this manuscript. key: cord-007092-ukqvhzws authors: themsakul, sirintra; suebwongsa, namfon; mayo, baltasar; panya, marutpong; lulitanond, viraphong title: secretion of m2e:hbc fusion protein by lactobacillus casei using cwh signal peptide date: 2016-09-08 journal: fems microbiol lett doi: 10.1093/femsle/fnw209 sha: doc_id: 7092 cord_uid: ukqvhzws the ability to serve as a delivery vehicle for various interesting biomolecules makes lactic acid bacteria (lab) very useful in several applications. in the medical field, recombinant lab expressing pathogenic antigens at different cellular locations have been used to elicit both mucosal and systemic immune responses. expression–secretion vectors (esvs) with a signal peptide (sp) are pivotal for protein expression and secretion. in this study, the genome sequence of lactobacillus casei atcc334 was explored for new sps using bioinformatics tools. three new sps of the proteins cwh, sura and sp6565 were identified and used to construct an esv based on our escherichia coli–l. casei shuttle vector, prceid-lc13.9. functional testing of these constructs with the green fluorescence protein (gfp) gene showed that they could secrete the gfp. the construct with cwhsp showed the highest gfp secretion. consequently, cwhsp was selected to develop an esv construct carrying a synthetic gene encoding the extracellular domain of the matrix 2 protein fused with the hepatitis b core antigen (m2e:hbc). this esv was shown to efficiently express and secrete the m2e:hbc fusion protein. the identified sps and the developed esvs can be exploited for expression and secretion of homologous and heterologous proteins in l. casei. lactic acid bacteria (lab) are a heterogeneous group of bacteria that are used in various biotechnological applications. with respect to human and animal health, many lab are being used as mucosal delivery vehicles for therapeutic and prophylactic molecules because of their gras (generally regarded as safe) status (eijsink et al. 2002) , their probiotic properties and the ease with which they can be engineered to express heterologous proteins (ouwehand et al. 2002) to elicit an effective immune response (lee et al. 2006 ). up to now, a definite conclusion on the best cellular location for optimal immunization, i.e. intracellular, cell wall anchored or secreted, has yet to be reported. this seems to depend on parameters such as the bacterial type, the amount and localization of the antigen, the route of administration, etc. (wells and mercenier 2008) . many expression and secretion systems for lab to deliver protective antigens have already been developed (kruger et al. 2002; pusch et al. 2005) . such systems require the use of expression-secretion vectors (esvs) with a signal peptide (sp) to translocate the protein after synthesis across the bacterial cell membrane. several sps have been identified and incorporated into esvs (hols et al. 1997; hazebrouck, pothelune and azevedo 2007) . the secretion efficiency of different sps seems to be host specific. for this reason, homologous sps are thought to drive protein secretion more efficiently than heterologous sps (mathiesen et al. 2008) . the availability of the complete genome sequences of many lactobacillus casei strains (makarova et al. 2006) makes it easy to search for new and efficient sps that can be used in the construction of new esvs for this species and maybe other lactobacilli. in this study, putative sps were selected from the genome of l. casei atcc334 (accession no. cc000423.1) using bioinformatics analysis. the selected sps were tested for the secretion of a gene product encoding a green fluorescent protein (gfp) as a reporter protein and using the esv based on escherichia coli-l. casei shuttle vectors previously developed in our group (panya et al. 2012; suebwongsa et al. 2013) . the sp with the highest efficiency was then used to drive the secretion of a synthetic gene coding for an m2e:hbc fusion protein. m2e is a highly conserved protein of the influenza a virus that has been incorporated as an antigen into many influenza vaccine formulations. the immunogenicity of m2e has been reported to be enhanced when fused with the hepatitis b core antigen (hbc) (de filette et al. 2005) . the final construct, plc-cwh:m2e:hbc, successfully expressed and secreted the fused antigenic protein. the bacterial strains, plasmid vectors and oligonucleotide primers used in this study are listed in table 1 . lactobacillus strains were grown statically in de man rogosa and sharpe (mrs) medium (labm, heywood, lancashire, uk) at 37 • c. escherichia coli dh5α was cultured in luria-bertani (lb) broth (labm) at 37 • c with shaking. agarified media were prepared by adding 15 g l −1 bacteriological agar to the corresponding broth. when required, appropriate antibiotics (sigma-aldrich, st louis, mo, usa) were added to the media as follows: erythromycin for l. casei at 2.5 μg ml −1 , and ampicillin for e.coli at 100 μg ml −1 . all dna manipulation procedures were performed as described by sambrook and russell (2001) . plasmids from e. coli were isolated and purified using the hiyield plasmid mini kit (rbc bioscience, new taipei city, taiwan). plasmids from l. casei were extracted as described by o'sullivan and klaenhammer (1993) . genomic dna from l. casei was extracted using the genelute bacterial genomic dna kit (sigma-aldrich). pcr amplification was performed using taq dna polymerase (invitrogen, carlsbad, ca, usa). amplicons were purified from agarose gels using the hiyield gel pcr dna fragments extraction kit (rbc bioscience). the nucleotide sequences of cell surface-associated proteins of l. casei atcc334 were retrieved from the ncbi microbial genomes database (http://www.ncbi.nlm.nih.gov/genomes/ lproks.cgi). the signalp4.1 program (http://www.cbs.dtu.dk/ services/signalp/) was used to predict the putative signal peptides. the likely location of the proteins was predicted with the programs psortb v 3.0 (http://www.psort.org/psortb/), subloc v 1.0 (http://www.bioinfo.tsinghua.edu.cn/subloc/pro predict.htm) and gpos-mploc (http://www.csbio.sjtu.edu.cn/ bioinf/gpos-multi/). dna sequences encoding sps of cell surface-associated proteins with a clear sp cleavage site and an extracellular location were selected for the construction of esvs. to determine whether the three selected sps (i.e. cwhsp, surasp and sp6565) could drive heterologous protein secretion in l. casei, an esv based on each of these was constructed. sp strength was analyzed using the gfpuv-encoding gene as a reporter. sequences encoding the three sps were amplified from genomic dna of l. casei atcc334 under the following conditions: one cycle of 95 • c for 3 min; 30 cycles of 95 • c for 30 s, 60 • c for 30 s, 72 • c for 14 s; and 72 • c for 3 min. the lengths of the amplicons, cwhsp, surasp and sp6565, were 129 bp, 105 bp and 126 bp, respectively. the gfpuv gene fragment, recovered as a kpni/spei-digested fragment from pgfpuv, was cloned into pcwh, psura and psp6565 to generate pcwhsp:gfpuv, psursp:gfpuv and psp6565:gfpuv, respectively. the cwh:gfpuv fusion gene was obtained by double digestion of the pcwh:gfpuv with hindiii/spei and cloned into hindiii/spei-digested pldh-pro1, a vector containing the constitutive promoter (p ldh ) and ribosome binding site (rbs ldh ) of the lactate dehydrogenase gene (genbank accession no. d12591.1), resulting in pldh:cwh:gfpuv. using the same procedure, pldh:sur:gfpuv and pldh:sp6565:gfpuv were also obtained. the ldh:cwh:gfpuv dna fragment obtained as a aatii/spei-digested pldh:cwh:gfpuv was cloned into aatii/speidigested prceid-lc13.9 resulting in plc-cwh:gfpuv. the same strategy was followed to obtain the recombinant esvs containing gfpuv with sursp and sp6556, designated as plc-sur:gfpuv and plc-sp6565:gfpuv. figure 1 shows the construction diagram of plc-cwh:gfpuv and a similar procedure was used for the generation of the pldh:sur:gfpuv and pldh:sp6565:gfpuv. all three recombinant plasmids were verified by dna sequencing (data not shown) and were independently electrotransformed into l. casei rceid02. the preparation of competent cells and the electroporation protocol were as described elsewhere (chassy and flickinger 1987) . to determine the secretion efficiency, the fluorescence intensity of culture supernatants derived from recombinant l. casei rceid02 harboring plc-cwh:gfp, plc-sur:gfp and plc-sp6565:gfp was determined in triplicate. cultures of the above three recombinant constructs were incubated at 37 • c for 20 h with shaking at 200 rpm until an optical density of 3.0 at 600 nm (od600) was reached. supernatants of the cultures were harvested and the fluorescence intensity was measured with a fluorometer (cary eclipse, victoria, australia), as previously described (wu and chung 2006) . as a negative control, the culture supernatant of l. casei rceid02 harboring the expression vector without signal peptide (plc-gfpuv) was used. transformants containing plc-cwh:m2e:hbc were cultured at 37 • c for 20 h with shaking at 200 rpm until an od600 of 3.0 was reached. the supernatant and cell pellet were collected. the supernatant was concentrated 10-fold using centricons with a molecular weight cut-off of 10 kda (pall life sciences, ny, usa). the total cell extract and concentrated supernatant were electrophoresed in 12% sodium dodecyl sulphate-polyacrylamide gel electrophoresis (sds-page) followed by western blotting. the m2e:hbc fusion protein on the membrane was detected with mouse anti-m2 monoclonal antibody (abcam, cambridge, uk) at a dilution of 1:1000. horseradish peroxidase-conjugated goat anti-mouse igg (abcam) was used as a secondary antibody at a dilution of 1:10 000. the signal was developed with a chemiluminescent substrate reagent (thermo fisher scientific, rockford, il, usa) and recorded with a digital camera (image quant tm las 4000, uppsala, sweden). total cell extracts from the transformants with plc-cwh:m2e:hbc and those with plc-m2e:hbc were used as positive controls for m2e:hbc expression. supernatants from the transformants with the plc-m2e:hbc construct and those with prceid-lc13.9 were used as negative controls. seventy-seven deduced amino acid sequences of l. casei atcc334 genes encoding cell surface-associated proteins were retrieved from the ncbi database and analyzed with the sig-nalp 4.1 program to predict likely sp sequences (data not shown). the program generates a parameter called the d score, which is used to discriminate signal peptide from non-signal peptide sequences. it also indicates the presumptive location of the signal peptide cleavage site. six proteins with a d score above the cutoff value of 0.450 contained good predictions for sp sequences ( table 2 ). the d score of these proteins and the likely cleavage sites are summarized in table 2 . cellular location of each of the six proteins were predicted using three different programs. five, three and one of six proteins were predicted to be extracellular by gpos-mploc, psortb and subloc programs, respectively. based on this bioinformatics analysis, sps of the proteins encoded by the orfs yp˙805583.1, yp 805328.1 and yp 806565.1 were selected. these were designated as cwhsp, sursp and sp6565, respectively. to determine the secretion efficiency of the selected sequences, esvs based on each sp with the gfpuv gene as a reporter were constructed and designated plc-cwh:gfpuv, plc-sur:gfpuv and plc-sp665:gfp, respectively. these esvs were independently transformed into l. casei rceid02, resulting in the recombinant strains rceid02:cwhsp, rceid02:sursp and rceid02:sp6556. the fluorescence intensity of culture supernatants of these strains, and that of l. casei rceid02 plc-gfpuv (used as a negative control), was determined. the highest fluorescence intensity value was shown by the cwhsp (164.80), followed by that of surasp (118.80) and sp6565 (113.49) (fig. 3) . the cwhsp was selected for constructing the esv containing the m2e:hbc fusion gene, i.e. plc-cwh:m2e:hbc. western blot analysis of the supernatant from l. casei cells containing plc-cwh:m2e:hbc showed a prominent band with an apparent molecular weight of around 27.6 kda (fig. 4, lane 6) , which presumably corresponded to the expected fusion protein. the supernatant from an intermediate construct lacking cwhsp showed no band (fig. 4, lane 4) . the m2e:hbc fusion protein was detected in total cell extracts of both l. casei carrying plc-m2e:hbc (fig. 4 lane 3) and plc-cwh:m2e:hbc (fig. 4 lane 5) . no protein band was detected in either total cell extracts (fig. 4 , lane 1) or supernatant (fig. 4, lane 2) of the negative control strain. engineered lab strains constitute bacterial systems that may serve as alternative mucosal delivery vehicles for a variety of biomolecules (wells and mercenier 2008) . recombinant lab can be used as vaccine vehicles to elicit both secretory iga and systemic antibody responses against the delivered antigen (neutra and kozlowski 2006) . in this study we constructed recombinant l. casei cells expressing m2e:hbc in a secreted form with the use of novel sps derived from the l. casei genome. several sps derived from cell surface-associated proteins and secreted proteins of lab have been identified and used to construct distinct esvs (wu and chung 2006; mathiesen et al. 2008) . of these, the sp of usp45sp from lactococcus lactis (daniel et al. 2011) is the most widely used heterologous sp for protein secretion in lab species (dieye et al. 2001; mathiesen et al. 2008) . previous studies in gram-positive bacteria have shown that the secretion efficiency depends not only on the sp but also on the protein to be secreted and on bacterial host (dieye et al. 2001; mathiesen et al. 2008) . nonetheless, studies on the sp functionality for heterologous protein secretion in lactobacillus plantarum wcfs1 found that the homologous sps have similar or higher secretion efficiencies than those of heterologous origin (mathiesen et al. 2008) . aimed to express and secrete m2e:hbc in l. casei, we screened for homologous sps using a bioinformatics approach. in our hands, the best sp predictor is signalp4.1 (petersen et al. 2011) . this program has been successfully used to select efficient sps from the protein pool of l. plantarum wcsf1 (mathiesen et al. 2008) . the subloc, psortb and gpos-mploc programs were also used to predict the localization of the proteins produced. using these programs, three putative sps (cwhsp, surasp and sp6565) were selected from cell surface-associated proteins of l. casei atcc334 and used for construction of esvs with gfp as reporter protein. the reason that the subloc program predicts cwhsp and sp6565 as periplasmic might be due to the fact that this program uses a dataset based on that of reinhardt and hubbard (1998) . this dataset contains a small number of sequences of both extracellular and periplasmic groups for training the neural network, which may result in a less accurate prediction. based on these esv constructs, it was found that all three sps can function for gfp secretion in l. casei rceid02, and that the construct using cwhsp provides the highest gfp secretion. previous studies found that an efficient signal peptide in gram-positive bacteria generally contains the consensus sequence vla-x-ala↓ala or ala-x-ala↓ala as the sp cleavage site (nielsen et al. 1997; mathiesen et al. 2008) . in this study, all three selected sps have the consensus motif val-x-ala at position −3 to −1 relative to the putative cleavage site. however, none of the selected sps contained ala at position +1. in addition, the most efficient sp, cwhsp, harbors ser at position +1, which indicates ala at this position is not required. cwhsp was further used to construct an esv containing m2e:hbc. recombinant l. casei carrying this construct expressed and secreted m2e:hbc successfully, demonstrating the usefulness of bioinformatics to predict and select novel sps for l. casei. however, this approach does not guarantee that the sps selected will function, since protein secretion depends on the sp itself, the genetic background of the host strain and the target protein. in addition, secretion efficiency may be affected by the extent to which protein production levels and rates are adapted to the capacity of the translocation machinery, the influence of sp sequence on mrna stability and its translation efficiency (mathiesen et al. 2008) . for these reasons, to maximize the production of a secreted protein in lactobacillus, and before optimization and balancing of other factors, it might be worthwhile selecting an optimal sp for each individual protein. besides conferring only short-term immunity, a major drawback of the current influenza virus a vaccines is that they confer only short-term subtype-specific protection. thus, keeping vaccines up-to-date with influenza a antigens requires constant monitoring of the subtypes of the circulating viruses (johansson and brett 2007) . this inconvenience has led to much discussion about the development of a universal influenza a vaccine, able to provide protection against all virus subtypes (du, zhou and jiang 2010) . such a vaccine would require identification of a viral component conserved through all subtypes of influenza a virus for eliciting a broad-spectrum immunity (stanekova and vareckova 2010) . the extracellular domain of the m2 protein (m2e) of the influenza virus is one such conserved component (ebrahimi and tebianian 2011) . however, this is a short peptide of only 24 amino-acid residues. in this study, the m2e-encoding gene was fused with the gene coding for the hepatitis b core antigen (hbc) protein in order to increase its immunogenicity. in a previous study, strong immunogenicity and full protection were obtained in mice after either intraperitoneal or intranasal administration of the m2e:hbc fusion protein (de filette et al. 2005) . considering their gras status, the use of l. casei as an antigen delivery vehicle can confer great advantage for the expression of the m2e:hbc fusion protein. as a conclusion, in this study, we used a bioinformatics approach to select potential sps from the l. casei atcc334 genome. the three selected sps, which include those of cwh, sura and 6565 proteins, proved to direct secretion of the gfp in l. casei. cwhsp, which showed the highest secretion efficiency, was then used to develop an esv construct expressing and secreting the fused m2e:hbc protein. this sp and the esv developed may further serve for the expression and secretion of a variety of homologous and heterologous proteins in l. casei and surely in other lab species. transformation of lactobacillus casei by electroporation recombinant lactic acid bacteria as mucosal biotherapeutic agents universal influenza a vaccine: optimization of m2-based constructs design of a protein-targeting system for lactic acid bacteria research and development of universal influenza vaccines influenza a viruses: why focusing on m2e-based universal vaccines production of class ii bacteriocins by lactic acid bacteria; an example of biological warfare and communication efficient production and secretion of bovine β-lactoglobulin by lactobacillus casei. microb efficient secretion of the model antigen m6-gp41e in lactobacillus plantarum ncimb 8826 changing perspective on immunization against influenza in situ delivery of passive immunity by lactobacilli producing single-chain antibodies mucosal immunization with surfacedisplayed severe acute respiratory syndrome coronavirus spike protein on lactobacillus casei induces neutralizing antibodies in mice comparative genomics of the lactic acid bacteria heterologous protein secretion by lactobacillus plantarum using homologous signal peptides mucosal vaccines: the promise and the challenge identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites high-and low-copy-number lactococcus shuttle cloning vectors with features for clone screening probiotics: an overview of beneficial effects sequencing and analysis of three plasmids from lactobacillus casei tistr1341 and development of plasmid-derived escherichia coli-l. casei shuttle vectors signalp 4.0:discriminating signal peptides from transmembrane regions bioengineering lactic acid bacteria to secrete the hiv-1 virucide cyanovirin using neural networks for prediction of the subcellular location of proteins molecular cloning: a laboratory manual conserved epitopes of influenza a virus inducing protective immunity and their prospects for universal vaccine development cloning and expression of a codon-optimized gene encoding the influenza a virus nucleocapsid protein in lactobacillus casei coli host strains significantly affect the quality of small scale plasmid dna preparations used for sequencing mucosal delivery of therapeutic and prophylactic molecules using lactic acid bacteria green fluorescent protein is a reliable reporter for screening signal peptides functional in lactobacillus reuteri we would like to acknowledge prof. david blair for editing the manuscript via the kku publication clinic. this study was supported by the higher education research promotion and national research university project of thailand, office of the higher education commission, thailand and by the invitation research grant (i57301) from the faculty of medicine, khon kaen university. none declared. key: cord-004673-c8qcjve9 authors: faaberg, k. s.; plagemann, p. g. w. title: membrane association of the c-terminal half of the open reading frame 1a protein of lactate dehydrogenase-elevating virus date: 1996 journal: arch virol doi: 10.1007/bf01718835 sha: doc_id: 4673 cord_uid: c8qcjve9 orf 1a of lactate dehydrogenase-elevating virus, strain p (ldv-p), encodes a protein of 2206 amino acids. eisenberg hydrophobic moment analysis of the protein predicted the presence of eleven transmembrane segments in the c-terminal half of the molecule (amino acids 980–1852) that flank the serine protease domain. cdnas encoding orf 1a protein segments encompassing transmembrane segments 5 to 11 and its amphipathic c-terminal end as well as the n-terminal 80 amino acids of the downstream orf 1b protein were transcribed and the transcripts in vitro translated in the absence and presence of microsomal membranes. the synthesis of the protein products with putative transmembrane segments was enhanced by the presence of the microsomal membranes and the proteins became membrane associated. when synthesized in the absence of membranes they were recovered in the supernatant upon ultracentrifugation of the translation reaction mixtures, whereas they were recovered in the membrane pellet when synthesized in the presence of membranes. furthermore, the latter proteins were not released from the membranes by disruption of the membrane vesicles in carbonate buffer, ph 11.5, and large portions of the proteins were resistant to digestion by trypsin, chymotrypsin and proteinase k. no n-glycosylation was observed and only little, if any, processing of the protein by the putative serine protease. the results indicate that the c-terminal half of the orf 1a protein represents a non-glycosylated integral membrane protein. potential modes of synthesis and function of the protein are discussed. in addition, the results showed that the synthesis of the orf 1a protein was generally terminated at its termination codon, but that read-through into the orf 1b gene occurred with low frequency. summary. orf la of lactate dehydrogenase-elevating virus, strain p (ldv-p), encodes a protein of 2206 amino acids. eisenberg hydrophobic moment analysis of the protein predicted the presence of eleven transmembrane segments in the c-terminal half of the molecule (amino acids 980-1852) that flank the serine protease domain, cdnas encoding orf la protein segments encompassing transmembrane segments 5 to 11 and its amphipathic c-terminal end as well as the n-terminal 80 amino acids of the downstream orf lb protein were transcribed and the transcripts in vitro translated in the absence and presence of microsomal membranes. the synthesis of the protein products with putative transmembrane segments was enhanced by the presence of the microsomal membranes and the proteins became membrane associated. when synthesized in the absence of membranes they were recovered in the supernatant upon ultracentrifugation of the translation reaction mixtures, whereas they were recovered in the membrane pellet when synthesized in the presence of membranes. furthermore, the latter proteins were not released from the membranes by disruption of the membrane vesicles in carbonate buffer, ph 11.5, and large portions of the proteins were resistant to digestion by trypsin, chymotrypsin and proteinase k. no n-glycosylation was observed and only little, if any, processing of the protein by the putative serine protease. the results indicate that the c-terminal half of the orf la protein represents a non-glycosylated integral membrane protein. potential modes of synthesis and function of the protein are discussed. in addition, the results showed that the synthesis of the orf la protein was generally terminated at its termination codon, but that read-through into the orf lb gene occurred with low frequency. lactate dehydrogenase-elevating virus (ldv) belongs to a new group of positive-strand rna viruses, presently classified as genus arterivirus [3] , which also includes equine arteritis virus (eav), simian hemorrhagic fever virus, and porcine reproductive and respiratory syndrome virus (prrsv) [22, 23] . expression of the viral genomes of 12-15 kb in infected cells is via the formation of 3'coterminal nested sets of 6 or 7 subgenomic mrnas. the major structural proteins of these viruses are translated from subgenomic mrnas 5, 6, and 7. the 5' three-quarters of the viral genome encodes two large proteins, la (1727-2396 amino acids) and lb (1410-1448 amino acids) which are translated from genomic rna. the orf lb protein is expressed via a frameshift mechanism involving a slippery sequence and a pseudoknot [8] . the orf lb proteins of these viruses possess several common functional motifs, i.e. replicase, helicase, and zinc finger motifs, and probably represent the rna replicases of these viruses. the orf la protein of eav possesses an n-terminal papain-like cysteine proteinase (pcp) that autocatalytically cleaves off the n-terminal 29-kda end of the protein; (nonstructural protein-l; nsp-1) downstream of the catalytic pcp residues [26] . the orf la proteins of ldv and prrsv possess two pcps. both are functionally active in cleaving off n-terminal products of about 21 (nsp-la) and about 26kda (nsp-l[3), respectively [7] (see fig. 1 ). in the case of eav, a 61 kda product (nsp-2) is then removed by another cysteine proteinase cp [7] . in addition, the orf la proteins of these viruses possess a serine proteinase (sp) motif in the c-terminal half of the molecule (see fig. 1 ), and the sp has been suggested to be responsible for the further cleavage of the protein into functional units [7, 26] . the function of the orf la proteins is unknown. however, hydrophobic moment analyses of the orf la proteins of ldv, eav, and prrsv have identified similar 11 or 12 potential transmembrane segments in their c-terminal segments of about 1200 amino acids [20] which flank the sp motif, 4 to 7 segments on either side (see fig. 1 ). the predicted structure is unique among virus-encoded proteins and implies some specific function of the orf la proteins in arterivirus replication. the only other proteins with 11 or more transmembrane segments are membrane-associated transport proteins [2, 6, 15, 19] . in the present study we provide strong evidence that the segment of ldv orf la protein with transmembrane segments 5-11 (see fig. 1 ) becomes intimately associated with endoplasmic reticulum (er) membranes during synthesis and that none of its potential n-glycosylation sites becomes glycosylated. our approach has been to in vitro translate mrnas representing portions ofldv orf la in the absence and presence of microsomal membranes and then to examine the membrane association of the protein products and their susceptibility to degradation by proteinases. at the same time we have examined the efficiency of termination of the orf la protein and of frameshift/readthrough at the slippery sequence at the end of ldv-p orf la (6763gcu uua aac uguuga... ; slippery sequence and orf la termination codon are underlined; [20] ). plasmids pbsk-a66, pbks ÷ 181-3 (g3), and pbks ÷4-6 carry overlapping segments of the 3' end of ldv-p orf la (see fig. 1 ; nt 4202-4934, 4805-5945 and 5833-7043, respectively; genbank accession number u15146) and have been generated in previous studies [20] . cdna 4-6 also encompasses the y-terminal end of orf lb including the putative pseudoknot that is postulated to play a role in frame-shifting between orf la and lb. cdna 4-6 was excised from its bluescript vector with restriction endonucleases bamhi and aatii, incubated with klenow dna polymerase in the presence of deoxynucleotide triphosphates to obtain blunt ends and ligated into the msci site of the pcite-1 vector (novagen, madison, wi) which supplied an aug initiation codon 12 nucleotides upstream of the orf-la segment in pc4.6. the dnas of pbsk-a69, pbsk÷181-3 and pc4.6 were linearized and transcribed with t7 rna polymerases or with t3 rna polymerase in the case of pbsk-a69 as described previously [11] . the integrity of the rna was verified by agarose gel electrophoresis (see later, fig. 3a ). the transcribed rnas were in vitro translated in a rabbit reticulocyte lysate system in the absence and presence of canine pancreatic microsomal membranes as also described previously [10, 11] . for investigating their association with microsomal membranes, the products of translation were incubated and centrifuged under different buffer conditions 1-5, 13, 21] . the translation reaction mixtures were diluted about t00-fold with tris-buffered sucrose (250ram sucrose, 25 mm tris-hct, ph 7.5) or 100mm sodium carbonate, ph 11.5. the mixtures were incubated on ice for 1 h and then centrifuged for 1 h at about 100 000 x g at 4 °c. when incubated in isotonic sucrose buffer, the microsomal membranes remain closed vesicles and all proteins associated with the membrane and those located within the vesicles will be located in the membrane pellet (p), whereas proteins synthesized on free ribosomes will be recovered in the supernatant (s). in contrast, when translation products are incubated in 100mm na÷-carbonate at ph 11.5, the membrane vesicles are disrupted and only integral membrane proteins remain associated with the pelleted membranes, whereas secreted glycoproteins and peripheral proteins are recovered in the supernatant [13] . after centrifugation, the proteins in the supernatant (s) were precipitated with trichloroacetic acid, washed with acetone, and air dried. the pelleted material (p) was washed once in sterile water. the s and p proteins were suspended in reducing sample buffer and analyzed by tricine sodium dodecylsulfate polyacrylamide gel electrophoresis (sds-page) [25] using 16.5%t: 3%c or 12.5%t: 3%c gels (t refers to total percentage of acrylamide and bisacrylamide monomers and c refers to the percentage of crosslinker relative to t [ 17] ) as described previously [11] . only the orf la segment present in each transcript could yield a protein of significant size since the other two reading h-ames in each transcript did not encode any protein longer than 34 amino acids. the main product of the a69 transcript whether translated in the absence or presence of microsomat membranes was a protein of about 2022 kda ( fig. 2a, lanes 1 and 4) . however, the presence of membranes enhanced translation of the a69 transcript and the product was almost exclusively recovered in the membrane pellet and thus associated with membranes, whereas the protein synthesized in the absence of membranes was mainly recovered in the s fraction. incubation in carbonate buffer, ph 11.5, did not cause a significant release of the protein from the membranes (fig. 2b, lanes 2 and 5) . thus the protein product became integrated into the membranes during membrane-associated synthesis. translation must have been initiated at one of the three aug codons of orf la located close together at the 5' end of the transcript, but which one is not clear. none of the three augs are in favorable context for translation [18] with c rather than a purine at the -3 position, relative to the a(+ 1) of the aug codon but the same is the case for the initiation codons of most of orfs 2-7 which function efficiently and specifically in translation initiation [11] . the termination codon was provided by the vector so that the six terminal amino acids in the protein product were derived from vector sequences. initiation at one of the three aug codons would yield proteins with 246, 234 or 205 amino acids, respectively. another minor product of about 45-kda was consistently produced from the a69 transcript with properties similar to those of the about 22-kda protein. its nature is unknown. the results for the translation of the 183-1 transcript, which also encodes a protein sequence with potential transmembrane segments, were similar to those described for the a69 transcript. the main product of about 36 kda was recovered in the s fraction when synthesized in the absence of membranes, whereas it was recovered in the p fraction when synthesized in the presence of membranes ( fig. 2a, lanes 5 and 8) . incubation in carbonate buffer, ph 11.5, did not cause a significant dissociation of the protein from the membranes (fig. 2b, lanes 8 and 9) . translation was markedly enhanced by the presence of membranes. since the molecular weight of the product was the same whether the protein was syl~thesized in the absence or in the presence of membranes, none of three potential n-glycosylation sites in the predicted protein (see fig. 1 ) became fig. 2 . analysis of the membrane association of the in vitro translated proteins a69, 183-1, and 4-6. the transcripts of the appropriate plasmids were in vitro translated in the absence and presence of pancreatic membranes (-or +, respectively). the reaction mixtures were further incubated in trissucrose, ph 7.5 (a), or carbonate buffer, ph 11.5 (b) and centrifuged. the proteins in the supernatant (s) and the pellet (p) were analyzed by tricine sds-page using 16.5%t: 3% gels glycosylated. in contrast, we have shown previously [11, 12] , that under the same experimental conditions the orf 2-5 glycoproteins of ldv-p become nglycosylated in vitro when synthesized in the presence of membranes. there were also membrane-associated proteins of about 28, 24, and 16 kda synthesized from the 183-1 transcript (fig. 2a, lane 8) . since the putative transmembrane segments fall in the middle of the expected product (see fig. 1 ), the smaller products could have resulted from premature termination or initiation at more downstream aug codons. the former was probably the case, since there is only one additional in-frame aug codon that could yield a product with putative membrane segments and approximately the size (22 kda) of one of the smaller products and it is in very unfavorable context for initiation. processing by the serine proteinase could also not play a role in generating the smaller products since the potential 183-1 product does not contain the complete sp motif. the primary findings concerning the a69 and 183-1 proteins were that they are, as predicted by their hydrophobicity profiles, integral membrane proteins and that they are not n-glycosylated. as also predicted by the hydrophobic moment analyses (fig. i) , the pc4.6 product did not become membrane-associated. a transcript of pc4.6 could potentially yield two protein products if a frame-shift occurred at the slippery sequence at nt 6765-6771 (see earlier and fig. 1 ). one protein would be composed of 321 amino acids (35.2 kda), 4 nterminal amino acids encoded by the vector plus the 317 amino acid long c-terminal end of the orf la protein. the second protein would possess an additional 80 amino acids (about 10 kda), 76 representing the n-terminal end of the orf lb protein plus c-terminal 4 amino acids encoded by the vector. both products were generated. since the 44 kda product possessed twice as many met residues as the 35 kda protein, namely six, the level of radioactivity in the two proteins indicates that the read-through product was synthesized in considerably lower amounts than the terminated orf la protein ( fig. 2a, lane 9 ). since significant amounts of a 10 kda protein were not produced, little, if any, cleavage of the 10 kda orf lb protein segment from the 44 kda read-through product occurred. the presence of membranes had no significant effect on the translation of the pc4.6 transcript and the products were recovered in the soluble supernatant fraction whether synthesized in the presence or absence of membranes ( fig. 2a, lanes 9-12) . the predicted catalytic triad of the sp of the ldv-p orf la protein consists of his-1551, asp-1576, ser-1646 [14, 20] . his-1551 and asp-1576 are present in the c-terminal end of the a69 protein. ser-1646 is encoded in cdna 183-1. in order to examine the functionality of the sp and to further examine the membrane association of the orf la protein, we have joined a69 and 183-1 at a styi site in their overlapping segment (see fig. 1 ) and in vitro transcribed/translated the construct. a69 was excised from pbsk-a69 using restriction endonucleases styi and ncoi releasing a segment of 711 nt (nt 4215-4926). pbks ÷ 183-1 was cut with styi and the a69 segment ligated to it resulting in a clone which represented ldv nts 4779-4926 followed by ldv nts 4215-5945. this clone was then digested with bamhi and ncol (removing ldv nt 4779-4926 from the 5' end), blunt ended and ligated. the resulting construct pbks ÷ a69/183-1 possessed the 5 r terminal aug codons of a69 and potentially encoded a protein of 592 amino acids (63.7 kda), orf la amino acids 1354-1930 plus 15 c-terminal amino acids encoded by the vector. it encompassed the sp-motif and putative transmembrane segments 5---11 (see fig. 1 ). transcription of the construct yielded a transcript of the appropriate size (fig. 3a) . translation of the transcript yielded in some experiments a protein of the expected size of about 64 kda (see fig. 4 ), but in many others, the main product had a size of about 44 kda whether or not membranes were present (fig. 3b, lanes 1 and 4) . the reason for this difference is not known. the 44-kda protein could have resulted from processing of the 64-kda protein, but, if this was the case, processing must have occurred very rapidly since a time course experiment showed that maximum amounts of product were produced by 30 min fig.3 . agarose gel electrophoretic analysis of the transcripts of pbks+a69/183-1, pbsk a69, pbks÷183-1 and pc4.6 (a) and in vitro synthesis of protein a69/183-1 and analysis of its membrane association (b). the a69/183-1 proteins synthesized in the absence and presence of microsomal membranes were analyzed as described in the legend to fig. 2 of incubation. the protein profile changed little, if at all, during the 15-180 min period of translation (data not shown) and there was no indication of the formation of an 18-20 kda protein that would have been expected to be formed in such a processing step (figs. 3b and 4) . only a minor product of 27 to 29 kda was consistently produced (figs. 3b and 4) . furthermore, the formation of the protein products was the same when all proteinase inhibitors (1 gg pepstatin a, 2 gg leupeptin and 2 gg aprotonin per ml, and 50 iam phenylmethylsulfonyl fluoride) were omitted from the translation reaction mixture (data not shown). thus very little, if any, processing by the sp was observed. regardless, when synthesized in the absence of membranes the protein products were recovered in the supernatant fraction, whereas when synthesized in the presence of pancreatic membranes they were recovered in the membrane fraction whether the reaction products were incubated in the buffered sucrose solution or in carbonate buffer, ph 11.5, prior to centrifugation (fig. 3b, lanes 1 and 4 and 5 and 8, respectively) . the results clearly indicate that the primary protein products become membrane associated when synthesized in the presence of membranes. furthermore, the lack of increase in size of the primary products when synthesized in the presence of membranes indicates that the proteins were not significantly glycosylated at any of three potential glycosylation sites. in order to further investigate the membrane association of the a69/183-1 protein we determined its proteinase resistance after synthesis in the presence of microsomal membranes. large portions of the protein seemed to be protected by membrane association from digestion by typsin, chymotrypsin, or proteinase k. the main digestion product was about 18 kda but there were lower amounts of products with apparent molecular weights of about 29kda and 13 14kda k.s. faaberg and p. g. w. ptagemann fig. 4 . sensitivity of the membraneassociated a69/183-t proteins to proteinase digestion. the products synthesized in vitro in the presence of membranes were incubated in trissucrose, ph 7.5 (a), or carbonate, ph 11.5 (b), and then digested for 1 h on ice with trypsin, chymotrypsin or proteinase k (each at a concentration of 0.1~tg/ml), or without them (control) before analysis by tricine sds-page using a 16.5%t: 3%c gel (fig. 4, lanes 2-4) . the same products were generated by proteinase digestion after incubation of the primary product in carbonate buffer, ph 11.5 (fig. 4b) . the lower recovery of some products after the carbonate treatment does not indicate additional protein digestion since a variable loss was also frequently observed with non-proteinase-treated proteins after carbonate, ph 11.5, treatment, apparently resulting from sticking of the proteins to test tube walls [1 ii. the smallest proteinase digestion products varied in size depending on the proteinase used (fig. 4) ; it was the smallest after proteinase k digestion. the proteinase digestion products, except for the 13-14-kda proteins, were larger than the segments making up the transmembrane segments 5-7 (about llkda) or 8--11 (about 15kda). this finding indicates that portions of the a69/183-1 protein in addition to the transmembrane segments became proteinase resistant as a result of membrane association of the proteins. similar findings have been reported for coronavirus m proteins; protein segments upstream and downstream of their transmembrane segments seem to be resistant to chymotrypsin and protein k attack [4, 24] . the same applies to the m/vp-2 and vp-3p proteins of ldv [11] . in contrast, the ldv proteins when synthesized in vitro in the absence of membranes were completely digested by the three proteinases under the same experimental conditions [11] . the authenticity of the in vitro synthesized proteins as ldv proteins was confirmed by immunoprecipitation of the products by anti-ldv antibodies. both the about 44kda and 27-29 kda products of the a69/183-1 rna were precipitated by incubation with plasma from 5-month ldv-infected mice (imp) but not by normal mouse plasma (nmp), or monoclonal antibodies to ldv n/vp-1 or vp-3 (fig. 5, lanes 1-5) . the same applied to the about 44kda and fig. 5a , b. immunoprecipitation of in vitro synthesized products of the a69/183-1 transcript mouse plasma (imp), anti-vp-1 mab c350201.7 or anti-vp-3pmab 159-12 as described previously [ 11 ] 35 kda products of the 4.6 rna (fig. 5, lanes 6-10) . the results are also of interest in that they demonstrate that mice mount antibody responses to the c-terminal half of the orf la protein. in conclusion, the present study shows that at least a large portion of the c-terminal half of the ldv orf la protein represents an integral membrane protein(s). the in vitro synthesis of the portion of the orf la protein with the putative transmembrane segments is enhanced by the presence of microsomal membranes and the protein(s) synthesized in the presence of the membranes becomes inserted into the membrane in a form that is not released by disruption of the membrane vesicles in a carbonate buffer, ph 11.5, and that is largely protected from proteinase digestion. much larger portions of the protein seem to become proteinase resistant than strictly the portions making up the putative transmembrane segments, suggesting intimate association with the membrane. the membrane association of a large portion of the orf la protein suggests that the orf la protein or at least its c-terminal half is synthesized in vivo on rough er. one scenario suggests that the synthesis of the orf la protein starts on free ribosomes, that the n-terminal end is processed autocatalyticalty by the two paps releasing products of 22 and 26 kda, (fig. 1 ; nsp-l~ and 113, respectively) [7-1. this might be followed by the autocatalytic release of another protein (nsp-2) by the action of cysteine protease [7] . one of the putative transmembrane segments then could function as a processed or uncleaved signal peptide so that the synthesis of the remainder of the orf la protein occurs on membrane vesicles, however, in a manner that prevents n-glycosylation. in contrast, all the n-glycosylation sites in the ectodomains of the orf 2 and orf 5 glycoproteins become glycosylated during membrane-associated in vitro synthesis and the oligosaccharide chains seem to become processed during traverse through golgi membranes in vivo [11, 12] . nothing is known about the functions of the orf la protein. however, the integration of at least a large portion of the protein into er membranes suggests one possibility, namely that this process results in the formation of unique double-membrane vesicles that are invariably associated with the replication of all arteriviruses regardless of the nature of the host cell [23, 27, 28] . electron micrographs suggest that these double membrane vesicles originate from rough er by protrusion and detachment [28] . their formation is a very early event in ldv replication in macrophages and free nucleocapsids first appear among these vesicles before budding into single-membrane cisternae located generally next to golgi membranes [23, 27] . these findings combined with the uniqueness of these structures suggests that these double membrane vesicles may play an important role in virus replication. the possibility that comes to mind is a role in the synthesis of viral genomic and subgenomic lnrnas since viral rna synthesis in general seems to be membrane associated. the function of membranes and their associated proteins in viral nucleic acid synthesis is not known, but they might supply some organizational component to the replication complex and/or facilitate rna folding as suggested for rna chaperones [16] . in the case of the arteriviruses, the orf lb replicase protein is not an integrai membrane protein and probably supplies the catalytic rna replicase functions since it possesses helicase, replicase, and zinc binding motifs [22] . our results as well as those for eav [8] suggest that the orf lb protein is synthesized in much smaller amounts than the orf la protein which is typical for ribosomal frame shifting [1] . our results are consistent with the hypothesis that the orf la and lb proteins provide structural and catalytic functions, respectively. ribosomal frame shifting on viral rnas structure and function ofcytochrome c oxidase revision of the taxonomy of the coronavirus, torovirus and arterivirus genera coronavirus ibv glycopolypeptides: locational studies using proteases and saponin, a membrane permeabilizer new variation on the translocation of proteins during early biogenesis of apolipoprotein b structure and function of mammalian facilitated sugar transporters processing and evolution of the n-terminal region of the arterivirus replicase orf 1 a protein: identification of two cysteine proteases equine arteritis virus is not a togavirus but belongs to a coronavirus-like superfamily analysis of membrane and surface protein sequencers with the hydrophobic moment plot disulfide bonds between two envelope proteins of lactate dehydrogenase-elevating virus are essential for viral infectivity the envelope proteins of lactate dehydrogenaseelevating virus and their membrane topography open reading frame 3 of lactate dehydrogenaseelevating virus encodes a soluble, non-structural highly glycosylated and antigenic protein isolation of intracellular membranes by means of sodium carbonate treatment: application to endoplasmic reticulum complete genomic sequence and phylogenetic analysis of the lactate dehydrogenaseelevating virus (ldv) the multidrug transporter, a double-edged sword rna chaperones and the rna folding problems molecular-sieve" chromatography of proteins on columns of cross-linked polyacrylamide the scanning model for translation: an update the high amnity na+/glucose cotransporter sequence ofgenome of lactate dehydrogenase-elevating virus: heterogeneity between strains p and c a former amino terminal signal sequence engineered to an internal location directs translocation of both flanking protein domains lactate dehydrogenase-elevating virus and related viruses lactate dehydrogenase-elevating virus, equine arteritis virus, and simian hemorrhagic fever virus: a new group of positive-strand rna viruses assembly in vitro of a spanning membrane protein of the endoplasmic reticulum: the e1 glycoprotein of coronavirus mouse hepatitis virus a59 orf la protein ldv tricine-sodium dodecyl sulfate-polyacrylamide gel electrophoresis for the separation of proteins in the range from 1 to 100kda proteolytic processing of the replicase orf la protein of equine arteritis virus replication of lactate dehydrogenase-elevating virus in macrophages. 1. evidence for cytocidal replication electron microscopic studies on the morphogenesis of prrsv in infected cells -comparative studies we thank brent langley for competent secretarial help. during part of the work ksf was supported by usphs training grant ca 09138. received december 8, 1995 key: cord-002711-b7mlt19n authors: jacomin, anne-claire; samavedam, siva; charles, hannah; nezis, ioannis p. title: ilir@viral: a web resource for lir motif-containing proteins in viruses date: 2017-08-14 journal: autophagy doi: 10.1080/15548627.2017.1356978 sha: doc_id: 2711 cord_uid: b7mlt19n macroautophagy/autophagy has been shown to mediate the selective lysosomal degradation of pathogenic bacteria and viruses (xenophagy), and to contribute to the activation of innate and adaptative immune responses. autophagy can serve as an antiviral defense mechanism but also as a proviral process during infection. atg8-family proteins play a central role in the autophagy process due to their ability to interact with components of the autophagy machinery as well as selective autophagy receptors and adaptor proteins. such interactions are usually mediated through lc3-interacting region (lir) motifs. so far, only one viral protein has been experimentally shown to have a functional lir motif, leaving open a vast field for investigation. here, we have developed the ilir@viral database (http://ilir.uk/virus/) as a freely accessible web resource listing all the putative canonical lir motifs identified in viral proteins. additionally, we used a curated text-mining analysis of the literature to identify novel putative lir motif-containing proteins (lircps) in viruses. we anticipate that ilir@viral will assist with elucidating the full complement of lircps in viruses. autophagy is a multistep process that consists of the isolation of cytoplasmic components into double-membrane vesicles, called autophagosomes, that shuttle to lysosomes, which serve as end-point degradative organelles. it is a catabolic mechanism that enables the removal of damaged or excess cellular organelles and proteins, thereby contributing to the maintenance of cell homeostasis and survival. 1 the autophagic machinery is highly conserved from unicellular eukaryotes to metazoans. among the proteins that take part in this process, the atg8-family proteins play a central role. 2 indeed, these proteins are involved in the elongation and maturation of the autophagosome and its fusion with lysosomes. 1, 3 phosphatidylethanolamine-conjugated atg8-family proteins reside on autophagosomal membranes where they can contribute to the recruitment of other core autophagy machinery proteins essential for the effective course of the autophagy process. 1, [4] [5] [6] [7] although originally considered to be a nonselective bulk degradation mechanism, a gathering body of evidence over the past decade suggests that autophagy is much more selective than initially appreciated. selective targeting of cellular components to autophagosomes for degradation relies on the existence of selective autophagy receptors able to recognize and tether cargos toward nascent autophagosomes. 8, 9 examples of selective autophagy include aggrephagy, mitophagy, lipophagy, and xenophagy. [10] [11] [12] [13] the interaction between selective autophagy receptors and atg8-family proteins is essential for the proper steering of the cargo for degradation. these receptors typically contain an lc3-interacting region (lir, also known as lrs, aim or gim; the latter correspond to lc3 recognition sequence, atg8-interacting motif and gabarap interaction motif respectively) critical for the binding to atg8-family proteins. 7, [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] the lir motif consists of a short amino acid sequence with a core motif originally described as w/f/yxxi/l/ v (where 'x' represents any amino acids, and referred to as wxxl hereafter). 7, 18 this sequence has lately been relaxed and extended to 6 amino acids to integrate most of the experimentally verified lirs. [ilv] , where the residues in positions 3 and 6 correspond to the most crucial ones for the interaction with atg8-family proteins. [26] [27] [28] besides its role in cellular homeostasis, autophagy is also involved in the innate immune response against pathogens. 13, 29, 30 recent years have seen an outburst of studies on autophagy and viral infections. autophagy may exert a variety of antiviral functions, including the degradation of viral components (known as virophagy), the activation of innate immunity by the delivery of viral nucleic acids to the toll-like receptors, and adaptive immunity through the major histocompatibility complex ii (mch-ii/hla class ii), or the control of the production of reactive oxygen species (ros). [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] however, to be successful, viruses have evolved mechanisms to evade host defense. several viruses have thus developed strategies to use the autophagy machinery or even thrive in the autophagosomes and promote their replication, spread, and survival. [44] [45] [46] [47] other viruses susceptible to autophagy have evolved mechanisms to counteract autophagy activation by expressing proteins that interfere with the cellular machinery, essentially inhibiting the autophagosome-lysosome fusion or interfering with early stages of autophagy activation. [48] [49] [50] [51] [52] [53] [54] [55] [56] a few proteins from viruses infecting mammals and plants that interfere with the host autophagy process have been shown to associate with atg8-family proteins. 52, [57] [58] [59] [60] [61] [62] yet, only one lirdependent interaction has been reported. 62 we have developed a database, ilir@viral (http://ilir.uk/ virus/), that organizes information on the presence of lir motifs in viral proteins. additionally, a curated text-mining analysis of the literature permitted us to predict functional lir motifs in viral proteins that have already been shown to associate with atg8-family members. the ilir@viral database is a web resource freely available to the academic community at http://ilir.uk/virus/. various functionalities are accessible under specific menus. the 'classification' menu gives access to the complete list of putative lir motifs in viral proteins. two virus taxonomic systems have been used: the nomenclature used by the international committee on taxonomy of viruses (ictv) to name the species, genus and families of each of the viruses cited in the database, and the baltimore classification system that groups viruses depending on their genome and kind of replication (dsdna, dsrna, ssdna, ssrna and reverse-transcribing viruses) (see methods). 63, 64 for each specific family or group of viruses, the data are presented in a table containing (i) the clickable uniprotkb accession number of the protein, (ii) the information related to the lir-motif (position, sequence, pssm score, if the pattern is recognized as an xlir or wxxl motif and the presence of the motif in an intrinsically disordered region (anchor)), (iii) the name of the protein and (iv) the name of the species. for some genera, no putative lir-containing protein (lircp) could be identified; for accuracy in the classification, these are listed but appear in red (fig. 1) . the 'search' menu allows the user to look in the database for a specific protein or virus order, family, subfamily, genus, species or common name. uniprot identifiers can also be used in the search function. the blast (basic local alignment search tool) menu offers the user to search the database using a protein sequence as query against the reviewed set of viral proteins from uniprot database. we have used position-specific iterative (psi) blast to search against the database. 65 this search can be run against any of the viral classification systems described above. by default, the blast search runs against all the data available in the database with a default e-value 0.01; nevertheless, the user has the possibility to run the blast search against a specific category (baltimore classification) or order taxonomic rank (ictv classification), and define a different e-value. the results page displays the subject sequences from the database that match the query sequence. the lir patterns are highlighted with red asterisks. the menus 'bibliography' and 'help' provide users with additional information. finally, the 'links' menu gives access to other ilir web resources, inducing the ilir search tool and the ilir database for eukaryote model organisms. 27, 28 analysis of the content of the database we have used the ilir web resource (https://ilir.warwick.ac.uk) to identify lircps in viruses (see methods for details). 27, 28 out of 16,609 reviewed viral sequences available from uniprot across 2569 individual viral species we found that 15,589 of them contain either xlir or wxxl motifs. 6376 proteins figure 1 . screenshot of the ilir@viral database 'classification' menu. example for the ictv classification system. the genera for which no lircps were found appear in red. contain xlir motifs whereas 15,460 contain wxxl motifs. 6247 proteins contain both xlir and wxxl motifs whereas 129 proteins contain only xlir motifs (without containing wxxl patterns) and 9213 proteins contain only wxxl motifs (without xlir patterns) (table s1 ). we found a correlation between the total number of putative lir motifs identified in a family and number of sequences ( fig. 2a) . on average, we found 8.3 lir motifs per sequence. the fact that viral sequences often refer to polyproteins instead of individual proteins is possibly an explanation of the high proportion of patterns identified. the ilir web resource can make the distinction between xlir and wxxl patterns. 27, 28 we noticed that a vast majority of the motifs identified in viral proteins correspond to the wxxl pattern ( fig. 2b and table s1 ). the identification of putative lir motifs has been done for all the reviewed sequences for viral proteins. among the sequences sorted as putative lir-containing proteins, 1517 sequences belong to 188 species of bacteriophages (table s2 ). however, it is likely that these sequences correspond to false-positive hits as bacteria don't have an autophagy process. using a hypergeometric test (see supplementary information), we compared the enrichment fold of proteins containing lir motifs (lircps) in bacteriophages with viruses infecting eukaryotes. we noticed that the enrichment fold for both xlir and wxxl patterns in lircps was higher in viruses infecting eukaryotes than in bacteriophages (table s2) which is in line with the fact that autophagy has been reported only in eukaryotes. we have previously identified all the verified and putative lir motifs in eukaryotic model organisms. 28 we also compared the enrichment of putative lir motifs in viruses infecting some of the model organisms (human, mouse, rat, and chicken) (table s3) . two hypergeometric tests have been conducted to compare the enrichment fold of lircps in viruses against the host: one for proteins containing all combination of lir patterns (i.e., either xlir or wxxl, or both kind) and another considering only the proteins containing at least one xlir motif (i.e., xlir patterns alone or along with one or more wxxl patterns). we observed that when all the possible combinations of lir patterns are taken into account, there is an enrichment of putative lir motifs in viruses compared with the host organism for all 4 model organisms tested. however, there is an enrichment of xlir patterns in the host compared with the infecting viruses. the lir motifs can be divided into 3 subtypes depending on the residue at the first position: w-, f-and y-types. 8 it has been shown that the f-type lir motif of mammalian ulk1 and atg13 has a preference for gabarap proteins, thus suggesting that the subtype of the lir motifs could be related to a specificity toward atg8family proteins. 7 additionally f-type and y-type lir motifs are mostly contained in selective autophagy adaptor proteins. 8 we thus analyzed the distribution of w-, f-and y-type of wxxl and xlir patterns at the viral order and family levels (table s4 and fig. 2c, d) . we observed that 45% and 38% of the putative lir motifs matching the wxxl pattern are of f-type and y-type respectively. w-type motifs are the least represented, with about 17% of the patterns (fig. 2c) . similar distribution could be observed for the putative xlir motifs with a higher variability across families, probably due to the low representation of xlir motifs compared with wxxl patterns (fig. 2d) . to assess the trustworthiness of our in silico screening for lircps in viruses, we compared our data with the already published viral proteins listed on the web resource viral-zone as modulators of autophagy. 66 viralzone classifies 180 entries related to the activation of the host autophagy, and 163 entries linked to the inhibition of host autophagy. we found all these entries in our database. among viruses that inhibit autophagy, only 2 proteins have been shown to interact directly with mammalian atg8also table s4 ). (d) distribution of the w-, f-and y-type of xlir patterns across the different families (see also table s4 ). family proteins: viral infectivity factor (vif) from hiv-1 binds to all atg8-family proteins, and matrix protein 2 (m2) from influenza was shown to bind to lc3. yet, influenza m2 protein is the only one that contains a lir motif that has been experimentally validated. 52, 62 other viral proteins listed in viralzone as being related to inhibition of the autophagy process are the neurovirulence factor icp34.5 and rna-binding protein us11 from human herpesviruses, the protein nef from hiv-1, and the protein trs1 from human cytomegalovirus. 49, 61, 62, 67 negative regulation of autophagy by icp34.5 and trs1 proteins depends on their ability to interact with becn1/beclin 1; 68-70 while us11 function has been linked to the protein kinase eif2ak2/pkr. 71 putative, or functional, lir motifs could be identified for all these proteins using the ilir web resource, except for the rna-binding protein us11. the protein m2 from influenza a virus is necessary and sufficient for the inhibition of the autophagic degradation of the virus by blocking the fusion between the autophagosomes and lysosomes. 49 these results were further confirmed and extended by beale and colleagues who show that the cytoplasmic tail of m2 interacts in a lir-dependent manner with lc3 and promotes the relocalization of lc3 at the plasma membrane. 62 we have identified a wxxl motif at positions 89 to 94 that has the highest pssm score (8 to 9), and corresponds to the one experimentally verified (fvsi). 27, 72 it is an f-type lir motif and the fact that m2 protein has been shown to block the autophagosome-lysosome fusion suggests that it may act as a an adaptor protein. we were also able to identify one xlir motif in the sequence of the accessory viral protein nef of the virus hiv-1 group m subtype b (strain 89.6). nef colocalizes with lc3 and becn1, and contributes to the inhibition of autophagic maturation, thus protecting the virus from elimination by autophagy. 61, 67 the proteins trs1 from human cytomegalovirus and neurovirulence factor icp34.5 from human herpesvirus (hsv) both interact with becn1 via a specific becn1-binding domain. this interaction is required for the inhibition of autophagosome maturation and fusion with the lysosomes. [68] [69] [70] 73 while ilir could detect several wxxl motifs in trs1 sequence, a single one in an intrinsically disordered region was identified for icp34.5 whose sequence (64-rqwlhv-69) is quite well conserved among 4 strains of hsv-1 and one strain of hsv-2. however, to date, there is no evidence of association between icp34.5 and lc3 proteins. we observed that 7% of the reviewed sequences that contains at least one putative lir motif correspond to the genome polyprotein from various viruses. because polyproteins are processed co-and post-translationally by both host and viral proteases, we ran a systematic pubmed search with the terms 'name of the virus c autophagy' followed by 'name of the virus c lc3' to restrain the result outcome as necessary, finally we looked for papers (excluding reviews and commentaries) that specifically mention proteins derived from the processing of the viral genome polyprotein. our literature searching strategy pinpointed several nonstructural viral proteins; one of those was the protein ns1 from dengue viruses. studies have shown that ns1 protein from dengue virus type 2 (denv-2) colocalizes with lc3 and that denv-2 particles and autophagosomes travel together during viral infection. 58, 74 in contrast to denv-2, ns1 protein from denv-3 displays a low level of colocalization with lc3. 75 sequence alignment of ns1 proteins from denv-2 and denv-3 showed that they are highly conserved. however, checking for the presence of lir motifs revealed a discrepancy between them. we observed that denv-2 ns1 has an xlir motif (asfiev) that is not recognized in any denv-3 ns1 sequences due to the substitution f to l, as well as an additional wxxl motif with a pssm score 12 (rawnsl) that is absent in denv-3 (sl to vw) (fig. s1 ). it is possible that the absence of either the xlir or wxxl motifs (or both) in denv-3 is responsible for its lower affinity to lc3. other nonstructural proteins from different viruses interact with lc3 proteins. for instance, the nonstructural protein ns5a from hepatitis c virus that colocalizes and can be coimmunoprecipitated with lc3 proteins when ectopically expressed in various hepatoma cell lines. 59, 76, 77 also, the viral peptide 2bc and the protein 3a encoded by the genome polyprotein from poliovirus type 1 interact with lc3-ii. 60, 78 all these proteins contain wxxl motifs. finally, we were able to identify proteins from zika virus that have been just recently related to autophagy. independent studies have shown that zika virus activates autophagy and that the formation of autophagosomes is crucial to the replication of the virus. [79] [80] [81] it appears that the nonstructural proteins ns4a and ns4b are responsible for the induction of autophagy in infected cells by inhibiting the akt-mtor signaling, and both of them contain 3 wxxl motifs. 80, 82 very little is known about the relation between zika virus infection and autophagy modulation. we have found several proteins encoded by zika genome polyprotein that contain lir motifs, that could be good candidates for experimental validation. autophagy is an evolutionarily conserved and highly regulated, intracellular catabolic mechanism that is essential for maintaining homeostasis and coping with nutrient starvation. it is increasingly appreciated that autophagy can be highly selective, and that xenophagy, the selective autophagy of pathogens, is an important aspect of the immune response, which protects against infection. a vast array of viruses are associated with autophagy, and we have found several viral proteins containing putative lir motifs that are thought to interact with the autophagic machinery via atg8-family proteins. a continued research effort to better understand how these viral proteins interact with the autophagic machinery may provide therapeutic strategies and ultimately lead to the discovery of novel pharmacological agents to fight viral infections. protein sequences of all reviewed viral proteins were downloaded from uniprot database vailable at: http://www.uni prot.org/ [accessed 20 september 2016]. xlir and wxxl patterns for these proteins were identified using the approach suggested previously. 27, 28 for a given protein, information related to the start and end of lir pattern, actual lir sequence, pssm score (position specific scoring matrix), similar lircps and presence or absence xlir and wxxl in intrinsically disordered region were obtained. the taxonomic lineage obtained from uniprot for all the reviewed viral proteins correspond to the baltimore classification system. to do the conversion between the baltimore and ictv classification systems, we matched the organism name for each reviewed sequence from uniprot with the organism names obtained from the ictv master species list 2015 v1, available at: https://talk.ictvonline.org/files/master-species-lists/ m/msl/5945 [accessed 20 september 2016]. the differences between the classification systems were identified through a battery of sql queries. the details of these are attached in tables s5, s6 and s7. levenshtein distance (also known as edit distance) was used to compare the species name that belongs to a particular genus and family between both the classification systems. levenshtein distance is defined as the minimal number of characters required to replace, insert or delete to transform one string into another. the levenshtein distance is symmetric and it holds: here, 'x' and 'y' are 2 strings and d(x,y) is the distance between 'x' and 'y' put as minimal cost of operations to transform 'x' to 'y'. the complexity of the algorithm is o(m ã� n), where n and m are the lengths of 2 strings. 83 we used perl extension for approximate matching (search.cpan. org, (2016). string-approx-3.27. retrieved from: http://search.cpan.org/ cpan/authors/id/j/jh/jhi/string-approx-3.27.tar.gz) the closer the value of levenshstein distance to zero, the closer are the species names. using this approach, we could reliably justify the ictv classification of 14055 out 15589 proteins loaded into the database. to share the information beyond our mysql (v5.6.33) dbms (database management system), we built a website using html, css, javascript and php (v5. 6 no potential conflicts of interest were disclosed. the machinery of macroautophagy network organization of the human autophagy system atg8 family lc3/gabarap proteins are crucial for autophagosome-lysosome fusion but not autophagosome formation during pink1/parkin mitophagy and starvation role of the mammalian atg8/lc3 family in autophagy: differential and compensatory roles in the spatiotemporal regulation of autophagy lc3 and gate-16/gabarap subfamilies are both essential yet act differently in autophagosome biogenesis plekhm1 regulates autophagosome-lysosome fusion through hops complex and lc3/gabarap proteins atg8 family proteins act as scaffolds for assembly of the ulk complex: sequence requirements for lc3-interacting region (lir) motifs the lir motif -crucial for selective autophagy the lc3 interactome at a glance aggrephagy: selective disposal of protein aggregates by macroautophagy mechanisms of mitophagy lipophagy: selective catabolism designed for lipids the role of 'eat-me' signals and autophagy cargo receptors in innate immunity p62/sqstm1 binds directly to atg8/lc3 to facilitate degradation of ubiquitinated protein aggregates by autophagy structural basis for sorting mechanism of p62 in selective autophagy the autophagy-related protein kinase atg1 interacts with the ubiquitin-like protein atg8 via the atg8 family interacting motif to facilitate autophagosome formation structural basis of target recognition by atg8/lc3 during selective autophagy atg8-family interacting motif crucial for selective autophagy the structure of atg4b-lc3 complex reveals the mechanism of lc3 processing and delipidation during autophagy a role for nbr1 in autophagosomal degradation of ubiquitinated substrates nix is a selective autophagy receptor for mitochondrial clearance the tbk1 adaptor and autophagy receptor ndp52 restricts the proliferation of ubiquitin-coated bacteria nix directly binds to gabarap: a possible crosstalk between apoptosis and autophagy phosphorylation of the autophagy receptor optineurin restricts salmonella growth structural and functional analysis of the gabarap interaction motif hfaim: a reliable bioinformatics approach for in silico genome-wide identification of autophagy-associated atg8-interacting motifs in various organisms ilir: a web resource for prediction of atg8-family interacting proteins ilir database: a web resource for lir motif-containing proteins in eukaryotes autophagy in antiviral innate immunity autophagy in infection, inflammation and immunity autophagy protects against sindbis virus infection of the central nervous system image-based genome-wide sirna screen identifies selective autophagy factors pkr-dependent autophagic degradation of herpes simplex virus type 1 autophagydependent viral recognition by plasmacytoid dendritic cells replication-independent activation of human plasmacytoid dendritic cells by the paramyxovirus sv5 requires tlr7 and autophagy pathways production of interferon alpha by human immunodeficiency virus type 1 in human plasmacytoid dendritic cells is dependent on induction of autophagy antigen-loading compartments for major histocompatibility complex class ii molecules continuously receive input from autophagosomes autophagy enhances the presentation of endogenous viral antigens on mhc class i molecules during hsv-1 infection absence of autophagy results in reactive oxygen species-dependent amplification of rlr signaling autophagy facilitates ifngamma-induced jak2-stat1 activation and cellular inflammation hepatitis c virus upregulates beclin1 for induction of autophagy and activates mtor signaling peroxisomal protein pex13 functions in selective autophagy fam134b, the selective autophagy receptor for endoplasmic reticulum turnover, inhibits replication of ebola virus strains makona and mayinga sustained autophagy contributes to measles virus infectivity autophagy and the effects of its inhibition on varicella-zoster virus glycoprotein biosynthesis and infectivity host cell autophagy modulates early stages of adenovirus infections in airway epithelial cells the autophagy elongation complex (atg5-12/16l1) positively regulates hcv replication and is required for wild-type membranous web formation herpes simplex virus gamma34.5 interferes with autophagosome maturation and antigen presentation in dendritic cells matrix protein 2 of influenza a virus blocks autophagosome fusion with lysosomes coronavirus membrane-associated papain-like proteases induce autophagy through interacting with beclin1 to negatively regulate antiviral innate immunity human immunodeficiency virus type-1 infection inhibits autophagy hiv-1 viral infectivity factor interacts with microtubule-associated protein light chain 3 and inhibits autophagy irgm is a common target of rna viruses that subvert the autophagy network multi-layered control of galectin-8 mediated autophagy during adenovirus cell entry through a conserved ppxy motif in the viral capsid foot-and-mouth disease virus infection suppresses autophagy and nf-small ka, cyrillicb antiviral responses via degradation of atg5-atg12 by 3cpro berlioz-torrent c. lc3c contributes to vpu-mediated antagonism of bst2/tetherin restriction on hiv-1 release through a non-canonical autophagy pathway autophagy functions as an antiviral mechanism against geminiviruses in plants co-localization of constituents of the dengue virus translation and replication machinery with amphisomes replication of hepatitis c virus rna on autophagosomal membranes modification of cellular autophagy protein lc3 by poliovirus autophagy pathway intersects with hiv-1 biosynthesis and regulates viral yields in macrophages a lc3-interacting motif in the influenza a virus m2 protein is required to subvert autophagy and maintain virion stability taxonomy of viruses., king amq. virus taxonomy: classification and nomenclature of viruses: ninth report of the international committee on taxonomy of viruses expression of animal virus genomes gapped blast and psi-blast: a new generation of protein database search programs viralzone: a knowledge resource to understand virus diversity human immunodeficiency virus type 1 nef inhibits autophagy through transcription factor eb sequestration hsv-1 icp34.5 confers neurovirulence by targeting the beclin 1 autophagy protein the human cytomegalovirus protein trs1 inhibits autophagy via its interaction with beclin 1 analysis of the role of autophagy inhibition by two complementary human cytomegalovirus becn1/beclin 1-binding proteins the herpes simplex virus 1 us11 protein inhibits autophagy through its interaction with the protein kinase pkr analysis of the native conformation of the lir/ aim motif in the atg8/lc3/gabarap-binding proteins herpes simplex virus type i induces an incomplete autophagic response in human neuroblastoma cells single-virus tracking approach to reveal the interaction of dengue virus with autophagy during the early stage of infection a role for autophagolysosomes in dengue virus 3 production in hepg2 cells interferoninducible protein scotin interferes with hcv replication through the autolysosomal degradation of ns5a a functional role for ns5atp9 in the induction of hcv ns5a-mediated autophagy subversion of cellular autophagosomal machinery by rna viruses biology of zika virus infection in human skin cells zika virus ns4a and ns4b proteins deregulate akt-mtor signaling in human fetal neural stem cells to inhibit neurogenesis and induce autophagy zika virus infection induces mitosis abnormalities and apoptotic cell death of human neural progenitor cells anchor: web server for predicting protein binding regions in disordered proteins a guided tour to approximate string matching we would like to thank professor andrew easton (university of warwick) and professor keith leppard (university of warwick) for helpful discussions. this work is supported by bbsrc grants bb/l006324/1 and bb/p007856/1 awarded to i.p.n. key: cord-001866-s5otdtwq authors: mandal, nakul; lewis, geoffrey p.; fisher, steven k.; heegaard, steffen; prause, jan u.; la cour, morten; vorum, henrik; honoré, bent title: proteomic analysis of the vitreous following experimental retinal detachment in rabbits date: 2015-11-18 journal: j ophthalmol doi: 10.1155/2015/583040 sha: doc_id: 1866 cord_uid: s5otdtwq purpose. the pathogenesis of rhegmatogenous retinal detachment (rrd) remains incompletely understood, with no clinically effective treatment for potentially severe complications such as photoreceptor cell death and proliferative vitreoretinopathy. here we investigate the protein profile of the vitreous following experimental retinal detachment using a comparative proteomic based approach. materials and methods. retinal detachment was created in the right eyes of six new zealand red pigmented rabbits. sham surgery was undertaken in five other rabbits that were used as controls. after seven days the eyes were enucleated and the vitreous was removed. the vitreous samples were evaluated with two-dimensional polyacrylamide gel electrophoresis and the differentially expressed proteins were identified with tandem mass spectrometry. results. ten protein spots were found to be at least twofold differentially expressed when comparing the vitreous samples of the sham and retinal detachment surgery groups. protein spots that were upregulated in the vitreous following retinal detachment were identified as albumin fragments, and those downregulated were found to be peroxiredoxin 2, collagen-iα1 fragment, and α-1-antiproteinase f. conclusions. proteomic investigation of the rabbit vitreous has identified a set of proteins that help further our understanding of the pathogenesis of rhegmatogenous retinal detachment and its complications. rhegmatogenous retinal detachment (rrd) is characterized by the accumulation of subretinal fluid between the neurosensory retina and retinal pigment epithelium following the formation of a retinal break [1] . the pathogenesis of rrd is complex and incompletely understood, involving age-related and/or inherited structural and molecular changes of the vitreous extracellular matrix and vitreoretinal interface, and the process of posterior vitreous detachment [2] . the annual incidence of the condition has been estimated at 12.05 per 100,000, and although primary surgical reattachment is successful in the great majority of cases, photoreceptor cell death, growth factors as a result of rrd significantly contributes to the pathogenesis of pvr, though the basic cause as well as a clinically effective therapeutic approach for this condition remains elusive [5, 6] . proteomics studies proteins on a large scale in pursuit of a global and integrated view of disease processes at the protein level, which may potentially lead to the identification of novel biomarkers and therapeutic targets useful in clinical practice [7] [8] [9] [10] [11] . rrd would likely be associated with alterations in the proteomic profiles of both the retina and vitreous. indeed, we initially undertook the first such retinal study from which a number of potentially important proteins were identified [12] . the present study extends the proteomic investigation to the vitreous of this rabbit model of retinal detachment, building upon previous such analyses of human vitreous [13] [14] [15] [16] , in order to add further knowledge of the underlying pathophysiology [10, 17] . inferior retinal detachment was created in the right eyes of six new zealand red pigmented rabbits. the eyes were normal with no evidence of disease on examination. combined injections of xylazine (6.7 mg/kg) and ketamine (33.3 mg/kg) were administered intramuscularly to induce anesthesia and analgesia. the pupils were dilated with topical drops of atropine and tropicamide (1% solutions). a pipette tip, with an external diameter of approximately 100 m, was inserted into the eye through a pars plana incision. sodium hyaluronate (healon, 0.25% in a balanced salt solution; pharmacia, piscataway, nj) was infused via a glass pipette between the neurosensory retina and retinal pigment epithelium. healon was necessary to prevent spontaneous retinal reattachment, and 0.25% is the most dilute solution that maintains the detachment for extended periods. approximately 50% of the retina beneath the medullary rays, which included the central retina, was detached ( figure 1 ). sham surgery was performed in the right eyes of five other rabbits that were used as controls, which involved surgical entry of the vitreous cavity without disruption of the retina. scleral incisions were closed with 8-0 nylon suture. seven days postoperatively the animals were euthanized by the administration of sodium pentobarbital (120 mg/kg; butler schein, dublin, oh) and the eyes enucleated. after removal of the cornea and lens, the associated vitreous of the sham and detached retinas was extracted and immediately snap-frozen in liquid nitrogen within separate vials. there was no gross evidence of blood or other contamination of the vitreous samples at the time of tissue harvesting. the vitreous samples were stored at −80 ∘ c until further use. all of the animal experiments undertaken in this study were in accordance with the standards of the national institutes of health animal care and use committee protocols, the arvo statement for the use of animals in ophthalmic and vision research, and the guidelines of the animal resource center, university of california, santa barbara. the rabbit vitreous samples were homogenized and dissolved in a lysis buffer containing 9 m urea, 2% (v/v) triton x-100, 2% (v/v) immobilized ph gradient (ipg) buffer (ph 3-10 nonlinear), and 2% (w/v) dithiothreitol (dtt). the total protein content in each vitreous sample was determined with non-interfering protein assay (calbiochem, san diego, ca). the protein samples were stored at −80 ∘ c until further use. the extracted proteins were first fractionated by isoelectric focusing (ief) using ph 3-10 nonlinear 18 cm ipg strips (ge healthcare, chalfont st. giles, buckinghamshire, uk). the ipg strips were rehydrated for 20 h at room temperature in 200 l lysis buffer each containing 20 g protein from individual vitreous samples and 150 l rehydration buffer (8 m urea, 2% (w/v) 3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate (chaps), 0.3% (w/v) dtt, and 2% (v/v) ipg buffer), using the immobiline drystrip reswelling tray (ge healthcare). the ief was undertaken on a multiphor ii electrophoresis system (ge healthcare) at 500 v for 5 h and 3500 v in two steps for 5 h and 9.5 h in a gradient mode at 17 ∘ c with the use of a multitemp iii thermostatic circulator (ge healthcare). before the second-dimension sodium dodecyl sulfate (sds) polyacrylamide gel electrophoresis (page), the ipg strips were equilibrated firstly for 10 min with gentle agitation in 20 ml of equilibration solution (0.6% (w/v) tris-hcl, ph 6.8, 6 m urea, 30% (v/v) glycerol, 1% (w/v) sds, and 0.05% (w/v) dtt) and secondly using 4.5% (w/v) iodoacetamide and bromophenol blue. the ipg strips were then transferred to 12% polyacrylamide gels for electrophoresis, which was performed at a maximum voltage of 50 v for approximately 20 h to separate the proteins vertically on the basis of molecular mass. staining. the two-dimensional (2d) gels were silver stained using a protocol optimized for protein identification with mass spectrometry [18] . in brief, the gels were fixed overnight in 50% (v/v) ethanol, 12% (v/v) acetic acid, and 0.0185% (v/v) formaldehyde. the gels were washed 3 times for 20 min in 35% (v/v) ethanol and pretreated for 1 min in 0.02% (w/v) na 2 s 2 o 3 ⋅5h 2 o. they were then rinsed in water and stained for 20 min in 0.2% (w/v) agno 3 and 0.028% (v/v) formaldehyde. following further rinsing with water, development was undertaken for approximately 3 min in 6% (w/v) na 2 co 3 , 0.0185% (v/v) formaldehyde, and 0.0004% (w/v) na 2 s 2 o 3 ⋅5h 2 o. the development was arrested in a fixative solution of 40% (v/v) ethanol and 12% (v/v) acetic acid. the 2d gels were then dried between cellophane sheets and sealed in plastic envelopes. silver stained 2d gels were scanned on a gs-710 calibrated imaging densitometer (bio-rad, hercules, ca) using the quantity one program (bio-rad), and the pdquest software (bio-rad) was used to define, quantify, and match the protein spots on each of the 2d gels. all well-defined protein spots that were at least twofold (mann-whitney test, < 0.05) differentially expressed between the sham and retinal detachment vitreous groups were selected for identification with nanoliquid chromatographyelectrospray ionization tandem mass spectrometry (lc-ms/ms). the 2d gels were removed from their plastic envelopes and rehydrated in water. the selected protein spots were carefully excised from the gels with a scalpel and subjected to in-gel digestion with trypsin gold (mass spectrometry grade; promega, madison, wi). the peptide samples that were obtained were analyzed by lc-ms/ms as previously described [12] . in brief, peptides generated by trypsin digestion were separated on an inert nano-lc system (lc packings, san francisco, ca) connected to a q-tof premier mass spectrometer (waters, milford, ma). the masslynx 4 sp4 (waters) was used to obtain spectra and the raw data was processed using proteinlynx global server 2.1 (waters). the processed data were used to search the total part of the swiss-prot database using the online version of the mascot ms/ms ions search facility (matrix science, ltd.). the search was undertaken with doubly and triply charged ions with up to two missed cleavages, a peptide tolerance of 50 ppm, one variable modification, carbamidomethyl-c, and a ms/ms tolerance of 0.05 da. contaminating peptides such as trypsin, keratin, bovine serum albumin, and all peptides originating from previous samples were disregarded. at least one "bold red" (matrix science ltd., http://www .matrixscience.com/) peptide match was required in the search for protein hits. individual peptide ions scores above approximately 36 indicated identity or extensive homology giving a less than 5% probability that the observed match was a random event. all peptides for the protein hits are reported (table 1) . blotting. in each case three micrograms of vitreous sample protein was separated on novex 10-20% gradient tris-glycine polyacrylamide gels (invitrogen corporation, carlsbad, ca) and subsequently transferred to nitrocellulose hybond-c extra membranes (ge healthcare). the membranes were blocked overnight with 5% skimmed milk in 80 mm na 2 hpo 4 , 20 mm nah 2 po 4 , 100 mm nacl, and 0.05% tween 20 buffer, ph 7.5. membranes were incubated with anti-albumin (genway biotech, ca, usa; 1 : 5000) and anti-peroxiredoxin 2 (abcam, cambridge, uk; 1 : 200). no suitable antibodies were commercially available for the rabbit f isoform of -1-antiproteinase or the rabbit collagen-i 1 fragment that was identified with lc-ms/ms. following washing, the membranes were further incubated with appropriate horseradish peroxidase-conjugated secondary antibodies: p0163 sheep and p0260 mouse (both 1 : 1000; dako, glostrup, denmark). proteins were visualized with the enhanced chemiluminescence system (ge healthcare) and imaging system (fujifilm las-3000, tokyo, japan). up to approximately 340 protein spots were clearly resolved on each of the 11 2d gels. ten protein spots were found to be significantly and at least twofold differentially expressed between the sham and detachment vitreous groups (figure 2 ). three protein spots were upregulated and seven spots were downregulated. from the three upregulated protein spots, two were identified as fragments of albumin (spots 5104 and 6101), whilst spot 6205 could not be identified. four of the seven downregulated protein spots were identified as fragment of collagen-i 1 (spot 0503), -1-antiproteinase f (spots 0703 and 1707), and peroxiredoxin 2 (spot 0705). protein spots 0102, 0815, and 1302 could not be identified ( figure 2 ; table 1 ). western blotting developed with anti-albumin showed a heavy band at approximately 60 kda, which is likely to represent the full length protein, whilst multiple bands below this suggest the presence of several fragments, some of which may correspond with those identified with the 2d-page analysis (figure 3, left) . peroxiredoxin 2 has a deduced molecular mass of approximately 22 kda. however, spot 0705 containing peroxiredoxin 2 migrates with a molecular mass around 60 kda with 2d-page (figure 2) , and this size was verified by western blot analysis (figure 3 , right). though a single and specific band was achieved with anti-peroxiredoxin 2, western blotting could not be reliably used for quantification due to a weak signal near the detection limit and variable background reaction. analysis with 2d-page revealed fragments of albumin to be upregulated in the vitreous following retinal detachment. albumin is the most abundant protein in plasma, aqueous, and vitreous humor, where in the latter it constitutes around 60-70% of total protein [19] [20] [21] . serum proteins such as albumin are present in the aqueous and vitreous humor at a relatively lower level compared to the vascular circulation from where they may have in part originated [21, 22] . western blot analysis showed an intense band at approximately 60 kda corresponding with the full length albumin protein, with multiple lower molecular mass bands that are likely its fragments. increase of albumin and its fragments may signify increased proteolysis and the passage of albumin into the vitreous. indeed, the breakdown of the blood-retinal barrier that occurs with retinal detachment has also been implicated in the increase of other such proteins in the vitreous [12, [23] [24] [25] [26] [27] [28] . it is also possible that albumin in the vitreous may arise from de novo synthesis in the retina, similar to the reported increased gene and protein expression of albumin in the corneal epithelium during wound healing [29, 30] . extraocular albumin is known to have diverse and important functions, which include maintenance of colloid osmotic pressure, transport of biomolecules, and inactivation of toxins through intermolecular binding [31, 32] . albumin can also act as an antioxidant by scavenging reactive oxygen species and sequestration of metal ions and has anti-inflammatory and apoptotic regulatory abilities [32] [33] [34] [35] . vitreal albumin has been proposed to transport long chain fatty acids into the lens for biosynthesis of lenticular lipids [21, 31, 36] . indeed, albumin is likely to have many such important roles in the eye, which requires further investigation. in the present study we observed peroxiredoxin 2 to have a molecular mass above 60 kda using both 2d-page and western blot analyses. however, the predicted molecular mass of the peroxiredoxin family of proteins is approximately 22 kda-31 kda. this variation may represent the well-studied property of these proteins to undergo oligomerization, which can be promoted by a number of factors including overoxidation of cysteine residues of peroxiredoxin [37] . although the present experiments were conducted in standard reducing conditions that aim to break cysteine bonds, we obtained a band well above 20 kda. this is in keeping with another study, which also showed some peroxiredoxin 2 western blot bands appearing at molecular mass much higher than 20 kda that was suggested to result from oligomerization or posttranslational modification [38] . our finding could also represent a novel alternative splicing variant of peroxiredoxin 2, as reported for peroxiredoxin 5 [39] . the peroxiredoxins are a group of ubiquitous antioxidant proteins that currently comprise six members in mammals [40] . these proteins are primarily found at high levels intracellularly, mainly within the cytosol, but are also present in the mitochondria, peroxisomes, and nuclei, and they may be exported [37] . furthermore, presence of peroxiredoxin 2 has been shown in plasma, not only as a result of hemolysis but also possibly by secretion from the t lymphocytes [41, 42] . these multifunction enzymes act as antioxidants by using redox active cysteines for the reduction and degradation of hydrogen peroxide, peroxynitrite, and organic hydroperoxides [37, 43] . oxidative stress is thought to result from an imbalance between reactive oxygen species production and antioxidant ability and is recognized to be an important factor in the pathogenesis of a number of age-related and neurodegenerative diseases, which include age-related cataract, agerelated macular degeneration, glaucoma, diabetic retinopathy, retinal detachment, and pvr [44] [45] [46] [47] . indeed, the present study showed a decrease in the vitreal levels of peroxiredoxin 2 following retinal detachment. this may be in keeping with reported reductions in the levels of other members of the antioxidant defense system such as glutathione and ascorbic acid both in vitreal and in blood samples of patients suffering from pvr [44, 45] . furthermore, apart from their role as antioxidants, the peroxiredoxins can affect a diverse range of biological processes that include cellular proliferation, differentiation, and apoptosis by influencing signal transduction pathways that employ hydrogen peroxide as a secondary messenger [43, 48] . recent studies on tears from patients with glaucoma have also identified peroxiredoxin 1 as having a possible involvement in inflammation [49, 50] . indeed, peroxiredoxin 2 and other members of this family of proteins are liable to have a significant role in the pathophysiology of retinal detachment. a fragment of collagen-i 1 was identified in the vitreous of the rabbit; however, type i collagen has not previously been identified as a natural component of the mammalian vitreous and is rather known to be a constituent of early pvr membranes [51] [52] [53] and retinal blood vessels [54, 55] . a mixture of type ii, ix, and v/xi hybrid collagen fibrils, which are separated out mainly by water and ions attracted to hyaluronan, characterizes the vitreous body [56] . collagen, possibly with the aid of adhesive-like intermediate molecules, may provide the basis of vitreoretinal adhesion by connecting the vitreous with the retinal inner limiting membrane (ilm). this attachment is extremely strong in the vitreous base since the fibrils pass through the ilm to merge into underlying collagen networks and crypts [2, 56, 57] . collagen is also a significant component of both epiretinal and subretinal pvr membranes [53, 58] , and type i collagen is recognized to be a principal constituent during their early development [51] [52] [53] . the presence of collagen in the subretinal space, a place normally devoid of this protein, suggests that certain cells, particularly the rpe and müller cells associated with membranes, are able to synthesize collagen under certain pathological conditions such as retinal detachment and pvr [58] [59] [60] [61] . however, the present analysis suggests collagen-i 1 fragment to be found in sham vitreous, which furthermore showed a decreased concentration following retinal detachment that may indicate perturbed proteolytic activity. matrix metalloproteinases (mmp) and other proteolytic enzymes that are able to degrade and remodel vitreal collagen have been found to be increased with rrd and pvr [62] [63] [64] [65] , which could be in keeping with the decrease in -1-antiproteinase shown in the present study. further studies will be necessary to confirm the source and nature of collagen-i 1 in the vitreous and the possible mechanisms of collagen fragmentation that may be an important feature of vitreous liquefaction and rrd [2, 62, 66] . alpha-1-antiproteinase. 2d-page showed -1-antiproteinase (also called -1-antitrypsin or -1-proteinase inhibitor) at two closely positioned spots, which were largely in keeping with their predicted molecular mass but differing by their charge. currently, four isoforms of -1-antiproteinase have been identified in the rabbit, termed f, s1, s2, and e, which is a similar picture to the multiple variants identified in humans [67, 68] . alpha-1-antiproteinase is an acute phase protein and archetypal member of the superfamily of serine protease inhibitors (serpin), which are involved in a wide range of biological processes that includes inflammation, angiogenesis, blood coagulation, ecm remodeling, and tumor suppression [69] . this protein has the ability to inhibit a large number of serine proteases though its principle target is neutrophil elastase [68] . indeed, -1-antiproteinase originally received much attention because its deficiency increases the risk of a variety of clinical conditions, such as chronic obstructive pulmonary disease, which can result from unrestrained elastase activity. we found the f isoform of rabbit -1-antiproteinase to be downregulated in the vitreous following retinal detachment. the f isoform of -1-antiproteinase is the only one of the rabbit isoforms so far identified that has been shown to have the oxidizable methionine residue site that is present in human -1-antiproteinase [67] . the oxidation of methionine to methionine sulfoxide, which can occur during episodes of inflammation as a result of oxygen-free radicals secreted by leucocytes, has an inhibiting effect upon -1-antiproteinase function. this process is thought to enhance the ability of proteinases such as elastase to locally degrade tissue debris that occurs at sites of inflammation [67, 68] . alpha-1-antiproteinase is primarily produced in the liver and circulated to the rest of the body tissues via the blood; however, extrahepatic sites of its synthesis have been identified, which include blood monocytes, alveolar macrophages, bronchial and gastrointestinal epithelial cells, and the cornea [70] [71] [72] [73] . the protein has also been localized to the tear film, aqueous humor, and vitreous, where in the latter a phosphorylated form of -1-antiproteinase has been suggested as a potential biomarker of idiopathic macular hole and rhegmatogenous retinal detachment [73] [74] [75] . it has been postulated that one of the main functions of corneal -1-antiproteinase is to protect against the damaging effects of neutrophil elastase produced during corneal inflammation [73] , and it may be expected that a similar role in addition to others is applicable to vitreal -1-antiproteinase, though this requires further investigation. this proteomic investigation of the rabbit vitreous has identified a set of proteins that assist our understanding of the pathogenesis of rhegmatogenous retinal detachment and its journal of ophthalmology 7 complications. further studies will be necessary to clarify the role of these proteins. certain proteins, such as those of low abundance and at the extremes of molecular mass, together with membrane proteins, can be difficult to resolve and detect using the 2d-page technique. therefore, complementary proteomic methods such as gel-free mass spectrometry should be considered in future work in order to help address these limitations. the authors alone are responsible for the content and writing of the paper. recent trends in the management of rhegmatogenous retinal detachment pathogenesis of rhegmatogenous retinal detachment: predisposing anatomy and cell biology the epidemiology and socioeconomic associations of retinal detachment in scotland: a two-year prospective population-based study cellular effects of detachment and reattachment on the neural retina and the retinal pigment epithelium epithelial-mesenchymal transition and proliferation of retinal pigment epithelial cells initiated upon loss of cell-cell contact a novel strategy to develop therapeutic approaches to prevent proliferative vitreoretinopathy ocular proteomics with emphasis on two-dimensional gel electrophoresis and mass spectrometry proteomics of uveal melanomas suggests hsp-27 as a possible surrogate marker of chromosome 3 loss proteomic analysis of human vitreous associated with idiopathic epiretinal membrane analytical platforms in vitreoretinal proteomics dye-free porcine model of experimental branch retinal vein occlusion: a suitable approach for retinal proteomics protein changes in the retina following experimental retinal detachment in rabbits proteome profiling of vitreoretinal diseases by cluster analysis vitreous proteomic analysis of proliferative vitreoretinopathy elucidation of the pathogenic mechanism of rhegmatogenous retinal detachment with proliferative vitreoretinopathy by proteomic analysis proteomic analyses of the vitreous humour translational vitreous proteomics improved silver staining protocols for high sensitivity protein identification using matrix-assisted laser desorption/ionization-time of flight analysis components of the fibrinolytic system in the vitreous body in patients with vitreoretinal disorders proteome analysis of human vitreous proteins role of albumin as a fatty acid carrier for biosynthesis of lens lipids shifting the paradigm of the blood-aqueous barrier blood-aqueous barrier breakdown associated with rhegmatogenous retinal detachment risk factors for proliferative vitreoretinopathy levels of pentosidine in the vitreous of eyes with proliferative diabetic retinopathy, proliferative vitreoretinopathy and retinal detachment elevation of vitreous leptin in diabetic retinopathy and retinal detachment immunoglobulins in paired specimens of vitreous and subretinal fluids from patients with rhegmatogenous retinal detachment experimental retinal detachment. ix. aqueous, vitreous, and subretinal protein concentrations elevated albumin in retinas of monkeys with experimental glaucoma changes in albumin precursor and heat shock protein 70 expression and their potential role in response to corneal epithelial wound repair all about albumin: biochemistry, genetics, and medical applications albumin: biochemical properties and therapeutic potential the anti-apoptotic activity of albumin for endothelium is mediated by a partially cryptic protein domain and reduced by inhibitors of g-coupled protein and pi-3 kinase, but is independent of radical scavenging or bound lipid mitochondria are the major targets in albumin-induced apoptosis in proximal tubule cells albumin is a major serum survival factor for renal tubular cells and macrophages through scavenging of ros in vivo passage of albumin from the aqueous humor into the lens structure, mechanism and regulation of peroxiredoxins peroxiredoxin 2 and peroxidase enzymatic activity of mammalian spermatozoa peptides with dual binding specificity for hla-a2 and hla-e are encoded by alternatively spliced isoforms of the antioxidant enzyme peroxiredoxin 5 cloning of bovine peroxiredoxins-gene expression in bovine tissues and amino acid sequence comparison with rat, mouse and primate peroxiredoxins plasma proteome of severe acute respiratory syndrome analyzed by twodimensional gel electrophoresis and mass spectrometry advances in our understanding of peroxiredoxin, a multifunctional, mammalian redox protein peroxiredoxins, oxidative stress, and cell proliferation interleukin-8, nitric oxide and glutathione status in proliferative vitreoretinopathy and proliferative diabetic retinopathy determination of ascorbic acid in human vitreous humor by high-performance liquid chromatography with uv detection vitreous levels of oxidative stress biomarkers and the radical-scavenger 1 -microglobulin/a1m in human rhegmatogenous retinal detachment activation of signaling pathways and stress-response genes in an experimental model of retinal detachment peroxiredoxin evolution and the regulation of hydrogen peroxide signaling differential protein expression in tears of patients with primary open angle and pseudoexfoliative glaucoma shotgun proteomics reveals specific modulated protein patterns in tears of patients with primary open angle glaucoma naïve to therapy proliferative vitreoretinopathy: pathobiology, surgical management, and adjunctive treatment promotion of adhesion and migration of rpe cells to provisional extracellular matrices by tnf proliferative vitreoretinopathy membranes. an immunohistochemical study collagens and collagen-related matrix components in the human and mouse eye retinal microvessel extracellular matrix: an immunofluorescent study adult vitreous structure and postnatal changes structure and composition of the inner limiting membrane of the retina. sem on frozen resin-cracked and enzyme-digested retinas of macacca mulatta collagen formation by periretinal cellular membranes matrix and the retinal pigment epithelium in proliferative retinal disease variation in epiretinal membrane components with clinical duration of the proliferative tissue human retinal müller cells synthesize collagens of the vitreous and vitreoretinal interface in vitro enzymatic breakdown of type ii collagen in the human vitreous correlation of the extent and duration of rhegmatogenous retinal detachment with the expression of matrix metalloproteinases in the vitreous a prospective study of matrix metalloproteinases in proliferative vitreoretinopathy correlation of matrix metalloproteinase levels with the grade of proliferative vitreoretinopathy in the subretinal fluid and vitreous during rhegmatogenous retinal detachment age-related liquefaction of the human vitreous body: lm and tem evaluation of the role of proteoglycans and collagen various forms of rabbit plasma -1-antiproteinase mammalian 1 -antitrypsins: comparative biochemistry and genetics of the major plasma serpin the serpins are an expanding superfamily of structurally similar but functionally diverse proteins. evolution, mechanism of inhibition, novel functions, and a revised nomenclature expression of the alpha 1-proteinase inhibitor gene in human monocytes and macrophages biosynthesis of 1-proteinase inhibitor by human lung-derived epithelial cells regulation of 1-proteinase inhibitor release by proinflammatory cytokines in human intestinal epithelial cells corneal synthesis of 1-proteinase inhibitor ( 1-antitrypsin) alpha-1-antitrypsin in aqueous humour from patients with corneal allograft rejection identification of phosphotyrosyl proteins in vitreous humours of patients with vitreoretinal diseases by sodium dodecyl sulphate-polyacrylamide gel electrophoresis/western blotting/ matrix-assisted laser desorption time-of-flight mass spectrometry the authors appreciate the excellent technical support pro the authors report no conflict of interests. key: cord-013348-lsksys56 authors: goto, keiko; yamaoka, yutaro; khatun, hajera; miyakawa, kei; nishi, mayuko; nagata, noriko; yanaoka, toshikazu; kimura, hirokazu; ryo, akihide title: development of monoclonal antibodies and antigen-capture elisa for human parechovirus type 3 date: 2020-09-19 journal: microorganisms doi: 10.3390/microorganisms8091437 sha: doc_id: 13348 cord_uid: lsksys56 human parechovirus type 3 (hpev3) is an etiologic agent of respiratory diseases, meningitis, and sepsis-like illness in both infants and adults. monoclonal antibodies (mabs) can be a promising diagnostic tool for antigenic diseases such as virus infection, as they offer a high specificity toward a specific viral antigen. however, to date, there is no specific mab available for the diagnosis of hpev3 infection. in this study, we developed and characterized mabs specific for hpev3 capsid protein vp0. we used cell-free, wheat germ-synthesized viral vp0 protein for immunizing balb/c mice to generate hybridomas. from the resultant hybridoma clones, we selected nine clones producing mabs reactive to the hpev3-vp0 antigen, based on enzyme-linked immunosorbent assay (elisa). epitope mapping showed that these mabs recognized three distinct domains in hpev3 vp0. six mabs recognized hpev3 specifically and the other three mabs showed cross-reactivity with other hpevs. using the hpev3-specific mabs, we then developed an elisa for viral antigen detection that could be reliably used for laboratory diagnosis of hpev3. this elisa system exhibited no cross-reactivity with other related viruses. our newly developed mabs would, thus, provide a useful set of tools for future research and ensure hpev3-specific diagnosis. human parechoviruses (hpevs) belong to the parechovirus genus of the picornaviridae family [1, 2] . the two serotypes (hpev1 and hpev2) of parechoviruses were initially isolated in 1956 from children with diarrhea, and were assigned to the enterovirus genus. however, it became evident that they were genetically distinct from the enteroviruses, and reclassified as the parechovirus genus in 1999. at present, 19 hpevs are reported and categorized, based on the nucleic acid sequences of the vp1 gene, not on the classification as enteroviruses serotype. hpev3 was first identified in japan. hpev3 was isolated from a stool sample provided by a 1-year-old infant who was experiencing fever, gastritis like symptoms, and transient lower extremity paralysis [3] . hpev types 1 to 8 are the common identified strains; among them, hpev type 1, 3, and 6 account for the majority of infectious strains worldwide [4] . infection with hpevs is associated with a broad spectrum of clinical manifestations, ranging from respiratory symptoms and mild gastrointestinal illness to sepsis-like diseases, meningitis, and encephalitis in children [5] . while most hpevs cause mild symptoms in children between 1 and 5 years of age, human parechovirus 3 (hpev3) is clinically the most important genotype, owing to its association to severe diseases in younger infants under 3 months of age [6] [7] [8] . hpev3 infection in infants can trigger a sepsis-like dysregulated host response involving the central nervous system [9] [10] [11] [12] . in cases of acute meningitis or encephalitis, patients might develop abnormal white matter lesions and neurological sequelae, and even death might occur [13] [14] [15] [16] [17] . apnea can occur in children regardless of encephalitis [17] . hpev3 is known to cause myalgia and myositis in adult patients and a similar pattern is also sporadically seen among pediatric patients. [18, 19] . an epidemic of hpev3 occurs every 3 to 4 years in japan. as respiratory disease or meningitis cases due to hpev3 are not subject to notifiable disease surveillance in japan, the actual number of the patients is not known [1, 6, [20] [21] [22] [23] . the appropriate diagnosis tool for hpev3 detection might be able to rule out infectious etiology and avoid unnecessary antibiotics use, which is a given because hpev3 leads to septic shock-like symptoms. for these reasons, the establishment of a method to detect hpev3 plays a vital role in healthcare fields. importantly, hpev3 s epidemic cycle occurs in summer time and is concurrent with enteroviral infection. thus, it is essential to develop a detection method that does not cross-react with enterovirus. hpev3 contains a small, non-enveloped, single-stranded positive-sense rna genome of approximately 7.3-kb nucleotides [24, 25] . the hpev3 virion is composed of 60 copies of three structural proteins (vp0, vp1, and vp3) that fit together to form a 28-nm-diameter icosahedral shell around the viral genome [26] . the genome encodes a single polyprotein that, during infection, is subsequently cleaved into all essential capsid components and non-structural proteins [24] . vp0 is an important protein for stabilizing the surface of the viral capsid, and the assembly of hpev is controlled by multiple interactions of the genome with the capsid, through conserved amino acids in vp1 and vp3 [25] . although rt-pcr-based diagnostic tests targeting 5 -utr of the hpev3 genome were developed for hpev3 detection in clinical samples, there is currently no diagnostic method for detecting the viral antigens. recently, chen et al. generated polyclonal antibodies for hpev3 vp0, and proposed an immunofluorescence-based diagnostic assay [27] . however, this method requires virus isolation by cell culture and takes several weeks for the identification of viral genotype/serotype. abed et al. developed a serological enzyme-linked immunosorbent assay (elisa), using a synthetic peptide from the vp0 protein of hpevs [28] . although it can provide a definitive diagnosis, serological test requires paired serum samples from acute and recovery phases, which makes it difficult to diagnose immediately as a point of care testing (poct). to develop a rapid and effective diagnostic strategy, there is an urgent need to produce highly specific monoclonal antibodies (mabs) toward hpev3 antigens. in this study, we sought to generate mabs specific to the capsid protein vp0 of hpev3. we prepared the viral vp0 antigen using the wheat germ cell-free system, which has the advantage of producing properly folded functional proteins [29, 30] , to immunize mice. as a result, we obtained nine mab clones for characterization, and thereafter, generated an elisa system that is specifically able to detect the hpev3 vp0 antigens. complementary dna encoding hpev3-vp0 (genbank no. ab084913) was used to generate the expression vector for antigen production with the wheat germ cell-free system. the hpev3-vp0 open reading frame was amplified by pcr, using the corresponding primer pairs. the amplified fragment was cloned into vector peu-e01-his-tev-mcs-n2 (cellfree sciences, yokohama, japan), using restriction enzymes xhoi and spei. in vitro wheat germ cell-free protein synthesis was carried out as previously described [29, 30] . for cell-free protein synthesis, wepro7240h wheat extract (cellfree sciences, yokohama, japan) was used in the bilayer translation reaction, as previously described. synthesized proteins were confirmed by immunoblotting. the his-hpev3-vp0 (full length, 1-298) protein was synthesized using a proteomist xe robotic protein synthesizer (cellfree sciences, yokohama, japan) for mouse immunization. the cell-free translation reaction mixture was separated into soluble and insoluble fractions by centrifugation at 18,000× g for 15 min. the soluble fraction was mixed with ni-sepharose high performance beads (ge healthcare, waukesha, wi, usa) in the presence of 20 mm imidazole. the beads were washed thrice with a washing buffer [20 mm tris-hcl (ph 7.5), 500 mm nacl] containing 40 mm imidazole. his-hpev3-vp0 was then eluted in another washing buffer containing 500 mm imidazole. amicon ultra centrifugal filters (millipore, bedford, ma, usa) were used to concentrate the purified his-hpev3-vp0. protein concentration was determined using the bradford method, with bovine serum albumin (bsa) as a protein standard. immunization of balb/c mice and generation of anti-hpev3-vp0 mab-producing hybridomas were carried out as previously described [29, 30] . briefly, his-tagged full-length hpev3-vp0 protein was injected into the footpad of the balb/c mice, using keyhole limpet hemocyanin as an adjuvant. four weeks later, the spleen cells were isolated and fused to the myeloma cell line, sp2/o, using polyethylene glycol 1500 (peg 1500). monoclonal antibodies in the hybridoma culture supernatant were tested using elisa with his-tagged recombinant hpev3-vp0 protein. isotype determination was performed using isostrip mouse monoclonal antibody isotyping kit, following the manufacturer's instructions (roche diagnostics, basel, switzerland). vero cells were grown in dmem containing 10% fbs. hpev3 was provided by dr. masaki takahashi (iwate prefectural institute of public health). hpev3 was propagated in vero cells and quantified by qrt-pcr. the sequence information was as follows: 5 -gtaacaswwgcctctgggs ccaaaag-3 (forward primer), 5 -ggccccwgrtcagatccayagt-3 (reverse primer), and 5 -vic-cctrygggtacctycwgggcatccttc-bhq-3 (probe). wheat germ-synthesized recombinant his-tagged hpev3-vp0 proteins were separated by 10% sds-page in running buffer (250 mm glycine, 25 mm tris, 0.1% sds). the separated proteins were transferred to a pvdf membrane (millipore). the membranes were washed with blotting buffer tbst (tbs containing 0.05%-tween 20), and then blocked for 1 h at room temperature in 5% non-fat powdered milk in tbst. thereafter, the membranes were incubated overnight with generated hybridoma supernatant (1:50 dilution in tbst) at 4 • c. next, after washing thrice in tbst, the membranes were incubated for 1 h at room temperature with anti-mouse igg-hrp secondary antibody (1:10000 dilution in tbst). finally, after washing thrice in tbst, the target protein was detected with the immobilon western chemiluminescence detection system (ge healthcare) using fluor chem fc2 (alpha innotech corp. tokyo, japan). for epitope mapping, we prepared deletion mutants of hpev3-vp0 using pcr mutagenesis with template vector peu-his-hpev3-vp0, followed by wheat germ cell-free protein synthesis. the proteins were analyzed by immunoblotting using our generated mabs. to examine the amino acid variability among the vp0 proteins of other hpev genotypes, vp0 sequences for hpev1-8, 14, and 17-19 were accessed from the genbank and aligned with the hpev3-vp0 sequence using the mega software. k d values were determined by bio-layer interferometry (bli) using octet red96 (fortebio, usa). anti-mouse igg capture biosensor tips (amc, fortebio) were loaded with 20 µg/ml of mab #8 and #39 for 5 min, in pbs containing 0.1% bsa and 0.01% tween20. the association of recombinant vp0 at concentrations of 50, 25, 12, and 5 nm for #8 mab, and 14, 7, 3.5, and 1.75 nm for #39 mab, was measured for 5 min, followed by a 10-min-long dissociation phase. all measurements were corrected for baseline drift by subtracting a reference well. the operating temperature was maintained at 30 • c. data were analyzed using a 1:1 binding model with global fitting algorithms in the fortebio data analysis software. each mab was diluted in 50 mm of carbonate buffer (ph 9.6) to a concentration of 10 µg/ml, and then added to an elisa plate (agc techno glass, shizuoka, japan). to immobilize the antibodies, the plate was incubated overnight at 4 • c. wells were blocked with pbs containing 2% (w/v) skim milk for 1 h at room temperature (rt). after three washes with pbs containing 0.05% (v/v) tween-20 (pbs-t), 100 µl of antigen protein (8 ng/ml) diluted with pbs-t or blank (pbs-t alone) was added, and the mixture was incubated for 60 min at rt. after three washes with pbs-t, 100 µl of each mab, conjugated with horseradish peroxidase (hrp), was added into each well and incubated for 60 min at rt. antibody labeling was performed using the peroxidase labeling kit-nh2 (dojindo laboratories, kumamoto, japan). after three washes with pbs-t, 100 µl of abts substrate solution (kirkegaard and perry laboratories, washington, dc, usa) was added and the mixture was incubated for 30 min at rt. absorbance at 405/490 nm was measured on glomax discover system (promega), and the signal-to-noise ratio (s/n) was calculated. for generation of mabs, we produced n-terminal his-tagged full-length vp0 protein of hpev3 as an antigen. as a result, hpev3-vp0 protein was produced with high aqueous solubility (figure 1a) . the protein was subsequently purified using ni-sepharose beads, followed by elution with imidazole. balb/c mice were then immunized with the purified protein. after 4 weeks of immunization, the mice splenocytes were isolated, fused with myeloma cells, and hybridomas were produced. as a result, 48 stable hybridomas were generated and designated as #1 to #48. among these 48 clones, nine were selected (#3, #6, #8, #12, #27, #30, #34, #39, and #41), based on the reactivity in elisa to the target antigens vp0 proteins derived from hpev1 and hpev3 (figure 1b) . isotype analysis revealed that #3 mab belongs to igg1, kappa isotype, #6 and #12 mabs belong to igg2a, kappa isotype, while the others belongs to igg2b, kappa isotype ( figure 1c) . to demonstrate the antibody-binding sites within the antigen, we next performed epitope mapping. for epitope mapping, we produced five deletion mutants of vp0 and performed immunoblotting analysis. we found that our newly developed mabs recognized three distinct domains in hpev3 vp0: #30 and #34 mabs bind to 68-121 amino acid (aa), #3, #8, and #27 mabs bind to 127-172 aa, and remaining four mabs (#6, #12, #39, and #41) bind to 225-289 aa within c-terminal region of the vp0 protein of hpev3 (figure 2a) . we further created deletion mutants for a more precise epitope determination for these mabs, and found that #30 and #34 mabs bind to 82-95 aa, #3, #8, and #27 mabs bind to 133-159 aa, and #6, #12, #39, and #41 mabs bind to 275-289 aa (figure 2b) . we next examined whether the antigenic epitopes were located on the surface of hpev3-vp0. the ucsf chimera software revealed that, except for #30 and #34, the binding regions of all mabs were located on the molecular surface of vp0 protein (figure 2c) . the binding region of mabs #30 and #34 was relatively conserved among the analyzed hpevs (figure 2d ). to demonstrate the antibody-binding sites within the antigen, we next performed epitope mapping. for epitope mapping, we produced five deletion mutants of vp0 and performed immunoblotting analysis. we found that our newly developed mabs recognized three distinct domains in hpev3 vp0: #30 and #34 mabs bind to 68-121 amino acid (aa), #3, #8, and #27 mabs bind to 127-172 aa, and remaining four mabs (#6, #12, #39, and #41) bind to 225-289 aa within c-terminal region of the vp0 protein of hpev3 (figure 2a) . we further created deletion mutants for a more precise epitope determination for these mabs, and found that #30 and #34 mabs bind to 82-95 aa, #3, #8, and #27 mabs bind to 133-159 aa, and #6, #12, #39, and #41 mabs bind to 275-289 aa (figure 2b ). we next examined whether the antigenic epitopes were located on the surface of hpev3-vp0. the ucsf chimera software revealed that, except for #30 and #34, the binding regions of all mabs were located on the molecular surface of vp0 protein (figure 2c) . the binding region of mabs #30 and #34 was relatively conserved among the analyzed hpevs (figure 2d ). we next investigated the specificity of our newly developed mabs. for this, we created vp0 proteins encoded by the six different hpev genotypes and vp4-vp2 proteins derived from enteroviruses 71 and d68. immunoblotting analysis showed that six mabs (#3, #6, #8, #12, #27, and #39) react specifically with hpev3 vp0, while three mabs (#30, #34, and #41) showed some crossreactivity to other hpevs (figure 3) . no mabs showed cross-reactivity to enterovirus vp4-vp2 proteins. we next investigated the specificity of our newly developed mabs. for this, we created vp0 proteins encoded by the six different hpev genotypes and vp4-vp2 proteins derived from enteroviruses 71 and d68. immunoblotting analysis showed that six mabs (#3, #6, #8, #12, #27, and #39) react specifically with hpev3 vp0, while three mabs (#30, #34, and #41) showed some cross-reactivity to other hpevs (figure 3) . no mabs showed cross-reactivity to enterovirus vp4-vp2 proteins. sandwich elisa systems with highly specific matched antibody pairs are commonly used to detect and quantify viral antigens in immunoassay. hence, we next determined the optimal pair of mabs for antigen-capture using elisa, by evaluating all possible combinations of immobilized and labeled mabs. among the 36 possible pair combinations, only the #8 and #39 mab pair represented a combination of antibodies specific for hpev3-vp0 (figure 4a ). therefore, we selected this combination for further analysis. sandwich elisa systems with highly specific matched antibody pairs are commonly used to detect and quantify viral antigens in immunoassay. hence, we next determined the optimal pair of mabs for antigen-capture using elisa, by evaluating all possible combinations of immobilized and labeled mabs. among the 36 possible pair combinations, only the #8 and #39 mab pair represented a combination of antibodies specific for hpev3-vp0 (figure 4a ). therefore, we selected this combination for further analysis. to characterize the equilibrium dissociation constant (k d ) of the selected antibodies and the target vp0 antigen, we used the bli octet assay system. k d for antibody-antigen binding in mabs #8 and #39 was calculated as < 1 × 10 −12 and 3.72 × 10 −10 , respectively, suggesting that both mabs showed high binding affinity to the hpev3-vp0 (figure 4b) . we next performed antigen-capture elisa with recombinant vp0 protein and virions released into the cell-culture supernatant of the hpev3-infected cells. using the optimal antibody pair (#8 and #39 mabs) identified above, we determined the detection threshold for antigen recognition by antigen-capture elisa. our results revealed that our system was highly sensitive to the recombinant antigen, capable of detecting the protein at a concentration of 3 ng/ml (figure 4c, left) . in parallel, we investigated the detection limit of elisa for the hpev3 virion. this elisa system could detect heat-treated hpev3, but not non-heated virions, and its detection limit of the system was 1 × 10 9 copies/ml (figure 4c, right) . we also found that this elisa system exhibited no cross-reactivity with enteroviruses ( figure 4d ) to characterize the equilibrium dissociation constant (kd) of the selected antibodies and the target vp0 antigen, we used the bli octet assay system. kd for antibody-antigen binding in mabs #8 and #39 was calculated as < 1 × 10 −12 and 3.72 × 10 −10 , respectively, suggesting that both mabs showed high binding affinity to the hpev3-vp0 (figure 4b) . we next performed antigen-capture elisa with recombinant vp0 protein and virions released into the cell-culture supernatant of the hpev3-infected cells. using the optimal antibody pair (#8 and #39 mabs) identified above, we determined the detection threshold for antigen recognition by antigen-capture elisa. our results revealed that our system was highly sensitive to the recombinant antigen, capable of detecting the protein at a concentration of 3 ng/ml (figure 4c, left) . in parallel, we investigated the detection limit of elisa for the hpev3 virion. this elisa system could detect heattreated hpev3, but not non-heated virions, and its detection limit of the system was 1 × 10 9 copies/ml hpev3 is increasingly being highlighted as a potentially severe viral infection in neonates and young infants. therefore, there is an urgent need to develop assays for early diagnosis of hpev3 infection for reducing inappropriate antimicrobial use, unnecessary investigations, and prolonged hospitalization. it is also likely to lead to follow-up for potential complications in infants who are severely affected [1] . in this context, a real-time pcr-based molecular test to detect virus from patients was recently developed [31] . however, pcr tests are extremely sensitive and need extensive controls, whereas antigen detection by mabs has the advantage of the relative ease of sample handling and the use of less stringent procedures. specific mabs could then be used to develop a rapid test such as elisa. in this study, we sought to generate specific mabs and develop an elisa test for the detection of hpev3 vp0 antigen. hpev3 genome encodes for three structural proteins, namely vp0, vp3, and vp1, which assemble to create the virus particle [10] . among these, vp0 was identified as an antigenic determinant and it might be relatively more useful for diagnostic purposes, due to a higher level of sequence conservation [32] . furthermore, it possesses high immunogenicity [7] . therefore, the selection of vp0 as an antigen is both practical and reasonable. in addition, the mab quality was determined mostly by preparation of high-quality antigen. here, we used a wheat germ, cell-free protein production system for synthesizing recombinant vp0 proteins, as this system produces properly folded, soluble, and biologically active native proteins similar to those expressed in mammalian cells [29, 30, 33] . in our current study, we newly produced nine different mabs that recognized hpev3-vp0 as an antigen. we then performed epitope mapping of our generated mabs, as identification of the epitope is a key step in the characterization of monoclonal antibodies [34] . based on the epitope analysis, the mabs were able to recognize three different areas of hpev3-vp0 and specify 12-14 aa length epitopes within the hpev3-vp0. interestingly, two mab clones (#30 and #34) exhibiting cross-reactivity to vp0 proteins of other hpevs, bound a distinct epitope (82-95 aa), which partially overlapped with the recognition site for the polyclonal antibody (79-99 aa) created by chen et al. [27] . nevertheless, we obtained mabs specific to hpev3, which recognized sites 133-159 aa and the 275-280 aa site of vp0, which was not reported earlier. moreover, we also developed an elisa for detecting hpev3 antigen using two of our newly generated hpev3-specific mabs (#8 and #39), as mab-based elisa is highly specific and sensitive towards viral antigen detection [35] . our octet assay suggested that both mabs show a high binding affinity to the full-length hpev3-vp0 recombinant protein. vp0 protein can be folded into the correct native structure and is likely to form the capsid-like structure via oligomerization. in this situation, #8 mab could bind various sites on polymerized vp0 proteins, as a result of an allosteric effect or "avidity", owing to which the k d value of #8 mab was calculated to be much lower than that of #39 mab. our newly developed elisa system requires heat treatment of hpev3 to detect vp0 antigen. based on previous studies [7] , the binding area of mab #39 was estimated to localize to the interface between vp0 and vp3. on the other hand, three-dimensional modeling of viral particles revealed that the antigen-recognizing sites of mabs #8 and #39 in vp0 protein are proximally located. moreover, a previous study showed that glu285 (in the epitope region of #39) and ser28 (in the vp0 protein of hpev3) bind together by a hydrogen bond [24] . thus, a possible explanation for our observation is that heat treatment helps antigen retrieval for the interface between vp0 and vp3, resulting in efficient access of mabs to the epitopes. according to a previous study [36] , an increase in antibody-binding capacity was exhibited when glycosylation of capsid protein was removed. based on this finding, we carried out pretreatment by resecting the connection between o-linked and n-linked glycosylation of hpev3. however, there was no obvious effect on antibody recognition in our sandwich elisa assay (data not shown), indicating that glycosylation might not affect the antigenicity for mab recognition. the detection limits of our newly developed sandwich elisa were 3 ng/ml for recombinant vp0 protein and 1 × 10 9 copies/ml for viral particles, respectively. assuming that one viral particle contains a single copy of a viral genome, there are 60 copies of vp0 protein per viral particle. for a viral particle detection sensitivity of 1 × 10 9 copies/ml, the vp0 protein detection sensitivity is presumed to be 3 ng/ml. therefore, we conclude that the detection sensitivity for the recombinant vp0 protein and the virus particles was almost equivalent, suggesting the feasibility of our method for use in actual clinical settings. other than the elisa method, several antigen-detecting tests using an antigen-antibody interaction are now available. for instance, the multi-array technology using electrochemiluminescence immunoassay (eclia) and chemiluminescent enzyme immunoassay (cleia) can provide several hundred times more sensitivities than the conventional elisa method [37] . another example is an influenza-testing kit using a highly sensitive immunochromatographic detection method based on silver amplification [38] . the limitation of sensitivity might be overcome by utilizing these sophisticated technologies combined with our monoclonal antibodies for hpev3 vp0 antigen. furthermore, we can potentially improve the detection sensitivity by altering the second antibody to recognize the poly-hrp complex [39] or by adding other antibodies, which can target vp1 or vp3 proteins. in summary, we utilized the wheat germ cell-free protein production system to synthesize the hpev3 vp0 protein and produced mabs that could specifically detect hpev3 but not other hpevs. we further explored the feasibility of these mabs in terms of their utility in various immunological applications. to the best of our knowledge, the hpev3 vp0 antibody as well as the elisa-based viral detection system reported here is the first of its kind ever reported. with the implementation of more sophisticated applications, our newly developed mabs could be useful for further development of diagnostic methods for hpev3 infection. human parechovirus: an increasingly recognized cause of sepsis-like illness in young infants detection and characterization of a novel human parechovirus genotype in thailand isolation and identification of a novel human parechovirus identification and molecular characterization of the first complete genome sequence of human parechovirus type 15 enterovirus and parechovirus infection in children: a brief overview sepsis-like disease in infants due to human parechovirus type 3 during an outbreak in australia strain-dependent neutralization reveals antigenic variation of human parechovirus 3 human parechovirus 3 causing sepsis-like illness in children from midwestern united states human parechovirus 3 and neonatal infections human parechovirus-3 infection: emerging pathogen in neonatal sepsis prevalence and characteristics of human parechovirus and enterovirus infection in febrile infants comparison of diagnostic clinical samples and environmental sampling for enterovirus and parechovirus surveillance in scotland infant deaths associated with human parechovirus infection in wisconsin human parechovirus causes encephalitis with white matter injury in neonates lyall, h. characteristics and outcomes of human parechovirus infection in infants human parechovirus type 3 infection: an emerging infection in neonates and young infants epidemic myalgia and myositis associated with human parechovirus type 3 infections occur not only in adults but also in children: findings in yamagata epidemic myalgia associated with human parechovirus type 3 infection the human parechoviruses: an overview epidemic myalgia associated with human parechovirus type 3 infection among adults occurs during an outbreak among children: findings from yamagata severe epidemic myalgia with an elevated level of serum interleukin-6 caused by human parechovirus type 3: a case report and brief review of the literature a 2.8-angstrom-resolution cryo-electron microscopy structure of human parechovirus 3 in complex with fab from a neutralizing antibody multiple capsid-stabilizing interactions revealed in a high-resolution structure of an emerging picornavirus causing neonatal sepsis genomic rna folding mediates assembly of human parechovirus parechovirus a detection by a comprehensive approach in a development of a serological assay based on a synthetic peptide selected from the vp0 capsid protein for detection of human parechoviruses development of monoclonal antibody and diagnostic test for middle east respiratory syndrome coronavirus using cell-free synthesized nucleocapsid antigen production and characterization of monoclonal antibodies specific for major capsid vp1 protein of trichodysplasia spinulosa-associated polyomavirus clinical relevance of positive human parechovirus type 1 and 3 pcr in stool samples strategies to improve detection and management of human parechovirus infection in young infants wheat germ cell-free system-based production of hemagglutinin-neuraminidase glycoprotein of human parainfluenza virus type 3 for generation and characterization of monoclonal antibody monoclonal antibody-based capture elisa in the diagnosis of previous dengue infection identification of two n-linked glycosylation sites within the core of the simian immunodeficiency virus glycoprotein whose removal enhances sensitivity to soluble cd4 measurement of sirolimus concentrations in human blood using an automated electrochemiluminescence immunoassay (eclia): a multicenter evaluation development of a highly sensitive immunochromatographic detection kit for h5 influenza virus hemagglutinin using silver amplification improving the sensitivity of traditional western blotting via streptavidin containing poly-horseradish peroxidase (polyhrp) we thank naohito nozaki and satoko matsunaga for antibody production and technical assistance, and masaki takahashi (iwate prefectural institute of public health) for providing regents.conflicts of interest: y.y. is a current employee of kanto chemical co., inc. the remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. key: cord-023647-dlqs8ay9 authors: nan title: sequences and topology date: 2003-03-21 journal: curr opin struct biol doi: 10.1016/0959-440x(91)90051-t sha: doc_id: 23647 cord_uid: dlqs8ay9 nan . garrell j, modolell j: the drmm~hila locus, am antagonist of proneural geors that, lilac these genes, ~.ncodes a helix-loop-helix protein. ce/11990, 61:39-48 in crystals of ~or-nib-glu(oz~)-leu-mb-ala-leu-an~alm-lys(z)-alb.ome. pro~ natl acx*d sci usa 1990, 87:7921-7925 cloning and expre~inn of two distinct high-afl~nlty p~eptot~ ~geat¢ting with acidic and b~ic ~last growth imctogs. embo j 1990 . emboj 1990 emboj , 9:1957 emboj -1962 . he~gst l~ lf~w~m~ t, g~lwr~ i~. the ryb l gene in the l~lmion ye~tt sd~m~acctm~ pombe lfalcodin 8 a gtp-bindin s protein belated to rho and ypt: structot~, expfemflon and identificetion of its human homulogue. embo j 19~0, 9:1949 embo j 19~0, 9: -1956 serotouln receptor that activates adenyiate cyclase domain • of lutropin/choriogonadotropin receptor expressed in t~ cells binds choringonadotropin with ltlgh /tmnlty cllgol~o0onlal orsanization of adrenergic receptor genes molecoiat chagactegization of a rat ~2n-adrenerl#c receptor identificatlon of rpo30, a vaccinia virus rna polymerase gene with structut~ similarity to a eucaryotic transcription elon~ation factor nucleotide sequence analysis of the l g~ne of vesicular stomafltia virus (new jersey serotype) --identification of conserved domai~l~ in l proteins of nonsegmented negative-strand rna viruses a novel u,man immunodeflclency virus type-1 protein, try, shares ~'quences with tat, ent~ and rev proteins phosphoprotein and nucleo~psid protein evolution of vesicular stomatitis virus new jersey identification of a conserved region common to cadherius and hl~u~llzat s~ a hema~u!tinin~ sequence and evolutionary relationships of african swine fever vh*es thymidine kinase all unusual stnlctul~ of a putative t cell oncugene whlch allows production of similar proteh~ from distinct messengeg rnas ~ ldentilfication of a 3rd protein factor which binds to the rolls sarcoma virus ltr enhancer --po~lble homology with the serum response factor genetic variation and multigene fannllles in aj~rics~t swine fever virus sequence of the genome rna of rubella vh'us --evidence for genetic rearrangement du~n~ tosavirus evolution derse i~ equine infectious anemia virus tat--insights into the structure, function, and evolution of lentivtrus tran.~activator proteins a colrpoeison of the genome organization of capripoxvirns with that of the orthopo~ ~golutionary orlffln of human and s imian lmtmo~odeflciency vir~jes a new supe~mlly of putative ntp-bindin 8 domaan,t encoded by genomes of small dna and rna viruses envelope gene sequence of htlv-1 isolate mr-2 and its comlmrlann with other htlv-i isolates evolutionary relationship between luteovtruses and other rna plant viruses based on sequence motifs in their putative rna pulymet-am~ and nucleic acid hellcases isolation and sequence analysis of caenothabd/~s br/~w~e repetitive elements related to the ~ dqana transposon tcl jmolevo11990 selective clo~ sequence analysis of the hum l1 sequence* which t~ in the rehttively recent l~tt jowcs m& sequences related to the matte tra~acmable element acin the genum zea. j m0/ evo/1990 evulutiotmtv pattern of the hemas~utinin gene of inflmmza-b viru~s ~ulated in japan ~ cocir~lating linesses in the same epidemic semon the dna binding subuult of nf-kaplm-b is identical to factor kbfl and homologous to the rel oncogene product sequencins analyses and com~ of pmrainfluenza virus type-4a and type-4b np protein genes. virok>gie complete sequence of the gcnomic gna of o'nyon8-nyong virus and its use in the constroctton of alphavir~ phylogenetic trecs molecular clouln s of the rinderpest virus matrix gene --comparative sequence analysis with other paramyxorirm~. vi~logy cautd~an p~ ancestry of a human endogenous retrovirus ~ determination of an epitope of the diffuse systemic sclerosis marker antisen dna topoisomerase-l: sequence $1mllagity with retroviral ~ protein suggests a possible cause for autoimmunity in systemic sclerosis. pro6 natlacad s6i u&11989, 86:8492~96. mcgeoch dj: pgotein sequence cota~lxs show that the 'psuedoprotesses' encoded by poxviruses and certain retrovirus~ belong to the deoxyoridine triphtmphate family ~sk1 life: ~es of comme//na yellow mottle vlrus's complete dna sequence, genomlc discontinuities and transcript su88est that it is a pararetrovlnm i~l~titis c vllrll~ sborl~ amino acid sequence similarity with pe~tivirutu~ and flavivirus~ as well as members of two plant vlgus superggoupo mo~mann ti~ homology of cy~kine synthesis inhibitory factor (el-10) to the epstein-barr virus gene bcrfi nucleotlde sequence analysis of sa-omvv, a vlsna-related ovine lentivirus --phylo~-netic history of lentivirmms single copy seqoences in g~qgo dna retmmable a repetitive hnman aetrotrmmposon-llke family. y mo/e~/1990, 31:92 100 re¢otnblnation resulting in unusual features in the polyomavlrus genome isolated from a murine tumor cell line sequence anal~is of rice dwarf elxytoreovirus genome sewments s4, s5, and s6 -comparison with the equivalent wound tumor virus segments ho~tu~ ~ s71 is a ehylngcueticellly distinct human endogenous reteovtgal 1rlement with structural mad sequence homology to simian sarcoma virus (ssv). vi~ologie identification of a novel 65-kl)a cell surface receptor common to pm~cee~flc polypeptide molybdenum hydroxylas~ ~ the amino acid seqoence of chicken hepatic solfite oxidase frequency of a]mloglnai h |lm~tn haemoglobins ~ by c ~ t trmmitions in cpg dlnucleotid~ evidence for conservation of ferritin sequences amon 8 plants and animctbt and for a transit peptlde in soybean a 32-kda llpo~ortin from human mononuclear cells appears to be identical with the placental inhibitor of blood coagulation distinct fercedoxins from rhodobacter-capsulstus -complete amino acid sequences and molecular evolution n~ptide sequence analysis and molecular cloning reveal two calcium pump isoforms in the human erythrocyte membgane cloning and characterization of a novel member of the cytochrome-p450 subfamily iva in rat prostate a directiy repeated sequence in the ~-globin promoter resulates transcription in murine efythroleukemla cells isolation and chamcterizatinn of the alkane-inducibie nadph-cytochrome-p-450 olf, idoreductsse gene from candida-tropicalls -identification of invarlant residues wlthin slmilmr amino acid sequences of direr'sent flavoproteins protein klnase-c inhibitor proteins -purification from sheep brain and sequence similarity to lipocortins and 14-3-3 mci~ aveml~ b& sequence homology between purple acid phosphatases and phusphoprotein pho*phatsses --are phesphoprotcin phosphatatms metalloproteins collt~|nln~ oil~-bridged dinuclcar metal centers negative regulation of the human ~-globin ca~ne by transcriptional interference: role of an mu repetitive ~lement amino acid sequence of chicken catisequestrin deduced from c dna -comp~rison of caisequestrin and aspartactin caisequestrin, an intesccilular calciumbinding protein of skeletal muscle sarcoplssmic reticulm, is homolokous to ~, a putstive latminin-binding protein of the exteac¢llular matr~ bovsm~ ]prote~ c inhihl.gog with structugll and fun~ hotdoio~ou~ ]~-.gtl~ to hum~zn plum~ protein c inhibitor sequence of silkworm hemolymph antitrypsin deduced from its cdna nucleoude sequence --~on of its homology with ~.rplus. l b~cbem (tokyo) human mm~t cell tryptm~e multiple cdnas and genes reveal • multigene serlne protemje lmmlly howam> jc: msc ore. n k#on encoding protehm iteleted to the multidtog ite~letance family of tra~membt'mne tratmpofters m~, a tks~me-speclfi¢ b•tmment membrane protein, is a ia.minin.like protein commrvation of a cytoplasmic ~xy-termitml domain of couexin 43, a gap junctional protein, in mammal heart and brain the a~lba//a~ plasma membrane h+-a~ multigene ~ -genomle sequence and expression of • 3rd lsoform, f b/0/owra op#n of calliphora peripheral l~otoreceptors r1-6 --homology with d~ rhl and po~tmnsi~domd processing evolution of rhodopsin supergene family --independent divergence of visual pibments in vertebrates and insects and po~ibly in mollusks ct~tpl¢ the g~ne~ amino acid ~'~m~me gene of sac~baromy~wcet~-v/~ae --nucleotide seque~tce, protein similarity with the other i~kers yeast amino acid petmme~mes, and nitrogen cataboht~ repreulon the 70-kda peroxlsomal membrane protein is a member of the mdr (p-glycoprotein)-related atp-bindin~g protein superfamlly a new clam of lym~o-real/vacuolar protein sorelng signais. l b~/chem complete amino acid sequence and homologies of human erythrocyte membrane protein band 4.2. proc natl acad scd us a the primary structure of a halorhodopsin from n pbaraom~--structural, functional and evolutionary impnoations for bacterial rhod~ and haloghodopslns soluble lactose. blndln~ vertelmue lectlns: a ~ family the a regulatory subunit of the mltochondrlal fi-atpa~ complex is a heat shock protein. identification of two highly conserved amino acid sequences amon~ the ~x-subunits and molecular ~ sequence of h ilmlfl ~ l~ieat~ • novel gene family of integral membrane proteoglycans a protein with homology to the c-termimml relationaxip~ between/m~-nylate cycla~ and n•+,k+-ati~se lit pat pancgtmti¢ islets human na+ ,k+-ati~¢ genes ~ beta~ubunit gene family conmina at lcest one gene and one ~ evolution of the mltc cles~l genes of a new world lh'imate from ancestral homologues of human non-clessical genes the cdna sequence of mouse pllp-1 arid homololgy to hntman cd44 c~ll s~e antitpm and promot#ycen core unk proteins ~tjott of cdna encodin~ a hnman sperm membr~e protein related to a4 amyloid protebm ptwlflcstlo~ c~mu'actet~muon, and con with memb~ne carbonic anhydrase from human kidney hypermumbility of cpg dinucleoudes in the propcptide-enced/ng sequence of the human alb~tmi~ gene dystt'ophhl in electric or~n of to, pedo-~ homologous to that in h,ml~ muscle botste~ i~ homolosy of a yeast acun-binding protein to signal trmmductlon proteins slid myosin-1 the complete sequence of drosophila alpha-spectrin --conservation of structurml domahm between alplm-~ and alpl~t4cttnin •~ettaatflon of a lqbrilisr collqwn gene ~ spruces reveais the etdy evolutionary appearance of two collqwn gene fmmilk~ the predicted amino acid sequence of ct-lnternexin is that of a novel neuronal lntegmedla~ ~ent protein otsen bl~ type xil collm~n. a larbe multidomnln molecule with partial homology to type ix cousllem / b/d aera 1989 amyioid protein in i~mni~l amyloidmfls (plmnlah type) ks homolollotm to gd$oilmb an ac, tht-i~h,.da~g protein. b/~bera b/q0b~ res commun key ji~ ~ of a proline-rich cell wall protein gene ~ of soybesn. a ~ ana/ysis. j b/o/~em chicken liver evolutionary rehttinnships and impflcations for the resulation of phoophohpsse-a2 from snake venom to human secreted forms identification of a locality in snake venom a-ncurotoxins with a slsnlficant comlm*itinmd similarity to marine smdl ct-conotoxins: implications for evolution and structure activity al~ph[biml~ albmtm|nm ~s members of the albumin, alpha-l~toprotein. vitamin-d-binding protein mul~ flmily ~ni~on of the hnm~n llpoprotein lllmse gene and evolution of the llpase gene family e~'t~ion of cloned human reticulocyte 15-1ipoxygenase and immunological evidence that 15-hpoxygetmses of different cell types are related identification of a protein alt~ inttaspecific evolution of a gene family coding for urinary proteins conservation between yeast and man of a protein a~ociated with i35 small nuclear rlbonucleoprotein stl~ctute and partial amino acid sequence of calf thymus dna topobmmaertt~-ii -coml~on with other type-h emmyme~ ol~nudeotide correlations between infector and hem genomes hint at evolutiotmry relationships. nu6/e~ scot/~ik p& carotenoid desmurases fi, om ~ ~and nmoowo~craua are stru~ and l~n~'tinnally comerved and eonmin domains homolosons to flavoprotein dimdflde oxldoreductm~ deininger pi2 stt'uc~uee and vsrisbihty of recently inserted alu family members a novel neutrolphfl chemmtttactant generated duan8 an ln~ammmtory reaction in the l~mt peritoneal cal~lt~ tt~ t~t~o -l~tl'~t~tloil~ ~ amino acid seque~tce and structural relmtmmhip to interkukin-& b~ffx~m j the multlfimctinna 6-methylmllcyllc acid syn~ ge~e of ~~ ~ its ge~e structmm ieimive to tl~t of other po~lyketide symhase~. f.urj b/odaem 1990 mammalkm ublquitin carrier prmmtmh but not i~:i~k, ame ltdated to the 20-kda yeast 182, rad6. bk~chem b/qohys res commun chambers gk: sequence. structure and evolution of the c.ene codin b for ~t-gi~erol-3-phe~plmte ~rdrotfm~ in om,qt~ the cotaplete sequence of bogu/ktmm nenrotoxin type-#, and com~ with other clostrldhtl neugoto~hm if: a pamlly of cxam~fltutive c/bbp-llkc dna blndln~ proteins attenuate the il-l~t induced, ni~b mediated trans-activation of the ansiotemflnogen gene acute-phase response element different fort~ of ultmhithomx proteim generated by alternative spttcim~ are functionally equivalent evolution of collagen-iv genes from a 54-batm pair faton --a role for lntrmm ht gem~ evolution evolution of the insulin superfamlly tcetins are structoraily related sertoli cell proteim who~ ~on is tightly coupled to the iprtsence of germ cells ivarie r~ a bovine homolo s to the human myolletti c determination factor myf~ sequence conservation and 3' proce~ing of transcripts proteiu sertne threonine phoephatmes -an expanding family coppes zl divergence of duplicate genes in three sciaenid species (perciformes) from the south co~t of uruguay coasfaneda m: rrs~j~o~a (mu-~--a~) repetitive dna seqmmce l~vointion in 3 ~hically mstinct isolates. cor~0 bnz~n physiol repetitive seq~ce involvement in the duplication and divergence of mouse lysozyme genes the structure of a subtermlnal nut/e/6 a6/ds res 1990 schoofs i~ h~ between amino acid sequenc~ of ~ v~'lt~tm'stte peptide hormones and peptides ~mlated fi-on~ invertebrate sources. corn# bm&.n mg~ol bun'nng s, ~us r& lqatelet gtycoprotetn nb-ma protein antssonim from snake venoms ---evidence for s fumlly of p~telet-~sgqpttlon lnhll~tol~ hikher plant orilgins and the whylogeny of gt~en allpte simihtrity between the t~ ~ sindln s proteins abf1 how big is the univet~ of e~otm worklwide diffegences in the ~ncideace of type ! diabetes are ammciated with amino acid variation at pos/tion 57 of the hi~-dq ~ chain yeast general trtnscelptimt l~ctor gf! --sequence requirements for binding to dna mad evointhmky commrvttion. nudeg m/ds res concerted ]rv~ution of primate mplm smelllte dna. e'~kmce foe tm an~mt~ sequence sbm'ed by goal~ md human x ~e alpha ~ttdllte the nuchl~m~ sequence of etve ribommaal protein genea from the o/anene. of ~~ impacattom concem~ the mtytosene~ relationship bet~-en cyanelles and chloropluts wmslanoer l~ a new member of a secretory protein gene family in the dipteran c~t~onomot~ tentaus ~ a variant repeat stracture the ~r sequence ~ --die.inn on the x-chromosome and y-chromosome of a large set of closely related sequence~, most of wmda are i~eudogene~ ba~ttmo~e l~ cloning of the pso dna binding subutdt of nf-kapi~-b -homolo~" to gel and dortml l-~te two-monooxr~muse from m~ --clon~ nucleotide sequence, and primary structu~ homology within an enzyme family genetic hot~o~n~ty ~ acute and chronic acute forms of spinal muscular atrophy genetic variants of bovine ~-lactogiobulin --a novel wild.type ~-lacto#obulin w and ~ts primary sequence. b/or (~rn h0tt0e sey/er l~ltogh~ dna evolution in the olmcm species subgroup of drooophll~ f mot evot lovell-badge l~ a gene mapldng to the sex-determining gegion of the mouse y chromommae ~ a member of a novel ~ of zmbryonk~ly genes ~titmte 1,2-dioxy~mm~ from p~.udomotm~ pustfi~mtion, characterization, ~md compm'tson of the f.mtymes from psemffmmm~m ta~o~k-ron/and aaammms~ spec~clties of the peptidyl prolyl cis-tratm isomeric activities of cydophmn and fk-506 bindh~ protein --evidence for the existence of a family of distinct enzymes. b~x/aem/ary mltochondrl~ dna evolution in primates -tt-atmltion gate has been extremely low in the lemug homeobox containing genes in the nematode ~enorbabd/f~ elk.gamin nucleic ac shdic add fateesses of ~ • voluttomu.y origins have serine active sites f~entlal arginlne residues dewact-rrer l~ the 188 ltilm0omal rna ~-quence of the s~t anemone anemom~s ssdcmta and its evolutionary intuition amomqg other eukaryotes inferred b'om s~l,.m.~ comlmrttmas of a heat shock g~ae in two nematorl~ the l~'/o multtgene family of ok~hag of cdna ~ for the ~ omin of human complement component ca~bi~una protein, seqaenoe homolo~ with thc a c~t~:~a~h proc natl acad s¢t usa1990 highly conserved core domain and unique n terminus with presumptive regulatory moti~ in a hmman tata factor (l'lql~) [letter] identification cimractertzaflon of a novel member of the nerve growth fmctor/besln.dertved neurotrophic factor family ~ bind8 to s~dlfmme [eal(~-so4)l~l-lcer ] and has a sequence homology with other pt'otelns that bind sulfated glycoconjut~tes anllllo acid seqmmce of clnnamomin, a new member of the elicitin family, and its comparison to cryptogein and capsicetn soluble and mtmo[~tle~ioc~ta~l h~ low-ml~n|ty adenomne binding protein (adenotin) --properties and homology with mtmmall~la and avian stress protelus. b~-/~om/stry edolatlon of complementary dna$ f~lcoding a cerebellum-enriched nuclear factor-i family that activates tt'anscription from the mouse m~.lin basic protein promoter ye~mt mltochondrlal dna polymet'ase is related to the family a dna polymerases nudeotide and deduced amino add sequence of a human cdna (nqo2) corresponding to a second membeg of the nad(p)h --quinone oxldoreductase gene family --extensive polymorphism at the nqo 2 gene locus on chgomo~ome-6. b/oc.heraistry ult~ sltnlltt'leles a~llolltll enzyme pterin binding sites as demonstrated by a monoeinnal amiidiotypic antibody blundell tl molecular anatomy: phylogenetic relationship* derived from three~limenslonal structure~ of proteins subfamily structure and evolution of the hnmtn 1.1 family of repetitive scquence~. f mot evo 3 selmt~te mltochondrlal dna sequences are contiguous in htlmsa~ genol~ic dna l~t~lit~ within mmmm~lla~ sogl~tol deh~ --the prlmm'y structure of the human liver enzyme heterogeneous modifications of the l14/alo ltrote~a of ibtegleuldn-~t cells are concentrated in a/,ti~hly r~qg~.titlv ~ amino-t~ vaults.ell rebofmcleoprotein structures are msl~ conserved among higher and lower e~tes rnas le~d support to the monophyletic nature of the ~erla lmmunoloslcal ~lmllmtties ~etween cytosolic and partictdate tissue trans#utamilsc. febs lat mans~ti x#tope m~w~zed by a protective m~aodonm antibody is identical to the sta~e-specific embryonic antlgen-l. proc naa acad sa o~ 1990 the murg3 gene of t-brucei contains multiple dom.l.m of extensive editinil and is hofaoin~m~ to a subultit of nadh dehy~ neparm-bindl~ nenrotrophtc x~tor (hbnf) and mk, member's of z new i~mily of homolosous~ developmentally l~ted proteitm pugmattion and strucrmml ~on of pttcentel nad + .mtked 15-hydroxyproma#andm dehydtoffmase ~ the primary structure reveals the enzyme to belon 8 to the short-alcohol l)ehydrogena~ l~mlly. b/ochemistry structores and homologies of carbohydrate ~pho~ system ep~l~[ln, a ~o~a-gmjoclated mudn, is generated by a polymorphlc gene encodin8 splice variants with alternative amino termini a new member of the leucine zipper class of proteins that binds to the hia drct promoter. sc/ence attalysi~ of cdna for human ~ ajudgyrin i~dicltes a repeated structure with homology to tissue-differentiation a~td cell-cycle control protein the b subunlt of a rat hetefomeric ocaat-binding transcription factor shoes a striking sequence identity with the yeast hap2 transcription factor homology to mouse s-if and sequence similarity to yeast pt~2 stgucttu'e and evolution of the 02 small nuclear rna multigene family in primates: gene amplification under nat-¢wal selectinn? ident~catinn of an additional member of the proteln.tyrushle-phosp~ family --l*vidence f~ alternative spliclog in the tyrmine phosphzmme domain a 8~le am~o acid difference dis~ishes the human and the rat sequences of statlmaln, a ubiquitous intracehular pho~phoproteln ~ with cell item comp~ison of the seve~le~ gene* of drosop~ffa t~'ff~ end ma4~ muty, an adenine ~ active on g-a mislmirs, has homology to ~t evolution of largesubunit iutna structuge --the ~cation of imvetbe~t d3 dommin amon8 mmjor phyiolpmetic groups discrepancy in diveqlenoe of the mltodtondrlal and nuclear genomes of m sensor/and y~ j mot evot 1~90 adenylate deamll~t~. a mt~flige~e fam~ in p..m~,n, and rats isolmion and structure of ceerol#m, itna,~le hat~ peptmes, from the smm~m, ~ mo~ comp a~a rmm~ i~ vmotocin ge~ of the teleom f.,xott intro~ botany. ~ hot~ ot'l~mization. b~hemioy the adb gene areal share features of sequence structure and nudeast~protected sites. m0/cell bto/1990 the amino-acid sequence of multip/e lectins of the #.corn barnacle m~us-lgo~ and its homology with .animal ]~'tllls. bioclx'm btqobys acta amino add ~.-quence of mtmkey erythrocyte glycophorba mk. its amino acid ~'qu~'~icc ]f][~ a stri~tl~ homology with that of human glycophorin a flsp~r p& drtmophila proliferating cell nuclear antigen. structural and functional homology with its mammalian coonterpart phylogeny of n|trogen*me s~queac~ in ][~mnkla and other nlteogen-fixing ml~m$ vertebrate prot~mlne c~ne evolution.1. sequence alignments and gene structure florin l~ a major styl~ matrix polypeptid~ (sp41) is a member of the f~thogenesia-reiated proteins superciass complete amino acid sequence of rat kidney ornithine aminoteat~fet-~e --identity with ijver omithine aminotransferme. l bnxl;em (tokyo) rlbonuclease p --function and variation. j b/o/~bem the primary strum of glycoprotein-m from bovine adrenal medullary granules --sequence similarity with bnmmn serum protein-40,40 and rat sertoli cell giycoprotein-2 compm'ative ~quence/umlysis of m~mmantan f'a~or ix protaotegs the amino acid sequence of the b nman l~ia polymet'a~-h 33-kda subunit hrpb 33 is highly cotmerved among eukaryotes phylogenetic conservation of atylsulfatases --cdna cloturing and expre~ion of hnman aryisul~t~e-b. j b/o/cbem c.oll/l~'vlltion and diversity in fatnllies of coated vemcle adaptlns cllaracterizaflon of petel porcine bone sialoproteins, soca'~ted phosphopgotein ! (sppi, osteopontin), bone siaioprotein, and a 23.kda glycoprotetn ~ demonstration that the 23-kda glycoprotein is derived from the carboxyl terminus of sppi characterization of matteuccin, the 2.2s storag~ prote~ of the ostcich fern -evolutionary iteiatinnshlp to angiosperm seed storage ~ a new mmber of the glutamine-rlch protein gene family is characterized by the absence of internal lgepe~ts and the androgen control of its expression in the subm*ndlbuiar gland of pad 2 novel insect n~ with homology to peptides of the vea'te~ tachykinin family identircation of a novel platelet-derived neutrophli-chcmaotgctic po~ with structural homology to piatelet-factor-4 a novel repeated dna sequoncc located in the intergenic regions of ba~tceial chromosomes. nuc2eic.,k:ids res the proianlin storage protellx¢ of cere~ seeds ~ structure and evolution functional analysis of the 3'-terminal part of the balbiani ring gene by hlterspecies sequence comparison dr= mammaban ~yl phosphate symhetase (cp*) --cdna sequence and evolution of the cl m domain of the syrian hamster multifunctional protein cad mammalian dihydroorotase --nudeotide sequence, peptide sequences, and evolution of the imhydroorotsse domain of the multifunctinnal protein cad a receptor for tumor necrosis factor defines an unusual family of cellular and viral proteins the control of flower morphogenesis in a~..ffd~um majusthe protein shows homoinff~ to transcription factors an element of symmetry in ytmst tata-box binding protein transcription factor-lid --consequence of an ancestra/ duplication? c-type natciuretic peptide (cnp): a new member of nateinretic peptide family identified in porcine brain evolution of antioxidant m~: ediol-dependent petoxidm~.s and thiol~ ~umong ptocaryotes towards the evolution of ribozymes alkyl hydroperoxide reductase from sa/mone/ta ~ur/um --sequence and homology to thinredoxin reductase and other fiavoprotein disuliide oxidoreducmses fc: nonuniform evolution of duplicated, developmentally controlled c~azrion genes in a sillumoth the fission yeast cutl + gene regulates spindle pole body duplication and has homolosy to the buddin structural homology b~ween the hnmmn fur gene product mad the sub---like protea~ encoded by ye~t/~x2. nuc~ a¢/ds res 1990 nudeotide sequences and novel steuctut~ features of hnm=. and cimm~ lighter ~# primary stt~t~ and expression of a nuclear-coded subunit of complex-n n~ to protetm specified by the chtoropiast genome. b/0chera bnfhys r~ commun a novel gene member of the human giycophorin-a and glycophorin-b genc fatuily -molecular cloning and expression the x-chromosome of monotremes shares a highly conserved region with the eutherlan and marsupial x-o~romosomes despite the absence of x-chromosome ittactt~tion c~lract~tion and or~= nl~tion of dna sequences adjacent to the evidence for a new fmily of evolutionarily conserved homeobox genes elellatltlll and albolabrin purified peptides from viper venoms --homologies with the rgds domain of flbrinogen and yon willebrand pactor measurement of $~tiv~-site homology between potato and l~bbit muscle alpha-glum phosphoryiases through use of a iane~r free energy relationship white 1~ weiss 1~ the neuroflbromatosis typed gene encodes a protein related to gap the dna damage-inducible gcne-dinl of saocbarom3q~ewcet~#.s/ae encodes a regulatory subunit of elbonucleotide reductase and is identical to gnr3 fhlgegprinting of ne~lr-homogeneous dna hgase-i and ligase-h from eh,m~n cells --similarity of their amp-binding domains control of m11na st~mlity in • chnoc~qg.~um, by 3'inverted ltepeats: effects of stem and loop mutations on degradation ofxtmba mlna/n vt~ nuc/e~ ac alternative messenger rna structures of the ciil-gene of bacteriophage ~. determine the rate of its tt'ansbttion initiation alternative mrna structures of the cm genc of bacta~ophage ~ detc:'mine the rate of its translation initiation. j mo/b~0/1989 a model fog iina editing in klnetopiastid mltochondrla --guide rna molecules transcribed from max/circle dna provide the edited information elements and coding sequences. j mol bio11989, 210:417-427 . chang c-y, ~ d-a, mohandas til chung b-c: stt~ctut~e, ~-quence, chromo~maal location, and evolution of the human fercedoxin gene family. dna cell b/o/1990, 9:205-212 key: cord-002973-bkr4ndl2 authors: seifi, morteza; walter, michael a. title: accurate prediction of functional, structural, and stability changes in pitx2 mutations using in silico bioinformatics algorithms date: 2018-04-17 journal: plos one doi: 10.1371/journal.pone.0195971 sha: doc_id: 2973 cord_uid: bkr4ndl2 mutations in pitx2 have been implicated in several genetic disorders, particularly axenfeld-rieger syndrome. in order to determine the most reliable bioinformatics tools to assess the likely pathogenicity of pitx2 variants, the results of bioinformatics predictions were compared to the impact of variants on pitx2 structure and function. the mutpred, provean, and pmut bioinformatic tools were found to have the highest performance in predicting the pathogenicity effects of all 18 characterized missense variants in pitx2, all with sensitivity and specificity >93%. applying these three programs to assess the likely pathogenicity of 13 previously uncharacterized pitx2 missense variants predicted 12/13 variants as deleterious, except a30v which was predicted as benign variant for all programs. molecular modeling of the pitx2 homoedomain predicts that of the 31 known pitx2 variants, l54q, f58l, v83f, v83l, w86c, w86s, and r91p alter pitx2’s structure. in contrast, the remaining 24 variants are not predicted to change pitx2’s structure. the results of molecular modeling, performed on all the pitx2 missense mutations located in the homeodomain, were compared with the findings of eight protein stability programs. cupsat was found to be the most reliable in predicting the effect of missense mutations on pitx2 stability. our results showed that for pitx2, and likely other members of this homeodomain transcription factor family, mutpred, provean, pmut, molecular modeling, and cupsat can reliably be used to predict pitx2 missense variants pathogenicity. paired-like homeodomain transcription factor 2 (pitx2, refseq nm 000325.5, mim# 601542) is located at 4q25 and is expressed in the developing eye, brain, pituitary, lungs, heart, and gut [1] . mutations in human pitx2 or the forkhead box transcription factor c1 (foxc1; 6p25, refseq nm 001453.2, mim# 601090) underlie the autosomal dominant disorder called axenfeld-rieger syndrome (ars; mim# 602482) [2] [3] [4] [5] . ars is a full penetrant, but clinically and genetically heterogeneous disorder characterized by developmental anomalies involving both ocular and non-ocular structures [6] . to date, 87 identified including deletions, insertions, splice-site mutations, and coding region frameshift, nonsense and missense mutations [7] [8] [9] [10] [11] [12] [13] . identifying new disease-associated variants is becoming increasingly important for genetic testing and it is leading to a significant change in the scale and sensitivity of molecular genetic analysis [14] . one of the most frequent approaches for detecting novel variants in target genes is using direct gene sequencing. however, due to increasing number of newly identified missense variants, it is often difficult to interpret the pathogenicity of these variants as not all the mutations alter protein function, and the ones that do may also have different functional impacts in disease [15, 16] . thus, prior to detailed analyses, novel variants cannot be easily classified as either deleterious or neutral, because of their unknown functional and phenotypic consequences. therefore, further research should be conducted to validate the genetic diagnosis when a novel missense variant is discovered. preferably, in vitro characterization of novel variants should be undertaken; however, due to facility limitation, it is often not practicable to experimentally verify the impact of large number of mutations on protein function [17] . another robust approach to substantiate the pathogenicity is using animal models by generating the homologous mutation that recapitulates the human phenotype; but, similar to in vitro studies, these are time-consuming, labor-intensive, difficult and expensive, making this approach unfeasible to experimentally determine the pathogenicity effects of all novel identified variants [18] . to circumvent the above mentioned limitations and to provide fast and efficient methods for predicting the functional effect of nonsynonymous variants on protein stability, structure, and function, several computational tools have been developed [19] [20] [21] . protein stability and structure are key factors affecting function, activity, and regulation of proteins. conformational changes are necessary for many proteins' function and disease-causing variants can impair protein folding and stability. missense variants are also capable of impairing protein structure, likely by affecting protein folding, protein-protein interaction, solubility or stability of protein molecules. the structural effect of mutational changes can be examined in silico on the basis of three-dimensional structure, multiple alignments of homologous sequences, and molecular dynamics [22] [23] [24] . therefore, analysing sequence data in silico first and detecting a small number of predicted deleterious mutations for further experimental characterization is a key factor in today's genetic and genomic studies. in general, bioinformatics prediction methods obtain information on amino acid conservation through alignment with homologous and distantly related sequences. the most common criteria considered in many bioinformatics programs for predicting the functional effect of an amino acid substitution are amino acid sequence conservation across multiple species, physicochemical properties of the amino acids involved, database annotations, and potential protein structural changes [23, 25, 26] . as mentioned above, resources for in vitro and in vivo functional analysis of novel variants are constrained in most clinical laboratories. therefore, identifying and reporting novel variants that are likely to be pathogenic often requires accurate prediction using computational tools. in a previous study, we examined the effect of foxc1 variants on protein structure and function by combining laboratory experiments and in silico techniques. our results showed that integration of different algorithms with in vitro functional characterization serves as a reliable means of prioritizing, and then functional analyzing, candidate foxc1 variants [27] . unlike most previous studies that focused on using only polyphen and sift to predict the pathogenicity of missense mutations, here, we investigated the predictive value of sift, poly-phen and nine other prediction tools by comparing their predictions to in vitro functional data for pitx2 variants. the bioinformatics programs found to be most reliable were then used to predict the likely consequences of 13 functionally-uncharacterized pitx2 variants. we also performed molecular modeling on all the pitx2 missense mutations located in the homeodomain and compared the results with the findings of protein stability algorithms to identify the most reliable tools in predicting the effect of missense mutations on pitx2 stability. to the best of our knowledge, this is the first study that incorporates the results of functional studies in conjunction with bioinformatics approaches for predicting the pathogenicity of mutations in pitx2 gene. lists of pitx2 missense variants were assembled from the previous literature and a search using the clinvar [28] , human gene mutation database (hgmd) [29], the genome aggregation database (gnomad), and the single nucleotide polymorphism database (dbsnp). this study found 47 pitx2 missense variants; 31 of which were described in the literature as being associated with ars or coronary artery disease (cad), while the remaining 16 variants, were considered as benign variants (fig 1) . eighteen of the 31 variants were classified as pathogenic based on functional studies utilizing site-directed mutagenesis, expression studies, and other functional analysis ( table 1) . thirteen of 31 variants were described as associated with ars and cad in the absence of functional analyses on pitx2 structure or function. sixteen snps, with population allele frequencies > 0.0005 were identified from the gnomad and the clin-var. based upon the allele frequency (approximately 10-fold greater than the disease frequency of ars) these have been considered benign polymorphisms. nucleotide numbering of the mutations herein indicates cdna numbering with +1 as the a of the atg translation initiation codon in the ncbi reference sequence nm_000325.5, while the amino positions are based on the corresponding ncbi reference sequence np_000316.2. this study is a retrospective case report that does not require ethics committee approval at our institution. all patients' mutations and phenotypes were obtained from previously published studies. these programs were used to analyse 18 functionally characterised pitx2 missense variants plus 13 additional, functionally uncharacterized pitx2 missense variants. sift program provides functional predictions for coding variants, based on the degree of conservation of amino acid residues in sequence alignments derived from closely related sequences, collected by psi-blast algorithm [60]. the polyphen-2 (polymorphism phenotyping-2) server predicts possible effect of an amino acid change on the structure and function of a protein using several sources of information such as straightforward physical and comparative considerations [61] . panther-psep is a new application that analyses the length of time a given amino acid has been conserved in the lineage leading to the protein of interest. there is a direct association between the conservation time and the likelihood of functional impact [62] . mutpred is a free web-based application that utilizes a random forest algorithm with data based upon the probabilities of loss or gain of properties relating to many protein structures and dynamics, predicted functional properties, and amino acid sequence and evolutionary information [52] . mutationtaster is a tool that combines information derived from various biomedical databases and uses established analysis programs. unlike sift or poly-phen-2 which work on dna level, mutationtaster processes substitutions of single amino table 2 for more information on the prediction tools used in this study. the nmr structure of the homeodomain of pitx2 complexed with a taatcc dna binding site (pdb: 2lkx) were analyzed by the swiss-model server (http://www.expasy.org/spdbv/ ; provided in the public domain by the swiss institute of bioinformatics, geneva, switzerland). model structures of wild-type and mutants were created in swiss-pdb viewer and investigated using the anolea server (http://melolab.org/anolea). for structure predictions of pitx2, sequence in fasta format was obtained from ncbi database (np_001191327.1). eight different protein stability programs (duet, sdm, mcsm i-mutant3.0, mupro, iptree-stab, cupsat, and istable) were used to predict the effects of missense mutations on the stability of pitx2 protein. duet is a web server that uses integrated computational approach to predict effect of missense mutations on protein stability [66] . duet calculation is based on complementary data regarding the mutation including secondary structure [67] and a pharmacophore vector [68] . sdm, a computational method, has been demonstrated as the most appropriate method to use along with many other programs. sdm assesses the amino acid substitution occurring at specific structural environment that are tolerated within the family of homologous proteins of defined three dimensional structures and change them into substitution probability tables [69] . mcsm relies on graph-based signature concept and predicts not only the effect of single-point mutations on protein stability, but also protein-protein and protein-nucleic acid binding [70] . i-mutant3.0 is a neural-network-based web server that predicts automatically protein stability changes upon single point protein mutations based on either protein sequence or protein structure. i-mutant3.0 can predict the severity effect of a mutation on the stability of the folded protein [71] . mupro is a set of machine learning programs that accurately calculates protein stability alterations based on primary sequence information particularly where the tertiary structure is unrevealed, overcoming a major restriction of previous methods which are based on the tertiary structure [72] . iptree-stab is a web service and mainly provides two function modules of services including discriminating the stability of a protein upon single amino acid substitutions and predicting their numerical stability values [73] . cupsat uses protein environment specific mean force potentials (through solvent accessibility and secondary structure specificity) to analyse and predict protein stability changes upon point mutations [74] . istable, a combined predictor, was designed by using sequence information and prediction data from various element predictors. istable is available with two different input types: structural and sequential [75] . please see table 3 for more information on the stability predictors used in this study. previous analyses of missense variations in different human diseases predicted that the stability margin without any immediate effect on protein fitness is 1-3 kcal mol -1 [77] [78] [79] . mutations that reduce the protein stability by >2 kcal mol -1 contribute to severe disease phenotypes [80, 81] . therefore, in this study, all variations were classified as predicted to be neutral (-1.5 < δδg < 1.5), stabilizing (δδg > 1.5) or destabilizing (δδg < -1.5). the protein sequence and/or protein structure with mutational position and amino acid residue of 18 previously functionally characterized pathogenic pitx2 missense variants, plus 16 snps with a population frequency of higher than 0.05% (thus considered benign polymorphisms), were used to test the predictive value of eleven common bioinformatics prediction programs; sift, polyphen-2, panther-psep, mutpred, mutationtaster, provean, pmut, fathmm, nssnpanalyzer, align gv-gd, and revel (table 4 and table 5 ). to evaluate the performances of the programs, seven measures (sensitivity, specificity, accuracy, precision, positive predictive value (ppv), negative predictive value (npv), and matthews correlation coefficient (mcc)) were calculated by comparing the results of all programs with previously generated functional data. for pitx2, mutpred, provean, and pmut were the most reliable of the bioinformatics tools in predicting the pathogenicity effects of all 18 functionally characterized missense variants in pitx2, with sensitivity and specificity of > 93% (fig 2) . then, revel tool showed high sensitivity and specificity, 94.44% and 87.50%, respectively. analysis of the sensitivity and specificity sift showed that this program had good sensitivity (72.22%) but low specificity (43.75%). although polyphen-2, mutationtaster, panther-psep, fathmm, and align gv-gd exhibited over 83% sensitivity, they were unable to identify the benign polymorphisms, showing specificity of 37.50%, 6.25%, 43.75%, 6.25%, and 6.25%, respectively. the predictive value of nssnpanalayzer was similar to that of sift program, with sensitivity and specificity of 66.67% and 43.75%, respectively. the most reliable programs found in this study's analyses (mutpred, provean, and pmut) were then used to predict the likely pathogenicity of 13 pitx2 missense variants for which functional testing has not been performed (table 6) . interestingly, the a30v variant unanimously was predicted as benign by all three programs. the remaining 12 pitx2 variants were predicted to be disease-associated mutations by all programs. molecular models for the homeodomain of wild-type and variant-containing pitx2 proteins were designed using threading algorithms to assess impairment of pitx2 structure by missense variants. three functionally characterised variants, n100d, l105v, and n108t, were excluded from these molecular modeling analyses since they are not located in the homeodomain, which is the only portion of pitx2 with a known structure. wild-type amino acids were changed to variant residues to determine putative structural effects of the remaining 15 functionally analysed pitx2 variants through anolea mean force potential calculations. the molecular modeling identified three mutations as high-risk (l54q, v83l, and r91p) to change the structure of pitx2, particularly in the h1, h2, and h3 subdomains (fig 3) . the r91p variant was predicted to grossly disrupt the non-local amino acid side chain contacts. similar, although less profound, effects were predicted when l54 and v83 were altered to glutamine and leucine, respectively. in contrast, the remaining twelve amino acid variants showed no predicted substantially altered pairwise interactions, indicating that these missense variants are predicted to have minor or no effects on pitx2's structure (s1 fig). molecular modeling was also performed on the nine functionally uncharacterised pitx2 missense mutations located in the homeodomain. four mutations (f58l, v83f, w86c, w86s) were predicted to change the structure of pitx2 (fig 4) , while, the remaining five variants (r62h, p64l, p64r, r69c, and r90p) were predicted to have minor or no impact on pitx2's structure (s2 fig). to assess the performance of eight different stability predictor programs (duet, sdm, mcsm, i-mutant3.0, mupro, iptree-stab, cupsat, and istable) in predicting the effect of missense mutations on pitx2 protein stability, the change in protein stability (δδg) were computed for all 24 pitx2 homeodomain variants (15 functionally characterised and 9 functionally uncharacterised mutations) (table 7) . of these eight programs, cupsat was the most consistent with the results of our molecular modeling, by identifying 5 of 7 destabilizing mutations that were also predicted to be destabilizing by molecular modeling (v83l, v83f, w86s, w86c, and r91p). computational analysis of pitx2 mutations sdm also showed high consistency with the results of our molecular modeling, by detecting 4 of 7 destabilizing mutations that were also predicted to be destabilizing by molecular modeling (l54q, r91p, w86s, and w86c). duet, mcsm, and i-mutant3.0 identified 3 and iptree-stab detected 2 of 7 destabilizing mutations detected by molecular modeling. mupro and istable were unable to identify any of the 7 destabilizing mutations predicted by molecular modeling. although in silico programs are not a substitute for wet-lab experiments, they can provide a supportive role in the experimental validation of disease-associated alleles and can help further diagnostic strategies by prioritizing the most likely pathogenic novel variants. while many tools are available for assessing the functional significance of variants, determining the reliability of prediction results is challenging. in this context, the current study investigated the combination of experimental findings, molecular modeling, in silico mutation prediction programs, and stability prediction software to assess the pathogenicity of pitx2 missense variants. in silico methods that correctly identify deleterious variants do not always inevitably work well for benign predictions. the methods determined by this study to be preferred for analyses of pitx2 variants were those best able to distinguish both pathogenic and benign variants, thus yielding the highest accuracy. our results showed that mutpred, provean, and pmut tools were the most accurate in predicting pathogenicity of pitx2 missense variants (fig 2) . the sensitivity and specificity of these three tools in recognizing pitx2 disease-causing variants were over 93%, indicating the strong performance of these programs in identifying as pathogenic only pitx2 variants with significant functional defects. after these three tools, revel showed highest sensitivity and specificity, 94.44% and 87.50%, respectively. sift showed good sensitivity (72.22%) but low specificity (43.75%). polyphen-2, mutationtaster and panther-psep, fathmm, and align gv-gd demonstrated > 83% sensitivity, but, they were unable to identify the benign polymorphisms, showing the specificity of 37.50%, 6.25%, 43.75%, 6.25%, and 6.25%, respectively. the predictive value of nssnpanalayzer was similar to that of sift program, with sensitivity and specificity of 66.67% and 43.75%, respectively. our results showed, therefore, that computational analysis of pitx2 mutations mutpred, provean, and pmut can be utilized with high confidence to test whether or not a pitx2 missense variant is likely to be deleterious. interestingly, mutpred was the only in silico program that ranked in the top three programs in identifying both pathogenic and benign pitx2 and foxc1 variants [27] . a likely explanation for mutpred's high ranking is that it evaluates the most factors in making assessments. however, since the number of variants available for testing in this study were small, a larger dataset would confirm that our results are reproducible and generally applicable. the three programs that were found to be the most reliable (mutpred, provean, and pmut) were then used to assess the likely pathogenicity of thirteen pitx2 missense variants for which functional analyses have not been performed, but which have been associated with ars or cad (table 6 ). our results showed that mutpred, provean, and pmut predicted as pathogenetic 12/13 of the variants. the a30v variant was scored as non-pathogenetic/benign by all three programs. while it is possible that a30v is an example of a false negative for all three programs, it is likely that this variant is instead benign. functional testing of the a30v variant is needed to determine which of these possibilities is accurate. various intramolecular interactions are involve in stabilizing and folded state of protein, including hydrophobic, electrostatic, and hydrogen-bonding [95] [96] [97] [98] . the stability state of a protein is key factor in its proper functionality. in fact, up to 80% of mendelian disease-causing mutations in protein coding regions are predicted to be caused by altering protein stability [99] . in recent years, due to the availability of high-throughput array-based genotyping methods [100] and next generation sequencing platforms [101, 102] , a large number of snps has been reported. however, the association of missense variants with protein stability has often been difficult to predict. fortunately, recent advances in computational prediction of protein stability offers potential insight into this question. we used two parallel prediction methods to investigate the possible effects on pitx2 protein structure and stability of missense variants. knowledge of a protein's 3d structure can be used to predict the functionality of protein and the possible impact of variants on protein conformation and structure. we thus first used molecular modelling analyses to assess and compared the total energy difference between native and mutated modeled structure of pitx2 proteins. the results predicted that while most pitx2 variants did not dramatically affect the protein tertiary structure, seven variants (l54q, f58l, v83f, v83l, w86c, w86s, and r91p) altered the total energy level in comparison with the native structure, suggesting that these amino acid substitutions changed the structure of the pitx2 protein. molecular modeling of the pitx2 homeodomain predicted that these variants impair the required energy to maintain the proper folding of helix 1-3 and cause global destabilization of the structure of pitx2. these seven amino acids are either invariant (e.g., w86) or highly conserved in the approximately 300 homeobox proteins analyzed, consistent with a pivotal role of these residues in the homeodomain [103] [104] [105] . these seven amino acids are tightly packed hydrophobic amino acids responsible for holding helices of the pitx2 homeodomain together, supporting our molecular modeling predicting that mutations of these amino acids disrupt pitx2 structure. for f58l, v83f, and v83l, the native wild-type residues and the introduced mutant residues differ in size, probably causing loss of hydrophobic interactions in the core of the protein, particularly involving helix 1-3. for l54q, w86c, w86s, and r91p, the wild-type residues and the mutant residues are different in both size and charge, likely disturb the local structure of protein thereby altering protein structure and function. residues v83, w86, and r91 are located within the third helix which is specifically responsible for binding with the major groove of the dna [106] . thus, the prediction that these mutations impair the capacity of this helix to interact with dna is consistent with this knowledge and with previous functional characterizations that showed reduced dna-binding capacities of the v83l and r91p mutant pitx2 proteins [5, 107] . consistent with bioinformatics predictions of deleterious affects of mutation of w86, mutations of the neighboring amino acids (r84w and k88e) have been shown to decrease the ability of the mutant proteins to interact with dna [39, 108] . residues l54 and f58 are located in helix 1 of the homeodomain, responsible for contacting with the minor groove of the dna. molecular modeling of l54q is consistent with the computational analysis of pitx2 mutations suggestion that mutations in these highly-conserved residues in helix 1 of the homeodomain might disturb the dna-protein binding affinity. our prediction is supported by the fact that changing the leucine to a glutamine (l54q) disrupts dna-protein complex, indicating the necessity of leucine at position 54 for pitx2 binding ability [109] . thus, consistent with our recent studies on foxc1 protein [110] , the results of molecular modeling of pitx2 are strongly consistent with the functional characterization of pitx2 missense variants. the results from our molecular modeling analysis were also compared to the predictions of eight stability predictor methods (duet, sdm, mcsm, i-mutant3.0, mupro, iptree-stab, cupsat, and istable). based on our analyses, it appears that cupsat performs the best of the seven methods evaluated here in predicting the effect of missense mutations on pitx2 protein stability, with sdm, duet, mcsm, and i-mutant3.0, performing weaker, consistent with the results of previous studies [111, 112] . our results indicate that further studies are required to improve δδg predictions, especially for buried amino acids. in this study, for the first time, we evaluated the impact of missense variants on pitx2 stability, structure and function by integrating stability prediction algorithms, bioinformatics mutation prediction tools, and molecular modeling. our results showed that mutpred, provean, pmut, molecular modeling, and cupsat are reliable methods to assess pitx family missense variants in the absence of laboratory experiments. however, for our analyses, it must be noted that we used sixteen snps as non-pathogenetic control variants to investigate the performance of prediction programs. although we considered snps with a population frequency of >0.05% as benign, we cannot formally exclude that these snps might have un-documented pathogenic effects on pitx2. in addition, while the prediction methods used in this study are not gene-specific, generalization of the performance of these programs to other human genes may be inappropriate without additional study. when assessing the pathogenicity of missense variants, it is necessary to be cautious on depending merely on in silico programs without wetlab experiments. according to standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the american college of medical genetics and genomics (acmg) and the association for molecular pathology, in silico predictions only serve as one supporting factor, whereas functional tests are frequently needed to assess the pathogenicity of missense variants. in particular, as per clinical guidelines for the interpretation of single substitution variants, the output of computational tools should be interpreted in the light of functional studies results, population frequency data and segregation in affected families. evidence that rieger syndrome maps to 4q25 or 4q27 expanding the spectrum of foxc1 and pitx2 mutations and copy number changes in patients with anterior segment malformations. investigative ophthalmology & visual science a novel mutation in the pitx2 gene in a family with axenfeld-rieger syndrome novel mutations of foxc1 and pitx2 in patients with axenfeld-rieger malformations phenotypic variability and asymmetry of rieger syndrome associated with pitx2 mutations. investigative ophthalmology & visual science single nucleotide polymorphisms in clinical genetic testing: the characterization of the clinical significance of genetic variants and their application in clinical research for brca1. mutation research/fundamental and molecular mechanisms of mutagenesis the role of functional data in interpreting the effects of genetic variation determining the pathogenicity of genetic variants associated with cardiac channelopathies. scientific reports structure based thermostability prediction models for protein single point mutations with machine learning tools using sift and polyphen to predict loss-of-function and gain-offunction mutations. genetic testing and molecular biomarkers stability effects of mutations and protein evolvability prediction of protein stability upon point mutations predicting the functional effect of amino acid substitutions and indels molecular mechanisms of disease-causing missense mutations integrating population variation and protein structural analysis to improve clinical interpretation of missense variation: application to the wd40 domain. human molecular genetics prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases identification of related proteins with weak sequence identity using secondary structure information comparison of bioinformatics prediction, molecular modeling, and functional analyses of foxc1 mutations in patients with axenfeld-rieger syndrome clinvar: public archive of relationships among sequence variation and human phenotype mcsm: predicting the effects of mutations in proteins using graph-based signatures sdm-a server for predicting effects of mutations on protein stability and malfunction sdm: a server for predicting effects of mutations on protein stability mcsm: predicting the effects of mutations in proteins using graph-based signatures predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information prediction of protein stability changes for single-site mutations using support vector machines iptree-stab: interpretable decision tree based method for predicting protein stability changes upon mutations cupsat: prediction of protein stability upon point mutations istable: off-the-shelf predictor integration for predicting protein stability changes iptree-stab: interpretable decision tree based method for predicting protein stability changes upon mutations stability effects of mutations and protein evolvability correlation of levels of folded recombinant p53 in escherichia coli with thermodynamic stability in vitro investigating the effects of mutations on protein aggregation in the cell using model proteins to quantify the effects of pathogenic mutations in ig-like proteins systematically perturbed folding patterns of amyotrophic lateral sclerosis (als)-associated sod1 mutants de novo mutations in histone-modifying genes in congenital heart disease prevalence and spectrum of pitx2c mutations associated with familial atrial fibrillation identification of four new pitx2 gene mutations in patients with axenfeld-rieger syndrome. molecular vision expanding the spectrum of foxc1 and pitx2 mutations and copy number changes in patients with anterior segment malformations. investigative ophthalmology & visual science mutation in pitx2 is associated with ring dermoid of the cornea novel mutations of foxc1 and pitx2 in patients with axenfeld-rieger malformations journal of oral pathology & medicine: official publication of the international association of oral pathologists and the american academy of oral pathology dental and craniofacial anomalies associated with axenfeld-rieger syndrome with pitx2 mutation. case reports in medicine a novel pitx2 mutation causing iris hypoplasia. human genome variation pitx2 and foxc1 spectrum of mutations in ocular syndromes a novel pitx2 mutation in a chinese family with axenfeld-rieger syndrome. molecular vision. emory university a novel pitx2 mutation and a polymorphism in a 5-generation family with axenfeld-rieger anomaly and coexisting fuchs' endothelial dystrophy mutation analysis of the genes associated with anterior segment dysgenesis, microcornea and microphthalmia in 257 patients with glaucoma influence of hydrophobic and electrostatic residues on sars-coronavirus s2 protein stability: insights into mechanisms of general viral fusion and inhibitor design influence of hydrophobic and electrostatic residues on sars-coronavirus s2 protein stability: insights into mechanisms of general viral fusion and inhibitor design protein science: a publication of the protein society mechanisms of protein stabilization and prevention of protein aggregation by glycerol snps, protein structure, and disease whole-genome genotyping. methods in enzymology the complete genome of an individual by massively parallel dna sequencing sequencing technologies-the next generation the structure of the antennapedia homeodomain determined by nmr spectroscopy in solution: comparison with prokaryotic repressors understanding the homeodomain solution structure of the k50 class homeodomain pitx2 bound to dna and implications for mutations that cause rieger syndrome functional analyses of two newly identified pitx2 mutants reveal a novel molecular mechanism for axenfeld-rieger syndrome identification of a dominant negative homeodomain mutation in rieger syndrome the molecular basis of rieger syndrome. analysis of pitx2 homeodomain protein activities comparison of bioinformatics prediction, molecular modeling, and functional analyses of foxc1 mutations in patients with axenfeld-rieger syndrome screening of mutations affecting protein stability and dynamics of fgfr1-a simulation analysis performance of protein stability predictors the authors would like to thank the members of the walter laboratory for critical reading of the manuscript and for helpful comments. key: cord-004181-exbs3tz7 authors: pumchan, ansaya; krobthong, sucheewin; roytrakul, sittiruk; sawatdichaikul, orathai; kondo, hidehiro; hirono, ikuo; areechon, nontawith; unajak, sasimanas title: novel chimeric multiepitope vaccine for streptococcosis disease in nile tilapia (oreochromis niloticus linn.) date: 2020-01-17 journal: sci rep doi: 10.1038/s41598-019-57283-0 sha: doc_id: 4181 cord_uid: exbs3tz7 streptococcus agalactiae is a causative agent of streptococcosis disease in various fish species, including nile tilapia (oreochromis niloticus linn.). vaccination is an effective disease prevention and control method, but limitations remain for protecting against catastrophic mortality of fish infected with different strains of streptococci. immunoproteomics analysis of s. agalactiae was used to identify antigenic proteins and construct a chimeric multiepitope vaccine. epitopes from five antigenic proteins were shuffled in five helices of a flavodoxin backbone, and in silico analysis predicted a suitable rna and protein structure for protein expression. 45f2 and 42e2 were identified as the best candidates for a chimeric multiepitope vaccine. recombinant plasmids were constructed to produce a recombinant protein vaccine and dna vaccine system. overexpressed proteins were determined to be 30 kda and 25 kda in the e. coli and tk1 systems, respectively. the efficacy of the chimeric multiepitope construct as a recombinant protein vaccine and dna vaccine was evaluated in nile tilapia, followed by s. agalactiae challenge at 1 × 10(7) cfu/ml. relative percentage survival (rps) and cumulative mortality were recorded at approximately 57–76% and 17–30%, respectively. these chimeric multiepitope vaccines should be applied in streptococcosis disease control and developed into a multivalent vaccine to control multiple diseases. immunogenic protein characterization. proteins bound to a s. agalactiae antibody were eluted from protein a agarose and divided into two fractions. the first fraction was subjected to 4-20% gradient sds-page to observe the protein features and compare the protein profile from serotypes ia and iii. the second fraction was subjected to lc-ms/ms mass spectrometry to identify the immunogenic proteins. the protein profile from the immunoprecipitation on 1d-sds-page demonstrated that the major protein (approximately 55 kda) corresponded to rabbit immunoglobulin. however, several bacterial proteins could not be bound to rabbit immunoglobulin and were removed through the flow-through fraction (ft), whereas the protein that specifically bound to the anti-s. agalactiae antibody could be detected in the eluted fraction (fig. 1) . comparative immunoproteomics analysis of s. agalactiae serotypes ia and iii was determined by lc-ms/ ms and assessed by a venn diagram ( supplementary fig. 1 ). one hundred proteins were matched and identified between serotype ia and serotype iii via in-house protein databases, resulting in 79 shared proteins between serotype ia and serotype iii. the protein expression levels of the 79 common proteins were determined by hierarchical clustering (hcl). two groups of immunogenic proteins were demonstrated based on their abundance, and 37 proteins were overexpressed in serotype iii, whereas there was a lower abundance of 39 immunogenic proteins in serotype iii than in serotype ia (fig. 2) . regarding specific antigen-antibody interactions, 10 and 11 proteins were uniquely identified in serotypes ia and iii, respectively (supplementary figs. 1, 2) . linear β-cell epitope prediction and chimeric vaccine design. the epitopes of immunogenic proteins were predicted by the bcpreds server based on b cell epitopes to be used in chimeric multiepitope vaccine construction. in this study, not only immunogenic proteins from the immunoproteomics analysis were used but also other subunit vaccine candidates were subjected to epitope prediction and combined to produce a chimeric multiepitope vaccine. the amino acid sequences of the c-β protein (bac), surface protein rib (rib), lpxtg cell wall anchor domain-containing protein (spb1), surface immunogenic protein (sip), and cell surface protein heat map with hierarchical clustering (hcl) of normalized protein abundance reveals the 79 differentially expressed immunogenic proteins. the expression value showed in the relative intensities ranges from the highest protein abundance (red) to the lowest protein abundance (green) expression value. alternatively throughout the backbone likely provides potential bioactivity 22 . considering α/β fold structure, flavodoxin from escherichia coli [pdb accession code: 3chy] 23 was utilized as a linker to combine the epitope fragments from five antigenic proteins. predicted epitopes were randomly displayed on the α-helix structure of flavodoxin, generating 1,440 designed models due to the variance of 6 epitopes of bac and of 2 epitopes of sip protein. after joining, protein conformation was examined by molecular modeling with 7,200 constructs. i-tasser and stereochemical qualitative allowance manifested from 45f2 and 42e2 showed appropriate potential tertiary structure with optimal c-scores between −5 and 2. 45f2 and 42e2 also demonstrated the highest score of the amino acid allowance region in the ramachandran plot. the 45f2 multiepitope model represented 90.2%, 8.5%, and 0.7% of residues located in the most favored, allowed, and disallowed regions, respectively. meanwhile, the ramachandran plot regions for the 42e2 designed model comprised 83.0%, 11.6%, and 0.7%, respectively, of the residues ( supplementary fig. 3 ). the epitope arrangements in 45f2 and 42e2 were represented in a 3d structure of chimeric proteins, showing that all chosen epitopes were exposed to the protein surface. the five epitopes were displayed as α-helical layers surrounding 3chy linkers, which appeared as five-stranded parallel β-sheets at the structure's center, with the order 21345 (fig. 3 ). codon optimization of chimeric multiepitope vaccines and plasmid construction. the ectopic expression of bacterial protein in the fish cells may not be achieved due to different codon utilization in the bacterial system. subsequently, codon optimization of the chimeric multiepitope vaccine was analyzed by geneart ™ 's gene optimization according to iso 9001 standards (registration no. 1210024212) to apply the codon bias of oreochromis niloticus. the region of an ideal gc content range-between 30% to 70%-was well optimized. moreover, negative cis-acting sites included internal tata-boxes, chi-sites and ribosomal sites; at-rich or gc-rich sequence stretches; rna instability motifs; repeat sequences; rna secondary structures; and splice donor and acceptor sites in higher eukaryotes, which were successfully removed from these chimeric multiepitope dna vaccine sequences. the best two predicted chimeric multiepitope vaccines were designated 45f2 and 42e2. codon adaptation index (cai) presented 45f2 and 42e2 scores that matched in codon utilization with that of nile tilapia of 0.92 and 0.93, respectively. the codon quality distribution index of 45f2 and 42e2 demonstrated that the codons within the dna sequence were distributed frequently in 90-100 positions at 77% and 78% ( supplementary fig. 4a-d) . the average gc content of both chimeric multiepitope vaccines was 56% ( supplementary fig. 4e ,f). single-stranded rna-folding prediction revealed the minimum free energy (mfe) secondary structure of 45f2 and 42e2 ( supplementary fig. 5 the protparam server demonstrated a theoretical pi of 4.1 and a molecular mass of 20 kda for 45f2 and 42e2. the total number of negatively (asp and glu) and positively (arg and lys) charged amino acid residues of 45f2 was 33 and 13 residues, while for 42e2, there were 31 and 12 residues, respectively. the estimated half-life of both chimeric multiepitope constructs was approximately 30 h in mammalian reticulocytes (in vitro), more than 20 h in yeast (in vivo), and over 10 h in e. coli (in vitro). 45f2 showed aliphatic index and grand average of hydropathicity values of 65.32 and −0.401, respectively, whereas 42e2 showed values of 67.93 and −0.296, respectively. the 45f2 and 42e2 proteins were indicated to be stable proteins, as represented by instability indexes of 31.31 and 25.53, respectively. antigenicity of the 45f2 and 42e2 chimeric multiepitope vaccines was predicted as 0.7538% and 0.7424% at a 0.4% threshold for the bacterial model, consistent with antigenpro server prediction by representing 0.936 and 0.923, respectively. these results indicate that both vaccine candidates have high potential antigenic www.nature.com/scientificreports www.nature.com/scientificreports/ properties. conformational b cell epitopes from the 3d protein structure computed by the discotope server demonstrated 11 b cell epitope residues in both 45f2 and 42e2 at a −3.7 threshold ( table 2) 24 . interestingly, the number of epitopes was reduced when computed at the −2.5 and −1.0 thresholds, with 45f2 showing 6 and 3 b cell epitope residue regions, respectively, while 42e2 contained only 2 and 1 b cell epitope residue regions, respectively (table 2 ). recombinant plasmids harboring 42e2 and 45f2 were constructed, namely, pet28a (+)_42e2 or _45f2 and pcdna3.1 (+)_42e2 or _45f2, which were used to determine the recombinant chimeric multiepitope vaccine expression (fig. 4) . chimeric multiepitope protein expression was tested in a bacterial expression system and a fish cell (tk-1) culture expression system. these results demonstrated that both chimeric multiepitope proteins could be expressed in both systems, with the expression detectable within 3 h in e. coli (30 kda) and within 7 days post-transfection in tk-1 cells (25 kda) (fig. 5 ). larger-sized chimeric multiepitope proteins in the e. coli expression system resulted from an additional tag at the n-terminus, which was contained in the pet28 expression vector. vaccine efficacy. after vaccination, fish were challenged with s. agalactiae, and infected fish showed clinical signs of streptococcosis disease, such as swirling swimming, opaque eye, exophthalmia and abscess. these moribund fish were collected, and bacteria were re-isolated, showing that they were infected with s. agalactiae serotype iii ( supplementary fig. s7) . www.nature.com/scientificreports www.nature.com/scientificreports/ dna vaccine efficacy testing showed that fish immunized with either 45f2 or 42e2 had cumulative mortality rates of 16.67 ± 5.77% and 16.67 ± 15.27%, respectively, which were not significantly different from those of the fkc-vaccinated fish (p > 0.05). however, in the control group [empty vector; pcdna3.1(+)], 70.00 ± 10.00% mortality was observed at 21 days post-challenge (fig. 6a ). the recombinant chimeric multiepitope protein vaccination showed that 45f2 and 42e2 produced cumulative mortality rates of 30.00 ± 10.00% and 26.67 ± 5.77%, respectively, which were significantly lower than those of the negative control group, at 70% (p < 0.05) (fig. 6a) . the 45f2 and 42e2 dna vaccines demonstrated similar patterns of rps, with 76.19 ± 8.24% and 76.19 ± 21.82%, respectively, which were not significantly different from those of the fkc-immunized fish (76.19 ± 21.82%). however, they were significantly higher than those of the recombinant protein vaccines, which showed 61.90 ± 8.24% and 57.14 ± 14.28% for 42e2 and 45f2, p < 0.05, respectively (fig. 6b ). immune response. to determine the immune response, dot blot analysis of serum prepared from 42e2-or 45f2-vaccinated fish was used. it was demonstrated that the dna vaccine could gradually activate the production of fish antibodies from the 1 st to the 4 th week. the pattern of antibody response differed from that for the recombinant protein vaccine, with the highest activation of antibody production being significantly produced in the 2 nd -3 rd week and suddenly dropping in the 4 th week. the highest induction was observed in fkc-immunized fish (fig. 7) . dot blot analysis of vaccinate fish sera against whole cell lysate of s. agalactiae serotype ia and iii demonstrated that fish vaccinated with recombinant protein vaccine 42e2 and 45f2 showed cross-reactivity to whole cell lysate of s. agalactiae serotype ia and iii ( supplementary fig. s8 ). for the reverse vaccinology approach, computational analysis using a variety of bioinformatics tools is robust and beneficial when identifying appropriate vaccine candidates 18 . bacterial genomics and proteomics analysis indeed help researchers analyze proteins, short domains, and pathogenic epitopes that provide high immunogenicity and high antigenicity for multimeric vaccine development 25 . therefore, immunoproteomics should be applied as a preliminary process to screen antigenic proteins and minimize potential candidates for vaccine development 21 . several immunogenic proteins in this study were described previously, such as c5a peptidase and laminin-binding surface protein (lmb), which are cell surface proteins that have an important function in www.nature.com/scientificreports www.nature.com/scientificreports/ chemoattractant activities and are proteins promoting invasion of group b streptococcus (gbs) 26, 27 . however, the current immunoproteomics analysis from this study identified new immunoreactive proteins, such as bacteriocin transport accessory protein, dihydrofolate reductase, ssu ribosomal protein s8p, transposase tnpa, 1,4-alpha-glucan, cell wall surface anchor family protein, and the gtp-binding protein era. as expected, most of these are cell surface proteins, which are suggested to be associated with bacterial virulence 27, 28 . subsequently, the identified immunoreactive proteins may be used in further vaccine development. multiepitope vaccines are an interesting issue since constructed vaccines designed by in silico analysis may elicit cellular immunity and provide effective responses 25, 29 . it is known that immunodominant b cells could strongly induce both cellular and humoral immunity; thus, evaluation of b cell epitopes was performed to identify potent epitopes before integrating them to produce a multiepitope vaccine. moreover, this vaccine type is more efficient than whole antigens for controlling staphylococcus spp. infections 20, 30 . from the present study, table 2 . predicted conformational b-cell epitopes from 3d structure of designed chimeric multiepitope vaccines using discotope 2.0 server. www.nature.com/scientificreports www.nature.com/scientificreports/ linear b cell epitope prediction was assessed and identified 11 potent epitopes from 5 common immunogenic and virulence proteins that were present in serotypes ia and iii. the previous studies supported that one of the chosen proteins, sip, represented a highly conserved protein among gbs isolates and showed cross-protective immunity against gbs infections 11, 13, 14, 31 . prediction of candidate antigenic proteins can be used to select the bacterial strains that carry antigenic genes, as well as to determine high expression levels in the target host and the accessibility of particular antigens in host organisms 16 . therefore, these selected immunogenic proteins might be suitable for consideration in a rational vaccine design. rational chimeric multiepitope vaccine design was achieved by randomly combining epitopes from 5 immunogenic proteins and conjugating with core structures of flavodoxin (β-1-5-3chy) to produce a secondary structure with α/β folding. in addition to the α/β-type folding of flavodoxin, it was also useful to construct our chimeric multiepitope vaccine by forcing the 5 chosen epitope segments to fit within 5 α-helix loops and protrude out of the 3d-folded structure since that configuration benefits protein solubility by exposure to water molecules 23 . additionally, this linker may promote the solubility of the constructed vaccine and help enhance the recognition of the vaccine by the host's immune system, which contributes to vaccine efficacy. 45f2 and 42e2 presented the most favored region of protein folding, with the stereochemical quality representing the disallowed region at only 0.7%, which is acceptable since the minimum quality should be less than 2% 32 . it is suggested that in silico analysis could design a chimeric multiepitope vaccine that could probably manifest effective properties 33,34 . to achieve a high level of protein expression in nile tilapia, codon optimization was conducted to improve the transcription and translation capability by removing all possible cis-acting sequence motifs, which may have a negative impact on protein expression. both proteins had a cai > 0.8 and a codon with frequent distribution (cfd) > 30%, which are acceptable for high expression in the target organism 18, 21 . the gc content of 45f2 and 42e2 was optimized between 30-70% and had a suitable thermodynamic ensemble free energy, which allowed rna folding and thermodynamic stability 35, 36 . the overall points suggested that the modeled 45f2 chimeric multiepitope vaccine was clearly the best candidate vaccine. numerous effective single-serotype gbs vaccines have been reported, including vaccines for controlling streptococcosis in tilapia 10, 11, 13, 14, 17, 37, 38 . however, it is known that single-serotype whole-cell inactivated vaccines have limitations during cross-prevention against different serotypes. for instance, a s. iniae vaccine (serotype i) could not protect atlantic salmon from infection by s. iniae (serotype ii) 39 . meanwhile, mixed-serotype vaccines (serotypes iv and vii) could promote antiserum levels and enhance the survival rate of newborn pups against streptococcal infection 40 . although formalin-killed vaccines generally provide highly protective effects compared with those of subunit vaccines and dna vaccines, the subunit and dna vaccines may replace the original formalin-killed vaccines or inactivated vaccines due to their promising efficacy, which are similar to those of inactivated vaccines, and longer shelf life 18, 33 . evidence suggests that dna and subunit vaccines can efficiently trigger the immune system and promote protective efficacy, with an rps value greater than 50% 11, 13, 14, 17 . nevertheless, these vaccines have limitations, such as their mass production costs, and they may require various optimizations to obtain the highest stable storage conditions 18, 41 . regarding this idea, a chimeric multiepitope vaccine composed of different epitopes from different proteins common in both serotypes ia and iii was generated to achieve broad protection against different serotypes and increase their stability. interestingly, the designed chimeric multiepitope dna vaccine and protein vaccine exhibited effective prevention in nile tilapia against s. agalactiae, with efficacy similar to that of the whole-cell inactivated vaccine. this evidence supports the strategy of rational vaccine design through b cell recognition using in silico analysis. importantly, immunoproteomics analysis could assist the preliminary determination of suitable immunogenic proteins for vaccine development due to the distinct antigenic determinants that can mediate dissimilar immune responses. the criterion in immunogenic protein selection for vaccine development has focused on the ability of a particular protein to induce an immune response. among 79 identified proteins shared in both serotypes, in addition to providing the highest bcpred scores (table 1) , the 5 proteins chosen were also reported as virulence proteins and used as vaccine candidates for streptococcosis disease prevention 13 . for example, c-β protein (bac) can lead to antibody production through fc region binding of human iga 42 . sip protein has been shown to mediate protection against streptococcal infection 11, 13, 14 . additionally, the chosen immunogenic protein should be conserved among streptococcus spp., so it would be suitable for application in cross-reactive prevention among s. agalactiae serotypes 11, 43 . moreover, it should be mentioned that peptide vaccines or epitopes with only 30 amino acid residues may trigger immune responses through binding directly to mhc-i or mhc-ii molecules. these molecules localize to nonprofessional antigen-presenting cells. vaccines containing proteins with longer amino acid sequences can enhance the presentation of epitopes to dendritic cells due to t cell induction 25, 44, 45 . herein, the comparative efficacy of both the 45f2 and 42e2 dna and recombinant protein vaccines indicated that the dna vaccine provided a higher efficacy than the recombinant protein vaccine. this result suggests that the dna vaccine can prolong the activation of the immune response by triggering both humoral and cellular immune responses 46, 47 . moreover, the clearance rate of the recombinant protein vaccine in the host system may be faster than that of the dna vaccine. this difference implies that the dna vaccine can enter the host cell to produce chimeric multiepitope protein, with that protein existing in the host system for longer than the recombinant protein vaccine, thus enhancing its bioavailability. taken together, these data indicate that the antigen combination has shown promise for streptococcosis disease control in nile tilapia. this research demonstrated a novel platform for rational vaccine design based on chimeric vaccine development that used flavodoxin with a tim-barrel structure as a template. our chimeric protein backbone is suitable for presenting epitopes to be recognized by the host immune system. with 5 epitopes, it could activate antibody production and demonstrated promising protection against bacterial disease similar to that of a whole-cell inactivated vaccine. this platform will promote the production of multivalent vaccines to control multiple diseases and for other applications in the future. experimental fish, bacterial strain and antibody. all male s. agalactiae-free nile tilapia (oreochromis niloticus linn.) were obtained from a commercial gap farm in thailand. the experiments were conducted in accordance with guidelines approved by the national research council of thailand. the experimental fish were anesthetized with clove oil to minimize stress during vaccination and challenge testing. s. agalactiae serotypes ia and iii were cultured as described previously 7 . s. agalactiae serotype iii was used for polyclonal antibody (pab) production, which was kindly provided by prof. ikuo hirono, tumsat, japan. antibody against igm of nile tilapia was kindly provided by assist. prof. eakapol wangkahart. mahasarakham university, thailand. immunoproteomics analysis. s. agalactiae was grown in bhi broth at 30 °c with agitation until reaching exponential phase. bacterial cells were collected by centrifugation, lysed in 100 µl of lysis buffer [tris-buffered saline (tbs) with 1% tween-20 and 0.01% lysozyme] and incubated at 50 °c for 20 min following sonication on ice. protein a agarose beads (cell signaling, usa) were added to the bacterial protein lysate, and nonspecific proteins were removed by 10 min of centrifugation at 10,000 × g at 4 °c. clarified supernatant was supplemented with 5% glycerol and then with a pab specific to s. agalactiae serotype iii (1:500 dilution). then, 30 µl of protein a agarose beads were added to separate bound immunogenic proteins, and the bound proteins were separated by acetone precipitation [1:5 (v/v)]. precipitated proteins were solubilized in 20 mm tris-hcl with 0.5% sds, and a lowry assay was used to measure the protein concentration. the protein profile was assessed by fractionating 25 µg of protein on a nupage 4-12% bis-tris protein gel (thermofisher, usa). 3 µg of immunogenic protein was mixed with a lysis buffer (0.1% rapidgest sf in 20 mm ammonium bicarbonate) and 5 mm dtt in 10 mm ammonium bicarbonate at 60 °c for 3 h. this step was followed by incubation with 15 mm iodoacetamide (iaa) in 10 mm ammonium bicarbonate at room temperature for 45 min in the dark. the protein solution was cleaned up by a zeba spin desalting column before digestion with 50 ng of sequencing-grade trypsin (promega, germany) at 37 °c for 6 h. tryptic peptides were dried at 44 °c under a vacuum and then protonated with 0.1% formic acid in lc water before injection into an lc-ms/ms. the tryptic peptides' immunoproteomics profiles were analyzed using an ultimate ™ 3000 nano/capillary lc system (dionex, uk) and hybrid quadrupole q-tof impact ii ™ (bruker daltonics gmbh, germany) equipped with a nano-captivespray ion source. first, 500 nl of extracted peptide was subjected to a trapping column (thermo scientific, pepmap100, c18, 300 μm i.d. × 5 mm) through a full loop injection before being resolved in an analytical column (pepswift c18 nano column, 100 μm × 15 cm, i.d.) at 60 °c. the linear gradient method was used to elute peptides with mobile phase a (0.1% formic acid in water) and mobile phase b (0.1% formic acid in 80% acetonitrile) at a 0.35 µl/min constant flow rate into the mass spectrometer. electrospray ionization was conducted at 1.6 kv using captivespray. mass spectra (ms) and ms/ms spectra were fully acquired in positive ion mode (compass 1.9 for otofseries software, bruker daltonics). mass accuracy was assessed using positive detection mode after internal calibration with sodium trifluoroacetate (na-tfa) within 1.6 ppm. raw lc-ms/ ms spectra were collected using compassxport version 3.0.9.2 (bruker daltonics gmbh, germany) to convert all spectra into the mzxml data format. the mzxml files of the lc-ms/ms datasets for label-free quantification of peptides were evaluated based on the ms profile by maxquant software. chimeric multiepitope vaccine design. the linear b cell epitope was predicted by bcpred 48 . the scop and cath databases were used to design an appropriate chimeric multiepitope vaccine structure 49 (an: wp_000913277). a 3d structure was rendered by i-tasser (iterative threading assembly refinement) using the qualifying c-score value as a confidence score 32 . to refine the tertiary structure, the derived i-tasser results in the pdb files were prepared using the galaxyrefine server, which performed a repeated structure perturbation, and the best structural relaxation candidates were chosen 19 . moreover, to obtain the best chimeric multiepitope vaccine candidates, the residues were determined according to residue stereochemical quality for all the refined chimeric multiepitope models and validated by the procheck program v.3.5.4 to generate ramachandran plots 50 . codon optimization. amino acid sequences were reverse-translated to nucleotide sequences using nile tilapia codon usage (oreochromis niloticus [gbvrt]: 113). the codon adaptation index (cai) of the designed vaccine candidates' nucleotides was analyzed by an optimizer program (http://genomes.urv.es/optimizer/) and combined with geneart tm 's gene optimization process (thermo fisher scientific, usa). the secondary structure of the single-stranded rna folding and free energy of the thermodynamic ensemble were calculated by the rnafold web server 51 . the optimized dna sequence was synthesized by geneart ® gene synthesis (thermo www.nature.com/scientificreports www.nature.com/scientificreports/ to verify the ectopic expression of the chimeric multiepitope dna vaccine, pcdna3.1(+)_42e2 or _45f2 was transfected into tk1 (tilapia kidney 1) tilapia cells using effectene transfection reagent (qiagen, germany). the transfected fish cell cultures were maintained with leibovitz's l-15 media containing 10% fbs and penicillin-streptomycin at 25 °c, and dna vaccine expression was determined after 1 week. recombinant chimeric multiepitope protein was purified by ni-nta agarose beads (qiagen) with a gradient concentration buffer of imidazole ranging from 5 mm to 500 mm. subsequently, the gel filtration chromatography method was performed by fast protein liquid chromatography (fplc) incorporated with a hiprep 16/60&26/60 sephacryl s-300 high-resolution column (ge healthcare, usa) using a 1 × pbs buffer with a 1 ml/min flow rate. recombinant protein detection was confirmed by sds-page analysis and western blot analysis using an anti-his tag antibody (recombinant protein vaccine) or an anti-flag (rabbit igg) (dna vaccine) and anti-rabbit antibody conjugated to ap (alkaline phosphatase). vaccine efficacy analysis. to evaluate vaccine performance, nile tilapia (o. niloticus) were immunized with chimeric multiepitope vaccines (recombinant protein and dna vaccines), followed by bacterial challenge. a total of 6 experimental groups, namely, 1) the 45f2 recombinant protein vaccine, 2) 42e2 recombinant protein vaccine, 3) 45f2 dna vaccine, 4) 42e2 dna vaccine, 5) formalin-killed (fkc) s. agalactiae vaccine 58 , and 6) pcdna3.1(+) [empty vector], were conducted in triplicate. before vaccination, 25 streptococcosis-free nile tilapia (60 ± 5 g) were transferred into 18 glass aquarium tanks containing 30 l of water for one week. after a week of acclimatization, fish were vaccinated according to above mentioned groups. all fish were maintained under running and aerated water at 30 ± 3 °c and fed with commercial pellet feed twice a day. for the chimeric multiepitope protein vaccination, purified 45f2 and 42e2 proteins were mixed with montanide isa 763 (seppic, france) in a 7:3 ratio prior to intraperitoneal injection with 200 µg of protein per fish. for the chimeric multiepitope dna vaccine, plasmid dna of 45f2 and 42e2 were purified by ultracentrifugation using a cscl gradient 59 and dissolved in te buffer (ph 8.0) to obtain a concentration of 0.1 µg/µl. the dna vaccine was applied to the fish with 10 µg of dna through intramuscular injection. fkc and pcdna3.1(+) were used as positive and negative controls, respectively. the schedule of vaccine efficacy analysis and immune response analysis was demonstrated in supplementary fig. 9 . for the immune response analysis, blood was drawn from the caudal vein to separate serum for the immunoblotting assay, and those fish were transferred to another separate tank. the analysis was performed every week, using 3 fish in each treatment from the 1 st week to the 4 th week. after one month of vaccination, 10 vaccinated fish in each treatment group were taken from among the remaining fish for serum collection and anesthetized with eugenol before challenge with s. agalactiae (serotype iii) at 1 × 10 7 cfu/ml through ip administration. mortality and clinical signs of infected tilapia were recorded daily for 3 weeks. the brain, head kidney, and liver were collected from moribund fish for bacterial isolation and identification 7 . cumulative mortality and relative percentage survival (rps) were calculated 60 . a one way analysis of variance (anova) was used for statistical analysis and p < 0.05 was considered significant. to detect the antibody response after immunization, antibody production was evaluated through dot blot analysis using the minifold ® i dot blot system (ge healthcare, germany). briefly, 20 µl of purified 42e2, 45f2 proteins, or a whole-cell lysate of s. agalactiae (10 µg/ml) were spotted on a nitrocellulose membrane and blocked with blocking solution (0.1% bsa in tbst) before adding 10 µl of serum of the different treatment groups as above mentioned. then, the membrane was probed with a primary antibody (anti-igm at 1:5,000) for 1.5 h, followed by washing 3 times with tbst buffer and 45 min of incubation with an anti-mouse igg hrp-linked ab (1:10,000). subsequently, the signal was detected with a chemidoc ™ imaging system (bio-rad) after adding a substrate reagent (perkinelmer, usa). the integrated density of the dot blot was analyzed by imagej (version 1.x) 61 . five different piscidins from nile tilapia, oreochromis niloticus: analysis of their expression and biological functions prevention and control of viral disease in aquaculture development of a quantitative pcr assay for monitoring streptococcus agalactiae colonization and tissue tropism in experimentally infected tilapia parasites and diseases streptococcus agalactiae serotype distribution and antimicrobial susceptibility in pregnant women in gabon microevolution of streptococcus agalactiae st-261 from australia indicates dissemination via imported tilapia and ongoing adaption to marine hosts or environment molecular serotyping, virulence gene profiling and pathogenicity of streptococcus agalactiae isolated from tilapia farms in thailand by multiplex pcr a microwave-irradiated s. agalactiae vaccine provides partial protection against experimental challenge in nile tilapia oreochromis niloticus efficacy of an experimentally inactivated s. agalactiae vaccine in nile tilapia (oreochromis niloticus) reared in brazil development of live attenuated streptococcus agalactiae vaccine for tilapia via continuous passage in vitro a recombinant truncated surface immunogenic protein (tsip) plus adjuvant fia confers active protection against group b streptococcus infection in tilapia development and efficacy of feed-based recombinant vaccine encoding the cell wall surface anchor family protein of s. agalactiae against streptococcosis in oreochromis sp safety and immunogenicity of an oral dna vaccine encoding sip of streptococcus agalactiae from nile tilapia oreochromis niloticus delivered by live attenuated salmonella typhimurium protective efficacy of cationic-plga microspheres loaded with dna vaccine encoding the sip gene of s. agalactiae in tilapia assembly and role of pili in group b streptococci dentification of universal group b streptococcus vaccine by multiple genome screen protection of nile tilapia (oreochromis niloticus l.) against streptococcus agalactiae following immunization with recombinant fbsa and alpha-enolase designing an efficient multi-epitope peptide vaccine against vibrio cholera via combined immunoinformatics and protein interaction based approaches immunoinformatics analysis and in silico designing of a novel multi-epitope peptide vaccine against staphylococcus aureus multiple b-cell epitope vaccine induces a staphylococcus enterotoxin b-specific lgg 1 protective response against mrsa infection designing of complex multi-epitope peptide vaccine based on omps of klebsiella pneumonia: an in silico approach evaluation of β-amino acid replacements in protein loops: effects on conformational stability and structure crystal structure of escherichia coli chey refined at 1.7-a resolution reliable b cell epitope predictions: impacts of method development and improved benchmarking a novel multi-epitope peptide vaccine against cancer: an in silico approach structure of the streptococcal cell wall c5a peptidase enhanced expression of lmb gene encoding laminin-binding protein in streptococcus agalactiae strains harboring is 1548 in scpb-lmb intergenic region adhesion, invasion and evasion: the many functions of the surface proteins of staphylococcus aureus multi-epitope vaccines: a promising strategy against tumors and viral infections novel targeted immunotherapy approaches for staphylococcal infection identification of group b streptococcal sip protein, which elicits cross-protective immunity the i-tasser suite: protein structure and function prediction designing a novel multi-epitope peptide vaccine against pathogenic shigella spp. based immunoinformatics approaches novel immunoinformatics approaches to design multi-epitope subunit vaccine for malaria by investigating anopheles salivary protein exploring dengue genome to construct a multi-epitope based subunit vaccine by utilizing immunoinformatics approach to battle against dengue infection on the normalization of the minimum free energy of rnas by sequence length efficacy of feed-based adjuvant vaccine against streptococcus agalactiae in oreochromis spp growth, immune responses and protection of nile tilapia oreochromis niloticus immunized with formalinkilled streptococcus agalactiae serotype ia and iii. vaccines recovery of streptococcus iniae from diseased fish previously vaccinated with a streptococcus vaccine conjugate vaccines against group b streptococcus types iv and vii. the journal of infectious diseases development of streptococcus agalactiae vaccines for tilapia streptococcal β protein has separate binding sites for human factor h and iga-fc identification of group b streptococcal sip protein, which elicits cross-protective immunity immunotherapy of established (pre)malignant disease by synthetic long peptide vaccines a web resource for designing subunit vaccine against major pathogenic species of bacteria prospects for control of emergimg infectious dyseases with plasmid dna vaccines dna vaccines: ready for prime time? predicting linear b-cell epitopes using string kernels scope: structural classification of proteins extended, integrating scop and astral data and classification of new structures aqua and procheck-nmr: programs for checking the quality of protein structures solved by nmr viennarna package 2.0. algorithms for molecular biology dna synthesis and biological security prediction of n-glycosylation sites in human proteins precision mapping of the human o-galnac glycoproteome through simplecell technology protein identification and analysis tools on the vaxijen: a server for prediction of protective antigens, tumor antigens and subunit vaccines high-throughput prediction of protein antigenicity using protein microarray data first report of streptococcus agalactiae isolated from oreochromis niloticus in piura, peru: molecular identification and histopathological lesions dna extraction from 0.22 µm sterivex filters and cesium chloride density gradient centrifugation potency and efficacy test of a vaccine in addition with adjuvant against koi herpesvirus in koi (cypronus carpio) nih image to imagej: 25 years of image analysis the authors declare no competing interests. supplementary information is available for this paper at https://doi.org/10.1038/s41598-019-57283-0.correspondence and requests for materials should be addressed to s.u. publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. key: cord-000182-ni6iyzdn authors: he, zhisong; zhang, jian; shi, xiao-he; hu, le-le; kong, xiangyin; cai, yu-dong; chou, kuo-chen title: predicting drug-target interaction networks based on functional groups and biological features date: 2010-03-11 journal: plos one doi: 10.1371/journal.pone.0009603 sha: doc_id: 182 cord_uid: ni6iyzdn background: study of drug-target interaction networks is an important topic for drug development. it is both time-consuming and costly to determine compound-protein interactions or potential drug-target interactions by experiments alone. as a complement, the in silico prediction methods can provide us with very useful information in a timely manner. methods/principal findings: to realize this, drug compounds are encoded with functional groups and proteins encoded by biological features including biochemical and physicochemical properties. the optimal feature selection procedures are adopted by means of the mrmr (maximum relevance minimum redundancy) method. instead of classifying the proteins as a whole family, target proteins are divided into four groups: enzymes, ion channels, g-proteincoupled receptors and nuclear receptors. thus, four independent predictors are established using the nearest neighbor algorithm as their operation engine, with each to predict the interactions between drugs and one of the four protein groups. as a result, the overall success rates by the jackknife cross-validation tests achieved with the four predictors are 85.48%, 80.78%, 78.49%, and 85.66%, respectively. conclusion/significance: our results indicate that the network prediction system thus established is quite promising and encouraging. identification of drug-target interaction networks is an essential step in the drug discovery pipeline [1] . the emergence of molecular medicine and the completion of the human genome project provide more opportunity to discover unknown target proteins of drugs. many efforts have been made to discover new drugs in the past few years. however, the number of new drug approvals remains quite low (around only 30 per year). this is partially because many compounds or drug candidates have to be withdrawn owing to unacceptable toxicity. such failures have wasted a lot of money. it would be beneficial to develop computational methods for predicting the sensitivity and toxicity before a drug candidate was synthesized [2, 3, 4] . however, a number of problems need to be overcome in order to find out the exact effects of a drug. firstly, drugs could have numerous effects including positive and negative effects, and it is hard to find out and elucidate the possible effects; secondly, different people would have completely different responses to a drug even though the same gene products are only slightly different [5, 6, 7, 8] ; thirdly, it is very hard to trace the drug effects since the biological interaction pathways are extremely complicated in human beings. therefore, it would be very helpful for drug development if the interactions between drugs and target proteins could be predicted more accurately and the underlying mechanisms could be better understood. several computational approaches have been developed for analyzing and predicting drug-protein interactions. the most commonly used are docking simulations [9, 10, 11, 12] , literature text mining [13] , and combining chemical structure, genomic sequence, and 3d structure information [14] , among others (see, e.g., [15, 16, 17] ). machine learning and data mining methods have been widely used in the computational biology and bioinformatics area. many researchers have made lots of efforts to develop useful algorithms and softwares to investigate various drug-related biological problems, such as hiv protease cleavage site prediction [18, 19] , identification of gpcr (g protein-coupled receptors) type [20, 21] , protein signal peptide prediction [22] , protein subcellular location prediction [23, 24, 25] , analysis of specificity of galnac-transferase protein [26] , identification of protease type [27, 28] , membrane protein type prediction [29, 30, 31, 32] , and a series of relevant webserver predictors as summarized in a recent review [33] . here we propose a predictor for drug-target interactions based on the nearest neighbor algorithm [34] . since biochemical and physicochemical features [35] are important for characterizing proteins, in this study they are used to represent proteins as done by many previous investigators (see, e.g., [36, 37, 38] . to improve the predictor's performance, minimum redundancy maximum relevance (mrmr) algorithm [39] is used to rank the features. meanwhile, the incremental feature selection and forward feature selection are applied for feature selection. the protein targets for drugs are divided into enzymes, ion channels [40, 41, 42, 43] , gpcrs [44, 45] , and nuclear receptors [14] in this study. finally, four predictors for predicting the interactions of drugs with each of the four protein families are developed in hopes that they can help provide useful information for drug design. in addition to the dataset used by yamanishi et al. [14] , information about drug compounds and genes can be obtained from kegg [46, 47] by the ftp operations: ftp://ftp.genome.jp/ pub/kegg/ligand/drug/drug for the drugs, and ftp://ftp.genome. jp/pub/kegg/genes/fasta/gene.pep for the genes. after excluding the drug-target pairs that lack experimental information, we finally obtained a total of 4,797 drug-target pairs, of which 2,719 for enzymes, 1,372 for ion channels, 630 for gpcrs, and 82 for nuclear receptors. all these datasets were used as the positive datasets in the current study. the corresponding negative datasets were derived from the above positive datasets via the following steps: (1) separate the pairs in the above positive dataset into single drugs and proteins; (2) re-couple these singles into pairs in a way that none of them occurs in the corresponding positive dataset; (3) randomly picked the negative pairs thus formed until they reached the number two times as many as the positive pairs. the drug-target benchmark datasets thus obtained for enzymes, ion-channels, gpcrs, and nuclear receptors are given in online supporting information s1, s2, s3, and s4, respectively. representing drugs with chemical functional groups composition. the number of drugs is extremely large. however, most of them are small organic molecules and are composed of some fixed small structures, called functional groups. since functional groups usually represent the characteristics of a compound as well as its reaction mechanism with other molecules, features derived from its functional groups could be very effective in characterizing a drug. moreover, the number of common functional groups is quite small, and hence it is possible to use the functional group composition to uniquely represent a drug [48] . a number of functional groups are available in nature, and we selected the following 28 common groups for the current study: (1) alcohol, (2) aldehyde, (3) amide, (4) amine, (5) hydroxamic acid, (6) phosphorus, (7) carboxylate, (8) methyl, (9) ester, (10) ether, (11) imine, (12) ketone, (13) nitro, (14) halogen, (15) thiol, (16) sulfonic acid, (17) sulfone, (18) sulfonamide, (19) sulfoxide, (20) sulfide, (21) a_5c_ring, (22) ar_6c_ring, (23) non_ar_5c_ring, (24) non_ar_6c_ring, (25) hetero ar_6_ring, (26) hetero non_ar_5_ring, (27) hetero non_ar_6_ring, and (28) hetero ar_5_ring. thus, following the same treatment as in [23] , a drug compound can now be formulated as a 28-d (dimensional) vector given below: where g i (i~1, 2, á á á , 28) is the occurrence frequency of the i-th functional group in the drug d, and t the matrix transpose operator. representing target proteins with pseudo amino acid composition by incorporating biochemical and physicochemical features. now the problem is how to effectively represent a target protein. two kinds of representations are generally used in this regard: the sequential representation and the non-sequential representation. the most typical sequential representation for a protein sample is its entire amino acid sequence, which can contain the most complete information of a protein. to deal with this model, the sequence-similarity-searchbased tools, such as blast [49] , are usually used to find the desired results. unfortunately, this kind of approach failed to work when the query protein did not have significant homology to the proteins in the training dataset. thus, various non-sequential representations or discrete models were proposed. the simplest discrete model was based on the amino acid composition (aac) (see, e.g., [50] ). however, if using the aac model to represent a protein, all its sequence-order information will be lost. to avoid completely losing the sequence-order information, the pseudo amino acid composition (pse-aac) was proposed [36] to represent the sample of a protein. the pseaac can be used to represent a protein sequence with a discrete model yet without completely losing its sequence-order information. for further information about pseaac, see the web-page by clicking the link http://en. wikipedia.org/wiki/pseudo_amino_acid_composition. ever since the concept of pseaac was introduced, it has been widely used to study various problems in proteins and protein-related systems (see, e.g., [37, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66] ). meanwhile, many different forms of discrete models were also proposed (see, e.g., [20, 30, 32, 51, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82] ). however, regardless of how much different these models are, they just belong to different forms of pseaac, as elucidated in a recent comprehensive review [83] . here, we are to propose a different pseaac to represent drug-targeted proteins in terms of their biochemical and physicochemical features [84] . six different types of features were considered: (1) hydrophobicity, (2) polarizability, (3) polarity, (4) secondary structure, (5) normalized van der waals volume, and (6) solvent accessibility. each amino acid residue in a protein sequence can be represented by a set of different states according to its features. for instance, its hydrophobicity feature can be marked by one of the following three states: ''polar'', ''neutral'', or ''hydrophobic'' [85] ; its solvent accessibility feature by one of the two: ''buried'' or ''exposed to solvent'', as predicted by predacc [35] ; its secondary structure feature by one of the three: ''helix'', ''sheet'', or ''coil'', as predicted by the method in [86] ; and so forth. thus, a protein sequence can be translated to a series of codes according to the biochemical and physicochemical properties of its constituent amino acid residues. for example, if using ''p'', ''n'' and ''h'' to represent the three states of hydrophobicity: ''polar'', ''neutral'', and ''hydrophobic'', the protein sequence ''dmaeimsdkp-qagml'' can be translated to ''phnphhnppnpnnhh'' according to the codes of the hydrophobic property feature. the encoded sequences thus obtained would have different length for proteins of different sizes, which will make the prediction engine difficult to handle. to make the feature-encoded sequence to be a vector with a fixed number of dimensions, three properties of a sequence was used: composition (c), transition (t), and distribution (d). c represents the global composition of each letter in the sequence; t, the frequency of a code letter changing from one to another; d, the distribution pattern of the code letters along the sequence, measuring the percentage of the sequence length within which the first, 25%, 50%, 75%, and 100% of the amino acids of each code letter is located. take the above hydrophobic property sequence as an example: its c feature is 5/15 = 33.3% for all of p, h, and n, while the t feature is 2/10 = 20%, 3/10 = 30% and 5/10 = 50% for the changes between h and p, n and h, n and p, respectively. the measurement of feature d is a little more complicated. for the letter h, the first, 25%, 50%, 75% and 100% of hs in the sequence is located at the position of 2, 5, 6, 14, and 15. thus its d feature is ( , with a total of 21 components. likewise, for the sequences encoded by the other four biochemical properties, each is also corresponding to 21 components. but for the sequence encoded by the solvent accessibility with only two states (''buried'' or ''exposed to solvent''), the encoded sequence is corresponding to only 14 components. finally, by adding the 20 components of aac [87] into the vector concerned, the total number of components thus obtained for a given protein is 5|21z20z14~139; i.e., the protein can be formulated as a 139-d vector given by where p i (i~1, 2, á á á , 139) is the i-th component of the protein p. of the 139 components, 119 are derived according to the codes of the above six biochemical and physicochemical features, and 20 are the aac components of p. with all samples represented by a feature vector, now it is possible for us to construct our predictor using the machine learning approach. the nn (nearest neighbor) algorithm is quite popular in pattern recognition community owing to its good performance and simple-to-use feature. according to the nn rule [88] , the query sample should be assigned to the subset represented by its nearest neighbor. in this study, if the drug-target pair with the shortest distance is a positive sample, meaning that they can interact with each other, the sample for test is seen as a positive drug-target pair. otherwise, the test sample is seen as a negative one. there are many different definitions to measure the ''nearness'' for the nn algorithm, such as euclidean distance, hamming distance [89] , and mahalanobis distance [50, 90, 91] . in the current study, the following equation was adopted to measure the nearness between samples v x and v y where v x : v y is the dot product of the two vectors, and v x k k and v y their modulus, respectively. when v x :v y we have d(v x ,v y )~0, indicating the ''distance'' between these two sample vectors is zero and hence they have perfect or 100% similarity. after constructing the drug-target interaction predictor, we have to evaluate its performance. in statistical prediction, the following three cross-validation methods are often used to examine a predictor for its effectiveness in practical application: independent dataset test, subsampling (k-fold cross-validation) test, and jackknife test [92] . however, as elucidated by [24] and demonstrated by eq.50 in [93] , among the three cross-validation methods, the jackknife test is deemed the most objective that can always yield a unique result for a given benchmark dataset, and hence has been increasingly used and widely recognized by investigators to examine the accuracy of various predictors (see, e.g. [51, 53, 54, 55, 56, 57, 59, 62, 63, 64, 94, 95, 96] ).'' accordingly, in this study the jackknife cross-validation was adopted to calculate the success prediction rates as well. although we've constructed the drug-target predictor based on the original feature set described above, it is possible to improve its performance with a better feature set. apparently, not every feature in the feature set is equally relevant to the drug-target interaction. what's more, features may not be independent with each other. the ''bad'' will have negative impact on the accuracy and efficiency of the predictor, so it is possible to do the feature selection process to construct a more compact and effective feature set. the first step is using maximum relevance minimum redundancy (mrmr) [36] to do feature evaluation. maximum relevance minimum redundancy (mrmr) [39] was firstly developed for analysis of microarray data. it ranks each feature according to its relevance to the target and redundancy to other features. the better a feature is deemed to be, the higher the rank it will be assigned to. mutual information (mi), denoted by i to indicate the dependence of two features used to quantify the relevance and redundancy. mi is defined as following: based on mi, we can quantify relevance (d) and redundancy (r) as: where f candidate is the feature to be calculated, and c is the target variable. by combining the above two equations to maximize relevance and minimize redundancy, the following mrmr function is constructed: where v s and v t are the already-selected feature set and to-beselected feature set, respectively, and m and n are the sizes of these two feature sets, respectively. the earlier a feature is selected, the better it would be though of. finally, we can get an ordered feature list with a rank for every feature to indicate its importance in the feature set. in our study, the mrmr program is obtained from: http://research. janelia.org/peng/proj/mrmr/index.htm. to calculate mi, the joint probabilistic density and the marginal probabilistic densities of the two vectors were used. a parameter t is introduced here to deal with these variables. suppose mean to be the average value of one feature in all samples, and std to be the standard deviation, the feature of each sample would be classified into one of the three groups according to the boundaries: mean+(t : std). in our study, t was assigned to be 1. as mentioned above, the importance of each feature is rated according to its rank in the mrmr analysis. the next step is to determine which features should be selected as the optimal feature set for our drug-target predictor. here the ifs (incremental feature selection) procedure is used to solve the problem. each feature in the mrmr feature list was added one by one, and n different feature sets are obtained if the total feature number is n, while the i-th feature set is: based on each of the n feature sets, an nn algorithm predictor was constructed and tested with the jackknife cross-validation test. with all the n overall accurate rates calculated, we could draw an ifs curve with the index i to be the x-axis and the corresponding overall accurate rate to be the y-axis. thus, s opt~f f 1 , f 2 , :::, f n g is regarded as the optimal feature set if the curve reach its peak where the value of its x-axis is nƒn. because four independent predictors are needed for the four different classes of drug-target pairs, the ifs analysis procedure will be processed four times with each for a specific predictor. to refine feature selection, the ffs (forward feature selection) procedure based on the result of ifs was used. ffs is a feature selection method based on ifs results which tries every feature in the candidate feature set and adds the feature that achieves the highest prediction accuracy into the already-selected feature set in each goes. suppose the ifs curve reaches its peak with apex as its x-axis, the initial ffs-selected feature set was constructed as: more features in ffs-to-be-selected feature set would be added into the ffs-selected feature set one by one. the ffs-to-beselected feature set with m features covers the features with mrmr ranks between k+1 and k+1+m, where m is a user-defined positive integer smaller than n{k with n to be the size of the original feature set. in each round of ffs, each feature in ffs-tobe-selected feature set would be taken out and added to the ffsselected feature set. each predictor based on each new ffsselected feature set would be tested, and the feature set obtained the highest overall accurate rate would be used as the new ffsselected feature set. this process would be run for m times, until the ffs-to-be-selected feature set becomes a null set. an ffs curve similar to the ifs curve could be drawn with x-axis as the index and y-axis as the overall accurate rate. in this study, ffs was run for each of the four benchmark datasets based on the corresponding ifs result. m for all these processes was set to 50, while k for each ffs was set to be the index of the point with the first maximum value (i.e. the maximum point with the smallest index) in the corresponding ifs curve. to improve performance of the predictor of drug-target interaction, feature selection process was carried out. the first step of feature selection is feature evaluation. in this study, mrmr was used to evaluate every feature in original feature set. listed in online supporting information s5 are two kinds of outputs: the first one is the maxrel list which shows ranks of features for their relevance to the target; the second is mrmr list showing the mrmr ranks according to the feature order satisfying eq. 3. in this study, only the mrmr list was used as the results of feature evaluation. since there are four groups of samples, mrmr was run four times with each for one of them. with the four mrmr lists, ifs was processed for each of the four sample groups, generating four ifs curves. based on these results, we set k in ffs to be 16, 15, 14 and 19 for the data of enzymes, ion channels, gpcrs and nuclear receptors, respectively. each of these figures is the index of the point of the first maximum value in the corresponding ifs curve. shown in fig. 1 are the four ifs curves with their corresponding ffs curves. the peaks of the four ffs curves finally reach the overall success rates of 85.48% with 32 features, 80.78% with 37 features, 78.49% with 30 features, and 85.66% with 32 features for enzyme group, ion channel group, gpcr group and nuclear receptor groups, respectively. features selected by mrmr+ffs for the four different groups are quite different from each other, showing the intrinsic differences between them. although there are more features for target than those for drug in the original feature set, more drug features were selected, showing the important role of drugs. many of the selected target features are for protein secondary structure, especially for enzyme group (half of selected target features are for this). all types of features are selected in at least one group, showing that all biochemical and physicochemical features have their irreplaceable positions in drug-target interaction process. for the details of the optimal feature-set outputs by ffs for the four benchmark datasets, see the online supporting information s6. for the specificity and promiscuity, we divided the drug-protein interactions into four groups according to the targets of drugs: enzymes, ion channels, gpcrs, and nuclear receptors. we used all the known drugs and target proteins in the gold standard data as training data to predict the potential interactions between all human proteins annotated as members of the four classes in kegg genes and all compounds in kegg ligands. enzyme recognition is the primary event involved in the interaction of proteins with other proteins and with small molecules such as metabolites and therapeutics. predicting drugenzyme interactions has direct application for completing genome annotations, finding enzymes for synthetic chemistry, and predicting drug specificity, promiscuity and pharmacology. it is suggested that the secondary structure information plays the major role in determining the drug-enzyme interactions activity. for example, cytochrome p450 (cyp) induction-mediated interaction is one of the major concerns in clinical practice and for the pharmaceutical industry [97] . induction of cyp1a enzymes with a specific structure-stable state may activate some xenobiotics to their reactive metabolites, leading to toxicity [98, 99] . amino acid composition and hydrophobicity also contribute considerably to these interactions. an insertion/deletion (i/d) polymorphism of the angiotensin i-converting enzyme (ace) have an influence on the antihypertensive response, particularly when using ace inhibitors (acei) [100] , mirroring that the amino acid composition did contribute to the interactions. hydrophobicity plays a role in determining the coefficients of drug-enzyme interaction energy with the application to drug screening as well as in silico target protein screening [101, 102] . the g-protein coupled receptor (gpcr) superfamily, which is comprised of estimated 600-1,000 members, is another largest known class of molecular targets with varieties of physiological activities and proven therapeutic value [103] . they are integral membrane proteins sharing a common global topology that consists of seven transmembrane alpha helices, intracellular cterminal, an extracellular n-terminal, three intracellular loops and three extracellular loops [33, 44] . it is suggested that secondary structure and polarity would play a major role in determining the drug-gpcrs interactions activity. small secondary structures such as helices and loops are identified as entities potentially involved in stabilizing interactions with ligands [33] . these motifs were situated mainly in the apical region of transmembrane segments and included a few extracellular residues [104] . crystal structures of engineered human beta 2-adrenergic receptors (ars) in complex with an inverse agonist ligand, carazolol, provide threedimensional snapshots of an important g protein-coupled receptor (gpcr) with a beta-sheet structure and forms part of the chromophore-binding site [105] . glida provides interaction data between gpcrs and their ligands, along with chemical information on the ligands, as well as biological information regarding gpcrs [106] . some of the features reflect physical interactions that are responsible for the structural stability of the transmembrane, the formation of extensive networks of interhelical h-bonds and sulfur-aromatic clusters that are spatially organized as ''polarity'', the close packing of side-chains throughout the transmembrane domain. when more experimental 3d structures become available for gpcrs in the future, this will help building reliable models for a wider range of gpcrs that would be suitable for docking studies. joint use of ligand-based chemogenomic and docking would certainly improve the prediction. ion channels are a large superfamily of membrane proteins that pass ions across membranes and are critical to diverse physiological functions in both excitable and nonexcitable cells and underlie many diseases. as a result, they are an important target class which is proven to be highly ''druggable''. according to our analysis, secondary structure and polarity play the major role in determining the drug-ion channels interactions activity. secondary structure controls the membrane potential and interrogates ion channels in different conformational states. the drug-ion channels interaction needs gated state where they can switch conformation between a closed and an open state [42, 43] . simulations on model nanopores reveal that a narrow hydrophobic region can form a functionally closed gate in the channel and can be opened by either a small increase in pore radius or an increase in polarity [107, 108] . nowadays, intense research is being conducted to develop new drugs acting selectively on ion channel subtypes and aimed at the understanding of the intimate drug-channel interaction [109] . nuclear receptors (nr) are ligand-activated transcription factors that regulate the activation of a variety of important target genes, which are the most important drug targets in terms of potential therapeutic application. according to our results, secondary structure and polarizability play the major role in determining the drug-nrs interactions. the conservative motif of the nr is typically described as three stacked alpha-helical sheets. the helices that make up the ''front'' and ''back'' sheets are aligned parallel to one another. the helices in the middle sheet run across the two outer sheets and only occupy the space in the upper portion of the domain. the space in the lower part of the domain is relatively void of protein, and for most nrs, this creates an internal cavity for small-molecule ligands [110] . hydrogen bonds with polarizability activity play a crucial role in protein-drug interactions (see, e.g., [11] ). our approaches and the results thus obtained could be used to demonstrate how nuclear hormone receptors form a network of direct interactions. and this network increases in complexity to describe the interactions with target genes as well as small molecules known to bind a receptor, enzyme, or transporter. a comprehensive drug-target interaction network system has been established that contains four classifiers for predicting the drugable interaction of compounds with enzymes, ion-channels, gpcrs, and nuclear receptors, respectively. it is anticipated that the network predictor system may become a very useful tool for drug development. particularly it may help us find new or potential drug-target interactions. online supporting information s1 the benchmark dataset for the drug-target enzyme interaction system. it contains 8,157 genedrug pair samples, of which 2,719 are positive and 5,438 negative. the 1st column of the table indicates the nature of samples with 1 for positive and 2 for negative; the 2nd column shows the code of target gene; and the 3rd column shows the code of drug. all the detailed information for the genes and drugs listed here can be found in a guide to drug discovery: target selection in drug discovery predicting human safety: screening and computational approaches assessment of chemical libraries for their druggability review: progress in computational approach to drug development against sars 3d structure modeling of cytochrome p450 2c19 and its implication for personalized drug design molecular modeling of two cyp2c19 snps and its implications for personalized drug design review: pharmacogenomics and personalized use of drugs review: structure of cytochrome p450s and personalized drug structure-based maximal affinity model predicts small-molecule druggability a fast flexible docking method using an incremental construction algorithm review: structural bioinformatics and its impact to biomedical science binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against sars a probabilistic model for mining implicit 'chemical compound-gene' relations from literature prediction of drug-target interaction networks from the integration of chemical and genomic spaces statistical prediction of protein chemical interactions based on chemical structure and mass spectrometry data integrating statistical predictions and experimental verifications for enhancing protein-chemical interaction predictions in virtual screening alignment-free prediction of a drug-target complex network based on parameters of drug connectivity and protein sequence of receptors a vectorized sequence-coupling model for predicting hiv protease cleavage sites in proteins review: prediction of hiv protease cleavage sites in proteins gpcr-ca: a cellular automaton image approach for predicting g-protein-coupled receptor functional classes gpcr-gia: a web-server for identifying gprotein coupled receptors and their families with grey incidence analysis signal-cf: a subsite-coupled and window-fusing approach for predicting signal peptides using functional domain composition and support vector machines for prediction of protein subcellular location cell-ploc: a package of web-servers for predicting subcellular localization of proteins in various organisms euk-mploc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites a sequence-coupled vector-projection model for predicting the specificity of galnac-transferase protident: a web server for identifying proteases and their types by fusing functional domain and sequential evolution information prediction of protease types in a hybridization space prediction of membrane protein types and subcellular locations low-frequency fourier spectrum for predicting membrane protein types memtype-2l: a web server for predicting membrane proteins and their types by incorporating evolution information through pse-pssm support vector machines for predicting membrane protein types by using functional domain composition review: recent advances in developing web-servers for predicting protein attributes a k-nearest neighbor classification rule based on dempster-shafer theory predacc: prediction of solvent accessibility prediction of protein cellular attributes using pseudo amino acid composition using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes digital coding of amino acids based on hydrophobic index feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy insights from modelling three-dimensional structures of the human potassium and sodium channels the structure of phospholamban pentamer reveals a channel-like architecture in membranes mechanism of drug inhibition and drug resistance of influenza a m2 channel structure and mechanism of the m2 proton channel of influenza a virus prediction of g-protein-coupled receptor classes coupling interaction between thromboxane a2 receptor and alpha-13 subunit of guanine nucleotide-binding protein ligand: chemical database for enzyme reactions from genomics to chemical genomics: new developments in kegg predicting networking couples for metabolic pathways of evaluating the statistical significance of multiple distinct local alignments a novel approach to predicting protein structural classes in a (20-1)-d amino acid composition space prediction of protein secondary structure content by using the concept of chou's pseudo amino acid composition and support vector machine use of fuzzy clustering technique and matrices to classify amino acids and its impact to chou's pseudo amino acid composition using the concept of chou's pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy predicting protein subcellular location using chou's pseudo amino acid composition and improved hybrid approach the modified mahalanobis discriminant for predicting outer membrane proteins by using chou's pseudo amino acid composition predicting subcellular localization of mycobacterial proteins by using chou's pseudo amino acid composition prediction of subcellular localization of apoptosis protein using chou's pseudo amino acid composition prediction of g-protein-coupled receptor classes based on the concept of chou's pseudo amino acid composition: an approach from discrete wavelet transform using the augmented chou's pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach predicting the cofactors of oxidoreductases based on amino acid composition distribution and chou's amphiphilic pseudo amino acid composition predicting lipase types by improved chou's pseudo-amino acid composition using chou's amphiphilic pseudoamino acid composition and support vector machine for prediction of enzyme subfamily classes using chou's pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier prediction of cell wall lytic enzymes using chou's amphiphilic pseudo amino acid composition medicinal chemistry and bioinformatics -current trends in drugs discovery with networks topological indices proteomics, networks, and connectivity indices application of pseudo amino acid composition for predicting protein subcellular location: stochastic signal processing approach weighted-support vector machines for predicting membrane protein types based on pseudo amino acid composition slle for predicting membrane protein types using pseudo amino acid composition to predict protein structural classes: approached with complexity measure factor using pseudo amino acid composition to predict protein subcellular location: approached with lyapunov index, bessel function, and chebyshev filter using cellular automata to generate image representation for biological sequences using cellular automata images and pseudo amino acid composition to predict protein subcellular location using pseudo amino acid composition to predict transmembrane regions in protein: cellular automata and lempel-ziv complexity using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image using grey dynamic modeling and pseudo amino acid composition to predict protein structural classes predicting membrane protein type by functional domain composition and pseudo amino acid composition hum-ploc: a novel ensemble classifier for predicting human protein subcellular localization large-scale plant protein subcellular location prediction predicting membrane protein types by the llda algorithm predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic k-nearest neighbor classifiers pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology recognition of a protein fold in the context of the structural classification of proteins (scop) classification the classification and origins of protein folding patterns seventy-five percent accuracy in protein secondary structure prediction predicting protein folding types by distance functions that make allowances for amino acid interactions a fuzzy k-nearest neighbours algorithm discriminant analysis; chapter 12 multivariate analysis of variance on the generalized distance in statistics this reference also presents a brief biography of mahalanobis who was a man of great originality and who made considerable contributions to statistics review: prediction of protein structural classes review: recent progresses in protein subcellular location prediction an intriguing controversy over protein structural class prediction some insights into protein structural class prediction subcellular location prediction of apoptosis proteins cyp induction-mediated drug interactions: in vitro assessment and clinical implications cyp1a1: friend or foe? inhibition and induction of human cytochrome p450 enzymes: current status angiotensin i-converting enzyme gene polymorphism and drug response genome scale enzymemetabolite and drug-target interaction predictions using the signature molecular descriptor svm-prot: web-based support vector machine software for functional classification of a protein from its primary sequence molecular tinkering of g protein-coupled receptors: an evolutionary success critical role for the second extracellular loop in the binding of both orthosteric and allosteric g protein-coupled receptor ligands structural basis for ligand binding and specificity in adrenergic receptors: implications for gpcr-targeted drug discovery glida: gpcr-ligand database for chemical genomics drug discovery-database and tools update investigation into adamantane-based m2 inhibitors with fb-qsar an in-depth analysis of the biological functional studies based on the nmr m2 channel structure of influenza a virus ion channel pharmacology the nuclear receptor superfamily and drug discovery key: cord-002739-7t1o19kn authors: yu, xiaobo; song, lusheng; petritis, brianne; bian, xiaofang; wang, haoyu; viloria, jennifer; park, jin; bui, hoang; li, han; wang, jie; liu, lei; yang, liuhui; duan, hu; mcmurray, david n.; achkar, jacqueline m.; magee, mitch; qiu, ji; labaer, joshua title: multiplexed nucleic acid programmable protein arrays date: 2017-09-20 journal: theranostics doi: 10.7150/thno.20151 sha: doc_id: 2739 cord_uid: 7t1o19kn rationale: cell-free protein microarrays display naturally-folded proteins based on just-in-time in situ synthesis, and have made important contributions to basic and translational research. however, the risk of spot-to-spot cross-talk from protein diffusion during expression has limited the feature density of these arrays. methods: in this work, we developed the multiplexed nucleic acid programmable protein array (m-nappa), which significantly increases the number of displayed proteins by multiplexing as many as five different gene plasmids within a printed spot. results: even when proteins of different sizes were displayed within the same feature, they were readily detected using protein-specific antibodies. protein-protein interactions and serological antibody assays using human viral proteome microarrays demonstrated that comparable hits were detected by m-nappa and non-multiplexed nappa arrays. an ultra-high density proteome microarray displaying > 16k proteins on a single microscope slide was produced by combining m-nappa with a photolithography-based silicon nano-well platform. finally, four new tuberculosis-related antigens in guinea pigs vaccinated with bacillus calmette-guerin (bcg) were identified with m-nappa and validated with elisa. conclusion: all data demonstrate that multiplexing features on a protein microarray offer a cost-effective fabrication approach and have the potential to facilitate high throughput translational research. protein microarrays display individual proteins at high density on a chemically-modified slide that can be tested simultaneously with high sensitivity, high specificity, and low reagent consumption. they have been widely applied in basic and translational research, such as protein interaction studies, immune profiling, vaccine development, biomarker discovery and clinical diagnostics, etc. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] . for example, zhang et al. used a human protein microarray to better understand how arsenic, which is used in chemotherapy, disrupts cancer signaling pathways and, further, to identify potential targets of novel therapeutic treatments. of the 16,368 proteins that were screened, 360 arsenic binding proteins were identified, which may be novel targets for cancer treatment [7] . anderson et al. used protein ivyspring international publisher microarrays to discover a 28-autoantibody biomarker signature of early stage breast cancer with a sensitivity and specificity of 80.8% and 61.6%, respectively [13] . by combining those autoantibodies with several protein biomarkers, provista diagnostics developed the first protein-based blood test for early breast cancer detection called videssa® breast [14] . ayoglu et al. screened sera from multiple sclerosis (ms) patients using protein microarrays containing 11,520 purified protein fragments and then validated those results using bead-based arrays [15] . the arrays indicated that anoctamin 2 autoantibodies and the ms-associated hla complex drb1*15 allele were strongly associated. additional experiments showed that anoctamin 2 aggregates near and inside lesions within human ms brain tissue [15] . protein microarrays can be classified into two different types, purified or cell-free, based on whether the proteins are produced in vivo or in vitro, respectively [16] . purifying proteins is labor-intensive, requires method optimization and multiple manipulations, exhibits highly variable yields of different proteins, and may not result in naturally-folded or functional mammalian products due to expression in non-mammalian systems (e.g., e. coli, yeast). cell-free protein microarrays overcome these challenges by depositing rna or dna on the slide surface and rapidly expressing them just before an experiment (~2 h) through the use of various cell-free expression systems (e.g., lysate from wheat germ, insect cells, rabbit reticulocyte and human cells). compared to purified protein microarrays, cell-free protein microarrays are more likely to produce naturally-folded mammalian proteins due to the decreased sample manipulation and use of enhanced cell extracts with native chaperone proteins. moreover, the use of nucleic acids vastly simplifies the production of custom arrays since any protein can be produced as long as the gene-of-interest is synthesized; for example, arrays can be produced that represent a specific proteome or signaling pathway [17] [18] [19] [20] . a primary disadvantage of planar-based cell-free protein microarrays is the diffusion of mrna or expressed proteins during in vitro transcription and translation (ivtt), which can then be captured by neighboring features (i.e., cross-talk). thus, the closer the features are to each other, the higher the cross-talk [21] [22] [23] [24] . planar-based cell-free protein microarrays include the protein in situ array (pisa) [22] , dna array to protein array (dapa) [25] , nucleic acid programmable protein array (nappa) [18, [26] [27] [28] , and in situ puromycin-capture array [29] . dapa, nappa and puromycin-capture arrays employ a probe (e.g., ni-nta or anti-tag antibody) on a microarray surface that captures the expressed recombinant proteins in situ during ivtt. of the cell-free approaches, nappa has achieved the highest densities with ~ 2,300 plasmids per slide where the distance between neighboring spots is 625 µm) and the cross-talk is less than 2%. however, cross-talk is increased when the feature spacing is reduced to 375 µm [21] . with ~ 2,300 plasmids per slide, five nappa slides are needed to screen a proteome-scale array with over 10,000 genes [18, 30] . therefore, an increase in spot density would reduce the amount of labor, time, reagents, and cost needed for large-scale proteome analyses like target discovery and validation experiments. to address this issue, angenendt et al. printed cdna and expressed the proteins in nanowells using piezoelectric dispensers [31] . takulapalli et al. demonstrated the fabrication of high-density cell-free protein arrays by combining photolithographicallyetched silicon nanowells (n=8,000/slide), nappa, and a piezo-inkjet printer [21] . here we utilized a different strategy to produce high density arrays that does not require any specialized equipment or substrates. we developed the multiplexed nucleic acid programmable protein array (m-nappa) method by combining as many as five different dna plasmids within one spot, which increases the number of displayed proteins per microarray by five-fold. we first demonstrate that multiplexed proteins are displayed on m-nappa using protein-specific antibodies. second, we compare the ability of m-nappa with non-multiplexed nappa to detect different protein-protein interactions and the serological antibody reactivity against 646 viral proteins. next, we show the feasibility of m-nappa in performing high throughput screening for immune-dominant tuberculosis (tb) antigens through the use of an ultra-high density m-nappa tb proteome array containing four subarrays with 4,045 tb open reading frames (orfs) on one slide. using m-nappa tb protein microarrays, four new immune-dominant antigens in the sera of bcg-vaccinated guinea pigs were identified, which were then validated using elisa. finally, we propose a high throughput target discovery and verification pipeline based on the m-nappa approach. all sera samples were collected with written informed consent with the approval of institutional review boards (irb) at university of florida (gainsville, fl), arizona state university (tempe, az) and albert einstein college of medicine (bronx, ny). detailed sample information was provided in our previous work [17, 20] . the sera from guinea pig tb models vaccinated with bcg were kindly provided by dr. david n. mcmurray from texas a&m college of medicine [32] . all experiments using clinical sera samples were executed according to the declaration of helsinki. a mathematical model was built based on a two-step analysis process. the first round of screening would use multiplexed plasmids with the primary objective of identifying potential protein "hits." the second round would be non-multiplexed, in which each multiplexed "hit" from the first round would be printed separately, with the primary goals of validating and identifying specific individual hits. the total number of printed spots (n) needed for the combined two-round screening of 10k proteins was determined by the number of plasmids printed per spot and the anticipated hit rates (i.e., percentage of displayed proteins that will be identified as significant in the study). the probability p of an individual protein being a true hit can be estimated from previous studies of a similar nature (e.g., antibody biomarkers). the following equation assumes that p follows a bernoulli distribution and that its corresponding plasmid is randomly multiplexed where the number of different plasmids per spot is k: the number of spots needed in the first round is , the probability that a multiplexed spot would be a hit (i.e., containing at least one immune-dominant the optimal level of multiplexing of k different plasmids per spot results in the smallest n. all human and viral orf plasmids were obtained from dnasu (https://dnasu.org/), and transferred into a t7-based mammalian expression vector, pant7-cgst, as previously described [18, 30, 33] . purified dna plasmids were prepared by our automated dna factory robot as previously described [18, 30, 33] , and were normalized to 1,200 ng/μl, such that multiplexed plasmids contributed equally to the final concentration. in other words, a plasmid in a five-multiplexed spot would represent 240 ng/μl. five (5) different plasmids containing a different gene-of-interest were mixed with a master printing mixture containing bsa (sigma), bs 3 cross-linker (thermo fisher scientific, il) and polyclonal α-gst antibody (thermo fisher scientific, il) [26] , and subsequently incubated at 4 o c for 2 h. m-nappa and nappa were printed by the nappa protein array core (http://nappaproteinarray.org/) according to published protocols [18, 30, 33] . the quality of printed plasmid dna on m-nappa and nappa was determined using picogreen dna staining [26] . each m-nappa microarray was blocked with superblock solution (pierce, rockford, il) for 1 h at 23 °c, briefly washed with water, centrifuged at 1000 rpm for 3 min to dry, and covered with a hybridization chamber (grace biolabs, or). the array was then incubated with 160 μl of human in vitro transcription & translation (ivtt) solution containing human hela cell lysate, accessory proteins, reaction mixture, and nuclease-free water (thermo fisher scientific, il) for 1.5 h at 30 °c and 0.5 h at 15 °c to express the gst-tagged proteins-of-interest. the gst-tagged proteins were displayed on the slide surface via the polyclonal α-gst antibody that was included in the printing mixture. then, the resulting protein microarray was incubated with 5% (w/v) milk in 1xpbs with 0.2% (v/v) tween-20 (pbst) for 1 h at 23 °c, followed by three brief washes with pbst. the protein specific antibodies were diluted with 5% milk-pbst at 1:50 or 1:100, respectively, and incubated with the protein microarray for 16 h at 4 °c followed by a 1 h incubation at 23 °c with an alexa fluor 555 labeled secondary antibody (jackson immunoresearch laboratories, pa). after washing three times with pbst, the m-nappa slides were briefly rinsed with water and dried by centrifugation (2,000 rpm, 2 min). the arrays were scanned by a tecan scanner (männedorf, switzerland). after proteins were expressed on m-nappa, the resulting protein arrays were blocked with blocking buffer (1×pbs, 1%tween 20 and 1% bsa, ph 7.4) for 1 h at 4 o c. in parallel, the query proteins (e.g., rb1, jun, fos, lida) fused to a halotag were produced by incubating 90 ng of dna in 180 µl human cell-free expression system (thermo fisher scientific, il) for 2 h at 30 o c. to screen protein-protein interactions, the protein array was incubated with unpurified rb1-halo protein in human hela lysate for 16 h at 4 o c, and then washed with cold washing buffer (pbs, 5 mm mgcl2, 0.5% tween20, 1% bsa and 0.5% dtt, ph 7.4 at 4 o c) three times to remove unbound molecules. the arrays were consecutively incubated with a chicken anti-halo tag antibody (genetel, wi) and alexa fluor 555 goat anti-chicken secondary antibody (jackson immunoresearch laboratories, pa) for 2 h at 4 o c. arrays were washed and dried with brief centrifugation at 1,000 rpm for 1 min, and scanned as described above. the protein binding signal was quantified using array-pro analyzer (media cybernetics) software as previously reported [20, 33] . for a given experiment, a tab-separated file with the interaction information was generated and loaded into the cytoscape software [34] with an attribute file that contained signal intensities of features on m-nappa and nappa. in figures 4 and 5 , proteins within a multiplexed m-nappa feature and its five corresponding non-multiplexed proteins on nappa were displayed as connecting large and small nodes, respectively, with color gradients depicting signal intensities. after proteins were expressed on m-nappa, the arrays were blocked with 5% milk-pbst for 1 h and then incubated with sera at 1:300 dilution in 5% milk-pbst for 16 h at 4 °c. after washing three times with pbst, the resulting arrays were incubated with alex fluor 555 labeled anti-human igg antibody (jackson immunoresearch laboratories, pa) 1 h at 23 °c. the slides were washed with pbst, briefly rinsed with water, and dried by centrifugation (2,000 rpm, 2 min). the fluorescent scanning was performed using a tecan scanner (männedorf, switzerland). the antibody binding event was quantified by fluorescence signal intensity using array-pro analyzer (media cybernetics) software as previously reported [20, 33] . protein microarrays have been used in functional protein and antibody biomarker studies to screen for target(s)-of-interest, which are generally rare in the tested protein population ( figure 1b) . for example, the median hit rates (± standard deviation, sd) of studies employing protein microarrays in the past five years for screening protein function and autoantibody biomarkers (table s1) were 0.49% ± 1.23% and 1.02% ± 4.46%, respectively ( figure 1b) . since false positives are not uncommon during initial screens, all initial candidates require an independent verification step performed using different samples [5, 7, 8, 19, [35] [36] [37] . considering that a two-step approach for target discovery and verification often uses hundreds to thousands of samples, the cost of such studies using full-scale arrays can be inhibitory. to decrease the cost of high throughput screening experiments, we hypothesized that the plasmid cdna encoding for different proteins could be multiplexed (by combining m different plasmids) within each feature to create a high-density array, m-nappa ( figure 1a) . this multiplexed array could be implemented during the initial functional screen, testing entire proteomes (p proteins) using only a fraction of the features (p/m). multiplexed hits identified during the screening step could then be de-convoluted in the subsequent verification step using the standard, non-multiplexed nappa array where each feature displays only one protein (i.e., m=1). the objectives of the second step would be to identify which proteins were responsible for the positive multiplexed signal and to verify whether the hits were real. this approach exploits the high flexibility of cell-free microarrays, in which arrays can be customized by simply re-arraying individual plasmids encoding for the multiplexed features-of-interest. the schematic illustration of how m-nappa arrays are processed is shown in figure 1a . using a standard pin-based arrayer, each spot on m-nappa contains plasmids encoding for different proteins-of-interest with the same fusion tag. the genes are then transcribed and translated into recombinant proteins in two hours using a cell-free expression system, and captured to the slide surface in situ via a fusion tag antibody. the optimal number of proteins to multiplex depends upon several factors, including hit frequency, cost, array space, and number of proteins. as the frequency of hits in the screen increases, more proteins will need to be tested as individual features during the verification step. taken to the extreme, if one protein per multiplexed feature were a hit (hit rate = 1/m), all multiplexed features would require deconvolution, making the multiplexing approach impractical. however, such a high hit rate is not reflected by data collected by numerous studies; for example, the hit rate was < 5% in most of our previous nappa-based screening studies with 10k human genes ( figure 1b) . we generated a mathematical model (materials and methods) to find the optimal m that would take into consideration array space and the cost of screening and verifying hits using our 10k protein human collection at different hit rates. in figure 1c , the x-axis represents the number of genes per spot (m) while the y-axis represents the number of spots or proteins that are needed for the two-step screening process. notably, when the hit rate is < 5% for 10k proteins, a relatively small number of spots would be needed for the entire study (screening + verification) with 5 proteins multiplexed per feature in the initial screen, thus representing a good compromise between the number of initial features screened and the subsequent number of features that would be needed for deconvolution and verification. to assess the difference in transcriptional/translational efficiency as well as display competition between large and small proteins within one feature, we multiplexed proteins of varying molecular weights (mw; 20 -124 kda) covering 80% of the size range in our human protein collection ( figure s2 ) on m-nappa. as indicated in figure 1 , we prepared nappa and m-nappa slides in parallel where nappa had only one plasmid per spot and m-nappa multiplexed five plasmids per spot. after ivtt, the protein arrays were probed with eight antibodies that bound targets ranging in size from 20 to 106 kda (methods and figure s1 ). these antibodies were specific to ia-2 (106 kda), gad2 (65 kda), clusterin (52 kda), p53 (44 kda), fos (40 kda), pp2a (36 kda), sfn (28 kda) and bcl2l2 (20 kda). we compared the protein display between the two array types using groups of five proteins with either similar (figure 2a and figure 2b ) or varied molecular weights (figure 2a and figure 2c) , and then calculated the signal ratio of m-nappa to nappa. in both cases, all of the antibodies readily detected their corresponding antigens. for the spots with similarly-sized proteins (36 kda to 85 kda), the signal ratio of m-nappa to nappa was 0.78±0.44. for the spots containing five proteins covering a wide range of molecular weight, from 29 kda to 106 kda (figure 2c) , the binding signal ratio of m-nappa to nappa was 1.03±0.75 ( figure s3) . thus, multiplexing proteins of similar size did not confer any advantage over random multiplexing. to further demonstrate that there were no biases in the expression of different proteins produced from mixed plasmids, five-plasmid mixtures containing various combinations of seven different genes (abl1, ia-2, gad2, jun, rhou, bcl2l2 and mt3033) were co-expressed in ivtt solution and analyzed via western blot. despite a wide range of protein sizes, all proteins were expressed at similar amounts in their relevant combinations ( figure s4) . these data indicate that, although each plasmid in m-nappa is present at one-fifth the amount present in standard nappa, there was no significant difference of protein display levels between the arrays (p-value = 0.36, paired sample t-test). in addition, background signals that resulted from non-specific antibody binding were comparable between the platforms, demonstrating that multiplexing does not result in an accumulation of background signal that could contribute to the identification of false positives ( figure s5) . therefore, we randomly mixed different gene plasmids in the following m-nappa studies. to demonstrate that the signal intensity for m-nappa was reproducible, we also tested the spot-to-spot, zone-to-zone and slide-to-slide variations by printing 80 gene plasmids on different locations across the m-nappa and nappa slides. protein display was then examined with anti-gst antibody staining (materials and methods). the coefficient of variations (cvs) for spot-to-spot, zone-to-zone and slide-to-slide were 3.64±3.27%, 7.57±3.41% and 7.27±4.00% for m-nappa, respectively, and 7.63±10.58%, 12.13±7.56%% and 13.25±9.42%% for nappa, respectively (table s4) . we purified 646 viral orf plasmids from ~23 viruses, normalized their concentrations to 1,200 ng/µl, and printed viral nappa and m-nappa arrays in duplicate [20] . analyses of the deposited dna and displayed protein levels indicate that most viral dna plasmids were successfully printed, expressed, and captured onto the microarrays in a reproducible manner (figure 3a) . for example, plasmid dna deposition across technical replicates of nappa and m-nappa had correlations (r) of 0.95 and 0.96, respectively. the protein display correlation (r) across technical replicates of nappa and m-nappa were 0.90 and 0.93, respectively ( figure 3b, figure s6 ). "non-spots" containing printing buffer alone without plasmid dna was used as a negative control. 94% and 93% of the spots on nappa and m-nappa viral arrays, respectively, produced signal that was at least two sds above the average signal intensity of these "non-spots" (figure s7 ). together with figure 2 , the results indicate that the majority of viral proteins can be displayed on m-nappa arrays. in addition, we compared the s/b (signal to background) ratios between direct fluorescence and tyramide signal amplification (tsa) using a fluorophore-linked or hrp-conjugated anti-p53 antibody, respectively. using the signal from "non-spots" as background, we found that the s/b ratio of fluorescence detection using an antibody with a directly-conjugated fluorophore, the dylight649 rabbit anti-mouse igg, was higher (s/b ratio = 431±38) than the tsa method with the hrp-labeled goat anti-mouse igg (s/b ratio = 323±18). thus, directly-conjugated fluorescent secondary antibodies were used for the following assays ( figure s8 ). to determine whether nappa and m-nappa detect similar protein-protein interactions, both arrays were programmed to display proteins that are known to interact with the tumor suppressor protein rb1. the arrays were then probed with a rb1 query protein fused to halotag, and interactions were detected using an anti-halotag antibody. the rb1-halotag query protein bound to several targets (red arrow) on nappa and m-nappa arrays; the query also bound to diffused targets outside of each spot, which appear as a "ring" around each feature (figure 4a) . in figure 4b , we used a flower pattern diagram to depict the multiplexed proteins on m-nappa (large central circle) and the deconvoluted individual proteins on nappa (five small connecting circles) (materials and methods). the blue gradient within the spot indicates target binding to the rb1 query protein, whereas reactivity to the "ring" [20, 33] is indicated by a red circle around the spot. using custom defined criteria, where the target-to-"non-spot" signal ratio is > 2 and the ring score is > 3, we found that 5 and 6 hits were identified on nappa and m-nappa, respectively, out of the 30 possible candidate target proteins ( table s2, table s3 ). five of the 6 hits on m-nappa were e1a, hpv11-e7, hpv16-e7, hpv18-e7 and hpv33-e7, which agrees with previous studies [38] [39] [40] . the sixth hit on m-nappa was not detected with nappa, thus suggesting that the hit may be a false positive. to further examine the utility of m-nappa to test protein-protein interactions, additional interactions were analyzed with 35 displayed proteins on nappa and m-nappa using halotagged-jun, -fos, and -lida queries. jun, fos and lida bound to their expected interaction partners (i.e., fos, jun and three rab family proteins, respectively) on both m-nappa and nappa arrays (figure s9) . regarding the protein interactions that were identified, the spot-to-spot and zone-to-to zone cvs were 5.65±2.69% and 5.75±3.86% for nappa, respectively, and 2.55±2.56% and 3.11±3.46% for m-nappa, respectively (table s5) . these results indicate that m-nappa can be used for preliminary high throughput (ht) screening of novel protein-protein interactions. the screen can then be followed by a verification step using deconvoluted spots via nappa to identify the specific proteins that are involved. to test whether m-nappa can be used to detect proteomic serological response, we screened ten serum samples from patients with type 1 diabetes that had been previously characterized using nappa arrays [20] . a dozen hits were observed with m-nappa and nappa (figure 5) . forty-nine of the 53 antigens (92.5%) identified by nappa were also detected by m-nappa. four antigens, however, were detected with only one platform (i.e., two with nappa, two with m-nappa). these uncommon discrepancies may be due to variations in surface chemistry, plasmid concentration, printing or array processing. since the multiplex concept to increase feature density was successful in detecting protein-protein interactions and serological antibody responses on planar microarrays, we wanted to determine whether m-nappa could also be applied to a nano-well microarray platform. we previously increased feature number by printing plasmids into photolithography-etched silicon nano-wells to create a high-density nappa (hd-nappa) platform [41] . hd-nappa can have as many as 10k features per slide, and has successfully detected antiviral antibodies in autoimmune diseases with 761 different proteins displayed on the array in quadruplicate. these tiny wells hold only 1200 pl and use only 0.12 ng of plasmid dna. we applied the multiplex concept to hd-nappa using a mixture of plasmids encoding for ia-2, gad2 and p53 proteins. we then detected their expression and display using specific antibodies; all of these proteins were readily detectable when printed as a three-plexed mixture ( figure s10) . we then multiplexed 4,045 tuberculosis (tb) orfs [32] onto hd-nappa microarrays as four separate subarrays using three gene plasmids per well (m=3), resulting in an m-hd-nappa microarray displaying > 16k proteins on a single slide. this lower multiplicity was based on the mathematical model ( figure 1c ) that took into account that the high number of conserved proteins in endemic, non-pathogenic mycobacterial species results in a higher hit rate (~10% [37] ). over 95% of the spots generated a signal that was at least 10 sds above the background, which indicates that the vast majority of proteins were well-expressed and displayed (figure 6a) , with a correlation of r = 0.90 across technical replicates (figure 6a-c) . antibody reactivity from tb patient sera was observed with m-hd-nappa (figure 6d) . the technical reproducibility of these immune-dominant antigens across different m-nappa arrays using the same sera was very high, with a correlation of r = 0.98 (figure 6d-f) . all immune-dominant antigens identified with m-hd-nappa screening were then deconvoluted in the verification step using single protein nappa ( figure 6g ) and validated with rapid-elisa as previously described [19] (figure 6h ). we screened the sera from guinea pigs immunized with bacillus calmette-guerin (bcg), a tb vaccine, using m-hd-nappa tb proteome microarrays. the aim of this experiment was to identify potential protective antibodies induced with bcg. the representative fluorescence images are shown in figure 7a . compared to the control mock sera pool using pbs buffer (n=5), four features on m-nappa arrays showed increased signals with the bcg samples (n=4) (figure 7b) . to deconvolute and validate those targets, we repeated the serological assay for those candidate proteins, along with two non-responsive control proteins (rv2077a and rv2682c), using rapid-elisa and the individual sera from the guinea pigs. the antibody levels of four antigens (rv3405c, rv1078, rv2853 and rv0928) in bcg-vaccinated guinea pigs were significantly higher than that of the pbs control with a p-value <0.01 ( figure 7c, figure s11 ). according to the tuberculist database (http://tuberculist.epfl.ch/), these proteins are involved in regulation, cell wall and cell processes, and are considered to be in the proline-glutamic acid / proline-protein-glutamic acid (pe/ppe) protein families ( figure 7d ). a primary advantage of cell-free protein microarrays is that the arrays have a long shelf life. we compared the protein expression of m-nappa tb arrays immediately after printing and then again after 6 months of storage at room temperature in a nitrogen atmosphere. a gst-tagged protein, detected with an anti-gst antibody, was considered to be displayed if it had a signal that was two sds above the signal of the "non-spots." over 99% of the proteins were displayed on new m-nappa arrays; this number, as well as the anti-gst signal intensity, did not change even after 6 months of storage ( figure s12 ). nappa has been widely applied in protein-protein interactions, post-translational modifications (ptms), antibody epitope mapping and discovery of (auto) antibody biomarkers for a variety of human diseases, including markers that are currently being used in the clinic for the detection of breast cancer [13, 14, 18, 20, 32, 36, [42] [43] [44] . due to mrna and protein diffusion during ivtt, the number of features per planar microscope slide has been limited to ~2,300 to minimize cross-talk to neighboring spots. the feature density limit has thus required that multiple slides be used to study large proteomes. here, we developed a new strategy, m-nappa, that significantly increases the number proteins that can be tested per slide multiple-fold. by combining five different plasmids within one feature, >10k proteins can be printed on one microscope slide for ht, low cost analyses when compared to studies using one-plasmid-per-feature arrays. the multiplexed hits that are identified with m-nappa can then be deconvoluted during the subsequent verification step (figure 8) . first, we constructed a mathematical model to determine the optimal level of multiplexing, which considers the number of proteins, cost, array size, and hit rate to predict the number of arrays that would be needed for a two-step screening and verification study. a survey of ht unbiased target screening studies that used protein microarrays, both in the literature and our own results using nappa, revealed that hits are rare (typically <5%) (figure 1b) . for 10k proteins and a hit rate of 5%, the mathematical model indicated that multiplexing 5 proteins per spot (figure 1c) would provide a good balance of maximizing the number of features, minimizing the number of arrays, and yielding the minimum overall workload when compared to using non-multiplexed arrays for both the screening and verification steps. second, we demonstrated that there was nothing inherent to printing plasmids as a five-plasmid mixture that prevented their expressed proteins from routine detection regardless of protein size. however, display levels of large proteins (> 65 kd) were decreased by 60±0.1% for gad2 and 63±11% for ia-2 ( figure 2c ). this may be because the plasmids were mixed equally together based on their masses, a requirement imposed by the printing chemistry; this would result in a lower molarity of large plasmids (i.e., large proteins) relative to small plasmids in the printing mixture. another possible reason is that larger proteins are produced more slowly than smaller proteins due to their longer mrna sequences. third, we showed that m-nappa can be used in protein-protein interaction and serological screening studies. the results from m-nappa agreed strongly with those observed with non-multiplexed nappa (figure 3-5, table s2 and s3, figure s9 ). these data indicate that m-nappa presents a labor-and cost-effective strategy to initially screen for hits. fourth, we further increased the feature density by applying this method to our previously-published nano-well platform [21] . with m-hd-nappa, the entire tb proteome containing 4,045 genes was successfully printed on a nano-well array in quadruplicate [20] (figure 6 ). this generates the highest density nano-well protein microarray to date and increases the previously demonstrated content by more than five-fold [20] . our data indicate that the multiplexing strategy has great potential value for use with different microarray platforms (figure 6 and figure 7 ) [45] . finally, we evaluated the reproducibility of m-nappa arrays for protein array preparation and protein-protein interactions. we found m-nappa can be reproducibly fabricated with spot-to-spot, zone-to-zone and slide-to-slide cvs that are similar to those obtained with nappa ( table s4) . the spot-to-spot and zone-to-to zone cvs for protein-protein interactions were also similar between the two array platforms (table s5) . while the correlations within and between different m-nappa slides were good (i.e., r = 0.93 for both) (figure 3 and figure s6 ), with some size adaptation, the reproducibility could eventually be further improved with the use of automation equipment like the hs 4800 pro hybridization station (tecan trading ag; männedorf, switzerland). in some ways, m-nappa resembles "natural protein" microarrays that print unpurified or partially fractionated proteins from lysates of human cells, tissues or body fluids, but in a much more controlled manner. each feature of a natural protein microarray typically represents a mixture of unknown proteins. thus, responsive hits on natural protein arrays require a challenging and time-consuming process to determine the identity of the protein responsible for the response. this may require further purification, identification by mass spectrometry and additional response testing of recombinant proteins [46, 47] . in the case of m-nappa, the identities of the proteins in each mix are known in advance and the plasmids encoding for each protein are available for secondary testing. m-nappa would be useful in unbiased ht screening studies, such as protein-protein interactions, protein-dna interactions, discovery of drug binding target as well as (auto)antibody biomarkers for a variety of human diseases. however, it should be noted that there are situations in which using a non-multiplexed array format would be more appropriate. for example, nappa should be used when investigating protein functions or when the number of proteins to be screened is low. additional attributes of m-nappa should be considered as well. large, multiplexed proteins (> 65 kda) on m-nappa are displayed at a lower level (37 -40%) than their non-multiplexed counterparts ( figure 2c ). this issue could be resolved by increasing plasmid dna concentration before printing or reducing multiplicity per spot. alternatively, since large proteins represent a small fraction of the proteome, a hybrid array containing multiplexed spots with plasmids encoding for proteins with low to moderate mws and non-multiplexed spots for large proteins (> 65 kda) could be employed. in addition, ptms that occur during cell-free protein expression may affect the protein display or activity on m-nappa arrays [42] . we have observed that the human expression system contains the ability to phosphorylate some proteins (data not shown); other types of ptms (e.g., glycosylation, acetylation) by the expression system are not well known or reported. in our studies, ptms did not appear to affect protein expression, protein-protein interactions, or the identification of serological antigens on m-nappa when compared to nappa (figure 2, figures 4-7 and figure s7 ). we developed a method that multiplexes five different proteins within the same feature, called m-nappa, which significantly increases array density while decreasing experimental time and cost. although we used this approach with nappa and hd-nappa, the same concept could be applied toward other microarray technologies or platforms. our results show that m-nappa identified hits in protein interaction and serum screening studies, thus highlighting its potential to be employed in high throughput proteomics studies. supplementary table s1-s5 and supplementary figures s1-s12. http://www.thno.org/v07p4057s1.pdf protein analysis on a proteomic scale protein microarray technology severe acute respiratory syndrome diagnostics using a coronavirus protein microarray a burkholderia pseudomallei protein microarray reveals serodiagnostic and cross-reactive antigens protein acetylation microarray reveals that nua4 controls key metabolic target regulating gluconeogenesis protein microarrays for personalized medicine systematic identification of arsenic-binding proteins reveals that hexokinase-2 is inhibited by arsenic identification of serum biomarkers for gastric cancer diagnosis using a human proteome microarray aagatlas 1.0: a human autoantigen database advancing translational research with next-generation protein microarrays nappa as a real new method for protein microarray generation emerging protein array technologies for proteomics protein microarray signature of autoantibody biomarkers for the early detection of breast cancer integration of serum protein biomarker and tumor associated autoantibody expression data increases the ability of a blood-based proteomic assay to identify breast cancer anoctamin 2 identified as an autoimmune target in multiple sclerosis advancing translational research with next generation protein microarrays identification of antibody targets for tuberculosis serology using high-density nucleic acid programmable protein arrays high-throughput identification of proteins with ampylation using self-assembled human protein (nappa) microarrays plasma autoantibodies associated with basal-like breast cancers immunoproteomic profiling of anti-viral antibodies in new-onset type 1 diabetes using protein arrays high density diffusion-free nanowell arrays single step generation of protein arrays from dna by cell-free expression and in situ immobilisation (pisa method) optimised 'on demand' protein arraying from dna by cell free expression with the 'dna to protein array' (dapa) technology cell free expression put on the spot: advances in repeatable protein arraying from dna (dapa) printing protein arrays from dna arrays next-generation high-density self-assembling functional protein arrays self-assembling protein microarrays mapping transcription factor interactome networks using halotag protein arrays protein chip fabrication by capture of nascent polypeptides copper-catalyzed azide-alkyne cycloaddition (click chemistry)-based detection of global pathogen-host ampylation on self-assembled human protein microarrays cell-free protein expression and functional assay in nanowell chip format mycobacterial membrane vesicles administered systemically in mice induce a protective immune response to surface compartments of mycobacterium tuberculosis host-pathogen interaction profiling using self-assembling human protein arrays cytoscape: a software environment for integrated models of biomolecular interaction networks mapping transcription factor interactome networks using halotag protein arrays autoantibody signature for the serologic detection of ovarian cancer dynamic antibody responses to the mycobacterium tuberculosis proteome systematic identification of interactions between host cell proteins and e7 oncoproteins from diverse human papillomaviruses functional studies of e7 proteins from different hpv types adenovirus e1a, simian virus 40 tumor antigen, and human papillomavirus e7 protein share the capacity to disrupt the interaction between transcription factor e2f and the retinoblastoma gene product antiviral antibody profiling by high-density protein arrays a contra capture protein array platform for studying post-translationally modified auto-antigenomes exploration of panviral proteome: high-throughput cloning and functional implications in virus-host interactions serological autoantibody profiling of type 1 diabetes by protein arrays mycobacterium tuberculosis proteome microarray for global studies of protein function and immunogenicity the identification of phosphoglycerate kinase-1 and histone h4 autoantibodies in pancreatic cancer patient serum using a natural protein microarray development of natural protein microarrays for diagnosing cancer based on an antibody response to tumor antigens we thank for dr. mark atkinson (university of florida) for providing sera samples. this work was supported by the national natural science the authors have declared that no competing interest exists. key: cord-012682-7goljir4 authors: yuan, meng; song, zi-han; ying, mei-dan; zhu, hong; he, qiao-jun; yang, bo; cao, ji title: n-myristoylation: from cell biology to translational medicine date: 2020-03-18 journal: acta pharmacol sin doi: 10.1038/s41401-020-0388-4 sha: doc_id: 12682 cord_uid: 7goljir4 various lipids and lipid metabolites are bound to and modify the proteins in eukaryotic cells, which are known as ‘protein lipidation’. there are four major types of the protein lipidation, i.e. myristoylation, palmitoylation, prenylation, and glycosylphosphatidylinositol anchor. n-myristoylation refers to the attachment of 14-carbon fatty acid myristates to the n-terminal glycine of proteins by n-myristoyltransferases (nmt) and affects their physiology such as plasma targeting, subcellular tracking and localization, thereby influencing the function of proteins. with more novel pathogenic n-myristoylated proteins are identified, the n-myristoylation will attract great attentions in various human diseases including infectious diseases, parasitic diseases, and cancers. in this review, we summarize the current understanding of n-myristoylation in physiological processes and discuss the hitherto implication of crosstalk between n-myristoylation and other protein modification. furthermore, we mention several well-studied nmt inhibitors mainly in infectious diseases and cancers and generalize the relation of nmt and cancer progression by browsing the clinic database. this review also aims to highlight the further investigation into the dynamic crosstalk of n-myristoylation in physiological processes as well as the potential application of protein n-myristoylation in translational medicine. lipids are one of the principal components that structure the cell membrane and provide the barrier required for cells to survive. in addition, various lipids and lipid metabolites are also generated for modifying proteins in eukaryotic cells in a process known as 'protein lipidation'. generally, there are four major types of protein lipidation: myristoylation, palmitoylation, prenylation, and glycosylphosphatidylinositol (gpi) anchoring. these lipid modifications have been defined by different functional properties that are classified according to the characteristics of lipid attachment, the covalent bond, the specific sequence on the protein and the enzymes involved [1, 2] . these lipidation distinctions impact the charge, hydrophobicity, and other aspects of targeted-protein chemistry, resulting in marked differences in the physiology of the targeted protein, such as its conformation, trafficking, localization, and binding affinity for cofactors. therefore, due to the deregulated lipid metabolism that occurs, protein lipidation may contribute to various diseases. this review primarily summarizes the current knowledge of nmyristoylation with updated study results and discusses the strategy of using n-myristoylation in the treatment of diseases. n-myristoylation consists of the addition of the 14-carbon fatty acid, myristate, to the n-terminal glycine residue of a protein via a covalent amide bond. in rare cases, including those of ras gtpases and tnf [3, 4] , myristic acid is attached to a lysine residue [5] through an amide bound, a process named lysine myristoylation. previously, it was observed that the myristate attaches to the nascent peptide in the first 10 min of translation during protein biosynthesis. therefore, n-myristoylation is considered a cotranslational modification with the most accurate step occurring after the removal of the methionine initiator by methionine aminopeptidase (metap) (fig. 1a, b) . furthermore, it is recognized that nmyristoylation can also occur posttranslationally on an internal glycine exposed by caspase cleavage during apoptosis (fig. 1c) . for many protein signaling systems, the n-myristoyl moiety represents an essential feature and contributes to numerous effects, including changing protein stability, influencing protein-protein interactions, enhancing subcellular targeting to organelles or the plasma membrane, and so on [6] [7] [8] [9] [10] . two nmt isozymes during the multiple enzymatic steps of n-myristoylation, the process of myristate transfer is completed by nmt, which is classified as a member of the gcn5-related n-acetyltransferase (gnat) superfamily. some well-studied structural and biochemical investigations of yeast and human nmts revealed an ordered bi bi kinase mechanism (fig. 1d ) that involves a structural transformation of nmt1 during nmt catalysis. in fact, nmt is ubiquitous in eukaryotes while no protozoans possessed it. typically, lower eukaryotes (e.g., s. cerevisiae and drosophila) have only one isoform of nmt. however, for some mammalian species, including humans and mice, two isozymes have been identified, referred to as nmt1 and nmt2. these isozymes are encoded by distinct genes but share approximately 77% peptide sequence identity with unique substrate specificities in the n-terminus, suggesting a distinct physiologic role of each isozyme in mammals. additionally, each isozyme has a conserved sequence in the catalytic domain of divergent species, implicating an essential role for each gene family throughout evolution. in humans, there are four isoforms of nmt1 and two isoforms of nmt2, which are translated from splice variants of mrna with different reading frames. the main differences among the nmt isoforms are found in the n-termini. although the n-terminal structure of these nmts does not contribute to the construct of the kinase pocket, some investigations have characterized an nterminal truncation that increases kinase activity without affecting enzyme stability [11] . in addition, the isoforms of these nmts may have specific effects on their intracellular localization and substrate selectivity [12, 13] , suggesting that nmts are involved in diverse physiological processes. physiological roles of nmts unequivocally, nmts play essential roles in the survival and cell proliferation of diverse species [14, 15] . some evidence suggests a principal role of nmt1 in the embryonic development of mice. in normal mice, the expression level of nmt1 is similar to that of nmt2 in a wide variety of tissues, but it is higher than the nmt2 levels in embryos. knocking out nmt1 severely impaired the differentiation ability of embryonic stem cells. homozygous nmt1 −/− -knockout mice were not born alive, and induction of nmt2 activity alone was unable to elicit the survival of heterozygous nmt1 +/− mice. in addition, it was reported that the subcellular localization and catalytic activity of both nmt1 and nmt2 were altered during apoptosis. nmt1 was transported to the cytosol from ribosomes and membranes following caspase-8-and caspase-3-mediated nmt1 cleavage, and 40% of the nmt1 activity was eliminated 8 h after the induction of apoptosis. however, the relocalization of the cytosolic fraction to the membrane and reduced activity of nmt2 were also found under the same conditions. in addition, the depletion of nmt2 caused a 2.3-fold increase in the apoptosis rate compared to the apoptosis rate upon depletion of nmt1 [16] . this evidence led to the speculation that nmt1 may be responsible for ribosome-based cotranslational n-myristoylation, while nmt2 may be the major contributor to apoptosis-related posttranslational n-myristoylation. in summary, the widespread and conservative presence of nmt1 in different species emphasizes its importance in basic physiological processes, such as embryonic development, while nmt2 appears only in higher-level organisms, suggesting that it is necessary for sophisticated physiological processes. in the future, in-depth biochemical and pharmacological research is expected to improve the understanding of the unique roles of nmt1 and nmt2 as they apply to the clinic. protein demyristoylation some reports have revealed the demyristoylation ability of some proteins. human sirtuin 2 (sirt2), which is a member of the lysine deacetylase sirtuin protein family, was found to exhibit more efficient demyristoylase activity than deacetylase activity [17] . the crystal structure implied that, in complex with a thiomyristoyl peptide, sirt2 has a dominant hydrophobic pocket that can adopt a myristoyl group. the hydrophobic acyl pocket of sirt2 resembles that of sirt6, which has been previously demonstrated to possess efficient fatty acid deacylase activity [18] . in addition, the two pockets are different in certain aspects, suggesting differential adoption of fatty acid chains. moreover, the other homologs, sirt1 and sirt3, each have a hydrophobic acyl pocket very similar to that of sirt2, hinting at analogous demyristoylation activities among them. the shigella virulence factor ipaj was identified as an irreversible demyristoylase [19] that cleaves the peptide bond between nmyristoylated gly-2 and asn-3 in some n-myristoylated proteins such as human adp-ribosylation factor 1p (arf1) and c-src. this irreversible demyristoylation mechanism provides a new approach to exploring the functional effects of n-myristoylated proteins in human health and diseases. in most cases, n-myristoylation on a protein is irreversible, indicating that the myristoyl motif may orient the protein toward a specific destiny, as if it is pressing a button that will irrevocably affect the dynamics of the protein and its subsequent pathway. therefore, it is reasonable to study the interactions among the factors of n-myristoylation and those of biological signaling pathways to understand the significant role of n-myristoylation. although n-myristoylation is irreversible, it cannot shield the myristoylated protein from cross talk. in contrast, cross talk is regarded as a means of interfering with n-myristoylation functions. it has been proposed that one protein modification might initiate the signaling that leads to the addition or removal of a second protein modification or the binding of another protein, suggesting that cross talk between protein modification components may serve as an important bypass of regulating protein functions. for example, both methylation and phosphorylation are able to trigger acetylation of histones [20] . here, while introducing the physiological functions of nmyristoylation, we also delineate the cross talk of nmyristoylation components with signaling constituents in light of well-established studies to explore the robust role of nmyristoylation in cell biology. dynamic structural changes in membrane anchoring and intracellular trafficking one of the major functions of n-myristoylation is to facilitate protein binding in membranes. in fact, peitzsch and mclaughlin established a tenet stating that the myristoyl motif is insufficient for the stable anchorage of a protein to a lipid bilayer [21] . therefore, a second signal, comprising a group of hydrophobic residues, positively charged amino acids or another lipid moiety, is required for stable membrane attachment. in one scenario called the 'ligand-dependent switch' (fig. 2a) , the conformation of a protein is changed upon ligand binding, exposing the myristoyl motif that attaches to a component in the lipid bilayer. for example, the gtp-myristoyl switch facilitates the membrane interaction of arf [22, 23] . the exposed myristoyl motif and the basic hydrophobic residues in the n-terminus facilitate the interaction of arf1-gtp with the membrane. the second scenario refers to a cluster of positively charged amino acids that are associated with a cofactor, such as calcium (fig. 2b) , or are phosphorylated (fig. 2c) ; the former cluster accumulates a positive charge to strengthen membrane binding, while the latter attenuates the positive charge to weaken membrane binding and cause membrane dissociation. the binding of two calcium ions to ef-hand motifs in the recoverin protein facilitates the exposure of a myristoyl group from a hydrophobic cavity to solvent (fig. 2b ) [24] . another example is the myristoylated alanine-rich c kinase substrate (marcks) protein. the phosphorylation of serine residues contributes to its membrane dissociation, since the phosphate moiety reduces the positive charge ( fig. 2c ) [25] . ece c. gaffarogullari et al. [26] proposed a novel myristoyl/phosphorylation switch in a pka-c model of membrane attachment. pka-c maintains conformational equilibrium between a myr-in state and a myr-out state, which refer to the myristoyl group tucked into the hydrophobic pocket of the pka-c or extruded from the hydrophobic pocket, respectively. in the myr-out sate, the exposed myristoyl group inserts into the lipid bilayer and facilitates pka-c binding to the membrane. therefore, it was proposed that a large population of pka-c maintains the myr-in state distal to the membrane and that pka-c shifts to the myr-out state in the proximity of the plasma membrane or when phosphorylated at ser10 independent of the regulatory subunit. the palmitoyl moiety, which consists of a 16-carbon fatty chain, also induces signal transduction (fig. 2d) . it was observed that h-ras requires both a myristoyl group and palmitoyl group to target the membrane and stimulate kinase activity, whereas only n-myristoylated h-ras acts a substrate for palmitoyltransferases and therefore is palmitoylated [27, 28] . the n-myristoylationnegative mutants of cdpk2 [29] and fibroblast growth factor receptor substrate 2α (frs2α) [30] do not incorporate palmitate and therefore do not attach to the plasma membrane. moreover, proteins with myristoyl moieties show a tendency for inclusion in the membrane fraction. for example, it was reported that both n-myristoylation and palmitoylation appear to have opposing roles and different membrane lipid microdomain preferences for the g protein-membrane interactions i (gαi1) monomer, which are likely due to the conformational differences in the presence of different fatty acids [31] . the gαi1 protein that is n-myristoylated but not palmitoylated preferentially anchors to lamellar-prone membrane domains with a net negative charge. in contrast, the gαi1 protein that is both n-myristoylated and palmitoylated preferentially localizes to raft-like lamellar membranes without negative charges. n-myristoylated akt1 (myrakt1) exhibits a distinct substrate preference that is not exhibited by the cytosolic or the membrane fractions that do not include lipid rafts, indicating that akt1 for which oncogenicity is conferred by nmyristoylation is enriched in lipid rafts [7] . in addition, the evidence that simvastatin or cholesterol depletion preferentially ablates the phosphorylation of myrakt1 in lipid rafts suggests that myrakt1 is modified in a cholesterol-sensitive manner (fig. 2e ). it is well known that the most organelles in cells, such as the endoplasmic reticulum (er), mitochondrion and endosome, are bound to the plasma membrane and that these membrane interactions produce different effects. in addition, some proteins must undergo n-myristoylation for subcellular trafficking and localization. some mitochondria-related proteins, such as tomm40, samm50 [32] and, clpabp [33] , were demonstrated to require the function of myristoylated proteins to bind to the mitochondrial outer membrane. in one mechanism, the hydrophobic myristoyl group motif increases dependence of the protein to reduce cytoplasmic shuttling. for example, in mice, binding domain-containing protein 1 (stbd1) was observed to be a transmembrane resident protein, and nonmyristoylated stbd1 was shuttled with ease between the er and mitochondria [34] . ring finger protein 11 (rnf11) colocalizes with both early and recycling endosomes. while rnf11 requires n-myristoylation and spalmitoylation for membrane binding, n-myristoylation plays a greater role [35] . the removal of myristoylate results in protein diffusion. in another mechanism, the hydrophobicity of the myristoyl moiety increases protein flexibility, facilitating its shuttling through hydrophobic areas. in a yeast model [36] , the proteasome subunit rpt2 is cotranslationally n-myristoylated. loss of rpt2 n-myristoylation causes abnormal dynamic nucleuscytoplasm translocation of proteasomes and the aberrant accumulation of ubiquitinated protein levels (fig. 2f) . requirements for protein assembly and protein-protein interaction many proteins require assembly for maturity and function, and some evidence indicates that n-myristoylation drives the aggregation of proteins in some cases [37] . spassov et al. reported that src dimerization is mediated by the myristoylated n-terminal region of one partner interacting with the hydrophobic kinase domain of its counterpart [38, 39] . both y419 autophosphorylation and dimerization are codependent and activate src kinases (fig. 3a) . the functions of src kinases are disrupted by interfering dimerization, emphasizing the importance of n-myristoylation to src activity. in addition, the myristoyl-group-driven aggregation in lipid bilayers is also observed for h-ras [37] . some protein-protein interactions require a myristoyl group. the brain-specific protein cap-23/nap22 binds calmodulin (cam) with high affinity even though it lacks any canonical cam-binding domain. a crystal structure of ca 2+ -cam-cap-23/nap22-cam complex showed that the myristoyl group of cap-23/nap22 passes through a hydrophobic tunnel created by two hydrophobic components exclusively in the pockets of the terminus of cam, implying the direct involvement of the myristoyl group in this interaction. further, the interaction of calmodulin induces the phosphorylation of cap-23/nap-22 [40] . in hiv infections, both nef and gag are myristoylated by the host nmt cell to execute proper functions. in the assembly of hiv, nef and gag proteins are essential for infection. nef has many virulence factors that attenuate the immune system recognition of infected cells and enhance infectivity and viral replication [12] , and gag is a precursor protein for structural components of the viral capsid. because this virus lacks nmt proteins, both nef and gag are myristoylated by the host nmt cell, and the gag-gag interaction triggers an entropic switch toward a myristoyl-exposed conformation, providing the impetus for protein assembly (fig. 3b) . in contrast, in beta-retroviruses, such as mouse mammary tumor virus (mmtv), the oligomerization of the matrix (ma) domain, which contains the n-terminal residue in gag, adopts a myristoylsequestering conformation [41, 42] . moreover, the results from screens of the interaction of the host factors with the hiv-1 ma domain showed that heme oxygenase 2 (ho-2) specifically binds the myristoyl moiety of gag via a hydrophobic channel adjacent to its heme-binding pocket, which inhibits virus production. in addition, ho-2 binds n-myristoylated tram, an adaptor protein for toll-like receptor 4 (tlr4) [43] , which inhibits the tramdependent lipopolysaccharide(lps)-tlr4 pathway. these findings suggest ho-2 is a novel cellular n-myristoylated protein that negatively regulates both virus replication and host inflammatory responses [44] . recent studies have revealed that a glycine positioned in the nterminus can act as a potent degron, indicating that nmyristoylation may contribute to the removal of proteolytic cleavage products. richard t. timms et al. [45] identified two cul2 cullin-ring e3 ubiquitin ligase complexes called cul2 zyg11b and cul2 zer1 , both of which target n-myristoylated proteins for proteasomal degradation by recognizing n-terminal glycine degrons, which presumably play important roles in the quality control of protein n-myristoylation. nonmyristoylated c-src has enhanced stability compared to that of soluble n-myristoylated c-src [46] . borja belda-palazon et al. [47] found that rglg1, an e3 ligase in arabidopsis, is n-myristoylated by nmt1. n-myristoylation facilitates the attachment of rglg1 to the plasma membrane. the phytohormone abscisic acid (aba) induces the degradation of pp2ca, which is predominantly localized in the nucleus, through rglg1/5 e3 ligases. this degradation mechanism was found by aba downregulation of nmt1, which hindered the nmyristoylation of rglg1 and promoted its translocation to the nucleus, where it interacted with pp2ca, increasing pp2ca degradation. in addition, in aspergillus nidulans [48] , the swof gene was found to encode an nmt. the enhanced activity of the 26s proteasome and the accumulation of ubiquitinated substrates in the n-myristoylate-deficient swof mutant, compared with the activity and accumulation in wild-type strain, resulted in impaired cell morphogenesis. alterations to signaling pathways n-myristoylation influences downstream kinase activity directly or indirectly, usually by the mechanisms described above, through ras and src. c-abl is a member of the src family of tyrosine kinases. a 'myristoyl/phosphotyrosine' switch has been identified in the regulation of the kinase activity of c-abl [49] . nmyristoylation locks the protein into an autoinhibitory conformation when the sh2 domain docks to the kinase domain. in contrast, myristoylation leads to an unpredicted function: c-src is induced into a conformation optimal for kinase activity. a series of studies have unequivocally demonstrated that n-myristoylation of the β-subunit is a prerequisite for the initiation of ampk signaling in response to amp [26, 50, 51] . therefore, in the case of nmt deficiency or upon cell treatment with an nmt inhibitor, suppressed n-myristoylation diminished the extent of α-thr172 phosphorylation of amp and abolished its activation, thereby causing multiple morbid physiological outcomes. moreover, the myristoyl switch regulated by amp may affect the selectivity of the substrates and serve as a gatekeeper for transducing signals of metabolic stress [52] . recently, it was suggested that pka-c n-myristoylation may account for the enriched kinase activity at the membrane and the preferential phosphorylation of membrane substrates. these findings indicate an important role for n-myristoylation in synaptic function normality and plasticity during normal pka regulation [53, 54] . in another pattern of results from recent studies, several proteins were found to phosphorylate nmt, resulting in the activation of nmt. rajala et al. [55] verified that lyn and fyn are nmyristoylated by nmt before they phosphorylate nmt. in addition, it has been hypothesized that the interaction between nmt1 and the lyn tyrosine kinase proceeds in a phosphorylationdependent manner [56] . manipulation of substrates during apoptosis apoptosis, or programmed cell death, is the physiologically and tightly controlled process for eliminating unnecessary or redundant cells in eukaryotic organisms. in addition, dysfunctional apoptosis is one of the characteristics of malignant cells. apoptosis is orchestrated by a signaling cascade of proteases called caspases (cysteine-containing aspases), which leads to the exposure of novel n-termini susceptible to various posttranslational modifications. since the initial discovery of myristoylated bid, it has been revealed that multiple proteins are modified posttranslationally by n-myristoylation upon caspase cleavage. using a myristoylated protein-labeling method in jurkat t cells and mcf-7 cells, at least 15 and 7 posttranslationally nmyristoylated proteins were discovered, respectively, during apoptosis [57] . further characterization of these proteins revealed that caspase truncated (ct)-gelsolin and ct-pkc have antiapoptotic roles while ct-bid and ct-pak2 have proapoptotic roles upon posttranslational n-myristoylation, suggesting that posttranslational n-myristoylation plays an important role in maintaining a balance between cell death and survival. moreover, it has been found that nmt1 and nmt2 are substrates of different caspases. furthermore, n-myristoylation may be involved in sphingolipid biosynthesis pathway, which includes ceramide synthesis and is involved in the induction of apoptosis. in the last step of ceramide biosynthesis, a trans △4-double bond in the carbon chain of dihydroceramide is specifically catalyzed by dihydroceramide △4desaturase 1 (des1), which is n-myristoylated by nmt1 [58] . myristic acid upregulates novel des1 activity, which differs from that of saturated fatty acids, possibly because of the accumulation of n-myristoylated des1. n-myristoylation causes a proportion of des1 to be translocated from the er to the mitochondrial outer membrane, leading to an increase in ceramide levels. since the production of ceramide can induce apoptosis by inducing mitochondrial dysfunction [59] , the myristic acid-associated increase in des1 activity can enhance apoptosis induction in cells, which shows the importance of n-myristoylation in the process of apoptosis. n-myristoylated lancl2, which belongs to the eukaryotic lanc-like protein family, is known as the testisspecific adriamycin sensitivity protein (tasp). one biochemical study identified a phosphatidylinositol phosphate (pip)-binding site in the n-terminus of lancl2. in addition, both nmyristoylation and pip binding are crucial for lanc2 binding to the membrane. it is possible that the overexpression of lancl2 increases cell sensitivity to adriamycin, which is dependent on both n-myristoylation and membrane association [60] . in summary, we can conclude that a myristoyl group as a hydrophobic motif always stably binds to substrates to change the conformation and increase the hydrophobicity of the protein. further, it affects protein localization and the ease with which a protein binds to substrates. however, insufficient hydrophobicity of the 14-carbon myristoyl group leads to it being influenced by its surroundings. the hydrophilic group can ease protein binding, and the hydrophobic group can stabilize the state of the n-myristoylated protein. it seems that n-myristoylation triggers consequential functions of protein necessarily but insufficiently. accumulating evidence has demonstrated that n-myristoylation is an evolutionarily conserved lipidation that is essential for cell viability in different organisms. currently, interventions of protein n-myristoylation are principally achieved through nmt inhibitors. therefore, nmt inhibitors are very suitable for use against diseases caused by proliferating cells or pathogens, such as infectious diseases caused by various pathogens and malignancies. in fact, broadly classified inhibitors for each process in nmt-induced disease are designated as antifungal and antiviral agents. typically, these inhibitors can be divided into three categories [56] : (1) myristoyl-coa and myristate derivatives. although the myristoyl-coa binding sites in human nmt and fungal nmt are highly conserved, their peptide-binding sites are divergent, which provides an explanation for the selectivity of inhibitors that do not induce adverse toxicity in humans. (2) histidine analogs. in most conserved regions in the catalytic domain eeveh (289-293), histidine is critical for the myristoyl-coa transfer [61] . (3) myristoyl peptide derivatives. as previously mentioned, lipid metabolism disorders can affect protein lipidation [1] . various saturated and unsaturated fatty acids have been evaluated as potent inhibitors of human nmt1. indeed, the development of nmt inhibitors is not limited to proteins involved in infectious diseases because they are great prospects as immunodeficiency and cancer treatments. as mentioned previously, nmt1 has been characterized as the principal enzyme for the early development of embryogenesis. further investigation was focused on the role of nmt1 in myelopoiesis through which blood cell types are matured. bone marrow-derived macrophages (bmdms) from nmt1 +/− -deficient mice have defective morphology compared with that of mature bmdms in wild-type mice. during bmdm maturation, the nmt activity increased during the initial period and then decreased for the remaining time due to differential nmt expression. although the nmt activity in the bmdms of nmt1 +/− -deficient mice followed a trend similar to that of the bmdms in the wild-type mice, the nmt expression levels were reduced to approximately one-half in the mutant mice. this report addressed an essential role of nmt1 during monocytic/macrophage differentiation. of course, studies are continuously uncovering valuable evidence not only for nmt1 but also for nmt2 in both normal and malignant hematopoiesis. a bioinformatic database of gene expression in different cancer cell lines revealed decreased nmt2 expression and preserved nmt1 expression in some hematological cancer cell lines, such as burkitt lymphoma, diffuse large b-cell lymphoma and acute myeloid leukemia (aml) cell lines. a study performed by ryan stubbins et al. suggested a correlation between nmt2 and aml. nmt1 and nmt2 protein levels in the marrow aspirates or peripheral blood of aml patients were assayed by fluorescence-activated cell counting with intracytoplasmic staining, expressed as the mean fluorescent intensity (mfi). the evidence showed that the nmt2 mfi was higher in the lymphocytes and lower in the monocytes, suggesting that the regulation of nmt2 protein levels may influence early lymphoid/myeloid lineage commitment. further, the overall trend revealed by a survival analysis showed higher nmt2 mfi values, portending a worse prognosis for aml patients and suggesting a role for nmt2 as a novel prognostic biomarker for intermediate-risk aml [62] . in terms of adaptive immunity, it has been reported that nmyristoylation is indispensable for t cell development [63, 64] . a comparative analysis of thymuses showed that mice with deficient nmt1 or nmt2 levels had reduced medullary volume, which was 30% and 25% lower, respectively, than it was in wild-type mice, and mice with double nmt1 and nmt2 deficiency had a much greater medullary volume reduction (83%). these findings suggest nonredundant roles for both nmts in maintaining the development of thymocytes, since the thymus has high nmt activity levels [65] . in agreement with these results, it was demonstrated that n-myristoylation and its potential applications m yuan et al. t-cell receptor (tcr) signaling is disrupted in mice with t-cell lineages characterized by specific nmt1 and nmt2 activity deficiencies and results in retarded thymocyte development. these outcomes may be attributable to the mislocalization of nmyristoylation-deficient src family tyrosine kinases, such as lck or fyn [66] . for example, n-myristoylation contributes to the cytomembrane targeting of fyn and facilitates its binding with the z chain of tcr. non-n-myristoylated lck is in the cytoplasm and unable to facilitate tcr signaling. however, the regulation of n-myristoylation or nmt activity in these processes during adaptive immunity has not been well established to date. to gain in-depth understanding of the regulation of immune responses, further investigation into the specific roles of nmyristoylation, which may involve nmt as a potential target of immune modulation, is warranted. parasitic and other infectious diseases it is known that some n-myristoylated proteins in small rna viruses and retroviruses are essential for virus assembly during viral replication or production of infectious viral particles, suggesting that n-myristoylation may be related to the survival and propagation of pathogens. in addition, some pathogens need to utilize host cellular machinery to replicate within host cells due to deficiency in viral nmts. in immunosuppressed patients, cryptococcus neoformans can easily cause chronic meningitis; however, it is unable to survive at 37°c if it harbors mutant nmt with reduced activity. some disease-causing parasitic protozoa such as falciparum (malaria), leishmania donovani (leishmaniasis), and trypanosoma brucei (african sleeping sickness) retain the nmt necessary for their survival. given that the peptide-binding pocket of nmt is not strictly conserved across species, the search for species-specific nmt inhibitors focused on the binding pocket is worthwhile [67] . notwithstanding, current anti-infectious agents have some drawbacks, such as drug resistance, poor oral bioavailability, and renal toxicity. a group of studies have suggested nmt as a novel target for use in anti-infective drugs. theoretically, according to the respective skeletal structures, nmt inhibitors are classified into four categories (table 1) . among these inhibitors, benzofuran and, benzothiazole have high species selectivity [68] . anti-infective drugs targeting nmt have certain advantages. first, myristoyl-coa is found at very low accumulation levels, at approximately 5 nm, in mammalian cells. in addition, the strict hydrophobic structure of the fatty carbon chains reduces the likelihood that incompatible fatty acid groups will be recognized by the nmt, even in cases where the palmitic acid concentration is higher than the myristic acid in vitro [69] . these physiological phenomena suggest that a small amount of inhibitor can have a good inhibitory effect. second, nmt inhibitors can be designed for high selectivity because of the significant differences in substrate specificity in the human and parasitic organisms. moreover, many fungal and parasites must use the nmt of the host to synthesize essential proteins for their own reproduction. potential targets of cancer treatments given that altered nmt expression is observed in many types of cancer tissues and because many n-myristoylated proteins are involved in signaling processes that regulate cell proliferation, growth and death, it has been proposed that n-myristoylation or nmts can be considered as therapeutic targets for cancer. the premise for their use is based on a thorough understanding of the abnormal regulation of n-myristoylation in carcinogenesis. for instance, given that n-myristoylation can facilitate src-mediated prostate tumorigenesis, the myristoyl-coa analog b13 and its derivative lcl204 [68, 70] have been identified as inhibitors of nmt1 enzymatic activity and blockers of src n-myristoylation; their actions are based on competing for myristoyl-coa binding site, which provides a promising approach to inhibit src family kinase-mediated oncogenic activity, and offers preclinical support for the use of protein n-myristoylation inhibitors in treating cancer [71] . an organopalladium compound, tris(dibenzylideneacetone) dipalladium (tris dba) was identified as a novel human nmt1 inhibitor that not only blocks the kinase activity of nmt1 but also reduces its expression. it showed potent antiproliferative activity against melanoma cells by inhibiting several proliferation-related signaling pathway proteins including mapk, akt, and stat-3 [72] . the most successful examples of myristoylation-related anticancer inhibitors are allosteric abl tkis such as abl001 [73] [74] [75] , gnf2 and gnf5 [76] (table 1) . these allosteric inhibitors selectively target the myristoyl-binding pocket in the c-lobe of the abl kinase domain and not the atp-binding pocket. moreover, these inhibitors increase the sensitivity of a bcr-abl (t315i) mutant to atp-competitive tkis. a series of phase i clinical trials of abl001 and tkis (nilotinib, imatinib, and dasatinib) therapy have been ongoing for cml and ph+all patients (https://clinicaltrials.gov/ ct2/show/nct02081378). some studies have shown the potential targets of nmt in cancer cells. the cancer genome atlas (tcga) reports a group of genomic alterations induced by nmt1 and nmt2 in several cancers. although the occurrence of somatic mutations is more common than the occurrence of genomic amplification in nmt1 and nmt2, the roles of these mutations in regulating pathological mechanisms remain to be determined (https://www.cbioportal. org/). moreover, patients with diseases such as liver cancer, cervical cancer, and lung cancer and high expression levels of nmt1 or nmt2 are more likely to have a poor prognosis, as indicated consistently in some reports. in addition, it is noteworthy that a cohort with renal cell carcinoma with either high expression of nmt1 or low expression of nmt2 was found to have worse overall survival, suggesting the unique roles of these proteins in kidney cancer. however, no evidence showed that their catalytic activity is linked to tumor progression in kidney cancer (http:// kmplot.com/analysis/). in addition, several nmt activators that enhance nmt enzyme activity have been identified. l-histidine and d-histidine can activate human nmt activity in a concentration-dependent manner. this finding suggests that two analogs may be involved in myristoyl-coa transfer by interacting with his-293 of nmt. naf45 was identified as an nmt activator factor in bovine brain [77] . the reconstitution of naf45 and nmt likely resulted in an open conformation where the active site is expanded. a novel stratagem of nmt activation is enhanced nmt specificity for specific myristoyl-coa substrates. on the basis that nmt can transfer abundant palmitoyl-coa but can use only the rare myristoyl-coa for acylation of a substrate protein, eric soupene et al. [78] determined that the acyl-coa-binding protein domain 6 (acbd6) stimulates the activity of nmt2. acbd6 can block the access of palmitoyl-coa to the nmt2-binding site and enhance its catalytic activity, which requires interaction between nmt2 and acbd6. a mutant acbd6 unable to interact with nmt2 or with deficient ligand binding at its own n-terminus does not stimulate nmt2 activity. in this review article, we outlined advanced studies of nmyristoylation, focusing on the role of protein n-myristoylation in physiology and pathology. the important role of nmyristoylation underlies the early stages of protein maturation. after the protein is folded in the golgi apparatus or the er, cotranslationally n-myristoylation is likely to influence the transport and localization of the protein, which can affect the biofunction of protein. the insufficient nmt kinase activity in pathogens and the variations in different species emphasize the selective lethality of nmt inhibitors for infectious diseases for which either no valid drug is qualified or for which available drugs induce drug resistance. the breakthroughs of nmt inhibitors will likely be as tumor treatments. abl001 is being pioneered continuously for cml and ph+all patients in phase i trials, which suggests a strategy focused on the n-myristoylation of oncoproteins. furthermore, posttranslational n-myristoylation in the apoptotic process suggests the participation of nmts, specifically nmt2, in cell death. the function of the n-myristoylated protein in the apoptotic process, whether pathogenically or physiologically normal, can further indicate the orientation of the treatment strategy for targeting nmt. the knowledge of other protein modifications, such as ubiquitination or palmitoylation, which involves a multimember kinase family, can inform many specific enzyme-substrate studies and the design of selective inhibitors. in contrast, scientists need more precise approaches for analyses of nmts and nmyristoylation. nevertheless, traditional methods based on radioactively labeled probes to detect n-myristoylation are insensitive and time consuming, and the technical difficulties in obtaining three-dimensional structures of myristate-attached proteins need to be overcome. further, bioinformatics on n-myristoylation and nmts is still in its infancy; however, since databases are integrated, these data provide many clues linked to other biological fields. regardless of the technical difficulties, it is worth exploiting novel nmt inhibitors as single agents and exploring the potential drug synergies that might improve multiple clinical applications and enhance therapeutic efficacy, reverse drug resistance, or extend the therapeutic index for drugs already used in the clinic. it is likely that two approaches [79] can be used to reveal the prospects of nmt inhibitors for oncology therapy in the future: (1) identify currently unknown sensitivities of certain cancer types by widely screening cancer cell line panels and (2) discover the essential sensitivity or resistance mechanisms in resistant cell lines and wild-type cell lines by proteomic analyses of nmt substrate profiles and proteomic changes. as our knowledge of the biochemistry and cell biology of n-myristoylation continues to grow, more substrate proteins will be identified, and scientists will continue to deduce the effects of this lipid attachment on protein structure and function. protein lipidation in cell signaling and diseases: function, regulation, and therapeutic opportunities protein lipidation sirt2 and lysine fatty acylation regulate the transforming activity of k-ras4a hdac11 regulates type i interferon signaling through defatty-acylation of shmt2 protein lysine acylation and cysteine succination by intermediates of energy metabolism protein myristoylation in health and disease cholesterol sensitivity of endogenous and myristoylated akt myristoylated naked2 antagonizes wnt-beta-catenin activity by degrading dishevelled-1 at the plasma membrane myristoylation of the dual-specificity phosphatase c-jun n-terminal kinase (jnk) stimulatory phosphatase 1 is necessary for its activation of jnk signaling and apoptosis inhibition of enterovirus vp4 myristoylation is a potential antiviral strategy for hand, foot and mouth disease n-terminal region of the catalytic domain of human nmyristoyltransferase 1 acts as an inhibitory module hiv-1 production is specifically associated with human nmt1 long form in human human n-myristoyltransferase amino-terminal domain involved in targeting the enzyme to the ribosomal subcellular fraction fatty acids regulate germline sex determination through acs-4-dependent myristoylation comparison of myristoyl-coa: protein n-myristoyltransferases from three pathogenic fungi: cryptococcus neoformans, histoplasma capsulatum, and candida albicans two n-myristoyltransferase isozymes play unique roles in protein myristoylation, proliferation, and apoptosis efficient demyristoylase activity of sirt2 revealed by kinetic and structural studies sirt6 regulates ras-related protein r-ras2 by lysine defatty-acylation nmyristoyltransferase 1 is essential in early mouse development protein kinase c coordinates histone h3 phosphorylation and acetylation molecular determinants of the myristoyl-electrostatic switch of marcks structural basis for activation of arf gtpase: mechanisms of guanine nucleotide exchange and gtp-myristoyl switching structure and membrane interaction of myristoylated arf1 sequestration of the membranetargeting myristoyl group of recoverin in the calcium-free state myristoylation-dependent n-terminal cleavage of the myristoylated alanine-rich c kinase substrate (marcks) by cellular extracts a myristoyl/phosphoserine switch controls camp-dependent protein kinase association to membranes c-terminal 15 kda fragment of cytoskeletal actin is posttranslationally n-myristoylated upon caspase-mediated cleavage and targeted to mitochondria n-terminally myristoylated ras proteins require palmitoylation or a polybasic domain for plasmamembrane localization membrane localization of a rice calcium-dependent protein kinase (cdpk) is mediated by myristoylation and palmitoylation myristoylationdependent palmitoylation of the receptor tyrosine kinase adaptor frs2alpha g proteinmembrane interactions i: galphai1 myristoyl and palmitoyl modifications in protein-lipid interactions and its implications in membrane microdomain localization identification and characterization of protein n-myristoylation occurring on four human mitochondrial proteins, samm50, tomm40, mic19, and mic25 role of n-myristoylation in stability and subcellular localization of the clpabp protein mouse stbd1 is n-myristoylated and affects er-mitochondria association and mitochondrial morphology multiple modification and protein interaction signals drive the ring finger protein 11 (rnf11) e3 ligase to the endosomal compartment n-myristoylation of the rpt2 subunit of the yeast 26s proteasome is implicated in the subcellular compartment-specific protein quality control system aggregation of lipid-anchored full-length h-ras in lipid bilayers: simulations with the martini force field a dimerization function in the intrinsically disordered n-terminal region of src a myristoyl-binding site in the sh3 domain modulates c-src membrane anchoring crystal structure of a myristoylated cap-23/nap-22 n-terminal domain complexed with ca2+/calmodulin a myristoyl switch regulates membrane binding of hiv-1 gag myristoylation drives dimerization of matrix protein from mouse mammary tumor virus the myristoylation of trif-related adaptor molecule is essential for toll-like receptor 4 signal transduction heme oxygenase 2 binds myristate to regulate retrovirus assembly and tlr4 signaling a glycine-specific ndegron pathway mediates the quality control of protein n-myristoylation myristoylation and membrane binding regulate c-src stability and kinase activity aba inhibits myristoylation and induces shuttling of the rglg1 e3 ligase to promote nuclear degradation of pp2ca a novel interaction between n-myristoylation and the 26s proteasome during cell morphogenesis a myristoyl/phosphotyrosine switch regulates c-abl beta-subunit myristoylation is the gatekeeper for initiating metabolic stress sensing by ampactivated protein kinase (ampk) n-myristoyltransferase deficiency impairs activation of kinase ampk and promotes synovial tissue inflammation exploring protein myristoylation in toxoplasma gondii liberated pka catalytic subunits associate with the membrane via myristoylation to preferentially phosphorylate membrane substrates myristoylated alanine-rich c-kinase substrate effector domain phosphorylation regulates the growth and radiation sensitization of glioblastoma phosphorylation of human n-myristoyltransferase by n-myristoylated src family tyrosine kinase members potential role of n-myristoyltransferase in cancer rapid detection, discovery, and identification of post-translationally myristoylated proteins during apoptosis using a bio-orthogonal azidomyristate analog myristic acid increases the activity of dihydroceramide delta 4-desaturase 1 through its nterminal myristoylation monounsaturated fatty acid modification of wnt protein: its role in wnt secretion myristoylation of human lanc-like protein 2 (lancl2) is essential for the interaction with the plasma membrane and the increase in cellular sensitivity to adriamycin effects of l-histidine and its structural analogues on human n-myristoyltransferase activity and importance of eeveh amino acid sequence for enzyme activity expression and activity of n-myristoyltransferase in lung inflammation of cattle and its role in neutrophil apoptosis deficiency of n-myristoylation reveals calcineurin activity as regulator of ifn-gammaproducing gamma delta t cells myristoylation: an important protein modification in the immune response n-myristoylation of ampk controls t cell inflammatory function the residue at position 5 of the n-terminal region of src and fyn modulates their myristoylation, palmitoylation, and membrane interactions vhy, a novel myristoylated testis-restricted dual specificity protein phosphatase related to vhx novel analogs of d-e-mapp and b13. part 2: signature effects on bioactive sphingolipids role of long-chain fatty acyl-coa esters in the regulation of metabolism and in cell signalling novel analogs of de-mapp and b13. part 1: synthesis and evaluation as potential anticancer agents blocking myristoylation of src inhibits its kinase activity and suppresses prostate cancer progression tris (dibenzylideneacetone) dipalladium, a n-myristoyltransferase-1 inhibitor, is effective against melanoma growth in vitro and in vivo discovery of asciminib (abl001), an allosteric inhibitor of the tyrosine kinase activity of bcr-abl1 allosteric inhibitors of bcr-abl: towards novel myristate-pocket binders the allosteric inhibitor abl001 enables dual targeting of bcr-abl1 inhibitors of the abl kinase directed at either the atp-or myristate-binding site differential activation of bovine brain n-myristoyltransferase(s) by a cytosolic activator acbd6 protein controls acyl chain availability and specificity of the n-myristoylation modification of proteins n-myristoyltransferase inhibition induces er-stress, cell cycle arrest, and apoptosis in cancer cells this work was supported by grants from the national natural science foundation of china (81872885 to j.c.), zhejiang provincial natural science foundation (y18h310005 to j.c.), and the talent project of zhejiang association for science and technology (no. 2018ycgc002 to j.c.). my and jc conceived and designed the review article. my, zhs, and jc collected the related research articles contributed to the paper. my, mdy, hz, by, qjh, and jc made amendments to the paper. competing interests: the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential competing interests. key: cord-033010-o5kiadfm authors: durojaye, olanrewaju ayodeji; mushiana, talifhani; uzoeto, henrietta onyinye; cosmas, samuel; udowo, victor malachy; osotuyi, abayomi gaius; ibiang, glory omini; gonlepa, miapeh kous title: potential therapeutic target identification in the novel 2019 coronavirus: insight from homology modeling and blind docking study date: 2020-10-02 journal: egypt j med hum genet doi: 10.1186/s43042-020-00081-5 sha: doc_id: 33010 cord_uid: o5kiadfm background: the 2019-ncov which is regarded as a novel coronavirus is a positive-sense single-stranded rna virus. it is infectious to humans and is the cause of the ongoing coronavirus outbreak which has elicited an emergency in public health and a call for immediate international concern has been linked to it. the coronavirus main proteinase which is also known as the 3c-like protease (3clpro) is a very important protein in all coronaviruses for the role it plays in the replication of the virus and the proteolytic processing of the viral polyproteins. the resultant cytotoxic effect which is a product of consistent viral replication and proteolytic processing of polyproteins can be greatly reduced through the inhibition of the viral main proteinase activities. this makes the 3c-like protease of the coronavirus a potential and promising target for therapeutic agents against the viral infection. results: this study describes the detailed computational process by which the 2019-ncov main proteinase coding sequence was mapped out from the viral full genome, translated and the resultant amino acid sequence used in modeling the protein 3d structure. comparative physiochemical studies were carried out on the resultant target protein and its template while selected hiv protease inhibitors were docked against the protein binding sites which contained no co-crystallized ligand. conclusion: in line with results from this study which has shown great consistency with other scientific findings on coronaviruses, we recommend the administration of the selected hiv protease inhibitors as first-line therapeutic agents for the treatment of the current coronavirus epidemic. the first outburst of pneumonia cases with unknown origin was identified in the early days of december 2019, in the city of wuhan, hubei province, china [1] . revelation about a novel beta coronavirus currently regarded as the 2019 novel coronavirus [2] came up after a high-throughput sequencing of the viral genome which exhibits a close resemblance with the severe acute respiratory syndrome (sars-cov) [3] . the 2019-ncov is the seventh member of enveloped rna coronavirus family (subgenus sarbecovirus, orthocoronavirinae) [3] , and there are accumulating facts from family settings and hospitals confirming that the virus is most likely transmitted from person-to-person [4] . the 2019-ncov has also recently been declared by the world health organization as a public health emergency of international concern [5] and as of the 5th of february 2020, over 24,000 cases has been confirmed and documented from laboratories around the world [6] while more than 28, 000 of such cases were documented in china through laboratory confirmation as of the 6th of february 2020 [7] . despite the fast rate of global spread of the virus, the characteristics clinically peculiar to the 2019-ncov acute respiratory disease (ard) remain unclear to a very large extent [8] . over 8000 infections and 900 deaths were recorded worldwide in the summer of 2003 before a successful containment of the severe acute respiratory syndrome wave was achieved as the disease itself was also a major public health concern worldwide [9, 10] . the infection that led to a huge number of death cases was linked to a new coronavirus also known as the sars coronavirus (sars-cov). coronaviruses are positive-stranded rna viruses and they possess the largest known viral rna genomes. the first major step in containing the sars-cov-lined infection was to successfully sequence the viral genome, the organization of which was found to exhibit similarity with the genome of other coronaviruses [11] . the main proteinase crystal structure from both the transmissible gastroenteritis virus and the human coronavirus (hcov 229e) has been determined with the discovery that the enzyme crystal structure exists as a dimer and the orientation of individual protomers making up the dimer has been observed to be perpendicular to each other. each of the protomers is made up of three catalytic domains [12] . the first and second domains of the protomers have a two-βbarrel fold that can be likened to one of the folds in the chymotrypsin-like serine proteinases. domain iii have five α-helices which are linked to the second domain by a long loop. individual protomers have their own specific region for the binding of substrates, and this region is positioned in the left cleft between the first and second domain. dimerization of the protein is thought to be a function of the third domain [13] . the main proteinase of the sars cov is known to be a cysteine proteinase which has in its active site, a cysteine-histidine catalytic dyad. conservation of the sars cov main proteinase across the genome sequence of all sars coronaviruses is very high, likewise the homology of the protein to the main proteinase of other coronaviruses. on the basis that high similarity exists between the different coronavirus main proteinase crystal structures and the conservation of almost all the amino acid residue side chains involved in the dimeric state formation, it was proposed that the only biologically functional form coronavirus main proteinase might be is its existence as a dimer [14] . more recently, chen et al. in his study which involved the application of molecular dynamic simulations and enzyme activity measurements from a hybrid enzyme showed that the only active form of the proteinase is in its dimeric state [15] . recent studies based on the sequence homology of the coronavirus main proteinase structural model with tgev as well as the solved crystal structure has involved the docking of substrate analogs for the virtual screening of natural products and a collection of synthetic compounds, alongside approved antiviral therapeutic agents in the evaluation of the coronavirus main proteinase inhibition [16] . some compounds from this study were identified for the inhibitory role played against the viral proteinase. these compounds include the l-700,417, which is an hiv-1 protease inhibitor, calanolide a, and nevirapine, both of which are reverse transcriptase inhibitors, an inhibitor of the α-glucosidase named glycovir, sabadinine, which is a natural product and ribavirin, a general antiviral agent [17] . ribavirin was shown to exhibit an antiviral activity in vitro, at cytotoxic concentrations against the sars coronavirus. at the start of the first outbreak of the sars epidemic, ribavirin was administered as a first-line of defense. the administration was as a monotherapy and in combination with corticosteroids or the hiv protease inhibitor, kaletra [18] . according to reports from a very recent research conducted by cao et al., where a total of a 199 laboratoryconfirmed sars-cov-infected patients were made to undergo a controlled, randomized, open-labeled trial in which 100 patients were assigned to the standard care group and 9 patients assigned to the lopinavirritonavir group. 48.4% of the patients in the lopinavir-ritonavir group (46 patients) and 49.5% of the patients in the standard care group (49 patients) exhibited serious adverse events between randomization and the 28th day. the exhibited adverse events include acute respiratory distress syndrome (ards), acute kidney injury, severe anemia, acute gastritis, hemorrhage of lower digestive tract, pneumothorax, unconsciousness, sepsis, acute heart failure etc. patients in the lopinavir-ritonavir group in addition, specifically exhibited gastrointestinal adverse events which include diarrhea, vomiting, and nausea [19] . our current study took advantage of the availability of the sars cov main proteinase amino acid sequence to map out the nucleotide coding region for the same protein in the 2019-ncov. two selected hiv protease inhibitors (lopinavir and ritonavir) were then targeted at the catalytic site of the protein 3d structure which was modeled using already available templates. the predicted activity of the drug candidates was validated by targeting them against a recently crystalized 3d structure of the enzyme, which has been made available for download in the protein data bank. lopinavir is an antiretroviral protease inhibitor used in combination with ritonavir in the therapy and prevention of human immunodeficiency virus (hiv) infection and the acquired immunodeficiency syndrome (aids). it plays a role as an antiviral drug and a hiv protease inhibitor. it is a member of amphetamines and a dicarboxylic acid diamide (fig. 1 ). the complete genome of the isolated wuhan seafood market pneumonia virus (2019-ncov) was downloaded from the genbank database with an assigned accession number of mn908947.3. the nucleotide sequence of the full genome was copied out in fasta format. the gen-bank sequence database is an annotated collection of all nucleotide sequences which are publicly available with their translated protein segments and also open access. this database is designed and managed by the national center for biotechnology information (ncbi) in accordance with the international nucleotide sequence database collaboration (insdc) [20] . nucleotides between the 10055 and 10972 sequence of the 2019-ncov genome was selected as the sequence of interest. translation of the nucleotide sequence of interest in the 2019-ncov and the back-translation of the sars cov main proteinase amino acid sequence was achieved with the use of emboss transeq and backtranseq tools, respectively [21] . transeq reads one or more nucleotide sequences and writes the resulting translated sequence of protein to file while backtranseq makes use of a codon usage table which gives the usage frequency of individual codon for every amino acid [22] . for every amino acid sequence input, the corresponding most frequently occurring codon is used in the nucleotide sequence that forms the output. the corresponding amino acid sequence generated as a product of the transeq translation of the nucleotide sequence of interest had no stop codons and as such was used directly for protein homology modeling without the need for any deletion. two sets of sequence alignments were carried out in this study. the first was the alignment between the translated nucleotide sequence copy of the 2019-ncov genome which was used for the reference protein homology modeling and the amino acid sequence of the sars cov main proteinase while the second alignment was between the back-translated sars cov main proteinase nucleotide sequence and the 2019-ncov full genome. the latter was used in mapping out the protein coding sequence in the 2019-ncov full genome. these alignments were carried out using the clustal omega software package. clustal omega can read inputs of nucleotide and amino acid sequences in formats such as a2m/fasta, clustal, msf, phylip, selex, stockholm, and vienna [23] . template search with blast and hhblits was performed against the swiss-model template library. the target sequence was searched with blast against the primary amino acid sequence contained in the smtl. a total of 120 templates were found. an initial hhblits profile was built using the procedure outlined in remmert et al. [24] followed by 1 iteration of hhblits against nr20. the obtained profile was then searched against all profiles of the smtl. a total of 192 templates were found. models were built based on the targettemplate alignment using promod3. coordinates which are conserved between the target and the template are copied from the template to the model. insertions and deletions are remodeled using a fragment library. side chains were then rebuilt and finally, the geometry of the resulting model, regularized by using a force field [25] . for the estimation of the protein structure model quality, we used the qmean (qualitative model energy analysis), a composite scoring function that describes the main aspects of protein structural geometrics, which can also derive on the basis of a single model, both global (i.e., for the entire structure) and local (i.e., per residue) absolute quality estimates [26] . an appreciable number of alternative models have been produced which formed the basis on which scores produced by the final model was selected. the qmean score was thus used in the selection of the most reliable model against which the consensus structural scores were calculated. molprobity (version 4.4) was used as the structurevalidation tool that produced the broad-spectrum evaluation of the quality of the target protein at both the global and local levels. it greatly relies on the provided sensitivity and power by optimizing the placement of hydrogen and all-atom contact analysis with complementary versions of updated covalent-geometry and torsion-angle criteria [27] . the torsion angles between individual residues of the target protein were calculated using the ramachandran plot. this is a plot of the torsional angles [phi (φ) and psi (ψ)] of the amino acid residues making up a peptide. in the order of sequence, the torsion angle of n(i-1), c(i), ca(i), n(i) is φ while the torsion angle of c(i), ca(i), n(i), c(i+1) is ψ. the values of φ were plotted on the x-axis while the values of ψ were plotted on the y-axis [28] . plotting the torsional angles in this way graphically shows the possible combination of angles that are allowed. the quaternary structure annotation of the template is employed to model the target sequence in its oligomeric state. the methodology as proposed by bertoni et al. [29] was supported on a supervised machine learning algorithm rule, support vector machines (svm), which mixes conservation of interface, clustering of structures with other features of the template to produce a quaternary structure quality estimate (qsqe). the qsqe score is a number that ranges between 0 and 1, and it is a reflection of the accuracy expected of the inter-chain contacts for a model engineered based on a given template and its alignment. the higher score is an indication of a more reliable result. this enhances the gmqe score that calculates the accuracy of the 3d structure of the resulting model. the 3d structural homology modeling of the 2019-ncov genome translated segment was followed by a structural comparison with the sars cov main proteinase 3d structure (pdb: 1uj1). this was achieved using the ucsf chimera which is a highly extensible tool for interactive analysis and visualization of molecular structures and other like data, including docking results, supramolecular assemblies, density maps, sequence alignments, trajectories, and conformational ensembles [30] . high-quality animation videos were also generated. the amino acid constituents of the target protein secondary structures were colored and visualized in 3d using the pymol molecular visualzer which uses opengl extension wrangler library (glew) and freeglut. the py aspect of the pymol is a reference to the programming language that backs up the software algorithm which was written in python [31] . the percentage composition of each component making up the secondary structure was calculated using the chou and fasman secondary structure prediction (cfssp) server. this is a secondary structure predictor that predicts regions of secondary structure from an amino acid input sequence such as the regions making up the alpha helix, beta sheet, and turns. the secondary structure prediction output is displayed in a linear sequential graphical view according to the occurrence probability of the secondary structure component. the cfssp implemented methodology is the chou-fasman algorithm, which is based on the relative frequency analyses of alpha helices, beta sheets, and loops of each amino acid residue on the basis of known structures of proteins solved with x-ray crystallography [32]. the expasy server calculates protein physiochemical parameters as a part of its sub-function, basically for the identification of proteins [33] . we engaged the function of the protparam tool in calculating various physiochemical parameters in the model and template protein for comparison purposes. the calculated parameters include the molecular weight, theoretical isoelectric point, amino acid composition, extinction coefficient, instability index, etc. the inference on evolutionary relationship was made utilizing the maximum likelihood methodology which is the basis of the jtt matrix-based model [34] . the corresponding consensus tree on bootstrap was inferred from a thousand replicates, and this was used to represent the historical evolution of the analyzed taxa. the tree branches forming partitions that were reproduced in bootstrap replicates of less than 50% were automatically collapsed. next to every branch in the tree is the displayed percentage of tree replicates of clustered associated taxa in the bootstrap test of a thousand replicates. initial trees were derived automatically for the search through the application of the neighbor-join and bionj algorithms to a matrix of pairwise distances calculated using a jtt model and followed by the selection of the most superior log likelihood value topology. the phylogenetic analysis was carried out on 12 amino acid sequences with close identity. the complete dataset contained a total of 306 positions. the whole analysis was conducted using the molecular evolutionary and genetics analysis (mega) software (version 7) [35] . ligand preparation and molecular docking protocol 2d structures of the experimental ligands were viewed from the pubchem repository and sketched using the chemaxon software [36] . the sketched structures were downloaded and saved as mrv files which were converted into smiles strings with the openbabel. the compounds prepared as ligands were docked against each of the prepared protein receptors using autodock vina [37] . blind docking analysis was performed at extra precision mode with minimized ligand structures. after a successful docking, a file consisting of all the poses generated by the autodock vina along with their binding affinities and rmsd scores was generated. in the vina output log file, the first pose was considered as the best because it has stronger binding affinity than the other poses and without any rmsd value. the polar interactions and binding orientation at the active site of the proteins were viewed on pymol and the docking scores for each ligand screened against each receptor protein were recorded. the same docking protocol was performed against the sars-cov main proteinase 3d structure that was downloaded from the protein data bank with a pdb identity of 6m2n. obtained outputs were visualized, compared, and documented for validation purpose. the full genome of the 2019-ncov (https://www.ncbi. nlm.nih.gov/nuccore/mn908947.3?report=fasta) consists of 29903 nucleotides, but for the purpose of this study, nucleotides between 10055 and 10972 were considered to locate the protein of interest. the direct translation of this segment of nucleotides produced a sequence of 306 amino acids (fig. 2 ). this amino acid count was reached after the direct translation of the nucleotide sequence of interest as there were no single existing stop codons hence, deletion of any form was needless. as depicted in fig. 3 , few structural differences were noticed. the amino acid sequences making up these nonconserved regions were clearly revealed in fig. 4 . notwithstanding, a 96% identity was observed between both sequences showing the conserved domains were predominant. figure 4 represents the percentage amino acid sequence identity between the target and the template protein, where the positions with a single asterisk (*) depicts regions of full residue conservation while the segments with the colon (:) indicates regions of conservation between amino acid residues with similar properties. positions with the period (.) show regions of conservation between amino acids with less similar properties. the amino acid sequence of the sars coronavirus main proteinase was back-translated to generate the corresponding nucleotide sequence which was then aligned with the 2019-ncov full genome. this was carried out for the purpose of mapping out the region of the 2019-ncov full genome where the proteinase coding sequence is located. as depicted in fig. 5 , the target protein coding sequence is located between 10055 and 10972 nucleotides of the viral genome the outcome of a qmean analysis is anchored on the composite scoring function which calculates several features regarding the structure of the target protein. the expression of the estimated absolute quality of the model is in terms of how well the score of the model is in agreement with the values expected of a set of resultant structures from high-resolution experiments. the global score values can either be from qmean4 or qmean6. qmean4 is a combination of four statistical potential terms represented in a linear form while qmean6 in addition to the functionality of qmean4 uses two agreement terms in the consistency evaluation of structural features with sequence-based predictions. both qmean4 and qmean6 originally are in the range of 0 to 1 with 1 being the good score and are by default transformed into z-scores (table 1) for them to be related to what would have been expected from x-ray structures of high resolution. the local scores are also a combination of four linear potential statistical terms with the agreement terms of evaluation being on a per residue basis. the local scores are also estimated in the range of 0 to 1, where one is the good score (fig. 6) . when compared to the set of non-redundant protein structures, the qmean z-score of the target protein as shown in fig. 7 was 0. the models located in the dark zone are shown in the graph to have scores less than 1 while the scores of other regions outside the dark zone can either be 1 < the z-score < 2 or z-score > 2. good models are often located in the dark zone. whenever such values are found, they result in some strains in the polypeptide chain and in cases of such, the stability of the structure will depend greatly on additional interactions but this conformation may be conserved in a protein family for its structural significance. another αand β-regions clustering principle exemption can be viewed on the right side plot of fig. 8 where the distribution of torsion angles for glycine are the only displayed angles on the ramachandran plot. glycine has no side chain, and this gives room for flexibility in the polypeptide chain hence making accessible the forbidden rotation angles. glycine for this reason is more concentrated in the regions making up the loop where sharp bends can occur in the polypeptide. for this reason, glycine is highly conserved in protein families as the presence of turns at specific positions is a characteristic feature of particular structural folds. the comparative physiochemical parameter computation of the template and target proteins by protparam were deduced from the amino acid sequences of the individual proteins. no additional information was required about the proteins under consideration and each of the complete sequences was analyzed. the amino acid sequence of the target protein has not been deposited in the swiss-prot database. for this reason, inputting the standard single-letter amino acid sequence for both proteins into the text field was employed in computing the physiochemical properties as shown in tables 3, 4 the two hiv protease inhibitors (lopinavir and ritonavir) when targeted at the modeled 2019-ncov catalytic site gave significant inhibition attributes; hence, the in silico study was planned through molecular docking analysis with autodock vina. the binding orientation of the drugs to the protein active site as viewed in the pymol molecular visualizer (fig. 11) showed an induced fit model binding conformation. the same compounds were targeted against the active site of the downloaded pdb 3d structure of the sars-cov main proteinase (pdb 6m2n) for comparison purposes fig. 12 . the active site residues as visualized in pymol are shown in fig. 13 . the binding of lopinavir to the target protein which produced the best binding score was used as the predictive model. residues at the distance of < 5 angstroms to the bound ligand were assumed to form fig. 13 the combined view of the 3d structural comparison between the modeled target protein and the downloaded pdb structure of the viral protein (left column) and their primary sequence alignment (right column). the target protein is colored in grey while its protein data bank equivalence is colored in red. the high structural similarity between the two proteins was validated through their sequence alignment which produced 99.34% sequence identity score. homology modeling which is a computational method for modeling the 3d structure of proteins and also fig. 8 depicted here are two ramachandran plots. the plot on the left hand side shows the general torsion angles for all the residues in the target protein while the plot on the right hand side is specific for the glycine residues of the protein fig. 9 the target protein secondary structures with bound lopinavir. at the top is the secondary structure visualization on pymol with regions making up the alpha helix, beta sheets, and loops shown in light blue, purple, and brown, respectively. below is the prediction by cfssp where the red, green, yellow, and blue lines depict regions of the helices, sheets, turns, and coils (loops), respectively. the predicted secondary structure composition shows a high degree of alpha helix and beta sheets, respectively, occupying 45 and 47% of the total residues with the percentage loop occupancy at 8% regarded as comparative modeling, constructs atomic models based on known structures or structures that have been determined experimentally and likewise share more than 40% sequence homology. the backing principle for this is that there is likely to be an existing three-dimensional structure similarity between two proteins with high similarity in their amino acid sequence. with one of the proteins having an already determined 3d structure, the structure of the unknown one can be copied with a high degree of confidence. there is a higher degree of accuracy for alpha carbon positions in homology modeling than the side chains and regions containing the loop which is mostly inaccurate. as regards the template selection, homologous proteins with determined structures are searched through the protein data bank (pdb) and templates must have alongside a minimum of 40% identity with the target sequence, the highest possible resolution with appropriate cofactors for selection consideration [29] . in this study, the target protein was modeled using the sars coronavirus main proteinase as template. this selection was based on the high resolution and its identity with the target protein which is as high as 96%. qualitative model energy analysis (qmean) is a composite scoring function that describes protein structures on the basis of major geometrical aspects. the scoring function of the qmean calculates the global quality of models on six structural descriptors linear combination basis, where four of the six descriptors are statistical potentials of mean force. the analysis of local geometry is carried out by potential of torsion angle over three consecutive amino acids. in predicting the structure of a protein structure, final models are often selected after the production of a considerable number of alternative models; hence, the prediction of the protein structure is anchored on a scoring function which identifies within a collection of alternative models the best structural model. two distance-dependent interaction potentials are used to assess long-range interactions based on c_β atoms and all atoms, respectively. the burial status of amino acid residues describes the solvation potential of the model while the two terms that reflect the correlation between solvent accessibility, calculated and predicted secondary structure are not excluded [38] . the resultant target protein can be considered a good model as the z-scores of interaction energy of c_β, pairwise energy of all atoms, solvation energy, and torsion angle energy are − 0.35, − 0.65, − 0.77, and 0.36, respectively, as shown in table 1 . the quality of a good model protein can be compared to high-resolution reference structures that are products of x-ray crystallography analysis through z-score where 0 is the average z-score value for a good model [26] . according to benkert et al., qmean z-score provides an estimate value of the degree of nativeness of the structural features that can be observed in a model, and this is an indication that the model is of a good quality in comparison to other experimental structures [26] . our study shows the z-score of the target is "0" as indicated in fig. 6 and such a score is an indication of a relatively good model as it possesses the average z-score for a perfect model. properties of the model that is predicted determines the molprobity scores. work initially done on all-atom contact analysis has shown that proteins possess exquisitely well-packed structures with favorable van der waals interactions which overlap between atoms that do not form hydrogen bonds [39] . unfavorable steric clashes are correlated strongly with the quality of data that are often poor where a near zero occurrence of such steric clashes occurs in the ordered regions of crystal structures with high resolution. therefore, low values of clash scores are indications of a very good model which likewise has been proven by the clash score value exhibited by the target protein that was modeled for the purpose of this study (table 2 ). in addition to the clash score, the protein conformation details are remarkably relaxed, such as staggered χ angles and even staggered methyl groups [40] . applied forces to a given local motif in environments predominantly made up of folded protein interior can produce a locally strained conformation but significant strain are kept near functionally needed minimum by evolution and this is on the presumption that the stability of proteins is too marginal for high tolerance. in traditional validation measures updates, there has been a compilation of rigorously quality-filtered crystal structures through homology, resolution, and the overall score validation at file level, by b-factor and sometimes at residue level, by all-atom steric clashes. the resulting multi-dimensional distributions generated after an adequate smoothing are used in scoring the "protein-like" nature of each local conformation in relation to known structures either for backbone ramachandran values or the side chain rotamers [41] . rotamer outliers are equivalent to < 1% at high resolution while general-case ramachandran outliers to a high-resolution equivalence of < 0.05%, and ramachandran favored to 98%. in this regard, the definition of the molprobity score (mpscore) was given as mpscore = 0.426 *ln(1+clashscore) + 0.33 *ln(1+max(0, rota_out|-1)) + 0.25 *ln(1+ max(0, rama_iffy|-2)) + 0.5 where the number of unfavorable all-atom steric overlaps ≥ 0.4 å per 1000 atoms defines the clashscore [38] . the rota_out is the side chain conformation percentage termed as the rotamer outliers, from side chains that can undergo possible evaluation while rama_iffy is the backbone ramachandran percentage conformations that allows beyond the favored region, from residues that can undergo possible evaluation. the derivatives of the coefficients are from a log-linear fit to crystallographic resolution on a set of pdb structures that has undergone filtration, so that the mpscore of a model is the resolution at which each of its scores will be the values expected thus, the lower mpscores are the indications of a better model. with a clash score of 2.06 and a 95.66% value for the ramachandran favored region as compared to the ramachandran outliers and rotamer outliers with individual values of 0.83% and 5.21% respectively, we arrived at a molprobity score of 1.82 which is low enough to indicate the quality of a good model in our experimental protein. the characteristic repetitive conformation attribute of amino acid residues is the basis for the repetitive nature of the secondary structures hence the repetitive scores of φ and ψ. the range of φ and ψ scores can be used in distinguishing the different secondary structural elements as the different φ and ψ values of each secondary structure elements map their respective regions on the ramachandran plot. peptides of the ramachandran plot have the average values of their α-helices clustered about the range of φ = − 57°and ψ = − 47°while the average values of 130°and ψ = + 140°describes the ramachandran plot clustering for twisted beta sheets [42] . the core region (green in fig. 8 ) on the plot has the most favorable combinations for the φ and ψ values and has the highest number of dots. the figure also shows in the upper right quadrant, a small third core region. this is known as the allowed region and can be found in the areas of the core regions or might not be associated with the core region. it has lesser data points compared to the core regions. the other areas on the plot are regarded as disallowed. since glycine has only one hydrogen atom as side chain, steric hindrance is not as likely to occur as φ and ψ are rotated through a series of values. the glycine residues having φ and ψ values of + 55°and − 116°, respectively [43] do not exhibit steric hindrance and for that reason positioned in the disallowed region of the ramachandran plot as shown in the right hand side plot in fig. 8 . the extinction coefficient is an indication of the intensity of absorbed light by a protein at specific wavelength. the importance of estimating this coefficient is to monitor a protein undergoing purification in a spectrophotometer. woody in his experiment [44] has shown the possibility of estimating a protein's molar extension coefficient from knowledge of its amino acid composition which has been presented in table 3 . the extinction coefficient of the proteins (both the template and the target proteins) was calculated using the equation: e(prot) = numb(tyr) × ext(tyr) + numb(trp) × ext(trp) + numb(cystine) × ext(cystine)the absorbance (optical density) was calculated using the following formula: for this equation to be valid, the following conditions must be met: ph 6.5, 6.0 m guanidium hydrochloride, 0.02 m phosphate buffer. the n-terminal residue identity of a protein is an important factor in the determination of its stability in vivo and also plays a major role in the proteolytic degradation process mediated by ubiquitin [45] . βgalactosidase proteins with different n-terminal amino acids were designed through site-directed mutagenesis, and the designed β-galactosidase proteins have different half-lives in vivo which is striking, ranging from over a hundred hours to less than 2 min, but this is dependent on the nature of the amino terminus residue on the experimental model (yeast in vivo; mammalian reticulocytes in vitro, e. coli in vivo). the order of individual amino acid residues is thus in respect to the conferred half-lives when located at the protein's amino terminus [46] . this is referred to as the "n-end rule" which was what the estimated half-life of both the template and target proteins were based on. the instability index provides an estimate of the protein's stability in a test tube. statistical analysis of 32 stable and 12 unstable proteins has shown [47] that there are specific dipeptides with significantly varied occurrence in the unstable proteins as compared with those in the stable proteins. the authors of this method have assigned a weight value of instability to each of the 400 different dipeptides (diwv). the computation of a protein's instability index is thus possible using these weight values, which is defined as: table 3 amino acid composition table for both template and target protein amino acid residues in one letter codes template 17 12 19 17 12 14 9 26 8 12 30 11 10 16 13 16 26 3 11 24 target 17 11 21 17 12 14 9 26 7 11 29 11 10 17 13 16 24 3 11 27 durojaye et al. where l is the sequence length and diwv(x[i]x[i + 1]) is the instability weight value for the dipeptide starting from position i. a protein that exhibits an instability index value less than 40 can be predicted as a stable protein while an instability index value that exceeds the 40 threshold is an indication that the protein may be unstable. the comparative instability index values for the template and target proteins were 29.67 and 27.65 (table 4) , respectively, showing both are stable proteins. the relative volume occupied by aliphatic side chains (valine, alanine, leucine, and isoleucine) of a protein is known as its aliphatic index. it may be an indicator of a positive factor for an increment in globular proteins thermostability. the aliphatic index of the experimental proteins was calculated according to the following formula [48] : where x(ala), x(val), x(ile), and x(leu) are the mole percent (100 × mole fraction) of alanine, valine, isoleucine, and leucine. the coefficients "a" and "b" are the relative volume of the valine side chain (a = 2.9) and of leu/ile side chains (b = 3.9) to the alanine side chain. the calculated aliphatic index for the experimental protein shows that the thermostability of the target protein is slightly higher than the template. the most common secondary structures are the alpha helices and beta sheets although the beta turns and omega loops also occur. elements of the secondary structures spontaneously form as an intermediate before their folding into the corresponding three-dimensional tertiary structures [49] . the stability and how robust the α-helices are to mutations in natural proteins have been shown in previous studies. they have also been shown to be more designable than the beta sheets; thus, designing a functional all-α helix protein is likely to be easier than designing proteins with both α helix and strands, and this has recently been confirmed experimentally [50] . the template and target proteins both have a total of 306 amino acid residues (table 4 ) with the composition of individual residues shown in table 3 . as shown in fig. 9 , the target protein which shares a structural homology with the template (fig. 3 and the animation video) is predominantly occupied by residues forming alpha helix and beta sheets, with very low percentage of the residues forming loops. the stability of these two proteins is revealed in their physiochemical characteristics which can therefore be linked to the high percentage of residues forming alpha helix. the ultimate goal of genome analysis is to understand the biology of organisms in both evolutionary and functional terms, and this involves the combination of different data from various sources [51] . for the purpose of this study, we compared our protein of interest to similar proteins in the ncbi database to predict the evolutionary relationships between homologous proteins represented in the genomes of each divergent species. this makes the amino acid sequence alignment the most suitable form of alignment for the phylogenetic tree construction. organisms with common ancestors were positioned in the same monophyletic group in the tree, and the same node where the protein of interest (the 2019-ncov main proteinase) is positioned also houses the non-structural polyprotein of the 1ab bat sars-like coronavirus. this shows that the two viral proteins share a common source with shorter divergence period. bootstrapping allows evolutionary predictions on the level of confidence. one hundred is a very high level of confidence in the positioning of the node in the topology. the lower scores are more likely to happen by chance than it is in the real tree topology [52] . the bootstrap value of the above-mentioned viral proteins which is exactly 100 is a very high level of statistical support for their positioning at the nodes in the branched part of the tree. the length of the branches is a representation of genetic distance. it is also the measure of the time since the viral proteins diverged, which means, the greater the branch length, the likelihood that it took a longer period of time since divergence from the most closely related protein [53] . the tw9 and tjf strains of the sars coronavirus orf1a polyprotein and replicase, respectively, are the most distantly related, based on their branch length and as such can be regarded as the out-group in the tree. structure-based drug discovery is the easiest molecular docking methodology as it screens variety of listed ligands (compounds) in chemical library by "docking" them against proteins of known structures which in this study is the modeled 3d structure of the 2019-ncov main proteinase and showing the binding affinity details alongside the binding conformation of ligands in the enzyme active site [54] . ligand docking can be specific, that is, focusing only on the predicted binding sites of the protein of interest or can be blind docking where the entire area of the protein is covered. most docking tool applications focus on the predicted primary binding site of ligands; however, in some cases, the information about the target protein binding site is missing. blind docking is known to be an unbiased molecular docking approach as it scans the protein structure in order to locate the ideal binding site of ligands [55] . the autodock-based blind docking approach was introduced in this study to search the entire surface of the target and template protein for binding sites while optimizing the conformation of peptides simultaneously. for this reason, it was necessary to set up our docking parameters to search the entire surface of the modeled main proteinase of the 2019-ncov. this was achieved using the autogrid to create a very large grid map (center 77 å × − 10 å × 15 å and size 30 å × 60 å × 35 å) with the maximum number of points in each dimension in order to cover the whole protein. we observed a partial overlap in the docking pose of lopinavir to the active site of both template and target protein as compared to the conspicuous difference observed in the binding orientation of ritonavir to the protein active sites. these differential poses can be viewed distinctively in the attached animation video. a keen view of the binding orientation of the two drug candidates to the 2019-ncov virus main proteinase active site (fig. 11 ) is also consistent with the proposed induced fit binding model. in a comparative docking study, the same drug candidates (lopinavir and ritonavir) were docked against the active site of the pdb downloaded version of the viral main proteinase. the docking grid for this purpose was set with precision as the solved pdb structure of the virus included a cocrystalized ligand at the enzyme active site (center -32 å × − 65 å × 42 å and size 25 å × 30 å × 25 å) and experimental ligands bind to this site with precision and variation in poses (fig. 12) . the binding energy results table 5 here, the docking results of lopinavir and ritonavir against the template and target protein are shown. the binding of ritonavir to the template protein produced the highest number of inter model hydrogen bonds while the binding of lopinavir to the target protein formed polar interaction with three residues at the active site as compared to the two formed by the other interactions table 6 the amino acid residues involved in polar interaction, the number of inter-model hydrogen bonds and the docking score of lopinavir and ritonavir upon binding to the 3d pdb download of the sars-cov main proteinase (pdb 6m2n) showed a difference of − 0.3 kcal/mol upon the binding of lopinavir to the template and the pdb 3d structure of the enzyme (pdb 6m2n), and a difference of − 0.5 kcal/ mol between the pdb 3d structure of the enzyme and the target protein (table 5 and 6). the same comparative study was repeated for the binding of ritonavir and a difference of − 0.1 and − 1.0 kcal/mol was observed upon the binding of drug to the template and target proteins, respectively, in comparison with the binding to the downloaded 3d structure of the enzyme from the pdb. the observed consistency in the binding energy of the drug candidates can also serve as a reference to the validity and quality of the modeled protein, which has exhibited a high sequence and structural similarity with the downloaded 3d structure from the protein data bank (fig. 13 ). in an effort to make available potent therapeutic agents against the fast rising 2019 novel coronavirus epidemic, we identified from the viral genome the coding region and modeled the main proteinase of the virus coupled with the evaluation of the efficacy of existing hiv protease inhibitors by targeting the protein active site using a blind docking approach. our study has shown that lopinavir displays a broader spectrum inhibition against both the sars coronavirus and 2019-ncov main proteinase as compared to the inhibition profile of ritonavir. the modeled 3d structure of the enzyme has also provided interesting insights regarding the binding orientation of the experimental drugs and possible interactions at the protein active site. the conclusion from the study of cao et al. as previously discussed however has shown that the administration of the lopinavir-ritonavir therapy might elicit additional health concerns as a result of the extreme adverse events exhibited by the experimental subjects for the purpose of their study. it was also observed that the drugs showed no increased benefit when compared with the standard supportive care. in view of this findings, we therefore suggest a drug modification approach aimed at avoiding the health concerns posed by the lopinavir-ritonavir combined therapy while retaining their proteinase inhibitory activity. supplementary information accompanies this paper at https://doi.org/10. 1186/s43042-020-00081-5. additional file 1. supplementary information to this article can be found online at https://www.rcsb.org/structure/6m2n clinical features of patients with 2019 novel coronavirus in wuhan genomic characterization and epidemiology of 2019 novel coronavirus: implications of virus origins and receptor binding a novel coronavirus from patients with pneumonia in china a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster importation and human-tohuman transmission of a novel coronavirus in vietnam national health commission of the people's republic of china transmission of 2019-ncov infection from an asymptomatic contact in germany alert, verification and public health management of sars in the post-outbreak period coronavirus in severe acute respiratory syndrome (sars) a novel coronavirus and sars crystal structures of the main peptidase from the sars coronavirus inhibited by a substrate-like aza-peptide epoxide dissection study on the sars 3c-like protease reveals the critical role of the extra domain in dimerization of the enzyme: defining the extra domain as a new target for design of highly-specific protease inhibitors 3c-like proteinase from sars coronavirus catalyzes substrate hydrolysis by a general base mechanism only one protomer is active in the dimer of sars 3c-like proteinase biosynthesis, purification, and substrate specificity of severe acute respiratory syndrome coronavirus 3c-like proteinase a trial of lopinavir-ritonavir in adults hospitalized with severe covid-19 emboss: the european molecular biology open software suite srs, an indexing and retrieval tool for flat file data libraries issues in bioinformatics benchmarking: the case study of multiple sequence alignment hhblits: lightning-fast iterative protein sequence searching by hmm-hmm alignment the swiss-prot protein knowledgebase and its supplement trembl in 2003 toward the estimation of the absolute quality of individual protein structure models molprobity: more and better reference data for improved all-atom structure validation chapter 2: protein composition and structure modeling protein quaternary structure of homo-and hetero-oligomers beyond binary interactions by homology ucsf chimera-a visualization system for exploratory research and analysis fasman gd (1974) prediction of protein conformation protein identification and analysis tools on the expasy server the rapid generation of mutation data matrices from protein sequences mega7: molecular evolutionary genetics analysis version 7.0 for bigger datasets chemoinformatics: theory, practice, & products critical assessment of the automated autodock as a new docking tool for virtual screening critical assessment of methods of protein structure prediction (casp) round 6 visualizing and quantifying molecular goodnessof-fit: small-probe contact dots with explicit hydrogen atoms a test of enhancing model accuracy in high-throughput crystallography the penultimate rotamer library protein geometry database: a flexible engine to explore backbone conformations and their relationships to covalent geometry circular dichroism spectrum of peptides in the poly(pro)ii conformation calculation of protein extinction coefficients from amino acid sequence data universality and structure of the n-end rule the n-end rule in bacteria correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence thermostability and aliphatic index of globular proteins alpha helices are more robust to mutations than beta strands global analysis of protein folding using massively parallel design, synthesis, and testing time of the deepest root for polymorphism in human mitochondrial dna intraspecific nucleotide sequence differences in the major noncoding region of human mitochondrial dna limitation of the evolutionary parsimony method of phylogenetic analysis efficient docking of peptides to proteins without prior knowledge of the binding site molecular recognition and docking algorithms we appreciate the leadership of the laboratory of cellular dynamics (lcd), university of science and technology of china, for the all-around support and academic advisory role. we also acknowledge the strong support from the ustc office of international cooperation all through the challenging period of the coronavirus epidemic. the authors received no funding for this project from any organization. ethics approval and consent to participate not applicable the authors declare that they have no competing interests. key: cord-016442-3su3x6ed authors: aiking, harry; de boer, joop; vereijken, johan m. title: transition feasibility and implications for stakeholders date: 2006 journal: sustainable protein production and consumption: pigs or peas? doi: 10.1007/1-4020-4842-4_7 sha: doc_id: 16442 cord_uid: 3su3x6ed nan role of citizens who support more stringent environmental standards might be greater than their economic role as consumers when they make trade-offs between the prices of pork and npfs. third, the ecological approach argued that for developing a coherent set of indicators no framework turned out to be available, which (1) links pressures to environmental consequences, and (2) focuses on how each of the indicators can contribute to the problem solving logic required to answer questions of environmental sustainability. rather, the existing frameworks take a one-dimensional perspective. therefore, causal networks were recommended, describing how individual indicators are interrelated, thus providing a more complete, holistic picture of what is happening in the environment. in short, process knowledge provides insight into cause-andeffect networks and is put forward to guide the selection of appropriate indicator sets. together, irrespective of the approach, a clear consensus was provided (1) that npfs are environmentally more sustainable than pork and (2) that the most important impacts are due to tampering with (a) land use and the cycling of (b) water, (c) carbon and (d) nitrogen, leading to biodiversity loss, climate change, eutrophication and acidification (chapter 2). given the convergent results concerning the relative advantages of npfs, the question emerges whether this information will be enough for policymakers who want to choose between alternative protein chains. as we shall see in greater detail later, policymakers should try to avoid suboptimal solutions and that makes it important for them to start with a comparative analysis of a large number of options. presently, the environmental assessment tool is limited to the agricultural production phase. after extension to the rest of the production and consumption chain the method should be generalised beyond comparing just protein food options. in view of the increasing relevance of linkages between transitions, it should be extended sufficiently to become a tool to assess the environmental sustainability of transitions in general. because a large part of the environmental impact of the pork chain takes place outside western europe, it is important to also look at the global dimension. reducing this environmental impact in developing countries of asia and south america by shifting to a more plant protein centred consumption pattern in western europe may have important economic repercussions in those developing countries. in turn, this may or may not cause negative environmental impacts there, depending on what alternative livelihood strategies will be developed to compensate reduced feed crop exports to western europe. so north-south issues clearly require further study. six projects aimed at assessing aspects of the technological feasibility of pea-derived npfs were performed in profetas. two of them were directed at aspects concerning primary production of peas and two at aspects concerning processing of peas to npfs. in addition, in one project a tool is being developed to optimise the npf chain. last but not least, in one project the options for the non-protein fractions were studied. in the breeding project it was shown that between different varieties there is a large variation in protein composition. this variability can be exploited in classical breeding programmes to obtain pea varieties that are optimised towards npf production. furthermore, a tool has been developed for genetic modification of pea and other legumes. its application will enable selective removal of certain storage proteins (which affect texture formation) and of certain enzymes (which affect flavour compounds) in order to improve the quality of the raw material. as an alternative, genetic modification may be used in r&d for screening the effects of alterations in protein and flavour composition. such an approach will speed up classical breeding while field crops are kept gm free. in the cultivation project, a crop growth model has been developed that predicts yield of peas and pea protein as a function of genotype and environmental variables (such as solar radiation, temperature and rainfall). the model is generic; it can be extended to any arable crop. the model can be used to optimise production (e.g. with respect to resource use efficiency such as water) or to define the characteristics a variety should have for a given environment. an important insight resulting from this project is that using varieties with a high protein content cannot increase the protein yield per hectare. to achieve the latter, other traits (e.g. plant architecture to sustain leaf area duration during the grain filling period) should also be changed. furthermore, attention was drawn to agronomic aspects, such as resistance to lodging and soil-borne diseases, which indicate that peas may not be the best choice to function as starting material for npf production. the texture formation studies yielded new insights into a process that is of prime importance in texture formation, i.e. heat-induced gelling behaviour of pea proteins. this behaviour was shown to be affected by the protein 7.3.1 composition and by the rate of cooling during processing. this implies that a range of textures can be produced (a) by varying the protein composition by selecting appropriate pea varieties and (b) by varying the processing parameters. this provides basic tools to food technologists to direct the texture of pea-based npfs towards consumer demands. comparative studies using soy and pea proteins led to the insight that despite similarities between proteins on the molecular level, their gelling behaviour may be different. on the one hand, this could mean that a large range of textures can be made using a limited set of proteins. on the other hand, this observation likely sets limits to the interchangeability of proteins. protein-flavour interactions studies using model volatile flavour compounds provided information on the amounts of flavour compounds associated with the pea proteins under various conditions. they also showed that the interactions with flavour compounds differ per type of pea protein. so again, varying protein composition or process parameters offers possibilities to tune the flavour of npfs. furthermore, it was shown that saponins, not only present in peas but also in numerous other legumes (including soy) are responsible for a bitter taste. this bitterness may be reduced by heating, thus providing a technological tool to adjust taste. it was also found that in pea protein preparations "off-flavours" seem to be present that are derived from fat oxidation. similar "off-flavours" have been found in other plant protein preparations too; very well known examples are soy protein preparations. the project on chain design will not be completed before 2006. tools developed in this project are expected to be useful in the design of food chains to optimise these chains either towards a single goal (such as product quality, costs, or environmental load), or even towards multiple goals. preliminary results are that the cost price of npfs could be less than or at least comparable to that of pork. comparison of the environmental load by using exergy analysis showed that an npf chain is more efficient than a pork chain only if the non-protein fractions of pea seeds (mainly starch) are put to use. options for the non-protein fractions were investigated in a project added later to study the feasibility of plant-derived npfs from a different perspective. a tool was developed to assess different protein crops on the potential uses of the non-protein fractions. only bulk fractions were distinguished and evaluated on their suitability to be used as food, feed, stock and biofuel. the conclusion was drawn that -at present -oil crops may be at an advantage, because the oil fractions have better application perspectives than other non-protein fractions. interestingly, a transition from intensive meat production to npf production from any crop would lead to a shortage of soy oil for food applications, rather than to a surplus of this non-protein fraction. if it is assumed that all intensive livestock farming disappears and only grass-fed and other extensive meat production would remain, then this shift -irrespective of the crop (pea or soy) -would release an enormous area (over 300 million ha) of highly productive land for other uses. for example, if used to grow biomass, this would be sufficient to cover approximately 25% of the current world energy demand. the research on technological feasibility has focussed primarily on basic problems and on issues that needed to be solved before products could be designed. with respect to processing some important insights and tools are still missing. for instance, neither have texturisation processes been studied, nor have products (or product concepts) actually been made. such research is indispensable to allow a consumer-oriented development of npfs. at present it is still unknown whether more traditional texturing processes such as extrusion can deliver the range of textures desired by consumers. it may well be that novel techniques have to be deployed and further developed, such as techniques based on phase separation. furthermore, in actual npfs other components (e.g. hydrocolloids, fats) will be present. no information is available about the effects of such components on the texture and flavour of the npfs. with respect to flavour, stakeholders from industry have indicated that starting materials for npf production should preferably be free from "off-flavours". the prevention of formation of such compounds or ways to remove them should be investigated. a lot of knowledge on this subject is already available from research on soy proteins. missing insights with respect to the use of the newly developed protocol to genetically modify peas include, among others, its effectiveness and efficiency as well as its effects on cultivation. these effects not only include those relevant for npf processing (e.g. effects of protein composition on texture and flavour) but also those relevant for primary production. questions have to be answered regarding the effects of the modification on for example (a) germination power of the seeds, (b) viability of the modified peas in the field and (c) an eventual yield penalty for total protein content because of the modification. furthermore, the applicability of this new protocol to modify crops other than peas needs to be further explored. to exploit the interesting possibilities offered by the crop growth model, it has to be validated and to be extended to crops other than peas. then it can be a powerful tool in identifying differences among various crops in resource use efficiency for e.g. biomass and protein production, hence in contributing to a sustainable primary production of raw materials for npfs. in summary, quite a lot of research is still required to actually develop and produce npfs that meet consumer preferences. in addition to the technological issues discussed above, the project on chain design will most likely result not just in answering questions but also in raising others. furthermore, options for the non-protein fractions should be detailed by means of scenario studies, with particular attention for the projected shortage of soy oil for food applications. finally, the issue of the crops to be used for npf production is still open. as argued in section 6.2, even in europe peas may not be the preferred crop for npf production. more research concerning crop choice is required. in addition to sustainability -which is an evident societal preferencesocietal desirability was studied in profetas with respect to (1) the behaviour and preferences of consumers and (2) the food-related political and economic developments of the next decades. the underlying notion of this work was that the desirability of a diet shift is in proportion to its fitting in with the behavioural patterns of current and future generations. in addition, it was argued that a lack of fit would create less serious problems if mitigating measures can be taken. based on a long-term view on behaviour, the potential for a diet shift in relation to socio-cultural changes was examined. a newly developed analytical framework sorts insights on influences on behaviour into a logical order. its cascade-like structure embodies the view that a long-term development will create opportunities for food choices that match its general direction, whereas it will put constraints on others. the analysis of long-term changes indicated that there is a favourable socio-cultural context for decisions that make consumers and producers less dependent on meat proteins. one of the most salient results of this work, however, is the contrast between, on the one hand, a series of impressive changes in dietary choices over the last few centuries and particularly over the last few decades and, on the other hand, the observation that an individual will not easily change his or her food preferences from one day to the next. this contrast underlines the value of the analytical framework in combination with smallscale consumer research. by using currently available meat substitutes as a model to analyse consumer behaviour and food choices, it was possible to develop several 7.4 social desirability tools that may guide npf product development. it appears that the currently available meat substitutes will not become popular without additional measures. analysis of consumers and consumption behaviour with respect to meat and meat substitutes provided insights such as (a) non-vegetarian consumers of meat substitutes are not impressed by environmental arguments, (b) only a small group of consumers is open to completely new products due to so-called "neophobia", (c) consumers would like more information on usage and preparation of meat substitutes and on ingredients used. consumer sensory preferences were analysed and their translation into product characteristics was attempted, yielding insights such as (a) attention should be paid to satiating properties of npfs, which are related to protein content, (b) most people want soft, smooth or crispy meat substitutes to have a seasoned, meat-like flavour and a brown colour, (c) meat substitutes should have the same place in the dish as meat. other than originally anticipated, the latter results clearly show that people want npfs to have meat-like characteristics. they confirm the notion that people will habitually look for what is familiar when they are trying to make sense of something, such as an invitation to try a product. the retrospective character of sense making can explain why non-vegetarian consumers keep relying on distinctions drawn in the past and why they evaluate meat substitutes by using meat-based criteria. the selection environment in which npfs have to survive is partly dependent on their positioning in the market, for example as cheap (bulk) proteins or as quality products (specialties). the prospects of the new products may also depend on linkages with other issues, such as increasing meat prices or health promotion. these prospects are not only dependent on consumer responses in a potential usage situation, but also on processes at the level of markets and public institutions. given the fact that technology development takes place in a changing and malleable world, the stakeholders of a new technology may opt to monitor and influence its selection environment. profetas applied some advanced tools such as policy analysis and econometric modelling, as well as a skilled examination of the rules of international institutions with regard to novel foods. in a political analysis it was investigated how politics and public policy affect the possibilities of a diet shift. as expected, it appeared that governmental policy does not have many direct influences on food choices (i.e. the proportion of meat vs. plant proteins in the nation's diet), but mostly indirect influences. these influences demonstrate that production and consumption are, at least partially, facilitated by a political "infrastructure", implicitly favouring certain products or production processes over others. to uncover indirect influences, it was analysed how current food practices have developed in the netherlands since 1850 (i.e. when the physician g.j. mulder brought up the issue of proteins and the societal effects of the lack of proteins in most people's diets). given the current sustainability-related issues, the political infrastructure still seems to favour the option of more protein production and consumption, although that option is not actively promoted by governmental action. from a sustainability perspective, however, it may be desirable to have a political infrastructure that favours a more divergent food system with different approaches to food. this would mean, for example, that the option of producing more plant protein does not necessarily get more emphasis than the option of simply "eating less meat". two complementary econometric analyses studied agricultural production and consumption and the patterns exhibited (by adaptations to the existing gemat model), and institutional arrangements affecting agricultural production and trade (adapting the gtap model), respectively. it was shown that a protein transition results in a decrease of environmental pressure on land under all scenarios (gtap) and that the potential is even larger (gemat). though crucially important to the feasibility of a transition, interestingly, the two models employed disagree on future meat price development. from gtap a price decrease may be inferred, but from gemat a price increase. the underlying cause of this disagreement is the implicit assumption, deeply embedded in the models, to what extent agricultural efficiency will continue to increase in the future. these results show the added value of employing two complementary models. a study of institutional aspects provided the insight that even though international institutions (such as the eu and the wto) may provide both incentives and barriers for the introduction, marketing and promotion of npfs, on balance the barriers exceed the incentives. since these barriers (primarily intended to resist protectionist practices in international trade) cannot be circumvented, a successful introduction and marketing of npfs should be taking them into proper account. in short, it is easier to start from already authorised foods and ingredients than to start from foods and ingredients that still need to be authorised, especially in europe if they are gm crop derived. for the promotion of npfs not too much should be expected from traditional government instruments such as taxes (on meat) and subsidies (on npfs). subsidies have already lost their appeal in most eu countries for purely domestic reasons, plus they are heavily restricted by eu regulations concerning state aid and the single market. if npfs are to become a success, it should primarily be through private, commercial means and action. international institutions can protect and support commercial interests, however, through the international protection of intellectual property rights. each of the tools has provided relevant information about opportunities and barriers for npfs, but, under the present conditions of uncertainty, none could lead to conclusive answers. for example, one of the drawbacks of consumer research is that the currently available meat substitutes are, in fact, sold in a niche market and that they are almost twice as expensive as the cheapest meats. in order to improve the relevance of consumer research (sensory research, in particular), actual pea-derived npf products -in the form of cheap protein products or as quality products -are indispensable. since such products are not available yet, they should be crafted for this purpose. these products could also be used in assessing the relation between sensory properties of npfs and their physical characteristics, a very difficult field of research, but of great importance for consumer-oriented product development. evidently, it is important for product developers to realize that "the average consumer" does not exist. subsequently, it is important to realize that different npfs will have to be developed to fulfil the needs of different consumer groups. all of the tools may seriously be influenced by the use of pre-conceptions in thinking about the future. the role of pre-conceptions is particularly great when questions have to be answered about the social desirability of policy options. it was shown that pre-conceptions lead to hidden assumptions that blur the arguments for or against an option. for instance, one of the preconceptions defined a diet shift primarily as an opportunity to develop products with a larger profit margin than meat. in contrast, it may be more sensible to develop plant protein ingredients that can serve as a low price alternative to meat protein ingredients. another example is the implicit assumption mentioned above that meat prices will continue to decrease in the future, in consequence of continuing agricultural efficiency. probably, policymakers may show wisdom by expecting that the meat prices will not decrease but increase as a result of changes in the world market and agroecological constraints on production. in fact, it seems inevitable that the present growth of spending power in china will put the world market prices of meat and feed under pressure. in addition, it should be emphasized that not too much should be expected of traditional government instruments such as taxes and subsidies. although the latter are effective tools, their application is fraught with political difficulties. although the various tools to analyse the social desirability of a diet shift have not identified strong arguments against it, a critical reflection on the results may lead to the conclusion that the corresponding profetas hypothesis deserves some more stringent tests than the arguments that have been described. in addition to the limitations mentioned above, it should be 7.4.2 noted that the present analyses might have overstated the importance of agriculture to the political and economic processes in modern society. other linkages, in particular to non-food issues, may have been overlooked or underestimated. it is clear that health issues -which have not been directly studied in profetas -require further attention. for example, it has emerged recently that intensive production of poultry and pigs in close proximity with people may play an important role in the adaptation of originally poultry-specific viruses via pigs to human beings as suitable hosts (pilcher, 2004; chen et al., 2004) . under such conditions -primarily extant in south east asia -new viruses are frequently spawned. it seems more and more likely that recent incidents such as with sars and avian influenza are correlated with the intensive meat production there, and that the frequency of such incidents will continue to rise in parallel in the future. apart from animal welfare issues, this health aspect, in itself, seems a good reason to reduce intensive meat production in general, and that of poultry and pigs in particular, at least under conditions such as in south east asia, which is globally an important producing and exporting area. from profetas the conclusion emerges that there are several sound reasons that support the triple hypothesis. that is, a shift in the western diet from meat proteins to plant-derived protein products appears to be (1) environmentally more sustainable than present trends, (2) technologically feasible, and (3) socially desirable. interestingly, the citizen generally seems to consider sustainability to be socially desirable, but the consumer does not like the taste of present meat substitutes. nevertheless, the main evidence is in support of a transition to make food production and consumption more sustainable. given the aims of profetas, the programme has not included specialized transitions research. nevertheless, many insights have been generated on the protein transition and on its linkages with the energy and freshwater transitions. these insights are described below. first, it has become clear that the protein transition is a necessity at the global level. without it, food production and sustainability are on a collision course. from time to time, there will be signals, alarming reports or dramatic events, which may contribute to the opening of a policy window. if the window opens, however, the market system will not simply guarantee that 7.5 transition feasibility 7.5.1 the most sustainable alternative technology will win. market failures, but also government failures, may result in the selection of a suboptimal solution. second, study of the technological feasibility has revealed a number of options, but also some gaps in the required knowledge, such as the preferred crop for npf production. although many criteria for such a crop have been established (such as low fertiliser requirement, high protein yield, and preferably already part of the established food system), a more conclusive choice cannot be made yet, for criteria in other areas (such as concerning the non-protein fraction) have not yet fully materialised. presently, the top choice in europe may be rapeseed or pea yet, but in asia it will probably be soy. third is the insight that the protein, energy and freshwater transitions are inextricably intertwined and that all parts of the crop or seed -protein and non-protein -should be considered one combined chain. although the former, in particular, is a novel insight, both the former and the latter views fit in nicely with the present trend towards a biobased economy, aiming to derive food and feed, chemicals and other non-food materials, and energy from plant crops in the most efficient way. fourth, there is support from various actors, although this depends on linkages with other issues. feedback from dutch governmental actors and ngos revealed enthusiasm for a protein transition, however, they are not inclined to initiate one themselves. in contrast, they do feel committed towards an energy transition. for most western consumers, sustainability or the environment is not an incentive for food choice, however, health is. due to a number of meat crises and other food scares, health and animal welfare are valued as increasingly important issues by western consumers. in fact, sales of meat substitutes are increasing every year and they are being bought and eaten by non-vegetarian consumers. in conclusion, several trends in different areas (protein, biofuel, water saving), on different geographic levels (local to global; western to developing countries) and concerning different actors (consumer, government, industry, ngos) have been identified by profetas, whichtaken together -may lead towards a protein transition. a major step is raising the awareness that all these trends are linked up and cannot be seen in isolation. so a major achievement of profetas is the insight to propose bringing together all these different actors, all with their own agendas and multiform aims. this is, we believe, the true definition of a societal transition. for pragmatic reasons, in profetas many boundary conditions were assumed, such as the focus on the western consumer and the focus on peas. at this stage, it seems timely to start focusing the attention on how -and where -a protein transition may be achieved in a dynamic, multiform society. alternative options such as deriving proteins from algae should be investigated, as well as the feasibility of reducing the present protein overconsumption in western countries (the "eating less protein" option). more direct research on transitions seems in order. the latter requires better insight into the role of explicit as well as implicit pre-conceptions in thinking about the future. in this respect, the use of multi-method strategies appears to be indispensable. multi-method strategies may not only validate each other or specify complementary aspects of a complex phenomenon, but they may also show interesting divergences. profetas has shown several divergences that are extremely important for further research into transitions. some examples are: the divergence between pre-conceptions that see a diet shift primarily as an opportunity to develop products with a larger profit margin than meat and pre-conceptions about developing plant based products that can serve as a low price alternative to meat protein ingredients. the contrast in the studies of consumer behaviour between, on the one hand, a series of impressive changes in dietary choices over the last few centuries and decades and, on the other hand, the observation that an individual will not easily change his or her food preferences from one day to the next. the assumption that meat prices will continue to decrease in the future, in consequence of continuing agricultural efficiency, versus the assumption that meat prices will increase as a result of changes in the world market and agro-ecological constraints on production. in spite of historical trends, it seems inevitable that the strong increase in meat consumption in countries such as china will put the world market prices of meat and feed under pressure. further research may be necessary into historical transitions, as well as into future general trends in global society. with respect to the latter, developing a number of contrasting visions of a potential global future is often considered helpful. in order to arrive at a robust transition strategy, these visions can be confronted with desirable options for protein food development. such an approach should not be too general, for example, by focussing on aggregated parameters such as the growth rate of the world economy. more specific factors, such as global public health, should be an explicit part of the picture. so does taking geographic and cultural 7.5.2 inhomogeneities into account, which clearly requires international cooperation. for the purpose of dutch and european policy making, it can be argued that protein sustainability is a medium to long-term issue on a global scale. in the not too distant future, intensive meat production worldwide should be discontinued, or at least strongly reduced. in practice, the notion of an approaching collision between current food production and sustainability means that imminent disturbances of the protein production chain will let themselves be known to policymakers through all kinds of signals. obviously, some of the weak signals could be a gradual rise of meat and feed prices. stronger signals may include human and animal health incidents, such as with sars and avian influenza, which are probably correlated with intensive meat production. against the background of irregular signals that vary in strength, governmental policymakers may have to manage at least three emerging goals: the first goal is to detect and interpret the signals correctly and to attribute them without delay to the relevant disturbances of the protein production chain. the bse case has shown that a delay might have disastrous consequences on the controllability of the whole policy making process. the second goal is to minimize any negative effects that the disturbances may have on society. for instance, it may be wise to be prepared for a large-scale vaccination campaign. the third goal is to prevent the opportunity to take action getting lost as a result of counteracting processes that favour suboptimal solutions. that is, solutions that are suboptimal from the perspective of long-term sustainability objectives, not aimed at the root of the problem, such as temporary bans on certain types of meat. the prevention of suboptimal solutions may require that governmental policymakers take action to correct market failures, for example, where private actors do not take full responsibility for the societal consequences of their activities. however, suboptimal solutions may also result from government failures, such as subsidies given to the wrong group or at the wrong moment. moreover, the various linkages between the protein transition and other transitions on different levels of scale will seriously complicate any attempt to apply straightforward forms of strategic planning. the least a robust policy should entail is an open mind to problem solving. for governmental policymakers it would be wise to avoid fixation on one particular technological option. in contrast, this might involve a decision to actively stimulate a range of divergent potential solutions to be developed, for example, by supporting research in the pre-competitive stage of technology development. this may especially be necessary where private parties have difficulties in making sense of the linkages between transitions, i.e. protein, energy and water. this approach would entail: slighting the mental barriers between thinking about food, energy and water policy, or about natural resources in general. trying to identify links with specific other policies, such as with the fight against obesity, the aims for renewable energy (such as the eu biomass action plan) and the aims to promote organic farming. considering alternatives such as npfs as sustainable successors to animal products (such as pork) with regard to export of products and know-how and considering incentives to r&d in that area. in combination with an open-minded approach, it is extremely important that governmental policymakers pay attention to the selection environment in which newly developed npfs have to survive. governmental policymakers can do a lot to prevent suboptimal solutions by monitoring and influencing the selection environment npfs are facing. some points in need of attention are the following: initiating removal of national, eu and international (wto) institutional barriers to the introduction of npfs on the market. increasing the transparency of the food chain in order to enable citizens and consumers to make sound choices. continuing and internationally promoting "green" thinking by acknowledging the role of plants for improving sustainability. in fact, the presently emerging trend towards a "biobased economy" is a first step toward the latter. in addition to deriving materials and energy from plant crops in combination, by the same token, food can be added to the list because crops are the ultimate renewables, degradable and all, and particularly sustainable if little fertiliser is required (such as with nitrogenfixing protein crops). european (eu) policymakers are in much the same boat as their dutch counterparts, where they have to deal with global environmental changes. the goals for biofuels and organic farming are easily linked to the protein transition. in addition, recovering the presently lacking self-sufficiency in protein rich feed crops may be an incentive for eu policy towards striving for sustainability by means of a protein transition. this may be even more so in view of the expected increase in demands (and consequently, competition) for protein rich feed crops by industrialising countries such as china. an additional incentive may be provided by the potential savings on freshwater use by agriculture, given the rising need expressed by the world water forum. for industrial policymakers an important question will be whether or not they want to be among the first movers in the market of new protein products. a company's decision on this topic will be governed by strategic and political circumstances at the time the options are contemplated, such as the ripeness of an issue. these circumstances depend on its own capabilities, its position in the industry, the economic situation of this industry and the industry's public image. the ripeness of an issue will in particular be influenced by the technological state-of-the-art (including innovative power), by the durability of the issue (a trend or a hype), by public opinions (including campaigns that emphasize the "ills" of an industry), and by linkages that improve the market success of a particular product segment. in view of the many potential linkages between a protein transition and other transitions, it should be emphasized that developing alternatives to an established technology may require many innovations and that these will not occur automatically as the outcome of a linear process. the currently predominating policy of multinational companies seems to be rather riskaversive by doing r&d themselves only on a moderate scale and, when deemed necessary, strategically buying emerging small companies with innovative products. the progress of small companies may be highly dependent on the relational context in which an innovation is located, such as the other innovations on which it builds, the status of the actors that own competing innovations, and the underlying community of experts involved in its elaboration. in addition, many chance factors may affect the content and the timing of innovation decisions. accordingly, the prospects of npfs may depend on many initiatives. as a start, they may entail: implementing "green" thinking in policies towards stepwise innovation and stepwise improving sustainability. slighting the mental barriers between thinking about raw materials, energy and water. integral thinking is a cornerstone of the emerging "biobased economy". considering plant-derived alternatives as sustainable successors to animal products with regard to export of products and of know-how, and increasing r&d in that area. this way of thinking coincides very well with parts of the mission statements of many companies: to create innovative products that promote a sustainable future. next to contributing to a sustainable future, npfs could also meet another important issue for industry: making profit. preliminary profetas calculations indicate that cost price of npfs could be lower or at least comparable to that of pork. furthermore, meat prices are expected to increase. this increase will be higher than the increase in the cost of plant proteins because of the low conversion factor of plant protein to meat protein. hence the gap between the costs of npfs and meat will become bigger, thereby increasing the profit margin. furthermore, npfs are directed towards a growing market. among others, the increase will evidently depend on the extent to which the products meet consumer demands and preferences, not only with respect to sensory properties but also to other issues such as health and animal welfare. this will be discussed in section 7.8. those industrial policymakers who want to be among the first movers on the npf market may start thinking about questions such as: which markets are the most interesting options, specialty or bulk, which products and how to position the products on the market? what are the preferences of the different consumer groups and which ones can be translated to product characteristics? what techniques are required to produce the desired products? which proteins or, even more basic, which crops seem feasible where? what chains should be developed and optimised, and with whom? this sounds partly like revisiting technological projects of profetas and indeed it is, but now on the level of actual and competitive products, rather than on the level profetas was directed at, i.e. the development of a toolkit on a basic, pre-competitive level. however, to answer the questions asked above in a science-based way and to provide a sound base for future npf development, further pre-competitive research on technological issues is still required. a number of subjects for such research have been discussed for example in section 7.3. in addition to technological issues, questions should be addressed such as: cooperation with what other parties is useful (e.g. for optimal use of the crop)? how may a protein transition spread across the world (e.g. which countries are best suited for introducing and testing npfs)? what may be the side effects of a protein transition (e.g. with respect to the north-south divide)? what trends in other areas (transport, technologies, lifestyles) may be relevant? for smaller companies it may pay to closely watch the emerging trend towards a "biobased economy" and start thinking about strategic alliances. as long as innovation is not "directed" by larger companies (cf. transition "management" as proposed by the dutch government), niches for innovative products will open, be it plant-derived protein foods or products made from the non-protein fraction of crops, products for saving water and/or energy, or others. the consequences of a protein transition will affect many stakeholders, such as consumers, retailers, farmers and ngos. without addressing each of these groups individually, the present section intends to sketch the most pertinent implications. what consumers should ideally do in the context of a protein transition may boil down to (a) eating one third less protein (the average dutch over-consumption), (b) replacing one third by plant-derived proteins, and (c) replacing the remaining third by extensively produced meat (such as most beef and lamb). although in the netherlands intensively produced pork converts much food industry waste in a sustainable way, globally this is not a representative example. theoretically, the proposed threefold option has both environmental and health benefits. in practice, few consumers will be convinced immediately, but they may do so in the long run. at present, the environmental benefits may not appeal to many consumers. the well-known activist storyline of "global nature" under threat and in need of protection from a global community has become too simplistic. modern westerners do not tend to think in terms of one big environment that is the same for everyone. they want credible solutions that give them the feeling that they are "doing the right thing." in contrast, the fact that many people are no longer aware of the animal origin of meat indicates that there is an increasing indifference toward the origins of proteins. at first sight, this seems to open possibilities for npfs. some producers might be tempted to change the protein chain in a way that does not have to be noticed by the people who are eating it, which may create a substantial shift from meat to plant protein foods without much consumer involvement. on second thought, however, a low-involvement approach may not be the optimal strategy to pursue more sustainable food choices. if people are no longer aware of meat's animal origin, they will also be less inclined to 211 transition feasibility and implications for stakeholders 7.8 pay attention to animal welfare. this may have negative consequences for attempts to stimulate sustainable agriculture by promoting high quality meat from well-treated animals or by simply eating less meat. an additional reason to not opt for a low-involvement approach refers to the societal value conflicts that are expected in many technology-related areas. in europe, the recent example of genetically modified food has shown that a lowinvolvement approach may backfire if people get the impression that they are part of a "hidden" transition. therefore, it is essential that all the people concerned are mindful of any transitions of food production methods. one of the ways to involve people is a discussion on personal health aspects, which have many links to the protein transition. in contrast to plantderived diets, meaty diets generally contain more saturated fats -associated with heart and coronary disease -and are sometimes associated with overconsumption of calories -leading to obesity. npfs may not only exert a beneficial effect on health indirectly, via these relationships, but maybe also directly. that plant proteins may provide such an effect is indicated, for example, by the claim approved by the american food and drug administration: "diets low in saturated fat and cholesterol that include 25 grams of soy protein a day may reduce risk of heart disease" (fda claim 21 cfr 101.82, october 1999). proteins from crops other than soy seem to exert the same protective effect, according to the fda, though this protective effect does not go unchallenged. last but not least, npfs are complex foods for which the amount of calories can be set via the choice of ingredients, which may be an important tool in view of the increase in obesity. admittedly, certain plant-derived foods may generate more allergies than meat, but this affects a minority of people compared to the stifling incidence of heart and coronary disease, let alone the downright epidemic incidence of obesity (800 million people afflicted worldwide). public health aspects, such as food safety, add to personal health aspects perceived by consumers. due to recent meat crises, both with chemical contaminants (such as dioxins, antibiotics, growth hormones) and pathogenic microbes (such as foot-and-mouth disease, avian influenza) european consumers are rather keen on food safety. other aspects -such as the proposed relationship between intensive pork and poultry production in south east asia and the increasing outbreaks of avian influenza -may not be topics consumers think about when they are shopping for food, but reminders of these issues may gradually induce "green" thinking by acknowledging the role of plants for improving their quality of life. the preceding chapters and sections have once again made it crystalclear how large the environmental burden of meat production is. conservatively estimated, it requires a 3-10 fold larger agricultural area and energy input and produces 3-10 fold more eutrophication than the production of plant protein. not only does meat production bring about over 60 fold more acidification, but it also appropriates 30-40 fold more of the dwindling freshwater resources. in addition, it produces a lot more pollution by pesticides, heavy metals and antibiotics. as an entirely novel finding, profetas has clearly demonstrated that the protein transition is coupled inseparably to two other societal transitions, namely those towards sustainable energy production and towards sustainable water use (section 6.3). the freshwater link is evident from the resource difference indicated above. the energy production link primarily regards the release of agricultural area, which is freed for biomass production. furthermore, the considerable proportion (60-80%) of non-protein biomass released as a byproduct during npf production may be utilised very efficiently for energy production. an evident case of win-win-win, with a triple environmental gain due to combined savings on protein, energy and water (section 6.3). concerning technological feasibility many questions have been answered, both in primary production, processing and chain development. generic as well as dedicated tools have been developed or are under development. among the generic tools is the crop growth model (which still has to be validated) and the model for chain design (which is still under development). applied knowledge concerning flavour-binding properties of pea proteins should be rated among the specialised tools. as might be expected, in addition to answers, many new questions have been raised as well (see chapter 3). these concern, among others, the way in which process and ingredient selection may fulfil consumer wishes for npfs within certain target groups (sections 3.2, 3.3 and 6.4). this puts forward three major challenges: how to obtain a wide range of textures by using plant proteins, how these textures are affected by ingredients other than proteins and how to obtain the desired flavour? next to questions regarding processing, also questions with respect to primary production remain to be answered. these questions concern among others the extent to which protein composition may be manipulated without affecting the viability of the plant in the field, and which crops are the most suitable to yield raw materials with the desired specifications (section 6.2). another major issue is the extent to which new 213 transition feasibility and implications for stakeholders 7.9 conclusions developments in molecular biology and breeding may affect the suitability of crops for npf production. from the sustainability perspective, the societal desirability has been established beyond doubt, particularly concerning resource and pollution aspects (land, water and energy uses, and their implications for biodiversity loss and climate change). furthermore, increased availability of plant protein based foods will undoubtedly make an important contribution to food protein security, which is presently under increasing pressure from world-wide increasing meat consumption. doubtlessly, such a transition will have a significant impact on north-south relationships and the poverty issue in the world (section 6.5). concerning societal desirability it has also become apparent that different actors can hold different interests and opinions here (section 6.4). it has been clearly established that the consumer is the player who holds the key to a short-term protein transition and that western consumers currently rate health above sustainability. for consumer-oriented product development, much more interaction is necessary between product developers and various user groups, such as consumers leaning towards health, convenience or culinary traditions. although profetas was originally designed from the western perspective, it has become clear that in developing countries the incentives to achieve a protein transition are not just different (section 6.6), but generally much stronger (section 6.5). in china, for example, meat production generates pressure by its inability to meet the national demand on top of the pressure generated by severe local pollution. crops tailored to climatic (section 6.2) and cultural (section 6.6) characteristics are available. it can be concluded that, although the developed countries are primarily responsible for the unbalanced meat consumption referred to earlier, it is primarily the developing countries that are confronted with the effects (see section 2.5). the latter, therefore, may experience stronger and more direct incentives to strive for a protein transition. however, the required technological expertise may be available in developed countries mainly. at any rate, if mineral oil and meat prices continue to rise, particularly in developing countries there will be opportunities for the onset of a combined protein plus biomass transition. since the majority of the developing countries are particularly short of freshwater resources, the additional implicit water conservation will be considered an important bonus. in profetas, neither transitions, nor their feasibility have been studied directly. rather, insights and tools have been developed to facilitate a potential transition from meat protein to plant-derived protein products in the near future. these have been summarised in the present chapter. for the reasons outlined above we feel that a protein transition -reducing intensive livestock farming and cultivation of feed crops -is not just beneficial to the environment, but also more sustainable, most certainly socially desirable, and in the long run inevitable. it is yet unclear whether npfs will be able to replace meat, to what extent, on what scale internationally, and how rapidly. when the window of opportunity opens, however, it will be crucial to have the technology available for more sustainable alternatives. facilitating a transition does not seem easy. however, it is interesting to note that -while exclusively working on an approach to make protein production and consumption more sustainable -the profetas research community has yielded an integrated solution for various global problems far beyond its original scope. in addition to promising much more sustainable protein production -directly contributing to biodiversity and resource conservation and probably indirectly to animal welfare and human health -profetas simultaneously indicated realistic options to produce a significant amount of biomass for sustainable energy production, and to save an immense volume of freshwater. this suggests that integral thinking combined with an even further extension of the disciplines involved (including health aspects, in particular) may be a promising avenue to meet the foreseeable challenges of the next few decades. the evolution of h5n1 influenza viruses in ducks in southern china increasing virulence of bird flu threatens mammals key: cord-252304-lwiulri7 authors: fragnoud, romain; flamand, marie; reynier, frederic; buchy, philippe; duong, vasna; pachot, alexandre; paranhos-baccala, glaucia; bedin, frederic title: differential proteomic analysis of virus-enriched fractions obtained from plasma pools of patients with dengue fever or severe dengue date: 2015-11-14 journal: bmc infect dis doi: 10.1186/s12879-015-1271-7 sha: doc_id: 252304 cord_uid: lwiulri7 background: dengue is the most widespread mosquito-borne viral disease of public health concern. in some patients, endothelial cell and platelet dysfunction lead to life-threatening hemorrhagic dengue fever or dengue shock syndrome. prognostication of disease severity is urgently required to improve patient management. the pathogenesis of severe dengue has not been fully elucidated, and the role of host proteins associated with viral particles has received little exploration. methods: the proteomes of virion-enriched fractions purified from plasma pools of patients with dengue fever or severe dengue were compared. virions were purified by ultracentrifugation combined with a water-insoluble polyelectrolyte-based technique. following in-gel hydrolysis, peptides were analyzed by nano-liquid chromatography coupled to ion trap mass spectrometry and identified using data libraries. results: both dengue fever and severe dengue viral-enriched fractions contained identifiable viral envelope proteins and host cellular proteins. canonical pathway analysis revealed the identified host proteins are mainly involved in the coagulation cascade, complement pathway or acute phase response signaling pathway. some host proteins were overor under-represented in plasma from patients with severe dengue compared to patients with dengue fever. elisas were used to validate differential expression of a selection of identified host proteins in individual plasma samples of patients with dengue fever compared to patients with severe dengue. among 22 host proteins tested, two could differentiate between dengue fever and severe dengue in two independent cohorts (olfactomedin-4: area under the curve (auc), 0.958; and platelet factor-4: auc, 0.836). conclusion: a novel technique of virion-enrichment from plasma has allowed to identify two host proteins that have prognostic value for classifying patients with acute dengue who are more likely to develop a severe dengue. the impact of these host proteins on pathogenicity and disease outcome are discussed. electronic supplementary material: the online version of this article (doi:10.1186/s12879-015-1271-7) contains supplementary material, which is available to authorized users. infection by one of the four serotypes of the dengue virus (dv), a member of the flaviviridae family, can cause a wide spectrum of clinical manifestations. although the majority of symptomatic patients develop a febrile illness known as dengue fever (df) with nonspecific symptoms such as headache, fever or myalgia, around 10 % of patients develop a more severe form of disease, severe dengue (sd), that may include plasma leakage, severe hemorrhage and organ failure [1] . the dv genome is a positive single-stranded rna molecule encoding a polyprotein that is processed into three structural proteins (the capsid protein c, the membrane protein m, and the envelope proteins e) and seven non-structural (ns) proteins involved in replication and pathogenicity [2] . dv enters target cells via receptor-mediated endocytosis and traffics via the endosome, where the acidic environment triggers fusion of viral and host cell membranes. once within the target cell, the virus manipulates the host cell membrane to create an optimal environment for the assembly of its replication complex and subsequent rna amplification [3] . virion assembly occurs on the surface of the endoplasmic reticulum (er), followed by budding of an immature particle into the er lumen. the immature virion is then transported to the trans-golgi network, matured via proteolytic cleavage, and finally released by exocytosis into the extracellular medium. no specific antiviral treatment against dv currently exists. the available therapies are symptomatic and are administered to control the clinical manifestations. the ability to diagnose dv infection at an early stage and successful prognosis of the resulting complications of dengue are urgently required to improve the management of patients. it is possible that the identification of proteins specifically present in plasma before the onset of severe symptoms may ultimately lead to the discovery of new prognostic biomarkers. despite an incomplete understanding of the mechanisms of pathogenicity, several hypotheses have been formulated to explain the disease process in patients infected with dv. however, the lack of animal models capable of reproducing the features of human disease has hampered the identification of reliable parameters and indicators to explain or predict the development of sd. a number of host immune components, especially antibodies, are associated with the pathogenicity of viral infections. such mechanisms include antibody-dependent enhancement of a secondary infection or cross-reactivity with proteins such as endothelial cell or coagulation proteins [4] [5] [6] [7] [8] . other immune components, including memory t-cells, innate immunity effectors and complement factors have been shown to modulate the outcomes [9, 10] . the dv non-structural protein 1 (ns1) may also play a major pathogenic role, as it interacts with host complement proteins [11, 12] . numerous studies have investigated associations between altered levels of circulating cytokines/chemokines and dengue severity [13, 14] . alternatively, as there is a strong biological rationale for investigating markers implicated in vascular pathologies, many markers such as the soluble intercellular adhesion molecule-1 (sicam-1), the soluble vascular cell adhesion molecule-1 (vcam-1), the e-selectin or the thrombomodulin have been identified and seemed to correlate with disease severity [15] . unfortunately, no clear consensus has emerged from these studies. predicting outcome in dengue remains challenging, and the search for robust markers remains crucial. previous studies have demonstrated patients with sd have higher viremia than patients with df [9, 16, 17] . additionally, reports from cuba and australia have suggested that circulating dv may become more virulent through passage in successive hosts during an epidemic [18] [19] [20] [21] . as the virus cycle and the virus pathogenicity are strongly linked with the host metabolism, it is assumed that host proteins interacting with virions are a reflection of pathological status of the patient. therefore, in the present study, a dv-enrichment procedure combining sucrose gradient ultracentrifugation and a polymer-based technique was developed for differential proteomic analysis of plasma pools from patients with df and sd by liquid chromatography coupled with tandem mass spectrometry (lc-ms/ms). the objective was to identify the viral proteins and the host proteins incorporated into virions, and also the host proteins that interact/are associated with virions. the identified markers were validated by quantitative elisas using samples from dengue patients from two different geographical regions. the possible role of the co-identified host proteins in disease pathogenicity and the potential of these proteins as fingerprints of disease severity in patients infected with dv are discussed. plasma samples were provided by the universidad industrial de santander, bucaramanga, colombia and the institut pasteur, phnom-penh, cambodia. samples were collected from dengue patients as part of retrospective (colombia) or prospective study (cambodia). both studies were reviewed and approved by the local medical ethics committees (universidad industrial de santander, colombia; national ethic committee, cambodia) and performed in compliance with the ethical standards set out by the declaration of helsinki. all patient plasma samples were anonymized after a physical examination and obtaining informed consent. dr villar-centeno (universidad industrial de santander, bucaramanga, colombia) and dr philippe buchy (institut pasteur, phnom-penh, cambodia) granted the authors permission to use the samples. all samples were collected between the onset of symptoms and defervescence. the cases were classified as primary or secondary infections by the physician based on hemagglutination assays performed on different dv serotypes and on japanese encephalitis virus. the serotype and the copy number were determined by real-time quantitative rt-pcr (qrt-pcr). dv-negative plasma specimens from healthy donors were obtained from the french national blood bank (etablissement français du sang, lyon, france). dengue samples were tested for the presence of the viral ns1 protein using the platelia™ elisa (biorad, marnes-la-coquette, france) following the manufacturer's instructions. rna was extracted from the plasma using the qiamp viral rna kit (qiagen, hilden, germany). viremia was measured using a qrtpcr kit (primerdesign southampton, uk) according to the manufacturer's instructions. qrt-pcr was performed in a final volume of 20 μl, containing 5 μl of extracted rna, 10 μl of 2x precision onesteptm qrt-pcr mastermix, and 1 μl of dengue primer/probe mix. assays were carried out using a lightcycler® 1.2 (roche applied science, bâle, switzerland) using the onestep amplification protocol recommended by the manufacturer. hepg2 cells (atcc hb-8065) were cultivated at 37°c in 5 % co 2 in dmem supplemented with 10 % decomplemented fetal calf serum, 5 × 10 4 iu penicillin, 50 mg streptomycin and 10 mm l-glutamine (invitrogen, paisley, uk). the cells were infected as previously described [22] with a serotype 3 dv (dv3, strain d78-878 thailand) graciously provided by dr v. barban (sanofi-pasteur, lyon, france). sub-confluent hepg2 cell cultures (approx. 10 7 cells/75 cm 3 flask) were incubated with virus diluted in serum-free culture medium at various multiplicities of infection (mois) for 90 min, the supernatant was removed, the cells were washed once with pbs (invitrogen) and 10 ml of fresh complete medium was added to the cells. after 5 days of culture, the supernatant was harvested and clarified by centrifugation at 10,000 g for 5 min at 4°c. denaturing polyacrylamide gel electrophoresis, western blotting and silver-staining following denaturation in sds sample-buffer (novex invitrogen, paisley, uk) at 37°c and denaturing polyacrylamide gel electrophoresis (page) on 4-12 % polyacrylamide gels in sds-mops buffer (nupage invitrogen), samples were electro-transferred to pvdf membranes (millipore, billerica, mt, usa) in 10 % caps-10 % methanol buffer, blocked in tbs-0.1 % tween 20-5 % skimmed dried milk (régilait, macon, france), incubated with anti-e monoclonal antibody (diluted to 1 μg/ml; biomerieux, marcy-l'etoile, france) followed by horseradish peroxidase-labeled conjugate (diluted at 0.1 μg/ml; p.a.r.i.s., compiegne, france) for 1 h each at room temperature (rt). after washing with tbs-0.1 % tween 20, the proteins were revealed using the supersignal west dura kit (thermo scientific, rockford, il, usa) and imaged using the versadoc™ imaging system (biorad, hercules, ca, usa). alternatively, the gels were silver-stained after electrophoresis using the silverxpress kit (life technologies, paisley, uk). densitometry analysis was carried out using quantity-one software (biorad). five milliliters of pre-clarified plasma pools (5000 g/ 10 min/4°c; cf. table 1) or 5 ml of pre-clarified cell culture supernatant were centrifuged for 2 h at 200,000 g using a beckman sw41 rotor in an optima l90 ultracentrifuge (beckman, fullerton, ca, usa). after centrifugation, the pellet was dissolved in a small volume of cold pbs (euromedex, souffelweyersheim, france), loaded on a discontinuous sucrose gradient constituted of 5 ml of 60 % sucrose in pbs (w/w) and 5 ml of 20 % sucrose in pbs (w/w), and centrifuged for 2 h at 200,000 g. the fraction containing virions, located at the interface of the two sucrose solutions, was collected, diluted ten-fold in cold pbs, centrifuged for 2 h at 200,000 g and the pellet was resuspended in 200 μl of cold pbs. all centrifugation steps were performed at 4°c. ultracentrifugation was complemented by additional purification steps using viraffinty™ (biotechsupportgroup, monmouth, usa), a water insoluble elastomeric polyelectrolyte developed for the capture and recovery of viruses and 100 μl of viraffinity™ were added to the resuspended pellet obtained after ultracentrifugation. the mixture was incubated for 5 min at rt and centrifuged at 1000 g for 10 min. the supernatant was discarded and the pellet was rinsed three times with mn buffer. finally, the viral fraction was separated from the polymer by heating for 5 min at 70°c in sds-buffer (novex invitrogen). after a final centrifugation step at 1000 g, the supernatant was harvested and stored at −80°c for immunoblotting and lc-ms/ms analysis. the following experiments were conducted by the edyp laboratory (edyp-service, cea grenoble, france). the virion-enriched preparation was loaded on a 10 % polyacrylamide gel and electrophoresed until all proteins entered the gel. the band containing the proteins was manually excised, washed three times in buffer containing 50 % acetonitrile and dried using 100 % acetonitrile. proteins with a mascot score higher than 40 (p < 0.05) were selected for further analysis [24] . the mascot score is a measure of the reliability of identification: the higher the score, the better the identification. in order to eliminate false matches and incorrect protein identification, consecutive searches against the concatenated swiss-prot and trembl_decoy databases (versions 57.6 and 40.6 decoy database, respectively; homo sapiens taxonomy, 164,620 entries) were performed for each sample using mascot 2.3 software (matrix science). irma software [25] was used to filter the results to achieve a false positive rate lower than 1 %. in-gel digestion and lc-ms/ms analysis was performed twice for each sample. ingenuity pathways analysis (ipa) software (ingenuity systems, redwood city, ca, usa) was used to investigate the interactions among all of the host proteins identified. interactive pathways were generated to observe the potential direct and indirect relationships among the proteins that were differentially expressed in the df and sd samples. commercial elisas (uscn, wuhan, china) targeting potential severity markers were used to measure in duplicate the levels of the proteins in individual df and sd plasma specimen, using the protocols recommended by the manufacturer. for each test, a standard curve was established by serial dilution of the calibrator provided in the kit to determine the protein concentration. the optical density values were determined at 450 nm using a microplate reader (eon; biotek, vinooski, vt, usa). statistical analysis (mann-whitney u test, chi-square test) and receiver operating curve analysis were performed using graphpad prism v.4.03 software (graphpad software, san diego, ca, usa). the mann-whitney and the chi-square tests were used to examine differences in demographic and clinical characteristics between patients and to assess potential confounding variables. comparisons of continuous variables were performed using mann-whitney u test that can be applied on unknown distributions for two small sets of observation (n < 30). comparisons of proportion were performed using chisquare test. p < 0.05 was considered significant [26] . plasma obtained from patients with df or sd were pooled to create two samples ( table 1) ; all of the plasma samples used for this step came from colombian patients who had secondary dv serotype 2 or 3 infections, and were collected between the onset of symptoms and the development of severe symptoms [27] . classification of the infections as secondary infections and the degree of severity were based on review of the patients' medical records. in these medical records, the estimation of the severity was based on the who criteria of 1997 [1]. df and dhf grade i (dengue haemorrhagic fever, minor haemorrhages) were considered as classicdengues (not severe). dhf grade iii/iv (dss, dengue shock syndrome) were considered as severe dengues. each medical records compiled details on platelet and blood cell counts, transaminase levels, the presence/absence of warning signs (persistent vomiting, abdominal pain…), haemorrhagic signs (petechiae, ecchymosis, epistaxis…) or shock signs (cold extremities, cyanosis…) that help to the patients classification by the physician. the proportion of samples from male patients was higher for the df pool (75 %) than the sd pool (43 %). the age of the patients (mean, 30 years) and the number of days the samples were collected after the onset of symptoms (mean, approximately 3 days) were similar between the two groups. no comorbidity was reported for any patient in either group. each of the individual plasma samples was confirmed as ns1-positive. the viremia of each pool was estimated using a commercial qrt-pcr kit. all patients were positive for dv, indicating all samples were collected during the acute phase of disease. the average number of rna copies per ml was 4.05 × 10 6 and 4.13 × 10 7 for the samples in the df and the sd pools, respectively. a technique based on ultracentifugation (step 1) followed by concentration using a commercial water-insoluble elastomeric polyelectrolyte specially engineered for the capture and recovery of viruses (viraffinity™; step 2), was created to obtain a fraction of plasma enriched with dv particles. initially, this technique was developed using the cell culture supernatant of hepg2 cells harvested five days after infection with dv at a moi of 1 or 10. the samples obtained after each step of the purification process (step 1 and step 1 + 2) were analyzed by western blotting using an anti-e monoclonal antibody (fig. 1a) . a positive signal corresponding to monomeric (60 kda) and dimeric e protein (120 kda) was observed in the dv-infected samples, and became more intense after the viraffinity™ step (fig. 1a, step 1 + 2). the purity of the step 1 + 2 sample was assessed by electrophoresis using denaturing page and silver-staining (fig. 1b) . bovine serum albumin (bsa), an intense protein band of approximately 66 kda was observed in the sample before purification ("not purified" lanes); however, this band was almost absent from the final purified sample ("step 1 + 2" lane). only a small number of bands were detected in the purified sample. densitometry indicated that the protein complexity of the purified samples was reduced by roughly 350-fold compared to the unpurified samples. fig. 1 characterization of the viral-enriched fraction of dv-infected hepg2 supernatant or plasma obtained from patients with dengue. a after infection with dv at a moi of 1 or 10, hepg2 cell culture supernatants were purified by ultracentrifugation (step 1) followed by the use of viraffinity™ (step 1 + 2). b silver-staining assessment of the protein complexity of dv-infected hepg2 cell culture supernatant before and after purification; not diluted (lane "not purified"), 1/10 diluted (lane "not purified 1/10") dv-infected cell supernatant before purification or the same sample as shown in lane 1 after purification (lane "step 1 + 2") were separated by denaturing polyacrylamide gel electrophoresis and silver-stained. c electron microscopy of dv-infected hepg2 culture supernatant after ultracentrifugation. grids were negatively stained using uranyl acetate. bars: 50 nm, 200 nm. d representative western blotting analysis of plasma pools obtained from patients with dengue after purification by ultracentrifugation and viraffinity™ using an anti-e monoclonal antibody. mw: molecular mass the presence of intact viral particles was also confirmed by electron microscopy, which was performed by the ezus laboratory (université claude bernard, villeurbanne, france). after negative staining, virions could be visualized in the purified samples obtained from the supernatant of cells infected at a moi of 10 (fig. 1c) . the viral membrane, including the e spikes, was visible. the spikes observed on the virion surface are probably due to the slightly basic ph (ph 7.5) of the pbs-buffered solutions used during centrifugation and electron microscopy. virions tended to aggregate in clusters of 10-20 viral particles as seen in images taken at a lower magnification. little background signal was observed around the virion clusters. following the two-step purification process, the moi 10 sample was also analyzed by nano-lc ms/ms. all structural viral proteins were identified ( table 2 ). the e protein peptides were the most largely represented (21 peptides; 15 single peptides). the sequence coverage for the e protein represented 44 % of the entire protein. the protein m, which is part of the external viral layer, along with the e protein, was identified three times (3 single peptides) with coverage of more than 58 % of the protein sequence. the pr peptide and capsid were also identified (one peptide each). in conclusion, combining ultracentrifugation with viraffinity polymer yielded a fraction with a considerably reduced protein complexity that contained a large quantity of enveloped virions. this technique was then applied to the pooled plasma samples obtained from the colombian patients (table 1) . after purification, the preparations were analyzed by western blotting using a monoclonal antibody directed against the e viral protein (fig. 1d , sd pool as an example). strong signals corresponding to monomeric (60 kda) and dimeric e protein (120 kda) were observed. the e protein was not detected in purified plasma obtained from healthy individuals, which was tested as a mock-control. to identify the proteins present in the purified samples and to compare their composition, nano-lc-ms/ms was conducted on the purified df and sd plasma pools. this experiment was also performed on the purified mock-control sample to determine the host background. the same quantity of total protein was analyzed for each sample. these experiments were performed independently twice. both viral and host peptides were identified in the purified pooled plasma samples from the patients with df and sd. one peptide corresponding to the viral envelope protein was detected twice in the sd sample; this peptide (gwgngcgllfgk) was also identified in the purified dv-infected hepg2 supernatant (table 2) . after removal of the mock-control background, the remaining peptides identified in the df and sd samples were analyzed. in order to consolidate the results obtained by nano-lc ms/ms, only proteins for which the variance of the average number of peptides was lower than 25 % were selected [28] ; 188 host proteins met this criterion. the sd/df peptide ratio was calculated for these 188 proteins (see additional file 1). the highest sd/df ratio was obtained for c1s esterase (sd/df peptide ratio = 4.14) and the vitamin k-dependent protein s (ratio = 4). the lowest ratio was obtained for beta-1 spectrin (ratio = 0.13). six proteins were only identified in the sd sample; consequently, the sd/df ratio could not be calculated for these proteins. ingenuity pathway analysis (ipa) was conducted to further elucidate the specific pathways associated with the identified host proteins. the specific location and function of each pathway was attributed by ipa for 174 of the 188 proteins; the most prominently represented pathways are illustrated in fig. 2 . the ipa diagram shows that the highest p-values were attributed to acute phase response signaling [−log(p-value), lp = 31], the lxr/rxr activation pathway (lp = 18.4), the complement system (lp = 16.6), the coagulation system (lp = 16.2) and the extrinsic prothrombin activation pathway (lp = 12.6). the p-values indicate the likelihood that the focus genes in a network are found in these pathways by random chance alone. although represented to a lesser degree, three pathways related to host proteins involved in the dv cell cycle, clathrin-mediated endocytosis signaling (lp = 10.3), caveolar-mediated endocytosis signaling (lp = 9.2) and the virus entry via endocytic pathway (lp = 6.05), were also identified. other identified pathways, including integrin (lp = 6.96) and paxillin signaling (lp = 6.75), correspond to proteins involved in cell-matrix interactions and cell-to-cell communication. figure 2 also illustrates the ratio of identified proteins for each canonical pathway. higher proportions of proteins mapped to the complement system (ratio = 0.34), coagulation system (ratio = 0.32) and extrinsic prothrombin activation pathway (ratio = 0.4) than the other pathways (ratios between 0.16 and 0.05). ipa software was also used to investigate possible interactions among all of the identified proteins and to assess the representation of the identified proteins in acute phase response signaling, the complement system and the coagulation system with prothrombin activation (see additional file 2). twenty of the 50 (40 %) secreted proteins identified by ipa as part of the acute phase response signaling pathway, which is activated in macrophages and endothelial cells upon infection and inflammation, were over-represented in the pooled sd plasma sample compared to the pooled df plasma sample. thirteen complement component proteins were also over-represented in sd plasma, with high numbers of peptides identified for c1r and c1s in particular. complement factor c8 and complement factor b (cfb), which are involved in the complement alternate pathway, were only identified in the pooled sd plasma sample. the c9 protein was enriched in the pooled df plasma sample. fig. 2 ingenuity pathway analysis for host proteins identified in the viral-enriched plasma fraction of patients with dengue. pathway classification according to canonical pathways was performed using ipa software. the x-axis represents the pathways identified. the y-axis (left) shows the − log of the p-value calculated using fisher's exact test. the ratio (y-axis, right) represented by the line is calculated as follows: number of proteins in a given pathway that meet the cutoff criteria divided by total number of proteins that make up that pathway the extrinsic and intrinsic prothrombin activation pathways are part of the coagulation system. the majority of the proteins involved in the fibrinogen/fibrin cascade (i.e. 10 proteins out of 11) were over-represented in the sd sample compared to the df sample. overall, the network analysis also provided evidence of strong links between these three pathways. the coagulation factor 2 (f2) and the c3 and c5 proteins are located at the interface of the coagulation system and the complement system (see additional file 3). to validate the mass spectrometry data, the levels of selected host proteins were assessed by quantitative elisa both in the virus-enriched fraction and in individual plasma samples from df or sd patients. for the elisa, proteins with a sd/df ratio higher than 1.3 or lower than 0.78 were selected (see additional file 1); other criteria, such as the number of peptides identified in the sd sample and the availability of a commercial elisa kit were also considered. some proteins, such as ribosomal protein p2 (accession number p05387; ratio = 0.6) and histone h4 (accession number p62805; ratio = 0.4) were not deemed relevant enough for the study and were consequently not tested. the proteins readily found at high concentrations in plasma and known to be frequent contaminants in proteomic experiments, such as the immunoglobulin heavy chain (accession number p01779; ratio = 2.4), were not tested. finally, a non-exhaustive list of 22 proteins to assay was established. these 22 proteins were ceruloplasmin (cp), vitamin k-dependent protein s, complement factor properdin, antithrombin iii, secretory component p85, complement factor c1r (c1r), complement factor c1s, angiotensin, factor 2, anti-factor viii, serum amyloide p-component, olfactomedin-4 (olfm4), thrombospondin, gelsolin, platelet factor 4 (pf4), complement factor c1q, complement factor b, moesin and complement factor c8. multimerin-1, apolipoprotein b-100 and von willebrand factor were also tested. these 22 proteins were first quantified in the virusenriched fraction of the individual plasma samples obtained from df or sd patients of the colombian cohort. the elisa signal levels for these samples were close to background levels (purified mock-control sample); the results were not interpretable. to investigate the potential interaction of these proteins with the virion, fresh plasma was incubated for various times with purified viral particles. the change of protein concentration before and after the incubation was assessed by elisa. this experiment was performed for 4 proteins out of 22 (olfm4, cp, c1r, pf4) and showed that no significant signal change was observed for these proteins, whatever the condition tested (data not shown). the virus preparation is probably not concentrated enough to induce a significant variation in host protein concentration. furthermore, the plasmatic concentration of the four proteins is too high to be significantly affected by the interactions with the virus coated in the microplates. the elisas were then performed on the whole plasma samples obtained from colombian patients ( table 3 , colombian patients). all patients had secondary infections associated with various dv serotypes (dv1, dv2 and dv3); all samples were collected between onset and defervescence (between days 2 and 5). the male/female comorbidity 0 (0 %) 0 (0 %) -0 (0 %) 0 (0 %) -iu international units, ns not significant, na not applicable chi-square or mann-whitney tests were used to analyze the differences between groups sex ratio and age were similar between the groups of patients with df or sd. the samples were confirmed positive for both dv by qrt-pcr and the viral ns1 protein using a commercial elisa (ns1 platelia™). no comorbidity was recorded in any patient. among the 22 proteins tested, only cp, c1r, olfm4 and pf4, had different plasma concentrations between df and sd patients. the average concentration of c1r was higher for patients with sd than patients with df (p = 0.049). moreover, the variance between the two groups was significantly different (p = 0.019). the protein concentration of olfm4 was considered higher for sd patients (p = 0.051). for cp, the difference between the two groups was significant (p = 0.045). the largest difference was observed for pf4 with a higher concentration for patients with df (p < 0.001; fig. 3 ). to further confirm the relevance of c1r, cp, olfm4 and pf4 as potential markers to differenciate df and sd, plasma samples from patients with acute dengue from cambodia, another dengue-endemic area, were tested. as for colombian patients, the cambodian patients were classified by the local physician using the who classification of 1997. for the present study, patients classified df or dhf grade i were considered as classic dengue. patients classified dhf grade iii and iv were considered as severe dengue. clinical data were collected and included details on the platelet count, transaminases level, blood cells count, biochemistry (cholesterol, triglycerides, bilirubin…), ultrasound imagery, hemorrhagic signs (petechiae, ecchymosis, epistaxis…), the presence/or absence of warning signs (persistent vomiting, abdominal pain…), plasma leakage and shock signs (cold extremities, cyanosis…). the main characteristics of the cambodian patients are detailed in table 3 (cambodian patients). the mean age of patients from cambodia (about 8 years-old) was lower than that of the colombian patients. the male/female ratios of the df and sd groups were similar. all cambodian patients were infected with serotype 1 viruses. the plasma samples were collected approximately 3 days after the onset of symptoms (=acute samples) and also just before the discharge from the hospital (=discharge samples). in contrast to the samples obtained from the colombian patients, elisas showed that the levels of c1r and cp were not significantly different between df and sd patients in the cambodian cohort (see additional file 4); however, concentrations of olfm4 (p < 0.0001) and pf4 (p <0.0001) during the acute phase were significantly different between df and sd plasma (fig. 4) . the olmf4 concentration in df patients was lower than in sd patients. in contrast, pf4 concentrations were higher in df patients than in sd patients and sd patients showed no significant difference with healthy controls (n = 8). for these two markers, at the time of discharge, concentrations tended to decline back to the levels observed in controls. receiver operating characteristic (roc) curves compare sensitivity versus specificity across a range of values. roc curves are used to assess the discriminatory ability of each markers. area under the roc curve (auc) is another measure of test performance. the auc quantifies the fig. 3 assessment of the protein concentrations of c1r, cp, olfm4 and pf4 in the individual plasma samples of colombian patients using specific quantitative elisas. each value corresponds to the mean of two independent experiments, each performed in duplicate. *: 0.01 < p < 0.05; **: 0.005 < p < 0.01; ***: p < 0.005 overall ability of the test to discriminate between individuals with df and those with sd. a perfect test (zero false positives and zero false negatives) has an auc of 1.00. roc curve analysis was used to determine aucs and specificity/sensitivity values for olfm4 and pf4 as prognostic biomarkers of disease severity in cambodian acute patients with dengue. the roc curves indicated good discrimination between df and sd as the auc were 0.958 for olfm4 and 0.836 for pf4. the highest specificity/sensitivity values obtained using this model are summarized in table 4 . for a sensitivity of 96 %, the specificity exceeded 62 % (pf4) or 86 %(olmf4). when the sensitivity reached 100 %, the specificity dropped to 56.25 % (pf4) or 82.6 % (olfm4). dengue viral infections are prevalent in tropical and subtropical areas, and are associated with substantial morbidity and mortality. the pathogenesis of dengue remains unclear. various proteomic approaches have been used to characterize host-protein changes during dv infection and identify prognostic biomarkers of disease severity; several studies have been conducted on cells infected in vitro [29] [30] [31] . studies on plasma specimens have identified a number of candidate biomarkers [32] [33] [34] . as the dynamic range for plasma proteins is large (over 10 orders of magnitude) and the most abundant proteins (0.1 % of the total number) constitute up to 95 % of the plasmatic protein mass, identification of biomarkers present at low concentrations is a major challenge in proteomic studies dealing with plasma. however, proteomic techniques are constantly being improved to assess low abundance plasma proteins [35] . viruses facilitate their replication and propagation by subverting host cellular pathways and processes, and constantly adapt to and modulate their host's environment. enveloped viruses are able to incorporate numerous host proteins, both into virus particles as well as the host-derived viral envelope. as the genomes of rna viruses only encode a small number of proteins, they must rely on host proteins during an infection. interactions between the virus and host may have unforeseen consequences, depending on viremia level, the host's genetic background and other relevant factors [36] . host proteins can be incorporated into virions either randomly, by being present at the site of budding, or specifically, as a result of interacting with viral proteins. the functional significance of the host proteins that associate with viral particles remains to be thoroughly investigated; however, it is very likely that such interactions contribute to pathogenicity if they disturb the metabolism of the host cell [37] . for instance, the incorporation of cellular proteins such as integrins or hla class ii proteins can affect the ability of hiv-1 to infect host cells and contributes to immunopathogenesis [38, 39] . viroproteomic analyses have already been conducted on dna viruses [40] and rna viruses, such as retroviruses [39, 41] , paramyxoviruses [42] , coronaviruses [43] and flaviviruses [44] ; these studies have proven that many cellular proteins are incorporated into the virion or associate with the viral membrane. usually, prior to ms analysis, protease hydrolysis is combined with ultracentrifugation to remove host proteins that may co-purify with virions. in the present study, as we aimed to characterize the host proteins that interact with viral membrane proteins and particles, enzymatic hydrolysis of the proteins outside of the virion could not be performed. therefore, a new technique combining ultracentrifugation with water-insoluble polyelectrolyte-based enrichment was developed to enable proteomic characterization of a fraction of human plasma enriched with virus particles and depleted of the host proteins predominantly expressed in plasma. previous studies measuring viremia have demonstrated that patients with sd have higher viremia than patients with df [9, 16, 17] . higher viremia may lead to biosynthesis of virions with an altered host protein composition, which may reflect an increased level of cellular stress, or subtle changes in the assembly pathway. these host proteins may be packaged into the virus particle along with the viral components or incorporated into the viral membrane. alternatively, they can be simply co-purified along with virions. such proteins may potentially be fingerprints of the virus assembly pathway and may also play a role in viral pathogenicity. nano lc-ms/ms was used in this study to analyze the protein composition of the virion-enriched samples purified from the plasma pools. some proteins had a higher average number of peptides in the sd sample than the df and control samples, suggesting the presence of greater amounts of these proteins associated to viruses purified from the plasma of patients with sd; conversely, other proteins were mainly identified in the samples purified from the df plasma pool. interestingly, the identified host proteins belong to three main pathways: the complement system, the coagulation system and the acute response signaling pathway. the complement system plays an important role in both innate and adaptive immune responses, is first line of defense against infectious agents, and modulates b-and t-cell responses. in flavivirus infections, excessively-activated complement proteins have been reported to induce a deleterious, exacerbated inflammatory response [45] [46] [47] . recently, it has been shown that the level of complement proteins positively correlates with the severity of dengue in indonesian patients [48] . coagulation is the process by which the blood changes from a liquid to a gel. coagulation is required to stop blood loss and enable the subsequent repair of damaged vessels. hemorrhage is one of the major symptoms of sd, and is probably caused by vasculopathy and a deficiency in coagulation and fibrinolysis. the normal vascular endothelium produces inhibitors of coagulation and fibrinolysis. hemostasis can be impaired if the endothelium is stimulated by excessive levels of cytokines or by a pathogen, which may in turn result in thrombosis and bleeding [49, 50] . the acute phase response corresponds to the inflammatory response observed in response to an infection, tissue injury or an immunological disorder [51] , and is mainly characterized by increased levels of inflammatory factors and a change in the protein composition of plasma. interestingly, these changes can inhibit complement activation [52] . interconnections between the coagulation process and complement cascade have been reported by many authors [53, 54] . deregulation of one or both systems can result in the clinical manifestations of diseases with inflammatory complications. in a previous study, differentiallyexpressed genes associated with the immune response were identified by microarray analysis of peripheral blood mononuclear cells isolated from colombian children with dengue fever or dengue hemorrhagic fever. these results indicated that the complement and numerous cytokines are deregulated in patients with dengue-hemorrhagic fever. such changes may enhance disease severity by disturbing coagulation and inducing endothelial cell damage [55] . in another recent study, sera from indian patients with df were compared with that of healthy controls using 2d-dige associated with maldi tof/tof ms. the authors reported that dv infection led to altered expression of multiple serum proteins involved in complement cascades, blood coagulation and acute phase response signaling, providing further clues regarding the pathogenesis and host immune response to dv infection [56] . the complement system, coagulation system and acute response signaling pathway are strongly interconnected and share proteins whose expression is modulated during an infection. they are all related to the inflammatory process and the innate immune response, and act upstream of activation of the adaptive immune response. expression deregulation of these proteins could induce deleterious effects in endothelial cells [57] . in our hands, none of the seven complement proteins tested by elisa showed a difference in concentration levels between sd and df acute samples, minimizing the potential role of these proteins in dengue pathogenesis, at least in the early phase of the disease. several authors have reported proteomic analysis of samples from patients with dengue using different approaches [29, 30, 32] . to our knowledge, this study is the first assessment of the proteins that are differentially present in a virus-enriched fraction purified from plasma specimens obtained from patients during the acute stage of dv infection. we were unable to confirm that proteins identified by lc-ms/ms were directly associated with the viral particles. neither elisas conducted on virions purified from plasma, nor electron microscopy using specific goldlabeled antibodies succeed to yield reliable results. consequently, we cannot discard the possibility that the host proteins identified in this study may be contaminants that co-purify reproducibly with viral particles. elisas carried out on individual plasma specimens from the colombian cohort indicated trends towards higher levels of cp, c1r and olfm4 in patients with sd compared to patients with df. cp, an acute phase protein, is a ferroxidase involved in iron metabolism. elevated levels of cp are generally observed during inflammation. c1r is a complement protein that interacts with c1s at the beginning of the complement cascade. olfm4 is an anti-apoptotic factor that promotes tumor growth and facilitates cell adhesion, probably via interacting with cell surface lectins and cadherin [58] . the associations between these proteins and disease severity in dengue remain to be explored. interestingly, a recent paper mentioned the interest of olfm4 as a marker of disease severity in respiratory syncytial virus infection in children [59] . elisas also showed that the concentration of pf4 was higher in the plasma samples of colombian patients with df than patients with sd. pf4 is a small cytokine released by platelets that promotes blood coagulation by moderating the effects of heparin-like molecules. pf4 also plays a role in wound repair and inflammation. alterations in platelet function have been associated with plasma leakage, which is one of the major features of severe disease in patients with dengue [60] . recently, it has been demonstrated that dv replicates and produces infectious virus in platelets [61] . in 1989, srichaikul t. et al. demonstrated that the level of pf4 increases during acute phase in both shock or non-shock dhf children. it is difficult to compare these results with the results presented here because there is no strict correspondence between the patient classification used by srichaikul t. et al. and the classification of 1997 used in the present work [62] . interestingly, among the six proteins involved in coagulation and having a sd/df peptide ratio higher than 1.3 or lower than 0.78 (see additional file 1), pf4 is the only one that was confirmed by elisa. pf4 is significantly less concentrated in sd acute patient plasma. the trends observed for olfm4 and pf4 by elisas for the colombian patients were also confirmed in the cambodian patients. by testing different time points during the course of the disease, we showed that these markers evolved through time and tended, during the process of healing, to get close to the protein concentration found in healthy patients. pf4 was overabundant in df acute patients, but remained surprisingly low in sd patients, with levels remaining similar to control individuals. it has previously been proposed that pf4 interacts with the vasculature and is involved in thrombus formation at sites of vascular injury [63] . murine studies suggest that pf4 may have an overall salutary effect in sepsis [64] . therefore, a basal level of pf4 expression during the acute phase of dv infection could be an interesting marker of a future severe dengue. the severity of dengue is modulated by multiple risk factors such as the age, genetic background and nutritional status of the host, as well as the genotype and serotype of the virus. these factors could explain the differences in the concentrations of specific markers observed between the plasma samples from the colombian and cambodian patients. for example, sd in south-east asia is mainly observed in children, whereas adults are predominantly affected in south america [65] . as a consequence, results should be extrapolated with caution to different geographic areas or different demographic groups. in conclusion, we describe the development of a novel technique of dv-enrichment from complex biological fluids based on centrifugation combined with a waterinsoluble polyelectrolyte-based technique, for subsequent proteomic analyses. we found no evidence that the identified host proteins are specifically associated with virions. however, this purification technique enables the analysis of a plasma fraction enriched in virions and from which the most abundant plasma proteins were removed. the host proteins characterized in this study may potentially reflect how dv infection disturbs the function of the cellular proteome. in this regard, further studies are required to assess the prognostic value of host proteins associated with inflammation, complement cascade and coagulation for disease severity by analysis of additional biological samples from patients infected with dv. analysis of a selection of the identified proteins using elisas identified two host proteins, olfm4 and pf4, which had significant prognostic value for classifying patients with dengue who were likely to develop sd. further prospective studies are warranted to confirm and validate the prognostic value of olfm4 and pf4 as potential biomarkers of disease severity in larger cohorts of patients from a variety of dengue-endemic areas around the globe. complement factor 1r; ci: confidence interval; cp: ceruloplasmin; df: dengue fever; dige: difference gel electrophoresis; dv: dengue virus; elisa: enzyme linked immunosorbent assay; er: endoplasmic reticulum; iu: international unit; lc-ms/ms: liquid chromatography coupled to mass spectrometry chain; mes: 2-(n-morpholino)ethanesulfonic acid; moi: multiplicity of infection; na: not applicable; ns: non-structural references 1. world health organization. dengue: guidelines for diagnosis, treatment, prevention and control: new edition. geneva: world health organization recent advances in deciphering viral and host determinants of dengue virus replication and pathogenesis modification of intracellular membrane structures for virus replication the dengue virus nonstructural-1 protein (ns1) generates antibodies to common epitopes on human blood clotting, integrin/adhesin proteins and binds to human endothelial cells: potential implications in haemorrhagic fever pathogenesis molecular mimicry between virus and host and its implications for dengue disease pathogenesis complement and dengue haemorrhagic fever/shock syndrome expression of cytokine, chemokine, and adhesion molecules during endothelial cell activation induced by antibodies against dengue virus nonstructural protein 1 vascular leakage in severe dengue virus infections: a potential role for the nonstructural viral protein ns1 and complement differential gene expression changes in children with severe dengue virus infections antagonism of the complement component c4 by flavivirus nonstructural protein ns1 binding of flavivirus nonstructural protein ns1 to c4b binding protein modulates complement activation cytokine expression profile of dengue patients at different phases of illness alteration of cytokines and chemokines during febrile episodes associated with endothelial cell damage and plasma leakage in dengue hemorrhagic fever predicting outcome from dengue relationship of preexisting dengue virus (dv) neutralizing antibody levels to viremia and severity of disease in a prospective cohort study of dv infection in thailand dengue viremia titer, antibody response pattern, and virus serotype correlate with disease severity do escape mutants explain rapid increases in dengue case-fatality rates within epidemics? why dengue haemorrhagic fever in cuba? 2. an integral analysis dengue haemorrhagic fever/ dengue shock syndrome: lessons from the cuban epidemic a primary dengue 2 epidemic with spontaneous haemorrhagic manifestations nonstructural protein ns1 immunodominant epitope detected specifically in dengue virus infected material by a seldi-tof/ms based assay application of a simple method using minute particles of amorphous calcium phosphate for recovery of norovirus from cabbage, lettuce, and ham evaluation of multidimensional chromatography coupled with tandem mass spectrometry (lc/lc-ms/ms) for large-scale protein analysis: the yeast proteome a toolbox for validation of mass spectrometry peptides identification and generation of database: irma methodes statistiques appliquees a la recherche clinique biochemical alterations as markers of dengue hemorrhagic fever evaluation of reproducibility of protein identification results after multidimensional human serum protein separation association of alix with late endosomal lysobisphosphatidic acid is important for dengue virus infection in human endothelial cells proteomic analysis of host responses in hepg2 cells during dengue virus infection highthroughput quantitative proteomic analysis of dengue virus type 2 infected a549 cells discovery proteomics and nonparametric modeling pipeline in the development of a candidate biomarker panel for dengue hemorrhagic fever isotope coded protein labeling analysis of plasma specimens from acute severe dengue fever patients serum proteome and cytokine analysis in a longitudinal cohort of adults with primary dengue infection reveals predictive markers of dhf optimization of proteomic sample preparation procedures for comprehensive protein characterization of pathogenic systems alternate hypothesis on the pathogenesis of dengue hemorrhagic fever (dhf)/dengue shock syndrome (dss) in dengue virus infection potential roles of cellular proteins in hiv-1 the importance of virus-associated host icam-1 in human immunodeficiency virus type 1 dissemination depends on the cellular context proteomic and biochemical analysis of purified human immunodeficiency virus type 1 produced from infected monocyte-derived macrophages proteome analysis of adenovirus using mass spectrometry identification of host proteins associated with retroviral vector particles by proteomic analysis of highly purified vector preparations cellular proteins in influenza virus particles proteomic analysis of purified coronavirus infectious bronchitis virus particles development of a rapid and comprehensive proteomics-based arboviruses detection system the role of complement in hemorrhagic shock syndrome (dengue) the potential pathogenic role of complement in dengue hemorrhagic shock syndrome the host complement system and arbovirus pathogenesis correlation between complement component levels and disease severity in dengue patients in indonesia hemostatic defects in dengue hemorrhagic fever is clinical outcome of dengue-virus infections influenced by coagulation and fibrinolysis? a critical review of the evidence cytokines and the hepatic acute phase response complement factor b gene regulation: synergistic effects of tnf-alpha and ifn-gamma in macrophages complement and coagulation: strangers or partners in crime? interactions between coagulation and complement-their role in inflammation comparison of the transcriptional profiles of patients with dengue fever and dengue hemorrhagic fever reveals differences in the immune response and clues in immunopathogenesis serum proteome changes in dengue virus-infected patients from a dengueendemic area of india: towards new molecular targets? a physical interaction network of dengue virus and human proteins olfactomedin 4 is a novel target gene of retinoic acids and 5-aza-2′-deoxycytidine involved in human myeloid leukemia cell growth, differentiation, and apoptosis olfactomedin 4 serves as a marker for disease severity in pediatric respiratory syncytial virus (rsv) infection platelet function alterations in dengue are associated with plasma leakage dengue virus binding and replication by platelets platelet function during the acute phase of dengue hemorrhagic fever interactions of platelet factor 4 with the vessel wall role of the platelet chemokine platelet factor 4 (pf4) in hemostasis and thrombosis dengue viruses -an overview we would like to thank dr. y. coute data supporting the findings and materials are available upon request to frederic bedin, biomerieux sa, chemin de l'orme, 69280 marcy l'etoile (france). additional file 1: virus-enriched fraction proteome change in purified plasma pools obtained from acute dengue patients. authors' contributions rf, mf, fr and fb carried out all the experiments and performed the statistical analysis. rf and fb drafted the manuscript. ap, fr and gb participated in the design of the study. rf, fb, gb participated to draft the manuscript. dv and pb collected cambodian samples and managed the patient data and ethics. all authors read and approved the final manuscript. key: cord-020097-eh5deunk authors: nan title: cumulative author index for 2006 (volumes 115–122) date: 2006-10-27 journal: virus res doi: 10.1016/s0168-1702(06)00318-2 sha: doc_id: 20097 cord_uid: eh5deunk nan domingues, h.g., see spilki, f.r. (116) distribution and heterogeneity of small ruminant lentivirus envelope subtypes in naturally infected french sheep p1-and vpg-transgenic plants show similar resistance to potato virus a and may compromise long distance movement of the virus in plant sections expressing rna silencing-based resistance identification of novel foot-and-mouth disease virus specific t-cell epitopes in c/c and d/d haplotype miniature swine modulation of pkr activity in cells infected by bovine viral diarrhea virus genetic analysis of the function of the plum pox virus ci rna helicase in virus movement nidovirales: evolving the largest rna virus genome characterization of epstein-barr virus type i variants based on linked polymorphism among ebna3a, -3b, and -3c genes phloem specific promoter from a satellite associated with a dna virus various 30 and 69 bp deletion variants of the epstein-barr virus lmp1 may arise by homologous recombination in nasopharyngeal carcinoma of tunisian patients complete genome analysis of rflp 184 isolates of porcine reproductive and respiratory syndrome virus hepatitis c virus e2 links soluble human cd81 and sr-b1 protein specific binding of heat shock protein 70 with hn-protein inhibits the hn-protein assembly in sendai virus-infected vero cells protecting crops from non-persistently aphid-transmitted viruses: a review on the use of barrier plants as a management tool ppv long-distance movement is occasionally permitted in resistant apricot hosts evolutionary genomics of nucleo-cytoplasmic large dna viruses heat shock enhances the susceptibility of bhk cells to rotavirus infection through the facilitation of entry and post-entry virus replication steps adenoviral vectors-how to use them in cancer gene therapy inhibition of anatid herpes virus-1 replication by small interfering rnas in cell culture system prevalence of mutations in hepatitis c virus core protein associated with alteration of nf-b activation molecular epidemiological study of arctic rabies virus isolates from greenland and comparison with isolates from throughout the arctic and baltic regions asian prunus viruses: new related members of the family flexiviridae in prunus germplasm of asian origin the impact of the use of col-1492, a nonoxynol-9 vaginal gel, on the presence of cervical human papillomavirus in female sex workers microglial cells initiate vigorous yet non-protective immune responses during hsv-1 brain infection evolution of orf5 of spanish porcine reproductive and respiratory syndrome virus strains from identification of an interferon antagonist protein encoded by segment 7 of infectious salmon anaemia virus topics in herpesvirus genomics and evolution comparative analysis of genome sequences of three isolates of orf virus reveals unexpected sequence variation activities of membrane bound phosphatases, transaminases and mitochondrial enzymes in white spot syndrome virus infected tissues of fenneropenaeus indicus white spot syndrome virus infection decreases the activity of antioxidant enzymes in fenneropenaeus indicus phosphorylation and dephosphorylation events that regulate viral mrna translation identification of two amino acid residues on ebola virus glycoprotein 1 critical for cell entry sindbis virus infection of two model insect cell systems-a comparative study the amino-terminal residue of glycoprotein b is critical for neutralization of bovine herpesvirus an endornavirus from a hypovirulent strain of the violet root rot fungus importance of the extracellular and cytoplasmic/transmembrane domains of the haemagglutinin protein of rinderpest virus for recovery of viable virus from cdna copies envelope gene capture and insect retrovirus evolution: the relationship between errantivirus and baculovirus envelope proteins multiclonal pattern of jaagsiekte sheep retrovirus integration sites in ovine pulmonary adenocarcinoma venezuelan equine encephalitis virus complex-specific monoclonal antibody provides broad protection, in murine models, against airborne challenge with viruses from serogroups i, ii and iii patterns of sequence evolution at epitopes for host antibodies and cytotoxic t-lymphocytes in human immunodeficiency virus type phylogenetic analysis of the gag region encoding the matrix protein of small ruminant lentiviruses: comparative analysis and molecular epidemiological applications gene expression array analyses predict increased proto-oncogene expression in mmtv induced mammary tumors evolutionary genomics of archaeal viruses: unique viral genomes in the third domain of life hiv-l and the microrna-guided silencing pathway: an intricate and multifaceted encounter a modified viral satellite dna-based gene silencing vector is effective in association with heterologous begomoviruses tatabinding protein and tbp-associated factors during herpes simplex virus type 1 infection: localization at viral dna replication sites immunohistochemical examination of the role of fas ligand and lymphocytes in the pathogenesis of human liver yellow fever complete sequence and organization of the human adenovirus serotype 46 genome molecular characterization and phylogenetic study of maedi visna and caprine arthritis encephalitis viral sequences in sheep lymphocytopathogenic activity in vitro correlates with high virulence in vivo for bvdv type 2 strains: criteria for a third biotype of bvdv inactivation of white spot syndrome virus (wssv) by normal rabbit serum: implications for the role of the envelope protein vp28 in wssv infection of shrimp low prevalence of primary antiretroviral resistance mutations and predominance of hiv-1 clade c at polymerase gene in newly diagnosed individuals from south brazil characterization of a highly virulent feline calicivirus and attenuation of this virus interaction of the hepatitis b virus protein hbx with the human transcription regulatory protein p120e4f in vitro translation reinitiation and leaky scanning in plant viruses phylogenetic analysis of recent isolates of classical swine fever virus from colombia inhibition of severe acute respiratory syndrome-associated coronavirus (sars-cov) infectivity by peptides analogous to the viral spike protein rabies virus-induced apoptosis involves caspase-dependent and caspase-independent pathways translational control during virus infection epstein-barr virus immunossuppression of innate immunity mediated by phagocytes binding of shrimp cellular proteins to taura syndrome viral capsid proteins vp1 collaborative study to evaluate a new elisa test to monitor the effectiveness of rabies vaccination in domestic carnivores efficient inhibition of hepatitis b virus replication by small interfering rnas targeted to the viral x gene in mice preparation and characterization of a novel monoclonal antibody specific to severe acute respiratory syndrome-coronavirus nucleocapsid protein the kinetics of proinflammatory cytokines in murine peritoneal macrophages infected with envelope protein-glycosylated or non-glycosylated west nile virus members of adenovirus species b utilize cd80 and cd86 as cellular attachment receptors exogenous nitric oxide inhibits crimean congo hemorrhagic fever virus structural and antigenic analysis of the yellow head virus nucleocapsid protein p20 efficient expression of the 15-kda form of infectious pancreatic necrosis virus vp5 by suppression of a uga codon phylogenetic relationships of brazilian bovine respiratory syncytial virus isolates and molecular homology modeling of attachment glycoprotein genetic manipulation of two fowlpox virus late transcriptional regulatory elements influences their ability to direct expression of foreign genes a new rna virus found in black tiger shrimp penaeus monodon from thailand conformational maturation of the nucleoprotein synthesized in influenza c virus-infected cells hiv-1-mediated syncytium formation promotes cell-to-cell transfer of tax protein and htlv-i gene expression lack of a mechanism for faithful partition and maintenance of the kshv genome studies on the activity of a bidirectional promoter of mungbean yellow mosaic india virus by agroinfiltration expression, purification, and in vitro activity of an arterivirus main proteinase genetic characterization of the capra hircus papillomavirus: a novel close-to-root artiodactyl papillomavirus transient expression of human papillomavirus type 16 l1 protein in nicotiana benthamiana using an infectious tobamovirus vector a deletion and point mutation study of the human papillomavirus type 16 major capsid gene hearsnpv orf83 encodes a late, nonstructural protein with an active chitin-binding domain molecular epidemiology of bluetongue virus in northern colorado amino acid sequence of the amur tiger prion protein involvement of endoplasmic reticulum in hepatitis b virus replication systemic antiviral silencing in plants sequencing and comparative analysis of a pig bovine viral diarrhea virus genome esirnas inhibit hepatitis b virus replication in mice model more efficiently than synthesized sirnas chronological and geographical variations in the small rna segment of the teratogenic akabane virus hcv ns2 protein inhibits cell proliferation and induces cell cycle arrest in the s-phase in mammalian cells through down-regulation of cyclin a expression pepper mild mottle virus pathogenicity determinants and cross protection effect of attenuated mutants in pepper hepatitis e virus genotyping based on full-length genome and partial genomic regions location and phylogenetic analysis of the region immediately upstream of the granulin gene of the clostera anachoreta granulovirus molecular characterization of rabies virus isolates in china during proteolytic cleavage and shedding of the bovine prion protein in two cell culture systems antigenic structure analysis of glycosylated protein 3 of porcine reproductive and respiratory syndrome virus generation of virus-like particles consisting of the major capsid protein vp1 of goose hemorrhagic polyomavirus and their application in serological tests amino acid changes in the recombinant dengue 3 envelope domain iii determine its antigenicity and immunogenicity in mice key: cord-004435-l66ost6q authors: oli, angus nnamdi; obialor, wilson okechukwu; ifeanyichukwu, martins ositadimma; odimegwu, damian chukwu; okoyeh, jude nnaemeka; emechebe, george ogonna; adejumo, samson adedeji; ibeanu, gordon c title: immunoinformatics and vaccine development: an overview date: 2020-02-26 journal: immunotargets ther doi: 10.2147/itt.s241064 sha: doc_id: 4435 cord_uid: l66ost6q the use of vaccines have resulted in a remarkable improvement in global health. it has saved several lives, reduced treatment costs and raised the quality of animal and human lives. current traditional vaccines came empirically with either vague or completely no knowledge of how they modulate our immune system. even at the face of potential vaccine design advance, immune-related concerns (as seen with specific vulnerable populations, cases of emerging/re-emerging infectious disease, pathogens with complex lifecycle and antigenic variability, need for personalized vaccinations, and concerns for vaccines' immunological safety -specifically vaccine likelihood to trigger non-antigen-specific responses that may cause autoimmunity and vaccine allergy) are being raised. and these concerns have driven immunologists toward research for a better approach to vaccine design that will consider these challenges. currently, immunoinformatics has paved the way for a better understanding of some infectious disease pathogenesis, diagnosis, immune system response and computational vaccinology. the importance of this immunoinformatics in the study of infectious diseases is diverse in terms of computational approaches used, but is united by common qualities related to host–pathogen relationship. bioinformatics methods are also used to assign functions to uncharacterized genes which can be targeted as a candidate in vaccine design and can be a better approach toward the inclusion of women that are pregnant into vaccine trials and programs. the essence of this review is to give insight into the need to focus on novel computational, experimental and computation-driven experimental approaches for studying of host–pathogen interactions and thus making a case for its use in vaccine development. vaccination has been undeniably very helpful in promoting a healthy global population. it has severally saved lives, reduced healthcare costs and raised man's quality of life. 1 it greatly reduces disease burden, disability and death. however, newly emerging and reemerging infectious diseases (erid), infectious agents with complex lifecycle and antigenic variability and the need for personalized vaccination present additional challenges in vaccine development. 2, 3 for many pathogens (especially the emerging and those with antigenic variability), their genomes are known but their immune correlates of protection remain unclear. 1, 4 some of these reasons are why vaccine development for erid and multi-lifecycle pathogenic diseases is a tall order. serendipitous discoveries in immunology coupled with knowledge of bioinformatics tools for epitope predictions have resulted in the emergence of new pattern of vaccine design. 5, 6 the art and science of efficient and comprehensive information extraction and analysis of data deposited in relevant databases is now increasingly essential in researches related to immunology. 7 even with this capacity (efficient information extraction), some challenges in the application of bioinformatics in immunology include structure and/or function analysis and immune process analyses as concern the immune interaction specificity. fortunately, although researches in immunology are experimentally costly and very intensive, colossal amounts of data are usually generated. such data can only be analyzed with high precision and speed using bioinformatics tools. for instance, genome sequencing as well as in vitro t-cell confirmation is done in few months as opposed to years using the conventional vaccine design. 8 also, computational immunological methods drastically reduce both time and labor needs in epitopes screening. 5, 9 with computational immunology techniques, it is possible to discover vaccine candidate epitopes simply by scanning the protein sequences in a pathogen of interest. 5 many of these proteins are yet to be isolated or at least cloned. being pathogens specific and unique, they present ready candidates in vaccine construct. this review describes the need to use immunoinformatics-based techniques to unveil vital determinants of immunity made available in the genome sequence database and design vaccines. also, this review gives insight into the need to focus on novel computational, experimental and computation-driven experimental approaches for studying of host-pathogen interactions and thus making a case for its use in vaccine development. this review will further show the need for new approaches for effective drugs or vaccine design so as to combat the antigenic variability of some pathogens. the process of generating vaccine-induced immunity is somewhat challenging in immunology. current conventional vaccines came empirically when there were vague or no knowledge of vaccine immune system activation. a lot of research [10] [11] [12] has been geared toward the understanding of this challenge, but the complexity of it requires a different dimension of approach. 13 an approach that must accommodate many factors affecting vaccine development like pathogen antigenic variability, the emergence of infectious disease, human genetic variation is the goal of immunoinformatics [ figure 1 ]. activation of the immune system involves, among many processes, induction of the immune memory. the strength of this induction determines the efficacy of a vaccine. hence, vaccine efficacy in the long run is influenced by the determinants of immunological memory stimulation, persisting antibodies and kind and type of immune memory cells induced. 14 the primary vaccine-mediated immunological effectors (table 1 ) are mainly the antibodies (from b lymphocytes/ cell) 15, 16 and sometimes cd8+ and cd4+ t cells. these antibodies bind specifically to a particular kind of toxin or pathogen. vaccines and most antigens evoke humoral as well as cell-mediated immune responses. 17 vaccines that mediate immune responses through these systems (b and t cell responses) are said to be more effective. although b cells are regarded as the primary vaccine immune effectors, t cells induce immune memory cells and high-affinity antibodies. studies in reverse vaccinology and immunomics had also proved t cells as prime immune effectors following the discovery of novel vaccine targets with epimatrix. [18] [19] [20] this change of immune target has led to successful advances in vaccine design. even at the face of potential vaccine design advances, immune-related questions are now focused on specific vulnerable populations such as the young, elderly and immunocompromised. 21, 22 these concerns have propelled a better understanding of the efficacy of current vaccines on this vulnerable population and have also paved way for the application of new approaches that can put into consideration the differences of population and better targets that can generate optimum immune induction [23] [24] [25] with the exception of type ii t-cell-independent (ti-2) antigens (i.e., polysaccharide antigens). antigens that could provoke the b lymphocytes as well as the t lymphocyte responses stimulate the germinal centers causing antigen-specific highly efficient b-cell multiplication and eventual differentiation into antibodyforming plasma cells and memory b cells. all existing protein and dna antigens induce immunological memory b cells unlike type ii t-cell-independent (ti-2) antigens (i.e., polysaccharide antigens). these polysaccharide antigens do not generate memory b cells but can induce longlasting humoral immunity even when recall responses are lacking. 26 vaccine efficacy may be short term 27 if only the b cells are activated. the traditional approach for developing vaccines for infectious disease threats has shifted to include other vaccine design techniques like cloning and expressing major surface antigens 28 although this frequently resulted in the formulation of vaccines with poor immunogenicity, requiring strong adjuvantation. 29 this approach is particularly likely to be less specific for pathogens with complex lifecycles (e.g., parasites) or very high mutability (e.g., rna viruses). these pathogens do not depend on one route for their virulence of pathogenesis in human and thus to alter this process, increasing the specificity of the vaccine should be the aim and not just the effectiveness as seen in the current conventional vaccines. 28, 29 vaccines for several neglected tropical diseases are in various stages of development, 30 thanks to mega drug companies that have continued to demonstrate a willingness to invest money in the research and development as regards to diseases plaguing the developing nations. 16, [30] [31] [32] it is very pertinent to invest in researches that have an interest in vaccine specificity on the pathogen antigens than totally 33 computational vaccinology may now be applied to screen these genomes for possible vaccine target. with these tools, many proteins of virulence interest can be sequenced and the most essential gene of interest modeled for a potential vaccine candidate specific for that pathogen. immunoinformatics is the way forward in the identification of vaccine candidates for these tropical erid, for pathogens with varying antigens and for individualized therapy. immunology studies produce data in colossal quantities. also, with proteomics and genomics projects, extensive screening of pathogens and/or pathogen-host interaction, it has become increasingly necessary to store, manage and analyze these data, hence the birth of immunoinformatics. immunoinformatics deals with computational techniques and resources used to study the immune functions. statistical, computational, mathematical and biological knowledge and tools are applied in immunoinformatics in order to accurately and specifically store, and analyze data concerning the immune system and its functions. to handle evidence diversity, immunoinformatics uses tools that cut across several aspects of bioinformatics such as creation and management of databases, 34, 35 use and definition of both structural and functional signatures and the formation and application of predictive tools. [35] [36] [37] these strategies can synergize toward a better understanding of the immune system of both man and animals and fight against some less predictable pathogenesis. the complex nature of vertebrates' immune system, the variable nature of pathogens and environmental antigens coupled with the multi-regulatory pathways show that colossal quantities of data will be needed to unveil how the human immune systems work. conventionally, much cannot be achieved based on the complexity of the immune system and the virulent antigen but with the application of computational vaccinology, researches on vaccine design have been made easier, accurate and specific. applying immunoinformatics in disease study (table 2 ) requires the knowledge of disease pathogenesis, the immune system dynamics, and computational vaccinology, painstaking searches of the database, sequence comparison, structural modeling as well as motif analysis. 35, 38 these methods can go a long way in analyzing the pathogenesis of a disease and identification of vaccine candidates. in order to help understand complex pathogenrelated processes, computational models were developed for viral 46, 47 bacterial, 48 parasitic 49 and fungal pathogens. 50 the bioinformatics tools (table 3 ) are used to identify possible epitopes for vaccine formulation. each tool can screen protein sequences and identify aggregates of mhc binding and supertype motifs for possible use in epitopebased vaccine development and for use among human populations with genetic variability. there are several databases ( table 4 ) that can provide a wide range of information for all forms of immunological studies. generated data from the studies are further organized and stored in the databases (table 4) to provide a means for the development and advance in immunotargets and therapy 2020:9 immunological research. a tour on these databases will actually stimulate some interest in the vaccinology of emerging and re-surging disorders attributable to pathogen including cancer. emerging infections (eis) include infections that are entirely new in a population or that may have existed before in the population but are now gaining rapid and continued spread and/or wide geographical range. re-emerging or re-surging infections represent the infections that were previously of historical relevance but are now quickly becoming relevant because of either increasing incidence or increasing geographical and/or human host range while emerging infections represent the infections that were not originally observed in man. 66 several factors such as human behavioral changes, environmental changes, and host/intermediate factors, animal-human switching and microbial genetic changes all affect infectious disease emergence and spread. 67 these factors interact to promote the evolution of pathogens into new ecosystems, infect, spread and thrive in their new hosts. 68 the overall consequences of these are continued epimatrix this is an in-silico product of epivax developed for predicting and identifying the immunogenicity of therapeutic proteins and epitopes. it is also used to re-design proteins and in designing t-cell vaccine 51 has been applied in comparing strings from different strains of same pathogens and for pathogens identification. configuration of conservatrix allows for amino acid replacement at unusual positions. highly conserved t-cell epitopes in variable genomes such as some viruses are amenable to the algorithm 51,52 clustimer potential t-cell epitopes usually aggregate in specific immunogenic consensus sequence (ics) regions as clusters of 9-25 amino acids with 4-40 binding motifs instead of randomly distribute throughout protein sequences. in combination with epimatrix, the clustimer algorithm may be used to identify those peptides with epimatrix immunogenicity cluster scores ≥ +10. such peptides are usually immunogenic and tend to make a promising vaccine candidate. blastimer using blastimer program, one may also choose to automatically blast "putative epitopes against the human sequence database at genbank". blasting screens off those epitopes with possible autoimmunity and cross reactivity questions and locates the epitopes that can safely be used in developing human or animal vaccine. blastimer can also blasts sequences against pdb, swissprot, pir, prf and non-redundant genbank cds translations. vaccine cad this algorithm evaluates junctional epitopes for possible immunogenicity and inserts "spacers and breakers into the design of any string-of-beads construct". infectious disease emergence and re-emergence, epidemics and public health challenges. emerging infections and multi-antibiotic-resistant strains of pathogenic bacteria usually surge from one geographic location from where it spreads to other places due to immigration of people. 67, 69 most emerging infections originate from a specific population and can spread to a new population or become selectively advantaged that it can lead to the emergence of new strains of the pathogen. 67, 70, 71 also, there could be microbial traffic, in which case, an infectious agent transfers from animals to humans or spreads from isolated groups to new populations. 67, 71, 72 several factors, including ecology, are known to be associated with infectious disease outbreaks. these factors bring man into close contact with a natural disease reservoir/host. 70 with an increasing world population and poor infection control, the emergence of infection and increased microbial populations are sure. the human growth population will only increase the spread of the infection across populations. the information provided in table 5 is the list of remerging infections and current emerging diseases put forward during the who 2018 annual review. 73 the review noted that these infections, if not well controlled, can cause disease outbreaks, bioterrorism and similar occurrences requiring urgent public health attention and that with the dearth of efficacious medicines or vaccines, there is a compelling demand for continuous as well as accelerated research and development in those areas. advances in genomics, proteomics, immunomics, vaccinomics and nanotechnology are being continually exploited in diagnostic, therapeutic and in rational drug and vaccine development. these advances have also served in the control of the afore-mentioned emergences. 74, 75 the knowledge of the emerging pathogen's genome, protein make-up, pathogen-immune system interactions and researching the possible therapeutics will go a long way in directing the optimum path to containing the infection spread and controlling potential re-emergence or emergence in a different population. approaches in direct and computer-based structural determinations, 76 protein-protein interactions predicting, and bioinformatics tools now exist and are used in modern-day development of drug and biologics. 77 vaccine development has been sped up through the advance in the knowledge of the immune system of man. researches in the traditional targets of vaccines (adaptive immune response) and the less specific and fast-acting innate immune responses have been clear evidences for this advance. [78] [79] [80] as our understanding of the intercourse between innate and adaptive immunity increases, reasons and opportunities for more effective vaccine adjuvants will open up. this can be a step forward in solving a critical world's health challenge per population. following the conventional approach of vaccine design, much cannot be achieved but when the knowledge of immunoinformatics is applied, population safety and disease control can be achieved through pathogen's genome sequencing leading to optimum new vaccine design or development of a novel vaccine for the infection. antigenic variability is an important mechanism pathogens use to evade their host immunity. the surface proteins of pathogens are normally variable. this assists them to escape recognition by the immune system. a successful infectious agent presents to the host immune system information that differs from that of its virulence. pathogenic organisms have organized systems of escaping destruction by the immune system of their hosts. for instance, toxoplasma invades and appropriates the host cells thereby circumventing phagocytosis and then spread within their host to establish infection. 81 vertebrates on their own are endowed with immune system robust enough to efficiently and effectively surmount the non-self-attacks. yet the more the host's immune system elaborates, the better the organisms in their evasion of immune effector cells. antigenic variation refers to a pathogen's ability to modify its surface proteins such that it can circumvent the host's immunological attacks. it involves several mechanisms including the varying of surface protein's phase, shifting and drifting of surface protein antigens and/or any other form of alteration of antigenic protein. 82 antigenic variation plays significant roles in the pathogenicity of microorganisms by evasion of the host immune responses and establishment of re-infection. when a pathogen alters its surface antigens, it can evade the host's adaptive immunity and so reestablishes infection. the immune system may battle to generate new immunoglobulins against the new antigen. certain bacteria like neisseria gonorrhea, neisseria meningitides, mycoplasma and species of the genus streptococcus show antigenic diversity. 83 in eukaryotic pathogens, antigenic variation is shown by trypanosoma brucei and plasmodium falciparum. 81, 84 another vital cause of antigenic variation in bacteria is horizontal gene transfer (more important than point mutation) through plasmid acquisition and transduction via bacteriophages. virulence genes are normally acquired by non-virulent organisms via these routes. once this occurs, the new bacteria may quickly get established and cause fresh epidemic outbreak. species of the genus neisseria are champions in the rapid change of surface antigens amongst bacterial pathogens. pathogenic forms exhibit an amount of phenotypic variability not found in the commensal species. the pathogenic forms are implicated in std and meningitis. they employ amazing varieties of antigenic variability mechanisms. • they can recombine their pilin genes in a similar manner that eukaryotes recombine their own genes, such that they can express variable surface protein. 85 • some cell-surface proteins and enzymes synthesizing bacterial cell-surface carbohydrates are expressed in a variety of ways. this is as a result of replication slipping or slippage errors and repairs of simple tandem nucleotide repeats involving either the di-, or tri-or tetra-nucleotides. 86 • neisseria is able to take up and incorporate environmental dna into its genomes. 83 these are why an effective vaccine against neisseria infections is not yet developed. neisseria may be considered as an extreme example. however, many other bacterial pathogens like streptococcus and mycoplasma in promoting their antigenic variation tend to utilize one or more of these techniques. additionally, there are reports that dna-related defects have a much greater association with bacterial pathogen from symptomatic patients than samples of the same bacterial species isolated from environmental sources. [87] [88] [89] pneumococcus streptococcus pneumoniae, gram-positive cocci bacteria that cause otitis media, bacteremia and pneumonia, are a public health concern, causing morbidity and mortality in adults and children. 90 two forms of vaccines (polysaccharide and conjugate vaccines) are currently marketed for the prevention of pneumococcal infections. while the polysaccharide vaccines are for vaccination in the adult population, conjugate vaccines have an added immunogenic non-pneumococcal protein conjugated to the pneumococcal polysaccharides for enhanced immunogenicity in children. it is not yet known that these vaccines can evoke complete immunity against the infection. a polysaccharide capsule is a major virulence factor in the bacteria. several of these capsule types have been identified, and these form the basis of pathogen's antigenic serotyping. 91, 92 current pneumococcal vaccines are combinations of various capsular (polysaccharide) antigens from the serotypes most common in a particular population. currently, over 100 different serotypes are known but are not all covered in the available vaccines. 92 the discovery of a common antigen(s) will produce an effective vaccine. knowledge of the genome of the organism and the different strains has led to a possible advance in driving the pneumococcal potential vaccine search through a different approach. and this consideration will help solve a lot of concerns about the current vaccines. with this knowledge, many methods are been tried to determine whether they can be a source of effective vaccine design that can accommodate all the serotypes of the organism. search for antigen that is common to all the serotypes can be achieved with the knowledge of genomics and immunoinformatics. the introduction of genomic and computational technologies has given new directions in the study of bacterial pathogenesis and vaccine design. 93, 94 plasmodium plasmodium falciparum undergoes two life cycles: one in humans and the other in mosquitoes. the human host's erythrocytes and hepatocytes usually display modified parasite proteins called plasmodium falciparum erythrocyte membrane protein 1 (pfemp1) and plasmodium falciparum hepatocyte membrane protein 1 (pfhmp1), respectively. these proteins function to assist the parasite to evade destruction by the host immune systems. 95, 96 the pfemp1 proteins were identified as the prime ligands responsible for cytoadherence and resetting. 97 they cause the infected rbcs of host tissues to sequester thus helping the parasite to circumvent clearance by the host's spleen. 98 the membrane proteins also shield infected host cells from destruction by the spleen by adhering to the endothelium. luckily, if the pfemp1 protein is expressed for a long while; it comes under attack by the naturally acquired immunity. 98, 99 in defence to this, the parasite has expanded the var genes coding for pfemp1 such that the genes can exist as a polymorphic family of as much as over 50 members in every genomic haploid. antigenic switches work well here in that members of the polymorphic family (also called antigenic-variant-protein family) can be interchanged and cannot express their proteins at the same time. in this way, only one particular protein at the surface of the infected rbc is expressed at any given time. 97, 100 when studying antigenic-variant-protein families, it is pertinent to understand if grouping them into single-family results in any meaningful antigenic activity. studies have tried to understand the "languages" of the antigenic variant of pfemp1 proteins. 97, 101, 102 they sought to know the pfemp1 proteins binding properties or search to understand the correlation between motifs and infection severity. the vardb database is a repository for protein sequences involved in antigenic variation and their associated functionalities. 103 antigenic variant data obtained from several pathogens may be regrouped into a unified database. this will enable researches from several multicopy gene families to be accessed and compared swiftly in a single moment. updated vardb database contains close to 10,000 dna sequences, several protein translations, tens of infectious diseases and pathogens with their gene families. with a novel sequencing-based approach, pacbio, the different pfemp1 proteins can be sequenced and the related sequences used as potential vaccine targets. 104, 105 trypanosoma for many pathogens, antigenic variability occurs during the infection pathogenesis and is to enable them to escape destruction by the host antibodies. for instance, some eukaryotic parasites take to genetic assortment and rearrangement thereby changing their surface antigens. a ready example is seen in trypanosoma brucei, the causative organism for sleeping sickness. trypanosoma brucei replicates in the bloodstream (outside the cells) of their host, but at maturity, it crosses the blood-brain barrier to cause several fatal complications. during replication in the bloodstream, the parasites are subjected to humoral as well as cellular immune responses. it evades the host defenses by encasing itself in homogeneous coat of glycoprotein called the variant surface glycoprotein (vsg). 106, 107 though at initial invasion, the protein coat tends to protect the microbe from the immune system but on constant exposure, the coat will be identified as a foreign matter, and at this point the immune effectors can launch an attack against it. in a particular trypanosoma brucei, there are diversities of the vsg protein being coded by more than a thousand genes in the parasite's genome. unfortunately for the host, the expression of these genes is mutually exclusive. expressed vsg gene is normally due to genetic reassortments causing new alleles to be copied into the sites of expression. some trypanosomes with these abnormal vsg genes evade humoral immunity and multiply thereby causing re-infection and chronic recurring infections. 107 influenza is a viral infectious disease due to infection by any of the three types of rna viruses, namely influenza types a, b and c. current vaccines contain double type a and single type b strains and induce strong antibody responses to neuraminidase and the surface glycoprotein hemagglutinin. these vaccines, however, cannot effectively protect against newly emerging viruses with antigenic shift and drift. 108, 109 antigenic drift results in changes in the antigenic site (a minor change) while antigenic shift results in a new virus subtype. hemagglutinin and neuraminidase are the two enzymes dictating the antigenic properties of the viruses. while inside its host, defined host proteases break the peptide bonds in the hemagglutinin molecule to form hemagglutinin 1 and 2 subunits. virulence tendencies are decreased when the amino acids at the cleavage sites are lipophobic, the virus exhibits high virulence tendencies. 110 the surface glycoprotein can be regarded as antigen and hence can serve as a target for the immune system which if sequenced, using the new immunoinformatics approach and a common site for the varying proteins identified, a potent vaccine can be developed which can accommodate the antigenic drift/shifts of the virus. influenza viruses are able to thrive for a long while in a given human population. 111, 112 the virus has a high mutation rate such that a once effective vaccine can easily lose efficacy. antigenic variability is only one of the evidences of phenotypic variation in the biology of the influenza virus. the use of immunoinformatics in vaccine development has been accelerated toward the design of a multiepitope vaccine construct which has and will fully address the challenges faced with pathogens with mutagenic antigens. previous vaccines developed by conventional approaches consist of several proteins or a whole pathogen. this constitutes unwarranted antigenic load and increases the chances of inducing allergy. the use of peptide-based vaccines surmounts these challenges. the vaccines are made from short peptide fragments capable of eliciting highly specific immune responses, precision targeting and multiepitope constructs, in the case of varying antigenic peptides, which has been made feasible with the advancements in the field of computational biology. [113] [114] [115] [116] vaccines for pathogens with immune escape potentials can basically be constructed by using most, if not all, of their immunogenic peptides 116, 117 because such vaccines prove to be better than single-epitope and classical vaccines. multiepitope vaccines enjoy the following advantages over single-epitope and classical vaccines: a) they are an assemblage of several epitopes obtained from distinct protein targets/antigens of an intended infection; b) the multiple t-cell receptors (tcrs) in the vaccine recipient can easily recognize vaccines with multiple hla epitopes; c); they can be easily adjuvanted to improve their immunogenicity; d) they can activate antibody-mediated and cell-mediated immunological responses because of their overlapping helper t lymphocytes (htl), cd8+ t-cell and b-cell epitopes; and e) unwanted protein antigens are excluded in such construct thereby reducing the chances of untoward effects and/ or immune responses likely to cause disease(s). [118] [119] [120] [121] [122] [123] thus producing a vaccine with these qualities can provide chances of combating most infections such as streptococcus pneumoniae and hiv infections. immunoinformatics can be employed in the docking of single and multiepitope vaccines and subsequently to predict their properties (physicochemical, allergenic and antigenic). this approach has seen the use of diverse tools and database in the analysis of ligands with their targets and has greatly helped to predict the binding score of antigenic peptides with the immune proteins like hla. peptides and hla allele modeling can be done by the 3d structural designing of the epitopes using pepfold3 (an online server), 124 retrieving from protein data bank (pdb) the x-ray crystallographic structure of human population most occurring hla alleles (hla-drb1 01:01, 15:01 and hla-a 02:01) followed by filtration of previously bound ligands. the following is a step-wise detail on how to construct a single or multiepitope-based vaccine and its property prediction; • molecular docking analysis: to determine the interaction pattern of the screened out epitopes with the hla alleles by employing cluspro v.2 (a proteinprotein docking web server). this server performs this task by energy minimization, calculation of both the binding energy scores of the docked complex and electrostatic/shape complementarity. • target-protein comparative modeling and associated structure validation: the sequence of the amino acid in the target protein (e.g., tlr-9) can be retrieved from uniprot and the tertiary structure with raptor-x and i tasser (online comparative modeling tools). the server constructs and creates a 3d model immunotargets and therapy 2020:9 (mathematical representation) of the target protein using hierarchical algorithms. 118 personalized vaccine refers to vaccines "targeted" toward an optimized outcome. immunogenicity is maximized while either the risk of vaccine failure or reactogenicity and side effect is minimized. personalized are developed in the following cases; the individual level vaccines are developed to take care of haplotype and polymorphism knowing that they can retard the formation of a protective immune response or become pointers to the risk of an adverse vaccine reaction. this is needed when it is clear that females produce a higher antibody titre against a particular vaccine antigen than do their male counterparts. where it is clear that a particular human race or ethnic group has a higher or lower immune response to a particular vaccine antigen. personalized vaccines arise due to known complex interactions between host environmental, genetic and some other factors that may be influencing the vaccine immune responses. the associations between the immune response gene polymorphisms and variations in immune responses to a particular gene must be pine-pointed when it is clear that a particular drug either suppresses or augments the transcription of an immune response gene. 127, 128 this could help in designing vaccines or vaccine adjuvants that can circumvent restrictions due to immunological differences arising from varying genetic compositions. 129, 130 personalized vaccines stem from our understanding of how, within the human leukocyte antigen (hla) systemalso referred to as the major histocompatibility complex (mhc), the t cells are able to recognize peptides of pathogenic origin. 131, 132 hla molecules enjoy the double advantages of having stable polymorphisms and being fully characterized. 133 these advantages make good candidates for personalized vaccine design. hla polymorphism, although stable, is complex. for instance, more than 12,000 alleles of hla class i molecules and greater than 4000 class ii molecules have been identified among human populations. 133, 134 hla class i and ii molecules have heterodimeric character comprising of α and β chains, three highly variable extracellular domains (α1, α2, and α3) and then transmembrane and intracytoplasmic domains that are less variable. 133, 135 hla genes contain eight exons. exon 1 is responsible for producing the leader peptide; exons 2,3,4 produce α1, α2, and α3 extracellular domains, respectively, for mhc class 1 or α1, β1and α3, respectively, for mhc class ii; exon 5 encodes transmembrane anchor; exons 6 and 7 encodes the cytoplasmic tail while exon 8 encodes the 3ʹ-untranslated region. 135 most of the several forms associated with the class i and ii genes are seen in α-1 and α-2 (as known as class i) and in α-1 and β-1 (as known as class ii) domains. 133 mhc i and ii bind and present the peptide to t cells. t cell responses to viral pathogens differ from one patient to the other, basically because of the expression of differing hla (mhc) alleles which determine the several viral amino acid sequence brought to the t cells to read. 136, 137 it is most likely that during an infection, diverse epitopes are usually presented to the t cells to read owing to the several forms of hla alleles and also because each human person expresses six hla class i and six hla class ii alleles. 138 now, antibody-binding sites in a given hla (mhc) molecule are mostly prediction-servers predetermined on the basis of particular binding motifs and the anchor residues. 139, 140 these residues refer to known amino acids located at defined locations in the peptide chain and which are peculiar to each mhc molecule. 141, 142 prediction-server database of peptide motifs and/or of mhc ligands may be obtained from web-based and/or from prediction-servers dedicated to netmhc family. 143, 144 in another example, sequence analysis of lassa fever virus (the lasv) and other viruses' immunoproteomic was used to identify the best immunogenic protein predicting t-cell as well as b-cell epitopes and also target sequence and binding sites. 145, 146 the ssnlykgvy peptide sequence at aa41-49 of glycoprotein 1 (produced by the l segment) was the best candidate epitope for the induction of humoral as well as the cell-mediated immunity for lassa fever vaccine construct. 17 hla-i and 16 hla-ii molecules have been proven in sizable african populations and their combination with the ssnlykgvy peptide sequence may prove useful in such lassa fever virus endemic areas. 145 this approach will strongly improve individualized vaccination and help combat emerging infections. the hla region is suspected to contribute, to a large extent, to genetic propensity to infections and differences in vaccine expected immune responses. 132, 147 studies show that females exhibit stronger immune responses to immunization compared to males. 148, 149 there are differential antibody responses to rubella and measles viral protein between males and females and that both hormonal and genetic difference may be influencing the immune responses. 148, 150, 151 practical issues may stand in the way of achieving this new development (personalized vaccinology). having to use different vaccines for different persons based because of personal genetic composition requires more time and labor during the vaccination process. also, screening for individual factors for targeted vaccination can significantly increase vaccination cost. but with all these challenges, personalized vaccination is the new age approach in achieving an optimum immunization that takes into consideration the individual immune differences in a particular population and it is a new dawn for vaccine development. personalized vaccine development is strongly improved by vaccinomics. the field of vaccinomics looks at how immune response gene polymorphisms affect the cellmediated, humoral and innate immune responses to vaccine antigens at population and specific individual levels. "vaccinomics" encompasses both immunogenomics and immunogenetics as it concerns immune responses to vaccine antigen. 152 the fields of personalized vaccinology and vaccinomics were the products of phase i of the international hapmap and that of the human genome project. also, modern molecular assay techniques permitting highthroughput detection of variations at gene level, in particular linkage disequilibrium maps and single nucleotide polymorphism (snp), played significant roles in the development of personalized vaccinology and vaccinomics. it has also been shown that polymorphisms at vital immune response genes can bring about differing immune responses to biopharmaceutical products including vaccines. [152] [153] [154] newer, accurate, cheap and reproducible sequencing technologies; validated databases containing genotypephenotype data; statistical and bioinformatics tools are needed in order to analyze and interpret data that will help and improve vaccine adverse and immune response quantifiability and predictability. 155 the information will enhance clinical practice and accelerate rational and directed vaccine development. safe vaccines are a critical requirement for any immunization program. 156 conventional vaccination has been an approach targeted at all groups and individuals but has failed toward the enrolment of pregnant women into vaccination programs because of presumed fetal and maternal harms. 157, 158 evidence on the safety of vaccination in pregnancy is very small because pregnant women were usually excluded from participating in vaccine trials. 159 pregnancy can alter the maternal as well as fetal immunological responses. 160 it is pertinent to explore research opportunities presented in advanced vaccine designs such as immunoinformatics (multiepitope vaccine docking) by studying human immune system functions and responses specific to pregnant women and their unborn children. 157 according to a report 161 from the dominican republic of congo, during the 2016-2017 zika virus outbreaks, over a thousand pregnant women were suspected of being infected with the virus and a sizable number were at their first trimester. the report further stated that fetal loss was approximately one-tenth of the pregnancies and that there were up to 3 cases of fetus with head circumferences smaller than normal. the widespread morbidity during the epidemic showed that zika virus infection adversely affects pregnancy outcome. 160, 161 currently, there is no proof that pregnancy predisposes to ebola virus infections in comparison with the non-pregnant population, but there is some evidence suggesting pregnancy to worsen the disease prognosis including fetal loss. also, evidence showed that the virus can pass the placental barriers to establish infection in the unborn child. 162 designing, implementing and enrolling pregnant women as well as perspective pregnant women into vaccine trials and programs is imperative in order to protect this group and ensure good vaccine uptake by them during infection outbreaks and epidemics. 157, 163 these recommendations will give an informed decision to be investigated using the immunoinformatic tools to determine the immunogenic responses worthy of safe vaccine development for the pregnant women and perspective pregnant women group. maternal immunization offers palpable benefits in several ways. some vaccines are primarily administered to shield these pregnant women from morbid conditions and/ or death including fetal death. 164, 165 pregnant women stand the risk of being exposed to virulent pathogens and may be at a higher risk of morbidity and/or mortality when compared to the general population. 166 there has been an explosion of new immunological data (table 4 ) due to an increase in research to understand the immune system pathway in infectious disease pathogenesis and the application of the knowledge of bioinformatics has led to a better exposition of the immune system importance through immunoinformatics. the knowledge of immune system and the cost-effective, specific and effective approach like immunoinformatics, the concerns for emerging and re-surging diseases caused by pathogenic organisms, antigenic variability/complex lifecycle of pathogens and the need of personalized vaccination can be combated on a molecular level. the future of immunological research is sharpened by the ability to make discoveries in biologics (e.g., vaccines) more effectively and efficiently. this will mean reduction and better targeting of wet laboratory experiments and will only be possible if wet laboratory experimentation is combined with bioinformatics techniques. • immunoinformatics depends on experimental science (wet laboratory) to produce raw data for analysis. the predictions are not formal proofs of any concepts. they do not replace the traditional experimental research methods of actually testing hypotheses. • the quality of immunoinformatics predictions depends on the quality of data and the sophistication of the algorithms being used. sequence data from high-throughput analysis often contain errors. if the sequences are wrong, or annotations incorrect, the results from the downstream analysis are misleading as well. 167 immunotargets and therapy is an international, peer-reviewed open access journal focusing on the immunological basis of diseases, potential targets for immune based therapy and treatment protocols employed to improve patient management. basic immunology and physiology of the immune system in health, and disease will be also covered. in addition, the journal will focus on the impact of management programs and new therapeutic agents and protocols on patient perspectives such as quality of life, adherence and satisfaction. the manuscript management system is completely online and includes a very quick and fair peer-review system, which is all easy to use. immunotargets and therapy 2020:9 time for t? immunoinformatics addresses vaccine design for neglected tropical and emerging infectious diseases vaccinology in the third millennium: scientific and social challenges antigenic variability: obstacles on the road to vaccines against traditionally difficult targets complex immune correlates of protection in hiv-1 vaccine efficacy trials immunoinformatics approach to design a novel epitope-based oral vaccine against helicobacter pylori immunoinformatics approach for multiepitopes vaccine prediction against glycoprotein b of avian infectious laryngotracheitis virus transcriptome and proteome: the rise of omics data and their integration in biomedical sciences neoantigen vaccine: an emerging tumor immunotherapy designing of cd8 + and cd8 + -overlapped cd4 + epitope vaccine by targeting late and early proteins of human papillomavirus immunization with a recombinant antigen composed of conserved blocks from tsa56 provides broad genotype protection against scrub typhus vaccine mediated protection against zika virus-induced congenital disease vaccination with sporozoites: models and correlates of protection novel approaches for the design, delivery and administration of vaccine technologies vaccination to gain humoral immune memory humoral and cell-mediated immune responses after a booster dose of hbv vaccine in hiv-infected children, adolescents and young adults an evaluation of the cold chain technology in south-east, nigeria using immunogenicity study on the measles vaccines a review of the immunological mechanisms following mucosal vaccination of finfish reverse vaccinology 2.0: human immunology instructs vaccine antigen design large screen approaches to identify novel malaria vaccine candidates reverse vaccinology and subtractive genomics reveal new therapeutic targets against mycoplasma pneumoniae: a causative agent of pneumonia evolution of the immune system in humans from infancy to old age the twilight of immunity: emerging concepts in aging of the immune system pneumococcal vaccination strategies. an update and perspective progress and challenges in tb vaccine development vaccines for the elderly: current use and future challenges remembrance of things past: long-term b cell memory after infection and vaccination global practices of meningococcal vaccine use and impact on invasive disease new vaccine technologies to combat outbreak situations what are the most powerful immunogen design vaccine strategies? reverse vaccinology 2.0 shows great promise the new malaria vaccine program for african children is promising but still quite limited. quartz africa merck's ebola vaccine helps combat deadly outbreak in the congo as the virus spreads. biotech and pharma innovation for the 'bottom 100 million': eliminating neglected tropical diseases in the americas emerging and neglected infectious diseases: insights, advances, and challenges exploitation of reverse vaccinology and immunoinformatics as promising platform for genome-wide screening of new effective vaccine candidates against plasmodium falciparum the use of databases, data mining and immunoinformatics in vaccinology: where are we? systems vaccinology and big data in the vaccine development chain biochemical functional predictions for protein structures of unknown or uncertain function reverse vaccinology: the pathway from genomes and epitope predictions to tailored recombinant vaccines universal genotyping for tuberculosis prevention programs: a 5-year comparison with on-request genotyping integrating whole-genome sequencing data into quantitative risk assessment of foodborne antimicrobial resistance: a review of opportunities and challenges bioinformatics and drug discovery new technologies in predicting, preventing and controlling emerging infectious diseases how genomics can be used to understand host susceptibility to enteric infection, aiding in the development of vaccines and immunotherapeutic interventions in silico analysis of epitope-based vaccine candidates against hepatitis b virus polymerase protein advances in designing and developing vaccines, drugs and therapeutic approaches to counter human papilloma virus computational approaches and challenges to developing universal influenza vaccines. vaccines (basel) experimental and computational analyses reveal that environmental restrictions shape hiv-1 spread in 3d cultures dynamic computational model of symptomatic bacteremia to inform bacterial separation treatment requirements a model of plasmodium vivax concealment based on plasmodium cynomolgi infections in macaca mulatta a computational modeling approach predicts interaction of the antifungal protein afp from aspergillus giganteus with fungal membranes via its γ-core motif. msphere an overview of bioinformatics tools for epitope prediction: implications on vaccine development bioinformatics tools for identifying class i-restricted epitopes in silico-accelerated identification of conserved and immunogenic variola/vaccinia t-cell epitopes nhlbi-abdesigner: an online tool for design of peptide-directed antibodies ivax: an integrated toolkit for the selection and optimization of antigens and the design of epitope-driven vaccines from genome to vaccine: in silico predictions, ex vivo verification designing string-of-beads vaccines with optimal spacers hiv vaccine development by computer assisted design: the gaia vaccine nerve: new enhanced reverse vaccinology environment jennerpredict server: prediction of protein vaccine candidates (pvcs) in bacteria based on host-pathogen interactions genome-wide prediction of vaccine target of human herpes simplex viruses using vaxign rv vaxijen: a server for prediction of protective antigens, tumour antigens and subunit vaccines vacsol: a high throughput in silico pipeline to predict potential therapeutic targets in prokaryotic pathogens using subtractive reverse vaccinology panrv: pangenome-reverse vaccinology approach for identifications of potential vaccine candidates in microbial pangenome understanding emerging and re-emerging infectious diseases. bethesda (md: national institutes of health (us) the consequences of human actions on risks for infectious diseases: a review institute of medicine (us) forum on microbial threats. microbial evolution and co-adaptation: a tribute to the life and scientific legacies of joshua lederberg: workshop summary trends in antimicrobial resistance of bacterial pathogens in harare understanding emerging and re-emerging infectious diseases. bethesda (md): national institutes of health (us) evolution and emergence of infectious diseases in theoretical and real-world networks carbapenem-resistant enterobacteriaceae posing a dilemma in effective healthcare delivery annual review of diseases prioritized under the research and development blueprint the role of nanotechnology in the treatment of viral infections proteomics for development of vaccine computational methods in drug discovery computational approaches for prediction of pathogen-host protein-protein interactions new approaches to understanding the immune response to vaccination and infection adaptive immune responses mediated by natural killer cells friend retrovirus studies reveal complex interactions between intrinsic, innate and adaptive immunity from entry to early dissemination-toxoplasma gondii's initial encounter with its host evolution and divergence of h3n8 equine influenza viruses circulating in the united kingdom from neisseria genomics: current status and future perspectives antigenic variation in plasmodium falciparum malaria involves a highly structured switching pattern neisseria gonorrhoeae muts affects pilin antigenic variation through mismatch correction and not by pile guanine quartet binding antigenic variation in bacterial pathogens mechanisms and regulation of extracellular dna release and its biological roles in microbial communities mobile genetic elements in neisseria gonorrhoeae: movement for change. pathog dis genomic characterization of novel neisseria species the pneumococcus: epidemiology, microbiology, and pathogenesis. cold spring harb perspect med purification of capsular polysaccharides of streptococcus pneumoniae: traditional and new methods pneumococcal capsules and their types: past, present, and future whole-genome sequencing of bacterial pathogens: the future of nosocomial outbreak analysis human genomic loci important in common infectious diseases: role of high-throughput sequencing and genome-wide association studies immune response and evasion mechanisms of plasmodium falciparum parasites plasmodium falciparum pfemp1 modulates monocyte/macrophage transcription factor activation and cytokine and chemokine responses plasmodium falciparum proteins involved in cytoadherence of infected erythrocytes to chemokine cx3cl1 effect of mature blood-stage plasmodium parasite sequestration on pathogen biomass in mathematical and in vivo models of malaria controlled human malaria infection with plasmodium falciparum demonstrates impact of naturally acquired immunity on virulence gene expression multiple plasmodium falciparum erythrocyte membrane protein 1 variants per genome can bind igm via its fc fragment fcμ plasmodium falciparum malaria parasite var gene expression is modified by host antibodies: longitudinal evidence from controlled infections of kenyan adults with varying natural exposure expression of the plasmodium falciparum clonally variant clag3 genes in human infections vardb: a pathogen-specific sequence database of protein families involved in antigenic variation var2csa binding phenotype has ancient origin and arose before plasmodium falciparum crossed to humans: implications in placental malaria vaccine design. sci rep global genetic diversity of var2csa in plasmodium falciparum with implications for malaria in pregnancy and vaccine development how does the vsg coat of bloodstream form african trypanosomes interact with external proteins? variant surface glycoprotein density defines an immune evasion threshold for african trypanosomes undergoing antigenic variation towards a universal influenza vaccine: different approaches for one goal the emerging influenza virus threat: status and new prospects for its therapy and control influenza a virus hemagglutinin antibody escape promotes neuraminidase antigenic variation and drug resistance a comprehensive review on equine influenza virus: etiology, epidemiology, pathobiology, advances in developing diagnostics, vaccines, and control strategies novel platforms for the development of a universal influenza vaccine computer aided epitope design as a peptide vaccine component against lassa virus genome-wide identification of novel vaccine candidates for plasmodium falciparum malaria using integrative bioinformatics approaches. 3 biotech in-silico design of a multi-epitope vaccine candidate against onchocerciasis and related filarial diseases antigenic variation and immune escape in the mtbc vaccinomics approach for designing potential peptide vaccine by targeting shigella spp. serine protease autotransporter subfamily protein siga designing a multi-epitope based vaccine to combat kaposi sarcoma utilizing immunoinformatics approach efficient control of chronic lcmv infection by a cd4 t cell epitope-based heterologous prime-boost vaccination in a murine model a novel multi-epitope vaccine from mmsa-1 and dkk1 for multiple myeloma immunotherapy evaluation of tandem chlamydia trachomatis momp multi-epitopes vaccine in balb/c mice model development of a multi-epitope peptide vaccine inducing robust t cell responses against brucellosis using immunoinformatics based approaches identification of a cd4 t-cell epitope in the hemagglutinin stalk domain of pandemic h1n1 influenza virus and its antigen-driven tcr usage signature in balb/c mice pep-fold3: faster de novo structure prediction for linear peptides in solution and in complex the protein model portal-a comprehensive resource for protein structure and model information exploring leishmania secretory proteins to design b and t cell multi-epitope subunit vaccine using immunoinformatics approach ap-1 transcription factors as regulators of immune responses in cancer vaccinomics and personalized vaccinology: is science leading us toward a new path of directed vaccine development and discovery? milieu intérieur consortium. distinctive roles of age, sex, and genetics in shaping transcriptional variation of human immune responses to microbial challenges evaluation of levamisole as an adjuvant for typhoid fever vaccine formulation hla and infectious diseases predicting antigen presentation-what could we learn from a million peptides? front immunol development of a human leukocyte antigen-based hiv vaccine hla dna sequence variation among human populations: molecular signatures of demographic and selective events major histocompatibility complex: antigen processing and presentation harnessing the power of t cells: the promising hope for a universal influenza vaccine. vaccines (basel) recalling the future: immunological memory toward unpredictable influenza viruses human leukocyte antigen (hla)-binding epitopes dataset for the newly identified t-cell antigens of mycobacterium immunogenum fundamentals and methods for t-and b-cell epitope prediction human leukocyte antigen (hla) and immune regulation: how do classical and non-classical hla alleles modulate immune response to human immunodeficiency virus and hepatitis c virus infections? front immunol amino acid signatures of hla class-i and ii molecules are strongly associated with sle susceptibility and autoantibody production in eastern asians predicting hla class i non-permissive amino acid residues substitutions major histocompatibility complex class i binding predictions as a tool in epitope discovery prediction of mhc class i binding peptides using profile motifs design of peptide-based epitope vaccine and further binding site scrutiny led to groundswell in drug discovery against lassa virus immunoinformatics approach for epitope-based peptide vaccine design and active site prediction against polyprotein of emerging oropouche virus polymorphisms of immunoglobulin receptors and the effects on clinical outcome in cancer immunotherapy and other immune diseases: a general review sex differences in vaccine-induced humoral immunity sex differences in older adults' immune responses to seasonal influenza vaccination biological sex affects vaccine efficacy and protection against influenza in mice differential antibody responses to rubella virus infection in males and females genomics of immune response to typhoid and cholera vaccines human immune system variation omic technologies and vaccine development: from the identification of vulnerable individuals to the formulation of invulnerable vaccines advances in genetics and genomics: use and limitations in achieving malaria elimination goals. pathog glob health safety evaluation in mice of the childhood immunization vaccines from two south-eastern states of nigeria prevent working group. pregnant women & vaccines against emerging epidemic threats: ethics guidance for preparedness, research, and response vaccinations for pregnant women efficacy and safety of pertussis vaccination for pregnant women -a systematic review of randomised controlled trials and observational studies beyond passive immunity: is there priming of the fetal immune system following vaccination in pregnancy and what are the potential clinical implications? front immunol zika virus epidemic in pregnant women ebola virus disease in pregnancy: clinical, histopathologic, and immunohistochemical findings the ethics working group on zikv research & pregnancy. pregnant women & the zika virus vaccine research agenda: ethics guidance on priorities, inclusion, and evidence generation maternal immunization maternal immunization: where are we now and how to move forward strengthening maternal immunisation to improve the health of mothers and infants immunoinformatics: in silico approaches and computational design of a multi-epitope, immunogenic protein while this review has not been funded directly by them, we gratefully acknowledge the drug design laboratory of faculty of pharmaceutical sciences, nnamdi azikiwe university, nigeria, and drug discovery africa. all the needed data are included in this manuscript. all authors contributed to data analysis, drafting or revising the article, gave final approval of the version to be published, and agree to be accountable for all aspects of the work. the authors report no funding and no conflicts of interest in this work. key: cord-136540-2h2braww authors: buehler, markus j. title: liquified protein vibrations, classification and cross-paradigm de novo image generation using deep neural networks date: 2020-04-16 journal: nan doi: nan sha: doc_id: 136540 cord_uid: 2h2braww in recent work we reported the vibrational spectrum of more than 100,000 known protein structures, and a self-consistent sonification method to render the spectrum in the audible range of frequencies (extreme mechanics letters, 2019). here we present a method to transform these molecular vibrations into materialized vibrations of thin water films using acoustic actuators, leading to complex patterns of surface waves, and using the resulting macroscopic images in further processing using deep convolutional neural networks. specifically, the patterns of water surface waves for each protein structure is used to build training sets for neural networks, aimed to classify and further process the patterns. once trained, the neural network model is capable of discerning different proteins solely by analyzing the macroscopic surface wave patterns in the water film. not only can the method distinguish different types of proteins (e.g. alpha-helix vs hybrids of alpha-helices and beta-sheets), but it is also capable of determining different folding states of the same protein, or the binding events of proteins to ligands. using the deepdream algorithm, instances of key features of the deep neural network can be made visible in a range of images, allowing us to explore the inner workings of protein surface wave patter neural networks, as well as the creation of new images by finding and highlighting features of protein molecular spectra in a range of photographic input. the integration of the water-focused realization of cymatics, combined with neural networks and especially generative methods, offer a new direction to realize materiomusical"inceptionism"as a possible direction in nano-inspired art. the method could have applications for detecting different protein structures, the effect of mutations, or uses in medical imaging and diagnostics, with broad impact in nano-to-macro transitions. proteins are the basic building blocks of life. they form materials as diverse silk, cells, and hair, but also offering other functions from enzymes to drugs, and pathogens like viruses [1] [2] [3] [4] [5] [6] . while we cannot see small nanoscopic objects like proteins or other molecules, a common feature is their continuous motion, or vibration, that can be understood as an overlay of fundamental normal modes each consisting of harmonic waves. in recent work [1] we reported the vibrational spectrum of more than 100,000 known protein structures, and a self-consistent sonification method to render their complex vibrational spectrum in the audible range of frequencies. the sonification work has also been used to train generative neural networks to facilitate the design of de novo proteins using machine learning [2, 7] . this article focuses on a different perspective and reports a distinct, complementary and translational approach, in which we transform these molecular vibrations into vibrations of thin water films using acoustic actuators, leading to visual images of complex materialized patterns of surface waves. this approach follows the pioneering concept developed by chladni [8] and is similar to the cymatics method [9] [10] [11] [12] [13] [14] , offers a distinct means to assess the molecular details of protein vibrations in the form of macroscopic vibrations visible to our eyes. in addition to potential applications in outreach, the physical manifestation of molecular vibrations at the macroscale provides a novel avenue to render sound visible, providing an alternative method to conventional musical notation [11] and novel interactive approaches to interact with sound using senses other than our ears [15] [16] [17] [18] . moreover, the use of artificial intelligence provides an exciting avenue to classify, process and understand, or augment images generated by sound. we will explore some of these augmentation concepts in this paper, by highlighting key features of images as detected by distinct layers in deep neural networks. these connections between sound, materialization and images can offer many avenues for future research, enabled by significant progress made in recent years in computer vision [2, [19] [20] [21] [22] [23] [24] [25] . figure 1a shows the overall flow of the research reported here. the approach includes the calculation of molecular vibrational spectra transposed to audible frequencies and made audible (see earlier work for details, [1, 6] ), which are then used to excite a thin film of water. images of surface wave patterns are collected, building a training set for a machine learning model. the predictive power of the classifier model is validated, showing that the model can correctly determine the protein structure solely based on images of the surface wave patterns. figure 1b depicts an overview of eight distinct protein structures (chosen to reflect different complexity and size in its molecular structure) that are investigated in this paper. sonification is a method to translate data structures into audible signals, which has been explored widely as a means to better understand scientific data in a range of areas of application including spider webs, proteins, and other systems [26] [27] [28] [29] [30] [31] [32] [33] [34] . here, we utilize the protein synthesizer published in earlier work [1] and generate various audio signals that are fed into an actuator attached to a petri dish with a thin layer of water. all protein codes are expressed as protein data bank (pdb) [35] identifier, and can be downloaded and further explored at www.rcsb.org. the experimental setup is shown in figure 2 . the system consists of an actuator attached to a petri dish. the actuator (dayton audio daex25 sound exciter, driven by an analog amplifier) is driven by an amplifier, who receives the analog audio signal directly from a digital-to-analog (d/a) audio interface in a laptop computer. a 2,100 lumens led light source is used for illumination during image capturing (winplus, led folding worklight). the actuator is fixed permanently to a solid and immobile substrate with high mass. the petri dish has been spray-painted white for a clear optical signal (unless indicated otherwise; as we generated some images with a reflective aluminum foil background for better contrast). water is inserted into the petri dish with a height of 1 mm. the higher the water level, the more actuation energy is needed, so we pick a level that provides with significant surface wave generation at the available actuation energy. a camera is mounted at a fixed location for consistent images. the digital audio workstation (daw) ableton live [36] and max/msp [37] patches, as described in earlier work [1] , are utilized here to generate the audio signals; these provide implementations of the protein synthesizer. a few comparative simple, pure sine wave forms are generated as well. images are taken in the form of a continuous video, recorded at 60 fps, and with 1920x1080 resolution, recorded with a samsung galaxy s10+ 5g camera (the camera combines a primary 12 mp camera and variable-aperture lens with a 16 mp ultra-wide-angle component). the video is stored, and subsequently individual frames are extracted, using a python code. the images are cropped consistently so as to focus on the surface wave structures inside the petri dish, without showing a boundary. each image has a size of 585x788 after such processing, and before being fed into the neural network. no additional image processing is conducted. figure 3b shows an example of the surface wave structures that emerges in the experiment, as used in the deep neural network training. the majority of images are used for training (80%), and 20% for validation. to test the prediction of the model, new data from additional imaging of the experiment is used. two neural networks are considered in this work. first, an existing resnet neural network model as reported in [38] that has been pre-trained against a large dataset (https://tfhub.dev/tensorflow/resnet_50/feature_vector/1). this tensorflow hub model uses the implementation of resnet with 50 layers, and contains a pre-trained instance of the network, arranged to get feature vectors from images. we use transfer learning to retrain all parameters in the neural network, by adding a dense layer after the feature layer to add a new set of classifications to reflect the distinct audio sources of molecular spectra. the resnet model features 23,587,789 parameters [38] . second, a custom designed neural network with much fewer parameters (consisting of alternating convolutional and pooling layers), as a comparative approach. figure 4 depicts a summary of this neural network design. the model consists of multiple sequences of convolutional and pooling layers, following a standard image classifier design. the custom designed neural network features a total of 945,165 parameters, and 20 layers. we use deepdream [39] to generate novel images by activating select layers in the deep neural network, based on the model trained against the water surface images. this method is an approach to visualize the patterns learned by a neural network, and results in novel images. it can be understood similar to the process of an interpretation of structures in clouds, for instance, in which we try to interpret shapes and forms from the abstract patterns. in the case of the neural network, patterns it sees in a given image are interpreted and then accentuated. the method works by processing an image through a neural network, then calculating the gradient of the image with respect to the activations of a set of layers. the image is then adapted to enhance the patterns seen by the neural network. in processing the images, we explore different layers for the activation algorithm, each resulting in distinct features being highlighted in the image (e.g. internal convolutional layer vs. internal pooling layers). the purpose of this computation is to explore the inner structures and features learned by the neural network, and make them visible. we collect a number of images, around 1,000 for each case, and feed the labeled images into the neural network training process. a total number of 13 labels are used, reflecting 13 distinct frequency spectra corresponding to the individual protein structures considered here. the audio signals of a consistent, stationary spectrum corresponding to the molecular vibrational signature of each of these structures are fed into the actuator. the cases included in the training set are: -flat: no actuation (i.e., no audio signal) -pure sine waves: 3 distinct frequencies (sine33 = 50 hz, sine43 = 98 hz, sine65 = 349 hz) -various protein structures, including protein data bank ids: 6vsb, 107m, 5xdj, 6m17, 6m18 (and others). figure 1b depicts an overview of all proteins considered in this study, where the first row represents relatively simple structures (added complexity from left (short alpha-helix) to right (more complex alpha-helical fold). the lower row features more complex protein structures, including two spike proteins from the pathogen of covid-19 (two distinct molecular states -6vxx and 6vsb of covid-19), as well as the covid-19 spike protein bound (6m17) and unbound (6m18) to the human ace2 receptor. figure 5 depicts the results of the training process for both neural network formulations. both cases show reasonable convergence. the pre-trained resnet case ( figure 5a ) converges faster and in a more stable manner, however. this makes sense given that the model starts from pre-trained parameters. figure 6 shows results of a classification test suite conducted with the two trained models, depicting averaged values for a series of unknown, new images, fed to the neural network. the radar plot shows the score for each of the 13 labels (each is denoted out of 1.00). the data shows that both models predict the case very well, and without error. it can also be seen that the score for the correct prediction is quite pronounced, showing that the classifier works very well and can distinctly classify the correct protein structure at the origin of the acoustic actuation. this can be verified by the fact that the highest score in the plot is close to 1.00, and that there is a distinct peak for each of the cases. going into more detail, we analyze a few specific cases. for instance, the orange color represents the scores for protein 1akg. the highest score, in both models, is for the label 1akg, which indicates correct classification. for 3tnu, the highest score is 3tnu, and so on. notably, both models are capable of predicting the correct labels, solely from images of surface waves, for all cases studied. it is remarkable that the method works for a variety of protein structures with distinct frequency spectra from complex to simple including the pure sine wave cases at different frequencies. it also renders correct predictions for proteins with distinct geometric dimensions. for instance, 1akg is a very small protein that features only a simple fold and a short alpha-helix segment. 3tnu is a small alpha-helix fragment without any folding. in contrast, 6vsb is a very large protein. other interesting cases are 6m17 and 6m18. the protein 6m17 shows only minute distinctions from 6m18 in terms of its vibrational spectrum, as is shown in figure 7 . yet, the model can distinguish the cases very well. figure 10 shows a similar experiment, this time applied to the same image, but considering the result for using individual layers to generate features. the collection of examples of images visualize features seen by the internal layers of the deep neural network, by selecting different layers. figure 11 shows an analysis of images with deepdream,using the inception deep neural network [19] . unlike in the cases discussed above, here the neural network model used is taken "as trained" against a large number of conventional images (and has not been trained against the dataset reported here). it is clear that the conventional image classification model features very distinct patterns, reflecting the characteristic features learned in a conventional image classifier. it offers interesting insights into what image features are seen by a model like inception in the abstract representation of surface wave patterns induced by proteins. the new experimental-computational method reported here provides a new way to analyze and visualize molecular vibrations expressed in sound. while the method could in principle be used to predict visuals of conventional audio signals (e.g. specific musical instruments, songs, varied classes of music, etc.) the focus in this paper was on determining the protein structure solely from the pattern of surface waves. while cymatics, or the general attempt to make sound into images is a concept explored since ancient times (since the original works in the late 1700s [8, 9] ), here we report for the first time a method to visualize molecular vibrations in protein structures in that way, and offer a path to integrate it with artificial intelligence methods. the integration of the water-focused rendering of cymatics, combined with neural networks and especially generative methods, offer a new direction to realize materiomusical "inceptionism" [39] as a possible future direction in nano-inspired art. our method shows that even minute changes in the molecular vibrational spectra (e.g. 6m17 vs 6m18, as shown in figure 6 ) can be detected through this method, as well as different states of proteins (e.g. different molecular states of the covid-19 virus spike protein, 6vxx vs 6vsb).since we now have the vibrational spectra of all known protein structures available [1] , the approach reported here could potentially be used to build a very large dataset to render visual macroscopic representations of all known proteins. while this is beyond the scope of the present study, which focused on the establishment of the basic methodology, such nano-to-macro translations could find applications in a variety of fields. further work could also be done by providing detailed fluid-structure-interaction models of how the specific patterns emerge. such research may provide a theoretical basis to complement the experimental research done here. in terms of limitations, the method reported here can only classify protein structures that the neural network has been trained against. however, future work could seek to generalize the method. other limitations are the dependence of the images on the experimental setup. it is likely that the patterns of surface waves depend on the particular geometries of the experimental setup, such as the water level, the size of the contained, the power output of the actuators and so on. more broadly, looking ahead, the method can find applications as a means to accentuate images, find patterns in other biomaterials, or to use the approach in detecting patterns across manifestations of domains. figure 1 : a, overall flow of this research, ranging from the molecular vibrational spectra transposed to audible frequencies, and used to excite a thin film of water using an actuator. images of surface wave patterns are collected, building a training set for a machine learning model. the predictive power of the classifier model is validated, showing that the model can correctly determine the protein structure solely based on images of the surface wave patterns. b, overview of the various proteins studied in this paper. the first row represents relatively simple structures (added complexity from left (short alpha-helix) to right (more complex alpha-helical fold). the lower row features more complex protein structures, including two spike proteins from coronaviruses (6vxx and 6vsb of covid-19), as well as the covid-19 spike protein bound (6m17) and unbound (6m18) to the human ace2 receptor. experimental setup used in this study. a think water film in a petri dish is excited by an actuator, resulting in surface waves, as can be seen in the right column labelled "top". the resulting structures are captured using a high-resolution camera, and collected as a training set for machine learning. : results of the classifier as applied to a new data set that wasn't included in training or validation. the data is plotted in a radar plot (1.0 = the neural network is capable of correctly labeling the image, 0.0 reflects worst performance). we plot averaged scores, providing average values over classifications for multiple images. the data shows that both neural networks are capable of identifying different protein structures, as well as the test cases of sine waves (at different frequencies), as well as silence. for instance, the orange color represents the scores for 1akg. the highest score, in both models, is for the label 1akg, which indicates correct classification. for 3tnu, the highest score is 3tnu, and so on. both models are capable of predicting the correct labels, solely from images of surface waves, for all cases studied. in spite of the small differences in the structure and the frequency spectrum as depicted in the figure, the machine learning model is capable of easily distinguishing the patterns generated from these audio signals. a, normal mode frequencies of 6m17 and 6m18 over the mode number. b, spectrogram of the resulting audio signal, using the method reported in [1] . an application of deepdream [39] to generate novel images by activating select layers in the deep neural network, based on the model trained against the water surface images. this approach offers a new way to design generative art from molecular vibrations. the top image is the original, the bottom the processed image. this example shows how the method can be a powerful method to better understand features seen and learned in the neural network, which can be used to accentuate key elements in images. : analysis of images with deepdream, using the inception deep neural network [19] (note, this model is taken "as is", and has not been trained against the dataset reported here). it is clear that the conventional image classification model features very distinct patterns, reflecting the characteristic features learned in a conventional image classifier. the left column shows the processing resulting from a mountain landscape (same as in figure 9 , right column) for comparison. the right column shows the water surface wave pattern as source image. analysis of the vibrational and sound spectrum of over 100,000 protein structures and application in sonification a self-consistent sonification method to translate amino acid sequences into musical compositions and application in protein design using ai reoccurring patterns in hierarchical protein materials and music: the power of analogies materials by design -a perspective from atoms to structures mater. impuls. natur, acatec nanomechanical sonification of the 2019-ncov coronavirus spike protein through a materiomusical approach sonification based de novo protein design using artificial intelligence, structure prediction and analysis using molecular modeling entdeckungen ã¼ber die theorie des klanges cymatics for the cloaking of flexural vibrations in a structured plate experimental study of cymatics, wcse 2012 -int cymasense: a real-time 3d cymatics-based sound visualisation tool discovering an uncanny world: cymatics software and the journey to the creation of knowledge within the field of contemporary cymatics nanoscale self-organization using standing surface acoustic waves cymatics" of selenium and tellurium films deposited in vacuum on vibrating substrates interacting with 3d reactive widgets for musical performance imaging soundscapes : identifying cognitive associations between auditory and visual dimensions synesthesia and cross-modality in contemporary audiovisuals deep residual learning for image recognition accelerated search and design of stretchable graphene kirigami using machine learning human-level control through deep reinforcement learning bioinspired hierarchical composite design using machine learning: simulation, additive manufacturing, and experiment convolutional lstm network: a machine learning approach for precipitation nowcasting gradient-based learning applied to document recognition world cloud: a prototype data choralification of text documents sonification report: status of the field and research agenda sublime frequencies: the construction of sublime listening experiences in the sonification of scientific data research set to music: the climate symphony and other sonifications of ice core, radar, dna, seismic and solar wind data a systematic review of mapping strategies for the sonification of physical quantities tuning the instrument: sonic properties in the spider's web decoding the locational information in the orb web vibrations of araneus diadematus and zygiella x-notata tangled web imaging and analysis of a three-dimensional spider web architecture a series of pdb related databases for everyday needs deep residual learning for image recognition inceptionism: going deeper into neural networks acknowledgements: this work was supported by mit cast via a grant from the mellon foundation, with additional support from onr (grant # n00014-16-1-2333) and nih u01 eb014976. key: cord-018437-yjvwa1ot authors: mitchell, michael title: taxonomy date: 2013-08-26 journal: viruses and the lung doi: 10.1007/978-3-642-40605-8_3 sha: doc_id: 18437 cord_uid: yjvwa1ot this chapter addresses the classification and taxonomy of viruses with special attention to viruses that show pneumotropic properties. information provided in this chapter supplements that provided in other chapters in parts ii–v of this volume that discuss individual viral pathogens. taxonomy may be defi ned as a logical discipline for the identifi cation and classifi cation of biological entities based on objective, measurable characteristics of relevant entities. useful taxonomic systems should be broadly applicable across diverse types of biological groups. they should also be fl exible, so that new data from technological advances may be integrated into the classifi cation scheme. primary goals of systemic taxonomy, regardless of biological discipline, include the following: • establishing groups (taxa) that refl ect varying degrees of evolutionary relatedness among the different biological entities studied • establishing criteria for assignment of known or unknown clinical isolates to a given group • establishing a clear and unequivocal nomenclature the origins of biological taxonomy are fi rmly rooted in botany and zoology. early taxonomic systems relied on gross characteristics, like biological niche, internal and external morphology, reproductive strategies and compatibilities, and fossil records. the seminal works of the swedish botanist carl linnaeus used a hierarchical scheme to represent biological relatedness and established the simplifi ed binomial system of nomenclature that serves as the basis for modern classifi cation systems. the modern scientifi c classifi cation in biology is designed to describe all biological entities within a hierarchy consisting of the following taxa: a basic assumption for the establishment of such a hierarchy assumes that all biological entities have evolved from a single common cellular life-form. different biological entities have evolved as a result of accumulated changes in dna that have provided survival advantages in different ecological niches. species may be classifi ed on the basis of phylogenetic and evolutionary relatedness: members of a given species are the most closely related, different species within a single genus are more closely related to each other than to a species within a different genus, and so on. newer technologies like microscopy, improved biochemical and physiological analysis, and advanced protein and molecular analytical methods have resulted in an enormous expansion of characteristics that may be studied for the classifi cation of biological entities and validation of taxonomic systems (woese et al. 1990 ). there are a number of excellent texts that discuss the clinical and laboratory aspects of virus biology (knipe and howley 2007 ; richman et al. 2009 ; versalovic 2012 ) . though viruses are certainly "biological entities," they are fundamentally different from the cellular life-forms classifi ed by previous taxonomic schemes. viruses have no autonomous metabolic or replicative ability; they are completely dependent on cellular life-forms. however, within their biological milieu, viruses do replicate and evolve, and they are composed of the same types of organic macromolecules as are cellular lifeforms. because of their intimate relationship with cellular life-forms, it seems legitimate to integrate the schemes for classifi cation of viruses with the schemes used for biological classifi cation of cellular life-forms (lefkowitz 2012 ) . initially, various features, like host range, crossimmunity, clinical disease, and pathologic features, were used to classify viruses. technological advances have led to more detailed and integrated classifi cation, taxonomy, and phylogenetic characterization (evolutionary relatedness) of viruses. sophisticated nucleic acid sequence analysis has emerged as a powerful tool for virus classifi cation and phylogenetic determination, in spite of some limitations (holmes 2008 ; mccormack and clewley 2002 ; zanotto et al. 1996 ) . a robust system for classifi cation of viruses developed by david baltimore has gained wide acceptance (baltimore 1971 ) . classifi cation is based on the genomic nucleic acid used by the virus (dna or rna), strandedness (single or double stranded), and method of replication. the system has been used to defi ne seven classes of viruses: class i: double-stranded dna (dsdna) class ii : single-stranded dna (ssdna) the primary classifi cation of viruses is into species. a virus species is defi ned as a polythetic class of viruses that constitute a replicating lineage and occupy a specifi c ecological niche (international committee on taxonomy of viruses 2002 ). in polythetic classifi cations, group members share a number of characteristics, but no single characteristic is necessary or suffi cient to defi ne members of the group. higher-level taxa are monothetic, i.e., there are characteristics that are necessary and suffi cient to defi ne members of the class. it is important to note that not all viruses can be assigned through all taxonomic levels. virus species may be assigned to a genus or remain unassigned. similarly, a genus may be assigned to a family or subfamily, or remain unassigned, and so on up the taxonomic hierarchy. each genus has a type species . the type species is the virus that necessitated the creation of the genus; it is always linked to the genus. in the most recent publication (2012), the ictv recognized 7 orders, 96 families, 22 subfamilies, 420 genera, and 2,618 species. important characteristics used by the ictv to defi ne and classify viruses within these taxa include the following: • susceptible host range : most viruses have a restricted range of hosts which they are able to infect. • virus structure : the viral genome is surrounded by a protective shell of proteins called a capsid. the capsid may also enclose proteins, like reverse transcriptase or proteins required for organization of the nucleocapsid. a nucleocapsid refers to a viral nucleus surrounded by an intact capsid. the nucleocapsids of certain viruses are also surrounded by an envelope of host-derived membranes. the complete virus particle is referred to as a virion. icosahedral capsids are very common; these quasi-spherical shells are composed of 20 identical equilateral triangles with 30 edges and 12 vertices. icosahedral capsids are very effi cient geometrically (internal volume versus protein content) and genetically (many small sides require fewer and smaller genes to code for capsid proteins). the nucleocapsid proteins of some viruses, like the infl uenza viruses, form helical tubes with the nucleic acid incorporated directly into the helical structure. the nucleocapsids of some viruses are surrounded by envelopes composed of lipid bilayers and host-or viral-encoded proteins. envelopes are typically acquired by budding of the nucleocapsid through a virally modifi ed portion of a specifi c host-cell membrane (plasma, endoplasmic reticulum, golgi, nucleus) . the shape of the virus nucleocapsid or intact virion is usually determined by electron microscopy. the shape and dimensions of the nucleocapsid and intact virion, and the presence or absence of an envelope, are useful characteristics for classifying viruses. • genome : the viral genome is either dna or rna; the nucleic acids may be single or double stranded. the genome size may be expressed in terms of kilobases (kb) for singlestranded genomes or kilobase pairs (kbp) for double-stranded genomes. the sequence of genes of positive-sense ssrna may be directly translated by the host into viral proteins. the sequence of negative-sense ssrna is complementary to the coding sequence for translation, so mrna must be synthesized by rna polymerase, typically carried within the virion, before translation into viral proteins. the sequence of positive-sense ssdna is the same as that of the mrna coding for viral proteins; negative-sense ssdna is complementary to mrna and may be transcribed into mrna for viral protein synthesis. ambisense single-stranded nucleic acids use both positive-sense and negative-sense sequences. the viral nucleic acid may be linear or circular; the nucleic acid may be in the form of a single molecule or broken into two or more segments. in addition to the type of nucleic acid, the size of the viral genome, measured in number of bases or base pairs, is an important characteristic used for classifi cation. • nucleic acid sequence analysis : the analysis of specifi c viral nucleic acid sequences is increasingly used as a powerful tool for taxonomic assignment and assessment of evolutionary relatedness. the utility is greatest for related groups of viruses (lauber and gorbalenya 2012a , b ) , but has been challenging for more divergent groups of viruses. sequence analysis alone has not provided a reliable single criterion on which all viruses may be classifi ed. construction of a universal phylogenetic tree for viruses, as has been proposed for cellular life-forms, may not be possible for viruses. it is not clear that all viruses emerged from a single progenitor virus; there is evidence for multiple, independent origins of existing viruses. phylogenetic analysis using nucleic acid sequences is further complicated by recombination, reassortment, incorporation of host nucleic acid sequences, and other factors (domingo 2007 ; holmes 2011 ) . currently, expert consensus, considering laboratory, phenotypic, clinical, and other characteristics, remains the most accurate and robust method for the classifi cation and taxonomic assignment of viruses. note that the formal names assigned at all taxonomic levels are italicized, while the common names, which are often used clinically, are not italicized. the viruses that have been associated with human infections are shown in table 3 .1 . among the families of viruses able to infect humans and other vertebrate hosts, there are many species that target and cause disease in the lung. these viruses commonly use airborne transmission as an effective mode of transmission between an infected host and a new susceptible host. characteristics of viruses that directly or indirectly cause pulmonary disease are discussed in this section. adenoviridae: adenoviruses are pathogenic for humans and other vertebrate species. a structural protein at each of the 12 of the icosahedral nucleocapsid vertices anchors a rodlike projection with a terminal knob, which interacts with specifi c host surface receptor molecules and which confers the hemagglutination pattern and tissue tropism for the different groups of adenoviruses. the genome encodes ~40 genes (davison et al. 2003a ) , including common genes and species-specifi c genes. genes are grouped into early, delayed early, and late transcribed genes. the genome contains inverted repeat sequences at both ends. sequences of both dna strands are transcribed to mrna; mrna splicing is used for expression of many adenovirus genes. the family adenoviridae has not been assigned to an order. within this family, there are fi ve genera. the seven species that cause human infection are human adenovirus a, b, c, d, e, f, and g , all within the mastadenovirus genus; there are 57 accepted serotypes (buckwalter et al. 2012 ) . endemic respiratory infections are most commonly caused by serotypes of human adenovirus c (the type species of the genus); most epidemic respiratory infections are caused by serotypes within species adenovirus b and adenovirus e . arenaviridae : arenaviruses may cause several hemorrhagic fever syndromes. specifi c rodents are the reservoir for each arenavirus; human disease is incidental and is usually transmitted by infectious aerosols. viruses of this family are enveloped; evenly spaced glycoprotein complexes (a tetramer of viral gp2 with viral gp1 ionically bound as a globular head) are attached to the envelope giving complete virions a studded spherical morphology. complete virions are ~100 nm in diameter, but show signifi cant pleomorphism (range, 60-300 nm). the genome is divided into two segments which are complexed with nucleoproteins (peters 2009 ) . complementary sequences at the 3′ and 5′ ends of each segment result in the formation of two circular nucleocapsids. arenaviruses use both negative-sense and ambisense coding strategies. host ribosomes are often incorporated within the envelope of complete virions. this family of viruses is not assigned to an order. there is one genus, arenavirus , with 25 species that fall into two complexes on the basis of serologic and genetic relatedness. the old world, or african, species include lassa virus (lassa fever) and lujo virus. the new world species include guanarito virus (venezuelan hf), junín virus (argentine hf), and machupo virus (bolivian hf). the type species of the genus arenavirus is lymphocytic choriomeningitis virus . bunyaviridae : bunyaviruses may cause several hemorrhagic fever syndromes. viruses coronaviridae : transmembrane proteins produce blunt projections from the surface of coronaviruses, resulting in a "crown-like" appearance on electron microscopic studies (100-160 nm in diameter). translation of the coronavirus genome is unique and includes production of polyproteins, discontinuous synthesis, overlapping reading frames, ribosomal frame shifting, and post-translational proteolytic processing (marra et al. 2003 ; rota et al. 2003 ; theil et al. 2003 ) . the major structural proteins, spike glycoprotein (s), membrane glycoprotein (m), nucleocapsid phosphoprotein (n), hemagglutinin-esterase glycoprotein (he), and envelope protein (e), are present in all coronaviruses. nonstructural proteins are encoded in 5-10 unique or overlapping reading frames (lai et al. 2007 ). the human coronaviruses are assigned to the order nidovirales , family coronaviridae , and subfamily coronavirinae . there are four genera and three serological groups. relevant viruses include human coronavirus 229e and human coronavirus nl63 of the genus alphacoronavirus (antigenic group i), human coronavirus hku1, betacoronavirus 1 and severe acute respiratory syndrome-related coronavirus of the genus betacoronavirus (antigenic group ii). filoviridae : filoviruses may cause several hemorrhagic fever syndromes. the fi loviruses have a unique threadlike morphology. the helical nucleocapsids are surrounded by an envelope studded by spikes formed by a single type of glycoprotein (gp). the genome consists of a single segment of negative-sense ssrna that encodes for seven proteins (kuhn et al. 2010 ) . the presence of gene overlap for several genes is an unusual feature of fi loviruses. in ebolaviruses, the surface glycoprotein is encoded by two adjacent reading frames. a truncated version (sgp), which lacks the hydrophobic anchor, results from translation of the upstream reading frame only. this protein is secreted from cells and may serve as a decoy for the host's immunological response. the full-length gp is formed only when the rna polymerase misreads a poly-u editing site between the reading frames. the fulllength gp is inserted, as homotrimers, into the host membranes that will form the virion envelope. a helical nucleocapsid is formed by association of the ssrna with nucleoproteins. the nucleocapsid is ~50 nm in diameter, with a central axial space ~20 nm in diameter. the nucleocapsid is attached to the envelope by matrix protein. the complete virions are ~80 nm in diameter, but the virion length may vary from 800 to 10,000 nm. the family filoviridae is assigned to the order mononegavirales . there are two genera within the family ebolavirus and marburgvirus . there are fi ve ebolavirus species, including sudan ebolavirus and zaire ebolavirus (the type species). the genus marburgvirus consists of one species, marburg marburgvirus . humans and nonhuman primates are susceptible to ebolavirus and marburgvirus infection; the host reservoirs for these viruses are unknown. humans may be infected sporadically by presumed contact with the host species or by direct contact with virus containing body fl uids taken from acutely infected humans or nonhuman primates. nosocomial and laboratoryacquired infections are well described. flaviviridae : flaviviruses may cause several hemorrhagic fever syndromes. hepatitis c virus is also a fl avivirus species. flaviviruses are surrounded by an envelope studded with dimers of viral e glycoprotein and m protein which give the mature virion a herringbone appearance with icosahedral symmetry. the genome consists of a single segment of positive-sense ssrna (chambers et al. 1990 ; osatomi and sumiyoshi 1990 ) . cyclization of the genome, through hybridization of rna sequences of the 5′ and 3′ ends of the genome, may be required for mrna synthesis (alvarez et al. 2005 ). there is a long open reading frame that codes for three structural proteins at the 5′ end; downstream of this region are genes for seven nonstructural proteins (thurner et al. 2004 ). the positive-sense genome is directly translated into a large polyprotein, which undergoes intra-and post-translational cleavage. strain evolution and clinical diversity have been driven by a high rate of mutation at replication and through molecular recombination. the nucleocapsid is formed by interaction of genomic rna with capsid proteins. the complete virion has a spherical morphology approximately 50 nm in diameter. this family of viruses is not assigned to an order. there are four genera within the family flaviviridae . within the genus flavivirus , there are 53 species, including dengue virus (simmons et al. 2012 hepatitis c virus (hcv) is the type species of the genus hepacivirus in the family flaviviridae . the physical properties of hcv have not been as well defi ned as other fl aviviruses because there is no effi cient method for in vitro replication of hcv. virion morphology is consistent with other fl aviviruses; complete, enveloped virions have a diameter of 55-65 nm. the single segment positive-sense ssrna is ~9.6 kb in length (hijikata et al. 1991 ) . a single open reading frame is fl anked by highly conserved regions at the 5′ and 3′ ends. cap-independent protein synthesis, typical of flavivirus species, is initiated at an internal ribosomal entry site (ires) within the 5′ untranslated region. this results in synthesis of a polyprotein that undergoes cleavage and further processing during and after translation. a unique and highly conserved sequence upstream of the ires interacts with liver-specifi c microrna and is required for effi cient replication. circulating hcv is associated with host ldl/vldl, which may play a role in delivery of virions to hepatocytes. the error-prone rna polymerase and high replication rate of hcv has resulted in a great genetic diversity and heterogeneity of clinical isolates. hcv isolates can be grouped by genotypic analysis into six groups and many subgroups. there are differences with respect to responses to antiviral therapy among the genotypes, but intrinsic virulence is similar. the vast majority of strains in the united states are genotypes 1a, 1b, and 2, whereas central african strains are almost exclusively genotype 4. hemorrhagic fever (hf) syndromes : viral hemorrhagic fever syndromes may be caused by many species of viruses from four different families: arenaviridae, bunyaviridae, flaviviridae and filoviridae ; all are single-stranded rna viruses. see the discussions above for specifi c information related to these virus families. typical symptoms of viral hemorrhagic fever infection include fever, malaise, hypotension, and coagulation defects. with the exception of dengue, the other hf viral agents are maintained in nonhuman vertebrate hosts; humans are coincidental, dead-end hosts. in dengue, human infection is maintained through a mosquito vector. the epidemiologic distribution of disease refl ects the geographic range of the reservoir host. hf viruses primarily infect dendritic cells, macrocytes, and monocytes, which are present in virtually all tissues and organ systems; parenchymal cells may also be susceptible to infection, depending on the virus. infected cells release mediators that result in marked increased vascular permeability, compromising the function of critical organ systems. suppression of cellular type 1 interferon response is a signifi cant contributor to pathogenesis (habjan et al. 2008 ) . hepadnaviridae: in the family hepadnaviridae , there are two genera, avihepadnavirus (two species) and orthohepadnavirus (four species); hepatitis b virus (hbv), the type species of orthohepadnavirus , is only human pathogen in family. the family hepadnaviridae is not assigned to an order. eight distinct hbv genotypes (a-h) and subtypes can be recognized on the basis of antigenic or sequence variation. the genotypes show geographic and ethnic variability; the hbv genotype infl uences the severity and outcome of disease (garfein et al. 2004 ; lin and kao 2008 ) . the complete, enveloped hbv virion (dane particle) is 42-47 nm in diameter. the icosahedral nucleocapsid (~28 nm in diameter) of the virion contains a single molecule of partially double-stranded dna with a dna-dependent polymerase covalently linked to the 5′ end of the complete dna strand, hepatitis b e antigen (hbeag) and hepatitis b core antigen (hbcag). the nucleocapsid is surrounded by an envelope derived from host-cell membrane and viral envelope proteins, including hepatitis b surface antigen. the genome of hbv is a circular, partially double-stranded dna molecule which is replicated by a unique process of reverse transcription of an rna intermediate. the minus dna strand runs the entire length of the hbv genome; the plus strand covers only about two-thirds of the genome. the genome is replicated by synthesis of a fulllength ssrna transcript (pre-genomic rna), followed by dsdna synthesis by reverse transcription of the ssrna by viral-encoded reverse transcriptase/dna polymerase. all viral proteins are also transcribed from the minus dna strand. there are four overlapping open reading frames, all read in the same direction (liang 2009 ) . herpesviridae : the herpesvirus species associated with human infections (hsv-1, hsv-2, cmv, ebv, vzv, hhv-6, hhv-7, and hhv-8) belong to the family herpesviridae within the order herpesvirales . there are four subfamilies of the herpesviridae : alphaherpesvirinae (5 genera), betaherpesvirinae (4 genera), gammaherpesvirinae (4 genera), and a single genus in an unassigned subfamily. specifi c human herpesviruses are discussed in the sections below. the herpesviruses are double-stranded dna viruses. the icosahedral capsid (~100 nm diameter) is surrounded by an envelope studded by a variety of short glycoproteins. the nucleocapsid is a dense toroid complex with an outer diameter ~70 nm and inner diameter ~18 nm. an irregular "tegument" fi lls the space between the envelope and capsid. depending of the thickness of the tegument layer, complete virions range in size from ~125 to >250 nm. the size and organization of the dsdna genome varies among the species causing human disease (mcgeoch et al. 2006 ) . the genomes of human herpesviruses include unique sequences and repeated sequences. though the genomes are linear in virions, they circularize in the nucleus of infected cells, which is mediated through repeat sequences at both ends of the dsdna genome. for hhv6 and hhv7 (class a genome), a large unique sequence region is fl anked by a region that is repeated at both ends of the linear strand of dsdna. the genome of ebv and the kaposi's sarcoma-associated herpesvirus (class c genome) have smaller left and right terminal repeat sequences, while repeat sequences r1 to r4 divide the unique sequence nucleic acid into four discrete regions. for vzv (class d genome), a large terminal sequence is inverted and inserted into the genome, resulting in a large unique sequence region (ul) and a small unique sequence region (us). hsv-1, hsv-2, and cmv (class e genomes) are the most complex. there are repeat sequence regions at both ends of the linear dsdna molecule. the unique sequence dsdna is divided into ul and us regions by a sequence composed of juxtaposed copies of the terminal repeat sequences inserted in an inverted orientation. typical of dsdna viruses, a large number of proteins are produced by various herpesviruses. the organization of the coding regions is complex, with 3′ and 5′ reading frames, gene overlap, spliced genes, and intron regions. forty genes are conserved among the α-, β-, and γ-herpesviruses. these core genes are divided among seven gene blocks (albà et al. 2001 ) ; within each block the order and polarity of genes are conserved, including genes for gene regulation, nucleotide metabolism, dna replication, virion maturation, envelope glycoprotein synthesis, and capsid, fusion and tegument protein synthesis. diseases caused by human herpesviruses range from systemic to localized infection of virtually all organ systems, although the hostcell range and typical disease characteristics vary by species. a characteristic of herpesvirus infections is latency, which is commonly associated with reactivation and symptomatic infections (e.g., shingles). while active infection with herpesviruses results in the destruction of the infected host cell, latently infected cells remain viable. in latently infected cells, the viral genome forms circularized molecules within the host nucleus with limited expression of viral genes. (gompels et al. 1995 ) of these viruses has the simplest organization and lowest %g + c content compared to the other herpesviruses. reading frames are present on each strand of the dsdna. core herpesvirus proteins are clustered near the center of the strands, while species-specifi c genes are located toward the ends of the strands (braun et al. 1997 ). hhv-6 and hhv-7 are assigned to the genus roseolovirus in the subfamily betaherpesvirinae , family herpesviridae , and order herpesvirales . there are two distinct hhv-6 species: human herpesvirus 6a (the roseolovirus type species) and human herpesvirus 6b . hhv-6b is the agent of exanthem subitum. there is a single hhv-7 species, human herpesvirus 7 , which is also a cause of exanthem subitum. t-lymphocytes are the primary target cell of hhv-6 and hhv-7 viruses. • kaposi's sarcoma-associated herpesvirus : the complete virions of kaposi's sarcoma-associated herpesvirus (kshv) have a diameter ~100 nm. in addition to virusspecifi c proteins, the tegument also carries viral mrnas, probably the result of passive incorporation during the cytoplasmic envelopment process (bechtel et al. 2005 ) . the envelopes of complete virions bear kshv-specifi c glycoproteins. the genome (~170 kbp) has class c organization (russo et al. 1996 ) typical of gammaherpesvirinae . the conserved herpesvirus genes are clustered in four blocks; kshv-specifi c genes are typically distributed in the regions outside and between these blocks (renne et al. 1996 ) . the kshv species designation is human herpesvirus 8 , which is assigned to the genus rhadinovirus within the subfamily gammaherpesvirinae . the virus has tropism for b-lymphocytes and is implicated in all forms of kaposi's sarcoma. four clades, a-d, with distinctive geographical distributions, have been identifi ed by genotypic analysis; the a and c clades cluster together and are most typical for isolates from europe and the united states. • varicella-zoster virus (vzv) : the dense core of vzv is enclosed in an icosahedral capsid (80-120 nm diameter), which is surrounded by an amorphous tegument. the envelope may be derived from multiple types of hostcell membranes during transit from the nucleus through the cytoplasm; specifi c viralencoded glycoproteins are embedded in the envelope of the complete virions, which may be spherical or pleomorphic (180-200 nm in diameter). vzv has a class d dsdna genome (~125 kbp) (clarke et al. 1995 ; davison 1984 ) , resulting in production of two isomeric genomic forms by infected cells through inversion of the us region (ecker and hyman 1982 ) . the genome encodes more than 70 proteins. the organization includes grouping of several genes into single transcription units, genes with overlapping reading frames, and spliced segments (davison and scott 1986 ) . the species designation for vzv is human herpesvirus 3 . it is the type species of the genus varicellovirus within the subfamily alphaherpesvirinae . there is only a single serotype of vzv. for epidemiologic purposes, vzv isolates may be genotyped on the basis of minor differences in dna sequence; different genotypes may be classifi ed as european, japanese, or mosaic (loparev et al. 2004 ). the host range of vzv is restricted to cells of humans or other primates; in humans, vzv has tropism for human t-lymphocytes and establishes latent infection in the cells of the dorsal root ganglia. orthomyxoviridae : infl uenza viruses belong to the family orthomyxoviridae . they are polymorphic; viruses may be spherical (~100 nm diameter) or fi lamentous. complete virions are surrounded by an envelope derived from the host cytoplasmic membrane. viral hemagglutinin and neuraminidase proteins are embedded in the envelope resulting in characteristic 10-14 nm spikes projecting from the surface of virions. in addition to the ha and na protein, m2 protein is embedded into the envelope of infl uenza a viruses; nb and bm2 proteins are embedded into the envelopes of infl uenza b viruses. the matrix protein (m1) is located just below the envelope. the nucleocapsid is composed of viral rna and nonstructural proteins, including ribonucleoproteins and polymerases. the genome of infl uenza viruses is composed of negative-sense ssrna. all viral rna synthesis occurs in the nucleus of the host cell. the a and b infl uenza virus genomes are composed of eight segments, while the infl uenza c virus genome consists of seven segments (hayden and palese 2009 ) . the segments range in size from ~900 to 2,300 nucleotides in length. each segment codes for one or more viral proteins (mccauley et al. 1983 ). the 3′ and 5′ ends of each segment contain noncoding, regulatory regions (fujii et al. 2005 ) . the three largest segments code for various components of rna polymerase; the pb1 segment of infl uenza a virus has a second open reading frame that encodes the pro-apoptotic protein pb1-f2. in infl uenza types a and b, the fourth and sixth segments encode for the surface hemagglutinin (ha) and neuraminidase (na) glycoproteins, respectively. the infl uenza a surface protein m2 is encoded by the seventh segment; infl uenza b surface protein nb is encoded by the sixth segment, while the bm2 is encoded by the seventh segment. the fi fth segment of both a and b infl uenza viruses encodes for the rna-binding nucleoprotein (np). the matrix protein m1 is encoded by the seventh segment of both viruses. the eighth and smallest rna segment of infl uenza a and b viruses encodes for ns1, a multifunctional protein with interferon antagonistic properties and nep/ ns2 protein which is involved in transport of vrnps across the nuclear membrane of the host cell. the names of clinical isolates of human infl uenza isolates include the species of origin, isolation location, number of the isolate, and year of isolation; infl uenza a virus isolates also include the hemagglutinin (h1 to h16) and neuraminidase (n1 to n9) subtypes (atmar and lindstrom 2012 ) . for example, a/california/7/2009 (h1n1), a/victoria/361/2011 (h3n2), and b/ wisconsin/1/2010 viruses were recommended for the 2012-2013 seasonal infl uenza vaccine. large outbreaks have only occurred with h1, h2, and h3 and neuraminidases n1 and n2 viral subtypes. antigenic drift and antigenic shift contribute to reinfection with infl uenza viruses (taubenberger and kash 2010 ) . antigenic drift is caused by a gradual accumulation of point mutations in hemagglutinin and neuraminidase genes, which result in minor antigenic changes in these proteins. antigenic shift is caused by a virus created by reassortment of infl uenza virus rna segments during coinfection of a host, usually with a human infl uenza virus and an avian or swine infl uenza virus or through introduction of a nonhuman infl uenza virus strain into human populations after mutation during a host-species infection creates a new isolate permissive for interspecies transmission. papillomaviridae: the papillomaviruses (pvs) represent a large (and growing) family of viruses that currently includes 30 different genera and 69 species; the taxonomy has undergone signifi cant reorganization in recent years (bravo et al. 2010 ) . the oncogenic potential of human papillomaviruses is well established. pvs are non-enveloped; virions are icosahedral with diameters of 50-55 nm. the capsid contains two structural proteins, l1, the most abundant viral protein, and l2. the pv genome consists of a single molecule of circularized dsdna (zheng and baker 2006 ) . the open reading frames for all viral genes are located on only one of the dna strands, and transcription proceeds in a single direction. there are eight early (e) open reading frames that encode for regulatory proteins that control viral metabolism and dna synthesis. the e6 proteins of high-risk hpv types have anti-apoptotic effects and interfere with p53 regulatory function in infected host cells (howley et al. 1990 ) . two late (l) reading frames encode for synthesis of the structural proteins l1 and l2. epithelial cells of a wide variety of vertebrate hosts are susceptible to papillomavirus infection, but the different host species are only susceptible to species-specifi c viruses. papillomaviruses have been classifi ed according to susceptible host species and the type of disease produced, but comparison of sequence differences of the l1 reading frame has provided a more detailed description of papillomavirus phylogeny (de villiers et al. 2004 ) . the family papillomaviridae is not assigned to an order. human pathogens are clustered within fi ve papillomavirus genera. paramyxoviridae : the paramyxoviruses are enveloped (host cytoplasmic membrane) with an unsegmented negative-sense ssrna genome (15-19 kb) . the viral rna serves as template for synthesis of mrna and for synthesis of antigenomic (positive-sense) rna for synthesis of new viral negative-sense rna for new virions. there are six to ten genes; genes for the six major proteins are linked in the following 3′ to 5′ order: nucleocapsid (n) → phosphoprotein (p) → matrix (m) → fusion (f) → hemagglutinin/neuraminidase (hn) → large polymerase (l). there is an untranslated leader sequence at the 3′ end and untranslated trailer sequence at the 5′ end. the genes are separated by untranslated sequences and do not overlap, with the exception of the m and l genes of human metapneumovirus . translation is initiated at the 3′ end and proceeds directly through to the 5′ end. because the rna polymerase is unstable and may detach at the untranslated regions between genes, there is a gradient in the concentration of gene products from 3′ to 5′. in different species, other proteins are produced by additional small genes, mrna editing, or overlapping reading frames within the p gene. the v and c proteins regulate viral rna transcription and also interfere with host interferon signaling and other aspects of the immune response to paramyxovirus infection (andrejeva et al. 2004 ; durbin et al. 1999 ; swedan et al. 2009 ). formation of the nucleocapsid core is constrained by a required association of one n protein to every six genomic nucleotides (kolakofsky et al. 2005 ; skiadopoulos et al. 2003 ) . the resulting helical structure has a diameter of 18 nm with a 4 nm central core. p proteins (a polymerase cofactor) are attached to this rigid rod and serve as attachment of l proteins, which interact to provide enzymatic activity for rna synthesis. this core structure, rather than free genomic rna, serves as the template for mrna and antigenomic rna synthesis. the paramyxovirus m proteins surround and organize the nucleocapsid and interact with the cytoplasmic tails of transmembrane envelope proteins. the envelope formed from modifi ed host-cell plasma membranes is studded by viral protein complexes, including hn proteins, which mediate virion attachment to target cells, and f protein, which mediates ph-independent fusion of the viral envelope and cell cytoplasmic membrane. the paramyxoviridae are one of the four families within the order mononegavirales and include signifi cant and frequent pathogens of humans and animals. there are two subfamilies: the paramyxovirinae and the pneumovirinae . there are seven genera and thirty-one species in the subfamily paramyxovirinae and two genera and fi ve species in the pneumovirinae subfamily. • henipahviruses: in the subfamily paramyxovirinae , there are two species within the genus henipahvirus : hendra virus (hev) and nipah virus (niv); hev is the type species. henipahvirus virions are pleomorphic (spherical to helical forms). electron microscopy of hendra virus shows a "double-fringe" appearance due to short and long surface projections (hyatt et al. 2001 ) . complete virions range in size from 40 to 1,900 nm in longest dimension. the genome (~18 kb) includes genes typical of paramyxoviridae . long untranslated sequences are attached to the 3′ end of fi ve of the six genes, resulting in the larger genome size of henipahviruses compared to other paramyxoviruses (eaton et al. 2007 ; wang et al. 2000 ) . the p gene also codes for v and w proteins by mrna editing and c protein by a shifted reading frame. henipahviruses are assigned to the family paramyxoviridae and subfamily paramyxovirinae . henipahvirus infections are zoonotic; fruit bats are the presumed reservoir for hendra virus infections, while fruit bats (johara et al. 2001 ) includes two open reading frames; the function of m2-1 protein is undefi ned, while m2-2 protein is a regulator of viral transcription. there is no gene for hemagglutinin-neuraminidase; the product of the g gene serves as the major attachment protein. metapneumoviruses are members of the subfamily pneumovirinae . human metapneumovirus is a species in the genus metapneumovirus . there is a single hmpv serotype, with two antigenic subtypes a and b. • human respiratory syncytial viruses: the envelope of human respiratory syncytial virus (hrsv) is studded with three viral glycoproteins: g protein (the major attachment protein), f protein, and sh protein (a small hydrophobic protein). complete virions are pleomorphic (spherical to fi lamentous forms). the nucleocapsid diameter, 12-15 nm, is smaller than typical for other paramyxoviruses (hall 2001 ) . the hrsv genome is ~15 kb and includes ten genes (collins and wertz 1983 ) . the fi rst eight reading frames are nonoverlapping; the last two genes, m2 and l, overlap by 62 nucleotides. in addition to n, p, and l proteins, m2-1 protein, a transcription factor, is associated with the nucleocapsid. the overall organization of the hrsv genome is similar to other paramyxoviruses. in addition to the typical genes, the hrsv genome includes genes ns1 and ns2 (nonstructural proteins that interfere with interferon induction and signaling), sh, and m2-1 and m2-2 (nonstructural proteins involved in regulation of transcription). there is no nh gene; attachment is mediated by the g gene product. human respiratory syncytial virus is the type species of the genus pneumovirus in the subfamily pneumovirinae . there is a single hrsv serotype, with two antigenic subtypes a and b. (henrickson 2003 ) . the genome of human parainfl uenza viruses is ~15 kb in length with an organization and six reading frames (n, p, m, f, hn, l) typical of the paramyxoviridae (karron and collins 2007 ) . there are no overlapping reading frames. accessory proteins, c (hpiv1 and 3), v (hpiv 2 and 4), and d (hpiv3), are produced by mrna editing of the p gene. n proteins are tightly bound to viral and antigenomic rna; p and l proteins are also bound to the nucleocapsid, forming functional complexes for rna polymerization and processing. human parainfl uenza viruses are assigned to two genera in the subfamily paramyxovirinae . (berns 1990 ). virions are stable in the environment and thought to transmit infection by attachment to specifi c receptors of actively dividing cells. the parvovirus genome is composed of unsegmented ssdna (cotmore and tattersall 1984 ; shade et al. 1986 ; zhi et al. 2004 ) . complete virions of different species may contain negativesense or both negative-and positive-sense dna in various proportions. there are two major reading frames: one encoding capsid proteins and the other coding for nonstructural proteins. noncoding sequences at the 3′ and 5′ ends include complementary sequences which result in the formation of hairpin structures that serve to regulate nucleic acid synthesis (deiss et al. 1990 ) . various host-cell molecules mediate attachment and infection by parvoviruses. erythrocyte p antigen is the major receptor for human parvovirus b19 . viruses are taken up by endocytosis, followed by transport into the host-cell nucleus. viral dna replication depends on host-cell polymerases during the s phase of host-cell replication. human infections are caused by parvovirus b19 and bocavirus (schildgen et al. 2008 ; vicente et al. 2007 ) . the family parvoviridae is not assigned to an order; there are two subfamilies, the densovirinae and the parvoviridae . there are fi ve genera in the subfamily parvoviridae : amdovirus (1 species), bocavirus (2 species including the type species bovine parvovirus ), dependovirus (12 species, including adenoassociated viruses), erythrovirus (4 species including the type species human parvovirus b19 ), and parvovirus (12 species). picornaviridae : infections of the respiratory tract and other organ systems by enteroviruses and parechovirus are well described. enteroviruses were initially classifi ed on the basis of clinical disease and epidemiology, suckling mouse inoculation, replication in cell culture, electron microscopic studies, physical properties, and the vast range of specifi c antigenic differences. the major subgroups were poliovirus, coxsackievirus (a and b), and echovirus. a characteristic of these viruses is their relative stability in acidic media and nonionic detergents. translation of the positive-sense ssrna genome is regulated by a 5′ non-translated region (lindberg and polacek 2000 ) that is covalently linked to protein vpg (virion protein, genome linked); the short 3′ noncoding region is polyadenylated. translation results in synthesis of a single polyprotein, which is cleaved into functional proteins by post-translational processing (nicklin et al. 1987 ; pallansch and roos 2007 ) . there are three functional regions delimited by ribosomal entry sites. the p1 region codes for capsid proteins, while regions p2 and p3 code for nonstructural proteins. capsid proteins vp1, vp2, and vp3 are exposed externally and account for the serological diversity of the viruses. with the advent of molecular phylogenetic analysis, the enteroviruses have been reclassifi ed by the ictv. enteroviruses are in the order picornavirales , family picornaviridae, and genus enterovirus . the enteroviruses have been assigned to 12 species, including human enterovirus a (17 serotypes including coxsackieviruses and enteroviruses), human enterovirus b (56 serotypes, including coxsackieviruses, echoviruses, and enteroviruses), human enterovirus c (the type species; 16 serotypes including coxsackieviruses, all human polioviruses, and enteroviruses), and human enterovirus d (3 enterovirus serotypes). in addition to the enteroviruses, the genus enterovirus also includes 3 rhinoviruses species, human rhinovirus a , b, and c , and more than 100 serotypes. also within the family picornaviridae is the genus parechovirus . human parechovirus is the type species for the genus. there are 14 parechovirus serotypes. polyomaviridae : polyomaviruses may infect a variety of primate and non-primate vertebrate host species; the oncogenic potential of polyomaviruses is well established (white and khalili 2004 ) . sialic acid and/or gangliosides on the hostcell membranes serve as receptors for attachment of human polyomavirus. though these molecules are widespread on human cells, there is a restricted tropism. respiratory epithelial cells and cells of lymphoid origin are the likely targets for initial infection, followed by hematogenous spread to target organs. the virions are non-enveloped; the icosahedral capsids (40-45 nm diameter) are composed of three proteins (vp1, vp2, and vp3), which enclose the circular dsdna genome (~5 kbp). the genome is divided into three regions. the early region encodes for proteins involved in viral processes that occur prior to dna replication, including t (tumor) antigens (benjamin 2001 ) . the late region encodes for proteins involved in processes that primarily occur after dna replication. the early and late regions do not overlap and are transcribed from opposite strands of the viral dna and in opposite directions. a number of viral proteins are encoded as a result of alternative splicing and other posttranslational modifi cations of mrna. polyomaviruses are members of the family polyomaviridae , which is not assigned to an order. there is 1 genus, polyomavirus , and 13 species, including the human pathogens bk polyomavirus and jc polyomavirus and simian virus 40 (type species). retroviridae : the retroviruses are a unique group of viruses, including human immunodefi ciency virus types 1 and 2 and human t-cell leukemia virus type 1; they may infect a wide range of vertebral host species. the human immunodefi ciency viruses and human t-cell leukemia virus 1 are able to cause disease in humans. these rna viruses use a unique replication cycle that uses a "reverse fl ow" of genetic information from rna to dna: viral rna is reverse transcribed and converted into a dsdna copy of the viral genome which is integrated into the host-cell genome. integration of the proviral dna allows the viruses to establish persistent, presumably lifelong, infection. another consequence of insertion of the viral dna is functional mutation of the host genome at the site of insertion which may alter the host gene or regulation of a gene's expression; the oncogenic potential of retroviral infection is well described in humans and other vertebrate host species. the electron microscopic morphology of retroviruses shows a dense nucleocapsid core (cylindrical or cone shaped) (chrystie and almeida 1988 ; gelderblom et al. 1989 ) . viruses are functionally diploid: the core includes two copies of the positive-sense ssrna genome, which are closely complexed with viral nucleoproteins. the sequences of the two ssrna molecules may differ because of errors in transcription of new genomic ssrna molecules during replication. the core also includes several functional viral proteins, including reverse transcriptase, integrase, and protease. the core is surrounded by capsid proteins; the nucleocapsid is surrounded by viral matrix protein. complete virions are surrounded by an envelope derived from virus-modifi ed host-cell cytoplasmic membranes; the envelope is studded by viral glycoproteins. the transmembrane protein extends from the matrix layer through the lipid bilayer to the external surface. the receptorbinding complex is anchored to the external portion of the transmembrane protein. mature virions are spherical (~100 nm diameter). the ssrna genomes of retroviruses are similar to the host-cell mrna. a repeat sequence is present at both ends of the ssrna; the 5′ end is capped and the 3′ end polyadenylated. the order of sequences from the 5′ end to the 3′ end is cap → repeat sequence → unique sequence (u5) → the initiation site for initiation of minus-strand dna synthesis → gag gene → pol gene → env gene → the initiation site for plus-strand dna synthesis → a unique sequence (u3) → repeat sequence → poly(a) sequence. after entry into the cytoplasm of a susceptible cell, double-stranded dna is synthesized by reverse transcription of both copies of the retroviral ssrna. the viral-encoded dna is transported into the nucleus, after which it is integrated into the host's genomic dna. the process of forming new virions is initiated by transcription of the proviral dna. the processed viral rna is exported into the cytoplasm and genes for precursor viral proteins are translated. virions are assembled at the cytoplasmic membrane and then released by budding; fi nal virion maturation occurs by extracellular processing of viral proteins. a characteristic of retroviruses is the high mutation rate and marked genomic heterogeneity of isolates. the major factors that contribute to this phenomenon include (1) error-prone reverse transcription, without proofreading correction, of the infecting virus genome; (2) recombination between the two genomic ssrna strands during reverse transcription; and (3) the very high-level production of progeny viruses from infected cells. retroviruses are not assigned to a taxonomic order. the family retroviridae has two subfamilies. the orthoretrovirinae includes six genera, including deltaretrovirus and lentivirus . htlv-1 is assigned the species name primate t-lymphotropic virus 1 in the deltaretrovirus genus. human immunodefi ciency virus 1 and hiv-2 are named human immunodefi ciency virus 1 (type species) and human immunodefi ciency virus 2 , respectively, in the genus lentivirus (clavel et al. 1986b ). • human immunodefi ciency viruses : the human immunodefi ciency viruses have a conical core surrounded by an envelope derived from viralmodifi ed host-cell cytoplasmic membrane. binding and entry of hiv into susceptible cells requires several specifi c receptors: cd4 (present on host helper t cells, cd4+ macrophages, and some dendritic cells) plus chemokine receptors, including ccr5 and cxcr4 (klatzman et al. 1984 ; simmons et al. 1998 ). the biological properties of hiv-1 isolates depend on the chemokine coreceptor(s) used by the virus (berger et al. 1998 ). isolates that exclusively use cxcr4 are t-cell tropic with rapid replication and syncytium formation. isolates that use ccr5 exclusively are tropic to macrophages, replicate more slowly, and do not induce syncytium formation. isolates that can use either cxcr4 or ccr5 have intermediate phenotypes. the gag , pro , pol, and env genes are translated from full-length mrna transcripts of the proviral genome: gag and env in one reading frame and pro and pol from a second reading frame. in addition, several genes are transcribed from overlapping or unique reading frames, including several spliced gene products. human immunodefi ciency type 1 and 2 viruses evolved from simian viruses (gao et al. 1999 ; peeters et al. 1989 ; daniel et al. 1985 ; marx et al. 1991 ) . these viruses may be distinguished by a number of characteristics, including clinical disease, specifi c antigens, and gene sequences (clavel et al. 1986a ) . hiv-1 isolates may be further characterized into genetic groups and subtypes or clades (wainberg 2004 ) . most hiv-1 isolates are in the m (main) group, which has a number of well-defi ned subgroups and recombinant forms with heterogeneous global distribution; clade b viruses are the predominant isolates in north america and europe (hemelaar et al. 2006 ; osmanov et al. 2002 ) . group o (outlier) strains have mainly been isolated or acquired in western africa. group n (non-m, non-o) and recombinant forms are also most commonly isolated from western africa. • human t-cell leukemia virus type 1 : mature htlv virions have a spherical core, symmetrically placed within the envelope. the host-cell receptor is glut-1, a surface glucose transport molecule (manel et al. 2003 ) . the gag , pro , pol, and env genes are translated from fulllength mrna transcripts of the proviral genome: gag in one reading frame, pro and env from a second, and pol from a third reading frame. in addition, several spliced genes are transcribed from overlapping reading frames. recent and continuing progress to develop and use standardized and widely accepted methods for biological and taxonomic classifi cation of viral pathogens has resulted in improvement in the medical response to viral illnesses. at a very basic level, these systems allow clinicians and scientists to communicate effectively and ensure the comparability of data generated by clinical or basic scientifi c studies. further, accurate and standardized data is critical for understanding issues related to transmission, prevention, and treatment of viral illnesses. establishing phylogenetic similarity to known viral pathogens may allow clinicians to anticipate the clinical behavior of new and emerging viral pathogens, as may be seen when virus mutation results in acquisition of new pathogenic mechanisms, like changes to antigens associated with evasion of the immune response of the host species or changes that allow a viral pathogen to jump from one species into new, susceptible species. as analytical tools improve, even more informative data relevant to clinical and pathologic characteristics of viral pathogens is anticipated. genomewide function conservation and phylogeny in the herpesviridae longrange rna-rna interactions circularize the dengue virus genome the v proteins of paramyxoviruses bind the ifn-inducible rna helicase, mda-5, and inhibit its activation of the inf-beta promoter infl uenza viruses, chap 81. in: versalovik j (ed) manual of clinical microbiology dna sequence and expression of the b95-8 epstein-barr virus genome expression of animal virus genomes rnas in the virion of kaposi's sarcoma-associated herpesvirus polyoma virus: old fi ndings and new challenges a new classifi cation for hiv-1 human herpesvirus 6 the clinical importance of understanding the evolution of papillomaviruses real-time qualitative pcr for 57 human adenovirus types from multiple specimen sources flavivirus genome organization, expression, and replication the morphology of human immunodefi ciency virus (hiv) by negative staining confi guration and terminal sequences of the simian varicella virus genome isolation of a new human retrovirus from west african patients with aids molecular cloning and polymorphism of the human immune defi ciency virus type 2 cdna cloning and transcriptional mapping of nine polyadenylylated rnas encoded by the genome of human respiratory syncytial virus characterization and molecular cloning of a human parvovirus genome isolation of t-cell tropic htlv-iii-like retrovirus from macaques structure of the genome termini of varicella-zoster virus the complete dna sequence of varicella-zoster virus genetic content and evolution of adenoviruses the human cytomegalovirus genome revisited: comparison with the chimpanzee cytomegalovirus genome classifi cation of papillomaviruses cloning of the human parvovirus b19 genome and structural analysis of its palindromic termini the genome sequence of herpes simplex virus type 2 et al (1986) transcriptional map of the measles virus genome functional profi ling of a human cytomegalovirus genome mutations in the c, d, and v open reading frames of human parainfl uenza virus type 3 attenuate replication in rodents and primates henipaviruses, chap 45 virus taxonomy: one step forward, two steps back varicella-zoster virus dna exists as two isomers importance of both the coding and the segment-specifi c noncoding regions of the infl uenza type a virus ns segment for its effi cient incorporation into virions origin of hiv-1 in the chimpanzee pan troglodytes troglodytes factors associated with fulminant liver failure during an outbreak among injection drug users with acute hepatitis b morphogenesis and morphology of hiv. structure-function relations the dna sequence of human herpesvirus-6: structure, coding content, and genome evolution processing of genome 5′ termini as a strategy of negative-strand rna viruses to avoid rig-idependent interferon induction respiratory syncytial virus and parainfl uenza virus infl uenza virus, chap 42 global and regional distribution of hiv-1 genetic subtypes and recombinants in parainfl uenza viruses gene mapping of the putative structural region of the hepatitis c virus genome by in vitro processing analysis evolutionary history and phylogeography of human viruses what does virus evolution tell us about virus origins? association of human papillomavirus types 16 and 18 e6 proteins with p53 ultrastructure of hendra virus and nipah virus within cultured cells and host animals the international code of virus classifi cation and nomenclature nipah virus infection in bats (order chiroptera) in peninsular malaysia t-lymphocyte t4 molecule behaves as the receptor for human retrovirus lav paramyxovirus mrna editing, the "rule of six" and error catastrophe: a hypothesis proposal for a revised taxonomy of the family filoviridae: classifi cation, names of taxa and viruses, and virus abbreviations partitioning the genetic diversity of a virus family: approach and evaluation through a case study of picornaviruses toward genetics-based virus taxonomy: comparative analysis of a geneticsbased classifi cation and the taxonomy of picornaviruses taxonomy and classifi cation of viruses, chap 75. in: versalovic j (ed) manual of clinical microbiology hepatitis b: the virus and disease hepatitis b viral factors and clinical outcomes of chronic hepatitis b molecular analysis of the prototype coxsackievirus b5 genome global identifi cation of three major genotypes of varicella-zoster virus: longitudinal clustering and strategies for genotyping the ubiquitous glucose transporter glut-1 is a receptor for htlv the genome sequence of the sars-associated coronavirus isolation of a simian immunodefi ciency virus related to human immunodefi ciency virus type 2 from a west african pet sooty mangabey structure and function of the infl uenza virus genome the application of molecular phylogenetics to the analysis of viral genome diversity and evolution topics in herpesvirus genomics and evolution bunyaviridae: bunyaviruses, phleboviruses, nairoviruses, and hantaviruses, chap 43 site-specifi c inversion sequences of the herpes simplex virus genome: domain and structural features poliovirus polypeptide precursors: expression in vitro and processing by exogenous 3c and 2a proteinases complete nucleotide sequence of dengue type 3 virus genome rna estimated global distribution and regional spread of hiv-1 genetic subtypes in the year 2000 enteroviruses: polioviruses, coxsackieviruses, echoviruses, and newer enteroviruses, chapter 25 isolation and partial characterization of an hiv-related virus occurring naturally in chimpanzees in gabon arenaviruses, chap 44 the size and conformation of kaposi's sarcoma-associated herpesvirus (human herpesvirus 8) dna in infected cells and virions characterization of a novel coronavirus associated with severe acute respiratory syndrome nucleotide sequence of the kaposi sarcoma-associated herpesvirus (hhv8) human bocavirus: passenger or pathogen in acute respiratory tract infections? nucleotide sequence and genome organization of human parvovirus b19 isolated from the serum of a child during aplastic crisis cxcr4 as a functional coreceptor for human immunodeficiency virus type 1 infection of primary macrophages the genome length of human parainfl uenza virus type 2 follows the rule of six, and recombinant viruses recovered from non-poly-hexameric-length antigenomic cdnas contain a biased distribution of correcting mutations respiratory syncytial virus nonstructural proteins decrease levels of multiple members of the cellular interferon pathways infl uenza virus evolution, host adaptation, and pandemic formation mechanisms and enzymes involved in sars coronavirus genome expression conserved rna secondary structures in flaviviridae genomes analysis of the genomic sequence of a human metapneumovirus manual of clinical microbiology human bocavirus, a respiratory and enteric virus hiv-1 subtype distribution and the problem of drug resistance the exceptionally large genome of hendra virus: support for the creation of a new genus within the family parmyxoviridae polyomaviruses and human cancer: molecular mechanisms underlying patterns of tumerogenesis towards a natural system of organisms: proposal for the domains archaea, bacteria and eucarya a reevaluation of the higher taxonomy of viruses based on rna polymerases papillomavirus genome structure, expression, and post-transcriptional regulation construction and sequencing of an infectious clone of the human parvovirus b19 key: cord-014864-0d682m0n authors: nan title: biomedical vignette date: 2008-10-26 journal: j biomed sci doi: 10.1007/s11373-008-9279-2 sha: doc_id: 14864 cord_uid: 0d682m0n nan connective tissue growth factor (ctgf) was initially discovered in 1991 as a secreted protein in the conditioned media of cultured human umbilical vascular endothelial cells [1] . ctgf is a member of the cnn family of secreted, matrix-associated proteins encoded by immediate early genes that play various roles in angiogenesis and tumor growth [2] . although papers and reviews on ctgf have been published, this review [3] focuses on the functional role of ctgf in cancer progression. the influence of ctgf expression on the behavior and progression of various cancer cells, as well as its regulation on various types of protein signals and their mechanisms are highlighted. although ctgf expression seems to be associated with progression of many kinds of cancers, its expression may have tumor suppressive effects in a few cases such as lung adenocarcinoma cells, colorectal cancer cells and oral squamous carcinoma cells. cgcgh: a tool for molecular karyotyping using dna microarray-based comparative genomic hybridization (array-cgh) to analyze the rare events of single-copy dna aberrations, a matlab-based, array cgh analyzing program, chang gung comparative genomic hybridization (cgcgh) was employed to survey chromosomal amplifications and deletions in fetal aneuploidies or cancer tissues [4] . the analyzed chromosomal data are displayed in a graphic interface, and cgcgh allows users to launch a corresponding g-banding ideogram. in 15 karyotyped samples, the cgcgh program outperformed other programs and cgcgh supported the data generated from cdna microarrays, spotted oligonucleotide microarray and affymetrix human mapping arrays. a computational screen for c/d box snornas in the human genomic region associated with prader-willi and angelman syndromes to identify snornas (small nuclear rnas) in the deletion of human chromosomal region 15q11-q13 related to prader-willi syndrome (pws) and angelman syndrome (as), computational scanning and screening and a novel hybridization energy test were used to identify all the snornas. three previously unknown methylation snor-nas targeting ribosomal 18s and 28s rnas and two snornas targeting serotonin receptor 2cmrna were identified [5] . the application of the present method to the pws/as region of human genome identified 11 snornas, three of which pass the hybridization test. these snornas require experimental confirmation. present hybridization test can be incorporated and automated with motif scanning for genome-wide studies. interactions between m protein and other structural proteins of severe acute respiratory syndrome coronavirus the study by hsieh et al. [6] in the current issue focuses on the protein-protein interactions that regulate sars coronavirus assembly. this was performed by co-localization studies in cells that express structural viral proteins either individually or combined. changes in localization induced by the presence of other virus components indicate direct protein-protein interactions. these experiments indicate that the sars m protein plays a pivotal role in virus assembly, similar to findings with other coronaviruses (vennema et al., reference 14 in the paper). a model is presented to explain how the m protein interacts with multiple other structural proteins (s, e and nc) to facilitate virion assembly. severe acute respiratory syndrome coronavirus nucleocapsid protein confers ability to efficiently produce virus-like particles when substituted for the human immunodeficiency virus nucleocapsid domain the assembly of virus particles is a complex, but illunderstood process. some viruses like the human immunodeficiency virus type 1 (hiv-1) require only a single protein for the assembly of virus-like particles (vlps) (gheysen, reference 14 in the paper). the situation is more complex for the sars coronavirus, which requires at least two structural proteins for vlp formation. the current study by wang et al. [7] asked whether the sars nucleocapsid (n) protein and fragments thereof can functionally replace the nucleocapsid domain of the self-assembling hiv-1 gag protein. using this novel chimeric vlp approach, the authors were able to map more precisely the sars n domain that facilitates self-association and vlp formation. host factors, such as iron concentration at the site of infection, may influence the virulence of pseudomonas aeruginosa and hence the outcome of infection. here, mittal et al. [8] reported that the depletion of iron in the growth medium led to increased adherence of p. aeruginosa to uroepithelial cells and decreased phagocytosis of the bacteria. furthermore, p. aeruginosa growth in irondepleted medium showed increased renal bacterial load and tissue pathology in the challenged mice. the current study may provide insight into the pathogenesis of p. aeruginosa and facilitate the development of a preventive approach against p. aeruginosa-induced urinary tract infections. in vertebrate retinas, glycinergic synapses regulate glutamate transmissions in both synaptic layers, the inner plexiform layer (ipl) and outer plexiform layer (opl). the function of glycinergic synapses in the inner retina has been known to inhibit glutamate transmissions, shaping light-evoked response in ganglion cells [9, 10] . however, little is known about the function of glycinergic synapses in the outer retina. in this issue, shen et al. [11] reported that glycine depolarized rods and activated voltage-gated ca 2+ channels in the neurons, resulting in facilitation of glutamate release in photoreceptors and increase of the spontaneous excitatory postsynaptic currents in off-bipolar cells. furthermore, these authors reported that inhibition of a cluptake transporter nkcc1 effectively eliminated glycine-evoked depolarization in rods suggesting that nkcc1 maintains a high cllevel in rods, which is responsible for glycine-induced depolarization. this finding is quite significant since it reveals a new function of glycine in retinal synaptic transmission. dose-finding study with nicotine as an anti-antiseizure agent in ptz-induced seizure model in mice nicotine exerts agonistic effect on neuronal nicotinic acetylcholine receptors and is reported to enhance release of excitatory neurotransmitter glutamate into the synapses [12] . pentylenetetrazole (ptz), on the other hand, produces seizures in small rodents [13] and is believed to act by antagonizing the inhibitory gabaergic [14] neurotransmission in central nervous system. in this issue, medhi et al. [15] reported that subthreshold dose of nicotine pretreatment significantly decreased the cd50 value for ptz. sodium valproate but not topiramate, significantly inhibited ptz-induced seizure. nonconvulsive dose of nicotine significantly antagonized the protective efficacy of sodium valproate against ptz-induced seizures. these findings are quite significant since they bear clinical relevance particularly amongst epileptic smokers who may show failure of efficacy of antiseizure agents and present with breakthrough seizure attacks due to nicotine. benzyl alcohol inhibits n-methyl-d-aspartate receptor-mediated neurotoxicity and calcium accumulation in cultured rat cortical neurons benzyl alcohol has been used as a preservative in some small multiple-dose vials of bacteriostatic sodium chloride or water for injection. however, there is concern that excipients such as benzyl alcohol may cause adverse reactions to neurons in premature infants [16, 17] . in this issue, takadera and ohyashiki [18] reported that benzyl alcohol inhibited nmda-induced cytotoxicity. furthermore, these authors showed that the protective effect of benzyl alcohol on nmda-induced toxicity is due to its effect in reducing nmda receptor-mediated calcium accumulation, indicating that benzyl alcohol inhibits nmda receptor activity. this finding is significant since it shows potential beneficial effect of benzyl alcohol on mature neurons against glutamate-induced neurotoxicity although it may have adverse effect on immature neurons. bone morphogenetic proteins (bmps) are the most potent class of all osteoinductive proteins, and bmp-2 has already been clinically applied to accelerate bone regeneration in both fracture and spinal fusion [19] . bone narrow-derived mesenchymal stem cells (bmmscs) have been shown to be able to differentiate in vitro toward osteogenic lineages, when treated with established lineage-specific factor [20] . kim et al. [21] studied if a combination of the undifferentiated bmmscs and bmp-2 delivered via heparin-conjugated plga nanoparticles (hcpn) would extensively regenerate bone in vivo. in the in vivo testing, the undifferentiated bmmscs with bmp-2 with bmp-2loaded hcpn induced far more extensive formation, indicating the feasibility of entensive in vivo bone regeneration by transplantation of undifferentiated bmmscs and bmp-2 delivery via hcpn. previous studies indicated that t-box genes play an essential role in cell specification and morphogenesis [22, 23] . in vertebrates, tbx5 plays a critical role in cardiac and upper limb development. expression of tbx5 at the appropriate dose, time and position is important for normal cardiac development. zebrafish appears to be a well-established model used in studying the development of vertebrates. the zebrafish is an ideal model and has been receiving attention as a human disease model, because the species fertilizes embryos externally, the fetus develops rapidly, and it manages to survive without cardiac function for days. thus, zebrafish was used as the model organism to facilitate our investigation on the causal relationship between tbx5 and cardiac myogenesis during cardiac looping. results demonstrated that in zebrafish, injection of tbx5 morpholino antisense rna caused changes of heart conformation, defect of heart looping, pericardium effusion, dropsy of ventral position and decreased heart rate. in conclusion, this study showed deficiency of tbx5 might perturb cardiac looping progress, as well as the formation of atrium and ventricle, possibly through down-regulating cardiacmyogenesis genes such as amhc, vmhc and cmlc2 [24] . the discovery of the naturally occurring cardiac nonfunction (c) animal strain in ambystoma mexicanum (axolotl) provides a valuable animal model to study cardiomyocyte differentiation. it was shown that this recessive mutation, in homozygous animals, causes incomplete differentiation of the embryonic heart at heartbeat initiation stages [25] due to a lack of organized myofibril formation in homozygous mutant hearts [26, 27] . in this issue, jia et al. [28] reported cloning of a peptide cdna (n1) from an anterior-endoderm-conditioned-medium rna library. furthermore, the authors have shown a dramatic decrease of expression of n1 mrna in mutant (c/c) embryos, indicating that the n1 gene is involved in heart development. these findings are quite significant since revealing the underlying molecular mechanisms of heart development will be an important step in finding cures for heart diseases. immunohistochemical assessment of cyclic guanosine monophosphate (cgmp) and soluble guanylate cyclase (sgc) within the rostral ventrolateral medulla the rostral ventrolateral medulla (rvlm) in the caudal medulla oblongata is a major site for generation of neurogenic vasomotor tone. nitric oxide (no) in the rvlm plays an important role in central cardiovascular regulation by regulating the sympathetic vasomotor outflow [29, 30] , and cgmp-dependent no signaling in the rvlm is impaired in hypertension [31] . to date, no studies have described the immunohistochemical localization of neurons capable of expressing cgmp within the rvlm. in this communication, powers-martin et al. [32] sought to identify the cellular targets for no in the rvlm by visualizing anatomical relationship of cgmp with the tyrosine hydroxylase (th) or phenylethanolamine n-methyltransferase (pnmt) cell group. double label immunohistochemistry for cgmp-immunoreactivity (ir) and th or pnmt neurons failed to reveal cgmp-ir neurons in the rvlm of either normotensive wistar-kyoto rats or the spontaneously hypertensive rats. in addition, soluble guanylate cyclase (sgc)-ir was found throughout neurons of the rvlm, but did not co-localize with pnmt-or th-ir neurons. these results indicate that within the rvlm, cgmp is not detectable using immunohistochemistry in the basal state and this raises the hypothesis that functional network inputs, such as the sympathetic baroreflex pathway are required to drive a sgc/cgmp cascade in the rvlm. in vivo gm-csf promoter-based assay for drug screening in drug discovery research, in vitro cell-based screening systems are well established as methods for evaluation of candidate lead compounds. for example, in vitro assays of nf-jb [33] and cox-2 [34] , two examples of drug targets, are employed to develop therapeutic strategies to counter inflammation. since the regulation of immuno-modifiers is highly dependent on three-dimensional microenvironments, an in vivo assay can more accurately evaluate the effects of drugs on the expression of key cytokines. su et al. [35] devised an in vivo, transgenic, human cytokine (e.g. gm-csf) gene promoter assay using defined epidermal skin cells as test tissue. test compounds were topically applied to mouse skin before or after gene gun transfection, using a cytokine gene promoter-driven luciferase reporter. croton oil, an inflammation inducer, induced six-folds transgenic gm-csf promoter activity in skin epidermis, and this effect was drastically inhibited by the phytocompound shikonin. these results demonstrate that this in vivo transgenic promoter assay system is cytokine gene-specific, and highly responsive to pro-inflammatory stimuli. arecoline and the 30-100 kda fraction of areca nut extract differentially regulate mtor and respectively induce apoptosis and autophagy-a pilot study in taiwan, chewing of betel quit causes oral cancer, which is the top sixth killer of all types of cancer. betel quit consists of areca nut (an), lime and influorescene of piper betel. it was demonstrated previously that extracts of an (ane), but not those of lime, inflorescence of piper betel, induced rounding of cell morphology and nuclear shrinkage in several carcinoma cell lines. in the present study [36] , the mw of active principle was found to be between 30-100 kda, which induced nuclear shrinkage, clearance of the cytoplasm, cleavage of lc3-1q and appearance of autophagic vacuoles and acid vesicles. on the other hand, arecoline (are) triggered caspase-3 activation, peri-nuclear condensation and micronucleation. furthermore, ane 30-100 k but not are, inhibited the phosphorylation of rapamycin (m-tor)-ser2448. it indicates that different constituents of an induces apoptosis or autophage of an oral cancer cell line. to the present, utilization of high-resolution gas chromatography with high-resolution mass spectrometry (hrgc/ ms) is recognized as the most efficient way for determining dioxin compounds [37] . however, the procedures of this detection system are not easy to carry out. the fluorescence resonance energy transfer (fret) technique can precisely evaluate interaction between two molecules in cells, and the use of cyan-fluorescent protein (cfp) as an energy donor and yellow-fluorescent protein (yfp) as an energy acceptor has been reported as the most suitable combination of fret signal detection [38] . lin et al. [39] therefore established a fret-based dioxin-detection assay. aryl hydrocarbon receptor (ahr) and ahr nuclear translocate (arnt) fused-cyan fluorescent protein 9cfp) and-yellow fluorescent protein (yfp) constructs were transiently cotransfected into rat hepatoma cell line h4iiec3 cells. no fret signals were detected in ahr-cfp-and arnt-yfptransfected cells. however, dioxin treatment upregulated fret signals in these transfected cells in a dose-dependent manner, indicating the potential of fret technique in the detection of dioxin-like compounds. connective tissue growth factor: a cysteine-rich mitogen secreted by human vascular endothelial cells is related to the src-induced immediate early gene product cef-10 nov (nephroblastoma overexpressed) and the ccn family of genes: structural and functional issues connective tissue growth factor (ctgf) and cancer progression cgcgh: a tool for molecular karyotyping using dna microarray-based comparative genomic hybridization (array-cgh) a computational screen for c/d box snornas in the human genomic region associated with prader-willi and angelman syndromes interactions between m protein and other structural proteins of severe, acute respiratory syndrome-associated coronavirus severe acute respiratory syndrome coronavirus nucleocapsid protein confers ability to efficiently produce virus-like particles when substituted for the human immunodeficiency virus nucleocapsid domain iron dictates the virulence of pseudomonas aeruginosa in urinary tract infections synaptic mechanisms that shape visual signaling at the inner retina glycinergic synaptic inputs to bipolar cells in the salamander retina glycine input induces the synaptic facilitation in salamander rod photoreceptors nicotine enhancement of fast excitatory synaptic transmission in cns by presynaptic receptors laboratory evaluation of antiepileptic drugs. review of laboratory methods the gaba postsynaptic membrane receptorionophore complex dose-finding study with nicotine as a proconvulsant agent in ptz-induced seizure model in mice fatal benzyl alcohol poisoning in a neonatal intensive care unit the gasping syndrome and benzyl alcohol poisoning benzyl alcohol inhibits n-methyl-d-aspartate receptor-mediated neurotoxicity and calcium accumulation in cultured rat cortical neurons bmp-2 evaluation in surgery for tibial trauma-allgraft (bestt-all) study group (2006) recombinant human bmp-2 and allograft compared with autogenous bone graft for reconstruction of diaphyseal tibial fractures with cortical defects. a randomized, controlled trial epidermal growth factor as a candidate for ex vivo expansion of bone marrow-derived mesenchymal stem cells enhancement of ectopic bone formation by bone morphogenetic protein-2 delivery using heparin-conjugated plga nanoparticles with transplantation of bone marrow-derived mesenchymal stem cells t-box genes: what they do and how they do it t-targets: clues to understanding the functions of t-box proteins cascade effect of cardiac myogenesis gene expression during cardiac looping in tbx5 knockdown zebrafish embryos genetic and experimental studies on a mutant gene (c) determining absence of heart action in embryos of the mexican axolotl (ambystoma mexicanum) morphology of developing heart in cardiac lethal mutant mexican axolotls imunofluorescent confocal analysis of tropomyosin in developing hearts of normal and cardiac mutant axolotls a novel protein involved in heart development in ambystoma mexicanum is localized in endoplasmic reticulum differential cardiovascular responses to blockade of nnos or inos in rostral ventrolateral medulla of the rat the nnos/cgmp signal transducing system is involved in the cardiovascular responses induced by activation of nmda receptors in the rostral ventrolateral medulla of cats pressor and sympathoexcitatory effects of nitric oxide in the rostral ventrolateral medulla immunohistochemical assessment of cyclic guanosine monophosphate (cgmp) and soluble guanylate cyclase (sgc) within the rostral ventrolateral medulla nf-jb: a key role in inflammatory diseases development and use of a gene promoter-based screen to identify novel inhibitors of cyclooxygenase-2 transcription immunomodulatory effects of phytocompounds characterized by in vivo transgenic human gm-csf promoter activity in skin tissues arecoline and the 30-100 kda fraction of areca nut extract differentially regulate mtor and respectively induce apoptosis and autophagy: a pilot study a hybrid hrgc/ms/ms method for the characterization of tetrachlorinated-p-dioxins in environmental samples reducing the environmental sensitivity of yellow fluorescent protein. mechanism and applications establishment of a fluorescence resonance energy transfer-based bioassay for detecting dioxin-like compounds key: cord-010681-tmpxs9og authors: dondapati, srujan kumar; stech, marlitt; zemella, anne; kubick, stefan title: cell-free protein synthesis: a promising option for future drug development date: 2020-03-20 journal: biodrugs doi: 10.1007/s40259-020-00417-y sha: doc_id: 10681 cord_uid: tmpxs9og proteins are the main source of drug targets and some of them possess therapeutic potential themselves. among them, membrane proteins constitute approximately 50% of the major drug targets. in the drug discovery pipeline, rapid methods for producing different classes of proteins in a simple manner with high quality are important for structural and functional analysis. cell-free systems are emerging as an attractive alternative for the production of proteins due to their flexible nature without any cell membrane constraints. in a bioproduction context, open systems based on cell lysates derived from different sources, and with batch-to-batch consistency, have acted as a catalyst for cell-free synthesis of target proteins. most importantly, proteins can be processed for downstream applications like purification and functional analysis without the necessity of transfection, selection, and expansion of clones. in the last 5 years, there has been an increased availability of new cell-free lysates derived from multiple organisms, and their use for the synthesis of a diverse range of proteins. despite this progress, major challenges still exist in terms of scalability, cost effectiveness, protein folding, and functionality. in this review, we present an overview of different cell-free systems derived from diverse sources and their application in the production of a wide spectrum of proteins. further, this article discusses some recent progress in cell-free systems derived from chinese hamster ovary and sf21 lysates containing endogenous translocationally active microsomes for the synthesis of membrane proteins. we particularly highlight the usage of internal ribosomal entry site sequences for more efficient protein production, and also the significance of site-specific incorporation of non-canonical amino acids for labeling applications and creation of antibody drug conjugates using cell-free systems. we also discuss strategies to overcome the major challenges involved in commercializing cell-free platforms from a laboratory level for future drug development. proteins whose functionality is not well characterized form a large percentage of entries in many of the currently available biological databases, including the protein data bank (pdb), and there is a constantly growing demand for reliable and fast synthesis and characterization methods. when it comes to drug discovery, proteins are key components as they can have therapeutic potential themselves (e.g., antibodies, coagulation factors, hormones, growth factors, enzymes, and antimicrobial peptides), but also because they could serve as drug targets for diverse diseases (as ion channels, receptors, enzymes, and transporters, for example) [1] [2] [3] [4] [5] [6] [7] . a large proportion of approved pharmaceutical drugs target human proteins. beyond that, protein-based therapeutics, such as antibody-drug conjugates, represent a significant percentage of total drug molecules currently approved. they are poised to grow further with increased gene expression technology, improved protein engineering, and refined bioinformatics tools. some proteins are very difficult to express in traditional cell-based systems and this can hamper our ability to define the mechanism of action and structure-function relationship of the individual protein, knowledge which aids the development of drugs targeting these proteins [1] [2] [3] [4] . generally, to exploit and fine-tune the structural and functional characteristics of a protein, it needs to be expressed and purified with high quality by using recombinant expression technology. traditionally, escherichia coli-based systems were widely used for the production of recombinant proteins due to simplicity in preparation and operation, and cost effectiveness. as a result, broad research and standardization from several years was performed using e. coli-based expression systems, resulting in their often-cited utilization as a state-of-the-art protein expression system [8] . for complex therapeutic proteins, membrane proteins (mps) originating from humans, and virus-like proteins (vlps), mammalian expression systems fulfill all the requirements like post-translational modifications (ptms), cofactors, and chaperones for correct folding and efficient production. however, batch-tobatch variation in cell culture may be a source of process variation. additionally, overexpression of mps might be toxic for the cultivated cells, resulting in cell death or truncated and misfolded proteins [9] . ideally, synthesized proteins are functionally folded and exhibit appropriate ptms. due to the lack of extensive research and low yields in recombinant protein expression, many mps are not yet crystallized, thus limiting the computer-aided drug discovery efforts. due to the growing demand for the production of protein biologics and drug discovery targeting proteins, alternative strategies for protein synthesis should be developed. new expression technologies where proteins can be expressed in a simple way and which allow high throughput screening of different reaction conditions, different genes, and different supplements in a cost-effective manner are extremely important for future drug development. in this review, we give an overview of recent advances in cell-free (cf) synthesis platforms and their diverse applications. additionally, we focus on human and therapeutic proteins produced by different types of cf systems and how these cf protein synthesis (cfps) methods can further play a prominent role in future drug development. cfps systems use crude cell extracts prepared from cells of choice by lysis followed by many steps of washing to remove the cell debris and genomic dna [10, 11] . these cell extracts can be stored at − 80 °c for years and can be used by thawing just before the reaction. such extracts contain all the principal components necessary for transcription and translation, such as aminoacyl-trna synthetase (aas), ribosomes, and factors necessary for elongation, initiation, and transcription. protein synthesis can be realized by combining cell extracts with necessary substrates like amino acids, energy substrates, dna, cofactors, salts, and nucleotides. depending on the biochemical properties of the protein and its end application, the appropriate cf system can be selected. cfps is a fast protein production system since it does not require transfection or cell culture and lacks cell viability constraints. due to its openness, cfps platforms offer additional advantages when compared with cell-based expression methods. a comparative analysis of cf and cellbased approaches is shown in table 1 . for complex proteins, eukaryotic cf systems are ideal as they contain the endogenous microsomes derived from the endoplasmic reticulum (er), enabling co-translational translocation of proteins and er-based ptms [10, 11, 18, 19] . there has been a constant improvement in the quality of lysate preparation, system optimization, linear templatebased protein synthesis, and reduction of process costs, which has led to the preparation of cost-effective systems suitable for commercial purposes. a general scheme of cf protein production is depicted in fig. 1 . cfps platforms are based on either prokaryotic or eukaryotic origin. among the prokaryotic cf systems, extracts based on e. coli are regularly used and are available commercially for cfps of a diverse range of proteins. very recently, cf systems based on bacillus subtilis [32] , pseudomonas putida [33] , streptomyces [34] , and vibrio [12] have been optimized well at the laboratory level due to the ease of preparation of cf lysates. a wide range of detailed protocols is currently available for the preparation of e. coli-based lysates. among the eukaryotic cf systems, extracts based on rabbit reticulocyte lysate (rrl), wheat germ, insect spodoptera frugiperda 21 (sf21), chinese hamster ovary (cho), and cultured human cells are regularly used. an increasing number of eukaryotic cf systems have so far reached technical maturity and become commercially available. requires dna template to be cloned in a plasmid [15] toxic proteins ideal choice for the synthesis of most of the toxic proteins due to high toxic tolerance of cf systems toxic proteins may be difficult to synthesize [16] membrane proteins suitable for a wide range of mps of different sizes mp overexpression can lead to cell toxicity and death [17] [18] [19] membrane protein solubilization to solubilize mps, supplements can be added directly to the reaction mixture in the form of nanodiscs, detergents or liposomes (prokaryotic systems and wheat germ systems) or by using endogenous microsomes or proteoliposomes (eukaryotic systems) not possible to add supplements externally during translation mps have to be purified and reconstituted into liposomes or detergents for functional analysis [9, 20] reliability most of the reports related to cfps are currently limited to the research and laboratory level, but progress towards the drug discovery pipeline has been made recently most reliable and state of the art for protein production and drug discovery purposes, and approved by drug authorities [21] functional characterization compared with cell-based systems, standardized biophysical, biochemical assays are limited, but progress has been made recently a wide range of standardized biophysical and biochemical techniques are available for proteins synthesized by cell-based systems [22] protein yields and downstream applications yields range from µg/ml (complex proteins) to several mg/ml (cytosolic proteins and few mps) with more complex proteins downstream applications are simple and protein can be purified and reconstituted immediately after synthesis yields can be very high, in the range of mg/ml downstream applications are possible but need to additionally lyse the cells for mps [11, 21, 22] post-translational modifications (ptms) ptms possible (mostly in eukaryotic cf systems with translationally active microsomes) limited ptms in prokaryotic and eukaryotic lysates lacking endogenous microsomes. o-glycosylation not possible all ptms are possible including o-glycosylation [13] incorporation of non-standard amino acids ideal choice for the incorporation of single and multiple noncanonical amino acids difficult to incorporate non-canonical amino acids due to cell membrane barrier and cytotoxic effects [23, 24] scale of reaction volume ranging from few µl (chip-based and batch-based in an eppendorf tube) to 100 l reaction (in a fermenter) typical reactions require a minimum of 5 ml. there are exceptional cases where it is performed in 60 µl spots [25] [26] [27] [28] flexibility completely open system and easy to manipulate the reaction conditions with lack of cell membrane constraints completely closed system and difficult to manipulate [22] automation cf systems can be automated with high throughput screening of multiple templates, starting in an elisa plate format generally difficult to automate due to the requirement of larger volumes and aseptic techniques [29] point of care production of biologics lyophilized cf lysates are suitable for the production of therapeutic proteins next to the emergency settings very difficult due to its time-consuming process and requirements of large infrastructure including manufacturing facilities, transport, and cold storage facilities [30, 31] recently, several eukaryotic cf extracts based on tobacco [35] , leishmania [36] , neurospora [37] , yeast cells [38] , and human blood cells [39] were characterized and optimized for a limited number of proteins at the laboratory level. there is a growing trend in the development of novel cf platforms for taking advantage of the genetic tools available in the literature and the abundant literature available on the in vivo expression of proteins. prokaryotic cf systems based on e. coli are most commonly used for protein production towards drug development due to their simplicity and a vast literature available on the utilization of these cells. protein synthesis starts with crude cell extracts prepared from e. coli cells that contain the translation machinery along with all the essential components required for translation. a modified and reconstituted cf synthesis system known as the pure system (protein synthesis using recombinant elements), where all the components of the translation machinery are purified and added individually along with the dna template to produce the protein, has been reported [40] . this is a highly controlled system compared with crude extract methods. a major advantage of the pure system is that protein factors participating in the initiation, elongation, and termination of the protein synthesis process are identified and can be adapted individually to the cf system's requirements. although the naturally occurring ptm machinery is not available in the e. coli lysates, recently proteins with n-glycosylation were synthesized by using e. coli extracts enriched with glycosylation components, including oligosaccharyltransferases (osts) and lipid-linked oligosaccharides (llos) [41] . using release factor (rf1) deficient e. coli lysates, proteins were phosphorylated by incorporation of non-canonical amino acids, which will be addressed in a later part of this review [42] . due to a constantly growing demand for more complex proteins of pharmaceutical value, cf systems based on eukaryotic lysates have been developed to produce highquality proteins. cf systems based on wheat germ lysates (wgl) are among the most popular eukaryotic platforms due to their capacity to produce eukaryotic proteins with high yields [43] . cfps based on wgl have been used frequently for the discovery of novel vaccine candidates as well as for producing several proteins of high quality for structural analysis. despite the high yields and quality of the lysate, this system does not offer all the ptms like glycosylation and does not support the solubilization of complex mps [13] . in the case of wheat germ and rrl, there are no translationally active endogenous microsomes present in the system. in the case of rrl, exogenous microsomes are typically supplied from the canine pancreas for protein translation [13, 44] . it is quite laborious and difficult to enrich rrls with heterologous microsomes. cf systems derived from cultured insect (sf21) cells represent the most popular eukaryotic-based approach for synthesizing a wide variety of proteins. sf21 lysates contain translationally active endogenous er membranes, thereby supporting the signal peptide-mediated translocation of proteins across the membrane, and further provides functions such as signal peptide cleavage, post-translational modifications like n-glycosylation, and lipid modification [13, 14, 45] . cho cell-based expression is well established and is approved for the large-scale synthesis of several biologics by the fda because it undergoes human-compatible ptms. nearly 70% of the approved mammalian therapeutic proteins are currently expressed in cho cells. however, these cells have limitations when it comes to difficult-to-express proteins like overexpression of complex mps, toxic proteins, and multi-subunit proteins as discussed above. cf systems based on cho lysates are evolving as an alternative strategy for the expression of difficult-to-express proteins [13, [17] [18] [19] 46] . apart from many general advantages of cf systems, cho-based cf systems retain most of the features of cho cells while being more flexible due to the lack of cell membrane boundaries. cho-based lysates harbor endogenous microsomal vesicles enabling translocation of transmembrane proteins and secretory proteins. furthermore, ptms of de novo synthesized mps, such as glycosylation, are possible using cho lysate. thus, using cho cell lysate for cfps has a potential value and enables new opportunities, in particular, the high-yield production of pharmaceutically relevant mps [13, [17] [18] [19] . there is a significant increase in the number of publications based on cho lysates for cfps. table 2 compares different cf systems and their advantages and limitations with some selected examples. fig. 1 general scheme depicting the overall process of cell-free protein production. aatrna aminoacyl-trna, aas aminoacyl-trna synthetase, atp adenosine triphosphate, ef elongation factor, gsh glutathione, gssg glutathione-disulfide, gtp guanosine-5'-triphosphate, if initiation factor, ires internal ribosome entry site, mp membrane protein, ncaa non-canonical amino acid, pdi protein disulfide isomerase, peg polyethylene glycol, ptm post-translational modification, r ribosomes, t-rna transfer rna, tf transcription factor, utr untranslated region, vlp virus like particle chinese hamster ovary (cho) mimic the cho cell-based production ptms (n-glycosylation, disulfide bridging, and lipidation) suitable for a wide range of eukaryotic and complex proteins presence of translational active endogenous microsomes [45] high yields in cecf mode endotoxin free lysates used for point-of-care testing [30] low yields especially in the batch mode [58] cost ineffective and difficult to establish unlike e. colibased system streptokinase (cecf): 500 µg/ml [59] human tlr9 receptor [18] (3-h batch): 21 µg/ml (48-h cecf): 900 µg/ml hegfr [46] (batch): 40 µg/ml (cecf): 800 µg/ml cfps can be performed in different formats. the batchbased format is the most commonly used method both in the prokaryotic and eukaryotic systems. this method is relatively fast and cheap, and synthesis can be performed within 1.5-3 hours depending on the system. e. coli-based systems can provide protein yields ranging from 100 µg/ ml to 2-3 mg/ml. although the yields from batch-based eukaryotic systems are comparatively low, mps are automatically incorporated into microsomal membranes and the functionality can be addressed immediately after the synthesis [62] . for researchers who would like to further scale up the protein yields via batch-based eukaryotic systems, a repetitive batch-based synthesis format has been proposed where the microsomes incorporating the mp of interest generated in an initial synthesis reaction can be added to a fresh cf synthesis reaction that has been depleted of its microsomes [19, 45] . another popular cf synthesis format that has been used for a rapid increase in the protein yields is the so-called continuous exchange cell-free synthesis platform (cecf). in this format, a semi-permeable dialysis membrane separates the reaction chamber and a feed chamber and thereby a feed chamber provides the fresh reaction components and enriches the reaction chamber. in exchange, the inhibitory components accumulated during the reaction are removed [14, 17, 18, 46] . typically, the cecf format prolongs the reaction time and increases the protein yields. until now, the cecf format has been used to increase the protein yield by multiple fold, and is widely used as cf platforms (table 2 ). this section highlights some of the key parameters that might influence the protein production using cf lysates. designing synthetic dna and sequence manipulation for cf synthesis by adding regulatory elements plays a significant role in high-yield protein production. in eukaryotic cf systems, initiation factors (ifs) in particular limit the initiation of protein synthesis, thereby leading to low protein yields. one alternative is to use internal ribosome entry site (ires) elements found in the 5′-untranslated region (5′utr) of the different viral genomes upstream of the start codon for cap-independent translation initiation [14, 18, 62] . ires elements from three different viral sources were compared for their translational efficiency in sf21, cho, and human leukemia k562 cf lysates. the ires from the cricket paralysis virus (crpv) typically increased protein yields by a factor of 3-5 [62] . inserting the crpv-ires into the corresponding vector upstream of the epidermal growth factor receptor (egfr) gene, and using the cecf reaction format, egfr yields were significantly increased to more than 100-fold compared with batch reaction format without crpv-ires [14] . additionally, replacement of the initiator codon (atg) to a gcu-codon in combination with the crpv-ires resulted in a further improvement of protein expression levels in cho and k562 cf systems [62] . the vector backbone also plays an important role in cfps. a detailed study comparing commercially available vectors harboring the luciferase gene in combination with crpv-ires showed that there is a significant 5-fold increase in protein yield with a change in the vector backbone [19] . species-independent translational sequences (sits) are another group of synthetic 5′utrs capable of initiating cap-independent translation in multiple prokaryotic and eukaryotic cf systems [66] . typically, polymerase chain reaction (pcr) products are generated with sits downstream of the t7 promoter and upstream of the start codon atg [66] . the 3′ hairpin region of the sits increases the residence time of the preinitiation complex in the vicinity of the start codon [66] . using l. tarentolae cfps in the presence of genes encoding 58 rab encoding variable fragments in combination with a universal sits, nearly a full complement of human rab gtpases were produced with a yield of around 30 µg/ml [36] . similarly, egfp with a yield of around 300 µg/ml [66] , and an active multisubunit enzyme heterodimeric farnesyl transferase (ftase) [36] were synthesized using the l. tarentolae cfps [36] . codon optimization is another important parameter that plays a crucial role in increasing the expression yields of proteins. codon optimization has been shown to influence the translation efficiency of several proteins [71] . by taking advantage of the cf lysates derived from n. crassa and s. cerevisiae, transcription and translation reactions were uncoupled for ribosome profiling, which provided strong biochemical evidence that codon optimization enhances the rate of translational elongation, thereby affecting the ribosome traffic on the mrna [72] . on the one hand, codon optimization usually improves protein yields, but on the other hand, it was shown that faster translation rates might negatively affect the protein folding and function of the individual protein [72, 73] . this problem often cannot be solved even by altering the trna population in the case of cfps. the addition of anti-spliced leader oligonucleotide to l. tarentolae cell extracts suppressed the translation of endogenous l. tarentolae mrnas, thus increasing the translation efficiency of exogenously supplied mrna [65] . using the er-specific signal sequence of honeybee melittin (melittin signal peptide) instead of the native signal peptide increased the translocation of synthesized proteins such as wnt proteins, single-chain antibody variable fragments, and the htlr9-ectodomain into microsomes in the case of sf21 and cho-based cf systems [18, 59, 74, 75] . iterative optimization processes are required to develop highyield cfps. factors that influence both protein quality and quantity include reaction temperature, reaction time, plasmid concentration, salt concentration, t7 polymerase, and other supplements. the influence of these factors on the synthesis rates is also protein specific. very recently, cfps of human toll-like receptor protein (htlr9) in cho-based lysates has been reported by using a cecf method with high yields of around 0.9 mg/ml. by increasing the temperature from 27 to 30 °c, the protein yields were increased by almost 50%. stable monitoring and maintenance of ph throughout the entire cf reaction along with sufficient adenosine triphosphate (atp) supply are essential for efficient and maximum yield protein production. by using amino acid decarboxylase, the ph is controlled throughout the cf reaction [76] . supplementation of chaperones influences the functional folding of many proteins. supplementation of chaperones such as groes/el and dnak/dnaj/grpe in prokaryotic cf systems was used to increase the yield and solubility of colicin m from 16 to 100%, resulting in enhanced cellkilling activity [77] . li et al. demonstrated that by using cfps based on wheat germ extracts, expression of j-domain containing chaperone proteins (dnajb12 and dnajb14) along with potassium channels plays a critical role in the folding, stabilization, and tetramerization of k+ channels [78] . ion concentrations (potassium and magnesium) in the cf reaction have a significant effect on protein production. in the case of cho-based cecf reactions, an increase in the magnesium ion (mg 2+ ) concentration from 3.9 to 22.5 mm led to a 3.9-fold increase in egfr yield [46] . for efficient regeneration of atp, several methods have been developed in cf systems. in prokaryotic systems, compounds like phosphoenolpyruvate (pep), glucose + glutamate decarboxylase, glucose-6-phosphate, fructose-1,6-biphosphate, acetyl phosphate, maltodextrin, and creatine phosphate are widely used as energy sources [79] . in eukaryotic cf systems, a combination of creatine phosphate and creatine kinase is typically used for energy regeneration. apart from these, phosphoglycerate (b. subtilis), and polyphosphate are used in cf systems [80, 81] . cf systems have evolved over the last decade from their use as a prototype method in research laboratories to commercial and large-scale applications. in this section, the utility of cf systems in mp synthesis, antibody production, vaccine development, protein labeling, and antimicrobial peptide synthesis are addressed. mps are structurally and functionally diverse, and constitute 30% of the proteins encoded in the human genome. drugs targeting mps such as ion channels, transporters, and g-protein coupled receptors (gpcrs) represent 12 out of the top 20 global revenues in the pharmaceutical industry [3] . due to the presence of transmembrane domains, ranging from 1 to 24, these proteins are highly hydrophobic and are very challenging to express by traditional cell-based systems. expression of human proteins in heterologous cellular hosts is very much limited due to the difference in their lipid composition, which can prevent the mps from attaining maximum functionality [9] . synthesis of mps by cell-based methods often leads to cytotoxicity, aggregation, and improper folding [9] . to analyze mp functionality, the protein needs to be folded properly and in the appropriate hydrophobic environment. cf systems derived from prokaryotic, as well as eukaryotic lysates lacking endogenous microsomes, require specific supplements for the solubilization of mps in the form of detergents, nanodiscs, or liposomes. non-ionic and zwitterionic detergents are commonly used as supplements in the majority of cfps reactions for the solubilization of mps during their production. detergent-solubilized mps can be either used directly for functional analysis or may be reconstituted into liposomes by mixing with artificial lipids followed by detergent removal [9, 82, 83] . alternatively, nanodiscs (nds) and liposome-based reconstitution are detergent-free strategies where nds and liposomes, prepared and characterized externally, could be supplemented directly into the reaction mixture for the reconstitution [9, 84] . a detailed review of the cf synthesis of mps and the usage of solubilization supplements for isolation and functional analysis is presented in the literature [9] . some of the advantages of nds are easy purification and flexibility in using different lipids and membrane scaffold proteins for creating different sizes, and their availability as monodisperse and homogenous nds. nonetheless, nds have their limitations, particularly when working with a protein whose functionality depends on its orientation and also working with transporter proteins. liposome-based reconstitution covers the limitations of the nds, but the separation of liposomes after the cf reaction is quite challenging and often suffers from disruption due to osmotic instability. further, such passive reconstitution strategies do not offer the advantages of post-translational modifications within native membranes and are limited for mps whose function does not depend on active translocon-based translation. cf systems derived from eukaryotic lysates equipped with endogenous microsomes (e.g., sf21, cho, cultured human cells, and tobacco-by2) satisfy all the necessary requirements for proper folding of mps. the microsomes offer a native environment and intact translocon machinery for a proper embedment and folding of mps [13, 18, 19, 45, 46, 51, 59] . there are continuing efforts in analyzing the functionality of microsomal reconstituted mps, indicated by a good number of publications reporting on this reconstitution strategy, which should help the pharmaceutical industry to develop more dynamic drug screening assays involving mps [9, 46, 83] . here we present recent works on ion channels, transporters, and gpcrs, which constitute more than 40% of the major drug targets in the pharma industry [85] . ion channels constitute approximately 19% of all currently existing human drug targets and play a crucial role in diverse physiological processes involving cell excitability, neuronal transmission, metabolism, sensory transduction, cognition, and electrolyte homeostasis. transporters mediate the translocation of a variety of substrates across biological membranes [86] . the solute carrier (slc) family is the largest class of transporters and is implicated in metabolic conditions and diseases, and in the transport of drugs. these proteins typically have 9-12 transmembrane domains and are difficult to express by traditional methods [2, 9, 50, 87] . slc transporters are an emerging drug target class and the molecular target of several approved inhibitor drugs [2] . despite this, these classes of proteins remain largely unexplored in recent years due to the high costs involved and lack of proper expression methods [9, 53] . table 3 highlights some of the selected publications using cf methods for synthesis, reconstitution, and functional analysis of ion channels and transporters. the most widely used method of reconstitution for functional analysis is detergent-based reconstitution into liposomes or passive integration of mps into liposomes and nds. the majority of the functional assays were performed with plbe in the case of ion channels and substrate uptake assays in the case of transporters. gpcrs transduce extracellular stimuli to the inside of cells, after activation by a variety of different molecules such as neurotransmitters, hormones, odorants, and peptides, thereby triggering several signal transduction cascades. the involvement of gpcrs in almost all processes in living cells has resulted in significant pharmaceutical interest in this protein class, and the development of robust and high-throughput-suitable assays for the discovery of novel ligands and drugs targeting these proteins. in principle, the screening of ligands can be performed in whole-cell assays by measuring a downstream signaling event, or in cf assays, which are decoupled from the living organism. usually, these decoupled methods are preferable for high-throughput screenings, as they are easy to handle and therefore amenable for automation and downsizing. these parameters can be well combined with cfps. an automated cf synthesis procedure for the production of different mps is already reported [97] . this procedure might be further expanded for the parallel analysis and identification of molecules that target different gpcrs. to date, only a few studies have analyzed in detail the activity of receptors produced by cf systems. the main reason for this is that there are limited well established activity assays. this section addresses possible activity assays that might be transferred to cf systems in the future. radioligand binding assays, the gold standard for identifying binding molecules, are already adapted for gpcrs that have been synthesized in eukaryotic and prokaryotic cf systems, and demonstrate similar binding affinities in comparison with in vivo produced gpcrs [20, [98] [99] [100] . alternatively, fluorescently labeled ligands can be analyzed by an optical read-out system using eukaryotic cf systems harboring endogenous membrane structures [101] . nevertheless, for these systems, radiolabeled or fluorescently labeled ligands are required, thereby limiting the analysis mainly to gpcrs with already known ligands. in addition, simple ligand binding assays usually do not differentiate between an agonistic and an antagonistic effect of the bound substance. in this context, measuring downstream signaling to distinguish between an activating and inhibitory ligand is preferable. one possible method of choice is the receptor-mediated coupling of g proteins [102] . this early event immediately follows after gpcr activation and is detected by the binding of [ 35 s]gtpγs to gα subunits. this method is not yet established in cf systems but might be transferable assuming the presence of gα proteins in the eukaryotic lysate. alternatively, the gα proteins can be additionally co-synthesized to the target gpcr or directly applied to the reaction based on the open nature of cf systems. after gpcr activation, gtp binding and hydrolysis should be detectable. in addition to ligand binding and g protein coupling, intra-and intermolecular interactions can be visualized by förster and bioluminescence resonance energy transfer techniques. different sensor models are known in living cells [103] . the monitoring of intermolecular interactions can be performed as well in cf systems using the already established in vivo models. one model includes the tagging of the c-terminus of a gpcr of interest to a fluorophore (gfp/yfp) and fusing a binding partner such as β-arrestin to luciferase or a second fluorophore [104] . upon activation of the gpcr, β-arrestin binds to the receptor and both tags are in close proximity, resulting in a measurable energy transfer. this model requires active g protein-coupled receptor kinases for the phosphorylation of the c terminus of the synthesized gpcr to get recognized by β-arrestins. this requirement has to be analyzed in detail in the individual cf systems. the second known in vivo model visualized intramolecular changes after agonist and antagonist binding by introducing fluorophores into the third extracellular loop and the c terminus of different gpcrs. upon activation, the distance between both fluorophores changes and an alteration in the energy transfer can be measured [105, 106] . initial experiments to transfer these energy transfer-based sensors were recently performed in cf systems [107] . both models can be applied to high-throughput analyses. in summary, the successful cf synthesis of a variety of gpcrs has been demonstrated in recent years and a transfer of these gpcr production systems to a drug discovery format in a high-throughput manner has recently started. in the near future, we might see novel technologies for ligand screenings, thereby utilizing the advantage of the automatization and downsizing capacity of cf systems. the gold standard for synthesis, development, and production of antibody-based drugs (based on full-length antibodies) is mammalian cell culture-based expression systems. although cultivation of mammalian cells is well established and widely used, the development of monoclonal antibodies (mabs) and antibody-drug conjugates (adcs) remains time-consuming and challenging. thus, methods for highthroughput screenings, especially in the early-stage evaluation of antibody candidates, are valuable. in view of this, the use of the cf technology constitutes a promising strategy to shorten the time from antibody discovery to production. using cf technology, antibodies can be produced in a flexible scale within a couple of hours. besides the synthesis of individual antibodies, cf technology can be used to display libraries of antibodies. in contrast to phage and yeast display, in vitro cf systems such as ribosome and mrna display are open, and thus result in higher library sizes. in theory, the size of the library is only limited by the quantity of supplemented mrna/dna, the volume of the cf reaction, and the number of ribosomes within the system, resulting in library sizes of ~ 10 12−15 /ml cf reaction [108] . in comparison, phage and yeast display exhibit library diversities of ~ 10 6 -10 10 . selection technologies such as ribosome display [109] , mrna display [110] , and cis display [111] have been developed based on reticulocyte lysate [110] and e. coli cf systems [109] . these systems focused on smaller antibody fragments because their functionality does not rely on the assembly of multiple polypeptide chains. nonetheless, recently two groups have succeeded in developing completely cf display technologies that allow the selection of fab fragments [112] . the challenge to assemble the heavy and light polypeptide chain (hc/lc) of the fab fragment was approached in different ways. while sumida et al. succeeded by combining mrna display based on two mrna sub-libraries, one encoding hc, the other one encoding lc, with in vitro compartmentalization pcr to link and then amplify hc and lc gene pairs [112] , stafford et al. developed a ribosome display method where they displayed only one of the two fab chains, while the other one was not presented in display format [113] . successful synthesis of different antibody formats, including single-chain variable fragments (scfvs), fab fragments, as well as complete iggs, has already been shown in e. coli [114, 115] , sf21 [10, 116] , reticulocyte [110] , wheat germ [117] , and cho cf systems [46, 75, 118] . furthermore, the upscaling of cf reactions to the liter-scale [25, 115] as well as downscaling [119] and high-throughput applications [120] have been demonstrated. in addition, advances in bioorthogonal reaction chemistries have paved the way to expand the possibilities for adc development. the site-specific introduction of non-canonical amino acids into a genetically engineered sequence can be used to create site-specifically labeled adcs [121] . currently, seven adcs are approved for therapy. to date, all of these adcs have been generated by coupling of mabs to the cytotoxic linker-payload via surface-exposed lysines, or partial disulfide reduction and conjugation to free cysteines, which typically results in a controlled but heterogeneous adc population with varying numbers and positions of drug molecules attached to the mab [122] . homogeneous adc populations can be achieved by introducing the payload at one or more defined positions. by developing a bioorthogonal trna/synthetase pair, zimmerman et al. have shown that the optimized non-canonical amino acid para-azidomethyll-phenylalanine (pamf) can be site-specifically incorporated into the tumor-specific, her2-binding igg trastuzumab [123] . subsequently, the cytotoxic linker payload dbco-peg-monomethyl auristatin (dbco-peg-mmaf) was conjugated to pamf via strain-promoted azide-alkyne cycloaddition (spaac) copper-free click chemistry. in the context of dual-functioning molecules, bispecific antibodies have also emerged as promising anti-cancer agents. one of the advantages of these proteins is their capability to target two different epitopes simultaneously, thereby increasing target engagement, where mono-specific antibodies might fail [124] . due to their open design, cfps reactions can easily be manipulated, for example by varying the template ratios and concentrations of hc and lc. for example, xu et al. showed the successful assembly of bispecific 'knobs-into-holes' antibodies in multiple scaffolds by using an e. coli-based cf expression platform [125] . taken together, antibody evolution, selection, and engineering can dramatically benefit from the technological advances in the field of cfps. (1) novel display technologies based on cf methods enable the in vitro evolution of multimeric proteins and allow for more sophisticated protein engineering. (2) due to the very short time frame from synthesis to functional testing, cf systems can accelerate antibody construct evaluation by a repetitive (one after one) and/or parallel screening. (3) the introduction of non-canonical amino acids expands the chemical repertoire and thus the possibilities to modify and improve antibody-based therapeutics. advanced labeling technologies allow for a very fast qualitative analysis of drug-to-antibody ratio (dar), linker, linker/position, drug, drug/position (research application), and allow full control of the adc design (commercial application). cf systems are becoming a potential option for synthesizing vaccine antigens. most of the vaccine antigens produced by cf systems to date have used e. coli and wgl. recent progress on eukaryotic cf systems may offer additional advantages. in this context, eukaryotic cf systems are endotoxin free and lack the complex plasma membrane that makes the protein purification simple. some of the antigens synthesized by using cf systems are highlighted in table 4 . they are able to induce a strong immune response in experimental animals and could serve as a proof of concept for future vaccine development. using recent advances in cfps technology, a freeze-dried, cell-free (fd-cf) expression system was created based on e. coli cf lysates [31] . using this fd-cf technique, diphtheria toxoid antigen variants (dt5 and dt6) were produced following rehydration with water and functional characterization of the synthesized proteins was verified following administration in mice and measuring the immune response [31] . the fd-cf method could enable the production of on-demand, point-of-care biologics requiring just the simple addition of water for activation and synthesis. recently, cf-based expression has proven successful in producing difficult-to-express proteins like major outer membrane protein (mmomp) of chlamydia spp., a major vaccine antigen. using e. coli-based cfps, mmomp was synthesized in a native trimeric form in the presence of nanolipoproteins (nlps) with a yield of around 1.5 mg/ml. when injected into mice in the presence of an adjuvant, the protein elicited an enhanced humoral immune response [126] . this method of synthesizing and simultaneous incorporation of antigens into nlps using a cf approach is a promising method for future vaccine development. conjugate vaccines are one of the safest and most effective biologics [127] . bioconjugate vaccines are produced using protein glycan coupling technology (pgct). however, pgct has its own limitations such as time-consuming in vivo processes. additionally, fda-approved carrier proteins, such as toxins derived from clostridium tetani and corynebacterium diphtheria, have not yet been demonstrated to be compatible with an e. coli-based production process. relevant ptms are often difficult to synthesize in e. coli cf systems. meanwhile, there have been further advances in using bacterial glycoengineering combined with cfps for producing bioconjugate vaccines. this cf glycoprotein synthesis (cfgps) used glycooptimized e. coli extracts integrating both n-linked glycosylation and protein synthesis. using cfgps, two bioconjugate vaccines were synthesized against f. tularensis and e. coli o78 [146, 147] . besides posttranslational modification, the assembly of macromolecular structures in cf systems is highly ambitious. virus-like particles (vlps), for example, are nanoscale structures that are formed from the self-assembly of viral proteins without the viral genome responsible for the infection. usually, vlps mimic the capsid structure of the real virus. vlp antigens are vaccine candidates for several diseases [148] . one of the vaccine candidates, which is currently in clinical trials, contains vlp antigens addressing noroviruses responsible for gastroenteritis in humans [149] . cf synthesized vlps were structurally confirmed by electron microscopy [150] . using e. coli cf systems, antimicrobial colicins (colicin m, la, e1, and e2) have been synthesized with high yields (around 300 µg/ml) and solubility. the synthesized colicins are able to effectively kill the target cells without any purification [151] . antimicrobial peptides (amps) are another class of defense molecules that have a wide spectrum of targets; for example, bacteria, viruses, fungi, parasites, and cancer. amps are evolving as alternatives to antibiotics [152] . using lyophilized e. coli cf lysates, ten different amps have been synthesized successfully and the functionality of plasmodium falciparum wheat germ interaction of pfmsa180 with cd47 was confirmed by erythrocyte binding assay [133] . antigen-specific igg responses to lsa3-c were profiled by an alphascreen assay [134] . western blotting and elisa confirmed the interaction of purified recombinant msp11 with human sera [135] . immunization of purified pfron12 with freund's complete adjuvant into japanese white rabbits generated pfron12 antisera [136] . rabbits immunized with expressed pfripr produced specific antibodies [137] . anti-exp1 antibodies were generated by immunization of recombinant exp1 [138] . wheat germ proteoliposomal engineered cldn-5 antigens induced anti-cldn5-5-ecr antibodies in mice [145] αcd19-id small diabody (db) molecule containing both a b-cell-targeting moiety (anti-cd19) and a lymphoma id, 38c13-scfv idiotype-specific single-chain variable fragment of the immunoglobulin from the 38c13 mouse b-cell lymphoma, cldn-5 claudin-5, when non-canonical amino acids (ncaas) are incorporated into proteins, novel functional, structural, and imaging properties can be generated. this synthetic biology application is fast emerging and has wide applications such as incorporating precise ptms and adding novel functions to proteins [23, 24, 42, 45] . by taking advantage of the openness of the cfps, one can add the machinery responsible for the co-translational incorporation of ncaas directly to the standard reaction components. one possible method to incorporate ncaa is to use precharged trnas harboring the ncaa. one of the most commonly used methods is the amber suppression technology using an orthogonal pair of aminoacyl-trna synthetase/trna (o-trna/aars pairs from distinct organisms), which functions independent of endogenous aarss and trnas in the host and is used to direct the incorporation of ncaas to specific positions such as the amber stop codon (uag). after incorporation of an ncaa with a reactive group, bioorthogonal click reactions can be performed to conjugate a molecule of interest. the most general biorthogonal click reactions for conjugating molecular probes or polymers are the copper-catalyzed azide-alkyne cycloaddition (cuaac), staudinger-ligation, photo click cycloaddition, strain-promoted azide-alkyne cycloaddition (spaac) and inverse electron-demand diels-alder cycloadditions (iedda + spiedac). using e. colibased cf systems, cui et al. showed the incorporation of two fluorescent labels, bodipy fluorophore and tamra-dibo, by using a precharged trna + orthogonal system for fret measurements [153] . using sf21-based cf systems, quast et al. demonstrated the incorporation of p-azido-l-phenylalanine at defined amber positions in parallel in the two subunits of the human egfr protein dimer. later, the azido group of the incorporated azfs was coupled by photoaffinity cross-linking using a bis-combo linker to create a stable synthetic dimer of egfr [14] . the dimerized protein shows autophosphorylation in the presence of tyrosine kinase. in general, release factor 1 (rf1) competes with orthogonal ncaa-trna for the amber codon, which results in truncated products along with successfully suppressed products. so, cf lysates derived from genetically modified e. coli lacking release factor 1 (rf1) can be used to enhance the incorporation efficiency of ncaas. using the orthogonal system and e. coli-based cfps, human mek1 kinase with ptms was synthesized up to milligram quantities by site-specific, co-translational incorporation of phosphoserine at specific positions [154] . various polyethylene glycol (peg) moieties have been widely used to decorate therapeutic proteins. the peg moiety usually offers high stability and extends the half-life of proteins while in circulation inside the body. the food and drug administration (fda) has recognized peg moieties as safe due to their structural flexibility, hydrophilicity, and minimal toxicity, and several pegylated drugs have been approved by the fda. using sf21-based cf systems, a site-specific pegylated human epo was produced and characterized by autoradiography [45] . apart from the amber suppression strategy, there are other strategies like frameshift suppression, sense codon reassignment, and unnatural base pairing. a detailed review of prominent methods for the incorporation of ncaas into proteins using cfps has been recently published [23] . a wide range of commercial cf systems is available in the market based on lysates derived from diverse sources. as well, a few companies provide services for cf synthesis of proteins. some of the products derived from the cf systems based on e. coli lysates are already in clinical trials, such as adcs targeting cd74 and folate receptor alpha highly expressed in myeloma and cancer cells (sutro biopharma, inc, usa). table 5 lists commercial systems currently available for the cf production of proteins. evolving cf systems from a laboratory level to a robust production platform is necessary to fulfill their potential. prior to full realization of cf systems as emerging tools for drug discovery and evaluation, several factors need to be addressed, like synthesis of the high-quality functional protein with proper folding and ptm, cost of production, scalability, and safety issues. a more detailed understanding of the components in the cf lysates will substantially improve the quality and stability of the extract preparation. the quantity of the protein depends on the translation efficiency of the cf system. the most important factors that influence the protein yields are quality of the celllysate, reaction conditions, and template optimization as addressed in section 3.2. to increase the translation efficiency, further efforts are required to increase the quality of lysate production. this can be achieved by using genetic engineering tools to remove the factors responsible for nucleic acid degradation, ribosome inactivation, and protein degradation. brodiazhenko et al. showed that genomic disruption of genes encoding ribosome-inactivating factors (hpf in b. subtilis and stm1 in s. cerevisiae) has improved the activities of bacterial and yeast translation systems [54] . in this context, advanced engineering tools like crispr cas could help to improve the translation efficiency of the cf systems [155] . activation and enrichment of translation-relevant factors could also increase translation efficiency [63] . when it comes to eukaryotic cfps platforms, translocation through microsomes currently remains a black box. optimizing the efficiency of coupling translation and translocation needs to be addressed. the most important issue with cf systems, especially when working with cecf systems, is to maintain the balance between the amount of protein synthesized and the stability and quality of the protein. although cecf has been capable of producing 0.6-1 mg protein per ml, especially with the mammalian expression systems, only a small fraction of the produced protein was subject to detailed functional analysis [15, 155] . this is one of the reasons why the functional assays are limited to binding assays (gpcr, tlr, antibody), plbe (ion channels), and colorimetric assays (enzymes). by optimizing the redox conditions, the problem of ab translocation into the lumen of microsomes is addressed already [75, 155] . however, when it comes to the synthesis of complex transmembrane proteins in mammalian systems, the insertion efficiency might be already saturated at the low synthesis rates due to restrictions on the level of the translocon's functionality. a more detailed analysis of lipid composition and proteins constituting the microsomes present in the insect, cho, and human-derived lysates will help to improve the quality of synthesized membrane proteins. one could use alternative supplements like nanodiscs or liposomes reconstituted from microsomal membranes to support mp integration [156] . intense efforts on designing novel and improved mammalian cf systems should be maintained as the majority of the drug targets are related to complex eukaryotic proteins. optimizing cf reactions in order to decrease protein aggregation during the purification processes and increasing the quality of the protein purification, especially when using the cecf method, is strongly required. another point to address in the field of cfps is to decrease the costs of production, especially in the preparation of cf lysates and the individual reaction components. substantial costs arise from the usage of phosphorylated energy systems, cofactors, nucleotides, amino acids, and dna. alternative energy regeneration systems are available in the place of phosphorylated substrates (e.g., glucose, maltodextrin, etc.) for sustainable atp regeneration throughout the synthesis reaction [157] [158] [159] . use of nucleoside monophosphates instead of nucleoside triphosphates as the nucleotide source in the cf systems could be another cost-effective parameter [159] . avoiding the use of exogenous trnas and cyclic amps and reducing the concentration of amino acids and nucleotides are some of the cost-effective parameters one could optimize during protein synthesis. additionally, new high-cell-density cultivation strategies and improvement in the quality of cell lines by genetic engineering could help to produce cost-effective high-quality cf systems. costs can also be decreased by engineering and optimization of eukaryotic lysates to extend the lifetime of these systems, thereby increasing the yield of the produced protein. there has been considerable progress in the point-ofcare production devices for on-demand biologic synthesis of small quantities of therapeutic proteins using cho lysates and e. coli lysates through on-site good manufacturing practice (gmp) [30] . this type of miniaturized device could be useful for quick testing of proteins and thus help in treating common and rare diseases, and cfps could help solve the challenges associated with in vivo expression. due to the open nature of the cf systems, proteins can be modified with chemically synthesized glycans by bioconjugate chemistries. this will help to increase the quality and therapeutic efficiency of the synthesized proteins. there is an exponential increase in the number of publications from the last 5 years using cf lysates for producing a wide range of proteins [160] . due to the increased awareness of the biosynthetic potential of the cf systems, protocols becoming simpler, improvement in the lysate quality, and its applicability in the preparation of a diverse range of proteins, there will be unexpected outcomes in the field of protein production towards future drug development. funding this research is supported by the european regional development fund (efre) and the german ministry of education and research (bmbf, no. 031b0078a). open access this article is licensed under a creative commons attribution-noncommercial 4.0 international license, which permits any non-commercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by-nc/4.0/. ion channels in drug discovery and safety pharmacology slc transporters as therapeutic targets: emerging opportunities unexplored therapeutic opportunities in the human genome properties of protein drug target classes antibodies and venom peptides: new modalities for ion channels prospects for pharmacological targeting of pseudokinases advances in development of antimicrobial peptidomimetics as potential drugs. molecules new tools for recombinant protein production in escherichia coli: a 5-year update membrane protein synthesis in cell-free systems: from bio-mimetic systems to bio-membranes cell-free protein synthesis for producing difficult-to-express proteins a user's guide to cell-free protein synthesis establishing a high-yielding cell-free protein synthesis platform derived from vibrio natriegens cell-free protein synthesis: pros and cons of prokaryotic and eukaryotic systems high-yield cell-free synthesis of human egfr by iresmediated protein translation in a continuous exchange cellfree reaction format cellfree synthesis of functional and endotoxin-free antibody fab fragments by translocation into microsomes cell-free synthesis and characterization of a novel cytotoxic pierisin-like protein from the cabbage butterfly pieris rapae improving the recombinant human erythropoietin glycosylation using microsome supplementation in cho cell-free system cell-free synthesis of human toll-like receptor 9 (tlr9): optimization of synthesis conditions and functional analysis cell-free systems based on cho cell lysates: optimization strategies, synthesis of "difficultto-express" proteins and future perspectives systematic optimization of cell-free synthesized human endothelin b receptor folding cell-free protein synthesis: applications come of age cell-free synthetic biology: engineering in an open world cotranslational incorporation of non-standard amino acids using cell-free protein synthesis cell-free protein synthesis from genomically recoded bacteria enables multisite incorporation of noncanonical amino acids microscale to manufacturing scale-up of cell-free cytokine production-a new approach for shortening protein production development timelines on-chip automation of cell-free protein synthesis: new opportunities due to a novel reaction mode highthroughput transfection of interfering rna into a 3d cell-culture chip microarray platform affords improved product analysis in mammalian cell growth studies automated production of functional membrane proteins using eukaryotic cell-free translation systems point-of-care production of therapeutic proteins of good-manufacturing-practice quality on-demand biomolecular manufacturing development of a bacillus subtilis cell-free transcription-translation system for prototyping regulatory elements development of a pseudomonas putida cell-free protein synthesis platform for rapid screening of gene regulatory elements establishing a high yielding streptomyces-based cell-free protein synthesis system tobacco by-2 cellfree lysate: an alternative and highly-productive plant-based in vitro translation system leishmania cell-free protein expression system the cell free protein synthesis system from the model filamentous fungus neurospora crassa a combined cell-free transcription-translation system from saccharomyces cerevisiae for rapid and robust protein synthe rapid recombinant protein expression in cell-free extracts from human blood the pure system for the cell-free synthesis of membrane proteins single-pot glycoprotein biosynthesis using a cell-free transcription-translation system enriched with glycosylation machinery the application of cell-free protein synthesis in genetic code expansion for post-translational modifications wheat germ systems for cell-free protein expression cell-free expression and functional reconstitution of homo-oligomeric alpha7 nicotinic acetylcholine receptors into planar lipid bilayers cell-free protein synthesis as a novel tool for directed glycoengineering of active erythropoietin high-yield production of "difficult-to-express" proteins in a continuous exchange cell-free system based on performance benchmarking of four cell-free protein expression systems production and stabilization of the trimeric influenza hemagglutinin stem domain for potentially broadly protective influenza vaccines cell-free expression of a functional pore-only sodium channel high-level cell-free production of the malarial lactate transporter pffnt as a basis for crystallization trials and directional transport studies cell-free production of pore forming toxins: functional analysis of thermostable direct hemolysin from vibrio parahaemolyticus functional characterization of cell-free expressed kv1.3 channel using a voltage-sensitive fluorescent dye crystallization of the membrane protein hvdac1 produced in cell-free system elimination of ribosome inactivating factors improves the efficiency of bacillus subtilis and saccharomyces cerevisiae cell-free translation systems cell-free protein synthesis from fast-growing vibrio natriegens membrane assembly of the functional kcsa potassium channel in a vesicle-based eukaryotic cellfree translation system cell-free synthesis of functional human epidermal growth factor receptor: investigation of ligand-independent dimerization in sf21 microsomal membranes using non-canonical amino acids accelerating the production of druggable targets: eukaryotic cell-free systems come into focus cell-free production of a therapeutic protein: expression, purification, and characterization of recombinant streptokinase using a cho lysate functional g-protein-coupled receptor (gpcr) synthesis: the pharmacological analysis of human histamine h1 receptor (hrh1) synthesized by a wheat germ cell-free protein synthesis system combined with asolectin glycerosomes accelerated pharmaceutical protein development with integrated cell free expression, purification, and bioconjugation ires-mediated translation of membrane proteins and glycoproteins in eukaryotic cell-free systems an efficient mammalian cell-free translation system supplemented with translation factors a technique to increase protein yield in a rabbit reticulocyte lysate translation system host-regulated hepatitis b virus capsid assembly in a mammalian cell-free system species-independent translational leaders facilitate cell-free expression cell-free expression, purification and immunoreactivity assessment of recombinant fasciola hepatica saposin-like protein-2 exploiting leishmania tarentolae cell-free extracts for the synthesis of human solute carriers an optimized yeast cell-free system: sufficient for translation of human papillomavirus 58 l1 mrna and assembly of virus-like particles a cell-free expression and purification process for rapid production of protein biologics synthesis of membrane proteins in eukaryotic cell-free systems codon usage influences the local rate of translation elongation to regulate co-translational protein folding nonoptimal codon usage influences protein structure in intrinsically disordered regions cell-free eukaryotic systems for the production, engineering, and modification of scfv antibody fragments cell-free synthesis of functional antibodies using a coupled in vitro transcription-translation system based on method for cell-free protein synthesis involved with ph control with amino acid decarboxylase optimizing cell-free protein synthesis for increased yield and activity of colicins tetrameric assembly of k(+) channels requires er-located chaperone proteins energy systems for atp regeneration in cell-free protein synthesis reactions recent advances in development of cell-free protein synthesis systems for fast and efficient production of recombinant proteins effects of atp regeneration systems on the yields and solubilities of cell-free synthesized proteins combining in vitro folding with cell free protein synthesis for membrane protein expression membrane protein production in escherichia coli cell-free lysates reconstitution and functional characterization of ion channels from nanodiscs in lipid bilayers challenges in the development of functional assays of membrane proteins a comprehensive map of molecular drug targets a cell-free translocation system using extracts of cultured insect cells to yield functional membrane proteins in vitro synthesis of a major facilitator transporter for specific active transport across droplet interface bilayers functional reconstitution of cell-free synthesized purified kv channels the sensorless pore module of voltage-gated k + channel family 7 embodies the target site for the anticonvulsant retigabine cell free expression and functional reconstitution of eukaryotic drug transporters coupled cell-free synthesis and lipid vesicle insertion of a functional oligomeric channel mscl mscl does not need the insertase yidc for insertion in vitro enhanced functional expression of aquaporin z via fusion of in situ cleavable leader peptides in escherichia coli cell-free system green fluorescent protein changes the conductance of connexin 43 (cx43) hemichannels reconstituted in planar lipid bilayers a small viral potassium ion channel with an inherent inward rectification incorporation of adenine nucleotide transporter, ant1p, into proteoliposomes facilitates atp translocation and activation of encapsulated luciferase synthesis and site-directed fluorescence labeling of azido proteins using eukaryotic cell-free orthogonal translation systems modulation of g-protein coupled receptor sample quality by modified cell-free expression protocols: a case study of the human endothelin a receptor production of g protein-coupled receptors in an insect-based cell-free system cellfree synthesis of a functional g protein-coupled receptor complexed with nanometer scale bilayer discs qualifying a eukaryotic cell-free system for fluorescence based gpcr analyses principles: extending the utility of [35s]gtp gamma s binding assays optical probes based on g proteincoupled receptors -added work or added value? the role of beta-arrestins in the termination and transduction of g-protein-coupled receptor signals a flash-based fret approach to determine g protein-coupled receptor activation in living cells intramolecular and intermolecular fret sensors for gpcrs-monitoring conformational changes and beyond a combined cell-free protein synthesis and fluorescence-based approach to investigate gpcr binding properties in vitro selection and evolution of functional proteins by using ribosome display antibody-ribosome-mrna (arm) complexes as efficient selection particles for in vitro display and evolution of antibody combining sites cis display: in vitro selection of peptides from libraries of proteinâ€"dna complexes antibodies to watch in 2019 in vitro selection of fab fragments by mrna display and gene-linking emulsion pcr in vitro fab display: a cell-free system for igg discovery engineering toward a bacterial "endoplasmic reticulum" for the rapid expression of immunoglobulin proteins aglycosylated antibodies and antibody fragments produced in a scalable in vitro transcription-translation system production of functional antibody fragments in a vesicle-based eukaryotic cell-free translation system efficient synthesis of a disulfide-containing protein through a batch cellfree system from wheat germ development of a cho-based cellfree platform for synthesis of active monoclonal antibodies sealable femtoliter chamber arrays for cell-free biology high-throughput screening of biomolecules using cell-free gene expression systems synthesis of site-specific antibody-drug conjugates using unnatural amino acids location matters: site of conjugation modulates stability and pharmacokinetics of antibody drug conjugates production of site-specific antibody-drug conjugates using optimized non-natural amino acids in a cell-free expression system dual targeting strategies with bispecific antibodies production of bispecific antibodies in "knobs-into-holes" using a cell-free expression system cell-free production of a functional oligomeric form of a chlamydia major outer-membrane protein (momp) for vaccine development protein carriers of conjugate vaccines: characteristics, development, and clinical trials efficacy of a potential trivalent vaccine based on hc fragments of botulinum toxins a, b, and e produced in a cellfree expression system hela based cell free expression systems for expression of plasmodium rhoptry proteins ralp1 is a rhoptry neck erythrocytebinding protein of plasmodium falciparum merozoites and a potential blood-stage vaccine candidate antigen a vaccine directed to b cells and produced by cell-free protein synthesis generates potent antilymphoma immunity cellfree production of scfv fusion proteins: an efficient approach for personalized lymphoma vaccines pfmsa180 is a novel plasmodium falciparum vaccine antigen that interacts with human erythrocyte integrin associated protein (cd47) pv1, a novel plasmodium falciparum merozoite dense granule protein, interacts with exported protein in infected erythrocytes anti-msp11 igg inhibits plasmodium falciparum merozoite invasion into erythrocytes in vitro antibodies against a plasmodium falciparum ron12 inhibit merozoite invasion into erythrocytes identification of plasmodium falciparum reticulocyte binding protein homologue 5-interacting protein, pfripr, as a highly conserved blood-stage malaria vaccine candidate. vaccine plasmodium falciparum exported protein 1 is localized to dense granules in merozoites wheat germ cell-free system-based production of malaria proteins for discovery of novel vaccine candidates immunological characterization of plasmodium vivax pv32, a novel predicted gpi-anchored merozoite surface protein assessing sequence plasticity of a virus-like nanoparticle by evolution toward a versatile scaffold for vaccines and drug delivery malaria derived glycosylphosphatidylinositol anchor enhances anti-pfs25 functional antibodies that block malaria transmission immunogenicity of glycosylphosphatidylinositolanchored micronemal antigen in natural plasmodium vivax exposure wheat germ cell-free system-based production of hemagglutinin-neuraminidase glycoprotein of human parainfluenza virus type 3 for generation and characterization of monoclonal antibody engineered membrane protein antigens successfully induce antibodies against extracellular regions of claudin-5 toy building set method for rapid in vitro synthesis of bioconjugate vaccines via recombinant production of n-glycosylated proteins in prokaryotic cell lysates immunodrugs: therapeutic vlpbased vaccines for chronic diseases immunogenicity of takedas bivalent viruslike particle (vlp) norovirus vaccine (nov) candidate in children from 6 months up to 4 years of age cell-free protein synthesis of norovirus virus-like particles rapid production and characterization of antimicrobial colicins using escherichia coli-based cell-free protein synthesis application of antimicrobial peptides of the innate immune system in combination with conventional antibiotics-a novel way to combat antibiotic resistance? front cell infect microbiol combining sense and nonsense codon reassignment for site-selective protein modification with unnatural amino acids robust production of recombinant phosphoproteins using cell-free protein synthesis biosynthesis of membrane dependent proteins in insect cell lysates: identification of limiting parameters for folding and processing cell-free synthesis of membrane proteins: tailored cell models out of microsomes methods for activating natural energy metabolism for improving yeast cellfree protein synthesis: 15/639 cell-free gene expression: an expanded repertoire of applications an economical method for cell-free protein synthesis using glucose and nucleoside monophosphates cell-free biosensors for biomedical applications key: cord-000012-p56v8wi1 authors: bigot, yves; samain, sylvie; augé-gouillou, corinne; federici, brian a title: molecular evidence for the evolution of ichnoviruses from ascoviruses by symbiogenesis date: 2008-09-18 journal: bmc evol biol doi: 10.1186/1471-2148-8-253 sha: doc_id: 12 cord_uid: p56v8wi1 background: female endoparasitic ichneumonid wasps inject virus-like particles into their caterpillar hosts to suppress immunity. these particles are classified as ichnovirus virions and resemble ascovirus virions, which are also transmitted by parasitic wasps and attack caterpillars. ascoviruses replicate dna and produce virions. polydnavirus dna consists of wasp dna replicated by the wasp from its genome, which also directs particle synthesis. structural similarities between ascovirus and ichnovirus particles and the biology of their transmission suggest that ichnoviruses evolved from ascoviruses, although molecular evidence for this hypothesis is lacking. results: here we show that a family of unique pox-d5 ntpase proteins in the glypta fumiferanae ichnovirus are related to three diadromus pulchellus ascovirus proteins encoded by orfs 90, 91 and 93. a new alignment technique also shows that two proteins from a related ichnovirus are orthologs of other ascovirus virion proteins. conclusion: our results provide molecular evidence supporting the origin of ichnoviruses from ascoviruses by lateral transfer of ascoviral genes into ichneumonid wasp genomes, perhaps the first example of symbiogenesis between large dna viruses and eukaryotic organisms. we also discuss the limits of this evidence through complementary studies, which revealed that passive lateral transfer of viral genes among polydnaviral, bacterial, and wasp genomes may have occurred repeatedly through an intimate coupling of both recombination and replication of viral genomes during evolution. the impact of passive lateral transfers on evolutionary relationships between polydnaviruses and viruses with large double-stranded genomes is considered in the context of the theory of symbiogenesis. approximately two-thirds of these wasps are endoparasites, meaning that the larval stages develop within the body cavity of their hosts, typically other insects. among the most successful of these endoparasitic wasps are those that use lepidopteran larvae as hosts. owing to the economic importance of these insects and the utility of their wasp parasites as biological control agents, the ability of these parasites to develop within lepidopteran hosts without triggering an intense immune response has been the subject of numerous studies over the past forty years. early studies of the mediterranean flour moth, ephestia kuhniella, parasitized by the ichnemonid, venturia canescens, showed that eggs of this species are coated with particles that resemble virions [2] [3] [4] and contain surface proteins that mimic host proteins, thus keeping the eggs and larvae from being recognized as foreign material by their host. these particles lack dna, and thus are not considered virions [5] . with respect to both species number and mechanisms that lead to successful parasitism, endoparasitic wasps are known to inject secretions at oviposition, but only a few lineages use viruses or virus-like particles (vlps) to evade or to suppress host defences. in the family ichneumonidae, for example, four types of host defence suppression mediated by the injection of fluids or suspensions are known that lead to successful parasitism. 1) fluid injected with eggs bypasses host defences without the aid of viruses or vlps [6] . 2) wasps inject a virus that replicates in both the wasp and lepidopteran host. one example is the wasp diadromus pulchellus, which injects an ascovirus, dpav4 [7] into host pupae to circumvent host defence response. 3) the wasp injects vlps capable of molecular mimicry and/or direct defence suppression. 4) the wasp injects polydnavirus particles that contain genes coding for proteins that interfere with host defence responses. the last mechanism is by far the best-studied type of direct immune suppression by ichneumonid wasps, and occurs in many species belonging to genera campoletis, hyposoter and tranosema (ichneumonidae, campopleginae), and glypta (ichneumonidae, banchinae) [8] . in these cases, female wasps inject eggs along with ichnovirus particles into their hosts. similarly, in certain lineages of endoparasitic braconid wasps, other types of immunosuppressive particles containing dna occur in the fluid injected along with eggs [ [9] ; for a review, [10] ]. once in the host, ichneumonid and brachonid particles enter host nuclei and their dna is transcribed, producing proteins that selectively suppress various steps in the host defence response. as a result of this unusual biology, these particles were described as symbiotic viruses belonging to new viral family, polydnaviridae [10] [11] [12] since the 1970's, it was assumed that the dna in the polydnavirus particles, as with all other viruses, encoded typical enzymes and proteins for viral replication and virion assembly and structure. however, several recent genomic studies have shown that only a small number of the genes vectored into lepidopteran hosts, less than 2%, have homologs in other viruses. most viral dna is noncoding, except that which codes for wasp proteins involved in suppression of immune pathways, such as phenoloxidase activation and the toll pathways [8, 13, 14] . even before these genomic studies, it was suggested that these particles were more similar to organelles than viruses [15] . the similarities between particle structure and virions of known types of complex dna insect viruses are striking, and suggest these immunosuppressive particles originated by symbiogenesis between viruses and endoparasitic wasps, the same evolutionary process by which mitochondria and plastids originated from symbiotic bacteria [16] . for example, most braconid wasps produce enveloped bacilliform particles classified as bracoviruses, and these resemble baculovirus and nudivirus virions [10, 15] . similarly, ichneumonid wasps produce enveloped spindle-shaped particles classified as ichnoviruses that resemble virions of ascoviruses, viruses lethal to lepidopterans, which, interestingly, are vectored by endoparasitic wasps [15] . it must also be noted that ichnoviruses resemble other true virus particles that are structurally very similar to virions of ascoviruses, but which remain unclassified because the lack of information about their genomes [17] [18] [19] [20] [21] . however, ascoviruses and ichnoviruses display very different genome properties; similar genomic differences occur between bracoviruses and baculoviruses or nudiviruses, suggesting that convergent evolution led to the origin the different polydnavirus types from at least two different types of viruses. in ascoviruses, the genome consists of a single circular dna molecule ranging from 119-to 180-kpb in size [7] . phylogenetic analyses of several viral genes have revealed that ascoviruses are closely related to iridoviruses [22] , and likely evolved from them. in contrast, the genome of ichnoviruses is composed of multiple circular dna molecules (25 to 105) representing a total size of 250 to 300kbp, all of which are replicated from the wasp chromosomes. the ichnovirus proviral genome is specifically excised and amplified in several segments in the female calyx cells, the only wasp tissue in which ichnovirus virogenesis occurs. after assembly, these particles are secreted into the female genital tract. once injected into the host, the ichnovirus genome does not replicate, and does not lead to the production of a new virus generation. the third characteristic of ichnoviruses is that most of the genes borne by the particles are not related to viral genes. among the 7 annotated ichnovirus gene families, there are four (rep, prrp, n, and trv) for which no homology with known eukaryotic (or prokaryotic) proteins has been detected and for which no function has been proposed. among the remaining three (cys, ank and inx), cys-motif proteins have no clear homologs among eukaryotic (or prokaryotic) proteins, although the "cysteine knot" that they form is a folding domain found in many proteins, but not one that is necessarily related to eukaryotic host immune systems [10, 14] . however, some protein domains and their putative functions suggest that they might be related to regulatory components of eukaryotic host defence systems that are not sufficiently elucidated. although the resemblance of the polydnavirus virions to those of conventional insect viruses suggests that the former evolved from the latter, to date no molecular evidence supports this hypothesis. in the case of ascoviruses and ichnoviruses, well-conserved genes found among the three ascoviruses sequenced so far (sfav1a [23] , tnav2c [24] , and hvav3e [25] ) are not found in ichnovirus genomes. as noted above, the principal reason for this is that the genomes of the latter viruses appear to contain mainly wasp genes, not viral genes. this highlights the need for new and alternative types of sequence data obtained from pertinent biological systems. in this regard, dpav4 has features that could provide important insights. indeed, it is the only ascovirus known to replicate in both its wasp and caterpillar hosts. it is transmitted vertically from wasp to caterpillars to suppress the defence response of the latter host, thereby enabling parasite development [26, 27] . moreover, in males and females of d. pulchellus, the dpav4 genome resides in the nuclei of all hosts cells, providing a possible example of what may have been an intermediate stage in the symbiogenesis that led to the evolutionary origin of ichnoviruses. we recently sequenced the dpav4 genome, and a combination of our analysis of this genome and recent data from new types of ichnoviruses, as well as new software programs that elucidate protein relationships based on structural analysis, have enabled us to detect phylogenetic relationships between proteins encoded by open reading frames of dpav4 and the glypta fumiferanae (gfiv) and campolitis sonorensis (csiv) ichnoviruses. in support of the symbiogenesis hypothesis for the origin of ichnoviruses, data and analyses suggest two independent symbiogenic events, in agreement with what was previously proposed [28] . the first led to the ichnoviruses in banchinae lineage. this hypothesis is based on the occurrence of a gene cluster present in gfiv and dpav4. the second symbiogenic event led to ichnoviruses in the campopleginae wasp lineage. this hypothesis is based on relationships of the major capsid proteins among csiv, ascoviruses and iridoviruses. extending our investigations to proteins encoded by open reading frames of certain ascoviruses and bracoviruses, hosts and bacteria, in the light of recent analyses about the involvement of the replication machinery of virus groups related to ascoviruses in lateral gene transfer [29] , we discuss the robustness and the limits of the molecular evidence supporting an ascovirus origin for ichnovirus lineages. the dpav4 genome sequenced by genoscope (france) is 119,334-bp in length. its organization, gene content and evolutionary characteristics will be detailed in a separate publication (manuscript in preparation; additional file 1). however, blast results obtained with several orfs in the dpav4 genome provide evidence that certain ichnovirus orfs have their closest relatives in an ascovirus genome. specifically, we identified a 13-kbp region that contains a cluster of three genes ( fig. 1 , orf90, 91 and 93; additional files 1 and 2) that have close homologs in a gfiv gene family composed of seven members [28] . all contain a domain similar to a conserved domain found in the pox-d5 family of ntpases. to date, this pox-d5 domain has been identified as a ntp binding domain of about 250 amino acid residues found only in viral proteins encoded by poxvirus, iridovirus, ascovirus and mimivirus genomes. these genes seem to be specific to gfiv, as they are absent in the three sequenced genomes of other ichnoviruses, namely csiv, tranosema rostrales ichnovirus (triv), and hyposoter fugitivus ichnovirus (hfiv). more specifically, in dpav4, orf90 encodes a protein of 925 amino acid residues that is 40% similar from position 140 to 925 to a protein of 972 amino acid residues encoded by the orf1 contained in the segment c20 in the gfiv genome (fig. 2) . these two proteins can therefore be considered putative orthologs. the 480 c-terminal residues of this dpav4 protein are also 42% similar to the cterminal domain of the protein homologs encoded by the orf1 of the d1 and d4 gfiv segments, 36% similar to the n-terminal and the c-terminal domains of the protein encoded by the orfs 184r and 128l of the iridovirus civ and lcdv, and 30% similar with those encoded by orfs 119, 99 and 78 in the ascovirus genomes of hvav3e, sfav1a and tnav2c, respectively. overall, this indicates that this dpav4 protein is more closely related to that of gfiv than to those found in other ascovirus and iridovirus genomes currently available in databases. orf091 encodes a protein of 161 amino acid residues similar only with the c-terminal domain of three proteins encoded by the orfs 1, 1 and 3, contained, respectively, in gfiv segments d1, d4 and d3. in contrast, orf93 is closer to iridovirus and ascovirus genes than to gfiv genes. this protein of 849 amino acid residues is 43% similar over all its length to civ orf184r orthologs in all iridoviral and ascoviral genomes and is only 36% similar over 350 amino acid residues to the c-terminal domain of the gfiv protein homologs encoded by the orf1, 2, 1, 1, 1 and 1 in, respectively, the c20, c21, d1, d2, d3 and d4 segments of this virus. analysis of the genes surrounding the dpav4 orf-90-91-93 cluster confirms that this virus has an ascovirus origin since this region contains orfs that are close homologs of genes in iridovirus and ascovirus genomes. upstream from the orf-90-91-93 cluster, an orf encoding the dna-dependent rna polymerase 1 subunit c is present, which is an ortholog of the iridoviral civ orf176r and the ascoviral sfav1a orf008. downstream from this cluster, there are two genes, absent in known ascoviral genomes, but similar to the iridoviral civ orf115l and civ orf132l. these two genes encode, respectively, a chromosomal replication initiation protein and zinc finger protein. in between them, a gene encoding a small protein is present that is similar to that encoded by the orf069l of the iridovirus civ, and which corresponds to the ali-like protein also found in entomopoxviruses [30] . since the three dpav4 genes have relatives in all ascovirus and iridovirus genomes sequenced so far, their presence in the dpav4 genome cannot result from a lateral transfer that occurred from an ichnovirus genome related gfiv to dpav4. thus, as these dpav4 genes are the closest relatives of the pox-d5 gene family present in gfiv identified so far, they could be considered a landmark of the symbiogenic ascovirus origin of the ichnovirus lineage to which this polydnavirus belongs. an alternative explanation is that the presence of dpav4-like genes in the genome of gfiv resulted from a lateral transfer from viral genomes closely related to those of gfiv and dpav4. indeed, this might have happened when a glypta wasp was infected by an ancestral virus related to dpav4. nevertheless, the symbiogenic origin of gfiv from ascoviruses is also supported by morphological features of its virions [28] , which, aside from similarities in shape, also show reticulations on their surface in negatively stained preparations, a characteristic of the virions of all ascovirus species examined to date [7] . because ascovirus virions and ichnovirus particles display structural similarities, we developed an approach to search for homologs of virion structural proteins in ichnoviruses. these approaches were initiated in 2000 and recently finalized, but some of the conclusions have been published [14] . to date, only two virion proteins from the campoletis sonorensis ichnovirus (csiv) have been characterized [31, 32] . the first is the p44 (acc n° aad01199), a structural protein that appears to be located as a layer between the out envelope and nucleocapsid, and the second, p12, a capsid protein (acc n° af004367). presently, there are more than one hundred ascoviral or iridoviral mcp sequences in databases. blast searches using these sequences failed to detect any similarities between csiv virion proteins and ascoviral or iridoviral mcps, or any other proteins [33] . to evaluate the possibility that homology between ichnovirus and ascovirus virion proteins may simply not be detectable by conventional blastp searches, we used a different method, wapam (weighted automata pattern matching; [34] ). the models were designed on the basis of a previous study [22] demonstrating that mcp encoded by ascovirus, iridovirus, phycodnavirus and asfarvirus genomes are related, and all contain 7 conserved domains separated by hinges of very variable size. we investigated these conserved domains further using hydrophobic cluster analysis (hca, [35] ). this map of the 13-kbp region of the dpav4 genome (embl acc. n° cu469068 and cu467486) that contains the gene cluster with direct homologs in the genome of the glypta fumiferanae ichnovirus amino acid sequence comparison resulting from a blast search done with the dpav4 orf90 as a query, and the best hit corresponding to the protein encoded by the orf1 of the ichnovirus segment gfv-c20 (subject; genbank acc. n° yp_001029423) figure 2 amino acid sequence comparison resulting from a blast search done with the dpav4 orf90 as a query, and the best hit corresponding to the protein encoded by the orf1 of the ichnovirus segment gfv-c20 (subject; genbank acc. n° yp_001029423). analysis revealed that most conservation occurred at the level of hydrophobic residues, as expected for structural proteins (additional file 3a and 3b). the size variability of the hinges between conserved domains and the conservation of hydrophobic residues might explain why blast searches using iridoviral and ascoviral mcp sequences have limited ability to detect mcp orthologs in phycodnavirus and asfarvirus genomes. we designed two syntactic models (see materials and methods), which together were able to specifically align all mcp sequences of the four virus families. importantly, wapam aligned the csiv ichnovirus p44 structural protein with both models. complementary structural and hca confirmed the presence of the seven conserved domains in this csiv structural protein ( fig. 3a and additional file 3c). in addition to the above analysis, ten syntactic models were developed using proteins conserved in the three sequenced ascovirus species (sfav1a, tnav2c, and hvav3a) and twelve iridoviruses [36] . none of these 1 and 4, typed in black) , dpav4 (lanes 2 and 5, typed in blue) and sfav1a (lanes 3 and 6, typed in purple) . conserved positions among the amino acid sequence of csiv and those of dpav4 and sfav1a are highlighted in grey. secondary structures in the three sfav1a orf061 orthologs were calculated with the network protein sequence analysis at http://npsa-pbil.ibcp.fr/ and the statistical relevance of the secondary structures were evaluated with psipred at http://bioinf.cs.ucl.ac.uk/psipred/. c, e and h in lanes 4 to 6 respectively indicated for each amino acid that it is involved in a coiled, b sheet or a helix structure. using default parameters of psipred, upper case letters indicate that the predicted secondary structure is statically significant in psipred results. significant secondary structures are highlighted in yellow. in (a), the comparisons were limited to three of the seven conserved domains (additional file 3a, b and 3c), the 2, 5 and 7. indeed, classical in silico methods appeared to be inappropriate to predict statistically significant secondary structures in conserved structural protein rich in b strand such as iridovirus and ascovirus mcp. in contrast, a complete and coherent domain comparison was obtained by hca profiles (fig. s3b, c) . , developed from small proteins encoded by the dpav4 orf041, sfav1a orf061, hvav3a orf74, and tnav2c orf118 in the ascovirus genomes, and iridovirus civ orf347l and mimivirus miv orf096r genomes, respectively. importantly, these proteins have orthologs in vertebrate iridoviruses, phycodnaviruses, and asfarvirus. in sfav1a, the peptide encoded by orf061 is one of the virion components. in ascoviruses, iridoviruses, phycodnaviruses, and the asfarvirus, they have been annotated as thioredoxines, proteins that play a role in initiating viral infection [37] [38] [39] . database mining with our model revealed four hits with csiv sequences (acc n°. m80623, s47226, af236017, af362508) each a homolog orf of sfav1a orf061. in fact, these sequences correspond to several variants of a single region contained in the b segment of the csiv genome. to date, these have not been annotated in the final csiv genome, probably because they overlap a recombination site. hca analyses confirmed that the hydrophobic cores were conserved ( fig. 3b and additional file 3d and 3e). the chromosomal locations of genes encoding these two csiv proteins, i.e., p44 and p12, were also consistent with the symbiogenesis hypothesis. in fact, the orf encoding p44 is not found in proviral dna. it is notable that no orfs encoding orthologs of p44 or other structural proteins such as mcps are found in any of the other three ichnovirus genomes sequenced -triv, gfiv, hfiv [8, 14] . therefore, this indicates that the orthologs of ichnovirus mcps and other virion structural proteins are also probably located in the genomes of these wasps, i.e., not in proviral dna. in contrast to this, we found that the gene encoding the csiv ortholog of sfav1a orf061 is located within the proviral dna. whether ortholog proteins are similarly involved in the triv, gfiv and hfiv biology, their genes are not found in proviral dna, since no matches were detected in their viral genomes. the phylogenetic analysis performed previously on p44 and the sfav1a orf061 orthologs [15] indicated that they have an ancestor close to that of the ascoviruses and iridoviruses. as in the case of genes encoding pox-d5 family of ntpases in all ascoviruses, iridoviruses, and gfiv, genes encoding virion proteins cannot result from a horizontal transfer from a campoplegine or banchine ichnovirus genome to all ascovirus, iridovirus, phycodnaviruses and asfarvirus genomes. as the ascovirus genes encoding the two virion proteins investigated here are the closest relatives of virion proteins in csiv, they can be considered a landmark reflecting the symbiogenic origin of the two ichnovirus lineages from ascoviruses closely related to dpav4. in fact, the difficulty encountered in elucidating their sequence relationships can be explained by a combination of the marked transition from ascovirus to ichnovirus, and the significant selection constraints that resulted as the latter virus type evolved from the former. analysis of available ascovirus, iridovirus and ichnovirus genomes provides some of the first molecular support for the hypothesis that ichnoviruses evolved from ascoviruses by symbiogenesis. however, examining genes shared only by ascovirus, iridovirus and ichnovirus genomes likely limits the sources of genes that contributed to the evolution and complexity of these viruses, especially of the role of lateral gene transfer. relevant to this is the recent finding that an important part of the mimivirus and phycodnavirus genomes had a bacterial origin [28] . obviously, this did not lead to the conclusion that these viruses had a bacterial origin. the cytoplasmic environment in which these viruses replicate is rich in bacterial dna because their amobae and unicellular algae hosts feed on bacteria that they digest in their cytoplasm. thus, it has been proposed [28] that lateral transfers of bacterial dna within these viral genomes were driven by intimate coupling of recombination and viral genome replication. indeed, replication of these viruses is similar to that of bacteriophage t4. this mode of replication has been called recombination-primed replication. it permits integration of dna molecules with sequence homology as short as 12-bp [28, 40] . the replication machinery used by ascoviruses, iridoviruses, mimiviruses, phycodnaviruses, and other nucleocytoplasmic large dna viruses (ncldv) [41, 42] is common to all of them, despite differences in the specifics of replication in each virus family. it can therefore be expected that recombination-primed replication occurred repeatedly during evolution of both these viruses and the genome of their eukaryotic hosts. in an eukaryotic cellular environment in which bacteria, chromosomes, ncldv viruses and non-ncldvs (such as baculoviruses) intimately cohabit temporarily or permanently, recombination-primed replication is able to allow reciprocal passive lateral transfers between viral genomes, host chromosomes, and bacterial dna. under these conditions, lateral transfers are considered passive since they just result from the intimate environment and not from an active mechanism dedicated to genetic exchanges. in ascoviruses and iridoviruses, the occurrence of such lateral transfers is supported by blastp searches that detected the presence of orfs whose closest relatives have their origin within eukaryotic genomes (e.g., for dpav4, in additional data 1, orfs 029, 049, 077, 080, 083, 118), bacterial genomes (e.g., for dpav4, in additional data 1, orfs 056, 057, 059, 112, 115 and119) or viruses belonging to other ncldv and non-ncldv families (e.g., for dpav4, in additional data 1, orfs 007, 037, 062, 068). the transmission of ascoviruses is unusual in that they are poorly infectious per os and appear to be transmitted among lepidopteran hosts by parasite wasp vectors at oviposition [7, 43] . the genome of the ascoviruses can be replicated in presence of polydnavirus dna either within the reproductive tissues of female wasps or within the body of the parasitized hosts infected by both polydnavirus and ascovirus. consequently, integrated sequences of ascovirus origin can be expected within wasp and polydnavirus genomes. reciprocally, sequences of polydnavirus origin may have been integrated in ascovirus genomes, whatever the wasp origin, ichneumonid or braconid. one gene family related to a bacterial family of n-acetyl-l-glutamate 5-phosphotransferase (acc. n° of the closest bacterial relatives yp_001354925, cam32558, zp_00944224, zp_02006449), identified only within the sfav1a, hvav3e and tnav2c genomes, supports this conclusion. it has been found in the genome of a bracovirus, cotesia congregata bracovirus (ccbv [13] ; fig. 4 ). since this gene is absent in the genome of microplitis demolitor bv, a related bracovirus [8] , it is difficult to infer the direction of the lateral transfer between the common ancestors of the three ascoviruses and of the wasp c. congregata. however, they unambiguously indicate that there was at least one lateral transfer for this gene between the common ancestor of ascoviruses and the parasitic wasp. since iridoviruses, like ascoviruses and other virus species [44, 45] , are, in some cases, vectored by parasitic wasps, databases were mined using all the available ichnovirus virus proteins as queries. we found no significant relationships between csiv, hfiv and triv genomes and genomes of their putative closest relatives ncldv and non-ncldv relatives. this indicates that passive lateral gene transfers from virus to eukaryotes that are successfully spread and maintained in ichnovirus genomes remain rare events. one case of such lateral transfer was described in the ccbv genome. in this genome, aside from the presence of cardinal endogenous eukaryotic retrotranposon and polintons that transposed in the chromosomal dna of the proviral form of ccbv [46] [47] [48] , two genes encoding acmnpv p94-related proteins, which have their closest relatives among granuloviruses (xcgv), were found. this suggests that ccbv contained at least two cases of lateral transfers between non-ncldv and a bracovirus. our results provide another source of evidence that passive lateral gene transfers have occurred regularly during evolution from bacteria to viruses and eukaryotes, and between viruses and eukaryotes [49] [50] [51] [52] . even if the pox-d5 ntpase genes in the gfiv genome, and the mcp and sfav061-like genes in the csiv genome, indicate that they have an ascovirus origin, they provide only limited evidence supporting an ascovirus origin of ichnoviruses. indeed, their sequence conservation and biological characteristics suggest that there were repeated lateral transfers during evolution between ascoviruses and wasp genomes, including the proviral ichnovirus loci. this raises an important issue about the role of lateral transfers during co-evolution of the ncldvs and non-ncldvs, ichnovirus, wasp and parasitized host. indeed, genetic materials of various origins have been exchanged and maintained during co-evolution. this therefore suggests that ichnoviruses might be chimeric entities partly resulting from sevsymbiogenesis was first proposed as an evolutionary mechanism when it became widely recognized that mitochondria and plastids originated from free-living prokaryotes [7] . the genomes of the endosymbiotic cyanobacteria and proteobacteria, respectively, at the origin of chloroplasts and mirochondria have evolved by reduction of several orders of magnitude to the approximate size of plasmids. concurrently, nuclear genomes have been the recipients of plastid genomes. this relocation of the genes encoding most proteins of the endosymbiotic bacteria to the host nucleus is the ultimate step of this evolutionary process, so-called endosymbiogenesis [7, 53] . recent studies of plants have revealed a constant deluge of dna from organelles to the nucleus since the origin of organelles [54] . this allows the host cell to have the genetic control on its organelles, in a relationship that is closer to enslavement or domestication than to a symbiosis or a mutualism in which the organelles would recover benefits from their contribution to the eukaryotic cell well-being. to date, this deluge of dna is considered to correspond to passive lateral transfers that result from the interactions between the life cycle of the organelle and nuclear replication. numerous cases of symbiogenesis between endocellular bacteria and a wide variety of eukaryotic hosts have been characterized. however, recent work has demonstrated that this evolutionary process was not restricted to bacteria. it also occurred between endocellular eukaryotes such as unicellular algae and fungal endophyte in plants [55, 56] . endosymbiogenesis was also proposed as the evolutionary mechanism that allowed some invertebrate viruses with a large double-stranded dna genome related to the nudiviruses and the ascoviruses [22] , to have led, respectively, to the origin of bracoviruses and ichnoviruses, which are currently recognized as forming two genera within the family polydnaviridae. although presently there is no definitive evidence ruling out the hypothesis that the resemblance between ichnovirus and ascovirus virions is only an evolutionary convergence, the genomic differences between ascovirus and ichnoviruses are in good agreement with the symbiogenetic hypothesis. indeed, they match an evolutionary scenario of endosymbiogenesis during which, from a single integration event of symbiotic virus genome, viral genes were lost and/or translocated from the provirus to other chromosomal regions (fig. 5 ). in parallel, host genes of interest for the wasp parasitoid were integrated and diversified by selection and gene duplication in the proviral dna. in this scenario, the more ancient symbiogenesis, the rarer the traces of genes from viral origin in the ichnovirus genome would be. this constitutes a constraint that dramatically limits the possibility to investigate the evolutionary links between ascovirus and ichnovirus. results of our analyses demonstrate that the situation is also complicated by the fact that lateral gene transfers unrelated to the origin of ichnoviruses cause important misleading background noise. moreover, the scenario in figure 5 is close to a previously proposed version [57] , but is not consistent with results presented here, nor with recently accumulated knowledge on dna transfer from organelles into the nucleus. since endocellular environments favour lateral transfers between virus and wasp nucleus, it can be proposed that genes of virus origin that are involved in the ichnovirus biology were passively integrated in one or several loci, step by step over time, alone or through transfers of gene clusters, or even the entire viral genome. since parasitoid wasps are able to vector different viruses [44, 45] , this second scenario opens the exciting possibility that virus genes involved in the ichnovirus biology might correspond to a gene patchwork resulting from transfers from viruses belonging to different ncldv and non-nclvd families. because of the background noise due to lateral gene transfers found in these systems, elucidating the origins of ichnoviruses will be very time-consuming, requiring new accurate experimental approaches to generate more robust evidence. sequencing wasp genomes to identify proteins of viral origin that are components of virions and involved in the assembly of these may well contribute to our understanding of how ichnoviruses and bracoviruses evolved from other insect dna viruses. searches for similarities were mainly developed using facilities of blast programs at two websites http:// www.ncbi.nlm.nih.gov/blast/blast.cgi and http:genoweb.univ-rennes1.fr/serveur-gpo/out ils.php3?id_rubrique=47. for dpav4 genes having their origin within eukaryotic, bacterial or virus genomes belonging to ncldv and non-ncldv families, the closest gene was located using the distance trees supplied with each blast search at the ncbi website. construction of syntactic models: conserved amino acid blocks and positions described previously [15, 22] and with new data sets were verified or determined using meme at http://meme.sdsc.edu/meme/meme.html. in the first step, we used motifs resulting from meme to make mast minings in databases at http:// meme.sdsc.edu/meme/mast.html. since meme motifs depend significantly on the data set use to calculate them, this approach did not enable an exhaustive detection of homologs among ascoviruses, iridoviruses, phycodnaviruses, mimiviruses and asfarviruses, and the detection sensitivity was ultimately very similar to that obtained with blast. to reach our detection objectives, we therefore constructed syntactic models that only included the most conserved positions and their variable spacing using wapam at the website. http://genoweb.univ-rennes1.fr/ serveur-gpo/ outils_acces.php3?id_syndic=185&lang=en. defining these models was obtained empirically until they allowed an exhaustive detection in refseq-protein and genbank databases of the homologs among ascoviruses, iridoviruses, phycodnaviruses, mimiviruses and asfarviruses. the procedures were done until we were only able to detect exact match with the syntactic model. whatever obtained with wapam, they required a confirmation with other approaches. here, we used psipred result comparison for regions with scores over 7 and hca analyses for regions having scores lower than 7 with psipred. this simplified the statistical treatment of the result obtained with wapam, since all exact matches have significance or a score of 100%. syntactic hypothetical mechanism for the integration and evolution of ascovirus genomes in endoparasitic wasps figure 5 hypothetical mechanism for the integration and evolution of ascovirus genomes in endoparasitic wasps. schematic representation of the three-step process of symbiogenesis, and dna rearrangements that putatively occurred in the germ line of the wasp ancestors in the banchinae and campopleginae lineages, from the integration of an ascoviral genome to the proviral ichnoviral genome. sequences that originate from the ascovirus are in blue, those of the wasp host and its chromosomes are in pink. genes of ascoviral origin are surrounded by a thin black or white line, depending on their final chromosomal location. two solutions can account for the final chromosomal organisation of the proviral ichnovirus genome, monolocus or multilocus, since this question is not fully understood in either wasp lineage. more complex alternatives to this three-step process might also be proposed and would involve, for example, the complete de novo creation of a mono or multi locus proviral genome from the recruitment by recombination or transposition of ascoviral and host genes located elsewhere in the wasp chromosomes. this model for the chromosomal organization of proviral dna in polydnaviruses is consistent with data recently published [58] . immune surface of eggs of a parasitic insect the resistance of insect parasitoids to the defense reactions of their hosts an insect glycoprotein: a study of the particles responsible for the resistance of a parasitoid's egg to the defence reactions of its insect host role of virus-like particles in parasitoid-host interaction of insects venom from the endoparasitic wasp pimpla hypochondriaca adversely affects the morphology, viability, and immune function of hemocytes from larvae of the tomato moth, lacanobia oleracea characteristics of pathogenic and mutualistic relationships of ascoviruses in field populations of parasitoid wasps polydnavirus genomes reflect their dual roles as mutualists and pathogens particles containing dna associated with the oocyte of an insect parasitoid family polydnaviridae. in virus taxonomy. eighth report of the international commitee on taxonomy of viruses edited by: fauquet cm virus in aparasitoid wasp: suppression of the cellular immune response in the parasitoid's host polydnaviridae -a proposed family of insect viruses with segmented, doublestranded, circular dna genomes genome sequence of a polydnavirus: insights into symbiotic virus evolution shared and species-specific features among ichnovirus genomes origin and evolution of polydnaviruses by symbiogenesis of insect dna viruses in endoparasitic wasps symbiosis in cell evolution hyenoptera: formicidae) from brazil the ultrastructure of microorganisms in the tissues of casenaria infesta (hymenoptera: ichneumonidae) apparent replication of an unusual viruslike particle in both parasitoid wasp and its host an unusual virus from the parasitic wasp cotesia melanoscela. virology viruslike particles in the ovaries of microctonus aethiopoides loan (hymenoptera: braconidae), a parasitoid of adult weevils (coleoptera: curculionidae) evidence for the evolution of ascoviruses from iridoviruses genomic sequence of spodoptera frugiperda ascovirus 1a, an enveloped, double-stranded dna insect virus that manipulates apoptosis for viral reproduction sequence and organization of the trichoplusia ni ascovirus 2c (ascoviridae) genome. virology sequenceand organization of the heliothis virescens ascovirus genome biological and molecular features of the relationships between diadromus pulchellus ascovirus, a parasitoid hymenopteran wasp (diadromus pulchellus) and its lepidopteran host, acrolepiopsis assectella dpav-4, on thehemocytic encapsulation response and capsule melanization of the leek-moth pupa, acrolepiopsis assectella genomic and morphological features of a banchine polydnavirus: comparison with bracoviruses and ichnoviruses i am what i eat and i eat what i am: acquisition of bacterial genes by giant viruses the genome of melanoplus sanguinipes entomopoxvirus cloning and expression of a gene encoding a campoletis sonorensis polydnavirus structural protein a gene encoding a polydnavirus structural polypeptide is not encapsidated what does structure tell us about virus evolution? cluster of re-configurable nodes for scanning large genomic banks deciphering protein sequence information through hydrophobic cluster analysis (hca): current status and perspectives comparative genomic analysis of the family iridoviridae: reannotating and defining the core set of iridovirus genes the thioredoxin system in retroviral infection and apoptosis mimivirus giant particles incorporate a large fraction of anonymous and unique gene products cell entry by enveloped viruses: redox considerations for hiv and sars-coronavirus genetic recombination of the dna plant virus pbcv-1 in a chlorella alga common origin of four diverse families of large eukaryotic dna viruses evolutionary genomics of nucleo-cytoplasmic large dna viruses effects of the nonoccluded virus of spodoptera frugiperda (lepidoptera: noctuidae) on the development of a parasitoid parasitoid-mediated transmission of an iridescent virus non-poly-dna viruses, their parasitic wasp, and hosts the few virus-like genes of cotesia congragata self-synthesizing dna transposons in eukaryotes marvericks, a novel class of giant transposable elements widespread in eukaryotes and related to dna viruses evolution of viruses by acquisition of cellular rna or dna nucleotide sequences and genes: an introduction microbialgenes in the human genome: lateral transfer or gene loss? science are there bugs in our genome? science express genome-wide survey for genes horizontally transferred from cellular organisms to baculoviruses morphogenesis by symbiogenesis endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes a cryptic intracellular green alga in ginkgo biloba: ribosomal dna markers reveal worldwide distribution forest succession suppressed by an introduced plant-fungal symbiosis unfolding the evolutionary story of polydnaviruses structure and evolution of a proviral locus of glyptapanteles indiensis bracovirus this research was funded by grants from the c.n.r.s. (pics n°204343), the genoscope, the a.n.r. project in bioinformatics modulome, the ministère de l'education nationale, de yb is the leader of all aspects of the research on the biology, genomics, and evolution of dpav4. ss coordinated the sequencing, assembly, and sequence quality control of the dpav4 genome. cag participated in the bioinformatics analysis of the dpav4 genome development of the manuscript. baf contributed original concepts regarding the evolutionary origins and role of polydnaviruses in endoparasitoid biology, provided virological expertise to optimize data interpretation, and participated in writing the manuscript. predicted orfs in dpav4 genome. key: cord-021772-5v4gor2v authors: levine, gwendolyn j.; cook, jennifer r. title: cerebrospinal fluid and central nervous system cytology date: 2019-05-31 journal: cowell and tyler's diagnostic cytology and hematology of the dog and cat doi: 10.1016/b978-0-323-53314-0.00014-6 sha: doc_id: 21772 cord_uid: 5v4gor2v nan analysis of csf is an important adjunctive diagnostic tool in the workup of patients with cns disease and must be interpreted within the context of the patient's history, clinical signs, clinicopathological data, imaging studies, and other ancillary diagnostics. rarely is csf solely used to provide an etiological diagnosis (exceptions include cytological visualization of infectious agents or overtly neoplastic cells), but analysis may significantly narrow the field of pathophysiological differentials, guiding further diagnostic and therapeutic options. csf analysis is most sensitive in detecting inflammatory disease. 3 positive findings in csf tend to be more diagnostically helpful compared with negative findings but are often nonspecific because many different diseases may cause a common csf pathology (e.g., neutrophilic pleocytosis). 4 occasionally, the magnitude of change within the csf may be as instructive as the character of the change (e.g., a marked increase in protein concentration raising diagnostic concern for feline infectious peritonitis, marked neutrophilic pleocytosis raising diagnostic concern for steroid-responsive meningitis-arteritis in a young dog in pain). 4 more frequently, however, specific disease etiologies will present with csf changes of variable character and magnitude. csf that falls within laboratory reference intervals should never be used to rule out a differential diagnosis because negative findings may represent early or mild disease, disease suppressed or masked by therapeutic intervention, or a disease process that does not present within the particular area of the extracellular space being sampled. csf analysis may or may not correlate with imaging studies; a retrospective study of 92 cats receiving magnetic resonance imaging (mri) for spinal signs showed that abnormal csf was not a predictor for abnormal mri. 5 in another study, approximately 25% of dogs with intracranial signs and inflammatory csf had normal brain mri results. 6 the conventionally accepted theory of csf secretion and transport is based on the concept of active transport of ions within the ventricular ependymal cells and choroid plexi, subsequent passive flow of fluid, and circulation and drainage of csf into dural venous sinuses. these ideas have recently come under scrutiny as potentially simplistic and inconsistent with the past 100 years of experimental evidence. 7 analysis of past experiments, coupled with new data, supports a "global production" hypothesis-that instead of exclusive formation within the ventricles, csf is continually created and reabsorbed diffusely by cerebral capillaries that have slight variances in hydrostatic and osmotic pressure. canine studies have documented csf production within the ventricular system and the sas. 2 a study of 158 dogs with focal, noninflammatory disease showed that in cases of spinal lesions, csf was more likely to be abnormal if collected from the lumbar cistern, that is, caudal to the lesion. 12 this observation may be explained by presupposing cranial to caudal flow of csf, but the traditionally held theory of csf flow has recently been contested. 7 in canines, csf collected from the cerebromedullary cistern generally has lower microprotein concentrations compared with samples collected from the lumbar cistern. 13 blood contamination may be more pronounced in lumbar collection, as the desired subarachnoid space is more difficult to enter and yields a smaller volume of fluid that tends to flow more slowly. 9, 10 moreover, hemodilution may contribute to increased measured protein concentration. 13 rare instances of csf contamination with hematopoietic precursors have only been reported from lumbar sites. 14 a low, but potentially catastrophic, risk for puncturing the cervical spinal cord or caudal brainstem exists during cerebromedullary collection. because the spinal cord length is variable, spinal cord puncture is a possibility during lumbar collection, but it is associated with less severe adverse effects compared with injury following cisternal puncture. in a case series of four accidental cisternal parenchymal punctures (documented by using mri), three of the four patients suffered neurological decompensation and subsequently had to be euthanized. 15 the following equipment should be assembled: anesthesia and monitoring equipment, clippers, aseptic preparation materials for the skin, sterile gloves, and a spinal needle with stylet. for cerebromedullary cistern collection in dogs weighing less than 25 kg and for cats, a 22-gauge, 1.5-inch spinal needle is usually adequate, and a 22-gauge, 2.5-inch spinal needle is recommended for dogs weighing greater than 25 kg. for lumbar puncture, a 22-gauge spinal needle up to 6 inches long may be required for obese or extremely large patients. if available, fluoroscopic equipment may aid in the acquisition of cisternal or lumbar csf. at the authors' institution, fluoroscopy is often used before cisternal csf acquisition in toy-breed dogs to exclude the possibility of subclinical atlantoaxial subluxation. the anesthetized patient is placed in lateral recumbency (it is generally easier for a right-handed clinician to have the patient in right-lateral recumbency, and vice versa), with the neck and back flush to the edge of a sturdy table. for collection from the cerebromedullary cistern, the neck is flexed such that the dorsum of the muzzle is 90 degrees to the long axis of the body (if needed, stabilizing the endotracheal tube to prevent kinking and deflating the cuff to prevent tracheal trauma), and the snout is propped up slightly, if necessary, to keep it parallel with the table and not angulated from the sagittal plane. 10 a wide area (3-5 cm) around the atlanto-occipital joint (beyond atlas wings and axis spinous process and to the external occipital protuberance) is shaved and aseptically prepared, and landmarks are palpated with a gloved, nondominant hand. 9 the needle is inserted at the intersection of two imaginary perpendicular lines that run (1) along the dorsal midline (dividing the patient sagittally) from the occipital protuberance to the cranial spinous process of the axis (c2) and (2) across the craniolateral aspects of the wings of the atlas (c1) (dividing the patient craniocaudally). for lumbar collection, the pelvic limbs are brought forward into full flexion, and the needle is inserted cranial and parallel to the dorsal spinous process of l6 for dogs and l7 for cats, advancing the needle until the ventral aspect of the vertebral canal is encountered; the needle is then retracted slightly and csf is collected from the ventral sas. 10, 16 the pelvic limbs may be kicked or may twitch slightly during collection because of irritation of the cauda equina or spinal cord parenchyma. for either location, once landmarks are palpated, the needle is held stably with the dominant hand and very slowly advanced, stylet in place. the heel of the dominant hand may be supported against the table. for cisternal collection, it is important to advance the needle toward the point of the nose without angulation. the stylet is removed with the nondominant hand every 2 to 3 mm to check for fluid within the needle hub, waiting a few seconds. it is common to feel a decrease in resistance to forward needle movement once the thecal space is entered. if bone is hit or frank hemorrhage is observed from the needle, it should be withdrawn slowly and collection reattempted. 9 if clear or slightly blood-tinged fluid is observed, advancement of the needle is stopped, and open tubes are placed directly under the needle hub to collect freely falling drops. csf is collected passively and should not be aspirated. there are no significant objective data regarding the maximal amount of csf that may be collected in dogs. several authors claim that it is safe to collect 0.2 milliliters (ml) of csf per kilogram of body weight (1 ml/5 kg); in other species much higher volumes of csf per body weight are acquired standardly. 17 in general, 0.5 to 1 ml of csf is adequate for routine diagnostic tests, including cell counts, protein concentration, and cytological analysis. larger volumes are necessary for additional diagnostics (cultures, titers, polymerase chain reaction [pcr], flow cytometry, protein electrophoresis, etc.). two sets of tubes should be readied and ideally handled by an assistant. an ethylenediaminetetraacetic acid (edta)-treated (purple-top) tube is used for cell counts, flow cytometry, and pcr testing for organisms, and plain (red-top) tubes are used for protein concentration, culture, or immunologic assays. 10 some sources indicate that plain tubes are recommended, as edta could increase protein concentration. if csf analysis will occur rapidly (within 1 hour), collection into a plain tube is adequate, whereas preservation of cells may be improved with collection into edta if analysis will be delayed. if low volume is present, priority is given to the edta tube. if csf appears red, then iatrogenic hemorrhage (puncture of a dural vessel) or actual cns hemorrhage has occurred. in this instance, the first few drops are allowed to collect into the first set of tubes, and the second set of tubes are reserved for the latter portion of the sample, as iatrogenic hemorrhage tends to clear over time. if the hemorrhage does clear, a decision may be made about discarding the first set of tubes or keeping them for ancillary testing not affected by the hemorrhage. after collection, the needle is withdrawn without the stylet, and the csf within the needle is allowed to drip into one of the tubes or is placed in an additional plain tube and saved for culture. as with other clinicopathological and cytological samples, evaluation of a fresh specimen is preferred to minimize cellular degradation, to which csf is particularly vulnerable because of its relatively low protein concentration. sample degradation will affect cell differential count to a greater extent than the total nucleated cell count or the protein concentration. 18 a study of 30 canine csf samples with pleocytosis concluded that delay of analysis up to 8 hours was unlikely to alter interpretation, especially in samples with protein concentrations above 50 milligrams per deciliter (mg/dl). 18 preservative should be added to low protein samples unless analysis is to be completed within 60 minutes (see next section), and a dilutional effect must then be factored into cell counts. 18 samples to be shipped to a reference laboratory overnight should be kept at refrigeration temperature and shipped with ice packs for analysis within 48 hours. 9, 16 the reference laboratory should be prenotified to ensure prompt analysis. if analysis is likely to be delayed by more than 1 hour and the csf sample has a protein concentration less than 50 mg/dl, one of the following may be added as a protein source to maintain cellular integrity: (1) hetastarch (add 1:1 volume), (2) fetal calf serum (3.7 g/dl protein; add 20% by volume), or (3) autologous plasma or serum (fresh or frozen; 11% by volume ≡ one drop from 25-gauge needle (approximately 0.03 ml) mixed into 0.25 ml csf). 10, 19, 20 the sample should be labeled with the protein source and amount added to the sample. one study demonstrated better preservation of mononuclear cells in canine samples when fetal calf serum was used instead of hetastarch. 18 all samples should be refrigerated at 4°c to minimize cellular degradation. a hemocytometer may be employed in practice to count nucleated cells and erythrocytes. both sides of the cover-slipped hemocytometer are loaded with unstained csf, which is then placed in a humidified container for 10 to 15 minutes to allow cells to settle on the glass. because the fluid is unstained, the microscope condenser is lowered to improve contrast. erythrocytes and nucleated cells are differentiated by size, refraction, granularity, and smoothness of plasma membrane. 21 some laboratories stain csf samples with new methylene blue (nmb), as leukocytes will take up stain, whereas erythrocytes remain unstained, making differentiation of leukocytes (specifically small lymphocytes) and erythrocytes easier (fig. 14.1) . 22 a small volume of csf is drawn into a capillary tube coated with nmb or a tube that has a small volume of nmb followed by an air pocket. 22 the tube containing nmb and csf is gently rocked back and forth, allowing the cells to take up some stain without diluting the csf with a volume of nmb. 22 the hemocytometer is then loaded, and each population is counted and totals are calculated, as follows: neubauer chamber: (1) both areas of large nine squares are counted, and the average of the number of leukocytes and erythrocytes is found; (2) the average is multiplied by 9 to get the cells per microliter (cells/μl). 10 the advia 120 (siemens medical solution, fernwald, germany) hematology instrument has been validated for analyzing canine csf samples and shows excellent correlation with manual methods used in dogs with increased total cell counts (pleocytosis), but the instrument may overestimate the cell count in samples without pleocytoses and has not been validated for the identification of eosinophils. 23 the automated differential count is also more accurate at higher cell numbers and thus should be compared with a traditional manual differential. the advia 2120 hematology analyzer displayed satisfactory agreement with the standard hemocytometer method. 24 validation experiments using 67 canine samples showed a sensitivity of 100% and specificity of 89% for accurately identifying samples with pleocytosis when manual counting was considered the gold standard (>5 cells/μl). 24 the instrument tended to be less accurate at lower (within reference interval) nucleated cell counts. 24 erythrocytes may be a source of interference, as a red blood cell (rbc) count of 250 cells/μl was shown to elevate the nucleated cell count. 24 with regard to differential cell count, the instrument performed better in the presence of pleocytosis, whereas monocytes were overcounted at lower nucleated cell counts. 24 automated cell counts thus should not replace a manual differential but may be used as another level of quality control. automated instruments cannot recognize altered cell types, such as atypical neoplastic cells. measurement of csf specific gravity is not considered to be helpful because of low sensitivity for detecting abnormalities. 12 csf microprotein may be semiquantitatively measured by using urine dipsticks that detect albumin. this assay has a lower detection limit of 100 mg/ dl; therefore, it has low sensitivity for mild to moderate csf protein concentration elevations (30 mg/dl to 100 mg/dl). false-positive or false-negative reactions may occur if the dipstick reads at trace or 1+, but this method is useful if other techniques are not available. 11 reference laboratories apply a similar but more sensitive methodology to measurement of csf microprotein as that of serum protein, using the trichloroacetic acid method, the ponceau s red dye-binding method, or the coomassie brilliant blue method. 22 csf globulin production is typically screened for with the pandy reaction. in this test, a few drops of csf are added to 1 ml of 10% carbolic acid solution, and the resulting turbidity is graded 0 to 4+. any pandy score above zero is considered elevated. globulin concentration below 50 mg/dl will be undetectable with either test. 10, 21 protein electrophoresis and immunoelectrophoresis may be performed on csf and serum for maximum fractionation. 25 the utility of protein electrophoresis or immunoelectrophoresis of csf lies in discriminating altered blood-brain barrier (bbb) permeability from increased localized production of immunoglobulin, which may be suggestive of (but not specific for) a disease entity for which an electrophoretic pattern has been established. cytological analysis is a critical component of csf evaluation because the differential count (percentages) of cells may be abnormal, even if the total nucleated count is within reference interval. cytology also enables examination for neoplastic cells, infectious agents, and evidence of prior hemorrhage. it may also serve as a quality control point, allowing for correlation between observed cellularity and the total count generated by a hemocytometer or an automated analyzer. because of its low cellularity, csf must be concentrated before cytological smear preparation. use of an in-house sedimentation chamber (sörnäs procedure) may be very useful and preserves cell-free fluid for ancillary testing. 10 this technique will recover approximately 60% of total cells, which is sufficient for analysis. 16 a syringe barrel (with the tip and needle aseptically removed with a scalpel blade) is turned upside down and the smooth, top side is placed in warm petroleum jelly and then onto a clean slide. once a seal has formed, fresh csf (at least 0.5 ml) is placed in the syringe and allowed to sit for 30 minutes. 16, 21 then, the supernatant is aspirated carefully with a pipette so as not to disturb the bottom layer contacting the slide. the syringe barrel is removed, and any excess csf is carefully absorbed with a small piece of filter paper or paper towel. the slide is completely and rapidly air-dried without heat (inadequate drying results in cellular distortion), excess petroleum jelly removed with a scalpel blade, and the slide is stained with routine romanowsky stains (e.g., diff-quik). if csf is sent to a reference laboratory, a cytological slide will likely be prepared using cytocentrifugation (500-1000 revolutions per minute [rpm] for 5-10 minutes, either onto a slide coated with albumin or with the addition of 0.05 ml of 30% albumin for improved cell capture) for maximal concentration of nucleated cells onto one slide. 16 cytocentrifuged cytology may show excellent cellular detail, but the preparation may enlarge cells slightly and create an artifactual foamy or vacuolated appearance. 16 slides are air-dried and stained with conventional romanowsky stains. multiple cytospin preparations may be made to yield 200 intact nucleated cells for classification. as it is rare for etiologic agents to localize only within the cns, all cases of suspected infection may be aided diagnostically by fine-needle aspiration (fna) cytology, biopsy with histopathology, culture of nonneural lesions, or all of these. 21 bacterial culture and sensitivity testing of csf is recommended for most cases of neutrophilic pleocytosis, given the appropriate clinical index of suspicion for a septic lesion. even when organisms are visualized on csf cytology, speciation and susceptibility testing may help guide prognostic and treatment decisions. alternatively, bacterial or fungal culture may be negative regardless of cytological observation of organisms. 10, 20 it must be remembered that bacterial cns infection is highly uncommon in dogs and cats compared with other domestic animal species. 26 advanced techniques for neurological disease diagnosis are expanding rapidly. enzyme-linked immunosorbent assay (elisa)-based assays for antibody detection and pcr-based assays for nucleic acid detection of several medically important microbes have been developed for use on csf and may be instructive in the diagnosis of viral, rickettsial, protozoal, or fungal diseases. 20 a large canine study that included a subset of 16 dogs with neoplastic or inflammatory disease showed that csf titer provided diagnosis in 25% of cases. 3 antibody assays should be interpreted cautiously because the presence of antibody may indicate prior exposure or vaccination rather than active infection. moreover, compromise to the bbb in states of inflammation may translate to the presence of antibodies within the csf without local production. occasionally cross-reactive antibodies may be present that do not represent presence of the disease agent under assessment. similarly, specimens for pcr should be submitted to a laboratory with strict quality control to minimize false-negative and false-positive results. poor collection technique may result in false-positive results, especially for bacterial species that are ubiquitous in the environment. 27 as with other aspects of csf analysis, a negative pcr result does not definitively rule out the presence of a pathogen because of the sampling limitation of a small portion of the extracellular space. 20 csf contains glucose, electrolytes, neurotransmitters, and enzymes, but these substances are not measured routinely, although this measurement represents a rapidly expanding area of research in the effort to give clinicians better tools for diagnosing patients and determining prognoses. csf enzymes originate from the bloodstream, the cns, or cells within csf. 10 one study of 34 cats with noninflammatory cns disease showed that measurement of csf activities of lactate dehydrogenase (ldh), aspartate aminotransferase (ast), and creatine kinase (ck) were not diagnostically sensitive but may be useful in detection of acute injury. 28 multiple studies have correlated elevations in csf ck activity with poor prognosis in dogs with neurological disease or spinal cord injury. 29, 30 immunoassays for vascular endothelial growth factor (vegf) and s-100 calcium-binding protein have shown elevations of both molecules in the csf of experimentally induced hypothyroid dogs, suggesting endothelial and glial contribution to increased bbb permeability in this population. 31 myelin basic protein (mbp) has been found to be elevated in lumbar csf in dogs with degenerative myelopathy, supporting the conclusion that it is a demyelinating lesion. 32 mbp concentration is elevated in the csf of dogs affected by intervertebral disk herniation (ivdh) and has been found to be an independent predictor of poor prognosis. 33 beta-2-microglobulin, a major histocompatibility complex i (mhc-i)-associated molecule, has been assayed by using elisa and found to be elevated in the csf of dogs with ivdh and inflammatory disease and also positively correlated with normal total nucleated cell count (tncc). 34 the amino acids tryptophan and glutamine have been found to be elevated in the csf of dogs with portosystemic shunts because of abnormal ammonia metabolism. 35 one study found increased oxytocin in the csf of dogs with spinal cord compression, where it is believed to have an analgesic effect. 36 gamma-aminobutyric acid (gaba) and glutamate neurotransmitter concentrations have been measured in dogs with epilepsy. 37 normal csf is clear and colorless, with few cellular elements and a protein concentration approximately 200 to 300 times less than that of plasma or serum. red or yellowish coloration indicates prior lesional hemorrhage or iatrogenic hemorrhage during collection. in the latter case, a pellet of rbcs will be present after centrifugation. true xanthochromia (yellowish color of hemoglobin breakdown products) that does not clear on centrifugation, cytological evidence of erythrophagia, or both indicate prior hemorrhage into the subarachnoid space. 20 increased bilirubin leakage into the sas or high concentrations of csf protein (>100-150 mg/dl) may cause xanthrochromia. 21 increased turbidity of the sample may be caused by increased number of cells present (>400 rbcs/μl or >200 nucleated cells/μl) but is usually not affected by mild changes. 10, 11 cell counts tncc is fewer than 5 cells/μl in the dog and fewer than 8 cells/μl in the cat, and elevation above this range is termed pleocytosis. 10 grading of pleocytosis is somewhat subjective: in one reference, "mild" was defined as 6 to 50 cells/μl; "moderate" as 51 to 1000 cells/μl; and "marked" as more than 1000 cells/μl. 4 depending on laboratory-specific reference intervals, normal protein concentration is usually less than 25 to 30 mg/dl for cisternal csf and less than 45 mg/dl for lumbar csf. 10, 20 approximately 80% to 95% of csf protein is albumin, and 5% to 12% of csf total protein comprises gammaglobulins. 2 eighty percent of csf protein is transferred from plasma, with the remainder produced within the cns. the latter population includes molecules also produced by other organs and proteins unique to the csf that may potentially be used as markers of cns tissue damage. experimental evidence and earlier literature support a gradient of increasing protein concentration from cranial to caudal within the subarachnoid space, which has been attributed to slower flow and greater blood-csf permeability caudally. 12 normal csf is acellular or contains small numbers of small lymphocytes (figs. 14.2 and 14.3) and large mononuclear cells (macrophages, ependymal lining cells, meningothelial lining cells, choroid plexus cells) (figs. 14.4 and 14.5). large mononuclear cells may be vacuolated and contain phagocytized material ( fig. 14.6) . a low frequency of nondegenerate neutrophils (<25%), which are usually indicative of blood contamination during collection, may be present. 38 a study of 359 samples of canine csf found a 7.5% incidence of meningeal, choroid plexus, ependymal, endothelial cells, or all of these. 39 no correlation existed between the presence of these cells and the presence of pleocytosis, elevated protein concentration, or the primary disease etiology. 39 thus it is postulated that the presence of these cells is an artifact of collection and should not be overinterpreted. the authors recommended the term "surface epithelial cells" for the combined grouping (which cannot be distinguished cytologically), although not all of these cells (meningeal, endothelial) are of epithelial origin. 39 occasionally, anucleate superficial squamous epithelial cells may be seen; these may be caused by contamination from the skin (fig. 14.7 ). occasionally, small amounts of granular, foamy extracellular material are present and are consistent with myelin or myelin-like material, which will stain positively with luxol fast blue stain. this material may consist of myelin fragments, which are generated from demyelination, or may consist of myelin figures (a nonspecific term for layered phospholipids exfoliated from damaged cells). 40 the two cannot be distinguished with light microscopy. the significance of this material remains unclear because it may be observed in samples from patients with no discernible cause. a study of 98 canine cerebromedullary and lumbar csf samples showed 20% incidence of myelin-like material, with a higher percentage in samples from the lumbar cistern or from small dogs (<10 kg). 41 the presence of the material was not correlated with case outcome. 41 similarly, in a study of 61 cavalier king charles spaniels with chiari-like malformations, myelinlike material was observed in 57% of lumbar csf collections and 12% of cerebromedullary collections. 42 thus myelin-like material may be a procedural artifact or may be consistent with a demyelinating (e.g., canine distemper virus, degenerative myelopathy) or potentially necrotizing disorder (e.g., ivdh, other spinal trauma, or a necrotic neoplasm). 40, 41 normal csf should not contain erythrocytes, but hemodilution is a common occurrence. varying reports on the effect of blood contamination on tncc, leukocyte differential, and protein concentration have been published. [43] [44] [45] [46] deciding whether increased tncc or protein concentration is the result of hemodilution alone or a significant change concurrent with hemodilution necessarily remains, to an extent, a subjective assessment and must be critically evaluated in light of the magnitude of csf findings along with the other pertinent facts of the case. correction formulas for csf parameters in the face of hemodilution (e.g., adding 1 nucleated cell/μl per 100 or 500 rbcs/μl) are unreliable. 45, 46 in a recent study of 106 canine csf samples without pleocytosis (tncc <5/μl) but containing at least 500 rbcs/μl, the mean percentage of neutrophils (45.2% versus 5.7%), percentage of samples with eosinophils present (36.8% versus 6.8%), and mean protein concentration (40 mg/dl versus 26 mg/dl) were found to be significantly increased in the samples with blood contamination when compared with controls. 47 significant rbc contamination warrants repeat sampling, if possible. marked hemorrhage or evidence of prior hemorrhage (erythrophagocytosis, xanthochromia, hemosiderin-laden macrophages) may be useful in the diagnosis of cns trauma, which may be accompanied by neutrophilic to mixed cell pleocytosis and mild increase in protein concentration. 4 elevated protein concentration in csf (>30 mg/dl) may occur with or without pleocytosis, and in the absence of pleocytosis is termed albuminocytological dissociation (acd). high protein concentration may be the result of leakage of plasma or cellular proteins across the bbb, localized production of immunoglobulin, localized tissue damage or necrosis, decreased clearance of protein into the venous sinuses, obstruction of csf circulation, or all of the above. as such, it is a nonspecific change that indicates cns damage or hyperproteinemic disease and is consistent with disease of any etiology (e.g., trauma, metabolic, infectious, inflammatory, degenerative, or neoplastic). caution should be exercised when diagnosing acd if the sample is hemodiluted (>500 rbcs/μl). 47 as is true for pleocytosis, inflammation of the meninges and superficial regions of parenchyma will result in greater csf protein elevations than for lesions that are more remote from the sas. occasionally, an abnormal leukocyte differential (shifted from mononuclear predominance to neutrophil predominance) without pleocytosis occurs. this may only be detected if cytological analysis (after sedimentation or cytocentrifugation of csf) is performed. increased percentages of neutrophils may occur in early or mild inflammatory disease, noninflammatory cns disease, disease that is remote from the sas or sampling site, or in cases of hemodilution. an increased proportion of neutrophils is present when neutrophils comprise greater than 25% of all nucleated cells, and increased percentage of eosinophils occurs when eosinophils comprise greater than 1% of the differential. 10 when present (with or without increased tncc), neutrophils should be evaluated for toxic change, degenerative change, and intracellular organisms or other inclusions ( fig. 14.8 ). increased percentage of neutrophils without pleocytosis has been associated with healthy dogs, blood contamination, degenerative disk disease, neoplasia, cerebrovascular accident, fracture, cns aspergillosis, and fibrocartilaginous embolism (fce). 10, 42, 48 a study of 61 cavalier king charles spaniels with chiari-like malformation documented that those with syringomyelia were more likely to have an increased percentage of neutrophils, but it was not reported whether this subpopulation also had a concurrent pleocytosis. 42 in another study, cats with cns neoplasia had increased percentage of neutrophils or lymphocytes without a pleocytosis. 28 although not a classic pattern, infectious or inflammatory disease should not be ruled out if increased neutrophils are visualized without pleocytosis. increased percentage of eosinophils has been reported in parasitic and protozoal diseases, such as neospora caninum infection. 38 one cat with eosinophilic meningoencephalitis (eme) of unknown etiology had an increased percentage of eosinophils and lymphocytes without pleocytosis. 49 the specific diseases mentioned in the next section on various categories of pleocytosis are a survey of the current literature and meant to be a helpful starting point in the generation of particular differential diagnoses. thus disease entities are listed in the section under which they are most commonly present, but it is important to note that for all disease entities, variability in the nature and the magnitude of pleocytosis may emerge in a particular patient at a particular point in time. wherever possible, other categories of pleocytosis that have been reported for a disease have been mentioned. generally, pleocytoses are defined by the cell type that comprises 70% or more of the nucleated cell population. if all cell types are 50% or less, the pleocytosis is classified as a mixed cell pleocytosis. and if, for example, lymphocytes are greater than 50% but less than 70%, some pathologists will classify the pleocytosis as mixed cell, lymphocyte predominant. a pleocytosis will be classified as eosinophilic if eosinophils compose at least 10% to 20% of the nucleated cell population. 11 bacterial meningoencephalomyelitis. bacterial infections of the cns are unusual and represent a small portion of neutrophilic pleocytoses. typically, this pleocytosis is severe (could be over 1000 cells/μl), neutrophilic, and accompanied by significantly elevated protein concentration, but the cell population may change to mononuclear during the course of treatment. 10, 20, 50 rare instances of brain abscessation secondary to sepsis (which may be a sequela of iatrogenic immunosuppression) may result in marked neutrophilic pleocytosis, markedly elevated protein concentration, visualization of bacterial organisms (see fig. 14.8) , and abnormal mri findings. 51 staphylococcus intermedius was cultured from the csf of a dog presenting with a retrobulbar abscess and neurological signs. 52 the csf showed a moderate neutrophilic pleocytosis (75 cells/μl) and borderline elevation in protein concentration (30 mg/dl). 52 local extension of severe otitis interna resulting in meningoencephalitis and ventriculitis in a dog has been reported. 53 this patient exhibited a severe neutrophilic pleocytosis (3672 cells/μl) and protein elevation (>400 mg/dl). 53 pasteurella multocida meningoencephalomyelitis in a kitten was characterized by marked neutrophilic pleocytosis (981 cells/μl) with mild protein elevation (31 mg/dl) and rare extracellular and intracellular bacterial rods. 54 bacterial culture and susceptibility testing are recommended but may yield false-negative results if organisms are not circulating in the extracellular space or if prior antibiotic therapy had been given. serology and csf-pcr (using organism-specific or universal bacterial [ub] pcr) are recommended. 27, 54 cryptococcosis in dogs. cryptococcus spp. are a large genus of systemic dimorphic fungi with a predilection for cns tissue, which is infected hematogenously or via direct penetration of the cribriform plate. only two species at this time are medically important: (1) cryptococcus neoformans (var. neoformans and var. grubii) and (2) cryptococcus gattii. in a recent study of 31 dogs with cryptococcosis, 68% had cns infection, with neurological signs being the most common reason for presentation. 55 dogs and cats with cryptococcosis typically have pleocytoses and elevated protein concentrations, but pleocytoses may be variably neutrophilic, eosinophilic, mononuclear, or mixed. in a recent study of 15 dogs with cns cryptococcosis, organisms were found in 11 of 15 csf samples (figs. 14.9 and 14.10). 56 all affected dogs had pleocytoses that were mixed to mononuclear, whereas cats tended to have neutrophilic pleocytoses. 56 of the samples, 11 of 12 also had increased protein concentrations (mean 494 mg/dl), which were significantly higher than in cats in the same study (mean 45 mg/ dl). 56 capsular antigen latex agglutination testing on serum or csf is highly sensitive and specific and is recommended if cryptococcosis is suspected but organisms are not visualized cytologically. 57 this test may yield negative results if disease is present but localized (i.e., within the respiratory tract), so appropriate clinical signs should guide testing. culture of csf may also be helpful and may distinguish c. neoformans from c. gattii with the use of selective media. the finding of inflammatory foci on mri may be supportive of the presence of fungal disease; cryptococcosis may result in mass lesions, meningitis, or pseudocyst formation. cryptococcosis is the most common systemic fungal disease of cats and is believed to infect the cns less frequently than in the dog. a recent study found that 42% of 62 cats with cryptococcosis had cns infection, but respiratory signs were still a more common reason for presentation. 55 mild to marked neutrophilic or mononuclear pleocytosis may occur, with variable and occasionally normal protein concentrations. 4 a study of cats with cns cryptococcosis showed organisms in 9 of 11 of the csf samples, and a majority of cases (9 of 10) had neutrophilic pleocytosis and increased protein concentration (8 of 10). 56 eosinophilic pleocytosis may also occur. capsular antigen latex agglutination testing on serum or csf is recommended for confirmation of cryptococcus spp. infection, with rare false-negative reactions if disease is highly localized. fungus that has been visualized in canine csf and may be extracellular or within leukocytes. 58 ehrlichiosis. neutrophilic pleocytosis has been reported in cases of granulocytic ehrlichia spp. in dogs (fig. 14.11 ). 61 neurological signs are uncommon in this disease, and affected dogs may display features ranging from ataxia to seizures. (fip) has been traditionally linked to marked csf changes, but the current literature paints a somewhat more varied picture. one study of natural fip infection showed neutrophilic pleocytosis (as defined by >50% neutrophils) in the majority (7 of 11) of cases, with fewer cases of mononuclear (3 of 11; as defined by >80% mononuclear cells) and mixed cell (1 of 11) pleocytosis, all of variable severity. 4 most cases (7 of 9) also had differing degrees of elevated protein concentrations. 4 diagnosis was confirmed by histopathology or suggested by elevated feline coronavirus antibody titers and reduced albumin-to-globulin ratios in both serum and body cavity effusions. 4 a slightly older study of 16 csf samples (natural and experimental infections) showed pleocytosis in 2 of 16 cases (neutrophilic and lymphocytic) and elevated protein concentration in 4 of 16 cases. 62 in a larger study of 67 cats with fip or non-fip disease, incidence of pleocytosis was highest in the neurological fip group, but 20% of these patients did not have a pleocytosis. 63 additionally, protein concentrations were variably elevated and not statistically different in fip compared with non-fip neurological disease. 63 another study of 12 cats with cns fip showed 8 of 12 with unspecified pleocytosis and 3 of 12 with elevated protein concentration. 64 in cats with cns disease, sensitivity of feline coronavirus (fecov) immunoglobulin g (igg) in csf for the diagnosis of fip was 60%, and specificity was 93%, with a positive predictive value of 75% and a negative predictive value of 87% (fip prevalence in this population was 25.6%). 63 definitive diagnosis of this disease remains challenging, with virus identification (pcr or immunohistochemistry) accompanied by pyogranulomatous inflammation in tissues being the gold standard. hypergammaglobulinemia, elevated serum α 1 -acid glycoprotein (agp), mri abnormalities (typically involving the ventricular lining and meninges), and positive feline coronavirus igg titer or pcr from serum, tissue, or csf are supportive but not specifically diagnostic, and negative findings do not rule out disease. 20, 63, 65 toxoplasmosis in cats. cats are the definitive hosts for toxoplasma gondii and may be subclinically infected; thus, diagnostics should only be performed on patients with appropriate clinical signs. cats typically present with mild neutrophilic or mononuclear pleocytosis and normal to mildly elevated protein concentration, but marked protein elevation may occur. 4 mild lymphocytic pleocytosis is also reported. 65 diagnosis may be confirmed by direct visualization of organisms in csf, aspirates of other inflammatory foci, histopathology of affected tissues, or fecal examination. serology must be interpreted cautiously because igg may remain elevated for up to 6 years after exposure. therefore, paired serum igm-igg titers, indicating acute exposure, or documentation of rising serum igg titers are more useful, but the latter is difficult to document in the advanced state of disease. 65, 66 spinal epidural empyema in dogs. epidural empyema is an uncommon disease in dogs, resulting from pyogenic infection in the epidural space. one study showed 4 of 5 dogs with neutrophilic pleocytosis of variable magnitude (11-342 cells/μl). 67 no organisms were visualized on any of the samples. 67 except for one case with a lumbar csf protein concentration of 726 mg/dl, protein elevations were modest. 67 three csf samples were cultured with no growth, and two dogs for which follow-up csf was obtained showed resolution of pleocytosis. 67 these results are not surprising, as the dura likely provides a barrier to prevent infection extending from the epidural space to the subarachnoid space. reported in a young cat with a marked neutrophilic pleocytosis with intracellular and extracellular merozoites observed on csf cytology. 68 diagnosis was confirmed with decreasing paired serologic titers, and speciation to the level of sarcocystis dasypi or sarcocystis neurona was conducted with pcr from blood. 68 a case of systemic acanthamoeba spp. infection in a young boxer, diagnosed post mortem, had antemortem csf with marked neutrophilic pleocytosis (4956 cells/μl), marked increase in protein concentration (259 mg/ dl), and subnormal csf iga concentration (33 mg/dl; reference interval 35-270 mg/dl). 69 postmortem pcr for the organism was positive on extraneural tissue but not on csf or spinal cord. 69 the patient had been deliberately immunosuppressed on the basis of a preponderance of evidence of steroid-responsive meningitis arteritis at initial presentation and thus may have been infected either before or opportunistically after treatment. 69 another case report of canine cerebellar balamuthia mandrillaris infection (diagnosed post mortem with immunohistochemistry) displayed a marked neutrophilic pleocytosis (234 cells/μl), but other cases with lymphocytic pleocytosis have been reported. 70 because of tissue encystment, it is suggested that extraneural tissue be used for immunohistochemistry or pcr for antemortem confirmation of amoebic infection; pcr of csf may be diagnostic but is not widely available. 69, 70 two dogs with aberrant spinal migration of spirocirca lupi nematodes had moderate to marked neutrophilic to mixed or eosinophilic pleocytoses (800 cells/μl with 91% neutrophils; 180 cells/μl with 60% neutrophils, 30% eosinophils). 71 steroid-responsive meningitis arteritis. steroid-responsive meningitis arteritis (srma) is presumptively an immune-mediated disease of mainly young, medium-and large-breed dogs: beagles, boxers, bernese mountain dogs, weimaraners, and nova scotia duck tolling retrievers are overrepresented. 20 csf analysis is important in diagnosis and typically features a moderate to marked neutrophilic pleocytosis (a left shift may be present) and markedly elevated protein concentration. chronically, pleocytosis may change to a more mononuclear or mixed population (fig. 14.12 ) and may become mild or even fall into reference intervals. 72 a study of 20 affected dogs showed neutrophilic pleocytosis in 12 of 20 cases and mononuclear pleocytosis in 8 of 20 cases. 72 concurrent elevations of serum and csf iga titers (elevated igg and igm fractions may be present), serum concentration of cross-reactive protein (crp), or serum α 2 -macroglobulin is diagnostically supportive but not specific. 20, 50 increases in iga have been linked to a t-helper 2 (th2)-dominated immune response driven by elevated interleukin-4 (il-4) and decreased il-2 and interferon-gamma (ifn-γ). 73 serum amyloid a (saa), serum agp, and serum haptoglobin may also be elevated. 74 another study of 36 dogs with srma reported statistically significant elevations of csf and serum crp, but not serum α 2 -macroglobulin, in dogs with srma compared with other neurological diseases. 75 in a study of 20 dogs, serum crp was positively correlated with csf tncc. 72 additionally, serum haptoglobin and serum and csf iga remained increased throughout successful treatment, indicating that these parameters are more useful for diagnosis than for monitoring therapy. 72 serum and csf concentrations of crp and saa have been documented to fall significantly during treatment, and repeat measurement of serum crp or saa may be used to guide therapy and predict relapse, which is less invasive and more sensitive than repeat csf sampling. 72, 74, 75 rare cases have been documented in cats with marked mononuclear or mixed pleocytosis and mild to moderate protein concentration elevations. 4 intervertebral disk herniation. csf from patients with ivdh may be extremely variable; data indicate that csf findings correlate with location of sampling, disk herniation location, chronicity of the lesion, and severity of spinal cord injury. bearing this in mind, it is no surprise that some reports in the literature state that neutrophilic, lymphocytic, mixed, and mononuclear pleocytoses are most common in dogs with ivdh. 12, 30, 76 a study of 423 cases of ivdh showed 51% with pleocytosis, of which 31% were neutrophilic, 41% were lymphocytic, 20% were mixed, and 7.4% were mononuclear. 76 of all cases, 71% had elevated protein concentrations. 76 interestingly, a larger number of cases of lymphocytic pleocytosis were observed in the samples analyzed more than 7 days after onset of clinical signs. 76 the magnitude of pleocytosis, in general, was also shown to decrease with increasing time between clinical onset and sampling, and this observation has been corroborated by other studies. 12, 76 prior treatment with corticosteroids was observed to reduce the number of observed lymphocytes in csf. 76 the authors also found a higher incidence of pleocytosis in thoracolumbar disease (61%) compared with cervical disease (23%), but this may have been caused by exclusive sampling of lumbar csf closer to the lesion. 76 ivdh is rare in cats and has been reported to feature mild mixed cell pleocytosis and elevated protein concentration. 4 patients typically present with nonpainful, progressive, asymmetrical neurological signs. as only histopathology is confirmat ory, it is a multimodal diagnosis of exclusion. a study of 32 dogs with presumptive fce, based on history, clinical signs, imaging, and outcome, showed 53% with normal csf, 25% with acd, and 19% with mild to moderate pleocytosis (7-84 cells/μl; median 12/μl). 77 pleocytoses were neutrophilic or mixed. 77 one study of 36 confirmed cases in dogs showed that 64% had normal csf and the remainder displayed mild changes. 78 another study looking at five dogs suggested that pleocytosis may be marked, up to 529 cells/μl. 3 fce is much less common in cats. in general, the disease process and clinical signs are similar to those in dogs, with the exception that the disease presents in cats in middle or older age, usually with cervical spinal cord signs. a case series of five cats showed csf ranging from normal to marked neutrophilic pleocytosis with moderately elevated protein concentration and variable correlation to clinical outcome. 79 the case with the most severe csf changes had extensive myelomalacia at necropsy. 79 it was suggested in this study that csf is more likely to be abnormal if collected closer to the lesion and that mri is helpful for localization and in supporting the diagnosis. 77, 79 thiamine deficiency in cats. thiamine deficiency is a rare nutritional disorder of patients fed noncommercial, misformulated commercial, or irradiated diets. two case reports showed increased percentage of neutrophils or mild neutrophilic pleocytosis, presumptively from cerebrocortical necrosis. 4 diagnosis is based on history, response to treatment, mri features compatible with the disease (cortical and brainstem hyperintensities), or histopathology. 80 spaniels with chiari-like malformation showed that 40% of dogs with concurrent syringomyelia and cisternal csf sampling had mild (up to 15 cells/μl) pleocytoses and increased percentages of neutrophils compared with the subpopulation without syringomyelia, but it was not specifically documented whether pleocytoses were, in fact, neutrophilic or mixed with an increased percentage of neutrophils. 42 a positive correlation was also seen to exist between tncc and syrinx size. 42 neoplasia. it is important to perform csf in neurology patients with suspected neoplasia, as definitive diagnosis may be achieved if neoplastic cells are directly observed via cytology. inflammatory pleocytoses or elevated protein concentrations are common in patients with cancer, tend to be mild to moderate in magnitude, and may represent paraneoplastic inflammation, compromise of the bbb, lesional necrosis, or all of these. 28 normal csf is also a common finding in cases of neoplasia. moreover, in the absence of overtly neoplastic cells, no defined patterns connect specific tumors with specific types of inflammatory pleocytoses. neutrophilic pleocytosis of unspecified magnitude was found in the csf of 2 of 11 cats with spinal lymphoma and in 3 of 7 cats with nonlymphoma spinal neoplasia (astrocytoma or osteosarcoma). 81 additionally, the remaining four cats with nonlymphoma spinal tumors (meningioma, peripheral nerve sheath tumor, plasma cell tumor) had either normal csf or acd of unspecified magnitude. 81 metastatic tumors to the cns should also be considered in a patient with neurological signs. eosinophilic meningoencephalitis of dogs. eme is an idiopathic diagnosis of exclusion that is typically steroid responsive and is postulated to be triggered by an underlying hypersensitivity, allergy, or self-limiting infection. the disease may be overrepresented in rottweilers and golden retrievers. 82 a study of 23 dogs with eosinophilic pleocytosis (defined by >20% eosinophils) showed 16 cases of idiopathic eme, 4 cases of infectious disease (c. neoformans, n. caninum, baylisascaris procyonis), and 3 cases of ivdh. 83 the magnitude of pleocytosis or the percentage of eosinophils could not be used to distinguish infectious versus eme cases, although ivdh cases tended to have milder pleocytoses (<84 cells/μl). 83 in about half the eme cases, mri showed abnormal findings. 83 peripheral eosinophilia may or may not be present. highly suggestive of protozoal (toxoplasmosis, neosporosis), fungal (cryptococcosis), parasitic (including cuterebra spp., dirofilariasis), and algal (protothecosis) infections and also rarely in cases of canine distemper and rabies viruses. 10, 84 eosinophils have also been found in cases of granulomatous meningoencephalomyelitis (gme). 85 eosinophilic pleocytosis has been documented in bacterial encephalitis as well. 21 gondii infection tend to have neurological or neuromuscular signs. case reports are sporadic; documentation of mild acd (58 mg/dl) and also a report of mild lymphocytic or eosinophilic pleocytosis (35 cells/μl) with an elevated protein concentration of 77 mg/dl exist in the literature. 86 it is important to rule out other potential causes of the neurological signs because immunocompetent dogs tend to clear subclinical infections, and therefore paired serum igm-igg titers or sequential serum igg titers are preferable to a single serum igg titer. to the author's knowledge, no data on the life span of canine igg antibodies exist. reports in the literature are conflicting with regard to the cross-reactivity of t. gondii antibodies to other agents, such as n. caninum. 20,86 pcr testing for toxoplasma in serum, tissue, or csf is diagnostic. 86, 87 rabies. pleocytoses may be lymphocytic and of varying severity. ancillary antemortem diagnostics include viral pcr on saliva or csf and the saliva antigen latex agglutination test. in a study of 15 dogs under quarantine for suspected natural infection (subsequently confirmed positive), 13 of 15 were saliva-pcr positive, and 4 of 15 (27%) were csf-pcr positive. 88 all animals with positive results on csf were also positive on saliva, and interestingly 100% correlation was seen between positive csf-pcr and the dull clinical presentation (all aggressive clinical presentations were csf-pcr negative). 88 negative testing should never exclude diagnosis because viral load is highest within salivary glands and brain parenchyma. 88 frequently made by exclusion when coupled with appropriate clinical signs. csf and mri findings are variable and may be normal in the acute stage of disease before inflammation has peaked. 89 in a study of 32 dogs with noninflammatory distemper, half (15 of 32) had normal csf. 85 a study of eight dogs with natural infection (confirmed by cns tissue-pcr and histopathology) showed lymphocytic pleocytosis in all samples and normal protein concentrations. 90 another case (confirmed by tissue-pcr and csf-pcr) in a 7-month-old dog displayed marked (554 cells/μl) lymphocytic pleocytosis and a normal protein concentration. 89 because this is a demyelinating disease, myelin-like material, which is amorphous, granular, pink, foamy, and stains positively with luxol fast blue, may be present. 40 pcr testing of csf, serum, urine, epithelial or tonsillar tissue is available, and immunohistochemistry on biopsy specimens of nasal mucosa, haired skin, or footpad is 88% to 96% sensitive for detection of viral antigen. 91 is present, it is likely to be lymphocytic. pleocytoses are typically mild to moderate, but severe lymphocytic pleocytoses have been reported. 89 main differential diagnoses include other viral diseases, gme, or chronic bacterial infection. extranigral signs related to the gastrointestinal or the respiratory system, if present, may be helpful in distinguishing this disease from gme. 89 in a study comparing four dogs with chronic cdv, six dogs with acute cdv, and controls, dogs with chronic cdv had markedly elevated csf igg concentration. 92 the igg region was polyclonal, including a population of neutralizing antibodies for cdv. 92 fungus acquired through inhalation, and most cases in the united states are observed in the southwestern region of the country. signs tend to involve respiratory or skeletal systems, and cns involvement is rare. one dog had a mild to moderate lymphocytic pleocytosis. 57 complement fixation (detecting igg) or tube precipitation (detecting igm), or agar-gel immunodiffusion serological testing is recommended for confirmation. necrotizing meningoencephalitis. meningoencephalitis has been subcategorized as necrotizing meningoencephalitis (nme) and necrotizing leukoencephalitis (nle) on the basis of histopathological appearance. both nle and nme are believed to have an immune-mediated basis, and recent data support that in pugs with nme, canine leukocyte antigen gene aberrations exist. 93 meningoencephalitis is rapidly progressive and affects a variety of generally young to middle-aged toy-breed dogs, including the pug, shih tzu, papillon, maltese, chihuahua, yorkshire terrier, french bulldog, pekingese, west highland white terrier, boston terrier, japanese spitz, and miniature pinscher breeds. 20, 94 a study of csf from 14 pugs with nme showed 12 of 14 with pleocytoses of varying severity (mean 120 cells/μl). 95 of these dogs, 66% had a lymphocytic pleocytosis, 17% had a mononuclear pleocytosis, and 17% had a mixed cell pleocytosis (fig. 14. 13) 95 ; 11 of 14 dogs had elevated protein concentrations (mean 88.4 mg/dl). 95 another study of three dogs showed one with acd and two with moderate to marked (40-220 cells/μl) neutrophilic to lymphocytic pleocytosis. 94 mri findings may help support a diagnosis, but only histopathological examination of lesions provides definitive proof. other. four cats with ischemic encephalopathy had mild (<10 cells/μl) lymphocytic pleocytosis. 28 another study of feline ischemic encephalopathy reported one cat with normal csf and another with mononuclear to mixed pleocytosis (26 cells/μl). 96 a report of two cats with cerebrovascular disease (infarction or stroke) showed one with mononuclear pleocytosis and the other with acd. 97 cerebrovascular disease was correlated in several other cases (without csf data) to hepatic lipidosis or fip. 97 other. a case report of a pug with a mild mononuclear pleocytosis (8 cells/μl), mild elevation in protein concentration (89 mg/dl), evidence of hemorrhage, and direct visualization of angiostrongylus vasorum helminth larvae is found in the literature. 102 eosinophilia was not observed within the csf or peripheral blood. 102 the organism is endemic in europe and canada among foxes and canids, mainly causing respiratory signs or coagulopathy; neurological signs are typically caused by hemorrhage. 102 two dogs with paraparesis and pyogranulomatous lumbar masses (one intradural, one extradural) had lumbar csf with mild mixed cell pleocytosis (lymphocytes and nondegnerate neutrophils) or lymphocytic pleocytosis. 103 these patients were serologically pcr positive for bartonella vinsonii subsp. berkhoffi and presented with a nodular dermatosis. 103 a dog with neurological signs and hepatozoon canis infection showed marked lymphocytic pleocytosis (243 cell/μl) with mildly elevated protein concentration (37 mg/dl). 104 organisms were not visualized in csf but were found on cytology of peripheral blood, lymph node, bone marrow, and bony lesions. serology and positive pcr from a bone marrow sample were diagnostic. 104 intrathecal contrast administration. contrast media or pharmacological agents, such as epidural anesthetics, may introduce preanalytical error into csf samples, artificially raising tncc and protein concentrations. 105 in a study of 17 healthy dogs given either iopamidol or metrizamide for emg, same-day csf sampling showed that 8 of 17 developed a mild to moderate mononuclear to mixed mononuclear or neutrophilic pleocytosis (6 of 8 were from iopamidol). 106 in the same study, 3 of 17 (all metrizamide) developed mild protein elevation, but mean protein concentration for both groups stayed within the reference interval. 106 in dogs given metrizamide, 7 of 8 had an increased pandy score after emg, which was considered a false-positive result because of the contrast agent. 106 these data, plus histopathology from the same population, showed that the contrast agents caused low-grade leptomeningeal inflammation with no statistical difference between the two agents studied. 106 another similar study over 30 days showed that post-emg csf changes reversed after approximately 5 days. 107 granulomatous meningoencephalitis. gme is a progressive immune-mediated disease that is overrepresented in females, toybreed dogs, and terriers. 20 it is a diagnosis of exclusion and has clinical presentations and mri findings that may be similar to various infectious and neoplastic diseases. csf may be unaffected or may display a mononuclear to mixed pleocytosis and protein concentration elevations, both of varying severity (figs. 14.14 and 14.15). in a study of 188 csf samples from dogs with inflammatory neurological diseases, marked pleocytosis (>1000 cells/μl) was found in cases of srma, bacterial encephalitis, or gme. 85 pleocytosis may also be lymphocytic or neutrophilic. 10 csf protein electrophoresis may be helpful, as several cases have been shown with increased β-globulin and gammaglobulin fractions. 108 most of the diseases described previously in this chapter may manifest as mixed cell pleocytoses, depending on the time interval between disease onset and csf sampling, disease severity, and previous treatment administered. a mixed cell pleocytosis would be expected to occur during transition between different phases of the inflammatory response, where certain cells may predominate at specific times after injury. is typically rare and may involve chorioretinitis or focal cerebral granuloma in the cat. 109 a study of two dogs with systemic blastomycosis and neurologic signs showed mild mixed cell pleocytosis (8 cells/μl, mononuclear predominant; and 15 cells/μl, lymphocytic predominant). 110 using csf cytology or culture to diagnose the organism may be unrewarding. agar-gel immunodiffusion serologic testing has high sensitivity and specificity for canine antibodies and is recommended if appropriate clinical signs (respiratory signs or lymphadenopathy) are present. 57 agar-gel immunodiffusion testing is less sensitive (25%-33%) in the cat, as indicated by a limited number of reports. 57 urine antigen enzyme immunoassay (eia) has good sensitivity for dogs and has been used successfully on at least one cat. 109 eia may also be performed on csf. cytology of nasal, pulmonary, or dermal lesions is more likely to yield direct visualization of organisms. lymphoma. lymphocytic pleocytosis of inflammatory origin may be difficult to distinguish from lymphoma exfoliating into the csf (fig. 14.16 ). the size of lymphocytes and morphological atypia may be helpful, although these may be challenging to differentiate from artifactual morphological changes secondary to cytospin preparation. cats with neoplasia may have lymphocytic pleocytoses (suggestive of lymphoma), mild to moderate mononuclear to mixed cell pleocytoses (suggestive of nonlymphoma tumors), or normal csf. one study examined six cases of feline cns or multifocal lymphoma, which displayed pleocytoses of variable magnitude, absent to mildly elevated protein concentrations, and neoplastic cells visualized in 5 of 6 of the csf samples. 4 in this study, eight cats with cns signs that were ultimately diagnosed with nonlymphoma tumors (e.g., meningioma, carcinoma, nerve sheath tumor) had mild csf protein elevations and either normal tncc (1 of 8) or mild to moderate mononuclear or mixed cell pleocytosis (7 of 8). 4 another study of 11 cats with spinal lymphoma showed neoplastic cells visualized in one case and hemodilution, acd, or neutrophilic pleocytosis in the remainder of cases. 81 a case report of feline multiple myeloma involving lumbar vertebrae and associated soft tissues exhibited cisternal csf with an elevated protein concentration of 290 mg/dl and mild pleocytosis (8 cells/μl) consisting of a majority of neoplastic plasma cells. 111 diagnosis was further confirmed by abnormal urine protein electrophoresis and bone marrow aspiration. 111 histiocytic malignancies. malignant histiocytosis or histiocytic sarcoma tumor cells in canine csf have been documented in two recent case reports; csf cytology displayed marked mononuclear pleocytoses (>500 cells/μl) and mild to moderately elevated protein concentrations (<135 mg/dl). 112, 113 tumor cells phenotypically resembled macrophages, displayed multiple criteria of malignancy, and reacted positively to cd1c on immunocytochemistry, compatible with interstitial dendritic cell origin. 112, 113 necropsy was confirmatory and found no evidence of neoplasia outside of the cns. 112, 113 a case report of a gliomatosis cerebri (gc) neoplasm in a middle-aged poodle showed csf with a mild lymphocytic pleocytosis (20 cells/μl) and protein concentration elevation. 114 on histopathology, lymphocytelike perivascular cuffing and meningitis were noted. other case studies of canine gc have reported normal csf or mild acd. 115 in a study of 56 dogs with intracranial meningioma, in which csf analysis was performed, 29% had normal csf, 45% had acd, and 27% had pleocytosis (2 of 3 of these neutrophilic pleocytosis; 1 of 3 unspecified), with the overall incidence of neutrophilic pleocytosis at 18%. 116 in this study, a positive correlation existed between elevated tncc and anatomical localization of the lesion to the caudal (versus middle or rostral) portion of the cranial fossa, and no association between pleocytosis and necrosis within the lesion was found. 116 these findings contradict prior reports of a high percentage of abnormal csf findings in meningioma, and the authors reported that concurrent glucocorticoid therapy in some of the patients may have negatively biased the data. 11, 116 a study of 26 dogs with spinal meningioma showed no cases with exfoliating tumor cells, 62% with mild pleocytosis up to 47 cells/μl (mean 11 cells/μl), and normal or variably elevated protein concentrations up to 836 mg/dl (mean 212 mg/dl). 117 both cisternal and lumbar csf samples were evaluated in this study and not found to be significantly different. 117 interestingly, tumors of the lumbar region displayed higher mean tncc and protein concentrations compared with tumors of the cervical area (24 versus 4 cells/μl and 158 versus 98 mg/dl, respectively), which the authors postulated may be reflective of a higher number of lumbar csf samples with proximity to the lesion. 117 other neoplasms. a case report of canine csf with 240 cells/μl was characterized by atypical neoplastic round cells that were confirmed on immunocytochemistry and immunohistochemistry to be from a metastatic mammary carcinoma. 118 inflammatory cells were of low numbers and were of a mixed population. 118 a study of csf from 25 dogs with choroid plexus tumors showed direct observation of tumor cells in 47% of the cases of carcinoma. 119 mild to moderate mixed-cell pleocytosis was present in all cases of papilloma and in half of the carcinomas; when pleocytosis was present, no difference in magnitude existed between benign and malignant tumors. 119 all cases had elevated protein concentrations, with median concentration for carcinoma being significantly higher (108 mg/dl) than median concentration for papilloma (34 mg/dl). 119 a cutoff protein concentration of 80 mg/dl yielded a sensitivity of 67% and a specificity of 100% for detection of choroid plexus carcinomas. 119 another case report of canine choroid plexus carcinoma had a mononuclear pleocytosis of 165 cells/μl, mildly elevated protein concentration of 30 mg/dl, and numerous tumor cells visualized. 120 a rise in the availability of stereotactic brain biopsy has facilitated increased cytological assessments of cns lesions. this technique offers several advantages, although significant equipment investment and time to perfect techniques is required. stereotactic biopsy often offers application accuracy for targeting lesions that approximate 3 mm or less in all directions. in one study, diagnostic accuracy of stereotactic biopsy specimens submitted for histopathology (i.e., agreement with specimens obtained via open approaches) exceeded 90%. 121 in experienced hands, stereotactic biopsy is believed to be a relatively low-morbidity procedure. a b 20um 20um cytological interpretation of brain biopsy specimens acquired via stereotaxy or open approaches may be challenging and does require a tumor that exfoliates well, a surgeon willing to provide multiple samples, and a cytologist with expertise in this area. 122 a study of 42 canine and feline cases of biopsy-or necropsy-confirmed cns lesions showed squash-prep smear cytology to have 76% sensitivity in accurately determining diagnosis, with an additional 14% of cases having partial correlation between cytology and histopathology. for the remaining 10% of cases, cytological interpretation did not correlate with final diagnosis. 123 cytological interpretation of cns lesions may be very difficult, and biopsy with histopathological examination is recommended to confirm all diagnoses. it is important for cytological samples to be prepared in the same manner each time to avoid introducing additional cytological variations that the pathologist has to read through. some authors recommend wet-fixation of tissues followed by staining with hematoxylin and eosin (h&e). 122 at the authors' institution, cns cytological samples are air-dried and stained with diff-quik or a modified wright stain. the reader is referred elsewhere for a complete discussion of normal cns cytology. 124 clinical imaging findings, and signalment, should be considered carefully and may help the pathologist to formulate a list of potential differential diagnoses. it must be kept in mind that primary tumors may metastasize to the cns, and these should be included in the differential diagnoses, where appropriate. meningiomas are composed of neoplastic cells arising from the meningothelial cells of the leptomeninges of the cns. 125 these tumors are the most common primary cns tumors of dogs and cats. 126 histologically, these neoplasms are classified into at least nine subtypes based on appearance, and some tumors may be characterized by more than one pattern (fig. 14.17 ). 125 cytologically, smears are often characterized by spindle-shaped cells draped around vessels and arranged in large whorling structures (fig. 14.18 ). some cells may contain nuclei that display intranuclear cytoplasmic pseudoinclusions, but this is not a feature reliably seen on a majority of tumors (fig. 14.19 ). 127 as a whole, this group represents the second most common primary cns neoplasm seen in dogs and cats. 126 glial tumors are more a b common than meningiomas in brachycephalic breeds. 126 glial tumors arise from the supporting cells of the cns. astrocytomas are found most frequently in the cerebral hemispheres, although they have been reported to occur in various locations throughout the cns. 125 astrocytomas arise from transformed astrocytes and are characterized cytologically by high cellularity, a high degree of nuclear pleomorphism, and fibrillar cytoplasmic processes. 125 tumor cells will stain positively for glial fibrillary acid protein (gfap). 125 oligodendrogliomas are derived from transformed oligodendrocytes and are found within the gray or white matter of the cns, with the highest incidence in the cerebral hemispheres. 125 cytological preparations are characterized by large numbers of blood vessels surrounded by neoplastic cells (fig. 14.20) . 125 neoplastic oligodendrocytes have small amounts of eosinophilic cytoplasm surrounding uniformly round nuclei. 125 ependymomas are derived from the ependymal lining cells found on the surface of the ventricular system of the brain and central canal of the spinal cord. 125 these tumors are rare and are found most often in the lateral ventricles. 125 cytologically, smears are characterized by neoplastic cells palisading around branching vascular structures. 125 cells are cuboidal to columnar in shape with high nuclear-to-cytoplasmic (n:c) ratios and eccentrically placed nuclei. 125 choroid plexus tumors arise from the modified ependymal lining cells that contribute to the production of csf. they are more common in dogs than in cats. 125 papillomas and carcinomas have a very similar cytological appearance and may only be reliably differentiated on the basis of histopathological examination. 123 cytological preparations contain polygonal cells arranged in rafts, columns, or papillary projections around capillary structures (fig. 14.21 ). 125 medulloblastoma arises within the cerebellum and is a type of primitive neuroectodermal tumor derived from a germinal neuroepithelial cell. 125 cytologically, preparations are highly cellular, composed of individual round cells that are large in size and have moderate to high n:c ratios. the appearance of these cells is reminiscent of large lymphocytes or histiocytes (fig. 14.22 ). nephroblastoma is a unique tumor arising in the spinal cord of young dogs (under age 4 years), usually between the t10 and l2 spinal cord segments. 125 the cytological appearance of this tumor has been described in a recent report and is characterized by three populations of cells: (1) high n:c ratio blastemal cells, veterinary neuroanatomy and clinical neurology the function, composition and analysis of cerebrospinal fluid in companion animals: part i-function and composition cerebrospinal fluid analysis and magnetic resonance imaging in the diagnosis of neurologic disease in dogs: a retrospective study inflammatory cerebrospinal fluid analysis in cats: clinical diagnosis and outcome clinical and magnetic resonance imaging findings in 92 cats with clinical signs of spinal cord disease magnetic resonance imaging findings in 25 dogs with inflammatory cerebrospinal fluid the formation of cerebrospinal fluid: nearly a hundred years of interpretations and misinterpretations cerebrospinal fluid collection, examination, and interpretation in dogs and cats cerebellomedullary cerebrospinal fluid collection in the dog the function, composition and analysis of cerebrospinal fluid in companion animals: part ii-analysis cerebrospinal fluid analysis analysis of cerebrospinal fluid from the cerebellomedullary and lumbar cisterns of dogs with focal neurologic disease: 145 cases (1985-1987) comparison of total white blood cell count and total protein content of lumbar and cisternal cerebrospinal fluid of healthy dogs bone marrow contamination of canine cerebrospinal fluid iatrogenic brainstem injury during cerebellomedullary cistern puncture collecting, processing, and preparing cerebrospinal fluid in dogs and cats canine neurology: diagnosis and treatment. saunders effects of time, initial composition, and stabilizing agents on the results of canine cerebrospinal fluid analysis analysis of cerebrospinal fluid from dogs and cats after 24 and 48 hours of storage conventional and molecular diagnostic testing for the acute neurologic patient cerebrospinal fluid analysis in the dog: methodology and interpretation cerebrospinal fluid analysis evaluation of the advia 120 for analysis of canine cerebrospinal fluid automated flow cytometric cell count and differentiation of canine cerebrospinal fluid cells using the advia 2120 high resolution protein electrophoresis of 100 paired canine cerebrospinal fluid and serum bacterial meningoencephalomyelitis in dogs: a retrospective study of 23 cases (1990-1999) a case of canine streptococcal meningoencephalitis diagnosed using universal bacterial polymerase chain reaction assay clinical, cerebrospinal fluid, and histological data from thirty-four cats with primary noninflammatory disease of the central nervous system critical evaluation of creatine phosphokinase in cerebrospinal fluid of dogs with neurologic disease associations between cerebrospinal fluid biomarkers and long-term neurologic outcome in dogs with acute intervertebral disk herniation blood-brain-barrier disruption in chronic canine hypothyroidism measurement of myelin basic protein in the cerebrospinal fluid of dogs with degenerative myelopathy cerebrospinal fluid myelin basic protein as a prognostic biomarker in dogs with thoracolumber intervertebral disk herniation beta-2-microglobulin levels in the cerebrospinal fluid of normal dogs and dogs with neurological disease cerebrospinal fluid glutamine, tryptophan, and tryptophan metabolite concentrations in dogs with portosystemic shunts oxytocin content of the cerebrospinal fluid of dogs and its relationship to pain induced by spinal cord compression cerebrospinal fluid gammaaminobutyric acid and glutamate values in dogs with epilepsy cerebrospinal fluid analysis significance of surface epithelial cells in canine cerebrospinal fluid and relationship to central nervous system disease cerebrospinal fluid from a 6-yearold dog with severe neck pain prevalence and significance of extracellular myelin-like material in canine cerebrospinal fluid evaluation of cerebrospinal fluid in cavalier king charles spaniel dogs diagnosed with chiarilike malformation with or without concurrent syringomyelia effects of iatrogenic blood contamination on results of cerebrospinal fluid analysis in clinically normal dogs and dogs with neurologic disease reference intervals for feline cerebrospinal fluid: cell counts and cytologic features differences in total protein concentration, nucleated cell count, and red blood cell count among sequential samples of cerebrospinal fluid from horses effects of blood contamination on cerebrospinal fluid analysis cytologic interpretation of canine cerebrospinal fluid samples with low total nucleated cell concentration, with and without blood contamination clinical features and magnetic resonance imaging findings in 7 dogs with central nervous system aspergillosis clinical, cerebrospinal fluid, and histological data from twenty-seven cats with primary inflammatory disease of the central nervous system inflammatory diseases of the spine in small animals brain abscess and bacterial endocarditis in a kerry blue terrier with a history of immune-mediated thrombocytopenia central nervous system infection with staphylococcus intermedius secondary to retrobulbar abscessation in a dog cerebral ventriculitis associated with otogenic meningoencephalitis in a dog meningoencephalomyelitis caused by pasteurella multocida in a cat clinical features and epidemiology of cryptococcosis in cats and dogs in california: 93 cases (1988-2010) clinical signs, imaging features, neuropathology, and outcome in cats and dogs with central nervous system cryptococcosis from california fungal infections of the central nervous system in the dog and cat disseminated histoplasmosis in cats: 12 cases (1981-1986) treatment of thoracolumbar spinal cord compression associated with histoplasma capsulatum infection in a cat clinicopathologic and diagnostic imaging characteristics of systemic aspergillosis in 30 dogs ehrlichiosis in a dog with seizures and nonregenerative anemia diagnostic features of clinical neurologic feline infectious peritonitis use of anti-coronavirus antibody testing of cerebrospinal fluid for diagnosis of feline infectious peritonitis involving the central nervous system in cats use of albumin quotient and igg index to differentiate blood-vs brain-derived proteins in the cerebrospinal fluid of cats with feline infectious peritonitis the cat with neurological manifestations of systemic disease. key conditions impacting on the cns feline toxoplasmosis: interpretation of diagnostic test results spinal epidural empyema in seven dogs sarcocystis sp. encephalomyelitis in a cat multisystemic infection with an acanthamoeba sp. in a dog another case of canine amoebic meningoencephalitis-the challenges of reaching a rapid diagnosis spinal intramedullary aberrant spirocerca lupi migration in 3 dogs steroid responsive meningitisarteritis: a prospective study of potential disease markers, prednisolone treatment, and long-term outcome in 20 dogs pathogenetic factors for excessive iga production: th2-dominated immune response in canine steroidresponsive meningitis-arteritis the role of acute phase proteins in diagnosis and management of steroid-responsive meningitis arteritis in dogs concentrations of acutephase proteins in dogs with steroid responsive meningitis-arteritis lumbar cerebrospinal fluid in dogs with type i intervertebral disc herniation magnetic resonance imaging findings and clinical associations in 52 dogs with suspected ischemic myelopathy fibrocartilaginous embolism in dogs fibrocartilaginous embolic myelopathy in five cats reversible encephalopathy secondary to thiamine deficiency in 3 cats ingesting commercial diets tumors affecting the spinal cord of cats: 85 cases (1980-2005) clinical and clinicopathological features of non-suppurative meningoencephalitis in young greyhounds in ireland cerebrospinal fluid eosinophilia in dogs what is your diagnosis? cerebrospinal fluid from a dog. eosinophilic pleocytosis due to protothecosis diagnosis of inflammatory and infectious diseases of the central nervous system in dogs: a retrospective study emergency presentations of 4 dogs with suspected neurologic toxoplasmosis use of a multiplex polymerase chain reaction assay in the antemortem diagnosis of toxoplasmosis and neosporosis in the central nervous system of cats and dogs realtime pcr analysis of dog cerebrospinal fluid and saliva samples for ante-mortem diagnosis of rabies cerebrospinal fluid from a 7-month-old dog with seizure-like episodes clinicopathological findings in dogs with distemper encephalomyelitis presented without characteristic signs of the disease immunohistochemical detection of canine distemper virus in haired skin, nasal mucosa, and footpad epithelium: a method for antemortem diagnosis of infection production of immunoglobulin g and increased antiviral antibody in cerebrospinal fluid of dogs with delayed-onset canine distemper viral encephalitis necrotizing meningoencephalitis of pug dogs associates with dog leukocyte antigen class ii and resembles acute variant forms of multiple sclerosis necrotizing meningoencephalitis in five chihuahua dogs epidemiology of necrotizing meningoencephalitis in pug dogs cerebrospinal cuterebriasis in cats and its association with feline ischemic encephalopathy feline cerebrovascular disease: clinical and histopathologic findings in 16 cats clinical characterization of a familial degenerative myelopathy in pembroke welsh corgi dogs detection of neospora caninum tachyzoites in canine cerebrospinal fluid detection of neospora caninum tachyzoites in cerebrospinal fluid of a dog following prednisone and cyclosporine therapy necrotizing cerebellitis and cerebellar atrophy caused by neospora caninum infection: magnetic resonance imaging and clinicopathologic findings in seven dogs angiostrongylus vasorum causing meningitis and detection of parasite larvae in the cerebrospinal fluid of a pug dog bartonella-associated meningoradiculoneuritis and dermatitis or panniculitis in 3 dogs hepatozoonosis in a dog with skeletal involvement and meningoencephalomyelitis cerebrospinal fluid response following metrizamide myelography in normal dogs: effects of routine myelography and postmyelographic removal of contrast medium cerebrospinal fluid changes after iopamidol and metrizamide myelography in clinically normal dogs transient leakage across the blood-cerebrospinal fluid barrier after intrathecal metrizamide administration to dogs cerebrospinal fluid protein electrophoresis: a clinical evaluation of a previously reported diagnostic technique cerebral blastomyces dermatitidis infection in a cat clinical and magnetic resonance imaging features of central nervous system blastomycosis in 4 dogs multiple myeloma with central nervous system involvement in a cat antemortem diagnosis of localized central nervous system histiocytic sarcoma in 2 dogs cerebrospinal fluid from a 10-yearold dog with a single seizure episode oligodendroglial gliomatosis cerebri in a poodle gliomatosis cerebri in six dogs characteristics of cisternal cerebrospinal fluid associated with intracranial meningiomas in dogs: 56 cases (1985-2004) canine intraspinal meningiomas: imaging features, histopathologic classification, and long-term outcome in 34 dogs neoplastic pleocytosis in a dog with metastatic mammary carcinoma and meningeal carcinomatosis choroid plexus tumors in 56 dogs (1985-2007) choroid plexus carcinoma cells in the cerebrospinal fluid of a staffordshire bull terrier ct-guided brain biopsy using a modified pelorus mark iii stereotactic system: experience with 50 dogs primary canine and feline nervous system tumors: intraoperative diagnosis using the smear technique squash-prep cytology in the diagnosis of canine and feline nervous system lesions: a study of 42 cases canine and feline cytology-e-book: a color atlas and interpretation guide tumors in domestic animals what is your diagnosis? intracranial mass in a dog a true "triphasic" pattern: thoracolumbar spinal tumor in a young dog key: cord-017866-h5ttoo0z authors: bowman, grant r.; cowan, andrew t.; turkewitz, aaron p. title: biogenesis of dense-core secretory granules date: 2010-05-27 journal: trafficking inside cells doi: 10.1007/978-0-387-93877-6_10 sha: doc_id: 17866 cord_uid: h5ttoo0z dense core granules (dcgs) are vesicular organelles derived from outbound traffic through the eukaryotic secretory pathway. as dcgs are formed, the secretory pathway can also give rise to other types of vesicles, such as those bound for endosomes, lysosomes, and the cell surface. dcgs differ from these other vesicular carriers in both content and function, storing highly concentrated cores’ of condensed cargo in vesicles that are stably maintained within the cell until a specific extracellular stimulus causes their fusion with the plasma membrane. these unique features are imparted by the activities of membrane and lumenal proteins that are specifically delivered to the vesicles during synthesis. this chapter will describe the dcg biogenesis pathway, beginning with the sorting of dcg proteins from proteins that are destined for other types of vesicle carriers. in the trans-golgi network (tgn), sorting occurs as dcg proteins aggregate, causing physical separation from non-dcg proteins. recent work addresses the nature of interactions that produce these aggregates, as well as potentially important interactions with membranes and membrane proteins. dcg proteins are released from the tgn in vesicles called immature secretory granules (isgs). the mechanism of isg formation is largely unclear but is not believed to rely on the assembly of vesicle coats like those observed in other secretory pathways. the required cytosolic factors are now beginning to be identified using in vitro systems with purified cellular components. isg transformation into a mature fusion-competent, stimulus-dependent dcg occurs as endoproteolytic processing of many dcg proteins causes continued condensation of the lumenal contents. at the same time, proteins that fail to be incorporated into the condensing core are removed by a coat-mediated budding mechanism, which also serves to remove excess membrane and membrane proteins from the maturing vesicle. this chapter will summarize the work leading to our current view of granule synthesis, and will discuss questions that need to be addressed in order to gain a more complete understanding of the pathway. in eukaryotes, newly-synthesized proteins destined for secretion are first transferred from the cytoplasm to the lumen ofthe endoplasmic reticulum, and then progress through the golgi apparatus to the trans-golgi network (tgn). at the tgn, the choice of secretory pathways broadens. one route, which appears to be present in all cells, is constitutive in the sense that secretion does not depend on extracellular signals. such secretion involves the budding of vesicles or tubular elements from the tgn and their subsequent transport to and fusion with the plasma membrane, and is essential for cell growth since, among other functions, it provides new material to expand the cell surface. 1 in addition to a constitutive route , many cells maintain a secretory mode that is adapted for the tight coupling of protein release to extracellular stimuli. for such regulated exocytosis, the vesicles that carry newly-synthesized protein from the tgn accumulate in the cytoplasm until specific extracellular events trigger their fusion with the plasma membrane, resulting in the release of vesicle contents.f the vesicles involved are called dense-core granules (dcgs), the name reflecting the fact that the contents are so highly condensed that they form a large electron-dense plug in the vesicle lumen. a large amount ofprotein, as well as other molecular cargo, is thus efficiently stored in vesicular reservoirs and later released on demand. this pathway therefore permits larger and more rapid secretory responses than can be generated via constitutive secretion. classical dcgs in endocrine, exocrine and neuroendocrine cells are responsible for storage of a wide array of signaling molecules (e.g., peptide hormones) and secreted enzymes, and related vesiclesare found in metazoan cells of other lineages as well as in numerous unicellular organisms. the secreted proteins and macromolecules playa vast range of functions, from tissue coordination in metazoans to cyst formation in protists . regulated secretion also depends upon mechanisms for controlling the timely release of dcg contents, and this is accomplished by regulating the fusion of the vesicle membrane with the plasma membrane. much of the progress in understanding the mechanisms that mediate this step has been preceded or aided by studies of synaptic vesicles (reviewed in ref 3) , which undergo regulated fusion with the plasma membrane, but differ from dcgs in their biogenesis and acquisition of contents. comparable work in dcg secretion has shown that many of the molecular components involved in regulating exoeytosis and achieving membrane fusion are shared by these two vesicle types. 2 in addition to proteins that appear to be specific for regulated fusion with the plasma membrane, the mechanisms include factors, such as snare s and rab proteins, which are members of families of proteins that are of central importance to vesicular trafficking at multiple stages in the eukaryotic secretory pathway. thus, regulated exocytosis appears to be accomplished by the coupling of a regulatory mechanism to a universal core of membrane trafficking machinery. although many of the protein components have been identified, and a more complete understanding of the process remains an important goal for ongoing research. the mechanistic studies of regulated membrane fusion are too extensive to be included in this chapter, but have been covered in many reviews,4-8 and above. figure1.at least3 pathways diverge at the tgn in neuronal, endocrineand exocrinecells. a) a subsetof proteinsare destinedfor densecoresecretorygranules for release via regulatedexocytosis, these are found asaggregates in distendedareasofthe cisterna. b)proteinsdestinedforconstitutivesecretionaretransported via vesicles or tubules. c) proteinsdestinedfor lysosomes are concentrated,via the mannose-6-phosphate receptor, into clathrin coated pits and vesicles. darkly-shaded squaresand circlesrepresentproteins that tend to coaggregate under tgn conditions, and that are subsequentlystored in dcgs. lightly-shaded formsrepresentproteinsthat primarilyexitthetgn viaother pathways.arrowheads representproteinsthat areligandsforthe mannose-6-phosphatereceptor. crescents representproteinsthat arefound in a relatively evendistribution throughout the lumen. the extent to whichsomeconstitutively secretedproteinsmaybe concentrated within specific regionsof individualcisternaeis not incorporated into this model. this chapter will instead focus on a part of the pathway that precedes regulated exocytosis, namely the synthesis steps that lead to the formation of dcgs, beginning at the tgn. the tgn is a complex compartment that gives rise not only to the regulated and constitutively released classes of secretory vesicles but also to vesicles that carry hydrolytic enzymes to lysesomes. 9 -11 therefore, we seek an understanding ofthe signals that guide outbound proteins in a 3-way tgn sorting problem (fig. 1) . dcg protein sorting also continues in a post-tgn compartment, where additional factors come into play. this pathway has been the subject of numerous valuable reviews,12-19 with a particularly thorough treatment by arvan and cascle. 20 the anatomy ofdcg formation a number of important insights into the pathway of dcg formation have come from electron microscopy, providing a context for molecular and genetic studies. first , the fact that dcgs appear dense implies the existence of mechanisms to drive a degree of macromolecular aggregation that is unusual within the secretory pathway. many lines of research have led to the conclusion that protein sorting and concentration are intimately linked in this pathway, both relating to the self-aggregating tendency of dcg proteins that will be discussed below. rambourg and colleagues have investigated the localization of protein aggre:fiates, using serial thin sections to reconstruct the golgi apparatus during granule formation. -27 in cells producing mucous-containing dcgs, the cis and medial golgi appear as flat cisternae, and secretory proteins are evenly distributed in their lumina. in contrast, cisterna in trans regions are marked by multiple perforations and are dilated in regions that accumulate aggregates of secretory material. those dilations grow progressively larger in the more distal regions, while the nondilated portions take on a tubular appearance. at the trans-most cisterna, the dilated regions with their concentrated secretory cargo appear to exist as independent bodies, separate from a residual network oftubular membranes. several points were established or reinforced by these images. the first is that the visible concentration of dcg proteins begins within golgi cisternae . a second point is that the tgn, the vesicle donor, appears to be undergoing large-scale changes itself. the images also indicate that the vesicles do not bud conventionally in the manner that is well-established for coat (e.g., clarhrinl-mediared steps, since no coats are seen. electron microscopy also suggested that the aggregates undergo progressive changes, and are therefore likelyto be dynamic in nature. in pancreatic cellsthat are synthesizing insulin-storage granules, the proteinaceous cores seen in golgi dilations appeared less dense than the cores of insulin granules in the cell cyroplasm. 28 since the latter are derived from the former, this implied that proteins reorganize during an organellar maturation process. an important conclusion is that dcg formation should be considered as a multi-step process that plays out in sequential compartments. an early phase occurs at the trans face ofthe golgi and results in the production of vesicles bearing concentrated secretory proteins. these are called immature secretory granules (isgs).29subsequently those vesiclesare remodeled, as reflected morphologically by cargo condensation and biochemically by changes in protein composition, to become mature dcgs. 30. 31 for simplicity, we will refer to the first process as budding, and the second as maturation. central issues to be considered in this chapter are the mechanisms responsible for protein sorting during those successivesteps. again for simplicity, we will largely confine our discussion to the dcgs found in neuronal and endocrine cells. many of the same mechanisms are likely to apply to other classes of dcgs, among which are those in hematopoetic cells; for example, see references 32 and 33. genetics frequently offers a natural complement to morphological studies for developing an overview ofa pathway. unfortunately, a weakness in current approaches to analyzing dcg biogenesis is the absence of developed genetic models, although several systems show promise. no human diseases are known to stem from an inability to synthesize neuroendocrine dcgs. presumably, strong defects in dcg formation would result in embryonic lethality in a complex multicellular organism, since such a defecr would preclude regulated secretion of many peptides involved in tissue coordination. however, this has not prevented the generation of regulated exocytosis mutants in more simple systems, such as drosophila. in flies, the null phenotype of a gene called dcaps (calcium-activated protein for secretion), is embyronic lethal, but analysis of the larva has shown that the gene product is necessary for dcg exocytosis,34 as predicted from earlier work in mammalian chromaffin cells. 35. 36 although mutations affecting earlier stages in the pathway (i.e., dcg synthesis) have not been characterized in this organism, the characterization of caps mutants in this system provides hope that the earlier steps will be accessible by further mutational analysis. c elegans offers another potentially useful system for the genetic analysis of dcg synthesis, and several mutations affecting regulated exocytosis have been identified in this organism (reviewed in ref. 37) . currently, the only examples of dcg synthesis mutants are found in single-cell systems: the unicellular ciliates tetrahymena thermophila and paramecium tetraurelia, in which the mutations were chemically induced, and spontaneously-arising clones of the rat pheochromocytoma line pci2. [38] [39] [40] [41] [42] [43] [44] [45] [46] the viability ofthese mutants substantiates the idea that regulated exocytosis, unlike constitutive secretion, is not involved in basal cell growth . that is, dcgs are essential for organismal survival in metazoans, but not for individual cell viability. in the pcl2 lines, some mutations appear to disrupt the transcription ofnumerous granule protein genes. 45, 46 in the ciliate mutants, which appear to be due to single recessivealleles, the cargo genes are still expressed though no granules are synthesized. in one tetrahymena line, normal granule cargo appears to be shunted to the constitutive secretory pathway.47this phenotype indicates that dcg cargo proteins are not sufficient to direct granule formation, a result which was particularly interesting in the context of experiments in which mammalian dcg cargo proteins were expressed in tissue culture cells that do not normally make dcgs. 48 -51 such cells make vesicles with dense cores, presumably because cargo proteins expressed in nonspecialized cells can induce the formation oftheir own carriers from the tgn. these results implied that the capacity to make dcgs was inherent in the basic organization of the golgi/tgn since it could also occur in such nonspecialized cells. since this capacity appears to have been lost in the tetrahymena mutant, the defect in that line may point to an aspect of goigi/tgn function that is critical for regulated but not constitutive secretion. the full relevance of the ciliate or pc12 cell mutants to dcg biosynthesis will only be known when the mutations themselves havebeen identified. such geneticapproaches provide an unbiasedmethod for the identification of novel genes, and mayprovecritical in broadening our understanding of the granule synthesis pathway. although many of the dcg cargoproteins themselves have been cloned and characterized, much less is known about the mechanismsthat control protein sorting and condensation. geneticsystems may help to identifythe regulatory factors that are involved in theseprocesses. proteinsortingtakes place in thetgn and duringmaturation. in eachcase, a single compartment gives rise to multiple pathways, and the challenge is in understanding how dcg proteins, both in the lumen and the membrane, are cosorted from a larger cohort that includes proteins destined for other pathways. the relevant contributions oftgn vs, isg sortingarelikely to be cell-type specific and aregenerally difficult to quantify experimentally. however, the mechanisms for controlling sorting at both stages may be fundamentally similar. in particular, the considerationsthat arise from protein aggregation are relevant for both compartments. a long-standing issue is whether the primary mode of dcg protein sorting is active or passive. the model of active sorting was initially inspired by the paradigmof sorting to lysosornes, in which sorting derives from recognition of a set of soluble lumenal proteins by a transmembranereceptor. extendingthis to dcg biogenesis, the modelpositedthat a subsetof proteinshavepositive sorting signals for inclusionin isgs. 52 • 53 in this scheme, proteinsin the tgn lumen that lack targetingsignals are presumedto follow an alternative, default pathway of constitutive secretion. this model has been called "sorting for entry" (fig. la) . an alternative model posits that newly synthesized proteins can be targeted to isgs by default , even in the absence of specific targeting signals, if the flux of bulk membrane traffic toward isgs is greater than that to constitutive or lysosomal carriers. this may indeed be the case for cells that are highly committed to regulated exoeytosis. 54 ,55 in this case, the major sorting events occur in the isg, which becomes a functional extension of the tgn. proteins that are retained as isgs undergo maturation end up as the contents of mature granules. nongranule proteins can be selectivelywithdrawn from isgs during this period, and this model is termed "sorting by retention" (fig. 2b) . in evaluating either model, the sorting of dcg proteins cannot be considered in precisely the same terms that apply in other pathways, because the tendency of such proteins to self-aggregate facilitates a unique mode of targeting. among other things, it allows a large group of proteins to be sorted together in a single step. one implication is that sorting receptors, if present, could presumably function at concentrations that are dramatically sub-stoichiometric to their dcg protein ligands. furthermore, such receptors would only have to recognize some subset ofdcg proteins , since the remainder could be sorted indirectly via aggregation. in fact, no receptor has ever been unambiguously identified in this pathway. this does not by itself eliminate a "sorting for entry" model, because a second unusual feature of many dcg proteins is a tendency to bind to membranes. this has implications for sorting that will be discussed in a later section. many isolated dcg proteins will self-associate under in vitro conditions believed to approximate the tgn; namely, a slightly acidic flh and high calcium concentration relative to earlier compartments in the secretory pathway. 56 57,58 this can serve as a mechanism for sorting because it is selective: proteins that are constitutively secreted tend to remain soluble under conditions that promote dcg protein aggregation. this first sorting step can therefore be imagined as the evolutionary version of ammonium sulfate precipitation, with the collective behavior based on the proteins' individual biophysical properties, for example their surface charge. while the ability of individual proteins to aggregate is variable,59 mixtures of proteins may show cooperativiry in vitro, thereby increasing the efficiency of the step (fig. 3a) . 60 efficient protein aggregation might be expected to show concentration-dependence, and indeed isolated dcg proteins only self-associate above a threshold concentration.57 this in turn suggests that minor constituents of dcgs may depend for their efficient sorting on coassociation with more abundant species, whose concentrations must be sufficiently high to drive their independent self-aggregation. the sorting efficiency of individual proteins can be experimentally measured as the fraction that is stored in dcgs as opposed to being mistargeted to the constitutive pathway. asexpected from coassociation models, the sorting efficiency of a protein may vary widely between different cell lines. one would also predict that the sorting efficiency ofa protein could be boosted by increasing the expression levelof other proteins with which it coaggregates, particularly those which are most abundant. chiefamong the abundant metazoan dcg proteins are the chromogranin/secretogranins, a group of proteins with shared physical characteristics despite their very limited sequence similariry.61,62 indeed, the overexpression ofchromogranin b (cgb) in the att-20 neuroendocrine cell line increased the sorting efficiency of a second dcg protein, pro-opiomelanocortin (pomc). 63 nonetheless , it is inherently difficult to test the proposition that self-or coaggregation is a primary sorting determinant using conventional structurefunction analysis, since aggregation is thought to be directed by gross biophysical properties of dcg proteins , and there are no clear "aggregation signals" at the amino acid sequence level. however, recent studies have shown that sorting efficiency can be increased by providing an artificial aggregation signal. heterologous expression of a 6his-tagged secretory protein enhanced the aggregation and dcg storage, in a calcium-dependent fashion, of cga.64,65the authors speculate that the tag functions as an "aggregation chaperone" by providing a local site for the binding of divalent cations, thereby nucleating the aggregation process. curiously stably incorporated into the aggregates, suggesting that dcg proteins in their aggregated form interact more strongly with other dcg proteins than with the his tagged peptide. whether endogenous proteins have similar nucleation-promoting properties remains to be determined. identifying the role of any single protein or protein domain in dcg sorting is complicated by the high degree ofcooperativity that is hypothesized to exist within dcg protein aggregates. colo mer et al took advantage of the observation that two exocrine dcg proteins, amylase and gp2, do not coaggregate with neuroendocrine dcg proteins in solution,66 to study the sorting of dcg proteins the absence of coaggregation. when expressed in the neuroendocrine cells, the exocrine proteins were not stored in dcgs but instead secreted constitutively. 67 in similar experiments, an endothelial dcg protein, von willebrand factor, was expressed in neuroendocrine att20 cells. 49 this resulted, however, not in the constitutive secretion ofvon willebrand factor but instead in the formation of two morphologically-distinct classes of granules. one contained endogenous chromogranins, while the other contained von willebrand factor. a possibility is that two sets ofproteins aggregate independently in the tgn, which could be determined by a number of factors. for example, the two sets could precipitate at relatively distinct ph and/or calcium concentrations and thus be spatially or temporally separated. specific aggregate formation can also arise from conventional protein-protein interactions. in pituitary and pancreatic islet cells, for example, efficient sorting ofcga to dcgs depends on its association with secretogranin iii, and an essential targeting sequence in cga has been determined by gene truncation. 68 cga sorting in pc12 cells also depends on a specific sequence in the protein, which overlaps with, but is not identical to, that region which is required in pituitary cells. 69 this difference suggests that cga may be interacting with a different partner in pc12 cells, and indeed these cells do not express secretogranin iii. one possibility is that different surfaces of a cga domain can interact specifically with a range of partners, like a good host at a cocktail party. in summary, the data indicate that the aggregation of a particular protein depends on a number offactors, including its attraction to other potential binding partners within the aggregate, and the physiologic qualities of the lumenal environment, such as ph and calcium concentration, which affect the strengths of those interactions. the expression of proteins that are differentially sensitive to lumenal conditions or that form exclusive sets of protein-protein interactions can potentially result in the formation of multiple distinct aggregates in the same tgn compartment, each comprised of different proteins . these mechanisms could underlie the natural ability ofsome cell types to produce more than one classofdcgs, as is observed in aplysia bag cell neurons, bovine pituitary cells, as well as some protozoa. 7o-n though the model of sorring-by-aggregation is well established, the actual nature of the molecular interactions within such aggregates is difficult to define. the process of aggregation must be reversible so that the contents can be released into solution following exocyrosis, and moreover, it must be dynamic enough to permit the reorganization oftheir substituents during maturation.v' the latter isparticularly clear in pancreatic~-cells, in which the insulin-containing dcgs exhibit a crystalline ultrastructure, observed by electron microscopy, that is not found in isgs. in comparison to the production of insulin crystals, which involves the assembly of a single protein, the formation of dcg ultrastructure in protozoa may be significantly more complex. in these cells, the lumen ofmature dcgs is filled by a crystalline core that consists of multiple varieties ofproteins,?4,75 indeed , the localization of different proteins within the cores of paramecium dcgs has revealed that the crystals contain at least two distinct layers, each with a different set of protein componenrs. f" images ofisgs reveal that the components of the two layers are interspersed in this compartment, indicating that the layersare formed during a subsequent reorganization phase. thus, there is a significant amount of reorganization that must occur during crystal assembly. overall, the term "aggregation" may be misleading insofar as it suggests a phenomenon based on "stickiness", as for example for misfolded proteins in the endoplasmic reticulum. 77 instead, the interactions that occur between individual proteins in an aggregate may be transient and weak, stimulating formation of aggregates in the tgn due to stabilizing effects provided by multivalent interactions while also allowing for reorganization of the proteins during crystallization, as in figure 3 . some of the nonspecific, low-affinity interactions that occur in aggregates are likely to be mediated by the effects of calcium and ph in charge neutralization, leading to intermolecular interactions of acidic proteins by coordinate association with calcium ions (fig. 3a) . it is noteworthy that the chromogranins/secretogranins contain a preponderance of acidic amino acids, which endow these proteins with the capacity to bind large numbers of calcium ions with low affinity.6! acidic calcium-binding proteins also form the core of some lrotist dcgs, though they show little overall sequence homology with mammalian proreins.f . 79 an attractive explanation for the similarities is that they reflect a common aggregation-based dcg synthesis mechanism between protozoa and multicellular organisms, and that the amino acid sequences have evolved under similar constraints. following dcg synthesis, the regulated secretion of dcg cargo proteins is dependent on mechanisms that bring the vesicles to the cell surface and control their fusion with the plasma membrane. these activities are dependent on the activity of dcg membrane proteins, for example those that interact with cytoskeleton-based motors for intracellular transpon 80 and those that mediate regulated exocytosis.f it follows that the aggregation ofcore proteins during dcg synthesis cannot by itself be sufficient to form functional dcgs, and that there must be specific, though not necessarily direct, interactions between the lumenal proteins and the membrane constituents in order to ensure efficient sorting of these proteins to the same vesicles. these interactions have been difficult to detect, although some possible examples are discussed in a later section. what is clear, however, is that many lumenal proteins can themselves associate with membranes in unconventional ways. however, the nature and the functional significance of those associations are largely unsettled. five to ten percent of cgb adheres tightly, in a calcium and ph sensitive manner, to mernbranes. 8! whether this fraction is in dynamic equilibrium with the remaining~90% is not known, but there is no known chemical difference between the two cohorts. the membrane binding ofcgb is associated with an n-terminal domain defined by a disulfide-anchored loop, which is sufficient to confer membrane association when linked to an otherwise soluble protein. 82 importantly, the chimeric protein was sorted to dcgs in spite of the fact that it did not appear to aggregate, suggesting that the n-terminal domain constitutes an independent targeting signal. that same domain may promote homodimerization at neutral~h, implying that it may mediate different interactions in sequential secretory compartments. 3 cgb, as discussedearlier,also showsa strong tendency to aggregatein a controlled fashion. the coexistence in a single protein of domains that facilitate both protein-membrane binding and homo-or heterotypic protein-protein aggregation, offers the potential to generate cooperative networks with physiologically-useful properties (fig. 3b) . first, the total concentration of dcg proteins needed to reach the aggregation threshold in the tgn may be reduced for any proteins that interact with the membrane, since the local concentration may be increased depending on local membrane geometry. secondly, the avidity of a cgb aggregate for the membrane will be greater than that ofa monomer, since multiple n-terminal domains are availablefor independent membrane binding. validation of this came from an extension of the experiments with cgb chimeras outlined above. while a single n-terminal cgb domain was able to direct sorting to dcgs, efficient sorting only occurred when two such domains were present.82 this suggests that the membrane affinity of a single domain may be only marginally sufficient, but is more than adequate if two or more such domains are linked, as would be the case in a cgb aggregate. in a nonconventional sense, cgb could be considered as a dcg sorting receptor: a membrane-associated protein that is itself targeted to dcgs, and that can potentially cotranspon any proteins with which it associates. a similar argument has been made for the enzyme carboxypeptidase e (cpe), which is targeted to dcgs by a c-terminal amphipathic alpha helical domain. 84 • 85 in addition to acting as an enzyme to modify dcg cargo, cpe can also bind a subset ofdcg proteins , for example the hormone precursor pro-opiomelanocortin (pomc). 86 the cpe recognition site involved is different from the enzymatic cleft,87 and binding may be important for efficient sorting ofpomc, a conclusion based on experiments with cpe knockout mice and from cpe -deficient cell lines.86.88 cpe has been called a receptor for pomc and perhaps for other cargo proteins, though use of the term "receptor" has remained contentious since cpe can also aggregate with pome, chromogranins, and other cargo proteins in a conventional ca 2 + and ph-dependent fashion. 89,9o membrane association of cgb and cpe may be a property that has arisen convergently in these proteins, albeit by different mechanisms, reflecting the importance of this activity in dcg cargo sorting. an n-terminal disulfide bonded loop such as that found in cgb is found in several dcg proteins, including pomc and chromogranin a (cga), though the homology does not extend beyond the structural level,84 and evidence to date suggests that its role in sorting may be protein-specific. as in cgb , n -terminal disulfide loop domain in pomc is both necessary and sufficient for sorting, but sur~risingly,it appears to interact with the membrane indirectly, through interaction with cpe. 4 the disulfide loop in chromogranin a was not necessary for the sorting of this protein in pcl2 cells,69 and instead an interior domain is essential for sorting in these cells,via interaction with membrane-associated secretogranin 111. 68 • 91 these studies find no evidence for a conserved dcg targeting signal, but they do indicate that specific protein-protein interactions can be important for efficient sorting oflumenal cargo. precisely how cgb, cpe, secretogranin iii, and other ostensibly soluble lumenal proteins associate with membranes is not resolved. there is some evidence that they associate preferentially with cholesterol-rich membranes, so-called lipid rafts. 91 . 92 consistent with this, depletion of cholesterol from tissue culture cells decreased the sorting efficiency of both cpe and cgb, though it is difficult to distinguish direct from indirect effects in such experiments. 19. 93 in addition, because both constitutive and regulated secretion were inhibited by cholesterol withdrawal, the results do not demonstrate a specific role for cholesterol in dcg formation. the experimental limitations notwithstanding, these data suggest that the association of cpe and cgb with specific membrane sub-domains could be an important aspect ofsorting. iflipid rafts are indeed involved in this pathway, it could add another level of complexity to the cooperative mechanisms that may perrain (fig. 4) . interestingly, cgb is also differentially sorted between the apical and basolateral pathways in polarized epithelial cells, which do not make figure4. selective association ofocg proteinswithlipidraftsin thetgn. implications ofsuchassociation includethe following possibilities: 1. independentassociation ofproteinswith a singleraftwouldpromote protein-protein aggregation . 2. protein aggtegates could stabilize rafts with which they associate. large aggregates couldleadto formationofextensive rafis, in principle, thisprocess couldbesufficient to generate ocgs with a highly biased lipid composition, which is indeed observed. l9l the thickened, patterned regions of the cisternal membrane representputativelipid subdomains. dcgs, and this also requires signals within the n-terminal domain. 94 this may suggest a similarity in sorting mechanisms used in epithelial and regulated secretory cells. a final complication in dissecting dcg sorting signals is that the requirement for the disulfide loop in cgb depends on cell type. disulfide bond reduction led to the constitutive secretion of newly synthesized cgb in pcl2 cells. 95 as expected, this treatment did not affect the sorting of secretogranin ii, a protein that undergoes aggregation but does not contain cysteine residues. however, in gh4ci cells, the same treatment did not perturb the sorting of cgb. 96 similarly, cga soning also appears to exhibit cell~e specificity: a c-terminal truncation was correctly sorted in pcl2 but not in gh4ci cells, and an n-terminal region, which does not contain a disulfide loop, was important for sorting in pcl2 cells. 69 thus, the sorting requirements for cga, cgb, and granule proteins more generally, may depend on the cell type, specificallybecause the efficiency of any protein's sorting will depend on the available interacting parmers . in some cases, a protein's interacting parmer could be a membrane raft, whereas in other cases, the same protein may be delivered to dcgs by virtue of its ability to aggregate with other lumenal cargo proteins. our current understanding of signals involved in dcg membrane protein targeting is relatively primitive. in principle, membrane proteins could be targeted by signals in their lumenal, transmembrane and/or cytoplasmic domains; however the characterization ofsuch signals has not been straightforward. a significant obstacle has been the fact that relatively few membrane proteins have been identified that are exclusivelylocalized to granules. 20 phogrin (phosphatase homolog in granules of insulinoma) localizes to dcgs in a range of neuronal and endocrine tissues. 98 it is a transmembrane protein with an n-terminallumenal domain and c-terminal cytoplasmic domain, and is synthesized with a large n-terminal proregion that is later cleaved within isgs. either the pro-domain or the lumenal domain of the processed protein can be independently stored in dcgs, indicating that each contain signals sufficient for rargeting. 99 one possibility is that these, and by implication the full length phogrin as well, can be sorted by associating with the condensing core of granule cargo in the tgn. this may also be true for two dcg membrane proteins of the anterior pituitary and adrenal medulla, peptidylglycine a-amidating monooxygenase (pam) and dopaminẽ -hydroxylase.60in these cases, there is physiological evidence that the lumenal domains can sort independently of the transmembrane or cytosolic domains, since both the soluble forms and the transmembrane forms occur naturally in dcgs. loo nonetheless, efficient storage of the transmembrane form of pam also requires signals within the cytoplasmic tail101 the idea that sorting of transmembrane proteins in dcg involves cytosolic signals is also supported by analyses ofvamp2, a widely distributed dcg v-snare,4 and p-selectin, a protein of platelets and endothelial cells. 102 the sorting ofvamp2 to insulin-containing dcgs is impaired by a point mutation in the cytosolic portion of the protein, and the expression of this incorrectly sorted mutant protein is unable to support regulated exocytosis in the absence of wildrype vamp2. 103 analysis ofp-selectin targeting is complicated by the fact that it can be found in more than one intracellular compartment, suggesting that it contains hierarchical targeting signals. 104 in addition, the dcgs of platelets and endothelial cells share some properties with lysosomes, and mechanisms involved in their biogenesis may differ from those in neuronal and endocrine cells.102.105·107 nonetheless, p-selectin expressed heterologously in the neuroendocrine cell line att-20 was targeted to dcgs, and this depended on a tyrosine-containing motif in the cytoplasmic domain.l08. 109 the same motif is important in the endogenous endothelial cell context, indicating that the rargeting mechanisms may be similar. the tyrosine-based motif suggests that this protein can interact with a coat-associated adaptor, and indeed a functional role for ap-3 in the sorting ofp-selectin to dcgs has been suggested,llo but no ident ified coats are involved in the formation ofisgs in the tgn. one possibility is that conventional adaptor/coat-mediated sorting of p-selectin occurs at a step distinct from the known budding and maturation steps in dcg bio~enesis; a second is that adaptors may have noncanonical roles unrelated to coat recruitment.i i the studies ofp-selectin have revealed clear evidence that a cytosolic signal can be important for the sorting of transmembrane proteins to dcgs. further analysis of the targeting mechanism will likely be an important topic in future research, as it represents an activity that is topologically distinct from the relatively well characterized aggregation-based sorting events in the lumen. in an intriguing set of experiments, cutler and colleagues found evidence that the function ofthis cytosolic sorting determinant can be coupled to the expression ofa lumenal dcg protein. when the lumenal dcg protein von willebrand factor (vwf) was coexpressed with p-selectin in neuroendocrine att-20 cells, the vwf was stored in vesicles that were distinct from dcgs containing endogenously-expressed cgb, 1l2,1l3 a finding that is consistent with previous results.i14the novel and intriguing finding was that p-selectin was preferentially targeted to the vwf-containing vesicles, indicating that vwf and pvselectin, which are normally expressed in platelet and endothelial cells, could be cosorred in a cell type in which they are heterologously expressed (fig. 5) . there was no indication, however, of a direct interaction between the two proteins, and the sorting ofp-selectin in this context was instead dependent on the same tyrosine-containing cytoplasmic motif that had previously been shown to be necessary for targeting to dcgs. the targeting of one class of membrane proteins, those linked via a gpi-anchor, cannot depend on cytosolic signals, since anchors of this type do not penetrate the cytoplasmic membrane leaflet.i15 for gp-2, the major membrane protein of zymogen granules in pancreatic acinar cells, sorting may occur via a coaggregation mechanism. its lumenal domain has been found to associate with a lectin (zg 16p), sulphated matrix proteoglycans, and syncollin, the last a lumenal protein that may itself interact with the membraney6 these proteins have been postulated to form a membrane-associated matrix that could serve as a sorting intermediary between the membrane and the zymogen core contents. 1l7 this is a variation on the model described for cgb and cpe as sorting receptors , and suggests by analogy that gp-2 or syncollin might serve as the membrane anchor for the zymogen core. however, dcg assembly is normal in the absence of either protein, indicating that neither is playing a unique role in that regardys,1l9 in summary, the relatively limited evidence to date suggests that a mechanism similar to that involved in cargo protein condensation is involved in sorting of some, but not all membrane proteins with lumenal dcg contents. in principle, indirect interactions between membrane and core proteins may be equally important in the cosorting of membrane and lumenal cargo. iflumenal proteins like cpe preferentially insert into membrane sub-domains based on their lipid composition, then any membrane proteins that independently partition into the same sub-domains would be cosorted. in support of this hypothesis, recent studies have suggested that prohormone convertases 1 and 2, which are responsible for proteolytic cleavageof lumenal proteins during granule maturation, are sorted to isgs by virtue of c-terminal membrane raft-associated tails, which are by themselves necessary and sufficient for targeting to dcgs,uo 120 additionally, cytoplasmic signals on some transmembrane proteins appear to play important roles in sorting, but the mechanisms are unknown. the canonical mechanism of vesicle budding, as for example that involved in the emergence oflysosome-bound carriers from the tgn, involves transmembrane receptors, adaptors , and coat proteins . since there is no evidence that transmembrane proteins or coat proteins are relevant in isg budding, other mechanisms are likely to apply. there has been some progress in reconstituting this process using cell-free systems, though the field has generally suffered from a lack of in vivo models, for example a well developed genetic system with mutations that affect this step in the pathway. the general approach has been to start with labeled dcg protein in the tgn of permeabilized cells or in golgi-enr iched fractions, then measure the transfer of the label from the relatively large and pelletable golgi membranes to nonpelletable vesicles,using medium speed centrifugation to separate the two pools. the appearance oflabel in smaller vesiclesis taken as an indication of cargo transfer to isgs via vesicle budding. since little is known about dcg biogenesis, it is important to note that "budding" as defined by this assay may include a large number of steps, including the establishment of golgi/tgn microdomains, and the release of previously budded, weakly associated vesicles. thus, the results of these experiments could depend upon on the nature of the starting material. in addition, it has not yet been rigorously demonstrated in any system that the released vesicles are bona fide isgs, for example by testing whether they are competent to fuse with their appropriate target membrane. the reconst ituted buddin~reactions utilize atp, as expected , and most but not all require a cytosol extract . 31 ,121-1 4 the small gtpase arf is required, although the targets for this regulatory protein are not yet clear. one potentially relevant arf target is phosfholipase d (pld) , the binding of which to the membrane can enhance isg buddingp pld converts phospharidyl choline to phosphatidic acid, perhaps thereby effecting a change in membrane curvature.126 this idea is appealing because, in the absence of coat proteins, the membrane curvature required for isg budding must be induced by other mechanisms.127in addition, the indirect products of pld activity may recruit additional effectors to the budding site, including the unconventional gtpase dynamin_2. 128 dynamin mediates membrane scission events, such as pinching off vesicle buds . however, pld does not stimulate budding in all reconstituted systems; the differences may reflect the variery of ways in which donor fractions are prepared. there is evidence that kinases and phosphatases, heterotrimeric g-proteins, and a phosphatidyl inositol transfer protein (pitp) are involved in isg budding, but the enzymatic substrates have not been established. [129] [130] [131] [132] [133] an important unanswered question is whether any of these activities, the majoriry of which are as yet unidentified at the molecular level, is specifically required for the formation ofdcgs and not other membrane carriers. pld, for example, has been implicated in tgn tubularion, but the downstream effectors, as for dcg budding, the reaction depends upon ap-l, which can be recruited by the cytoplasmic tails of furin , the mannose-o-phospharereceptor,and other membrane proteins. the mannose-6-phosphatereceptorcan in turn bind any soluble lysosomal enzymes in the isg lumen, so thesewillalsobewithdrawn in the budding vesicles. other soluble proteins may also be included based on random partitioning, but the aggregated deg proteins will be excluded.at the end of this process, the mature secretory granule is no longer a budding donor compartment, perhaps becauseit no longer contains membrane proteins that can recruit ap-l (representedby stars). are unknown. 134 the noncanonical gtpase dynamin has been implicated in dcg budding as well as in constitutive secretion. 128.135 one possibility is that these activities are only indirectly involved in dcg synthesis. according to the "sort ing by exclusion" model ( fig. 6a ), isgs are created by a passive process, as aggregation prevents dcg cargo from entering into outbound vesicles and tubules that bear lysosomal or constitutively secreted proteins. instead of being actively budded from the tgn, aggregated proteins would be enriched in a separate subset of relatively large membrane carriers, while non-dcg proteins are removed from the compartment by active coat-dependent processes . thus, the cyeosolic components identified as isg budding factors by in vitro reconstitution assays may really be parts of the mechanisms for other secretory pathways. according to this model, the so-called sorting receptors need only act as membrane tethers in associating the lumenal aggregates with membrane rafts. as there is no need to transport this material to a new compartment, the receptors do not recruit cytosolic coat proteins for vesicle budding, as the traditional membrane receptor proteins do in sorting proteins to other pathways. an alternative to the "sorting by exclusion" model proposes that isg budding is indeed an active process, and that the same mechanism is also involved in driving the budding and tubulation of constitutive secretory carriers from the tgn. although the two pathways give rise to vesicles of vastly different sizes, it is possible that the difference is caused by the cargo proteins (large aggregates versus soluble material) and is not a reflection of different cytosolic budding machinery. the formation ofconstitutive secretory carriers, like isg budding, differs from clathrin-dependent transport at the tgn in that is not associated with the appearance of vesicle coats. there are similarities between the budding of constitutive and regulated vesicles at the molecular level as well: in addition to rab proteins, constitutive traffic has been shown to rely on the activity of dynamin-2,128 protein kinase d,136and heterotrimeric g-proteins ,130 factors which may also be associated with isg budding (above). cholesterol depletion has been shown to inhibit both pathways,93 however it is difficult to know whether the treatment has a direct effect on both pathways, or whether the inhibition of one pathway could inhibit the second via some indirect mechanism. thorough testing ofthis model requires experiments that avoid this problem. the only concrete indication that there are dcg-specific budding factors is that, at least in one reconstituted system, the cytosol re~uirement cannot be substituted for by an extract from hela cells, which do not make dcgs. 1 7 one possibility that is compatible with both sorting models is that specific cytosolic proteins are involved in establishing or facilitating golgi subdomains in which dcg proteins condense. the structural and functional analysis of the tgn is at a very early stage, but the existence of sub-domains is consistent with the observed nonuniform protein distribution within a single cisterna, as well as with live imaging of heterogenous budding structures. 94 ,138 however, the cisternal dilations involved in isg budding do not necessarily reflect the active maintenance of sub-domains. a simpler view is that the cisterna are passively stretched around the forming granule protein cores, like the bulges in a pancake around blueberries. future work with in vitro systems may provide molecular identification ofactivities that are required for isg budding, but the question ofwhether cells have machinery that is specifically used for this purpose will need to be addressed by other types of analyses. if the budding mechanism is specific to isgs, and not indirectly required for isg production, as in the "sorting by exclusion" model, the prediction is that knocking out individual components would inhibit isg formation without also inhibiting the exit of lysosomal or constitutive proteins from the tgn. depending on cell type, the importance of isgs as a locus of protein sorting may be as important as that of the tgn. sorting at this level involves the budding of vesicles from isg membranes, resulting in the remodeling of membrane and lumenal contents by selective withdrawal (fig. 6b ). this targeted removal occurs via clathrin-coat recruitment to isgs occurs via the ap-1 adaptor, wh ich is recruited by membrane proteins in an arf-dependent, bfa-inhibitable step.139 proteins known to be withdrawn from isgs include the cargo protease furin and the mannose-6-phosphate receptor, both of which can interact directly with ap_l.140.143 the mannose-s-phosphate receptor can bind any lysosomal enzymes that may have been incorrectly sorted upon exit from the tgn, and this step therefore leads to selective withdrawal of some lumenal proteins by classical receptor-based sorting. 144 mature dcgs do not support ccv formation, the simplest explanation for which is that isgs become progressivelydepleted ofproteins that act in the recruitment ofap-1. consistent with this, myristoylated arf 1 binds to isgs but not mature granules in vitro. 139 recent evidence suggests that the full cohort of arfs and adaptors present on isgs includes arf1, 5 and 6, and ap-1 and _3. 145 these may all be present on a uniform population of vesicles, or may reflect heterogeneity within isgs. 143 coated vesicles budding from the isg will also withdraw any soluble proteins that randomly partition by diffusion into the vesicle lumen during budding. however, large aggregates of proteins that are condensing in the isg are too large to fit into the buds , and are therefore selectively retained. 146 the efficiency of this separation is increased by the tendency of trafficking imide cells: pathways, mechanisms and regulation nonaggregating proteins to be concentrated at the periphery of the vesicle lumen , as they are excluded from the dense core forming in the center ofthe isg. as a result, the soluble proteins accumulate in a place where they can readily enter the vesicles that are budding from the membrane. these may include proteins that randomly partition at the tgn into budding isgs, but will also include some soluble products of dcg proprotein processing. the best characterized of these is derived from proinsulin, which is processed into and a, b, and c peptides. 147 the first two are disulfide linked, and crystallize to form the granule core. the c peptide is soluble and is largely excluded from the core, and is selectivelywithdrawn. 148.149 a collateral consequence of isg maturation is the generation of a set of coated vesicles bearing newly synthesized proteins, some ofwhich have undergone processing by isg-specific enzymes. at least in some cell types, these can deliver their cargo to the plasma membrane, probably via an endosomal intermediate. 150 this has been called "constitutive-like" secretion: constitutive-like in that it is independent ofextracellular stimulation, but with kinetics that are slower than those of true constitutive secretion. in pancreatic~cells, the c peptide that is withdrawn from isgs is secreted via this route . the model that describes the progressive enrichment of granule cargo during isg maturation has been given the name "sorting by retention" and essentially posits that sorting in isgs can be based on a protein's ability to aggregate, rather than depending on specific targeting signals. the concepts are like those of the "sorting by exclusion" model that may apply at the tgn, and the similarity in models may be a reflection of similar molecular mechanisms in vivo. thus, isgs may simply be a functional extension of the tgn, which becomes progressivelyenriched in dcg contents as nonaggregating proteins are actively removed during maturation. thus, there may not be any mechanistic differences between coat mediated sorting at the tgn versus the isgs, though the material that is included in the budding vesicles could change as the compartment matures. alternatively, modification of the coat mediated sorting machinery may be required in order to facilitate sorting from a compartment that is progressively changing. for example, such modifications may be necessary for the trafficking of proteins that are allowed to enter isgs but are not stored in mature dcgs, such as proteases (see "structural maturation of isgs" section) , or for adapting to differences in membrane composition between the tgn and isgs. indirect evidence in support of this possibility comes from the study of the membrane lipid component phospharidyl inosirol-i-phosphare (pi-4-p) and its derivatives. in the tgn, these molecules play important modulating roles, including the recruitment of ap-l/clathrin coat proteins for vesicle budding. 151 the levels on pi-4-p in the tgn are affected by the activity of pi-4 kinase, which is stimulated by myristoylated arf1-gti~a part of the coat formation machinery.152,153 interestin~r: isgs have been found to contain a pi-4-k activity that is not stimulated by arfi-gtp' 1 the tgn has two different pi-4 kinases (ii and iii) , and it is possible that isgs only recruit one of these. 152.153 coat recruitment at the tgn vs. isgs may also be differentially regulated by modification of the vesicle cargo, since the binding of ap-l to the cytoplasmic tails of both furln and the mannose-6~hosphate receptor is stimulated following their phosphorylation by casein kinase ii.141 ,1 in this regard, a very interesting observation is that newly-budded isgs are rapidly transported to the cell periphery, at least in some cell types, and therefore primarily inhabit a different cellular microenvironment from the tgn. 155 this may be relevant for differential regulation of similar activities at the tgn vs. isgs, for example if receptors in isgs are selectively modified. although the data is not yet conclusive, the emerging view of sorting from isgs is that it is directed by the core elements of a "flexible" ap-l/clathrin dependent sorting mechanism that is differentially controlled at the isgs versus the tgn. the model holds that the sorting events ofisg maturation are not mediated by a unique vesicle trafficking mechanism, but are instead accomplished by pathway-specific modifications ofmachinery that is common to all cell rypes. a similar phenomenon may occur at an earlier stage ofthe pathway, where the coat-independent machinery that drives the formation ofconstitutive carriers from the tgn may be adapted for the budding of isgs, as discussed in the "mechanisms of immature secretory granule (isg) budding" section. this apparent mechanistic conservation may explain the abiliry offibroblast cells, which do not normally make dcgs, to make dense-cored vesicles when expressing heterologous chromogranin genes or vonwillebrand factor.48.50.51 however, these observations do not preclude the possibility that specialized dcg-producing cells express proteins that specifically modify parts of the conserved cellular trafficking machinery to enhance dcg synthesis. the cores of newly-budded isgs apftear less electron-opaque than those in mature dcgs, and are also lower in buoyant density, 0 indicating that granule cargo becomes increasingly condensed during granule maturation. this is one reflection of the larger remodeling of protein and lipid constituents during the maturation process, which includes the selective withdrawal of components that are present in im matu re, but not mature, granules . this overall process serves important structural functions. the tighter packing offers increasingly efficient storage, and not simply because more material can be contained in a fixed vesicle volume. protein condensation overcomes an energetic barrier that is posed by a vesicle filled with concentrated soluble macromolecules, which is hyperosmolar when compared to cytosol. maintaining such a vesiclewould require constant pumping ofosmolytes to counter vesicle swelling, an expensive cellular proposition. within dcgs, aggregated proteins are no longer solvated, and are therefore osmotically inert. the progressive condensation durin~maturation parallels, and is likely to be controlled by, changes in the lumenal environment. 1 in neuroendocrine cells, the tgn is acidified to ph 6.4 by vacuolar atpases.157these are also present in the isg membrane, with the result that the isg continues to acidify 158-160 at the same time there is an increase in calcium that , along with other cations,161 is important for charge neutralization of the largely acidic core proteins. this calcium may be cotransported from the endoplasmic reticulum with calcium-binding dcg cargo proteins, or imported via isg membrane ion exchangers. 162 the ionic changes can trigger changes in dcg protein conformations or interactions . for example, cgb forms horno-oligomers under the conditions found in isgs. 56 .163the functional significance is as yet unknown, but these are presumably based on contacts different from those involved in aggregative sorting. one well-established consequence ofisg acidification, in combination with increased ca 2 +, is the activation of proteases that are specifically localized to dcgs. the contents of neuronal and endocrine dcgs are largely synthesized as proteins that are proteolytically processed to generate bioactive peptides, the species that are eventually released during exocytosis. l64 proteolytic processing involves a variety of enzymes including amino-and carboxypeptidases, and a family ofaspartyl proteases called prohormone convertases. [165] [166] [167] [168] members of this family are differentially active over a range of proton and calcium concentrations, and may thus act sequentially on their substrates during isg maturation, in a cell type-dependent fashion [davidson, 1988 #572 ;laslop, 1998 #2270;goodge, 2000 #1981;.169 though isgs are considered to be the major compartment of proprotein processing, in some cell types processing may begin in the tgn, and moore and colleagues have begun to resolve the requirements for isg budding from those required for the onset of processing. 137.170.171 in their cell-free system, the onset of processing precedes budding. both require hydrolyzable gtp, but at two distinct concentrations . this difference suggested a model in which the former requires arf, while the latter depends upon a heterotrimeric g-protein. in addition to generating mature peptides, proprotein processing may drive the physical reorganization of the core, in cases where mature peptides can pack more tightly than the precursors . the best example of this is found in~-cell granules, in which mature insulin but not proinsulin can assemble into hexagonal crysrals, simply because processing relieves a packing constraint 147 . 172 ,173 (fig. 7) . the control of assembly via proteolytic processing is strongly reminiscent of mechanisms involved in viral capsid formation. 174 the process ofdcg maturation, which includes the generation of active peptides by proteolytic processing and the condensation of cargo into a densely packed, osmotically inert form, serves to increase the efficiency of the regulated secretory pathway in several ways. first, the condensation ofmaterial allows great quantities of protein to be stored in the vesicles,with the consequence that a small number ofexocytic events can generate a relativelylarge secretory response. second, proteolytic processing in isgs allows the cell to combine multiple dcg peptides into a single proprotein, thereby linking the sorting of these proteins at earlier stages of the pathway. in neuroendocrine isgs, for example, the chromogranin proteins are cleaved into multiple biologically active peptides with different postexocytic functions. 62 ,175 furthermore, limiting the site of proteolytic processing to isgs may provide a failsafe mechanism, ensuring that the active forms of the proteins are only found in a compartment that is under direct control of the regulated secretory pathway, therefore leaving any incorrectly sorted proteins as uncleaved precursors. the remodeling of the membrane attending the budding of clathrin coated vesicles does not simply serve to remove proteins that may have been incorrectly targeted at the tgn. rather, it also underlies differences in the activity ofisgs and mature granules. this was suggested by the observation that isgs and mature granules differ dramatically with regard to exocytosis: whereas mature dcgs undergo efficient exocytic fusion with the plasma membrane in a stimulus-dependent fashion , isgs exhibit an increased tendency to fuse with the plasma membrane in the absence of stimulation. in att20 cells, unregulated release ofdcgs from isgs proceeds for 2-3 hours after isg budding from the tgn. 176 these isgs contain two snares, vamp4 and synaptota~min iv (syt iv) , which are withdrawn during maturation in a brefeldin a-inhibitable step.i 9,171,177 during the same period , the maturing granules become responsive to exocytic stimuli, a process also blocked by bfa. that the two phenomena may be linked is suggested by the observation that overexpression ofsyt iv itself decreased the responsiveness of maturing granules to secretory stimuli. 17l syt iv is thought to act as a negative regulator of calcium-induced exocytosis,178 and the withdrawal of this inhibitory factor from isgs may foster maturation. a recent study showed that the removal ofvamp4 from isgs depends upon interactions with ap-l and the coat protein pacs-l,179 thereby providing genetic confirmation and molecular detail to this model. however, a complication ofthis model is that syt iv is thought to inhibit membrane fusion by forming inactive heterodimers with synaptotagmin i, and the mechanism by which the heterodimers are separated and syt n is selectively removed from the isgs is unknown. another functional characteristic that may, in some cell types, distinguish isgs from dcgs, is that isgs can undergo homotypic fusion, a reaction that has been more extensively characterized in vitro than in vivo. 177 ,180,181 the specific function ofthis reaction is not clear. in some systems homotypic-like fusion might allow for the synthesis ofspecialized dcg cores in which the contents are not randomly distributed. in pseudomicrotborax dubius, two kinds of isgs, containing morpholo §ically-distinguishable cargo, fuse during the process ofassembling a complex core structure. 18 more generally, consolidation could potentially define the size of the granules, which in many systems appear to be controlled. 183 disruption of the gene encoding rab3d, an exocrine granule-associated small gtpase, resulted in a doubling ofmature granule volume, and one possibility is that rab3d acts as a negative regulator of homotypic fusion. 184 at some level, membrane remodeling must account for the difference in the fusogenic behavior of isgs vs. mature granules, and attention has focused on the snares, due to their importance in regulating membrane fusion. isgs from pcl2 cells contain syntaxin 6, which must be present on both donor and acceptor membranes for efficient homotypic fusion in vitro.in syntaxin6 is also present in clathrin-coated vesicles which bud from the isg rnernbrane ,142 consistent with the idea that it is selectively removed during maturation via several likely ap-i binding sites in its cytoplasmic domain. 140 as isg maturation appears to involve the removal of specific factors via the budding of clathrin coated vesicles, it is possible that more thorough analyses of the target proteins and their interacting partners will help to uncover isg-specific machinery that regulates clathrin-dependenr sorting in this compartment. more broadly, the identification of the molecules that define the functional maturity ofdcgs by their presence or absence in the vesicle will provide insights into the nature of organelle identity, a topic that is central to an understanding the general principles of vesicular traffic. finally, recent evidence hints at aspects of granule maturation that have not previously been recognized. functional maturation of secretory granules may extend beyond the period of morphological change, based on the observation that the distribution and fusogenic activity of granules may change with vesicle age.185 the majority of the work on dcg synthesis has focussed on the sorting of the lumenal content proteins in the tgn and isgs. these studies have, for the most part, supported the nonspecific aggregation-based model for sorting that was proposed by chanat and huttner in 1991.57 not surprisingly, studies of many granule cargo proteins in multiple systems have revealedsome casesthat are possible exceptions to this general rule, where specificprotein-protein interactions are required for the sorting of a particular protein, as discussed in the "protein sorting into isgs" section of this chapter. overall, the precise requirements for the sorting of any particular protein is likely to be both context (which other granule cargo proteins are being expressed, and in what quantities) and cell type dependent (protein aggregation is sensitive to physiological properties of the lumen, such as calcium concentration and ph, which may vary between cell types), though it is likely that the general principles of aggregation-based sorting apply in all cells that produce dcgs. further analysis of the specific sorting requirements for individual proteins may lead to a greater knowledge of the details ofaggregation-based sorting, but the next leap forward in our understanding of they system will more likely come from experimental approaches that expand beyond the level of individual proteins and consider the dcg synthesis pathway more broadly. for example, cargo protein aggregation is known to be sensitive to lumenal calcium concentration and ph levels, but the mechanisms that control these physiologic parameters have not been elucidated. secondly, how are granule cargo proteins sorted to the same destination as other proteins that are essential for dcg function, such as membrane fusion machinery? the answers to these questions may be learned from studies in genetic systems, such as c. elegans, drosophila, and ciliated protozoans, which offer promising avenues for further experimentation. these orf.anisms have recently been used to identify elements of the regulated exocytosis machinery.' ,186,187 and similar studies could uncover genes that are involved in vesicle synthesis. another major gap in our understanding of the granule synthesis pathway is the extent of its functional relationship with other branches of the secretory pathway. two decades ago, dgc formation was considered to be one of a small number of distinct, post-tgn secretory pathways. this carried the assumption that vesicles bound for constitutive or regulated exocytosis, or toward lysosomes, would rely on distinct mechanisms for their biogenesis. that view now seems, paradoxically, to have been both too simple and too complex. it was too simple because post tgn traffic cannot be neatly divided into three branches: for example, what was called the constitutive pathway may in fact consist of multiple branches.188.189 this was initially established for apical vs. basolateral targeting in polarized epithelia, but there is evidence in other cell types as well. furthermore, the mechanisms for dcg formation are not easily separated from those that are directly involved in other pathways, implying that the secretory pathway cannot be divided into distinct, independently functioning branches. for example, ap-1 dependent sorting of proteins to the lysosomal pathway is associated with isg maturation, and may also be part of the driving force for the "sorting by exclusion" of dcg contents in the tgn (see "protein sorting in isgs" section). at the same time, the fact that the dcg synthesis pathway and lysosomal pathway use some of the same machinery argues that the historical view of distinct mechanisms was too complex. similarly, the historical view that constitutive and regulated secretory carriers are fundamentally different may also be incorrect. the idea that constitutive traffic is based on small vesiclesis being modified by the recognition that tgn tubularion may be as, if not more, important in this pathway, at least in some cell types (referencesin ref 190) . thus coat-mediated vesicle formation may be the exception rather than the rule for anterograde traffic to the plasma membrane, and the formation of constitutive and regulated secretory carriers may share common mechanisms . in the extreme, the mechanisms may be mostly conserved, and the end products depend upon the behavior of the vesicle cargo. addressing these issues directly will require identification of factors required for isg budding and tgn rubulation. while progress has recently been made toward the latter, details regarding the former are extremely limited. success in this may depend on further exploitation of cell-free systems, strengthened by development of new genetic models. secretion and cell-surface growth are blocked in a temperaturesensitive mutant of saccharomyces cerevisiae secretory granule exocytosis the synaptic vesicle cycle snares and snare regulators in membrane fusion and exocytosis calcium sensors in regulated exocytosis principles of exoeytosis and membrane fusion regulated exocytosis and snare function (review) the trans-most cisternae of the golgi complex: a compartment for sorting of secretory and plasma membrane proteins sorting within the regulated secretory pathway occurs in the trans-golgi network sorting of progeny coronavirus from condensed secretory proteins at the exit from the trans-golgi network of att20 cells intracellular aspects of the process of protein synthesis pathways of protein secretion in eukaryotes biogenesis of secretory granules in the trans-golgi network of neuroendocrine and endocrine cells constitutive and regulated secretion of proteins biogenesis of constitutive secretory vesicles, secretory granules and synaptic vesicles protein hormone storage in secretory granules: mechanisms for concentration and sorting lumenal protein multimerization in the distal secretory pathway/secretory granules secretory granule biogenesis: rafting to the snare sorting and storage during secretory granule biogenesis: looking backward and looking forward formation of secretion granules in the golgi apparatus of pancreatic acinar cells of the rat trans-golgi network (tgn) of different cell rypes: three-dimensional structural characteristics and variabiliry transport of casein submicelles and formation of secretion granules in the golgi apparatus of epithelial cells of the lactating mammary gland of the rat modulation of the golgi apparatus in stimulated and nonstimulated prolactin cells of female rats formation of secretory granules in the golgi apparatus of prolactin cells in the rat pituitary gland: a stereoscopic study three-dimensional electron microscopy: structure of the golgi apparatus formation of secretion granules in the golgi apparatus of plasma cells in the rat nonconverted, amino acid analog-modified proinsulin stays in a golgi-derived clarhrin-coated membrane compartment biogenesis of secretory granules. implications arising from the immature secretory granule in the regulated pathway of secretion intermediates in the constitutive and regulated secretory pathways released in vitro from semi-intact cells tooze jet ai. characterization of the immature secretory granule, an intermediate in granule biogenesis structural requirements for targeting of surfactant protein b (sp-b) to secretory granules in vitro and in vivo procathepsin l self-association as a mechanism for selective secretion drosophila caps is an essential gene that regulates dense-core vesicle release and synaptic vesicle fusion ca(2+)-dependenr activator protein for secretion is critical for the fusion of dense-core vesicles with the membrane in calf adrenal chromaffin cells a novel 145 kd brain cyrosolic protein reconstitutes ca 2+ -regulated secretion in permeable neuroendocrine cells the synaptic vesicle cycle: exocytosis and endocytosis in drosophila and c e1egans isolation and ultrastructural characterization of secretory mutants of tetrahymena thermophila protein secretion in tetrahymena thermophila: characterization of the secretory mutant strain sb281 genetic characterization of tetrahymena thermophila mutants unable to secrete capsules maturation of dense core granules in wild rype and mutant tetrahymena thermophila mutational analysis of regulated exocytosis in tetrahymena mutations affecting the trichocysts in paramecium aurelia. i morphology and description of the mutants evidence for defects in membrane traffic in paramecium secretory mutants unable to produce functional storage granules overall lack of regulated secretion in a pci 2 variant cell clone a pci 2 variant lacking regulated secretory organelles: aberrant protein targeting and evidence for a factor inhibiting neuroendocrine gene expression analysis of a mutant exhibiting conditional sorting ro dense core secretory granules in tetrahymena thermophila biogenesis of von willebrand factor-containing organelles in heterologous rransfected cv-i cells induction of specific storage organelles by von willebrand facror propolypeptide chromogranin a, an «on/off. switch controlling dense-core secretory granule biogenesis chromogranin b-induced secretory granule biogenesis: comparison with the similar role of chromogranin a molecular sorting in the secretory pathway protein secretion: puzzling receptors proteins synthesized and secreted during rat pancreatic development phasic release of newly synthesized secretory proteins in the unstimulated rat exocrine pancreas ph-and ca2+-dependent aggregation properry of secretory vesicle matrix proteins and the potential role of chromogranins a and b in secretory vesicle biogenesis milieu-induced, selective aggregation of regulated secretory proteins in the trans-golgi network signal-mediated sorting to the regulated pathway of protein secretion 2+)-induced conformational change and aggregation of chromogranin b. comparison with chromogranin a and implication in secretory vesicle biogenesis secretory granule content proteins and the luminal domains of granule membrane prot eins aggregate in vitro at mildly acidic ph the granin (chromagranin/secretogranin) family the chromogranins: th eir roles in secretion from neuroendocrine cells and as markers for neuroendocrine neoplasia chro mogranin b (secretogranin i) promotes sorting to the regulated secretory pathway of processing intermediates derived from a peptide hormone precursor gorr suo aggregation chaperones enhance aggregation and storage of secretory proteins in endocrine cells in vitro aggregation of the regulated secretory protein chromogranin a reconstitution in vitro of the ph-dependent aggregation of pancreatic zymogens en route to the secretory granule: implication of gp-2 exocrine granule specific packaging signals are present in the polypeptide moiery of the pancreatic granule membrane protein gp2 and in amylase: implications for protein targeting to secretory granules identification of a chromogranin a domain that mediates binding to secretogranin iii and targeting to secretory granules in pituitary cells and pancreatic beta-cells identification of a novel sorting determinant for the regulated pathway in the secretory protein chromogranin a sorting of three secretory proteins to distinct secretory granules in acidophilic cells of cow anterior pituitary multiple neuropept ides derived from a common precursor are differentially packaged and transported structure et ultrastructure de lacrymaria olor (o.f.m. 1786) cocrystallization of proinsulin and insulin vayssie let ai. a large multigenic family codes for the polypeptides of the crystalline trichocyst matrix in paramecium protein secretion in tetrahymena thermophila. characterization of the major proteinaceous secretory proteins growth and form of secretory granules involves stepwise assembly but not differential sorting of a family of secretory proteins in paramecium quality control in the endoplasmic reticulum proteolytic processing and ca 2+-binding activity of dense-core vesicle polypeptides in tetrahymena identification and characterization of a novel secretory granule calcium-binding protein from the early branching eukaryote giardia lamblia myosin va facilitates the distribution of secretory granules in the f-actin rich cortex of pc12 cells chromogranin b (secretogranin i) a secretory protein of the regulated pathway, is also present in a tightly membrane-associated form in pc12 cells the disulfide-bonded loop of chromogranin b mediates membrane binding and directs sorting from the trans-golgi network to secretory granules the disulfide-bonded loop of chrornogranins, which is essential for sorting to secretory granules, mediates hornodimerizarion identification of the sorting signal motif within pro-opiomelanocortin for the regulated secretory pathway carboxypeptidase e, a prohormone sorting receptor, is anchored to secretory granules via a c-terminal transmembrane insertion carboxypeptidase e is a re~lated secretory pathway sorting receptor: genetic obliteration leads to endocrine disorders in cpe at mice identification of a novel prohormone sorting signal-binding site on carboxypeptidase e, a regulated secretory pathway-sorting receptor depletion of carboxypeptidase e, a regulated secretory pathway sorting receptor, causes misrouting and constitut ive secretion of proinsulin and proenkephalin, but not chromogranin a proinsulin targeting to the regulated pathway is not impaired in carboxypeptidase e-deficient cpefat/cpefat mice carboxypeptidase e, a peripheral membrane protein implicated in the targeting of hormones to secretory granules, coaggregates with granule content proteins at acidic ph secretogranin iii binds to cholesterol in the secretory granule membrane as an adapter for chromogranin a sorting of carboxypeptidase e to the regulated secretory pathway requires interaction of its transmembrane domain with lipid rafts cholesterol is required for the formation of regulated and constitutive secretory vesicles from the trans-golgi network selective delivery of secretory cargo in golgi-derived carriers of nonepithelial cells reduction of the disulfide bond of chromogranin b (secretogranin i) in the trans-golgi network causes its missorting to the constitutive secretory pathways disruption of disulfide bonds exhibits differential effects on trafficking of regulated secretory proteins n-and c-terminal domains direct cell type-specific sorting of chromogranin a to secretory granules molecular cloning of phogrin, a protein-ryrosine phosphatase homologue localized to insulin secretory granule membranes the lumenal domain of the integral membrane protein phogrin mediates targeting to secretory granules differential trafficking of soluble and integral membrane secretory granule-associated proteins identification of routing determinants in the cytosolic domain of a secretory granule-associated integral membrane protein biogenesis of weibel-palade bodies mutational analysis of vamp domains implicated in ca2+-induced insulin exocytosis targeting of p-selectin to two regulated secretory organelles in pc12 cells p-selectin targeting to secretory iysosomes of rbl-2h3 cells regulated secretion of conventionallysosomes angelica ec the molecular machinery for the biogenesis of lysosome-related organelles: lessons from hermansky-pudlak syndrome p-selectin, a granule membrane protein of platelets and endothelial cells, follows the regulated secretory pathway in att-20 cells a complex web of signal-dependent trafficking underlies the triorganellar distribution of p-selectin in neuroendocrine pc12 cells ap-3 adaptor functions in targeting p-selectin to secretory granules in endothelial cells coat proteins: shaping membrane transport selective and signal-dependent recruitment of membrane proteins to secretory granules formed by heterologously expressed von willebrand factor weibel-palade bodies recruit rab27 by a content-driven , maturation-dependent mechanism that is independent of cell type assembly of multimeric von willebrand factor directs sorting of p-selectin how glycosylphosphatidylinositol-anchored membrane proteins are made interaction of syncollin with gp-2, the major membrane protein of pancreatic zymogen granules, and association with lipid microdomains a submembranous matrix of proteoglycans on zymogen granule membranes is involved in granule formation in rat pancreatic acinar cells the major zymogen granule membrane protein gp-2 in the rat pancreas is not involved in granule formation loss of the zymogen granule protein syncollin affects pancreatic protein synthesis and transport but not secretion recycling of raft-associated prohormone sorting receptor carboxypeptidase e requires interaction with arf6 requirement for gtp hydrolysis in the formation of secretory vesicles the use of permeabilized cells to investigate secretory granule biogenesis exocytic transport vesicles generated in vitro from the trans-golgi network carry secretory and plasma membrane proteins reconstitution of constitutive secretion using semi-intact cells: regulation by gtp but not calcium phospholipase d stimulates release of nascent secretory vesicles from the trans-golgi network secretory vesicle budding from the trans-golgi network is mediated by phosphatidic acid levels implications of lipid microdomains for membrane curvature, budding and fission role of dynamin in the formation of transport vesicles from the trans-golgi network a role for phosphatidylinositol transfer protein in secretory vesicle formation multiple trimeric g-proteins on the trans-golgi network exert stimulatory and inhibitory effects on secretory vesicle formation t rimeric g-proteins of the trans-golgi network are involved in the formation of constitutive secretory vesicles and immature secretory granules formation of nascent secretory vesicles from the trans-golgi network of endocrine cells is inhibited by tytosine kinase and phosphatase inhibitors cooperativity of phosphatidylinositol transfer protein and phospholipase d in secretory vesicle formation from the tgn-phosphoinositides as a common denominator? role of diacylglycerol in pkd recruitment to the tgn and protein transport to the plasma membrane functional diversity in the dynamin family protein kinase d regulates the fission of cell surface destined transport carriers from the trans-golgi network biogenesis of processing-competent secretory organelles in vitro two independent targeting signals in the cytoplasmic domain determine trans-golgi network localization and endosomal trafficking of the proprorein convertase furin direct and gtp-dependent interaction of adp-ribosylation factor 1 with clathrin adaptor protein ap-l on immature secretory granules the ap-l adaptor complex binds to immature secretory granules from pc12 cells, arid is regulated by adp-ribosylation factor interaction of furin in immature secretory granules from neuroendocrine cells with the ap-1 adaptor complex is modulated by casein kinase ii phosphorylation mannose 6-phosphate receptors are sorted from immature secretory granules via adaptor protein ap-1, clathrin, and syntaxin 6-positive vesicles differential distribution of mannose-e-phosphare receptors and furin in immature secretory granules differential sorting of lysosomal enzymes out of the regulated secretory pathway in pancreatic beta cells site-specific cross-linking reveals a differential direct interaction of class 1, 2, and 3 adp-ribosylation factors with adaptor protein complexes 1 and 3 proinsulin endoproteolysis confers enhanced targeting of processed insulin to the regulated secretory pathway the role of assembly in insulin's biosynthesis protein discharge from immature secretory granules displays both regulated and constitutive characteristics protein targeting via the "constitutive-like" secretory pathway in isolated pancreatic islets: passive sorting in the immature granule compartment protein traffic from the secretory pathway to the endosomal system in pancreatic beta-cells phospharidyiinosirol 4 phosphate regulates targeting of clathrin adaptor ap-l complexes to the golgi arf mediates recruitment of ptdlns-4-0h kinase-beta and stimulates synthesis of ptdlns(4,5)p2 on the golgi complex type i phosphatidylinositol 4-phosphate 5-kinase directly interacts with adp-ribosylation factor 1 and is responsible for phosphatidylinositol 4,5-bisphosphate synthesis in the golgi compartment regulation and recruitment of phosphatidylinositol 4-kinase on immature secretory granules is independent of adp-ribosylation factor 1 dynamics of immature secretory granules: role of eytoskeletal elements during transport, cortical restriction, and f-actin-dependent tethering mechanisms of ph regulation in the regulated secretory pathway endoproteolytic cleavage is mediated by a vacuolar atpase that generates an acidic ph in the trans-golgi network ph-independent and -dependent cleavage of proinsulin in the same secretory vesicle biosynthesis and secretion of pituitary hormones: dynamics and regulation inhibition of the vacuolar h+-atpase perturbs the transport , sorting, processing and release of regulated secretory proteins low-molecular-weight constituents of isolated insulin-secretory granules . bivalent cations , adenine nucleotides and inorganic phosphate endoplasmic reticulum ca2+ is important for the proteolytic processing and intracellular transport of proinsulin in the pancreatic beta-cell effects of ph and ca2+ on heterodimer and heteroretramer formation by chromogranin a and chromogranin b prohormone and proneuropeptide processing calcium-and ph-dependent aggregation and membrane association of the precursor of the prohormone convertase pc2 ionic milieu controls the compartment-specific activation of pro-opiomelanocortin processing in att-20 cells molecular and cellular regulation of prohormone processing the proprotein convertases furin and prohormone convertase 1/3 are major convertases in the processing of mouse pro-growth hormone-releasing hormone an antibody specific for an endoproteolytic cleavage site provides evidence that pro-opiomelanocortin is packaged into secretory granules in att20 cells before its cleavage lau 0 et ai. biogenesis of regulated exocytotic carriers in neuroendocrine cells comparison of secondary structures of insulin and proinsulin by ftir nmr and photo-cidnp studies of human proinsulin and prohormone processing intermediates with application to endopeptidase recognition the role of proteolytic processing in the morphogenesis of virus particles peptides derived from the granins (chromograninslsecretogranins) distinct molecular events during secretory granule biogenesis revealed by sensitivities to brefeldin a homotypic fusion of immature secretory granules during maturation requires syntaxin 6 synaptic function modulated by changes in the ratio of synaptoragmin i and iv ap-l recruitment to vamp4 is modulated by phosphorylationdependent binding of pacs-1 cytoplasmic granule formation in mouse pancreatic acinar cells. evidence for formation of immature granules (condensing vacuoles) by aggregation and fusion of progranules of unit size, and for reductions in membrane surface area and immature granule volume during granule maturation homotypic fusion of immature secretory granules during maturation in a cell-free assay immunological characterization of trichocyst proteins in the ciliate pseudomicrothorax dubius lysosome function in the regulation of the secretory process in cells of the anterior pituitary gland rab3d is not required for exocrine exocytosis but for maintenance of normally sized secretory granules functional and spatial segregation of secretory vesicle pools according to vesicle age genetic approach to regulated exocyrosis using functional complementation in paramecium: identification of the nd7 gene required for membrane fusion novel secretory vesicle proteins essential for membrane fusion display extracellular-marrix domains post-golgi biosynthetic trafficking multicolour imaging of post-golgi sorting and trafficking in live cells gaip participates in budding of membrane carriers at the rrans-golgi network macro-and micro-domains in the endocrine pancreas key: cord-016652-x8t3lf1x authors: matthews, david; emmott, edward; hiscox, julian title: viruses and the nucleolus date: 2011-05-23 journal: the nucleolus doi: 10.1007/978-1-4614-0514-6_14 sha: doc_id: 16652 cord_uid: x8t3lf1x the nucleolus is a dynamic sub-nuclear structure integral to the function of a eukaryotic cell. some of its major roles involve ribosome subunit biogenesis, rna processing, cell cycle control and responding to cellular stress, such as infection. our understanding of the relationship between viruses and the nucleolus has moved from a phenomenological approach describing protein localisation to functional studies involving genetic analysis and proteomic approaches. these advances have provided fundamental insights as to how and why the nucleolus is targeted by many different viruses both to usurp normal functioning and to recruit nucleolar proteins to facilitate virus replication. this knowledge has been exploited for therapeutic strategies involving targeted inhibition of virus replication and live-attenuated recombinant vaccines. ; lee et al. (2003) ; lutz and kedinger (1996) ; tollefson et al. (2007) ; matthews and russell (1998) asfaviridae lower et al. (1995) many of these proteins have been shown to interact with nucleolar proteins the reason why rna viruses, and positive-strand rna viruses in particular, interact with the nucleolus when the site of genome replication is in the cytoplasm is less intuitive. in this latter case, viral proteins that are normally required in the cytoplasm must transit through the nuclear pore complex both to and from the nucleus. this process is crucial for virus biology because if the viral proteins that are required for cytoplasmic functions such as rna synthesis and encapsidation are sequestered in the nucleolus or nucleus, then progeny virus production will be affected as has been revealed by inhibitor and genetic studies (lee et al. 2006; tijms et al. 2002) . viruses may interact with the nucleolus to usurp host cell functions and recruit nucleolar proteins to facilitate virus replication. investigating the interactions between viruses and the nucleolus may facilitate the design of novel anti-viral therapies both in terms of recombinant vaccines (pei et al. 2008 ) and molecular intervention (rossi et al. 2007) , and also contribute to a more detailed understanding of the cell biology of the nucleolus. for many years our understanding of the interaction of viruses and the nucleolus was phenomenological and focused on identifying viral proteins that localised to this structure, their mechanisms of trafficking and potential interaction with nucleolar proteins (e.g. see table 14 .1). however, recent research capitalising on advances in proteomics, viral genetics and cellular imaging techniques are beginning to increase our understanding of the mechanisms viruses use to subvert host cell nucleoli and facilitate virus biology . new data are now emerging that support the view that many viruses interact with the nucleus and nucleolus, particularly to facilitate virus replication. one of the best-studied viruses in terms of viral interactions with the nucleolus is hiv-1 and is described in detail in chap. 17. although hiv has clearly defined cytoplasmic and nuclear replication strategies, the virus has a positive-sense rna genome in the sense that the viral capsid contains two copies of positive-sense rna, but these are reverse transcribed in the cytoplasm and then trafficked to the nucleus, where ultimately the new genome is transcribed and trafficked back to the cytoplasm. part of the reasoning for the interaction of hiv-1 with the nucleolus is the trafficking of intronless mrna from the nucleus into the cytoplasm (michienzi et al. 2000) . this is a property shared with herpes viruses and indicated that different viruses have evolved similar strategies involving subversion of nucleolar function for the benefit of virus biology (boyne and whitehouse 2006) . in the case of hiv-1, this knowledge has also led to the design and implementation of effective genetic therapies against the virus (unwalla et al. 2008 ). a large number of viruses with dna genomes have been shown to interact with nucleolus, and this perhaps is not surprising as most dna viruses replicate in the nucleus. a genome-wide screen of three distinct herpesviruses, herpes simplex virus 1 (hsv-1), cytomegalovirus (cmv) and epstein-barr virus (ebv), has shown that at least 12 herpesvirus-encoded proteins specifically localise to the nucleolus (salsman et al. 2008) , which are implicated in many aspects of the herpesvirus life cycle. therefore, a number of proteomic studies are currently being undertaken to study changes, in a global context, within the nucleolar proteome during virus infections, and are discussed later (lam et al. 2010) . several different herpes virus proteins have been shown to cause the redistribution of nucleolar proteins and hence disruption of the nucleolus. these include herpes simplex virus 1, the major tegument structural protein vp22 (lopez et al. 2008) , and the us11 (xing et al. 2010) and ul24 proteins (bertrand and pearson 2008; lymberopoulos and pearson 2007) . such disruption in many cases may have a direct effect on nucleolar function. a significant area of virus biology that has been investigated is the role of viral proteins that traffic through the nucleolus. for example, a number of hiv proteins that traffic through the nucleolus have been implicated in virus mrna processing (dundr et al. 1995) . similar observations have also been made in herpesviruses whitehouse 2006, 2009; leenadevi and dalziel 2009) . initial studies utilising the prototype g-2 herpesvirus, herpes virus saimiri (hvs), demonstrated that the hvs nucleolar trafficking orf57 protein induces nucleolar redistribution of the host cell human trex proteins, which are involved in mrna nuclear export (boyne and whitehouse 2006) . intriguingly, ablating orf57 nucleolar trafficking led to a failure of orf57-mediated viral mrna nuclear export (boyne and whitehouse 2006) . the precise role of this nucleolar sequestration is yet to be determined, but possible effects on viral mrna/protein processing and viral ribonucleoprotein particle assembly are currently being investigated. this property may also be conserved in other orf57 homologues as recent analysis has shown that the orf57 protein from kaposi's sarcoma associated herpesvirus (kshv) also dynamically traffics through the nucleolus (boyne et al. 2008b) . moreover, on the rapid disorganisation of the nucleolus a reduction is observed in virus mrna nuclear export (boyne and whitehouse 2009 ). the formation of an orf57-mediated export competent ribonucleoprotein particle within the nucleolus may also have implications for the translation of viral mrnas. for example, it has recently been demonstrated that the cellular nucleo-cytoplasmic shuttle protein, pym, which is involved in translation enhancement, is redistributed to the nucleolus in the presence of the kshv orf57 protein (boyne et al. 2010 ). this interaction effectively enhances the translation of the predominantly intronless transcripts made by kshv, and draws parallels with potential translation enhancement of positive strand rna virus genomes through their interaction with the nucleolus (discussed later). a second area of virus replication where nucleolar proteins are sequestered involves the replication of the virus dna genome. for example, we (matthews) and others have observed that nucleolar antigens upstream binding factor (ubf) and nucleophosmin (b23.1) are both sequestered into adenovirus dna replication centres where they promote viral dna replication (hindley et al. 2007; lawrence et al. 2006; okuwaki et al. 2001) . similarly, in hsv-1 infected cells, a number of nucleolar proteins including nucleolin and ubf are recruited into viral dna replication centres . these are specific sites where replication and encapsidation of the hsv-1 genome occurs. evidence suggests that sequestration of ubf is essential for viral dna replication as overexpression of tagged version of ubf acts in a dominant-negative manner inhibiting virus dna replication (stow et al. 2009 ). moreover, depletion of nucleolin results in reduced virus gene expression and infectious virion production (calle et al. 2008; sagou et al. 2010) . in addition to enhancing virus replication, nucleolar proteins are redistributed to alter cellular pathways during infection. for example, the nucleolar targeted hsv-1 us11 protein has been shown to interact with homeodomain-interacting protein kinase 2 (hipk2), which plays a role in p53-mediated cellular apoptosis and hypoxic response (calzado et al. 2009 ) and also participates in the regulation of the cell cycle (calzado et al. 2007 ). this interaction alters the sub-cellular localisation of hipk2 and protects against hipk2-mediated cell cycle arrest (giraud et al. 2004) . in contrast, the cellular protein, protein interacting with the carboxyl terminus-1 (pict-1), can sequester the virally encoded apoptosis suppressor protein, ks-bcl-2 protein, from the mitochondria into the nucleolus to down-regulate its anti-apoptotic activity (kalt et al. 2010 ). this is a potential interesting interplay between two sub-cellular structures involved in the viral stress response (olson 2009 ), and maybe more common and widespread. for example, bacterial infection has been shown to disrupt the nucleolus through regulating mitochondrial dysfunction (dean et al. 2010). although many rna virus proteins have been shown to localise to the nucleolus, most attention has focused on viral capsid proteins. these are proteins that associate with the viral genome for encapsidation and assembly of new virus particles. these proteins may also modulate replication (and transcription, where appropriate) of the viral genome. increasingly, capsid proteins have also been shown to have a number of roles in modulating host cell signalling pathways and functions. these capsid proteins are referred to as capsid, nucleoproteins or nucleocapsid proteins, depending on the virus. in many cases, they are phosphorylated (chen et al. 2005) , which can modulate activity (spencer et al. 2008) . many examples of these proteins have been shown to localise to the nucleolus both when over-expressed and also in infected cells. these include proteins from positive-strand animal and plant rna viruses, including the coronavirus nucleocapsid protein (chen et al. 2002; hiscox et al. 2001; wurm et al. 2001) , the arterivirus nucleocapsid protein (rowland et al. 1999) , the alphavirus capsid protein (jakob 1994 ) and non-structural protein nsp2 (rikkonen et al. 1992 (rikkonen et al. , 1994 and the umbravirus orf3 protein (ryabov et al. 2004 ). capsid proteins from negative-strand rna viruses also localise to the nucleolus. these have strain dependent localisation of a number of different influenza virus proteins (emmott et al. 2010c; han et al. 2010; melen et al. 2007; volmer et al. 2010) . for many years this has followed a phenomenological pattern and viral capsid and rna-binding proteins might simply localise to the nucleolus because they diffuse through the nuclear pore complex and associate with compartments in the nucleus that have high rna contents -the nucleolus in particular because it is transcriptionally active. in this case, sub-cellular localisation to the nucleolus would have no physiological consequence for the virus or the cell. however, rna virus replication is error prone and selection pressure might apply to such a fortuitous localisation (given the ~4,500+ nucleolar proteins and their diverse roles (ahmad et al. 2009 )), with the concomitant effect that the virus could select for changes that ultimately disrupt nucleolar function and/or recruit nucleolar proteins to aid virus replication. there is a potential correlation between the nucleolar localisation of a viral protein and the loss of an essential nucleolar function. the molecular mechanisms responsible for this effect are unknown, but the displacement and re-localisation of nucleolar proteins by viral proteins could increase or decrease the nucleolar, nuclear and/or cytoplasmic pool of these proteins. certainly, the accumulation of viral proteins in the nucleolus could potentially cause volume exclusion or crowding effects, which have been proposed to play a fundamental role in the formation of nuclear compartments including the nucleolus, and can be addressed by proteomic strategies. therefore, disruption of nucleolar architecture and function might be common in virus-infected cells if viral proteins target the nucleolus or a stage of the virus lifecycle disrupts nucleolar proteins. for example, poliovirus infection results in the selective redistribution of nucleolin from the nucleolus to the cytoplasm (waggoner and sarnow 1998) and inactivation of ubf, which shuts off rna polymerase i transcription in the host cell. the infection of cells with ibv has been shown to disrupt nucleolar architecture (dove et al. 2006b ) and cause arrest of the cell cycle in the g2/m phase and failure of cytokinesis (dove et al. 2006a) . the ibv and arterivirus nucleocapsid proteins associate with nucleolin and fibrillarin, respectively. similarly, the hiv-1 rev protein has been shown to localise to the dfc and gc and over-expression of rev protein alters the nucleolar architecture and is associated with the accumulation of nucleophosmin (dundr et al. 1995) . many different virus proteins localise to the nucleolus (table 14 .1). however, predicting viral (and cellular) nucleolar targeting signals has historically been problematic and only recently has bioinformatic software been developed to fascilitate this (scott et al. 2011) . nucleolar trafficking might be mediated by virtue of the fact that viral proteins that are trafficked to the nucleolus contain motifs that resemble host nucleolar targeting signals, that is, a form of molecular mimicry is used . the discovery of specific nucleolar trafficking signals in viral proteins has indicated a functional mechanism behind this observed localisation (lee et al. 2003; reed et al. 2006; . analysis of the different nucleolar trafficking signals identified in viral proteins using dynamic live-cell imaging has certainly demonstrated that different proteins can confer differential trafficking rates and localisation patterns (emmott et al. 2008) . this is very similar to cellular nucleolar proteins (lechertier et al. 2007 ). in some virus proteins, both nlss and nucleolar targeting signals act in concert to direct a protein to the nucleolus. the arterivirus porcine reproductive and respiratory syndrome virus (prrsv) nucleocapsid protein localises to the nucleolus and has been shown to contain two potential nlss, a pat4 and a downstream pat7 motif (rowland et al. 1999 . analysis revealed that a 31 amino acid sequence incorporating the pat7 motif could direct the nucleocapsid protein to both the nucleus and nucleolus. the protein also contains a predicted nes, presumably to allow the protein to traffic back into the cytoplasm to contribute to viral function in this compartment. this is common with other similar related proteins. for example, in the avian coronavirus nucleocapsid protein an eight amino acid sequence is necessary and sufficient to target the protein to the nucleolus (reed et al. 2006 ) and contains an nes (reed et al. 2007 ). intriguingly, genetic analysis (lee et al. 2006) , dynamic livecell imaging (you et al. 2008 ) and use of trafficking inhibitors (tijms et al. 2002) paint a picture of the requirement of these positive sense rna virus capsid proteins localising to the nucleolus as soon as they are translated, prior to their involvement in virus replication or assembly. this may be related to subversion of host cell function, protein modification (e.g. phosphorylation) or recruitment of nucleolar proteins. viral proteins might also traffic to the nucleolus through association with cellular nucleolar proteins . for example, the hepatitis delta antigen has been shown to contain a nucleolar targeting signal that also corresponded to a site that promoted binding to nucleolin (lee et al. 1998) . mutating this region prevented nucleolin binding to the delta antigen and nucleolar trafficking. by implication, this relates nucleolin binding to nucleolar trafficking (lee et al. 1998) . certainly, interaction with nucleophosmin and hepatitis delta antigens can modulate viral replication (huang et al. 2001 ) and more recently combined proteomic-rnai screens have revealed many other nucleolar proteins that can be associated with this viral protein (cao et al. 2009 ). trafficking and accumulation of viral proteins to and from the nucleolus, similar to cellular proteins, may also be cell cycle related. for example, the coronavirus nucleocapsid protein localises preferentially to the nucleolus in the g2 phase of the cell cycle (cawood et al. 2007) , as does the human cytomegalovirus protein ul83 in the g1 phase (arcangeletti et al. 2011) . again these trafficking profiles may be related to the interaction with cellular nucleolar proteins (emmott and hiscox 2009). semliki forest virus non-structural protein nsp2 can localise to the nucleolus (peranen et al. 1990; rikkonen et al. 1992 rikkonen et al. , 1994 and disruption of this localisation through a single amino acid change results in a reduction in neurovirulence (fazakerley et al. 2002) . such in vitro data has also been backed up by persuasive in vivo data. mutation of the arterivirus nucleocapsid protein pat7 nls motif in the context of a full-length clone revealed that this sequence could have a key role in virus pathogenesis in vivo, as animals infected with mutant viruses had shorter viraemia than wild-type viruses (lee et al. 2006; pei et al. 2008) . interestingly, reversions occurred in the mutated nucleocapsid gene sequence and although the amino acid sequence of the pat7 motif was altered, its function was not; this new signal was defined as a pat8 motif (lee et al. 2006) . the clear implications of this groundbreaking work is that disruption of nucleolar trafficking of a viral protein proves functional relevance and illustrates the potential of exploiting this knowledge for the generation of growth attenuated recombinant vaccines (pei et al. 2008; reed et al. 2006 reed et al. , 2007 . similarly, point mutations in the japanese encephalitis virus (jev) core protein that abolished nuclear and nucleolar localisation resulted in recombinant viruses with impaired replication in mammalian cells, compared to wild type virus (mori et al. 2005; tsuda et al. 2006) . curiously, replication of recombinant viruses was not impaired in insect cells, illustrating this could potentially be related to differences in nucleolar architecture and proteomes between these cell types (thiry and lafontaine 2005) . the jev core protein has been shown to interact with nucleophosmin and is translocated from the nucleolus to the cytoplasm. flaviviruses in general (jev, dengue virus and west nile virus) appear to have a part-nuclear stage to the synthesis of viral rna and several components of the viral replicase together with newly synthesised rna have been found in the nucleus of infected cells (uchil et al. 2006) . one intriguing question that has yet to be elucidated is how such viral rna traffics from the nucleus to the cytoplasm. most cellular mrnas are spliced and it is part of the splicing process that signals nuclear export. certain dna viruses, such as herpesvirus saimiri, produce intron-less mrna and these viruses have evolved specific viral proteins (such as herpesvirus saimiri orf57 (boyne et al. 2008a) ), which interact with the cellular mrna export machinery (e.g. the mrna processing and export factor aly) to traffic viral mrna from the nucleus to the cytoplasm (boyne et al. 2008b (boyne et al. , 2010 boyne and whitehouse 2006) and a similar process might be required by rna viruses. for example, tomato bushy stunt virus (tbsv) redistributes aly from the nucleus to the cytoplasm, and this might be a way the virus mediates host cell protein synthesis (uhrig et al. 2004 ). in plants rna silencing, a host defence mechanism targets virus rnas for degradation in a sequence-specific manner and viruses use several mechanisms to counteract this system (canto et al. 2006) . tbsv encodes a protein, p19, which interferes with this pathway. however, aly might transport p19 from the cytoplasm to the nucleus or nucleolus and disrupt its silencing suppression activity. nucleolin has also been shown to be involved in the trafficking of herpes simplex virus type 1 nucleocapsids from the nucleus to the cytoplasm (sagou et al. 2010) , drawing parallels with the involvement of nucleolar proteins in the movement of plant viruses (kim et al. 2007a, b) . different plant virus proteins involved in long-distance phloem-associated movement of virus particles or with roles in binding to the rna virus genomes localise to the nucleolus and other sub-nuclear structures (kim et al. 2007b; ryabov et al. 2004 ). this may be mediated by association with nuclear proteins, as is the case with fibrillarin and the orf3 protein of plant umbraviruses (kim et al. 2007a) . hijacking the nucleolus is not exclusive to plant viruses and may also occur with mammalian viruses. similar to the plant rhabdovirus maize fine streak virus (mfsv), whose nucleocapsid and phosphoproteins localise to the nucleolus (tsai et al. 2005) , the animal negative-stranded rna virus borna disease virus has been reported to use the nucleolus as a site for genome replication, and its rna-binding protein has the appropriate trafficking signals for import to and export from the cytoplasm to the nucleus (pyper et al. 1998 ). the hepatitis delta virus genome also has differential synthesis in the nucleus with rna being transcribed in the nucleolus (huang et al. 2001) ; this is similar to the potato spindle tuber viroid where rnas of opposite polarity are sequestered in different nuclear compartments, with the positive-sense rna being transported to the nucleolus. again localisation to different sub-nuclear strcutures may have different roles in the virus life cycle (li et al. 2006 ). an intriguing recent discovery has been made showing that adenoassociated virus (aav) encodes an additional protein called assembly-activating protein (aap) that localises to the nucleolus and promotes assembly of the viral capsid (sonntag et al. 2010) . as a result of their limited genomes and coding capacities, recruitment of cellular proteins with defined functions in rna metabolism would be a logical step to facilitate rna virus infection. as nucleolar proteins have many crucial functions in cellular rna biosynthesis, processing and translation, it comes as no surprise that nucleolar proteins are incorporated into the replication and/or translation complexes formed by rna viruses. given that some nucleolar proteins have many different functions, the same nucleolar protein might be used by a virus for different aspects of the replication pathway. studies suggest that the human rhinovirus 3 c protease (3cpro) pre-cursors, 3cd' and/or 3cd, localise in the nucleoli of infected cells early in infection and inhibit cellular rna transcription via proteolytic mechanisms (amineva et al. 2004 ). this general property is not restricted to human rhinovirus and in terms of the inhibition of cellular translation has also been described for encephalomyocarditis virus (aminev et al. 2003a, b) , again suggesting roles in translational regulation. given the many roles of the nucleolus in the life cycle of the cell, including as stress sensor (boulon et al. 2010; mayer and grummt 2005) , it would seem reasonable that comprehensive unbiased analysis of the nucleolar proteome would yield interesting data, particularly, with providing clues as to what cellular nucleolar functions may be altered by virus infection and what mechanisms the nucleolus may use to respond to this. how the nucleolar proteome changes in response to virus-infection has been investigated using stable isotope labelling with amino acids in cell culture (silac) coupled to lc-ms/ms and bioinformatics (fig. 14.1) . these studies, led by our laboratories, have analysed purified nucleoli and the nucleus, and have directly stemmed from the pioneering work of the lamond laboratory in analysing purified nucleoli using quantitative proteomics (andersen et al. 2005 (munday et al. 2010) . overall, our data indicates that only a small proportion of nucleolar proteins change in abundance in virus-infected cells, fig. 14. 1 diagram of a "classic" silac experiment. this technology allows high-throughput quantitative proteomics and has been readily applied to the nucleolus, especially when coupled with dynamic live-cell imaging (andersen et al. 2005) . the ability to simultaneously compare up to three different conditions through selection of the appropriate isotope label has enabled the recent studies of how the nucleolar proteome changes in virus-infected cells (emmott et al. 2010a; emmott et al. 2010b; emmott et al. 2010c; hiscox et al. 2010; lam et al. 2010) and these tend to be virus-specific. for example, in adenovirus infected cells just 7% of proteins identified show a twofold or greater change compared to almost a third of nucleolar antigens showing a greater than twofold change when cells are treated with actd which inhibits rrna synthesis (lam et al. 2010) . what is notable is that direct comparison between the adenovirus data set and the actd dataset shows no clear correlation lam et al. 2010) , further supporting the case that adenovirus induces effects on the nucleolus distinct from that of a generalised, non-specific shut down of nucleolar function. this fits well with a previous observation that adenovirus infection does not affect rrna synthesis even 36 h post-infection (lawrence et al. 2006 ). these results were initially surprising given the number of different viral proteins that can localise to this structure and how they interact with nucleolar proteins. this suggests that the nucleolar proteome and architecture is resilient during early stages of infection but may become disrupted as more and more damage accumulates inside cells because of virus activity, as clearly evidenced in live-cell imaging experiments (bertrand and pearson 2008; dove et al. 2006b; ). coupling quantitative proteomic analysis of the nucleolus and deep sequencing throughout infection in time-course experiments of lytic, latent, acute and persistent viruses would reveal valuable insights into the response of the nucleolus to virus infection. likewise, being able to move from studying cell culture-adapted laboratory strains into clinical isolates replicating in primary cells would yield more biologically relevant information, particularly with regard to the severity of disease and nucleolar changes. these technologies could also be applied to large-scale analysis of viral proteins that traffic to the nucleolus and the cellular nucleolar proteins that they associate with (e.g. using silac and egfp-traps (trinkle-mulcahy et al. 2008)), thus generating and integrating interactome networks with the nucleolar proteome during infection. nopdb: nucleolar proteome database-2008 update encephalomyocarditis viral protein 2a localizes to nucleoli and inhibits cap-dependent mrna translation encephalomyocarditis virus (emcv) proteins 2a and 3bcd localize to nuclei and inhibit cellular mrna transcription but not rrna transcription rhinovirus 3 c protease precursors 3cd and 3cd' localize to the nuclei of infected cells nucleolar proteome dynamics human cytomegalovirus proteins pp65 and iep72 are targeted to distinct compartments in nuclei and nuclear matrices of infected human embryo fibroblasts cell-cycle-dependent localization of human cytomegalovirus ul83 phosphoprotein in the nucleolus and modulation of viral gene expression in human embryo fibroblasts in vitro np9 protein of human endogenous retrovirus k interacts with ligand of numb protein x functional role of px open reading frame ii of human t-lymphotropic virus type 1 in maintenance of viral loads in vivo a temporal study of the expression of the capsid, cytoplasmic inclusion and nuclear inclusion proteins of tobacco etch potyvirus in infected plants visualization of the interaction between the precursors of vpg, the viral protein linked to the genome of turnip mosaic virus, and the translation eukaryotic initiation factor iso 4e in planta the conserved n-terminal domain of herpes simplex virus 1 ul24 protein is sufficient to induce the spatial redistribution of nucleolin m148r and m149r are two virulence factors for myxoma virus pathogenesis in the european rabbit the nucleolus under stress nucleolar trafficking is essential for nuclear export of intronless herpesvirus mrna nucleolar disruption impairs kaposi's sarcoma-associated herpesvirus orf57-mediated nuclear export of intronless viral mrnas herpesvirus saimiri orf57: a post-transcriptional regulatory protein recruitment of the complete htrex complex is required for kaposi's sarcoma-associated herpesvirus intronless mrna nuclear export and virus replication kaposi's sarcoma-associated herpesvirus orf57 protein interacts with pym to enhance translation of viral intronless mrnas nucleolin is required for an efficient herpes simplex virus type 1 infection hipk2: a versatile switchboard regulating the transcription machinery and cell death from top to bottom: the two faces of hipk2 for regulation of the hypoxic response nuclear localization of nucleocapsid-like particles and hcv core protein in hepatocytes of a chronically hcv-infected patient a single amino acid change in the nuclear localization sequence of the nsp2 protein affects the neurovirulence of semliki forest virus analysis of the subcellular localization of the proteins rep, rep' and cap of porcine circovirus type 1 bovine immunodeficiency virus tat gene: cloning of two distinct cdnas and identification, characterization, and immunolocalization of the tat gene products human t-cell leukemia virus type i p30 nuclear/nucleolar retention is mediated through interactions with rna and a constituent of the 60 s ribosomal subunit us11 of herpes simplex virus type 1 interacts with hipk2 and antagonizes hipk2-induced cell growth arrest nuclear and nucleolar localization of an african swine fever virus protein, i14l, that is similar to the herpes simplex virus-encoded virulence factor icp34.5 the nucleoprotein and the viral rna of infectious salmon anemia virus (isav) are localized in the nucleolus of infected cells the bovine immunodeficiency virus rev protein: identification of a novel lentiviral bipartite nuclear localization signal harboring an atypical spacer sequence cucumber mosaic virus 2b protein subcellular targets and interactions: their significance to rna silencing suppressor activity involvement of the nucleolus in replication of human viruses identification of nucleolus localization signal of betanodavirus ggnnv protein alpha characterization of the nuclear and nucleolar localization signals of bovine herpesvirus-1 infected cell protein 27 new regulatory mechanisms for the intracellular localization and trafficking of influenza a virus ns1 protein revealed by comparative analysis of a/pr/8/34 and a/sydney/5/97 imaging of viroids in nuclei from tomato leaf tissue by in situ hybridization and confocal laser scanning microscopy distinctions between bovine herpesvirus 1 and herpes simplex virus type 1 vp22 tegument protein subcellular associations nucleolar localization of potato leafroll virus capsid proteins relationship between adenovirus dna replication proteins and nucleolar proteins b23.1 and b23.2 direct interaction between nucleolin and hepatitis c virus ns5b brief review: the nucleolus -a gateway to viral infection? the interaction of animal cytoplasmic rna viruses with the nucleus to facilitate replication rna viruses: hijacking the dynamic nucleolus the coronavirus infectious bronchitis virus nucleoprotein localizes to the nucleolus nucleolar proteomics and viral infection the hbz-sp1 isoform of human t-cell leukemia virus type i represses junb activity by sequestration into nuclear bodies nucleolar localization of mouse mammary tumor virus proteins in t-cell lymphomas identification and characterization of the ul24 gene product of herpes simplex virus type 2 the nucleolar phosphoprotein b23 interacts with hepatitis delta antigens and modulates the hepatitis delta virus rna replication expression and processing of a small nucleolar rna from the epstein-barr virus genome a novel, mouse mammary tumor virus encoded protein with rev-like properties nucleolar accumulation of semliki forest virus nucleocapsid c protein: influence of metabolic status, cytoskeleton and receptors gltscr2/pict-1, a putative tumor suppressor gene product, induces the nucleolar targeting of the kaposi's sarcoma-associated herpesvirus ks-bcl-2 protein interaction of a plant virus-encoded protein with the major nucleolar protein fibrillarin is required for systemic virus infection cajal bodies and the nucleolus are required for a plant virus systemic infection electron microscopy of ribonucleic acid in nuclear particulate aggregates of hepatitis d using nuclease-gold complexes functional similarity of hiv-i rev and htlv-i rex proteins: identification of a new nucleolar-targeting signal in rev protein nucleo-cytoplasmic redistribution of the htlv-i rex protein: alterations by coexpression of the htlv-i p21x protein proteomics analysis of the nucleolus in adenovirus-infected cells immunocytology shows the presence of tobacco etch virus p3 protein in nuclear inclusions nucleolar protein upstream binding factor is sequestered into adenovirus dna replication centres during infection without affecting rna polymerase i location or ablating rrna synthesis a b23-interacting sequence as a tool to visualize protein interactions in a cellular context the nucleolin binding activity of hepatitis delta antigen is associated with nucleolus targeting adenovirus core protein vii contains distinct sequences that mediate targeting to the nucleus and nucleolus, and colocalization with human chromosomes precursor of human adenovirus core polypeptide mu targets the nucleolus and modulates the expression of e2 proteins mutations within the nuclear localization signal of the porcine reproductive and respiratory syndrome virus nucleocapsid protein attenuate virus replication the alcelaphine herpesvirus-1 orf 57 encodes a nuclear shuttling protein functional interaction and colocalization of the herpes simplex virus 1 major regulatory protein icp4 with eap, a nucleolar-ribosomal protein rna-templated replication of hepatitis delta virus: genomic and antigenomic rnas associate with different nuclear bodies capsid protein of cucumber mosaic virus accumulates in the nuclei and at the periphery of the nucleoli in infected cells nucleolar and nuclear localization properties of a herpesvirus bzip oncoprotein, meq the major tegument structural protein vp22 targets areas of dispersed nucleolin and marginalized chromatin during productive herpes simplex virus 1 infection identification of a rev-related protein by analysis of spliced transcripts of the human endogenous retroviruses htdv/ herv-k properties of the adenovirus iva2 gene product, an effector of late-phase-dependent activation of the major late promoter involvement of ul24 in herpes-simplex-virus-1-induced dispersal of nucleolin relocalization of upstream binding factor to viral replication compartments is ul24 independent and follows the onset of herpes simplex virus 1 dna synthesis involvement of the ul24 protein in herpes simplex virus 1-induced dispersal of b23 and in nuclear egress the products of gene us11 of herpes simplex virus type 1 are dna-binding and localize to the nucleoli of infected cells stable expression of hepatitis delta virus antigen in a eukaryotic cell line adenovirus core protein v is delivered by the invading virus to the nucleus of the infected cell and later in infection is associated with nucleoli cellular stress and nucleolar function identification of nuclear and nucleolar localization signals in the herpes simplex virus regulatory protein icp27 nuclear and nucleolar targeting of influenza a virus ns1 protein: striking differences between different virus subtypes karyophilic properties of semliki forest virus nucleocapsid protein ribozyme-mediated inhibition of hiv 1 suggests nucleolar trafficking of hiv-1 rna localization and importance of the adenovirus e4orf4 protein during lytic infection the lactate dehydrogenase-elevating virus capsid protein is a nuclear-cytoplasmic protein the protein icp0 of herpes simplex virus type 1 is targeted to nucleoli of infected cells nuclear localization of japanese encephalitis virus core protein enhances viral replication quantitative proteomic analysis of a549 cells infected with human respiratory syncytial virus functional domain structure of human t-cell leukemia virus type 2 rex nucleolar localization of human hepatitis b virus capsid protein nucleolar targeting signal of human t-cell leukemia virus type i rex-encoded protein is essential for cytoplasmic accumulation of unspliced viral mrna identification of nucleophosmin/b23, an acidic nucleolar protein, as a stimulatory factor for in vitro replication of adenovirus dna complexed with viral basic core proteins induction of apoptosis by viruses: what role does the nucleolus play? inhibition of human immunodeficiency virus type 1 and type 2 tat function by transdominant tat protein localized to both the nucleus and cytoplasm nuclear entry and nucleolar localization of the newcastle disease virus (ndv) matrix protein occur early in infection and do not require other ndv proteins functional mapping of the porcine reproductive and respiratory syndrome virus capsid protein nuclear localization signal and its pathogenic association nuclear localization of semliki forest virus-specific nonstructural protein nsp2 the nucleolus is the site of borna disease virus rna transcription and replication the complex subcellular distribution of satellite panicum mosaic virus capsid protein reflects its multifunctional role during infection control of nuclear and nucleolar localization of nuclear inclusion protein a of picorna-like potato virus a in nicotiana species proapoptotic effect of hepatitis c virus core protein in transiently transfected cells is enhanced by nuclear localization and is dependent on pkr activation delineation and modelling of a nucleolar retention signal in the coronavirus nucleocapsid protein characterization of the nuclear export signal in the coronavirus infectious bronchitis virus nucleocapsid protein nuclear and nucleolar targeting signals of semliki forest virus nonstructural protein nsp2 nuclear targeting of semliki forest virus nsp2 intracellular transport of the murine leukemia virus during acute infection of nih 3t3 cells: nuclear import of nucleocapsid protein and integrase functional analysis of proteins involved in movement of the monopartite begomovirus, tomato yellow leaf curl virus genetic therapies against hiv nucleolar-cytoplasmic shuttling of prrsv nucleocapsid protein: a simple case of molecular mimicry or the complex regulation by nuclear import, nucleolar localization and nuclear export signal sequences the localisation of porcine reproductive and respiratory syndrome virus nucleocapsid protein to the nucleolus of infected cells and identification of a potential nucleolar localization signal sequence peptide domains involved in the localization of the porcine reproductive and respiratory syndrome virus nucleocapsid protein to the nucleolus structural and functional characterization of human immunodeficiency virus tat protein human endogenous retrovirus herv-k(hml-2) encodes a stable signal peptide with biological properties distinct from rec intracellular location of two groundnut rosette umbravirus proteins delivered by pvx and tmv vectors identification of a nuclear localization signal and nuclear export signal of the umbraviral long-distance rna movement protein nucleolin is required for efficient nuclear egress of herpes simplex virus type 1 nucleocapsids genome-wide screen of three herpesviruses for protein subcellular localization and alteration of pml nuclear bodies identification of the caprine arthritis encephalitis virus rev protein and its cis-acting rev-responsive element the rev protein of visna virus is localized to the nucleus of infected cells pnac: a protein nucleolar association classifier characterization of signals that dictate nuclear/nucleolar and cytoplasmic shuttling of the capsid protein of tomato leaf curl java virus associated with dna beta satellite sequence requirements for nucleolar localization of human t cell leukemia virus type i px protein, which regulates viral rna processing a viral assembly factor promotes aav2 capsid formation in the nucleolus role of phosphorylation clusters in the biology of the coronavirus infectious bronchitis virus nucleocapsid protein upstream-binding factor is sequestered into herpes simplex virus type 1 replication compartments nucleolin associates with the human cytomegalovirus dna polymerase accessory subunit ul44 and is necessary for efficient viral replication reversible nucleolar translocation of epstein-barr virus-encoded ebna-5 and hsp70 proteins after exposure to heat shock or cell density congestion birth of a nucleolus: the evolution of nucleolar compartments nuclear localization of non-structural protein 1 and nucleocapsid protein of equine arteritis virus identification of a new human adenovirus protein encoded by a novel late l-strand transcription unit identifying specific protein interaction partners using quantitative mass spectrometry and bead proteomes complete genome sequence and in planta subcellular localization of maize fine streak virus proteins nucleolar protein b23 interacts with japanese encephalitis virus core protein and participates in viral replication nuclear localization of flavivirus rna synthesis in infected cells relocalization of nuclear aly proteins to the cytoplasm by the tomato bushy stunt virus p19 pathogenicity protein use of a u16 snorna-containing ribozyme library to identify ribozyme targets in hiv-1 avian reovirus sigmaa localizes to the nucleolus and enters the nucleus by a nonclassical energy-and carrier-independent pathway nucleolar localization of influenza a ns1: striking differences between mammalian and avian cells viral ribonucleoprotein complex formation and nucleolarcytoplasmic relocalization of nucleolin in poliovirus-infected cells interactions of minute virus of mice and adenovirus with host nucleoli intracellular localization and determination of a nuclear localization signal of the core protein of dengue virus proteins c and ns4b of the flavivirus kunjin translocate independently into the nucleus subcellular compartmentalization of adeno-associated virus type 2 assembly the n-terminal domain of pmtv tgb1 movement protein is required for nucleolar localization, microtubule association, and long-distance movement localisation to the nucleolus is a common feature of coronavirus nucleoproteins and the protein may disrupt host cell division molecular anatomy of subcellular localization of hsv-1 tegument protein us11 in living cells nucleolar localization of the ul3 protein of herpes simplex virus type 2 colocalization and interaction of the porcine arterivirus nucleocapsid protein with the small nucleolar rna-associated protein fibrillarin subcellular localization of the severe acute respiratory syndrome coronavirus nucleocapsid protein a model for the dynamic nuclear/ nucleolar/cytoplasmic trafficking of the porcine reproductive and respiratory syndrome virus (prrsv) nucleocapsid protein based on live cell imaging g0/g1 arrest and apoptosis induced by sars-cov 3b protein in transfected cells intracellular localization of the ul31 protein of herpes simplex virus type 2 acknowledgements dam and jah would like to acknowledge their co-workers and collaborators over the years for developing viral interactions with the nucleolus. dam's research on the nucleolus is supported by the wellcome trust and jah's by the bbsrc and a leverhulme trust research fellowship. ee is supported by a bbsrc astbury dtg studentship. key: cord-023740-g84fa45m authors: oldstone, michael b.a.; schwimmbeck, peter; dyrberg, thomas; fujinami, robert title: mimicry by virus of host molecules: implications for autoimmune disease date: 2014-06-27 journal: progress in immunology doi: 10.1016/b978-0-12-174685-8.50079-2 sha: doc_id: 23740 cord_uid: g84fa45m molecular mimicry defines the shared identity of molecules from disparate genes or proteins. thus, although their origins are as separate as a virus and the self-determinant of a human or lower animal, two molecules' linear amino acid sequences or their conformational fits may be shared. such molecular homologies between proteins occur frequently and likely play roles in the processing of viral proteins inside cells. the homologies shared between viruses and host cytoskeletal proteins likely indicate that shared determinants on cell linker proteins guided viral proteins along highways and stop points inside cells. most importantly, these unexpected cross-reactivities have broad and major implications for understanding autoimmune disease. molecular mimicry is detected either by using humoral or cellular immune components, that cross-react with two presumably unrelated protein structures, or by computer searches to match descriptions of proteins in storage banks. the use of both these approaches to define molecular mimicry and establish its potential role in autoimmune disease is the topic of this chapter. by molecular mimicry we mean the shared identity of molecules from disparate genes or proteins . thus, although their origins are as separate as a virus and the self-determinant of a human or lower animal, two molecules' linear amino acid sequences or their conformational fits may be shared. for a variety of reasons including false signals from enriched guaninecytosine sequences or as introns designed to be spliced away may provide false hybridization signals, we have focused on molecular mimicry at the protein level. such molecular homologies between proteins occur frequently and have broad and major implications for understanding autoimmune disease. further, such homologies likely play roles in the processing of viral proteins inside cells. this realization came from repeated observations of homologies shared between viruses and host cytoskeletal proteins. this phenomenon led dales et al. (1983) to hypothesize that shared determinants on cell linker proteins guided viral pro-teins along highways and stop points inside cells. additionally, these unexpected cross-reactivities attendant to mimicry warrant cautious use of reagents in diagnostic virology, microbiology laboratories, and synthetic vaccines, even though these materials originated from hybridomas or from animals immunized with predetermined (peptide) amino acid sequences. molecular mimicry is detected either by using humoral or cellular immune components that cross-react with two presumably unrelated protein structures, or by computer searches to match descriptions of proteins in storage banks. the use of both these approaches to define molecular mimicry and establish its potential role in autoimmune disease is presented in this chapter. several independent reports appeared in the early 1980s noting molecular mimicry between the viral antigen sv40-t and host cell proteins (lane and hoeffler, 1980) , measles virus phosphoprotein and the cytoskeleton component keratin , and a herpes simplex glycoprotein of 140k and a separate epitope on keratin from that recognized by the measles virus phosphoprotein . to better determine the frequency of molecular mimicry, srinivasappa and his colleagues at the nih acquired from many laboratories over 600 monoclonal antibodies raised against viral polypeptides (srinivasappa et al., 1986) . these investigators then mapped the incidence of monoclonals cross-reactivity with host proteins expressed on a large panel of normal tissues (table i ). in the analysis were antibodies against 11 different viruses including dna and rna viruses known to cause human infection from the herpes virus group, vaccinia virus, myxoviruses, paramyxoviruses, arenaviruses, flaviviruses, alphaviruses, rhabdoviruses, and coronaviruses. the results indicated that over 3.5% of such monoclonals cross-reacted with host cell determinants expressed on uninfected tissues. these determinants occurred at a single site or in widely diverse places located in a wide panel of cells found in the nervous system, endocrine system, immune system, gut, heart, and muscle (fig. 1 ). these and our other observations indicated that certain monoclonal antibodies stained restricted subsets of neurons in selected areas of the nervous system or unique subsets of lymphocytes within a defined functional lymphocyte class were stained. the above data indicate common cross-reactivity at the monoclonal level between viral protein and host self-proteins. in this situation, antibodies to hor"listed are the analyses of over 600 monoclonal antibodies done by srinivasappa et al. (1986) . monoclonal antibodies against 11 different viruses including dna and rna viruses known to cause human infection from the herpes virus group, vaccinia virus, myxoviruses, paramyxoviruses, arenaviruses, flaviviruses, alphaviruses, rhabdovirus, and coronaviruses cross-react with host cell determinants expressed on uninfected tissues. twenty-two of such monoclonal antibodies, or 3.5% provide evidence of molecular mimicry. mones, lymphocyte subsets, or cells of the nervous system, etc., can develop as a consequence of virus infection, with all the inherent potential for participating in disease. the cross-reactivity between viruses and particular tissues offers some interesting insights into the association of virus infections with specific diseases. for example, coxsackie b virus has been found in individuals with myocarditis, an inflammatory disease of the heart muscle. one of the monoclonal antibodies described by srinivasappa et al. (1986) was directed against the neutralizing domain of coxsackie virus but also interacted with heart muscle . equally intriguing was the recent finding by fujinami and powell (1986) of a link between theiler's virus and the demyelinating disease it causes. in this instance, a monoclonal antibody directed against the major neutralizing domain of theiler's virus also reacted with galactocerebroside, the main component on the surface of oligodendrocytes. because oligodendrocytes are cells that make the myelin lamella that wraps around axons, their destruction leads to demyelination. interestingly, inoculation of this monoclonal antibody into the sciatic nerve results in several fingerprints of demyelination. these and table i ). note that these monoclonal antiviral antibodies cross-react with one or more groups of uninfected cells representative of the nervous system, endocrine system, immune system, gut, heart, and muscle. see srinivasappa etal. (1986) for experimental data. other examples (reviewed in oldstone and notkins, 1986 ) suggest a mechanism whereby immune reactants directed against a viral or microbial component may cross-react with a host component and generate autoimmune disease. a second immunopathologic sequella associated with molecular mimicry is the formation and trapping of immune complexes (dales et ai, 1983; oldstone and notkins, 1986) . in this instance, antibodies induced against proteins of the infecting virus, but cross-reactive with host proteins such as cytoskeletal or other self-proteins released into fluids during normal cell turnover or enhanced turnover and lysis occurring during virus infection, form complexes with antigen in the circulation. these complexes may become trapped in vessels with fenestrated endothelial linings such as the renal glomeruli, small arteries and capillaries, and the choroid plexus. here, they accumulate to set in motion the events of immune complex disease. such events may well account for the high incidence of antiself (myosin, actin, smooth muscle, nucleic acids, etc.) antibodies producing during virus infection and the formation of immune complexes during such infections (reviewed in oldstone and notkins, 1986 ). the analysis discussed above indicates that 3-4% of monoclonal antibodies generated against specific virus determinants also bind to host "self-determinants. other experiments have established that a minimum of five to six peptides is required for the induction of monoclonal antibodies (wilson et al., 1984) . since, on the basis of antibody cross-reactivity, many viruses share antigenic sites with normal host cell components, the next step was to look for crossreactive capability in eliciting autoimmunity and related disease. this was approached by using a computer-assisted search of the dayhoff data bank. after examining the 2511 amino acid sequences including 470,158 residues deposited in the protein data bank to look for overlapping peptides, homologies were noted among 2469 hexomers, 186 septomers, and 17 octomers; however these may not have been in tandem. since the probability of the requisite 20 amino acids occurring in six identical sequences in a row between two proteins, if amino acids occur at a random frequency, is 20 6 or one to 64 million, it is unlikely that such a homology would occur by chance. to provide evidence that molecular mimicry could cause autoimmunity we chose to study the encephalitogenic site of myelin basic protein and the disease allergic encephalomyelitis (eae). the entire amino acid sequence of myelin basic protein is known, and its encephalitogenic sites have been mapped in several animal species (hashim, 1978) . computer-assisted analysis located several viral proteins in the dayhoff files that have significant homology with the encephalitogenic site of myelin basic protein (hashim, 1978) ; these are the nucleoprotein and hemagglutinin of influenza virus, coat protein of polyoma virus, core protein of adenovirus, polyprotein of poliomyelitis virus, ec-lf2 protein of epstein-barr virus, rabies virus glycoprotein, measles virus nucleoprotein, and hepatitis b virus polymerase. however, of the banked sequences, the myelin basic protein encephalitogenic site in the rabbit fit best with hepatitis b virus polymerase (hbvp): 66 75 thr-thr-his-tyr-gly-ser-leu-pro-gln-lys encephalitogenic site, rabbit myelin basic protein 589 598 ile-gly-cys-tyr-gly-ser-leu-pro-gln-glu hbvp as seen in fig. 2 , immune responses, both humoral and cellular, were generated in rabbits inoculated with the dexomer viral peptide reacted with myelin basic protein. further, inoculation of the hbvp peptide into rabbits caused perivascular infiltration localized to the central nervous system reminiscent of the disease induced by inoculation of either whole myelin basic protein or the encephalitogenic site of myelin basic protein (fig. 2) . this outcome provided the first evidence for the potential of molecular mimicry to cause both autoimmune responses and autoimmune disease (fujinami and oldstone, 1985) . conceptually, molecular mimicry can produce autoimmunity when virus and host determinants are sufficiently similar to induce a crossreactive response, yet different enough to break immunologie tolerance. such principles for molecular mimicry undoubtedly follow those mapped for the induction and breaking of tolerance at both the b and t lymphocyte levels by heterologous serum protein (weigle, 1980) . with current technology allowing cloning and rapid sequencing of genes, and utilizing nucleic acid sequences of the open reading frame to determine the protein sequences encoded by a gene, more data on viral polypeptides will soon be available. such information with respect to homologies between microbial agents and the acetylcholine receptor, insulin receptor, and/or encephalitogenic receptor will likely show similarities and should improve understanding of the postinfectious encephalopathies and demyelination following virus infections, potential causes of myasthenia gravis and perhaps endocrine disorders such as diabetes. important will be the recognition that unless homology and the subse-quent immunologie cross-reactivity involve a host protein that precipitates disease, e.g., the restricted encephalitogenic site of myelin rather than multiple other sites of myelin basic protein, disease will be unlikely to follow despite autoimmune response. we have uncovered several other interesting examples of molecular mimicry, and these are currently under study in our laboratory. included are mimicry between viruses and the α-chain of the human acetylcholine receptor (fig. 3) and between a number of microbial agents and the human major histocompatibility (mhc) marker hla b.27 (fig. 4) . in the first instance, 10% of sera samples tested from myasthénie patients bound to the selected amino acid sequences for the α-chain of the acetylcholine receptor depicted in fig. 3 . affinity purification of such antibodies is underway, and their ability to cause depolarization of the receptor is being evaluated. similarly, sera from hla b.27 patients with and without ankylosing spondylitis are being studied for their cross-reactivity with amino acid sequences from several microbial agents, depicted in fig. 4 . other interesting molecular mimicries under study are those between myelin and aids virus and between sequences representing other mhc and viruses. the most likely mechanism by which molecular mimicry would cause disease is by eliciting an immune response against a determinant shared between the host and the virus to bring forth a tissue-specific immune response, presumably capable of destroying cells and eventually the tissue. the probable mechanism is the generation of cytotoxic cross-reactive effect of lymphocytes and/or antibodies that recognize "self-determinants" localized on target cells. interestingly, the induction of cross-reactivity would not require a replicating agent, and the immunologically mediated injury could occur after removal of the immunogen-a hit and run event. by such a mechanism, the microbial infection that initiates the autoimmune phenomenon need not be present at the time overt disease develops. the likely picture would be that the virus responsible for inducing a cross-reactive immune response is cleared initially, but the components ofthat immunity continue to assault host elements. this cycle continues as the autoimmune response itself leads to tissue injury that, in turn, releases more self-antigen, thereby incuding more antibodies, and so on. such a sequence of events would render the isolation or identification of an initiating infectious agent difficult or impossible. indeed, such events likely occur with the encephalopathies that follow measles, mumps, vaccinia, or herpes zoster virus infections of humans. in such postinfectious diseases, the infected host develops encephalitis, frequently associated with such other symptoms of autoimmune disease as rash when the viral amino acid sequences were inoculated into rabbits and then tested for their binding to the a-chain of the acetylcholine receptor note that herpes simplex virus (hsv) glycoprotein d residues 286 to 293 showed a higher degree of cross-reactivity then those of polyoma virus, which showed a greater degree of sequence homology. specificity of binding was checked by quantitative blocking experiments. analysis of sera from over 40 patients with myasthenia gravis indicated that their antibodies bound to the acetylcholine receptor sequence shown. see dyrberg and oldstone (1986) for details. oldstone.) and pain in the joints and skeletal muscles. recovery of the viral agent at this time is exceedingly rare, although the agent has been recovered easily several days earlier. this link between molecular mimicry and host proteins is further supported by studies showing that, after several types of acute virus infections, mononuclear cells from peripheral blood or cerebral spinal fluid proliferate in response to host antigens, one of which is myelin basic protein. further, several clonal populations of lymphocytes have been harvested from the central nervous system fluid of patients with encephalitis, and these lymphocytes proliferate when incubated with the infecting virus, its antigens, or nervous system antigens. these and other issues are likely to be studied increasingly to provide insights as to both potential etiologic agents and pathogenic mechanisms responsible for a variety of autoimmune disorders of man. proc. natl. acad. sei concepts in viral pathogenesis this is publication number 4393-imm from the department of immunology, scripps clinic and research foundation, la jolla, ca 92037. this work was supported in part by usphs grants ai-07007 and ag-04342 and nms jf2009 from the national multiple sclerosis society. p.s. is the recipient of a fellowship from the deutsche forschungsgemeinschaft (dfg). r.s.f. is a harry weaver scholar of the national multiple sclerosis society. key: cord-003070-6oca1mrm authors: shen, wen-jun; cui, wenjuan; chen, danze; zhang, jieming; xu, jianzhen title: rpirls: quantitative predictions of rna interacting with any protein of known sequence date: 2018-02-28 journal: molecules doi: 10.3390/molecules23030540 sha: doc_id: 3070 cord_uid: 6oca1mrm rna-protein interactions (rpis) have critical roles in numerous fundamental biological processes, such as post-transcriptional gene regulation, viral assembly, cellular defence and protein synthesis. as the number of available rna-protein binding experimental data has increased rapidly due to high-throughput sequencing methods, it is now possible to measure and understand rna-protein interactions by computational methods. in this study, we integrate a sequence-based derived kernel with regularized least squares to perform prediction. the derived kernel exploits the contextual information around an amino acid or a nucleic acid as well as the repetitive conserved motif information. we propose a novel machine learning method, called rpirls to predict the interaction between any rna and protein of known sequences. for the rpirls classifier, each protein sequence comprises up to 20 diverse amino acids but for the rpirls-7g classifier, each protein sequence is represented by using 7-letter reduced alphabets based on their physiochemical properties. we evaluated both methods on a number of benchmark data sets and compared their performances with two newly developed and state-of-the-art methods, rpi-pred and ipminer. on the non-redundant benchmark test sets extracted from the pridb, the rpirls method outperformed rpi-pred and ipminer in terms of accuracy, specificity and sensitivity. further, rpirls achieved an accuracy of 92% on the prediction of lncrna-protein interactions. the proposed method can also be extended to construct rna-protein interaction networks. the rpirls web server is freely available at http://bmc.med.stu.edu.cn/rpirls. the interactions of proteins with other proteins, peptides, dnas and rnas govern most the essential molecular function. rna-protein interactions (rpis) have a critical influence on post-transcriptional gene regulation [1] [2] [3] , viral assembly [4] [5] [6] , cellular defence [7] , protein synthesis [8, 9] and various other fundamental biological processes [10, 11] . a significant portion of transcripts is long non-coding rnas (lncrnas) which are not translated into proteins and are longer than 200 nucleotides [12] . lncrnas normally function with their interacting proteins [13] . for instance, the lncrna hotair regulated the hoxd locus in trans by interacting with pcg proteins [14] ; several lncrnas were shown to be able to interact with auf1, a protein linked to aging and cancer [15] ; lncrnas binding to jarid2 protein were essential for the recruitment of prc2 to the chromatin [16] ; lncrna gas5 inhibited hepatitis c virus replication by decoying hcv ns3 protein [17] . hence, the study of rpis is essential for understanding their functions. compared to those of protein-protein interactions and dna-protein interactions, current knowledge regarding rna-protein interactions, especially lncrna-protein interactions is still limited. in this study, we propose a novel machine learning method, which we call rna-protein interaction prediction based on regularized least squares (rpirls), to quantitatively predict the potential rna-protein interactions. the experimental determination of rpis remains expensive and time-consuming [18] [19] [20] , but fortunately, the accumulated rpi experimental data facilitate the development of computational models for rpi prediction [21] [22] [23] . in 2011, pancaldi and b .. ahler [24] introduced a computational approach for rbp (rna binding protein)-mrna interaction prediction. they employed support vector machines (svms) and random forests (rfs) based on more than 100 physical and functional features of rpis, including gene ontology, chromosomal position, gene and protein physical properties, protein localization, experimental translation, mrna properties, predicted protein structure, utr properties and genetic interactions. bellucci et al. [25] proposed a method called catrapid for the prediction of protein lncrna interaction. they evaluated the interaction propensities of protein-rna based on their physicochemical properties, including secondary structure, hydrogen bonding and van der waals. muppirala et al. [26] developed a method called rpiseq, which predicted rpis solely based on primary sequences. the rpiseq method still employed svms and rfs but exploited different features. they represented each sequence of proteins and rnas as the normalized frequencies of the corresponding 3-mer and 4-mer, respectively. in 2013, based on the same feature vectors presented in muppirala et al., wang et al. [27] first reduced the dimensionality of feature vectors, and then performed the rpis prediction by using naive bayes classifier which assumed the independence of attributes and by using extended naive bayes classifier which considered the correlation between attributes. lu et al. [28] integrated the information on the secondary structure, hydrogen bonding propensities and van der waals of lncrnas and proteins with fisher's linear discriminant model. in 2015, suresh et al. [29] developed a method called rpi-pred to predict rpis by considering the high-order 3d protein and rna structure information. in 2016, pan et al. [30] proposed a new method named ipminer that integrated deep learning with stacked ensembling to improve the prediction performance of ncrna-protein interactions. in this paper, we classified rna-protein pairs as interacting or non-interacting by integrating derived kernel with regularized least squares (rls) [31] . the motivation is to relate the sequence information of proteins and rnas to their biological functions, i.e., interactions. our method attempted to extract discriminant subsequence features from amino acid sequences and nucleotide sequences. the derived kernel measures the similarity between two biological sequences by capturing nucleic acid or amino acid compositions and repetitive sequence patterns. we used regularized least squares in learning as the computations performed by rls algorithms can be expressed using just inner products, hence allowing efficient implementation of kernel-based learning, in addition the rls algorithms often perform comparable to the best batch classifiers [32] . since the dimensionality of feature space increases exponentially with the template size, for computational sake, we set upper limit for template size. on the other hand, we categorized 20 amino acids into several groups based on their physiochemical properties [33] [34] [35] , the reduced alphabet representation of the protein sequence allows larger template size and also decreases the dimensionality of feature space. we considered the derived kernel with two-layer architecture, hence there were two template sets, denoted as t p and t r needed to be constructed for protein and rna, respectively. here we considered all possible substrings of the same length making up a template set. the template set t p for amino acid sequences was composed of substrings with k continuous amino acids, while the template set t r for nucleic acid sequences was composed of substrings with l continuous nucleic acids. in order to extract discriminant subsequence features from amino acid sequences and nucleotide sequences, we explored the effect on rpi prediction over a range of choices for the template sizes of protein and rna. the training set rpi2662 was used to determine these parameters. in our case, we used different template sizes of protein and rna chosen from set {1, 2, . . . , 4} ∪ {1, 2, . . . , 8} for rpirls and {1, 2, . . . , 6} ∪ {1, 2, . . . , 8} for rpirls-7g. with different combination of protein template size and rna template size, the combined kernelk dk 2 was integrated with rls to predict rna-protein interactions. the ten-fold stratified cross-validation has been verified to be the best algorithm for model selection on a large scale experiments [36] , therefore on the data set rpi2662, we tuned the parameter λ by ten-fold stratified cross-validation with the optional parameter set {λ = e n , n = −15, · · · , 15}. the data set rpi2662 was divided into ten mutually exclusive folds and the mean response of each fold was approximately equal. in each test we merged 9 parts of the samples as the training set and left the other part as the test set. the parameter λ was chosen by leave-one-out cross-validation on the training set. for rpirls, in all the ten sets, λ = e −2 uniformly achieved the best performance in the training data. tables 1 and 2 showed the performance of the proposed method in terms of auc and accuracy with different combination of parameters k and l, respectively. the experiment results showed that when the protein template size k = 2 and the rna template size l = 5, the model performs best with auc score of 0.926 and accuracy of 0.830. the other measurements of specificity (sp) and sensitivity (se) were 0.771 and 0.890, respectively. while for rpirls-7g, λ = e −3 uniformly performed best in all the ten sets. table 3 showed that the method achieved the best prediction accuracy of 0.823 when the protein template size k = 3 and the rna template size l = 4. the other measurements (auc, sp and se) were observed as 0.902, 0.761 and 0.884, respectively. the computational results showed that the rpirls classifier outperformed the rpirls-7g classifier in terms of various performance measurements, indicating that the diversity of amino acids at a sequence is important for the prediction of rpis. the performance of predicting rpis was evaluated by using 10-fold stratified cross-validation on the rpi2662 data set. different combinations of parameters k and l were evaluated. remark on the symbols of template sizes: k stands for template size of amino acid sequences; l stands for template size of nucleic acid sequences. the best auc in the table is marked in bold. remark on the symbols of template sizes: k stands for template size of amino acid sequences; l stands for template size of nucleic acid sequences. the best accuracy in the table is marked in bold. in order to evaluate the reliability and robustness of rpirls and rpirls-7g, we compared them with other two state-of-the-art methods rpi-pred and ipminer. the rpi2241 and rpi369 data sets after removing overlapping rpis with the training data were evaluated. both the rpirls and rpirls-7g classifiers were trained on the rpi2662 data set, and tested on the rpi2241 and rpi369 data sets, respectively. as shown in tables 4 and 5 , rpirls outperformed the rpirls-7g, rpi-pred and ipminer methods on both data sets. for the rpi369 data set as shown in table 4 , the performance of the rpirls method was 0.85, 0.92, 0.84 and 0.86 for predictive accuracy, auc, specificity and sensitivity, respectively. while the predictive accuracy of the rpi-pred and ipminer methods were just 0.49 and 0.5, respectively which were much lower than rpirls's. the remaining measurements (specificity and sensitivity) were observed as 0.34 and 0.63, respectively for rpirls, and 0.52 and 0.48, respectively for ipminer. the rpirls method outperformed rpi-pred and ipminer in terms of accuracy, specificity and sensitivity on the rpi369 data set. table 3 . predictive performance of, rpirls-7g in terms of the accuracy on the, rpi2662 training data set over varying template sizes. the performance of predicting rpis was evaluated by using 10-fold stratified cross-validation on the rpi2662 data set. different combinations of parameters k and l were evaluated. remark on the symbols of template sizes: k stands for template size of amino acid sequences; l stands for template size of nucleic acid sequences. the best accuracy in the table is marked in bold. similar results were observed on the rpi2241 data set in table 5 . the specificity of the rpi-pred and ipminer methods was just 0.38 and 0.20, respectively, indicating there was a positive bias in their predictions of performance. a low specificity increases the labor, cost, and time needed to perform the required experimental tests, but our rpirls method achieved both reasonable specificity and sensitivity. furthermore, we evaluated the rpirls classifier on large-scale rna-protein pairs in the currently available rpintdb data base. the rpirls method correctly predicted 35980 out of 43010 rpis, reaching the predictive accuracy of 84%. to explore the effectiveness of the proposed method on predicting ncrna-protein interactions, a large-scale ncrna-protein interaction data set (we called nrpi13153) was retrieved from the npinter data base [37] . we trained rpirls and rpirls-7g on the rpi2662 data set, and tested it on the nrpi13153. table 6 showed the prediction results compared with the rpi-pred classifier on the nrpi13153 data set. the ipminer method showed a significantly positive bias on predicting ncrna-protein interactions, thus here we didn't include it into the comparison. the predictive accuracy for different organisms were separately computed. for the six organisms, our method rpirls performed best for the homo sapiens and saccharomyces cerevisiae, rpi-pred performed best for drosophila melanogaster, escherichia coli and mus musculus, and both methods obtained the same predictive accuracy for the caenorhabditis elegans. rpirls outperformed rpirls-7g over all six organisms. for 13153 ncrna-protein pairs, the rpirls method achieved an accuracy of 91% compared to 76% for rpirls-7g and 88% for the rpi-pred method. we further tested rpirls and rpirls-7g on the lnrpi12114 data set which was a subset of the nrpi13153 data set and consisted of only lncrna-protein interactions (lncrpis). our model achieved an overall accuracy of 92% compared to 77% for rpirls-7g and 89% for the rpi-pred classifier as shown in table 7 . the predictive performance of rpirls was improved in 5 out 6 organisms compared with its performance on the nrpi13153 data set. the results indicated the effectiveness of our method to predict lncrna-protein interactions only by using primary sequences of proteins and rnas. predicting lncrna-protein interaction networks is useful to explore the molecular mechanisms that are regulated by lncrnas [38, 39] . in this experiment, we evaluated the performance of rpirls on building lncrpi networks and further compared its performance with rpi-pred. on the basis of the data in npinter, we analyzed the results of four organisms, i.e., caenorhabditis elegans, drosophila melanogaster, escherichia coli and saccharomyces cerevisiae, consisting of 4, 61, 78 and 437 lncrpis, respectively. for caenorhabditis elegans, the rpirls method correctly identified all 4 lncrpis (blue edges) while the rpi-pred method correctly identified 3 out of 4. as shown in figure 1 , rpi-pred made incorrect prediction for the pair of n342950-g5ebf5 (red edges). in figure 2 , rpi-pred correctly predicted all 61 lncrpis, whereas rpirls missed 7 out of 61 lncrpis for drosophila melanogaster. these 6 out of 7 incorrect predictions which were observed between two proteins p49963 and q9vss2 (yellow rectangle) and three signal recognition particle (srp) rnas n5330, n5333 and n389 (green ellipse), formed the srp rna-protein interactions. for escherichia coli, the rpirls classifier made much more errors than the rpi-pred method, with predictive accuracies of 45% vs. 86%. the performance of rpirls for escherichia coli was much poorer than that for the other five organisms. in order to analyze why rpirls had relative poor performance on escherichia coli, we estimated the amino acid composition of escherichia coli compared with that of the other five organisms. as shown in figure 3 , we found that escherichia coli had relative higher observing frequencies of alanine and valine as well as much lower content of serine compared with that of the other five organisms. the amino acid composition bias in escherichia coli probably leaded to poor results. as shown in figure 4 , among 43 incorrect predicted pairs, 39 rpis corresponded to 7 protein hubs, e.g., p0a6h1, p0afz3, p0ag67, p0ce47, p0ce48, p21499 and p77398, each of which appearing as a yellow rectangle node was shown to interact with six transfer-messenger rnas (tmrnas), e.g., n3828, n1877, n5000, n435, n329 and n4292 (green ellipse). for saccharomyces cerevisiae, as showed in figure 5 , among 27 incorrect predicted pairs, 10 rpis were involved in 3 protein hubs (p57743, q03338 and q06819), in which each protein interacted with 4 small nuclear rnas (snrnas: n4610, n6134, n4606 and n6136), and other 13 rpis corresponded to 7 protein hubs (p15646, p47083, p53941, q04217, q04500, q08492 and q12136), each of which interacted with three small nucleolar rnas (snornas: n5819, n4618 and n6159). the rpirls classifier correctly identified 410 out of 437 rpis, achieving a high accuracy of 94%, compared of 87% (correctly predicted 379 out of 437 pairs) for rpi-pred. in this work, we illustrated the effectiveness and reliability of rpirls in predicting rpis for eukaryotic organisms in networks which comprised a variety protein hubs and rna hubs. mammalian cells contain more than 1000 different proteins interacting with rna [3] . normally, any individual rna can interact with multiple proteins [11, 40] . conversely, most proteins are capable of interacting with multiple rnas [41] . given the number of rnas and rna-binding proteins, the number of possible rpis is enormous. high-throughput sequencing methods have accumulated huge amount of rna-protein binding experimental data and opened new possibilities to measure and understand rna-protein interactions by computational methods. most of the previous computational works on rpis focus on the prediction of rna-binding proteins or rna-binding residues in a protein sequence [34, [42] [43] [44] . to our knowledge, very limited works have been developed to predict the specific associations between rnas and proteins, which play a critical role in post-transcriptional gene regulatory networks. complex networks of rpis mediate post-transcriptional gene regulation and therefore prediction of rpis helps us to gain insight into regulatory networks [45, 46] . the work presented here provided a computational method, called rpirls, to classify rna-protein pairs as interacting or non-interacting by integrating a sequence-based derived kernel with regularized least squares. the derived kernel exploited the contextual information around an amino acid or a nucleic acid as well as the repetitive conserved motif information. our results demonstrated that only the sequence structures of rnas and proteins provide sufficient information to accurately predict rna-protein interactions, especially long non-coding rna-protein interactions. specifically, the rpirls classifier considered each protein sequence comprising up to 20 diverse amino acids, while the rpirls-7g classifier encoded each protein sequence by using the 7-letter reduced alphabets according to amino acid physiochemical properties. the computational results showed that the rpirls classifier was superior to the rpirls-7g classifier in reliability and effectiveness, indicating that the diversity of amino acids at a sequence has critical impact on the function of rna-binding proteins. on two non-redundant benchmark data sets extracted from the pridb, the rpirls method outperformed rpi-pred and ipminer in terms of accuracy, specificity and sensitivity. compared with rpi-pred and ipminer, the rpirls method obtained a reasonable sensitivity at a lower false positive rate. further, rpirls achieved an accuracy of 92% compared to 89% for rpi-pred on the prediction of lncrna-protein interactions. the rpirls method can be extended to construct rna-protein interaction networks and therefore helps us to gain insights into post-transcriptional gene regulation. the reason for the good performance of the proposed method may be due to several factors. firstly, the use of similarity scores is a significant conceptual change in protein/rna evaluation, quantifying the overall similarity between proteins, rnas and their interactions. combining kernels by tensor product for the set of rna-protein pairs allowing to share information across the rna-binding proteins considerably improved the prediction, especially in the case of rnas with few known binding proteins. secondly, we have found that contiguous k-mer frequencies alone captured rich statistical information on the repetitive conserved motif of rna-protein pairs and the diversity of amino acids at a sequence has also contributed to an observed improvement contrast to rpi-pred which just applied 1-letter frequency for both protein and rna. finally, a kernel works as a measure of similarity and supports the application of powerful machine learning algorithms such as regularized least squares which we used in this paper. the rls mehod enables us to efficiently search for an optimized parameter λ at essentially no additional cost [31] . further, our model was trained on a large data set which contained 2662 rna-protein pairs, and yielded more robust results. in contrast, ipminer had much more model parameters to fit as combining deep learning with stacked ensembling, however, was trained on a small data set of just 488 rna-protein pairs, and thus showed a significantly positive bias on predicting ncrna-protein interactions. the main disadvantage of the proposed method is that the method is purely data-driven, in the sense that it relies solely on information derived from amino acid sequences and nucleic acid sequences, and thus does not consider higher structural information of protein and rna. while this may be seen as an advantage, since it can predict any rna-protein pair of known sequences. the increase of the number of protein-rna complexes in protein data bank [47] has opened possibilities for researchers to develop secondary data bases and to gain valuable insight into the structure and function of these complexes. the protein-rna interface data base (pridb) v2.0 [48] identifies interfacial residues in rna-protein complexes and also calculates atomic distances between interfacial residues. the rb344 and rb1179 are two precalculated data sets in the pridb, which respectively consist of 344 and 1179 rna-binding protein chains. after combining the rb344 and rb1179 data sets, we obtained a total of 1750 experimentally validated non-redundant rna-protein pairs, which had at least two atoms respectively coming from rna and protein with distance no more than 4 å. next, we removed redundant rna-protein pairs, which are the same protein chains interacting with the same rna chains. further, we removed those rna-protein pairs with amino acid sequence length < 25 or nucleic acid sequence length < 15. finally, we obtained a positive sample set which consisted of 1331 experimentally validated rna-protein pairs. so far there are no definite negative samples of rna-protein interactions that are available. to construct a balanced negative sample set ("rna-protein non-interacting pairs"), we made it by randomly permute the proteins in the positive sample set but kept the rna fixed. we repeated the permutation process until no negative pairs existed in the positive sample set. as a result, the training set, called rpi2662, was composed of 1331 rna-protein interacting pairs and 1331 rna-protein non-interacting pairs. several data sets were employed to evaluate the performance of the proposed methods. our rpirls method was first evaluated using two popular non-redundant data sets of rpis studied in [27] . the rpi2241 data set consisted of 2241 experimentally validated rna-protein pairs extracted from the pridb data base. while the rpi369 data set eliminated all rpi pairs with ribosomal proteins or ribosomal rnas from the rpi2241 data set. to avoid overlapping between training and testing data sets, those rpis overlapping the training data were removed, leaving the rpi2241 data set of 1832 rpis and the rpi369 data set of 204 rpis. their corresponding negative pairs were generated by following the same steps as developing the training negative pairs. next, we tested the performance of the rpirls method on a large scale data set extracted from the rna-protein interaction data base (rpintdb) (http://pridb.gdcb.iastate.edu/rpiseq/download.php). this data set consisted of 43,010 experimentally validated rpis from several sources, including the rpidb, npinter data base and high-throughput experiments published in literature. the fourth data set were extracted from the npinter data base which we called nrpi13153. the nrpi13153 data set consisted of 13,153 experimentally validated ncrna-protein pairs from six model organisms, i.e., caenorhabditis elegans, drosophila melanogaster, escherichia coli, homo sapiens, mus musculus and saccharomyces cerevisiae. we constructed the fifth data set called lnrpi12114 by extracting only lncrna-protein pairs from the npinter data base. this data set contains 12,114 experimentally validated lncrna-protein pairs. in this paper, we proposed two classifiers for predicting rpis based on different representations of protein sequences. for the rpirls classifier, each amino acid sequence comprised up to 20 different amino acids. while for the rpirls-7g classifier, we adopted the same amino acid classification approach as [26, 29] . the derived kernel was proposed by smale et al. [49] on images inspired by neuroscience of visual cortex. in what follows, we briefly described the construction of derived kernel on sequences. suppose a is a finite set called the alphabet. in the work here a is the set of 20 amino acids (for rpirls), 7 alphabets (for, rpirls-7g) or 4 nucleic acids. let a 1 = a and define a k+1 = a k × a recursively for any k ∈ n. we say s is a string if s ∈ ∪ ∞ k=1 a k , and s = (s 1 , . . . , s k ) is a k-mer (e.g., a sequence of length k) if s ∈ a k for some k ∈ n with s i ∈ a . the process of computing the derived kernel mainly includes three steps as below: step 1. set an initial kernelk 1 at the first layer. here the initial kernel is defined as: where x, y ∈ a k . x = {x 1 , . . . , x k } and y = {y 1 , . . . , y k } are substrings of the same length k. x = y if and only if x i = y i for i = 1, . . . , k. step 2. let f = ( f 1 , . . . , f n ), denote by | f | the length of f , so here | f | = n. then define the second layer neural response of f at t : where t 1 is the template set at the first layer, here we consider all possible substrings of length k making up the template set t 1 , so here t 1 = a k . h 1 is the transformation set at the first layer. step 3. compute the second layer derived kernel by normalizing the inner product of two neural responses: where ·, · l 2 (t 1 ) denotes the l 2 inner product with respect to the uniform measure 1 |t 1 | ∑ t∈t 1 δ t , where |t 1 | is the cardinality of the template set t 1 and δ t is the dirac measure; with correlation normalization: . this process can continue if appropriate higher level templates are defined. at each layer (local) derived kernels are built by recursively pooling over previously defined local kernels. here, for the 2-layer derived kernel, pooling is accomplished by taking an average over a set of transformations which calculating the frequency of a template that occurs in a sequence. in this paper we deal with inner product kernels k which satisfies the mercer condition, are known to be an instance of reproducing kernels. next with correlation normalization,k is also a reproducing kernel andk(x, x) = 1 for all x ∈ x. the kernel function is symmetric (i.e., k( f , g) = k(g, f )), and non-negative (i.e., k( f , f ) ≥ 0), therefore it can be interpreted as a measure of similarity. we first apply the kernel to the set r which contains nucleic acid sequences, and denote it byk r 2 , and then apply the kernel to the set of amino acid sequences p, denote it byk p 2 , and lastly combine two kernels in a natural way by tensor product for the set of rna-protein pairs . the reproducing kernel for two rna-protein pairs (r, p), (r , p ) ∈ r × p is defined by: k dk 2 ((r, p), (r , p )) =k r 2 (r, r )k p 2 (p, p ). since bothk r 2 (r, r ) andk p 2 (p, p ) are positive definite kernels,k dk 2 ((r, p), (r , p )) is obviously a positive definite kernel too [50] . after combining the kernel with other kernel-based supervised learning algorithm, we can predict rpis to any rna-protein pairs with known primary sequences. the rls algorithm is one of the most widely used models for regression. let k be a kernel on a finite set x. write h k to denote the inner product space of functions on x defined by k. supposez = {(x i , y i )} m i=1 is a sample set (called the training set) with x i ∈ x and y i ∈ r for each i. the rls can be written as follows: we integrated rls with the combined kernel k =k dk 2 , hence the main construction is to computē herein, we aim to develop a novel method to distinguish rna-protein interaction pairs from non-rna-protein interaction pairs. therefore, for the binary classification case with y i ∈ {−1, 1} for each i, iff ≤ 0, the predicted class is −1 ( denotes non-interaction), otherwise it is 1 (denotes interaction). one important step of rls is to find a "good" value of the regularization parameter λ > 0 in equation (7). they were selected from an optional set λ by leave-one-out cross-validation [51] on the training data. we never used testing data for parameter selection which is under the risk of over-fitting. the sensitivity (se) and specificity (sp) are used to measure the ability of identifying positive and negative instances, respectively. they are defined by the accuracy which is used to measure the prediction quality, is defined by accuracy = tp + tn tp + tn + fp + fn . the auc (area under the receiver operating characteristic curve) is further employed to measure the predictive performance, which is 1 for perfect prediction and 0.5 for random prediction. rna regulons: coordination of post-transcriptional events rpicool: a tool for in silico rna-protein interaction detection using random forest a census of human rna-binding proteins sequence-specific interaction of r17 coat protein with its ribonucleic acid binding site rna-rna and rna-rotein interactions in coronavirus replication and transcription diverse roles of host rna binding proteins in rna virus replication rna-protein interactions in human health and disease the three-dimensional structure of the ribosome and its components ribosomal protein structures: insights into the architecture, machinery and evolution of the ribosome emerging roles of rna and rna-binding protein network in cancer cells rna processing and its regulation: global insights into biological networks long noncoding rnas in interaction with rna binding proteins in hepatocellular carcinoma long noncoding rnas: functional surprises from the rna world functional demarcation of active and silent chromatin domains in human, hox loci by noncoding rnas par-clip analysis uncovers, auf1 impact on target rna fate and genome integrity jarid2 is implicated in the initial xist-induced targeting of, prc2 to the inactive x chromosome long non-coding rna gas5 inhibited hepatitis c virus replication by binding viral ns3 protein rip-chip: the isolation and identification of mrnas, micrornas and protein components of ribonucleoprotein complexes from cell extracts hits-clip yields genome-wide insights into brain alternative rna processing transcriptome-wide identification of rna-binding protein and microrna target sites by par-clip protein-rna interactions: structural analysis and functional classes advances in rip-chip analysis: rna-binding protein immunoprecipitation-microarray profiling quantitative analysis of rna-protein interactions on a massively parallel array reveals biophysical and evolutionary landscapes in silico characterization and prediction of global protein-mrna interactions in yeast predicting protein associations with long noncoding rnas predicting rna-protein interactions using only sequence information de novo prediction of rna-protein interactions from sequence information computational prediction of associations between long non-coding rnas and proteins predicting ncrna-protein interaction using sequence and structural information ipminer: hidden ncrna-protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction applications of regularized least squares to pattern classification simulations of the dynamics at an rna-protein interface prediction of rna-binding proteins from primary sequence by a support vector machine approach prediction of rna binding sites in proteins from amino acid sequence a study of cross-validation and bootstrap for accuracy estimation and model selection the noncoding rnas and protein related biomacromolecules interaction database molecular mechanisms of long noncoding rnas function of lncrnas and approaches to lncrna-protein interactions principles and properties of eukaryotic mrnps transcriptome-wide analysis of protein-rna interactions using high-throughput sequencing a neural network method for identification of rna-interacting residues in protein a server for the computational prediction of rna-binding residues in protein sequences prediction of protein-rna binding sites by a random forest method with combined features dissecting the expression dynamics of rna-binding proteins in posttranscriptional regulatory networks deciphering the role of rna-binding proteins in the post-transcriptional control of gene expression the protein data bank pridb: a protein-rna interface database introduction to the peptide binding problem of computational immunology: new results. found generalized cross-validation as a method for choosing a good ridge parameter this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license author contributions: w.s. and j.x. conceived and designed the experiments; w.s., w.c. and j.z. performed the experiments; w.s. and w.c. analyzed the data; w.s. and d.c. contributed web tools; w.s. and j.x. wrote the paper. the authors declare no conflict of interest. key: cord-000003-ejv2xln0 authors: crouch, erika c title: surfactant protein-d and pulmonary host defense date: 2000-08-25 journal: respir res doi: 10.1186/rr19 sha: doc_id: 3 cord_uid: ejv2xln0 surfactant protein-d (sp-d) participates in the innate response to inhaled microorganisms and organic antigens, and contributes to immune and inflammatory regulation within the lung. sp-d is synthesized and secreted by alveolar and bronchiolar epithelial cells, but is also expressed by epithelial cells lining various exocrine ducts and the mucosa of the gastrointestinal and genitourinary tracts. sp-d, a collagenous calcium-dependent lectin (or collectin), binds to surface glycoconjugates expressed by a wide variety of microorganisms, and to oligosaccharides associated with the surface of various complex organic antigens. sp-d also specifically interacts with glycoconjugates and other molecules expressed on the surface of macrophages, neutrophils, and lymphocytes. in addition, sp-d binds to specific surfactant-associated lipids and can influence the organization of lipid mixtures containing phosphatidylinositol in vitro. consistent with these diverse in vitro activities is the observation that sp-d-deficient transgenic mice show abnormal accumulations of surfactant lipids, and respond abnormally to challenge with respiratory viruses and bacterial lipopolysaccharides. the phenotype of macrophages isolated from the lungs of sp-d-deficient mice is altered, and there is circumstantial evidence that abnormal oxidant metabolism and/or increased metalloproteinase expression contributes to the development of emphysema. the expression of sp-d is increased in response to many forms of lung injury, and deficient accumulation of appropriately oligomerized sp-d might contribute to the pathogenesis of a variety of human lung diseases. surfactant protein-d (sp-d) is a member of the collagenous subfamily of calcium-dependent lectins (collectins) that includes pulmonary surfactant protein a (sp-a) and the serum mannose-binding lectin [1] [2] [3] . collectins inter-act with a wide variety of microorganisms, lipids, and organic particulate antigens, and can modulate the function of immune effector cells and their responses to these ligands. this article reviews what is currently known about the sites of production, structure, function, and regulated expression of sp-d. emphasis will be placed on functional attributes, known ligand interactions, and structure-function relationships believed to be important for host defense. for additional information on sp-a and other members of the collectin family, the reader is referred to other recent reviews [4] [5] [6] . sp-d is synthesized and secreted into the airspaces of the lung by the respiratory epithelium [1] . at the alveolar level, sp-d is constitutively synthesized and secreted by alveolar type ii cells. more proximally in the lung, sp-d is secreted by a subset of bronchiolar epithelial cells, the non-ciliated clara cells. because sp-d is stored within the secretory granules of clara cells [7, 8] , it seems likely that sp-d is subject to regulated secretion via granule exocy-tosis at this level of the respiratory tract. in some species, sp-d is also synthesized by epithelial cells and/or submucosal glands associated with the bronchi and trachea [9] . although many alveolar macrophages show strong cytoplasmic and/or membrane labeling with antibody against sp-d, they do not contain detectable sp-d message. the lung seems to be the major site of sp-d production. however, there is increasing evidence for extrapulmonary sites of expression as assessed with monoclonal or affinity-purified antibodies, reverse-transcriptase-mediated pcr (rt-pcr), and/or hybridization assays of tissues from humans and other large mammals [10 • ,11-14] (summarized in table 1 ). it is difficult to entirely exclude crossreactions or amplification of related sequences; however, localization to many of these sites in human tissues was confirmed by using monoclonal antibodies in combination with rt-pcr with sequencing of the amplified products [10 • ]. non-pulmonary expression seems to be largely restricted to cells lining epithelial surfaces or ducts and certain glandular epithelial cells that are in direct or indirect continuity with the environment. notable exceptions to this generalization might include heart, brain, pancreatic islets, and testicular leydig cells. sp-d has also been identified in amnionic epithelial cells by immunohistochemistry [15] ; however, it is unclear whether this is synthesized locally or derived from the lung by way of the amniotic fluid. interestingly, in many of these sites sp-d microscopically co-localizes with gp-340, an sp-d binding protein and putative sp-d receptor [10 • ]. sites of extrapulmonary expression have also been described in small mammals. in the rat, sp-d message was identified in rna extracted from skin and blood vessel [16] , and both protein and message were identified in gastric mucosa [17] and mesentery [13] . using rt-pcr, sp-d message has also been identified in mouse stomach, heart, and kidney [14] . sp-d (43 kda, reduced) consists of at least four discrete structural domains: a short, n-terminal domain; a relatively long collagenous domain, a short amphipathic connecting peptide or coiled-coil neck domain, and a c-terminal, ctype lectin carbohydrate recognition domain (crd). each molecule consists of trimeric subunits (3 × 43 kda), which associate at their n-termini (fig. 1) . although most preparations of sp-d contain a predominance of dodecamers (that is, four trimeric subunits), the proportions of various oligomers vary between species. for example, rat lavage and recombinant rat sp-d are almost exclusively assembled as dodecamers (four trimers), whereas recombinant human sp-d is secreted as trimers, dodecamers and higher-order multimers [18] . sp-d isolated from the lavage of some patients with alveolar proteinosis consists predominantly of higher-order multimers, which can contain up to 32 (or more) trimeric subunits (fig. 1 ). recent crystallographic and mutagenesis studies suggest that the structural determinants of saccharide binding are similar to those originally described for mannose-binding lectin [19,20,21 • ,22 • ]. at least two bound calcium ions and two intrachain disulfide crosslinks stabilize the required tertiary structure, and glu321 and asn323 within the crd participate in glucose/mannose type recognition. interactions with at least one glycolipid ligand, phosphatidylinositol (pi), require the participation of the c-terminal end of the protein [23, 24] . a trimeric cluster of crds is necessary for high-affinity binding to carbohydrate ligands [21 • ,25]. the crystal structure of human sp-d suggests that the spatial distribution of crds within a trimeric subunit permits simultaneous and cooperative interactions with two or three glycoconjugates displayed on the surface of a particulate ligand [21 • ]. furthermore, solid-phase binding studies have shown that monomeric crds have an approximately 10-fold lower binding affinity for multivalent ligands than trimeric crds. crystallographic studies of human sp-d further suggest that the spatial organization of crds within a trimer is stabilized by interactions of the c-terminal sequence with the trimeric neck domain [21 • ,26]. interestingly, the three crds show a deviation from threefold asymmetry, suggesting some flexibility of the crds in relation to the neck. thus, the dependence of the binding of pi on the c-terminal sequence could reflect conformational effects, rather than the direct participation of this sequence in ligand interactions. the collagen domain length of sp-d is highly conserved and lacks interruptions in the repeating gly-x-y sequence (in which x and y are different amino acids). as for other collagenous proteins, this domain is enriched in imino acids and contains hydroxyproline. unlike sp-a, sp-d also contains hydroxylysine. although the collagen domain of rat, human, bovine, and mouse sp-d lacks cysteine residues, cdna sequencing has identified a codon for cysteine within the collagen domain of pig sp-d [27 • ]; this suggests the possibility of alternative patterns of chain association and oligomeric assembly for pig sp-d. the first translated exon of sp-d contains a highly conserved and unusually hydrophilic gly-x-y sequence that shows little homology with the remainder of the collagen sequence. the functional significance of this region is unknown. however, it has been suggested that this region contributes to oligomer assembly or mediates interactions with cellular receptors. the collagen domain determines the maximal spatial separation of trimeric, c-terminal lectin domains within sp-d molecules, but might also contribute to normal oligomeric assembly and secretion. for example, deletion of the entire collagen domain of rat sp-d results in the secretion of trimers rather than dodecamers [28] . in addition, 2,2-dipyridyl, an inhibitor of prolyl hydroxylation that interferes with the formation of a stable collagen helix, causes the intracellular accumulation of 43 kda monomers and dimers [29] . in any case, the complete conservation of the number of gly-x-y triplets suggests that the spatial separation of trimeric crds is critical for normal sp-d function. the n-terminal peptide of the mature protein contains two conserved cysteine residues at positions 15 and 20. these residues participate in interchain disulfide crosslinks that stabilize the trimer, as well as the n-terminal association of four or more trimeric subunits. stable oligomerization of trimeric subunits permits cooperative or bridging interactions between spatially separated binding sites on the same surface or on different particles. the process of forming interchain disulfide bonds is complex, and appropriate crosslinking of the n-terminal domains might be rate limiting for secretion [30] . subcellular fractionation studies suggest that interchain bonds form initially between the three chains of a trimeric subunit. subsequent rearrangements within the rough endoplasmic reticulum might allow the covalent crosslinking of a single chain from one subunit and two crosslinked chains of another, with the associated elimination of free thiol groups. mutant proteins that contain unpaired n-terminal cysteine residues are not secreted. however, it is unclear whether this results from abnormalities in disulfide bonding itself, or the failure to stabilize the required n-terminal conformation. the collagen domain contains hydroxylysyl-derived glycosides and a single n-linked oligosaccharide. in most species (human, rat, mouse, and cow) the site of n-linked glycosylation is located near the n-terminal end of the collagenous domain. recently, it was shown that pig sp-d has an additional potential site of n-linked glycosylation within the crd [27 • ]. although rat and human lung lavage sp-d seem to be sialylated, as suggested by charge heterogeneity and cleavage with highly purified neuraminidase, preparations of human amniotic fluid and bovine lavage sp-d recovered from amniotic fluid showed predominantly complex type biantennary structures and no sialic acid [31] . a variant form of sp-d (50 kda) has been identified in lavage from a subset of human lavage samples; this protein shows o-linked glycosylation of threonyl residues within the n-terminal peptide domain [32 • ]. at present, the functional significance of these sugars is not known. the presence of o-linked glycosylation within the n-terminal domain might be predicted to interfere with normal dodecamer assembly. in this regard, the o-glycosylated 50 kda form of human sp-d is recovered as trimeric subunits or smaller species. as for many glycoproteins, the functional role of the attached carbohydrate is unknown. mutational analysis has shown that the n-linked sugar on rat sp-d is not required for secretion, for dodecamer formation, or for interactions with a variety of microorganisms [29,33]. consistent with its designation as a 'mannose-type' c-type lectin, sp-d preferentially binds to simple and complex saccharides containing mannose, glucose, or inositol [34, 35] . sp-d also interacts with specific constituents of pulmonary surfactant including pi [36-38] and glucosylceramide [39] . binding to glucosylceramide involves interactions of the carbohydrate-binding sequences of the crd with the glucosyl moiety. however, the interaction of sp-d with pi involves interactions with the lipid, as well as crd-dependent interactions with the inositol moiety [24, 40] . microorganisms are surfaced with a diverse and complex array of polysaccharides and glycoconjugates, and most classes of microorganism contain one or more sugars recognized by sp-d. however, the outcome of this interaction depends on the specific organism and can be modified by the conditions of microbial growth. the potential consequences of this interaction include the following: varying degrees of lectin-dependent aggregation (namely, microbial agglutination), enhanced binding of microorganisms or microbial aggregates to their 'receptors' on host cells, phagocyte activation, and opsonic enhancement of phagocytosis and killing, potentially involving one or more cellular receptors for sp-d. binding to organisms in suspension is often -but not always -accompanied by some degree of aggregation. sp-d binds to purified lipopolysaccharide (lps) isolated from a variety of gram-negative organisms [35, 41] . in addition, lps is the major cell wall component that is labeled on lectin blotting of outer membranes isolated from escherichia coli [41] . the latter interactions involve the recognition of the core oligosaccharide domain, which contains glucose and heptose [41] . sp-d interacts preferentially with purified lps molecules characterized by short or absent o-antigens and preferentially agglutinates bacterial strains expressing a predominance of rough (o-antigen-deficient) lps [41, 44] . although the core oligosaccharide domain of lps constitutes the major ligand for sp-d on at least some gram-negative bacteria, the mechanism of interaction with this group of microorganisms is probably heterogeneous. sp-d binds to some smooth, unencapsulated strains of gram-negative bacteria by immunofluorescence. the mechanism is uncertain; the quantity or quality of binding differs from that observed for rough strains and does not necessarily result in agglutination. lps molecules on the surfaces of bacteria show heterogeneity in the extent of maturation, so it is possible that this interaction is mediated by a subpopulation of lps with deficient o-antigens and that the density of binding sites is too low for high-affinity binding. the recognition of the surface glycoconjugates on gramnegative bacteria by sp-d depends not only on the expression of lectin-specific residues by a given strain or species, but also on the accessibility of these residues [1, 45] . for example, sp-d binds inefficiently to the core region of lps of encapsulated klebsiella, but efficiently agglutinates the corresponding unencapsulated phase variants. interactions of sp-d with the core oligosaccharides of gram-negative organisms are also influenced by the number of repeating saccharide units associated with the terminal o-antigen of the lps [41,44]. other potential ligands include the o-antigen domain of lps, certain capsular polysaccharides, and membraneassociated glycoproteins. in this regard, sp-d can bind to di-mannose containing o-antigens expressed by a subset of klebsiella serotypes (i ofek, h sahly and ec crouch, unpublished data). although other c-type lectins, specifically sp-a and the mannose receptor, can interact with specific capsular polysaccharides [46], a specific interaction of sp-d with capsular glycoconjugates or exopolysaccharides has not been described. the mechanism of interaction with gram-positive organisms has not been elucidated. lipoteichoic acids, which are the major glycolipids associated with the gram-positive cell wall, do not detectably compete with lps for binding to sp-d (i ofek, a mesika, m kalina, y keisari, d chang, d mcgregor and ec crouch, manuscript submitted). in preliminary studies we observed that binding was competed only partly with maltose and/or edta, raising the possibility that binding might be more complex than for some gram-negative organisms. . however, similar effects were observed when the neutrophils were preincubated with sp-d, and there was only a slight enhancement of uptake when bacteria were incubated with human sp-d and washed before their addition to neutrophils. notably, the extent of binding and internalization was dependent on the extent of multimerization, with human sp-d multimers demonstrating the highest potency. differences in cell type, the extent of sp-d multimerization, or differences in size or organization of bacterial aggregates could account for some of the apparent inconsistencies. although lps mediates the binding of sp-d to at least some gram-negative bacteria, sp-d can also bind to spein the latter study the authors suggested that fungal aggregation inhibits phagocytosis. interestingly, sp-d binding directly inhibited fungal growth and decreased the outgrowth of pseudohyphae, the invasive form of the fungus, in the absence of phagocytic cells [57] . it is possible that these effects are also secondary to agglutination, possibly as result of nutrient deprivation. purified rat and human sp-d inhibit the infectivity and hemagglutination activity of influenza sp-d can interact with host cells, both directly and indirectly. as indicated above, sp-d can enhance the phagocytosis and killing of certain microorganisms and enhance the oxidant response to microbial binding. however, at present there is only one study that suggests that the enhancement of phagocytosis by sp-d might involve the participation of an opsonic receptor. furthermore, the enhanced uptake of iav seems to be mediated by viral aggregation, with enhanced interactions of the virus with its natural receptors on the host cell. in any case, sp-d can interact directly with host cells, and in some cases can influence their behavior. sp-d is chemotactic and haptotactic for neutrophils and certain mononuclear phagocytes [59 • ,67-69] and can elicit directional actin polymerization in alveolar macrophages [69] . in this regard, sp-d is considerably more potent than sp-a. early studies with natural proteins isolated from silicotic animals reported directed effects on the oxidant metabolism of isolated alveolar macrophages [70] . however, such effects can probably be attributed to endotoxin contamination and/or aggregation. purified dodecamers do not significantly increase the production of nitric oxide [71] or of proinflammatory cytokines such as tumor necrosis factor-α (y kesari, h wang, a mesika, e crouch and i ofek, unpublished data). interestingly, purified sp-d has been reported to increase the production of several metalloproteinases in the absence of a significant effect on proinflammatory cytokine production [72] . despite the ability of sp-d to modulate a variety of cellular functions, little is currently known about potential cellular receptors for this protein. compartments [73] , but it is unclear whether the uptake is receptor dependent and whether sp-d is being internalized in association with specific ligands. there are at least two classes of binding to host cells: crd-dependent and crd-independent. some studies have demonstrated crd-dependent binding to phagocytes that can be inhibited with edta or competing saccharides, both in vitro and in vivo. as indicated above, the ability of sp-d to elicit the chemotaxis of neutrophilic and monocytic cells depends on the lectin activity of sp-d [68] . in addition, kuan and coworkers reported that extracting formaldehyde-fixed alveolar macrophages with detergents largely eliminates the binding of purified sp-d, suggesting a membrane-associated ligand or glycolipid receptor [73] . dong and wright have extended these findings and suggest that pi can contribute to sp-d binding by alveolar macrophages [74] . it is of interest that sp-d can bind to recombinant scd14 through interactions with n-linked oligosaccharides [51 • ]. given that the membrane-associated form of cd14 is widely expressed on host cells, it is possible that cd14 can serve as a binding site on macrophages and other cell types. the phagocytic uptake of certain bacteria by neutrophils is also inhibited by calcium chelation or competing sugars [42]; however, this could result from the inhibition of microbial agglutination rather than lectin-dependent interactions with the phagocyte. wang et al suggested that sp-d can bind to lymphocytic cells by a lectin-dependent mechanism [75 •• ] . in this regard, it is interesting to note that glucosylceramide, a ligand for sp-d in vitro, is one of the most abundant neutral glycolipids expressed by lymphoid cells. reid and co-workers were the first to present evidence for lectin-independent binding [76] . these and other studies suggested that binding does not involve known c1q or collectin receptors. the only putative receptor protein, gp-340, is a widely expressed member of the scavenger receptor superfamily [77,78 • ]. it binds to the crd of sp-d in a calcium-dependent manner that does not require the lectin activity of sp-d. although the protein has been immunolocalized to alveolar macrophage membranes and distributes together with sp-d in many different human tissues [10 • ,77], it has not yet been shown to mediate the binding of sp-d to these cells or to participate in signal transduction events. the cdnas isolated from lung have not shown a membrane-spanning region [77] , and the protein is abundant as a soluble component in bal. given that gp-340 is a highly multimerized protein that contains numerous potential ligand binding domains (fig. 1b) , it is possible that the protein cooperates with sp-d in the neu-tralization or clearance of certain ligands rather than specifically mediating the interactions of sp-d with host cells. wright and co-workers have demonstrated the binding of sp-d to isolated type ii pneumocytes. the mechanism seems distinct from the binding to macrophages [79 • ]. the binding was dependent on concentration, time, and temperature and required calcium; it was not sensitive to protease treatment or to pi-phospholipase c. although the internalized sp-d was degraded or recycled to lamellar bodies, sp-d binding did not alter the uptake of surfactant lipids. sp-d has demonstrated comparatively few direct effects on the metabolism of host cells, at least in situations where self-aggregation and endotoxin contamination have been excluded. one possible explanation is that modulation of cellular function requires the prior interaction of sp-d with a ligand. this would have numerous potential physiological advantages, because the presence of 'active' protein might be restricted to sites of microbial or antigenic deposition. the binding of complex, multivalent, particulate antigens to two or more crds could markedly alter the conformation of sp-d molecules, with respect to the spatial orientation of the arms in relation to the n-terminal crosslinking domain and/or with respect to the spatial orientation of the crds within a given trimeric subunit. thus, the 'charging' of sp-d with a particulate ligand could lead to local or distant conformational changes that expose 'cryptic' binding sites for cellular receptors. there is some preliminary evidence consistent with the notion that the interaction of sp-d with a ligand alters its capacity to activate host cells. table 3 and discussed below. sp-d can be isolated in different multimeric forms from proteinosis lavage [32 • ] and are produced by chinese hamster ovary k1 cells transfected with human sp-d cdna [18] . as described previously, the effects of sp-d on the neutrophil response to influenza virus are highly dependent on the ability of sp-d to agglutinate the viral particles, and the agglutination activity is directly correlated with the extent of multimerization. trimers can bind to the virus but have little capacity to modulate neutrophil interactions. by contrast, highly multimerized proteins show greater activity than dodecamers [81] . given these observations, factors that favor enhanced oligomerization or lead to the accumulation of trimeric subunits promote might influence sp-d function. for example, the liberation of active trimers by a hypothetical microbial protease could lead to the accumulation of molecules that might inhibit the aggregation-dependent activities of sp-d. in contrast, recombinant trimeric crds can stimulate chemotaxis [67] and decrease viral infectivity [65 • ]. although higher-order oligomers of sp-d can self-aggregate and precipitate in the presence of calcium in vitro, the functional consequences are not known. the lectin activity of sp-a is decreased after the nitric oxide-dependent nitration of tyrosine residues [82] , and nitration decreases the ability of sp-a to enhance the adherence of pneumocystis to alveolar macrophages [83] . however, similar findings have not yet been reported for sp-d. conditions of mildly acidic ph, as might be found in endocytic compartments, are predicted to disrupt the lectin-dependent activities of sp-d [34]. proteolytic degradation remains an important possibility. however, sp-d is highly resistant to degradation by a wide variety of neutral proteases in vitro, and degradation products have not yet been shown to accumulate under pathological conditions in vivo. glucose concentrations at levels encountered in diabetes can interfere with sp-d's ability to interact with specific strains of iav or other microorganisms in vitro [84 • ]. many microorganisms release cell wall polysaccharides or glycoconjugates, which might interfere with the binding of collectins to the same or other organisms. in this regard, sp-d recovered from rats after the instillation of lps into the airway shows decreased lectin activity, which is attributed to occupancy of the crd with lps [49 • ]. it seems reasonable to speculate that some organisms might compete with other organisms for binding to sp-d. such a situation could conceivably predispose to secondary infections. lastly, the potential inhibitory effects of competing saccharide ligands presents important methodological considerations for experiments using carbohydrate-containing cell culture medium or buffers. non c-type lectins (such as ficolins) it is difficult to predict the functions of sp-d within the airspace. other lectins with overlapping specificity are also present. although the levels of mannose-binding lectin are probably low in the absence of increased vascular permeability, sp-a and the macrophage mannose receptor could conceivably interact with the same ligands in the distal airways and alveoli. such interactions could lead to antagonistic or cooperative effects. furthermore, we have little knowledge regarding the microanatomic distribution of these molecules in specific circumstances in vivo. although most sp-a is probably associated with the insoluble phase of the alveolar lining material, and the macrophage mannose receptor is membrane-associated, the distribution might be altered in the setting of lung injury. models of sp-d deficiency show no detectable anatomical or physiological abnormalities at birth. however, the animals gradually develop a patchy, subpleural alveolar lipidosis with associated type ii cell hypertrophy, the accumulation of enlarged and foamy macrophages, and an apparent expansion of peribronchial lymphoid tissue [85 • ,86 • ]. interestingly, the mice eventually develop distal-acinar emphysema and areas of subpleural fibrosis, which could reflect a continuing inflammatory reaction associated with abnormal oxidant metabolism and metalloproteinase activity [87 • ]. by contrast, sp-a-deficient mice (-/-) show essentially normal respiratory function and surfactant lipid metabolism [88, 89] but numerous apparent host defense abnormalities [90] . the capacity of sp-d to bind to specific strains of influenza a in vitro is highly correlated with the capacity of the virus to proliferate in mice in vivo [62] . specifically, strains with more oligosaccharide attachments on the ha are preferentially neutralized by sp-d in vitro and show decreased proliferation in mice. because the administration of mannan together with the virus increased the replication of iav in the lung, the involvement of a mannose-type, c-type lectin was implicated. sp-d-sensitive iav strains also replicate to higher titers in the lungs of diabetic mice than in nondiabetic controls [84 • ]. replication of the virus is positively correlated with blood glucose level, and decreases in response to insulin treatment. significantly, blood glucose levels comparable to those measured in the diabetic mice were sufficient to inhibit the interaction of sp-d with these viral strains in vitro. pr-8, a strain that does not interact with sp-d but does interact with sp-a, replicated to the same extent in diabetic and control mice. sp-d levels increase in association with certain infections. for example, sp-d levels, but not the levels of serum mannose-binding lectin, increase markedly after iav infection [62] . impressive increases in sp-d have also been observed in murine models of pneumocystis carinii [91] and p. aeruginosa infection [92] . sp-d-deficient mice have not yet been extensively characterized with respect to host defense function. however, they show decreased viral clearance and enhanced inflammation after challenge with respiratory syncytial virus [93] and iav (am levine, personal communication). in addition, they show increased inflammation, increased oxidant production, and decreased macrophage phagocytosis in response to intratracheally instilled group b streptococcus and haemophilus influenzae (am levine, personal communication). although the overexpression of wild-type sp-d in type ii pneumocytes with the sp-d-deficient mice can prevent the lipidosis and inflammatory changes [94] , the ability of overexpressed wild-type sp-d or exogenous sp-d to ameliorate these abnormalities has not yet been described. the coexisting pulmonary abnormalities also complicate the interpretation of challenge models. for example, macrophage activation might enhance killing and offset any decrease that results more directly from sp-d deficiency. sp-d deficiency modifies the host response to instilled lps with decreased lung injury and inflammatory cell recruitment [50]. molecules that can bind to potential antigens and deliver them to macrophages and other antigen-presenting cells might contribute to the development of acquired immunity. in this regard, a few published observations suggest possible roles in the development of humoral and/or cellular immunity in response to microorganisms or complex organic antigens. for example, sp-d can decrease interleukin-2dependent t-lymphocyte proliferation [95 • ]. interestingly, single-arm mutants were at least as potent as intact dodecamers in mediating this effect. sp-d also binds to oligosaccharides associated with dust mite allergen [96 • ], and can inhibit the binding of specific ige to these allergens, possibly through direct, crd-dependent binding to lymphocytes [96 • ]. thus, alterations in the level of sp-d (or the state of oligomerization) might influence the development of immunological responses and contribute to the pathogenesis of asthma and other hypersensitivity disorders. there are other potential interplays between humoral immunity and collectins with regard to antimicrobial host defense. for example, increased glycosylation of iav coat proteins, an adaptation that is believed to help the virus to evade antibody-mediated neutralization, is associated with increased reactivity with sp-d and other collectins [62]. thus, the relative potential importance of antibody and collectin-mediated host defenses might be influenced by subtle variations in the structure of the microbial surface. there is little recent information on the developmental regulation of sp-d expression. in general, sp-d increases rapidly late in gestation [97] [98] [99] [100] . the production of sp-d increases during the culture of fetal lung explants, and expression can be increased with glucocorticoids [98, 100, 101] . the exposure of fetal rats to glucocorticoids in vivo leads to precocious expression with increased numbers of sp-d-expressing cells and increased cellular levels of sp-d message [98, 101, 102] . although sp-d is produced constitutively within the lung, protein accumulation and gene expression are inducible and increases in sp-d expression have been observed in a number of disease states or models (tables 4 and 5 ). in general, the synthesis and secretion of sp-d increase in association with lung injury and activation of the respiratory epithelium [1] . for example, levels of sp-d mrna and sp-d accumulation are increased within 24-72 h after intratracheal instillation of lps [103 • ], and sp-d expression by alveolar and bronchiolar epithelial cells increases after exposure of rats to 95% o 2 for 12 h [104] . keratinocyte growth factor (kgf) increases sp-d expression and protein production in association with pneumocyte hyperplasia and after injury caused by bleomycin [105] . in addition, the levels of sp-d can increase markedly in response to the overexpression of certain cytokines, such as interleukin-4, or in response to microbial challenge [91, 92] . studies of the upstream regulatory region of the sp-d gene have demonstrated increased promoter activity in the presence of glucocorticoids, which is consistent with the findings in vivo and in lung organ culture [106] . however, no functional glucocorticoid response elements have been identified, and the effects of dexamethasone seem to be secondary and involve the effects of other transregulatory molecules. the activity of the human sp-d promoter is dependent on a conserved activator protein-1 (ap-1) element (-109) that binds to members of the fos and jun families of transcriptional factors [107] . in addition, the promoter contains multiple functional binding sites for ccaat-enhancer-binding protein (c/ebp) transcription factors. mutagenesis experiments suggest that these are required for basal and stimulated promoter activity, and promoter activity is markedly increased in h441 cells after co-transfection with c/ebpβ cdna (yc he and e crouch, unpublished data). the importance of the conserved ap-1 element and the presence of multiple binding sites for c/ebp transcription factors is consistent with the observed modulation of sp-d expression in the setting of tissue injury. sp-d promoter activity is not dependent on the binding of thyroid transcription factor 1 (ttf-1) [107] . however, promoter activity is dependent on two interacting forkhead binding sites, upstream and downstream of the ap-1 element; these sites bind to hepatic nuclear factor-3α and apparently other forkhead box proteins in h441 lung adenocarcinoma nuclear extracts [107] . initial comparison of genomic and cdna sequence suggested the existence of genetic polymorphisms in the sp-d coding sequence, including one in the n-terminal propeptide domain (thr11 compared with met11 in the mature protein) and three additional differences within the collagen domain at positions 102, 160, and 186 [108] . the latter substitutions are conservative to the extent that they are not expected to disrupt the collagen helix. floros table 5 increased sp-d accumulation or expression in animal models silicosis rat [118] hyperoxia rat [104] endotoxin (lps) rat [103] challenge with p. aeruginosa mouse [92] challenge with iav mouse [62] challenge with pneumocystis carinii scid mouse [91] rat [119] overexpression of interleukin-4 mouse [120] scid, severe combined immunodeficiency. and co-workers have recently confirmed the existence of polymorphisms at positions 11 and 160 of the mature protein [109] . the potential biological significance, if any, is not known. interestingly, the 50 kda variant of sp-d showed o-linked glycosylation of thr11 [32 • ], suggesting that this polymorphism might be associated with altered glycosylation. interestingly, the 50 kda variant was recovered as trimeric subunits, raising the possibility that differences in the glycosylation of residue 11, which is immediately n-terminal to cys15, could influence multimerization and the capacity of sp-d to participate in bridging interactions. there is increasing evidence that sp-d interacts specifically with a wide variety of respiratory pathogens, modulates the leukocyte response to these organisms, and participates in aspects of pulmonary immune and inflammatory regulation (table 6) . sp-d can influence the activity of phagocytes through crd-dependent and crd-independent interactions. at least some of the effects of sp-d result from aggregation with enhanced binding of the agglutinated ligand to their natural 'receptors'. although the lung is the major site of sp-d expression, it is likely that the protein has more generalized roles in host defense and the acute response to infection and tissue injury. 16 collectins and pulmonary host defense immunomodulatory functions of surfactant lung surfactant proteins involved in innate immunity interactions of surfactant protein a with pathogens the role of collectins in host defense structural aspects of collectins and receptors for collectins immunocytochemical localization of surfactant protein d (sp-d) in type ii cells, clara cells, and alveolar macrophages of rat lung surfactant protein d: subcellular localization in nonciliated bronchiolar epithelial cells localization and developmental expression of surfactant proteins d and a in the respiratory tract of the mouse localiza-• tion of lung surfactant protein d (sp-d) on mucosal surfaces in human tissues this recent study was the first to systematically investigate the extrapulmonary expression of sp-d in human tissues immunolocalization of sp-d in human secretory tissues surfactant protein a and d expression in the porcine eustachian tube expression of hydrophilic surfactant proteins by mesentery cells in rat and man mouse surfactant protein-d. cdna cloning, characterization, and gene localization to chromosome 14 surfactant proteins a (sp-a) and d (sp-d): levels in human amniotic fluid and localization in the fetal membranes antiviral activity of bovine collectins against rotaviruses recombinant sp-d carbohydrate recognition domain is a chemoattractant for human neutrophils interactions of pulmonary surfactant protein d (sp-d) with human blood leukocytes surfactant proteins a and d specifically stimulate directed actin-based responses in alveolar macrophages rat surfactant protein d enhances the production of oxygen radicals by rat alveolar macrophages effects of endotoxin on surfactant protein a and d stimulation of no production by alveolar macrophages induction of matrix metalloproteinase biosynthesis in human alveolar macrophages exposed to surfactant protein d (sp-d): possible roles in pulmonary host defense binding of surfactant protein d (sp-d) to membrane glycolipids on alveolar macrophages inhibitory effect of •• pulmonary surfactant proteins a and d on allergen-induced lymphocyte proliferation and histamine release in children with asthma surfactant protein d binding to alveolar macrophages cloning of gp-340, a putative opsonin receptor for lung surfactant protein d isolation and characterization of a new member of the scavenger receptor superfamily, glycoprotein-340 (gp-340), as a lung surfactant protein-d binding molecule gp340 was the first protein shown to bind to sp-d in a crd-independent manner. although it can be found on the surface of alveolar macrophages, it remains uncertain whether it can function as a cellular receptor binding and uptake of surfactant • protein d by freshly isolated rat alveolar type ii cells this study describes the direct binding of sp-d to type ii pneumocytes. the mechanism appears distinct from that observed for alveolar macrophages this study describes specific ultrastructural alterations in the organization of phospholipid mixtures containing phosphatidylinositol. similar tubular structures are found in type ii cell cultures collectins and pulmonary innate immunity nitration of surfactant protein a results in decreased ability to aggregate lipids nitrated sp-a does not enhance adherence of pneumocystis carinii to alveolar macrophages increased suscepti-• bility of diabetic mice to influenza virus infection: compromise of collectin-mediated host defence of the lung by glucose? is uncontrolled diabetes mellitus associated with defective collectin function? surfactant protein-d regulates surfactant phospholipid home ostasis in vivo this paper and the following paper by botas and coworkers [86 • ] describe the phenotype of sp-d deficient transgenic mice altered surfactant homeostasis and alveolar type ii cell morphology in mice lacking surfactant protein d increased metalloproteinase activity, oxidant production, and emphysema in surfactant protein d gene-inactivated mice altered surfactant function and structure in sp-a gene targeted mice surfactant metabolism in surfactant protein a-deficient mice surfactant protein a (sp-a) gene targeted mice p. carinii induces selective alterations in component expression and biophysical activity of lung surfactant interleukin-4 enhances pulmonary clearance of pseudomonas aeruginosa surfactant protein-d modulates lung inflammation with respiratory syncytial virus infection in vivo pulmonary-specific expression of sp-d corrects pulmonary lipid accumulation in sp-d gene-targeted mice • recombinant rat surfactant-associated protein d inhibits human t lymphocyte proliferation and il-2 production this paper was the first to describe direct, inhibitory effects of sp-d on the stimulated proliferation of blood lymphocytes interaction of • human lung surfactant proteins a and d with mite (dermatophagoides pteronyssinus) allergens this study was the first to suggest that interactions of sp-d with glycoconjugates on particulate allergens might influence the immune response developmental expression of pulmonary surfactant protein d (sp-d) modulation of surfactant protein d expression by glucocorticoids in fetal rat lung ontogeny of surfactant apoprotein d, sp-d, in the rat lung regulation of surfactant protein d in human fetal lung regulation of surfactant protein d expression by glucocorticoids in vitro and in vivo pre-and postnatal stimulation of pulmonary surfactant protein d by in vivo dexamethasone treatment of rats surfactant proteins a • and d increase in response to intratracheal lipopolysaccharide this important paper suggests that sp-a and sp-d participate in the acute response to lung injury brief exposure to 95% oxygen alters surfactant protein d and mrna in adult rat alveolar and bronchiolar epithelium kgf increases sp-a and sp-d mrna levels and secretion in cultured rat alveolar type ii cells characterization of the human surfactant protein d promoter: transcriptional regulation of sp-d gene expression by glucocorticoids proximal promotor of the surfactant protein d (sp-d) gene: regulatory role of ap-1, forkhead box, and gt-box binding proteins genomic organization of human surfactant protein d (sp-d). sp-d is encoded on chromosome 10q22.2-23.1 novel, non-radioactive, simple and multiplex pcr-crflp methods for genotyping human sp-a and sp-d marker alleles recognition of klebsiella pneumoniae by pulmonary c-type lectins deficient hydrophilic lung surfactant proteins a and d with normal surfactant phospholipid molecular species in cystic fibrosis serial changes in surfactant-associated proteins in lung and serum before and after onset of ards decreased contents of surfactant proteins a and d in bal fluids of healthy smokers surfactant proteins a and d in premature baboons with chronic lung injury (bronchopulmonary dysplasia). evidence for an inhibition of secretion deficiencies in lung surfactant proteins a and d are associated with lung infection in very premature neonatal baboons pulmonary surfactant protein d in sera and bronchoalveolar lavage fluids surfactant protein-a concentration in bronchoalveolar lavage fluids of patients with pulmonary alveolar proteinosis surfactant protein d. increased accumulation in silica-induced pulmonary lipoproteinosis surfactant protein-d mediates aggregation of pneumocystis carinii il-4 increases surfactant and regulates metabolism in vivo this paper was the first to definitively demonstrate a classical opsonic activity of sp-d. the studies further suggest the existence of an opsonic receptor on alveolar macrophages. key: cord-022235-6ircruag authors: pugsley, anthony p. title: later stages in the eukaryotic secretory pathway date: 2012-12-02 journal: protein targeting doi: 10.1016/b978-0-12-566770-8.50009-4 sha: doc_id: 22235 cord_uid: 6ircruag nan the secretory pathway of eukaryotic cells comprises a succession of compartments, the secretory organelles, through which proteins pass en route to their final destinations. although each secretory organelle has its own special characteristics, a number of basic features are common to all of them. some proteins pass more or less unimpeded through the entire length of the secretory pathway, whereas others are retained in secretory organelles. this raises two of the fundamental questions which will be addressed in this chapter, namely, what determines whether a protein will be taken out of circulation at a particular step in the pathway, and how tight is the separation between secretory organelles? ultrastructure and cell fractionation studies indicate that there is little or no physical link between the rer and the golgi. furthermore, specific proteins are known to reside in individual compartments of the golgi apparatus. as we shall see, the physical separation of golgi enzymes is also indicated by the succession of posttranslational modifications to which secretory proteins are subjected, by in situ immunocytochemistry, and by the separation of golgi-derived vesicles containing different en zymes. how then do secretory proteins move between these compart ments, and is the secretory pathway a continuous gradient of secretory organelles or are they functionally and structurally independent? a feature common to all stages of the secretory route beyond the rer is that proteins move between secretory organelles in specific classes of transport vesicles. consequently, soluble secretory proteins never come into direct contact with the outer face of the organelle to which they are being targeted and therefore can play no direct part in sorting. integral membrane proteins usually have segments exposed on the cytoplasmic face of the transport vesicle which could be recognized by receptors on the target organelle. microinjected antibodies recognizing the c-terminal, cytoplasmic tails of plasma membrane proteins can prevent their trans port to the cell surface (25, 590) , but this is probably due to antibodyinduced changes in protein conformation which make the protein incom petent for transport along the secretory pathway, rather than to inhibition of receptor-secretory protein interactions. in this chapter, we will consider three different ways in which sorting of secretory proteins might occur: (i) all secretory proteins have signals which target them successively through the secretory organelles and then on to their final target; some proteins remain in different organelles because they lack the signal necessary for targeting to the next organelle in the pathway. (ii) secretory proteins only have sorting signals for the last stage in the secretory pathway; some proteins have retention signals which pre vent them joining the bulk flow through the secretory pathway, thereby causing them to be retained in specific organelles. (iii) as (ii), except that secretory organelle proteins pass through the secretory pathway in the bulk flow and are recycled via signaldependent counterflow. if either of the last two models are correct, then molecules devoid of retention or sorting signals should travel through the secretory pathway as part of the bulk flow. the rate at which molecules are transported out of the cell will thus be determined by the rate at which they diffuse to sites within the rer and golgi where transport vesicles are formed and depart en route to the next compartment in the chain. this flow rate has recently been measured with the n-glycosylation acceptor tripeptide asn-tyr-thr (section v.b.i). if this tripeptide contains a radioiodinated tyr residue, its progress through the secretory pathway, which it seems to enter by diffusing across the rerm, can be followed by chromatography and autoradiography. wieland et al. (1190) found that the tripeptide was gly cosylated in the er (this prevented it from leaking back into the cytosol), then trimmed by mannosidases located in the golgi (section v.d), and finally secreted into the medium. tripeptide was detected in the medium after 5-10 min, depending on the cell type, which is considerably faster than the time required for the transport of most secretory proteins through the secretory pathway. thus, bulk flow is very efficient, implying that diffusion through the lumen of the er and golgi can occur relatively easily and that there is massive, vectorial movement of vesicles between the organelles and the cell surface. the way in which this result influences our understanding of protein sorting in the secretory pathway is discussed in following sections. soluble and integral membrane secretory proteins fold once they have crossed the rerm; they are also often covalently modified. the follow ing sections deal with the types of modifications which occur in the er and their role, if any, in protein sorting. palmitoylation and phosphatidyl inositol modification of secretory proteins are discussed in section iii.f.4.c; only their role in protein transit through the secretory route will be discussed here. most secretory proteins are nitrogen (n)-glycosylated in the rer. be cause we are primarily concerned with the effects of glycosylation on protein targeting, only general details of the glycosylation reactions will be given here. a. the sequence of reactions shown in fig. v .l, which is common to both yeasts and higher eukaryotes, results in the addition of a (glucose) 3 -(mannose) 9 -( j /v-acetylglucosamine) 2 complex onto asn residues in the sequence asn-x-ser or thr (the acceptor peptide, in which x can be any i i i i amino acid except possibly proline or aspartate). isolated tripeptides can act as acceptors only when the two extremities are blocked. longer ac ceptor peptides are more rapidly glycosylated (577). as shown in fig. v .l, the entire complex is assembled before it is attached to the asn residue in the lumen of the rer. at least some sugars cross the rerm as intermediates complexed with the long-chain lipid dolichol, whereas others may cross the rerm as nucleotide 5 '-diphosphate-sugar complexes. however, it is by no means certain that the complex is assembled entirely within the lumen of the rer. for example, there is no evidence that gdp-mannose can be trans ported across the rerm, leading to the proposal that the five mannose residues are added via gdp-mannose donors on the cytoplasmic face of the rerm, whereas later modifications, and possibly earlier modifica tions, occur in the lumen. this presupposes that the dolichol-pp-(nagn) 2 complex can "flip" from the lumen to the cytoplasmic face and then back again once the five mannose residues have been added [see (452) for further discussion of the topology of glycosylation reactions in the rer]. the preferred donor lipid-carbohydrate complex is the com plete complex shown in fig. v.l (1129) . however, truncated versions lacking glucose and even mannose residues can act as donors in vitro, and under-glycosylated complexes can act as donors in vivo in yeasts (598) and protozoa (577) . variations in the mannose content of transferred oligosaccharides have also been reported (577). the transfer of the first n-acetylglucosamine (nagn) residue from the udp complex to the dolichol, is inhibited by the antibiotic tunicamycin, which therefore blocks all n-glycosylation. this provides a valuable tech nique for studying the role of n-glycosylation in protein traffic. the yeast (s. cerevisiae) gene (alg7) coding for the rerm-associated, tunicamycin-sensitive enzyme (udp-nagn : dolichol-p-transferase) was cloned by virtue of its ability to rescue tunicamycin-treated cells when present on multiple copy number plasmids (931). null mutations in alg7 are lethal. mutants affected in other stages in the yeast glycosylation pathway have been isolated by mannose suicide selection (section ii.f.l.b). only muta tions blocking the earliest stages in the pathway are lethal; incomplete dolichol-linked oligosaccharides containing a minimum of four mannose residues allow normal growth, presumably because they can be attached to secretory proteins if the full-length lipid-oligosaccharide complex is absent (see above) (598). the complete oligosaccharide chain is transferred onto the asn accep tor site in the lumen of the rer. n-glycosylation is generally thought to occur as the polypeptide is being threaded through the rerm, while it is still in its unfolded state. however, some acceptor sites are not glyco-sylated. this may be because they rapidly become inaccessible within the structure of the folded polypeptide, although other explanations are possi ble (577, 968) . an unexplained anomaly is that some er membrane glyco proteins have glycosylated residues exposed on the cytoplasmic face of the er (1). one explanation could be that additional nagn transferases are present on the cytoplasmic face of the smooth er, where these glyco proteins are predominantly located. the initially homogeneous oligosaccharide is processed immediately fol lowing its attachment to the polypeptide chain, initially by the removal of glucose residues by one or more glucosidases. glucose trimming of glyco sylated vesicular stomatitis virus (vsv) g protein has been reported to occur cotranslationally (27). further processing steps are different in yeasts and in animal cells. in s. cerevisiae cells, a single al-2-linked mannose residue is replaced by αϊ-3 mannose. a mutation in the gls1 gene coding for α 1-2 glucosidase does not affect removal of the mannose residue or subsequent chain elongation, whereas removal of the ninth al-2-linked mannose residue may be essential for outer chain elongation, which occurs in the golgi (section v.d.i) (598). further processing of animal cell glycoproteins also occurs in the golgi, although further man nose trimming of resident er proteins and reglycosylation may both oc cur in the er (577,847). c. ser and thr residues on secretory proteins of yeasts and possibly other fungi can be o-glycosylated in the er (429). the process is less well characterized than n-glycosylation but seems to involve the direct trans fer of mannose residues from dolichol-1-mannose onto acceptor amino acids. studies with acceptor oligopeptides suggest that no particular se quence is required around the modification site (54,621). further process ing of the mannose residue occurs in the golgi apparatus (429) (section v.d.2). yeast secretory proteins may be both n-and o-glycosylated. most studies indicate that o-glycosylation occurs exclusively in the golgi apparatus (section v.d.2). secretory proteins apparently fold spontaneously as they are extruded .across the rer membrane (194, 1222) . however, disulfide bridge forma tion is probably catalyzed by protein disulfide isomerase (pdi), which is loosely associated with the rerm (325,326). the formation of soluble protein complexes occurs shortly after synthesis (604), but trimerization of vsv g protein and influenza virus hemagglutinin (ha), both of which are integral membrane proteins, occurs only 7-10 min after synthesis (193, 194, 364, 589) and involves the selection of polypeptides from a ran dom pool of prefolded monomers rather than from a restricted pool of monomers synthesized on a single polysome (108). kreis and lodish (589) suggest that this delay may be due to the segregation of pdi in a late "compartment" of the er. according to the papers discussed above, oligomerization of g and ha occurs just before they leave the er, 5-10 min after synthesis. this view is challenged, however, by yewdell et al. (1222) , who found that mono clonal antibodies specific for oligomerized ha reacted only with proteins in the golgi apparatus, and not with ha in the er (see next section). a possible explanation for this ambiguity is that ha monomers fold and trimerize in the er and are then transported to the golgi apparatus, where further modifications alter the conformation of the ha trimers to produce the antigenic sites recognized by the antibody used by yewdell et al. different proteins transit through the secretory pathway together, but the rates at which they are secreted may vary considerably (335, 343, 615, 1221) . the site at which the secretion lag is most prominent seems to be transit from the rer to the golgi (646). it has been suggested that this delay might reflect the need for secretory proteins to interact with specific receptors in the rer for transport to the golgi and that carbohydrates might form part of the recognition signal (320,647,648). alternatively, proteins may be retained in the er until they have folded correctly; different folding kinetics may result in different retention times in the er (589,1192). indeed, exit from the er of one of the proteins studied by fitting and kabat (320) coincided with a partial proteolysis event, and polypeptides were "selected" for exit from a random pool of "new" and "old" proteins. furthermore, studies discussed in the preced ing section showed that oligomerization of virus-encoded membrane pro teins occurs just before they are transported to the golgi. if secretory proteins are indeed retained in the er until they are folded and oligomerized in the correct way, then exit-incompetent proteins should either remain indefinitely in the er or be secreted at very much reduced rates. this could explain why genetically altered or hybrid pro teins are sometimes translocated into the rer without difficulty and yet do not transit further through the secretory pathway (104, 281, 364) , and why incomplete glycosylation or glucose trimming and failure to process signal peptides sometimes affect secretion kinetics (96, 577, 649, 1045) . even minor sequence changes have been reported to affect protein con formation and exit from the er (1192). what happens to incorrectly folded proteins in the er? results from numerous studies indicate that they stay in the er (604,681) or are de graded either in the er (642a) or in the lysosome (723). these proteins do not seem to precipitate in large aggregates. instead, a specific, major er protein seems to bind to some incorrectly folded proteins, thereby pre venting their exit from the er. haas and wabl (413) first detected this protein (bip) complexed with immunoglobulin heavy chains synthesized in the absence of light chains (heavy and light chains are only transported to the golgi as complexes) (97), and it was subsequently found to be complexed with incorrectly oligomerized or monomeric ha (194, 364) , nonglycosylated invertase (in vitro in dog pancreatic microsomes) and incorrectly folded prolactin (557), and recombinant human factor vii in chinese hamster ovary cells (561). it was not detected in association with aggregated vsv g protein (250) or with incorrectly folded (mutant) class i histocompatibility antigen (1192). the dissociation of secretory protein-bip complexes requires atp (758), which might explain part of the atp requirement for protein movement along the secretory pathway, but be cause atp is usually present in cell ly sates, some bip complexes (such as g-bip) may dissociate during extraction. bip has also been shown to associate with nascent polypeptides as they enter the lumen of the er (557). two interesting features of bip are that its synthesis is stimulated by glucose deprivation and that it is structurally related to a heat shock protein (758,857). glucose starvation is likely to reduce glycosylation, thereby increasing the proportion of incorrectly folded secretory proteins in the er. studies have confirmed the idea that the extent of glycosylation can affect the association between secretory proteins and bip, as well as their rate of secretion (251). the accumulation of misfolded (mutant) pro teins in the er also increases bip synthesis (583), possibly because the cell senses that its store of bip has been sequestered into protein com plexes. thus, synthesis of bip may be increased according to require ments (857), but studies on the "proofreading" or quality control role of bip are still at an early stage. a different, cytoplasmic quality control protein may be responsible for proofreading cytoplasmic domains in transmembrane proteins to ensure that they too are correctly folded and oligomerized (406a). protein conformation, rather than any specific structural feature such as glycosylated residues, is thus likely to determine whether a protein is competent to leave the er. this feature is illustrated by studies on the effects of glycosylation on vsv g protein. machamer et al. (672) used site-directed mutagenesis to replace the glycosylated asn residues. nonglycosylated g protein was transported from the er to the golgi but did not reach the cell surface. when only one of the asn residues was de leted, the protein reached the cell surface; thus g protein with one glyco sylated asn residue can be transported through the entire secretory path way. g protein export became temperature-sensitive when new glycosylation sites were added (671). kotwal et al. (581) noted that some natural g protein variants have only one oligosaccharide whereas others have none at all and suggested that compensatory changes in primary structure may allow nonglycosylated protein to fold into a secretion-com petent conformation. thus, glycosylation is probably needed for correct protein folding but is not directly implicated in the formation of secretion competence signals. at least one secreted protein, ovalbumin, is not gly cosylated. the er contains a large number of proteins, some of which it shares with the contiguous nuclear membrane (117). they include proteins involved in secretory protein translocation through the rerm, signal peptidase, ribophorins, bip, enzymes involved in lipid synthesis and in protein glycosylation, and cytochrome p 4 5 0 . the characterized proteins are similar to other secretory proteins: they often have signal peptides (198, 630, 1089) , are glycosylated, and may be located in the lumen or in the rerm. rothman (965) has argued that it may be physically impossible for the er to prevent the escape of endogenous proteins, and particularly mem brane proteins, to the cis golgi. he cited two observations in support of the idea that these proteins could leave the rer as part of the bulk flow through the secretory pathway, and then recycle from the golgi to the er: (i) some er proteins are detected in significant amounts in vesicles derived from the cis golgi when cells are fractionated (102). (ii) bulk lipid flow out of the er necessitates efficient recycling, possi bly from the cis golgi, which could provide "carriers" for the recy cling of er proteins. most er proteins are, however, almost completely excluded from the golgi (663,697). furthermore, er proteins are not terminally glycosyla ted (117,1089), which means that they do not reach the medial or trans golgi (section v.e). mannose residues on some er proteins are not trimmed beyond the stage catalyzed by er mannosidase (117,954,1089), but a lysosomal protein carrying an er retention signal (see below) is phosphorylated by nagn phosphotransferase, indicating that it reaches the early cis golgi (858). significantly, the phosphate groups are not modi fied by nagn phosphodiesterase, which may be located in a different, er-distal golgi compartment (table v.l). warren (1166) has proposed that er proteins are salvaged from an intermediate compartment, the socalled transitional element (section v.c), located between the er and the cis golgi. recycling of "escaped" endogenous er proteins, rather than receptormediated retention in the er, is the currently favored model for the specific accumulation of proteins in the er (151). it is not clear whether soluble and er membrane proteins are both subject to the same retention mechanism, but studies on the rates of diffusion of membrane proteins in the er indicate that rerm proteins are more restricted than bip and that the mobility of bip devoid of its er retention-salvage signal (see below) cannot be distinguished from that of normal bip (151). several studies have sought to determine the nature of the er reten tion-salvage signal(s). the rotaviral type ii er membrane protein vp7 was found to be secreted when the two potential n-terminal transmem brane segments were deleted (884), suggesting that retention-salvage de pended on a membrane anchor domain. however, subsequent studies showed that the entire n-terminus was absent from mature vp7 (1084) and thus that some other feature must account for vp7 retention in the er (883c). although deletion of the c-terminal transmembrane and cyto plasmic tails of the type i er membrane protein ε19 of adenovirus also causes the protein to be secreted, deletion of the last eight residues (fidekkmp) of the c-terminal, cytoplasmic tail alone causes the protein to appear on the cell surface, suggesting that the c-terminus contains the er retention-salvage signal. furthermore, cell-surface interleukin 2 re ceptor protein β chain was converted into a resident er protein following fusion of fidekkmp to its cytoplasmic c-terminus (832). the cytoplasmically exposed signal could be recognized by "salvage receptors" in the transitional element. these studies on the ε19 protein are reminiscent of earlier, seminal work on the soluble er protein bip which resulted from the observation that three soluble proteins in the lumen of the er, including bip, had the same four residues, lys-asp-glu-leu (kdel), at their c-termini. munro and pelham (759) found that if this sequence was deleted or extended, bip was secreted into the medium. this suggests that kdel is the er reten-tion signal and that it must be at the extreme c-terminus, which is pre sumably the last segment of the polypeptide to fold and may therefore be exposed on the surface of the protein. in a complementary study, dna coding for kdel was fused to the end of a cdna clone coding for secreted lysozyme. the resulting hybrid protein was retained in a perinu clear region probably corresponding to the er (759). further studies are required to determine whether kdel is the "uni versal" er retention signal. preliminary studies suggest that some mam malian er proteins have the sequence rdel at the c-terminus, whereas yeast er proteins have the sequence hdel (858a). hdel or kdel is presumably recognized either by an endogenous er membrane protein, which could anchor soluble proteins in the er, or, more likely, by the recycling receptor located in the transition element or cis golgi. indeed, a putative hdel receptor gene (erd1) has recently been identified in yeast cells (858,858a). it is not clear what triggers the release of bip from its receptor once it is recycled to the er. β-glucanase is a typical lysosomal protein; small but significant amounts of it, however, are retained in the er through a specific interac tion with an endogeneous er protein, the esterase egasyn. studies by medda et al. (704) indicate that /3-glucanase-egasyn interaction is blocked by inhibitors of esterase activity, leading to β-glucanase secretion (rather than sorting to the lysosome). whether this is a physiologically significant mechanism for retaining proteins in the er remains to be de termined. only about 10% of egasyn is complexed with /3-glucanase (666), so perhaps it also binds to other resident er proteins. it would be inter esting to see whether egasyn itself has a c-terminal kdel-like sequence. as discussed above, secretory proteins do not move to the cis golgi until they attain an exit-competent state, and some proteins are specifically retained in the er, probably by recycling of escaped proteins. both nor mal secretory proteins (1223) and exit-incompetent proteins have been reported to accumulate in a specific region of the smooth er, the vacuolar transitional element (79,996), from which vesicles either migrate to and fuse with the cis golgi (41) or coalesce to form new golgi cisternae (797). specific membrane proteins [e.g., the product of the sec 12 gene in yeasts (767a)] may be required to form this specialized domain of the er and hence directly or indirectly assist vesicle formation and protein transport to the golgi. secretory pathway "shuttle" vesicles are difficult to isolate because (533) also showed that atp depletion caused secretory pro teins to accumulate in transitional elements, but this study did not distin guish between energy requirements for protein folding and for vesicular transport from the transitional element to the golgi. a different in vitro assay was developed by haselbeck and schekman (428) to study protein movement from the er to the golgi in yeast cell extracts. their assay uses donor er vesicles derived from a strain carry ing the secl8ts mutation, which blocks protein movement from the er, and the mnnl mutation, which prevents terminal (golgi) mannosylation of secretory proteins. invertase accumulated in the er grown at the restric tive temperature was mannosylated after transfer to recipient golgi from wild-type cells. the transfer efficiency was low, however, possibly be cause invertase accumulated at the restrictive temperature did not be come exit-competent at the permissive temperature in vitro, or because recipient golgi were saturated. as in the mammalian system, protein transfer was atp-dependent and required soluble cofactors including sec 18 protein (269b) as well as proteins on the surface of the recipient golgi. a similar but more efficient reconstitution system using gently lysed yeast-cells has recently been developed (35a). results obtained with this system indicate the probable requirement for gtp in er to golgi traffic. secretory proteins are subjected to further chemical modification as they transit through the golgi. these processes are discussed in the following sections (except palmitoylation, which was covered in section iii.f.4.c); their significance with regards to compartmentalization of the golgi appa ratus and their role in protein targeting will be discussed in later sections of this chapter. in section v.b.i, we saw that the basic oligosaccharide core on asn residues is trimmed by glucose-and mannosidases before secretory pro teins leave the er. following their arrival in the golgi, mannose residues on secretory proteins may be processed in one of two ways, depending on whether they are targeted to the lysosome. soluble lysosomal proteins carry phosphorylated mannose residues which function as lysosomal sorting signals (section v.g.5). they un dergo specific mannose phosphorylation catalyzed by one or two nacetylglucosamylphosphotransferases and 7v-acetylglucosamine-1 -phosphodiester-a-n-acetylglucosaminidase (phosphodiesterase) in different cis golgi compartments (614a). sequential action of these enzymes results in the transfer of tv-acetyl glucosamine-1-phosphate from udp-nagn to any one of five mannose residues in the oligosaccharide core, followed shortly afterwards by the removal of the jv-acetylglucosamine to expose the phosphomannosyl group (577). goldberg and kornfeld (379) found three partially phosphorylated peptides in β-glucuronidase, indicating that lysosomal enzymes may not be uniformly phosphorylated. lysosomal enzymes presumably contain signals which are recognized by the phosphorylating enzymes (919). deglycosylated (endoglycosidase η-treated) lysosomal cathepsin d is not a substrate for the phosphoryl ation reaction in vitro, but it inhibits phosphorylation of intact lysosomal enzymes. proteolytic fragments of glycosylated cathepsin d are also not phosphorylated, and do not inhibit phosphorylation when they are dephosphorylated, indicating that the "phosphorylation signal" is probably not a linear sequence of amino acids (612). frog oocytes are also able to recognize the phosphorylation signal of human cathepsin d (301). renin is closely related to cathepsin d and yet is secreted by mammalian cells, presumably because it lacks the phosphorylation signal. however, renin produced in oocytes is phosphorylated, remains intracellular, and is de graded (presumably in the lysosome) (302). this suggests that renin has a phosphorylation signal which is recognized by amphibian but not by mam malian phosphorylating enzymes. outer chains on lysosomal and nonlysosomal secretory proteins of com plex eukaryotes may be further trimmed by golgi mannosidase i to leave five mannose residues on the core oligosaccharide. further modifications involve the addition of an n-acetylglucosamine residue by n-acetylglucosaminyltransferase i, further removal of two mannose residues by golgi mannosidase ii, fucosylation of the innermost tv-acetylglucosamine by fucosyltransferase, and the addition of galactose, n-acetylglucosamine, and sialic acid residues by appropriate transferases (fig. v. 2). oligosaccharides of both lysosomal and nonlysosomal proteins may be further modified by sulfation of mannose and n-acetylglucosamine resi dues and o-acetylation of sialic acid residues (577). outer chain modification in yeasts is markedly different. golgi mannosyltransferases extend the basic (man) 8 core oligosaccharide to produce large mannan structures typical of many yeast mannoproteins, which may carry as many as 150 mannose residues (598). the single o-linked mannose residue on yeast glycoproteins (section v.b.l.c) may be further modified by the addition of up to four more mannoses transferred from gdp-mannose (598,1107 blocked in sec mutants, which prevent secretory proteins from reaching the golgi (429). in complex eukaryotes, hydroxyl groups of ser or thr residues are oglycosylated by enzymes thought to be located exclusively in the golgi (418,786). acceptor octapeptides can be o-glycosylated. 7v-acetylgalactosamine is the primary sugar in o-linked oligosaccharides; further resi dues of galactose, sialic acid, fucose, ^-acetylgalactosamine, and nacetylglucosamine can then be added. lipid intermediates are not in volved in o-glycosylation, and the reaction is not inhibited by tunicamy cin (418), but it is not known whether other sugar transferases are in volved in both n-and o-glycosylation. glycosylated and unglycosylated secretory proteins are major substrates for tyrosine sulfation (501). tyrosynylprotein sulfotransferase is enriched in golgi-derived membrane fractions and has its active site oriented to ward the golgi lumen (619). the sulfate donor is 3 '-phosphoadenosine 5 phosphosulfate. almost all tyrosinated proteins have an acidic residue, a glycine or proline residue, and no cysteines, basic residues, extended secondary structure, or n-glycosylation sites close to the modified tyro sine, but there does not appear to be a strict consensus sequence around the modification site (474,619). inhibition of tyrosine sulfation is reported to retard the exit of a secretary protein from the tgn (331b). many secretory proteins undergo secondary proteolytic processing fol lowing removal of the signal peptide. the best characterized examples of this class of processed secretory proteins are α factor and killer toxin of yeasts. the α-factor peptide is present four times in the pro-a-factor protein which reaches the golgi apparatus. this probably represents a way of reducing "shipping costs" since α factor itself may be too small to be efficiently transported to the rerm and it would be inefficient to pad the polypeptide out with redundant sequences (602). julius et al. (545) found that pro-a-factor processing was blocked by sec mutations, which prevented protein movement through the golgi, and that mature α factor was normally present in secretory vesicles en route to the cell surface. thus, processing occurs in the golgi. some viral coat proteins such as influenza virus ha are also proteolytically processed in the trans-golgi network (tgn) or trans golgi. processing of mammalian prohormones, which is similar to that of pro-α factor, occurs in secretory granules budding from the trans golgi [see section v.g.4.c and (1003)]). the four 13-residue-long α-factor peptides in pro-α factor were found to be separated by 6-8 residues (lys-arg-glu-ala-asp-ala-glu-asp) (602). this suggests that at least one processing protease has trypsin-or chymotrypsin-like activity. killer toxins are processed at similar sites (245,1074). in fact, the enzyme which performs this initial processing step, the product of the kex2 gene, is a ca 2 + -thiol protease which cleaves between basic residues. the enzyme is inhibited by anti-αΐ-trypsin and can correctly process mammalian proalbumin (53). strains mutated at kex2 secrete unprocessed pro-α factor (546). the product of the ste13 gene, a membrane-associated aminopeptidase (544), processes the nterminal tetrapeptide, and further processing of the c-terminus is per formed by kex1 -encoded carboxypeptidase (245). other secretory pro teins produced by different species of yeasts may also be processed by one or more of these enzymes (695b). results from a number of experimental approaches (summarized below) show quite conclusively that individual golgi cisternae are separate, bio chemically and functionally distinct entities organized according to a very strict pattern and that secretory proteins progress in a synchronized wave from one end of the stack of cisternae to the other en route to their final destinations. according to these data, golgi stacks must contain at least three distinct cisternae (generally referred to as cis, medial, and trans according to their orientation with respect to the er). heterogeneity within these "domains" indicates that the actual number of cisternae is probably higher than three (see table v .l, fig. v .3) (299). cells normally have one or a very limited number of golgi stacks. the stacks break up into clusters of many vesicles during mitosis, apparently to ensure equal partitioning of golgi components to daughter cells, although the number of golgi clusters produced is far in excess of that required for this pur pose. golgi breakdown presumably involves membrane fission, so each cluster of vesicles contains cis, medial, and trans components (662). pro tein secretion is usually shut down during mitosis or meiosis, but the yeast s. cerevisiae continues to process and secrete invertase during mitosis, implying that the golgi fragments remain active (685). in general, results from different analyses give a coherent picture of the cis to trans organization of golgi cisternae (see table v . 1), on the basis of which a simple map can be drawn (fig. v.3) . the segregation of modify ing enzymes presumably allows prosthetic groups to be added or removed according to a strict sequence and prevents competition between process ing enzymes which could act on the same substrate (266). table v .l is five, but the actual number of golgi cisternae may be higher than that shown. nagn, iv-acetylglucosamine; tgn, trans-golgi network. early studies showed that one or two cis (er-proximal) golgi cisternae were preferentially stained during prolonged exposure to osmium [pre sumably due to strong reducing conditions in these cisternae (333)]. sub sequent histological and cytochemical tests showed that many enzymes involved in protein glycosylation and other reactions, as well as some proteins with unknown function, were not evenly distributed through the golgi stock but were compartmentalized in one or two cisternae (table v .l). the techniques employed include immunocytological detection of proteins using specific antibodies, enzymatic cytochemical reagents for specific enzymes, and the detection of lectin-specific sugar residues on terminally modified glycoproteins in transit through the golgi (table v .l). as noted by farquhar (299), some studies with different cell lines give conflicting results, suggesting that the organization of the golgi stack may vary depending on cell type and function. intermediates in the secretion pathway can be detected by pulse-chase experiments in which oligosaccharides or other prosthetic groups are la beled by the metabolic incorporation of radioactive precursors. the se quence in which processing occurs can thus be determined and correlated with the location of processing enzymes in the golgi cisternae. golgi enzymes involved in n-and o-glycosylation, sulfation, and palmi toylation of secretory proteins can be separated by density gradient cen trifugation of golgi-derived vesicles, which apparently differ in density due to differences in cardiolipin content (817). enzymes identified in cis golgi by kinetic and histological or cytochemical tests are generally found in heavier golgi membrane fractions, and density is now widely used to define the approximate location of golgi proteins in the stack (table v .l). the ionophore monesin slows or arrests intra-golgi transport and inhibits late (trans) golgi functions, and can thus be used to distinguish between early and late golgi processing events. monesin also causes secretory proteins to accumulate in medial or late golgi compartments, which be come distended and vacuolated, aiding their identification by cytochemi cal methods (400,905,1108). however, as discussed by dunphy and rothman (266), studies on the effects of monesin in different cell types often give conflicting results and do not necessarily indicate the precise site of secretory protein accumulation or processing. oligosaccharides on secretory proteins are only cleaved by endoglycosi dase η before they are processed by nagn transferase i and mannosi dase ii. these enzymes are probably located in medial golgi cisternae. thus, endoglycosidase η-sensitivity provides a simple test for determin ing whether secretory proteins have reached these cisternae (267). rothman et al. (969, 970) demonstrated that pulse-labeled vsv g protein present in the golgi of a cell lacking a particular modification enzyme can be modified upon fusion with a cell producing the modifying enzyme. the golgi cisternae from donor and recipient cells did not fuse, and the g protein, rather than the modifying enzyme, was transferred from one cisterna to the other. these observations have implications for the mecha nisms of protein movement between cisternae (section v.f) but, like in vivo kinetic experiments, may also indicate the sequence in which pros thetic groups are added or removed ( having established that secretory proteins are progressively processed as they migrate through the golgi stacks, we now turn our attention to how the proteins themselves migrate between cisternae, and how golgi pro teins can be specifically retained within specific cisternae. of all of the models explaining intra-golgi movement of secretory pro teins (300), only that invoking shuttle vesicles fits the experimental data showing that golgi cisternae are distinct entities. numerous proteincoated vesicles are produced by golgi stacks in vitro under conditions which favor the intra-golgi movement of secretory proteins (see below) (40,820). these uniformly sized vesicles were shown to contain at least one secretory protein, vsv g protein, indicating that they could be bona fide intra-golgi shuttle vesicles (820). similar vesicles were also found around the golgi apparatus in situ (996, 1088) . furthermore, different se cretory proteins destined to different locations and exported or secreted at different rates were present in the same vesicles (1088), which agrees with the idea that secretory proteins are not segregated from each other during intermediate stages in the secretory pathway. the protein coat on these vesicles is distinct from the clathrin-containing coats present on endocytic and lysosomal vesicles and on budding secretory granules (see section v.g.6). the only regions of the golgi apparatus which have pro tein coats are those from which the golgi shuttle vesicles bud. a number of different conditions prevent intra-golgi movement of secre tory proteins in vivo and in vitro. the ionophore monesin (section v.e.4) probably prevents movement though the trans cisternae by disrupting a proton gradient maintained by an atp-dependent protein pump present in golgi membranes (48), thus causing the ph of the normally acidic trans golgi cisternae to rise. saraste and hedman (994) demonstrated that mi gration between different golgi cisternae was blocked at different critical temperatures and suggested that this might be caused either by atp depletion or by changes in membrane fluidity. atp may be required to maintain the acidic ph of trans cisternae, and vesicle fission and fusion, which must occur at the cisternal membranes, can probably only occur when membrane lipids are in a "fluid" state above the phase transition temperature. atp and soluble cytosolic factors including a 74-kda, n-ethylmaleimide-sensitive protein (93a) are required in both early and late stages of intra-golgi movement in vitro (39,40,685a,966) and in permeabilized cells (62). intriguingly, soluble yeast cell extracts can replace endogenous cy toplasmic components in a mammalian cell-derived assay system for in tra-golgi movement of proteins (268), raising the possibility of using ex tracts from yeast sec mutants blocked at different stages in the secretory pathway to define the role of the corresponding wild-type gene products in intra-golgi transport. indirect evidence based on the effects of gtp analogs suggests that gtp is also needed for intra-golgi movement of secretory proteins (710), and studies with antibodies against a yeast gtp binding protein indicate that a similar protein is apparently located in the golgi apparatus in multicellular eukaryotes (1030). the significance of this observation is not understood, but gtp and gtp binding proteins (g proteins) may be involved in maintaining vectorial movement of shuttle vesicles or in vesicle recycling (see below). surface components on golgi membranes are also required for intra-golgi transport (3,39,40). a recep tor may recognize the protein coat on the golgi transport vesicles. balch et al. (40) considered that the migration of secretory proteins from one cisterna to another depends on three separate events: (i) secretory proteins are primed to make them competent or available for transport. priming probably involves the migration of secretory proteins to regions of the cisternae where budding occurs, princi-pally at the outer rims. this step might depend on interactions be tween the secretory protein and receptors migrating to the budding areas or may rely on free diffusion within the cisternal membrane or lumen. segregation may depend on signals generated by processing enzymes in individual golgi cisternae, but most experiments in which processing inhibitors have been used do not support this idea and show instead that processing is nonessential for secretory pro tein targeting. some mutationally altered secretory proteins, how ever, transit normally through the early stages of the secretory path way and yet are not transported through the golgi (344,1040). perhaps secretory proteins must fold into a particular conformation to be competent for movement between golgi cisternae. the accu mulation of such abnormal proteins in the golgi may cause it to become distended, but this does not drastically affect intra-golgi movement and secretion of other secretory proteins (344). (ii) vesicles move from one cisterna to another. vesicle migration be tween golgi cisternae may be either vectorial or random. vectorial movement is more compatible with traditional views on the strict sequence of events in secretory protein processing and is supported by the observation that the likelihood of forward transfer is at least five times greater than that of lateral movement in golgi of fused cells (969, 970) . this implies that there are receptors on the surfaces of golgi cisternae which recognize vesicles budding from the pre ceding cisterna in the chain. nonetheless, the topology in the golgi complex is well maintained, possibly because the cytoskeleton pre vents cisternae from coming into direct contact. another puzzling feature of vesicular intra-golgi transport is that small transport vesi cles would be expected to diffuse away from the golgi complex. the cytoskeleton may restrict the movement of golgi vesicles and may even play a more positive role in directing vesicles between cister nae. however, although movement along microtubules has been well documented for large organelles (1136) and may be important in the sorting of some proteins leaving the golgi complex (section v.g.3), there is no evidence that it could play more than a minor role in the movement of vesicles over the very short distances which separate golgi cisternae (139). furthermore, cell fusion studies by rothman et al. (969) show that inter-golgi movement of vsv g protein can occur, indicating either fusion of the two golgi com plexes or, more likely, that movement between golgi cisternae is dissociative; i.e., the vesicles do indeed diffuse into the cytoplasm. (iii) vesicle fusion with the membrane of the acceptor cisterna and re lease of vesicle contents into the lumen of the cisterna occurs. this model for intra-golgi transport fits many experimental observa tions, but further studies are required to define clearly the steps involved. for example, atp may be required for vesicle fission and fusion, as well as for reducing the ph in trans golgi compartments, but this has not been proven, and the nature and role of some of the cytosolic components required in in vitro assays have yet to be determined. further work is needed to define how proteins find their way to budding regions of the golgi membrane and what triggers vesicle formation and vesicle fusion with acceptor membranes. how are proteins specifically retained in individual golgi cisternae? we saw earlier that receptor-dependent recycling of er proteins from the cis golgi or an intermediate compartment could explain how these proteins remain almost entirely in the er (section v.b.4). golgi residents proba bly also have signals which are recognized by some kind of receptor. indeed, coronavirus el membrane glycoprotein appears to have a golgi retention signal in one of its transmembrane domains (670). however, recycling of escaped proteins is a far less attractive model for explaining how golgi proteins are retained than it is for the case of er proteins. one possibility, proposed by pfeffer and rothman (872) , is that the membranes of golgi cisternae have two domains: a fluid domain, close to the budding rims, and an immobile phase, in which endogenous membrane proteins are anchored. a protein would require a signal to associate with a recep tor in the immobile phase or to become anchored to it, whereas all other proteins would enter the mobile membrane phase or the bulk phase of the cisternal lumen. pfeffer and rothman point out that this model reduces the number of proteins which need to have signals for routing through or retention in the golgi, because fewer proteins are retained in the golgi than transit through it. cytoskeletal structures, including possibly the cytoplasmic matrix, which "glues" the cisternae together, were proposed to limit movement in the immobile phase. an alternative idea, also con sidered by pfeffer and rothman, is that endogenous golgi proteins inter act to form patches which, by virtue of their size, are too small to fit into transport vesicles. secretory proteins transit through the golgi apparatus and arrive in the trans cisternae together. the trans golgi compartment is therefore the point at which the different branches of the secretory pathway diverge. the following sections deal with the site at which sorting occurs, the ways in which proteins are sorted, and what happens when vesicles carrying secretory proteins arrive at their destinations. as its name suggests, the trans-golgi network (tgn, also called golgi endoplasmic reticular lysosomes or gerl) is the most distal compart ment of the golgi apparatus (relative to the rer). it differs from the golgi cisternae in that it has a distended, reticular appearance rather than that of a flattened dish. early morphological studies suggested that the tgn was a reticular adjunct of the golgi specifically involved in lysosome biosynthesis [lysosomal enzymes were originally thought to bypass the golgi (398)]. the tgn is now known to be distinct from endosomes or lysosomes. endocytosed horseradish peroxidase does not accumulate in the tgn (401), although some endocytosed proteins may be recycled to the cell surface via the trans golgi and especially via the tgn (319) (see chapter viii). the tgn probably contains tyrosinyl sulfotransferase (33), acid phosphatase, sialotransferase, and galactosyltransferase (365,957) (table v .l, fig. v.3) , although tgn-derived vesicles are diffi cult to distinguish from those derived from the trans cisternae. the tgn also marks the site at which assembled clathrin, one of the proteins which coat some secretory and endocytic vesicles, appears along the secretory pathway (see section v.g.6). the tgn is the most acidic golgi compartment, although the ph is almost certainly not as low as in secretory granules (823) or in lysosomal sorting vesicles, in which low ph causes the dissociation of lysosomal proteins from the mannose-6-phosphate receptor (577) (see section v.g.5). furthermore, influenza virus hemagglutinin, which is activated at low ph, reaches the cell surface as a nonactivated form, indicating that it does not spend an appreciable period (more than 2 min) in a compartment with a ph of less than 6 (107). different secretory proteins accumulate together in the tgn (1088,1123), which can become further distended when the load of secre tory or lysosomal proteins increases (398). transport of secretory pro teins from the tgn is blocked at 20°c, and numerous protein-coated and naked vesicles accumulate as buds on the surface of the tgn (401). different types of vesicles seem to be involved in sorting secretory pro teins into different branches of the secretory pathway. thus, different classes of soluble secretory proteins may be segregated into different domains of the tgn according to their interaction with specific receptors (fig. v.4 ). although this model does not explain how receptor proteins note that constitutive default sorting is not receptor-mediated and that sorting into the regulated secretory branch of the pathway may result from protein aggregation and granule formation rather than receptor interactions. tgn, trans-golgi network; l, lysosome; pl, prelysosome. segregate into these domains, it does serve to illustrate the possible mech anisms involved in receptor-dependent segregation and packaging secre tory proteins discussed in the following sections. default sorting is the final stage in the secretion of proteins which lack specific sorting (lysosomal, vacuolar, or polarity) signals and which are not accumulated within specific secretory storage granules of the regu lated branch of the secretory pathway, i.e., those which are constitutively secreted (fig. v.4) . (note, however, that some constitutively secreted proteins carry sorting signals.) immunocytochemical studies show that proteins secreted by the default pathway accumulate in secretory vesicles which do not have protein coats (401,821). plasma membrane proteins are transported to the cell surface in the same vesicles as secreted proteins (114, 462, 1088) . (1204) (see section ii. f. 3) required atp and was sensitive to trypsin, implying that at least one cytoplasmic or vesicle-cytoplasmic membrane surface protein is involved. atp might be needed for fusion between the vesicle and cytoplasmic mem brane. saccharomyces cerevisiae strains carrying a temperature-sensitive mu tation in the sec4 gene accumulate large numbers of secretory vesicles. these were purified by walworth and novick (1165), who found them to contain three dominant proteins (110 kda, 40-45 kda, and 18 kda) to gether with invertase, the secretory marker protein. these three proteins were made during the period in which the vesicles accumulated, i.e., after secretion had been shut down at the nonpermissive temperature. the 110-kda protein was the most abundant protein in the lumen of the secretory vesicles, whereas the other two proteins were membrane-associated and had cytoplasmic domains which could interact with the cytoskeleton, the cytoplasmic membrane, or cytoplasmic components (1165). it should be noted, however, that exocytosis in s. cerevisiae might be considered as polarized rather than default sorting because secretory vesicles are di rected towards the growing bud rather than being randomly distributed over the entire cell surface (section v.g.2 and fig. ii.5) . the sec4 gene was cloned and sequenced by salminen and novick (987), who found it to be homologous to gtp-binding ras regulatory proteins of higher eukaryotes. whether this is significant for the role of sec4 in secretion is unclear, especially since it is not known whether gtp is required for exocytosis in yeasts. recent studies show that sec4 protein binds gtp (391). overexpression of sec4 suppresses the effects of mutations in three other sec genes, but strains carrying sec4ts muta tions rapidly become secretion-defective at the nonpermissive tempera ture, suggesting a direct role in secretion. the cytoplasmic and secretory vesicle membrane-associated sec4 product does not itself appear to be a secretory protein because its predicted primary sequence does not in clude a potential secretory routing signal. one possibility raised by salminen and novick is that the c-terminal cysteine residue of sec4 is acylated, as are the gtp-binding ras proteins and that the acyl groups anchor the polypeptide in the membrane, but this could not be confirmed directly. another gtp binding protein, ypt1 (which is 50% homologous to sec4) was detected close to the bud as well as in ill-defined structures (possibly the er and golgi) of s. cerevisiae cells (1030). mutations in the ypt1 gene affect several stages in the secretory pathway, but ypt1 pro tein probably plays an indirect role in protein transport as a result of its role in ca 2 + regulation (1014). the plasma membrane of epithelial cells is divided into two distinct do mains. the apical surface, which may have microvilli, is oriented toward the outside (e.g., lumen of the intestine), whereas the basolateral surface is on the inside, facing the basolateral surface of other cells or resting on an extracellular matrix of basal lamina (fig. v.5) . the two membrane domains have distinctly different lipid contents in their outer leaflets [the apical membrane has a higher glycolipid and cholesterol content, and a lower phosphatidylcholine content than the basolateral membrane (1050)], although the lipid contents of their inner leaflets may be identical (inset to fig. v.5 ). this implies that only the outer leaflets of the two membranes are separated by tight junctions, morphologically distinct structures rich in nonbilayer lipids and containing unique proteins which form the junction between apical and basolateral surfaces and probably prevent the movement of outer leaflet lipids and proteins between them, as well as acting as ion gates between adjacent cells (fig. v .5) (177, 260, 409, (706) (707) (708) 1050) . there may also be differences in basal and lateral membrane composition. cell-cell interactions are required for effi cient formation of the basolateral surface of polarized cells, but not for sorting to the apical zone (1143). adjacent cells may be held together by desmosomes. the major breakthrough in studies on protein sorting in polarized cells (878). in general, the steady-state distribution of polarized membrane proteins indicates that sorting of basolateral pro tein is highly efficient (>97% fidelity) (873), whereas apical proteins may be found in significant amounts in the basolateral membrane. however, the surface area of the apical membrane is usually much smaller than that of the basolateral membrane, which means that the fidelity of apical tar geting is actually quite high (about 88%) (873). basolateral and apical sorting seem to represent two distinct pathways, but most polarized cells also secrete some proteins from both surfaces, implying that neither basolateral nor apical sorting are default pathways (390,574). furthermore, lysosomal proteins are secreted by the default pathway from both apical and basolateral surfaces when lysosomal sort ing is blocked (140). this may not be the case in caco-2 cells in which high-level basolateral secretion of normally nonpolarized lipoproteins im plies that the basolateral sorting predominates over all other sorting path ways, including the default pathway (136,930a). do polarized cells sort and direct proteins directly to their target mem branes, or are they all first randomly targeted to both domains, or specifi cally to one or other domain, and then transcytosed? although some studies designed to answer this question have been conducted with en dogenous proteins, the most detailed results come from studies on the basolaterally targeted vsv g protein and the apically targeted influenza virus hemagglutinin (ha) in mdck cells. furthermore, it is through derivatives of these proteins that most recent studies have attempted to identify apical and basolateral targeting signals (see next section). the results of pulse-chase experiments combined with a trypsin sensi tivity assay and tests with anti-ha antibodies applied to the basal surface of mdck cells led matlin and simons (695) to conclude that ha was targeted directly to the apical surface. conversely, pfeffer et al. (873) showed that pulse-labeled g protein went directly to the basolateral mem brane, where it could be detected with specific monoclonal antibodies. another approach to studying the sorting of ha and g proteins is to ''freeze" them in the golgi apparatus by cooling the cells to 20°c (which specifically blocks protein exit from the tgn) or to use cells infected with viruses carrying temperature-sensitive mutations in the ha or g genes at the restrictive temperature. the bulk movement of these proteins can thus be followed by immunocytochemistry when the cells are restored to the permissive temperature. rindler et al. (929) found that gts protein was transported directly from the golgi to the adjacent lateral cell sur face, whereas wild-type ha accumulated at 20°c was sorted directly to the region of the apical surface closest to the tgn. similarly, pfeiffer et al. (873) found that g protein accumulated in the tgn went to the baso lateral membrane 67 times faster than to the apical surface when the cells were warmed to 37°c. these studies are compatible with the idea that apical and basolateral proteins go directly to their respective target mem branes. different results were reported by bartles et al. (50) , who found that endogenous apical proteins in hepatocytes appeared first in the baso lateral membrane and were subsequently redistributed to the apical sur face by transcytosis (section viii.a.5.b). whether this result indicates the existence of totally different mechanisms for sorting apical proteins in kidney and liver cells could be tested by expressing hepatocyte genes for apical proteins in mdck cells. sorting of polarized secreted and membrane proteins is assumed to de pend on sorting signals (probably signal patches) [see section i.d.2 and (88)] present in the sorted proteins and on their interaction with receptors. sorting presumably occurs in the tgn, where at least one set of cognate receptors are probably located. to date, the primary approach used to locate sorting signals has been to introduce major sequence alterations including the deletion of cytoplasmic or transmembrane domains from the ha, g, or other polarized viral glycoproteins, or to create hybrids be tween them. even these relatively unsophisticated studies give conflicting results. for example, mcqueen et al. (702) and roth et al. (959) found that either deleting the transmembranous and cytoplasmic tail of ha (to make a soluble, secreted protein) or replacing them with the correspond ing regions of g did not affect apical targeting and concluded that the apical sorting signal was located in the n-terminal, extracytoplasmic do main of ha. gonzalez et al. (387) reported, however, that truncated ha was secreted by the default pathway and concluded that the sorting signal was in the c-terminal transmembranous or 10-amino-acid cytoplasmic domains. there is also disagreement concerning the location of the g protein basolateral sorting signal, which, according to the deletion and ha gene fusion studies of paddington et al. (835) and gonzalez et al. (387) is located in the 29-amino-acid, c-terminal cytoplasmic domain. stephens and compans (1080) mcqueen et al. (703) have suggested that the origin of these conflicting results may be the fact that recombinant genes are not stably expressed in mdck cells. their studies (702,703) were carried out shortly after infec tion by recombinant viruses and are, they claim, more likely to reflect the true sorting pathway of the recombinant gene product. other groups may in fact be studying the sorting of hybrid or truncated proteins which have been further modified by genetic selection for more stable cell lines. an other argument in favor of the idea that sorting signals are in the extracy toplasmic domain is that this part of the signal is initially localized in the lumen of the tgn, where sorting is presumably initiated, and could there fore bind to hypothetical sorting receptors which segregate sorted pro teins from those destined for default export or secretion. these receptors may themselves have cytoplasmic domains recognized by receptors on the basolateral or apical surfaces. furthermore, such a system could han dle both secreted and exported (membrane) proteins. however, there is some evidence, based on the effect of nh 4 c1 (or ph) on basolateral sorting of laminin and heparin sulfate proteoglycan, that soluble and membrane proteins may be sorted by different mechanisms (142). clearly, more work remains to be done before we can identify polarity sorting signals with any precision. one of the difficulties may be that these signals are probably not linear sequences of amino acids and are therefore less amenable to gene fusion studies, which have been so useful in identi fying routing signals. even very subtle conformational changes may alter sorting signals and render them nonfunctional; this could explain why prevention of glycosylation with tunicamycin may cause normally polar ity-sorted proteins to enter the default pathway (1135). another aspect of the work on polarity sorting signals which needs to be pursued is the identity of receptors and the stage at which sorted proteins dissociate from them. sorting receptors are presumably re cycled. little is known about how the vesicles are targeted to different regions of the plasma membrane. one possibility is that sorting of lipids ( fig. v.5) is involved (709,1050a) . alternatively, sorting vesicles could interact with components of the cytoskeleton, along which they are driven by atp-dependent motors attached to microfilaments; further receptor-ligand interactions might complete the sorting process when vesi cles arrive at the plasma membrane (1136). although there is no evidence that microfilaments are required for default sorting of secretory proteins (1136), rindler et al. (930) have reported that colchicine and other microtubule-disorganizing agents abolished specific apical sorting of ha and caused influenza virus to bud from both cell surfaces in polarized cells. salas et al. (986) obtained the opposite result, however, and neither group observed any effect on basolateral targeting of vsv g protein. rogalski (944), however, found that agents which caused microtubule disassembly caused random sorting of g protein. thus, the role of the cytoskeleton in the targeting of polarity-sorted secretory proteins in complex eukaryotes' proteins remains unclear. a temperature-sensitive mutation in the s. cerevisiae actin gene results in abnormal exocytosis and bud formation at the nonpermissive temperature, suggesting that actin filaments may direct secretory vesicles to the growing bud in yeast cells (793). there is evidence to show that the cytoskeleton may be important in maintaining polarized distribution in complex eukaryotic cells (410). a polarized atpase has recently been shown to bind directly to ankyrin, a protein which is known to link an integral erythrocyte membrane protein to spectrin and actin of the erythrocyte cytoskeleton (770). thus, ankyrin might link the atpase to microfilaments and thereby maintain its polar ized distribution. two features distinguish regulated exocytosis from other branches of the secretory pathway: (i) protein release only occurs when the cells are stimulated by secretagogues (e.g., cyclic amp or ca 2 + ). (ii) proteins accumulate within the cell before their release. proteins accumulate in a special class of secretory vesicles called secre tory granules, wherein protein concentrations reach such high levels that they become electron-dense and can be clearly identified by electron mi croscopy. the secretory pathway is restricted to certain cell types (e.g., endocrine and exocrine cells, mast cells), and only certain types of pro teins (examples include hormones, albumin, and some degradative en zymes including proteases and lipases) enter into the pathway. another feature of the regulated secretory pathway is that proteins may be pro-teolytically processed before release or as they are released from the cell, although this feature is also found in some constitutively secreted pro teins. regulated and constitutive branches of the secretory pathway can coexist in the same cell (395, 567, 1109) , implying that proteins destined for storage in secretory granules must be sorted from other secretory proteins. elec tron microscopic studies show that both classes of secreted proteins tran sit through the golgi cisternae together and are segregated in the tgn, where proteins destined for the regulated branch of the pathway condense into specific areas coated with the protein complex clathrin (see figs. v.4 and 6; see also section v.g.6) (824,1122). the clathrin coat is subse quently removed as the condensing granules bud from the tgn and ma ture (818) (see fig. v.6 ). these areas probably represent the sites at which secretory granules mature and are released from the tgn. it should be noted, however, that some proteins which are normally se creted by the regulated pathway may 4 'escape" and be secreted 4 'consti tutively" (31). it remains to be determined whether this is due to incorrect sorting or to low-level, secretagogue-independent secretion from storage granules, as suggested by studies by von zastrow and castle (1231). bur gess and kelly (136) propose that this "spillover" secretion may be due to inefficient sorting in the specific cell lines tested, since rhodes and halban (921) observed much more efficient sorting into the regulated pathway. efficient segregation of proteins into regulated and constitutive branches of the secretory pathway implies that the former have sorting signals (see section v.g.4.b for an alternative explanation). the exis tence of these signals was suggested by moore and kelly (731), who transfected a pituitary tumor cell line with a hybrid gene comprising the 5 ' end of the vsv g protein and the 3 ' end of the gene for human growth hormone. the hybrid protein was diverted into the regulated secretory pathway, indicating that the constitutive pathway had been bypassed due to sorting signals present in the growth hormone part of the hybrid. there are no obvious sequence similarities between proteins secreted by the regulated pathway, implying that the sorting signal is probably a patch signal rather than any identifiable linear stretch of amino acids. several proteins secreted via the regulated pathway are proteolytically processed prior to their release from the cell (see below). this raises the possibility that the sorting signal could reside in that part of the secretory polypep tide which is eventually cleaved off. this possibility was tested by bur gess et al. (135) , who found that deleting dna coding for the propeptide part of trypsinogen had only a minor effect of the targeting of the enzyme into secretory granules. thus, they concluded, there must be at least one sorting signal in the mature part of the trypsinogen. one surprising feature of regulated pathway sorting is that the putative sorting signal seems to be universal. moore et al. (732) , for example, found that human proinsulin was packaged into secretory granules in mouse adrenocorticotropic hormone (acth)-secreting cells, that it was correctly processed (see below), and that its release from these cells was stimulated by the same secretagogues that stimulated acth release. similar results were obtained in studies on the expression of human kid ney renin dna in the same cell line (337). fibroblast l cells, however, which do not have the regulated pathway, secreted (unprocessed) proin sulin via the constitutive pathway (732). receptors localized in specific domains of the tgn membrane may segregate proteins into the regulated branch of the secretory pathway, but this has not been proven, and other factors may be important. for exam ple, treatment with the weak base chloroquinone causes acth to be secreted by the constitutive pathway, indicating that low ph is required for sorting into the regulated pathway (733). low ph in condensing secre tory granules may dissociate proteins from their receptors, which can then be reused (but see section v.g.4.b). another feature of secretory granules which appears to have been largely overlooked is that proteases involved in post-tgn processing of secretory proteins must also carry sorting signals. it remains to be seen whether the membrane content of secretory granules differs significantly from that of vesicles of the consti tutive branch of the secretory pathway. in certain exocrine cells, the concentration gradient of a secretory protein between the rer and secretory granules can be as high as 200 (988). the dense core of aggregated protein is sometimes seen to be separated from the membrane of the secretory granule (136) and may remain intact when the membrane is removed (1230) or upon exocytosis (19, 1122) . although receptor-mediated sorting into specific regions of the tgn may assist the condensation process, proteins secreted by the regulated pathway may aggregate spontaneously and be packaged into secretory granules when they reach a critical size. thus, the packaging of g protein-growth hor mone hybrids into secretory granules discussed above (731) may be due to the presence of "aggregation" sequences in the growth hormone segment of the polypeptide, rather than to the presence of a specific sorting signal. the low ph of the condensing granule (19,824) may be important for protein aggregation, since aggregates dissociate at high ph (534). pfeffer and rothman (872) suggest that this could explain the failure of chloroquinone-treated cells to package secretory proteins into secretory gran ules (see above). secretory granules of exocrine (hormone-secreting) cells remain acidic during maturation, but those of endocrine (enzyme-secret ing) cells return to neutral ph as they mature (534). atp is also needed for secretory granule formation. there are conflicting views as to whether different secretory proteins cosegregate and coaggregate into the same secretory granule. detailed studies by fumagalli and zanini (341) revealed that bovine growth hor mone and prolactin could be present either in different aggregates in the same secretory granule or in mixed aggregates in the same granule or in pure aggregates in different granules. the ratios of the three type of granules varied from animal to animal. similar results were reported by mroz and lechene (744) , who showed that the enzyme content of individ ual secretory granules derived from single cells from the same gland can vary enormously. the simplest interpretation of these data seems to be that segregation is a random process and that the formation of mixed or pure aggregates depends on the local concentration of the respective pro teins and on their preference for forming homo-rather than heteroaggregates. packaging of proteins into different secretory granules, however, might permit their release to be stimulated by different secretagogues, but there is only limited evidence for such a phenomenon at the level of individual cells (11,136,317a) . many of the proteins secreted by the regulated branch of the secretory pathway are proteolytically processed and activated, usually in secretory granules. processing often involves the proteolytic removal of an n-terminal propeptide and can be mimicked by exogenous proteases such as trypsin (336). alternatively, short spacer peptides may be removed from polyprotein precursors (187). in the latter cases, cells in different tissues can process the precursors to give different "mature" forms. this is the case for prosomatostatin, which is processed to a 28-amino-acid form by cells in the gut and to a 14-amino-acid form by brain and pancreatic cells (790). islet tissue from angler fish pancreas contains at least two proteases which process prosomatostatin to give products of different lengths. one of these proteases can also process proinsulin (674) . other examples of processing of heterologous secretory proteins (188, 444, 732) indicate the existence of only a limited number of processing proteases, which may also be found in some cell types which do not have a regulated secretory pathway (444, 1168) . when secretory proteins which would normally use the regulated pathway are produced in cells which do not have the regu-lated pathway, however, they are constitutively secreted as unprocessed (pro-) forms (889). the site of proinsulin processing was determined cytochemically by orci et al. (816, 825) , using monoclonal antibodies specific for mature insulin. fully processed insulin was first detected in clathrin-coated vesi cles budding from the tgn, and subsequently in naked granules. process ing was coincident with condensation and acidification and was inhibited at higher ph, indicating that the two proinsulin-processing proteases have low ph optima (821). therefore, this and other proteolytic processing steps probably occur in a late "compartment" of the tgn. although propeptides may not play a role in protein sorting, they may prevent enzymes such as proteases from folding into active conforma tions prior to their release, thereby protecting secretory granules from endoproteolytic attack. proinsulin and other prohormones may be loosely membrane-associated (789,819). in these cases, bridge sequences may contribute to the patch signal which shunts proteins into budding secre tory granules. relatively little appears to be known about the events which accompany protein release from secretory granules. the general view seems to be that secretagogues directly, or more likely indirectly, stimulate fusion between the granule membrane and the cytoplasmic membrane, resulting in the release of granule contents to the outside of the cell. secretagogues bind to specific cell-surface receptors and promote ca 2 + influx, which seems to be intimately related to the fusion event (756). gtp binding proteins also seem to be involved in generating the signal which stimu lates secretion (138), and microtubules may play a minor role in directing storage granules to the cell surface (112). inhibitors of metalloprotease and dipeptide protease substrates inhibit exocytosis, suggesting that pro teolytic cleavage of a membrane protein may be essential for exocytosis. mundy and strittmatter (756), who found that metalloprotease activity was highest in the plasma membrane, propose that proteolysis may un mask the active site on a fusogenic membrane protein. breckenbridge and aimers (121) have recently studied exocytosis-associated changes in membrane capacitance in a mouse mast cell mutant with enlarged secretory granules. small fluctuations in capacitance preceded larger increases, which themselves preceded granule swelling and the release of a fluorescent tracer dye from the lumen of the granule. the large increase in capacitance probably results from productive fusion be tween the plasma and granule membranes (312), whereas capacitance "flutters" may represent nonproductive membrane association. the most plausible interpretation of these data is that release of secretory granule contents is preceded by the formation of a narrow channel, the fusion pore, (121) between the two membranes, leading eventually to the open ing of the granule membrane to the outside of the cell and the subsequent swelling and dissociation of the granule contents and their release as soluble proteins. lysosomes (in animal cells) and vacuoles (in plant and fungal cells) con tain most of the cells' degradative enzymes, which function not only in general "housekeeping" but also in the degradation of endocytosed mate rial (chapter viii). the following sections review the evidence for lysoso mal and vacuolar sorting signals, special features of the sorting pathways, and differences between the lysosomal and vacuolar routes. specific sorting of soluble lysosomal enzymes is determined by mannose-6-phosphate (m6p) residues on n-linked core oligosaccharides (section v.d.i.a). two receptors have been identified. the major m6p receptor (275 kda, formerly called the 215-kda receptor) was detected predomi nantly in the cis golgi compartment (131), leading to the proposal that the lysosomal pathway diverged from the main secretory pathway at the cis end of the golgi stack rather than in the tgn. this idea is incompatible with the observation that some lysosomal proteins are terminally pro cessed by enzymes located in medial and trans golgi compartments. other cytological studies indicate that m6p receptors are located in the tgn, as well as throughout the golgi stack (208), in coated vesicles and the plasma membrane (365,366) (see below for explanation), and in a golgi-proximal vesicular structure (402), but not in lysosomes. thus, ly sosomal enzymes probably do transit through the tgn, possibly already complexed with their receptor, although farquhar (299) argues strongly in favor of multiple lysosomal sorting pathways and in particular for sorting from the cis golgi in certain cell lines. thus, the site of accumulation of m6p receptors along the secretory pathway may have little relevance for lysosomal enzyme sorting. two m6p receptors (275 kda and 46 kda) have been identified and characterized. the receptor activity of the 275-kda protein, but not that of the 46-kda protein is cation-independent, and the 46-kda receptor recognizes only phosphomonoesters whereas the 275-kda protein also binds methylphosphomannosyl residues (461). the 275-kda protein ap pears to be present in most mammalian cell lines tested so far (982), but the distribution of the 46-kda receptor has not been determined. some mutant cell lines lack the 275-kda m6p receptor yet still target lysosomal enzymes normally, which suggests that both receptors are involved in lysosomal sorting (461). studies with such cell lines revealed another difference between the two receptors, however. although part of the cellular pool of both receptors is located on the cell surface, only cells with the 275-kda protein can endocytose secreted lysosomal proteins (1077). this "defect" in the 46-kda protein appears to be due to a failure to bind the ligands, since antibodies against the 46-kda protein are endocytosed normally (see chapter viii for more details on endocytosis). thus, the 46-kda protein seems to be specifically involved in the sorting of endogenous lysosomal enzymes. the genes for both m6p receptors have been cloned and sequenced. although the predicted sequence of the two gene products are generally different, the two proteins have a region of moderate sequence similarity in their lumenal domains (645,826) which dahms et al. (208) propose could be the m6p binding domain. the sorting of the lysosomal enzymes cathepsin c and cathepsin d was studied directly by lemansky et al. (626) , who found that lysosomal enzyme precursors occurred only in coated vesicles. proteolytically ma tured forms were found in lysosomes. schulze-lohoff et al. (1022) also observed the transient accumulation of one of these enzymes in coated vesicles, which, they proposed, are specifically involved in the sorting of lysosomal proteins from the secretory pathway. lemansky et al. (626) devised procedures which allowed vesicles derived from the secretory pathway to be separated from those derived from the endocytotic path way. they found that both classes of vesicles contained precursor forms of cathepsin. these studies have two profound implications: (i) the fact that vesicles involved in direct sorting of cathepsins to the lysosome contain clathrin implies that they had passed through the tgn, the first site along the secretory pathway at which clathrin is detected (section v.g.i). (ii) the fact that some lysosomal enzymes are "fished" out of the sur rounding medium and retargeted to the lysosome implies that some lysosomal enzymes are incorrectly sorted, probably into the consti tutive secretory pathway (see also section v.d. 1 .a). higher levels of incorrect sorting occurs in nh 4 -cl-treated cells, probably because low ph is required to dissociate m6p from its receptor in the prelysosome ( another interesting observation concerning the m6p receptor is that it specifically binds to one of the components (the 100-kda accessory pro tein) of the clathrin cage which coats the sorting vesicles (854) (see be low). this may have particular relevance for the sorting of lysosomal enzymes because different classes of clathrin-coated vesicles appear to have different types of accessory proteins (855). coated vesicles almost certainly do not transport proteins directly into lysosomes. instead, the vesicles are targeted to endosome-like reticular organelles (prelysosomes or secondary endosomes), where the receptor probably dissociates and recycles back to the golgi cisternae. this organ elle is also the site to which endocytosed lysosomal proteins are targeted (134,402) ( fig. v.6 and section viii. a.4.b). a different class of vesicles may complete the transport of lysosomal enzymes once they have dissoci ated from their receptor, but von figura and hasilik (313) consider it more likely that there is a gradual transition from tubular prelysosomes to lyso somes proper. although m6p is undoubtedly the major, and in some cells the only, sorting signal on lysosomal enzymes, some lysosomal proteins do not have m6p residues. owada and neufeld (829) found that a human liver cell line devoid of iv-acetylglucosamine-l-phosphotransferase [and there fore unable to phosphorylate mannose residues (see section v.d.i.a)], still targeted some lysosomal enzymes correctly with, at most, only slightly reduced efficiency. these cells may have a completely m6p-independent system for sorting lysosomal enzymes. all lysosomal membrane proteins are also devoid of m6p residues. therefore, some lysosomal enzymes may have membrane-associated intermediates which are sorted to the lysosome together with authentic lysosomal membrane proteins. barriocanal et al. (49) used immunocytochemistry to follow the fate of three lysosomal membrane proteins which they detected in lysosomes, the golgi apparatus, and coated and uncoated vesicles in the region of the tgn. they found that oligosaccharide modifications were not required for lysosomal targeting (although they may be required to protect against proteolysis). thus the sorting pathway for these proteins remains to be determined; they could be sorted completely independently of lysosomal enzymes or could be colocalized to the same coated vesicles by an m6pindependent receptor and then segregated from recycling vesicle mem brane components in the prelysosome. green et al. (397) have recently found that newly synthesized lysosomal membrane proteins appear in lysosomes with the same kinetics as newly synthesized plasma membrane proteins appear at the cell surface, making it unlikely that the former pass via the plasma membrane en route to the lysosome. although vacuoles are the functional equivalents of lysosomes in animal cells, the sorting of vacuolar enzymes is completely independent of m6p receptors. this was demonstrated most simply by the fact that tunicamy cin treatment did not affect vacuolar protein targeting (572,1025). how ever, at least one vacuolar protein does have phosphorylated mannose residues. most of the work on the sorting of vacuolar proteins has concentrated on the identification of sorting signals in yeast vacuolar proteins. early studies showed that vacuolar proteases were proteolytically processed in two distinct stages, the first of which corresponded to the removal of a signal peptide (711). the second processing step is catalyzed by a vacuo lar protease, proteinase a, which removes an additional n-terminal poly peptide segment, the propeptide, from other vacuolar enzymes. protein ase a is also autoactivated by the same mechanism (15,1206). the second processing step is blocked by certain sec mutations, which cause secre tory proteins to accumulate in the rer or golgi apparatus, whereas mutations which affect the final stage of the secretory pathway from the golgi to the cell surface do not affect vacuolar protein sorting or process ing (1082). this implies that the golgi is the site of sorting of vacuolar and secreted or plasma membrane proteins in yeast cells. gene fusion studies conclusively demonstrated that propeptides are vacuolar sorting signals. bankaitis et al. (44) and johnson et al. (541) found that at most 50 n-terminal residues of preprocarboxypeptidase y (cpy), including the 20-residue propeptide, could target the normally secreted enzyme invertase into the vacuole. similar results were obtained with proteinase a-invertase hybrids (572). vails et al. (1137) subse quently found that mutations affecting the sequence of the procpy pro peptide caused the enzyme to be secreted in an inactive form. there appears to be no sequence similarity between the propeptides of different yeast vacuolar enzymes, even though genetic studies described below suggest that they are sorted into the vacuole via a common pathway. part of the propeptide may be required to maintain vacuolar enzymes in an inactive form until they reach the vacuole and may also maintain the precursors in a competent conformation for transport through the secre tory pathway. the overproduction of vacuolar proteinase a (972) and of cpy-inver-tase hybrids (44) causes them to be secreted into the medium, suggesting that some component of the vacuolar sorting pathway (e.g., a receptor) had been saturated. these observations led to the development of tech niques for selecting mutants which secreted cpy or cpy-invertase with out overproduction (section ii.f.l.b). mutations in over 50 different genes (called vpl or vpt) have been identified (44,939a,971) . the extent of the sorting defect varied in different mutants: some of them, for exam ple, did not affect the targeting of proteinase a, and none of them affected the sorting of the vacuolar membrane protein α-mannosidase, which is presumably sorted to the vacuole by an alternate pathway (971). it is unlikely that any of the mutations affected protein retention in the vacuole because vacuolar enzymes were not terminally processed, and the kinet ics of cpy secretion were comparable to those of a normally secreted protein. the characterization of the vpt or vpl gene products and their localization in the cell could provide revealing insights into the mecha nisms of vacuolar protein targeting, but at present we can only speculate on their roles. obvious candidates are the propeptide receptor, vacuolar or golgi atpases which might produce a low ph environment necessary for sorting or receptor dissociation, or proteins involved in vesicle fission and fusion. much less work has been done on plant vacuolar proteins. tague and chrispeels (1100) found that the plant vacuolar storage protein phytohemagglutinin was targeted mainly to the vacuole when its structural gene was expressed in yeast cells. this protein does not have a cleavable propeptide (it does have a signal peptide, which was at least partially processed in yeast cells), which means that the vacuolar plant sorting signal which is recognized by the yeast vacuolar sorting pathway is lo cated in the mature part of the phytohemagglutinin polypeptide. mrna coding for a second plant storage protein, globulin p, has been microinjected into frog oocytes, which secreted the protein into the medium (51). this confirms that plant vacuolar proteins are bona fide secretory proteins and that plant cells have a special branch of the secretory pathway which shunts storage proteins into vacuoles. as we have seen, protein-coated vesicles are involved in transporting secretory proteins at various stages of the secretory pathway. some vesi cles (e.g., lysosomal sorting vesicles and immature secretory granules), are coated with a protein complex called clathrin, which also coats endocytotic vesicles (chapter viii). other secretory vesicles (e.g., those me diating protein transport through the golgi) have a different type of pro tein coat (797). clathrin, which is composed of equimolar amounts of heavy and light chains, forms a three-layered cage which envelops vesicles in a shell with fibrous interconnections, which give it mechanical strength and stability. the vesicle membrane is thought to have receptors which anchor the clathrin cage to the surface via a number of ancillary assembly proteins which probably act as bridges (1148, 1149) . different assembly proteins are found in different classes of clathrin-coated, tgn-derived, and endocytic vesicles, suggesting that they might contribute to their respective specificity for particular membrane targets (7). the clathrin coat probably prevents intimate contact between fusing membranes and must thus be removed to allow fusion to occur. this may explain, for example, the disappearance of the clathrin coat from matur ing secretory granules (section v.g.4.b) and the presence of lysosomal protein precursors in naked as well as coated vesicles. vesicles coated with other proteins are also presumably uncoated to allow fusion to oc cur. the uncoating of clathrin-coated vesicles is mediated by the atpdependent cytosolic "uncoating" protein, which remains attached to the released clathrin (1010). soluble clathrin retains its typical triskelion con formation. uncoating protein is a member of a highly conserved group of stress proteins and may be related to hsp70 heat shock proteins involved in other stages in secretory protein transport and in mitochondrial protein import (sections iii.c.3 and vlb.5). the significance of this observation remains to be determined (855,967), but one possibility is that the atpase (hsp70) is required to activate an uncoating enzyme which is already present in the clathrin complex. the action of uncoating protein must be triggered in some way to pre vent it from destroying protein coats on immature vesicles or on coated buds, but the nature of the signal remains to be determined. the require ment for atp for uncoating activity could in part explain the observed requirement for the same nucleotide during the transport of proteins be tween different stages of the secretory pathway if a similar activity is required to remove other, nonclathrin coats. attempts to determine the role of clathrin in protein targeting in yeast cells have given ambiguous results. yeasts are known to have clathrin, and coated vesicles have been observed, but it is not known whether clathrin coats vesicles involved in secretory protein targeting (1165). the gene for the clathrin heavy chain was independently cloned by two groups who used it to inactivate the chromosomal gene to study the effect of the absence of clathrin on cell growth and protein secretion. payne and schekman (852) and payne et al. (853,853a) reported that their mutants grew somewhat more slowly than wild-type cells, secreted invertase at a slightly reduced rate, and were partially defective in prepro-a-factor pro-cessing. the mutants accumulated unusual vacuoles, vesicles, and golgiderived structures. these results suggested that the absence of clathrin did not completely impair plasma membrane growth and protein secretion but that there was nevertheless a reduced rate of transit of secretory proteins through the later stages of the secretory pathway. different results were obtained using exactly the same approach by lemmon and jones (627). they found that cells lacking the clathrin heavy chain were not viable unless they also carried a suppressor mutation. even with this mutation, the cells grew slowly, were larger and rounder, had an unusual granular appearance, tended to aggregate in liquid culture, and were poly ploid. these results suggest that the absence of clathrin is highly detri mental to yeast cells, making it difficult to determine whether clathrin plays a specific role in protein secretion in this organism. the secretory pathway is almost certainly the route by which the vast majority of secreted and plasma membrane proteins are exported by eukaryotic cells. however, there is increasing evidence that some plasma membrane and secretory organelle proteins may reach their final destina tions directly rather than via the secretory pathway. examples of this class of proteins include ras-like gtp binding proteins such as sec4 (987) and ras2 (232) of the yeast s. cerevisiae and similar proteins from complex eukaryotes (1194), mating pheromones in yeast and some fungi (726, 886, 983) , capsid proteins of picornaviruses (851), and src proteins of rous sarcoma and other transducing viruses (1029). most if not all of these proteins are fatty acylated. two different amino acids seem to be modified: n-terminal glycines, which are substrates for myristoyl coa protein 7v-myristoyltransferase (1124), and c-terminal cysteines in raslike proteins (232). these cys residues are reported to be palmitoylated, but a farnecyl residue has been found in basidiomycete pheromones (983). the absence of the fatty-acylated amino acid disrupts membrane associa tion of ras and src proteins (232, 555, 1029, 1194) , indicating that fatty acids probably anchor these proteins in their respective membranes. it remains to be determined how fatty-acylated proteins actually cross the plasma membrane (as in the case of the fungal pheromones) or what determines their specificity for certain membranes. a further example of a secreted protein which does not have a secretory routing signal is interleukin 1, but very little appears to be known about how this protein crosses the plasma membrane. there is general agreement concerning the events which lead to the sort ing of secretory proteins into different terminal branches of the secretory pathway, as illustrated in fig. v. 6, but we are clearly a long way from understanding exactly what directs proteins to their specific targets. patch signals are undoubtedly necessary for sorting soluble proteins (other than lysosomal enzymes) into the various branches of the secretory pathway, but these will be difficult to identify by gene fusion techniques. at present, we have no clear idea of the extent to which the sorting vesicles have different membrane contents, but it seems probable that specific groups of membrane proteins (lysosomal, secretory granule, apical, and basolateral) accumulate at different sites in the membrane of the tgn from which sorting vesicles bud. this specialization is presumably also determined by protein-protein interactions, but the possibility that other interactions (e.g., protein-lipid) might be involved should not be over looked. another interesting observation is that atp seems to be required at almost every stage in the secretory pathway. atp has been proposed to act in a variety of ways, including acidification of secretory organelles and vesicles, phosphorylation of receptors or ligands, protein folding and "proofreading," activation of cytoskeletal motors, and vesicle uncoating activity. there is increasing evidence that gtp and gtp binding proteins (g proteins) are also involved at several stages in the secretory pathway. gtp binding proteins are also known to be involved in the generation of other intracellular signals, such as the activation or inactivation of adeny late cyclase, the activation of cyclic gmp phosphodiesterase, and the control of phospholipase c action (769). by analogy, gtp binding pro teins may act as signals or to activate receptors or ligands to ensure vectorial transport through the secretory pathway (109). constitutive and regulated secretion of proteins progress in unravelling pathways, of golgi traffic assembly of asparagine-linked oligosaccharides the sorting of proteins to the plasma membrane in epithelial cells biosynthetic protein transport and sorting by the endoplasmic reticulum and golgi protein localization and membrane traffic in yeast lysosomal enzymes and their receptors key: cord-018969-0zrnfaad authors: giese, matthias title: types of recombinant vaccines date: 2015-09-24 journal: introduction to molecular vaccinology doi: 10.1007/978-3-319-25832-4_9 sha: doc_id: 18969 cord_uid: 0zrnfaad the original scientific strategy behind vaccinology has historically been to “isolate, inactivate, and inject,” first invoked by louis pasteur. new vaccines and vaccination strategies are being developed including the use of attenuated live mycobacteria, recombinant microorganisms, and subunits, prime-boost strategies based on the successive administration of a certain mycobacterial antigen under two different vaccine vectors, and dna vaccines [ 1 ] . the various types of vaccines differ in eliciting an immune response. live attenuated vaccines (lavs) mimic a natural infection without being virulent and trigger the activation of the innate immune system through pamps. following injections, lavs rapidly disseminate throughout the vascular network to the draining lymph nodes. therefore, the route of application of lavs does not specifi cally infl uence the immune response. lavs also don't need an adjuvant; they possess a natural intrinsic adjuvancy. safety concerns exist because of the replication competence and the possibility of recombination with a wild type. non-live vaccines, inactivated and most recombinant vaccines, whether containing proteins or carbohydrates (−conjugates), are less effective. in the absence of replication, vaccine-induced immune reactions remain more limited, and therefore the route of vaccination infl uences the effi cacy and the duration of the immune reaction. nonlive vaccines induce a lower antibody response and generally no cytotoxic t lymphocyte activation. compared to lav, all non-live vaccines are regarded as biologically safe ( fig. 9 .2 ). dna vaccines entail the direct, in situ inoculation of dnabased eukaryotic expression vectors that encode the sequence of a pathogenic protein antigen. the constructed plasmids are then subsequently grown in bacteria like e. coli and highly purifi ed via chromatographic methods. lps contamination of plasmids has to be prevented because of the immunotoxic properties of natural lps. after purifi cation the circular double-stranded dna plasmids are ready for vaccination. the de novo production of the encoded antigens in the host results in the elicitation of both the antibody and the cellular response by activating cytotoxic t lymphocytes (ctls). vaccine proteins made by the host are natural proteins and contain important posttranslational modifi cations such as the correct glycosylation. but like subunit vaccines, dna vaccines must be adjuvanted. naked dna does not work. the unique advantage of dna vaccines is their ability to mimic the effects of live attenuated vaccines without the risk associated with the administration of infectious albeit attenuated material. dna vaccines are able to stimulate a complete, humoral and cellular immune response. peptide fragments are processed via the endogenous pathway, resulting in the presentation of antigen on the cell surface by mhc class i molecules. plasmid dna is very stable also beyond a cold chain. therefore, the storage, transportation, and distribution of dna vaccines are more practical and also cheaper [ 2 ] . mostly all plasmid dna constructs ( fig. 9 .3 ) used for vaccination share fi ve main characteristics: • strong promoter/enhancer sequence for driving the incorporated foreign gene • convenient cloning site for insertion of foreign genes • origin of replication for initiation of plasmid replication • polyadenylation/termination sequence for production of mature mrna • resistance/antibiotic marker for selection • immunomodulators, e.g., cpgs, interleukins, ubiquitin, etc. • (on the same plasmid or on extra plasmids) uptake of plasmid dna. some biological barriers have to be overcome by dna vaccines on the way to the cell nucleus where the plasmid dna is translated into cellular mrna. after delivery of plasmid dna to the target tissue, e.g., skeletal muscle or skin, lots of tissue nucleases attack 9.3 dna vaccines and degrade a large amount of the applicated dna. also the extracellular matrix with collagen and hyaluronic acid infl uences the passage from the application site to the cell membrane. only a small portion (1 % estimated) of the still intact plasmid dna will cross the cell membrane by phagocytosis or pinocytosis. inside the cell the route toward the nucleus is also spiked with exo-and endonucleases so that probably only 0.1 % (estimated) is successfully and actively transported through the nucleus pore membrane (npc). small particles (<~40 kda) are able to pass through the nuclear pore complex (npc) by passive diffusion; larger particles need the support of carrier proteins for effi cient passage through the complex. because of this enormous loss of plasmid dna (up to 99.9 %), various tools were developed to protect the plasmid dna and thus increase the effi cacy such as encapsulation into liposomes or binding of dna to dendrimers. figure 9 .4 illustrates the passage of plasmid dna from the extracellular matrix (ecm) to the nucleus. whereas in human medicine clinical trials with dna vaccines are still ongoing without any registered product on the market, the fi rst approved dna vaccines for the veterinarian medicine are available since 2005 and are discussed now. the fi rst veterinarian dna vaccines were developed for horses (davis b.s., 2001 for wnv [ 3 ] ; giese m., 2002 for eav [ 4 ] ). today the number of current clinical trials worldwide with veterinary dna vaccines is unmanageable and probably all species are hit. canine malignant melanoma (cmm) typically begins in the mouth or around the toes and can spread within the body to the heart, lungs, intestines, and other organs. canine malignant melanoma is known for being one of the most aggressive cancers in dogs and deadly. cmm is most commonly seen in golden retrievers, scottish terriers, dachshunds, labradors, and poodles ( fig. 9 .5 ). metastases of the tumors will be found very often in distant parts of the body. the overall biology of cmm is similar to the biology of human melanoma. however the melanomas in dogs have diverse biologic behaviors due to the race and a variety of factors. standardized treatments such as surgery, radiation, and chemotherapy are the common tools to fi ght canine malignant melanoma. these traditional tools have afforded minimal to modest stage-dependent clinical benefi ts. xenogeneic dna vaccine. the plasmid dna contains a cdna for the human tyrosinase, hutyr, a tumor antigen (ta). this is a non-mutated differentiation antigen and specifi c to melanoma. tyrosinase is a glycoprotein and essential in the process of melanin synthesis ( fig. 9 .6 ). like other tas tyrosinase is overexpressed in tumor cells and therefore an ideal target in cancer therapy. normally there is no strong immune reaction against the body's own protein. but immunization of dogs with xenogeneic hutyr cdna can break the immune tolerance against this self tumor differentiation antigen and induce antibody and cytotoxic t cell response against melanoma cells [ 5 ] . tyrosinase is highly conserved from dog to mouse to man. radiotherapy in cases with positive surgical margins or positive regional lymph nodes. one dose contains 102 μg dna given in a volume of 0.4 ml by the transdermal route via a needle-free vaccination device. booster immunizations were given at 6-month intervals. in march 2007 the drug manufacturer received a conditional license for oncept from the usda and a full license in 2010. the results of the xenogeneic immunization of dogs with hutyr cdna as an adjunct therapy for cmm demonstrate a signifi cant increase of survival time compared to the control group. none of the dogs developed systemic adverse reactions; no toxicity was seen. the overall safety of this dna vaccine is confi rmed. this vaccine development represents a tremendous milestone in dna science and technology. virus. west nile virus (wnv) is a mosquito-borne member of the family flaviviridae , genus flavivirus , and was fi rst identifi ed in 1937 in uganda, africa. it is a positive-sense, single-strand rna virus, (+)ssrna, of about 11 kb that encodes a single polyprotein with seven nonstructural proteins and three structural proteins. the rna strand is held within a nucleocapsid. wnv replicates in the cytoplasm of infected cells. wnv is a zoonotic virus. the primary reservoir is birds with a signifi cant impact to spread the infection across countries and continents. more than 170 different species are described as carrier of this virus. wnv is spread from bird to bird by mosquitoes when they bite, or take a blood meal, from birds that are infected with the virus. birds from some species get ill and die; others have no clinical signs and survive. mosquitoes are also capable of spreading the virus to horses, dogs, cats, mice, alligators, and lots of other mammals but also to humans. one-third of all horses bitten by carrier mosquitoes develop the disease and die or are so affected that euthanasia is required. the incubation period ranges from 3 to 14 days. horses that do become ill vary in symptoms: muscle trembling, skin twitching, ataxia, sleepiness, dullness, and listlessness. wnv may cross the placenta from mother to gestating foal. horses cannot spread the disease to humans. wnv produces different outcomes in humans like in horses: fever, headache, chills, diaphoresis, weakness, swollen lymph nodes, drowsiness, and pain in the joints comparable to symptoms of infl uenza. more severe neuroinvasive infection includes meningitis and encephalitis. wnv-dna vaccine. the surface envelope protein e is the main target for the antibody response. there are more than 180 copies of the e protein in a mature wnv virion. the e function is the interaction between the cell surface and the fusion between virus and cellular membrane. the premembrane protein prm is cleaved during viral maturation into a smaller membrane m peptide. the expression of prm and e protein in cells results in the formation of virus-like particles, vlp. these vlp share many of the antigenic and structural properties of fully mature viruses and are of special interest for a vaccine development ( fig. 9.7 ) . the fi nal expression plasmid for immunization of horses contains the human cytomegalovirus early gene promoter, signal sequences from japanese virus, and a fusion gene of part of the fi shing industry is aquaculture, also known as aqua farming, but it can be contrasted with commercial fi shing, which is the harvesting of wild fi sh. aquaculture involves cultivating freshwater and saltwater fi sh and other populations (shrimp, oyster) under controlled conditions. salmon is one of the main food-producing fi sh in the world. a dna vaccine for fi sh must be not only safe for the animal but especially safe for the fi sh consumer. salmon is the major economic contributor to the world production of farmed fi sh, representing over u$1 billion annually in the united states. salmon farming is also very big in norway, scotland, canada, and chile and is the source for most salmon consumed in the united states and europe. like all other animals also fi sh is threatened by viruses, bacteria, and parasites. one major problem for salmons is the infectious hematopoietic necrosis (ihn) virus [ 6 ] . virus. infectious hematopoietic necrosis (ihn) virus is a common viral pathogen of both wild and farmed salmonids, in particular pacifi c salmonids, rainbow trout, and atlantic salmon. ihn virus is enzootic to the pacifi c northwest; however it has varying effects on different pacifi c salmonids. it is a negative-sense single-stranded, (−)ssrna virus that is a member of the rhabdoviridae family, genus novirhabdovirus . the rna genome is 11,133 nucleotides long and contains a leader (l) and trailer (t) sequences at its 3′-end and 5′-end, respectively. the coding regions are n, p, m, g, nv, and l genes. g encodes the surface glycoprotein, so-called spikes, main target for the immune response. transmission. ihnv is transmitted following shedding of the virus in the feces, urine, sexual fl uids, and external mucus and by direct contact or close contact with surrounding contaminated water. the virus gains entry into fi sh at the base of the fi ns. salmons are carnivorous and are currently fed a meal produced from catching other wild fi sh and other marine organisms -a permanent origin of possible infections with ihnv. clinical signs of infection with ihnv include anemia, skin darkening, bulging of the eyes, fading of the gills, and abdominal distension. infected fi sh commonly hemorrhage in several areas, like the mouth, the pectoral fi ns, muscles near the anus, and the yolk sac of fry. diseased fi sh weaken, eventually fl oating on the surface of the water. necrosis is common in the kidney and spleen and sometimes in the liver. mortality rates in older fi sh (2-3 kg) tend to range from 10 to 20 %; in smolts the mortality rate often exceeds 85 %. the average cumulative mortality following an outbreak is estimated at 47 %. ihnv-dna vaccine. the antigen is the viral surface glycoprotein (g) capable of eliciting neutralizing antibody and the production of a protective immune response. the g gene was cloned into a eukaryotic expression vector by insertion of an intermediate-early promoter and a polyadenylation signal. but the speciality of this vaccine is to be prepared as a two-component vaccine in a single vaccine, one plasmid or more. the second component is a portion of the nucleic acid sequence encoding a second peptide, derived from a fi sh pathogen other than the said rhabdovirus resulting in a fusion. this second pathogen can be any fi sh pathogen, e.g., isav, ipnv, iridovirus, nnv, spdv, svcv, vhsv, koi herpes virus, and more. the rationale behind this is that the presence of the ihnv g protein boosts the immune response to the second protein, resulting in a protective effect against infection by this fi sh pathogen. the vaccine is given intramuscularly with a dosage of only 10 μg in 50 μl on the left dorsal fl ank, in the area just below the dorsal fi n [ 7 , 8 ] . this fi rst dna fi sh vaccine was licensed in 2005 in canada by the veterinary biologics section (vbs), animal health and production division, canadian food inspection agency (cfia) and is also used now in studies in norway. there are many environmental stressors and diseases which infl uence and seriously threat the life of european honey bees, apis mellifera. the european honey bee is professionally managed worldwide for honey production and pollination. the bee was imported to the united states 400 years ago with the fi rst european settlers and called "white man's fl y" by the native americans, the indians. first reported in the united states, a mysterious socalled colony collapse disorder (ccd) decimated the bee colonies there between 50 and 90 %, fi rst observed during the winters of 1995-1996 and then 2000-2001 and without interruptions up to now. a similar situation is also given in europe. about 20,000-60,000 bees live in a colony. the fi rst description originated from the 1950s. in the early nineteenth century, the colony losses were known in england as "isle of wight disease," and the americans called this phenomenon "disappearing disease" in the 1960s, whereas these colony losses in france in the late 1990s were called "mysterious bee losses." where have all the bees gone? economic value. the huge loss of honey bees as pollinators has a dramatic impact on agricultural pollination. about 130 crops, nuts, fruits, and vegetables are pollinated by a. mellifera , with an overall value of more than $ 15 billion in the united states and more than € 14 billion for the eu in 2005. a bee colony produces some 1 kg/2205 lb honey per day. in return, these bees have to pollinate 10-15 million fl owers. one should keep in mind that besides european honey bees, wild insects, among them 30,000 species of wild bees, have also a very great impact on pollination and seem to be more effi cient in pollination as managed honey bees [ 9 ] . the industrial farming threatens also the natural biotope of wild insect pollinators. ecologic value. the total global economic value of honey bee pollination was calculated in 2005 to more than € 150 billion or $ 202 billion. the food and agriculture organization (fao) of the united nations estimates that there are 65 million managed honey bee colonies worldwide. beside this professional agriculture, honey bees are irreplaceable for the biodiversity. this organism appeared during evolution with the fi rst fl ower plants and exists since 100 million years as described in chap. 2 , fig. 2 .10 . after swine and cattle, bees are in europe and north america the third important farm animal and since 2007 formally listed as farm animal in switzerland. therefore, ccd is not only an economical but also an essential global ecological problem which urgently must be solved in the future. "the bee is more than honey." what is causing ccd? the colony collapse disorder of the last years seems to differ from past outbreaks: the worker bees disappear instead of dying in place, leaving behind the queen and young bees. high levels of bacteria, viruses, and fungi are measured in the gut of the remaining bees. collapses can occur within 2 days. a complex problem. different theories are discussed about what is causing ccd. pesticide contamination, hotly debated to interfere with the nerve system affecting foraging behavior of bees, lead them to abandon their hives. fungal diseases such as nosema spp. is known for big bee losses in spain. monocultures or gene-manipulated crops. electro smoke (radio waves) caused by cell phones destroys the bee's compass. the rigors of travelling in trucks from crop to crop in the usa. down from february professional us beekeepers travel with their colonies through the country until december. thereby the bees must relocate up to 15 times. in europe the bee colonies begin the winter sleep around september. also the climate change, the temperature sensitivity is discussed to have an impact on crop pollination. ccd is likely caused by a combination of factors [ 10 , 11 ] . varroa destructor . but in all ccd cases, an overload of bloodsucking varroa mites is detectable and varroa is currently considered the major threat for apiculture. the infection and disease is called varroosis. varroa destructor is an ectoparasite, has a reddish-brown fl at shape, and is 1-1.8 mm long and 1.5-2 mm wide, with eight legs. v. destructor infest worker bees and drones and its brood. the mite develops inside the brood cells. varroa is a real colossus compared to the size of bees as can be seen in fig. 9 .1 . varroa mites belong to the scientifi c class of arachnida, subclass acari. there are 50,000 species described alone from mites. some mites prefer carbohydrates as food such as meal or crops. the house dust mites feed fl akes of shed human skin. varroa mites prefer fresh "blood" and the hemolymph of bees and can feed 0.1 mg/0.0000002205 lb within 2 h. varroa is transported into the hives via piggyback by worker bees. the female mite enters broad cells, preferentially drone cells. once the cell is capped, varroa lays eggs on the larvae. the development from egg to insect takes 7 days. bee larvae and mites hatch in about the same time and the newborn varroa mites spread to other bees [ 12 , 13 ] . the lifetime of summer mites are 3-6 weeks, whereas fall mites can live for several months. varroa can only reproduce in honey bees and thus are considered harmless to other insects. varroa is more than a disease. it is a global pest having devastating effects on bees ( fig. 9 .8 ). varroa as vector. varroa may be not considered as isolated agent for the disease. the mortality of adult bees and its brood must be considered in the context with secondary viral infections. at least 18 various viruses are able to infect honey bees, mostly ssrna viruses. eight viruses are known to be associated with varroa mites: acute bee paralysis virus (abpv), black queen cell virus (bqcv), chronic bee paralysis virus (cbpv), deformed wing virus (dwv), kashmir bee virus (kbv), sacbrood bee virus (sbv), cloudy wing virus (cwv), and slow bee paralysis virus (sbpv) [ 14 -17 ] . varroa control. a number of natural and synthetic chemicals are commercially available for the control of varroa infestations. the fi rst compounds were bromopropylate, fl uvalinate, or other pyrethroid insecticides. and to make a long story short, varroa mites became resistant not only against one product of a given chemical class; the resistance was against the entire class with several related synthetic products. also the use of natural products, such as formic acid, mineral oil, or thymol, is only partially and temporally effective and show adverse effects [ 18 ] . there is no successful chemical treatment. mites will quickly develop resistance to all chemicals. the immune system of insects. the basic difference between insect and vertebrate immunity is the missing highly specifi c antigen response of the acquired immune system in insects. nevertheless, in the 400 million years of evolution, insects developed a powerful defense strategy against bacteria, fungi, viruses, and parasites. only protected by this "primitive" immune system insects were so successful that they colonized all terrestrial ecosystems. the insect innate immunity shows many similarities to the vertebrate and to the human innate immunity, is multifaceted, and involves both humoral and cellular components [ 19 ] . most insights on insect immunity are provided by drosophila melanogaster research. the key mechanism is also observed in honey bees. the humoral and systemic response to bacterial and fungal infections is controlled by antimicrobial peptides (amps). there are circulating receptors sensing a danger signal and activating the toll pathway, whereas membranebound receptors activate the imd pathway. both pathways lead to the translocation of nf-kb-like transcription factors and the production of amps. nf-kb response elements can be detected in the promoter region of the diptericin gene. the cellular immune response is mediated by specialized blood cells, the hemocytes, plasmatocytes, crystal cells, and lamellocytes [ 20 ] . plasmatocytes represent 95 % of the majority of hemocytes. they express phagocytic receptors and patrol through the body, clear microorganism and cell a b c debris, and signal infections to the fat bodies. the bee genome was completely sequenced in 2006 [ 21 ] . the bee dna vaccine. an expression plasmid was constructed with a cmv promoter. surprisingly, no bee or other insect specifi c promoter was essential to drive the expression of the protein. the enhanced green fl uorescent protein (egfp) was chosen as reporter gene and inserted into the multiple cloning site, together with an sv40 enhancer element. the plasmid construct was produced in e. coli and highly purifi ed by standard techniques. european honey bees ( apis mellifera ) were obtained from local beekeepers and cultivated under lab conditions. varroa mites were collected from infested bees. the oral vaccination of the egfp plasmid was operated by feeding the bees with a mixed solution of sugar and plasmid dna (vaccine sugar). standard sugar solutions made by the beekeeper are the normal food for winter bees. results. over 10 days after onset of feeding, we measured the expression of egfp by immunofl uorescence and western blot analysis with egfp antibodies. between day 3 and 10, a clear egfp signal was detected in the thorax and especially in the malpighian tubules. control bees fed with dna lacking the reporter did not show any signal. in parallel, control experiments with transformed e. coli were done to study the possibility of egfp expression in gut bacteria instead of bee cells. no egfp signal was detected in transformed bacteria. most surprisingly, we found the egfp signal after 5 days in varroa mites sucking hemolymph of bees which were fed by the vaccine sugar solution and no signals in control mites of infested control bees. feeding of plasmid dna results in expression of a reporter gene in different bee tissues over a period of several days, and fi nally varroa absorbs this protein via bloodsucking. the bee blood is not carried by arteries and veins but fl ows loosely around the body. no egfp signals were detected either in the honey stomach or in the feces. figure 9 .9 illustrates the egfp passage through the bee body and toward the varroa mite. we started with the simple idea that the biochemistry in eukaryotic cells remains the same, irrespective of the organism. a difference is given in the confi guration of the immune system. that means, an insect can successfully fi ght against parasites and infections but with different weapons. no t cells, no b cells, and consequently no antibodies and no memory. we are able to stimulate targeted immune genes of bees and measure an insect typical immune response. a standard plasmid dna vaccine, fi rst developed for horses, bridges the evolution from fi sh to insects to mammals. no other vaccine type is able to do this job. how fascinating biology is! a protein subunit is based on a single protein molecule and able to stimulate a humoral immune response, but usually not a cellular response. after phagocytosis proteins are degraded by acid-dependent proteases in endosomes (endosomal or exogenous pathway), resulting in an mhc ii presentation of the antigenic peptides. a peptide is one form of a subunit. carbohydrates are also used as subunits with a poor and age-dependent immunogenicity. carbohydrate antigens induce a t cell-independent b cell response as discussed in chap. 6 . therefore carbohydrates are mainly linked to a protein (conjugation) to enhance toe immune reaction as discussed here with the hib conjugate vaccine. conjugate vaccines. the polyribosylribitol phosphate (prp) capsule of haemophilus infl uenzae type b (hib) is a major virulence factor for the organism. prp is a t cellindependent antigen characterized by, e.g., induction of a poor antibody response in less than 18-month-old infants and children and the inability to induce a booster response. polysaccharide vaccines based on prp alone were developed in the 1970s. by covalent linkage of prp with t cell dependent protein antigens, a conjugated vaccine was created to overcome the t cell independent characteristics of prp. at present three different licensed protein carriers are linked to prp: • hboc: diphtheria crm protein 197, mutant corynebacterium -linkage: no spacer • hbomp: outer membrane protein, omp, neisseria meningitidis -linkage: spacer • prp-d: diphtheria toxoid, d -linkage: spacer these hib conjugate vaccines differ by protein carrier, polysaccharide size, and method of chemical conjugation, including use of a spacer between the prp and protein carrier. a standard chemical conjugation between a polysaccharide and a protein is illustrated in fig. 9 .10 . subunit vaccines, while offering greater safety, are intrinsically poorly immunogenic and strong adjuvants are essential to boost the activation of immune responses. serotype variability is dictated by modifi cations of the o-antigen portion of lps. o antigens vary in the number of oligosaccharide unit repeats, the types and distribution of carbohydrates, and the intra-and intermolecular linkages [ 22 ] . in s. fl exneri , these genes are encoded in the bacterial chromosome. in contrast, s. sonnei , which shows no serotype variability, expresses plasmid-encoded o-antigen modifi cation enzymes. the o antigen is one of the major immunogenic components of shigella and is a virulence factor, in part, due to masking the exposure of type three secretion apparatus [ 23 ] . the inclusion of conserved proteins in vaccine compounds potentially solves the issue of serotype specifi city, thus allowing the generation of a highly desirable pan-shigella vaccine. in addition, recombinant proteins usually have increased safety profi les. another important impact in shigella epidemiology that prompts vaccine development is the increasing frequency of antibiotic-resistant strains. antibiotic resistance is continually rising for this pathogen [ 24 ] . shigella spp. as causative agent of shigellosis. first defi ned as a causative agent of bacillary dysentery by shiga in japan, shigella is a gram-negative bacillus that is noncapsulated and nonmotile. diagnosis is generally based on symptoms [ 25 ] since bloody, mucoid stools are indicative of shigella infections. however, because several diarrheal infections caused by other microorganisms share these symptoms (enteroinvasive e. coli and campylobacter , among others), the sole analysis of symptoms is insuffi cient for an accurate diagnosis. therefore, clinical diagnosis must be complemented with microbiological isolation from culture. shigella invasion and pathogenesis. shigella is transmitted through the fecal-oral route by consumption of contaminated food and water. following ingestion, the acid-tolerant shigella passes through the stomach and small intestine into the large intestine [ 26 ] (fig. 9.11 ). here, they are taken up by m cells, transcytosed to the basolateral face of the colonic epithelium, and presented to resident macrophages wherein ipab of the type three secretion system (t3ss) induces apoptosis by caspase 1 activation, thereby escaping killing by the macrophage [ 27 ] . shigella then invades epithelial cells using its t3ss to create a translocation pore in the host cell membrane to initiate an orchestrated fl ow of effectors into the host cell cytoplasm to induce actin rearrangements that ultimately result in uptake of bacteria. once inside, shigella quickly escapes its vacuole, replicates, and moves about the cytoplasm via actin-based motility. in a t3ssdependent manner, the shigella then forms a protrusion into a neighboring uninfected cell with the resulting vacuole being quickly lysed to complete the process of intercellular spread. the genes associated with the t3ss are encoded on a 220-kb plasmid which is highly conserved among the shigella species. at the heart of the t3ss is the type three secretion apparatus (t3sa) which is composed of a basal body similar to that of fl agellar systems and an extracellular needle [ 28 ] . invasion plasmid antigen d (ipad) is a 37 kda protein that forms a pentameric ring at the tip of the needle. it controls secretion of effector proteins and is the environmental sensor for mobilization of ipab to the t3sa tip complex. ipab is a 64 kda translocator that forms a ring atop the ipad ring and is responsible for host cell contact. this contact is required for mobilization of ipac to the needle tip and formation of a complete unidirectional conduit from the bacterial cytoplasm to the host cell cytoplasm. the initiation of infl ammation and invasion processes occurs exclusively at the basolateral side of host cells, highlighting the importance of the previous steps of macrophage subversion in shigella colonization of the gut. animal models. shigellosis is strictly a human disease. while the basis of this restriction is unknown, it complicates the ability to investigate the pathogenesis of shigella . however, several animal models have been developed to study the pathogenesis of shigella , the resulting immune response against shigella antigens, and the protection efficacy of candidate vaccines against shigellosis: [ 31 ] . nonhuman primate (nhp) models : nhp models have been used to defi ne the ability of vaccines to elicit immune responses and protection (rhesus and cynomolgus monkeys) [ 32 ] . the main advantage of this model is that shigella is able to colonize the large intestine and generate symptoms that these bacteria generate in human infection. o antigen/proteosome : o antigen represents the variable portion of shigella lps ( fig. 9 .12 ). administration of lps or o antigen alone in animal models is not enough to elicit immune responses, making them ineffective immunogens. to solve this limitation, these molecules have been used in conjunction with different proteins as carriers. several variants of lps/oantigen mixtures have been developed and characterized. one of these protein combination approaches uses s. fl exneri and s. sonnei lps complexed with neisseria meningitidis outer membrane protein proteosomes [ 33 , 34 ] . lps is extracted from s. fl exneri or s. sonnei by hot phenol extraction and mixed with detergent-extracted outer membrane proteins from n. meningitidis . the complex was then separated from free lps present in the mixture by gel fi ltration chromatography. the concept behind this vaccine is that the proteins present in the n. meningitidis proteosome are able to act as carriers for t cell stimulation, thus allowing the recognition of lps. shigella outer membrane vesicles. outer membrane vesicles (omvs) are particles composed of lps, proteins, and nucleic acids. in a proposed vaccine formulation, these particles were purifi ed from liquid cultures of s. boydii by centrifugation with subsequent fi ltering ( fig. 9.13 ). the precise identity and amount of the proteins included in this preparation is not currently known, although the presence of proteins having the same mass as ipab, ipac, and ipad suggests its composition includes these proteins. when these omvs are administered orally to mice, antibodies are generated against omv lysates. this vaccine has the advantage of heterologous protection (as shown by challenge against strains from each shigella serogroup) and the absence of adjuvant dependency. in addi-tion, immunity can be passively transferred to offspring, suggesting that the protective mechanism involves antibodies and raising the possibility that this vaccine can be used in infants, which is the main target population for a shigella vaccine. the use of live, fully virulent shigella during its formulation process, the presence of lps, and lot-to-lot consistency are possible downsides of this preparation. invaplex. another vaccine candidate that uses t3ss proteins and lps as part of the formulation is the invaplex [ 35 ] . these complexes are obtained by aqueous extraction followed by ion exchange chromatography (fig. 9.13 ). the precise composition of these extracts has not been completely characterized but includes lps, ipab, and ipac [ 36 ] . these complexes are able a b d c fig. 9 .12 depiction of lps/o-antigen-based vaccines. lps ( a ) extracted from shigella fl exneri or sonnei is admixed with protein preparations from n. meningitides and used as a carrier. when this complex is administered orally and intranasally to mice and guinea pigs [ 33 ] , serum igg and mucosal iga in intestines and lungs are vaccine com-pound ( b ). o antigen purifi ed from lps is delivered in combination with exoprotein a from p. aeruginosa ( c ). finally, o antigen from different shigella serogroups is combined with ribosomes from shigella and is depicted in ( d ) to elicit igg and iga responses against ipa proteins as well as lps in both mice and guinea pigs. in addition, they are protective against the shigella species/serotype used for extract generation [ 37 ] in the mouse and guinea pig challenge models. two phase one studies have been performed using the invaplex vaccine on adult volunteers [ 38 ] and showed no major side effects to delivery of intranasal doses of up to 690 μg. the highest dose employed in these studies generated an asc response to lps in 58 % of the volunteers. an advantage of this approach is that, other than the invaplex itself, no additional adjuvants need to be administered. a drawback of this vaccine consists in a challenging production process that includes cultures of virulent shigella as well as the presence of bacterial lps products in the intermediate steps and fi nal formulation. another possible caveat is the uniformity of protein composition in these complexes through manufacturing lots. finally, this vaccine was not designed to protect against multiple serotypes. a solution for this possible drawback, however, is the generation of formulations that include invaplex complexes generated from more than one particular serotype, which increases an already diffi cult manufacturing process. this would allow the generation of vaccine formulations specifi c for the serotypes prevalent in a particular region. a vaccine candidate that targets conserved shigella virulence proteins includes some of the t3ss ipa proteins (fig. 9.14 ) . recombinant ipab and ipad can be expressed in e. coli at high levels. ipad is then easily purifi ed from the e. coli cytosol while ipab must be purifi ed as a complex with its cognate chaperone ipgc. the chaperone is needed to maintain the hydrophobic ipab in a soluble state and to provide stability for ipab from proteolytic degradation. ipab can then be further purifi ed after separation from ipgc in low concentrations of detergent. analyses have indicated that ipab is greater than 90 % pure following this scheme. in its fi nal formulation, this ipa-based vaccine also contains a double mutant of heat-labile enterotoxin from e. coli (dmlt) [ 39 ] as an adjuvant. the mechanism of protection for this vaccine has not yet been worked out. nevertheless, it was tested in the mouse lethal pulmonary model [ 40 ] where it exhibited over 90 % homologous protection (against s. fl exneri ) and greater than 60 % heterologous protection (using s. sonnei during the challenge experiments). igg and mucosal iga were generated after intranasal administration along with antigen-specifi c ifn-γ-secreting cells. ompa. a 34-kda outer membrane protein (omp) was purifi ed from s. fl exneri 2a using ion exchange chromatography. incubation of macrophages with this 34-kda protein induced the production of nitric oxide and increased production of il-12 and tnf-α. this protein was delivered parenterally fi ve times in rabbits, giving protection against challenge by s. fl exneri in the rabbit cecal ligation model [ 41 ] . subsequent work using a recombinant protein purifi ed by affi nity chromatography identifi ed this 34-kda omp as ompa, part of a family of immunomodulating proteins present in numerous gram-negative bacteria. this protein showed high protective effi cacy in the mouse lethal pulmonary model [ 42 ] where it elicited serum igg and mucosal iga. ticks are widely distributed throughout the world, affecting 80 % of the world's cattle population [ 43 ] . the economic importance of ticks and tick-borne diseases (tbds) has been estimated by a number of studies; however they most likely represent an underestimation of the real impact of these arthropod vectors and their transmitted diseases. tick feeding has devastating effects including disease transmission, paralysis, toxicosis, and secondary infections of the tick-feeding site [ 44 ] . the effect of ticks and tickborne diseases is particularly pronounced in the livestock sector where it is repeatedly rated highly for its impact on the livelihood of farmers, particularly in countries of the south which are heavily dependent on agricultural production. there are six genera of ixodid ticks of importance, namely, amblyomma , dermacentor , haemaphysalis , hyalomma , rhipicephalus , and ixodes . historically, tick and tick-borne disease control has focused on the control of ticks at tolerable levels through acaricide use and treatment of disease with appropriate drugs. in some cases acaricide-based tick control is often the only method of reducing tick populations without sacrifi cing productivity [ 45 ] . acaricides are commercially available in a number of formulations that are applied either directly onto livestock or in dipping vats where multiple animals can be passed through at regular time intervals. acaricide application relies heavily on correct formulation and administration to be effective. a large number of chemical compounds have been found to be effective against ticks including arsenic (introduced ~1983), ddt (~1946), cyclodienes and toxaphene (~1947), organophosphates-carbamate group (~1955), formamides (~1975), and macrocyclic lactones (~1981). the potency and usefulness of many of the abovementioned compounds is gradually eroding with resistance developing in many tick species of rhipicephalus , amblyomma , and hyalomma . multiple acaricide-resistant tick stocks have been identifi ed, limiting or entirely excluding the use of many acaricides [ 46 ] . in addition to resistance, chemical control through the guiding principle for anti-tick vaccination stems from early studies conducted on acquired host resistance to tick infestations. repeated exposure of hosts to ticks or tick organ homogenates induced resistance to tick re-infestation. while the degree of resistance may vary between different tick and host species, evidence strongly suggests that natural resistance against tick infestation develops based on adaptive immune response mechanisms [ 47 ] . ticks feeding from hosts vaccinated with tick components take up effector molecules during feeding that mediate deleterious effects on the ticks. this effect manifests as reduction of feeding time, tick mortality (during or after feeding), reduced engorgement weights, and reduced reproductive capacity of adult females. eggs laid from ticks fed on vaccinated hosts may also show reduced hatching rates. the overall result culminates in reduction of tick populations and tick-borne diseases. many of the anti-tick vaccine targets have been identifi ed using conventional immune-screening techniques. immunization of vertebrate hosts with tick homogenates or purifi ed tick extracts generates immune sera. these sera are used to screen for tick antigens detected by the host. the identifi cation of tick proteins essential for tick survival is a useful method for more targeted antigen discovery, which is made increasingly possible as information is gathered on tick biology. with the availability of genome sequences for a number of tick species, the number of candidates for discovery is expanding through reverse vaccinology. the use of other techniques such as rna interference (rnai) has been useful in confi rming the importance of antitick vaccine candidates and is likely to play a role in future anti-tick vaccine antigen discovery [ 48 ] . exposed or concealed antigens. anti-tick vaccine candidates have been classifi ed into two categories: exposed or concealed antigens. exposed antigens are secreted in tick saliva during attachment and feeding on a host while concealed antigens are normally hidden from the host immune response. molecular mimicry by ticks of host components has been observed, and vaccination may induce host sensitivity and autoimmune reactions when exposed antigens are used [ 49 ] . one advantage of using exposed antigens is that natural boosting occurs through tick feeding. mechanistically, vaccination with exposed antigens is thought to induce a focal hostile environment unsupportive for tick attachment and feeding. concealed antigens do not come into contact with the host immune response during natural tick feeding. although often contained within the thoracic cavity of the tick, some salivary gland proteins can be characterized as concealed if they are not secreted into the tick-feeding site. one diffi culty in the development of concealed anti-tick vaccines is that the antigen must be accessible to the induced humoral vaccine response. this often limits the number of candidates to those coming into prolonged and direct contact with the blood meal or where the humoral response can be transported over the gut barrier into the hemolymph [ 50 -52 ] . the second limitation of concealed antigens relates to natural boosting of the immune response. as the antigens do not come into contact with the immune response within the host, suffi ciently high antibody levels must be induced through repeated vaccination. as the blood meal acts as the carrier for the effector immune responses, the anti-tick effect can take place over a longer period of time compared to exposed antigens. this effect may even extend beyond the mere feeding period into the inactive stages where digestion and molting/egg laying takes place. bm86 anti-tick vaccine. the bm86-based anti-tick vaccine remains the only anti-tick vaccine commercially produced and has become the benchmark for future anti-tick vaccine development and evaluation. the gut-associated bm86 glycoprotein was fi rst identifi ed in r. microplus although homologues in other tick species have since been identifi ed [ 53 -55 ] . the biological function of bm86 remains unknown although it is thought to play a role in the digestion of the blood meal [ 56 ] . in r. microplus , expression of bm86 is increased during embryogenesis, reaching the highest level in unfed larvae. expression decreases during feeding and molting with lowest levels of expression detected during the resting stages of the tick. bm86 has a translated coding sequence of 650 amino acids and a size of 71.7 kda. the protein contains four potential n-linked glycosylation sites and a leader peptide suggesting transport to the cell surface. localization studies have shown the molecule is located predominantly on the microvilli of gut epithelial cells. a single c-terminal transmembrane sequence is present in the unprocessed protein which is replaced by a glycosylphosphatidylinositol anchor in the mature protein. the protein also contains multiple predicted egf repeats rich in cysteine residues. vaccination has been performed mostly with the whole molecule, and protective epitopes for bm86 have not been well determined. the site of a protective b cell epitope was defi ned and additional epitopes are likely to exist. overlapping cross-reactive immune-reactive epitopes have been found between bm86 and the r. decoloratus homologue, bd86. vaccine effi cacy is directly related to anti-bm86 antibody titer and the ability to control tick populations is directly related to achieving a strong antibody response. substantial animal-to-animal variation has been observed in the ability to generate anti-bm86 antibody titers which is likely related to the mhc class ii haplotypes expressed. antibodies to bm86 and cattle complement system are taken up during the blood meal. antibody binding results in lysis of the gut epithelial cells culminating in impaired blood meal digestion. strong antibody responses may induce tick mortality due to blood leakage from the gut into the hemolymph and ticks may turn reddish instead of gray. the development of the antibody response in cattle [ 57 ] after immunization with rbm86 is demonstrated in fig. 9.19 . recombinant expression of bm86 has been attempted in several expression systems including escherichia coli , aspergillus nidulans , aspergillus niger , and pichia pastoris . vaccine trials showed that bm86 vaccination targeted mainly the adult stage of r. microplus , particularly the number of adult females fully engorging and post-engorgement mortality. reproductive capacity of adult r. microplus females was affected in terms of egg-laying capacity and hatching of eggs [ 58 ] . under fi eld situations, vaccination of cattle reduced tick numbers by 56 % within a single generation and reduced the reproductive capacity by 72 %. reversal of negative effects of tick feeding on live weight of vaccinated animals by an average increase in live weight of 18.6 kg over a 6-month period was observed. extensive fi eld trials in cuba, brazil, argentina, and mexico showed between 55 and 100 % control of r. microplus ticks within a 36-week period [ 59 ] . importantly, complete control of acaricide-resistant ticks could be accomplished by integrating bm86 vaccination with acaricide use [ 52 ] , showing that integrated control systems are effective in controlling tick populations. vaccination also decreased the amount of acaricides required to control tick populations and prolonged the time interval between cattle dipping. bm86 vaccination has been extensively evaluated for its ability to control other tick species. almost complete cross-protection against rhipicephalus annulatus has been reported [ 60 ] . signifi cant protection against hyalomma anatolicum , h. dromedarii , and r. decoloratus has been observed; however no cross-protection was seen against r. appendiculatus or amblyomma variegatum . genetically attenuated microorganisms, viruses and bacteria, can be engineered to deliver recombinant heterologous antigens to stimulate the host immune system. some experimental vector systems are summarized in work with vectors classifi ed as bsl-1 does not require biosafety program approval. work with vectors classifi ed as bsl-2 or higher requires approval by the local biosafety committee. safety concerns. as demonstrated for adenovirus 5 (ad5), following i.m. injection, the vector persisted mainly near the injection site and in draining lymph nodes for up to 6 months. low levels of integration into chromosomal dna were observed, with a calculated mutation rate of 2 × 10 −7 mutations per cell. the spontaneous mutation rate of a cell is 2 × 10 −6 and therefore tenfold higher. ad5 is classifi ed as biosafety level 2 (bsl-2). live vectors are able to stimulate the mucosal as well as a systemic humoral and cellular immunity. a severe drawback of the vector technology is that, once used, the vector cannot be effectively used in the patient again because it will be recognized by antibodies. repeated booster immunization will fail. also preexisting immunity in the patient against the vector could render the vaccination ineffective. a heterologous prime-boost and vector priming as described in chap. 2 could circumvent this barrier. disease 4 lactic acid bacteria have generally recognized as safe (gras) status and have been developed in the past decade as potent adjuvants for mucosal delivery of vaccine. a platform technology based on lactobacillus plantarum ( fig. 9.21 ) was developed to deliver antigens against plague disease caused by yersinia pestis , an aerobic, nonmotile, gram-negative bacillus belonging to the family enterobacteriaceae , which is transmitted to humans via fl eabite or via aerosol droplet, causing bubonic or pneumonic plague, respectively [ 61 ] . most human plague cases present as one of three primary forms -bubonic, septicemic, or pneumonic. secondary plague septicemia, pneumonia, and meningitis are the most common complications. the pathogenicity of y. pestis results from its impressive ability to overcome the defenses of the mammalian host and to overwhelm it with massive growth. plague is enzootic in rodents in africa, asia, south america, and north america. y. pestis is transmitted from host to host by fl eas via blood feeding, through consumption or handling of infectious host tissues, or through inhalation of infectious materials. y. pestis infects an astonishingly broad range of mammals and uses rats, squirrels, mice, prairie dogs, marmots, or gerbils as reservoirs and several arthropod vectors for transmission [ 62 ] (fig. 9.20 ) . humans acquire this zoonotic infection via an atypical bite from animal fl eas, sometimes prompted by an animal's death from plague, after which the fl ea seeks a new source of blood. most infected fl eas come from the domestic black rat rattus rattus or the brown sewer rat rattus norvegicus. y. pestis cells spread from the site of the infected fl eabite to the regional lymph nodes, grow to high numbers causing the formation of a bubo, and spill into the bloodstream where bacteria are removed in the liver and spleen. growth continues in these organs, spreads to others, and causes septicemia. fleas feeding on septicemic animals complete the infection cycle. in humans, bubonic plague can develop into an infection of the lung (secondary pneumonic plague) that can lead to aerosol transmission (primary pneumonic plague) [ 63 ] . multiple antibiotic-resistant strains of y. pestis occur naturally and they can be easily bioengineered. thus, plague is a category a bioterrorism agent in need for novel strategies for its prevention. bubonic plague is the classic form of the disease. patients usually develop symptoms of fever, headache, chills, and swollen, extremely tender lymph nodes (buboes) within 2-6 days of contact with the organism either by fl eabite or by exposure of open wounds to infected materials. primary septicemic plague is generally defi ned as occurring in a patient with positive blood cultures but no palpable lymphadenopathy. patients are febrile, and most have chills, headache, malaise, and gastrointestinal disturbances. primary pneumonic plague is a rare but deadly form of the disease that is spread via respiratory droplets through close contact (2-5 ft) with an infected individual. it progresses rapidly from a febrile fl u-like illness to an overwhelming pneumonia with coughing and the production of bloody sputum. the incubation period for primary pneumonic plague is between 1 and 3 days. in general, patients who develop secondary plague pneumonia have a high fatality rate. fig. 9 .20 schematic of the plague cycle with small mammals as hosts and fl eas as vectors. arrows represent connections affected by climate with a color coding depending on the most infl uential climate variable on this link (i.e., precipitation, temperatures, and other variables indirectly depending on them such as soil characteristics and soil moisture). grey rectangles somewhat arbitrarily delimit epizootic, enzootic, and zoonotic cycles. note that despite their location at the far end of the cycle, humans often provide the only available information on plague dynamics the laboratory diagnosis of plague is based on bacteriological and/or serological evidence [ 64 ] . samples for analysis can include blood, bubo aspirates, sputum, cerebrospinal fl uid in patients with plague meningitis, and scrapings from skin lesions, if present. staining techniques such as the gram, giemsa, wright, or wayson stain can provide supportive but not presumptive or confi rmatory evidence of a plague infection. lactic acid bacteria (lab) are a group of gram-positive, nonpathogenic, non-sporulating bacteria that include species of lactobacillus (fig. 9.21 ) , lactococcus , leuconostoc , pediococcus , and streptococcus . they have limited biosynthetic abilities and require preformed amino acids, b vitamins, purines, pyrimidines, and a sugar as a carbon and energy source. these nutritional requirements restrict their habitats to those in which the required compounds are abundant. thus, these highly specialized bacteria occupy a range of niches including milk, plant surfaces, the oral cavity, the gastrointestinal tract, and the vagina of vertebrates [ 65 ] . lab have been consumed for centuries by humans in fermented foods and have an extraordinary safety profi le. these intrinsic advantages turn lab into excellent delivery vectors of novel preventive and therapeutic molecules for humans. a number of studies of oral vaccines generated from genetically engineered pathogenic or commensal bacteria have been reported [ 66 , 67 ] . live attenuated pathogenic bacteria, such as derivatives of mycobacterium , salmonella , and bordetella spp., are the most popular live delivery vectors used currently. they are particularly well adapted to interact with mucosal surfaces as they have specialized machinery to initiate the infection process. the major disadvantages of live vaccines include inadequate attenuation and the potential to revert to virulence. lactic acid bacteria-based vaccines act as live attenuated vaccines but without the safety concern. lab have a generally recognized as safe (gras) status and thus are not likely to cause harm. the production of a desired antigen by lab can occur in three different cellular locations: • intracellular , which allows the protein to escape harsh external environmental conditions (such as gastric juices in the stomach) but requires cellular lysis for protein release and delivery • extracellular , which allows the release of the protein into the external medium, resulting in direct interaction with the environment (food product or the digestive tract) • cell wall anchored , which combines the advantages of the other two locations (i.e., interaction between the cell wall-anchored protein and the environment, in addition to protection from proteolytic degradation) in this context, several studies have compared the production of different antigens in lab, using all three locations, and evaluated the subsequent immunological impact. these studies demonstrated that the highest immune response was obtained with cell wall-anchored antigens exposed on the surface of lab. therefore, most of the recent lab vaccination studies have selected surface exposure of the antigen of interest, rather than intra-or extracellular production [ 68 ] . dendritic cells (dcs) play a central role in bridging the innate immune system with the adaptive immune system. dcs are found throughout the body and are especially common at mucosal surfaces. with only a single layer of epithelial cells separating the external from the internal world amid the constant need for particle exchange, intestinal dendritic cells (dcs) play a key role in maintaining intestinal homeostasis as well as governing protective immune responses against invading pathogens. to avoid activation of selfreactive t cells and to limit unnecessary responses, such as those against commensal fl ora, dcs can imprint tolerance onto t cells (fig. 9.22 ). immature-type dcs are enriched underneath the epithelium of mucosal inductive sites and are poised to capture antigens. they extend protrusions between epithelial cells, enabling direct sampling of luminal antigens [ 69 ] . through upregulation of mhc and co-stimulatory molecules, matured dcs convert into highly effi cient antigen-presenting cells. successful antigen presentation to cd4+ t cells requires recognition of cognate peptide in the context of mhc class ii molecules, whereas epitopes presented on mhc class i molecules stimulate ag-specifi c cd8+ t cells. when antigen uptake occurs, these dcs change their phenotype by expressing higher levels of co-stimulatory molecules and move to t cell areas of inductive sites for antigen presentation. thus, dcs and their derived cytokines play key roles in the induction of antigen-specifi c effector th cell responses. in this regard, targeting mucosal dcs is an effective strategy to induce mucosal and systemic immune responses. lab persistence. the ability of some lab to persist in the gastrointestinal tract may be critically important in the effectiveness of lab-based vaccines. a comparison of a persisting lab strain, l. plantarum , with a nonpersisting lab strain, l. lactis , identifi ed l. plantarum to be more effective at eliciting antigen-specifi c immunity, suggesting that persistence promoted immunogenicity [ 70 ] . furthermore, it has been shown that particular lactobacillus species induced critical infl ammatory cytokines and induced activation and maturation of dendritic cells [ 71 ] . it has also been shown that immature dcs effi ciently capture lactobacillus species, and these bacteria activated human dcs, resulting in the production of pro-infl ammatory cytokines like il-12, increased proliferation of cd4+ and cd8+ cells, and skewed t cell response toward a th1 pathway believed to be involved in effective clearance of microbial pathogens [ 72 , 73 ] ( fig. 9.23 ). evidence suggests that the peptidoglycan layer of some lab promote natural immuno-adjuvanticity [ 74 ] , and antigen localization on the cell wall makes it more accessible to the immune system as compared to intracellular or secreted proteins. leader peptides mark proteins for translocation across the cytoplasmic membrane, and lipid modifi cation is of major importance both for anchoring exported proteins to the membrane and for protein function [ 75 ] . it has been shown that lipidation at the fi rst amino acid of the mature borrelia burgdorferi ospa protein is essential to induce an immune response via tlr2 [ 76 ] . the leader peptide of ospa targets the protein to the cell envelope of lactobacillus and that the cys [ 17 ] is recognized by the l. plantarum cell wall-sorting machinery that lipidates and anchors the protein to the cell envelope. the end result is a delivery system that exerts a potent adjuvant effect [ 77 ] . virus-like particles (vlps) that mimic the antigenic architecture of authentic virions, however, can be produced in insect, mammalian, and plant cells by the expression of the capsid protein. the particulate nature and high-density presentation of viral structure proteins on their surface render vlps as a premier vaccine platform with superior safety, immunogenicity, and manufacturability. therefore, this chapter focuses on the development of effective nov vaccines based on vlps of capsid proteins. the expression and structure of nov vlps, especially vlps of norwalk virus, the prototype nov, are extensively discussed. the ability of nov vlps in stimulating a potent systemic and mucosal anti-nov immunity through oral and intranasal delivery in mice is presented. gastroenteritis (ge) is a worldwide health problem that affects people of all ages. as its name implies, ge is characterized by infl ammation of the gastrointestinal tract and often associated with symptoms of diarrhea, nausea, vomiting, and abdominal cramping and pain. ge and its associated diarrheal diseases remain as one of the top causes of death in the world especially in developing countries and in young children with an estimated death toll of four to six million per year [ 78 ] . ge can be caused by a variety of pathogens including viruses, bacteria, and parasites and by ingestion of noninfectious toxins or medications, with viruses as the most common offending agents. norovirus (nov) and rotavirus are the most common viruses that cause viral ge, while adenovirus, astrovirus, coronavirus, and parechovirus are also known to cause ge in humans. novs are a group of genetically diverse rna viruses that belong to the genus of norovirus in the caliciviridae family [ 79 ] . they were fi rst discovered and characterized in their prototype virus, the norwalk virus (nv), in 1972 [ 80 ] . studies of nv revealed that novs are non-enveloped viruses with a rna genome surrounded by a round capsid protein shell approximately 38 nm in diameter. novs are divided into 5 genogroups and 29 clusters with 8 clusters in genogroup i (gi), 17 in gii, 2 in giii, and 1 each in giv and gv. within the fi ve genotypes, gi and giv strains are found to infect humans exclusively and gii are found in both humans and pigs, while giii and gv strains are animal viruses that infect cattle and murine species, respectively [ 81 ] . currently, strains in cluster 4 of gii (gii 4) are the most prevalent novs in human population [ 82 ] . the genome of novs, which was fi rst characterized in nv, contains a single-stranded positive-sense rna of 7.5-7.7 kb with three open reading frames (orfs) and a poly a tail at its 3′ end [ 83 ] (fig. 9.24 ). orf1 encodes a polyprotein that is processed by viral protease 3clpro into the rnadependent rna polymerase and approximately fi ve other nonstructural proteins including p48, the nucleoside triphosphatase, p22, vpg, and 3clpro. the two structural proteins, the major (vp1) and minor (vp2) capsid proteins, are encoded by orf2 and orf3, respectively [ 84 ] . structural analysis of nov has revealed that each viral capsid is composed of 90 dimers of vp1 in a t = 3 icosahedral symmetry. vp1 folds into two domains: a shell (s) domain that is responsible for initiating capsid assembly and icosahedral contacts and a protruding domain (p), containing two subdomains of p1 and p2, that enhance the stability of the capsid by providing intermolecular contacts between vp1 dimers. studies of nv also indicate that the vp2 protein enhances the expression level of vp1 and stabilizes the vp1 in the viral capsid. novs are highly contagious and spread rapidly, and their outbreaks commonly occur in various social places and settings where people share common food and water sources or are in close physical proximity, such as cruise ships, schools, military units, nursing homes, daycare centers, hospitals, restaurants, and catered events . the life cycle of nov has not been fully understood due to the lack of an in vitro cell culture system and a small animal model of infection. the failure of nov replication in mammalian cell cultures is not due to the lack of host factors to support intracellular expression of nov rna. instead, the problem may lie in the steps of viral binding to cellular receptors, virus entry into cells. nov [ 85 ] . while serum antibodies to nov can be readily detected, this method has little clinical relevance due to the cross-reactivity of antibodies. since there is no culture system available for nov, the detection of virus in stool samples has become the preferred method of diagnosis. traditionally, nov infection was diagnosed by detecting the virus by immune transmission electron microscopy (tem). tem offers the advantage of direct visualization of any potentially responsible virus particles in stool samples. however, it does have the disadvantage of requiring sophisticated and expensive equipment and highly specialized technicians for its operation. several enzyme-linked immunosorbent assays (elisas) that detect nov antigens were later developed for nov diagnosis. studies have shown that nov antigen-detecting elisas have high specifi city (94-96 %) but poor sensitivity (40-60 %), most likely due to the antigenic diversity of nov strains [ 86 ] . similar to elisa-based assays, rt-pcr is rapid and robust, because it can process large numbers of samples simultaneously and results can be obtained within a working day. however, it requires rna extraction from fecal samples and needs expensive equipment and skilled workers to operate. therefore, rt-pcr is more labor intensive and less economical than elisas. overall, tem, elisa, and rt-pcr-based methods all have their advantages and challenges. the three methods detect different components of the virus and therefore are complementary to each other. immunology of nov infection. the immunological knowledge of nov is mostly obtained from human challenge studies and natural outbreaks due to the lack of small animal models. observations of repeat infections in adults suggest the scarcity of long-term immunity against these viruses. however, other studies showed that close to 50 % of the genetically susceptible subjects were not infected by nov challenge, which support the possibility of long-term immunity [ 87 ] . nov. the lack of a tissue culture system also impedes the development of vaccines against nov. fortunately, the discovery of the spontaneous assembly of expressed vp1 into virus-like particles (vlps) that are morphologically and antigenically similar to the native viruses has facilitated vaccine development. vlps combine the best traits of wholevirus and subunit antigens for vaccine development. vlps are noninfectious, therefore, safer than inactivated or attenuated virus due to the lack of viral nucleic acid genome. importantly, vlps can induce potent cellular and humoral immune responses without adjuvants and are more effective vaccines than other subunit antigens because their architectures mimic infectious viruses. vlps can be produced by recombinant technology in heterologous expression systems without requiring the ability to support viral replication. this is particularly important for nov because no such culture system has been developed to support the growth of these viruses. studies have demonstrated that viruses and corresponding vlps have a particle size ideal for dc and macrophage uptake to initiate antigen processing [ 88 ] . thus, the particulate nature of vlps favors their targeting to relevant apcs for optimal induction of t cell-mediated immune responses. vlps can also be presented effi ciently to b cells and induce strong antibody responses. like live viruses, the quasicrystalline surface of vlps, with its arrays of repetitive epitopes, presents a prime target that vertebrate b cells have evolved to specifi cally recognize. this recognition triggers the cross-linking of surface membrane-associated immunoglobulins (ig) on b cells and leads to their proliferation and migration, t helper-cell activation, antibody production and secretion, and the generation of memory b cells. thus, vlps can directly activate b cells at much lower concentrations than other subunit antigens and induce high titer and durable b cell responses in the absence of adjuvants. these inherent advantages of vlps have made them one of the most successful recombinant vaccine platforms. for example, fi ve vlp-based vaccines for hepatitis b virus characterization of nov vlps. vlps of novs were fi rst produced in insect cells using baculovirus vectors [ 89 ] and then in plants using tobamovirus and geminivirus vectors and in mammalian cells using the venezuelan equine encephalitis (vee) replicon system [ 90 ] . these studies demonstrated that expression of the major capsid protein vp1 alone can drive the self-assembly of vlps that morphologically and antigenically resemble native virus particles (fig. 9.25 ) . vlps generated by all three expression systems are similar to each other. the structure of nov vlps is exemplifi ed by the vlp of nv capsid protein (nvcp). studies of insect cellbaculovirus-derived nvcp vlps by cryo-electron microscopy and x-ray crystallography reveal that the nv capsid is a 38 nm icosahedral arrangement of 180 copies of the 58 kda capsid protein vp1 organized into 90 dimers in a t = 3 symmetry. while all dimers are formed from two identical nvcp monomers, two different dimer confi gurations are required to correctly form the complete assembled capsid [ 91 ] . as in native nv particles, the nvcp also folds into two distinctive domains in vlps, with s domain forming the inner core of the shell and p domain protruding out from the capsid [ 92 ] . similarly, the p2 subdomain is also the most surface-exposed region in nv vlps and may contain hbga and neutralizing antibody-binding sites and determinants of strains specifi city [ 93 , 94 ] . the similarity between vlps of nv and other novs including gii.4 viruses has been demonstrated [ 95 ] . insect cell-baculovirus vector-produced nvcp vlps. it was shown that four oral doses of as little as 5 μg nvcp vlps without any adjuvant triggered serum nv-specifi c anti-igg response in the majority (8/11) of vlpfed icd1 outbred mice [ 90 ] . systemic igg response was observed after two oral dosages and the highest titer was induced by four doses of 200 μg vlps. moreover, mice in the 200 μg dosage group developed nv-specifi c intestinal iga in a level up to 0.1 % of total iga. inclusion of the mucosal adjuvant cholera toxin (ct) did not signifi cantly change the number of positive responders of serum igg or intestinal iga but signifi cantly enhanced the amplitude of serum igg response, especially for higher doses of vlps. thus, nvcp vlp is clearly a potent oral immunogen and can induce both systemic and gut mucosal antibody responses. nanovaccines: gas infections 6 group a streptococcus ( streptococcus pyogenes) (gas) is an important human mucosal pathogen that is responsible for a wide spectrum of diseases with varying clinical manifestations and severity [ 96 , 97 ] : pharyngitis (strep throat) is a common minor complication of gas infection but when left untreated can lead to life-threatening diseases including the autoimmune sequelae rheumatic fever (rf) and rheumatic heart disease (rhd). rhd results in permanent damage to the heart tissues and valves. gas infections cause >500,000 deaths each year mostly in developing countries and indigenous populations within developed nations where poor socioeconomic conditions and overcrowding contribute to the high rates of gas diseases. in developing countries, rf is the leading cause of heart disease among children [ 6 ] . there is currently no available vaccine to prevent infection with gas and consequently prevent gas diseases. a successful mucosal gas vaccine would need to stimulate the appropriate humoral and cellular immunity for protection against gas infection ( fig. 9.26 ). this is especially diffi cult due to a lack of human-compatible mucosal vaccine adjuvants that are essential to boost immune responses. researchers have therefore focused mainly on parenteral gas vaccine delivery approaches, for which suitable adjuvants are available, designed to provide protection against systemic infection via the induction of opsonic igg antibodies. antigen. an effective gas vaccine needs to have broad antigenic reach because of the many different gas strains (>150 different m types) circulating in a population fig. 9 .26 gas vaccination approaches. gas that breaches the physical barrier of the mucosal epithelium of the nasal-associated lymphoid tissue, functionally analogous to human tonsils, is purported to be transported to the underlying lymphoid tissue via association with membranous (m) cells. cells of the innate immune system sense gas and produce cytokines and chemokines to contain the infection to the mucosa. gas antigens are delivered to antigen-presenting cells such as dcs and b cells. iga-committed b cells are activated and initiate antigen-specifi c iga responses. dcs play a fundamental role in the development of immunity to gas and present antigen to t cells to induce a th17 response that is integral along with iga for mucosal defense against pharyngeal gas colonization. mucosal vaccination is designed to mimic these responses and effectively clear gas from the mucosal surface upon infection to prevent gas colonization and carriage. gas that escapes the host's defense mechanisms can disseminate into the lymphatics and blood, leading to systemic infection. mucosal vaccination is also able to induce a systemic immune response characterized by the induction of opsonic igg antibodies, which destroy the pathogen by opsonophagocytosis. parenteral gas vaccination induces serum igg but is not able to induce mucosal immunity and should not induce immune responses that are potentially cross-reactive with self tissue proteins. the gas m protein is the major protective antigen and an ideal target for vaccine development; however it contains heart tissue cross-reactive epitopes particularly in the conserved region [ 98 ] . evidence suggests that cross-reactive t cells especially play a pivotal role in the pathogenesis of rhd. the m protein is an α-helical coiled-coil surface protein consisting of a hypervariable amino-terminal region and a highly conserved (>98 % sequence identity) carboxyterminal c-repeat region [ 99 ] (fig. 9.27 ) . functionally, the m protein is important in preventing bacterial clearance by complement-mediated phagocytosis, which limits host defense mechanisms. previous studies indicate that protective immunity to gas can be evoked by opsonic antibodies to serotypic epitopes at the amino-terminal region that are m-type specifi c [ 100 ] . nanovaccine. combined vaccine/adjuvant delivery systems offer the potential of mucosal vaccine delivery. for example, the lipid-core-peptide (lcp) system is a novel, synthetic, selfadjuvanted vaccine delivery system that incorporates the adjuvant (prr agonist), carrier, and antigenic peptides of a vaccine into a single molecular entity ( fig. 9 .28 ). this system has been previously shown to effi ciently deliver gas vaccines and induce immunity [ 101 ] . evidence suggests the adjuvant activity of lcp involves the induction of dc activation. preclinical developments. three approaches are currently being investigated in the development of a subunit gas vaccine based on the m protein: 1. multivalent gas vaccine . a multivalent approach employs a combination of amino-terminal protein fragments representing different m types and is designed to target prevalent gas strains in a population. using this approach, a recombinant multivalent gas vaccine containing m protein peptides from 26 different gas serotypes prevalent in north america was demonstrated to evoke opsonic antibodies in animals [ 102 ] . from epidemiological data, the 26-valent vaccine would cover the majority of pharyngitis and invasive gas diseases, including rf, invasive fasciitis, and toxic shock syndrome. recently, a new 30-valent gas vaccine was shown to be immunogenic in rabbits and evoked opsonic antibodies against "non-vaccine" serotypes [ 103 ] potentially creating a vaccine with much broader coverage. this type of vaccine is population specifi c and therefore may not be effective universally. it may also need to be re-designed periodically to refl ect changes in the epidemiology of gas infections. (ch 2 ) 7 (ch 2 ) 7 ch 3 ch 3 ch 3 fig. 9.28 the lipid-core-peptide (lcp) system. the lcp system contains an adjuvant component (lipid-core made of lipoamino acids) and a polylysine carrier onto which peptide epitopes are attached. the example shows a gas vaccine candidate containing j8 and an amino-terminal serotypic epitope called 8830. the adjuvant has three 2-amino-dodecanoic ( n = 9) lipoamino acids separated by glycine amino acid spacers 2. j8 vaccine . a gas vaccine that employs peptide epitopes from the conserved c-repeat region of the m protein is the second approach and has the potential in theory for greater coverage of m types. immunization of mice with a c-region peptide gas vaccine candidate called j8 conjugated to the carrier protein diphtheria toxoid (dt) and co-delivery with an appropriate adjuvant led to protection against systemic and mucosal gas infection [ 104 ] (fig. 9.29 ). j8 also elicited protective immunity against gas when linked to lipopeptides [ 105 , 106 ] . other studies have shown that intranasal immunization of mice with c-region peptides conjugated to the experimental mucosal adjuvant cholera toxin b subunit (ctb) evoked protective immunity against gas at the mucosal level [ 107 ] . ctb could possibly enter olfactory regions of the central nervous system and cause neuronal damage following intranasal delivery [ 108 ] and therefore is not suitable for human use. vector delivery approaches have included expressing the c-repeat region on vaccinia virus [ 109 ] , the commensal bacterium lactococcus lactis [ 110 ] , or streptococcus gordonii [ 111 ] . 3 . j14 . the third combination vaccine approach uses both serotypic and conserved m protein peptide epitopes. initially, a heteropolymer gas vaccine construct was synthesized by free radical-induced polymerization of acryloyl peptides to combine seven serotypic epitopes and a highly conserved c-region peptide epitope called j14 [ 112 ] . the m types that were targeted in the heteropolymer represented gas infections prevalent in the northern territory of australia -a region highly endemic for gas. immunization of mice with the heteropolymer demonstrated excellent immunogenicity and protection fig. 9 . 29 preclinical evaluation of gas vaccine candidates. intranasal immunization of mice with the j8-dt vaccine candidate led to significantly greater survival after intranasal challenge with gas versus control groups but the mucosal adjuvant ctb was essential for protection ( a ). a multiepitope lcp-based gas vaccine candidate elicited protective immunity against mucosal gas infection even in the absence of ctb ( b ) against homologous and heterologous gas strains, indicating its potential to provide broad coverage. however, batch-to-batch variation led to altered immune responses, which limited its applicability for human use. the vaccine also required the addition of an adjuvant to be effective, further limiting its use as a mucosal vaccine due to a lack of safe and effective mucosal adjuvants. later, multiepitope gas vaccine candidates were synthesized based on the lcp system that induced highly opsonic antibodies following parenteral delivery to mice [ 113 ] , as well as protection against mucosal gas infection following intranasal immunization [ 114 ] (fig. 9.29 ). safety and effi cacy. the main concern when using large regions of the m protein in a gas vaccine is the potential for inducing an autoimmune response due to immunological cross-reactivity with host proteins. it is therefore important to identify protective antigenic determinants and to separate the biological relevant epitopes from those that are host tissue cross-reactive and potentially harmful. epitope mapping studies were used to identify the conserved gas vaccine candidate j8, which contains a conformational protective b cell epitope and was designed to lack a human heart cross-reactive t cell epitope [ 115 , 116 ] . tb vaccines -state of the art and progresses dna-antiviral vaccines: new developments and approaches -a review west nile virus recombinant dna vaccine protects mouse and horse from virus challenge and expresses in vitro a noninfectious recombinant antigen that can be used in enzymelinked immunosorbent assays stable and long-lasting immune response in horses after dna vaccination against equine arteritis virus vaccination with human tyrosinase dna induces antibody responses in dogs with advanced melanoma infectious haematopoietic necrosis epidemic (2001 to 2003) in farmed atlantic salmon salmo salar in british columbia naked dna vaccination of atlantic salmon salmo salar against ihnv effi cacy of an infectious hematopoietic necrosis (ihn) virus dna vaccine in chinook oncorhynchus tshawytscha and sockeye o. nerka salmon wild pollinators enhance fruit set of crops regardless of honey bee abundance a metagenomic survey of microbes in honey bee colony collapse disorder colony collapse disorder: a descriptive study the reproductive program of female varroa destructor mites is triggered by its host, apis mellifera brood cell size of apis mellifera modifi es the reproductive behavior of varroa destructor the transmission of deformed wing virus between honeybees (apis mellifera l.) by the ectoparasitic mite varroa jacobsoni oud detection of acute bee paralysis virus and black queen cell virus from honeybees by reverse transcriptase pcr varroa destructor is an effective vector of israeli acute paralysis virus in the honeybee, apis mellifera prevalence and transmission of honeybee viruses age-related changes in the behavioural response of honeybees to apiguard(r), a thymolbased treatment used to control the mite varroa destructor the immune response of drosophila drosophila hemopoiesis and cellular immunity c: insights into social insects from the genome of the honeybee apis mellifera structure and genetics of shigella o antigens optimization of virulence functions through glucosylation of shigella lps growing antimicrobial resistance of shigella isolates role of m cells in initial antigen uptake and in ulcer formation in the rabbit intestinal loop model of shigellosis ipab mediates macrophage apoptosis induced by shigella fl exneri molecular pathogenesis of shigella spp.: controlling host cell signaling, invasion, and death by type iii secretion mucosal lymphoid infi ltrate dominates colonic pathological changes in murine experimental shigellosis not available development of an improved animal model of shigellosis in the adult rabbit by colonic infection with shigella fl exneri 2a a challenge model for shigella dysenteriae 1 in cynomolgus monkeys (macaca fascicularis) immunogenicity and effi cacy of oral or intranasal shigella fl exneri 2a and shigella sonnei proteosome-lipopolysaccharide vaccines in animal models enhancement of anti-shigella lipopolysaccharide (lps) response by addition of the cholera toxin b subunit to oral and intranasal proteosome-shigella fl exneri 2a lps vaccines isolation and characterization of a shigella fl exneri invasin complex subunit vaccine immunogenicity and effi cacy of highly purifi ed invasin complex vaccine from shigella fl exneri 2a development and evaluation of a shigella fl exneri 2a and s. sonnei bivalent invasin complex (invaplex) vaccine safety and immunogenicity of an intranasal shigella fl exneri 2a invaplex 50 vaccine characterization of a mutant escherichia coli heat-labile toxin, lt(r192g/l211a), as a safe and effective oral adjuvant broadly protective shigella vaccine based on type iii secretion apparatus proteins purifi cation and characterization of an immunogenic outer membrane protein of shigella fl exneri 2a outer membrane protein a (ompa) of shigella fl exneri 2a, induces protective immune response in a mouse model global aspects of the management and control of ticks of veterinary importance the global importance of ticks chemical control of ticks on cattle and the resistance of these parasites to acaricides factors that infl uence the prevalence of acaricide resistance and tick-borne diseases the molecular revolution in the development of vaccines against ectoparasites rna interference for the study and genetic manipulation of ticks proteins in the saliva of the ixodida (ticks): pharmacological features and biological signifi cance comparison of the proteins in salivary glands, saliva and haemolymph of rhipicephalus appendiculatus female ticks during feeding immunoglobulin-binding proteins in ticks: new target for vaccine development against a blood-feeding parasite amblyomma americanum : specifi c uptake of immunoglobulins into tick hemolymph during feeding synthetic vaccine (sbm7462) against the cattle tick rhipicephalus (boophilus) microplus : preservation of immunogenic determinants in different strains from south america cloning, expression and immunoprotective effi cacy of rhaa86, the homologue of the bm86 tick vaccine antigen, from hyalomma anatolicum anatolicum differential transcription of two highly divergent gut-expressed bm86 antigen gene homologues in the tick rhipicephalus appendiculatus (acari: ixodida) cloning and expression of a protective antigen from the cattle tick boophilus microplus bovine immunoprotection against rhipicephalus (boophilus) microplus with recombinant bm86-campo grande antigen tick vaccines and the transmission of tick-borne pathogens vaccination against ticks (boophilus spp.): the experience with the bm86-based vaccine gavac immunity against boophilus annulatus induced by the bm86 (tick-gard) vaccine developing live vaccines against plague yersinia pestis--etiologic agent of plague yersinia: strategies that thwart immune defenses plague . in: crc handbook series in zoonoses . section a. bacterial, rickettsial, chlamydial and mycotic diseases mucosal delivery of therapeutic and prophylactic molecules using lactic acid bacteria the induction of hiv gag-specifi c cd8+ t cells in the spleen and gutassociated lymphoid tissue by parenteral or mucosal immunization with recombinant listeria monocytogenes hiv gag mucosal immunization with surface-displayed severe acute respiratory syndrome coronavirus spike protein on lactobacillus casei induces neutralizing antibodies in mice lactococci and lactobacilli as mucosal delivery vectors for therapeutic proteins and dna vaccines dendritic cells express tight junction proteins and penetrate gut epithelial monolayers to sample bacteria protection against tetanus toxin after intragastric administration of two recombinant lactic acid bacteria: impact of strain viability and in vivo persistence lactobacilli differentially modulate expression of cytokines and maturation surface markers in murine dendritic cells lactobacilli as natural enhancer of cellular immune response lactobacilli activate human dendritic cells that skew t cells toward t helper 1 polarization instruments for oral disease-intervention strategies: recombinant lactobacillus casei expressing tetanus toxin fragment c for vaccination or myelin proteins for oral tolerance induction in multiple sclerosis surface proteins of gram-positive bacteria and mechanisms of their targeting to the cell wall envelope. microbiol treponema pallidum and borrelia burgdorferi lipoproteins and synthetic lipopeptides activate monocytic cells via a cd14-dependent pathway distinct from that used by lipopolysaccharide immune response to lactobacillus plantarum expressing borrelia burgdorferi ospa is modulated by the lipid modifi cation of the antigen a review of viral gastroenteritis noroviruses: a comprehensive review biological properties of norwalk agent of acute infectious nonbacterial gastroenteritis norovirus classifi cation and proposed strain nomenclature mechanisms of gii.4 norovirus persistence in human populations sequence and genomic organization of norwalk virus norwalk virus genome cloning and characterization methods for the detection and characterisation of noroviruses associated with outbreaks of gastroenteritis: outbreaks occurring in the north-west of england during two norovirus seasons european multicenter evaluation of commercial enzyme immunoassays human susceptibility and resistance to norwalk virus infection t cellindependent type i antibody response against b cell epitopes expressed repetitively on recombinant virus particles threedimensional structure of baculovirus-expressed norwalk virus capsids expression and self-assembly of norwalk virus capsid protein from venezuelan equine encephalitis virus replicons conformational stability and disassembly of norwalk virus like particles: effect of ph and temperature structural studies of recombinant norwalk capsids e. coliexpressed recombinant norovirus capsid proteins maintain authentic antigenicity and receptor binding capability c-terminal arginine cluster is essential for receptor binding of norovirus capsid protein structural basis for the recognition of blood group trisaccharides by norovirus pathogenesis of group a streptococcal infections group: a streptococcal infections and acute rheumatic fever multiple cross reactive epitopes of streptococcal m proteins streptococcal m protein: molecular design and biological behavior localization of protective epitopes of the amino terminus of type 5 streptococcal m protein towards the development of a broadly protective group a streptococcal vaccine based on the lipid-core peptide system immunogenicity of a 26-valent group a streptococcal vaccine new 30-valent m protein-based vaccine evokes cross-opsonic antibodies against non-vaccine serotypes of group a streptococci protection against group a streptococcus by immunization with j8-diptheria toxoid: contribution of j8-and diphtheria toxoid-specifi c antibodies to protection intranasal vaccination with a lipopeptide containing a minimal, conformationally constrained conserved peptide, a universal t-cell epitope and a self-adjuvanting lipid protects mice from streptococcus pyogenes and reduces throat carriage immunisation of mice with a lipid core peptide construct containing a conserved region determinant of group a streptococcal m protein elicits heterologous opsonic antibodies in the absence of adjuvant epitopes of group a streptococcal m protein that evoke cross-protective local immune responses cutting edge: the mucosal adjuvant cholera toxin redirects vaccine proteins into olfactory tissues protection against streptococcal pharyngeal colonization with a vaccinia: m protein recombinant mucosal vaccine made from live, recombinant lactococcus lactis protects mice against pharyngeal infection with streptococcus pyogenes clinical and microbiological responses of volunteers to combined intranasal and oral inoculation with a streptococcus gordonii carrier strain intended for future use as a group a streptococcus vaccine new multi-determinant strategy for a group a streptococcal vaccine designed for the australian aboriginal population immunization with a tetraepitopic lipid core peptide vaccine construct induces broadly protective immune responses against group a streptococcus intranasal administration is an effective mucosal vaccine delivery route for self-adjuvanting lipid core peptides targeting the group a streptococcal m protein mapping a conserved conformational epitope from the m protein of group a streptococci mapping the minimal murine t cell and b cell epitopes within a peptide vaccine candidate from the conserved region of the m protein of group a streptococcus key: cord-000708-iuo2cw23 authors: lippé, roger title: deciphering novel host–herpesvirus interactions by virion proteomics date: 2012-05-28 journal: front microbiol doi: 10.3389/fmicb.2012.00181 sha: doc_id: 708 cord_uid: iuo2cw23 over the years, a vast array of information concerning the interactions of viruses with their hosts has been collected. however, recent advances in proteomics and other system biology techniques suggest these interactions are far more complex than anticipated. one particularly interesting and novel aspect is the analysis of cellular proteins incorporated into mature virions. though sometimes considered purification contaminants in the past, their repeated detection by different laboratories suggests that a number of these proteins are bona fide viral components, some of which likely contribute to the viral life cycles. the present mini review focuses on cellular proteins detected in herpesviruses. it highlights the common cellular functions of these proteins, their potential implications for host–pathogen interactions, discusses technical limitations, the need for complementing methods and probes potential future research avenues. over the last decades, many host-pathogen interactions have been characterized using genetics, biochemical, and microscopy approaches. these discoveries relied on mutants, pharmacological reagents, immunoprecipitations, immunofluorescence, electron microscopy, cell fractionation, and western blotting to name a few of the methods employed. these approaches provided much precious information but, given the typical focus of these approaches on individual molecules, likely only revealed a small portion of the proteins involved. other methods such as high throughput two-hybrid and genetic screens, nucleic acid arrays, rna interference, and proteomics are now proving essential tools to tackle the complexity of these interactions. the main advantages of mass spectrometry, for instance, are that it is a fast, sensitive and potentially a quantitative approach to identify putative novel players, particularly when coupled to efficient purification schemes. already, proteomics revealed how viruses modulate the expression of host proteins (rassmann et al., 2006; sun et al., 2008; tong et al., 2008; antrobus et al., 2009; pastorino et al., 2009; thanthrige-don et al., 2009; zandi et al., 2009; zhang et al., 2009 zhang et al., , 2010 coombs et al., 2010; emmott et al., 2010; lu et al., 2010 lu et al., , 2012 munday et al., 2010; bartel et al., 2011; lietzen et al., 2011; ramirez-boo et al., 2011; chou et al., 2012) . a relatively new and interesting field is the characterization of host-pathogen interactions within mature purified virions. as reviewed on several occasions, several studies reported the presence of individual cellular proteins in viral particles (bernhard et al., 2005; maxwell and frappier, 2007; viswanathan and fruh, 2007; friedel and haas, 2011; zheng et al., 2011) . this includes vaccinia virus (krauss et al., 2002) , influenza virus (shaw et al., 2008) , hiv (gurer et al., 2002; cantin et al., 2005; ott, 2008) , vesicular stomatitis virus (moerdyk-schauwecker et al., 2009) , and several herpesviruses (see below). though these cellular components have often been considered purification contaminants, the presence of similar proteins in both related and unrelated viruses suggests that some of them may be biologically relevant. the identification of virion-associated host proteins could thus lead to the discovery of novel therapeutic tools against viruses. the present review focuses on their identification and putative roles with respect to the proteomics of herpesviruses. thus far, the protein composition of eight different herpesvirions has been studied by mass spectrometry. these studies include the alphaherpesvirinae herpes simplex virus type 1 (hsv-1) and pseudorabies virus (prv; loret et al., 2008; kramer et al., 2011) , the betaherpesvirinae human and murine cytomegaloviruses (hcmv and mcmv, respectively; kattenhorn et al., 2004; varnum et al., 2004) and the gammaherpesvirinae kaposi sarcoma herpesvirus (kshv), gamma herpesvirus 68 (γhv68), epstein-barr virus (ebv), and alcelaphine (bortz et al., 2003; johannsen et al., 2004; bechtel et al., 2005; zhu et al., 2005; dry et al., 2008) . interestingly, host proteins were detected in all herpesvirions analyzed so far, as summarized in table 1 . for instance, our laboratory previously reported the protein composition of mature extracellular hsv-1 viral particles and identified as many as 49 cellular proteins (loret et al., 2008) . similarly, studies focusing on prv and ebv reported up to 48 and 43 cellular proteins, respectively (johannsen et al., 2004; kramer et al., 2011) . meanwhile, varnum et al. (2004) found as many as 70 different host proteins in extracellular hcmv virions. while fewer cellular proteins were reported for other viral particles, it is clear that herpesviruses can potentially incorporate many proteins from its host. moreover, of the 173 different proteins detected in herpesvirions, nine protein groups are present in at least four distinct herpesvirions. this includes 14-3-3, actin, annexins, cofilin, translation factors, gapdh, heat shock proteins, pyruvate kinase m2, and various rab gtpases. these results indicate that, first of all, it is common for herpesviruses to incorporate cellular proteins into their viral particles and, secondly, that www.frontiersin.org ? different viruses share similar host proteins. most excitingly, it also suggests that these host proteins may play common roles throughout the herpesviral family. this defines an interesting and novel set of host-pathogen interactions taking place within the virus itself, rather than the cell. it is tempting to speculate that some viruses might have a higher capacity to steal cellular proteins because of their size and symmetry. herpesviruses are indeed large viruses containing a layer called the tegument between their capsids and envelopes that could accommodate non-viral proteins. though some host proteins may randomly be incorporated into virions, others may rather be selected to insure the optimal replication of the viruses that carry them. bioinformatics databases such as the kegg, gene ontology, or david are useful tools to get an overview of the functional interplay of proteins (ashburner et al., 2000; huang da et al., 2009; kanehisa et al., 2010) . as pointed out by friedel and haas (2011) , complex statistical tools are available to quantitatively evaluate the implication of proteins in various processes but these are beyond the scope of the present review. here an analysis of the proteins identified in herpesvirions was instead performed with the ingenuity pathways analysis database (ingenuity ® systems), which contains all the known physical and functional links among cellular proteins and defines their most significant functions. that analysis indicates that many of the cellular proteins found in herpesvirions normally modulate trafficking, cell proliferation, cell death, cell migration, cell metabolism, or the cytoskeleton (figure 1 , upper pie chart). though subtle differences between family members are noticeable when looking at individual viruses, similar functions are found (figure 1, other charts) . immune-related molecules are also important constituents for several viruses, including hsv-1, kshv, γhv68, alcelaphine, and mcmv. altogether, this provides an overall picture whereby herpesviruses, not surprisingly, modulate all of the important aspects of the cell but where each virus might deploy its energies slightly differently. the main surprise is that so many cellular proteins are detected within assembled viral particles, which raises an important question as to their biological significance and mode of action. the overall picture that several important cellular functions might be modulated by the host proteins incorporated into viral particles is intriguing. this clever strategy is consistent with the parasitic nature of all viruses, including herpesviruses, which would presumably gain some replication advantage from stealing cellular modulators rather than coding for them in their own genomes. the most critical question is the benefit for the viruses to incorporate these cellular proteins in their assembled particles, particularly since these proteins also exist in the cells. while this is open to discussion, one possibility is that some of the incorporated cellular proteins may be remnants of the final capsid envelopment process. alternatively, this may allow the prompt action of some of these proteins immediately upon viral entry. this could jumpstart the expression and/or duplication of the viral genome, as it is the case for the herpesviral vhs, vp16, icp0, and icp4 proteins that are present in virions (lam et al., 1996; everett, 2000; halford and schaffer, 2001; ellison et al., 2005; hancock et al., 2006; loret et al., 2008; sarma et al., 2008; loret and lippe, 2012) . other early potential sites of action are the process of viral entry itself, intracellular capsid transport, import of the viral genome through the nuclear pore or immune modulation, all common steps among herpesviruses. whatever the case might be, the question remains as to why the cellular pool of these proteins would not suffice. several options may be considered. first, it may be that the virions incorporate specific isoforms, splice variants or post-translationally modified proteins that could have properties or functions distinct than their cellular counterparts. second, the incorporation of a host protein from one cell type might permit the infection of a different cell type that does not express such protein. for example, alpha herpesviruses initially infect mucosal cells and could acquire host proteins that are beneficial to infect dormant neuronal cells. finally, the host proteins might be in complex with viral proteins and it is those complexes that are active to promote the infection. these possibilities are of course speculative at this point and need to be explored. one aspect where the incorporation of host proteins in mature virions might be beneficial is molecules involved in intracellular trafficking. work by numerous laboratories demonstrated that the transport machinery used to move cellular proteins is also employed by viruses (simons and warren, 1984; lodish et al., 2000; sollner, 2004; greber and way, 2006; mercer et al., 2010) . this is essential for their proteins and particles to reach their final destination, for example, the site of viral replication, assembly, and/or envelopment. along with snares proteins, rab and arf gtpases are master regulators of molecular trafficking throughout the cell (sollner and rothman, 1996; zerial and mcbride, 2001; mizuno-yamasaki et al., 2012) . so far,vamp3, a snare, was identified in prv virions (kramer et al., 2011) but it may only be a matter of time until other snares are discovered in other members of the herpes family. this is relevant as another snare was reported to facilitate the envelopment of mcmv capsids (cepeda and fraile-ramos, 2011) . in contrast, a great number of rab proteins have been identified in herpesvirions, particularly hsv-1 and prv (table 1) . one stimulating option is that these proteins regulate the displacement of viral capsids in the cell, which could justify their incorporation in the viral particles. as rab and arf proteins collectively modulate several intracellular transport steps within the cell, it is anticipated they may be involved in various stages of the infection. for instance, rab1, which is present in hsv-1 extracellular virions (loret et al., 2008) , and rab43 were recently demonstrated to modulate the final envelopment of the virus (zenner et al., 2011) . similarly, rab6, found in hsv-1 and prv (loret et al., 2008; kramer et al., 2011) , is also necessary for the efficient assembly of the related hcmv (indran and britt, 2011) . it will now be of interest to determine if the virion-associated pool of these gtpases actively participates in the viral life cycle. interestingly, several rab proteins have been implicated in autophagosome formation and maturation (chua et al., 2011) . while it is difficult to consider how virion-incorporated rab proteins play a role at that stage, they might rather be incorporated into the virions as mcmv figure 1 | the proteins from table 1 were analyzed with the ingenuity database to define their putative functions in the context of an infection. to this end, the protein accession numbers (or gi numbers) were queried from the ingenuity database. for the purpose of this figure, all known functions associated with these proteins were exported to microsoft excel and regrouped. in the top pie chart, the cellular proteins found in all the herpesvirions were analyzed collectively, while the other pie charts depict the host proteins incorporated into each virus. since each protein can be associated with multiples functions in the database, the results of those analyses are expressed as relative values instead of raw numbers, which consequently exceeds the original number of proteins analyzed. the percentages therefore represent the number of proteins falling into a given category with the total of each pie chart being 100%. a graphical legend of the categories is provided at the bottom right corner of the figure. a consequence of their involvement in autophagosome formation and concomitant viral envelopment. given the vast impact of rab proteins on the cell, it will be a major challenge to decipher all their roles in the life cycle of herpesviruses, particularly for the pool present in mature virions. molecular trafficking is not only dependent on snares, rab, and arf proteins, it is also intimately linked to the cytoskeleton. it is thus not surprising that herpesviruses devote some of their resources toward regulating this central cellular machinery. for instance, herpesviruses significantly reorganize both cellular and nuclear actin as well as microtubules (norrild et al., 1986; avitabile et al., 1995; sharma-walia et al., 2004; simpson-holley et al., 2005; de regge et al., 2006; saksena et al., 2006) . they also travel along microtubules during both entry and egress and interact with several cellular molecular motors (sodeik et al., 1997; smith et al., 2001; dohner et al., 2002; marozin et al., 2004; lee et al., 2006; wolfstein et al., 2006; radtke et al., 2010) as well as cortical and nuclear actin filaments (forest et al., 2005; feierbach et al., 2006; roberts and baines, 2011) . furthermore, some members incorporate in their viral particles tubulin or actin-related components ( table 1 ; wong and chen, 1998; grunewald et al., 2003) . actin has been reported to compensate the loss of various viral tegument proteins in prv (del rio et al., 2005; michael et al., 2006) and may thus act as an abundant filling agent, so its significance in herpesviral particles remains enigmatic. similarly, the relevance of intermediate filament components vimentin and keratins in some herpes virions ( table 1) is difficult to assess given these filaments are not as well characterized as other cytoskeletal elements. it may nevertheless be important for herpesviruses, particularly since they are not all associated with the common skin or hair contaminants often detected in mass spectrometry (hertel, 2011) . viruses tend to monopolize for their own purpose their host expression apparatus, including protein translation (bushell and sarnow, 2002) . for example, the prototypic hsv-1 icp27 viral protein regulates all aspect of mrnas including transcription, splicing, nuclear export, and translation for the benefit of the virus (rice and knipe, 1988; sekulovich et al., 1988; sandri-goldin and mendoza, 1992; smith et al., 1992; hardwicke and sandri-goldin, 1994; hardy and sandri-goldin, 1994; brown et al., 1995; soliman et al., 1997; chen et al., 2002; lindberg and kreivi, 2002; ellison et al., 2005; larralde et al., 2006; fontaine-rodriguez and knipe, 2008) . as these cellular functions are highly regulated, the inclusion of ddx3x, a multifunctional rna helicase that also regulates transcription, nuclear export, and translation that is used by several viruses (schroder, 2010 (schroder, , 2011 ) may be relevant. its incorporation into mature virions could thus accelerate viral gene expression in the early stages of the infection. similarly, the presence of translation initiation or elongation factors in virions (table 1 ) may also jumpstart gene expression in favor of the viruses. interestingly, hsv-1 does not require cells to be in the s-phase and even arrests the cell cycle at the g1/s transition step (shadan et al., 1994; song et al., 2000) , which partly explains why it can grow in non-dividing neurons. while the precise mechanism of this arrest is unclear, it is known that the viral icp0 protein and the vp16 cellular partner hcf modulate the cell cycle (hobbs and deluca, 1999; lomonte and everett, 1999; piluso et al., 2002) . moreover, icp0 interacts with the host cyclin d3 (kawaguchi et al., 1997) . however, it was recently reported that stress, rather than the cell cycle per se, may be a critical feature (bringhurst and schaffer, 2006) . clearly, the interaction of herpesviruses with the cell proliferation apparatus is complex and likely involves several host and viral proteins. identifying novel players that might be incorporated into mature virions may thus be very useful to clarify this process. an interesting scenario is the possible regulation of apoptosis by host proteins loaded onto viral particles. apoptosis is regulated both negatively and positively by several viruses (teodoro and branton, 1997; goodkin et al., 2004) , presumably to insure their survival at the early stages of the infection but their efficient release later on. conceptually, the presence of anti-apoptotic proteins in herpes particles might thus provide a mean to quickly evade death upon entry, while the presence of pro-apoptotic proteins on newly assembled/enveloped viral particles may trigger or stimulate their extracellular release. only further work will resolve this open question. several factors generally contribute to variation among proteomic studies. hence, the preparation of the samples (e.g., in-gel trypsin digestion versus liquid digestion and chromatography) may lead to the detection of different populations of tryptic peptides. moreover, the sensitivity of the mass spectrometers and the abundance of the proteins in the samples also impact peptide detection. the relative abundance of a peptide is itself influenced by the complexity of the samples, where some proteins may evade identification. finally, each protein differs in its properties (ionization, resolution in sds-page gels), which will be reflected in their detection. this includes snares, which are transmembrane proteins resistant to sds extraction (yang et al., 1999; kubista et al., 2004) . it is thus likely that some of the proteins in table 1 are present in more viral particles than reported and that additional proteins are indeed incorporated in herpesvirions. more specific aspects regarding herpesviruses includes the purification schemes employed to enrich the viral particles, which will directly influence the purity of the samples and hence the potential detection of contaminants. one important caveat is that some host proteins may simply stick to the large viral particles. another one is common contaminants such as some hair/skin associated keratins or as mentioned above actin, which may simply fill the virions. however, even potential contaminants cannot simply be discarded since actin and even some keratins may indeed participate in viral life cycles. moreover, the relative abundance of all the cellular proteins within the cell is unknown, so it is not possible to rule out potential contaminants on the sole basis of abundance. it is thus critical to orthogonally validate all proteomics hits. various tools are available to define the biological relevance of host proteins identified in viral particles, including western blotting, immuno-electron microscopy or functional screens. one powerful method is rna interference. however, given the dual presence of the host proteins within the viral particles and the cell itself, this becomes a challenging task. rna interference also has its own caveats (false positives and negatives). another common step is the expression of dominant positive or negative mutants. in all cases, one major difficulty is that the host proteins may be essential for the cells and their depletion may lead to cytotoxicity, thus proper controls are needed. in addition, the host proteins might be essential for the virus within the cells but only accessory within the virions. consequently, depletion of a protein may have limited impact on the virus since complemented by the other pool of that protein in the virus or the cell. small reduction or stimulation in viral yields may thus result. it such cases, it may be necessary to produce the virus on cells that lack these proteins to see if this makes a difference. one should also consider animal models since tissue culture based screens may miss important players, for instance modulators of the immune system or virulence factors. clearly, multiple experimental strategies are needed to ultimately insure the biological significance of the host proteins found in viral particles. the identification and functions of host proteins in viral particles is an important step toward the elucidation of novel host-pathogen interactions. in the case of herpesvirions, this is well under way with eight different family members analyzed so far. one main aspect is to sort biologically relevant cellular proteins from sticky contaminants. the orthogonal validation of the host proteins found in herpesvirions using biologically relevant assays is thus critical. as pointed out above, it will necessary to analyze all these proteins in the background of two pools, one cellular and one virion-associated, which are likely to complement one another. an interesting possibility is that some isoforms or specific posttranslationally modified host proteins may be loaded into the capsids. thus a detailed analysis of the host proteins present in viral particles will be important and a potential way to distinguish them from their cell-associated counterparts. another issue is the expected variation among cell types. in that respect, it would be most interesting to examine the cellular protein content of hsv-1 produced in neurons in opposition to the virions produced on other cell types. finally, the mechanisms by which all these host proteins are recruited to the viral particles will also need to be explored. thus the proteomics of viral particles is only the beginning of the adventure, which should prove most exciting yet challenging. i am indebted to the canadian institutes of health research (grant # mop82921) for funding our proteomics research. i also wish to thank kerstin radtke for excellent suggestions and daniel henaff for critical reading of the manuscript. proteomic analysis of cells in the early stages of herpes simplex virus type-1 infection reveals widespread changes in the host cell proteome gene ontology: tool for the unification of biology redistribution of microtubules and golgi apparatus in herpes simplex virus-infected cells and their role in viral exocytosis proteome analysis of vaccinia virus ihd-w-infected hek 293 cells with 2-dimensional gel electrophoresis and maldi-psd-tof ms of on solid phase support n-terminally sulfonated peptides host and viral proteins in the virion of kaposi's sarcomaassociated herpesvirus new insights into viral structure and virus-cell interactions through proteomics identification of proteins associated with murine gammaherpesvirus 68 virions cellular stress rather than stage of the cell cycle enhances the replication and plating efficiencies of herpes simplex virus type 1 icp0-viruses herpes simplex virus trans-regulatory protein icp27 stabilizes and binds to 3 ends of labile mrna hijacking the translation apparatus by rna viruses plunder and stowaways: incorporation of cellular proteins by enveloped viruses a role for the snare protein syntaxin 3 in human cytomegalovirus morphogenesis icp27 interacts with the rna export factor aly/ref to direct herpes simplex virus type 1 intronless mrnas to the tap export pathway an overview of the vaccinia virus infectome: a survey of the proteins of the poxvirus-infected cell involvement of members of the rab family and related small gtpases in autophagosome formation and maturation quantitative proteomic analyses of influenza virus-infected cultured human lung cells alpha-herpesvirus glycoprotein d interaction with sensory neurons triggers formation of varicosities that serve as virus exit sites actin is a component of the compensation mechanism in pseudorabies virus virions lacking the major tegument protein vp22 function of dynein and dynactin in herpes simplex virus capsid transport proteomic analysis of pathogenic and attenuated alcelaphine herpesvirus 1 control of vp16 translation by the herpes simplex virus type 1 immediateearly protein icp27 quantitative proteomics using stable isotope labeling with amino acids in cell culture reveals changes in the cytoplasmic, nuclear, and nucleolar proteomes in vero cells infected with the coronavirus infectious bronchitis virus icp0, a regulator of herpes simplex virus during lytic and latent infection alpha-herpesvirus infection induces the formation of nuclear actin filaments herpes simplex virus icp27 increases translation of a subset of viral late mrnas active intranuclear movement of herpesvirus capsids virus-host interactomes and global models of virus-infected cells herpes simplex virus infection and apoptosis a superhighway to virus infection three-dimensional structure of herpes simplex virus from cryoelectron tomography specific incorporation of heat shock protein 70 family members into primate lentiviral virions icp0 is required for efficient reactivation of herpes simplex virus type 1 from neuronal latency herpes simplex virus regulatory proteins vp16 and icp0 counteract an innate intranuclear barrier to viral gene expression the herpes simplex virus regulatory protein icp27 contributes to the decrease in cellular mrna levels during infection herpes simplex virus inhibits host cell splicing, and regulatory protein icp27 is required for this effect herpesviruses and intermediate filaments: close encounters with the third type perturbation of cell cycle progression and cellular gene expression as a function of herpes simplex virus icp0 systematic and integrative analysis of large gene lists using david bioinformatics resources a role for the small gtpase rab6 in assembly of human cytomegalovirus proteins of purified epstein-barr virus kegg for representation and analysis of molecular networks involving diseases and drugs identification of proteins associated with murine cytomegalovirus virions herpes simplex virus 1 alpha regulatory protein icp0 interacts with and stabilizes the cell cycle regulator cyclin d3 proteomic characterization of pseudorabies virus extracellular virions an investigation of incorporation of cellular antigens into vaccinia virus particles evidence for structural and functional diversity among sds-resistant snare complexes in neuroendocrine cells herpes simplex virus vp16 rescues viral mrna from destruction by the virion host shutoff function direct stimulation of translation by the multifunctional herpesvirus icp27 protein reconstitution of herpes simplex virus microtubule-dependent trafficking in vitro quantitative subcellular proteome and secretome profiling of influenza a virus-infected human primary macrophages splicing inhibition at the level of spliceosome assembly in the presence of herpes simplex virus protein icp27 viruses: structure, function, and uses herpes simplex virus type 1 immediate-early protein vmw110 inhibits progression of cells through mitosis and from g(1) into s phase of the cell cycle comprehensive characterization of extracellular herpes simplex virus type 1 virions biochemical analysis of icp0, icp4, ul7 and ul23 incorporated into extracellular herpes simplex type 1 virions two-dimensional liquid chromatography-tandem mass spectrometry coupled with isobaric tags for relative and absolute quantification (itraq) labeling approach revealed first proteome profiles of pulmonary alveolar macrophages infected with porcine reproductive and respiratory syndrome virus proteomic analysis of the host response in the bursa of fabricius of chickens infected with marek's disease virus herpes simplex virus type 1 infection of polarized epithelial cells requires microtubules and access to receptors present at cell-cell contact sites virus entry by endocytosis composition of pseudorabies virus particles lacking tegument protein us3, ul47, or ul49 or envelope glycoprotein e gtpase networks in membrane traffic analysis of virion associated host proteins in vesicular stomatitis virus using a proteomics approach quantitative proteomic analysis of a549 cells infected with human respiratory syncytial virus subgroup b using silac coupled to lc-ms/ms organization of cytoskeleton elements during herpes simplex virus type 1 infection of human fibroblasts: an immunofluorescence study cellular proteins detected in hiv-1 isolation and preliminary characterization of herpes simplex virus 1 primary enveloped virions from the perinuclear space host cell factor-1 interacts with and antagonizes transactivation by the cell cycle regulatory factor miz-1 plus-and minus-end directed microtubule motors bind simultaneously to herpes simplex virus capsids using different inner tegument structures quantitative proteomics by 2-de, 16o/18o labelling and linear ion trap mass spectrometry analysis of lymph nodes from piglets inoculated by porcine circovirus type 2 proteome alterations in human host cells infected with coxsackievirus b3 gene-specific transactivation by herpes simplex virus type 1 alpha protein icp27 actin in herpesvirus infection herpes simplex virus type 1 accumulation, envelopment, and exit in growth cones and varicosities in mid-distal regions of axons a herpesvirus regulatory protein appears to act posttranscriptionally by affecting mrna processing small interfering rnas that deplete the cellular translation factor eif4h impede mrna degradation by the virion host shutoff protein of herpes simplex virus human deadbox protein 3 has multiple functions in gene regulation and cell cycle control and is a prime target for viral manipulation viruses and the human dead-box helicase ddx3: inhibition or exploitation? the herpes simplex virus type 1 alpha protein icp27 can act as a trans-repressor or a trans-activator in combination with icp4 and icp0 n-butyrate, a cell cycle blocker, inhibits the replication of polyomaviruses and papillomaviruses but not that of adenoviruses and herpesviruses kaposi's sarcomaassociated herpesvirus/human herpesvirus 8 envelope glycoprotein gb induces the integrin-dependent focal adhesion kinase-srcphosphatidylinositol 3-kinase-rho gtpase signal pathways and cytoskeletal rearrangements cellular proteins in influenza virus particles semliki forest virus: a probe for membrane traffic in the animal cell identification and functional evaluation of cellular and viral factors involved in the alteration of nuclear architecture during herpes simplex virus 1 infection herpesviruses use bidirectional fast-axonal transport to spread in sensory neurons evidence that the herpes simplex virus immediate early protein icp27 acts post-transcriptionally during infection to regulate gene expression microtubulemediated transport of incoming herpes simplex virus 1 capsids to the nucleus shuttling of the herpes simplex virus type 1 regulatory protein icp27 between the nucleus and cytoplasm mediates the expression of late proteins intracellular and viral membrane fusion: a uniting mechanism molecular machinery mediating vesicle budding, docking and fusion herpes simplex virus infection blocks events in the g1 phase of the cell cycle proteomic alteration of pk-15 cells after infection by classical swine fever virus regulation of apoptosis by viral gene products analyses of the spleen proteome of chickens infected with marek's disease virus proteomic analysis of cellular protein alterations using a hepatitis b virus-producing cellular model identification of proteins in human cytomegalovirus (hcmv) particles: the hcmv proteome viral proteomics: global evaluation of viruses and their interaction with the host the inner tegument promotes herpes simplex virus capsid motility along microtubules in vitro evidence for the internal location of actin in the pseudorabies virion snare interactions are not selective. implications for membrane fusion specificity proteomics analysis of bhk-21 cells infected with a fixed strain of rabies virus analysis of rab gtpase-activating proteins indicates that rab1a/b and rab43 are important for herpes simplex virus 1 secondary envelopment rab proteins as membrane organizers proteomic analysis of pbmcs: characterization of potential hiv-associated proteins differential proteome analysis of host cells infected with porcine circovirus type 2 mass spectrometry based proteomic studies on viruses and hosts -a review virion proteins of kaposi's sarcoma-associated herpesvirus the author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. key: cord-014901-d9szap94 authors: permyakova, n. v.; uvarova, e. a.; deineko, e. v. title: state of research in the field of the creation of plant vaccines for veterinary use date: 2015-01-04 journal: russ j plant physiol doi: 10.1134/s1021443715010100 sha: doc_id: 14901 cord_uid: d9szap94 transgenic plants as an alternative of costly systems of recombinant immunogenic protein expression are the source for the production of cheap and highly efficient biotherapeuticals of new generation, including plant vaccines. in the present review, possibilities of plant system application for the production of recombinant proteins for veterinary use are considered, the history of the “edible vaccine” concept is briefly summarized, advantages and disadvantages of various plant systems for the expression of recombinant immunogenic proteins are discussed. the list of recombinant plant vaccines for veterinary use, which are at different stages of clinical trials, is presented. for many thousands of years plants have served humanity as a source of medicinal substances. how ever, only at the turn of the century xxi with the appearance of dna technologies, it became possible to modify plant genomes and to create new types of plants (transgenic plants), which are capable of syn thesizing and accumulating in their tissues recombi nant proteins from various heterologous systems. to date transgenic plants in which nuclear and chloro plast genomes have been transformed with genes encoding heterologous proteins that are important in the treatment of various diseases -antigens of infec tious agents, antibodies, immunomodulators, etc. have been created [1, 2] . of principal importance of this work development was the creation of the "edible vaccine" concept, the essence of which is the use of genetically modified plants containing protein anti gens of infectious agents for oral delivery of relevant antigens to the mucosa of the gastrointestinal tract of warm blooded animals. vaccination based on the programming of the spe cific mechanisms of warm blooded animal protection against pathogens is the most efficient method for the struggle against infectious diseases, which often result in a mass mortality. in agriculture, there is no alterna tive to livestock vaccination, because there are no anti viral drugs that are suitable for a wide use in animal husbandry. the importance of animal vaccination indirectly affects human health, because the use of vaccines significantly reduces the amount of pharma ceuticals in the food chain. as a rule, animal immune mechanisms are acti vated by the direct introduction of infectious agents or their components. at present, most of used vaccines are preparations on the basis of inactivated agents. although these vaccines manifest the high immunoge nicity, they are not without serious shortcomings. among such disadvantages are the increased sensitiv ity of the organism to them, the large load on the immune system, the reactogenicity of vaccines (side effects), their toxicity. etc. the application of molecular biology and genetic engineering methods opened wide prospects for the manufacturing of vaccines of new generation, which immunogenic components can be biological mole cules or their fragments. dna fragments or proteins of the infectious agent cell envelopes can serve as immu nogenic components. when the gene encoding the envelope protein of the infectious agent is transferred into the genome of another organism, for example, plant, then the cells of such plant will synthesize the protein antigen capable of formation of resistance to this agent. thus, the introduction into the organism not the whole pathogen but only its part, which is not capable of inducing infection development, will pro vide for the effect of vaccination. one fourth of the total pharmaceutical market of drugs for veterinary use, and it is constantly expanding. [3] . preparation of medicinal substances for the pro duction of veterinary products is based on various approaches, including biotechnology using genetically modified (transgenic) organisms for these purposes; such expression systems as bacteria, yeast, cells of insects and mammals are used. the application of genetically modified plants with genes encoding phar maceutic proteins inserted in the genome opens new prospects for obtaining recombinant proteins, includ ing plant vaccines [4] . this review is devoted to the analysis of possibilities of producing recombinant immunogenic proteins for veterinary use on the basis of plant expression systems, and the history of the concept of "edible vaccines" for animal immunization. of plant vaccines the idea of the usage of plant cells for the synthesis and accumulation of recombinant protein antigens was for the first time successfully realized in 1992 by c. arntzen and his colleagues [5] . just this team of researchers not only demonstrated a possibility of the accumulation of the surface hbsag antigen of hepati tis b virus but also its capability for self assembling in the virus like particles in transgenic tobacco plants. the virus like particles isolated from plant tissues were identical to the particles of hbsag antigen of indus trial recombinant vaccine obtained in the yeast expres sion system and also to virus like particles from the blood plasma of patients infected with hepatitis b virus. thus, it became obvious that genetically modi fied plants producing and accumulating protein anti gens of various infectious agents can be used for oral delivery of corresponding agents to the mucosa of the gastrointestinal tract of warm blooded animals, i.e., as "edible vaccines." the next important step in the development of the "edible vaccine" concept on the basis of genetically modified plants was the creation of transgenic plants producing the heat labile enterotoxin of escherichia coli [6, 7] and b subunit of the cholera toxin [8] . heat labile toxin of e. coli consists of two parts: lt a (enzyme) and lt b (pentamer of receptor binding polypeptides). lt b binds to the receptors on the sur face of membranes of epithelial cells of the mammal small intestine and transports lt a in the intestinal cells, where it induces changes in the cell metabolism and cell dehydration. when the two parts of the heat labile enterotoxin are separated, the appearance of the lt b protein complex on the surface of epitheliocytes will stimulate a strong immune response of intestine mucosa without the appearance of any disease signs. just this feature was the basis for the research of c. arntzen [6] team on the creation of plant vaccine providing for the resistance to enterotoxigenic e. coli toxins. the authors established that lt b synthesized in transgenic tobacco and potato plants and also lt b isolated from e. coli delivered orally to mice induced similar immune responses. later, lt b sequence was optimized for expression in plant cells and transferred into the potato genome [7] . in potato tubers, the protein assembled correctly in oligomers and was accumulated in amounts suffi cient for the induction of the immune response at oral delivery to the organism. on the basis of clinical trials of the "candidate" plant lt b vaccine, it was estab lished that the consumption of raw potato tubers con taining 0.3-10 mg lt b by volunteers resulted in the formation of serum and mucosal immune responses with high titers of antibodies [9] . the initial concept of "edible vaccine" was heavily criticized by researchers who believed that in the aggressive medium of the gastrointestinal tract the recombinant protein should be destroyed. however, later it was experimentally established that the recom binant b subunit of the cholera toxin fused with green fluorescent protein (gfp) protected by the plant cellu lose cell wall at oral delivery is capable of passing through the gastrointestinal tract and reaching the antigen containing cells of the mouse intestine [10] . the results obtained confirmed the possibility of using plants synthesizing protein antigens of various infec tious agents for oral delivery of antigens to the mucosa of the gastrointestinal tract and experimentally con firmed a consistency of the "edible vaccine" concept. it became evident that genetically modified plants can be used for the creation of plant vaccines as a raw mate rial, and separate plant parts (fruits, roots, berries, leaves, etc.) can be used directly in food without pre liminary heat treatment. response formation in the process of evolution, mammals developed secondary lymphoid mucosal tissue capable of antigen absorption, processing them, and using for the induc tion of the mucosal response. it was established that in this case both cellular and humoral immunity were formed. it is of importance that adaptive mucosal immunity can distinguish between usual food and symbiotic antigens and infectious agents [11] . the scheme of the mechanism of mucosal immune response formation is presented in fig. 1 . in the digestive tract, which is one of the pathways for the penetration of a variety of pathogens into organism, the main associated lymphoid tissue is the peyer's patches. peyer's patches, inductive sites of the intestine, which contain the dome, underlying the fol licle (b zones with germinal center) and interfollicu lar region containing t cells. the surface of the dome is covered by a specialized follicle associated epithe lium containing the folded cells (m cells), which are able to absorb and transport antigens from the intesti nal lumen. after successful capture, the antigen is par tially cleaved and enters into dendritic cells (see fig. 1 , step 1). dendritic cells are especially important in the initiation of adaptive immune responses, since they migrate to the lymph nodes (see fig. 1 , step 2), and the mediators act in the development of various subpopu lations of t helper cells from naive t lymphocytes and also can interact with b lymphocytes (see fig. 1 , step 5). the activated b and t cells leave the peyer's patches and penetrate into the circulatory system. mature t helper cells then return to the mucosa sur pathogens (antigen) epithelium t helper cells face to function as effectors (see fig. 1 , step 3). t helper 17 expressing interleukin 17 (il 17) increases the expression of the receptors of polymeric immunoglobulin (pig) and secretion of antigen spe cific iga (fig. 1, step 4) . subsequent generation of mature plasma cells producing iga leads to the induction of antigen specific protection of local and distal mucosal surfaces. since the mucosal immune response has a general ized nature, oral (i.e., mucosal) vaccination is not only the immune response of the mucous membranes, but also the overall immune response of the organism [11] [12] [13] [14] . it is proved that vaccination via the surface of mucosa membrane can specifically activate the immune response to infection without the develop ment of such processes related to the disease as inflammation or toxicity [14, 15] . of plant vaccines in comparison with traditional expression systems, plant systems are attractive to researchers in many ways, primarily due to the absence of the risk of plant cell infection with animal pathogens, viruses, prions, etc. plants are capable of the synthesis of most recom binant antigens with the same posttranslational modi fications as in animal cells [16] . plant vaccines can play especially important role in the protection of ani mals against diarrheal diseases and diseases, which infectious agents penetrate into the organism through the mucosal tissues. modern techniques of genetic engineering allow to selectively direct the recombinant proteins expressed in plant cells to various plant organs (seeds, tubers, fruits, etc.) [17] . this possibility greatly simplifies the large scale production of plant vaccines and reduces their cost. according to the experts, the final price of the product (recombinant protein) produced in a plant expression system will be much less than the price of a similar protein produced, for example, in mammalian cell culture [18] . since recombinant proteins can be accumulated in the storage organs or seeds, they are able to be main tained without any changes and the loss of biological activity for a long time (months and years). it was established that the recombinant protein of cholera toxin b subunit remained stable in transgenic rice grains for at least 18 months, when grains were stored under room temperature conditions [19] . recombi nant protein antigens remained stable in rice grains during three years and provided for the formation of the protective immunity in mice against cholera agent or against enterotoxigenic e. coli [20] ; in soybean seeds and soy milk, the preservation of antigen stabil ity was observed for four years [21] .thus, grains of transgenic plants can be transported to the site of final destination without additional freezing and treatment, and this ensures the retention of activity of the recom binant protein activity, its stability, and the constancy of dosage. a somewhat different picture is observed when lyo philization is used as a method of the conservation of protein antigens synthesized by plant cells. it was established that the recombinant protein of the norovirus envelope retained its immunogenicity in tis sues of both lyophilized and air dried tomato fruits [22] , but the tested samples differed in immunogenic ity. the immunogenicity of this recombinant protein in air dried tomato fruits was somewhat higher in comparison with lyophilized fruits. similar results were obtained in experiments on the lyophilization of potato tubers synthesizing the protein of norovirus envelope, e.g., the immunogenicity of lyophilized tubers was lower than that of fresh tubers [22] . how ever, additional studies are required for the final solu tion of this question. despite these advantages, plant vaccines are not without disadvantages. one of them is differences in protein posttranslational modifications in plants and animals, e.g., in glycosylation of the recombinant pro tein [23] . it is known that more than a half of proteins synthesized by eukaryotic cells are glycosylated and more than a third of currently applied biopharmaceu tics are glycoproteins [24] . although the activity of most proteins does not depend on glycosylation, in some cases it may be critical. specific features of pro tein glycoforms can affect their folding, stability, trans portation, and changes in their functional activity and immunogenicity. examples of biopharmaceuticals, which functional activity depends on the specific gly coform, are erythropoietin, antibodies, blood anti gens, some interferons and hormones [24] . scheme of the n glycan complex formation in plants and animals (humans, for example) is shown in fig. 2 . the most significant difference in glycosylation is that the plant β 1,2 xylose is attached to the core mannose residue and α 1,3 fucose -to n acetylglu cosamine residue of the core glycan. in human cells xylose is not used at all in glycosylation and a proximal fucose residue is attached to glycans through the α 1,6 bond. it was established that sugar residues attached at posttranslational modifications of recom binant proteins in plant cells themselves were capable of exhibiting the immunogenicity. approximately in one quarter of patients with allergy symptoms, ige antibodies specific for complex glycans, which include xylose or fucose, were revealed [25] . differ ences in glycosylation during posttranslational modi fications of recombinant proteins in plant and mam mal cells can be eliminated by genetic engineering techniques, which was successfully demonstrated on the moss physcomitrella patens, in which the genes encoding the enzymes β 1,2 xylosyltransferase and α 1,3 fucosyltransferase responsible, respectively, for xylosylation and fucosylation of proteins were switched off by the "knock down" method [25] . it is known that when the foreign gene is inserted in the genome of the transgenic plant, the level of its expression depends on the site of its insertion, which determines the level of protein (antigen in particular) accumulation in plant tissues. in this connection, one of disadvantages of plant vaccines is the difficulty in the standardization of their dosage. it is at this stage the development of the "edible vaccine" concept has undergone substantial revision, since the possibility of oral delivery of the recombinant antigen with the raw plant material was untenable because of the variability of the recombinant protein content. according to the researchers involved in the development of "candi date" plant vaccines, the antigen dosage problem in plant tissues can be successfully solved by the intro duction of additional treatment of the plant material: its refinement (to equalize the concentration of the antigen), drying or lyophilization. it is also necessary to introduce an additional stage associated with the development of rapid methods for determining and monitoring the dosage of the recombinant antigen. after appropriate preparation, plant vaccines can be encapsulated, tableted, and used in practice under corresponding medical supervision [14, 26] . thus, performed studies allow a suggestion that the plant vaccine based on the genetically modified plants is capable of inducing protective immunity and open new opportunities for the creation of low cost and easy handling vaccines against infectious diseases of animals. it is of importance that developed to date, highly effective methods of cultivation of agro eco nomically important plant species, as well as the seed production system for a particular culture make the plants attractive to be used as "biofactories" for man ufacturing cheap recombinant proteins for medical purposes. roplasts). among plant expression systems with stable integration of the transgene in the nuclear genome, a separate group includes the cultivated duckweed, microalgae, and cell culture systems in vitro: cell sus pensions, cultures of "hairy roots", moss protonemas, which cultivation conditions are a completely closed environment (bioreactors). promising is the transient expression system, in which the target gene is intro duced into plant cells and is expressed for a short period of time (several days), but is not integrated into the genome. each of these expression systems has its advantages and disadvantages; the main details of these systems are considered below. the most widely used system of heterologous gene expression, in particular for plant vaccine production, are transgenic plants with the stable integration of the transgene into the nuclear genome. the creation of these plants involves the transfer of foreign genes into the genome of plant cells using agrobacterium tumefa ciens or bioballistics and subsequent regeneration of transformed plants from these cells. being inserted into the nuclear genome, transgene becomes its resi dent part, is stably expressed, and is maintained in subsequent generations. using this expression system for producing recom binant proteins, including plant vaccines, is still ham pered by relatively low levels of transgene expression, which is, as a rule, less than 1% of total soluble protein (tsp) and also by its variability in different plant organs and tissues within a single plant and in different plants. most often, the variability in expression of the transgene is due to the random nature of its insertion into the nuclear genome (effect of position) and may be associ ated with partial or complete transgene inactivation [27, 28] . experts believe that the use of plant expression sys tems for obtaining plant vaccines is economically bene ficial at the level of expression of a target gene, which allows the accumulation of recombinant protein in an amount not less than 1% of tsp [29] . a lot of ways to increase the level of foreign gene expression in transgenic plants is developed; they are very fully discussed in the reviews [17, 30, 31] . among them are the optimization of the codon composition of the target sequence, the usage of strong promoters, the addition of introns or regions of binding with the nuclear matrix (sar), and others. the search for tis sue and organ specific promoters, i.e., promoters providing for target gene expression in definite plant tissues or organs, is of a special interest. for example, the usage of promoters directing transcription of the target gene predominantly in the seeds can increase the yield of the target protein by an order or several fold: up to 13-14% of tsp in rice grains and up to 25% of tsp in tobacco seeds [17] . it is of importance that as compared with leaves, seeds contain less pro teases and much less water; therefore, recombinant protein is saved better. despite the fact that transgenic plants with stable transgene expression in the nuclear genome are used most widely, it is just this expression system that induces a cautious attitude of human society. one of the fears is associated with the possibility of transgene transfer from the cross pollinated plants into the genomes of wild relatives at growing biotechnological crops in open field. to solve this problem, researchers developed various agricultural technologies as well as fixing male sterility in transgenic plants to prevent unwanted cross pollination of cultivated and wild spe cies. examples of production of various immunogenic proteins using for this purpose transgenic plants with stable transgene integration into the nuclear genome are presented in table 1 . chloroplasts are most attractive among plant expression systems. genetically modified plants with the stable transgene integration into the chloroplast genome were called transplastome plants. the specific organization of chloroplasts allows achieving a high dose of foreign gene in transplastome plants, which provides for the efficient production of the target pro tein. the record of the recombinant protein yield was achieved by transplastomic tobacco plants with the bacteriophage lysin gene plygbs, encoding a hydro lase of the bacterial cell wall, the level of which accu mulation in the leaves amounted to 70% of tsp [32] . although in some cases, negative physiological changes were observed in plants with such high level of foreign gene expression, usually there were no devia tions in the development of such plants. problems of transplastome plant adaptation to the high level of for eign gene expression are discussed in the review of bally et al. [33] . transgene delivery to chloroplasts is performed using bioballistics (gene gun); its integration into the chloroplast genome occurs via homological recombi nation [34] . the advantage of chloroplast expression system is in the absence of the effect of position observed in the case of random pattern of transgene distribution in the nuclear genome. in plastids, there is no transgene splicing (inactivation); therefore, its expression is stably preserved in subsequent genera tions. due to the prokaryotic organization of chloro plast genome, there is a possibility of co expression of several genes within a single operon [35] . an impor tant specificity of plastids is that they are inherited through the maternal line and usually are absent from pollen. therefore, as distinct from usual transgenic plants, transplastome plants are safe for environment because the uncontrolled spread of the transgene into other plants is prevented [36, 37] . it should be noted that protein posttranslational modifications, e.g., assembling multimer proteins, the tmv-tobacco mosaic virus; camv-cauliflower mosaic virus; tsp-total soluble protein; (+) immune response is revealed at oral immunization; (++) immune response is revealed at oral immunization, animals did not die after virus infection. species of immunized animals are indicated, the way of antigen delivery is indicated in the cases when it was not oral; the amount of survived infected animals is indicated in the cases when the survival was less than 100%. no. 1 2015 formation of disulfide bridges, lipid modifications, etc., occur successfully in chloroplasts. it is estab lished that recombinant proteins synthesized in chlo roplasts do not differ from native ones in their func tional activities [38, 39] . chloroplast attractiveness as the system of recombinant protein expression is in the fact that they are closed structures and it preserves the metabolites, which when released into the cytosol are toxic to plant cells, such as b subunit of cholera toxin [38] and trihalose in tobacco cells [40] . on the basis of the above works, it becomes appar ent that transplastome plants can be regarded as the most promising system for efficient production of pro tein antigens. however, the main disadvantage of such expression system at plant vaccine manufacturing is that chloroplasts cannot glycosylate proteins. at present, the creation of transplastome plants is also associated with some technological problems related to the absence of the efficient system of regeneration for most plant species and also with the absence of the efficient system of transgen delivery to the chloroplast genome providing for the high percent of transformed plant yield. the suspension cell cultures on the basis of geneti cally modified plants attract the attention of research ers as promising potential systems for biopharmaceu tics production. such cultures can be obtained from loose callus tissues induced from genetically modified explants or on the basis of co culturing of the cell sus pension and a. tumefasciens. after the assessment of growth characteristics and cell line screening to obtain promising lines capable of the accumulation of great amounts of recombinant proteins, such lines can be cultivated in bioreactors for target protein obtaining. an attractive feature of the cell cultures as expres sion systems for recombinant protein production, as compared to the use of whole plants for this purpose, is that cell culture can be unified in growth character istics, cell dimensions and types. moreover, cells are grown under strictly controlled conditions, when the product accumulation does not depend on the sea sonal weather changes and allows the permanent product obtaining in bioreactors. the additional insertion of signal peptide nucleotide sequences into the construct permits a protein secretion into the intercellular space, which allows target protein isola tion directly from the culture liquid. the addition of recombinant protein stabilizers to the suspension cul ture increases the yield of the target protein [41] . by their capabilities, plant cell cultures are compa rable in production of therapeutic recombinant pro teins with the conventionally used mammalian cell cultures, such as chinese hamster ovary cells. how ever, as distinct from the mammal suspension cultures, they are not infected by any animal pathogens. to data, there are many examples of plant cell cultures with the yield of recombinant protein in the amount more than 10 mg/l, which is a threshold value for starting the commercial manufacture of the product [42] . an example of a commercially successful produc tion of veterinary vaccine products is developed by dow agro sciences company (united states) system concert™, patented as an effective and safe system for the production of vaccine proteins in cultured plant cells, cultured in a bioreactor. the first plant vaccine against newcastle disease virus of birds obtained in the tobacco cell culture was approved for use by the min istry of agriculture of the united states in 2006. the disadvantages of this expression system are still insufficiently high yield of the target recombinant pro tein and the instability of foreign genes in cultured plant cells due to the epigenetic silencing of the trans gene transcription [43] . advantages, disadvantages, and specific features of recombinant protein produc tion in the plant cell cultures are discussed in reviews [42, 44] . there are many examples of successful use of aqueous plants, such as duckweed, unicellular algae, and mosses, which are cultivated similarly as plant cell suspensions in bioreactors, for recombinant pro tein expression. duckweed attracts the attention of researchers as a potential highly efficient system for recombinant pro tein expression due to its capability of a rapid biomass accumulation: it can be doubled for 24-48 h. geneti cally modified duckweed as a potential producer of biopharmaceutic proteins can be used by animals as a raw or dried food. duckweed is a monocotyledonous angiosperm; for foreign gene transfer into its genome, the methods of agrobacterial transformation and bioballistics are used. examples are known when genetically modified duckweed accumulated recombi nant protein in the amounts up to 25% of tsp, as assessed after the accumulation of gfp [45] . sequenc ing the duckweed chloroplast genome is close to com pleting, which opens up some prospects for a signifi cant increase in the yield of recombinant proteins. the systems of biopharmaceuticals production using genetically modified microalgae are actively developed. algae combine advantages of both bacteria (rapid growth and simplicity of cultivation) and higher plants (a capability of posttranslational modifications and photosynthesis). chlamydomonas reinhardtii is most promising among algae: it has a short time of bio mass doubling (about 10 h), it is easily subjected to nuclear and chloroplast transformation, it can be grown under photoautotrophic conditions or with the addition of acetate as the source of carbon. nuclear transformants usually give rather low yield of protein product; therefore, recombinant protein production by this alga is based on the transformation of chloro no. 1 2015 plast, which occupies about 40% of the cell volume [46] . c. reinhardtii nuclear and chloroplast genomes are sequenced, and this simplifies substantially any genetic engineering manipulations. known examples of protein antigen production in chloroplasts of c. reinhardtii are b subunit of cholera toxin fused with the coat protein of foot and mouth disease virus [47] or with d2 fibronectin binding domain of staphylococ cus aureus [48] , as well as protein 28 virus cryptokary osis (shrimp disease) [49] and e2 protein of swine fever [50] . the green moss physcomitrella patens is the only representative of bryophytes, the genome of which is currently completely sequenced and approaches to its transformation are developed. the peculiarity of this moss is that at the stage of the haploid juvenile game tophyte (protonema) this moss is morphologically similar to filamentous algae and easily enough culti vated in a bioreactor [25, 42] . under certain culture conditions the moss can be in the stage of protonema indefinitely long. the attractiveness of this moss spe cies as the system for recombinant protein expression is that, as distinct from plants, fragments of foreign dna can be integrated in its genome through homo logical recombination, which reduces substantially a possibility of transferred gene inactivation. the p. pat ens cells are capable of postranslational modification of proteins of eukaryotic origin. since at the step of gametophyte the moss has the haploid number of chromosomes, it becomes possible to modify the func tioning of individual genes, in particular the moss lines conducting glycosylation of recombinant proteins as in mammalian cell type were obtained [25] . the firm grenovation (germany) is developing the technology of biopharmaceutical protein production on the basis of p. patens in bioreactors. expression system for pro ducing biopharmaceuticals based on the green moss is not the part of the food chain and is characterized by a high degree of biosafety. as distinct from above described systems based on the stable expression of foreign genes integrated into nuclear or chloroplast genomes, during transient expression target proteins are synthesized in the plant cell during relatively short time (several days) without insertion into the plant genome. at present, the fol lowing approaches are used for transient gene expres sion in plants: gene delivery with the help of agrobac terium, the use of plant virus vectors, and magnifec tion [51] [52] [53] . tobacco mosaic virus (tmv), potato x virus, alfalfa mosaic virus, and cowpea mosaic virus are used as virus vectors [52] . the availability of infec tious cdna clones, the small size of the viral genomes, the short time required for the expression of a target gene, and a high level of expression provides for a high attraction of this system. the rapid development of this expression system led to substantial modifications of the first gene inser tion vectors or full virus vectors, which is a recombi nant virus that behaves as wild type virus but is capable of expressing additional genes. the next step was the creation of "disarmed vectors" (deconstructed vec tors) lacking a number of original virus genes, and gene replacement vectors, in which a portion of the viral genes is replaced by alien genes [51] . viral vectors have several substantial disadvantages: a tendency to the loss of foreign insertion in the process of virus spreading over the plant and a potential risk for envi ronment related to the presence of infectious recombi nant viral particles. launch vectors represent an alternative to recom binant plant viruses; cdna of these viruses is deliv ered to plants within t dna region of agrobacterial ti plasmid. firstly, primary transcription of t dna occurs in the nucleus; then viral rna is released into the cytoplasm, where its further amplification, trans lation, and protein synthesis occur [52] . by 2005, the system of agroinfiltration based on the use of plant viruses and agrobacterial binary plasmids was upgraded and named as magnifection [54] . at magnifection, multiple agrobacterial lines carrying different parts of the tmv genome are used simulta neously. after agrobacterial transfer into the plant cell nucleus, separate parts of viral genome are assembled in plants in the completely functional viral replicon [51] . the substantial modification of the viral genome, including numerous point mutations for the removal of potential sites of splicing, intron insertion, and the removal of the gene encoding envelope proteins, pro vided for the highly efficient system capable of recom binant protein synthesis (up to 5 g/kg of fresh tissue), which is more than 50% of tsp [51] . among disadvantages of transient system is a necessity for recombinant protein isolation and purifi cation immediately after its accumulation in the plant, because, as distinct from seeds and fruits, plant leaves and stems cannot be stored for a long time. the systems of transient expression of recombinant proteins are rather promising in the cases when a rapid production of a small amount of proteins is required. experiments with transient expression in plants are held indoors, which reduces the risks associated with biosafety to almost zero. examples of manufacturing immunogenic proteins for veterinary using transient expression systems are presented in table 1 . the sys tems of transient expression for the production of recombinant proteins are described in more details in reviews [1, 41, 51, 52, 54] . "candidate" plant vaccines for veterinary use table 1 presents examples of using various expres sion systems for the production of "candidate" plant vaccines for veterinary use. main specific immunogenic no. 1 2015 proteins synthesized at respective diseases (structural proteins, hemagglutinins, glycoproteins) are usually used as antigens. the most commonly used method of transgene construct delivering into plant cells is still the agrobacterial transformation. in some cases, the level of target protein expression was rather high [55] [56] [57] [58] , espe cially in the systems of transient expression [59] [60] [61] , and suitable for product commercialization. in all experiments using the "candidate" plant vac cines, the formation of a specific immune response was demonstrated in vivo, and in most experiments immunogenic proteins were delivered to animals just orally. "candidate" plant vaccines were usually tested on mice, but in approximately a quarter works the ani mals subjected to the disease were tested. protein s of the transmissible swine gastroenteritis virus synthe sized in the cells of transgenic maize [55, 62, 63] or tobacco [64] and delivered into the body of pigs as a food supplement, provided for 100% survival of ani mals after infection [62] . rabbit protein vp60 virus synthesized in potato [65] and other plants (tobacco, pea, rape) [66] and delivered orally enhanced protective immunity: after infecting rabbits with this virus all ani mals survived [66] . the effect of plant recombinant antigen was comparable with commercially used vac cines. the use of plant vaccines for the vaccination of wild animals using edible baits (e.g., vaccine against rabies) will lead to an increase in the proportion of wild animal populations having immunity to the rabies virus. a potential possibility to reduce the cost of produc tion of biopharmaceutics using genetically modified plants served at the end of the xx century as an impe tus for more than twenty biotechnological companies to initiate commercial programs. as seen from the table 1 , many biological products of plant origin are developed, expressed in different types of plants and plant cell cultures. for a variety of reasons, including the still skeptical attitude of the human community to the biosafety of genetically modified plants, many of these works remained in the framework of laboratory tests. at the moment three companies function on the biotechnology market of veterinary preparations, two from the united states and one from canada (table 2 ). in the united states the dow agro sciences company presented a recombinant plant viral hn protein of the newcastle disease virus (approved by usda) and a mixture of antiviral vaccines at the first stage of clinical trials. the second american company at thomas jef ferson university has developed a plant anti rabies vaccine (completion of phase 1). the canadian guardian biosciences company presented plant vac cine against chicken coccidiosis at the second phase of clinical trials. production and wide distribution of biopharmaceu ticals is hampered by a number of circumstances. the first of them is related to the problem of biosafetythe cultivation of genetically modified plants in the field can lead to the accidental introduction of foreign genes into crops grown for human consumption. therefore, most companies producing biopharma ceutics focused on plant species, which are absent from the food chain of humans and animals and also on growing of genetically modified plants preventing their cross pollination with other crops. the second difficulty is related to the necessity of plant material treatment for the removal of various undesired com pounds, such as lignin, proteases, phenolic com pounds, and pigments, especially in the case of plant species, which are not consumed. all these facts result in the requirement of additional studies. the third cir cumstance is due to the fact that until now all aspects of maintaining and growing of plants producing biop harmaceuticals are not settled at the legislative level. ambiguity and vagueness of the existing legislation in this area lead to the fact that large biopharmaceutical and biotechnology companies do not tend to invest in the development of technological lines and research programs in this area, which significantly inhibits the development of the industry. a significant problem for the development of vac cines for veterinary use, especially those used in agri culture, is the need to minimize the price of the final product. the vaccine should be inexpensive for entre preneurs engaged in commercial animal breeding and fully subsidized, if you intend to use the vaccine for mass immunization and the prevention of the disease spread in underdeveloped regions. as a result, the potential income of manufacturers of vaccines for ani mals is much less than that for vaccines intended for humans. for example, in 2007, the volume of the mar ket for the vaccine against human papilloma virus was estimated at more than 1 billion dollars, but the mar ket for the most popular animal vaccines (against foot and mouth disease of cattle and against mycoplasma hyopneumoniae in pigs) together amounted to only 10-20% of this amount [3] . animal vaccines are cheaper and the volume of market for them is less; therefore, the investments in their development are substantially less as compared with investments in the production of vaccines for humans, whereas the com plexity and diversity of both hosts and pathogens in the case of vaccines for animals is much higher. expression systems based on the use of plant cells still have a limited application or are used primarily in some laboratories. nevertheless, biopharming (the biotechnological production of various substances for medicine) in plants has attracted the attention of researchers and manufacturers in developed countries. first biopharmaceuticals of plant origin, such as anti bodies (anti hbsag required to purify the hepatitis b vaccine), therapeutic and dietary proteins ("intrinsic factor" required at vitamin b12 deficiency, gastric lipase), have entered the market, that is an excellent illustration of this progress [67] . despite the fact that today the number of biophar maceutical proteins expressed in plant cells is enor mous, many questions still remain unresolved. the methods for target recombinant protein quantification and purification are still not developed for most of products. the problems of transgene silencing and increased expression of target protein genes are still at the stage of research. the important task that has yet to be solved is to achieve a stable level of expression in different batches of plant raw material. not much work appeared for judging about maintaining the sta bility of recombinant proteins after the harvest, pro cessing, and storage. all of these problems require additional expenses for further research. one of the main obstacles for the leading research groups working on the development and production of plant vaccines, given the financial constraints, is the fulfillment of the relevant official regulations govern ing the use of oral medications. to date, the purified vaccines and therapeutic proteins of plant origin must meet the same standards relating to the production, biosafety, purification, storage, dosage, etc. as any other recombinant proteins for medical purposes. nevertheless, despite these difficulties, there were first biopharmaceuticals of plant origin that passed all the necessary tests and were approved for use by the relevant authorities. some new products having spe cific advantages over similar products obtained in mammalian cell cultures were developed. such com panies as sembiosys genetics inc. (calgary, can ada), medicago inc. (quebec, canada), protalix bio therapeutics (karmiel, israel), and orf genetics (iceland) proved the possibility of quick establishing of the production of purified plant proteins, which are quite competitive in today's market. progress has been made in the formation of the legal framework related to the cultivation of transgenic plants, testing and use of plant biopharmaceuticals. several production pro cesses based on transgenic plants have already received a brand gmp (good manufacturing practice), the interest of manufacturers to this field of biotechnology began to increase again. this work was performed within the framework of the project vi.62.1.5 (no. 01201280334) development and improvement of genetic constructs to optimize the expression of target genes and the production of recombi nant proteins for medical purposes in transgenic plants and animals. plant produced vaccines: promise and reality evolution of plant made pharmaceu ticals current status of veterinary vac cines clinical trials fuel the promise of plant derived vaccines expression of hepatitis b surface antigen in transgenic plants oral immunization with a recombinant bacterial anti gen produced in transgenic plants edible vaccine protects mice against escher ichia coli heat labile enterotoxin (lt): potatoes express ing a synthetic lt b gene effi cacy of food plant based oral cholera toxin b subunit vaccine immunogenicity in humans of a recombinant bacte rial antigen delivered in a transgenic potato receptor mediated oral delivery of a bioencapsulated green fluorescent protein expressed in transgenic chlo roplasts into the mouse circulatory system delivery of plant made vaccines and therapeutics defending the mucosa: adjuvant and carrier formula tions for mucosal immunity induction of secretory immunity and memory at mucosal surfaces, vaccine oral delivery of human biopharmaceuti cals, autoantigens and vaccine antigens bioencapsu lated in plant cells the mucosal immune response to plant derived vaccines posttranslational modifica tion of therapeutic proteins in plants seed based expression systems for plant molecular farming the economic potential of plant made pharmaceuticals in the manu facture of biologic pharmaceuticals rice based mucosal vac cine as a global strategy for cold chain and needle free vaccination secretory iga mediated protection against v. cholerae and heat labile enterotoxin producing enterotoxigenic escherichia coli by rice based vaccine stability of a soybean seed derived vaccine antigen following long term storage, processing and transport in the absence of a cold chain tomato is a highly effective vehicle for expression and oral immunization with norwalk virus capsid protein production of plant made pharma ceuticals: from plant host to functional protein post translational modifica tions in the context of therapeutic proteins current achievements in the production of complex biopharmaceuticals with moss bioreactors low dose oral immunization with lyophilized tissue of herbicide resistant lettuce expressing hepatitis b surface antigen for prototype plant derived vaccine tablet formulation rna mediated chromatin based silencing in plants transcriptional gene silencing in plants plant based production of biopharma ceuticals production of heterologous proteins in plants: strategies for optimal expression expression of heterologous genes in plant systems: new possibilities exhaustion of the chloroplast protein synthesis capac ity by massive expression of a highly stable protein anti biotic metabolic adaptation in transplastomic plants massively accumu lating recombinant proteins transplastomic plants overexpression of the bt cry2aa2 operon a ¸n´l in chloroplasts leads to formation of insecticidal crys tals chloroplast vector systems for biotechnology applications determining the transgene containment level provided by chloroplast transformation expression of the native cholera toxin b subunit gene and assembly as functional oligomers in transgenic tobacco chloroplasts chloroplast expression of his tagged gus fusions: a general strategy to over produce and purify foreign proteins using transplas tomic plants as bioreactors accumulation of trehalose within transgenic chloro plasts confers drought tolerance plants as bioreactors for the production of vaccine anti gens towards high yield production of pharmaceutical proteins with plant cell suspension cultures position effects and epigenetic silencing of plant transgenes bioreactor systems for in vitro production of foreign proteins using plant cell cultures high expression of transgene protein in spirodela micro algae come of age as a platform for recombinant protein production foot and mouth disease virus vp1 pro tein fused with cholera toxin b subunit expressed in chlamydomonas reinhardtii chloroplast heat stable oral alga based vaccine pro tects mice from staphylococcus aureus infection factors effecting expression of vaccines in microalgae recombination and expression of classical swine fever virus (csfv) structural protein e2 gene in chlamydomonas rein hardtii chroloplasts viral vec tors for the expression of proteins in plants agrobacterium mediated transient expression as an approach to production of recombi nant proteins in plants a novel two component tobacco mosaic virus based vector system for high level expression of multiple ther apeutic proteins including a human monoclonal anti body in plants magnifec tion -a new platform for expressing recombinant vac cines in plants plant based vaccines: unique advan tages expression of the newcastle disease virus fusion protein in transgenic maize and immunological studies multimerization of peptide antigens for pro duction of stable immunogens in transgenic plants generation and immunogenicity of japanese encepha litis virus envelope protein expressed in transgenic rice induction of protective immunity in swine by recombinant bamboo mosaic virus express ing foot and mouth disease virus epitopes in planta production of two peptides of the classical swine fever virus (csfv) e2 glycoprotein fused to the coat protein of potato virus x expression in plants and immunogenicity of plant virus based exper imental rabies vaccine delivery of subunit vaccines in maize seed a corn based deliv ery system for animal vaccines: an oral transmissible gastroenteritis virus vaccine boosts lactogenic immu nity in swine immunogenicity of porcine transmissible gastroenteritis virus spike protein expressed in plants oral immunization using tuber extracts from transgenic potato plants expressing rabbit hemorrhagic disease virus capsid protein pea derived vaccines demonstrate high immunoge nicity and protection in rabbits against rabbit haemor rhagic disease virus plant made pharmaceuti cals: leading products and production platforms pro duction of immunogenic vp6 protein of bovine group a rotavirus in transgenic potato plants rotavirus vp6 expressed by pvx vectors in nicotiana benthamiana coats pvx rods and also assembles into viruslike parti cles expression of rotavirus capsid protein vp6 in transgenic potato and its oral immuno genicity in mice protective lactogenic immunity conferred by an edible peptide vaccine to bovine rotavirus produced in transgenic plants bovine herpes virus gd protein pro duced in plants using a recombinant tobacco mosaic virus (tmv) vector possesses authentic antigenicity induction of a protective antibody response to foot and mouth disease virus in mice fol lowing oral or parenteral immunization with alfalfa transgenic plants expressing the viral structural protein vp1 induction of a protective antibody response to fmdv in mice following oral immunization with transgenic stylosanthes spp. as a feedstuff additive expression of hemagglutinin protein of rinderpest virus in transgenic tobacco and immunogenicity of plant derived protein in a mouse model systemic and oral immunogenicity of hemagglutinin protein of rinderpest virus expressed by transgenic peanut plants in a mouse model expression of hemagglutinin protein of rinderpest virus in trans genic pigeon pea oral immunogenicity of the plant derived spike protein from swine transmissible gastroenteritis coronavirus cloning and sequence analysis of the korean strain of spike gene of porcine epidemic diarrhea virus and expression of its neutraliz ing epitope in plants successful oral prime immunization with vp60 from rabbit haemorrhagic disease virus pro duced in transgenic plants using different fusion strate gies mucosal and sys temic immunization elicited by newcastle disease virus (ndv) transgenic plants as antigens expres sion of the fusion glycoprotein of newcastle disease virus in transgenic rice and its immunogenicity in mice expression of immunogenic s1 glycoprotein of infectious bronchitis virus in trans genic potatoes transient expression of the ectodomain of matrix protein 2 (m2e) of avian russian immunization with plant expressed hemagglutinin protects chickens from lethal highly pathogenic avian influenza virus h5n1 challenge infection immunogenicity study of plant made oral subunit vaccine against porcine reproductive and res piratory syndrome virus (prrsv) expression of the rabies virus glycoprotein in transgenic tomatoes immunization against rabies with plant derived antigen development of an edible rabies vaccine in maize using the vnukovo strain induction of a protective immune response to rabies virus in sheep after oral immunization with transgenic maize, expressing the rabies virus glycoprotein expression of the rabies virus nucleoprotein in plants at high levels and evaluation of immune responses in mice expression of rabies virus g pro tein in carrots (daucus carota) key: cord-023770-ymxapsv6 authors: nan title: closteroviridae date: 2011-11-23 journal: virus taxonomy doi: 10.1016/b978-0-12-384684-6.00085-9 sha: doc_id: 23770 cord_uid: ymxapsv6 this chapter focuses on closteroviridae family whose member genuses are closterovirus, ampelovirus, and crinivirus. the virions are helically constructed filaments with a pitch of the primary helix in the range of 3.4–3.8 nm, containing about 10 protein subunits per turn of the helix and showing a central hole of 3–4 nm. the very flexuous and open structure of the particles is the most conspicuous trait of members of the family. the virions have a diameter of about 12 nm and their length ranges from 650 nm in case of species with fragmented genome, to over 2000 nm in case of species with monopartite genome. the fragility of virions and a tendency to end-to-end aggregation contribute to the fact that a range of lengths is often given for single viruses. the virions of several species are degraded by cscl and are unstable in high salt concentration, resist moderately high temperatures and organic solvents, but are sensitive to rnase and chelation. regardless of the genome type, monopartite or fragmented, virions contain a single molecule of linear, positive sense, single stranded rna, constituting 5–6% of the particle weight. the structural proteins of most members of the family consist of a major cp and of a diverged copy of it denoted minor cp (cpm), with a size ranging from 22 to 46 kda (cp) and 23 to 80 kda (cpm). the members of the family have one of the largest genomes among plant viruses because of sequence duplication and acquisition of nonviral coding sequences such as protease, and hsp70 protein via rna recombination. recombination. recombination may also explain differences in genome organization between genera and members of the same genus. genome organization, i.e. the number and relative position of the orfs differs between the genera and/or individual viral species. however, the complex orf1a-orf1b invariably encodes the replication-related proteins, with methyl-transferase (mtr), helicase (hel), and rna-dependent rna polymerase (rdrp) conserved domains. downstream orfs, which encode in 5→3 direction a 6k small hydrophobic protein, the hsp70h, the ~60 kda protein, the cp and cpm, form a five-gene module which is conserved, with few modifications, among most members of the family analysed so far. the hsp70h and the ~60 kda proteins are integral virion components present in all the sequenced members of the family. the functions postulated for hsp70h are: mediation of cell-to-cell movement through plasmodesmata, involvement in the assembly of multisubunit complexes for genome replication and/or subgenomic rnas synthesis, and assembly of virus particles. the ~60 kda protein is required for incorporation of both hsp70h and cpm to virion heads. the duplication of the capsid protein gene seems to be the only example of such condition among viruses with elongated particles. in general, capsid proteins and their homologs (cpm) show a significant degree of sequence conservation and the duplicate copies probably retain the general spatial folding and some crucial properties of the cps. notable exception are a group of ampeloviruses with the smallest genomes in the family [e.g. grapevine leafrollassociated virus 4 (glrav-4), glrav-5, glrav-6, glrav-9, pineapple mealybug wilt-associated virus 1(pmwav-1) and pmwav-3] which do not appear to possess cpm. the genome expression strategy is based on: (i) proteolytic processing of the polyprotein encoded by orf1a; (ii) 1 pos. ssrna ribosomal frameshift for the expression of the rdrp domain encoded by orf1b, a mechanism not found in other ()rna plant viruses; (iii) expression of the downstream orfs via the formation of a nested set of 3 co-terminal sub-genomic rnas (sgrnas). the dsrna patterns are very complex and variable among species, reflecting the different numbers and sizes of the orfs present in individual genomes and, in some cases, the existence of defective rnas. replication occurs in the cytoplasm, possibly in association with endoplasmic reticulum-derived membranous vesicles and vesiculated mitochondria. from an evolutionary point of view, closteroviruses represent a monophyletic virus lineage that might have evolved from a smaller filamentous virus when higher plants have differentiated. this progenitor virus, thought to be composed of three genes encoding replication-associated proteins, a protein (p6) with affinity for cell membranes, and a single coat protein, acquired the hsp70h and a ~60 kda protein derived from a fusion of two domains, n-terminal domain of unknown evolutionary provenance, and a duplicated capsid protein domain. under the pressure of further modular evolutionary events, i.e. duplication of the coat protein gene, acquisition of diverse suppressors of rna silencing and of additional genes acquired via horizontal gene transfer (e.g. papain-like cysteine proteinase and alkb domains), this family ancestor gave rise to the progenitors of the three extant genera of the family. one of these genera (crinivirus), differentiated further by splitting its genome in two or three genome components. virion proteins are moderately antigenic. most virus species within genera are serologically unrelated or distantly related to one another. no intergeneric serological relationship has been detected. the natural and experimental host ranges of individual virus species are usually restricted, except for a few members of the genus crinivirus. disease symptoms are of the yellowing type (i.e. stunting, rolling, yellowing or reddening of the leaves, small and late ripening fruits), or pitting and/ or grooving of the woody cylinder of woody hosts. infection is systemic, but usually limited to transmission is semi-persistent regardless of the type of vector. geographical distribution varies from restricted to widespread, depending on the virus species, most of which occur in temperate or subtropical regions. virions are usually found in the phloem (sieve tubes, companion cells, phloem parenchyma), occasionally in the mesophyll and epidermis. ultrastructural modifications arise by membrane proliferation, degeneration and vesiculation of mitochondria, and formation of inclusion bodies. these are made up of aggregates of virions or membranous vesicles, or a combination of the two. virions accumulate in conspicuous cross-banded fibrous masses or, more typically, in more or less loose bundles intermingled with single or clustered membranous vesicles. inclusions of this type are one of the hallmarks of the family. the vesicles contain a fibrillar network and derive either from the endoplasmic reticulum, or from peripheral vesiculation of mitochondria. traits that largely characterize the family and that are the basis of the current classification are: l natural transmission by aphids, mealybugs or whiteflies in a semi-persistent manner; experimental transmission by mechanical inoculation very difficult or not possible. see table 1 . the genus comprises species with particle length above 1200 nm, and monopartite rna genome, 14.5-19.3 kb in size, in which cpm is located upstream of the cp gene. natural transmission by aphids. particle morphology largely conforms to that of other members of the family. virions are of one size, ranging from 1350 to 2000 nm in length. ctv has also smaller than full-length particles that may encapsidate subgenomic or multiple species of defective rnas (d-rna) containing all of the cis-acting sequences required for replication. sgrnas may be involved in the construction of recombinant d-rnas. according to the species, infectivity is inactivated at temperatures between 40 and 55 °c, is retained for 1 to 4 days at room temperature, up to 1 year in frozen sap, 2 years in dried leaf material, 5 years in lyophilized preparations stored at 20 °c, and is destroyed at ph lower than 6. a 260 /a 280 ratio is around 1.20 but some members [byv, carnation necrotic fleck virus (cnfv), burdock yellows virus (buyv)] lack tryptophan, which results in a higher ratio (1.4-1.8) for the virions. s 20,w ranges from 130 (byv) to 140 (ctv), buoyant density is 1.33 g cm 3 in cscl (byv and ctv) and 1.257 g cm 3 in cs 2 so 4 (ctv). virions contain a single molecule of linear, positive sense, single stranded rna from 14.5 to 19.3 kb in size. multiple double stranded rna (dsrna) species occur in infected tissues, the largest of which is usually the replicative form of the entire genome. sgrnas generate a range of smaller dsr-nas. with ctv, the presence of d-rna makes the dsrna pattern of virus isolates more complex than that of other members of the genus. sequenced members of the genus closterovirus show three types of genome organization exemplified by byv (figure 2 ), ctv ( figure 3 ) and bysv: (i) byv contains eight orfs flanked by 5 and 3 utrs of 107 and 181 nt, respectively; (ii) ctv has 12 orfs and utrs of 107 nt at the 5 end and 275 nt at the 3 end. it differs from the byv genome in having two papain-like protease domains in orf1a, an extra 5 proximal orf (orf2) encoding a 33 kda product with no similarity to any other protein in databases, and two extra 3 proximal orfs (orf9 and orf11); (iii) bysv has 10 orfs and a 3 utr 241 nt in size, a length intermediate between that of the bysv and ctv utrs. a further difference with the byv genome rests in the presence of an extra orf (orf2) encoding a 30 kda polypeptide with no similarity to any other protein in databases. this orf is located downstream of orf1b, i.e. in the same position as the unrelated ctv orf2. thus, the organization of bysv genomes is intermediate between that of byv and ctv, suggesting that these three viruses might represent three distinct stages in closterovirus evolution. non-structural proteins common to all members of the genus are: (i) a large polypeptide (over 300 kda) containing the conserved domains of papain-like protease (p-pro), methyltransferase (mtr), and helicase (hel); (ii) a ~50 kda protein with all sequence motifs of viral rdrp of the "alpha-like" supergroup of positivestrand rna viruses; (iii) a 6 kda hydrophobic protein with membrane-binding properties; (iv) the homolog of the cellular hsp70 heat-shock proteins (hsp70h); (v) a 55-64 kda product, referred to as the ~60 kda protein. some of the structural and non structural proteins function as suppressors of the rna silencing plant defence machinery. for instance, cp, p20 and p23 proteins of ctv have suppressor activity, much the same as the homologs of p21 of bysv, byv, and glrav-2. ctv p23 is a unique protein in the family and has a nucleolar localization. silencing suppressors contribute to the accumulation of virus particles and are important determinants of pathogenesis. no serological relationships reported among different virus species of the genus. monoclonal antibodies have been produced to byv, ctv and glrav-2 and polyclonal antisera have been raised to byv, ctv and cylv from fusion proteins obtained in bacterial expression systems. polyclonal antisera have been raised to normal capsid proteins of byv, bysv, glrav-2 and buyv. most of the members of the genus infect herbaceous hosts (weeds, vegetable and flower crops) or shrubs ( the criteria demarcating species in the genus are: the genus comprises species with particles 1400-2000 nm long, monopartite genome 13.0-18.5 kb in size, transmitted by pseudococcid mealybugs and soft scale insects. particle morphology largely conforms to that of other members of the family. information is very limited, except for the size of cp and cpm, as deduced from sequence data. virions contain a single molecule of linear, positive sense, single stranded rna from 13.0 to 18.5 kb in size. multiple double stranded rna (dsrna) species occur in infected tissues, the largest of which is usually the replicative form of the entire genome. smaller dsrna are thought to be replicative forms of subgenomic rnas. the genus ampelovirus shows a wide variation in genome size and organization. at one extreme there are grapevine leafroll-associated virus 1 (glrav-1) and glrav-3, which has the largest genome of all (18,498 nt). glrav-3 has 12 orfs, coding for the replication related proteins (orfs 1a and 1b), two small hydrophobic proteins (6 kda), the hsp70h, the ~60-kda protein, cp, cpm and five additional proteins 21, 20, 20, 4 and 7 kda in size, respectively (figure 4) . the 5' utr and 3' utr are 737 and 277 nt in size, respectively. glrav-1 differs from all other members of the genus in encoding two copies of the cpm. at the other extreme there is a group of viruses infecting grapevine [e.g, grapevine leafroll-associated viruses 5 and 9 (glrav-5 and -9)] and pineapple [e.g. pineapple mealybug wilt-associated viruses 1 and 3 (pmwav-1 and -3)]. all these viruses have a genome made up of seven orfs and lack the cpm. pmwav-1, a representative of this group, has a genome 13,071 nt in size, beginning with a 535 nt utr at the 5 end, followed by the orfs expressing, respectively, the replication related proteins, a 6 kda hydrophobic protein, the hsp70h, the ~60 kda protein, the cp and a 24 kda protein. a utr 132 nt in size terminates the genome. replication occurs in the cytoplasm, likely in association with membranous vesicles, derived either from the endoplasmic reticulum or from peripheral vesiculation and disruption of mitochondria (glrav-1, glrav-3). structural and non-structural proteins are similar in type and function to those reported for the genus closterovirus. polyclonal antisera and monoclonal antibodies have been raised to most of the members of the genus. a recombinant single-chain variable fragment antibody was synthesized to glrav-3. glrav-1 and glrav-3 are distantly serologically related based on cross-reactivity to a monoclonal antibody to glrav-1. glrav-4, -5, -6 and -9 show serological interrelations when tested with polyclonal antisera or monoclonal antibodies. grapevine leafroll-associated virus 7 (glrav-7), an unassigned member of the family, is also distantly related to the four above species. the three pineapple mealybug wilt-associated viruses are serologically unrelated to one another. the majority of extant ampelovirus species are recorded from woody hosts (grapevine, plum, fig) and pineapple. according to the host, they induce rolling yellowing and reddening of the leaves (grapevine), stem pitting (plum), wilting or symptomless infections (pineapple). natural vectors are mealybugs which transmit with a semipersistent modality. the range of vectors varies for individual viruses from rather wide to restricted. for instance, glrav-1 is transmitted by species of several genera of pseudococcid mealybugs (heliococcus, phenacoccus, pseudococcus) and soft scale insects (pulvinaria, neopulvinaria and parthenolecanium); glrav-3 by pseudococcid mealybugs (planococcus, pseudococcus, heliococcus, phenacoccus) and soft scale insects (pulvinaria, neopulvinaria, parthenolecnium, coccus, saissetia, parasaissetia and ceroplastes), whereas glrav-5 is transmitted by pseudoccocus, planococcus and ceroplastes spp. vectors of pineapple ampeloviruses are two species of the genus dysmicoccus, and lchv-2 is transmitted by phenacoccus aceris. none of the viruses is transmitted through seed or mechanically. all persist in plant parts used for propagation and are disseminated with them over long distances. geographical distribution is very wide. the criteria demarcating species in the genus are: l particle size. l size of cp, as determined by deduced amino acid sequence data. pro mtr hel figure 4 : genome organization of grapevine leafroll-associated virus 3, showing the relative position of the orfs and their expression products. pro, papain-like protease; mtr, methytransferase; hel, helicase; rdrp, rna polymerase; hsp70h; ~60 kda protein; cpm, minor capsid protein; cp, capsid protein. l serological specificity using discriminatory monoclonal or polyclonal antibodies. l genome structure and organization (number and relative location and size of the orfs). l amino acid sequence of relevant gene products (polymerase, cp, hsp70h) differing by more than 25%. l vector species and specificity. l magnitude and specificity of natural and experimental host range. l cytopathological features (i.e., aspect of inclusion bodies and origin of cytoplasmic vesicles). genus crinivirus the genus comprises species transmitted by whiteflies. virions usually have two modal lengths (650-850 and 700-900 nm) and a bipartite genome, but potato yellow vein virus (pyvv) has a tripartite genome. particle morphology largely conforms to that of other members of the family. information is very limited, except for the size of cp and cpm, as deduced from sequence data. virions contain a single molecule of linear, positive sense, single stranded rna with size ranging from 7801 to 9127 nt (rna-1) and from 7903 to 8530 nt (rna-2) in species with bipartite genome. the rna size of pyvv, the only species with a tripartite genome, is 8035 nt (rna-1), 5339 nt (rna-2) and 3892 nt (rna-3). the genome of most criniviruses (e.g. liyv) is divided between two linear, positive sense, single stranded rnas totalling 15.6-17.9 kb in size ( figure 5 ), but pyvv possesses a tripartite genome. all molecules are needed for infectivity and are separately encapsidated. rna-1 of liyv contains three orfs, i.e. the orf1a-orf1b complex plus a 3-most orf coding for a 32 kda protein with no similarity to any protein in databases. this orf is similar in size and location to orf2 of ctv and bysv but the respective expression products are not related. rna-1 has 5 and 3 utrs of 97 and 219 nucleotides, respectively. as with other members of the family, the orf1a-orf1b complex codes for the replication-related proteins including the rna-dependent rna polymerase (rdrp). rna-2 has seven orfs flanked by a 5 utr of 326 nt and a 3 utr of 187 nt. rna2 contains the five-gene module which, however, differs from that of members of the genus closterovirus by the insertion of an extra gene (orf4) upstream of the cp gene. as to pyvv: (i) rna-1 (8035 nt in size) is composed of three orfs, i.e. the orf1a-orf1b complex and a 7 kda hydrophobic protein containing a potential transmembrane helix; (ii) rna-2 (5,339 nt in size) comprising five predicted orfs that encode, in the order, the hsp70h; a 7 kda protein similar to a comparable protein of cucurbit yellow stunting disorder virus (cysdv); the ~60 kda protein; a 9.8 kda product with no significant similarity to any other sequence in database; the 28.2 kda putative cp; (iii) rna-3 (3892 nt) has three potential orfs coding for a protein 4 kda in size with no counterpart with other proteins in the family and no significant sequence homology in database; the 77.5 kda cpm, and a 26.4 kda protein present in other members of the genus. in all criniviruses, the order of the cp and cpm orfs is reversed compared to that of species in the genus closterovirus. sweet potato chlorotic stunt virus (spcsv) and tomato chlorosis virus (tocv) have a particularly large cpm (75-80 kda), compared to liyv (53 kda). replication occurs in the cytoplasm, likely in association with membranous vesicles, derived from the endoplasmic reticulum or from vesiculated mitochondria (cysdv). structural and non-structural proteins are similar in type and function to those reported for the genus closterovirus. both genomic rnas of tocv encode rna silencing suppressors, e.g. the p22 protein in rna-1 and cp and cpm in rna-2. the p25 protein of cysdv, the viral rnase iii and the p22 gene present in a few isolates of spcsv also have suppressor activity. monoclonal antibodies have been produced to proteins of spcsv. antisera have been raised from structural and non structural proteins produced as fusion proteins in bacterial expression systems (spcsv and liyv) phylogenetic relationships within the family are depicted in figure 6 . virions of some of the genera of the families alphaflexiviridae (allexivirus) and betaflexiviridae (capillovirus, trichovirus, vitivirus, citrivirus and foveavirus) have the same particle morphology as those of the family closteroviridae. however, the sequence of the cp of members of this family has little similarity with that of cps of viruses in the above genera, and major differences exist in genome size and organization, and in strategy of expression. replication-associated proteins (rdrp, mtr and hel) contain signature sequences homologous to those of other taxa of the "alphalike" supergroup of ssrna viruses, the closest affinity being with the families bromoviridae and virgaviridae. the replication strategy, based on polyprotein processing, translational frameshifting and multiple sgrna generation, closely resembles that of viruses in the families coronaviridae and picorna-like" supergroup of polymerases. hence, the transcriptional strategy of members of the family closteroviridae follows the mechanism of other "alpha-like" viruses, and is dissimilar from the discontinuous, leader-primed transcription of coronaviruses and arteriviruses. derivation of names ampelo: from greek ampelos, "grapevine principles of molecular organization, expression and evolution of closteroviruses: over the barriers closterovirus group". cmi/aab description of plant viruses molecular biology and evolution of closteroviruses: sophisticated build up of large rna genomes comparative and functional genomics of closteroviruses genetic diversity and evolution of closteroviruses glrav-3 glrav-5 glrav-car pos. ssrna molecular biology of the citrus tristeza virus closteroviruses the family closteroviridae revised citrus tristeza virus: a pathogen that changed the course of the citrus industry ecology and epidemiology of whitefly-transmitted closteroviruses criniviruses infect primarily herbaceous hosts, in which they induce extensive chlorosis to yellow discoloration of the leaves, often accompanied by stunting. they are transmitted semi-persistently by whiteflies of the genera trialeurodes and bemisia. persistence and specificity of transmission by their respective vectors have been used as characters for species differentiation. thus, the viruses of group 1 [pyvv, blackberry yellow vein-associated virus (byvav), beet pseudoyellows virus (bpyv) and strawberry pallidosis-associated virus (spav)] are transmitted by t. vaporariorum, viruses of group 2 [tocv, spcsv, cysdv and bean yellow disorder virus (bydv)] by b. tabaci, whereas one of the viruses of group 3 is transmitted by b. tabaci (liyv) and the other by t. vaporariorum (ticv). these groups were identified by comparative phylogenetic analyses of rdrp amino acid sequences. none of the viruses is transmitted through seed or mechanically. geographical distribution varies from restricted (e.g. byvav) to very wide. some emerging viruses (e.g. cysdv, ticv and tocv) are being increasingly recorded from a number of european, american and asiatic countries. the membranous vesicles with a fibrillar content derive from the endoplasmic reticulum and/or from vesiculated mitochondria (cysdv). the criteria demarcating species in the genus are: martelli, g.p., agranovsky, a.a., bar-joseph, m., boscia, d., candresse, t., coutts, r.h.a., dolja, v.v., hu, j.s., jelkmann, w., karasev, a.v., martin, r.r., minafra, a., namba, s. and vetten, h.j. key: cord-017326-1caeui30 authors: seay, montrell; dinesh-kumar, savithramma; levine, beth title: digesting oneself and digesting microbes: autophagy as a host response to viral infection date: 2005 journal: modulation of host gene expression and innate immunity by viruses doi: 10.1007/1-4020-3242-0_11 sha: doc_id: 17326 cord_uid: 1caeui30 although research in this area is still in a stage of infancy, it seems likely that the lysosomal degradation pathway of autophagy plays an evolutionarily conserved role in antiviral immunity. the interferon-inducible, antiviral pkr signaling pathway positively regulates autophagy, and both mammalian and plant autophagy genes restrict viral replication and protect against virus-induced cell death. given this role of autophagy in innate immunity, it is not surprising that viruses have evolved numerous strategies to inhibit host autophagy. different viral gene products can either modulate autophagy regulatory signals or directly interact with components of the autophagy execution machinery. moreover, certain rna viruses have managed to “co-apt” the autophagy pathway, selectively utilizing certain components of the dynamic membrane rearrangement system to promote their own replication inside the host cytoplasm. in addition to this newly emerging role of autophagy in innate immunity, autophagy plays an important role in many other fundamental biological processes, including tissue homeostasis, differentiation and development, cell growth control, and the prevention of aging. accordingly, the inhibition of host autophagy by viral gene products has important implications not only for understanding mechanisms of immune evasion, but also for understanding novel mechanisms of viral pathogenesis. it will be interesting to dissect the role of viral inhibition of autophagy in acute, persistent, and latent viral replication, as well as in the pathogenesis of cancer and other medical diseases. the cellular pathway of autophagy is as ancient as the origins of eukaryotic life. derived from the greek and meaning to eat ("phagy") oneself ("auto"), the term autophagy refers to a lysosomal pathway of selfdigestion, involving dynamic membrane rearrangement to sequester cargo for delivery to the lysosome, where the sequestered material is degraded and recycled. for decades, it has been known that autophagy is the primary intracellular catabolic mechanism for the degradation and recycling of longlived cellular proteins and organelles. for decades, it has also been known that the recycling function of autophagy is an important adaptive response to nutrient deprivation and other forms of environmental stress. however, only recently have we discovered that autophagy may also be an important mechanism for the degradation of intracellular pathogens and that autophagy may also be important in cellular protection against the stress of microbial infection. not surprisingly, we have also recently learned that some successful intracellular pathogens have devised strategies either to block host autophagy or to subvert the host autophagic process to foster their own replication. in this chapter, we will review recent progress in understanding the interrelationships between viruses, autophagy, and innate immunity (figures 1 & 2) . before discussing the interrelationships between viruses, autophagy, and innate immunity, we will provide a brief overview of the molecular and cell biology of autophagy. while this subject has been covered extensively in a recent book and numerous recent review articles 2-5 we will highlight the aspects of this subject that may have particular relevance for viral infections. r r the process of autophagy was first described more than forty years ago, however, for many decades our understanding of autophagy was based largely on morphological observations from electron microscopy (reviewed in 6 ). the field has expanded considerably within the last 15 years after the cloning and molecular characterization of the yeast autophaggy (atg)related genes (reviewed in 7 ) . the analysis of sequenced genomes of higher eukaryotes has identified atg homologues in mammals, c. elegans, drosophila, dictyostelium, and plants, and many of these genes have been shown to be essential for autophagy function in higher eukaryotes (reviewed in 2 ) ( table 1 ). in addition to the identification of the autophagy genes, significant progress has been made in the past decade in understanding some of the signaling events that regulate autophagy (reviewed in 8, 9 ) . interestingly, some of the signaling molecules that regulate autophagy, as well as some of the autophagy genes, play a role in the host antiviral innate immune response (table 1, figure 2 ). the initial step of autophagy is the formation and elongation of the isolation membrane. the isolation membrane invaginates and sequesters cytoplasmic constituents including mitochondria, endoplasmic reticulum (er) and ribosomes, and the edges of the membrane fuse with each other to form a double-membrane structure called an autophagosome. the outer membrane of the autophagosome fuses to the lysosome/vacuole with subsequent delivery of the inner vesicle or autophagic body into the lumen of the degradative compartment. the source of the autophagosomal membrane is still unclear, but presently, it is thought that the preautophagosomal structure (pas) acts as the site of vesicle formation during autophagy [10] [11] [12] . the pas is thought to form de novo, but the source of the vesicle membrane is not known. it seems likely that the "typical" autophagosomes observed during viral infection that contain a mix of virions and self-cytoplasmic constituents, originate from the pas. however, it is not yet known whether the pas also serves as the site of vesicle formation for the formation of "atypical" autophagic-like double-membrane vesicles that function as replication sites for certain rna viruses (e.g. poliovirus, mouse hepatitis virus, equine arterivirus) (reviewed in 13 ). more likely, these double-membrane vesicles arise directly from the endoplasmic reticulum (er) 13 . autophagosomes are lipid-rich, protein-poor vesicles that vary in size and membrane thickness depending on the organism and cell type. the composition and abundance of proteins sequestered within autophagosomes reflects the relative composition and abundance of proteins in the surrounding cytoplasm 14 . this observation has led to the concept that autophagosomes indiscriminately sequester cytoplasmic content. however, in yeast, there are well-established pathways of specific autophagy, including the biosynthetic cytoplasm-to-vacuole targeting pathway and pexophagy (reviewed in 15 ), and in mammalian cells, mitochondria-specific autophagy has been reported 16 . although molecular determinants of cargo recognition have been identified in yeast pathways of specific autophagy, virtually nothing is known about the specificity of cargo recognition in higher eukaryotes. in circumstances where there is degradation of viruses observed inside "typical" autophagosomes that also contain cellular constituents, the sequestration step may lack specificity. however, in circumstances where viruses utilize components of the autophagic machinery for the formation of "autophagic-like" double-membrane structures that exclusively contain viral constituents, the sequestration step is likely to have exquisite specificity. the identification of the viral and cellular determinants of this specificity will be an important advance in understanding the cell biology of these types of rna virus infections and may eventually lead to the identification of novel antiviral therapeutic targets. autophagy is tightly regulated by nutritional, hormonal, and other environmental cues. it occurs as a cellular response to extracellular stimuli (e.g. nutrient starvation, hypoxia, overcrowding, high temperature, hormonal or chemotherapeutic treatment), and intracellular stimuli (e.g. accumulation of damaged, superfluous or unwanted organelles, accumulation of misfolded proteins, invasion of microorganisms). although it is not yet known whether different stimuli act through parallel, convergent, or divergent pathways to trigger autophagy, significant progress has been made within the past decade in identifying different signaling molecules that function in the positive (e.g. eif2 kinases, class iii pi-3 kinases, pten, death-associated protein kinases) or negative (e.g. tor, insulin-like growth factor signals, class i pi-3 kinase, rho/ras family of gtpases) regulation of autophagy (reviewed in 8, 9 ) . the identification of a role for these signaling molecules in autophagy regulation has implications for understanding antiviral immunity, and more speculatively, generates hypotheses about novel principles of virus-host interactions. the recently defined evolutionarily conserved role of the eif2 kinase signaling pathway in autophagy induction suggests that autophagy regulation may contribute to the antiviral function of the interferon-inducible eif2 kinase, pkr. pkr and other eif2 kinases induce a general translational arrest by phosphorylating the serine 51 residue of eif2 (reviewed in 17 ) . genetic studies in yeast and mammalian cells have also shown that the eif2 kinase signaling pathway is required for starvation and herpes simplex virus-induced autophagy 18 . while further analyses are required to dissect the relative contributions of autophagy induction vs. translational arrest in mediating the antiviral effects of pkr, these findings link a new cellular function (i.e. autophagy) with interferon signaling. although the eif2 kinase signaling pathway is the only as-of-yet defined autophagy regulatory signaling pathway that has known antiviral functions, it is interesting to note that most autophagy regulatory signals play a role in other important cellular processes, including cell growth control, cell death, and aging. some of these effects may be the consequence of divergent downstream targets of these regulatory signals and some of these effects may be directly mediated through autophagy. as will be discussed below, given the evidence for a role of autophagy in innate immunity, it is likely that viruses have evolved different strategies to antagonize host f f autophagy, which, at least in the case of herpes simplex virus (see sections 4.1 and 5) include the targeting of upstream autophagy regulatory signals f 18 . therefore, it is tempting to speculate that some of the effects of viruses on cell growth control and cell death may be either direct or indirect consequences of the evolutionary pressure that viruses face to modulate host autophagy. as one example, the insulin-like/class i pi-3k/akt signaling pathway inhibits autophagy 19, 20 , promotes oncogenesis (reviewed in 21 ) , and decreases lifespan (most likely through autophagy-inhibitory effects) 20 . certain retroviruses have recruited the catalytic subunit of pi3-k and its downstream target akt and these viral gene products function as oncoproteins (reviewed in 22 ) . the emerging link between these signaling molecules and autophagy inhibition raises the interesting hypothesis that the initial acquisition of these molecules by viruses was perhaps related to the selective advantage of autophagy inhibition in viral growth. in view of recent evidence supporting a role of autophagy in tumor suppression [23] [24] [25] [26] , the presence of these genes in retroviral genomes could contribute to oncogenesis at least, in part, through inhibition of autophagy signaling, as well as through modulation of other downstream pathways. the atg genes encode proteins important for responding to upstream signaling pathways as well as proteins needed for the generation, maturation, and recycling of autophagosomes (reviewed in 4, 5, 15, 27 ). the atg proteins can be grouped into four functional groups, including a protein kinase cascade important for responding to upstream signals, a lipid kinase signaling complex important for vesicle nucleation, ubiquitin-like conjugation pathways important for vesicle expansion, and a recycling pathway important for the disassembly of atg protein complexes from matured autophagosomes. the role of some of the atg proteins, including ones that act in the lipid kinase signaling complex and in the ubiquitin-like conjugation pathways, has been studied in plant and mammalian viral infections (see table 1 ). autophagy is a dynamic process that is tightly regulated by protein kinases and phosphatases. one of the first atg genes identified in yeast, atg1, encodes a serine/threonine kinase 28 . the atg1 kinase maintains a weak interaction with a hyperphosphorylated atg protein, atg13, in nutrientrich conditions. upon starvation conditions or stress, atg13 is dephosphorylated resulting in a tighter association with atg1 29 . atg13 binding is essential for autophagy since atg13 mutants unable to bind to atg1 are completely defective in autophagy 29 . atg17, which is also thought to play a role in atg1 activation, also interacts with atg1 29 . downstream targets of atg1 have not been identified, although atg1 interacts with other proteins independently of its kinase activity 30 . the upstream kinase, tor (target of rapamycin), indirectly or directly results in atg13 hyperphosphorylation, which is one presumptive mechanism by which tor kinase inhibits autophagy. of note, the atg1 component of the yeast autophagy induction complex plays a conserved role in autophagy in higher eukaryotes. however, as-of-yet, the role of atg1 in antiviral immunity has not been evaluated. lipid kinase signaling vps34, which encodes a phosphatidylinositol-3 kinase (pi3-k), d d phosphorylates the 3' hydroxyl group inosotiol ring of phosphoinositides u u 31 . although there is only one pi3-k in yeast, there are three classes of pi3-k in higher eukaryotes; class iii pi3-k has been shown to be analogous to yeast vps34. the importance of class iii pi3-k signaling in autophagy has been demonstrated pharmacologically and genetically. the nucleotide derivative, 3-methyladenine, inhibits class iii pi3-k activity and is widely used to inhibit autophagosome formation in mammalian cells 32, 33 . a null mutation in vps34 causes defects in autophagosome formation in yeast 34, 35 and microinjection of an inhibitory antivps34 antibody blocks autophagy in cultured mammalian cells 36 . vps34 functions through the association with other atg proteins in a large complex that includes vps15, atg6/vps30 and atg14 35 . this complex is thought to be important in vesicle nucleation by mediating the localization of other atg proteins at the pas 10, 35 . vps34 and atg6/vps30 are conserved in higher eukaryotes. importantly, the mammalian (beclin 1) and plant (beclin 1 ( ( ) homologues of yeast atg6/vps30 have been the most extensively studied atg genes in viral infections. as will be discussed in more detail below, mammalian beclin 1 restricts viral replication, protects against virus-induced cell death, and is a target of inhibition by different virally-encoded gene products [37] [38] [39] . furthermore, both plant beclin 1 and its binding partner, class iii pi3-k/vps34 prevent the spread of programmed cell death during the plant antiviral hypersensitive response 40 . thus, the lipid kinase complex plays an evolutionarily conserved role in antiviral innate immunity. autophagic vesicle expansion and completion involves conjugation machinery analogous to the ubiquitin conjugation needed for proteasomemediated protein degradation. autophagy utilizes an e1-like enzyme (atg7), two e2-like enzymes (atg10 and atg3) that facilitate the conjugation, and activation and localization of different ubiquitin-like modifiers (atg5 and t atg8). the conjugation modification of atg proteins is necessary for the formation of an autophagosome of appropriate size and shape 41 . however, the precise molecular functions of the conjugation reactions are not known and remain a critical unanswered question in autophagy research. the first conjugation system involves the lipidation of atg8, a ubiquitinlike protein whose close mammalian homologues have three-dimensional structures very similar to ubiqutin [42] [43] [44] [45] . both atg8 and the mammalian homomlogue lc3 are cleaved post-translationally by the cysteine endopeptidase atg4 46, 47 . the cleavage of atg8/lc3 is essential for conjugation and further maturation of the autophagosomes 48 . in yeast and mammalian systems, the cleaved atg8/lc3 is immediately activated by atg7, an e1-like enzyme; transferred to atg3, an e2-like enzyme; and finally conjugated to the lipid molecule phosphatidylethanolamine (pe) 42, 47, 49, 50 . the second ubiquitin-like reaction is the conjugation of atg12 to atg5. atg12 is an ubiquitin-like protein that is activated by atg7 (e1-like enzyme), transferred to atg10 (e2-like enzyme), and subsequently conjugated to atg5 through an isopeptide bond 41, 49 . the conjugation of atg5 to atg12 is necessary for autophagosome formation but not necessary for localization to the pas 10 . almost all of the components of the autophagy machinery that participate in the protein conjugation systems have orthologs in at least some higher eukaryotes. however, only mammalian atg5 has been studied in the context of its role in viral infections. the contrasting phenotypes of atg5 null cells infected with two different rna viruses illustrate two distinct mechanisms by which viruses interact with the autophagic machinery. the murine coronavirus, mouse hepatitis virus, which replicates in association with double-membrane vesicles, has severely impaired growth in atg5 null embryonic stem (es) cells 51 , suggesting that atg5 is required for the formation of coronavirus replication complexes. in contrast, the prototype alphavirus, sindbis virus, replicates to higher titers in atg5 null murine embryonic fibroblasts (mefs) than in wildtype controls 38 , suggesting that the autophagic machinery functions to restrict sindbis virus replication. in yeast, atg proteins that act at the stage of vesicle formation are not associated with the completed autophagosome, with the exception of atg8. this suggests that atg proteins are retrieved at some point prior to, or upon, vesicle completion, and then reutilized in the generation of new autophagosomes. the process of recycling requires the action of atg2 and atg18, which allow the recycling of atg9, the only transmembrane protein that is part of the autophagic machinery 52, 53 . atg9 and atg18 have orthologues in higher eukaryotes, but their function in antiviral responses has not been studied. the unique association of atg8/lc3 with the mature autophagosome has led to an important technical advance in autophagy research. atg8/lc3 is presently the most widely used and reliable marker for labeling autophagosomes 10, 11, 20, 50, 54 and with the recent availability of transgenic mice that express gfp-tagged lc3 54 , it is now possible to study autophagy induction in vivo during viral infections. in addition to its emerging role in innate immunity, autophagy plays a role in diverse other biological processes, including survival during starvation, differentiation and development, tissue homoeostasis, aging, cell growth control, and certain forms of programmed cell death. these biological functions of autophagy have been reviewed in detail elsewhere 2, 55, 56 . in this section, we will however, briefly discuss selected biological functions of autophagy that have relevance either to understanding the mechanisms by which autophagy protects cells against virus infections ( figure 1 ) or to understanding the potential consequences for the host of viral evasion of autophagy ( figure 3 ). perhaps the primordial function of autophagy is its ability to recycle nutrients and help sustain life during periods of starvation. several decades ago, starvation was noted to be a potent inducer of autophagy in rodent liver (reviewed in 57 ), leading to the hypothesis that autophagy is an adaptive response to starvation. following the identification of the conserved autophagy genes, genetic studies in different species have confirmed that autophagy genes are required for the maintenance of eukaryotic life in the face of limited environmental nutrient supply. this principle was first demonstrated in yeast, i.e. all atg gene mutant yeasts grow normally in nutrient rich conditions, but unlike wild-type yeasts, die rapidly during carbon or nitrogen starvation 28 . similarly, dictyostelium discoideum that lack atg genes also grow normally in the presence of their food, nonpathogenic bacteria, but die rapidly when subjected to starvation 58, 59 . during nitrogen or starvation, atg7 and atg9 mutant plants display two phenotypes that are thought to result from a defective ability to mobilize nutrients through autophagic delivery, including enhanced chlorosis (yellowing of leaves due to a loss of chlorophyll) and accelerated leaf senescence 60, 61 . in addition, mammalian cells deleted of the autophagy genes, atg5 or beclin 1, also undergo accelerated cell death in response to starvation as compared to their wild-type counterparts 62 . the pro-survival function of autophagy during starvation is thought to be related directly related to its ability to recycle nutrients to generate a sufficient pool of amino acids required for the synthesis of essential proteins. while the eif2 kinase signaling pathway shuts off general translation during starvation, at least in yeast, gcn2 signaling simultaneously stimulates the transcription of essential starvation response genes, including autophagy genes 63, 64 . thus, this signaling pathway provides a coordinated method to effectively generate new amino acids by autophagy and redirect the host cell synthetic machinery to use its limited amino acid supply specifically for the synthesis of essential starvation response proteins. although a downstream transcription factor like yeast gcn4 (which is downstream of yeast eif2 and transcriptionally transactivates autophagy genes), has not yet been identified for mammalian pkr signaling initiated during virus infection, it seems likely that there are functionally homologous molecules that direct virus-infected cells to mount an adaptive and selective transcriptional and translational response during virus infection. even in the absence of this postulated arm of pkr signaling, the mere recycling of nutrients in virus-infected cells would be predicted to have a beneficial function for the host. although few studies have compared cellular amino acid pools during nutrient starvation and virus infection, acute viral replication involves the parasitism of not only the host cell's translational machinery, but also the host cell's translational building blocks. therefore, it seems likely that acute viral replication induces what can be thought of as a state of "pseuodostarvation". according to this model, the prediction is that the nutrient recycling function of autophagy plays a similar protective function during viral infection as it plays during nutrient deprivation. differentiation and development both require cells to undergo significant phenotypic changes and must entail a mechanism for the breakdown and recycling of obsolete cellular components. genetic studies have revealed an essential role for components of the autophagic machinery in differentiation and developmental processes in several different organisms, including sporulation in yeast, multicellular development in dictyostelium, dauer development in c. elegans, and embryonic development in mice (reviewed in 2 ). in addition, the mammalian autophagy gene, beclin 1, appears to play a role in epithelial cell differentiation, since the mammary glands in beclin 1 heterozygous-deficient mice display striking morphological abnormalities 23 . since viral gene products can inhibit the autophagy function of beclin 1 (see below) and potentially other autophagy proteins 65 , it is possible that autophagy blockade represents a mechanism by which viruses can affect cellular differentiation. for example, the bcl-2-like bhrf1 protein encoded by ebv binds to beclin 1 39 blocks its autophagy function 39 , and also perturbs epithelial cell differentiation 66 . further studies are needed to determine the role of beclin 1 binding in the perturbation of epithelial cell differentiation by bhrf1, as well as to investigate the effects of other viral inhibitors of autophagy on cellular differentiation and multicellular development. many different viruses, including retroviruses, gammaherpesviruses, papillomaviruses, and hepatitis viruses are oncogenic. studies of the mechanisms of viral oncogenesis have largely focused on the ability of viruses to alter mitogenic signaling, cell cycle regulation, and/or apoptosis. however, new evidence is emerging that autophagy plays a role in tumor suppression and that autophagy is antagonized by gene products encoded by certain oncogenic viruses. accordingly, it will be important to evaluate the role of viral inhibition of autophagy in viral oncogenesis. normal cell growth requires a well-coordinated balance between the cell's biosynthetic machinery (e.g. protein synthesis and organelle biogenesis) and its degradative processes (e.g. protein degradation and organelle turnover). in the 1970's, it was first proposed that protein catabolism through autophagy is a major determinant of cell growth 67, 68 . according to this model, both cell mass and the rate of cell growth is a balance between the amount of protein synthesized and the amount of autophagic protein degradation. although this model has received little attention in recent years, interest in the role of autophagy in cell growth control has reemerged in light of new biochemical and genetic links between autophagy and the negative regulation of tumorigenesis. as stated above in section 2.2, several different oncogenic signaling molecules, including members of the insulin signaling pathway (e.g. class i pi-3k, akt) and members of the rho and ras family of gtpases negatively regulate autophagy in mammalian cells and the pten tumor suppressor positively regulates autophagy (reviewed in 69 ). furthermore, the autophagy inhibitor, tor, is an important positive regulator of cell growth in diverse organisms, and the tor inhibitor, rapamycin, has promising anti-tumor effects in human clinical trials (reviewed in 70 ). oncogenic viruses have developed multiple different strategies to activate autophagy-inhibitory signaling pathways. these strategies include encoding viral oncoproteins that represent activated forms of the corresponding cellular proto-oncogene (reviewed in 22 ) or upregulating rho/ras or class i pi-3k/akt/tor signaling through alternative mechanisms [71] [72] [73] [74] [75] [76] [77] . components of the autophagic machinery may also play a direct role in tumor suppression. the beclin 1 gene is monallelically deleted in a high percentage of cases of human breast, ovarian, and prostate cancer (reviewed in 78 ) and has tumor suppressor function in cultured mammary carcinoma cells 79, 80 . heterozygous disruption of beclin 1 in mice increases the frequency of spontaneous tumorigenesis (including papillary lung carcinomas, b cell lymphomas, and hepatocellular carcinomas) and accelerates the development of hepatitis b virus-induced pre-malignant lesions 23, 24 . in addition, atg5 null es cells are more tumorigenic in mice than their wild-type counterparts and result in teratomas that are less welldifferentiated 81 . together, these findings lead to the concept that autophagy genes may represent a novel class of tumor suppressor genes and that genetic disruption of autophagy may represent a novel mechanism of tumorigenesis. as will be discussed in more detail below, two different classes of viral gene products have been identified thus far that bind to beclin 1 and inhibit its autophagy function, including the alphaherpesvirus-encoded neurovirulence protein, hsv-1 icp34.5, and the gammaherpesvirus-encoded bcl-2-like proteins, kshv vbcl-2 and ebv bhrf1 38, 39 . the gammaherpesviruses are oncogenic viruses that are etiologically linked to a variety of different malignancies, including lymphoma, nasopharyngeal carcinoma, and kaposi's sarcoma. at present, the precise role of gammaherpesvirus bcl-2like proteins in viral oncogenesis is uncertain. nonetheless, given the welldefined role of cellular bcl-2 in oncogenesis and the emerging evidence that beclin 1 is a tumor suppressor protein, it will be important to evaluate whether viral bcl-2 antagonism of beclin 1 function plays a role in gammaherpesvirus oncogenesis. of note, preliminary data indicates that kshv may also encode other gene products that interact with other components of the autophagic machinery 65 . thus, oncogenic gammaherpesviruses may have multiple mechanisms to disarm host autophagy. it will be of interest to determine whether other oncogenic dna viruses, especially human papillomavirus, also directly inhibit the host autophagic machinery. in many tissues in the adult organism (especially post-mitotic cells), protein and organelle turnover by autophagy plays an essential cellular homeostatic or housekeeping function, removing damaged or unwanted organelles and proteins. for many decades, it has been presumed that this homeostatic function of autophagy represents an anti-aging mechanism, perhaps by reducing reactive oxidative species and other toxic intracellular substances that contribute to genotoxic stress (reviewed in 82 ). the conserved effects of protein caloric restriction (a dietary inducer of autophagy) on lifespan extension has provided further fuel for this concept (reviewed in 83 ). recent genetic studies, especially those performed in c. elegans, provide more direct evidence for a role of both autophagy regulatory signals and components of the autophagic machinery in anti-aging pathways. loss-offunction mutations in autophagy-inhibitory insulin-like signaling pathway extend lifespan (reviewed in 84 ) , and inactivation of the c. elegans ortholog of yeast autophagy gene, atg6/vps30, blocks this lifespan extension 20 . while the precise mechanisms by which autophagy extends lifespan are unknown, one theory is that autophagy selectively removes damaged mitochondria, resulting in decreased levels of intracellular reactive oxygen species and cellular protection against oxidative damage. viral infections, as well as the inflammatory response to viral infections, can damage mitochondria and/or increase the intracellular generation of reactive oxygen species, and these effects may contribute to viral pathogenesis. for example, the mitochondrial damage that occurs in hiv infection (even in the absence of antiretroviral treatment) is thought to be a major contributory factor to the metabolic abnormalities and cardiomyopathy that occur in patients with aids (reviewed in 85 ). as another example, in a transgenic mouse model of hepatitic c virus (hcv) infection, oxidative stress in the absence of inflammation has been implicated in hcv-associated hepatocarcinogenesis 86 . similarly, studies in transgenic mice and cultured cells indicate that pre-s1/s2 mutant hepatitis b virus surface antigens, which accumulate in late stages of hbv infection, cause oxidative stress and dna damage 87 . therefore, it is possible that the mechanisms by which autophagy functions as an anti-aging pathway may be relevant to potential roles that autophagy may play in protecting cells against adverse sequelae of oxidative stress during virus infection. diseases associated with an accumulation of misfolded and aggregated proteins, including neurodegenerative disorders and 1 -anti-trypsin liver disease, are associated with an increase in the accumulation of autophagic vacuoles (reviewed in 56 ). in these diseases, it has both been argued that autophagy plays a protective role (i.e. by removing protein aggregates and damaged mitochondria) and a pathologic role (i.e. by promoting liver dysfunction in 1 -anti-trypsin deficiency through excessive mitochondrial autophagy 88 or by promoting autophagic cell death). although both of these roles may be operative in different diseases or even in different facets of a single disease, recent studies provide compelling evidence that autophagy plays a protective role against the toxic effects associated with protein aggregation. for example, mutant -synuclein (associated with early onset parkinson's disease), and aggregate-prone proteins with polyglutamine expansions (associated with huntington's disease) are targeted for autophagic degradation 88, 89 . rapamycin, which stimulates autophagy, not only enhances the clearance of aggregate-prone proteins but also reduces the appearance of the aggregates and the cell death associated with expression of mutant huntington's proteins 89 . furthermore, induction of autophagy with rapamycin protects against neurodegeneration in both a fly and mouse model of huntington's disease 90 . recent advances have also been made in understanding the mechanisms by which autophagy is induced in response to misfolded protein aggregates. in cell models, transgenic mice, and samples from human brains of patients with huntington's disease, mtor is sequestered into polyglutamine aggregates. this sequestration impairs its kinase activity, leading to induction of autophagy 90 . although it has not yet been evaluated, it is likely that the accumulation of misfolded protein aggregates also induces autophagy through activation of the er stress response, which is mediated by the eif2 kinase, pkr-like er resident kinase (perk) 91, 92 , since other stress stimuli (e.g. starvation and virus infection) that activate other eif2 kinases (e.g. gcn2 and pkr) induce autophagy through this same signaling pathway 18 . these observations are potentially relevant to understanding the role of autophagy in protection against virus-induced diseases in which protein misfolding and er stress are thought to play pathogenetic roles. similar to genetic neurodegenerative disorders, there is increasing evidence that murine retrovirus-associated spongiform-like neuronal degeneration is also associated with protein misfolding and er stress. for example, viral envelope proteins from avirulent strains are processed normally and fail to induce er stress, whereas envelope proteins from neurovirulent strains are misfolded and activate er stress response pathways [93] [94] [95] . in addition, it has been proposed that the mechanism by which pre-s mutant hbv surface antigens promote oxidative stress and dna damage is through the accumulation of misfolded mutant proteins and activation of er stress 87, 96 . thus, based on recent studies with non-viral associated neurodegenerative disorders, the prediction is that autophagy induction might be beneficial in attenuating diseases associated with misfolded viral proteins, such as retrovirus-associated spongiform encephalopathy and hepatitis b virusinduced liver damage. as a corollary, the possibility that these viruses might possess mechanisms to evade host autophagy could be an exacerbating factor in the pathogenesis of these infections. an interesting question is whether viral protein aggregates trigger autophagy by mechanisms that are similar to those involved in autophagy induction initiated by cellular protein aggregates. different viral glycoproteins are known to activate the er stress-related eif2 kinase, perk 97,98 , although a role for perk in autophagy induction has not yet been formally demonstrated. it is completely unknown, however, whether viral protein aggregates, like polyglutamine aggregates in huntington's disease, sequester and thereby inactivate the autophagy-inhibitory kinase, mtor. if so, this would represent a highly novel mechanism by which viruses trigger intracellular innate immune responses. autophagy is emerging as a newly described mechanism of antiviral innate immunity that is targeted by viral virulence gene products. although there are not yet many published articles in this area, there are several observations that support this concept. first, during herpes simplex virus infection, the interferon-inducible antiviral pkr signaling pathway regulates the autophagic degradation of cellular and viral components 18, 99 . second, mammalian autophagy execution genes, including beclin 1 and atg5, regulate sindbis virus replication and sindbis virus-induced cell death 37, 38 . third, plant autophagy execution genes, including beclin 1, class iii pi3-k/vps34, atg3, and atg7, restrict tobacco mosaic virus replication and limit the spread of cell death during the innate immune response 40 . in this section, each of these observations will be described in more detail. the interferon-inducible dsrna-dependent protein kinase r (pkr) plays an important role in innate immunity against viral infections. pkr activation leads to phosphorylation of the subunit of eukaryotic initiation factors 2 (eif2 ) and a subsequent shutdown of host and viral protein synthesis and viral replication (reviewed in 100 ). to avoid this translational shutdown, many viruses have evolved different strategies to antagonize pkr function. these include interference with the dsrna-mediated activation of pkr or pkr dimerization; blockade of the kinase catalytic site or pkr-f substrate interactions; alterations in the levels of pkr; direct regulation of eif2 phosphorylation; and effects on components downstream of eif2 (reviewed in 100,101 ). the importance of viral antagonism of pkr function in viral pathogenesis has been most clearly demonstrated using a herpes simplex virus type 1 (hsv-1) model system 102, 103 . the hsv-1 neurovirulence protein, icp34.5, binds to protein phosphatase 1 and causes it to dephosphorylate eif2 , thereby negating the activity of pkr 104, 105 . a neuroattenuated hsv-1 mutant lacking icp34.5 exhibits wild-type replication and virulence in mice genetically lacking pkr 103 , proving that the icp34.5 gene product mediates neurovirulence by antagonizing pkrdependent functions. in addition to regulating host translation during viral infection, the pkr signaling pathway also regulates the autophagic degradation of host proteins 18 . as mentioned above, molecules in the yeast eif2 kinase signaling pathway (e.g. the eif2 kinase, gcn2, the eif2 ser51 residue, and the transcriptional transactivator, gcn4) are required for nitrogen starvationinduced autophagy. interestingly, the autophagy defect of gcn2 null yeast can be rescued by mammalian pkr transformation 18 . direct evidence that viruses can induce pkr-dependent autophagy has been provided by studies done with herpes simplex virus type 1 (hsv-1) infection in genetically engineered mefs 18 . a herpes simplex virus (hsv-1) mutant virus lacking the icp34.5 inhibitor of pkr signaling (termed hsv-1 34.5), but not wild-type hsv-1, is able to induce the autophagic breakdown of long-lived cellular proteins in wild-type mefs. however, hsv-1 34.5 infection is not able to induce the autophagic breakdown of long-lived cellular proteins in mefs lacking pkr or with a nonphosphorylatable mutation in ser-51 of eif2 . these findings indicate the pkr-dependent signaling events regulate the autophagic breakdown of host proteins during viral infection and that this function of pkr is antagonized by the hsv-1 icp34.5 neurovirulence gene product. as discussed in section 3.1, the breakdown of cellular proteins may help protect host cells against the effects of "pseudostarvation" induced by viral infection. more recent studies indicate that pkr-dependent signaling also regulates the breakdown of viral components during hsv-1 infection 99 . ultrastructural analyses of wild-type and pkr-deficient mefs and sympathetic neurons infected with wild-type and hsv-1 34.5 demonstrate that hsv-1 is degraded in autophagosomes by a pkr-dependent process. in wild-type cells infected with wild-type hsv-1, the majority of intracytoplasmic virions are a either randomly dispersed in the cytoplasm or are contained within "viral vacuoles," a structure thought to represent an important intermediate in the egress of hsv-1 from the nucleus out of the cell 106 . in contrast, in wild-type cells infected with hsv-1 34.5, most cytoplasmic virions are localized within autophagosomes that contain a mix of different cytoplasmic constituents ( figure 4a ). pkr-deficient mefs or pkr-deficient neurons infected with hsv-1 34.5 have very few autophagosomes, and appear similar to wild-type hsv-1-infected wild-type cells, with randomly dispersed intracytoplasmic virions and numerous viral vacuoles. together, these observations demonstrate that hsv-1 is degraded by autophagy, that hsv-1 icp34.5 antagonizes the cellular autophagic degradation of hsv-1, and that this process requires pkr. recent biochemical analyses confirm that pkr signaling and hsv-1 icp34.5 regulate viral protein degradation 99 . hsv-1 protein degradation is significantly accelerated in wild-type mefs infected with hsv-1 t t 34.5 as compared to wild-type mefs infected with wild-type hsv-1, indicating that icp34.5 delays viral protein degradation. however, in autophagy-deficient pkr -/-mefs or eif2 s51a mutant mefs, the rate of hsv-1 protein degradation is similar in hsv-1-and hsv-1 34.5-infected cells, indicating that hsv-1 protein degradation is positively regulated by the pkr signaling pathway. thus, the eif2 kinase-dependent autophagy signaling pathway not only regulates the degradation of long-lived cellular proteins but also regulates the degradation of viral proteins. accordingly, it seems logical to speculate that pkr-dependent autophagic degradation of viruses inhibits viral replication and is an antiviral defense mechanism. however, the relative contributions of the effects of pkr on viral protein synthesis and the effects of pkr on viral protein degradation in the regulation of the hsv-1 replication have not yet been assessed. for this purpose, it will be necessary to selectively inhibit the autophagic protein degradation machinery and/or have hsv-1 mutant viruses that selectively block specific downstream functions regulated by pkr. it will also be important to determine whether pkr-dependent autophagy degrades and inhibits the replication of viruses other than hsv-1. preliminary observations indicate that pkr-dependent autophagy does lead to the degradation of another neurotropic virus, the enveloped, positive-strand rna virus in the alphavirus genus, sindbis virus 107 ( figure 4b ). the role of pkr in antiviral innate immunity is well established, but pkr regulates many different cellular processes, and it is not yet known exactly what role autophagy induction plays in the antiviral effects of pkr. however, the concept that autophagy is important in innate immunity is more directly supported by studies involving components of the mammalian and plant autophagic machinery. the first identified mammalian autophagy gene product, beclin 1, was isolated in a yeast two-hybrid screen, in the context of studies of the mechanism by which the antiapoptotic protein, bcl-2, protects mice against lethal sindbis virus encephalitis 37 . similar to the neuroprotective effects of bcl-2 108 , enforced neuronal expression of wild-type beclin 1 in a recombinant chimeric sindbis virus vector reduces sindbis virus replication, reduces sindbis virus-induced apoptosis, and protects mice against lethal sindbis virus encephalitis 37 . mutations in the bcl-2-binding domain of beclin 1 and mutations in other regions of beclin 1 that block its autophagy function also block its protective effects during sindbis virus infection 37, 62 . thus, it appears that both the interaction with bcl-2 and the autophagy function may be required for the antiviral effects of beclin 1. however, further studies are required to define the precise mechanism of how beclin 1 inhibits viral replication and virus-induced apoptosis and to identify the precise role of bcl-2-beclin 1 interactions in these processes. preliminary studies with beclin 1 null es cells and atg5 null mefs indicate a role for these two endogenous autophagy genes in innate immunity against sindbis virus infection. sindbis virus replicates to higher titers and results in accelerated death in beclin 1 null es cells as compared to wild-type control es cells and in atg5 null mefs as compared to wild-type control mefs 38 . in the case of sindbis virus infection, it is not known whether the acceleration of virus-induced death in beclin 1 null or atg5 null cells is a result of increased viral replication or of independent effects of atg gene deficiency on cell death. however, studies comparing hsv-1 infection in wild-type and beclin 1 -/-es cells suggest that beclin 1 can protect against virus-induced cell death in the absence of inhibitory effects on viral replication 38 . together, the studies of sindbis virus infection in neurons overexpressing beclin 1 or in cultured cells lacking beclin 1 or atg5 demonstrate a role for mammalian autophagy genes in both restricting viral replication and in protection against virus-induced cell death. it will be important to examine whether other autophagy genes have a similar antiviral function and to examine whether autophagy genes also protect against other types of virus infections. the mechanisms by which autophagy genes exert protective effects in sindbis virus infection are not yet known. presumably, the autophagic breakdown of viral components leads to decrease viral yields. however, it is also possible that autophagy leads to the breakdown of cellular components required for viral replication. as noted above, the protective effects against cell death may be secondary to inhibitory effects on viral replication. alternatively, the protective effects may relate to the nutrient recycling functions or "damage control" functions of autophagy, or in the case of beclin 1, to interactions between autophagy proteins and antiapoptotic pathways. it is also possible that autophagy may protect against cell death by degrading specific viral proteins (e.g. the sindbis virus e1 and e2 envelope glycoprotein's 109 ) that are involved in triggering the apoptotic pathway. the protective effects of beclin 1 and atg5 on virus-induced cell death are consistent with the "pro-survival" effects of autophagy during nutrient starvation and other forms of environmental stress. it is not yet clear how to reconcile these pro-survival effects with the view that autophagy represents an alternative form of non-apoptotic programmed cell death (reviewed in 2, 25, 110 ). while the primary basis for this view has been morphologic correlations between the presence of autophagic vacuoles and dying cells (reviewed in 2,111 ), recent genetic experiments establish a more direct role for autophagy genes in certain types of programmed cell death. mammalian atg7 and beclin 1 rnai blocks cell death in fibroblast and macrophage cell lines treated with the caspase inhibitor, zvad 112 , and atg 6 , 7 , and 12 rnai blocks salivary gland destruction during drosophila development 113 . thus, the relationship between autophagy, cell survival, and cell death is quite complex and likely varies according to the cell type and the specific physiological or pathophysiological setting. it remains to be determined whether autophagy genes primarily play a protective role in preventing cell death during virus infection, or whether they also participate directly in cell death that is induced by certain viruses. the plant homologue of beclin 1 also functions in antiviral host defense in plants. similar to mammalian beclin 1, plant beclin 1 restricts viral replication; tobacco mosaic virus replication is increased in beclin 1silenced tobacco plants as compared to vector-treated control plants 40 . however, in contrast to mammalian beclin 1, which prevents the death of sindbis virus-and hsv-1-infected cells, plant beclin 1 plays an interesting role in preventing the death of uninfected cells 40 . in plants, the innate immune response during virus infection is characterized by a hypersensitive response which is a programmed cell death response that occurs around the infected areas (reviewed in [114] [115] [116] ). this hypersensitive response limits virus spread and confers pathogen resistance. it is triggered by a pathogen-encoded avirulence protein, which is recognized by a specific plant cognate resistance protein, termed an r protein. in plants lacking r proteins, there is uncontrolled virus spread and pathogen sensitivity. a tobacco mosaic virus infection model has recently been used to study the role of autophagy genes in plant innate immunity. during tobacco mosaic virus infection, the hypersensitive response is triggered by tobacco mosaic virus protein, tmv p50, which is the helicase domain of the viral replicase 117 . tmv p50 is recognized by an r protein (called the n protein) of n. benthamiana, which is composed of a toll/interleukin 1 receptor domain, a nucleotide binding domain, and a leucine-rich repeat domain 118 . therefore, in tobacco plants containing the n protein (n ( ( +/+ n n ), there is local cell death in cells that are either infected with tmv or that express the tmv p50 protein but no systemic illness is observed. beclin 1 silencing in n +/+ n n tobacco plants reveals a striking role for this autophagy gene in limiting the spread of cell death during the hypersensitive response 40 . during tmv infection of n n plants, cell death begins as discreet and defined foci but continues to spread beyond the site of tmv infection until there is death of the entire inoculated leaf and other uninoculated leaves. a similar spreading cell death phenotype is seen with local expression of the tmv p50 protein, suggesting that the cell death occurs in response to a specific signal triggered by the pathogen encoded avirulence protein, and is not due to increased tmv replication or altered virus movement. in addition, in plants that lack the n gene, beclin 1 silencing does not lead to cell death after tmv infection. moreover, beclin 1 silencing also results in spreading cell death during the hypersensitive response triggered by bacterially-encoded pathogen avirulence proteins. together, these observations demonstrate that the spreading cell death phenotype in beclin 1-silenced plants is mediated by r gene-mediated innate immune responses and that beclin 1 is an important negative regulator of cell death during the plant innate immune response. a similar role for other autophagy genes in limiting the spread of programmed cell death during tmv infection has also been observed. as discussed in section 2.3.3, pi3-k/vps34 is a protein that physically interacts with atg6/beclin 1 in yeast and mammals and is essential for proper autophagosome formation. interestingly, silencing of the plant class iii pi-3k/vps34 in n +/+ n n plants results in a spreading cell death phenotype during tmv infection that is similar to that observed with beclin 1 silencing. as discussed in section 2.3.3, yeast and mammalian atg3 and atg7 are essential for conjugation reactions needed for autophagosome formation, and d silencing of the plant homologues of these genes also results in a spreading cell death phenotype after tmv infection. thus, multiple different f f autophagy genes, including those that act in the vesicle nucleation stage (e.g. t beclin 1, class iii pi3-k/vps34) and those that act in the vesicle expansion stage (e.g. atg3 and atg7) are necessary to prevent the spread of cell death during the plant innate immune response. while plant autophagy genes protect uninfected cells against death whereas mammalian autophagy genes protect infected cells against death during virus infection, the plant data nonetheless further support a "prosurvival", rather than "pro-death" function of autophagy genes during viral f f infections. at present, it is not yet clear how autophagy genes protect uninfected cells against death during the plant hypersensitive response. one possibility is that the absence of autophagy genes in uninfected cells somehow modifies the r gene-mediated signal transduction pathway in a way that instructs uninfected cells to die. an alternative, perhaps more likely, possibility is that the absence of autophagy genes in uninfected cells f renders them more susceptible to pro-death signals emitted from infected cells. regardless of the mechanism, this newly defined role for autophagy genes in preventing the spread of cell death during plant innate immunity has significant implications for understanding the role of autophagy in systemic protection against viral infections. an important question is whether autophagy genes play a similar role during animal virus infections. the evolutionarily conserved function of both mammalian and plant autophagy genes in restricting viral replication and/or protection against cell death suggests an essential role for autophagy in innate immunity. this concept is further supported by recent observations indicating that the herpes simplex virus neurovirulence protein, icp34.5, possesses multiple mechanisms to disarm host autophagy. it can both antagonize the pkr signaling pathway required for autophagy induction and inhibit the function of one component of the autophagic machinery, beclin 1. as discussed in section 4.1, icp34.5 blocks pkr-dependent, eif2 ser-51-dependent autophagic degradation of cellular and viral components in f hsv-1-infected mefs and neurons 99 . one predicted mechanism by which icp34.5 blocks pkr-dependent autophagy is through its known ability to promote the dephosphorylation of eif2 via interactions with pp1 105 . however, new evidence also suggests a second potential mechanism. roizman et al. isolated the mammalian autophagy protein, beclin 1, in a yeast two-hybrid screen using icp34.5 as a bait 119 . subsequent studies have shown that icp34.5 directly interacts with beclin 1 in mammalian cells and inhibits the ability of beclin 1 to rescue autophagy in autophagy-defective atg6 null yeast and in autophagy-defective human mcf7 breast carcinoma cells 38 . since icp34.5 binds to beclin 1 via a domain that is distinct from its pp1 -binding domain, it should be possible to construct hsv-1 viruses containing mutations in icp34.5 that help differentiate between the role of pp1 -binding (and eif2 phosphorylation) and the role of beclin 1-binding in hsv-1 icp34.5-mediated neurovirulence. besides hsv-1 icp34.5, there are numerous other viral proteins or rnas that suppress pkr signaling through a variety of different mechanism (reviewed in 100,101 ). for example, vaccinia virus e3, influenza virus ns1, hsv-1 us11, reovirus 3, and rotavirus nsp3 encode double-stranded (ds) rna-binding proteins that prevent pkr activation. adenovirus vai rnas and hiv tar rnas bind to dsrna substrates and inhibit pkr. hepatitis c virus ns5a protein inhibits the dimerization of pkr, and influenza virus recruits a cellular protein, p58 ipk , that directly interacts with pkr and inhibits its dimerization. the vaccinia virus k3l, hepatitis c virus e2, and hiv tat proteins act as pseudosubstrates of pkr. as-of-yet, the role of these other viral rnas and proteins in autophagy inhibition has not been investigated. however, given the evolutionarily conserved requirement for an intact eif2 kinase signaling pathway in autophagy induction, the prediction is that these other viral inhibitors of pkr, like hsv1 icp34.5, also function as antagonists of host autophagy. further studies are needed to test this prediction and to study the role of this predicted antagonism of host autophagy in the pathogenesis of diseases caused by these other important viral pathogens that encode putative autophagy inhibitors. not only may other viruses antagonize the autophagy function of pkr, but other viral gene products may also antagonize the autophagy function of specific mammalian atg genes. beclin 1 was originally isolated in a yeast two-hybrid screen with the cellular anti-anti-apoptotic protein, bcl-2 37 . subsequently, beclin 1 has also been shown to interact with viral bcl-2-like proteins encoded by different gammaherpesviruses, including ebv-encoded bhrf1, kshv-encoded v-bcl-2, and murine hv68-encoded m11 39 . like icp34.5, these viral proteins can also inhibit the autophagy function of beclin 1 in yeast and mammalian assays. in addition, preliminary evidence indicates that other kshv-encoded proteins may interact with other specific atg proteins 65 . an as-of-yet explored area is whether viruses also inhibit autophagy by activating autophagy inhibitory signaling pathways. as noted in section 2.2 and 3.2, the class i pi-3k/akt signaling pathway negatively regulates autophagy in both mammalian cells and c. elegans 19, 20, 120 and several different viruses activate this pathway. certain oncogenic retroviruses encode the catalytic subunit of pi3-k and akt (reviewed in 22 ). in addition, the ebv latent membrane proteins, lmp1 and 2a, the hepatitis b virus protein, hbx, the kaposi's sarcoma virus protein, k1, and the hepatitis c virus protein, ns5a all activate the pi3-k/akt signaling pathway [72] [73] [74] [75] [76] [77] . presumably, such activation plays a role in autophagy inhibition, although this has not yet been formally tested. while further studies are required to more precisely define the interactions between viral gene products and autophagy regulatory signals and autophagy proteins, there is, however, accumulating evidence that viruses do target multiple different steps of the host autophagy pathway. this observation strongly suggests an evolutionary advantage for viruses to inhibit host autophagy, and by extrapolation, a beneficial role for host autophagy in defense against viral infections. some viruses appear to have even further outsmarted host autophagy. rather than merely devising strategies to block host autophagy, certain positive-strand rna viruses have figured out ways to "co-opt" elements of the autophagy pathway to promote their own replication. this subject has been recently reviewed in detail elsewhere 13 and will therefore only be briefly summarized in this section. as early as 1965, electron microscopic studies of poliovirus-infected cells demonstrated the presence of large numbers of membranous vesicles that were postulated to develop by an autophagic-like mechanism 121 . more recently, work by kirkegaard et al, has extended these findings to further show that poliovirus replication complexes are associated with doublemembrane vesicles that resemble autophagosomes, in that they (1) have similar double membrane-bound morphology; (2) have low buoyant density; and (3) label with the autophagosome marker, gfp-lc3, and the lysosome marker, lamp1 13, 122 . unlike classical autophagosomes, these autophagiclike vesicles do not appear to have a destructive role or mature into degradative compartments. in support of this, treatment with autophagy r r inducers, rapamycin or tamoxifen, both increase, rather than decrease poliovirus growth 13 . furthermore, these double-membrane vesicles are also different from classical autophagosomes in that they contain sec13 and sec31, components of the anterograde transport system that bud from the er 123 . therefore, it is possible that poliovirus-induced vesicles arise from an alternate source rather than the pas, but still share some of the same characteristics of classical autophagosomes (e.g. gfp-lc3 and labeling, augmentation with rapamycin treatment). similar to the replication vacuoles that are associated with certain intracellular bacterial pathogens (e.g. legionella pneumophila) 124 , the poliovirus-induced double-membrane vesicles likely originate from the er 123 . furthermore, these poliovirusinduced vesicles seem to have an alternate function than autophagosomes (i.e. they are pro-replicative, rather than degradative compartments). these r observations suggest that poliovirus may promote its own replication by inducing dynamic membrane rearrangements that share in common certain features of the autophagy pathway (e.g. formation of sequestering doublemembrane vesicles, presence of overlapping markers) but avoid, other unwanted features of the autophagy pathway (e.g. maturation into degradation compartments). of note, specific poliovirus proteins, including 2bc and 3a, have been identified that are sufficient for the induction of t these "autophagic-like" double-membrane bound vesicles 122 . however, the mechanisms by which these proteins induce the formation of such vesicles are not yet known. a recent study with the coronavirus, murine hepatitis virus, has provided more direct evidence that components of the autophagic machinery can be utilized for rna virus replication 51 . mhv replication complexes localize to double-membrane vesicles (that are also thought to arise from the er) ( figure 5a ) and they co-localize with certain autophagy proteins, including lc3 and atg12. in mhv-infected atg5 -/-es cells, double-membrane vesicles are not detected ( figure 5b) , and viral replication is dramatically reduced. these observations provide the first genetic demonstration that proteins necessary for autophagic vacuole formation are also required for maximal levels of viral replication. thus, mhv, and potentially other viruses that replicate in association with double-membrane vesicles (e.g. poliovirus, equine arterivirus), utilize components of cellular autophagy to foster their f own growth. presumably, the atg protein conjugation system (involving atg5) that plays a role in autophagic vesicle expansion and completion also plays a role in the formation of double-membrane vesicles involved in viral replication. it is not yet known whether the entire autophagic machinery or only selective components of the autophagic machinery are used for the formation of double-membrane vesicles that are associated with viral replication complexes. these observations with poliovirus and mhv represent two examples of how viruses can "subvert" elements of the host autophagy pathway to promote their own intracellular growth. in these infections, rna replication complexes are observed in association with "autophagic-like" doublemembrane vesicles but not in association with degradative autophagosomes. it is not clear whether this represents fundamental differences in the host pathways leading to the formation of "autophagic-like" double-membrane vesicles and classical degradative autophagosomes, the diversion of the autophagic machinery towards the formation of "autophagic-like" doublemembrane vesicles from the formation of classical degradative autophagosomes, or specific viral mechanisms to antagonize the maturation of "autophagic-like" double-membrane vesicles into mature degradative autophagosomes. however, interestingly, mhv infection does lead to the induction of atg-5-dependent long-lived cellular protein degradation, ruling out the hypothesis that the autophagic machinery is entirely diverted to form membranes required for viral replication complexes. perhaps mhv possesses as-of-yet defined mechanisms to shield its replication complexes from autophagic degradation. although research in this area is still in a stage of infancy, it seems likely that the lysosomal degradation pathway of autophagy plays an evolutionarily conserved role in antiviral immunity. the interferon-inducible, antiviral pkr signaling pathway positively regulates autophagy, and both mammalian and plant autophagy genes restrict viral replication and protect against virus-induced cell death. given this role of autophagy in innate immunity, it is not surprising that viruses have evolved numerous strategies to inhibit host autophagy. different viral gene products can either modulate autophagy regulatory signals or directly interact with components of the autophagy execution machinery. moreover, certain rna viruses have managed to "co-apt" the autophagy pathway, selectively utilizing certain components of the dynamic membrane rearrangement system to promote their own replication inside the host cytoplasm. in addition to this newly emerging role of autophagy in innate immunity, autophagy plays an important role in many other fundamental biological processes, including tissue homeostasis, differentiation and development, cell growth control, and the prevention of aging. accordingly, the inhibition of host autophagy by viral gene products has important implications not only for understanding mechanisms of immune evasion, but also for understanding novel mechanisms of viral pathogenesis. it will be interesting to dissect the role of viral inhibition of autophagy in acute, persistent, and latent viral replication, as well as in the pathogenesis of cancer and other medical diseases. development by self-digestion: molecular mechanisms and biological functions of autophagy development by self-digestion: molecular mechanisms and biological functions of autophagy autophagy: a regulated bulk degradation process inside cells. d the molecular mechanism of autophagy functions of lysosomes a unified nomenclature for yeast autophagy-related genes diversity of signaling controls of macroautophagy in mammalian cells the pre-autophagosomal structure organized by concerted functions of apg genes is essential for autophagosome formation convergence of multiple autophagy and cytoplasm to vacuole targeting components to a perivacuolar membrane compartment prior to de novo vesicle formation yeast autophagosomes: de novo formation of a membrane structure cellular autophagy: surrender, avoidance and subversion by microrganisms nonselective autophagy of cytosolic enzymes by isolated rat hepatocytes autophagy as a regulated pathway of cellular degradation the mitochondrial permeability transition initiates autophagy in rat hepatocytes pkr: a sentinel kinase for cellular stress regulation of starvation-and virus-induced autophagy by the eif2a kinase signaling pathway distinct classes of phosphatidylinositol 3'-kinases are involved in signaling pathways that control macroautophagy in ht-29 cells autophagy genes are essential for dauer development and lifespan extension in c. elegans oncogenic kinase signalling promotion of tumorigenesis by heterozygous disruption of the beclin 1 gene beclin 1, an autophagy gene essential for early embryonic development, is a haploinsufficient tumor suppressor autophagy as a cell death and tumor suppressor mechanism defective autophagy leads to cancer molecular dissection of autophagy: two ubiquitin-like systems isolation and characterization of autophagy-defective mutants of saccharomyces cerevisiae tor-mediated induction of autophagy via an apg1 protein kinase complex chemical genetic analysis of apg1 reveals a non-kinase role in the induction of autophagy phosphatidylinositol 3-kinase encoded by yeast vps34 gene essential for prtoein sorting 3-methyladenine: specific inhibitor of autophagic/lysosomal protein degradation in isolated rat hepatocytes the phosphatidylinositol 3-kinase inhibitors wortmannin and ly294002 inhibit autophagy in isolated rat hepatocytes the hansenula polymorpha pdd1 gene product, essential for the selective degradation of peroxisomes, is a homologue of saccharomyces cerevisiae vps34p two distinct vps34 phosphatidylinositol 3-kinase complexes function in autophagy and carboxypeptidase y sorting in saccharomyces cerevisiae inhibition of autophagy in mitotic animal cells protection against fatal sindbis virus encephalitis by beclin, a novel bcl-2-interacting protein unpublished data unpublished data autophagy genes are essential for limiting the spread of programmed cell death associated with plant innate immunity a protein conjugation system essential for autophagy a ubiquitin-like system mediates protein lipidation the x-ray crystal structure and putative ligand-derived peptide binding t t properties of gamma-aminobutyric acid receptor type a receptor-associated protein solution structure of human gaga(a) receptor-associated protein gabarap: implications for biological function and its regulation the cyrstal structure of microtubule-associated protein light chain 3, a mammalian homolgue of saccharomyces cerevisiae atg8 formation process of autophagosome is traced with apg8/aut7p in yeast hsatg4b/hsapg4b/autophagin-1 cleaves the carboxyl termini of three humans atg8 homologues and delipidates lc3-and gabarap-phospholipid conjugates the reversible modifcation regulates the membrane-binding state of apg8/aut7 essential for autophagy and the cytoplasm to vacuole targeting pathway apg7p/cvt2p: a novel protein-activating enzyme essential for autophagy lc3, a mammalian homologue of yeast apg8p, is localized in autophagosome membranes after processing coronavirus replication complex formation utilizes components of cellular autophagy apg9/cvt7p is an integral membrane protein required for transport vesicle formation in the cvt and autophagy pathways the atg1-atg13 complex regulates atg9 and atg23 retrieval transport from the preautophagosomal structure in vivo analysis of autophagy in response to nutrient starvation using transgenic mice expressing a r r fluorescent autophagy marker autophagy in health and disease: a double-edged sword functions of lysosomes macroautophagy is required for multicellular development of the social amoeba dictyostelium discoideum dictyostelium macroautophagy mutants vary in the severity of their developmental defects leaf senescence and starvation-induced chlorosis are accelerated by the disruption of an arabidopsis autophagy gene the apg8/12-activating enzyme apg7 is required for proper nutrientrecycling and sensescence in arabidopsis thaliana unpublished data translation initiation: adept at adapting trascriptional profiling shows that gcn4p is a master regulator of gene expression during amino acid starvation in yeast bhrf1, a viral homologue of the bcl-2 oncogene, disturbs epithelial cell r differentiation reduced rates of proteolysis in transformed cells role of the vacuolar apparatus in augmented protein degradation in cultured fibroblasts inhibitors of mammalian target of rapamycin as novel antitumor agents: from bench to clinic human papillomavirus type 6b virus-like particles are able to activate the ras-map kinase pathway and induce cell proliferation epstein-barr virus lmp2a transforms epithelial cells, inhibits cell differentiation, and activates akt latent membrane protein 2a inhibits transforming growth factor-beta 1-induced apoptosis through the phosphatidylinositol 3-kinase/akt pathway differential signaling pathways are activated in the epstein-barr virus-associated malignancies nasopharyngela carcinoma and hodgin lymphoma hepatitis b viral hbx induces matrix metalloproteinase-9 gene expression trhough activation of erk and pi-3k/akt pathways: involvement of invasive potential the k1 protein of kaposi's sarcoma-associated herpesvirus activates the akt signaling pathway the hepatitis c virus ns5a protein activates a phosphoinosite 3-kinase-dependent survival signaling cascade cloning and genomic organization of beclin 1, a candidate tumor suppressor gene on chromosome 17q21 induction of autophagy and inhibition of tumorigenesis by beclin 1 beclin 1 contains a leucine-rich nuclear export signal that is reguired for its autophagy and tumor suppressor function unpublished data autophagy and aging--when "all you can eat" is yourself the anti-ageing effects of caloric restriction may involve stimulation of macroautophagy and lysosomal degradation, and can be intensifid pharmacologically genetic pathways that regulate ageing in model organisms mitochondria and hiv infection: the first decade oxidative stress in the absence of inflammation in a mouse model for hepatitis c virus-associated carcinogenesis pre-s mutant surface antigens in chronic hepatitis b virus infection induce oxidative stress and dna damage mitochondrial autophagy and injury in the liver in 1 -antitrypsin deficiency aggregate-prone proteins with polyglutamine and polyalanine expansions are degraded by autophagy inhibition of mtor induces autophagy and reduces toxicity of polyglutamine expansions in fly and mouse models off huntington disease identification and characterization of pancreatic eukaryotic initiation factor 2 alpha-subunit kinase, pek, involved in translational control protein translation and folding are coupled by an endoplasmic-reticulum-resident kinase endoplasmic reticulum stress is a determinant of retrovirus-induced spongiform neurodegeneration activation of endoplasmic reticulum stress signaling pathway is associated with neuronal degeneratio in momulv-ts1-induced spongiform encephalopathy endoplasmic reticulum (er) stress induced by a neurovirulent mouse retrovirus is associated with prolonged bip binding and retention of a viral protein in the er different types of ground glass hepatocytes in chronic hepatitis b virus infection contain specific pre-s mutants that may induce endoplasmic reticulum stress replication of a cytopathic strain of bovine viral diarrhea virus activates perk and induces endoplasmic reticulum stress-mediated apoptosis of mdbk cells possible involvment of both endoplasmic reticulum-and mitochondriadependent pathways in momulv-ts1-induced apoptosis in astrocytes pkr-dependent autophagic degradation of herpes simplex virus type 1 molecular mechanisms of interferon resistance mediated by viral-directed inhibition of pkr, the interferon-induced protein kinase com: maneuvering the internetworks of viral neuropathogenesis and evasion of the host defense mapping of herpes simplex virus-1 neurovirulence to g134.5, a gene nonessential for growth in culture specific phenotypic restoration of an attenuated virus by knockout of a hostresistance gene association of a m(r) 90,000 phosphoprotein with protein kinase pkr in cells exhibiting enhanced phosphorylation of translation initiation factor eif-2 alpha and premature shutoff of protein synthesis after infection with gamma 134.5-mutants of herpes simplex virus 1 the (1)34.5 protein of herpes simplex virus 1 complexes with protein phosphatase 1 to dephosphorylate the subunit of the eukaryotic translation initiation factor 2 and preculde the shutoff of protein synthesis by double-stranded rna-activated protein kinase electron microscopic observations on the development of herpes simplex virus unpublished data bcl-2 protects mice against fatal alphavirus encephalitis the transmembrane domains of sindbis virus envelope glycoproteins induce cell death the autophagosomal-lysosomal compartment in programmed cell death regulation of an atg7-beclin 1 program of autophagic cell death by caspase 8 american society for cell biology 44th annual meeting plant pathogens and integrated defence responses to infection plant innate immunity -direct and indirect recognition of general and specific pathogen-associated molecules controlled cell death, plant survival and development the helicase domain of the tmv replicase proteins induces the nmediated defense response in tobacco the product of the tobacco mosaic virus resistance gene n: similarity to toll and the interleukin-1 receptor the tumor suppressor pten positively regulates macroautophagy by inhibiting the phosphatidylinositol 3-kinase/protein kinase b pathway electron microscopic study of the formation of poliovirus remodeling the endoplasmic reticulum by poliovirus infection and by individual viral proteins: an autophagy-like origin for virus-induced vesicles cellular coopii proteins are involved in production of the vesicles that form the poliovirus replication complex apg1p, a novel protein kinase required for the autophagic process in saccharomyces cerevisiae apg14p and apg6/vps30p form a protein complex essential for autophagy in the yeast, saccharomyces cerevisiae the drosophila homolog of aut1 is essential for autophagy and development human apg3p/aut1p homologue is an authentic e2 enzyme for multiple substrates, gate-16, gabarap, and map-lc3, and facilitates the conjugation of hapg12p to hapg5p the loss of drosophila apg4/aut2 function modifies the phenotypes of cut and notch signaling pathway pathway mutants a single protease, apg4b, is specific for the autophagy-related ubiquitin-like proteins gate-16, map1-lc3, gabarap, and apg8l dissection of autophagosome formation using apg5deficient mouse embryonic stem cells a new protein conjugation system in human. the counterpart of the yeast apg12p conjugation system essential for autophagy structural and functional analyses of apg5, a gene involved in autophagy in yeast the human homolog of saccharomyces cerevisiae apg7p is a protein-activating enzyme for multiple substrates including human apg12p, gate-16, gabarap, and map-lc3 glucose-induced autophagy of peroxisomes in pichia pastoris requires a unique e1-like protein mouse apg10 as an apg12conjugating enzyme: analysis by the conugation-mediated yeast two-hybrid method apg10p, a novel protein-conjugating enzyme essential for autophagy in yeast murine apg12p has a substrate preference for murine apg7p over three apg8p homologs apg16p is required for the function of the apg12p-apg5p conjugate in the yeast autophagy pathway cvt18/gas12 is required for cytoplasm-to-vacuole transport, pexophagy, and autophagy in saccharomyces cerevisiae and pichia pastoris the work done in the authors' laboratories was supported by nih ro1 grants ai51367 and ai44157 to b.l, and an nsf plant genome grant dbi-0116076 and nih ro1 grant gm62625 to s.p.d-k. key: cord-003297-fewy8y4a authors: wang, ming-yang; liang, jing-wei; mohamed olounfeh, kamara; sun, qi; zhao, nan; meng, fan-hao title: a comprehensive in silico method to study the qstr of the aconitine alkaloids for designing novel drugs date: 2018-09-18 journal: molecules doi: 10.3390/molecules23092385 sha: doc_id: 3297 cord_uid: fewy8y4a a combined in silico method was developed to predict potential protein targets that are involved in cardiotoxicity induced by aconitine alkaloids and to study the quantitative structure–toxicity relationship (qstr) of these compounds. for the prediction research, a protein-protein interaction (ppi) network was built from the extraction of useful information about protein interactions connected with aconitine cardiotoxicity, based on nearly a decade of literature and the string database. the software cytoscape and the pharmmapper server were utilized to screen for essential proteins in the constructed network. the calcium-calmodulin-dependent protein kinase ii alpha (camk2a) and gamma (camk2g) were identified as potential targets. to obtain a deeper insight on the relationship between the toxicity and the structure of aconitine alkaloids, the present study utilized qsar models built in sybyl software that possess internal robustness and external high predictions. the molecular dynamics simulation carried out here have demonstrated that aconitine alkaloids possess binding stability for the receptor camk2g. in conclusion, this comprehensive method will serve as a tool for following a structural modification of the aconitine alkaloids and lead to a better insight into the cardiotoxicity induced by the compounds that have similar structures to its derivatives. the rhizomes and roots of aconitine species, a genus of the family ranunculaceae, are commonly used in treatment for various illnesses such as collapse, syncope, rheumatic fever, joints pain, gastroenteritis, diarrhea, edema, bronchial asthma, and tumors. they are also involved in the management of endocrinal disorders such as irregular menstruation [1, 2] . however, the usefulness of this aconitine species component intermingles with toxicity after it is administered to a diseased patient. so far, few articles have recorded the misuse of aconitine medicinals with strong emphasis and thus have referenced that the misuse of this medicinal can result in severe cardio-and neurotoxicity [3] [4] [5] [6] [7] . in our past research, it was evidenced that the aconitine component is the main active ingredient in this species' root and rhizome, and is responsible for both therapeutic and toxic effects [8] . this medicinal has been tested for cancerological and dermatological activities. its application to disease conditions proved to exhibit an activity that slowed down cancer tumor growth and to cure serious cases of dermatosis. it was also found to have an effect on postoperative analgesia [9] [10] [11] [12] . however, a previous safety study has revealed that aconitine toxicity is responsible for its restriction in clinical settings. further studies are needed to explain the cause of aconitine toxicity as well as to show whether the toxicity supersedes its usefulness. a combined network analysis and in silico study was once performed to obtain insight on the relationship between aconitine alkaloid toxicity and the aconitine structure, and it was found that the cardiotoxicity of aconitine is the primary cause of patient death. the aconitine poison is similar to the poison created by some pivotal proteins such as the ryanodine receptor (ryr1 and ryr2), the gap junction α-1 protein (gja1), and the sodium-calcium exchanger (slc8a1) [9] [10] [11] [12] . however, among all existing studies about the aconitine medicinal, no one has reported detail of its specific binding target protein linked to toxicity. protein-protein interactions (ppis) participate in many metabolic processes occurring in living organisms such as the cellular communication, immunological response, and gene expression control [13, 14] . a systematic description of these interactions aids in the elucidation of interrelationships among targets. the targeting of ppis with small-molecule compounds is becoming an essential step in a mechanism study [14] . the present study was designed and undertaken to identify the critical protein that can affect the cardiotoxicity of aconitine alkaloids. a ppi network built by the string database is a physiological contact for the high specificity that has been established for several protein molecules and has stemmed from computational prediction, knowledge transfer between organisms, and interactions aggregated from other databases [15] . the analysis of the ppi network is based on nodes and edges and is always performed via cluster analysis and centrality measurements [16, 17] . in cluster analysis, highly interconnected nodes and protein target nodes are divided and used to form sub-graphs. the reliability of the ppi network is identified by the content of each sub-graph [18] . the variability in centrality measurements is connected to the quantitative relationship between the protein targets and its weightiness in the network [18] . hence, ppi networks with protein targets related to aconitine alkaloid cardiotoxicity must enable us to find the most relevant protein for aconitine toxicity and to understand the mechanism at the network level. in our research, the evaluation and visualization analysis of essential proteins related to cardiotoxicity in ppis were performed by the clusterone and cytonca plugins in cytoscape 3.5, designed to find the potential protein targets via combination with conventional integrated pharmacophore matching technology built in the pharmmapper platform. structural modification of a familiar natural product, active compound, or clinical drug is an efficient method for designing a novel drug. the main purpose of the structural modification is to reduce the toxicity of the target compound while enhancing the utility of the drug [19] . the identification of the structure-function relationship is an essential step in the drug discovery and design, the determination of the 3d protein structures was the key step in identifying the internal interactions in the ligand-receptor complexes. x-ray crystallography and nmr were the only accepted techniques of determining the 3d protein structure. although the 3d structure obtained by these two powerful techniques are accurate and reliable, they are time-consuming and costly [20] [21] [22] [23] [24] . with the rapid development of structural bioinformatics and computer-aided drug design (cadd) techniques in the last decade, computational structures are becoming increasingly reliable. the application of structural bioinformatics and cadd techniques can improve the efficiency of this process [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] . the ligand-based quantitative structure-toxicity relationship (qstr) and receptor-based docking technology are regarded as effective and useful tools in analysis of structure-function relationships [35] [36] [37] [38] . the contour maps around aconitine alkaloids generated by comparative molecular field analysis (comfa) and comparative molecular similarity index analysis (comsia) were combined with the interactions between ligand substituents and amino acids obtained from docking results to gain insight on the relationship between the structure of aconitine alkaloids and their toxicity. scores from functions were used to evaluate the docking result. the value-of-fit score in moe software reflects the binding stability and affinity of the ligand-receptor complexes. when screening for the most potential target for cardiotoxicity, the experimental data was combined with the value-of-fit score by the ndcg (normalized discounted cumulative gain). the possibility of a protein being a target of cardiotoxicity corresponds with the consistency of this experimental data. since the pioneering paper entitled "the biological functions of low-frequency phonons" [39] was published in 1977, many investigations of biomacromolecules from a dynamic point of view have occurred. these studies have suggested that low-frequency (or terahertz frequency) collective motions do exist in proteins and dna [40] [41] [42] [43] [44] . furthermore, many important biological functions in proteins and dna and their dynamic mechanisms, such as cooperative effects [45] , the intercalation of drugs into dna [42] , and the assembly of microtubules [46] , have been revealed by studying the low-frequency internal motions, as summarized in a comprehensive review [40] . some scientists have even applied this kind of low-frequency internal motion to medical treatments [47, 48] . investigation of the internal motion in biomacromolecules and its biological functions is deemed as a "genuinely new frontier in biological physics," as announced in the mission of some biotech companies (see, e.g., vermont photonics). in order to consider the static structural information of the ligand-receptor complex, dynamical information should be also considered in the process of drug discovery [49, 50] . finally, molecular dynamics was carried out to verify the binding affinity and stability of aconitine alkaloids and the most potential target. this present study may be instrumental in our future studies for the synergism and attenuation of aconitine alkaloids and for the exploitation of its clinical application potential. a flowchart of procedures in our study is shown in figure 1 . of-fit score by the ndcg (normalized discounted cumulative gain). the possibility of a protein being a target of cardiotoxicity corresponds with the consistency of this experimental data. since the pioneering paper entitled "the biological functions of low-frequency phonons" [39] was published in 1977, many investigations of biomacromolecules from a dynamic point of view have occurred. these studies have suggested that low-frequency (or terahertz frequency) collective motions do exist in proteins and dna [40] [41] [42] [43] [44] . furthermore, many important biological functions in proteins and dna and their dynamic mechanisms, such as cooperative effects [45] , the intercalation of drugs into dna [42] , and the assembly of microtubules [46] , have been revealed by studying the low-frequency internal motions, as summarized in a comprehensive review [40] . some scientists have even applied this kind of low-frequency internal motion to medical treatments [47, 48] . investigation of the internal motion in biomacromolecules and its biological functions is deemed as a "genuinely new frontier in biological physics," as announced in the mission of some biotech companies (see, e.g., vermont photonics). in order to consider the static structural information of the ligand-receptor complex, dynamical information should be also considered in the process of drug discovery [49, 50] . finally, molecular dynamics was carried out to verify the binding affinity and stability of aconitine alkaloids and the most potential target. this present study may be instrumental in our future studies for the synergism and attenuation of aconitine alkaloids and for the exploitation of its clinical application potential. a flowchart of procedures in our study is shown in figure 1 . the whole framework of the comprehensive in silico method for screening potential targets and studying the quantitative structure-toxicity relationship (qstr). the 33 compounds were aligned over, under the superimposition of the common moiety and template compound 6. the statistical parameters for database alignment-q 2 , r 2 , f, and see-were the whole framework of the comprehensive in silico method for screening potential targets and studying the quantitative structure-toxicity relationship (qstr). the 33 compounds were aligned over, under the superimposition of the common moiety and template compound 6. the statistical parameters for database alignment-q 2 , r 2 , f, and see-were table 1 . the comfa model with the optimal number of 6 components presented a q 2 of 0.624, an r 2 of 0.966, an f of 124.127, and an see of 0.043, and contributions of the steric and electrostatic fields were 0.621 and 0.379, respectively. the comsia model with the optimal number of 4 components presented a q 2 of 0.719, an r 2 of 0.901, an f of 157.458, and an see of 0.116, and the contributions of steric, electrostatic, hydrophobic, hydrogen bond acceptor, and hydrogen bond donor fields were 0.120, 0.204, 0.327, 0.216, and 0.133, respectively. the statistical results proved that the aconitine alkaloids qstr model of comfa and comsia under the database alignment have adequate predictability. experimental and predicted pld 50 values of both the training set and test set are shown in figure 2 , and the comfa ( figure 2a ) and comsia ( figure 2b ) model gave the correlation coefficient (r 2 ) value of 0.9698 and 0.977, respectively, which demonstrated the internal robustness and external high prediction of the qstr models. experimental and predicted pld50 values of both the training set and test set are shown in figure 2 residuals vs. leverage williams plots of the aconitine qstr models are shown in figure 3a ,b. all values of standardized residuals fall between 3σ and −3σ, and the values of leverage are less than h*, so the two models demonstrate potent extensibility and predictability. residuals vs. leverage williams plots of the aconitine qstr models are shown in figure 3a ,b. all values of standardized residuals fall between 3σ and −3σ, and the values of leverage are less than h*, so the two models demonstrate potent extensibility and predictability. under mesh (medical subject headings), a total of 491 articles (261 articles were received from web of science, and others were received from pubmed) were retrieved. after selecting cardiotoxicity-related and excluding repetitive articles, 274 articles were used to extract the correlative proteins and pathways for building a ppi network in the string server. the correlative proteins or pathways are shown in table 2 . all proteins were taken as input protein in the string database to find its direct and functional partners [51] , and proteins and its partners were then imported into the cytoscape 3.5 to generate the ppi network with 148 nodes and 872 edges ( figure 4 ). potassium voltage-gated channel h2 7 scn3a sodium voltage-gated channel type 3, 3 scn2a sodium voltage-gated channel type 2 3 scn8a sodium voltage-gated channel type 8 2 scn1a sodium voltage-gated channel type 1 2 scn4a sodium voltage-gated channel type 4 1 kcnj3 potassium inwardly-rectifying channel j3 1 during the case of screening of the essential proteins in ppi network, three centrality measurements (subgraph centrality, betweenness centrality, and closeness centrality) in cytonca were utilized to evaluate the weight of nodes. after removing the central node "ac," the centrality measurements of 147 nodes were calculated by cytonca and documented in table s1 . the top 10% of three centrality measurement values of all node are painted with a different color in figure 4a . to screen the node with the high values of each three centrality measures, nodes with three colors were overlapped and merged into sub-networks in figure 4b . under mesh (medical subject headings), a total of 491 articles (261 articles were received from web of science, and others were received from pubmed) were retrieved. after selecting cardiotoxicity-related and excluding repetitive articles, 274 articles were used to extract the correlative proteins and pathways for building a ppi network in the string server. the correlative proteins or pathways are shown in table 2 . all proteins were taken as input protein in the string database to find its direct and functional partners [51] , and proteins and its partners were then imported into the cytoscape 3.5 to generate the ppi network with 148 nodes and 872 edges ( figure 4 ). table 2 . proteins related to aconitine alkaloids induced cardiotoxicity extracted from 274 articles. classification frequency ryanodine receptor 2 19 ryr1 ryanodine receptor 1 15 gja1 gap junction α-1 protein (connexin43) 13 slc8a1 sodium/calcium exchanger 1 11 atp2a1 calcium transporting atpase fast twitch 1 9 kcnh2 potassium voltage-gated channel h2 7 scn3a sodium voltage-gated channel type 3, 3 scn2a sodium voltage-gated channel type 2 3 scn8a sodium voltage-gated channel type 8 2 scn1a sodium voltage-gated channel type 1 2 scn4a sodium voltage-gated channel type 4 1 kcnj3 potassium inwardly-rectifying channel j3 1 during the case of screening of the essential proteins in ppi network, three centrality measurements (subgraph centrality, betweenness centrality, and closeness centrality) in cytonca were utilized to evaluate the weight of nodes. after removing the central node "ac," the centrality measurements of 147 nodes were calculated by cytonca and documented in table s1 . the top 10% of three centrality measurement values of all node are painted with a different color in figure 4a . to screen the node with the high values of each three centrality measures, nodes with three colors were overlapped and merged into sub-networks in figure 4b . in the sub-networks, the voltage-gated calcium and sodium channel accounted for a large proportion, which is consistent with our research in clustering the network (clusters 1, 2, and 9). all proteins in the sub-networks will be utilized to predict the results of the pharmmapper server to receive the potential target of cardiotoxicity induced by aconitine alkaloids (in figure 5a ,b). in the meantime, 2v7o (camk2g) and 2vz6 (camk2a) were identified as the potential targets with higher fit scores. in the sub-networks, the voltage-gated calcium and sodium channel accounted for a large proportion, which is consistent with our research in clustering the network (clusters 1, 2, and 9). all proteins in the sub-networks will be utilized to predict the results of the pharmmapper server to receive the potential target of cardiotoxicity induced by aconitine alkaloids (in figure 5a ,b). in the meantime, 2v7o (camk2g) and 2vz6 (camk2a) were identified as the potential targets with higher fit scores. all compounds were docked into three potential targets. the values of ndcg are shown in table 3 . the dock study of three proteins with an ndcg of 0.8503 and 0.9122, respectively (the detailed in the sub-networks, the voltage-gated calcium and sodium channel accounted for a large proportion, which is consistent with our research in clustering the network (clusters 1, 2, and 9). all proteins in the sub-networks will be utilized to predict the results of the pharmmapper server to receive the potential target of cardiotoxicity induced by aconitine alkaloids (in figure 5a ,b). in the meantime, 2v7o (camk2g) and 2vz6 (camk2a) were identified as the potential targets with higher fit scores. all compounds were docked into three potential targets. the values of ndcg are shown in table 3 . the dock study of three proteins with an ndcg of 0.8503 and 0.9122, respectively (the detailed all compounds were docked into three potential targets. the values of ndcg are shown in table 3 . the dock study of three proteins with an ndcg of 0.8503 and 0.9122, respectively (the detailed docking result is shown in table s2 ) proves that the result of the dock study of 2v7o is consistent with the experimental pld 50 , so the protein 2v7o was utilized for the ligand interaction analysis. table 3 . ranking results by experimental and predicted pld 50 and fit score. experimental pld 50 fit score (2v7o) fit score (2vz6) 6 1 3 3 20 2 1 12 12 3 4 9 1 4 2 4 11 5 7 2 14 6 8 13 16 7 5 6 7 8 17 15 8 9 10 11 27 10 23 17 13 11 12 19 15 12 11 5 32 13 18 18 5 14 22 8 33 15 13 29 21 16 15 1 25 17 9 20 22 18 25 25 17 19 20 16 28 20 24 30 9 21 16 32 29 22 32 14 2 23 30 24 30 24 31 26 18 25 21 27 10 26 26 21 23 27 29 31 31 28 33 7 26 29 14 23 4 30 28 33 3 31 6 10 19 32 27 28 24 33 19 22 ndcg 1 0.9122 0.8503 the 3d-qstr contour maps were utilized to visualize the information on the comfa and comsia model properties in three-dimensional space. these maps used characteristics of compounds that are crucial for activity and display the regions around molecules where the variance of activities is expected based on physicochemical property changes in molecules [52] . the analysis of favorable and unfavorable regions of steric, electrostatic, hydrophobic, hbd, and hba atom fields contribute to the realization of the relationship between the aconitine alkaloid's toxic activity and its structure. steric and electrostatic contour maps of the comfa qstr model are shown in figure 4a ,b, respectively. hydrophobic, hbd, and hba contour maps of the comsia qstr model are shown in figure 4c -e. compound 6 has the most toxic activity, so it was chosen as the reference structure for the generation of the comfa and comsia contour maps. in the case of the comfa study, the steric contour map around compound 6 is shown in figure 6a . the yellow regions near r2, r7, and r6 showed the substituents of the molecule, which proved that these positions were not ideal for sterically favorable functional groups. therefore, compounds 19, 24, and 26 (with pld 50 values of 1.17, 0.84, and 1.82, respectively), which consist of sterically esterified moieties at positions r2 and r7, were less toxic than compounds 6 and 20 (with pld 50 values of 5.00 and 4.95), which were substituted by a small hydroxyl group, and compound 3 (with a pld 50 value of 1.44) has less toxic activity due to the esterified moiety in r6. the green regions, sterically favorable the comfa electrostatic contour map is shown in figure 6b . the blue region near the r2 and r7 substitution revealed that the replacement of electropositive groups is in favor of toxicity. this can be proven by the fact that the compounds with hydroxy in these two positions had higher pld 50 values than the compound with acetoxy or no substituents. the red region surrounding molecular scaffolds was not distinct, which revealed that there was no connection between the electronegative and the toxicity. the comsia hydrophobic contour map is shown in figure 6c . the r2, r6, and r7 around the white region indicated that the hydrophobic groups were unfavorable for the toxicity, so the esterification of hydrophilic hydroxyl or dehydroxylation decreased the toxicity, which is consistent with the steric and electrostatic contour map. the yellow contour map near the r12 manifested that the hydrophilic hydroxy was unfavorable to the toxicity, which can be validated by the fact that aconitine alkaloids with hydroxy substituents in r12 (compound 10, with a pld 50 the ppi network of aconitine alkaloids cardiotoxicity was divided into nine clusters using clusterone. statistical parameters are shown in figure 5 . six clusters, namely clusters 1, 3, 4, 5, 7, and 9, which possess quality scores higher than 0.5, a density higher than 0.45, and a p-value less than 0.05, were selected for further analysis (in figure 7) . clusters 1, 4, and 7 consisted of proteins mainly involved in the effects of various calcium, potassium, and sodium channels. cluster 1 mainly the comsia contour map of hbd is shown in figure 6d . the cyan regions at r2, r6, and r7 represented a favorable condition for the hbd atom, which clearly validated the fact that the compounds with hydroxy in this region show potent toxicity. a purple region was found near r12, which proved that the hbd atom (hydroxyl) in this region has an adverse effect on toxicity. the hba contour map is shown in figure 6 . the magenta region around r1 substitution proved that this substitution was favorable to the hba atom, so compounds 13, 15, 32, and 33 with the hba atom in the r1 substitution exhibit more potent toxicity (with pld 50 values of 3.52, 3.30, 3.16, and 2.84) than compounds with methoxymethyl substituents (compounds 19, 24, and 26 with pld 50 values of 1.17, 0.84, and 1.82). the red contour map where hba atoms are unfavorable for the toxicity was positioned around r2 and r6. these contours were well validated by the lower pld 50 value of compounds with carbonyl in these substitutions. the ppi network of aconitine alkaloids cardiotoxicity was divided into nine clusters using clusterone. statistical parameters are shown in figure 5 . six clusters, namely clusters 1, 3, 4, 5, 7, and 9, which possess quality scores higher than 0.5, a density higher than 0.45, and a p-value less than 0.05, were selected for further analysis (in figure 7) . clusters 1, 4, and 7 consisted of proteins mainly involved in the effects of various calcium, potassium, and sodium channels. cluster 1 mainly consisted of three channel types related to the cardiotoxicity of aconitine alkaloids, cluster 4 contained calcium and sodium channels and some channel exchangers (such as ryr1 and ryr2), and cluster 7 mainly consisted of various potassium channels. all of these findings are consistent with previous research about the arrhythmogenic properties of the toxicity of aconitine alkaloids: the aconitine binds to ion channels and affects their open state, and thus the corresponding ion influx into the cytosol [53] [54] [55] . the channel exchangers play a crucial role in keeping the ion transportation and homeostasis inside and outside of the cell. cluster 9 contained some regulatory proteins that can activate or repress the ion channels through the protein expression level. atp2a1, ryr2, ryr1, cacna1c, cacna1d, and cacna1s mediate the release of calcium, thereby playing a key role in triggering cardiac muscle contraction and maintaining the calcium homeostasis [56, 57] . aconitine may cause aberrant channel activation and lead to cardiac arrhythmia. clusters 3 and 5 consisted of camp-dependent protein kinase (capk), cgmp-dependent protein kinase (cgpk), and guanine nucleotide binding protein (g protein). they have not been fully studied to prove whether the cardiotoxicity induced by aconitine alkaloids is linked to the capk, cgpk, and g proteins; however, some studies have shown that cardiotoxicity-related protein kcnj3 (potassium inwardly-rectifying channel) is controlled by g proteins and the cardiac sodium/calcium exchanger and is said to be regulated by capk and cgpk [58, 59] . the result of clusterone indicated that the constructed network is consistent with existing studies and that the network can be used to screen essential proteins in the cytonca plugin. the protein 2v7o belonging to the camkii (calcium/calmodulin (ca 2+ /cam)-dependent serine/threonine kinases ii) isozyme protein family plays a central role in cellular signaling by transmitting ca 2+ signals. the camkii enzymes transmit calcium ion (ca 2+ ) signals released inside the cell by regulating signal transduction pathways through phosphorylation. ca 2+ first binds to the small regulatory protein cam, and this ca 2+ /cam complex then binds to and activates the kinase, which then phosphorylates other proteins such as ryanodine receptor and sodium/calcium exchanger. thus, these proteins are related to the cardiotoxicity induced by aconitine alkaloids [60] [61] [62] . the excessive activity of camkii has been observed in some structural heart disease and arrhythmias [63] , and past findings demonstrate neuroprotection in neuronal cultures treated with inhibitors of camkii immediately prior to excitotoxic activation of the camkii [64] . the acute cardiotoxicity of the aconitine alkaloids is possibly related to this target. based on the analysis of the ppi network above, camkii was selected as the potential target for further molecular docking and dynamic simulation. the dock result of 2v7o is shown in figure 8a . compound 20 has the highest fit scores, so it was selected as the template for conformational analysis. the mechanisms of camkii activation and inactivation are shown in figure 8b . compound 20 affects the normal energy metabolism of the myocardial cell via binding in the atp-competitive site in figure 8c . the inactive state of the camkii was regulated by cask-mediated t306/t307 phosphorylation, and this state can be inhibited by the binding of compound 20 in the atp-competitive site. such binding moves camkii toward a ca 2+ /cam-dependent activation active state and a ca 2+ /cam-dependent activation through structural rearrangement of the inhibitory helix caused by ca 2+ /cam binding and the subsequent autophosphorylation of t287 [65] , which will induce the excessive activity of camkii and dynamic imbalance of the calcium ions in the myocardial cell, eventually leading to heart disease and arrhythmias. molecules 2018, 23, x for peer review 10 of 24 channel) is controlled by g proteins and the cardiac sodium/calcium exchanger and is said to be regulated by capk and cgpk [58, 59] . the result of clusterone indicated that the constructed network is consistent with existing studies and that the network can be used to screen essential proteins in the cytonca plugin. the protein 2v7o belonging to the camkii (calcium/calmodulin (ca 2+ /cam)-dependent serine/threonine kinases ii) isozyme protein family plays a central role in cellular signaling by transmitting ca 2+ signals. the camkii enzymes transmit calcium ion (ca 2+ ) signals released inside the cell by regulating signal transduction pathways through phosphorylation. ca 2+ first binds to the small regulatory protein cam, and this ca 2+ /cam complex then binds to and activates the kinase, which then phosphorylates other proteins such as ryanodine receptor and sodium/calcium the information of a binding pocket of a receptor for its ligand is very important for drug design, particularly for conducting mutagenesis studies [28] . as has been reported in the past [66] , the binding pocket of a protein receptor to a ligand is usually defined by those residues that have at least one heavy atom within a distance of 5 å from a heavy atom of the ligand. such a criterion was originally used to define the binding pocket of atp in the cdk5-nck5a complex [20] , which was later proved to be very useful in identifying functional domains and stimulating the relevant truncation experiments. a similar approach has also been used to define the binding pockets of many other receptor-ligand interactions important for drug design [30, 31, 33, [67] [68] [69] [70] . the information of a binding pocket of camkii for the aconitine alkaloids will serve as a guideline for designing drugs with similar scaffolds, particularly for conducting mutagenesis studies. in figure 8a , four top fit scores-compounds 1, 6, 12, and 20-generated similar significant interactions with amino acid residues around the atp-competitive binding pocket. four compounds formed with many van der waals interactions within the noncompetitive inhibitor pocket through amino acid residues such as asp157, lys43, glu140, lys22, and leu143. the ligand-receptor interaction showed that the hydroxy in r2 formed a side chain donor interaction with asp157. in addition, the hydroxy in r6 and r7 also formed a side chain acceptor interaction with glu140 and ser26, respectively (the docking result of compounds 6 and 12 in figure 8a ). these results correspond to the comfa and comsia contour maps. however, the small electropositive and hydrophilic group in r2, r6, and r7 possess a certain enhancement function to toxicity. there were aromatic interactions between the phenyl group in r9 and amino acid residues. the phenyl group in r9 formed aromatic interactions with leu20, leu142, and phe90, while the small group hydroxyl did not form any interaction with asp91, which demonstrate that bulky phenyl group is crucial to this binding pattern and toxicity. this was mainly equal to the comfa steric contour map, where r9 was ideal for sterically favorable groups. the methoxymethyl r1 generated backbone acceptor with lys43, which correspond to the comsia hba contour map, where r1 was favorable for the hba atom. compound 20 docked into 2v7o, and the atp-competitive pocket was painted green; the t287, t307, and t308 phosphorylation sites were painted green, orange, and yellow, respectively; the inhibitory helix was painted red. the result of md simulation is shown in figure 9 . the red plot represented the rmsd values of the docked protein. the values of rmsd reached 2.41 å in 1.4 ns and then remained between 2 and 2.5 å throughout the simulation for up to 5 ns. the averaged value of the rmsd was 2.06 å. the md simulation demonstrated that the ligand was stabilized in the active site. finally, we combined the ligand-based 3d-qstr analysis with the structure-based molecular docking study to identify the necessary moiety related to the cardiotoxicity mechanism of the aconitine alkaloids (in figure 10 ). finally, we combined the ligand-based 3d-qstr analysis with the structure-based molecular docking study to identify the necessary moiety related to the cardiotoxicity mechanism of the aconitine alkaloids (in figure 10 ). to build the ppi network of aconitine alkaloids, literature from 1 january 2007 to 31 february 2017 was retrieved from pubmed (http://pubmed.cn/) and web of science (http://www.isiknowledge.com/) with the mesh word "aconitine" and "toxicity" and without language restriction. all documents about cardiotoxicity caused by aconitine alkaloids were collected. the proteins related to the aconitine alkaloids cardiotoxicity of this decade were gathered and taken as the input protein in the string (https://string-db.org/) database [51, 71] , used to search for related proteins or pathways that had been reported. finally, all the proteins and its partners were recorded in excel in order to import information and build a ppi network in cytoscape software. cytoscape is a free, open-source, java application for visualizing molecular networks and integrating them with gene expression profiles [71, 72] . plugins are available for network and molecular profiling analyses, new layouts, additional file format support, making connections with figure 10 . crucial requirement of cardiotoxicity mechanism was obtained from the ligand-based 3d-qstr and structure-based molecular docking study. to build the ppi network of aconitine alkaloids, literature from 1 january 2007 to 31 february 2017 was retrieved from pubmed (http://pubmed.cn/) and web of science (http://www.isiknowledge.com/) with the mesh word "aconitine" and "toxicity" and without language restriction. all documents about cardiotoxicity caused by aconitine alkaloids were collected. the proteins related to the aconitine alkaloids cardiotoxicity of this decade were gathered and taken as the input protein in the string (https://string-db.org/) database [51, 71] , used to search for related proteins or pathways that had been reported. finally, all the proteins and its partners were recorded in excel in order to import information and build a ppi network in cytoscape software. cytoscape is a free, open-source, java application for visualizing molecular networks and integrating them with gene expression profiles [71, 72] . plugins are available for network and molecular profiling analyses, new layouts, additional file format support, making connections with databases, and searching within large networks [71] . clusterone (clustering with overlapping neighborhood expansion) of cytoscape was utilized to cluster the ppi network into overlapping sub-graphs of highly interconnected nodes. clusterone is a plugin for detecting and clustering potentially overlapping protein complexes from ppi data. the quality of a group was assessed by the number of sub-graphs, p-values, and density. the cluster was discarded when the number of sub-graphs was smaller than 3, the density was less than 0.45, the quality was less than 0.5, and the p-value was under 0.05 [73] . the clustering results of the clusterone are instrumental to understanding how the reliability of the ppi network relates to aconitine alkaloids' cardiotoxicity. cytonca is a plugin in cytoscape integrating calculation, evaluation, and visualization analysis for multiple centrality measures. there are eight centrality measurements provided by cytonca: betweenness, closeness, degree, eigenvector, local average connectivity-based, network, subgraph, and information centrality [74] . the primary purpose of the centrality analysis was to confirm the essential proteins in the pre-built ppi network. the three centrality measurements in the cytonca plugin-subgraph centrality, betweenness centrality, and closeness centrality-were used for evaluating and screening the essential protein in the merged target network. the subgraph centrality characterizes the participation of each node in all subgraphs in a network. smaller subgraphs are given more weight than larger ones, which makes this measurement an appropriate one for characterizing network properties. the subgraph centrality of node "u" can be calculated by [75] µ l (u) is the uth diagonal entry of the lth power of the weight adjacency matrix of the network. v 1 , v 2 , . . . , v n is be an orthonormal basis composed of r n composed by eigenvectors of a associated to the eigenvalues λ 1 , λ 2 , . . . , λ n v u v , which is the uth component of v v [75] . the betweenness centrality finds a wide range of applications in network theory. it represents the degree to which nodes stand between each other. betweenness centrality was devised as a general measure of centrality. it is applicable to a wide range of problems in network theory, including problems related to social networks, biology, transport, and scientific cooperation. the betweenness centrality of a node u can be calculated by [76] ρ (s, t) is the total number of shortest paths from node s to node ρ (s, u, t), which is the number of those paths that pass through u. closeness centrality of a node is a measure of centrality in a network, calculated as the sum of the length of the shortest paths between the node and all other nodes in the graph. thus, the more central a node is, the closer it is to all other nodes. the closeness centrality of a node u can be calculated by [77] |nu| is the number of node u's neighbors, and dist (u, v) is the distance of the shortest path from node u to node v. pharmmapper serves as a valuable tool for identifying potential targets for a novel synthetic compound, a newly isolated natural product, a compound with known biological activity, or an existing drug [78] . of all the aconitine alkaloids in this research, compounds 6, 12, and 20 exhibited the most toxic activity and were used for the potential target prediction. the mol2 format of three compounds was submitted to the pharmmapper server. the parameters of generate conformers and maximum generated conformations was set as on and 300, respectively. other parameters used default values. finally, the result of the clusterone and pharmmapper will be combined together to select the potential targets for the following docking study [78] . comparative molecular field analysis (comfa) and comparative molecular similarity index analysis (comsia) are efficient tools in ligand-based drug design and are in use for contour map generation and identification of favorable and unfavorable regions in a moiety [52, 79] . the comfa consists of a steric and electrostatic contour map of molecules that are correlated with toxic activity, while the comsia consists of hydrophobic field, hydrogen bond donor (hbd)/hydrogen bond acceptor (hba) [80] , and steric/electrostatic fields that are correlated with toxic activity. the comfa and comsia have been utilized to generate a 3d-qstr model [81] . all molecule models and the generation of 3d-qstr were performed with sybyl x2.0. alkaloids in mice with ld 50 values listed in table 4 were extracted from recent literature [70] . the ld 50 values of all aconitine alkaloids were converted into pld 50 with a standard tripos force field. these pld50 values were used as a dependent variable, while comfa and comsia descriptors were used as an independent variable. the sketch function of sybyl x2.0 was utilized to illustrate structure and charges, and was calculated by the gasteiger-huckel method. additionally, the tripose force field was utilized for energy minimization of these aconitine alkaloid molecules [81] . the 31 molecules were divided into a ratio of 3:1. the division was done in a way that showed that both datasets are balanced and consist of both active and less active molecules [81] . the reliability of the 3d-qstr model depends on the database molecular alignment. the most toxic aconitine alkaloids (compound 6) was selected as the template molecule, and the tetradecahydro-2h-3,6,12-(epiethane [1,1,2] triyl)-7,9-methanonaphtho [2,3-b] azocine was selected as the common moiety. pls (partial least squares) techniques are associated with field descriptors with activity values such as [80] leave one out (loo) values, the optimal number of components, the standard error of estimation (see), cross-validated coefficients (q 2 ), and the conventional coefficient (r 2 ). these statistical data are pivotal in the evaluation of the 3d-qstr model and can be worked out in the pls method [81] . the model is said to be good when the q 2 value is more than 0.5 and the r 2 value is more than 0.6. the q 2 and r 2 values reflect a model's soundness. the best model has the highest q 2 and r 2 values, the lowest see, and an optimal number of components [80, 82, 83] . in the case of comfa and comsia analysis, the values of the optimal number of components, see, and q 2 can be worked out by loo validation, with use sampls turned on and components set to 5, while in the process of calculating r 2 , the use sampls was turned off and the column filtration was set to 2.0 kcal mol −1 in order to speed up the calculation without the need to sacrifice information content [81] [82] [83] [84] . therefore, components were set to 6 and 4, respectively, which were optimal numbers of components calculated by performing a sampls run. see and r 2 were utilized to assess the non-cross validated model. the applicability domain (ad) of the topomer comfa and comsia model was confirmed by the williams plot of residuals vs. leverage. leverage of a query chemical is proportional to its mahalanobis distance measure from the centroid of the training set [85, 86] . the leverages are calculated for a given dataset x by obtaining the leverage matrix (h) with the equation below: x is the model matrix, while xt is its transpose matrix. the plot of standardized residuals vs. leverage values was drawn, and compounds with standardized residuals greater than three standard deviation units (±3σ) were considered as outliers [85] . the critical leverage value is considered 3 p/n, where p is the number of model variables plus one, and n is the number of objects used to calculate the model. h > 3 p/n mean predicted response is not acceptable [85] [86] [87] . (cadd) software program that incorporates the functions of qsar, molecular docking, molecular dynamics, adme (absorption, distribution, metabolism, and excretion), and homologous modeling. all of these functions are regarded as conducive instruments in the field of drug discovery and biochemistry. the molecular docking and dynamics technology were performed in moe2016 software to detect the stability and affinity between the ligands and predictive targets [88, 89] . the docking process involves the prediction of ligand conformation and orientation within a targeted binding site. docking analysis is an important step in the docking process. it has been widely used to study the reasonable binding mode and obtain information of interactions between amino acids in active protein sites and ligands. the molecular docking analysis was carried out to determine the toxicity-related moiety of aconitine alkaloids through the ligand-amino-acid interaction function in moe2015. the pdb format of 2v7o and 2vz6 was downloaded from the pdb (protein data bank) database (https://www.rcsb.org/), and the mol2 format of compounds was from the sybyl software of qstr research. the structure preparation function in moe software will be carried out to minimize the energy and optimize the structure of the protein skeleton. based on the london dg score and induced fit refinement, all compounds will be docked into the active site of every potential target by taking score values as the scoring function [90] . the dcg (discounted cumulative gain) algorithm was utilized to examine the consistency between the ranking result of pld 50 and our research (fit scores of dock study). they rely on the formula that refers to pld 50 . the idcg (ideal dcg) refers to the ordered pld 50 values. the closer the normalized discounted cumulative gain (ndcg) value is to 1, the better the consistency [91] . preliminary md simulations for the model protein were performed using the program namd (nanoscale molecular dynamics program, v 2.9), and all files were generated using visual molecular dynamics (vmd). namd is a freely available software designed for high-performance simulation of large biomolecular systems [92] . during the md simulation, minimization and equilibration of original and docked proteins occurred in a 15 å3 size water box. a charmm 22 force field file was applied for energy minimization and equilibration with gasteiger-huckel charges using boltzmann initial velocity [93, 94] . integrator parameters also included 2 fs/step for all rigid bonds and nonbonded frequencies were selected for 1 å and full electrostatic evaluations for 2 å were used with 10 steps for each cycle [93] . the particle mesh ewald method was used for electrostatic interactions of the simulation system periodic boundary conditions with grid dimensions of 1.0 å [94] . the pressure was maintained at 101.325 kpa using the langevin piston and the temperature was controlled at 310 k using langevin dynamics. covalent interactions between hydrogen and heavy atoms were constrained using the shake/rattle algorithm. finally, 5 ns md simulations for original and docked protein were carried out for comparing and verifying the binding affinity and stability of the ligand-receptor complex. the method combining network analysis and the in silico method was carried out to illustrate the qstr and toxic mechanisms of aconitine alkaloids. the 3d-qstr was built in sybyl with internal robustness and external high prediction, enabling identification of pivotal molecule moieties related to toxicity in aconitine alkaloids. the comfa model had q 2 , r 2 , optimum component, and correlation coefficient (r 2 ) values of 0.624, 0.966, 6, and 0. 9698, respectively, and the comsia model had q 2 , r 2 , optimum component, and correlation coefficient (r 2 ) values of 0.719, 0.901, 4, and 0.9770. the network was built with cytoscape software and the string database, which demonstrated the reliability of cluster analysis. the 2v7o and 2vz6 proteins were identified as potential targets with the cytonca plugin with pharmmapper server for interactions between the aconitine alkaloids and key amino acids in the dock study. the result of the dock study demonstrates the consistency of the experimental pld 50 . the md simulation indicated that aconitine alkaloids exhibit potent binding affinity and stability to the receptor camk2g. finally, we incorporate pivotal molecule moieties and ligand-receptor interactions to realize the qstr of aconitine alkaloids. this research serves as a guideline for studies of toxicity, including neuro-, reproductive, and embryo-toxicity. with a deep understanding of the relationship between toxicity and structure of aconitine alkaloids, subsequent structural modification of aconitine alkaloids can be carried out to enhance their efficacy and to reduce their toxic side effects. based on such research, aconitine alkaloids can bring us closer to medical and clinical applications. in addition, as pointed out in past research [95] , user-friendly and publicly accessible web servers represent the future direction of reporting various important computational analyses and findings [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] . they have significantly enhanced the impacts of computational biology on medical science [110, 111] . the research in this paper will serve as a foundation for constructing web servers for qstr studies and target identifications of compounds. immunomodulating agents of plant origin. i: preliminary screening chinese drugs plant origin aconitine poisoning: a global perspective ventricular tachycardia after ingestion of ayurveda herbal antidiarrheal medication containing aconitum fatal accidental aconitine poisoning following ingestion of chinese herbal medicine: a report of two cases five cases of aconite poisoning: toxicokinetics of aconitines a case of fatal aconitine poisoning by monkshood ingestion determination of aconitine and hypaconitine in gucixiaotong ye by capillary electrophoresis with field-amplified sample injection a clinical study in epidural injection with lappaconitine compound for post-operative analgesia therapeutic effects of il-12 combined with benzoylmesaconine, a non-toxic aconitine-hydrolysate, against herpes simplex virus type 1 infection in mice following thermal injury aconitine: a potential novel treatment for systemic lupus erythematosus aconitine-containing agent enhances antitumor activity of dichloroacetate against ehrlich carcinoma complex discovery from weighted ppi networks prediction and analysis of the protein interactome in pseudomonas aeruginosa to enable network-based drug target selection the string database in 2017: quality-controlled protein-protein association networks, made broadly accessible identification of functional modules in a ppi network by clique percolation clustering united complex centrality for identification of essential proteins from ppi networks the ppi network and cluster one analysis to explain the mechanism of bladder cancer the progress of novel drug delivery systems mitochondrial uncoupling protein 2 structure determined by nmr molecular fragment searching structural basis for membrane anchoring of hiv-1 envelope spike unusual architecture of the p7 channel from hepatitis c virus architecture of the mitochondrial calcium uniporter structure and mechanism of the m2 proton channel of influenza a virus computer-aided drug design using sesquiterpene lactones as sources of new structures with potential activity against infectious neglected diseases successful in silico discovery of novel nonsteroidal ligands for human sex hormone binding globulin in silico discovery of novel ligands for antimicrobial lipopeptides for computer-aided drug design structural bioinformatics and its impact to biomedical science coupling interaction between thromboxane a2 receptor and alpha-13 subunit of guanine nucleotide-binding protein prediction of the tertiary structure and substrate binding site of caspase-8 study of drug resistance of chicken influenza a virus (h5n1) from homology-modeled 3d structures of neuraminidases insights from investigating the interaction of oseltamivir (tamiflu)with neuraminidase of the 2009 h1 n1 swine flu virus prediction of the tertiary structure of a caspase-9/inhibitor complex design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach heuristic molecular lipophilicity potential (hmlp): a 2d-qsar study to ladh of molecular family pyrazole and derivatives fragment-based quantitative structure & ndash; activity relationship (fb-qsar) for fragment-based drug design investigation into adamantane-based m2 inhibitors with fb-qsar hp-lattice qsar for dynein proteins: experimental proteomics (2d-electrophoresis, mass spectrometry) and theoretic study of a leishmania infantum sequence the biological functions of low-frequency phonons: 2. cooperative effects low-frequency collective motion in biomacromolecules and its biological functions quasi-continuum models of twist-like and accordion-like low-frequency motions in dna collective motion in dna and its role in drug intercalation biophysical aspects of neutron scattering from vibrational modes of proteins biological functions of soliton and extra electron motion in dna structure low-frequency resonance and cooperativity of hemoglobin solitary wave dynamics as a mechanism for explaining the internal motion during microtubule growth designed electromagnetic pulsed therapy: clinical applications steps to the clinic with elf emf molecular dynamics study of the connection between flap closing and binding of fullerene-based inhibitors of the hiv-1 protease molecular dynamics studies on the interactions of ptp1b with inhibitors: from the first phosphate-binding site to the second one the cambridge structural database: a quarter of a million crystal structures and rising molecular similarity indices in a comparative analysis (comsia) of drug molecules to correlate and predict their biological activity single channel analysis of aconitine blockade of calcium channels in rat myocardiocytes conversion of the sodium channel activator aconitine into a potent alpha 7-selective nicotinic ligand aconitine blocks herg and kv1.5 potassium channels inactivation of ca 2+ release channels (ryanodine receptors ryr1 and ryr2) with rapid steps in [ca 2+ ] and voltage targeted disruption of the atp2a1 gene encoding the sarco(endo)plasmic reticulum ca 2+ atpase isoform 1 (serca1) impairs diaphragm function and is lethal in neonatal mice cyclic gmp-dependent protein kinase activity in rat pulmonary microvascular endothelial cells different g proteins mediate somatostatin-induced inward rectifier k + currents in murine brain and endocrine cells cardiac myocyte calcium transport in phospholamban knockout mouse: relaxation and endogenous camkii effects inhibition of camkii phosphorylation of ryr2 prevents induction of atrial fibrillation in fkbp12.6 knock-out mice regulation of ca 2+ and electrical alternans in cardiac myocytes: role of camkii and repolarizing currents the role of calmodulin kinase ii in myocardial physiology and disease excitotoxic neuroprotection and vulnerability with camkii inhibition structure of the camkiiδ/calmodulin complex reveals the molecular mechanism of camkii kinase activation a model of the complex between cyclin-dependent kinase 5 and the activation domain of neuronal cdk5 activator binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against sars an in-depth analysis of the biological functional studies based on the nmr m2 channel structure of influenza a virus molecular therapeutic target for type-2 diabetes novel inhibitor design for hemagglutinin against h1n1 influenza virus by core hopping method the string database in 2011: functional interaction networks of proteins, globally integrated and scored cytoscape: a software environment for integrated models of biomolecular interaction networks detecting overlapping protein complexes in protein-protein interaction networks cytonca: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks subgraph centrality and clustering in complex hyper-networks ranking closeness centrality for large-scale social networks enhancing the enrichment of pharmacophore-based target prediction for the polypharmacological profiles of drugs comparative molecular field analysis (comfa). 1. effect of shape on binding of steroids to carrier proteins sample-distance partial least squares: pls optimized for many variables, with application to comfa a qsar analysis of toxicity of aconitum alkaloids recent advances in qsar and their applications in predicting the activities of chemical molecules, peptides and proteins for drug design unified qsar approach to antimicrobials. 4. multi-target qsar modeling and comparative multi-distance study of the giant components of antiviral drug-drug complex networks comfa qsar models of camptothecin analogues based on the distinctive sar features of combined abc, cd and e ring substitutions applicability domain for qsar models: where theory meets reality comparison of different approaches to define the applicability domain of qsar models molecular docking and qsar analysis of naphthyridone derivatives as atad2 bromodomain inhibitors: application of comfa, ls-svm, and rbf neural network concise applications of molecular modeling software-moe medicinal chemistry and the molecular operating environment (moe): application of qsar and molecular docking to drug discovery qsar models of cytochrome p450 enzyme 1a2 inhibitors using comfa, comsia and hqsar estimating a ranked list of human hereditary diseases for clinical phenotypes by using weighted bipartite network biomolecular simulation on thousands processors molecular dynamics and docking investigations of several zoanthamine-type marine alkaloids as matrix metaloproteinase-1 inhibitors salts influence cathechins and flavonoids encapsulation in liposomes: a molecular dynamics investigation review: recent advances in developing web-servers for predicting protein attributes irna-ai: identifying the adenosine to inosine editing sites in rna sequences iss-psednc: identifying splicing sites using pseudo dinucleotide composition irna-pseu: identifying rna pseudouridine sites ploc-mplant: predict subcellular localization of multi-location plant proteins by incorporating the optimal go information into general pseaac ploc-mhum: predict subcellular localization of multi-location human proteins via general pseaac to winnow out the crucial go information iatc-misf: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals psuc-lys: predict lysine succinylation sites in proteins with pseaac and ensemble random forest approach irnam5c-psednc: identifying rna 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition ikcr-pseens: identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier iacp: a sequence-based tool for identifying anticancer peptides ploc-meuk: predict subcellular localization of multi-label eukaryotic proteins by extracting the key go information into general pseaac iatc-mhyb: a hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals ihsp-pseraaac: identifying the heat shock protein families using pseudo reduced amino acid alphabet composition irna-psecoll: identifying the occurrence sites of different rna modifications by incorporating collective effects of nucleotides into pseknc impacts of bioinformatics to medicinal chemistry an unprecedented revolution in medicinal chemistry driven by the progress of biological science this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license key: cord-001244-qdld7hdc authors: fan, yue-nong; xiao, xuan; min, jian-liang; chou, kuo-chen title: inr-drug: predicting the interaction of drugs with nuclear receptors in cellular networking date: 2014-03-19 journal: int j mol sci doi: 10.3390/ijms15034915 sha: doc_id: 1244 cord_uid: qdld7hdc nuclear receptors (nrs) are closely associated with various major diseases such as cancer, diabetes, inflammatory disease, and osteoporosis. therefore, nrs have become a frequent target for drug development. during the process of developing drugs against these diseases by targeting nrs, we are often facing a problem: given a nr and chemical compound, can we identify whether they are really in interaction with each other in a cell? to address this problem, a predictor called “inr-drug” was developed. in the predictor, the drug compound concerned was formulated by a 256-d (dimensional) vector derived from its molecular fingerprint, and the nr by a 500-d vector formed by incorporating its sequential evolution information and physicochemical features into the general form of pseudo amino acid composition, and the prediction engine was operated by the svm (support vector machine) algorithm. compared with the existing prediction methods in this area, inr-drug not only can yield a higher success rate, but is also featured by a user-friendly web-server established at http://www.jci-bioinfo.cn/inr-drug/, which is particularly useful for most experimental scientists to obtain their desired data in a timely manner. it is anticipated that the inr-drug server may become a useful high throughput tool for both basic research and drug development, and that the current approach may be easily extended to study the interactions of drug with other targets as well. with the ability to directly bind to dna ( figure 1 ) and regulate the expression of adjacent genes, nuclear receptors (nrs) are a class of ligand-inducible transcription factors. they regulate various biological processes, such as homeostasis, differentiation, embryonic development, and organ physiology [1] [2] [3] . the nr superfamily has been classified into seven families: nr0 (knirps or dax like) [4, 5] ; nr1 (thyroid hormone like), nr2 (hnf4-like), nr3 (estrogen like), nr4 (nerve growth factor ib-like), nr5 (fushi tarazu-f1 like), and nr6 (germ cell nuclear factor like). since they are involved in almost all aspects of human physiology and are implicated in many major diseases such as cancer, diabetes and osteoporosis, nuclear receptors have become major drug targets [6, 7] , along with g protein-coupled receptors (gpcrs) [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] , ion channels [18] [19] [20] , and kinase proteins [21] [22] [23] [24] . identification of drug-target interactions is one of the most important steps for the new medicine development [25, 26] . the method usually adopted in this step is molecular docking simulation [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] . however, to make molecular docking study feasible, a reliable 3d (three dimensional) structure of the target protein is the prerequisite condition. although x-ray crystallography is a powerful tool in determining protein 3d structures, it is time-consuming and expensive. particularly, not all proteins can be successfully crystallized. for example, membrane proteins are very difficult to crystallize and most of them will not dissolve in normal solvents. therefore, so far very few membrane protein 3d structures have been determined. although nmr (nuclear magnetic resonance) is indeed a very powerful tool in determining the 3d structures of membrane proteins as indicated by a series of recent publications (see, e.g., [44] [45] [46] [47] [48] [49] [50] [51] and a review article [20] ), it is also time-consuming and costly. to acquire the 3d structural information in a timely manner, one has to resort to various structural bioinformatics tools (see, e.g., [37] ), particularly the homologous modeling approach as utilized for a series of protein receptors urgently needed during the process of drug development [19, [52] [53] [54] [55] [56] [57] . unfortunately, the number of dependable templates for developing high quality 3d structures by means of homology modeling is very limited [37] . to overcome the aforementioned problems, it would be of help to develop a computational method for predicting the interactions of drugs with nuclear receptors in cellular networking based on the sequences information of the latter. the results thus obtained can be used to pre-exclude the compounds identified not in interaction with the nuclear receptors, so as to timely stop wasting time and money on those unpromising compounds [58] . actually, based on the functional groups and biological features, a powerful method was developed recently [59] for this purpose. however, further development in this regard is definitely needed due to the following reasons. (a) he et al. [59] did not provide a publicly accessible web-server for their method, and hence its practical application value is quite limited, particularly for the broad experimental scientists; (b) the prediction quality can be further enhanced by incorporating some key features into the formulation of nr-drug (nuclear receptor and drug) samples via the general form of pseudo amino acid composition [60] . the present study was initiated with an attempt to develop a new method for predicting the interaction of drugs with nuclear receptors by addressing the two points. as demonstrated by a series of recent publications [10, 18, [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] and summarized in a comprehensive review [60] , to establish a really effective statistical predictor for a biomedical system, we need to consider the following steps: (a) select or construct a valid benchmark dataset to train and test the predictor; (b) represent the statistical samples with an effective formulation that can truly reflect their intrinsic correlation with the object to be predicted; (c) introduce or develop a powerful algorithm or engine to operate the prediction; (d) properly perform cross-validation tests to objectively evaluate the anticipated accuracy of the predictor; (e) establish a user-friendly web-server for the predictor that is accessible to the public. below, let us elaborate how to deal with these steps. the data used in the current study were collected from kegg (kyoto encyclopedia of genes and genomes) [71] at http://www.kegg.jp/kegg/. kegg is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies. here, the benchmark dataset can be formulated as where is the positive subset that consists of the interactive drug-nr pairs only, while the negative subset that contains of the non-interactive drug-nr pairs only, and the symbol represents the union in the set theory. the so-called "interactive" pair here means the pair whose two counterparts are interacting with each other in the drug-target networks as defined in the kegg database [71] ; while the "non-interactive" pair means that its two counterparts are not interacting with each other in the drug-target networks. the positive dataset contains 86 drug-nr pairs, which were taken from he et al. [59] . the negative dataset contains 172 non-interactive drug-nr pairs, which were derived according to the following procedures: (a) separating each of the pairs in into single drug and nr; (b) re-coupling each of the single drugs with each of the single nrs into pairs in a way that none of them occurred in ; (c) randomly picking the pairs thus formed until reaching the number two times as many as the pairs in . the 86 interactive drug-nr pairs and 172 non-interactive drug-nr pairs are given in supplementary information s1, from which we can see that the 86 + 172 = 258 pairs in the current benchmark dataset are actually formed by 25 different nrs and 53 different compounds. since each of the samples in the current network system contains a drug (compound) and a nr (protein), the following procedures were taken to represent the drug-nr pair sample. first, for the drug part in the current benchmark dataset, we can use a 256-d vector to formulate it as given by where d represents the vector for a drug compound, and d i its i-th (i = 1,2, ,256) component that can be derived by following the "2d molecular fingerprint procedure" as elaborated in [10] . the 53 molecular fingerprint vectors thus obtained for the 53 drugs in are, respectively, given in supplementary information s2. the protein sequences of the 25 different nrs in are listed in supplementary information s3. suppose the sequence of a nuclear receptor protein p with l residues is generally expressed by where 1 r represents the 1st residue of the protein sequence p , 2 r the 2nd residue, and so forth. now the problem is how to effectively represent the sequence of equation (3) with a non-sequential or discrete model [72] . this is because all the existing operation engines, such as covariance discriminant (cd) [17, 65, [73] [74] [75] [76] [77] [78] [79] , neural network [80] [81] [82] , support vector machine (svm) [62] [63] [64] 83] , random forest [84, 85] , conditional random field [66] , nearest neighbor (nn) [86, 87] ; k-nearest neighbor (knn) [88] [89] [90] , oet-knn [91] [92] [93] [94] , and fuzzy k-nearest neighbor [10, 12, 18, 69, 95] , can only handle vector but not sequence samples. however, a vector defined in a discrete model may completely lose all the sequence-order information and hence limit the quality of prediction. facing such a dilemma, can we find an approach to partially incorporate the sequence-order effects? actually, one of the most challenging problems in computational biology is how to formulate a biological sequence with a discrete model or a vector, yet still keep considerable sequence order information. to avoid completely losing the sequence-order information for proteins, the pseudo amino acid composition [96, 97] or chou's pseaac [98] was proposed. ever since the concept of pseaac was proposed in 2001 [96] , it has penetrated into almost all the areas of computational proteomics, such as predicting anticancer peptides [99] , predicting protein subcellular location [100] [101] [102] [103] [104] [105] [106] , predicting membrane protein types [107, 108] , predicting protein submitochondria locations [109] [110] [111] [112] , predicting gaba(a) receptor proteins [113] , predicting enzyme subfamily classes [114] , predicting antibacterial peptides [115] , predicting supersecondary structure [116] , predicting bacterial virulent proteins [117] , predicting protein structural class [118] , predicting the cofactors of oxidoreductases [119] , predicting metalloproteinase family [120] , identifying cysteine s-nitrosylation sites in proteins [66] , identifying bacterial secreted proteins [121] , identifying antibacterial peptides [115] , identifying allergenic proteins [122] , identifying protein quaternary structural attributes [123, 124] , identifying risk type of human papillomaviruses [125] , identifying cyclin proteins [126] , identifying gpcrs and their types [15, 16] , discriminating outer membrane proteins [127] , classifying amino acids [128] , detecting remote homologous proteins [129] , among many others (see a long list of papers cited in the references section of [60] ). moreover, the concept of pseaac was further extended to represent the feature vectors of nucleotides [65] , as well as other biological samples (see, e.g., [130] [131] [132] ). because it has been widely and increasingly used, recently two powerful soft-wares, called "pseaac-builder" [133] and "propy" [134] , were established for generating various special chou's pseudo-amino acid compositions, in addition to the web-server "pseaac" [135] built in 2008. according to a comprehensive review [60] , the general form of pseaac for a protein sequence p is formulated by where the subscript  is an integer, and its value as well as the components ( 1, 2, , ) u u   will depend on how to extract the desired information from the amino acid sequence of p (cf. equation (3)). below, let us describe how to extract useful information to define the components of pseaac for the nr samples concerned. first, many earlier studies (see, e.g., [136] [137] [138] [139] [140] [141] ) have indicated that the amino acid composition (aac) of a protein plays an important role in determining its attributes. the aac contains 20 components with each representing the occurrence frequency of one of the 20 native amino acids in the protein concerned. thus, such 20 aac components were used here to define the first 20 elements in equation (4); i.e., (1) ( 1, 2, , 20) ii fi   (5) where f i (1) is the normalized occurrence frequency of the i-th type native amino acid in the nuclear receptor concerned. since aac did not contain any sequence order information, the following steps were taken to make up this shortcoming. to avoid completely losing the local or short-range sequence order information, we considered the approach of dipeptide composition. it contained 20 × 20 = 400 components [142] . such 400 components were used to define the next 400 elements in equation (4); i.e., (2) 20 ( 1, 2, , 400) jj fj where (2) j f is the normalized occurrence frequency of the j-th dipeptides in the nuclear receptor concerned. to incorporate the global or long-range sequence order information, let us consider the following approach. according to molecular evolution, all biological sequences have developed starting out from a very limited number of ancestral samples. driven by various evolutionary forces such as mutation, recombination, gene conversion, genetic drift, and selection, they have undergone many changes including changes of single residues, insertions and deletions of several residues [143] , gene doubling, and gene fusion. with the accumulation of these changes over a long period of time, many original similarities between initial and resultant amino acid sequences are gradually faded out, but the corresponding proteins may still share many common attributes [37] , such as having basically the same biological function and residing at a same subcellular location [144, 145] . to extract the sequential evolution information and use it to define the components of equation (4), the pssm (position specific scoring matrix) was used as described below. according to schaffer [146] , the sequence evolution information of a nuclear receptor protein p with l amino acid residues can be expressed by a 20 l matrix, as given by where (7) were generated by using psi-blast [147] to search the uniprotkb/swiss-prot database (the universal protein resource (uniprot); http://www.uniprot.org/) through three iterations with 0.001 as the e-value cutoff for multiple sequence alignment against the sequence of the nuclear receptor concerned. in order to make every element in equation (7) be scaled from their original score ranges into the region of [0, 1], we performed a conversion through the standard sigmoid function to make it become now we extract the useful information from equation (8) moreover, we used the grey system model approach as elaborated in [68] to further define the next 60 components of equation (4) ( 1, 2, , 20) in the above equation, w 1 , w 2 , and w 3 are weight factors, which were all set to 1 in the current study; f j (1) has the same meaning as in equation (5) where   and combining equations (5), (6), (10) and (12), we found that the total number of the components obtained via the current approach for the pseaac of equation (4) and each of the 500 components is given by (1) ( since the elements in equations (2) and (4) are well defined, we can now formulate the drug-nr pair by combining the two equations as given by   (19) where g represents the drug-nr pair, å the orthogonal sum, and the 256 + 500 = 756 components are defined by equations (2) and (18) . for the sake of convenience, let us use x i (i = 1, 2, , 756) to represent the 756 components in equation (19); i.e., (20) to optimize the prediction quality with a time-saving approach, similar to the treatment [148] [149] [150] , let us convert equation (20) to where the symbol means taking the average of the quantity therein, and sd means the corresponding standard derivation. in this study, the svm (support vector machine) was used as the operation engine. svm has been widely used in the realm of bioinformatics (see, e.g., [62] [63] [64] [151] [152] [153] [154] ). the basic idea of svm is to transform the data into a high dimensional feature space, and then determine the optimal separating hyperplane using a kernel function. for a brief formulation of svm and how it works, see the papers [155, 156] ; for more details about svm, see a monograph [157] . in this study, the libsvm package [158] was used as an implementation of svm, which can be downloaded from http://www.csie.ntu.edu.tw/~cjlin/libsvm/, the popular radial basis function (rbf) was taken as the kernel function. for the current svm classifier, there were two uncertain parameters: penalty parameter c and kernel parameter  . the method of how to determine the two parameters will be given later. the predictor obtained via the aforementioned procedure is called inr-drug, where "i" means identify, and "nr-drug" means the interaction between nuclear receptor and drug compound. to provide an intuitive overall picture, a flowchart is provided in figure 2 to show the process of how the predictor works in identifying the interactions between nuclear receptors and drug compounds. to provide a more intuitive and easier-to-understand method to measure the prediction quality, the following set of metrics based on the formulation used by chou [159] [160] [161] in predicting signal peptides was adopted. according to chou's formulation, the sensitivity, specificity, overall accuracy, and matthew's correlation coefficient can be respectively expressed as [62, [65] [66] [67] sn 1 where n  is the total number of the interactive nr-drug pairs investigated while n   the number of the interactive nr-drug pairs incorrectly predicted as the non-interactive nr-drug pairs; n  the total number of the non-interactive nr-drug pairs investigated while n   the number of the non-interactive nr-drug pairs incorrectly predicted as the interactive nr-drug pairs. according to equation (23) we can easily see the following. when 0 n    meaning none of the interactive nr-drug pairs was mispredicted to be a non-interactive nr-drug pair, we have the sensitivity sn = 1; while nn    meaning that all the interactive nr-drug pairs were mispredicted to be the non-interactive nr-drug pairs, we have the sensitivity sn = 0 . likewise, when 0 n    meaning none of the non-interactive nr-drug pairs was mispredicted, we have the specificity sp we have mcc = 0 meaning total disagreement between prediction and observation. as we can see from the above discussion, it is much more intuitive and easier to understand when using equation (23) to examine a predictor for its four metrics, particularly for its mathew's correlation coefficient. it is instructive to point out that the metrics as defined in equation (23) are valid for single label systems; for multi-label systems, a set of more complicated metrics should be used as given in [162] . how to properly test a predictor for its anticipated success rates is very important for its development as well as its potential application value. generally speaking, the following three cross-validation methods are often used to examine the quality of a predictor and its effectiveness in practical application: independent dataset test, subsampling or k-fold (such as five-fold, seven-fold, or 10-fold) crossover test and jackknife test [163] . however, as elaborated by a penetrating analysis in [164] , considerable arbitrariness exists in the independent dataset test. also, as demonstrated in [165] , the subsampling (or k-fold crossover validation) test cannot avoid arbitrariness either. only the jackknife test is the least arbitrary that can always yield a unique result for a given benchmark dataset [73, 74, 156, [166] [167] [168] . therefore, the jackknife test has been widely recognized and increasingly utilized by investigators to examine the quality of various predictors (see, e.g., [14, 15, 68, 99, 106, 107, 124, 169, 170] ). accordingly, in this study the jackknife test was also adopted to evaluate the accuracy of the current predictor. as mentioned above, the svm operation engine contains two uncertain parameters c and  . to find their optimal values, a 2-d grid search was conducted by the jackknife test on the benchmark dataset . the results thus obtained are shown in figure 3 , from which it can be seen that the inr-drug predictor reaches its optimal status when c = 2 3 and 9 2    . the corresponding rates for the four metrics (cf. equation (23)) are given in table 1 , where for facilitating comparison, the overall accuracy acc reported by he et al. [59] on the same benchmark dataset is also given although no results were reported by them for sn, sp and mcc. it can be observed from the table that the overall accuracy obtained by inr-drug is remarkably higher that of he et al. [59] , and that the rates achieved by inr-drug for the other three metrics are also quite higher. these facts indicate that the current predictor not only can yield higher overall prediction accuracy but also is quite stable with low false prediction rates. as mentioned above (section 3.2), the jackknife test is the most objective method for examining the quality of a predictor. however, as a demonstration to show how to practically use the current predictor, we took 41 nr-drug pairs from the study by yamanishi et al. [171] that had been confirmed by experiments as interactive pairs. for such an independent dataset, 34 were correctly identified by inr-drug as interactive pairs, i.e., sn = 34 / 41 = 82.92%, which is quite consistent with the rate of 79.07% achieved by the predictor on the benchmark dataset via the jackknife test as reported in table 1 . it is anticipated that the inr-drug predictor developed in this paper may become a useful high throughput tool for both basic research and drug development, and that the current approach may be easily extended to study the interactions of drug with other targets as well. since user-friendly and publicly accessible web-servers represent the future direction for developing practically more useful predictors [98, 172] , a publicly accessible web-server for inr-drug was established. for the convenience of the vast majority of biologists and pharmaceutical scientists, here let us provide a step-by-step guide to show how the users can easily get the desired result by using inr-drug web-server without the need to follow the complicated mathematical equations presented in this paper for the process of developing the predictor and its integrity. step 1. open the web server at the site http://www.jci-bioinfo.cn/inr-drug/ and you will see the top page of the predictor on your computer screen, as shown in figure 4 . click on the read me button to see a brief introduction about inr-drug predictor and the caveat when using it. step 2. either type or copy/paste the query nr-drug pairs into the input box at the center of figure 4 . each query pair consists of two parts: one is for the nuclear receptor sequence, and the other for the drug. the nr sequence should be in fasta format, while the drug in the kegg code beginning with the symbol #. examples for the query pairs input and the corresponding output can be seen by clicking on the example button right above the input box. step 3. click on the submit button to see the predicted result. for example, if you use the three query pairs in the example window as the input, after clicking the submit button, you will see on your screen that the "hsa:2099" nr and the "d00066" drug are an interactive pair, and that the "hsa:2908" nr and the "d00088" drug are also an interactive pair, but that the "hsa:5468" nr and the "d00279" drug are not an interactive pair. all these results are fully consistent with the experimental observations. it takes about 3 minutes before each of these results is shown on the screen; of course, the more query pairs there is, the more time that is usually needed. step 4. click on the citation button to find the relevant paper that documents the detailed development and algorithm of inr-durg. step 5. click on the data button to download the benchmark dataset used to train and test the inr-durg predictor. step 6. the program code is also available by clicking the button download on the lower panel of figure 4 . nuclear receptors in cell life and death nuclear receptors nuclear receptors: current concepts and future challenges the nuclear receptor superfamily non-steroid nuclear receptors: what are genetic studies telling us their role in renal life? nuclear receptor drug discovery nuclear receptors and drug disposition gene regulation a web-server for identifying g-protein coupled receptors and their families with grey incidence analysis bioinformatical analysis of g-protein-coupled receptors igpcr-drug: a web server for predicting interaction between gpcrs and drugs in cellular networking a cellular automaton image approach for predicting g-protein-coupled receptor functional classes gpcr-2l: predicting g protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions prediction of g-protein-coupled receptor classes in low homology using chou's pseudo amino acid composition with approximate entropy and hydrophobicity patterns prediction of g-protein-coupled receptor classes based on the concept of chou's pseudo amino acid composition: an approach from discrete wavelet transform using ensemble svm to identify human gpcrs n-linked glycosylation sites based on the general form of chou's pseaac identifying gpcrs and their types with chou's pseudo amino acid composition: an approach from multi-scale energy representation and position specific scoring matrix prediction of g-protein-coupled receptor classes identify the channel-drug interaction in cellular networking with pseaac and molecular fingerprints insights from modelling three-dimensional structures of the human potassium and sodium channels influenza m2 proton channels a model of the complex between cyclin-dependent kinase 5 (cdk5) and the activation domain of neuronal cdk5 activator rapid and accurate structure determination of coiled-coil domains using nmr dipolar couplings: application to cgmp-dependent protein kinase ialpha the three-dimensional structure of the cgmp-dependent protein kinase i-α leucine zipper domain and its interaction with the myosin binding subunit determination of the packing mode of the coiled-coil domain of cgmp-dependent protein kinase ialpha in solution using charge-predicted dipolar couplings a guide to drug discovery: target selection in drug discovery target discovery a fast flexible docking method using an incremental construction algorithm binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against sars nmr studies on how the binding complex of polyisoprenol recognition sequence peptides and polyisoprenols can modulate membrane structure review: progress in computational approach to drug development against sars molecular modelling and chemical modification for finding peptide inhibitor against sars cov mpro an in-depth analysis of the biological functional studies based on the nmr m2 channel structure of influenza a virus energetic analysis of the two controversial drug binding sites of the m2 proton channel in influenza a virus investigation into adamantane-based m2 inhibitors with fb-qsar designing inhibitors of m2 proton channel against h1n1 swine influenza virus insights from investigating the interaction of oseltamivir (tamiflu) with neuraminidase of the 2009 h1n1 swine flu virus review: structural bioinformatics and its impact to biomedical science identification of proteins interacting with human sp110 during the process of viral infections docking and molecular dynamics study on the inhibitory activity of novel inhibitors on epidermal growth factor receptor (egfr) novel inhibitor design for hemagglutinin against h1n1 influenza virus by core hopping method design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach insights from modeling the 3d structure of new delhi metallo-betalactamase and its binding interactions with antibiotic drugs insights into the mutation-induced hhh syndrome from modeling human mitochondrial ornithine transporter-1 mitochondrial uncoupling protein 2 structure determined by nmr molecular fragment searching structure and mechanism of the m2 proton channel of influenza a virus unusual architecture of the p7 channel from hepatitis c virus the structure of phospholamban pentamer reveals a channel-like architecture in membranes the structural basis for intramembrane assembly of an activating immunoreceptor complex solution nmr structure of the v27a drug resistant mutant of influenza a m2 channel mechanism of drug inhibition and drug resistance of influenza a m2 channel solution structure and functional analysis of the influenza b proton channel prediction of the tertiary structure and substrate binding site of caspase-8 prediction of the tertiary structure of a caspase-9/inhibitor complex prediction of the tertiary structure of the beta-secretase zymogen coupling interaction between thromboxane a2 receptor and alpha-13 subunit of guanine nucleotide-binding protein insights from modeling the 3d structure of dna-cbf3b complex modeling the tertiary structure of human cathepsin-e assessment of chemical libraries for their druggability predicting drug-target interaction networks based on functional groups and biological features some remarks on protein attribute prediction and pseudo amino acid composition (50th anniversary year review) identify recombination spots with trinucleotide composition and pseudo amino acid components identify recombination spots with pseudo dinucleotide composition identifying the heat shock protein families using pseudo reduced amino acid alphabet composition combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection inuc-physchem: a sequence-based predictor for identifying nucleosomes via physicochemical properties predict cysteine s-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition incorporating amino acid pairwise coupling into pseaac for predicting cysteine s-nitrosylation sites in proteins iezy-drug: a web server for identifying the interaction between enzymes and drugs in cellular networking iamp-2l: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition the kegg databases and tools facilitating omics analysis: latest developments involving human diseases and pharmaceuticals review: recent progresses in protein subcellular location prediction an intriguing controversy over protein structural class prediction some insights into protein structural class prediction prediction of enzyme family classes slle for predicting membrane protein types predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image a novel approach to predicting protein structural classes in a (20-1)-d amino acid composition space subcellular location prediction of apoptosis proteins boosting classifier for predicting protein domain structural class artificial neural network for predicting alpha-turn types neural network prediction of the hiv-1 protease cleavage sites a sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix idna-prot: identification of dna binding proteins using random forest with grey model afp-pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties predicting subcellular localization of proteins in a hybridization space prediction of protease types in a hybridization space predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic k-nearest neighbor classifiers hum-ploc: a novel ensemble classifier for predicting human protein subcellular localization large-scale predictions of gram-negative bacterial protein subcellular locations euk-mploc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites signal-cf: a subsite-coupled and window-fusing approach for predicting signal peptides using optimized evidence-theoretic k-nearest neighbor classifier and pseudo amino acid composition to predict membrane protein types a top-down approach to enhance the power of predicting human protein subcellular localization: hum-mploc 2.0 fuzzy knn for predicting membrane protein types from pseudo amino acid composition prediction of protein cellular attributes using pseudo amino acid composition using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes theoretical and experimental biology in one predicting anticancer peptides with chou's pseudo amino acid composition and investigating their mutagenicity via ames test predicting plant protein subcellular multi-localization by chou's pseaac formulation based multi-label homolog knowledge transfer learning euloc: a web-server for accurately predict protein subcellular localization in eukaryotes by incorporating various features of sequence segments into the general form of chou's pseaac predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of chou's pseudo amino acid composition using radial basis function on the general form of chou's pseudo amino acid composition and pssm to predict subcellular locations of proteins with both single and multiple sites prediction of subcellular localization of apoptosis protein using chou's pseudo amino acid composition goasvm: a subcellular location predictor by incorporating term-frequency gene ontology into the general form of chou's pseudo-amino acid composition predicting protein subchloroplast locations with both single and multiple sites via three different modes of chou's pseudo amino acid compositions predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of chou's pseudo amino acid composition a multilabel model based on chou's pseudo-amino acid composition for identifying membrane proteins with both single and multiple functional types genetic programming for creating chou's pseudo amino acid based features for submitochondria localization predicting protein submitochondria locations by combining different descriptors into the general form of chou's pseudo amino acid composition multi-kernel transfer learning based on chou's pseaac formulation for protein submitochondria localization using the augmented chou's pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach prediction of gaba(a) receptor proteins using the concept of chou's pseudo-amino acid composition and support vector machine using chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes predicting antibacterial peptides by the concept of chou;s pseudo-amino acid composition and machine learning methods supersecondary structure prediction using chou's pseudo amino acid composition identifying bacterial virulent proteins by fusing a set of classifiers based on variants of chou's pseudo amino acid composition and on evolutionary information a novel feature representation method based on chou's pseudo amino acid composition for protein structural class prediction predicting the cofactors of oxidoreductases based on amino acid composition distribution and chou's amphiphilic pseudo amino acid composition prediction of metalloproteinase family based on the concept of chou's pseudo amino acid composition using a machine learning approach secretp: identifying bacterial secreted proteins by fusing new features into chou's pseudo-amino acid composition prediction of allergenic proteins by means of the concept of chou's pseudo amino acid composition and a machine learning approach using chou's pseudo amino acid composition to predict protein quaternary structure: a sequence-segmented pseaac approach identifying protein quaternary structural attributes by incorporating physicochemical properties into the general form of chou's pseaac via discrete wavelet transform using the concept of chou's pseudo amino acid composition for risk type prediction of human papillomaviruses prediction of cyclin proteins using chou's pseudo amino acid composition discriminating outer membrane proteins with fuzzy k-nearest neighbor algorithms based on the general form of chou's pseaac use of fuzzy clustering technique and matrices to classify amino acids and its impact to chou's pseudo amino acid composition protein remote homology detection by combining chou's pseudo amino acid composition and profile-based protein representation identification of colorectal cancer related genes with mrmr and shortest path in protein-protein interaction network hepatitis c virus network based classification of hepatocellular cirrhosis and carcinoma signal propagation in protein interaction network during colorectal cancer progression pseaac-builder: a cross-platform stand-alone program for generating various special chou's pseudo-amino acid compositions propy: a tool to generate various modes of chou's pseaac pseaac: a flexible web-server for generating various kinds of protein pseudo amino acid composition the folding type of a protein is relevant to the amino acid composition an optimization approach to predicting protein structural class from amino acid composition monte carlo simulation studies on the prediction of protein folding types from amino acid composition predicting protein folding types by distance functions that make allowances for amino acid interactions monte carlo simulation studies on the prediction of protein folding types from amino acid composition. ii. correlative effect does the folding type of a protein depend on its amino acid composition? protein secondary structural content prediction the convergence-divergence duality in lectin domains of the selectin family and its implications a multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins using accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites improving the accuracy of psi-blast protein database searches with composition-based statistics and other refinements gapped blast and psi-blast: a new generation of protein database search programs a comparison of normalization methods for high density oligonucleotide array data based on variance and bias feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data prediction of protein subcellular localization by support vector machines using multi-scale energy and pseudo amino acid composition low-frequency fourier spectrum for predicting membrane protein types using stacked generalization to predict membrane protein types based on pseudo amino acid composition prediction of linear b-cell epitopes using amino acid pair antigenicity scale predicting secretory proteins of malaria parasite by incorporating sequence evolution information into pseudo amino acid composition via grey system model using functional domain composition and support vector machines for prediction of protein subcellular location support vector machines for predicting membrane protein types by using functional domain composition an introduction of support vector machines and other kernel-based learning methodds libsvm: a library for support vector machines prediction of protein signal sequences and their cleavage sites using subsite coupling to predict signal peptides prediction of signal peptides using scaled window some remarks on predicting multi-label attributes in molecular biosystems review: prediction of protein structural classes cell-ploc: a package of web servers for predicting subcellular localization of proteins in various organisms cell-ploc 2.0: an improved package of web-servers for predicting subcellular localization of proteins in various organisms predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition identify catalytic triads of serine hydrolases by support vector machines using pseudo amino acid composition to predict protein subcellular location: approached with amino acid composition distribution discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of chou's pseudo amino acid composition a multi-layer classifier for predicting the subcellular localization of singleplex and multiplex gram-positive bacterial proteins drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework review: recent advances in developing web-servers for predicting protein attributes the authors would like to express their gratitude to the three anonymous reviewers, whose constructive comments are very helpful for strengthening the presentation of the paper. the authors declare no conflict of interest. key: cord-253987-83h861lp authors: tada, takuya; fan, chen; kaur, ramanjit; stapleford, kenneth a.; gristick, harry; nimigean, crina; landau, nathaniel r. title: a soluble ace2 microbody protein fused to a single immunoglobulin fc domain is a potent inhibitor of sars-cov-2 infection in cell culture date: 2020-09-17 journal: biorxiv doi: 10.1101/2020.09.16.300319 sha: doc_id: 253987 cord_uid: 83h861lp soluble forms of ace2 have recently been shown to inhibit sars-cov-2 infection. we report on an improved soluble ace2, termed a “microbody” in which the ace2 ectodomain is fused to fc domain 3 of the immunoglobulin heavy chain. the protein is smaller than previously described ace2-ig fc fusion proteins and contains an h345a mutation in the ace2 catalytic active site that inactivates the enzyme without reducing its affinity for the sars-cov-2 spike. the disulfide-bonded ace2 microbody protein inhibited entry of lentiviral sars-cov-2 spike protein pseudotyped virus and live sars-cov-2 with a potency 10-fold higher than unmodified soluble ace2 and was active after initial virus binding to the cell. the ace2 microbody inhibited the entry of ace2-specific β coronaviruses and viruses with the high infectivity variant d614g spike. the ace2 microbody may be a valuable therapeutic for covid-19 that is active against sars-cov-2 variants and future coronaviruses that may arise. as the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) continues to spread worldwide, there is an urgent need for preventative vaccine and improved therapeutics for treatment of covid-19. the development of therapeutic agents that block specific steps of the coronavirus replication cycle will be highly valuable both for treatment and prophylaxis. coronavirus replication consists of attachment, uncoating, replication, translation, assembly and release, all of which are potential drug targets. virus entry is particularly advantageous because as the first step in virus replication, it spares target cells from becoming infected and because drugs that block entry do not need to be cell permeable as the targets are externally exposed. in sars-cov-2 entry, the virus attaches to the target cell through the interaction of the spike glycoprotein (s) with its receptor, the angiotensin-converting enzyme 2 (ace2) (li, 2015; li et al., 2005; li et al., 2003) , a plasma membrane protein carboxypeptidase that degrades angiotensin ii to angiotensin-(1-7) [ang-(1-7)] a vasodilator that promotes sodium transport in the regulation of cardiac function and blood pressure (kuba et al., 2010; riordan, 2003; tikellis and thomas, 2012) . ace2 binding triggers s protein-mediated fusion of the viral envelope with the cell plasma membrane or intracellular endosomal membranes. the s protein is synthesized as a single polypeptide that is cleaved by the cellular protease furin into s1 and s2 subunits in the endoplasmic reticulum and then further processed by tmprss2 on target cells (glowacka et al., 2011; hoffmann et al., 2020; matsuyama et al., 2010; shulla et al., 2011) . the s1 subunit contains the receptor binding domain (rbd) which binds to ace2 while s2 mediates virus-cell fusion (belouzard et al., 2012; fehr and perlman, 2015; heald-sargent and gallagher, 2012; li et al., 2006; shang et al., 2020) . cells that express ace2 are potential targets of the virus. these include cells in the lungs, arteries, heart, kidney, and intestines (harmer et al., 2002; ksiazek et al., 2003; leung et al., 2003) . the use of soluble receptors to prevent virus entry by competitively binding to viral envelope glycoproteins was first explored for hiv-1 with soluble cd4. in early studies, a soluble form of cd4 deleted for the transmembrane and cytoplasmic domains was found to block virus entry in vitro (daar et al., 1990; haim et al., 2009; orloff et al., 1993; schenten et al., 1999; sullivan et al., 1998) . fusion of the protein to an immunoglobulin fc region, termed an "immunoadhesin", increased the avidity for gp120 by dimerizing the protein and served to increase the half-life of the protein in vivo. an enhanced soluble cd4-ig containing a peptide derived from the hiv-1 coreceptor ccr5 was found to potently block infection and to protect rhesus macaques from infection (chiang et al., 2012) . the soluble receptor approach to blocking virus entry has been recently applied to sars-cov-2 through the use of recombinant human soluble ace2 protein (hrsace2) (kuba et al., 2005; monteil et al., 2020; wysocki et al., 2010) or hrsace2-igg which encodes soluble ace2 and the fc region of the human immunoglobulin g (igg) (case et al., 2020; lei et al., 2020; qian and hu, 2020) which were shown to inhibit of sars-cov and sars-cov-2 entry in a mouse model. in phase 1 and phase 2 clinical trials (haschke et al., 2013; khan et al., 2017) , the protein showed partial antiviral activity but short halflive. addition of the fc region increased the half-life of the protein in vivo. a potential concern with the addition of the ig fc region is the possibility of enhancement, similar to what occurs with antibody-dependent enhancement in which anti-spike protein antibody attaches to fc receptors on immune cells, facilitating infection rather than preventing it (eroshenko et al., 2020) . we report, here on a soluble human ace2 "microbody" in which the ace2 ectodomain is fused to domain 3 of immunoglobulin g heavy chain fc region (igg-ch3) (maute et al., 2015) . the igg-ch3 fc domain served to dimerize the protein, increasing its affinity for the sars-cov-2 s and decreasing the molecular mass of the protein. the ace2 microbody did not bind to cell surface fc receptor, reducing any possibility of infection enhancement. mutation of the active site h345 to alanine in the ace2 microbody protein, a mutation that has been shown to inactivate ace2 catalytic activity (guy et al., 2005) , did not decrease its affinity for the s protein. the dimeric ace2 microbody had about 10-fold higher antiviral activity than soluble ace2, which was also a dimer, and high a higher affinity for virion binding. the ace2 microbody blocked virus entry into ace2.293t cells that over-expressed ace2 as well as all of the cell-lines tested and was fully active against the d614g variant spike protein and a panel of b coronavirus spike proteins. as a means to study sars-cov-2 entry, we developed an assay based on sars-cov-2 s protein pseudotyped lentiviral reporter viruses. the viruses package a lentiviral vector genome that encodes nanoluciferase and gfp separated by a p2a self-processing peptide, providing a convenient means to titer the virus, and the ability to use two different assays to measure infection. to pseudotype the virions, we constructed expression vectors for the full-length sars-cov-2 s and for a δ19 variant deleted for the carboxyterminal 19 amino acids that removes a reported endoplasmic reticulum retention sequence that blocks transit of the s protein to the cell surface (giroglou et al., 2004) ( figure 1a) . the vectors were constructed with or without a carboxy-terminal hemagglutinin (ha) epitope tag. pseudotyped viruses were produced in 293t cells cotransfected with the dual nanoluciferase/gfp reporter lentiviral vector plenti.gfp.nluc, gag/pol expression vector pmdl and full-length s protein, the δ19 s protein, vesicular stomatitis virus g protein (vsv-g) expression vector or without an envelope glycoprotein expression vector. immunoblot analysis showed that full-length and δ19 s proteins were expressed and processed into the cleaved s2 protein (s1 is not visible as it lacks an epitope tag). analysis of the virions showed that the δ19 s protein was packaged into virions at >20-fold higher levels than the full-length protein ( figure 1b ). this difference was not the result of differences in virion production as similar amounts of virion p24 were present in the cell supernatant. analysis of the transfected 293t cells by flow cytometry showed a minor increase in the amount of cell surface δ19 s protein as compared to full-length ( figure 1c ) suggesting that deletion of the endoplasmic reticulum retention signal was not the primary cause of the increased virion packaging of the δ19 s protein and may result from an inhibitory effect of the s protein cytoplasmic tail on virion incorporation. as a suitable target cell-line, we established a clonal, stably transfected 293t cell-line that expressed high levels of ace2 (figures 1d and s1). a comparison of the infectivity of the viruses on ace2.293t cells showed that the δ19 s protein pseudotype was about 2.5-fold more infectious than the full-length s protein pseudotype ( figure 1e ). the ha-tag had no effect on infectivity and the nevirapine control demonstrated that the luciferase activity was the result of bona fide infection and not carried-over luciferase in the virus-containing supernatant. to determine the cell-type tropism of the pseudotyped virus, we tested several standard laboratory cell-lines for susceptibility to infection to the δ19 s protein pseudotyped virus. the vsv-g pseudotype, which has very high infectivity on most celltypes was tested for comparison and virus lacking a glycoprotein was included to control for potential receptor-independent virus uptake. the results showed high infectivity of the δ19 s protein pseudotyped virus on ace2.293t cells, intermediate infectivity on 293t, vero, vero e6, a549, ace2.a549, caco2 and huh7 and low infectivity on a549, chme3, bhk and u937 ( figure 1f) . analysis by flow cytometry of cell surface ace2 levels showed high level expression on ace2.293t, intermediate levels expression on ace2.a549 and low to undetectable levels on a549, caco2 and huh7 ( figure s1 ). the low level of ace2 expression on cells such as vero and caco2 suggests that virus can use very small amounts of the receptor for entry. moreover, the pseudotyped virus is a highly sensitive means with which to detect virus entry. soluble ace2 and ace2-fc fusions have been shown to inhibit sars-cov-2 infection (case et al., 2020; kuba et al., 2010; lei et al., 2020; monteil et al., 2020; qian and hu, 2020; wysocki et al., 2010) . to increase the effectiveness of soluble ace2 and improve therapeutic potential, we generated an ace2-"microbody" in which the ace2 ectodomain was fused to a single igg ch3 domain of the igg fc region (figure 2a ). this domain contains the disulfide bonding cysteine residues of the igg fc that are required to dimerize the protein, which would serve to increase the ace2 microbody avidity for ace2. to prevent potential unwanted effects of the protein on blood pressure due to the catalytic activity of ace2, we mutated h345, one of the key active site histidine residues of ace2, to alanine, a mutation that has been shown to block catalytic activity (guy et al., 2005) . h345 lies underneath the s protein interaction site so was not predicted to interfere with s protein binding ( figure 2b ). for comparison, we constructed vector encoding soluble ace2 without the igg ch3. the proteins were purified from transfected 293t cells and purified to homogeneity by ni-nta agarose affinity chromatography followed by size exclusion chromatography ( figure s2 ). the oligomerization state of the proteins was analyzed by sds-page under nonreducing and reducing conditions. under reducing conditions, the ace2 and ace2.h345a microbody proteins and soluble ace2 ran at the 130 kda and 120 kda, consistent with their calculated molecular mass (figures 2c and s2 ). under nonreducing conditions, the ace2 microbody and ace2.h345a microbody proteins ran at 250 kda, consistent with dimers while the soluble ace2 ran as a monomer with a mass of 120 kda ( figure 2c) . analysis of the proteins by size-exclusion chromatography coupled with multi-angle light scattering (sec-mals) under nondenaturing conditions showed all three proteins to have a molecular mass consistent with dimers ( figure 2d ). the mass of the ace2 and ace2.h345a microbody proteins was 218 kda and 230 kda, respectively, while soluble ace2 was 180 kda. taken together, the results suggest that the ace2 microbody proteins are disulfide-bonded dimers while soluble ace2 is a nondisulfide-bonded dimer. to compare the relative ability of the soluble ace2 proteins to virions that display the s protein, we established a virion pull-down assay. ni-nta beads were incubated with a serial dilution of the carboxy-terminal his-tagged soluble ace2 proteins. free spike protein was removed and the beads were then incubated with a fixed amount of lentiviral pseudotyped virions. free virions were removed and the bound virions were quantified by immunoblot analysis for virion p24 capsid protein. to confirm that virus binding to the beads was specific for the bead-bound ace2, control virions lacking the spike were tested. the results showed that s protein pseudotyped virions bound to the beads while virions that lacked the s protein failed to bind, confirming that the binding was specific ( figure 3a) . in addition, a high titer human serum from a recovered individual blocked binding of the virions to the bead-bound ace2 microbody ( figure s5 ). analysis of the soluble ace2 and ace2 microbody proteins that had bound to the beads showed that similar amounts of each proteins had bound ( figure 3b ). immunoblot analysis of virion binding to the bead-bound soluble ace2 proteins showed that the wild-type and h345a microbody proteins both bound to virions more efficiently than soluble ace2 ( figure 3c) and that the ace2.h345a microbody bound more virions than the wild-type microbody protein. this was unexpected as h345 does not lie in the interaction surface with the s protein. to determine the relative antiviral activity of soluble ace2 and the ace2 microbody proteins, we tested their ability to block the infection sars-cov-2 δ19 s protein pseudotyped gfp/luciferase reporter virus. a fixed amount of pseudotyped reporter virus was incubated with the ace2 proteins and then used to infect ace2.293t cells. after 2 days, luciferase activity and the number of gfp+ cells in the infected cultures were analyzed. for comparison, a high titered recovered patient serum with a neutralizing titer of 1:330 (figure s5) was also tested. the results showed that soluble ace2 had moderate inhibitory activity with an ec50 of 1.24 µg/ml. the ace2 microbody was significantly more potent, with an ec50 of 0.36 µg/ml and the ace2.h345a microbody protein was somewhat more potent than the wild-type ace2 microbody with an ec50 of 0.15 µg/ml ( figure 4a ). inhibition of infection by the soluble ace2 proteins was comparable to recovered patient serum although it is not possible to directly compare the two inhibitors as the mass amount of anti-s protein antibody in the serum is not known. to confirm the results, we analyzed the infected cells by flow cytometry to determine the number of gfp+ cells. the inhibition curves were similar to the luciferase curves, confirming that the ace2 proteins had decreased the number of cells infected and did not simply reduce expression of the reporter protein ( figure 4b, top) . representative images of the gfp+ cells provide visual confirmation of the results ( figure 4b, below) . the inhibitory activity of the soluble ace2 proteins was specific for the sars-cov-2 s protein as they did not inhibit vsv-g pseudotyped virus ( figure 4c ). the ace2 microbody was somewhat more active when tested on untransfected 293t that express low levels of ace2 ( figure 4d ). to determine the ability of the ace2 microbody proteins to block the replication of live sars-cov-2, we used the replication-competent sars-cov-2, icsars-cov-2mng that encodes an mneongreen reporter gene in orf7 (xie et al., 2020) . serially diluted ace2 microbody proteins were incubated with the virus and the mixture was then used to infect ace2.293t cells. the results showed that 1-0.125 µg of ace2 microbody protein blocked live virus replication ( figure 4e ). soluble ace2 was less active; 1 µg of the protein had a 50% antiviral effect and the activity was lost with 0.5 µg. the antiviral activity of ace2 proteins against live virus was similar to pseudotyped virus, except that in the live virus assay, the wild-type and h345a microbodies were of similar potency. in the experiments described above, the proteins were incubated with virus prior to infection. to determine whether they would be active when at later time points, the ace2 proteins were tested in an "escape from inhibition" assay in which the soluble ace2 and ace2 microbody proteins were added to cells at the same time as virus or up to 6 hours post-infection. the results showed that addition of the microbody together with the virus (t0) blocked the infection by 80%. addition of the microbody 30 minutes postinfection maintained most of the antiviral effect, and even 2 hours post-infection the inhibitor blocked 55% of the infection. at 4 hours post-infection, the ace2 microbody retained its blocking activity at 10 µg/ml but was less active with decreasing amounts of inhibitor ( figure 5a) . these results suggest that the ace2 microbody is highly efficient at neutralizing the virus when present before the virus has had a chance to bind to the cell and that it maintains its ability to block infection when added together with the cells and even 2 hours after the virus has been exposed to cells, a time which most of the virus has not yet bound to the cell. to determine whether the ace2 microbody could prevent virus entry once the virus bound to the cell, the virus was prebound by incubating it with cells for 1 hour at 22°c, the unbound virus was removed and the ace2 microbody was added at increasing time points. the results showed that removal of the unbound virus after 1 hour incubation resulted in less infection as compared to when the virus was incubated with the cells for 4 hours, indicating that only a fraction of the virus had bound to cells. however, virus that was bound could be blocked by the ace2 microbody for another 30 minutes post-binding ( figure 5b) . the ability to block entry of the cell-bound virus suggests that virus binding results from a small number of spike molecules binding to ace2. over the next 30-60 minutes, additional spike:ace2 interactions form, escaping the ability of the ace2 microbody to block virus entry. the results demonstrate that the ace2 microbody is a highly potent inhibitor of free virus and maintains its antiviral activity against virus newlybound to the cell. cov-2 containing a d614g point mutation in the s protein has been found to be circulating in the human population with increasing prevalence (daniloski et al., 2020; eaaswarkhanth et al., 2020; korber et al., 2020; zhang et al., 2020) . the d614g mutation was found to decrease shedding of the spike protein from the virus and to assume a fusion-ready conformation, resulting in increased infectivity and most likely contributing to its increasing prevalence. to determine the ability of the soluble ace2 proteins to block entry of virus with the d614g s protein, we introduced the mutation into the δ19 s protein expression vector and generated pseudotyped reporter viruses ( figure 6a ). analysis of the infectivity of the d614g and wild-type pseudotyped viruses on the panel of cell-lines showed that the mutation increased the infectivity of virus 2-4 fold on 293t, ace2.293t, vero and veroe6 cells, consistent with previous reports (daniloski et al., 2020; yurkovetskiy et al., 2020; zhang et al., 2020) . infectivity of the mutated virus was also increased in a549, ace2.a549 caco2 although the overall infectivity of these cells was low (figure.6b). to determine the ability of the soluble ace2 proteins to neutralize the virus with the variant s protein, serial dilutions of the soluble ace2 proteins were tested for their ability to block wild-type and d614g s pseudotyped virus. the results showed that soluble ace2 had moderate antiviral activity against wild-type virus, while the wildtype and h345a microbody proteins were more potent ( figure 6c ). the ace2.h345a microbody was somewhat more active at low concentrations than the wild-type protein. to test the relative binding affinity of the soluble ace2 proteins for wild-type and d614g mutated spike, we tested the pseudotyped virions in the ace2-virus binding assay ( figure 6d) . the results showed that virus with the d614g s bound efficiently to soluble ace2. the results demonstrate the broad activity of the ace2 microbody. we report the development of a soluble form of ace2 in which the ectodomain of ace2 is fused to a single domain of the igg heavy chain fc. the domain renders the protein smaller than those fused to the full-length fc yet retains the cysteine residues required for dimerization and the ability to increase the in vivo half-life (maute et al., 2015) . the microbody protein was shown to be a disulfide-bonded dimer in contrast to soluble ace2 lacking the fc domain which was dimeric but not nondisulfide-bonded. although both proteins are dimeric, the ace2 microbody had about 10-fold more antiviral activity than soluble ace2 and bound to virions with a >4-fold increased affinity. while high affinity anti-spike rbd monoclonal antibodies that potently inhibit sars-cov-2 infection will be of great value in the treatment of covid-19, the soluble receptor proteins have advantageous features. the ace2 microbody is of fully human origin so should be relatively non-immunogenic. in addition, it is expected to be broadly active against mutated variant spike proteins that may arise in the human population. the microbody was fully active against virus with the d614g s protein, a variant of increasing prevalence with increased infectivity in vitro ( figure 6 ) and was highly active against ace2-specific s proteins from other b coronaviruses. it has been previously shown that a recombinant ace2-fc fusion had a major effect on blood pressure in a mouse model (liu et al., 2018) . it was therefore important to inactivate ace2 carboxypeptidase activity in microbody to decrease unwanted effects on blood pressure associated with its use therapeutically. the h345a mutation alters one of the histidine essential for ace2 catalytic activity yet did not impair antiviral activity against sars cov-2 or other b coronavirus spike proteins. in some of our analyses, the ace2.h345a microbody appeared to be more active than the wild-type protein although the significance of this difference was unclear as the two proteins had similar activity in the live virus replication assay. escape from inhibition studies provided insight into the kinetics of virus infection and the mechanism of inhibition by the soluble receptors. pretreatment of virus with the ace2 microbody potently neutralized the virus as did simultaneous treatment addition of virus and microbody to cells. furthermore, the protein retained its ability to prevent infection even when added to the culture at times after addition of virus, blocking infection by about 50% when added 1 hour after virus addition. the ace2 microbody was partially active even on virus that had already attached to the cell. when virus was pre-bound for 2 hours, a time at which about 10% of the infectious virus had bound the cell, the ace2 microbody retained the ability to prevent infection of about 50% of the bound virus ( figure 5a ). taken together, the experiments suggest a series of events in which the virus binds to cells over a period of about 4 hours. during this time, the ace2 microbody is highly efficient, neutralizing nearly all of the free virus. once the virus binds to the cell, the ace2 microbody retains its ability to block infection for about 30 min, suggesting that binding is initially mediated by a small number of s proteins and that over 2 hours, additional s proteins are recruited to interact with target cell ace2, a period during which the ace2 microbody remains able to block the viral fusion reaction. once a sufficient number of s protein:ace2 interactions have formed, the virus escapes neutralization. it was surprising that the ace2 microbody had more antiviral activity than soluble ace2 as both proteins are dimeric. in addition, the ace2 microbody protein showed somewhat better binding to virions than soluble ace2. the reasons for these differences are not clear. it is possible that the disulfide bonds of the ace2 microbody stabilize the dimer or that they position the individual monomers in a more favorable conformation to bind to the individual subunits of the s protein trimer. it is worth noting that in most of the experiments, we used ace2.293t cells that overexpress ace2 compared to the cell-lines tested. on untransfected 293t cells that express barely detectable levels of ace2, the antiviral activity of the microbody protein was increased, suggesting that the antiviral activity of the ace2 microbody may be under-estimated by the use of ace2 overexpressing cells. recent reports have described similar soluble ace2 proteins. recently soluble ace2-related inhibitor including rhace2 was shown to partially block infection (case et al., 2020; lei et al., 2020; monteil et al., 2020) although the proteins had a short half-life (wysocki et al., 2010 ) (< 2 hours in mice), limiting their clinical usefulness. in contrast a dimeric rhesus ace2-fc fusion protein had a half-life in mice in plasma greater than 1 week (liu et al., 2018) . the half-life of the ace2 microbody in vivo has not yet been tested, but the protein retained antiviral activity for several days in tissue culture, significantly longer than longer than soluble ace2 ( figure s3 ). the phenomenon of antibody-dependent enhancement is caused by the interaction of the fc domain of non-neutralizing antibody with the fc receptor on cells which then serves to promote rather than inhibit virus neutralization. a similar phenomenon is possible with receptor-fc fusion proteins by interaction with fc receptor on cells. because the ace2 microbody contained only a single fc domain, it was not expected to interact with fc receptor. to test whether this was the case, we tested the ace2 microbody in an enhancement assay using u937 cells which express fc receptors. the ace2 microbody protein did not detectably bind to cells that express the fc g receptor and the cells did not become infected, suggesting that this mechanism is not likely to play a role in vivo ( figure s4 and not shown). pseudotyped viruses have been highly useful for studies of sars-cov-2 entry. vectors for producing sars-cov-2 lentiviral pseudotypes have been developed by several laboratories (crawford et al., 2020; nie et al., 2020; ou et al., 2020; schmidt et al., 2020; shang et al., 2020; xia et al., 2020) . the vectors we report here produce pseudotyped lentiviral viruses with very high infectivity. the high infectivity of the pseudotypes produced is due in part to efficient expression of a codon-optimized δ19 s protein and the efficient virion incorporation that results from the cytoplasmic tail truncation. the δ19 s protein was present at only slightly higher levels on the cell surface than the full-length protein, suggesting that this small increase does not fully account for the large increase in virion incorporation. a possible explanation is that the full-length cytoplasmic tail sterically hinders virion incorporation by conflicting with the underlying viral matrix protein and that the deletion removes the conflict. also, contributing to high viral titers, is the use of separate gag/pol packaging vector and lentiviral transfer vector as opposed to a lentiviral proviral dna encoding gag/pol and the reporter gene, a strategy that resulted in higher reporter gene expression as shown in a direct comparison (not shown). moreover, the dual luciferase/gfp reporter allows for measurement of infectious virus titer by flow cytometry and the high sensitivity of nanoluciferase read-out. the lentiviral pseudotypes are highly useful for rapidly titering neutralizing antibody in patient serum. in a study of over 100 sera from recovering patients, we found that the pseudotype assay results to be highly correlated with those of a live virus neutralization assay (submitted). a feature of soluble receptors is that because the virus spike protein needs to conserve receptor binding affinity to maintain transmissibility, they should maintain their ability to neutralize s protein variants. sars-cov-2 s variants have been found to be circulating in the human population and it is likely that others are yet to emerge, some of which may be less sensitive to neutralization by the therapeutic monoclonal antibodies currently under development. the recently identified sars-cov-2 variant encoding the d614g s protein has been found to be spreading with increased frequency in the human population (daniloski et al., 2020; eaaswarkhanth et al., 2020; korber et al., 2020; zhang et al., 2020) . the d614g s protein was found to be more resistant to shedding from the virion and to adopt a conformation that favors ace2-binding and is in a more fusioncompetent state (yurkovetskiy et al., 2020; zhang et al., 2020) . we confirmed the increased infectivity of virions and find that the d614g s protein has a higher affinity for ace2 as measured in a virion binding assay. nevertheless, the ace2 microbody maintained its ability to neutralize d614g s protein pseudotyped virus. the ability of the ace2 microbody to neutralize diverse b coronaviruses suggest that it may also be able to neutralize novel ace2 using coronaviruses that may be transferred to the human population in the future. the microbody protein could serve as an off-the-shelf reagent that could be rapidly deployed. the dual gfp/nanoluciferase lentiviral vector plenti.gfp.nluc was generated by overlap extension pcr. a dna fragment encoding gfp was amplified with a forward primer containing a bamh-i site and a reverse primer encoding the p2a sequence. the nanoluciferase gene (nluc) was amplified with a forward primer encoding the p2a motif and a reverse primer containing a 3'-sal-i site. the amplicons were mixed and amplified with the external primers. the fused amplicon was cleaved with bamh-i and sal-i and cloned into plenti.cmv.gfp.puro (addgene plasmid #17448, provided by eric campeau and paul kaufman) (campeau et al., 2009 ). the sars-cov-2 s expression vector pccov2.s was chemically synthesized as dna fragments a and b encoding codon-optimized 5' and 3' halves, respectively, of the s gene of wuhan-hu-1/2019 sars-cov-2 isolate (table s2 and the amplicon was then cloned into the kpn-i and xho-i sites of pcdna6. the ace2 microbody expression vector pcace2-microbody was generated by overlap extension pcr that fused the extracellular domain of ace2 with human immunoglobulin g heavy chain fc domain 3 using a forward primer containing a kpn-i site and reverse primer containing an 8xhis-tag and xho-1 site. the amplicon was cloned into the kpn-i and xho-i sites of pcdna6. expression vector pcace2.h345a-microbody that expressed the ace2.h345a microbody was generated by overlap extension pcr using primers that overlapped the mutation. full-length cdna sequence, primer sequences and amino acid sequences are shown in tables s1-3. control and recovered patient sera were collected from patients through the nyu vaccine center with written consent under i.r.b. approval (irb 20-00595 and irb 18-02037) and were deidentified. vero e6, caco2, a549, ace2 a549, bhk, huh7 293t, vero and chme3 cells were sars-cov-2 s protein pseudotyped lentiviral stocks were produced by cotransfecting virus stocks were titered on 293t by flow cytometry and for luciferase activity. the p24 concentration was measured and the virus was used at a concentration of 1.0 µg/ml. to test the inhibitory activity of soluble receptors and convalescent sera, 50 µl serially diluted inhibitor or convalescent patient serum was incubated for 30 min at room temperature with 5 µl pseudotyped reporter virus (approximately 5 x 10 6 cps luciferase activity/µl) at a moi of 0.1 in a volume of 100 µl. the mixture was added to ace2.293t cells in a 96 well tissue culture dish containing 1 x 10 4 cells/well. after 2 days, the culture medium was removed and 50 µls nano-glo luciferase substrate (promega) and 50 µls medium was added to each well. the supernatant (70 µls) was transferred to a microtiter plate and the luminescence was read in an envision 2103 microplate luminometer (perkinelmer). alternatively, the gfp+ cells were quantified by flow cytometry with pacific blue viability dye to exclude dead cells (biolegend). 293f cells (thermo fisher) at a density of 2.5 x 10 6 cells/ml were transfected with microbody expression vector plasmid dna using polyethyleneimine (polysciences, inc) at a 1:3 plasmid:pei ratio. the cells were then cultured at 30°c and at 12 hours posttransfection 10 mm sodium butyrate was added. after 4 days, the supernatant culture medium was collected, filtered and adjusted ph to 8.0. the medium was passed over a the absolute molecular masses of the purified protein complexes were determined by sec/mals. the proteins were injected onto a superdex 200 10/300 gl gel-filtration chromatography column equilibrated in sample buffer that was connected to a dawn heleos ii 18-angle light-scattering detector (wyatt technology), a dynamic lightscattering detector (dynapro nanostar; wyatt technology) and an optilab t-rex refractive index detector (wyatt technology). the data were collected at 25°c at a flow rate of 0.5 ml/minute every second. the molecular mass of each protein was determined by analysis with astra 6 software. 293t cells were transfected by lipofection with 4 µg pcace2-microbody. at 72 hours posttransduction, 0.5 ml of culture supernatant was incubated with nickel-nitrilotriacetic acidagarose beads (qiagen). the beads were washed, and bound protein was eluted with laemmle loading buffer. the proteins were analyzed on an immunoblot probed with mouse anti-6xhis antibody (invitrogen) and horseradish peroxidase (hrp)-conjugated goat anti-mouse igg secondary antibody (sigma-aldrich). the proteins were visualized using luminescent substrate and scanned on a li-cor biosciences fc imaging system (li-cor biotechnology). ratios were calculated as the his (spike) signal intensity divided by the p24 signal intensity for an identical exposure of the blot. transfected cells were lysed in buffer containing 50 mm hepes, 150 mm kcl, 2 mm edta, 0.5% np-40, and protease inhibitor cocktail. protein concentration in the lysates was measured by bicinchoninic protein assay and the lysates (40 µg) were separated by sds-page. the proteins were transferred to polyvinylidene difluoride membranes and probed with anti-ha mab (covance), mouse anti-his mab (invitrogen) and anti-gapdh mab (life technologies) followed by goat anti-mouse hrp-conjugated second antibody (sigma). the blots were visualized using luminescent substrate (millipore) on a li-cor bio-sciences fc imaging system. soluble ace2 proteins (10 µg) were mixed with 20 µl nickel beads for 1 hour at 4°c. unbound protein was removed by washing the beads with pbs. the beads were resuspended in pbs and mixed with 40 µl pseudotyped lentiviral virions after 1 h incubation at 4°c, the beads were washed with pbs and resuspended in reducing laemmli loading buffer and heated to 90°c. the eluted proteins were separated by sds-page and analyzed on an immunoblot probed with anti-p24 antibody (ag3.0) followed by goat anti-mouse hrp-conjugated second antibody. mneongreen sars-cov-2 (xie et al., cell host and microbe 2020) was obtained from the world reference center for emerging viruses and arboviruses at the university of texas medical branch. the virus was passaged once on vero e6 cells (atcc crl-1586), clarified by low-speed centrifugation, aliquoted, and stored at -80°c. the infectious virus titer was determined by plaque assay on vero e6 cells after staining with crystal violet. virus neutralization was determined as previously described (xie et al, biorxiv 2020) . ace2.293t cells were seeded in a 96-well plate (1 x 10 4 /well). the next day, mneongreen sars-cov-2 (moi = 0.5) was mixed 1:1 with serially 2-fold diluted soluble ace2 protein in dmem/2% fbs and incubated for 1 hour at 37°c. the virus:protein mixture was then added to the ace2 cells and incubated for 24 hours. at 37°c in 5% co2. the cells were fixed with 4% paraformaldehyde, stained with dapi and the mneongreen+ cells were counted on a cellinsight cx5 platform high content microscope (thermo fisher). all experiments were performed in technical duplicates or triplicates and data were analyzed using graphpad prism (version 7 7.0e). statistical significance was determined by the two-tailed, unpaired t test. significance was based on two-sided testing and attributed to p< 0.05. confidence intervals are shown as the mean ± sd or sem. (*p≤ 0.05, **p≤ 0.01, ***p≤0.001, ****p≤0.0001). mechanisms of coronavirus cell entry mediated by the viral spike protein a versatile viral system for expression and depletion of proteins in mammalian cells neutralizing antibody and soluble ace2 inhibition of a replication-competent vsv-sars-cov-2 and a clinical isolate of sars enhanced recognition and neutralization of hiv-1 by antibody-derived ccr5-mimetic peptide variants protocol and reagents for pseudotyping lentiviral particles with sars-cov-2 spike protein for neutralization assays the d614g mutation in sars-cov spike increases transduction of multiple human cell types. biorxiv could the d614g substitution in the sars-cov-2 spike (s) protein be associated with higher covid-19 mortality? implications of antibody-dependent enhancement of infection for sars-cov-2 countermeasures coronaviruses: an overview of their replication and pathogenesis retroviral vectors pseudotyped with severe acute respiratory syndrome coronavirus s protein evidence that tmprss2 activates the severe acute respiratory syndrome coronavirus spike protein for membrane fusion and reduces viral control by the humoral immune response identification of critical active-site residues in angiotensin-converting enzyme-2 (ace2) by site-directed mutagenesis soluble cd4 and cd4-mimetic compounds inhibit hiv-1 infection by induction of a short-lived activated state quantitative mrna expression profiling of ace 2, a novel homologue of angiotensin converting enzyme pharmacokinetics and pharmacodynamics of recombinant human angiotensin-converting enzyme 2 in healthy human subjects ready, set, fuse! the coronavirus spike protein and acquisition of fusion competence sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor a pilot clinical trial of recombinant human angiotensin-converting enzyme 2 in acute respiratory distress syndrome tracking changes in sars-cov-2 spike: evidence that d614g increases infectivity of the covid-19 virus a novel coronavirus associated with severe acute respiratory syndrome trilogy of ace2: a peptidase in the renin-angiotensin system, a sars receptor, and a partner for amino acid transporters a crucial role of angiotensin converting enzyme 2 (ace2) in sars coronavirus-induced lung injury structure of the sars-cov-2 spike receptor-binding domain bound to the ace2 receptor neutralization of sars-cov-2 spike pseudotyped virus by recombinant ace2-ig functional assessment of cell entry and receptor usage for lineage b beta-coronaviruses enteric involvement of severe acute respiratory syndrome-associated coronavirus infection receptor recognition mechanisms of coronaviruses: a decade of structural studies structure of sars coronavirus spike receptor-binding domain complexed with receptor insights from the association of sars-cov sprotein with its receptor, ace2 angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus novel ace2-fc chimeric fusion provides long-lasting hypertension control and organ protection in mouse models of systemic renin angiotensin system activation efficient activation of the severe acute respiratory syndrome coronavirus spike protein by the transmembrane protease tmprss2 engineering high-affinity pd-1 variants for optimized immunotherapy and immuno-pet imaging infections in engineered human tissues using clinical-grade soluble human ace2 establishment and validation of a pseudovirus neutralization assay for sars-cov-2 two mechanisms of soluble cd4 (scd4)-mediated inhibition of human immunodeficiency virus type 1 (hiv-1) infectivity and their relation to primary hiv-1 isolates with reduced sensitivity to scd4 characterization of spike glycoprotein of sars-cov-2 on virus entry and its immune cross-reactivity with sars-cov ucsf chimera--a visualization system for exploratory research and analysis ig-like ace2 protein therapeutics: a revival in development during the covid-19 pandemic angiotensin-i-converting enzyme and its relatives effects of soluble cd4 on simian immunodeficiency virus infection of cd4-positive and cd4-negative cells measuring sars-cov-2 neutralizing antibody activity using pseudotyped and chimeric viruses cell entry mechanisms of sars-cov-2 a transmembrane serine protease is linked to the severe acute respiratory syndrome coronavirus receptor and activates virus entry determinants of human immunodeficiency virus type 1 envelope glycoprotein activation by soluble cd4 and monoclonal antibodies angiotensin-converting enzyme 2 (ace2) is a key modulator of the renin angiotensin system in health and disease targeting the degradation of angiotensin ii with recombinant angiotensin-converting enzyme 2: prevention of angiotensin ii-dependent hypertension inhibition of sars-cov-2 (previously 2019-ncov) infection by a highly potent pan-coronavirus fusion inhibitor targeting its spike protein that harbors a high capacity to mediate membrane fusion an infectious cdna clone of sars-cov-2 sars-cov-2 spike protein variant d614g increases infectivity and retains sensitivity to antibodies that target the receptor binding domain the d614g mutation in the sars-cov-2 spike protein reduces we thank lin xinhua and hanna nazeeh (nyulh) and benjamin tenoever for cell-lines ace2-mb h345a-mb sace2 a. no serum no serum **** **** **** **** **** tada et al. atgtcaagctcttcctggctccttctcagccttgttgctgtaactgctgctcagtccaccattgaggaacaggccaagacatttt tggacaagtttaaccacgaagccgaagacctgttctatcaaagttcacttgcttcttggaattataacaccaatattactgaaga gaatgtccaaaacatgaataatgctggggacaaatggtctgcctttttaaaggaacagtccacacttgcccaaatgtatccacta caagaaattcagaatctcacagtcaagcttcagctgcaggctcttcagcaaaatgggtcttcagtgctctcagaagacaagagc aaacggttgaacacaattctaaatacaatgagcaccatctacagtactggaaaagtttgtaacccagataatccacaagaatgct tattacttgaaccaggtttgaatgaaataatggcaaacagtttagactacaatgagaggctctgggcttgggaaagctggagat ctgaggtcggcaagcagctgaggccattatatgaagagtatgtggtcttgaaaaatgagatggcaagagcaaatcattatgagg actatggggattattggagaggagactatgaagtaaatggggtagatggctatgactacagccgcggccagttgattgaagat gtggaacatacctttgaagagattaaaccattatatgaacatcttcatgcctatgtgagggcaaagttgatgaatgcctatcctt cctatatcagtccaattggatgcctccctgctcatttgcttggtgatatgtggggtagattttggacaaatctgtactctttgac agttccctttggacagaaaccaaacatagatgttactgatgcaatggtggaccaggcctgggatgcacagagaatattcaagga ggccgagaagttctttgtatctgttggtcttcctaatatgactcaaggattctgggaaaattccatgctaacggacccaggaaat gttcagaaagcagtctgccatcccacagcttgggacctggggaagggcgacttcaggatccttatgtgcacaaaggtgacaat ggacgacttcctgacagctcatcatgagatggggcatatccagtatgatatggcatatgctgcacaaccttttctgctaagaaa tggagctaatgaaggattccatgaagctgttggggaaatcatgtcactttctgcagccacacctaagcatttaaaatccattggt cttctgtcacccgattttcaagaagacaatgaaacagaaataaacttcctgctcaaacaagcactcacgattgttgggactctgc catttacttacatgttagagaagtggaggtggatggtctttaaaggggaaattcccaaagaccagtggatgaaaaagtggtggg agatgaagcgagagatagttggggtggtggaacctgtgccccatgatgaaacatactgtgaccccgcatctctgttccatgttt ctaatgattactcattcattcgatattacacaaggaccctttaccaattccagtttcaagaagcactttgtcaagcagctaaacat gaaggccctctgcacaaatgtgacatctcaaactctacagaagctggacagaaactgttcaatatgctgaggcttggaaaatca gaaccctggaccctagcattggaaaatgttgtaggagcaaagaacatgaatgtaaggccactgctcaactactttgagccctta tttacctggctgaaagaccagaacaagaattcttttgtgggatggagtaccgactggagtccatatgcagaccaaagcatcaaa gtgaggataagcctaaaatcagctcttggagataaagcatatgaatggaacgacaatgaaatgtacctgttccgatcatctgttg catatgctatgaggcagtactttttaaaagtaaaaaatcagatgattctttttggggaggaggatgtgcgagtggctaatttgaa accaagaatctcctttaatttctttgtcactgcacctaaaaatgtgtctgatatcattcctagaactgaagttgaaaaggccatca ggatgtcccggagccgtatcaatgatgctttccgtctgaatgacaacagcctagagtttctggggatacagccaacacttggac ctcctaaccagccccctgtttccatatggctgattgtttttggagttgtgatgggagtgatagtggttggcattgtcatcctgat cttcactgggatcagagatcggaagaagaaaaataaagcaagaagtggagaaaatccttatgcctccatcgatattagcaaagg agaaaataatccaggattccaaaacactgatgatgttcagacctccttttag key: cord-017775-qohf9pxp authors: loa, chien chang; wu, ching ching; lin, tsang long title: recombinant turkey coronavirus nucleocapsid protein expressed in escherichia coli date: 2015-09-10 journal: animal coronaviruses doi: 10.1007/978-1-4939-3414-0_4 sha: doc_id: 17775 cord_uid: qohf9pxp expression and purification of turkey coronavirus (tcov) nucleocapsid (n) protein from a prokaryotic expression system as histidine-tagged fusion protein are presented in this chapter. expression of histidine-tagged fusion n protein with a molecular mass of 57 kda is induced with isopropyl β-d-1-thiogalactopyranoside (iptg). the expressed n protein inclusion body is extracted and purified by chromatography on nickel-agarose column to near homogeneity. the protein recovery can be 10 mg from 100 ml of bacterial culture. the purified n protein is a superior source of tcov antigen for antibody-capture elisa for detection of antibodies to tcov. turkey coronavirus (tcov) is the cause of an acute and highly contagious enteric disease affecting turkeys of all ages. the disease is severe in 1-to 4-week-old turkey poults [ 1 ] . turkey fl ocks that recover from natural or experimental coronaviral enteritis may develop lifelong immunity [ 2 ] . tcov has been recognized as an important pathogen of young turkeys. tcov infection causes signifi cant economic losses in the turkey industry due to poor feed conversion and uneven growth. outbreaks of tcov enteritis in turkey poults remain as a threat to the turkey industry. in order to rapidly diagnose and effectively control turkey coronaviral enteritis , development of an antibody-capture enzymelinked immunosorbent assay ( elisa ) for detecting antibodies to tcov is essential. development of elisa for detection of tcov infection requires large amounts of tcov antigen . molecular cloning and expression of tcov n protein were carried out for preparation of large quantities of highly purifi ed viral proteins. coronavirus is an enveloped and positive-stranded rna virus that possesses three major structural proteins including a predominant phosphorylated nucleocapsid (n) protein, peplomeric glycoprotein, and spike (s) protein that makes up the large surface projections of the virion, and membrane protein (m) [ 3 , 4 ] . the n protein is abundantly produced in coronavirus-infected cells and is highly immunogenic. the n protein binds to the viral genomic rna and composes the structural feature of helical nucleocapsid. the n protein is a preferred choice for developing a groupspecifi c serologic assay because of highly conserved sequence and antigenicity. the nucleocapsid protein s of various rna viruses, such as mumps, rabies, vesicular stomatitis, measles, newcastle disease, and infectious bronchitis (ibv) viruses, have been used as the coating antigens in diagnostic elisa [ 5 -10 ] . prokaryotic expression is an economic and convenient system to prepare large amount of pure recombinant protein. in addition, the antigenic integrity of n protein expressed in prokaryotic system is expected to be maintained due to the lack of glycosylation. this chapter describes expression and purifi cation of tcov n protein with a prokaryotic system for preparation of a large quantity of highly purifi ed viral protein, which can be used as coating antigen for antibody-capture elisa for serologic diagnosis of tcov infection [ 11 , 12 ] . 13. add 2 μl of the above rt mixture to the pcr amplifi cation reaction (100 μl) with primers nf and nr. a mix of taq and pfu at 10:1 is recommended to maintain pcr fi delity (table 1 ) . 2. the nuclease reagent benzonase is added with 1 μl (25 units) for every ml of bugbuster reagent used. 4. ampicillin antibiotic marker is on the expression vector ptriex and chloramphenicol antibiotic marker is on plasmid placi in the expression host strain tuner cells. carbenicillin is recommended to be in place of ampicillin for better stability for ph changes throughout the bacterial cultures. 5. the binding capacity of 1.25 ml of his-bind resin is 10 mg of target protein per column. as for any affi nity chromatography, the best purity of target protein is achieved when the amount of protein extract is near the binding capacity. 6. the purpose of 6 m urea is to improve resolution of the sticky inclusion bodies. the presence of 6 m urea does not affect binding of his-bind resin to target n protein. 7. the suggested ratio of rnapure reagent to sample is 10:1. excess amount of rnapure reagent has no negative impact. the lower ratio (5:1) in this step is intended to obtain higher concentration of viral rna in the fi nal supernatants. if the upper aqueous phase after centrifugation at step 3 is more than half of the total volume, there is not enough rnapure reagent added. the appropriate reagent amount may be adjusted. chloroform is applied at 150 μl for every milliliter of lysate. 8. the sample mixture with chloroform at this step can be stored at −70 °c or even lower temperature before proceeding to the next step. 9. optional: inverting the tube for 5-10 min for air-drying of rna pellet is a helpful tip to completely remove any residual ethanol that may interfere the following rt reaction. 10. it is critical to make sure that the jellylike rna pellet is completely dissolved into solution by repeat pipetting. 11. the synthesized cdna in the rt reaction can be stored at −20 °c or even lower temperature until used. 12. pcr product may be purifi ed. the vector must be gel purifi ed due to the long digested fragment size above 30 bp. 13. the molar ratio between vector and insert is suggested at 1:2 to 1:5. the volumes in this step are illustrated for initial exploration. the concentration of digested vector and insert can be estimated by od 260 or agarose gel electrophoresis with known amount of dna of similar size in adjacent wells. the ligation reaction mixture can be stored at 4 °c until used for transformation or at −20 °c for longer term. 14. after plating, the leftover transformation mix can be stored at 4 °c for further plating in the following days at different amount if needed. 15. the starter culture can be prepared from a fresh colony on a plate or directly from a glycerol storage stock. an od around 0.5 represents a culture at log phase when the cells are at the best condition to expand and for protein expression. 16. this usually takes about 2-3 h to reach the od range. the higher the od of starter culture in the previous step, the shorter the time to reach this od range. 17. the centrifuge tubes should be weighed before and after collection of cell pellet for estimation of wet pellet amount and the volume of bugbuster to be applied in the next step. frozen storage of cell pellets may improve the extraction effi ciency of bugbuster reagents through the freeze/thaw cycle. 18. it is important to completely resuspend the cell pellets for the best results of bugbuster extraction. higher volume of bugbuster reagent does not have adverse effect. roughly 10-20 ml of bugbuster reagent should be enough for cell pellets collected from a 100 ml of culture. bugbuster reagent can be added directly to frozen cell pellets. there is no need to wait for the temperature to return to room temperature. protease inhibitors may be added at this step but usually not necessary. 19. it is critical but somewhat diffi cult to completely dissolve the sticky inclusion bodies. repeat pipetting up and down until the protein solution is homogeneous. any undissolved particles will clot the his-bind column and affect the purifi cation process. it is advisable to centrifuge the dissolved inclusion body protein solution at 5000-10,000 × g for 10-15 min at 4 °c for clarifi cation before application to the column. 20. eluate may be collected in fractions such as 0.5 or 1 ml each fraction. 21. the presence of 6 m urea is compatible with the protein assay reagent. the assay range can be adjusted for protein concentrations from low μg/ml to 1 mg/ml with different assay format. the protein concentration of the target n protein eluate as obtained following this process is about 1-2 mg/ml. the presence of 6 m urea has no adverse effect on plate coating for elisa performance. given the coating concentration of n protein at 20 μg/ml, the eluate is usually diluted in coating buffer for at least 1:10 to reduce the urea content to less than 600 mm and, subsequently, further diminish any possible effect on elisa performance. accordingly, the purifi ed n protein eluate can be directly applied to the elisa assay for detection of antibodies to tcov. coronaviral enteritis of turkeys (blue comb disease) immunity to transmissible coronaviral enteritis of turkeys (blue comb) identifi cation of the structural proteins of turkey enteric coronavirus coronavirus immunogens immunoglobulin class and immunoglobulin g subclass enzyme-linked immunosorbent assays compared with microneutralisation assay for sero-diagnosis of mumps infection and determination of immunity rabies diagnostic reagents prepared from a rabies n gene recombinant expressed in baculovirus baculovirus expression of the nucleocapsid gene of measles virus and utility of the recombinant protein in diagnostic enzyme immunoassays immunological characterization of the vsv nucleocapsid (n) protein expressed by recombinant baculovirus in spodoptera exigua larva: use in differential diagnosis between vaccinated and infected animals a diagnostic immunoassay for newcastle disease virus based on the nucleocapsid protein expressed by a recombinant baculovirus recombinant nucleocapsid protein is potentially an inexpensive, effective serodiagnostic reagent for infectious bronchitis virus expression and purifi cation of turkey coronavirus nucleocapsid protein in escheria coli recombinant nucleocapsid proteinbased enzyme-linked immunosorbent assay for detection of antibody to turkey coronavirus the protocol "recombinant turkey coronavirus nucleocapsid protein expressed in escherichia coli " detailed in this chapter had been successfully carried out in the authors' studies on characterization and immunology of turkey coronaviral enteritis . those studies were in part fi nancially supported by usda, north carolina poultry federation, and/or indiana department of agriculture and technically assisted by drs. tom brien and david hermes, mr. tom hooper, and ms. donna schrader for clinical and diagnostic investigation, virus isolation and propagation, and animal experimentation. key: cord-018647-bveks6t1 authors: butnariu, monica; butu, alina title: plant nanobionics: application of nanobiosensors in plant biology date: 2019-10-01 journal: plant nanobionics doi: 10.1007/978-3-030-16379-2_12 sha: doc_id: 18647 cord_uid: bveks6t1 nanobiosensors (nbss) are a class of chemical sensors which are sensitive to a physical or chemical stimulus (heat, acidity, metabolism transformations) that conveys information about vital processes. nbss detect physiological signals and convert them into standardized signals, often electrical, to be quantified from analog to digital. nbss are classified according to the transducer element (electrochemical, piezoelectric, optical, and thermal) in accordance with biorecognition principle (enzyme recognition, affinity immunoassay, whole sensors, dna). nbss have varied forms, depending on the degree of interpretation of natural processes in plants. plant nanobionics uses mathematical models based on qualitative and less quantitative records. nbss can give information about endogenous concentrations or endogenous fluxes of signaling molecules (phytohormones). the properties of nbss are temporal and spatial resolution, the ability of being used without significantly interfering with the system. nbss with the best properties are the optically genetically coded nbss, but each nbs needs specific development efforts. nbs technologies using antibodies as a recognition domain are generic and tend to be more invasive, and there are examples of their use in plant nanobionics. through opportunities that develop along with technologies, we hope that more and more nbss will become available for plant nanobionics. the main advantages of nbss are short analysis time, low-cost tests and portability, real-time measurements, and remote control. plants are a fascinating research topic if we relate to environmental stress. because they are physically stuck in specific spots, the plants have to handle in that site, regardless of the environmental conditions. moving to another place is not an option. but what plants can do is to modify the internal "environment," and plants are "true masters" of manipulating their metabolism to deal with environmental disturbances. this feature is one of the reasons why plants are useful in various research; we can rely on them as "sensitive indicators" of environmental changes, even in completely new environments. in the absence of normal conditions, plants cannot use the classical pathways of metabolism, so they need to identify other solutions. this is what happens when plants adjust their metabolism for regulating gene activation, thus producing more or less proteins that are useful or not in the new environmental conditions. the different parts of the plant come with their own genetic regulation strategies. a number of genes that are involved in the creation and remodeling of cell walls are activated differently in growing plants. other genes with a role in identifying light, which are normally active on the leaves, are active at the root level. in leaves, many genes associated with transmission of hormonal information are suppressed, and genes associated with insect protection are more active. these trends are also seen in the (higher) number of proteins involved in message transmission, cell wall metabolism, and plant protection. these patterns of gene and protein functioning indicate that in unfavorable conditions, the plants respond by weakening the cell walls and creating new ways to understand the environment. it is possible to monitor changes in the genes in real time by labelling certain proteins with fluorescent elements. plants modified with fluorescent proteins can give useful information about how they respond to the environment. these modified plants act as biological sensors (nbss). specialized cameras and microscopes allow us to monitor how the plant uses these fluorescent proteins (hamers et al. 2014) . chemical or biological nbs functions on the principle of signal emission (voltage or electrical, photonic) in response to a chemical reaction involve a chemical or biological receptor, r (macrocyclic ligand, antibody enzyme), that binds to a specific target molecule of a sample to be studied, the analyte, a. signal transmission is achieved by coupling with a transducer t that interfaces nbs processes with the processing-transform unit in a measurable signal. analysis of signals in plant nanobionics aims at processing signals recorded by measurements in order to extract the maximum of useful information for diagnostics and these devices are mostly used in genetic engineering in agriculture, where it is necessary to know the mechanisms of reaction and the affinity of enzymes and microorganisms for different substrates of interest and signaling molecules. a biochip is a device that contains a structure of individual sensory elements (nbss) interconnected by functions and recognition specifications, integrated on a chip. the number of nbss on a chip may be of the order of 10 6 units. in this degree of integration, a set of distinct tests can be performed extremely quickly and efficiently. in contrast to microchips, biochips are not electronic structures (they contain different electronic structures coupled to nanobiosensory elements). each nbs can be thought of as a "microreactor" that performs a specific chemical reaction with an analyte. nbss from biochips can be designed to detect a wide variety of analytes, including dna, antibodies, proteins, and biomolecules. the advantages are multiple: sensors can be produced in batches or sequences that can be assembled in parallel or serial, providing a high manufacturing yield; sensors can be assembled on very small areas with reduced distances between them; 3d structures can be generated providing high signals besides 2d structures; any type of biochemical reaction may be incorporated; nbss can be produced separately and subsequently assembled according to the specificity and nature of the application. the important features regardless of industry or technology are selectivity, sensitivity, and stability in the design of sensory systems integrated with structures and arrangements of sensory elements (wan salim et al. 2013) . one of the integrated systems including rotational aseptic sampling is a robotic fluid and reusable electrodes formed by ink-jet printing injection system. the system contains an enzyme electrode with immobilized gox in gel, and the detection of hydrogen peroxide is carried out on a rhodinised carbon electrode (rh coating). although the enzyme electrode has stability and efficiency characteristics, the problem of automated sample monitoring and sampling in an integrated system requires multiparameter optimization due to reciprocal interferences. there is a requirement on specific domains (environment and genetic engineering) of highly performing integrated systems that work in in vivo conditions, such as dialysis, the use of biointerfaces, evanescent techniques, and atomic force microscopy to grasp in the depth of the biological phenomena (identifying and understanding the interaction of proteins). the in vivo exploitation of detection systems for both glucose and lactate was confirmed by the efficacy of using phospholipid copolymers and improving hemocompatibility. immunosensors provide an example for the development of integrated systems where microseparations, chromatographic methods, and electrochemical couplings with optical detectors are incorporated, which ultimately lead to a miniaturized system. there are examples where the level of integration and miniaturization becomes more pronounced (dna-nanotube or biochips-biointerfaces). nbss are expected to be widely used in plant nanobionics where physiological/biochemical parameters should be identified. advanced ink-jet technology has developed methods for analyzing nanoliter fractions on a three-dimensional nbs surface at a speed of 6 m/s. it is expected to produce one million nbss/cm 2 areas using photolithography, contact fingerprinting or self-assembling techniques, and adsorption/desorption under the laser beam that allows "writing" proteins on the surface to be analyzed with great precision. laser techniques, maple (matrix assisted pulsed laser evaporation) or dw (direct writing) approached for immobilizing biological materials on substrates, are in the laboratory stage but have prospects for use as molecular imprinting methods (potocký et al. 2014 ). whether nbss are individuals, in integrated systems, or areas of nbss, all are characterized by unique parameters such as sensitivity and detection limit for a range of analytes. trace detection of various analytes (indicators, additives, contaminants) with sufficient sensitivity and safety is the basic criteria of an nbs to be used. the detection limit in the laboratory is pushed to an atom when the atomic force microscopy is used. thus, the enzymatic electrodes, studied and continuously perfected, use palliative such as concentrating the analyte of interest, which leads to major design and miniaturization difficulties. nbss for phenol vapors were identified, where the phenoloxidase was immobilized on a glycerol gel with a range of interdigitated electrodes. phenol vapors are directly partitioned into the gel and oxidized to quinone. signal amplification was improved by redox amplification of the quinone/catechol couple to obtain a reasonable sensitivity resulting in a detection limit of 30 ppb phenol. this principle can be extended to other carbon compounds up to the ppt (parts per trillion) limits. dna structures have been studied as potential receptors. sandwich structures of liquid crystal dispersions and dna-polycation complexes have been studied with relatively good success in identifying different analytes. the polycation with the role to maintain structural integrity of dna and dna-protamine complexes allows detection of hydrolysis of the trypsin enzyme to the detection limit of 10 −14 m. elimination of the polycation leads to an increase in the distance between the two dna strands resulting in the appearance of an intensive band in the circular dichroism spectrum due to the texture modification (espinoza et al. 2014 ). from a practical standpoint, the disadvantage is their inherent instability. different strategies have been approached to improve longevity and preserve the structure of biological receptors. immobilization in matrices by sol-gel technique for glucose nbss is one of the strategies; fluorescence indicators are used: hexahydrate chloride (2,2′-bipyridyl) ruthenium (ii) and 1-hydroxypyrene-3,6,8 trisulfonic acid. in addition to the optical property improvement of the gel, the stability of the gox enzyme has also improved. other examples are the case of monooxygenase used in hydrocarbon detection; detection of organic halides with metalloporphyrins; and detection of carbon tetrachloride, haloalkane (perchloroethylene), and insecticides (ddt) (kazakova et al. 2013 ). improving the selectivity of an nbs can be approached on two levels: through direct transducer-biological receiver interfacing to reduce interference and new receptors with improved affinity or new affinity capabilities. selectivity is a key parameter that requires the performance of an nbs. pyrroloquinoline is used as a mediator in a glucose oxidase enzyme electrode for the measurement of glucose in the raw or elaborated cavity. alternatively, the electrocatalytic detection of the reaction products resulting from the enzymatic reactions can be improved by chemically modified electrodes such as rhodinised electrodes or hexacyanoferrate modified carbon electrodes. prussian blue is used to modify the electrode surface at amperometric detection of oxygenated water at both oxidation and reduction potentials for the enzyme electrode in lactate and glucose detection. one solution is to identify redox centers of the enzyme via a molecular wire to perform the electron transfer to the electrode (enzymes linked by molecular wires), but the concerns have focused on immobilized mediators on different polymer chains. molecular wires are regarded as intermediates in long-range electron transfer, consisting of two pyridine groups linked by thiophenes with different lengths. such wires can be used in conjunction with self-assembly techniques to produce an isolated electrode that transfers electrons to predetermined molecular pathways (jones et al. 2013 ). an ideal nbs is a device that will detect an "analyte," the subject under analysis and which is present in a given sample. most samples also contain other analytes which may interfere with the nbs response. it must have a specific selectivity to identify the target analyte. it is necessary to design nbss with selectivity for an analyte with the ability to discriminate the interferences produced by the other components in the analyzed sample. specific identification and selectivity capacity are the key components of molecular recognition. molecular recognition is accomplished by the sensor component of a host molecule (host-chemoreceptor) that binds selectively to the "guest" target molecule/guest molecule that needs to be identified. for each "host-guest" system, there is a specific chemical reaction from the multitude of possible reaction channels. when the host-guest response was identified, the host molecule is immobilized/incorporated into nbss, typically on a transducer-contacting membrane/ contact electrode. finally, a way to signal that the bindings/recognition event has occurred (transducer to transducer) has to be found (rodríguez-sevilla et al. 2014 ). one of the requirements for molecular recognition is the existence of groups or centers with specific reactivity in the host molecule that can "close" or bind ions, atoms, molecules, and biomolecules. all living organisms use enzymes, which are proteins that contain "pockets," active centers, designed to recognize a specific analyte. this means that only a specific analyte is able to enter the enzyme pocket. enzymes can be used in nbss as host receptors with molecular recognition capability but are unstable. to design host molecules that can be used in an nbs, the following criteria are considered: the host molecule must be stable under the conditions in which it is to be used, must be able to selectively bind the analyte in the sample, must be able to be immobilized in a film/membrane that is in contact with the sample, must signal that a host-host binding response has occurred, and must ideally release the analyte after detection so the host is free to be reusable. biological receptors include antibodies, membrane receptors, signaling molecules, enzymes, ribosomes, lectins, phytohormones, etc. they bind analytes using "lock-and-key" molecular recognition mechanisms (key-lock, identification-immobilization). biological receptors are not practical solutions for many applications because the specificity, sensitivity, and stability cannot be optimized. artificial receivers are immobilization environments that can be optimized by molecular design for any type of application. synthetic receptor design and synthesis are based on tools developed by proteomics and genetic engineering, producing recognition components that can respond to the occurrence and identification of metabolic deficiencies in plant nanobionics. there are platforms and areas of artificial receptors based on combinatorial mathematical techniques, interface biology, and surface chemistry. they have induced the development of various artificial receptor environments with rapid and diversified selection for any target analyte. the current technique for producing synthetic receptors is called cara (combinatorial array of receptor analysis) (fang et al. 2015) . supramolecular chemistry has developed a wide range of synthetic macrocycles. the most common feature for macrocycle classes is that they contain cavities that act as host pockets for guest molecules. the selectivity of the hosts can be done in "read mode" by varying the size of the preformed cavities. 12-crown-4 has a small cavity ideal for the binding of small ions such as li + , while 18-crown-6 has a large cavity that fits better with larger ions such as k + . the size of the cavities is important for the selectivity of the host, but the question remains what attracts an ion or molecule into a preformed cavity and which factors stabilize the host-guest complex. in enzymes, weak noncovalent interactions (hydrogen bonding, electrostatics, dipole-dipole, van der waals, etc.) are used to link the guest into the enzyme pocket (interactions stabilize host-guest interaction). macrocycles contain polar functionalities, capable of interacting with guests via hydrogen bonding, electrostatic interactions, and dipole-dipole interactions. it is desirable that bonding in the cavity is not strong, because it is important that the analyte, the guest, is released from the host after it has been detected and measured. crown ethers and calixarene are ideal for bonding metal cations, based on the size of their cavity, but also on the high density of electrons present on the oxygen atoms in the cavity. the base compound selectively binds li + to other metallic cations; the modified version of the base macrocycle has a good selectivity for na + . by synthetic modification it is possible to increase the capacity of the host cavity, and new functionalities can be introduced that will favor the binding of specific molecules and ions (da silva et al. 2013) . other modified calixarenes, which demonstrate this principle, are the group of tetraphosphine oxide of the calix[4]arene. by changing binding groups on the same template, calix[4]arene from esters in phosphorus hydrogen oxides selectivity changes from na + to ca 2+ . by increasing the number of repeatable units in esters and phosphorus hydrogen oxides at six, the cavity increases, and the selectivity changes in favor of the higher cs + and pb 2+ cations, respectively. some host compounds have been developed for the selective detection of low molecular weight compounds. an example involves the use of the tetra (s-propanol) calix[4]arena containing four lateral chiral halves for selective differentiation of the phenylalanyl enantiomers. other techniques of supramolecular chemistry may be involved in the synthesis of synthetic receptors that simulate the properties of enzymes. the basic structures that can be modified are porphyrins, semiconductor polymers of the tetrathiafulvalene (ttf) class, and ppvs. other patterns can be considered modified polysaccharides. a linear archetype is the polyanilines containing the two types of redox states (meyer et al. 2007 ). among the multiple nbs classifications, bioaffinity has a range of applications, and antigen-antibody interactions (ag-ab) play a role and are considered to be an instrument in the development of molecular recognition principles. in vivo ag-ab interactions are reversible. factors that condition the ag-ab interaction are the structural complementarity between the antigenic determinant and the antibody combining site; this is the exclusive factor of the specificity of the reaction; the structural complementarity involves the conformational adaptation of the two reacting groups and was conceived in structural terms on the key-lock principle; the chemical complementarity of the reaction groups is the consequence of structural complementarity and signifies the entry into action of intermolecular forces that stabilize and consolidate the interaction of the two groups. the formation of intermolecular bonds requires the existence of atomic groups sufficiently close to the two molecules. the distance between them is inversely proportional to the degree of complementarity. although structural complementarity is not strictly obligatory, higher spatial matching is more conducive to interaction. it is expressed by the congruent of contact surfaces that provide intermolecular attraction forces that stabilize the complex. the ag-ab interaction involves the following types of noncovalent bonds: h bonds, electrostatic forces, van der waals linkages, and hydrophobic bonds. all are nonspecific forces of low value and their nature makes the reaction reversible. h bonds form when two atoms share an atomic h nucleus (one proton). the common proton is found between two atoms of n or o or between n and o. the h nucleus is covalently bonded to one of the two atoms (n or o). the h bond has a binding energy of 3-7 kcal/mol. intermolecular forces are involved in ag-ab complex formation. the action of these forces requires close contact between the two reaction groups. the h bonds result from the formation of an h bridge between two nearby atoms. the electrostatic forces are due to the attraction of the ionic groups with opposite charges located at the periphery of the two protein chains. the van der waals forces result from the interaction between different electron clouds, represented in the form of oscillating dipoles. the van der waals relationships, the weakest interaction forces, are active at very small distances between the reaction groups. the binding energy is 1-2 kcal/mol. van der waals' links are not based on a permanent separation of electrical charges but on their fluctuations induced by the proximity of molecules. at intermolecular distance, instantaneous electric fields are formed, with a polarizing effect on neighboring molecules. between the nearby atoms, there is a mutual attraction force induced by fluctuating dipole load, which a dipole induces in the neighboring dipole (dispersion forces). their intensity depends on the distance between the groups involved and is inversely proportional to the seventh power of the distance. their value is optimal at 1-2 å. hydrophobic bonds, which can contribute with half of the ag-ab binding force, are produced by the association of nonpolar and hydrophobic groups, whereby water molecules are excluded. the optimum distance between the reactive groups varies with the type of bond. the electrostatic forces (coulomb or ionic) are the result of the attraction between atoms or groups of atoms with the opposite electric charge located on the two reacting groups: between a cation (na + ) and an anion (cl − ) or between coo − and nh 3+ (agrawal et al. 2012) . the binding energy of these forces is significant at very small (less than 100 å) distances from the reaction groups. exact juxtaposition of ions favors the action of these forces. the binding energy is 5 kcal/mol and varies inversely with the square of the distance between the two reaction groups (1/d 2 ). hydrophobic (or a polar) linkages occur between nonpolar (nonionized) groups in aqueous solutions and are the consequence of the tendency to exclude the ordered molecule of water molecules between the antigen and antibody molecules. these linkages are favored by amino acids with a polar group that tends to associate, reducing the number of water molecules in their vicinity. by removing the water molecules between the reaction groups, the distance between the active sites decreases much, and the value of the stabilizing forces increases. taken each one on their own, space complementarity or intermolecular forces are not sufficient to form stable relationships. for the stability of the ag-ab interaction, both conditions are required. the higher the binding energy of the reactants, the stable the ag-ab complexes. the interaction of the antigen-and antibody-reactive groups is defined by two parameters: the affinity and avidity of the antibodies. measurement of antibody affinity can be achieved by dialysis at equilibrium. the ag-ab interaction is reversible. within the dialysis bag, the hapten is partially free and partially bound to the antibodies, depending on the affinity of the antibodies. through the membrane of the dialysis bag, only the free hapten can be diffused, and its external concentration will equal the concentration of the free hapten inside the bag (kersten and feilner 2007) . measurement of the concentration of the hapten in the dialysis bag allows for the calculation of the amount of antibody-bound hapten. the constant renewal of the buffer results in total dissociation and loss of hapten in the dialysis bag, which indicates the reversible nature of ag-ab binding. affinity of antibodies measures the binding force between an antigenic determinant and the complementary binding site of a specific antibody. affinity is the result of attraction and rejection forces that mediate the interaction of the two reactants. the strength of these interactions is measured in the reaction between a monovalent antigen (hapten) and specific antibodies. a high affinity interaction requires perfect complementary structures, while the imperfect complementarity of the reaction groups causes a low affinity, since the attraction forces are active only at very small distances and are diminished by the rejection forces. complexes formed by antibodies with high affinity are rapidly eliminated from the circulation without adverse effects on renal function. the ag-ab interaction is permanently characterized by the formation and cancellation of various types of intermolecular bonds. in vivo, probably all ag-ab reactions are reversible, but secondary in vitro reactions (agglutination, precipitation) , under the conditions of reagent balance, are irreversible (sakamoto et al. 2018) . it is essential that the host-guest binding event (the receptor-analyte interaction, r-a) is detected. it is thus necessary to have a way of identifying and transducing the signal from the receptor-analyte interaction to the outside to be processed. it is generally defined as a transducer. the transducer must be in contact with the receiver or the diaphragm that immobilized the handset. electrode and interfacial interactions are determinant in capturing the signal from an r-a interaction and transforming it into an electrical or photonic signal. there are ways of identifying the r-a event, collecting the signal and transducing it as an external signal. the way of identifying the signals and their transduction defines the type of nbss. this means immobilizing the receiver on an electrochemical transducer that measures a current (amperometric method) or a voltage (potentiometric) between two electrodes. if r is immobilized on an optical component, then we will define optical nbss (optical fiber, fluorescence, absorption, surface plasmon resonance (spr)). for detection, at most electrochemical nbss, it is necessary for the membranes containing the host molecule to be placed on a surface of an electrode that leads to an electrochemical response to the binding of the guest. the approach works well when target analysts are loaded species such as metal cations. neutral molecules cannot be detected from the point of view of electrochemical transduction, so the optical detection methods have been successfully used. a chiral host in calixarene contains naphthyl, fluorescent units. upon binding to the guest, fluorescence is attenuated as a result of interaction between the naphthyl-phenyl groups in the host or analyte. the fluorescence attenuation is proportional to the concentration of the analyte. optical methods are used because they offer greater sensitivity than electrochemical techniques. in the absence of the guest analyte, this compound does not exhibit fluorescence because the pyrenyl substituent cannot come in contact with the adjacent nitrophenyl substituent (and the fluorescence attenuation occurs due to their interaction). however, in the presence of na + ions, fluorescence is observed, because the na + ion enters the cavity and binds to the oxygen in the phenoxy and carbonyl groups in the host. this binding induces a more rigid conformation by removing the nitrophenyl groups of pyrene to prevent fluorescence attenuation (gaggeri et al. 2012 ). nbs is a bioelectronic analysis system that combines a transducer with a biological component that is in a specific interdependence. nbss use biological systems with different levels of recognition of the substances to be determined. the first step in this interaction is the formation of the specific complex of the biologically active, immobilized substance r (the receptor, the substrate with the sensitive biological component) with the analyte a (often defined as the chemical signal). table 12 .2 summarizes specific patterns of nbss in relation to the nature of the receptor and the chemical/biochemical signal. there are two general classes of nbss that are based on the bioaffinity response between r and a that alters the distribution of electrical charges that can be measured with specific transducers or consuming the substrate by a specific reaction. the biological constituent of the molecular recognition element (r) is represented by various active species that can be enzymes or enzymatic systems, antibodies (ab) or antigens (ag), receptors, populations of bacteria or eukaryotic cells, tissue fragments, and sometimes even signaling molecules. analytes or substances that can be analyzed (a) are glucose or other sugars, amino acids, alcohols, lipids, and nucleotides. they can be identified by their specific interaction, or their concentration can be measured by various methods. both r and a represent distinct molecular species with high macromolecular specialization (antibodies, antigens, enzymes, receptors, etc.) or are complex systems (cells, tissues, etc.) (kersten and feilner 2007) . after the active biological component, they can be grouped as follows: • enzymatic nbss: enzymes are energy proteins characterized by their catalytic function. modified substrate molecules lead to oxidation, reduction, and hydrolysis reactions that can be measured by enzymatic nbss. enzymatic nbss produce a linear response depending on the substrate concentration. • immunosensors: antibodies are glycoproteins produced by the immune system when an external substance, antigen, is involved. it is theoretically possible to produce antibodies without identifying an antigen. an immunosensor is a high sensitivity nbs. the principle of operation is based on the ag-ab interaction of molecular recognition. • nbss with receptors: the regularity of biological processes is ensured by high sensitivity molecular processes based on the specialization of structural proteins (receptors) capable of recognizing a number of physiological signals. this is the case for neurotransmitters, whose action is mediated through the presence of receptors in the plasma membrane, in sites or cell targets. activation of the biologically active site is via the ion channels. the acetylcholine receptor is the first known receptor in neurotransmitting phenomena. • nbss based on cells or tissues: measurement of molecular species is not limited to interaction with the compounds to be analyzed; the transformations that occur can be measured as resulting products. it is desirable to operate with cell populations whose major metabolic pathways are known. relevant is nbss' l-arginine, which associates populations of bacterial cells of streptococcus faecium in combination with an ammonia electrode. arginine is metabolized by microorganisms. it is difficult to obtain complex reactions outside the cellular structures. similar to the use of cell populations as sensitive elements, fragments or parts of plant tissues may be used. the advantage is greater because there is no extra effort to keep the cells viable in a natural arrangement. for adenosine nbss, a tissue biosensitive element has been proposed. for dopamine nbss, specialists have focused on the pulp of banana fruit, considering that it has remarkable biocatalytic properties. • nbss with redox proteins: the redox proteins are involved in biochemical processes such as cellular respiration and photosynthesis reaction (kersten and feilner 2007) . nbs catalysts use enzymes, microorganisms, or cells to catalyze a reaction with a target substance. nbss own an affinity on using antibodies, receptors, and nucleic acids that bind to a target substance. reactions are quantified by electrochemical, optical, evanescence transducers, etc. the main types of known redox proteins are cytochromes, containing iron in the prosthetic group, and cytochrome "c" is involved in the transfer of electrons into mitochondria; ferredoxins contain iron and sulfur ions in dimeric combinations of chloroplasts (2fe-2s) ferredoxin and tetrameric combinations of bacterial ferredoxin 2 (2fe-2s) involved in photosynthesis and transfer of fixed nitrogen ions, respectively; blue proteins contain copper linked to the smallest cysteine residue involved in a tetrahedral structure such as plastocyanin and azurin that mediates electron transfer in photosynthesis and possibly in nitrite reduction; flavoproteins, containing a prosthetic group and an organic conjugate, are involved in the transfer of proteins such as flavotoxins (agrawal et al. 2012) . these proteins play a role in nature, due to the location on their surface of the redox centers. the subtle architecture of molecules offers selectivity and specificity to these molecules in their interaction with other proteins or enzymes, such as the cytochrome c structure. porphyrin iron (heme) is located at the center of the molecule and is well covered or hidden, being exposed to solvents in a small proportion of 0.06% of the total molecular surface. from those presented above and from table 12 .2, we can see that nbss can be classified into two groups according to the biological component. the protein has a positive potential of +9 mv due to excess lysinic base debris. there is a 324 debye dipole moment, which produces an imbalance in the spatial distribution of the acidic chain balance. a number of lysine residues are distributed around the solvent to which the center of the heme that interacts with the redox proteins is exposed (nelsen et al. 1990 ). nbss are classified in three generations. at the first-generation sensors, the biocatalyst is attached to the surface of the membrane, and then this arrangement is fixed to the surface of the transducer. the adsorption or covalent attachment of the biologically active component to the surface of the transducer allows the elimination of the semipermeable membrane, which is the second generation. the direct linking of the biocatalyst to the electronic device that translates and amplifies the signal, such as the compact transistor, is the basis of the third-generation nbs miniaturization. depending on the nature of the immobilization and the interaction between the three components, the a-r membrane contact with the electrodes to the transducer, and the processes in the nbss have evolved over a generation. first, the specificity and selectivity are dominated by the biological component and are directly related to its nature: enzymes, antibodies, and microorganisms. the specificity derives from the binding of the analyte to the biological component used as the receptor. at the base of this dominant sequence is the a-r biochemical reaction and the collision process between a and r. second, the transport of the analyte to and through the surface considered to r is also an important factor. this process is related to the transport of a physical size through typical diffusion, migration, and convection mechanisms. third, the nbs signal is dependent on the a-r reaction assumed to be at a constant speed. transient states and biochemical pseudo-equilibrium conditions are dominated by the reaction kinetics, the nature of the transport, which in turn is coupled with the immobilized a-r interface substrate reactions. even in the case of a real equilibrium, the reaction speed near the steady state will be important in determining the response time. the kinetics of these processes require additional conditions (agrawal et al. 2012) . the new types of nanoscale materials with different levels of biocompatibility, the new generation of biocompound cells based on a better understanding of metabolism, the manipulation of information stored at the molecular level all have led to a generation of nbss with a high level of integration. molecular information initially stored in the base molecular components can be expressed directly to a higher level called "supramolecular" where interactions between molecules are performed by preestablished algorithms, leading to adaptive, functional, and intelligent materials. materials are built on conceptional, supramolecular, and combinatorial principles. separation, storage, and detection techniques are developed using "biomimetic" membranes that function according to biological models or precise physicochemical principles ). electrochemical nbs with dna is the result of medical diagnosis requirements to quickly and accurately determine the segments of a dna sequence. results from genetics, molecular biology, and nanotechnology have led to one of the most accurate detection methods: electrochemical nbss with dna (combining the principle of nbs isfet with molecular wires from nanotubes). the operating principle consists of collecting the signal between two electrodes-one working electrode and another reference electrode. the auxiliary electrodes have a specified role in their turn. the sensing mechanism consists in modifying the i-v characteristic (current-voltage) in the presence of a target molecule. carbon nanotubes are exceptional for the work electrode with high electron transfer velocity and excellent spatial resolution. target in nbs dna is an unknown sequence of dna (or oligonucleotides) attached by functionalization to the carboxyl or amino cnt groups. researchers reported the developing plants' ability to capture 30% more energy by implanting carbon nanotubes into chloroplasts and plant organisms where photosynthesis takes place. they managed to modify plants so as to detect nitric oxide by implanting another type of carbon nanotubes. "the plants are suitable for the role of a technological platform. they heal themselves, they are durable, they resist the harsh environments and they have their own sources of energy and water." the transformation of plants to photon devices, with own energy, such as explosive detectors or chemical weapons, is expected (panchal and upadhyay 2014) . external or surface electrodes are metal electrodes that contact the nbss' bioactive component either directly (dry or solid electrodes) or through an electrolyte solution (liquid electrodes). solid or dry electrodes are made in silver, platinum, gold, and nickel. internal electrodes are made of thin wires made of durable metal: stainless steel, platinum, and tungsten. the active part of the electrode can be covered with a metallic conductor layer (gold, silver) and the inactive one with an insulating layer (polymer/thin film). nbss' active contact surfaces are large in size compared to cell sizes and are used for extracellular recordings. microelectrodes are internal electrodes, but they are built to measure potentials in direct contact with the receiver in nbss. the contact surface with r is micronized. microelectrodes can be solid, compounds which can be achieved by depositing a conductive layer (platinum, gold) on a glass support having a particularly thin peak, and another constructive variant consists of inserting a metallic or carbon fiber conductor into an epoxy resin support mixed with a conductive paste; they are used in cellular samples and consist of a glass pipette having a micronized tip filled with an electrolytic solution containing potassium chloride. in the electrolyte solution, a conducting wire is immersed to pick up the electrical potential (wu et al. 2014 ). it was thought that it is not possible to transfer electrons directly between the electrode and the proteins due to their distortion. several practical considerations have led to the conclusion that the active center of the heme is irreversibly adsorbed when resulting in protein denaturation in contact with the electrode. changing the surface of the gold electrode by surface adsorption of 4,4′-bipyridyl resulted in the modification of the electrode surface configuration for interaction with cytochrome "c." 4,4′ bipyridyl is not an electroactive substance in the potential region and therefore does not play a role as a mediator. this electrochemical addition was possible due to the quasireversible binding of cytochrome "c" to the modified gold electrode with 4.4′ bipyridyl, thereby resulting in the hydrogen linkages in the lysine residues to bound to the nitrogen of the pyridyl which modified the surface of the electrode. transmutation through complex protein electrode rapidly directs the transfer of electrons, which is accomplished by the following scheme: cytochrome diffusion "c" on the electrode, protein binding on the surface, electron transfer, and protein desorption. following this process, more than 60 surface changes were possible for electrochemistry of proteins and the gold electrode. using a bifunctional reagent (x-y), wherein the x group is the n, p or s bonded electrode and the y group, which must be bonded "represent also examples of patterns developed" (agrawal et al. 2012 ). electrochemistry of proteins has been extended to the carbon electrode. pyrolytic carbon-graphite forms, vitreous carbon and mesocarbon, are structures in which graphene plans are arranged in ababa hexagonal mesh or disordered in different turbulent forms. the base graphene plan is hydrophobic, but existing or induced defects lead to free c-c bonds, and there is an increase in c-o linkages by oxidation. the direct electrochemistry of positively charged proteins can therefore be performed on the edges of the graphite plans of the carbon electrode. the direct transfer of electrons to negatively charged proteins, such as plastocyanin with graphite electrode (edges or plane edges), can be aided with mn, ca, cr complex cations complexed with amino compounds, cr (am6) 3+ , used as promoters of reactions. in this context, the promoters are inactive redox species in solutions but allow the transfer of electrons to the redox proteins. microminiaturized electrodes have specific advantages among which we mention the improvement of polarization and contact with biological material at the active sites. specificity and selectivity depend on the nbs receptor biological component and its affinity for the analyte. affinity is a specific feature of enzymes, antibiotics, and receptors being used in functions in living organisms. affinity is based on the chemical coupling between a component and its complementary partner. in the case of high-affinity components, the diffusion process is rapid, leading to the formation of the ag-ab type of complex. the association reaction specific to molecular recognition will be characterized by the first-order response rate constant. in nbs measurements it is essential to consider the concentration of a constant component and other variables. the results of electrochemistry of proteins have been extended to amino acids and peptides (barroso et al. 2015) . the addition of enzymes to solutions containing substrate molecules is the essential condition in enzyme catalysis reactions. extracting the necessary information from the enzyme science to be applied to the development of nbss such as the enzyme electrode is an extremely difficult task. references will be used to outline some of the enzyme properties necessary to describe enzymatic nbss. consider a simple reaction, with a single substrate s, that combines with enzyme e to form the enzyme-substrate intermediate complex, es. this unstable complex undergoes a new reaction resulting in product p. the formation and consumption rates of the complex are equal. as soon as e and s enter the reaction, the system becomes unbalanced, the concentration of the complex will be zero, and the formation speed of the complex is much higher than the rate of its consumption. as the reaction unfolds, es increases and implicitly increases the rate of disappearance of the complex relative to its rate of formation. initially, the excess of the substrate determines the consumption of the enzyme, and during the course of the reaction, the enzyme's constant regeneration begins to reach its steady state. the analysis of these reactions results in two important conclusions: • at a low concentration of the substrate, the rate of the reaction is proportional to the substrate concentration and inversely proportional to the rate of the formation and extinction rate of the complex or the dissociation reaction rate in the initial reactants plus the decomposition reaction rate in products. • at a high concentration of the substrate, maximum speed is limited by enzyme concentration. thus, the two sequences correspond to the two processes that can control the overall reaction rate (stein et al. 2011 ). when the reaction takes place in homogeneous solutions at a uniform rate, the same in the entire environment, it is necessary to consider the change in the concentration of the components over time. three mechanisms of mass transport occur in solution: diffusion, convection, and migration. an electrochemical nbs is an autonomous, self-contained device capable of providing specific quantitative or semiquantitative analytical information using as a molecular recognition element, a biochemical receptor (biological identification element) that is in direct spatial contact with an electrochemical transducer. electrochemical nbss are distinguished only by the nature of the transducer regardless of the nature of the biological component according to the classification in table 12 .3. due to their ability to be calibrated in a repetitive manner, an electrochemical nbs is distinguished by a bioanalytical system requiring additional processing steps, such as the addition of reagent. nbss for a single type of measurement, or unable to continuously monitor concentration analysis or not to be rapidly and reproducibly regenerated, are defined as "disposable." nbss are classified according to their biological specificity-with reference to the mechanism or to the interpretation of the physical-chemical signal (the transducer) (barroso et al. 2015) . the biological recognition element is based on a catalyzed chemical reaction or an equilibrium reaction with macromolecules that have been previously isolated or synthesized in their original biological environment. in the case of reversible reactions, the steady state can be reached if there is no net consumption by the agent of in addition to quantitative determination of analytes, nbss are also used to detect and quantify microorganisms: the receptors are bacteria, yeasts, or oligonucleotides coupled with electrochemical, piezoelectric, optical, or calorimetric transducers the immobilized biocomplex and incorporated into the nbss. electrochemical nbss can be classified according to the analyses and reactions they monitor: direct monitoring of the concentration of the analytes or their production or consumption reactions and, alternatively, an indirect monitoring of the inhibitor or activator of the biological recognition element (the biochemical receptor). criteria include calibration characteristics (sensitivity, linearity, operational range of concentration, quantitative determination limits, and specific detection), selectivity, equilibrium state and response time, reproducibility, lifetime, and stability (fang et al. 2015) . the notion of recognition is used in nbss or in nanobiosensory systems by association with the sensory systems of the plants. sensations such as smell or taste are made up of systems that contain an identification receiver cell coupled with neurotransmitter signal-processing pathways. such phenomena also occur in biochemosensors but at a much-simplified level compared to the complexity of molecular recognition in living systems (barroso et al. 2015) . examples of single or multiple transfer signals, limited to the main biochemosensors, are shown in table 12 .3. for the receptor types shown in table 12 .3, different electrodes and measurement methods can be selected from table 12 .4 to form an electrochemical nbs. nbss are classified by the recognition element (table 12. 3) or by the transduction mode (table 12 .4). nbss irrespective of the type of classification should be treated unitarily as a microsystem, the biological recognition element being in direct contact with the transducer element. an electrochemical nbs is an nbs with electrochemical transducer (table 12 .4). it is considered to be a chemically modified electrode (cme), the electric conductor nbss may use several other types of non-electrochemical transducer: (a) piezoelectric nbss; (b) nbs-saw, measures surface acoustic waves in a resonance circuit (shear and surface acoustic wave); (c) thermometric nbss (the active element is coupled with a thermistor); (d) optical nbss, uses optical phenomena: planar wave guide, optical fiber, surface plasmon resonance) spr nbss use the immobilized analyte-receptor interaction on a metal film deposited on an optical prism measuring the variance of the refractive index due to changes induced in the metal's electrical charge that transmits the electrons from the interaction process to the outside in the electronic measuring system. the electrode may be a metal, a semiconductor, or an ionic conductive material coated with a biochemical or bioactive film. electrochemical nbs is an integrated transducer microsystem capable of providing selective, quantitative, or semiquantitative analytical information using a biological identification element. it can be used to monitor biological and nonbiological elements. chemical nbss that incorporate nonbiological components as receptors, although used to monitor biological processes (ph or nbss of oxygen), are not nbss. the clark electrode is of importance in the nbss' measuring range. similar physical nbss used in biological environments such as those measuring pressure, etc. are not considered nbss (jacoby et al. 2015) . electrochemical nbss according to the terminology set out in tables 12.3 and 12.4 can be classified according to their biological specificity, by mechanism or mode of signal transmission, or alternatively, the combination of two. they can be amperometric, potentiometric, field effect (fet), or nmss' conductometric (electrical conduction measurement) respectively impedance metrics. alternatively, they may be called enzymatic amperometric nbss to specify the nature of the receptor and the transducer. the first nbss that were studied are the enzymes and immunosensors (fang et al. 2015) . nbss are based on a catalyzed reaction of biomacromolecules present in the original biological medium that is preisolated or synthetically produced. the reaction is monitored by an integrated detector (transducer) that measures the stationary or transition states or the final reaction product via the immobilized biocidal product in nbss. types of commonly used biocatalysts are enzymes (simple or enzymatic complexes)-most commonly used as recognition systems, cells, microorganisms (bacteria, fungi, eukaryotic cells, or yeasts), cellular organs, or component (cell walls, mitochondria) sections of plant or animal tissues. nbss with biocatalyst recognition elements are the best known and studied since the beginning of their approach by clark and lyons. one or more analytes, commonly called s and s′ substrates, react in the presence of one or more enzymes, cells, etc., to produce the p and p′ products (fang et al. 2015) . there are four strategies whereby the associated transducer can monitor the consumption of s-analyte through the biocatalytic reaction: • detection of s′ cosubstrate consumption, oxygen depletion through the oxidaseinduced reaction chain, bacteria, or yeast. the measured signal is the decrease in cosubstrate consumption compared to the initial value. • recycling of the reaction product p such as peroxyhydrogen, h + , co 2 , and nh 3 , in oxidoreductase reduction schemes, hydrolysis, lysis, etc. the signal from the transducer will be amplified. • detection of active centers in the biocatalyst: redox, cofactors, prosthetic groups evolving in the presence of substrate s by using an immobilized mediator. it reacts quickly with the biocatalyst and is easily detected in the transduction chain. various ferrocene derivatives, such as tetrathiafulvalene, tetracyanochinodimetane (ttf + tcnq), organic salts, quinones, quinone dyes, ru, or os complexes in polymeric matrices, can be used as mediators. • direct electron transfer is made between the enzymatic redox reactive site and the electrochemical transducer. the third strategy eliminates, partially or totally, the dependence of the nbss' response on the cosubstrate concentration, s′, which decreases the influence of interference between species. the use of mediators leads to the decrease of the substrate concentration together with the reaction chains by using a suitable membrane, whose permeability favors the transport of the cosubstrate. when enzymes are immobilized within the same reaction chains, it can improve the performance and abilities of nbss. three possibilities are commonly used: • some enzymes facilitate biological identification by sequentially converting the products of the enzyme reaction series into an electroactive final form: this way allows for a wide range of nbs analysis. • the enzymatic complex, applied in series, can regenerate the cosubstrate of the first enzyme and amplify the nbs output signal by regenerating another cosubstrate of the first enzyme. • the parallel enzyme complex improves selectivity of nbss by lowering the local concentration of electrochemical interfering substrate: this sequence is an alternative to the use of a permselective membrane or a sequential method (e.g., interpretation of an output signal generated by an nbs and a reference nbs without biorecognition element). the operation of nbss is based on the interaction between the analyte and the macromolecules or organized molecular assemblies. upon reaching the balance, there is no consumption of analyte by the biocomplex agent immobilized in the substrate. the response to the biocomplex analyte-reagent reaction is monitored by an integrator detector. in some cases, the biocomplex reaction is self-monitored by a complementary biocatalytic reaction. the integrator detector monitors stationary or transient states. antibody-antigen interactions, the most relevant examples of nbss using biocomplex receptors, are based on immunochemical reactions, e.g., binding an antigen (ag) to the characteristic antibody (ab). complexes formed by ab-ag can be detected provided that other nonspecific reactions are minimized, for each determination of ag corresponds to a certain ab that must be isolated, purified, etc. some recent studies have analyzed direct monitoring of ag-ab formation using ion-selective field effect transistors (isfets). increasing the sensitivity of immunological nbss is achieved by adding specific enzymes to ag-ab couples, but this requires additional synthesis steps. as the binding strength or affinity constant varies widely, these systems operate irreversibly (disposable nbss) or are coupled to flow injection analysis (fia) systems; then ab can be regenerated from dissociation of the complex with agents such as glycine-hcl to ph 2.5 (kurien et al. 2010 ). recently, they have been used as molecular recognition systems in conductometric analysis, isfet, or optical nbss with receptors with ion channels, membranes, or protein structures. a transporting protein, lactose-permease (lp), can be incorporated into a liposomal bilayer that permits protonic carbohydrate transport with a stoichiometric ratio of 1:1. this mechanism was identified through the ph-dependent fluorescence of a fluorophore immobilized in liposomes. liposomes with lp were incorporated into a lipid bilayer deposited on a ph-sensitive isfet. preliminary results show that this modified isfet is capable of irreversibly detecting lactose from an fia system. protein receptor nbss have been recently discovered. binding of analytes, here called agonists, to immobilized receptor proteins is monitored by changing the flow of ions through these channels. glutamate, as an agonist, can be determined in the presence of other agonists that can interfere with the determination of na + or ca 2+ streams using conductometric method or ion-selective electrodes. due to the dependence of the ionic channel on the nature of the linkages, it produces an independence toward the enzyme nature in order to achieve the desired sensitivity. two methods have been approached. the first refers to oligonucleotide duplex interleaving during the formation of the double helix structure of the dna of a molecule that is electrically active. the second method is the direct detection of guanine which is electroactive. some of these nbss cannot operate through analytical-sensing membrane separation membranes. the sensitive layer often has to be in contact with the biological environment where the analytes are located (fang et al. 2015) . nbss have been developed for indirect monitoring of organic pesticides or inorganic compounds (heavy metals, fluorides, cyanides) that inhibit the biocatalytic properties of enzymes used in the construction of nbss (devices are irreversible). in immunosensors, the initial biological activity can only be regenerated by chemical treatment and therefore is not part of the reconditioned or reusable nbs class. their application potential is to warn and not to accurately monitor a specific analyte (considered as disposable). nbss with cyanide (i.g. inhibition of cytochrome c oxidase) that are used as inhibitor to cytochromoxidase are regenerated by washing with a phosphate buffer at ph 6.3 (armstrong and beckett 2011). with the development of enzymatic glucose nbss, an experiment in which glucose oxidase is immobilized between two membranes, literature has emerged about techniques for immobilizing biological receptors. enzymes, antibodies, cells, or tissues with high biological activity can be immobilized in a thin film on the surface of a transducer through a variety of methods. the following immobilization procedures of biological receptors are used: • immobilization on the membrane on the surface unexposed to the analyte: an enzymatic solution, a cell or tissue suspension, rests between the permeable membrane to the analyte and the measuring electrode (electrochemical detector). • retaining of biological receptors in a polymeric matrix, polyacrylonitrile, agar gel, polyurethane (pu) or polyvinyl alcohol (apv), redox hydrogels with redox centers such as [os (bpy) 2cl] +/2+ . • retaining of biological receptors between self-assembled layers (sam) or in membranes from the double lipid layer (blm). • the covalent binding of membrane surface receptors through bifunctional groups: glutaraldehydes, carbodiimides, sams, avidin-biotin silanized. • modification of the entire electrode structure (modified carbon paste with enzymes or graphite in epoxy resins). receptors are modified either alone or mixed with other proteins, such as bovine serum albumin (bsa), either directly on the surface of the transducer or in the polymer membrane. reactivated membranes can be used directly to immobilize enzymes or antibodies without chemical modification. the covalent binding and cross-linking are more difficult than immobilization or the retaining of receptors on the membrane. in the case of microsensor structures where the membrane is directly deposited on the transducer, the covalent bonding is safer and more stable (muñoz et al. 2008 ). besides reactive layers or membranes with immobilized receptors, many nbss, those for clinical or biological applications, incorporate one or more internal or external separation auxiliary membranes with three important functions: the barrier, the outer diffusion barrier for the substrate, and the biocompatible surfaces. for any nbs built on the principle of molecular recognition, it is important to characterize it by its response, which is related to operating parameters and limiting reaction speeds. accuracy, precision, sensitivity, and reproducibility are basic criteria for estimating nbs performance. these parameters are in direct relation to the reaction mechanisms, the transport phenomena, and the kinetics of the processes in the volume at the interface. most criteria have been developed for enzymatic nbss, being the most studied in the literature. in the case of immunosensors, the key element is the ability to capture the surface, i.e., the number of surface molecules that is active. one method of checking this parameter is to measure the specific activity, meaning the ratio of the number of active molecules to the number of immobilized molecules. this estimation is dependent on the immobilization mode (molecular orientation, number of attachment points or active sites), and the ratio ranges between 0.15 and 0.3, rarely reaching the unit. capture capability becomes important when the surface decreases as in microfluidic applications. a problem encountered in immunosensors is that of regenerating the surface without significant loss of activity. there was a lack of rigor in the performance criteria (affi et al. 2016) . the response signal is corrected for background noise, the reference concentration is usually estimated in mol/l although this high value is never used when measuring ranges refer to 10 −10 mol/l, and currently sensitivities of the order nmol/l and pmol/l have been reached. transient response is important for dynamic assay analysis and sampling techniques but is less significant in continuous monitoring. the transient response is estimated by the slope (dr/dt) max after the addition of the analyte in the measuring cell. one evaluation method is to introduce nbss into an fia system for sequential sample analysis in a specified hydrodynamic regime. the sensitivity and linear range of measurement of stationary concentration are determined through graphical representation. this method is more concise than the current calibration curves used to plot the response corrected to the baseline based on its concentration or logarithm. parameters are estimated in the linear response range of nbss. any electrochemical nbs has a superior linearity of the response. this limit is directly related to the biocatalytic or biocomplex properties of the biochemical or biological receptor. more in the case of nbss with enzymes, this limit is significantly influenced by membranes and immobilized substrates where the diffusion barriers and secondary kinetics play a role. the local concentration of the analyte in the reaction layer may be two orders of magnitude smaller than the volume of the solution (michelini and roda 2012) . enzymatic kinetics are described by michaelis-menten mechanisms and expressed by km and vm parameters. for the kinetics of the enzyme in the solution phase, km is usually determined from the lineweaver-burk graphical representation. for any electrochemical nbss, the number of standards used and how the standard sample matrix can be simulated or duplicated should be set, being required to specify the procedures for each type of nbss related to its application. these are important for the disposable nbss' case using immunoaffinity or inhibition reactions. sensitivity is the slope of the curve and should not be confused with the quantified detection limit (lod) relative to the baseline or noise signals. the range of work concentrations is determined by lower or higher detection limits (fang et al. 2015) . selectivity and safety are determined as any kind of amperometric or potentiometric nbss. they depend on the choice of receiver and transducer. most enzymes are specific, but there are also nonselective enzyme classes, such as alcohol oxidases, the group of oxidases sugars, peroxidases, lactases, tyrosinases, ceruloplasmin, alcohol dehydrogenases, glucose dehydrogenases, nadh dehydrogenases, etc. they have been used to develop nbss to determine environmental phenols or to monitor food quality. on the other hand, oxygen electrodes, ph electrodes, and isfets have a pronounced selectivity, the same as metal electrodes that are sensitive to many substances. their selectivity may be changed when these transducers are associated with receptors. enfet is ph sensitive to the buffer and protonation but its selectivity is not altered. when the transducer interferes with other substances, known as ascorbate or urease, to glucose nbss based on hydrogen peroxide detection, these side effects may be restricted by the use of outer or inner permselective membranes. alternatively, nbss with and without biological receptors that work by differential nbss are designed. safety in operation of nbss depends on the selectivity and reproducibility and accuracy of the measurements (heyl et al. 2012 ). the clark construction principle studied electrochemically oxygen as a reducing gas and platinum as a metal electrode. platinum used for detecting electrochemical oxygen is known as the clark electrode. the electrode has an organic membrane covering the electrolyte layer and two metal electrodes. oxygen diffuses through the membrane and is electrochemically reduced to the cathode. between the cathode and the anode, a fixed voltage is applied, for which the oxygen reduction reaction takes place. temperature greatly influences reaction speed and solubility. this is a polarographic electrode used to measure the concentration of oxygen in body fluids or gases. the sample is in contact with a membrane (polypropylene or teflon) through which the oxygen diffuses into a measuring chamber containing 50% saturated potassium chloride solution. inside the room are two electrodes, one is reference, ag/agcl, and the other is platinum, coated in the glass. the electrical current at the polarization potential of -600 mv is proportional to the oxygen concentration in the solution. for reverse polarization at +600 mv, hydrogen measurements can be made. reactions are very sensitive to temperature and should be maintained at ±0.1 °c. the electrode is calibrated using a mixture of the two gas-oxygen and hydrogen-known concentrations. oxygen electrode or clark electrode has proven to be an analyzer of raw gas or developed gas when performing chemistry analyses in the clinical laboratory and in the field of medical care, ambulatory, or intensive care (on the surface of the platinum electrode an enzyme reacts with oxygen). the enzymes are placed in a closed membrane to the surface, which can be recognized as the simplest model of nbss. the oxygen concentration curve was proportional to the glucose concentration. it was the first nbs built, which helped the progress of laboratory analyses a lot. oxygen diffuses through the membrane and is electrolytically reduced to the cathode. the higher the partial oxygen pressure, the more oxygen diffuses at a time. the temperature nbss attached to the sample allow the membrane to compensate for the diffusion and solubility rate. the measuring instruments record cathode current, sample temperature, membrane temperature, barometric pressure, and salinity. with this information one can calculate the oxygen contained in the sample, either in parts per million (ppm) or in percent of oxygen saturation. the geometric configuration of the clark electrode is of great importance. in particular, the thickness of the electrolytic layer between the cathode and the membrane must have a certain limit, to ensure linearity and decrease of the drift current. calibrating a polarographic system is a must. proportionality between current and oxygen concentration must be ensured, with errors below 1% (biological samples role and air parameters are essential). air, as a gas mixture that has a constant oxygen content of about 20.9%, when in contact with water, the dissolved amount depends on several factors: the optimal time for oxygen dissolution, homogeneity of the water solution, water temperature, air pressure, salts contained in water, and other water-soluble substances that are oxygen-consuming. oxygen contained in water is determinant for biological and chemical processes, so measurement of dissolved oxygen in water is important to find the partial pressure of dissolved oxygen; it must be saturated in pure water at a certain temperature (wolfbeis 2015) . the enzymatic electrode (in some references known as the enzyme electrode) is a combination of an electrochemical probe of any type (amperometric, potentiometric, or conductometric) with a thin layer (10-200 microns) of immobilized enzyme. in these devices the function of the enzyme is to provide selectivity in virtue of its biological affinity for a substrate of molecules. an enzyme can catalyze a reaction of a given substrate for a specific isomer from a plurality of substrates with different isomers. typically, the degree of advancement of an enzymatic reaction (directly related to the concentration of the analyte) is monitored by the rate of product formation or the disappearance of a reagent. if the product or reagent is electrically active, then the response can be directly monitored by amperometry, i.e., the variation of the current for a given applied potential. the main considerations are: does the enzyme contain active redox groups? are the biochemical reaction products electroactive? is one of the substrates or cofactors electrically active? what is the speed and response time? what is the final application of nbss? if the enzyme does not contain redox groups, then nbss are limited to measuring the product or substrate consumption by their reaction to the transduction electrode. the electrical current is directly related to the analyte concentration. nbss are based on electrochemical response due to h 2 o 2 . most common enzymes used in the design of enzyme electrodes are those that contain redox groups that change their redox state during the biochemical reaction. redox enzymes are oxidases and dehydrogenases, pyrroloquinoline quinone (pqq). they act by oxidizing the substrate, accepting electrons during the process, and further transforming in a reduced state. these enzymes return to the oxidized active state by transferring electrons to the oxygen molecule resulting in hydrogen peroxide (h 2 o 2 ). both oxygen and peroxide being electrochemically active, they continue by reducing to cosubstrate (oxygen) or oxidation of peroxide (reaction product). the method based on the reduction of oxygen to the o 2 electrode is one of the simplest methods but suffers from several disadvantages, namely, slow response, miniaturization difficulties, low accuracy, and reproducibility. measurements on peroxide oxidation overcome the above difficulties and are currently the most popular method. mediator systems-a major limitation of the above-described hydrogen permeation system is the high operating potential (about 0.8 v against the ag/ agcl reference electrode) required for oxidation of peroxide which leads to increased interference. the use of mediators (molecules that can carry electrons between the enzyme redox center and the electrode) can minimize this inconvenience. depending on the nature of the mediators, the potential applied may be reduced below the limit of interferences of species such as ascorbate, urate, and paracetamol. a large number of compounds are able to act as mediators in the enzyme electrode. of these, the most popular are the metallic complexes. representatives for organometallic complexes are ferrocenes and their derivatives because they have redox potentials and are independent of ph. bienzymatic systems' recent works have focused on the direct communication of electron transfer between enzymes and electrodes. successes in the field are the peroxidase enzyme hrp (horseradish peroxidase) that catalyzes the reduction of hydrogen peroxide for a number of organic compounds. when the enzyme is immobilized on the electrode, the need for the organic reducer is prevented by the electrode itself providing the reducing equivalences. the coupling of peroxidase with an enzyme oxidase allows for the construction of bienzymatic systems through which the peroxide produced by the oxidase is detected by the electrode-peroxidase system which operates at lower potentials relative to a simple platinum electrode. this is where the minimization of active species interferences results from (frederickson matika and loake 2014). optical fiber as a nanobiosensor can be placed in the surface or inside the plant to directly measure parameters. nbss with optical fiber are proposed to be used in many and rapid medical determinations, and its applications are continuously expanding. it can be attached inside a hollow-like tubular instrument, serving to dilate a hole or channel, and inserted into the tissue, performing a minimal monitoring where it is needed. nbss with optical fiber are nontoxic, chemically inert, and can be successfully used inside the plant. it can be associated with plant monitoring equipment. it's easy to handle with negligible weight. the evolution of fiber-optic nbss is based on multiple performance and biocompatibility. biocompatibility is the first step in the plant's comfort; nbss should not affect the physiological parameters of the plant, but its functionality must not be compromised by plant disassimilation products. fiber-optic nbss can be classified as extrinsic, fiber acts as a way for signal and intrinsic, interactions occur in the fiber itself. there are two types of fobs: minimally invasive nbss that are introduced into the cavities of the plants and invasive nbss that are introduced into the organs or in wood conductive tissue (liu et al. 2015) . in the last decade, optical fiber is a product that is widely used in all the cutting-edge fields of advanced science and technology. given the ease with which it can be manipulated, unlimited sterilization possibilities, and reduced costs, it can be estimated that this product will increasingly gain market. the following applications are known to have used fiber-optic nbss: in epidermis and vascular tissues, for analysis of raw sebaceous elements, saturation in oxygen, raw sewage gas analysis, sap ph; in plant breeding monitoring; easy ph determination with a microabsorbent indicator and ph-modulator, acid-alkaline; in vegetal tissues, when it is intended to monitor the temperature, or to diagnose small and very small injuries that are difficult to reach; in epidermis for can test the quality and integrity of the layers, so, small lesions can be detected, can be used to stimulate tissues, a fobs based on oxygen demand (bod) can be used; in the stem can identify very small injuries that are inappropriate. another possible application is to appreciate the color or integrity of the vascular tissues. optic-fiber nbss can now monitor electrolytes from raw or elaborated sap as well potassium, sodium, and calcium. it takes the form of a tubular instrument, able to expand an optic-fiber channel (0.5 mm diameter) that can be inserted into vascular tissues. it can measure the gas concentration and the ph of the raw or elaborated sap and oxygen saturation also. the materials that make up the chemical transducers are ionophores that can be reversibly attached to the electrolyte by a molecular separator (spacer) and fluorophore, respectively. the degree of fluorescence, through excitation with electroluminescent diodes in purple, is modulated by ionophores proportional to the analyte concentration. nbss are used either extracorporeally in the external raw or elaborated sap gas analysis circuit or intracorporeally for continuous blood gas monitoring in critical situations. the chemical parameter capable of monitoring cell state is ph because lactic acid, formed when cell tissue dies, produces a decrease in ph. any drop in the ph of the raw or elaborated sap from 7.4 indicates cell death (mclamore et al. 2010) . achievements in the domain are invasive ph nbss that determine the state of the cell. nbss are composed of a fluorescent dye encapsulated in a gel matrix (polyacrylamide) attached at the end of the optical fiber. the dye is characterized by the emission of the acidic form centered on 580 nm and the alkaline form centered on 680 nm. the two forms are ph sensitive. excitation occurs at 533 nm for both forms. separation of emission is done directly through optical filters, and sensitivity is 0.05 ph units far below acceptable clinical standards (0.1 ph). a bacterial disorder is a multifactorial condition that is characterized by demineralization of the inorganic portion and an organic destruction of the substance. each bacterial perturbation has as an etiological agent as pathogenic species. the content of the raw or elaborated sap in terms of bacterial load is about 109/ml raw or elaborated sap. therefore, raw or elaborated sap can be considered as a selective medium for bacterial growth. a significant correlation has been demonstrated between the number of pathogens in the raw or elaborated sap and their prevalence in the problems. a simple method has been adapted for detecting and counting the pathogen; it is a device consisting of a special support made with an agar culture medium containing 20% sucrose. it is inoculated with raw or elaborated sap, and the density of growth of the pathogen is assessed after incubation for 48 hours at 37 °c. next comes the morphological identification of distinct colonies on selective and nonselective agar, on distinct cell form, visible in the light of the microscope. the technique also has some disadvantages that it takes time for bacterial growth and also requires additional laboratories. to monitor pathogen activity in the sap, a fobs was developed that monitors the pathogen-mediated sucrose reactions through a photosensitive indicator immobilized in a porous glass coating. the surface of the optical fiber core is treated or coated in a porous glass film using the sol-gel technique (kozan et al. 2007) . spectroscopic analysis showed that there are two phases in light absorption at 597 nm over a duration of 120 min: between 0-60 min and 60-120 min. the investigation shows the potential of nbss in monitoring plant activity. the sol-gel technique is used to immobilize the photosensitive indicator, and it is simpler than compared with the principle of selective medium which takes time and is laborious. criteria at the base of the experiment: pathogens are partially anaerobic with opti-mal growth at 37 °c; glucosyltransferases and fructosyltransferases from pathogens catalyze the synthesis of water-insoluble glucan and fructanic polymers in sucrose to form lactic acid found in acidic sap; pathogenic agents in the sap synthesize both extracellular and intracellular polysaccharides from sucrose; extracellular polysaccharides help the adhesion of bacteria to the surface, while intracellular polysaccharides are stored for bacterial energy; polysaccharides, intracellularly, help the bacterium to continue fermentation even when there is no exogenous form of food. acid tolerance of pathogens causes their activity to continue even at a low ph; a ph indicator, photosensitive, produces a characteristic color, according to a color gradient, depending on the ph of the raw or elaborate sap and is used in the fobs construction (miranda et al. 2011) . on the basis of these considerations, an experimental assembly was designed and performed with a double beam uv spectrometer in which optical fiber was introduced instead of a cuvette. in the reference well, the blue bromophenol buffer solution was used for different ph values, from 4 to 7. the initial experiment helped determine the wavelength and peak characteristic to the buffer indicator and for different ph values in the sap, induced by bacterial activity. uv spectroscopic analysis at a ph of 4 and 7 of the blue bromophenol solution showed slightly prominent at 590 nm wavelength; peak intensity decreases from ph 7 to 4. comparison with literature data showed a good concordance, observing that in the medium with sap, sucrose and bromophenol blue, the absorption was stable at 590 nm wavelength for a time interval of 15 min, 30 min, 1 h, and 24 h. for each set of corroded and processed fibers set, it requires independent calibration due to the in homogeneities resulting from the optic-fiber preparation steps. fobs proves to be a rapid quantity measurement test of pathogen activity in raw or elaborated sap. this test can also be adapted to other plant nanobionics where bacterial activity is involved in cellular or tissue destruction. the method of forming a biosensor from an optical fiber for the observation and detection of the pathogen, the experiments of this study followed the phase of biochemical recognition of the signal and the phase of the spectroscopic analysis. the idea of nanobionic plants has evolved to create solar cells that heal themselves from plant cells. the next step was the desire to try to amplify photosynthesis in isolated chloroplasts in plants to be used in solar cells. chloroplasts host everything they need for two-step photosynthesis. in the first step, pigments such as chlorophyll absorb light, which generates the stimulation of electrons that circulate through the chloroplast tilacloids. the plant captures this electrical energy and uses it to fuel the second stage of photosynthesis, creating glucose. chloroplasts also have these reactions after they have been removed from the plants, but after a few hours, they break down because light and oxygen destroy their photosynthetic proteins. normally, unlike extracted chloroplasts, plants can repair this damaging process. to prolong chloroplast productivity, researchers have attached ceric oxide nanoparticles. these particles are, in fact, powerful antioxidants that remove oxygen radicals and other high reactivity molecules produced by light and oxygen, protecting the decomposition chloroplasts. the researchers introduced nanoparticles into chloroplasts using a new technique called leep (lipid exchange envelope penetration). by wrapping the particles into polyacrylic acid, a heavily charged molecule allowed the particles to penetrate the lipid hydrophobic membranes surrounding the chloroplasts. in these chloroplasts, the level of decomposition of molecules has decreased tremendously. using the same technique, the researchers introduced semiconductor carbon nanotubes embedded in negatively charged dna into chloroplasts. plants use only one tenth of the available sunlight, but carbon nanotubes have functioned as artificial antennas that have allowed chloroplasts to capture unusual light waves such as green, ultraviolet, and near infrared. when carbon nanotubes functioned as prosthetic photoabsorbents, photosynthesis, measured by the activity of electrons in tilacloids, was 49% more intense than in chloroplasts isolated without attached nanotubes. when cerium oxide was joined with carbon nanotubes, the chloroplasts remained active for the next few hours (nikolelis et al. 2008) . researchers went to live plants and used a technique called vascular infusion to attach nanoparticles to arabidopsis thaliana, a small flower plant. using the above method, the researchers applied a nanoparticle solution on the lower side of the leaf, penetrating the stomata that usually allow the carbon dioxide to get in and the oxygen out. in these plants, the nanotubes have penetrated into chloroplasts and have increased the photonic electron circuit by about 30%. it is also to be discovered how these percentages influence the sugar production of plants. scientists have been able to transform the arabidopsis thaliana plant into a chemical nbs by implanting nanotubes that detect nitric oxide, a pollutant produced by combustion (hines et al. 2015) . nbss have been created on the basis of carbon nanotubes for several chemicals, including hydrogen peroxide, trinitrotoluene, and sarin gas. when molecules attach to polymers encased in nanotubes, the fluorescence of these nanotubes is altered. carbon nanotubes can be used to create nbss that detect real-time particle free radicals or signal the presence of molecules with a very low level of concentration and too difficult to detect. this is a tremendous demonstration of how nanotechnology can be combined with synthetic biology to modify and enhance the functions of living organisms of plants. the way that nanoparticles arrange themselves can be used to enhance plant photosynthetic capacity, being used as nbss and stress reducers. by adapting nbss to targets, researchers hope to develop plants that could be used to monitor environmental pollution, pesticides, fungal infections, or exposure to bacterial toxins. attempts to incorporate electron nanomaterials into plants, such as graphene, are currently being made. researchers have long tried to find the best way to transmit information in a timely manner, focusing on electronics and mechanics of nbss for their tasks. nbss used for agricultural purposes are not new. recombinant dna technology now offers the possibility of obtaining new biological insecticides that preserve the benefits of "classic" biological control agents, plus some new features. these technologies are not accessible to all possible users, especially if they are poor, and furthermore, they have also generated a series of public debates about their usefulness and effects on organisms other than the target or the environment. obtaining pest-resistant plants is perhaps the most spectacular field of genetic engineering applied to plants, since it allowed the regeneration of plants containing genes of bacterial origin that provide protection against harmful insects. this ensures, on the one hand, the obtaining of richer harvests and, on the other hand, the reduction of farmers' costs for pesticides (lukács et al. 2006; prasad et al. 2014 prasad et al. , 2017 . a series of new genes for resistance to insect attack, transferable to plants (genes encoding δ-endotoxin production from b. thuringiensis) has been discovered; genes for the synthesis of enzymes or enzyme inhibitors; plant genes encoding the synthesis of specific lectins; genes that cause induction of synthesis of plant compounds such as phytoalexins. the development of cloning methodology was based on the observation that there is a group of gram-positive bacteria belonging to b. thuringiensis species that produce a toxin called δ-endotoxin or crystalline protein capable of killing a wide range of insects (coleoptera, lepidoptera, diptera), depending on the bacterial strain. of great interest is strain b. thuringiensis var. tenebrionis that synthesizes an effective δ-endotoxin against the colorado beetle. the genes involved in the synthesis of this protein are localized, on most bacterial strains, to large plasmids (75 kb), the production of the toxin occurring during sporulation. it has been shown that crystalline protein (δ-endotoxin) is normally expressed as a large inactive protoxin, which undergoes proteolytic processing in the intestine of the sensitive insect, becoming an active toxin. it recognizes the specific receptors in the intestinal cells and blocks the functions of these cells, leading to the death of the insects. studies on genes that encode inhibitory proteins produced by b. thuringiensis led to their grouping into four types based on target species specificity and nucleotide sequence: type i cry genes, encode specific 130 kda proteins for lepidopteran larvae; type ii cry genes, encode active 70 kda proteins on dipterous and lepidopteran larvae; type iii cry genes, encode 70 kda specific activity on coleopteran larvae; and type iv cry genes, encode inhibitory proteins for diptera larvae. the number of genes identified that are coding for δ-endotoxin-like proteins is high: 140 genes specific for lepidopteran, coleopteran, and diphtheria (genfa et al. 2005) . it has been achieved to obtain crystalline protein genes from several strains of b. thuringiensis by genetic amplification (pcr). because the whole gene for the crystal protein was found to be very poor in the transformed plant cells, a modified gene was created, containing only the n-terminal portion of the protein (amino acids 1-645). to increase gene expression in plants, the natural sequence for amino acids 1-415 rich in at was replaced by a synthetic sequence, rich in gc, containing the preferred codons for plant cells. these recombinant genes were introduced into ti plasmid-derived vectors (binary vectors containing the duplicate camv promoter, which increases the transcriptional and fivefold transcription and selection marker genes for antibiotic resistance or phosphinothricin herbicide) transferred to cells by a. tumefaciens containing ti disarmed plasmids. the size of recombinant plasmids obtained ranges between 5000 and 10,000 pb, depending on the size of the bacterial gene and the promoter sequence integrated into the vector. recombinant bacterial strains were then used to infect test plants (potato, tobacco, cotton). selection was first performed according to vector-borne selection markers (antibiotic resistance, gus test, herbicide resistance, etc.), and finally regenerated plants were subjected to insect attack (takakusagi et al. 2013) . regenerated plants have shown resistance to attack by insect pests, the specific character being maintained and expressed in experiments in the field. the first transgenic plant that manages the insect attack resistance belongs to the nicotiana tabacum species, expressing the whole or truncated cry 1a gene, cloned under the control of a constitutive promoter, so that the inhibitory protein represents 0.02% of the total plant protein (leaf). there were obtained cotton plants into which the modified cry 1a (c) gene, cloned under the control of the camv 35s promoter or under the control of a promoter and a sequence for a chloroplast transit peptide isolated from arabidopsis so that the expression level of the gene of interest led to a high level of toxin: 0.1% of total protein, 1%, respectively. another variant of cloning the bacterial toxin gene was that of using genetic elements that ensure expression of the gene of interest exclusively in the green portions of the plant (the promoter derived from the gene for pepc) or pollen (by using a gene derived promoter for a calcium dependent protein kinase (cdpk). a similar methodology has been used to transform rice plants, and by cloning a synthetic cry iii gene, they have obtained tobacco and potato plants resistant to the attack of colorado beetle (vigneux et al. 2007) . a modified 1a (b) modified gene was used for cloning under the control of the camv promoter and obtained sugarcane plants with resistance to diatraea saccharalis larvae. given the practical significance of plant resistance to harmful insects, research has been extended to other plant species, producing eggplant plants resistant to the attack of coleoptera, broccoli with resistance to certain lepidopteran species, maize with resistance to b. fusca, etc., as well as a number of advances in leguminous plants. toxicity studies conducted on plants expressing the gene for δ-endotoxin have shown that the existence of the transgene does not alter the normal features of the plants, except resistance to insect attack. in addition to δ-endotoxin produced by b. thuringiensis strains, other bacterial species have also been identified that produce insecticidal proteins (liu et al. 2011) . this is the case for b. cereus strains producing two vip1 and vip2 proteins with effect on insects, their mode of action being similar to δ-endotoxin. expression by plants of enzymes such as chitinase, cholesterol oxidase, lipoxygenase, phenol oxidase, peroxidase, or isopentenyl transferase (ipt) could be an alternative to using the δ-endotoxin gene. of the enzymes that can provide plant protection against insect attack, a particular place is occupied by chitinases, enzymes that act on chitin, a basic component of insect coatings. tobacco plants expressing genes for chitinase isolated from insects or beans exhibit increased resistance to lepidopterans. it has been observed that by cloning the isolated chitinase gene from the s. marcescens bacterium, a synergistic effect of endocytinase produced by plants containing the endotoxin gene in addition to s. littoralis larvae has been revealed. another enzyme of bacterial origin that exhibits insecticidal action is cholesterol oxidase. the introduction and expression of cho a gene for cholesterol oxidase from streptomyces sp. in tobacco plants have led to increased plant resistance to a. grandis larvae. the use for cloning the gene for the enzyme bacterial isopentenyl transferase (ipt) (involved in cytokine biosynthesis) by fusion with the protease inhibitor ii (pi-iik) gene promoter determined the production of n. plumbaginifolia lepidopteran-resistant plants (m. sexta or m. persicae). another embodiment was that of introducing into the plant cells the cpt2 gene that encodes a trypsin inhibitory protein isolated from vicia faba. this protein has antimetabolic effect, protecting plants from the attack of pests. similarly, other genes encoding protease inhibitors (kunitz trypsin inhibitor, bowman-birk proteinase inhibitor) or lectins have been cloned into different hosts, and encoded proteins may be true "defense guns" for the plants that contain them. it is known that insects, such as lepidopterans, depend on serine proteases (trypsin, chymotrypsin, or elastase), these being the first enzymes to digest (alvarez-fernandez 2010). a series of genes encoding different inhibitors for serine proteases have been isolated from various sources (plants, microorganisms) and cloned into various plant species, such as m. sativa, tobacco, corn, etc.; the plants obtained showed increased resistance to various insect pests compared to normal non-gm plants. in some cases, it has been noted that the insertion of additional genes to plants for protease inhibitors to join endogenous genes causes an increased level of pathogen resistance of transformed plants. however, the use of protease inhibitors for controlling insect pests requires thorough studies of plant and insect interactions because it has been observed that for some inhibitors such as serine proteases, the insecticide effect also manifests on useful insects (e.g., on bees). insect carbohydrate metabolism is another target for inhibitory agents tested in numerous studies. genes for different α-amylase inhibitors (wheat and beans) were isolated and characterized; after cloning the isolated gene from wheat in tobacco plants, an increase in lepidopteran larval mortality of up to 40% was observed. in the case of cloning the α-amylase inhibitor gene from beans in pea plants under the control of the pha 1 gene promoter, an increased expression of the foreign gene in the seeds was obtained, which resulted in a higher resistance to callosobruchus sp. (ramirez and spears 2014) . vegetable lectins are a special group of glycoproteins that have protective functions against a number of harmful organisms. studies on these glycoproteins have shown that they produce strong effects on the development of various types of insects. the first example of plants containing genes for nonspecific lectins that show toxicity to pests is tobacco plants in which the lectin genes from the pea have been cloned. however, many of the vegetable lectins also have a toxic effect on mammals, which limits their use as agents to increase pest resistance. special attention has been given by many specialists to lectin specific for mangosteen from guinea pigs and concanavalin a: the genes for these lectins have been cloned into different plant species (tobacco, tomato, potato, sugarcane, rice, wheat), and the heterologous proteins synthesized by them have determined a reduced sensitivity to the attack of harmful insects (lepidopteran, aphids, coleopteran larvae). the results suggest that the use of plants containing insecticide genes (such as for lecithin) together with integrated management represents promising possibilities for controlling pests from many plant species, including wheat or rice (richard et al. 2014 ). contrary to the remarkable results achieved so far, the genes used to transform crop plants are either too specific or only partially effective on the targets of insect pests. to use plants as true "weapons" for pest control, it would be necessary to have genes at their level to determine the synthesis of compounds with different actions on the same target. the researchers are relatively recent and aim mainly to combine genes for a b. thuringiensis δ-endotoxin with another inhibitory gene in the same host gene: for example, the gene for the v. faba trypsin inhibitor (cpti) or for serine proteases and the protein gene in the potato virus y coating. another interesting approach is that which introduced the cry1a (c) gene into a p. fluorescens strain able to colonize the sugarcane by means of two plasmids, pder405 and pkt240, in which the gene is found in 13 and 28 children, respectively. testing of recombinant bacterial strains on sugarcane-specific insect pests revealed greater resistance of plants treated with the respective bacteria than untreated. also, although pest-resistant plants have been obtained for several plant species, fewer results have been reported for cereals, vegetables, and oleaginous plants (ibrahim et al. 2008) . plant resistance to various diseases (microbial, viral, and nematode phytopathogens) has been a subject for long-term studies, identifying a relatively large number of resistance genes. although it was thought that endogenous resistance genes would provide a sustainable effect for the appropriate plants, this was true in very few cases. in the case of potato, the control program for certain diseases, such as the rot caused by phytophthora infestans, had to be abandoned because the resistance to this disease of potato plants obtained by transferring the 11 resistance genes on the basis of crossbreeds with the solanum demissum species proved to be of short duration. identifying genes of resistance in the genome of different plant species and transferring them to other crop plants are extremely difficult and time-consuming if traditional methods (intra-and interspecific hybridizations) are used. the process is considerably accelerated by the use of molecular markers generated by restriction fragment length polymorphism (rflp) techniques, randomized amplified polymorphic dna (rapd), single sequence repeat (ssr), or single nucleotide polymorphism (snp). the application of molecular markers allowed the isolation of nearly 20 resistance genes (r-genes) from genetically well-characterized plant species, proving that many of them are grouped into specific chromosomal regions (they form clusters). of these resistance genes, some have been transferred to other plant species than to their origin through molecular cloning techniques, with the help of specific vectors that ensure the transfer of large fragments, revealing that in this way, the r-genes act synergistically and provide lasting resistance to some diseases. as it has already been mentioned, few r-genes have been shown to provide a lasting control of plant diseases. this is the case for pepper bs2 genes and rice xa21 which provide resistance to different phytopathogenic strains of x. campestris or x. oryzae in the case of species in which the genes have been cloned (e.g., tomatoes). resistance to these genes is due to the recognition of proteins produced by bacteria or phytopathogenic fungi, resulting in the occurrence of a plant hypersensitivity phenomenon and necrosis of affected tissues. another example of the long-acting r-gene is the barley recessive mlo gene which provides the resistance of the plants containing it to all e. graminis strains, through the accumulation in plant cells of a phenolic antifungal compound named p-coumaroylhydroxyagmatine. it is expected that this gene will be used for suppressing the antisense technique of the dominant mlo gene from wheat or other plant species susceptible to erysiphe sp. in vitro processing of the r-genes and then introducing them into new hosts provide new possibilities for resistance. this is the case for the tomato avrpto gene which, after cloning under the control of a strong promoter such as 35s to camv, determines the resistance of transformed tomato plants to p. syringae pv. strains, tomato, and to unrelated pathogens such as x. campestris and c. fulvum. researchers' efforts to obtain resistance systems applicable to a larger number of plant species are not limited to r-genes but also include systemic acquired resistance mechanisms (sar). several genes involved in sar have been isolated and characterized, of which the npr1 gene encoding a transcriptional regulator is a key gene in this system. overexpression of this gene increases the resistance of arabidopsis thaliana or rice plants to a wide variety of pathogens (knecht et al. 2010 ). an interesting behavior has been observed in the myb1 gene which is induced by vmt infection of resistant tobacco plants and which causes the synthesis of a transcription factor that binds to the promoter of a gene involved in pathogenesis (the pr1a gene). modifying the expression level of the myb1 gene in tobacco plants leads to increased resistance to viral infection (with vmt) as well as to r. solani pathogenic fungus (raymond et al. 2007) . along with this, another recently cloned gene, pad4, isolated from arabidopsis thaliana, proves to be extremely interesting for the development of broad-spectrum resistance: overexpression of the gene in plants increases resistance to phytopathogens. numerous biotech companies and universities have begun to assess the performance of plants that express antifungal proteins through both "in vitro" and field experiments. both plants containing r-genes or genes involved in sars, as well as genes such as those which encode the glucosidase (ago) from aspergillus sp., defensins, h 2 o 2 -generating enzymes, glucanases or chitinases, have been examined. although at the laboratory level potato plants containing the fungal gene for glucose oxidase showed an increased resistance to phytopathogens, the results were inconclusive when they were put into the field. for other genes such as chitinase and intracellular α-1,3-glucanase, overexpression of these in tomato plants resulted in significant resistance to fusarium oxysporum (werner et al. 2016) . companies have created nbss that farmers can use to detect information such as air pollution, soil humidity, and so on. however, given that plants are really good nbss and that they can naturally react to external stimuli and changes, they can be used instead. this is the idea behind the latest nbss initiative called advanced plant technologies. the idea is that, through genetic manipulation, researchers will be able to create self-sustaining plants, which in turn enable them to act as a kind of nbs when it comes to detecting chemical substances, pathogens, etc. this is not the first time this idea with plants as nbss appears, but before that, resources that plants needed to survive were used, which in turn reduced their resistance. this new idea indicates that nbss can be sustained by themselves, which means they can work longer in isolated parts. in the future, plants could be used to detect when a biological attack will take place. in addition, because they are plants, it also means that they can be placed anywhere and nobody will think twice about their presence and that they might be some nbss. through such examples, nature teaches us to optimize by exploring diversity. in this context, integrated nanoscale systems (nbss), energy sources (biocombustible cells) that use plants metabolism, manipulators for nanoscale surgery, and drug reservoirs embedded in intelligent polymers are explored. all of these are virus-sized. the proliferation of these types of nbss leads to a large number of applications, and combinations of these in the future will lead to microminiaturization, versatility, and functionality. plant species present a great diversity of genetics, and wild ones constitute a large genetic reservoir, from which genes that are important from a practical point of view can be obtained. plant genetic engineering research has a great theoretical significance, facilitating the knowledge of how genes of these organisms act, the effects of phytohormones on plant development, genes inactivation mechanisms, etc. by applying molecular biology techniques, useful information can be obtained on the genome of plants used for amelioration, the localization of genes of interest, the degree of relationship between different species, etc. as far as practical applications are concerned, a number of significant results have been obtained so far, some of which have already been applied, such as virus-resistant plants, plants resistant to the attack of pests, herbicide-resistant plants, plants of horticultural interest (ornamental plants with new phenotypes, plants producing softening resistant fruits), plants capable of synthesizing secondary metabolites in increased amounts, and plants producing "edible" antibodies, and enumeration could continue. numerical modeling of the dynamic response of a bioluminescent bacterial biosensor rapid detection of cadmium-resistant plant growth promotory rhizobacteria: a perspective of elisa and qcm-based immunosensor patented applications of gene silencing in plants: manipulation of traits and phytopathogen resistance experimental and modelling data contradict the idea of respiratory down-regulation in plant tissues at an internal [o 2 ] substantially above the critical oxygen pressure for cytochrome oxidase 3d-nanostructured au electrodes for the event-specific detection of mon810 transgenic maize development of biosensor for phenolic compounds containing ppo in β-cyclodextrin modified support and iridium nanoparticles detection of glycoalkaloids using disposable biosensors based on genetically modified enzymes proteomic dissection of plant responses to various pathogens redox regulation in plant immune function chiral flavanones from amygdalus lycioides spach: structural elucidation and identification of tnf alpha inhibitors by bioactivity-guided fractionation the screening and isolation of an effective anti-endotoxin monomer from radix paeoniae rubra using affinity biosensor technology development of fret biosensors for mammalian and plant systems application of genetically engineered microbial whole-cell biosensors for combined chemosensing properties, functions and evolution of cytokinin receptors tracking transience: a method for dynamic monitoring of biological events in arabidopsis thaliana biosensors the influence of different nutrient levels on insect-induced plant volatiles in bt and conventional oilseed rape plants assessment of respiration in isolated plant mitochondria using clark-type electrodes in vivo biochemistry: applications for small molecule biosensors in plant biology chemosensors and biosensors based on polyelectrolyte microcapsules containing fluorescent dyes and enzymes generation of plant protein microarrays and investigation of antigenantibody interactions expression of bvglp-1 encoding a germin-like protein from sugar beet in arabidopsis thaliana leads to resistance against phytopathogenic fungi biosensing hydrogen peroxide utilizing carbon paste electrodes containing peroxidases naturally immobilized on coconut (cocos nucifera l.) fibers heat-solubilized curry spice curcumin inhibits antibody-antigen interaction in in vitro studies: a possible therapy to alleviate autoimmune disorders plant growth enhancement and associated physiological responses are coregulated by ethylene and gibberellin in response to harpin protein hpa1 portable optical aptasensor for rapid detection of mycotoxin with a reversible ligand-grafted biosensing surface effect of bt broccoli and resistant genotype of plutella xylostella (lepidoptera: plutellidae) on development and host acceptance of the parasitoid diadegma insulare (hymenoptera: ichneumonidae) measurement of the optical parameters of purple membrane and plant light-harvesting complex films with optical waveguide lightmode spectroscopy self-referencing optrodes for measuring spatially resolved, real-time metabolic oxygen flux in plant systems redox-sensitive gfp in arabidopsis thaliana is a quantitative biosensor for the redox potential of the cellular glutathione redox buffer staying alive: new perspectives on cell immobilization for biosensing purposes colorimetric bacteria sensing using a supramolecular enzyme-nanoparticle biosensor versatile strategy for the synthesis of biotin-labelled glycans, their immobilization to establish a bioactive surface and interaction studies with a lectin on a biochip complex regulation of the immunoglobulin mu heavy-chain gene enhancer: microb, a new determinant of enhancer function development of an electrochemical biosensor for the rapid detection of naphthalene acetic acid in fruits by using air stable lipid films with incorporated auxin-binding protein 1 receptor boron nitride nanotube-based biosensing of various bacterium/ viruses: continuum modelling-based simulation approach live-cell imaging of phosphatidic acid dynamics in pollen tubes visualized by spo20p-derived biosensor nanotechnology in sustainable agriculture: recent developments, challenges, and perspectives. front microbiol 8:1014 nanotechnology in sustainable agriculture: present concerns and future aspects stem nematode counteracts plant resistance of aphids in alfalfa, medicago sativa host plant and population determine the fitness costs of resistance to bacillus thuringiensis fine mapping of co-x, an anthracnose resistance gene to a highly virulent strain of colletotrichum lindemuthianum in common bean electrochemical quantification of the antioxidant capacity of medicinal plants using biosensors enzyme-linked immunosorbent assay for the quantitative/qualitative analysis of plant secondary metabolites kinetic models for detection of toxicity in a microbial fuel cell based biosensor mapping a disordered portion of the brz2001-binding site on a plant monooxygenase, dwarf4, using a quartz-crystal microbalance biosensor-based t7 phage display the xaxab genes encoding a new apoptotic toxin from the insect pathogen xenorhabdus nematophila are present in plant and human pathogens multi-analyte biochip (mab) based on allsolid-state ion-selective electrodes (assise) for physiological research belowground communication: impacts of volatile organic compounds (vocs) from soil fungi on other soil-inhabiting organisms luminescent sensing and imaging of oxygen: fierce competition to the clark electrode comparative quantification of oxygen release by wetland plants: electrode technique and oxygen consumption model key: cord-252147-bvtchcbt authors: domingo-espín, joan; unzueta, ugutz; saccardo, paolo; rodríguez-carmona, escarlata; corchero, josé luís; vázquez, esther; ferrer-miralles, neus title: engineered biological entities for drug delivery and gene therapy: protein nanoparticles date: 2011-11-15 journal: prog mol biol transl sci doi: 10.1016/b978-0-12-416020-0.00006-1 sha: doc_id: 252147 cord_uid: bvtchcbt the development of genetic engineering techniques has speeded up the growth of the biotechnological industry, resulting in a significant increase in the number of recombinant protein products on the market. the deep knowledge of protein function, structure, biological interactions, and the possibility to design new polypeptides with desired biological activities have been the main factors involved in the increase of intensive research and preclinical and clinical approaches. consequently, new biological entities with added value for innovative medicines such as increased stability, improved targeting, and reduced toxicity, among others have been obtained. proteins are complex nanoparticles with sizes ranging from a few nanometers to a few hundred nanometers when complex supramolecular interactions occur, as for example, in viral capsids. however, even though protein production is a delicate process that imposes the use of sophisticated analytical methods and negative secondary effects have been detected in some cases as immune and inflammatory reactions, the great potential of biodegradable and tunable protein nanoparticles indicates that protein-based biotechnological products are expected to increase in the years to come. the development of genetic engineering techniques has speeded up the growth of the biotechnological industry, resulting in a significant increase in the number of recombinant protein products on the market. the deep knowledge of protein function, structure, biological interactions, and the possibility to design new polypeptides with desired biological activities have been the main factors involved in the increase of intensive research and preclinical and clinical approaches. consequently, new biological entities with added value for innovative medicines such as increased stability, improved targeting, and reduced toxicity, among others have been obtained. proteins are complex nanoparticles with sizes ranging from a few nanometers to a few hundred nanometers when complex supramolecular interactions occur, as for example, in viral capsids. however, even though protein production is a delicate process that imposes the use of sophisticated analytical methods and negative secondary effects have been detected in some cases as immune and inflammatory reactions, the great potential of biodegradable and tunable protein nanoparticles indicates that protein-based biotechnological products are expected to increase in the years to come. the design of new chemical entities (nce) for diagnosis and treatment of human diseases has relied on the discovery of active chemical drugs from a diverse library of compounds or from naturally occurring molecules. 1, 2 further chemical modifications improve pharmacokinetic properties to obtain a final product with a known mechanism of action and decreased toxicity. 3 nonetheless, using such approaches, the final products present low specificity for their target molecules, interacting with many other molecules and accumulating in some tissues, disturbing the correct homeostasis of the system. in some cases, the adverse effects of drug administration exceed pharmacological effect and despite the concise mechanism of action of the drug over the target molecule representing an improvement in the patient's state, the treatment has to be prevented or discontinued. 4 in fact, although a maintained steady increase in the number of launched nce has been observed in the last years, the question arises whether this classical approach has already exhausted the discovery of innovative molecules. 5 on the other hand, macromolecular new biological entities (nbe) have been used to supplement cellular deficiencies or to inhibit cellular pathways exploiting their relatively specific mode of action. proteins and peptides have been obtained first from their natural source or produced as recombinant 248 versions after the development of genetic engineering techniques in the late 1970s. however, the delivery of biological entities is sometimes hampered by its low half-life in the bloodstream by unspecific degradation, resulting in an expensive and ineffective process. nevertheless, some solutions have already been explored for biopharmaceuticals to increase solubility and stability and to reduce immunogenicity including postranslational modifications such as glycosylation and covalent conjugation of polyethylene glycol. 6 thus, one of the main objectives in the use of drugs (for either nce or nbe) is the need to optimize the delivery system to reduce the pharmacological dose which would consequently represent a concomitant reduction in toxicity and cost. in that scenario, new delivery approaches have been implemented using biological interactions such as antigen-antibody binding (immunoliposomes) 7 or more sophisticated interactions including the binding between nutrient concentrator sparc (secreted protein acidic and rich in cysteine) and albumin in the treatment of some types of cancer (abraxane ). 8, 9 proteins can be then used for their targeting qualities as molecular delivery vehicles both for the specific delivery of drugs or nucleic acids in gene therapy approaches and by themselves as therapeutic molecules. one of the interesting characteristics of proteins is their ability to form intermolecular driven complexes as sophisticated and structurally perfect as in the case of viral capsids. in addition, through the use of genetic engineering, recombinant proteins can be tuned to include additional properties to optimize drug delivery and nucleic acid delivery in gene therapy. in this chapter, the main available strategies to develop protein-based nanovehicles or biopharmaceuticals will be described. in this context, several parameters will be defined such as proper formulation, stability, immunogenicity, and delivery to the correct cell type and cell compartment. modular protein engineering, virus-like particles (vlps), and other self-assembling entities are envisioned as modulatable novel protein nanoparticles able to include many desirable properties in the correct delivery of drugs and nucleic acids. finally, some successful examples of protein nanoparticles on the market will be described in addition to protein products currently in clinical trials and under preclinical research in order to envision which type of protein nanoparticles will be available soon on the market. with the therapeutic molecule to generate a vehicle capable of being transported in the blood if a systemic administration is needed and retaining a significant stability before reaching the target cell. 10, 11 in addition, the biological system poses specific barriers that have to be overcome such as membranes (cytoplasmic, endocytic, and nuclear), degradation (protease degradation induced by acid denaturalization in lysosomes, cytosolic proteosomes, and nucleases), cytosolic transport, and nuclear entry if necessary. 12, 13 for central nervous system therapies, the blood-brain barrier (bbb) represents the main bottleneck, and for that, a specific strategy has to be designed. 14 furthermore, the therapeutic complex has to be flexible enough in order to release the therapeutic molecule in the specific cell compartment. thus, several protein motifs have been described to overcome each and every process described earlier so that a modular multifunctional protein can be generated including those modules that are necessary to achieve its goal. in order to get a rational construction of the multifunctional vector, each step has to be carefully taken into account so as to overcome every step which is needed to achieve its final goal (table i) . the dna/rna condensation or drug interaction with the protein vector is a critical step in the formulation of protein nanoparticles for gene therapy. they have to remain attached to the vector during the whole transport process through the body and the cell until it can be released in the desired localization within the target cell. highly positively charged peptides containing a large number of arginines or polylysines have been used to promote electrostatic interactions since nucleic acids are highly negatively charged molecules. [15] [16] [17] [18] [19] [20] [21] [22] natural dna-condensing proteins as nuclear histidines or protamines can also be used to bind nucleic acids. [22] [23] [24] [25] protamine, which is the protein that replaces histidines during the spermatogenesis process, is a sperm chromatin component and just as the histidines do, it has very high dna condensation ability to protect nucleic acids form cytosolic endonucleases. 23, 26 in addition, as soon as the complex reaches the cellular nucleus, protamine is degraded by chromatinremodeling proteins, releasing the transported dna allowing its expression. 15, 23 in contrast, polycationic dna condensation modules such as polylysines and polyarginines-even they can present higher dna condensation ability depending on the polycationic chain length-usually present lower dna-releasing ability, interfering negatively with the accessibility of cellular transcription factors and dna expression capacity. 15 all these dna condensation modules described above interact with any dna that is incubated in an unspecific way. however, there are proteins such as gal4 that are able to recognize specific dna sequences [27] [28] [29] and that permit to bind and condensate specific dna sequences in the final vector. 30 in many cases, the multifunctional protein vector is in vivo administrated by the systemic route in order to travel in the blood and reach the target cells. that exposes the vector to all blood components, making it susceptible to be degraded. thus, it is completely necessary that the vector remain in the blood long enough to be able to reach the target cells. it has also been described that naked dna has an estimated half-life in blood of minutes 10 ; so protein nanovehicles in gene therapy, among other properties, are intended to protect nucleic acids from degradation. one important factor when the vector is exposed to the blood is that it can be recognized by the immune system components and produces an immune response against the vector. thus, it is also very important to try to make the vector as less antigenic as possible in order to avoid being degraded or even being toxic to the organism. 32 peptide uptake or internalization involves a step before the protein binding to the cell surface. this attachment can be either specific or unspecific but in all cases the promotion of its internalization is required. 33 positively charged peptides usually bind the cellular surface by unspecific electrostatic interactions with the negatively charged cell surface proteoglicans. this kind of peptides can be used in the multifunctional protein if specific targeting is not required. 33 cell-penetrating peptides (cpps) have been widely described as unspecific cell-binding and internalization peptides [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] (see also the chapter ''peptide nanoparticles for oligonucleotide delivery'' by lehto et al. in this volume). however, specific interactions can be obtained by incorporating cell receptor ligands if cell or tissue targeting is required for the therapeutic action. moreover, some of those ligand-receptor interactions promote the ligand-receptor complex internalization. many peptides have been described in the literature as receptor-specific ligands so any of them can be added to the multifunctional proteins in order to confer them cell specificity. [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] the most natural specific ligands that can also be used for cell targeting are monoclonal antibodies. 32, [63] [64] [65] in addition, if no specific peptides are available for an intended target, new specific binding peptides can be found by using phage display 66 or combinatorial chemistry. 67 2. endosomal escape several internalization pathways are possible depending on the vector properties, 27, 33 including endocytosis (clathrin/caveolae-mediated, clathrin/ caveolae-independent), macropinocytosis, and non-endocytic pathways. 254 it is known that more than one internalization pathway can be performed at the same time but usually the peptide-based vector uses endocytic pathways. 68 moreover, it seems that proteins that interact with a specific cellular receptor are internalized by the clathrin-mediated endocytic pathway. 33 most of the generated endosomal vesicles will converge to late endosomes that eventually will fuse with cellular lysosomes. 15, 33 remaining in the cellular endosomes, the multifunctional protein will be degraded, so it is strictly necessary that the internalized multifunctional proteins be released into the cellular cytoplasm escaping from degradation. several peptides have been described that are able to promote endosomal escape and can be classified into two types depending on their escape mechanism: fusiogenic peptides and histidine-rich peptides. 36 the fusiogenic peptides are small peptides that have hydrophobic amino acids (aa-s) interspersed at constant intervals with negatively charged aa-s. 12, 19, 39, 40, 45, 46, [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] thus, when early endosomes become late endosomes, their low ph induces a conformational change in the peptide, which adopts a alpha-helix structure, in an amphipathic structure able to fuse with the endosomal membrane, leading to pore formation and releasing all the endosomal content into the cell cytoplasm. 36 the histidine-rich peptides are small peptides with a high histidine content whose endosmolytic activity is mediated by a mechanism called ''proton sponge''. 12, 22, [80] [81] [82] [83] when the endosomal ph becomes low in late stages, the imidazole groups of the histidines are protonated and attract endosomal cl à ions, buffering against the proton pump. thus, the endosomes collapse by an osmolytic swelling process and the endosomal content is released to the cell cytoplasm. 36 further details are given in the chapter ''peptide nanoparticles for oligonucleotide delivery'' by lehto et al. in this volume. once the protein has achieved the cellular cytosol, it can be degraded by cellular proteases or by the cellular proteosome system. 84 it is important to avoid this process, especially if the protein has to reach the cellular nucleus. if the final target of the nanoparticle is the cellular cytoplasm, it is necessary that it remain there at least long enough to perform its therapeutical action. several peptide proteosome inhibitors have been described that are able to avoid this type of protein degradation. by adding these peptides to the final protein vector it is possible to protect it and enhance cytoplasmatic stability. epstein-barr virus nuclear antigen 1 (ebna1) contains a proteosome inhibitor consisting of glycine-alanine repeats able to prevent proteosomal proteolysis. it has been shown that a minimum of 4 aa-s gly-ala repeats are necessary to achieve such protective activity. [85] [86] [87] if the protein vector is carrying nucleic acids (dna or rna), degradation by the cytosolic endonucleases has to be taken into account, so it is also very important to protect this nucleic acid in order to maintain its integrity. some dna/rna condensing peptides as protamines also protect the dna against cytoplasmic endonucleases and enhance its stability as has been described above. 15 the cellular cytoplasm is a very crowded and compartmentalized environment where cellular organelles and cytoskeleton make the free diffusion of macromolecules such as protein vectors difficult. however, cytoskeleton elements such as microtubules are used by endosomes and other cytosolic macromolecules for intracytosolic mobility. 33 dyneins have been described as being capable of carrying those macromolecules and endosomes along the microtubules in a retrograde transport toward the nucleus. some small peptides that are able to bind dyneins have been identified. they can be added to the multifunctional protein vector in order to mediate an intracytosolic mobility toward the cellular nucleus. 36 several dynein-binding proteins have been identified in viruses that are able to use this transport system. comparing those protein sequences, a consensus peptide sequence (kstqt) that is able to bind to the dynein lc8 light chain has been identified. 88 molecules lower than 45 kda/10-30 nm are able to enter in the cellular nucleus by passive diffusion. however, macromolecules higher than 45 kda/ 10-30 nm generally require an active transport system through the nuclear pore system. this transport mechanism generally requires a specific targeting signal peptide named nuclear localization signal (nls). these signaling peptides are usually rich in basic aa-s, which are recognized by the cellular importines and actively transported through the nuclear pore. 15, 89 monopartite or bipartite nls sequences which are nls peptides that have one or two nls recognized sequences respectively have been described. 12 thus, these peptidic sequences can be added into the final multifunctional protein if nuclear localization is required in order to express a carried dna. it has been reported that a single nls sequence is sufficient to transport the vector to the nucleus and that a large number of nls sequences can result in inhibition of its activity. 90 one of the most used nls signal peptides are fragments derived from the 111-135 aa-s of the simian virus sv40 large tumor antigen (t-ag). other nls sequences can be found in gal4, protamines, or tat. 23, 36, 37, 77, [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] it is important that when the transported dna reaches the cellular nucleus, it has to be released in order to be accessible to the nuclear transcription factors and achieve the desired expression level. thus, while designing the multifunctional protein vector, this aspect has to be taken into account. once the dna has been released in the cell nucleus, it will be necessary to control its expression level depending on which therapeutic action is being promoted. when the goal is to kill a cell as in cancer therapies, the uncontrolled dna expression levels would not be a problem. however, when a specific protein expression level is required, achieving good control is very important. 13 some expression systems have been developed that can be pharmacologically regulated by oral drug formulation. 103 cell-specific promoters and enhancers can be also used in order to confer high cell specificity to the therapy. 104, 105 d. ways to get over the bbb the bbb is a hermetic barrier that only allows nonlipophilic molecules smaller than 400 da to cross it. however, some human proteins such as insulin, transferrin, insulin-like growth factor, or leptins are able to go across it by receptor-mediated transporters. thus, the most important factor limiting central nervous system-targeting therapeutics is the presence of the bbb. 106 finding the way to cross it will be the main challenge. some peptides have been described that are able to reach the brain crossing the bbb. moreover, it has been seen that they can be associated with another molecule and transported through the barrier. thus, they could be interesting candidates to be included in the multifunctional vectors if central nervous system targeting is required. 14, 56, 107, 108 antibodies have also been described that bind transferrin and insulin receptors and that are able to cross the bbb efficiently. they can be conjugated with large molecules, allowing its translocation to the central nervous system. 63, 64, [109] [110] [111] synthesis, and rational design the development of genetic engineering techniques has increased the natural repertoire of proteins for the design of useful and/or valuable proteins with the aim to obtain new proteins with desired functions. there are three main strategies leading to the construction of engineered proteins: (a) direct evolution, (b) de novo protein design, and (c) rational design. directed evolution has developed quickly to become a method of choice for protein engineers in order to create enzymes having desired properties for all kind of processes. over the past decade, this technique has become a daily part of the molecular toolbox of every biochemist. this is emphasized by the increasing number of publications about the subject. 112 in nature, evolution and creation of new functionalities is achieved by mutagenesis, recombination, and survival of the fittest. directed evolution mimics this and is a process of iterative cycles of producing mutants and finding the mutant with the desired properties. mutations can be introduced at specific places using site-directed mutagenesis or throughout the gene by random mutagenesis. several mutagenesis techniques have been developed in order to avoid codon bias. 113, 114 the first technique used to mimic evolution was dna shuffling. 115 this method is based on the mixing and subsequent joining of different related small dna fragments in order to form a complete new gene. in the process of shuffling, the recombination frequency is dependent on the degree of homology. a high level of recombination is important to get all possible combinations of mutations. since recombination can be biased, several methods to overcome problems arising from the use of shuffling in the early years were tackled by novel strategies, all having their own advantages and disadvantages. 112 the products obtained by these methods have to be screened for desired qualities and not all of them can be easily screened. de novo protein design offers the broadest possibility for new structures. it is based on searches for amino acid sequences that are compatible with a three-dimensional protein backbone template using in silico techniques. several research groups in the field have applied in silico methods to design the hydrophobic cores of proteins, with the novel sequences being validated with experimental data. 116 in silico protein design has allowed novel functions on templates originally lacking those properties, modifying existing functions, and increasing protein stability or specificity. beyond any doubt, intense research activities are ongoing in the field, the potential of which is simply enormous. 117 so far there have been numerous examples of full sequences designed ''from scratch'' that were confirmed to fold into the target three-dimensional structures by experimental data. 118 the zinc-finger protein designed by dahiyat and mayo 119 was the first one to appear by this method. rational design of proteins is based on the modification or insertion of selected amino acids or domains in a polypeptide chain backbone to obtain proteins with new or altered biological functions. when using that strategy, a detailed knowledge of the structure and function of the backbone protein is needed to make desired changes. this generally has the advantage of being inexpensive and technically feasible. however, a major drawback of this approach is that detailed structural knowledge of a protein is often unavailable or it can be extremely difficult to predict the effects of various mutations. modular engineering enables, by using simple dna recombinant techniques, the construction of chimerical polypeptides in which selected domains, potentially from different origins, provide the required activities. an equilibrate combination and spatial distribution of such partner elements has generated promising 258 prototypes, able to deliver expressible dna or molecules to tissue culture but also to specific cell types in whole organisms. 120 modular fusion proteins that combine distinct functions required for cell type-specific uptake and intracellular delivery of dna or drugs present an attractive approach for the development of self-assembling vectors for targeted gene or drug delivery. 121 one of the first examples was described by the group of uherek et al. they combined a cell-specific target module (antibody fragment specific for the tumor-associated erbb2 antigen), a dna-binding domain (gal4), and a translocation domain for endosomal escape. 121 in this context, many strategies for the construction of safer vehicles are being explored and the number of nonviral prototype vectors for gene and drug delivery is noticeably increasing. here, the common steps that an approach like this might explore are presented ( fig. 1) . when designing a new protein for drug or gene delivery there are many critical aspects, namely (a) design of the vehicle itself, required functions, stability, etc.; (b) production of the protein, suitable expression system, purification procedure, scaling up process, etc.; (c) characterization of the vehicle by physicochemical and functional tests; and finally (d) the administration route and regulatory guidance for biological products. although all these aspects belong to different disciplines, they have to be overviewed together. here, the major needs of a modular protein for gene and drug delivery are presented. to enhance the physicochemical stability of the cargo molecules and their resistance to nuclease/protease-mediated degradation, protein vehicles should ideally exhibit, like their natural counterparts (viruses), nucleic-acid binding and condensing properties. 27 such abilities are, in general, conferred by cationic segments of the main scaffold molecules that interact with nucleic acids, mainly through electrostatic interactions. in addition, such complexes need to efficiently release the nucleic acid in the nucleus (if the cargo is a therapeutic gene), for which endosomal escape is required. such functions have been found in some peptides in many natural molecules and they are suitable for functionalizing protein vehicles. the ability to bind a particular cell type with high specificity is especially significant in a systemic delivery in which appropriate biodistribution and tissue targeting are essential. 122 for nuclear targeting, only naked short nucleic acids can freely enter the nucleus of nondividing cells via free diffusion through the nuclear pore. large molecules require active transport mediated by nlss that are often found in viral proteins. because the molecular mass of plasmidic dna varies from to 2 to 10 mda, dna that is to be expressed, and essentially any macromolecular complex for nucleic acid delivery, requires nlss. 123 the role and types of functional modules peptides used for all these purposes will be discussed in depth in the following sections. in vivo experiments finally, which protein or peptide is better for a given cargo is to be determined empirically and only few rules can be taken literally. 38, 124 c. production of protein nanoparticles some steps in the production of a protein-based vehicle after molecular cloning such as protein production and protein purification 125 might be experimentally labor intense with a variable success rate. for that reason, when small proteins are needed, solid-phase peptide synthesis 44 guarantees the process. however, the classical procedure of biological production allows scaling up the process in most of the cases and the production of larger polypeptides and fulllength proteins. generally, in protein nanoparticle approaches, the protein is composed by different modules of natural sources such as the cell-penetrating peptide transactivator of transcription (tat) derived from the tat of the human immunodeficiency virus (hiv) 126 or artificial sequences not present in any organism such as the polylysine dna-condensing sequence. 127 once it has been defined which modules will be part of the protein, it is important to define the order they will have in the final construct. it has been demonstrated by boekle and coworkers using melittin conjugated to polyethylenimine (pei) that depending on the side of the linkage (c-or n-terminus), the lytic activity could be changed. some other modules have the need to be in a determined position for its correct function. 128 when producing a protein for gene or drug delivery, it is important to know the origin of its domains to choose the most suitable expression system for its production. for instance, if any module naturally carries a posttranslational modification that is essential for its biological function, the expression system chosen will have to be able to reproduce the same crucial modification. the main biological production systems for protein drugs are described below. escherichia coli is the most widely used prokaryotic organism for the expression of recombinant proteins. 129 the use of this host is relatively simple and inexpensive. 130 added advantages include its short duplication time, growth to high cell densities, ease of cultivation, and high yields of the recombinant product. however, since it lacks fundamental prerequisites for efficient secretion, recombinant proteins manufactured by e. coli systems are mainly produced as inclusion bodies. 125, 131 moreover, posttranscriptional modifications are not achieved with this system. there are many examples of proteins for gene delivery produced in e. coli with probed efficiency. 132, 133 like e. coli, yeasts can be grown cheaply and rapidly and are amenable to high-cell-density fermentations. besides possessing complex posttranslational modification pathways, they offer the advantage of being neither pyrogenic nor pathogenic and are able to secrete more efficiently. species established in industrial production procedures are saccharomyces cerevisiae, kluyveromyces lactis, pichia pastoris, and hansenulapolymorpha. s. cerevisiae is the best genetically characterized eukaryotic organism among them all and is still the prevalent yeast species in pharmaceutical production processes. 131 in spite of their physiological advantageous properties and natively high expression and secretion capacity, the employability of yeasts in some cases, however, might reach a limit, particularly when the pharmacological activity of the product is impaired by the glycosylation pattern. in such cases, either a postsynthetic chemical modification has to be considered or the employment of more highly developed organisms. most examples of nanoparticles produced in yeast are for vlps. 134 animal cell expression systems show the highest similarity to human cells regarding the pattern and capacity of posttranslational modifications and the codon bias. however, their culture is more complicated and costlier and usually yields lower product titers. among the known systems, insect cells infected by baculovirus vectors have reached popularity since they are considered to be more stress-resistant, easier to handle, and more productive compared with mammalian systems and are thus frequently employed for high-throughput protein expression. for commercial application, scale-up related questions have to be solved. [135] [136] [137] preferably applied in pharmaceutical production processes are mammalian systems like chinese hamster ovary (cho) cells and baby hamster kidney (bhk) cells. these systems are genetically more stable and easier to transform and handle in scale-up processes, to grow faster in adherent and submerged cultures, and to be more similar to human cells and more consistent in their complete spectrum of modification. 138 in some cases, mammalian cell systems can be the only choice for the preparation of correctly modified proteins. peptides, being complex and unique complex molecules with regard to its chemical and physical properties, can be produced synthetically by the solidphase method. 139, 140 this technology can be used to avoid problems related to biological production. general advantages of synthetic peptides are that they are very stable compounds, solid-phase chemistry produces highly standardized peptides, and the crucial polycation component is provided by a ''natural'' polycation, thus minimizing toxicity. 141 however, some disadvantages related to synthetic peptides have been reported such as the difficulty to synthesize long and well-folded oligopeptides, peptides with multiple cysteine, methionine, arginine, and tryptophan residues due to technical limitations or production cost. 141 when working with protein nanoparticles, it is very important to characterize them physically and functionally in order to understand their behavior. the size and charge of protein/cargo particles are crucial properties which influence rates of diffusion, binding to polyanionic components of connective tissues, transversal of anatomical barriers, binding of serum proteins, attachment to cells, and mechanisms of endocytosis, among other factors. stability in physiological salt solutions is a key issue for in vivo delivery, as salt is found everywhere in the body. 141 mixing a multivalent polycation and dna results in electrostatic binding of both molecules, with charge neutralization of dna and a particle formation named conjugate. charge neutralization can be easily seen by retardation gel assays and particle formation by dynamic light scattering (dls). dls is a good method to see particle formation but not to quantify relative number of particles of different sizes. 142 to visualize particles, many groups have used transmission electron microscopy (tem) 15, 143 with good results while others have used fluid particle image analyzer (fpia) to photograph individual particles in physiological solutions. 58 the net charge of protein/cargo particles is an important variable. generally, optimal gene delivery for cell lines requires a net positive charge but, as stated previously, it has to be determined empirically. one of the best techniques to determine the net charge is by calculating the zeta potential that measures the electrophoretic mobility of particles. 144 despite the fact that physical characterization is a key element, understanding and testing the functionality and pharmacokinetics of a gene or drug is the most important part of its development process. most of the initial tests are done using cell lines in in vitro experiments using reporter genes, rna, or drugs. 145, 146 quantifying the percentage of transfected cells or drug-induced changes is a very valuable tool to evaluate nanoparticle performance in both nuclear and cytoplasmic delivery, respectively. in addition, in vitro experiments may be designed to select a candidate for the in vivo experiments from a group of possible therapy vectors. the quantitative kinetics of particle binding, the molecular basis of particle interactions with target cell membranes, the efficiency of particle internalization, and endosomal escape are all poorly understood. 141 interaction of particles with plasma membranes prior to protein internalization can be either unspecific or specific. untargeted delivery normally is the consequence of electrostatic interactions between anionic ligands in the cell surface and cationic components of the vehicle. on the other hand, targeted delivery to specific membrane molecules is a more sophisticated approach. it aims to improve cell specificity and efficiency, by directing to molecules, only expressed or overexpressed in a particular cell type, that initiate internalization by endocytosis. targeting moieties include many types of molecules and is discussed afterwards. internalization of particles, its mechanisms, and kinetics are not well known and most studies about nanoparticle delivery do not focus on this aspect. there are several endocytic pathways each initiated by different ligands. 147 enhancing the delivery by addition of chloroquine, a synthetic molecule used primarily for the prophylaxis and treatment of malaria that disrupts endosomes, 148 is an accepted parameter to demonstrate endosomal localization of particles. endosomal escape is the area most intensively investigated but is poorly understood. an important practical point to note is that some reagents that are used can be toxic. 141 to enhance this step, anionic fusogenic peptides can be used. these peptides fuse to membranes in an acidic-dependent manner causing its disruption. 149 in gene delivery approaches, translocation of dna expression plasmids into the cell nucleus involves an active, energy-dependent process through the nuclear pore complex. 150 directly injected dna into the cytosol is usually, but not always, poorly transferred to the nucleus 150, 151 and because of that, the use of proteins carrying cationic nuclear-localizing sequences (such as that of sv40 large t antigen) has been widely used to overcome this step. 143 iv. natural self-assembling protein nanoparticles: vlps ideal drug delivery and gene therapy vehicles must accomplish some desired features such as appropriate packaging size for its cargo, target cellspecificity, safe and efficient cargo delivery, and protection against immune recognition, or capability to escape immune recognition. moreover, these vehicles must avoid inflammatory toxicity and rapid clearance. 152 in this context, viral vectors have been exploited as one of the vehicles of choice. viruses are nano-sized (15-400 nm) supramolecular nucleoproteinbased entities, covered or not with a lipid bilayer (enveloped/nonenveloped viruses) that satisfy, into relatively simple structures, outstanding properties and functions that are relevant to drug and gene delivery. viruses are able to recognize and interact specifically with cells by receptor-mediated binding, internalize, escape from endosomes, and uncoat and release nucleic acids in different cellular compartments. they are also capable of transcribing and translating their viral proteins to self-assemble into new infectious virus particles and exit the host cell. 120, [153] [154] [155] despite all these relevant properties of viral vectors or some other rising vehicles in drug and gene delivery such as cationic liposomes, their therapeutic use presents some limitations and risks because of the complexity of production, limited packaging capacity, insertional mutagenesis and gene inactivation, low probability of integration, reduced efficacy of repeat administration or reduced expression overtime, unfavorable immunological recognition or strong 264 immune response against vehicle and transgene, inflammatory toxicity, and rapid clearance. 120, 152 in this context, virus capsids or vlps, produced by recombinant capsid proteins but lacking the viral genome, have noticeably emerged as a safer alternative to viral vectors. a. structure of protein self-assembled nanovehicles vlps are classically described as self-assembling, nonreplicative and nonpathogenic, highly organized supramolecular multiprotein nanoparticles (coats) (ranging from 20 to 100 nm) that can be formed from the minimal spontaneous self-assembling of one or more viral structural capsid proteins. it has been described that the self-assembling process of the structural viral proteins for vlp formation involves both spontaneous assembly, under favorable experimental conditions, and the requirement of scaffold proteins as catalysts. 156, 157 therefore, vlps are considered protein ''coats'', ''shells'', or ''boxes'' that lack the viral genome, still conserve the structure, morphology, and some properties of viruses. some of these properties such as cellular tropism and uptake, intracellular trafficking, membrane translocation, and transfer of nucleic acids or molecules across the cytoplasmic, endosomal, and nuclear membranes are important for drug delivery and gene therapy. 120, 153, 155, [158] [159] [160] usually, the degree of similarity of vlps and their viruses depends on the number of proteins incorporated into the constructs. 161, 162 since the first description in 1983 of the viral dna packaging into mouse polyomavirus (mpyv) vlps and its transduction in vitro, 163 vlps of different viruses such as papillomaviruses, [164] [165] [166] hepatitis b, c, and e viruses, [167] [168] [169] polyomaviruses, 163 vlps offer some structure, dynamics, characteristic features, and functions that make them appealing bionanomaterials to be exploited in the biomedicine arena as drug and gene delivery vehicles and are discussed in detail afterward. on the one hand, viral coat proteins have the ability to spontaneously selfassemble, which ensures the formation of highly organized, regular, repetitive structurally stable, and very low morphological polydisperse particles that provide useful properties to be used as scaffolds for bioimaging, synthesis of bionanomaterials, and as nanocarriers in drug and gene therapy. 186 in addition, homogeneity of particle size and composition is a desired production factor when developing therapeutic molecules. the overexpression of structural viral proteins in a convenient expression system renders recombinant proteins capable of being folded and assembled in discrete organized nanoparticles with a defined size corresponding to the natural capsid geometry. [187] [188] [189] moreover, even though vlps are structurally stable particles, some biochemical and structural studies have observed that viral capsids and bacteriophages may show some structurally dynamic properties varying in shape, size, or rearrangements of the coat proteins, in response to different factors such as ph. [190] [191] [192] [193] on the other hand, vlps are considered biologically safe nanostructures since they are not infectious (lack of viral genome) and do not replicate, representing a safer alternative to viral vectors. 160, [194] [195] [196] [197] however, they can elicit immune and inflammatory responses, especially when repeated administration is needed. 152 it has to be also noted that when used in vaccination, vlps could show excellent adjuvant properties and the majority of vlps stimulate strong cellular and humoral immune responses as direct immunogens. 198 it has been suggested that recombinant vlps derived from infection of insect cells with baculovirus or even those derived from prokaryotic systems could be contaminated with different residual components of these host cells, contributing those impurities to the adjuvant properties. 153 one interesting property of vlps is that coat viral proteins present an enormous elasticity and adaptability to be modified chemically and/or by protein genetic engineering 154, 160, 199 to incorporate multiple directed functionalities, in order to be addressed in biomedical applications such as drug delivery or gene therapy. it has been recently reviewed that chemically and/or genetically modified vlps, including cpmv, ccmv, ms2, m13 bacteriophages, and other virus-based nanoparticles, 155,186 could maintain their structural integrity and improve their physical stability 154 and, moreover, these modifications could also confer desired cell-targeting properties to the nanovehicle. [153] [154] [155] 186, 200, 201 vlps can be successfully engineered with spatial precision to incorporate (attached or genetically displayed on the surface) targeting tissue-specific ligands such as epidermal growth factor (egfr) and antibodies, or other molecules such as oligonucleotides, peptides, gold, and other metals, target proteins, carbohydrates, polymers, fluorophores, quantum dots, drugs, or small molecules. 152, 154, 155 moreover, one of the potential benefits of such modifications is that the specific geometric rearrangement confers precise recognition patterns. 200, 201 furthermore, accessibility of the materials carried within the particle and the ability of inclusion and separation of nucleic acids, small molecules, and unusual cargoes with appropriate charge is another outstanding feature and key advantage of vlps that has also made them excellent vessels for gene and drug delivery. 152, 195 as described above, vlps can be used as empty nanocarriers to transport molecules chemically attached on their surface or can be loaded ex vivo with therapeutic small molecules such as drugs, dnas, mrnas, sirnas, oligonucleotides, quantum dots, magnetic nanoparticles, or proteins. 155, 157, 160 vlps of different papillomavirus and polyomavirus have been widely characterized and used for directed delivery in biomedical applications. 132, 165, 173, 174, 194, 202 osmotic shock and in vitro self-assembling of vlp subunits in the presence of 266 the cargo have been the two main strategies used to packaged nucleic acid or other small molecules. it has to be taken into account that some attachment of the cargo on the vlp surface can occur. 195 besides, diversity of natural tropism including liver for hepatitis b vlps, spleen for some papillomavirus and polyomavirus vlps, antigen-presenting cells for certain papillomavirus vlps, and glial cells for human polyomavirus jc (jcv) vlps, among others 152 is one of the key advantages offered by vlps providing a wide spectrum of specific targeting and distribution profiles depending on the directed application. although each vlp has its own characteristic receptors, entry pathway, and intracellular trafficking, it has been demonstrated that tropism of vlps could be customized, modifying the residues identified as ligands of the cellular receptor on vlps' surface or even varying the delivery routes. 155, 189, 203 another key advantage of vlps is that they can be easily produced by using a wide range of hosts and expression systems, each of them with its own conditionings. 162 in the past years, there has been an increasing need to improve and optimize efficient large-scale production systems, process control and monitoring, and up-and down-streaming processes. 153, 157, 159, 204 production of vlps usually involves transfection of the cell host expression system of choice with a plasmid encoding one or more viral structural proteins, further and rigorous purification for the removal of immunogenic cellular contaminants, and quality control of the produced vlp and encapsulation of the cargo ex vivo before administration. 152, 158 the most frequent and convenient expression systems, adaptable to large-scale processes are (1) yeast cells 176 205, 206 , (4) bacteria 204,207 , (5) green plants infected with modified viruses 208, 209 , and (6) cell-free systems. 163, 204 the preparative and large-scale manufacture of vlps in some of these hosts has been reviewed by pattenden et al. and can be classified into two main methods of bioprocessing: in vivo and in vitro systems. 157 in addition, the capability of in vitro dissociation and reassociation of vlps contribute to the application of easy and more accurate purification methods than those of viral vectors. 152, 157 furthermore, depending on the expression system, the resulting vlp might be significantly different even though expressing the same viral proteins. thus, a broad spectrum of vlps could be customized depending on the vlp type, the number of proteins needed for vlp assembling, and the targeted final application. 158, 210 as described above, vlps have great potential as nanocarriers in drug and gene delivery. at the same time, although there is an increasing flow of developments in this area, these vehicles also present some limitations that should be addressed and taken into account, such as residual cellular components, variable yield of functional vlps after disassembly/reassembly process, immunostimulation and unsuitability for repeated administration, tolerance to the transgene, ineffective therapeutic molecule loading, and low transfection rates. 152 protein nanoparticles engineered for drug delivery and gene therapy 267 due to their versatile nanoparticulate structure and morphology, and nonreplicative and noninfecting nature combined with their natural immunogenic properties and ease production, vlps have principally emerged as an excellent alternative tool to attenuate viruses for vaccination. 152 182 and ebola virus. 215, 216 although vlp-based vaccines have been primarily developed for their use against the corresponding virus, in the last decades genetic engineering or chemical modifications have been applied in order to generate chimeric vlps. thus, on the one hand, commonly short heterologous peptide epitopes or full proteins that are unable to form vlps or that are unsafe for vaccination have been presented on surface-exposed loops or fused to n-or c-exposed termini of structural viral capsid proteins on vlps. 154, 161, 210 different hpv, 217-219 hbv, 220,221 parvovirus, 222, 223 and chimeric polyoma vlps have been engineered 170, 175 and tested for different applications including vaccination against viral or bacterial diseases, against virus-induced tumors, and more recently, for immunotherapy of nonviral cancer. 161, 210 on the other hand, chemical bioconjugation for covalent coupling of protein epitopes and small molecules to lysines, cysteines, or tyrosine residues of vlp surfaces has been applied in viral or cancer vaccines. 200 chackerian et al. have demonstrated the efficient induction of protective autoantibodies using self-antigen conjugation to hpv vlps. 224 it is important to point out that vlps can also be engineered to incorporate heterologous cell-specific ligands to cell receptors, thus altering their cellular tropism. 154, 155, 186, 201 this great convertibility and flexibility of vlps to be modified (chemically and/or genetically), their high stability, natural and diverse tropism, their nanocontainer properties, and their ability to enter in the cell and incorporate, bind, and deliver nucleic acids and small molecules have positioned vlps as appealing entities not only for vaccination applications but also for a broad spectrum of other diverse and emerging applications in nanomedicine and nanotechnology such as immunotherapy against cancer, 210,225 gene therapy delivery of therapeutic genes into specific cells, 161, 165, 171, 184, 226, 227 and targeted delivery of drugs and small molecules using vlps as nanocarriers. 174, 196 268 domingo-espín et al. although there is no commercial vlp as vector in gene therapy, since the initial work in 1970 of uncoating polyoma pseudovirus in mouse embryo cells as gene delivery vector 228 and the establishment in 1983 of the viral dna packaging into mpyv vlps and its transduction in vitro, 163 different vlps such as hbv and hepatitis e virus, 229 hpv and polyomavirus nanoparticles 172, 178, 229 have been modified toward the specific delivery of therapeutic genes and proteins in different target cells, organs, and tissues in vitro and in vivo by systemic injection 229 or oral administration. 230 for example, recombinant vp1-based polyomavirus vlps can encapsulate in vitro exogenous dna, and deliver it by cell surface sialic acid residues to human brain cells and fetal kidney epithelial cells. 178 furthermore, vlps have recently emerged as novel nanocarriers or nanocontainers to store unnatural cargos, deliver modified oligonucleotides, 154 synthetic small interfering rnas, and plasmids expressing short hairpin rnas as therapy to downregulate gene expression. 171, 231 in this context, chou et al. have recently described the use of jcv vlps as an efficient vector for delivering rnai in vitro using murine macrophage raw 264.7 cells and in vivo using balb/c mice in silencing the cytokine gene of il-10 without significant cytotoxicity for systemic lupus erythematosus gene therapy. 171 one of the key aspects in targeted gene and drug delivery is cell-specific delivery. it is important to point out that vlps are tunable nanoparticles that can also be chemically or genetically engineered to modify their natural cellular tropism in order to diversify the range of therapeutic applications in targeted gene or drug delivery. 154, 201 some effective approaches to modify the natural cellular tropism include: (1) genetic engineering of vlp chimeras incorporating heterologous cellspecific short peptides that contain recognition sites of target cell receptors. 232 in this context, polyoma and papillomavirus, with solved atomic structures of their major structural capsid proteins, have been extensively used to obtain chimeric vlps as delivery vector systems. 165, 233 however, this approach has some bioprocessing limitations such as low production levels as a consequence of vlp modification, alterations of size and properties of the vlps that could affect the structural interactions and conformations for vlp assembly, disassembly and packaging, and low transduction efficiencies. 157 (2) chemical bioconjugation of purified vlps with epitope-containing peptides 234, 235 or a wide range of small molecules conferring cell-specific targeting such as transferrins, folic acid, or other targeting molecules. as an example, cmpv vlps have been successfully conjugated with tfn using ''click'' chemistry 236 and with nhs-ester-derivatized folic acid, demonstrating both as internalized into hela cells and kb cells, respectively. 183, 184 (3) high-throughput library and directed evolution method is a rational approach that has been recently used to engineer viral vectors with the desired tropism properties. 237 (4) pseudotyping, which consists of replacing the envelope protein of one virus species by the envelope protein of another virus species. 238 (5) modification of the delivery route of the vlps. it has been shown that the levels of expression of b-galactosidase in heart, lung, kidney, spleen, liver, and brain are different depending on the delivery route of polyomavirus vp1 vlps. 203 the great accessibility and reactivity showed by vlps, as well as their ability to serve as nanocarriers, which made them suitable to be exploited in gene therapy, have also been applied to targeted drug delivery. 195 genetic modification and/or chemical functionalization of exposed amino acid residues on the capsid surface in order to attach small molecules, such as markers or bioactives molecules, is one of the most common approaches applied to target drug delivery. 174, 239 as an example, canine parvovirus (cpv) vlps produced in a baculovirus expression system and exhibiting natural tropism to transferrin receptors (tfrs) were chemically modified on accessible lysines of the capsid surface with fluorescent dye molecules and delivered to tumor cells. derivatization of cpv-vlps did not interfere with the binding and internalization into tumor cells. 183, 184 one limitation of vlps in gene therapy is the low efficiency of gene transduction due to inefficient dna packaging. however, a recent study presented a novel in vivo dna packaging of jcv vlps in e. coli that effectively reduced human colon carcinoma volume in a nude mouse model. in this study, the exogenous plasmid dna was transformed into the jcv vp1 expressing e. coli. the packaging of the second plasmid occurs simultaneously as the in vivo assembly of the jcv vlp. even though it is still not clear how the plasmid dna molecules are encapsidated in the vlp, the authors showed that gene transduction efficiency by their in vivo package system was about 80% in contrast to the 1-2% of gene transduction efficiency achieved by the in vitro osmotic shock system. 226 in addition, the administration of exogenous proteins may induce the immune system response, reducing therapy effectiveness or causing undesirable secondary effects, albeit immunological response of protein nanoparticles can be modulated. 240 spontaneous protein self-assembly to form ordered oligomers is a common event in biology. it can prove advantageous in terms of genome-size minimization, formation of large structures, stabilization of complexes, and inclusion of 270 functional features. 241 it has been widely documented that cellular oligomer proteins as well as viral capsids are stabilized by several weak noncovalent interactions as hydrophobic interaction, electrostatic energy, and van der waals forces. [242] [243] [244] these interactions result in a complex quaternary structure described by three symmetry point groups named cyclic (cn), dihedral (dm), and cubic (t, o, i). 245, 246 the development of computational techniques to predict protein-protein interactions using solved 3d protein structures makes it possible to predict and/or strengthen experimental data performing in in silico approaches. 247 furthermore, its use opens up the possibility to design proteins not only displaying specific biological functions but also interesting intermolecular interactions to obtain increased multivalency in the resulting complexes. moreover, it should be considered that not only whole proteins can self-assemble in smart nanoparticles; oligopeptides are also capable of forming organized structures. many applications are possible due to the enormous quantity of different combinations and features that can be exploited with peptides. 248, 249 furthermore, protein-protein interactions are not the unique parameters involved in particle formation, nucleic acid-peptide interactions, salt concentration, order of mix, and ratio between nucleic acid and protein can also strongly influence the condensation process. 250, 251 due to their natural tendency to self-assemble forming highly ordered structures, viruses provide a wide variety of scaffold proteins which are used as gene/drug carriers. among them, vlps have been reviewed in the previous section. however, simple bacterial proteins can be also utilized as carriers for gene delivery. for example, heat shock proteins (hsp) from hyperthermophilic archeaon methanococcus jannaschii can assemble in a small structure of 24 subunits having an octahedral symmetry. these 12 nm structures are stable at high temperature, up to 70 c, and wide range of ph. residue modifications are allowed to elicit specific attachment of small molecules. 186, 252 in bacteria, bacterial microcompartments (bmc) which are intracellular organelles consisting of enzymes encapsulated within polyhedral, protein-only shells, somewhat similar to viral capsids, have been described. bmcs are composed of a few thousand copies of a few repeated protein species (including one or more enzymes involved in specific metabolic pathways), and with sizes of around 100-150 nm in cross section. the general role of bmcs is to confine toxic or volatile metabolic intermediates, while allowing enzyme substrates, products, and cofactors to pass. the first described bmc, the carboxysome, was isolated in the early 1970s 253, 254 and has been found to contain both co 2 -fixing ribulose bisphosphate carboxylase/oxygenase (rubisco) 253, 254 and carbonic anhydrase [255] [256] [257] enzymes. carboxysomes' function is to enhance autotrophic co 2 fixation at low co 2 levels. other bmcs were later identified in cyanobacteria and some chemoautotroph bacteria. among them, bmc proteins have been later found to be encoded in the propanediol utilization operon (pdu) of the heterotroph salmonella 258 and by an operon for metabolizing ethanolamine (eut) in enteric bacterial species, including salmonella and escherichia. 259 salmonella enterica forms a polyhedral organelle during growth on 1,2-propanediol (1,2-pd) as a sole carbon and energy source, but not during growth on other carbon sources. 260, 261 the pdu organelles' function is to minimize the harmful effects of a toxic intermediate of 1,2-pd degradation (propionaldehyde). [261] [262] [263] other studies have shown that a polyhedral organelle is involved in ethanolamine utilization (eut) by s. enterica. 259 the function of the eut microcompartment is to metabolize ethanolamine without allowing the release of acetaldehyde into the cytosol, therefore minimizing the potentially toxic effects of excess aldehyde in the bacterial cytosol [264] [265] [266] and also preventing volatile acetaldehyde from diffusing across cell membrane. 267 so far, about 1700 proteins containing bmc domains have been identified, covering at least 10 different bacterial phyla. the typical bmc protein consists of approximately 90 amino acids, with an alpha/beta fold pattern. 268, 269 some individual bmc proteins self-assemble to form hexamers, which further assemble side by side to form the flat facets of the shell. 268, 270, 271 the formation of icosahedral, closed shells from such flat layers was elucidated in part by structural studies in carboxysomes: some bmc proteins assemble to form pentamers, which are located at and form the vertices of the icosahedral shell. 270 mechanisms directing enzyme encapsulation within protein-based bmcs have been studied during the last years. it has been described that, in some carboxysomes, protein ccmm is used as a scaffold to form interactions between both shell proteins and enzymes, 272,273 through a ccmm c-terminal region with homology to the small subunit of rubisco. 274 other studies revealed that pdu shells can self-assemble without needing interior enzymes 275 and that carboxysomes can self-assemble in vivo when rubisco has been deleted. 276 regarding properties of the encapsulated enzymes, in the pdu bmc some of the internal enzymes are encapsulated by specific n-terminal targeting sequences. 275, 277 in this line, sutter and colleagues 278 described a conserved c-terminal amino acid sequence that mediates the physical interaction of an iron-dependent peroxidase (dyp) or a protein closely related to ferritin (flp) with a specific type of bmc (encapsulins). in another example, an icosahedral enzyme complex, lumazine synthase (aals) from bacillus subtilis and aquifex aeolicus, was engineered to encapsulate target molecules by means of charge complementarity and can also be modified to give different characteristics to the assembled structure. 279 moreover, enzymatic subunits, like e2 of pyruvate dehydrogenase from bacillus stearotermophilus, can be modified to be used in gene delivery. e2 peptides naturally form a dodecahedron of 60 subunits of 24 nm in diameter allowing modification for drug-like accommodation. the assembling/disassembling of these structures can be modulated by changing the operative ph in the experimental environment. these nanoparticles can also be functionalized with antigens for vaccine development. 281, 282 according to these results, specific targeting sequences could be of use in biotechnological applications to package proteins inside the stable selfassembled icosahedral shell of bmcs, offering appealing opportunities to manipulate in the laboratory such nanocages to fill them with therapeutic molecules. the simplicity of this system makes it very attractive for engineering studies to design, mimicking nature, new applications in biotechnology, providing a new, intriguing platform of microbial origin for drug delivery. bovine serum albumin (bsa) is able to form microspheres after sonochemical treatment in aqueous medium. chemical effects of ultrasound radiation and coupling with an anticancer drug such as taxol (paclitaxel) led to the assembling of a spherical carrier with an average diameter of 120 nm. bsa particles resulting from s-s bonds, due to ho 2 radical formation, are able to release the encapsulated taxol in cancer tissue with best results if compared with mere taxol treatment. this drug for breast cancer treatment is commercially available. 283, 284 also little cationic peptides can lead to self-assembling particles. among others, arginine-rich cationic peptides are widely known as good tools for gene delivery. for example, purified r9-tailored gfp in solution is described to form nanodisk particles 20 nm in diameter. this structure is proved to be induced by the 9 arg tails and is able to bind and condense dna. these nanodisks are also able to deliver dna toward the nucleus where the reporter gene is expressed. 285 on the other hand, the expression of recombinant proteins over physiological rates can cause a bad functioning of cellular quality control system, leading to self-organizing, pseudo-spherical, protein aggregates known as inclusion bodies. these mechanically stable nanoparticles, ranging from 50 to 500 nm in diameter, were considered for a long time as undesired bio-products. recently, it became clearer that they are suitable for medical approaches when utilized as scaffold surface to promote cellular proliferation. [286] [287] [288] one of the most difficult goals for a foreign gene delivery is to reach the nucleus. an approach to overpass this obstacle is by fusing an nls in a nonessential position of a dna-binding protein. such type of modification has been described for a tetracycline repressor protein (tetr) fused with an sv40 nls. the tetr-nls affinity and specificity to teto dna sequence is exploited to form spontaneous protein-dna complexes which allow an enhancing of dna transportation into the nucleus and subsequent expression of foreign genes, combining the two peculiar characteristics of each fusion component. 289 there is still a tremendous gap between progresses made in protein-based nanoparticle research for drug delivery and clinical reality. hundreds of publications in basic research describe the combination of two or more functional elements in a single protein nanoparticle, by which the delivery of a carried drug is enhanced. these agents act by improving critical steps in the drug delivery process, such as increasing the systemic stability or tissue specificity, favoring internalization, endosomal escape, and entry into the nucleus, or transporting therapeutic material through the bbb, in in vitro and in vivo studies. besides the human recombinant therapeutic proteins currently on the market (or functional segments of them), there are also some fusion proteins approved for clinical use (most by incorporating an antibody fragment or a ligand to enhance cell specificity). sadly no gene therapy trials have so far used full protein carriers in vivo, but rather peptide-functionalized vehicles. bottlenecking the gap between research and clinical application, the us fda/european medicines agency (emea) only approves human proteins, to avoid the risk of an immune response that could affect not only the effectiveness of the nanoparticle but also challenge patients' health. another critical factor is the administration route, where the protein is degraded before arriving at the target; this problem could be solved or minimized by the use of protein d-isomers, pegylation, or the design of protecting groups for labile sites. despite the current situation mentioned above, there are many good examples of multifunctional modular proteins that, when carrying therapeutic material, can improve the prognosis in vivo in animal models for different diseases. these examples are reviewed below, along with those few protein nanoparticles that are currently on the market or in clinical trials. albumin is a natural protein transporter of hydrophobic molecules throughout plasma that has been approved by the fda to reversibly bind water-insoluble anticancer agents, as is the case of albumin-bound (nab) paclitaxel, abraxane . this albumin-nab technology-based drug is in use in patients with metastatic breast cancer who have failed combination therapy, and it is the first protein nanoparticle approved by the fda. albumin potentiates paclitaxel 274 concentration within the tumor by increasing paclytaxel endothelial transcytosis through caveolae formation. it also contributes to the fact that tumors secrete an albumin-binding protein sparc (also called bm-40) to attract and keep albumin-bound nutrients inside the tumor cell. 290 the albuminpaclitaxel complex was not formally considered a nanoparticle in the united states (due to an average size of 130 nm) but only so in europe. apart from whole recombinant therapeutic proteins being currently commercialized, there are also some examples of vehicles formed by chimerical proteins with target ligands already in the market. dab389il-2 (denileukin diftitox or ontak) is a fusion of diphtheria toxin catalytic and translocation domains for lethal effect and interleukin-2 (il-2) to gain cell specificity in the treatment of persistent or recurrent t-cell lymphoma. belatacept (bms-224818) is a ctla4-ig fusion protein formed by the cytotoxic t-lymphocyteassociated antigen 4 joined to an immunoglobulin g1 fc fragment fusion protein, developed by bristol-miers-squibb. etanercept (enbrel) fusion tumor necrosis factor receptor (tnfr), which binds and inhibits specifically tnf activity, to an immune globulin g1 fc, to prevent inflammation mediated by tnf in autoimmune diseases like arthritis and psoriasis. on the other hand, fusion proteins which include an antihuman epidermal growth factor receptor 2 (her2) monoclonal antibody that binds tumor cell surfaces, among them the so-called ''trastuzumab'' (commercialized as herceptin by roche), associated to dm-1, an antimitotic drug, aimed at improving the treatment of breast cancer. finally, vlps, that is, empty viral entities formed by the self-assembly of a viral capsid protein, are the only truly protein nanoparticles (architectonically speaking) which are currently used in clinical practice. hbsag recombinant protein of hbv expressed in yeast and the capsid l1 recombinant protein of hpv (types 6, 11, 16, and 18) administered currently as vaccines tend to form spontaneously vlps that elicit t and b immune response. recently, there have been preclinical and clinical trials to test the security and efficacy of vlp vaccines against chikungunya 291 and seasonal influenza virus (http://www. medpagetoday.com/meetingcoverage//icaac/22129), respectively. influenza vlp vaccines have proven to provide complete protection against h1n1 2009 flu pandemics, 292 within a record preparation time when compared to 9 months for traditional vaccines. the use of vlps as a delivery system for drugs or nucleic acids in gene therapy is still under investigation. 194 drugs and proteins may be transformed through pegylation, a process that can assist them in overcoming some of the potential problems that delay the adoption of protein nanoparticles for clinical use. the covalent attachment of peg can reduce immunogenicity and antigenicity by hiding the particle from the immune system, can increase the circulating time by reducing renal clearance, and can also improve the water solubility of a hydrophobic particle. the use of pegylation has been approved for commercial use by the fda and emea, and some examples of pegylated protein products are adagen (peg-bovine adenosine deaminase), the first pegylated protein approved by the fda in 1990, pegasys (peg-interferon alpha), and oncaspar (peg-l-asparaginase). the majority of protein nanoparticles studied in clinical trials (http://clinicaltrias.gov) are fusion proteins composed of a therapeutic protein/peptide and a target cell-specific ligand. an example is alt-801, a biologic compound composed of il-2 genetically fused to a humanized soluble t-cell receptor directed against the p53-derived antigen. the clinical trials evaluated whether directing il-2 activity using alt-801 to the patient's tumor sites that overexpress p53 results in clinical benefits (nct01029873, nct00496860). another ligand joined to il-2 is l19, a tumor-targeted immunocytokine constituted of a single chain fragment variable (scfv) directed against the ed-b domain of fibronectin, one of the most important markers for neoangiogenesis. l19-il-2 is in a phase i/ii study for patients with solid tumors and renal cell carcinoma (rcc) (nct01058538). l19 has also been fused to tnfa with the intention to target tnfa directly to tumor tissues resulting in high and sustained intralesional bioactive tnfa concentrations. the l19tnfa is under clinical trial using isolated inferior limb perfusion (ilp) with the standard treatment with melphalan 10 mg/l limb volume in subjects affected by stage iii/iv limb melanoma (nct01213732). ngr-htnf is another bifunctional protein which combines a tumor-homing peptide (ngr) that selectively binds to amino peptidase n/cd13 highly expressed on tumor blood vessels, thus affecting tumor vascular permeability, and htnf, with direct anticancer activity. ngr-htnf is undergoing 14 clinical trials as a single agent to treat different cancers, as well as in combination with chemotherapy agents. another strategy to direct a therapeutic protein to the target cell is through fusion to a growth factor receptor ligand. an example is tp-38, a recombinant chimerical protein composed of the egfr binding ligand (tgf-a) and a genetically engineered form of the pseudomonas exotoxin, pe-38, to treat recurrent grade iv malignant brain tumors (nct00071539). many clinical trials are based on a therapeutic protein fused to a targeting antibody, as is the case of apc8015. this drug stimulates the immune system and stops cancer cells from growing by the combination of biological therapies with bevacizumab , an already approved monoclonal antibody that locates tumor cells and kills them in a specific way (nct00849290). there are also many putative protein drugs against cancer which include antibodies antiintegrins (e.g., cilengitide and imgn388), sometimes in combination with 276 classical therapies. a recently developed tool, the nanobodies or single domain antibodies, 293 have several advantages: small size (only 12-15 kda), which lowers the possibility of triggering immune response, safety in clinical trials (nct01020383), and is easy to be joined to different kinds of compounds. all these features make nanobodies competent drugs against different diseases, and have been tested in vivo as bifunctional proteins associated to a prodrug, very efficient in mice cancer xenografts. 294 even though cpps are very useful tools to deliver drugs and in gene therapy (see the chapter ''peptide nanoparticles for oligonucleotide delivery'' by lehto et al. in this volume), their toxicity and endosomal entrapment slows their inclusion for systemic delivery in clinical trials. nevertheless, there are a few examples of use to prevent undesirable cell proliferation in coronary artery bypass grafts, as is the case of a cpp (r-ahx-r) 4 ahxb-pmo conjugate targeted to human c-myc to be applied ex vivo. the trial, in phase ii, has been completed in 2009 (nct00451256). another case is psorban , a product patented for the treatment of psoriasis based on a cyclosporine-polyarginine conjugate of local application, which circumvents the specificity problem of intravenous (i.v.) application. it is in clinical trial phase iii, but not yet in the market. finally, kai-9803, a pkcd inhibitor peptide conjugated to tat to function as an intravenous drug for the treatment of acute myocardial infarction, is currently in phase 2b clinical trial (nct00785954, kai pharmaceuticals). there are many proteins, often organized as nanoparticles, that when associated to a drug, therapeutic protein, peptide, or nucleic acid increase the therapeutic efficacy of a cargo alone in the treatment of various diseases. some of them proved effective in animal models, which are discussed in more detail in this section, with relevant examples listed in table ii . these nanoparticles may simply be (a) a cpp to promote nonspecific internalization, 295-300 (b) a peptide to confer cargo specificity by joining a receptor distinctive of a cell type, including scfvs or peptides obtained by phage display, 301 and (c) a mixture of both, 302 since as observed in several studies the cpp does not reduce ligand specificity and increases nanoparticle potency. [303] [304] [305] complex and multifunctional vehicles including endosomal escape peptides enhance the therapeutic potency of the complex, or other domains that allow their selective activation in certain contexts. 306, 307 apart from the cases listed in table ii , the spectrum of additional examples of multidomain protein nanoparticles tested in vivo is wide, and a considerable proportion of them include cpps, mainly tat and polyarginines. a classical tat fusion protein is the transducible d-isomer ri-tatp53c 0 ' fusion protein that activates p53 protein in cancer cells, but not in normal cells. ri-tatp53c 0 treatment in terminal peritoneal carcinomatosis and peritoneal lymphoma preclinical models results in significant increases in life span (higher than sixfold) and full recovery from the disease. 308 there are also several studies in vivo using tat-fused therapeutic proteins which have proven effective in treating tumors [309] [310] [311] and cerebral ischemia 312,313 when applied intraperitoneally (i.p.). regarding polyarginines, kumar and colleagues have presented two different models in which a bifunctional peptide formed by nine arginines (9r) and a specific ligand constitute an effective sirna vehicle. in the first model, a chimerical peptide derived from rabies virus glycoprotein (to confer neuronal specificity) fused to 9d-arginines (rvg-9r), was able to transport si-rna across the bbb and silence specific gene expression in the brain when applied intravenously. 56 in the second model, a cd7-specific single-chain antibody was conjugated to oligo-9-arginine peptide (scfvcd7-9r) for t cell-specific antiviral sirna delivery in humanized mice reconstituted with human lymphocytes. in hiv-infected humanized mice, this treatment controlled viral replication and prevented the disease-associated cd4 t cell loss. moreover, it effectively suppressed viremia in infected mice. 314 some other examples of polyarginines in tumor models are 9-d-arginines fused to a tumor-suppressor peptide, which stopped tumor growth in hepatocellular carcinoma-bearing mice when applied intraperitoneally, and also colesteryl oligoarginines carrying vegf sirna, which inhibited tumor growth in colon adenocarcinoma after local application. 315 another bbb-crossing peptide is g7, which is able to transport nanoparticles loaded with loperamide. 107 in general, the partner fusion peptide can confer specificity instead of penetrability, as is the case of egfr fab fragment associated to liposomes that contain anticancer drug, which increases efficiency of anticancer effect in egf overexpressing xenograft tumors 316 ; in addition, rgd-4 c-doxorubicin in human breast xenografts increases efficacy and diminishes toxicity. 317 in many conjugates, the therapeutic peptide of the chimerical proteins is a toxin. anthrax lethal toxin has been modified to be activated by methaloproteases, and it has probed to be effective for human xenografted tumors such as melanoma, lung, and colorectal cancer. 318 anthrax toxin has also been associated to antibodies or growth factors for lethal effects specifically on cancer cells. 319 the specific cytotoxicity desired to treat a tumor might derive from a tissue factor, which promotes clotting to restrict blood supply in tumor vessels, fused to peptides that provide specificity, like v-cam antibodies, fibronectin, and integrin ligands. 320 eventually, drug activity may decrease when conjugated to a carrier protein, although if the entry of the drug is favored, the overall balance of activity can be much more efficient. 321 on the other hand, the use of noncovalent bond drug carrier could avoid interfering with the activity of the drug. an important issue in a preclinical study to be considered for a clinical trial is the administration route. in in vivo experiments, most of the protein nanoparticles are administered by local or intraperitoneal injection, avoiding systemic spreading and clearance in the vascular system, in a way very similar to in vitro experiments. the fda and emea, on the other hand, will preferentially approve i.v. and oral administrations rather than intraperitoneal or local injections except for very accessible tissues. another relevant issue is the number of active domains to be included in a therapeutic protein carrier, an issue that seems to be relevant for the functionality of the construct. for example, the cpp neutralization of a ligand may depend on the cpp/ligand ratio that is in the vehicle. 322 it has also been observed that the integrin binding power of rgd-containing motives increases with the number of rgd domains over the monomer until a maxim of four moieties. 323 another example is tat activity empowerment when attached to molecules that form tetramers, such as beta-galactosidase 108 and p-53. 324 some multidomain protein carriers allow the drug entrance only in selected target cells by tailored smart selective mechanisms. 325 for instance, cpps neutralized by polyanions are activated and enter the cells when they are released by metalloproteases 326 or by lowering the ph, 327 both situations being very common in tumors. cpp-morpholino oligomer (pmo) nanoparticles have also shown their effectiveness in treating viral infections by inhibiting viral replication, as demonstrated with the carrier (r-ahx-r) 4ahxb-pmo administered i.v. in animal models infected with picornaviruses, i.p. in mice infected with coronaviruses and flaviviruses, and the carrier r9f2c-pmo administered also i.p. in mice infected with ebola virus. furthermore, it has also been shown in some of these studies that the efficacy of the treatment is dependent on the incorporation of arginine-rich peptides in the nanoparticle. 328 a good example of how a cpp can improve the internalization of a therapeutic protein is the case of insulin. the instability and low absorption in the digestive tract of insulin prevents its oral administration, even though it would be very convenient for a daily administrated drug. in recent studies, noncovalent conjugation of insulin to different cpps enhances its absorption without toxic intestinal effect, l-penetratin being the most efficient as insulin carrier. 329 among the protein nanoparticles tested in vivo, it is worth making special mention of trojan horses generated in pardridge's laboratory to cross the bbb, through a strategy of fusing within a chimerical peptide the therapeutic protein which has to reach the cns to a monoclonal antibody against the human insulin receptor (hirmab). this trojan horse is very potent for humans and primates, and has proven effective to transport b-glucuronidase, a-l-iduronidase, gdnf, abeta amyloid peptides, paroxonase, etc., with potential benefits in diseases like mucopolysaccharidosis type vii, hurler syndrome, parkinson, alzheimer, and organophosphates toxicity, respectively. 330 there are also promising results when protein nanoparticles have been tested as carriers for gene therapy in vivo, some examples being listed in table ii . in this regard, the use of modular proteins generated by insertional mutagenesis of b-galactosidase condensing the sod gene are able to protect neurons against ischemic injury 133 ; a bifunctional galactosylated polylysine is able to conjugate plasmid dna and to differentially promote expression in hepatocytes that display asialoglycoprotein receptor 331 ; a suicide multidomain protein particle formed by herpes simplex virus thymidine kinase (hsv-tk) conjugated to transferrin (tf) by a biotin-streptavidin bridging, which, administered i.v. in k562 massively metastasized nude mice, was able to reduce tumor size and to increase mouse survival. 332 in this chapter, proteins and peptides have been envisioned as potent biotechnological tools for the development of new biocompatible biological entities that can be used as therapeutic agents by themselves or as nanovehicles for the delivery of associated drugs. proteins are nanostructures that can form complex high-order entities such as vlps, resulting in appropriate cages for the internalization of therapeutic molecules. in addition, the design of modular proteins displaying selected functions has been possible by using in silico approximations to the feasibility of recombinant protein production. this approach has demonstrated the versatility of such molecules in the generation of novel delivery nanovehicles opening up the possibility of new functional combinations to enhance the specific interaction with the target tissue. such tunable specificity in the delivery of drugs, nucleic acids, or other proteins is one of the main properties that make multifunctional proteins appealing as more rational delivery vehicles. the presence on the market of such complex entities, which started with the approval of insulin for the treatment of diabetes, has been increasing over the past years, and this tendency is expected to continue. in fact, there are some products in clinical trials that will probably end up being approved and some more are being explored in preclinical experiments which might enter in clinical trials. identifying actives from hts data sets: practical approaches for the selection of an appropriate hts data-processing method and quality control review natural products in the process of finding new drug candidates when analoging is not enough: scaffold discovery in medicinal chemistry dose-toxicity models in oncology is declining innovation in the pharmaceutical industry a myth? the impact of pegylation on biological therapies anticancer activity of celastrol in combination with erbb2-targeted therapeutics for treatment of erbb2-overexpressing breast cancers soon-shiong p. sparc expression correlates with tumor response to albumin-bound paclitaxel in head and neck cancer patients improved effectiveness of nanoparticle albumin-bound (nab) paclitaxel versus polysorbate-based docetaxel in multiple xenografts as a function of her2 and sparc status pharmacokinetics of plasmid dna in the rat instability, stabilization, and formulation of liquid protein pharmaceuticals peptide-guided gene delivery artificial viruses: a nanotechnological approach to gene delivery approaches to transport therapeutic drugs across the blood-brain barrier to treat brain diseases multifunctional protein nanocarriers for targeted nuclear gene delivery in nondividing cells synthetic and natural polycations for gene therapy: state of the art and new perspectives structure-activity relationships of poly(l-lysines): effects of pegylation and molecular shape on physicochemical and biological properties in gene delivery systemic circulation of poly(l-lysine)/dna vectors is influenced by polycation molecular weight and type of dna: differential circulation in mice and rats and the implications for human gene therapy a novel dnapeptide complex for efficient gene transfer and expression in mammalian cells branched cationic peptides for gene delivery: role of type and number of cationic residues in formation and in vitro activity of dna polyplexes comparative gene transfer efficiency of low molecular weight polylysine dna-condensing peptides low molecular weight disulfide cross-linking peptides as nonviral gene delivery carriers protamine sulfate enhances lipid-mediated gene transfer protamine-induced condensation and decondensation of the same dna molecule evaluation of nuclear transfer and transcription of plasmid dna condensed with protamine by microinjection: the use of a nuclear transfer score the protamine family of sperm nuclear proteins membrane-active peptides for non-viral gene therapy: making the safest easier enhancement of msh receptor-and gal4-mediated gene transfer by switching the nuclear import pathway target cell-specific dna transfer mediated by a chimeric multidomain protein: novel non-viral gene delivery system a multi-domain protein system based on the hc fragment of tetanus toxin for targeting dna to neuronal cells refined solution structure of the dna-binding domain of gal4 and use of 3j(113cd,1h) in structure determination immune responses to gene therapy vectors: influence on vector function and effector mechanisms peptide-assisted traffic engineering for nonviral gene therapy cell-penetrating peptides: a reevaluation of the mechanism of cellular uptake cell penetrating peptides: overview and applications to the delivery of oligonucleotides modular protein engineering in emerging cancer therapies oligomers of the arginine-rich motif of the hiv-1 tat protein are capable of transferring plasmid dna into cells tat-mediated delivery of heterologous proteins into cells tat peptide-mediated cellular delivery: back to basics a truncated hiv-1 tat protein basic domain rapidly translocates through the plasma membrane and accumulates in the cell nucleus cellular uptake [correction of utake] of the tat peptide: an endocytosis mechanism following ionic interactions the design, synthesis, and evaluation of molecules that enable or enhance cellular uptake: peptoid molecular transporters delivery of short interfering rna using endosomolytic cell-penetrating peptides conjugate for efficient delivery of short interfering rna (sirna) into mammalian cells cell penetration by transportan cellular translocation of proteins by transportan integrin-mediated vectors for gene transfer and therapy inhibition of tumor growth by rgd peptide-directed delivery of truncated tissue factor to the tumor vasculature hiv coreceptor downregulation as antiviral principle: sdf-1alpha-dependent internalization of the chemokine receptor cxcr4 contributes to inhibition of hiv replication cxcr4, inhibitors and mechanisms of action the transferrin receptor part ii: targeted delivery of therapeutic agents into cancer cells improved gene delivery into neuroglial cells using a fiber-modified adenovirus vector systemic genetic transfer of p21waf-1 and gm-csf utilizing of a novel oligopeptide-based egf receptor targeting polyplex specific systemic nonviral gene delivery to human hepatocellular carcinoma xenografts in scid mice a new n-acetylgalactosamine containing peptide as a targeting vehicle for mammalian hepatocytes via asialoglycoprotein receptor endocytosis transvascular delivery of small interfering rna to the central nervous system a novel peptide, plaeidgielty, for the targeting of alpha9beta1-integrins a synthetic peptide vector system for optimal gene delivery to corneal endothelium secretin-mediated gene delivery, a specific targeting mechanism with potential for treatment of biliary and pancreatic disease in cystic fibrosis a synthetic peptide containing loop 4 of nerve growth factor for targeted gene delivery neurotensin-spdp-poly-l-lysine conjugate: a nonviral vector for targeted gene delivery to neural cells identification of peptides that target the endothelial cell-specific lox-1 receptor selective transport of an anti-transferrin receptor antibody through the blood-brain barrier in vivo humanization of anti-human insulin receptor antibody for drug targeting across the human blood-brain barrier anti-gad antibody targeted non-viral gene delivery to islet beta cells novel challenges in exploring peptide ligands and corresponding tissue-specific endothelial receptors from combinatorial chemistry to cancer-targeting peptides cell surface adherence and endocytosis of protein transduction domains influenza virus hemagglutinin ha-2 n-terminal fusogenic peptides augment gene transfer by transferrin-polylysine-dna complexes: toward a synthetic virus-like gene-transfer vehicle the influence of endosomedisruptive peptides on gene transfer using synthetic virus-like gene transfer systems ph-dependent bilayer destabilization by an amphipathic peptide mechanism of leakage of phospholipid vesicle contents induced by the peptide gala association of a ph-sensitive peptide with membrane vesicles: role of amino acid sequence gala: a designed synthetic ph-responsive amphipathic peptide with applications in drug and gene delivery design, synthesis, and characterization of a cationic peptide that binds to nucleic acids and permeabilizes bilayers new basic membrane-destabilizing peptides for plasmid-based gene delivery in vitro and in vivo melittin enables efficient vesicular escape and enhanced nuclear access of nonviral gene delivery vectors the third helix of the antennapedia homeodomain translocates through biological membranes trojan peptides: the penetratin system for intracellular delivery histidine-rich peptides and polymers for nucleic acids delivery membrane permeabilization and efficient gene transfer by a peptide containing several histidines histidine containing peptides and polypeptides as nucleic acid vectors characterization of the gene transfer process mediated by histidine-rich peptides the proteasome metabolizes peptide-mediated nonviral gene delivery systems inhibition of ubiquitin-dependent proteolysis by a synthetic glycine-alanine repeat peptide that mimics an inhibitory viral sequence a minimal glycine-alanine repeat prevents the interaction of ubiquitinated i kappab alpha with the proteasome: a new mechanism for selective inhibition of proteolysis cis-inhibition of proteasomal degradation by viral repeats: impact of length and amino acid composition recognition of novel viral sequences that associate with the dynein light chain lc8 identified through a pepscan technique mechanisms of nuclear protein import gene delivery: a single nuclear localization signal peptide is sufficient to carry dna to the cell nucleus epstein-barr virus nuclear antigen 1 forms a complex with the nuclear transporter karyopherin alpha2 identification of the human c-myc protein nuclear translocation signal a chimeric fusion protein containing transforming growth factor-alpha mediates gene transfer via binding to the egf receptor the requirement of h1 histones for a heterodimeric nuclear import receptor core histones and linker histones are imported into the nucleus by different pathways nuclear targeting peptide scaffolds for lipofection of nondividing mammalian cells the carboxyl 35 amino acids of sv40 vp3 are essential for its nuclear accumulation pentapeptide nuclear localization signal in adenovirus e1a identification of domains involved in nuclear uptake and histone binding of protein n1 of xenopus laevis competition between nuclear localization and secretory signals determines the subcellular fate of a single cug-initiated form of fgf3 the human poly(adpribose) polymerase nuclear localization signal is a bipartite element functionally separate from dna binding and catalytic activity two interdependent basic domains in nucleoplasmin nuclear targeting sequence: identification of a class of bipartite nuclear targeting sequence long-term pharmacologically regulated expression of erythropoietin in primates following aav-mediated gene transfer rapid promoter analysis in developing mouse brain and genetic labeling of young neurons by doublecortin-dsred-express liverrestricted expression of the canine factor viii gene facilitates prevention of inhibitor formation in factor viii-deficient mice blood-brain barrier delivery targeting the central nervous system: in vivo experiments with peptide-derivatized nanoparticles loaded with loperamide and rhodamine-123 in vivo protein transduction: delivery of a biologically active protein into the mouse genetic engineering, expression, and activity of a fusion protein of a human neurotrophin and a molecular trojan horse for delivery across the human blood-brain barrier genetic engineering of a lysosomal enzyme fusion protein for targeted delivery across the human blood-brain barrier gdnf fusion protein for targeted-drug delivery across the human blood-brain barrier directed evolution: selecting today's biocatalysts novel methods for directed evolution of enzymes: quality, not quantity chemical and biochemical strategies for the randomization of protein encoding dna sequences: library construction methods for directed evolution rapid evolution of a protein in vitro by dna shuffling protein design automation advances in protein structure prediction and de novo protein design: a review solution structure and dynamics of a de novo designed three-helix bundle protein de novo protein design: fully automated sequence selection modular protein engineering for non-viral gene therapy a modular dna carrier protein based on the structure of diphtheria toxin mediates target cell-specific gene delivery gene therapy progress and prospects: non-viral gene therapy by systemic delivery using nuclear targeting signals to enhance non-viral gene transfer delivery of bioactive molecules into the cell: the trojan horse approach synthesis of cell-penetrating peptides and their application in neurobiology complexes of plasmid dna with basic domain 47-57 of the hiv-1 tat protein are transferred to mammalian cells by endocytosis-mediated pathways molecular organization of protein-dna complexes for cell-targeted dna delivery role of molecular chaperones in inclusion body formation recombinant protein expression in escherichia coli advanced genetic strategies for recombinant protein expression in escherichia coli recombinant expression systems in the pharmaceutical industry the major capsid protein, vp1, of human jc virus expressed in escherichia coli is able to self-assemble into a capsid-like particle and deliver exogenous dna into human kidney cells neuroprotection from nmda excitotoxic lesion by cu/zn superoxide dismutase gene delivery to the postnatal rat brain by a modular protein vector yeast cells allow high-level expression and formation of polyomavirus-like particles mammalian cell culture systems for recombinant protein production insect cell culture for industrial production of recombinant proteins large-scale mammalian cell culture glycosylation of human recombinant gonadotrophins: characterization and batch-to-batch consistency efficient gene delivery to primary neuron cultures using a synthetic peptide vector system cell-penetrating dna-binding protein as a safe and efficient naked dna delivery carrier in vitro and in vivo synthetic peptides as non-viral dna vectors exploration of peptide motifs for potent non-viral gene delivery highly selective for dividing cells engineering nuclear localization signals in modular protein vehicles for gene therapy the role of surface charge on the uptake and biocompatibility of hydroxyapatite nanoparticles with osteoblast cells rotavirus-like particles: a novel nanocarrier for the gut efficient accommodation of recombinant, foot-andmouth disease virus rgd peptides to cell-surface integrins virus entry: open sesame high efficiency polyoma dna transfection of chloroquine treated cells plasticity of influenza haemagglutinin fusion peptides and their interaction with lipid bilayers a nuclear localization signal can enhance both the nuclear transport and expression of 1 kb dna efficacy of a peptidebased gene delivery system depends on mitotic activity biological gene delivery vehicles: beyond viral vectors virus-like particles-universal molecular toolboxes virus engineering: functionalization and stabilization viruses and their uses in nanotechnology scaffolding proteins and their role in viral assembly towards the preparative and large-scale precision manufacture of virus-like particles virus-sized vaccine delivery systems virus-like particles in vaccine development advances in the development of virus-like particles as tools in medicine and nanoscience vaccination, immune and gene therapy based on viruslike particles against viral infections and cancer virus-like particles as a vaccine delivery system: myths and facts gene transfer by polyoma-like particles assembled in a cell-free system papillomavirus-like particles induce acute activation of dendritic cells papillomavirus virus-like particles as vehicles for the delivery of epitopes or genes virus-like particle vaccines and adjuvants: the hpv paradigm a novel system for efficient gene transfer into primary human hepatocytes via cell-permeable hepatitis b virus-like particle essential elements of the capsid protein for self-assembly into empty virus-like particles of hepatitis e virus recombinant hepatitis c virus-like particles expressed by baculovirus: utility in cell-binding and antibody detection assays murine pneumotropic virus chimeric her2/neu virus-like particles as prophylactic and therapeutic vaccines against her2/neu expressing tumors in vitro and in vivo targeted delivery of il-10 interfering rna by jc virus-like particles cell-type specific targeting and gene expression using a variant of polyoma vp1 virus-like particles molecular cloning and expression of major structural protein vp1 of the human polyomavirus jc virus: formation of virus-like particles useful for immunological and therapeutic studies packaging of small molecules into vp1-virus-like particles of the human polyomavirus jc virus chimeric polyomavirus-derived virus-like particles: the immunogenicity of an inserted peptide applied without adjuvant to mice depends on its insertion site and its flanking linker sequence generation of recombinant virus-like particles of human and non-human polyomaviruses in yeast saccharomyces cerevisiae hamster polyomavirusderived virus-like particles are able to transfer in vitro encapsidated plasmid dna to mammalian cells virus-like gene transfer into cells mediated by polyoma virus pseudocapsids immunity against both polyomavirus vp1 and a transgene product induced following intranasal delivery of vp1 pseudocapsid-dna complexes virus-like particles: designing an effective aids vaccine lentivirus-based virus-like particles as a new protein delivery tool parenteral administration of rf 8-2/6/7 rotavirus-like particles in a one-dose regimen induce protective immunity in mice canine parvovirus-like particles, a novel nanomaterial for tumor targeting tumor targeting using canine parvovirus nanoparticles norwalk virus-like particles as vaccines virus-based nanoparticles (vnps): platform technologies for diagnostic imaging quantitative characterization of virus-like particles by asymmetrical flow field flow fractionation, electrospray differential mobility analysis, and transmission electron microscopy high-resolution structure of a polyomavirus vp1-oligosaccharide complex: implications for assembly and receptor binding the polyomaviridae: contributions of virus structure to our understanding of virus receptors and infectious entry structures of virus and virus-like particles fabrication of novel biomaterials through molecular self-assembly bacteriophage capsids: tough nanoshells with complex elastic properties maturation of a tetravirus capsid alters the dynamic properties and creates a metastable complex the use of virus-like particles for gene transfer virus-like particles as vaccines and vessels for the delivery of small molecules recombinant virus like particles as drug delivery system virus-like particles as immunogens virus-like particles: passport to immune recognition manipulation of the mechanical properties of a virus by protein engineering viruses as building blocks for materials and devices adaptations of nanoscale viruses and other protein cages for medical applications blocking oncogenic ras signaling for cancer therapy assessment of cell type specific gene transfer of polyoma virus like particles presenting a tumor specific antibody fv fragment microbial production of virus-like particle vaccine protein at gram-per-litre levels recombinant baculoviruses as mammalian cell gene-delivery vectors production of core and virus-like particles with baculovirus infected insect cells production of fmdv virus-like particles by a sumo fusion protein approach in escherichia coli virus-like particles production in green plants an efficient plant viral expression system generating orally immunogenic norwalk virus-like particles immunotherapeutic polyoma and human papilloma virus-like particles human hepatitis b vaccine from recombinant yeast evaluation of hbs, hbc, and frcp virus-like particles for expression of human papillomavirus 16 e7 oncoprotein epitopes advances in methods for the production, purification, and characterization of hiv-1 gag-env pseudovirion vaccines nanotechnology in vaccine delivery protection against lethal challenge by ebola virus-like particles produced in insect cells ebola virus-like particles protect from lethal ebola virus infection activation of dendritic cells and induction of t cell responses by hpv 16 l1/e7 chimeric virus-like particles are enhanced by cpg odn or sorbitol vaccination trial with hpv16 l1e7 chimeric virus-like particles in women suffering from high grade cervical intraepithelial neoplasia (cin 2/3) hpv16 l1e7 chimeric virus-like particles induce specific hla-restricted t cells in humans after in vitro vaccination chimeric virus-like particles for the delivery of an inserted conserved influenza a-specific ctl epitope chimeric hepatitis b virus core particles carrying an epitope of anthrax protective antigen induce protective immunity against bacillus anthracis parvovirus b19 empty capsids as antigen carriers for presentation of antigenic determinants of dengue 2 virus recombinant virallike particles of parvovirus b19 as antigen carriers of anthrax protective antigen conjugation of a self-antigen to papillomavirus-like particles allows for efficient induction of protective autoantibodies a vaccine against nicotine for smoking cessation: a randomized controlled trial efficient gene transfer using the human jc virus-like particle that inhibits human colon adenocarcinoma growth in a nude mouse model a top-down approach for construction of hybrid polymer-virus gene delivery vectors dna and gene therapy: uncoating of polyoma pseudovirus in mouse embryo cells nanoparticles for the delivery of genes and drugs to human hepatocytes dna vaccineencapsulated virus-like particles derived from an orally transmissible virus stimulate mucosal and systemic immune responses by oral administration efficient delivery of rna interference effectors via in vitro-packaged sv40 pseudovirions reengineering a receptor footprint of adeno-associated virus enables selective and systemic gene transfer to muscle an investigation into the use of human papillomavirus type 16 virus-like particles as a delivery vector system for foreign proteins: n-and c-terminal fusion of gfp to the l1 and l2 capsid proteins coupling of antibodies via protein z on modified polyoma virus-like particles conjugation of an antibody fv fragment to a virus coat protein: cell-specific targeting of recombinant polyoma-virus-like particles accelerated bioorthogonal conjugation: a practical method for the ligation of diverse functional molecules to a polyvalent virus scaffold molecular engineering of viral gene delivery vehicles filovirus-pseudotyped lentiviral vector can efficiently and stably transduce airway epithelia in vivo nuclear entry mechanism of the human polyomavirus jc virus-like particle: role of importins and the nuclear pore complex hybrid virus-polymer materials. 1. synthesis and properties of peg-decorated cowpea mosaic virus the power of two: protein dimerization in biology bottom-up design of biomimetic assemblies measuring the forces that control protein interactions recent progress in understanding hydrophobic interactions binding mechanisms in supramolecular complexes structural symmetry and protein function modeling experimental design for proteomics smart and genetically engineered biomaterials and drug delivery systems molecular designer self-assembling peptides nanoparticulate architecture of protein-based artificial viruses is supported by protein-dna interactions giant dna molecules exhibit on/off switching of transcriptional activity through conformational transition the small heat shock protein cage from methanococcus jannaschii is a versatile nanoscale platform for genetic and chemical modification comparative ultrastructure of the thiobacilli functional organelles in prokaryotes: polyhedral inclusions (carboxysomes) of thiobacillus neapolitanus association of carbonic anhydrase activity with carboxysomes isolated from the cyanobacterium synechococcus pcc7942 a novel evolutionary lineage of carbonic anhydrase (epsilon class) is a component of the carboxysome shell isolation of a putative carboxysomal carbonic anhydrase gene from the cyanobacterium synechococcus pcc7942 the control region of the pdu/cob regulon in salmonella typhimurium the 17-gene ethanolamine (eut) operon of salmonella typhimurium encodes five homologues of carboxysome shell proteins the propanediol utilization (pdu) operon of salmonella enterica serovar typhimurium lt2 includes genes necessary for formation of polyhedral organelles involved in coenzyme b(12)-dependent 1, 2-propanediol degradation pdua is a shell protein of polyhedral organelles involved in coenzyme b(12)-dependent degradation of 1,2-propanediol in salmonella enterica serovar typhimurium lt2 protein content of polyhedral organelles involved in coenzyme b12-dependent degradation of 1,2-propanediol in salmonella enterica serovar typhimurium lt2 pdup is a coenzyme-a-acylating propionaldehyde dehydrogenase associated with the polyhedral bodies involved in b12-dependent 1,2-propanediol degradation by salmonella enterica serovar typhimurium lt2 dna polymerase i function is required for the utilization of ethanolamine, 1,2-propanediol, and propionate by salmonella typhimurium lt2 glutathione is required for maximal transcription of the cobalamin biosynthetic and 1,2-propanediol utilization (cob/pdu) regulon and for the catabolism of ethanolamine, 1,2-propanediol, and propionate in salmonella typhimurium lt2 microcompartments for b12-dependent 1,2-propanediol degradation provide protection from dna and cellular damage by a reactive metabolic intermediate conserving a volatile metabolite: a role for carboxysome-like organelles in salmonella enterica protein structures forming the shell of primitive bacterial organelles bacterial microcompartment organelles: protein shell structure and evolution atomic-level models of the bacterial carboxysome shell structure and mechanisms of a protein-based organelle in escherichia coli a multiprotein bicarbonate dehydration complex essential to carboxysome function in cyanobacteria analysis of carboxysomes from synechococcus pcc7942 reveals multiple rubisco complexes with carboxysomal proteins ccmm and ccaa analysis of a genomic dna region from the cyanobacterium synechococcus sp. strain pcc7942 involved in carboxysome assembly and function synthesis of empty bacterial microcompartments, directed organelle protein incorporation, and evidence of filament-associated organelle movement halothiobacillus neapolitanus carboxysomes sequester heterologous and chimeric rubisco species short n-terminal sequences package proteins into bacterial microcompartments structural basis of enzyme encapsulation into a bacterial nanocompartment a simple tagging system for protein encapsulation multiple assembly states of lumazine synthase: a model relating catalytic function and molecular assembly thermostability and molecular encapsulation within an engineered caged protein scaffold ph-triggered disassembly in a caged protein complex characterization and activity of sonochemically-prepared bsa microspheres containing taxol-an anticancer drug paclitaxel-clusters coated with hyaluronan as selective tumor-targeted nanovectors protein nanodisk assembling and intracellular trafficking powered by an arginine-rich (r9) peptide the nanoscale properties of bacterial inclusion bodies and their effect on mammalian cell proliferation surface cell growth engineering assisted by a novel bacterial nanomaterial nanostructured bacterial materials for innovative medicines development of a selfassembling nuclear targeting vector system based on the tetracycline repressor protein unraveling mysteries of the multifunctional protein sparc a virus-like particle vaccine for epidemic chikungunya virus protects nonhuman primates against infection recombinant h1n1 viruslike particle vaccine elicits protective immunity in ferrets against the 2009 pandemic h1n1 influenza virus properties, production, and applications of camelid singledomain antibody fragments efficient cancer therapy with a nanobody-based conjugate effect of cell-based intercellular delivery of transcription factor gata4 on ischemic cardiomyopathy morpholino oligomer-mediated exon skipping averts the onset of dystrophic pathology in the mdx mouse overcoming multidrug resistance of small-molecule therapeutics through conjugation with releasable octaarginine transporters in vivo delivery of the caveolin-1 scaffolding domain inhibits nitric oxide synthesis and reduces inflammation a non-covalent peptide-based carrier for in vivo delivery of dna mimics targeting cyclin b1 through peptide-based delivery of sirna prevents tumour growth antibody mediated in vivo delivery of small interfering rnas via cell-surface receptors selective inhibition of erbb2-overexpressing breast cancer in vivo by a novel tat-based erbb2-targeting signal transducers and activators of transcription 3-blocking peptide design of a tumor-homing cell-penetrating peptide a tumor-homing peptide with a targeting specificity related to lymphatic vessels penetratin improves tumor retention of single-chain antibodies: a novel step toward optimization of radioimmunotherapy of solid tumors killing hiv-infected cells by transduction with an hiv protease-activated caspase-3 protein development of elastin-like polypeptide for thermally targeted delivery of doxorubicin treatment of terminal peritoneal carcinomatosis by a transducible p53-activating peptide antitumor effect of tat-oxygen-dependent degradation-caspase-3 fusion protein specifically stabilized and activated in hypoxic tumor cells the 104-123 amino acid sequence of the beta-domain of von hippel-lindau gene product is sufficient to inhibit renal tumor growth and invasion dendritic cells transduced with protein antigens induce cytotoxic lymphocytes and elicit antitumor immunity protein kinase c delta mediates cerebral reperfusion injury in vivo in vivo delivery of a bcl-xl fusion protein containing the tat protein transduction domain protects against ischemic brain injury and neuronal apoptosis t cell-specific sirna delivery suppresses hiv-1 infection in humanized mice cholesteryl oligoarginine delivering vascular endothelial growth factor sirna effectively inhibits tumor growth in colon adenocarcinoma epidermal growth factor receptor-targeted immunoliposomes significantly enhance the efficacy of multiple anticancer drugs in vivo cancer treatment by targeted drug delivery to tumor vasculature in a mouse model matrix metalloproteinase-activated anthrax lethal toxin demonstrates high potency in targeting tumor vasculature anthrax fusion protein therapy of cancer comparison of three different targeted tissue factor fusion proteins for inducing tumor vessel thrombosis overcoming methotrexate resistance in breast cancer tumour cells by the use of a new cellpenetrating peptide tumor cell retention of antibody fab fragments is enhanced by an attached hiv tat protein-derived peptide improved targeting of the alpha(v)beta (3) integrin by multimerisation of rgd peptides probing the impact of valency on the routing of arginine-rich peptides into eukaryotic cells cell-penetrating and cell-targeting peptides in drug delivery tumor imaging by means of proteolytic activation of cell-penetrating peptides tat peptide-based micelle system for potential active targeting of anti-cancer agents to acidic solid tumors cell penetrating peptide conjugates of steric block oligonucleotides usefulness of cell-penetrating peptides to improve intestinal insulin absorption biopharmaceutical drug targeting to the brain gene transfer in vivo: sustained expression and regulation of genes introduced into the liver by receptor-targeted uptake in vivo gene delivery to tumor cells by transferrin-streptavidin-dna conjugate the authors appreciate the financial support received through grants bfu2010-17450 from micinn, ps0900165 from fiss, and 2009sgr-108 from agaur. the authors also acknowledge the support of the ciber de bioingeniería, biomateriales y nanomedicina (ciber-bbn), an initiative funded by the vi national r&d&i plan 2008-2011, iniciativa ingenio 2010, consolider program, ciber actions and financed by the instituto de salud carlos iii with assistance from the european regional development fund. protein nanoparticles engineered for drug delivery and gene therapy 281 key: cord-002100-dt5zvebj authors: he, yonghua; schmidt, monica a.; erwin, christopher; guo, jun; sun, raphael; pendarvis, ken; warner, brad w.; herman, eliot m. title: transgenic soybean production of bioactive human epidermal growth factor (egf) date: 2016-06-17 journal: plos one doi: 10.1371/journal.pone.0157034 sha: doc_id: 2100 cord_uid: dt5zvebj necrotizing enterocolitis (nec) is a devastating condition of premature infants that results from the gut microbiome invading immature intestinal tissues. this results in a life-threatening disease that is frequently treated with the surgical removal of diseased and dead tissues. epidermal growth factor (egf), typically found in bodily fluids, such as amniotic fluid, salvia and mother’s breast milk, is an intestinotrophic growth factor and may reduce the onset of nec in premature infants. we have produced human egf in soybean seeds to levels biologically relevant and demonstrated its comparable activity to commercially available egf. transgenic soybean seeds expressing a seed-specific codon optimized gene encoding of the human egf protein with an added er signal tag at the n’ terminal were produced. seven independent lines were grown to homozygous and found to accumulate a range of 6.7 +/3.1 to 129.0 +/36.7 μg egf/g of dry soybean seed. proteomic and immunoblot analysis indicates that the inserted egf is the same as the human egf protein. phosphorylation and immunohistochemical assays on the egf receptor in hela cells indicate the egf protein produced in soybean seed is bioactive and comparable to commercially available human egf. this work demonstrates the feasibility of using soybean seeds as a biofactory to produce therapeutic agents in a soymilk delivery platform. each year in the united states, more than 530,000 babies, approximately 12% of total births, are born before 37 full weeks of gestation [1] . as a growing health issue the rate of premature birth has increased by 36 percent since the early 1980s. one of the major problems associated with prematurity is the development of a condition known as neonatal necrotizing enterocolitis (nec) [2] . this is observed clinically as the abrupt development of bloody diarrhea, abdominal swelling, and tenderness in a premature infant who is otherwise doing well [3] . current treatment often requires surgical removal of the damaged and dead intestine, often resulting in mortality (about 40%) or, if the infant survives, to manifest significant resulting lifetime problems [3] [4] [5] . although the direct cause of nec is not known, the most significant contributing factor is premature birth. post-partum establishment of an abnormal gut microbiome creates the opportunity for bacterial invasion into gut due to immature intracellular junctions of the intestinal mucosa [6, 7] . experimental and clinical evidence suggest that prematurity and nec is associated with deficient endogenous production of epidermal growth factor (egf), which is necessary for normal intestinal development and repair [8, 9] . egf is a critical growth factor found in multiple fluids that bathe the developing intestine including amniotic fluid, fetal urine, breast milk, bile, and saliva [2, 10, 11] . in the amniotic fluid, there is an increasing concentration of egf as gestation progresses [12] . egf amounts in mother's milk is highest first days after parturition with mothers of extreme pre-term neonates having 50-80% higher than mother's milk of full term infants [13] . human studies have demonstrated that egf is resistant to proteolytic degradation across a range of gastric ph [14] . while egf is produced to some extent in duodenal brunner's glands and kidney, the vast majority of egf is produced in the salivary glands [15] . exogenous infusion of egf in utero has been shown to accelerate the maturation of intestinal enzyme activity as well as stimulate intestinal growth [16, 17] . the importance of egf to gut development is highlighted by the fact that knockout of the egf receptor in some mice strains results in death due to a bloody diarrhea that is remarkably similar to human nec [18] . transgenic mice directed to intestinally overexpress egf displayed a number of beneficial effects, including increased body weight and villus height, after a small bowel resection compared to nontransgenic mice [19] . conversely, inhibition of egf receptors impairs intestinal adaption following a small bowel resection [20] . a prospective, multi-center trial demonstrated that infants fed regular formula (not containing growth factors) were 6 to 10 times more likely to develop nec than infants fed breast milk [21] . while a large number of biologically active peptides and growth factors have been identified in breast milk, egf is one of the major peptides present in significant concentrations [22] . the concentration of egf in milk is found to be inversely proportional to the gestational age of the infant, therefore, the more premature the infant, the more egf is present in the breast milk [13] . this may be a compensatory response to the premature removal of the fetus from the egf-rich amniotic fluid. it has been demonstrated in several animal models of nec that administration of exogenous egf has been shown to significantly reduce the severity of intestinal injury [23, 24] . the proactive treatment of infants at nec risk with egf supplementation could therefore accelerate intestinal maturation, thus preventing the development of nec. if the proactive egf feeding strategy is effective to induce the maturation of the neonate intestinal tract then this simple approach may mitigate the development of nec with its resulting high costs in medical resources, pain and possible life-long debilitation and for the 40% infants with nec that proves fatal [25] [26] [27] . to accomplish such a proactive therapeutic approach adapting infant formula for egf delivery would be simple and economic and mimic the delivery of egf in mother's milk. the need and potential delivery makes infant formula containing egf a good model for food-sourced plant biotechnology. soybean-derived formula encompasses a significant fraction of the total infant formula market. soybean milk and derived products are a potential food-therapy delivery platform that could include a variety of medically-necessary products including drugs such as egf but might also include oral vaccines [28, 29] . the economy of production and simple conversion into therapeutic materials that has long-shelf life makes soybean biotech products a potentially desirable commodity for use in cost-sensitive scaled applications. addressing the devastating disease of nec through a simple proactive treatment protocol is an excellent platform to explore the potential of soybean-produced therapeutics. here we report the accumulation of human egf (hegf) in geneticallyengineered soybean seeds and show that the recombinant egf is indistinguishable from authentic human egf and is bioactive at stimulating egf receptor (egfr) activity. epidermal growth factor protein from humans was produced in soybean seeds by constructing a plant gene expression cassette that involved a synthetic codon optimized egf nucleotide sequence (protein sequence from genbank accession ccq43157). this 162 bp open reading frame was placed in-frame behind a 20-amino acid endoplasmic reticulum (er) signal sequence from the arabidopsis chitinase gene [30, 31] . the er-directed egf encoding open reading frame was developmentally regulated by the strong seed-specific storage protein glycinin regulatory elements [31] . the entire seed specific cassette to direct egf production was placed in a vector containing the hygromycin resistance gene under the strong constitutive expression of the potato ubiquitin 3 regulatory elements as previously described [31] [32] [33] . the result plasmid pgly::shegf was sequenced using a glycinin promoter primer (5' tcattcac cttcctctcttc 3') to ensure the egf open reading frame was placed correctly between the regulatory elements. somatic soybean (glycine max l. merrill cv jack (wild type)) embryos were transformed via biolistics using 30 mg/l hygromycin b selection and regenerated as previously described [34] . embryos from resistant lines were analyzed by genomic pcr to confirm the presence of inserted hygromycin cassette using primers specific to the hygromycin gene (hygf 5'ctcactattcctttgccctc3' and hygr 5'ctgacctattgcatctcccg3'), cetyl trimethyl ammonium bromide (ctab) extraction genomic dna isolation and the following amplification conditions: 150 ng genomic dna in 25 μl total reaction containing 200 nm primers and 3 u taq polymerase (neb) and the following cycling parameters (initial 95°c 4 min then 45 cycles of 95°c 30 s, 55°c 45 s, 72°c 90s; followed by a final extension of 72°c 7 min). dry seeds from two successive generations of pcr positive plants were analyzed by elisa for the expression of egf protein until all 7 lines were confirmed to be homozygous. egf transgenic soybean plants along with nontransgenic control wild type cultivar plants were grown side by side in a greenhouse at 25°c under 16 h daylight with 1000 μm -2 /s. total soluble protein was extracted from dry seeds of two homozygous egf lines and a nontransgenic control by repeated acetone washes followed by acetone precipitation with the protein pellet dissolved in water. proteins with molecular weight 10 kda and under were isolated by separately passing each extract through an amicon ultra centrifugal filter (merck, kenilworth nj). the samples were each suspended in sample buffer (50mm tris hcl, ph6.8 2% sds (w/v), 0.7 m β-mercaptoethanol, 0.1% (w/v) bromphenol blue and 10% (v/v) glycerol) and then denaturated 5 min 95°c. protein content was determined by bradford assay [35] . a 15% sds-page gel was used to separate 30 μg protein for each of the three samples: negative control wild type, lines 4 and 5 of egf transgenic soybean dry seeds. commercially available human egf (gibco, life technologies,united kingdom) was used at 0.5 μg as positive control. gel was electroblotted onto immobilon p transfer membrane (millipore, bedford ma) and blocked with 3% milk solution in tbs for at least 1 hr. primary antibody was a commercially available anti-egf (calbiochem, san diego ca) and was used in a 1:100 ratio in 3% bsa-tbs buffer overnight at room temperature. after 3 washes of 15 mins each with tbs buffer, the blot was incubated with a 1:10,000 ratio in tbs of secondary antibody anti-rabbit igg fabspecific alkaline phosphatase conjugate (sigma, st. louis mo). after 3 washes, the presence of the egf protein was detected by using a color substrate (bcip/nbt: final concentrations 0.02% (w/v) 5-bromo-4-chloro-3-indoyl phosphate and 0.03% (w/v) nitro blue tetrazolium in 70% (v/v) dementhylformadmide) (kpl, gaithersburg ma). total soluble protein was extracted from dry soybean seeds as described previously [31, 32] from all 7 lines of pgly::shegf transgenic plants along with nontransgenic seeds as a negative control. egf was quantitated by commercially available human egf elisa assay (quantikine elisa kit from r&d systems, minneapolis mn) according to the manufacturer's instructions. the provided positive control was used to create a standard curve in order to determine the amount of egf in each soybean protein extract. each homozygote egf transgenic line was assayed with three biological replicates and results displayed as mean +/-standard error. total soluble proteins were extracted, quantitated and suspended in sample loading buffer as previously described [31, 32] . approximately 30 μg of protein extract from dry seeds of 4 homozygous egf lines were separated on a 4-20% gradient sds-page gel (biorad, hercules ca) along with extract from a nontransgenic seed. the gel was subsequently stained with 0.1% total soluble protein was extracted from 3 biological egf transgenic soybean dry seed samples, lines 4, 5 and 6. as described above, proteins with molecular weights lowers than 10 kda were concentrated using an amicon ultra centrifugal filter (merck, kenilworth nj). non-transgenic seeds were used as a negative control and 5 μg commercially available egf (as above in immunoblot section) was the positive control. protein was precipitated by adjusting the solution to 20% (v/v) trichloroacetic acid and allowed to sit at 4°c overnight. precipitated proteins were pelleted using centrifugation, washed twice with acetone and then dried using vacuum centrifugation. the commercial egf was not filtered or precipitated, only dried. dried pellets were rehydrated with the addition of 10 μl 100 mm dithiothreitol in 100 mm ammonium bicarbonate and placed at 85°c for 5 minutes to reduce disulphide bonds. samples were then alkylated with addition of 10 μl iodacetamide in 100 mm ammonium bromide and placed at room temperature in the dark for 30 minutes. two μg trypsin in 200 μl 100 mm ammonium bromide was added to each samples and placed in 37°c overnight for enzymatic digestion. post trypsin digest samples were desalted using a peptide reverse phase microtrap (michrom bioresources, auburn ca), dried and ultimately resuspended in 2 μl of 2% (v/v) acetonitrile, 0.1% (v/v) formic acid. separation of peptides was performed using a dionex u3000 splitless nanoflow hplc system operated at 333 nl minute using a gradient from 2-50% acetonitrile over 60 minutes, followed by a 15 minute wash with 95% acetonitrile and a 15 minute equilibration with 2% acetonitrile. the c18 column, an in-house prepared 75 μm by 15 cm reverse phase column packed with halo 2.7 μm, 90å c18 material (mac-mod analytical, chadds ford pa) was located in the ion source just before a silica emitter. a potential of 2100 volts was applied using a liquid junction between the column and emitter. a thermo ltq velos pro mass spectrometer using a nanospray flex ion source was used to analyze the eluate from the u3000. scan parameters for the ltq velos pro were one ms scan followed by 10 ms/ms scans of the 5 most intense peaks. ms/ms scans were performed in pairs, a cid fragmentation scan followed a hcd fragmentation scan of the same precursor m/z. dynamic exclusion was enabled with a mass exclusion time of 3 min and a repeat count of 1 within 30 sec of initial m/z measurement. spectra were collected over the entirety of each 90 minute chromatography run. raw mass spectra were converted to mgf format using msconvert, part of the proteowizard software library [36] x!tandem 2013.09.01.1 [37] and omssa [38] algorithms were employed via the university of arizona high performance computing center to perform spectrum matching. precursor and fragment mass tolerance were set to 0.2 daltons for both omssa and x!tandem. trypsin cleavage rules were used for both algorithms with up to 2 missed cleavages. amino acid modifications search consisted of single and double oxidation of methionine, oxidation of proline, n-terminal acetylation, carbamidomethylation of cysteine, deamidation of asparagine and glutamine and phosphorylation of serine, threonine, and tyrosine. x!tandem xml and omssa xml results were filtered using perl to remove any peptide matches with an evalue > 0.05 as well as proteins identified by a single peptide sequence. the protein fasta database for glycine max was downloaded on august 5, 2015 from ncbi refseq with the addition of the egf amino acid sequence. a randomized version of the glycine max fasta was concatenated to the original as a way to assess dataset quality. the mass spectrometry proteomics data have been deposited to the proteomexchange constortium (http://proteomecentral. proteomexchange.org) via the pride partner repository [39] with the dataset identifier pxd003326 and 10.6019/pxd003326. hela cells (obtained from american tissue culture collection) were cultured in minimum essential media (mem) complemented with 10% fetal bovine serum (fbs), 100 units/ml penicillin, and 100 μg/ml streptomycin. for western blotting assay, cells grown in 6-well plate were kept in serum free mem media for 24 hours. cells were then either kept in serum free medium (control) or stimulated with soy milk alone, soy egf or commercial recombined human egf for different time period as indicated. cells were lysed by directly adding 1× sds sample buffer (50 mm tris-hcl, ph 6.8, 10% glycerol, 2% sds and 5% β-me) to the cells after washing 3 times with 1x pbs. egf bio-activity was determined via egfr phosphorylation and downstream akt phosphorylation. total egfr was also measured since egfr is known to undergo internalization when stimulated with egf. antibodies used in western blot are anti-p-egfr (tyr1068) (#2234, cell signaling technology), anti-total egfr (#06-847, millipore), anti-p-akt (#4060, cell signaling technology) and anti-lamin b1 (# 13435, cell signaling technology) [40] . for immunocytochemistry assay, cells were grown on coverslip in 6-well plate and kept in serum free media for 24 hours before stimulation, cells were then either kept in serum free media (control) or stimulated with human or soy egf for 6 hours. cells were washed with pbs and fixed with 4% formalin. egfr was labeled using anti-egfr antibody (#4267, cell signaling technology) and detected with alexa fluor 594 goat anti-rabbit igg (#a11012, life technology). the cell nucleus were shown using mounting medium with dapi (#h-1200, vectorshield). to produce hegf in soybean a strong soybean seed-specific promoter and terminator was used to regulate gene expression of a synthetic soybean codon optimized hegf (shegf) gene that included an n-terminal 60 nucleotide er-signal sequence (fig 1a) . in the engineering strategy for the hegf expression in soybean, the components of the prepro portions of hegf were eliminated in preference to produce only the final recombinant hegf product. to facilitate the co-translational transfer of the egf into the er lumen for disulfide bond formation a plant signal sequence was added so that the hegf synthesized would be as a pre-hegf. the gly::shegf construct was used for biolistic transformation of soybean somatic embryo cells as outlined in [31] [32] [33] [34] . embryos were selected in liquid culture by hygromycin b and individual regenerated lines were separated, propagated, and induced to form cotyledonary embryos. the cotyledonary embryos were evaluated for hegf production using egf-specific elisa that indicated a variation of heterologous protein production (data not shown). the most promising egf expressing lines were moved forward for regeneration by desiccating and subsequent germination. the initial t 0 generation egf transgenic plants were grown in the greenhouse and further selected by genomic pcr for an additional 2-3 generations. additionally, each generation of seeds produced by the selected lines were assayed for hegf content by elisa. the hegf content of each line in seeds representative of the homozygous population is shown in fig 1b. the lines varied in hegf content but seeds within each line had a narrow range of hegf accumulation. the egf transgenic line 5 produced in excess of 100 μg hegf per gm dry seed weight, a level calculated to be much in excess of potential therapeutic requirements. by comparison, yeast stains have been used as an expression system for both human egf [41] and mouse egf [42] with the highest levels produced being from a multicopy insert pichia pastoris clone secreting 49 μg egf/ml. in both the mouse and human egf yeast production systems, truncated versions of the egf were detected. the hegf soybeans and nontransgenic soybeans were evaluated to determine the biochemical authenticity of the soybean-produced egf protein. using 1d sds/page and parallel immunoblots probed with anti-egf, the soluble low molecular weight (<10 kda) seed proteins and the mr of the soybean-produced hegf was evaluated. the total protein polypeptide of the hegf expressing lines appeared to be identical to the standard parental control (fig 2) . immunoblots of the 1d sds/page probed with anti-egf showed a lack of an immunoreactive band in the nontransgenic soybean seed control and recognized a 6 kda mr band in the hegf expressing lines 5 and 4. the soybean-produced hegf has the same apparent mr as authentic recombinant hegf fractioned in an adjacent lane (fig 3) . to further assess the soybean-synthesized hegf the seed lysates were enriched in low mr total proteins and concentrated. the crude low mr proteins were reduced, alkylated, and cleaved with trypsin prior to analysis by mass spectrometry. the resulting data was queried with the hegf sequence and exact matches for peptides encompassing the majority of the sequence of the complete mature hegf protein were obtained (fig 4) . together the data shows that transgenic soybeans successfully produced and accumulated hegf that is the correct mr, is immunoreactive with antibodies directed at authentic egf in both elisa and immunoblot assay, and that a majority mass spectrometry of fragments of the soybean-produced hegf match the human egf sequence. the delivery of any biopharma product in the context of compositionally complex food presents the potential that the components of the food may act to modulate bioactivity. plantsource foods in particular pose problems because plant tissues often possess a wide range of intrinsic biologically active components including proteins and natural products. the natural products of food could mask or enhance the effects of an expressed biopharma product. to evaluate the potential of egf activity in soymilk delivery commercial recombinant human egf (rhegf) was added as a supplement to soymilk and the intrinsic activity of the egf was tested with a hela cell assay. fig 5 shows the effects of soymilk on the display of the egf receptor (egfr) on hela cells and the effect of commercial rhegf supplement to soymilk. soymilk does not modify the display of egfr on hela cells showing that soymilk alone is biologically inactive. the binding of egf to egfr results in the decrease of displayed egfr as it is internalized into the hela cells. hela cells treated with commercially available recombinant rhegfsupplemented soymilk display the same decrease in egfr as cells treated with rhegf in media without soymilk. parallel time-course experiments show that the effect of rhegf binding to efgr is rapid with a reduction of displayed efgr occuring within 5 min of treatment and continuing out to at least 30 min (data not shown). together these assays show that soymilk has no apparent negative bioactivity with respect to both the binding of commercial rhegf to the hela cell egfr or the viability of the hela cells over the course of the assay. to assess the bioactivity of soybean-produced hegf, samples were prepared from both shegf transgenic soybean lines and nontransgenic controls that were used to stimulate hela cells to induce egfr internalization, degradation and phosphorylation. in results shown in fig 5, soybean-produced hegf induces the internalization, degradation and phosphorylation of egfr that is indistinguishable from the bioactivity of commercial rhegf delivered in control samples. in contrast, samples prepared from control nontransgenic soybeans exhibited no apparent bioactivity showing the degradation and phosphorylation of egfr is the result of egf binding of either commercial rhegf added to the media or from the hegf produced by the transgenic soybeans. together these results show that nontransgenic soybean seeds have no intrinsic egf-mimic activity able to induce egfr degradation or phosphorylation, while soybeans producing hegf have identical activity in comparison to commercial rhegf. soybean produced egf displayed comparable bioactivity to commercially available egf. panel a. soybean produced hegf induces a rapid phosphorylation of hela cell egfr. serum free media (sf) and sf media with soymilk alone does not induce egfr phosphorylation and degradation. soymilk from seeds producing shegf added at different concentrations (0.1, 0.05, 0.025 μg/ml) induced concentrationdependent egfr degradation comparable to the effect of rhegf. serum free media and serum free media with nontransgenic soybean soymilk (negative controls) showed no effect on inducing pegfr. in contrast soymilk from shegf soybeans given at different concentrations (0.1, 0.05, 0.025 μg/ml) induced pegfr comparable to control rhegf. pakt indicates the functional activation of egfr. lamin b1 was used as a loading control. panel b. exogenous commercial rhegf and shegf induces an internalization and degradation of egfr in hela cells shown as a decrease in abundance assayed by immunoblot. the results shown demonstrate that soymilk alone has no intrinsic bioactivity with respect to egfr abundance. the rhegf is not degraded in soymilk over 24 hours having the same bioactivity as control recombinant rhegf.-ctrl-sf media alone. soy egf and rhegf are at 0.1 μg/ml. lamin b1 was used as a loading control. panel c. shown is an immunohistochemical assay of hela cells showing that shegf induces internalization of the egfr comparable to that from control rhegf. in c, the cells were first treated with soy/egf or human egf for 6 hours, fixed and then immunostained with egfr antibody overnight. egfr shows red staining while nucleus was stained by dapi and shows blue staining. in developing a food-based delivery platform for biopharma it is important to address the question of whether there are significant collateral consequences in seed composition resulting from the genetic modification. ideally a consumption plant biotechnology platform, such as soymilk, should be fully equivalent to the standard type other than the intended modification. seeds in general, including soybeans, possess an inventory of bioactive proteins and small molecules that will affect the metabolism of consumers in both advantageous and disadvantageous manner. for soybeans some of the relevant molecules are allergens, anti-metabolite proteins, and small molecules especially isoflavones. to test for potential collateral composition in the hegf-producing soybeans, the shegf transgenic and nontransgenic control soybeans were analyzed by non-targeted proteomics and metabolomics. among the significant proteins identified include various well-documented allergens and anti-metabolite proteins. a comparison of standard soybeans with hegf-producing soybean lines showed that there was no significant difference (p = .01) between nontransgenic control and shegf transgenic soybeans aside from the targeted production of hegf for any other proteins of concern. this data is available in pride partner repository with the dataset identifier pxd003326 and 10.6019/pxd003326. non-targeted small molecule metabolomics was used to conduct a parallel analysis of the nontransgenic and hegf soybeans. again there were insignificant differences between nontransgenic soybean seeds and the shegf transgenic seeds (fig 6) with one notable exception. soybean highly regulates sulfur availability and its allocation into protein. from a nutritional perspective soybean is considered a somewhat sulfur deficient crop. there have been a number of biotechnology experiments to increase sulfur content be either modifying assimilation and biosynthesis pathways leading to methionine or over-expressing high-methionine proteins such as maize zeins. modifying sulfur by pathway or competition has an effect on sulfurresponsive proteins including the bowman-birk trypsin inhibitor (bbi) and beta chain of the storage protein conglycinin. egf is a high sulfur content protein that broadly mimics bbi as a small globular protein synthesized by the er and presumptively competing for sulfur amino acid charge trna. expressing hegf in soybean has an effect on metabolites involved in sulfur amino acid metabolism that is consistent with producing a protein of egf's composition. a complete dataset of all metabolite abundance of the standard and hegf-expressing lines is available as an on-line spreadsheet (s1 table) . among the assayed molecules of particular note is the soybean molecule genistein, an isoflavone that has been shown to affect the activity of tyrosine phosphatase in the signal cascade associated with egf signaling [43] [44] [45] [46] . genistein levels were determined to be the same in both the nontransgenic and hegf-expressing soybean lines. this too demonstrates that the expression of hegf in soybeans does not produce any incidental collateral consequences of concern for its potential therapeutic use. since the inception of plant biotechnology its potential use for biopharma applications has been assessed [47] [48] [49] [50] . several different plant organs proposed as production for food/feed delivery systems [51] [52] [53] . for many vaccine applications fruit are a highly advantageous delivery system providing a broadly accepted platform for even the most recalcitrant consumer (for example, [54, 55] ). although fruit are perhaps one of the best delivery systems from the perspective of point of delivery, fruit also has logistical issues with relatively short time that ripened fruit are palatable requiring a tightly coordinated effort to produce, distribute, and use biopharma product that could be challenging for deployment in mass quantities. an alternative fig 6. relative proportion of non-targeted metabolites detected in soybean seeds shown as amount in egf transgenic compared to nontransgenic (wt). complete list of non targeted metabolites quantitated in samples in s1 table. doi:10.1371/journal.pone.0157034.g006 transgenic soybean production of bioactive human epidermal growth factor (egf) is to develop a biopharma platform that is broadly acceptable for food and feed delivery but can be lightly processed to preserve bioactivity and can be massively scaled to maximize the distribution potential of the product. soybean is a potentially useful biopharma platform that could have broad application in both food and feed end uses [29, 30, 56, 57] . soybean has been demonstrated as a platform to produce heterologous proteins at a standard that far exceeds the levels typically needed for biopharma [31] . soybeans can be used to produce both soymilk and formula for potential delivery to human infants or children as well as for production animals such as swine and calves. soybean can also be used to produce protein concentrates for inclusion in industrial food and feed or more simply as protein aggregates as tofu. soybean production is efficient and economic that can be massively scaled if needed. recently developed technology makes it feasible to increase the amount of recombinant protein product by silencing and exchange with a storage protein(s) [31, 58, 59] . as a platform, soybean is an industrial crop with vast majority of its total production being directed toward products including processed food, protein used as animal feed, and its oil for food, feed, fuel, and chemical feedstock uses. many of the goals of further enhancing and modifying soybeans are largely directed at improving its utilization for industry products rather than direct food use. as a biopharma platform to produce soymilk derived products soybean seeds can be stored for years anticipating future needs while retaining the potential to be rapidly processed into formula/milk or tofu using adaptations of traditional technology in use for over a millennium. soybeans like many other seeds produce an array of intrinsic small globular proteins with secondary disulfide bonds accumulated at relatively high levels (>1% of total proteins). soybean in particular accumulates the bowman-birk trypsin inhibitor that is 8.5 kda with 3 intra-chain disulfide bonds [60] . this suggests that soybean seeds are optimized as a potential bioreactor to produce and store proteins like egf, a 6.9 kda protein with 3 intra chain disulfide bonds paralleling intrinsic seed proteins. in a predecessor experiment a mutant inactive bbi was expressed in transgenic soybeans showing that alternate small proteins can be expressed in soybean [60] . expression of a construct encoding shegf regulated by the soybean seed storage protein promoter results in the accumulation of hegf at > 100 μg /gm of dry soybean seed, a level to be many fold over the estimated therapeutic requirements of 50 μg/kg weight of treated individual [61] . soybean-produced hegf appears to be completely comparable to authentic hegf in its mr, immunoreactivity with specific antibodies, correspondence of fragment sequence in mass spectrometry assay, and in bioactivity inducing the internalization, degradation and phosphorylation of efgr. together the results shown demonstrate that soybean seeds will produce hegf at proto-therapeutic levels and the derived hegf from these seeds are bioactive for egf activity in a model hela cell assay. the expression of hegf in soybean has little collateral impact on seed composition soybeans have been used as an expression platform for a wide variety of heterologous proteins with investigative as well potential food/feed and biopharma goals [33, [62] [63] [64] [65] [66] . prior biopharma expressions have included prototype expression of vaccine models [67] as well as proinsulin and fibroblast and human growth factor [68, 69] . in this study potential collateral changes in prototype product mature soybean seeds resulting from hegf expression was evaluated by non-targeted proteomics and metabolomics to assess both large and small molecules. these assessments showed that there was no significant difference in the seed proteome of the egf transgenics compared to nontransgenics. this is a pertinent result as soybeans are regulated in the us under falpa (the 2004 food and allergen labeling protection act) and unintended alterations of any of the known seed allergens or anti-nutritional proteins can be of concern. similarly the non-targeted metabolomics of the soybean seeds showed a significant lack of alteration of the small molecule profile in response to hegf accumulation. among the molecules assessed the lack of change in genistein is among the most significant as this isoflavonoid has been shown to have activity with tyrosine phosphatase that is in the signal cascade of animal and human cells that includes egf/egfr signaling [43, 44, 70] . in the hela cell assessments there was no synergistic effect of standard soybean milk and authentic egf on egfr activity indicating that the identical genistein concentration in the standard and hegf expressing soybeans is below the threshold of effect in the assays conducted. the one significant alteration in the small molecule profile was in methionine-related metabolism. egf is a sulfur rich protein containing three disulfide bonds that has some general resemblance to the soybean bowman-birk inhibitor. soybean is a relatively sulfur deficient crop and much effort has been made to increase its sulfur amino acid content by either the co-expression of sulfurrich proteins such as zeins [71] or by increasing the sulfur flux by altering the sulfur amino acid pathways [72] [73] [74] . these studies have shown that within limits the increase of a sulfur sink such as expressing a high-sulfur content protein will collaterally induce modest increases in sulfur amino acid source. the results of increases in sulfur amino acid metabolites accompanying hegf expression in soybean is in accord with these prior experiments. together the results of the non-targeted proteome and metabolome assessments show that converting soybean into a prototype biopharma delivery platform of hegf does not result in any adverse alterations of the soybean seed's composition. soybeans could be used to produce biopharma products that might be delivered as milk or formula. as a test of this concept human epidermal growth factor (hegf) has been produced in soybeans to potentially address the devastating disease of neonatal necrotizing enterocolitis. this is a disease of premature infants of low birth weight. these infants have underdeveloped organs including the intestinal tract. the resulting gangrenous infection is treated by emergency surgery to remove dead portions of the intestinal tract that even under most optimistic situations has a high mortality rate and high cost of treatment. an alternative approach is to proactively treat infants at risk immediately post-partum to attempt to improve the integrity and maturity of the lining epithelial cells. the bioactivity results with model hela cells shows that hegf can be produced and accumulated in soybean seeds and as crude soy-milk lysate is capable of stimulating a response from the egf receptor (egfr) that occurs on epidermal surfaces such as the intestinal tract. soybean-produced hegf has potential other applications in cosmetics, burn and injury treatment, stimulating improved adaptation of the bowel to massive intestinal loss. supporting information s1 table. non-targeted metabolome set. births: final data for 2010 role of human milk in extremely low birth weight infants' risk of necrotizing enterocolitis or death necrotizing enterocolitis: the search for a unifying pathogenic theory leading to prevention neonatal necrotizing enterocolitis: a nine year experience: ii outcome assessment long-term survival and parenteral nutrition dependence in adult patients with the short bowel syndrome 16s rrna gene-based analysis of fecal microbiota from preterm infants with and without necrotizing enterocolitis fecal microbiota in premature infants prior to necrotizing entercolitis establishment of the gut microbiota in western infants bacterial colonization and gut development in preterm neonates epidermal growth factor: biology and mechanism of action early human milk feeding is associated with a lower risk of necrotizing enterocolitis in very low birth weight infants epidermal growth factor (egf) concentrations in amniotic fluid and maternal urine during pregnancy increased epidermal growth factor levels in human milk of mothers with extremely premature infants human epidermal growth factor: isolation and characterization and biological properties immunocytochemical localization of human epidermal growth factor/urogastrone in several human tissues transforming growth factor alpha and epidermal growth factor in protection and healing of gastric mucosal injury epidermal growth factor enhances intestinal adaptation after massive small bowel resection epithelial immaturity and multiorgan failure in mice lacking epidermal growth factor receptor intestinal overexpression of egf in transgenic mice enhances adaptation after small bowel resection selective inhibition of the epidermal growth factor receptor impairs intestinal adaptation after small bowel resection breast milk and neonatal necrotizing enterocolitis human milk for the premature infant epidermal growth factor reduces the development of necrotizing enterocolitis in a neonatal rat model intestinal barrier failure during experimental necrotizing enterocolitis: protective effect of egf treatment short bowel syndrome management of the short bowel syndrome in the pediatric population pediatric shortbowel syndrome: the cost of comprehensive care plant made pharmaceuticals: from edible vaccines to ebola therapeutics towards using biotechnology to modify soybean seeds as protein bioreactors. in, recent advancements in plant expression in crop plants production of escherichia coli heat labile toxin (lt) b subunit in soybean seed and analysis of its immunogenicity as an oral vaccine the collateral protein compensation mechanism can be exploited to enhance foreign protein accumulation in soybean seeds a rnai knockdown of soybean 24 kda oleosin results in the formation of micro-oil bodies that aggregate to form large complexes of oil bodies and er containing caleosin transgenic soybean seeds accumulating β-carotene exhibit the collateral enhancements of high oleate and high protein content traits towards normalization of soybean somatic embryo maturation a rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein dye binding proteowizard: open source software for rapid proteomics tools development tandem: matching proteins with tandem mass spectra open mass spectrometry search algorithm retinoblastoma protein (prb), but not p107 or p130, is required for maintenance of enterocyte quiescence and differentiation in small intestine the proteomics identifications (pride) database and associated tools: status in 2013 characterization of recombinant human epidermal growth factor produced in yeast production of mouse epidermal growth factor in yeast: high level secretion using pichia pastoris strains containing multiple gene copies genistein, a tyrosine kinase inhibitor, reduces egf-induced egf receptor internalization and degradation in human hepatoma hepg2 cells genistein alters growth factor signaling in transgenic prostate model (tramp) genistein enhances relaxation of the spontaneously hypertensive rat aorta by transactivation of epidermal growth factor receptor following binding to membrane estrogen receptors-α and activation of g protein-coupled, endothelial nitric acid synthase-dependent pathway genistein increases epidermal growth factor receptor signaling and promotes tumor progression in advanced human prostate cancer the production of recombinant pharmaceutical proteins in plants sowing the seeds of success: pharmaceutical proteins from plants recombinant pharmaceuticals from plants: the plant endomembrane system as bioreactor commercialization of biopharmaceutical and bioindustrial proteins from plants efficacy of a food plant-based oral cholera toxin β subunit vaccine oral immunization with hepatitis β surface antigen expressed in transgenic plants immunogenicity of recombinant lt-b delivered orally to humans in transgenic corn expression of hepatitis b surface antigen (hbsag) gene in transgenic banana (musa sp) severe acute respiratory syndrome (sars) s protein production in plants: development of recombinant vaccine expression and immunogenicity of an escherichia coli k99 fimbriae subunit antigen in soybean protein expression systems: why soybean seeds?" in soybean-molecular aspects of breeding cosuppression of the α-subunits of β-conglycinin in transgenic soybean seeds induces the formation of endoplasmic reticulum-derived protein bodies silencing of soybean seed storage proteins results in a rebalanced protein composition preserving seed protein content without major collateral changes in the metabolome and transcriptome reduction of protease inhibitor activity by expression of a mutant bowman-birk gene in soybean seed epidermal growth factor augments adaptation following small bowel resection: optimal dosage, route, and timing of administration expression of functional recombinant human growth hormone in transgenic soybean seeds processing and localization of bovine β-casein expressed in transgenic soybean seeds under control of a soybean lectin expression cassette embryo-specific silencing of a transporter reduces phytic acid content of maize and soybean seeds metabolically engineered oilseed crops with enhanced seed tocopherol an alternative to fish oils: metabolic engineering of oilseed crops to produce omega 3 long chain polyunsaturated fatty acids correct targeting of proinsulin in protein storage vacuoles of transgenic soybean seeds high-level expression of basic fibroblast growth factor in transgenic soybean seeds and characterization of its biological activity expression of correctly processed human growth hormone in seeds of transgenic tobacco plants genistein analogues: effects on epidermal growth factor receptor tyrosine kinase and on stress-activated pathways increased sulfur amino acids in soybean plants overexpressing the maize 15 kd zein protein enhanced levels of methionine and cysteine in transgenic alfalfa (medicago sativa l.) plants over-expressing the arabidopsis cystathionine γ-synthase gene transgenic soybean plants overexpressing o-acetylserine sulfhydrylase accumulate enhanced levels of cysteine and bowman-birk protease inhibitor in seeds effects of proteome rebalancing and sulfur nutrition on the accumulation of methionine rich δ-zein in transgenic soybeans front key: cord-009792-e2vvi8qo authors: pandit, sb; balaji, s.; srinivasan, n. title: structural and functional characterization of gene products encoded in the human genome by homology detection date: 2008-01-03 journal: iubmb life doi: 10.1080/15216540400006105 sha: doc_id: 9792 cord_uid: e2vvi8qo availability of the human genome data has enabled the exploration of a huge amount of biological information encoded in it. there are extensive ongoing experimental efforts to understand the biological functions of the gene products encoded in the human genome. however, computational analysis can aid immensely in the interpretation of biological function by associating known functional/structural domains to the human proteins. in this article we have discussed the implications of such associations. the association of structural domains to human proteins could help in prioritizing the targets for structure determination in the structural genomics initiatives. the protein kinase family is one of the most frequently occurring protein domain families in the human proteome while p‐loop hydrolase, which comprises many gtpases and atpases, is a highly represented superfamily. using the superfamily relationships between families of unknown and known structures we could increase structural information content of the human genome by about 5%. we could also make new associations of domain families to 33 human proteins that are potentially linked to genetically inherited diseases. iubmb life, 56: 317‐331, 2004 an important event in the history of humankind has been made at the turn of the 21st century in the form of the draft version of human genome data (1, 2) . this monumental effort provides us with an opportunity to better understand the various biological processes and possible cognition. the functional characterization of the gene products encoded in the human genome could provide insights into such complex biological process. there are various experimental endeavors to understand the functions of gene products encoded in the human genome. in addition to these experimental efforts, computational analyses of human proteins could form an important step in the functional inference of the genome data. the computational approaches for the prediction of functional features of proteins encoded in genomes relies on establishing relationships to homologues that are experimentally studied. there have been several attempts, using various sophisticated homology search tools, to assign functions to gene products encoded in various proteomes (see for example references 3 -8) . such functional predictions could be used as a guiding tool in order to direct the relatively time consuming, more difficult and expensive experimental methods for exploring protein functions. furthermore, functional inferences of human proteins that are implicated in diseases could provide valuable insights on the molecular basis of human diseases. such an understanding could aid identification of effective drug targets and rational design of lead compounds to combat the diseases. the most commonly used computational approach for genome-wide association of functions to proteins is by identification of well-characterized homologues using sequence-based search procedures such as blast (9) and fasta (10) . but, pairwise sequence alignment based search procedures are unlikely to be able to identify related proteins with low sequence similarity. however, these distantly related proteins could often be identified with the use of threedimensional (3-d) structural information (11) as the structure is conserved better than sequence during evolution (12, 13) thus, use of structural information could potentially enhance the functional assignments (14 -17) . moreover, structure prediction with relevant biochemical motifs can provide more detailed functional insights than sequence comparisons alone (18 -20) . the search methods for such an analysis could be improved by the use of multiple sequence alignment of the homologues in a family, which can indicate structurally/ functionally important positions. the information in these multiple sequence alignments can be converted into position specific scoring matrices (pssm) usually referred as profiles (21) or into a probabilistic model called the hidden markov model (hmm) (22) . the use of profile-based search methods is known to improve sensitivity of detection of remotely related homologues (23 -28) . hence, use of structure and profile-based method would enable detection of remote homologues and thus enrich the functional assignments. some of the commonly used profile-based search methods include psi-blast (25) , impala (29) , rps-blast and hidden markov model (hmmer2)-based (30) procedures. these methods have been shown to detect remote and subtle similarities (31, 32) between proteins that were previously possible only by structure comparison procedures, which obviously demand the knowledge of 3-d structures. although the profile-based annotation methods are among the widely used procedures to detect remote similarities, there are other procedures such as genthreader (33) and environment-based profiles (34) , which are fold recognition methods for assigning the structural domains to amino acid sequences. furthermore, the comparison of proteins across various genomes could also aid in enhancement of the assignment of functions to the proteins (34 -41) . the comparative genomics methods are based on the functional characterization of the gene products by detecting the orthologous proteins in the closely related organisms where the experimental functions of the proteins have been proposed. however, the effectiveness of this approach is dependent on at least two factors: (1) ability to identify homologues of a given protein in other organisms. (2) the extent of divergence of amino acid sequences of the homologues across the organisms and its implication on the similarity of functions between the two proteins. in order to understand the biological function of the human proteins we have associated functional or structural domains using sensitive profile-matching procedures. association of functional domain would provide clues to biochemical role of the protein. the structural domain association could provide enhanced abilities to assign function and also provide the molecular basis of action of proteins. furthermore, we have enhanced the structural information content of human genome by relating families apparently with unknown structures to known structural families, as in the supfam database, which was developed by us earlier (26, 27) . the functional domain information was considered from pfam database (http://www.sanger.ac.uk/software/pfam) (42) and structural domain from pali (http://pauling.mbu.iisc. ernet.in/*pali) (43, 44) databases. pali contains structurebased sequence alignment and phylogeny of proteins in the families derived from the scop database (45) , which is a hierarchical structural classification database. the profiles for pfam and pali families have been generated as described by pandit et al (26) . these profiles were searched using rps-blast. we have also used hmm-based search procedure against the hmm libraries of pfam to associate functional domain to human proteins. profiles of transmembrane sequences have been associated with human proteins in order to predict membrane localisation of the proteins. using sensitive profile-matching procedures, we could make a comprehensive compilation of functional/structural domains to gene protein encoded in human genome. similar attempts have been made in the past by muller et al. (38) . they have used psi-blast for much of their analysis, apart from impala, to assign structural/functional domains. in their approach psi-blast has been scanned against the nonredundant database of protein sequences augmented with scop domain sequences. in the present review we will discuss the current status of large-scale function association, using various computational procedures, of various gene products encoded in the human genome. furthermore, human proteins potentially involved in diseases have been specifically analyzed by associating functional/structural domain to these protein sequences. the amino acid sequences of the open reading frames (orfs) that correspond to the gene products encoded in human genome have been obtained from the ensembl database (46) (release 22.34d.1, http://www.ensembl.org). the total number of gene products predicted in this release is 29,031. the omim database (47) is a comprehensive collection of genes and genetic disorders in humans. in our analysis the protein sequences corresponding to the entries in the omim database have been derived from the swissprot database (48) . it is also possible to obtain details of genes involved in genetic disorders through the genelink table provided at ensembl database (46) . the genelink table indicates the association of human proteins to omim identifiers. we were able to associate 6257 unique protein sequences, which have one or more reference, to the omim database. the number of omim entries referenced in swissprot is 6770, as in march 2002 release of the swissprot database. a possible reason for the difference in the numbers of genes could be that more than one genetic disorder entry in omim database is associated with a given protein. in the data set used by muller et al. (38) there were 5856 proteins linked to omim database entries. we have used profile-matching method rps-blast that matches a sequence to sequence-profile obtained from structural (pali -release 2.2) and functional domain (pfam -version 10.0) families. we used stringent e-value cut off of 3 6 10 75 in our search methods to ensure reliability of the domain association. this e-value cut-off has been extrapolated from the one reported by schaffer et al. (1999) (29) as well as based on the benchmarking (n. s. mhatre, b. anand and n. srinivasan, unpublished results) using the database of structure-based sequence alignments of similarly folded proteins. we have used hmmer based procedure against pfam hmm profiles with an e-value cut off of 10 72 to extract reliable domain association. subsequent to functional/ structural domain identification, all the sequences were subjected to tmhmm2.0 (49) in order to assign transmembrane helical regions. the functional and structural domain assignments for human proteome along with other organisms are made publicly available at: http://hodgkin.mbu.iisc.ernet.in/*human. the following sections discuss some of the interesting results derived by analysing this large dataset of human proteins. we could assign a total of 52,297 functional/structural domains to 21,835 (75%) human proteins out of 29,031 proteins encoded in the human genome. we also surveyed for transmembrane regions in human proteins, since this would suggest putative localization of these proteins to the membrane hence, could aid in function prediction. using tmhmm (49) we could identify transmembrane regions in 6777 gene products. of these, 5424 are found to be present in combination with extracellular or intracellular functional/ structural domains. the total number of residues covered in structural/ functional domain and transmembrane region assignments are about 42% of the proteome. these functional/structural domain assignments would indicate probable biochemical functions for the assigned proteins, which could be useful for biological function prediction. a total of 7196 human proteins with no domain assignment, hence with no function or cellular localization information, could form an interesting set for experimental exploration for their properties and biological roles. the association of gene products with structure can give valuable insights, since structural information provides molecular details of the function of a protein. the structural domain assignment will also help in prioritizing the target for structural genomics consortium by indicating gene products with no structural predictions. with a view to enhance structural information present in human genome, we have used structural information as in pali profiles that is generated using structure-dependent sequence alignments of a large number of protein domain families, since the incorporation of 3-d structural information could aid in effective detection of remotely related proteins. using pali profiles alone, we could associate additional 1191 structural domains to 1076 human proteins that are remotely related. furthermore, we tried relating families with unknown structures to known structural families as in supfam database, which was developed by us earlier (26, 27) in order to enhance information on the structural content of human proteome. the supfam database relates two or more homologous protein families, of either known or unknown structure, using profiles derived from structure-based sequence alignments. integrating the relationships derived in supfam we could provide structural information for an additional *5% of domain families (fig. 1 ). these family assignments would increase known structural content in the genome. a total of 2669 pfam families are assigned in human genome, of which 1195 pfam families, have structural information documented in pfam. out of 1474 pfam families, apparently with no structural information, 129 families could be related to a family of known structure in supfam. there are now 1324 families (*50%) with structural information known directly or indirectly through relationships present in sup-fam (fig. 1 ). these 1324 unique families with structural information are present in 40,947 domains, hence would provide further insights into their functions. a total number of 52,082 functional domains could be assigned to the human proteins. these assigned functional domains belong to 2669 sequence/functional families of the pfam database (42) . we have surveyed for the most commonly occurring pfam families in human genome. fig. 2 shows the most frequently occurring functional domain families in the human genome. the most frequently occurring family is the zf-c2h2 family, which is a classical zinc-finger domain with very short length (typically 25 residues). identification of such a family with short motifs using bioinformatics tools could be unreliable. hence, we did not consider them in our analysis. the other most frequently occurring globular protein family is protein kinase. it was previously shown that protein kinases occur with typical and atypical combination of domain families in the gene products encoded in human genome. these kinase domain-containing proteins are involved in a wide variety of biological roles (50) . among the other most frequently occurring pfam families, the majority are involved in or in part responsible for protein-protein interactions (immunoglobulin, ankyrin repeat, tpr domains), cell attachment adhesion function (fibronectin, collagen, cadherin domains), signalling function (ph, sh3, c2 domains), nucleic acid binding function (zf-c2h2, homoeobox, rrm domains). a considerable number of human proteins are characterized by short lengths, although they match significantly with protein domain families which are typically much longer. there are 129 functional families, associated in 1144 human proteins, apparently with no structural information but could be associated to distantly related families of known structures using relationship described in supfam. these 1144 proteins have 1184 number of domains. most of these 129 families correspond to enzymes. from the structural genomics perspective, structure association for 129 pfam families meant that clues about structure and function could be extended for 1144 proteins. some of the pfam protein families are known to be characteristic of prokaryotic organisms or viruses only. however members of some of these families from human genome could be identified from the current analysis and these families are referred to as atypical families. we could associate 16 bacterial specific families and 18 viral specific families to 41 and 188 human proteins respectively. the list of bacterial and viral specific families, identified in human, along with associated gene products in human genome are listed in table 1a and 1b respectively the complete list of proteins with the region of pfam domain assignment is made available at http:// hodgkin.mbu.iisc.ernet.in/*human. the assigned domain family includes for example cobalamin biosynthesis protein, minor capsid protein, bacteriophage lambda head decoration protein. such functions have not been shown before to be present in humans. there are two possible explanations that could be drawn in the context of occurrence of the bacterial and viral specific families in human genome. first explanation is that the superfamily relationship exists between the assigned bacterial or viral domain families and the corresponding eurkaryotic domain families as the sequence similarity of these domains with human proteins is low, while significant. these regions in the human proteins could have diverged significantly and sequence data corresponding to these families in other eukaryotes is currently lacking. an alternative possibility is horizontal gene transfer of these bacterial/viral specific families to humans. we surveyed the human genome for the occurrence of specific pfam families, which are known to be present only in eukaryotes. such eukaryote specific families are known to be involved in specific functions in eukaryotes. out of 2229 eukaryote specific pfam families, we could not associate 1054 eukaryotic specific families to the human proteome. further, we assessed reasons for the absence of these eukaryotic specific families in human genome. most of these families are organism or lineage specific. some of them have no known functions and other, as mentioned below, are involved in functions not required or present in humans, hence not identified in the human proteome. the probable reasons for absence of eukaryotic specific pfam families in human genome are: (1) class of toxin families (which also includes snake and scorpion toxins) (2) families that are unique to the plant kingdom, like seed storage class of proteins, potato inhibitor and plant disease resistance response protein. the zf-c2h2 has been excluded from this histogram due to low reliability in assignments of short domains. protein srb, c. elegans sre g protein-coupled chemoreceptor and c. elegans srg family integral membrane protein. this analysis showed that human proteome has eukaryotic specific families (1175) which are involved in eukaryotes-like functions. however, absence of some of the eukaryotic specific families could be explained from the observations that such biochemical functions are undesirable for human or they are highly specific to lower eukaryotes. the pfam families, without known 3-d structure, could be clustered into sequence superfamilies and such superfamily relationships are documented in the supfam database. in the current release of supfam, 96 of the 3904 pfam families, with no structural information, could be clustered into 39 new potential superfamilies. it is expected that members of all the families in each new potential superfamily would share the same fold and might have gross similarity in their functional properties. these relationships could help in prioritizing the target for structural genomics, since the 3-d structural determination of one of the representative member in each superfamily would result in 39 structures that can serve as framework models. using these sequence superfamilies information we could identify 18 of the 39 sequence superfamilies in the human genome. the list of these new potential superfamilies with their constituent families identified in human genome is listed in table 2 . the 18 sequence superfamilies identified in human genome consist of 25 pfam families, with no known 3-d structure for any of their members. there are 371 domains belonging to these 18 new potential superfamilies that could be assigned to the 367 unique gene products in human genome. hence, an experimental structure for 18 domains or proteins one each from these superfamilies could provide templates for interpreting the functions of other members in the superfamily. this results in substantial reduction (from 371 to 18) in the number of 3d-structures to be determined experimentally in order to get clues about their functions experimentally. these superfamilies may be considered as priority targets for structural genomics initiatives in order to improve the coverage of structural information for the human proteins. the nature of the superfamily relationships for some of the new potential sequence superfamilies that are identified in human genome is discussed further. this superfamily consists of three families namely patched domain, acr_tran and secd_secf domain families. the acr_tran family is an integral membrane protein family whose members are known to be involved in drug resistance in bacteria (51) . the other family in this superfamily, patched domain, is a receptor for the morphogene sonic hedgehog and transduces hedgehog signals (52) . this secd and secf family consists of various prokaryotic secd and secf protein export membrane proteins (53) . we could identify 16 human proteins with patched family domain assigned. the functional and structural elucidation of other two families could be could be extended to patched domain because of superfamily relationships. this superfamily has four families cluster together viz. sugar_tr, oatp_c, duf791 and duf894. the sugar_tr family is single-polypeptide capable only of transporting small solutes, such as sugar, in response to chemiosmotic ion gradients and lies in uniporter-symporter-antiporter family (54) . oatp_c is eukaryotic organic-anion-transporting polypeptides that vary in tissue distribution and substrate specificity (55) . the functions of dufs (domains of unknown function) are not known. we could associate sugar_tr and oatp_c domains to 85 and 16 gene products respectively. this superfamily constitutes methyltransf_4 and trna_u5-meth_tr pfam families. both families have methyltransferase activity, however, the trna_u5-meth_tr family is involved in methylation of t-rnas (56) . we could identify 1 and 5 human homologues of methyltransf_4 and trna_u5-meth_tr respectively. the gde_c and duf608 pfam families could be clustered in this superfamily. the gde_c family is glycogen branching enzyme and has glucosidase activity (57) . we could identify gde_c and duf608 in 3 and 2 human proteins respectively. from, this relationship it could be suggested that duf608 might have glucosidase-like activity. the 3-d information provides precise molecular details about the function of the protein. the association of gene products encoded in human genome to 3-d structures would assist in providing further insights into their function. the databases of protein structures in which domains with similar 3-d architecture are grouped together could be used for such structural analysis. we have used pali database derived from scop for the present analysis. scop classifies protein domain having high sequence and structural similarity into families. the families are grouped in superfamilies when they share similar functional features and have an evolutionary common ancestor. superfamilies are grouped in fold when major secondary structures are topologically equivalent with similar topological connectivity. the assignment of structural domains to the proteins would aid in the investigation of the preponderance of superfamilies and fold in the human genome. using the various search procedures we could associate 38,017 structural domains to 16,459 human proteins either directly or by using the sequence superfamily relationships as described in supfam. further, we classified these domain assignments at the level of fold or superfamilies to understand the most commonly used function present in human genome. we analyzed the most commonly occurring superfamilies in human proteome. the figure 3 shows the top few superfamilies along with their extent of representation in the human genome. this distribution of superfamilies is similar to the one obtained by muller et al. (38) . the most commonly occurring superfamily is c2h2 zinc finger, followed by immunoglobulin domain. because of the short length of c2h2 zinc-finger domain and associated low complexity region, there is bias in identification of these domains. hence, all the gene products having this domain might not have zinc-finger like function and we excluded them from our present analysis. p-loop containing nucleotide triphosphate hydrolases domain is the next most represented superfamily and it is involved in many different critical biological functions such as cell growth, differentiation, repair and modification of dna, transcription, etc. this superfamily comprises various atpases and gtpases that are essential for cell survival. for example gtpases include elongation factors, ga subunit of the heterotrimeric g-proteins that are absolutely critical in major cellular processes. the other superfamilies among frequently occurring superfamily are involved in various functions in the cell as cellular signalling (protein kinase-like, ph domainlike), cell adhesion (cadherin, fibronectin type iii), nucleic acid binding function (rna-binding domain). interestingly, 'family a-g protein-coupled receptor-like' superfamily that consists of many receptors as other most populous superfamilies. the complete list of structural superfamilies that occur in human genome with their respective frequency of occurrence in human genome is provided at http://hodgkin.mbu.iisc.ernet.in/*human. figure 4 shows population distribution of few most populated folds, which occur in the human proteome. figure 5 shows the 3-d folding patterns in the most populated folds. the c2h2 and c2hc zinc finger is the most frequent occurring fold in human genome. for the reasons mentioned before, we have excluded c2h2 and c2hc zinc finger from this analysis. ferredoxin-like fold has the highest number of superfamilies in the human proteome as well as in scop. however, 16 of the superfamilies occur in the human proteome out of the currently known 36 superfamilies in the ferredoxin fold. ribonuclease h-like motif fold has six out of currently known seven superfamilies in the human proteome. except the superfamily of hypothetical protein mth1175 from methanobacterium, all other superfamilies of ribonuclease h-like motif occur in the human proteome. these superfamilies are actin-like atpase domain, creatinase/ prolidase n-terminal domain, ribonuclease h-like, translational machinery components, dna repair protein muts domain ii and methylated dna-protein cysteine methyltransferase domain. this could be expected as the nucleic acid binding/related superfamilies are highly represented in the human proteome. the complete list of protein structure folds that occur in human genome with their respective frequency of occurrence in human genome is provided at http://hodgkin. mbu.iisc.ernet.in/*human. the sequence to profile matching procedure described in the methods section resulted in the association of at least one functional domain family in pfam database to the 4864 proteins of swissprot database linked to omim entries (77.8% of the total of 6257 proteins in the omim database). the remaining 1393 disease-related proteins could not be associated to any functional or structural domain family. hence these proteins could be high priority targets in structural genomics to provide further insights into the molecular basis of the function of these proteins. these 4864 proteins contain 8431 functional and structural domains from 1288 pfam families. it is important to note that 6491 domains out of 8431 domains could be linked to 802 pfam families with known structural information. in terms of the amino acids coverage in these domains about 51% of the amino acids in the proteins are in the functional or structural domain (58) assigned regions in these 4864 proteins. figure 6 shows the distribution of the domains in the top 15 most populous families in the proteins, these families contain 2551 domains which is about 30% of all assigned domains. protein kinase is the most frequently occurring domain family in the human disease proteins. among the top 15 most populous families 14 have known structural information. the most populous structural superfamily that is assigned to the proteins is p-loop containing nucleotide triphosphate hydrolases and this has 361 domains in it. the largest representations in the p-loop superfamily come from the domain families like ras, helicase_c, and dead. the list of highly populated superfamilies has much in common with the analogous list generated by muller et al. (38) . much of these highly represented superfamilies are associated with regulatory roles in development, differentiation and proliferation. further analysis revealed that there are 33 proteins that have been assigned additional functional domains apart from previously assigned functional domains. these 33 proteins are listed in table 3 . these newly assigned domains may play a significant role in furthering our understanding of overall functions of these proteins. using various methods of domain association we could associate at least one domain to about 75% of gene products in the human genome. interestingly, the assignments of remote homologues to the human proteins revealed the occurrence of some of the viral and bacterial specific proteins in the human genome. among most commonly occurring functional family, protein kinases is one of the most frequently occurring domains, and the p-loop containing nucleotide triphosphate hydrolases is the one of the most represented superfamily. the assignment of 1184 domains to families with apparently no structural information to structural families would aid in the prioritization of targets for structural genomics of human genome. the assignment of new domains in addition to previously identified domains to the proteins possibly linked to genetically inherited human diseases could form a basis for the experimental verification of the roles of these domains as well as the molecular basis of disease. initial sequencing and analysis of the human genome the sequence of the human genome predicting function: from genes to genomes and back structural assignments to the mycoplasma genitalium proteins show extesive gene duplication and domain rearrangements the cath extended protein-family database providing structural annotations for genome sequences trends in protein evolution inferred from sequence and structure analysis studying genomes through the aeons: protein families, pseudogenes and proteome evolution predicting protein function by genomic context: quantitative evaluation and qualitative inferences basic local alignment search tool improved tools for biological sequence comparison distant homology recognition using structural classification of proteins the relation between the divergence of sequence and structure in proteins protein evolution. how far can sequences diverge? how representative are the known structures of the proteins in a complete genome? a comprehensive structural census homology-based fold predictions for mycoplasma genitalium proteins the relationship between protein structure and function a comprehensive survey with application to the yeast genome enhanced genome annotation using structural profiles in the program 3d-pssm predicting structures for genome proteins from protein structure to function genomic-scale comparison of sequence-and structure-based methods of function prediction: does structure provide additional insight? profile analysis: detection of distantly related proteins hidden markov models in computational biology. applications to protein modeling fold and function predictions for mycoplasma genitalium proteins applying motif and profile searches gapped blast and psi-blast: a new generation of protein database search programs supfam-a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes supfam: a database of sequence superfamilies of protein domains enhanced functional and structural domain assignments using remote similarity detection procedures for proteins encoded in the genome of mycobacterium tuberculosis impala: matching a protein sequence against a collection of psi-blast-constructed position-specific score matrices profile hidden markov models benchmarking psi-blast in genome annotation identification of related proteins on family, superfamily and fold level genthreader: an efficient and reliable protein fold recognition method for genomic sequences a method to identify protein sequences that fold into a known three-dimensional structure comparative genome and proteome analysis of anopheles gambiae and drosophila melanogaster the identification of functional modules from the genomic association of genes genecensus: genome comparisons in terms of metabolic pathway activity and protein family sharing structural characterization of the human proteome protein fold recognition using sequence profiles and its application in structural genomics functional and structural genomics using pedant genome sequences and great expectations the pfam protein families database pali-a database of phylogeny and alignment of homologous protein structures integration of related sequences with protein three-dimensional structural families in an updated version of pali database scop: a structural classification of proteins database for the investigation of sequences and structures the ensembl genome database project omim passes the 1000-disease-gene mark the swiss-prot protein sequence database and its supplement trembl in 2000 a hidden markov model for predicting transmembrane helices in protein sequences the repertoire of protein kinases encoded in the draft version of the human genome: atypical variations and uncommon domain combinations acrab efflux pump plays a major role in the antibiotic resistance phenotype of escherichia coli multiple-antibiotic-resistance (mar) mutants the drosophila patched gene encodes a putative membrane protein required for segmental patterning secd and secf are required for the proton electrochemical gradient stimulation of preprotein translocation major facilitator superfamily. microbiol molecular identification and characterization of novel members of the human organic anion transporter (oatp) family dual function of the trna (m(5)u54) methyltransferase in trna maturation identification of the catalytic residues of bifunctional glycogen debranching enzyme the human serum paraoxonase/arylesterase gene (pon1) is one member of a multigene family setor: hardware lighted three-dimensional solid model representations of macromolecules this research is supported by the award of senior fellowship to n.s. by the wellcome trust, london as well as by the computational genomics initiative supported by the department of biotechnology, new delhi. s.b. and s.b.p. are supported by the wellcome trust, london and csir, new delhi respectively. key: cord-001435-ebl8yc92 authors: hoppe, sebastian; bier, frank f.; von nickisch-rosenegk, markus title: identification of antigenic proteins of the nosocomial pathogen klebsiella pneumoniae date: 2014-10-21 journal: plos one doi: 10.1371/journal.pone.0110703 sha: doc_id: 1435 cord_uid: ebl8yc92 the continuous expansion of nosocomial infections around the globe has become a precarious situation. key challenges include mounting dissemination of multiple resistances to antibiotics, the easy transmission and the growing mortality rates of hospital-acquired bacterial diseases. thus, new ways to rapidly detect these infections are vital. consequently, researchers around the globe pursue innovative approaches for point-of-care devices. in many cases the specific interaction of an antigen and a corresponding antibody is pivotal. however, the knowledge about suitable antigens is lacking. the aim of this study was to identify novel antigens as specific diagnostic markers. additionally, these proteins might be aptly used for the generation of vaccines to improve current treatment options. hence, a cdna-based expression library was constructed and screened via microarrays to detect novel antigens of klebsiella pneumoniae, a prominent agent of nosocomial infections well-known for its extensive antibiotics resistance, especially by extended-spectrum beta-lactamases (esbl). after screening 1536 clones, 14 previously unknown immunogenic proteins were identified. subsequently, each protein was expressed in full-length and its immunodominant character examined by elisa and microarray analyses. consequently, six proteins were selected for epitope mapping and three thereof possessed linear epitopes. after specificity analysis, homology survey and 3d structural modelling, one epitope sequence gavvalsttfa of kpn_00363, an ion channel protein, was identified harboring specificity for k. pneumoniae. the remaining epitopes showed ambiguous results regarding the specificity for k. pneumoniae. the approach adopted herein has been successfully utilized to discover novel antigens of campylobacter jejuni and salmonella enterica antigens before. now, we have transferred this knowledge to the key nosocomial agent, k. pneumoniae. by identifying several novel antigens and their linear epitope sites, we have paved the way for crucial future research and applications including the design of point-of-care devices, vaccine development and serological screenings for a highly relevant nosocomial pathogen. klebsiella pneumoniae is a gram-negative, facultative anaerobic rod-shaped bacterium belonging to the family of enterobacteriaceae. it is a non-motile, lactose fermenting organism, which has been known to cause severe lung damage if aspirated. other clinical symptoms common with klebsiella pneumoniae infections encompass urinary-tract-infections (uti) and wound infection potentially causing bacteremia and septicemia [1] . in recent years it has become one of the most persistent nosocomial agents, especially due to the increasing distribution of multiple resistances to antibiotics. the most prominent group of k. pneumoniae harboring a broad resistance spectrum incorporates the extendedspectrum beta-lactamase (esbl) expressing strains. due to their outstanding clinical relevance and occurrence as agents of nosocomial infections, it is highly desirable to rapidly detect the presence of these organisms and to find suitable measures to effectively counter any infection in the early stages [2] . while numerous dna-based typing methods exist [3] , these are often laborious and time-consuming. in contrast, user-friendly point-of-care devices applying antigen-antibody interactions would allow for a quick and reliable detection [4] . nevertheless, the knowledge about suitable antigens to be incorporated into such a device is scarce. thus, we have utilized a method to quickly assess novel immunogenic proteins of k. pneumoniae, which might serve as potential targets for a diagnostic tool. recently, we have successfully employed this approach to unveil immunogenic proteins for both campylobacter jejuni [5] and salmonella enterica [6] . concisely, prokaryotic cdna libraries are created, fusion proteins expressed and these constructs covalently attached to microarray surfaces via the use of a halotag (promega). subsequently, the microarrays are screened using polyclonal antibodies reactive to the donor species of the cdna. therefore, this approach enables a broad and reliable screening, while reducing cross-reactivity and background to a minimum [7] due to the highly selective and covalent binding of halotag to its specific ligand [8] . moreover, the high specificity of said interaction renders excessive protein purification steps normally encountered in microarray-based screening applications [9] obsolete. thus, it is a faster and more direct approach as spotting combines both the deposition of samples and enables the immediate purification by a simple washing step. in connection with the above screening approach, cdna derived expression libraries were generated to express a vast number of proteins from k. pneumoniae. these libraries offer the advantage of a smaller sample size as compared to genomic libraries. this is mainly due to genomic libraries encompassing highly truncated dna fragments as well as dna representing regions that do not encode proteins in the original organism. contrastingly, cdna libraries generated via the in-fusion smarter directional cdna library construction kit (clontech) have been known to lead to longer fragments and possess a high abundance of full-length clones [10] . thus, the overall number of clones required for screening is substantially reduced. still, prokaryotic cdna libraries display one main disadvantage. as bacterial mrna rarely contains a poly(a)-tail [11, 12] , isolation of the mrna from other rna species is tedious. while some methods exist to isolate the mrna prior to reverse transcription [13, 14] , we rather chose to normalize the cdna afterwards. consequently, the entire rna of k. pneumoniae was used for reverse transcription. next, normalization was performed using a duplex-specific nuclease [15] . this treatment has been shown to effectively reduce the highly abundant rrna derived cdna portions without implementing a bias, thus altering the overall composition of the cdna in favour of the mrna derived molecules [16] . in addition to this, ligation-independent cloning and electroporation were employed to enhance cloning efficiency [17] . in this work, we have screened 1536 clones to detect the presence of previously unknown immunogenic proteins. in summary, we identified 14 proteins that have not been described as immunogenic before. after further analyses and epitope mapping of several promising candidates, three proteins -a channel receptor, a putative transport protein and a hypothetical protein -revealed linear antigenic sites with varying specificity. our results offer the potential to be used for a wide array of applications including the generation of monoclonal antibodies that might be used in diagnostic examination. furthermore, several of the identified antigens or parts thereof might be suitable for vaccine development, either used in passive or active immunization. additionally, many virulence-associated factors harbor some immunogenic potential. thus, identification of novel immunogenic proteins might elucidate proteins involved in the pathogenicity and virulence of k. pneumoniae. consequently, this advances the understanding of this pathogen and illuminates new approaches to counter infections. the mean rin value of rna used for cdna generation and library construction was calculated to 7.361.3 (n = 5). after successful normalization, the cdna was cloned to create an expression library. of this library, 1536 different clones were screened using halolink slides. the known immunogenic proteins [18] , outer membrane protein 1a ompf (uniprot/swiss-prot: a6t721) and outer membrane protein 3a ompa (a6t751), were used as positive references, whereas both dihyroorotase pyrc (a6t7d6) and glyceraldehye-3-phosphate-dehydrogenase gapa (a6t8l2) served as negative references. after screening, the signal intensities of each sample were compared to the references and grouped accordingly. generally, three distinct groups were established. group i represents samples exceeding the intensities of both positive reference proteins, group ii encompasses samples ranging in between the different intensities of ompa and ompf, whereas group iii entails those samples that albeit showing higher intensities than the negative controls, are below both ompa and ompf. here, approximately 25% of samples belong to group i, while ii and iii contain 7% and 14% respectively. the remaining samples, 54%, fall into the same range as the negative protein references. consequently, 192 clones or 12.5% of the entire screening approach were selected for sequencing. these clones were all taken from group i. after sequencing, artificial fragments and known antigens were discarded to reveal potentially novel immunogenic proteins, see table s1 for a list of the initial identification via sequencing. some of the inserts were too heavily truncated to reliably identify the corresponding gene. moreover, in some cases subcloning the initially identified genes in full-length failed even after numerous attempts. therefore, those genes were removed from further characterization as the translated peptide fragments were too short to be significant. despite these limitations, 14 potentially novel immunogenic proteins were expressed in full length and used for further characterization. the 14 antigen candidates are summarized in table 1 including their locus tag, protein name, length and size in kda. the difference in immunodominant behaviour was assessed by ten independent microarray and elisa analyses employing two different antibodies. in summary, table 1 reveals the resulting mean q values and corresponding errors (n = 10). a q value above one represents higher intensity signals than ompa, the used positive reference. the highest mean q value was obtained for kpn_02199, a coa-linked acetaldehyde dehydrogenase and irondependent alcohol dehydrogenase; pyruvate-formate-lyase deactivase with 2.1660.39. however, as this protein is highly conserved within all bacteria, it was not considered for further analyses regarding specific antigenic sites. the same holds true for kpn_03668, the 50s ribosomal protein l11 methyltransferase. although a q value of 1.7060.38 was attained, the highly conserved nature of this protein renders it unsuitable within a diagnostic approach. contrastingly, kpn_00466 and kpn_01584, two hypothetical proteins, were selected for future investigations as little is known about the function of these proteins. their mean q values were 1.1560.22 and 1.9060. 36 respectively. moreover, kpn_01584 shows some homology to a known superantigen of yersinia pseudotuberculosis according to the ncbi protein cluster database [19] . thus, the potential utilization of these proteins within a diagnostic tool seems plausible. in addition, kpn_01100, a histidine triad protein, kpn_02202, glucose-1-phosphate uridylyltransferase, kpn_00 363, a nucleoside channel and receptor of phage t6 and colicin k and kpn_00459, a putative transport protein, were selected for further investigations via epitope mapping. while kpn_01100 and kpn_02202 revealed mean q values above one with 1.3160.27 and 1.6960.56, respectively, kpn_00459 and kpn_00363 failed to reach this level. rather, the q values attained were 0.8360.27 and 0.7760.14. while the q values of the latter two proteins are lower compared to some of the other proteins identified, their functional descriptions and membraneassociation render them highly attractive within a diagnostic question. hence, they might be more easily accessible in a whole cell detection approach than cytoplasmic proteins. epitope mapping revealed the potential presence of linear epitopes within three of the six proteins investigated, namely kpn_00363, kpn_00459 and kpn_00466. for kpn_00363 seven distinct regions were identified with intensities above 1000 a.u., see figure 1 . these comprised the peptides 2 and 3, 16-18, 29-30, 45-47, 50-51, 60-63 and 69-71. the highest mean value for these peptides was obtained for peptide 16 with more than 10000 a.u. the positive reference, i.e. rabbit igg, reached a mean value of more than 30000 a.u., whereas the negative control mbp showed intensities of less than 500 a.u. as the adjacent peptides are identical in all but 4 amino acids in sequence, a consensus can easily be derived from two or more neighboring peptides. as figure 2 reveals, only the first two peptides, llaagavvalsttfa and gavvalsttfaagaa showed some specificity during specificity control assays. here, the arrays were incubated with additional antibodies reactive to different bacterial species, namely campylobacter jejuni, staphylococcus aureus, e. coli and s. enterica. all remaining peptides display similar intensities when these antibodies were used as compared to the original k. pneumoniae antibodies. however, for the first two peptides a significant difference is observed. the mean value for peptide 2 was approximately 6500 a.u. with k. pneumoniae antibodies and dropped to less than 400 a.u. with the other antibodies. a similar trend is discernible for peptide 3, where a drop from 1100 a.u. to less than zero is visible. thus, the consensus sequence gavvalsttfa is likely a suitable linear epitope featuring specificity for k. pneumoniae. for the putative transport protein, kpn_00459, three regions scored intensities above 1000 a.u. including peptides 50 and 51, 59 and 60 as well as 79-81, see figure 3 . the positive reference again reached a mean value above 30000 a.u., while the negative control levelled out at less than 500 a.u. after specificity assays, the peptides 59 and 60 revealed predominantly specific binding by the k. pneumoniae antibody as the mean values dropped from approximately 5800 and 2800 a.u. respectively, to less than zero for all other antibodies tested, see figure 4 . thus, a consensus for this epitope can be derived to giafgavelfd. contrastingly, the remaining peptides revealed equal intensities independent of the antibody used. finally, for the third protein, kpn_00466, one pair of adjacent peptides, namely 11 and 12, achieved mean values of approximately 2000 a.u. within close proximity to the positive reference, as depicted in figure 5 . additionally, peptide 16 displayed the highest overall mean value with more than 7000 a.u., however neither of the two neighboring peptides (15 and 17) attained values of any significance. thus, the presence of a linear epitope in that region is rather unlikely. besides, as specificity assays illustrated, none of the three peptides showed specific interaction to the k. pneumoniae antibodies alone, see figure 6 . rather, signal intensities with antibodies reactive to c. jejuni fell into the same scope. therefore, nonspecific binding to these peptides is probable. the remaining three proteins under investigation failed to disclose any linear peptide region with significant signal intensities to assume linear epitopes to exist. for a better understanding of the suitability of the identified linear epitopes, structural modelling was employed using the swiss model automated mode. for kpn_00363 a model was constructed based on the crystal structure of the bacterial nucleoside transporter tsx of e. coli. however, the derived model only spans residues 31 to 294 and as such does not contain the derived consensus sequence of the linear epitope gav-valsttfa. nevertheless, the model displays a beta barrel structure typical of outer membrane-spanning transport proteins, see figure 7 for details. the first residues of the derived model, starting at position 31, are marked in orange and present a coiled region outside of the beta barrel. therefore, the likelihood of the gavvalsttfa region to reside within the barrel is slim. rather, an extension of the truncated coil seems plausible. consequently, the potential accessibility of the identified linear epitope appears high. furthermore, the predicted 3d structure of the first part of kpn_00459 is displayed in figure 8 . contrary to kpn_00363, two models were devised for kpn_00459. however, only one encompasses the identified linear epitope giafgavelfd, which is highlighted in orange and is situated as part of an alpha helix and an adjacent loop. as the sequence is not enclosed by other residues or structures, good accessibility ought to be provided. in order to predict the potential specificity of the epitopes, homology analyses were performed. whereas giafgavelfd of kpn_00459 exhibits a broad homology throughout with all residues identical in closely related species, see figure s3 : homology of kpn_00459, gavvalsttfa of kpn_00363 features four residues likely specific for k. pneumoniae within this particular sequence. the variance of the latter epitope's sequence considering closely related species to k. pneumoniae is summarized in figure 9 . specifically, the following residues have been replaced: valine by leucine (position four), both threonine residues by serine residues (position 8 and 9) as well as alanine by threonine (position 11). the influence of sequence variations on the binding capacity of the identified epitope gavvalsttfa was subsequently examined by performing an alanine scan. its results are summarized in figure 10 . the original consensus epitope sequence shows a mean intensity of approximately 1000 a.u. alterations of the first, second, fifth, eighth and ninth residue lead to a significant drop of signal intensities. consequently, signals of less than 300 a.u., in close proximity to the negative control mbp with 50 a.u. are obtained. contrastingly, by altering residues three or six of the consensus sequence gavvalsttfa, i.e. replacing the first valine or leucine by alanine, the resulting mean signal intensities are significantly increased to more than 1700 a.u. for replacing valine and surpassing 5600 a.u. after changing leucine to alanine. additionally, specificity assays showed no significant signal intensity for antibodies reactive to closely related species, i.e. e. coli and s. enterica. incubation with antibodies reactive to either of those two bacterial species resulted in signal intensities in the neighbourhood of the negative control independent of sequence alterations, see figure 11 . this was also true for gavlalsssft and saalaltssft. the screening of a cdna based expression library of k. pneumoniae has successfully identified a number of novel, previously unknown immunogenic proteins. this is well in accordance to previous achievements of this method, which we have employed for the identification of both c. jejuni [5] and s. enterica [6] antigens. consequently, in this current study we aimed to detect suitable proteins for the specific identification of k. pneumoniae. furthermore, identification of specific antigens might improve treatment opportunities including the development of suitable vaccines [20] . finally, detecting proteins with yet unknown functional and structural information to be antigenic might be an indication of their potential involvement in the pathogenic nature of an organism. thus, these proteins might be suitable access points for future investigations to improve the understanding of the underlying pathogenicity and virulence of k. pneumoniae. within the 14 proteins only a subset was selected for further analysis using epitope mapping. the rationale for these selections was based on three distinct features: the resulting q value, homology as well as functional and structural properties. while the q value mirrors a normalized intensity compared to a positive reference, it does not fully account for differences in expression levels, misfolding and other factors which might have had an influence on the binding reaction and thus on the overall intensity. still, it facilitates to rank the proteins and increases the likelihood of a given protein to be immunogenic if its q value is significantly high. however, proteins with outstanding q values were not chosen for epitope mapping by default. rather, the known homology and corresponding distribution of each protein was carefully evaluated. thus, proteins like kpn_02199 and kpn_03668 albeit scoring high q values were exempt from epitope mapping as they display a broad spectrum of homologous proteins across bacterial species. finally, the intrinsic properties of each protein, if available, were closely scrutinized. therefore, proteins with hypothetical character were predominantly chosen, as information on them is confined. in addition, all types of membrane-associated proteins deserved a better look, as these type of proteins offer a more direct route and accessibility, which might be of utmost importance in a future rapid point-ofcare device detecting whole organisms. after epitope mapping, three of the six proteins under investigation revealed sites with potential linear epitopes. still, the remaining proteins displayed no such sequences. this is, however, in accordance to the general prevalence of structural epitopes in comparison with linear epitopes in nature. in fact, approximately 90% of all epitopes are conformational rather than sequential [21, 22] . despite these potential shortcomings of the used linear epitope mapping method, a number of intriguing linear epitopes have been identified. notably, three proteins, kpn_00363, kpn_00459 and kpn_00466, harbored sites with potential linear epitopes. further careful examination, however, excluded kpn_00466 from future applications as the identified epitope sequence were nonspecific for k. pneumoniae, which might be due to the conserved character of the protein within the family enterobacteriaceae. still, kpn_00466 is a membrane protein and upcoming investigations might help to elucidate applications using this protein either for prevention of k. pneumoniae infections or detection thereof. in contrast, the other two proteins displaying linear epitopes, kpn_00363 and kpn_00459, indicated some specificity with the antibodies tested and two linear consensus sequences could be derived, gavvalsttfa and giafgavelfd, respectively. while the former sequence is present in kpn_00363, an ion channel protein tsx, which is conserved among enterobacteriaceae, the latter sequence is part of kpn_00459, a hypothetical protein with high similarity to a cation:proton antiport protein and conserved among proteobacteria. consequently, the homology of giafgavelfd (kpn_00459) is high throughout different bacterial species, which renders the sequence inapt for specific k. pneumoniae detection and other applications. in contrast, gavvalsttfa (kpn_00363) displays alterations within this sequence for bacterial species other than k. pneumoniae. as small as these alterations appear, they appear to have a crucial effect on the binding of antibodies to the target sequence and thus benefit specificity. this is well in line with the experimental results that indicate no binding by antibodies reactive to other bacteria to this sequence or any of its alterations. moreover, alanine scanning revealed a number of residues to be paramount for antibody binding. consequently, replacing the glycine (position 1), alanine (position 2), alanine (position 5) or either threonine (positions 7 and 8) leads to a significant loss of antibody binding observable by a dramatic reduction in signal intensity. on the contrary, removing either the first valine or leucine and inserting an alanine residue as a replacement results in a significant increase in signal intensity, potentially hinting at an improved antibody binding. the reason for these effects remains nebulous; however, it is not caused by a simple change in secondary structure. this is apparent as both aavvalsttfa and gavvaasttfa, two sequences at opposite sides regarding signal intensity, are predicted to consist solely of an alpha-helix as compared to the original sequence, which is predicted to contain a beta-strand (position 1 to 7) and a helical part (position 8 to 11) by emboss garnier. furthermore, as the original sequence is fully constructed of uncharged amino acids, alterations via alanine or glycine do not change the overall charge of the peptide. consequently, the differences in signal intensity cannot be deduced to implementation or removal of charged residues. finally, changes in hydrophobicity albeit present are minimal at best and may not suffice to explain the observed characteristics. still, one has to bear in mind that the accuracy of prediction based tools is often lacking, especially for emboss garnier with an accuracy in the range of 65%. consequently, some of the predicted secondary structures might differ. despite some uncertainty as to the mechanistic cause of the altered peptides to behave as they did, one fact remains obvious. none of the peptides showed any significant binding to antibodies not raised against k. pneumoniae. therefore, after alanine scanning, gavvalsttfa remains a suitable candidate for specific k. pneumoniae detection or treatment. still, accessibility of the epitope is paramount to a quick diagnostic tool. thus, modelling of the 3d structure of the protein was performed. unfortunately, the 3d modelling only succeeded for the part of the protein, which does not contain said linear epitope sequence. nevertheless, the pronounced beta barrel structure of the channel protein is visible and the linear epitope sequence, albeit absent, likely to be an extension of the freely accessible n-terminal region outside the barrel structure. thus, accessibility of the epitope by an antibody might be pronounced without the need to enter the cell or channel. membrane proteins harboring a beta barrel like structure have been shown in bacteria to be exclusively found in the outer membrane [23] . moreover, the mobile coils on the extracellular sides of these membrane proteins are pivotal for their function or interaction with other molecules. this renders the identified sequence an intriguing target for antibody-based detection. additionally, channel proteins harboring beta barrel like structures have been shown to be immunodominant in other bacterial species such as salmonella, haemophilus influenzae, e. coli, neisseria meningitides, shigella dysenteriae and chlamydia trachomatis [24] [25] [26] [27] [28] [29] . on the contrary, the model for kpn_00459 encompasses the identified consensus sequence of the linear epitope giafga-velfd, which is part of a loop between two alpha helices. the abundance of alpha helices suggests the protein to span the inner membrane [23] . in this topological design, helices are mostly located within the membrane, notably as transmembrane domains, whereas loops are located either on the cytoplasmic or periplasmic side of the cell. when considering prediction based methods, such as s_tmhmm for topological domains [30, 31] or emboss antigenic [32, 33] , part of giafgavelfd is assumed to be extracellular. combined with the high flexibility and degrees of freedom of random coil structures, the likelihood for good accessibility is high rendering the sequence a potentially attractive target for whole cell detection despite its lack of specificity. furthermore, these findings support the 3d model and underline the accuracy of the specified structure. another key aspect in determining the accuracy of the 3d model prediction is a so-called z-score [34] . the z-score for the model of kpn_00363 is 24.285 and 28.895 for kpn_00459 respectively. although the values are significantly below zero that does not inevitably indicate models of poor accuracy. in fact, low z-scores are often obtained if the protein under investigation is membraneassociated. this is mainly due to the inverse physicochemical properties of membrane proteins in comparison to soluble ones. hence, the low z-scores are more likely induced by this effect than caused by an insufficient accuracy of the models. in conclusion, we have successfully identified several novel antigens of k. pneumoniae and identified three proteins potentially harboring linear epitopes. subsequently, we achieved to identify two sequences displaying specificity during experimental investigations; however, one of these is doubtful as homology analysis has revealed it to be highly conserved among a broad spectrum of bacterial species. still, gavvalsttfa of kpn_00363 was identified to be specific both experimentally and has shown four residues within the eleven amino acid sequence to occur predominantly in k. pneumoniae only. thus, the likelihood for this linear epitope to be specific is high. this assumption was confirmed by alanine scanning revealing a number of pivotal residues for antibody binding. moreover, it was unearthed that neither e. coli nor s. enterica antibodies were able to bind to any of the sequences, original and modified. subsequent investigations might help to further nurture the insight into the suitability of this peptide for diagnostic and therapeutic applications. thus, monoclonal antibodies ought to be devised to be used for affinity investigations via biacore and to determine kinetics. furthermore, monoclonal antibodies could be used within a potential diagnostic tool and after validation ought to be tested with whole bacteria. if these antibodies are able to specifically detect intact k. pneumoniae cells, the resulting antibody might well be suited for integration into a point-of-care device. in a different approach, the identified epitope sequence could easily be produced in large quantity. this peptide might serve some role in serological screenings, especially if it proves to be immunodominant. consequently, antibodies against this epitope might be present in a plethora of patient sera. finally, all proteins identified here might be suitable candidates for vaccine development independent of the existence of a linear epitope, as structural epitopes might well be present and antigenicity ensured. nevertheless, additional in-depth analysis is required to determine a number of the key aspects of vaccine development prior to use. as a donor of rna the fully-sequenced strain k. pneumoniae mgh78578 was grown on solid trypticase-soy-agar (tsa) for 24 h at 37uc under aerobic conditions. for rna isolation a liquid culture was prepared by inoculating 10 ml of brain-heart-infusion broth (bhi) with a single colony and incubated overnight at 37uc, 140 rpm. this overnight culture was used to inoculate a flask containing 100 ml fresh bhi medium. the cells were harvested 6 h after inoculation. for initial screening a rabbit polyclonal igg antibody to k. pneumoniae (acris ap00792pu-n) was used. for further micro-array analyses of a subset of candidate proteins, elisa and epitope mapping this antibody was used as well as rabbit polyclonal igg antibody to k. pneumoniae (abcam ab20947). the antibodies were generated with k. pneumoniae atcc 43816 serving as an immunogen. specificity assays were performed employing rabbit polyclonal igg antibody to c. jejuni (acris ap24002pu-n), s. aureus (fitzgerald 20c-cr1274rp), e. coli (abcam ab137967) and s. enterica (abcam ab35156). detection was achieved by usage of secondary antibodies. goat polyclonal antibody to rabbit igg conjugated with chromeo-546 (abcam ab60317) for fluorescent and antibody conjugated with horseradish peroxidase (abcam ab6721) for a colorimetric readout were applied where appropriate. the cells were harvested by centrifugation (20006g, 10 min) and the resulting supernatant discarded. the pellets were resuspended in fresh bhi medium. for stabilisation of the rna, 1 ml of rnaprotect bacteria reagent (qiagen) was added to 0.5 ml of bacterial suspension and processed according to the manufacturer's instructions. lysis was performed with 200 ml of lysis buffer (30 mm tris-cl, 1 mm edta, 15 mg/ml lysozyme, .12 mau proteinase k) by pipetting and vortexing for 10 s. after incubation, 700 ml buffer rlt and 500 ml 96% ethanol were added and the lysate applied to rneasy bacteria mini kit spin columns (qiagen) for rna isolation following the manufacturer's instructions. during this procedure an on-column dnase digest was performed using rnase-free dnase i solution (qiagen) according to the manufacturer's instructions. the isolated total rna was eluted in 50 ml of rnase-free water and its concentration and purity analyzed by nanodrop (peqlab) measurements. the quality of isolated rna was assessed using the rna 6000 pico kit and bioanalyzer 2100 (agilent). the total rna was diluted to a working concentration of 200-500 pg/ml. the analysis was performed following manufacturer's instructions and the rna integrity number (rin) calculated by the 2100 expert software (agilent). the rin is defined to fall into a range of 0 to 10, with a higher score indicating a more intact rna, whereas lower numbers are associated with degraded rna molecules [35] . in order to use bacterial mrna as a substrate in cdna synthesis, polyadenylation was mandatory. the tailing was achieved using the poly(a) polymerase tailing kit (epicentre) following the alternate protocol offered by the manufacturer. briefly, up to 10 mg of total rna were combined with 2 ml poly(a) polymerase reaction buffer, 2 ml 10 mm atp, 0.5 ml riboguard rnase inhibitor and 1 ml poly(a) polymerase (4 u) in a total reaction volume of 20 ml. the reaction was incubated for 20 min at 37uc, terminated by the addition of 1 ml 0.5 m edta and purified by rneasy mini kit (qiagen) following manufacturer's instructions. yield and purity were determined by nanodrop measurements. for cdna synthesis the in-fusion smarter directional cdna library construction kit (clontech) was used according to manufacturer's instructions with slight modifications. 3.5 ml total, polyadenylated rna were mixed with 1 ml of 39 in-fusion smarter cds primer, heated first for 3 min at 72uc and then incubated for additional 2 min at 42uc. after addition of 5.5 ml mastermix (2 ml 5x first strand buffer, 0.25 ml 100 mm dtt, 1 ml 10 mm dntps, 1 ml 12 mm smarter v oligonucleotide, 0.25 ml rnase inhibitor and 1 ml smartscribe reverse transcriptase) the tubes were incubated for 90 min at 42uc. the reaction was terminated at 68uc for 10 min. for second strand cdna synthesis two 2 ml aliquots of first strand reaction were used in long distance pcr using phusion polymerase (finnzymes). each pcr reaction was comprised as follows: 2 ml first-strand reaction, 70 ml rnase-free water, 20 ml 5x phusion hf buffer and 2 ml each of dntp mix (10 mm), 59 figure 6 . specificity binding analysis of epitope peptides of kpn_00466. bar chart representing the mean relative fluorescence intensities (n = 10) of each peptide potentially harboring a linear epitope site after incubation with polyclonal antibodies reactive to k. pneumoniae (green) and c. jejuni (orange). none of the peptides shows a peculiar specific interaction; rather signal intensities are in the same vicinity for each peptide independent of the antibody used. this indicates mainly non-specific binding to occur. doi:10.1371/journal.pone.0110703.g006 pcr primer ii a (12 mm), 39 in-fusion smarter pcr primer (12 mm) and phusion polymerase with a total reaction volume of 100 ml. the pcr reactions were subjected to the cycling program with 98uc for 1 min as initial denaturation followed by 15 cycles of 10 s denaturation at 98uc, 30 s of primer annealing at 65uc and 6 min extension at 72uc. for improved pcr results optimization was performed as follows; 30 ml of the 15 cycle experimental tube were transferred to a separate pcr tube, cycling commenced and aliquots of 5 ml each were collected after 15, 18, 21, 24 and 27 cycles total. the different cycles were compared by gel electrophoresis and the experimental tubes subjected to additional cycles if necessary. finally, pcr reactions were purified using the qiaquick pcr purification kit (qiagen). the purity and yield of each reaction were analyzed by nanodrop measurements. normalization of double-stranded cdna was achieved with the trimmer-2 cdna normalization kit (evrogen) to reduce the number of cdna molecules derived from rrnas. briefly, 12 ml of cdna (approx. 100 ng/ml) were mixed with 4 ml of 4x hybridization buffer. for the trimming reaction 4 ml of this mixture were distributed to four different pcr tubes and overlaid with a drop of pcr-grade mineral oil. after centrifugation (130006g, 2 min), the tubes were incubated for 2 min at 98uc followed by 5 h at 68uc. next, pre-heated (68uc) duplex-specific nuclease (dsn) master buffer was added to each tube and incubation prolonged for 10 min. dsn was added to the first three tubes in decreasing concentrations -1 u/ml, 0.5 u/ml and 0.25 u/ml -with the fourth tube receiving dsn storage buffer and no enzyme as a control reaction. the incubation prolonged for 25 min at 68uc. after addition of 5 ml dsn stop solution and subsequent incubation for 5 min at 68uc, the tubes were placed on ice. the chilled reaction was diluted by addition of 25 ml sterile, rnase-free water. for amplification of normalized cdna, 1 ml of each reaction was used as template in pcr. each pcr reaction contained 1 ml of template from the normalization reaction, 33 ml nuclease-free water, 10 ml 5x phusion hf buffer, 1 ml 10 mm dntp mix (neb), 2 ml of each primer 59 pcr primer ii a (12 mm), 39 in-fusion smarter pcr primer (12 mm) and 1 ml phusion polymerase. the pcr was performed with initial denaturation at 98uc for 1 min and seven cycles of denaturation at 98uc for 10 s, primer annealing at 65uc for 30 s and extension at 72uc for 3 min, respectively. for optimization, the control tube was subjected to 7, 9, 11, and 13 cycles with 12 ml aliquots taken every two cycles. the optimization samples were analyzed by gel electrophoresis (1% agarose, tae, 100 v) and the optimal cycle number determined. the remaining three tubes were subjected to 9+ x cycles with x being the differential of the optimized cycles to the originally performed seven cycles. after the second pcr, the experimental reactions were compared to the optimal control reaction using gel electrophoresis as above. reactions showing a successful normalization were combined and used in a third pcr reaction. after the final pcr, the reactions were purified by qiaquick pcr purification kit. a major feature of the given model is the prominent beta barrel structure that originates from the abundance of beta strands. this is a typical feature of transport and channel proteins spanning the outer bacterial membrane. contrastingly, the identified linear epitope gavvalsttfa is located at the very beginning of the protein and thus not included in the given model. however, it is likely an extension of the truncated n-terminal region marked in orange. doi:10.1371/journal.pone.0110703.g007 figure 8 . 3d model of predicted structure of kpn_00459. the model was predicted using the automated mode of the swiss model application by expasy (university basel). as a template the crystal structure of a na(+)/h(+) antiporter nhaa of e. coli was used. the resulting model spans residues 12 to 390 of the full-length protein and was subsequently dyed using the chimera 1.7 software. coils are depicted in light green, beta strands in purple and alpha helices in blue. the potential linear epitope giafgavelfd is highlighted in orange. it comprises part of an alpha helix, a connective coil and the start of the next alpha helix. doi:10.1371/journal.pone.0110703.g008 in-fusion cloning and cloning vector for cloning pfn18a (promega) was used as a vector, as it features a n-terminal encoded halotag fusion protein, which allows for specific and covalent binding to a unique ligand, thus reducing background and minimizing cross-reactivity in immunoassays with halolink microarrays harboring the ligand on its surface. first, the vector needed to be linearized to be used with the in-fusion cloning technology. this was achieved by reverse pcr using ifs 18a for (59 ttgataccactgcttttc-catggcgatcgcgttatc 39) and ifs 18a rev (59 tctcatcgtaccccgtgtttaaacgaattcgggctcg 39). each reaction contained 2 ml each of 1:10 diluted pfn18a (10 ng/ml) and the two primers, 10 ml 5x phusion hf buffer, 1 ml 10 mm dntps, 0.5 ml phusion polymerase and 32.5 ml nuclease-free water to reach a total reaction volume of 50 ml. the pcr was run using a 25 cycle two-step program with 98uc denaturation for 10 s and 4 min extension at 72uc. after completion, 2 ml of dpni (20 u/ml) were added to the reaction and incubated at 37uc for 1 h. the presence of a single band was checked by gel electrophoresis and the remaining reaction purified by qiaquick pcr purification kit. cloning of normalized cdna and linearized pfn18a vector was performed following the manufacturer's instructions within the in-fusion smarter directional cdna library construction kit (clontech). electroporation 2 ml of the cloning reaction were mixed with 25 ml of electrocompetent acella e.coli cells (mobitec), a bl21 derivative, figure 9 . homology of linear epitope sequence gavvalstffa of kpn_00363. the sequence derived from k. pneumoniae mgh 78578 was used as a reference. identical residues are marked by dots, gaps by a horizontal dash and differences by the single letter amino acid code. seven of nine k. pneumoniae strains show identical epitope sequences, while two strains display changes in two residues. threonine replaces alanine at position 11, a change observed not only in these two strains but in almost all other bacteria within the list. additionally, threonine at position 9 is substituted by either serine or phenylalanine. bacteria other than k. pneumoniae show an additional number of amino acid substitutions, most prominently leucine for valine at position 4 and serine for threonine at position 8. in s.enterica the changes become more pronounced. glycine at position 1 is replaced by serine, valine at position 3 replaced by alanine and threonine inserted for serine at position 7. in some rare cases, other residues have also been substituted, e.g. valine replaces alanine at position 2 in shigella dysenteriae. doi:10.1371/journal.pone.0110703.g009 figure 10 . alanine scan of gavvalsttfa of kpn_00363. box-whisker plot (n = 12) of gavvalsttfa after alanine/glycine scanning. the box comprises 50% of the data, while the whiskers enclose 98%. the median is represented by a small horizontal line and the mean by a small rectangle. rabbit igg served as a positive reference, whereas mbp was used as a negative control. if alanine was present in the original present it was replaced by glycine, otherwise each amino acid was stepwise replaced by alanine. additionally, gavlalsssft and saalaltssft were included as they resemble sequences present in e. coli and s. enterica. switching glycine (position 1), alanine (positions 2 or 5), or threonine (positions 8 and 9) to alanine or glycine, results in a significant drop in signal intensities to levels below or at the negative control. in contrast, substituting valine (position 3) or leucine (position 6) by alanine, leads to an increase in signal intensities to 1700 a.u. and more than 5600 a.u., respectively. note the different axis scales prior and after axis break at 2000 a.u. doi:10.1371/journal.pone.0110703.g010 and electroporated in 1 mm cuvettes using the easyject plus electroporator (peqlab). conditions for electroporation were as follows: voltage = 1400 v, capacity = 25 mf, resistance = 200 v and a pulse duration of 5 ms. the electroporated cells were added to 970 ml of super optimal broth with catabolite expression (soc) and incubated at 37uc for 1 h with shaking at 250 rpm. afterwards, 150 ml of the transformation reaction were plated on lysogeny broth (lb) agar containing ampicillin. for each reaction at least two plates were prepared and incubated at 37uc for 16 h. a total number of 1536 clones were selected and transferred to 1.3 ml u96 deepwell plates (nunc) containing 0.8 ml lb-amp. the plates were incubated overnight at 37uc, 130 rpm. on the next day, the deepwell plates were centrifuged, the supernatant discarded and the pellets resuspended in 370 ml of lb-amp. a new set of u96 deepwell plates was prepared with 850 ml of fresh lbamp and inoculated with 100 ml each from the resuspended overnight cultures. the remaining 270 ml of resuspended overnight culture were mixed with 30 ml of sterile-filtered dmso and stored at 280uc. the newly inoculated plates were incubated for 6 h at 37uc, 130 rpm. afterwards, the temperature was reduced to 20uc, incubation continued for 1 h and protein expression induced by addition of 2 ml of 0.5 m b-d-1-thiogalactopyranoside (iptg). incubation persisted overnight at 20uc, 130 rpm. the cells were harvested by centrifugation (25006g, 10 min), the supernatant discarded and the pellets frozen at 220uc. after 15 min the pellets were resuspended in 180 ml of easylyse bacterial protein extraction solution (epicentre) and incubated for 5 min at room temperature. dnase i was mixed with dnase reaction buffer (10 mm tris-hcl, 2.5 mm mgcl 2 , 10 mm cacl 2 ), added to the reaction and incubation was carried on for 10 min at 37uc. the plates were centrifuged to collect cell debris for 3 min at 25006g. for each sample 10 ml of lysate were transferred to 384 microtiterplates (genetixx), which were used as reservoirs for the spotting procedure. the samples were spotted onto halolink slides (promega) using the qarray2 microarray spotter (molecular devices). 384 different samples were spotted per slide with three replicate slides per screening. in total 1536 samples were screened on 12 slides (n = 3). each sample was spotted as quadruplicates with controls in two identical sets of eighteen 10610 subarrays each (total number of spots per slide 3600). the controls used included ht-ompa and ht-ompf as positive reference proteins as these have been described as immunodominant before. as specificity controls ht-argc and ht-pyrc were used, representing proteins without known immunodominant behaviour, thus binding of the polyclonal antibodies is not expected. in addition two different e.coli strains -acella electrocompetent cells and krx single-step figure 11 . specificity assay of gavvalsttfa and derivatives. the bar chart represents the mean signal intensities (n = 12) of gavvalsttfa and several modified peptides with single amino acid replacements incubated with antibodies reactive to k. pneumoniae (green), e. coli (orange) or s. enterica (purple). the sequences on the left represent the original epitope and modified versions displaying an increase in signal intensity for k. pneumoniae antibodies. in contrast, sequences on the right harbor modifications causing a significant drop in intensity for k. pneumoniae. rabbit igg is used as a positive reference and mbp serves as a negative control. none of the sequences tested displayed any significant signal intensity above the negative control when incubated with either e. coli or s. enterica antibodies. doi:10.1371/journal.pone.0110703.g011 competent cells (promega) -were spotted as further controls. as those two lack proteins expressed with a halotag, they are used as negative controls. after spotting of the samples, the slides were incubated for 1 h at room temperature in a humidity chamber. next, slides were washed with pbs+0.05% igepal ca-630 (pbsi, sigma aldrich) and dried by nitrogen flow. the 2 well proplate module (grace biolabs) was attached to each slide. the top chamber was filled with 1.5 ml of rabbit-polyclonal antibody to k. pneumoniae (acris, 2 mg/ml) in pbs. the bottom chamber was incubated with pbs only. after 2 h of incubation at room temperature with gentle rocking, both chambers of each slide were washed three times with 2 ml of pbsi. secondary antibody (goat-polyclonal to rabbit igg conjugated with chromeo-546, abcam, 5 mg/ml) was subjected to each chamber in pbs and the slides were incubated at room temperature for 2 h in the dark under gentle rocking. finally, slides were washed three times with pbsi, the proplate modules removed and the slides dried by nitrogen flow. the slides were scanned on an axon genepix 4200a laser scanner (molecular devices) with the following settings: 532 nm laser, pmt gain 400, 40% laser power, lines to average 1, 10 mm resolution and standard green emission filter at 575 nm. the raw data sets of all the microarray analyses in this publication have been deposited in ncbi's gene expression omnibus [36] and are accessible through geo series accession numbers gse52536, gse52537, gse52538, and gse60588. the median fluorescence intensity of each spot corrected by the local background (median f532 -b532) was used. further, relative fluorescence intensity (rfi) was calculated by subtracting the signals of the bottom chamber from the raw data signals of the top chamber to account for non-specific binding of secondary antibodies. for screening of expression libraries we used the contrast method with either argc or pyrc as specificity control to determine the contrast via the formula: with rf f i control the median of all rfis of the control used. clones harboring strong signals in microarray screening were selected to be sequenced. sequencing was performed externally by lgc genomics using ht7f (59 acatcggcccgggtct-gaatc 39) and flxr (59 cttcctttcgggctttgttag 39) primers. after sequencing and identification of potentially immunodominant proteins, primers were designed to generate full-length clones for each identified gene, see table s2 for a list of the primers used. cloning was performed as mentioned above with slight modifications. the pfn18a vector was linearized using the following primer set; 18a if linear for (59 gtttaaac-gaattcgggctc 39) and 18a if linear rev (59 ggcgatcgcgttatcgctctg 39) with pcr conditions as mentioned before. protein expression, lysis, and spotting of fulllength proteins were performed as described above. the slides were incubated for 1 h at room temperature in a humidity chamber. for incubation with antibodies 3 well or 16 well proplate modules (grace biolabs) were attached to the halolink slides. processing of the slides was done similar to the original screening, however several different antibodies were used, see section antibodies. for testing of immunodominant characteristics with elisa, the crude lysate was first purified using halolink magnetic beads (promega) following the manufacturer's instructions. the proteins of interest were subsequently cleaved off by digestion with protev protease (promega) and concentration was determined by nanodrop measurements. the samples were diluted to a total protein content of 20 mg/ml in pbs and 50 ml of each sample was added to maxisorb plates (nunc). each sample was analyzed at least in triplicate. the elisa plate was covered with a lid and incubated overnight at 4uc in a humidity chamber. after five washing steps each with pbs+0.05% tween-20 (pbst), the plates were blocked using 200 ml 5% non fat dried milk in pbs per well for 2 h. afterwards, plates were washed three times with pbst. 100 ml of primary antibody solution (c = 4 mg/ml) in pbs containing 1% non fat dried milk were applied to each well using the respective desired antibody or pbs for controls. the plates were incubated for 2 h at room temperature and washed four times with pbst. next, 100 ml of conjugated secondary antibody (goat polyclonal to rabbit igg conjugated with horseradish peroxidase, abcam ab6721, c = 20 ng/ml) were added to each well and incubation carried on for 1 h. finally, plates were washed once again four times with pbst and 100 ml 3,39,5,59-tetramethylbenzidine (tmb, sigma-aldrich) was added to each well for detection. after 30 min of incubation at room temperature in the dark, the reaction was stopped by applying 100 ml of 2 m h 2 so 4 to each well. the optical density of each well was measured using the omega fluostar (bmg labtech) at a wavelength of 450 nm. primers were designed using primer3 [37] within geneious pro 5.6.5 [38] . the sequenced inserts were identified by blast [39] . peptide sequence secondary structures were predicted using the emboss garnier [40] algorithm and the transmembrane regions predicted by tmhmm2.0 [30, 31] . antigenic sites were predicted by emboss antigenic [32, 33] . data evaluation was performed by originpro 8 g (originlab) and microsoft excel. 3-dimensional structure predictions were performed using the swiss model automated mode [41] [42] [43] [44] [45] and pdb files were visualized and analyzed by the ucsf chimera package [46] .chimera is developed by the resource for biocomputing, visualization, and informatics at the university of california, san francisco (supported by nigms p41-gm103311). analysis of full-length proteins was achieved by combining the results from elisa and microarray data. hence, the rfi of each sample was calculated. next, a normalized rfi was generated by dividing the rfi of each sample by the median rfi of all the samples within an area of interest, i.e. incubation compartment. from these normalized rfis a median and standard deviation was calculated. if the median normalized rfi of the positive control was below the median normalized rfi of any of the negative references whilst taking the standard deviations into account, the test was rendered invalid. if the test passed the above criterion, the q values were calculated as follows: with rf f i sample the median of normalized rfis of the sample and rf f i pos:control the median of the normalized rfis of the positive control ompa respectively ompf. the resulting error was calculated by error propagation according to gauss. finally, incorporating all valid tests, the mean q value was determined along with its resulting error following error propagation by gauss, see table 1 . several proteins were chosen for epitope mapping. these were the proteins encoded by kpn_00363, kpn_00459, kpn_00466, kpn_01100, kpn_01584, and kpn_02202. the proteins were divided into 15-mer oligopeptides with an overlap of 11 amino acids in silico. the synthesis and coupling to microarray slides was performed externally by jpt peptide technologies gmbh. each peptide sequence was applied 9 times to one slide. the slides were used with proplate 3-well chamber system (grace) allowing for incubation with different antibodies. first, the slides were blocked with superblock blocking buffer (thermo fischer) for 2 h, washed five times with pbs+0.05% tween-20, primary antibodies applied, incubated overnight at 4uc with mild rocking, washed again, secondary antibodies applied for 2 h in the dark and after a final washing procedure, dried and scanned as above. two different primary antibodies to k.pneumoniae were tested. the bottom chamber was always used as a control chamber, incubated only with secondary antibody. the peptide gavvalsttfa and 11 modifications thereof created by substituting one amino acid by alanine/glycine were synthesized by jpt peptide technologies gmbh. these peptides in combination with two peptides showing closely related sequences, gavlalsssft and saalaltssfte, were applied 9 times to slides. incubation procedure was performed as described above for epitope mapping the expression of the desired halotag fusion proteins was checked by sds-page. after lysis of cells, 2 ml of each protein extract was mixed with 1 ml of 10 mm halotag alexa 488 ligand. after addition of 7 ml 1x tbs (100 mm tris, 150 mm nacl, ph 7.6) the reaction was incubated at room temperature for 30 minutes. 2 ml of each reaction were removed, mixed with 8 ml of 5x loading buffer (fermentas) and 1 ml dtt and heated for 5 min at 70uc. the separation was performed on a mini-protean tgx gel (biorad, any kd, 15 wells) in a protean ii xi cell chamber (biorad) for 30 min at 200 v. as a size reference benchmark fluorescent protein standard (life technologies) was used. fluorescence was measured in a fla-5100 (fujifilm) with excitation at 473 nm. figure s3 homology of giafgavelfd of kpn_00459. the sequence derived from k. pneumoniae mgh78578 was used as a reference. the 100 best matches after blast analysis are shown in the figure with dots indicating identical residues. for differentiation of the sequences the ncbi accession number of the parent protein is given followed by the strain designation, if available. only three e. coli strains in lines 63, 97 and 98 feature a valine residue at the second position instead of the consensus isoleucine. consequently, this sequence is highly conserved within the enterobacteriaceae including e. coli, klebsiella, salmonella, and enterobacter among others. (pdf) table s1 list of 192 sequenced clones after screening the clones were sequenced by lgc genomics using ht7f and flx primers. clones that were not successfully sequenced are indicated by ''-'', clones carrying inserts too short to be reliably mapped to a gene are marked as truncated. additionally, a few inserts were detected deriving from primer concatamers. these are displayed as ''artificial''. the remaining clones are indicated by the corresponding locus tag and protein name. several clones apparently carry identical inserts, especially obvious for kpn_01805 or kpn_02668. these were discarded from further analysis as the mapped inserts are very short and might have an artificial origin. inserts that were highly unlikely to garner new immunogenic proteins, antigens described previously, e.g. ompa, other molecules like trna and rrna were abolished from further analysis. table s2 primers used in this study. each primer is given with a name, its sequence in 59 to 39 direction and the target gene or vector. for each target f represents forward and r the reverse primer. the primers were used for cloning in the in-fusion smarter directional cdna library construction kit. (xls) manual of clinical microbiology clinical epidemiology of the global expansion of klebsiella pneumoniae carbapenemases extended-spectrum beta-lactamase producing klebsiella spp. in chicken meat and humans: a comparison of typing methods modern clinical microbiology: new challenges and solutions rapid identification of novel immunodominant proteins and characterization of a specific linear epitope of campylobacter jejuni application of a microarray-based immunoscreening for rapid identification of novel antigens of salmonella enteritidis microarray-based method for screening of immunogenic proteins from bacteria halotagbased purification of functional human kinases from mammalian cells severe acute respiratory syndrome diagnostics using a coronavirus protein microarray reverse transcriptase template switching: a smart approach for full-length cdna library construction polyadenylic acid sequences in e. coli messenger rna identification of the gene for an escherichia coli poly (a) polymerase a simple method to enrich mrna from total prokaryotic rna magnetic capturehybridization method for purification and probing of mrna for neutral protease of bacillus cereus dsn depletion is a simple method to remove selected transcripts from cdna populations duplex-specific nuclease efficiently removes rrna for prokaryotic rna-seq ligation-independent cloning of pcr products (licpcr) identification of vaccine candidate antigens of an esbl producing klebsiella pneumoniae clinical strain by immunoproteome analysis the national center for biotechnology information's protein clusters database protective efficacy of dna vaccines encoding outer membrane protein a and ompk36 of klebsiella pneumoniae in mice b-cell epitopes: fact and fiction x-ray crystallography of antibodies the structure of bacterial outer membrane proteins identification and characterization of ompl as a potential vaccine candidate for immune-protection against salmonellosis in mice the unique structure of haemophilus influenzae protein e reveals multiple binding sites for host factors directed evaluation of enterotoxigenic escherichia coli autotransporter proteins as putative vaccine candidates structure of the c-terminal domain of neisseria heparin binding antigen (nhba), one of the main antigens of a novel vaccine against neisseria meningitidis in vivo versus in vitro protein abundance analysis of shigella dysenteriae type 1 reveals changes in the expression of proteins involved in virulence, stress and energy metabolism surface expression, singlechannel analysis and membrane topology of recombinant chlamydia trachomatis major outer membrane protein predicting transmembrane protein topology with a hidden markov model: application to complete genomes a hidden markov model for predicting transmembrane helices in protein sequences a semi-empirical method for prediction of antigenic determinants on protein antigens new hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and x-ray-derived accessible sites toward the estimation of the absolute quality of individual protein structure models the rin: an rna integrity number for assigning integrity values to rna measurements gene expression omnibus: ncbi gene expression and hybridization array data repository primer3 on the www for general users and for biologist programmers basic local alignment search tool analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins the swiss-model workspace: a web-based environment for protein structure homology modelling the swiss-model repository and associated resources swiss-model: an automated protein homology-modeling server swiss-model and the swiss-pdb viewer: an environment for comparative protein modeling protein modeling by email ucsf chimera-a visualization system for exploratory research and analysis the k. pneumoniae strain mgh 78578 was a kind gift of the group of s. bereswill (department of microbiology and hygiene, charitã© -university medicine berlin, berlin, germany). sh is greatly indebted to martina obry for her assistance during expression library construction and clone isolation. the authors would like to thank simone aubele for technical assistance. we also gratefully acknowledge michaela schellhase for microarray printing. conceived and designed the experiments: sh mvnr. performed the experiments: sh. analyzed the data: sh ffb mvnr. contributed reagents/materials/analysis tools: sh. wrote the paper: sh ffb mvnr. key: cord-020757-q4ivezyq authors: saikumar, pothana; kar, rekha title: apoptosis and cell death: relevance to lung date: 2010-05-21 journal: molecular pathology of lung diseases doi: 10.1007/978-0-387-72430-0_4 sha: doc_id: 20757 cord_uid: q4ivezyq in multicellular organisms, cell death plays an important role in development, morphogenesis, control of cell numbers, and removal of infected, mutated, or damaged cells. the term apoptosis was first coined in 1972 by kerr et al.1 to describe the morphologic features of a type of cell death that is distinct from necrosis and is today considered to represent programmed cell death. in fact, the evidence that a genetic program existed for physiologic cell death came from the developmental studies of the nematode caenorhabditis elegans.2 as time has progressed, however, apoptotic cell death has been shown to occur in many cell types under a variety of physiologic and pathologic conditions. cells dying by apoptosis exhibit several characteristic morphologic features that include cell shrinkage, nuclear condensation, membrane blebbing, nuclear and cellular fragmentation into membrane-bound apoptotic bodies, and eventual phagocytosis of the fragmented cell (figure 4.1). in multicellular organisms, cell death plays an important role in development, morphogenesis, control of cell numbers, and removal of infected, mutated, or damaged cells. the term apoptosis was fi rst coined in 1972 by kerr et al. 1 to describe the morphologic features of a type of cell death that is distinct from necrosis and is today considered to represent programmed cell death. in fact, the evidence that a genetic program existed for physiologic cell death came from the developmental studies of the nematode caenorhabditis elegans. 2 as time has progressed, however, apoptotic cell death has been shown to occur in many cell types under a variety of physiologic and pathologic conditions. cells dying by apoptosis exhibit several characteristic morphologic features that include cell shrinkage, nuclear condensation, membrane blebbing, nuclear and cellular fragmentation into membrane-bound apoptotic bodies, and eventual phagocytosis of the fragmented cell (figure 4 .1). cell death is central to the normal development of multicellular organisms during embryogenesis and maintenance of tissue homeostasis in adults. 3 during development, sculpting of body parts is achieved through selective cell death, which imparts appropriate shape and creates required cavities in particular organs. in adults, cell death balances cell division as a homeostatic mechanism regulating constancy of tissue mass. deletion of injured cells because of disease, genetic defects, aging, or exposure to toxins is also achieved by apoptosis. in essence, apoptotic cell death has important biologic roles not only in development and homeostasis but also in the pathogenesis of several disease processes. dysregulation of apoptosis is found in a wide spectrum of human diseases, including cancer, autoimmune diseases, neurodegenerative diseases, ischemic diseases, viral infections, 4 and lung diseases. 5 our knowledge of cell death and the mechanisms of its regulation increased dramatically in the past two decades with the discovery nevertheless, necrosis has been shown to occur in cells having defects in apoptotic machinery or upon inhibition of apoptosis, 7 and this form of cell death is emerging as an important therapeutic tool for cancer treatment. 8 autophagy autophagy, which is also referred to as type ii programmed cell death, is characterized by sequestration of cytoplasm and organelles in double or multimembrane structures called autophagic vesicles, followed by degradation of the contents of these vesicles by the cell's own lysosomal system (see figure 4 .1). the precise role of autophagy in cell death or survival is not clearly understood. autophagy has long been regarded as a cell survival mechanism whereby cells eliminate long-lived proteins and organelles. in this regard, it is argued that autophagy may help cancer cells survive under nutrientlimiting and low-oxygen conditions and against ionizing radiation. 9,10 however, recent observations that there is there is early membrane damage with eventual loss of plasma membrane integrity and leakage of cytosol into extracellular space. despite early clumping, the nuclear chromatin undergoes lysis (karyolysis). apoptosis: cells die by type i programmed cell death (also called apoptosis); they are shrunken and develop blebs containing dense cytoplasm. membrane integrity is not lost until after cell death. nuclear chromatin undergoes striking condensation and fragmentation. the cytoplasm becomes divided to form apoptotic bodies containing organelles and/or nuclear debris. terminally, apoptotic cells and fragments are engulfed by phagocytes or surrounding cells. autophagy: cells die by type ii programmed cell death, which is characterized by the accumulation of autophagic vesicles (autophagosomes and autophagolysosomes). one feature that distinguishes apoptosis from autophagic cell death is the source of the lysosomal enzymes used for most of the dying-cell degradation. apoptotic cells use phagocytic cell lysosomes for this process, whereas cells with autophagic morphology use the endogenous lysosomal machinery of dying cells. paraptosis: cells die by type iii programmed cell death, which is characterized by extensive cytoplasmic vacuolization and swelling and clumping of mitochondria, along with absence of nuclear fragmentation, membrane blebbing, or apoptotic body formation. autoschizis: in this form of cell death, the cell membrane forms cuts or schisms that allow the cytoplasm to leak out. the cell shrinks to about one-third of its original size, and the nucleus and organelles remain surrounded by a tiny ribbon of cytoplasm. after further excisions of cytoplasm, the nuclei exhibit nucleolar segregation and chromatin decondensation followed by nuclear karyorrhexis and karyolysis. decreased autophagy during experimental carcinogenesis and heterologous disruption of an autophagy gene, beclin 1 (bcn1), in cancer cells 11, 12 suggest that breakdown of autophagic machinery may contribute to development of cancer. other interesting studies have shed some light on the relationship between autophagy and apoptosis. these investigations have shown prevention of caspase inhibitor z-vad-induced cell death in mouse l929 cells by rna interference directed against autophagy genes atg7 and bcn1 13 and protection of bax −/− , bak −/− murine embryonic fi broblasts against staurosporine-or etoposide-induced cell death by rna interference against autophagy genes atg5 and bcn1. 14 however, both of these studies were done in cells whose apoptotic pathways had been compromised. thus, it remains to be seen whether cells with intact apoptotic machinery can also die by autophagy and whether apoptotic-competent cells lacking autophagy genes will be resistant to different death stimuli. paraptosis has recently been described as a form of cell death characterized by extensive cytoplasmic vacuolation (see figure 4 .1) caused by swelling of mitochondria and endoplasmic reticulum. this form of cell death does not involve caspase activation, is not inhibited by caspase inhibitors, but is inhibited by the inhibitors of transcription and translation, actinomycin d, and cycloheximide, respectively, 15 suggesting a requirement for new protein synthesis. the tumor necrosis factor receptor family taj/troy and the insulin-like growth factor i receptor have been shown to trigger paraptosis. 16 paraptosis appears to be mediated by mitogen-activated protein kinases and inhibited by aip1/alix, a protein interacting with the calcium-binding death-related protein alg-2. 16 autoschizis autoschizis is a recently described type of cell death that differs from apoptosis and necrosis and is induced by oxidative stress. 17 in this type of death, cells lose cytoplasm by self-morsellation or self-excision (see figure 4 .1). autoschizis usually affects contiguous groups of cells both in vitro and in vivo but can also occasionally affect scattered individual cells trapped in subcapsular sinuses of lymph nodes. 18 the nuclear envelope and pores remain intact while the cytoplasm is reduced to a narrow rim surrounding the nucleus. the chromatin marginates along the nuclear membrane, and mitochondria and other organelles around the nucleus aggregate as a result of cytoskeletal damage and condensation of the cytosol. interestingly, the rough endoplasmic reticulum is preserved until the late stages of autoschizis, in which cells fragment and the nucleolus becomes condensed and breaks into smaller fragments. 19 eventually, the nuclear envelope and the remaining organelles dissipate with cell demise. genetic studies in the nematode worm c. elegans led to the characterization of apoptosis. activation of specifi c death genes during the development of this worm results in death of exactly 131 cells, leaving 959 cells intact. 2 further studies revealed that apoptosis can be divided into three successive stages: (1) commitment phase, in which death is initiated by specifi c extracellular or intracellular signals; (2) execution phase; and (3) clean-up phase, in which dead cells are removed by other cells with eventual degradation of the dead cells in the lysosomes of phagocytic cells. 20 the apoptotic machinery is conserved through evolution from worm to human. 21 in c. elegans, execution of apoptosis is mediated by ced-3 and ced-4 proteins. commitment to a death signal results in the activation of ced-3 by ced-4 binding. the ced-9 protein prevents activation of ced-3 by binding to ced-4. 22, 23 mechanisms of apoptosis caspases studies over the past decade have indicated that two distinct apoptotic pathways are followed in mammalian systems: the extrinsic or death receptor pathway and the intrinsic or mitochondrial pathway. the executioners in both intrinsic and extrinsic pathways of cell death are the caspases, 24 which are cysteine proteases with specifi city to cleave their substrates after aspartic acid residues. the central role of caspases in apoptosis is underscored by the observation that apoptosis and all classic changes associated with apoptosis can be blocked by inhibition of caspase activity. to date, 12 mammalian caspases (caspase-1 to -10, caspase-14, and mouse caspase-12) have been identifi ed. 25 caspase-13 was later found to represent a bovine homolog and caspase-11 appears to be a murine homolog of human caspases-4 and -5, respectively. caspases are normally produced as inactive zymogens containing an n-terminal prodomain followed by a large and a small subunit that constitute the catalytic core of the protease. they have been categorized into two distinct classes: initiator and effector caspases. the upstream initiator caspases contain long n-terminal prodomains and one of the two characteristic protein-protein interaction motifs: the death effector domain (ded; caspase-8 and -10) and the caspase activation and recruitment domain (caspase-1, -2, -4, -5, -9, and -12). the downstream effector caspases (caspase-3, -6, and -7) are characterized by the presence of a short prodomain. apart from the structural differences, a prominent difference between initiator and effector caspases is their basal state. both the zymogen and the activated forms of effector caspases exist as constitutive homodimers, whereas initiator caspase-9 exists predominantly as a monomer both before and after proteolytic processing. 26 initiator caspase-8 has been reported to exist in an equilibrium between monomers and homodimers. 27 although the initiator caspases are capable of autocatalytic activation, the activation of effector caspases requires formation of oligomeric complexes with their adapter proteins and often intrachain cleavage within the initiator caspase. caspases have also been divided into three categories based on substrate specifi city. 28 group i members (caspase-1, -4, and -5) have a substrate specifi city for the wehd sequence with high promiscuity; group ii members (caspase-2, -3, and -7 and ced-3) prefer the dexd sequence and have an absolute requirement for aspartate (d) at p4; and members of group iii (caspase-6, -8, and -9 and the "aspase" granzyme b) have a preference for (i/ l/v)exd sequences. several reports have suggested a role for group i members in infl ammation and that of group ii and iii members in apoptotic signaling events. the extrinsic pathway involves binding of death ligands such as tumor necrosis factor-α (tnf-α), cd95 ligand (fas ligand), and tnf-related apoptosis-inducing ligand (trail) to their cognate cell surface receptors tnfr1, cd95/fas, trail-r1, trail-r2, and the dr series of receptors, 29 resulting in the activation of initiator caspase-8 (also known as fadd-homologous ice/ced-3-like protease or flice) and subsequent activation of effector caspase-3 ( figure 4 .2). 30 the cytoplasmic domains of death receptors contain the "death domain," which plays a crucial role in transmitting the signal from the cell's surface to intracellular signaling molecules. binding of the ligands to their cognate receptors results in receptor trimerization and recruitment of adapter proteins to the cell membrane, which involves homophilic interactions between death domains of the receptors and the adapter proteins. the adapter protein for the receptors tnfr1 and dr3 is tnfr-associated death domain protein (tradd) 31 and that for fas, trail-r1, trail-r2, and dr4 is fas-associated death domain protein (fadd). 32 the receptor/ligand and fadd complex in turn recruits caspase-8 to the activated receptor, resulting in the formation of death-inducing signaling complex (disc) and subsequent activation of caspase-8 through oligomerization and self-cleavage. depending on the cell type and/or apoptotic stimulus, caspase-8 can also be activated by caspase-6. 33 activated caspase-8 then activates effector caspase-3. in some cell types, cleavage of caspase-3 by caspase-8 also requires a mitochondrial amplifi cation loop involving cleavage of proapoptotic protein bid by caspase-8 and its translocation to the mitochondrial membrane, triggering the release of apoptogenic proteins from mitochondria into cytosol (see figure 4 .2). in these cell types, overexpression of bcl-2 and bcl-xl can block cd95-induced apoptosis. 34 tumor necrosis factor-α is produced by t cells and activated macrophages in response to infection. although tnf-α-mediated signaling can be propagated through either tnfr1 or tnfr2 receptors, the majority of biologic functions are initiated by tnfr1. 35 binding of tnf-α to tnfr1 causes release of inhibitory protein silencer of death domain protein (sodd) from tnfr1, which enables recruitment of adapter protein tradd. signaling induced by activation of tnfr1 or dr3 diverges at the level of tradd. in one pathway, nuclear translocation of the transcription factor nuclear factor-κb (nf-κb) and activation of c-jun n-terminal kinase (jnk) are initiated, which results in the induction of a number of proinfl ammatory and immunomodulatory genes. 36 in another pathway, tnf-α signaling is coupled to fas signaling events through interaction of tradd with fadd. 37 the tnfr1-tradd complex can alternatively engage traf2 protein, resulting in activation of transcription factor c-jun, which is involved in survival signaling. furthermore, binding of receptor interaction protein to tnfr1 through tradd results in activation of transcription factor nf-κb, which suppresses apoptosis through transcriptional upregulation of antiapoptotic molecules such as traf1, traf2, ciap1, ciap2, and flip. the flice-associated huge protein was identifi ed to be a ced-4 homolog interacting with the ded of caspase-8 and was shown to modulate fas-mediated activation of caspase-8. 38 another class of protein, flip (flice inhibitory protein), was shown to block fasinduced and tnf-α-induced disc formation and subsequent activation of caspase-8. 39 cytotoxic t cells play a major role in vertebrate defense against viral infection. 40 they induce cell death in infected cells to prevent viral multiplication and spread of infection. 41 cytotoxic t cells can kill their targets either by activating the fas ligand/fas pathway or by injecting granzyme b, a serine protease, into target cells. cytotoxic t cells carry fas ligand on their surface but also carry granules containing the channel-forming protein perforin and granzyme b. upon recognizing the infected cells, the lymphocytes bind and secrete granules onto the surface of infected cells. perforin then assembles into transmembrane channels to allow the entry of granzyme b into the target cell. upon entry, granzyme b, which cleaves after aspartate residues in proteins ("aspase"), activates one or more of the apoptotic proteases (caspase-2, -3, -7, -8, and -10) to trigger the proteolytic death cascade (see figure 4 .2). fas ligand/fas and perforin/granzyme b systems are the main apoptotic machinery that regulates homeostasis in immune cell populations. cells can respond to various stressful stimuli and metabolic disturbances by triggering apoptosis. drugs, toxins, heat, radiation, hypoxia, and viral infections are some of the tnf-α tnfr1 complex can also elicit an antiapoptotic response by recruiting traf2, which results in nf-κbmediated upregulation of antiapoptotic genes. in cytotoxic t lymphocyte-induced death, granzyme b, which enters the cell through membrane channels formed by the protein perforin, activates caspases by cleaving them directly or indirectly. intracellular pathways: lack of survival stimuli (withdrawal of growth factor, hypoxia, genotoxic substances, etc.) is thought to generate apoptotic signals through ill-defi ned mechanisms, which lead to translocation of proapoptotic proteins such as bax to the outer mitochondrial membrane. in some cases, transcription mediated by p53 may be required to induce proteins such as bax. translocated bax undergoes conformational changes in the outer membrane to form oligomeric structures (pores) that leak cytochrome c from mitochondria into the cytosol. formation of a ternary complex of cytochrome c, the adapter protein apaf-1, and the initiator caspase-9 results in the activation of caspase-9 followed by sequential activation of effector caspase(s) such as caspase-3 and others. the action of caspases, endonucleases, and possibly other enzymes leads to cellular disintegration. for example, the endonuclease cad (caspase activated dnase) becomes activated when it is released from its inhibitor icad upon cleavage of icad by an effector caspase. antiapoptotic proteins such as bcl-2 and bcl-xl inhibit the membrane-permeabilizing effects of bax and other proapoptotic proteins. cross-talk between extra-and intracellular pathways occurs through caspase-8-mediated bid cleavage, which yields a 15 kda protein that migrates to mitochondria and releases cytochrome c, thereby setting in motion events that lead to apoptosis via caspase-9. the stimuli known to activate death pathways. cell death, however, is not necessarily inevitable after exposure to these agents, and the mechanisms determining the outcome of the injury are a topic of active interest. the current consensus appears to be that it is the intensity and the duration of the stimulus that determine the outcome. the stimulus must go beyond a threshold to commit cells to apoptosis. although the exact mechanism used by each stimulus may be unique and different, a few broad patterns can be identifi ed. for example, agents that damage dna, such as ionizing radiation and certain xenobiotics, lead to activation of p53-mediated mechanisms that commit cells to apoptosis, at least in part through transcriptional upregulation of proapoptotic proteins. 42 other stresses induce increased activity of stress-activated protein kinases, which result ultimately in apoptotic commitment. 43 these different mechanisms converge in the activation of caspases. a cascade of caspases plays the central executioner role by cleaving various mammalian cytosolic and nuclear proteins that play roles in cell division, maintenance of cytoskeletal structure, dna replication and repair, rna splicing, and other cellular processes. this proteolytic carnage produces the characteristic morphologic changes of apoptosis. once the caspase cascade is initiated, the process of cell death has crossed the point of no return. the roles of various caspases in apoptotic pathways and their relative importance for animal development have been examined in genetic studies involving knockout of different caspase genes. a caspase-1 (interleukin [il]-1b converting enzyme [ice]) knockout study suggested that ice plays an important role in infl ammation by activating cytokines such as il-1b and il-18. however, caspase-1 was not required to mediate apoptosis under normal circumstances and did not have a major role during development. 44 surprisingly, ischemic brain injury was signifi cantly reduced in caspase-1 knockout mice compared with wild-type mice, 45 suggesting that infl ammation may contribute to ischemic injury. caspase-3 deficiency leads to impaired brain development and premature death. also, functional caspase-3 is required for some typical hallmarks of apoptosis such as formation of apoptotic bodies, chromatin condensation, and dna fragmentation in many cell types. 46 lack of caspase-8 results in the death of embryos at day 11 with abnormal formation of the heart, 47 suggesting that caspase-8 is required for cell death during mammalian development. in support of this fi nding, knockout of fadd, which is required for caspase-8 activation, resulted in fetal death with signs of abdominal hemorrhage and cardiac failure. 48 moreover, caspase-8-defi cient cells did not die in response to signals from members of the tnf receptor family. 47 however, cells lacking either fadd or caspase-8, which are resistant to tnf-α-mediated or cd95-mediated death, are susceptible to chemotherapeutic drugs, serum depriva-tion, ceramide, γ-irradiation, and dexamethasone-induced killing. 48 in contrast, caspase-9 has a key role in apoptosis induced by intracellular activators, particularly those that cause dna damage. deletion of caspase-9 resulted in perinatal lethality, apoptotic failure in developing neurons, enlarged brains, and craniofacial abnormalities. 49 in caspase-9-defi cient cells, caspase-3 was not activated, suggesting that caspase-9 is upstream of caspase-3 in the apoptotic cascade. as a consequence, caspase-9-defi cient cells are resistant to dexamethasone or irradiation, whereas they retain their sensitivity to tnf-α-induced or cd95-induced death 49 because of the presence of caspase-8, the initiator caspase involved in death receptor signaling that can also activate caspase-3. overall, these observations support the idea that different death signaling pathways converge on downstream effector caspases (see figure 4 .2). indeed, caspase-3 is regarded as one of the key executioner molecules activated by apoptotic stimuli originating either at receptors for exogenous molecules or within cells through the action of drugs, toxins, or radiation. in c. elegans, biochemical and genetic studies have indicated a role for ced-4 upstream of ced-3. 50 upon receiving death commitment signals, ced-4 binds to pro-ced-3 and releases active ced-3. 50 however, when overexpressed, ced-9 can inhibit the activation of pro-ced-3 by binding to ced-4 and sequestering it away from pro-ced-3. therefore, ced-3 and ced-4 are involved in activation of apoptosis, and ced-9 inhibits apoptosis. after the discovery of caspases as ced-3 homologs, a search for activators and inhibitors analogous to ced-4 and ced-9 led to the discovery of diverse mammalian regulators of apoptosis. the plethora of these molecules and their functional diversity allowed them to be classifi ed into four broad categories: (1) adapter proteins, (2) the bcl-2 family of regulators, (3) inhibitors of apoptosis (iaps), and (4) other regulators. as stated earlier, two major pathways of apoptosis, involving either the initiator caspase-8 or the initiator caspase-9 (see figure 4 .2), have been recognized. signaling by death receptors (cd95, tnfri) occurs through a well-defi ned process of recruitment of caspase-8 to the death receptor by adapter proteins such as fadd. recruitment occurs through interactions between the death domains that are present on both receptor and adapter proteins. receptorbound fadd then recruits caspase-8 through interactions between deds common to both caspase-8 and fadd forming a disc. in the disc, caspase-8 activation occurs through oligomerization and autocatalysis. activated caspase-8 then activates downstream caspase-3, culminating in apoptosis. the inhibitory protein, flip was shown to block fas-induced and tnf-α-induced disc formation and subsequent activation of caspase-8. 39 of particular interest is cellular flip, which stimulates caspase-8 activation at physiologically relevant levels and inhibited apoptosis upon high ectopic expression. 51 cellular flip contains two deds that can compete with caspase-8 for recruitment to the disc. this limits the degree of association of caspase-8 with fadd and thus limits activation of the caspase cascade. it also forms a heterodimer with caspase-8 and caspase-10 through interactions between both the deds and the caspase-like domains of the proteins, thus activating both caspase-8 and caspase-10. 52 apoptotic protease activating factor-1 (apaf-1), a ced-4 homolog in mammalian cells, affects the activation of initiator caspase-9. 53 this factor binds to procaspase-9 in the presence of cytochrome c and 2′deoxyadenosine 5′-triphosphate (datp) or adenosine triphosphate (atp) and activates this protease, which in turn activates a downstream cascade of proteases (see figure 4 .2). 54 by and large, apaf-1 defi ciency is embryonically lethal and the embryos exhibit brain abnormalities similar to those seen in caspase-9 knockout mice. 55 these genetic fi ndings support the idea that apaf-1 is coupled to caspase-9 in the death pathway. unlike ced-4 in nematodes, apaf-1 requires the binding of atp and cytochrome c to activate procaspase-9. the multiple wd40 repeats in the c-terminal end of apaf-1 have a regulatory role in the activation of caspase-9. 56 the ced-9 homolog in mammals is the bcl-2 protein. bcl-2 was fi rst discovered in b-cell lymphoma as a protooncogene. overexpression of bcl-2 was shown to offer protection against a variety of death stimuli. 57 the bcl-2 protein family includes both proapoptotic (bcl-2, bcl-xl, bcl-w, mcl-1, nr13, and a1/bfl -1) and antiapoptotic proteins (bax, bak, bok, diva, bcl-xs, bik, bim, hrk, nip3, nix, bad, and bid). 58 these proteins are characterized by the presence of bcl-2 homology (bh) domains: bh1, bh2, bh3, and bh4 (figure 4.3) . the proapoptotic members have two subfamilies: a multidomain and a bh3-only group (see figure 4 .3). the relative ratio of pro-and antiapoptotic proteins determines the sensitivity of cells to various apoptotic stimuli. the best-studied proapoptotic members are bax and bid. exposure to various apoptotic stimuli leads to translocation of cytosolic bax from the cytosol to the mitochondrial membrane. 59 bax oligomerizes on the mitochondrial membrane along with another proapoptotic protein, bak, leading to the release of cytochrome c from the mitochondrial membrane into the cytosol. 60 other proapoptotic proteins, mainly the bh3-only proteins, are thought to aid in bax-bak oligomerization on the mitochondrial membrane. the antiapoptotic bcl-2 family members are known to block bax-bak oligomerization on the mitochondrial membrane and subsequent release of cytochrome c into the cytosol. 60, 61 after release from the mitochondria, cytochrome c is known to interact with the wd40 repeats of the adaptor protein apaf-1, resulting in the formation of the apoptosome complex. seven molecules of apaf-1, interacting through their n-terminal caspase activation and recruitment domain, form the central hub region of the symmetric wheel-like structure, the apoptosome. binding of atp/datp to apaf-1 triggers the formation of the apoptosome, which subsequently recruits procaspase-9 into the apoptosome complex, resulting in its activation 62 . activated caspase-9 then activates executioner caspases, such as caspase-3 and caspase-7, eventually leading to programmed cell death. the iaps, fi rst discovered in baculoviruses and then in insects and drosophila, inhibit activated caspases by directly binding to the active enzymes. 63 these proteins contain one or more baculovirus inhibitor of apoptosis repeat domains, which are responsible for the caspase inhibitory activity. 64 to date, eight mammalian iaps have been identifi ed. they include x-linked iap (xiap), c-iap1, c-iap2, melanoma iap (ml-iap)/livin, iaplike protein-2 (ilp-2), neuronal apoptosis-inhibitory protein (naip), bruce/apollon, and survivin. in mammals, caspase-3, -7, and -9 are inhibited by iaps. 62 there are reports suggesting aberrant expression of iaps in many cancer tissues. for example, ciap1 is overexpressed in esophageal squamous cell sarcoma 65 ; ciap2 locus is translocated in mucosa-associate lymphoid lymphoma 66 and survivin has been shown to be upregulated in many cancer cells. 67 the caspase inhibitory activity of iaps is inhibited by proteins containing an iap-binding tetrapeptide motif. 62 the founding member of this family is smac/diablo, which is released from the mitochondrial intermembrane space into the cytosol during apoptosis. in the cytosol, it interacts with several iaps and inhibits their function. the other mitochondrial protein, omi/htra2, is also known to antagonize xiap-mediated inhibition of caspase-9 at high concentrations. 68 a serine protease, omi/htra2 can proteolytically cleave and inactivate iap proteins and thus is considered to be a more potent suppressor of iaps than smac. 69 it has been reported that the heat shock proteins hsp90, hsp70, and hsp27 can inhibit caspase activation by cytochrome c either by interacting with apaf-1 or other players in the pathway. [70] [71] [72] a high-throughput screen identifi ed a compound called petcm (α-[trichloromethyl]-4-pyridineethanol) as a caspase-3 activator. further work with petcm revealed its involvement in apoptosome regulation. 73 this pathway also includes oncoprotein prothymosin-α and tumor suppressor putative hla-dr-associated proteins. these proteins were shown to promote caspase-9 activation after apoptosome formation, whereas prothymosin-α inhibited caspase-9 activation by inhibiting apoptosome formation. in an apoptotic cell, the regulatory, structural, and housekeeping proteins are the main targets of the caspases. the regulatory proteins mitogen-activated protein/extracellular signal-regulated kinase kinase-1, p21-activated kinase-2, and mst-1 are activated upon cleavage by caspases. 74 caspase-mediated protein hydrolysis inactivates other proteins, including focal adhesion kinase, phosphatidylinositol-3 kinase, akt, raf-1, iaps, and inhibitors of caspase-activated dnase (icad). caspases also convert the antiapoptotic protein bcl-2 into a proapoptotic protein such as bax upon cleavage. there are many structural protein targets of caspases, which include nuclear lamins, actin, and regulatory proteins such as spectrin, gelsolin, and fodrins. 75 degradation of nuclear dna into internucleosomal chromatin fragments is one of the hallmarks of apoptotic cell death that occurs in response to various apoptotic stimuli in a wide variety of cells. a specifi c dnase, cad (caspase-activated dnase), that cleaves chromosomal dna in a caspase-dependent manner, is synthesized with the help of icad. in proliferating cells, cad is always found to be associated with icad in the cytosol. when cells are undergoing apoptosis, caspases (particularly caspase-3) cleave icad to release cad and allow its translocation to the nucleus to cleave chromosomal dna. thus, cells that are icad defi cient or that express caspase-resistant icad mutant do not exhibit dna fragmentation during apoptosis. apoptosis plays a critical role in the postnatal lung. 76 regulated removal of infl ammatory cells by apoptosis helps in the resolution of infl ammation in the lung. 77 recent evidence also supports a role for apoptosis in the remodeling of lung tissue after acute lung injury 78 and in the pathogenesis of chronic pulmonary hypertension, 79 idiopathic pulmonary fi brosis, and chronic obstructive pulmonary disease. 80, 81 acute lung injury/acute respiratory distress syndrome acute lung injury, which clinically manifests itself as the acute respiratory distress syndrome (ards), involves disruption of the alveolar epithelium and endothelium, increased vascular permeability, and edema. two main hypotheses link the pathogenesis of ards to apoptosis, namely, the "neutrophilic hypothesis" and the "epithelial hypothesis." these two hypotheses are not mutually exclusive, and both could play important roles in the pathogenesis of ards. the neutrophilic hypothesis suggests that neutrophil apoptosis plays an important role in the resolution of infl ammation and that the inhibition of neutrophil apoptosis or the inhibition of clearance of apoptotic neutrophils is deleterious in ards. 82, 83 studies in humans showed that bronchoalveolar lavage fl uids from patients with early ards inhibit the rate at which neutrophils develop apoptosis in vitro. 84 the inhibitory effect of bronchoalveolar lavage fl uids on neutrophil apoptosis is mediated by granulocyte/macrophage colony-stimulating factor, and possibly by il-8 and il-2. 85,86 a membrane surface molecule, cd44, has been shown to play an important role in the clearance of apoptotic cells in vivo and in vitro. 87 in a model of bleomycin-induced lung injury, cd44-defi cient mice failed to clear apoptotic neutrophils, which was associated with worsened infl ammation and increased mortality. 87 activation of phagocytic cells inhibits production of proinfl ammatory cytokines, including il-1β, il-8, il-10, granulocyte/ macrophage colony-stimulating factor, and tnf-α and increases release of anti-infl ammatory mediators such as transforming growth factor-β, prostaglandin e 2 , and platelet-activating factor. 88, 89 the net effects of these changes could favor resolution of infl ammation. the epithelial hypothesis suggests that the apoptotic death of alveolar epithelial cells, in response to soluble mediators such as fas ligand, contributes to the prominent alveolar epithelial injury characteristic of ards. several lines of evidence suggest a role for the fas/fas ligand system in epithelial cell apoptosis. 90 fas is expressed on alveolar and airway epithelial cells, 91, 92 and its expression increases in response to infl ammatory mediators such as lipopolysaccharide. fas-mediated lung cell apoptosis is modulated by surfactant protein a, which inhibits apoptosis in vivo. 93 chronic obstructive pulmonary disease chronic obstructive pulmonary disease, caused primarily by smoking, generally refers to chronic bronchitis and emphysema. several factors, including protease/antiprotease imbalance, oxidative stress, cigarette smokederived toxins, and infl ammation mediated by neutrophils, macrophages, and cd8 + t cells, have been shown to contribute to the disease process. furthermore, matrix metalloproteinase 94 and vascular endothelial growth factor receptor inhibition, 95, 96 but not fas/fas ligand, have been shown to play role in the development of emphysema. asthma allergic asthma is characterized by intermittent or persistent bronchoconstriction and has been linked to airway remodeling and chronic infl ammation, with increased numbers of eosinophils, cd4 + t cells, and mast cells. although at present a role for apoptosis in asthma is not confi rmed, studies ex vivo have shown reduced apoptosis of circulating peripheral cd4 + t cells and eosinophils in asthma, which might contribute to infl ammation. corticosteroids used to reduce infl ammation in asthma have been shown to induce eosinophil apoptosis. 97 pulmonary fi brosis is characterized by epithelial damage, fi broblast proliferation, and deposition of collagen. although the mechanism of alveolar epithelial cell apoptosis in pulmonary fi brosis is not known, several reports have suggested fas pathway, 98 angiotensin pathway, 99 activated t cell-derived perforin, 100 il-13 stimulation, 101 and transforming growth factor-β1 activation 102 to play critical roles. because insuffi cient apoptosis is often associated with tumorigenesis, modulation of apoptotic and antiapoptotic targets seems to be an attractive approach to cancer therapy. lung cancers can be divided into small cell lung cancers (sclcs) and non-small cell lung cancers (nsclcs). 103 the sclcs are relatively more sensitive to anticancer drugs and irradiation than are the nsclcs, 104 but the molecular basis for this difference is not clearly known. evaluation of apoptosis-associated substances has shown that caspase-8, fas, and fas ligand are often downregulated in sclcs but not in nsclcs. 105 an investigation of the basis for these differences revealed that there were no differences in the levels of bax and bcl-xl, but the expression of bcl-2 was found to be signifi cantly higher in sclc than in nsclc cell lines. the observation that in some cases bcl-2 can be converted into a proapoptotic bax-like death molecule may offer an explanation for the paradoxic expression of bcl-2 in sclc. 106 the lack of expression of procaspase-1, -4, -8, and -10 107 reported in sclc suggests that these caspases probably do not contribute to spontaneous apoptosis in these cells. apoptosis regulators apaf-1 and procaspase-3 are overexpressed and are functional in nsclc cell lines. in both types of lung cancer, apoptotic stimuli result in cytochrome c release and activation of caspase-9 and caspase-3, but only sclc cell lines showed a relocalization of caspase-3 into the nucleus 108 ; this suggests that the resistance of nsclc cell lines is probably due to defective relocalization of caspase-3. the expression of caspase-9 and caspase-7 in nsclcs was found to be similar to normal lung tissue. 109 however, these cell lines express the apoptosis inhibitor and splice variant of caspase-9 casp9b. in vitro, chemotherapy-resistant nsclc cell lines exhibit decreased caspase-9 and caspase-3 expression, 110 which suggests an inhibition of apoptosis induction via apoptosome formation in nsclc. additionally, both nsclc and sclc cells express high and almost equal levels of survivin. 107 the resistant nsclc cells showed higher expression of c-iap2, and the radiosensitive sclc cells exhibited increased expression of xiap. 111 these results suggest no correlation between the level of expression of the iaps and the difference in the radiosensitivity between nsclc and sclc cells. cell death has become an area of intense interest and investigation in science and medicine because of the recognition that cell death, in general, and apoptosis, in par-ticular, are important features of many biologic processes. involvement of many genes in the death process suggests that cell death is a complex phenomenon with many redundant mechanisms to ensure defi nitiveness. the realization that defective cell death plays a central role in the pathogenesis of diseases has stimulated work on therapies targeted to these processes, and this work will undoubtedly continue in the future. apoptosis: a basic biological phenomenon with wide-ranging implications in tissue kinetics genetic control of programmed cell death in the nematode c. elegans programmed cell death in animal development apoptosis: defi nition, mechanisms, and relevance to disease apoptosis as a therapeutic target for the treatment of lung disease four deaths and a funeral: from caspases to alternative mechanisms dual signaling of the fas receptor: initiation of both apoptotic and necrotic cell death pathways alkylating dna damage stimulates a regulated form of necrotic cell death a novel response of cancer cells to radiation involves autophagy and formation of acidic vesicles autophagy: in sickness and in health tissue protein turnover during liver carcinogenesis reduced autophagic activity in primary rat hepatocellular carcinoma and ascites hepatoma cells regulation of an atg7-beclin 1 program of autophagic cell death by caspase-8 role of bcl-2 family proteins in a non-apoptotic programmed cell death dependent on autophagy genes an alternative, nonapoptotic form of programmed cell death paraptosis: mediation by map kinases and inhibition by aip-1/alix autoschizis: a novel cell death inhibition of the development of metastases by dietary vitamin c:k3 combination autoschizis: a new form of cell death for human ovarian carcinoma cells following ascorbate/menadione treatment. nuclear and dna degradation the molecular biology of apoptosis evolutionary conservation of a genetic pathway of programmed cell death interaction between the c. elegans cell-death regulators ced-9 and ced-4 interaction and regulation of the caenorhabditis elegans death protease ced-3 by ced-4 and ced-9 caspases: enemies within vital functions for lethal caspases mechanism of xiapmediated inhibition of caspase-9 insights into the regulatory mechanism for caspase-8 activation a combinatorial approach defi nes specifi cities of members of the caspase family and granzyme b. functional relationships established for key mediators of apoptosis signalling by cd95 and tnf receptors: not only life and death apoptosis control by death and decoy receptors the tnf receptor 1-associated protein tradd signals cell death and nf-kappa b activation fadd, a novel death domain-containing protein, interacts with the death domain of fas and initiates apoptosis caspase-6 is the direct activator of caspase-8 in the cytochrome c-induced apoptosis pathway: absolute requirement for removal of caspase-6 prodomain two cd95 (apo-1/fas) signaling pathways induction of cell death by tumour necrosis factor (tnf) receptor 2, cd40 and cd30: a role for tnf-r1 activation by endogenous membrane-anchored tnf tumor necrosis factor (tnf) receptor 1 signaling downstream of tnf receptor-associated factor 2. nuclear factor kappab (nfkappab)-inducing kinase requirement for activation of activating protein 1 and nfkappab but not of c-jun nterminal kinase/stress-activated protein kinase involvement of mach, a novel mort1/fadd-interacting protease, in fas/apo-1-and tnf receptor-induced cell death the ced-4-homologous protein flash is involved in fas-mediated activation of caspase-8 during apoptosis viral fliceinhibitory proteins (flips) prevent apoptosis induced by death receptors memory and distribution of virus-specifi c cytotoxic t lymphocytes (ctls) and ctl precursors after rotavirus infection fasdependent cd4 + cytotoxic t-cell-mediated pathogenesis during virus infection transcriptional regulation during p21waf1/cip1-induced apoptosis in human ovarian cancer cells activation of c-jun nh2-terminal kinase/stress-activated protein kinase (jnk/ sapk) is critical for hypoxia-induced apoptosis of human malignant melanoma characterization of mice defi cient in interleukin-1 beta converting enzyme reduced ischemic brain injury in interleukin-1 beta converting enzyme-defi cient mice caspase-3 is required for dna fragmentation and morphological changes associated with apoptosis targeted disruption of the mouse caspase 8 gene ablates cell death induction by the tnf receptors, fas/apo1, and dr3 and is lethal prenatally fadd: essential for embryo development and signaling from some, but not all, inducers of apoptosis reduced apoptosis and cytochrome c-mediated caspase activation in mice lacking caspase 9 the ins and outs of programmed cell death during c. elegans development c-flip(l) is a dual function regulator for caspase-8 activation and cd95-mediated apoptosis the fl ip side of flip apaf-1, a human protein homologous to c. elegans ced-4, participates in cytochrome c-dependent activation of caspase-3 an apaf-1.cytochrome c multimeric complex is a functional apoptosome that activates procaspase-9 apaf1 (ced-4 homolog) regulates programmed cell death in mammalian development autoactivation of procaspase-9 by apaf-1-mediated oligomerization bcl-2 inhibits death of central neural cells induced by multiple agents bcl-2 family proteins role of hypoxia-induced bax translocation and cytochrome c release in reoxygenation injury association of bax and bak homo-oligomers in mitochondria. bax requirement for bak reorganization and cytochrome c release bcl-2 prevents bax oligomerization in the mitochondrial outer membrane mechanisms of caspase activation and inhibition during apoptosis diablo promotes apoptosis by removing miha/xiap from processed caspase 9 iap family proteins-suppressors of apoptosis identifi cation of ciap1 as a candidate target gene within an amplicon at 11q22 in esophageal squamous cell carcinomas the apoptosis inhibitor gene api2 and a novel 18q gene, mlt, are recurrently rearranged in the t(11;18)(q21;q21) associated with mucosa-associated lymphoid tissue lymphomas a novel anti-apoptosis gene, survivin, expressed in cancer and lymphoma a serine protease, htra2, is released from the mitochondria and interacts with xiap, inducing cell death omi/htra2 catalytic cleavage of inhibitor of apoptosis (iap) irreversibly inactivates iaps and facilitates caspase activity in apoptosis hsp27 functions as a negative regulator of cytochrome c-dependent activation of procaspase-3 heat-shock protein 70 inhibits apoptosis by preventing recruitment of procaspase-9 to the apaf-1 apoptosome negative regulation of cytochrome c-mediated oligomerization of apaf-1 and activation of procaspase-9 by heat shock protein 90 distinctive roles of phap proteins and prothymosin-alpha in a death regulatory pathway caspase-dependent cleavage of signaling proteins during apoptosis. a turn-off mechanism for anti-apoptotic signals caspasemediated proteolysis during apoptosis: insights from apoptotic neutrophils programmed cell death contributes to postnatal lung development granulocyte apoptosis and its role in the resolution and control of lung infl ammation apoptosis is a major pathway responsible for the resolution of type ii pneumocytes in acute lung injury mechanisms of structural remodeling in chronic pulmonary hypertension induction of apoptosis and pulmonary fi brosis in mice in response to ligation of fas antigen essential roles of the fas-fas ligand pathway in the development of pulmonary fi brosis granulocyte apoptosis and the control of infl ammation macrophage engulfment of apoptotic neutrophils contributes to the resolution of acute pulmonary infl ammation in vivo modulation of neutrophil apoptosis by granulocyte colony-stimulating factor and granulocyte/macrophage colony-stimulating factor during the course of acute respiratory distress syndrome g-csf and il-8 but not gm-csf correlate with severity of pulmonary neutrophilia in acute respiratory distress syndrome interleukin-2 involvement in early acute respiratory distress syndrome: relationship with polymorphonuclear neutrophil apoptosis and patient survival resolution of lung infl ammation by cd44 macrophages that have ingested apoptotic cells in vitro inhibit proinfl ammatory cytokine production through autocrine/paracrine mechanisms involving tgf-beta, pge2, and paf phosphatidylserinedependent ingestion of apoptotic cells promotes tgf-beta1 secretion and the resolution of infl ammation recombinant human fas ligand induces alveolar epithelial cell apoptosis and lung injury in rabbits fas expression in pulmonary alveolar type ii cells expression of fas (cd95) and fasl (cd95l) in human airway epithelium natural protection from apoptosis by surfactant protein a in type ii pneumocytes upregulation of gelatinases a and b, collagenases 1 and 2, and increased parenchymal cell death in copd inhibition of vegf receptors causes lung cell apoptosis and emphysema oxidative stress and apoptosis interact and cause emphysema due to vascular endothelial growth factor receptor blockade glucocorticoid-induced apoptosis in human eosinophils: mechanisms of action increased circulating levels of soluble fas ligand are correlated with disease activity in patients with fi brosing lung diseases bleomycin-induced apoptosis of alveolar epithelial cells requires angiotensin synthesis de novo the perforin mediated apoptotic pathway in lung injury and fi brosis interleukin-13 induces tissue fi brosis by selectively stimulating and activating transforming growth factor beta(1) early growth response gene 1-mediated apoptosis is essential for transforming growth factor beta1-induced pulmonary fi brosis united states lung carcinoma incidence trends: declining for most histologic types among males, increasing among females progress in understanding the molecular pathogenesis of human lung cancer loss of expression of death-inducing signaling complex (disc) components in lung cancer cell lines and the infl uence of myc amplifi cation conversion of bcl-2 to a bax-like death effector by caspases differences in expression of pro-caspases in small cell and non-small cell lung carcinoma defective caspase-3 relocalization in non-small cell lung carcinoma increased expression of apaf-1 and procaspase-3 and the functionality of intrinsic apoptosis apparatus in non-small cell lung carcinoma rescue of death receptor and mitochondrial apoptosis signaling in resistant human nsclc in vivo expression of inhibitor of apoptosis proteins in small-and non-small-cell lung carcinoma cells key: cord-002179-v8lpw4r7 authors: viktorovskaya, olga v.; greco, todd m.; cristea, ileana m.; thompson, sunnie r. title: identification of rna binding proteins associated with dengue virus rna in infected cells reveals temporally distinct host factor requirements date: 2016-08-24 journal: plos negl trop dis doi: 10.1371/journal.pntd.0004921 sha: doc_id: 2179 cord_uid: v8lpw4r7 background: there are currently no vaccines or antivirals available for dengue virus infection, which can cause dengue hemorrhagic fever and death. a better understanding of the host pathogen interaction is required to develop effective therapies to treat denv. in particular, very little is known about how cellular rna binding proteins interact with viral rnas. rnas within cells are not naked; rather they are coated with proteins that affect localization, stability, translation and (for viruses) replication. methodology/principal findings: seventy-nine novel rna binding proteins for dengue virus (denv) were identified by cross-linking proteins to dengue viral rna during a live infection in human cells. these cellular proteins were specific and distinct from those previously identified for poliovirus, suggesting a specialized role for these factors in denv amplification. knockdown of these proteins demonstrated their function as viral host factors, with evidence for some factors acting early, while others late in infection. their requirement by denv for efficient amplification is likely specific, since protein knockdown did not impair the cell fitness for viral amplification of an unrelated virus. the protein abundances of these host factors were not significantly altered during denv infection, suggesting their interaction with denv rna was due to specific recruitment mechanisms. however, at the global proteome level, denv altered the abundances of proteins in particular classes, including transporter proteins, which were down regulated, and proteins in the ubiquitin proteasome pathway, which were up regulated. conclusions/significance: the method for identification of host factors described here is robust and broadly applicable to all rna viruses, providing an avenue to determine the conserved or distinct mechanisms through which diverse viruses manage the viral rna within cells. this study significantly increases the number of cellular factors known to interact with denv and reveals how denv modulates and usurps cellular proteins for efficient amplification. dengue is a mosquito-borne viral disease that infects 50-100 million people annually, resulting in dengue fever that is either asymptomatic or flu-like. however, tens-of-thousands of people develop the more severe, and sometimes fatal, dengue hemorrhagic fever/shock syndrome (dhf/dss) [1] . denv is found in most tropical and many subtropical areas with more than 125 countries being endemic for denv [2] . there is no approved vaccine or antiviral therapeutic available for this life-threatening disease. given the seriousness of infection, the expanding geographical range of the denv, and the limitations in the existing measures of control and prevention, there is a pressing need to better understand the biology and pathogenesis of denv. denv is a single-stranded positive-sense rna virus that belongs to the flaviviridae family. it has a 5' cap, no poly(a) tail, highly structured 5'-and 3'-untranslated regions (utr), and a single open reading frame (orf) [reviewed in [3] ]. following virus entry, the viral rna is released into the cytoplasm. viral translation and replication occur in membranous assembly "factories" localized in the perinuclear region of endoplasmic reticulum (er) [4] . the positivestranded rna molecules are encapsidated; virions are further processed as they are transported through the secretory pathway to the cell surface and released extracellularly [reviewed in [3] ]. in addition to the viral proteins, cellular proteins, termed host factors, participate in most, if not all, steps of the denv life cycle, including entry, translation, replication, virion assembly, and release [5] . since viruses require host factors for efficient amplification, targeting host factors can provide an effective antiviral target for which the virus has no genetic control over. therefore, it may be more difficult for viruses to evolve escape mutants that can replicate efficiently in the absence of the host factor [5, 6] . several cellular proteins are known to impact denv infection. for example, the polypyrimidine-tract-binding protein (ptbp1) is relocalized from the nucleus to the cytoplasm following denv infection where it enhances denv amplification by binding to the denv 3'utr and to ns4a, a viral protein required for the formation of the viral replication complex [7] [8] [9] [10] . ptbp1 may also stimulate denv translation [8] , although this is still controversial [7, 9] . while most of the known denv rna binding proteins enhance viral amplification, several reduce denv titers [10] [11] [12] . one such factor, ybx1, inhibits viral translation [12] . although previous studies have laid a foundation for establishing critical interactions between viral rna and cellular proteins [ [13] and reviewed in [14] ], the host factors identified thus far likely represent only a fraction of the total network of denv host factors. previously, we have described a high-throughput mass spectrometry method termed tux-ms (thiouracil cross-linking mass spectrometry) to identify host factors that interact with viral rna during a live infection in cell culture [15] . importantly, tux-ms allows for identification of proteins that are bound directly to the viral rna in living cells. briefly, during a viral infection in cell culture, thiouridine is biosynthetically incorporated into the viral rna to serve as a zero-distance cross-linker upon exposure to ultraviolet (uv) light. thus, proteins that are bound directly to the viral rna during a live infection are cross-linked to the rna prior to disruption of cellular compartmentalization. this is particularly valuable for the identification of denv host proteins, since denv amplification is tightly associated with cellular membrane structures [4] . following cross-linking, the viral rna together with cross-linked proteins are isolated under denaturing conditions and identified by mass spectrometry-based proteomics. using tux-ms, we reported previously the successful identification and validation of host factors for poliovirus, pointing to a low false discovery rate of < 12% [15] . here, we expanded the tux-ms methodology for use with other types of rna viruses, and investigated rna-protein interactions during denv infection. we modified the method to use virus-specific dna oligos to capture the viral rna and cross-linked proteins. furthermore, we used metabolic labelling with stable isotopes to accurately quantify relative protein levels. this quantitative thiouridine cross-linking mass spectrometry (qtux-ms) analysis identified 79 novel host proteins, which were not previously shown to be involved in denv infection. we placed these findings in the context of whole proteome changes upon denv infection, and further validated and functionally analysed a subset of the novel denv host factor candidates. overall, validation of the qtux-ms identified factors using secondary assays indicates a low rate of false positives (17%), suggesting that the majority of the other identified qtux-ms factors may also play significant roles in denv viral amplification. hela uprt cells expressing uracil phosphoribosyltransferase (uprt) were generated previously by transduction of hela cells (atcc, ccl-2) with uprt-gene containing lentivirus [15] . huh7.5 uprt cells were generated by transduction of huh7.5 cells (a kind gift from charles m. rice, rockefeller university) with the same lentiviral construct as in [15] . hela uprt and huh7.5 uprt cells were cultured at 37°c and 5% co 2 in complete dulbecco's modified minimum essential medium (dmem) supplemented with 10% fbs (fetal bovine serum; atlanta biologicals) and penicillin-streptomycin and grown with 1 mg/ml g418 (sigma) to select for the uprt gene; huh7.5 uprt cells were additionally supplemented with non-essential amino acids (cellgro). for silac labelling huh7.5 uprt cells were passed 1:10 at least twice in silac dmem (thermo scientific) with 10% dialyzed heat-inactivated fbs (thermo), l-proline (200 mg/l) and penicillin-streptomycin, and either 50 mg/l 'heavy' (13c6 l-lysine and 13c6-15n4 l-arginine; cambridge isotope laboratories, inc.) or 40 mg/l 'light' l-lysine and l-arginine amino acids [16] . dengue virus serotype 2 (denv2), strain 16681 (genebank accession number u87411) was kindly provided by dr. robert striker (university of wisconsin-madison). denv2 was propagated in the mosquito c6/36 cells at 28°c and 5% co 2 in advanced dmem supplemented with 10% fbs, penicillin-streptomycin (cellgro), l-glutamate and 10% tryptose phosphate broth (20 g/l tryptose; 2 g/l glucose; 5 g/l sodium chloride and 2.5 g/l disodium hydrogen phosphate). titers for denv2 were determined using plaque assays in bhk cells. for infections, cells were incubated with virus containing media for 2 hours, washed twice with the dmem media after removal of the virus and incubated in the serum-free dmem for the indicated time. infections and titer determination of adenovirus 5 (ad5) were performed exactly as in [15] . the denv antisense biotin-labelled dna fragments were generated using pcr and primers listed in (s1 table) from the denv2 complementary dna (cdna) and correspond to positions 4350-4914 and 4740-4914 of denv genome. pcr was followed by removal of the unlabelled dna strand according to the nanolink streptavidin magnetic beads (solulink) manual. the mixture of two biotinylated single-stranded dna fragments of 174 base pairs (bp) and 564 bp long were bound to nanolink streptavidin magnetic beads magnetic beads according to the manufacturer's protocol. 1-3 x 10 7 human hepatoma huh7.5 uprt cells labelled with either 'light' or 'heavy' silac media were infected with denv2 (moi = 10) or mock-treated, respectively. then, virus was replaced with silac media with 1 mm 4-thiouracil and 10% fbs. at 28 hours post-infection (hpi) the cells were washed with pbs and irradiated at 365 nm uv light for 20 min, collected, cell pellets were frozen on dry ice and stored at -80°c. cell pellets were lysed in the lysis buffer (50 mm tris-hcl ph 8, 4 mm magnesium chloride, 150 mm sodium chloride, 0.1% tween-20, 5 mm dithiothreitol, recombinant rnasin ribonuclease inhibitor [500 units/ml; promega], 1× complete protease inhibitor cocktail [roche]), with a half volume of 465-600 μm glass beads (sigma) by shaking at frequency of 30 hz for 1 min using a retsch mm200 mixer mill. an aliquot of 'light' and 'heavy' cell lysates were removed and the remaining lysates were incubated with the streptavidin magnetic beads labelled with denv antisense dna fragments for 15 to 30 min allowing for viral rna to bind. the beads were washed twice with wash buffer i (50 mm tris-hcl ph 8, 500 mm potassium chloride, 0.1% tween-20), once with wash buffer ii (50 mm tris-hcl ph 8, 150 mm sodium chloride, 0.5% sodium deoxycholate) and once with 10 mm tris ph 7.6. the samples were eluted at 65°c for 2 min in 20 μl of 10 mm tris ph 8.0. the eluted 'light' and 'heavy' samples were mixed at a 1:1 ratio, rna was degraded with 0.5 ng/μl bovine rnase a (fisher scientific) at 25°c for 24 hrs and subjected to quantitative mass spectrometry-based proteomic analysis. quantitative mass spectrometry-based proteomic analysis of rnainteracting proteins (qtux-ms) protein eluates and their respective mixed light/heavy input lysates were subjected to in-solution enzymatic digestion using a filter-aided sample preparation approach [17] , then analyzed by nlc-ms/ms, as described in s1 methods and in [18] . light (denv2) and heavy-labelled (mock) cell pellets were lysed in 100 mm abc containing 5% sodium deoxycholate at 95°c to ensure denaturation and virus inactivation. the protein concentrations were determined by the bca assay and mixed in equal protein amounts (100 μg total). proteins were subjected to in-solution digestion with trypsin, then fractionated and analyzed by nlc-ms/ms as described in the s1 methods. proteomic data analysis qtux-ms, its respective mixed input lysate, and the whole cell silac instrument raw data were separately processed using the maxquant software (ver. 1.5.3.8), configured with default settings, except for experiment-specific parameters, which are described in the s1 methods. the mass spectrometry proteomics data have been deposited to the proteomexchange consortium via the pride [19] partner repository with the dataset identifier pxd003593. using the filtered list of protein identifications, unique gene symbols were used for downstream functional ontology analyses. the gene ontology annotations from uniprot were used to generate and assign the denv2 rna interacting candidates into broader functional categories. for the whole cell silac protein expression study, genes and their associated ratios were analyzed by panther gene enrichment (panther database ver 10.0, 2015-05-15) [20] using the panther pathway and protein class ontologies. significant differential protein abundance was determined as a function of ontological classification versus the overall population (bonferroni-corrected p-value < 0.05). for specific functional protein ontologies that were differentially regulated, a subset were selected for analysis by the reactome functional interaction (fi) network cytoscape plugin [21] . the reactome fi plugin was used to visualize candidate host factors identified by qtux-ms. sirna transfections 2 x 10 6 hela uprt cells were transfected with 350 pmol of silencer select negative control (ambion) or the specified sirnas (s1 table) using the xtreamgene sirna transfection reagent system (roche). 24 hrs later, 4 x 10 5 cells/well were plated for infection. 48 hours post transfection cells were infected (moi = 0.1) with denv2 or ad5 (n = 3). at either 30 (ad5) or 40 (denv2) hpi, virus titers were determined by plaque assays using 911 or bhk cells, respectively. knockdown efficiency was measured 48 hrs post transfection by rt-qpcr. experiments were performed in two biological replicates for each host factor. statistical analysis was performed using student's t-test. cell viability and proliferation assay 24 hrs post sirna transfection equal numbers of cells (either 2 x 10 3 or 8 x 10 3 ) were seeded in 96-well plates. cell viability was measured 48 hours after sirna-mediated knockdown of individual host factors using the vybrant mtt [3-(4,5-dimethyl-2-thiazolyl)-2,5-diphenyl-2h-tetrazolium bromide] assay kit (invitrogen) according to the manufacturer's protocol and reported relative to the negative-control sirna (set to 100%; n = 3). statistical analysis was performed using student's t-test. cdna was generated from 1μg trizol (ambion) purified total rna using moloney murine leukemia virus (mmlv) reverse transcriptase (promega) as described by the manufacturer using random primers (invitrogen). qpcr was performed using iq sybr green supermix (bio-rad) with the primers listed in the s1 table. the amplification efficiency for each primer set was 100±10% as determined using a standard curve. development of a quantitative thiouridine cross-linking mass spectrometry (qtux-ms) method for identification of proteins associated with the denv rna tux-ms can be used to identify host factors by incorporating 4-thiouridine (4su), a zero-distance cross-linker, into the viral rna (vrna) to enable cross-linking of proteins bound to vrna during a live infection in cell culture [15] . cross-linking is carried out under physiological conditions prior to cell lysis to ensure specificity and reduce false-positives from non-specific rna protein interactions that occur upon loss of compartmentalization. vrna is isolated under denaturing conditions and cross-linked proteins are identified using liquid chromatography-tandem mass spectrometry (lc-ms/ms). to improve quantification of the tux-ms identified host proteins (qtux-ms), a silac (stable isotope-labelled amino acids in cell culture) approach [16] was used to label the uninfected (mock) and infected cells with either 'heavy' or 'light' amino acids, respectively ( fig 1a) . when 4-thiouracil (4tu) is present in the medium, huh7.5 uprt human hepatoma cell lines stably expressing uprt (uracil phosphoribosyltransferase) convert 4tu to ump. then, the ump is converted to thiouridine triphosphate (4sutp) by cellular kinases [22] . both cellular and viral rna polymerases use 4sutp as a substrate during rna synthesis, which serves as a zero-distance cross-linker, covalently binding proteins to rnas upon exposure to long wave uv-light. importantly, protein-protein crosslinking is very inefficient at long uv wavelengths, ensuring that only proteins in direct contact with the reactive thiol group of the 4su-containing rna will be cross-linked [23] . we have shown previously that immunoisolation of candidate vrna-binding proteins identified by tux-ms (and confirmed by western) could be specifically co-isolated with viral rna [15] . together, this study established that tux-ms can identify bona fide interactions between host proteins and viral rna. the tux-ms method was originally developed to capture polyadenylated rna using oligo(dt) beads [15] . however, as denv rna is not polyadenylated, we modified the method to use sequence specific capture of the vrna using magnetic beads. following crosslinking in huh7.5 uprt cells infected with denv at 28 hpi and affinity capture of the vrna, the ribonucleoprotein complexes were eluted from the beads, 'heavy' and 'light' eluates were mixed, and rnase a was used to degrade the rna. the proteins were digested insolution with trypsin and subjected to quantitative ms-based proteomics ( fig 1b) . the median 'light' to 'heavy' peptide and protein ratios were calculated, reflecting the specificity of vrnaprotein capture. we identified several classes of proteins, including denv proteins, known denv host factors, and putative rna-interacting host proteins, but most of the qtux-ms identified factors have not been previously identified through interactions with denv (s2 table) . consistent with previous knowledge of flaviviral rna [24] [25] [26] [27] , our qtux-ms analysis identified several viral proteins-c, e, ns3, ns4a and ns5-as associated with vrna. in addition, qtux-ms identified six known denv host factors: polypyrimidine tract-binding protein 1 (ptbp1), interleukin enhancer-binding factor 3 (ilf3), calreticulin (calr), calnexin and heterogeneous nuclear ribonucleoproteins hnrnp h1 and hnrnp k, as well as a known denv anti-viral protein-eukaryotic initiation factor 4a (eif4a)-and 12 other proteins previously shown to associate with denv rna or proteins (s2 table) . altogether, since several known host factors were identified using qtux-ms, this suggests that the adapted method is effective at identifying host factors for denv. for identification of novel host factors, cellular proteins with a denv/mock silac ratio of 1.5-fold (n 3 quantified peptides) were considered putative denv vrna interactions. this threshold was selected when considering the median variance in the silac ratio (for (blue) amino acids are infected with denv or treated with mock, respectively, in the presence of 4tu. 4su is incorporated into cellular and denv rnas and proteins are uv cross-linked to the contacting thio-containing rna (represented as either balls to indicate native conformation or curved lines to indicate denatured proteins) in living cells at 28 hpi prior to cell lysis under denaturing conditions. viral ribonucleoprotein complexes were isolated using dna molecules complementary to denv rna bound to magnetic beads, the rna was degraded with rnase a and the proteins were identified by mass spectrometry. (b) workflow for quantitative proteomic analysis of rna-bound host factors isolated in (a). isolated proteins were mixed between mock and virus-infected samples, digested into peptides, and analysed by mass spectrometry. relative 'light' and 'heavy' peptide abundances were quantified to determine the specificity of interaction. host factor candidates were identified and subjected to functional validation. doi:10.1371/journal.pntd.0004921.g001 proteins with > 3 quantified peptides), which was approximately 25%. therefore, we opted for a conservative cut-off at 50%, representing twice this median value or 1.5-fold. common environmental and non-human cell culture contaminants were excluded, since they existed in only the 'light' silac state. in addition, our qtux-ms samples also contained histones: h3, h4, h2a, h2b, h1.5 and macroh2a.1. in a previous study, histones were shown to play roles in dengue infection [28] . however, their functions were mediated through an interaction with a viral capsid protein and were shown to be independent of rna. in addition, histones are primarily nuclear and highly abundant proteins commonly detected (> 50%) in control affinity purifications compiled across diverse protein-protein interaction studies [29] . for these reasons, histones likely represent non-specific associations rather than denv rna binding factors, and thus were excluded from further analysis. in total, 93 cellular proteins passed these selection criteria, 79 of which have not been previously shown to associate with denv (s2 table, s1 dataset). several of the known denv host factors were enriched but did not meet the stringent inclusion criteria (s2 table) , suggesting that there may be additional host factors below our enrichment threshold (see s1 dataset). importantly, the subset of 79 host factors represents a significant potential expansion in the number of known denv host factors, providing a valuable resource to test for pro-viral or antiviral activities during denv infection. it is well recognized that viral infections can induce significant changes in cellular proteomes [30] [31] [32] and an increase in protein levels during denv infection may contribute to the increased protein capture measured by qtux-ms. to address this, we used silac-ms to quantify proteome (i.e., total protein abundance) changes following denv infection. comparison of qtux-ms and proteome silac ratios showed that the protein abundances for the 93 qtux-ms-identified vrna-binding factors remained largely unchanged (fig 2a, s1 fig and s1 dataset). on average, for these proteins, the denv-induced changes in whole cell abundance were ±1.1-fold, suggesting that their identification by qtux-ms was not due to an increase in their abundance in the cell following denv infection. noteworthy, a retrospective qualitative comparison of the qtux-ms identified factors for denv with those identified in the tux-ms analysis on poliovirus revealed less than 10% were identified for both viruses [15] ( fig 2b) . since the identified proteins are largely denv specific, qtux-ms is not biased towards identifying a sub-set of cellular rna binding proteins. taken together, our results indicate that the enrichment ratios measured by qtux-ms is predominantly due to their specific association with the denv rna. while proteins that bound denv rna did not show significant changes in abundance upon infection, we performed bioinformatics analysis on the complete proteome dataset of whole cell abundance to determine the global proteome effects of denv infection under these conditions. in total, 4,907 host proteins were quantified by silac ms in biological duplicates (s1 dataset). the abundance ratios between biological duplicates were reproducible, with only~2% of the ratios varying by > 50% (fig 3a) . from these duplicates, an average abundance ratio was calculated and the respective proteins were analyzed by panther gene enrichment analysis (s2 fig) [20] . this analysis found systematic up regulation of proteins in the ubiquitin proteasome pathway (upp), comprising 18 members of the 26s proteasome as well as various ubiquitinconjugating enzymes (s3 fig). many of these enzymes are linked to ubiquitin-dependent proteasome degradation, consistent with the current knowledge that the upp is important for production of infectious virions [31, 33, 34 ]. yet, other enzymes, such ube2n and ube2v2, catalyze polyubiquitination at lys-63, which does not lead to proteasome degradation but rather participates in transcriptional activation of target genes and may promote innate immune signaling [35, 36] . in contrast, proteins in the transporter protein class were on average down regulated (fig 3b and s2 fig). assembly of the annotated proteins into reactome protein networks identified several subnetworks with various transporter activities (s4 fig). while the abundances of mitochondrial transporters and nucleoporins were the most consistently decreased, not all transporters were down regulated; for example, the lipoprotein (apo) transporters were increased in expression (fig 3c) . interestingly, the rna binding protein class was significantly down regulated (fig 3b and s2 fig) , though the effect was not as pronounced as the transporter class (fig 3b) . the overall down-regulation of rna binding proteins appears to be driven by changes in cytoplasmic and mitochondrial ribosomal subunits, and proteins involved in rna degradation and processing (s5 fig). nevertheless, the relative protein abundance for the set of 93 (known and putative) denv binding factors identified by qtux-ms was largely unchanged, despite being enriched in rna processing and translation factors (fig 2a) . overall, the quantitative proteome analysis suggests that denv selectively alters the abundance of proteins, and reveals several pathways that could be directly or indirectly modulated in the host response to denv infection. to gain insight into potential molecular mechanisms and biological processes of the 93 qtux-ms identified factors, we performed a functional network-based analysis using the curated human pathway relationships from the reactome database. this analysis revealed a high degree of connectivity, with 62 proteins forming a large interconnected network (fig 4) . the densest network connectivity included proteins involved in rna processing/translation (orange nodes) and dna binding/transcription (blue nodes). several additional proteins with rna and/or translational activities were also identified, but lacked annotation in reactome (orange single nodes). overall, our bioinformatic evaluation further supports the ability of qtux-ms to capture vrna-bound host factors and points to their possible function in denv amplification. since most of the factors were associated with rna processing in the reactome analysis (fig 4) , we focused on these factors for functional analysis of their roles in dengue infection. we have randomly selected six qtux-ms identified host proteins with functions in rna processing/translation, which were enhanced in the denv sample ranging from 1.32-to 2.26-fold (denv/mock). thus, by validating factors that are only modestly enhanced in the qtux-ms analysis this will indicate if the qtux-ms identified factors that are near the cut-off of 1.5-fold are bona fide host factors or not. we assessed the effect of sirna knockdown of these factors (fig 5a) on viral production. hela uprt cells were used for sirna silencing experiments due to higher sirna transfection efficiency compared with huh7.5 uprt cells. knockdown of five out of six qtux-ms identified candidates: non-pou domain-containing octamer-binding protein (nono), embryonic stem cell-specific 5-hydroxymethylcytosine-binding protein (hmces), rbmx (rna-binding motif protein, x chromosome), hnrnp m and hnrnp f significantly decreased denv production, while knockdown of hnrnp l had no effect on denv titer ( fig 5b; s5 fig) . knockdown of ptbp1, a positive control [7, 8] , also resulted in decreased viral titers (fig 5) . for negative controls, we selected two rna binding proteins (ddx39 and hnrnp a0) that were not identified by qtux-ms. denv titers were not altered following silencing of these two proteins, suggesting that only specific rna binding proteins are used by denv. altogether, our data suggests that rbmx, nono, hmces, hnrnp m, hnrnp f are required for viral production. these results demonstrate that qtux-ms is a robust method with a low false discovery rate for high-throughput identification of viral host factors. reduced viral amplification could be due to compromised cell fitness rather than a specific requirement of the virus for a particular host factor. using an mtt assay, we confirmed that knockdown of these factors did not impact cell viability (fig 5c) . as a positive control, knockdown of g3bp2 did reduce cell fitness, as previously shown (s6 fig) [37] . to more rigorously rule out any potential effects of host factor knockdown on cell fitness that would affect viral amplification, an unrelated virus (adenovirus 5), was amplified following knockdowns of the candidate factors. adenovirus 5 replication was not significantly decreased by rbmx, nono, hnrnp m, hnrnp f or hmces sirna knockdown (fig 5b and s6c fig) demonstrating that knockdown of these factors does not affect cell fitness for viral amplification. given that these proteins bind directly to viral rna and are required for viral amplification, rbmx, nono, hnrnp m, hnrnp f and hmces are novel denv host factors. to determine if the host factor is required for a step prior or subsequent to viral replication, vrna was quantified by qrt-pcr in the denv infected cells knocked down for rbmx, nono, hnrnp m, hnrnp f or hmces (fig 6) . as a positive control, knockdown of ptbp1, which is required for denv replication [7] , reduced denv rna levels. similarly, the intracellular vrna levels were reduced in cells knocked down for either hnrnp f, rbmx or hmces. the decrease in dengue rna levels (fig 6) is consistent with the decrease in viral titers ( fig 5b) . thus, hnrnp f, rbmx or hmces are required for the early steps in the viral replication cycle, such as translation, replication or rna stability. in contrast, knockdown of hnrnp m and nono did not change intracellular viral rna levels despite the dramatic decrease in viral titers, suggesting they may play a role downstream of replication. altogether, we have identified and validated five novel host factors for denv, demonstrating that qtux-ms can identify factors that function at different stages of the virus life cycle. an estimated 40% of the world's population is at risk from dengue for which vaccines or antivirals are not yet available. since diagnostic tests can detect denv infection at early stages, administration of antivirals could significantly improve survival rates as viral load is correlated with symptom severity [38] . using antivirals that target host factors may limit the appearance of drug-resistant viruses and may be effective for all denv serotypes and possibly for multiple flaviviruses [5, 6, 39, 40] . the qtux-ms analysis identified 79 novel cellular proteins, for which the majority are distinct from those previously identified for poliovirus using a similar approach [15] . this suggests that unrelated rna viruses have evolved to utilize distinct host rna binding proteins. interestingly, ptbp1 and nono, which were identified in both the poliovirus and denv tux-ms analyses, were shown to be required for production of both viruses (this study, figs 5 and 6) [15, 41, 42] . further analysis of virus-specific and shared host factors will reveal whether unrelated viruses utilize similar or diverse mechanisms to control viral rna replication, processing and packaging within cells. the host factors that enhance amplification of both poliovirus and dengue, (fig 2b) could serve as attractive targets for the development of broad-spectrum antivirals. the novel denv rna interactions identified in our study reveal a large network of cellular proteins which belong to different functional classes primarily associated with the nucleic acid metabolism, including numerous components of splicing, rna processing and translation machineries. these factors likely play direct roles in denv translation, replication or packaging. in addition, we have detected multiple components of cell signaling and stress response, such as several members of 14-3-3 adapter proteins, heat shock proteins and β-catenin. these factors are known to regulate diverse pathways, including host innate immune and cellular homeostasis [43] [44] [45] suggesting their possible role in host antiviral response or viral strategies to subvert the innate immune response. we demonstrated that the majority of the qtux-ms factors that we selected for validation were required for efficient denv amplification (fig 5) . specifically, we found that hnrnp f, hmces and rbmx are required for the early steps in the viral life cycle. in contrast, hnrnp m and nono appear to act downstream of viral rna replication (fig 6) , which may be significant given that both have been shown to be in a complex together [46] . nono and its binding partners are predominantly nuclear, bind rna, and are involved in pre-mrna processing, splicing, and rna transport, as well as in transcriptional activation and repression [47] [48] [49] . interestingly, other nono binding partners: psf/sfpq (polypyrimidine tract-binding protein (ptb)-associated splicing factor) and matrin3 were identified by qtux-ms as well. many of the qtux-ms identified cellular proteins are hnrnps, which encompass a large class of rna binding proteins that either localize to the nucleus or shuttle between the nucleus and the cytoplasm in order to perform multiple functions in rna metabolism, from transcription to rna turnover [50] [51] [52] . importantly, the vast majority of these factors have established roles in viral infections or in modulating the antiviral host response to various viruses [53] [54] [55] [56] , including denv (s2 table and references within). our study establishes that rbmx (hnrnp g), hnrnp f and hnrnp m are required for efficient denv amplification (fig 5b) . since several hnrnps, such as ptbp1 (hnrnp i), hnrnp a1 and hnrnp k, were previously shown to re-localize from the nucleus to the viral replication sites during denv infection [8, 57, 58] , rbmx, hnrnp f, hnrnp m and possibly other nuclear qtux-ms identified factors are also likely to be either actively recruited to the viral replication sites or retained in the cytoplasm upon denv infection. interestingly, the qtux-ms identified hnrnps affect different steps in the denv life cycle (fig 6) , suggesting that they have distinct functions during infection. this is consistent with studies that have suggested that denv rna structures are dynamic during the viral life cycle [59] and may suggest that host factors play an important role in these structural changes. furthermore, we showed that knockdown of some hnrnps (hnrnp a0 and hnrnp l) did not affect dengue viral titers significantly demonstrating that only certain hnrnps are required for denv amplification. since some of hnrnps are known to modulate cellular gene expression in response to dengue infection [60, 61] , we can not rule out that some of the observed effects on virus titers derive from their roles in regulating host mrnas. among the numerous qtux-ms identified factors of interest, our study is the first to demonstrate the involvement of hmces (or c3orf37) in viral infection. while the cellular role of human hmces is currently unknown, the mouse homologue was suggested to be an rnabinding protein and predicted to contain a putative peptidase domain [62, 63] . interestingly, it is possible that the nucleic acid binding domain enhances the protease activity or visa versa as has been shown for other such proteins [64] [65] [66] . for example, adenovirus uses a nucleic acid binding protease to localize the protease activity to the viral substrates [64] . it has been suggested that the protease is recruited to the empty capsid as an inactive protease, then it becomes fully activated once bound to the viral dna inside the virion. using the dna as a guide wire, it moves along the nucleic acid, searching for capsid and core proteins to cleave, which is required to render the viral particle infectious [64, 67, 68] . however, since we observed that knockdown of hmces resulted in a decrease in viral rna, it seems more likely that it might participate in translation, replication or the switch from translation to replication as has been shown for other nucleic acid binding proteases [65, 69] . only one of the qtux-ms identified factors was increased at the protein level in whole cells following denv infection, suggesting that qtux-ms identifications derived from the specific associations to vrna rather than changes in protein abundances. our analysis of the host cell proteome upon infection also revealed interesting changes, including up regulation of proteins in the ubiquitin pathway and down regulation of transporter proteins. the ubiquitin-proteasome pathway is one of two major cellular pathways used to degrade 80 to 90% of proteins. previous studies on denv infected cell lines and patient samples showed that the ubiquitin pathway was upregulated [31, 39, 70] . many groups have consistently shown that the ubiquitin proteasome pathway is critical for amplification of a number of flaviruses, including denv and west nile virus [31, 34, 71, 72] . however, it remains controversial as to which step in viral amplification is affected by ubiquitination, but it appears to be early during internalization or viral genome release [33, 39, 73] . further studies will be required to understand how denv up-regulates the pathway and the mechanism that ubiquitination has in denv amplification. altogether, our study has significantly increased the number of cellular proteins known to interact with the denv rna during a live infection in cells. we have also placed these interactions in the context of proteome abundance changes in the infected cells. a recent study by phillips and colleagues [13] exploited a cross-linking label-free ms approach to identify denv rna associating proteins in cell culture by cross-linking the rna to the proteins using short wavelength uv light and isolating denv rna bound proteins by anti-sense dna affinity capture [74] . while their method identified several denv host factors [12, [75] [76] [77] , the qtux-ms method reported here resulted in improved identification of known dengue host factors and putative denv rna interacting proteins. there could be several reasons for these results, such as, the qtux-ms approach achieves greater cross-linking efficiency by using long wave uv light to form crosslinks to 4-thio-uridines compared to short-wavelength uv light, which is inherently inefficient [78] . moreover, since thio-uridine is a zero distance cross-linker for rna-bound proteins at long uv wavelengths and protein-protein cross-links are not formed at long uv wavelengths, qtux-ms may also have achieved improved specificity, as only proteins in direct contact with the viral rna would be captured [23] . additionally, qtux-ms used an ms-based silac approach to determine which host proteins were specifically enriched in the vrna isolations versus mock. though isotope-labelling is not applicable in all model systems, it does afford greater quantitative accuracy compared to label-free ms strategies [16] . overall, the qtux-ms method identified 93 cellular proteins that bind to denv rna, which include 14 previously known or putative interactions. importantly, five out of the six qtux-ms identified novel factors that were tested were shown to be bona fide host factors. we used robust assays to show that the identified host factors were specifically required for denv amplification and did not merely result in a decrease in cell fitness for viral amplification. future studies will reveal whether the identified factors may also be required for other flavivirus infections that cause life-threatening illnesses, such as yellow fever, west nile, zika, japanese and tick-borne encephalitis. therefore, our data demonstrates that qtux-ms is an effective technique for identifying novel virus host factors that can be used for a broad spectrum of rna viruses by simply designing antisense dna oligonucleotides to allow for efficient sequence-specific isolation of the vrna. supporting information s1 table containing all proteins quantified by qtux-ms, including proteins enriched in denv infection (red highlighted rows) and those that did not meet the specificity threshold. for each protein group, the following are provided (from left to right), uniprot accession number, gene name, protein name, the linear and log2 denv/ mock qtux-ms enrichment ratios (columns d and e), the qtux-ms ratio variability, the number of qtux-ms quantified peptides, the average log2 denv/mock whole cell relative abundance silac ratio, whether this ratio was up (u) or down (d) regulated by more than ± 1.75-fold, the average silac ratio variability, the average number of quantified peptides, the number of razor+unique peptides and sequence coverage for qtux-ms, the protein's molecular weight and sequence length, and the complete fasta header entry for the primary protein group member. columns h-k are cross-referenced from the respective whole cell data (b). nd = not detected. (b) table containing all proteins quantified in whole cell lysates by silac. for each protein group, the following are provided (from left to right), uniprot accession number, gene name, protein name, the log2 denv/mock relative abundance ratios for replicates 1, 2, and the average (columns d and e), whether this ratio was up (u) or down (d) regulated by more than ± 1.75-fold. for each replicate, the following are provide: the silac ratio variability, the number of quantified peptides, the number of razor+unique peptides, and the number of unique peptides, total protein intensity for light (denv) and heavy (mock) conditions, and the overall sequence coverage. the complete fasta header entry is listed for the primary protein group accession number. (xlsx) s1 (a) hela uprt cells were transfected with either control or specific sirnas. 24 hours post transfection cells were counted and seeded (4 x 10 5 cells/well) in 6-well plates in triplicates. 48 hours post transfection cells were infected with denv2 at moi 0.1 and virus released to the media collected 40 hours post infection. denv2 titers were measured using plaque assays. the bars represent average values from triplicate, a standard error is reported. the representative data from one of at least two independent experiments is shown. (b) to verify sensitivity of the mtt assay, we performed sirna knockdown of g3bp2, which is known to bind denv and was previously shown to affect cell viability [37, 75] . relative viability of non-infected cells was measured using an mtt assay (invitrogen) and represented exactly as described in fig 5c. cell viability or proliferation is decreased by knockdown of g3bp2 compared with control sirna (p<0.01). sirna transfection was performed as described in the methods section using previously published sirna sequence [37] . in cells treated with g3bp2-specific sirna but not control sirna, g3bp2 protein was knocked down to the levels undetectable by western analysis using antibodies against g3bp2 (abcam, ab86135). (c). hela uprt cells were seeded 4.0 x 10 5 cells in 60 mm plates and transfected with either control or specific sirnas. 24 hours post transfection cells were infected with ad5 at moi 0.1 and collected 30 hours post infection. sirna knockdown efficiency was determined by measuring respective mrna levels in comparison to β-actin mrna abundance as described in the experimental section and reported relative to control sirna transfection (upper panel). ad5 titers were determined using plaque assays on 911 cells. the bars represent average values from triplicate, a standard error is reported. the representative data from one of at least three independent experiments is shown. (tif) the global distribution and burden of dengue perspectives for the treatment of infections with flaviviridae dengue virus life cycle: viral and host factors modulating infectivity. cellular and molecular life sciences: cmls composition and three-dimensional architecture of the dengue virus replication and assembly sites targeting a host process as an antiviral approach against dengue virus targeting host factors to treat west nile and dengue viral infections the polypyrimidine tract-binding protein is required for efficient dengue virus propagation and associates with the viral replication machinery polypyrimidine tract-binding protein is relocated to the cytoplasm and is required during dengue virus infection in vero cells polypyrimidine tract-binding protein influences negative strand rna synthesis of dengue virus translation elongation factor-1alpha, la, and ptb interact with the 3' untranslated region of dengue 4 virus rna eukaryotic initiation factor 4ai interacts with ns4a of dengue virus and plays an antiviral role. biochemical and biophysical research communications y box-binding protein-1 binds to the dengue virus 3'-untranslated region and mediates antiviral effects identification of proteins bound to dengue viral rna in vivo reveals new host proteins important for virus replication revisiting dengue virus-host cell interaction: new insights into molecular and cellular virology thiouracil cross-linking mass spectrometry: a cell-based method to identify host factors involved in viral amplification stable isotope labeling by amino acids in cell culture, silac, as a simple and accurate approach to expression proteomics universal sample preparation method for proteome analysis the functional interactome landscape of the human histone deacetylase family. molecular systems biology 2016 update of the pride database and its related tools panther version 10: expanded protein families and functions, and analysis tools reactomefiviz: a cytoscape app for pathway and network-based data analysis biosynthetic labeling of rna with uracil phosphoribosyltransferase allows cell-specific microarray analysis of mrna synthesis and decay photo-leucine and photo-methionine allow identification of protein-protein interactions in living cells subcellular localization and some biochemical properties of the flavivirus kunjin nonstructural proteins ns2a and ns4a japanese encephalitis virus nonstructural protein ns3 has rna binding and atpase activities rna sequences and structures required for the recruitment and activity of the dengue virus polymerase structural analysis of viral nucleocapsids by subtraction of partial projections dengue virus capsid protein binds core histones and inhibits nucleosome formation in human liver cells the crapome: a contaminant repository for affinity purification-mass spectrometry data high-throughput quantitative proteomic analysis of dengue virus type 2 infected a549 cells the ubiquitin-proteasome pathway is important for dengue virus infection in primary human endothelial cells proteomic analysis of host responses in hepg2 cells during dengue virus infection dengue virus genome uncoating requires ubiquitination. mbio the ubiquitin-conjugating system: multiple roles in viral replication and infection activation of the ikappab kinase complex by traf6 requires a dimeric ubiquitin-conjugating enzyme complex and a unique polyubiquitin chain trim5 is an innate immune sensor for the retrovirus capsid lattice gap161 targets and downregulates g3bp to suppress cell growth and potentiate cisplaitin-mediated cytotoxicity to colon carcinoma hct116 cells dengue viremia titer, antibody response pattern, and virus serotype correlate with disease severity. the journal of infectious diseases rna interference screen for human genes associated with west nile virus infection dengue virus evolution under a host-targeted antiviral a cytoplasmic 57-kda protein that is required for translation of picornavirus rna by internal ribosomal entry is identical to the nuclear pyrimidine tract-binding protein translation of polioviral mrna is inhibited by cleavage of polypyrimidine tract-binding proteins executed by polioviral 3c(pro) genome-wide rnai screen reveals a new role of a wnt/ctnnb1 signaling pathway as negative regulator of virus-induced innate immune responses the role of the 14-3-3 protein family in health, disease, and drug development. drug discovery today virus-heat shock protein interaction and a novel axis for innate antiviral immunity hnrnp m interacts with psf and p54(nrb) and co-localizes within defined nuclear structures psf and p54(nrb)/nono-multi-functional nuclear proteins kinesin transports rna: isolation and characterization of an rnatransporting granule the multifunctional protein p54nrb/psf recruits the exonuclease xrn2 to facilitate pre-mrna 3' processing and transcription termination advances in experimental medicine and biology hnrnp complexes: composition, structure, and function. current opinion in cell biology hnrnp proteins and the biogenesis of mrna. annual review of biochemistry hnrnps relocalize to the cytoplasm following infection with vesicular stomatitis virus rna-rna and rna-protein interactions in coronavirus replication and transcription hnrnp l and nf90 interact with hepatitis c virus 5'-terminal untranslated rna and promote efficient replication high-affinity interaction of hnrnp a1 with conserved rna structural elements is required for translation and replication of enterovirus 71 nf90 binds the dengue virus rna 3' terminus and is a positive regulator of dengue virus replication the heterogeneous nuclear ribonucleoprotein k (hnrnp k) is a host factor required for dengue virus and junin virus multiplication dynamic rna structures in the dengue virus genome the heterogeneous nuclear ribonucleoprotein k (hnrnp k) interacts with dengue virus core protein dengue virus infection induces upregulation of hn rnp-h and pdia3 for its multiplication in the host cell novel autoproteolytic and dna-damage sensing components in the bacterial sos response and oxidized methylcytosine-induced eukaryotic dna demethylation systems the rna-binding protein repertoire of embryonic stem cells human adenovirus proteinase: dna binding and stimulation of proteinase activity by dna rna binding by the ns3 protease of the hepatitis c virus dna and rna binding by the mitochondrial lon protease is regulated by nucleotide and protein substrate the active adenovirus protease is the intact l3 23k protein processing of the l1 52/ 55k protein by the adenovirus protease: a new substrate and new insights into virion maturation cellular protein modification by poliovirus: the two faces of poly(rc)-binding protein host gene expression profiling of dengue virus infection in cell lines and patients west nile virus and dengue virus capsid protein negates the antiviral activity of human sec3 protein through the proteasome pathway west nile virus genome amplification requires the functional activities of the proteasome appraising the roles of cbll1 and the ubiquitin/proteasome system for flavivirus entry and replication antisense-mediated affinity purification of dengue virus ribonucleoprotein complexes from infected cells quantitative mass spectrometry of denv-2 rna-interacting proteins reveals that the dead-box rna helicase ddx6 binds the db1 and db2 3' utr structures a)-binding protein binds to the non-polyadenylated 3' untranslated region of dengue virus and modulates translation efficiency. the journal of general virology role of human heterogeneous nuclear ribonucleoprotein c1/c2 in dengue virus replication detecting rna-protein interactions by photocross-linking using rna molecules containing uridine analogs we would like to thank those who generously supplied reagents: lncx uprt-myc, tat, and vsvg plasmids (edward mocarski), denv2 (robert striker), and huh 7.5 cells (charlie rice). we would like to thank quynh-mai trinh for technical assistance. conceptualization: ovv tmg imc srt. key: cord-026012-r0w0jbpg authors: tennant, bud c.; hornbuckle, william e. title: gastrointestinal function date: 2014-06-27 journal: clinical biochemistry of domestic animals doi: 10.1016/b978-0-12-396350-5.50013-9 sha: doc_id: 26012 cord_uid: r0w0jbpg this chapter discusses the functions of gastrointestinal tract. the principal functions of the gastrointestinal tract are assimilation of nutrients and excretion of the waste products of digestion. within the gastrointestinal tract, these substances are solubilized and degraded enzymatically to simple molecules, sufficiently small in size and in a form that permits absorption across the mucosal epithelium. the distribution of the different types of secretory cells in the salivary glands varies among species. the mandibular and sublingual glands are mixed salivary glands containing both mucous and serous types of cells, and produce a viscous secretion that contains large amounts of mucus. the cytoplasm of the secretory cells contains numerous zymogen granules that vary in size and number depending on the activity of the gland. these granules contain the precursors of the hydrolytic enzymes responsible for digestion of the major dietary components. the cells of the terminal ducts probably secrete the bicarbonate ion responsible for neutralizing hydrochloric acid that enters the duodenum from the stomach. the digestive system is composed of the gastrointestinal tract or alimentary canal, salivary glands, liver, and exocrine pancreas. the principal functions of the gastrointesti nal tract are assimilation of nutrients and excretion of the waste products of digestion. most nutrients are ingested in a form which is either too complex or insoluble for absorp tion. within the gastrointestinal tract, these substances are solubilized and degraded enzymatically to simple molecules, sufficiently small in size and in a form which permits absorption across the mucosal epithelium. in the following section, the normal biochemi-283 cal processes of intestinal secretion, digestion, and absorption are described. with these in perspective, we then discuss the mechanisms involved in the pathogenesis of the most important gastrointestinal diseases and the biochemical basis for diagnosis and treatment. a. saliva saliva is produced by three major pairs of salivary glands and by small glands distrib uted throughout the buccal mucosa and submucosa. two types of secretory cells are found in the acinar portions of the salivary glands: (1) the mucous cells, which contain droplets of mucus, and (2) the serous cells, which contain multiple secretory granules. in those species which produce salivary amylase, the secretory granules are the zymögen precur sors of this enzyme. a third cell type is found lining the striated ducts. the striations along the basal borders of these cells are caused by vertical infoldings of the cell membrane, a characteristic of epithelial cells involved in rapid movement of water and electrolytes. the primary secretion of the acinar cells is modified by active transport processes of the ductal epithelium. the distribution of the different types of secretory cells in the salivary glands varies among species. the parotid glands of most animals are serous glands which produce a secretion of low specific gravity and osmolarity, containing electrolytes and proteins including certain hydrolytic enzymes. the mandibular (submaxillary) and sublingual glands are mixed salivary glands containing both mucous and serous types of cells and produce a viscous secretion which contains large amounts of mucus (dukes, 1955) . a. mucus. mucus is an aqueous mixture of protein-poly saccharide complexes and glycoproteins (gottschalk, 1972) , which have relatively large amounts of carbohydrate bound to protein. the protein-poly saccharide complexes have long polysaccharide chains containing repeating units bound to a protein core. the glycoproteins contain numerous oligosaccharide residues distributed along the polypeptide chain. one of the most completely studied glycoproteins is mucin from the submaxillary glands of ruminants. the carbohydrate portion is a disaccharide of yv-acetylneuraminic acid (a sialic acid) and 7v-acetylgalactosamine. approximately 800 such disaccharide molecules are present per molecule of mucin (bhavanandan et al., 1964; bertolini and pigman, 1967 ). an enzyme capable of linking protein with hexosamine was demonstrated in sheep submaxillary glands (mcguire and roseman, 1967) . the physiological functions of mucin are closely related to its high viscosity. n-acetylneuraminic acid is the component responsible for the formation of viscous aque ous solutions. at physiological ph, it causes expansion and stiffening of the mucin molecule (gottschalk and thomas, 1961) . the resistance of mucin to enzymatic break down is also due to the presence of disaccharide residues. removal of the terminal yv-acetylneuraminic acid residues by neuraminidase significantly increases the susceptibil ity of peptide bonds to trypsin (gottschalk and fazekas de st. groth, 1960) . b. amylase. the saliva of most species contains the α-amylase ptyalin. this enzyme is said to be absent, however, in the saliva of dogs, cats, and horses (dukes, 1955) . salivary amylase splits the a-1,4-glucosidic bonds of various polysaccharides. the sali vary enzyme is similar in all major respects to pancreatic α-amylase, which is described below (section ii,d). salivary amylase initiates digestion of starch and glycogen in the mouth of those species which secrete the enzyme. the optimal ph for amylase activity is approximately 7, and activity therefore terminates when the enzyme mixes with acidic gastric contents. saliva bathes the oral cavity continuously, serving to protect the surface epithelium. ingested food is moistened and lubricated by saliva, facilitating mastication and swallow ing. the teeth also are protected from decay by saliva, which washes food particles from the surfaces of the teeth and, because of its buffering capacity, neutralizes the organic acids produced by bacteria normally present in the mouth. ruminants produce much greater quantities of saliva than simple-stomached animals, and the saliva has a higher ph and bicarbonate ion concentration. in ruminants, saliva serves several unique functions (phillipson, 1977) . it is required for maintenance of the composition of the contents of the rumen. the great buffering capacity is necessary to neutralize the large amounts of short-chain fatty acids which are the major end products of rumen fermentation. the urea in saliva can be utilized by rumen bacteria for protein synthesis. protein synthesized in the rumen is then used to meet dietary protein require ments. in this way, urea nitrogen can be "recycled" through the amino acid pool of the body and in ruminants need not be considered an end stage in protein catabolism. the ability to reutilize urea has also been demonstrated in the horse and may be of particular benefit during periods of protein deficiency (houpt and houpt, 1971; prior et al., 1974) . the stomach is divided into two main regions on the basis of secretory function (grossman, 1958) . the oxyntic gland area corresponds approximately to the body of the stomach in most species of domestic animals and also to the fundus in the dog and cat. the oxyntic glands contain (1) oxyntic or parietal cells, which are responsible for hydro chloric acid production, (b) peptic (zymogenic, chief) cells, which produce pepsinogen, and (c) mucous cells. the pyloric gland area contains the pyloric glands, which are slightly alkaline, and, in addition to mucus, contains the polypeptide hormone gastrin. a variety of stimuli can initiate gastric secretion. the sight or smell of food or the presence of food within the mouth causes gastric secretion by a reflex mechanism involv ing the vagus nerve. the presence of certain foods within the stomach or distention of the stomach alone also can initiate both intrinsic and vagai nerve reflexes which cause secre tion of gastric juice. in addition to neural reflexes, these stimuli cause release of the polypeptide hormone gastrin from the pyloric gland area, which enters the bloodstream, stimulating gastric secretion. the release of gastrin from the specific g cells responsible for synthesis is inhibited by excess hydrogen ion, and this negative feedback mechanism is 1 fig. 1. amino acid sequence of porcine gastrin i (gregory, 1966) . gastrin ii differs from gastrin i by the presence of a sulfate ester group on the single tyrosyl residue. believed to be of physiological importance in the control of hydrochloric acid production. gastrin has been isolated in pure form from the antral mucosa of swine gregory and tracy, 1964; . when administered intraven ously, the purified hormone causes the secretion of hydrochloric acid and pepsin. it also stimulates gastrointestinal motility and causes pancreatic secretion. two separate peptides have been obtained from porcine gastric mucosa and have been designated gastrin i and gastrin ii. the structure of gastrin has been determined and has been confirmed by synthesis (anderson et al., 1964) . it is a heptadecapeptide amide, with a pyroglutamyl n-terminal residue and the amide of phenylalanine as the c-terminal residue (fig. 1 ). in the center of the molecule is a sequence of five glutamyl residues, which give the molecule its acidic properties. gastrin ii differs from gastrin i only in the presence of a sulfate ester group linked to the single tyrosyl residue. the c-terminal tetrapeptide amide, trp-met-asp-phe-nh 2 , is identical in all species so far studied (gregory, 1967) . the tetrapeptide has all of the activities of the natural hormone. it is not as potent as the parent molecule, but activity can be increased by lengthening the peptide chain. gastrin is the only hormone known to stimulate hc1 secretion (walsh and grossman, 1975) . as indicated above, gastrin is released in response to vagai stimulation by distention of the pyloric antrum and by direct luminal contact with food, particularly partially hydrolyzed protein (walsh and grossman, 1975) . the exact mechanism of action is not known, but studies using isolated preparations of isolated parietal cells suggest that the effects of gastrin are not mediated by cyclic amp (soil, 1977) . some of the other factors which are important in regulation of hc1 secretion are summarized in fig. 2 after dousa and dozois (1977) . there is little doubt that histamine secreted locally within the mucosa has a major effect on the function of parietal cells (soil dousa and dozois, 1977.) and grossman, 1978) . histamine has been recognized as a potent stimulant of hc1 production for many years (code, 1965) . this effect, however, was not inhibited by traditional antihistaminic drugs (hj antagonists), and, until the demonstration of h 2 recep tors in the stomach (the atrium and uterus) by black et al. (1972) , the physiological role of histamine in hc1 secretion was controversial. specific h 2 antagonists (burimamide, cimetidine, metiamide) now have been shown to inhibit the secretory response not only to histamine, but also to gastrin, to cholinergic stimuli, and to food (grossman and konturek, 1974) . although there has been significant conflict in the published literature, current evidence suggests that histamine activates the adenylate cyclase of parietal cells (dousa and dozois, 1977) , resulting in synthesis of cyclic amp and ultimately in hc1 secretion (fig. 2 ). the controversy with regard to the role of cyclic amp as a mediator of histamine action has come from observations that prostaglandins and secretin, both potent inhibitors of gastric hc1 secretion, also stimulate adenylate cyclase (thompson et al., 1977) . it is now believed that prostaglandins, in addition to inhibiting hc1 secretion, act on a mucosal cell population which is different from parietal cells and that these cells secrete cytoprotective substances (mucin, glycosaminoglycans). the ulcerogenic effects of prostaglandin inhibitors (indomethacin, acetylsalicylic acid) apparently result from inhibition of this protective effect of endogenous prostaglandins. a. basal versus stimulated secretion. gastric juice is composed of two compo nents. one is secreted continuously by the surface epithelial cells and other mucusproducing cells. the other component is produced by the oxyntic glands in response to various stimuli. the basal component is neutral or slightly alkaline. the electrolyte composition is similar to that of an ultrafiltrate of plasma (table i) and contains large amounts of mucus, which protects the epithelium. the secretory component produced by the oxyntic glands in response to stimulation contains free hydrochloric acid and pepsinogen, the principal enzyme of gastric digestion. the composition of gastric juice depends on the relative amounts of the two secretory components present, which in turn is a function of flow rate. in the dog, gastric juice is produced in the resting state at a rate of approximately 5 ml/hour (gray and bûcher, 1941) , and the composition is similar to that of the basal component, containing practi cally no peptic activity or hydrochloric acid. when the flow of gastric juice is stimulated maximally, the dog may produce 80 ml or more per hour (gray and bûcher, 1941) , and this secretion contains large amounts of peptic activity and hydrochloric acid. sodium, which is the principal cation in the basal secretion, is replaced to a large extent by hydrogen ion. the concentration of potassium is similar in both basal and stimulated secretions and therefore remains relatively constant at various rates of flow. hydrochloric acid and pepsinogen are secreted by separate mechanisms, but these appear to be closely linked under physiological conditions. stimulation of the vagus nerve (bachrach, 1953; hirschowitz and sachs, 1965) or intravenous injection of gastrin (hirschowitz, 1966) increases pepsinogen and hydrochloric acid levels together. other stimuli may affect the two processes differently. in the dog, for example, histamine infusion stimulates hydrochloric acid production maximally but inhibits pepsinogen secre tion (abrams and brooks, 1960; hirschowitz, 1966; ernas and grossman, 1967) . inhibitor 6 ( n · leu · leu · · · leu · glu (3200 mol. wt.) pepsinogen is the zymögen, or inactive precursor, of pepsin, the principal proteolytic enzyme of gastric juice. pepsinogen was first crystallized from the gastric mucosa of swine (herriott, 1938) , and several pepsinogens have been separated by ryle (1965) , ryle and porter (1959) , and ryle and hamilton (1966) . porcine pepsinogen has a molecular weight of approximately 43,000 and is composed of the pepsin molecule and several smaller peptides (fig. 3) . one of these peptides has a molecular weight of 3200 and is an inhibitor of peptic activity (herriott, 1962) . activation of pepsin from pepsinogen occurs by selective cleav age of this small basic peptide from the parent pepsinogen (neurath and walsh, 1976) . autocatalytic conversion begins below ph 6.o. at ph 5.4, the inhibitor peptide dis sociates from the parent molecule, and, at ph 3.5-4.0, the inhibitor is completely digested by pepsin (taylor, 1968) . pepsin has a very acidic isoelectric point, being stable in acidic solution below ph 6.0 but irreversibly denatured at ph 7.0 or above. in contrast, pepsinogen is stable in neutral or slightly alkaline solution. the optimal ph for peptic activity is generally between 1.6 and 2.5, but the effect of ph may vary with the substrate. pepsin is capable of hydrolyzing peptide bonds of most proteins, mucin being one important exception. pepsin splits bonds involving phenylalanine, tyrosine, and leucine most readily but can hydrolyze almost all other peptide bonds. c. rennin. rennin is another proteolytic enzyme produced by the gastric mucosa and has some characteristics which are similar to those of pepsin. it has been separated from pepsin in preparations from the stomachs of newborn calves. rennin splits a mucopeptide from casein to form paracasein, which then reacts with calcium ion to form an insoluble coagulum. the coagulated milk protein probably delays gastric emptying and increases the efficiency of protein digestion in young calves. d. hydrochloric acid. hydrochloric acid is produced by the oxyntic cells. when the normal mucosa is stimulated, both chloride and hydrogen ions are secreted together, but current evidence suggests that h + and cl" are secreted by separate, closely coupled pump mechanisms. small amounts of cl~ are secreted continuously by the unstimulated parietal cells in the absence of h + secretion, and this mechanism is responsible for the relative negative charge of the resting mucosal surface. hydrogen ion and clsecretory systems may also be differentiated in vitro by the demonstration of hydrogen ion secretion in the absence of cl~. a scheme for the secretion of hydrochloric acid is presented in fig. 4 . for every h + secreted, an electron is removed. the electron ultimately is accepted by oxygen to form oh -, which is neutralized within the cell by h + from carbonic acid. the bicar bonate ion produced enters the venous blood, and this explains why the ph of gastric venous blood frequently is greater than that of arterial blood during hydrochloric acid secretion (davenport, 1966) . conversion of carbon dioxide and water to carbonic acid is catalyzed by carbonic anhydrase, which is present in high concentration within parietal cells. when the rate of acid secretion is high, this enzyme contributes to the secretory mechanism by maintaining normal intracellular ph. carbonic anhydrase inhibitors, such as acetazolamide, interfere with hydrochloric acid production in high concentrations and when the rate of acid secretion is high (janowitz et al., 1952) . bile is secreted continuously by the hepatocytes into the bile canaliculi and is trans ported through a system of ducts to the gallbladder, where it is modified, concentrated, and stored. during digestion, bile is discharged into the lumen of the duodenum, where it aids in emulsification, hydrolysis, and solubilization of dietary lipids. the digestive functions of bile are accomplished almost exclusively by the detergent action of its major components, the bile salts and phospholipids. the primary bile acids are c 24 carboxylic acids synthesized by the liver from choles terol. bile acid formation represents the major pathway for cholesterol metabolism (danielsson, 1963) . cholic acid (3a,7a,12a-trihydroxy-5/3-cholanoic acid) and chenodeoxycholic acid (3a,7a-dihydroxy-5/3-cholanoic acid) are the primary bile acids formed by most species of domestic animals. in swine, chenodeoxycholic acid is hydroxylated at the 6a position by the liver to yield hyocholic acid, which is a major primary bile acid in this species (haslewood, 1964) . bile acids are secreted as amino acid conjugates of either glycine or taurine. taurine conjugates predominate in the dog, cat, and rat. in the rabbit, the conjugating enzyme system appears to be almost completely specific for glycine (bremer, 1956) . both taurine and glycine conjugates are present in ruminants. in the newborn lamb, 90% of the bile acids are conjugated with taurine. as the lamb matures, glycine conjugates increase, accounting for one-third of the total in mature sheep (peric-golia and socie, 1968) . under normal conditions, only conjugated bile acids are present in the bile and in the contents of the proximal small intestine. in the large intestine, the conjugated bile acids are hydrolyzed rapidly by bacterial enzymes so that, in the contents of the large intestine and in the feces, free or unconjugated bile acids predominate. several genera of intestinal bacteria, including clostridium, enterococcus, bacteroides, and lactobacillus (midtvedt and norman, 1967) , are capable of splitting the amide bonds of conjugated bile acids. intestinal bacteria also modify the basic structure of the bile acids. one such reaction is the removal of the a-hydroxyl group at the 7 position of cholic acid or chenodeoxycholic acid. these bacterial reactions yield the secondary bile acids, deoxycholic acid, and lithocholic acid, respectively (gustafsson et al., 1957) . lithocholic acid is relatively insoluble and is not reabsorbed to any great extent (gustafsson and norman, 1962) . deoxycholic acid is reabsorbed from the large intestine in significant quantities and is either rehydroxylated by the liver to cholic acid and excreted (lindstedt and samuelsson, 1959) or excreted as conjugated deoxycholic acid. the extent to which bacteria transform the primary bile acids depends on the nature of the diet, the composition of the intestinal microflora, and the influences which these and other factors have on intestinal motility (gustafsson et al., 1966; gustafsson and norman, 1969a,b) . the carboxyl group of the bile acids is completely ionized at the ph of bile and is neutralized by sodium ion, resulting in the formation of bile salts. the bile salts are effective detergents. they are amphipathic molecules, which have both hydrophobic and hydrophilic regions. in low concentrations, bile salts form molecular or ideal solutions, but, when their concentration increases above a certain critical level, they form polymolecular aggregates known as micelles. the concentration at which these molecules aggregate is called the critical micellar concentration (cmc). bile salt micelles are spherical and consist of a central nonpolar core and an external polar region. fatty acids, monoglycerides, and other lipids are solubilized when they enter the central core of the micelle and are covered by the outside polar coat. solubilization occurs only when the cmc is reached. for the bile salt-monoglyceride-fatty acid-water system present during normal fat digestion, the cmc is approximately 2 mm, which is ordinarily exceeded both in bile and in the contents of the upper small intestine (hofmann, 1963) . phospholipids, principally lecithin, are also major components of bile. in the lumen of the small intestine, pancreatic phospholipase catalyzes the hydrolysis of lecithin, forming free fatty acid and lysolecithin. the latter compound also is a potent detergent which acts with the bile salts to disperse and solubilize lipids in the aqueous micellar phase. the enterohepatic circulation begins as conjugated bile acids near the duodenum and mix with the intestinal contents, forming emulsions and micellar solutions. the bile acids are not absorbed in significant amounts from the lumen of the proximal small intestine. absorption occurs primarily in the ileum (lack and weiner, 1961 weiner, , 1966 weiner and lack, 1962) , where an active transport process has been demonstrated . the conjugated bile acids pass unaltered into the portal circulation (playoust and isselbacher, 1964) and return to the liver, where the cycle begins again. this arrangement provides optimal concentrations of bile acids in the proximal small intestine, where fat digestion and absorption occur, and then efficient absorption after these functions have been accomplished. absorption of unconjugated bile acids from the large intestine ac counts for 3-15% of the total enterohepatic circulation (weiner and lack, 1968) . in dogs, the total bile acid pool was estimated to be 1.1-1.2 gm. the half-life of the bile acids in the pool ranged between 1.3 and 2.3 days, and the rate of hepatic synthesis was 0.3-0.7 gm/day (wollenweber et al., 1965) . the daily requirement for bile acids greatly exceeds the normal synthetic rate. this necessitates repeated reutilization of the bile acids, which is accomplished by means of the enterohepatic circulation. under steady-state conditions, the entire bile acid pool passes through the enterohepatic circulation approxi mately ten times each day (hofmann, 1966) . the size of the bile acid pool is dependent upon diet, the rate of hepatic synthesis, and the efficiency of the enterohepatic circulation. surgical removal of the ileum in dogs interrupts the enterohepatic circulation, causing an increase in bile acid turnover rate and a reduction in the size of the bile acid pool (playoust et al., 1965) . in diseases of the ileum, there may be defective bile salt absorption and bile salt deficiency. if severe, impaired utilization of dietary fat may occur, resulting in steatorrhea and impaired absorption of the fat-soluble vitamins. the exocrine pancreas is an acinous gland with the same general structure as the salivary glands. the cytoplasm of the secretory cells contains numerous zymogen granules, which vary in size and number depending on the activity of the gland. these granules contain the precursors of the hydrolytic enzymes responsible for digestion of the major dietary components. the cells of the terminal ducts probably secrete the bicarbonate ion responsible for neutralizing hydrochloric acid which enters the duodenum from the stomach. /. composition a. electrolyte composition. the cation content of pancreatic secretion is similar to that of plasma. sodium is the predominant cation, with smaller concentrations of potas sium and calcium being present. a unique characteristic of pancreatic juice is its high bicarbonate ion concentration and alkaline ph. in the dog, the ph ranges from 7.4 to 8.3, depending on hc0 3~ content. the volume of pancreatic juice is directly related to hc0 3~ content and ph increase and the cl~ concentration decreases. the sodium and potassium ion concentrations and osmolarity appear to be independent of secretory rate (fig. 5) . b. a-amy läse. the amylase produced by the pancreas catalyzes the specific hy drolysis of a-l,4-glucosidic bonds, which are present in starch and glycogen (a-1,4glycan-4-glycan hydrolase). pancreatic amylase appears to be essentially identical to the amylase of saliva. it is a calcium-containing metalloenzyme (vallee et al., 1959) . re moval of calcium by dialysis inactivates the enzyme and markedly reduces the stability of the apoenzyme. pancreatic amylase has an optimal ph for activity of 6.7-7.2 and is activated by chloride ion. synthesis of pancreatic α-amylase occurs in the ribosomes. the enzyme is transferred -rasmussen et al., 1956.) from the endoplasmic reticulum to cytoplasmic zymogen granules for storage (redman et al., 1966) . it is secreted in active form upon stimulation of the acinar cells. newborn calves (huber et al., 1961) and pigs (walker, 1959 ) secrete amylase at a significantly lower rate than mature animals. the rate of synthesis is also influenced by diet. animals fed a high-carbohydrate diet synthesize amylase at several times the rate of animals on a high-protein diet (ben abdeljlil and desnuelle, 1974) . unbranched a-1,4-glucosidic chains, such as those found in amylase, are hydrolyzed in two steps. the first is rapid and results in formation of the disaccaride maltose and maltotriose. the second step is slower and involves hydrolysis of maltotriose with forma tion of glucose and maltose. polysaccharides such as amylopectin and glycogen contain branched chains with both a-\ ,4-and a-\ ,6-glucosidic linkages. when α-amylase attacks these compounds, the principal products are maltose (a-l,4-glycosidic bond), isomaltose (a-l,6-glucosidic bond), and small amounts of glucose. final hydrolysis of the maltose and isomaltose occurs at the surface of the mucosal cell, where the enzymes maltase and isomaltase are integral parts of the microvillous membrane. c. proteolytic enzymes. the proteolytic enzymes of the pancreas are responsible for the major portion of protein hydrolysis, which occurs within the lumen of the gastrointes tinal tract. two types of peptidases are secreted by the pancreas. trypsin, chymotrypsin, and elastase are endopeptidases, which attack peptide bonds along the polypeptide chain, producing smaller peptides. the exopeptidases attack either the carboxy-terminal or amino-terminal peptide bonds, releasing single amino acids. the principal exopeptidases secreted by the pancreas are carboxypeptidases a and b. the endopeptidases and exopep(table ii) , producing free amino acids, which are absorbed directly, or small peptides, which are further hydrolyzed by the aminopeptidases of the intestinal mucosa (see section iii,c). the pancreatic peptidases are secreted as inactive proenzymes or zymogens termed trypsinogen, chymotrypsinogen, and procarboxypeptidase a and b. trypsinogen is con verted to active trypsin in two ways. at alkaline ph, trypsinogen can be converted autocatalytically to trypsin, the activated enzyme converting more zymögen to active enzyme. trypsinogen can also be activated by the enzyme enter okinase, which is pro duced by the duodenal mucosa. the latter reaction appears to be highly specific in that enterokinase will not activate chymotrypsinogen. chymotrypsinogen, proelastase, and the procarboxypeptidases a and b are converted to active enzymes by the action of trypsin. the amino acid sequences and other structural characteristics of bovine trypsinogen and chymotrypsinogen have been determined (hartley et al., 1965; hartley and kauffman, 1966; brown and hartley, 1966) . the polypeptide chain of trypsinogen contains 229 amino acid residues. activation of the proenzyme occurs with hydrolysis of a single peptide bond located in the 6 position between lysine and isoleucine. the c-terminal hexapeptide is released as enzyme activity appears. there is also substantial change in the helical structure of the parent molecule (davie and neurath, 1955; neurath et al., 1956) . chymotrypsinogen a is composed of 245 amino acid residues and has numerous structural similarities to trypsinogen. activation of the chymotrypsinogen also occurs with cleavage of a single peptide bond. for a complete discussion of this subject, see the review by keller (1968) . d. lipase. the pancreas produces several lipolytic enzymes with different substrate specificities. the most important of these from a nutritional viewpoint is the lipase re sponsible for hydrolysis of dietary triglycéride. this enzyme has the unique property of requiring an oil-water interface for activity so that only emulsions can be effectively at tacked (sarda and desnuelle, 1958) . the principal products of lipolysis are glycerol, monoglycerides, and fatty acids. the monoglycerides and fatty acids accumulate at the oil-water interface and can inhibit enzyme activity. their transfer from the interface to the aqueous phase is favored by the presence of sodium bicarbonate also secreted by the pancreas and by bile salts. mattson and volpenhein (1966) described two other carboxylic ester hydrolases in pancreatic juice. both enzymes have an absolute requirement for bile salts, in contrast to glycerol ester hydrolase, which is actually inhibited by bile salts at ph 8. one of these enzymes is a sterol ester hydrolase responsible for hydrolysis of cholesterol esters. the other enzyme hydrolyzes various water-soluble esters. the two enzyme activities have been differentiated on the basis of stability and optimal ph. the pancreas secretes a third lipolytic enzyme which hydrolyzes phospholids. phospholipase a converts lecithin which is present in bile to lysolecithin, an effective deter gent which aids in emulsification of dietary fat. pancreatic secretion is controlled and coordinated by neural and endocrine mechanisms. when ingesta or hydrochloric acid enters the duodenum, the hormone secretin is released into the circulation by the duodenal mucosa. secretin increases the volume, ph, and hc0 3~ concentration of the pancreatic secretion. secretin is a polypeptide hormone which contains 27 amino acid residues. all 27 amino acids are required to maintain the helical structure of the molecule and its activity (bodanszky et al., 1969) . the c-terminal amide is a property of other polypeptide hormones, such as gastrin and vasopressin, which act on the flow of water in biological systems (mutt and jorpes, 1967) . in addition to its effects on the pancreas, secretin increases the rate of bile formation (wheeler and mancusi-ungaro, 1966) . the pancreatic juice which results from stimulation by secretin is large in volume and has high bicarbonate concentration but is low in enzyme activity. stimulation of the vagus nerve causes a significant rise in enzyme concentration. this type of response also is produced by pancreozymin, another polypeptide hormone secreted by the duodenal mu cosa. pancreozymin is now believed to be identical to cholecystokinin, an intestinal hormone which causes contraction of the gallbladder (thompson, 1969) . the c-terminal pentapeptide of pancreozymin-cholecystokinin is exactly the same as that of gastrin. this fascinating relationship suggests that gastrin and pancreozymin-cholecystokinin may par ticipate in some unified but as yet poorly understood system of digestive control (thompson, 1969) . during the past several years, a large number of papers have been published on the endocrine function of the gastrointestinal mucosa and on several new polypeptides which are being classified as gut hormones (table iii) . many of these new substances have not met the rigid physiological requirements for true hormone status, including (1) biological action in very small concentration, (2) release into the bloodstream, and (3) normal serum levels comparable to those provided experimentally by exogenous administration. these criteria probably will be modified, particularly with regard to requirements for transport in the vascular system. a large class of peptides are under investigation which have paracrine rather than endocrine activities; that is, their actions are on cells and tissues in the immediate vicinity of the cells of origin. motilin is a polypeptide containing 22 amino acids that was originally isolated from porcine duodenal mucosa (brown et al., 1971) . the amino acid composition and se quence have been described (brown et al., 1972 (brown et al., , 1973 . immunoreactive motilin has been found in the enterochromaffin cells of the duodenum and jejunum of several species (polak et al., 1975) and, by means of radioimmunoassay, motilin has been identified in the plasma of dogs (dryburgh and brown, 1975) . motilin has been shown to stimulate pepsin output and motor activity of the stomach (brown et al., 1971) and to induce lower esophageal sphincter contractions (jennewein et al., 1975) . studies by itoh et al., (1978) suggest that motilin plays an important role in initiating interdigestive gastrointestinal contractions. somatostatin, which is named for its growth hormone release-inhibiting activity, was first purified from bovine hypothalamus (brazlua and guilleman, 1974) . somatostatin also has been demonstrated in the stomach, pancreas, and intestinal mucosa in concen trations higher than in the brain (pearse et al., 1977) . somatostatin is a potent inhibitor of insulin and glucagon release. it also inhibits gastrin release and gastric acid secretion (barros d'sa et al., 1975; bloom et al., 1974) , apparently acting independently on parietal cells and on g cells. these and a variety of other physiological effects suggest that somatostatin has important gastrointestinal regulatory functions (pearse et al., 1977) . enteroglucagon is the hyperglycémie, glycogenolytic factor isolated from the intestinal mucosa. it occurs primarily in the distal small intestine and colon in at least two forms, one with a molecular mass of 3500 dal tons and the other somewhat larger (val verde et al., 1970) . enteroglucagon differs from pancreatic glucagon biochemically, immunologically, and in its mode of release. the physiological function of enteroglucagon is not known, but its release from the mucosa following a meal and the associated increase in circulating blood levels have suggested a regulatory role on bowel function (pearse et al., 1977) . enteroglucagon also differs significantly from glucagon produced by the a cells of the gastric mucosa of the dog (sasaki et al., 1975) . canine gastric glucagon is biologi cally and immunochemically identical to pancreatic glucagon. gastric glucagon appears to be unique to the dog, similar activity not being observed in the stomach of the pig or the abomasum of cattle and sheep (sutherland and de du ve, 1948) . the microvillous membrane of the intestinal mucosa, like other cell membranes, is a lipid structure which acts as a barrier to water and water-soluble substances. water and polar solutes penetrate in one of two ways. (1) they may pass through pores in the membrane, which are believed to be aqueous channels connecting the luminal surface of the cell with the apical cytoplasm. the "effective" diameter of jejunal pores has been estimated to be approximately 0.4 nm (lindemann and solomon, 1962) . (2) they may attach to membrane carriers, which facilitate passage through the lipid phase of the membrane. transport of water and water-soluble compounds is influenced by the permeability characteristics of the limiting membrane and by the nature of the driving forces which provide energy for transport. passive movement occurs either by simple diffusion or as a result of gradients in concentration (activity), ph, osmotic pressure, or electrical potential which may be present across the membrane. the passive movement of an ion in the direction of an electrochemical gradient is referred to as single-file diffusion (hladky, 1965) . when a substance moves in a direction opposite that of an established electrochem ical gradient, an active transport process is said to be responsible. most water-soluble compounds, such as monosaccharides and amino acids, cannot diffuse across the intestinal mucosal membrane at rates which are adequate to meet nutritional requirements. transport of these substances is believed to be by means of membrane carriers. the nature of thse carriers is not well understood, but they are believed to be an integral part of the membrane and responsible for binding the transported substance in a rather specific way. their existence is based primarily on kinetic evidence. carrier-mediated transport systems can be saturated and are competitively inhibited by related compounds. three types of carrier transport mechanisms are recognized (curran and schultz, 1968 ). (1) active transport, as stated previously, involves movement of electrolytes against an electrochemical gradient. in the case of nonelectrolytes, such as glucose, active transport is defined as movement against a concentration gradient. active transport requires metabolic energy and is inhibited by various metabolic blocking agents or by low tempera ture. (2) facilitated diffusion occurs when the passive movement of a substance is more rapid then can be accounted for by simple diffusion. facilitated diffusion systems can increase the rate of movement across the membrane by two or three orders of magnitude. the carrier mechanism is similar to that involved in active transport in that it displays saturation kinetics, may be inhibited competitively, and is temperature dependent. how ever, transport does not occur against concentration or electrochemical gradients, and direct expenditure of energy is not required. (3) exchange diffusion is a transfer mechanism similar to facilitated diffusion. it was postulated originally by ussing (1947) to explain the rapid transfer of radioactive na + across cell membranes. the mechanism does not give rise to net transport but contributes in a major way to unidirectional flux rates, which are measured with isotopie tracers. in the intestine, net water absorption is the result of bulk flow through pores in the membrane. diffusion in the usual sense plays no important role in a water movement (section iii,a,4). when bulk flow occurs, it is possible for solutes to move across the membrane in the direction of flow by a phenomenon called solvent drag. the effect of solvent drag on the transport of a given solute depends on the rate of volume flow and upon the reflection coefficient, which is an expression of the relationship between the pore radius and the radius of solute molecule being transported. a solute such as urea can be transported by the intestine against a concentration gradient by means of solvent drag (hakim and lifson, 1964) . studies with isotopie tracers have shown that transport of water and electrolytes by the intestinal mucosa is a dynamic process, with rapid unidirectional fluxes of the substances occurring continuously in both directions. net absorption occurs when the flow from lumen to plasma exceeds that in the opposite direction (code et al., 1960; berger et al., 1959; hindle and code, 1962) . active transport of na + can occur along the entire length of the intestine, but the rate of absorption is greatest in the ileum and colon, where most net sodium and water absorption occurs. sodium transport is believed to be accomplished by an energy-requiring "sodium pump. " the characteristics of this pump are not completely understood, but skou (1965) presented evidence that the pump is intimately related to the activity of a na + -in dependent adenosine triphosphatase located within the cell membrane. this enzyme is inhibited by cardiac glycosides, such as oubain, which also are effective inhibitors of na + transport, and it has been suggested that this enzyme system may actually be the pump. in the jejunum, net absorption of sodium occurs slowly unless nonelectrolytes, such as glucose or amino acids, are absorbed simultaneously. in in vivo studies by fordtran et al. (1968) , jejunal absorption of sodium appeared to be explained, in part, by solvent drag which was associated with active glucose transport. in the ileum, na + absorption was independent of glucose absorption. water absorption in the jejunum also appears to be almost entirely dependent upon the absorption of glucose, while absorption from the ileum is unaffected by glucose (barry et al., 1961) . the differential effect of glucose on absorption from the jejunum and ileum appears to be the result of fundamental metabolic differences between these two areas of the intestine (curran, 1960; gilman and koelle, 1960) . as sodium is transported across the mucosal membrane, an equivalent amount of anion must be transported simultaneously to maintain electrical neutrality. a significant amount of chloride ion absorption can be accounted for on this basis. it is generally agreed that chloride transport in the intestine is a passive process (clarkson et al., 1961) , although active secretion by the gastric mucosa seems well established. the intestinal mucosa can, under certain circumstances, absorb cl" independently of cation absorption and maintain electrical neutrality by exchange secretion of bicarbonate into the lumen (ingraham and visscher, 1936). dietary potassium is absorbed almost entirely in the proximal small intestine. absorp tion appears to be a passive process since movement across the mucosa occurs down a concentration gradient (high luminal concentration to a low concentration in plasma). the fluid which reaches the ileum from the jejunum has a potassium concentration and a sodium/potassium ratio which is similar to that of plasma. in the ileum and colon, the rate of sodium absorption is much greater than that of potassium so that, under normal conditions, the sodium/potassium ratio in the feces is much lower than that of plasma, approaching a ratio of 1. the absorption of water has been one of the most extensively studied aspects of intestinal transport. it is now generally agreed that water movement is the result of bulk flow through membranous pores and that simple diffusion plays only a minor role. the question of whether water is actively or passively transported has been the subject of considerable controversy, and the controversy itself points to the fundamental difficulties which arise in trying to establish a definition of active transport. hypertonie saline so lutions can be absorbed from canine intestine in vivo (grim, 1962) and from canine and rat (parsons and wingate, 1961) intestine in vitro. these observations indicate that water absorption can occur against an activity gradient and that the process is dependent upon metabolic energy. this would suggest that an active transport process is involved. curran (1965) , however, presents an alternate interpretation which is now generally accepted. this view is that water transport occurs secondarily to active solute transport and is the result of local gradients established within the mucosal membrane. water transport is then coupled to the energy-dependent process responsible for solute transport but is one step removed from it. in the dog and probably other carnivores, the ileum is the main site of net sodium and water absorption. the colon accounts for no more than perhaps 20% of the total. in the case of herbivorous animals in which the large intestine is developed extensively, net secretion of water may occur in the ileum so that all net absorption of water must take place in the cecum and colon (powell et al., 1968; argenzio, 1975) . carbohydrate is present in the diet primarily in the form of polysaccharides of glucose. the most common polysaccharides are starch, glycogen, and cellulose. starch and glycogen are composed of long chains of glucose molecules linked together by repeating a-1,4-glucosidic bonds. branching chains are linked by a-1,6-glucosidic bonds. in those species which secrete salivary amylase, digestion of starch and glycogen begins in the mouth when this enzyme mixes with food. the action of salivary amylase is interrupted in the stomach, however, because of the low ph of the gastric secretion. starch digestion begins again in the proximal small intestine with the action of pancrea tic amylase. this enzyme catalyzes a series of stepwise hydrolytic reactions, resulting in formation of the principle end products of starch digestion, the disaccharides maltose and isomaltose, and small amounts of glucose. glucose is absorbed directly by the intestinal mucosa and transported to the portal vein. the disaccharides are broken down further by hydrolytic enzymes of the brush border. b. cellulose. cellulose, like starch, is a polysaccharide of glucose but differs from starch in that the glucose molecules are linked by ß-1,4-glucosidic bonds. starch can be utilized by all species, but cellulose is utilized as a source of energy only by animals which have extensive bacterial fermentation within the gastrointestinal tract. ruminant species digest cellulose most efficiently, but other animals in which the large intestine is well developed also can utilized cellulose to some degree. in ruminants, hydrolysis of cellulose is accomplished by cellulytic bacteria, which are part of the complex rumen microflora. the end products of cellulose fermentation are short-chain fatty acids-acetic, propionic, and butyric acids. these are absorbed directly from the rumen and serve as the major source of energy for ruminants. propionic acid is the major precursor for synthesis of carbohydrate. maltose and isomaltose are the disaccharides (glucose-glucose) produced as end prod ucts of starch digestion. the diet also contains lactose (galactose-glucose) and sucrose (fructose-glucose). it once was believed that disaccharides were hydrolyzed within the (1968) malathi (1967) malathi (1967) eichholz (1967), forstner et al. (1968) eichholz (1967), forstner et al. (1968) lumen of the intestine by enzymes secreted by the mucosa. there is now general agree ment, however, that disaccharide digestion is completed at the surface of the cell by disaccharidases (gray, 1975) , which are components of the brush border (table iv) . this is considered a form of intracellular digestion (ugolev, 1965) . the disaccharidases have been solubilized from the brush border and partially purified. two separate maltases have been isolated (auricchio et al., 1965) . isomaltase and sucrase have been separated and purified together as a two-enzyme complex (kolinskâ and semenza, 1967) . the mucosa also contains two enzymes with lactase activity. one of these is a nonspecific /3-galactosidase which hydrolyzes synthetic /3-galactosides effec tively but which hydrolyzes lactose at a slow rate. this enzyme has an optimal ph of 3 and is associated with the lysozomal fraction of the cell. the other lactase hydrolyzes lactose readily. it is associated with the brush border fraction of the cell and is the enzyme which is important in the digestive process (alpers, 1969) . maltase, isomaltase, and sucrase are almost completely absent from the intestine in newborn pigs dahlqvist, 1961) and calves (huber et al., 1961) . the activity of these disaccharidases increases after birth and reaches adult levels during the first months of life. lactase activity is highest at birth and decreases gradually during the neonatal period. the relatively high lactose activity seems to be an advantage to the newborn in utilizing the large quantities of lactose present in the diet. by water and demonstrated lactase deficiency following acute enteric infections and suggested that lactose utilization may be decreased in such cases. a. specificity of monosaccharide transport. regardless of whether monosaccharides originate in the lumen of the intestine or are formed at the surface of the mucosal cell, transport across the mucosa involves processes which have a high degree of chemical specificity. glucose and galactose are absorbed from the intestine more rapidly than other monosaccharides. fructose is absorbed at approximately one-half of the rate of glucose, and mannose is absorbed at less than one-tenth the rate of glucose (kohn et al., 1965) . glucose and galactose can be absorbed against concentration gradients and are said, by definition, to be actively transported. active absorption requires metabolic energy and can be inhibited by a variety of substances which block oxidative phosphorylation. the monosaccharides that are transported most efficiently against concentration gradients have certain common structural characteristics, which were summarized by wilson (1962) . these include (1) the presence of a pyranose ring, (2) a carbon atom attached to c-5, and (3) a hydroxyl group at c-2 with the same stereoconfiguration as d-glucose. these features once were believed to be necessary for active monosaccharide transport, but recent observations suggest that they are not absolute requirements. both d-xylose, which has no substituted carbon atom at c-5, and d-mannose, which lacks the appropriate hydroxyl configuration at c-2, can be transported against concentration gradients under proper experimental conditions (csâky and lassen, 1964; csâky and ho, 1966; alvarado, 1966b) . most current concepts imply that, during the initial phase of monosaccharide absorption, the monosaccharide molecule attaches to a mobile carrier located within the cell membrane . the evidence for such membrane carriers comes from kinetic studies of the overall transport process. the rate of glucose absorption is independent of luminal concentration over a rather wide range, but a maximal rate of absorption can be demonstrated at very high concentrations. this limitation of transport is believed to be due to saturation of binding sites on the membrane carrier. glucose transport is competitively inhibited by galactose (cori, 1925; fisher and par sons, 1953) and by a variety of substituted hexoses, which compete with glucose for carrier binding sites. the glucoside phlorizin is a very potent inhibitor (parsons et al., 1958; alvarado and crane, 1962) . phlorizin also competes for binding sites but has a much higher affinity for these sites than does glucose. the absorptive surface of the mucosal cell is the microvillous membrane, or brush border (figs. 6a and 6b). it is through this part of the plasma membrane that glucose must pass during the initial phase of mucosal transport. techniques have been developed for isolating highly purified preparations of microvillous membranes from mucosal homogenates forstner et al., 1968) . faust et al. (1967) studied the binding of various sugars to these isolated membrane fractions. they found that d-glucose was bound by the membrane preferentially to l-glucose or to d-mannose and that glucose binding was completely inhibited by 0.1 mm phlorizin. the specificity of their observa tions suggested that binding represented an initial step in glucose transport, namely, attachment to a membrane carrier. c. sodium requirement. the absorption of glucose and other monosaccharides is influenced significantly by sodium ion (schultz and curran, 1970; kimmich, 1973) . when sodium is present in the solution bathing the intestinal mucosa, glucose is absorbed rapidly, but, when sodium is removed and replaced by equimolar amounts of other cations, glucose absorption virtually stops (riklis and quastel, 1958; csâky, 1961; bihler and crane, 1962; bihler et al., 1962) . glucose abosrption is inhibited by oubain, digitalis, and other cardiac glycosides which are also inhibitors of na-k-dependent adenosine triphosphatase activity and sodium transport (csâky and hara, 1965; schultz and zalusky, 1964) . these observations suggest a close relationship between the transport of glucose and sodium. on the basis of their own observations, crane and co-workers (1965) sug gested that sodium ion acts directly upon the membrane carrier to increase affinity of the carrier for glucose. csâky (1963) interprets the apparent coupling of sodium transport to the transport of various nonelectrolytes as being due to the need to maintain a critical intracellular sodium concentration, which, in turn, is essential for conversion of metabolic energy (atp, etc.) to energy for transport. the initial step in protein digestion is enzymatic hydrolysis of peptide bonds with formation of smaller peptides and amino acids. the endopeptidases (proteases) hydrolyze peptide bonds within the protein molecule and also hydrolyze certain model peptides. exopeptidases hydrolyze either the carboxy-terminal (carboxypeptidase) or the aminoterminal (aminopeptidase) amino acids of peptides and certain proteins. dietary proteins first come in contact with proteolytic enzymes in the stomach. the best known of the gastric proteases is the family of pepsins (samloff, 1971) , which attack most proteins with the exception of keratins, protamines, and mucins. pepsins are relatively nonspecific endopeptidases and split peptide bonds involving many amino acids. the most readily hydrolyzed peptide bonds are those of leucine, phenylalanine, tyrosine, and glutamic acid (ryle, 1965; ryle and hamilton, 1966; meyer and kelly, 1977) . the extent of proteolysis in the stomach depends on the nature of the dietary protein and the length of time spent in the stomach. the food bolus mixed with saliva has a neutral or slightly alkaline ph as it enters the stomach, and a certain period of time is necessary for it to mix with gastric secretions and become acidified. proteolytic digestion begins when the ph of the gastric contents approaches 4 and occurs optimally in two ph ranges, 1.6-2.4 and 3.3-4.0 (taylor, 1959a,b) . because of the relative lack of specificity of the pepsins, some peptide bonds of almost all dietary proteins are split during passage through the stomach. the gastric phase of protein digestion appears to have only a minor and probably dispensable role in overall protein assimilation (freeman and kim, 1978) . the reservoir function of the stomach, however, contributes to the gradual release of nutrients, insuring more efficient utilization in the small intestine. partially digested protein passes from the stomach to the duodenum, where the acidic contents are neutralized by sodium bicarbonate secreted in the bile and pancreatic juices. peptic activity persists in the duodenum only during the period required to raise the ph above 4.0. the major peptidase activity in the lumen of the small intestine comes from the pancreatic enzymes trypsin, chymotrypsin, elastase, and carboxypeptidases a and b. the action of these enzymes is integrated so that the endopeptidases produce peptides with c-terminal amino acids which are appropriate substrates for the exopeptidases. trypsin produces peptides with basic c-terminal amino acids which are particularly suited for the action of carboxypeptidase b. chymotrypsin produces peptides with aromatic amino acids in the c-terminal position, and elastase produces peptides with c-terminal amino acids which are nonpolar. carboxypeptidase a hydrolyze both types of c-terminal peptide bonds (table ii) . the intestinal mucosa contains a broad range of aminopeptidases which complete the process of protein digestion (heizer and laster, 1969) . most of the aminopeptidase activity is found in the soluble fraction of the cell (newey and smyth, 1960), but a small fraction is tightly bound to the microvillous membrane and appears to serve a digestive function at the cell surface similar to that described for the disaccaridases (rhodes et al., 1967 ). an endopeptidase from the intestinal mucosa was studied by hsu and tappel (1965) using hemoglobin as substrate. over 95% of the activity was located in the particulate fraction of the cell. the association with other acid hydrolases suggests that this is a lysozomal enzyme, and its relationship to the normal process of protein digestion is not known. despite the long interest in and controversy regarding the subject of this section, the relative amounts of the various types of protein digestion products, i.e., peptides and amino acids, which are actually absorbed by intestinal mucosal cells during normal digestion are still not known. it is a difficult process to investigate from a kinetic standpoint because the products of proteolysis are absorbed rapidly after they are formed. studies of luminal contents, therefore, give only an estimate of the overall rate of protein digestion. in addition, dietary protein is continually mixed with endogenous protein in the form of digestive secretions and extruded mucosal cells. endogenous protein is hydrolyzed and the amino acids absorbed in a manner similar to that of dietary protein, and the two processes occur simultaneously. endogenous protein accounts for a significant part of the amino acids of the intestinal contents (nässet and ju, 1961) . even when the dietary protein is labeled with a radioactive tracer, there is such rapid utilization that the tracer soon reenters the lumen in the form of endogenous protein secretion. in adult mammals, protein is not absorbed from the intestine in quantities of nutritional significance without previous hydrolysis. most neonatal animals absorb significant amounts of immunoglubin and other colostral protein, but this capacity is lost soon after birth (see section iii,a,4 below). the intestinal mucosa is not totally impermeable to large polypeptide molecules, however. the absorption of insulin (mw 5700) (laskowski et al., 1958; danforth and moore, 1959) , ribonuclease (mw 13,700) (alpers and isselbacher, 1967), territin (bockman and winborn, 1966) , and horseradish peroxidase (cor nell et al., 1971 ) has been demonstrated. the intestine produces a part of the plasma /3-globulin. this is believed to be the result of de novo synthesis of protein, however, presumably from individual amino acid precursors. during the digestion of protein, the amino acid content of the portal blood increases rapidly. attempts to demonstrate parallel increases in the level of peptides in the portal blood have not been successful (levenson et al., 1959) . this has sometimes been taken as evidence that only amino acids can be absorbed by the intestinal mucosa and that the absorption of peptides does not occur. while it seems clear that a significant part of the dietary protein is absorbed in the form of free amino acids, peptides also may be taken up by the mucosal cell. evidence of the mucosal uptake of peptides came originally from experiments with isolated loops of intestine (wiggans and johnston, 1959; newey and smyth, 1959) . various peptides were placed in solutions bathing the mucosa and analyses made sub sequently of the serosal fluid. with the exception of small amounts of glycylglycine, peptides were never found on the serosal side, but free amino acids were found in significant quantities. the final steps to peptide digestion appear to be associated with mucosal epithelial cells. almost all of the aminopeptidase activity is associated with the mucosa, and very little activity is present in luminal contents (lindberg, 1966) . as described above, mucosal aminopeptidase activity is located in the cytosol and in the brush border mem brane fractions of the epithelial cell (heizer and laster, 1969; kim et al., 1972) . these physically separate enzymes have remarkably different substrate specificities (kim et al., 1974) . the brush border enzyme has more than 50% of the activity for tripeptides, yet less than 10% of the total activity for dipeptides relative to the cytosolic enzyme(s) (peters, 1970; kim et al., 1972) . almost all activity for tetrapeptides is present in the brush border (freeman and kim, 1978) . proline-containing peptides are hydrolyzed almost exclusively by cytosolic peptidases, whereas leucine aminopeptidase activity is located primarily in the brush border. from these studies, it appears that, in the intact animal, peptides are absorbed in physiologically important quantities by intestinal mucosal cells and hydrolyzed either at the cell surface or intracellularly to constituent amino acids. the individual amino acids then are transported to the apical part of the cell and finally enter the portal circulation. amino acids, like glucose and certain other monosaccharides, are absorbed and trans ferred to the portal circulation by active transport processes. the same type of saturation kinetics observed in studies of monosaccharide absorption are observed with amino acids, suggesting carrier transport mechanisms. certain monosaccharides inhibit amino acid transport (saunders and isselbacher, 1965; newey and smyth, 1964) . inhibition generally has been of the noncompetitive type, but alvarado (1966a) demonstrated competitive inhibition between galactose and cycloleucine, suggesting that some form of common carrier may be involved. most amino acids are transported against concentration and electrochemical gradients, and the overall transport process requires metabolic energy. the chemical specificity of these transport mechanisms is demonstrated by the observation that the natural / forms of various amino acids are absorbed more rapidly than the corresponding d forms, and only the /-amino acids appear to be actively transported. sodium ion is necessary for absorp tion of amino acids as it is for a variety of other nonelectrolyte substances (schultz and curran, 1970; gray and cooper, 1971) . separate transport systems appear to exist for different groups of amino acids. each member of a group inhibits the transport of other members competitively, suggesting that they share the same binding site. there is some overlap between groups, indicating that, in the overall transport process, certain steps may be common to all amino acids and other steps more specific (saunders and isselbacher, 1966; matthews and laster, 1965; wise man, 1968) . these groups are the following: 1. monoaminomonocarboxylic (neutral) amino acids, including histidine. these amino acids show mutual competition for transport and have the greatest requirement for na + . 2. monoaminodicarboxylic amino acids. aspartic and glutamin acids are not trans ported against concentration gradients. following uptake, they are transaminated, and, under physiological conditions, almost all of the aspartic and glutamic acid enters the portal blood as alanine. 3. dibasic amino acids, including lysine, arginine, ornithine, and the neutral amino acid cystine. these amino acids are apparently transported by the same transport system. 4. proline, hydroxyproline, the n-substituted glycine derivatives n-methylglycine (sarcosine), and n-dimethylglycine, and betaine. proline and hydroxyproline also can be transported by the first mechanism but the affinity of both amino acids for the nadependent pathway is low. the γ-glutamyl cycle has been proposed as a possible transport system for amino acids (meister and tate, 1976) . γ-glutamyltransferase (ggt) is a membrane-bound en zyme which is present in a number of mammalian tissues and catalyzes the initial step in glutathione degradation. the γ-glutamyl moiety of glutathione is transferred to amino acid (or peptide) receptors with the production of cysteinylglycine: glutathione + amino acid ^τ γ-glutamyl-amino acid + cys-gly the highest ggt activity is present in tissues which are known to transport amino acids actively, e.g., the jejunal villus and the proximal convoluted tubule of the kidney. meister and his colleagues (1976) have suggested that ggt may function in translocation by interaction with extracellular amino acids and with intracellular glutathione. the hypothet ical mechanism involves the noncovalent binding of extracellular amino acids to the plasma membrane, while intracellular glutathione interacts with ggt to yield a γ-glutamyl enzyme. when the γ-glutamyl moiety is transferred to the membrane-bound amino acid, a γ-glutamyl-amino acid complex is formed and, when released from the membrane binding site, moves into the cell. the γ-glutamyl-amino acid complex is split by the action of γ-glutamylcyclotransferase, an enzyme appropriately located in the cytosol. glutathione is regenerated by means of the γ-glutamyl cycle, which are good substrates for ggt (thompson and meister, 1975) . the γ-glutamyl cycle does not require sodium, and the previously demonstrated sodium dependence for amino acid transport would not be explained by the cycle. the cycle is not considered to be the only amino acid transport system, and its quantitative significance in individual tissues is unknown. certain nutrient cell types which are deficient in ggt have been shown to transport amino acids normally. at birth most domestic species, including the calf, foal, lamb, pig, kitten, pup, and infant, absorb significant quantities of colostral protein from the small intestine (brambell, 1958; walker and isselbacher, 1974) . γ-globulin either is absent in the serum of these species at birth, or is at a low level. within a few hours after ingestion of colostrum, the serum γ-globulin level rises. this is the principal mechanism by which the young of the above-listed species acquire maternal immunity. under normal environmental con ditions, ingestion of colostrum is an absolute requirement for the health of these species during the neonatal period (fig. 7) . in the neonatal calf, immunoglobulin deficiency has a role in the pathogenesis of gram-negative bacterial septicemia (smith, 1962; gay, 1965; roberts et al, 1954) . most calves deprived of colostrum develop septicemia early in life but may develop diarrhea before death (smith, 1962; roberts et al., 1954; wood, 1955; tennant et al, 1975) . hypogammaglobulinemia is almost always demonstrable in calves dying of gramnegative bacterial septicemia (fey, 1971) , and hypogammaglobulinemia is believed to be due to insufficient immunoglobulin intake or to insufficient intestinal absorption. the factor in colostrum that protects against systemic infections is the igm fraction (penhaie et al, 1971) . serum immunoglobulin values of neonatal calves vary, and a 10% incidence of hypogammaglobulinemia may occur in clinically normal calves (tennant et al, 1969a; house and baker, 1968; smith et al, 1967; thornton et al, 1972; braun et al, 1973) . most hypogammaglobulinemic individuals probably had insufficient colostrum intake. even when calves were given the opportunity to ingest colostrum, a surprising number were hypogammaglobulinemic. some of the reasons for varying gammaglobulinemia values are recognized, but the relative importance of each reason is not known. the concentration of lactoglobulin, the volume consumed (bush et al, 1971; selman et al, 1971 ) , the time elapsed from birth to ingestion of colostrum , and the method of ingestion (natural suckling versus bucket feeding) may have an important influence on the serum γ-globulin (smith et al, 1967; mcbeath et al, 1971) . calves that suckle their dams usually attain serum γ-globulin concentrations that are higher than those attained by calves given colostrum from a bucket. the frequency of hypogamma globulinemia may be influenced by seasons (gay et al., 1965b; mcewan et al., 1970a) , although this relationship has not always been observed (smith et al., 1961; thornton et al., 1972) . familial factors also influence hypogammaglobulinemia (tennant et al., 1969a) . regardless of cause, the mortality of hypogammaglobulinemic calves is higher than that of calves with normal serum γ-globulin values (gay, 1965; house and baker, 1968; thornton et al., 1972; mcewan et al., 1970a; boyd, 1972; naylor et al., 1977) . in addition to having more septicémie infections (smith, 1962; gay, 1965a; roberts et al., 1954; wood, 1955; fey, 1971; mcewan et al., 1970a) , hypogammaglobulinemic calves have a greater prevalence of acute diarrheal disease (boyd, 1972; naylor et al., 1977; penhale et al., 1970; gay et al., 1965) ; the local protective effects of immunoglobulin in the intestine apparently are important (fisher et al., 1975; . the prevalence of hypogammaglobulinemia and the high mortality associated with it has led to the development of several rapid tests for identification of hypogamma globulinemic calves (mcbeath et al., 1971; aschaffenburg, 1949; fisher and mcewan, 1967a; patterson, 1967; stone and gitter, 1969) . the zinc sulfate turbidity test (kunkel, 1947) was the first to be used for determination of serum immunoglobulin concentrations of neonatal calves (mcewan et al., 1970) . a close correlation has been established between test results and the amount of serum igg and igm (fisher and mcewan, 1967a,b; mcewan et al., 1970b; penhale et al., 1967) . the sodium sulfite turbidity test is similar to the zinc sulfate test and also has been used to identify hypogammaglobulinemic calves (stone et al., 1969; pfeiffer and mcguire, 1977) . failure of turbidity to develop when serum is added to a saturated solution of sodium sulfite indicates immunoglobulin deficiency, and semiquantitative assessment of the immunoglobulin concentration may be made by grading the degree of turbidity (stone and gitter, 1969) . the refractometer is used as a rapid test for immunoglobulin deficiency (mcbeath et al., 1971; boyd, 1972) . the close relationship between the concentration of γ-globulin and that of total serum protein in neonatal calves was described previously (tennant et al., 1969a) , and the wide variation in total protein concentration was due to differences in γ-globulin concentration. direct linear correlation between the serum protein concentra tion (refractive index) and the immunoglobulin concentration also has been described (mcbeath et al., 1971) . the equation for the regression line in that report was virtually identical to that observed recently (tennant, et al., 1978) . the y intercepts in our study and in that previously reported were identical (4 gm/dl). the refractometer has value as a rapid field instrument for the assessment of immunoglobulin status, but in cases of hemoconcentration it has limitations (boyd, 1972) . the glutaraldehyde coagulation test was used originally for the detection of hypergammaglobulinemia in cattle, using whole blood (sandholm, 1974) . glutaraldehyde reagent also has been used in a semiquantitative test to evaluate γ-globulin in canine (sandholm and kivisto, 1975) and human serum (sandholm, 1976) . we modified this procedure to detect hypogammaglobulinemic calves ( table v) . calves that had a negative test result (serum γ-globulin ^0.4 gm/dl) had markedly higher mortality than did calves with posi tive results (table vi) (tennant, et al., 1979) , findings similar to those obtained by using the zinc sulfate turbidity test (gay et al., 1965a; mcewan et al., 1970a) . many tests can be initiated at one time using the glutaraldehyde coagulation test, and all results can be evaluated rapidly without instrumentation (tables v and vi) . protein enters the absorptive cell by pinocytosis and passes across the cell to the lymphatics. the process is not selective because many proteins other than the immune globulins can be absorbed (payne and marsh, 1962a,b) . the ability to absorb intact protein is lost by domestic species within 1 or 2 days following birth. in rodents, protein absorption normally continues for approximately 3 weeks. the mechanism of intestinal "closure" was studied by lecce and co-workers (1964; lecce, 1966 ; lecce and morgan, a samples of serum were obtained at birth, but no follow-up of calves was made. 0 the death rate of calves that were test negative was significantly (p < 0.01) greater than that of testpositive calves, using t test for significance of differences between two percentages. 1962). they found that complete starvation of pigs lengthened the period of protein absorption to 4-5 days, whereas early feeding shortened the period. feeding different fractions of colostrum including lactose and galactose resulted in loss of protein absorptive capacity. the route of feeding may not be the critical factor, however. calves prevented from eating but which receive nutrients parenterally lose the ability to absorb protein at the same time as control calves (deutsch and smith, 1957). a. luminal phase. the fat present in the diet is primarily in the form of triglycérides of long-chain fatty acids. the initial step in utilization of triglycérides occurs in the lumen of the proximal small intestine, where hydrolysis is catalyzed by pancreatic lipase. this enzyme, which is secreted in active form, requires an oil-water interface for activity so that only emulsions are attacked (sarda and desnuelle, 1958) . enzyme activity is directly related to the surface area of the emulsion. the smaller the emulsion particle, the greater the total surface area of a given quantity of triglycéride and the greater the rate of hydrolysis (benzonana and desnuelle, 1965) . bile salts are not an absolute requirement, but they favor hydrolysis (1) by their detergent action, which causes formation of emul sions with small particle sizes, and (2) by stimulating lipase activity within the physiologi cal ph range of the duodenum (borgström, 1954 (borgström, , 1964a . a colipase is present in the pancreatic secretion which facilitates the interaction of lipase with its triglycéride substrate and protects lipase from inactivation (borgström and erlanson, 1971) . pancreatic lipase splits the ester bonds of triglycérides preferentially at the 1 and 3 positions (sari et al., 1966) , so that the major end products of hydrolysis are 2-monoglycerides and nonesterified fatty acids (mattson et al., 1952; volpenhein, 1962, 1964) . both compounds are relatively insoluble in water but are brought rapidly into micellar solution by the detergent action of bile salts. the mixed micelles so formed have a diameter of approximately 2.0 nm (borgström, 1964b; laurent and persson, 1965) and are believed to be the form in which the products of fat digestion are actually taken up by the mucosal cell (hofmann and small, 1967) . the intraluminal events which occur in fat absorption are schematically summarized in fig. 8. b. mucosal phase. the initial step in fat transport is the uptake of fatty acids and monoglycerides by the mucosal cell from micellar solution. just how this occurs is not completely clear, but present evidence suggests that the lipid contents of the micelle are somehow discharged at the cell surface so that they enter the cell in molecular rather than micellar form . the net effect is the absorption of the end products of lipolysis with the exclusion of bile salts, which are absorbed farther down the intestine, primarily in the ileum (lack and weiner, 1963) . uptake of fatty acids appears to be a passive process having no requirement for metabolic energy (johnston and borgström, 1964; strauss, 1966) . within the mucosal cell, the fatty acids are transported by a soluble binding protein to the endoplasmic reticulum, where the fatty acids and monoglycerides are rapidly reesterified to triglycéride (ockner and manning, 1974; ockner and isselbacher, 1974) . the two biochemical pathways for triglycéride biosynthesis in the intestine are summarized in fig. 9 . direct acylation of monoglyceride occurs in the intestine (senior and isselbacher, 1962) and probably is the major pathway for lipogenesis in the intestine during normal fat absorption (kern and borgström, 1965; mattson and volpenhein, 1964) . the initial step in this series of reactions involves activation of fatty acids by acyl-coa synthetase, a reaction which requires mg 2+ , atp, and coa (dawson and isselbacher, 1960; clark and hübscher, 1960, 1961; brindley and hübscher, 1965 ) and which has a marked specificity for long-chain fatty acids (dawson and isselbacher, 1960; brindley and hübscher, 1965) . this specificity appears to explain the observation by bloom et al. (1951) that mediumand short-chain fatty acids are not incorporated into triglycérides during intestinal trans port but enter the portal circulation as nonesterified fatty acids. the activated fatty acids (from isselbacher, 1966.) then react sequentially with mono-and diglycerides to form triglycérides in steps catalyzed by mono-and diglyceride transacylases (ailhaud et al., 1964) . the enzymes responsible for this series of reactions were partially purified by rao and johnston (1966) from the microsomal fraction of the cell. they observed that purification of the separate enzyme activities occurred simultaneously, suggesting that these enzymes occur together in the endoplasmic reticulum as a "triglyceride-synthetase" complex. an alternate route which is available for fatty acid esterification involves la-glycerophosphate, which may be derived from glucose or from dietary glycerol by the action of intestinal glycerokinase (haessler and isselbacher, 1963; clark and hübscher, 1962) . activated fatty acid coa derivatives react with l-a-glycerophosphate to form lysophosphatidic acid (monoglyceride phosphate), which by a second acylation forms phosphatidic acid (diglyceride phosphate). phosphatidic acid phosphatase then hydrolyzes the phosphate ester bond, forming diglyceride, and by means of a transacylase step similar to that described in the previous paragraph, triglycéride can then be formed. although this pathway appears to be one of minor importance for triglycéride synthesis in the intestine, johnston (1968) pointed out the importance of certain of the intermediates in this sequence of reactions in the synthesis of phospholipids which are necessary for stabilization of the chylomicron. the next step in fat transport is formation of chylomicrons within the endoplasmic reticulum. the chylomicron is composed primarily of triglycéride and has an outer mem branous coating of cholesterol, phospholipid, and protein (zilversmit, 1965) . the js-lipoprotein component of the chylomicron is synthesized by the intestinal mucosal cell (isselbacher and budz, 1963; hatch et al., 1966; windmueller and levy, 1968) . inhibi tion of protein synthesis by puromycin or acetoxycycloheximide interferes with chylomic ron formation and significantly reduces fat transport (sabesin and isselbacher, 1965) . the final step in fat absorption is extrusion of the chylomicron into the intercellular space opposite the basal lateral portion of the absorptive cell. this is accomplished by a process which is essentially the reverse of pinocytosis (palay and karlin, 1959) . from the intercellular space the chylomicron passes through the basement membrane and enters the lacteals through small pores. the chylomicron passes from the lacteal into lymph ducts and ultimately reaches the general circulation, having bypassed the liver completely during the initial phase of absorption. a. cholesterol. dietary cholesterol is present in both free and esterified forms, but only nonesterified cholesterol is absorbed (vahouny and treadwell, 1964) . cholesterol esters are hydrolyzed within the lumen of the intestine by sterol esterase secreted by the pancreas. bile salts are required both for the action of this enzyme (vahouny et al., 1965) and for the absorption of nonesterified cholesterol. in the mucosal cell, cholesterol is reesterified and transferred by way of the lymph to the general circulation. the type of triglycéride present in the diet significantly affects the absorption of cholesterol and its distribution in lymph lipids . b. vitamin a. the diet contains vitamin a activity in two principal forms: (1) as esters of preformed vitamin a alcohol (retinol) and fatty acids and (2) as provitamin a, primarily in the form of jö-carotene. vitamin a ester is hydrolyzed by a pancreatic esterase within the lumen (murthy and ganguly, 1962) , and the free alcohol is absorbed in the upper small intestine by a process which apparently requires metabolic energy (skala and hrubâ, 1964) . vitamin a alcohol is reesterified in the mucosa utilizing primarily palmitic acid (mahadevan et al., 1963) . the vitamin a ester is absorbed by way of the lymph. after reaching the general circulation, it is rapidly cleared from the plasma and stored in the liver. in the postabsorptive state, vitamin a circulates as the free alcohol. this is also the form released from the liver as needed by the action of a specific hepatic retinylpalmitase esterase (mahadevan et al., 1966) . the blood level of vitamin a is independent of the liver reserve, and, as long as a small amount of vitamin a is present in the liver, the blood level remains normal (dowling and wald, 1958) . in diets which lack animal fat, the carotenes, mainly /3-carotene, serve as the major vitamin a precursors. the intestinal mucosa plays the primary role in conversion of provitamin a to the active vitamin, although conversion can occur to a limited degree in other tissues (bieri and pollard, 1954; zachman and olson, 1963) . the exact mechanism involved in the conversion of /3-carotene to vitamin a is not completely established, but studies by olson (1961) suggest that there is central cleavage of/3-carotene into two active vitamin a alcohol molecules, which are subsequently esterified and transported by the lymphatics as with the preformed vitamin. bile salts are required for the mucosal uptake of /3-carotene and for the conversion of ß-carotene to vitamin a. uptake of carotene and release of vitamin a ester into the lymph appear to be rate-limiting steps. cattle also absorb substantial amounts of carotene without prior conversion to vitamin a, and these pigments are responsible for much of the yellow color of the plasma. most other species have no carotene in the plasma, and it has been suggested that extraintestinal conversion may be more efficient in these species than in cattle . c. vitamin d. vitamin d, like cholesterol, is a sterol which is absorbed from the intestine by way of the lymph (schachter et al., 1964) . intestinal absorption differs, however, in that vitamin d is transported to the lymph in nonesterified form (bell, 1966) . the uptake of vitamin d by the mucosal cell is favored by the presence of bile salts. simultaneous absorption of fat from micellar solutions increases transport out of the cell into the lymph, a step which appears to be rate limiting (thompson et al., 1969) . one of the major actions of vitamin d is to enhance the intestinal absorption of calcium ion. the mechanism of action of vitamin d has been described by wasserman and co-workers (1968; wasserman and taylor, 1966, 1968) . they have shown that vitamin d causes synthesis of a calcium-binding protein present in the soluble fraction of the intesti nal mucosal cell. they have accumulated a substantial amount of evidence which suggests that this protein plays a central role in the active transport of calcium. vomiting is a coordinated reflex act which results in rapid, forceful expulsion of gastric contents through the mouth. the reflex may be initiated by (1) local gastric irritation caused by a variety of toxic irritants or infectious agents, (2) foreign bodies, (3) gastric tumors, (4) obstruction of the pyloric canal or of the small intestine, or (5) drugs, such as apomorphine, or other toxic substances which act centrally on the "vomiting center" located in the medulla. severe vomiting produces loss of large quantities of water and of h + and cl~ ions. these losses cause dehydration, metabolic alkalosis with elevated plasma bicarbonate concentration, and hypochloremia. chronic vomiting may also be associated with loss of tissue k + and hypokalemia. the k + deficit is caused primarily by increased urinary excretion, which is the result of the existing alkalosis (leaf and santos, 1961) . gastric secretions contain significant quantities of k + (section ii,b), and losses in the vomitus also contribute to the k + deficiency. potassium deficiency, which develops initially because of alkalosis, ultimately may perpetuate the alkalotic state by interfering with the ability of the kidney to conserve h + (koch et al., 1956; darrow, 1964) . both potassium deficiency and the hypovolemia caused by dehydration may result in renal tubular damage and ultimately in renal failure (haden and orr, 1923, 1924) . vomiting occurs frequently in the dog, cat, and pig but is an unusual sign in the horse, which has anatomical restrictions of the esophagus that interfere with expulsion of gastric contents. in cattle, sheep, and goats, the physiological process of rumination utilizes neuromuscular mechanisms similar to those involved in vomiting. uncontrolled expulsion of ruminai contents is, however, an uncommon sign, occurring most frequently after ingestion of toxic materials. the contents of the abomasum are not expelled directly even when the pyloric canal is obstructed. a syndrome does occur in cattle with pyloric obstruction, however, which is similar metabolically to that observed in nonruminants. the syndrome has been observed in right-sided displacement of the abomasum with torsion (espersen, 1961; boucher and abt, 1968) . we have also observed the syndrome in cows with functional pyloric obstruction, the result of reticuloperitonitis (a variety of "vagai indigestion"). when the pylorus is obstructed, abomasal contents are retained, causing distention of the abomasum, which in turn stimulates further secretion and reten tion. retained abomasal contents may be regurgitated into the large reservoir of the rumen and there are sequestered from other fluid compartments of the body. the net result is loss of h + and cl" ions and development of metabolic alkalosis, hypochloremia, and hypokalernia (espersen and simesen, 1961; svendsen, 1969) . chronic hypertrophie gastritis has been demonstrated in the dog (van der gagg et al., 1976; happe and van der gagg, 1977; kipnis, 1978) which resembles menetrier's disease in man. van kruiningen's series of cases were basenjis which had concomitant lymphocytic-plasmocytic enteritis. three unpublished cases were studied at the new york state college of veterinary medicine. signs of illness usually involved chronic vomiting, weight loss, and occasionally diarrhea. hypoalbuminemia was documented in most of these cases. in man, hyperchlorhydria or achlorhydria can occur. the morphological changes in the stomach wall (hypertrophie rugae) as well as some of the clinical features help to differentiate this disease from gastric neoplasia and canine zollinger-ellison syndrome. canine zollinger-ellison syndrome was reported in four dogs (straus et al., 1977; van der gagg et al., 1978) . vomiting, diarrhea, inappetance, and weight loss were reported. all of the dogs had pancreatic non-js islet cell tumors, resulting in hypergastrinemia, hyperchlorhydria, hypertrophie gastritis, peptic esophagitis, and duodenal ulcers. the term "diarrhea" is used loosely to describe the passage of abnormally fluid feces with increased frequency and/or with increased volume. the significance of diarrhea depends primarily on the underlying cause and on the secondary nutritional and metabolic disturbances which are caused by excessive fecal losses. there are theoretically three factors which could act independently or in combination to produce diarrhea: (1) increased rate of intestinal transit, (2) decreased intestinal absorptive capacity, and (3) increased secretion into the intestinal lumen. an increase in the rate of intestinal transit has been considered to be important in various functional disorders of the gastrointestinal tract in which "hypermotility" has been considered the primary cause. although increased intestinal motility may be a factor in certain types of diarrheal disease when the direction of motility has been investigated, diarrhea has actually been associated with decreased motility (christiansen, 1972) . decreased intestinal assimilation of nutrients may result from either (1) decreased intraluminal hydrolysis of nutrients, e.g., maldigestion (kaiser, 1964) , due to pancreatic exocrine insufficiency or to bile salt deficiency or (2) defective mucosal transport of nutrients, malabsorption, which may be the result of various types of inflammatory bowel disease, intestinal lymphoma, or intrinsic biochemical defects in the mucosal cell which interfere with normal digestion and absorption. the role of increased intestinal secretion in the pathogenesis of certain types of acute diarrhea is now recognized. enteropathogenic strains of escherichia coli have been shown to produce soluble enterotoxins (smith and halls, 1967; köhler, 1968; moon, 1978) , which alter bidirec tional sodium and water flux (fig. 10) . rapid advances in understanding the pathogenesis of enterotoxin-induced diarrhea and the molecular basis of enterotoxin action have been made. the most extensively studied enterotoxin is that produced by vibrio cholerae. this bacterium produces a large molecular weight heat-labile toxin (ct), one subunit of which has properties similar to those of heat-labile (lt) enterotoxin produced by certain strains of e. coli (richards and douglas, 1978) . the mechanism of action of ct is believed to moon, 1978.) involve the activation of adenylate cyclase. this membrane-bound enzyme converts atp to cyclic 3',5'-adenosine monophosphate (camp), which is then responsible for the greatly increased secretion of water and electrolytes by the intestinal mucosa (moon, 1978) . although species differences have been observed (hamilton et al., 1978a,b; forsyth et al., 1978) , this mechanism appears to be important in the mode of action of e. coli lt as well (richards and douglas, 1978) . additional extensive studies have centered on the molecular mechanism of action of ct. under physiological conditions, adenylate cyclase is activated by the binding of guanosine triphosphate to the inactive enzyme. an associated gtpase inactivates the enzyme by converting enzyme-bound gtp to gdp and inorganic phosphate. this gtp-gdp system apparently plays a critical role in the regulation of adenylate cyclase. cholera toxin is believed to bind to the adenyl cyclase in a way which inhibits hydrolysis of gtp, thereby maintaining the enzyme in an activated state (levinson and blume, 1977; johnson et al., 1978; cassel and pfeuffer, 1978) (fig. 11) . certain enteropathogenic strains of e. coli produce a low molecular weight heat-stable toxin (st) alone or in addition to lt (richards and douglas, 1978; moon, 1978; hamil ton et al., 1978a) . in most epidemiological studies of neonatal diarrheal diseases of calves, isolated strains of e. coli produce only st (moon et al., 1976; braaten and myer, 1977; lari vier et al., 1979) . in contrast to lt and ct, which induce intestinal sodium and . proposed mechanism of action of cholera toxin, which inhibits hydrolysis of gtp, thereby increas ing adenylate cyclase activity. (after cassel and selinger, 1978.) water secretion only after a lag phase of several hours, st increases intestinal secretion at once. recent evidence suggests that st induces intestinal secretion by activating guanylate cyclase and that the mediator of intestinal secretion induced by st is cyclic 3',5'guanosine monophosphate field et al., 1978) . such advances in our fundamental knowledge of the pathogenesis of enterotoxininduced diarrheal disease have opened several avenues of investigation which may lead to pharmacological modification of intestinal secretion as a mode of therapy or prophylaxis. enterotoxin-induced intestinal secretion has been shown to be effectively blocked by cycloheximide, inhibitor of protein synthesis (serebro et al., 1969) . the lack of speci ficity and the toxicity of cycloheximide precluded its clinical use, but acetazolamide has been shown to inhibit intestinal fluid secretion (norris et al., 1969; moore et al., 1971) , and ethacrynic acid, another potent diuretic, has been shown to inhibit enterotoxininduced fluid secretion (carpenter et al., 1969) . unfortunately, the diuretic effects of these drugs preclude their clinical use, but an "intestinal-specific" derivative would have significant therapeutic potential. adenosine analogues also have been shown in prelimi nary studies to inhibit cholera toxin-stimulated intestinal adenylate cyclase, but their potential as prophylactic or therapeutic agents is not known. prostaglandin e t and ct have a similar effect on electrolyte transport in rabbit ileum. application of either to the mucosa inhibits sodium absorption and stimulates chloride secretion. one possible explanation for the effects of ct is that it stimulated release of prostaglandin, which then acted on adenylate cyclase, producing camp. to test this hypothesis, the effects of inhibitors of prostaglandin release on enterotoxin-stimulated intestinal secretion were investigated. both indomethacin (gots et al., 1974) and acetylsalicylic acid (farris et al., 1976) were shown to be potent inhibitors of enterotoxininduced intestinal secretion using laboratory animal models. current evidence does not support the hypothesis that prostaglandins play a primary role in the pathogenesis of cholera or other enterotoxin-induced diarrheal diseases (schwartz et al., 1975) , but the effects of these known prostaglandin inhibitors and other drugs on the intestinal secretory process warrant their evaluation as possible prophylactic and therapeutic agents. in pre liminary studies, jones and his colleagues demonstrated a positive therapeutic response to a new prostaglandin inhibitor (jones et al., 1977) . the autonomie nervous system has important effects on intestinal ion transport and water absorption (tapper et al., 1978) . catecholamines stimulate formation of camp in a variety of mammalian cells (sutherland and rail, 1960; schultz et al., 1975) , apparently by activating the gtp-gdp system described above (cassel and selinger, 1978; ciment and devellis, 1978) . adrenergic blocking agents, such as chlorpromazine (holmgren et al., 1978) and propranolol (donowitz et al., 1979) , have been shown to have significant inhibitory effects on enterotoxin-induced intestinal secretion. although the mechanism of action of these two adrenergic blockers is not known, they represent still another class of drugs which may be of therapeutic benefit. the intestinal "adsorbent" drug pepto bismol, a patented medication containing bis muth subsalicylate, and attapulgite, a heat-treated silicate, have been shown to have antienterotoxic effects (drucker et al., 1977; ericsson et al., 1977; gyles and zigler, 1978) . controlled therapeutic trials with bismuth subsalicylate have demonstrated signifi cant therapeutic benefit in certain large-volume diarrheal diseases in man suspected of being enterotoxigenic in origin (portnoy et al., 1976; dupont et al., 1977; dupont, 1978) . the mechanism of action in inhibiting intestinal secretion has not been determined, but the chemical relation of bismuth subsalicylate to other known prostaglandin inhibitors is recognized. it is possible that such drugs, by decreasing endogenous production of prostaglandin, decrease the basal level of cyclic nucleotides, which in turn causes an increase in the threshold of response to enterotoxin. recent evidence suggests that salicylates also may stimulate sodium chloride absorption (powell et al., 1979) . these observa tions taken collectively suggest that new, innovative methods for therapy and control of acute clinical diarrheal disease may be developed in the not too distant future. acute diarrhea represents the leading cause of morbidity and mortality in neonatal calves and pigs. the pathogenesis of the neonatal enteric infection is complex, often involving nutritional or environmental factors as well as infectious agents, such as enteropathogenic strains of e. coli, the transmissible gastroenteritis virus (tge), rota viruses, and other bacterial and viral pathogens. the severe clinical signs and frequently fatal outcome of acute diarrheal disease are often directly related to dehydration and to associated hydrogen ion and electrolyte disturbances (dalton et al., 1965; fisher and mcewan, 1967b; tennant et al, 1972 tennant et al, , 1978 . in acute diarrhea with watery stools of large volume, the fecal fluid originates primarily in the small intestine. the electrolyte composition of the stool in such cases is similar to that of the fluid found normally in the lumen of the small intestine, which in turn is similar to that of an ultrafiltrate of the plasma. the rapid dehydration which accompanies acute enteritis in the newborn soon produces hemoconcentration and ultimately hypovolemic shock. such cases are characterized by metabolic acidosis (dalton et al., 1965; phillips and knox, 1969) caused by decreased excretion of h + due to renal failure and by in creased production of organic acids, the result of decreased tissue oxygénation, which leads to excessive anaerobic glycolysis. hyperkalemia also is observed characteristically in young, severely dehydrated animals. hyperkalemia in such cases is the result of increased movement of cellular potassium into the extracellular fluid and to decreased renal excretion. cardiac irregularities caused by hyperkalemia can be demonstrated with the electrocardiogram, and cardiac arrest related to hyperkalemia is believed to be a direct cause of death in calves with acute diarrhea fisher and mcewan, 1967b) . marked hypoglycemia also has been observed occasionally prior to death in calves with acute enteric infections. hypoglycemia is believed to be due to decreased gluconeogenesis and increased anaerobic glycolysis, the result of hypovolemic shock (tennant et al., 1968) . the sequence of metabolic changes which occur during acute neonatal diarrhea is summarized in fig. 12 . in chronic forms of diarrheal disease, excessive fecal losses of electrolyte and fluid may be compensated in part by renal conservation mechanisms and by oral ingestion. if water is consumed without adequate ingestion of electrolytes, hyponatremia and hypokalemia may develop (tasker, 1967; patterson et al., 1968) . in such cases, the osmolarity of the plasma is significantly decreased and hypotonie dehydration occurs. in longer-standing cases of chronic diarrhea, the plasma k + concentration may become dangerously low. it is imperative, in this situation, that intravenous fluids contain sufficient k + to prevent further reduction in plasma concentration. if they do not, additional cardiac irregularities or cardiac arrest may result. decreased assimilation of nutrients may occur either as a result of defective intraluminal digestion (maldigestion) (kaiser, 1964) , or because of defects in mucosal transport (jeffries, et al., 1969; floch, 1969; wilson and dietchy, 1971 (from tennant et al., 1972.) or the malabsorption syndrome is observed in several types of intestinal disease, including chronic intestinal granulomatous diseases such as johne's disease, intestinal parasitic infections, and lymphoma of the intestine. primary clinical signs include persistent or recurrent diarrhea, nutrient loss in the feces (e.g., steatorrhea), and weight loss. mucosal cell-enzymatic defects may be accompanied by chronic inflammation, villous atrophy, or cellular infiltrations of the lamina propria of the intestine. early reports of primary or idiopathic intestinal malabsorption in dogs (miller, 1960; vernon, 1962; kaneko et al., 1965) were compared to nontropical sprue (adult celiac disease, gluten induced enteropathy) of man, but no convincing association to with gluten sensitivity was demonstrated. subsequent reports of malabsorption syndromes in the dog have described a variety of causes (van kruiningen, 1968; ewing, 1971; van kruiningen andhayden, 1972; hill, 1972; hill and kelly, 1974; schall, 1974; anderson, 1975 anderson, , 1977 burrows et al., 1979) , which must be distinguished from the maldigestion caused by pancreatic insufficiency (anderson and low, 1965a,b) (juvenile pancreatic atrophy, chronic pancreatitis) and from certain forms of hepatic or gastric disease. intestinal malab sorption can occur with protozoal enteritis (giardiasis, coccidiosis), lactase deficiency, eosinophilic gastroenteritis, lymphangiectasis, villus atrophy, lymphocytic-plasmacytic enteritis, histoplasmosis, chronic "bacterial" enteritis, malignant lymphoma, and intesti nal amyloidosis of the bowel. some authors (anderson, 1977; hay den and van kruiningen, 1973; arrick and kleine, 1978) described malabsorption and pseudoobstruc tion secondary to hypoplasia of the tunica muscularis of the jejunum in a dog. intestinal malabsorption is reported less frequently in the cat than in the dog (theran and carpenter, 1968; wilkinson, 1969) .malabsorption syndromes similar to those recog nized in dogs are being recognized with increased frequency in farm animals (blood et al., 1979) . meuten et al. (1978) , cimprich (1974), and merritt et al., (1976) have reported malabsorption in the horse secondary to chronic granulomatous enteritis and specific amino acid malabsorption has been reported in johne's disease (patterson and berret, 1969) . steatorrhea, the presence of excessive amounts of fat in the feces, is a prominent sign of intestinal malabsorption in dogs. the stools are bulky, gray or tan, and, grossly, may have an oily appearance. the normal dog excretes 3-5 gm of fat in the stool each day. this level of fecal fat is quite constant and is independent of dietary fat intake over a wide range of 15 to 48 gm/day (heersma and annegers, 1948) . in intestinal malabsorption, the ability to absorb fat is decreased and fecal fat excretion increases significantly. under these conditions, the amount of fecal fat excreted becomes proportional to dietary intake. merritt et al. (1979) reported that body weight is an important factor in fat output. small dogs (i.e., less than 10-15 kg body weight) with intestinal malabsorption had fecal fat outputs lower than or equal to published normal values. fecal fat excretion for normal dogs was 0.24 ± 0.01 gm/kg body weight per day. steatorrhea can be documented qualitatively by staining the fresh stool with a lipophilic stain, such as sudan iii, and observing increased numbers of oil droplets under the light microscope. in experienced hands, this method is a reliable diagnostic procedure (drummey et al., 1961) . the following methods can be used to demonstrate neutral and split fats. for neutral fat, two drops of water are added to a stool sample on a glass slide and mixed. two drops of 95% ethanol are then added and mixed followed by several drops of a saturated solution of sudan iii in 95% ethanol. a coverslip is applied to the mixture, which is then examined for yellow or pale orange refractile globules of fat, particularly at the edges of the coverslip. normally, two or three fat droplets per high-power field are present. a large number of neutral fat droplets suggests a lack of pancreatic lipase activity, i.e., exocrine pancreatic insufficiency. ¥ or free fatty acids, several drops of 36% acetic acid are added to a stool sample on a glass slide and mixed. several drops of sudan iii solution are then added and mixed. a coverslip is applied, and the slide gently heated over an alcohol burner until it begins to boil. the slide is air-cooled and then quickly heated again, this procedure is repeated two or three times. the warm slide is examined for stained free fatty acid droplets, which, when warm, appear as deep orange fat droplets from which spicules and soaps, resem-bling the pinna of the ear, form as the preparation cools. normal stools may contain many tiny droplets of fatty acids (up to 100 per high-power field). with increasing amounts of split fats, the droplets become larger and more numerous, which suggests an abnormality in fat absorption. quantitation of fecal fat is the most accurate method of assessing steatorrhea (burrows et al., 1979) with dietary fat balance being determined for a period of 48-72 hours. fecal fat is analyzed using a modification of the technique of van de kamer et al. (1949) , which employs ether extraction of fecal lipid and titration of fatty acids. the results are ex pressed as grams of neutral fat excreted per 24 hours. merritt et al. (1979) have suggested that dogs be fed 50 gm fat per kilogram per day for two to three days prior to fecal collection. analysis of a 24-hour collection of stool when this is done is believed to be as accurate as a 72-hour stool collection. results are expressed as fat excretion in grams per kilogram body weight. in addition to mal absorption of fat, the canine malabsorption syndrome is associated with decreased absorption of other nutrients. these defects in absorption are responsible for the progressive malnutrition which is a cardinal feature of the disease. there may be malabsorption of vitamin d and/or calcium, resulting in osteomalacia. the anemia some times observed may be the result of malabsorption of iron or of the b vitamins, which are required for normal erythropoiesis. malabsorption of vitamin k can result in hypoprothrombinemia. glucose malabsorption has been clearly documented by kaneko et al. (1965) , and it is likely that amino acids, which are absorbed at a similar level of the small intestine, are also malabsorbed. carbohydrate and fat malabsorption unquestionably con tributes to the calorie deficit which results in weight loss. amino acid malabsorption may contribute to the development of hypoproteinemia, although this is thought to be due primarily to increased intestinal loss of plasma protein (see section iv,c). the diagnosis of idiopathic canine malabsorption can be made only after appropriate diagnostic procedures have ruled out the presence of (1) other primary inflammatory, neoplastic, or parasitic diseases of the intestine and (2) the diseases of the pancreas, liver, or stomach which result in defective intraluminal digestion. the presence of parasitic infection is determined by examining the feces for parasite ova. other inflammatory or neoplastic diseases of the intestine may be suggested on the basis of clinical or radiologi cal examination, but a definitive diagnosis usually depends on histopathological examina tion of an intestinal biopsy specimen. both primary and secondary intestinal malabsorption must be differentiated from those diseases in which there is decreased intraluminal hydrolysis of nutrients. the latter are due most frequently to pancreatic exocrine insufficiency, the result of such diseases as chronic pancreatitis or juvenile atrophy. in these diseases, degradation of the major dietary con stituents is reduced because of a primary lack of pancreatic enzymes. intraluminal hy drolysis of fat may also be decreased because of a deficiency of bile salts caused either by decreased hepatic secretion or by bile duct obstruction. under certain experimental condi tions, diversion of bile flow in the dog actually has a quantitatively small effect on fat absorption (wells et al., has a quantitatively small effect on fat absorption (wells et al., 1955; hill and kidder, 1972a ). the problems of pancreatic exocrine deficiency are discussed in detail elsewhere in this text (chapter 7). the most simple and perhaps most widely used test to differentiate intestinal malabsorption from pancreatic exocrine insufficiency is that described by jasper (1954) . the test is employed to detect reduction in trypsin-like activity in the feces of dogs with decreased pancreatic exocrine secretion (grossman, 1962) . there is wide variation in normal activity, making interpretation of the test difficult (frankland, 1969; hill and kidder, 1970; burrows et al., 1979) . the test reveals only the presence or absence of hydrolysis of gelatin and does not differentiate between gelatinase activity produced by intestinal bacteria from that secreted by the pancreas. there is evidence in some species that trypsin is almost completely destroyed by bacteria during its passage through the intestine and that the proteinase activity of the feces is primarily of bacterial origin (borgström et al., 1959) . despite these theoretical objections, the test has been of clinical diagnostic value in our hands. fecal gelatinase activity has been detected consistently in cases of intestinal malabsorption and is almost always absent when severe pancreatic exocrine insufficiency is present. burrows et al. (1979) reported that the mean 24-hour trypsin output in dogs with pancreatic insufficiency was significantly lower, and in dogs with malabsorption significantly higher, than clinically normal dogs. an indirect method to test chymotrypsin activity has been described (strombeck, 1978) . a synthetic peptide, rc-benzoyltyrosine//?-aminobenzoic acid, is administered to test dogs orally. if chymotrypsin is present in the duodenum, hydrolysis of this peptide occurs and p-aminobenzoic acid (paba) is released in a free form, which is absorbed and subsequently excreted in the urine within 6 hours. the urine is analyzed for paba. less than 43% paba excretion identifies dogs with suspected pancreatic exocrine insuffi ciency. a. oleic acid and triolein absorption. several tests have been developed for the clinical evaluation of intestinal absorptive capacity. the absorption of 131 i-labeled oleic acid and 131 i-labeled triolein has been studied extensively in normal dogs (turner, 1958; michaelson et al., 1960) , and kaneko et al. (1965) used this test to study dogs with in testinal malabsorption. the day before administration of the 13l i-labeled compound, a small amount of lugol 's iodine solution is administered to block thyroidal uptake of the isotope. tracer amounts of the test substances are mixed with nonradioactive carrier and are administered orally. absorption is determined by measuring the radioactivity of the plasma at intervals following administration can calculating the percentage of the dose absorbed based on plasma volume. it is possible to use the results of these two tests, when performed in sequence, to differentiate between steatorrhea caused by a deficiency of pancreatic enzymes and that caused by a primary defect in absorption (kallfelz et al., 1968) . if steatorrhea is caused by a lack of pancreatic lipase, oleic acid absorption will be normal, whereas that of triolein, which requires lipolysis for absorption, will be significantly reduced. the absorption of both compounds is reduced in intestinal malabsorption (fig. 13a,b) . the results of this test also may vary depending on the rate of intestinal motility (tennant et al., 1969b) . kaneko et al., 1965.) of vitamin a, mean serum vitamin a concentrations peak at 6-8 hours, with values ranging between three and five times fasting serum levels in normal dogs. breed dif ferences and delayed gastric emptying will alter results. c. glucose absorption. the absorption of glucose can be measured by means of an oral glucose tolerance test in which a test dose of glucose is given by mouth and the blood glucose level measured at intervals for 3-4 hours following administration. the test has been used in canine malabsorption in which the normal rise in blood glucose level is reduced (kaneko et al., 1965) . the test also has been reported for use in the horse (roberts and hill, 1973) . dogs with pancreatic exocrine deficiency may, however, have "diabetic" tolerance curves (hill and kidder, 1972b) . the major disadvantage of relying on this test alone is that it does not differentiate between decreased intestinal absorption and increased tissue uptake following absorption. this problem can be minimized by comparing results of the oral glucose tolerance test with those obtained with the intraven ous tolerance test. the results of this test, however, must be interpreted carefully and in relation to other clinical and laboratory findings. hill and kidder (1972) reported that dogs on low-carbohydrate diets can have "diabetic" tolerance curves; test dogs should be on a high-carbohydrate diet 3-5 days before testing. the absorption of d-xylose also can be used to evaluate intestinal function. d-xylose is not metabolized by the body to any significant degree, and the problems of evaluating tissue utilization which occur with glucose are eliminated. because of the large amounts of d-xylose used in the test, absorption is independent of active transport processes, and the rate of absorption is proportional to luminal concentra tion. a d-xylose absorption test for dogs has been described by van kruiningen (1968) . in this procedure, a standard 25-gm dose of d-xylose is administered by stomach tube. during the 5-hour period following administration, the dog is confined in a metabolism cage, and urine is collected quantitatively. at the end of the 5-hour test period, the urine remaining in the bladder is removed by catheter, and the total quantity excreted in 5 hours is determined. normal dogs excreted an average of 12.2 gm during the test period, with a range of 9.1-16.5 gm. the results obtained by this method are dependent not only on the rate of intestinal absorption, but also on the rate of renal excretion, and it is necessary, therefore, to know that kidney function is normal. the oral xylose tolerance test has received most clinical use (hill et al., 1970; hayden and van kruiningen, 1973) . dogs are fasted overnight, a blood sample is obtained, and d-xylose is administered by stomach tube at the rate of 0.5 gm/kg. a control test is performed on a normal dog simultaneously with each dog with signs of intestinal malab sorption. the first blood sample is obtained one-half hour after administration. the second sample is obtained 1 hour following administration, and additional samples are taken at hourly intervals for 5 hours. the xylose concentration in the blood is determined by the method of roe and rice (1948) . maximal blood levels almost always are reached at 1 hour after administration of the test dose; hill expects a xylose level of at least 45 mg/dl within 60-90 minutes in a normal dog. in preliminary studies of four dogs with the malabsorption syndrome, maximal blood xylose levels averaged 58% of corresponding control values. in dogs with pancreatic exocrine insufficiency with normal intestinal mucosa, there should be a normal xylose response test. the d-xylose absorption test also has been described for use in differential diagnosis of equine diarrheal diseases (roberts, 1974) . bolton et al. (1976) reported that a dosage of 0.5 gm xylose per kilogram body weight was useful in detecting horses that absorbed the pentose abnormally. gastrointesti nal lesions associated with abnormal results were classified as (1) villous atrophy, (2) edema of the lamina propria, or (3) necrosis of the lamina propria. at this dosage in normal horses, the mean peak plasma concentration is less than one-third that seen in normal dogs given xylose (normal dogs: 60-70 mg % at 60 minutes). albumin, γ-globulin, and other plasma proteins are present in normal gastrointestinal secretions. because protein usually undergoes complete degradation within the intestinal lumen, it has been suggested that the gastrointestinal tract must have a physiological role in the catabolism of plasma proteins. the relative significance of this pathway, however, has been the subject of considerable controversy. some investigators have concluded that as much as 50% or more of the normal catabolism of albumin (glenert et al., 1961 campbell et al., 1961; wetterfors, 1964 wetterfors, , 1965 wetterfors et al., 1965) and γ-globulin occurs in the gastrointestinal tract. others believed that the physiological role of the intestine in plasma protein catabolism is far less significant, accounting for only about 10% of the total catabolism (waldmann et al., 1967 (waldmann et al., , 1969 katz et al., 1960; franks et al., 1963a,b) . regardless of the questions concerning the relative importance of the gastrointestinal tract in plasma catabolism, it is well established that normal intestinal losses are increased significantly in a variety of gastrointestinal diseases, which are referred to collectively as protein-losing enteropathies. the increased loss causes hypoproteinemia (especially hypoalbuminemia), which may be observed in various types of chronic enteric diseases. the excessive losses are produced by ulcérations or other mucosal changes which alter permeability or by obstruction of lymphatic drainage from the intestine. if severe, hypoalbuminemia may result in retention of fluid with development of ascites and subcutaneous edema of pendant areas. excessive plasma protein loss has now been demonstrated in swine with chronic ileitis (nielsen, 1966) , in calves with acute enteric infections (marsh et al., 1969) , in cattle with parasitic or other inflammatory abomasal disease (nielsen and nansen, 1967; halliday et al., 1968; murray, 1969) , and in johne's disease (patterson et al., 1967; nielsen and andersen, 1967; patterson and berret, 1969) . in addition to the classic mucosal and submucosal lesions of johne's disease, nielsen and andersen (1967) demonstrated the presence of secondary intestinal lymphangiectasia. meuten et al., (1978) described protein-losing granulomatous enteritis in two horses and discussed a comparative over view of diseases causing mal absorption in the horse, cow, dog, pig, and man. protein-losing enteropathy has been seen with some frequency in the dog (campbell et al., 1968; farrow and penny, 1969; hill, 1972; fineo et al, 1973; hayden and van kruiningen, 1973; mattheeus, et al; hill and kelly, 1974; milstein and sanford, 1977; barton et al, 1978; olson and zimmer, 1978) . intestinal lymphangiectasia was com monly reported. the dog described by milstein and sanford (1977) was not hypoproteinemic because the rate of albumin synthesis by the liver was greater than protein loss into the intestine. protein loss has also been documented in dogs with chronic hypertrophic gastritis (section iv,a). increased intestinal protein loss is the most likely explanation for the hypoalbuminemia associated with certain other enteric diseases, including intestinal mal absorption and lymphoma of the intestine. munro (1974) demonstrated that protein loss in dogs with experimentally induced protein-losing gastropathy occurs by an intercellular route. isotope-labeled polyvinylpyrolidine ( 131 i-pvp), 67 cr-labeled ceruloplasmin, and 51 crlabeled albumin have been used to evaluate enteric protein loss in the dog (fineo et al, 1973; barton et al, 1978; hill and kelly, 1974; van der gagg et al, 1976; olson and zimmer, 1978) . canine ulcerative colitis was described originally in the report of cello (1964). since that time, ulcerative colitis and its variant form, granulomatous colitis of boxer dogs, has been reported by several investigators kennedy and cello, 1966; koch and skelley, 1967; sander and langham, 1968; ewing and gomez, 1973; gomez et al, 1977; russell et al, 1971 ). the etiology is generally unknown. ewing and aldrete (1973) reported a case of canine giardiasis presenting as chronic ulcerative colitis and cases of ulcerative colitis in dogs have been attributed to trichuriasis, balantidiasis, protothecosis, histoplasmosis, eosinophilic ulcerative colitis, or neoplasia (lorenz, 1975) . rarely, severe ulcerative colitis is seen in the cat. in some of these cases, feline leukemia virus is demonstrated. shindel et al (1978) described colonie lesions in cats caused by feline panleukopenia. histopathologically, periodic acid-schiff-positive macrophages are pathognomonic for the granulomatous colitis of boxer dogs. the disease causes chronic, intractable diarrhea, which is often hemorrhagic. in addition, afflicted dogs may vomit and are often emaciated. fever is usually not present. biochemical manifestations of ulcerative colitis depend upon duration and severity of ewing and gomez (1973) . 0 thirty-six observations on 29 affected dogs. illness, degree of colorectal involvement, and the presence of systemic complications. in severe cases of long duration with extensive colorectal involvement, hypoalbuminemia and hypergammaglobulinemia (table vii) are sometimes observed. the pathogenesis of hypoalbuminemia probably involves increased loss of plasma through the denuded and inflammed colorectal mucosa. hypergammaglobulinemia is probably an associated re sponse to chronic inflammation. the digestive process of ruminants differs from that of other animals because of microbial digestion and metabolism in the rumen which occurs prior to other normal digestive processes. the short-chain fatty acids (acetic, propionic, and butyric acids) are the pri mary end products of rumen fermentation and represent the chief dietary source of energy for ruminants (hungate et al., 1961) . the polysaccharide cellulose, which undergoes only very limited digestion in most simple-stomached animals, is readily utilized by ruminants because of the activity of cellulytic bacteria. significant quantities of nonprotein nitrogen also can be utilized by ruminai bacteria for protein synthesis, and this bacterial protein subsequently can be utilized to meet the protein requirements of the animal. bacterial production of vitamins may also meet essentially all the requirements of ruminants. maintenance of bacterial fermentation within the rumen also presents certain unusual hazards to ruminant animals. when rapid changes in dietary intake occur, the products of fermentation can be released more rapidly than they can be removed. acute rumen tympany, acute indigestion of d-lacticacidosis, and urea poisoning are diseases which result from such abrupt changes in diet (hungate, 1966 (hungate, , 1968 . acute rumen indigestion occurs in sheep or cattle on a high-roughage diet when they inadvertently are allowed access to large amounts of readily fermentable carbohydrate, e.g. grain and apples (dunlop, 1972) . streptococcus bovis is the rumen microorganism believed to be chiefly responsible for rapid fermentation and for production of large quantities of lactic acid (hungate et al., 1952; krogh, 1963a,b) . as lactic acid accumulates more rapidly than absorption, the rumen ph falls and rumen atony results. rumen bacteria produce a racemic mixture of lactic acid. some l-lactate may be metabolized by the liver and other tissues, but d-lactate cannot be and contributes significantly to the acid load of the body. the excessive lactic acid production results in metabolic acidosis, which is characterized by reduced blood ph and bicarbonate concen tration and by a fall in urine ph from a normal value of 8.01 -8.0 to as low as 5.0. fluid accumulates in the rumen because of increased osmolarity of its contents, causing hemoconcentration, which may lead to hypovolemic shock and death (hyldgaard-jensen and simesen, 1966) . if affected animals survive the initial period of explosive fermenta tion, a chemical rumenitis, caused by lactic acid, may develop. secondary mycotic rumenitis may also occur and be fatal. hepatic abscesses also may result from severe rumenitis. the rumen of mature cattle can produce 1.2-2.0 liters gas per minute (hungate et al., 1965) . the gas is composed primarily of carbon dioxide and methane, which are products of rumen fermentation. carbon dioxide is also released when salivary bicarbonate is acted upon by organic acids within the rumen. under normal conditions, these large amounts of gas are continually removed by eructation. any factor which interferes with eructation can produce acute tympany of the rumen (bloat), leading to rapid death. interruption of the normal eructation reflex or mechanical obstruction of the esophagus typically results in free-gas bloat. the most important form of bloat, however, is seen in cattle consuming large quantities of legumes or in feedlot cattle on high-concentrate diets. the primary factor in these more common types of bloat is a change in the ruminai contents to a foamy or frothy character. because of altered surface tension, gas is trapped in small bubbles with the rumen and cannot be eliminated by eructation (clarke and reid, 1974) . the chemical changes which cause foam to form within the rumen are not completely clear. some reports (nichols, 1966; nichols and deese, 1966) suggest that plant pectin and pectin methyl esterase, an enzyme system also from plants, are critical factors. the enzyme acts on pectin to release pectic and galacturonic acids, which greatly increase the viscosity of the rumen fluid, resulting in formation of a highly stable foam. slimeproducing bacteria also have been incriminated in the pathogenesis of frothy bloat. these microorganisms produce an extracellular polysaccharide, which results in stable foam formation. effective medical treatment and control are directed toward decreasing or preventing foam formation. this has been accomplished with certain nonionic detergents with surfac tant properties which break up or prevent formation of foam within the rumen (bartley, 1965) . another approach has been the prophylactic administration of sodium alkyl sulfonate, which inhibits pectin methyl esterase activity, preventing foam formation by eliminating the products of this enzyme reaction (nichols, 1963) . much effort is now being directed toward genetic selection of cattle which are less susceptible to rumen tympany and to varieties of legumes which are less likely to produce bloat (howarm, 1975) . unlike monogastric species, ruminants can effectively use nonprotein nitrogen to meet dietary protein requirements. urea, biuret (oltjen et al., 1969) and ammonium salts (webb et al., 1972) all can serve as dietary supplements. urea, which is the most frequently used, is hydrolyzed by bacterial urease within the rumen and the free ammonia formed is incorporated into amino acids by microorganisms within the rumen. the bacte rial protein so produced is digested and absorbed in the small intestine along with protein from the diet. signs of urea poisoning typically develop within minutes after consumption of food containing toxic amounts of urea. clinical manifestations are the result of excessive ammonia production (word et al., 1969; elmer and barclay, 1971) and are due to the encephalotoxic effects of free ammonia absorbed from the rumen. tolerance to urea may be significantly increased by gradually elevating the amounts of urea in the diet or by adding readily fermentable carbohydrate. it has actually been possible for ruminants to adapt and thrive on a diet in which urea was the sole source of dietary nitrogen. however, if urea is fed at a level of more than 3% in the diet of unadapted animals, toxic effects are likely. poisoning may occur when, by accident, animals obtain access to large amounts of urea-containing dietary supplement or in animals receiving bulk feed when there has been an error in formulation or when the urea-containing additive is incompletely mixed. oral administration of acetic acid has been shown to reduce acute urea toxicity, apparently by decreasing absorption of free ammonia from the rumen. acetic acid also has been used clinically for the treatment of urea poisoning but under experimental conditions it has more value prophylactically than in animals with frank signs of poisoning (word et al., 1969) . textbook of veterinary internal medicine-diseases of the dog and cat current veterinary therapy veterinary medicine cornell vet. 66, 183 vet. ree. 90, 645 nature (london) 185, 35 handbook of physiology a guide to learning fluid therapy physiology of the digestive tract proc nati. acad. sci. u.s.a. 44, 648. drucker the physiology of domestic animals current veterinary therapy proc. nati. acad. sci the vitamins vet. ree. 77, 994 glycoproteins: composition, structure and function handbook of physiology the rumen and its microbes handbook of physiology nord. veterinaermed. 18, 73 handbook of physiology handbook of physiology acta vet. scand. 4, 27. krogh, n proc. soc. exp. biol. med. 66, 217. lack textbook of veterinary internal medicine-diseases of the dog and cat nature (london) 202, 400 vet. ree. 81, 416. penhale current veterinary therapy handbook of physiology proc. nati. acad. sci nature (london) 160, 262 diabetes 19, 614. van de kamer current veterinary therapy gastroenterology 67, 531 handbook of physiology intestinal absorption handbook of physiology key: cord-006331-s2qf98lj authors: spiridonova, v. a. title: molecular recognition elements: dna/rna-aptamers to proteins date: 2010-05-23 journal: biochem mosc suppl b biomed chem doi: 10.1134/s1990750810020046 sha: doc_id: 6331 cord_uid: s2qf98lj the review summarizes data on dna/rna aptamers, a novel class of molecular recognition elements. special attention is paid to the aptamers to proteins involved into pathogenesis of wide spread human diseases. these include aptamers to serine proteases, cytokines, influenza viral proteins, immune deficiency virus protein and nucleic acid binding proteins. high affinity and specific binding of aptamers to particular protein targets make them attractive as direct protein inhibitors. they can inhibit pathogenic proteins and data presented here demonstrate that the idea that nucleic acid aptamers can regulate (inhibit) activity of protein targets has been transformed from the stage of basic developments into the stage of realization of practical tasks. during the last decade a significant break has been achieved in the use of basic knowledge on dna in applied studies. the development of highly technolog ical analytic methods employing immobilized dna is one of rapidly developing directions. the major achievement of microarray technology (i.e. dna chips) consists in possibility of the use of various dna libraries amplified by polymerase chain reaction (pcr) for the development of sets of dna sequences. using hybridization these sets can rapidly analyze and compare sequences of thousands genes, their mutant forms, dna polymorphism and to discover new genes. the second direction consists in the develop ment of irrational design of nucleic acids for studies of nucleic acid protein recognition. in 1990, two labs (the gold and szoztak laboratories; usa) indepen dently developed the selex method (systematic evolution of ligands by exponential enrichment) [1, 2] . using this method it is possible to isolate targeted nucleic acid molecules (aptamers) from the large set of individual molecules (more than 10 18 ) known as the combinatorial library. aptamers are small single stranded dna or rna molecules of 40⎯100 nucle otide residues in length with rather complex three dimensional structure. such complex structure deter mines aptamer ability to bind various molecules including proteins. thus, such complex process of biosynthesis of protein recognizing elements, antibod ies, which nature has been naturally creating for thou sands years, is now modeled in vitro. selection begins with generation of a large rna library with fixed 5' and 3' ends and a degenerated region of 30-60 nucleotides in length (fig. 1) . such library contains 10 14 -10 15 variants of rna molecules, which are folded in complex 3d structure. the library is incubated with a protein and rna molecules bound to the protein target are separated from unbound rna molecules. the bound rna molecules are separated from proteins and then amplified by means of reverse transcriptase and pcr to obtain a new pool of mole cules with increased affinity. the procedure is usually repeated 10-15 times until maximal number of aptamers exhibiting affinity to the target will be clearly detectable in the enriched fraction. aptamers are then cloned (usually into a bacterial vector) and sequenced. in the case of dna the selection also begins with dna library in which the randomized region is flanked by fixed sequences at 5' and 3' ends. to pro duce single stranded dna molecules either asymmet ric pcr or one of primers carries a biotinylated tag, which helps to separate one dna strand from another on streptavidin columns are used. many reviews on detailed description of all steps of this method have been published to date [3] [4] [5] [6] [7] [8] [9] [10] . now, the idea that nucleic acid aptamers can regulate (inhibit) activity of protein targets has been transformed from the stage of basic developments into the stage of realization of practical tasks. in this review the major attention is paid to the aptamers to proteins involved into patho genesis of wide spread human diseases. tuer and gold proposed to use combinatorial rna libraries for creation of rna ligands selectively bound to t4 dna polymerase [1] . rna ligands were named as aptamers by ellington and szoztak [2] . they also introduced a name of this method, selex. now convincing evidence exists that aptamers are a new effective group of therapeutics, which may represent the scheme illustrating the selex method for preparation of dna and rna aptamers. an initial randomized dna library is transformed into single stranded dna (ssdna) and then introduced into binding reaction with a protein, pre immo bilized onto a column. rna is obtained by transcription of the initial library; the latter carries introduced promoter of t7 rna polymerase. rna molecules are also exposed for binding with the protein immobilized onto the column. after removal of unbound molecules, dna/rna molecules that bind to the immobilized protein are separated from this protein by phenol chlo roform treatment and then subjected to alcohol sedimentation. rna molecules are used for cdna preparation by means of reverse transcriptase. all resultant molecules are then transformed into double stranded dna (dsdna). in the case of dna ssdna prepared using asymmetric pcr is used again for repeated binding. rna molecules are subjected for transcription and resultant molecules are used for binding with the immobilized protein. the selection includes 10⎯15 rounds and this yields an enriched fraction of aptamers. the next step includes cloning into a plasmid vector and sequencing of resultant sequences. an important tool against many diseases. the drug macugen has already been approved by us fda (food and drug administration) for the treatment of age related macular degeneration (amd). some aptamers are now in different stages of pre and clini cal trials [11, 12] . highly specific (antibody like) recognition and binding of aptamers to their protein targets make them attractive therapeutics. aptamers (as well as antibod ies) are folded into complex three dimensional struc tures and form hairpins and loops. the range of disso ciation constants characterizing binding of dna and rna aptamers to their protein targets varies from nanomolar to subnanomolar levels. aptamers can dis criminate related proteins consisting of the same structural domains [13] [14] [15] [16] . it should be noted that the use of 1000 fold excess of aptamer doses in animal models employed in preclinical trials and in therapeu tic applications in humans did not cause allergic reac tions [11, 17] . studies on biocompatibility and phar macokinetics of aptamers and investigations of various modifications of aptamer structures have been per formed for their further applications as drugs [12] . nuclease degradation is the major problem that complicates manipulations with oligonucleotides. protection against nonspecific action of nucleases during selection includes modification of pyrimidine nucleotides at ribose c2' (amino and fluoro deriva tives) [18] [19] [20] , use of liposomes as carriers [21] , post selection ribose c2' hydroxyl modifications by intro ducing methyl, allyl, amino groups, etc. [22] [23] [24] . the molecular mass of short polynucleotides is 8-14 kda; this corresponds to 25-40 nucleotides. such a small size facilitates rapid renal filtration within a few minutes. aptamer modification by conjugation to polyethylene glycol or other agents and their attach ment to the liposome surface prolonged the period of aptamer action [21, 25] . usually, aptamers exhibit high specificity towards their targets and this should be taken into consider ation in preclinical trials on animal models. however, protein orthologs may decrease efficacy of such com pounds. nevertheless, an improved selection process named as "toggle selex" seems to overcome this problem. toggle selex has been proposed for rna libraries incubated with the same protein from human plasma. during the follwing rounds of selection it was also exposed to binding with the animal ortholog used in subsequent preclinical trials [26] . studies on pharmacokinetics of aptamers subjected to modifications should be accompanied by analysis of their excretion from the body. this is important because in addition to the main disease patients may have multiple dysfunctions including renal insuffi ciency. for regulation of aptamer activity rusconi pro posed the antidote proof of concept (the method of rational design) [27] . using watson crick base pairing he designed an antidote structure as a complementary sequence that bound to the aptamer, altered its struc the mechanism of antidote action in the pair aptamer antidote (modified from [26] ). the antidote is an oligonucleotide of 15 nucleotides in length; it forms an inactive complex with the aptamer to factor ixa involved into the blood coagulation cas cade. possessing a sequence complementary to the aptamer, the antidote forms a complex with the aptamer and thus alters its 3d structure. antidote inactive form ture and, thus, prevented its complex formation with the protein target (fig. 2) . the proposed concept gave a unique possibility of aptamer regulation because it allows to control administration of an aptamer based drug in any clinical use. considering aptamers as direct protein inhibitors it is very attractive to use them for studies of inhibitory mechanisms and also for therapeutic application. pos sibility of aptamer preparation for any class of biomol ecules allows to evaluate importance of their use and increases areas of their applications. advantage of aptamers maintaining their activity in multicellular organisms significantly facilitates preclinical trials, saves time and reduces costs required for antibody preparation on animal models. finally, design of an antidote that control aptamer activity possibly repre sents the most important contribution to the develop ment of nuclei acid therapy because it can control drug dosage and, thus, determine required safety. it should be noted that in addition to biomedical studies aptam ers are also used as a recognizing elements in microar ray based biosensors; this is another important and logic continuation of the development of the dna chip technology. this review summarizes current knowledge on aptamers to various proteins, their affinity to protein targets; it describes inhibitory properties of aptamers and information about preclinical and clinical trials. 2.1. thrombin thrombin is a key protein in blood clotting process. this serine protease is generated during a cascade of proteolytic reactions initiated by epithelial damage. thrombin is produced from prothrombin by factor xa. active thrombin catalyzes the reaction of fibrinogen conversion into fibrin, which forms a fibrin matrix for the thrombus by "capturing" blood cells [28] . throm bin also activates platelets via interaction with their par receptors and regulates expression of some sub strates and activation molecules such as p selectin [29, 30] . problem of hemostasis requires creation of such thrombin inhibitor, which would be specific for blood clotting process, does not cause allergic reaction and effectively regulates this process. an anti thrombin aptamer was one of the first therapeutic aptamers obtained by the selex method. the single stranded dna aptamer was isolated from a pool of ~10 13 oligo nucleotide sequences containing a 60 nucleotide ran domized region [31] . the 5 round selection resulted in identification of aptamers forming a complex with thrombin, which was characterized by the k d values ranged from 25 to 200 nm. these aptamers were based on the 15 nucleotide sequence: dggttggtgtg gttgg (15tba). the aptamer increased time required for clot formation from 25 to 170 s in vitro and from 25 to 43 s in human blood plasma. the 15tba structure was investigated by means of nmr analysis, which included the study of 15tba alone, in the complex with thrombin and the study of aptamer binding with the anion binding site of thrombin, exosite 1 [32] [33] [34] [35] . 15tba forms a complex compact tertiary structure known as g quadruplex (fig. 3 ). other laboratories also used the selex method to perform aptamer selection to thrombin [36, 37] . 15tba also influenced platelet aggregation stimulated by thrombin. thrombin also caused proteolytic activa tion of platelet par 1 receptor and its aptamer inhib ited this activation in a dose dependent manner [38] . the anticoagulant activity of the aptamer was tested on monkeys. the prothrombin time (pt) increased by 1.7 fold in 10 min and returned to base line 10 min after aptamer administration. 15tba also inhibited platelet aggregation and prolonged platelet activation induced by thrombin [39] . the aptamer was also investigated using an anticoagulation model of extracorporeal circulation in sheep. the pt values reached 40⎯45 s (versus 21.7 s of the baseline level), whereas control pt remained close to the baseline. in the other experiment 15tba was investigated using a the structure of the 15tba aptamer to thrombin, which inhibits fibrinogen hydrolyzing activity. 15tba has the oligonucleotide sequence dggttggtgtggt tgg. the g quadruplex structure is a structure forming element for dna. eight of nine guanines form two planar g quartets with three loops; the loop tgt located in the center and two symmetrical loops tt. the presence of octa coordinating calcium ion and stacking interaction between g quartets of the duplex determine maintenance of the g quadruplex. calcium ion is located between par allel planes of the g quadruplex. cardiopulmonary bypass (cpb) model. the study included examination of anticoagulation activity, pharmacokinetics and renal clearance of the aptamer [40] . animals were subdivided into two groups: one group received injections of heparin (300 u/kg) and protamine in boluses and was used as control of activ ity by cpb. the second group of animals received aptamer infusion (0.3-0.5 mg/kg per 1 min). these animals were characterized by increased pt, activated partial thromboplastin time (aptt), and activated clotting time (act), which then returned to the base line after infusion [39] [40] [41] . in the pharmacokinetic studies using the cpb model, the elimination half life of the aptamer was 1.9 min; however, during the 60 min infusion this parameter increased up to 7.7 min. these results sug gested that the aptamer would function in the animal model and that unmodified dna aptamers were rap idly eliminated from the body. now this dna aptamer is under preclinical trials by archemix corp. for subsequent trials in humans. anti thrombin rna aptamers were obtained using the library with a 30 nucleotide randomized sequence [42] . the anti thrombin aptamers were iso lated after 12 rounds of selection. the enriched frac tion was then cloned in the plasmid puc18 and sequenced; this yielded two classes of aptamers. the conservative motif in 22 clones of the class i rnas was represented by the sequence uccggaucgaag uuaguaggcgga inside a variable zone. one of the best anti thrombin aptamers was characterized by the k d value of 9.3 ± 1.0 nm. members of the second class exhibited lower affinity (the k d value of 155 ± 9.0 nm). competitive analysis with heparin and hirudin dem onstrated that heparin but not hirudin displaced the rna aptamer from its complex with thrombin. this suggests affinity of this aptamer to thrombin exosite ii. however, tests on functional activity in animal models have not been performed. in preclinical trials highly specific aptamers to human proteins may demonstrate lowered affinity to protein orthologs in animal models. to overcome this problem so called "toggle" approach has been pro posed: 2' fluoro rna aptamers were incubated with a mixture of human and porcine thrombin during the first round and then porcine and human thrombin were alternatively used in subsequent rounds of selection [43] . after 13 rounds of selection clones with the con servative sequence gggaacaaagcugaaguac uuaccc have been found; they exhibited cross reac tivity with porcine and human thrombin. the complex with human thrombin was characterized by the disso ciation constant k d of 2.8 ± 0.7 nm and the complex with porcine thrombin had the k d value of 83 ± 3 pm. the aptamer increased clotting time (thrombin con centration was 10 nm) of blood plasma from 11.6 ± 0.2 to 22.6 ± 1.4 s. in porcine plasma tog 25 increase clot ting time from 15.7 ± 0.7 to 61.9 ± 1.2 s. improvement of thrombin dependent platelet aggregation by the aptamer occurred in the dose dependent manner. the higher effect was achieved using porcine platelets: a 10 fold excess of tog 25 inhibited thrombin activity by 90% [26, 43] . factor viia (fviia) is a trypsin like protease involved into the coagulation cascade. in combination with the tissue factor (tf) fviia plays a critical role in thrombin formation and thus promotes active clot for mation. aptamers to fviia have been isolated from an rna library using the selex method. these aptam ers inhibited activation of factor x (inactive precursor in the coagulation cascade) by fviia [15] . after 16 rounds of selection from the 2' amino modified rna library the isolated aptamers formed a complex with fviia characterized by the k d value of 11.3 ± 1.3 nm. specificity of some aptamers was investigated in bind ing reactions with other protein factors (fxia and fxa). the micromolar range of k d values determined for these complexes suggested nonspecific binding of these aptamers with protein factors. addition of the anti fviia aptamer inhibited an initial rate of fx acti vation by about 95%. experiments on dilution of the reaction mixture revealed a dose dependent mode of inhibition. the aptamer prolonged clotting time up to 175% in the pt test. factor ixa (fixa) is a serine protease that plays an important role in formation of critical mass of throm bin required for coagulation. the complex tf/fviia performs proteolytic cleavage of the protein factor fix into its active form fixa; the latter binds to fviiia on the platelet surface and activates factor fx to fxa, which catalyzes conversion of prothrombin into thrombin [28] . rusconi et al. performed rna selection to fixa; after eight rounds of selection they found an aptamer, which bound to fixa with the k d value of 0.65 ± 0.2 nm and exhibited 5000 fold higher affinity to fixa compared with fviia, fxa, fxia and activated protein c [20] . a truncated version of this aptamer (9.3t) maintained high affinity to fixa (k d of 0.58 ± 0.1 nm) and totally inhibited fx hydrolysis by the enzyme complex. the anticoagulation activity of 9.3t was evaluated using activated partial thromboplastin time (aptt). the aptamer increased clotting time in the dose dependent manner and caused a several fold increase in aptt. in continuation of the antidote theory rus coni obtained an rna antidote, which caused revers ible inhibition of 9.3 t thus creating a drug/antidote pair for the anticoagulation therapy. using the com plementary base pairing principle the second rna oligonucleotide complementary to the 9.3t aptamer was created. after administration of the antidote nucleotide the anticoagulation activity of the anti fixa aptamer changed in 10 min and this effect per sisted for over 5 hours [16] . almost in 5% of 12 million people receiving heparin therapy heparin induced thrombocytopenia (hit) is developed after one year [28] and this is the reason for cessation of the heparin therapy. patients who need repeated anticoagulant therapy receive hemodialysis, which complicates patient's life. to prolong the effect of the anti fixa aptamer in vivo rusconi c.p. et al. prepared a cholesterol deriva tive (ch 9.3t), which exhibited high affinity (k d = 5.3 ± 1.1 nm) and anticoagulant activity [27] . tests on por cine and mouse plasma have shown the same efficacy of the animal models as in the case of human plasma. using the porcine anticoagulant system the aptamer increased pt and aptt comparable with its effects on pt and aptt in human plasma samples. there was significant difference between the modified and initial aptamers: the cholesterol moiety increased half life of ch 9.3t to 60-90 min (versus 5-10 min for 5-10 min). the antidote 5 2 c neutralized more than 95% of the aptamer effect within 10 min in animal models [27] . the antithrombotic effect of the aptamer was investigated in mouse thrombosis model, which was induced by administration of ferric chloride to the carotid artery; mice were pretreated with ch 9.3t or a functionally inactive aptamer with scrambled nucle otide sequence (negative control). all the mice in the negative control group developed an occlusive throm bus in 8.1 ± 0.1 min. in the aptamer treated group 80% of mice maintained clear normal carotid artery blood flow during 30 min (time required for the occlu sive thrombus formation ≥24.4 min) [27] . the effect of the 5 2c antidote was accessed using the model of active bleeding (tail transection). mice were pre treated with ch 9.3t or an aptamer with scrambled sequence and the tail was cut 1 h after the treatment. blood losses were measured for 15 min after tail transection. animals treated with ch 9.3t exhibited significantly more blood loss (176 ± 23.7 μl) compared with controls (48 ± 17.8 μl). administration of the 5 2c immediately after tail transection prevented hem orrhage in the aptamer treated animals (blood loss was 54.5 ± 13.6 μl) [27] . the biopharmaceutical company regado bio sciences continues studies on the fixa aptamer anti dote pair named as reg1, which is under first stage of clinical trials. hepatitis c virus (hcv) is a major cause of both sporadic and viral hepatites differed from hepatitis a or b [44] . the nonstructural protein 3 (ns3) is a serine pro tease that exhibits protease and helicase activity and represents a good target for inhibition of hcv. aptamer selection was performed using a library car rying a 120 nucleotide randomized region and after 6 rounds of selection two aptamers inhibiting protease and helicase activities were obtained [45] . for identi fication of the aptamer demonstrating affinity to the active site of ns3 subsequent selection was performed using a truncated polypeptide δns3. using an rna library with a 30 randomized region authors per formed 9 rounds of selection and identified 45 clones, which bound δns3 [46] . according to their nucleotide sequences aptamers were subdivided into three families. they all contained a conservative region ga(a/u)ugggac. these aptamers formed a complex with δns3 with the k d value of 10 nm, caused 90% inhibition of protease activity of the δns3 peptide and full sized ns3 bound to a maltose binding protein (mbp ns3). in vivo hcv proteins are processed by ns3 and ns4a cofac tor. for modeling of physiological conditions the aptamer effect on ns3 activity was tested in the pres ence of the p41 peptide, which caused a sevenfold increase of mbp ns3 activity. under these conditions the aptamer inhibited mbp ns3 activity by 70% [46] . human neutrophil elastase (hne) is involved in various inflammatory diseases, including acute respi ratory distress syndrome (ards), septic shock, arthritis, and ischemia reperfusion injury [47] . a covalent inhibitor of hne, a diphenyl phosphate derivative of valine (valp), was coupled to an rna library to enhance the binding of the inhibitor with hne [47] . ten rounds of selection yielded an rna aptamer conjugated to the dna:valp substrate (rna 10.11: dna:valp). the aptamer demonstrated bind ing to hne (k d = 71 nm) and enzyme inhibition (k i = 5 nm) in vitro. in contrast to the rna aptamer 10.11 or the substrate dna:valp administered separately the aptamer modified with the substrate inhibited hne ex vivo in the rat model of ards [48] . the same group also performed a valyl phosphonate: dna library selection to find more potent hne inhibitors. authors used a single enantiomer form of the valyl phospho nate, which was compared with a racemic mixture. inhibitor selection was performed using purified elastase and also secreted elastase in the presence of neutrophils [49] . after 18 rounds of selection the aptamer ed45, which inhibited hne, was found. the aptamer was truncated to a 42 mer, named nx21909, and tested in a rat model of lung inflammation. a 40 nmol dose of nx2109 inhibited neutrophil infiltration by 53% in the lung of rat in vivo [49] . 3.1. vegf angiogenesis plays a central role in various physio logical and pathological processes. vegf (vascular endothelial growth factor) is one of the best character ized growth factors; it is involved into initial steps of angiogenesis and represents one of the most promising targets for anticancer therapy [50] . an increased vegf level associated with angio genesis was observed during tumor growth and metastases, premature aging and age related degener ation of tissues [50] [51] [52] . using the selex method ruckman performed 12 rounds of selection cycles and isolated aptamers to human vegf 165 with k d of 50 pm in a 2' f pyrimi dine rna library [19, 20, 53⎯55] . for increased sta bility against nucleases two aptamers were additionally modified by the 2' o position [55] . these aptamers were characterized by the k d values of 49 and 130 pm; they were specific to vegf 165 and did not bind to related proteins: vegf 121 and placenta growth factor plgf 129 . the aptamers to vegf 165 inhibited the bind ing of vegf 165 to its receptors, flt 1 and kdr (kinase domain receptor). using 125 i labeled vegf 165 inhibition of receptor binding was evaluated: theic 50 values for aptamer competition with the flt 1 receptor and kdr were ranged from 50-300 and 2-60 pm, respectively [55] . therapeutic potential of the aptamers to vegf was evaluated by the miles assay representing simple and rapid means of monitoring the ability of aptamers to inhibit the activity of vegf 165 in vivo. it is assessed as vascular wall permeability in animal models. the test was performed using adult guinea pigs and the most effective aptamer inhibited vascular permeability by 58% at 1 μm [55] . pharmacokinetics of the 2 fluoro pyrimidine and 2' o methyl purine aptamer to vegf called as nx 1838 has been investigated in monkeys. during intravenous administration of this aptamer as a conjugate with a 40 kda polyethylene glycol was characterized by half life of 9.3 h and a clearance rate of 6.2 ml/h. subcuta neous administration resulted in 80% absorption into the tissues within 8-12 h [25] . preclinical and clinical trial of nx 1838 also called as macugen was performed by eyetech pharmaceuti cals inc for the treatment of age related macular degeneration in diabetic patients [11, 12] . the synthetic aptamer nx 1838 was also investi gated in the rat model of angiogenesis. these studies confirmed a significant inhibition by 80% of angio genesis by means of vegf in the presence of this aptamer. the 1a phase of clinical trials did not reveal any significant complications after a single adminis tration of the drug. in addition 80% of patients dem onstrated stable improvement during observation for 3 months after injections, 27% of patients demon strated a threefold improvement of vision among dia betic patients (etdrs) [10] . clinical trials (phase 2) have shown that multiple macugen administration with or without photody namic therapy (pdt) did not cause any serious impairments; moreover 87.5% of patients demon strated stable vision improvement and 25% of patients demonstrated significant improvement evaluated using the etdrs system (early treatment for dia betic retinopathy study). during the third phase of clinical trials macugen (under the commercial name pegaptanib) was used as the only drug every 6 weeks over a period of 48 weeks at a dose of 0.3, 1 and 3 mg intravitreously [56] . all three groups of patients demonstrated significant improvement of vision. the severe loss of visual activ ity determined as the loss of 30 letters of visual acuity reduced from 22 to 10% in the group receiving 0.3 mg of macugen. in addition, 33% of patients receiving this dose maintained their visual acuity or gained acu ity (versus 23% of control group). no antibodies against macugen were found. eytech in cooperation with pfizer obtained fda approval for the use of macugen for the treatment of amd. these first results demonstrate that aptamers may be effective drug prep arations. it is known that the intensive tumor development is accompanied by neovascularization and therefore an aptamer to vegf was used for inhibition of tumor vascularization (and tumor growth). the aptamer isolated and optimized by ruckman et al. was tested in a mouse model of wilmis tumor (the most common malignant tumor of the kidneys in chil dren). tumor was implanted into a mouse kidney and its growth was maintained for one week, and then the aptamer to vegf (200 μg) or vehicle (phosphate buff ered saline) were administered for experimental and control mice respectively, daily for 5 weeks [57] . after decapitation of animals authors observed that tumors weighed 84% less in treated versus control animals. lung metastases were seen only in 20% of the aptamer treated animals (versus 60% of animals from the control group). the aptamers tested in the murine nephroblastoma exhibited the decrease in tumor growth by 53% compared with control [57] . the increase in bfgf correlates with appearance of various diseases including retinopathy, rheumatoid arthritis, leukemia [58] . jellinek and co authors used a 2' amino pyrimi dine derivative rna library and performed 11 rounds of aptamer selection [58] . they found an aptamer named as m21a, which exhibited binding to bfgf with k d of 0.35 nm; a competitive binding study revealed that it competed for bfgf binding with unfractionated heparin and low molecular weight hep arin. the inhibitory activity of m21a was also investi gated using chinese hamster ovary (cho) cells: the aptamer bound to its target with the k d values of 1-3 nm [59] . the effect of m21a on the endothelial cell motility was also investigated using the migration of endothelial cells to a denuded area in bovine aortic cells where endogenous bfgf is essential for activity. at concentrations > 50 nm the aptamer inhibited cell migration in a dose dependent manner (as compared with control). the rna aptamer inhibited bfgf bind ing to its cell receptor [20] . platelet derived growth factor (pdgf) is a mito gen composed of two homologous (a and/or b) chains linked by three disulfide bonds; this dimeric protein is involved into wound healing and progression of vari ous diseases including atherosclerosis and glomerulo nephritis. many tumor cell lines produce and secrete pdgf [60] . a dna selection in vitro against human recombi nant pdgf ab was performed and after 12 rounds dna aptamers characterized by k d of 50 pm were iso lated. three aptamers effectively inhibited pdgf bb binding to pdgf α and β receptors with the k i value of 1 nm. the anti pdgf aptamers also inhibited mitogenic effects of pdgf on cells expressing pdgf β receptors with k i of 2.5 nm [61] . one aptamer termed 36t was truncated, 2'o methyl, 2' fluoro modified and capped at the 3' end (to increase resistance to nucleases) and conjugated to 40 kda peg (to increase its lifetime in blood circula tion) [62] . the modified aptamer exhibited high affin ity binding to the human protein (k d = 100 pm); it was tested using a rat glomerulonephritis. in this model intravenous administration of this aptamer (2.2 mg/kg) twice a day decreased mitoses by 64% on day 6 and by 78% on day 9. animals treated with this aptamer were characterized by a decreased mono cyte/macrophage index and glomerular matrix over production. control animals received a scrambled sequenced oligonucleotide or peg for 6 days [63] . interferon γ (ifn γ)exhibits various immunoregu latory effects. although its antiproliferative effect is less pronounced than in ifn α and ifn β, ifn γ is the most potent activator of macrophages and the inducer of expression of mhc class ii molecules [58] . in healthy nervous tissue ifn γ is almost absent, how ever, during inflammatory processes in the nervous system and in multiple sclerosis it is overproduced. ifn γ secretion can result in inflammatory and autoim mune diseases. rna selection using 2 fluoropyrimi dine and 2 aminopyrimidine rna or a mixture of these two modifications were screened for aptamers that inhibited receptor binding of ifn γ [64] . the resultant aptamer, 2' amino 30 had a k d value for its complex with receptor of 2.7 nm. in the culture of a549 cells it inhibited receptor binding of ifn γ with k i value of 10 nm. this aptamer also inhibited induction of the mhc complex regulated by ifn γ and icam 1 expression with ic 50 values of 700 and 200 nm, respectively [64] . endothelial receptor tyrosine kinase tie2 plays an important role in vascular wall stability. angiopoietin 2 (ang 2) is a natural antagonist, which is obviously expressed only during active angiogenesis (e.g. tumor growth) [65] . for investigation of ang 2 by aptamers 11 rounds of rna selection were performed and rna molecules exhibiting specific binding to ang 2 were iso lated. one aptamer demonstrated high affinity to ang 2 (k d = 3.1 nm); it did not bind to ang 1 (k d > 1 μm). this aptamer was truncated to 41 mer (k d = 2.2 nm) and the truncated aptamer inhibited ang 2 function in a cell culture and in a rat model, where it significantly inhibited neovascularization by 40% [65] . influenza is one of the most widespread disease in the world. control of this disease includes active cam paigns of vaccination, use of drugs blocking neuro minidase action. the use of aptamers helps to block virus binding to cell receptors. binding of isolated dna aptamers to viral hemagglutinins blocked virus penetration into cells [66, 67] . extracellular domains of influenza hemagglutinin cause agglutination of blood cells (mainly erythro cytes). hemagglutinin determines virus binding to cells. neuraminidase is responsible for : 1) ability of a viral particle to penetrate into the host cell; 2) ability of viral particles to leave host cells after reproduction. two dna aptamers were obtained to the hemaggluti nin peptide (residues 91-261), which is responsible for binding of an oligosaccharide component of cell receptors. the aptamer a22 exhibited high binding activity and blocked agglutination of chicken red blood cells [66] . the effect of a22 was confirmed by microscopic studies; they revealed preservation of cell structure compared with control preparations, in which damage of cell structure in the presence of influenza virus was observed. the same authors iso lated two rna aptamers to hemagglutinin using an rna library containing a 30 nucleotide randomized region. a predicted secondary structure in 73 bases long included nucleotides of the randomized region and also constant sequences of the flanking region. one of the aptamers exhibited binding to hemaggluti nin with the k d value of 2.9 nm. the affinity of this aptamer was 15 fold higher than that of a monoclonal antibody to this protein. moreover, the rna aptamer allowed to discriminate hemagglutinins isolated from two various strains [67] . binding proteins 5.1. tat protein therapeutic applicability of aptamers has been undertaken during studies of hiv replication. human immunodeficiency viruses hiv 1 and hiv 2 that belong to a lentivirus class selectively affect t helpers. a regulatory tat protein activates viral replication. the tar element (of 60 nucleotides long) presented in all predicted viral transcripts is required for func tioning of tat protein. it was proposed to express the 60 nucleotide tar sequence to capture tat protein into an rna decoy [68, 69] . tar rna, hiv tran script, was expressed in cem ss cells. the rna decoy inhibited hiv 1 replication over 99% in vitro. inhibition of viral replication in cd4+ cells proceeded at the high level tar aptamer expression from the trna promoter. changes in the nucleotide sequence of hair pins or loop in the structure of the tar aptamer abol ished the ability of the tar aptamer to inhibit hiv rep lication [68, 70] . the selex library consisted of a randomized sequence of 120 nucleotides was used for selection of aptamers to tat proteins. after 11 rounds of selection a truncated variant of an rna aptamer named as rna tat was obtained; it formed a complex with the tat protein with k d of 120 pm. this 37 mer rna aptamer inhibited hiv 1 in vitro and decreased viral replica tion by 70% in a cell culture [71] . other groups also found rna decoys, which bound tat protein and inhibited hiv 1 [72, 73] . no significant incompatibility between tar and tat interactions of hiv 1 and hiv 2 belonging to var ious subfamilies were found. however, although tat 1 could transactivate hiv 2 through tar 2, tat 2 did not interact with tar 1 in hiv 1 [74] . the viral protein rev promotes transportation of partially spliced rna molecules to cytoplasm, where they provide synthesis of usual retroviral products. the rre (rev responsive element) element composed of about 234 nucleotides forms a complex three dimen sional structure, to which rev protein binds. using a trna promoter for expression of the rre element, the major rev binding site in hiv 1, overexpression of this construct has been estimated in the cells. expres sion of the chimeric trna rre aptamer caused inhi bition of viral replication by more than 90% [75] . the aptamers passed phase 1 clinical trials, in which this construct was transduced in vitro to cd34+ cells obtained from bone marrow of a hiv 1 infected sub jects followed by subsequent reinfusion into these sub jects. aptamer administration did not cause adverse effects, however, a rather low level of rre gene expression was observed possibly due to inappropriate conditions for gene transfer [76] . reverse transcriptase (rt) was the first target for the development of the selex method for hiv ther apy. tuerk and gold published the pioneer paper on selex. they used rt as the target for isolation of rna ligands, inhibiting hiv replication [1] . after 9 rounds selection from rna populations random ized at 32 positions authors isolated rna that specifi cally bound to hiv rt and inhibited activity of this enzyme [77] . the sructure of the rna aptamer was subsequently characterized and experiments per formed on cells indicated inhibition of hiv 1 replica tion by 90-99% [78] . in addition, aptamer expressing t cells completely blocked the spread of hiv in cul ture [79] . proliferation of myocardial and vascular cells is a central problem in the development of such cardiovas cular diseases as hyperplasia, atherosclerosis, malig nant tumors [80, 81] . e2f plays a central role in regu lation of cell proliferation. this factor exhibits highly specific binding to a double stranded dna containing eight base pairs tttcgcgc. the constructed 14 mer dna aptamer containing a sequences for e2f binding was tested for inhibition of e2f activity [82] . in vascular smooth muscle cells (vsmc) stimulated by e2f the 14 mer oligonucleotide (odn) inhibited vsmc proliferation and expression of the genes c myc and cdc2, controlling cell cycle, and proliferating cell nuclear antigen (pcna). in vivo the 14 mer odn was transfected to rats with experimental carotid injury and this markedly suppressed the fibrosis for mation compared with nontransfected arterial seg ments. furthermore, this inhibition continued up to 8 weeks after a single transfection [81] . transfer of an e2f decoy can therefore modulate gene expression and inhibit smooth muscle proliferation and vascular lesion formation in vivo. using the selex method other aptamers to e2f were also prepared; they inhib ited dna binding activity of this protein [82] . these interesting results prompted dzau's group to test the e2f aptamer in humans in order to determine whether it can limit intimal hyperplasia during intra venous administration [83, 84] . the e2f aptamer was delivered to the infra inguinal vein by transfection. cell transfection efficiency was 89%, expression of c myc and pcna reduced by 73 and 70%, respectively, com pared with the control group. after 12 months in the group of patients treated with the e2f aptamers fewer occlusions were registered compared with control. the e2f dna aptamer is now evaluated by cor genetch inc., in a phase iii study to estimate its effi cacy at limiting coronary and peripheral vascular damages. this aptamer is very close to clinical appli cation. the e2f aptamer was also used for evaluation of long term protection from neointimal hyperplasia and atherosclerosis [85] . hypercholesterolemic rabbits were treated with intravenous injections of e2f aptamer or scrambled oligonucleotide. after 6 months (when animals were put on a cholesterol containing diet) the e2f aptamer treated group of animals was free of plaque whereas animals treated with the scram bled oligonucleotide and also control animals had extensive plaque formation [85] . finally, selex was used to obtain an rna aptamer that would bind and inhibit e2f. insertion of the e2f aptamer into a trna expression cassette yielded rnas exhibited effective inhibition of e2f1 binding to dna [86] . to test the ability of the e1 rna aptamer to block proliferation, human fibro blasts were treated with e1 rna aptamer and prolifer ation was then induced. the rna aptamer inhibited s phase by 80% compared with control [86] . thus, natural and in vitro selected aptamer can act as prolif eration inhibitors. transcription factor nf κb activates genes involved into inflammatory processes and synthesis of cytokines, interferons, mhc proteins, growth factors, and cell adhesion molecules, which play a central role in infarctions and various ischemic pathologies [87] . it is also required for hiv 1 gene expression and regula tion of cell tumors. a double stranded dna aptamer exhibiting high affinity binding to nf κb and named as a "natural decoy" was investigated in vivo using a cardiac ischemic/reperfusion model and a significant effect in inhibiting this injury was observed [88] . in a rat model of thrombosis animals transfected with the nf κb aptamer showed improved recovery of coro nary flow (97% versus 61% in control) 3 days after transfection [89] . the aptamer treated group demon strated a lower percentage of neutrophil adhesion to endothelial cells (38% versus 81%) and a lower level of interleukin 8 (109 versus 210 ng/mg) as compared with control [89] . a fluorescent labeled aptamer to nf κb was investigated in a murine model of nephritis, where it blocked glomerular inflammation and expression of the inflammatory markers il 1α, il 1β, il 6, icam 2, vcam 1 [90] . using the selex method an rna aptamer was also genetade against the p50 subunit of nf kb. four teen rounds of selection yielded the rna aptamer, exhibiting high affinity binding to p50 and inhibition of nf kb binding to dna by preventing protein dimerization [91] . work with aptamers has important advantages over antibodies: ⎯in clinical practice aptamers may be applicable in the same fields where antibodies are already used for treatment, but in contrast to antibodies aptamers are non immunogenic; ⎯aptamers exhibit the same high affinity to their protein targets as antibodies; ⎯aptamers can bind and penetrate to a pathologi cal nidus faster than antibodies; ⎯aptamer antidotes may be developed and they can control activity of the administered aptamer. the selex method originally developed for nucleic acid binding proteins is now actively used for studies of proteins, which lack natural complexes. the method is applicable for manipulations with individ ual proteins and for work with cell cultures. proc. natl. acad. sci. usa trombozy v kardiologii. mekhanismy razvitiya i vozmozhnosti terapii (thromboses in cardiology. mechanisms of develop ment and therapeutic capacities) proc. natl. acad. sci. usa proc. natl. acad. sci. usa proc. natl. acad. sci. usa proc. natl. acad. sci. usa proc. natl. acad. sci. usa proc. natl. acad. sci. usa, 1992 proc. natl. acad. sci. usa proc. natl. acad. sci. usa key: cord-000257-ampip7od authors: bagowski, christoph p; bruins, wouter; te velthuis, aartjan j.w title: the nature of protein domain evolution: shaping the interaction network date: 2010-08-17 journal: curr genomics doi: 10.2174/138920210791616725 sha: doc_id: 257 cord_uid: ampip7od the proteomes that make up the collection of proteins in contemporary organisms evolved through recombination and duplication of a limited set of domains. these protein domains are essentially the main components of globular proteins and are the most principal level at which protein function and protein interactions can be understood. an important aspect of domain evolution is their atomic structure and biochemical function, which are both specified by the information in the amino acid sequence. changes in this information may bring about new folds, functions and protein architectures. with the present and still increasing wealth of sequences and annotation data brought about by genomics, new evolutionary relationships are constantly being revealed, unknown structures modeled and phylogenies inferred. such investigations not only help predict the function of newly discovered proteins, but also assist in mapping unforeseen pathways of evolution and reveal crucial, co-evolving interand intra-molecular interactions. in turn this will help us describe how protein domains shaped cellular interaction networks and the dynamics with which they are regulated in the cell. additionally, these studies can be used for the design of new and optimized protein domains for therapy. in this review, we aim to describe the basic concepts of protein domain evolution and illustrate recent developments in molecular evolution that have provided valuable new insights in the field of comparative genomics and protein interaction networks. the protein universe is the collection of proteins of all biological species that exist or have once existed on earth [1] . our sampling and understanding of it began over half a century ago, when the first peptide and protein sequences were determined by sanger [2, 3] and, subsequently, the sequencing of rna and dna [4] [5] [6] . in the meantime, the genome projects of the last decade have uncovered an overwhelming amount of sequence data and researchers are now starting to address a series of fundamental questions that should shed light onto protein evolution processes [7] [8] [9] [10] . for instance, how many gene encoding sequences are present in one genome? how many sequences are repetitive and are these sequences similar in the various organisms on earth? which genes were involved in the large scale genome duplications that we see in animals? a comparison of sequences for evolutionary insight is best achieved by looking at the structural and functional (sub)units of proteins, the protein domains. by convention, domains are defined as conserved, functionally independent protein sequences, which bind or process ligands using a core structural motif [11] [12] [13] . examples of domain modes of actions in signaling cascades for instance, are to connect different components into a larger complex or to bind signaling-molecules [14, 15] . protein domains can usually fold independently, likely due to their relatively limited size, and are well known to behave as independent genetic elements within genomes [16, 17] . the sum of these features makes protein domains readily identifiable from raw nucleotide and amino acid sequences and many protein family resources (e.g., superfamily and smart [see table 1 ]) indeed fully rely on such sequence similarity and motif identifications [18, 19] . the algorithms that are used for domain identification are built around a set of simple assumptions that describe the process of evolution. in general, evolution is believed to form and mold genomes largely via three mechanisms, namely i) chemical changes through the incorporation of base analogs, the effects of radiation or random enzymatic errors by polymerases, ii) cellular repair processes that counter mutations, and iii) selection pressures that manifest themselves as the positive or negative influence that determines whether the mutation will be present in subsequent generations [20, 21] . by definition, each of these phenomena styles, reproductive strategies, or the lack of apparent polymerase-dependent proofreading such as in positivestranded rna viruses [22] [23] [24] [25] . consequently, substitution rates need therefore be calculated to correctly compare two or more sequences and hunt uncharted genomes for comparable domains. particularly this last strategy, using general rate matrices like blosom and pam, is an elegant example of how new protein functions can be discovered [26] [27] [28] [29] [30] . fast algorithms for pair-wise alignments can be found in the basic local alignment search tool (blast), whereas multiple sequence alignments (msas, fig. 1a) in which multiple sequences are compared simultaneously are commonly created with for example clustalx and muscle (see table 1 ) [31] [32] [33] [34] . close relatives, sharing an overall sequence identity above for example 50% and a set of functional properties, can also be grouped into families and subfamilies. in turn, these families share also evolutionary relationships with other domains and form together so-called domain superfamilies [18, 35] . evolutionary distances between related domain sequences can easily be estimated from sequence alignments, provided that the correct rate assumptions are made. subsequently, these can be used to compute the phylogenies of the domain that share an evolutionary history. these, often tree-like graphs (fig. 1b) , depend heavily on rate variation models, such as molecular clocks or relaxed molecular clocks (e.g., maximum likelyhood and bayesian estimation), which are calibrated with additional evidence fig. (1a) . it was computed using bayesian estimation and presents the best-supported topology for the alignment. numbers indicate % support by the two methods used, while # indicates gene duplication events in the common ancestor and * marks a species-specific duplication event. for computational details, please see [42] . such as fossils and may therefore also provide valuable information on aspects like divergence times and ancestral sequences [36] [37] [38] . commonly used phylogenetic analysis strategies are listed in table 1 . a limitation of all inferred phylogenetic data is that it is directly dependent on the alignment and less so on the programs used to build the phylogenetic tree [39] . one of the shortcomings of automated alignments may thus derive from the fact that they commonly employ a scoring and penalty procedure to find the best possible alignment, since these parameters vary from species to species [22, 23] , as mentioned above. careful inspection of alignments is therefore advisable, even though software has been developed that combines the alignment procedure and phylogenetic analysis iteratively in one single program [40] . although sequence and phylogenetic analysis provide a relatively straightforward way for looking at domain divergence, comparison of solved protein structures has shown that protein tertiary organizations are much more conserved (>50%) than their primary sequence (>5%) [41] . for this reason, protein structures and their models provide significantly more insight into the relations of protein domains and how domain families diverged [16] . for example, the inactive guanylate kinase (gk) domain present in the maguk family was shown to originate from an active form of the gk domain residing in ca2+ channel beta-subunits (cacnbs) through both sequence and structural comparison [42] . furthermore, identification of functionally or structurally related amino acid sites in a fold sheds light on the complex, co-evolutionary dynamics that took place during selection [43] . as described above, the evolution of a protein domain is generally the result of a combination of a series of random mutations and a selection constraint imposed on function, i.e., the interaction with a ligand. the interaction between protein and ligand can be imagined as disturbances of the protein's energy landscape, which in turn bring about specific, three-dimensional changes in the protein structure [44, 45] . binding energies however, need not be smoothly distributed over the protein's binding pocket as a limited number of amino acids may account for most of the free-energy change that occurs upon binding [45] [46] [47] . in these cases, new binding specificities (including loss of binding) may therefore arise through mutations at these hot spots. an example is a recent study of the pdz domain in which it was shown that only a selected set of residues, and in particular the first residue of -helix 2 ( b1), directly confers binding to a set of c-terminal peptides [48] . the folding of a domain is essentially based on a complex network of sequential inter-molecular interactions in time [49] . this has of course significant implications for domain integrity, particularly if one assumes that the core of a protein domain is and has to be largely structurally conserved. indeed, even single mutations that arise in this area may easily derail the folding process, either because their free energy contribution influences residues in the direct vicinity or disturbs connections higher up in the intermolecular network [49] . it is therefore hypothesized that protein evolution took place at the periphery of the protein domain core, and that gradual changes via point mutations, insertions and deletions in surface loops brought about the evolutionary distance we see among proteins to date [21, [50] [51] [52] . however, distant sites also contribute to the thermodynamics of catalytic residues. this is achieved through a mechanism called energetic coupling, which is shaped by a continuous pathway of van der waals interactions that ultimately influences residues at the binding site with similar efficiency as the thermodynamic hotspots [53, 54] . indeed in such cases, evolutionary constraints are not placed on merely one amino acid in the binding pocket, but on two or more residues that can be shown to be statistically coupled in msas [54, 55] . in addition to contributions to binding, these principles also explain why the core of a domain structure will remain largely conserved, while at functionally related places residues can (rapidly) co-evolve with an overall neutral effect [56] . of course, these aspects of co-evolution are also of practical consequence for structure prediction and rational drug design [43] . through selective mutation, protein domains have been the tools of evolution to create an enormous and diverse assembly of proteins from likely an initially relatively limited set of domains. the combined data in genbank and other databases now covers over 200.000 species with at least 50 complete genomes and this greatly facilitates genome comparisons [57] [58] [59] . following such extensive comparisons, currently > 1700 domain superfamilies are recognized in the recent release of the structural classification of proteins (scop) [60] and it has become clear that many proteins consist of more than one domain [17, 61, 62] . indeed, it has been estimated that at least 70% of the domains is duplicated in prokaryotes, whereas this number may even be higher in eukaryotes, likely reaching up to 90% [35] . there are various mechanisms through which protein domain or whole proteins may have been duplicated. on the largest scale, whole genome duplication such as those seen in the vertebrate genomes duplicated whole gene families, including postsynaptic proteins, hormone receptors and muscle proteins, and thereby dramatically increased the domain content and expanded networks [42, 63, 64] . on the other end of the scale, domains and proteins have been duplicated through genetic mechanisms like exon-shuffling, retrotranspositions, recombination and horizontal gene transfer [65] [66] [67] . since the genetic forces, like exon-shuffling and genome duplication vary among species, the total number of domains and the types of domains present fluctuate per genome. interestingly, comparative analyses of genomes have shown that the number of unique domains encoded in organisms is generally proportional to its genome size [60, 68] . within genomes, the number of domains per gene, the socalled modularity, is related to genome size via a power-law, which is essentially the relation between the frequency f and an occurrence x raised by a scaling constant k (i.e., f (x) x k ) [69, 70] . a similar correlation is found when the multi-domain architecture is compared to the number of cell types that is present in an organism, i.e., the organism complexity or when the number of domains in a abundant superfamily is plotted against genome size (fig. 2) [71, 72] . given the amount of domain duplication and apparent selection for specific multi-domain encoding genes in, for example, vertebrates, it may come as little surprise that not all domains have had the same tendency to recombine and distribute themselves over the genomes [68, 73] . in fact, some are highly abundant and can be found in many different multi-domain architectures, whereas others are abundant yet confined to a small sample of architectures or not abundant at all [68, 70] . is there any significant correlation between the propensity to distribute and the functional roles domains have in cellular pathways? some of the most abundant domains can be found in association with cellular signaling cascades and have been shown to accumulate non-linearly in relation to the overall number of domains encoded or the genome size [70] . additionally, the on-set of the exponential expansion of the number of abundant and highly recombining domains has been linked to the appearance of multicellularity [70] . a reoccurring theme among these abundant domains is the function of protein-protein interaction and it appears that particularly these, usually globular domains, have been particularly selected for in more complex organisms [70] . this positive relation is underlined by the association of these abundant domains with disease such as cancer and gene essentiality as the highly interacting proteins that they are part of have central places in cascades and need to orchestrate a high number of molecular connections [74, 75] . their shape and coding regions, which usually lie within the boundaries of one or two exons, make them ideally suited for such a selection, since domains are most frequently gained through insertions at the n-or c-terminus and through exon shuffling [76] [77] [78] . from a mutational point of view, protein-protein interaction domains are different from other domains as well and this appears to be particularly true for the group of small, relatively promiscuous domains like sh3 and pdz. these domains are promiscuous in the sense that they both tend to physically interact with a large number of ligands [79, 80] and are prone to move through the genome to recombine with many other domains. it has been found that particularly these domains evolve more slowly than non-promiscuous domains [70] . this likely stems from the fact that they are required to participate in many different interactions, which makes selection pressures more stringent and the appearance of the branches on phylogenetic trees relatively short and more difficult to assess when co-evolutionary data in terms of other domains in the same gene family or expression patterns is limited [42, 63] . non-promiscuous domains on the other hand can quite easily evade the selection pressure by obtaining compensatory mutations either within themselves or their specific binding partner [70] . the overall phenomenon that the number of protein domains and their modularity increases as the genome expands has not been linked to a conclusive biological explanation yet. a rationale for the increase in interactions and functional subunits, however, may derive from the paradoxical absence of correlation between the number of genes encoded and organism complexity, the so-called g-value paradox [81] . there is indeed evidence that domains involved in the same functional pathway tend to converge in a single protein sequence, which would make pathways more controllable and reliable without the need for supplementary genes [73] . additionally, the number of different arrangements found in higher eukaryotes is, given the vast scale of unique domains present, relatively limited. this in turn implies that evolutionary constraints have played an important role in selecting the right domain combinations and the right order from n-to c-terminus in multi-domain proteins [13, 82] . in fact, the ordering and co-occurrence of domains was demonstrated to hold enough evolutionary information to construct a tree of life similar to those based on canonical sequence data [70] . furthermore, the increased use of alternative splicing and exon skipping in higher eukaryotes likely supplied a novel way of proteome diversification by restricting gene duplication and stimulating the formation of multi-domain proteins [83, 84] . in plants, however, the latter notion is not supported since both mono-and dicots show limited alternative splicing and a more extensive polyploidy [85] [86] [87] . it is clear that some of the above characteristics are underappreciated in the phylogenetic analysis of linear amino acid sequences. moreover, the effects of evolution extend even further than these aspects and entail transcriptional and translational regulation, intramolecular domain-domain interactions, gene modifications and post-translational protein modifications [88] [89] [90] [91] [92] [93] [94] [95] [96] . new methods are thus being developed to take into account that when sequences evolve, their close and distant functional relationships evolve in parallel. correlations of mutations have already been found between residues of different proteins [97, 98] and compensating mutational changes at an interaction interface were shown to recover the instability of a complex [99] . these observations are evidence for the current evolutionary models for the protein-protein interaction (ppis) networks that are being constructed through large-scale screens [100] [101] [102] . in these, a gene duplication or domain duplication (depending on the resolution of the network) implies the addition of a node, while the deletion of a gene or domain reduces the amount of links in the network (fig. 3) . in the next step, extensive network rewiring may take place, driven by the effect of node addition or node loss in the network (i.e., the duplicability or essentiality of a domain/protein) and mutations in the domain-interaction interface [67, 74, [103] [104] [105] . beyond mutations at the domain and protein level, regulation of protein expression provides another vital mechanism through which protein networks can evolve. microarray studies are now well under way to map genome-wide ex-pression levels of related and non-related genes under a variety of conditions [91, [94] [95] [96] . for example, transcriptional comparisons have investigated aging [106] and pathogenicity [107] . unfortunately, given the highly variable nature of gene expression and the fact that different species may respond different to external stimuli, such comparisons can only be performed under strictly controlled research conditions. to date most studies have therefore focused on the embryogenesis, metamorphosis, sex-dependency and mutation rates of subspecies [94, [108] [109] [110] [111] . other studies have revealed valuable information on promoter types and duplication events [91] [92] [93] [94] . to overcome the limitations mentioned in the previous paragraph, the analysis of co-expression data has been developed to supplement the direct comparison of individual gene expression changes [95] . in this procedure, a coexpression analysis of gene pairs within each species precedes the cross-comparison of the different organisms in the study. this approach thus primarily focuses on the similarity and differences of the orthologous genes within network, and is therefore ideally suited for the study of protein domain evolution and has already revealed that species-specific parts fig. (3) . evolutionary models for protein-protein interactions. the evolution of protein networks is tightly coupled to the addition or deletion of nodes. additionally, events that introduce mutations in binding interfaces of proteins may result in the addition or loss of links in the network. node addition may take place through e.g., domain duplication or horizontal gene transfer, while rewiring of the network is mediated by point mutations, alternative splice variants and changes in gene expression patterns. of an expression network resulted via a merge of conserved and newly evolved modules [95, 112, 113] . finding evolutionary relationships protein domains is mostly based on orthology and thus commonly performed on best sequence matches. identifying these and categorizing them depends largely on multiple sequence alignments and this will in most cases give good indications for function, fold and ultimately evolution. however, this approach usually discards apparent ambiguities that arise from speciesspecific variations (e.g., due to population size, metabolism or species-specific domain duplications or losses) and may therefore introduce significant biases [114] . biases may also derive from the method of alignment, the rate variation model used to infer the phylogeny, and the sample size used to build the alignment [39, 40, 115] . care should therefore be taken to not regard orthology as a one-to-one relationship, but as a family of homologous relations [91] , to select for appropriate analysis methods [39, 115] and extend comparative data to protein interactions and expression profiles [91] . indeed, as our wealth of biological information expands, our systems perspective will improve and provide us with an opportunity to reveal protein domain evolution at the level network organization and dynamics. large-scale expression studies are beginning to show us evolutionary correlations between gene expression levels and timings [94, 106, 107, 112, 116] , while others demonstrate spatial differences between paralogs or (partial) overlap between interaction partners [117] [118] [119] [120] . indeed, when we are able to map the spatiotemporal aspects of inter-and intra-molecular interactions we will begin to fully understand the versatile power of evolution that shaped the protein universe and life on earth [118] . phylogenetic continuum indicates "galaxies" in the protein universe: perliminary results on the natural group structures of proteins the chemistry of amino acids and proteins some peptides from insulin nucleotide sequence from the coat protein cistron of r17 bacteriophage rna use of dna polymerase i primed by a synthetic oligonucleotide to determine a nucleotide sequenc of phage fl dna dna sequencing with chain-terminating inhibitors the genome sequence of drosophila melanogaster flybase: genomes by the dozen initial sequencing and comparative analysis of the mouse genome insights into social insects from the genome of the honeybee apis mellifera selectivity and promiscuity in the interaction network mediated by protein recognition modules modular peptide recognition domains in eukaryotic signaling the multiplicity of domains in proteins the modular nature of apoptotic signaling proteins regulatory potential, phyletic distribution and evolution of ancient, intracellular smallmolecule-binding domains protein families and their evolution: a structural perspective the folding and evolution of multidomain proteins the superfamily database in 2007: families and functions smart: identification and annotation of domains from signalling and extracellular protein sequences comparative genomics: genome-wide analysis in metazoan eukaryotes distribution of indel lengths heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference review of concepts, case studies and implications who do species vary in their rate of molecular evolution infidelity of sars-cov nsp14-exonuclease mutant virus replication is revealed by complete genome sequencing who's your neighbor? new computational approaches for functional genomics protein function in the post-genomic era the role of pattern databases in sequence analysis gene ontology: tool for the unification of biology unique and conserved features of genome and proteome of sarscoronavirus, an early split-off from the coronavirus group 2 lineage phylip version 3.63. deptartment of genetics gapped blast and psi-blast: a new generation of protein database search programs the clustal_x windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools comparison of methods for searching protein sequence databases an insight into domain combinations evolutionary trees from dna sequences: a maximum likelihood approach mrbayes: bayesian inference of phylogenetic trees mammalian evolution and biomedicine: new views from phylogeny multiple sequence alignment: in pursuit of homologous dna positions bayesian coestimation of phylogeny and sequence alignment the relation between the divergence of sequence and structure in proteins molecular evolution of the maguk family in metazoan genomes why should we care about molecular coevolution the propagation of binding interactions to remote sites in proteins: analysis of the binding of the monoclonal antibody d1.3 to lysozyme structural stability of binding sites: consequences for binding affinity and allosteric effects revealing the architecture of a k+ channel pore through mutant cycles with a peptide inhibitor structural plasticity in a remodeled protein-protein interface a specificity map for the pdz domain family the linkage between protein folding and functional cooperativity: two sides of the same coin? empirical and structural models for insertions and deletions in the divergent evolution of proteins analysis of insertions/deletions in protein structures structural similarity of loops in protein families: toward the understanding of protein evolution the effect of inhibitor binding on the structural stability and cooperativity of the hiv-1 protease evolutionary conserved pathways of energetic connectivity in protein families how frequent are correlated changes in families of protein sequences? an improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution evolution of vertebrate genes related to prion and shadoo proteins--clues from comparative genomic analysis evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes data growth and its impact on the scop database: new developments estimating the number of protein folds and families from complete genome data insights into the molecular evolution of the pdz-lim family and indentification of a novel conserved protein motif independent elaboration of steroid hormone signaling pathways in metazoans integration of horizontally transferred genes into regulatory interaction networks takes many million years prokaryotic evolution in light of gene transfer how the global structure of protein interaction networks evolves the impact of comparative genomics on our understanding of evolution modular genes with metazoan-specific domains have increased tissue specificity evolution of protein domain promiscuity in eukaryotes the structure of the protein universe and genome evolution modules, multidomain proteins and organismic complexity detecting protein function and protein-protein interaction from genome sequences lethality and centrality in protein networks comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks domain deletions and substitutions in the modular protein evolution genome evolution and the evolution of exon-shuffling-a review significant expansion of exon-bordering protein domains during animal proteome evolution thermodynamic basis for promiscuity and selectivity in protein-protein interactions: pdz domains, a case study promiscuous binding nature of sh3 domains to their target proteins expansion of genome coding regions by acquisition of new genes the geometry of domain combination in proteins different levels of alternative splicing among eukaryotes how did alternative splicing evolve? alternative splicing and gene duplication are inversely correlated evolutionary mechanisms polyploidy and genome evolution in plants comparative analysis indicates that alternative splicing in plants has a limited role in functional expansion of the proteome structural characterization of the intramolecular interaction between the sh3 and guanylate kinase domains of psd-95 identification of an intramolecular interaction between the sh3 and guanylate kinase domains of psd-95 interplay of pdz and protease domain of degp ensures efficient elimination of misfolded proteins comparative biology: beyond sequence analysis a genetic signature of interspecies variations in gene expression genome-wide scan reveals that genetic variation for transcriptional plasticity in yeast is biased towards multi-copy and dispensable genes identification of tightly regulated groups of genes during drosophila melanogaster embryogenesis a gene-coexpression network for global discovery of conserved genetic modules similarities and differences in genome-wide expression data of six organisms accurate prediction of proteinprotein interactions from sequence alignments using a bayesian method correlated mutations contain information about protein-protein interaction mutually compensatory mutations during evolution of the tetramerization domain of tumor suppressor p53 lead to impaired hetero-oligomerization functional organization of the yeast proteome by systematic analysis of protein complexes a human protein-protein interaction network: a resource for annotating the proteome protein function, connectivity, and duplicability in yeast evolution and topology in the yeast protein interaction network modularity and evolutionary constraint on proteins comparing genomic expression patterns across species identifies shared transcriptional profile in aging genome-wide functional analysis of pathogenicity genes in the rice blast fungus a mutation accumulation assay reveals a broad capacity for rapid evolution of gene expression evolution of gene expression in the drosophila melanogaster subgroup sexdependent gene expression and evolution of the drosophila transcriptome microarray analysis of drosophila development during metamorphosis conservation and coevolution in the scale-free human gene coexpression network conservation and evolution of gene coexpression networks in human and chimpanzee brains cross-species sequence comparisons: a review of methods and available resources impact of taxon sampling on the estimation of rates of evolution at sites comparative genomics beyond sequence-based alignments: rna structures in the encode regions comparative analysis of splice form-specific expression of lim kinases during zebrafish development towards cellular systems in 4d gene expression map of the arabidopsis shoot apical meristem stem cell niche a gene expression map of arabidopsis thaliana development key: cord-011012-5mev3otu authors: rathore, abhishek singh; sarker, animesh; gupta, rinkoo devi title: production and immunogenicity of fubc subunit protein redesigned from denv envelope protein date: 2020-03-30 journal: appl microbiol biotechnol doi: 10.1007/s00253-020-10541-y sha: doc_id: 11012 cord_uid: 5mev3otu dengue virus (denv) is a vector-borne human pathogen that usually causes dengue fever; however, sometime it leads to deadly complications such as dengue with warning signs (dws+) and severe dengue (sd). several studies have shown that fusion (fu) and bc loop of denv envelope domain ii are highly conserved and consist some of the most dominant antigenic epitopes. therefore, in this study, fu and bc loops were joined together to develop a short recombinant protein as an alternative of whole denv envelope protein, and its immunogenic potential as fusion peptide was estimated. for de novo designing of the antigen, fu and bc peptides were linked with an optimised linker so that the three dimensional conformation was maintained as it is in denv envelope protein. the redesigned fubc protein was expressed in e. coli and purified. subsequently, structural integrity of the purified protein was verified by cd spectroscopy. to characterise immune responses against recombinant fubc protein, balb/c mice were subcutaneously injected with emulsified antigen preparation. it was observed by elisa that fubc fusion protein elicited higher serum igg antibody response either in the presence or in absence of freund’s adjuvant in comparison to the immune response of fu and bc peptides separately. furthermore, the binding of fubc protein with mice antisera was validated by spr analysis. these results suggest that fu and bc epitope-based recombinant fusion protein could be a potential candidate towards the development of the effective subunit vaccine against denv. electronic supplementary material: the online version of this article (10.1007/s00253-020-10541-y) contains supplementary material, which is available to authorized users. in the recent decades, dengue has emerged as one of the world's most dominant tropical diseases (guzman et al. 2016) with around 390 million (95% credible interval 284 to 528 million) cases recorded per year, of which only 96 million (95% credible interval 67 to 136 million) cases were manifested clinically (giraldo-garcía and castaño-osorio, 2019; bhatt et al. 2013 ). according to world health organization (who), an estimate of 3.9 billion people in 128 countries are living in areas with a high risk of dengue infection (brady et al. 2012 ). the recently estimated annual 390 million dengue cases reveals that the dengue disease burden has tripled as compared to previous predictions of 50 to 100 million reported cases without dengue warning signs. nonetheless, 250,000 to 500,000 patients were hospitalised due to dengue with warning signs (dws+) and severe dengue (sd), and the total annual cost of dengue burden was estimated globally around us$ 8.9 billion (giraldo-garcía and castaño-osorio, 2019; guzman et al. 2016) . in spite of having paucity of effective vaccines and drugs to expel dengue comprehensively, still it's enduring a big challenge to human (martina et al. 2009 ). therefore, next-generation vaccine strategies such as inactivated purified virus, dna or protein-based subunit vaccines, and their fusion chimeras are now going under investigation and some of them even under clinical trials (coller et al. 2011; danko et al. 2011; liu et al. 2015) . recently, dengvaxia (cyd-tdv), a tetravalent live attenuated vaccine has been approved for use in some of the highly dengue endemic areas where the sero-prevalence is higher than 70% (guy et al. 2015; pang et al. 2018 ). according to abhishek singh rathore and animesh sarker contributed equally to this work. electronic supplementary material the online version of this article (https://doi.org/10.1007/s00253-020-10541-y) contains supplementary material, which is available to authorized users. who, it has shown significant efficacy and acceptable safety profile during clinical trials in seropositive individuals; however, it carries a risk of severe dengue infection in seronegative individuals, and also failed to confer subsequent protection in denv-2 positive people in areas with less exposure of dengue (arredondo-garcía et al. 2018; durbin et al. 2011; guy et al. 2011) . several other vaccine candidates were also formulated based on live-attenuated dengue viruses (e.g. tv003, tv005) and inactivated purified virus (e.g. dpiv), of which some have already completed clinical trials and some are currently going under phase ii to iii clinical trials (whitehead et al. 2017; diaz et al. 2018 ). on the other hand, dna or proteinbased alternative subunit vaccines (e.g. tvdv, den-80) and dengue-flavivirus chimeras (e.g. ediii-p64k, 80e-stf2, ediii-hbsag) are going either under preclinical study or phase i clinical test (govindarajan et al. 2015; danko et al. 2018; castaño-osorio et al., 2019) . although, still most of them are seemed to be slow immunity booster and require a strong adjuvant and longer immunisation to achieve full protection against each of the dengue serotype, which all make them ill-suited for universal vaccine licence . the dengue envelope (e) protein is composed of three ecto-domains, a membrane-proximal stem and a transmembrane anchor (klein et al. 2013 ). various crystal structures have shown that the ecto-domains are arranged into three antiparallel dimer on virus surface in a icosahedral symmetry, where ecto-domain i (ei) is located at the centre holding domain ii (edii) and iii (ediii) in two opposite sites in each of the monomer of a dimeric unit (fibriansah et al. 2015; kuhn et al. 2002; modis et al. 2004 ). most of the protective antibodies against dengue virus are identified to domain iii, and initially were thought to be a potential subunit vaccine target (murphy and whitehead 2011; sukupolvi-petty et al. 2010) . although, recombinant ediii domain injected by plasmid dna or purified from bacterial expression systems was found to be poorly immunogenic, and has shown low protective efficacy in animal model (guzman et al. 2010 ). on the other hand, edii has been reported as the dimerisation domain and consists of the most conserved fusion (fu) and bc loop (li et al. 2019; zhang et al. 2004 ). several studies have also shown that the conserved fu loop is highly immunogenic (lai et al. 2008; cherrier et al. 2009; smith et al. 2013) , and induces cross-reactive antibodies, of which some are crosstalk with adjacent bc loop (gallichotte et al., 2015; cherrier et al. 2009 ). moreover, during endocytosis, both of the loops are found to involve in tertiary conformation of membrane fusion (nayak et al. 2009; sukupolvi-petty et al. 2010) . therefore, it has been anticipated that both of the loops might have integrated role for boosting cross-neutralizing immunity. in this study, we aim to develop an alternative vaccine by combining fu and bc loop together in a single orf for production of a fusion antigen. although, the recombinant antigen in isolation tends to be poorly immunogenic in vivo; the use of potent immunomodulating compounds, fusion partner or suitable delivery systems improve specific immune response (higgins et al. 2007; garçon et al. 2011) . herein, we have expressed the recombinant fubc antigenic protein, optimised its large-scale purification protocol and finally evaluated its protein-specific immune response in balb/c mice. prior to the animal challenge, its secondary structure was checked by cd spectroscopy, and binding specificity was cross-checked with a characterised anti-fusion loop scfv antibody ). all of the male and female balb/c mice were obtained from national institute of nutrition, telangana, hyderabad, india and were maintained in standard light: dark (12:12) cycle with the supplement of adequate standard food, and water was provided from ad libitum. all of the animals were acclimated and randomly distributed into different experimental groups. furthermore, all the in vivo experiments were performed in accordance with the committee for the purpose of control and supervision on experiments on animals (cpcsea) guidelines and were approved by the south asian university institutional animal ethics committee (iaec) that was responsible for the care and use of laboratory animals. the dna sequence of fubc gene was retrieved from fusion (fu) and bc loop of denv serotype 2 envelope protein deposited in the protein data bank (pdb: 1oan). in order to construct stable and immunogenic protein, highly conserved fusion (fu) and bc loops and their neighbouring residues (amino acid residues 62 to 122) were selected by using antigen variability analyser (avana), pblast and multiple sequence alignment tool of ncbi. for convenient expression in bacteria, all of the oligos were codon optimised for e. coli, using codon optimisation tool of integrated dna technology. complete fubc gene was constructed by assembly pcr reactions using four 60 nucleotide long overlapping oligonucleotides. for cloning into pet28a expression vector, two unique restriction sites, ecori and xhoi, were also incorporated at the 5′ and 3′ ends respectively during the final amplification of fubc full-length gene using forward and reverse primers. initially, the full-length fubc gene was cloned into a ta cloning vector using instaclone kit (thermo scientific). positive clones were screened using x-gal bluewhite screening method and digested with ecori and xhoi restriction enzymes. the digested fubc gene was further sub-cloned into a pet28a expression vector. the recombinant plasmid (fubc + pet28a) was transformed into e. coli xl-10 gold for cloning, and subsequently into e. coli bl-21 rosetta (de3) for protein expression. the e. coli bl-21 (rosetta) cells carrying fubc gene in pet28a vector were grown overnight at 37°c in 10 ml lb broth (luria-bertani medium) containing 50 μg/ml kanamycin (sigma, usa). overnight grown 1 ml primary culture was used to inoculate kanamycin containing 100 ml secondary culture and was further incubated at 37°c. the incubated secondary culture was induced by 0.5 mm isopropyl β-d-1thiogalactopyranoside (iptg) when its od 600 reached at around 0.5, and the culture was grown for another 4 h at 37°c. the cells were harvested by centrifugation at 4000 rpm for 10 min. the resulting cell pellet was resuspended in lysis buffer containing 50 mm tris-hcl ph 8.0, 1 mm cacl 2 , with 0.5% triton x-100, lysozyme 0.1 mg/ml, 1 mm edta and 1 mm pmsf. then, the resuspended cells were kept on a rocker for an hour at room temperature, and sonication was done using 30% amplitude for five times 30 s on/off pulse. the lysed sample was separated into supernatant and pellet by centrifugation at 10,000 rpm for 10 min at 4°c. finally, fubc expression level in supernatant and pellet were checked on 15% sds-page. a major fraction of the fubc protein was observed in pellet after a number of efforts made to recover it in a soluble form. then, it was decided to recover soluble protein from pellet fraction. the resulting pellet protein was washed 4 to 5 times with te 50/20 (50 mm tris ph 8.0 and 20 mm edta 20) buffer to remove impurities and extra salts. the remaining pellet was re-solubilised using mild denaturing agent such as 4.0 m urea and 5% n-propanol along with pbs buffer (ph 8.1) and was centrifuged at 20,000 rpm for 10 min. the soluble protein fraction was initially purified by using ni-nta agarose beads and was confirmed by western blot using anti-his antibody. however, the purity and yield were not sufficient. therefore, soluble protein fraction was subjected to gel filtration in superose 6 10/300 column by using fast protein liquid chromatography (fplc) for largescale good quality protein production. the column was preequilibrated with pbs (50 mm phosphate buffer ph 7.4 and 150 mm nacl), and the protein sample was eluted with the same pbs buffer. 0.5 ml protein sample was injected in each run by using 0.5 ml loop at 0.4 ml/min flow rate. the elution profile of injected protein was followed by monitoring uv absorbance at 280 nm on the akta fplc system with u9-l uv monitor. different peaks greater than 20 mau were collected in fraction collector and checked using a 15% sds page. multiple runs of fplc were carried out, and the fraction containing fubc protein was pooled together. finally, fubc was concentrated and desalted by using 0.5 ml millipore-amicone filter with a cut-off of 3 kda. concentration of the purified fubc was also measured by using bca protein assay kit. in order to assure the proper folding of purified (> 95%) fubc protein, secondary structure was analysed by cd spectroscopy at far uv wavelength ranging from 180 to 260 nm. to achieve the best conformational reading, cd spectrum was obtained at the different protein and buffer concentrations, because the cd spectrum of a protein needs to be adequately intense for interpreting the data as the intensity of a cd spectrum directly relies on the protein and buffer concentration (kelly et al. 2005; miles and wallace 2006) . the cd spectra of pbs buffer and fubc protein samples at 0.1 mg/ml and 0.2 mg/ml concentrations (diluted in 50 mm and 10 mm pbs buffer, ph 7.4) were recorded at far uv spectra ranging from 200 to 280 nm with a step size of 1 nm to bandwidth 1 nm. the measurement was performed at room temperature (22°c), and the uv spectra were recorded for each sample with five scans. the baseline cd spectrum of the buffer was deducted from the spectrum containing the protein to yield the actual fubc protein cd spectrum. the mean residue ellipticity [θ] mrw at wavelength λ was quoted in units of degree cm 2 / dmol, and was calculated as [θ] mrw, λ = mrw × θλ/10 × d × c, where θ is the observed ellipticity (degrees) at wavelength λ, d is the path length (cm) and c is the concentration (g/ml). mrw is the mean residual weight for the peptide bond which is given as mrw = m/(n − 1); where m is the molecular mass of the polypeptide chain (in da), and n is the number of amino acids in the chain; the number of peptide bonds is n − 1. complete and incomplete freund's adjuvant (sigma, usa) was used for primary (day 0) and booster immunisation (days 7, 14 and 21) respectively. to prepare a primary dose of the immunogen, complete freund's adjuvant (fa) was mixed with equal volume of purified fubc (25 μg) in pbs and emulsified by vigorous vortex. similarly, three booster doses of immunogen were prepared with an equal volume of fubc protein (25 μg) and incomplete freund's adjuvant. all of the immunogens were prepared according to the protocol "immunization of mice" by maira-litrán, 2017. finally, emulsified fubc immunogens were checked by observing stable droplet on the water surface. healthy balb/c mice (10 weeks old, male and female) were randomly distributed into four different groups. each group was immunised with different antigen preparations. out of the four, two groups were immunised with fubc antigen: one group was injected with fubc protein with adjuvant and other was with fubc protein without adjuvant. rest of the two groups were immunised with fu and bc peptide: one group was injected with fu and bc peptides with adjuvant, and other was fu and bc peptide without adjuvant. each of the groups was also subdivided into male and female sets, and with each set, one adjuvant control was used replacing adjuvant with pbs buffer. the mice of each group were subcutaneously immunised with emulsified antigen preparations, first dose (day 0) with cfa and three subsequent booster doses (at days 7, 14 and 21) with ifa (according to protocol maira-litrán 2017). after 5 days of the final booster dose, blood samples were collected by retro-orbital cavity, and all the antisera isolated from blood were stored at − 80°c for further analysis. recombinant fubc protein-specific igg was measured by indirect elisa. for this, 96 well elisa plates were coated by incubating overnight at 4°c with fubc protein (1 μg/ well) in 50 mm sodium carbonate-bicarbonate buffer ph 9.5. blocking was done for 2 h with 2% bsa in pbst (pbs plus 0.05% tween 20). serum collected from immunised and control mice were incubated for 1 h at 37°c after making 7 double dilutions in pbst starting from 1/500 μl. subsequently, the elisa plate was washed once with pbst, and 100 μl of anti-mouse hrp-conjugated secondary antibody (1:1000 dilution in pbst) was added and incubated for 30 min at room temperature. then, the plate was washed three more times with pbst, and the binding reaction was developed by adding 200 μl/ well opd substrate (prepared in a phosphate-citrate buffer, 0.05 m, ph − 5.0 plus 0.03% h 2 o 2 ). the reaction was stopped by adding 50 μl of 2.5 m h 2 so 4 just after observing an optimum colour, and the absorbance was measured at 492-nm wavelength using biotek synergy ht microplate reader (winooski, vt). mean absorbance was plotted against anti-sera dilutions, and two-way anova was performed to find a statistically significant difference between two groups. comparison of the immune response in different groups of antisera was made by converting each elisa curve to linear equations (y = mx + c), and computing serum dilutions for a fixed elisa absorbance. the data are expressed as mean ± standard deviation (sd). p values of < 0.05 were considered statistically significant. to validate the immune response generated by recombinant fubc, collected mice anti-sera were allowed for spr interaction experiment with a autolab esprit instrument. similar to elisa, here also recombinant fubc protein of denv envelope was immobilised on a gold plate of spr device. for coupling of fubc protein on gold disc, 1-ethyl-3-(3dimethylaminopropyl) carbodiimide (edac) and nhydroxysuccinimide (nhs) were applied for activation. nhs activates the carboxymethyl groups by creating a highly reactive succinimide ester on the disc surface, which reacts with amine and other nucleophilic groups on proteins that subsequently help to bind the target protein on activated disc surface. ethanolamine was added to block the remaining activated carboxymethyl group. for the qualitative assay, all of the serum samples (diluted in running buffer) were applied at a flow rate of 20 μl/min over 2 min. in between injections, the surface of the sensor chip was regenerated by injecting 2 m nacl at the same flow rate for 15 s. in addition, surface coupling was done by using buffer (150 mm nacl, 3 mm edta, 0.005% surfactant p20 and 10 mm hepes-naoh) at ph 7.4, and the buffer containing 50 mm phosphate and 150 mm nacl at ph 7.1 was used for sample running. initially, a series of serum dilutions were applied as analytes to find an optimum refractive index. it was observed that the refractive index at 1:500 dilution was within the detection limit among the series of serial dilutions (fig. 6a) . therefore, 1:500 dilution of different experimental samples was applied as standard for further comparative analysis between fubc immune response with and without freund's adjuvant. serum control (the serum collected before immunisation) sample with the identical dilution was also applied to find the basal (non-specific) immune response of serum. finally, refractive index was then analysed from association and dissociation curve of the spr sensorgram. the fubc synthetic gene sequence was deposited in genbank database with accession number mn781186. generally, dengue envelope (e) folded into three distinct domains (designated by domain i, ii and iii), membrane proximal stem and a transmembrane anchor (klein et al. 2013) . throughout these structural element, four highly conserved regions have been identified by in silico sequence analysis, and two of them were found in the domain ii with a very less informational entropy . further studies have revealed that these conserved regions are the part of previously characterised flavivirus fusion (fu) and bc loop (kuhn et al. 2002) . during the process of endocytosis, this hydrophobic fusion loop remains buried at the dimer interface in the prefusion state and forms cluster into larger hydrophobic surface at one end to form trimer at later state that finally initiates membrane fusion (klein et al. 2013 ). in addition to this structural property, fu and bc regions consist some of the most potent antigenic epitopes that were identified by denv neutralising conformation sensitive anti-denv hmabs (goncalvez et al. 2007; costin et al. 2013; yamanaka et al. 2013) . therefore, these two conserved fu and bc loops have been selected to design a recombinant fubc antigenic protein for the development of dengue subunit vaccine. in this study, an alignment of the domain ii amino acid sequences of denv1-4 envelope proteins spanning residues from 61 to 120 was used to find optimum conserved sequence for the development of fubc fusion protein (fig. 1a) . the structural details of truncated fubc protein compare to whole envelope in dengue and fu, and bc peptides were also analysed by modelling each of their three dimensional structure. it reveals that the presence of three anti-parallel β-strands and one disulphide linkage play a crucial role in preserving the three dimensional conformations of truncated fubc protein same as in original denv envelope protein (fig. 1b, c) . ironically, due to the absence of anti-parallel β-strands and a single protein frame of fu and bc loops, these two separately expressed peptides fail to form similar three-dimensional conformation as in original denv envelop (fig. 1c) . the electrostatic and solvent accessible areas of fubc truncated protein are also comparable with original dengue envelope. hence, it is speculated that the recombinant fubc protein would be an alternative vaccine target of whole dengue envelope. fu loop is shown in orange and bc loop is shown in yellow. b in denv envelope, the fu and bc loops are present at the end of domain ii and are linked by the di-sulphide linkage that holds both loops together to form a stable structure that can also work as an epitope. c when the structure of fubc is compared to original denv envelope, it is plausible that the presence of three anti-parallel β-strands present in fubc protein (same as denv protein) plays a crucial role in maintaining the three dimensional conformation of fubc protein as it is in denv protein. another major factor that plays in conformation of fu and bc loops is the disulphide linkage between the two highly conserved loop in the fubc protein that ensures the same conformation of denv envelope as it is shown in b and c, and d due to the lack of anti-parallel β-strands and same protein frame while these peptides are expressed separately, it is unlike to form di-sulphide linkage and fold in the same conformation as reside in denv envelope and fubc proteins for in vitro synthesis of short antigenic protein, a recombinant fubc gene was constructed by assembly pcr using overlapping oligonucleotides, designed from fu and bc loop encoding dna sequences of dengue envelope. therefore, full length of fubc gene was amplified by using 5 and 3 end primers flanked by ecori and xhoi restriction sites. the fubc gene was then cloned into a ta cloning vector, and the clones were screened by colony pcr using fubc gene specific end primers. two positive clones were then confirmed by restriction digestion using ecori and xhoi enzymes (fig. s1a-d) . the digested and purified fubc insert gene was further sub-cloned into pet28a expression vector, and the positive clones were confirmed by restriction digestion (fig. s2b) and sanger sequencing. primarily, the recombinant fubc protein expression was observed only in pellet fraction; there was no such fubc equivalent protein band in supernatant fraction, while they were separated on sds page (fig. s2c) . therefore, fubc expression level was checked further by lowering growth temperature and iptg concentration, but no significant change was noticed in supernatant fraction. chaperone-assisted folding system did not help remarkably to express the fubc recombinant protein in the soluble form. therefore, the insoluble pellet fraction of fubc protein was further utilised in mild solubilisation process to recover soluble fubc protein. initially, the recovered soluble fubc protein fraction was purified by using ni-nta agarose beads, and the purified protein band was confirmed by western blot analysis using anti-his tag monoclonal antibody (fig. s3) . although, ni-nta affinity chromatography has not revealed good purification quality and the yield was also not sufficient. furthermore, gel filtration chromatography was used to recover good quality large-scale recombinant protein, and single distinct peak greater than 20 mau was observed at approximately 20.57 ml position for all of injected fubc protein samples (fig. 2a) . after pooling together, the peak fractions were separated on 15% sds-page, and single bright band was observed at molecular weight 11.5 kda (fig. 2b) . originally, the fu and bc loop of dengue envelope is composed of three anti-parallel β-sheets. and the formation of recombinant fubc secondary structure was characterised here by negative bending at 208 nm and 222 nm wavelength (kumagai et al. 2017) . the cd spectrum of recombinant fubc protein has showed significantly decreased spectral peaks around 222 and 208 nm that signify the formation of β-sheet (fig. 3 ). in addition, it was noticed that the acquired cd spectrum of fubc at a protein concentration of 0.1 mg/ml in 10 mm pbs buffer was adequately intense (fig. 2) . therefore, it can be inferred from fig. 3 (lower panel) that fubc at concentrations of 0.2 mg/ml and 0.1 mg/ml in 10 mm pbs retained the best globular folded state as compared to other conditions. immune response to the purified fubc protein in balb/c mice from indirect elisa, it was observed that the serum igg levels in both male and female groups treated with fubc proteins were significantly higher than those treated with only adjuvant and pbs control (fig. 4a, b) . in addition, the response with only fubc recombinant protein (without adjuvant) in the female group was also observed higher than their male counterpart group (fig. 4b) . all of the elisa data were also found statistically significant by two-way anova (table s1) . similarly, by spr assay, high level immune response was also observed for serum injected with recombinant fubc protein along with adjuvant, and without any adjuvant, the response was also higher than the control (serum with pbs only) (fig. 6b ). in post-dengue infection, most of the circulating antibodies are non-neutralizing and found to be raised against dengue envelope e protein and the prm protein (wahala and de silva 2011) . due to the absence of highly specific neutralising antibodies in secondary infections, cross-reactive nonneutralising antibodies usually enhance the dengue severity (de alwis et al. 2014 ). according to the revised dengue case classification (denco), 2 to 3% of secondary infections with another serotype causes life threating dengue with warning signs (dws+) and severe dengue (sd). it was reported that serotype-cross-reactive non-neutralising antibodies enhance the entry of dengue genome into fc receptor-bearing monocyte cells and promote disease severity by a process known as antibody-depended enhancement (ade) flipse et al. 2013) . that means nature of the antibody response to denv is most likely to play a major role in defining disease outcome. therefore, it is predictable that antibodies that recognise specific neutralizing epitopes help in virus clearance and reduce symptoms; however, antibodies that recognise non-neutralising epitopes lead to more severe forms of disease like dws+/sd. hence, there is an urgent need for an advanced vaccine which could generate highly specific and cross-neutralising antibody (costin et al. 2013 ). recently, several attempts have been made towards the development of a potent dengue vaccine. the most advanced candidate, dengvaxia (cyd-tdv), was licenced in some of the dengue endemic countries (imai and ferguson 2018) . however, it was revealed risky in children or naïve dengue patient with severe infection as they were vaccinated (arredondo-garcía et al. 2018) . therefore, considering safety issues, production of recombinant subunit vaccine with efficient immune protective properties is looking attractive (govindarajan et al. 2015) . meanwhile, an admixture of four live attenuated recombinant dengue vaccine tv003/tv005 have completed phase iii clinical trial and licenced to several manufacturers including butantan, vabiotech and merk (whitehead et al. 2017) . moreover, some recombinant tetravalent vaccines (e.g. den-80e, tvdv) expressing the prm and e genes of each of the four denv serotypes from plasmid dna, have already completed phase i clinical trial (govindarajan et al. 2015; danko et al. 2018 ). it has also been shown that the recombinant dengue envelope domain iii can inhibit dengue infectivity, and induce dengue-neutralising immunoglobulin in mice (hermida et al. 2006) . in addition, a number of antibodies which were raised in mice and fig. 2 purification of recombinant fubc protein by size exclusion chromatography. a fplc chromatogram of fubc protein expressed in e. coli inclusion bodies. three distinct peaks at 19.88 ml, 20.57 ml and 22.57 ml position were collected separately. b gel image of unpurified and purified fubc protein sample separated on 15% sds-page. lanes 1 and 2 represent the un-purified sample and lanes 3 and 4 represent purified protein sample collected from peak at position 20.57 ml fraction chimpanzees against dengue domain ii fusion loop were found cross-reactive to other flaviviruses (goncalvez et al. 2004; stiasny et al. 2006) . although, the hmabs specific to dengue domain ii fusion loop were not found equally effective on other flaviviruses including wnv, yfv and other denv strain (costin et al. 2013) . therefore, it was conjectured that adjacent to the fusion loop, additional contact residues of original domain ii surface structure might be involved to raise cross-neutralizing antibodies. several studies have identified that adjacent to the fusion loop, another similar loop exists in most of the flaviviral (e.g. wnv, tbe, jev, yfv) envelope (fibriansah and lok, 2016; li et al. 2019) . our previous bioinformatics studies have also shown that the envelope of all denv serotype consists of four conserved regions (> 90%), and two of them were found in domain ii, in which one is fusion loop (fu) with more than 99% conservation and another is its nearby bc loop (fig. 1a ) . therefore, these two highly conserved loops were targeted in this study for the development of a fusion protein. by using reverse dna technology, we have previously shown the structural integrity ( fig. 1b-d) and binding specificity of fubc protein with an anti-fusion loop scfv antibody, derived from dengue and wnv-specific mab e53 (rodenhuis-zybert et al. 2011; rathore et al. 2019) . the very early challenge of subunit vaccine design is to produce large quantities of functional protein. hereby, we have successfully optimised the expression and purification methods for the production of large-scale high-quality recombinant fubc protein. the e. coli expression system was used here, and the recombinant protein was found to be expressed in higher quantity, though most of the protein was extracted initially in pellet fraction. no significant change was noticed by altering regular growth temperature and inducer concentration. therefore, we have utilised the mild-solubilisation methodology to recover soluble fubc protein from insoluble pellet protein. for purification, firstly, we have used convenient ninta affinity purification method, and by adjusting the lysis and elution buffer, we were able to purify recombinant fubc protein to some extent but the quality and quantity was not sufficient. finally, to scale up the quality protein production, size exclusion chromatography was used in akta fplc system. to confirm the recombinant protein expression, simple western blot was performed by using anti-his monoclonal antibody as primary. furthermore, in vitro structural integrity and functional authenticity of the experimental fubc protein were also verified by cd spectroscopy (fig. 3) and indirect elisa (fig. 4) . however, most of the hmabs identified earlier from dengue patient serum were predominantly cross-reactive and recognise epitopes containing highly conserved residues at the fu loop of domain ii (lai et al. 2008 ). however, having such sequence homology in fu loop of all flaviviruses, these hmabs are non-neutralizing against heterologous serotypes (lai et al. 2008; smith et al. 2013) and thus was found to be responsible for ade in animals (goncalvez et al. 2007 ). previously, it has also been stated that this scenario might happen due to the cryptic nature and poor accessibility of the fusion loop epitope on the surface of the mature virion (lai et al. 2008; cherrier et al. 2009 ). however, in addition to the partially exposed fusion loop on immature flaviviruses, some neutralizing hmabs were also found to bind with bc loop (costin et al. 2013) . therefore, it is suggestive that along with conserved fusion loop, adjacent less conserved linker sequence and bc loop might be required to be exposed as a whole to generate effective cross-neutralizing immunity. our current mouse immunisation experiment also supports this idea as it was observed that the antibody response to full length fubc protein was stronger than the response elicited by the fu and bc peptides in separate (figs. 4 and 5) . moreover, without freund's adjuvant, recombinant fubc protein was found immunogenic, and the response in female mice was stronger than in male (figs. 4 and 5) . these observations were also symmetrical with other studies where it was stated that female mice have a better immune response due to the higher number of leukocytes occupying the naive peritoneal and pleural cavities, and also have a number of t-and blymphocytes as well as macrophages (terres et al. 1968; weinstein et al. 1984; klein et al. 2015) . therefore, we have used female mice serum samples for further qualitative elisa and spr test to quantify immune response of recombinant fubc protein. it was noticed that igg antibody response to fu, and bc peptides were significantly lower than the response observed to fubc recombinant protein, and the immune response achieved to fu and bc peptide with freund's adjuvant was significantly higher as compared to the peptides without adjuvant and serum control (fig. 5) . two-way anova was also performed to analyse the significance of igg immune responses between fubc + fa and fubc without fa; fu, bc peptide + fa and fu, bc peptide without fa. from f values and p values obtained by anova test, it can be interpreted that the difference between fubc protein with freund's adjuvant and fu, bc peptide with freund's adjuvant were significantly higher than protein and peptides injected without freund's adjuvant (table s2) . also, the igg response to recombinant fubc protein was recorded better than fu and bc peptides in both cases either with or without adjuvant. in addition, the p value of igg immune response between fubc protein with fa and without fa was found insignificant (p = 0.1740 > 0.05) that suggests that fubc without any adjuvant is sufficient to elicit significant immune response (table s2) . consistently, it was observed by spr experiment that the refractive index of fubc + fa was significantly higher than fubc + without fa and serum control (fig. 6b) . since, the newly developed fubc recombinant protein expresses vastly in e. coli and induces significant immune response in mice, it might be a good agent of dengue subunit vaccine development. due to the presence of highly conserved fusion loop epitope, overlapping less conserved linker sequences and also bc epitope, this fusion protein might induce cross-neutralisation immunity against heterologous dengue serotypes. though, still a lot of investigations, like the evaluation of a proper adjuvant to induce robust immune response, the memory immune response generated against the fubc protein both by humoral and cell-mediated immunity, and whether this memory response will provide protection against a secondary encounter with denv, are required before going clinical trial. nevertheless, the production process and immune response of this fusion protein would provide new insight for the development of dengue subunit vaccine. four-year safety follow-up of the tetravalent dengue vaccine efficacy randomized controlled trials in asia and latin america the global distribution and burden of dengue refining the global spatial limits of dengue virus transmission by evidencebased consensus current status of vaccines against dengue virus. dengue fever -a resilient threat in the face of innovation intechopen book chapter 9 structural basis for the preferential recognition of immature flaviviruses by a fusion-loop antibody the development of recombinant subunit envelope-based vaccines to protect against dengue virus induced disease mechanistic study of broadly neutralizing human monoclonal antibodies against dengue virus that target the fusion loop development of dengue dna vaccines safety and immunogenicity of a tetravalent dengue dna vaccine administered with a cationic lipid-based adjuvant in a phase 1 clinical trial dengue viruses are enhanced by distinct populations of serotype cross-reactive antibodies in human immune sera phase i randomized study of a tetravalent dengue purified inactivated vaccine in healthy adults from puerto rico development and clinical evaluation of multiple investigational monovalent denv vaccines to identify components for inclusion in a live attenuated tetravalent denv vaccine dengue virus cryo-em structure of an antibody that neutralizes dengue virus type 2 by locking e protein dimers the development of therapeutic antibodies against dengue virus molecular mechanisms involved in antibody-dependent enhancement of dengue virus infection in humans rs 2015 a new quaternary structure epitope on dengue virus serotype 2 is the target of durable type-specific neutralizing antibodies development of a dengue vaccine and its use in pregnant women monoclonal antibody-mediated enhancement of dengue virus infection in vitro and in vivo and strategies for prevention epitope determinants of a chimpanzee fab antibody that efficiently cross-neutralizes dengue type 1 and type 2 viruses map to inside and in close proximity to fusion loop of the dengue type 2 virus envelope glycoprotein preclinical development of a dengue tetravalent recombinant subunit vaccine: immunogenicity and protective efficacy in nonhuman primates from research to phase iii: preclinical, industrial and clinical development of the sanofi pasteur tetravalent dengue vaccine development of the sanofi pasteur tetravalent dengue vaccine: one more step forward dengue infection domain iii of the envelope protein as a dengue vaccine target a recombinant fusion protein containing the domain iii of the dengue-2 envelope protein is immunogenic and protective in nonhuman primates immunostimulatory dna as a vaccine adjuvant targeting vaccinations for the licensed dengue vaccine: considerations for serosurvey design how to study proteins by circular dichroism structure of a dengue virus envelope protein late-stage fusion intermediate sex-based differences in immune function and responses to vaccination structure of dengue virus: implications for flavivirus organization, maturation, and fusion going deep into protein secondary structure with synchrotron radiation circular dichroism spectroscopy antibodies to envelope glycoprotein of dengue virus during the natural course of infection are predominantly cross-reactive and recognize epitopes containing highly conserved residues at the fusion loop of domain ii potent neutralizing antibodies elicited by dengue vaccine in rhesus macaque target diverse epitopes immunogenicity and efficacy of flagellin-envelope fusion dengue vaccines in mice and monkeys immunization of mice dengue virus pathogenesis: an integrated view synchrotron radiation circular dichroism spectroscopy of proteins and applications in structural and functional genomics structure of the dengue virus envelope protein after membrane fusion immune response to dengue virus and prospects for a vaccine crystal structure of dengue virus type 1 envelope protein in the postfusion conformation and its implications for membrane fusion dengue vaccination: a more balanced approach is needed designing antibody against highly conserved region of dengue envelope protein by in silico screening of scfv mutant library a fusionloop antibody enhances the infectious properties of immature flavivirus particles evaluation of scfv protein recovery from e. coli by in vitro refolding and mild solubilization process the potent and broadly neutralizing human dengue virus-specific monoclonal antibody 1c19 reveals a unique cross-reactive epitope on the bc loop of domain ii of the envelope protein cryptic properties of a cluster of dominant flavivirus cross-reactive antigenic sites structure and function analysis of therapeutic monoclonal antibodies against dengue virus type 2 a quantitative difference in the immune response between male and female mice the human antibody response to dengue virus infection sex-associated differences in the regulation of immune responses controlled by the mhc of the mouse in a randomized trial, the live attenuated tetravalent dengue vaccine tv003 is well-tolerated and highly immunogenic insubjects with flavivirus exposure prior to vaccination a mouse monoclonal antibody against dengue virus type 1 mochizuki strain targeting envelope protein domain ii and displaying strongly neutralizing but not enhancing activity publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-000884-zq8kqf6h authors: shen, hsin-hui; lithgow, trevor; martin, lisandra l. title: reconstitution of membrane proteins into model membranes: seeking better ways to retain protein activities date: 2013-01-14 journal: int j mol sci doi: 10.3390/ijms14011589 sha: doc_id: 884 cord_uid: zq8kqf6h the function of any given biological membrane is determined largely by the specific set of integral membrane proteins embedded in it, and the peripheral membrane proteins attached to the membrane surface. the activity of these proteins, in turn, can be modulated by the phospholipid composition of the membrane. the reconstitution of membrane proteins into a model membrane allows investigation of individual features and activities of a given cell membrane component. however, the activity of membrane proteins is often difficult to sustain following reconstitution, since the composition of the model phospholipid bilayer differs from that of the native cell membrane. this review will discuss the reconstitution of membrane protein activities in four different types of model membrane—monolayers, supported lipid bilayers, liposomes and nanodiscs, comparing their advantages in membrane protein reconstitution. variation in the surrounding model environments for these four different types of membrane layer can affect the three-dimensional structure of reconstituted proteins and may possibly lead to loss of the proteins activity. we also discuss examples where the same membrane proteins have been successfully reconstituted into two or more model membrane systems with comparison of the observed activity in each system. understanding of the behavioral changes for proteins in model membrane systems after membrane reconstitution is often a prerequisite to protein research. it is essential to find better solutions for retaining membrane protein activities for measurement and characterization in vitro. the cell membrane separates intracellular components from the outside environment and is constituted by various phospholipids, cholesterol, glycolipids and proteins. integral membrane proteins have at least one polypeptide segment spanning the membrane bilayer whereas peripheral membrane proteins are temporarily attached to the lipid bilayer or to integral membrane proteins by various interactions such as hydrophobic, electrostatic and other types of non-covalent interactions. membrane proteins work as a selective filter to regulate molecules entering cells and also serve in communicating with the surrounding environment. thus, membrane proteins play an essential role in the physiological functions needed for cell survival. the functional activities of membrane proteins are modulated by the structure of the surrounding lipids molecules in the membrane [1, 2] ; thus the composition of the lipid bilayer can affect the inter-or intra-molecular interactions between the lipid bilayer and membrane proteins [3] . investigating membrane proteins in vivo is difficult because the membrane proteins are associated with a complex mixture of other proteins, and are prone to aggregation in solution [4] . it is still a major challenge at this stage to extract information needed in vivo to address specific questions in the function of the cell membrane. to simplify cell membrane systems, model membranes such as monolayers, bilayers, liposomes and nanodiscs have been developed, enabling detailed investigation of membrane protein structure in lipid membranes. model membrane environments more closely resemble the natural lipid bilayer than alternatives such as detergents. however, many features of phospholipid structure need to be considered and optimized in the creation of a suitable model membrane. for example, the hydrophobicity of the lipid chain defined by the lengths of the fatty acid chains, is an important parameter for retaining protein activity. other factors affecting the reconstituted membrane protein activity are the chemical properties of the lipid head groups which control membrane hydrophilicity. both parameters are crucial in stabilizing membrane protein structure. there are a number of approaches used to create a model membrane in order to mimic properties of the native cell membrane, and we will review these various approaches for reconstituting membrane proteins into different types of model membrane-monolayers [5] , supported planar lipid bilayer [6] and liposomes [7] as shown in figure 1a -c. we will also discuss the emerging technology of nanodiscs [8] ( figure 1d ). nanodiscs are a new class of model membrane, with attractive properties that address shortcomings of other approaches in the study of membrane proteins. the first section gives a brief summary of each method and a comparison of their strengths and weaknesses. in the following section, we describe four case studies and will compare the protein activity changes when the membrane proteins are reconstituted into different model membranes. in these case studies, we demonstrate how protein activities are modulated by the lipid environment and discuss how this environment helps to retain protein activities. and black in b represent water and a substrate respectively. nanodiscs contain membrane scaffold proteins, shown in green. one of the most common approaches to study the membrane protein structure and activity uses a langmuir monolayer at the air-water interface. this method has been extensively used for more than a century [9, 10] . reconstitution of membrane proteins helps obtain further information on their organization and structure in the langmuir membrane [11, 12] . it is a simple method to create a phospholipid monolayer at an air-water interface. basically, a desired amount of lipid or lipid mixtures are dissolved in organic solvents such as chloroform or chloroform/ethanol mixtures, followed by spreading the lipid/solvent mixtures on the water surface. by evaporating out the solvent, the phospholipid molecules self-assemble vertically as a monolayer film at the air-water interface, with their hydrophilic head groups immersed in the water and their hydrophobic tail pointed to the air as shown in figure 1a [13] . a major advantage of using the langmuir monolayer system is that parameters such as thickness, surface pressure, molecular area and subphase thickness can be well controlled [10] . more advanced characterization techniques, such as π-a isotherm uv-vis adsorption, x-ray reflectivity, ellipsometry and rheology, have been developed to gain detailed information on the binding of proteins onto the phospholipid monolayer and to monitor enzyme activities when binding to the monolayer [14] . however, a limitation of langmuir monolayers is that the lack of a layer comparing to the natural cell structure (bilayer) and the high surface tension of water that can cause protein denaturation. despite this limitation, there are several successful studies using this approach. two types of membrane proteins in monolayer model membrane system will be briefly described below: rhodopsin [15, 16] , bacteriorhodopsin [17, 18] and the aliphatic peptide gramicidin [19, 20] have been successfully reconstituted and studied in monolayers at the air-water interface. in order to obtain information on the secondary structure and orientation, the protein layer can be investigated in situ at the air-water interface by either polarization modulation infrared reflection absorption spectroscopy (pm-irras) or x-ray reflectivity in combination with surface pressure-area isotherms [21] . the study of gramicidin is an example of such an approach, and while gramicidin is unfolded at high molecular area (low pressure), it is refolded upon compression and retains its precise structure and orientation. likewise, for both rhodopsin and bacteriorhodopsin, the secondary structures measured in monolayers are indistinguishable from that in native membranes when appropriate conditions are used. while some experiments have suggested that spreading of rhodopsin in certain conditions (>5 m/n) leads to denaturation [21] , bacteriorhodopsin, in contrast, is very stable in most testing conditions (compression and temperature change). the different properties of the protein are probably due to the ability of baceriorhodopsin to form a stable two-dimensional crystalline structure at the air-water interface [21] . phospholipid monolayers are simple model membrane systems that are perfectly suited to study the binding of peripheral proteins onto a membrane surface. peripheral membrane proteins spontaneously bind onto phospholipid monolayers at the air-water interface by injecting themselves into the subphase underneath the lipid monolayer. in most cases, useful information can be obtained by measuring the binding of peripheral proteins onto the monolayer. for example, the kinetics and dynamics of adsorption of myristoylated and nonmyristoylated recoverin onto phospholipid monolayers have been investigated using surface pressure isotherm described in figure 2 [21] . the curve can be fitted with stretched exponential which can convert into the rate of adsorption of myristoylated and nonmyristoylated which is 0.028 s −1 and 0.0048 s −1 , respectively. this indicates that the adsorption of myristoylated recoverin is six times faster than nonmyristoylated recoverin. reconstituting enzymes into the langmuir monolayers at the air-water interface has been found to be a very useful approach to understand the hydrolysis of membrane phospholipids. for example, the interfacial recognition and adsorption of phospholipases a2 (pla2) and phospholipases c (plc) to the phospholipid membrane interface are poorly understood. by using this approach, it appears that both pla2 and plc are active at the monolayer model membrane, indicating that the kinetics of phospholipid hydrolysis at the air-water interface can be monitored by biophysical characterization techniques in situ such as pm-irras and infrared reflection adsorption spectroscopy [22] . moreover, it has been found that in the presence of calcium, phospholipid hydrolysis by pla2 resulted in the production of calcium-palmitate complexes. this suggests that calcium is necessary for pla2 secretion. the formation of a supported lipid bilayer on a solid substrate was reported by tamm and mcconnell in 1985 as a new model membrane system to study the physical properties of biological membranes and their constituent lipid and protein molecules [23, 24] . supported planar lipid bilayers are prepared by several methods [25, 26] . vesicle fusion is the simplest method for supported bilayer formation and the fusion mechanism on a hydrophilic support is well understood [27, 28] . essentially, the bilayer is prepared by the fusion of small unilamellar vesicles on solid supports such as sio 2 , glass and modified gold surface by van der waals, electrostatic, hydration and steric forces. the supported lipid bilayer has polar hydrophilic headgroups facing the aqueous surroundings and two hydrophobic tails that face the interior of the membrane which more closely resembles biological membranes than the langmuir monolayer. the supported lipid bilayer can confer many key functions to biological membranes. however, one side of the hydrophilic head group is still tightly attached to the solid support and this may, in some cases, affect the fluidity of the model membrane. this matters, since integral membrane proteins may not diffuse in the plane of the membrane. furthermore the orientation of membrane proteins cannot be controlled in the supported planer lipid bilayer. to alleviate some of these problems, a new tethered polymer-supported planar lipid bilayer system was developed to investigate the reconstitution of integral membrane proteins in a laterally mobile form into the supported lipid bilayer [29] . wagner and tamm [30] have successfully designed a supported lipid bilayer on a polyethyleneglycol cushion shown in figure 3 . the polymer cushion minimizes the interactions of the proteins with the substrate and the polymer. it also provides a soft support and, for increased stability, covalent linkage of the membranes to the supporting quartz or glass substrates. in low polyethyleneglycol concentration regimes, the bilayers were assembled with high lateral lipid diffusion coefficients (0.8-1.2 × 10 −8 cm 2 /s). cytochrome b5 and annexin v were used to test the polyethyleneglycol cushion system. two populations of laterally mobile proteins were observed in the polyethyleneglycol cushion-supported bilayers. approximately a quarter of cytochrome b5 diffused with a diffusion coefficient of 0.8-1.2 × 10 −8 cm 2 /s, and more than half of the cytochrome b5 diffused with a diffusion coefficient of ~2 × 10 −10 cm 2 /s. similar results were found in the annexin v system. annexin v diffused with two populations with diffusion coefficients of 3 × 10 −9 cm 2 /s and 4 × 10 −10 cm 2 /s. the new polymer-supported lipid bilayer system has increased the mobile fraction and retained the full lateral mobility of both cytochromes b5 and annexin when integrated or bound to the supported lipid. although polymer cushions allow for successful integration of small membrane proteins into bilayers, further challenges stem from studies with large transmembrane proteins. polymer cushions cannot provide large transmembrane proteins with good solvent accessibility, or enough space for the motion; required for the activity. while several types of polymer cushions have been developed, including polymethyl methacrylate diblock polymer cushions [31] , poly(ethylene imine) [32, 33] cushions and poly(ethylene glycol) tethered lipopolymers [30] , these cushions are mostly limited to a thickness of up to 10 nm. a recent development of a maleic anhydride copolymer thin film has film thickness up to 60 nm [34] . the hydrophilic polymer-cushioned supported lipid bilayers provide a higher mobility and homogeneous distribution of the incorporated beta-amyloid precursor protein cleaving enzyme (bace) on the bilayer surface, and enhances the enzymatic activity of bace (increased from 8% to 16%). even so, the activity of the incorporated bace remains significantly lower (16%) than that of the native enzyme (100%). another important classic category of membrane proteins are the transporters of ions and small molecules. studies of how ion channels regulate the transport of substrates [7] are important for fundamental biology. however, it is challenging to incorporate ion channels in supported lipid bilayers due to leakage or instability issues. detailed studies of ion channel conduction or gating require considerable period of time (possibly >1 h), and it is difficult to set up a stable and electrically quiet environment for the ion channel in planar lipid bilayer. a better alternative has proven to be reconstitution of ion channels into proteoliposomes. lipid vesicles, also known as liposomes, consist of a self-closed lipid bilayer. they have been widely used for more than 30 years to reconstitute the membrane proteins in unilamellar phospholipid vesicles. liposomes are relatively easy to construct by procedures such as extrusion method or ultrasonication, with reverse-phase evaporation. furthermore, giant vesicles of unilamellar or multilamellar nature can be "micro-manipulated" under an optical microscope. reconstitution of membrane proteins in liposomes usually requires detergents wherein purified membrane proteins are solubilized in detergent, then mixed with the desired phospholipid vesicles forming an isotropic solution of mixed phospholipid-protein-detergent micelles. the detergent can then be removed slowly by dialysis, gel filtration or biobead adsorption. when the detergent concentration reaches a critical level, the protein will spontaneously associate with the phospholipid membrane to form biologically active liposomes, called proteoliposomes. however, it has been a hard feat to control the final orientation of protein in the proteoliposomes [35] , as well as the amount of protein inserted due to the limited area available. in many cases, disorientation of the protein causes aggregation. despite these difficulties, there have been many successful cases of membrane proteins reconstituted in the proteoliposomes, and we describe two examples below. several integral membrane proteins have been successfully reconstituted into proteoliposomes such as rhodopsin [36] , g proteins [37, 38] , proapoptotic bcl-2 proteins and t-bid [39] , phosphocholine cytidylyltransferase (ct) [40] and p protein kinase c (pkc) [41] . however, these studies also found that the resulting protein activities are sensitive to the membrane curvature of the liposomes. this indicates that different phospholipids can cause considerable curvature stress changes in the liposomes [42] . specifically, the curvature stress has been suggested to modulate the free energy and folding of the integral membrane proteins [43] . sometimes the activity of different enzymes is modulated by the same driving force of the membrane curvature, but there may also be variation of activity through different mechanisms. for example, the activities of both ct [40] and pkc [41] are enhanced by increasing the negative curvature strain of the membrane. the activity of ct appears to be directly coupled with the membrane curvature, in contrast, the activity of pkc does not have a direct relationship with the curvature strain and enzymatic activity [41] . the activity of pkc is instead modulated by nonlamellar-forming lipids via a less direct mechanism. liposomes have been commonly used for reconstituting different types of transporters to allow for the free diffusion of solution or catalysis of obligatory co-transporters. a large number of functional membrane proteins have been successfully reconstituted into liposomes but only a few examples will be discussed here. the reconstitution of colicin ia and e1 in either soybean phospholipids or e. coli phospholipids show that there is channel formation in the liposomes but there are unspecific channels allowing passage of ions, such as rubidium, sodium, chlorine, potassium or phosphate but not of sugars [7, 44] . an example of the reconstitution of selective transport comes from the d-glucose transporter, purified from human erythrocytes and extracted from detergents followed by incorporation into proteoliposomes. with incorporation of the d-glucose transporter, the proteoliposomes become permeable to d-glucose but not to l-glucose. the transport was inhibited by cytochalasin b which is a potent inhibitor of d-glucose transporter [45, 46] . several types of atp-dependent ion transporters such as ca 2+ /mg 2+ -atpase, na + /k + -atpase, and h + /k + -atpase have been reconstituted into proteoliposomes [47] . upon addition of atp, ions are observed to be transported inwards and can form a complex. the single-channel property of channels incorporated into proteoliposomes can be investigated using the well-known patch-clamp method [48] . channel activity is monitored following excision of the patch from the proteoliposomes. ion-channel reconstitution makes possible the investigation of the influence of membrane lipid composition on channel function. the kinetic investigation of these channels under physiological conditions has been discussed elsewhere [47] . another up-to-date method is using organic solvent or oil mixed with water that creates water-in-oil (w/o) microdroplets coated by phospholipid. the hydrophilic head group immerses in the water and the hydrophobic tail locates in the oil/organic solvent phase. the application of the water-in-oil system could cover a wide range of applications from monolayer, planer lipid bilayer and liposomes. funakoshi et al. [49] and maglia et al. [50] used a planer lipid bilayer formed by two microdroplets driven to come in contact to reconstitute ion channels in the bilayer. this method is extremely simple and reproducible. recently, the water-in-oil microdroplets are extended to form liposomes by using droplet-transfer method invented by yoshikawa [51] . by using this approach, it is possible to modulate the lipid compositions of outer and inner leaflets and furthermore to orient a reconstituted membrane protein in liposomes [52] . nanodiscs offer a solution to some of the challenges described in the previous sections. the first attempt to reconstitute membrane proteins in the phospholipid bilayers using nanodisc technology was initiated by sligar's group a decade ago [8] . the nanodisc is a self-assembly of phospholipids and a membrane scaffold protein derived from human serum apolipoprotein a1. the detergent, cholate, can be used to solubilize phospholipids and membrane scaffold proteins into a micelle mixture. following detergent removal with dialysis or bio-beads adsorbent, a nanodisc self-assembles. the phospholipid associates as a bilayer domain while the membrane scaffold protein wraps around the edges of the discoidal structure in a belt-like configuration ( figure 1d ). it is possible to modify the diameter of the bilayer disc by genetically engineering the apolipoprotein a1 by changing the number of amphipathic helices. by this approach, the diameter of nanodiscs can be made anywhere from 9.8 to 17 nm, and therefore accommodate a range of membrane proteins. the ratio of phospholipid: membrane scaffold protein is precisely defined which helps engineer the different size of membrane proteins in the nanodiscs. detailed formation of different types of nanodiscs has been described elsewhere [53, 54] . the great advantage of using nanodiscs is keeping the membrane proteins in aqueous solution, in native-like phospholipid bilayer environment that is soluble, stable, monodisperse and detergent-free. most important, it isolates proteins or complexes as individual particles in monomeric or oligomeric states for analysis by techniques that range from activity assays to electron microscopy. since 2003, there have been more than 100 membrane proteins reconstituted into nanodiscs [54] , ranging from signaling receptors to transport machines. we will discuss the applications separately below. nanodiscs have been used to analyze the influence of binding substrate on monodisperse receptors which are isolated from the cell-surface membrane. those receptors include g protein-coupled receptors (gpcr) [55, 56] , cholera toxin receptor ganglioside g m1 , bacterial chemoreceptor [57] and epidermal growth factor receptor. introduced into nanodiscs, the receptors stay in monodispersed, controllable, predefined oligomeric states in which it is possible to characterize the oligomeric status. for example, two different gpcr proteins, the beta-adrenergic receptor (β 2 ar) and rhodopsin [58] have been extensively studied using nanodiscs. β 2 ar was one of the first receptors assembled into nanodiscs which was found to be functionally active (54% of starting activity recovered) and shown coupling to its g-protein. rhodopsin is a light-activated gpcr present in the photoreceptor cells of the retina and transducin is an important g-protein naturally expressed in retina rods and cones. assembly of functional rhodopsin into nanodiscs was found to activate transducin with high efficiency and to isolate the high affinity of transducin-metharhodopsin ii complex. this provides strong evidence that the monomeric state of rhodopsin can activate and interact with the transducin. a dimeric rhodopsin nanodisc was separated for monomeric forms using sucrose density gradients. even with two rhodopsins in the nanodiscs, interaction with a single transducin molecule was observed and found to activate the transductin with high efficiency [56] . numerous membrane associated enzymes have been incorporated into nanodiscs. cytochrome p450 (cyp) enzymes have been extensively studied, including cyp2b4 [59] , cyp6b1 [8] , cyp73a5 [60] and cyp19 [61, 62] . this system has provided a means for studying the extensive collection of membrane bound cytochromes p450 with the same biochemical and biophysical tools that have been previously limited to use with the soluble p450s. for example, the cytochrome p450 3a4 (cyp3a4) is a membrane-bound protein which is a human hepatic drug-metabolizing enzyme. most studies of the ligand binding by cyp34a are carried out in the presence of detergents below their critical micelle concentrations [63, 64] but are compared by the propensity of cyp34a to aggregate. even in studies attempting to use liposomes, cyp3a4 is unlikely to exist in its native state because the detergent concentrations are much higher than the phospholipid concentrations. as a result, the understanding of the structure and composition of cyp3a4 in the lipid phase was limited and the membrane effect on cyp3a4 ligand binding behavior is unclear. nanodiscs have been utilized to study cyp3a4 which displays monophasic reduction kinetics. with a high lipid-protein ratio, cyp3a4 is captured as a monomer. however, at lower lipid ratio, cy3a4 self-associates and heterogeneous behaviors are induced. the nanodiscs prohibit self-association in this case as there is only one cyp3a4 per nanodisc and show significant improvement in homogeneity and stability. this opens up new possibilities for detailed analysis of equilibrium and steady-state kinetic characteristics of catalytic mechanisms of cyp enzymes [63] . the sec translocon is a membrane-embedded protein assembly that drives protein translocation into or across membranes. the core translocon is formed from a trimeric arrangement of secy, sece and secg [65] . the secyeg promoter has 15 transmembrane helices sitting in the phospholipid membrane. the oligomerization of secyeg has been proposed to be necessary to proper function. researchers were successful in reconstituting sec into membrane vesicles in 1990 and have had great success in characterizing several partial reactions of secyeg functions [65] . reconstituting a single secyeg into a nanodisc with different types of lipids [66] suggests that the acidic lipids can stabilize the secyeg channel in the nanodisc bilayer and trigger dissociation of the seca dimer. a model has been proposed by alami et al. [66] , suggesting that the dissociation of the seca dimer provoked by the secyeg complex is followed by activation of the seca atpase. furthermore, dalal et al. [67] , using the nanodisc technology, have also shown that only the secy dimer together with acidic lipids supports the activation of the seca translocation atpase. recently, a high resolution single-particle cryo-em structure of single secyeg complexes in nanodiscs, bound to translating ribosomes was first solved at subnanometer resolution [68] . it allows the secyeg complex to be investigated in a natural lipid bilayer environment and identifies the ribosome-lipid interactions. wu et al. [69] also used surface plasmon resonance to investigate the competitive binding of ribosomes and seca. the data suggest that both ribosomes and seca can interact simultaneously with secyeg complex during membrane protein insertion, but seca competes with ribosome when it binds to the secyeg complex. in the previous section, we have shown that membrane proteins can be assembled into four different types of model membrane and the activities of some of the membrane proteins can be retained, allowing their physicochemical properties to be studied. but is there a model membrane system that is the best for membrane-protein reconstitution? the reconstitution of the same membrane protein into different model membranes has been compared and, in this section, we list four membrane proteins with varying activities in different model membranes. ganglioside g m1 is a naturally occurring native receptor that binds to cholera toxin via hydrogen bonds [70] . it is an excellent receptor for studying lipid-receptor interaction. several different approaches to reconstituting the glycolipid receptor g m1 in model membranes have enabled the measurement of binding of its interaction partner cholera toxin. in liposomes and supported lipid bilayer systems, the ganglioside g m1 is free to diffuse across long distances and exhibits a non-uniform lateral distribution, i.e., self-aggregation, even at low incorporation ratios. therefore the binding activity of ganglioside g m1 with cholera toxin b is restricted [71] . investigations of ganglioside g m1 incorporated into nanodiscs found reduced protein aggregation. bricarello et al. [72] found that the reconstitution of a low concentration of ganglioside g m1 in nanodiscs, shows binding of cholera toxin with a significantly higher affinity than in liposomes or supported lipid bilayers. this is due to the interaction of ganglioside g m1 with the headgroup region of the disc which reduces the oligomerization, thereby causing a potential effect on the affinity of toxin binding. thus, nanodisc technology restricts the ganglioside g m1 oligomerization by controlling the number of ganglioside g m1 monomer isolated by each nanodisc. borch et al. [73] have also used sensor chip-based surface plasmon resonance (spr) technology to measure the detailed kinetic binding of the interaction between soluble molecules and membrane receptors inserted in the bilayer of nanodiscs. the corresponding spr sensorgrams are displayed in figure 4 . overall, the change of the sensorgram indicates that the spr sensorchip is immobilized with histidine-modified nanodisc or the cholera toxin b bound to the nanodiscs. the sensorgrams in both figure 4a ,b shows the binding of nanodiscs (576 ru) on the antibody immobilization surface on the sensor chip. by injecting the cholera toxin b over two flow cells presented in figure 4a and 4b, the spr sensorgrams can detect the interaction of the cholera toxin b with nanodisc with or without the existence of g m1 . it has been revealed that the captured 2% g m1 -nanodiscs bound 238 ru of the cholera toxin b without binding to the capturing nanodiscs without g m1 ( figure 4b ). the measured kinetic values of the interaction are in agreement with those reported by previous studies on the interaction of the cholera toxin with the g m1 receptor embedded in different membrane systems. this, therefore, serves as a proof of concept that nanodiscs can be employed in kinetic spr studies. the nucleus envelope is composed of two bilayers (the outer nuclear membrane and inner nuclear membrane) and contains abundant ion channels, through which ions and small molecules diffuse between the cytoplasm, nucleoplasm and perinuclear (i.e., intermembrane) space. the nuclear ionic channels represent a ubiquitous structure in the nuclei in a wide range of cells, although little is known about its functional properties. to characterize nuclear ionic channels, guihard et al. [74] attempted to reconstitute nuclear envelope vesicles derived from the canine liver nuclei into a planar lipid bilayer and giant proteoliposomes. they found that the success rate of nuclear envelope fusion into planar lipid bilayers was extremely low although cardiac nuclear ionic channels were successfully incorporated into planar lipid bilayers. the detection of the nuclear ionic channels activity was not possible. such a low efficiency can be explained by the clustering of nuclear envelope vesicles, and the low density of single vesicles, as well as the presence of residual chromatin and/or nuclear proteins (histones or lamins) which would prevent fusion events with the bilayer. another approach is reconstituting nuclear envelope vesicles into giant proteolipsosmes and detecting the single ion channel by the patch-clamp technique [49] . large conductance, voltage-gated, k + and cl − selective nuclear ionic channels are characterized and plotted as a current-voltage relationship presented in figure 5a ,b respectively. it has been found that under asymmetrical 150/50 mm kcl conditions, the zero current potential for unitary currents is at 322 mv ( figure 5a ). calculated from the goldman-hodgkin-katz (ghk) flux equation, a p k +/p cl − ratio is 9.4. this value indicates the k + selectivity for this channel. in figure 5b , the cl − selective nuclear ionic channel yields a positive zero current potential of +27.3 mv, with a p cl −/p k + ratio of 80, indicative of a high cl − selectivity over k + . this suggests super fusion of the channel under asymmetrical (150/50 mm) kcl conditions. the current-voltage relationship curves indicate that the nuclear ion channels can be functionally characterized by incorporating the proteins into the giant proteoliposomes where it is possible to retain their channel activity. furthermore, the measured activities are consistent with those described for native nuclear ion channels. p-glycoprotein, the most extensively studied atp-binding cassette transporter, has been implicated in the phenomenon of multidrug-resistance in tumor cells and has been suggested to play a significant role in drug absorption and deposition. how p-glycoprotein interacts with its substrates is still unknown. functional studies are limited because of the difficulty of obtaining large quantities of stable p-glycoprotein. besides that, no atpase activity of p-glycoprotein solubilized in detergent could be detected. when p-glycoprotein is reconstituted into proteolipsomes, it has detectable atpase activity; however, the whole complex is very unstable. heikal et al. [75] have further found that p-glycoprotein reconstituted in the proteoliposomes has a half-life of less than one day. in 2009, ritchie et al. [76] performed a detailed study of drug-stimulated atpase kinase activity of p-glycoprotein using the nanodisc technology. the p-glycoprotein protein was reconstituted into both msp1e3d1 disc and liposomes in order to compare its atpase kinase activities. the results described in figure 6 demonstrate that p-glycoprotein is functionally active when reconstituted into the nanodiscs (close squares). comparing to the atpase kinase activity of p-glycoprotein reconstitution in lipsosomes (close circles), there is a twofold increase in the maximum atpase activity in the nanodiscs. this could be due to the uniform orientations of p-glycoprotein in the nanodiscs while there are two possible orientations in liposomes. these data not only show that p-glycoprotein is functionally active when reconstituted into the nanodiscs, but that it also exhibits higher specific activity than the current standard reconstitution system. figure 6 . comparsions of the atpase activity of p-glycoprotein in nanodiscs (square) and proteoliposomes (circle). open symbols: basal activity in the absence of drug; filled-in symbols: activity in the presence of nicardipine [76] . atp-binding cassette transporters utilize the energy of atp hydrolysis to transport a wide range of substrates across cellular membranes and for non-transport-related processes such as translation of rna and dna repair [77] . a member of the atp-binding cassette super family, the maltose transporter malfgk2 from e. coli, together with the substrate-binding protein male, is one of the best-characterized atp-binding cassette binding cassette transporters suitable for various reconstitution techniques. bao and fuong have reported the reconstitution of the maltose transporter in nanodiscs, in detergent and in proteoliposomes. the atpase activity of the malfgk2 complex in various environments is shown in figure 7 . the data presented in the first column of figure 7 show that the basal atpase activity for assembly in the nanodiscs and detergent (~700 nmol/min/mg) is 10-fold higher than in proteoliposomes because of the decrease in the activation energy barrier of the transporter [78] in detergent micelles and nanodiscs. however, in the presence of male, the rate of atp hydrolysis increases in all assembly conditions. this is because male captures maltose and delivers the sugar to the transporter. note that the basal atpase activity assembly in the nanodiscs dramatically increases from 700 to 2300 nmol/min/mg. the maltose alone has no effect on the basal atpase activity in the nanodiscs and detergent. however, in nanodisc and detergent, an inhibition of the atpase activity was observed in the presence of both maltose and male in the nanodiscs. this is because that maltose reduces the binding affinity of the male-malfgk 2 complex, which therefore has reduced the atpase activity. in proteoliposomes, the atpase activity (~40 nmol/min/mg) shows a further 10-fold increase in the presence of both maltose and mele in the figure. the author used another type of male mutant which binds maltose with higher affinity. this male mutant, in contrast, shows a reduction of the atpase activity in proteoliposomes which has the same effect as the nanodiscs and detergents. overall, proteoliposomes have shown a low basal atpase activity because the lipid stabilized the transporter. however, the nanodiscs have been shown to be a better medium than proteoliposomes for studying the atp hydrolysis ability of atp-binding cassette transporters. this review summarizes and compares the most up-to-date methods for reconstituting membrane proteins into model membranes. there is no superior method for reconstituting membrane proteins in the model membrane; instead two or more model membranes should be considered, depending on the particular needs of the system and the proteins of interest. in general, systems based on lipid bilayers supported on a solid substrate are still the most favored and well-developed of the methods to study membrane proteins in the bilayer. this approach allows detailed study of the fundamental properties of biological membranes and is practical to reproduce the bilayer system. on the other hand, the proteoliposome is more suitable for ion channel reconstitution in the bilayer, as well as for combination with the patch-clamp method to detect the ionic selectivity of the channel. finally the self-assembled nanodiscs system provides a robust and common means for rendering these targets soluble in aqueous media while providing a native-like bilayer environment that maintains functional activities. nanodisc technology offers another way to prepare monodisperse samples of membrane proteins in the bilayer environment, and it is emerging as the favored approach in studies concerning membrane protein complexes. biochemical and functional characterization of the membrane association and membrane permeabilizing activity of the severe acute respiratory syndrome coronavirus envelope protein lipid-protein interactions in human erythrocyte-membrane acetylcholinesterase. modulation of enzyme activity by lipids correlation between the effect of the anti-neoplastic ether lipid 1-o-octadecyl-2-o-methyl-glycero-3-phosphocholine on the membrane and the activity of protein kinase calpha weak dependence of mobility of membrane protein aggregates on aggregate size supports a viscous model of retardation of diffusion structure and phase transitions in langmuir monolayers allogeneic stimulation of cytotoxic t cells by supported planar membranes effect of colicins ia and e1 on ion permeability of liposomes direct solubilization of heterologously expressed membrane proteins by incorporation into nanoscale lipid bilayers surface enhanced raman scattering of a lipid langmuir monolayer at the air-water interface lipid monolayers: why use half a membrane to characterize protein-membrane interactions? langmuir monolayer of artificial pulmonary surfactant mixtures with an amphiphilic peptide at the air/water interface: comparison of new preparations with surfactant polymyxin b-lipid interactions in langmuir-blodgett monolayers of escherichia coli lipids: a thermodynamic and atomic force microscopy study langmuir balance investigation of superoxide dismutase interactions with mixed-lipid monolayers modern physicochemical research on langmuir monolayers structure of rhodopsin in monolayers at the air-water interface: a pm-irras and x-ray reflectivity study formation, structure, and spectrophotometry of air-water interface films containing rhodopsin proton transport by bacteriorhodopsin in planar membranes assembled from air-water interface films structural and spectroscopic characteristics of bacteriorhodopsin in air-water interface films spectroscopic and structural properties of valine gramicidin a in monolayers at the air-water interface effects of gramicidin-a on the adsorption of phospholipids to the air-water interface organization, structure and activity of proteins in monolayers monitoring of phospholipid monolayer hydrolysis by phospholipase a2 by use of polarization-modulated fourier transform infrared spectroscopy supported planar membranes in studies of cell-cell recognition in the immune system supported phospholipid bilayers formation of high-resistance supported lipid bilayer on the surface of a silicon substrate with microelectrodes supported lipid bilayer self-spreading on a nanostructured silicon surface simulations of temperature dependence of the formation of a supported lipid bilayer via vesicle adsorption simulations of lipid vesicle rupture induced by an adjacent supported lipid bilayer patch membrane lateral mobility obstructed by polymer-tethered lipids studied at the single molecule level tethered polymer-supported planar lipid bilayers for reconstitution of integral membrane proteins: silane-polyethyleneglycol-lipid as a cushion and covalent linker reversible activation of diblock copolymer monolayers at the interface by ph modulation, 1: lateral chain density and conformation polymer-cushioned bilayers. i. a structural study of various preparation methods using neutron reflectometry polymer-cushioned bilayers. ii. an investigation of interaction forces and fusion using the surface forces apparatus controlled enhancement of transmembrane enzyme activity in polymer cushioned supported bilayer membranes orientation and reactivity of nadh kinase in proteoliposomes modulation of rhodopsin function by properties of the membrane bilayer role of lipid polymorphism in g protein-membrane interactions: nonlamellar-prone phospholipids and peripheral protein binding to membranes influence of the membrane lipid structure on signal processing via g protein-coupled receptors the apoptotic protein tbid promotes leakage by altering membrane curvature modulation of ctp:phosphocholine cytidylyltransferase by membrane curvature elastic stress the role of membrane biophysical properties in the regulation of protein kinase c activity membrane lipid polymorphism: relationship to bilayer properties and protein function elastic coupling of integral membrane protein stability to lipid bilayer forces reconstitution of colicin e1 into dimyristoylphosphatidylcholine membrane vesicles the permeability of bilayer lipid membranes on the incorporation of erythrocyte membrane extracts and the identification of the monosaccharide transport proteins binding of cytochalasin b to human erythrocyte glucose transporter conformational dynamics of na+/k+-and h+/k+-atpase probed by voltage clamp fluorometry the extracellular patch clamp: a method for resolving currents through individual open channels in biological membranes lipid bilayer formation by contacting monolayers in a microfluidic device for membrane protein analysis droplet networks with incorporated protein diodes show collective properties cell-sized liposomes and droplets: real-world modeling of living cells oriented reconstitution of a membrane protein in a giant unilamellar vesicle: experimental verification with the potassium channel kcsa phospholipid phase transitions in homogeneous nanometer scale bilayer discs membrane protein assembly into nanodiscs functional reconstitution of beta2-adrenergic receptors utilizing self-assembling nanodisc technology transducin activation by nanoscale lipid bilayers containing one and two rhodopsins using nanodiscs to create water-soluble transmembrane chemoreceptors inserted in lipid bilayers atomic-force microscopy: rhodopsin dimers in native disc membranes single-molecule height measurements on microsomal cytochrome p450 in nanometer-scale phospholipid bilayer disks co-incorporation of heterologously expressed arabidopsis cytochrome p450 and p450 reductase into soluble nanoscale lipid bilayers the critical iron-oxygen intermediate in human aromatase the ferrous-oxy complex of human aromatase kinetics of dithionite-dependent reduction of cytochrome p450 3a4: heterogeneity of the enzyme caused by its oligomerization ligand binding to cytochrome p450 3a4 in phospholipid bilayer nanodiscs the effect of model membranes the atpase activity of seca is regulated by acidic phospholipids, secy, and the leader and mature domains of precursor proteins nanodiscs unravel the interaction between the secyeg channel and its cytosolic partner seca two copies of the secy channel and acidic lipids are necessary to activate the seca translocation atpase cryo-em structure of the ribosome-secye complex in the membrane environment competitive binding of the seca atpase and ribosomes to the secyeg translocon crystal structure of cholera toxin b-pentamer bound to receptor gm1 pentasaccharide self-aggregation-an intrinsic property of g(m1) in lipid bilayers ganglioside embedded in reconstituted lipoprotein binds cholera toxin with elevated affinity nanodiscs for immobilization of lipid bilayers and membrane receptors: kinetic analysis of cholera toxin binding to a glycolipid receptor patch-clamp study of liver nuclear ionic channels reconstituted into giant proteoliposomes the stabilisation of purified, reconstituted p-glycoprotein by freeze drying with disaccharides chapter 11-reconstitution of membrane proteins in phospholipid bilayer nanodiscs abc transporters, mechanisms and biology: an overview discovery of an auto-regulation mechanism for the maltose abc transporter malfgk2 the authors gratefully acknowledge financial support from the australian research council (arc). hhs is an arc super science fellow and tl is an arc federation fellow. tl and lm were awarded the arc super science fellowships and grant (fs110200015). we thank victoria hewitt for her critical reading of the manuscript. key: cord-005145-1l87fdmi authors: marquet-blouin, e.; bouche, f.b.; steinmetz, a.; muller, c.p. title: neutralizing immunogenicity of transgenic carrot (daucus carota l.)-derived measles virus hemagglutinin date: 2003 journal: plant mol biol doi: 10.1023/a:1022354322226 sha: doc_id: 5145 cord_uid: 1l87fdmi although edible vaccines seem to be feasible, antigens of human pathogens have mostly been expressed in plants that are not attractive for human consumption (such as potatoes) unless they are cooked. boiling may reduce the immunogenicity of many antigens. more recently, the technology to transform fruit and vegetable plants have become perfected. we transformed carrot plants with agrobacterium tumefaciens to generate plants (which can be eaten raw) transgenic for an immunodominant antigen of the measles virus, a major pathogen in man. the hemagglutinin (h) glycoprotein is the principle target of neutralizing and protective antibodies against measles. copy numbers of the h transgene were verified by southern blot and specific transcription was confirmed by rt-pcr. the h protein was detected by western blot in the membrane fraction of transformed carrot plants. the recombinant protein seemed to have a 8% lower molecular weight than the viral protein. although this suggests a different glycosylation pattern, proper folding of the transgenic protein was confirmed by conformational-dependent monoclonal antibodies. immunization of mice with leaf or root extracts induced high titres of igg1 and igg2a antibodies that cross-reacted strongly with the measles virus and neutralized the virus in vitro. these results demonstrate that transgenic carrot plants can be used as an efficient expression system to produce highly immunogenic viral antigens. our study may pave the way towards an edible vaccine against measles which could be complementary to the current live-attenuated vaccine. the development of genetic transformation technology has allowed the expression of foreign genes in an increasing number of plant species. the use of plants for the production of foreign antigen proteins that could serve as experimental immunogens was first reported in the early 1990s (cardineau and curtis, 1990; mason et al., 1992) . since then, a number of viral and bacterial antigens have been expressed in a variety of plant species (mcgarvey et al., 1995; thanavala et al., 1995; carrillo et al., 1998; gomez et al., 1998; modelska et al., 1998; tacket et al., 1998; wigdorovitz et al., 1999) . despite differences in post-translational processing viral and bacterial antigens preserved their immunogenic properties when produced in plants and induced cross-reactive and sometimes neutralizing and protective antibodies. plants could therefore be an inexpensive source of antigens that could be easily purified for parenteral inoculation (thanavala et al., 1995; gomez et al., 1998) . moreover, oral ingestion of plants expressing high levels of antigens bear the potential of edible vaccines (kong et al., 2001) . strategies based on potent mucosal adjuvants such as cholera toxin and heat-labile enterotoxin of escherichia coli (haq et al., 1995; arakawa et al., 1998a, b) may pave the road for oral immunization. research on edible plant vaccines has been carried out mostly in plant species largely inappropriate for human consumption (e.g. tobacco, nicotiana tabacum, thanavala et al., 1995; mason et al., 1996; huang et al., 2001; nicotiana benthamiana, modelska et al., 1998; arabidopsis thaliana, carrillo et al., 1998; gomez et al., 1998) . for most proteins of human pathogens expressed in edible plants, potatoes were used, which are normally boiled before consumption (mason et al., 1996; arakawa et al., 1998a; richter et al., 2000; kong et al., 2001) . it can be anticipated that many antigens would not resist cooking without being denatured and that cooked plant material is less immunogenic than raw plants (kong et al., 2001) . therefore, there is a need to develop other and more appropriate transgenic plant species that can serve as edible vaccines. the technology for creating other transgenic edible plants, including fruits and vegetables has been further perfected (schenk et al., 2001; brodzik et al. 2000) . we chose to transform transgenic carrots, which can be grown in most parts of the world, which can be eaten both raw and cooked, and are part of the early diet of infants. current life-attenuated measles vaccines are given routinely at 9 to 15 months of age. after a single injection, seroconversion rates are high, complications are rare and protection is long-lasting. these advantages are difficult to match by other experimental measles vaccine. however, after 25 years 50% of vaccinees are thought to have lost protective levels of antibodies (mossong et al., 1999) . revaccination with an oral vaccine which would boost the residual immunity would be a preferred strategy since it can be selfadministered, requires less training of health workers and avoids the risks associated with needle injections. the potential of a parenteral/oral prime-booster schedule has been demonstrated for a number of pathogens (kong et al., 2001; mantis et al., 2001) . such a schedule is probably less liable to problems of weak and variable responses after oral vaccination. a strategy based on oral vaccination could be a particularly useful for large-scale booster immunization in developing countries where the need to deliver parenteral vaccines may hamper eradication efforts. measles is caused by a paramyxovirus (mv) which projects two glycoproteins, the hemagglutinin (h) and the fusion protein, from the outer viral envelope. the h protein is responsible for the attachment of the virus to the host cell (naniche et al., 1993) , whereas the fusion protein is directly involved in the fusion of viral and target cell membranes required for the penetration of the virus (wild et al., 1991) . virus-neutralizing and protective antibodies are mainly directed against the hemagglutinin and, to a lesser extent, the fusion protein (mcfalin et al., 1980; giraudon and wild, 1985) . the aim of this study was (1) to explore the potential of carrots as an expression system for antigens that is suitable for human consumption, and (2) to test whether the measles virus hemagglutinin glycoprotein would preserve its neutralizing immunogenicity in this system. although some work has been done with transgenic carrot callus cells (brodzik et al., 2000) , this is one of the first reports of the expression of a transgenic antigen in mature carrots, showing that high levels of virus-neutralizing antibodies can be induced with a glycoprotein produced in this plant. the coding sequence corresponding to the measles virus hemagglutinin (mv-h) protein (bouche et al., 1998a) was subcloned into the expression cassette of the prtl2 vector at the ncoi-bamhi sites (restrepo et al., 1990) . in this vector, the mv-h sequence was under the control of the constitutively expressed cauliflower mosaic virus (camv) 35s promoter fused to the tobacco etch virus (tev) 5 -untranslated region, a translational enhancer, and the camv 35s terminator (odell et al., 1985; pietrzak et al., 1986) . these control sequences were flanked with hindiii restriction sites that allowed their transfer, together with the h sequence, into the t-dna region presents in binary vector pbin19 (bevan, 1984) , creating the recombinant plasmid pbin19-mvh. the t-dna region, delimited by the right and left border sequences, also contains the neomycin phosphotransferase ii gene (nptii) that provides neomycin and kanamycin resistance to transformed plants ( figure 1 ). after subcloning of the expression cassette in pbin19 and transformation of xl1-blue cells kanamycin-resistant colonies were picked and checked for the presence of the expression cassette by pcr and automated sequencing (373 a model, perkin elmer, netherlands). plasmid dna was then isolated from a positive clone and introduced in agrobacterium tumefaciens strain lba4404 by electroporation (25 µf, 2500 v, 400 ). transformed bacteria were selected on yeb-agar solid medium containing 50 µg/ml kanamycin (28 • c, 48 h) and were used for subsequent carrot transformation. the protocols of a. tumefaciens-mediated transformation of hypocotyls were modified for the production of transgenic carrot plants (hardegger and sturm, 1998; tokuji and fukuda, 1999; brodzik et al., 2000) . sterilized carrot seeds (herrera-estrella and simpson, 1988; daucus carota cv. senkou-gosun, kindly given by dr tokuji, japan) were sown in 0.8% agar in the dark at 25 • c. after 14 days hypocotyl segments were harvested in gamborg b5 liquid medium (b5) supplemented with 3% sucrose and with the phytohormone 2,4-dichlorophenoxyacetic acid (1 mg/l) for 48 h at 25 • c. after washing, the segments were placed for 5 days on phytohormone-free b5 solid medium (0.8% agar) containing 3% sucrose. the hypocotyl fragments were then transformed by immersion (2 h) in the bacterial suspension of a. tumefaciens containing the recombinant pbin19-mvh binary plasmid. the segments were further co-cultured in the dark at 25 • c with the bacteria on b5 solid medium with 3% sucrose. five days later, the explants were washed with b5 medium containing cefotaxime (250 mg/l) to kill remaining agrobacteria. explants were grown in darkness at 25 • c on b5 solid medium containing cefotaxime (250 mg/l) and geneticin (10 mg/l) for the selective growth of transgenic cells. after 4 weeks of selection, somatic embryos resistant to geneticin appeared on the hypocotyl segments. each hypocotyl usually gave rise to a few embryos. embryos were subcultured and rooted on selective medium at 25 • c under light. plantlets with adequate roots were transferred to potting soil and grown in a greenhouse under normal light and humidity conditions. several lines of transgenic carrots were obtained and further studied. genomic dna was isolated from both untransformed and transformed plants by macerating frozen leaves (ca. 3 g) in liquid nitrogen. the resulting powder extract was re-suspended in the extraction buffer (100 mm tris-hcl, 20 mm edta, 1.4 m nacl, 100 mm 2-mercaptoethanol and 2% n-cetyl-n,n,n,trimethylammonium bromide, ph 8.0) and incubated at 60 • c for 30 min. after a chloroform extraction, nucleic acids were precipitated with 0.7 volume of isopropanol and the pellet obtained after centrifugation was resuspended in tris-hcl (10 mm, ph 8.0)/ edta (1 mm) buffer. after rnase treatment the dna was stored at −20 • c until being used. a fragment of the mv-h expression cassette (promotor-mvh-terminator) was specifically amplified with a forward primer (5 -gcaagacccttcctctatat-3 ) from the 35s promoter region and a reverse primer (5 -atctgggaactactcacac-3 ) from the 35s terminator region. the presence of the nptii gene was detected by pcr amplification with a pair of specific primers within the nptii gene (5 -tgctcctgccgagaaagtatc-3 and 5 -tcctgtatcgcaaccgatgggc-3 ). genomic dna of untransformed plants was used as negative control. southern blotting with genomic dna was performed following conventional protocols. briefly, 30 µg of extracted dna was digested overnight at 37 • c with ecori which cuts the recombinant t-dna at a single position (between the 35s promoter and the tev leader). after agarose gel (0.8%) electrophoresis, digestion products were transferred overnight onto a nylon membrane (hybond+, amersham-pharmacia biotech, uk). the membrane was equilibrated with 20× sspe prior to immobilization of dna by uvcross-linking (uv stratalinker, stratagene, netherlands). for hybridization, a 32 p-labelled h-specific cdna probe, generated by nick translation according to feinberg and volgelstein (1983) , was incubated with the membrane for 4 h at 65 • c. the membrane was then washed for 20 min at 65 • c successively with 0.1% sds in 5× ssc, 2× ssc and finally in 0.2× ssc. hybridized complexes were detected by autoradiography. total rna from leaves (1 g) of transformed plants was isolated as described by hughes and galau (1988) . rna from untransformed plants was used as a negative control template. reverse transcription (rt) was carried out for 2 h at 40 • c in 50 µl final reaction volume containing 5 µg rna, 500 ng random hexamers and a mixture of dntps (20 mm each), 10 mm dtt, 10 units of rnasin (promega, netherlands) and 200 units of m-mlv reverse transcriptase (superscript ii, gibco life sciences, belgium). after adding 0.1 volume of 10 mm atp, ligation was carried out by incubation for 45 min at 37 • c in the presence of 2 units of t4 dna ligase. pcr amplification was performed with two h-specific primers including the first 18 and the last 12 nucleotides of the coding sequence. the pbin19-mvh vector was used as positive control template. a pcr reaction was also performed directly on the total rna to confirm the absence of specific dna in the extract. plant tissues were homogenized on ice in pbs containing 5 mm edta and protease inhibitors (10 µg/ml aprotinin, 1.8 mg/ml iodoacetamide, 10 µg/ml leupeptin, 1 mm pmsf). the mixture was centrifuged at 1700 × g (10 min, 4 • c) to remove insoluble debris. the membrane fraction was sedimented by ultracentrifugation at 100 000 × g (45 min, 4 • c), and re-suspended in pbs containing 0.5% np-40. protein concentration was determined with the dc protein assay kit (biorad, belgium). one gram of wet weight of carrot plant gave about 100 µg of membrane protein. proteins of the membrane fraction were separated by 12% sds-page under reducing and denaturing conditions (100 mm dtt, 4 m urea, 2% sds). proteins were blotted onto nitrocellulose membrane in tris-glycine buffer for 2 h at 250 ma. the membrane was blocked for 1 h at room temperature with 5% non-fat dehydrated milk in pbs containing 0.1% tween-20. for detection of the mv-h protein two specific monoclonal antibodies (mabs; bh47 and bh195, 1:1000 dilution) and goat anti-mouse iggconjugated horseradish peroxidase secondary antibodies (1:5000 dilution) were used. bound antibodies were detected by enhanced chemiluminescence (ecl kit, amersham-pharmacia biotech). to characterize recombinant h protein in leaf and root extracts microtiter plates (maxisorb, nunc, denmark) were coated with increasing concentrations of plant extract (in pbs) and revealed with h-specific mabs (dilution 1:1000). bh47 recognizes the sequential helix-forming epitope h236-350 (fournier et al., 1997; deroo et al., 1998) ; bh67 and bh81 are conformation-dependent antibodies (unpublished); bh1 binds only denatured protein (ziegler et al., 1996) ; bh216 binds to the hemagglutinin noose epitope (hne) h386-400 (ziegler et al., 1996) . mouse sera (1:250 to 1:32000) were titrated against immobilized purified h protein (bhk-h) produced in bhk-21 cells (50 ng/well; bouche et al., 1998b) . the coated antigens were washed and blocked with 1% bsa in tris-buffered saline (15 mm, ph 7.4). after addition of mabs or mouse serum, alkaline phosphatase-conjugated goat anti-mouse igg (1:750; southern biotechnology association, usa) and pnitrophenylphosphate (sigma, usa) were used for detection. optical density (od) was measured after 90 min at 405 nm. data were expressed as net od values after subtracting either the absorbance of the negative control antigen (e.g. extract of wild-type roots or leaves, or bhk-0 antigen; figure 4 ) or the absorbance of the conjugate without mouse serum (<0.150 od). h-specific antibody isotypes and subclasses were determined in mouse sera (1:500) by elisa with specific conjugates and substrate of a commercial kit (biorad). data were expressed as od at 415 nm after 30 min as recommended. groups of four spf balb/c mice were primed by intraperitoneal injection of with 500 µg of leaf or root extracts from transgenic or wild-type (wt) plants or 100 µg of bhk-h or bhk-0 antigens emulsified (1:1) in freund's complete adjuvant (sigma). by comparison with the elisa signal of mammalianexpressed purified h protein, the plant extract and the bhk extract was estimated to contain about 10 and 20 µg of the specific protein, respectively. mice were boosted on days 14, 28 and 42 with the same antigen preparation emulsified in freund's incomplete adjuvant (sigma). sera were drawn 7 days after boosting and were tested as pooled sera (after the second boost) or as individual sera (after the third boost). the reactivity of immune sera (1:50) with native h protein or mv was tested by flow cytometry as described before using permanently h-transfected mel-juso cells expressing the recombinant protein at their surface (mel-juso/h, gift of r. de swart, rotterdam, netherands; de swart et al., 1997) or mvsuperinfected ebv-transformed human b cell line (mv-wmpt; gift of b. chain, london, uk; muller et al., 1995) . for both assays wt or uninfected cells (mel-juso/wt or wmpt) were used as negative control cells. cells were incubated on ice (30 min) with diluted serum of individual mice. after washing, fitc-conjugated goat anti-mouse fc-specific antibody (1:200, sigma) was used for detection. fitcconjugate alone, naïve serum on positive and negative cells or test serum on negative cells served as negative controls. bh26 (1:500; ziegler et al., 1996) served as an antibody positive control. dead cells were excluded by propidium iodide staining (1 µg/ml). to test mv-neutralizing activity, mouse sera were heated for 30 min at 56 • c to inactivate complement. duplicates of two-fold serial dilutions (1:64 to 1:32768) of heat-inactivated serum were mixed with an equal volume of medium containing 35 plaqueforming units of edmonston strain mv. after 2.5 h of pre-incubation at 37 • c, the mixture was added to a subconfluent vero cell culture in 24-well plates (2.5 × 10 5 cells per well). after one hour unbound virus was removed and cells were covered with a carboxymethylcellulose (4%) overlay. after four days of incubation under tissue culture conditions, a 0.2% neutral red solution was added. on day 5, the overlay was removed and cells were fixed with a 10% formaline solution. the 50% neutralization titre, defined as the reciprocal of the dilution that reduces the number of plaques by 50%, was calculated accord-ing the spearman-kärber method. titres <64 were considered to be negative. the transformation of carrot plants was mediated by recombinant agrobacterium infection using the pbin19-mvh plasmid (figure 1 ). the regenerated transgenic plants showed no morphological changes in comparison to wt carrot plants (data not shown). about 30 plants resulting from independent transformation events were selected and grown in the greenhouse. ten of them were analysed further. the presence of the mv-h expression cassette in transgenic plants was confirmed by pcr followed by gel electrophoresis of the amplified fragments (figure 2a) . from all transformed plants tested a product of the expected size (2.2 kb) was amplified (lanes 3-12). the same size was obtained with pbin19-mvh vector as a positive control template (lane 1). this product was absent in dna of untransformed plants (lane 2). similarly, the presence of the nptii gene was confirmed as a specific product of 360 bp in all geneticin-resistant plants ( figure 2b, lanes 3-12) . this product was absent from untransformed plants (lane 2). copy numbers of the transgene and the integration pattern were determined by southern blot by digesting genomic dna of the 10 transformed plants ( figure 2c ) with ecori (which has a unique restriction site in the t-dna, see figure 1 ). hybridization with a mv-h-specific probe showed in every transgenic plant a unique restriction pattern (lanes 3-12) indicating that insertions occurred at random sites throughout the genome. all plants appeared to have integrated a single copy of the transgene, as illustrated by the presence of a single hybridization band. the insert corresponding to the radioactive probe was used as positive control template (lane 1). no signal appeared when genomic dna of wt plants was used as a template (lane 2). the transcription of the mv-h gene was analysed by rt-pcr on total rna extracted from transgenic plants ( figure 3a ). untransformed plants served as a negative control (lane 3). all tested plants produced a specific major transcript of the expected size lanes 4-13) . the minor unspecific, smaller band was also found when other transgenes were tested. no amplified dna was detected when the pcr was directly performed on the rna preparations, confirming the rna-specificity of the reaction (lane 2). crude membrane preparations of leaves (500 ng protein) and of bhk-h cells (50 ng protein) gave specific bands of similar intensities in western blots ( figure 3b ). based on elisa with purified h protein the content of specific protein was estimated at about 2% and 20% of specific protein in membrane fraction. the h-specific monoclonal antibodies bh47 and bh195 that bind to two non-overlapping epitopes revealed a strong under reducing conditions with an estimated molecular mass of about 68 kda (lane 3). under the same conditions, h protein produced in mammalian cells (bhk-h) migrated with an apparent size of ca. 74 kda as reported earlier (lane 1; bouche et al., 1998b) . this difference in mass may correspond to different levels of glycosylation of the monomer, but this was not further explored (vialard et al., 1990) . negative controls included an irrelevant mab (lanes 4-6) as well as a crude membrane preparation of wt leaves (lanes 2). the western blot suggested that there may be differences in glycosylation. glycosylation is well known to be important for the proper folding of the h protein (hu et al., 1994) . several monoclonal antibodies were used to investigate the conformational integrity of the transgenic protein ( figure 4a ). microtitre plates were coated with increasing concentrations (62.5-1000 ng/well) of membrane fractions from both transgenic leaves and roots. high net signals were obtained with two conformational-dependent mabs (bh67 and bh81), and bh47 which recognizes the sequential epitope h236-250 (fournier et al., 1997) with a putative helical conformation (deroo et al., 1998) . in contrast, bh1, which binds denatured protein only, showed essentially no reactivity above the background of wt plants. bh216 binds to the hemagglutinin noose epitope (hne, h386-400) with its oxidized cysteine bridge (ziegler et al., 1996) . the antibody binding pattern was essentially the same in root and leaf extracts and resembled that of purified h protein of mammalian origin ( figure 4b ). in general, signals tended to be higher in younger than in older tissues (data not shown). since the transgenic h protein seemed to be antigenically conserved, its immunogenicity was tested in mice. after 2 or 3 boosts with transgenic plant extracts all animals showed reactivity with purified h protein produced in mammalian cells ( figure 5a ). after the third boost, average antibody levels obtained with leaf extracts were about 4 times higher than after immunization with root extracts. similar levels were found after boost 2 for leaves, whereas antibodies still increased between boost 2 and 3 with root extracts. sera of control mice immunized with wt plant extract showed no cross-reactivity with the h protein (net od <0.03; data not shown). interestingly, antibodies generated with the transgenic leaves were of both the igg1 and igg2a subclass, whereas the immune response against the h protein produced in bhk-21 cells was essentially restricted to igg1, suggesting a difference in the th1/th2 balance ( figure 5b ). no h-specific igg2b, igg3, iga and igm were detected. to exclude that the reactivity may be due to partially denatured recombinant protein, the sera were further tested for antibodies against the intact native protein expressed on mv-superinfected wmpt cells ( figure 6a ) and h-transfected mel-juso cells (figure 6b ). the flow cytometry data showed that all mice vaccinated with transgenic leaf or root extracts produced high levels of antibodies cross-reacting with the native protein independently whether virus-infected or h-protein transfected cells were used. antibody levels were similar to those of mice immunized with h protein produced in mammalian cells, while wt plants induced no cross-reactive antibodies. in addition to a strong virus cross-reactivity, all sera showed high levels of neutralizing antibodies in a standard plaque reduction neutralization assay. in the group immunized with leaf extracts, mean neutralizing titres of 10 700 (range 5500-22 500) were observed, while root extracts induced significantly lower titres (3000; range 800-7400). all sera from mice given extracts of untransformed plants had titres <64 and were considered negative. in general, both cross-reactive and neutralizing titres were higher in leaves than in roots. this could reflect a difference in expression or in the yield of the extraction procedure. although edible vaccines seem to be feasible, very few antigens have been expressed in plants fit for human consumption (modelska et al., 1998; kapusta et al., 1999; sandhu et al., 2000) . for instance, potatoes have been used to express antigens of human pathogens (mason et al., 1996; arakawa et al., 1998a, b; richter et al., 2000; kong et al., 2001) . however, raw potatoes are not very appealing, and cooking can drastically reduce the immunogenicity of the vaccine (kong et al., 2001) . this is the first report of the expression of an antigenic protein as a transgene in mature carrots (which can be eaten raw by human beings) and of the antigenic and immunogenic properties of the heterologous protein in mice. genomic integration of the mv-h gene under the camv 35s double promoter was obtained using agrobacterium tumefaciens. southern blot analysis showed that all regenerated transgenic plants analysed integrated a single copy of the transgene and that integration occurred randomly in the carrot genome. in all clones a specific transcript of the expected size could be amplified by rt-pcr. under the fluorescence microscope, a fusion protein of mv-h with the green fluorescence protein (gfp) expressed in tobacco protoplasts showed a predominant association with the plasma membrane (data not shown). the membrane fraction of carrot cells produced a strong signal in western blot correspond-ing to an estimated 2 µg of specific protein per gram of wet weight. interestingly, the transgenic protein migrated faster in sds-page than the same protein produced in mammalian cells, suggesting a size difference of about 6 kda. a similar observation was made when the rabies virus glycoprotein g was expressed in plants (mcgarvey et al., 1995) . the expression of the c-terminally fused gfp demonstrates that the h protein is fully translated and that the difference in apparent size is most probably due to post-translational modifications. in the virus, the edmonston strain h protein undergoes glycosylation at 4 of the 5 predicted n-glycosylation sites (hu et al., 1994) . it has been shown that glycosylation of mammalian proteins is efficient in plants, but normally different carbohydrate side chains are utilized (bardor et al., 1999) . the complex glycans of plants are often smaller than those of animals, partially because they lack sialic acid (faye et al., 1993) . in insect cells, which also lack sialic acid, the reduced size of the recombinant h protein was explained by a difference in glycosylation (vialard et al., 1990) . authentic post-translational modifications such as glycosylation and cystine-bridge formation are thought to be important for intracellular trafficking (hu and norrby, 1995) as well as the antigenic conformation of the h protein (hu et al., 1994; hu and norrby, 1995) . recently, huang et al. (2001) reported the expression of the h protein in tobacco leaves. in this system, the native h protein was undetectable. after adding a retention signal for the endoplasmic reticulum low levels of protein became detectable by elisa with polyclonal sera from rabbits and man; the reactivity with mabs was even weaker or negative suggesting that the conformation may have been less than optimal. the immunogenic integrity of the h protein is critical for the induction of antibodies that protect against virus infection and disease. conformational dependent mabs confirmed the proper antigenic structure of the transgenic protein produced in carrots; while mab bh1 which recognizes only denatured protein showed no reactivity. after immunization with extracts of transgenic carrots high levels of virus-neutralizing antibodies were found, using the who recommended plaque neutralization assay (miller et al., 1995) . these titres were comparable to those obtained with the mammalian cell-derived h protein extract. the antibody isotype subclasses suggested that in contrast to the h protein of mammalian origin, which produced primarily a th2 (or antibody-dominated) response, the plant protein induced a th1/th2-balanced response, indicative of both a humoral and cellular response. this may be important to safeguard against atypical measles when such a vaccine is used in unprimed individuals. in many countries carrots are components of the diet of both adults and children and a stable antigenic transgene in an edible plant that can be consumed raw without further processing could bring a vaccine within reach of the most destitute. although plants can be used as efficient bioreactors for producing antigens, the full potential of plant-based vaccines becomes apparent when immunogenicity can be demonstrated after oral ingestion. a frequent problem of plant-based vaccines is that the response after oral delivery is inconsistent and variable. sometimes even low levels of antigen can give an effective response after oral administration in mice (mason et al., 1996; wigdorovitz et al., 1999) and man (kapusta et al., 1999) . in other cases mucosal immunity was improved by mucosal adjuvants (arakawa et al., 1998b; kong et al., 2001) . in the study by huang et al. (2001) , the immune response after oral administration of transgenic tobacco was very weak even when exceedingly high levels (3 times 1 mg) of ctb where co-administered. however, when experimental oral vaccines were given after parenteral priming, results were more consistent than after oral immunization alone (kong et al., 2001; mantis et al., 2001) . therefore, a prime booster schedule with an oral vaccine could be an attractive alternative when immunity to the current (injectable) live-attenuated measles vaccine wanes in adults. further studies are required to confirm the efficacy of the carrot plant expressed measles antigen in such a scenario. efficacy of a food plant based cholera toxin b subunit vaccine a plant-based cholera toxin b subunit-insulin fusion protein protects against the development of autoimmune diabetes analysis of the n-glycosylation of recombinant glycoproteins produced in transgenic plants binary agrobacterium vectors for plant transformation immunosorbent assay based on recombinant hemagglutinin protein produced in a high-efficiency mammalian expression system for surveillance of measles immunity a simplified immunoassay based on measles virus recombinant hemagglutinin protein for testing the immune status of vaccinees transgenic plants as a potential source of an oral vaccine against helicobacter pylori oral immunization by transgenic plants protective immune response to foot-and-mouth disease virus with vp1 expressed in transgenic plants measles virus fusion protein-and hemagglutinin-transfected cell lines are sensitive tool for the detection of specific antibodies by facs-measured immunofluorescence assay enhanced antigenicity of a four-contact-residue epitope of the measles virus hemagglutinin protein by phage display libraries: evidence of a helical structure in the putative active site detection, biosynthesis and some functions of glycans n-linked to plant secreted proteins a technique for radiolabeling dna restriction endonuclease fragments to high specific activity antibodies to a new linear site at the topographical or functional interface between the haemagglutinin and fusion proteins protect against measles encephalitis correlation between epitopes on hemagglutinin of measles virus and biological activities: passive protection by monoclonal antibodies is related to their hemagglutination inhibiting activity expression of immunogenic glycoprotein s polypeptides from transmissible gastroenteritis coronavirus in transgenic plants oral immunization with a recombinant bacterial antigen produced in transgenic plants transformation and regeneration of carrot foreign gene expression in plants role of individual cysteine residues in the processing and antigenicity of measles virus hemagglutinin protein role of n-linked oligosaccharide chains in the processing and the antigenicity of the measles virus haemagglutinin protein plant-derived measles virus hemagglutinin protein induces neutralizing antibodies in mice preparation of rna from cotton leaves and pollen a plant-derived edible vaccine against hepatitis b virus oral immunization with hepatitis b surface antigen expressed in transgenic plants immunization of mice with recombinant gp41 in a systemic prime/mucosal boost protocol induces hiv-1-specific serum igg and secretary iga antibodies expression of hepatitis b surface antigen in transgenic plants expression of norwalk virus capsid protein in transgenic tobacco and potato and its oral immunogenicity in mice monospecific antibody to the haemagglutinin of measles virus expression of the rabies virus glycoprotein in transgenic tomatoes antibodies to measles, mumps, and rubella in uk children 4 years after vaccination with different mmr vaccines immunization against rabies with plant-derived antigen modeling the impact of subclinical measles transmission in vaccinated populations with waning immunity cholera toxin b stimulates systemic neutralizing antibodies after intranasal co-immunization with measles virus human membrane cofactor protein (cd46) acts as a cellular receptor for measles virus identification of dna sequences required for activity of a plant promoter: the camv 35s promoter expression in plants of two bacterial antibiotic resistance genes after protoplast transformation with a new plant expression vector nuclear transport of plant potyviral proteins production of hepatitis b surface antigen in transgenic plants for oral immunization oral immunization in mice with transgenic tomato fruit expressing respiratory syncytial virus-f protein induces a systemic immune response promoters for pregenomic rna of banana streak badnavirus are active for transgene expression in monocot and dicot plants immunogenicity in humans of a recombinant bacterial antigen delivered in a transgenic potato human immune responses to a novel norwalk virus vaccine delivered in transgenic potatoes immunogenicity of transgenic plant-derived hepatitis b surface antigen a rapid method for transformation of carrot (daucus carota l.) by using direct somatic embryogenesis synthesis of the membrane fusion and hemagglutinin proteins of measles virus, using a novel baculovirus vector containing the β-galactosidase gene induction of a protective antibody response to foot and mouth disease virus in mice following oral or parental immunization with alfalfa transgenic plants expressing the viral structural protein vp1 measles virus: both hemagglutinin and fusion glycoproteins are requiered for fusion protection against measles virus encephalitis by monoclonal antibodies binding to a cystein loop domain of the h protein mimicked by peptides which are not recognized by maternal antibodies we are grateful to richard wagner and his team (ibmp) for taking care of the transgenic carrots. we also thank b. jérouville, s. willieme and w. ammerlaan for technical assistance. this research was supported by a grant from the ministère de l'education nationale et de la formation professionnelle du grand-duché de luxembourg, the european union 4th framework programme (project pl970242) and the crp-santé, luxembourg. key: cord-004400-li1sc47z authors: ma, jingjiao; wu, rujuan; xu, guanlong; cheng, yuqiang; wang, zhaofei; wang, heng’an; yan, yaxian; li, jinxiang; sun, jianhe title: acetylation at k108 of the ns1 protein is important for the replication and virulence of influenza virus date: 2020-02-24 journal: vet res doi: 10.1186/s13567-020-00747-3 sha: doc_id: 4400 cord_uid: li1sc47z non-structural protein 1 (ns1) of influenza virus is a multifunctional protein that plays an important role in virus replication and virulence. in this study, an acetylation modification was identified at the k108 residue of the ns1 protein of h1n1 influenza virus. to further explore the function of the k108 acetylation modification of the ns1 protein, a deacetylation-mimic mutation (k108r) and a constant acetylation-mimic mutation (k108q) were introduced into the ns1 protein in the background of a/wsn/1933 h1n1 (wsn), resulting in two mutant viruses (wsn-ns1-108r and wsn-ns1-108q). in vitro and mouse studies showed that the deacetylation-mimic mutation k108r in the ns1 protein attenuated the replication and virulence of wsn-ns1-108r, while the constant acetylation-mimic mutant virus wsn-ns1-108q showed similar replication and pathogenicity as the wild-type wsn virus (wsn-wt). the results indicated that acetylation at k108 of the ns1 protein has an important role in the replication and virulence of influenza virus. to further explore the potential mechanism, the type i interferon (ifn-i) antagonistic activity of the three ns1 proteins (ns1-108q, ns1-108r, and ns1-wt) was compared in cells, which showed that the k108r mutation significantly attenuated the ifn-β antagonistic activity of the ns1 protein compared with ns1-wt and ns1-108q. both ns1-wt and ns1-108q inhibited the ifn-β response activated by rig-i card domain, mavs, tbk1, and irf3 more efficiently than the ns1-108r protein in cells. taken together, the results indicated that acetylation at ns1 k108 is important for the ifn antagonistic activity of the ns1 protein and virulence of the influenza virus. influenza virus non-structural protein 1 (ns1) is a multifunctional protein that is responsible for interacting with cellular factors to antagonize the host antiviral response during viral infection [1] . the major role of the ns1 protein is inhibition of both interferon (ifn) and ifn-stimulated proteins by different mechanisms. ns1 inhibits the transcription of type i ifn by binding the 5′ triphosphate viral double-stranded rnas generated during viral replication to prevent the recognition of viral genomic material by host pattern recognition receptors (prrs), including rig-i, dsrna-dependent protein kinase r (pkr), and 2′5′-oligoadenylate synthetase (oas)/rnase l [2] [3] [4] . ns1 can also interact directly with rig-i in the absence of rna binding to inhibit the conformational change of rig-i required for mavs activation [5] . moreover, ns1 is able to interrupt mrna maturation by inhibiting the nuclear export of host mrnas by binding to host poly(a)-binding protein ii (pabpii) and cleavage and polyadenylation specific factor 30 (cpsf30), which are required for host mrna processing, resulting in the accumulation of ifn pre-mrnas in the nucleus of infected cells [6] . in addition, ns1 also antagonizes the open access *correspondence: lijinxiang@caas.cn; sunjhe@sjtu.edu.cn 1 shanghai key laboratory of veterinary biotechnology, key laboratory of urban agriculture (south), ministry of agriculture, school of agriculture and biology, shanghai jiao tong university, shanghai 200240, china 2 chengdu national agricultural science and technology center, sichuan, china full list of author information is available at the end of the article ifn signalling response by regulating other host factors, such as phosphoinositide 3-kinase (pi3k) activity, crklike protein (crkl), and the jak-stat signalling pathway [7] [8] [9] [10] [11] . the ns1 protein, typically 202-237 aa in length depending on the strain, contains four functional regions: an rna binding domain (rbd, 1-73 aa), linker region (lr, 74-88 aa), effector domain (ed, 89-202 aa), and c-terminal "tail" (ctt, 207-237 aa) [12] . multiple basic amino acids (e.g., 35r, 38r, 41k, and 46r) in the rbd are important for rna binding activity and suppressing the activation of pkr [12, 13] . the ed plays an important role by targeting multiple host factors, such as pkr, cpsf30, and p85β (pi3k), to inhibit antiviral responses and enhance viral replication [1] . the residue 187w in the ed domain is important for the dimerization of the ns1 protein, and the w187r substitution impaired ns1 dimerization and attenuated the virus in vivo [14] . in addition, residues 186e, 189d, and 194v play important roles in the binding of ns1 to cleavage and polyadenylation specificity factor 30 (cpsf30), and mutations in those residues weaken the binding of ns1 to cpsf30 and impair the ability of the ns1 protein to shut off host gene expression [15, 16] . post-translational modifications, such as phosphorylation, sumoylation, and acetylation, are important for protein function. phosphorylation at 49t, 80t, and 215t of the ns1 protein are important for interferon antagonistic activity and replication of human influenza virus [17, 18] . sumoylation at positions 219 and 221 of ns1 are crucial for host protein expression shutoff and replication of h5n1 influenza virus [19] . acetylation is an important post-translational modification that occurs in two forms [20] . one is the co/post-translational acetylation at the n α -termini of the nascent polypeptide chains [21] . the other form is acetylation of the ε-amino group of lysine, which was first recognized in histones regulating gene translation [22] and later was found in non-histone proteins [23] . the acetylation status is reversible and well balanced by lysine (k) acetyltransferases (kats) and lysine deacetylases (kdacs), which are tightly regulated to perform many cellular functions [24] . dysfunction of the acetylation machinery can inhibit protein functions and consequently lead to severe diseases [25, 26] . acetylation has been found in multiple proteins of influenza viruses. acetylation was identified in the np protein of influenza virus, and deacetylation of the np protein prevented the virus from assembling functional virus particles [27] . a histone-like sequence (histone mimic) was identified in the ns1 protein of influenza a h3n2, which contributes to suppression of the antiviral response [28] . the n-terminal acetylation of pa-x is required for the host shutoff activity of pa-x and for viral polymerase activity [29, 30] . acetylation of pa has been reported to be crucial for polymerase activity, and deacetylated pa protein restricts iav rna transcription and replication. the influenza virus haemagglutinin (ha) has three conserved cysteine residues (551, 559, and 562) at its c terminus serving as acylation sites that are essential for the formation of fusion pores and infectivity [31] . in the present study, an acetylation modification at k108 of the ns1 protein was identified and characterized. the results showed that the deacetylation-mimic mutation k108r in the ns1 protein attenuated the replication and virulence of the virus in vitro and in vivo. ifn-β antagonist assays indicated that the k108r mutation attenuated ifn antagonistic ability compared with the ns1-wt or ns1-108q (constant acetylation-mimic) proteins. overall, this study indicated that acetylation at k108 of the ns1 protein plays an important role in the replication and virulence of influenza virus. madin-darby canine kidney (mdck) cells were maintained in eagle's minimal essential medium (emem, hyclone, grand island, usa) with 5% foetal bovine serum (fbs, gibco, grand island, usa), l-glutamine (gibco), and 1% antibiotic (gibco). human embryonic kidney (hek) 293t cells and adenocarcinomic human alveolar basal epithelial cells (a549 cells) were maintained in dulbecco's modified eagle's medium (dmem, hyclone) supplemented with 10% fbs (gibco), l-glutamine (gibco), and 1% antibiotic (gibco). the virus strain a/wsn/1933 h1n1 (wsn), a mouse-adapted human influenza virus, was propagated and titrated in mdck cells. to identify the putative acetylation sites in influenza viral proteins, mass spectrometry was conducted with concentrated influenza virus. briefly, the h1n1 virus was propagated in mdck cells, and then a total of 100 ml of virus stock was prepared. to concentrate the virus, the collected virus was pelleted by centrifugation at 2000 g for 10 min to remove the cell debris. clarified virus supernatants were layered on a 30% (w/v) sucrose cushion and centrifuged at 200 000 g for 3 h. the virus pellet was suspended in water and subjected to mass spectrometry analysis performed by ptm biolabs llc (hangzhou, china). to generate mutant viruses, the site mutations k108r (deacetylation-mimic mutation) and k108q (constant acetylation-mimic) were introduced into the reverse genetic plasmid phw2000-wsn-ns by a commercial site-directed mutagenesis kit (invitrogen, grand island, usa). the mutant viruses were rescued in the background of wsn-h1n1 virus as described previously [32] , resulting in wsn-ns1-108r and wsn-ns1-108q viruses, and all the mutant viruses were verified by sequencing. then, the ns1-wt, ns1-108r, and ns1-108q genes were amplified and cloned into the pcdna3.0-flag expression vector (flag-ns1-wt, flag-ns1-108r, and flag-ns1-108q). the three viruses were inoculated on monolayer mdck and a549 cells cultured in 12-well plates with multiplicities of infection (mois) of 0.001 and 0.01 for each virus, respectively. each time point was set up in triplicate, and then the samples were collected at 12, 24, 36, and 48 hours post-inoculation (hpi). the supernatants were titrated on mdck cells cultured in 96-well plates following the reed and muench method to calculate tcid 50 /ml. the cells were collected and subjected to western blotting. briefly, the cell lysates were separated on a sodium dodecyl sulfate (sds)-polyacrylamide gel and transferred to a pvdf membrane. the membrane was blocked in pbs containing 5% skim milk and then incubated with rabbit anti-ns1 polyclonal antibody (genscript, piscataway, usa) and then with horseradish peroxidase-conjugated secondary anti-mouse antibody (thermo fisher, grand island, usa). the proteins were visualized by using an ecl kit (yeasen, shanghai, china). additionally, to determine the protein expression levels of the three ns1 expression plasmids, the flag-ns1-wt, flag-ns1-108q, and flag-ns1-108r plasmids were transfected into 293t cells. forty-eight hours post-transfection, the cells were collected and subjected to western blotting. forty-eight 4-week-old female balb/c mice were randomly allocated into four groups, and each group contained 12 mice. three groups were challenged with the indicated viruses, and one group was challenged with pbs as a control. the mice were inoculated with virus intranasally with 10 5.5 tcid 50 of virus in 50 µl solution under slight anaesthesia with co 2 . the mice were monitored for body weight, clinical signs, and survival rate each day until 14 days post-infection (dpi), and they were euthanized if they lost more than 25% of their original body weight. three mice from each group were euthanized at 3 and 5 dpi. the mouse lungs were collected for viral titration and cytokine analysis. to quantify the cytokine levels of il-1β, ifn-β, and tnf-α in mouse lungs, total rna was extracted from lung tissues, reverse-transcribed and subjected to quantitative realtime polymerase chain reaction as described previously [33] . to detect the ifn-β antagonistic ability of ns1 proteins, 293t cells were transfected with the indicated ns1 expression plasmids (0.2 μg/well) together with a plasmid expressing firefly luciferase under the control of the ifn-β promoter (pgl-ifn-β-luc, 0.2 μg/well), the renilla luciferase expressing plasmid prl-tk (0.07 μg/ well), and the ifn-β stimulator poly(i:c) (0.2 μg/well) or a plasmid expressing the active caspase recruitment domain (card) of rig-i (pcdna-rig-i 0.2 μg/well), pcdna-mavs (0.2 μg/well), pcdna-tbk1 (0.2 μg/well) or pcdna-irf3 (0.2 μg/well) as described previously [34] . twenty-four hours post-transfection, the cells were lysed and subjected to a dual-luciferase reporter assay kit (promega, madison, usa). mdck cells cultured on glass slides were infected with wsn-ns1-k108r, wsn-ns1-k108q, or wsn-wt at an moi of 1. all cells were fixed with 4% paraformaldehyde (pfa) and permeabilized with 0.1% triton x-100 in pbs at 12, 24, and 36 hpi. to detect the ns1 proteins, the fixed cells were incubated with rabbit anti-ns1 polyclonal antibody (genscript), followed by fitc-conjugated antirabbit igg antibody (yeasen), and then the cells were stained with dapi. all images were obtained on a leica tcs sp2 confocal microscope (leica microsystems inc., buffalo grove, usa). the animal study was conducted in accordance with the guidelines of the animal care and use committee of shanghai jiao tong university, and the animal study protocols were approved by shanghai jiao tong university (approval no. 20181012). all data were analysed using analysis of variance (two-way anova) in graph-pad prism version 5.0 (graphpad software inc., la jolla, usa); a p-value of 0.05 or less was considered significant. the mass spectrometry results showed that one acetylation modification was identified at position k108 of the ns1 protein of the wsn-wt virus (figure 1 ). to further explore whether k108 is subtype-specific, we compared the ns1 amino acid sequences of 1000 randomly selected influenza virus strains of each endemic subtype in birds and humans from genbank. most avian h9n2 (100%), h5n1 (99.9%), h7n9 (100%), human h3n2 (99.4%), and human h1n1 (99.9%, isolated before 2009) viruses contained k108 in the ns1 genes, whereas 99.9% of 2009 pandemic h1n1 viruses contained 108r in the ns1 protein. these data demonstrated that the k108 residue is relatively conserved in influenza viruses except the 2009 pandemic h1n1. since the ns1 protein is an important virulence marker and antagonist of host innate immunity, we chose to further explore the influence of acetylated k108 on virus replication and virulence in this study. to mimic deacetylated lysine, a k108r substitution was introduced into the ns1 protein, since an r substitution prevents acetylation but preserves the positive charge, and a mutant virus containing the ns1-k108r substitution was generated in the background of wsn virus (wsn-ns1-108r). moreover, to mimic constantly acetylated lysine at k108 of the ns1 protein, a k108q substitution that is a known acetylation mimic was introduced into ns1, resulting in a mutant virus containing ns1-k108q (wsn-ns1-108q). to determine the effect of acetylated k108 on virus replication, mdck and a549 cells were infected with wsn-wt or the two mutant virus wsn-ns1-108r (deacetylation mimic) or wsn-ns1-108q (constant acetylation mimic) at the indicated mois to obtain multicycle growth curves. all three viruses replicated efficiently in mdck and a549 cells, and wsn-ns1-108q and wsn-wt replicated to similar levels at each time point. however, the growth of the deacetylated mutant virus wsn-ns1-108r was significantly impaired compared with that of the other two viruses in mdck and a549 cells at 36 and 48 hpi, which indicated that the acetylated k108 of ns1 is important for virus replication in vitro at the late stage of infection (figures 2a and b) . similarly, ns1 protein levels were detected by western blotting. the results showed that the ns1 expression levels of wsn-ns1-108r were lower than those of wsn-wt and wsn-ns1-108q at different time points in mdck and a549 cells (figures 2c and d) . all the mice infected with viruses showed clinical signs such as ruffled fur, depression, and inappetence. the mice infected with constantly acetylated wsn-ns1-108q displayed more severe clinical signs and started to show mortality earlier than wsn-wt-infected mice. both wsn-ns1-108q and wsn-wt caused 100% mortality in infected mice. however, the deacetylated wsn-ns1-108r virus infection resulted in 80% mortality, and the mortality was delayed by 2 days compared with other viruses, which indicated that the deacetylation-mimic k108r substitution attenuated the wsn-ns1-108r virus in mice ( figures 3a and b) . virus titers were slightly lower in the lungs of mice infected with wsn-ns1-108r than in the other two groups ( figure 3c) . notably, significantly higher levels of ifn-β, il-1β, and tnf-α mrna were detected in wsn-ns1-108r-infected mice than in the other two groups at 3 dpi but not at 5 dpi ( figures 3d-f) , which indicated that wsn-ns1-108r was less efficient at inhibiting the innate immune response than wsn-ns1-108q and wsn-wt at 3 dpi in mice. to determine whether the k108r and k108q mutations affect the expression of the ns1 protein, the expression levels of flag-ns1-wt, flag-ns1-108q, and flag-ns1-108r in 293t cells were compared. the three proteins were expressed at similar levels in transfected 293t cells, which indicated that the k108r and k108q mutations did not influence protein expression ( figure 4e) . the major function of the ns1 protein is inhibition of type i ifn induction, and the acetylated k108 residue is located in the effector domain of ns1, which is important for its ifn antagonistic ability. to determine the effect of the acetylated k108 residue on ifn suppression by the ns1 protein, the inhibition of ifn-β promoter activity by flag-ns1-wt, flag-ns1-108q, and flag-ns1-108r was evaluated. the results showed that flag-ns1-wt and acetylation-mimic flag-ns1-108q suppressed the ifn-β promoter activity stimulated by poly(i:c) ( figure 4a) ; however, the deacetylation-mimic flag-ns1-108r protein was significantly less capable of inhibiting the activation of the ifn-β promoter compared with the acetylated ns1 proteins ( figure 4a ). this result indicated that the impaired ifn-β antagonistic ability might be responsible for the attenuation of the wsn-ns1-108q virus in vitro and in vivo. influenza virus infection stimulates type i ifn production by signal transduction from rig-i to tbk1 to irf3. to further explore how the k108r mutation attenuated the ifn-β antagonistic ability of the ns1 protein, we co-transfected ns1 expression plasmids, an ifn-β reporter plasmid and different type i interferon pathway components, including rig-i card, tbk1, and the active form of irf3, into 293t cells. the results showed that ns1-108q and ns1-wt inhibited the ifn-β response stimulated by each component more efficiently than ns1-108r, which suggested that the acetylated k108 residue is important for inhibiting the ifn-β response of ns1 that targets factors downstream of irf3 or other proteins (figures 4b-d) . two nuclear localization signals (35-41 and 216-221) have been identified in the ns1 protein, which drive ns1 to the nucleus during the early stage of infection. to determine whether k108r changes the subcellular localization of ns1 during infection, mdck cells were infected with the three viruses. the ns1 protein of wsnwt was located in the nucleus and cytoplasm of infected cells at 12 and 24 hpi ( figures 5a and b) . at the late stage of infection (36 hpi), the ns1 protein was mainly located in the nucleus and perinuclear area of infected cells (figure 5c ). the ns1-108q protein was located in both the nucleus and cytoplasm at 12 hpi and 24 hpi, while it was mainly located in the perinuclear area of infected cells at 36 hpi. in contrast, the ns1-108r protein accumulated mostly in the cytoplasm during the whole infection course. this result indicated that the deacetylation-mimic k108r substitution retained ns1 protein in the cytoplasm of infected cells, suggesting that the acetylated k108 residue is important for the nuclear localization of the ns1 protein ( figures 5a-c ). post-translational modification is important for protein function, stability, cellular localization, and protein-protein interactions. recent studies have shown that posttranslational modifications of viral proteins modulate the virus life cycle, e.g., phosphorylation of influenza viral proteins (ns1, m1, and np) plays important roles in virus replication [35] [36] [37] [38] . the ubiquitination of np and m2 proteins is crucial for viral rna replication and the production of infectious virus particles [37] . acetylation is an important post-translational modification in eukaryotes, but the occurrence and function of acetylation in influenza viral proteins remain largely unclear. giese et al. reported that acetylation of 77 k, 113 k, and 229 k of np proteins is important for virus polymerase activity and replication [27] . in the present study, we identified and characterized the acetylation of k108 in the ns1 protein, and the deacetylation of k108q affected viral replication in cells at 36 and 48 hpi. the expression levels of ns1-k108q and ns1-k108r in transfected 293t cells were similar ( figure 4e ), while the ns1 levels in infected mdck and a549 cells were different ( figures 2c and d) . this result could be attributed to the deacetylation affecting virus replication, which resulted in low expression of ns1 in infected cells. moreover, the acetylation of 108q contributes to the ifn antagonistic ability of the ns1 protein. the mrna levels of ifn-β, il-1β, and tnf-α in mouse lungs of the wsn-ns1-k108r-infected group were significantly higher than those in the other two groups at 3 dpi, which indicated that ns1-108r was less efficient at inhibiting the production of innate antiviral cytokines at 3 dpi in mice. the ns1 protein of influenza virus is a virulence factor that inhibits the antiviral immunity of the infected host, and c-terminal truncation has been widely used as a strategy to generate attenuated virus vaccine candidates [39] . one mechanism used by ns1 to inhibit the ifn response is through direct binding and sequestration of rna as well as direct interaction with trim25 and complex formation with the rna sensor rig-i, resulting in inhibition of the activation of the rig-i card and hence inhibition of irf3 activation [12] . the rna binding, rig-i and trim25 interacting domains are located in the n-terminus (1-73 aa) of the ns1 protein. however, the acetylated k108 is located in the ed domain, and acetylation of k108 may not affect the rna binding capability of the ns1 protein. the ns1-k108r substitution impaired the suppression of ifn promoter activation by poly(i:c), rig-1 card, tbk-1, and irf3, which suggested that the ns1-k108r substitution affected the ifn antagonism of ns1 either through targeting downstream of irf3 or a general mechanism that ns1 uses to inhibit ifn, such as interaction with cpsf30, resulting in inhibition of the processing of mrna, including ifn mrna [6] . notably, the cpsf30 protein is mainly located in the nucleus and is required for the 3′ end processing of all host pre-mrnas. interestingly, the deacetylation-mimic k108r substitution retained ns1 protein in the cytoplasm of infected cells, resulting in a possible impaired interaction between cpsf30 and the ns1 protein, subsequently leading to attenuated ifn antagonism. benjamin hale and colleagues found that the a/california/04/2009 (h1n1) virus has 108r in ns1, and ns1 was unable to suppress general host gene expression. nevertheless, the rk108 substitution in the ns1/2009 protein restored its ability to block general gene expression and bind cpsf30 [40] . this could explain why attenuation of the ifn antagonistic ability of ns1-k108r is independent of rig-i card, tbk-1, and irf3 activation. in addition, anastasina et al. [41] reported that the ns1 protein binds to cellular dna to block the cellular transcription of ifns and isgs; thus, ns1 proteins retained in the cytoplasm lose their cellular dna binding function, resulting in impaired ifn-β antagonistic ability. two known nuclear localization sequences (nlss) of ns1 proteins are located at the 35-41 and 219-232 positions. the 35-41 nls of the ns1 protein is highly conserved among influenza a virus strains [42] . the second nls (219-232) is virus strain specific, and the 2009 pandemic h1n1 lacks the second nls in the ns1 protein [43] . single point mutations, either r35a, r38a, or k41a, completely eliminated importin protein binding, which transports target proteins to the nucleus [42] . notably, in this study, acetylation of k108 located outside of the nls affected the cellular localization of ns1 protein, and the deacetylation-mimic k108r substitution blocked the nuclear localization of the ns1 protein in infected cells; however, the underlying mechanism remains unknown. interestingly, the ns1-k108 residue is relatively conserved in most influenza viruses, except for the 2009 pandemic h1n1. the 2009 pandemic h1n1 has 108r in ns1, which causes inefficient general host gene expression shutoff, while r108k restores its ability to block general host genes and bind cpsf30 [40] . potentially, the 2009 pandemic h1n1 virus might use different strategies to overcome the ifn response compared with the other influenza viruses. overall, we identified an acetylation of k108 of the ns1 protein of influenza virus, and the acetylation of k108 plays an important role in the cellular localization, ifn antagonistic ability, replication, and virulence of influenza virus. conformational plasticity of the influenza a virus ns1 protein inhibition of retinoic acid-inducible gene i-mediated induction of beta interferon by the ns1 protein of influenza a virus rig-i-mediated antiviral responses to single-stranded rna bearing 5′-phosphates immunogenicity and protection efficacy of replication-deficient influenza a viruses with altered ns1 genes structural basis for a novel interaction between the ns1 protein derived from the 1918 influenza virus and rig-i structural basis for suppression of a host antiviral response by influenza a virus influenza virus non-structural protein 1 (ns1) disrupts interferon signaling a site on the influenza a virus ns1 protein mediates both inhibition of pkr activation and temporal regulation of viral rna synthesis the primary function of rna binding by the influenza a virus ns1 protein in infected cells: inhibiting the 2′-5′ oligo (a) synthetase/rnase l pathway influenza a virus inhibits type i ifn signaling via nf-kappabdependent induction of socs-3 expression influenza a virus abrogates ifn-gamma response in respiratory epithelial cells by disruption of the jak/stat pathway the multifunctional ns1 protein of influenza a viruses influenza a virus virulence depends on two amino acids in the n-terminal domain of its ns1 protein to facilitate inhibition of the rnadependent protein kinase pkr contribution of ns1 effector domain dimerization to influenza a virus replication and virulence ns1 protein amino acid changes d189n and v194i affect interferon responses, thermosensitivity, and virulence of circulating h3n2 human influenza a viruses the k186e amino acid substitution in the canine influenza virus h3n8 ns1 protein restores its ability to inhibit host gene expression roles of the phosphorylation of specific serines and threonines in the ns1 protein of human influenza a viruses threonine 80 phosphorylation of non-structural protein 1 regulates the replication of influenza a virus by reducing the binding affinity with rig-i modification of nonstructural protein 1 of influenza a virus by sumo1 50 years of protein acetylation: from gene regulation to epigenetics, metabolism and beyond proteomics analyses reveal the evolutionary conservation and divergence of n-terminal acetyltransferases from yeast and humans acetylation and methylation of histones and their possible role in the regulation of rna synthesis acetylation and deacetylation of non-histone proteins the world of protein acetylation genetic dissection of histone deacetylase requirement in tumor cells the many roles of histone deacetylases in development and physiology: implications for disease and therapy role of influenza a virus np acetylation on viral growth and replication suppression of the antiviral response by an influenza histone mimic n-terminal acetylation by natb is required for the shutoff activity of influenza a virus pa-x hdac6 restricts influenza a virus by deacetylation of the rna polymerase pa subunit acylation-mediated membrane anchoring of avian influenza virus hemagglutinin is essential for fusion pore formation and virus infectivity analysis of recombinant h7n9 wild-type and mutant viruses in pigs shows that the q226l mutation in ha is important for transmission quantification of murine cytokine mrnas using real time quantitative reverse transcriptase pcr herpes simplex virus 1 ubiquitinspecific protease ul36 inhibits beta interferon production by deubiquitinating traf3 mapping the phosphoproteome of influenza a and b viruses by mass spectrometry effects of the s42 residue of the h1n1 swine influenza virus ns1 protein on interferon responses and virus replication ubiquitination of the cytoplasmic domain of influenza a virus m2 protein is crucial for production of infectious virus particles phosphorylation and dephosphorylation of threonine 188 in nucleoprotein is crucial for the replication of influenza a virus attenuation of the virulence of a recombinant influenza virus expressing the naturally truncated ns gene from an h3n8 equine influenza virus in mice inefficient control of host gene expression by the 2009 pandemic h1n1 influenza a virus ns1 protein influenza virus ns1 protein binds cellular dna to block transcription of antiviral genes nuclear and nucleolar targeting of influenza a virus ns1 protein: striking differences between different virus subtypes influenza a h3n2 subtype virus ns1 protein targets into the nucleus and binds primarily via its c-terminal nls2/nols to nucleolin and fibrillarin this study was supported by the national natural science authors' contributions jm, jl, and js designed the study; jm was involved in the acquisition of data, analysis, and figure preparation; rw, gx, yc, and zw contributed to some of the laboratory experiments and data analysis; hw and yy helped revise the manuscript; jl and js supervised the study; jm drafted the original paper. all authors read and approved the final manuscript. the authors declare that they have no competing interests. key: cord-103320-2rpr7aph authors: bhandari, bikash k.; gardner, paul p.; lim, chun shen title: solubility-weighted index: fast and accurate prediction of protein solubility date: 2020-03-26 journal: biorxiv doi: 10.1101/2020.02.15.951012 sha: doc_id: 103320 cord_uid: 2rpr7aph motivation recombinant protein production is a widely used technique in the biotechnology and biomedical industries, yet only a quarter of target proteins are soluble and can therefore be purified. results we have discovered that global structural flexibility, which can be modeled by normalised b-factors, accurately predicts the solubility of 12,216 recombinant proteins expressed in escherichia coli. we have optimised b-factors, and derived a new set of values for solubility scoring that further improves prediction accuracy. we call this new predictor the ‘solubility-weighted index’ (swi). importantly, swi outperforms many existing protein solubility prediction tools. furthermore, we have developed ‘sodope’ (soluble domain for protein expression), a web interface that allows users to choose a protein region of interest for predicting and maximising both protein expression and solubility. availability the sodope web server and source code are freely available at https://tisigner.com/sodope and https://github.com/gardner-binflab/tisigner-reactjs, respectively. the code and data for reproducing our analysis can be found at https://github.com/gardner-binflab/sodope_paper2020. high levels of protein expression and solubility are two major requirements of successful recombinant protein production (esposito and chatterjee 2006) . however, recombinant protein production is a challenging process. almost half of recombinant proteins fail to be expressed and half of the successfully expressed proteins are insoluble ( http://targetdb.rcsb.org/metrics/ ). these failures hamper protein research, with particular implications for structural, functional and pharmaceutical studies that require soluble and concentrated protein solutions (kramer et al. 2012; hou et al. 2018) . therefore, solubility prediction and protein engineering for enhanced solubility is an active area of research. notable protein engineering approaches include mutagenesis, truncation (i.e., expression of partial protein sequences), or fusion with a solubility-enhancing tag (waldo 2003; esposito and chatterjee 2006; trevino, martin scholtz, and nick pace 2007; chan et al. 2010; kramer et al. 2012; costa et al. 2014 ) . protein solubility, at least in part, depends upon extrinsic factors such as ionic strength, temperature and ph, as well as intrinsic factors-the physicochemical properties of the protein sequence and structure, including molecular weight, amino acid composition, hydrophobicity, aromaticity, isoelectric point, structural propensities and the polarity of surface residues (wilkinson and harrison 1991; chiti et al. 2003; tartaglia et al. 2004; diaz et al. 2010) . many solubility prediction tools have been developed around these features using statistical models (e.g., linear and logistic regression) or other machine learning models (e.g., support vector machines and neural networks) (hirose and noguchi 2013; habibi et al. 2014; hebditch et al. 2017; sormanni et al. 2017; heckmann et al. 2018; z. wu et al. 2019; yang, wu, and arnold 2019) . in this study, we investigated the experimental outcomes of 12,216 recombinant proteins expressed in escherichia coli from the 'protein structure initiative:biology' (psi:biology) (chen et al. 2004; acton et al. 2005) . we showed that protein structural flexibility is more accurate than other protein sequence properties in predicting solubility (craveur et al. 2015; m. vihinen, torkkila, and riikonen 1994) . flexibility is a standard feature that appears to have been overlooked in previous solubility prediction attempts. on this basis, we derived a set of 20 values for the standard amino acid residues and used them to predict solubility. we call this new predictor the 'solubility-weighted index' (swi). swi is a powerful predictor of solubility, and a good proxy for global structural flexibility. in addition, swi outperforms many existing de novo protein solubility prediction tools. we sought to understand what makes a protein soluble, and develop a fast and accurate approach for solubility prediction. to determine which protein sequence properties accurately predict protein solubility, we analysed 12,216 target proteins from over 196 species that were expressed in e. coli (the psi:biology dataset; see supplementary fig s1 and table s1a ) (chen et al. 2004; acton et al. 2005) . these proteins were expressed either with a c-terminal or n-terminal polyhistidine fusion tag (pet21_nesg and pet15_nesg expression vectors, n=8,780 and 3,436, respectively). they were previously curated and labeled as 'protein_soluble' or 'tested_not_soluble' (seiler et al. 2014) , based on the soluble analysis of cell lysate using sds-page (r. xiao et al. 2010) . a total of 8,238 recombinant proteins were found to be soluble, in which 6,432 of them belong to the pet21_nesg dataset. both the expression system and solubility analysis method are commonly used (costa et al. 2014) . therefore, this collection of data captures a broad range of protein solubility issues. protein structural flexibility, in particular, the flexibility of local regions, is often associated with function (craveur et al. 2015) . the calculation of flexibility is usually performed by assigning a set of 20 normalised b-factors-a measure of vibration of c-alpha atoms (see supplementary notes)-to a protein sequence and averaging the values by a sliding window approach (ragone et al. 1989; karplus and schulz 1985; m. vihinen, torkkila, and riikonen 1994; smith et al. 2003) . we reasoned that such sliding window approach can be approximated by a more straightforward arithmetic mean for calculating global structural flexibility (see supplementary notes). we determined the correlation between flexibility (vihinen et al. 's sliding window approach as implemented in biopython) and solubility scores calculated as follows: where is the normalised b-factor of the amino acid residue at the position , and is the b i i l sequence length. we obtained a strong correlation for the psi:biology dataset (spearman's rho = 0.98, p-value below machine's underflow level). therefore, we reasoned that the sliding window approach is not necessary for our purpose. we applied this arithmetic mean approach (i.e., sequence composition scoring) to the psi:biology dataset and compared four sets of previously published, normalised b-factors (bhaskaran and ponnuswamy 1988; ragone et al. 1989; m. vihinen, torkkila, and riikonen 1994; smith et al. 2003 ) among these sets of b-factors, sequence composition scoring using the most recently published set of normalised b-factors produced the highest auc score ( to improve the prediction accuracy of solubility, we iteratively refined the weights of amino acid residues using the nelder-mead optimisation algorithm (nelder and mead 1965) . to avoid testing and training on similar sequences, we generated 10 cross-validation sets with a maximised heterogeneity between these subsets (i.e. no similar sequences between subsets). we first clustered all 12,216 psi:biology protein sequences using a 40% similarity threshold using usearch to produce 5,050 clusters with remote similarity (see methods and supplementary fig s4) . the clusters were grouped into 10 cross-validation sets of approximately 1,200 sequences each manually. we did not select a representative sequence for each cluster as about 12% of clusters contain a mix of soluble and insoluble proteins (supplementary fig s4c) . more importantly, to address the issues of sequence similarity and imbalanced classes, we performed 1,000 bootstrap resamplings for each cross-validation step (fig 2a and supplementary fig s5) . we calculated the solubility scores using the optimised weights as equation 1 and the auc scores for each cross-validation step. our training and test auc scores were 0.72 ± 0.00 and 0.71 ± 0.01, respectively, showing an improvement over flexibility in solubility prediction (mean ± standard deviation; fig 2b and supplementary table s3 ). the final weights were derived from the arithmetic means of the weights for individual amino acid residues obtained cross-validation (supplementary table s4) . we observed over a 20% change on the weights for cysteine (c) and histidine (h) residues (fig 2c and supplementary table s4 ). these results are in agreement with the contributions of cysteine and histidine residues as shown in supplementary fig s2b. we call the solubility score of a protein sequence calculated using the final weights the solubility-weighted index (swi). flow chart shows an iterative refinement of the most recently published set of normalised b-factors for solubility prediction (smith et al. 2003) . the solubility score of a protein sequence was calculated using a sequence composition scoring approach (equation 1, using optimised weights , w instead of normalised b-factors ). these scores were used to compute the auc scores for b training and test datasets. (b) training and test performance of solubility prediction using optimised weights for 20 amino acid residues in a 10-fold cross-validation (mean auc ± standard deviation). related data and figures are available as supplementary table s3 and supplementary fig s4 and s5 . (c) comparison between the 20 initial and final weights for amino acid residues. the final weights are derived from the arithmetic mean of the optimised weights from cross-validation. these weights are used to calculate swi, the solubility score of a protein sequence, in the subsequent analyses. filled circles, which represent amino acid residues, are colored by hydrophobicity (kyte and doolittle 1982) . solid black circles denote aromatic amino acid residues phenylalanine (f), tyrosine (y), tryptophan (w). dotted diagonal line represents no change in weight. see also supplementary table s4 and fig s4. auc, area under the roc curve; roc, receiver operating characteristic; , arithmetic w mean of the weights of an amino acid residue optimised from 1,000 bootstrap samples in a cross-validation step. to validate the cross-validation results, we used a dataset independent of the psi:biology data known as esol (niwa et al. 2009 ) . this dataset consists of the solubility percentages of e. coli proteins determined using an e. coli cell-free system (n = 3,198) . our solubility scoring using the final weights showed a significant improved correlation with e. coli protein solubility over the initial weights (smith et al. 's normalised b-factors) [spearman's rho of 0.50 (p = 9.46 ✕ 10 -206 ) versus 0.40 (p = 4.57 ✕ 10 -120 )]. we repeated the correlation analysis by removing extra amino acid residues including his-tags from the esol sequences (mrgshhhhhhtdpalra and glcgr at the n-and c-termini, respectively). this artificial dataset was created based on the assumption that his-tags have little effect on solubility. we observed a slight decrease in correlation for this artificial dataset (spearman's rho = 0.47, p= 3.67 ✕ 10-176), which may be due to the effects of his-tag in solubility and/or the limitation(s) of our approach that may overfit to his-tag fusion proteins. we performed spearman's correlation analysis for both the psi:biology and esol datasets. swi shows the strongest correlation with solubility compared to the standard and 9,920 protein sequence properties (fig 3 and supplementary fig s2, respectively) . swi also strongly correlates with flexibility, suggesting that swi is also a good proxy for global structural flexibility. we asked whether protein solubility can be predicted by surface amino acid residues. to address this question, we examined a previously published dataset for the protein surface 'stickiness' of 397 e. coli proteins (levy, de, and teichmann 2012) . this dataset has the annotation for surface residues based on previously solved protein crystal structures. we observed little correlation between the protein surface 'stickiness' and the solubility data from esol (spearman's rho = 0.05, p = 0.34, n = 348; supplementary fig s6a) . next, we evaluated if amino acid composition scoring using surface residues is sufficient, optimising only the weights of surface residues should achieve similar or better results than swi. as above, we iteratively refined the weights of surface residues using the nelder-mead optimisation algorithm. the method was initialised with smith et al. 's normalised b-factors and a maximised correlation coefficient was the target. however, a low correlation was obtained upon convergence (spearman's rho = 0.18, p = 7.20 ✕ 10 -4 ; supplementary fig s6b) . in contrast, the swi of the full-length sequences has a much stronger correlation with solubility (spearman's rho = 0.46, p = 2.97 ✕ 10 -19 ; supplementary fig s6c) . these results suggest that the full-length of sequences contributes to protein solubility, not just surface residues, in which solubility is modulated by cotranslational folding (natan et al. 2018 ) . to understand the properties of soluble and insoluble proteins, we determined the enrichment of amino acid residues in the psi:biology targets relative to the esol sequences (see methods). we observed that the psi:biology targets are enriched in charged residues lysine (k), glutamate (e) and aspartate (d), and depleted in aromatic residues tryptophan (w), albeit to a lesser extend for insoluble proteins (supplementary fig s7a) . as expected, cysteine residues (c) are enriched in the psi:biology insoluble proteins, supporting previous findings that cysteine residues contribute to poor solubility in the e. coli expression system (diaz et al. 2010; wilkinson and harrison 1991) . in addition, we compared the swi of random sequences with the psi:biology and esol sequences. we included an analysis of random sequences to confirm whether swi can distinguish between biological and random sequences. we found that the swi scores of soluble proteins are higher than those of insoluble proteins (supplementary fig s7b) , and that true biological sequences also tend to have higher swi scores than random sequences, highlighting a potential evolutionary selection for solubility. to confirm the usefulness of swi in solubility prediction, we compared it with the existing tools protein-sol (hebditch et al. 2017 ) , camsol v2.1 (sormanni, aprile, and vendruscolo 2015; sormanni et al. 2017) , parsnip ) , deepsol v0.3 (khurana et al. 2018) , the wilkinson-harrison model (davis et al. 1999; harrison 2000; wilkinson and harrison 1991) , and ccsol omics (agostini et al. 2014 ) . we did not include the specialised tools that model protein structural information such as surface geometry, surface charges and solvent accessibility because these tools require prior knowledge of protein tertiary structure. for example, aggrescan3d and solart accept only pdb files that can be downloaded from the protein data bank or produced using a homology modeling program (kuriata et al. 2019; hou et al. 2019) . swi outperforms other tools except for protein-sol in predicting e. coli protein solubility (table 1, fig 4a) . our swi c program is also the fastest solubility prediction algorithm (table 1, fig 4b and supplementary table s7 ). prediction accuracy of solubility prediction tools using the above cross-validation sets (fig 2a) . for swi, the test auc scores were calculated from a 10-fold cross-validation (i.e., a boxplot representation of fig 2b) protein structural flexibility has been associated with conformal variations, functions, thermal stability, ligand binding and disordered regions (mauno vihinen 1987; teague 2003; ma 2005; radivojac 2004; schlessinger and rost 2005; yuan, bailey, and teasdale 2005; yin, li, and li 2011) . however, the use of flexibility in solubility prediction has been overlooked although their relationship has previously been noted (tsumoto et al. 2003) . in this study, we have shown that flexibility strongly correlates with solubility (fig 3) . based on the normalised b-factors used to compute flexibility, we have derived a new position and length independent weights to score the solubility of a given protein sequence (i.e., sequence composition based score). we call this protein solubility score as swi. upon further inspection, we observe some interesting properties in swi. swi anti-correlates with helix propensity, gravy, aromaticity and isoelectric point (fig 2c and 3) , suggesting that swi incorporates the key propensities affecting solubility. amino acid residues with a lower aromaticity or hydrophilic are known to improve protein solubility (trevino, martin scholtz, and nick pace 2007; niwa et al. 2009; kramer et al. 2012; warwicker, charonis, and curtis 2014; han et al. 2019; wilkinson and harrison 1991) . consistent with previous studies, the charged residues aspartate (d), glutamate (e) and lysine (k) are associated with high solubility, whereas the aromatic residues phenylalanine (f), tryptophan (w) and tyrosine (y) are associated with low solubility (fig 2c and supplementary fig s7a) . cysteine residue (c) has the lowest weight probably because disulfide bonds couldn't be properly formed in the e. coli expression hosts (stewart, aslund, and beckwith 1998; rosano and ceccarelli 2014; jia and jeon 2016; aslund and beckwith 1999) . the weights are likely different if the solubility analysis was done using the reductase-deficient, e. coli origami host strains, or eukaryotic hosts. higher helix propensity has been reported to increase solubility (idicula-thomas and balaji 2005; huang et al. 2012 ) . however, our analysis has shown that helical and turn propensities anti-correlate with solubility, whereas sheet propensity lacks correlation with solubility, suggesting that disordered regions may tend to be more soluble (fig 3) . in accordance with these, swi has stronger negative correlations with helix and turn propensities. these findings also suggest that protein solubility can be largely explained by overall amino acid composition, not just the surface amino acid residues. this idea aligns with our understanding that protein solubility and folding are closely linked, and folding occurs cotranscriptionally, a complex process that is driven various intrinsic and extrinsic factors (wilkinson and harrison 1991; chiti et al. 2003; tartaglia et al. 2004; diaz et al. 2010) . however, it is unclear why sheet propensity has little contribution to solubility because β-sheets have been shown to link closely with protein aggregation (idicula-thomas and balaji 2005) . we conclude that swi is a well-balanced index that is derived from a simple sequence composition scoring method. to demonstrate the usefulness of swi, we developed a web server called sodope (soluble domain for protein expression; https://tisigner.com/sodope ). sodope calculates the probability of solubility of a user-selected region based on swi, which can either be a full-length or a partial sequence (see methods and supplementary table s8 ). this implementation is based on our observation that some protein domains tend 345 to be more soluble than the others. to demonstrate this point, we have analysed three commercial monoclonal antibodies and the severe acute respiratory syndrome coronavirus proteomes (sars-cov and sars-cov-2) (wang et al. 2009; marra et al. 2003; f. wu et al. 2020 ) ( supplementary fig s8 and s9 ). these soluble domains may enhance protein solubility as a whole. sodope also provides options for solubility prediction at the presence of solubility fusion tags. similarly, solubility tags may act as soluble 'protein domains' that can outweigh the aggregation propensity of insoluble proteins. however, some soluble fusion proteins may become insoluble after proteolytic cleavage of solubility tags (lebendiker and danieli 2014) . in addition, sodope is integrated with tisigner, a gene optimisation web service for protein expression. this pipeline provides a holistic approach to improve the outcome of recombinant protein expression. the standard protein sequence properties were calculated using the bio.sequtils.protparam module of biopython v1.73 (cock et al. 2009 ) . all miscellaneous protein sequence properties were computed using the r package protr v1.6-2 (n. xiao et al. 2015) . we used the standard and miscellaneous protein sequence properties to predict the solubility of the psi:biology and esol targets (n=12,216 and 3,198 , respectively) (seiler et al. 2014; niwa et al. 2009 ) . for method comparison, we chose the protein solubility prediction tools that are scalable (table 1) . default configurations were used for running the command line tools. to benchmark the wall time of solubility prediction tools, we selected 10 sequences that span a large range of lengths from the psi:biology and esol datasets (from 36 to 2389 residues). all the tools were run and timed using a single process without using gpus on a high performance computer [ /usr/bin/time -f '%e' ; centos linux 7 (core) operating system, 72 cores in 2× broadwell nodes (e5-2695v4, 2.1 ghz, dual socket 18 cores per socket), 528 gib memory]. single sequence fasta files were used as input files. to improve protein solubility prediction, we optimised the most recently published set of normalised b-factors using the psi:biology dataset (smith et al. 2003 ) (fig 2) . to avoid including homologous sequences in the test and training sets, we clustered the psi:biology targets using usearch v11.0.667, 32-bit (edgar 2010) . his-tag sequences were removed from all sequences before clustering to avoid false cluster inclusions. we obtained 5,050 clusters using the parameters: -cluster_fast -id 0.4 -msaout -threads 4 . these clusters were divided into 10 subsets with approximately 1,200 sequences per subsets manually . the subsequent steps were done with his-tag sequences. we used smith et al. 's normalised b-factors as the initial weights to 391 maximise auc using these 10 subsets with a 10-fold cross-validation. since auc is non-differentiable, we used the nelder-mead optimisation method (implemented in scipy v1.2.0), which is a derivative-free, heuristic, simplex-based optimisation (oliphant 2007; millman and aivazis 2011; nelder and mead 1965) . for each step in cross-validation, we used 1,000 bootstrap resamplings containing 1,000 soluble and 1,000 insoluble proteins. optimisation was carried out for each sample, giving 1,000 sets of weights. the arithmetic mean of these weights was used to determine the training and test auc for the cross-validation step (fig 2a) . to examine the enrichment of amino acid residues in soluble and insoluble proteins, we compute the bit scores for each amino acid residue in the psi:biology soluble and insoluble groups ( supplementary fig s7a) , we normalised the count of each residue in each x) ( group by the total number of residues in that group. we used the normalised count of amino acid residues using the esol e. coli sequences as the background. the bit score of residue for soluble or insoluble group is then given by the following equation: where is the normalised count of residue in the psi:biology soluble or insoluble (x) f i x) ( group and is the normalised count in the esol sequences. (x) f esol for a control, random protein sequences were generated by incrementing the length of sequence, starting from a length of 50 residues to 6,000 residues with a step size of 50 residues. a hundred random sequences were generated for each length, giving a total of 12,000 unique random sequences. to estimate the probability of solubility using swi, we fitted the following logistic regression to the psi:biology dataset: (3) robability of solubility where, is the swi of a given protein sequence, and . the x 81.05812 a = − 2.7775 b = 6 p-value of log-likelihood ratio test was less than machine precision. equation 3 can be used to predict the solubility of a protein sequence given that the protein is successfully expressed in e. coli ( supplementary table s8 ). on this basis, we developed a solubility prediction webservice called the soluble domain for protein expression (sodope). our web server accepts either a nucleotide or amino acid sequence. upon sequence submission, a query is sent to the hmmer web server to annotate protein domains ( https://www.ebi.ac.uk/tools/hmmer/ ) (potter et al. 2018) . once the protein domains are identified, users can choose a domain or any custom region (including full-length sequence) to examine the probability of solubility, flexibility and gravy. this functionality enables protein biochemists to plan their experiments and opt for the domains or regions with high probability of solubility. furthermore, we implemented a simulated annealing algorithm that maximised the probability of solubility for a given region by generating a list of regions with extended boundaries. users can also predict the improvement in solubility by selecting a commonly used solubility tag or a custom tag. we linked sodope with tisigner, which is our existing web server for maximising the accessibility of translation initiation sites (bhandari, lim, and gardner 2019) . this pipeline allows users to predict and optimise both protein expression and solubility for a gene of interest. the sodope web server is freely available at https://tisigner.com/sodope . jupyter notebook of our analysis can be found at https://github.com/gardner-binflab/sodope_paper_2020 . the source code for our solubility prediction server (sodope) can be found at https://github.com/gardner-binflab/tisigner-reactjs . 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 robotic cloning and protein production platform of the northeast structural genomics consortium ccsol omics: a webserver for solubility prediction of endogenous and heterologous expression in escherichia coli the thioredoxin superfamily: redundancy, specificity, and gray-area genomics highly accessible translation initiation sites are predictive of successful heterologous protein expression positional flexibilities of amino acid residues in globular proteins learning to predict expression efficacy of vectors in recombinant protein production targetdb: a target registration database for structural genomics projects rationalization of the effects of mutations on peptide and protein aggregation rates biopython: freely available python tools for computational molecular biology and bioinformatics fusion tags for protein solubility, purification and immunogenicity in escherichia coli: the novel fh8 system protein flexibility in the light of structural alphabets new fusion protein systems designed to give soluble expression in escherichia coli prediction of protein solubility in escherichia coli using logistic regression search and clustering orders of magnitude faster than blast enhancement of soluble protein expression through the use of fusion tags prediction of peptide and protein propensity for amyloid formation a review of machine learning methods to predict the solubility of overexpressed recombinant proteins in escherichia coli improve protein solubility and activity based on machine learning models expression of soluble heterologous proteins via fusion with nusa protein protein-sol: a web tool for predicting protein solubility from sequence machine learning applied to enzyme turnover numbers reveals protein structural correlates and improves metabolic models espresso: a system for estimating protein expression and solubility in protein expression systems computational analysis of the amino acid interactions that promote or decrease protein solubility solart: a structure-based method to predict protein solubility and aggregation prediction and analysis of protein solubility using a novel scoring card method with dipeptide composition understanding the relationship between the primary structure of proteins and its propensity to be soluble on overexpression in escherichia coli high-throughput recombinant protein expression in escherichia coli: current status and future perspectives prediction of chain flexibility in proteins deepsol: a deep learning framework for sequence-based protein solubility prediction toward a molecular understanding of protein solubility: increased negative surface charge correlates with increased solubility aggrescan3d (a3d) 2.0: prediction and engineering of protein solubility a simple method for displaying the hydropathic character of a protein production of prone-to-aggregate proteins cellular crowding imposes global constraints on the chemistry and evolution of proteomes usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes the genome sequence of the sars-associated coronavirus data structures for statistical computing in python python for scientists and engineers cotranslational protein assembly imposes evolutionary constraints on homomeric proteins a simplex method for function minimization bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of escherichia coli proteins python for scientific computing scikit-learn: machine learning in python protein flexibility and intrinsic disorder protein engineering, design and selection parsnip: sequence-based protein solubility prediction using gradient boosting machine recombinant protein expression in escherichia coli: advances and challenges protein flexibility and rigidity predicted from sequence statsmodels: econometric and statistical modeling with python dnasu plasmid and psi:biology-materials repositories: resources to accelerate biological research improved amino acid flexibility parameters rapid and accurate in silico solubility screening of a monoclonal antibody library the camsol method of rational design of protein mutants with enhanced solubility disulfide bond formation in the escherichia coli cytoplasm: an in vivo role reversal for the thioredoxins the role of aromaticity, exposed surface, and dipole moment in determining protein aggregation rates implications of protein flexibility for drug discovery amino acid contribution to protein solubility: asp, glu, and ser contribute more favorably than the other hydrophilic amino acids in rnase sa practical considerations in refolding proteins from inclusion bodies relationship of protein flexibility to thermostability accuracy of protein flexibility predictions genetic screens and directed evolution for protein solubility the numpy array: a structure for efficient numerical computation potential aggregation prone regions in biotherapeutics: a survey of commercial monoclonal antibodies lysine and arginine content of proteins: computational analysis suggests a new tool for solubility design predicting the solubility of recombinant proteins in escherichia coli complete genome characterisation of a novel coronavirus associated with severe human respiratory disease in wuhan, china proceedings of the national academy of sciences of the united states of america protr/protrweb: r package and web server for generating various numerical representation schemes of protein sequences the high-throughput protein sample production platform of the northeast structural genomics consortium machine-learning-guided directed evolution for protein engineering on the relation between residue flexibility and residue interactions in proteins prediction of protein b-factor profiles we evaluated nine standard and 9,920 miscellaneous protein sequence properties using the biopython's protparam module and 'protr' r package, respectively (cock et al. 2009; n. xiao et al. 2015) . for example, the standard properties include the grand average of hydropathy (gravy), secondary structure propensities, protein structural flexibility etc., whereas miscellaneous properties include amino acid composition, autocorrelation, etc. we thank new zealand escience infrastructure for providing a high performance computing platform. we are grateful to harry biggs for proofreading our manuscript and providing feedback for the web server. this work was supported by the ministry of business, innovation and employment, new zealand (mbie grant: uoox1709). key: cord-018479-mvnm98hv authors: rehm, fabian b. h.; grage, katrin; rehm, bernd h. a. title: applications of microbial biopolymers in display technology date: 2017-11-16 journal: consequences of microbial interactions with hydrocarbons, oils, and lipids: production of fuels and chemicals doi: 10.1007/978-3-319-50436-0_377 sha: doc_id: 18479 cord_uid: mvnm98hv microorganisms produce a variety of different polymers such as polyamides, polysaccharides, and polyesters. the polyesters, the polyhydroxyalkanoates (phas), are the most extensively studied polymers in regard to their use in display technology. the material properties of bacterial phas in combination with their biocompatibility and biodegradability make them attractive substrates for use in display technology applications. by translationally fusing bioactive molecules to a gene encoding a pha-binding domain, the appropriate functionalization for a given application can be achieved such that the need for chemical immobilization is circumvented. by separately extracting and processing the biopolymer, using it to coat a surface, and then treating this surface with the fusion proteins, surface functionalization for immunodiagnostic microarray or tissue engineering applications can be accomplished. conversely, by expressing the fusion protein directly in the pha-producing organisms, one-step production of functionalized beads can be achieved. such beads have been demonstrated in diverse applications, including fluorescence-activated cell sorting, enzyme-linked immunosorbent assays, microarrays, diagnostic skin test for tuberculosis, vaccines, protein purification, and affinity bioseparation. the display of biologically active molecules is utilized for a range of applications such as diagnostics, biosensing, and microarray technologies. the substrate on which such display takes place is greatly deterministic of functionality and applicability. common, well-established techniques include display on cell surfaces, ribosomes, and phage particles (lee et al. 2003; zahnd et al. 2007; rakonjac et al. 2011) . the use of microbial biopolymers as substrates has more recently been revealed in a variety of contexts. although bacteria can produce a range of polymers, only a few of them have been considered for display technologies (fig. 1 ). bacterial cellulose, which exhibits various properties superior to plant-based cellulose with respect to display technology applications, has been limited to enzyme, bacterial cell and fungi immobilization (wu and lia 2008; ullah et al. 2016) . in contrast, bacterial polyhydroxyalkanoates (phas) have been extensively investigated as substrates for display technology applications, and thus will be the focus of this chapter. phas are biopolyesters which serve as carbon and energy storage materials in a range of bacteria and archaea (lenz and marchessault 2005; rehm 2010 ). during excess carbon availability, they are stockpiled as the amorphous core of pha inclusions, surrounded by structural proteins (phasins), pha metabolism-associated enzymes, and regulator proteins jendrossek 2009 ). critical enzymes for pha synthesis and inclusion assembly are the pha synthases (such as phac) which catalyze the stereoselective conversion of the activated precursor (r)-3-hydroxyacyl-coa (rehm 2003) . these coa thioesters, depending on their carbon chain length, are synthesized from intermediates of fatty acid metabolism or directly from acetyl-coa to polyoxoesters with the simultaneous release of coenzyme a (steinbuchel et al. 1993; rehm 2006) (fig. 2) . unlike the other hydrophobically interacting pha inclusion surface proteins, the pha synthase remains covalently linked to the pha inclusion core (hezayen et al. 2002; peters and rehm 2006 although phas are all hydrophobic and water insoluble, they can drastically vary in composition and thus physical properties. melting points can range from 50 c to 180 c and crystallinity can range from 30% to 70%, which is largely determined by monomer composition (rehm 2010) . as such, phas have been classified on the basis of monomer chain length. medium-chain-length phas (mcl, c6-c14) are naturally produced primarily by pseudomonads, whereas short-chain-length phas (scl, c3-c5) production is more widespread throughout bacteria and archaea (anderson et al. 1990) (fig. 2) . while the common laboratory bacterium escherichia coli does not naturally accumulate phas, it becomes a competent pha producer upon introduction of the appropriate pha biosynthesis genes (schubert et al. 1988; lee et al. 1994) . the intracellular pha inclusions may be isolated for polymer extraction and purification for processing into various materials, or maintained as functional shell-core beads (fig. 3) . the latter requires engineering of proteins attached to the pha core in order to obtain functionality. the pha material properties enable film coating of solid surfaces suitable for microarray applications. indeed, by exploiting the ability of pha-associated proteins to specifically bind phas, and thereby overcoming specificity-or orientation-associated issues, pha substrates for immobilization have been demonstrated as attractive for such applications. the first relevant example of pha as a protein micropatterning substrate was described in a study by park and colleagues (2005) . poly(3-hydroxybutyrate) (p(3hb)) and poly(3-hydroxybutyrate-co-3-hydroxyhexanoate) were independently produced, purified, and used to spin-coat glass substrate, producing pha films. subsequently, enhanced green fluorescent protein and red fluorescent protein, fused with the hydrophobic side chain-interacting substrate binding domain (sbd) of a pha depolymerase, were micropatterned onto the film using microcontact printing. confocal fluorescent microscopy clearly revealed the printed patterns after several wash steps and surface plasmon resonance spectroscopy using anti-green fluorescent protein polyclonal antibody further confirmed this specific fusion protein immobilization, additionally demonstrating the system as capable for examining proteinprotein interactions. spurred on by these encouraging results, a further study aimed to demonstrate the possibilities of this system for protein microarray development centered on immunodiagnostic applications (park et al. 2006) . pha depolymerase sbd fusions with the single-chain antibody variable region (scfv) against hepatitis b virus (hbv) pres2 surface protein as well as the severe acute respiratory syndrome coronavirus envelope protein (scve) were produced and microspotted onto p(3hb) films. fluorescence-labelled hbv antigen and anti-scve antibody were then used to detect the interactions on the films. fluorescence signals were detected only at the corresponding microspotted regions, indicating high affinity and selectivity and thus indicating the technology as appropriate for use in immunodiagnostics. subsequent investigation into whether such a platform could be applicable for clinical pathogen detection via immobilized dna-protein complexes was undertaken by park et al. (2009) (fig. 4) . pathogen-specific (acinetobacter baumannii, e. coli, klebsiella pneuomoniae, and pseudomonas aeruginosa) biotin-labelled 15-mer dna probes immobilized to core streptavidin fused to the pha depolymerase sbd were microspotted onto polyhdroxybutyrate (phb)-coated slides. by hybridizing differentially fluorescently labelled target dnas to each pathogen-specific probe simultaneous detection of the corresponding pathogens added to the slide was demonstrated, further evidencing the potential of phas for use in immunodiagnostic microarrays. the diverse, adjustable material properties of phas in concert with their biodegradability, biocompatibility, noncarcinogenicity, and low cytotoxicity have driven their development in the area of biomaterials research (ali and jamil 2016) . on the basis of pha surface functionalization by addition of pha-binding protein fusions, several approaches to enhancing tissue engineering and culturing techniques have been undertaken. by translationally fusing the pha-binding phasin, phap, to the cell adhesion motif arg-gly-asp (rgd) and applying this fusion protein to coat pha surfaces, the effect on fibroblast growth was investigated (dong et al. 2010) . under serum-free conditions, confocal laser scanning microscopy in combination with cell counting assays revealed a significant cell attachment increase on the phap-rgd-coated pha films relative to the phap-(nonfused) and noncoated films, in a manner that is unlikely to be attributable to changes in pha surface topology. further cell proliferation assays revealed increased fibroblast proliferation levels on phap-rgdcoated pha films. thus, pha surface functionalization, using a well-known cell adhesion motif, was demonstrated using aqueous solutions, avoiding toxicity of chemical immobilization techniques. continuing this pattern of investigation, similar increases in cell adhesion and cell proliferation of human vascular smooth muscle cells (hvsmcs) were demonstrated using the pha repressor protein (phar) fused to the specific integrin ligand peptide (kqagdv) (dong et al. 2012) , and, more recently, phap fused to rgd or the laminin-derived ikvav peptides used to coat pha films demonstrated enhanced neural stem cell (nsc) attachment, proliferation, and better neurite outgrowths, without effecting nsc differentiation (xie et al. 2013 ). fluorescence activated cell sorting (facs) is a qualitative and quantitative technique used in biomolecule detection for in vitro diagnostics. generally, antigen-displaying beads are used to bind antibodies which are then detected via fluorescent-signaling secondary antibodies when the bead suspension passes through the facs machine flow cell. current bead preparation techniques involve tedious antigen purification and chemical crosslinking. therefore, more cost-effective, reliable means of producing antigen-displaying beads with comparable detection efficacy could greatly impact the accessibility of facs. a series of studies have shown that one path towards such improvement may be via one-step production of surface protein engineered pha beads. the first demonstration of in vivo-produced pha inclusions for facs applications was described by bäckström and coworkers (bäckström et al. 2007 ). the authors generated fusions between either interleukin-2 (il2) or the myelin oligodendrocyte glycoprotein (mog) and the c-or n-terminus of phap via an enterokinase (specific protease) recognition site providing linker. subsequently, the hybrid genes were expressed in e. coli and the pha granules extracted. the resultant antigen-displaying pha beads were then used for facs using the corresponding fluorescently labelled monoclonal antibody which showed significant and specific antibody binding. enterokinase treatment reversed this recognition indicating removal of the fusion partners/antigens. finally, sera from mice immunized with mog or ovalbumin (as a negative control) were analyzed using the mog-displaying beads and facs. again, high specificity and sensitivity (antibody detection in sera diluted 1:100,000) were demonstrated, providing strong support for application of engineered pha beads in facs (fig. 5) . a subsequent investigation explored c-terminal phac-streptavidin fusions (peters and rehm 2008) . the remarkably high streptavidin-biotin binding affinity has given rise to its use in range of biotechnological applications, thus making it a great candidate for further demonstrating the applicability of pha beads in facs. the study revealed a biotin binding capacity of 61 ng per μg of bead protein and demonstrated the detection of goat polyclonal biotinylated igg in facs using a secondary conjugated antibody. most recently, pha inclusions with gfp and mog fused to phap or phac, such that each granule is simultaneously displaying two protein-based functionalities, were examined in the context of facs, providing proof-of-concept for the biotechnological application of bifunctional pha beads which may extend beyond facs ). the enzyme-linked immunosorbent assay (elisa) is based on antigen-antibody binding on solid support either detecting antigen or antibody in samples. in vivo assembled engineered pha inclusions displaying antigens or specific binding domains as described above were directly used to coat elisa plates. respective elisas showed specific and sensitive detection of the corresponding antibody or in vivo manufactured antigen displaying pha beads used in fluorescence activated cell sorting for specific antibody detection. pha beads were incubated in sera of mice either immunized with mog or ova. after washing, bound antibodies were detected using a fluorescently labelled secondary anti-igg antibodies combined with facs. anti-mog antibodies were still detectable at a serum dilution 1:100,000. mog, myeline oligodendrocyte glycoprotein; phap, phasin (structural pha inclusion surface protein); ova ovalbumin (facs data representation was adapted from bäckström et al. (2007)) antigen confirming the applicability of bioengineered pha beads in elisa (peters and rehm 2008; atwood and rehm 2009; parlane et al. 2009 ). one in vivo-produced pha bead-based approach to in vivo diagnosis is a skin test for the detection of bovine tuberculosis (tb) (chen et al. 2014 ). the major disadvantage of the currently widespread tuberculin skin test is the lack of detection specificity for pathogenic mycobacterium tuberculosis complex members, that is, cattle exposed to nonpathogenic, environmental mycobacteria may result in falsepositives. thus, with the aims of improving specificity and lowering production cost, an alternative test was developed. three immunodominant tb antigens, which are absent in most environmental mycobacteria, were fused to phac mediating production of triple-antigen-displaying pha beads. these granules showed in vitro increased reactivity with antigen-specific tb antibodies when compared with granules displaying only one antigen. assessment of triple-antigen-displaying pha beads in the skin test (in vivo) showed specificity as all cattle experimentally infected with mycobacterium bovis were detected while no false-positive reactions in cattle previously exposed to environmental mycobacteria were observed. a fourth antigen was added to the triple-antigen-displaying pha beads to boost skin test sensitivity (parlane et al. 2016) . dose response studies showed that very low amounts of mycobacterial antigens (0.1 μg) were already sufficient for the skin test suggesting greatly increased immunogenicity of antigen-displaying beads versus soluble antigens. hence, it was demonstrated that antigen-displaying pha beads mediate an antigen-specific immune reaction upon injection into the skin, that is, serve as specific immune response stimulating antigen delivery systems. the concept of in vivo assembly of nano-/microsized pha beads displaying diseasespecific antigen as particulate vaccines was investigated. since phas are considered as biomaterials, when produced coated with pathogen-specific antigens they might serve as safe and efficient particulate vaccine. in general, the nano-/microsized particulate nature of a vaccine mimics the dimensions of a pathogen which boosts immunogenicity via enhanced uptake by antigen presenting cells (apc) (shah et al. 2014) . immunodominant antigens from pathogens such as hcv and m. tuberculosis were displayed on pha beads using pha synthase engineering (parlane et al. , 2011 . purified antigen-coated pha beads were injected into mice which resulted in strong and specific immune responses mediating protective immunity as assessed by challenge of vaccinated animals with the pathogen (parlane et al. 2012; martinez-donato et al. 2016) (fig. 6) . it is noteworthy that these pha bead-based vaccines stimulated both a humoral antibody (th2) and cell-mediated (th1) immune response, the latter being particularly relevant to protect against intracellular pathogens and is more challenging to achieve. purification of recombinantly produced proteins, generally with the target protein fused to an affinity tag, can be a time-consuming and costly process since it often requires multiple chromatography steps which all need to be individually optimized, and additionally require cleavage and removal of the tag. over the past 10 years, various approaches have been undertaken to make use of pha and pha-associated proteins for the development of alternative, simple and cost-effective protein purification systems. the general principle that most of these studies have followed is to produce the target protein as a translational fusion to a pha inclusion-associated protein in cells that also make pha and to then copurify the protein with the pha beads. this can be done in a host that naturally produces pha or in an organism that has been genetically engineered to make pha (e.g., e. coli). different methods have then been applied to release the target protein from the beads. early examples used the phasin protein phap as an affinity tag and inducible self-cleaving inteins inserted between phap and the target protein to release the protein of interest. this method was applied by barnard and coworkers to purify green-fluorescent protein (gfp) and β-galactosidase (lacz) from ralstonia eutropha, a natural phb producer (barnard . gfp and lacz could be successfully purified fused to either end of phap using a thiol-inducible intein. similarly, several proteins (maltose-binding protein (mbp), lacz, chloramphenicol acetyltransferase (cat), and nusa) were tagged with phap and purified from recombinant e. coli with the help of a ph-inducible intein (banki et al. 2005) . to strengthen the binding of the fusion protein to the bead surface to the point where significant leakage during the purification process could be avoided, multiple (two or three) phasins were used as a tag. geng and coworkers chose a protein which is generally difficult to produce in bacteria due to the presence of several disulfide bonds (geng et al. 2010) . they fused recombinant human tissue plasminogen activator (rpa), a truncated version of tissue plasminogen activator, to phap and inserted a thrombin cleavage site as a linker. active rpa could be released from isolated beads by thrombin treatment. in another study which focused on diagnostic applications (aforementioned) the mog and interleukin-2 were produced as phap-fusions containing an enterokinase recognition site (bäckström et al. 2007 ). the successful removal of mog or il2, respectively, after enterokinase treatment was monitored by facs analysis. a slightly different approach to pha-based protein purification is based on production of fusion proteins separate to pha extraction and processing into beads. various target proteins were fused to different granule-associated proteins such as phap or the regulatory protein phar, the fusion proteins recombinantly produced in e. coli and crude cell lysates incubated with beads processed from extracted pha (wang et al. 2008 , zhang et al. 2010 . release of the target protein was also achieved via intein-mediated self-cleavage. while this method is more labor-intensive than producing pha and proteins in the same cell, advantageous might be the possibility to produce the tagged target protein in any host organism including eukaryotes. a simplified approach used the n-terminal part of phaf to anchor a target protein to pha beads in vivo, followed by bead isolation and release of the target protein (i.e., the entire fusion of target protein and phap) from the beads by detergent treatment (moldes et al. 2004 ). however, a general drawback of using phap (or phar) as a tag for protein purification is that these proteins are only attached to the pha inclusion surface via hydrophobic interactions, so there is a risk that the fusion protein could detach during either the bead purification process or the tag cleavage process resulting in loss of target protein. in the natural system, phasins have the advantage of being the predominant protein on the surface of pha inclusions (wieczorek et al. 1995) ; however, in a recombinant system, proteins such as the pha synthase can be overproduced to achieve a high density at the inclusion surface (brockelbank et al. 2006; mifune et al. 2009 ). grage et al. harnessed the covalent attachment of the pha synthase to pha inclusions by translationally fusing hcred or an anti-β-galactosidase single-chain antibody fragment (scfv) (martineau et al. 1998) , separated by an enterokinase recognition site . both target proteins were successfully released from purified beads by enterokinase treatment; however, cleavage efficiency was relatively low. in an attempt to find a robust and inexpensive auto-processing module, hay and coworkers used a modified soluble form of the cell surface sortase transpeptidase a (srta) from staphylococcus aureus which had been engineered to self-cleave in the presence of ca 2+ (mao 2004; hay et al. 2015a) . srta and the target protein were fused to the c-terminus of phac using an extended linker . using this technique, gfp, mbp, and antigen rv1626 from mycobacterium tuberculosis could be released from isolated pha beads at a high yield and purity (e.g., 6 mg/l of soluble gfp at a purity of about 98%) (hay et al. 2015a ). it may not always be feasible or desirable to produce the target protein fused to a bead-associated protein (and to coproduce it with the beads/resin). hence pha beads have also been engineered to serve as affinity resins for bioseparation, generally exploiting the possibility of densely displaying binding domains on nano-/microbeads. these resins can be produced in one step by fusing the binding domain of choice to the bead-associated protein and then isolating these functionalized beads from the production host (similar to the protein purification approach described above). the first example of this was the immunoglobulin g (igg) binding zz domain of staphylococcus aureus protein a translationally fused to phac (brockelbank et al. 2006 ). the resulting zz domain displaying beads were isolated from e. coli and successfully purified igg from human serum and mouse hybridoma supernatants, with purity and yield comparable to commercially available protein a sepharose (brockelbank et al. 2006; lewis and rehm 2009 ). further pha bead-based resins assembled by engineering phac displayed streptavidin which bound various biotinylated compounds such as enzymes, antibodies, and dna (peters and rehm 2008 ). grage and rehm were able to purify β-galactosidase from a mixture of proteins using anti-lacz scfv immobilized to pha beads (grage and rehm 2008 ). an endotoxin-removing resin was developed by producing human lipopolysaccharide-binding protein (hlbp) fused to phap in pichia pastoris (li et al. 2011) . after secretion and recovery from the culture supernatant, the hlbp-phap fusion was incubated with beads processed from extracted pha, and the resulting hlbp-beads were tested for their endotoxin removal abilities under a variety of conditions. according to the authors, the beads performed better than commercially available endotoxin-removing gels (li et al. 2011) . recently, hay and coworkers published a more extensive study that aimed at broadening the applicability of pha bead affinity resins by identifying and testing several easily customizable affinity binding domains which they translationally fused to phac (hay et al. 2015b ). this study demonstrated that v hh domains from camelid antibodies, designed ankyrin repeat proteins (darpins) and ob-folds (obodies), could be densely displayed on pha beads resulting in high affinity binding resins for purification of various target proteins. these binding domains were used to establish extensive libraries of variants enabling screening for binders specific for the target compound of interest (binz et al. 2004; harmsen and de haard 2007; stumpp et al. 2008; steemson et al. 2014) . pha-based affinity resins showed a purification performance at least equal to current commercial offerings (hay et al. 2015b) . a recent surface topology study of the r. eutropha phac attached to pha inclusions suggested new engineering strategies towards the development of pha-based affinity resins with increased binding capacity via improved display (hooks and rehm 2015) . this study identified several surface-exposed flexible regions of phac, which tolerated the insertion of the igg-binding zz domain. one of the double zz domain insertions (i.e., zz inserted in two of the surface-exposed regions) showed greatly improved igg binding capacity with some of the single insertions also showing improved igg binding capacity when compared with terminal fusions. overall, phac engineering such as n-or c-terminal fusions and/or insertions enabled efficient display of binding domains for interaction with target compounds resulting in purification performance suitable for application as bioseparation resin. research needs microbial phas show great promise as polymers providing a support structure for display of a range of protein functions. since phas can be composed of various constituent resulting in a diversity of material properties, it currently remains unexplored how these different phas perform in the context of anchoring binding domains for display. additionally, research needs to address bioprocessing challenges to obtain phas of consistent structure and shape for improved implementation in various protein display applications. although microorganisms are capable of producing a variety of polymers, the hydrophobic thermoplastic phas currently hold the greatest promise as support material to display protein functions. the pha material properties allow processing into nano-/microbeads, films, and 3d structures, while pha-related proteins provide specific pha binding domains to anchor protein functions of interest. these approaches enabled implementation in microarray-based diagnostics as well as tissue engineering. besides the binding of protein functions to isolated and processed pha, recent research elucidated the concept of producing pha inclusions within the bacterial cell already coated with desired protein functions. the applicability of these pha beads as vaccines, in diagnostics and as bioseparation resin was demonstrated. polyhydroxyalkanoates: current applications in the medical field biosynthesis and composition of bacterial poly (hydroxyalkanoates) protein engineering towards biotechnological production of bifunctional polyester beads recombinant escherichia coli produces tailormade biopolyester granules for applications in fluorescence activated cell sorting: functional display of the mouse interleukin-2 and myelin oligodendrocyte glycoprotein novel and economical purification of recombinant proteins: intein-mediated protein purification using in vivo polyhydroxybutyrate (phb) matrix association integrated recombinant protein expression and purification platform based on ralstonia eutropha high-affinity binders selected from designed ankyrin repeat protein libraries recombinant escherichia coli strain produces a zz domain displaying biopolyester granules suitable for immunoglobulin g purification new skin test for detection of bovine tuberculosis on the basis of antigen-displaying polyester inclusions produced by recombinant escherichia coli the improvement of fibroblast growth on hydrophobic biopolyesters by coating with polyhydroxyalkanoate granule binding protein phap fused with cell adhesion motif rgd the cytocompatability of polyhydroxyalkanoates coated with a fusion protein of pha repressor protein (phar) and lys-gln-ala-gly-asp-val (kqagdv) polypeptide expression of active recombinant human tissue-type plasminogen activator by using in vivo polyhydroxybutyrate granule display in vivo production of scfv-displaying biopolymer beads using a selfassembly-promoting fusion partner bacterial polyhydroxyalkanoate granules: biogenesis, structure, and potential use as nano-/micro-beads in biotechnological and biomedical applications recombinant protein production by in vivo polymer inclusion display properties, production, and applications of camelid singledomain antibody fragments in vivo polyester immobilized sortase for tagless protein purification bioengineering of bacteria to assemble custom-made polyester affinity resins biochemical and enzymological properties of the polyhydroxybutyrate synthase from the extremely halophilic archaeon strain 56 insights into the surface topology of polyhydroxyalkanoate synthase: self-assembly of functionalized inclusions tolerance of the ralstonia eutropha class i polyhydroxyalkanoate synthase for translational fusions to its c terminus reveals a new mode of functional display polyhydroxyalkanoate granules are complex subcellular organelles (carbonosomes) microbial cell-surface display construction of plasmids, estimation of plasmid stability, and use of stable plasmids for the production of poly(3-hydroxybutyric acid) by recombinant escherichia coli bacterial polyesters: biosynthesis, biodegradable plastics and biotechnology zz polyester beads: an efficient and simple method for purifying igg from mouse hybridoma supernatants endotoxin removing method based on lipopolysaccharide binding protein and polyhydroxyalkanoate binding protein a self-cleavable sortase fusion for one-step purification of free recombinant proteins expression of an antibody fragment at high levels in the bacterial cytoplasm protective t cell and antibody immune responses against hepatitis c virus achieved using a biopolyester-bead-based vaccine delivery system production of functionalized biopolyester granules by recombinant lactococcus lactis vivo immobilization of fusion proteins on bioplastics by the novel tag biof micropatterning proteins on polyhydroxyalkanoate substrates by using the substrate binding domain as a fusion partner polyhydroxyalkanoate chip for the specific immobilization of recombinant proteins and its applications in immunodiagnostics microarray of dna-protein complexes on poly-3-hydroxybutyrate surface for pathogen detection bacterial polyester inclusions engineered to display vaccine candidate antigens for use as a novel class of safe and efficient vaccine delivery agents production of a particulate hepatitis c vaccine candidate by an engineered lactococcus lactis strain vaccines displaying mycobacterial proteins on biopolyester beads stimulate cellular immunity and induce protection against tuberculosis display of antigens on polyester inclusions lowers the antigen concentration required for a bovine tuberculosis skin test in vivo enzyme immobilization by use of engineered polyhydroxyalkanoate synthase protein engineering of streptavidin for in vivo assembly of streptavidin beads filamentous bacteriophage: biology, phage display and nanotechnology applications polyester synthases: natural catalysts for plastics genetics and biochemistry of polyhydroxyalkanoate granule self-assembly: the key role of polyester synthases bacterial polymers: biosynthesis, modifications and applications cloning of the alcaligenes eutrophus genes for synthesis of poly-beta-hydroxybutyric acid (phb) and synthesis of phb in escherichia coli the impact of size on particulate vaccine adjuvants tracking molecular recognition at the atomic level with a new protein scaffold based on the ob-fold molecular basis for biosynthesis and accumulation of polyhydroxyalkanoic acids in bacteria darpins: a new generation of protein therapeutics advances in biomedical and pharmaceutical applications of functional bacterial cellulose-based nanocomposites a novel self-cleaving phasin tag for purification of recombinant proteins based on hydrophobic polyhydroxyalkanoate nanoparticles analysis of a 24-kilodalton protein associated with the polyhydroxyalkanoic acid granules in alcaligenes eutrophus application of bacterial cellulose pellets in enzyme immobilization enhanced proliferation and differentiation of neural stem cells grown on pha films coated with recombinant fusion proteins ribosome display: selecting and evolving proteins in vitro that specifically bind to a target microbial polyhydroxyalkanote synthesis repression protein phar as an affinity tag for recombinant protein purification key: cord-005034-wyipzwo4 authors: gleeson, paul a.; teasdale, rohan d.; burke, jo title: targeting of proteins to the golgi apparatus date: 1994 journal: glycoconj j doi: 10.1007/bf00731273 sha: doc_id: 5034 cord_uid: wyipzwo4 the golgi apparatus maintains a highly organized structure in spite of the intense membrane traffic which flows into and out of this organelle. resident golgi proteins must have localization signals to ensure that they are targeted to the correct golgi compartment and not swept further along the secretory pathway. there are a number of distinct groups of golgi membrane proteins, including glycosyltransferases, recyclingtrans-golgi network proteins, peripheral membrane proteins, receptors and viral glycoproteins. recent studies indicate that there are a number of different golgi localization signals and mechanisms for retaining proteins to the golgi apparatus. this review focuses on the current knowledge in this field. the survival of a cell depends on maintaining the integrity of the intracellular organelles. this feat is achieved by highly selective sorting and accurate transport of proteins to their correct destinations. over the past few years the dissection of the molecular machinery for targeting and localization of proteins, in particular proteins of the secretory pathway, has been studied vigorously. defined sequence motifs have been identified on proteins which can act as 'address labels'. the golgi apparatus represents the 'hub' of the secretory pathway where intense membrane traffic is controlled. this organelle not only co-ordinates the sorting of newly synthesized proteins but is also responsible for the control of posttranslational modifications, in particular glycosylation. a fundamental question currently being addressed in cell biology is how the golgi apparatus is organized to achieve these demanding functions and how it maintains its structural integrity in spite of the intense membrane traffic which enters and leaves this organelle. this review will focus primarily on our understanding of the molecular signals and mechanisms for the retention of resident golgi proteins. the golgi apparatus is a highly complex and dynamic organelle, which has been difficult to define in three-dimensional terms [1] . it consists of a number of (reflecting in part the cholesterol content), ph, and most importantly in the populations of resident proteins which they contain. however, detailed biochemical characterization of the individual cisternae is lacking as the current methods are inappropriate to allow their purification. newly synthesized proteins are transported sequentially from the er to the golgi and then to their final destination. the transport of newly synthesized proteins from the endoplasmic reticulum to golgi cisternae, between adjacent cisternae within the golgi stack, and from golgi cisternae to various destinations is mediated by vesicles shuttling between donor and recipient compartments. vesicles bud from one compartment and then target and fuse with the next compartment [7, 8] . an increasing number of structural and regulatory components have been identified which are involved in the orchestration of the complex and intriguing processes of budding, specific targeting, docking and fusion [9] [10] [11] . some of the components of this machinery are localized only to golgi membranes and are thought to be specific for membrane transport through and from the golgi apparatus. the restricted location of these components indicates the presence of specific localization signals. until recently it was widely believed that, if correctly folded, newly synthesized proteins are transported from the endoplasmic reticulum, through the golgi stack to the tgn without the requirement for a specific transport signal [7, 12] . however, the concept of 'bulk flow' of membrane proteins from the er has recently been challenged by the finding that vesicular stomatitis virus g glycoprotein is significantly concentrated during export from the er [13] . nonetheless forward transport, at least from the golgi apparatus to the cell surface, appears to constitute a signal-independent or default pathway. despite this extensive flux of proteins, it is imperative that the golgi apparatus maintains the set of resident proteins which define its unique structural and functional properties. thus, it would appear that newly synthesized golgi membrane proteins must stop at the correct cisterna, or subcompartment, and be prevented from being swept into transport vesicles that bud from the dilated rims of the cisternae. clearly, specific localization signals are required for retention of proteins which reside in the golgi apparatus. what do we know about the sorting signals and mechanisms for the localization of non-golgi proteins within the secretory and endocytic pathways? a number of sorting signals have been found associated with the cytoplasmic domains of membrane proteins [14] . for example, short tyrosine-containing peptide motifs found on cytoplasmic domains direct the sorting of proteins from the plasma membrane via the receptor mediated endocytosis pathway [15] and the transcytosis of basolateral proteins to the apical surface of certain polarized cells [16] , while a dileucine-containing peptide motif directs the transport of man-6-p receptors from the tgn to the late endosomes [17, 18] . these cytoplasmic domain sorting signals mediate interactions with coat structures of budding vesicles and thereby allow the selective vesicular transport of these membrane proteins between a variety of compartments [ 19] . much progress has been made in defining the retention signals for resident er proteins. targeting signals have been identified for both soluble and membrane-bound proteins residing in the er. a specific retention signal, comprising the carboxy terminal sequence kdel/hdel, has been identified for a number of resident soluble er proteins [20, 21] , and a receptor for this retention sequence has been identified [22] [23] [24] . retention of these soluble er proteins is mediated by a receptor-based salvaging mechanism, whereby escaped kdel-bearing proteins are retrieved from a post-er compartment by a recycling kdel receptor [24] . for membrane proteins of the e r a double lysine motif (kkxx) at the cytoplasmically exposed carboxy terminus of certain type i membrane proteins has been shown to specify er residence [25] . for type li membrane er proteins, a related double arginine motif at the cytoplasmically orientated amino terminus has been identified [26] . interestingly, the localization of ergic-53 (p53), a type i membrane protein of the intermediate compartment or cgn, requires a kkxx er retention motif, again suggesting that the cgn may be an extension overall, the localization signals of non-golgi proteins are hydrophilic motifs located on either the cytoplasmic or luminal domains of the protein, and some of these signals have been shown to interact specifically with receptor molecules or with protein coats of budding vesicles. it is now clear from recent studies that there are a number of distinct types of golgi localization signals. based on these localization signals and other biochemical features the resident proteins of the golgi apparatus can be divided into five groups (table 1) . they are all membrane proteins. interestingly no soluble resident golgi proteins have been identified within the lumen of the golgi apparatus, which probably indicates that the mechanisms for retaining proteins to this organelle are restricted to membraneassociated proteins. the five groups are described individually as the mechanism of golgi localization may be unique for each group. the golgi apparatus plays a key role in the glycosylation of newly synthesized membrane and secreted proteins [29, 30] . based on the exquisite specificities of the currently defined glycosyltransferases [29, 31] , the synthesis of all the known carbohydrate chains of glycoconjugates must require in the region of 100-200 different glycosyltransferase enzymes distributed throughout the golgi stack. however, very little is known about the structural organization of these integral membrane enzymes within the membranes of the golgi cisternae and the signals which define their localization within the golgi are only now beginning to be defined. a number of glycosyltransferases have restricted distributions within the golgi apparatus, notably /~1,4 galactosyltransferase (/? 1,4galt) (trans-golgi) [32] [33] [34] , :~2,6 sialyltransferase (e2,6st) (trans-golgi and trans-golgi network) [35, 36] and n-acetylglucosaminyltransferase i (glcnacti) (medial-golgi) [34, 37, 38] . furthermore, simultaneous immunogold localization of /~i,4galt and gtcnacti in the same golgi apparatus, confirms that these enzymes have distinct, though overlapping, distributions [34] . from the purification tables ofgolgi glycosyltransferases, it is clear that individual transferases constitute only a minor percentage of the proteins of the cisternae in which they reside. for example, glcnacti constitutes about 1~ of medial-specific golgi membrane protein in rabbit liver [39] . however, in view of the estimated number of golgi glycosyltransferases, collectively the glycosyttransferases of each golgi cisterna may represent a very significant proportion, if not the bulk, of the resident membrane proteins. numerous mammalian glycosyltransferases have been cloned and sequenced (see reviews [40] [41] [42] [43] ). individual glycosyltransferases are highly conserved across species, for example rabbit glcnacti shares 92~o and 93~ amino acid sequence identity with human and mouse glcnacti, respectively [4447] . but comparison of the amino acid sequences between the glycosyltransferases has revealed only isolated cases of sequence similarity. for example, there is a high degree of sequence similarity between blood group a and b glycosyltransferases [48] , which are products of two alleles, and between a number of e3(4) fucosyltransferases [49] . furthermore, a conserved motif has been identified in the catalytic domain of cloned sialyltransferases (the 'sialylmotif') [50] . however, these examples of sequence similarity are the exceptions and, overall, little sequence similarity has been detected between different glycosyltransferases. this is dramatically illustrated by a lack of obvious amino acid similarity between the sequences of four different glcnac transferases involved in the synthesis of the outer antennae of complex n-glycans, namely glcnacti [44, 45] , glcnactii [51] , glcnactiii [52] , and glcnactv [53] . one would expect there to be structural similarity between the catalytic sites of these glcnac transferases but this has not been detected from their amino acid sequences. thus, comparison of the primary structures of golgi glycosyltransferases has not revealed a potential golgi localization motif. there is, however, a striking similarity in the domain structure of these golgi enzymes. all golgi transferases cloned to date are nin/cou t (type ii) membrane proteins containing a single hydrophobic membrane-spanning domain (16-25 amino acids) which also serves as a non-cleavable signal sequence. each has a short n-terminal cytoplasmic domain (many have less than 10 amino acids), and a large carboxyl-terminal catalytic domain situated in the lumen of the golgi apparatus. the catalytic domain is linked to the transmembrane domain by a loosely defined 'stem' region which may play a role in positioning the catalytic domain away from the lipid bilayer facilitating access to the substrate. over the past 4 years a number of groups have attempted to identify the targeting signal responsible for the localization of gtycosyltransferases. three glycosyltransferases have been extensively examined, namely cd,6st, /~i,4galt, and glcnacti. these enzymes are residents of the tgn, tgn/trans-gotgi, and medial-golgi respectively. a common strategy has been employed by all groups to identify a putative golgi retention signal(s) by analysing the localization, in transfected mammalian cells, of hybrid molecules containing limited sequences derived from golgi glycosyltransferases. in all cases, the membrane-spanning domains of the golgi glycosyltransferases have been shown to direct, at least partial localization of hybrid molecules to the golgi apparatus [54] [55] [56] [57] [58] [59] [60] [61] . indeed it has been further demonstrated that the transmembrane domain of/?i,4galt and glcnacti can specifically localize hybrid proteins to the trans and medial cisternae, respectively [55, 61] . however, a number of studies have shown that sequences flanking the transmembrane domain also play auxiliary roles in mediating golgi localization [54, [61] [62] [63] . these studies all agreed that the transmembrane domain plays a central role in the targeting and localization of resident golgi glycosyltransferases. this was an unexpected finding as localization signals up until then were hydrophilic glycopinion mini-review regions of the cytoplasmic or luminal domains. the involvement of a hydrophobic stretch of amino acids in targeting indicated a unique mechanism for the localization of these resident golgi proteins. these initial studies were based to a large extent on the premise that a discrete region, or motif, was responsible for the golgi localization signal. further studies have indicated that, although the transmembrane domain is relevant, the situation is far more complex than at first appreciated. figs 2-4 summarize graphically the regions of glycosyltransferases that have been examined and their ability to direct reporter molecules to the golgi apparatus. collectively, a large number of constructs have now been analysed. figs 2-4 include many of these constructs, however, it is by no means all-inclusive. those selected are the most instructive and highlight the complexity of the situation. there is considerable variability in the results obtained between groups, even when comparing the same glycosyltransferase. for example, wong et al. [60] reported that the transmembrane domain of e2,6st resulted in very efficient golgi localization of a hybrid molecule, whereas a hybrid construct of munro [54] , containing the equivalent c~2,6st domain, resulted in considerable leakage to the cell surface ( fig. 2) . site-directed mutagenesis of residues of the transmembrahe domain of //1,4gait, in the context of hybrid molecules, suggested that uncharged polar residues are critical for the ability of these hydrophobic domains to mediate golgi retention [56] . however, a number of other studies have indicated that considerable alterations can be made to the transmembrane domain of the native enzymes without abolishing gotgi retention. for example, colley et al [62] showed that sequential replacement of 4-5 amino acid blocks of the transmembrane domain of e2,6st had no effect on golgi localization. further, munro [54] made the striking demonstration that the transmembrane domain of an e2,6st hybrid protein (containing the stem and tail of e2,6st) can be totally replaced by a poly-leucine sequence of similar length without adversely affecting golgi retention. munro [54] also reported that the length of the polyteucine segment appeared to be important in maintaining efficient golgi localization as a transmembrane domain of 23 leucine residues showed leakage to the cell surface. in contrast to this apparent length requirement, dahdat and colley [63] replaced the 17 amino acid transmembrane domain of e2,6st with the long 29 amino acid transmembrane domain from influenza neuraminidase without any apparent disruption of the retention signal. the difficulty in defining the structural elements associated with transmembrane domains in golgi localization has been highlighted by a recent study by low et al. [64] who demonstrated that swapping the transmembrane domains of two cell surface proteins resulted in hybrid molecules which either accumulated in the golgi or were retarded in transport through the golgi apparatus. gol0jeell smc~ace (m) 6o~0~ ce~ surface (n,w) -_ ~. t 17aa i stem37aa 6 gotgi 0~w) ce~ surface (lv~ go~ce~ surface (d) thus, although both native proteins are transported efficiently to the cell surface, swapping the transmembrane domains of these two proteins altered golgi to cell surface transport. these investigators concluded that the hydrophobic transmembrane domain in relation to its charged flanking sequences is important in transport from the golgi apparatus to the cell surface. for both ct2,6st and glcnacti it is clear that regions flanking the transmembrane domain can augment the efficiency of golgi localization. for example, additional sequences from the stem region and the cytoplasmic tail increase the efficiency of golgi localization of ~2,6st and glcnacti [54, 58, 61] , although the tail and/or stem of ~2,6st and glcnacti alone is not capable of retaining a reporter molecule to the golgi 1-54, 60, 61] , and removal of the stem region from wild type c~2,6st [62] or /31,4gait [561 does not disrupt golgi localization. although the potential of the stem region has been addressed in a number of studies, the potential role of the catalytic domain has been overlooked in most studies. yet, removal of the cytoplasmic tail and stem, and considerable alteration of the signal/anchor domain, still allowed hybrid ~2,6st molecules containing the catalytic domain to be golgi localized [-62, 63] . it should be noted that the membrane flanking sequences, comprising short stretches of charged residues, were maintained in the latter constructs which may also be an important factor in localization. the contribution of the membrane flanking sequences of /~i,4galt to golgi localization is not known. comparison of the results of all these studies is not straightforward and there are a number of factors which may account for the apparent lack of agreement between them. first, as yet there is no direct evidence that glycosyltransferases localized to different golgi subcompartments are retained by identical mechanisms. there may be subtle differences between the localization signals which specify residency in medialand trans-cisternae and in the tgn. second, in the majority of studies sequences involved in golgi localization have been identified by their ability a. third, the definition of the transmembrane domain has varied from group to group; in some cases the transmembrane domain has been defined as a stretch of hydrophobic residues, excluding charged residues necessary for anchoring membrane proteins within the lipid bilayer [38, 54] , whereas in other studies, two or three charged amino acids on either side of the hydrophobic stretch have been included in the sequence defined as the transmembrane domain [55] . fourth, the appearance of hybrid or mutant glycosyltransferases at the cell surface has been frequently used as a measure of disruption of golgi localization. however, the majority of groups have only used fluorescence microscopy to compare levels of cell surface expression. fluorescence microscopy is relatively insensitive and comparisons are at best qualitative. only a few studies have employed the more sensitive and quantitative technique of flow cytometry to compare levels of cell surface expression between constructs. fifth, the expression levels of the hybrid constructs vary between and within studies. we believe this to be a critical factor in assessing these results. whereas many groups have shown that the native gtycosyltransferases can be expressed at very high levels without saturation of the golgi retention mechanism [38, 54, 55, 59] [44] . of expression have been observed between different constructs. in addition, for any one construct there is a wide range of expression within one transfection experiment, and indeed dahdal and colley [63] noted that surface expression of some hybrid proteins appeared to be related to the level of expression in the cells. in our studies on fll,4galt, we have observed that a construct expressed transiently in cos cells showed a different intraceltular distribution to that of the same construct stably expressed in mouse l cells. replacement of the 20 amino acid transmembrane domain of /?i,4galt with the 27 amino acids from the transferrin receptor resulted in abundant cell surface expression in cos cells, and with very little detected in the golgi region, whereas in stable l cells, which expressed the hybrid molecules at a 50 to 100-fold lower level, substantial amounts of the hybrid molecules were specifically retained within the golgi apparatus [66] . clearly, stable clones expressing low levels of the hybrid molecules are likely to be more informative. sixth, several groups have identified glycosyltransferase sequences which are capable of conferring golgi localization upon reporter proteins, but have neglected to assess the role played by these sequences within the context of the full length enzyme. strategies involving reporter proteins are useful for determining the minimum sequence requirements for golgi localization of hybrid molecules. however, it does not necessarily follow that a sequence which is sufficient to confer golgi localization upon a reporter molecule is the only sequence involved in retention of the native enzyme. this point was illustrated earlier with the golgi localization of glycosyltransferases bearing substituted transmembrane domains (figs 2-4) . furthermore, most of the studies which have made substitutions of the native enzyme have not assessed the effect of those substitutions on enzyme activity, thus it is unclear whether the structure of the luminal catalytic domain has been perturbed in these studies. in our recent study on glcnacti [61] we have attempted to address many of these problems and have assessed the relative contribution of the cytoplasmic tail, transmembrane domain, and catalytically active luminal domain in medial-golgi localization. stable l cell clones expressing hybrid molecules were generated, and clones which expressed equivalent amounts of hybrid proteins were selected for analyses. all hybrid molecules expressing the luminal domains of glcnacti were catalytically active, inferring a native structure for this domain. cellular localization was assessed by fluorescence microscopy, immuno-electron microscopy and flow cytometry. overall our study showed that each of the three gicnacti domains contributes significantly to medial-golgi localization. soluble, catalytically active forms of /?i,4galt and ~2,6st, which lack the cytoplasmic tail, transmembrane domain, and luminal stem region, have been detected in body fluids and are thought to be derived from the membrane-bound forms by proteolytic cleavage [67-693. when the cytoplasmic tail, transmembrane domain and luminal stem region of either /~2,6st or /~i,4galt are replaced by a cleavable signal sequence, the resulting truncated enzymes are also rapidly secreted from transfected cells [59, 70] . from these data it has been argued that the catalytic domains of glycosyltransferases do not contain golgi retention signals. however, it is entirely possible that the luminal catalytic domain can only function in golgi retention if it is anchored to the membrane. at this stage the relative contribution of the stem and catalytic domains in the localization of glycosyltransferases is unresolved. overall, it is most unlikely that golgi retention is determined by a discrete and continuous sequence motif or peptide segment, but rather localization of golgi glycosyltransferases could be mediated by interactions spanning the entire length of the molecule. what are the possible mechanisms for the compartmentspecific localization of the membrane-bound glycosyltransferases? it is unlikely that localization of glycosyltransferases involves a simple receptor-ligand interaction where the receptor is fixed in the golgi cisternae, as over-expression of wild-type transferases does not result in saturation of the retention mechanism [38, 54-56, 59, 62]. an alternative glycopinion mini-review possibility is that escaped golgi glycosyltransferases are retrieved from post-golgi compartments, as with soluble er proteins. however, experimental evidence seems to argue against a retrieval system for golgi glycosyltransferases. wong et al. [60] have demonstrated that ~2,6st leaked to the cell surface is not retrieved back to the golgi apparatus. also we [66] have demonstrated that/~i,4galt which has escaped golgi retention undergoes a post-translational modification, probably in the tgn, before appearance at the cell surface; as the golgi-localized /?1,4gait does not accumulate this modification, a retrieval system would appear unlikely to play a dominant role. what could be the basis of an active golgi retention mechanism? it has been suggested by a number of investigators that retention of golgi glycosyltransferases could be mediated by the formation of protein aggregates within the membranes of the correct golgi cisternae [71] [72] [73] . this model proposes that such oligomers or aggregates of glycosyltransferases would then be excluded from entry into vesicles for forward transport. although an attractive hypothesis, the evidence for aggregation remains largely indirect. recent elegant experiments performed by nilsson et al. [74] have shown that the addition of an er retention motif to the glcnacti cytoplasmic tail not only causes glcnacti to localize to the er but also partially retains another medial-golgi enzyme, namely ~mannosidase ii, within the er. furthermore, burke [75] has demonstrated co-precipitation of glcnactii activity, another medial-golgi enzyme, using antibodies specific to glcnacti. as the amino acid sequences of the glcnacti and glcnactii transferases are not related, a likely explanation is the association of enzymes which occupy the same golgi cisternae. warren and colleagues have coined the term 'kin recognition' to denote this self-association of golgi enzymes [72] . the proposed aggregation of golgi glycosyltransferases is also consistent with earlier observations that the majority of glcnacti and/~i,4galt exist as high molecular weight material following detergent extraction of tissue [39, 76] . how could the three domains of a glycosyltransferase play a role in aggregation? the fact that each domain of glcnacti is required for complete golgi retention implies that all three domains may be involved in the lateral interactions which lead to aggregate formation. for example, the transmembrane domains of resident golgi proteins may mediate homo-or hetero-dimerization via protein protein interactions along uncharged polar faces of c~-helixes, predicted for some of the glycosyltransferases [71] , or along one face of the c~-helix containing a leucine zipper, as predicted for glcnacti [44] . such dimers may form prior to their arrival in the golgi, as indicated by the results of er retention of glcnacti/manii [74] . these homo-or hetero-dimers may then be induced to interact, within the correct golgi microenvironment, through their large luminal domains, resulting in a two-dimensional aggregate ( fig. 5) . aggregation may be induced by differences within the golgi cisternae, such as ph and calcium concentration. this model differs somewhat from that of warren's group which proposes that the golgi enzymes form homo-dimers (via their stem regions) and interact via their transmembrane domains with different neighbours to generate linear hetero-oligomers. ~-mannosidase ii has been shown to be a disulphide-bonded dimer, but there is no evidence of stable covalent dimer formation for any of the golgi glycosyltransferases. the first report of a purified membrane-bound form of glycosyltransferase, namely /~i,4galt, indicates that no disulphide bonded dimers exist [76] , contrary to an earlier suggestion [77] . in addition, warren's model of linear heteroaggregates cannot readily explain the efficient golgi retention of hybrid molecules containing a transmembrane domain of a glycosyltransferase and a reporter molecule known to be a monomer in the native state, such as ovalbumin [38, 59, 61] . retention of such hybrid molecules could only occur at the ends of the iinear aggregates, via their transmembrane domains, and would effectively cap these linear structures resulting in only a very minimal number of hybrid molecules retained in the golgi apparatus. finally, the cytoplasmic tail of the glycosyltransferases may be necessary for either transmembrane-mediated dimerization or, as proposed by slusarewicz et al. [78] , may interact in a salt-dependent manner with a putative intercisternat matrix. consistent with this proposal, the differences in solubility of wild-type gicnacti and the glcnacti hybrid proteins indicate that golgi localized molecules may exist in a different physical state from their cell surface counterparts [61] . an interaction of the glycosyltransferases with the intercisternal matrix (the "golgi glue'), either directly or indirectly, would ensure that the aggregates are immobilized within the golgi membranes and so are excluded from transport vesicles. clearly an aggregation model of retention may involve many additional components and further biochemical analysis is now required. a model of golgi localization also needs to account for the presence of soluble catalytically active forms of fil,4galt and ~2,6st which have been detected in body fluids. the retention model proposed above could allow the release of soluble catalytic oligomers from the golgi aggregate by proteolytic cleavage, with the subsequent dissociation of the oligomers to monomers. gotgi membranes differ in lipid composition from the er and plasma membranes. such lipid differences may be important in mediating interactions between the transmembrane domains of glycosyltransferases. a lipid mediated mechanism of protein sorting has been proposed by bretscher and munro [79] who have suggested that the typically shorter transmembrane domains of golgi proteins may interact selectively with the low cholesterol bilayers of golgi membranes and be excluded from the thicker, cholesterol-sphingotipid enriched bitayers of post-golgi membranes. a protein-lipid interaction is compatible with the observations that the length of the hydrophobic stretch coupled with the adjacent flanking residues is important in gotgi retention, rather than the actual amino acid sequence. the models discussed above are by no means mutually exclusive and it is possible that the golgi retention mechanism includes both a protein-lipid interaction (via the transmembrane domains of the proteins) as well as protein-protein aggregation. based on the aggregation model of golgi retention outlined above, wild type glycosyltransferases expressed in transfected cells may be retained within golgi cisternae as a consequence of self aggregation, or by virtue of their ability to interact or 'dock' to existing aggregates within the golgi apparatus of the mouse cells. self aggregation, as opposed to docking, represents a potentially non-saturable means of retention, consistent with the many reports that glycosyltransferases expressed in heterologous cell lines do not leak to the plasma membrane even when expressed at vastly elevated levels [38, 54-56, 59, 62] . on the other hand, hybrid constructs would have a reduced capacity to self-aggregate, due to insufficient domains, and may be retained by interacting or 'docking' to existing aggregates within goigi membranes, either through their luminal domain or transmembrane domain (fig. 6) . this would be a more readily saturable means of retention, with only a finite number of exposed, endogenous molecules available as 'docking' sites. the fact that golgi localized glcnacti hybrid proteins, including those which lack the glcnacti cytoplasmic tail, are predominantly localized to medial-golgi cisternae is in agreement with this proposal [61] . thus, golgi localization of hybrid molecules probably reflects the minimum structure(s) required to 'dock' with endogenous golgi aggregates. this model would also help to explain the discrepancies between studies as the expression levels of the hybrid molecules would be an important factor in the efficiency of golgi localization. the high level of conservation of individual glycosyltransferases across species is also consistent with structural constraints imposed by such an aggregation model. a conserved structure would be required in order to preserve the many interactions with neighbouring enzyme molecules of a heteroaggregate within the golgi compartment. if golgi glycosyltransferases have evolved with a fundamental requirement for such inter-molecular interactions, it would also explain the conservation of the retention mechanism across species and also the ability of an animal glycosyltransferase to be apparently correctly housed in the golgi apparatus of plant cells [80] . while most viruses mature at the plasma membrane, a limited number of viruses acquire their envelopes by budding into intracellular compartments. viruses which assemble from golgi membranes include, coronavirus, bunyavirus and pox virus [81, 82] . viral budding from the golgi apparatus is probably determined by the targeting of one or more viral glycoproteins to the golgi membranes. indeed, a number of viral proteins have been shown to be independently targeted to the golgi apparatus, including the m glycoproteins of an avian coronavirus [83] and a related murine coronavirus [84, 85] , the e1 and e2 spike glycoproteins of rubella virus [86] and the g1 glycoprotein of punta tora virus [87] . as a consequence of the specific localization of these viral glycoproteins they represent useful tools for the study of protein targeting to the golgi apparatus. the m (formerly called el) glycoprotein of the avian coronavirus, infectious bronchitis virus (ibv), has been shown to be localized specifically to the cis-golgi cisternae [83] . in contrast to the type ii membrane orientation of the glycosyltransferases, the ibv m gtycoprotein contains a short glycosylated amino-terminal domain, three membrane spanning domains and a carboxy-terminat cytoplasmic tail. only the first of the three membrane spanning domains of m glycoprotein of ibv is required to retain this protein in the golgi [88] . furthermore, this membrane spanning domain is sufficient to confer golgi localization upon a plasma membrane localized protein [89] . thus, as for the gtycosyltransferases of the medial and trans-cisternae and the tgn, the transmembrane domain of a resident protein of the cis-cisternae has also been implicated in retention. extensive mutagenesis showed that four polar residues in the first m transmembrane domain were critical for golgi retention of a hybrid protein with the vsv g glycoprotein [90] . these four polar residues are predicted to form an uncharged polar face along one side of an c~-helix, which has potential to be involved in protein-protein interactions and mediate oligomer formation. indeed, aggregation has been shown to correlate with retention of this m hybrid protein in the golgi apparatus [91] . these investigators demonstrated that the appearance of sds-resistant aggregates of the m hybrid protein correlated with golgi localization, whereas mislocatized transmembrane domain mutants do not form oligomers. the aggregates have not been biochemically characterized but it is possible that they include endogenous golgi proteins. however, sds-resistant oligomers of the native m glycoprotein were not detected in this study [91] , thus the relationship between aggregate formation of the m hybrid molecule and golgi retention of the native m protein remains unclear. in contrast to the findings for the m glycoprotein of ibv, machamer et al. [90] in the past few years it has become apparent that there is a distinct set of resident gotgi proteins in the tgn of mammalian cells, and the late golgi of yeast, that have features associated with their localization which are distinct from the golgi glycosyltransferases [93] [94] [95] . these differences are associated with the structure of the proteins. the group includes the mammalian tgn38/41 [95] and furin [96] , and the yeast proteolytic enzymes kexlp, kex2p, and dipeptidyl aminopeptidase a (dpap a) (for review see [97] ). in contrast to the golgi glycosyltransferases, tgn38/41, furin, kextp and kex2p are type i membrane proteins, however, membrane orientation is not a distinguishing characteristic of the group as dpap a is a type ii membrane protein. in contrast to the golgi glycosyltransferases, the cytoplasmic tail of all these proteins is essential for golgi localization and, in addition, a retrieval signal plays a role in defining residence of these proteins to the tgn or late golgi. tgn38/41 is a heterodimeric membrane protein complex which cycles between the tgn and the cell surface [95, [98] [99] [100] . tgn38/41 has been shown to interact with cytosolic proteins and may be involved in the formation of exocytic vesicles from the tgn [95, [98] [99] [100] . a number of groups have demonstrated that the tetrapeptide sequence yqrl, within the 33 amino acid cytoplasmic tail of tgn 38, is both necessary and sufficient to target this membrane protein to the tgn [101] [102] [103] . this tyrosine-containing motif also acts as an internalization motif from the plasma membrane, via interaction with clathrin-coated pits. recently this tyrosine motif has been shown to lie within an ~-helix, and not a tight /~-turn conformation which is typical of other tyrosine-containing internalization motifs [104] . there is evidence that individual amino acids around the tyrosine of the tgn 38/41 internalization motif could signal different intracellular locations. for example, mutation of the yqrl sequence to yqdl abrogated tgn localization of tgn 38 but did not affect internalization [101] . recently furin, a membrane associated subtilisin-like protease, has been shown to be concentrated in the tgn [96] . like tgn38/42, furin also cycles between the cell surface and tgn. sequences of the cytoplasmic tail of furin are required for tgn targeting, and a potential tyrosine motif has been identified [96, 105] . on the other hand, potential tyrosine motifs for internalization appear to be absent in the cytoplasmic tails of gotgi glycosyltransferases. the yeast proteins dpap a, kex2p and kexlp are all integral membrane proteins with cytoplasmic tails of about 100 amino acids. these cytoplasmic tails are required for retention of these enzymes in the late golgi since deletions in the tail reduce the efficiency of retention [106] [107] [108] . the retention signals within the cytoplamic tails of these proteins have been identified and are very similar to the proposed general motif for clustering into clathrin-coated pits of animal cells (see [94, 97] ). deletion of the golgi retention signal, or over-expression of these proteins, results in mislocalization to the vacuolar compartment. this initially surprising finding has led to the conclusion that the default destination for membrane proteins in the yeast secretory pathway is the vacuolar compartment and not the plasma membrane. studies on the yeast vps mutants suggest that dpap a may leak from the late golgi and is transported, via the default pathway, to a post-golgi/pre-vacuolar compartment [-97] . the cytoplasmic localization signals of these escaped dpap a moiecules then mediate retrieval back to the late golgi; in the absence of the cytoplasmic tail golgi localization signals these membrane proteins would continue to be transported along this default pathway to vacuoles. thus there are clear similarities in the mechanism of golgi localization of these yeast proteolytic enzymes and mammalian tgn38/41 and furin. the kdel receptor resides in the cgn and possibly throughout the entire golgi stack [22, 24] . the kdel receptor is predicted to have six or seven transmembrane domains [109] . empty receptors do not recycle back to the er, however, after binding to ligand the ligand-receptor complex is then returned by retrograde transport to the er [24] . thus, this receptor is likely to have signals for goigi localization. however, mutational analysis of the kdel receptor, although defining structural features associated with ligand binding and retrograde transport, revealed very little about the nature of the putative golgi localization signal [t09]. there are a number of structural membrane proteins and proteins associated with the machinery of vesicular transport that are localized specifically to the golgi apparatus, for example/~-cop [110], rab6 and tab12 [111] , p230 [112] , p200 [113] , heterotrimeric g proteins [114, 115] , sec 7 [116] and the actin binding protein, comitin [117] (table 1) . these are not integral membrane proteins as they do not have transmembrane domains, but rather are peripheral membrane proteins associated with the cytosolic face of golgi membranes. some of these components recycle between a cytosolic pool and golgi membranes. in general very little is known about the gotgi localization signals for these peripheral membrane proteins. there is evidence that the carboxy-terminat region of the gtp binding protein g,n is required for golgi membrane binding [118] . membrane association of rab proteins requires the geranylgeranylation of one or two c-terminal cysteines [119] as well as a localization signal to define the organelle-specificity. the hypervariable c-terminus of rab proteins has been implicated in localization [120] , although a recent study on rab6 indicates that efficient localization of this rab protein to golgi membranes requires both n-terminal and c-terminal domains [121] . the identification of the precise nature of the targeting signals and the mechanism of localization of these peripheral membrane proteins will be important to the understanding of the organization of the golgi apparatus and vesicular transport. it is now apparent that the localization of resident golgi proteins includes more than one mechanism. for some late golgi membrane proteins a retrieval system operates to recycle proteins from post-golgi compartments. on the other hand, golgi glycosyltransferases appear to be actively retained within golgi membranes; there is no evidence that glycosyltransferase molecules which have leaked from the golgi apparatus can be retrieved. from many 'cut and paste' experiments it is apparent that the localization of glycosyltransferases does not involve a discrete retention signal but may be dependent on many interactions spanning the length of the molecule. furthermore, there is increasing evidence to suggest that retention of glycosyltransferases involves the formation of aggregates within the golgi apparatus. the challenge now is to biochemically characterize these aggregates, to identify any associated molecules that may be important in mediating retention, and to identify the conditions which induce aggregation. this will require the development of novel strategies to allow the isolation and biochemical analyses of individual golgi compartments, in particular the lipid composition, the organization of the resident proteins within golgi membranes, and the nature of interactions with the intercisternal matrix. thus, the problem now is understanding the biogenesis of golgi membranes themselves. the enzymes of biological membranes (martonosi an this work was supported by grants from the national health and medical research council of australia and the australian research council. key: cord-031937-qhlatg84 authors: verma, anukriti; sharda, shivani; rathi, bhawna; somvanshi, pallavi; pandey, bimlesh dhar title: elucidating potential molecular signatures through host-microbe interactions for reactive arthritis and inflammatory bowel disease using combinatorial approach date: 2020-09-15 journal: sci rep doi: 10.1038/s41598-020-71674-8 sha: doc_id: 31937 cord_uid: qhlatg84 reactive arthritis (rea), a rare seronegative inflammatory arthritis, lacks exquisite classification under rheumatic autoimmunity. rea is solely established using differential clinical diagnosis of the patient cohorts, where pathogenic triggers linked to enteric and urogenital microorganisms e.g. salmonella, shigella, yersinia, campylobacter, chlamydia have been reported. inflammatory bowel disease (ibd), an idiopathic enteric disorder co-evolved and attuned to present gut microbiome dysbiosis, can be correlated to the genesis of enteropathic arthropathies like rea. gut microbes symbolically modulate immune system homeostasis and are elementary for varied disease patterns in autoimmune disorders. the gut-microbiota axis structured on the core host-microbe interactions execute an imperative role in discerning the etiopathogenesis of rea and ibd. this study predicts the molecular signatures for rea with co-evolved ibd through the enveloped host-microbe interactions and microbe-microbe ‘interspecies communication’, using synonymous gene expression data for selective microbes. we have utilized a combinatorial approach that have concomitant in-silico work-pipeline and experimental validation to corroborate the findings. in-silico analysis involving text mining, metabolic network reconstruction, simulation, filtering, host-microbe interaction, docking and molecular mimicry studies results in robust drug target/s and biomarker/s for co-evolved ibd and rea. cross validation of the target/s or biomarker/s was done by targeted gene expression analysis following a non-probabilistic convenience sampling. studies were performed to substantiate the host-microbe disease network consisting of protein-marker-symptom/disease-pathway-drug associations resulting in possible identification of vital drug targets, biomarkers, pathways and inhibitors for ibd and rea. our study identified na((+))/h((+)) anti-porter (nhaa) and kynureninase (kynu) to be robust early and essential host-microbe interacting targets for ibd co-evolved rea. other vital host-microbe interacting genes, proteins, pathways and drugs include adenosine deaminase (ada), superoxide dismutase 2 (sod2), catalase (cat), angiotensin i converting enzyme (ace), carbon metabolism (folate biosynthesis) and methotrexate. these can serve as potential prognostic/theranostic biomarkers and signatures that can be extrapolated to stratify rea and related autoimmunity patient cohorts for further pilot studies. www.nature.com/scientificreports/ approach has advantages over the traditional approach for network analysis that can help to simultaneously characterize several protein interaction modules and has the potential to study complex diseases. the vital information obtained in our study from in-silico analysis is cross-validated through targeted gene expression experimental analysis on patient cohorts. this study will help us to obtain clinico-molecular informatics-based outcomes and expand our knowledge regarding the understanding of biological functions for ibd co-existent rea. text mining: data screening and selection. systematic data search and organization was carried out incorporating data identification, data screening and data selection to find target microorganisms involved in inflammatory bowel disease (ibd) and reactive arthritis (rea). data identification was carried out to obtain records through data sources utilising keywords (e.g. "microorganism and inflammatory bowel disease and reactive arthritis") incorporating boolean operators (and/or/not). data screening and selection were carried as part of the manual curation through primary and secondary screening scrutinizing collected data records to obtain organized records relevant for the autoimmune and enteric disorders triggered by microorganisms, especially ibd and rea and the microbial triggers implicated in ibd and rea that were utilised for further metabolic network reconstruction. bottom-up approach consisting of draft reconstruction and manual reconstruction refinement was followed to create metabolic networks of obtained target microorganisms. genome-scale metabolic models simulation, reconstruction and visualization (gemsirv) software 51 that includes reciprocal basic local alignment search tool (blast) of target microorganisms against a template metabolic network of its phylogenetic neighbour and incorporates information from national center for biotechnology information (ncbi), kyoto encyclopedia of genes and genomes (kegg) and transport db was used for creating draft reconstructs. the manual curation of missing links or gaps in the draft reconstruct was done by mapping the incomplete information to other databases such as expert protein analysis system (expasy) 52 and integrated relational enzyme database (intenz) 53 . this fully connected and annotated network was used for further simulation studies 54 . the metabolic networks thus obtained were visualized using celldesigner, a tool for modelling and editing biochemical and gene-regulatory networks. simulation analysis was carried by converting the metabolic networks obtained into a mathematical model and performing the gene deletion analysis to retrieve essential genes. model conversion was through generation of stoichiometric based matrixes consisting of reactions (columns) and metabolites (rows) corresponding to respective genes. upper boundary and lower boundary fluxes i.e. movement of matter across a system were generated for the gene associated reactions and metabolites that was extracted in systems biology markup language (sbml) format. the next step was gene deletion analysis done using the constraint based reconstruction and analysis toolbox (cobra) that runs in matrix laboratory (matlab) 55 for finding the essential genes based upon the gene-reaction matrix and boolean relationship between genes and reactions 56 . the purpose of data filtering is to remove repeats and homologs from essential genes of target microorganisms associated with ibd co-existent rea. the non-homologous protein sequences corresponding to the essential genes of target microorganisms were extracted from pathosystems resource integration (patric) database 57 . refinement of protein sequences was further done using cluster database at high identity with tolerance (cd-hit) 58 suite so as to have 60% identity non-repeat sequence tolerance stringency. blast-p was further used to remove the homologs from such non-repeats against human database at e-value of 10 -4 to obtain nonhomologous protein sequences used for further in-silico analysis. essential host-microbe and microbe-microbe interactions. the host-microbe interactions of the non-homologous proteins for the selected target microorganisms were obtained using host-pathogen interaction database (hpidb) 59, 60 . the host-microbe interactions were visualised using cytoscape. simulation analysis (gene essentiality) was done to obtain the essential host proteins interacting with common microbe proteins of microorganisms triggering ibd and rea utilising the human metabolic model hmr 2, a cobra compliant metabolic model of human consisting of around 3,765 genes, 8,000 reactions and 3,000 metabolites 61 . this led to profiling of the common host-microbe and microbe-microbe interactions comprehending the complex 'interspecies communication' as complex interaction maps, executed using search tool for the retrieval of interacting genes/proteins (string) 62,63 . host-microbe disease network and molecular mimicry studies. the host-microbe disease network is a multilayered archetype that connects the protein-marker-symptom/disease-drug-pathway associations. the contributions of the microorganisms in the co-evolved ibd and rea as part of the disease network was created through the interactive maps of the essential host interaction proteins (verified using literature survey) and the information processed through gene expression data analysis 64 . the information patronised here is mostly scored through the available non-specific protein diagnostic markers of both ibd and rea e.g. c-reactive protein (crp), interleukin 6 (il6) and toll like receptor 4 (tlr4), major histocompatibility complex, class i, b (hla-b) and major histocompatibility complex, class ii, dr beta 1 (hla-drb1) with the essential host proteins determined using string 65 . database genecards 66 was used to assess the role of these interacting partners aka proteins further with symptoms/diseases associated with ibd and rea. the pathways of the above host interacting proteins were found out using kegg database that provides ontologies for proteins related to biological processes 67 www.nature.com/scientificreports/ subsequently, the role of drugs or inhibitors used to suppress the effect of ibd and rea such as indomethacin, prednisone, ciprofloxacin, sulfasalazine, azathioprine, methotrexate and hydroxychloroquine was scored in the disease network through their docking studies against the potential targets (both host as well microbial targets) as per published methodologies 68, 69 . the host-microbe disease network which is an amalgamation of all the above patterned associations was visualized using cytoscape software 70 . molecular mimicry analysis between the vital targets triggering ibd co-evolved rea, essential human proteins including hla-b27, hla-b51 and hla-drb1 was done using data repository expasy. this led to retrieval of microbe relayed protein sequences that have been implicated in disease development after sequence alignment performed using emboss 71 . experimental evidences to identify the signature molecules in patient samples. the cross-validation of vital in-silico targets was done in rea patient cohort cases via targeted gene expression analysis. scientific and ethical clearance was taken from amity university ethics committee and institutional ethics committee, fortis noida for handling the patient samples. all experiments were performed in accordance with indian council of medical research (icmr) guidelines constituting the ethics committees. the study was carried out for 6 months on the rare disorder rea patients, with the inclusion criteria as patients having rea according to european spondyloarthropathy study group (essg) 72 and exclusion criteria as patients undergoing treatment from last 3-6 months and healthy controls (hc). the participants were inducted in the study design with an informed consent form along with a questionnaire containing information regarding symptomatic and diagnostic history of patient and linked disorders. blood (5 ml) was drawn from participants in ethylenediaminetetraacetic acid (edta) vacutainers. these were transported to the laboratory for further analysis. the processing of the samples was done within 2-4 h of procurement 73 . peripheral blood mononuclear cells (pbmc's) were isolated from blood using density gradient centrifugation 74 . rna was isolated from pbmc's using trizol method 75 . the quantification of rna was done using nano-drop 76 . the high capacity cdna reverse transcription kit (applied biosystems™) was used for conversion of rna to single-stranded cdna as per the standard protocol 77 . quantitative pcr analysis of target gene was executed using biorad cfx96 real time-pcr taking human housekeeping gene, gapdh as a reference. previously reported primers for qpcr analysis of target and reference gene were selected for this study 78, 79 following the standard protocol 80 . relative gene expression analysis from qpcr data was performed using the relative expression software tool (rest® 2009) 81 that utilises the expression of reference genes to normalize expression of target genes in different samples. the schematic representation of methodology involved in our combinatorial analysis is provided in fig. 1 . text mining: data screening and selection. a systematic literature mining and curation for our thematic connecting autoimmune disorders, inflammatory bowel disease (ibd) and reactive arthritis (rea) was carried out. data identification extracted 1,071 records (articles in journals, book chapters, conference papers etc.) corresponding to autoimmune and enteric disorders. data screening extracted 426 records of autoimmune and enteric disorders triggered by microorganisms that belong to class of bacteria, fungi, protozoan, mites, virus, yeast and nematode. data selection yielded 48 ibd, 32 rea and 5 ibd co-evolved rea records. data selection was directed towards the microbial contenders implicated here resulting in 6 target microorganisms namely campylobacter jejuni, escherichia coli o157:h7, klebsiella oxytoca, salmonella typhimurium, shigella dysenteriae and yersinia enterocolitica, whose genome information was available. the etiopathogenesis in the co-evolved disorders have been documented through gut microbiome associated host-pathogen interactions studies, perpetuating where pathogen microorganisms involve in dysbiosis leading to autoimmunity. the results of text mining are provided in fig. 2 . the list of microorganisms is provided in supplementary table s1 online. ing of genes along with their corresponding proteins, reactions and metabolites for the selected microorganisms serve as primary set of partial metabolic network information. the missing data persistent in the draft reconstruct obtained through genome-scale metabolic models simulation, reconstruction and visualization (gem-sirv) was manually refined. entirely associated metabolic networks of target microorganisms were obtained (genes, proteins and reactions). the essential genes of microorganisms (vital for survival. sustenance and growth) were obtained after performing simulation on mathematical models consisting of gene associated reactions and metabolites (metabolites, inner cell reactions, exchange reactions and essential genes). due to lack of availability of exchange reactions for campylobacter jejuni, simulation analysis on the partial metabolic network could not be carried out and essential genes could not be retrieved. an alternative approach for finding essential genes of campylobacter jejuni was carried out. the essential genes of campylobacter jejuni were taken from our previous published report and were found out to be 228 69 . table 1 portrays the results of metabolic network reconstruction and simulation of target microorganisms. the metabolic network and simulation analysis data of target microorganisms is provided in supplementary table s2 online. the proteins corresponding to essential genes, non-repeats and non-homologs were obtained as stated below according to the parenthesis {proteins corresponding to essential genes, non-repeats, non-homologs}. the essential genes, their corresponding proteins, reactions and metabolites from the curated dataset were refined to create a list of most relevant molecular indicators to assess their coveted role in disease establishment. the non-redundant filtered proteins were utilised further in the computational work-pipeline canvassing the drug targets and signatures in the interspecies communication. essential host-microbe and microbe-microbe interactions. the central mechanism of hostmicrobe/microbe interface conferred through gut microbiome was correlated for the selected microbial species and processed to obtain the common signatures so as to follow the core system of metabolic changes affecting the host harbouring them as either commensal or pathogenic loads. the interactors between human and target microorganisms were obtained. the interactors of escherichia coli o157:h7 were 136; klebsiella oxytoca were 141; salmonella typhimurium were 136; shigella dysenteriae were 117 and yersinia enterocolitica were 133. there were no interactors for campylobacter jejuni (supplementary table s3 -s7 online). table 2 shows the results of filtering and host-microbe interactions of protein sequences corresponding to essential genes of target microorganisms. www.nature.com/scientificreports/ the host-microbe interactors were analysed for all the target microbial species and processed to obtain the common signatures. 43 proteins were found between all target microorganisms having interaction among themselves and with 130 human proteins. the essential host correlative targets to the microbial gene targets were followed by obtaining host essential genes and corresponding proteins from human metabolic model hmr 2. there were 1,401 essential proteins (supplementary table s8 online) the essential human protein was found out to be kynu having interaction with essential microbial protein nhaa (fig. 3) . nhaa was also having interactions with non-essential hcls1 associated protein x-1 (hax1), prolyl endopeptidase-like (ppcel), biogenesis of lysosomal organelles complex 3 subunit 1 (hps1) and eukaryotic translation initiation factor 2 alpha kinase 1 (e2ak1) proteins of human host. kynu was further mapped with host proteins (direct and indirect) resulting in 1994 interactions. out of these the single connected essential protein interactions were 988 and protein interactors were 412 ( fig. 4 and see supplementary table s9 online). the research design here followed to assess the interaction map of essential proteins in human host to indicate the clinical insights in pathophysiological trends in the autoimmune development. host-microbe disease network and molecular mimicry. the human essential proteome complement with its interacting proteins were analysed further as part of the disease network. 394 human essential protein interactors were found to be associated with ibd and similarly 3 essential protein interactors namely adenosine supplementary table s10 online) . these 397 proteins can be postulated as probable contenders transcending their role in the simulated network as important regulators in the co-existent disorders. the composite associations of the above 397 proteins with non-specific protein diagnostic markers of ibd and rea were obtained (see supplementary table s11 online) . this gave rise to a single connected protein network consisting of 402 proteins and 13,350 interactions. the association of above 402 with symptoms and diseases linked with ibd and rea were obtained (see supplementary table s12 online) . apart from non-specific diagnostic markers, the major protein linked with majority of symptoms/diseases is angiotensin i converting enzyme (ace). 78 pathways of the 402 proteins were obtained (see supplementary table s13 online) in total out of which the pathway associated with majority of proteins was carbon metabolism. another layer of disease network substantiates the role of therapeutic regime followed in the studied autoimmune diseases, so the docking analysis of drugs used to suppress the effect of ibd and rea against nhaa of target microorganisms and kynu of human host was done. the docking analysis resulted in docking scores that represent binding of drugs with host kynu and microbial nhaa of all 5 microorganisms selected in our study. higher the negative docking score more is the binding 68 . escherichia coli o157:h7 nhaa shows highest and lowest docking score with methotrexate (− 7.362) and azathioprine (− 3.491); klebsiella oxytoca nhaa with methotrexate (− 5.083) and azathioprine (− 3.459); salmonella typhimurium nhaa with ciprofloxacin (− 5.135) and hydroxychloroquine (− 2.597); shigella dysenteriae nhaa with methotrexate (− 8.059) and azathioprine (− 3.847); yersinia enterocolitica nhaa with hydroxychloroquine (− 7.47) and azathioprine (− 3.451) and human kynu with hydroxychloroquine (− 5.357) and indomethacin (1.113). our results portray methotrexate to have highest docking scores with maximum proteins and therefore can be considered as a vital drug for ibd associated rea. the resultant docking scores are provided in fig. 5 . the extensive interaction pattern of nhaa with kynu along with 396 proteins, 5 markers, 66 symptoms/ diseases, 78 pathways and 7 drugs give rise to a host-microbe disease network of ibd co-existent rea (fig. 6 and see supplementary table s14 online) . the final league of information processed in this study design was to accommodate the concept of molecular mimicry between the essential host proteins and selected microorganisms. nhaa protein of target microorganisms shows homology with human hla-b27, hla-b51 and hla-drb1 (fig. 7) . peptides homologous to hla-b27: peptides homologous to hla-drb1: experimental evidences to identify the signature molecules in patients. the in-silico analysis followed for the molecular signature identification till far through gene expression datasets and curated metabolic reconstructs strongly indicate the host protein, kynu being the singular common predictive markers for all pathogenic microbes. kynu has also been indicated in the expression data of inflammatory linked disorder, www.nature.com/scientificreports/ ibd. there is lack of data available regarding kynu differential expression in rea, therefore the experimental evaluation of kynu through targeted expression analysis in rea patients was carried out. a non-probabilistic convenience sampling was followed for our single blind study. this study encompassed 15 individuals: 60% male with mean age of 45.7 and 40% female with mean age of 38 (9 males and 6 females). out of these cases were: 10 with rea and controls were: 3 currently undergoing treatment, 1 with poncet's disease (pd) and 1 healthy control (hc). the clinical characteristics of the patients recruited in the study included inflammatory back pain in 33%, fatigue in 60%, fever in 27%, swollen joint in 47%, ankylosing spondylitis (as) that affects spine in 7%, dactylitis that is inflammation in finger or toe in 7% and poncet's disease (pd) in 7% of participants. the clinical characteristics of the recruits are provided in table 3 . the expression of kynu in peripheral blood mononuclear cells (pbmc's) of rea cases vs controls was evaluated using relative expression software tool (rest) software that estimated a sample's relative expression ratio in relation to the control housekeeping gene (here gapdh) by calculating an intermediate absolute concentration value: where cp = point at which fluorescence escalates considerably above the background fluorescence. here the cp values for reference and target genes are collectively redistributed to control and sample groups and the expression ratios are calculated based on the mean value. a pair wise fixed reallocation randomisation test is followed for normalisation of the target genes with a reference gene and for calculating the statistical difference of variation between 2 groups 81 . it utilises a bootstrapping technique providing a 95% confidence interval for expression ratios. it uses a p(h1) test for testing the significance between the samples and controls. according to our analysis, kynu sample group is different to control group where p(h1) = 0.025. kynu was found to be downregulated in sample group (in comparison to control group) by a mean factor of 0.115 (standard error range is 0.018-0.837) as depicted in the whisker-box plot (fig. 8) . kynu expression showed a ~ ninefold decline in rea cases as compared to controls. gut microbiome is pitched to be the central theme housing enormous diversity of microbial species, characterizing the fine balance between healthy and diseased states. the physiological drifts from healthy to diseased and vice-versa is tuned to sophisticated interactive networks of human host and the microbial flora residing the gut. the autoimmune conditions reactive arthritis (rea) and inflammatory bowel disease (ibd) have been linked to prevalent dysbiosis of the gut, where disease development occurs as a perceptive reaction due invading population of microbes. to find out the basal networks of interactions at the host-microbe interface, common microbes affecting the co-evolved diseases with shared characteristics were studied. these involved comprehensive analysis of the bimolecular functional networks including the gene, protein, metabolite molecular signatures engraved at the host-microbe and microbe-microbe interface. this 'interspecies communication' have been linked now with immuno-pathogenesis of most human autoimmune disorders 82, 83 . www.nature.com/scientificreports/ the etiopathology of these interactions have remained elusive leading to non-specific diagnostic criteria and therapeutic regimes. it is suggested that microbial dysbiosis, pathogenic infection and host-microbe interactions cause incidence of rea. in this study, utilising the combinatorial approach we have compiled a repertoire of microorganisms, biomolecules and pathways that are possibly involved in triggering co-evolved autoimmune disorders ibd and rea. in our study, text mining results convey the presence of microorganisms namely campylobacter jejuni, escherichia coli o157:h7, klebsiella oxytoca, salmonella typhimurium, shigella dysenteriae and yersinia enterocolitica implicated in both the disorders. the thematic concepts for microbe contribution in host immunity have been explored in our previous analysis of metabolic reconstruction and simulation of campylobacter jejuni and salmonella enterica 69, 84 . in our current study, we used a designated work-pipeline for metabolic network reconstruction and simulation of target microorganisms. the analysis conducted extracted the information via constraint-based bottom-up approach that was filtered and utilised for further computational analysis. the essential genes, proteins and metabolites of microorganisms represent the promising drug targets as these are speculated to contribute towards infection triggered host physiological drifts leading to development of the co-evolved pattern of autoimmunity in ibd and rea. a thorough curation pattern followed led to provide robust molecular cues in terms of essential proteins and biological networks that are correlated to the 'interspecies communication' using the host-microbe and microbemicrobe interaction profiling. the most closely associated common protein observed in all the selected common microbial species involved in both ibd and rea is na (+) /h (+) antiporter (nhaa), microbial integral membrane protein, catalyzing the exchange of 2 h (+) per na (+)85 and involved in processes crucial for cell viability. similarly, the common host interacting protein with nhaa is kynureninase (kynu), involved in tryptophan metabolism and whose differential expression (upregulation and downregulation based on the control samples) have been followed in ibd patient cohorts [86] [87] [88] . as per the scientific discourse presented in the studied disorders, the pathological mechanism hypothesizes that after bacterial infection, antigen-presenting cells transport bacterial antigens/peptides into the synovial membrane, where the bacterial components persist causing inflammation. it is suggested that in host-microbe interactions, bacterial proteins entering host cells interact with host proteins and inject their effector components, but has not been proven in rea and ibd. so, this formed a basis of one of the parameters in our study design where we found the physical interactions between nhaa and kynu and predicted that these might be the early host-microbe interactors for establishing pathogenesis in ibd associated rea. this could assist to comprehend the very few reports indicated in the rare autoimmune rea, where gene expression datasets of the co-evolved disorder ibd can serve to incorporate the larger theme of gut-microbiome associations. the theme of gut-microbiome paradigm shifts thus contemplates the vital cues in triggering autoimmunity with indirect linkages to diet and environmental triggers. this is indicative of the identified target molecular signature, kynu, found to be differentially regulated in the patient cohorts with history of infection triggered or ibd co-evolved rea. kynu and nhaa could serve as the robust early and essential host-microbe interacting targets and molecular indicators involved in interspecies communication in ibd associated rea. the investigations further were targeted for parallel analysis of other host-essential protein partners enmeshed to have interaction with host protein kynu indicating the intricate details of host-microbe interaction information. the disease network constructed through our approach consists of 412 single connected essential protein interactors of kynu, where 394 human essential protein interactors are found to be associated with ibd, while 3 of them (adenosine deaminase (ada), catalase (cat) and superoxide dismutase 2 (sod2)) are associated with both ibd and rea. ada protein has been reported in juvenile idiopathic arthritis and rea patient cohorts in serum samples 89 . similarly, cat and manganese superoxide dismutase (sod) genes polymorphisms were observed in rea patient cohorts 90, 91 . these become part of the host-microbe disease network where such molecular elements and co-regulatory pathways represent the intricate biological cross-talk followed during disease development. pathological conditions can also trigger immune cells such as il's and tlr's and various cytokines leading to immune cell infiltration in host and higher levels of inflammation. genetic factors such as hla alleles encode susceptibility, contribute to bacterial persistence and increase risk in rea cases. based on this we also found the interactions of important targets in our study with immunogenic and genetic factors. the host harboured assorted essential proteins were further probed for their association with non-specific protein diagnostic markers as well as with symptoms/diseases linked with ibd and rea, accruing towards a single connected network consisting of 402 interdependent proteins. the reciprocation of these integrated protein indicators to the disease development is conveyed through metabolite monitoring as in the study, angiotensin i converting enzyme (ace) was found to be linked with maximum symptoms/diseases. ace is involved in catalyzing the conversion of angiotensin i into angiotensin ii that is a potent vasopressor and aldosterone-stimulating peptide that controls blood pressure and fluid-electrolyte balance 92 . this could be the indicator of involvement of microbe triggered host physiological drifts. subsequently, the pathways associated with the proteins ramified into 78 pathways of human host speculated to give details of metabolic regulatory checkpoints where carbon metabolism is found to be associated with majority of deduced proteins. carbon metabolism pathway implicated here as the vitally generic pathway for ibd co-related rea confers how diet, balance of gut microbiome, antibiotic exposures can have layered impact on autoimmune disease progression and remissions. kynu is found to be downregulated in rea patients as compared to controls through our targeted gene expression analysis. collectively, the disease network followed here confers interaction of microbial nhaa with host kynu, that is further correlated to 396 proteins, 5 markers, 66 symptoms/diseases, 78 pathways and 7 drugs. docking analysis of drugs used to suppress the effect of ibd and rea predicts methotrexate as an important drug that could be useful for early treatment of ibd co-evolved rea. www.nature.com/scientificreports/ genetic factors found common in both rea and ibd are hla-b27, hla-b51 and hla-drb1. the most important mechanism of susceptibility of hla in rea is molecular mimicry that is microbial peptides mimicking hla autopeptides of human host leading to autoimmunity. this mechanism has been observed in rea where reports have predicted microorganism peptides such as chlamydial proteins (clpc, nqra and dnap) and yersinia pseudotuberculosis peptides (yoph) showing homology with human hla-b27 via bioinformatic analysis 14 . similarly, molecular mimicry has also been observed in ibd cases having extraintestinal manifestations. we performed targeted molecular mimicry analysis in our study using our robust microbial protein (nhaa) with hla-b27, hla-b51 and hla-drb1, enhancing the importance of nhaa acting as a trigger for generating ibd associated rea. we generate a putative hypothesis amalgamating key findings with literature. we state that the initial hostmicrobe triggers for ibd associated rea is when pathogenic microbial protein nhaa interacts with host protein kynu that further interacts with human proteins ada, sod2, cat and ace and carbon metabolism involving the above host proteins is hampered. methotrexate regulates carbon metabolism and the associated host-microbe proteins reducing effect of ibd associated rea. since carbon metabolism is the most basic aspect of life and therefore an extensive network consisting of sub-pathways, we narrowed down our findings towards a consequentially central and a significant pathway that embrace the carbon metabolism pathway involving the molecular signatures kynu, ada, sod2, cat and ace, further is also effectuated by potential drug methotrexate and is associated with ibd/ rea/ ibd and rea cohorts. it is reported that methotrexate is incorporated intracellularly interfering with adenosine concentrations and affecting proinflammatory cytokines in ibd reducing inflammation 93 . in inflammatory arthritis, the mechanisms reported by which methotrexate reduces inflammation include enhanced adenosine release, de novo synthesis of purines and pyrimidines, inhibition of transmethylation reactions, diminished accumulation of polyamines and nitric oxide synthase uncoupling. most of the mechanisms are associated with folate biosynthesis, a type of carbon metabolism 94 . kynu, ada, sod2, cat and ace are also found to be involved in folate biosynthesis and metabolism from genecards. apart from the above targets, parallel interactors, pathways and drugs for ibd co-evolved rea obtained in our host-microbe disease network can be utilised further as disease determinants. the experimental validation of these targets in patient cohorts need to be performed on a pilot scale in future to increase the robustness of this network. the intertwined information processed through the knowledge-base created for the linked disorders have given the most elaborate layout of patterns observed in disease diagnosis and analysis. the major information after processing the gene expression profiles, protein markers, molecular networks and metabolic networks involved here have led to chalk out as well as connect the strings for robust gut microbiome paradigm shifts. the current work on host-microbe interactions provides a starting point for researchers and clinicians to investigate inflammatory bowel disease (ibd) associated reactive arthritis (rea). in this study a combinatorial approach is utilised to reveal the interactions of gut microbes with human host extensively sketched through the work-pipeline providing the vital insights for the drug targets, biomarkers, pathways and inhibitors for etiology, prognosis, diagnosis and treatment attributes of pathogenic rheumatic autoimmunity. the information sorted through the combinatorial study will be useful in deciphering the etiopathogenesis of the co-linked disorders especially for the rare rea, from synonymous analyses of ibd datasets, conferred through common microbial triggers. these predictions substantially furnish the intricate details of the cross-talk between post-infectious inflammatory reactions with shared patho-immunogenesis as the starting point for researchers and clinicians for detailed and newer experimental analysis. future studies are required on larger cohort of patients having rea due to ibd in order to have validated outputs of the predictive network. www.nature.com/scientificreports/ reactive arthritis: a review management of arthritis in patients with inflammatory bowel disease enteric pathogens and reactive arthritis: a systematic review of campylobacter, salmonella and shigella-associated reactive arthritis vedolizumab as induction and maintenance therapy for ulcerative colitis achieving deep remission in crohn's disease: treating beyond symptoms gut microbiota perturbations in reactive arthritis and postinfectious spondyloarthritis epidemiology: time to revisit the concept of reactive arthritis role of human leukocyte antigens (hla) in autoimmune diseases human leukocyte antigen (hla) and immune regulation: how do classical and non-classical hla alleles modulate immune response to human immunodeficiency virus and hepatitis c virus infections? clostridium difficile: an under-recognized cause of reactive arthritis? reactive arthritis after enteric infections in the united states: the problem of definition mhc class i and class ii genes in tunisian patients with reactive and undifferentiated arthritis reiter's syndrome associated with hlab51 novel hla-b27-restricted epitopes from chlamydia trachomatis generated upon endogenous processing of bacterial proteins suggest a role of molecular mimicry in reactive arthritis mechanisms of disease: pathogenesis of crohn's disease and ulcerative colitis th1-type responses mediate spontaneous ileitis in a novel murine model of crohn's disease lack of tnfr p55 results in heightened expression of ifn-γ and il-17 during the development of reactive arthritis microbial antigens mediate hla-b27 diseases via tlrs microbes in gastrointestinal health and disease arthritis associated with yersinia enterocolitica infection chlamydia pneumoniae-a new causative agent of reactive arthritis and undifferentiated oligoarthritis diagnosis of chlamydia trachomatis in patients with reactive arthritis and undifferentiated spondyloarthropathy salmonella lipopolysaccharide in synovial cells from patients with reactive arthritis the role of intracellular organisms in the pathogenesis of inflammatory arthritis a pilot study for detection of intra-articular chromosomal and extra chromosomal genes of chlamydia trachomatis among genitourinary reactive arthritis patients in india acute erosive reactive arthritis associated with campylobacter jejuni-induced colitis seroprevalence of campylobacteriosis and relevant post-infectious sequelae the role of microbiome in rheumatoid arthritis treatment immunopathogenesis of rheumatoid arthritis recombinant salmonella typhimurium outer membrane protein a is recognized by synovial fluid cd8 cells and stimulates synovial fluid mononuclear cells to produce interleukin (il)-17/il-23 in patients with reactive arthritis and undifferentiated spondyloarthropathy outer membrane protein of salmonella is the major antigenic target in patients with salmonella induced reactive arthritis a single nonamer from the yersinia 60-kda heat shock protein is the target of hla-b27-restricted ctl response in yersinia-induced reactive arthritis the 19 kda protein of yersinia enterocolitica o: 3 is recognized on the cellular and humoral level by patients with yersinia induced reactive arthritis identification of the yersinia enterocolitica urease beta subunit as a target antigen for human synovial t lymphocytes in reactive arthritis association between reactive arthritis and antecedent infection with shigella flexneri carrying a 2-md plasmid and encoding an hla-b27 mimetic epitope role of 30 kda antigen of enteric bacterial pathogens as a possible arthritogenic factor in post-dysenteric reactive arthritis the microbiome in autoimmune diseases antivirulence activity of the human gut metabolome the human gut microbiome-a potential controller of wellness and disease prevalence of antibodies against chlamydia trachomatis and incidence of c. trachomatis-induced reactive arthritis in an early arthritis series in finland in 2000 campylobacter reactive arthritis: a systematic review reactive arthritis: current perspectives the microbiome and autoimmunity: a paradigm from the gut-liver axis aspects of gut microbiota and immune system interactions in infectious diseases systematic review of gut microbiota and major depression omic' technologies: proteomics and metabolomics learning objectives: ethical issues in vivo and in silico determination of essential genes of campylobacter jejuni a community-driven global reconstruction of human metabolism systems analysis of inflammatory bowel disease based on comprehensive gene information introduction of inflammatory bowel disease biomarkers panel using protein-protein interaction (ppi) network analysis gemsirv: a software platform for genome-scale metabolic model simulation, reconstruction and visualization expasy: the proteomics server for in-depth protein knowledge and analysis a systematic reconstruction and constraint-based analysis of leishmania donovani metabolic network: identification of potential antileishmanial drug targets genome-scale metabolic reconstructions of bifidobacterium adolescentis l2-32 and faecalibacterium prausnitzii a2-165 and their interaction quantitative prediction of cellular metabolism with constraint-based models: the cobra toolbox v2.0 reconstruction of the metabolic network of pseudomonas aeruginosa to interrogate virulence factor synthesis patric, the bacterial bioinformatics database and analysis resource cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences hpidb-a unified resource for host-pathogen interactions study of intra-inter species protein-protein interactions for potential drug targets identification and subsequent drug design for escherichia coli o104:h4 c277-11 integration of clinical data with a genome-scale metabolic model of the human adipocyte string v10: protein-protein interaction networks, integrated over the tree of life oral squamous cell cancer protein-protein interaction network interpretation in comparison to esophageal adenocarcinoma association of inflammatory bowel disease with arthritis: evidence from in silico gene expression patterns and network topological analysis the landscape of protein biomarkers proposed for periodontal disease: markers with functional meaning significant modules and biological processes between active components of salvia miltiorrhiza depside salt and aspirin gene ontology and kegg pathway enrichment analysis of a drug target-based classification system flexible ligand docking with glide identification of novel drug targets against campylobacter jejuni using metabolic network analysis cytoscape: a software environment for integrated models of biomolecular interaction networks multiple groups of endogenous epsilon-like retroviruses conserved across primates update of the eular recommendations for the management of early arthritis hla-b27 correlates with the intracellular elimination, replication, and trafficking of salmonella enteritidis collected from reactive arthritis patients recombinant salmonella typhimurium outer membrane protein a and d reactive t cells are expanded in synovial fluid of patients with reactive arthritis and undifferentiated spondyloarthropathy (hum6p. 251) purification of rna using trizol (tri reagent) unique transcriptome signatures and gm-csf expression in lymphocytes from patients with spondyloarthritis development of a reverse transcription-quantitative pcr system for detection and genotyping of aichi viruses in clinical and environmental samples characterization of the kynurenine pathway and quinolinic acid production in macaque macrophages persistence of gene expression changes in noninflamed and inflamed colonic mucosa in ulcerative colitis and their presence in colonic carcinoma vitamin d receptor expression in dogs relative expression software tool (rest) for group-wise comparison and statistical analysis of relative expression results in real-time pcr anti-microbial antibodies, host immunity, and autoimmune disease host-microbe interactions in the pathogenesis and clinical course of sarcoidosis elucidating vital drug targets of salmonella enterica utilizing the bioinformatic approach overproduction and purification of a functional na+/h+ antiporter coded by nhaa (ant) from escherichia coli pro-inflammatory mir-223 mediates the cross-talk between the il23 pathway and the intestinal barrier in inflammatory bowel disease pediatric crohn disease patients exhibit specific ileal transcriptome and microbiome signature disruption of macrophage pro-inflammatory cytokine release in crohn's disease is associated with reduced optineurin expression in a subset of patients sensitivity and specificity of adenosine deaminase in diagnosis of juvenile idiopathic arthritis antioxidant enzyme levels in reactive arthritis and rheumatoid polyarthritis cytochrome p450 1a1 and manganese superoxide dismutase genes polymorphisms in reactive arthritis coronavirus disease 2019 (covid-19): do angiotensin-converting enzyme inhibitors/angiotensin receptor blockers have a biphasic effect the current role of methotrexate in patients with inflammatory bowel disease methotrexate and its mechanisms of action in inflammatory arthritis the authors are grateful to amity institute of biotechnology, amity university uttar pradesh, noida and department of biotechnology, teri school of advanced studies, new delhi for providing the facility and technical support during the preparation of the manuscript. we also thank fortis hospital, noida for providing the patient samples. s.s. and b.r. conceived the study concept; s.s., b.r. and p.s. jointly designed and supervised the work; b.d.p. supervised the clinical setting and recruitment of participants; b.d.p. and a.v. recruited the participants and contributed to the sample collection and preparation; a.v. performed the experiments; s.s., b.r., p.s. and a.v. contributed to the analysis and interpretation of data; a.v. generated all figures and tables; a.v. wrote the first draft of the manuscript; s.s., b.r., p.s. and b.d.p. critically reviewed and edited the manuscript; all authors reviewed and approved the final version of the manuscript. the authors declare no competing interests. supplementary information is available for this paper at https ://doi.org/10.1038/s4159 8-020-71674 -8.correspondence and requests for materials should be addressed to s.s.reprints and permissions information is available at www.nature.com/reprints.publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. key: cord-001898-ntqyjqqk authors: huang, chih-wei; lin, hui-chen; chou, chi-yuan; kao, wei-chuo; chou, wei-yuan; lee, hwei-jen title: lys-315 at the interfaces of diagonal subunits of δ-crystallin plays a critical role in the reversibility of folding and subunit assembly date: 2016-01-05 journal: plos one doi: 10.1371/journal.pone.0145957 sha: doc_id: 1898 cord_uid: ntqyjqqk δ-crystallin is the major structural protein in avian eye lenses and is homologous to the urea cycle enzyme argininosuccinate lyase. this protein is structurally assembled as double dimers. lys-315 is the only residue which is arranged symmetrically at the diagonal subunit interfaces to interact with each other. this study found that wild-type protein had both dimers and monomers present in 2–4 m urea whilst only monomers of the k315a mutant were observed under the same conditions, as judged by sedimentation velocity analysis. the assembly of monomeric k315a mutant was reversible in contrast to wild-type protein. molecular dynamics simulations showed that the dissociation of primary dimers is prior to the diagonal dimers in wild-type protein. these results suggest the critical role of lys-315 in stabilization of the diagonal dimer structure. guanidinium hydrochloride (gdmcl) denatured wild-type or k315a mutant protein did not fold into functional protein. however, the urea dissociated monomers of k315a mutant protein in gdmcl were reversible folding through a multiple steps mechanism as measured by tryptophan and ans fluorescence. two partly unfolded intermediates were detected in the pathway. refolding of the intermediates resulted in a conformation with greater amounts of hydrophobic regions exposed which was prone to the formation of protein aggregates. the formation of aggregates was not prevented by the addition of α-crystallin. these results highlight that the conformational status of the monomers is critical for determining whether reversible oligomerization or aggregate formation occurs. introduction δ-crystallin is a taxon-specific eye lens protein. it is the major soluble protein in the eye lens of reptiles and birds and functions as a structural protein to maintain the refraction properties of the lens [1, 2] . δ-crystallin and argininosuccinate lyase (asl) are homologous proteins. asl is in this study, the effects of this interaction on the folding pathway of wild-type and mutant proteins were investigated using urea as a denaturant. the different distributions of dissociated component from wild-type and mutant proteins, as measured by sedimentation velocity experiment, suggests the quaternary structure dissociates in different ways for wild-type and mutant proteins. structural simulation supports different dissociation processes for the two proteins. these results highlight the critical role of k315 in stabilizing the quaternary structure of δ-crystallin. the residue appears to control both the dissociation of dimers into monomers and the stability of the produced monomers. the monomers dissociated from the k315a mutant protein with a stable and compact conformation provided a good model for studying the folding mechanism of the δ-crystallin. this study reveals the conformational status of the monomers, which determines whether functional protein or aggregates are formed. the recombinant wild-type and the k315a mutant δ-crystallin or αa-crystallin plasmid were transformed and over-expressed in e. coli bl21 (de3) with induction by isopropyl-β-d-thiogalactopyranoside (iptg). proteins were purified as previously described [8, 17] . the supernatants of crude cell extracts were loaded onto a q-sepharose anion exchange column (hiprep 16/10 q xl, ge healthcare) pre-equilibrated in buffer a (50 mm tris-hcl buffer, ph 7.5) and eluted with a linear gradient of 0 to 0.4 m nacl in buffer a. recombinant protein was eluted at approximately 0.15 m nacl. the eluted protein was pooled and treated with ammonium sulfate to 1.2 m. after filtration, the sample was loaded onto a hydrophobic interaction column (source™ 15phe) pre-equilibrated in buffer a containing 10% (v/v) glycerol and 1.2 m ammonium sulfate and eluted with a linear gradient to the same buffer lacking ammonium sulfate. the retained proteins were eluted at~0.3 m ammonium sulfate. fractions were pooled and loaded onto s-300 sephacryl column (26 mm x 85 cm) pre-equilibrated in 50 mm tris-acetic acid buffer, ph 7.5. fractions were analyzed by sds-page and protein concentrations determined by the method of bradford [18] . proteins possessing a c-terminal his 6 tag were purified on ni affinity column (chelating sepharose ff, ge healthcare) then desalted using a sephadex g-25 column (26 mm x 12 cm) as previously reported [11] . equilibrium unfolding experiments were carried out by overnight incubation of δ-crystallin with various concentrations of urea or gdmcl in 50 mm tris-acetic acid buffer, ph 7.5 at 25°c. the refolding experiments were undertaken by dilution of equilibrium-denatured δ-crystallin to a series of urea or gdmcl concentrations in the same buffer. the experiments for equilibrium unfolding of monomeric δ-crystallin were undertaken by overnight incubation of δ-crystallin in 1.5 m urea at 25°c followed by addition of various concentrations of gdmcl to the solution followed by incubation for 2 hrs. the refolding experiments were carried out by dilution of the denatured δ-crystallin into a solution containing 1.5 m urea and 0.8, 3 or 5 m gdmcl in 50 mm tris-acetic acid buffer, ph 7.5. to analyze the conformation and quaternary structure of the refolded δ-crystallin, the refolding experiments were undertaken by 20-fold dilution of the denatured monomeric δ-crystallin with buffer. the transition region in fig 2b for the reversible dissociation of k315a mutant δ-crystallin in urea solution was analyzed using the following method. the unfolding process was described as a two-state transition for the conversion of the tetramer (t) into monomers (m). the thermodynamic parameters were obtained by global fitting of the tryptophan fluorescence signal to eq 1 [19] : where y is the observed signal from tryptophan fluorescence, y n and y u are the signals in the folded and unfolded states, and m f and m u are the slopes of the baselines preceding and following the transition region. δg u 0 is the free energy difference in the absence of urea and m is the variation in the free energy of unfolding with the urea concentration. reversible unfolding of monomeric k315a mutant δ-crystallin were described as a four state process (m$i 1 $i 2 $u). the unfolding curve from tryptophan fluorescence was analyzed using a four-state unfolding model described as the transition from n to u with two intermediates, i 1 and i 2 , in the process. the thermodynamic parameters were calculated by fitting the data to eq 2 [8, 20] : where y i1 and y i2 are the signals in the i 1 and i 2 states. δg 1 0 , δg 2 0 and δg 3 0 are the free energy differences in the absence of gdmcl for n to i 1 , i 1 to i 2 and i 2 to u transition, respectively, and m 1 , m 2 and m 3 are the variation in the free energy of unfolding with the gdmcl concentration. enzymatic activity assay δ-crystallin was assayed for asl activity by monitoring the absorption of fumarate at 240 nm in a perkin-elmer lambda 40 spectrophotometer. assays were performed at least in triplicate in 50 mm tris-hcl buffer (ph 7.5) with 1 mm sodium argininosuccinate as substrate. a molar absorption coefficient of 2.44 x 10 3 m -1 cm -1 was used for all calculations [21] . the fluorescence spectra were measured on a perkin-elmer ls-50 luminescence spectrophotometer at 25°c. intrinsic tryptophan fluorescence spectra of the protein were recorded with excitation wavelength set at 295 nm and using 5 nm band-width for both excitation and emission wavelength. all spectra were corrected for buffer or denaturant absorption. the average emission wavelength was utilized for data analysis [22] . the ans (1-anilinonaphthalene-8-sulfonic acid) (molecular probes; eugene, oregon) was used as probe to bind with proteins and the fluorescence was recorded from 450 to 550 nm the circular dichroism (cd) spectra were measured on a jasco j-810 spectropolarimeter at 25°c. experiments were performed in 20 mm tris-acetic acid buffer (ph 7.5) with a 1 mm path-length over a wavelength range from 200 to 250 nm. all spectra were averaged from three accumulations and were buffer corrected. the observed ellipticity (θ) (degrees) was converted into the mean residue ellipticity [θ] by the equation: [θ] = θ×m mrw /10×d×c [23] , where m mrw is the mean residue weight, d is the light path (cm), and c is the concentration of protein in g/ml. the protein sedimentation were performed on a beckman-coulter (palo alto, ca) xl-a analytical ultracentrifuge (auc) with an an50 ti rotor at 20°c with 130 000 g in standard double sectors aluminum centerpieces. the radial scans were recorded with a time interval of 7-min and a step size of 0.003 cm. all data were fitted to the continuous c(s) distribution model and a continuous size-distribution with respect to frictional ratio (f/f o ) model using the sedfit program [24, 25] . the partial specific volume of the protein, solvent density and viscosity were calculated using the sednterp program [26] . the solvent density and viscosity of varied urea concentrations were calculated and included in the fitting. electrophoresis was performed using a phastsystem (ge healthcare). the samples were subjected a phastgel 4-15% gradient gel which contains the native buffer strip (0.88 m l-alanine, 0.25 m tris/hcl ph 8.8) attached to the surface of the gel and both electrodes. the electrophoresis was carried out at 10 ma and 15°c for 400 vh. after electrophoresis, the gel was fixed in 20% (w/v) trichloroacetic acid and stained with coomassie brilliant blue r250. the turbidity of protein solution was measured using a perkinelmer lambda 40 spectrophotometer equipped with a peltier temperature control accessory to monitor the light scattering at 360 nm. all measurements were carried out at 25°c. spectra were corrected using measurements recorded for native protein in the absence of denaturants. the aggregation rate was calculated by fitting the data to the single or double exponential equation: where y t is the signal at time t, y 0 is the signal of the final state, y i is the change in amplitude, and k i is the rate constant for aggregation. data for the linear increase in signals was fitted to a linear equation using sigmaplot 10. the assay was performed by setting the excitation wavelength at 440 nm and measuring the emission spectrum from 460 to 600 nm. proteins unfolded in denaturant were diluted into 50 mm tris-acetic acid buffer (ph 7.5) to give 0.05 mg/ml of protein in the presence of thioflavin t (tht) (50 μm). the spectrum of tht alone was used as a correction for each assay. the crystal structure of goose δ-crystallin (pdb code: 1xwo) with all water molecule removed was subjected to the charmm force field and energy minimized with the smart minimization algorithm to satisfy (rms gradient~0.1 kcal/mol/å). the implicit solvent model of generalized born was included in the calculation. the structural model was used as a template to build the k315a mutant model using the build mutant protocol (accelrys discovery studio 3.5, accelrys inc.). the best scoring model conformation was selected for energy minimization. molecular dynamics (md) simulations were performed using the standard dynamics cascade protocol. the structures of wild-type and mutant δ-crystallin in the charmm force field were subjected to initial energy minimization for 500 steps by steepest descent followed by a conjugate gradient for 500 steps. the minimized models were then heated from 50 to 300 k in 2 ps md simulations followed by equilibration for 2 ps at 300 k in the absence of any structural restraint. finally, the equilibrated models were submitted to md simulations for 100 ps at nvt (constant number of particles, volume, and temperature) using the berendsen coupling algorithm [27] . wild-type and k315a mutant δ-crystallin purified to near homogeneity were used for all analysis. equilibrium unfolding experiments were conducted by incubation of proteins in buffer supplemented with different gdmcl or urea concentrations overnight. tryptophan fluorescence was used to monitor the conformational changes during the unfolding process in the microenvironment around the tryptophan residues (fig 2a and 2b ). there are two tryptophan residues, w74 and w169, in the structure of δ-crystallin. they are located in the solvent accessible domain 1 and the helix bundle of domain 2, respectively ( fig 1a and 1b) . unfolding of the wild-type and k315a mutant protein follows a multistep process in gdmcl solution and was not reversible after 20-fold dilution (fig 2a) . as shown in fig 2a, the dramatic changes in the signal for the first transition were due to subunit dissociation as reported previously, with the gdmcl concentrations for half transition ([d] 1/2 ) at 1 ± 0.05 and 0.5 ± 0.01 m for wildtype and k315a mutant protein, respectively [28] . unfolding of the two proteins in urea solution followed a three-state process, with the [d] 1/2 values in the first transition at 3.6 ± 0.1 and 1.7 ± 0.1 m urea, respectively ( fig 2b) [8, 12] . the differences in the denaturant concentration required for the half transition clearly show the potency of gdmcl when disrupting of the conformation of δ-crystallin [11] . in the presence of urea, the k315a mutant protein was dissociated and stayed in a stable conformation between 2 and 5 m urea. the conformation was more stable than for wild-type protein at urea concentrations exceeding 4 m. when the urea was removed by 20-fold dilution into buffer, the denatured mutant protein was able to refold into a conformation similar to the original state (fig 2b) . in contrast, dilution of 3.5 or 4 m urea denatured wild-type δ-crystallin or 6 m urea denatured k315a mutant protein did not result in the restoration of the properly folded conformation. these results suggest that the reversible assembly of the quaternary structure of δ-crystallin is correlated with the conformation of the dissociated monomers. the exposure of hydrophobic surfaces in the presence of urea was investigated using ans titration. a dramatic increase in fluorescence was observed at around 3 m and 1 m urea, for wild-type and the k315a mutant, respectively, indicating that subunit dissociation had occurred due to the exposure of hydrophobic areas (figs 2c and 3) . the result is consistent with the observed changes in the tryptophan micro-environment as probed by tryptophan fluorescence ( fig 2b) . the highest signals occurred around 2 and 5 m urea for the mutant and wildtype protein, respectively. dilution of the 2.5 m urea denatured k315a mutant protein resulted in restoration of a native-like conformation. however, dilution of 4 or 6 m urea denatured wildtype or mutant protein, respectively, resulted in higher levels exposure of hydrophobic areas. since α-helices are the major secondary structure in δ-crystallin, the ellipticity at 222 nm was used to analyze the structural changes induced by the presence of urea (fig 2b) . the results showed both proteins retaining relatively stable α-helical structure at concentrations of urea below 2 m. there was about 13% and 30% loss of the structure at 3 m and 4 m urea, respectively. effect of urea on the size-and-shape changes of δ-crystallin variants the size and size-and-shape changes of wild-type and k315a mutant protein in different urea concentrations were determined by sedimentation velocity measurements and using continuous c(s) distribution and c(s, f/f 0 ) distribution analysis, respectively (figs 3 and 4) (s1 and s2 figs) [29] . in the absence of urea, the two proteins appeared as one major component with sedimentation coefficients about 8.5 and 8.4 s, respectively (fig 3) . this peak corresponds to tetrameric δ-crystallin [11] . they possessed the native conformation as judged from the friction ratio (f/f 0 ) distribution profile (with the centre region below 1.5 as shown by red in the contour) (fig 4a and 4b ). at 2.5 m urea two components were observed for wild-type protein with sedimentation coefficients about 6.6 and 3.2 s, and these are assumed to be the dissociated dimeric and monomeric forms. the s values of the two peaks decreased with increasing urea concentration. the proportion of the second (monomeric) peak increased from 6% to 27% to 60% in the presence of 2.5, 3.0 to 3.5 m urea, respectively. dilution of the denatured wild-type protein at 3.5 m urea resulted in refolding into one major component with an s value of 8.1 (fig 3a) . measurement of asl activity showed that around 25% activity was recovered following refolding (table 1) . subunit dissociation was observed at about 1.2 m urea for the k315a mutant protein, with s values for the major peak of 6.8 and a shoulder at about 4.5 s (fig 3b) . a single peak with an s value of 4.5 was observed for the mutant protein at~1.5 m urea. this species is thought to be dissociated monomers possessing the native conformation as judged from the friction ratio (f/ f 0 ) distribution profile (fig 4c) . the monomers were reassembled into a similar quaternary structure of wild-type protein after removing the urea (fig 4d) . when the urea concentration was increased to 4.5 m, the s values for the single component were gradually decreased to about 2.4 ( fig 3b) . dilution of the protein denatured with 2.5 m urea resulted in refolding into one major peak with a s value of 8.1 s. the refolded protein showed around 80% asl activity was recovered (table 1) . since k315a mutant protein was reversible dissociated into stable monomers at 1.5 m urea, the conformational reversibility of monomers was investigated. monomeric k315a mutant protein that had been produced by equilibrium incubation of the native protein in 1.5 m urea was treated with various gdmcl concentrations. unfolding of monomeric k315a mutant proteins followed a multistep pathway as measured by both tryptophan and ans fluorescence ( fig 5a and 5b ). stable intermediates were identified from the unfolding curve of ans fluorescence which included the highest ans fluorescence region at around 1 m gdmcl and a steady state at~3 m gdmcl (fig 5b) . the monomeric protein had lost about 30 and 55% of its secondary structures at 1 and 3 m gdmcl, respectively. the changes in tryptophan fluorescence were dilution of monomeric k315a mutant protein denatured in 5 m gdmcl resulted in refolding to a similar conformation as the original monomeric state (fig 5a and 5b) . however, dilution of 1 and 3 m gdmcl denatured monomeric protein resulted in the increasing of the ans fluorescence, indicating higher exposure of hydrophobic area (fig 5b) . the results suggest the conformation of the partly unfolded intermediate could affect the folding reversibility of the monomeric k315a δ-crystallin mutant. k315a mutant δ-crystallin that denatured in 3 m urea was reversible folded back to the original conformation after dilution (fig 2b) . signal changes in the tryptophan fluorescence with different urea concentration were used to calculate the thermodynamic parameters by directly fitted to the two-state mechanism (eq 1) [19] . the free energy difference in the absence urea (δg 0 ) for the transition was determined to be 6.5 ± 0.3 kcal/mol ( table 2) . the changes of tryptophan fluorescence as a function of gdmcl concentration were used to calculate the thermodynamic parameters for the reversible unfolding of the monomeric k315a δ-crystallin mutant (fig 5a) . the unfolding curve was best fitted into a four-state model [8, 20] . the [gdmcl] 1/2 for the transitions from the m to i 1 , i 1 to i 2 and i 2 to u (denaturation) were about 0.6 ± 0.04 m, 2.1 ± 0.2 m and 3.9 ± 0.1 m, respectively. the thermodynamic parameters determined are summarized in table 2 . the total free energy difference (δg 0 ) for folding of monomeric k315a δ-crystallin mutant was determined to be 12.8 ± 0.7 kcal/mol. to determine the ability about refolding of denatured monomeric k315a mutant followed by reassembly to a tetrameric protein, the monomeric proteins that denatured by gdmcl were diluted with buffer to remove both of the urea and gdmcl. the results showed that the quaternary structure of k315a mutant protein was recovered from the denatured monomers instantly after dilution. the amount of the assembled protein was increased with the incubation time possessing about 60% of the activity recovered ( fig 5c and table 1 ). in contrast, without denaturant treatment, respectively. samples from refolding of 5 m gdmcl denatured wild-type and mutant protein for time period of zero or overnight are shown in lanes 3-4 and 5-6, respectively. the protein concentrations used in the assays were 0.03, 0.1 and 0.1 mg/ml for (a), (b) and (c), respectively. doi:10.1371/journal.pone.0145957.g005 is the concentration of denaturant at which the transition is half completed. the data was calculated by global fitting to eq 1. b the data were fitted to a 4-state unfolding model (eq 2) described as the transition from m to u. two intermediates, i 1 and i 2 , were assumed in the process before denatured species (u). these data are the mean ± sd of at least three independent experiments. doi:10.1371/journal.pone.0145957.t002 dilution of 5 m gdmcl denatured wild-type δ-crystallin, the quaternary structure was recovered after overnight incubation with no detectable activity ( table 1 ). the similar result was also shown for 5 m gdmcl denatured k315a mutant protein. these results suggested the different pathway for protein folding that seems due to the distinct conformation of the denatured protein caused by different means of denaturation. since refolding of partly unfolded monomeric mutant δ-crystallin resulted in a conformation with high exposure of hydrophobic regions, the occurrence of protein aggregation in the process was determined using light scattering measurement. no protein aggregation was detected upon dilution of 0.84, 3 and 5 m gdmcl denatured monomeric mutant protein into buffer containing 1.5 m urea. however, when 0.84 and 3 m gdmcl denatured monomeric mutant protein was diluted into buffer, protein aggregation was occurred. the rates for aggregate formation were calculated to be ca. 0.14 and 0.0004 min -1 , respectively (fig 6a) . when αa-crystallin, the chaperone protein, was added in a 5:1 ratio to 0.84 m gdmcl denatured monomeric mutant protein in folding buffer, no change in the rate of protein aggregation was observed. formation of aggregates by αa-crystallin alone did not occur under the same conditions. it is notable that upon dilution of 5 m gdmcl denatured monomeric mutant protein into buffer, no aggregation occurred. the structural features of the protein aggregate were investigated using the thioflavin t assay [30] . an increase in fluorescence intensity resulting from binding of tht with the aggregates over time was observed following dilution of 0.84 and 3 m gdmcl denatured monomeric mutant δ-crystallin into buffer (fig 6b) . the results suggest the possible formation of ordered aggregates. no changes in the signal were observed during the incubation period upon refolding of 5 m gdmcl denatured monomeric mutant protein. to determine the effect of the interactions provided by k315 at the diagonal subunits in disassembly of the quaternary structure, a md simulation were run for 100 ps for wild-type and mutant δ-crystallin in the absence of any structural restraints. from the simulation trajectory, the dynamic motion for disassembly of the quaternary structure and conformational changes in the tertiary structure were elucidated. the distances between the c α of d237 and r182, r302 and e330 and the two k315 or a315 residues were measured to evaluate the extent for subunit dissociation between the a-c, a-d and a-b dimeric pairs, respectively (fig 7a) . these residues interact with each other by hydrogen bonding or salt bridges at the dimeric pair interface in the native structure. these interactions are lost on replacement of k315 with a315. the results showed that the distances between d237 and r182, r302 and e330 and the two a315 residues increased linearly at a similar rate, except that the rate of change for the r302-e330 interaction in wild-type protein was about half of that for the mutant protein. in contrast, no changes in the distance between the k315 residues were observed before 80 ps. inspection of the timecourse at 20 ps in wild-type protein showed that the primary dimers of subunit a and c or b and d showed were separated, while the diagonal dimers of subunit a and b or c and d were connected by the interactions of residue k315 (fig 7b) . however, the subunits for both of the primary and diagonal dimers were separated from each other in the mutant protein. the results suggest a different disassembly process for tetrameric wild-type and mutant proteins. the quaternary structure of δ-crystallin is assembled as two pairs of closely associated dimers. previous mutagenesis studies have used to investigate the interactions at the interfaces of double dimers to elucidate their role in the stabilization of the quaternary structure [11] . the unique stable conformation from unfolding of k315a mutant protein in the presence of urea suggests that the interactions provided by this residue at the interfaces may play a critical role in stabilization of the quaternary structure of δ-crystallin. lys-315 is the only residue which is arranged symmetrically at the diagonal subunit interfaces (fig 1b) . the ε-amino group in the side-chain of this residue forms hydrogen bonds with the carbonyl groups of m312, v313 and k315 within the symmetric subunit (from pisa analysis: http://www.ebi.ac.uk/msd-srv/prpt_int/cgi-bin/piserver) (fig 1d) . substitution of this residue by alanine reduces the structural stability of the protein. the results from the previous study showed about a 9°c reduction in the thermal stability of the secondary structure and the changes in the micro-environment surrounding the tryptophan residues [11] . this mutant protein was also more susceptible to chemical denaturation, since about half of the concentration of denaturant was required to disrupt its quaternary structure compared to wild-type protein. both of the wild-type and mutant protein showed similar and not reversible denaturation in the presence of gdmcl. however, differences in the denaturation pathway were observed when urea was used as the denaturant. the results suggest that the non-covalent interactions between the intra and inter-subunits might be disrupted by the ionic character of gdmcl, with subunit dissociation and denaturation occurring simultaneously for both proteins [31] . the previous studies showing the presence of only a monomeric molten globule intermediate in the dissociation pathway of wild-type δ-crystallin in gdmcl supports this assumption [12, 13] . thus, the role of k315 in the folding process of δ-crystallin cannot be distinguished under the strong denaturant. around 2~5 m urea, the k315a mutant δ-crystallin is in a stable conformation, as judged by tryptophan fluorescence measurements (fig 2b) . subunit dissociation occurs under these conditions, resulting in the exposure of hydrophobic regions (figs 2c and 3b ). only monomers were identified in this state, as measured by sedimentation velocity analysis. the monomers that dissociated from the tetramers at~1.5 m urea possessed a nearly identical content of secondary structure as native protein and native-like conformation (figs 2b and 4c ). monomers in this conformation were able to refold and re-associate into tetramers with a similar conformation as the native protein, and possessed significant asl activity. when the urea concentration was increased to 2.5 m, the conformational changes led to further exposure of hydrophobic regions. however, following this conformational change at higher urea concentrations, protein folding was not reversible. the contrasting result found for wild-type protein was that the dissociation and denaturation were concurrent under the effect of urea (fig 3a) . in this condition, the conformation of the dissociated monomers was partly unfolded, as judged from the reduced level of α-helix (fig 2b) . the dissociated monomers seem to refold into alternative conformations then re-associate into tetramers with only part of the catalytic activity recovered (table 1 ). these results indicated that the conformation of the monomers seems related to the assembly pattern for functional protein. it is interesting that the interactions provided by k315 at the interfaces seem to affect the disassembly pathway of the quaternary structure of wild-type protein. it was found that both of the dimers and monomers were dissociated from the wild-type protein at around 2 to 4 m urea, as measured by sedimentation velocity while only monomers were dissociated from the mutant protein (fig 3) . these results suggest that the interactions provided by k315 at the interfaces seem to increase the energy barrier for dimeric dissociation. when these interactions were disrupted by mutation, the monomers could be isolated with intact conformation from the tetramers earlier than for the wild-type protein in urea. thus, the steps for dissociation and denaturation could be distinguished in the unfolding process of the k315a mutant δ-crystallin. in this study, the dynamic motion of protein structure in the process was simulated for elucidation the dissociation mechanism of δ-crystallin. from the trajectory, the tetramers were found to disassemble at the early stage. due to the interactions by k315, the interfaces between the diagonal dimers remain connected while distances increase between the subunits of the primary dimers in wild-type protein. in contrast, the subunits between both of the diagonal and primary dimers were dissociated in the k315a mutant protein (fig 7) . as δ-crystallin was assembled by two close contact dimers as transthyretin, dissociation at the two primary dimer interfaces would be expected to occur at the initial stage [4, 32] . the simulation result provides a novel pattern for dissociation of the double dimeric protein consistent with the results of sedimentation velocity experiments. a possible explanation for this earlier dissociation of the subunit from the primary dimers compared to diagonal dimers is differences in solvent accessibility. unlike the location of the interfaces between the subunits of the primary dimer, the position of k315 is buried at the interior interfaces away from solvent. thus, the interactions of k315 at the interfaces of the protein seem to elevate the stability of the quaternary structure. for wild-type protein, two diagonal dimers were presumed to disassemble initially from the tetramers followed by subunit dissociation of the diagonal dimers. however, dissociation at the interfaces of two primary dimers would assume to be the first step in the unfolding process of the k315a mutant protein (fig 8) . the detail mechanism for folding of monomeric protein remains elusive due to the monomers dissociated from wild-type δ-crystallin were in a molten globule conformation. thus, the monomers that reversible dissociated from k315a mutant δ-crystallin with a stable conformation and possessing similar level of secondary structure as the original state, and this would be a good model for studying the folding process. the monomeric protein was reversible denatured in a four-state mechanism in the presence of gdmcl and two intermediates were detected in the process (fig 5) . refolding of the partly unfolded intermediate was not reversible which in turn resulted in a conformation with more exposure of hydrophobic regions. only denatured δ-crystallin was reversible folded into the monomers with a similar conformation to the original state. it is interesting that the refolded monomers were able to reassemble into tetramer instantly upon dilutions, with substantial recovery of activity (fig 5 and table 1 ). this contrasts with the slow refolding of gdmcl denatured wild-type protein into its tetrameric form with no detectable activity. the slow recovery of the quaternary structure for the latter protein is due to an energy barrier for the appropriate assembly of double dimers, as reported previously [14] . the results suggest that the conformation of the denatured monomers which the tetrameric wild-type (t) and k315a mutant (t*) δ-crystallin was dissociated through the diagonal dimer (d) and primary dimer (d*) to monomers with partial unfolded (m) and stable (m*) conformation, respectively. monomers of the wild-type or the mutant protein were then denatured through intermediate (i or i*) into respective unfolded form (u or u*). the monomers (m) of wild-type protein in partial unfolded conformation was associated in alternative pathway to form dimers (d 1 ), and then assembled into tetramers (t 1 ) or aggregates (a). the aggregation was prevented by αa-crystallin. refolding followed by assembly of the intermediates (i 1 * and i 2 *) of the mutant protein resulted in the aggregates (a 1 and a 2 ) formation and the chaperone function of αa-crystallin was invalid in this process. doi:10.1371/journal.pone.0145957.g008 was unfolded by stepwise dissociation or directly unfolded with 5 m gdmcl could be different. the consequence of this might be that protein folding occurs via different pathways leading to the refolded monomers with different conformations to associate into native structure or alternative structures without function. protein aggregates are prone to form during the reassembly process from refolding of partly unfolded monomeric intermediates of δ-crystallin. the intermediate with the highest exposure of hydrophobic conformation is particularly prone to aggregate formation (fig 5) . aggregate formation by monomeric intermediates with defined conformations was also reported for transthyretin under mildly acidic conditions [33] . the result implies that the conformational status of the monomers influences subunit association. it is interesting that the presence of αcrystallin seems to increase the formation of aggregates from the monomeric intermediates of δ-crystallin with partly unfolded conformation, while α-crystallin alone was not affected under these conditions (fig 6) . the studies for αa peptide which induces the aggregation of soluble α-crystallin suggested that the mechanism for aggregate formation might due to the changes in the hydrophobicity of α-crystallin induced by the peptide [34, 35] . our previous study reported that aggregate formation during refolding of gdmcl denatured wild-type δ-crystallin was due to the improper assembling of double dimers and was prevented by the presence of α-crystallin [14] . in this study, the aggregate formation was caused by assembly of the refolded monomeric intermediate which facilitated the aggregate formation of α-crystallin. it thus postulated that the electrostatic interaction with the substrate seems to be key factors to determine the chaperon-like or anti-chaperone activity of the α-crystallin [34, 36] . the underlying mechanism requires further investigation. nonetheless, the result highlights the conformational status of the monomers which play a critical role in the folding pathway for reversible oligomerization or aggregate formation. in conclusion, the folding pathways of wild-type and mutant δ-crystallin are summarized as the working models shown in fig 8. this model depicts the key interactions from k315 at the interfaces of diagonal subunits not only to stabilize the quaternary structure of δ-crystallin but also to act as the energy barrier for dissociation of stable monomers. the stability might be one of the reasons for recruitment of the metabolic enzyme asl into the lens as a crystallin protein [37] . the single polypeptide chain of δ-crystallin after translation would be assumed to fold into functional tetramers as the proposed refolding pathway for k315a mutant. however, due to the interactions by k315, the tetrameric protein would be assumed to dissemble in an alternative manner to form the diagonal dimers, followed by simultaneous subunit dissociation and denaturation. monomers in this status might associate into dimers via a different pathway which then assemble slowly into a non-native tetrameric form or self-associate into aggregates which can be prevented by the presence of α-crystallin. the reversible folding of the monomers that dissociated from the k315a mutant protein with near native conformation provided the folding mechanism of the δ-crystallin. in this process, the ordered aggregate formation from re-association of the partly unfolded intermediate reveals a specific status of the protein to avoid the chaperone function of α-crystallin. this model proposes a possible mechanism about the aggregate formation for lens protein under stress effect and their interaction with α-crystallin. this study reveals the key role of monomers that dissociated from the oligomeric crystallin; their conformational status determines the levels of aggregate formation. , the panels show the raw sedimentation and theoretical fitted data (solid lines), and the fitting residual, respectively. (c) grayscale of residual bitmap. the raw sedimentation data were fitted to the continuous size distribution model using the sedfit program [25] . (tif) lens crystallins: gene recruitment and evolutionary dynamism the recruitment of crystallins: new functions precede gene duplication lens crystallins. innovation associated with changes in gene regulation the structure of avian eye lens δ-crystallin reveals a new fold for a superfamily of oligomeric enzymes human argininosuccinate lyase: a structural basis for intragenic complementation evidence for neutral and selective processes in the recruitment of enzyme-crystallins in avian lenses structural comparison of the enzymatically active and inactive forms of δ crystallin and the role of histidine 91 the effect of n-terminal truncation on double-dimer assembly of goose δ-crystallin structural studies of duck δ1 and δ2 crystallin suggest conformational changes occur during catalysis crystal structure of an inactive duck δii crystallin mutant with bound argininosuccinate substitution of residues at the double dimer interface affects the stability and oligomerization of goose δ-crystallin guanidine hydrochloride induced reversible dissociation and denaturation of duck δ2-crystallin monomeric molten globule intermediate involved in the equilibrium unfolding of tetrameric duck δ2-crystallin kinetic refolding barrier of guanidinium chloride denatured goose δ-crystallin leads to regular aggregate formation disruption of a salt bridge dramatically accelerates subunit exchange in duck δ2 crystallin guanidine hydrochloride-induced denaturation and refolding of transthyretin exhibits a marked hysteresis: equilibria with high kinetic barriers distinct interactions of αa-crystallin with homologous substrate proteins, δ-crystallin and argininosuccinate lyase, under thermal stress. biochimie a rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding determination and analysis of urea and guanidine hydrochloride denaturation curves equilibrium unfolding of bombyx mori glycyl-trna synthetase metabolism of amino acids and amines resolution of the fluorescence equilibrium unfolding profile of trp aporepressor using single tryptophan mutants the application of circular dichroism to studies of protein folding and unfolding macromolecular size-and-shape distributions by sedimentation velocity analytical ultracentrifugation determination of the sedimentation coefficient distribution by least-squares boundary modeling analytical ultracentrifugation in biochemistry and polymer science molecular dynamics with coupling to an external bath d18g transthyretin is monomeric, aggregation prone, and not detectable in plasma and cerebrospinal fluid: a prescription for central nervous system amyloidosis? reversible unfolding of the severe acute respiratory syndrome coronavirus main protease in guanidinium chloride mechanism of thioflavin t binding to amyloid fibrils protein denaturation with guanidine hydrochloride or urea provides a different estimate of stability depending on the contributions of electrostatic interactions kinetic stabilization of the native state by protein engineering: implications for inhibition of transthyretin amyloidogenesis the acid-mediated denaturation pathway of transthyretin yields a conformational intermediate that can self-assemble into amyloid mechanism of the chaperone-like and antichaperone activities of amyloid fibrils of peptides from αa-crystallin the αa66-80 peptide interacts with soluble αcrystallin and induces its aggregation and precipitation: a contribution to age-related cataract formation amyloid-induced aggregation and precipitation of soluble proteins: an electrostatic contribution of the alzheimer's beta(25-35) amyloid fibril gene sharing by δcrystallin and argininosuccinate lyase we thank dr m. d. lloyd (university of bath, u. k.) for reading of this manuscript before publication. conceived and designed the experiments: hjl wyc. performed the experiments: cwh hcl wck cyc. analyzed the data: cwh cyc. wrote the paper: cwh hjl. key: cord-003761-ikni2acz authors: li, zengbin; zou, zixiao; jiang, zeju; huang, xiaotian; liu, qiong title: biological function and application of picornaviral 2b protein: a new target for antiviral drug development date: 2019-06-04 journal: viruses doi: 10.3390/v11060510 sha: doc_id: 3761 cord_uid: ikni2acz picornaviruses are associated with acute and chronic diseases. the clinical manifestations of infections are often mild, but infections may also lead to respiratory symptoms, gastroenteritis, myocarditis, meningitis, hepatitis, and poliomyelitis, with serious impacts on human health and economic losses in animal husbandry. thus far, research on picornaviruses has mainly focused on structural proteins such as vp1, whereas the non-structural protein 2b, which plays vital roles in the life cycle of the viruses and exhibits a viroporin or viroporin-like activity, has been overlooked. viroporins are viral proteins containing at least one amphipathic α-helical structure, which oligomerizes to form transmembrane hydrophilic pores. in this review, we mainly summarize recent research data on the viroporin or viroporin-like activity of 2b proteins, which affects the biological function of the membrane, regulates cell death, and affects the host immune response. considering these mechanisms, the potential application of the 2b protein as a candidate target for antiviral drug development is discussed, along with research challenges and prospects toward realizing a novel treatment strategy for picornavirus infections. the picornaviridae family consists of 35 genera and 80 species, mainly including enterovirus, hepatovirus, cardiovirus, aphthovirus, and rhinovirus [1] . to date, research on picornaviruses has mainly focused on enterovirus (ev) 71, coxsackievirus (cv), poliovirus (pv), encephalomyocarditis virus (emcv), foot-and-mouth disease virus (fmdv), human rhinovirus (hrv), and hepatitis a virus (hav). picornavirus infections can cause enormous damage in humans and animals. the ev71, cva16, and cva10 cause hand, foot, and mouth disease in millions of children in asia-pacific region each year and can cause more serious clinical symptoms such as aseptic meningitis, acute flaccid paralysis, and neurological respiratory syndrome [1] [2] [3] . picornaviruses are non-enveloped spherical viruses with an icosahedral-structured viral capsid. the picornaviral genome consists of a single-stranded positive-sense rna, which is approximately 6.7-10.1 kilobases in length, with a highly conserved structure, including a 5 -noncoding region , an open reading frame, a 3 -ncr, and a 3 -end polya tail [1] . the 5 -ncr contains multiple rna secondary structural elements, including the internal ribosome entry site. the open reading frame of the viral genome consists of three regions: p1, p2, and p3. the p1 region is translated and processed to form the structural proteins vp1, vp2, vp3, and vp4, which compose the capsid structure of a picornavirus. the p2 and p3 regions are separately translated to the non-structural proteins 2a, 2b, and 2c and 3a, 3b, 3c, and 3d, respectively. the majority of related research has focused on structural proteins of picornaviruses, such as vp1, whereas the importance of the non-structural protein 2b has been relatively overlooked. viroporins are proteins found in a variety of viruses and are generally comprised of 50 to 120 amino acids. each viroporin contains a highly hydrophobic domain capable of forming at least one amphipathic α-helical structure, which oligomerizes to form transmembrane hydrophilic pores [4, 5] . the 2b protein is a crucial component of picornaviruses that exhibits viroporin or viroporin-like activity, plays a key role in the picornavirus life cycle by inducing a series of cytotoxic reactions to promote picornaviral replication and release [6] [7] [8] [9] [10] [11] [12] [13] [14] . the 2b protein has a highly conserved sequence, which can be exploited for viral detection [15] [16] [17] , vaccine development [18] [19] [20] , and rna interference [21] [22] [23] [24] [25] . in addition, the 2b protein exhibits a viroporin or viroporin-like activity, and thus, targeted drugs against viroporin could potentially target 2b protein as a novel strategy to treat or prevent picornavirus infections. however, the detailed mechanism of action of the 2b protein has not been elucidated to date. therefore, here, we review the recent research data on the role of the 2b protein in the picornaviral life cycle and discuss its possible application in antiviral therapy. the picornaviral 2b protein is a relatively short molecule, containing a maximum of two predicted putative transmembrane hydrophobic helices, along with n-and c-terminal domains, which are connected by a short stretch of amino acid residues. the α-helix-turn-α-helix sequence of the 2b protein is the basis for forming a transmembrane pore through homo-multimerization and the major determinant of the 2b protein function [4, 11, [26] [27] [28] [29] [30] . a computational approach has demonstrated that the ev 2b protein is a tetramer, and 2b proteins with different orientations have different activities [31, 32] . the 2b protein belongs to the type ii family of viroporins, which can be further divided into different types according to the number and orientation of the membrane-spanning domains ( figure 1 ). in type iia viroporins, the n-and c-termini stretch to the organelle lumen, such as in the 2b protein of cvb3 [6, 9, 26, 33] , whereas the n-and c-termini of type iib viroporins face the cytoplasmic matrix, such as in pv [9, 27, 34] and fmdv [7, 14, 28] . in addition, the c-terminus of the hav 2b protein has a viroporin-like activity [11] . picornaviral 2b proteins target the membrane and form pores mainly through their transmembrane regions. the protein molecules are first inserted into the membrane individually and then self-interact and homo-oligomerize to form higher-order structures, which are important for the pore-forming activity, determined by the specific sequence and structure [4, 34, 35] . the majority of 2b proteins are localized to organelles, with predominant co-localization with the golgi apparatus and the endoplasmic reticulum (er) in cvb3, pv, and hrv14 ( figure 1 ) [36] . the hrv16 and fmdv 2b proteins are mainly localized to the er, whereas the emcv 2b protein is not localized to either the golgi complex or the er [28, 36, 37] . seggewiss et al. [38] found that the hav 2b protein was not localized to the er either but was involved in the amendment of the er-golgi apparatus intermediate compartment. since the protein structure determines its ultimate function and 2b proteins belonging to different viroporin species show both similarities and differences in their functions, much insight can be gained from research on the same viroporin 2b protein and on different viroporins. picornavirus infects the host cell, and then viral gene encodes 2b proteins. the 2b protein belongs to the type ii family of viroporins, including two transmembrane hydrophobic helices which are the basis for forming a transmembrane pore that results in the changes in cell membrane permeability. in type iia viroporins, the n-and c-termini stretch to the organelle lumen, whereas the n-and c-termini of type ii b viroporins face the cytoplasmic matrix. the majority of 2b proteins are localized to organelles, with predominant co-localization with the golgi apparatus and the endoplasmic reticulum (er), resulting in an obvious decrease in ca 2+ of the er and golgi complex, along with a decrease in calcium uptake by the mitochondrion, and causing an influx of extracellular ca 2+ . the 2b protein can induce many cellular reactions, such as changing membrane permeability, regulating apoptosis and autophagy, and affecting host immune responses. these functions are all related to changes in ion concentrations, especially of calcium ions (ca 2+ ). therefore, here, we mostly focus on the role of ca 2+ in these 2b protein activities. a common feature of infection by animal viruses is the damage to the ion balance in host cells. the picornaviral 2b protein may change the membrane permeability of target cells, disturbing the ion balance, especially that of ca 2+ , in organelles, such as the er and the golgi apparatus ( figure 1 ) [6, 36, 39] . the changes in membrane permeability, caused by the 2b protein, have also been suggested to be regulated by the content of specific membrane phospholipids [11, 40] . the ca 2+ are involved in the activation of enzymes in cells and play a crucial role in viral replication and other viral biological processes [41] [42] [43] . however, the role of the 2b protein in ca 2+ homeostasis remains unclear. initial studies only indicated that host cells had elevated the ca 2+ levels, owing to the expression of the 2b protein [9, 28, 29, 36, [44] [45] [46] , but the mechanism was not clarified. since then, some researchers have proposed that the decrease in the concentration of ca 2+ stored in organelles triggers the opening of specific calcium ion channels on the plasma membrane of cells, causing an influx of extracellular ca 2+ [29] . this idea was supported by the findings that expression of the cvb3 and pv 2b proteins resulted in an obvious decrease in the ca 2+ concentration in the er and golgi complex, along with a decrease in calcium uptake by the mitochondria. meanwhile, the increased ca 2+ level in the cytoplasm was suggested to be mainly due to the influx of extracellular ca 2+ [36, 44, 45] . similarly, the expression of the hrv 2b protein was shown to decrease the ca 2+ concentrations in the er and golgi apparatus, whereas the emcv 2b protein only significantly reduced the ca 2+ concentration in the er [36] . in contrast, other studies showed that expression of the hav and fmdv 2b proteins elevated the cytoplasmic ca 2+ level but did not alter the level of stored ca 2+ in organelles, such as the er and golgi complex [28, 36] . taken together, these studies suggest that there are different mechanisms by which 2b proteins affect the ca 2+ concentrations, depending on the virus type. furthermore, it is unknown whether ca 2+ directly pass through the channel formed by the 2b protein. pham et al. [47] demonstrated, using a planar lipid bilayer and liposome patch-clamp electrophysiological technique, that the rotavirus non-structural protein 4 (nsp4) viroporin region acts as a ca 2+ conduction channel. although there is currently no direct evidence that the 2b protein can directly induce the observed changes in the ca 2+ concentration in host cells upon infection, the above-reviewed studies suggest an association, and the mechanism requires further investigation. given the importance of ca 2+ signaling for numerous cellular processes, further studies on picornaviral 2b protein function should include determination of the ca 2+ concentration, which may provide more insight into the detailed function of the 2b protein. in particular, the 2b protein may change the ca 2+ concentration to regulate autophagy and apoptosiswhich are distinct cell death mechanisms controlled by the virus to effectively evade the host immunity, thereby promoting viral replication and release [48] [49] [50] [51] [52] . picornaviruses can form new cytoplasmic vesicles by inducing membrane remodeling, thereby promoting their own proliferation [39, 53, 54] . the 2b protein is capable of binding to the membrane and inducing target membrane remodeling to form a unique membrane structure that can serve as a viral replication site. this site, known as the viroplasm, is generated from the er to accumulate all of the cellular components required for viral replication ( figure 2 ) [13, 39, [54] [55] [56] . the viroplasm is also the main membrane source of autophagy [54, 57] . the cvb3 2b protein is dependent on its transmembrane hydrophobic region to induce autophagy [8] , which may be related to alterations in membrane permeability, especially with regard to the ca 2+ concentration. moreover, at an early stage of fmdv cell infection, the virus specifically recognizes and binds to the cell surface receptors, and the 2b protein rapidly upregulates the autophagy pathway, leading to punctate aggregation of a large number of autophagy marker proteins, such as the microtubule-associated protein 1 light chain 3 (map1-lc3) [28, 58] . in addition, rotavirus encodes the nsp4 viroporin, which releases the er-stored ca 2+ into the cytoplasm, thereby activating the ca 2+ /calmodulin-dependent kinase kinase-β (camkk-β) signaling pathway, leading to autophagy ( figure 2 ) [59] . further, cvb4 induces autophagy in a calpain-dependent manner, causing an accumulation of lc3 lipids and autophagosomes [60] . considering the ability of the 2b protein to alter cellular calcium homeostasis, along with its viroporin-like activity, it is feasible that the 2b protein may regulate autophagy mainly by changing the ca 2+ concentration. the 2b protein has also been shown to regulate apoptosis through the endogenous pathway, which can be divided into er stress and the mitochondrial pathway, providing another potential mechanism of bypassing the host immune response to facilitate infection [32, 37, 44, 45, 48] . the ca 2+ plays a pivotal role in er stress-dependent apoptosis by regulating the flow between the er and the mitochondria [45, 61] . excessive mitochondrial uptake of ca 2+ exerts a cytotoxic effect because a high ca 2+ concentration can open numerous mitochondrial transition pores, increase mitochondrial permeability, and destroy the mitochondrial outer membrane; consequently, cytochrome c and other proapoptotic factors are released, leading to apoptosis ( figure 2 ) [9, 32, 44, 45, 62] . the cvb3 2b protein was shown to inhibit caspase activation and cell death induced by actinomycin d and cycloheximide by regulating the intracellular ca 2+ concentration [44, 45] . additionally, the 2b protein of hrv16 induced an er stress response, accompanied by an increased expression of cleaved caspase-3 and ccaat-enhancer-binding protein homologous protein (chop), which might have also involved a change in the ca 2+ level [37] . collectively, these results suggest that the 2b protein may regulate apoptosis by altering calcium homeostasis. furthermore, the 2b protein can regulate apoptosis through the mitochondrial pathway. madan et al. [32] showed that the pv 2b protein interacted with the mitochondria and altered the mitochondrial morphology, in addition to the release of cytochrome c after the pv 2b gene expression. cong et al. [63] reported that the ev71 2b protein was localized to the mitochondria and induced apoptosis by directly activating the proapoptotic b-cell lymphoma 2-associated x (bax) protein, without a significant uptake of ca 2+ by the mitochondria. therefore, the activation of the mitochondrial apoptotic pathway and subsequent apoptosis, induced by the ev71 2b protein, may not involve ca 2+ signaling. collectively, picornaviral 2b proteins can induce cell death in a variety of ways, with ca 2+ playing an important role in most of these mechanisms. figure 2 . 2b protein regulates autophagy and apoptosis. the 2b protein induces target membrane remodeling to form the viroplasm, which is generated from the endoplasmic reticulum (er). the isolation membrane is produced by the viroplasm. activation of the ca 2+ /calmodulin-dependent kinase kinase-β (camkk-β) signal pathway is due to an increased intracellular calcium concentration. furthermore, mitochondrion takes up ca 2+ from the er, thereby cytochrome cis released, leading to apoptosis. the host immune system is an important line of defense against pathogens, and pathogens can affect the immune system in a variety of ways. the 2b protein mainly affects the host immune response through inflammasome activation and by direct antagonism of the host immune response. recognition of pathogens by the immune system is mainly mediated by pathogen-associated molecular pattern receptors, known as pattern recognition receptors, including nucleotide-binding oligomerization domain (nod)-like receptors (nlrs), retinoic acid-inducible gene-i (rig-i)-like helicases, and pyrin domain-containing 3 (nlrp3) [64] [65] [66] . activation of nlrp3 inflammasome occurs during a period of changes in ion concentrations [67, 68] . the nlrp3 belongs to the nlr family of inflammasomes and causes interleukin (il)-1β and il-18 secretion via caspase-1 activation [67] . the emcv, pv, ev71, and hrv 2b proteins all activate the nlrp3 inflammasome but use distinct mechanisms [12, 69] . the hrv and emcv 2b proteins can stimulate the nlrp3 inflammasome pathway to activate caspase-1, which catalyzes the proteolysis of pro-il-1β to il-1β, leading to its secretion from across the plasma membrane by inducing a ca 2+ efflux from intracellular storage ( figure 3 ) [12, 69] . wang et al. [70] found that cvb3-infected cells induced the nlrp3 activation in association with a k + efflux. the influenza virus m2 protein, which is also a viroporin, is capable of transporting na + and k + , resulting in activation of the nlrp3 inflammasome [71, 72] . since the cvb3 2b protein acts as a viroporin and can disrupt the intracellular ion balance [36, 46] , it has been speculated that the induction of nlrp3 activation in cvb3-infected cells may be related to the 2b protein. in addition to activating the inflammasome, the 2b protein also antagonizes the host immune response. both in vitro and in vivo studies have suggested that inhibition of protein trafficking would effectively allow viral evasion of the host immune response (figure 3 ) [73] [74] [75] . moreover, inhibition of protein transport may be related to changes in the ca 2+ concentration [46] . similar to the cvb3 2b protein, the 2b proteins of pv, hrv16 and hrv14, were shown to significantly inhibit the protein transport through the golgi complex, whereas the hav, fmdv, and emcv 2b proteins did not inhibit the protein transport [36, 46, 76] . the fmdv 2b and 2c proteins did not block protein secretion, whereas the transport of proteins from the er to the golgi complex were blocked by the fmdv 2bc protein, and this effect was reproduced upon co-expression of the 2c and 2b genes [7, 77] . collectively, these findings suggest that the 2b protein may participate in a viral evasion of the host immune response, mainly by inhibiting protein transport. figure 3 . 2b protein affects the host immune response. the 2b protein can stimulate the nlrp3 inflammasome pathway to activate caspase-1, which catalyzes the proteolysis of pro-il-1β to il-1β, which leads to their secretion across the plasma membrane by inducing a ca 2+ efflux from intracellular storage. moreover, the 2b protein inhibits protein transport through the golgi protein which may be effective to evade the host immune response. the 2b protein has also been suggested to facilitate the viral evasion of the host immune response through other means. thus, the 2b protein antagonizes rig-i-mediated antiviral responses by inhibiting the expression of rig-i as an fmdv-specific reaction [14] . the rna helicase lgp2 (also known as dexh-box helicase 58, dhx58) is a crucial factor involved in the host antiviral immune response [78] . the fmdv leader protein (lpro), 3c protein, and the 2b protein have the ability to induce a decrease in lgp2 protein expression [79] . in addition, pv 2b variants were shown to inhibit the antiviral interferon (ifn) system [80] , whereas the hav 2b protein inhibited the synthesis of ifn-β by affecting the mitochondrial antiviral signaling protein activity, thereby antagonizing the host immune response [81] . collectively, these evidences indicate that picornaviral 2b proteins can affect the host immune response, thereby promoting viral amplification or the release of viral particles. as discussed above, picornaviral 2b proteins have a viroporin or viroporin-like activity and play an important role in the picornaviral life cycle. therefore, many common applications targeting viroporins may be translatable to those targeting 2b proteins. in addition, the 2b protein may serve as a new target for the development of antiviral drugs. thus, further studies on the structure and function of the 2b protein might open up new avenues for the prevention and control of picornaviruses. owing to its highly conserved sequence, the use of the 2b gene as a marker could effectively improve the accuracy of virus detection. li et al. [15] designed primers and taqman probes, based on the 2b and 3d regions, which were successfully used in real-time polymerase chain reaction to accurately detect and quantify fmdv during infection and replication. in addition, wang et al. [16] developed a lateral-flow detection system, which could rapidly and easily detect fmdv using the 2b gene. in addition to gene-based detection, the virus could be detected using a 2b antibody. biswal et al. [17] used an indirect enzyme-linked immunoassay based on a recombinant 2b protein to detect antibodies specific for fmdv. this method can be applied not only to fmdv but also to other picornaviruses, including cvb3 and ev71. given the significant threat that picornaviruses pose to humans and animals, resulting in enormous economic damage to the livestock industry, development of picornavirus vaccines is of great significance. although inactivated virus vaccines can offer effective prevention, there are associated residual risk issues, including incomplete virus inactivation and escape during the vaccine production process [82] [83] [84] . therefore, genetically engineered vaccines are considered more suitable options to overcome these shortcomings of inactivated viral vaccines. the ev vp1 protein is located outside the viral membrane and is thus exposed to the greatest amount of immune stress. accordingly, vp1 shows an extreme serological variability, thus providing the most reliable molecular epidemiological information. consequently, the vp1 region of ev71 has become a focus of vaccine research for picornavirus infections [84] . however, dna constructs containing the vp1 gene of ev71 showed low levels of antigenicity. therefore, there is still a need to develop an effective adjuvant strategy to increase the antigenicity. one possibility in this regard is the use of recombinant vaccines incorporating the 2b gene to enhance the efficacy of vaccines. at present, applications of the 2b gene in recombinant vaccines have mainly concentrated on fmdv. the addition of a 2b fragment to a vaccine designed with vp1 as the core has been shown to effectively enhance the vaccine efficacy [18] [19] [20] and reduce the dose and side effects [20] . these effects may be similar to those leading to a greater efficacy of the adenoviral vector vaccine fused to the fmdv 2b protein against serotype o, which is associated with the induction of specific cd4 + and cd8 + protective t cell responses [85] . therefore, future designs of other picornaviral genetically engineered vaccines would benefit from the addition of the 2b gene to increase the vaccine efficacy, including the addition of the 2b gene to a genetically engineered ev vaccine with the vp1 gene as the core. since viroporin plays an important role in all life stages of the virus, it is an attractive antiviral therapeutic target, and there have been great breakthroughs in this regard. by contrast, research and development of drugs targeting the 2b protein are relatively delayed. since 2b proteins have a viroporin or viroporin-like activity, screening for anti-picornavirus drugs among existing viroporin-targeting drugs may be a viable approach. there are four main types of inhibitors of viroporin activity, including adamantane, amiloride, alkyl iminosugar, and spirane amine [86] . adamantane (amantadine and rimantadine) inhibits the m2 channel of influenza a virus by destroying the transmembrane network of hydrogen-bonded water molecules, thereby inhibiting the viral amplification [87] . in bhk-21 cells infected with fmdv, the virus titer gradually decreased with an increase in amantadine concentration, which may have been due to abrogation of the pore-forming activity of the 2b protein and ultimate inhibition of fmdv replication [28] . however, clinical trials showed that amantadine was not only selective for specific resistance mutations in hepatitis c virus (hcv) p7 [88] , but also caused a rapid emergence of amantadine-resistant variants of influenza a virus during monotherapy for influenza [89] . amiloride is a composite of two drugs, 5-(n,n-hexamethylene) amiloride and a novel inhibitor, bit225, targeting hcv p7 and hiv-1 vpu, which can together block the viroporin ion channel activity or prevent ion channel formation, resulting in a potent antiviral effect [90] [91] [92] [93] . the alkyl iminosugar inhibits the formation of ion channels by targeting the hcv p7 viroporin [94] . finally, spirane amines, such as bl-1743, also inhibit the influenza a virus m2 protein, with an antiviral mechanism similar to that of amantadine [95] . there are also other drugs that act as viroporin inhibitors, including 1,3-dibenzyl-5(2h-1,2,3,4tetraazol-5-yl) hexahydropyrimidine (cd), n-(1-phenylethyl)-2-[4-(phenylsulfonyl)-1-piperazinyl]-4quinazolinamine (lds25), and 6-methyl-1,3,8-trihydroxyanth-raquinone (emodin), among others [88, 96, 97] . the mechanism of action of these viroporin inhibitors is based on the inhibition of the viroporin channel activity. therefore, these drugs may have the potential to be applied for the treatment of picornavirus infections by targeting the 2b gene. however, this application will require further detailed investigations and drug screening. nevertheless, the 2b protein has the potential to widen the range of antiviral treatment strategies. furthermore, specific degradation of complementary mrna can be triggered by small interfering rnas (sirnas) or folded short hairpin rnas (shrnas) [98] , which can be explored as an rna interference strategy, a relatively novel technology that has already been applied to treat many important pathogens, including hiv-1, hepatitis b virus, and herpes simplex virus [99] [100] [101] . currently, shrnas targeting the highly conserved 2b gene sequence are widely used in picornavirus research, including fmdv [21, 25] , emcv [23] , and cvb3 [24] , and significant experimental viral suppression has been achieved. basically, rna interference against 2b gene affects the stability and integrity of the whole viral genome. the high nucleotide sequence conservation makes the 2b gene an attractive target for rna interference, which may potentially be effective against multiple picornavirus types, and open the door for additional sirna drugs. to date, there have been few studies specifically focusing on inhibitors of the 2b protein. xie et al. [102] found that 4,4 -diisothiocyano-2,2 -stilbenedisulfonic acid (dids) blocked a chloridedependent current, mediated by the ev71 2b protein, and suppressed viral amplification. however, further research is needed to uncover the underlying mechanism. despite the many challenges in drug development, new technologies such as fourier-transform infrared spectroscopy and design of molecular dynamics analogs, as well as cryo-electron microscopy and spectroscopy, are expected to greatly contribute to the development of antiviral drugs. recent studies have gradually clarified the function and the potential of the 2b protein, along with increasingly recognizing its importance in the viral life cycle. however, there are still some challenges to overcome in investigations of the picornaviral 2b protein. in particular, its strong hydrophobicity makes it difficult to achieve soluble expression. ao et al. [28] conjugated the small ubiquitin-like modifier (sumo) protein to the n-terminus of the fmdv 2b protein and successfully achieved soluble expression. therefore, this method can be tested for other picornaviral 2b proteins. moreover, the detailed molecular mechanism of the action of the 2b protein requires further study, along with the identification of interactions of 2b protein with host proteins, to better understand the role of the 2b protein in the pathogenesis of picornaviruses. in murine cells, the 2b protein was suggested to react with host proteins to promote rhinovirus proliferation [103] . using a yeast two-hybrid system, the fmdv 2b protein was found to interact with the host elongation factor 1γ (eef1g), and mislocalization of eef1g demonstrated that the eef1g deletion affected the synthesis of membrane proteins [104, 105] . although a yeast two-hybrid system is a common laboratory protein-screening technique, it has a low success rate and is time-consuming. alternatively, affinity purification-mass spectrometry can be used to overcome these shortcomings, which has already been widely used in studies on dengue, zika, and ebola viruses [106, 107] . at present, the development of antiviral drugs against viroporins is focused on three aspects, including viroporin and membrane fusion inhibitors, ion channel inhibitors, and targeted viroporin antibodies [9] . with respect to the biological function of the 2b protein, antiviral drugs targeting the 2b protein could be designed based on the following three approaches: broad-spectrum screening for anti-picornavirus drugs among existing viroporin inhibitors, screening for 2b protein and membrane fusion inhibitors, and screening for 2b protein pore activity inhibitors. as discussed herein, the most important basis for the function of the 2b protein is that it can be polymerized into pores, thereby changing the permeability of the membrane. therefore, the design of drugs targeting 2b protein should be based on inhibiting polymerization of the 2b protein into pores, thereby reducing its effects on cellular ion homeostasis. however, these designs first require detailed determination of the refined atomic structure of the 2b protein, along with the expansion of screening techniques and applications of meticulous medicinal chemistry. furthermore, to develop better antiviral drugs, it will be necessary to elucidate the exact role of the 2b protein channel in the viral life cycle. thus, the main points of focus for research on the structure and function of the 2b protein toward ultimate drug development are: (1) mechanism of increasing membrane permeability to disturb the ion balance, (2) regulation of autophagy and apoptosis, (3) inhibition of the host immune response, and (4) promotion of viral replication and release. taken together, as research aimed at further elucidation of the role of the 2b protein progresses, along with the adoption of new technologies, it is expected that more strategies will come to light for antiviral drug development and disease control. author contributions: z.l., z.z. and z.j. wrote the manuscript; q.l. and x.h. revised for its integrity and accuracy; q.l. approved the final version of this manuscript and takes responsibility for its contents. the authors declare no conflicts of interest. human parechoviruses-biology and clinical significance examples of the two-stage membrane protein folding model. viruses viroporins: structure and biological functions membrane-active peptides derived from picornavirus 2b viroporin molecular characterization of the viroporin function of foot-and-mouth disease virus nonstructural protein 2b protein 2b of coxsackievirus b3 induces autophagy relying on its transmembrane hydrophobic sequences. viruses topology and biological function of enterovirus non-structural protein 2b as a member of the viroporin family single point mutation in the rhinovirus 2b protein reduces the requirement for phosphatidylinositol 4-kinase class iii beta in viral replication the c-terminal region of the non-structural protein 2b from hepatitis a virus demonstrates lipid-specific viroporin-like activity encephalomyocarditis virus viroporin 2b activates nlrp3 inflammasome structural basis for host membrane remodeling induced by protein 2b of hepatitis a virus foot-and-mouth disease virus viroporin 2b antagonizes rig-i-mediated antiviral effects by inhibition of its protein expression development and validation of a duplex quantitative real-time rt-pcr assay for simultaneous detection and quantitation of foot-and-mouth disease viral positive-stranded rnas and negative-stranded rnas rapid detection of foot-and-mouth disease virus using reverse transcription recombinase polymerase amplification combined with a lateral flow dipstick detection of antibodies specific for foot-and-mouth disease virus infection using indirect elisa based on recombinant nonstructural protein 2b delivery of a foot-and-mouth disease virus empty capsid subunit antigen with nonstructural protein 2b improves protection of swine versatility of the adenovirus-vectored foot-and-mouth disease vaccine platform across multiple foot-and-mouth disease virus serotypes and topotypes using a vaccine dose representative of the adta24 conditionally licensed vaccine poly iclc increases the potency of a replication-defective human adenovirus vectored foot-and-mouth disease vaccine transgenically mediated shrnas targeting conserved regions of foot-and-mouth disease virus provide heritable resistance in porcine cell lines and suckling mice cross-inhibition to heterologous foot-and-mouth disease virus infection induced by rna interference targeting the conserved regions of viral genome specific small interfering rnas-mediated inhibition of replication of porcine encephalomyocarditis virus in bhk-21 cells short hairpin rna targeting 2b gene of coxsackievirus b3 exhibits potential antiviral effects both in vitro and in vivo adenovirus-vectored shrnas targeted to the highly conserved regions of vp1 and 2b in tandem inhibits replication of foot-and-mouth disease virus both in vitro and in vivo determinants for membrane association and permeabilization of the coxsackievirus 2b protein and the identification of the golgi complex as the target organelle viroporin-mediated membrane permeabilization. pore formation by nonstructural poliovirus 2b protein viroporin activity of the foot-and-mouth disease virus non-structural 2b protein coxsackievirus protein 2b modifies endoplasmic reticulum membrane and plasma membrane permeability and facilitates virus release membrane permeability induced by hepatitis a virus proteins 2b and 2bc and proteolytic processing of hav 2bc model generation of viral channel forming 2b protein bundles from polio and coxsackie viruses a peptide based on the pore-forming domain of pro-apoptotic poliovirus 2b viroporin targets mitochondria coxsackie b3 virus protein 2b contains cationic amphipathic helix that is required for viral rna replication mechanisms of membrane permeabilization by picornavirus 2b viroporin mutational analysis of different regions in the coxsackievirus 2b protein: requirements for homo-multimerization, membrane permeabilization, subcellular localization, and virus replication functional analysis of picornavirus 2b proteins: effects on calcium homeostasis and intracellular protein trafficking non-structural protein 2b of human rhinovirus 16 activates both perk and atf6 rather than ire1 to trigger er stress cellular localization and effects of ectopically expressed hepatitis a virus proteins 2b, 2c, 3a and their intermediates 2bc, 3ab and 3abc membrane integration of poliovirus 2b viroporin poliovirus 2b insertion into lipid monolayers and pore formation in vesicles modulated by anionic phospholipids release of intracellular calcium stores facilitates coxsackievirus entry into polarized endothelial cells calcium signals and calpain-dependent necrosis are essential for release of coxsackievirus b from polarized intestinal epithelial cells a dual role for ca(2+) in autophagy regulation the coxsackievirus 2b protein suppresses apoptotic host cell responses by manipulating intracellular ca 2+ homeostasis enterovirus protein 2b po(u)res out the calcium: a viral strategy to survive? the coxsackievirus 2b protein increases efflux of ions from the endoplasmic reticulum and golgi, thereby inhibiting protein trafficking through the golgi the rotavirus nsp4 viroporin domain is a calcium-conducting ion channel picornaviruses and apoptosis: subversion of cell death regulation of apoptosis during flavivirus infection virus infection and death receptor-mediated apoptosis autophagy promotes the replication of encephalomyocarditis virus in host cells coxsackievirus b3 infection induces autophagic flux, and autophagosomes are critical for efficient viral replication viral reorganization of the secretory pathway generates distinct organelles for rna replication remodeling the endoplasmic reticulum by poliovirus infection and by individual viral proteins: an autophagy-like origin for virus-induced vesicles a cytopathic and a cell culture adapted hepatitis a virus strain differ in cell killing but not in intracellular membrane rearrangements induction of intracellular membrane rearrangements by hav proteins 2c and 2bc subversion of the cellular autophagy pathway by viruses foot-and-mouth disease virus induces autophagosomes during cell entry via a class iii phosphatidylinositol 3-kinase-independent pathway autophagy hijacked through viroporin-activated calcium/calmodulin-dependent kinase kinase-beta signaling is required for rotavirus replication coxsackievirus b4 uses autophagy for replication after calpain activation in rat primary neurons cytochrome c binds to inositol (1,4,5) trisphosphate receptors, amplifying calcium-dependent apoptosis viral calciomics: interplays between ca 2+ and virus enterovirus 71 2b induces cell apoptosis by directly inducing the conformational activation of the proapoptotic protein bax nod proteins: regulators of inflammation in health and disease rigorous detection: exposing virus through rna sensing regulation of adaptive immunity by the innate immune system nlrp3 inflammasome activation: the convergence of multiple signalling pathways on ros production? inflammasomes: current understanding and open questions rhinovirus-induced calcium flux triggers nlrp3 and nlrc5 activation in bronchial cells involvement of nlrp3 inflammasome in cvb3-induced viral myocarditis influenza virus activates inflammasomes via its intracellular m2 ion channel response of host inflammasomes to viral infection inhibition of protein trafficking by coxsackievirus b3: multiple viral proteins target a single organelle the different tactics of foot-and-mouth disease virus to evade innate immunity foot-and-mouth disease virus 3c protease induces fragmentation of the golgi compartment and blocks intra-golgi transport human rhinovirus 16 causes golgi apparatus fragmentation without blocking protein secretion inhibition of the secretory pathway by foot-and-mouth disease virus 2bc protein is reproduced by coexpression of 2b with 2c, and the site of inhibition is determined by the subcellular location of 2c protective role of lgp2 in influenza virus pathogenesis foot-and-mouth disease virus infection inhibits lgp2 protein expression to exaggerate inflammatory response and promote viral replication poliovirus intrahost evolution is required to overcome tissue-specific innate immune responses hepatitis a virus protein 2b suppresses beta interferon (ifn) gene transcription by interfering with ifn regulatory factor 3 activation new insights into physiopathology of immunodeficiency-associated vaccine-derived poliovirus infection; systematic review of over 5 decades of data causes of impaired oral vaccine efficacy in developing countries recent progress towards novel ev71 anti-therapeutics and vaccines. viruses increased efficacy of an adenovirus-vectored foot-and-mouth disease capsid subunit vaccine expressing nonstructural protein 2b is associated with a specific t cell response viroporins: structure, function and potential as antiviral targets inhibitors of the m2 proton channel engage and disrupt transmembrane networks of hydrogen-bonded waters resistance mutations define specific antiviral effects for inhibitors of the hepatitis c virus p7 ion channel combination chemotherapy for influenza a phase 1b/2a study of the safety, pharmacokinetics and antiviral activity of bit225 in patients with hiv-1 infection antiviral efficacy of the novel compound bit225 against hiv-1 release from human macrophages understanding the inhibitory mechanism of bit225 drug against p7 viroporin using computational study a novel hepatitis c virus p7 ion channel inhibitor, bit225, inhibits bovine viral diarrhea virus in vitro and shows synergism with recombinant interferon-alpha-2b and nucleoside analogues the hepatitis c virus p7 protein forms an ion channel that is inhibited by long-alkyl-chain iminosugar derivatives growth impairment resulting from expression of influenza virus m2 protein in saccharomyces cerevisiae: identification of a novel inhibitor of influenza virus =structure-guided design affirms inhibitors of hepatitis c virus p7 as a viable class of antivirals targeting virion release emodin inhibits current through sars-associated coronavirus 3a protein specific inhibition of gene expression by small double-stranded rnas in invertebrate and vertebrate systems rna interference approaches for treatment of hiv-1 infection fatality in mice due to oversaturation of cellular microrna/short hairpin rna pathways an sirna-based microbicide protects mice from lethal herpes simplex virus 2 infection dids blocks a chloride-dependent current that is mediated by the 2b protein of enterovirus 71 amino acid changes in proteins 2b and 3a mediate rhinovirus type 39 growth in mouse cells kinectin-dependent assembly of translation elongation factor-1 complex on endoplasmic reticulum regulates protein synthesis eef1g interaction with foot-and-mouth disease virus nonstructural protein 2b: identification by yeast two-hybrid system protein interaction mapping identifies rbbp6 as a negative regulator of ebola virus replication comparative flavivirus-host protein interaction mapping reveals mechanisms of dengue and zika virus pathogenesis this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license key: cord-009614-lbjesv8y authors: durmuş tekir, saliha d.; ülgen, kutlu ö. title: systems biology of pathogen‐host interaction: networks of protein‐protein interaction within pathogens and pathogen‐human interactions in the post‐genomic era date: 2012-11-29 journal: biotechnol j doi: 10.1002/biot.201200110 sha: doc_id: 9614 cord_uid: lbjesv8y infectious diseases comprise some of the leading causes of death and disability worldwide. interactions between pathogen and host proteins underlie the process of infection. improved understanding of pathogen‐host molecular interactions will increase our knowledge of the mechanisms involved in infection, and allow novel therapeutic solutions to be devised. complete genome sequences for a number of pathogenic microorganisms, as well as the human host, has led to the revelation of their protein‐protein interaction (ppi) networks. in this post‐genomic era, pathogen‐host interactions (phis) operating during infection can also be mapped. detailed systematic analyses of ppi and phi data together are required for a complete understanding of pathogenesis of infections. here we review the striking results recently obtained during the construction and investigation of these networks. emphasis is placed on studies producing large‐scale interaction data by high‐throughput experimental techniques. despite immense technological advances in medicine, pathogenic organisms remain the source of much human morbidity and mortality. hiv/aids, acute lower respiratory tract infections, hemorrhagic fever, diarrheal diseases, tuberculosis and malaria are particularly notorious for high mortality rates [1] [2] [3] . the continuous emergence of new diseases and drug-resistant pathogens has heightened the global burden of infectious diseases in the 21 st century [1, 4] . to tackle such biological threats, an improved understanding of pathogenic microorganisms and their interactions with host organisms is needed since pathogen-host molecular interactions have crucial roles in initiating, sustaining, or preventing infection. pathogenic microorganisms communicate with human cells through interactions with human proteins both on the surface of the cell and within the interior of the cell. these interactions allow the microorganisms to enter the host cell and manipulate cellular mechanisms in order to use the host cell's capabilities to their own advantage, resulting in infection in the host organism. detailed knowledge of pathogen-host protein interactions may enable us to comprehend the mechanisms of infection and to identify better strategies to prevent or cure infection [5, 6] . however, the identification of new drug and vaccine targets for infectious diseases is only possible when the molecular machinery within individual pathogenic and host organisms is understood. for instance, anti-infection therapeutics should target essential genes in the pathogens which have no homology with human genes [7] . the very first genome sequencing was published in 1977 with the dna sequence for the genome of a virus, bacteriophage phix174 [8] . following the sequencing of the bacterial pathogen haemophilus influenzae in 1995 [9] and the human genome in 2000 [10] , sequence data for prokaryotic and eukaryotic genomes have appeared at an accelerated rate. today, genomic data for most of the pathogen and host organisms are available [11] . these data are used to study individual genes and corresponding proteins as well as to identify intra-and interspecies connections between proteins. in the light of these advances, the initial steps towards complete understanding of infection mechanisms through protein interactions have been recently published. in this review, the efforts to systematic determination and analysis of protein interaction networks underlying infection pathogenesis are summarized (mainly in a chronological order) to present the current picture of the research on infectious diseases. from a classical perspective, a protein is a functional unit that specifies a small, but discrete, part of the cellular physiology of an organism. in the post-genomic era, a protein is seen to function as an element within network of its interaction, and its role should be evaluated within this network together with its interacting partners [12] . advances in genomics and proteomics have been followed by the first large-scale efforts to identify functional networks of interacting proteins using the two-hybrid method [13] [14] [15] , pull-down assays [16, 17] , and protein chips [18] . to increase our understanding of the mechanisms of infection, protein-protein interaction (ppi) networks of pathogenic organisms should be determined in order to capture their functional and structural organizations. pathogenic ppi maps reveal biological pathways and processes, allowing prediction of protein functions and discovery of new drug and vaccine targets. the first genome-wide protein interaction networks were determined for viruses [19] [20] [21] . the first large-scale bacterial networks [22] [23] [24] followed successes in eukaryote mapping [15, [25] [26] [27] . today, the genome-wide ppi maps for a number of pathogens and hosts are available in public databases: bind [28] , biogrid [29] , dip [30] , hprd [31] , intact [32] , mint [33] , mips [34] , reactome [35] and string [36] . primarily due to their small genome size, whole genome ppi maps were first constructed for viruses. the first interaction map of whole proteome was determined for escherichia coli bacteriophage t7, mapping 25 interactions among viral proteins [19] . subsequently, genomewide analyses of important human pathogens, hepatitis c virus [20, 37] , vaccinia virus [21] , herpesviruses [38, 39] , and sars coronavirus [40, 41] were performed through intraviral ppi maps. hepatitis c virus (hcv), a flaviviridae family member causing severe liver disease, is a positive-sense singlestranded rna virus. it encodes only a single polyprotein which is co-or post-translationally processed into at least 10 viral proteins [42] . a controlled two-hybrid strategy based on a random genomic hcv library screen was used by flajolet et al. [20] , resulting in the identification of known and novel ppis. interactions among structural and non-structural proteins were revealed in the study, leading to the conclusion that almost all of the viral proteins encoded by the genome function in the hcv life-cycle, as in the cases of other members of the flaviviridae [43] . the roles of these functional interactions were discussed within the framework of the constructed genome-wide interaction map. interacting domains of the viral polyprotein were also identified to shed light on the development of anti-viral agents [20] . another genome-wide ppi map of hcv was then generated for the viral non-structural proteins [37] . vaccinia virus, well-known as a smallpox vaccine and also the source of potential recombinant vaccines against cancer and infectious diseases, is a member of poxviridae family. it is a large, double-stranded dna virus. poxviruses replicate themselves in the cytoplasm of the host cells without depending on the host's transcriptional machinery. the large genome of vaccinia virus can potentially express more than 200 proteins [44, 45] . mccraith et al. [21] performed a comprehensive two-hybrid analysis of full-length vaccinia virus proteins and detected 37 ppis (including 28 novel interactions) among both characterized and uncharacterized proteins. many of the ppis mapped involved one partner which was known to function in a specific process, coupled with another of unknown function, allowing functions to be assigned to previously unannotated proteins within dna replication, transcription, virion structure, or host evasion. another double-stranded dna virus family is herpesviridae whose members encode 70-170 proteins. herpesviruses cause human diseases such as kaposi sarcoma, b-cell lymphomas, chickenpox, shingles, and nasopharyngeal carcinoma [46] [47] [48] . the genome-wide intraviral protein interaction maps for three members of this family, kaposi sarcoma-associated herpesvirus (kshv), varicella-zoster virus (vzv), and epstein-barr virus (ebv) were generated by two-hybrid and analyzed comprehensively to reveal viral network properties [38, 39] . in the work of uetz et al. [38] , 123 ppis for kshv and 173 ppis for vzv were identified, the largest dataset published to date, allowing the construction of the first viral networks. topological network analyses of these interactome maps indicated that the viral networks appear as a single, highly coupled module ( fig. 1 ) with relatively many hubs and few peripheral nodes [38] in contrast to scale-free cellular networks with well-separated functional modules [49, 50] . just after this study was published, calderwood et al. [39] reported the detection of 43 ppis among ebv proteins. the construction of a ppi map for ebv by merging these interactions with already published ones resulted in a network of 52 proteins with 60 interactions. this large-scale network allowed the prediction of functions of uncharacterized proteins, further defining viral mechanisms. in these consecutive studies [38, 39] , core proteins common to all herpesviruses and noncore ones specific to each strain were investigated thoroughly. the severe acute respiratory syndrome coronavirus (sars-cov) is a positive-sense single-stranded rna virus belonging to the family of the largest rna viruses known, coronaviridae. its genome encodes 14 open reading frames expressing up to 30 structural and non-structural proteins that have roles in viral replication, assembly, and other functions for viral amplification in host cells [40, 41, 51] . for a genome-wide analysis of ppis of sars-cov, interactions between all sars-cov proteins were determined [40, 41] by two-hybrid producing 65 and 40 interactions, respectively. intraviral ppis were analyzed to elucidate the functions of the proteins as well as to identify the essential proteins in viral replication. von brunn et al. [40] compared the intraviral network topology of sars-cov with a previously defined viral network [38] and cellular networks [52] [53] [54] , concluding sars-cov network contained similarities to the kshv network [38] . insights gained into molecular mechanisms and topological network properties provided by the genome-wide analyses of intraviral ppi maps (table 1 ) may be used as a basis for further characterization of the functions and mechanisms of viral proteins, especially for other members of the same virus families. having successfully built genome-wide ppi maps for viruses, similar two-hybrid methodology was applied to construct ppi networks for the larger, more complex genomes of pathogenic bacteria. the first prokaryotic ppi map was built for helicobacter pylori [22] . other large-scale prokaryotic networks eventually emerged for campylobacter jejuni [55] , treponema pallidum [56] mycobacterium tuberculosis [57] , and bacillus subtilis [58] . genome-scale analysis of interacting proteins that assemble into protein complexes were performed for e. coli [23, 24] and mycoplasma pneumoniae [59] . the first large-scale intrabacterial ppi map was constructed for the human gastric pathogen, and gram-negative bacterium h. pylori, identifying 1280 interactions between 46.6% of all 261 bacterial proteins using the twohybrid method [22] . the comparison of these h. pylori ppis with previously described interactions between orthologous e. coli proteins resulted in prediction of protein functions within biological pathways such as chemotaxis and urease activity, essential for h. pylori pathogenicity. in this study, the interacting domains of h. pylori proteins were also identified and used in protein function predictions. interacting domains may serve in mapping new functional domains, providing crucial information for antibacterial drug design studies. gram-negative bacterium e. coli, the main cause of urinary tract infections and a model bacterial system, is one of the best characterized and early studied organisms [60] [61] [62] . however, any large-scale analysis of protein complexes in e. coli was not performed until the studies of butland et al. [23] and arifuzzaman et al. [24] . first, 716 binary interactions involving 83 essential and 152 nonessential proteins, were identified by pull-down assay using tandem affinity purification-mass spectrometry, targeting 1000 orfs (about one-quarter of the e. coli genome) [23] . a small number (15%) of these interactions were already available in dip, bind, and string. ten newly described e. coli ppis were found as orthologous to the interactions reported for h. pylori [22] . the novel interactions were analyzed for functional annotations of uncharacterized proteins, allocating them within ribosome function, rna processing, rna binding, and so on. the graph theoretical analysis of the ppi map of e. coli revealed scale-free behavior and a high correlation between connectivity and the degree of conservation. the genome-wide ppi map of e. coli k-12 strain with 11 511 interactions among 2667 proteins was then constructed by a similar method [24] . the comprehensive analysis of this large-scale network also validated the scale-free nature and the connectivity-conservation correlation found previously [23] . arifuzzaman et al. [24] identified 107 functional units which have roles in metabolic pathways, transcriptional and translational machinery, recombination and flagella assembly. analysis of ppis based on this functional unit categorization provided further functional annotations. the gram-negative, food-borne pathogen c. jejuni is the major cause of gastroenteritis. the proteome-level analysis revealed 11687 interactions involving 80% of 1654 c. jejuni proteins [55] , the most comprehensive bacterial ppi map determined by two-hybrid. a scale-free network was obtained, removing low confidence-scored interactions. this ppi map of c. jejuni was used to identify evolutionarily conserved subnetworks through comparison with protein networks of h. pylori [22] , e. coli [23] and saccharomyces cerevisiae in dip. further analyses of the identified conserved sub-networks allowed the prediction of new c. jejuni interactions using orthologous interactions. this comparative analysis also enabled the identification of essential c jejuni genes based on their orthology to essential genes in other organisms. this comprehensive interactome data were next used to predict protein roles and to map functional pathways such as chemotaxis. the causative agent of syphilis, t. pallidum, has one of the smallest genomes known in extracellular bacteria, encoding 1039 proteins [63] . the global ppi network of t. pallidum, involving 3649 interactions connecting 726 bacterial proteins, was identified by two-hybrid [56] . the high-confidence subset connects 576 proteins by 991 interactions. in that study, an integrated network of dna-metabolism related processes was constructed and 18 proteins were functionally annotated within this network. additionally, various orthologous interactions were predicted for completely sequenced genomes, allowing the description of phylogenetically conserved interaction patterns. atypical pneumonia causing human pathogen, m. pneumoniae also has one of the smallest genomes in self-replicating organisms with 689 protein-encoding genes, making it a good model organism to study proteome organization in prokaryotes [64] . a proteome-wide analysis was performed by tandem affinity purificationmass spectrometry, identifying 62 homo-multimeric and 116 hetero-multimeric protein complexes [59] . about a third of the found hetero-multimeric complexes were observed to interact with proteins forming 35 larger, multiprotein complexes implying higher level of proteome organization and protein multifunctionality, allowing functional annotations of assemblies as well as prediction of biological roles of individual proteins within the complexes. m. tuberculosis causes millions of deaths each year with tuberculosis infection [65] . after computational efforts to construct large-scale ppi maps of m. tuberculosis [66, 67] , its genome-wide network was identified experimentally by two-hybrid [57] . this global network is composed of 8 042 interactions among 2907 proteins which represent 74.1% of the whole proteome. the topological properties of the undirected network of these interactions were calculated and compared with those of the previously defined prokaryotic ppi networks [22-24, 55, 56] . similar scale-free behavior following a power-law distribution was observed. in fact, the networks obtained by pull-down assay [23, 24] differ in values of clustering coefficient from the networks obtained by two-hybrid analysis [22, 55, 56] . moreover, wang et al. [57] performed a cross-species network comparison analysis of m. tuberculosis interactions with the available large-scale ppi data [22-24, 55,56] and identified conserved sub-networks. additionally, the highly connected critical proteins and mechanisms of the protein secretion pathways which have roles in its pathogenesis were revealed. a large-scale ppi network was recently constructed for the gram-positive bacterium b. subtilis (which is rarely pathogenic) by two-hybrid [58] . this network of 793 interactions involves 287 bacterial proteins. due to its role as a model organism, many studies were performed to characterize the biological functions of its ppis in cellular processes [68] [69] [70] . however, many processes remained uncharacterized. hence marchadier et al. [58] performed a comprehensive analysis with the integration of transcriptomic data focusing on cell division, cell responses to stresses, the bacterial cytoskeleton, dna replication and chromosome maintenance. these sequential efforts on construction of large-scale ppi networks for prokaryotes (table 1) constitute the first comprehensive description of the intraspecies mechanisms of the bacterial pathogens. the protozoan pathogen plasmodium falciparum causes malaria which results in deaths of nearly a million of people each year [71] . a comprehensive protein interaction map of this pathogen was generated by two-hybrid, identifying a highly interconnected, scale-free network of 2823 interactions within 1267 proteins (~25% of the predicted p. falciparum proteins) [72] . in this network, 33% of the interactions are between two uncharacterized proteins whereas 49% of the interactions include one such protein. bioinformatic analysis of this network yielded functional annotations of the proteins within the processes; chromatin modification, transcription, messenger rna stability, ubiquitination, and invasion of host cells. more detailed studies of ppis within p. falciparum are required in order to unravel its pathogenesis mechanisms thoroughly. despite the increasing rate in the identification of genome-wide ppi networks, they remain unconstructed for most pathogens. in the light of accelerating advances in genomics, proteomics, and interactomics, large-scale maps for many more organisms are expected to be built in the near future. increasing numbers of ppi networks will allow the comparison of networks across diverse organisms, resulting in generalized conclusions about pathogenic molecular mechanisms. the first examples of such comparative studies have been highlighted in the sections 2.1 and 2.2 above. integration of several highthroughput interaction datasets to generate more detailed networks is also possible, as indicated by recent examples for the e. coli system [73, 74] . the frequency of such integrated networks is expected to increase, owing to the large number of diverse data sets. these will be invaluable in defining whole proteomic maps of the pathogens. one of the most striking results of bioinformatic analyses on the constructed ppi maps is the identification of essential proteins functioning within pathogens. these proteins should be examined thoroughly to test their potential as novel therapeutic targets. the exploration of genome-wide ppi maps of the pathogens permits the assignment of unannotated proteins to biological pathways with function prediction. the proteins annotated to the host invasion processes may provide a launching point for pathogen-host interaction studies. biochemical interactions of pathogens with their hosts are necessary to invade the host organism. these connections between pathogens and hosts include interactions between proteins, nucleotide sequences, and small ligands [75, 76] . however, the protein interactions of pathogen-host systems have been identified as the most important, and therefore the most studied, type of pathogen-host interactions (phis) [76, 77] . since these interspecies crosstalks determine the pathogenesis, focusing on the whole phi system, instead of investigating a pathogen or host individually, may allow us to capture critical mechanisms (i.e. strategies used by pathogens and host immune responses) during infection that cannot be provided by traditional methods. due to a lack of sufficient experimental phi data until recent years, many computational phi prediction methods have been developed [78] [79] [80] [81] [82] [83] [84] . these studies focused mainly on interactions of p. falciparum and human immunodeficiency virus (hiv), as these are some of the most threatening pathogens to humans. very recently experiments have been carried out to determine the first large-scale molecular interactions between human and viruses [39, 85, 86] and bacteria [87, 88] . as a result of an increase in data available for pathogen-host systems, phispecific databases have been introduced such as phibase [89] , virusmint [90] , virhostnet [91] , patric [92] , and phisto [93] . although these advances in data archiving are promising, most data relevant to phi are still buried in the biomedical literature. some rare efforts have been performed to obtain hidden phis from the literature by text mining [94] [95] [96] . as in the case of intraspecies pathogen ppis, large-scale phi data were generated for viral systems before bacterial systems ( table 2 ). the first examples are for commonly observed human pathogens, ebv [39] , hcv [85] and influenza a virus (h1n1 and h3n2) [86] and then recently for hiv [97] . in calderwood et al. [39] , protein interactions between herpesvirus, ebv and human were mapped by twohybrid in conjunction with ebv intraviral ppi mapping, providing 173 phis between 40 ebv proteins and 112 human proteins. a systematic analysis of these interactome maps of ppis and phis enabled hypotheses of the roles of ebv proteins in pathogenesis to be generated. furthermore, intraspecies protein interaction data for human were integrated from databases (bind, dip, hprd, mips) and from the literature [52, 53] to analyze the organization of the human proteins targeted by ebv within human molecular machinery. it was found that ebv proteins tend to target human proteins which are highly connected (hubs) and central to many paths (bottlenecks) in the human ppi network. on the other hand, the degree distribution of the ebv-human protein interaction network could not be fitted to any model because of its incompleteness (fig. 2) . attempts to analyze incomplete maps of ppis and phis are still able to supply a partial understanding of mechanisms underlying infection. a similar thorough analysis was earlier performed with herpesviral protein networks of kshv and vzv and their interaction with the human proteome [38] . in that study, protein interactions between herpesviruses and human were predicted using the interacting orthologs of both proteins in other organisms [54] . combined virus-human networks were constructed by starting with the viral networks, adding their human protein targets, and then adding the cellular interactions among the targeted human proteins. the topological analyses of the combined herpesviruses-human networks revealed distinct properties from both viral and human interactomes providing insights into the impact of the two organisms on each other [38] . a proteome-wide phi map for the flavivirus hcv was mapped by two-hybrid and then by literature mining of previously found interactions between hcv and human [85] . a map of 481 interactions between 11 hcv proteins and 421 human proteins was generated (314 phis by twohybrid). 65% of this phi network included novel interactions. the integrated human network of 44 223 ppis among 9520 proteins [98] was used to evaluate the interplay between hcv and human. very similar behavior to ebv [39] was observed for hcv in terms of attacking hub and bottleneck proteins in the human network. to assess the human pathways targeted by hcv, kegg functional annotation pathways [99] were used. four pathways were detected to be enriched in hcv-targeted human proteins. three of them were associated already with hcv clinical www.biotechnology-journal.com syndromes as insulin, tgf-β and jak/stat pathways. the last enriched pathway, focal adhesion, is a novel observation as a human pathway affected during hcv infection [85] . influenza a is a member of negative-sense singlestranded viruses of orthomyxoviridae family. it is the sources of all flu pandemics infecting multiple species. for h1n1 a/pr/8/34 strain of influenza virus, 31 intraviral ppis among 10 viral proteins and 135 phis between 10 viral and 87 human proteins, most of which are expressed in primary human bronchial cells, were detected by twohybrid [86] . some of the phis constructed had been published previously [100] . the topology of the constructed intraviral network revealed a highly interconnected nature, as observed previously for other viral networks [38, 101] . in the case of the influenza a-human interaction network, important properties about connectivity of proteins were observed. first, viral proteins interact with significant number of human proteins, reflecting the multifunctionality of the small number proteins encoded in rna viruses. second, each of 24 human proteins connects with two or more viral proteins forming virus-human multiprotein complexes. additionally, it was observed that viral proteins generally target human proteins which are highly connected within their own network, as it was the case in herpesviruses-human system [39] . in shapira et al. [86] another phi network was identified for strain of influenza virus, h3n2 a/udorn/72 by the same experimental approach. this phi network consists of 81 interactions between 10 viral and 66 human proteins, reflecting a similar nature to the network for h1n1 strain-human system. this confirms the conserved functions of influenza virus proteins through strains. besides direct physical interactions between viral and human proteins, host responses in bronchial cells to influenza infection was identified by expression profiling, generating a regulatory map of interactions between influenza proteins and their human targets. comprehensive analysis of the physical and regulatory maps of the phi system elucidated human mechanisms involved in infection. for example, nf-κb, mitogen-activated protein kinase, apoptosis, and wnt signaling pathways are regulated through transcriptional and/or physical interactions during influenza a infection. one of the most dangerous human pathogens, hiv, belongs to positive-sense single-stranded rna virus family retroviridae. acquired immunodeficiency syndromecausing hiv has been extensively studied since its first observation near the end of the 20 th century [102] [103] [104] [105] . similar to other rna viruses, hiv has a small genome and depends largely on human cellular machinery to be replicated. identifying the physical contacts between hiv and human proteins during hiv replication is critically important for a full understanding of hiv infection. being one of the most studied pathogens, there are many phi data for hiv-1 in virusmint and phisto. the current phi data have been produced mainly by small-scale experiments [106] [107] [108] . very recently, a global phi network was generated for hiv-human protein complexes by affinity tagging and purification mass spectrometry, producing 497 phis between 16 hiv-1 proteins and 435 human proteins [97] . it was observed that hiv-targeted human proteins are highly conserved across primates. the novel interactions identified in that study requires further work to detail their biological significance in terms of hiv infection. besides whole proteins, domains of the interacting proteins were investigated and the enriched domain types in targeted human proteins were indicated for facilitating future structural modeling studies regarding hivhuman system. the first large-scale interaction networks between viruses and humans [39, 85, 86, 97] provide crucial clues about the viral infections, verifying the critical importance of phi analyses in infection researches. until very recent years, the phi data were scarce for bacterial systems because of lack of any large-scale experiments. the first extensive bacterial phi networks were identified for important human pathogens, bacillus anthracis, francisella tularensis, and yersinia pestis [87] , then another high-throughput experimental study generating phi data of y. pestis was reported [88] . gram-positive bacteria b. anthracis and y. pestis and gram-negative bacterium f. tularensis are respiratory pathogens causing anthrax, bubonic plague, and acute pneumonic disease, respectively. using a two-hybrid assay, large-scale interaction data were generated between these bacteria and human producing 3073 phis between 943 b. anthracis proteins and 1748 human proteins, 4059 phis between 1218 y. pestis proteins and 2108 human proteins, and 1383 phis between 349 f. tularensis proteins and 999 human proteins [87] . the first conclusion of computational analyses of these comprehensive bacteria-human networks, in combination with the integrated human ppi network from databases bind, dip, hprd, intact, mint, mips, and reactome, was that bacterial proteins tend to target hubs and bottlenecks in the human network. secondly, the roles of human proteins targeted by these bacteria were investigated using their gene ontology annotations [109] . the tendency of all three pathogens to target human proteins involved in immune responses was observed as previously reported [110] [111] [112] . besides being effectors of immune signaling, the bacteria-targeted human proteins also have crucial roles in apoptosis [87] . thirdly, the conserved protein interaction modules of the three phi networks were computed [113, 114] for a more systematic comparative analysis. conserved modules revealed common attacks by the bacterial pathogens to same human pathways. subsequently, another phi map was generated for plague causing y. pestis by a different two-hybrid strategy by choosing only potential virulence factors as bait pro-teins [88] . 204 phis were yielded between 66 y. pestis proteins and 109 human proteins and then 23 previously reported phis were integrated to construct a comprehensive network between y. pestis and human. a graph theoretic analysis confirms that y. pestis preferentially targets hub and bottleneck proteins in the human intranetwork as concluded previously for viruses [39, 86] and bacteria [87] . signaling pathways, crucial for human immune system, were found to be enriched in human proteins targeted by y. pestis. these pathways include mitogen-activated protein kinase signaling and toll-like receptor signaling and also pathways functioning in focal adhesion, regulation of cytoskeleton, and leukocyte transendoepithelial migration. finally, y. pestis-targeted human proteins were compared with those targeted by viruses whose phi networks were identified previously. 16 of 109 y. pestis-targeted human proteins are included in phi networks of ebv [39] and hcv [85] indicating the common infection strategies of both viruses and bacteria. the recent detected first large-scale phi networks of bacteria-human systems [87, 88] contribute largely to the understanding of bacterial infection mechanisms with immune evasion. as phi data available for various pathogens increase, a need to analyze comprehensive phi data for all pathogen types together arises in order to draw a generalized picture. although infection mechanisms of individual pathogens have been studied through intraspecies pathogenic ppi maps and interspecies phi maps, a general overview of infection mechanisms was missing until analyses of phi data from different infection agents were attempted [6, 93] . in the absence of large-scale phi networks for bacterial, protozoan and fungal systems, dyer et al. [6] performed the first global analysis of 10 477 protein interactions between 190 pathogen strains of viruses, bacteria, protozoa, and human through properties of targeted 1233 human proteins. diversity of the available phi data was not rich, 98.3% of 10 477 phis belonged to the virushuman systems with 77.9% of the interaction data drawn from hiv -human interaction systems. the importance of the pathogen-targeted proteins was evaluated within the intraspecies human ppi network of 75 457 interactions. these phi and ppi data were integrated from public databases; mint, intact, dip, hprd, reactome, bind and mips. firstly, targeting hub and bottleneck proteins was concluded to be global behavior for all pathogens, as reported for individual pathogen strains previously [39, [86] [87] [88] . gene ontology [109] functions enriched in the targeted human proteins by different pathogens revealed common infection mechanisms. attack of human transcription factors and key proteins that control the cell cycle and regulate apoptosis and transport of genetic material across the nuclear membrane were found to be among the common viral strategies. despite its scarcity (174 interactions in the datset), bacterial phi data allowed identification of specific human proteins that function in the host immune response (via toll-like receptors and i-κb kinase/nf-κb signaling cascade) as a target of bacterial infection strategy [6] . recently we performed another study with comprehensive phi data to explore common and special infection strategies for viruses and bacteria [93] . a significant amount of bacterial phis, constituting 36.5% of all data, was avaiable thanks to dyer et al. [87] . we analyzed 23 435 interactions between 3419 proteins of viral, bacterial, protozoan and fungal pathogens (totally 257 strains) and 5210 proteins of human obtained from phisto (www.phisto.org). to generate the intra species human protein network, 194006 ppis were integrated from biogrid, dip, intact, mint and reactome. the significant amount of bacterial and viral phi data allowed us to focus on comparisons between their specific infection mechanisms. firstly, attacking hub and bottleneck proteins in the human ppi network was verified as a common infection strategy of both bacteria and viruses. furthermore, viruses were observed to target human proteins of much higher connectivity and centrality values in comparison to bacteria. secondly, gene ontology enrichment analysis of the targeted human proteins verified the special mechanisms of bacteria and viruses use to manipulate of human immune defense mechanisms and cellular processes, respectively (as reported in dyer et al. [6] but relying on lower amounts of phi data). a first attempt at the investigation of the human proteins targeted by both bacteria and viruses revealed that attacking human metabolic processes is a common strategy used by both pathogens during infections [93] . global analysis of phi data provides insights into the strategies adapted by bacteria and viruses to subvert human cellular processes and immune system for the infection. however, large-scale phi networks for pathogens other than bacteria and viruses are still undetermined, leaving their pathogenesis mechanisms to be relatively uncharacterized. research on infectious diseases through phis has accelerated within the post-genomic era (fig. 3) . however, large-scale phi networks have been infrequently studied. efforts to identify and analyze large-scale phis for diverse pathogen types would be expected to parallel the acceleration of biotechnology and bioinformatics research. increasing amounts of data available will allow more complete data sets to be compiled, resulting in characterization of topological properties of phi networks. the first attempt to fit the degree distribution of ebv-human interaction network failed due to scarcity of data [39] . on the other hand, bioinformatic analyses of the pathogen-tar-geted human proteins succeeded in unraveling some infection strategies such as targeting human hubs and bottlenecks, subverting cellular processes for the usage of pathogens' own advantages and evasion of immune defenses [6, 39, 85, 87, 88, 93] . the huge amount of data expected to be generated for phi systems will enable us to capture all details of infection processes. potentially leading to the development of new and more efficient therapeutics. conventional treatments for infectious diseases often aim to kill pathogens by targeting their essential proteins. this approach unfortunately forces the pathogens to evolve for survival and consequently selects resistant strains (especially in the case of rna viruses with a high mutation rate). to fight drug-resistant patho gens, novel alternative therapeutics are emerging which target host proteins required by pathogens to replicate and persist within the host organism. if these host factors are indispensable for pathogens, but not essential for host cells, their silencing may inactivate pathogenic activity, allowing them to serve as therapeutic targets [4, 115] . in the light of phi studies, some human factors required by viral and bacterial pathogens have been determined for hiv [115] [116] [117] [118] [119] , hcv [120] , west nile virus [121] , influenza virus [122, 123] , and m. tuberculosis [124] in recent years. despite the efforts reviewed here, the use of systems biology approaches to investigate phi is still considered relatively undeveloped. the availability of new phi network data, together with further topological and functional analyses of pathogen-host systems, are expected to shed more light on infection mechanisms and novel therapeutic targets for infectious diseases in the near future. we particularly thank dr. tunahan çakır for critical reading of the manuscript and for his contributions to figure 3 . the financial support was provided by the research funds of bogaziçi university, through project 5554d. the doctoral scholarship for saliha durmuş tekir is sponsored by tübitak, is gratefully acknowledged. the authors declare no conflict of interest. figure 3 . the number of scientific publications including phi-related terms in pubmed in the post genomic era. the searched phi-related terms: "pathogen host interactions", "host pathogen interactions", "pathogen host interaction", "host pathogen interaction", "pathogen-host interactions", "pathogen-host interaction", "host-pathogen interactions", "host-pathogen interaction". infectious diseases: for considerations the 21st century host-pathogen systems biology filoviruses are ancient and integrated into mammalian genomes new strategies to fight infectious diseases -arms race on a microscale signaling during pathogen infection the landscape of human proteins interacting with viruses and other pathogens antibacterial drug discovery and structure-based design nucleotide sequence of bacteriophage ϕx174 dna whole-genome random sequencing and assembly of haemophilus influenzae rd the sequence of the human genome protein function in the post-genomic era a novel genetic system to detect protein-protein interactions interaction mating reveals binary and ternary connections between drosophila cell cycle regulators toward a functional analysis of the yeast genome through exhaustive two-hybrid screens functional organization of the yeast proteome by systematic analysis of protein complexes systematic identification of protein complexes in saccharomyces cerevisiae by mass spectrometry global analysis of protein activities using proteome chips a protein linkage map of escherichia coli bacteriophage t7 a genomic approach of the hepatitis c virus generates a protein interaction map genome-wide analysis of vaccinia virus protein-protein interactions the protein-protein interaction map of helicobacter pylori interaction network containing conserved and essential protein complexes in escherichia coli largescale identification of protein-protein interaction of escherichia coli k-12 toward a protein-protein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins a comprehensive analysis of protein-protein interactions in saccharomyces cerevisiae protein interaction mapping in c. elegans using proteins involved in vulval development the biomolecular interaction network database and related tools: 2005 update the biogrid interaction database: 2011 update the database of interacting proteins: 2004 update human protein reference database the intact molecular interaction database in 2012 mint, the molecular interaction database: 2012 update mpact: the mips protein interaction resource on yeast reactome: a database of reactions, pathways and biological processes the string database in 2011: functional interaction networks of proteins, globally integrated and scored protein-protein interactions between hepatitis c virus nonstructural proteins herpesviral protein networks and their interaction with the human proteome epstein-barr virus and virus human protein interaction maps analysis of intraviral protein-protein interactions of the sars coronavirus orfeome genome-wide analysis of protein-protein interactions and involvement of viral proteins in sars-cov replication hepatitis c virus: structure, protein products and processing of the polyprotein precursor flaviviridiae: the viruses and their replication the complete dna sequence of vaccinia virus genetically engineered poxviruses for recombinant gene expression, vaccination, and safety the complete dna sequence of varicella-zoster virus identification of herpesvirus-like dna sequences in aids-associated kaposi's sarcoma the genome of epstein-barr virus type 2 strain ag876 emergence of scaling in random networks specificity and stability in topology of protein networks unique and conserved features of genome and proteome of sars-coronavirus, an early split-off from the coronavirus group 2 lineage towards a proteome -scale map of the human protein-protein interaction network a human proteinprotein interaction network: a resource for annotating the proteome a first-draft human protein-interaction map a proteome-wide protein interaction map for campylobacter jejuni the binary protein interactome of treponema pallidum -the syphilis spirochete a global protein-protein interaction network in the human pathogen mycobacterium tuberculosis h37rv an expanded protein-protein interaction network in bacillus subtilis reveals a group of hubs: exploration by an integrative approach proteome organization in a genome-reduced bacterium primary structure of the succinyl-coa synthetase of escherichia coli one-step inactivation of chromosomal genes in escherichia coli k-12 using pcr products new partners of acyl carrier protein detected in escherichia coli by tandem affinity purification complete genome eequence of treponema pallidum, the syphilis spirochete complete sequence analysis of the genome of the bacterium mycoplasma pneumoniae epidemiology, strategy, financing mycobacterium tuberculosis interactome analysis unravels potential pathways to drug resistance uncovering new signaling proteins and potential drug targets through the interactome analysis of mycobacterium tuberculosis an expanded view of bacterial dna replication dna polymerase i acts in translesion synthesis mediated by the y-polymerases in bacillus subtilis cell-cycle-dependent spatial sequestration of the dnaa replication initiator protein in bacillus subtilis chemical genetics of plasmodium falciparum a protein interaction network of the malaria parasite plasmodium falciparum inferring genome-wide functional linkages in e. coli by combining improved genome context methods: comparison with high-throughput experimental data global functional atlas of escherichia coli encompassing previously uncharacterized proteins the battle of two genomes: genetics of bacterial host/pathogen interactions in mice structural microbiology at the pathogen-host interface mining host-pathogen interactions host pathogen protein interactions predicted by comparative modeling computational prediction of host-pathogen protein-protein interactions a data integration approach to predict host-pathogen protein-protein interactions: application to recognize protein interactions between human and a malarial parasite ortholog-based protein-protein interaction prediction and its application to interspecies interactions prediction of interactions between hiv-1 and human proteins by information integration prediction of hiv-1 virus-host protein interactions using virus and host sequence motifs structural similarity-based predictions of protein interactions between hiv-1 and homo sapiens hepatitis c virus infection protein network a physical and regulatory map of host-influenza interactions reveals pathways in h1n1 infection the human-bacterial pathogen protein interaction networks of bacillus anthracis, francisella tularensis, and yersinia pestis insight into bacterial virulence mechanisms against host immune response via the yersinia pestis-human protein-protein interaction network phi-base update: additions to the pathogen host interaction database virus-mint: a viral protein interaction database virhost-net: a knowledge base for the management and the analysis of proteome-wide virus-host interaction networks patric: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species infection strategies of bacterial and viral pathogens through pathogen-human protein-protein interactions document classification for mining host pathogen protein-protein interactions text mining for discovery of host-pathogen-interactions literature mining of hostpathogen interactions: comparing feature-based supervised learning and language-based approaches global landscape of hiv-human protein complexes analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets gene annotation and pathway mapping in kegg the multifunctional ns1 protein of influenza a viruses connecting viral with cellular interactomes gay compromise syndrome how does hiv cause aids? science hiv-1 nef impairs mhc class ii antigen presentation and surface expression hiv-1 tar mir-na protects against apoptosis by altering cellular gene expression anchorage of hiv on permissive cells leads to coaggregation of viral particles with surface nucleolin at membrane raft microdomains human immunodeficiency virus type 1 vpr interacts with antiapoptotic mitochondrial protein hax-1 hiv-1 envelope triggers polyclonal ig class switch recombination through a cd40-independent mechanism involving baff and c-type lectin receptors gene ontology: tool for the unification of biology francisella tularensis induces cytopathogenicity and apoptosis in murine macrophages via a mechanism that requires intracellular bacterial multiplication macrophage apoptosis by anthrax lethal factor through p38 map kinase inhibition inhibition of mapk and nf-kb pathways is necessary for rapid apoptosis in macrophages infected with yersinia graemlin: general and robust alignment of multiple large interaction networks conserved patterns of protein interaction in multiple species network-based prediction and analysis of hiv dependency factors identification of host proteins required for hiv infection through a functional genomic screen global analysis of hostpathogen interactions that regulate early-stage hiv-1 replication genome-scale rnai screen for host factors required for hiv replication host cell factors in hiv replication: meta-analysis of genome-wide studies a genome-wide genetic screen for host factors required for hepatitis c virus propagation rna interference screen for human genes associated with west nile virus infection genome-wide rnai screen identifies human host factors crucial for influenza virus replication human host factors required for influenza virus replication genome-wide analysis of the host intracellular network that regulates survival of mycobacterium tuberculosis key: cord-007211-prygoc0q authors: segawa, hiroaki; kato, masahiko; yamashita, tetsuro; taira, hideharu title: the roles of individual cysteine residues of sendai virus fusion protein in intracellular transport(1) date: 1998-06-17 journal: j biochem doi: 10.1093/oxfordjournals.jbchem.a022044 sha: doc_id: 7211 cord_uid: prygoc0q the role of intramolecular disulfide bonds in the fusion (f) protein of sendai virus was studied. the 10 cysteine residues were changed to serine residues using site-directed mutagenesis. none of the cysteine mutant f proteins reacted with a monoclonal antibody specific for the mature conformation of the f protein, but eight of ten mutants reacted with an immature conformation-specific monoclonal antibody. the transport of these mutant proteins to the cell surface was drastically reduced. all of the cysteine mutant f proteins remained sensitive to endoglycosidase h (endo h) for 3 h after their synthesis. moreover, cell surface transport of the hemagglutinin-neuraminidase (hn) protein co-expressed with each of these cysteine mutant f proteins was also reduced. these results suggest that all cysteine residues participate in the formation of intramolecular disulfide bonds, that co-translational disulfide bond formation is crucial to the correct folding and intracellular transport of the f protein, and that interaction of the f and hn proteins takes place intracellulary. the two glycoproteins of paramyxoviruses, the fusion (f) and hemagglutinin-neuraminidase (hn) proteins, are present as integral proteins which form spike-like projections on the outer surface of the viral envelope. the hn protein exhibits both the hemagglutinating and the neuraminidase activities, while the f protein has been shown to be involved in virus penetration, hemolysis and cell fusion (for reviews, see refs. 1 and 2). the f protein is synthesized as an inactive precursor, f o , which is cleaved by proteases to form the biologically active protein consisting of the disulfide-linked subunits fi and f 2 . the n-terminal portion of the fi subunit is very hydrophobic, a feature which is highly conserved among paramyxovirus f proteins, and is suggested to mediate fusion of the virus envelope with the target membrane as well as cell fusion resulting in the syncytium formation. the sequences of paramyxovirus f proteins reveal the conservation not only of amino acid residues of the fusioninducing domain, cleavage site, and transmembrane domain, but also of cysteine residues at specific positions. as shown in fig. 1 , mature sendai virus f o protein has 10 cysteine residues, designated cl to c10, and the relative position of cysteine residues is highly conserved among paramyxovirus f proteins (2, 3) . this suggests that these cysteine residues play important structural and functional roles. two cysteine residues are present in the signal peptide and are not likely to participate in intramolecular disulfide bonds of the mature protein. the cl cysteine is in the f 2 subunit, and the remaining 9 cysteine residues are in the fi subunit. eight of the 9 cysteine residues are clustered in a narrow region near the transmembrane domain. the formation of disulfide bonds takes place in the er and is co-translationally catalyzed by protein disulfide isomerase. it is suggested that only native disulfide bonds are found in the principal folding intermediates and that disulfide bond formation plays an integral role in the folding. the mechanisms of folding, oligomeric assembly, and sorting of viral membrane proteins have been characterized in recent years (for review, see ref. 4) . to date, the roles of individual cysteine residues in the folding of newcastle disease virus (ndv) (5) and measles virus (6) have been reported. on the other hand, the roles of individual cysteine residues of paramyxovirus f proteins have not been characterized and remain largely unknown. intramolecular and intermolecular disulfide bonds are essential components of the structure of the majority of proteins, and it is of great interest to understand their functions. roles of disulfide bonds that have been suggested to include (i) aiding in protein folding and maturation and (ii) maintenance of stability and solubility. two general approaches, in vivo reduction using reducing reagents (7) (8) (9) (10) and site-directed mutagenesis to substitute each of the cysteine residues with other amino acids (5, 6, (11) (12) (13) (14) , have been used to study the functional role of disulfide bond formation within cells. recently, the sites of the disulfide bonds of sendai virus f protein were determined using protein sequencing analysis, and all 10 cysteine residues were shown to participate in disulfide bond formation {15). thus, mutagenesis to substitute each cysteine residue should provide additional tgc-tcc tgt-»tct tgt-uct tgotcc tgt-»tct tgt-uct tgotcc tgt-»tct tgc-»tcc tgt-»tct a findings regarding the functional role of each disulfide bond in the folding, intracellular transport and antigenicity. previously, we described the efficient expression of sendai virus f gene cdna with mutations at its cleavage site and hn gene cdna to induce cell fusion (16) . in this study, we employed site-directed mutagenesis to prepare mutant f protein genes at each of the 10 cysteines and analyzed the role of each cysteine residue in f protein intracellular processing, cell surface expression, and immunoreactivity with different monoclonal antibodies (mabs) after transient expression of the cysteine mutants in cos cells. plasmids-mutagenesis of pcr fragments by use of mismatched primers was performed in a three-step pcr (17) . to replace each cysteine, except for c4, we used the sets of four primers listed in table i . in the first pcr, two separate pcr reactions were run using the 5'-mutagenic and 3'-outer primers and the 3'-mutagenic and 5'-outer primers to amplify separate, overlapping sequences of the template plasmid puc-f (18). the 5'-mutagenic and 3'-mutagenic primers have overlapping homologous regions containing a mutation of interest. following amplification, the pcr products were purified, mixed, then denatured and re-annealed. an overlapping duplex was formed and extended by the second pcr reaction without primer to give the full-length target sequence containing the mutation. this mutated pcr fragment was amplified by the third pcr using the 5'-outer primer and 3'-outer primer used in the first pcr. plasmid psrd-fcls was constructed as follows: the pcr product containing a mutation at its cl site was digested with clal and ligated to the fc fragment, which was excised from puc-f by digestion with clal and hindlll. the ligated whole f gene was purified by gel extraction, treated with klenow fragment of dna polymerase i (klenow fragment) to create blunt ends, then treated with t4 polynucleotide kinase and ligated into the expression vector pcdl-srd (psrd) (19) , which had been plasmid psrd-fc2s was constructed as follows: the pcr product containing a mutation at its c2 site was digested with bglll and pstl and used to replace the corresponding fragment of puc-f. subsequently, the fc2s gene was excised by digestion with hindlll, blunt-ended by treatment with klenow fragment, and inserted into psrd, which had been linearized by digesting with £cori and treated with klenow fragment. to construct the expression plasmids psrd-fc3s, psrd-fc5s, psrd-fc6s, psrd-fc7s, psrd-fc8s, psrd-fc9s, and psrd-fclos, we used the following procedure: a cdna fragment containing the carboxyl terminal portion of f protein from 235 to 565 amino acid residues was subcloned into pbluescript ii sk+ (stratagene cloning systems, la jolla, ca) at the pstl site to yield pbs-fc. mutagenesis was performed as described above, and the resulting pcr products containing a mutation at each cysteine site were digested with xmnl and ndel, then inserted into pbs-fc, which had been digested with smal and ndel. the resulting plasmids were digested with pstl, and the mutated fc fragment was ligated with 4.1 kb pstldigested psrd-f. plasmid psrd-fc4s was constructed as follows: pcr was performed using n primer and mutagenic primer 3'-flcs3. the pcr product was digested with bamkl and inserted into pbs-fc, which had been digested with the same restriction enzyme. the resulting plasmid was digested with psti, and the mutated fc fragment was ligated with 4.1kb psti-digested psrd-f as described above. the mutations at the desired sites of all the mutant dnas were confirmed by dna sequencing (20) . as shown in fig. 1 , a total of 10 cysteine mutants were prepared and designated fcls to fc10s from the amino terminus of the f o protein to the carboxyl terminus, corresponding to the cysteine residues at amino acid positions 70, 199, 338, 347, 362, 370, 394, 399, 401, and 424, respectively. the expression plasmid psrd-hn was constructed by inserting the whole hn gene into psrd as described previously (16) . cell culture and dna transfection-monkey cos-1 cells (21) were grown in dulbecco's modified eagle's medium (dmem) supplemented with 10% fetal calf serum at 37°c in 5% co 2 . for transient expression of the f protein, cos cells were grown to 80% confluency in 35-mm tissue culture dishes, then transfected with 5 jug of expression plasmid per dish by calcium phosphate precipitation (22) . cells incubated for 48 h after transfection were analyzed by immunoprecipitation (18) . antibodies-anti-sendai virus antiserum was prepared from sendai virus-infected rabbits. the antiserum used for immunoprecipitation was obtained from rabbits immunized with f proteins, which were expressed in escherichia coli as maltose-binding fusion proteins and purified by amylose resin as described by the manufacturer (new england biolabs, beverly, massachusetts). anti-f serum used to detect cell surface or intracellular expression of the f protein was obtained from rabbits immunized with the fj protein, which was purified from virions by sds-polyacrylamide gel electrophoresis (sds-page) (23) . anti-hn serum used to detect hn protein by western blotting was obtained from rabbits immunized with the hn protein, which was purified from virions by sds-page. monoclonal antibodies (mabs) f-49, f-236, f-921, and yl-111 were the generous gift of dr. h. tozawa (kitasato university). mab f-49 reacts with the mature form of f protein (24) , and f-236 recognizes both f o and f! in immunoprecipitation but reacts specifically with f o in western blot analysis. mab f-921 reacts with the immature form of f protein. in a competitive binding assay, the mabs were divided into two groups, f-i and f-ii. mabs f-236 and f-921 belonged to the f-i group and f-49 belonged to the f-ii group (25) . mab yl-111 recognizes the hn protein. indirect immunofluorescence staining-indirect immunofluorescence staining was performed as described previously (26) . in general, cos cells were grown on glass coverslips and transfected with plasmid dna as described above. forty-eight hours after transfection, cells were washed with ice-cold phosphate buffered saline (pbs) and fixed with acetone for 10 min at -20°c or with 3.7% paraformaldehyde in 100 mm sodium phosphate buffer (ph 6.5) for 30 min at room temperature. the cells were washed twice with pbs for 10 min, then incubated for 1 h at 37°c in pbs containing 3% bsa, 0.02% nan 3 , and anti-f rabbit serum or yl-111 (diluted 1:100). after incubation with the antiserum, cells were washed with pbs twice, then incubated for 1 h at 37°c in pbs containing nan 3 , bsa, and fluorescein isothiocyanate (fitc)-conjugated anti-rabbit igg or anti-mouse igg (wako pure chemical industries), each diluted 1:100, as secondary antibody to anti-f rabbit serum or yl-111, respectively. immunofluorescence was examined by fluorescence microscopy. western blot analysis-cos-1 cells (in 12-well dishes) were transfected with plasmid dna as described above. forty-eight hours after transfection, cells were washed with ice-cold pbs containing 20 mm iodoacetamide (iaa), lysed in 100 a l of ripa buffer [0.01 m tris-hcl (ph 7.5), 0.15 m nacl, 1% triton x-100, 1% sodium deoxycholate, 0.1% sds] containing 2 mm pmsf and 20 mm iaa, and sonicated for 30 s. twenty-five microliters of 5xsds sample buffer [312.5 mm tris-hcl (ph 6.8), 25% 2-mercaptoethanol (2-me), 50% glycerol, 10% sds, 0.05% bromophenol blue] was added to the cell lysates and boiled for 3 min. fifteen microliters of each sample was electrophoresed on a 9% polyacrylamide gel. proteins were transferred to immobilon-p membrane (millipore) by semi-dry electroblotting. the membrane was preincubated in trisbuffered saline (tbs) [0.02 m tris-hcl (ph 7.6), 0.137 m nacl] containing 0.05% tween20, 5% low fat milk for 1 h at room temperature. the membrane was then washed in tbs containing 0.05% tween20, incubated for 1 h at room temperature in tbs-tween buffer containing a 1:1,000 each dilution of the anti-f and anti-hn antisera, washed twice for 30 min in tbs-tween, and incubated for 1 h at room temperature in tbs-tween containing a 1:4,000 dilution of horseradish peroxidase (hrp)-conjugated protein a (e. y laboratories). after extensive washing of the membrane, bound antibodies were detected using the ecl western blotting detection reagent system (amersham). immunoprecipitation-cos-1 cells (in 35-mm dishes) were transfected with plasmid dna as described above. forty-eight hours after transfection, the medium was replaced with methionine-and cysteine-free minimum essential medium (mem). twenty minutes later, cells were labeled with 100 ix\ of methionine-and cysteine-free mem supplemented with [ 35 s] methionine and [ 35 s] cysteine (200/^ci/ml, 1,000 ci/mmol; du pont/nen), and labeling was continued for 10 min. the cells were chased with dmem supplemented with 5% fbs, 5 mm methionine, and 5 mm cysteine for the time indicated in each figure legend. then the cells were washed twice with ice-cold pbs containing 20 mm iaa, lysed in 300/^1 of ripa buffer containing 20 mm iaa, sonicated for 30 s, and centrifuged for 10 min at 15,000 rpm. fifty microliters of the supernatant (cellular extract) was incubated with 5 //i of antibody raised against recombinant f protein expressed in e. coli or mabs reacting with various epitopes (25) on ice for 1 h. twenty microliters of a suspension of protein-a sepharose fast flow (pharmacia biotech, uppsala, sweden) was added to the lysates. the mixture was incubated at 4°c for 1 h with gentle mixing, and immune complexes adsorbed on the beads were washed three times in 50 mm tris-hcl (ph 7.5), 150 mm nacl, 1 mm edta, 0.25% (w/v) gelatin, 0.1% (v/v) np-40, 0.02% (w/v) nan 3 , then denatured in sds sample buffer [6.25 mm tris-hcl (ph 6.8), 2% sds, 10% glycerol] in the presence or absence of 5% 2-me by boiling for 3 min. the samples were analyzed by 9% sds-page followed by fluorography. glycosidase treatment-the immune complexes adsorbed on protein-a sepharose were resuspended in 100 fi\ of 50 mm sodium acetate buffer (ph5.5), then samples were incubated with 2 mu of endo-y3-at-acetylglucosaminidase h (endo h: boehringer mannheim biochemica) for 16 h at 37°c. after digestion, the immune complexes on the beads were recovered by centrifugation and denatured in sds sample buffer containing 5% 2-me by boiling for 3 min. the immune complexes were analyzed by sds-page followed by fluorography. expression of the cysteine mutant f proteins-the f protein of the z strain of sendai virus possesses 12 cysteine residues. the first and second cysteine residues are present in the signal peptide while the rest are distributed across the entire ectodomain of the protein. the signal peptide is cleaved in the er, and thus the remaining 10 cysteine residues are present in the mature f protein. the relative locations of these cysteine residues are highly conserved among paramyxovirus f proteins, and these cysteines are likely to participate in intramolecular disulfide bonds in the mature protein. to examine the processing of the intramolecular disulfide bonds and their contribution to the structure of the f protein, site-directed mutagenesis using pcr was used to substitute each cysteine residue by a serine residue to prevent the formation of disulfide bonds. ten expression plasmids of the cysteine mutant f proteins, named fc1s to fc10s, were prepared as shown in fig. 1 . recombinant plasmids were introduced into cos cells by the calcium phosphate precipitation method (22) , and the cells were incubated for 48 h. transfected cells were labeled with [ 35 s]methionine and [ 35 s] cysteine for 10 min, chased for 1 h, then lysed in ripa buffer. the lysates were immunoprecipitated with anti-f antiserum and analyzed by 9% sds-page under reducing ( fig. 2a) or nonreducing (fig. 2b) conditions. in fig. 2a , psrd-f transfected cells gave one main band (lane 3) with mobility corresponding to the uncleaved form of f, or f o , synthesized in sendai virusinfected cos cells as reported previously (16) . the 10 cysteine mutant-transfected cells also gave one main band corresponding to the uncleaved f o protein. a minor band of about 60 kda was also detected in all cells transfected with recombinant plasmids (lanes 3-13), and this seems to be nonspecifically degraded f proteins or unglycosylated f proteins, because the unglycosylated f proteins and immature f proteins could be detected by the antiserum used here, which was raised against the unglycosylated f proteins expressed in e. coli. the cysteine mutant psrd-fc1s ( fig. 2a, lane 4) transfected cells gave three bands, one main band corresponding to the f o protein, the common minor band (60 kda), and another higher molecular weight species of f protein. the alteration of the cysteine residue of the f 2 subunit to a serine residue introduced a new glycosylation site in this protein sequence. thus, the higher molecular weight species seems to be the f protein utilizing the additional new glycosylation site. approximately equal levels of most of the f proteins in the cells transfected with the above plasmids were detected by western blot analysis (data not shown). this showed that these 10 cysteine mutant f proteins were efficiently expressed at almost the same level as wild-type f protein in psrd-f transfected cells and were relatively stable. under nonreducing conditions, formation of disulfide bonds in a protein should allow it to assume a more compact form and to migrate faster on sds-page than unfolded forms without disulfide bonds, as reported by machamer et al. (27) . as shown in fig. 2b , the compact and unfolded forms of monomer f proteins and aggregates were detected with the mutants as well as the wild-type f protein. the compact form migrated as a 60 kda band and the unfolded form migrated as a 66 kda band, which is similar to the molecular mass of the reduced form of f protein as shown in fig. 2a . most of the wild-type f protein migrated as the compact form (lane 3). on the other hand, the relative amount of the unfolded form of the each 10 cysteine mutant f proteins was increased, and the ratio of the compact form to unfolded form was roughly 1 as determined by densitometry (lanes [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] . this indicated that all cysteine residues contributed to make the compact form. no obvious oligomeric form of f proteins was detected in the mutant or wild-type f protein-expressing cells. this implied that f protein does not form disulfide-linked oligomers. although the aggregates disappeared from the top of the gel when the radiolabeled cell lysates were reduced by boiling in 5% 2-mercaptoethanol before sds-page, treatment with thiol alkylation reagents such as iodoacetamide (iaa) or iv-ethylmaleimide (nem) had little effect on the formation of the aggregates. in preliminary experiments, we did not detect such aggregates in sendai virus-infected cells analyzed under nonreducing conditions. these observations indicated that these aggregates were formed before sample preparation and that other viral factors might be required for efficient folding of f protein. antibodies-to examine the antigenicity of the cysteine mutant proteins, the cell lysates were immunoprecipitated with mabs f-49, f-236, and f-921 and analyzed by sds-page under reducing conditions. in preliminary experiments, f-921 detected intracellular f proteins but not cell surface-transported f proteins. mab f-236 recognized both f o and fj in the immunoprecipitation, but specifically recognized f o in western blot analysis (unpublished data). mab f-49 recognized the mature form of f protein (24) . thus, these mabs bind independent epitopes. as shown in fig. 3 , only the wild-type f protein was precipitated with anti-mature conformation mab f-49 (panel a, lane 3), the wild-type f protein and a small amount of fc2s protein were precipitated with mab f-236 (panel b, lanes 3 and 5), and the wild-type f protein and most of the cysteine mutant f proteins, except for the fc5s and fc6s proteins, were precipitated with mab f-921 (panel c, lanes 3-7 and 10-13). these data indicated that loss of any one of the cysteine residues of the f protein had dramatic effects on the immunoreactivity to anti mature mabs and that all of the cysteine mutant f proteins were blocked at some stage in their maturation. cell surface transport of cysteine mutant f proteins-to examine the expression of cysteine mutant f proteins in the cells and their transport to the cell surface, the cells were fixed with acetone or paraformaldehyde and examined by indirect immunofluorescence staining using anti-f antiserum. to detect subcellular localization, the cells expressing wild-type and cysteine mutant f proteins were fixed with acetone, then examined by indirect immunofluorescence staining (fig. 4 , panels a, c, e, g, i, k, and m). we expected that a pair of cysteine mutants which had altered cysteine residues involving the same disulfide bond might show the same expression phenotype. recent results by iwata et al. (15) identified specific disulfide bonds between cl and c2, c3 and c4, and c5 and c6, while cio appears to be linked to c7, c8, or c9. thus, only the 5 cysteine mutants fc1s, fc3s, fc5s, fc7s, and fc8s were shown in fig. 4 . the cos cells transfected with vector plasmid psrd showed no fluorescence (panel a), whereas the cells transfected with psrd-f (panel c), psrd-fcls (panel e), psrd-fc3s (panel g), psrd-fc5s (panel i), psrd-fc7s (panel k), and psrd-fc8s (panel m) displayed a bright staining pattern, and the frequency of positive cells was estimated to be 5-10% of total cells in each case. as shown in panel c, the cells expressing wild-type f protein displayed internal staining throughout the cytoplasmic reticulum as well as in the juxtanuclear region. on the other hand, the cysteine mutant-expressing cells showed an intracellular staining pattern limited to a reticular perinuclear structure (panels e, g, i, k, and m). this indicates that cysteine mutant f proteins were efficiently expressed in these cells. to detect whether the expressed proteins were transported to the cell surface, the cells were fixed with paraformaldehyde instead of acetone, then examined by indirect immunofluorescence staining. the cells transfected with psrd-f displayed a staining pattern on the cell surface (panel b), whereas the surface fluorescence of the cells transfected with the cysteine mutant f plasmids was clearly much less intense than that of cells expressing wild-type f protein (panels d, f, h, j, and l). this indicates that the wild-type f protein was properly transported to the cell surface, but these cysteine mutant f proteins were not transported to the cell surface. in sendai virus-infected cells, both f protein and hn protein are transported to the cell surface. to examine the effect of hn on cell surface expression of wild-type and cysteine mutant f proteins, cells co-transfected with the hn gene and wild-type f or one of the cysteine mutant f genes were fixed with paraformaldehyde, then cell surface expression of f or hn proteins was tested with anti-f antiserum or anti-hn mab yl-111. the cell surface expression of cysteine mutant f proteins was not detected (data not shown). furthermore, as shown in fig. 5a , the cell surface expression of the hn protein was drastically reduced in the cells co-expressing hn and cysteine mutant f proteins. in panels (b) and (c), hn protein was efficiently detected at the cell surface of psrd-hn-or psrd-f and psrd-hn-transfected cells. on the other hand, the cell surface expression of the hn protein was drastically reduced in the cells co-transfected with psrd-hn and one of the cysteine mutant f genes (examples shown in fig. 5a panels d, e, and f). as shown in fig. 5b , in western blot analysis, both hn and wild-type or the cysteine mutant f proteins were detected in the cells co-transfected with hn and wild-type or cysteine mutant f genes (lanes [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] . the expression levels of hn proteins in these cells and in the cells transfected with hn gene alone were similar (lanes 3-14) . these results indicated that association of f and hn proteins takes place intracellulary and that the transport deficiency of the cysteine mutant f proteins affected the intracellular transport of the hn protein. intracellular transport of f proteins-the f protein has three iv-glycosylation motifs and contains complextype sugar chain(s) (28) . the f protein is synthesized on the rough er, then glycosylated in a well-defined sequence in the er and golgi apparatus in the process of transport to the cell surface. high mannose-type sugar chains of the glycoprotein in the er and the cis-golgi compartments are sensitive to digestion with endoglycosidase h (endo h), while sugar chains once trimmed and further processed in the medial and trans-golgi compartments are not. we therefore determined the critical state in the transport process of the cysteine mutant f proteins by testing endo h sensitivity of the sugar chains. to follow the process of maturation of glycoproteins, the transfected cells were labeled with [ 35 s]methionine and [ 35 s] cysteine for 10 min at 48 h after transfection. after being chased for 3 h, cells were lysed and the lysates were analyzed after immunoprecipitation with anti-f followed by endo h digestion, as described in "materials and methods." as shown in psrd-fcls-transfected cells was partially glycosylated f protein modified at a novel glycosylation motif generated by the mutagenesis. because the antiserum used in this study reacts efficiently with immature f protein, relatively large amounts of endo h-sensitive species of f proteins were detected even in the cells expressing wild-type f protein. these data showed that cysteine mutant f proteins were trapped in the er or the cis-golgi region, which prevented their transport to the cell surface. we used mutagenesis of the cysteine residue of the f protein to examine the effect of the disruption and rearrangement of the disulfide bonds on the folding of the f protein and its transport to the cell surface. we found that all cysteine mutant f proteins were efficiently expressed in cos cells ( fig. 2a) , but failed to fold into the compact form and to achieve proper conformation (fig. 3) . they were retained in the er to cis-golgi compartments (fig. 6) , formed aggregates (fig. 2b) , and failed to be transported to the cell surface (fig. 4) . this indicated that disulfide bond formation involving all cysteine residues is required for the f protein to exit the er. although our cysteine mutant f proteins were glycosylated, none of them acquired native structure. assignment of disulfide bonds of the f protein has been reported (15); thus, we expected that we would be able to detect some specific folding intermediates corresponding to the lack of individual disulfide bonds by analyzing individual cysteine mutant f proteins under nonreducing conditions. when mutations are made in either one of a pair of cysteines that are normally linked in a mature structure, the phenotype of the mutant proteins is often similar. in our study, two species of f proteins were detected in the cysteine mutant f proteins under nonreducing conditions: folded f protein and unfolded f protein. this indicated that formation of incorrect disulfide bonds led to rapid unfolding or rearrangement of disulfide bonds. as shown in fig. 3 , almost all cysteine mutants exhibited a drastic decrease in reactivity to the mature conformation specific mab f-49. this result showed that cysteine residues contribute to the whole protein structure through disulfide bridging, even though they are fairly distant from each other in the primary amino acid sequence. mab f-921 could not immunoprecipitate fc5s and fc6s proteins but could immunoprecipitate the rest of the mutants. this implied that the epitope of f-921 is present in the loop formed by the c5 and c6 disulfide bond, and the antigenic site is exposed when the f protein is inside the cells, but buried when the f protein is expressed on the cell surface. fc2s showed only slight reactivity to mab f-236. these data indicated that f-236 and f-921 may recognize cysteine-rich domains and supported the idea that the cysteine-rich domain in the bunched structure of paramyxovirus f proteins is recognized by mabs (29) . since the formation of the correct intramolecular disulfide bonds seems to be a complex process involving disruption and rearrangement of disulfide bonds during intracellular transport, the three-dimensional structure of these cysteine mutants was changed, and the immunoreactivity of mabs with these mutant proteins decreased dramatically. these data suggested that co-translational disulfide bond formation is an absolute requirement for subsequent folding. because of the incorrect folding of cysteine mutant f proteins, their transport to the cell surface was reduced drastically and they contained only high mannose-type oligosaccharides. these results indicated that the mutant proteins were retained in the er. viral glycoprotein mutants that had a temperature-sensitive phenotype and folded correctly at 32°c but misfolded at 37°c have been reported (4). in hsv-1, the cysteine mutant gd proteins showed temperature sensitive cell surface expression (12) . we examined whether cysteine mutant sendai virus f proteins show temperature sensitivity, but we could not detect cell surface expression of the cysteine mutant f proteins by incubating mutant gene-expressing cells at 33°c or at 31.5°c (data not shown). these results suggest that the inhibition of the expression of the cysteine mutant f proteins on the cell surface occurred at a stage involved in the intracellular transport. though many reports have shown that cysteine mutant proteins are improperly folded and found as disulfide -linked aggregates in the er, there are some reports that cysteine mutant proteins could be transported to the cell surface, e.g., newcastle disease virus (ndv) hn, measles virus h, and asialoglycoprotein receptor h2b subunit, all of which have type ii topology (5, 6, 14) . it seems that type i transmembrane proteins more strictly require disulfide bond formation than do type ii proteins in order to exit from the er. we previously found that cleavage of an intrinsically protease-sensitive mutant f, fmut, was enhanced by coexpression with hn protein (16) , indicating that f protein associates with hn protein intracellulary. thus, we examined the effect of co-expression of hn on cell surface expression of cysteine mutant f proteins. although the expression of both proteins was detected at roughly equal levels by western blot analysis (fig. 5b) , cell surface expression of cysteine mutant f proteins was not detected, and a decrease of cell surface expression of hn protein was observed (fig. 5a) . these observations are in accord with the down-regulation of the human parainfiuenza type 3 (hpiv3) hn protein cell surface expression by the mutant f protein containing an intracellular retention signal (30) . recently, many reports have shown a direct interaction of paramyxovirus f and hn proteins (31) (32) (33) (34) . however, the cell surface expression of f and hn proteins is reported to occur at different rates because of the different degrees of association with bip (35) . thus, decrease of cell surface expression of hn protein in the cysteine mutants coexpressing cells may be caused not only by direct interactions of the f and hn proteins but also by indirect interaction affected by binding of molecular chaperons such as bip or calnexin. morrison et al. reported that ndv f protein undergoes conformational change with disruption and rearrangement of disulfide bonds during intracellular transport, and that the conformational change occurs before the cleavage reaction (36) . because of the intracellular transport deficiency of the cysteine mutant f proteins, we could not identify the cysteine residues involved in such disulfide bond rearrangements. to study sequential disulfide bond formation and bond rearrangements, in vivo reduction experiments using well-characterized mabs are also now under way. paramyxovirus fusion: a hypothesis of changes the paramyxoviruses determination of the complete nucleotide sequence of the sendai virus genome rna and the predicted amino acid sequences of the f, hn and l protein folding and assembly of viral membrane proteins the role of the individual cysteine residues in the formation of the mature, antigenic hn protein of newcastle disease virus role of individual cysteine residues in the processing and antigenicity of the measles virus haemagglutinin protein ing disulfide bond formation and protein folding in the endoplasmic reticulum quality control in the endoplasmic reticulum folding and misfolding of vesicular stomatitis virus g protein in cells in vitro role of cotranslational disulfide bond formation in the folding of the hemagglutinin-neuraminidase protein of newcastle disease virus disulfide bonds in folding and transport of mouse hepatitis coronavirus glycoproteins cysteine mutants of herpes simplex virus type 1 glycoprotein d exhibit temperature-sensitive properties in structure and function disulfide bond structure of glycoprotein d of herpes simplex virus types 1 and 2 the contribution of cysteine residues to antigenicity and extent of processing of herpes simplex virus type 1 glycoprotein d enhanced folding and processing of a disulfide mutant of the human asialoglycoprotein receptor h2b subunit assignment of disulfide bridges in the fusion glycoprotein of sendai virus transfection of sendai virus f gene cdna with mutations at its cleavage site and hn gene cdna into cos cells induces cell fusion a general method for rapid site-directed mutagenesis using the polymerase chain reaction construction of expression plasmids for the fusion protein of sendai virus, and their expression in e. coli and eucaryotic cells sra promoter: an efficient and versatile mammalian cdna expression system composed of the simian virus 40 early promoter and the r-u5 segment of the human t-cell leukemia virus type 1 long terminal repeat dna sequencing with chain-terminating inhibitors sv40-transformed simian cells support the replication of early sv40 mutants a new technique for the assay of infectivity of human adenovirus 5 dna cleavage of structural proteins during the assembly of the head of bacteriophage t4 conformational aberrance of sendai virus f0 protein in thapsigargin-treated cells allowing exit from the endoplasmic reticulum but causing arrest at the golgi complex neutralizing activity of the antibodies against two kinds of envelope glycoproteins of sendai virus expression of the sendai virus fusion protein and the hemagglutinin-neuraminidase protein using a baculovirus vector heavy chain binding protein recognizes incompletely disulfide-bonded forms of vesicular stomatitis virus g protein carbohydrate structures of hvj (sendai virus) glycoproteins perturbation of cellular calcium blocks exit of secretory proteins from the rough endoplasmic reticulum downregulation of paramyxovirus hemagglutinin-neuraminidase glycoprotein surface expression by a mutant fusion protein containing a retention signal for the endoplasmic reticulum identification of regions on the hemagglutinin-neuraminidase protein of human parainfluenza virus type 2 important for promoting cell fusion functional interaction of paramyxovirus glycoproteins: identification of a domain in sendai virus hn which promotes cell fusion association of the parainfluenza virus fusion and hemagglutinin-neuraminidase glycoproteins on cell surfaces detection of an interaction between the hn and f proteins in newcastle disease virus-infected cells selective and transient association of sendai virus hn glycoprotein with bip conformational change in a viral glycoprotein during maturation due to disulfide bond disruption key: cord-020788-a33vcapl authors: gottardi, cara j.; caplan, michael j. title: signals and mechanisms of sorting in epithelial polarity date: 2008-05-22 journal: nan doi: 10.1016/s1569-2558(08)60020-x sha: doc_id: 20788 cord_uid: a33vcapl this chapter discusses epithelial-membrane polarity, sorting pathways in polarized cells, and the sorting-signal paradigm. polarized epithelial cells have long captured the attention of cell biologists and cell physiologists. at the electron-microscopic level, one of the most apparent and fundamental features of this cell type is its polarized organization of intracellular organelles and its structurally and compositionally distinct lumenal (apical) and serosal (basolateral) plasma-membrane domains. the polarized epithelial phenotype is an absolute necessity for organ-system function. in the most general sense, these cells organize to form a continuous, single layer of cells, or epithelium, which serves as a semi-permeable barrier between apposing and biologically distinct compartments. within the tubules of the nephron, these cells orchestrate complex ion-transporting processes that ultimately control the overall fluid balance of the organism. at the surface of the gastrointestinal tract, specialized versions of this cell type control the digestion, absorption, and immuno-protection of the organism. polarized epithelial cells have long captured the attention of cell biologists and cell physiologists. this is largely because the architecture of these cells so tellingly bespeaks their function. at the electron microscopic level, one of the most apparent and fundamental features of this cell type is its polarized organization of intracellular organelles and its structually and compositionally distinct lumenal (apical) and serosal (basolateral) plasma membrane domains (figures 1 a, b) . through the eyes of the physiologist, the polarized epithelial phenotype is an absolute necessity for organ system function. in the most general sense, these cells organize to form a continuous, single layer of cells, or epithelium, which serves as a semi-permeable barrier between apposing and biologically distinct compartments. within the tubules of the nephron, these cells orchestrate complex ion-transporting processes that ultimately control the overall fluid balance of the organism. at the surface of the gastrointestinal tract, specialized versions of this cell type control the digestion, absorption and immuno-protection of the organism. thus while polarized epithelial cells can carry out myriad functions, they share one defining feature: a structural polarity which serves their underlying functional polarity. the differential distribution of membrane proteins between the plasmalemmal surfaces of polarized epithelial cells enables these cells to both respond to and effect changes upon their environment in a directed fashion. the gastric parietal cell of the stomach, for example, contains a population of h,k-atpase-rich vesicles. upon stimulation, these vesicles fuse selectively with the lumenal membrane, resulting in the massive apical secretion of hcl which initiates digestion. without two important elements of the polarized phenotype, that is, junctional integrity and the precision of this membrane insertion, proton pumps might be delivered to a compartment which would be adversely affected by the secretion of acid. another illustration of the utility of the polarized phenotype is provided by the principal cells of the kidney, which carry out net sodium absorption through a mechanism which is entirely dependent upon the polarized distribution of two membrane proteins. sodium absorption is stimulated by the hormone aldosterone, which increases the amount or activity of na,k-atpase at the basolateral surface, while increasing the number or activity of apical sodium channels and thus the sodium conductance of the lumenal membrane (doucet and barlet-bas, 1989) . because the na,k-atpase generates low intracellular { na+}, sodium is these morphologically distinct apical and basolateral membrane domains are separated by a unique ultrastructure known as the tight junction (tj). this structure is just visible as an area of close, uniform membrane apposition located at the apices between adjacent epithelial cells. (photo courtesyof dr. marian neutra, children's hospital, boston, ma). able to pass from the lumen of the kidney tubule through apical sodium channels and into the cytoplasm down its electrochemical gradient. the na+ is then pumped across the basolateral membrane and into the interstitum by the sodium pump and is ultimately prevented from leaking back into the lumen by impermeable tight junctions. therefore, it is the differential assignment of na' channels to the apical surface and na,k-atpase molecules to the basolateral domain that ensures the vectoriality of this transport process. how the polarized cell assigns these two proteins (and apical and basolateral membrane proteins in general) to their respective surface domains has been the subject of much investigation and is the general focus of this review. it is perhaps important to point out that the fundamental questions of plasma membrane protein aniosotropy are not unique to surface membrane proteins or even to the study of epithelial polarity. the golgi apparatus, for example, is a polarized organelle whose cis-and trans-most cisternae are structurally and biochemically distinct. this organization is thought to enable the ordered addition and trimming of glycoprotein sugar residues as they traverse the stacked cisternae. as is clearly represented in the breadth of topics covered in this book, numerous cell types adopt a polarized state for some functional purpose. the propagation of a nervous impulse from dendrite to axon requires compositionally different membrane proteins in each of these domains, while the localization of determinants to specific parts of an egg's cytoplasm gives rise to cells with different growth potentials and the necessary assymetries required for embryo development . what we hope will become clear in this chapter and related chapters in this book is that we are beginning to appreciate the universality of polarity. the mechanisms involved in establishing and maintaining the polarized state appear to be so fundamental that some of the schemes through which a cell is able to localize a particular protein to a given cellular domain are turning out to be conserved between epithelia and neurons, and even between epithelia and yeast. while the need for protein asymmetries in development, or membrane polarity in epithelial transport is clear, the means through which it is achieved are only beginning to be elucidated. before we embark upon our review of the field, we first introduce the conceptual framework onto which the results in this field are organized and interpreted. first, a protein destined to accumulate with a polarized distribution needs to be recognized as different from other proteins. we presume that what is recognized is some structural aspect of the protein itself. we refer to that part of the protein that is recognized for polarized localization as a sorting signal or localization determinant. these two terms are often used interchangeably, but in fact there is a subtle difference between the two. "sorting signal" is often taken to imply a signal that is recognized and acted upon before the protein is delivered to its ultimate residence. sorting signals are thought to be those signals that enable a cohort of similar proteins with similar destinations to be sorted and sifted away from all of the other molecules traversing the biosynthetic pathway at the same time. a "localization determinant" is perhaps a more general term that carries fewer mechanistic implications. it is defined here as the determinant that specifies a protein's polarized distribution, but it does not make a distinction between recognition that takes place before the protein has reached its final destination or after (e.g., through a selective retention mechanism). the proteins which serve to recognize a particular signal and act upon it are generally referred to as sorting machinery. often, a distinction is made in the literature between "sorting" and "targeting machinery." in these cases, the sorting machinery is exclusively those elements which recognize the sorting signal. any downstream effectors of this sorter that orchestrate the vectorial directing of a vesicle to its final destination are referred to as targetting machinery. a simple schematic of these elements is presented in figure 2 . as is discussed in the second half of this review, we know much more about general targeting machinery than the sorters themselves. one of many possible ways to think about how a secretory or membrane protein could be sorted into a vesicle. it is presumed that the "sorters" will recognize a sorting signal ("1 embedded within the protein structure. it seems likely that this recognition event would need to take place in the lumen of the golgi for a secretory protein, but this might not be necessary for a membrane protein, which could interact with a sorter from either a lumenal-or cytoplasmic-facing signal domain. ultimately, the sorted protein(s) could be contained within a "domain-specific vesicle," which would then be targetted (with the help of protein targetting machinery x, y, and z) to the appropriate apical or basolateral surface domain. it is thought that proteins destined for either the apical or basolateral domain of a polarized cell occupy the same golgi cisternae during their biosynthesis ( m a t h and simons, 1984; misek et al., 1984; rindler et al., 1984; fuller et al., 1985; pfeffer et al., 1985) . immunoelectron microscopic studies performed on nonpolarized endocrine cells which manifest two biochemically and kinetically different secretory pathways suggested that the process of sorting components away from one another takes place at the tgn (orci et al., 1987; tooze et al., 1987) . however, recent studies have demonstrated that sorting may not take place exclusively at the tgn. sorting mechanisms have been suggested to take effect as early along the biosynthetic pathway as the er (balch et al., 1994) as well as at the recycling endosome (matter et al., 1993; . in hepatocytes, sorting appears to occur after all newly synthesized membrane proteins are delivered to the basolateral plasmamembrane (bartles et al., 1987) . similar delivery routes have been detected in polarized intestinal epithelial cell lines (matter et al., 1990) . finally, in at least one subclone of the canine renal mdck cell line, sorting may take place both at the golgi as well as at the level of the plasma membrane. while most proteins in this cell line are sorted in the tgn, the na,k-atpase can be preferentially localized to the basolateral membrane through domain-specific stabilization mechanisms after random insertion into both plasmamembrane domains (hammerton et al., 1991; siemers et al., 1993) . apically and basolaterally sorted proteins have been shown to be packaged into distinct classes of golgi-derived vesicles which are ultimately targeted to their appropriate domains. recently it has been shown that membrane and secretory proteins are segregated into distinct vesicular carriers upon transport from the golgi to the basolateral surface of hepatocytes (saucan and palade, 1994) the extent to whch distinct basolateral (or apical) proteins are cosorted and incorporated within the same vesicle either due to common localization signals or the ability to co-aggregate has not yet been determined. after proteins are sorted, the targeting of a vesicle to a particular surface domain can occur directly (vectorially) from the tgn to the apical domain (matlin and simons, 1984; rindler et al., 1984; fuller et al., 1985) , basolateral domain (caplan et al., 1986) or indirectly as has been shown for the poly-immunoglobulin receptor (pigr) (mostov and deitcher, 1986) . in the latter case, the protein is first targeted to the basolateral surface where the receptor can bind its ligand and is then transported to the apical surface via a process known as transcytosis (reviewed in mostov and simister, 1985) . as noted above, in hepatocytes all apical proteins studied to date make use of this indirect pathway for apical delivery (bartles et al., 1987) , while cell lines derived from intestine and kidney can employ both routes for surface delivery (matter et al., 1990; casanovaet al., 1991; low et al., 1991) while the details of the routes have been determined for a number of sorting pathways, the molecular signals and recognition components which control each of them are not well understood. the search for these molecular signals and recognition components has been the focus of much study over the last 15 years. during this period, the subjects of protein sorting and epithelial polarity have been extensively reviewed. several of these reviews are listed here for those seeking more background on specific aspects of this field: for general reviews on protein sorting pathways (burgess and kelly, 1987) ; general concepts of sorting and targeting (caplan and matlin, 1989 ); a discussion of the mechanisms required for the establishment and maintenance of epithelial polarity (rodriguez-boulan and nelson, 1989) ; polarized transport of surface porteins and lipids in epithelial cells ; comparative epithelial and neuronal polarity (rodriguez-boulan and powell, 1992) ; the generality of the polarized phenotype (nelson, 1992) ; cytoskeleton as a component of the protein sorting machinery (mays et al., 1994) ; summary of the few known sorting signals in polarized epithelial cells ; common signals involved in sorting from the tgn and endosomes . perhaps now more than ever before, it is becoming a rather daunting task to provide a synthesis of the observations relevant to the study of epithelial polarity. this is in part due to the fact that important insights into the mechanisms of sorting are being contributed by fields that are not exclusively focussed on epithelial biology. as we discussed in this review, some important contributions are emerging from studies of endocytosis, secretion in yeast and neurons, and the sorting of yeast lyso-soma1 enzymes (see chapter i of this volume), in addition to more "classical" approaches to epithelial polarity. in this review, we explore the current paradigm that the generation and maintainance of distinct membraneous compartments requires "sorting signals," the recognition domains embedded within the amino acid sequence or polypeptide structure of the protein, and "sorting machinery," the proteins which interpret and act upon these signals. in the first half, we review and categorize the signals that have begun to be elucidated, as well as discuss the approaches and difficulties associated with finding and interpreting sorting signals. while the polarity field itself has not yet succeeded in characterizing the definitive sorting machinery, numerous components of the membrane budding and fusion apparatus are rapidly being elucidated. we have chosen to review some of the important findings in the field of membrane transport, and in particular examine the potential roles that gtp-binding proteins of the rab, arf and heterotrimeric classes may play. we also discuss a class of proteins referred to as adaptins as well as the implications that the snare hypothesis may have for epithelial polarity. although these components have not been shown to be directly involved in sorting per se, it is becoming increasingly clear that in a general sense, the composition of the membrane vesicle budding and fusion machinery may be part of the overall apparatus which "acts upon" the sorted species and contributes to domain specific surface targeting. as stated above, the paradigm for conceptualizing the mechanisms responsible for biosynthetic sorting requires that each protein contains signal information embedded within its polypeptide sequence/structure (sorting signal) which is interpreted and acted upon by components referred to as sorting machinery. this scheme takes its cue from the process through which ribosomes translating secretory and membrane proteins are targeted to the endoplasmic reticulum to initiate cotranslational protein translocation (blobel, 1980) . prior to the elucidation of this process, it was suggested that protein targeting might require cellular sorting machinery to recognize certain signals which would be shared by proteins with common destinations (blobel, 1980) . shortly after this suggestion, it became clear that targeting to the rer, mitochondria and chloroplasts required short, contiguous, n-terminal signal peptides (reviewed in burgess and kelly, 1987) . in the case of the former, the signal was recognized by a receptor, srp (lingappa et al., 1978; von heijne, 1984; kurzchalia et al., 1986; walter and lingappa, 1986 ). subsequently, a number of short, contiguous amino acid domains have been shown to play a role in later stages of post-synthetic targeting. these include: (1) the kdel and adenovirus e l 9 signals which ensure the retention or recapture of resident er proteins (munro and pelham, 1987; nilsson et al., 1989); ( 2 ) a transmembrane domain signal responsible for golgi retention (swift & machamer, 1991; machamer, 1993) ; (3) the cluster of positively charged lysine residues (sv40-nls) sufficient for nuclear targeting (richardson et al., 1986) ; (4) the critical tyrosine/ "tight-turn'' structural motif which can mediate localization to clathrin coated-pits (goldstein et al., 1985; pearse and robinson, 1990; collawn et al., 1991) ; and (5) the discovery that lysosomal hydrolases were targeted to lysosomes through the recognition of a phosphorylated sugar residue (mannose-6-phosphate; reviewed by kornfeld and mellman, 1989) . in several of these cases receptors for these signals have been well-characterized: the signal recognition particle (srp) for secretory and membrane proteins (walter and lingappa, 1986) , the mannose-6-phosphate receptor (m6pr) for the targeting of lysosomal hydrolases to the lysosome (sly and fischer, 1982; vonfiguraandhasilik, 1986) , the kdelreceptor (tanget al., 1993) and the adaptins which couple coated pit localization sequences to clathrin cages (pearse and robinson, 1990; robinson, 1994) . the search for definitive signals which mediate the delivery of proteins to a particular epithelial surface domain has proven to be quite difficult. this is due in part to general limitations imposed by certain molecular biological approaches, as well as to some inherent difficulties specific to the investigation of epithelial polarity. our goal in this section is to outline reasonable criteria for the identification of a sorting signal. the observation that the influenza and vesicular stomatitis viruses bud from opposite surface domains of polarized mdck cells (madin darby canine kidney) (rodriguez-boulan and sabatini, 1978) spawned an extensive search in which chimeric and deletion analyses were applied to the problem of identifying the underlying apical and basolateral sorting signals (reviewed in caplan and matlin, 1989) . these efforts to characterize sorting signals have generally involved the generation of chimeric or truncated contructs prepared from portions of apical and basolateral membrane proteins. through analysis of the subcellular distributions of the resulting proteins, sorting information can, at least in theory, be assigned to particular portions of the parent molecules. while a large number of chimeric and truncated viral glycoproteins have been generated and analyzed, it has been difficult to interpret many of the resultant observations. with the benefit of hindsight, we now know that these difficulties can be attributed to a number issues that we discuss in more detail below (including the tertiary stuctures of the experimental constructs, the confounding possibilities introduced by uncharacterized default pathways, and the potential for multiple and hierarchical signals to be embodied within the structures of the studied proteins). until recently (thomas and roth, 1994) , the analysis of viral spike glycoproteins did not produce a definitive sorting signal. much of the uncertainty associated with this work is likely attributable to the fact that these studies engineered chimeras from portions of structurally dissimilar molecules. the tertiary structures of the resultant chimeras may thus differ substantially from those of either parent molecule, which may in turn exert unpredictable effects upon sorting behavior. clearly, if sorting signals are formed from domains arising from noncontiguous regions of a polypeptide, for example, in much the same manner that heterotrimeric g proteins are thought to "see" their effectors (berlot and bourne, 1992) , or in the way that the human growth hormone receptor (hghbp) is thought to interact with its ligand (cunningham and wells, 1989) , it is easy to imagine how the structural integrity of the putative sorting signal could become compromised in a chimeric construct. while producing a rough map of the signal-bearing domain of a protein can be relatively straight forward, determining the exact residues which constitute the signal is turning out to require a collaboration between many different types of mutagenesis approaches. often, contradicting results can arise from alanine scanning, truncation and point mutation/deletion mutagenesis, since a mutated protein can manifest impaired sorting behavior even though the altered residues are not part of the actual sorting signal (aroeti et al., 1993) . it is becoming clear that a judicious and thorough comparison of many different types of mutagenesis approaches may be necessary to determine definitively the key residues necessary for sorting. perhaps another difficulty in looking for apical or basolateral sorting signals is that the default pathway for "signal-less" membrane proteins is still not known. a protein that is sorted "by default" is, by definition, unable to interact with and be acted upon by any sorting machinery whatsoever. in theory, at least, such "unsorted'' proteins may be distributed with polarity, depending on the nature and characteristics of the membrane vesicular traffic arising from the golgi complex in a particular cell type. obviously, if the localization of a protein construct under study is identical to that produced by the cell's default pathway, elucidation of a signal will be difficult, since elimination of the signal will not alter the protein's distribution. thus, one can appreciate the difficulty in assigning localization information to a particular domain in the context of an undefined default pathway. this caveat accounts for at least some of the reasons which explain why a definitive basolateral sorting signal in the c-terminal domain of vsv-g protein took so long to discern. in the following example we summarize the ha-vsvg spike glycoprotein chimera literature as a means to illustrate the difficulties in interpretating these types of studies. when acdna encoding the influenza ha was expressed in mdck cells, the encoded protein localized to the apical membrane (roth et al., 1983) , while a cdna encoding the vsvg polypeptide produced a protein that is localized to the basolateral domain (gottlieb et al., 1986b; stephens and compans, 1986) . when truncation mutants were expressed in which soluble ectodomain versions of these proteins were synthesized, the vsvg ectodomain was secreted from both apical and basolateral domains gonzalez et al., 1987) while the ha ectodomain was predominantly secreted from the apical domain (gonzalez, et al., 1987; roth et al., 1987b) . based on evidence that the default pathway for secreted proteins leads to nonpolarized secretion from both surface domains (kondor-koch et al., 1985; gottlieb et al., 1986a; caplan et al., 1987) , it was reasoned that the ectodomain of ha encodes an apical sorting signal while the vsvg ectodomain lacks signal information. this was further confirmed by the observation that a hybrid ha-vsvg protein comprising the ha ectodomain fused to the vsvg transmembrane and cytoplasmic tail region was targeted to the apical membrane (mcqueen et al., 1986; roth et al., 1987a) . but if the vsvg ectodomain is randomly secreted and the vsvg tail domain fused to ha is apical, which domain of vsvg encodes basolateral sorting information? the complementary hybrid comprised of the ectodomain of vsvg (presumably signal-less) tethered to the ha transmembrane and tail region (perhaps also signal-less) was targeted either to the basolateral membrane or to both surface domains (mcqueen et al., 1986; puddington et a] ., 1987; roth et a]., 1987a; compton et al., 1989) . the interpretation of the behavior of this chimera was clearly complicated; it was suggested that this protein could be pursuing its distribution by default. (as discussed above, the default pathway for membrane proteins is still not defined in polarized cells). an alternative interpretation was that the vsvg ectodomain indeed contains basolateral sorting information, but that perhaps this domain needs to be tethered to the plasmamembrane with a transmembrane anchor in order to interact with its presumptive sorting machinery. this interpretation, however, was proved incorrect by the observation that the anchoring of this ectodomain to the membrane through a lipid-linkage resulted in apical targeting (brown et al., 1989) . interestingly, when the ectodomain of the normally apical placental alkaline phosphatase (plap) was attached to the vsvg transmembrane and cytosolic tail domains (which were though to lack a dominant signal), the resulting chimeric protein was targeted basolaterally. it is difficult to reconcile the ha-vsvg and plap-vsvg chimeras without invoking hierarchical and competing signals. recently, a basolateral targeting signal has been precisely localized to the cytoplasmic domain of the vsvg protein (thomas and roth, 1994) . in light of the vicissitudes which attended the interpretation of each round of chimeric constructs discussed above, it was certainly unexpected that definitive sorting information would be localized to the cytoplasmic tail of vsvg. the nature and function of this signal will be discussed in depth below. the preceding discussion was presented simply to reinforce the caveat that the default pathway, protein structural considerations and the possible interactions between "dominant" and "recessive" sorting signals can considerably cloud the interpretation of chimera experiments. recent studies of the polymeric immunoglobulin receptor (pigr), the low density lipoprotein receptor (ldlr) and polytopic hetero-oligomeric proteins (h,k-atpase and na,k-atpase) suggest that individual proteins can interact in multiple and complex fashions with the machinery responsible for surface targeting. it is becoming increasingly clear that there can be an array of signals encoded within an individual protein, and the sorting problem is becoming evermore complicated by the apparent redundancy, multiplicity and hierarchical nature of these signals (matter et al., 1992; mostov et al., 1992) . for example, brewer and roth's (1991) demonstration that they could completely overwhelm the apical signal present in the ha ectodomain and redirect it to the basolateral surface by changing a single amino acid in this protein's cytoplasmic tail strongly suggests that multiple signals present in a single protein can interact in a heirarchical fashion. the newly created cytoplasmic signal is dominant over the presumed apical sorting signal present in the ectodomain of ha. as discussed below, the ldl receptor has been shown to encode redundant, basolateral sorting information, since either of two cytoplasmic determinants could independently mediate basolateral delivery (matter et a]., 1992) . moreover, the protein may also contain acryptic apical sorting signal in its ectodomain, since a cytoplasmic tail-minus construct of this protein (ct12) is sorted with great efficiency to the apical membrane in mdck cells (matter et al., 1992 ). an ectodomain apical localization signal has also been found within the pigr, whose initial surface delivery is to the basolateral plasmalemma. why do these proteins need multiple signals? what does the ldlr gain by expressing two basolateral localization signals? recent studies (discussed in greater detail in the following section) have more finely decoded these two signals and are revealing functional differences. for ex-ample, the "membrane proximal determinant" encodes coated-pit internalization information, while the "membrane distal determinant" appears to ensure efficient sorting from a basolateral endosome back to the basolateral surface (matter et al., 1993) . analysis of the sorting behavior of multisubunit ion pumps provides further insight into the possible utility of multiple signals (reviewed in the gastric h,k-atpase and the na,k-atpase are close cousins in the large family of p-type ion transporting atpases. both are composed of 100 kda a-subunits and heavily glycosylated 55 kda p-subunits. they share similar reaction mechanisms and catalytic properties and, not surprisingly, are highly homologous at the amino acid sequence level. the a-subunits are -65% identical, whereas the p-polypeptides manifest roughly 40% identity. while the na,k-atpase is a basolateral protein in most polarized epithelial cell types (with the exception of neural epithelia such as choroid plexus and retinal pigment epithelium), the h,k-atpase occupies the apical membrane and a pre-apical storage compartment in gastric parietal cells. hormonal stimulation of gastric acid secretion induces fusion of the membrane vesicles which comprise the intracellular reservoir with the plasma membrane, resulting in delivery of the h,k-atpase to the apical cell surface. during the interdigestive period, the h,k-atpase is re-endocytosed and returned to its storage compartment. chimera studies reveal that each subunit of the h,k-atpase possesses a sorting signal which participates in regulating this complex traffic . the a-subunit is endowed with a dominant apical targeting signal, which can drive the apical sorting of chimeric pumps expressed in both mdck and llc-pkl renal epithelial cells. the p-subunit of the h,k-atpase possesses a tyrosine-based endocytosis signal (roush et al., manuscript submitted). this signal causes the protein to be sorted basolaterally when it is expressed in mdck cells and apically when it is expressed in llc-pk1 cells. the na,k-atpase p-subunit does not possess a similar sequence domain. it seems likely that the two h,k-atpase signals participate in distinct stages of pump sorting in the gastric parietal cells. the apical signal in the a-subunit probably mediates the sorting of the entire complex to the apical membrane or the pre-apical storage compartment, whereas the p-subunit signal is responsible for ensuring the re-internalization of the pump following the cessation of secretagogue stimulation (courtois-coutry et al., 1997) . it remains to be determined why the p-subunit's tyrosine-based signal is differentially interpreted by mdck and llc-pk1 cells. investigation of this phenomenon may shed light on the nature and function of the epithelial sorting machinery. this apparent trend towards a multiplicity of signals is not entirely surprising, since many proteins are required to perform highly sophisticated feats of membrane targeting during the course of their transits throughout the endomembranous networks of the cell. for example, the pigr receptor expressed in its native hepatocytes or by transfection in mdck cells travels first to the basolateral membrane to pick up ligand and is then transported to the apical surface domain. it appears that an apical sorting signal in this protein's ectodomain might be required for basolateral to apical transcytosis, while a basolateral signal in the cytoplasmic domain ensures the initial basolateral delivery. unlike proteins that are constitutively expressed at one surface domain, a number of distinct and individually acting signals are necessary to orchestrate the more complicated surface targeting events displayed by pigr receptor, and other molecules like it. obviously, the hierarchical (both temporal and spatial) regulation of each signal will be of utmost importance in ensuring that a protein follows a physiologically relevent trafficking pathway. recent evidence, for example, demonstrates that the pig receptor undergoes phosphorylation on a cytosolic serine residue around the time that it is delivered to the basolateral surface (larkin et al., 1986) . this phosphorylation event appears to inactivate the protein's basolateral signal and thus permit its transcytosis to the apical membrane (casanova et al., 1990 ). perhaps not surprisingly, the greatest advances in the elucidation of sorting signals have been made with single membrane-spanning monomeric or homooligomeric proteins (e.g., pigr, ldl-r, tfr). with these molecules the requirements for surface expression are easily met and the effects of mutagenesis on tertiary structure can be assessed through well-characterized functional assays, such as receptor-ligand or antibody binding. through deletion analysis and heterologous expression in mdck cells, it was determined that the pigr (casanova et al., 1991) and the ldlr (hunziker et al., 1991) each contained basolateral targeting determinants which mapped to short, contiguous regions of their cytoplasmic domains (table 1). both signals could be grafted onto heterologous proteins and cause them to be targeted to the basolateral surface, supporting the idea that each determinant was truly an autonomous basolateral sorting signal. exhaustive mutagenesis studies have more finely mapped each of these determinants. the ldlr possesses two distinct basolateral targeting determinants, one that is "coated-pit related" (proximal determinant) and another which is tyrosine-dependent but not capable of mediating localization into coated-pits (distal determinant) (matter et al., 1992; . interestingly, the polymeric immunoglobulin receptor (pigr) signal may constitute yet another class of basolateral targetting determinant, since it shares little in the way of sequence homology with either determinant of the ldlr and shows weak tyrosine dependence (aroeti et al., 1993) . the general characteristics of these three determinants and the degree to which they are related are only beginning to be eluciated thomas and roth, 1994 ). an attempt to categorize these basolateral sorting determinants has been made by and is summarized in table 1 . before discussing the nature of the "coated-pit related" basolateral targeting determinant, it is necessary to be familiar with the signals that are known to mediate the accumulation of plasma membrane receptors into clathrin-coated pits (goldstein et al., 1985) . it is now generally accepted that tyrosine-and dileucinecontaining sequence motifs present in the cytoplasmic tails of a number of coated-pit clustering proteins serve as the critical recognition elements for the adaptor components of clathrin coats (pearse and robinson, 1990; trowbridge, 1991) . more recently, numerous studies have demonstrated a strong relationship between the signals which mediate localization into coated pits and a subset of those involved in basolateral targeting (brewer and roth, 1991; hunziker et al., 1991; lebivic et al., 1991) . for example, brewer and roth (1991) found that the apically targeted ha molecule could be completely rerouted to the basolateral membrane by replacing a strategically localized cysteine residue (cys 543) with a tyrosine in the cytoplasmic domain. this tyrosine was also sufficient to localize this protein into coated-pits and direct the protein's incorporation into endosomes. this observation that an endocytosis signal might also double as a basolateral targeting signal led to the suggestion that the recognition determinants for endocytosis and for tgn-to-basolateral targeting might be similar or identical to one another. thorough mutagenesis studies on the coated-pit localization and basolateral sorting determinants of ha-y543 (thomas and roth, 1994; lin et al., 1997) , vsvg protein (thomas et al., 1993) , and the ldlr , however, have led to a revision of this initial interpretation. it turns out that the "endocytosis signal" of both the ha-ys43 and the ldlr (proximal signal) can be resolved into two overlapping but distinct signal components. in other words, there is information recognized for endocytosis that is distinct from that recognized for basolateral sorting, even though the sequences are in part superimposed and share marked similarity. table 2 shows the systematic mutagenesis that ultimately unraveled this relationship. brewer and roth (1991) found that ha-y543 is capable of both basolateral sorting and endocytosis. the second generation mutant ha-y543,rs46, however, behaved as aprotein that was capable of endocytosis, but whose basolateral localization was inhibited (lin et al., 1997) . similar results were found with the ldlr proximal determinant. matter and colleagues (1994) showed that the truncation mutant ct27 was basolaterally targetted and rapidly endocytosed, while the removal of terminal acidic residues in ct22 produced a protein that was not capable of basolateral targetting, but could nonetheless be endocytosed. thus, the initial correlation between endocytosis signals and basolateral targeting has now resolved into two distinct but overlapping signals that can share common residues for their respective activities. the implications of this result are very exciting for the field of epithelial polarity. first, they suggest that the signals for basolateral sortingkargetting may be structurally similar to signals for clathrin-coated pit localization and endocytosis. the involvement of similar signals suggests that the sortinghecognition molecules themselves may be related. at least for endocytosis signals, there is evidence in favor of clathrin "adaptins" (of the ap2 plasma membrane class) playing a role in recognizing these sequences (pearse, 1988; glickman et al., 1989; beltzer and spiess, 1991; sorkin and carpenter, 1993; sosa et al., 1993) . in light of the recent characterization of adaptin related molecules (cops, discussed in section 111, below), it has been suggested that a family of structurally and functionally similar sorting adaptors may serve as the sorting machinery which interacts with these basolateral sorting signals . the findings support the more general contention that sorting at the level of the tgn may be mechanistically similar to that at the level of the endosome (matter et al., 1993 . taken altogether, there now appear to be two general classes of basolateral targeting determinants. one of these is biochemically related to the signals that mediate sorting into coated pits. this type of signal can be colinear with an endocytosis determinant and may share the critical tyrosine residue required for the activity of both, but it is nonetheless distinct and dissociable from an endocytosis signal. the second class of basolateral targeting determinants appears to be unrelated to clathrin-coated pit localization signals, although it may also strongly depend on a tyrosine for activity. this second type of determinant appears to be unique to the ldlr, pigr (casanovaet al., 1991) and the tw (dargement et al., 1993) , although these signals share no primary sequence homology with one another. it is possible however, that this second determinant present in these three proteins may be mutually similar in three-dimensional structure but not in primary sequence. in this context it is important to note that adaptor proteins are thought to recognize tyrosine residues in the context of a tight turn, which can be achieved by many different primary sequences (glickman et al., 1989; collawn et al., 1990 collawn et al., , 1991 bansal and gierasch, 1991) . more detailed analyses are revealing that while the dependency on tyrosine is crucial, other residues which are acidic and c-terminal to the tyrosine are also important. demonstrated that the clusters of two or more acidic amino acids downstream from a tyrosine, phenylalanine or di-leucine are important for signal function (see table 1 ). while the authors of this study have argued that it is premature to propose a common motif characteristic of all basolateral targeting determinants, they have found that this critical aromatic amino acid followed by acidic residues can be discerned in the cytoplasmic domains of many known proteins which are targeted to the basolateral membrane of mdck cells, including ecadherin, transferrin receptor, cation-independent and dependent mannose-6-phosphate receptors, lap, pigr and fcriib2. (see discussion of . as these authors have suggested, it will be exciting to define mutations that will prevent the recognition of these sequences so that the identification and characterization of the molecules which serve to interact with and interpret these signals can be facilitated. an ever growing list of proteins are anchored to membranes through a covalent attachment to glycosylphosphatidylinositol or gpi. proteins of this class are initially synthesized on bound polysomes as transmembrane polypeptides and, while still resident within the er, are cleaved from their transmembrane portions and transferred covalently to lumenally facing glycosyl-phosphatidylinositol molecules (cross, 1990) . gpi-anchored proteins are widely distributed with respect to both cell type and function. members of this class of proteins include protozoal surface coat proteins (e.g, the variant surface glycoproteins of trypanosomes), differentiation antigens (e.g., thy-i ) , adhesion molecules (e.g., the gpi-linked isoform of n-cam), hydrolases (e.g., alkaline phosphatase and snucleotidase), and receptors (folate receptor). the functional advantages that this membrane linkage confers upon a particular protein is presently unclear, and has been the focus of a great deal of attention (reviewed in brown, 1992) . in general, the gpi-linkage has been suggested to be important for enabling proteins to "c1uster"at a surface density much higher than is possible for single-pass transmembrane proteins (hooper, 1992) . studies have also shown that these clusters of gpi-anchored proteins may be important for certain cell surface signal transducing events (reviewed in anderson, 1993) . gpi-linked proteins captured the attention of epithelial biologists because of their polarized distribution in mdck cells (lisanti et al., 1988) and other cultured epithelial cell lines (lisanti et al., 1990) . the nearly exclusive correlation of membrane anchoring via gpi with apical localization raised the question as to whether or not the gpi membrane anchor was itself a signal for apical targeting. chimeric analyses showed clearly that the gpi-linkage is sufficient for apical targeting in mdck cells (brown et al., 1989; lisanti et al., 1989a,b) . of course, in the absence of a known default pathway for membrane proteins, it remains formally possible that the gpi-anchor prevents a protein from gaining entry into the basolateral sorting pathway. moreover, the fact that the cytoplasmic tail-minus versions of the ldl and pig receptors are directly targetted to the apical membrane is consistent with the possibility that apical sorting occurs by default (discussed in . nonetheless, the gpi-linkage is the field's best accepted apical localization signal characterized to date. interestingly, glycosphingolipids (gsls) share the apical preference of gpi-linked proteins and are generally found exclusively in the outer leaflets of the apical membranes of mdck cells. the means through which gpi-anchored proteins and glycosphingolipids (gsls) are sorted and subsequently targetted to the apical membrane are poorly un-derstood. it has been shown that gsls manifest biophysical properties which enable them to self-associate or form clusters in the plane of the membrane (thompson and tillack, 1985) . these properties have been invoked to support the proposal that gsl clustering occurs at the level of the tgn, and that newly synthesized gpi-linked proteins might co-cluster with these lipids (simons and van meer, 1988) . it has been further suggested that apically-destined transmembrane proteins could similarly be sorted through an ability to co-cluster with gsls and gpi-linked proteins . according to this model, apical sorting could take place through selective inclusion within these gsl microdomains, while certain basolateral membrane protein components would be sorted by selective exclusion. however, it should be pointed out that there is still no experimental evidence showing that the gsl clusters are important for apical sorting. one cell line in particular suggests that the role of gsls in sorting of gpi-anchored proteins may be more complex. a rat thyroid epithelial cell line (frt) distributes its gsls and gpi-anchored proteins to the basolateral surface while the polarized distribution of a number of transmembrane proteins is identical to that of mdckcells (zurzolo et al., 1993) . this suggests that at least some of the apical proteins analyzed (e.g., ha) do not partition with basolaterally directed gsls. the frt cell line will serve as an excellent tool for furthering our understanding about the role of glycolipid clustering in the sorting of proteins and lipids in polarized epithelial cells. most of the early studies in epithelial polarity used the kidney-derived mdck cell line as their workbench. however, the last six years has seen the introduction of a number of new cell culture models into the field: caco2 (pinto et al., 1983; matter et al., 1990; costa de beauregard et al., 1995) ; ht-29 and t-84 (human intestinal epithelial), (madara et al., 1987; polak-charcon et al., 1989; mikogami et al., 1994) ; llc-pk1 (pig kidney proximal tubule) (hull et al., 1976; gstrauthaler et al., 1985; gottardi et al., 1995) ; mdbk (madin-darby bovine kidney) (furuse et al., 1994) , frt (fischer rat thyroid) (zurzolo et al., 1993) , as well as primary cultures of choroid plexus and retinal pigmented epithelium (marrs et al., 1993) . as we have discussed in the first half of this review, we arejust beginning to elucidate the nature of certain "apical" and "basolateral" sorting signals. however, the "nonstandard" sorting of gpi-link proteins in frt cells mentioned above, and the fact that a number of proteins display tissue and cell-type specific membrane localizations (see table 3 ), calls into question the ways in which we think about polarized sorting signals and the mechanisms of sorting. as shown in table 3 , there are notable differences in the localization of certain membrane proteins expressed in different tissue cell-types. the na,k-atpase, nearly ubiquitously expressed at the basolateral domain of most polarized cell types, is localized to the lumenal (apical) domain of both retinal pigmented epithelial and chorid plexus cells (wright, 1972; steinberg and miller, 1979; spector and shiel and caplan, 1995a,b; (m) schwartz et al., 1985; (n) brown et al., 1988 . johanson, 1989 gundersen et al., 1991) . when the cdna encoding the ldl receptor was placed under the control of a metallothionein promoter and employed in the generation of a transgenic mouse, the receptor was expressed at the basolateral domains of liver and intestinal epithelial cells, but unexpectedly localized to the apical domains of proximal kidney tubule cells (pathak et al., 1990) . the polarized budding of certain viruses and the localization of their respective spike glycoproteins was shown to vary considerably between kidney derived mdck and thyroidderived frt cells (zurzolo et al., 1992a) . in some instances, ashift in the type of targeting pathway used by a protein can depend on the differentiated state of the cell culture (zurzolo et al., 1992b) . furthermore, the polarized localization of a particular gpi-linked protein was found to be developmentally regulated in drosophila embryos (shiel and caplan, 1995a) . finally, a remarkable flexibility and "plastic-ity" of protein sorting has been suggested to be present in kidney intercalated cells, which appear to direct the vacuolar proton pump to either surface domain, depending on particular environmental cues (schwartz et al., 1985; brown et al., 1988a) at the present time, we have little understanding of the signals or sorting mechanisms that mediate the differential sorting of the same protein in distinct cellular types. are different signals recognized by the different epithelial cells or is the same signal interpreted differently? is the sorting machinery itself different between polarized cells, or is the sorting machinery basically conserved between different cell-types while its regulation, adaptation, or wiring to the targeting machinery is different? evidence discussed in the second half of this review on the rab family of proteins suggests that elements of the targeting machinery are in fact highly conserved between different cell types, and it is the cell-type specific adaptation of this machinery which accounts for differences. nonetheless, it is becoming clear that the sorting of a particular protein can be a highly idiosyncratic feature of each polarized cell model. the observation that different epithelial cell lines may handle the same protein (or the same signal) differently has to reflect more than a mere capriciousness of epithelial cells in culture. each of the cultured cell models employed in polarity studies derive from and reflect some of the differentiated features of a tissue or organ system. accordingly, the sorting behavior observed in a particular cell type needs to be evaluated in the context of this cell's functional history. for example, is this cell derived from a tissue specialized for apical secretion or apical endocytosis? studies of the sorting of ion-transporting atpase molecules expressed in distal tubule-derived mdck and proximal tubule derived-llc-pk1 kidney cells suggest that the distinct cell surface distributions which an atpase subunit achieve in these two lines are consistent with established physiologic differences between the distal and proximal tubule epithelial cells (roush et al., manuscript submitted). these observations have led to the suggestion that sorting mediates delivery to functionally defined rather than topographically defined domains (gottardi and caplan, 1993a) . it is becoming quite clear that the findings in the field of intracellular protein transport (reviewed by rothman, 1994 and by mellman, 1994) will prove to be extremely valuable to the discipline of epithelial polarity. in this field, the convergence of studies on synaptic vesicle (regulated) secretion in neurons, constitutive secretion in yeast, and intra-golgi transport have led to the rapid identifcation and characterization of the basic components necessary for vesicle formation, docking and fusion. clearly, the general components of the bud-ding and docking machinery lie at the heart of any transport process, whether we are considering the transport of a membrane protein from er to golgi, or a secretory protein from the tgn to a particular cell surface. in the following sections we touch upon some of the key discoveries in the field of intracellular transport and focus on the relevant molecules that may contribute to polarized sorting and delivery processes. one of the recent paradigms in intracellularprotein transport is based on the concept that vesicle shuttling between different organellar compartments is regulated through the coordinated efforts of different gtp-binding proteins. there are two broad classes of gtp-binding proteins which have been shown to regulate membrane trafficking events; the small g proteins (rabs and arf) reviewed by (donaldson and klausner, 1994; pfeffer, 1992 pfeffer, ,1994 and the trimeric g proteins (reviewed by bomsel and mostov, 1992) . the role of a gtp-binding protein in regulating vesicular transport was first realized with the analysis of one of the temperature sensitive sec (secretory) mutants in yeast (salminen and novick, 1987) . sec4 mutants display a rather striking accumulation of secretory vesicles when cultured at the restrictive temperature. the cloning, sequencing, and characterization of the sec4 gene revealed that it encoded a ras-like or 'small' gtp-binding protein which was present on the surfaces of the vesicles and could bind and hydrolyze gtp (salminen and novick, 1987; goud et al., 1988; kabcenell et al., 1990) . since the phenotype of cells bearing mutant sec4 is the accumulation of transport vesicles, it was apparent that sec4 is necessary for the targeting and/or fusion of secretory vesicles with the plasma membrane. similar results were found with another yeast protein yptl(48% identical to sec4), which in its mutant form inhibited vesicular transport between the er and golgi complex (gallwitz et al., 1983; segev et al., 1988) . the suggestion that two small gtpases were important in the regulation of two different vesicular transport events in yeast led to the hypothesis that each step in vesicular traffic was regulated by a specific gtpase (bourne et al., 1990) . these ras-like gtpases are known to zdopt either of two distinct conformations, depending upon whether or not they are complexed with gtp or gdp. consequently, these gtpases have been postulated to serve as key regulators or "molecular switches" for membrane fission and fusion events. the apparent generality of ras-like gtpase in yeast, as revealed by sec4 and yptl, inspired asearch for these proteins' mammalian counterparts. to date, 30 yptl/seccrelated proteins have been identified and are often referred to as rab proteins ("ras-like" proteins from rat brain) reviewed in (balch, 1990; hall, 1990; goud and mccaffrey, 1991; zerial and stenmark, 1994) . a number of rabs have been localized to specific organelles within the cell and through the combined efforts of in vitro and in vivo approaches have been shown to regulate membrane traffic between these organelles (reviewed in zerial and stenmark, 1994) . how this class of molecules contributes to the overall fidelity of membrane trafficking events is still unclear (rothman, 1994) . the idea that specific rab proteins regulate distinct steps along the transport pathway (e.g., rabl always regulates er to golgi traffic, whether in a kidney cell or neuron) led to the hypothesis that cells which contain unique, cell-type specific transport processes might be regulated by disinct rabs. indeed, the best example of this is the family of rab3 isofoms which have been found to be localized within cells which are well-adapted for regulated secretory events. rab3a has been suggested to be important in the regulation of caz+ dependent secretion in neuronal (fischer von mollard et al., 1991) , neuroendocrine (darchen et al., 1990) and endocrine cell types (mizoguchi et al., 1989) . interestingly, an isoform of rab3a, rab3d, has been localized to the glucose transporter-containing vesicles of adipocytes, which are known to undergo regulated exocytosis after insulin stimulation (baldini et al., 1992) . thus, despite cell-specific differences, or vesicle-content differences, these regulated pathways rely on similar rabs (rab3). thus, distinct regulated exocytic events in a variety of cell types make use of similar molecular machinery (lutcke et al., 1993) . in this context, it has been speculated that polarized epithelial cells, with their distinct apical and basolateral targeting pathways, may employ epithelia-specific rab molecules. recent data suggest that this may be true. there are four rabs which have been implicated in polarized epithelial-specific functions: rab 17, rab 3b, rabl3, and rab8. of the four, only rabl7 is truly specific to polarized epitheiial cells. in the developing kidney, rabl7 mrna is detected only after mesenchyme is induced to differentiate into polarized epithelial structures (lutcke et al., 1993) . interestingly, rabl7 induction was shown to occur just prior to the appearance of apical markers and has therefore been suggested to be involved in the generation of apicalhasolateral polarity in these cells. rab 17 localizes to the basolateral membrane and to electron dense tubules near the apical membrane. since rab proteins have been shown to regulate transport between the subcellular compartments with which they associate, it has been suggested that rabl7 regulates epithelial transcytosis. as we stated previously, two isoforms of rab3 (3a and 3d) have been implicated in the regulated exocytosis events shared by neuronal, endocrine and adipocyte cell types. interestingly, another isoform of rab3,3b, has been shown to be specific for polarized epithelial cells and is exclusively localized to the apical pole of cells, near the tight junctions (weber et al., 1994) . rabl3, like rab3b, also accumulates at the apical poles of polarized cells and co-localizes with the tight junction associated protein, zo-1 (zahraoui et al., 1994) . it has been suggested that these two rabs could regulate events necessary for the establishment of polarity. for example, since the localization of both rabs are completely dependent on the presence the of cell-cell contacts, it is possible that these mole-cules control the recruitment of membrane protein-containing vesicles required for establishing the tight junction "fence," a structure thought to maintain the distinct protein and lipid compositions of apical and basolateral membranes (dragsten et al., 1981) . it has also been proposed that these rabs control general vesicle targetting to the apical membrane (zahraoui et al., 1994) . this hypothesis was based on two independent observations. it has been shown that an apical membrane protein (aminopeptidase) inserts preferentially into the apical membrane at regions of cell-cell contact in mdck cells (louvard, 1980) . furthermore, under conditions in which mdck cells are denied intercellular contacts, apical proteins appear to be sorted and retained within a large subapical vacuolar compartment (vacuolar apical compartment, or vac) which, after initiation of cell-cell contact, is inserted preferentially at regions of cell-cell contact (vega-salas et al., 1988) . taken together, the localization of r a b l 3 and rab3b at this region of cell contact places these monomeric gtpases in a position to regulate the delivery of apical proteins to the cell surface (zahraoui et al., 1994) . moreover, the localization of a regulated, exocytic compartmentspecific rab (rab3) to a subdomain of the apical membrane of polarized cells is intriguing and suggests possible functional relationships between these subcellular compartments. the last rab worth exploring in the context of epithelial polarity is rab8. while rab8 is not solely expressed in polarized cells, it is the only rab that has been functionally implicated in vectorial targeting. a peptide derived from the c-terminal region of rab8 can inhibit basolateral but not apical transport of membrane proteins in a permeabilized-mdck cell assay (huber et al., 1993a) . interestingly, rab8 can also regulate membrane transport to the dendritic plasma membranes of neurons in culture; antisense rab8 oligonucleotides decrease the level of viral glycoprotein transported to this domain (huber et al., 1993b) . this observation is consistent with the model which suggests that the mechanisms which produce axoddendrite polarity in neurons may be similar to those involved in apicallbasolateral polarity in epithelia (simons et al., 1992) . taken together, the identification of a polarized epithelia-specific rab (rab 17), and the localization of other rabs to specific polarized epithelial domains (rab 13 and rab3b, apical; rab8, basolateral) suggests that rabs may regulate specific pathways in polarized epithelial cells. for the epithelial cell biologist, the obvious question is, "what brings about the pathway-specific localizations of rab proteins in polarized epithelial cells?" it has been demonstrated that the carboxy-terminal regions of rab proteins are responsible for their unique cellular localizations (chavrier et al., 1991) . it has been suggested that organelle-specific receptors exist which recognize the c-terminal domains of these molecules. at least in terms of polarized cells, it would be tempting to speculate that identification of such receptors for rabl3,3b and rab8 will bring us one step closer to an understanding of the overall machinery that orchestrates domain-specific vesicle formation and targeting. recent evidence, however, suggests that rabs may not provide the primary level of specificity in membrane targeting events (brennwald and novick, 1993; reviewed by rothman and warren, 1994) . as we discuss below, a new class of proteins, the snares, may provide the necessary specificity for vesicle-membrane targeting events throughout the cell. the snare hypothesis for vesicle targeting arose from research in three related fields: synaptic vesicle release in neurons, transport between cisternae of the golgi, and secretion in yeast. briefly, a number of synaptic proteins were discovered to be important for the regulated fusion of synaptic vesicles with their targets on the pre-synaptic plasma membranes (reviewed by pevsner and scheller, 1994) . homologues of these proteins were found in yeast and shown to be required for constitutive vesicle transport (aalto et al., 1993) . at the same time, key elements of the general machinery for intracellular membrane fusion were being elucidated. in all three .cases, membrane fusion requires an nem-sensitive factor (nsf), adaptors that link nsf to membrane proteins (snaps: soluble nsf attachment proteins) and the membrane receptors for the nsf-snap complexes (snares: snap receptors) (reviewed in rothman and warren, 1994) . distinct snare proteins are present in the membranes of the vesicle and the target. the snare hypothesis stipulates that each transport vesicle is endowed with its own vesicle-(v-) snare (or vamp-like molecule) that can specifically interact with its cognate target-( t -) snare (or syntaxin/snap25-like protein). this 'pairing' could ensure vesicleharget membrane specificity, while a general fusion apparatus consisting of nsf and snaps could be used throught the cell (sollner et al., 1993) . in the context of epithelial polarity, this hypothesis suggests that vectorial targeting of apical and basolateral proteins will require distinct v-snares. interesting recent data suggest that the situation in at least one epithelial cell type may be somewhat more complicated. when the surface delivery of membrane proteins is examined in mdck cells permeabilized at their apical or basolateral surfaces with streptolysin 0, it appears that basolateral transport involves all of the machinery discussed above. toxins which cleave snares inhibit basolateral delivery, as do antibodies directed against snaps. in contrast, apical protein insertion is unaffected by these reagents. isolation of apically-bound vesicles from mdck cells reveals the presence of high concentrations of an adducin homologue in their surface membranes. adducins are calcium-dependent phospholipid binding proteins thought to be involved in a number of membrane fusion events (ilkonen et al., 1995) . it would appear, therefore, that completely distinct classes of vesicular targeting and fusion machinery may operate in the two membrane delivery pathways present in polarized epithelial cells. in the absence of a readily available genetic system with which to identify the genes and gene products necessary for such higher eukarotic functions as transcytosis or polarized targeting, epithelial cell biologists have been resigned to the prospect of "poking" at the epithelial cell with various reagents and watching how it responds. reagents which prevent the polymerization of actin (gottlieb et al., 1993; jackmon et al., 1994) and tubulin (achler et al., 1989; parczyk et al., 1989) , toxins which modify a particular class of g proteins (stow et al., 1991; pimplikar and simons, 1993b) , or toxins that inactivate the vamp, syntaxin and snap-25 molecules described above, second messanger stimulators, analogues of the messangers themselves (apodaca et al., 1994; cardone et al., 1994; hansen and casanova, 1994) and the remarkable fungal metabolite brefeldin a (bfa) are all being incorporated into the repetoire of tools which we hope will enable us to gleen more information from a particular transport pathway. those interested in polarized and nonpolarized cell functions alike have made use of such cell-perturbing reagents. since the focus of this review is epithelial polarity, we have chosen to summarize some of the studies which are providing insights about the mechanisms of polarized sorting and targeting. brefeldin a is a fungal metabolite that endeared itself to cell biologists because of its dramatic effect on the protein secretory pathway (reviewed in . protein secretion is inhibited by bfa: membrane trafficking out of the er is blocked and the golgi appears to breakdown and become redistributed into the er (lippincott-schwartz et a]., 1989). before golgi redistribution, bfa causes this organelle to form tubular extensions which are devoid of any cytoplasmic (nonclathrin) "coat" material (lippincottschwartz et al., 1990) . it has been shown that these morphological changes are not restricted to the golgi but rather are observed in a number of organelles of the endomembranous network such as endosomes, lysosomes and the tgn (hunziker et al., 1991; lippincott-schwartz et al., 1991; wood et al., 1991) , suggesting that the bfa "effector" might play a role in membrane transport events all over the cell. perhaps surprisingly, while membrane transport phenomena are remarkably altered in the presence of bfa, several processes are clearly unaffected, including receptor mediated endocytosis and endocytic recycling . from the standpoint of sorting and polarized delivery, bfa's most interesting property is its ability to differentially affect polarized cell surface targeting events. for example, low and colleagues ( i 991,1992 ) determined a concentration of bfa where er-golgi trafficking was not inhibited, so that delivery from the tgn to the surface could be assayed for bfa sensitivity. interestingly, bfa inhibited the apical delivery of both endogenous, mdck secretory proteins (199 1) and the membrane protein dppiv (1992) while also enhancing their mis-delivery to the basolateral surface. basolateral targeting of the endogenous mdck protein, uvomorulin, was not affected under these conditions. taken together, it would seem that a target molecule for bfa action exists that is exclusively involved in directing apical vesicles or which is simply more sensitive to the effects of bfa than similar molecules participating in the basolateral pathway. either way, these results provide a hint that there are indeed molecular differences between these two targeting pathways. it is important to add that in addition to inhibiting the exocytic apical pathway in mdck cells, basolateral to apical transcytosis is also inhibited by this drug (hunziker et al., 1991; low et al., 1992) . these findings have led to the suggestion that sorting mechanisms for apically destined proteins, whether along the exocytic or the transcytotic pathway may be functionally and biochemically similar (hunziker et al., 1991) . the fact that the loss of the structural integrity of the golgi induced by bfa correlates with a striking absence of its characteristic "coat" (observed at the em level) led to the idea that coat proteins might be rendered non-functional due to bfa action. through a number of studies (reviewed by donaldson et al., 1992; helms and rothman, 1992; rothman & orci, 1992 ) molecules which make up this "coat" were identified and characterized (e.g., pcop and a m ) . an "order of events" necessary for vesicle budding emerged from these studies and is outlined below. arf is a gtp-binding protein loosely related to ras and distinct from the family of rabs. in its gtp-bound state, it is capable of associating with the membrane by virtue of its myristoyl group, while its gdp-bound form is soluble and not membrane bound. arf binding to membranes appears to be the signal for coatomer binding, that is, the binding of pcop in addition to other as yet uncharacterized coat proteins. coatomer binding is believed to be absolutely necessary for vesicle budding. therefore proper coatomer assembly would be required for any event downstream of budding, such as targeting. recently, it has been determined that bfa inhibits coatomer assembly and vesicle formation through arf, by essentially allowing it to remain in its gdp-bound or inactive form. there exists a class of proteins which are able to catalyze the exchange of gdp for gtp called guanine nucleotide exchange factors (gne). bfa has been proposed to antagonize the action of a gne on arf, thus preventing coatomer assembly and membrane budding helms and rothman, 1992) . with the recent identification of an ever-growing family of new arf-related genes (kahn et al., 1991) and the speculation that different cops may exist in the control of membrane budding events from different organelles , there is growing excitement that arfs and cops will turn-out to be essential components for regulating a particular level of specificity inherent to membrane targetting events. in the context of bfas affect on apical sorting and targeting in polarized mdck cells (low et al., 1991 (low et al., , 1992 , it is likely that distinct arfkoatomer complexes regulate the budding of apical and basolaterallydestined vesicles from the tgn. moreover, the fact that significant missorting into the basolateral pathway was observed in the presence of bfa (low et al., 1992) suggests that coatomer assembly may be inextricably linked to proper secretory and membrane protein sorting. it has been known for some time that members of the heterotrimeric family of g proteins are associated not only with the plasma membrane but also with intracellular membranes (reviewed by bomsel and mostov, 1992) . a number of toxins (cholera, pertussis and mastoparan) known to activate or inhibit various classes of g proteins have been applied to studies of polarized sorting and targeting. stow et al. (1991) found that overexpression of gai-3 in polarized llc-pk1 cells significantly reduced the level of constitutive basolateral secretion of an extracellular matrix component, heparan sulfate proteoglycan. pertussis toxin, which adp-ribosylates and inactivates the a-subunits of the g a i/o class of heterotrimeric g proteins, relieved this inhibition. similarly, pimplikar and simons (1993) suggested that gi and gs may differentially regulate the trafficking of apical and basolateral vesicles in slo-permeabilized mdck cells, while leyte et al. (1992) found that gi/o and gs associated with the tgn could oppositely regulate constitutive secretory vesicle formation. it should be noted that in no case did the g protein related inhibition or stimulation appear to affect the actual sorting or missorting of apical or basolaterally destined proteins (in contrast to the bfa results discussed above (low et al., 1992) , but rather may only affect the rate or "efficiency" of sortingkargetting . a possible link between heterotrimeric g proteins and coatomer formatiodvesicle budding was provided by ktistakis et al., (1992) . this group found that activation of a g a protein with mastoparan promoted pcop binding and prevented bfainduced effects. pretreatment of cells with pertussis toxin, which is known to specifically affect g a i subclass of heterotrimerics, prevented mastoparan's antagonizing effects on bfa. stated more simply, these results showed that activation of a pertussis-toxin-sensitive ga promotes the binding of pcop to golgi membranes and thus antagonizes the action of bfa. the authors of this study suggest further that different subclasses or isoforms of ga could be responsible for some of the differences in bfa-sensitivities observed between cell types and organellar membranes. these key observations have led to the idea that heterotrimeric g proteins, by virtue of their membrane topology would be ideal candidates for coordinating the transfer of sorting information to the cytoplasmic surface of the tgn necessary for vesicle budding (bomsel and mostov, 1992; ktistakis et al., 1992) . the outer surface of a fruit fly embryo is composed of a monolayer of polarized epithelial cells. the apical membranes of these epithelial cells face the outer shell, or chorion, while their basolateral surfaces face the embryonic interior and yolk space. invaginations of this surface epithelium give rise rise to all of the embryo's internal tissue structures (for review see shiel and caplan, 1995b) . recent investigations have examined the mechanisms through which proteins are sorted in these epithelial cells. human placental alkaline phosphatase (plap) is a gpi-linked protein which has been shown to be sorted to the apical plasma membrane when it is expressed by transfection in mdck cells. a chimeric construct of plap, in which the gpilinkage domain is replaced by the transmembrane and cytoplasmic domains of the vsv g protein (plapg), is sorted to the basolateral surfaces of mdck cells (brown et al., 1989) . these two proteins have been expressed under the control of heat shock promoters in transgenic flies and their distributions have been examined in embryonic epithelia throughout embryogenesis (shiel and caplan, 1995a) . as would be expected, the plapg protein is restricted to basolateral surfaces throughout ontogeny in the surface epithelial cells as well as in the internal epithelia which derive from invaginations of the surface cells. surprisingly, plap was also restricted to a basolateral distribution in the surface epithelial cells in both early and late stage embryos. biochemical experiments demonstrated that this mis-sorting of the plap protein can not be attributed to problems with the addition of the gpi-linkage, since at all embryonic stages plap is correctly glipiated. internal epithelial cells sorted plap exclusively to their apical surfaces. since in many cases internal epithelia form from surface epithelia without undergoing any mitosis (e.g., salivary gland), essentially the same epithelial cell is capable of differentially sorting plap depending on that cell's physical position within the embryo. examination of epithelia undergoing invagination (e.g., ventral furrow, tracheal placode) demonstrate that the transition in plap sorting occurs in the early stages of the invagination process. while the mechanism responsible for this switch remains unclear, the power of drosophilu genetics will hopefully allow the cellular components responsible for this transition to be readily identified. it is likely that the isolation of the proteins responsible for this phenomenon will shed light on the drosophilu as well as on the mammalian epithelial sorting machinery. a drosophilu mutation whose phenotype includes peturbations of the polarized organization of the surface epithelial cells has recently been identified and characterized at the molecular level wodarz et al., 1993) . the crumbs gene encodes a transmembrane protein which is normally expressed in the apical membranes of surface and internal epithelial cells. mutation of the crumbs gene results in a loss of crumbs polarity and markedly alters embryonic morphology. genetic studies have demonstrated that the crumbs gene product is necessary not only for its own apical sorting, but for the apical delivery of other proteins as well. furthermore, the product of the stardust gene appears to interact with the crumbs protein and also appears to participate in apical sorting. understanding these proteins' biochemical functions and their intermolecular associations will undoubtedly provide enormous insight into the cellular components responsible for generating and maintaining the polarized phenotype. hopefully, the development of genetic approaches such as these, in concert with the continuing refinement of in vitro and model systems, will allow us to develop a clear and fundamental understanding of how epithelial cells produce their remarkable asymmetry. yeast syntaxins ssolp and sso2p belong to a family of related membrane proteins that function in vesicular transport role ofmicrotubules in polarized delivery of apical membrane proteins to the brush border ofthe intestinal epithelium distribution of transport proteins over animal cell membranes podocytosis of small molecules and ions by caveolae the calmodulin antagonist, w-13, alters transcytosis, recyclingand morphology ofthe endocyticpathway in madine-darby canine kidney cells mutational and secondary structural analysis of the basolateral sorting signal of the polymeric immunoglobulin receptor vesicular stomatitis virus glycoprotein is sorted and concentrated during export from the endoplasmic reticulum cloning of a rab3 isotype predominately expressed in adipocytes the npxy internalization signal of the ldl receptor adopts a reverse turn conformation biogenesis of the rat hepatocyte plasma membrane in vivo: comparison of the pathways taken by apical and basolateral proteins using subcellular fractionation. 1 in vifro binding oftheasiatoglycoproteinreceptor to the betaadaptin of plasma membrane coated vesicles identificationofeffector-activatingresidues ofgsa lntracellular protein topogenesis. proc. natl. acad. sci. usa 77 role of heterotrimeric g proteins in membrane traffic the gtpase superfamily: aconserved switch sorting of gpi-anchored proteins to glycolipid-enriched membrane an h+-atpase in opposite plasma membrane 1317-1328. for diverse cell functions 413421. subdomains during transport to the cell surface interactions between gpi-anchored proteins and membrane lipids mechanism of membrane anchoring affects polarized expression oftwo proteins in mdck cells constitutive and regulated secretion of proteins intracellular sorting and polarized cell surface delivery of nqk-atpase, an endogenous component of mdck cell basolateral plasma membranes dependence on ph of polarized sorting of secreted proteins sortingofmembrane and secretoryproteins in polarizedepithelial cells phorbol myristate acetate-mediated stimulation of transcytosis and apical recycling in mdck cells phosphorylation of the polymeric immunoglobulin receptor required for its efficient transcytosis an autonomous signal for basolateral sorting in the cytoplasmic domain of the polymeric immunoglobulin receptor hypervariable c-terminal domain of rab proteins acts as a targeting signal transferrin receptor internalization sequence yxrf implicates a tight turn as the structural recognition motif for endocytosis transplantedldl and mannose-6-phosphate receptor internalization signals promote high-efficiency endocytosis of the transferrin receptor asortingsignal for the basolateral delivery of the vesicular stomatitis virus (vsv) g protein lies in its luminal domain: analysisofthe targetingofvsv g-influenzahemagglutininchimeras suppression of villin expression by antisense rna impairs brush border assembly in polarized epithelial intestinal cells a tyrosine-based signal targets h,k-atpase to a regulated compartment and is required for the cessation of gastric acid secretion high-resolution epitope mapping of hgh-receptor interactions by alanine-scanning mutagenesis association of the gtp-binding protein rab3a with bovine adrenal chromaffin granules the internalization signal and the phosphorylation site of transferrin receptor are distinct from the main basolateral sorting information arf: a key regulatory switch in membrane traffic and organelle structure brefeldin a inhibits golgi membrane-catalysed exchange of guanine nucleotide onto arf protein polarized sorting ofglypiated proteins in hippocampal neurons involvement of na,k-atpase in antinatriuretic action of mineralocorticoids in mammalian kidney membrane assymetry in epithelia: is the tight junction a barrier to diffusion in the plasma membrane? a small gtp-binding protein dissociates from synaptic vesicles during exocytosis an enzymatic assay reveals that proteins destined for the apical or basolateral domains of an epithelial cell line share the same late golgi compartments direct association of occludin with zo-1 and its possible involvement in the localization of occludin at tightjunctions a yeast gene encoding a protein homologous to the human c-hashas proto-oncogene product specificity of binding of clathrin adaptors to signals on the mannose-6-phosphate/insulin-like growth factor 11 receptor receptor-mediated endocytosis: concepts emerging h m the ldl receptor system nonpolarized secretion of truncated forms of the influenza hemagglutinin and the vesicular stomatitis virus g protein from mdck cells an ion transporting atpase encodes multiple localization signals biotinylation and assessment of membrane polarity: caveats and methodological conerns sorting of ion transport proteins in polarized cells secretion of endogenous and exogenous proteins from polarized mdck monolayers sorting and endocytosis of viral glycoproteins in transfected polarized epithelial cells actin microfilaments play a critical role in endocytosis at the apical but not the basolateral surface ofpolarized epithelial cells small gtp-binding proteins and their role in transport a gtp-binding protein required for secretion rapidly associates with secretory vesicles and the plasma membrane in yeast biochemical characterization ofrenal epithelial cell cultures (llc-pk1 and mdck) apical polarity ofna,k-atpase in retinal pigment epithelium is linked to a reversal of the ankyrin-fodrin submembrane cytoskeleton mechanism for regulatingcell surfacedistributionofna,k-atpase inpolarizedepithelial cells gs alphastimulates transcytosis and apical secretion in mdck cells through camp and protein kinase a inhibition by brefeldin a of a golgi membrane enzyme that catalyzes exchange of guanine nucleotide bound onto arf protein more than just a membrane anchor rab 8, a small gtpase involved in vesicular traffic between the tgn and the basolateral plasma membrane protein transport to the dentritic plasma membrane of cultured neurons is regulated by rab8p the origin and characteristics of a pig cell strain, llc-pk 1 basolateral sorting in mdck cells requires a distinct cytoplasmic domain determinant different requirements for nsf, snap, andrab proteins in apical and basolateral transport inmdck cells inhibition of apical but not basolateral endocytosisofricinandfolateincaco-2 cells by cytochalasind binding and hydrolysis of guanine nucleotides by sec4p, ayeast protein involved in the regulation of vesiculartraffic human adp-ribosylationfactors: a functionally conserved family of gtp-binding proteins crumbs and stardust, two genes of drosophila required for the development of epithelial cell polarity. development suppl exocytotic pathways exist to both the apical and the basolateral cell surface of the polarized epithelial cell mdck the biogenesis of lysosomes action of brefeldin a blocked by activation of a pertussis-toxin-sensitive g protein the signal sequence of nascent preprolactin interacts with the 54k polypeptide of the signal recognition particle phosphorylationofthe rat hepaticpolymericiga receptor an internal deletion in the cytoplasmic tail reverses the apical localization of human ngf receptor in transfected mdck cells multiple trimeric g-proteins on the trans-golgi network exert stimulatory and inhibitory effects on secretory vesicle formation tyrosine-dependent basolateral sortingsignals are distinct from tyrosine-dependent internalization signals a signal sequence for the insertion of a transmembrane glycoprotein rapid redistribution of golgi proteins into the er in cells treated with brefeldin a: evidence for membrane cycling from golgi to er microtubule-dependentretrograde transport of proteins into the er in the presence of brefeldin a suggests an er recycling pathway brefeldin a's effects on endosomes, lysosomes, and the tgn suggest a general mechanism for regulating organelle structure and membrane traffic polarized apical distribution of glycosyl-phoshatidylinositol-anchoredproteins in a renal epithelial cell line steady-state distribution and biogenesis of endogenous mdck-glycoproteins: evidence for intracellular sorting and polarized cell surface delivery preferred apical distribution of glycosyl-phosphatidylinositol (gpi) anchored proteins: a highly conserved feature of the polarized epithelial cell phenotype apical membrane aminopeptidase appears at site of cell-cell contact in cultured kidney epithelial cells selective inhibition ofprotein targeting to the apical domain of mdck cells by brefeldin a inhibition by brefeldina ofprotein secretion from the apical cell surfaceofmadin-darby caninekidney cells rabl7, a novel small gtpase, is specific for epithelial cells and is induced during cell polarization targeting and retentioon ofgolgi membrane proteins structural analysis o f a human intestinal epithelial cell line distinguishing roles of the membrane-cytoskeleton and cadherin mediated cell-cell adhesion in generating different na,k-atpase distributions in polarized epithelia sortingofan apical plasmamembraneglycoproteinoccurs before it reaches the cell surface in cultured epithelial cells sortingofendogenous plasmamembrane proteins occurs from two sites in cultured human intestinal epithelial cells (caco-2) basolateral sorting of ldl receptor in mdck cells: the cytoplasmic domain contains two tyrosine-dependent targeting determinants mechanisms ofcell polarity: sorting and transport in epithelial cells structural requirements and sequence motifs for polarized sorting and endocytosis of ldl and fc receptors in mdck cells polarizedexpressionofa chimeric protein in which the transmembrane and cytoplasmic doamins of influenza hemagglutinin have beenreplaced by those of the vesicular stomatitis g protein membranes and sorting. c u r apical-to-basal transepithelial transport biogenesis of epithelial cell polarity: of human lactoferrin in the intestinal cell line ht-29c1.19a intracellular sorting and vectorial exocytosis of an apical plasmamembrane glycoprotein tissue distribution of smg p25a, a ras p21-like gtp-binding protein, studied by use of a specific monoclonal antibody polymeric immunoglobulin receptor expressed in mdck cells transcytoses iga plasma membrane protein sorting in polarized epithelial short cytoplasmic sequences serve as retention signals for transmembrane proteins in the endoplasmic reticulum the trans-most cisternae of the golgi complex: a compartment for sorting of secretory and plasma membrane proteins microtubules are involved in the secretion of proteins at the apical cell surface of the polarized epithelial cell, madin-darby canine kidney tissue-specific sorting of the human ldl receptor in polarized epithelia of transgenic mice clathrin, adaptors, andsorting receptors compete for adaptors found in plasmamembranecoated pits mechanisms of vesicle docking and fusion: insights from the gtp-bindingproteins inintracellulartransport. trendsincell biology2 intracellular sorting and basolateral appearance of the g j. 7,333 1-3336. nervous system isoformsofthena,k-atpase are presentin both axons and dendrites of hippocampal neurons in culture regulation of apical transport in epithelial cells by a gs class of heterotrimeric g protein role of heterotrimeric g proteins in polarized membrane transport the effect ofmodifying the culture medium on cell polarity in a human colon cell line replacement of the cytoplasmic domain alters sorting of a viral glycoprotein in polarized cells localization of sodium pumps in the choroid plexus epithelium nuclear localization signals in polyomavirus large-t viral glycoproteins destined for apical or basolateral plasma membrane domains traverse the same golgi apparatus during their intracellular transport in doubly infected madin-darby canine kidney cells the distribution of na,k-atpase in the retinal pigmented epithelium from chicken embryo is polarized in vivo but not in primary cell culture the roleofclathrin, adaptors and dynamin inendocytosis morphogenesis of the polarized epithelial cell phenotype polarity of epithelial and neuronal cells asymmetric budding of viruses in epithelial monolayers: a model system for study of epithelial polarity influenza virus hemagglutinin expression is polarized in cells infected with recombinant sv40 viruses carrying cloned hemagglutinin dna the large extracellular domain is sufficient for the correct sorting of secreted or chimeric influenza virus hemagglutinins in polarized monkey kidney cells the large external domain is sufficient for the correct sorting of secreted or chimeric influenza virus hemagglutinins in polarized monkey kidney cells mechanisms of intracellular protein transport molecular dissectionofthesecretory pathway implications ofthe snare hypothesis for intracellularmembrane topology and dynamics a ras-like protein is required for a postgolgi event in yeast secretion membrane and secretory proteins are transported from the golgi complex to the sinusoidal plasmalemmaofhepatocytes by distinct vesicular carriers plasticity of functional epithelial polarity developmental regulation of membrane protein sorting in the generation of epithelial polarity in mammalian and lipid sorting in epithelia polarized sorting in epithelia biogenesis of cell-surface polarity in epithelial cells and neurons the phosphomannosyl recognition system for intracellular transport of lysosomal enzymes snap receptors implicated in vesicle targeting and fusion interaction of activated egf receptor with coated pit adaptins in vitro binding of plasma-coated vesicle adaptors to the cytoplasmic domain of lysosomal acid phosphatase 38), c207<216. drosophila embryos atpase in polarized epithelial cells the mammalian choroid plexus transport and membrane properties of the retinal pigment epithelium nonpolarized expression of a secreted murine leukemia virus glycoprotein in polarized epithelial cells a heterotrimeric g protein, gai-3, on golgi membranes regulates the secretion o f a heparan sulfate proteoglycan in llc-pki epithelial cells a golgi retention signal in a membrane-spanning domain of coronavirus el protein molecularcloning,characterization, subcellular localization and dynamics of p23, the mammalian kdel receptor vesicular stomatitis virus glycoprotein contains a dominant cytoplasmic basolateral sorting signal critically dependent on tyrosine the basolateral targetingsignal inthe cytoplasmicdomainofvsv g protein resembles a variety of intracellular targeting motifs related by primary sequence but having diverse targeting activities organization of glycosphingolipids in bilayers and plasmamembranes of mammalian cells sorting of progeny coronavirus from condensed secretory proteins at the exit from the trans golgi network of att 20 cells endocytosis and signals for internalization exocytosis of vacuolar apical compartment (vac): a c e l k e l l contact controlled mechanism for the establishment ofthe apical plasma membrane domain in epithelial cells analysisofthedistributionofchargedresidues inthen-terminal regionofsignal sequences: implications for protein export in prokaryotic and eukaryotic cells mechanisms of protein translocation across the endoplasmic reticulum distinct transport vesicles mediate the delivery of plasmamembrane proteins to the apical and basolateral domains of mdck cells expression and polarized targeting of a rab3 isoform in epithelial cells crumbs is involved in the control of apical protein targeting during drosophrlu epithelial development brefeldin a causes amicrotubule-mediatedfusionofthe mechanismsofiontransportacrossthe choroidplexus a small rab gtpase is distributed in cytoplasmic vesicles in non-polarized cells but colocalizes with the tightjunction marker 20-1 in polarized epithelial cells opposite polarity of virus budding and of viral envelope glycoprotein distribution in epithelial cells derived from different tissues modulation of transcytotic and direct targeting pathways in a polarized thyroid cell line glycosylphosphatidylinositol-anchored proteins are preferentially targeted to the basolateral surface in fischer rat thyroid epithelial cells key: cord-013315-plptulfb authors: tilocca, bruno; soggiu, alessio; greco, viviana; piras, cristian; arrigoni, norma; ricchi, matteo; britti, domenico; urbani, andrea; roncada, paola title: immunoinformatic-based prediction of candidate epitopes for the diagnosis and control of paratuberculosis (johne’s disease) date: 2020-08-27 journal: pathogens doi: 10.3390/pathogens9090705 sha: doc_id: 13315 cord_uid: plptulfb paratuberculosis is an infectious disease of ruminants caused by mycobacterium avium subsp. paratuberculosis (map). map is an intracellular pathogen with a possible zoonotic potential since it has been successfully isolated from the intestine and blood of crohn’s disease patients.since no cure is available, after the detection of the disease, animal culling is the sole applicable containment strategy. however, the difficult detection of the disease in its subclinical form, facilitates its spread raising the need for the development of effective diagnosis and vaccination strategies. the prompt identification and isolation of the infected animals in the subclinical stage would prevent the spread of the infection.in the present study, an immunoinformatic approach has been used to investigate the immunogenic properties of 10 map proteins. these proteins were chosen according to a previously published immunoproteomics approach. for each previously-described immunoreactive protein, we predicted the epitopes capable of eliciting an immune response by binding both b-cells and/or class i mhc antigens. the retrieved peptide sequences were analyzed for their specificity and cross-reactivity. the final aim is to employ the discovered peptides sequences as a filtered library useful for early-stage diagnosis and/or to be used in novel multi-subunit or recombinant vaccine formulations. bovine paratuberculosis, also known as johne's disease (jd) is an infectious disease of ruminants caused by mycobacterium avium subsp. paratuberculosis (map). it is characterized by chronic and progressive granulomatous enteritis. the infected animals initially show normal appetite and food consumption, but the intestinal wall thickening and the impaired nutrient absorption cause a reduced feed-conversion rate and a progressive weight loss. milk yield is also negatively affected by the progression of the infection. nevertheless, clinical manifestations do not involve all infected animals [1] [2] [3] ; the subclinical stageof infection can last from 2 to 15 years [4] and, despite the absence of visible symptoms, animals in this stage can shed map and spread the disease [3, [5] [6] [7] . for these aforementioned reasons, this pathology leads to significantly increased veterinary costs worldwide [3, 8, 9] . the causal agent of jd is map. it is considered a zoonotic pathogen [10] because of its possible link with crohn's disease. map infection affects animals and there is considerable evidence that might be a co-cause of human crohn's disease [11] . map isolation from the intestine and blood of crohn's disease patients has extensively documented. more precisely, map presence was found to be seven times higher in crohn's disease patients than what has been found in patients with any other bowel inflammation [12, 13] . map also infected animals and crohn's disease patients show similar alterations of the immune system response reinforcing the hypothesis about the analogy between the two [14] [15] [16] [17] . map is a slow-growing bacterium, commonly acquired via the fecal-oral route by both animals and humans [18] . despite the pathogenetic mechanism of map, infection has not been fully understood, it has been demonstrated that its acid resistance enables it to survive in the gastric environment, allowing its entrance in the intestinal tract. at the ileal level, map invades the lymphatic system overlying peyer's patches. this stimulates the host's immune response that, besides activating the humoral response, promptly phagocytizes map into macrophages [8] . as an intracellular pathogen, map can either survive into macrophagic cells or being killed and disassembled to present its antigens to t-lymphocytes [3] . evidence from multibacillary jd revealed a massive humoral antibody response along with a tendency to suppress the cell-mediated immune response [3, 19, 20] . whereas, a recent comparative study between two groups of cows, one in the sub-clinical and the other in the clinical stage, highlighted an increased t-cell activity in the first group of animals compared to the second one [21] . studies on cattle at the early stage of map infection revealed an upregulation of class i mhc molecules, suggesting a pivotal role of these molecules in the very beginning of the infective process [22] ; this is of great interest for both diagnosis and prophylaxis-oriented studies. figure 1 provides an overview of the major immunological mechanisms triggered bymap infections. to date, jd diagnosis relies on both direct (map culture, pcr, microarrays etc.) and indirect (elisa) detection of map from feces, milk and necroscopy-derived tissues. however, all the available diagnostic methods suffer from sensitivity (especially in the sub-clinical phase) that strongly reduce their robustness and efficient applicability on large-scale control programs. the failure to detect the subclinical infection makes it difficult to timely apply the control measures necessary to block the spread of the infection within the same, and to other, herds. a thorough comprehension of the etiopathogenetic mechanisms of map infection and host response would be beneficial for diverse research scenarios, providing guidance for the design of map-specific diagnostic tools capable of jd diagnosis in the subclinical phase. from this perspective, a previous study from our research group [18] employed an immunoproteomic approach to investigate and select map-specific immunoreactive proteins. here, incubation of map proteome with sera from infected bovines highlighted several possible candidate immunoreactive proteins. these candidates represent a good starting point for an immunoinformatic analysis of their sequences in order to find the best immunoreactive sub-sequences and epitopes. this would provide a library of peptides that might be useful for novel prophylactic strategies and/or for their potential application in the immune-based detection of map. the rapid development of the bioinformatics tools and databases provides the possibility to detect the antigenic/epitopic regions of given protein sequences. this innovative strategy for the in-silicoanalysis is time-and cost-effective compared to the "old-fashioned" laboratory-based approach. recently, carlos et al. [23] and rana et al. [24] applied immunoinformatics-based studies to detect class ii mhc epitopes possibly useful for the control of jd in a rapid and cost-effective manner. over the last decade, these computational approaches lead to the achievement of successful epitopes prediction in several research fields as virology [25, 26] , bacteriology and cancer research [27] . in this study, previously-selected immunogenic proteins [18] were studied via several immunoinformatics approaches aiming at the detection of the most promising peptide sequences useful for diagnostic purposes. the parameters taken into account were affinity for both the humoral antibody binding and the class i mhc molecule binding. we predicted the most suitable peptide sequences and discuss their potential employment in the design of innovative control measures against jd, with a specific focus on the early diagnosis of jd and/or potential use in novel specific vaccine formulations. pathogens 2020, 9, x for peer review 3 of 16 over the last decade, these computational approaches lead to the achievement of successful epitopes prediction in several research fields as virology [25, 26] , bacteriology and cancer research [27] . in this study, previously-selected immunogenic proteins [18] were studied via several immunoinformatics approaches aiming at the detection of the most promising peptide sequences useful for diagnostic purposes. the parameters taken into account were affinity for both the humoral antibody binding and the class i mhc molecule binding. we predicted the most suitable peptide sequences and discuss their potential employment in the design of innovative control measures against jd, with a specific focus on the early diagnosis of jd and/or potential use in novel specific vaccine formulations. the peptide sequences of the previously-identified immunoreactive proteins [18] were used to recall the novel protein identifiers in the ncbinr protein database. because of the continuous evolution of the data repositories and the increasing knowledge on their entries, some protein accession numbers were re-classified into other identifiers. table 1 summarizes the blast-based alignments of the peptides performed to line up the selected proteins to the current identification system. all pblast alignments matched at 100% with the reference protein kept. the low e-value of each alignment supports the attribution of the immunoreactive proteins to the novel identifiers. the peptide sequences of the previously-identified immunoreactive proteins [18] were used to recall the novel protein identifiers in the ncbinr protein database. because of the continuous evolution of the data repositories and the increasing knowledge on their entries, some protein accession numbers were re-classified into other identifiers. table 1 summarizes the blast-based alignments of the peptides performed to line up the selected proteins to the current identification system. all pblast alignments matched at 100% with the reference protein kept. the low e-value of each alignment supports the attribution of the immunoreactive proteins to the novel identifiers. once the updated protein identifiers are inferred, the major immunogenic domains of the selected proteins were predicted through an immunoinformatic approach. prediction of the linear b-epitopes provided a list of epitopes capable of eliciting antibody production (supplementary table s1 ). all the selected proteins showed relevant epitopes from an immunogenic point of view. a large number of short epitope sequences is predicted for each immunogenic protein; whereas, an average of six candidate epitopes (min 4-max 8) meeting the threshold of a minimal length of 10 aminoacids is predicted for each of the selected immunogenic proteins. figure 2 depicts the ten immunogenic proteins of map along with the relative distribution of the predicted b-epitopes. whole protein calculated immunogenic potential based on the type-b epitopes prediction indicates the "hypothetical protein map_1386c" (aas03703) as the most immunogenic one. this evidence is supported by its highest number of predicted epitopes and the highest average epitope length ( figure 2 ). on the other hand, the fructose-bisphosphate aldolase (eta93906), reported the lowest number of predicted epitopes along with the lowest epitope sequence length. regardless of the number of predictions, candidate epitopes are evenly mapped over the full sequence length of the immunogenic proteins, suggesting a good versatility of the predicted sequences ( figure 2 ). pathogens 2020, 9, x for peer review 5 of 16 prediction of binding affinity for the diverse class i bolas histocompatibility antigens predicted a high number of peptide sites. the full list of class i mhc epitope prediction is provided in the supplementary table s2 . epitope prediction from the previously selected immunogenic proteins yielded a total of 7044 peptides, each of which scoring a peculiar binding affinity. peptides scoring a binding affinity among the top 0.5% are considered as strong binders (sb); whereas, peptides with a percentile rank comprised between 0.6% and 2% were labelled as weak binders (wb). sorting all the entries using the "sole" wb and sb resulted in a total of 818 candidate epitopes when considering all the mhc haplotypes for the ten immunogenic proteins (supplementary table s2) . for a better evaluation of the most suitable map epitopes, we focused our attention towards the sequences that are most commonly recognized by the immune system effectors (i.e., bola haplotypes and, in turn, cd8 + t-cells). figure 3 lists, for each of the tested immunogenic protein, the shared epitopes among the mhc haplotypes. pathogens 2020, 9, x for peer review 6 of 16 prediction of binding affinity for the diverse class i bolas histocompatibility antigens predicted a high number of peptide sites. the full list of class i mhc epitope prediction is provided in the supplementary table s2 . epitope prediction from the previously selected immunogenic proteins yielded a total of 7044 peptides, each of which scoring a peculiar binding affinity. peptides scoring a binding affinity among the top 0.5% are considered as strong binders (sb); whereas, peptides with a percentile rank comprised between 0.6% and 2% were labelled as weak binders (wb). sorting all the entries using the "sole" wb and sb resulted in a total of 818 candidate epitopes when considering all the mhc haplotypes for the ten immunogenic proteins (supplementary table s2) . for a better evaluation of the most suitable map epitopes, we focused our attention towards the sequences that are most commonly recognized by the immune system effectors (i.e., bola haplotypes and, in turn, cd8 + t-cells). figure 3 lists, for each of the tested immunogenic protein, the shared epitopes among the mhc haplotypes. summarizes bolas haplotypes predicted to bind each sequence as means of a color scale that depends on the binding affinity. peptides "amlqdmail" belongs to heat shock protein 65 (caa52630.1); "aqldlsnal" and "hmlrhqqir" belong to hypothetical protein map_0323c (aas02640.1); "arleppgpl" belong to hypothetical protein map 1386c (aas03703.1); "gvfnleatl" and "neraveeal" belong to the protein fixa (aas05609.1); "rlleiepal", "aagqigysl" and "alegvvmel" belong to the malate dehydrogenase protein (p61976.1); "amqtlpqvl" and "rmaqtgspl" belong to phosphoglucosamine mutase (q73s29.1); "nqielhpll", "teravsaal", "amdaceasl" and "tqsytplal" belong to uncharacterized oxidoreductase map_3007 (q73vk6.1); "emfdlihqm" and "amrkwessm" belong to fructose-bisphosphate aldolase (eta93906.1); "glyefftpl", "gmigltqal" and "tqmtaaipl" belong to 3-ketoacyl-acp reductase (ajk73649.1); "fitwrgipl" and "nqsaiatyl" are peptides of the hypothetical protein ega31_12440 (azp81686.1). the vast majority of the listed epitopes are classified as sb; while eight of them, belonging to the proteins malate dehydrogenase (p61976), uncharacterized oxidoreductase map_3007 (q73vk6) and hypothetical protein ega31_12440 (azp81686), are to be considered as wb on the basis of their affinity rank. regardless of the binding affinity, all these sequences are predicted to be commonly bound by a plurality of mhc haplotypes. an average of 2 (min 1-max 4) suitable epitopes are selected for each of the tested protein. such epitopes are predicted to be recognized by five diverse bolas out of the six mhc haplotypes used for the computer-based prediction; except for the immunogenic proteins fixa (aas05609) whose epitopes can be bound by four bolas out of six. the bola-hd6, bola-jsp.1 and bola-t2c are able to recognize all the selected epitopes sequences . selected class i mhc binding peptides. the heat map displays the selected t-epitopes and summarizes bolas haplotypes predicted to bind each sequence as means of a color scale that depends on the binding affinity. peptides "amlqdmail" belongs to heat shock protein 65 (caa52630.1); "aqldlsnal" and "hmlrhqqir" belong to hypothetical protein map_0323c (aas02640.1); "arleppgpl" belong to hypothetical protein map 1386c (aas03703.1); "gvfnleatl" and "neraveeal" belong to the protein fixa (aas05609.1); "rlleiepal", "aagqigysl" and "alegvvmel" belong to the malate dehydrogenase protein (p61976.1); "amqtlpqvl" and "rmaqtgspl" belong to phosphoglucosamine mutase (q73s29.1); "nqielhpll", "teravsaal", "amdaceasl" and "tqsytplal" belong to uncharacterized oxidoreductase map_3007 (q73vk6.1); "emfdlihqm" and "amrkwessm" belong to fructose-bisphosphate aldolase (eta93906.1); "glyefftpl", "gmigltqal" and "tqmtaaipl" belong to 3-ketoacyl-acp reductase (ajk73649.1); "fitwrgipl" and "nqsaiatyl" are peptides of the hypothetical protein ega31_12440 (azp81686.1). the vast majority of the listed epitopes are classified as sb; while eight of them, belonging to the proteins malate dehydrogenase (p61976), uncharacterized oxidoreductase map_3007 (q73vk6) and hypothetical protein ega31_12440 (azp81686), are to be considered as wb on the basis of their affinity rank. regardless of the binding affinity, all these sequences are predicted to be commonly bound by a plurality of mhc haplotypes. an average of 2 (min 1-max 4) suitable epitopes are selected for each of the tested protein. such epitopes are predicted to be recognized by five diverse bolas out of the six mhc haplotypes used for the computer-based prediction; except for the immunogenic proteins fixa (aas05609) whose epitopes can be bound by four bolas out of six. the bola-hd6, bola-jsp.1 and bola-t2c are able to recognize all the selected epitopes sequences among the immunogenic proteins. on the other hand, the bola-t2a is not showing any binding affinity to the epitope sequences; while bola-d18.4 and bola-t2b fail to bind the epitopes of aas05609 protein (figure 3 ). the class i mhc epitopes as of figure 3 are further aligned against both the mycobacteria and cow databases to assess the specificity of the predicted epitope sequences for map. complete list of alignments is available in the supplementary table s3 . sequence alignment highlighted that the peptide amdaceasl and amrkwessm respectively of the "uncharacterized oxidoreductase map_3007" (q73vk6) and "fructose-bisphosphate aldolase" (eta93906) proteins are the most specific for map. specifically, amdaceasl scores 100% identity with the map and the mycobacterium aviumcomplex (mac); whereas, hits with other mycobacteria specimens are featured by a lower sequence identity (below 89%) and a far higher e-value when compared with map and mac hits (0.16 vs. 5.3, supplementary table s3) . similarly, the peptide amrkwessm scores 100% sequence identity with map and mac and only less than 73% of sequence similarity is scored by the alignments with other mycobacteria. the e-value supports the robust alignment against the map and mac (e-value 0.01) in spite of the other alignments (e-value > 86) further supporting the hypothesis on the specificity of this peptide sequence (supplementary table s3 ). concerning the alignment of the peptides against the cow database, both amdaceasl and amrkwessm did not score relevant matches with any of the cow proteins. several hits were matching with discontinuous sequences of the cow proteins database with high e-values (supplementary table s3 , topic better commented in the discussion section). the host's immune response to map infection is complex and heterogeneous. debates on the sequelae of immunological events following map infection are currently ongoing. nevertheless, it seems widely accepted that the early stage of the infection elicits an important cell-mediated response. once map is phagocytized, its antigen presentation is accomplished through the loading of the processed antigen onto mhc molecules. the bovine mhc genes complex (i.e., bovine leukocyte antigen, bola) is carried in the chromosome 23 and represent a fundamental component of the bovine immune system that allows the recognition and presentation of a virtually infinite number of antigens [28] (figure 1 ). such a high versatility relies on diverse factors, including the polygenetic origin of the mhc genes, the codominance of the parental alleles, the polymorphism of the genetic variants and the peptides/proteins splicing [29] . class i mhc molecules recognize, bind and present peptide antigens from intracellular pathogens to cd8 + t-lymphocytes [28] . in this view, class i mhc molecules and cytotoxic t-lymphocytes (ctl) are likely to play a pivotal role in the early stage of the map infection [30] . thus, of potential interest for early diagnosis-oriented studies and the design of efficient vaccine formulations. class i mhc peptide antigens are to be considered among the main triggers of the cell-mediated responseand their specific immunostimulation would lead to a more efficient prophylactic outcome. nevertheless, a study from rana etal. highlighted an important involvement also of the humoral response to map infection, other than the adaptive immunity mediated by the class ii mhc molecules [24] . huge efforts have been made to optimize diagnostics for the efficient detection of map by means of both direct and indirect methods [4, 31, 32] . the slow-growing rate of map along with the reduced sensitivity of the culture-based methods raised the need to develop alternative diagnostic strategies. pcr-based diagnosis targeting the multicopy insertion sequence is900 held the promise of fast detection of map in both environmental and clinical samples. however, the presence of is900-like sequences in other bacterial specimens resulted in a reduction of the pcr specificity. this, along with the elevated costs of the reagents, equipment and procedures, precludes the pcr applicability in large-scale programs [33] . among the indirect methods, elisa-based detection of anti-map antibodies enables faster diagnosis time but still suffer from drawbacks related to sensitivity and specificity. although great improvements have been made in optimizing elisa kits to reduce cross-reactions with environmental mycobacterium strains [18, 34] . still, this method suffers from a lack of sensitivity. moreover, the high antigen similarity between map and mycobacterium bovis obstacles the discrimination between bovines infected with tuberculosis and inoculated with live or attenuated paratuberculosis vaccines [35] . this promotes the seek of molecular target(s) capable of offering a more robust diagnosis. the present work describes a companion study that relies on previously-obtained datasets of our research group [18] . employing an immunoproteomic approach, we experimentally validated the whole map proteome for its capability of being complexed by the antibodies naturally occurring in sera of infected bovines. ms-based identification of the immunogenic proteins enabled the detection of 10 protein candidates whose protein sequences have been now further investigated for their immunogenic features. we employed an immunoinformatic approach for further focusing on the peptide sequences, potentially involved in the immunostimulation. a key point of the immunoinformatic approach is the prediction of the protein epitope sequences. epitopes prediction can be based on several features such as physical-chemical properties and structural folding of the primary protein sequence [36] [37] [38] . the present investigation is mainly focused on linear epitopes because protein-antibody complexes were selected through two-dimensional electrophoresis (2-de) and western blotting; thus, on linearized proteins [18, 39] . however, the application of other mass spectrometry technologies is quickly developing in the field on immunoproteomics [40] there are still significant limitation to map on a large scale conformational epitopes. as expected from the previous experimental data, all the screened protein sequences showed the capability of being recognized by both b-cells and class i bolas. the comprehension of recognition and binding of map by the host immune cells is still controversial. some studies document a relevant humoral response to map infections. on the other hand, other pieces of evidence describe the importance of the cell-mediated response to control map growth [41] [42] [43] [44] [45] . from our perspective and, according to the collected evidence, map-targeted antibodies could play a key role in the specific and sensitive detection of this pathogen in the subclinical stage of the infection. b-cell epitopes prediction highlighted the "hypothetical protein map_1386c" (aas03703) protein as the most active in stimulating antibody production. this finding is in agreement with our previous study [18] , where this protein showed a high level of immunoreactivity exclusively against the serum of the map infected animals. to the best of our knowledge, this protein was not described before as an antigen and, according to our dataset on its functional domains, it is possible to hypothesize that it is part of a surface-associated dehydrogenase with oxidoreductase activity involved in pentose phosphate pathway [18, 46] . the fructose-bisphosphate aldolase (eta93906), instead, is described as less prone to elicit antibody production. this is consistent with its intracytoplasmic localization and with its major role in the central metabolism. despite its cellular localization, several moonlighting properties have been described as part of its multiple functions [47] [48] [49] . interestingly, b-cell epitopes prediction highlighted a homogenous distribution of multiple peptide sequences throughout all the proteins primary sequences (figure 2 ). this suggests the potential usefulness of the selected proteins for a variety of implications where two or more epitopes are needed in a single protein molecule (e.g., sandwich elisa, and other indirect diagnostic tests ensuring higher sensibility) [50, 51] . prediction of the class i bolas binding peptides confirmed the immunogenicity of the previously studied proteins. similarly to hlas, bolas are highly polymorphic proteins; thus, including a plurality of bolas while computing the peptide binding affinity would benefit the robustness and reliability of the prediction [52, 53] . among the class i mhc epitopes predicted in the present study, the hypothetical protein ega31_12440 (azp81686) differs by only one amino-acidic residue from the map membrane protein 2121c (v7kre0), whose immunogenic properties have been already demonstrated by both our previous investigation and other studies [18, 31, 54] . it is, indeed, a surface-exposed protein involved in the mechanism of invasion of the epithelial cells [55, 56] . its expression is upregulated when the map is exposed to the physicochemical conditions similar to the intestine environment and the specific block of this protein reduces the virulence up to 60% [34] . interestingly, this protein is among the entries classified as wb suggesting that more immunogenic properties can also be exploited by the other wb protein besides the others predicted as being sbs. moreover, we specifically focused on the sole peptide sequences whose binding affinity is shared among multiple bolas. in this manner, the most suitable epitope sequences are likely to have a broad recognition in a higher portion of the bovine population [57, 58] . epitopes identified with this approach are of potential interest for diverse purposes and studies, including the investigations aimed at elucidating the order of immunological events following the map infection, and shedding light on the controversial aspect of suppression, or not, of the cellular-mediate immune response following map infection [3] . to prove selected epitopes as suitable candidates for the unbiased diagnosis of map infection, we aligned the peptides sequences against a database comprising the closest taxonomically-related bacteria. such alignment generated a steep reduction of the number of input sequences and returned two peptides suitable for a specific diagnosis of map. these two candidates were not overlapping with other mycobacteria other than mycobacterium avium complex (mac). the described approach resumes the pipeline of an in-silico method, therefore, empirical tests will be required for the definitive assessment of differential diagnosis capability of the selected sequences. the sequence alignment against the host-specific protein (i.e., the publicly available cow proteome) fails to identify significant sequence identities. the only alignment hits observed (supplementary table s3 ) were not continuously overlapping and showed a low percentage of identity and a high e-value. acknowledged the prediction of linear epitopes, the matching of our candidates with gapped sequences of cow is likely to be of a negligible relevance since regarding amino acid residues that are not laying in a concatenate order. thus, we speculate that the candidate epitopes suggested in this study are of potential value for the design of either multi-subunit or recombinant vaccines to confer protection against the first-time infection of the calves by map. nevertheless, confirmatory experimental trials are warmly encouraged especially to assess the specificity between map infected animals and bovine with tuberculosis [59, 60] . although less significant, a certain level of identity hasbeen registered with other mycobacterium strains. however, the discrepancy observed in the sequence alignment might be used as the driving force for the differential diagnosis. at this purpose, application of optimized laboratory protocols expecting high stringency condition might be the key to improve the specificity of the diagnostic methods. finally, empirical evaluation of the synergistic effect of both b-cell epitopes and the class i mhc epitopes are desirable. this will aim at the evaluation of the successful diagnosis of map infection at the subclinical stage and at the potential in elicitation of protective immunity. to conclude, the present study describes an innovative pipeline based on the in-silico survey of selected immunoreactive proteins capable to uncover the immunogenic features of each protein. this pipeline was applied to the detection of a restricted number of peptides potentially useful for the diagnosis of jd at the early subclinical stage. obtained results are as well useful for the implementation of innovative vaccination strategies. the obtained results confirmed the immunogenicity observed experimentally through the immunoproteomic approach applied to the map proteome. this evidence demonstrates, once again, reciprocal support between immunoproteomics and immunoinformatics. nevertheless, empirical confirmations are warmly required to test the provided epitope sequences both in-vitro and ex-vivofor the possible detection of the subclinical phase of the infection and for the efficacy of the eventual vaccinal formulations. such experimental tests might also help with the comprehension of the controversial role of the host immune cell-response underlying behind jd. complementation of the linear epitopes array with other conformational ones is also of importance for befitting efficacy and safety of the deliverables empirical confirmation may serve as further proof of the robustness of the immunoinformatics approaches as key contributors in the study of diverse infectious diseases. this would provide reliable scientific support in a safe, rapid and cost-effective approach. the current study focuses on ten proteins whose immunoreactivity has been experimentally investigated by means of an immunoproteomic approach [18] . brielfy, the map proteome was incubated with sera from infected animals to screen for proteins with immunoreactive potential. the most promising entries were then subjected to ms-based identification. identifiers of the candidate immunoreactive proteins were queried in the ncbi non-redundant (ncbinr) protein database to retrieve the whole protein sequences and export them as a fasta file. update of the protein accession numbers operated by the reference data repository (i.e., ncbi) required the run of a protein sequence alignment for the attribution of the novel protein identifiers (gi numbers). selected peptide sequences of the immunoreactive proteins were searched against the ncbinr database restricted to mycobacterium avium subsp. paratuberculosis (taxid 1770) and the best hit was used to transform the former protein accession numbers into the novel ncbi protein gis. the list of proteins employed in this study along with their current gi number is provided in table 1 and supplementary table s4 . prediction of the protein sequences that are likely to elicit antibody production and/or bind class i mhc proteins was performed through two tools that are commonly employed for the epitope prediction [27] , namely iedb (http://tools.iedb.org/bcell/) and netmhc (http://www.cbs.dtu.dk/ services/netmhc/), respectively for b-and class i mhc epitopes prediction. bepipred algorithm was chosen for the prediction of linear protein epitopes capable of binding b-cells. this employs a combination of a hidden markov model and a propensity scale method [61] . each protein residue is scored for its epitope behavior and the sole aminoacid with a score greater than or equal to 0.35 was considered as a potential epitope. linear peptide epitopes of the least length of 10 aminoacids were selected for this study. prediction of epitopes binding class i mhc molecules was performed through netmhc prediction tool, using the artificial neural network (ann) algorithm [62] . the algorithm was set for the prediction of nine-aminoacids long peptides capable of binding the following bola alleles: bola-d18.4; bola-hd6; bola-jsp.1; bola-t2a; bola-t2b; bola-t2c. the binding affinity of the peptides wasscored, and a percentile rank wasprovided by computationally comparing the score of each queried peptide sequence against 400,000 natural peptides of the same length. peptides scoring a binding affinity up to 0.5% were considered as strong binders (sb); whereas, peptides with a percentile rank comprised between 0.6% and 2% were labelled as weak binders (wb). all other peptides were discarded [3, 62, 63] . the resulting list of selected peptide epitopes was further quality-checked and filtered. for each of the selected proteins, the epitopes shared among the major number of bola haplotypes were kept (i.e., the most commonly recognized in the bovine population), resulting in a consensus list of epitopes to be further used in the study. a summary of the experimental worklow employed in this study is provided in figure 4 . . schematic representation of the immunoinformatic approach. blue and orange arrows refer to the input and output of the data, respectively. data arising from our previous immunoproteomic study [18] were used to retrieve the protein sequences of the immunogenic proteins, selected on the basis of their capability of being complexed by the immunoglobulins in the sera of the infected cows. the sequence of the immunogenic proteins is subjected to the epitope prediction through dedicated tools and algorithms. the b-epitope prediction was performed via the iedb prediction tool, that provides a list of candidate b-epitopes. the class i mhc epitopes waspredicted via netmhc. this makes use of the bola haplotypes from the data repository (ncbi) as references for computing the linear peptides capable of being recognized and presented by the diverse bolas. the list of potential t-epitopes is further refined by selecting the most commonly recognized epitopes with a relatively high binding affinity. the refined list of epitopes is further tested for sequence specificity and cross-reactivity by pblast alignment versus the mycobacteria and cow protein database which, in turn, arise from the publicly available data repository (ncbi). the list of epitope sequences was further analyzed through the basic local alignment search tool for protein sequences (pblast) [64] . this tool implements the pam30 algorithm to compare protein sequences and calculates the robustness of matches as means of expected values (e-value). . schematic representation of the immunoinformatic approach. blue and orange arrows refer to the input and output of the data, respectively. data arising from our previous immunoproteomic study [18] were used to retrieve the protein sequences of the immunogenic proteins, selected on the basis of their capability of being complexed by the immunoglobulins in the sera of the infected cows. the sequence of the immunogenic proteins is subjected to the epitope prediction through dedicated tools and algorithms. the b-epitope prediction was performed via the iedb prediction tool, that provides a list of candidate b-epitopes. the class i mhc epitopes waspredicted via netmhc. this makes use of the bola haplotypes from the data repository (ncbi) as references for computing the linear peptides capable of being recognized and presented by the diverse bolas. the list of potential t-epitopes is further refined by selecting the most commonly recognized epitopes with a relatively high binding affinity. the refined list of epitopes is further tested for sequence specificity and cross-reactivity by pblast alignment versus the mycobacteria and cow protein database which, in turn, arise from the publicly available data repository (ncbi). the list of epitope sequences was further analyzed through the basic local alignment search tool for protein sequences (pblast) [64] . this tool implements the pam30 algorithm to compare protein sequences and calculates the robustness of matches as means of expected values (e-value). this value describes the statistic of matches occurring "by chance"; thus, it decreases exponentially as the score of the match increases. in the pblast, each epitope sequence has been aligned against both mycobacteria (ncbi taxid 85007) and cow (ncbi taxid 9913) protein repertoires to evaluate sequence specificity and cross-reactivity (figure 4) , of importance while selecting candidate epitopes to be employed for the effective diagnosis and/or prophylaxis of map. supplementary materials: the following are available online at http://www.mdpi.com/2076-0817/9/9/705/s1. table s1 . b-cell binding protein epitope prediction. the file summarizes the b-epitope prediction results. each sheet of the xls file is relative to one of the ten selected proteins. "position" column is relative to the position of each aminoachid along the protein sequence; "residue" indicates the type of aminoacid; "score" is relative to the epitope propensity attributed by the algorithm and "assignment" rank each aminoacid residue as epitope or not depending on the prefixed settings. table s2 . class i mhc binding protein epitope prediction. the xls file summarizes the results in a table reporting: the predicted affinity (nm); the percentile rank, and the predicted binding core for all the selected alleles. two additional columns summarize the predictions across alleles: harmonic mean of the %rank calculated over all specified alleles (h_avg_ranks); the number of alleles covered by a given peptide (n_binders). table s3 . selected peptide epitopes alignment against the mycobacteria and cow databases. the xls file provides a summary of the pblast alignment search. full description of the table columns and the alignment criteria is available at https://blast.ncbi.nlm.nih.gov/blast.cgi?page=proteins. table s4 . full list of the prototypic peptides alignment. the file summarizes the p-blast alignment of the two selected epitopes against both the mycobacteria and the cow database. full description of the table columns and the alignment criteria is available at https://blast.ncbi.nlm.nih.gov/blast.cgi?page=proteins. the authors declare no conflict of interest. defining resilience to mycobacterial disease: characteristics of survivors of ovine paratuberculosis experimental infection model for johne's disease using a lyophilised, pure culture, seedstock of mycobacterium avium subspecies paratuberculosis gene expression profiles during subclinical mycobacterium avium subspecies paratuberculosis infection in sheep can predict disease outcome control of paratuberculosis: who, why and how. a review of 48 countries paratuberculosis in sheep and goats johne's disease) in cattle and other susceptible species temporal patterns and quantification of excretion of mycobacterium avium subsp paratuberculosis in sheep with johne's disease mycobacterium avium subsp. paratuberculosis: pathogen, pathogenesis and diagnosis peptidomics in veterinary science: focus on bovine paratuberculosis exploring the zoonotic potential of mycobacterium avium subspecies paratuberculosis through comparative genomics mycobacterium paratuberculosis as a cause of crohn's disease could mycobacterium avium subspecies paratuberculosis cause crohn's disease, ulcerative colitis· · · and colorectal cancer? facts, myths and hypotheses on the zoonotic nature of mycobacterium avium subspecies paratuberculosis mycobacterium avium subspecies paratuberculosis and crohn's disease: a systematic review and meta-analysis the immunopathogenesis of crohn's disease: a three-stage model mycobacterium avium ss. paratuberculosis zoonosis-the hundred year war-beyond crohn's disease mycobacteria in drinking water distribution systems: ecology and significance for human health identification of immunoreactive proteins of mycobacterium avium subsp. paratuberculosis differential cytokine gene expression profiles in the three pathological forms of sheep paratuberculosis longitudinal study of clinicopathological features of johne's disease in sheep naturally exposed to mycobacterium avium subspecies paratuberculosis divergent antigen-specific cellular immune responses during asymptomatic subclinical and clinical states of disease in cows naturally infected with mycobacterium avium subsp. paratuberculosis expression of genes associated with the antigen presentation and processing pathway are consistently regulated in early mycobacterium avium subsp. paratuberculosis infection in silico epitope analysis of unique and membrane associated proteins from mycobacterium avium subsp. paratuberculosis for immunogenicity and vaccine evaluation proteome-wide b and t cell epitope repertoires in outer membrane proteins of mycobacterium avium subsp. paratuberculosis have vaccine and diagnostic relevance: a holistic approach comparative computational analysis of sars-cov-2 nucleocapsid protein epitopes in taxonomically related coronaviruses. microbes infect molecular basis of covid-19 relationships in different species: a one health perspective. microbes infect epitope prediction by novel immunoinformatics approach: a state-of-the-art review the major histocompatibility complex in bovines: a an antigenic peptide produced by peptide splicing in the proteasome pathogenesis of bacterial infections in animals membrane and cytoplasmic proteins of mycobacterium avium subspecies paratuberculosis that bind to results of multiple diagnostic tests for mycobacterium avium subsp. paratuberculosis in patients with inflammatory bowel disease and in controls pathogenesis, molecular genetics, and genomics of mycobacterium avium subsp. paratuberculosis, the etiologic agent of johne's disease. front rapid expression of mycobacterium avium subsp. paratuberculosis recombinant proteins for antigen discovery immune reactions in cattle after immunization with a mycobacterium paratuberculosis vaccine and implications for the diagnosis of m. paratuberculosis and m. bovis infections immunoinformatics: a brief review immunoinformatics and epitope prediction in the age of genomic medicine a secondary antibody-detecting molecular weight marker with mouse and rabbit igg fc linear epitopes for western blot analysis applications of maldi-tof mass spectrometry in clinical proteomics model for immune responses to mycobacterium avium subspecies paratuberculosis in cattle susceptibility to and diagnosis of mycobacterium avium subspecies paratuberculosis infection in dairy calves: a review gene expression profiling of monocyte-derived macrophages following infection with mycobacterium avium subspecies avium and mycobacterium avium subspecies paratuberculosis analysis of the immune response to mycobacterium avium subsp. paratuberculosis in experimentally infected calves mycobacterium avium subsp. paratuberculosis antibody response, fecal shedding, and antibody cross-reactivity to mycobacterium bovis in m. avium subsp. paratuberculosis-infected cattle herds vaccinated against johne's disease cloning and expression of a gene cluster encoding three subunits of membrane-bound gluconate dehydrogenase from erwinia cypripedii atcc 29267 in escherichia coli fructose-1,6-bisphosphate and aldolase mediate glucose sensing by ampk fructose-bisphosphate aldolases: an evolutionary history the moonlighting protein fructose-1, 6-bisphosphate aldolase of neisseria meningitidis: surface localization and role in host cell adhesion a short history, principles, and types of elisa, and our laboratory experience with peptide/protein analyses using elisa diagnostic performance of direct and indirect methods for assessing failure of transfer of passive immunity in dairy calves using latent class analysis improved prediction of bovine leucocyte antigens (bola) presented ligands by use of mass-spectrometry-determined ligand and in vitro binding data improved prediction of mhc ii antigen presentation through integration and motif deconvolution of mass spectrometry mhc eluted ligand data production and characterization of monoclonal antibodies against a major membrane protein of mycobacterium avium subsp. paratuberculosis. clin. vaccine immunol a mycobacterium avium subsp. paratuberculosis rela deletion mutant and a 35 kda major membrane protein elicit development of cytotoxic t lymphocytes with ability to kill intracellular bacteria the mycobacterium avium subsp. paratuberculosis 35kda protein plays a role in invasion of bovine epithelial cells mhc peptidome deconvolution for accurate mhc binding motif characterization and improved t-cell epitope predictions pooshang bagheri, k. in silico rational design of a novel tetra-epitope tetanus vaccine with complete population coverage using developed immunoinformatics and surface epitope mapping approaches new latex bead agglutination assay for differential diagnosis of cattle infected with mycobacterium bovis and mycobacterium avium subsp. paratuberculosis duplex pcr for differential identification of mycobacterium bovis, m. avium, and m. avium subsp. paratuberculosis in formalin-fixed paraffin-embedded tissues from cattle improved method for predicting linear b-cell epitopes gapped sequence alignment using artificial neural networks: application to the mhc class i system in silico identification of epitopes in mycobacterium avium subsp. paratuberculosis proteins that were upregulated under stress conditions protein database searches using compositionally adjusted substitution matrices key: cord-001726-d7iwkatn authors: henry, kevin a.; arbabi-ghahroudi, mehdi; scott, jamie k. title: beyond phage display: non-traditional applications of the filamentous bacteriophage as a vaccine carrier, therapeutic biologic, and bioconjugation scaffold date: 2015-08-04 journal: front microbiol doi: 10.3389/fmicb.2015.00755 sha: doc_id: 1726 cord_uid: d7iwkatn for the past 25 years, phage display technology has been an invaluable tool for studies of protein–protein interactions. however, the inherent biological, biochemical, and biophysical properties of filamentous bacteriophage, as well as the ease of its genetic manipulation, also make it an attractive platform outside the traditional phage display canon. this review will focus on the unique properties of the filamentous bacteriophage and highlight its diverse applications in current research. particular emphases are placed on: (i) the advantages of the phage as a vaccine carrier, including its high immunogenicity, relative antigenic simplicity and ability to activate a range of immune responses, (ii) the phage’s potential as a prophylactic and therapeutic agent for infectious and chronic diseases, (iii) the regularity of the virion major coat protein lattice, which enables a variety of bioconjugation and surface chemistry applications, particularly in nanomaterials, and (iv) the phage’s large population sizes and fast generation times, which make it an excellent model system for directed protein evolution. despite their ubiquity in the biosphere, metagenomics work is just beginning to explore the ecology of filamentous and non-filamentous phage, and their role in the evolution of bacterial populations. thus, the filamentous phage represents a robust, inexpensive, and versatile microorganism whose bioengineering applications continue to expand in new directions, although its limitations in some spheres impose obstacles to its widespread adoption and use. the filamentous bacteriophage (genera inovirus and plectrovirus) are non-enveloped, rod-shaped viruses of escherichia coli whose long helical capsids encapsulate a single-stranded circular dna genome. subsequent to the independent discovery of bacteriophage by twort (1915) and d 'hérelle (1917) , the first filamentous phage, f1, was isolated in loeb (1960) and later characterized as a member of a larger group of phage (ff, including f1, m13, and fd phage) specific for the e. coli conjugative f pilus (hofschneider and mueller-jensen, 1963; marvin and hoffmann-berling, 1963; zinder et al., 1963; salivar et al., 1964) . soon thereafter, filamentous phage were discovered that do not use f-pili for entry (if and ike; meynell and lawn, 1968; khatoon et al., 1972) , and over time the list of known filamentous phage has expanded to over 60 members (fauquet et al., 2005) , including temperate and gram-positivetropic species. work by multiple groups over the past 50 years has contributed to a relatively sophisticated understanding of filamentous phage structure, biology and life cycle (reviewed in marvin, 1998; rakonjac et al., 2011; rakonjac, 2012) . in the mid-1980s, the principle of modifying the filamentous phage genome to display polypeptides as fusions to coat proteins on the virion surface was invented by smith and colleagues (smith, 1985; parmley and smith, 1988) . based on the ideas described in parmley and smith (1988) , groups in california, germany, and the uk developed phage-display platforms to create and screen libraries of peptide and folded-protein variants (bass et al., 1990; devlin et al., 1990; mccafferty et al., 1990; scott and smith, 1990; breitling et al., 1991; kang et al., 1991) . this technology allowed, for the first time, the ability to seamlessly connect genetic information with protein function for a large number of protein variants simultaneously, and has been widely and productively exploited in studies of proteinprotein interactions. many excellent reviews are available on phage-display libraries and their applications (kehoe and kay, 2005; bratkovic, 2010; pande et al., 2010) . however, the phage also has a number of unique structural and biological properties that make it highly useful in areas of research that have received far less attention. thus, the purpose of this review is to highlight recent and current work using filamentous phage in novel and nontraditional applications. specifically, we refer to projects that rely on the filamentous phage as a key element, but whose primary purpose is not the generation or screening of phagedisplayed libraries to obtain binding polypeptide ligands. these tend to fall into four major categories of use: (i) filamentous phage as a vaccine carrier; (ii) engineered filamentous phage as a therapeutic biologic agent in infectious and chronic diseases; (iii) filamentous phage as a scaffold for bioconjugation and surface chemistry; and (iv) filamentous phage as an engine for evolving variants of displayed proteins with novel functions. a final section is dedicated to recent developments in filamentous phage ecology and phage-host interactions. common themes shared amongst all these applications include the unique biological, immunological, and physicochemical properties of the phage, its ability to display a variety of biomolecules in modular fashion, and its relative simplicity and ease of manipulation. nearly all applications of the filamentous phage depend on its ability to display polypeptides on the virion's surface as fusions to phage coat proteins ( table 1) . the display mode determines the maximum tolerated size of the fused polypeptide, its copy number on the phage, and potentially, the structure of the displayed polypeptide. display may be achieved by fusing dna encoding a polypeptide of interest directly to the gene encoding a coat protein within the phage genome (type 8 display on pviii, type 3 display on piii, etc.), resulting in fully recombinant phage. much more commonly, however, only one copy of the coat protein is modified in the presence of a second, wild-type copy (e.g., type 88 display if both recombinant and wild-type pviii genes are on the phage genome, type 8+8 display if the parmley and smith (1988), mcconnell et al. (1994) , rondot et al. (2001) hybrid (type 33 and 3+3 systems) type 3+3 system <1 2 smith and scott (1993) , smith and petrenko (1997) pvi hybrid (type 6+6 system) yes <1 2 >25 kda hufton et al. (1999) pvii fully recombinant (type 7 system) no ∼5 >25 kda kwasnikowski et al. (2005) hybrid (type 7+7 system) yes <1 2 gao et al. (1999) pviii fully recombinant (landscape phage; type 8 system) no 2700 3 ∼5-8 residues kishchenko et al. (1994) , petrenko et al. (1996) hybrid (type 88 and 8+8 systems) type 8+8 system ∼1-300 2 >50 kda scott and smith (1990) , greenwood et al. (1991) , smith and fernandez (2004) pix fully recombinant (type 9+9 * system) yes ∼5 >25 kda gao et al. (2002) hybrid (type 9+9 system) no <1 2 gao et al. (1999) , shi et al. (2010) , tornetta et al. (2010) 1 asterisks indicate non-functional copies of the coat protein are present in the genome of the helper phage used to rescue a phagemid whose coat protein has been fused to a recombinant polypeptide. 2 the copy number depends on polypeptide size; typically <1 copy per phage particle but for pviii peptide display can be up to ∼15% of pviii molecules in hybrid virions. 3 the total number of pviii molecules depends on the phage genome size; one pviii molecule is added for every 2.3 nucleotides in the viral genome. recombinant gene 8 is on a plasmid with a phage origin of replication) resulting in a hybrid virion bearing two different types of a given coat protein. multivalent display on some coat proteins can also be enforced using helper phage bearing nonfunctional copies of the relevant coat protein gene (e.g., type 3 * +3 display). by far the most commonly used coat proteins for display are the major coat protein, pviii, and the minor coat protein, piii, with the major advantage of the former being higher copy number display (up to ∼15% of recombinant pviii molecules in a hybrid virion, at least for short peptide fusions), and of the latter being the ability to display some folded proteins at an appreciable copy number (1-5 per phage particle). while pviii display of folded proteins on hybrid phage is possible, it typically results in a copy number of much less than 1 per virion (sidhu et al., 2000) . for the purposes of this review, we use the term "phage display" to refer to a recombinant filamentous phage displaying a single polypeptide sequence on its surface (or more rarely, bispecific display achieved via fusion of polypeptides to two different capsid proteins), and the term "phage-displayed library" to refer to a diverse pool of recombinant filamentous phage displaying an array of polypeptide variants (e.g., antibody fragments; peptides). such libraries are typically screened by iterative cycles of panning against an immobilized protein of interest (e.g., antigen for phage-displayed antibody libraries; antibody for phage-displayed peptide libraries) followed by amplification of the bound phage in e. coli cells. early work with anti-phage antisera generated for species classification purposes demonstrated that the filamentous phage virion is highly immunogenic in the absence of adjuvants (meynell and lawn, 1968 ) and that only the major coat protein, pviii, and the minor coat protein, piii, are targeted by antibodies (pratt et al., 1969; woolford et al., 1977) . thus, the idea of using the phage as carrier to elicit antibodies against poorly immunogenic haptens or polypeptide was a natural extension of the ability to display recombinant exogenous sequences on its surface, which was first demonstrated by de la cruz et al. (1988) . the phage particle's low cost of production, high stability and potential for high valency display of foreign antigen (via pviii display) also made it attractive as a vaccine carrier, especially during the early stages of development of recombinant protein technology. building upon existing peptide-carrier technology, the first filamentous phage-based vaccine immunogens displayed short amino acid sequences derived directly from proteins of interest as recombinant fusions to pviii or piii (de la cruz et al., 1988) . as library technology was developed and refined, phage-based antigens displaying peptide ligands of monoclonal antibodies (selected from random peptide libraries using the antibody, thus simulating with varying degrees of success the antibody's folded epitope on its cognate antigen; geysen et al., 1986; knittelfelder et al., 2009) were also generated for immunization purposes, with the goal of eliciting anti-peptide antibodies that also recognize the native protein. some of the pioneering work in this area used peptides derived from infectious disease antigens (or peptide ligands of antibodies against these antigens; table 2) , including malaria and human immunodeficiency virus type 1 (hiv-1). when displayed on phage, peptides encoding the repeat regions of the malarial circumsporozoite protein and merozoite surface protein 1 were immunogenic in mice and rabbits (de la cruz et al., 1988; greenwood et al., 1991; willis et al., 1993; demangel et al., 1996) , and antibodies raised against the latter cross-reacted with the full-length protein. various peptide determinants (or mimics thereof) of hiv-1 gp120, gp41, gag, and reverse transcriptase were immunogenic when displayed on or conjugated to phage coat proteins (minenkova et al., 1993; di marzo veronese et al., 1994; de berardinis et al., 1999; scala et al., 1999; chen et al., 2001; van houten et al., 2006 van houten et al., , 2010 , and in some cases elicited antibodies that were able to weakly neutralize lab-adapted viruses (di marzo veronese et al., 1994; scala et al., 1999) . the list of animal and human infections for which phage-displayed peptide immunogens have been developed as vaccine leads continues to expand and includes bacterial, fungal, viral, and parasitic pathogens ( table 2) . while in some cases the results of these studies have been promising, antibody epitope-based peptide vaccines are no longer an area of active research for several reasons: (i) in many cases, peptides incompletely or inadequately mimic epitopes on folded proteins (irving et al., 2010 ; see below); (ii) antibodies against a single epitope may be of limited utility, especially for highly variable pathogens (van regenmortel, 2012); and (iii) for pathogens for which protective immune responses are generated efficiently during natural infection, peptide vaccines offer few advantages over recombinant subunit and live vector vaccines, which have become easier to produce over time. more recently, peptide-displaying phage have been used in attempts to generate therapeutic antibody responses for chronic diseases, cancer, immunotherapy, and immunocontraception. immunization with phage displaying alzheimer's disease β-amyloid fibril peptides elicited anti-aggregating antibodies in mice and guinea pigs (frenkel et al., 2000 (frenkel et al., , 2003 esposito et al., 2008; tanaka et al., 2011) , possibly reduced amyloid plaque formation in mice (frenkel et al., 2003; solomon, 2005; esposito et al., 2008) , and may have helped maintain cognitive abilities in a transgenic mouse model of alzheimer's disease (lavie et al., 2004) ; however, it remains unclear how such antibodies are proposed to cross the blood-brain barrier. yip et al. (2001) found that antibodies raised in mice against an erbb2/her2 peptide could inhibit breast-cancer cell proliferation. phage displaying peptide ligands of an anti-ige antibody elicited antibodies that bound purified ige molecules (rudolf et al., 1998) , which may be useful in allergy immunotherapy. several strategies for phage-based contraceptive vaccines have been proposed for control of animal populations. for example, immunization with phage displaying follicle-stimulating hormone peptides on pviii elicited antibodies that impaired the fertility of mice and ewes (abdennebi et al., 1999) . phage displaying or chemically rubinchik and chow (2000) conjugated to sperm antigen peptides or peptide mimics (samoylova et al., 2012a,b) and gonadotropin-releasing hormone (samoylov et al., 2012) are also in development. for the most part, peptides displayed on phage elicit antibodies in experimental animals ( table 2) , although this depends on characteristics of the peptide and the method of its display: piii fusions tend toward lower immunogenicity than pviii fusions (greenwood et al., 1991) possibly due to copy number differences (piii: 1-5 copies vs. pviii: estimated at several hundred copies; malik et al., 1996) . in fact, the phage is at least as immunogenic as traditional carrier proteins such as bovine serum albumin (bsa) and keyhole limpet hemocyanin (klh; melzer et al., 2003; su et al., 2007) , and has comparatively few endogenous b-cell epitopes to divert the antibody response from its intended target (henry et al., 2011) . excepting small epitopes that can be accurately represented by a contiguous short amino acid sequence, however, it has been extremely difficult to elicit antibody responses that cross-react with native protein epitopes using peptides. the overall picture is considerably bleaker than that painted by table 2 , since in several studies either: (i) peptide ligands selected from phage-displayed libraries were classified by the authors as mimics of discontinuous epitopes if they bore no obvious sequence homology to the native protein, which is weak evidence of non-linearity, or (ii) the evidence for cross-reactivity of antibodies elicited by immunization with phage-displayed peptides with native protein was uncompelling. irving et al. (2010) describe at least one reason for this lack of success: it seems that peptide antigens elicit a set of topologically restricted antibodies that are largely unable to recognize discontinuous or complex epitopes on larger biomolecules. while the peptide may mimic the chemistry of a given epitope on a folded protein (allowing it to crossreact with a targeted antibody), being a smaller molecule, it cannot mimic the topology of that antibody's full epitope. despite this, the filamentous phage remains highly useful as a carrier for peptides with relatively simple secondary structures, which may be stablilized via anchoring to the coat proteins (henry et al., 2011) . this may be especially true of peptides with poor inherent immunogenicity, which may be increased by high-valency display and phage-associated adjuvanticity (see immunological mechanisms of vaccination with filamentous phage below). the filamentous phage has been used to a lesser extent as a carrier for t-cell peptide epitopes, primarily as fusion proteins with pviii ( table 3) . early work, showing that immunization with phage elicited t-cell help (kölsch et al., 1971; willis et al., 1993) , was confirmed by several subsequent studies (de berardinis et al., 1999; ulivieri et al., 2008) . from the perspective of vaccination against infectious disease, de berardinis et al. (2000) showed that a cytotoxic t-cell (ctl) epitope from hiv-1 reverse transcriptase could elicit antigen-specific ctls in vitro and in vivo without addition of exogenous helper t-cell epitopes, presumably since these are already present in the phage coat proteins (mascolo et al., 2007) . similarly, efficient priming of ctls was observed against phage-displayed t-cell epitopes from hepatitis b virus (wan et al., 2001) and candida albicans (yang et al., 2005a; wang et al., 2006 wang et al., , 2014d , which, together with other types of immune responses, protected mice against systemic candidiasis. vaccination with a combination of phagedisplayed peptides elicited antigen-specific ctls that proved effective in reducing porcine cysticercosis in a randomized controlled trial (manoutcharian et al., 2004; morales et al., 2008) . while the correlates of vaccine-induced immune protection for infectious diseases, where they are known, are almost exclusively serum or mucosal antibodies (plotkin, 2010) , in certain vaccine applications, the filamentous phage has been used as a carrier for larger molecules that would be immunogenic even in isolation. initially, the major advantages to phage display of such antigens were speed, ease of purification and low cost of production (gram et al., 1993) . e. coli f17a-g adhesin (van gerven et al., 2008) , hepatitis b core antigen (bahadir et al., 2011) , and hepatitis b surface antigen (balcioglu et al., 2014) all elicited antibody responses when displayed on piii, although none of these studies compared the immunogenicity of the phage-displayed proteins with that of the purified protein alone. phage displaying schistosoma mansoni glutathione s-transferase on piii elicited an antibody response that was both higher in titer and of different isotypes compared to immunization with the protein alone (rao et al., 2003) . two studies of antiidiotypic vaccines have used the phage as a carrier for antibody fragments bearing immunogenic idiotypes. immunization with phage displaying the 1e10 idiotype scfv (mimicking a vibrio anguillarum surface epitope) elicited antibodies that protected flounder fish from vibrio anguillarum challenge (xia et al., 2005) . a chemically linked phage-bcl1 tumor-specific idiotype vaccine was weakly immunogenic in mice but extended survival time in a b-cell lymphoma model (roehnisch et al., 2013) , and was welltolerated and immunogenic in patients with multiple myeloma (roehnisch et al., 2014) . one study of dna vaccination with an anti-laminarin scfv found that dna encoding a piii-scfv fusion protein elicited stronger humoral and cell-mediated immune responses than dna encoding the scfv alone (cuesta et al., 2006) , suggesting that under some circumstances, endogenous phage t-cell epitopes can enhance the immunogenicity of associated proteins. taken together, the results of these studies show that as a particulate virus-like particle, the filamentous phage likely triggers different types of immune responses than recombinant protein antigens, and provide additional t-cell help to displayed or conjugated proteins. however, the low copy number of piii-displayed proteins, as well as potentially unwanted phage-associated adjuvanticity, can make display of recombinant proteins by phage a suboptimal vaccine choice. although our understanding of the immune response against the filamentous phage pales in comparison to classical model antigens such as ovalbumin, recent work has begun to shed light on the immune mechanisms activated in response to phage vaccination (figure 1) . the phage particle is immunogenic without adjuvant in all species tested to date, including mice (willis et al., 1993) , rats (dente et al., 1994) , rabbits (de la cruz et al., 1988) , guinea pigs (frenkel et al., 2000; kim et al., 2004) , fish (coull et al., 1996; xia et al., 2005) , non-human primates (chen et al., 2001) , and humans (roehnisch et al., 2014) . various routes of immunization have been employed, including oral administration (delmastro et al., 1997) as well as subcutaneous (grabowska et al., 2000) , intraperitoneal (van houten et al., 2006) , intramuscular (samoylova et al., 2012a) , intravenous (vaks and benhar, 2011) , and intradermal injection (roehnisch et al., 2013) ; no published study has directly compared the effect of administration route on filamentous phage immunogenicity. antibodies are generated against only three major sites on the virion: (i) the surface-exposed n-terminal ∼12 residues of the pviii monomer lattice (terry et al., 1997; kneissel et al., 1999) ; (ii) the n-terminal n1 and n2 domains of piii (van houten et al., 2010) ; and (iii) bacterial lipopolysaccharide (lps) embedded in the phage coat (henry et al., 2011) . in mice, serum antibody titers against the phage typically reach 1:10 5 -1:10 6 after 2-3 immunizations, and are maintained for at least 1 year postimmunization (frenkel et al., 2000) . primary antibody responses against the phage appear to be composed of a mixture of igm and igg2b isotypes in c57bl/6 mice, while secondary antibody responses are composed primarily of igg1 and igg2b isotypes, with a lesser contribution of igg2c and igg3 isotypes (hashiguchi et al., 2010) . deletion of the surface-exposed n1 and n2 domains of piii produces a truncated form of this protein that does not elicit antibodies, but also results in a non-infective phage particle with lower overall immunogenicity (van houten et al., 2010) . figure 1 | types of immune responses elicited in response to immunization with filamentous bacteriophage. as a virus-like particle, the filamentous phage engages multiple arms of the immune system, beginning with cellular effectors of innate immunity (macrophages, neutrophils, and possibly natural killer cells), which are recruited to tumor sites by phage displaying tumor-targeting moieties. the phage likely activates t-cell independent antibody responses, either via phage-associated tlr ligands or cross-linking by the pviii lattice. after processing by antigen-presenting cells, phage-derived peptides are presented on mhc class ii and cross-presented on mhc class i, resulting in activation of short-lived ctls and an array of helper t-cell types, which help prime memory ctl and high-affinity b-cell responses. frontiers in microbiology | www.frontiersin.org although serum anti-phage antibody titers appear to be at least partially t-cell dependent (kölsch et al., 1971; willis et al., 1993; de berardinis et al., 1999; van houten et al., 2010) , many circulating pviii-specific b cells in the blood are devoid of somatic mutation even after repeated biweekly immunizations, suggesting that under these conditions, the phage activates t-cell-independent b-cell responses in addition to highaffinity t-cell-dependent responses (murira, 2014) . filamentous phage particles can be processed by antigen-presenting cells and presented on mhc class ii molecules (gaubin et al., 2003; ulivieri et al., 2008) and can activate t h 1, t h 2, and t h 17 helper t cells (yang et al., 2005a; wang et al., 2014d) . anti-phage t h 2 responses were enhanced through display of ctla-4 peptides fused to piii (kajihara et al., 2000) . phage proteins can also be cross-presented on mhc class i molecules (wan et al., 2005) and can prime two waves of ctl responses, consisting first of short-lived ctls and later of long-lived memory ctls that require cd4 + t-cell help (del pozzo et al., 2010) . the latter ctls mediate a delayed-type hypersensitivity reaction (fang et al., 2005; del pozzo et al., 2010) . the phage particle is self-adjuvanting through multiple mechanisms. host cell wall-derived lps enhances the virion's immunogenicity, and its removal by polymyxin b chromatography reduces antibody titers against phage coat proteins (grabowska et al., 2000) . the phage's singlestranded dna genome contains cpg motifs and may also have an adjuvant effect. the antibody response against the phage is entirely dependent on myd88 signaling and is modulated by stimulation of several toll-like receptors (hashiguchi et al., 2010) , indicating that innate immunity plays an important but largely uncharacterized role in the activation of anti-phage adaptive immune responses. biodistribution studies of the phage after intravenous injection show that it is cleared from the blood within hours through the reticuloendothelial system (molenaar et al., 2002) , particularly of the liver and spleen, where it is retained for days (zou et al., 2004) , potentially activating marginal-zone b-cell responses. thus, the filamentous phage is not only a highly immunogenic carrier, but by virtue of activating a range of innate and adaptive immune responses, serves as an excellent model virus-like particle antigen. long before the identification of filamentous phage, other types of bacteriophage were already being used for antibacterial therapy in the former soviet union and eastern europe (reviewed in sulakvelidze et al., 2001) . the filamentous phage, with its nonlytic life cycle, has less obvious clinical uses, despite the fact that the host specificity of inovirus and plectrovirus includes many pathogens of medical importance, including salmonella, e. coli, shigella, pseudomonas, clostridium, and mycoplasma species. in an effort to enhance their bactericidal activity, genetically modified filamentous phage have been used as a "trojan horse" to introduce various antibacterial agents into cells. m13 and pf3 phage engineered to express either bglii restriction endonuclease (hagens and blasi, 2003; hagens et al., 2004) , lambda phage s holin (hagens and blasi, 2003) or a lethal catabolite gene activator protein (moradpour et al., 2009) effectively killed e. coli and pseudomonas aeruginosa cells, respectively, with no concomitant release of lps (hagens and blasi, 2003; hagens et al., 2004) . unfortunately, the rapid emergence of resistant bacteria with modified f pili represents a major and possibly insurmountable obstacle to this approach. however, there are some indications that filamentous phage can exert useful but more subtle effects upon their bacterial hosts that may not result in the development of resistance to infection. several studies have reported increased antibiotic sensitivity in bacterial populations simultaneously infected with either wild type filamentous phage (hagens et al., 2006) or phage engineered to repress the cellular sos response (lu and collins, 2009) . filamentous phage f1 infection inhibited early stage, but not mature, biofilm formation in e. coli (may et al., 2011) . thus, unmodified filamentous phage may be of future interest as elements of combination therapeutics against certain drug-resistant infections. more advanced therapeutic applications of the filamentous phage emerge when it is modified to express a targeting moiety specific for pathogenic cells and/or proteins for the treatment of infectious diseases, cancer and autoimmunity (figure 2) . the first work in this area showed as proof-of-concept that phage encoding a gfp expression cassette and displaying a her2specific scfv on all copies of piii were internalized into breast tumor cells, resulting in gfp expression (poul and marks, 1999) . m13 or fd phage displaying either a targeting peptide or antibody fragment and tethered to chloramphenicol by a labile crosslinker were more potent inhibitors of staphylococcus aureus growth than high-concentration free chloramphenicol (yacoby et al., 2006; vaks and benhar, 2011) . m13 phage loaded with doxorubicin and displaying a targeting peptide on piii specifically killed prostate cancer cells in vitro (ghosh et al., 2012a) . tumorspecific peptide:pviii fusion proteins selected from "landscape" phage (romanov et al., 2001; abbineni et al., 2010; fagbohun et al., 2012 fagbohun et al., , 2013 lang et al., 2014; wang et al., 2014a) were able to target and deliver sirna-, paclitaxel-, and doxorubicincontaining liposomes to tumor cells (jayanna et al., 2010a; wang et al., 2010a wang et al., ,b,c, 2014b bedi et al., 2011 bedi et al., , 2013 bedi et al., , 2014 ; they were non-toxic and increased tumor remission rates in mouse models (jayanna et al., 2010b; wang et al., 2014b,c) . using the b16-ova tumor model, eriksson et al. (2007) showed that phage displaying peptides and/or fabs specific for tumor antigens delayed tumor growth and improved survival, owing in large part to activation of tumor-associated macrophages and recruitment of neutrophils to the tumor site (eriksson et al., 2009) . phage displaying an scfv against β-amyloid fibrils showed promise as a diagnostic (frenkel and solomon, 2002) and therapeutic (solomon, 2008) reagent for alzheimer's disease and parkinson's disease due to the unanticipated ability of the phage to penetrate into brain tissue (ksendzovsky et al., 2012) . similarly, phage displaying an immunodominant peptide epitope derived from myelin oligodendrocyte glycoprotein depleted pathogenic demyelinating antibodies in brain tissue in the murine experimental autoimmune encephalomyelitis model of multiple sclerosis (rakover et al., 2010) . the advantages of the filamentous phage in this context over traditional antibody-drug or protein-peptide conjugates are (i) its ability to carry very high amounts of drug or peptide, and (ii) its ability to access anatomical compartments that cannot generally be reached by systemic administration of a protein. unlike most therapeutic biologics, the filamentous phage's production in bacteria complicates its use in humans in several ways. first and foremost, crude preparations of filamentous phage typically contain very high levels of contaminating lps, in the range of ∼10 2 -10 4 endotoxin units (eu)/ml (boratynski et al., 2004; branston et al., 2015) , which have the potential to cause severe adverse reactions. lps is not completely removed by polyethylene glycol precipitation or cesium chloride density gradient centrifugation (smith and gingrich, 2005; branston et al., 2015) , but its levels can be reduced dramatically using additional purification steps such as size exclusion chromatography (boratynski et al., 2004; zakharova et al., 2005) , polymyxin b chromatography (grabowska et al., 2000) , and treatment with detergents such as triton x-100 or triton x-114 (roehnisch et al., 2014; branston et al., 2015) . these strategies routinely achieve endotoxin levels of <1 eu/ml as measured by the limulus amebocyte lysate (lal) assay, well below the fda limit for parenteral administration of 5 eu/kg body weight/dose, although concerns remain regarding the presence of residual virion-associated lps which may be undetectable. a second and perhaps unavoidable consequence of the filamentous phage's bacterial production is inherent heterogeneity of particle size and the spectrum of host cellderived virion-associated and soluble contaminants, which may be cause for safety concerns and restrict its use to high-risk groups. many types of bacteriophage and engineered phage variants, including filamentous phage, have been proposed for prophylactic use ex vivo in food safety, either in the production pipeline (reviewed in dalmasso et al., 2014) or for detection of foodborne pathogens post-production (reviewed in schmelcher and loessner, 2014) . filamentous phage displaying a tetracysteine tag on piii were used to detect e. coli cells through staining with biarsenical dye . m13 phage functionalized with metallic silver were highly bactericidal against e. coli and staphylococcus epidermidis . biosensors based on surface plasmon resonance (nanduri et al., 2007) , piezoelectric transducers (olsen et al., 2006) , linear dichroism (pacheco-gomez et al., 2012) , and magnetoelastic sensor technology (lakshmanan et al., 2007; huang et al., 2009) were devised using filamentous phage displaying scfv or conjugated to whole igg against e. coli, listeria monocytogenes, salmonella typhimurium, and bacillus anthracis with limits of detection on the order of 10 2 -10 6 bacterial cells/ml. proof of concept has been demonstrated for use of such phage-based biosensors to detect bacterial contamination of live produce (li et al., 2010b) and eggs (chai et al., 2012) . the filamentous phage particle is enclosed by a rod-like protein capsid, ∼1000 nm long and 5 nm wide, made up almost entirely of overlapping pviii monomers, each of which lies ∼27 angstroms from its nearest neighbor and exposes two amine groups as well as at least three carboxyl groups (henry et al., 2011) . the regularity of the phage pviii lattice and its diversity of chemically addressable groups make it an ideal scaffold for bioconjugation (figure 3) . the most commonly used approach is functionalization of amine groups with nhs esters (van houten et al., 2006 (van houten et al., , 2010 yacoby et al., 2006) , although this can result in unwanted acylation of piii and any displayed biomolecules. carboxyl groups and tyrosine residues can also be functionalized using carbodiimide coupling and diazonium coupling, respectively (li et al., 2010a) . carrico et al. (2012) developed methods to specifically label pviii n-termini without modification of exposed lysine residues through a two-step transamination-oxime formation reaction. specific modification of phage coat proteins is even more easily accomplished using genetically modified phage displaying peptides (ng et al., 2012) or enzymes (chen et al., 2007; hess et al., 2012) , but this can be cumbersome and is less general in application. for more than a decade, interest in the filamentous phage as a building block for nanomaterials has been growing because of its unique physicochemical properties, with emerging applications in magnetics, optics, and electronics. it has long been known that above a certain concentration threshold, phage can form ordered crystalline suspensions (welsh et al., 1996) . lee et al. (2002) engineered m13 phage to display a zns-binding peptide on piii and showed that, in the presence of zns nanoparticles, they selfassemble into highly ordered film biomaterials that can be aligned using magnetic fields. taking advantage of the ability to display substrate-specific peptides at known locations on the phage filament hess et al., 2012) , this pioneering figure 3 | chemically addressable groups of the filamentous bacteriophage major coat protein lattice. the filamentous phage virion is made up of ∼2,500-4,000 overlapping copies of the 50-residue major coat protein, pviii, arranged in a shingle-type lattice. each monomer has an array of chemically addressable groups available for bioorthogonal conjugation, including two primary amine groups (shown in red), three carboxyl groups (show in blue) and two hydroxyl groups (show in green). the 12 n-terminal residues generally exposed to the immune system for antibody binding are in bold underline. figure adapted from structural data of marvin, 1990 , freely available in pdb and scope databases. work became the basis for construction of two-and threedimensional nanomaterials with more advanced architectures, including semiconducting nanowires (mao et al., 2003 (mao et al., , 2004 , nanoparticles , and nanocomposites (oh et al., 2012; chen et al., 2014) . using hybrid m13 phage displaying co 3 o 4 -and gold-binding peptides on pviii as a scaffold to assemble nanowires on polyelectrolyte multilayers, nam et al. (2006) produced a thin, flexible lithium ion battery, which could be stamped onto platinum microband current collectors (nam et al., 2008) . the electrochemical properties of such batteries were further improved through piii-display of single-walled carbon nanotube-binding peptides (lee et al., 2009) , offering an approach for sustainable production of nanostructured electrodes from poorly conductive starting materials. phagebased nanomaterials have found applications in cancer imaging (ghosh et al., 2012b; yi et al., 2012) , photocatalytic water splitting (nam et al., 2010a; neltner et al., 2010) , light harvesting (nam et al., 2010b; chen et al., 2013) , photoresponsive technologies (murugesan et al., 2013) , neural electrodes (kim et al., 2014) , and piezoelectric energy generation (murugesan et al., 2013) . thus, the unique physicochemical properties of the phage, in combination with modular display of peptides and proteins with known binding specificity, have spawned wholly novel materials with diverse applications. it is worth noting that the unusual biophysical properties of the filamentous phage can also be exploited in the study of structures of other macromolecules. magnetic alignment of high-concentration filamentous phage in solution can partially order dna, rna, proteins, and other biomolecules for measurement of dipolar coupling interactions (hansen et al., 1998 (hansen et al., , 2000 dahlke ojennus et al., 1999) in nmr spectroscopy. because of their large population sizes, short generation times, small genome sizes and ease of manipulation, various filamentous and non-filamentous bacteriophages have been used as models of experimental evolution (reviewed in husimi, 1989; wichman and brown, 2010; kawecki et al., 2012; hall et al., 2013) . the filamentous phage has additional practical uses in protein engineering and directed protein evolution, due to its unique tolerance of genetic modifications that allow biomolecules to be displayed on the virion surface. first and foremost among these applications is in vitro affinity maturation of antibody fragments displayed on piii. libraries of variant fabs and single chain antibodies can be generated via random or sitedirected mutagenesis and selected on the basis of improved or altered binding, roughly mimicking the somatic evolution strategy of the immune system (marks et al., 1992; bradbury et al., 2011) . however, other in vitro display systems, such as yeast display, have important advantages over the filamentous phage for affinity maturation (although each display technology has complementary strengths; koide and koide, 2012) , and regardless of the display method, selection of "improved" variants can be slow and cumbersome. iterative methods have been developed to combine computationally designed mutations (lippow et al., 2007) and circumvent the screening of combinatorial libraries, but these have had limited success to date. recently, esvelt et al. (2011) developed a novel strategy for directed evolution of filamentous phage-displayed proteins, called phage-assisted continuous evolution (pace), which allows multiple rounds of evolution per day with little experimental intervention. the authors engineered m13 phage to encode an exogenous protein (the subject for directed evolution), whose functional activity triggers gene iii expression from an accessory plasmid; variants of the exogenous protein arise by random mutagenesis during phage replication, the rate of which can be increased by inducible expression of error-prone dna polymerases. by supplying limiting amounts of receptive e. coli cells to the engineered phage variants, esvelt et al. (2011) elegantly linked phage infectivity and production of offspring with the presence of a desired protein phenotype. carlson et al. (2014) later showed that pace selection stringency could be modulated by providing small amounts of piii independently of protein phenotype, and undesirable protein functions negatively selected by linking them to expression of a truncated piii variant that impairs infectivity in a dominant negative fashion. pace is currently limited to protein functions that can be linked in some way to the expression of a gene iii reporter, such as protein-protein interaction, recombination, dna or rna binding, and enzymatic catalysis (meyer and ellington, 2011) . this approach represents a promising avenue for both basic research in molecular evolution (dickinson et al., 2013) and synthetic biology, including antibody engineering. filamentous bacteriophage have been recovered from diverse environmental sources, including soil (murugaiyan et al., 2011) , coastal fresh water (xue et al., 2012) , alpine lakes (hofer and sommaruga, 2001) and deep sea bacteria (jian et al., 2012) , but not, perhaps surprisingly, the human gut (kim et al., 2011) . the environmental "phageome" in soil and water represent the largest source of replicating dna on the planet, and is estimated to contain upward of 10 30 viral particles (ashelford et al., 2003; chibani-chennoufi et al., 2004; suttle, 2005) . the few studies attempting to investigate filamentous phage environmental ecology using classical environmental microbiology techniques (typically direct observation by electron microscopy) found that filamentous phage made up anywhere from 0 to 100% of all viral particles (demuth et al., 1993; pina et al., 1998; hofer and sommaruga, 2001) . there was some evidence of seasonal fluctuation of filamentous phage populations in tandem with the relative abundance of free-living heterotrophic bacteria (hofer and sommaruga, 2001) . environmental metagenomics efforts are just beginning to unravel the composition of viral ecosystems. the existing data suggest that filamentous phage comprise minor constituents of viral communities in freshwater (roux et al., 2012) and reclaimed and potable water (rosario et al., 2009) but have much higher frequencies in wastewater and sewage (cantalupo et al., 2011; alhamlan et al., 2013) , with the caveat that biases inherent to the methodologies for ascertaining these data (purification of viral particles, sequencing biases) have not been not well validated. there are no data describing the population dynamics of filamentous phage and their host species in the natural environment. at the individual virus-bacterium level, it is clear that filamentous phage can modulate host phenotype, including the virulence of important human and crop pathogens. this can occur either through direct effects of phage replication on cell growth and physiology, or, more typically, by horizontal transfer of genetic material contained within episomes and/or chromosomally integrated prophage. temperate filamentous phage may also play a role in genome evolution (reviewed in canchaya et al., 2003) . perhaps the best-studied example of virulence modulation by filamentous phage is that of vibrio cholerae, whose full virulence requires lysogenic conversion by the cholera toxin-encoding ctxφ phage (waldor and mekalanos, 1996) . integration of ctxφ phage occurs at specific sites in the genome; these sequences are introduced through the combined action of another filamentous phage, fs2φ, and a satellite filamentous phage, tlc-knφ1 (hassan et al., 2010) . thus, filamentous phage species interact and coevolve with each other in addition to their hosts. infection by filamentous phage has been implicated in the virulence of yersinia pestis (derbise et al., 2007) , neisseria meningitidis (bille et al., 2005 (bille et al., , 2008 , vibrio parahaemolyticus (iida et al., 2001) , e. coli 018:k1:h7 (gonzalez et al., 2002) , xanthomonas campestris (kamiunten and wakimoto, 1982) , and p. aeruginosa (webb et al., 2004) , although in most of these cases, the specific mechanisms modulating virulence are unclear. phage infection can both enhance or repress virulence depending on the characteristics of the phage, the host bacterium, and the environmental milieu, as is the case for the bacterial wilt pathogen ralstonia solanacearum (yamada, 2013) . since infection results in downregulation of the pili used for viral entry, filamentous phage treatment has been proposed as a hypothetical means of inhibiting bacterial conjugation and horizontal gene transfer, so as to prevent the spread of antibiotic resistance genes (lin et al., 2011) . finally, the filamentous phage may also play a future role in the preservation of biodiversity of other organisms in at-risk ecosystems. engineered phage have been proposed for use in bioremediation, either displaying antibody fragments of desired specificity for filtration of toxins and environmental contaminants (petrenko and makowski, 1993) , or as biodegradable polymers displaying peptides selected for their ability to aggregate pollutants, such as oil sands tailings (curtis et al., 2011 (curtis et al., , 2013 . engineered phage displaying peptides that specifically bind inorganic materials have also been proposed for use in more advanced and less intrusive mineral separation technologies (curtis et al., 2009 ). the filamentous phage represents a highly versatile organism whose uses extend far beyond traditional phage display and affinity selection of antibodies and polypeptides of desired specificity. its high immunogenicity and ability to display a variety of surface antigens make the phage an excellent particulate vaccine carrier, although its bacterial production and preparation heterogeneity likely limits its applications in human vaccines at present, despite being apparently safe and well-tolerated in animals and people. unanticipated characteristics of the phage particle, such as crossing of the blood-brain barrier and formation of highly ordered liquid crystalline phases, have opened up entirely new avenues of research in therapeutics for chronic disease and the design of nanomaterials. our comparatively detailed understanding of the interactions of model filamentous phage with their bacterial hosts has allowed researchers to harness the phage life cycle to direct protein evolution in the lab. hopefully, deeper knowledge of phage-host interactions at an ecological level may produce novel strategies to control bacterial pathogenesis. while novel applications of the filamentous phage continue to be developed, the phage is likely to retain its position as a workhorse for therapeutic antibody discovery for many years to come, even with the advent of competing technologies. kh and js conceived and wrote the manuscript. ma-g read the manuscript and commented on the text. evolutionary selection of new breast cancer cell-targeting peptides and phages with the cell-targeting peptides fully displayed on the major coat and their effects on actin dynamics during cell internalization generating fsh antagonists and agonists through immunization against fsh receptor n-terminal decapeptides metagenomics-based analysis of viral communities in dairy lagoon wastewater elevated abundance of bacteriophage infecting bacteria in soil phage displayed hbv core antigen with immunogenic activity cost effective filamentous phage based immunization nanoparticles displaying a full-length hepatitis b virus surface antigen hormone phage: an enrichment method for variant proteins with altered binding properties protective immune responses induced by the immunization of mice with a recombinant bacteriophage displaying an epitope of the human respiratory syncytial virus selection of pancreatic cancer cell-binding landscape phages and their use in development of anticancer nanomedicines targeted delivery of sirna into breast cancer cells via phage fusion proteins delivery of sirna into breast cancer cells via phage fusion protein-targeted liposomes association of a bacteriophage with meningococcal disease in young adults a chromosomally integrated bacteriophage in invasive meningococci preparation of endotoxin-free bacteriophages beyond natural antibodies: the power of in vitro display technologies a nonchromatographic method for the removal of endotoxins from bacteriophages progress in phage display: evolution of the technique and its application a surface expression vector for antibody screening raw sewage harbors diverse viral populations negative selection and stringency modulation in phage-assisted continuous evolution n-terminal labeling of filamentous phage to create cancer marker imaging agents rapid and sensitive detection of salmonella typhimurium on eggshells by using wireless biosensors phage display evolution of a peptide substrate for yeast biotin ligase and application to two-color quantum dot labeling of cell surface proteins versatile three-dimensional virus-based template for dye-sensitized solar cells with improved electron transport and light harvesting assembly of viral hydrogels for three-dimensional conducting nanocomposites protection of rhesus macaques against disease progression from pathogenic shiv-89.6pd by vaccination with phage-displayed hiv-1 epitopes phage-host interaction: an ecological perspective evaluation of filamentous bacteriophage as immunogens in atlantic salmon enhancement of dna vaccine potency through linkage of antigen to filamentous bacteriophage coat protein iii domain i bacteriophageinduced aggregation of oil sands tailings biomining with bacteriophage: selectivity of displayed peptides for naturally occurring sphalerite and chalcopyrite effects of bacteriophage on the surface properties of chalcopyrite (cufes(2)), and phage-induced flocculation of chalcopyrite, glacial till, and oil sands tailings induced alignment and measurement of dipolar couplings of an sh2 domain through direct binding with filamentous phage exploiting gut bacteriophages for human health recognition of hiv-derived b and t cell epitopes displayed on filamentous phages phage display of peptide epitopes from hiv-1 elicits strong cytolytic responses immunogenicity and epitope mapping of foreign sequences via genetically engineered filamentous phage triggering dth and ctl activity by fd filamentous bacteriophages: role of cd4+ t cells in memory responses immunogenicity of filamentous phage displaying peptide mimotopes after oral administration reproducing the immune response against the plasmodium vivax merozoite surface protein 1 with mimotopes selected from a phage-displayed peptide library direct electron microscopy study on the morphological diversity of bacteriophage populations in lake pluβsee monoclonal antibodies that recognise filamentous phage: tools for phage display technology a horizontally acquired filamentous phage contributes to the pathogenicity of the plague bacillus random peptide libraries: a source of specific protein binding molecules experimental interrogation of the path dependence and stochasticity of protein evolution using phage-assisted continuous evolution structural mimicry and enhanced immunogenicity of peptide epitopes displayed on filamentous bacteriophage. the v3 loop of hiv-1 gp120 sur un microbe invisible antagoniste des bacilles dysentérique tumor specific phage particles promote tumor regression in a mouse melanoma model tumor-specific bacteriophages induce tumor destruction through activation of tumor-associated macrophages immunogenicity and therapeutic efficacy of phage-displayed beta-amyloid epitopes a system for the continuous directed evolution of biomolecules landscape phages and their fusion proteins targeted to breast cancer cells metastatic prostate cancer cell-specific phage-like particles as a targeted genedelivery system the potential of phage display virions expressing malignant tumor specific antigen mage-a1 epitope in murine model virus taxonomy. 8th report of the international committee on the taxonomy of viruses mimicking of discontinuous epitopes by phage-displayed peptides, ii. selection of clones recognized by a protective monoclonal antibody against the bordetella pertussis toxin from phage peptide libraries a general strategy to identify mimotopes of pathological antigens using only random peptide libraries and human sera reduction of β-amyloid plaques in brain of transgenic mouse model of alzheimer's disease by efrh-phage immunization immunization against alzheimer's β-amyloid plaques via efrh phage administration filamentous phage as vector-mediated antibody delivery to the brain a method for the generation of combinatorial antibody libraries using pix phage display making artificial antibodies: a format for phage display of combinatorial heterodimeric arrays processing of filamentous bacteriophage virions in antigenpresenting cells targets both hla class i and class ii peptide loading compartments a priori delineation of a peptide which mimics a discontinuous antigenic determinant refactored m13 bacteriophage as a platform for tumor cell imaging and drug delivery m13-templated magnetic nanoparticles for targeted in vivo imaging of prostate cancer conserved filamentous prophage in escherichia coli o18:k1:h7 and yersinia pestis biovar orientalis immunisation with phage displaying peptides representing single epitopes of the glycoprotein g can give rise to partial protective immunity to hsv-2 phage display as a rapid gene expression system: production of bioactive cytokinephage and generation of neutralizing monoclonal antibodies multiple display of foreign peptides on a filamentous bacteriophage. peptides from plasmodium falciparum circumsporozoite protein as antigens genetically modified filamentous phage as bactericidal agents: a pilot study augmentation of the antimicrobial efficacy of antibiotics by filamentous phage therapy of experimental pseudomonas infections with a nonreplicating genetically modified phage viral host-adaptation: insights from evolution experiments with phages pf1 filamentous phage as an alignment tool for generating local and global structural information in nucleic acids tunable alignment of macromolecules by filamentous phage yields dipolar coupling interactions immunological basis of m13 phage vaccine: regulation under myd88 and tlr9 signaling satellite phage tlcφ enables toxigenic conversion by ctx phage through dif site alteration developing strategies to enhance and focus humoral immune responses using filamentous phage as a model antigen m13 bacteriophage display framework that allows sortasemediated modification of surface-accessible phage proteins seasonal dynamics of viruses in an alpine lake: importance of filamentous forms on infectious substructures from e. coli bacteriophages. 3. demonstration and properties of "ht2" particles peptide mimotopes of rabies virus glycoprotein with immunogenic activity sequential detection of salmonella typhimurium and bacillus anthracis spores using magnetoelastic biosensors programmable assembly of nanoarchitectures using genetically engineered viruses phage display of cdna repertoires: the pvi display system and its applications for the selection of immunogenic ligands selection and evolution of bacteriophages in cellstat filamentous phage associated with recent pandemic strains of vibrio parahaemolyticus exploring peptide mimics for the production of antibodies against discontinuous protein epitopes landscape phage ligands for pc3 prostate carcinoma cells landscape phage fusion protein-mediated targeting of nanomedicines enhances their prostate tumor cell association and cytotoxic efficiency dynamic modulation of dna replication and gene transcription in deep-sea filamentous phage sw1 in response to changes of host growth and temperature th2-type immune response induced by a phage clone displaying a ctla4-binding domain mimicmotif effect of infection with filamentous phage xf on the growth, ultrastructure and virulence of xanthomonas campestris pv. oryzae n 5850 linkage of recognition and replication functions by assembling combinatorial antibody fab libraries along phage surfaces experimental evolution filamentous phage display in the new millennium a new filamentous bacteriophage with sex-factor specificity diversity and abundance of single-stranded dna viruses in human feces genetically engineered bacteriophage delivers a tumor necrosis factor alpha antagonist coating on neural electrodes expression of a foot-and-mouth disease virus immunodominant epitope by a filamentous bacteriophage vector structure of a foreign peptide displayed on the surface of bacteriophage m13 epitope structures recognised by antibodies against the major coat protein (g8p) of filamentous bacteriophage fd (inoviridae) mimotope vaccination-from allergy to cancer affinity maturation of single-domain antibodies by yeast surface display genetics of the immune response. i. the immune response to the phage fd in high and low responding inbred strains of mice convection-enhanced delivery of m13 bacteriophage to the brain multivalent display system on filamentous bacteriophage pvii minor coat protein phage immobilized magnetoelastic sensor for the detection of salmonella typhimurium specific probe selection from landscape phage display library and its application in enzyme-linked immunosorbent assay of free prostate-specific antigen efrh-phage immunization of alzheimer's disease animal model improves behavioral performance in morris water maze trials ordering of quantum dots using genetically engineered viruses fabricating genetically engineered high-power lithium-ion batteries using multiple virus genes chemical modification of m13 bacteriophage and its application in cancer cell imaging direct detection of salmonella typhimurium on fresh produce using phagebased magnetoelastic biosensors mimotopes selected with a neutralizing antibody against urease b from helicobacter pylori induce enzyme inhibitory antibodies in mice upon vaccination phage display for site-specific immunization and characterization of high-risk human papillomavirus specific e7 monoclonal antibodies inhibition of bacterial conjugation by phage m13 and its protein g3p: quantitative analysis and model computational design of antibody-affinity improvement beyond in vivo maturation isolation of a bacteriophage specific for the f plus and hfr mating types of escherichia coli k-12 engineered bacteriophage targeting gene networks as adjuvants for antibiotic therapy role of capsid structure and membrane protein processing in determining the size and copy number of peptides displayed on the major coat protein of filamentous bacteriophage recombinant bacteriophage-based multiepitope vaccine against taenia solium pig cysticercosis viral assembly of oriented quantum dot nanowires virus-based toolkit for the directed synthesis of magnetic and semiconducting nanowires genetically engineered phage fibers and coatings for antibacterial applications molecular evolution of proteins on filamentous phage. mimicking the strategy of the immune system model-building studies of inovirus: genetic variations on a geometric theme filamentous phage structure, infection and assembly a fibrous dna phage (fd) and a spherical rna phage (fr) specific for male strains of e. coli. ii. physical characteristics phage display of a ctl epitope elicits a long-term in vivo cytotoxic response exposure of conjugative plasmid carrying escherichia coli biofilms to male-specific bacteriophages phage antibodies: filamentous phage displaying antibody variable domains constrained peptide libraries as a tool for finding mimotopes humoral immune response against proteophosphoglycan surface antigens of entamoeba histolytica elicited by immunization with synthetic mimotope peptides antigenicity and immunogenicity of phage library-selected peptide mimics of the major surface proteophosphoglycan antigens of entamoeba histolytica immunisation with phage-displayed variable region 2 from meningococcal pora outer membrane protein induces bactericidal antibodies against neisseria meningitidis derivation of vaccines from mimotopes. immunologic properties of human hepatitis b virus surface antigen mimotopes displayed on filamentous phage molecular evolution picks up the pace filamentous phages specific for the i sex factor design of specific immunogens using filamentous phage as the carrier uptake and processing of modified bacteriophage m13 in mice: implications for phage display genetically engineered phage harbouring the lethal catabolite gene activator protein gene with an inducer-independent promoter for biocontrol of escherichia coli inexpensive anti-cysticercosis vaccine: s3pvac expressed in heat inactivated m13 filamentous phage proves effective against naturally acquired taenia solium porcine cysticercosis recognition by human sera and immunogenicity of hbsag mimotopes selected from an m13 phage display library characterization of molecular correlates of the chronic humoral immune response: clues towards eliciting broadly neutralizing antibodies against hiv characterization of filamentous bacteriophage pe226 infecting ralstonia solanacearum strains virusbased photo-responsive nanowires formed by linking site-directed mutagenesis and chemical reaction virus-enabled synthesis and assembly of nanowires for lithium ion battery electrodes stamped microbattery electrodes based on self-assembled m13 viruses biologically templated photocatalytic nanostructures for sustained light-driven water oxidation virus-templated assembly of porphyrins into light-harvesting nanoantennae spr biosensor for the detection of l. monocytogenes using phage-displayed antibody production of hydrogen using nanocrystalline protein-templated catalysts on m13 phage quantitative synthesis of genetically encoded glycopeptide libraries displayed on m13 phage graphene sheets stabilized on genetically engineered m13 viral templates as conducting frameworks for hybrid energy-storage materials affinity-selected filamentous bacteriophage as a probe for acoustic wave biodetectors of salmonella typhimurium detection of pathogenic bacteria using a homogeneous immunoassay based on shear alignment of virus particles and linear dichroism phage display: concept, innovations, applications and future antibody-selectable filamentous fd phage vectors: affinity purification of target genes potential applications of phage display to bioremediation a library of organic landscapes on filamentous phage abundance, morphology and distribution of planktonic viruslike particles in two high-mountain lakes correlates of protection induced by vaccination targeted gene delivery to mammalian cells by filamentous bacteriophage conditional lethal mutants of the small filamentous coliphage m13. ii. two genes for coat proteins selection of antigenic and immunogenic mimics of hepatitis c virus using sera from patients towards a solution for hepatitis c virus hypervariability: mimotopes of the hypervariable region 1 can induce antibodies crossreacting with a large number of viral variants filamentous bacteriophages: biology and applications, " in els (the encyclopaedia of life sciences) filamentous bacteriophage: biology, phage display and nanotechnology applications antigen-specific therapy of eae via intranasal delivery of filamentous phage displaying a myelin immunodominant epitope expression of a 28-kilodalton glutathione s-transferase antigen of schistosoma mansoni on the surface of filamentous phages and evaluation of its vaccine potential structural requirements for the activity of the mirb ferrisiderophore transporter of aspergillus fumigatus chemically linked phage idiotype vaccination in the murine b cell lymphoma 1 model phage idiotype vaccination: first phase i/ii clinical trial in patients with multiple myeloma phage display selection of peptides that affect prostate carcinoma cells attachment and invasion a helper phage to improve single-chain antibody presentation in phage display metagenomic analysis of viruses in reclaimed water assessing the diversity and specificity of two freshwater viral communities through metagenomics recombinant expression and neutralizing activity of an mhc class ii binding epitope of toxic shock syndrome toxin-1 epitope-specific antibody response to ige by mimotope immunization some physical-chemical and biological properties of the rod-shaped coliphage m13 generation and characterization of phage-gnrh chemical conjugates for potential use in cat and dog immunocontraception phage display allows identification of zona pellucida-binding peptides with species-specific properties: novel approach for development of contraceptive vaccines for wildlife infective and inactivated filamentous phage as carriers for immunogenic peptides vaccination with filamentous bacteriophages targeting dec-205 induces dc maturation and potent anti-tumor t-cell responses in the absence of adjuvants the use of filamentous bacteriophage fd to deliver hla-a2-restricted peptides and to induce strong antitumor ctl responses selection of hiv-specific immunogenic epitopes by screening random peptide libraries with hiv-1-positive sera application of bacteriophages for detection of foodborne pathogens searching for peptide ligands with an epitope library de novo selection of high-affinity antibodies from synthetic fab libraries displayed on phage as pix fusion proteins high copy display of large proteins on phage for functional selections filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface effect of dna copy number on genetic stability of phage-displayed peptides hydroxyapatite chromatography of phage-display virions phage display libraries of peptides and proteins displayed on filamentous phage generation of anti-β-amyloid antibodies via phage display technology towards alzheimer's disease vaccination filamentous bacteriophage as a novel therapeutic tool for alzheimer's disease treatment comparison of phage pviii and klh as vector in inducing the production of cytokines in c57bl/6j mice bacteriophage therapy viruses in the sea a mimotope peptide of aβ42 fibril-specific antibodies with aβ42 fibrillation inhibitory activity induces anti-aβ42 conformer antibody response by a displayed form on an m13 phage in mice accessibility of peptides displayed on filamentous bacteriophage virions: susceptibility to proteinases antibody fab display and selection through fusion to the pix coat protein of filamentous phage an investigation on the nature of ultra-microscopic viruses antigenic properties of hcmv peptides displayed by filamentous bacteriophages vs. synthetic peptides in vivo characteristics of targeted drug-carrying filamentous bacteriophage nanomedicines presentation of the functional receptor-binding domain of the bacterial adhesin f17a-g on bacteriophage m13 engineering filamentous phage carriers to improve focusing of antibody responses against peptides filamentous phage as an immunogenic carrier to elicit focused antibody responses against a synthetic peptide basic research in hiv vaccinology is hampered by reductionist thinking lysogenic conversion by a filamentous phage encoding cholera toxin induction of hepatitis b virus-specific cytotoxic t lymphocytes response in vivo by filamentous phage display vaccine crosspresentation of phage particle antigen in mhc class ii and endoplasmic reticulum marker-positive compartments bio-mimetic nanostructure self-assembled from au@ag heterogeneous nanorods and phage fusion proteins for targeted tumor optical detection and photothermal therapy enhanced tumor delivery and antitumor activity in vivo of liposomal doxorubicin modified with mcf-7-specific phage fusion protein paclitaxel-loaded peg-pe-based micellar nanopreparations targeted with tumor specific landscape phage fusion protein enhance apoptosis and efficiently reduce tumors hybrid phage displaying slaqvkytsassi induces protection against candida albicans challenge in balb/c mice protective immune responses against systemic candidiasis mediated by phage-displayed specific epitope of candida albicans heat shock protein 90 in c57bl/6j mice enhanced binding and killing of target tumor cells by drug-loaded liposomes modified with tumor-specific phage fusion coat protein paclitaxel-loaded polymeric micelles modified with mcf-7 cell-specific phage protein: enhanced binding to target cancer cells and increased cytotoxicity cytoplasmic delivery of liposomes into mcf-7 breast cancer cells mediated by cell-specific phage fusion coat protein bacteriophage and phenotypic variation in pseudomonas aeruginosa biofilm development evidence for tilted smectic liquid crystalline packing of fd inovirus from x-ray fiber diffraction experimental evolution of viruses: microviridae as a model system immunological properties of foreign peptides in multiple display on a filamentous bacteriophage adsorption protein of bacteriophage fl: solubilization in deoxycholate and localization in the fl virion sensitive and selective bacterial detection using tetracysteine-tagged phages in conjunction with biarsenical dye phage display particles expressing tumor-specific antigens induce preventive and therapeutic anti-tumor immunity in murine p815 model development of a phage displayed disulfide-stabilized fv fragment vaccine against vibrio anguillarum high frequency of a novel filamentous phage, vcyφ, within an environmental vibrio cholerae population targeting antibacterial agents by using drug-carrying filamentous bacteriophages filamentous phages of ralstonia solanacearum: doubleedged swords for pathogenic bacteria prophylactic vaccination with phage-displayed epitope of c. albicans elicits protective immune responses against systemic candidiasis in c57bl/6 mice epitope mapping of mycoplasma hyopneumoniae using phage displayed peptide libraries and the immune responses of the selected phagotopes m13 phage-functionalized single-walled carbon nanotubes as nanoprobes for second near-infrared window fluorescence imaging of targeted tumors comparison of phage piii, pviii and gst as carrier proteins for peptide immunisation in balb/c mice spontaneous assembly of viruses on multilayered polymer surfaces characterization of murine coronavirus neutralization epitopes with phage-displayed peptides purification of filamentous bacteriophage for phage display using size-exclusion chromatography conformational mimicry of a chlamydial neutralization epitope on filamentous phage f1, a rodshaped male-specific bacteriophage that contains dna biodistribution of filamentous phage peptide libraries in mice this work was supported by funding from the national research council of canada (kh, ma-g) and the canada research chair program (js). we thank jyothi kumaran and roger mackenzie for critical appraisal of the manuscript, and jasna rakonjac for inviting us to contribute it. this is national research council canada publication number 53282. key: cord-020235-stcrozdw authors: nan title: abstracts of papers presented at the 38th meeting of the deutsche gesellschaft für hygiene und mikrobiologie, virology section, göttingen, 5.–8.10.1981 date: 2012-03-15 journal: zentralbl bakteriol mikrobiol hyg a doi: 10.1016/s0174-3031(82)80128-5 sha: doc_id: 20235 cord_uid: stcrozdw nan schaefer, a., zibirre, r., kabus, p., kohne, jutta, and koch. g. na +, k + -atpase activity was studied in a plasma membrane rich fraction isolated from control and poliovirus infected he la cells and compared to membrane potential and amino acid uptake in parallel cultures of intact cells. na + , k + -at pase activity in membranes from infected hela cells increased relative to control with a maximum at 90 minutes post infection (+ 30 %) and decreased again (180 rnin.: -30 %). similar but slighter changes were observed in the membrane potential dependent tetraphenylphosphonium (t pp+ ) uptake, indicating a correlation between membrane porential in intact cells and our measurment of plasma membrane na + , k +-at pase. at approximately 1 h post infection we observed a decrease in the uptake of amino acid (methionine, leucine) in infected cells relative to controls. these results suggest that the decline in amino acid upt ake is not mediated by virus-induced changes in the na + , k + -atpase activity or membrane potential. mpi f. biochemic, abt. virologic, sequence homology in different strains of fmdv and other picoma viruses marquardt, o. restriction enzyme generated subgenomic fragments of cloned cdna prepared from rna of the strain 01k of fmdv were compared quantitatively for sequence complementarity with radioactive rna from strains cobb and a2s or with rna from mengo or polio virus in hybridization experiments by use of the southern technique. nucleic acid sequences neighbouring or including the 3' end of viral genomes are demonstrated to be 80 ofo homologous in fmdv. in contrast, sequences coding for the capsid proteins vpl and vp3 were remarkably heterologous (less than 20010) in fmdv. sequences coding for non-structural proteins showed 35-50010 homology. thus no highly conserved coding sequence was detectable in fmdv by this technique. no hybridization could .be detected between 01k specific dna and polio rna, while weak hybridization was observed with mengo rna at areas including the 3' end and with a part of the gene for precursor protein p52. abr, molekularbiologie, physiol-chern. inst., grindelallee 117, d-2000 hamburg 13 modification of poliovirus capsid proteins scharli, claudia and koch, g. poliovirus type 1, strain mohoney contains a protein kinase activity. a highly purified (two cycles of csci gradient centrifugation) poliovirus preparation is able to transfer the y-phosphoryl-group of [32p]_atp to acid precipitable material. when the reaction product is anal yzed by sds page, all structural proteins of polio are phoshorylated. most of the label is incorporated in the minor capsid protein vpo and to a lesser extend in vp1, vp2 and vp3. hepatitis a virus (hav) was isolated directly from human stool in diploid human fibroplasrs, viral antigen was expressed only after 210 days of incubation of the infected cultures. in contrast, hav first adapted to growth in human hepatocellular carcinoma cells caused antigen expression in fibroplast cultures already 90 days after infection. during further passage the time of first appearance of antigen in infected fibroblasts decreased to about 14 days in both passage series (i. e. cultures inoculated with virus recovered directly from human stool or after adaption to human hepatoma cells). antigen was predominantly cell-associated and was shown by immunofluorescence to be located exclusively in the cytoplasm. -biophysical properties of hav particles extracted from infected cells were comparable to those described for hav extracted from human stool. those findings are of importance for preparation of large amounts of hav for vaccine. production. max von pettenkofer-inst. u. inst. f. biochemie d. univ. d-8000 miinchen 2 cloning of hepatitis a virus genomes helm, k. von der, winnacker, e. l., and deinhardt, f. subgenomic fragments of the genomic rna of hepatitis a virus (hav) were cloned via edna into pst 1 site of pbr 322. by restriction enzyme pattern analysis and hybridization of the cloned hav dna with selected fragments of the ha v rna it was determined that the major part of the cloned hav dna fragments are located at the 3' end of the genome but a few clones are distributed over the entire genome. we have now identified subtype specific sites in hbv-restriction map s. the maps were aligned to the eco ri site. only subtype ay revealed one subtype specific site of each xbai (2000 bp), h ineli (2200 bp) and bam hi (2900 bp). they are situ ated outside the s-gene. employing rhis system the subty pes of three hbv-ona sequences that ha ve been published so far , could be determined . thereupon the first 54 triplets of a hbv (ad)-dna sequence of the s-gene could be com pared with pub lished data . nucleotide exchanges in triplets 12 and 13 did not cau se an amino-acid (aa)-exchange, bu t cha nges in the triplets 45 and 46 do cause such exchange. thus, in subtype ay, aa 45 is thr, and aa 46 is thr, wh ile in subtype ad, aa 45 is ser and aa 46 is pro. these exchanges may produce different conf ormation of th e peptides. dept. m ed. mic robio l., univ. 0 -3400 gottingen characterization of the hepatitis b virus (hbv) associated proteinkinase gerlich, w. h. purified hbv-core preparations were shown to contain a pr ot eink inase which phosphory lates the major core protein (albin and robinson, 1980) . in this study it was found th at th is enzyme copurifies with the light (d = 1.31) but not with the heavy subfraction (d = 1.35) of the core-particles. the enzyme has a high affinity for atp and it transfers approx. one ph osphat e per particle within minutes. the 32p-phosphate introd uced in vitro cannot be remove d from core-particles by digestion with alkaline ph osph at ase. after lysis of the cores with sos the 32p can be cleaved from rhe protein. t his suggests th at th e enzyme and its ph osphate acceptor site are located with in the pa rticle. after acid hydrolysis the incor porated 3!p co-migrates with ph osph oserin but not with phosphor hreonin or phos pho tryosin. m ax von petrenk ofer-inst., univ. 0 -8000 miin chen 2 a phosphokinase activity associated with the hepatitis b virus (hbv) core helm, k. von der, roggendorf, m., and siegert, w. prepar ation s of core antigen positive (hbcag) fractions obta ined from hbv positive hu man liver tissue are associated with a ph osphokinase wh ich copurifies with the hbcag in cscl density-and sucrose velocity grad ient centrifugations. the acceptor protein for this kina se is the 19000 d hbcag protein; casein as an exogenous acceptor is also phospho rylate d to a minor extend. the ph osphorylated amino acid is serine or threonine and not tyrosi ne, as it is frequently the case with tumor virus specific prot ein kinases. dept. med . microbiol., univ. 0 -3400 gott ingen effect of glycosidases on the proteins of hepatitis b surface antigen (hbsag) stibbe, w. and gerlich, w. h . the protein composition of purified hbsag was stud ied by sos·gc1e1ectrophoresis and staining with silver. in additio n to p25 and gp28 of hbsag two furt her proteins, gp33 and gp36, were found consistently in preparations from 9 different blood donors. the glycoprotein nature of gp 33 and gp 36 was shown by the increase of electrophoretic mobility after treatment with endoglycosidase h or neuraminidase. immune precipitation with anti-hbs in ripa-buffer confirmed the specifity of the glycoproteins, the change in apparent mol. weight after treatment with neuraminidase was larger for gp36 (900) than for gp33 (600) or gp28 (300). the data suggest that gp 33 and gp36 are multiply glycosylated proteins with n-glycosidic linked carbohydrate side chains of the mixed type. inst, f. virologie, univ., hb s antigen screening was done in the blood samples of 1,790 pregnant women. 33 blood samples were found to be hb s antigen positive. 32 women were asymptomic carriers, 1 woman suffered from chronic hepatitis. 31 children have been born until now. the examination of cord blood showed hb s antigen in 6 samples; the venous blood samples taken thereupon demonstrated hb s antigen only in two samples. the hb s antigen negative children were treated with hbig. the follow-up examination of the treated children showed no signs of infection up to now. -one child born to an hb e antigen positive mother was already infected at the time of birth. another child born to an anti hb e positive mother became hbs antigen positive at the age of 4 months. unfortunately, this child had not been treated with hbig at the time of birth. willers, h., ponnath, h., sipos, s., and moller, r. sera of 5,847 pregnant women living in the hannover area were investigated for the presence of hbsag. 56 healthy hbsag carriers were found: 17 (0.340/0) among 4,962 women of german origin and 39 (4.280/0) among 912 foreign-borne women from southern european and non-european countries. 10 out of the 56 hbsag carriers (114,962 german women and 9/912 foreign-borne women) were recognized to be hbeag-positive. -the risk of perinatal transmission of hepatitis b virus in infants of hbeag-positive mothers is suggested to be 90 % and of hbeag-negative mothers 12 0/0. -thus in the hannover area 6 out of 10,000 infants born to german mothers and 129 out of 10,000 infants born ro foreign mothers become hbsag carriers due to perinatal hepatitis b virus-infection. max von pettenkofer insr., univ. d-8000 munich 2 a radioimmunoassay for anti-hbc igm using higher concentration for the dilution of serum (5 mg/ml) and hbcag (10 mg/ml) was compared with an elisa test ]. din. microb iol. 13 (1981) , 618. onl y 10 of 157 (6 0 /0) hbsag positive blood don ors were positive (40 in elisa) and no correl ation with the total anti -hbc titer could be found . after fraction ation of a serum with discrepant results a positi ve result was found in the elisa only in the igg fractions. in a collaborative stu dy with a/torfer et al. (presented also at lisbo n, easl 1981 (1981) , who show ed a 10 6 fold reduction in hepatitis b titer by means of !i-plluv treatme nt. the effectiveness of the cold sterilization procedure as regards reducing virus infectivity is considera bly greater than that of pasteurization (10 h 60 dc) for which shikata et al. (1977) sho wed a 10 4 fold reduction of hep atitis b virus infectivity. it has also been found that factor ix conce ntrate and stab ilized serum (biseko®) der ived from pooled. cold-sterilization hu man plasma transmitted neither hepatitis b nor hepatit is no n-a, non-b in chimpa nzees. max von pettenk ofer-insr., 0 -8000 miinchen 2 comparison of different evaluation systems for determination of viral antibodie s of the igg and igm class in the enzymeimmunoassay r o g g endorf, m. , zachoval, r ., zoulek, g., a nd deinharot, f. t he enzymeimmunoassays used to demonstrate antibodies of the igg and igm class against viral antigens reveal a high sensitivity. at onset of acute viral infections igm antibodies can be demonstrated up to serum dilutions of 10-7 • due to th is high sensitivity, the determin ation of antibody tite rs is not accurate and reproducible, because titer end points are determ ined as that dilution of sera giving an 0.0. sample/o .o. negative control equal or greater than 2.1. -for the dete rmination of antibody con centr ations an evaluation method is proposed which correlates, the measured 0 .0. of sera at one dilut ion step to the 0 .0. of a reference serum which is defined arbitrarily to contain 100 anti body units. using an elisa for detection of antibodies to adenovirus, a significant increase in antibod y units of acute phase sera over that of convalescent pha se sera is observed. low day to da y varia tions are seen in tests car ried out on different days. in an elisa which is designed to detect anti bodies to tick-born e encephalitis virus of th e igm class, the day to day variatio n using antibody units was significantly lower th an using pin ratios. inst. f. virolog ie, jusrus-liebi g-un iv., 0 -6300 gieben; max-planck-insr, f. m olekulare genet ik, d -looo berlin structure and function of the core protein of alphaviruses boege, ulrike and witt man n-liebold, brigitte the com plete primary struc tures of the core proteins of sindbis and semliki forest virus have been established by pr ote inchemical methods. both proteins contain clusters of basic amino acids and proline in th e n-terminal part, suggesting th at its function is to bind to the nucle ic acid. the c-termi nal pa rts have ext ensive iden tical sequence regions, pr ob abl y p roviding the recognition sites for pr otein-protein interactions . both proteins contain within these highl y conserved portion s the sequence gly-asp-ser-gly wh ich is typical for serine proteases and most likely is related to the cata lytic function of the core prot ein as a protease wh en cleaving its own peptide chai n from the nascent protein precursor. experimental studies using peptides which contain certai n sequence regions will help to elucidate the relationships of struc ture and functions. an ac ute and a persistent infectio n with rab ies virus (h ep flury) was established in th e cn s-derive d cell 1ine 108-cc-i5 (ng 108-15). these cells possess specific membrane recept ors to many hormon es and neurotran smitters including opiate receptors. in both cases we found increases in th e dissociation constant (kn) for the agonist 3h-etorp hine as estimated by scatcha rd plot ana lysis. h owe ver, in both cases th ere was no change in th e nu mb er of op iat e receptors per cell compa red to uninfected cells. these studies complete our previou s published observations of the impai rme nt o f receptor func tions in rabies virus infe cted cells (1) . (1) kosch el, k. an d m. halb ach: j. gen. virol. 42 (1979) , 627-632. insr, f. virologic u. immunobi olo gie d. univ., 0-8700 wiirzburg effect of measles (sspe) antiserum on viral surface proteins and hormone receptor activity in c6/sspe persistently infected cells barrett, p. n. and koschel , k. it has been reported (1) th at measles antis erum ca n modulate the expression of certain viru s prot eins in acu tely measles infected cells. we have exam ined th e effect of measles (sspe) antiserum on the exp ressio n of vira l surface proteins in c6isspe infected cells. we have shown an over 50 % redu ction in the am ount of viral antigen present on the memb ranes of these cells following incuba tion with sspe antiserum. this was demon-strated both by immunofluorescence and rad ioimmunoassay. however this loss of antigen had no effect on memb rane receptor linked camp synthesis. this is discussed with respect to the effect of virus antigen insertion in cell membranes on specialised membrane bound functions. (1) we have analyzed the effect of phosphorylation and dephosphorylation on the structure and dna binding of d2-t antigen. on non-denaturing pore-gradient gels the purified protein migrated with an apparent size of 135,000 daltons, in vitro phosphorylation by the protein kinase associated with the purified protein resulted in a shift of most of the protein to a size of 740,000 daltons, and it was this form that contained most of the phosphate incorporated. this aggregation was completely reversible by treatment of phosphorylated d2-t antigen with alkaline phosphatase. partial tryptic digestion indicated that phosphorylation of sites in the nvrerminal part of the protein is responsible for the observed aggregation. -as show n by protein blotting onto nitrocellulose filters predominantly the 740,000 dalton form bound to sv40 dna. however, onl y a fraction of the in vitro phosphorylated protein did bind to dna, suggesting that aggregation alone is not sufficient for dna bind ing. subclasses of sv40 large t antigen were separated by zone velocity sedimentation. three major forms, which sedirnented at about 5-6s, 14-16s and 23-255, have been shown to differ biologically and biochemically (1) . each subclass was tested for specific binding to restriction fragments of sv40 dna using an immunoprecipitation assay. -all three forms of t antigen bound specifically to a restriction fragment containing the sv40 origin of replication. however, the 5-6s form bound more origin fragment per unit t antigen than the other forms . the 5-6s form bound equally well to origin dna in the presence and absence of excess cellul ar dna , whereas the binding of the 14-16s form was reduced in the presence of cellular dna . all forms of t antigen from sv40-transformed sv80 cells bound much less origin dna than that from infected cells. (1) fanning, nowak , burger: j. virol37 (1981) wart scrapings from several small skin regions of an epidermodysplasia verruciformis patient were tested for papilomavirus-specific sequences by means of a 32p-labelled hpv8 dna. uncleaved wart dna contained uniforme full length hpv genomes. cleavage with bam hi revealed six different patterns and a surprising heterogeneity even within small biopsies. at least two virus types showed only limited cross-hybridization with hpv 8a. one of these closely resembles hpv 5b (bam hi fragments 2,9 and 2,1; eco ri fragments 3,6 and 1,1). hind ii fragments a and c of his virus perfectly hybridize with hpv 8a; b, d and 6 anneal only partially and f, g show no detectable hybridization. the dnas of four subtypes were partially characterized and mapped. only dna of the hpv 5b-like virus was detected in a probe from the center of a carcinoma at the patient's forehead. this dna persists extrachromosomally with more than 100 genome equivalents per cell. inst, f. virologie, zentrum hygiene, univ., hermann-herder-str. 11, d-7800 freiburg gene expression of papillomaviruses in hamster tumors freese, u. k.,schulte, p., and pfister, h. bpv 1 includes fibrosarcomas in hamsters. the tumors contain large amounts of complete virus genomes which persist extrachromosomally but there is no evidence for capsid protein synthesis or virus particle production. we used this system to study early viral gene expression. rna from the tumors contained a single virus-specific rna with about 1300 nucleotides which was shown to be polyadenylated by affinity chromatography on poly-uesepharose. the transcribed dna region of bpv 1 was mapped within the two hpa ii fragments, which are next to the bam hi cleavage site within the 1.4 x 10 6 d bam hiieco ri fragment. cross-hybridization studies under low stringency revealed some sequence relationship of hpv 1 and hpv 4 dna to the transcribed region of bpv 1. inst. of genetics, univ., d-5000 cologne the adenovirus type 12-mouse cell system: permissivity and analysis of viral dna in tumor cells starzinski-powitz, a., schulz, m., and doerfler, w. interactions between viruses and eucaryotic cells can be studied in a genetically well defined system like the mouse system. we have investigated whether ad 12 dna is able to replicate in primary mouse kidney cells or in the mouse cell line l929. it was shown by southern blot experiments that ad12 dna was not able to replicate in l929 cells, w hereas in prima ry mouse kidne y cells of (balb/c x cs7/b16) fl origin, viral dna repl icated. moreover , we subcutaneously injected ad12 into mice of various genetic origins. in ab out 100 mice injected, one rumor emerged in a balblc mouse almost 8 months after injection. restriction pattern anal ysis of the rumor or dna indicated that abou t 2-3 copies of ad12 dn a were integrated and covalently bound to cellular dna. with th e techniques available no deletion or rearrangement of viral dna could be found. t he sites of jun ction betw een viral and cellular dna will now be cloned and sequenced. inst, of genetics, univ., 0-5000 co logne virus-cell dna recombinants in human cells lytically infected by ad2 neumann, r., weyer, u., and doerfler, w. there is ample evidence for the notion that virus-cell dna recombinants are formed in human cells productively infected with adenovirus type 2 (ad2). these high molecular weight forms were detected at 1-2 h postinfection and were generated at high frequency, a limited set of rather specific recombinants was produced (neumann and doerfler, j. virol. (1981) 887 suggesting th at recomb ination exhibited a cert ain specificity. we now succeeded in molecularly cloning dna frag ments excised from the high mole cular weight dn a of ad2-infected hum an cells by th e restri ction endonuclease ecori in agtwes . ab or in acharon 4b. some of these clon ed frag ments had sequence ho mology to both viral and cellular d na. th is result provided proof fo r the occurrence of virus-cell dna recombi na nts . inst, of genetics, univ., 0-5000 cologne unmethylated dna sequences in the promoter regions of integrated adenovirus genes correlate with gene expression kruczek, 1. and doerfler, w. an inverse correlation was established between the levels of dn a met hylation at 5'-ccg g-3' sites in specific segments of integrated adenovirus dna and the extent to wh ich the se segments were expressed (sutt er and doerfler, pnas 77 (1980) 253. similar correlatio ns were reported in other viral and non-viral systems. more recently, the results of in vitro experiments prov ided direct evidence for the notion th at dna methylation at specific sites led to gene inactivation (vardimon et al., pnas, in press ). -in the present stud y a detai led methylation map at 5'-cc gg-3' (h paiilm spi) sites in the expressed early and the silent late genes of ad12 dn a was determined in three adl2-tr ansformed hamster cell lines. the early region s of inte grated ad12 dna were unmethylated; in particular their promot er regions were unrnethylated ar the hpall sites in all three lines investigated. t he late regio ns including the promot er sites were completely meth ylated. gahlmann, r., deuring, r., stabel, s., winterhoff, u. vardimon,1., and doerfler, w. the sites of junction between viral and cellular dna were sequenced to investigate two problems: 1. are the sites of insertion of viral dna specific? 2. what type of recombinatorial events occur in viral dna integration? we have studied junction sites from the ad2-transformed hamster line hes, and from the ad12-induced hamster tumor lines claci and clac3. the virus-cell dna junctions were cloned in the dna of bacteriophage 2gt· j.b. appropriate restriction fragments were sequenced by the maxam-gilbert technique. there was no direct sequence identity apparent at the viral and cellular junction sequences. deletions at the termini of the viral genome were seen involving 5 (he5), 45 (clac3) or 174 (clacl) nucleotides. peculiar patch type homologies between the cellular and viral sequence adjacent to the site of junction and also in remote areas were observed. these patches may have an important function in integrational recombination. buttner, w., veres-malnar,s., and block, j. as a first step to understand the relationship between structure and function of the adenovirus type 2 dna binding protein (ad2 dbp) we have begun to determine the primary structure of the dbp gene. the isolation and characterization of tupaia adenovirus (tav) has been reported. in order to construct a genetic map of the tav genome the use of temperature-sensitive mutants (ts) of tav was necessary. -a variety of ts mutants of t av, which were generated by treatment of t av virions (1 x 10 10 pfu) with hydroxylamine, were isolated and preliminary characterized. six of these mutants its: 1, 4, 11, 12, 13 , and 54) did not replicate in tupaia embryonic kidney cells at 39.5 dc, but did replicate well at 32°c. the characterization of these mutants was carried out using complementation tests, host range studies and dna anal ysis using different restriction enzymes. according to complementation analysis four groups were determined : group i = ts 1, group ii = ts 12, group iii = ts 13, and group iv = ts 4, 11, and 54. the host rage study revealed that ts 1 and 54 also had properties of host range mutants. these mutants did not replicate in tupaia bab y fibroblasts in contrast to wild-type virus. genome analysis of these mutants revealed that the mutated region is located between 69.9 to 74.5 map units. in addition, in ts 1 and 54 mutants a deletion from 0.77 kb to 1.39 kb (map unit 76.5 to 100) was detected. interaction of viral substructures with serum and serumcomponents resp. during the final stage in the course of many viral infections complete virus particles as well as viral substructures enter the intercellular space, e. g. hbsag, hbcag in hepatitis b virus infections . the present study deals on the interaction of the potentially infectious adenovirus cores with dna-specific antibodies and immunoglobulin solutions which had been absorbed by dna-antigens. cores were prepared by sarcosyl treatment of purified adenovirus typ 5. by electron microscopy immunocomplexes could be demonstrated which are composed of several individual core units. by buoyant density centrifugation in metrizamide gradients a drastic rise in the density of cores could be shown too. with immunoglobulin solutions, absorbed with native as well as with denatured dna, so far no complexes could be assessed. a cell line designated sbl-h12578 and several cell clones were established from skin tumours of the sporadic leukosis form. the cells were proved to be free of bovine leukemia (blv), and some other bovine viruses. by indirect immunofluorescence macroschizonts of a theileria species were seen in the cytoplasm. the cells originated from a blvantibody negative animal and carried a female karyotype with some morphological aberrations. evidence for a possible t cell origin of sbl-h12578 cells was obtained. after inoculation of 8 x loa cells into 'nude mice' transplantable sbl-h12578 tumours developed. by incorporarion of 3h-uridine and electron microscopy, the production of retrovirus particles by the cultured cells was detected . in the simultaneous detection test a highmolecular weight rna co-migrating with felv 70 s rna was demonstrated. no antigenic or genetic relationship between th e skin tumour virus isolate and blv or other major mammalian retrov iruses has been found. -(supported by stiftung volkswagenwerk ). abt. f. pathologie, gesellschafl: f. strahlen-u. umweldorschung, d-8042 neuherberg insectpathogenic baculoviruses: studies of the activation of endogenous c-type retroviruses in mammalian cell cultures schmidt, j. and erfle, v. activation of endogenous c-type retroviruses by baculoviruses was studied in "in vitro" cell culture systems of four mammalian species: mouse, rat, monkey and man. cells were treated with baculoviruses (from larvae and insect cell cultures), baculovirus-dna, c-rype retrovirus-activating chemicals and chemical insecticides alone and in combination. the activation of retroviral genomes was tested by the determination of reverse transcriptase activiry in concentrated cell culture supernatants and by the demonstration of the intracellular localisation of retrovirus structural protein p30 applying the indirect immunoperoxydase technique. -c-type retroviruses were activated in mouse cells only by the halogenated pyrimidine analogue iododeoxyuridine. in baculovirus-treated cell cultures no c-type retrovirus activation was detectable. in simultaneous treatments of the cells with baculoviruses and chemicals no potentiating effects could be detected. virions of baculoviruses in mammalian cell cultures showed unaltered morphology and upon reisolation their infectivity in homologous insect cell cultures was lowered by approximately one log. no influence upon growth or morphology of the cells could be observed. the gene products of replication-defective oncornaviruses are high molecular weight proteins which comprise a gag related and one gene product. they are probably not cleaved because the p15 protease gene is lacking. in vitro, however, the gag-specific peptide sequences are cleaved off upon addition of the purified viral pis protease; in the case of the replication-defective, transforming avian sarcoma virus prc ii the cleaved nongag part has a ryrosine-phosphorylaring kinase activity similar to that described for the rsv src-gene product pp60 src . processing of pr92 gp , the precursor to the viral glycoproteins of rous sarcoma virus bosch, v., schwarz, r. t., ziemiecki, a., and friis, r. r. the viral glycoproteins of rous sarcoma virus gp85 and gp35 are synthesized via a precursor polyprotein pr92". this precursor is already glycosylated and contains the polypeptides of both gp85 and gp35. we have studied the nature and site of processing of pr92'~to mature disulfide-linked gp85 and gp35 (vgp). we could show that in addition to proteolytic cleavage, processing involves conversion of the high-mannose oligosac-charides found in pr92' " to the complex, sialidated oligosaccharides found in vgp. experiments pertaining to the site of processing indicate that processing does not occur extracellularly as has been proposed by others iklemenz and diggelmann, j. virol. 29 (1979) 285-292. we have determined that the small amount of mature vgp found in infected cell lysates is localized chiefly within the cell, not at the cell surface. we favour the view that further glycosylation and proteolytic cleavage occur concomitantly on smooth membranes within the cell and that subsequent to this, export in virus is rapid. 1 paul-ehrlich-inst., d-6000 frankfurt; 2 lnst. f. virologie, univ., d·6300 giefen, 3 a protein of a molecular weight of about 38.000 d has been found to be phosphorylated 1 h after the onset of cell transformation by rous sarcoma virus (rsv). it is assumed to be a physiological target of the pp60'" kinase, since, apart from being phosphorylated in the transformed cell, it can be phosphorylated in vitro by the pp60"· kinase in tyrosine (1, 2) . -using 6 different chromatographic procedures (chromatography on deae-sephacel, poly (a)-sepharose, blue sepharose, and hydroxylapatite, isoelectric focusing and gel filtration ) the 38,000 d protein could not be separated from cytosolic malic dehydrogenase activity (c-m dh ). antiserum against the 38,000 d protein inhibited c-mdh. the transformation-specific protein "v-myc" of the acute avian leukemia virus mc29 donner, p., greiser-wilke, irene, and moelling, karin avian acute leukemia viruses transform cells through a virus-coded oncogene. in the case of the rnyelocytornatosis virus, mc29, which transforms fibroblasts as well as bone marrow cells in vitro , this oncogene is fused to a viral structural component, p19. the fusion protein, v-rnyc, was characterized by using monoclonal antibodies against p19. mc29 transformed quail fibroblasts which do not produce any virus, mc29-q8, were analyzed b y immunofluorescence. a nuclear fluorescence was observed which was not detected in normal cells or virus-producing cells. the monoclonal antibodies were used to purify the v-myc protein by immuno-affinity chromatography. the purification achieved by this single-step purification was 3,500 fold . the eluted protein was precipitable by anti-sera and will be further characterized for its biological properties. med. poliklinik, univ., d-8000 miinchen; abr, f. pathologie d. gsf, neuherberg biochemical characterization of antigens in human leukemic sera that crossreact with sisv p30 and baevp30 leib, c., schetters, h., erfle, v., and hehlmann, r. antigens crossreacting with the core proteins p30 of baboon endogenous virus (baev) and/or of simian sarcoma associated virus (sisv) have been isolated from human leukemic sera by immunoaffinity chromatography. the antigens have an apparent molecular weight of 70,000 in sds-polyacrylamide gel electrophoresis. peptide maps performed with the antigens from two different leukemic sera show that the two antigens are identical. peptide analyses of sisv p30 and of baev p30 and simultaneous mapping of p30 proteins mixed with human antigens show that 11 out of 21 major peptides of sisv p30 and 10 out of 20 major peptides of baev p30 have identical mobility with peptides of the human antigens. human serum albumin, transferrin, fibrinogen, igg and igm share clearly less peptides of identical mobility with the human antigens. the isoglycoproteins gp69 and gp71 were purified from f-mulv particles (propagated in eveline cells) by solubilization (freezing and thawing), ion exchange chromatography (phosphocellulose) and preparative sds-page. prior to amino acid and nh 2-terminal amino acid sequence analysis, the purified glycoproteins were subjected to high performance liquid chromatography (hplc) to remove contaminants. -the nh 2-terminal amino acid sequences (23 residues) of gp71 and gp69 were found to be different (in 10 positions) but highly related. f-mulv gp69 shows 410f0 homology to gp71 but lacks the potential glycosylation site (sequon) at position 12 in both f-mulv gp71 and r-mulv gp70. et al., 1980) . in our laborato ry 99 breast can cers, 60 normal breast tissues, 44 benign breast lesions and 10 other carcin om as from south german patients were tested for crossrea ctivit y w ith mmtv-gp52. the tests were done by indirect immunoperoxidase staining on paraffin-sections using antise ra against gp52 prov ided by dr. spiegelman, n. y. specificity of positive reactions wa s controlled by preabsorption with purified mmtv-gp52. 86 breast cancers (87 0 /0) and 13 benign breast lesions (30 0 /0) gave positive reactions, whereas normal bre ast tissue and other carcinomas were negative. -the test might be an additional useful diagnostic too l for the earl y detection of micrometastases, for the assignment of metastases from unknown pr imary tumors and in doubtful cases of breast cancer. it can possibly serve as an add itional criterion for the classification of mastopath ies. viruses wer e found in 3,691 patients. nearly all illnesses were caused by mumps-and entero-viruses , other viru ses were found in less than 5 % of all cases. the mumpsmeningitides were ascertained constantly over all these year s. in 1967 in , 1974 in and 1980 an increased incide nce of meningitides caused by echo-virus type 30 wa s seen. -there was no seaso na l dep endence on the occurrence of mumps virus meningitides. meningitides, caused by ent ero-viruses was found mor e often in the autumn. male patients fell ill twice as ofte n as female patients. mumps-meningitides were rarel y found in the first year of life. in contras t, enterovirus meningitides could be dete cted during the first year and caused men ingitides to age of 14. virus meningitides among adults were rarel y found. we suspected that a number of peripheral facial pareses (p.f.p.) considered "idiopathic" might, in fact, have a viral aetiology . 71 patients of the cologne university e.n.t. clinic with so-called idiopathic p.f.p. were examined under the aspect of a viral aetiology. in 32 cases conditions for virological investigations were optimal (paired sera, first serum within the first week after onset of disease). in 10 of these 32 cases a viral infection could be proven (varicella zoster virus: 7, herpes simplex virus: 2, coxsackie b 4 :1). in 5 of 7 vzv cases minimal zoster lesions were observed, 4 facial and 1 thoracic. among the 39 oth er patients (with late sera) no viral infection could be proven. in conclusion, by means of alert clinical and virological examinat ions, a considerable fract ion of idiopathic p.f.p. could be associated with a virus infection. 152 neonates were screened from the delivery to the time of discharge for rotavirus infections. daily faecal specimens were examined by an enzyme-linked immunosorbent assay (elisa) and a subgroup of positive specimens were also tested by a negative staining electromicro scopic method. -22 babies (14.5 0/0) were found to excrete rotavirus. 20 of them were asymptomatic infected and 2 showed mild gastrointestinal symptoms. with one exception viruses were not detected in babies less than 24 h old, but 10.9 % of them excreted virus during the second day of life and after 10 days 30 % of the neonates were positive for rotavirus. -excretion persisted for 1 to 5 days. according to our study most babies (40 0 /0) excreted roravirus for 3 days. -a great number of the stools (29%) from the first da y which were tested by elisa were found to have nonspecific activity in the absence of rotaviral antigen. theref ore such stool specimens should only be examined by electron microscopy. the single radial hemolysis test (srh-test = hemolysis-in-gel test) is a suitable technique for detection of rubella and influenza antibodies in a large number of sera. in our stud y we looked for the effect of att achment of the viral antigens to the erythrocytes by different coupling reagent s. chrom ic chloride, cyanogen chloride, glutaraldehyd and tetraazotized -0 -dianisid ine (tod) were used for the sensitization of the erythrocytes. -in the rubella srh-test no improvement on regard to sensitivity of the test was seen. moreover it is not possible to detect igm specific antibodies after different kinds of attachment in the srh-test. -in the influenza srh-test with allantoic fluid of eggs infected with h 3n2 virus it was possible to increase the sensitivity with tod, chromic chloride and potassium periodate. if tween-ether treated hemagglutinin was used only after sensitization with tod, chromic chloride and periodate hemolytic zones were detectable . univ.-kinderkiinik, mathildenstr. 1, d-7800 freiburg rna-electropherotyping of human rotaviruses 1978 . and pastor, s. rotaviruses contain a double-stranded ribonucleic acid genome consisting of 11 segments. this can easily be extracted from crude stool suspension (method by rodger et al., j. clin. microbial. 1981) . 55 out of 210 rota antigen positive samples contained sufficient rna to produce a satisfying pattern in the sds-acrylamide-electrophoresis. we demonstrate six electropherotypes with differences in the relative mobility of segments 2, 3, 4, 5, 7, 8, 9, 11 . of these diarrhea producing strains at least two different electropherotypes were found during every outbreak of rotavirus gastroenteritis. our findings are in good agreement with the results reported by rodger ), espejo et al. (1980 ), and kalica et al. (1978 , 1976 . the so-called m-type of emc virus is capable of inducing diabetes mellitus in mice by a selective b cell damage. the m-emc strain used in our laboratory has partially lost this capacity. we attempted to reisolate a diabetogenic variant and to elucidate the causes of the change. seven serial heart passages in mice did not enrich such a variant. cloning of the virus stock yielded one clone diabetogenic in two of ten animals (5 clones tested so far), -in a different substrain of m-ecm virus (the highly diabetogenic dvariant, obtained from dr. petersen) we found both diabetogenic and non-diabetogenic clones. incidence and severity of diabetes has been shown to be age-related. t his safety study was to demonstrate whether or not granulosis virus (gv) of laspeyresia pomonella can rep licate in vertebrates. after feeding gv to nmri-mice, no virus induced antibodies could be detected within eighty days by radioimmunoassay (ria) and no vertical virus transmission was observed. gv did not replicate in mice. human sera, as well as sera from horses, cattle, sheep and swine reacted with gv in the ria. by characterization the positive reacting human immunoglobulin c1ass(es), igg showed the strongest positive reaction, less positive reactions were shown by ige and igm and no reaction by iga and igd. by concentrating the immunoglobulins of the negative reacting sera to 250 pg igg/ml, all sera could be recognized as positive. thus a non-immunological reaction has been suggested. infection of human fibroblasts with cytomegalovirus induces typical cytopathic alterations. cell rounding within 5-6 h postinfection (p.i.) is followed by an increase in cell size, appearance of cytoplasmic and nuclear inclusions and a morphological change to an epitheloid cell shape by 48 h p.i. in order to elucidate the participation of the cytoskeleton in this alteration of cell morphology experiments were initiated in serum-starved human fibroblasts to visualize changes in actin arrangement by indirect immunofluorescence. a early as 3 h p.i. cytoplasmic microfilamenrs had shortened and were rearranged to a more irregular pattern. at 12 h p.i. actin fibers were absent from rounded cells. the same was observed in epitheloid cells at 48 h p.i. cultivation of infected cells with phosphonoacetic acid resulted in a partial preservation of the normal actin fiber distribution. in contrast, infected cells did not exhibit major changes in actin synthesis as estimated from the specific radioactivity of cytoplasmic actin isolated from 3h-leucine pulse labelled infected cells by dnaase i-sepharose affinity chromatography and sdspolyacrylamide electrophoresis with fluorography. chicken erythrocytes were coated with glycine-extracted cmv antigen and negative control antigen by treatment with formaldehyde and crcl s and were used for detection of cmv specific serum antibodies in pha tests. their sensitivity was proven to be in good agreement and their specificity superior to results seen in cft. igm-specific cmv antibody detection was performed either after a simple and rapid deae-cellulose exchange chromatography of serum samples or by igg immunoprecipitation combined with me-reduction controls. the main advantage of the modified cmv-phat was seen in the stability of sensitized erythrocytes, which can be lyophilized. lanvers, a., mertens, th., and eggers, h. j. there are reports that herpes simplex virus (hsv) could be isolated from the genitourinary tract of up to 15 ofo of males without manifest herpes. in order to confirm these results we first wanted to establish a method for typing possible hsv isolates. we used 3 published methods: a plaque test in chick embryo cells, a neutralisation test, and the analysis of the early hsv-proteins (sds-page). all 3 methods yielded identical results. we then tried to isolate hsv from 192 materials of 181 asymptomatic males (89 urethral swabs, 55 seminal fluids, 48 fresh tissue probes). all materials were immediately inoculated into tube cultures of two cell lines shown to be highly hsv-susceptible. additionally, the tissue probes were co-cultivated with permissive cells. we also induced cell fusion (peg) in such cocultures. despite all efforts we did not isolate any virus from all these materials. for an experimental approach to answer the question as to which viral genes may control pathogenesis after peripheral infection of inbred mice with hsv, we have isolated a variant strain (ang-path) that proved highly neuropathogenic both upon i.p. -or intravaginal infection from the apathogenic strain hsv-1 ang. two alterations of the ang genome have been detected by restriction enzyme analysis: the loss of the amplifiable 500 b.p. nucleotide sequence typical for ang and a 400 b.p, deletion approximately at the position mapped for the structural viral glyco-protein d. despite the induction of interferon and nk-cells both variants multiply to a similar extent, probably in the same target cells of the peritoneum. the spread from the peritoneum, spleen, liver, thymus and local lymph nodes to the ens seems, however, to be controlled by the action of gene products coded for in only one of the variants. div, of expo virology, insr, for med. microbiology, univ., d-6500 mainz knoblich, a., friedrich, d., goertz,] ., and falke, d. herpesviruses are known to cause infections in men and animals. strong differences in resistance to herpes simplex virus (hsv-1) are noted among mice of various strains. anti-hsv-activity of t-iymphocytes, macrophages and nk cells has been demonstrated in the last years. our interest was focused on neutralizing antibodies in primary hsv-1 abstracts of the 38 th meeting of the oghm infections of mice. antibodies become detectable by day 5 after infection and reach a plateau level at day 21, the day we chose to test the sera. comparison of antibody titers in 18 strains of mice revealed titers always to be higher in female mice, whereas no clear influence of either h-2-haplotype or background genome could be detected. sexual steroids produced in ovaries and testes were identified to exert influence on antibody formation by castration experiments. treatment of mice with silica once between day 1 before and day 12 after infection resulted in a strong increase of antibody titers both in females and males at the same time abolishing the difference in antibody titers between the sexes. silica could enhance antibody levels also after immunisation of mice with a formol-inactivated hsv-vaccine. bestatin is a small peptide known-selectively to stimulate dna metabolism in t-lymphocytes and to enhance hsv-antibody -titers maximally when given at day 5 after infection . after pretreatment of mice with silica the antibody augmenting effect is already achieved at day 1 after infection . in secondary hsv-l infection bestatin acts best 1 day after infection, too. insr. f. med. virologie d. univ., 0-6900 heidelberg induction and characterization of herpes simplex virus reisolates, isolated after intertypic superinfection of latent infected tupaias darai, g. and scholz, j. the susceptibility of juvenile and adult tupaias to herpes simplex virus type 1 and 2 had been reported. the intertypic recombination of herpes simplex virus using temperature-sensitive mutants of hsv-l and 2 and superinfection with wild-type hsv-l and 2 was studied. it was found that animals survived an infection of 1 x 10 7 to 1 x l()8 pfu of rs mutants of hsv-l and /or 2 which were inoculated intravenously (l0 50 for hsv-l = 10-3 • 75 and for hsv-2 = 10-2 • 25 ) . the inoculated animals were protected against a superinfection of hsv-l or 2 (5.0 x 10 6 pfu/animals). the state of viral latency in surviving animals was investigated. it was found that infectious virus was recovered from ganglia of those animals which had initially been infected with wild-type hsv-l or 2 and /or superinfected with wild-type viruses. in contrast, the infectious viruses were recovered only from spleens of those animals which had initially been infected with ts mutants of hsv-l or 2 and superinfected with wild-type hsv-2 and /or 1. it was found that recovered viruses from the spleen of the animals lost their pathogenicity and their natural tissue tropism . significant changes in the genome of the recovered viruses from the spleen were detected. recombination between ts mutants of hsv-l and 2 and challenged wild-type viruses was observed. thus, the pathogenicity and genomic properties of recovered viruses were altered . this observation is the first description of generation of inrerrypic recombinants of hsv-l and 2 in in vivo. oiv. of expo virology, inst. for med. microbiology, univ., 0-6500 mainz the functions of the hsv-coded dpyk-complex enzyme labenz, j., brauer, d., moller, w. e. g., and falke, d. analysing the phosphorylating capacity of the hsv-coded dpyk of hsv-l and 2 by glycerol gradient centrifugation we detected each three peaks differing in molecular weights using atp, aop or amp as phosphate donors. also by page peaks with differ-ing rj-values could be seen. indeed, 32p_amp and 32p_adp were used for phosphorylation of dthd. the amp-dependent activiry could be purified boo-fold. two antisera against hsv-coded pdyk neutralized all three activities. the tk-mutant mdk (b2006) did not induce in'i'kr cells the respective activities, only the 3 cellular tk's were detected and identified by their rj-values. further experiments have shown that only the hsv-l-dpyk has thymidylate kinase activity, but not that of hsv-2. the hsv-l thymidylate kinase activity could be neutralized by a tk-antiserum. -the ph-optimum, sensitivity to mg'", fe'", zn'", co" and mri'" ions differed. the susceptibility to thiol reagents was different, the amp-dependent activity proved to be susceptible to phenanthroline. also the k m and vrnax-values were detected. a diagram summarizes the biochemical function of the hsv-coded dpyk-complex. finally there is some indication that the enzyme phosphorylates ado by using dtmp or atp as a phosphate donor. the importance of the enzyme complex in hsv-infected cells is discussed. dkfz, inst. f. virusforschung, 1m neuenheimer feld 280, deletion of n uc1eotide-sequences in cloned hsv-1 fragments during propagation in the rec a e. coli, hb 101 gray, c. p., jellinghaus, u., and kaerner, h. c. passage of hsv at high mol results in the generation of defective genomes which are of full length, consisting of a relatively short region of the wild type genome that is repeated. they are packaged into mature virus particles and thus leave the cell in a state that is capable of entering a new host. such defective molecules are of interest as they represent a simpler model in which to study replication, recombination and packaging of hsv. -such a defective molecule arising from the serial passage of hsvtype l-ang has been mapped, and restriction fragments have been cloned into pbr 322 using the rec a-e.coh, hb 101, as host. -all the resulting clones were found to be unstable and to contain deletions, one of which has been characterised in more detail. the isolation and characterization of tupaia herpesviruses (thy 1 to 4) has been reported. the analysis of dna of these viruses showed the absence of submolar dna fragments, when the dna was cleaved with different restriction enzymes, as well as of a stem loop structure, when analysed by electron microscopy. with respect to this genomic structure it was of interest to study he recombination events between thv 1 to 4. recombinants were generated between these viruses using a co-infection technique in vitro on tupaia fibroblasts. recombinants were selected after the stocks of new progency virus were treated with specific antiserum against each parental virus. different recombinants were isolated, plaque-purified and characterized. results for one recombinant thv-r-26 between thv-2 and 3 were as follows: (i) the in vitro host range and the in vivo pathogenicity in juvenile tupaia was altered compared to parental viruses. phosphorylation of proteins is a posttranslational modification which is regulatory for the activity of several enzymes. most animal viruses code for phosphoproreins and the degree of their phosphorylation is thought to be a controlling factor during viral macromolecular synthesis. recent studies on protein kinases from a number of tumor viruses have raised the possibility that the phosphorylation of cell proteins is involved in the processes leading to cell transformation. -incubation of purified tree shrew (tupaia) herpesvirus (thv) particles with y.32p atp resulted in the incorporation of 32p labelled phosphate into proteins. a nonionic detergent such as np-40 was necessary for the detection of protein kinase activity. the incorporation of 32phosphate was proportional to the quantity of th viral proteins, indicating that the viral proteins can serve as substrates for the viral enzyme. a_ 32p datp or a_ 32p atp did not function as phosphate donors. a divalent cation such as mg2+ or mn'" is essential for the enzymatic activity. a product analysis revealed that six viral polypeptides are phosphorylated. the predominant sites of phosphorylation are the (f-oh groups of serine and threonine. in 1972 a herpesvirus was isolated from young goats with a severe generalized infection in california. this isolate has been characterized in some detail and named caprine herpesvirus 1. recently a herpesvirus was isolated in switzerland from goats with a similar infection. we report a further characterization of both isolates and propose their classification as bovid herpesvirus 6 (bhv-6). -bhv-6 multiplies rapidly and shows a broad hostcell range. crossneutralization onl y could be observed with bhv-1 (ibr/ipv-virus), in a one way reaction. in gel immunoelectrophoresis the serologic relationship envolved the major antigenic components of bhv-1. we have evaluated two methods for the analysis of antibodies directed against ebvspecified proteins: 1. indirect immunoprecipitation (ip) and 2. radioimmunoassay of electroblots of sds page separated ebv-specified proteins (riab). using ip we have identified 20 proteins against which antibodies are made during infection, some of these proteins have also been found using the latter technique. many of the proteins are only reactive with ea'vca+ sera and not ractive with ea-vca + sera and may therefore be candidates for the ea specifying proteins. it seems likely that the failure to identify all proteins using the riab technique is probably due to the strong denaturating conditions used during the sds page step. only those antibodies directed towards the primary sequence of the protein will react with the blotted proteins, whereas with ip analysis the native proteins are available for interaction with the sera. previousl y we demonstrated that the synthesis of ebv-induced proteins in superinfected raji cells (raji 51) could be divided into 3 phases: primary, secondary and tertiary (bay-liss and wolf, j. gen. virol., 1981, in press) . recent experiments show that incubation of raji si in the presence of canavanine and the absence of arginine allows only a limited expression of the viral genome. if the cells are released from a canavanine block (applied from 0 to 8 h post infection) then between 2 and 4 h later several new proteins are synthesized, however, if m-rna synthesis is inhibited (with actinomycin d) after removal of the canavanine then these new proteins fail to appear. amongst these proteins are members of the ebv-specified early antigen complex (ea). these results indicate that an arginine-containing protein is synthesized soon after infection and that this protein is required in an active form in order to synthesize m-rna for the secondary group of proteins. amongst these proteins are the major components of the early antigen (ea) complex. max von pettenkofer inst., univ., d-8000 miinchen 2; ent-klinik, univ., d-8700 wiirzburg; dept. tumor virology, chinese acad. med. sci., beijing, china seibl, r., richter, w., zeng, y., gu,s.-y., and wolf, h. antibody titers of iga antivirus capsid antigen (vca) can be used for screening early tumor cases, confirming histological diagnoses and longtime surveillance of therapy. however, in high risk areas (s. china, incidence of npc: 20/10 5 ) 2 % of the population have iga anti vca antibody indicating a need for methods which allow monitoring of additional parameters for deciding on the need for therapy. the same need exists in case of long term (1-2 years) survivors of npc with constant iga anti vca titers. -we have developed a system to collect cell specimens by application of buffer to the tumor site and collection of cells directly on to nitrocellulose filters. these cells can be examined cytologically or for ebv dna in nucleic acid hybridization. for the latter tests a modification of the grunstein hogness colony hybridization test and cloned ebv dna have been used. in reconstruction experiments 25 virus-producing cells could be detected. inst. f. klin. virologie, univ., d-8520 erlangen-niimberg keil, g. and fleckenstein, b. after infection of permissive cell cultures with overlapping restriction-fragments of viral dna derived from different strains of herpesvirus saimiri (h. saimiri) a number of recombinants could be isolated. the analysis of the viral proteins of the wt-strains 11, omi and s295c led to the identification of four proteins that differed within these strains with respect to molecular weight. we are able to localize these proteins in the internal region of the m-dna by comparing the protein patterns of recombinant and parent strains. the exact localization was so far not possible because the recombination events occured mainly in regions near the ends of the l-dna. -furthermore recombinants were constructed to identify the genomic region responsible for oncogenicity. the wt-strains of h. saimiri are oncogenic while attenuated strain 11-att, that was obtained from strain 11 has lost this property. this may be due to a deletion of 1,1 md at the left end of the l-dna. recombinants between this attenuated strain and wt-strains were constructed. the test for oncogenicity in vivo is in progress. the t-iymphoid cells can contain up to 300 genome copies per cell as episomes. we have begun to study translation and transcription in transformed cells in comparison to lytically infected omk-cells. about 25 new proteins can be detected in lyrically infected cells having apparent molecular weights between 12-200 kd. -at early and late stages of infection the right part of the genome is preferentially transcribed and each dna-fragment encodes a series of specific rnas. -in contrast, we do not find any virus-specific proteins in the transformed cells after labeling with 3ss-methionine and subsequent immunoprecipitation. preliminary results may suggest that the only virus-specific rna found in the transformed cells are small rnas with molecular weights of about 0.13 kb. -a more detailed analysis has to be performed in order to confirm these hybridization data. max von pettenkofer inst., d-8000 mi.inchen 2 characterization of herpesvirus saimiri glycoproteins modrow, s. and wolf, h. for the identification of herpesvirus saimiri glycoproteins we seperated h. saimiri 11 induced cell proteins on sds-polyacrylamide gels and transfered the polypeptides by elec-trophoretic blotting (2 h, 3,7 ma/cm 2 ) to nitrocellulose paper using carbon electrodes and buffer soaked sponge to cover the gel/filter layer. lectins, which are known to bind very specifically to certain sugar residues (concanavalin a to d-glucose and d-mannose derivates, soy bean agglutinin to n-acetyl-d-galactosamine and d-galactose, dolichos biflorus agglutinin to n-acetyl-d-galactosamine, ulex europaeus agglutinin to l-fucose) were iodinated using the nen-iactoperoxidase system. the glycoproteins bound to the nitrocellulose sheets were detected by incubation with the 1z5j-labelled lectins. by this, eight viral glycoproteins could be identified, two of them were synthesized in the presence of canavanine, and characterized according to the type of glycosylation. lymphocystis disease (ld) is a virus disease of marine fish with an almost world-wide geographical distribution. this disease is characterised by papilloma-like tumour lesions. lymphocystis disease virus (fdlv) has been tentatively classified as belonging to the family of iridoviridae. this report describes the anatomy of fdlv: the fdlv virions were isolated from a total of 22 fish with ld lesions, caught near the doggerbank, including 12 flounders, 6 dabs, and 4 plaice, which were analysed individually. the purity of the virus preparations was examined by a negative-staining technique and the virion diameters were determined: flounder 227.5 ± 12.5 nm, ldv-plaice 198.8 ± 12.9 nm, and ldv-dab 200.5 ± 12 nm. dnas of these different ldv isolates were cleaved with different restriction endonucleases and the resulting dna fragments were separated electrophoretically on agarose slab gels. the fragment patterns demonstrate that ldv dna of flonders and of plaice are indistinguishable, but clearly different from those of dab. the determination of the molecular weights of fdlv dnas using contour length measurements by electron microscopy resulted in a value ranging from 60 to 150 x 10 6 daltons. in contrast the molecular weight estimations by restriction enzyme analysis resulted in lower values ranging from 60 to 90 x 10 6 daltons. this discrepancy is probably due to a restricted, permutated structure of the ldv genome similar to the genome of frog virus 3 as reported by granoff. the structure of fldv constituents and its interaction with host cells remains obscure. in an effort to clarify the viral components and their functions we have studied the proteins and purified fldv and searched for virion-associated enzymes. -at least 33 distinct viral polypeptides were detected by polyacrylamide gel electrophoresis under denaturing conditions. the polypeptide patterns are remarkably specific for a given fish species (flounder, plaice, and dab) although some heterogeneity was found when proteins of individual fish of the same species were analysed. -it was found that a nucleoside triphosphate phosphohydrolase activity is closely associated with fldv particles. this activity hydrolyses atp with a high preference. the reaction requires a non-ionic detergent and a divalent cation, such as mg2+. reaction rates and substrate specifities were determined. the products of the reaction are nucleosides disphophares and inorganic phosphate. characterization of a poxvirus isolated from white rhinoceros (ceratotherium s. simum) pilaski,] ., schaller, k., olberding, p., and finke, hannelore in september 1977 an outbreak of poxdisease occurred in white rhinoceroses (ceratotherium s. simum) in the munster zoo. at the same time a similar outbreak was observed in elephants (elephas maximus, loxodonta africana) of the zoo in frankfurt (airline distance about 230 km). in both cases orthopoxvirus strains could be isolated which were similar but not identical in their biological properties (small efflorescences on the cam with hemorrhagic center, inclusion bodies of type a v+, high pathogenicity for rabbit skin, characteristic skin lesions in adult mice) and their dna restriction patterns (xho i, eco r i, hin d iii). both virus strains were incorporated into the group of "cowpoxlike viruses" (baxby and ghaboosi, 1977) . the results indicate that both outbreaks had occurred independently from each other. characterization of a 37k protein specifically associated with released extracellular vaccinia particles hiller g. and aulbach, h. infectious vaccinia virus can be isolated from infected cells after experimentally induced lysis (intracellular virus) or from the growth medium of infected cultures (extracellular virus). we have characterized a 37k protein only present on extracellular particles by its amino acid composition and its behaviour on isoelectric focusing. in addition we have used a 37k-specific antiserum to detect its distribution within infected cells. -37k protein is a late viral protein appearing 5-6 h p.i. it is predominantly found associated with the cellular golgi complex but never with structures representing pox virus "factories". later in infection 37k protein is incorporated into single viral particles preferentially found in the cell periphery. upon electron microscopy approximately 3040 % of morphologically mature virions inside the cell are enwrapped by a double-membranate vesicular structure. thus 37k viral protein is probably a component of this vesicular structure and only vesiculated virions can be released by the cell before lysis occurs. epidemiology of influenza in lower saxony willers, h. and hopken, w. the influenza surveillance in lower saxony is mainly based on laboratory investigations especially on the attempts to isolate influenza viruses throughout the year. -in the winter 1977-78 and in the winter 1980-81 the two influenza subtypes h3n2 and hini circulated at the same time. the h3n2 subtype affected during the last years persons of all age-groups whereas the hini subtype affected only persons younger than 30 years. -in the winter 1978-79 influenza b was found in an epidemiological extent. the disease caused outbreaks mostly in schools and kindergartens but also affected adults. -in 1980-81 from mid-january to mid-february scattered outbreaks were caused by the a subtype hinl, which particularly affected schoolchildren contrarily to 1978 where the hini subtype mostly infected young adults. influenza of the h3n2 subtype circulated from january to march. inst. f. med. mikrobiologie. abr, virologie d. tu, biedersteiner str. 29, antibodies against influenza c virus in the population of germany, kenia and australia pfeil-putzien, c. and meier-ewert, h. over one hundred human sera from each of the three countries germany, kenia and australia were tested for antibodies against influenza c virus, using the conventional hemagglutination inhibition test (hi). the rate of positive sera, showing a hi titer 8 amounted to 59 ofo for germany, 93 ofo for kenia and 96 ofo for australia. in the age group up to 5 years, 34 ofo of german and 94 ofo of kenian sera had already antibody titers against influenza c virus. the australian sera were tested for the age group of 16-25 years and showed 95 ofo seropositivity. the results show that influenza c virus is circulating to a higher extend in the populations of countries with subtropical climate, as compared to the more temperate middle european zones. inst, f. virologie, justus-liebig-univ., d-6300 giessen the proteolytic activation of influenza hemagglutinin, structure of the cleavage site and the mechanism of cleavage garten, w. and bosch, f. x. the hemagglutinin precursor ha is posttranslationally cleaved by proteases to the complex h i , 2' in vitro cleavage of ha by trypsin and trysin-like proteases yields infectious virus. cleavage by thermo lysin or chymotrypsin yields non-infectious virus. we have analyzed the cleavage sites of hi, h3 and hlo hemagglutinins. the amino acid sequences at the cleavage site, i. e. c-terminus of hal and the n-terminus of ha 2 are identical when virus is activated in vivo and in vitro. under both conditions, an arginine residue connecting hal and ha 2 in the precursor is eliminated. the elimination of argmme results in a shift of the isoelectric point of the hemagglutinin as demonstrated by isoelectric focusing. non-activating enzymes cleave only one peptide bond in the ha 2-nterminal region of activated ha and thus do not affect the isoelectric point. we have also analyzed the cleavage site of the hemagglutinin of fowl plaque virus (h7). a connecting peptide containing several basic amino acids is eliminated. -the data show that activation of the influenza hemagglutinin involves the action of a cellular protease with trypsin-like specificity followed by the action of an exopeptidase of the carboxy-peptidase b-type. the latter enzyme activity is associated with purified virus and can be analyzed by an assay employing peptides bearing 3h-arginine at the c-terminus. the carbohydrates of the glycoproteins of 21 influenza a strains containing hemagglutinin and neuraminidase of all serotypes known to date have been compared by analysis of glycopeptides labeled with radioactive sugars. analysis of incompletely glycosylated glycoproteins synthesized in the presence of glycosylation inhibitors allowed the determination of the number of oligosaccharide side chains on ha 2 • with all strains, the neuraminidase contains side chains of both the complex type i and the mannose-rich type ii. there are distinct quantitative and qualitative differences between the strains in the distribution of type i and type ii side chains on the hemagglutinin fragments hal and ha 2 • the majority of the hemagglutinin oligosaccharides is located on hal' these side chains are usually of type 1. only the hem agglutinins of serotype h3 have, in addition, a substantial amount of type ii side chains on hal' most strains have on ha 2 a single side chain which is usually of type i. with serotype hs this side chain is free of fucose, and with serotype h8 it appears to be missing completely. serotypes h7 and h10 have, in addition to the type i, a type ii side chain on ha 2 • these observations strengthen the concept that the primary structure of the polypeptide chain is an important determinant for the carbohydrate moiety of the hemagglutinin. inst. f. virologie, ]ustus-liebig-univ., d-6300 giessen acylation of viral glycoproteins schmidt, m. f. g. covalent binding of fatty acids to viral glycoproteins was first detected with sindbis virus and vesicular stomatitis virus (1, 2) . studies on acylation in other enveloped viruses revealed that covalent addition of fatty acid to spike glycoproteins is a more general feature. while in sindbis and in semliki forest virus both species of spike glycoprotein (el and e2) carry fatty acid chains, acylation with the other viruses studied (corona-, influenza a-and paramyxovirus family) seems restricted to those glycoproteins that are known to carry fusion activity. this new type of modification of viral glycoproteins occurs in a wide variety of host cells including those of human, bovine, mouse, hamster, avian and insect origin (3) . -with the aid of controlled digestion of 3h-palmitic acid labelled virus particles and by the analysis of cyanogen bromide fragments of fatty acid labelled glycoprotein the fatty acid bind ing site in influenza hemagglutinin (hat), vsv g-protein and sindbis virus e1 and e2 could be located to the membrane spanning portion of the respective proteins (3). heinrich-pette-institut f. exp. virologie u. immunologie an d. univ., martinistr. 52, d-2000 hamburg 20 mannweiler, k., bohn, w., rutter, g., and hohenberg, h. in replica-immunocytochemical (ric) -and ultrathin section (us) preparations the ultrastructures of the specific alterations and of virus antigens, which appear at the plasma membrane of he la cell coverslip cultures after infection with an adapted measles virus strain were investigated. as immunomarker protein a-coated gold particles were used (1) . this method is sufficiently sensitive to enable labeling of even small altered areas of the plasma membrane (70-100 nm cb). due to the high atomic number contrast in the tem and the small size of the marker (-<:: 10 nm) the ultrastructure of characteristic alterations morphologically still remains visible with ease in a three-dimensional aspect. data obtained by ric and us preparations after labeling with antimeasles immune serum or with monoclonal antibodies against ha (2) are demonstrated, compared and discussed. bohn, w., rutter, g., and mannweiler, k. by use of the mouse hybridoma technique, monoclonal antibodies were obtained with specificity for the ha (79k), p (72k) and m (36k) polypeptides of measles virus. balblc mice were immunized with native measles virus and measles virus treated with detergents and heat. clones obtained after immunization of mice with native measles virus showed specificity for the ha polypeptide only. after immunization with measles virus, treated with 1 0/0 sodium sarkosyl sulfate (sss) at 20°c a clone was obtained producing antibodies to the m polypeptide. heating of measles virus in the presence of 1 0/0 sds under reducing conditions elicited a selective immune response to the p and np polypeptides. thus, clones producing antibodies to the p polypeptides were isolated. contains a 50s rna, but it does not show any infectivity even after trypsin treatment. an activation of the 6/94 cl virus can be obtained by i) cocultivation of the cl-e-8 cells with standard cells (e. g. bhk-21) and ii) serial passaging of the 6/94 cl virus in several cell lines (e. g. bsc-1). infectious 6/94 cl virus can be detected in the cell supernatant after a period of 5-10 days and 25-30 days respectively. this virus cannot be propagated in chicken eggs, but it can replicate in serial cell cultures without trypsin treatment; moreover trypsin treatment does not influence the viral replicaiton. in comparison 6/94 virus released from an in-vitro generated persistent infection (bsc-1 cells infected with egggrown 6/94 standard virus) can be propagated in chicken eggs. this virus termed 6/94 pi virus also can grow on serial cell passages without trypsin treatment, whereas the serial replication of 6/94 standard virus on cell cultures depends on the trypsin-activation. inst, f. virologie u. immunbiologie, versbacher str. 7, d-8700 wiirzburg monzel, p. and koschel, k. the paramyxvviruses measles (sspe) virus and canine distemper virus (cdv) cause an impairment of the catecholamine induced p-adrenergic receptor dependent c-amp generation in persistently infected c6 rat glioma cells. in cdv persistently infected c6 cells the number of receptors is greatly reduced. hirata and axelrod have shown that the number of ii-adrenergic receptors could be regulated by methylation of phosphatidyl ethanolamine (pe) resulting in lecithin synthesis (1) . we have therefore studied the methylation of pe in persistently infected cells by the incorporation of (3h) methyl groups from (3h-methyl)-methionine into pe. in both infected systems, c6/sspe and c6/cdv, we observed a total loss of catecholamine stimulated p-adrenergic receptor dependent methylation whereas the p-receptor independent methylation of phospholipids is unchanged. inst. f. med. virologie d. univ., d-6900 heidelberg; t robert-koch-institut, d-1000 berlin 65; 2 inst. f. virusforschung, dkfz, d-6900 heidelberg kurz, w., gelderblom, h.t, flogel, r. m.2, and darai, g. a paramyxovirus was isolated from a kidney biopsy of a tupaia (three shrew) and termed (tpv). the detailed host range study revealed that only tupaia embryonic fibroblasts and tupaia kidney cells are the cells of choice for the efficient propagation of tpv. tpv can be plaque-assayed on tupaia embryonic fibroblasts and this cell line was used for the continued propagation of tpv. electron microscopy of purified tpv revealed the presence of typical paramyxovirus particles. -the hemagglutination test was performed with erythrocytes with a variety of different species. it was found that guinea pig erythrocytes were agglutinated with tpv. the buoyant density of purified virions was determined in sucrose gradient and found to be 1.19 g x ml", -the biological chara cterization of tpv which was perfo rmed by host range study in vivo revealed that tpv is highly pathogenic for new born mice and hamsters. -the characterization of viral rna and proteins of th is new member of paramyxo viridae is now in progress. coronaviruses contain two glycoprotein species £2 (180 k) and £1 (23 k) which are both synthesized in the r£r of the infected host cell. glycosylation of £2 is initiated at the cotranslational level and it can be inhibited by 2-deoxyglucose and tunicamycin indicating the presence of n-glycosidic carbohydra te protein linkages. particles formed in the presen ce of these inh ibit ors are non infectious and lack detect able amounts of £2. -cell fractionation experiments show that glycosylation of glycoprotein £1 occur s posttranslationally in smooth memb ranes. the carb ohydrate protein linkages in £1 are susceptible to mild alkaline reductive conditions and n-acetylgalactosamine was determined to be the reduc ing sugar of the released oligo saccharides . this together with the finding that glycosylation of £1 is not sensitive to inhibito rs of n-glycosylation suggests that glycoprotein in £1 of coronaviruses is the first structural virus glycoprotein containing o-glycosidic side chains exclusively. insr, £. virologie, jusrus-liebig-univ., d-6300 giessen target cells of infectious bursal disease virus (ibdv) of chickens moller, h. and becht, h. infectious bursal disease virus (ibdv), the causative agent of a highl y contagious disease of young chickens result ing in severe necrotic lesions in the bur sa of fabricius (gumboro disease), is a non-enveloped icosahedral particle with a diameter of about 60 nm. its genome con sists of 2 segments of double-stranded rna with mol ecular weight s of 2.2 x 10· and 2.5 x 10· daltons. th e virion is composed of 5 structural pol ypeptides with mole cular weights of 90 kd, 48 kd. 40 kd, 32 kd and 28 kd. the 40 kd pol ypeptide, one of the two main structural proteins, is derived from the 48 kd pol ypeptide, perh aps by proteolytic modification (1) . with immunofluorescence and "i nfectious center assays" we were able to show th at 1. after infection of isolated lymphoid cells in vitro only 20 ofo of bur sa cells, 2 % of thymus cells and 5 ofo of spleen cells produce infective virus (i, e. plaques in infectious center assays) alth ough the donor chickens were in the most susceptible age of 4 to 5 weeks. 2. the number of virus producing cells is not cor related with the appe arance of slgm or sigg. 3. virus yields seem to be influenced by the cell cycle: the number of chick emb ryo fibro blasts producing plaques in "infectious center assays" is increased after synchro nisa tion of the cells before infection. 4. cells that produce infective virus show an increased uptake of 3h-thymidine. 5. isolated lymphoid cells from thymus or spleen as well as blood lymphocytes can be stimulated by mitogens to produce higher virus yields. infection of borna virus without any influence of the immunoresponse. the antigens first appeared in the nucleus of neurons and sometimes fibroblasts, then filled the cytoplasm, most brilliantly 7 days post infection. thereafter they disappeared from the cytoplasm, but remained persistently only in the nucleus in point shape. no morphological changes were seen in the infected cells during 60 days post infection. we can say that the virus does not kill the cell. for the destruction of nerve cells in in vivo conditions it can be pointed to the importance of the immunological events, which might cause the clinical pictures. abr, molekularbiologie, physiol.-chem. lnst., grindelallee 117, d-2000 hamburg 13; heinrich pette-inst., martinistr. 52, d-2000 hamburg 20 large scale production of biologically active vsv in eat-cells mack, d., kruppa, j., and breindl, m.i ehrlich ascites tumor cells maintained in mice were used to prepare milligram quantities of biologically active vsv. at the 6th day after passage when the cell number reached approx. 7 x 10 8 cells/mouse, mice were infected by intraperitoneal injection of appropriate concentrations of vsv. ascites fluid was harvested after 20 h. virus production was exponential for at least 16 h and continued for at least 20 h p.i. -approximately 3-4 mg of viral protein/mouse and 2 x 1011 pfulmouse were routinely obtained. the specific infectivity of vsv isolated from eat-cells reached nearly 1.5 x 10 8 pfui,ug protein. the endogenous transcriptase activity of vsv produced in eat, bhk, and hela cells showed no significant differences. large amounts of biologically active vsv may be produced rapidly and much less costly with the described procedure than using tissue culture cells. it should be possible to adapt the procedure for the production of other viruses. lonza a.g., ch-4000 basel, schweiz the quantitative determination of bardac-22, formaldehyde, glyoxal and glutardialdehyde in disinfectants weinreis, p. and goller, s. several institutes of hygiene have been provided with two disinfected formulations, a and b, with the intention to analyse these mixtures, containing different amounts of quaternary ammonium compounds, formaldehyde, glyoxal and glutardialdehyde, -the quantitative analysis of the components of the two mixtures depends on well known volumetric and spectrophotometric methods. -the results of this analysis have shown that it is possible to use the described procedure for routine check ups of disinfectants without using microbiological tests. proc . nat. acad. sci virolog y (1982) previous studies have shown, that mouse-neurovirulent recombinants can be obtained from mixtures of avirulent influenza a-viruses provided one of the parents had previously been adapted to that host. this study shows that prior adaptation of parental strains is not necessary and that generation of neurovirulent recombinants is frequent. the gene constellation for neurovirulence was predictible for recombinants derived from a particular pair of parental strains in influenza c virus -infected cells a virus-specific protein of a molecular weight of 65,000 dalton can be detected when glycosylation is inhibited by tunicamycin. since this protein can not be found in untreated control cells, it probably constitutes the unglycosylated precursor of the viral glycoprotein gp 88 since the petide pattern is identical for the doublet bands, it remains to be established whether this reflects a differential glycosylation or dissimilar proteolytic cleavage sites. the coding capacity of the viral genome rna-segment no abstracts of the 38th meeting of the oghm 1 augenklinik d. justus-liebig-univ., 0-6300 giessen; 2 inst. f. neuropathologie u. virologie, freie univ., 0-1000 berlin retino-cerebral manifestation of experimental borna disease in rhesus monkeys krey, h.t, roggendorf, w.2, and ludwig, h.3 inflammation of the uveal tract, the retina meninges and brain represent uveo-meningoencephalitic syndromes of unknown origin. one of these, the vogt-koyanagi-harada syndrome is primarily manifested by inflammation of the retinal pigment epithelium, the uveal tract and meninges. severe visual and neurological impairment can occur. in our experimental studies 14 rhesus monkeys were experimentally infected with borna disease virus. after a 4-7 week incubation period a progressive retino-cerebral syndrome was observed. focal inflammatory lesions in the retinal pigmentepithelium and the uveal tract were accompanied by encephalitic and meningeal infiltrates. infectious virus could be demonstrated in the retina and in the brain. experimental borna disease can serve as an appropriate model for uveo-meningo-encephalitic syndromes in men. borna disease (bo) virus induced encephalitis in horses, sheep, rodents and primates shows similarities in many aspects with other so-called slow virus diseases. the eeg has shown to be a specific tool in studying encephalitides of different types in man. this is a report on the eeg of the bo virus specific encephalitis. twenty-eight rabbits inoculated by different routes and with different virus preparations were screened for eeg changes. the basic frequency was measured optically and an analysis of the eeg was performed. the following conclusions were made: 1. a significant slowing down of the basic frequency was observed in the eegs of bo virus infected rabbits. 2. spikes and spike-waves were present rather regularly and correlated with epileptic seizures at the end of the disease when they appeared rhythmically in intervalls of twenty seconds. rademakercomplexes appeared from the third week on. virological and serological data collected from all animals demonstrated a strong correlation of bo virus specific reactions with the eeg alterations. the patterns of eeg changes are reminiscent of those in sspe. eeg features in this kind of slow virus diseases may have rather similar characteristics which could suggest that they may underlie common pathophysiological mechanism. we established the primary cultures of neural retina and pigment epithelium of rabbit and of the brain of chicken embryo to study the sensitivity of each kind of cells to the key: cord-021626-ck2kybtp authors: walker-smith, john title: dietary protein intolerance date: 2013-10-21 journal: diseases of the small intestine in childhood doi: 10.1016/b978-0-407-01320-9.50011-x sha: doc_id: 21626 cord_uid: ck2kybtp this chapter discusses dietary protein intolerance. clinical food intolerance has many causes and many manifestations, including psychological aversion to the sight, smell, or taste of food as well as psychological intolerance to one or more of the many constituents of food. dietary protein intolerance is the clinical syndrome resulting from the sensitization of an individual to one or more proteins that have been absorbed via a permeable mucosa in the small intestine. intolerance to various food proteins, especially to cows' milk, has been recognized in children for many years. such food intolerance may be the result of a variety of causes—for example, a congenital digestive enzyme defect such as sucrase–isomaltase deficiency or an acquired lactase deficiency secondary to small-intestinal mucosal damage, which in turn can be the result of a food allergy. the incidence of gastrointestinal food allergy diseases is greatest in the first few months and years of an infant's life and decreases with age. the acute syndrome is usually characterized by the sudden onset of vomiting, after cows' milk ingestion, occasionally followed by pallor and a shock-like state; however, acute anaphylaxis is rare. acute abdominal pain seems to be a particular feature of fish hypersensitivity, while peanuts often produce immediate reactions in the oral mucosa as well as abdominal pain. clinical food 'intolerance' has many causes and many manifestations, including psychological aversion to the sight, smell or taste of food as well as psychological intolerance to one or more of the many constituents of food. intolerance to various food proteins especially the protein of cows' milk, has been recognized in children for many years. such food intolerance may be a result of a variety of causes; for example, congenital digestive enzyme defect such as sucrase-isomaltase deficiency, or acquired lactase deficiency secondary to small intestinal mucosal damage, which in turn can be the result of a food allergy. bleumink (1974) has classified adverse reactions after food ingestion as follows: • toxic effects, including those due to bacterial contamination and food additives. • intolerance phenomena due to enzyme deficiencies, e.g. lactose intolerance as a sequel to lactase deficiency. • allergic reactions. • symptoms resembling allergic reactions but not elicited by immunological phenomena. to this category belong symptoms caused by histamine releasers, e.g. strawberries, where histamine release is not the consequence of an immunological reaction. in recent publications the term 'food idiosyncrasy' has been used in the sense of a non-immunological abnormal response to food. there is, however, increasing evidence that dietary protein intolerance may be mediated via an allergic reaction. in this chapter, the varieties of food protein intolerance in which there is some evidence for such an allergic reaction or reactions affecting the small intestine will be discussed. von pirquet of vienna in 1906 introduced the term allergy. he used it to describe a deviation from the original state or normal behaviour of the individual. his contribution and its relevance to current concepts of immunity and allergy have been very clearly reviewed by turk (1987) . today, when the term allergy is used, it implies a heterogeneous group of conditions which have in common a state of altered reactivity to foreign proteins (antigens) (gell and coombs, 1968) . these antigens are called allergens when they produce symptoms in an allergic person. a child who has an allergy is distinguished from other children by an abnormal response on contact with an allergen or allergens, a response that does not occur when a non-allergenic child is exposed to the same allergen. the typical features of an allergic reaction are: first, the lack of any untoward reaction on the child's first exposure to the allergen; and second, that subsequent exposure to the allergen produces a hypersensitivity reaction. indeed, ferguson (1976) regards the term 'hypersensitivity' as preferable to the term 'allergy' when used to describe tissue damage resulting from the immune reaction to a further dose of antigen occurring in a previously immunized host. gell and coombs (1968) have classified the allergic or hypersensitivity reactions that may produce tissue damage of some kind into four types, as follows. this is initiated by an allergen reacting with mast cells that have been passively sensitized by ige (reaginic) antibody with release of vasoactive agents such as histamine. the reaction occurs within minutes of exposure. in the modern usage of the term atopy, this is the state where an individual is prone to develop antibodies of the igê class. the presence of such antibodies, however, does not necessarily mean that the child is intolerant to the antigen producing the antibody, in a clinical sense. this reaction is initiated by antibody reacting with an antigenic component of a cell or tissue element or one that is intimately associated with these. complement is usually necessary to affect cellular damage. in this type of reaction, antigen and antibody (igg or igm) react in the presence of antigen excess with the subsequent fixation of complement and consequent local inflammatory response. this reaction is maximal a few hours after exposure to antigen. this reaction is mediated by t-lymphocytes and macrophages and manifests by infiltration of lymphocytes and macrophages at the site where antigen is present, due to release of lymphokines. these are soluble factors secreted by lymphocytes on contact with antigen. this reaction takes 1-2 days after antigen exposure. evidence of such an allergic reaction, with tissue damage, in children who show clinical intolerance to dietary protein is not always available. therefore, in clinical practice, the descriptive term 'food protein intolerance' is often used rather than the more precise term 'food allergy'. until more is known about the pathogenesis of food allergy terminological difficulties will continue to arise . the first case report of food allergy (cows' milk allergy) was made by hamburger in 1901 . then finkelstein in 1905 described cows' milk as a cause of acute death in an infant. schloss, in 1911 , related gastrointestinal symptoms to food allergy. he made the diagnosis of egg allergy on the basis of a positive skin test with a protein fractionated from ovomucoid. gastrointestinal food allergy has since come to be recognized as an important cause of gastrointestinal symptoms in infancy. many paediatricians in the past, however, have been sceptical about this diagnosis because of the absence of precise and objective diagnostic criteria, nevertheless most paediatricians now accept the existence of the condition. on the other hand, there are those who undoubtedly have exaggerated its true importance. there is still debate, however, concerning its frequency and importance in different parts of the world. gastrointestinal symptoms may be the only manifestation of clinical intolerance to food protein but there may also be respiratory symptoms, skin reactions and other clinical features. only gastrointestinal effects will be discussed in any detail here. dietary protein intolerance is the clinical syndrome resulting from the sensitization of an individual to one or more dietary proteins that have been absorbed via a permeable small intestinal mucosa. clinically, it appears to be a transient phenomenon of variable duration in children. gastrointestinal food allergies may be defined as clinical syndromes which are characterized by the onset of gastrointestinal symptoms following food ingestion where the underlying mechanism is an immunologically mediated reaction within the gastrointestinal tract. a food-sensitive enteropathy is a disorder characterized by an abnormal small intestinal mucosa whilst having the offending food in the diet; the abnormality is reversed by an elimination diet only to recur once more on challenge with the relevant food. clinical intolerance to a variety of food proteins has been described. the most common are cows' milk, eggs and fish; but intolerance to tomatoes, oranges, bananas, meat, nuts, chocolates and cereals, including soy protein, have been described (bleumink, 1974) . there is no consistent association between a particular food and specific syndromes. in fact the clinical manifestations that may occur in cows' milk protein intolerance are large in number and diverse in nature (table 5 .1) (bahna and heiner, 1980; hill et al, 1986; hutchins and walker-smith, 1982) . chemically, allergens are usually glycoproteins with a molecular weight of between 20 000 and 40 000. broadly, gastrointestinal reactions to food in children with gastrointestinal food allergy may be divided into those that manifest quickly, i.e. within minutes to an iv secondary general effects iron deficiency anaemia hypoproteinaemia thrombocytopenia eosinophilia angioedema hour of food ingestion, and those in which the onset is slow, taking hours or days after food ingestion. both types of reaction may occur individually or together in different children. yet there are clear immunological differences between these groups. for example, it has been shown by fallstrom et al. (1986) that children with slow onset reactions to cows' milk feedings have significantly elevated titres of igg antibodies against both native and digested beta-lactoglobulin, when compared with both controls and those children who develop symptoms quickly after milk ingestion. these children also tended to have higher levels of antibody of ig a class to both native and processed milk. seven out of nine children with quick onset cows' milk allergy had ige antibodies to cows' milk but these did not occur at all in the slow onset group. the incidence of gastrointestinal food allergic diseases is greatest in the first months and years of life and decreases with age. this is especially true for late onset reactions to food with food induced small intestinal mucosal damage (dannaeus and johansson, 1979) . it remains uncertain as to whether gastrointestinal syndromes of allergic origin causing small intestinal mucosal damage exist in adult life. these gastrointestinal food syndromes of early childhood appear to be temporary in duration, although it does seem possible-as in the case of cows' milk protein intolerance-that gastrointestinal syndromes may be replaced with the passage of time by syndromes involving other systems. these syndromes are usually easy to diagnose on historical grounds. levels of foodspecific ige antibodies are typically elevated and skin prick tests are also often positive. children with such problems often present to allergy clinics rather than to gastroenterology clinics. thus diagnosis is usually simple and specific diagnostic tests are available. a good example of this is egg hypersensitivity. the most dramatic deleterious response is acute possibly life threatening anaphylaxis to food. the peculiar attribute some proteins possess, when injected, to diminish instead of to increase the defences of the body against their harmful action is described as anaphylaxis (the reverse of a guard or protection). anaphylaxis was first observed at the beginning of this century. charles richet and paul portier in 1901 on the yacht of prince albert of monaco discovered anaphylaxis by injecting dogs with an extract of the sea anemone la physalie without their becoming ill after the first injection. a second injection lead to acute vomiting and diarrhoea with their rapid death. only 4 years later schlossman in 1905 documented similar symptoms of acute shock not after injection but after ingestion of a foreign protein namely cows' milk in infants. in the same year finkelstein described a death due to cows' milk ingestion in infancy. it is now known that anaphylaxis usually results from a generalized immediate ige mediated reaction following the introduction of sufficient antigen into a previously sensitized individual, releasing histamine and other biologically active mediators from sensitized mast cells. reactions with the clinical features of anaphylaxis have also been described without evidence of ige mediation. thus their precise causation is not clear. the term anaphylaxis in clinical practice remains a term used to describe a severe collapse-like reaction, not necessarily ige mediated. this phenomenon of an acute anaphylactic reaction to an ingested food represents the most severe example or one extreme of the clinical spectrum of gastrointestinal food allergy, but fortunately does not usually result in death. the acute syndrome is usually characterized by the sudden onset of vomiting, after cows' milk ingestion, occasionally followed by pallor and a shock-like state, but acute anaphylaxis is rare. when it occurs, acute anaphylaxis is a dramatic syndrome ( figure 5 .1) (de peyer and walker-smith, 1977) and can be fatal. a breast-fed infant given cows' milk feeding may react in this dramatic way. in these circumstances, acute vomiting with or without diarrhoea can be clearly related to the ingestion of cows' milk by taking a careful history. other clinical features usually accompany these gastrointestinal symptoms, such as swelling of the lips and tongue, oedema, and urticaria. all these symptoms disappear in a few hours if cows' milk is stopped. the amount of cows' milk responsible for this can be extremely small. it has been proven that infants can be sensitized to cows' milk via its presence in maternal breast milk when mother is herself drinking cows' milk lake, whitington and hamilton, 1982) . investigations show a high serum ige level and elevated milk specific rasts. the milk antibodies of the other immunoglobulin classes (igg, ig a and igm) are present but usually at a low titre (firer, hosking and hill, 1981) . skin prick tests are also positive (ford et al., 1983) . in a series of 100 children with cows' milk allergy (hill et al., 1986) 27 children fell into the quick onset group. the role of cows' milk hypersensitivity in the genesis of sudden unexpected death or 'cot death' was raised by parish et al. (1960) . the postulated mode of death was an acute anaphylactic reaction to cows' milk. devey et al. (1976) , in cambridge, went on to report that guinea-pigs given cows' milk to drink, instead of water, soon became anaphylactically sensitized to the proteins of cows' milk. coombs, devey and anderson (1978) then found that when the drinking of cows' milk was continued by the guinea-pigs for more than 70 days they became refractory to the effect. anderson et al. (1979) then showed that there were differences in the anaphylactic sensitizing capacities of different milks in their animal model. evaporated whole cows' milk was practically without sensitizing capacity to beta lactoglobulin, and a formula in a liquid concentrate form had extremely low sensitizing capacity to both casein and beta lactoglobulin. in both cases this only occurred when given to the guinea-pigs by mouth, sensitizing capacity being retained when given parenterally. this suggests that these formula are handled differently in the small intestine from other milk feeds. these observations have far-reaching implications if they are true for human infants, because they suggest that modification of artificial feeding formulae may profoundly influence their allergenic or sensitizing capacity. this aspect is discussed further on page 154. rudd, manuel and walker-smith (1981) have described an acute anaphylactic reaction after feeding an infant with a wheat rusk. this infant did not have rast antibody to wheat and his ige was only marginally elevated to 17 iu ml" 1 . egg hypersensitivity was first described by schloss in 1911. vomiting within a few minutes to an hour of egg ingestion is characteristic of egg hypersensitivity. diarrhoea, abdominal pain and nausea may also occur. typically, skin and respiratory manifestations also occur and may be a more important part of the clinical presentation than gastrointestinal symptoms (ford and taylor, 1982) . ovomucoid is the most important egg protein capable of producing this syndrome (bleumink and young, 1969) . rast and skin prick responses to egg are usually positive and helpful diagnostically. they provide a useful guide to resolution or persistence of egg allergy ( figure 5 .2). acute abdominal pain seems to be a particular feature of fish hypersensitivity (niziami, lewin and baloo, 1977) , whilst peanuts often produce immediate reactions of the oral mucosa (wraith et al., 1979) as well as abdominal pain. where one or a number of foods produces quick onset symptoms skin prick and rasts responses are usually positive and give helpful information. the total serum ige is usually elevated (dannaeus and johansson, 1979) . one unfortunate aspect of recent times has been the appearance of a number of commercial laboratories offering diagnostic tests for allergy directly to the public. the ability of such laboratories to accurately diagnose nine fish-allergic patients and nine controls who provided specimens of blood and hair for testing was assessed by sethi et al. (1986) . all five laboratories were not only unable to diagnose fish allergy but reported many allergies in apparently non-allergenic subjects and also provided inconsistent results on duplicate samples from the same subject. thus laboratory investigations appear to be most helpful when they are clinically least necessary, as history should give a clear indication of the diagnosis in most cases of quick onset syndromes. sodium cromoglycate as an oral preparation (nalcrom, fisons ltd.) has been reported to be an effective treatment for some cases of immediate food allergic disease. symptoms provoked by ingestion of one or more foods may be prevented by sodium cromoglycate if it is taken before taking the food. the literature contains many anecdotal reports of the beneficial effect of this treatment of small numbers of patients at various ages (freier and berger, 1973; watson and timmins, 1979) , but larger studies are still awaited. scepticism about the value of this drug has been based upon its failure to act on mucosal mast cells in the animal model (pearce et al., 1982) . however, recent evidence that it has an effect on improving abnormal gastrointestinal sugar permeability in patients suggests its mode of action may be other than on the mast cell (scotto et al., 1987) . in england and wales there has been a sharp decline in the number of childhood deaths reported to be due to choking on food. the number of deaths from this cause fell from 144 in 1974 to 46 in 1984. this was especially due to a fall in mortality for those under 3 months of age. roper and david (1987) ford.) attention to this, relate the fall to the change in infant feeding practice, namely that early introduction of solid food should be avoided. this appears to be yet another beneficial consequence of the dhss present day practice in infant feeding publication which recommended that solids should not be given before the age of 3 months. some individuals have gastrointestinal and other symptoms related to a wide variety of foods. such patients characteristically have a number of immediate symptoms such as vomiting, urticaria or wheezing upon exposure to multiple foods. they often have an individual and family history of atopy, peripheral eosiniphilia, elevated total serum ige and positive rast and skin tests to specific foods. diets involving the elimination of a number of foods may be impractical or ineffective on their own. however, the addition of disodium cromoglycate may be highly effective as in the group of children described by syme (1979) . the therapeutic dose is empiric at present (kocoshis and gryboski, 1979) . it is usually 100 mg twice daily. curiously if oral disodium cromoglycate alleviates symptoms these may not relapse when the drug is discontinued. these patients need to be distinguished from cases of eosinophilic gastroenteritis. whereas quick onset syndromes often present to allergy clinics, by contrast the slow onset syndromes usually present as a gastroenterological problem to paediatric or paediatric gastroenterology clinics. such children may often have failure to thrive. in these cases there is often no clear history of food being related to the onset of symptoms. diagnosis may be difficult. accurate diagnosis centres upon the following three groups of tests: • investigation of gastrointestinal structure, e.g. proximal small intestinal mucosal biopsy. • investigation of gastrointestinal function, e.g. intestinal sugar permeability. • investigation of immunological function: (a) systemic, e.g. specific antibody production. (b) gut associated, e.g. studies of local antibody producing cells. once these initial investigations have been performed, dietary elimination and challenge continue to have an important diagnostic role. this approach is of best value when such elimination and challenge is related to gastrointestinal structure and function, i.e. serial observations. at present there are no simple laboratory tests available for diagnostic screening of children with these slow onset gastrointestinal symptoms. in individual patients cows' milk antibody estimation is not diagnostically useful {see figure 1.1) . in children such problems often overlap with gastrointestinal infection which may coexist thus making diagnosis difficult. unless full microbiological study of the stools is done, i.e. stool electron microscopy for viruses as well as stool bacterial culture, infection of the gastrointestinal tract can be easily overlooked. food allergy and infection often coexist. changes in the structure of the small intestinal mucosa in response to the ingestion of particular foods provide clear objective evidence for the existence of food sensitive disorders affecting the small intestinal mucosa. this approach of using serial small intestinal mucosal biopsies related to dietary elimination and challenge was first used for the diagnosis of coeliac disease in the interlaken or espgan diagnostic criteria (see chapter 5). coeliac disease is a state of permanent food sensitivity, but there also exists a group of temporary food sensitive enteropathies, presenting in infancy. indeed ingestion of a number of foods apart from gluten have now been shown to produce food sensitive enteropathies in infancy. from studies in the experimental animal (macdonald and ferguson, 1977; ferguson, 1980) it seems likely that such enteropathies may result from a type iv or tcell mediated reaction within the mucosa. in such animal studies a type i reaction in the small gut mucosa is associated with only minimal morphological changes (mast cell degranulation and some oedema), while mucosal type iii reactions which are associated with polymorph infiltration do not cause crypt hyperplasia. of course more than one type of allergic response may coexist within the mucosa at any one time and it is possible, for example, that a type i reaction may precede a type iv reaction. abnormalities of the small intestinal mucosa have been reported in children suffering temporary intolerances to cows' milk protein, soy protein, gluten, eggs, chicken, ground rice and fish. the evidence that the enteropathy is directly related to ingestion of a particular food is based upon serial small intestinal biopsy studies related to dietary elimination and challenge, as in coeliac disease. the enteropathy is not usually as severe as that seen in coeliac disease, although a flat mucosa may occasionally be seen. these disorders usually resolve by the age of 18 months to 2 years. in some cases the children appear to develop food intolerance after an acute episode of gastroenteritis. the underlying causes of these temporary food intolerances of infancy probably relate to a transient sensitization of the child to dietary antigens, which may be a result of a breach of the mucosal barrier. the precise mechanisms which cause the enteropathy are unclear although the application of the gell and coombs classification of hypersensitivity reaction provides a basis for investigation. for the reactions to occur the offending food antigen must enter the mucosa in appropriate amounts to cause sensitization. there are two hypotheses regarding this process: one suggests sensitization caused from an overstimulation of the immune system by excess antigen entry, the other proposes a minimal entry of antigen sufficient to stimulate a reaginic response, which in turn leads to increased antigen entry leading to mucosal damage. in the experimental animal it has been shown that intestinal anaphylaxis can lead to increased uptake of intestinal luminal antigens (walker et al., 1975) . both hypotheses may be correct. post-enteritis food sensitive enteropathies may result from excess local food antigen entry in susceptible individuals following gut damage induced by viral or bacterial pathogens. an hypothesis relating acute gastroenteritis and cows' milk sensitive enteropathy is illustrated in figure 5 .3. it is known from the observations of gruskay and cook (1955) that excess antigen absorption (in their studies egg albumin) occurs in infants with acute gastroenteritis. this has been well documented in animal studies of viral enteritis (keljo, butter and hamilton, 1985) . clinical studies have also shown increased entry of both small molecular weight sugars in acute gastroenteritis ( figure 5 .4) (ford et al., 1985) , and a larger molecular weight protein, horse radish peroxidase, in post-enteritis food sensitive enteropathies as observed using organ culture (jackson, walker-smith and phillips, 1983) . thus direct and indirect evidence exists that damage to the small intestinal mucosa may result in a local increase in antigen entry. however, most children in whom this happens are not sensitized to food. thus for this excessive antigen entry to be of pathogenic importance it must occur in susceptible individuals. the nature of such susceptibility remains to be established but clearly relates in part to impaired immunoregulation by the local defence mechanisms in the small intestinal mucosa. the local effect of excess antigen absorption may be also determined by the allergenicity of the antigen entering the mucosa, as suggested by the guinea-pig studies of coombs and colleagues in cambridge referred to earlier and supported by the clinical studies of manuel, walker-smith and france (1979) . the role of cows' milk sensitive enteropathy as a cause of the post-enteritis syndrome is discussed further in chapter 6. it remains yet to be established whether small intestinal mucosal damage due to food ingestion occurs in adults, other than with gluten ingestion in patients with coeliac disease. cows' milk proteins cows' milk and human breast milk have different protein compositions, as table 5 .2 illustrates. human milk does not contain beta-lactoglobulin, which represents the major protein in cows' milk whey proteins. most observers, including visakorpi and immonen (1967) and freier and his colleagues (1969) , have noticed that this protein is often the factor responsible for cows' milk allergy, although the other proteins may also be allergenic in children. cows' milk contains three times more protein than human milk (due to its higher content of casein), but has the same (table 5 .2). it is perhaps unfortunate that proteins in breast milk and cows' milk such as casein are not distinguished by special names to describe human casein and cows' casein respectively. both these proteins, despite the same name, are biologically and chemically (e.g. their amino acid composition) quite distinct. in modern adapted milks, the so-called 'humanized' milks, the total protein content is reduced to about the level of human milk and the proportion of soluble proteins to casein is corrected by the addition of whey proteins (i.e. all the soluble proteins in milk after precipitation of casein either by the action of rennin or by acidification to ph 4.6). thus these milks contain more beta-lactoglobulin. despite this there is both clinical and experimental evidence that they are less sensitizing (walker-smith, 1986 ). the immunoglobulins present in breast milk are chiefly of the iga class. table 5 .3 indicates the difference in the immunoglobulin composition of breast and cows' milk. there are probably two syndromes, a primary disorder of immunological origin and a secondary disorder, a sequel of mucosal damage. abnormal handling of dietary antigens across the intestinal mucosa probably occurs in infants with this disorder. this may be related to a temporary immunodeficiency state such as transient iga deficiency (taylor et al., 1973) , or to non-specific small intestinal mucosal damage from any cause permitting excess antigen entry as referred to above. there is indeed clinical evidence that acute enteritis may be followed by not only lactose intolerance but by more persistent and longer lasting cows' milk sensitive enteropathy (harrison, wood and walker-smith, 1976; harrison et al., 1976; walker-smith, 1982 ) (see chapter 6). in the experimental animal increased protein antigen uptake occurs when the mucosa is damaged by parasitic infection (bloch et al., 1979) . the pathogenetic role of circulating antibodies to cows' milk remains to be established. lippard et al. (1936) showed that at whatever age a child first begins to drink cows' milk, cows' milk antigen and then cows' milk antibody can be detected in his blood. délire, cambiaso and masson (1978) have found that neonates fed on cows' milk have in their blood immune complexes containing cows' milk protein antigens, and igg antibodies of maternal origin. despite these findings, only a few children go on to develop cows' milk protein intolerance. how such a state of clinical intolerance develops is unknown. even the presence of a high level of serum anti-milk antibodies is not necessarily associated with damage, e.g. there is a high incidence of elevated titres of cows' milk antibodies in children with both coeliac disease and kwashiorkor (chandra, 1976) yet, as a rule, these children improve clinically on cows' milk diet. as stated earlier the local reaction in the small intestine may be mediated via one of the allergic reactions as classified by gell and coombs (1968) , namely type i, type iii, and type iv. evidence that these three types of reaction may occur in children with cows' milk protein intolerance include the following observations. first, in relation to type i reaction, elevated titres of ige antibodies to cows' milk protein, have been observed in some children with enteropathy but are more typical of quick reactors. however, such elevated titres can also be found in milk tolerant children. positive skin prick tests may be found in milk intolerant children but, again, correlation with symptoms is variable. the involvement of ige in the immunological response of the lamina propria to milk challenge in children with cows' milk protein intolerance has been described by shiner and her colleagues (1975) , also by kilby, walker-smith and wood (1976) , who showed an increase in ige cell numbers in the small intestinal mucosa after a milk challenge in a child with cows' milk sensitive enteropathy. however, whether ige mediated small intestinal disease due to sensitized mast cells releasing mediators at the site of reaction between cows' milk protein, and reaginic ige antibody fixing to mast cells really occurs, has not yet been proved. if this does happen, disodium cromoglycate may help. second, in relation to type iii reaction, elevated titres of igg and igm milk antibodies have been described in cows' milk protein intolerance but again with a poor clinical correlation. increased numbers of ig a cells, but not in general igm plasma cells after a positive milk challenge, may occur. matthews and soothill (1970) have observed the effect of milk feeding on complement activation, and reports of circulating immune complexes in children fed with cows' milk has already been referred to. whether such complexes are implicated in the genesis of the mucosal lesion of cows' milk protein intolerance is unclear. third, abnormalities of cell mediated immunity leading to abnormal lymphocyte transformation tests (fontaine and navarro, 1975) have been reported. studies by macdonald and ferguson (1977) and ferguson (1980) in the animal show that type iv hypersensitivity produce an appearance similar to cows' milk sensitive enteropathy. the importance of the type i reaction in the gut would be in allowing increased amounts of antigen to cross a damaged mucosa, and by causing capillary dilatation and increased permeability, allowing large amounts of antigen into the systemic circulation to initiate secondary immunization. if this antigen meets tissue fixed ige on mast cells then type i reaction would occur, e.g. in the skin (rash), in the gut (mucosal damage) and in bronchial mucosa (wheeze). thus the very variable clinical reactions encountered can be accounted for by differences in antigen reaching ige on mast cells in different sites in the body. involvement of systemic immunity, and the local immune system, could explain the transient nature of the illness. the illness could disappear after a period on a milk-free diet when the small intestinal mucosa local immune system was mature enough to prevent much antigen getting through. the exact role of cell-mediated immunity in this hypothesis is not certain but it is clearly important. the allergenicity or antigenicity of the cows' milk formula may be of critical importance in pathogenesis, as clearly most individuals who have acute gastroenteritis associated with increased gut permeability do not become sensitized to food. manuel et al. (1981) showed a remarkable difference in the incidence of delayed recovery between infants fed with different formulae immediately after an acute attack of gastroenteritis. old formula pregestimil (high osmolality and based on a casein hydrolysate) and al 110 (based on casein) and standard sma (a conventional adapted formula) were compared for infants aged under 6 months. it is likely that delayed recovery in these circumstances is related to cows' milk sensitive enteropathy; 26% of infants fed with the casein formula al 110 developed delayed recovery compared with only 5% with pregestimil and the very low figure of 2.5% with sma. this later figure may have been by chance unusually low. nevertheless, when these milks are tested in an animal model (guinea-pigs) then similar results are found: al 110 is sensitizing, pregestimil not sensitizing at all and sma sensitizing significantly less often (mclaughlan and coombs, 1983; manuel et al., 1979) . thus adapted feeding formulae appear to be much less sensitizing than the older infant feeding formula still routinely used in much of the developing world. this is consistent with the decline in the severity of cows' milk sensitive enteropathy in societies where such milks are now universally used. thus the allergenicity of the milk formula fed at the time of an acute attack of gastroenteritis may be a central factor in the development of cows' milk sensitive enteropathy. boys and girls appear to be equally affected (table 5 .4). although an atopic family history is very common, no definite genetic factor has been identified. kuitunen et al. (1975) have shown an hla status identical to that of the community. however, swarbrick, stokes and soothill (1979) have shown, in animals, a genetic variation in the control of antigen absorption by the gut, which suggests certain individuals may be predisposed to develop dietary protein intolerance. in the vast majority of patients with slow onset of symptoms related to cows' milk ingestion the architecture of the proximal small intestinal mucosa is abnormal but the severity of the enteropathy is variable (kuitunen et al., 1975; fontaine and navarro, 1975; harrison et al., 1976) . however, in some early reports the mucosa was flat and indistinguishable from that seen in coeliac disease (kuitunen et al., 1975) . more recently from the same centre in finland less severe mucosal damage has been characteristic and now very few children are seen in finland with severe cows' milk sensitive enteropathy (verkasalo et al., 1981) . when present, the enteropathy can be shown to be cows' milk sensitive by serial biopsies related to withdrawal and challenge with cows' milk ( figure 5 .5). unlike the gluten sensitive enteropathy of untreated coeliac disease, this cows' milk sensitive enteropathy is of variable severity on proximal mucosal biopsy, and is patchy in distribution, while a flat mucosa, indistinguishable from the mucosal appearances found in coeliac disease, may occasionally occur {figure 5.5). typically the mucosa in untreated cows' milk sensitive enteropathy is thin (maluenda et al., 1984) (figure 5.6) . the pathological changes are often patchy (manuel et al., 1979) . the intra-epithelial lymphocytes count is increased although not to the level found in untreated coeliac disease (figure 5.7) . there is also often a dense accumulation of fat in the epithelium which is a particular feature of this disorder and the post-enteritis syndrome (variend et al., 1984) (figure 5.8) . the mucosa rapidly returns towards normal or near normal on withdrawal of milk, only to relapse following challenge with cows' milk albeit to a variable and inconsistent degree (walker-smith, 1975) . however, unlike coeliac disease the mucosa remains thin throughout (maluenda etal., 1984) (see figure 5 .6) and the intra-epithelial lymphocytes fall to levels below normal on a milk-free diet rising to levels within normal limits after cows' milk challenge (see figure 5 .7). cows' milk sensitive enteropathy only affects children less than 3 years of age as a general rule. one remarkable exception is the case report of watt, pincott and harries in 1983 of a child with coeliac disease whose small intestinal mucosa was responsive to cows' milk until at least the age of 7 years. after a positive milk challenge, alteration in microvilli of the enterocyte may be seen (figure 5 .9) with a reduction in microvillous surface area (phillips et al., 1980) in parallel with a fall in disaccharidase activity (figure 5.10) . figure 5 .11 shows the relationship between milk challenge and lactose tolerance. the onset of symptoms may be acute with the sudden onset of vomiting and diarrhoea; the diarrhoea persisting and becoming chronic. alternatively in some infants the onset is insidious and the presentation is very similar to coeliac disease. in fact in most cases in the early reports the onset with acute symptoms commenced before the age of 6 months. in those earlier reports the majority of infants were less than 3 months of age at the onset of symptoms (harrison et al., 1976; walker-smith, kilby and france, 1978) . in the more recent study of digeon et al. (1986) the mean age of onset was 7 months and some patients were over 1 year at the age of presentation. this change may relate to the fact that for the first 6 months of life most infants were fed modern adapted formulae, whereas in the previous reports infants were fed old-fashioned partly modified milks. the onset of symptoms usually occurred at a time when infants were having ordinary pasteurized (doorstep) milk. as referred to earlier, it seems likely that modern adapted milks are less sensitizing, and there is both animal model (coombs and mclaughlin, 1985) and clinical evidence to support this view. there is usually a latent interval between the introduction of cows' milk and the onset of symptoms. such an onset may be clinically indistinguishable from acute gastroenteritis. indeed the illness may begin with acute gastroenteritis and then after return to a cows' milk feeding the diarrhoea becomes persistent. lactose intolerance may also be present. whatever the mode of onset of symptoms most often at the time of diagnosis these infants have chronic diarrhoea and failure to thrive. cows' milk sensitive enteropathy is a most important cause of the syndrome of chronic diarrhoea and failure to thrive in infancy. the mode of presentation of 18 infants at time of diagnosis seen at queen elizabeth hospital for children is illustrated in table 5 .4. cows' milk protein intolerance may also be associated with protein losing enteropathy and with iron deficiency anaemia due to intestinal blood loss (either occult or overt). this is related to an endoscopie colitis, which is not discussed here (gryboski, 1967; jenkins et al., 1984) . usually cows' milk sensitive enteropathy and cows' milk induced colitis do not coexist in the same patient. cows' milk protein intolerance has also been described as producing severe gastritis with antral erosions accompanied in some cases by duodenitis, diagnosis being made by upper endoscopy. these infants presented between 2 and 4 months with vomiting and failure to thrive as the principal symptoms. all had hypochromic anaemia and occult faecal blood, responded favourably to cows'-milk-free diet, and relapsed on milk challenge. neither small intestinal biopsy nor colonoscopy was performed so the state of the rest of the gastrointestinal tract in these infants remains unknown (coello-ramirez and larrosa-haro, 1984). until recently, the only satisfactory way to make the diagnosis of cows' milk protein intolerance has been based purely on clinical observations of repeated withdrawal of milk and challenge with milk, associated with clinical remission and relapse as formulated by goldman and colleagues in 1963, and since spoken of as the goldman criteria (table 5 .5). these criteria have obvious drawbacks. most mothers are reluctant to submit their infants to three potentially hazardous challenges, especially after one positive challenge. diagnosis of clinical relapse can be misleading, e.g. intercurrent illness may cause vomiting and diarrhoea and lead to error in interpretation. it is also now clear that children may take longer than 48 hours to relapse after a milk challenge. finally, positive challenges do not always have a similar onset, duration and clinical features. the use of these rigorous criteria has probably led to under-diagnosis of this syndrome. serial small intestinal biopsies taken first at the time of initial presentation, and second after the return of symptoms following a milk challenge, now permit a firm diagnosis to be made on the basis of one diagnostic milk challenge. because of its transient nature it may be difficult to fulfil all the diagnostic criteria for cows' milk sensitive enteropathy. nevertheless, accurate diagnosis is important. it is important to exclude an infective cause of the enteropathy in any infant with chronic diarrhoea. as with all dietary protein intolerances, diagnosis is based upon the response to withdrawal and subsequent reintroduction of the offending protein. there is no specific laboratory test apart from serial biopsy related to elimination and challenge. the first stage in making the diagnosis is the suspicion that the child's symptoms relate to milk ingestion. if the small intestinal mucosa is shown to be abnormal on biopsy the finding of a patchy enteropathy with a thin mucosa in a cows' milk fed infant, when the infant responds rapidly to a cows' milk elimination diet provides firm presumptive evidence for this diagnosis. such children may then be described as having a milk elimination responsive enteropathy. the transient nature of this disorder as well as the desire to avoid early milk challenge (because of the potential risk of acute anaphylaxis) (de peyer and walker-smith, 1979) leads in practice to a late milk challenge at the age of 9 months to 1 year or even later. these children may then no longer be milk intolerant at the time of milk challenge (i.e. at the age of 9-12 months). milk provocation in such circumstances merely establishes the safe return of cows' milk into the diet. it does not confirm the diagnosis of a cows' milk sensitive enteropathy which must remain forever unproven, at least on present-day diagnostic criteria. it is important that a challenge with cows' milk should be carried out in hospital so that it may be critically evaluated and because of the occasional risk of an acute anaphylactic reaction. the infant who is having a milk-free diet (almost the same as a lactose-free diet) is admitted to hospital. first, he has a control pre-challenge small intestinal biopsy to show that his mucosa has returned to normal or near normal. if it has not improved, challenge should be deferred and the diagnosis reconsidered, the diet being carefully checked. in these circumstances if the child is truly on a milk-free diet but continues to have gluten in his diet coeliac disease must be considered. when the mucosa has been shown on biopsy to have healed, the following day an oral lactose load of 2 g lactose/kg is given. this is followed by a lactose mixture containing 7% lactose for 24 hours. during this period the infant should be observed carefully and any loose stools tested for reducing substances. if watery diarrhoea with excess reducing substances occurs, the infant is clearly lactose intolerant and goes back to his milk-free diet, and milk challenge is deferred. if there is no diarrhoea the milk challenge takes place the next day. the amount given in the challenge will depend upon the previous severity of symptoms. if the history suggests the possibility of a previous anaphylactic reaction an intravenous infusion should be set up before the challenge and the initial amount of milk given should be small; for example 0.2 ml. if there has been a previous anaphylactic reaction before challenge, resuscitation equipment should be ready, and 1:10 000 adrenaline 0.1 ml kg -1 for injection, hydrocortisone 100 mg for intravenous use, intramuscular chlorpheniramine (piriton) 2.5 mg for intravenous use, plasma infusion, oxygen and equipment for intubation. all children who have had a previous anaphylactic reaction should be admitted to hospital. it is suggested that at least a year should elapse before reintroduction is attempted. in clinical practice an anaphylactic reaction may be defined as collapse, hypotension, impairment of consciousness or significant upper airway obstruction (swelling of structures in mouth or throat). when this has occurred previously the following method is used to determine if the child is still allergic. a drop of full-strength cows' milk is placed on the skin of the forearm. if no skin reaction occurs, cautiously administer 1 ml of a 1 in 20 dilution of milk or 15 ml teaspoon on the tongue. if a reaction such as swelling, itching, redness or pain occurs, no further milk is given. if no reaction occurs within 20 minutes, 1 ml full-strength milk may be given. in most children who have no history of previous reaction, 5 ml of milk should be given, followed after an hour by a further 10 ml if no symptoms occur. if this is tolerated the child is then regraded in quarters back onto cows' milk or his normal milk feeding. should symptoms recur, such as vomiting and diarrhoea, evidence of an intercurrent infection, e.g. hospital acquired rotavirus gastroenteritis, should be sought, i.e. stools should be sent for rapid viral diagnosis. if viral gastroenteritis is excluded, it is only a biopsy that can establish whether a relapse has occurred and should be done as soon as possible. if the previously normal mucosa is now abnormal, then this is regarded as a positive challenge, i.e. a cows' milk sensitive enteropathy has been shown and a firm diagnosis of cows' milk protein intolerance is made. if the mucosa is still normal, he continues on milk, and other causes for the symptoms are sought, such as an intercurrent illness, e.g. urinary tract infection or other infection. an occasional child may clinically relapse after a milk challenge despite a normal biopsy in the absence of any other explanation. this poses a difficult problem in diagnosis and may be due to patchiness of the cows' milk sensitive enteropathy. the symptoms that occur after challenge vary considerably and range from an alarming anaphylactic reaction that comes on rapidly after the child has ingested cows' milk, to the development of diarrhoea, with stools obviously blood-stained 24 or more hours after exposure to milk protein (table 5 .6). vomiting is usually a striking symptom and is often the first to appear. however, there is still no unanimity concerning the diagnostic criteria for cows' milk protein intolerance; for example, there is no consensus as to how a milk challenge should be performed; some use whole milk for the challenge and others use milk protein fractions such as lactoglobulin. goldman and his colleagues in 1963 reported investigations of 89 children with cows' milk protein intolerance and found the following frequencies of reactions to challenge with various milk protein fractions: beta-lactoglobulin, 66%; casein, 57%; alpha-lactalbumin, 54%; bovine serum albumin, 51%. in addition to the above challenge procedure, related to small intestinal biopsy, various laboratory tests have been studied to help to diagnose cows' milk protein intolerance. anderson and schloss in the usa in 1923 found antibodies to cows' milk protein in the serum of infants exposed to cows' milk. since then the occurrence of circulating milk precipitins in the serum of children has been reported in a wide variety of disorders, including chronic respiratory disease and ig a deficiency. however, most authorities believe that the presence of circulating milk antibodies cannot be correlated with clinical intolerance to this protein. a good example of this is provided by some children with coeliac disease who still have cows' milk antibodies in their sera whilst in remission on a gluten-free diet with no symptoms related to milk ingestion. some observers, e.g. matthews and soothill (1970) , have studied complement activation after milk challenge. five children with gastrointestinal symptoms due to milk protein intolerance had evidence of complement activation after milk challenge with 5 ml of milk. however, assay of complement activity after milk challenge has not been adopted as a useful laboratory confirmation of the diagnosis of cows' milk allergy. however, the above authors were not considering cows' milk sensitive enteropathy specifically but were looking at cows' milk allergy as a whole. it may not be possible to distinguish cows' milk sensitivity from lactose malabsorption at the time of initial presentation as both cows' milk sensitive enteropathy and lactose intolerance may coexist. however, when a child is on a cows' milk-free diet it is possible by cows' milk challenge to make the distinction. in a classic paper, liu, tsao and moore, in 1967 , showed that challenge with cows' milk protein could induce intestinal malabsorption of lactose in children who were intolerant to cows' milk protein. it is now clear that lactase deficiency with secondary lactose intolerance may be produced by cows' milk challenge in children who have cows' milk sensitive enteropathy, particularly when this has occurred as a sequel to acute gastroenteritis. this secondary lactase deficiency is a consequence of mucosal damage produced by cows' milk protein per se {see figure 5 .1) (harrison, 1974; harrison et al., 1976) . while it may be impossible at the time of initial presentation to make the differential diagnosis, at the time of milk challenge it is practicable and important that the distinction should be made. the therapy for this disorder is to eliminate cows' milk from the child's diet and also all foods based upon cows' milk. this latter point is most important as dietetic failure may sometimes relate to neglect of restriction of foods such as ice-cream, which are based on cows' milk, despite strict adherence to avoidance of cows' milk per se. treatment involves substituting cows' milk feeds with the commercially available cows' milk protein-free formulae (francis, 1987) . in practice five categories of cows' milk substitutes have been used, namely those based on: • casein hydrolysate, pregestimil, and for infants over 6 months, nutramigen. • lactalbumin hydrolysate, alfare. • soya protein (cow and gate formula s, prosobee, wysoy). • a formula based upon comminuted chicken which requires supplements with the complete range of vitamins and minerals. • boiled goats' milk for children over 6 months plus vitamins a, d, c, b 12 and folic acid tablets (these contain lactose). as children may become sensitized both to soy and goats' milk feeds, the author prefers a hydrolysate formulation in most circumstances, with the occasional use of comminuted chicken formulae. this is used when the child is intolerant to one of the hydrolysates. a casein hydrolysate has most often been used but there are some children who are intolerant to it. this sometimes is due to glucose polymer intolerance (see chapter 7). in a study comparing a casein hydrolysate and a whey hydrolysate, whilst both were effective, weight gain and healing of the mucosa was better in some cases fed a whey hydrolysate. however, the latter formula was more unpalatable walker-smith, digeon and phillips, 1987) . only those formulae that are nutritionally complete (if necessary with vitamin supplementation) are to be recommended. those with a low osmolality should be chosen for young infants or infants with small intestinal disease. it is important to ensure that both liquid and solid feeds are free of cows' milk proteins. lactose intolerance may accompany the protein intolerance, and in such circumstances lactose should also be withdrawn from the diet. the necessity for dietary treatment is always temporary, and reintroduction of a normal diet is normally nearly always possible between the age of 1 and 2 years. this may be done in the home, but a history of previous severe reactions, such as urticaria or anaphylactoid shock, is an absolute indication for reintroduction of a normal diet under very close medical supervision usually in hospital. when a repeat biopsy is done to assess mucosal healing before challenge, clearly this will be done in hospital, which the author recommends on most occasions. usually this shows significant improvement. because of the risk of anaphylaxis milk challenge is usually delayed to the age of 9-12 months. soy bean was first proposed as a substitute for cows' milk in infants by ruhrah in 1909 . however, it was not until the recommendation by hill and stuart in 1929 that a soy bean food prepared to resemble milk began to be used in many countries for infants with milk allergy. glaser and johnstone went on in 1953 to suggest that soy bean milk when used as a substitute for cows' milk could play an important role in the prevention of allergy to cows' milk in those who were at risk. soy bean is rich both in protein (40%) and fat (20%). it belongs to the leguminaceae family as does the pea. it is principally produced in the usa. there were some difficulties with the first-generation soy formulae. these included carbohydrate intolerance because of indigestible carbohydrates and vitamin deficiencies. from the mid-1960s second-generation soy bean formulae have been used, based on a soy-protein isolate. the commercial formulae available in britain-formula s, isomil, prosobee and wysoy-are based on soy protein isolate. soy beans contain trypsin inhibitors, and in the experimental animal soy bean diets may cause growth retardation, pancreatic hypertrophy and even adenocarcinoma (mcguiness et al., 1980) . none of these serious effects has been described in infants probably because heat-labile trypsin inhibitors are inactivated in the manufacture of the formulae. however, the kunitz soy bean trypsin inhibitor has been reported to be the antigenic stimulant for an acute anaphylactic reaction in an adult woman (moroz and yang, 1980) . some problems with the second-generation formulae do remain. amongst these are mineral bioavailability. this is probably due to its 1% phytate content and there is a report of its use exacerbating acrodermatitis enteropathica (glasgow and elmes, 1975) . one other immunological concern about the use of soy formulae is the report from verona that healthy non-atopic infants fed soy formulae have lower immunoglobulin levels and more infections than do similar infants fed cows' milk (zoppi et a/., 1979 (zoppi et a/., , 1983 . the globulin fraction of soy beans is the major protein component. it consists of four main components (shibaski et al., 1980) . the 2s globulin has the highest allergenic potency. heat treatment increases this potency. haemagglutinating titres to heat treated soy were higher when injected into rabbits than crude soy powder extract (eastham et al., 1982) . this is in contrast to cows' milk formulae which became less antigenic after heat treatment (mclaughlin et al., 1981) . soy-based formulae have been shown to be at least as antigenic as milk-based formulae in a study by eastham et al. (1978 eastham et al. ( , 1982 . circulating antibodies developed rapidly for up to 3 months with little further rise in normal infants fed these formulae. high levels of haemagglutinating antibody have been shown in amniotic fluid, suggesting in-utero sensitization (kuruome et al., 1976) . although it has been shown that soy bean has low antigenicity in guinea-pig studies (ratner and crawford, 1955) , over recent years there has been an increasing number of clinical reports of intolerance to soy protein. such reactions have varied from a dramatic anaphylactic response, the onset of respiratory symptoms and the appearance of gastrointestinal symptoms. these observations are in accord with the concept that soy protein is in fact not a weak antigen in man. the first report of soy allergy was as long ago as 1934 (duke) . acute anaphylaxis has been described in infancy (david, 1984) . it has been shown that soy protein can produce a small intestinal enteropathy which resolves with soy elimination and which reappears when soy protein is reintroduced into the diet, i.e. a soy-sensitive enteropathy (ament and rubin, 1972) . these workers described a flat mucosal lesion indistinguishable to that found in coeliac disease. this observation has been confirmed by others, but more usually the small intestinal damage is less severe and is similar to cows' milk sensitive enteropathy (perkkio et al., 1981) . as with cows' milk protein intolerance the colon may also be affected and there are reports both of enterocolitis (mcdonald et al., 1984) and colitis (halpin et al., 1977) . soy formulae have been recommended in three situations: • when the small intestinal mucosa is normal as a prophylaxis against cows' milk allergy. • in those already who are cows' milk protein intolerant, some of whom may have small intestinal mucosal damage. • for the management of gastroenteritis by virtue of the formulae being lactose free. here too the small intestinal mucosa is likely to be abnormal. it is particularly in these last two situations when the mucosa is already damaged that the use of soy formulae may lead to intolerance by sensitizing the mucosa to this protein. there is clear evidence that the incidence of soy protein intolerance is lower when used for prophylaxis than as treatment for cows' milk allergy (0.5% versus 15-50% or even higher). the very high figure of 80% reported by wong in 1965 (wong, 1985 for children with severe cows' milk protein intolerance contrasts to the figure of 11% reported by kuitunen et al. referred to earlier. however, the former report concerned first-generation isolate. in between is the figure of 15% reported by perkkio, savilahti and kuitunen (1981) who described 16 cases of soy intolerance from 108 children with cows' milk sensitive enteropathy treated with soy formula. it is because of such data that the author recommends a protein hydrolysate rather than a soy formula for the management of children with cows' milk sensitive enteropathy. the pathogenesis of soy protein intolerance is probably very similar to that of cows' milk sensitive enteropathy. it frequently appears to occur as its sequel. butler et al. (1981) studied neutrophil chemotaxis and neutrophil random migration in infants with milk and/or soy protein intolerance. chemotaxis is the ability of certain cells to move or to turn towards other cells or substances that exert a chemical influence (positive chemotaxis) or away from them (negative chemotaxis). all infants showed a significant decrease in chemotaxis at a time when the disease was active. however, this depression did not relate to the severity of the protein intolerance as judged by symptoms or the nutritional status. random migration of neutrophils was increased in patients with active protein intolerance as compared with controls. the relationship of this observation to pathogenesis is unclear at present. kuitunen et al. (1975) who described the clinical findings in 54 children with cows' milk intolerance reported that 35 of these children were given soya protein as a cows' milk substitute. four of these developed soya protein intolerance. the symptoms were vomiting, diarrhoea and weight loss. three had partial villous atrophy on biopsy and the fourth had a flat mucosa. it is clear that soy sensitive enteropathy may occur as a sequel to cows' milk sensitive enteropathy. treatment is with a protein hydrolysate formula such as pregestimil. strict avoidance of soy may be difficult as so many modern prepared foods contain some soy. like cows' milk protein intolerance, the need for dietary avoidance is temporary. definition transient gluten intolerance may be defined as the syndrome seen when a child with gastrointestinal symptoms and an abnormal small intestinal mucosa who responds clinically and histologically to a gluten-free diet, subsequently thrives and continues to have a normal mucosa despite returning to a normal gluten-containing diet. dicke in holland, in 1952, described a transient wheat sensitivity in pre-school children following gastroenteritis. visakorpi and immonen in finland, in 1967 , described a state of transient gluten intolerance in 28 children associated in some cases with temporary cows' milk protein intolerance. these reports did not include serial small intestinal biopsies. in 1970, a child with transient gluten intolerance was described in australia, and this report included such biopsies (walker-smith, 1970) . this child had an abnormal small intestinal mucosa (a severe degree of partial villous atrophy) and he responded clinically to a gluten-free diet. after 1 year, while he was still on a gluten-free diet, a further biopsy revealed a normal mucosa. he was put back onto a normal gluten-containing diet, and 16 months later a further biopsy demonstrated a persistently normal mucosa. he has subsequently remained in excellent health. despite these case reports there has been considerable scepticism expressed that transient gluten intolerance does actually exist. in particular scepticism has been expressed as to whether those children reported to have had transient gluten intolerance had really been gluten intolerant ab initio. hence stricter criteria were laid down for diagnosis. these were first, the need to provide evidence that gluten toxicity was in fact present and that the apparent clinical response to gluten restriction was not fortuitous and second, the need to demonstrate the presence of a normal small intestinal mucosa 2 years or more after the return to a normal diet (i.e. the 2 years rule) as laid down by the european society for paediatric gastroenterology in interlaken in 1970 to exclude coeliac disease {see chapter 4) (meeuwisse, 1970) . the precise criteria necessary to establish the existence of any form of transient intolerance to a dietary substance are indicated in diagrammatic form in figure 5 . 12. mcneish et al. (1976) , in fact, have demonstrated such early evidence of gluten toxicity by serial xylose absorption studies at the time of an early gluten challenge in an infant with an enteropathy who had previously responded to a gluten-free diet. serial biopsies were not used to establish gluten toxicity at that time in their infant, but the mucosa has been shown to be normal 2 years after return to a normal gluten-containing diet. at the queen elizabeth hospital, a child has been observed who completely fulfils both the espgan and mcneish criteria for the diagnosis of transient gluten intolerance (see table 5 .8) . this child had an early gluten challenge at age of 1 year 2 months, having previously had a flat mucosa and a clinical response to a gluten-free diet. she relapsed clinically and histologically with gluten. subsequently, after a second challenge with gluten, she has remained clinically well and her mucosa 2 years 3 months after return to a normal glutencontaining diet has remained normal. she has now left the paediatric age group symptom free. it thus seems clear that this child had transient gluten intolerance from which she has now recovered (see table 5 .7). thus application of these very strict criteria has established evidence that transient gluten intolerance does in fact exist in this very small number of patients. in routine clinical practice these criteria are impossible to fulfil as early gluten challenge is no longer performed. thus the diagnosis of transient gluten intolerance is usually retrospective and presumptive, i.e. a child provisionally diagnosed as coeliac disease fails to relapse clinically and histologically after 2 years or more back on a gluten-containing diet. thus the child fails to fulfil the espgan criteria for the diagnosis of coeliac disease (see chapter 4). thus in current clinical practice the term transient gluten intolerance is reserved for the small group of children provisionally diagnosed as coeliac disease who fail to relapse after 2 years or more back on a gluten-containing diet. this retrospective diagnosis is then based upon the following diagnostic criteria. • initial illness associated with a severe small intestinal enteropathy. • complete clinical remission on a gluten-free diet. gluten started 2\ months of age gluten-free gluten 10 g powder daily for 3 months gluten-free 3 years 2 months gluten 5 g in diet daily for 3 months gluten 5-10 g in diet, daily for 2 years 2 months • healing of the enteropathy on a gluten-free diet. • normal intestinal mucosa 2 years or more after return to a gluten-containing diet (the two years rule'). in a review of the espgan criteria, mcneish et al. in 1979 commented upon the paucity of published evidence, at that time, to endorse the concept that all coeliac children relapse within 2 years of gluten challenge and therefore that patients not relapsing after this time had transient gluten intolerance. in fact the literature is now more extensive. several published studies of gluten challenge in children have established that relapse most often occurs within 2 years. there are, however, a few reports to suggest that it may take more than 2 years for a relapse in some exceptional cases of coeliac disease. mcnicholl et al. (1974) described two children who took more than 2 years to relapse after return to a normal gluten-containing diet. only one case has had the full clinical details published (egan-mitchell, fottrell and mcnicholl, 1978) . in that case the intra-epithelial lymphocyte count rose and disaccharidase activities fell before frank mucosal relapse. a survey by the european society of paediatric gastroenterology and nutrition suggested that others had a similar experience (shmerling, 1978) . in a study of 65 children, originally diagnosed as having coeliac disease, at the queen elizabeth hospital (see chapter 4), 15 proved on reinvestigation to have a normal mucosa 2 or more years after having gluten in their diet (walker-smith, kilby and france, 1978) . nine of these had a documented abnormal initial biopsy at the time of the original presentation on a normal gluten-containing diet. they responded clinically to a gluten-free diet, but despite this they had a final biopsy which was normal or near normal after 2 or more years on a gluten-containing diet. all have now left the paediatric age group symptom free. figure 5 .13 demonstrates the histological findings in one of these children. seven of these children had serial disaccharidase assay after start of challenge. on two occasions, disaccharidase activity fell despite normal morphology as reported in such children by mcnicholl, egan-mitchel and fottrell (1974) . their ages ranged from 8 weeks to 1 year 5 months at the time of the initial biopsy, i.e. all were less than 2 years of age at that time. a critical review of the early history in three children reveals evidence of a preceding episode of acute enteritis and evidence of other food intolerances, e.g. cows' milk protein intolerance, but in the remainder there was no such evidence (table 5. within this group there were some children who met all the classical criteria for the initial diagnosis of coeliac disease as they had a completely flat mucosa ( figure 5 .13) and evidence of malabsorption, and responded dramatically to a gluten-free diet alone without any evidence of other food intolerances. it is impossible now to convince the mothers of these children that a gluten-free diet at that time did not account for their clinical improvement. figure 5 .14 shows the weight progress over the years of one of these children. these cases with two others making a total of 11 were further reviewed (walker-smith, 1987) . this represents the total number of children diagnosed at queen elizabeth hospital between 1972 and 1986. they have now been followed up for a period of 8-10 years in most cases but in one for 20 years. many of these children had gluten introduced into their diet at an early age as did children with coeliac disease diagnosed at the same time. all had normal biopsies 2 years or more after a return to a normal gluten containing diet. four had further biopsies. one was abnormal and thus this child has coeliac disease. he had in fact been symptom free, the indication for biopsy being the development of serum gliadin antibodies. nusslé et al. (1978) have described a similar group of six children, initially diagnosed as having coeliac disease, who have had a normal small intestinal mucosa 2h-^h years after reintroduction of gluten to their diet. all had normal iel levels, and all except one, normal disaccharidases. thus, again, these children appear to have had transient gluten intolerance from which they have recovered. schmitz, jos and rey (1978) , from paris, in a very important paper have described three children who had earlier responded to a gluten-free diet but who then developed flat mucosa on a gluten-containing diet at the ages of 10 years, 62 years and 4 years 8 months respectively. however, they did not return to a gluten-free diet but continued on a gluten-containing diet. further biopsies after more than 9 years, 12 years and 5 years respectively on a normal gluten containing diet, showed surprisingly normal or near normal mucosae. the third case was having a low gluten intake but had relapsed earlier on a similar low gluten diet. thus these three children appear to have recovered spontaneously on a gluten-containing diet. a further seven children have since been described (schmitz et al., 1984) . the author would regard this as good evidence that these children had transient gluten intolerance which was longer lasting than other children described from which they had recovered, but schmitz, jos and rey have raised the possibility that the mucosal lesion of coeliac disease may disappear during adolescence only to reappear in adulthood. further prospective studies will establish whether this is so. it must be acknowledged that the ages of the cases described by schmitz et al. are clearly different to those reported by walker-smith et al. (1975) and nusslé et al. (1978) as they are much older than those cases which were all under 2 years at time of initial diagnosis. further evidence comes from the study of shmerling and franckx (1986) who described three patterns of response after a gluten challenge: • a group of 24 coeliac patients who fulfilled the espgan criteria; 21 relapsed within 2 years of commencement of the gluten-challenge. three took up to nearly 5 years to relapse. • six children who after a gluten challenge of 2.42-6.92 years have not relapsed. these children are being followed carefully and would appear to have had transient gluten intolerance. • a group of 11 children whose mucosa after gluten challenge deteriorated without becoming flat. all were symptom-free and have continued on gluten. they resemble the cases described by schmitz et al. likewise their future is uncertain. thus the 2-year rule appears to be valid for the majority of children with coeliac disease who have a gluten challenge although there are occasional exceptions. thus most children previously diagnosed on firm criteria as suffering from coeliac disease but who fail to relapse by 2 years after a return to a gluten-containing diet can be provisionally labelled as transient gluten intolerance. in each individual child, however, follow-up must be close in order to detect the occasional child who eventually has a late relapse and thus proves in the end to have coeliac disease, i.e. permanent gluten intolerance. from all these studies it is clear that there is thus a particular need for detailed and long-term follow-up into adult life of these children retrospectively diagnosed as transient gluten intolerance. it is therefore difficult to regard the diagnosis of transient gluten intolerance in our present state of knowledge as ever being any more than provisional and it cannot be regarded as final. it remains possible that any or all of the children referred to above ultimately may relapse after many years in adult life. similar uncertainty must remain concerning the fate of the children described by schmitz et al. and shmerling and franckx. a schematic model of the present scene is outlined in figure 5 .15. two explanations have been proposed to explain the development of this syndrome. firstly, it has been suggested that there may be a temporary depression of dipeptidase activity occurring in the small intestinal mucosa, secondary to nonspecific mucosal damage, such as may occur as a sequel to gastroenteritis. such a suggestion at present is only speculative and is based on reports of this syndrome following clinical episodes of gastroenteritis where there has been the demonstration of an abnormal small intestinal mucosa on biopsy. secondly, it is possible that a transient 'allergy' to gluten may occur in a similar and equally unknown manner to that suggested earlier in this chapter in relation to cows' milk protein. there is little evidence available so far to support either theory. all the cases described by walker-smith 1985 had gluten introduced very early in to their diet (under 2 months). this may account for the fact that no new case has been diagnosed at queen elizabeth hospital presenting since 1974. from 1975 mothers have been encouraged in britain to introduce gluten at a later time and to use rice rather than wheat cereal as a weaning food. thus the early age of introduction of gluten into the diet of these children may be an important factor in the pathogenesis of transient gluten intolerance and could account for its importance in the early 1970s. the small intestinal mucosa is by definition abnormal, i.e. thickened ridged mucosa characterized histologically by partial villous atrophy, or sometimes a flat mucosa. the demonstration of a flat mucosa, however, should not ordinarily suggest this diagnosis, as it is more characteristic of coeliac disease. the mucosal abnormality is therefore typically less severe than that found in coeliac disease. in the child referred to in table 5 .9 it is notable that there was an increase in the intra-epithelial lymphocyte count in the initial diagnostic biopsy, a fall in the count after a gluten-free diet, followed by a rise on mucosal relapse following a gluten challenge and, finally, a return to normal level which has remained normal on a gluten-containing diet. thus at the time when the child was gluten sensitive, the intra-epithelial lymphocyte count appeared to be as responsive to gluten in the diet as in children with coeliac disease (i.e. permanent gluten intolerance). transient gluten intolerance shibuld be considered as part of the differential diagnosis of the infant who develops gastrointestinal symptoms when he first encounters wheat protein, especially when he appears to be intolerant to other food proteins such as milk and egg. it should also be considered as a possibility in a child who fails to thrive following gastroenteritis, e.g. salmonellosis (walker-smith, 1970) in the presence of an abnormal small intestinal mucosa and the absence of other explanations, such as secondary lactose intolerance, particularly when such an infant has not responded to several other dietary measures but is having cereal in his diet. there appear to be two clinical syndromes associated with transient gluten intolerance: first, when gluten intolerance accompanies other forms of food intolerance; and secondly, when there is gluten intolerance alone producing a clinical picture identical with coeliac disease (walker-smith, kilby and france, 1978; nussle et al., 1978) . gliadin antibodies may be detected in the serum of such children even though their mucosa is now normal. table 5 .9 histological grading, lactase activity and intra-epithelial lymphocyte counts/100 epithelial cells (iel) in serial biopsies from a child s.s. related to age and gluten-containing (g) or gluten-free diet (gf) the dietary management is identical with that prescribed for children with coeliac disease but the need for such dietary restriction is, of course, by definition, a temporary one. the duration of the need for such dietary restriction will vary from child to child. clearly, at present this is a confusing field and in need of much clarification. it has been discussed here at some length, not because of the intrinsic importance of transient gluten intolerance itself, but rather because of the importance of distinguishing the condition from coeliac disease. if it were not for the existence of a permanent gluten intolerance this disorder would be looked upon much the same as cows' milk and soy intolerance. finally, it is most important that those children diagnosed as transient gluten intolerant, but who have completely recovered, should have long-term follow-up. these children could eventually relapse. it is the author's practice to follow-up such children throughout their childhood and then refer them to adult gastroenterologists for indefinite follow-up. it has now been established by serial biopsy and dietary elimination and challenge that egg protein (lyngkaran, 1982) , ground rice, chicken meat and fish (vitoria et al., 1982) may all temporarily damage the small intestinal mucosa in infancy. in this latter study all infants also were cows' milk intolerant and were under the age of 6 months at the time of diagnosis. these findings are of more theoretical importance than practical as the author would not advocate in clinical practice serial biopsy and challenge if a child with an enteropathy who responds to milk elimination develops diarrhoea in the absence of infection when given one of these foods. rather, the offending food would be removed from the diet. what is important is that there is now firm evidence that these food sensitive enteropathies do exist in infancy. soy protein-another cause of the flat intestinal lesion allergy to cow's milk in infants with nutritional disorders anaphylactic sensitivity of guinea-pigs drinking different preparations of cow's milk and infant formulae allergies to milk studies on the atopic allergen in hen's egg i. identification of the skin reactive fraction in egg white allergies and toxic protein in food intestinal uptake of macromolecules vi. uptake of protein antigen in vivo in normal rats and rats infested with nippostrongyloides and brasiliensis or subject to mild systemic anaphylaxis depressed neutrophil chemotaxis in infants with cow's milk and or soy protein intolerance gastrointestinal occult haemorrhage and gastroduodenitis in cow's milk protein intolerance allergenicity of food proteins and its possible modification refractoriness to anaphylactic shock after continuous feeding of cow's milk to guinea-pigs a follow-up study of infants with adverse reactions to cow's milk anaphylactic shock during elimination diets for severe atopic eczema circulatory immune complexes in infants fed on cow's milk cow's milk intolerance presenting as necrotizing enterocolitis the modified anaphylaxis hypothesis for cot death: anaphylactic sensitization in guinea-pigs fed cow's milk de subacute, chronische en recidiverende darmstoornis van de kleuter. nederlandsch tijdschrift voor geneeskunde food intolerance and gastrointestinal disease in infancy: personal practice. digestive disease soy bean as a possible important source of allergy antigenicity of infant formulas: role of immature intestine on protein permeability antigenicity of infant formulas and the induction of systemic immunological tolerance by oral feeding: cow's milk versus soy milk serum antibodies against native, processed and digested cow's milk proteins in children with cow's milk protein intolerance pathogenesis and mechanisms in the gastrointestinal tract effect of antigen load on development of milk antibodies in infants allergic to milk small intestinal biopsy in cow's milk protein allergy in infancy natural history of egg hypersensitivity in childhood cow's milk hypersensitivity: immediate and delayed onset clinical patterns intestinal sugar permeability: relationship to diarrhoeal disease and small bowel morphology intolerance of milk protein disodium cromoglycate in gastrointestinal protein intolerance classification of allergic reactions responsible for hypersensitivity and disease exacerbation of acrodermatitis enteropathica by soya-bean milk feeding milk allergy the gastrointestinal absorption of unaltered protein in normal infants and in infants recovering from diarrhoea colitis, persistent diarrhoea, and soy protein intolerance biologisches über die eiweisskorper der kuhmilch und über sauglingsernahrung cows' milk protein intolerance: a possible association with gastroenteritis lactose intolerance and ig a deficiency sugar malabsorption in cow's milk protein intolerance manifestations of milk allergy in infancy: clinical and immunologie findings food allergy: the gastrointestinal system egg-protein-induced villous atrophy macromolecular absorption by histologically normal and abnormal small intestine mucosa in childhood: an in vitro study using organ culture small intestinal mucosa in cow's milk allergy altered jejunal permeability to macromolecules during viral enteritis in the piglet use of cromolyn in combined gastrointestinal allergy response of the jejunal mucosa to cow's milk in the malabsorption syndrome with cow's milk intolerance malabsorption syndrome with cow's milk intolerance clinical findings and course in 54 cases milk sensitivity and soy bean sensitivity in the production of eczematous manifestations in breast-fed infants with particular reference to intrauterine sensitization dietary protein induced colitis in breast fed infants immune reactions induced in infants by intestinal absorption of incompletely digested cow's milk proteins bovine milk protein induced malabsorption of lactose and fat in infants food protein induced enterocolitis: altered antibody response to ingested antigen hypersensitivity reactions in the small intestine 3. the effects of allograft rejection and of graft-versus-host disease on epithelial cell kinetics the effect of long-term feeding of soya flour on rat pancreas an oral screening procedure to determine the sensitizing capacity of infant feeding formulae criteria for diagnosis of temporary gluten intolerance the diagnosis of coeliac disease: a commentary on the current practices of members of the european society for paediatric gastroenterology and nutrition quantitative analysis of small intestinal mucosa in cow's milk sensitive enteropathy a comparison of three infant feeding formulae for the prevention of delayed recovery after infantile gastroenteritis cow's milk allergy in chronic diarrhoea and malnutrition in indonesian infants complement activation after milk feeding in children with cow's milk allergy diagnostic criteria in coeliac disease annales de la nutrition et de l'alimentation kunitz soy bean trypsin inhibitor: a specific allergen in food anaphylaxis oral cromolyn therapy in patients with food allergy: a preliminary report non coeliac gluten intolerance in infancy hypersensitivity to milk and sudden death in infancy morphometric and immunohistochemical study of jejunal biopsies from children with intestinal soy allergy mucosal mast cells. 11. effects of antiallergic compounds on histamine secretion by isolated intestinal mast cells small intestinal lymphocyte levels in cow's milk protein intolerance soy bean: anaphylactogenic properties decline in deaths from choking on food in infancy: an association with change in feeding practice anaphylactic shock in an infant after feeding with a wheat rusk. a transient phenomenon the soy bean in infant feeding a case of allergy to common foods transient mucosal atrophy in confirmed coeliac disease intestinal permeability in allergic children before and after treatment with scg assessed of different sized polyethylene glycol (peg) how reliable are commercial allergy tests? lancet, i questionnaire of the european society for paediatric gastroenterology and nutrition on coeliac disease the absorption of antigens after immunisation and the simultaneous induction of specific tolerance investigation and treatment of multiple intestinal food allergy in childhood transient ig a deficiency and pathogenesis of infantile atopy allergy and infectious diseases: a review small intestinal mucosal fat in childhood enteropathies changing pattern of cow's milk intolerance intolerance to cow's milk and wheat gluten in the primary malabsorption syndrome in infancy enteropathy related to fish, rice and chicken allergic münchen medicin wochenschrift cow's milk protein intolerance, transient food intolerance of infancy the pathology of gastrointestinal allergy cow's milk intolerance as a cause of postenteritis diarrhoea experience with semi-elemental formula (alfare) food allergies and bowel disease milk intolerance in children reinvestigation of children previously diagnosed as coeliac disease infantile colitis responding to elimination of cow's milk from mother's diet. british paediatric association meeting evaluation of a caesin and a whey hydrolysate in the management of cow's milk sensitive enteropathy in infancy food allergy. response to treatment with sodium cromoglycate combined cow's milk protein and gluten induced enteropathy: common or rare rice gruel in management of infantile diarrhoea recognition of food allergic patients and their allergens by the rast technique and clinical investigation gammaglobulin level and soy protein intake in early infancy diet and antibody response to vaccinations in healthy infants key: cord-000479-u87eaaj8 authors: stolf, beatriz s.; smyrnias, ioannis; lopes, lucia r.; vendramin, alcione; goto, hiro; laurindo, francisco r. m.; shah, ajay m.; santos, celio x. c. title: protein disulfide isomerase and host-pathogen interaction date: 2011-10-11 journal: scientificworldjournal doi: 10.1100/2011/289182 sha: doc_id: 479 cord_uid: u87eaaj8 reactive oxygen species (ros) production by immunological cells is known to cause damage to pathogens. increasing evidence accumulated in the last decade has shown, however, that ros (and redox signals) functionally regulate different cellular pathways in the host-pathogen interaction. these especially affect (i) pathogen entry through protein redox switches and redox modification (i.e., intraand interdisulfide and cysteine oxidation) and (ii) phagocytic ros production via nox family nadph oxidase enzyme and the control of phagolysosome function with key implications for antigen processing. the protein disulfide isomerase (pdi) family of redox chaperones is closely involved in both processes and is also implicated in protein unfolding and trafficking across the endoplasmic reticulum (er) and towards the cytosol, a thiol-based redox locus for antigen processing. here, we summarise examples of the cellular association of host pdi with different pathogens and explore the possible roles of pathogen pdis in infection. a better understanding of these complex regulatory steps will provide insightful information on the redox role and coevolutional biological process, and assist the development of more specific therapeutic strategies in pathogen-mediated infections. host cells have the ability to cope with the progression and severity of infection in response to different types of pathogen. on the other hand, numerous mechanisms have evolved that support the use of the host cell machinery to facilitate pathogen survival and multiplication. such co-evolutionary processes are directly affected by different physicochemical factors within different cell compartments, both in the host and in pathogen. for instance, ph critically affects antigen stability of the influenza virus which modulates endosome acidity that attenuates its own infection [1] . ros (and reactive nitrogen species) production and the redox state of different cell compartments are also critically involved in cellular hostparasite interaction. among the many redox sensitive proteins that are altered during the course of different infections, protein disulfide isomerase (pdi-) mediated redox switches have been associated with pathogen attachment-internalization, antigen processing in the er/phagosome, and the regulation of ros production by nox family enzymes. thus, pdi emerges as a ubiquitous redox protein that regulates different steps of diverse infection processes. several pathogens also have their own pdi that act as an important virulence factor (table 1 ). other redox modifications directly mediated by ros and especially via nitric oxide (no) generated by inducible nitric oxide synthase (inos), which is abundant in phagocytic cells, have been reviewed elsewhere [2, 3] and are not considered in this article. below, the main cellular redox aspects of host and pathogen pdi will be discussed. the ancient pdi is a ubiquitous redox chaperone belonging to the thioredoxin oxireductase super family and can reduce (reaction 1), oxidize (reaction 2), and catalyse dithiol-disulfide exchange reactions (i.e., isomerase activities, reaction 3, figure 1 ). such broad range of activities overlaps with the chaperone role of pdi that overall performs a housekeeping function in helping to maintain proteins in a more stable conformation. there are around 20 pdi homologues, and the detailed structure and function of eukaryotic pdis have been covered in recent excellent reviews [4, 5] . the classic mammalian pdi (55 kda) has several domains ordered as a-b-b -a -c with 2 thioredoxin-like motifs (trp-cys-gly-his-cys) displayed in the a and a domain [4] [5] [6] (figure 1 ). pdi is abundant in the er (∼ 0.5 mm) where the relatively oxidizing conditions at basal level (i.e., gsh/gssg ratios ∼ 2-3 : 1) favours pdi isomerase/oxidase activity, which is primarily involved in client protein redox folding (reaction 2-3, figure 1 ). the oxidizing equivalents for this process are driven mainly by the er thiol-containing oxidase, ero1 (endoplasmic reticulum oxidase-1), which binds fad and is in turn re-oxidized via electron transfer to oxygen, generating h 2 o 2 in the process [7] [8] [9] [10] . the h 2 o 2 destiny is elusive, but it can oxidize er-located peroxiredoxin iv (prxiv) that is further reduced by pdi that is oxidized in the process [11] . this redox circuit is thought to increase total protein folding and thiol oxidation via ero1 [11] . however, even in the absence of ero1, protein folding still occurs, and it is suggested that other oxidases may compensate for redox demand in the er in some circumstances [12, 13] . nevertheless, the pdi-ero1-dependent oxidative activity is balanced to cytosolic glutathione levels suggesting a functional redox interplay between these compartments [12] . pdi reductase activity has been primarily associated to more reducing compartments (i.e., gsh/gssg ratios ∼ 30-100 : 1), such as those in the vicinity of the plasma membrane [6, 10] . pdi redox versatility is mainly governed by the low pka of the proximal cysteine on the active n-terminal a domain. indeed, the lower pka of 4.5 renders pdi a much better oxidase than thioredoxin, which has a pka of 7.1 and is mainly a reductase in most neutral ph cell compartments. it should also be noted that pdi functions as a chaperone independently of its redox-active domains as especially required for its atpase and ca 2+ activity, although pdi redox motifs still stabilize binding interaction [4, 5, 10] . in the er, pdi is tightly associated with prolyl-4 hydroxylase (the rate-limiting enzyme for collagen biosynthesis), the sec61 translocon, and the mhc class i complex (see later). it can be also found as a heterodimer with microsomal triglyceride transfer protein [6, 10] . pdi is a soluble homodimer and does not have a transmembrane domain and, similarly to other er chaperones, carries the kdel c-terminal sequence which binds to respective receptors in the cop vesicles that circulate in the er-golgi vicinity, recycling proteins back to er. pdi also undergoes intense intracellular trafficking and is found on the surface of diverse prokaryotic and eukaryotic cells [14] [15] [16] . despite limited knowledge about this traffic, it is possible that pdi exits the er through the translocon sec61 pore and/or via secretory vesicles [17] . pdi is thought to attach to lipids, glycans, and integral membrane proteins via electrostatic interactions at the cell plasma membrane [14, 15] , where its reductive activity mediates the infection of different pathogens ( figure 2 , discussed later). pdi along with erp57 have been found in the nuclei in association to dna and affecting transcriptional activity of nf-kb, ap-1, and stat3 [15] . these transcriptional regulators are key elements in many inflammatory processes, but their functional association to pdi and to different pathogen still elusive. in contrast to many other members of its family, such as thioredoxin itself and erp57, pdi is not normally found in the cytosol, where it is likely cleaved by caspase-3 and -7 [18] . protozoans and bacteria have their own pdis ( table 1 ). the function of these pdis in protein folding is poorly understood; however, there are intriguing data correlating pdi expression and the pathogenicity of several parasites, especially obligatory intracellular protozoans. leishmania that leads to distinct types [20] . if l. chagasi is a subgroup of l. infantum brought to america by european colonist or other specie in its own still a matter of controversy (see discussion in [21] ). the infection cycle in the vertebrate host and is initiated when leishmania promastigote is injected into the skin by the insect vector. in the host, the promastigote is phagocytised especially by macrophages, and further it is converted into intracellular amastigote. amastigote replicates inside the phagosome within the cell and is liberated after the cell lyses, subsequently infecting other cells resulting in the progression to disease [21, 22] . l. amazonensis has at least four pdis, and the use of specific pdi inhibitors substantially affected parasite growth [23] . in l. major, the increased levels of leishmania pdi (lmpdi) expression and secretion at the parasite surface reflects optimal protein folding balanced to parasite multiplication. importantly, this is correlated to high virulence of the parasite strains [16] . more recently, the use of lmpdi antigens to generate a vaccine for l. major partially protected balb/c animals and accelerated the cure of different strains of mice [24] . similarly to leishmania species, other parasites of the trypanosomatid group such as trypanosoma contain several genes predicted to encode for pdis, that can execute n-glycosylation and protein folding in the er [25, 26] . although pdi was considered essential for t. brucei survival, pdi activity was not essential for the growth of trypanosomes in vitro [26] . pdi is also expressed in different species of plasmodium protozoans (table 1) , the parasites that cause malaria [27, 28] . p. falciparum expresses at least nine different pdis and the pfpdi-8 has great similarity to the prototype pdi and is expressed during all stages of parasite life cycle. this pdi facilitates the disulfide-dependent conformational folding of eba-175 protein, an emerging candidate for the development of malaria vaccines [28] . this is intriguing given that malaria parasites express proteins with high content of cysteine, which are associated to parasite invasion and sequestration in the vertebrate host and transmission into mosquito host [28] . finally, toxoplasma gondii pdi was identified in host tears, suggesting an extracellular location and adhesion to host cells during the initial phase of infection [29] . antigen presentation occur through two distinct pathways. antigen presenting cells (apcs; especially macrophages and dendritic cells; dcs) are long-lived cells that capture antigens and subsequently process and present them at the cell surface, where they are recognized by t-lymphocytes. this process provides a long-term adaptive immune response to fungi, bacteria, and parasite. after internalization by the apc, antigens pass through phagosome/lysosome vesicles, where they form complexes with mhc class ii (figure 2 ), which are recognized by helper cd4+ t lymphocytes (exogenous pathway). in contrast, self cell antigens and virus synthesized within cells (mostly non-apcs) are degraded by the proteasome in the cytosol and nucleus. in successive steps, the antigen is processed, folded, and incorporated into the mhc class i ( figure 2 ) and the complex exposed on the cell surface, and recognized by cytotoxic cd8+ t lymphocytes (endogenous pathway). these two pathways overlap and some antigens are presented by both mhc class i and ii, in a process called cross-presentation. this has been described in dcs responding to viral infection, transplant rejection, and some autoimmune diseases and cancer. moreover, a wide range of pathogens passing or living in the phagosome such as mycobacterium tuberculosis, salmonella typhimurium, toxoplasma gondii, and especially leishmania spp and trypanosoma cruzi are all crosspresented in association to high levels of cd8 + t cells [31] . pdi as part of the er protein folding machinery directly regulates antigen processing of the mhc class i complex [32] [33] [34] [35] . antigens that are degraded by peptidases and proteasome to shorter peptides in the cytosol and nucleus can be further transported to the er through the tap system, a transmembrane er type of atp-binding cassette (abc) peptide transporter family [36] . er-located pdi interacts with the peptide-loading complex (pcl) that efficiently promotes peptide assembly with mhc class i molecules and supporting the exit of the peptide-antigen complex from the er [32] [33] [34] . other pcl components include calreticulin, tapasin, erp57 (another pdi family member), and the tap transporter itself. cells lacking pdi present much less peptide loading to mhc class i and the disulfide bridge between the peptide and mhc groove remains in a reduced redox state [32] . normally, this interaction is affected by the redox exchange between pdi (predominantly oxidized) and erp57 (predominantly reduced) [32] , a condition in which pdi favours the release of peptide-mhc class i from the pcl and the antigen-mhci complex is exited from er [34] . in fact, pdi-bound peptide facilitates the disassembly of the tapasin-erp57 complex while the pdi unbound to the complex is unable to interact with tapasin-erp57, retaining mhc i molecules in the er [34] . overall, pdi redox activity modulates the stability of the antigen peptide-mhc class i complex and further determines the transport of the complex to the plasma membrane [32] [33] [34] [35] . these redox effects may vary according to the type of antigen and some pathogens interfere with this pathway to escape antigen process and evading cd8 + t-cells recognition. this is the case for us3 protein from human cytomegalovirus, which enhances pdi degradation via the proteasome [32] . pdi participation in immune response, however, goes beyond its role in the er protein folding machinery and it acts at other cellular steps of host-pathogen interaction. pdi in the er is also thought to play a role in parasite phagocytosis, and the pdi displayed on the cell surface can mediate the entry of some viral, bacterial, and protozoan. pdi is also implicated in protein unfolding and trafficking of some pathogenic antigens across the endoplasmic reticulum and towards the cytosol by the endoplasmic reticulum-associated degradation system (erad). this is the main pathway where proteins are retrotranslocated from the er to cytosol and further degradated by the proteasome. next, we discuss some examples of the cellular association between host pdi and different pathogens. phagocytosis is the main gate for large microbes to enter into apcs. after binding and attaching to the pathogen, these cells can internalize organisms and large particles even bigger then their own size, which are then phagocytosed in an active process that involves intense membrane remodelling [37] . proteomics studies accumulated over the last decade revealed the presence of er chaperones in the isolated phagosome, uncovering a process called er-mediated phagocytosis [38] [39] [40] [41] [42] [43] [44] [45] . er chaperones were detected in phagosomes of macrophages exposed to different particulate material and pathogens, including latexbeads opsonised or not with immunoglobulin g (igg) or mouse serum (to facilitate entry through the fcr or complement receptors), igg-opsonized erythrocytes, promastigotes of leishmania donovani derived from wild-type cells or cell-surface lpg knockout, among other parasites [38] . a mix of er and endocytic vesicles in the formation of the parasitophorous vacuoles (pvs) during the uptake of different leishmania spp. was recently shown in macrophages overexpressing er-tagged green fluorescent protein [41] . the presence of the er proteins sec61, bip/grp78, and pdi in the phagosome of apcs [38, 41] support the idea that the er provides the necessary machinery for antigen translocation from the phagosome to the cytoplasm and thus, possibly converges mhc class i and class ii antigen cross-presentation [42] (figure 2 ). there are several other complementary hypotheses on how peptides cross from phagosomes to cytoplasm during the cross-presentation process [43, 44] . neutrophils are short-lived cells (half-life of 4-8 h in the human circulation) and very active in the phagocytosis of large microbes such as bacteria, parasites, and fungus. contrary to apcs, neutrophils only contain restricted amounts of er machinery and are thought to lack the er-mediated phagocytosis process [38] . whether er proteins functionally operate on phagocytosis-mediated infection has not been well characterised yet. an important work has shown that dictyostelium lacking both er calreticulin and calnexin present altered phagocytic cup formation and substantial decline in phagocytosis [45] . these two proteins utilise ca, and their disruption per se affects actin filaments and plasma membrane remodelling during phagocytosis [45] . we recently showed that pdi is critically involved in leishmania parasite infection in vitro [22] . we showed that phagocytosis of promastigotes (but not amastigotes) of leishmania chagasi was significantly inhibited by macrophage incubation with the thiol/pdi inhibitors dtnb, bacitracin, phenylarsine oxide, and neutralizing pdi antibody in a parasite redox-dependent way [22] . the phenylarsine response is of particular interest, since this arsenic compound may act similarly to antimonials, widely used in leishmaniasis chemotherapy [46, 47] . pdi preferentially affects parasite internalization and the phagocytosis of the promastigote forms is increased when wild-type pdi is overexpressed in macrophages, an effect opposed by pdi knockdown. at later stages of infection (i.e., after 4 h), pdi from promastigote-infected j774 macrophages was immunoprecipitated and subsequently blotted with an anti-leishmania antibody revealing a parasite band at ∼ 94 kda [16, figure 10 (b), lane 5]. subsequent removal and analysis of this band by mass fingerprint spectrometry showed a 58% match with elongation factor 2 (ef2) of l. major (q4q259; data not shown). the incubation of purified bovine pdi (sigma, p3818) and parasites did not yield any detectable protein complexes, suggesting that the macrophage milieu may be important to sustain pdi-ef2 association [22] . interestingly, leishmania ef2 has important virulent features and acts as a soluble antigen in lymphocyte stimulation in vitro [48] and in vivo [49] . moreover, proteomics studies revealed that ef2 is secreted during promastigote differentiation into the amastigote stage with potential immunomodulatory proprieties in animal models [50] . leishmania ef2 is therefore of particular interest for leishmania therapeutic interventions such as vaccines. although our studies did not address the role of the er in mediating phagocytosis, these data provide compelling evidence for a functional role of er-pdi in a host-parasite interaction. other mechanisms underlining pdi-mediated l. chagasi promastigote phagocytosis involves its association to ros production by phagocyte nadph oxidase and this is discussed next. the nadph oxidase (nox) family of enzymes uses nadph as an electron donor to convert oxygen to superoxide anion (o 2 •− ), a precursor of h 2 o 2 and other powerful oxidants such as hydroxyl radical and peroxynitrite (in the presence of nitric oxide), collectively called ros [3, 51, 52] . each of the seven oxidase family members is characterized by a distinct catalytic subunit (i.e., nox1-5 and duox1-2), and has differing requirements for additional protein subunits [51, 52] . the prototypic member of the nox family, nox2 oxidase (or gp91 phox oxidase), is best known for its role in neutrophil and macrophage phagocytosis. genetic defects in the enzyme are related to chronic granulomatous disease, a condition in which affected children suffer from recurrent severe fungal and bacterial infections due to defective phagocyte function [51] . each nox isoform forms heterodimers with a lower molecular weight p22 phox subunit and is predicted to be membrane-bound. nox2 is normally quiescent and acutely activated by agonists such as pma, lpg, and cytokines in a tightly regulated process in which cytosolic subunits (p47 phox , p67 phox , p40 phox , and rac1 in the case of macrophages and dendritic cells, or rac2 in neutrophil) associate with the nox2-p22 phox heterodimer to initiate enzyme activity [51, 52] . nox2 also has electrogenic features [53] and in apc cells is linked to the regulation of phagosome/lysosome ph and antigen processing [54, 55] . usually, phagosome acidity is maintained by a vacuolar atpase (v-atpase) that transports protons from the cytosol into the phagosome lumen, therefore regulating the function of lysosome proteases in the fused phagolysosomes. savina et al. [56] [57] [58] have shown that nox2-derived superoxide in the phagosomal vesicle promptly consumes protons maintaining a higher ph ambient in dendritic cells during particle internalization, which favours antigen processing and presentation [56] [57] [58] . opposite results were found in macrophages [56] [57] [58] . the selective role of nox2 in different phagocytic cells remains to be defined. the jury is out on whether the results shown in macrophages (association of nox2 to rab27a; a member of rab family of gtpases) are related to vesicle traffic molecule assembly and quality, or rather associated to degradation processes [59] . nox complex protein expression and function is greatly affected by redox compounds, and it is especially regulated by pdi with implications for cell signalling [60] . the association of pdi to p22 phox and other nox isoforms in different cell types, especially in vascular cells, has been previously described [60] [61] [62] . a functional and spatial/physical interaction between pdi and the p22 phox oxidase subunit was shown in macrophages [22] and more recently between pdi and p47 phox in neutrophils [62] . in macrophages, pdi-nox association was correlated to leishmania infection in vitro [22] . it is well known that during phagocytosis of leishmania, nox2 is activated and parasite uptake is inhibited by antioxidants such as catalase [22] . intriguingly, in the course of promastigote infection, some parasites evade that stressful condition and convert themselves into intracellular amastigotes, multiplying and resulting in progression to a disease process. overall, our studies support the view that parasite phagocytosis/infection by macrophages is a redox process mediated by pdi in at least two ways. initially, pdi-nadph oxidase increases ros production generating an oxidizing milieu, which seems to favour promastigote infection. the downstream role of ros generated by pdi-nadph oxidase remains unknown but can be related to the unfolded protein response signalling [63] or, similar to pdi-ero1, to protein folding in the macrophage er compartment with key implications for antigen processing. nevertheless, at later stages of infection, macrophage pdi physically associates with leishmania elongation factor-2 (as discussed earlier). some viruses envelop their genetic material within a protein-coated capsid in a further lipid membrane layout, for example, influenza virus, baculovirus, hepatitis-c, hiv, and herpes virus. these enveloped particles require successive steps to successfully entry and infect host cells. they usually first attach onto host receptors (and attachment factors), and their membranes fuse to interact with endosome vesicles that traffic the virus toward the endoplasmic reticulum, where it is uncoated. the proteins are finally transported to the cytosol and nucleus [64] (figure 2 ). there is convincing evidence showing that most viral infections are strongly influenced by changes in the redox environment and that host pdi mediates infection of enveloped viruses [65] [66] [67] [68] [69] [70] . in the course of hiv infection, the virus first binds to attachment factors, for example, mannose binding c-type lectin receptor and intracellular adhesion molecule (icam-3) on the surface of host cd4 + t cells. the glycoprotein 120 (gp120) subunit of the virus envelope binds to immunoglobulin g of cd4 + and undergoes conformational changes, allowing the virus to interact with its coreceptors, cxcr4 or ccr5. these interactions favour downstream conversions of gp41 envelope subunit to a competent fusion conformation. initial studies showed that membrane-impermeable pdi inhibitors and monoclonal antibodies against pdi prevent hiv-1 infection [65] . it was then revealed that the domain d2 of the cd4 has redox-active disulfide bonds and is regulated by thioredoxin [66] . using membrane-impermeable reducing agents (especially arsenical-derived compounds) and labelling thiol reagents, it was demonstrated that cd4 + reactive thiols critically drive hiv entry into cells [66] . work from another group also revealed that pdi, on the surface of hiv-1 target cells, reduces disulfide bonds of the recombinant envelope glycoprotein gp120 (reaction 1, figure 1 ), a reaction prevented by the usual pdi inhibitors [67] . intriguingly, pdi silencing in u373 and hela cells had little impact on hiv infection itself as compared to the effect mediated by general thiol inhibitors [68] . the reasons for this discrepancy remain to be elucidated and raise the question whether the reductive effect of pdi is coupled to other redox proteins (e.g., thioredoxin or nox's) that could amplify virus-cd4 redox association in some cells. it is noteworthy that in these later studies, pdi knockdown on the cell surface was not evident as compared to massive loss of most pdis within the er; an observation that supports the idea that pdi in the er has little impact in hiv-mediated infection [68] . thiol inhibitors also affect viral fusion as that mediated by the fusion (f) protein from the paramyxovirus newcastle disease virus [69] . the overexpression of pdi family members pfdi and erdj5 has also been shown to significantly catalyze the reduction of thiols in f protein, facilitating membrane fusion [70] . there is evidence suggesting a possible association between pdi and infection mediated by some members of the of herpesviridae viruses family [71] . pdi is also implicated in the attachment of some bacteria from different species of chlamydia [72] [73] [74] . chlamydia is an obligatory intracellular pathogen that causes diverse diseases in humans. the most common species are chlamydia trachomatis, which is sexually transmitted and can cause blindness and infertility, and c. pneumoniae, which affects the respiratory tract. cho cells have impaired endogenous pdi expression due to a defect in truncated mrna processing, thus providing a valuable model to understand the effect of pdi-mediated cell-cell interaction and infection. these cells are very resistant to chlamydia infection showing impaired attachment, an effect restored by ectopic expression of pdi [73] . similar to hiv infection, the molecular mechanisms most likely include the reductive activity of pdi (reaction 1, figure 1 ) on the surface of cho cells [72] . crossing the endoplasmic reticulum (er) membrane is an irreversible process for most proteins. in some cases, however, this flow is reversed and misfolded proteins retained in the er are retrotranslocated to the cytosol via erad to be degraded by the proteasome. this pathway is also exploited by small pathogens, especially non-enveloped viruses and some bacterial toxins, to gain access to the cytosol. in these cases, antigenic particles that reach the er by different means suffer molecular redox rearrangements and binding to pdi allowing them to be transported back to the cytosol or nucleus. well-known examples are infections mediated by polyomaviruses (py) and simian virus 40 (sv40) extensively studied in the field of carcinogenesis. after sv40 interaction with the gm1 receptor on the cell surface, the particle enters the host cell through endocytosis and traffics via the caveosome (a particular caveolin containing endosome with neutral ph) towards the er compartment [75, 76] . sv40-coated pentamers are linked to each other by disulfide bonds between cysteine 104 (c104). further isomerisation in the er is crucial for the viral uncoating process. in vitro cell screening shows that among all er-resident proteins, pdi and erp57 more specifically regulate sv40 infection [75] . pdi silencing substantially decreases sv40 infection that is also dependent on some redox sensitive cysteines on the viral particle [75] . pdi cooperation with erassociated erad proteins derlin-1 and sel1l is ca dependent and facilitates sv40 traffic through erad [75] . a similar pathway is used by some nonobligatory intracellular bacteria that exert their effect through production of potent endotoxins, such as diphtheria toxin (dt) and cholera toxin (ct). these proteins function similarly to some plant toxins, such as ricin and abrin. conversion into toxic proteins involves cleavage of their interchain disulfide bond, allowing them to traffic into the endocytic pathway within the host cell [77, 78] . in humans, ct is derived from the bacterium vibrio cholerae that causes cholera disease and has 2 subunits (a1 and a2). the protein first attaches to the host cell surface via gm1 and the subunit a2, which contains a kdel sequence, and is transported back to the er (see earlier discussion). there, pdi reduces and unfolds a2 and a1 that exit the er via the sec61 channel into the cytosol. pdi in the reduced state (reaction 1, figure 1 ) binds to the toxin and subsequent oxidation of pdi, probably via ero1α, enables the release of ct toxin [79, 80] . the active polypeptide a1 efficiently modifies a heterotrimeric g protein in the cytosol that leads to massive loss of chlorine and water secretion by intestinal epithelial cells in mammals, resulting in severe diarrhoea. in this article we have reviewed the main cellular aspects of pdi-mediated host pathogen interactions and the pathways that are involved in viral, bacterial (including bacterial toxins), and parasitic infections. a number of cellular mechanisms through which pdi modulates some specific cellular pathways in immune cells have been described, such as redox-sensitive attachment, antigen presentation in the er and exit from it, and association to phagosome and ros production by nadph oxidase (figure 2 ). many of these responses are antigen-specific and the precise mechanisms of action remain to be fully elucidated, especially in the context of redox changes in cross-presentation phenomena. moreover, little is known about the role of pdi in infection per se, as well as how pdi signals to a more integrated cellular response to stress [63] . pdi global knockout mice are only viable until birth, but partial gene-modified mice and also modified pathogens will help to reveal the significant redox role of pdi and its redox partners. overall, pdi is a key regulator that may propagate or limit the severity of the infection processes, depending on the infectious organism involved. a better understanding of these complex regulatory steps will provide insightful information on the redox role and coevolutional biological process, and assist the development of more specific therapeutic strategies in pathogen-mediated infections. endoplasmic reticulum mhc: major histocompatibility complex h 2 o 2 : hydrogen peroxide erp57: a member of pdi family also know as glucose-regulated protein or 58-kd (grp58) nf-kb: factor nuclear kappa b ap-1: activator protein 1 stat-3: signal transducer and activator of transcription 3. cell entry by enveloped viruses: redox considerations for hiv and sars-coronavirus role of nitric oxide synthesis in macrophage antimicrobial activity nitrogen dioxide and carbonate radical anion: two emerging radicals in biology the human pdi family: versatility packed into a single fold protein disulfide isomerase: a critical evaluation of its function in disulfide bond formation protein disulfide isomerase: the multifunctional redox chaperone of the endoplasmic reticulum generating disulfides enzymatically: reaction products and electron acceptors of the endoplasmic reticulum thiol oxidase ero1p the fad-and o 2 -dependent reaction cycle of ero1-mediated oxidative protein folding in the endoplasmic reticulum the contributions of protein bisulfide isomerase and its homologues to oxidative protein folding in the yeast endoplasmic reticulum protein disulfide isomerase recycling of peroxiredoxin iv provides a novel pathway for disulphide formation in the endoplasmic reticulum oxidative protein folding by an endoplasmic reticulum-localized peroxiredoxin oxidative folding in the endoplasmic reticulum: towards a multiple oxidant hypothesis? secretion, surface localization, turnover, and steady state expression of protein disulfide isomerase in rat hepatocytes proteins of the pdi family: unpredicted non-er locations and functions identification of a disulfide isomerase protein of leishmania major as a putative virulence factor glutathione limits ero1-dependent oxidation in the endoplasmic reticulum protein disulfide isomerase is cleaved by caspase-3 and -7 during apoptosis tegumentary leishmaniasis as a manifestation of immune reconstitution inflammatory syndrome in 2 patients with aids the molecular basis of leishmania pathogenesis further thoughts on the use of the name leishmania (leishmania) infantum chagasi for the aetiological agent of american visceral leishmaniasis protein disulfide isomerase (pdi) associates with nadph oxidase and is required for phagocytosis of leishmania chagasi promastigotes by macrophages identification and enzymatic activities of four protein disulfide isomerase (pdi) isoforms of leishmania amazonensis comparative evaluation of two vaccine candidates against experimental leishmaniasis due to leishmania major infection in four inbred mouse strains a developmentally regulated gene of trypanosomes encodes a homologue of rat protein-disulfide isomerase and phosphoinositol-phospholipase c characterization of two protein disulfide isomerases from the endocytic pathway of bloodstream forms of trypanosoma brucei cloning of plasmodium falciparum protein disulfide isomerase homologue by affinity purification using the antiplasmodial inhibitor 1,4-bis{3-[n-(cyclohexyl methyl)amino]propyl}piperazine protein disulfide isomerase assisted protein folding in malaria parasites protein disulfide isomerase of toxoplasma gondii is targeted by mucosal iga antibodies in humans oxidative folding and reductive activities of ehpdi, a protein disulfide isomerase from entamoeba histolytica leishmania antigens are presented to cd8 + t cells by a transporter associated with antigen processing-independent pathway in vitro and in vivo redox regulation facilitates optimal peptide selection by mhc class i during antigen processing molecular mechanisms of mhc class i-antigen processing: redox considerations redox-regulated export of the major histocompatibility complex class i-peptide complexes from the endoplasmic reticulum a role for protein disulfide isomerase in the early folding and assembly of mhc class i molecules peptide trafficking and translocation across membranes in cellular signaling and selfdefense strategies professional and non-professional phagocytes: an introduction the phagosome proteome: insight into phagosome functions er-mediated phagocytosis: a new membrane for new functions endoplasmic reticulum-mediated phagocytosis is a mechanism of entry into macrophages leishmania parasitophorous vacuoles interact continuously with the host cell's endoplasmic reticulum phagosomes are competent organelles for antigen cross-presentation the known unknowns of antigen processing and presentation quantitative and dynamic assessment of the contribution of the er to phagosome formation calreticulin and calnexin in the endoplasmic reticulum are important for phagocytosis molecular mechanisms of antimony resistance in leishmania dual action of antimonial drugs on thiol redox metabolism in the human pathogen leishmania donovani proteomic approach for identification and characterization of novel immunostimulatory proteins from soluble antigens of leishmania donovani promastigotes identification and characterization of t cell-stimulating antigens from leishmania by cd4 t cell expression cloning approaches for the identification of potential excreted/secreted proteins of leishmania major parasites nadph oxidases in lung biology and pathology: host defense enzymes, and more nox enzymes and the biology of reactive oxygen the superoxide-generating nadph oxidase of human neutrophils is electrogenic and associated with an h + channel the nadph oxidase of professional phagocytes-prototype of the nox electron transport chain systems phagocytosis and antigen presentation in dendritic cells nox2 controls phagosomal ph to regulate antigen processing during crosspresentation by dendritic cells rab27a regulates phagosomal ph and nadph oxidase recruitment to dendritic cell phagosomes nadph oxidase controls phagosomal ph and antigen crosspresentation in human dendritic cells activation of antibacterial autophagy by nadph oxidases novel role of protein disulfide isomerase in the regulation of nadph oxidase activity: pathophysiological implications in vascular diseases regulation of nad(p)h oxidase by associated protein disulfide isomerase in vascular smooth muscle cells protein disulfide isomerase redox-dependent association with p47phox: evidence for an organizer role in leukocyte nadph oxidase activation mechanisms and implications of reactive oxygen species generation during the unfolded protein response: roles of endoplasmic reticulum oxidoreductases, mitochondrial electron transport, and nadph oxidase how viruses enter animal cells inhibition of human immunodeficiency virus infection by agents that interfere with thiol-disulfide interchange upon virus-receptor interaction disulfide exchange in domain 2 of cd4 is required for entry of hiv-i inhibitors of protein-disulfide isomerase prevent cleavage of disulfide bonds in receptor-bound glycoprotein 120 and prevent hiv-1 entry role of protein disulfide isomerase and other thiol-reactive proteins in hiv-1 envelope protein-mediated fusion thiol/disulfide exchange is required for membrane fusion directed by the newcastle disease virus fusion protein overexpression of thiol/disulfide isomerases enhances membrane fusion directed by the newcastle disease virus fusion protein the human cytomegalovirus ul37 immediate-early regulatory protein is an integral membrane n-glycoprotein which traffics through the endoplasmic reticulum and golgi apparatus attachment and entry of chlamydia have distinct requirements for host protein disulfide isomerase chlamydia attachment to mammalian cells requires protein disulfide isomerase protein disulfide isomerase, a component of the estrogen receptor complex, is associated with chlamydia trachomatis serovar e attached to human endometrial epithelial cells simian virus 40 depends on er protein folding and quality control factors for entry into host cells downregulation of protein disulfide isomerase inhibits infection by the mouse polyomavirus cleavage of disulfide bonds in endocytosed macromolecules. a processing not associated with lysosomes or endosomes cell surface sulfhydryls are required for the cytotoxicity of diphtheria toxin but not of ricin in chinese hamster ovary cells unfolded cholera toxin is transferred to the er membrane and released from protein disulfide isomerase upon oxidation by ero1 the ero1α-pdi redox cycle regulates retro-translocation of cholera toxin protein disulfide isomerase and host-pathogen interaction the authors thank daniel c. pimenta for mass spectrometry experiments (centre for applied toxinology cat/cepid, butantan institute). they also thank dr. joão wosniak jr. from vascular biology laboratory, heart institute (incor) for helpful discussion. this paper was supported by auxílio fundação de amparò a pesquisa do estdado de são paulo (fapesp) and conselho nacional de pesquisa (cnpq), instituto do milênio redoxoma; fundação ej zerbini (incor). a. m. shah, s. ioannis and c. x. c. santos are also supported by the british heart foundation and a leducq fondation transatlantic network of excellence award. key: cord-001567-3bw7jbzq authors: borlak, jürgen; singh, prashant; gazzana, giuseppe title: proteome mapping of epidermal growth factor induced hepatocellular carcinomas identifies novel cell metabolism targets and mitogen activated protein kinase signalling events date: 2015-02-25 journal: bmc genomics doi: 10.1186/s12864-015-1312-z sha: doc_id: 1567 cord_uid: 3bw7jbzq background: hepatocellular carcinoma (hcc) is on the rise and the sixth most common cancer worldwide. to combat hcc effectively research is directed towards its early detection and the development of targeted therapies. given the fact that epidermal growth factor (egf) is an important mitogen for hepatocytes we searched for disease regulated proteins to improve an understanding of the molecular pathogenesis of egf induced hcc. disease regulated proteins were studied by 2de maldi-tof/tof and a transcriptomic approach, by immunohistochemistry and advanced bioinformatics. results: mapping of egf induced liver cancer in a transgenic mouse model identified n = 96 (p < 0.05) significantly regulated proteins of which n = 54 were tumour-specific. to unravel molecular circuits linked to aberrant egfr signalling diverse computational approaches were employed and this defined n = 7 key nodes using n = 82 disease regulated proteins for network construction. string analysis revealed protein-protein interactions of > 70% disease regulated proteins with individual proteins being validated by immunohistochemistry. the disease regulated network proteins were mapped to distinct pathways and bioinformatics provided novel insight into molecular circuits associated with significant changes in either glycolysis and gluconeogenesis, argine and proline metabolism, protein processing in endoplasmic reticulum, hifand mapk signalling, lipoprotein metabolism, platelet activation and hemostatic control as a result of aberrant egf signalling. the biological significance of the findings was corroborated with gene expression data derived from tumour tissues to evntually define a rationale by which tumours embark on intriguing changes in metabolism that is of utility for an understanding of tumour growth. moreover, among the egf tumour specific proteins n = 11 were likewise uniquely expressed in human hcc and for n = 49 proteins regulation in human hcc was confirmed using the publically available human protein atlas depository, therefore demonstrating clinical significance. conclusion: novel insight into the molecular pathogenesis of egf induced liver cancer was obtained and among the 37 newly identified proteins several are likely candidates for the development of molecularly targeted therapies and include the nucleoside diphosphate kinase a, bifunctional atp-dependent dihydroyacetone kinase and phosphatidylethanolamine-binding protein1, the latter being an inhibitor of the raf-1 kinase. electronic supplementary material: the online version of this article (doi:10.1186/s12864-015-1312-z) contains supplementary material, which is available to authorized users. liver malignancies are common cancers worldwide and are responsible for approximately one million deaths each year with most hcc patients having poor prognosis as a result of rapid disease progression. the relative 5-year survival rate is about 15% and can be attributed to an advanced stage of disease at the time of diagnosis, the occurrence of cirrhosis and of other co-morbidites [1] . detection of early stages of disease is essential for an improved prognosis and overall survival. however, apart from alpha-fetoprotein (afp) only a few serological markers are available in clinical practice (such as glypican-3, mir-21, fucosylated gp73, α-fucosidase) with afp diagnostics remaining unsatisfactory because of its low sensitivity and the non-specific correlation between the clinical behavior of hcc and afp blood levels. for this reason, new biomarkers are in strong demand [2, 3] and more selective markers, such as soluble interleukin-2 receptor levels, are evaluated [4] . importantly, research into the molecular pathogenesis of hcc identified several signalling pathways as deregulated. this inspired the development of molecularly targeted therapies such as the multikinase inhibitor sorafenib that inhibits signalling of c-raf-1, mek, erk, vegfr, pdgfr and other kinases, effectively [5] . given that epidermal growth factor is an important mitogen for hepatocytes we were particularly interested in an understanding of the consequences of its targeted overexpression in liver. in our initial study we reported the oncogenomics and pathology of egf induced hepatocarcinogenesis and an identification of molecular circuitries linked to exaggerated egfr signalling [6] . furthermore, we investigated the serum proteome of egf-tumourbearing mice to obtain information on disease-regulated proteins and to search for novel biomarkers at different stages of disease [7] . note, a regulatory loop was proposed whereby egf induces transcriptional activation of hdac2 by ck2α/akt activation in liver cancer cells [8] . additionally, inhibition of egfr in different animal models by erlotinib was shown to attenuate liver fibrosis and the development of hepatocellular carcinoma [9] thus suggesting new therapeutic intervention strategies in the prevention of hcc. indeed, the egf receptor tyrosin kinase plays a much wider role in the immortalization of different cell types as originally anticipated [10] , and is highly expressed in a number of solid tumours and egfr over-expression correlates well with tumour progression, resistance to chemotherapy and poor prognosis. the present study aimed at an identification of disease regulated proteins to facilitate an improved understanding of its complex signalling networks and to search for cross-talk amongst other pathways while an identification of disease regulated proteins would aid the development of molecularly targeted therapies. for this purpose, a two-dimensional electrophoresis and maldi-tof/tof ms strategy was employed to identify disease regulated proteins in an egf transgenic mouse model of hcc. this resulted in an identification of 96 statistically significant regulated proteins of which 54 are uniquely expressed in liver cancer. importantly, 11 out of 54 mouse tumour specific proteins were likewise uniquely expressed in human hcc and 49 disease regulated proteins identified in egf induced liver cancer were similarly regulated in human hcc, as determined by immunhistochemistry using different antibodies and the information given in the publically available human protein atlas depository. clinical significance of the identified proteins could be demonstrated and a total of 37 so far unkown proteins could now be related to egf induced liver cancer, several of which are likely candidates for the development of molecularly targeted therapies. this includes the nucleoside diphosphate kinase a, bifunctional atp-dependent dihydroxyacetone kinase, phosphatidylethanolamine binding protein1, i.e. an inhibitor of the raf-1 kinase as well as aldo-keto reductase family 1 proteins, members c14 and c6, interleukin 25 and the v-crk sarcoma virus ct10 oncogene homolog. finally, to gain insight into the molecular circuitries of egfr induced hepatocarcinogenesis diverse computational approaches were employed. this revealed master regulatory proteins and permitted network constructions of 82 disease regulated proteins with protein-protein interactions being confirmed for > 70% of regulated proteins in string analysis. their regulations were also studied by immunohistochemistry in egf transgenic hcc. we also compared the serum and liver proteomes of hcc bearing mice and found 10 proteins to be similarly regulated, thus evidencing leakage of tumour proteins that can be detected in serum. obviously, these are highly interesting biomarker candidates, 6 of which were also regulated in human hcc as determined by immunohistochemistry. all animal work followed strictly the public health service (phs) policy on humane care and use of laboratory animals of the national institutes of health, usa. formal approval to carry out animal studies was granted by the animal welfare ethics committee of the state of lower saxony, germany ('lower saxony state office for consumer protection and food safety' (laves)). the approval id is az: 33.9-42502-04-06/1204. a total of n = 12 c57/bl6 non-transgenic and n = 12 egf transgenic mice (aged 6-8 months), weighing 25-33 g, were housed in makrolon® type iii cages. water and food (v1124-000, ssniff, the netherlands) was given ad libitum. the temperature and relative humidity was set to 22 ± 2°c and 40-70%, respectively and a 12-h day and night cycle was used. tris, urea, thiourea, chaps, dithiothreitol, bromophenol blue, glycerin, sodium dodecyl sulfate, glycine, temed, ammoniumperoxodisulfate, ammonium sulfate, ammonium bicarbonate, colloidal coomassie blue, and acrylamide were purchased from roth (karlsruhe, germany). iodacetamide was obtained from serva (heidelberg, germany) and benzonase was purchased from novagen (darmstadt, germany). ampholytes (biolyte 3-10) were purchased from bio-rad laboratories (münchen, germany) and destreak was obtained from amersham bioscience (freiburg, germany). mice were anesthetized with ketamine 10% 100 μl/100 g and xylazine 2% 50 μl/100 g, and after surgical removal the liver was perfused and rinsed with ice cold ringer solution until free of blood. approximately 0.1 g of the liver sample was ground in a mortar under liquid nitrogen flow. then, the samples were processed with 0.5 ml of a buffer containing 40 mm tris base, 7 m urea, 4% chaps, 100 mm dtt, and 0.5% (v/v) biolyte 3-10 first (lb2). the suspensions were homogenized by sonication (3 × 20 s) and after addition of 3 μl of benzonase (endonuclease that degrades dna and rna) were left at room temperature for 20 min. the samples were then centrifuged at 12,000 g for 20 min. the pellets were washed and sonicated for 5 min with a further 0.5 ml of lb2 and centrifuged at 12,000 g for another 20 min, and the resulting two fractions of supernatant were collected (extract a). finally, the pellets were redissolved with 0.5 ml of buffer containing 40 mm tris base, 5 m urea, 2 m thiourea, 4% chaps, 100 mm dtt, and 0.5% (v/v) biolyte 3-10 (lb3), sonicated, and centrifuged at 12,000 g for 20 min. the pellet was collected, and the supernatant was marked as extract b. from the same animals, a further 0.1-g portion was ground in a mortar, but was now treated with 0.5 ml of lb3. the suspensions were sonicated, incubated with benzonase, and centrifuged. the pellets were then washed with another 0.5 ml of lb3, sonicated and centrifuged, and the supernatants were collected (extract c). proteome mapping was done under a variety of conditions, e.g. extraction with lysis buffers 2 and 3. in addition, proteins were separated at two different ph ranges [5] [6] [7] [8] [9] [10] . a total of 4 independent experiments were carried out, and duplicate measurements were run for each experiment. the protein concentration of all extracts was determined using the bradford assay. liquid-phase ief pre-fractionation was performed in the rotofor cell system (bio-rad) following the supplier's instructions. ion exchange membranes were equilibrated overnight in the appropriate electrolyte (anion exchange membranes in naoh 0.1 m and cation exchange membranes in h 3 po 4 0.1 m). after four runs ion exchange membranes were always discarded and new membranes were replaced for the other samples. for each run, the electrode chambers were filled with appropriate fresh electrolytes (30 ml) . initially, the cell was filled with pure water and run for 5 min at 5 watts constant power to remove residual ionic contaminants from the membrane core and ion exchange membranes. approximately 32 ml of lb2 were used to fill the cell. a total of 60 mg of total proteins in approximately 2 ml of lb2 were added to the cell to reach the maximum loadable volume (40 ml) . focusing started at 12 watts constant power. after approximately 4 hours the voltage increased to 3000 v and the wattage decreased to 3 w. the focused proteins were harvested in 20~1.5 ml fractions, and ph values were checked. fractions having ph values between 3 and 7.0 were collected and denoted "a-a" (acid). fractions having ph values > 7.0 were collected and denoted "a-b" (basic). again, the protein concentration was determined for both fractions (a-a and a-b) by the bradford method. approximately 30 mg of protein were recovered at the end of the liquid-phase ief pre-fractionation from an initial 60-mg load. the losses are accounted for by the multi-step pre-fractionation procedure, but are not the result of a precipitate that could not be dissolved in our lysis buffer. after each run the membrane core was cleaned with naoh 0.1 m overnight and sonicated for 5 min in water before the new focusing. isoelectric focusing (ief) -first dimension ief was performed using precast linear ipg strips. the 17-cm ipg strips 7-10 and 5-8 were loaded with 1.5 mg of proteins by active rehydration (12 h, 50 v) . samples destined to be separated by ipg strips 7-10 received an excess of hydroxyethyldisulphide (hed) (destreak™) prior to the focusing run. focusing began at 250 v for 20 min in rapid mode, 10,000 v for 5 h in linear mode and 10,000 v for 50,000 vh in rapid mode (for the ipg strips 5-8). ief for the strips 7-10 was carried out at 250 v for 60 min in rapid mode, 10,000 v for 3 h in linear mode and 10,000 v for 50,000 vh in rapid mode. each sample was analysed in duplicate. control and hcc samples were run always at the same time (6 control and 6 hcc samples). after ief, the ipg strips were either stored at −80°c or transferred to 10 ml of equilibration buffer (6 m urea, 30% w/v glycerin, 2% w/v sds, 50 mm tris-hcl ph 8.8) with 2% w/v dtt and 0.5% v/v bromophenol blue solution (0.25% w/v bromophenol blue, 1.5 m tris-hcl ph 8.8, 0.4% w/v sds) and incubated for 20 min at room temperature. strips were removed and incubated in equilibration buffer with 4% w/v iodoacetamide and 0.5% v/v bromophenol blue solution for further 20 min at room temperature. finally, the strips and 10 μl sds-page molecular weight standard on filter paper were placed on top of the 20 cm x 20.5 cm 12% second-dimension gel (12% v/ v acrylamide/bis solution, 375 mm tris, ph 8.8, 0.1% v/v sds, 1/2000 temed, 0.05% v/v aps). both were fixed in place with a 0.5% w/v agarose overlay. gels were run in protean plus dodeca cell (bio-rad) at 70 v for approximately 14 h, followed by 200 v until the bromophenol blue dye reached the bottom of the gel. the running buffer (25 mm tris, 0.2 m glycin, 0.1% sds) was cooled externally to 16°c. gels/proteins were fixed overnight in 30% ethanol, 2% phosphoric acid, and washed 3 x 20 min with 2% phosphoric acid. the gels were equilibrated with 15% ammoniumsulfate, 18% ethanol, 2% phosphoric acid for 15 min and finally stained with colloidal coomassie blue for 48 h. after staining, gels were washed 10 min with pure water and scanned on a molecular fx scanner (bio-rad) at 100 μm resolution. protein spots were imaged first automatically and then manually and analysed using the pdquest™ software (bio-rad). the normalization was carried out in total density in gel mode according to the manufacturer's recommendation. gels were excised using the spot cutter of bio-rad and placed into 96-well microtiter plates. excised gel spots were washed manually with 20 μl of water for 10 min and destained twice, first with 15 μl ammonium bicarbonate 50 mm for 5 min and then with 15 μl 50% ammonium bicarbonate 50 mm -50% acetonitrile for 5 min. finally, the gel particles were covered by acetonitrile until gel pieces shrunk and left to dry for 10 min. all gels/proteins were digested manually in situ with 4 μl of ammonium bicarbonate 50 mm containing 20 ng trypsin (sequencing grade modified trypsin, promega, germany). after 15 min each gel piece was re-swelled with 10 μl of ammonium bicarbonate 50 mm and incubated for 4 h at 37°c. after 4 h the reaction was stopped by adding 10 μl of trifluoroacetic acid 1% containing 1.5% (w/v) n-octyl-beta-d-glucopyranoside (ogp) (applichem). for the application of the samples, 4 μl of peptide solution were loaded onto an mtp anchor chip target 600/384 (bruker daltonics) previously prepared with a saturated solution of matrix, alpha-cyano-4-hydroxy-cinnamic acid (alpha-hcca) (bruker daltonics, bremen, germany). maldi-ms was performed on an ultraflex ii maldi-tof/tof (bruker daltonics) mass spectrometer equipped with a smartbeam™ laser and a lift-ms/ms facility. the instrument was operated in positive ion reflectron mode and an acceleration voltage of 25 kev for the peptide mass fingerprint (pmf) mode. typically, 600 spectra, acquired at 100 hz, were summed and externally calibrated. in the case of ms/ms-cid the lift device was used for selection and fragmentation of the ions; the acceleration voltage in the ion source 8 kv, the timed ion selector was set to 0.4% (relative to parent mass), and argon was used as collision gas (~4-6 × 10-6 mbar). resulting fragments were further accelerated in a second source by 19 kv and analysed by a two-stage gridless reflectron. typically, 400 shots were accumulated for the parent ion signal and 1000 shots for the fragments. flexcontrol™ 3.0, and flexanalysis™ 3.0 were used as instrument control and processing software (bruker daltonics, bremen, germany). a calibration standard was used for the external calibration of spectra (peptide calibration standard for mass spectrometry, which covered the mass range~1000-4000 da internal calibration was achieved using trypsin autolysis products (m/z's 1045.564, 2211.108 and 2225.119) resulting in a mass accuracy of ≤ 50 ppm. spectra were collected by the flexcontrol software without smoothing or baseline subtraction and a peak resolution higher than 6000 or 7000 a.u. in case of dhb and chca matrixsample preparation, respectively. the spectra were sent to the flexanalysis software which labeled the peaks for protein identification by proteinscape 1.3 or biotools 3.1 (bruker daltonics). trypsin autolysis products, tryptic peptides of human keratin and matrix ions were automatically discarded by proteinscape (mass control list). proteinscape score booster feature was used to improve database search results by automatic iterative recalibrations and background eliminations. protein scores greater than 53 were considered significant (p <0.05, mascot) and an annotation as mouse protein as the top candidates was requested in the search when no restriction was applied to the species of origin. identified proteins were checked individually for further considerations. for pmf peak picking the snap peak detection algorithm, a signal to noise threshold of 6, maximal number of peaks 100, a quality factor threshold 50 and baseline subtraction tophat was applied. peptide masses were searched against the swiss-prot database (download 2005-197 228 sequences, 71 501 181 residues) employing the mascot server (in-house mascot-server, matrix sciences ltd., http://www.matrixscience.com/, revision 2.0.0), taking into account carbamidomethyl of cysteines -carbamidomethyl (c)-as fixed modification and possible oxidation of methionine -oxidation (m)-as a variable modification but allowing one missed cleavage. based on initial data, ion precursors were selected by proteinscape for tandem ms data acquisition (by lift-tof/tof, bruker daltonics, bremen, germany). in the mascot ms/ms ions search, the restriction mammalia was applied with peptide tolerance of (70 ppm and ms/ms tolerance of (0.9 da (fixed and variable modifications as pmf). the acceptance criteria for pmf-based identification were an individual ions score >27, at least five matching peptides and 10% peptide coverage of the theoretical sequences. livers, dissected from egf-overexpressing mice aged between 7-9 months, were fixed in 4% buffered paraformaldehyd and embedded in paraffin. 5 μm thick sections were deparaffinized and rehydrated through a descending alcohol series followed by a 4 min washing step in destilled h 2 o. subsequently, antigen retrieval was performed in citrate buffer (ph 6) by autoclaving the sections 15 min at 121°c. the envision kit (dako, hamburg, germany) was used for immunohistochemistry. the slides were rinsed with destilled h 2 o and after a 5 min incubation step in tris-buffered saline (washing buffer), endogenous peroxidase activity was blocked with dako peroxidase blocking reagent for 5 min followed by a second washing step. thereafter, the sections were blocked for 10 min with protein-block serum free (dako) and incubated with primary antibodies for 45 min. details of antibody dilutions with washing buffer are given in additional file 1: table s1. in the case of goat primary antibodies a rabbit-anti-goat bridging antibody (dako) was employed. specifically, the bound primary antibodies or bridging antibodies were detected by use of labelled polymer hrp anti-rabbit secondary antibody (envision kit; dako) and the immunoreactivity was visualized by dako liquid dab substrate chromogen system in a 5 min incubation. finally, the sections were counterstained with harris haematoxylin for 2 min, dehydrated in an ascending alcohol series, coverslipped and examined under a light microscope (leica, jülich, germany). a total of n = 122 disease regulated proteins were filtered for statistical significance at p < 0.05 (table 1 ). this yielded n = 96 statistically significantly regulated proteins two of which had identical accession number, i.e. aah81431 = atp synthase h+ transporting mitochondrial f0 complex, subunit d and bac36241 = apoa1 but differed in their spot ids as a result of posttranslational modifications. the statistically significantly regulated proteins were grouped into four different categories to yield 54 tumour specific (to), 9 up-regulated (ur), 19 down-regulated (dr) and 14 proteins only expressed in healthy non-transgenic control livers (co). categorization of tumour regulated proteins based on ontology terms 82 non-redundant tumour proteins covering to, ur and dr categories were considered and analysed for ontologies using the genexplain software (v.2.4.1), the biological pathways tools reactome (http://www.reactome.org) and kegg (http://www.genome.jp/kegg) and wikipathways (http://wikipathways.org). the tumour regulated proteins (to + ur + dr) were subjected to functional classification based on ontology terms and a p-value of <0.01 was considered to be significant. moreover, disease regulated proteins were analysed with the cytoscape software version 3.0.2 using the function go-tree levels and number or % of proteins for a given term (see additional file 2: table s2 ). master regulatory proteins were searched based on the designated workflow of the genexplain software. it is designed to find master regulatory molecules upstream of an input list of regulated tumour proteins. after annotation of the input datasets the tool for master regulator finding over geneways network (http://www.genexplain.com) was applied. specifically, the geneways software is used to automatically extract, analyse, visualise and integrate molecular pathway data from the published peer reviewed literature. it is based on document sorting, term identification, term meaning disambiguation, information extraction, ontology, visualization and system integration [61] . the following filtering threshold was used, i.e. score cutoff (0.2), search collection (geneways hub), maximum radius [4, 10] , fdr cutoff (0.05), z-score cutoff (1.0), penalty (0.1) and decay factor (0.1) (additional file 3: table s3 ). protein network for disease regulated proteins were also constructed using the string software (http:// string-db.org/). the underlying database informs on known and predicted protein-protein interaction and the constructed networks are based on active prediction methods of neighborhood, gene fusion, co-occurrence, co-expression, databases and textmining. eventually, confidence scores were calculated for each interaction pair and only those above default cutoff scores (0.4) were selected. finally, mapping of pathways information from reac-tome, kegg and wikipathways have been implemented the proteins are sorted in alphabetical order, and the ncbi annotation is given in the accession number column. molecular weight, pi, and mascot scores are also given. the column "gels", "c" (c = control) and "t" (t = tumour) indicate the frequency of positive identification of proteins in a total of 48 independent gels, whereas "lb2" and "lb3" (lb = lysis buffer) refers to the different lysis buffers employed. furthermore, references are given for those proteins which have already been described as hcc-associated whereas those marked with a star (*) are so far unknown as egfr disease regulated in hepatocellular carcinoma. over protein networks using information of known pathways and sustained proteins connecting these pathways in a given network. the histopathology and oncogenomics of egf induced liver cancers was previously reported [6] and an important finding of the study was the 100% incidence of malignant tumour formation in less than one year after birth. notably, a sequence of events was observed that initially consisted of diffuse large cell dysplasia followed by multiple dysplastic foci and nodules and growth of hcc. figure 1 a and b depict the histopathology of healthy non-transgenic control liver and egf induced tumours, respectively. after protein extraction 2de was performed. subsequently, the gels were scanned on a bio-rad molecular fx scanner at a 100 μm resolution. image analysis was done with the pdquesttm software and spots were detected automatically. a total of 122 proteins differed in expression or were de novo expressed when 2de gels of non-transgenic controls and hcc mice were compared (see table 1 for detailed information on the proteins identified and figure 1e -g depicting examples of zoomin-gels of some regulated proteins). among them are 96 statistically significantly regulated proteins (p ≤ 0.05) of which 63 were significantly up-regulated (ratio hcc/control ≥ 2) and included fibrinogen and subunits of it, vimentin, cu/zn superoxide dismutase, and apolipoprotein e ( figure 1f (i-iv), while 33 proteins were repressed in expression (ratio hcc/control ≤ 0.6) and included arginase 1, dhdh protein, glutathione peroxidase 1 and agmatine ureohydrolase ( figure 1g (i-iv) and table 1 ). a reference 2-de map of mouse liver and serum proteins was constructed that consists of more than n = 500 proteins [2, 7] . note, in our previous efforts we identified n = 25 serum proteins as regulated in the egf transgenic disease model. among them were alpha-fetoprotein, clusterin, fibrinogen-α and -γ, serum amyloid component p and several apolipoproteins all of which were significantly up-regulated. based on the combined use of 2de and maldi-ms a total of n = 122 differentially expressed proteins were identified (table 1 ) and included isoforms as well as post translational modifications of albumin (5 up-regulated spots), alpha enolase (4 down-regulated spots), apoliproptein a-i (2 up-regulated spots), atp synthase h+ transporting mitochondrial (2 down-regulated spots), fibrinogen beta (2 up-regulated spots), glycine n-methyltransferase (3 spots, in controls only), hsp60 (2 down-regulated spots), nit protein 2 (2 down-regulated spots), peroxiredoxin 6 (1 up-regulated spot and 1 down-regulated spot), and 4931406c07rik (2 up-regulated spots) (see table 1 ). importantly, a total of n = 37 so far unknown disease regulated proteins were identified that can now be related to egf induced liver cancer. these are marked with an asterisk in table 1 . furthermore, a comparison of serum and liver proteoms revealed n = 10 proteins to be regulated in common, thus evidencing leakage of tumour proteins into systemic circulation (table 2) . among them was serum afp; it's up-regulation and that of others was confirmed by western blot analysis (figure 2a-e) . likewise, apolipoprotein e was up-regulated both in serum and tumour samples, the ratio hcc/control being 2.2 and 3.9, respectively. in a previous study on human hcc increased expression of apoe was observed in 88% of study cases; however, gene apoe expression and serum levels were unchanged to suggest its accumulation and impaired secretion [21] . two isoforms of alpha-2-macroglobulin were up-regulated in serum of hcc-bearing mice (spot 1: ratio hcc/control = 1.8; spot 2: ratio hcc/control = 3.2). its expression was exclusively associated with tumours. finally, serum amyloid component p was up to 10-fold up-regulated in serum and its tissue expression was tumour specific ( table 2) . to further evidence disease regulated proteins and to provide information on their subcellular localization a total of n = 8 proteins were studied by immunohistochemistry. five of them were selected for their novelty (see table 1 ) while amphiregulin and epiregulin were chosen for their importance in the egf-signalling pathway. furthermore, hnf4α was studied for its pivotal role in liver cancer [62] . depicted in figure 3 are immunohistochemistry stainings performed with egf transgenic livers to confirm regulation and predominant cytoplasmic expression of arginase ii. note, arg2 is only expressed in hcc and recent evidence suggest modulation of arginine levels in the extracellular milieu to be part of an immune escape mechanism whereby lack of local arginine weakens tumour-infiltrating lymphocytes as t cells require adequate argine levels [63] . likewise, the tumour specific and cytoplasmic expression of the f-actin capping protein α1 subunit (capza1) and the predominant nuclear expression of tubulin β that was particularly visible beneath the liver capsule may possible promote microtubule stability and interactions of microtubules with endogenous proteins. furthermore, the induced and predominat cytoplasmic expression of the gdp dissociation inhibitor 2 (gdi2) protein is part of the control of vesicular trafficking. this protein is known to regulate gdp-gtp exchange amongst members of the rab family of proteins. the tumour specific and cytoplasmic expression of amphiregulin supports the notion of a switch in autocrine signalling and it has been reported that amphiregulin is a prognostic marker for poor outcome of a variety of malignancies including colorectal liver metastasis [64] . finally, the repressed nuclear expression of hnf4a was not unexpected and confirms earlier findings [62] . based on the information given in table 1 the human protein atlas depository (www.proteinatlas.org, version 12) was interrogated. as shown in additional file 4: table s4 48 out of 96 mouse liver cancer regulated proteins were likewise regulated in human hcc. it should be noted that for some proteins several antibodies were used to study their expression; only representative data were considered. importantly, out of the 54 proteins uniquely expressed in mouse liver tumours n = 11 were likewise uniquely expressed in human hcc thus evidencing clinical significance of our findings. we compared our previously published transcriptomic data of egf induced liver cancers with the proteomic data obtained in the present study. such comparisons revealed n = 22 genes to be significantly regulated of which n = 17 are in common regulated whereas for n = 5 genes transcript expression was opposite to that of the coded proteins (see additional file 5: table s5 ). 82 of the 96 significantly regulated proteins were mapped to 40 different biological processes (see figure 4a ) of which prominent examples are regulation of arginine metabolism and amino acid import, regulation of cdc42 protein signal transduction', cellular response to oxidative stress, hydrogen peroxide and superoxide, glycolysis and gluconeogenesis, regulation of cholesterol transport, protein-lipid complex and plasma lipoprotein particle remodeling, positive regulation of steroid metabolic process, negative regulation of calcium ion transmembrane transporter activity and release of sequestered calcium ion into cytosol by sarcoplasmic reticulum, (see additional file 6: table s6 ). in figure 4a -c the go biological process, cellular components and molecular functions are depicted. note, some of the ontology terms could be grouped, i.e. chaperone-mediated protein complex assembly and folding, endoplasmic reticulum unfolded protein response, er-nucleus signalling pathway and response to er oxidative stress as well as hypoxia, blood coagulation, developmental growth and regulation of programmed cell death. as depicted in figure 4b 76 significantly regulated proteins were mapped to 21 cellular components (see additional file 7: table s7 ), i.e. mitochondrial crista, matrix and inner membrane, endoplasmic reticulum lumen, early endosome and cytoplasmic membrane-bounded vesicle, chylomicron, very-low and high density lipoprotein particle, proteasome accessory complex, peroxisome, extracellular vesicular exosome and extracellular membrane-bounded organelle. furthermore, 75 significantly regulated proteins were mapped to 21 molecular functions (see figure 4c ) and included arginase activity, fructose-bisphosphate aldolase activity, hydrolase and oxidoreductase activity, acting on carbon-nitrogen (but not peptide) bonds, acting on aldehyde, ch-oh group or oxo group of donors, nad or nadp as acceptor as well as steroid dehydrogenase activity. in addition, phosphatidylcholine-sterol o-acyltransferase activator activity, cholesterol transporter activity, sterol transporter, antioxidant and lipid transporter activity as well as electron carrier and serine-type endopeptidase inhibitor activity were prominent functions. finally, proteins functioning in metal ion and purine ribonuleoside triphosphate binding, lipoprotein particle receptor binding, chaperone and oxygen binding, binding of magnesium ion and nad, protease and single-stranded dna binding were observed as disease regulated (additional file 8: table s8 ). in all, 96 significantly regulated proteins were classified by the reactome, kegg and wikipathway databases, respectively. the different databases provided similar information with the majority of tumour proteins acting in 4 major metabolic pathways (see figure 5 and information derived from cluego and cluepedia). for example, the proteins aldoa, aldoc, fbp1 and pkm function in glycolysis and gluconeogenesis whereas akr1c6, aldoa, aldoc and fbp1 are part of the fructose and mannose metabolic pathway. likewise, atp5h and ndufv1 are part of the oxidative phosphorylation pathway and mdh1 and pkm contribute to pyruvate metabolism. similarly, the proteins akr1c14, akr1c18, akr1c6, alb, apoa1, apoa4, apoe, fdps, gpx1, hacl1 and plg take part in the metabolism of lipids, arachidonic acid and lipoproteins whereas the proteins agmat, arg1, arg2, bckdha, ckb, cps1, haao and phgdh are specified for arginine and proline metabolism. in the same manner the proteins gpx1, itpa and nme1 contribute to the metabolism of nucleotides and related to this are the proteins itpa, pkm and psmc5 which are part of the purine metabolic pathway. apart from these pathways a highly significant regulation of the blood coagulation cascade, figure 2 western blotting of serum proteins in control and egf transgenic mice. for the commonly regulated proteins in serum and tumours their regulation in liver tissue was confirmed by 2de and maldi-tof/ms (see table 1 ). depicted are western blots for serum proteins. note, with the exception of egf the regulated serum proteins were already reported in our earlier publication [7] . platelet activation and fibrinolysis was observed as defined by the proteins crk, fga, fgb, fgg, plg and sod1 all of which were highly significantly regulated. furthermore, trna aminoacylation (aars, gars and sars), advanced glycosylation endproduct receptor signalling (alb, capza1 and lgals3), peroxisome (ech1, hacl1 and sod1), protein processing in endoplasmic reticulum (ganab, hyou1 and pdia4), proteasome (psmc5 and psmd11) and activation of chaperone genes by xbp1(s) and 'unfolded protein response' (hyou1 and lmna) are pathways significantly perturbed in liver cancer induced by egf (see additional file 9: table s9 and additional file 10: table s10 ). using the designated workflow of the genexplain platform (see methods section) we searched for master regulatory proteins. the software is designed to identify molecules upstream of regulated tumour proteins to assist in the construction of molecular circuitries. after annotation of the input datasets the tool "find master regulators in networks (geneways)" was used to identify key nodes amongst 54 proteins exclusively expressed in tumours (to). this revealed 24 upstream regulatory molecules. among them five were selected for their link to the egfr signalling pathway, i.e. plaur, fgfr1, ptbp1 and agtrap while the protein s100a1 was chosen for its importance in the plaur/egfr network, (see additional file 11: figure s1a -e). in additional file 3: table s3 and additional file 12: table s11 , the tumour regulated proteins distributed amongst the selected master regulatory molecules are summarized. in support of its biological significance the constructed networks were enriched with gene expression data from transgenic non-tumour and tumour tissues. thus, the gene and protein data were merged and hybrid networks for each master regulatory protein were constructed. subsequently, these were merged into one (see figure 6 ) and the integrated hybrid network consisted of n = 82 network proteins of which n = 20 were tumour specific. in support, the genes coding for lmna, i.e. a component of the nuclear lamina that is frequently up-regulated in cancers and mvp that codes for multidrug resistance were up-regulated (ur-t) whereas nme, a suppressor of metastasis was repressed in expression (dr-t). likewise, the genes coding for igals3, i.e. a beta-galactoside-binding protein frequently overexpressed in cancers and pcbp1 that is involved in transcription and functions as an inhibitor of invasion [65] were up-regulated in transgenic nontumour livers (ur-tr-nt) whereas transcript expression of aars, a member of trna synthases and anaxa6, a calcium-dependent, phospholipid-binding protein with important roles in the tumour microenvironment and metastasis were repressed (dr-tr-nt). finally, the entire network was enriched with expression data of 16 and 17 genes, respectively that were significantly regulated in tumour and non-tumour transgenic livers. next, we searched for master regulatory molecules by considering 82 regulated tumour proteins obtained from the comparison tumour specific or up-and down regulated as compared to healthy non-transgenic controls (to + ur + dr). this revealed 29 filtered (threshold radius of 10) upstream regulators. among these 7 were selected as candidates because of their regulation in liver tumours and their link to egfr signalling. notably, in the constructed network all master regulators were significantly up-regulated and included pdia4, apeh, pebp1 and apoe while the protein expression of arg1, fbp1 and haao was repressed (see additional file 13: figure s2a-g) . note, in the case of arg1 transcript expression was equally repressed. in additional file 3: table s3 and additional file 12: table s11 the tumour regulated proteins distributed amongst the selected master regulatory molecules are summarized. in support of its biological significance the fused hybrid network was enriched for gene expression data derived from transgenic non-tumour and tumour tissues. thus, the integrated hybrid network consisted of 34 out of 82 regulated proteins and gene expression calls evidenced 6 of the 27 up-regulated tumour (to + ur) proteins to be regulated at the transcript level as well whereas among the 7 down-regulated tumour proteins (dr) the gene arg1 was repressed in expression. likewise, gene expression data from non-tumour transgenic livers evidenced 5 genes out of 27 networks partners to be increased in expression (ur-tr-nt) and among the 7 down-regulated networks proteins the gene phb was repressed (dr-tr-nt). thus, when the tumour gene expression data of the entire network was considered a total of 22 genes were regulated, of which 13 were up-regulated and 9 were repressed in expression, (see figure 7) . based on the information of the hybrid master regulatory network and in addition to other disease regulated proteins summarized in table 1 (note, some of the proteins were not part of the networks) a total of n = 122 disease regulated proteins were considered for network construction. after filtering for non-connected proteins the string database informed on n = 151 protein-interactions of which n = 76 were disease regulated as identified in the present study. among these 45, 24 and 7 were either up-, down-or not statistically significantly regulated. furthermore, gene expression calls for 45 up-regulated proteins were supported by 5 up-and 4 down-regulated genes identified in tumours and 4 up-and 6 down-regulated genes in transgenic non-tumour livers. likewise, gene expression calls for 24 down-regulated proteins were supported by 8 and 5 downregulated genes in tumours and transgenic non-tumour livers, respectively. therefore, the entire network was supported by 14 induced and 17 repressed tumour specific gene expression changes and 16 up-regulated and 13 downregulated genes observed in transgenic non-tumour livers. as depicted in figure 8 the proteins of the fusion network displayed functional associations via the egf/egfr network and included 69 out of 96 (72%) significantly regulated tumour proteins with 6 out of 7 master regulators being connected to egfr through the network's proteins (see additional file 14: table s12 for possible protein-protein interactions and related scores). of the 151 network proteins 109 could be mapped to distinct pathways. after removal of non-relevant terms such as alzheimer disease a total of 94 proteins were mapped to 6 pathways with meaningful associations (see figure 9 ) and consisted of 'platelet activation, signalling and aggregation (platelet degranulation)' , 'lipoprotein metabolism' , 'mapk signalling pathway' , 'glycolysis and gluconeogenesis', 'metabolism of amino acids and derivatives (arginine and proline metabolism)', 'apoptosis' and 'egfr1 signalling pathway'. additionally, a total of 2 and 3 tumour regulated proteins were mapped to the hif-1 signalling and protein processing in endoplasmic reticulum pathways, respectively. the pathway mapping was also supported by gene expression data with 10 up-and 9 down-regulated genes in tumours and 9 up-and 6 downregulated genes in transgenic non-tumour livers. note, two of the significantly regulated tumour proteins, i.e. crk and pebp1 are members of the egfr1 signalling pathway with pebp1 also functioning as a master regulator while the other regulated proteins are connected to egfr signalling through cross-talk among the pathways (see additional file 15: table s13 ). figure 6 integrated master regulatory network for proteins uniquely expressed in tumours. based on network information obtained for the 5 different master regulators an integrated hybrid network was constructed. the network contained 82 proteins including 20 with connectivity to egfr signalling (yellow coloured inner node). the master regulator, the connecting proteins (network elements) and regulated proteins are given as red, green and blue coloured inner node, respectively. furthermore, each node is partitioned into four segments whereas the first segment seen from left refers to tumour specific proteins and is red-coloured. the second, third and fourth segments refer to either up-and down-regulated proteins, tumour specific gene expression changes and gene regulations in transgenic non-tumour liver tissue, respectively. increased expression of either proteins or genes is given in red, whereas the blue colour denotes repressed expression. recent research into the molecular pathogenesis of hcc evidenced significant alterations in signalling pathways. given the fact that the epidermal growth factor is an important mitogen for hepatocytes we were particularly interested in investigating the consequences of its targeted overexpression in the liver. in our previous study we employed chromatin immunoprecipitation followed by cloning and sequencing of dna to search for tumour associated gene regulations targeted by novel hnf4alpha p1 and p2 promoter-driven isoforms. this identified egf-receptor substrate (eps15r) and eps15 as regulated by the p2 promoter-driven hnf4alpha splice variant in mouse and human hcc. a molecular circuitry was proposed whereby eps15 and eps15r mediate internalization of activated egfr to stimulate receptor recycling, therefore responding to mitogenic signalling of egf [66] . in the present study disease proteomics was performed to further investigate the role of egf in liver cancer. this identified 122 regulated proteins of which 37 are novel and have not been reported so far. a total of 63 proteins were significantly up-regulated (table 1) . among these 18 were extra-cellular or secreted proteins and included albumin and isoforms of it, apolipoproteins (apoe, apoa4 and apoai), α-, β-, γfibrinogen, plasminogen as well as interleukin 1 receptor antagonist (il-1ra). note, an isoform of apoa1 was already proposed as serum marker of hcc [67] and based on ihc staining il-1ra expression was confirmed in about 70% of mouse liver adenoma and carcinoma cases; however preneoplastic foci as well as normal hepatocytes surrounding the lesions were negative. furthermore, rt-pcr analysis confirmed mouse hepatic tumours to contain both secreted and intracellular forms of il-1ra [51] and figure 7 integrated master regulatory network for hcc regulated proteins. based on network information obtained for 7 different master regulators an integrated hybrid network was constructed. the network contained 114 proteins including 34 with connectivity to egf/egfr signalling (yellow coloured inner node). the master regulator, the connecting proteins (network elements) and regulated proteins are given as red, green and blue coloured inner node, respectively. furthermore, each node is partitioned into four segments whereas the first segment seen from left refers to tumour specific proteins and is red-coloured. the second, third and fourth segments refer to either up-and down-regulated proteins, tumour specific gene expression changes and gene regulations in transgenic non-tumour liver tissue, respectively. increased expression of either proteins or genes is given in red, whereas the blue colour denotes repressed expression. serum levels of il-1ra were monitored to assess therapeutic efficacy of radiofrequency ablation in hcc patients [68] . an important finding of the present study is the statistically significant regulation of 20 mitochondria associated proteins of which 13 were repressed while 7 were upregulated. similar results were reported by chignard and wei sun with mitochondrial proteins being the second largest proportion of regulated proteins in human viral hcc [15, 69] . among the repressed proteins were nadh dehydrogenase (ubiquinone) 1 alpha subcomplex 8 and prohibitin, a mitochondrial chaperone. this protein, when deleted (prohibitin ko mice) induced fibrosis, bile duct metaplasia, liver dysplasia and eventually multifocal hcc. however, its overexpression in tumour cell lines inhibited cell proliferation to demonstrate tumour suppressor function [70] . likewise, glutathione peroxidase 1 (response to oxidative stress) and argininosuccinate synthetase 1 (ass, urea cycle) were repressed. note, ass is the first of two enzymes to convert citrulline to arginine and this pathway allows cells to synthesize arginine from citrulline to function in no production, ammonia detoxification and synthesis of polyamines. several reports suggest ass deficiency to be common in tumour cell lines [25] [26] [27] [28] [29] [30] , and the present study confirms ass expression to be confined to healthy non-transgenic control liver, but ass was absent in tumour tissue extracts (see table 1 ). ablation of ass in diverse tumours suggests a tumour suppressor function and the fact that forced expression of ass in osteosarcoma cell lines suppresses growth adds weight to this notion [71] . another example of tumour specific ablation of proteins refers to glycine n-methyltransferase (gnmt). the enzyme catalyzes the transfer of a methyl group from s-adenosylmethionine (sam) to glycine thereby generating s-adenosylhomocysteine and n-methylglycine. this protein was completely downregulated in liver tumours. gnmt is known to play a role in the maintenance of genetic stability [44, 72] , and a novel tumour suppressor function was recently reported that is independent of its catalytic activity but does require its nuclear localization [73] . several of the proteins listed in table 1 were already reported for their tumour specific regulation while proteins so far unknown for their regulations in hcc, are marked with an asterisk (table 1) . these function in diverse biological processes including metabolism, translation and signalling. specifically, changes in carbohydrate metabolism are commonly observed in tumours where energy production relies on glycolysis rather than mitochondrial oxidative phosphorylation. in the present study induced expression of several glycolytic enzymes was observed, most notable [1] pyruvate kinase 3 that catalyzes the transfer of a phosphate group from phosphoenolpyruvate to adp and was shown to be a target of mi-rna122 in hcc [2, 74] aldolase, an enzyme that converts fructose 1,6-bisphosphate into dihydroxyacetone phosphate (dhap) and glyceraldehyde 3-phosphate and was reported to be a sensitive marker for benign and malignant liver disease [75] and [3] alpha glucosidase 2, a hydrolase that cleaves glycosidic bonds with the release of alpha glucose from carbohydrates. further important findings include the tumour specific expression of alanyl-, glycyl-and seryl-trna synthetases which catalyze the transfer of specific amino acids to trna, as well as regulation of eukaryotic translation elongation factor 2 and poly(rc) protein 2 that binds to oligo dc. note, knowledge on the role of aminoacyl-trna synthetases in cancer is just emerging [76] and through the use of a lentiviral mediated shrna vector, a link between aminoacyl-trna synthetases [aars]-interacting multifunctional protein 2 (aimp2) and repressed egfr signalling was established that resulted in repressed glucose uptake [77] . we also observed induced expression of heterogeneous ribonucleoprotein (hnrnp) that takes on diverse functions in the processing of mrna. its expression was reported to be increased in serum of hcc patients . in contrast, proteins involved in the synthesis and degradation of cholesterol, lipids, steroids and fatty acid were in part oppositely regulated and included induced expression of the aldo-keto reductase family 1. regulation of this protein has been reported for lung and pancreatic cancers [78] , and gene silencing of aldo-keto reductase family 1 b10 resulted in growth inhibition of colorectal cancer cells that might be of therapeutic utility [79] . the repressed expression of figure 9 pathways mapping of fussed network proteins. cytoscape 3.0.2 with plugins (see methods section) are used to generate functionally grouped network of pathways. grouping of significant pathway terms (p ≤ 0.05) were based on kappa score threshold of 0.4, initial group size of 2 and sharing group percentage of 50. the pathway network consisted of 35 significantly and 7 non-significantly regulated proteins involved in distinct pathways which are colour-coded. note, the two individual terms are grey-coloured. up and down-regulated proteins are coded as orange and green small discs, respectively. up-and down-regulated as well as non-significantly regulated proteins and connecting proteins of the network are given as orange, green, yellow and blue coloured discs, respectively. the network depicts protein-protein interactions in liver tumours of egfr transgenic mice and their relation to various pathways under the influence of egfr signalling. egfr is highlighted as blue triangle in this network. certain proteins may also be considered as an adaptive response and includes the enzyme enoyl coenzyme a hydratase 1. its activity was shown to contribute to lymphatic spread of liver tumours as was evidenced in gene silencing studies [80] . likewise, we observed repressed expression of dihydrodiol dehydrogenase in tumours. this enzyme plays an important role in the metabolism of steroids that leads to inactivation of circulating androgens, progestins and glucocorticoids and was repeatedly reported to be overexpressed in non-small cell lung cancer. amongst patients with high dhd expression the incidence of early tumour recurrence and distant metastasis is significantly higher and patients are highly resistant to chemo and radiotherapy [81] . intriguingly, complete ablation of mitochondrial butyryl coenzyme a synthetase 1, a gtp-dependent lipoateactivating enzyme was observed in tumours of egf transgenic mice. little is known about the possible link between butyrate metabolism and liver cancer. however, butyrate is well known to inhibit proliferation of human colon carcinoma cells in an epigenetic manner that involves histone acetylation [82] . note, it was recently reported that due to the warburg effect butyrate-mediated histone acetylation and cell proliferation is dictated [83] . several lines of evidence therefore suggest butyrate to act as a cytosolic sensor for histone acteylation and when transformed to intermediates by butyryl coenzyme a synthetase is unable to escape the mitochondria. moreover, we observed a highly significant repression of 2-hydroxyphytanoyl-coa-lyase. this peroxisomal thiamine pyrophosphate-dependent enzyme is rate limiting in the breakdown of 2-hydroxy fatty acids. the biological role of 2-hydroxy fatty acids has only recently become apparent [84] and cumulative evidence suggests intermediates of energy metabolism to specifically activate g-protein coupled receptors which are now classified as hydroxy carboxylic acid receptors (hca1-3). the hca2 receptor is involved in a complex negative feed-back loop whereby ketone bodies derived from fatty acid oxidation are sensed by hca2 via the activity of 3-hydroxybutyrate that leads to inhibition of lipolysis and to restriction of further fatty acid supply. in this way triglyceride use is diverted and energy demands for tumour growth are met more efficiently. specifically, during rapid tumour growth and the herewith associated ischemia the yield of high energy bonds (atp) from glucose oxidation is about twice that of fatty acid oxidation. our observation that proteins involved in the ß-oxidation of fatty acids were either repressed or unchanged agrees well with this principle (see also discussion below). the reduced expression of lysophosphopholipase signifies an adaptive response; it catalyses the production of lysophosphatidic acid, i.e. a second messenger known to contribute to tumour cell motility, survival and proliferation [85] . additionally, the repressed expression of mitochondrial acyl-coa thioesterase 1 in liver tumours which hydrolyzes acyl-coas to free fatty acids and coenzyme a, will influence the supply of ligands for nuclear receptors and the regulation of fatty acid oxidation in mitochondria and peroxisomes. equally, the regulation of farnesyl diphosphate synthetase, i.e. a key enzyme in the isoprenoid biosynthetic pathway is highly interesting and this enzyme is explored as a drug target of bisphosphonates to treat tumour growth [86] . it's up-regulation in colon cancers was reported [39] . in the present study repressed expression of the ribosom-compononent rps12 and enzymes of amino acid metabolism like branched chain ketoacid dehydrogenase e1 as well as dimethyl glycine dehydrogease was observed. conversely, expression of the proteasome 26s atpase subunit 5 (p45/sug) and its non-atpase regulatory subunit 11 (psmd11) was confined to tumour tissues (see table 1 ); the latter subunit is known to display high activity in embryonic stem cells. this multicomplex molecular machinery degrades intracellular proteins marked up by ubiquitin chains. psmd11 was reported to be up-regulated in breast cancer cells [87] . enhanced expression of cytoskeletal proteins such as tubulin β 5 and capza1 was also confirmed by ihc staining (see figure 3 ). differences in the localization of these proteins were obvious with tubulin ß 5 expression being primarily associated with cells proximal to the liver capsule, whereas expression of capping protein zline α1 (capza1) was strongly associated with tumour foci and this protein is known to play a pivotal role in cytoskeletal networks to support cell mobility, invasion and metastasis. additionally, gdi2, a protein functioning in the cycling of rab gtpases and arginase ii, i.e. a non-liver isoform of the urea cycle were up-regulated in tumours of egf transgenic mice (see figure 3 ). regulation of arginase ii was observed in various malignancies including lung cancer [88] . besides, the actin-binding protein lasp1 was uniquely expressed in tumours and is also up-regulated in breast cancer [89] to possibly support migration of cancer cells [90] . furthermore, pdia4, a disulfide bond isomerase and master regulator of the constructed networks (see below) was up-regulated as was kininogen that is part of the blood coagulation system and functions as a precursor of kinin. conversely, the serinproteinase inhibitor serpinb1a was repressed in expression to possible limited immunological responses in tumour growth and to influence inflammatory cytokine production by infiltrating monocytes [91] . the significant regulation of the calcium binding protein sorcin and nucleobindin 1 are further highly interesting results. sorcin is associated with multidrug-resistance in human leukemia cells [92] and nucleobindin 1 is evaluated as a biomarker of colon cancer [93] . in egf induced liver tumours transthyretin was also up-regulated. this protein is involved in the transport of thyroid hormones and was reported to be aberrantly regulated in thyroid cancer [94] . among the newly identified proteins is v-crk sarcoma virus ct10. this oncoprotein interacts with several tyrosinephosphorylated proteins and is part of the intracellular signalling cascades notably the phosphoinositide 3-kinase (pi3k)/akt pathway [95] . likewise, regulation of the 170 kda glucose-regulated protein grp170 is of great importance. this lumenal endoplasmic reticulum plays a role in immunoglobulin folding as was confirmed by coimmunoprecipitation in four different b cell hybridoma cell lines [11] . in our previous study several immunoglobulins were found to be either repressed or absent in serum of egf tumour bearing mice and this was particularly obvious for the ig k and l classes [7] . it remains to be determined whether repression of immunoglobulins can be attributed to aberrant grp170 activity. a summary of the biological functions in addition to their previous reported tumour association is given in additional file 16: table s14 while the regulation of genes coding for newly identified proteins and of genes coding for commonly regulated proteins in liver tumours and serum of egf2b-transgenic mice is given in additional file 17: table s15 and additional file 18: table s16 . initially the network construction was based on proteins exclusively expressed in tumours and by selecting master regulatory proteins linked to egfr signalling. thereafter, a fused hybrid network was developed in which tumour specific proteins were part of it. subsequently, the search was extended to all significantly regulated proteins (table 1) . this revealed 7 master regulatory proteins and its associated networks and encompassed 114 proteins of which 34 were disease regulated. eventually a fused network was developed; however not all disease regulated proteins are part of it. the performed pathway mapping over fused networks (see string analysis) defined protein interactions and grouped 76 disease regulated proteins into 6 distinct pathways of which platelet activation, signalling and aggregation is a major one (see figure 8) . specifically, the glycoprotein fibrinogen is a multimeric protein and consists of α, ß and y subunits. it is synthesized by hepatocytes and an essential blood coagulation factor with all polypeptide chains being highly regulated in tumours of egf transgenic mice. note, an association between coagulation factors and malignancies was established whereby fibrinogen functions as an extracellular matrix protein to interact with integrin receptors in the control of cell proliferation and cell migration [96] . accordingly, induced gene expression of the integrin receptors itgb1, itga3 and itgav was observed in egf induced liver tumours. in cancer progression a regulatory loop between fibrinogen, platelets and tumour cells has been determined that is activated by platelet cytosolic ca2+. this second messenger induces integrin receptor complex formation through an association of platelet glycoprotein chains iib and iiia (cd41/cd61) thereby creating an active binding site for fibrinogen. an association of tumour regulated proteins with the regulatory loop was confirmed in string analysis ( figure 8 ) and fibrinogen was reported to be an important determinant for metastasis of circulating tumour cells [97] . it is therefore of no surprise that elevated blood fibrinogen is a poor prognostic factor. haemostatic complications are commonly observed in cancer patients and future therapeutic strategies may focus on the hemostatic system by targeting tumour stroma. in this regard the tumour specific induction of plasminogen is of great importance. this zymogen [98] is converted to plasmin by urokinase (upa), a serine protease which itself was unchanged; however, gene expression of its receptor was significantly up-regulated in transgenic non-tumour livers. one report suggests the urokinase receptor to prime cells for proliferation in response to egf by promoting tyr845 phosphorylation and stat5b activation; nonetheless, this depended on intracellular c-src levels [99] . further studies established a link between induced expression of plasminogen activator, upa receptor and plasminogen activator inhibitor type-1 (pai-1) and invasiveness and metastasis of hcc [100, 101] . indeed, a fine balance exists between the plasminogen activating system and its inhibition by pai-1 and pai-2. based on transcriptomic data a highly significant induction of pai-i (up to 12-fold) in large tumours of egf transgenic mice was observed [6] ; consequently, the regulation of components of the plasminogen activating system may be considered as part of a strategy to degrade extracellular matrix thereby facilitating invasion and metastasis [102, 103] . to meet energy demands efficiently different sources are utilized and the induction of the proteins aldoa, aldoc, eno1, pkm and fbp1 is testimony to an altered glycolytic and pentose phosphate pathway. however, with the exception of acyl-coa thioesterase 2 that was below the limit of detection and functions in the hydrolysis of myristoyl-palmitoyl-, stearoyl-and arachidoyl-coa esters the regulation of enzymes linked to fatty acid metabolism in mitochondria and peroxisomes was hardly observed. in pursue of tumour growth and to sustain organelle and membrane biogenesis lipids are de novo synthesized and mobilized from stores and while the complex interaction of hepatic lipid and glucose metabolism in liver disease is the subject of intense research [104] the present study evidences significant regulation of several apolipoproteins, i.e. apoe, apoa1, apoa4 and isoforms of albumin. apart from lipid transport apolipoproteins play a wider role in cancers and are known to interact with diverse receptors to elicit cellular events as demonstrated for apoe to cause sustained proliferation and survival of cancer cells [105] . a further group of highly regulated proteins are aldoketo reductases. their quantitative evaluation in different hepatocellular carcinoma (hcc) cell lines was recently reported [106] . this superfamily of proteins comprises nad (p)(h)-dependent enzymes which catalyze oxidoreduction of a variety of prostaglandins, steroids and toxic aldehydes. their involvement in tumorigenesis is supported by several studies and they are explored as drug targets to overcome chemoresistance. in the present study the aldo-keto reductases akr1c14, akr1c18 and akr1c6 were uniquely expressed in tumours, however glutathione peroxidase 1 was repressed to 30% of healthy control livers to possibly support hif-1 signalling. indeed, the redox state and therefore glutathione participates in the hypoxic induction of hif-1 [107] , and two proteins of the glycolytic pathway, i.e. aldoa1 and eno1, which respond to hif-1 signalling, were regulated. moreover, glutathione peroxidase 1 was shifted in the gel as shown in figure 1 panel g iii as a result of post translational modifications that most likely involved c-abl and arg kinase activity at tyr 96 of gpx1 [108] . likewise, the genes coding for aldo1 and eno3 were significantly up-regulated in egf induced liver tumours. a complex interaction exists between egfr and rage signalling. this receptor for advanced glycation end-products is a member of the immunoglobulin family of cell surface molecules and was reported to significantly influence hepatic tumour growth in murine models of colorectal carcinoma [109] . there is strong evidence for rage to promote cancer growth upon ligand dependent activation and several proteins of the s100 family bind to the extracellular domain of rage [110, 111] . it is of considerable importance that gene expression of s100a4 and s100a11 was up to 34-fold induced in tumours of egf transgenic mice, however expression of s100a1 was repressed. likewise the tumour specific expression of the rage binding proteins lectin, galactoside-binding, soluble, 3 and capza1 in tumours of egf transgenic mice is highly suggestive for a sustained crosstalk between rage and egfr [112] . although the precise mechanism by which s100 proteins stimulate egfr signalling remains to be elucidated binding of s100a4 to egf and to other egfr ligands was reported to possibly facilitate interaction with the receptor [113] . similarly, the binding of s100a8/a9 to rage was shown to promote migration and invasion of human breast cancer cells through actin polymerization and epithelial-mesenchymal transition [114] . conversely, advanced glycation endproduct (age) receptor 1 suppressed oxidant stress-dependent signalling via the egfr and shc/grb2/ras pathway [115] . as depicted in figure 5 the amino acid metabolism was another distinct pathway to which several of the regulated proteins could be mapped to. note, the tumour specific regulations of arginine 1 and 2 as well as the regulation of subunits of the proteasome 26s atpase (psmc5 and psmd11) were already discussed (see above). in the following additional proteins regulated in this pathway are briefly summarized. specifically, 3-hydroxyanthranilate-3,4-dioxygenase (haao) catalyzes oxidation of 3-hydroxyanthranilate to quinolinate and this intermediate functions as a precursor in nad and pyridine biosynthetic pathways. expression of haao was significantly repressed in tumours of egf transgenic mice and hypermethylation of the coding gene was observed in ovarian cancer [116] . due to the fact that haao is significantly repressed at the gene and protein level in at least two different tumour entities (ovarian and liver cancer) the protein may function as a tumour suppressor that appears to be repressed by an epigenetic mechanism. a significant finding is the tumour specific expression of 3-phosphoglycerate-dehydrogenase which catalyses the production of 3-phosphoglycerate. this intermediate of glycolysis is an essential precursor of the serine biosynthetic pathway. importantly, a recent metabolomic study evidenced 3-phosphoglycerate to be diverted into serine and glycine metabolism and repressed expression of 3-phosphoglyceratedehydrogenase resulted in impaired tumour cell proliferation [117] . in support of tumour growth the diversion of intermediate of glycolysis affects protein, membrane lipid and nucleotide synthesis. moreover, the observed induction of creatine kinase in tumours of egf transgenic mice creates a circuitry for cellular energy homeostasis in conditions of high metabolic demands [118] . the enzyme catalyses the reversible transfer of phosphate from phosphocreatine to adp to yield atp and creatine. its induction has been observed in many cancers including liver cancer cell lines [119, 120] and a further study suggested a possible interplay between p53 mutations, hcc, ck expression with growth-inhibitory effects of cyclocreatine in hcc [121] . while the rationale of tumour cells in embarking on abnormal metabolism had already been discussed (see above) the finding that agmatine ureohydrolase was strongly repressed in egf induced liver tumours to about 10% of non-transgenic healthy livers is of great importance. this enzyme hydrolyzes agmatine (= decarboxylated arginine) to form putrescine and urea and repression of the enzyme will significantly increase agmatine tissue concentration to influence diverse cellular control mechanisms. importantly, in the study of battaglia and coworkers [122] 1 mm agmatine induced large amounts of superoxide production in rat liver mitochondria; however, it did not affect mitochondrial respiration or redox levels of thiols and glutathione. furthermore, atp synthesis remained normal and prevented ca(2+)-induced mitochondrial permeability transition in the presence of phosphate to suggest an intriguing regulatory loop whereby h2o2 induces hypoxia signalling that is linked to abberant metabolism, nonetheless by selecting interconnected physiological pathways tumour cells are equipped to avoid programmed cell death [122, 123] . thus, arginine deprivation is evaluated for its utility in cancer therapy [124] . a further enzyme repressed to 20% of healthy nontransgenic liver is carbamoyl phosphate synthetase 1 (cps1), i.e. a liver specific ligase to function in ammonia detoxification. it is perplexing that tumour cells disable such an important pathway of the urea cycle. however, a recent study demonstrated dna hypermethylation as a key mechanism of silencing cps1 gene expression in human hcc. note, forced expression of cps1 induced cell proliferation and the observed repression in human hcc may simply be the result of genomic instability as was observed in tumour cells [125] . the present study identified novel disease regulated proteins induced by overexpression of egf to provide new insight into the complex signalling events in hcc. six major pathways perturbed by egfr hyperactivity were identified and several of the regulated proteins are interesting drug target candidates and this includes tumour specific expression of kinases as well as proteins involved in aberrant metabolism. an identification of commonly regulated proteins in tumour and sera will be of great utility in the development of biomarkers to monitor disease progression and responses to therapy. the following additional data are available with the online version of this paper. aminoacyl-trna synthetases interacting multifunctional protein 2; alpha-hcca: alpha-cyano-4-hydroxycinnamic acid; aps: ammonium persulfate; ass: argininosuccinate synthetase 1; atp: adenosine triphosphate; ck: creatine kinase; ck2α: casein kinase ii subunit alpha; co: control specific protein dr: down-regulated protein; dr-t: down-regulated tumour; dr-tr-nt: down-regulated transgenic non-tumour; egf: epidermal growth factor egfr: epidermal growth factor receptor; eps15: epidermal growth factor receptor substrate 15; eps15r: epidermal growth factor receptor substrate 15r; er: endoplasmic reticulum; erk: extracellular signalling regulated kinase; fdr: false discovery rate; fgfr1: fibroblast growth factor receptor 1 go: gene ontology; grp170: 170 kda glucose-regulated protein; gtp: guanosine triphosphate h 3 po 4 : orthophosphoric acid; hca: hydroxy carboxylic acid; hcc: hepatocellular carcinoma; hdac2: histone deacetylase 2; hed: bis(2-hydroxyethyl) disulfide; hrp: horseradish peroxidase ihc: immunohistochemistry; il-1ra: interleukin-1 receptor antagonist ko: knockout mouse mek: mitogen-activated protein kinase kinase; nad: nicotinamide adenine dinucleotide; nad(p)(h): nicotinamide adenine dinucleotide phosphate nadp: nicotinamide adenine dinucleotide phosphate; no: nitric oxide; ogp: n-octyl β-d-glucopyranoside pai-1: plasminogen activator inhibitor type-1 pdgfr: platelet-derived growth factor receptors; plaur: plasminogen activator, urokinase receptor; pmf: peptide mass fingerprinting; ptbp1: polypyrimidine tract binding protein 1; rage: receptor for advanced glycation endproduct s100a1: s100 calcium binding protein a1; sam: s-adenosylmethionine short hairpin rna; stat5b: signal transducer and activator of transcription 5b; temed: tetramethylethylenediamine; to: tumour specific protein; trna: transfer rna; tyr845: tyrosine residue 845; upa: urokinase-type plasminogen activator; ur: up-regulated protein up-regulated transgenic non-tumour; vegfr: vascular endothelial growth factor receptor; xbp1(s): x-box binding protein 1 isoform surgery and ablative therapy for hepatocellular carcinoma improved method for proteome mapping of the liver by 2-de maldi-tof ms identification of specific protein markers in microdissected hepatocellular carcinoma soluble interleukin-2 receptor levels in hepatocellular cancer: a more sensitive marker than alfa fetoprotein discovery and development of sorafenib: a multikinase inhibitor for treating cancer epidermal growth factor-induced hepatocellular carcinoma: gene expression profiles in precursor lesions, early stage and solitary tumours mapping of the serum proteome of hepatocellular carcinoma induced by targeted overexpression of epidermal growth factor to liver cells of transgenic mice oncogenic potential of ck2alpha and its regulatory role in egf-induced hdac2 expression in human liver cancer epidermal growth factor receptor inhibition attenuates liver fibrosis and development of hepatocellular carcinoma distribution and function of egfr in human tissue and the effect of egfr tyrosine kinase inhibition the 170-kda glucose-regulated stress protein is an endoplasmic reticulum protein that binds immunoglobulin the hsp110 and grp1 70 stress proteins: newly recognized relatives of the hsp70s proteome database of hepatocellular carcinoma proteomic analysis of differentially expressed proteins in hepatocellular carcinoma developed in patients with chronic viral hepatitis c proteome analysis of hepatocellular carcinoma by two-dimensional difference gel electrophoresis: novel protein markers in hepatocellular carcinoma tissues identification of human hepatocellular carcinoma-related proteins by proteomic approaches expressed proteome analysis of human hepatocellular carcinoma in nude mice (lci-d20) with high metastasis potential overexpression of alpha enolase in hepatitis c virus-related hepatocellular carcinoma: association with tumor progression as determined by proteomic analysis profilin 1 obtained by proteomic analysis in all-trans retinoic acid-treated hepatocarcinoma cell lines is involved in inhibition of cell proliferation and migration serum biomarkers of hepatitis b virus infected liver inflammation: a proteomic study protein level of apolipoprotein e increased in human hepatocellular carcinoma proteome analysis of hepatocellular carcinoma by laser capture microdissection proteomic profiling of proteins decreased in hepatocellular carcinoma from patients infected with hepatitis c virus current progress in proteomic study of hepatitis c virus-related human hepatocellular carcinoma incidence and distribution of argininosuccinate synthetase deficiency in human cancers: a method for identifying cancers sensitive to arginine deprivation pegylated recombinant human arginase (rharg-peg5,000mw) inhibits the in vitro and in vivo proliferation of human hepatocellular carcinoma through arginine depletion renal cell carcinoma does not express argininosuccinate synthetase and is highly sensitive to arginine deprivation via arginine deiminase identification and preliminary validation of novel biomarkers of acute hepatic ischaemia/ reperfusion injury using dual-platform proteomic/degradomic approaches pegylated arginine deiminase: a novel anticancer enzyme agent arginine deprivation, growth inhibition and tumour cell death: 3. deficient utilisation of citrulline by malignant cells accurate qualitative and quantitative proteomic analysis of clinical hepatocellular carcinoma using laser capture microdissection coupled with isotope-coded affinity tag and two-dimensional liquid chromatography mass spectrometry oxidative phosphorylation and f(o)f(1) atp synthase activity of human hepatocellular carcinoma underexpression of mrna in human hepatocellular carcinoma focusing on eight loci insight into hepatocellular carcinogenesis at transcriptome level by comparing gene expression profiles of hepatocellular carcinoma with those of corresponding noncancerous liver proteome analysis of hepatocellular carcinoma hypermethylation of nad(p)h: quinone oxidoreductase 1 (nqo1) gene in human hepatocellular carcinoma use of proteomic methods to identify serum biomarkers associated with rat liver toxicity or hypertrophy identification of hepatocellularcarcinoma-associated antigens and autoantibodies by serological proteome analysis combined with protein microarray higher farnesyl diphosphate synthase activity in human colorectal cancer inhibition of cellular apoptosis proteome analysis of human hepatocellular carcinoma tissues by two-dimensional difference gel electrophoresis and mass spectrometry selenoprotein deficiency and high levels of selenium compounds can effectively inhibit hepatocarcinogenesis in transgenic mice contrasting patterns of regulation of the antioxidant selenoproteins, thioredoxin reductase, and glutathione peroxidase, in cancer cells selenium regulation of glutathione peroxidase in human hepatoma cell line hep3b genotypic and phenotypic characterization of a putative tumor susceptibility gene, gnmt, in liver cancer characterization of reduced expression of glycine n-methyltransferase in cancerous hepatic tissues using two newly developed monoclonal antibodies strategic shotgun proteomics approach for efficient construction of an expression map of targeted protein families in hepatoma cell lines the alteration of histidase catalytic activity and the expression of the enzyme protein in rat primary hepatomas identification of differential expression of genes in hepatocellular carcinoma by suppression subtractive hybridization combined cdna microarray inhibition of the acute-phase response in a human hepatoma cell line il-1 receptor antagonist regulation of acute phase protein synthesis in human hepatoma cells expression of an il-1 receptor antagonist during mouse hepatocarcinogenesis demonstrated by differential display analysis major urinary protein as a negative tumor marker in mouse hepatocarcinogenesis hepatocellular carcinoma: from bedside to proteomics plasminogen fragment k1-5 improves survival in a murine hepatocellular carcinoma model expression of angiostatin cdna in human hepatocellular carcinoma cell line smmc-7721 and its effect on implanted carcinoma in nude mice the role of angiostatin, vascular endothelial growth factor, matrix metalloproteinase 9 and 12 in the angiogenesis of hepatocellular carcinoma using proteomic approach to identify tumor-associated antigens as markers in hepatocellular carcinoma a gene expression biomarker provides early prediction and mechanistic assessment of hepatic tumor induction by nongenotoxic chemicals positional expression profiling indicates candidate genes in deletion hotspots of hepatocellular carcinoma hormonal regulation of vitamin d-binding protein production by a human hepatoma cell line geneways: a system for extracting, analyzing, visualizing, and integrating molecular pathway data progression of hcc in mice is associated with a downregulation in the expression of hepatocyte nuclear factors boosting antitumor responses of t lymphocytes infiltrating human prostate cancers amphiregulin is a promising prognostic marker for liver metastases of colorectal cancer pcbp-1 regulates alternative splicing of the cd44 gene and inhibits invasion in human hepatoma cell line hepg2 cells tasp1, and prpf3 are novel disease candidate genes targeted by hnf4alpha splice variants in hepatocellular carcinomas a strategy for the comparative analysis of serum proteomes for the discovery of biomarkers for hepatocellular carcinoma proteomic analysis of sera from hepatocellular carcinoma patients after radiofrequency ablation treatment proteomics for hepatocellular carcinoma marker discovery liver-specific deletion of prohibitin 1 results in spontaneous liver injury, fibrosis, and hepatocellular carcinoma in mice reduced argininosuccinate synthetase is a predictive biomarker for the development of pulmonary metastasis in patients with osteosarcoma methyl transfer in glycine n-methyltransferase. a theoretical study a novel tumor suppressor function of glycine n-methyltransferase is independent of its catalytic activity but requires nuclear localization mir-122 targets pyruvate kinase m2 and affects metabolism of hepatocellular carcinoma serum aldolase isoenzymes in benign and malignant liver disease aminoacyl trna synthetases and their connections to disease lentiviral vector-mediated shrna against aimp2-dx2 suppresses lung cancer cell growth through blocking glucose uptake overexpression and oncogenic function of aldo-keto reductase family 1b10 (akr1b10) in pancreatic carcinoma aldo-keto reductase family 1 b10 gene silencing results in growth inhibition of colorectal cancer cells: implication for cancer intervention enoyl coenzyme a hydratase 1 is an important factor in the lymphatic metastasis of tumors expression of dihydrodiol dehydrogenase and resistance to chemotherapy and radiotherapy in adenocarcinoma cells of lung butyrate metabolism in human colon carcinoma cells: implications concerning its growth-inhibitory effect the warburg effect dictates the mechanism of butyrate-mediated histone acetylation and cell proliferation biological roles and therapeutic potential of hydroxy-carboxylic acid receptors autotaxin has lysophospholipase d activity leading to tumor cell growth and motility by lysophosphatidic acid production farnesyl pyrophosphate synthase: a key enzyme in isoprenoid biosynthetic pathway and potential molecular target for drug development over-expression of genes and proteins of ubiquitin specific peptidases (usps) and proteasome subunits (pss) in breast cancer tissue observed by the methods of rfdd-pcr and proteomics arginase 2 is expressed by human lung cancer, but it neither induces immune suppression, nor affects disease progression nuclear localization and cytosolic overexpression of lasp-1 correlates with tumor size and nodal-positivity of human breast carcinoma regulation of cell migration and survival by focal adhesion targeting of lasp-1 serpinb1 regulates homeostatic expansion of il-17+ gammadelta and cd4+ th17 cells sorcin, an important gene associated with multidrug-resistance in human leukemia cells autoantibodies to ca2+ binding protein calnuc is a potential marker in colon cancer detection fine-needle aspiration of thyroid nodules: proteomic analysis to identify cancer biomarkers v-crk activates the phosphoinositide 3-kinase/akt pathway by utilizing focal adhesion kinase and h-ras tumors and fibrinogen. the role of fibrinogen as an extracellular matrix protein fibrinogen is an important determinant of the metastatic potential of circulating tumor cells plasminogen receptors and their role in the pathogenesis of inflammatory, autoimmune and malignant disease urokinase receptor primes cells to proliferate in response to epidermal growth factor invasion and metastasis of hepatocellular carcinoma in relation to urokinase-type plasminogen activator, its receptor and inhibitor urokinase-type plasminogen activator receptor transcriptionally controlled adenoviruses eradicate pancreatic tumors and liver metastasis in mouse models plasminogen activator system and its clinical significance in patients with a malignant disease the urokinase plasminogen activator system: role in malignancy the interaction of hepatic lipid and glucose metabolism in liver diseases apolipoprotein e is required for cell proliferation and survival in ovarian cancer quantitative evaluation of aldo-keto reductase expression in hepatocellular carcinoma (hcc) cell lines the redox state of glutathione regulates the hypoxic induction of hif-1 controlled elimination of intracellular h(2)o(2): regulation of peroxiredoxin, catalase, and glutathione peroxidase via post-translational modification rage signaling significantly impacts tumorigenesis and hepatic tumor growth in murine models of colorectal carcinoma binding of s100 proteins to rage: an update rage and rage ligands in cancer identification of galectin-3 as a high-affinity binding protein for advanced glycation end products (age): a new member of the age-receptor complex epidermal growth factor receptor ligands as new extracellular targets for the metastasis-promoting s100a4 protein rage-binding s100a8/a9 promotes the migration and invasion of human breast cancer cells through actin polymerization and epithelial-mesenchymal transition advanced glycation end product (age) receptor 1 suppresses cell oxidant stress and activation signaling via egf receptor identification of candidate epigenetic biomarkers for ovarian cancer detection phosphoglycerate dehydrogenase diverts glycolytic flux and contributes to oncogenesis intracellular compartmentation, structure and function of creatine kinase isoenzymes in tissues with high and fluctuating energy demands: the'phosphocreatine circuit' for cellular energy homeostasis creatine and creatinine metabolism mitochondrial creatine kinase as a tumor-associated marker elevated creatine kinase activity in primary hepatocellular carcinoma different behavior of agmatine in liver mitochondria: inducer of oxidative stress or scavenger of reactive oxygen species? hypoxia signaling pathways in cancer metabolism: the importance of co-selecting interconnected physiological pathways arginine deprivation as a targeted therapy for cancer submit your next manuscript to biomed central and take full advantage of: • convenient online submission • thorough peer review • no space constraints or color figure charges • immediate publication on acceptance • inclusion in pubmed, cas, scopus and google scholar • research which is freely available for redistribution we gratefully acknowledge support from the virtual liver network (grant 031 6154) of the german federal ministry of education and research (bmbf) to jb. additional file 1: table s1 . antibodies and dilutions used to study disease regulation by immunohistochemistry.additional file 2: table s2 . statistics and selection criteria for functional grouping of pathways terms of disease regulated proteins. a hypergeometric test followed by bonferroni correction was used with p-value ≤ 0.05. for grouping of pathway terms, the kappa score threshold was set to 0.4. additional file 3: table s3 . different master regulators and associated network for proteins expressed in tumours only (to) or significantly regulated when compared to healthy liver of non-transgenic control animals (t + ur + dr).additional file 4: table s4 . comparison of disease regulated proteins in mouse and human hcc. data were taken from 'the human protein atlas' database.additional file 5: table s5 . comparison of transcriptomic and proteomic data. this comparison revealed 19 significantly regulated genes of which 15 are regulated in common whereas for 4 genes transcript expression was opposite to that of the coded proteins.additional file 6: table s6 . biological processes ontology for significantly regulated proteins. additional file 7: table s7 . cellular component ontology for significantly regulated proteins.additional file 8: table s8 . molecular function ontology for significantly regulated proteins.additional file 9: table s9 . biological pathways and their cluster. cytoscape 3.0.2 with plugins (cluego and cluepedia) were used to generate functionally grouped network of pathways based on reactome, kegg and wikipathways databases. the grouping of significant pathway terms (p ≤ 0.05) is based on a kappa score of 0.4, initial group size of 2 and sharing group percentage of 50.additional file 10: table s10 . biological pathways information based each significantly regulated protein (to + ur + dr). this table depicts pathway terms of nearly 80% of regulated proteins (to + ur + dr) and are taken from reactome and kegg databases.additional file 11: figure s1 . master regulatory networks for proteins uniquely expressed in tumours with link to egfr signalling: (a) the plaur network consists of 44 proteins of which 20 are tumour-specifically regulated, (b) the fgfr1 consists of 41 proteins of which 18 are tumour-specifically regulated, (c) the ptbp1 network consists of 44 proteins of which 18 are tumour-specifically regulated, (d) the agtrap network consists of 43 proteins of which 19 are tumour-specifically regulated and (e) the s100a1 network consists of 38 protein of which 16 are tumour-specifically regulated. note all networks display connectivity to egfr signalling (yellow coloured inner node) and in the case of the s100a1 master regulatory protein egfr s ignalling is via the plaur/egfr network.the master regulator, the connecting proteins (network elements) and regulated proteins are given as red, green and blue coloured inner node, respectively. furthermore, each node is partioned into four segments whereas the first segment seen from left refers to tumour specific proteins and is red-coloured. the second, third and fourth segments refer to either up-and down-regulated proteins, tumour specific gene expression changes and gene regulations in transgenic non-tumour liver tissue, respectively. increased expression of either proteins or genes is given in red, whereas the blue colour denotes repressed expression.additional file 12: table s11 . integrated hybrid network with master regulator information for proteins expressed in tumours only (to) and significantly regulated proteins when compared to heathy liver of non-transgenic control animals (t + ur + dr).additional file 13: figure s2 . master regulatory networks for regulated proteins with link to egfr signalling: (a) the pdia4 network consists of 68 proteins including 32 significantly regulated proteins, (b) the apeh network consists of 68 proteins including 32 significantly regulated proteins, (c) the pebp1 network consists of 63 proteins including 31 significantly regulated proteins, (d) the apoe network consists of 66 proteins including 31 significantly regulated proteins, (e) the arg1 network consists of 67 proteins including 31 significantly regulated proteins, (f) the fbp1 network consists of 69 proteins including 31 significantly regulated proteins, (g) the haao network consists of 66 proteins including 32 significantly regulated proteins. note, all networks display connectivity to egf protein (yellow coloured inner node). the master regulator, the connecting proteins (network elements) and regulated proteins are given as red, green and blue coloured inner node, respectively. furthermore, each node is partioned into four segments whereas the first segment seen from left refers to tumour specific proteins and is red-coloured. the second, third and fourth segments refer to either up-and down-regulated proteins, tumour specific gene expression changes and gene regulations in transgenic non-tumour liver tissue, respectively. increased expression of either proteins or genes is given in red, whereas the blue colour denotes repressed expression.additional file 14: table s12 . protein interaction information of the fused network. given are interacting proteins with association score from different prediction methods, i.e. neighborhood, gene fusion, co-occurrence, co-expression, databases and textmining.additional file 15: table s13 . biological pathways and their cluster for fused network proteins. reactome, kegg and wikipathways database information was used as input data file for cytoscape 3.0.2. the table shows grouping of significant pathway terms (p ≤ 0.01) and is based on a kappa score of 0.4, initial group size of 2 and sharing group percentage of 50.additional file 16: table s14 . biological function of newly identified proteins and their previously reported tumour association.additional file 17: table s15 . regulation of genes coding for newly identified proteins in egf2b-transgenic liver tumours.additional file 18: table s16 . regulation of genes coding for common proteins in tumour tissue and serum of egf transgenic mice. the authors declare that they have no competing interests.authors' contributions jb conceived the study and contributed the reagents, gg performed the experiments, ps performed the bioinformatics analysis. jb, gg and ps analysed the data, jb wrote the manuscript and all authors read and approved the final manuscript. key: cord-007208-wnkjdg6y authors: li, sheng-hsiang; lee, robert kuo-kuang; hsiao, ya-ling; chen, yee-hsiung title: demonstration of a glycoprotein derived from the ceacam10 gene in mouse seminal vesicle secretions(1) date: 2005-09-01 journal: biol reprod doi: 10.1095/biolreprod.105.039651 sha: doc_id: 7208 cord_uid: wnkjdg6y ceacam10 was purified from mouse seminal vesicle secretions by a series of purification steps that included ion exchange chromatography on a deae-sephacel column and ion exchange high-performance liquid chromatography on a sulfopropyl column. it was shown to be a 36-kda glycoprotein with an n-linked carbohydrate moiety. the circular dichromoism spectrum of ceacam10 in 50 mm phosphate buffer at ph 7.4 appeared as one negative band arising from the β form at 217 nm. ceacam10 was expressed predominantly in seminal vesicles of adult mice. both ceacam10 and its mrna were demonstrated on the luminal epithelium of the mucosal folds in the seminal vesicle. the amount of ceacam10 mrna in the seminal vesicle was correlated with the stage of animal maturation. castration of adult mice resulted in cessation of ceacam10 expression, while treatment of castrated mice with testosterone propionate in corn oil restored ceacam10 expression in the seminal vesicle. during the entire course of pregnancy, ceacam10 might be silent in the embryo. a cytochemical study illustrated the presence of the ceacam10 binding region on the entire surface of mouse sperm. ceacam10-sperm binding greatly enhanced sperm motility in vitro. mammalian sperm display an intriguing sense of timing in undergoing some modification during their transit in the reproductive tract before encountering an egg. studying how the lumen of the reproductive tract affects sperm function is a prerequisite to unraveling the molecular mechanisms underlying the complex modification of sperm. factors that affect sperm motility have been reported in the seminal plasma of several mammals including the pig [1] [2] [3] , bovine [4] , mouse [5] , and human [6] . the seminal vesicle is a male accessory sexual gland found in many species of more than 4000 mammalian species alive on the earth today. after puberty, the gland secretes a fluid called seminal vesicle secretion (svs), which accumulates in its lumen. svs contains both protein and nonprotein components. when ejaculated, svs squirts into the urethra, contributing the major part of the liquid portion of seminal plasma, which is the complex biological fluid formed from mixing of various fluid in the male reproductive tract. it has been found that extirpation of the seminal vesicle from mice and rats greatly reduces fertility [7, 8] , demonstrating the importance of svs to sperm modification under natural circumstances. svs differs extensively in terms of volume and composition in various species of mammals. however, rodents have proven to be good experimental animals for the molecular study of mammalian reproduction, so attempts have been made to isolate the proteins involved in sperm modification by mouse svs, which contains several minor proteins and seven well-defined major proteins designated svs i-vii, named in decreasing order of molecular mass according to their mobility in sds-page [9] . previously, we demonstrated that svs vii enhances sperm motility [10] , and two of the minor proteins modulate sperm activity. one is a caltrin-like trypsin inhibitor/p12, which suppresses ca 2ϩ uptake by sperm [11] , and the other is a seminal vesicle autoantigen, which serves as a decapacitation factor [12, 13] . here we report the purification and identification of an androgen-stimulated 36-kda glycoprotein, a minor protein component of mouse svs that is able to enhance sperm motility in vitro. we have demonstrated that its core protein is derived from the ceacam10 gene [14] , which is a member of the cell adhesion molecule (cam) subgroup belonging to the carcinoembryonic antigen (cea) family. the following materials were obtained from commercial sources: deae-sephacel (amersham pharmacia biotech, uppsala, sweden); protein pak sp 5pw column (waters, milford, ma); vydac 218tp54 c 18 column (separations group, hesperia, ca); aminolink coupling gel, bicinchoninic protein assay kit (pierce, rockford, il); testosterone propionate, nitroblue tetrazolium, 5-bromo-4-chloro-3-indolyl phosphate (bcip), pmsf, periodic acid schiff reagent, and silanated glass slides (sigma chemical co., st louis, mo); cdna integrity kit, alkaline phosphataseconjugated streptavidin, and biotin-conjugated goat anti-rabbit immunoglobulin g (igg; kirkegaard & perry laboratories, gaithersburg, md); rhodamine-conjugated goat anti-rabbit igg (zymed laboratories, san francisco, ca); nuclear fast red (vector laboratories, burlingame, ca); enhanced chemiluminescent substrate and [␣32 a) fractionation of soluble mouse svs proteins by ion exchange chromatography on a deae-sephacel column. b) resolution of fraction iii sample from a by ion-exchange hplc on an sp column. c) demonstration of the glycoprotein nature. each of the a-to-d peaks from b were digested with n-glycosidase f. the parent proteins (lane 1, peak a; lane 3, peak b; lane 5, peak c; lane 7, peak d) and their deglycosylated forms (lanes 2, 4, 6, and 8) were identified by sds-page on a 12% polyacrylamide gel slab. the proteins in the gel were stained with coomassie brilliant blue. pgem-t-easy vector (promega, madison, wi); ultraspec-ii rna isolation kit (biotecx laboratories, inc., houston, tx); thermoscript reverse transcriptase (invitrogen life technologies, carlsbad, ca); n-glycosidase f, taq dna polymerase (takara, shiga, japan); and mouse embryo-stage blots designed to observe gene expression during pregnancy (seegene, inc., korea). all other chemicals were reagent grade. outbred icr mice were purchased from charles river laboratories (wilmington, ma) and were maintained and bred in the animal center at the college of medicine, national taiwan university. animals were treated according to institutional guidelines for the care and use of experimental animals. they were housed under controlled lighting (14l:10d) at 21-22њc and were provided with water and nih-31 laboratory mouse chow ad libitum. for investigation of androgenic effects, 8-wk-old adult male mice, which had been castrated 3 wk earlier, received a daily s.c. injection of testosterone propionate in corn oil (5 mg/kg body weight) for 8 consecutive days. control animals received corn oil only. seminal vesicles were removed from the animals 12 h after the last injection. normal adult mice (8 to 12 wk old) were killed by cervical dislocation. the seminal vesicles of 30 mice were carefully dissected to free them from the adjacent coagulating glands, and the secretions were squeezed directly into 50 ml of ice-cold 10 mm tris-hcl in the presence of 1 mm pmsf at ph 8.0. after centrifugation at 10 000 ϫ g for 15 min, the supernatant was resolved by ion exchange chromatography on a deae-sephacel column (12 ϫ 2.6 cm) that had been pre-equilibrated with 10 mm tris-hcl at ph 8.0. after the nonretarded fractions were washed out, the column was eluted with 0.5 m nacl in the same buffer at a flow rate of 18 ml/h. fractions (4 ml) were collected, and their absorbance at 280 nm was recorded (fig. 1a) . fraction iii was further subjected to ionexchange high-performance liquid chromatography (hplc) on a sulfopropyl (sp) column (7.5 cm ϫ 7.5 mm). the column was sequentially eluted with three linear gradients, including 0%-15%, 15%-40%, and 40%-80% of 0.6 m nacl in 20 mm sodium acetate at ph 6.0 at a flow rate of 1.0 ml/min for 60 min (see fig. 1b ). thirty micrograms of each peak (a-d) in 50 l of 100 mm tris-hcl ph 8.6 were digested with 0.6 g of trypsin-tpck at 37њc for 18 h. the reaction was stopped by adding 100 l of 0.1% trifluoroacetic acid and the reaction mixture was resolved by a reverse-phase c 18 column (4.6 ϫ 250 mm). the column was eluted with a linear gradient of 0%-80% acetonitrile at a flow rate of 1.0 ml/min for 60 min (see fig. 2a ). the n-glycoconjugate was removed from a glycoprotein by following the method described by tarentino and plummer [15] . the protein was boiled in 1.0% sds and incubated with n-glycosidase f (40 u/mg of protein) in 20 mm sodium phosphate at ph 7.2 in the presence of 50 mm edta, 0.5% nonidet p-40, and 10 mm sodium azide for 16 h at 37њc. the concentration of ceacam10 was determined using the bicinchoninic acid protein assay [16] according to the manufacturer's instructions. the amino acid sequence was determined using automated edman degradation with a 492 protein sequencer with an online 140 c analyzer (applied biosystems, foster city, ca). the circular dichromoism (cd) spectra were measured with a jasco j-700 spectropolarimeter under constant flushing with n 2 at room temperature. the mean residue elipticity () was estimated from the mean residue weight, which was calculated from the primary structure. antisera against ceacam10 were raised in new zealand white rabbits. the anti-ceacam10 antibody was purified from the antiserum on a column (2.5 ϫ 1.0 cm) of aminolink gel coupled with the purified antigen according to our previously described method [17] . proteins were resolved using sds-page on a 12% gel slab (8.2 ϫ 7.3 ϫ 0.075 cm) according to the method described by laemmli [18] . the proteins on the gel were stained with coomassie brilliant blue or transferred to a nitrocellulose membrane using an electroblotting method, which was conducted at 35 v at 4њc for 18 h in a solution of 25 mm tris-hcl, 197 mm glycine, and 13.3% methanol. membranes were blocked with 10% normal goat serum in pbs for 2 h, and then incubated with anti-ceacam10 antibody (0.5 g/ml) in the blocking solution for 1 h at room temperature. after gently agitating in four changes of pbs for 15 min each, they were immunoreacted with horseradish peroxidase-conjugated goat anti-rabbit igg diluted to 1:15 000 in the blocking solution for 1 h. immunoreactive bands were revealed using an enhanced chemiluminescent substrate according to the manufacturer's instructions. mouse seminal vesicles were fixed in bouin solution, embedded in paraffin, and 8-m serial cross-sections were mounted on silanated glass slides. deparaffinized sections were blocked in blocking solution for 1 h at room temperature and then incubated with affinity-purified anti-cea-cam10 antibody at a concentration of 1 g/ml in the blocking solution for 1 h. the slides were gently agitated in three changes of washing solution for 10 min each and then treated with biotin-conjugated goat antirabbit igg (1 g/ml) in the blocking solution for 1 h at room temperature. the slides were washed again as described above and then incubated with alkaline phosphatase-conjugated streptavidin (1 g/ml) in blocking solution for 1 h at room temperature. protein signals were observed after the slides were incubated for 10 min with 0.0375% nitroblue tetrazolium and 0.0188% bcip in a solution of 100 mm tris-hcl, 100 mm nacl, and 5 mm mgcl 2 at ph 9.5. the slides were washed in three changes of water for 3 min each and then counterstained with nuclear fast red for 3 min. fig. 2. identification of the 36-kda glycoprotein derived from ceacam10 and its circular dichroism. a) the trypsin-digested sample of peak c from figure 1b was resolved by reverse-phase hplc on a c 18 column (see materials and methods). b) the protein sequence was deduced from the reading frame of ceacam10 cdna (genbank accession number nm 007675). the initial and stop codons are underlined. the potential n-linked glycosylation sites are denoted by open boxes. the deduced protein sequence and the amino acid sequences determined directly from protein analysis for peaks a to d in figure 1b and peaks 9 and 18 in figure 2a agree in all positions except that asn 11 from the cdna-deduced protein was not identified in protein sequencing. the cleavage points for the generation of mature protein are indicated by an arrow. c) circular dichroism of ceacam10 in 50 mm phosphate buffer at ph 7.4 at room temperature. finally, the slides were briefly washed with water and photographed using a brightfield microscope (ah3-rfca; olympus, tokyo, japan). seminal vesicle tissues were oriented in a small aluminum foil cup and frozen immediately in tissue-tek oct medium. embedded samples were then stored at ϫ80њc before further processing. serial 8-m cryostat sections were mounted on uncoated glass slides and stored at ϫ80њc. before use, sections were fixed in 70% ethanol and stained with a histogene lcm frozen section-staining kit following the supplier's protocol. slides were dipped in xylene twice for 5 min each time and then air-dried. mucosal folds or smooth muscle cells were harvested using a pixcell ii lcm system (arcturus engineering, mountain view, ca). captured tissues from three sections were collected on capsure hs lcm caps containing transfer film. tissue samples from three different animals were pooled for subsequent analysis. total rna was extracted from the captured cells by using the picopure rna isolation kit, followed by treatment with dnase i before cdna synthesis. the total rna was reverse-transcribed using thermoscript reverse transcriptase for first-strand cdna synthesis according to the manufacturer's instructions. a cdna integrity kit was employed to examine the quality of the cdna. the qualified cdna samples were used as templates for pcr. the primer pair for amplification of a 237-base pair (bp) ceacam10 gene fragment was a forward primer (5ј-tatgctattt-caaaacttcgatcat-3ј), which corresponds to sequence 822-846, and a reverse primer (5ј-gttatgcggactttattg-3ј), which corresponds to sequence 1058-1041 (genbank accession number nm 007675). the primer pair for amplification of a 557-bp mouse glyceraldehyde-3-phosphate dehydrogenase (gapd) dna fragment was a forward primer (5ј-cggcaaattcaacggcacagt-3ј), which corresponds to sequence 199-219, and a reverse primer (5ј-tgggggtaggaacacggaagg-3ј), which corresponds to sequence 755-735 (genbank accession number xm 194302). pcr reaction mixtures consisted of 2 l of template, 1. for 30 sec at 53њc, extension for 30 sec at 72њc, and a final extension for 7 min at 72њc. the polymerase chain reaction (pcr) products were analyzed by electrophoresis on a 2.0% agarose gel. the identity of pcr products was confirmed by cloning and sequencing. dna sequencing was carried out with an abi prism 377-96 dna sequencer using the abi prism bigdye terminator cycle sequencing ready reaction kit (applied biosystems). total rna was extracted from tissue homogenates using an ultraspec-ii rna isolation kit. a pcr-amplified fragment of ceacam10 cdna (237 bp), which was inserted in pgem-t-easy, and a cdna fragment of the mouse gapd gene (1233 bp), which was inserted in pgem3 vector, were used as a template to prepare a 32 p-labeled cdna probe using a promega random-priming kit. rna samples (20 g) were subjected to denaturing by 1.0% agarose-formaldehyde gel electrophoresis and then blotted onto nylon membranes by capillary transfer as previously described [19] . after incubation with the prehybridization buffer (50% deionized formamide, 6ϫ ssc, 5ϫ denhardt solution, 1.0% sds, and 100 g/ml of sheared salmon sperm dna) for 2 h at 50њc, the membranes were hybridized with one labeled probe overnight at 50њc. following hybridization, the membranes were washed using standard procedures. rna messages on one filter membrane were observed after autoradiography and the probes were removed from the membranes as previously described [19] . the same membrane was then hybridized with another labeled probe. thus, hybridization with ceacam10 or gapd cdna probe was performed on the same filter membrane. in accordance with a method previously used [20] , a modified tyrode buffer, which consisted of 124.7 mm nacl, 2.7 mm kcl, 0.5 mm mgcl 2 , 0.4 mm nah 2 po 4 , 5.6 mm glucose, 0.5 mm sodium pyruvate, 15 mm nahco 3 , 10 mm hepes, 100 iu/ml penicillin, and 100 g/ml streptomycin was adjusted to ph 7.3-7.4 by aeration with humidified air/co 2 (19:1) in an incubator for 48 h at 37њc before use. mouse epididymides were removed and immersed in the medium. after careful dissection from the connective tissue, spermatozoa were extruded from the distal portion of the tissues for 10 min at 37њc. the cells were gently filtered through two layers of nylon gauze, layered on top of a linear gradient of 20%-80% percoll (v/v), and centrifuged at 275 ϫ g for 30 min at room temperature [21, 22] . three distinct cell layers formed. the lowest layer, which contained cells with progressive motility, was washed with three volumes of the medium and collected using centrifugation at 60 ϫ g for 10 min at room temperature. the sperm were resuspended and centrifuged two more times in a similar manner. the cell pellets were resuspended, and cacl 2 was added to the culture medium at a final concentration of 1.8 mm before the sperm were assayed. freshly prepared epididymal spermatozoa (10 6 cells/ml) were blocked in pbs containing 10% normal goat serum for 30 min at room temperature. the cells were further incubated with 1 m ceacam10 for 1 h. at the end of incubation the cells were centrifuged and the cell pellets were washed with pbs to remove the unbound ligands. the cells were air-dried on a glass slide and washed twice with pbs. the slides were incubated with the affinity-purified anti-ceacam10 antibody at a concentration of 1 g/ml in blocking solution for 1 h. the slides were washed three times with pbs to remove excess antibody before they were incubated with rhodamine-conjugated goat anti-rabbit igg diluted to 1:500 in blocking solution for 40 min. all slides were then washed with pbs, covered with 50% (v/v) glycerol in pbs, and photographed with a fluorescence microscope (axioplan 2 imaging; carl zeiss, oberkochen, germany). ejaculated sperm were collected from semen that existed in the uterine cavity of three vagina-plugged female mice. after extensive washing with pbs the ejaculated sperm without incubation with the exogenous cea-cam10 were smeared onto slides for immunolocalization of ceacam10 as mentioned above. sperm motility was determined using a computer-assisted sperm assay (casa) with a sperm motility analyzer (ivos version 10; hamilton-thorne research, beverly, ma). a 10-l sample was placed in a 10-m-deep makler chamber at 37њc. the analyzer was set as follows: negative phasecontrast optics and recording at 60 frames/sec; minimum contrast, 40; minimum cell size, 4 pixels; low-size gate, 0.2; high-size gate, 1.5; lowintensity gate, 0.5; high-intensity gate, 1.5; nonmotile head size, 29; nonmotile head intensity, 76; medium average path velocity, 50 m/sec; low path velocity, 7.0 m/sec; slow motile cells, yes; and threshold straightness, greater than 80%. ten fields were assessed for each sample. the fresh preparation of soluble svs was divided into fractions. i to iv by ion exchange chromatography on a deae-sephacel column (fig. 1a) . the fraction iii sample was further resolved into peaks a-e (fig. 1b) by ion exchange hplc on an sp column. peak e was a fad-dependent sulfhydryl oxidase (unpublished observation). on reducing sds-page gel, each of the a-to-d peak samples gave one rather broad 36-kda band that could be stained with either coomassie brilliant blue or periodic acid-schiff reagent, demonstrating their glycoprotein nature (fig. 1c, lanes 1, 3, 5, and 7) . each protein sample could be deglycosylated either by trifluoromethane sulfonic acid or exhaustive digestion with n-glycosidase f to a core protein that was identified as a sharp band between 26 and 28 kda by sds-page (fig. 1c, lanes 2, 4, 6 and 8) , indicating a similar molecular mass of the protein cores. apparently, peaks a-d were glycoproteins with an n-linked carbohydrate moiety. they were purified to homogeneity. automated edman degradation for each a-to-d peak samples for 14 cycles gave reliable data, which were assembled to the n-terminal sequences. avppxvtadnnvll was determined from the peak a sample. two amino acids were detected in each cycle during protein analysis for each peak (b to d). the actual yield of the two sequences in an individual cycle was such that the ratio of the major sequence to the minor one was estimated to be 2.5-3.0:1. assembly of the major and minor sequences gave a peptide sequence of aqvtveavppxvta and qvtveavppxvta, respectively. the three n-terminal peptides were completely confirmed in the ceacam10-deduced protein consisting of 265 amino acid residues in all positions except that x (asparagine), one potential site for an n-linked carbohydrate in ceacam10, was not identified in the protein sequencing (fig. 2b) . the post-translational cleavage at the peptide bond between glu and ala, thr and ala, or ala and gln in the signal peptide of the putative ceacam10 sequence gives rise to a peak a protein or to peak b-to-d proteins. as a result, peaks a to d share a very similar protein core with a slight difference in their n-terminal sequences. thereafter, we combined them for further study. among the svs protein components on sds-page gel, antibody against cea-cam10 immunoreacted only to a 36-kda protein band corresponding to the antigen, showing high specificity of the antibody. taken together, these data indicate that peaks a to d are translational products of the ceacam10 gene. each peak (a to d) was digested with trypsin, and the digests were subjected to hplc on a c 18 column. the chromatographic patterns of the four trypsin-digested samples were very similar. one representative chromatogram is shown in figure 2a . three amino acids were identified in each cycle of automated edman degradation of peak 9 on figure 2a . these data could be assembled to three peptide sequences of vfywyk, etiysn, and aiywyr in cea-cam10. the peptide sequence of ndegayaldmlfqnf in ceacam10 was completely confirmed by automated edman degradation of peak 18 (see fig. 2b ). ceacam10 was stable in 10 mm tris-hcl at ph 8.0, but it was degraded to an 18-kda protein component in 5% acetic acid (not shown). the cd spectrum of this protein at ph 7.4 shows a negative band with a minimum mean residue ellipticity of 12 000 deg cm 2 dmol ϫ1 at 217 nm (fig. 2c ). in addition, a positive band appeared as the cd profile extending below 200 nm. the spectral profile in the uv region shows some resemblance to that of the ␤ form of protein conformation [23] [24] [25] [26] , suggesting the presence of a considerable amount of ␤ form, a ␤ turn, or both in the protein molecule. we examined the distribution of ceacam10 and its rna message in the tissue homogenates of reproductive glands, including the seminal vesicle, epididymis, testis, coagulating gland, vas deferens, prostate, uterus, and ovary. the rna message was predominantly detected in the seminal vesicles (fig. 3a) . this was confirmed by the results of western blot analysis showing that ceacam10 was abundant in the seminal vesicle; a trace appeared in the epididymis and prostate only after over autoradiography (fig. 3b) . when equal amounts of total rna from the homogenate of a nonreproductive organ were compared with those of the seminal vesicle, very litter to no ceacam10 mrna was found in brain, heart, lung, liver, spleen, kidney, stomach, small intestine, muscle, skin, and thymus (not shown). ceacam10 was immunolocalized primarily to the luminal epithelium of the mucosal folds in the seminal vesicle slides of adult mice (fig. 4a) . the smooth muscle layer contained almost none. the strong immunochemical staining in the lumen supports the view that ceacam10 accumulates in the lumen as a result of its secretion from the luminal epithelium. further, we separated mucosal epithe-lial cells and smooth muscle cells from the tissue slices by lcm. ceacam10 transcripts were relatively abundant in the mucosal cells, but only trace amounts of the rna message appeared in the smooth muscle cells (fig. 4b) . the amounts of ceacam10 mrna in the seminal vesicles of mice at different ages were compared. the rna message first appeared at a considerable level in 3-wk-old mice. thereafter, the amount of transcript began increasing rapidly at 4 wk and reached a maximum in 7-wk-old mice (fig. 5a) . we analyzed mouse embryos from 4.5 to 18.5 days postcoitus (d.p.c.). the 4.5-6.5 d.p.c. samples included early stage embryos, extraembryonic tissue, and maternal uterus; the 7.5-9.5 d.p.c. samples included embryos and extraembryonic tissues, and the 10.5-18.5 d.p.c. samples were solely embryos. the rna message in the embryo samples was present in trace amounts on 5.5 d.p.c, increased remarkably from 6.5 d.p.c. to a maximum on 9.5 d.p.c., and rapidly declined thereafter to an almost undetectable level until delivery (fig. 5b) . because seminal vesicle growth is known to be androgen-dependent, we examined how androgen influenced ceacam10 expression in the seminal vesicles of adult mice that had been castrated 3 wk earlier (fig. 6) . ceacam10 mrna was undetectable in the total rna prepared from the control castrates that had received a daily injection of corn oil only compared with that of normal adults. induction of ceacam10 mrna was observed in the castrates treated with testosterone (5 mg/kg per day) for 8 consecutive days. figure 7a shows micrographs of epididymal spermatozoa with indirect fluorescence staining. no fluorescence was observed on the cells after they were treated successively with the ceacam10 antibody and rhodamine-conjugated anti-rabbit igg, demonstrating a lack of cea-cam10 on the cell surface. when spermatozoa were preincubated with 1 m ceacam10 in a blocking solution at room temperature for 45 min, rhodamine fluorescence was prominent on the middle piece, relatively weak on the tail, and faint on the head (fig. 7a, d) . apparently, sperm have ceacam10-binding sites that cover the entire cell surface. to access the binding of ceacam10 to epididymal sperm upon ejaculation, the ejaculated sperm were directly stained with the antibody. ceacam10 was immunodetected on the surface of the ejaculated sperm, in spite of high fluorescence background due to the free cea-cam10 that was difficult to removed completely during cell preparation (fig. 7b, d) . most spermatozoa freshly retrieved from the caudal epididymis of mice in modified tyrode buffer were mobile with visible tail beating. the result of casa for the cell incubation at specified conditions revealed that 90.0 m ceacam10 in cell culture greatly enhanced sperm motility relative to the motility of control cells at any incubation time (fig. 7c) . this work is the first to purify ceacam10 from mouse svs. we demonstrated it to be a 36-kda protein with an n-linked glycoconjugate. asn 11 , asn 54 , asn 71 , and asn 191 , each being part of consensus asn-xaa-(ser/thr) [27, 28] in the protein molecule, are the potential acceptor sites for the attachment of the carbohydrate moieties. our results of edman degradation support asn 11 as being one n-glycosylated site but rule out asn 71 in that role. removal of the hydrophobic leader sequence from the putative form of this protein gives a protein core consisting 226, 231, or 232 amino acid residues that sum to have a molecular mass of 25 270, 25 827 or 25 898 daltons, which is close to the molecular size of the deglycosylated proteins as determined by sds-page (fig. 1c) . the cea family consists of ceacam and pregnancyspecific glycoprotein subfamilies. these evolutionarily and structurally divergent glycoproteins of mammals share many common structural features [29] . they are characterized by the assembly of immunoglobulin variable (igv)like domain and immunoglobulin constant (igc)-like domains in each member of the family. according to the molecular model established by watt et al. [30] and tan et al. [31] , there are nine ␤ strands in one immunoglobulin-like domain involved in the maintenance of the three-dimensional architecture. in the ceacam10 molecule, residues 2-96 form one igv-like domain and residues 122-216 form another [14] . this may account for the characteristic cd shown in figure 2c . many members of the murine and human ceacam family contain either a transmembrane domain or a glycosyl ptdins moiety, but no such structural element is present in the ceacam10-deduced protein sequence, suggesting that ceacam10 is not a membrane-bound protein [32] . this is substantiated by our demonstration of its secretion from the luminal epithelium of seminal vesicle (fig. 4) . in the sexual glands of adult mice, the ceacam10 gene is predominantly transcribed and translated in the seminal vesicle. on ejaculation, the ceacam10-sperm binding may take place in the semen to enhance sperm motility. finkenzeller et al. [33] demonstrated that ceacam10 ϫ/ϫ male and female mice developed indistinguishably from wild-type litter mates with respect to sex ratio, weight gain, and fertility, but a significant reduction by 23% of litter size was observed in ceacam10 ϫ/ϫ mating. although this may be partially attributed to the lack of ceacam10 in the semen of ceacam10-inactivated males, it remains arguable whether in vitro ceacam10-enhanced sperm motility may play a significant role after coitus under natural circumstances. the maternal decidua surrounding the implantation site was not removed from the mouse conceptus that was used to prepare the commercially available embryo-stage blot for the observation of gene expression during pregnancy. as a result, the appearance of ceacam10 mrna in embryo samples at the blastula or gastrula stage (6.5-9.5 d.p.c.) (fig. 5b ) may arise from the maternal decidua as suggested in the study by finkenzeller et al. [33] . this finding, together with the lack of an rna message in embryo samples at 4.5 and 10.5-18.5 d.p.c. (fig. 5b) implies that ceacam10 might be weakly expressed if not silent during the entire course of embryonic development. in fact, we were unable to immunodetect the presence of ceacam10 in the em(a and b) or the ceacam10 antibody (c and d) and followed by incubation with rhodamine-conjugated anti-rabbit igg. the slides were observed via light microscopy (a and c) or fluorescence microscopy (b and d). bar ϭ 10 m. c) freshly prepared mouse spermatozoa in modified tyrode solution (10 5 cells/ml) containing 1.8 mm cacl 2 were incubated alone (⅙) or in the presence of 90 m ceacam10 (•) at 37њc for 0 to 60 min. cell motility determined at each specified incubation time was expressed as a percentage of control cell motility at time zero. points are mean ϯ sd for three determinations. *p ͻ 0.01 in a paired statistical comparison with the corresponding control. values were evaluated using one-way analysis of variance. bryo samples obtained by carefully microdissecting the extraembryonic tissues from the conceptus at 8.5 to 18.5 d.p.c. cytochemical observations shown in figure 7 suggest the presence of ceacam10-binding sites on the entire sperm surface. considering the presence of ceacam1 on human sperm cells [34] , it raises a possibility that heterophilic adhesion exists between ceacam10 and other ceacam molecules on the mouse sperm surface. this is unlike the action of other sperm motility effectors in the mouse svs, such as svs vii, which binds neutral phospholipid to enhance sperm motility [10] , and sva, which predominantly binds membrane phosphatidylcholine to suppress sperm motility [12] . motility of spermatozoa in hydrosalpingeal and follicular fluid of pigs purification and characterization of a sperm motility-dynein atpase inhibitor from boar seminal plasma. mol reprod purification and characterization of reversible sperm motility inhibitors from porcine seminal plasma low molecular weight components in bovine semen diffusate and their effects on motility of bull sperm effects of seminal vesicle fluid components on sperm motility in the house mouse purification and characterization of the active precursor of a human sperm motility inhibitor secreted by the seminal vesicles: identity with semenogelin the role of the seminal vesicles, coagulating glands and prostate glands on the fertility and fecundity of mice effects of seminal vesicle removal on fertility and uterine sperm motility in the house mouse the androgendependent mouse seminal vesicle secretory protein iv: characterization and complementary deoxyribonucleic acid cloning a novel heat-labile phospholipid-binding protein, svs vii, in mouse seminal vesicle as a sperm motility enhancer developmental profile of a caltrin-like protease inhibitor, p12, in mouse seminal vesicle and characterization of its binding sites on sperm surface seminal vesicle autoantigen, a novel phospholipid-binding protein secreted from luminal epithelium of mouse seminal vesicle, exhibits the ability to suppress mouse sperm motility a seminal vesicle autoantigen of mouse is able to suppress sperm capacitation-related events stimulated by serum albumin the cea10 gene encodes a secreted member of the murine carcinoembryonic antigen family and is expressed in the placenta, gastrointestinal tract and bone marrow enzymatic deglycosylation of asparagine-linked glycans: purification, properties, and specificity of oligosaccharide-cleaving enzymes from flavobacterium meningosepticum measurement of protein using bicinchoninic acid various forms of mouse lactoferrins: purification and characterization cleavage of structural proteins during the assembly of the head of bacteriophage t4 molecular cloning. cold spring harbor recovery, capacitation, acrosome reaction, and fractionation of sperm choosing among different technical variations of percoll centrifugation for sperm selection motility and fertilizing ability of rat epididymal spermatozoa washed by a continuous gradient of percoll determination of the secondary structures of proteins by circular dichroism and optical rotatory dispersion determination of the helix and beta form of proteins in aqueous solution by circular dichroism circular dichroic analysis of protein conformation: inclusion of the beta-turns analysis of the circular dichroism spectrum of proteins using the convex constraint algorithm: a practical guide sequence differences between glycosylated and non-glycosylated asn-x-thr/ser acceptor sites: implications for protein engineering the carcinoembryonic antigen (cea) family: structures, suggested functions and expression in normal and malignant tissues homophilic adhesion of human ceacam1 involves n-terminal domain interactions: structural analysis of the binding site crystal structure of murine sceacam1a[1,4]: a coronavirus receptor in the cea family redefined nomenclature for members of the carcinoembryonic antigen family zimmermann w. carcinoembryonic antigen-related cell adhesion molecule 10 expressed specifically early in pregnancy in the decidua is dispensable for normal murine development soluble isoforms of ceacam1 containing the a2 domain: increased serum levels in patients with obstructive jaundice and differences in 3-fucosyl-n-acetyl-lactosamine moiety key: cord-022499-7d58f1k3 authors: mall, sanjay; malcolm east, j.; lee, anthony g. title: transmembrane α helices date: 2004-01-07 journal: curr top membr doi: 10.1016/s1063-5823(02)52014-7 sha: doc_id: 22499 cord_uid: 7d58f1k3 this chapter discusses effects of intrinsic membrane proteins on lipid bilayers and model transmembrane α helices. incorporation of a protein into a lipid bilayer has significant effects on the properties of the bilayer. the rough surface presented by a protein to the surrounding lipid bilayer tends to produce poor packing unless the lipid fatty acyl chains distort to match the surface of the protein. in a liquid crystalline bilayer the lipid fatty acyl chains are disordered, because the chains undergo extensive wobbling fluctuations. the presence of a rigid protein surface reduces the extent of these motional fluctuations. however, the chains tilt and become conformationally disordered to maximize contact with the rough surface of the protein. the net result is that the presence of a protein leads to decreased order for the chains, with a wide range of chain orientations relative to the bilayer normal, but with reduced extent and rate of motion. because of the reduced motion, lipids adjacent to membrane proteins are often referred to as being motionally restricted. it is clear that the reasons for the disorder of the bulk lipids and the disorder of the lipids adjacent to the protein are different; for the bulk phospholipids, the disorder is dynamic, whereas, for the boundary lipids the disorder is static. bilayer of 30/~, about 20 residues will be required to span the core of the bilayer. the residues in the transmembrane region will be predominantly hydrophobic. however, for membrane proteins such as transporters and ion channels the transmembrane ot helices will also have to contain polar groups; the transmembrane helices will then be amphipathic rather than just hydrophobic. an extreme case could be a transmembrane a helix totally surrounded by other t~ helices in the center of a helical bundle: such a helix would not need to be hydrophobic at all, because it is not in direct contact with the lipid bilayer. however, although small in number, the available high-resolution structures for membrane proteins show no evidence for the presence of such purely hydrophilic transmembrane ot helices. it may be that the process of insertion of membrane proteins into the membrane during biogenesis requires all the transmembrane c~ helices to be relatively hydrophobic. analysis of the compositions of a large number of membrane proteins predicted to contain single transmembrane ot helices has shown that the amino acid composition of a transmembrane ot helix is distinctly different from that of hydrophobic a helices in water-soluble proteins. as expected, hydrophobic residues make up the bulk of the residues, the most common being leu (landolt-marticorena et al., 1993; wallin et al., 1997) . amino acids essentially excluded are the basic (arg and lys) and acidic (asp and glu) amino acids and their amide counterparts (asn and gin). transmembrane c~ helices are, however, relatively rich in bulky residues, such as ile, val, and thr, which, in water-soluble proteins, are classed as membrane destabilizers (their bulky side chains interfere sterically with the carbonyl oxygen in the preceding turn of the ot helix and thus destabilize the helical conformation). thus, factors such as residue volume and packing, which are important in determining helix stability in water-soluble proteins, are not so important for transmembrane ct helices, at least for membrane proteins containing single transmembrane c~ helices; effects of large residue volume will, in the membrane, be balanced by the favorable hydrophobic interactions of a large side chain with the fatty acyl chains. in water-soluble proteins, the conformationally flexible gly residue is also classed as a helix breaker, because it is an intrinsically flexible residue with the potential to adopt most of the dihedral angles available in a ramachandran plot. the observation that gly is quite common in transmembrane ot helices suggests that its potential flexibility is constrained in the bilayer environment. because gly possesses the smallest of all the side chains, it may play a role in mediating helix-helix interactions and packing in the membrane. the polar amino acids most commonly found within transmembrane ot helices are cys, thr, and ser. these residues can be stabilized within a hydrophobic environment by hydrogen bonding between their polar side chains and the peptide backbone at positions i -3 and i -4 (eilers et al., 2000) . figure 1 shows the positional preferences of the residues in the transmembrane ot helices of type i membrane proteins with a single transmembrane c~ helix oriented figure 1 positional preferences for amino acids in the transmembrane domains of human type i membrane proteins with single transmembrane a helices. modified from landolt-marticorena et al. (1993) . with its c-terminus on the cytoplasmic side of the membrane (landolt-marticoreno et al., 1993) . the amino-terminal end of the transmembrane domain contains an lie-rich region followed by a val-enriched region. the carboxyl-terminal half of the transmembrane a helix is leu-rich. ala is found randomly distributed throughout the transmembrane domain. aromatic residues are found located preferentially in the boundary regions, with trp at either end of the transmembrane domain, but with tyr and phe only at the carboxyl-terminal boundary. unlike the other aromatic amino acids, phe, is also found in the hydrophobic segment as well as in the boundary region. the polar regions flanking the transmembrane domain are enriched in arg and lys on the c-terminal side; asn, ser, and pro are enriched in the n-terminal flanking region. the presence of a positively charged c-terminus (cytoplasmic) could play a role in the process of insertion into the membrane, according to the inside positive rule of von heijne (1996) . the presence of particular residues at the n-and c-terminal ends of the helices could also be important in meeting the requirement to "cap" the ends of the ot helices; the initial four --nh and final four --c=o groups of an ot helix have no hydrogen-bonding partners provided by the peptide backbone of the a helix itself, and so suitable hydrogen-bonding partners have to be provided in some other way. one way is to extend the helix by three or four residues at each end with polar residues containing suitable hydrogen-bonding partners such as pro and ash. alternatively, if the hydrophobic, nonpolar residues in the transmembrane c~ helix extend into the headgroup region, hydrogen bonds could form with suitable groups in the glycerol backbone and headgroup regions of the lipid bilayer. either way, if about 20 residues are required to span the hydrophobic core of the bilayer, the total helix length could be up to 28 residues. the result is that there is a degree of indeterminacy in where the ends of transmembrane helices should be drawn; the precise ends of transmembrane ~ helices are often not known. the observed preference for trp and tyr residues for the ends of transmembrane ct helices agrees with measurements of the binding of small peptides at the lipid-water interface, which show that aromatic residues have a preference for the interface (wimley and white, 1996) . further, a number of small tryptophan analogues have been shown to bind in the glycerol backbone and lipid headgroup region of the bilayer, stabilized partly by location of the aromatic ring in the electrostatically complex environment provided by this region of the bilayer, and partly by exclusion of the fiat, rigid ring system from the hydrocarbon core of the bilayer for entropic reasons (yau et al., 1998) . thus, although it is agreed that aromatic residues at the ends of transmembrane a helices probably act as "floats" at the interface serving to fix the helix within the lipid bilayer, it is unclear whether the aromatic rings are located in the hydrocarbon or the headgroup region of the bilayer. this uncertainty is also apparent in the crystal structures of a number of membrane proteins. for example, the trp residues in the bacterial potassium channel kcsa (doyle et al., 1998) are found clustered at the ends of the transmembrane ot helices, forming clear bands on the two sides of the membrane, as shown in fig. 2 . however, the tyr residues clearly form a band on the periplasmic side of the membrane above the band formed by the trp residues. similarly, in the bacterial photosynthetic reaction center (rees etal., 1994) the majority of the trp residues are found near the periplasmic side of the protein near the ends of the transmembrane c¢ helices, as shown in fig. 3 . however, the band of trp residues is more diffuse than in kcsa, and some trp residues are likely to be located in the hydrocarbon core and some in the headgroup region. the average number of figure 2 the crystal structure of the potassium channel kcsa. a cross section with just two of the four identical subunits is shown. trl o residues are shown in space-fill representation and tyr residues are shown in ball-and-stick representation. two potassium ions in space-fill representation are shown moving through the channel. the separation between the two planes representing the outer edges of the trp residues is 35/~. (protein data bank [pdb] file ib18.) p ¢ figure 3 the structure of the l and m subunits of the photosynthetic reaction center of rhodobacter sphaeroides. trp residues are shown in ball-and-stick representation. an approximate location for the hydrophobic core of the bilayer of thickness 30 a is shown, as defined by the surface covered by detergent. (pdb file 1 aij.) residues in the transmembrane c~ helices of the bacterial photoreaction center is 26, corresponding to a length of about 39/~. the stretch of hydrophobic residues in these helices is, however, only about 19 amino acids or about 28.5/~ long (ermler et al., 1994; michel and deisenhofer, 1990) . this matches the thickness of the nonpolar region of the complex (about 30/~) as defined experimentally as the part covered by detergent in the crystal (roth et al., 1989 (roth et al., , 1991 . detergent is seen to cover some of the trp residues on the periplasmic side of the membrane, but not others (roth et al., 1991) . the distribution of trp residues on the cytoplasmic side of the complex is much less distinct than on the periplasmic side (fig. 3 ). if the hydrocarbon core of the bilayer around the complex does have a thickness of 30 ]~, then again the trp residues on the cytoplasmic side of the membrane will be located in both the hydrocarbon core and the headgroup regions of the bilayer (fig. 3) . in the ca2+-atpase of skeletal muscle sarcoplasmic reticulum the situation is more complex, as shown in fig. 4 (toyoshima et al., 2000) . many of the transmembrane o~ helices extend above the membrane surface to form a central stalk linking the transmembrane region to the cytoplasmic head of the protein. as a consequence some of the helices are very long; helix m5, for example, contains 41 residues. a ring of trp residues can be seen on the cytoplasmic side of the membrane helping to define the location of the membrane surface (fig. 4) . a lys residue (lys-262) in transmembrane c~ helix m3 can be seen pointing up from the hydrophobic core of the bilayer like a snorkel. because the cost of burying a charged residue in the hydrophobic core of a bilayer is very high (about 37 kj mo1-1 for a lys residue ; engelman et al., 1986) , it is likely that the amino group on lys-262 will be located at the interface; the trp residues in the ca 2+-atpase will then be located in the headgroup region of the bilayer. the structure of the ca2+-atpase is also unusual in that the first transmembrane ot helix contains two polar residues, asp-59 and arg-63, pointing out into the hydrocarbon core; presumably, stacking of asp-59 against arg-63 allows formation of an ion pair. the distribution of trp residues on the lumenal face of the ca2+-atpase is much more diffuse than on the cytoplasmic side. the hydrophobic thickness of the ca2+-atpase could be expected to be about 30/~, because that is the hydrophobic thickness of a bilayer of di(c18:1)pc, 1 the phospholipid that supports highest activity for the atpase . however, as shown in fig. 4 , this definition locates the lumenal loops between transmembrane ~ helices m5 and m6 i phospholipid designations are pc, ps, and pa for phosphatidylcholine, phosphatidylserine, and phosphatidic acid, respectively. fatty acyl chains are given in the format m:n, where m is the number of carbon atoms and n is the number of double bonds. thus, for example, dioleoylphosphatidylcholine is di(c 18:1)pc. and between m9 and m10 within the hydrocarbon core. further, it locates a lys residue (lys-972) totally within the hydrocarbon core, which seems unlikely. the hydrophobic thickness of the bilayer would have to be about 21 a to locate the two interhelical loops and lys-972 at the lumenal surface (fig. 4) ; this is close to the thickness of a bilayer of di(c14:i)pc (22 a). the crystal structure shown in fig. 4 corresponds to the ca2+-bound, e 1 conformation of the atpase. it has been shown that the e1 conformation of the atpase is favored by di(c14:1)pc, whereas di(c 18:1)pc favors the other major conformation of the atpase, e2 (starling et al., 1994) . thus, it is possible that conformational changes within the transmembrane region of the ca2+-atpase lead to changes in the interhelical loops and thus to changes in the effective hydrophobic thickness of the atpase (lee and east, 2001) . incorporation of a protein into a lipid bilayer can be expected to have significant effects on the properties of the bilayer. the rough surface presented by a protein to the surrounding lipid bilayer will tend to produce poor packing unless the lipid fatty acyl chains distort to match the surface of the protein. in a liquid crystalline bilayer the lipid fatty acyl chains are disordered, because the chains undergo extensive wobbling fluctuations. the presence of a rigid protein surface would be expected to reduce the extent of these motional fluctuations. however, the chains will have to tilt and become conformationally disordered to maximize contact with the rough surface of the protein. the net result is that the presence of a protein will lead to decreased order for the chains, with a wide range of chain orientations relative to the bilayer normal, but with reduced extent and rate of motion. because of the reduced motion, lipids adjacent to membrane proteins are often referred to as being motionally restricted. it is clear, therefore, that the reasons for the disorder of the bulk lipids and the disorder of the lipids adjacent to the protein (the boundary or annular lipids) are different; for the bulk phospholipids the disorder is dynamic, whereas for the boundary lipids the disorder is static. an example is provided by the bacteriorhodopsin trimer, whose crystal structure is unusual in showing a few well-defined lipid molecules (belrhali et al., 1999; luecke et al., 1999) . figure 5 shows some of the lipids located at the surface of the trimer. the electron densities for the chains are well defined, but the headgroups are disordered, so that the headgroups could not be identified; the lipids were therefore modeled simply as 2,3-di-o-phytanylsn-propane (belrhali et al., 1999) . the considerable static disorder of the chains is clear in fig. 5 , the rotational disorder of the chains being necessary to obtain good van der waals contacts with the molecularly rough surface of the bacteriorhodopsin trimer. lipids on the extracellular side of the membrane are better resolved than 10~ figure 5 structures of four phospholipid molecules identified in x-ray crystallographic studies of bacteriorhodopsin (belrhali et al., 1999) . the lipids have been modeled as 2,3-.di-o-phytanyl-snpropane. (pdb file lqhj.) those on the cytoplasmic side; the degree of order of the lipids parallels that of the protein, which is also greater on the extracellular side (grigorieff et al., 1996) . the average distance between the glycerol backbone oxygens for phospholipids on the two sides of the membrane was 31.6/~ (mitsuoka et al., 1999) . as expected, this closely matches the hydrophobic length of the transmembrane helices of bacteriorhodopsin; the mean helix length is 23 residues, corresponding to a length of about 35 a. the crystal structure also makes clear the very different conformations adopted by the various lipid molecules located on the surface of the tfimer. for example, one lipid molecule forms a hydrogen bond from its ether oxygens to a tyrosine --oh group at the end of a transmembrane a helix (fig. 6; belrhali et al., 1999; essen et al., 1998) . the result is that the strength of the interactions of individual boundary lipid molecules with the protein will be different. the disorder of the chains seen in fig. 6 is consistent with the results of molecular dynamics simulations of the bacteriorhodopsin trimer in a bilayer of diphytanyl phosphatidylglycerophosphate (edholm et al., 1995) . the molecular dynamics simulation agrees with experiment in predicting higher order for both the lipids and the protein on the extracellular side of the membrane; fluctuations in the loops and the ends of helices on the cytoplasmic side of the membrane are greater than on the extracellular side. this is also seen in fluctuations of the lipids, with lipids on the cytoplasmic side of the membrane fluctuating more strongly than those on the extracellular side (edholm et al., 1995) . the calculated order parameters for the chains are low, mainly due to a static tilt of the chains necessary to allow them to nestle against the rough surface of the protein. the chains in the purple membrane behave more like parts of the protein than parts of a fluid lipid phase, consistent with the idea of boundary lipids (edholm et al., 1995) . the boundary lipids and the bulk lipids in a membrane can be distinguished experimentally in many systems, because of the static disorder of the boundary lipids and the dynamic disorder of the bulk lipids. static and dynamic disorder give rise to very different electron spin resonance (esr) spectra for spin-labeled lipids, and esr spectra for membrane protein systems usually show two components, one "immobilized," corresponding to boundary lipid, and one relatively mobile, corresponding to bulk lipid (devaux and seigneuret, 1985; marsh, 1995) . studies with oriented samples have confirmed a wide range of orientational distributions for the boundary lipid, in contrast to the bulk lipid phase, where motion of the lipid long axis is about the bilayer normal (jost et al., 1973; pates and marsh, 1987) . a particularly important feature of a membrane protein as far as the lipid bilayer is concerned is the thickness of the transmembrane region of the protein. the cost of exposing hydrophobic fatty acyl chains or protein residues to water is such that the hydrophobic thickness of the protein should match that of the bilayer. the question then is how the system responds when these do not match. most models of hydrophobic mismatch assume that the lipid chains in the vicinity of the protein adjust their length to that of the protein, with the protein acting as a rigid body. when the thickness of the bilayer is less than the hydrophobic length of the peptide the lipid chains must be stretched. conversely, when the thickness of the bilayer is greater than the hydrophobic length of the peptide the lipid chains must be compressed (fig. 7) . stretching the fatty acyl chains will effectively decrease the surface area they occupy in the membrane surface, and, conversely, compressing the chains will increase the effective area occupied in the surface (fig. 7) . thus, figure 7 the result of a mismatch between the hydrophobic length of a peptide and the hydrophobic thickness of a lipid bilayer. left: positive hydrophobic mismatch (dp > dl). right: negative mismatch (dp< de). the top shows a "side" view of the chain packing around the peptide and the bottom shows a top view, illustrating the variation in chain cross-sectional area with distance from the peptide. figure based on fattal and ben-shaul (1993) . changes in fatty acyl chain order are linked to changes in average interfacial areas per lipid molecule. a number of terms have been suggested to contribute to the total free energy cost of deforming a lipid bilayer around a protein molecule (fattal and ben-shaul, 1993; nielsen et al., 1998): 1. loss of conformational entropy of the chains imposed by the presence of the rigid protein wall 2. bilayer compression/expansion energy due to changes in the membrane thickness 3. surface energy changes due to changes in the area of the bilayer-water interface 4. splay energy due to changes in the cross-sectional energy available to the chains along their length, resulting from curvature of the monolayer surface near the protein a number of models have been proposed to estimate these terms. fattal and ben-shaul (1993) calculated the total lipid-protein interaction free energy as the sum of chain and headgroup terms. for the chains the loss of conformational entropy imposed by the rigid protein wall was positive (unfavorable) even for perfect hydrophobic matching. the other contribution to the chain term arose from the requirement for hydrophobic matching and the consequent stretching or compression of the chain. the term due to the headgroup region was treated as an interracial free energy, including an attractive term associated with exposure of the hydrocarbon core to the aqueous medium and a repulsive term due to electrostatic and excluded-volume interactions between the headgroups (excluded-volume interactions signify that no two atoms can occupy the same position in space). the resulting profile of energy of interaction as a function of hydrophobic mismatch was fairly symmetrical about the point of zero mismatch. the lipid perturbation energy f (in units of kt per angstrom of protein circumference) calculated by ben-shaul (1995) fits to the equation where dp and dl are the hydrophobic thicknesses of the protein and lipid bilayer, respectively, and the unperturbed bilayer thickness is 24.5/~. the hydrophobic thickness of a bilayer of phosphatidylcholine in the liquid crystalline phase is given by where n is the number of carbon atoms in the fatty acyl chain (lewis and engelman, 1983; sperotto and mouritsen, 1993) . a simple, but crude calculation gives an idea of the size of the effect that can be expected from hydrophobic mismatch. it is assumed that the protein is very large and so appears fiat to a lipid molecule. it is also assumed that all the lipid perturbation energy is concentrated in the first shell of lipids around the protein. equation (1) then shows that if a lipid occupies 6 a of the protein circumference, the lipid-protein interaction energy would change by 3.6 kj tool -1 for a hydrophobic mismatch of 7 a, corresponding to an increase in fatty acyl chain length of 4 carbons, and by 22.8 kj mol -z for a hydrophobic mismatch of 17.5 a, corresponding to an increase in acyl chain length of 10 carbons. changes in interaction energies of 3.6 and 22.8 kj mo1-1 correspond to decreases in the lipid-protein binding constant by factors of 4.3 and 10 n, respectively. if the change in lipid-protein interaction energy were to propagate out from the protein surface to affect more than the first shell of lipids, effects of hydrophobic mismatch would be reduced. for example, if effects were averaged over three shells of lipids, changes in fatty acyl chain lengths by 4 and 10 carbons from that giving optimal interaction would decrease lipid-protein binding constants by factors of 1.6 and 21, respectively. energies of the magnitude calculated by ben-shaul (1995) are easily sufficient to result in conformational changes on a protein. if a protein conformational change results in a change in the hydrophobic thickness of the protein, the change will result in a deformation of the adjacent bilayer. because the equilibrium constant describing the equilibrium between two conformational states of a protein is determined by the total free energy difference between the two states, the energetic cost of the membrane deformation will contribute toward the equilibrium constant. the approach adopted by nielsen et al. (1998) came to rather similar conclusions. the most important energy terms were found to be the splay energy and the compression-expansion term, the splay energy term being most important close to the protein, with the compression-expansion term being more import'ant further from the protein. even though the bilayer deformation was calculated to extend some 30 a from the protein, most of the deformation was found concentrated in the component immediately adjacent to the protein. an alternative model for mismatch is the mattress model of bloom (1984, 1993) . this again expresses mismatch as two terms. the first is an excess hydrophobic free energy associated with exposing either lipid chains or the protein surface to the aqueous medium. the second is proportional to the contact area between the lipid chains and the hydrophobic surface of the protein. the calculations showed that, for a protein of hydrophobic thickness 20 a, which matches a bilayer ofdi(c 14:0)pc in the liquid crystalline state, the binding constant for di(c14:0)pc is about a factor of 2.5 greater than for a bilayer of di(c18:0)pc, which will give a bilayer too thick by 7 a (sperotto and mouritsen, 1993) . thus, effects of mismatch calculated in this way are similar to those calculated using the approach of fattal and ben-shaul (1993) . the importance of hydrophobic matching has been confirmed in a number of experimental studies (dumas et al., 1999; killian, 1998) . for example, although bacteriorhodopsin has relatively little effect on the phase transition temperatures of di(c14:0)pc or di(c16:0)pc (alonso et al., 1982) , it increases the transition temperature of di(c12:0)pc and decreases that of di(c18:0)pc (piknova etal., 1993) . this is consistent with hydrophobic matching models; because di(c 12:0)pc gives a too thin bilayer in the liquid crystalline phase, bacteriorhodopsin favors the gel phase, whereas di(c18:0)pc gives a too thick bilayer in the gel phase, so that bacteriorhodopsin favors the liquid crystalline phase. hydrophobic matching could also be the explanation for the unexpected observation that, in mixtures of di(c14:0)pc and di(c18:0)pc at temperatures where the mixture contains both gel and liquid crystalline phases, bacteriorhodopsin partitions equally between the two phases (piknova et al., 1997; schram and thompson, 1997) . this contrasts with the observed exclusion of bacteriorhodopsin from gel-phase lipid in mixtures containing single species of phospholipid (alonso et al., 1982; cherry et al., 1978) . it has been suggested that this shows that the requirements of hydrophobic matching are of prime importance; the hydrophobic thickness of bacteriorhodopsin is intermediate between the hydrophobic thickness of di(c14:0)pc in the gel phase and di(c18:0)pc in the liquid crystalline phase, so that bacteriorhodopsin shows little preference between the two. in mixtures of di(c 12:0)pc and di(c 18:0)pc, where, at low temperatures, two separate gel phases are formed, one enriched in di(c 12:0)pc and one enriched in di(c 18:0)pc, bacteriorhodopsin partitions very strongly into the di(c12:0)pc-enriched domains; this could be because the hydrophobic thickness of bacteriorhodopsin is better matched to gelstate di(c12:0)pc than to gel-state di(c18:0)pc (dumas et al., 1997) . however, in studies of the ca2+-atpase of sarcoplasmic reticulum using either spin-labeled (london and feigenson, 1981) or brominated phospholipids , strengths of binding of liquid crystalline-phase phospholipids to the atpase were found to be independent of fatty acyl chain length. these results are not consistent with the expectations of hydrophobic matching theory and suggest that, in liquid crystalline bilayers, u-helical membrane proteins are not rigid, but, in fact, can distort to match the thickness of the bilayer. such a distortion could explain why bilayer thickness affects the activity of membrane proteins such as ca2+-atpase (lee, 1998) . the structural distortion could take the form of a change in the tilt of the transmembrane ~ helices with respect to the bilayer normal or could be a change in the packing of the transmembrane c~ helices. an alternative approach to these questions, avoiding the complexity of real membrane proteins, is to use simple model transmembrane ~ helices, which can be synthesized chemically in large quantity. a number of studies have used peptides of the type ac-k2-g-ln-k2-a-amide (pn) consisting of a long sequence of hydrophobic leu residues capped at both the n-and c-terminal ends with a pair of charged lysine residues. the poly(leu) region forms a maximally stable u helix, particularly in the hydrophobic environment of the lipid bilayer. the charged lys caps were chosen both to anchor the ends of the peptides in the lipid headgroup region and to inhibit the aggregation of the peptides in the membrane. the peptide has been shown to adopt the expected u-helical structure in both liquid crystalline-and gel-phase bilayers (davis et al., 1983; huschilt et al., 1989; zhang et al., 1992a) . rates of hydrogen/deuterium exchange for the peptide p16 in lipid bilayers suggest that at least 80% of the peptide is in an u-helical conformation in the bilayer, meaning that the whole of the poly(leu) core must be u-helical (zhang et al., 1992b) . rates of hydrogen/deuterium exchange were greater at the n-and c-termini of the peptides than in the middle, suggesting some unraveling of the peptide at its ends (zhang et al, 1992b) . experiments with the peptides of this type suggest that about 15 lipid molecules are required for complete incorporation of the peptide into a bilayer of the appropriate thickness (webb et al., 1998) . this agrees with estimates from molecular modeling that about 16-18 lipid molecules will be required to form a complete bilayer shell around the peptide. at molar ratios of lipid less than this, non-bilayer phases can be induced, particularly when the hydrophobic length of the peptide is less than the hydrophobic thickness of the bilayer and when the peptide contains interfacial aromatic groups (de planque et al., 1999; killian et al., 1996; morein et al., 1997) . effects of single transmembrane u helices on lipid bilayers are likely to be less than those of a protein containing a bundle of transmembrane u helices. the cross-sectional area of a single transmembrane u helix is not much greater than that of a phospholipid molecule in the liquid crystalline phase, so that the hydrophobic surface presented to the lipid molecules is rather small. the structure of the lipid bound to bacteriorhodopsin shown in fig. 6 shows the two chains interacting predominantly with two different transmembrane u helices. this kind of interaction will obviously not be possible with a single transmembrane u helix. less extensive interactions between lipids and single transmembrane u helices than between lipids and membrane proteins is suggested by esr experiments. whereas esr spectra of spin-labeled lipids in the presence of membrane proteins typically show two-component spectra, as described above, esr spectra for lipid bilayers containing the peptide l24 and for a tryptophan-containing peptide of the type aw2(la)nw2a are single-component (de planque et al, 1998; subczynski et al., 1998) . this means either that the lipid fatty acyl chains are not "immobilized" on the peptide surface or that the rate of exchange between bulk and boundary lipid is fast on the esr time scale (i.e., exchange is faster than 107 s-l). effects of peptides on chain order in di(c14:0)pc or di(c16:0)pc measured using deuterium nuclear magnetic resonance (nmr) methods are small in both the liquid crystalline and the gel phases (davis et al., 1983; de planque et al., 1998 de planque et al., , 1999 roux et al., 1989) . thus, addition of a peptide aw2(la)tw2a to bilayers of di(c12:0)pc in the liquid crystalline phase resulted in only a 1.4-/~ increase in thickness (de planque et al., 1998) , whereas about an 11-/~ increase would be necessary for the bilayer thickness to match the hydrophobic length of the peptide. similarly, nezil and bloom (1992) estimated that the peptide p24 increased the thickness of a bilayer of (c16:0,c18:1)pc by just 0.6/~, despite the hydrophobic mismatch between the peptide and the lipid bilayer being ca. 10 a. increases in chain order caused by p24 in (c 16:0,c 18:1)pc in the liquid crystalline phase were detected using esr, but again the effects were small (subczynski et al., 1998) . effects of peptides on chain order will depend on the relative hydrophobic length of the peptide compared to the hydrophobic thickness of the bilayer, with long peptides decreasing order and short peptides increasing order, and such effects have been detected using infrared (ir) spectroscopy, but again effects were small (zhang et al., 1992a) . thus it appears that lipids will distort slightly to improve the match between the hydrophobic length of the peptide and the hydrophobic thickness of the bilayer, but the extent of these modifications is very limited and much less than required to produce full matching. a number of studies have been published on the effects of these peptides on the phase transition properties of lipid bilayers. addition of the peptide pi6 to bilayers of di(ci6:0)pc both broadens the main gel-to-liquid crystalline phase transition and decreases the enthalpy of the transition (morrow et al., 1985) . similar effects have been seen on incorporation of membrane proteins such as bacteriorhodopsin and ca2+-atpase (alonso et al., 1982; gomez-fernandez et al., 1980) . the decrease in enthalpy of the transition has often been taken to mean that the lipids adjacent to the protein (the boundary lipids) are very strongly perturbed by the peptide and so are unable to take part in the normal phase transition: they are effectively withdrawn from the transition. however, deuterium nmr spectra of mixtures of p24 and di(c 16:0)pc above and below the phase transition are typical of liquid crystalline-and gel-phase lipids, respectively, with no evidence for any "special" lipid unable to take part in the phase transition . similarly, as already described, esr spectra of spin-labeled lipids show the presence of a single type of lipid in the system, not separate bulk and boundary lipids in slow exchange (subczynski et al., 1998) . thus the peptides (or proteins) do not remove lipid from the main transition, but, rather, perturb the whole lipid bilayer. the peptide decreases the enthalpy difference between the liquid crystalline and gel phases, whereas the lipids in the bilayer remain recognizably liquid crystalline or gel (morrow et al., 1985) . morrow et al. (1985) showed that mixtures of lipids and peptides can be modeled in terms of regular solution theory (lee, 1978) . unfortunately, the number of free parameters in fitting to regular solution theory is high, so that little useful information is obtained from the analysis, apart from showing that the data are consistent with regular solution theory. effects of peptides or proteins on phase transition properties have therefore been interpreted qualitatively in terms of a twocomponent model in which one component is more or less unperturbed bulk lipid and the other is highly perturbed boundary lipid, which undergoes a broad phase transition of low enthalpy; this approach works for peptides of the poly(leu) type, but, for some reason, does not work with peptides of the k2(la)nk2 type (zhang et al., 1992a (zhang et al., , 1995 . differential scanning calorimetry (dsc) thermograms for mixtures with the poly(leu) peptides have been fitted to two components, attributed to the phase transitions of peptide-free and boundary lipid, respectively. the phase transition temperature for the peptide-free component is slightly less than that for pure lipid; this is probably due to the normal colligative effects that will follow from mixing the "pure" lipid phase with the boundary lipid, the latter acting as an "impurity." the phase transition temperature for the boundary lipid is higher than the bulk transition temperature for short-chain lipids, but is lower for long-chain lipids (zhang et al, 1992a) . the same observation has been made for membrane proteins (dumas et al., 1999; piknova et al., 1993) . this is consistent with the idea that short fatty acyl chains have to stretch to match the hydrophobic thickness of the membrane protein, whereas long fatty acyl chains have to compress. however, the experimental changes are much smaller than expected from models ofhydrophobic matching. further insights into how transmembrane ot helices might interact with lipid molecules in a bilayer have come from molecular dynamics simulations. one study was of a transmembrane ot helix of 32 alanine residues in a bilayer of di(c 14:0)pc in the liquid crystalline state (shen et al., 1997) . the peptide is not an ideal model for a transmembrane ot helix, because it lacks charged groups at each end to interact with the polar headgroups of the phospholipids. nevertheless many features of the simulation are informative. the simulation was started with the peptide as a pure ot helix. the central 15 residues (ala-12-26), which interacted just with the lipid fatty acyl chains, remained as a stable 0t helix. the n-and c-terminal regions of the ot helix, located in the lipid headgroup region, were less stable and fluctuated more, because of transient hydrogen bonding between the peptide bond amide hydrogen and the phosphate or fatty acyl ester oxygen atoms and the water; as a result, the ends of the helices become frayed. the length of the central helical region oscillated slightly about a 22-a average expected for an ot helix, varying between 20 and 23 a. the helix was tilted up to 30 ° with respect to the bilayer normal. because the helix contains no resides that would make strong contacts in the headgroup region or with water, there is no reason for it not to tilt (shen et al., 1997) . tilting in fact allows more hydrophobic contact by allowing more of the ala residues to be located in the core of the bilayer. the presence of the peptide had little effect on the calculated properties of the bilayer. the average bilayer thickness was not significantly changed, although the average order parameter for the ch2 groups in the chains decreased in the presence of the peptide. many different lipid molecules contributed to the immediate surroundings of the peptide (shen et al., 1997) . even if the fatty acyl chain of a particular lipid was immediately adjacent to the peptide, the headgroup of the lipid could be a substantial distance away. given the diameter of the helix and the size of the phosphate group, a phosphate immediately adjacent to the helix would be between 8 and 10 a away from the center of the helix (shen et al., 1997) . on average, five lipid molecules having their chains adjacent to the helix also had their headgroups adjacent, using this definition. however, the headgroup, as given by the position of the phosphorus atom, could be up to 16-18 a away. since the average distance between lipid headgroups is 8 a, this puts these lipids in the "second" shell around the peptide. other lipids existed between these extremes, suggesting a very diffuse environment around the protein rather than a discrete set of well-ordered shells of lipid. only rarely did an entire lipid molecule pack tightly around the helix. the shell around the helix contains contributions from a large number of lipid molecules, each contributing a small number of atoms (shen et al., 1997) . thus there is no evidence for a distinct shell of lipids around the peptide and any perturbation of the lipids extends out just a few angstroms. the lack of a clear shell of lipids around the poly(ala) peptide contrasts with the boundary lipids observed for membrane proteins and illustrated for bacteriorhodopsin in figs. 5 and 6. in part this could be an intrinsic feature of single transmembrane a helices, which will not be able to present a large surface area on which fatty acyl chains could be immobilized. the regular structure of a poly(ala) peptide compared to the rough surface of a typical or-helical peptide might also contribute to the lack of an immobilization shell of lipids. however, a significant factor is also likely to be the lack of polar groups at the ends of the peptide able to interact with the phospholipid headgroups. the importance of polar groups at the ends of the peptides has been shown in a simulation of isolated helices from bacteriorhodopsin (woolf, 1998) . the simulations show that a small number of the lipids surrounding a helix interact with it much more strongly than other lipids, due to a combination of van der waals (mediated by chains) and charge interactions (mediated by beadgroups) (woolf, 1998) . a molecular dynamics simulation of the peptide el6 in a bilayer of di(c 14:0)pc in the liquid crystalline phase showed that the peptide tilted with an average angle of 15.3 ° with respect to the bilayer normal, even though the thickness of the hydrophobic region of the bilayer (22 a) was a good match to the length of the ~ helix, 24 a (belohorcova et al., 1997) . a molecular dynamics simulation has been reported for pf1 coat protein in di(c16:0)pc (roux and woolf, 1996) . the coat protein contains an amphipathic a helix on the bilayer surface and a hydrophobic transmembrane a helix. fatty acyl chains next to the transmembrane helix were slightly more ordered than bulk lipids (roux and woolf, 1996) . similarly, in a simulation of the seventh transmembrane c~ helix of the serotonin 5ht receptor in di(c 14:0)pc, lipids in contact with the peptide had slightly higher order parameters than bulk lipids (duong et al., 1999) . the theories described above show that there will be an energetic cost associated with any change in the thickness of a bilayer. this would be reflected in values for the equilibrium constant describing the binding of lipids to the protein. a lipid that can bind to a protein without a change in bilayer thickness would bind more strongly to the protein than one for which binding required a change in bilayer thickness. the strength of interaction between a peptide and a phospholipid in a bilayer can be measured using a fluorescence quenching method (webb et al, 1998) . peptides used are of the type kkgl7wl9kka (l16) and kkgl10wl12kka (l22) containing a central trp residue as a fluorescence reporter group. the peptide is incorporated into bilayers containing the brominated phospholipid dibromostearoylphosphatidylcholine (di(br2c 18:0)pc); di(br2c 18:0)pc behaves much like a conventional phospholipid with unsaturated fatty acyl chains, because the bulky bromine atoms have effects on lipid packing similar to those of a cis double bond . contact between the bromine atoms in the lipid and the trp residue in the peptide leads to fluorescence quenching. in mixtures of brominated and nonbrominated phospholipids, the degree of quenching of the fluorescence of the tryptophan residue is related to the fraction of the surrounding (boundary) phospholipid molecules that are brominated, and thus to the strength of binding of the nonbrominated lipid to the peptide. an example of the method is shown in fig. 8 . the fluorescence intensity for the peptide l16 incorporated into bilayers of di(br2c 18:0)pc at a molar ratio of lipid to peptide of 100:1 is 5% of that in di(c18:1)pc, demonstrating highly efficient quenching of the tryptophan by the bromine-containing fatty acyl chains (fig. 8) . the fluorescence intensity in mixtures of di(br2c 18:0)pc and di(c18:1)pc decreases with increasing content of di(br2c 18:0)pc, reflecting the increasing number of boundary lipids that will be di(br2c 18:0)pc. as shown in fig. 8 , fluorescence quenching curves for l16 in mixtures of di(br2c 18:0)pc and (c 14:1)pc show more fluorescence quenching at intermediate mole fractions of di(br2c 18:0)pc than in mixtures with di(c 18:1)pc. this shows that di(c 18: i)pc binds more strongly to the peptide than does di(c14:1)pc. the results can be analyzed to give relative lipid-binding constants, as described in webb et al. (1998) . these lipid-binding constants for l16 and l22 are given in table i. for l22 strongest binding is seen with di(c22:1)pc, for which the relative binding constant is about double that for di(c 18:1)pc (table i ). the hydrophobic length of the peptide l22 is about 36 a, calculated for a stretch of 24 hydrophobic residues in total, with a helix translation of 1.5 a per residue. thus, strongest binding is seen when the hydrophobic length of the peptide matches the hydrophobic thickness of the bilayer, as expected from theories of hydrophobic mismatch. however, relative binding constants do not continue to decrease with decreasing chain length from di(c 18:1)pc to di(c 14:1 )pc as would have been predicted (table i ). an even larger deviation from theoretical predictions is observed with the short peptide l16 (table i ). in this case strongest binding is observed with di(c 18:1)pc, with binding decreasing with decreasing chain length to di(c14:i)pc, as expected. however, the peptide was found not to incorporate at all into bilayers of di(c24:l)pc, instead forming aggregates of peptide separate from the bilayer. ren et al. (1999) obtained very similar results, except that under their conditions, unincorporated peptide bound to the surface of the lipid bilayer, with the long axis of the peptide parallel to the surface. similarly, l16 was found to be only partly incorporated into ~bilayer hydrophobic thickness d calculated from d = 1.75(n -1), where n is the number of carbon atoms in the fatty acyl chain (sperotto and mouritsen, 1988) . bestimated hydrophobic length is 27 a for li6 and 36 a for l22. bilayers of di(c22:1)pc (webb et al., 1998) . thus, a short peptide cannot incorporate into a too-thick bilayer. it is suggested that a too-thin bilayer can match to a too-long peptide both by a slight stretching of the lipid and by tilting of the long axis of the helix with respect to the bilayer normal so that its effective length across the bilayer is reduced. however, a too-thick bilayer can only match a too-thin peptide by compression of the lipid, which becomes energetically unfavorable when the difference between the bilayer thickness and the peptide length exceeds about 6/~ (webb et al., 1998) . possible effects of aromatic residues at the ends of transmembrane ~ helices have been studied using peptides k2gfl6wl8fk2a (f2l14) and k2gyl6wl8yk2a (y2li4), in which one leu residue at each end of the poly(leu) stretch is replaced by either a phe or a tyr (mall et al., 2000) . in contrast to the results with l16, peptide f2l14 incorporated fully into bilayers of di(c24:1)pc, and y2li4 partitioned partially into di(c24:1)pc. the fluorescence quenching method was again used to obtain binding constants for phosphatidylcholines to the peptides, measured relative to the binding constant for di(c 18:1)pc (table ii) . the effective hydrophobic length of the peptide y2li4 might be expected to be somewhat greater than that of l16; if the peptide is modeled as an c~ helix with the two tyr residues oriented to be roughly parallel to the long axis of the c~ helix, the distance between the two tyr--oh groups is ca. 33 a, about 6/~ greater than the hydrophobic length of l16. the hydrophobic length of y2l14 calculated in this way matches the hydrophobic thickness of a bilayer of di(c20:1)pc, whereas the relative lipid-binding constants increase from di(c14:1)pc to di(c22:1)pc (table ii) . similarly, relative binding constants for f2li4 increase with increasing chain length from di(c14:i)pc to di(c24: i)pc. the results with y2l14 and f2li4 show that introduction of aromatic "hydrophobic thickness of the bilayer calculated from d = 1.75(n -1), where n is the number of carbon atoms in the fatty acyl chain (sperotto and mouritsen, 1988 ). residues at the two ends of the hydrophobic sequence increases the ability of the short peptides to partition into thick lipid bilayers. the observation that the highest relative binding constant is obtained with bilayers considerably thicker than the calculated hydrophobic length of the peptides suggests that the presence of aromatic residues at the ends of the helices could lead to marked thinning of the bilayer around the peptides (mall et al., 2000) . the chain length dependence of lipid binding to y2l20 is much less marked than for the shorter peptides; the relative binding constant increases from di(c14:1)pc to di(c18:i)pc, but then hardly changes with increasing chain length between di(c18:i)pc and di(c24:)pc (table ii) . this contrasts with l22, which shows markedly stronger interaction with di(c22:l)pc than with phospholipids with shorter or longer chains. this again suggests that the introduction of the two tyr residues leads to an increase in the thickness of the bilayer with which optimal interaction of the peptide is observed. interactions between transmembrane c~ helices and the phospholipid headgroups also have to be considered. using the fluorescence quenching method, it was shown that a small number of anionic phospholipid molecules (possibly just one) bound strongly to the peptide l16 , the remaining molecules binding with an affinity equal to that of phosphatidylcholine. the binding constant for the strongly bound phosphatidic acid molecule relative to phosphatidylcholine in a medium of low ionic strength was 8.6 (in mole fraction units), corresponding to a difference in unitary binding energies of -5.3 kj mo1-1 . at ph 7.2, phosphatidic acid bears a single negative charge (cevc, 1990) . the binding constant for phosphatidic acid changed little with ionic strength, suggesting that the interaction with the positively charged peptide did not follow simply from a high positive potential in the vicinity of the positively charged lys residues on the peptide, increasing the local concentration of anionic phospholipid. the energy of interaction between two ions u is given by where zl and z2 are the charges on the two ions eo is the permittivity of a vacuum, er is the relative permittivity (dielectric constant) of the medium, and r is the distance between the two ions. assuming a dielectric constant of 78.5 (water), we find that an energy of interaction of 5.3 kj tool -1 corresponds to a distance of separation between two monovalent ions of 3.3/~. this therefore suggests that strong interaction requires the anionic headgroup of phosphatidic acid to be in close contact with one of the lys residues on the peptide. once this strong interaction with a single phosphatidic acid molecule has been made, other phosphatidic acid molecules will then interact with el6 relatively nonspecifically, with a binding constant relative to phosphatidylcholine close to 1. the relative binding constants for phosphatidylserine were less than for phosphatidic acid and are more sensitive to ionic strength . for phosphatidylserine, the presence of the positively charged ammonium group as well as the negatively charged carboxyl group in the headgroup region may reduce interaction with the positively charged peptide. in contrast to l16, the binding constants for anionic phospholipids to l22 are very similar to those for zwitterionic phospholipids, with a relative binding constant close to 1. it could be that tilting of l22 in the bilayer, necessary to match the hydrophobic length of l22 to the hydrophobic thickness of a bilayer of di(c18:1)pc, locates the lys residues on the peptide too far from the lipid headgroup region to allow a strong interaction between the anionic phospholipid and the peptide. both phosphatidylserine and phosphatidic acid bind more strongly to the peptides yell4 and yel20 than does phosphatidylcholine, the effect of anionic phospholipid decreasing slightly with increasing ionic strength. however, in this case the experiments are consistent with a model in which the binding constants for all the anionic phospholipid molecules binding to the peptide are increased slightly (mall et al., 2000) . this suggests that the presence of the tyr residues prevents close association of the anionic phospholipid group with the cationic lys residues. these results suggest that the effects of charge on the interactions between phospholipids and transmembrane ot helices will often be rather small and will be strongly dependent on the detailed structure of the peptide and its orientation in the membrane. this picture is consistent with the results of the molecular dynamics simulations of individual ~ helices of bacteriorhodopsin in bilayers of di(ci4:0)pc, which showed that a small proportion of the lipid molecules interacted with the o~ helices much more strongly than the others, and that these strong interactions were dominated by electrostatic terms rather than van der waals terms (woolf, 1998) . in general, binding constants for phospholipids to membrane proteins also show relatively little selectivity for anionic phospholipids. for example, binding constants for phosphatidic acid and phosphatidylserine relative to phosphatidylcholine are close to 1 for the caz+-atpase (dalton et al., 1998) , and binding constants for phosphatidic acid and phosphatidylserine for the (na+-k+)-atpase are about twice those for phosphatidylcholine (esmann and marsh, 1985) . however, there is evidence for the presence of a small number of "special" anionic phospholipids binding to some membrane proteins, acting as "cofactors." an example is provided by cytochrome c oxidase, whose crystal structure shows the presence of a lipid molecule bound between the transmembrane a helices (iwata et al., 1995) . interaction between an anionic phospholipid and a binding site on a membrane protein would be specific if strong binding requires close interaction between the anionic headgroup and a positively charged residue on the protein, as suggested by the results presented here. the phase of the phospholipid is important in determining interactions with transmembrane ot helices. as shown in fig. 8 , fluorescence quenching is much more marked in mixtures of di(br2c18:0)pc and di(c16:0)pc at temperatures where both liquid crystalline and gel phases are present than in mixtures of di(brac18:0)pc and di(c18:1)pc (mall et al., 2000) . thus, l16 is excluded from regions of lipid in the gel phase and accumulates in regions in the liquid crystalline phase. the binding constants of l|6 and l2z for di(c16:0)pc in the gel phase relative to di(c18:i)pc in the liquid crystalline phase are ca. 0.15 (mall et al., 2000) . this is consistent with the expectation that van der waals contacts between an all-trans fatty acyl chain and the molecularly rough surface of a peptide will be poor; one way that this poor packing can be overcome is by exclusion of the peptide from the gel phase. quenching plots for y2l14 are very similar to those of l16 (fig. 8) , showing that the presence of bulky aromatic residues does not have any significant effect on the selectivity for liquid crystalline-over gel-phase lipid (mall et al., 2000) . further, since y2li4 shows a preference for longer chain phospholipids than l16, y2li4 might have been expected to show a greater preference for gel-phase lipid than l16, because phospholipid in the gel phase gives a thicker bilayer than the corresponding lipid in the liquid crystalline phase. because y2l14 and l16 show equal preferences for liquid crystalline-over gel-phase lipid, any effects of hydrophobic matching between the peptide and the bilayer must be small compared to effects of lipid phase on the interaction energies between the peptides and the lipid (mall et al., 2000) . preferential partitioning of proteins from domains in the gel phase into domains of liquid crystalline lipid has been demonstrated for a variety of membrane proteins, including bacteriorhodopsin (cherry et al., 1978) and ca2+-atpase kleeman and mcconnell, 1976) . effects of sphingomyelin at 25°c are very similar to the effects of gel-phase di(c16:0)pc ( fig. 8 ; mall et al, 2000) . mixtures of bovine brain sphingomyelin and di(c 18:1)pc are in a two-phase region at 25°c, with gel-phase domains enriched in sphingomyelin (untracht and shipley, 1977) . thus, partitioning of the peptides between gel-and liquid crystalline-phase lipid shows little dependence on the structure of the phospholipid. it has been suggested that plasma membranes of mammalian cells contain domains or "rafts" enriched in sphingomyelin and that particular enzymes, particularly those associated with cell signaling, are concentrated within the rafts (simons and ikonen, 1997) . the results presented here suggest that membrane proteins containing transmembrane ot helices will tend to be excluded from these rafts, and it may therefore be significant that many of the signaling proteins suggested to be contained within the rafts are anchored to the membrane by glycosylphosphatidylinositol anchors (harder and simons, 1997) . the presence of cholesterol has a marked effect on incorporation of the peptides into phospholipid bilayers (webb et al., 1998) . incorporation of cholesterol at a 1 : 1 molar ratio to phospholipid leads to a general reduction in incorporation of the peptides l16 and l22, but superimposed on this effect is a chain length effect. in the presence of cholesterol, the binding constant of p16 for di(c 14:1)pc relative to di(c 18 : 1)pc increased from 0.4 to about 1, as expected if the presence of cholesterol increases the effective chain length of the c14 chain so that it more nearly matches the hydrophobic length of the peptide (fig. 8) . consistent with this interpretation, the presence of high molar ratios of cholesterol prevented the incorporation of p16 into bilayers of di(c18:i)pc. nezil and bloom (1992) showed that incorporation of cholesterol at 33 mol% increases bilayer thickness by about 4 ,~. studies with brominated analogues of cholesterol showed that cholesterol binds to the peptides with a binding constant only a factor of about two less strongly than di(c 18:1)pc . this is rather surprising, given the relatively rigid structure of the steroid ring of cholesterol and the molecularly rough surface of the peptide. in other studies, it has been shown that cholesterol binds relatively weakly at the lipid-protein interface of the atpase (simmonds et al., 1982 (simmonds et al., , 1984 ; comparison with the peptide studies reported here suggests that weak binding of cholesterol to the atpase involves interactions in the lipid headgroup region rather than interactions between the sterol ring and the hydrophobic transmembrane helices. the requirement to match the hydrophobic thickness of a membrane protein to that of the surrounding lipid bilayer could be important in a number of ways. targeting of proteins to their correct final destinations in a cell is essential in maintaining cell integrity. in the bulk flow model, the vast majority of proteins synthesized in the endoplasmic reticulum (er) are believed to leave the er by default and flow along the exocytic pathway until they reach the plasma membrane (nilsson and warren, 1994) . some proteins, however, have to be retained at particular points along the exocytic pathway. compartmental localization could be achieved in one of two ways. the first involves a retention signal in the protein, which, at the appropriate point in the exocytic pathway, prevents forward movement of the protein by denying it access to budding transport vesicles of the onward pathway. the second involves a retrieval signal, leading to recapture of the protein after it has left the compartment in which it resides. the classical retrieval signal is the kdel sequence found in many er-resident proteins; the situation appears to be different for golgi-resident proteins, where membrane-spanning domains act as retention signals (nilsson and warren, 1994) . despite the extensive flux of proteins through the golgi, the golgi maintains its own distinctive population of resident proteins. furthermore, the distribution of enzymes within the golgi is organized according to function, so that, for example, the distributions of glycosyltransferases and glycosidases, although overlapping, are distinct (colley, 1997; roth, 1987) . many of the proteins in the golgi membrane are predicted to contain a single transmembrane ot helix, oriented with the n-and c-termini on the inner and outer faces of the membrane, respectively. the golgi retention signal in such proteins has been shown to involve the membrane spanning domain (munro, 1991 (munro, , 1995 nilsson et al., 1991; swift and machamer, 1991) . however, the membrane-spanning domains show no sequence homology, and it has not been possible to identify any particular motif leading to retention (bretscher and munro, 1993; colley et al., 1992; munro, 1991) . thus, sialyltransferase remains localized in the golgi even when its 17-amino-acid transmembrane domain is replaced by 17 leu residues (munro, 1991) . however, a longer stretch of 23 leu residues did not provide an efficient retention signal (munro, 1991) . similarly, a 4-residue insertion into the transmembrane domain of galactosyltransferase reduced its retention in the golgi (masibay et al., 1993) . the reverse effect has been shown with the influenza virus neuraminidase, which shifted from the plasma membrane to the golgi and er when the number of residues in the transmembrane domain was reduced (sivasubramanian and nayak, 1987) . the lack of a clear retention motif, together with the inability to saturate the mechanism for golgi retention by overexpression, suggests that retention is not a receptor-mediated event (nilsson and warren, 1994) . one possible model is then retention by preferential interaction with membranes of optimal thickness (nilsson and warren, 1994) . both bretscher and munro (1993) and masibay et al. (1993 ) showed that transmembrane domains of golgi proteins are shorter (average 15 residues) than transmembrane domains of plasma membrane proteins (average 20 residues). it has therefore been suggested that if the golgi membrane is thinner than the plasma membrane, membrane proteins with short transmembrane domains will interact "more strongly" with the lipid bilayer of the golgi than with that of the plasma membrane, leading to retention in the golgi (bretscher and munro, 1993; masibay et al., 1993) . the studies with model peptides described above show that a protein containing a transmembrane ot helix with a hydrophobic length greater than the hydrophobic thickness of the golgi membrane will be able to move out of the golgi into the plasma membrane. however, a protein whose transmembrane ot helix has a hydrophobic length less than the hydrophobic thickness of a particular membrane will not be able to enter that membrane, and such a protein would then be retained in the golgi (webb et al., 1998) . studies of targeting of proteins in yeast are also consistent with a lipid-based model (rayner and pelham, 1997) . the length of the transmembrane domain is important in targeting with long helices (24 residues), ensuring transport to the plasma membrane. however, for proteins with shorter transmembrane domains, the relative hydrophobicity of the transmembrane domain has been suggested to be important as well as its length, this determining targeting to the golgi and the vacuole (rayner and pelham, 1997) . retention of some membrane proteins in the er could also depend on the length of the transmembrane domain of the protein. an important class of er membrane proteins are those with an n-terminal catalytic domain exposed to the cytoplasm and a c-terminal membrane anchor. such proteins are inserted into the er membrane post-translationally by a signal-recognition-particle-independent pathway. no er retrieval signals have been identified in these proteins. instead, it has been observed that the hydrophobic domain is rather short. for example, cytochrome bs, a protein of this type, has a transmembrane domain containing just 17 hydrophobic amino acid residues (pedrazzini et al., 1996) . if the length of the hydrophobic stretch is increased to 22 residues, the protein is transported out of the er along the secretory pathway (pedrazzini et al., 1996) . it could therefore be that matching of the thickness of the lipid bilayer and the transmembrane length of the protein is important in retention in er, as was suggested for the golgi complex. although the length of the transmembrane domain appears to be the most important factor, the structure of the c-terminal, lumenal, region has also been shown to contribute to retention (honsho et al., 1998) . experiments with another c-terminalanchored protein, the ubiquitin-conjugating enzyme ubc6 from yeast, suggest that the thickness requirements of the er and golgi membranes may be different, explaining targeting between these two organelles (yang et al., 1997) . whereas ubc6 containing the wild-type 17-residue transmembrane domain targets to the er, increasing the length of the transmembrane domain to 21 residues results in movement to the golgi, and increasing the length further to 26 residues allows movement to the plasma membrane. these experiments show that the length of the transmembrane ot helix is often an important factor in targeting, although it is likely to be only one of a number of important factors. the lengths of the transmembrane ot helices are also likely to be important in the proper function of membrane proteins containing multiple transmembrane ot helices. an example already described is that of the ca2+-atpase, which shows highest atpase activity in di(c 18:1)pc and lower activities in bilayers of phospholipids with longer or shorter fatty acyl chains (lee, 1998) . changes in the atpase underlying these changes in atpase activity are complex (lee, 1998) , but all must be mediated by the transmembrane ot helices, because these are the parts of the atpase that can "sense" the change in bilayer thickness. in the case of the ca2+-atpase it seems that, as described above, the two major conformational states of the atpase (el and e2) have different preferences for bilayer thickness, the e1 conformation favoring thin bilayers and the e2 conformation favoring thick bilayers (lee, 1998) . changing the bilayer thickness could change the tilt of the transmembrane u helices in the ca2+-atpase, it could change the packing of the helices, and, possibly, it could lead to changes in the structures of the loops connecting the helices, changing the effective lengths of the helices. all these changes could be linked to changes in the phosphorylation domain of the ca2+-atpase, located well above the surface of the membrane. if, as seems likely, the various membranes in a cell have different thicknesses because of their different lipid compositions, the structure of each membrane protein will have evolved to match the thickness of the membrane in which it resides. protein-lipid interactions and differential scanning calorimetric studies of bacteriorhodopsin reconstituted lipid-water systems structure and dynamics of an amphiphilic peptide in a lipid bilayer: a molecular dynamics study protein, lipid and water organization in bacteriorhodopsin crystals: a molecular view of the purple membrane at 1.9 a resolution molecular theory of chain packing, elasticity and lipid-protein interaction in lipid bilayers cholesterol and the golgi apparatus membrane electrostatics temperature-dependent aggregation of bacteriorhodopsin in dipalmitoyl-and dimyristoyl-phosphatidylcholine vesicles golgi localization of glycosyltransferases: more questions than answers the signal anchor and stem regions of the fl-galactoside ot2,6-sialyltransferase may each act to localize the enzyme to the golgi apparatus interaction of phosphatidic acid and phosphatidylserine with the caz+-atpase of sarcoplasmic reticulum and the mechanism of inhibition interaction of a synthetic amphiphilic polypeptide and lipids in a bilayer structure influence of lipid/peptide hydrophobic mismatch on the thickness of diacylphosphatidylcholine bilayers: a 2h nmr and esr study using designed transmembrane alpha-helical peptides and gramicidin a different membrane anchoring positions of tryptophan and lysine in synthetic transmembrane alpha-helical peptides specificity of lipid-protein interactions as determined by spectroscopic techniques the structure of the potassium channel: molecular basis of k + conduction and selectivity molecular sorting of lipids by bacteriorhodopsin in dilauroylphophatidylcholine/distearoylphosphatidylcholine lipid bilayers is the protein/lipid hydrophobic matching principle relevant to membrane organization and functions? molecular dynamics simulation of membranes and a transmembrane helix lipid selectivity of the calcium and magnesium ion dependent adenosinetriphosphatase, studied with fluorescence quenching by a brominated phospholipid structure and fluctuations of bacteriorhodopsin in the purple membrane: a molecular dynamics study internal packing of helical membrane proteins identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins structure and function of the photosynthetic reaction center from rhodobacter sphaeroides spin label studies on the origin of the specificity of lipid-protein interactions in na+,k+-atpase membranes from squalus acanthias lipid patches in membrane protein oligomers: crystal structure of the bacteriorhodopsin-lipid complex a molecular model for lipid-protein interaction in membranes: the role of hydrophobic mismatch proteinlipid interaction. biophysical studies of (ca 2+ + mg2+)-atpase reconstituted systems electroncrystallographic refinement of the structure of bacteriorhodopsin caveolae, digs, and the dynamics of sphingolipid-cholesterol microdomains retention of cytochrome b5 in the endoplasmic reticulum is transmembrane and luminal domain-dependent phase equilibria in an amphiphilic peptidephospholipid model membrane by deuterium nuclear magnetic resonance difference spectroscopy orientation of u-helical peptides in a lipid bilayer structure at 2.8/~ resolution of cytochrome c oxidase from paracoccus denitrificans identification and extent of fluid bilayer regions in membranous cytochrome oxidase hydrophobic mismatch between proteins and lipids in membranes induction of nonbilayer structures in diacylphosphatidylcholine model membranes by transmembrane alpha-helical peptides: importance of hydrophobic mismatch and proposed role of tryptophans interactions of proteins and cholesterol with lipids in bilayer membranes non-random distribution of amino acids in the transmembrane segments of type 1 single span membrane proteins calculation of phase diagrams for non-ideal mixtures of lipids, and a possible nonrandom distribution of lipids in lipid mixtures in the liquid crystalline phase how lipids interact with an intrinsic membrane protein: the case of the calcium pump what the structure of a calcium pump tells us about its mechanism lipid bilayer thickness varies linearly with acyl chain length in fluid phosphatidylcholine vesicles fluorescence quenching in model membranes. 2. determination of local lipid environment of the calcium adenosinetripfiosphatase from sarcoplasmic reticulum structure of bacteriorhodopsin at 1,55 ~ resolution lipid-protein interactions in the membrane: studies with model peptides effects of aromatic residues at the ends of transmembrane alpha-helices on helix interactions with lipid bilayers specificity of lipid-protein interactions mutational analysis of the golgi retention signal of bovine t-1,4-galactosyl transferase the photosynthetic reaction center from the purple bacterium rhodopseudomonas viridis: aspects of membrane protein structure the structure of bacteriorhodopsin at 3.0 a resolution based on electron crystallography: implication of the charge distribution influence of membrane-spanning alpha-helical peptides on the phase behavior of the dioleoylphosphatidylcholine/water system simultaneous modeling of phase and calorimetric behavior in an amphiphilic peptide/phospholipid model membrane mattress model of lipid-protein interactions in membranes models of lipid-protein interactions in membranes sequences within and adjacent to the transmembrane segment of alpha-2,6-sialyltransferase specify golgi retention an investigation of the role of transmembrane domains in golgi protein retention combined influence of cholesterol and synthetic amphiphilic peptides upon bilayer thickness in model membranes energetics of inclusion-induced bilayer deformations retention and retrieval in the endoplasmic reticulum and the golgi apparatus the membrane spanning domain of ~-l,4-galactosyl transferase specifies trans golgi localization lipid mobility and order in bovine rod outer segment disk membranes:. a spin-label study of lipid-protein interactions a mutant cytochrome b5 with a lengthened membrane anchor escapes from the endoplasmic reticulum and reaches the plasma membrane hydrophobic mismatch and long-range protein/lipid interactions in bacteriorhodopsin/phosphatidylcholine vesicles fluorescence quenching and electron spin resonance study of percolation in a two-phase lipid bilayer containing bacteriorhodopsin transmembrane domain-dependent sorting of proteins to the er and plasma membrane in yeast membrane protein structure and stability: implications of the first crystallographic analyses control of the transmembrane orientation and interhelical interactions within membranes by hydrophobic helix length hydrophobicity of the peptide c=o. • -h--n hydrogen bonded group subcellular organization of glycosylation in mammalian cells detergent structure in crystals of a bacterial photosynthetic reaction centre structure of the detergent phase and proteindetergent interactions in crystals of the wild-type (strain y) rhodobacter sphaeroides photochemical reaction center molecular dynamics of pfl coat protein in a phospholipid bilayer conformational changes of phospholipid headgroups induced by a cationic integral membrane peptide as seen by deuterium magnetic resonance influence of the intrinsic membrane protein bacteriorhodopsin on gel-phase domain topology in two-component phase-separated bilayers transmembrane helix structure, dynamics, and interactions: multi-nanosecond molecular dynamics simulations annular and non-annular binding sites on the (ca 2+ + mg2+)-atpase interactions of cholesterol hemisuccinate with phospholipids and (ca2+-mg2+)-atpase functional rafts in cell membranes mutational analysis of the signal-anchor domain of influenza virus neuraminidase dependence of lipid membrane phase u'ansition temperature on the mismatch of protein and lipid hydrophobic thickness lipid enrichment and selectivity of integral membrane proteins in two-component lipid bilayers characterization of the single ca 2+ binding site on the ca2+-atpase reconstituted with short and long chain phosphatidylcholines molecular organization and dynamics of 1-palmitoyl-2-oleoylphosphatidylcholine bilayers containing a transmembrane alpha-helical peptide a golgi retention signal in a membrane-spanning domain of coronavirus e1 protein crystal structure of the calcium pump of sarcoplasmic reticulum at 2.6/~ resolution molecular interactions between lecithin and sphingomyelin principles of membrane protein assembly and structure architecture of helix bundle membrane proteins: an analysis of cytochrome c oxidase from bovine mitochondria hydrophobic mismatch and the incorporation of peptides into lipid bilayers: a possible mechanism for retention in the golgi experimentally determined bydrophobicity scale for proteins at membrane interfaces molecular dynamics simulations of individual alpha-helices of bacteriorhodopsin in dimyristoylphosphatidylcholine. ii. interaction energy analysis the transmembrane domain of a carboxyl-terminal anchored protein determines localization to the endoplasmic reticulum the preference of tryptophan for membrane interfaces interaction of a peptide model of a hydrophobic transmembrane a-helical segment of a membrane protein with phosphatidylcholine bilayers: differential scanning calorimetric and ftir spectroscopic studies ftir spectroscopic studies of the conformation and amide hydrogen exchange of a peptide model of the hydrophobic transmembrane a-helices of membrane proteins peptide models of helical hydrophobic transmembrane segments of membrane proteins. 2. differential scanning calorimetric and ftir spectrooscopic studies of the interaction of ac-k2-(la)12-k2-amide with phosphatidylcholine bilayers we thank the biotechnology and biological sciences research council (bbsrc) for financial support of the original studies reported here. key: cord-002835-qaogpxy9 authors: too, issac horng khit; bonne, isabelle; tan, eng lee; chu, justin jang hann; alonso, sylvie title: prohibitin plays a critical role in enterovirus 71 neuropathogenesis date: 2018-01-11 journal: plos pathog doi: 10.1371/journal.ppat.1006778 sha: doc_id: 2835 cord_uid: qaogpxy9 a close relative of poliovirus, enterovirus 71 (ev71) is regarded as an important neurotropic virus of serious public health concern. ev71 causes hand, foot and mouth disease and has been associated with neurological complications in young children. our limited understanding of the mechanisms involved in its neuropathogenesis has hampered the development of effective therapeutic options. here, using a two-dimensional proteomics approach combined with mass spectrometry, we have identified a unique panel of host proteins that were differentially and dynamically modulated during ev71 infection of motor-neuron nsc-34 cells, which are found at the neuromuscular junctions where ev71 is believed to enter the central nervous system. meta-analysis with previously published proteomics studies in neuroblastoma or muscle cell lines revealed minimal overlapping which suggests unique host-pathogen interactions in nsc-34 cells. among the candidate proteins, we focused our attention on prohibitin (phb), a protein that is involved in multiple cellular functions and the target of anti-cancer drug rocaglamide (roc-a). we demonstrated that cell surface-expressed phb is involved in ev71 entry into neuronal cells specifically, while membrane-bound mitochondrial phb associates with the virus replication complex and facilitates viral replication. furthermore, roc-a treatment of ev71-infected neuronal cells reduced significantly virus yields. however, the inhibitory effect of roc-a on phb in nsc-34 cells was not through blocking the craf/mek/erk pathway as previously reported. instead, roc-a treated nsc-34 cells had lower mitochondria-associated phb and lower atp levels that correlated with impaired mitochondria integrity. in vivo, ev71-infected mice treated with roc-a survived longer than the vehicle-treated animals and had significantly lower virus loads in their spinal cord and brain, whereas virus titers in their limb muscles were comparable to controls. together, this study uncovers phb as the first host factor that is specifically involved in ev71 neuropathogenesis and a potential drug target to limit neurological complications. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 enterovirus 71 (ev71) is a non-enveloped, positive-sense, single-stranded rna virus, and causes hand, foot and mouth disease (hfmd) in humans. being a close relative of poliovirus, ev71 is deemed as an important neurotropic virus worldwide [1] . since its first isolation in california in 1969, several major outbreaks have been reported in china, singapore, korea, and japan [2] [3] [4] [5] . although the clinical manifestations are generally mild and self-limiting, including hfmd and herpangina, severe neurological complications have been consistently reported with ev71-associated infections, causing brainstem encephalitis, acute flaccid paralysis, pulmonary edema and cardiopulmonary failure [6, 7] . in addition, some patients who have recovered from severe disease have been reported to develop long term neurologic and psychiatric disorders [8] . there are currently no effective prophylactic or therapeutic agents against ev71. although several vaccines have completed phase iii clinical trials [9] , regulatory issues may limit their widespread utilization. in addition, as these vaccine candidates consist of inactivated virus from a single ev71 genotype (c4), cross-protection against other genotypes may be limited [10, 11] . the increasing awareness of life-threatening ev71 infections has boosted research in recent years to further understand virus-host interactions and develop effective antiviral strategies [12] [13] [14] [15] [16] [17] . however, the neuropathogenesis of ev71 is still poorly understood. infection occurs when the virus enters the body upon ingestion and/or inhalation. the virus multiplies initially in the alimentary tract mucosa and rapidly reaches the deep cervical and mesenteric lymph nodes via the tonsils and peyer's patches [18] . after a short transient systemic dissemination phase, the virus accumulates and actively replicates in muscles where it is believed to infect motor neurons at the neuromuscular junctions. experimental evidence supports that ev71 migrates to the brainstem via retrograde axonal transport as previously described for its close relative poliovirus [1, 2, [19] [20] [21] . however, the molecular mechanisms involved in ev71 infection of motor neurons to access the central nervous system (cns) have not been studied. indeed in vitro studies aiming at studying ev71 neurovirulence have employed neuroblastoma cell lines that may not reflect accurately infection in motor neurons. to address this gap, we have recently reported a novel in vitro model of ev71 infection in the murine motor neuron cell line nsc-34 [22] . nsc-34 cells originate from the fusion between murine neuroblastoma and spinal cord cells, and possess motor neuron-like properties, such as generation of action potentials and production of acetylcholine [23] , therefore making it a relevant model to study the mechanism of ev71 neuropathogenesis. we demonstrated that nsc-34 cells are permissive to ev71 clinical isolates and found that, unlike any other mammalian cell types so far reported, ev71-infected nsc-34 cells do not undergo apoptosis and lysis. instead we showed that the virus exits the cells via a non-lytic mode, a phenomenon that has also been previously described for poliovirus [21, 24, 25] . these unique features thus suggested that the infection cycle of ev71 in nsc-34 cells involves host pathways and partners that are likely to be different from those previously identified in other mammalian cell types such as muscle cells and neuroblastoma cells. in this work, using a proteomics approach coupled with mass spectrometry, we have identified a panel of cellular proteins that were dynamically regulated during ev71 infection of nsc-34 cells. among the host protein candidates that were up-regulated, we focused our attention on prohibitin (phb) and characterized its role during ev71 infection in nsc-34 cells. we also demonstrated the importance of phb during ev71 infection in a symptomatic mouse model of ev71 infection. to identify the host proteins involved in ev71 infection cycle in nsc-34 cells, a 2de proteomic approach was undertaken. nsc-34 cells were infected with ev71 at m.o.i. 10 , and the cell lysates were harvested at 6, 24, 48 and 72 hours for downstream proteomic analysis in which a range of 350-800 spots were resolved. by using pdquest 2-d analysis software (biorad), a total of 81 protein spots (fig 1a) that displayed at least 0.5-fold differential expression (p<0.05, two-tailed student's t-test) compared to uninfected controls, were excised for in-gel digestion and maldi-tof ms analysis. the peptide fingerprints were then searched against ncbinr mouse genome database for protein identification using mascot program (http://www.matrixscience.com/). the protein candidates were then categorized based on their primary functional class indicated in uni-protkb database (s1 table) . to illustrate the dynamic regulation of host proteins during the viral infection, a heat map was generated using multiexperiment viewer (mev), with the distance between proteins represented by euclidean average linkage clustering (fig 1b) . this clustering analysis revealed that proteins that were up-regulated (fig 1c) during infection are mainly involved in motility (23%) and catalytic processes (20%), while proteins that participate in rna processing (32%) and energy biosynthesis (20%) generally displayed a down-regulation trend during the course of infection (fig 1d) . functional interactions among the selected host proteins were analyzed by string (search tool for the retrieval of interacting genes/proteins). this platform allows establish proteinprotein interactions based on published literature, online databases, predicted functional associations using genomic information or observations made with other organisms [26] . the protein network obtained was significantly enriched with the p value of less than 0.05, suggesting that the interactions are highly associated and unbiased (fig 2; s2 table) . furthermore, some of the selected host proteins appear to have strong associations among each other as indicated by the thickness of connecting lines which reflects the confidence level of the interactions [26] . using go annotations for biological processes, molecular functions, cellular compartments and protein classes, the protein candidates were localized within the cytoplasm (29.2%), organelles (20.8%) and macromolecular complexes (13.9%) (s1a fig). in addition, they were found to be involved in various biological processes including mitochondrial biogenesis, proteolytic activity, cytoskeletal machinery and rna processing (s1a fig) , consistent with the protein clusters observed in the string network (fig 2) . finally, molecular function analysis indicates that majority of these proteins contribute to nucleic acid binding transcription factor activity (34.7%), structural molecule activity (25%) or binding (22.2%) (s1a fig). a meta-analysis with other selected proteomic studies of ev71-infected muscle and neuronal cells [13] [14] [15] 17, [27] [28] [29] revealed minimal overlap between nsc-34, rd and other neuronal cell types with 4 protein candidates only, namely actb, tubb, pdia3 and eno1, suggesting that the host-pathogen interactions in nsc-34 cells are unique (s1b fig). the limited overlap may also be partly explained by differential proteomics approaches. act and tubb are involved in maintaining cytoskeletal structure, and they are found highly modulated during viral infection to facilitate virus internalization and transportation [30] [31] [32] . eno1 has been shown previously to interact with cytoskeletal proteins in intermediate filaments framework rearrangement [33] . on the other hand, pdi functions in catalyzing reduction and oxidation processes and protein folding [34] . it has also been demonstrated to be involved in humoral immune response [35] or viral replication (denv) [36] and entry (hiv) [37] . not surprisingly, greater overlap was seen between motor-neuron nsc-34 and other neuronal cells (26 shared hits) than between nsc-34 and rd cells (5 shared hits). importantly, neuro-specific proteins such as prph and uchl1 were only identified from profiling studies in nsc-34 and other neuronal cells, thus validating our 2de proteomic approach. seven protein candidates namely, alpha-enolase (eno1), dep domain-containing mtorinteracting protein (deptor), peripherin (prph), phosphatidylethanolamine-binding protein 1 (pebp1), prohibitin (phb), stomatin-like protein 2 (stoml2) and protein disulfideisomerase a3 (pdia3) were selected for validation of the proteomic findings by gene knockdown. these host proteins have been previously shown to be associated with various steps in the life cycle of viruses, such as entry [37] [38] [39] [40] and replication [15, 41, 42] , or to be involved in autophagy [43] [44] [45] and axonal transport [46] [47] [48] . silencing of each selected gene target was achieved by reverse transfecting the on-targetplus sirna smartpool into nsc-34 cells, prior to viral infection. sirna smartpools consist of four highly potent gene-specific sirna molecules which have been modified to minimize off-target activity and enhance gene specificity [49] . cytotoxicity of the sirnas smartpools was first established. apart from the sirna pool targeting pdia3, no significant cytotoxicity was observed with the other sirna pools at 25 and 50 nm with cell viabilities greater than the 70% viability threshold (fig 3a) . the pdia3-specific sirna pool concentrations were lowered to 5 and 10nm to avoid cytotoxicity (fig 3a) . virus titers in the culture supernatants of sirna-transfected cells were then determined at 48 h.p.i. results indicated that silencing of stoml2, prph, phb and deptor led to significantly lower virus titers, whereas pdia-, pebp1-and eno1-knocked down resulted in increased viral titers in the supernatants of ev71-infected nsc-34 cells compared to control (fig 3b) . importantly, both trends were dose-dependent. therefore, these results validate the 2d-proteomics approach as a powerful way to identify host proteins that play a role during ev71 infection in nsc-34 cells. prohibitins belong to a highly conserved protein family present in unicellular and multicellular eukaryotes [50] . prohibitin (phb; bap-32) and prohibitin 2 (phb2, rea, bap-37) are two highly homologous members of this family and are ubiquitously expressed in multiple cellular compartments including the mitochondria, nucleus, and the plasma membrane. prohibitins have been involved in multiple cellular functions including cell proliferation and maintenance of the functional integrity of the mitochondria [50] . in addition, phb specifically has been previously reported to be involved in the entry step of alphavirus chikungunya (chikv) [40] , and flaviviruses dengue (denv) [38] and hepatitis c (hcv) [39] , and to interact with envelope proteins from white spot syndrome virus to prevent infection [51] . however, there has been no report so far on the role of phb during ev71 infection. first, modulation of phb expression during ev71 infection in nsc-34 cells was confirmed by western blot and showed an overall up-regulation of phb during the course of infection compared to uninfected control (s2 fig). next, the impact of phb gene silencing on virus production was further analyzed using a wider range of sirna pool concentrations including 5, 10, 25 and 50 nm. efficacy of the gene silencing was assessed by western blot and showed a dose-dependent decrease of phb expression in nsc-34 cell lysates (fig 3c) . this dose-dependent phb knockdown correlated well with a dose-dependent reduction in the viral titers measured in the culture supernatant (fig 3d) , therefore supporting the role for phb in ev71 infection cycle. to address the possibility of false positive or off-target effects of the sirna pool, phb gene silencing was performed with the individual sirnas species from the pool. western blot showed that each individual sirna was capable of silencing phb expression significantly (s3a 10 . the viral titers in the culture supernatants were determined by plaque assay at 48 h.p.i.. cell viability of the transfected cells was assessed using alamarblue assay. statistical analysis was performed using two-tailed student's t-test ( ãã p<0.005, ããã p<0.005). relative band quantification (below western blot) was determined by imagej, by normalizing to loading control, β-actin. error bars represent mean ± standard deviation. one representative of two biological repeats is shown. to determine if phb mediates entry of ev71 into nsc-34 cells, a competition assay was performed using commercially available anti-phb antibody. incubation of nsc-34 cells with anti-phb antibody prior to infection led to significant reduction of the viral titer in a dosedependent manner (fig 4a) . to demonstrate a physical interaction between cell surfaceexpressed phb and ev71, a proximity ligation assay (pla) was performed. in this assay, phb and ev71 are recognized by specific primary antibodies raised in two different species, which are in turn recognized by species-specific secondary antibodies conjugated to a probe and a target, respectively. should ev71 and phb be in close proximity, the probe and the target ligate, amplify and result in emission of a fluorescent signal. scarb-2 which has been previously demonstrated as the main receptor for ev71 in rd cells [52] was used as positive control and a red fluorescent signal was readily detected in ev71-infected rd cells (fig 4b) . a positive signal was also detected with nsc-34 cells incubated with anti-phb and anti-ev71 antibodies thus supporting the close proximity between ev71 virus particles and surfaceexpressed phb (fig 4b) . in contrast, and expectedly, no signal was detected with mouse scarb-2 (mscarb2) and ev71 antibodies (fig 4b) , since we have shown previously that entry of ev71 into nsc-34 cells is not mediated by mscarb-2 [22] . the physical interaction between ev71 and cell surface-expressed phb was further assessed by performing a co-immunoprecipitation experiment. nsc-34 cells were incubated with ev71 for 2 hours at 4˚c to allow viral adsorption onto the cell surface but no internalization. the total cell lysate was obtained and a pulldown was carried out using antibody specific to phb or an isotype igg antibody control. the immunoprecipitates were then analyzed by western blot using anti-ev71 primary antibody. a discrete band at the expected size was obtained when pulldown was performed with the anti-phb antibody whereas no band was seen when pulldown was done with the igg isotype (fig 4c) . these findings thus support that ev71 physically interacts with cell surface-expressed phb, and suggest that phb may serve as a receptor for ev71 entry into nsc-34 cells. in addition to its association to the plasma membrane at the cell surface, phb is also present intracellularly [50] . to study the role of intracellular phb in ev71 infection cycle, phbknocked down nsc-34 cells were transfected with ev71 rna genome and the viral titers were determined at 6, 12, 18 and 24 hours post-transfection, in order to assess virus production within a single cell infection cycle. transfection of viral genome was meant to bypass the virus entry step which we have shown involves cell surface-expressed phb. no viral titer was obtained at 6 h.p.t. in the phb-knocked down and control cells (fig 5a) . from 12 h.p.t onwards the viral titers detected in the culture supernatant from phb-knocked down cells were consistently lower than those measured in sintc or non-treated cells (fig 5a) . this result thus demonstrates that phb plays a role in the intracellular virus infection cycle. to further confirm this hypothesis, a luciferase ev71 (lucev71) replicon transfection assay was performed. in this replicon, the viral structural genes have been replaced with a luciferaseencoding gene while the other parts of the viral genome are retained (fig 5b) . upon transfection, the replicon undergoes a single replication cycle with no production of virus progeny since it is deficient in viral structural proteins. the luminescence measured from the cell culture is proportional to the amount of luciferase produced inside the cell thereby reflecting the replication activity of the replicon. here, a significant reduction in the luminescence signal was observed in phb-knocked down nsc-34 cells compared to sintc-treated and nontreated cells (fig 5c) . thus, together the data support that intracellular phb is involved in ev71 viral replication. ) were incubated with ev71 at 4˚c for 2 hours to allow viral adsorption, prior to pla staining. scarb2-stained nsc-34 and rd cells served as negative and positive controls, respectively. scale bar represents 20μm. a representative experiment of two independent repeats is shown. (c) co-immunoprecipitation of ev71 and surface-expressed phb. ev71 and nsc-34 cells were co-incubated for 2 hours at 4˚c. the cell lysate was pulled down with anti-phb or igg isotype control antibodies prior to immunoblotting using anti-vp1 primary antibody. mock-infected cells and igg isotype served as control. a representative experiment of two independent repeats is shown. ip, immunoprecipitation; ib, immunoblot. to further investigate the role of intracellular phb during ev71 infection cycle, immunostaining was performed on ev71-infected nsc-34 cells probing for phb, ev71 capsid proteins vp0/vp1 and the viral replication intermediate dsrna. co-localization between phb and dsrna, and between phb and ev71 capsid proteins was readily observed (fig 6a) , supporting that intracellular phb could be involved in the viral replication and/or assembly processes. to further study the role of intracellular phb in viral replication, co-immunoprecipitation was carried out using antibody against phb. pull down with anti-phb antibody followed by western blot using anti-ev71 3d/3cd antibody led to the detection of a 72kd band that corresponds to the ev71 3cd protein complex and a 53 kda band (3d polymerase) which comigrated with igg heavy chain (fig 6b) . taken together, these data support a physical interaction between intracellular phb and ev71 non-structural proteins 3d and 3cd, indicating that intracellular phb is likely involved in viral replication. previous studies have shown that the main replication sites of picornavirus are located at the golgi apparatus and endoplasmic reticulum (er) [53] . we have demonstrated that intracellular phb co-localizes with dsrna and is closely associated with the ev71 3d polymerase. given that intracellular phb is abundantly and mainly expressed on mitochondria [54] , we speculated that in nsc-34 cells mitochondria may be exploited by ev71 as replication site. consistently, co-localization of phb and ev71 with mitochondria was observed by ifa (fig 7a and 7b) . furthermore, co-localization of phb, dsrna and mitochondria was also readily detected, thus indicating that mitochondrial phb is associated with the viral replication (fig 7c) . to exclude the possibility that the replication complexes detected were actually associated to the er, which is in close proximity to mitochondria, the mitochondrial fraction was prepared from ev71-infected nsc-34 cells and western blot analysis revealed the presence of viral capsid protein vp1 (38 kd), 3d (53 kd) and 3cd (72 kd) proteins (fig 7d) . furthermore, the mitochondrial fraction was shown to be free of cytoplasmic contamination, as evidenced by the presence of mitochondrial marker (atpb, 52 kd) and lack of er marker (calreticulin, 48 kd). similar observation was made with the mitochondrial fraction prepared from ev71-infected rd cells (fig 7d) , suggesting that ev71 is able to exploit various cellular organelles as replication scaffolds in various mammalian cell types from different species. this finding is consistent with a previous study where ev71 vp1 was found to be associated with mitochondria in hela cells [55] . to further support the close proximity and likely interactions between the viral replication complexes and mitochondria, transmission electron microscopy was performed on ev71-infected nsc-34 cells. clustering of mitochondria surrounding viral replication complexes could be seen (electron-dense like structures) in the infected cells, and examination at a higher magnification indicated a close association between mitochondria membrane and virus complex/virus particle (fig 7e) . collectively, the data strongly indicate that mitochondria in nsc-34 cells are exploited by ev71 as a replication scaffold and that mitochondria-associated phb plays a role in this process. recent studies have shown that phb activity is inhibited by a group of phytochemicals called rocaglamides, which are derived from the traditional chinese medicinal plants aglaia [39, [56] [57] [58] . we therefore investigated whether roc-a could interfere with ev71 infection cycle by blocking phb activity in nsc-34 cells. incubation of roc-a with virus prior to nsc-34 cell infection (co-treatment) did not result in any significant reduction in viral titer (s4a fig) . when cells were pre-treated with roc-a prior to ev71 infection (pre-treatment), less than 1 log pfu/ml of decrease in viral titer was observed at the highest drug concentration only (500 nm) (s4b fig). in contrast, a dose-dependent decrease in the viral titer was seen when roc-a treatment was applied after infection (post-treatment) at concentrations ranging between 10-100 nm (fig 8a) . western blot analysis of the cell lysates further confirmed the dose-dependent reduction of intracellular viral capsid protein and phb (fig 8b) . taken together, the data suggest that the antiviral effect of roc-a on ev71-infected nsc-34 cells targets the viral replication step and not the entry step, in contrast to a previous study with hcv [39] . prior studies focusing on cancer have reported that the mechanism by which roc-a targets and inhibits phb activity involves blocking the craf/mek/erk pathway [56, 57] , and this was also described in the hcv study [39] . to investigate whether the craf/mek/erk to decipher the mode of action of roc-a in nsc-34 cells, immunostaining of roc-a treated nsc-34 cells was performed. decreased signals for phb and mitochondria were observed with increasing concentrations of roc-a (fig 8c) , suggesting that roc-a might affect mitochondrial integrity. we thus assessed the mitotoxicity and cytotoxicity of roc-a using the mitochondrial toxglo assay (promega). while cytotoxicity remained generally minimal over the range of roc-a concentrations tested, the intracellular atp levels were significantly reduced in a dose-dependent manner, thus indicating functional impairment of the mitochondria in roc-a-treated nsc-34 cells (fig 8d) . consistently, using the membrane-permeant jc-1 dye as an indicator of mitochondrial membrane potential, roc-a-treated nsc-34 cells displayed marked and dose-dependent mitochondrial depolarization as evidenced by the decrease of red fluorescent j-aggregates, compared to untreated cells (s6 fig). together, these observations suggest that roc-a treatment in nsc-34 cells results in reduced levels of phb, leading to mitochondrial destabilization and lower atp production. one could thus speculate that the lack of intact mitochondria and reduced intracellular atp levels might eventually impact negatively on ev71 replication efficacy. the role of phb in ev71 infection cycle was also studied in human muscle (rd) and neuronal (sk-n-sh) cell lines. as human (gi246483) and murine (gi6679299) phb display high similarity in their amino acid composition, most of the anti-phb antibodies commercially available demonstrate good cross reactivity with cell lines of both species. we first showed by flow cytometry comparable levels of surface expression of phb on rd, sk-n-sh and nsc-34 cells (s7 fig). however, both phb gene silencing and phb receptor blocking experiments performed in human muscle rd cells did not impact the viral titer (s8a and s8b fig) . on the contrary, phb silencing in the human neuroblastoma cells sk-n-sh led to a significant dosedependent reduction in viral titer in the culture supernatant (fig 9a) . in addition, reduced virus titers were observed with sk-n-sh cells pre-treated with anti-phb antibodies, thus supporting that cell surface-expressed phb is involved in ev71 entry into this human neuroblastoma cell line (fig 9b) . to investigate if intracellular phb is also involved in viral replication in sk-n-sh cells, transfection of lucev71 replicon into phb-silenced sk-n-sh cells was performed. results showed a significant reduction in the luminescence signal compared to controls (fig 9c) . finally, the effectiveness of roc-a treatment in ev71-infected sk-n-sh cells was also assessed. similar to our observations with nsc-34 cells, a significant dose-dependent decline in viral titers was observed (fig 9d) . taken together, these findings thus strongly indicate the specific involvement of phb in both viral entry and replication of ev71 in neuronal cells from both human and murine origins. concentrations of roc-a for 48 hours before assessment of cytotoxicity (fluorescence) and mitotoxicity (luminescence) using mitochondrial toxglo assay. statistical analysis was performed using one-way anova with dunnett's post-test ( ã , p<0.05; ãã , p<0.005; ããã , p<0.001; ãããã , p<0.0001). one representative from two independent experiments is shown. https://doi.org/10.1371/journal.ppat.1006778.g008 i. the culture supernatant was collected for viral titer determination by plaque assay, and the cell lysate was harvested for western blot analysis. statistical analysis was performed using one-way anova with dunnett's post-test ( ãã , p<0.005). relative band quantification (below western blot) was determined by imagej, by normalizing to loading control, β-actin. error bars represent mean ± standard deviation. cellular cytotoxicity was assessed using alamarblue assay. one representative from two independent experiments is shown. https://doi.org/10.1371/journal.ppat.1006778.g009 the role of phb was further investigated in vivo, using the mouse model of ev71 infection that we established previously where 2-week old ag129 mice (deficient in type i&ii ifn pathways) infected with ev71 display progressive limb paralysis and spatio-temporal virus accumulation in the limb muscles, spinal cord and brainstem [59] . here, immunohistochemical analysis showed that phb was readily detected in the limb muscles, brainstem and spinal cord at day 4 p.i. (fig 10a) . furthermore, some co-localization with ev71 was observed (fig 10a) . next, the in vivo anti-ev71 efficacy of roc-a was assessed by treating therapeutically ev71-infected mice with roc-a at day 1 and 3 p.i. the development of clinical manifestations was clearly delayed in the roc-a-treated mouse group which resulted in increased survival time compared to the untreated or vehicle-treated control groups (fig 10b) . in addition, the viral loads in limb muscles, spinal cord and brain in both roc-a-treated and vehicle-treated mice were determined. comparable viral loads were detected in the limb muscles from both groups (fig 10c) . in contrast, viral titers in the spinal cord and brain from the roc-a-treated animals were significantly lower compared to the vehicle-treated group (fig 10c) , thus supporting that roc-a treatment specifically impairs ev71 neuropathogenesis. these findings correlate well with our in vitro data showing that the role of phb during ev71 infection cycle is specific to neuronal cells. together, the in vivo data support that phb plays a critical role in ev71 neurovirulence, and that roc-a represents a potential therapeutic strategy to limit ev71 neuropathogenesis, thereby minimizing neurological manifestations and complications. the re-emergence of neurotropic enteroviruses in recent years has motivated investigations into ev71 transmission in the neuronal system. understanding the interplay between virus and host proteins is likely to result in the identification of potential novel drug targets and development of novel antiviral strategies. here, using a proteomics approach, we have identified a panel of host factors that displayed dynamic regulation during the course of ev71 infection in the motor neuron nsc-34 cells. the host protein candidates are mainly involved in cytoskeletal structure maintenance, rna processing and mitochondrial biogenesis. by employing a sirna gene silencing approach, we have shown that some of these host factors either facilitate or limit ev71 productive infection. among these host factors, phb was found to exert a pro-viral effect as evidenced by the reduced viral titers measured in the culture supernatant of nsc-34 cells when the expression of phb was down-regulated, and by an increased viral titer in phb over-expressing cells. phb is mainly localized on plasma membrane, mitochondria and nucleus, and has been involved in multiple signaling pathways regulated by growth factors, immune response, mitochondrial biogenesis, cell migration, proliferation and survival [50, 58] . interestingly, knockdown in nsc-34 cells of stoml2, which was shown to interact with phb and participate to mitochondria biogenesis [60] , resulted in significant reduction of viral titer, similar to that seen with phb-knocked down cells. this further supports the involvement of mitochondrial proteins during ev71 infection cycle in nsc-34 cells. in addition, previous studies have reported the association of phb with internalization of several viruses, including hcv [39] , chikv [40] , denv [38] , and coronavirus (sars-cov) [61] . furthermore, phb was shown to promote hiv replication by interacting with the hiv-1 glycoprotein [62] . using various experimental approaches, we have demonstrated that cell surface-expressed phb is physically associated with ev71 and is involved in the entry of the virus into nsc-34 cells. on the other hand, by employing a lucev71 replicon, we have shown that intracellular (mitochondrial) phb plays a role in ev71 replication activity. this was further supported by (a) two week-old ag129 mice (n = 3) were infected i.p. with ev71 (10 7 pfu) and the limb muscles, spinal cord and brainstem were harvested for immunohistochemical analysis at day 4 p.i. scale bars denote 100 μm (20× magnification) and 50 μm (40× magnification). (b) ev71-infected ag129 mice (n = 8) were treated i.p. with roc-a at 0.25 mg/kg at day 1 and 3 p.i. and were monitored for survival and clinical manifestations. clinical scores were defined as follows: 0, healthy; 1, ruffled hair and hunched back; 2, limb weakness; 3, one limb paralysis; 4, both limbs paralysis at which point the animals were euthanized. statistical analysis of survival curve was performed using logrank (mantel-cox) test ( ãã , p<0.005). (c) limb muscles, spinal cord and brain were harvested at day 4 p.i. for viral load determination by plaque assay (n = 6/7). dotted line represents the limit of detection. statistical analysis was performed using mann-whitney u test ( ã , p<0.05). error bars represent mean ± sem. one representative of two biological repeats is shown. https://doi.org/10.1371/journal.ppat.1006778.g010 role of prohibitin in ev71 neuropathogenesis the observation that mitochondrial phb co-localizes with the replicating viral genome (dsrna) and the non-structural proteins 3d polymerase and 3cd complex. co-immunoprecipitation and tem approaches also confirmed the physical proximity and likely interactions between viral complexes/viral particles and mitochondria. these data thus led us to propose that mitochondria could serve as replication site for ev71 in nsc-34 cells. the association of ev71 with mitochondria was reported in a previous study where it was proposed that ev71 could potentially interact with some mitochondrial signaling proteins to evade host anti-viral innate immunity [55] . previous studies have reported that membrane-bound phb binds to ras in a gtp-dependent manner, which in turn activates craf kinase and eventually triggers the mapk pathway [58] . similar to other flavaglines, rocaglamide (roc-a) is a natural product that displays insecticidal, anti-fungal, anti-inflammatory and anti-cancer activities [63] . mechanistically, roc-a was found to inhibit craf-phb interactions in tumor cells [64] [65] [66] , and in an in vitro model of hcv infection [39] . in ev71-infected nsc-34 cells, incubation with nm concentrations of roc-a resulted in a dose-dependent reduction in virus titers. however, the mechanism by which roc-a exerts its antiviral effect against ev71 in nsc-34 cells does not seem to be mediated by blocking the craf/mek/erk pathway, given that phosphorylated erk proteins could not be detected in uninfected nsc-34 cells, suggesting that this pathway is not functional in these cells. instead, we found that roc-a-treated cells displayed reduced expression of mitochondrial phb and lower levels of intracellular atp, which suggests that mitochondria integrity/functionality is impaired in roc-a treated nsc-34 cells. since we showed that mitochondrial phb is involved in ev71 replication and that mitochondria serve as replication site for this virus, the impact of roc-a on phb expression and mitochondria integrity could represent the basis of its antiviral activity. further investigation is necessary to decipher the molecular mechanisms by which roc-a affects the expression of mitochondrial phb. interestingly, we found that the role of phb in ev71 entry and replication was limited to cells of neuronal origin, thus supporting a role of phb specifically in ev71 neuropathogenesis. this neuro-specific phenotype was also observed in vivo where roc-a-treatment resulted in reduced virus loads in the cns (spinal cord and brain) only but not in the limb muscles from infected mice, although phb was readily detected in the muscle cells as well. the cell typedependent involvement of phb during ev71 infection likely reflects differential intracellular events with different host factors being engaged during ev71 intracellular life cycle. since ev71 is known to be able to use multiple receptors to enter host cells, one could speculate that the host factors that are engaged during ev71 infection depend on the entry receptor that is being used by the virus. further study is necessary to explore this idea. in conclusion, our work has uncovered a novel host factor that is specifically involved in ev71 neurovirulence. in addition, our data support that roc-a, a previously established anticancer drug that targets phb, could represent a therapeutic approach to limit ev71 neuropathogenesis, and thus prevent or limit associated neurological complications. given the current attrition in effective antiviral drugs against ev71, roc-a repurposing is worth considering seriously. all the animal experiments were carried out under the guidelines of the national advisory committee for laboratory animal research (naclar) in the aaalac-accredited nus animal facilities. the animal experiments described in this work were approved under the nus institutional animal care and use committee (iacuc) protocol number 16-0136. non-terminal procedures were performed under anesthesia, and all efforts were made to minimize suffering. murine motor neuron nsc-34 cells (cellutions biosystems, clu140), human rhabdomyosarcoma (rd) cells (atcc ccl-136) and human neuroblastoma sk-n-sh cells (atcc htb-11) were used in this study. all cell lines were cultured in dulbecco's modified eagle's medium (dmem) (gibco) containing 10% fetal bovine serum (fbs) (gibco) at 37˚c with 5% co 2 . non-mouse-adapted ev71 s41 (5865/sin/00009, accession no.: af316321), kindly provided by prof. chow v. t. k. at national university of singapore, was isolated from the lymph node of a ev71-infected patient who died of encephalitis and pulmonary edema in singapore [67] . the virus stocks were made in rd cells and the viral titers were determined by plaque assay on rd cells. total proteins extract from infected cells was prepared using proteoextract complete mammalian proteome extraction kit (millipore). briefly, the cell pellet was thawed by resuspension in ice-cold resuspension buffer and the proteins were extracted with extraction buffer at room temperature (rt). benzonase and reducing agent were added during protein extraction to minimize nucleic acid contamination and to remove disulphide bonds, respectively. the solubilized protein suspension was subjected to centrifugation at 25,000 ×g for 30 minutes at 4˚c to remove the remaining insoluble material. the recovered cell extract was stored at -20˚c until further analysis. the protein samples (200 μg, quantified using rc dc bradford assay, biorad) were loaded onto individual lanes of the isoelectronic focusing (ief) tray with pre-wetted electrode wicks. passive rehydration was performed for each protein sample using 11cm ph4-7 readystrip ipg strips (biorad) for 12 hours at rt with gel side down configuration. after rehydration, the protein sample was subjected to ief on protean ief cell i11 (biorad) according to the following conditions: 250 v for 20 minutes with linear ramp, 8,000 v for 2.5 hours with linear ramp and 8,000 v for 30,000 v-hours with rapid ramping. ipg strips equilibration was achieved by incubating the strips with pre-warmed dtt equilibration buffer i (biorad) followed by iodoacetamide-supplemented equilibration buffer ii (biorad) for 10 minutes each on orbital shaker. equilibrated ipg strips were then transferred onto 12.5% tris-hcl criterion gel (biorad) and overlaid with readyprep overlay agarose (biorad). electrophoresis was run at 200 v for 65 minutes. after electrophoresis, the gels were stained with instantblue (expedeon) for 1 hour and submerged in milliq water overnight to remove background signal. gels were scanned using gs-800 calibrated densitometer (biorad). gel images were further processed using pdquest 2-d analysis software (biorad), whereby the different gel images from three independent experiments were matched and the intensities of detected spots were measured. protein spots that showed at least 0.5-fold change in spot intensity (p<0.05, two-tailed student's t-test), compared to the uninfected control sample, were excised for maldi-tof ms. the fold change was calculated using the equation: mean of spot expression in infected samples for each time point. in-gel digestion and maldi-tof ms of the excised protein spots were done by protein and proteomics centre, national university of singapore (singapore). the data was search against the murine and viruses national centre for biotechnology information non-redundant (ncbinr) database using a mascot program (http://www.matrixscience.com). no threshold was applied to the ms/ms fragment ions intensities. data mining of the identified proteins was done by searching in panther (http://www. pantherdb.org/) and swiss-prot/trembl (http://www.uniprot.org/) databases. the enrichment analysis of protein-protein interactions was performed using string network analysis version 10 (http://string-db.org/). hierarchical clustering and classification were performed using multiexperiment viewer version 4.9 (http://mev.tm4.org/#/welcome). on rd cells (10 5 cells/well) were seeded onto 24 wells plate. culture supernatant from ev71-infected samples was serially diluted (10-fold) with dmem containing 2% fbs prior to infection. the cell monolayer was incubated with 100 μl of the diluted viral suspension for 1 hour at 37˚c. the cells were then washed twice with pbs and replaced with 1 ml dmem containing 2% fbs and 1% carboxymethyl cellulose (cmc, sigma aldrich). after 3 days incubation at 37˚c, the infected monolayers were fixed and stained with 4% paraformaldehyde/ 0.1% crystal violet solution (sigma aldrich). the number of plaques was scored visually and viral titers were expressed as plaque-forming units (pfu) per milliliter (pfu/ml). drug treated or sirna-transfected cells (2.5×10 4 nsc-34 or 5×10 4 sk-n-sh cells/well) were washed twice with pbs and 1× alamarblue reagent (invitrogen) diluted with dmem containing 2% fbs was added. after 4 hours incubation at 37˚c, the fluorescence signals were determined using microplate reader (infinite 2000, tecan) at ex 570nm and em 585nm . percentage of viable cells was calculated using non-treated cells as control. table) for 20 mins at rt, the protein complexes were pulled down and subjected to western blot analysis (s6 table) . briefly, the protein extracts were incubated with the conjugated magnetic beads and further incubated for 3 hour at 4˚c with constant rotating, before eluting using laemmli buffer. isotype antibody pull-down and uninfected cells were used as controls. table) . ) . briefly, 20 μl of 5× cytotoxicity reagent were added into each well and incubated at 37˚c for 30 minutes. cytotoxicity was measured using fluorescence at ex 485nm and em 525nm . after equilibrating the assay plate to rt for 10 minutes, 100 μl of atp detection reagent were added into each well and mitotoxicity was measured by luminescence. all readings were normalized against non-treated cells. sodium azide (s2002, sigma aldrich) and staurosporine (s6942, sigma aldrich) were used as mitochondrial toxin and cytotoxin positive controls, respectively. nsc-34 (10 5 cells/well) cells were seeded onto 8-wells chamber slides (ibidi) and incubated overnight. the cells were treated with various concentrations of roc-a (diluted with 2% dmem) for 48 hours. jc-1 dye (invitrogen) (10 μg/ml in 2% dmem) was added into each well and incubated for another 15 mins, prior to imaging. nsc-34 cells incubated with sodium azide nan3 at 1 μm in 2% dmem (s2002, sigma aldrich) was used as positive control. cell lysate was prepared using m-per mammalian protein extraction reagent containing 1% of halt protease inhibitor cocktail and 1% of 0.5 m edta (thermoscientific). protein quantification was performed using quick start bradford protein assay (biorad). denatured proteins (5 μg) were resolved in 10% sds-page gel and transferred electrophoretically onto nitrocellulose membrane using trans-blot turbo transfer system (biorad). after blocking with 5% (w/v) milk in tbst (tbs buffer with 0.01% tween 20) for 1 hour at rt, the membrane was probed using specific primary antibodies and relevant secondary antibodies (s6 table) . the chemiluminescence signal was visualized using clarity western ecl substrate (biorad) on x-ray films. densitometric quantification was performed using imagej and the relative band intensity was normalized against β-actin. nsc table) . the nucleus was revealed using nucblue live readyprobes reagent (molecular probes). fluorescence images were captured using olympus ix81 microscope and further processed using imagej. limb muscles, spinal cord and brainstem from ev71-infected ag129 mice were harvested at day 4 p.i. after systemic perfusion. the organs were fixed in 4% pfa overnight prior immersing in 15% and 30% sucrose solution, and embedded in tissue-tek oct (vwr) solution. the organ samples were then frozen at -80˚, sectioned (10 μm) using a cryostat (leica) and mounted on a glass slide prior to blocking and staining as described above. samples were fixed for 1h at rt with 2.5% glutaraldehyde containing 1% tannic acid in 0.1m cacodylate buffer (ph 7.2), then washed three times for 5 min (each time) in 0.1m cacodylate buffer and post-fixed for 1h at rt with 1% osmium tetroxide in the same buffer. samples were then dehydrated in a graded series of ethanol and embedded in spurr. thin sections were stained with 2% uranyl acetate and lead citrate and observations were performed by transmission electron microscopy using a fei tecnai spirit g2 at 100kv. image were taken using a fei eagle 4k ccd camera. nsc-34, sk-n-sh (8 × 10 6 cells) and rd (4 × 10 6 cells) cells were seeded onto 6-well plate overnight. the cells were dislodged by incubating them in 10 mm edta/pbs in 4˚c for 10 mins and spun down to collect the cell pellet. cells were then blocked with either human or murine fc-blocker (564200 or 553142, bd pharmigen; 1:400) for 30 min at rt, and subsequently stained with anti-phb antibody (ab75766, abcam; 1:200) and/or anti-rabbit af488 antibody (a27034, thermoscientific; 1:500) for 30 min at 4˚c. stained cells were then fixed with 4% pfa. flow cytometry analysis was carried out using the becton-dickinson fortessa flow cytometer and analysed using flowjo v10. two-week old ag129 mice (b&k universal) were bred and housed under specific pathogenfree conditions in individual ventilated cages. infection was performed by injecting intraperitoneally (i.p.) 10 7 pfu of s41 (in 100 μl) per mouse. at day 1 and 3 post-infection (p.i.), mice were injected i.p. with 0.25 mg/kg of roc-a (in 0.25% dmso in sterile olive oil). the control group was inoculated with 0.25% dmso in olive oil. clinical manifestations were observed for a period of 20 days. clinical score was graded as follows: 0, healthy; 1, ruffled hair and hunched back; 2, limb weakness; 3, one limb paralysis; 4, two limbs paralysis; and 5, death. two limbs paralysis was used as criterion for early euthanasia. for virus titer determination, infected mice were euthanized at day 4 p.i. and perfused with 50 ml of sterile pbs systemically. the fore and hind limb muscles, spinal cord and brain were harvested and weighed before mechanical homogenization in 1 ml of serum-free dmem. the homogenates were spun down at 10,000 rpm for 10 minutes at 4˚c, and clarified using a 0.22 μm filter before serial dilution was carried out for plaque assay. viral titers were expressed as pfu per gram of tissue. supporting information s1 relative band quantification (below western blot) was determined by imagej, by normalizing to loading control, β-actin. statistical analysis was performed using two-tailed student's t-test ( ãã , p<0.005). one representative from two independent experiments is shown. (pdf) (a) for co-treatment assay, virus was incubated with roc-a for 1 hour before adding the mixture onto the cells. after one hour of incubation on cell monolayer, the mixture was then removed and replaced with fresh 2% dmem. (b) for pre-treatment assay, the cells were pre-treated with roc-a for 3 hours, prior to viral infection. culture supernatants were harvested at 48 h.p.i. for viral titer determination by plaque assay. cell viability was assessed using alamarblue viability assay. statistical analysis was performed using one-way anova with dunnett's post-test ( ãã , p<0.005). one representative from two independent experiments is shown. (pdf) . viral titer in the culture supernatant was determined by plaque assay at 12 h.p.i. statistical analysis was performed using two-tailed student's t-test ( ã p<0.05, ãã p<0.005, ããã p<0.005, ãããã p<0.0001). error bars represent mean ± standard deviation. relative band quantification (below western blot) was determined by imagej, by normalizing to loading control, β-actin. non-targeting sirna (ntc) served as control. (b) rd cells were pre-incubated with anti-phb antibody or isotype control antibody for 1 hour before infection. culture supernatant was harvested at 12 h.p.i. for viral titer determination. cell viability was determined using alamarblue cytotoxicity assay. one representative of two biological repeats is shown. (pdf) neurotropic enterovirus infections in the central nervous system epidemiological analysis, detection, and comparison of space-time patterns of beijing hand-foot-mouth disease enterovirus 71 infection with central nervous system involvement the epidemiology of hand, foot and mouth disease in asia: a systematic review and analysis virology, epidemiology, pathogenesis, and control of enterovirus 71. the lancet infectious diseases 10 outbreaks of hand, foot, and mouth disease by enterovirus 71. high incidence of complication disorders of central nervous system ev71 vaccine, an invaluable gift for children ev71 vaccine: protection from a previously neglected disease status of research and development of vaccines for enterovirus 71 proteomic analysis of extremely severe hand, foot and mouth disease infected by enterovirus 71 proteomics analysis of ev71-infected cells reveals the involvement of host protein nedd4l in ev71 replication comparative proteome analyses of host protein expression in response to enterovirus 71 and coxsackievirus a16 infections transcriptomic and proteomic analyses of rhabdomyosarcoma cells reveal differential cellular gene expression in response to enterovirus 71 infection human genome-wide rnai screen reveals host factors required for enterovirus 71 replication global quantitative proteomic analysis of human glioma cells profiled host protein expression in response to enterovirus type 71 infection a practical guide to clinical virology retrograde axonal transport: a major transmission route of enterovirus 71 in mice virus infections in the nervous system & feuer r. enterovirus infections of the central nervous system enterovirus 71 infection of motor neuron-like nsc-34 cells undergoes a non-lytic exit pathway neuroblastoma x spinal cord (nsc) hybrid cell lines resemble developing motor neurons nonlytic viral spread enhanced by autophagy components topology of double-membraned vesicles and the opportunity for nonlytic release of cytoplasm string v10: protein-protein interaction networks, integrated over the tree of life cellular proteome alterations in response to enterovirus 71 and coxsackievirus a16 infections in neuronal and intestinal cell lines beta-actin variant is necessary for enterovirus 71 replication proteomic analysis of human brain microvascular endothelial cells reveals differential protein expression in response to enterovirus 71 infection requirement for an intact t-cell actin and tubulin cytoskeleton for efficient assembly and spread of human immunodeficiency virus type 1 tubulin: a factor necessary for the synthesis of both sendai virus and vesicular stomatitis virus rnas subversion of the actin cytoskeleton during viral infection multifunctional alpha-enolase: its role in diseases protein disulfide isomerase in redox cell signaling and homeostasis comparing the primary and recall immune response induced by a new ev71 vaccine using systems biology approaches dengue virus infection induces upregulation of hn rnp-h and pdia3 for its multiplication in the host cell galectin-9 binding to cell surface protein disulfide isomerase regulates the redox environment to enhance t-cell migration and hiv entry identification and characterization of prohibitin as a receptor protein mediating denv-2 entry into insect cells a novel class of small molecule compounds that inhibit hepatitis c virus infection by targeting the prohibitin-craf pathway identification of prohibitin as a chikungunya virus receptor protein interaction of stomatin with hepatitis c virus rna polymerase stabilizes the viral rna replicase complexes on detergent-resistant membranes network based analysis of hepatitis c virus core and ns4b protein interactions pebp1, a raf kinase inhibitory protein, negatively regulates starvation-induced autophagy by direct interaction with lc3 epigenetic regulation of autophagy by the methyltransferase ezh2 through an mtordependent pathway knockdown of deptor induces apoptosis, increases chemosensitivity to doxorubicin and suppresses autophagy in rpmi-8226 human multiple myeloma cells in vitro defective axonal transport of neurofilament proteins in neurons overexpressing peripherin intermediate filaments in peripheral nervous system: their expression, dysfunction and diseases neuronal subtype and satellite cell tropism are determinants of varicella-zoster virus virulence in human dorsal root ganglia xenografts in vivo prohibitins role in cellular survival through ras-raf-mek-erk pathway prohibitin interacts with envelope proteins of white spot syndrome virus and prevents infection in the red swamp crayfish, procambarus clarkii human scarb2-dependent infection by coxsackievirus a7, a14, and a16 and enterovirus 71 viral and host proteins involved in picornavirus life cycle the mitochondrial phb complex: roles in mitochondrial respiratory complex assembly, ageing and degenerative disease enterovirus 71 protease 2apro targets mavs to inhibit anti-viral type i interferon responses inhibition of the craf/prohibitin interaction reverses craf-dependent resistance to vemurafenib the natural anticancer compounds rocaglamides inhibit the raf-mek-erk pathway by targeting prohibitin 1 and 2 prohibitin ligands in cell death and survival: mode of action and therapeutic potential a non-mouse-adapted enterovirus 71 (ev71) strain exhibits neurotropism, causing neurological manifestations in a novel mouse model of ev71 infection stomatin-like protein 2 binds cardiolipin and regulates mitochondrial biogenesis and function severe acute respiratory syndrome coronavirus nonstructural protein 2 interacts with a host protein complex involved in mitochondrial biogenesis and intracellular signaling identification of the cellular prohibitin 1/prohibitin 2 heterodimer as an interaction partner of the c-terminal cytoplasmic domain of the hiv-1 glycoprotein rocaglamide, silvestrol and structurally related bioactive compounds from aglaia species expression of a mutant prohibitin from the ap2 gene promoter leads to obesity-linked tumor development in insulin resistance-dependent manner & chellappan s. prohibitin induces the transcriptional activity of p53 and is exported from the nucleus upon apoptotic signaling prohibitin: a potential target for new therapeutics complete sequence analyses of enterovirus 71 strains from fatal and non-fatal cases of the hand, foot and mouth disease outbreak in we would like to thank a/p vincent chow from the department of microbiology & immunology, nus, for sharing the s41 strain and the electron microscopy unit core facility from yong loo lin school of medicine, nus. conceptualization: issac horng khit too, sylvie alonso. formal analysis: issac horng khit too, isabelle bonne. key: cord-006636-xgikbdns authors: ühlein, e. title: übersicht über neue ernährungswissenschaftliche publikationen date: 1964-02-01 journal: z ernahrungswiss doi: 10.1007/bf02021334 sha: doc_id: 6636 cord_uid: xgikbdns nan shoe~taxer, w. c., yanof, h. m., tui~k, l. n., u. wilson, t.h.: glucose and fructose absorption in the unanesthetized dog. gastroenterology 44 [1963] nr. 5, s. 654ff . (10 s.) . * sieaei~, p. s., u. correia, iv] . j. : speed of resumption of eating following distraction in relation to number of hours food-deprivation. physiol. ree. 13 [1963] nr. i, s. 39ff. (6 s.) . si~eth, r. n., u. christie, it. j. : studies on the absorption of insulin from the gastrointestinal tract of the rabbit. diabetes 12 [1963] nr. 3, s. 243ff. (6 s.) . *st~oh~.x~r, g.: ~iessung der ksrperradioaktivitiit zur bestimmung der eisenresorption beim mensehen. nuclear med., suppl. 1 [1963] s. 69]76. tallis, g. ] ~., moore, 1~. w., u. gream, b. d. : specific gravity of live sheep. nature 198 [1963] nr. 4876, s. 214 . tanaka, s., nu-kasawa, a., yoshikawa, h., u. yoneyama, y. : protein nutrition and radiation damage in mice. nature 197 [1963] nr. 4864, s. 305/306. tho~tson, a. m., u. billv, wlcz, w. z. : nutritional status, maternal physique, and reproductive efficiency. prec. i~utrition see. 22 [1963] nr. 1, s. 55/60. toothill, j. : the effect of certain dietary factors on the apparent absorption of magnesium by the rat. brit. j. nutrition 17 [1963] nr. 1, s. 1251134. *trail, j. c. m. : upgrading the indigenous poultry of uganda. i.: the growth rates and feed conversion from hatching to maturity of indigenous poultry crossed with four imported breeds. j. agrie. sei. 6{} [1963] nr. 2, s. 211ft. (6 s.) . van i~iekerk, b. d. h., i~eid, j. t., be~s~uoun, a., u. paladines, 0 . l.: urinary ereatinine as an index of body composition. j. nutrition 79 [1963] nr. 4, s. 463/473. voigtl~ndeg, h. : der rhythmus des brotessens. lebensmittel u. ern~hrung 16 [1963] nr. 2, s. 12] 13. wadsworth, g. r. : nutrition surveys: clinical signs and biochemical measurements. prec. l~utrition see. 22 [1963] nr. 1, s. 72] 78. watson, w. c., gordon, r. s., kx~en, a., u. jove~, a. : the absorption and excretion of castor oil in man. ft. pharmacy pharmacol. 15 [1963] nr. 3, s. 183/188. wkeel~r, r. r. : evaluation of various indicator techniques in estimating forage intake and digestibility by range cattle. diss. abstr. 23 [1962] nr. 6, . williams, w. p., dawxs, r. e., u. couch, j. r. : the utilization of earotenoids by the hen and chick. poultry sei. 42 [1963] nr. 3, s. 691 ft. (8 s.) . wolford, j. i-i., ringer, 1~. k., col~_ai% t. h., u. zrsdel, h. c. : individual feed consumption of turkey breeder hens and the correlation of feed intake, body weight, and egg production. poultry sci. 42 [1963] nr. 3, s. 599ff. (5 s.) . youl~o, g. m., tensuan, r. s., saimt, f., u. hol~es, f. : estimating body fat of normal young woman. visualizing fat pads by soft-tissue x-rays. j. amer. dietetic assoc. 42 [1963] nr. 5, s. 409ff . (5 s.) . zuck~r, t. f., u. ziicker, l. hi. : fat accretion and growth in the rat. j. nutrition 80 [1963] nr. 1, s. 6/19. intermediary metabolism 2c -1 kohlenhydrate -garbohydrates +n. n. : enzymatic defects in the hereditary fructosurias. nutrition i~ev. 21 [1963] i~r. 5, s. 137/138. *agosi~, hi., scar~elli, n., di~amarca, iv] . l., u. arav~i~a, l. : intermediary carbohydrate metabolism in triatoma infestans (insecta: hemiptera). ii. : the metabolism of 14c-glucose in triatoma infestans nymphs and the effect of ddt. comp. bioehem. physiol. 8 [1963] nr. 4, s. 311ff . (10 s.) . baens, g. s., lv-~deen, e., u. cornblxth, m. : studies of carbohydrate metabolism in the newborn infant. vi. : levels of glucose in blood in premature infants. pediatrics 31 [1963] nr. 4, s. 580ff . (10 s.) . * b~akeb, j., u. mapson, l. w. : studies in the respiratory and carbohydrate metabolism of plant tissues. xil : l~tlrther studies of the formation of co~ and the changes in lactate, alcohol, sucrose, pyruvate, and cr in potato tubers in nitrogen and in air following anaerobic conditions. prec. royal soc. b. 157 [1963] nr. 968, s. 383ff. (20 s.) . *bean, r. c., porter, g. g., u. bur, b. k.: carbohydrate metabolism of avocado. il: formation of sugars during short periods of photosynthesis. plant physiol. 38 [1963] nr. 3, s. 280ff. (5 s.) . b~gy~, e. n.: quantitative aspects of glucose metabolism in pregnant and nonpregnant sheep. amer. j. physiol. 294 [1963] nr. i, s . 147/152. b~-~t~au~, p. l.: effect of temperature on the fate of certain sugars in the gut of oxycarenus hyalinipennis costa (heteroptera: lygaeidae). naturwissenschaften 59 [1963] nr. 12, s. 445 . * broit~x~, s. a., u. zamchec~, n. : utilization and excretion of d-xylose in thyroxinetreated rats. j. l~bor. elin. med. 61 [1963] nr. 3, s. 476 ff. (7 s.) . davidsoi% e. a., u. small, w. : metabolism in vivo of connective tissue mncopolysaccharide. i. : cimndroitin sulfate c and keratosulfate of nucleus pulposus. biochimica biophysiea aeta 69 [1963] nr. 3, w. : metabolism in vivo of connective tissue mucopolysaccharide. il : chondroitin sulfate b and hyaluronic acid of skin. biochimica biophysica acta 69 [1963] nr. 3, s. 453/458. --u. s~a~l, w.: metabolism in rive of connective tissue mucopolysaccharide. iil: chondroitin sulfate and keratosulfate of cartilage. bioehimica biophysica aeta 69 [ 1963] nr. 3, s. 459/463. dea~, j. m.: a comparative study of carbohydrate metabolism in fish as affccted by temperature and exercise. diss. abstr. 23 [1963] nr. 10, 63-2088 (79 s.) . de ]3ode, t~,. c., steele, r., altszuler, n., du~, a., u. bishor, j. s. : effects of insulin on hepatic glucose metabolism and glucose utilization by tissues. diabetes 12 [1963] l~t. 1, s. 16ft . (12 s.) . *edel)ian, j., recaldi~, d. a. c. l., u. dic~erson, a. g. : the metabolism of fructose polymers in plants. 3. : the activity of 1f-fruetosylsucrose]sucrose transfruetosylase in living tissue of helianthus tuberosis l. bull. res. council israel 11 a [1963] nr. 4, s. 275ff . (4 s.) . * farooqi, m. i. h., u. karl, k. n.: potassium nitrate and carbohydrate contents of argemone mexieana l. at different stages of its growth. current sci. 32 [1963] nr. 4, s. 171ft . (i s.) . * geissb#~ler, f., u. fav~oe~, p. : ] ~ffet des rayons x sur le m6tabolisme du glucose et des acides gras chez la souris m~le. helvetica physiologica et pharmacologica acta 20 [1962] nr. 4, s. c 56ff. (2 s.) . gra~cr, b. r. : an investigation of the mechanism of absorption of sugars by plant cells. diss. abstr. 23 [1963] nr. 7, 63-124 (112 s.) . * gayeos~i, 5. d., t~ay~r, w. r., gayboskl w. a., gabaielson, i. w., u. sl'mo, h. m. : a defect in disaceharide metabolism after gastrojejunostomy. :new engl. j. med. 268 [1963] nr. 2, s. 7s/s0. * hxtch, m. d., u. glasziou, k. [1963] nr. 3, s. 338ff. (6 s.) . heald, p. j. : the mctabolism of carbohydrate by liver of the domestic fowl. biochem. j. 86 [1963] nr. 1, s. 103/110. h~sch~l, m. j., i-lrr,tj, w. b., u. poster, j. w. g. : carbohydrate digestion in the small intestine of the young steer. prec. nutrition soc. 22 [1963] nr. 1, s. v/vi. heb~h~d~z, a., u. sols, a. : transport and phosphorylation of sugars in adipose tissue. biochem. j. 86 [1963] nr. 1, s. 166/172. h, , .z, h., eric, , , c., u. glaubitt, d. : veriinderung yon zelldichte und polysaccharidstoffwechsel im alternden bindegewebe. klin. wschr. 41 [1963] nr. 7, s. 332/335. holt, p. r., h&essler, h. a., u. issv, lbacher, k [1963] nr. 40, s. 56ff. (7 s.) . je17ni~os, a. c., u. morton, r. k. : changes in carbohydrate, protein, ni~ogenous compounds of developing wheat grain. australian j. biol. sei. 16 [1963] 2, s. 318ff. (14 s.) . * kok, b., hoch, g., u. coorv, l~, b .: sensitization of chloroplast reactions. i. : sensitization of reduction and oxidation of cytochrome c by chloroplasts. plant physiol. 38 [1963] nr. 3, s. 274ff. (6 s.) . *levin, :b., 0berholzer, v. g., slqodorass, g. j. a. i., stimm~er, l., u. wilmers, m. j. : fruetosaemia. an inborn error of fructose metabolism. arch. disease childhood 38 [1963] nr. 199, s. 220ff. (11 s.) . lindquist, b., u. m_eeuwlsse, g. w. : intestinal transport of monosaccharides in generalized and selective malabsorption. acta paediatrica 1963 suppl. 146, s. ll0ff. (6 s.) . lui~dholm, l., u. i~ottme-lundholi~i, e r, j. a. : i. : glucose tolerance in swine as related to post-mortem muscle characteristics. ii. : the relationship of niacin and niacin coenzymes with various post-mortem properties of porcine muscle. diss. abstr. 23 [1963] nr. 9, 63-2902 (151 s.) . *ivir/~z, m. : carbohydrate metabolism following traumatization in the noble-collip drum and shock due to burns in rats. physiologica bohemoslovenica 12 [1963] nr. 2, s. 145 ft. (5 s.) . * murder, r., u. smith, f. h.: abnormal carbohydrate metabolism in pancreatic carcinoma. i~ied. clinics north americ~ 47 [1963] nr. 2, s. 397ff. (10 s.) . *randle, p. ft., garland, p. b., hales, c. n., u. newsholme, e. a.: the glucose fattyacid cycle. lancet 1 [1963] nr. 7285, s. 785ff. (5 s.) . rossi, f., z~ri, m., u. g] av.enbau~, a. l.: evidence for the existence of ~he hexose monophosphate pathway for glucose metabolism in the normal and denervated skeletal muscle of rats. biochcm. j. 87 [1963] nr. 1, 43] 48. *ruiteb, j., wei-i~bei~o, f., u. morrisoi~, a. : the stability of glucose in serum. clinical chem. 9 [1963] bit. 3, s. 356ff. (4 s.) . $sac]ter, j. a., ]: l~tch, ~r d., u. glasziou, k. t [1963] nr. 5, s. 1203ff. (3 s.) . *scm~ass~k, h.: metabolite des kohlenhydratstoffwechsels der isoliert pcrfundierten rattcnleber. biochem. z. 336 [1963] iqr. 6, s. 460ff. (8 s.) . *sch~reiu, c. : relation between fructose content of semen and fertility in man~ j. reproduction fertility 5 [1963] i~ir. 3, s. 347ff. (12 s.) . * schlubac~, it. h., u. g~em~, m neptune, e. 1~., u. sudduth, h. c. -" toxic effects of oxygen at hight pressure on the metabolism of d-glucose by dispersion of rat brain. biochem. j. 88 [1963] nr. 1, s. 31/45. *waaei~, h. g.: pathways of utilization of (1-14c) glucose and (6-x4c) glucose in slices of peas. j. exper. botany 14 [1963] nr. 40, s. 63ff. (19 s.) . weic-~r, h., schs~r~cl, h., u. re~schle~, h. e.: ~ber physiologische glucose-ansscheidung im urin yon stoffwechsel-gesunden. klin. wschr. 41 [1963] nr. 4, s. 201. w~ovse, s., fr~ed~ia~r~, b., u. reicm~d, g. : effects of insulin on hepatic glucose production and utilization. diabetes 12 [1963] nr. i, s. lff i~ou~, l~i., u. kum~e~ow, f. a. : fatty acid composition of lymph lipids from rats fed fresh and thermally oxidized fats. j. dairy sci. 46 [1963] i~r. 3, s. 176ff. (5 s.) . * blou~rr, a. w., u. co~, m.: tissue lipid pattern in a case of xanthoma disseminatum. arch. intern. ivied. lu [1963] nr. 4, s. 511] 517. bortz, w., anr~m~, s., u. c~o~, i. l.: localization of the block in lipogenesis resulting from feeding fat. j. biol. chem. 238 [1963] nr. 4, s. 1266/1272 . bronsert, u., hartmani~', f., u. mitzkat, ii.-j.: die wirkung yon milchs~ure im fettstoffwechsel der leber bei experimenteller fettlcber der ratte. naturwisscnschaften 50 [1963] i~r. 4, s. 129. br0w~, m. l. : effect of a low dietary level of three types of fat on reproductive performance and tissue lipid content of the vitamin b6-deficicnt female rat. j. nutrition 79 [1963] nr. 2, s. 1241130. *busfm~k, e. r., tho~rrson, r. h., u. weedo~, g. d. : metabolic response to cold air in men and women in relation to total body fat content. j. appl. physiol. 18 [1963] nr. of mast cells to fat transport. ann. new york acad. sci. 103 [1963] ar~. 1, s. 313ff. (9 s.) . katorsx~, b. a.: the influence of growth hormone on fat and protein metabolism. diss. abstr. 23 [1962] nr. 6, 63--1036 (55 s. amer. j. clinical nutrition 12 [1963] nr. 6, s. 431ff. (6 s.) . mxmo~, j. e.: the influence of diet and age on lipid metabolism of chickens. diss. abstr. 23 [1963] nr. 10, 63-1951 (63 s.) . *me, i, f. : a study on the lipid metabolism in the patients with atherosclerosis especially on the cholesterol in serum lipoprotein fl-fraction. japan. heart j. 4 [1963] nr. brown, l. d., grev~es, 1%. m., u. duncan, c. w. : effect of protein level in milk replacers on growth and protein metabolism of dairy calves. j. dairy sci. 46 [1963] nr. 6, s. 538ff. (6 s.) . *ma~i)elst~, j. : protein turnover and its function in the economy of the cell. arm. new york acad. sci. 162 [1963] nr. 3, s. 621ff. (16 s.) . mast~as, c. j. : nucleic acids and protein stores in the merino sheep. australian j. biol. sci. 16 [1963] nr. 1, s. 192ff. (9 s.) . noack, r.: zur biochemischen funktion des proteins als grundlage der bedarfsaormen. ernehrungsforschung 8 [1963] nr. 1, s. 15/24. *par.r. anin, a. u : le m6tabolisme pro~ique dans le systems nel-ceux. studii si cercet~ri de biochimie 6 [1963] nr. 1, s. 7ft. (16 s.) . roouski, j., ~i[akowska, k., i~sn~, j., stankowska, a., hryniewiecki, l., u [1963] nr . 42 [1963] . suppl. to nr. 1, s. 453ff. (8 s.) . co~solazio, c. f., ~ixtous~, l. 0., nv. lson, r. a., ~[a~drng, r. s., u. can" a~, j. e.: excretion of sodium, potassium, magnesium, and iron in human sweat and the relation of each to balance and requirements. j. nutrition 79 [1963] nr . goodwin, a. f., u sci. 42 [19631 nr. 1, s. 202ff. (4 s.) . *gerlinger, p., much, j.-v., u. clukvert, j.: note pr4liminaire sur raction du eyclophosphamide (endoxan) sur le d6veloppement de l'embryon. c. r. acad. sci. 255 [19621 nr. 23, s. 3229ff. (3 s. [19621 nr. 6, 62-5407 (110 s.) . nelson, f. e., jensen, l. s., u *caruzzo, c., tartara, d., pagano, p. g., pellegrini, a., costanzo, f., u [1963] nr. 10, 63--2084 (126 s.) . 38 [1963] nr. 6, s. ll0ff. (5 s.) . gal/~bos, j. t., asada, m., u. s~ks, j. z. : the effect of intravenous ethanol on serum enzymes in patients with normal or diseased liver. gastroenterology 44 [1963] nr. 3, s~ 267ff. (12 s.) . gall, c., u. weissw~.~le~, p. : un~ersuchungen fiber die labf~higkeit der milch und ihre beziehung zur mineralsteff-fiitterung der kfihe. milchwissenschaft 17 [1962] nr. 8, s. 413ff. (? s.) . zitat: dt. lebensmittel-rdsch. 59 [1963] nr. 4, s. 123 . gasic, g., u. morrison, a.b.: mucopolysaccharides of renal collecting tubule ceils in potassium deficient rats. prec. soc. exper. biol. med. 112 [1963] l~r. 4, s. 871ff. (2 s.) . gear~-, t. f. : oak wilt development and its reduction by growth regulators. i. production and activity of oak wilt fungus pectinase, cellulase, and auxin. ii. effect of halogenated benzoic acids on oak trees, the oak wilt diseases, and the oak wilt fungus. diss. abstr. 23 [1962] nr. 6, 63--650 (102 s.) . *g~l~,~cs~,r, f., g~ti, t., gy~g~, k., u. s6s, j.: effec~ of cardiopathogenic diet on the thiopental anaesthesia. acts physiologica aead. sci. hung. 1963 suppl. nr. 22, s. 16 . gar~'ls, j., piazu~lo, e., u [1962] nr. 6, 62--5595 (116 s.) . ikir~a, h., u. t~, k. v. : activity of gibberellin 'd' on the germination of photosensitive lettuce seeds. nature 197 ] nr. 4874, s. 1313 /1314 die wirkung des glucagons auf den blutzuckerspiegel in abh~ngigkeit yore alter. z. aiternsforsch. 16 [1963] nr. 3, s. 206/210. *loosli, j. k. : primary signs of nutritional deficiencies of laboratory animals. j. amer. veter, reed. assoc. 124 [1963] nr. 9, s. 1001 ft. (4 s.) . lowrey, r. s., pond, w. g., lo0sli, j. k. u. barnes, r. h.: effect of dietary protein and fat on growth, protein utilization, and carcass composition of pigs fed purified diets. j. animal sci. 22 [1963] nr. 1, s. 109ft. (6 s.) . lvnd, c. c., u. ) [1963] nr. 1, s. 32ff. (5 8.) . nr~cleod, l. b. : effect of liming and potassium fertilization on soil solution and on yield and composition of alfalfa and orchard grass mixtures. dlss. abs~r. 23 [1962] nr. 6, 62--5826 (322 s.) . ~ds~n, k. 0., u. edmonds, e. j. : prolonged effect on caries of short-term feeding of rice hulls to cotton rats. j. dental res. 42 [1963] nr. 1, s. 137ff. (9 s.) . m~a~.~, a. c. : biological responses of young rats fed diets containing genistin and genistein. j. nutrition 89 [1963] nr. 2, s, 151] 156. ~_ajaj, a.s., dnc~g, j.s., azzam, s.a., u. da_~by, w. j.: vitamin e responsive megaloblastic anemia in infants with protein-calorie malnutrition. amcr. j. clinical nutrition 12 [1963] nr. 5, s. 374 ft. (6 s.) . 1v~jcm~owicz, e., u. quxstel, j. h. : effects of aliphatic alcohols on the metabolism of glucose and fructose in rat liver slices. canad. j. bioehem. physiol. 41 [1963] nr. 3, s. 793ff. (12 s.) . m_al~otra, o. p., n~a-wl)ov, a.v., r~ber, e. f., u. norton, h . w., effects of rat strain, stilbestrol, and testosterone on the occurrence of hemorrhagic diathesis in rats fed a ration containing irradiated beef. j. nutrition 79 [1963] nr. 3, s. 381ff. (8 8,) . --, u. r~ber, e. f. : effect of methionine and age of rat on the occurrence of hemorrhagic diathcsis in rats fed a ration containing irradiated beef. j. nutrition 80 [1963] nr. 1, *misga, u. k. : effect of corn oil feeding on the lipids of dog bile. indian j. exper. biol. 1 [1963] nr. 2, s. 87/90. * miyao, m., tsvn•isnz, m., nagano, k., u. ttosogi, t.: experimental studies on the digestibility and absorbability of milk proteins. 4. effects of carbohydrate addition on the digestibility and absorbability of cow's milk proteins. tokushima j. exper. med. 9 [1962] nr 1963, nr. 5334, s. 846ff. (4 s.) . nagra, c. l., brerrenbach, r. p., u. meyer, 1~. k. "-influence of hormones on food intake and lipid deposition in castrated pheasants. poultry sci. 42 [1963] nr. 3, s. 770ff. (6 s.) . *+na~:ler, w. g.: the significance of calcium ions in cardiac excitation and concentration. amer. heart j. 65 [1963] nr l~, b. l., u. re(~a~, w. s., u. dobozy, a.: effect of vitamin a on the nucleic acid metabolism of rats. acts biologics acad. sci. hungariae 1963, suppl. 5, s. 46 . 01~itz, k., u. loeser, a. : uber den einflufl appetithemmender substanzen auf das fettgewebe. klin. wschr. 41 [1963] nr. 4, s. 1931196. *0rstadius, k., nordstrom, g., u. lannek, n. : combined therapy with vitamin e and selenite in experimental nutritional muscular dystrophy of pigs. cornell veterinarian 53 [1963] nr. 1, s. 60ft. (9 s.) . ern~hrung und therapie. med. u. ern~hrung 4 [1963] nr. 6, s. 146/150 . *--pflanzliche polysaccharide zur steigerung der kiirpereigenen abwehr. ivied. welt 1963, nr. 6, s. pxrkrnson, t. m., u. 0lso~r, j. a. : inhibitory effects of bile acids on the uptake, metabolism, and transpor~ of water-soluble substances in the small intestine of the rat. life sci. 1963, nr. 6, s. 393] 398. *pxl~o% j.-l., u. mord~l~t-dx~rin~, )i..' action du potassium et du calcium sur l'histaminopexie s~rique. recherches sur le cobaye, le rat et l'homme. j. physiol. 54 [1962] nr. 4, s. 579ff. (12 s.) . * patek, a. j., de fr1tsc] t, n. m., kendall, f. e., n. hirsch, r. l.: corn and coconut effects in dietary cirrhosis of rats. arch. pathology 75 [1963] nr. 3, s. 264ff. (7 s.) . patil, s. s.: the relation of ehlorogenic acid and total free phenols in potato plants to resistance to infection by verticillum alboatrum. diss. abstr. 23 [1962] nr. 6, 65--5602 (92 s.) . petuet,y, f.: diskussionsbemcrkungen zu rcferaten fiber sauermilchprodukte auf einer vortragsveranstaltung der gesellsehaft fiir erniihrungsbiologie e.v., miinchen, 22. juni 1961 . milchwissensehaft 18 [1963 nr. 5, s. 236] 237. pfei~ter, (3. j., gass, g. h., u. schw~rz, (3. s.: reduction by chlorpromazine of ulcem due to acute starvation in mice. nature 197 [ ] nr. 4871, s. 1014 1015. prrrllros, a. w., newc0m~, tl r., u. sha.wklrs, d. r.: long-term rat feeding studies on irradiated chicken stew and irradiated cabbage. toxicol. appl. pharmacol. 5 [1963] nr. 3, s. 273]297. prrmt~s, p. h., sutti~, j. w., u. ze~rowski, e. j. : effects of dietary sodium fluoride on dairy cows. vii.: recovery from fluoride ingestion. j. dairy sei. 46 [1963] l~r. 6, s. 513ff. (4 s. browi~, t. h., u. levee, r. v.: the effect of milk intake on nematode infestation of the lamb. prec. nutrition 8oc. 22 [1963] nr. 1, s. 32/41. * srnwrvasan, m., n~o~bhusha_~a~, a., u. sm~rrvas~, k. s.: changes in serum inorganic phosphate following ingestion of protein. curr. 8ci. 32 [1963] nr. 1, s. 21. stallwoth, h. : some effects of 2.4-dichlorophen-oxyacetie acid on sweet corn (zea mays rugosa l.) with emphasis on yield, tillering, root development, and exudation of electrolytes from roots and stems. diss. abstr. 23 [1963] nr. 8, 62---6233 (79 s.) . starcher, b., u. i~.ratzer, f~ h.: effect of zinc on bone alkaline phosphatase in turkey poults. j. nutrition 79 [1963] nr. 1, s. 18/22. stei~-~off, d., u. ~rqu~lrdt, p. : kombination yon kaliumpyrosulfit und ~xthy]alkohol ira tr~nkungsversuch an ratten. arzneimittelforschung 13 [1963] nr. 3, s. 237/238. stevermer, e. j. : influences of level of nutrition of the boar and of ionic environment of the spermatozoa on the properties of boar semen. diss. abstr. 23 [1962] nr. 6, 63--619 (104 s.) . stn~p~., f., u. 8c~warz, k.: incorporation of valine-l-x*c into serum and tissue proteins of rats fed torula yeast diets. j. nutrition 79 [1963] nr. 2, s. 151/160. 8ton~., d. b., u. co~or, w. e. : the prolonged effects of a low cholesterol, high carbohydrate diet upon the serum lipids in diabetic patients. diabetes 12 [1963] nr. 2, 8.127 ft. (6 s.) . *stormont, j. •., u. waterhouse, c. : effect of variations in previous diet on fasting plasma lipids. j. labor. clinical med. 61 [1963] nr. 5, s. 826ff. (6 s.) . * stuart, a. e., u vity of the heart extract of rats. medlcina exporimentalis 7 [1962] nr. 6, s. 363ff. (5 s.) . szepsenwol, j.: carcinogenic effect of egg white egg yolk and lipids in mice. prec. soc. exper. biol. med. 112 [1963] nr. 4, s. 1073ff. (3 s.) . *t~cs, l, u. ny~i, i.: effect of saline infusion, acth infusion, and blood transfusion on the hormone excretion of patients with hypereme~is. act~ medics aead. sei. hun. garicae 18 [1962] nr. 4, s. 385ff. (14 s.) . tanner, j. w.: an external effect of inorganic nitrogen on nodulation. diss. abstr. 23 [1963] nr. 10, 63--3007 (50 s.) . --, u [1963] nr. 4, s. 308ff. (13 s.) . *t~i~er, l., m~az, m., u. cs:menafiov4, m. : the effect of glucose and glucose together with insulin on the resistance of fasted rats to trauma in the noble-collip drum. physiologia bohemoslovenica 12 [1963] nr. 2, s. 136ff. (9 8.) . ude~iend, s. : factors in amino acid metabolism which can influence the central nervous systems. amer. j. clinical nutrition 12 [1963] nr. 4, s. 287ff. (4. 8 .) vahouny, g. v., moede, a., silver, b., n . treadw~ll, c. r. : nutrition studies in the cold. iv. effect of cold environment on experimental atherosclerosis in the rabbit. j. nutrition 79 [1963] nr. 1, s. 45/52. van p1lsum, j. f., olsen, b., taylor, d., rozyc'ki, t., u voss, r. d.: yield and foliar composition of corn as affected by fertilizer rates and environmental factors. i)iss. abstr. 23 [1963] nr. 10, 63--3008 (293 s.) . *vyval,ro, i. g. : the effect of gibberellin on the transformation of substances in germinating corn seeds. i)oldady akad. nauk sssr [russ.] 149 [1963] nr. 4, s. 979ff. (3 s.) . wao~ei~, g. r., cl~mk, a. j., hays, v. w., u amer. j. clinical nutrition 12 [1963] lgr. 3, s. 235ff. (6 s.) . the assessment of marginal protein malnutrition. prec. nutrition soc. 22 [1963] mr. 1, s. 66]72. --, u. stnp~, j. m. l.: the free ly~ine and amino nitrogen content of liver, muscle, and serum in normal and protein-depleted rats. prec. nutrition soc. 22 [1963] mr. i, s . viii/ix. *watson, w. c.: the morphology and lipid composition of the erythrocytes in normal and essential-fatty-acid-deflcient rats. bri6. j. haematology 9 [1963] lgr. 1, s. 32ff. (7 s.) . waite, r., u. : blackburn, p. s. : the relationship between milk yield, composition and tissue damage in a case of subclinical masticis. j. dairy res. 30 [1963] 1963, nr. 5330, s. 561ff. (1 s.) . *n. n.: paraty1~hoid fever from frozen chinese eggs. brit. reed. j. . (1 s.). *n. n. : radioactivity and human diet. chem. ind. 1963, nr. 18, 8. 721 . ~q. ~.: 8eh~digungen dutch konservlerungsmittel bei zitrusfrfichten? dr. reed. wschr. 88 [1963] nr. 17, s. 927. *n. n. : salt-poisoning in infancy. lancet 1 [1963] l~r. 7293, s. 1251. n.n.: kouoquium des arbeitskrcises hamburg der gdch-fachgruppe lebensmittelchemic und gerichtliche chemic am 7. januar 1963 . lebensmittelchem. u. gerichtl. chem. 17 [1963 nr. 6, s. 1161118. *n. n. : smoking and heath disease. new england j. med. 268 [1963] nr. 15, s. 903. +hi. n. : toxic components of lathyrus peas. nutrition rev. 21 [1963] nr. 1, s. 28/30. +n. n. : cariogenic ability of different diets. nutrition roy. 21 [1963] nr. 2, s. 50]52. +n. 1~.: aminoaeiduria in lead intoxication. nutrition rev. 21 [1963] nr. 3, s. 75/76. +n. n. : pyridoxine and dental caries. human studies. nutrition rev. 21 [1963] nr. 5, s. 1431145. +n. n.: pyridoxine and dental caries. animal studies. i~utrition rev. 21 [1963] nr. 5, s. 1451147. +hi. n. : rcporb: the prophylactic requirement and the toxicity of vitamin d. pediatrics 31 [1963] nr. 3, s. 512ff. (? s.). axrkroa, a.: caesium-137 from fall-out in human milk. nature 197 [1963] nr. 4868, s. 667ff. (1 s.) . *allcroft, r., u. car~agha~, r. b. a. : groundnut toxicity: an examination for toxin in human food products from animals fed toxic groundnut meal. veteri. rec. 75 [1963] nr. 10, s. 259ff. (5 s.) . *--, l~wis, g., u. hill, k. 1~.: groundnut toxicity in cattle: experimental poisoning of calves and a report on clinical effects in older ca~tle. ve~er. rcc, 75 [1963] nr, 19, s. 487ff. (6 s. nr. 1, 2, 3 nr. 1, 2, 3, 4, 5 nr. 1, 2 nr. 1, 2 influence of calcium and ouabai bain upon potassium influx in human erythrocytes the enzymatic assimilation of nitrate in the tomato plant translocation of '~p, i~n, and ~c in plants einige neue gesiehtspunktr zum caleiumstoffwechsel dis~ibution of water, sodium, and potassium in resting and stimulated mammalian muscle. canad the influence of vitamin bi, on calcium ('sca) metabolism of maxillodental tissues kidney, water, and electrolyte metabolism intermediiirer elektroly~toffwechsel und zellgrenzfl~chenphysiologie im theoretischen zusammenhang mit der krebsentstehtmg. tell i dcr einflul~ yon vollkornbrot auf den calcium-stoffwechsel bei schulkindern recherches sur le m6tabolisme du soufre. x. : la non-6quivalence de la eystine et de la eyst4ine dans la couvcrture des besoins sufr~s du rat adulte potassium-magnesium antagonism in soils and crops low serum iron levels in obese adolescents metal content of human organs studies on the requirements and interaction of copper and iron in broad breasted bronze turkeys to 4 weeks of age iron absorption and excretion in experimental iron deficiency the measurement of exchangeable magnesium in dogs the copper metabolism of warmblooded animals with special reference to the rabbit and the sheep comparative studies of the metabolism of strontium and barium in the rat the utilization of iron in erythropoiesis binding of strontium in blood ~iolybdenum, copper, and zinc contents of mouse liver and sarcoma 180 treated with molybdenum compounds biochemical effects of zinc deficiency in tomato plants excretion of sodium, potassium, magnesium, and iron in human sweat and the relation of each to balance and requirements turnover rate of zinc in the body as determined by the study of 6szn in rats a study of the iron absorption in mice as modified by various agents funktionsteste des radiojodstoffwechselstudiums und ihre bedeutung in der diagnostik der sehilddrfisenerkrankungen. ~rztl. laboratorium 9 the fate of radioiodine after parcnteral administration a possible humeral regulator of iron absorption beitrag zur kl~rung der ursachen der anreicherung yon caesinm-137 im 0rganismus blood-and serum-level of watersoluble vitamins in man and animals significados metabslicos do ~cido ~-lip6ico. 3. o ~cido r162 e o metabolismo do ferro observations on a magnesium-fluoride interrelationsip in chicks prevention of ,meat anemia" in mice by copper and calcium iron metabolism in experimental pyridoxine deficiency aluminium in soils and plants on the coastlands of british guinea physiology of adolescence. ii. : nutrition -basal oxygen consumption -energy expenditure and balance -nitrogen metabolism -calcium metabolism -iron metabolism -red cell mass and hemoglobin dietary strontium and calcium, and deposition of 89sr and asca in the bones of rats -~iengenelementansatz wachsender sehweine bei unterschicdiichen cuso4-zulagen differences in copper retention in two strains of chickens untersuchungen fiber anreicherung und verteilung yon rubidium in gerstenkcimpflanzen 112 [1963] nr. 3, s. 631/636. *+lv~ointyr~, i.: an outline of magnesium metabolism in health and disease. a review uptake by the root and subsequent distribution within the potato plant of strontium-89 leached from the foliage nor zinkstoffwechsel in der schwangerschaft foetal metabolism of caesium-137 in the rat magnesium metabolism of chickens zinc metabolism in patients with the syndrome of iron deficiency anemia hepatospenomegaly dwarfism, and hypogonadism. labor. clinical mcd c~isium-137 trod kalium in menschlichen organen und in der milch 1959/60 l~ole of the genotype in controlling accumulation of strontium-89 by plants copper and zinc interrelationships in the pig effect of chromium, cadmium, and other trace metals on the growth and survival of mice studies in the metabolism of zinc. iv. some observations on the urinary zine-porphyrin relationship in non-porphyries and in a patient with aeutm intermittent porphyria aastrontium balances in man studies on zinc metabolism. ii.: effect of the diabetic state on zinc metabolism: an experimental aspect effect of diabetic state on zinc metabolism: a clini. cal aspect copper metabolism and the liver iron metabolism and the liver with particular reference to the pathogcnesis of haemachromatosis studies on iron metabolism uber den eisenstoffwechsel. (bemerkung zu t. su~di~, miinchener reed yifinchener reed. wschr. 105 [1963] nr. 19, s. 99911000. 2c -7 wirkstoffe -biocatalysts +n. n. : vitamin a in human livers. nutrition biological half.llfe of vitamin b~ in plasma hypervitaminosis a and mast cells. a study of the interrelationship of mast cells and vitamin a in vivo and in vitro evidence concerning the human requirement for vitamin bi~. use of the whole body counter for determination of absorption of vitamin blz vitamin c in plasma and leucocy~s of smokers and non-smokers some factors affecting the absorption of vitamins the determination of vitamin a in animal tissues and its presence in the liver of the vitamin a deficient rat metabolic activities of vitamin a and related compounds in animals. i.: role of vitamin a in intestinal muscular contraction zum vitamin-b--haushalt dcr ratte bei sorbitfiitterung contribution a l'~tude do la relation entre la vitamine big et la glande thyrolde effects of deficiencies of certain b vitamins and ascorbic acid on absorption of vitamin blv amer ascorbic acid metabolism in plants. ii. : biosynthesis dietary and thyroid interrelationships affecting vitamin a status of feedlot beef cattle die wirkung gesteigerter kupferzufuhr auf den vitamin-c-haushalt vom meerschweinchen bei parental zugeffihrter ascorbins~ure kritische auswe~ung der naeh 1949 erschienenen arbeiten fiber gebundene ascorbins~ure im tierischen gewebe metabolism and biological activity of vitamin a acid in the chick biochemical studies of vitamin metabolism in poultry urine. ii. : on the excretion of thiamine in poultry urine after subcutaneous and oral administrations of some thiamine derivatives untersuchungen fiber die speiehcrung und fiber die ansscheidung yon vitamin a nach ungeniigender vitamin-a-versorgung bei legenden hfihnern studies on metabolism of vitamin a. 1.: the biological acticity of vitamin a acid in rats vitamin a and cholesterol absorption in the chicken studies on the interaction of vitamin bi~, intrinsic factor, and receptors. il the possible absorption of intrinsic factor human growth hormone in infant malnutrition macro-and micromethods for the determination of serum vitamin a using trifluoroacetic acid the activation of sulphate by extracts of cornea and colonic mucosa from normal and vitamin a-deficient animals the importance of blood as a pool of vitamin d studies on metabolism of vitamin a. 2.: enzymic synthesis ~nd hydrolysis of phenolic sulphates in vitamin-a-deficient rats human metabolism of l-ascorbie acid and erythorbic acid tissue distribution and storage forms of vitamin b~, injected and orally administered to the dog relation between vitamin a, tocopherol, and cholesterol serum levels in the elderly interrelationships of vitamin bi~, folio held, and ascorbie acid in the megaloblastie anemias zum wirkungsmechanismns des vitamins e. helvetica physiologica et pharmacologica effect of some physiologic factors on the absorption of vitamin bi~ in rats transport of dietary nitrogen mitochondrial fatty acids of fish and fish-eating birds the biosynthesis of fatty acids influence of age and dietary protein on cerbain free amino acids in chick blood plasma nitrogen metabolism in coldexposed rats ~ber des vermehrte auftreten yon fettsiiuren mit 10 bis 14 c-atomen in den depotfetten siiugender ratteu und den ubergang der linolsi~ure yon den mfittern auf die jungen die bildung yon antiksrpcrn gegen verschicdene kuhmilehproteine bei neugeborenen, kindern, erwachsenen und graviden the myocardial arterio-venous differences of free amino acids and of free fatty acids in healthy individuals, patients with diabetes and with essential hypercholesterolemia urinary amino acids on phenylalanine-tyrosine-supplemented diets difference in the metabolic fate of acetate and ethanol fed to higher plant tissues iiemodynamic relationships of anaerobic metabolism and plasma free fatty acids during prolonged, strenuous exercise in trained and untrained subjects effects of palmitate on the metabolism of leukooytes from-guinea pig exudate the dynamics of plasma free fatty acid metabolism during exercise ~ber die retention, den abbau und die ausseheidung yon 2-thion-tetrahydro-l.3.5-thiadiazinen transport systems for amino acids effects of protein intake and cold exposure on selected liver enzymes associated with amino acid metabolism quantitative studies on tryptophan metabolism in the pyridoxine-deficient rat effect of desoxypyridoxine-induced vitamin b~ deficiency on polyunsaturated fatty acid metabolism in human beings studies on the wheat plants using carbon-14 compounds. xix.: observations on the metabolism of lysine-14c. canad genetic defects of amino acid metabolism. pediatric clinics north america metabolism of tryptophan in diabetes mellitus the two-carbon chain in metabolism der einfluss yon vitamin a auf den citronens~urestoffwechsel ~)ber phenolspeicherung und phenolabbau in wasserpflanzen. naturwissensehaften 50 effect of physical exercise on nitrogen balance in obese subjects metabolism of nitrogen compounds in the rumen of ruminants. izvest. acad der intermedli~r-stoffwechsel s~ffweehsel der carotine im hiihnerembryo nucleic acid metabolism of germinating corn seedlings carbon metabolism of ~4c-labelled amino acids in wheat leaves. ii.: serine and its role in glycine metabolism bovine metabolism of insecticides. the metabolism of ~evin in dairy cows stoffweehsel der carotine. wiss. veroff. dr. oes. ern~hrnng 9 metabolism of labelled linolcic-l-tac acid in the sheep rnmen nonessential nitrogen supplements and essential amino acid requirements. nutrition rev vitamin c requirements of man re-examined. new values based on previously unrecognized exhalatory excretory pathway of ascorbie acid studies on the requirements and interaction of copper and iron in broad breasted bronze turkeys to 4 weeks of age water requirements of men as related to salt intake evidence for a high zinc requirement at the onset of egg production effect of lysine and glyeine upon arginine requirement of guinea-pigs the cobalt requirement of subterranean clover in the field fluid and electrolyte, requirements of newborn infants with intestinal obstruction the requirement and availability of dietary iron for young pigs studies on the protein and methionine requirements of young bobwhite quail and young ringnecked pheasants phenylalanine requirement of women consuming a minimal tyrosine diet and the sparing effect of tyrosine on the phenylalanine requirement effects of starvation on the cardiovascular system of the chicken calcium and phosphorus requiremeats of finishing broilers using phosphorus sources of low and high availability water intake of normal children sex differences in the ~4oeopherol requirement of rats as shown by the haemolysis test further studies on protein and energy requirements of chicks selected for high and iow body weight smoking and blood dotting dental caries and trace elements a statement approved by the board of directors of the canadian heart foundation the question of fats. il : fats and disease moldy peanuts and liver cancers vitamin c and healing of wounds berieht fiber die vortragstagung des fachverbandes lehensmittelchemie der chemisehen gesellschaft in der d])r veto 19. his 21 diet and human depot fat ethanol and plasma free fatty acid in man dietary water and protein efficiency in rats. nutrition rev. 21 [1963] nr. 1, s. 16/17. +n. n. : nature of the coagulation defect in rats fed diets producing thrombosis or experimental atheroselerosis factors affecting growth depression by raw soybeans bones in undernourished animals. nutrition rev. 21 [1963] nr. 1, s. 21123. +n. n. : vitamin e and the etiology of muscular dystrophy in the rabbit riboflavin eoenzymes and congenital malformations the thyroid gland in infant malnutrition. nutrition rev. 21 [1963] nr. 1, s. 32. +n. n. : a proposed mechanism for the effect of fats on serum cholesterol effect of varying levels of dietary protein on synthesis and excretion of urea dietary phosphates and dental caries folic acid restriction and cancer inhibition changes induced by lipoie acid in normal rat liver vitamin b6 deficiency and tryptophan metabolism effect of ubichromenol on development of encephmomalacia in vitamin e deficient chicks leucine-induced hypoglycemia nutritional muscular dystrophy in lambs. nutrition rev. 21 [1963] nr. 4, s. 120/122. +n. n.: amino acid imbalance in cold-exposed rats. nutrition rev. 21 [1963] nr. 4, s. 1221124. +n. n.: milk and athletic performance effect of vitamin a deficiency on the ubiquinone content of rat liver idiopatjaie stea~orrhea gastrointestinal protein loss in iron-deficiency anemia nutritional cirrhosis of the liver. nutrition rev. 21 [1963] nr. 6, s. 175/178. +n. n. : exercise and heart disease calcium deficiency in the etiology of osteoporosis the relation of dietary fat to the fatty acids in the intestinal wall clot-strength and elot-lysis in rats fed hyperlipemic diets effect of potassium iodide and duodenal powder on the growth and organ weights of goitrogen-fed rats ver~nderungen im stoffwechsel und wachsturn junger tomatenpflanzen nach giberillins~,urebehandiung effect of restricted feeding during the growing period on reproductive performance of large type white turkeys keys, a. l glucose, sucrose, and lactose in the diet and blood lipids in man ambient temperature and survival on a protein-deficient diet some considerations of changes in total body composition in relation to nutritional status the effect of variations in the energy and protein levels of the ration upon performance in the pig studies in choline deficiency. fate of injected 1-1~c-pal. mitio acid and fatty acid spectra in fasting and refed rats bartou i~ vitamin a requirements of chicks at moderately elevated temperature influence of age and dietary protein on eer~ain free amino acids in chick blood plasma the effects of nicotine on weight increment, activity, food intake, and water intake in weanllng albino rats effect of pyridoxine deficiency upon delayed hypersensitivity in guinea-pigs vitamin a deficiency in chickens ccrebellar encephalomalacia produced by diets deficient in toeopherol vitamin b12 deficiency in indian infantm plasma liplds in scurvy: effect of ascorblc acid supplement and insulin treatment effect of gibberellin on the variations of the growth-point in winter wheat uptake of dinitrophenol and its effect on transpiration. calcium accumulation in barley seedlings l~[ethylmalonate excretion in vitamin b~ deficiency reduction of plama cholesterol levels in atherosclerosis by diet and drug treatment. australasian ann effects of reduced dietary intake on the activities of various enzymes in the livers and kidneys of growing male rats the effect of feeding d-methionine on the d-amino acid oxidase activity of chick tissues l~etabolic effects of dietary protein level in cold-exposed rats relative effects of rapeseed oil and corn oil on rats subjected to aclrenalectomy, cold, or pyridoxine deprivation metabolic effects of dietary protein level with caloric restriction in coldexposed rats. canad metabolic effects of three dietary protein levels fed isocalorically to coldexposed rats. caned die nolle versehiedener fette im eiweis des organismus. nahrung the bursa of fabricius and xanthine oxidase activity of liver and kidney following dietary supplementation of iodina~d casein to chickens effects of linolsate and dietary fat level on plasma and liver cholesterol and vascular lesions of the cholesterol-fed rat a comparative study of the effect of bile acids and cholesterol on cholesterol metabolism in the mouse, rat, hamster, and guineapig the effects of ruminal and plasma sodium concentrations on the sodium appetite of sheep effect of level and sequence of feeding and breed on ovulation rate, embryo survival, and fetal growth in the mature ewe inhibitory effects of carbohydrates on histamine release and mast cell disruption by dextran fett-s~urestoffwechsel bci the effect of calcium infusions, parathyroid hormone, and vitamin i) on renal clearance of calcium nutritional supplementation during pregnancy effect of level of dietary protein with and without added cholesterol on plasma cholesterol levels in man die schilddrfisenfunktion bei enteralem eiweiflverlust effects of pelleting and varying grain intakes on milk yield and composition fatty acid composition of lipids of serum and aorta in the chicken on different diets rat intestinal suerase. ii.: the effects of rat age and sex and of diet on suerase activity the effect of selenium administration on the growth and health of sheep on scottish farns die wirkung yon vitamin b 6 bei leukopenien effect of energy source and level of alfalfa pellets on growth and tissue hpids of beef calves effect of magnesium deficiency on mast cells and urinary histamine in rats histamine-liberating effect of magnesium deficiency in the rat zur wirkung des wassers bei der seitenwurzelbildung an luftwurzeln influence of the aqueous potato extract and its fractions on growth and spore formation of the b. pumilus and the production of the antibiotic tetaine influence of mineral nutrition on the resistance of peach tree to fusicoceum amygdali de la croix the effect of dietary protein on the course of various infections in the chick bifidus intestinal flora in infants fed on mamysan b. acta paediatriea 52 effect of food fats on concentration of ketone bodies and citric acid level in blood and tissues is there a hemostatie effect of peanuts in hemophilioid disorders? milk allergy in infancy dental effects of fluoridation of water with particular reference to a study in the united kingdom influence of previous feeding with a high-fat diet on liver steatosis produced by acute starvation of growth hormone in mice effect of amino acid imbalance on nitrogen retention. ii.-interrelationships between methionine, valine, isoleucine, and threonine as supplements to corn protein for dogs supplementation of cereal proteins with amino acids. v. : effect of supplementation lime-treated corn with diffe. rent levels of lysine, tryptophane, and isoleueine of the nitrogen retention of young children the interrelation of nutrient supply, leaf nutrient content, and vegetative growth of ilex crenata gastric content of fasted primates. a survey serum cholesterol in vitamin c deficiency in man the effect of carbohydrates on the production of staphylococcal pigment effect of a low dietary level of three types of fat on reproductive performance and tissue lipid content of the vitamin b6-defieient female rat effect of magnesium deficiency on synthesis of hear~ and liver mit~chondria phospholipids ~ber den einflub l~nger dauernder ksrperlicher inaktivit~t auf die blutzuckerkurve nach oraler glucosebelasttmg. helvetica medica acts 30 action of trace elements on the metabolism of fluoride zur frage des einflusses yon kondensmilch und einer protein-valerina-polyphosphat-komplexverbindung auf die kreislaufwirktmg des kaffees beim menschen modifications de la croissance de la plantule de lapin blanc (lupinns albns l.) provoqu6es par une diminution exp6rimentale des r6serves influence of the dietary protein level on the magnesium requirement effects of high levels of copper and chlorotetracycline on performance of pigs effect of dietary calcium lactate and lactitic acid on faecal escherichia coli counts in pigs die entwickhing von calcium-mangelsymptomen. z. pflanzenern~hrung, diingung, bodenkde. 100 [1963] nr. 1, s. 53/58. --calcium-mangelsymptome an hsheren pfianzen copper deficiency in relation to swayback in sheep. i. : effect of molybdate and sulphate supplements during pregnancy long-term, low fat, low protein diets and their effect on normal trappist subjects further studies of the influence of diet on radiosensitivity of guinea-pigs, with special reference to broccoli and alfalfa effects of the infusions of ammonia, amides, and amino acids on excretion of ammonia answirlmngen langfristig fettreicher ern~ihrung auf das plasma-cholesterin the relationship of dietary energy level and density to the growth response of chicks to fats influences of dietary carbohydrate-fat combinations on various functions associated with glycosis and lipogenesis in rats. i. effects of substituting sucrose for rice starch with unsaturated and with saturated fat compensatory carcass growth in steers following protein and energy restriction fatty acid composition and glyceride structure in rats fed rapeseed oil or corn oil. canad influence of selective and nonselective hydrogenation of rapeseed oil on carcass fat of rats. canad studies in serum lipids. with special reference to spontaneous variations and the effect of short-term dietary changes experimentelle untersuchungen zur wirkung yon kaffeefett evaluation of the effect of breed on vitamin be requirements of chicks 1-iigh salt content of western infant's diet: possible relationship to hypertension in the adult the effect on the serum cholesterol levels of the consumption of a special dietary fat with a high content of unsaturated fatty acids in elderly people effect of dilution of the diet with an indigestible filler on feed intake in the mouse effect of tea and its tannins upon capillary resistance of guinea-pigs food input and energy extraction efficiency in carassins auratns effect of calcium and magnesium upon digestibility of a ration containing corn oil by lambs effects of calorie restriction during the growing period on the performance of egg-type replacement stock effects of insulin on hepatic glucose metabolism and glucose utilization by tissues cellulase and polygalacturonase in tomato fruits and the effect of calcium on fruit cracking effect of nutritional muscular dystrophy and of starvation on amino acid penetration in rabbit tissues plasma protein synthesis in nutritional muscular dystrophy inulin and sucrose distribution in tissues of vitamin e-deficient and control rabbits protein-bound dyes in the serum and liver of rats fed aminoazo dyes vanadium. excretion, toxicity, lipid effect in man the influence of vitamin a status on the proteoly~ic activity of lysosomes from the livers and kidneys of rats disauxie metainfettive e da malnutrizione induced drinking in dogs: comparative effects of hypertonic sodium chloride and sorbitol the influence of early nutrition on brain cholesterol accumulation during growth changes in composition of the saliva of cows on grazing heavily fertilized grass. res. veter changes in composition of the saliva of sheep on feeding heavily fertilized grass efect of varying alfalfa: barley ratios on energy intake and volatile fatty acid production by sheep the influence of calcium on the secretory response of the submaxillary gland to acetyi-choline or to noradrenaline clinical dentistry and fluoride food allergy as a cause of abdominal pain the effect of various dietary levels of ddt on liver function, cell morphology, and ddt storage in the rhesus monkey effect of natural and purified diets on survival of x-irradiated mice effect of autoclaving and ])'sine supplementation of skimmilk-powder diets on growth and caries in rats effects of alcohol intake on subjective and objective variables over a five-hour period nitrogen metabolism of african cattle fed diets with an adequate energy feeding value of fl-caroteno following treatment with n~o~ the relationship of the quantity and quality of dietary fats to serum cholesterol levels in men of different ages and weights effects on girls of greater intake of milk, fruits, and vegetables effect of antioxidant, protein, and energy on vitamin a and feed utilization in steers the growth-maintaining activity of ascorbic acid the effects of an induced pyridoxine and pantothenie acid deficiency on excretions of oxalic and xanthurenic acids in the urine influence of lactose and dried skim milk upon the magnesium deficiency syndrome in the dog. i.: growth and biochem chronic malnutrition in turkey. v. study on serum fatty acids in malnourished children the effect of nutrition conditions on the growth of and nitrogen accumalation by fodder beans when sown together with indian corn effects of a diet high in polyunsaturated fat on the plasmalipids of normal young females citrate and action of vitamin d on calcium and phosphorus metabolism beeinflussung der sportliehen lelstungsf/~higkeit durch eine geeignete er-nehrung effect of dietary erotic acid on liver proteins effect of barbiturie acid and ehlortetraeyeline upon growth, ammonia concentration, and urease activity in the gastrointestinal tract of chicks effects of feeding low levels of dimethoate on milk and whole blood eholinesterase activity of dairy cattle changes in serum lipoproteins after a large fat meal in normal individuals and in patients with isehemic hear~ disease the relationship of specific nutrient deficiencies ~) antibody response in swine. i. : vitamin a. ii. : pantothenie acid, pyridoxine or riboflavin. i)iss. abstr relationship of specific nutrient deficiencies to antibody production on swine. i. : vitamin a relationship of specific nutrient deficiencies to antibody production in swine. il : pantothenic acid, pyridoxine or riboflavin histechemistry of dietary cardiac lesions effect of various levels of fluorine, stilbestrol, and oxytetraeycline, in the fattening ration of lambs uptake of copper and its physiological effects on chlorella vulgarls the effects of a small dose of ethyl alcohol on certain basic components of human physical performance. i. the effect on cardiac rate during muscular work. arch. internat some effects of feeding stilbestrol, chlortetracycline and penicillin with alfalfa soilage on steer performance and carcass quality experimental induction of ciguatera toxicity in fish $hrough diet effects of potassium fertilizer, age of ewe, and small magnesium supplementation on blood magnesium and calcium levels of lactating ewes the influence of higher volatile fatty acids on the intake of urea-supplemented low quality cereal hay by sheep un~emuchungen fiber den umsatz wachsender schweine ab geburt. 2. mitt eczema and cow's milk. brit. med. j. 1963, nr. 5332, s. 753. isaacso~, a. : the effects of zinc on responses of frog skeletal muscle effects of zinc on responses of skeletal muscle effect of diet on work metabolism carbohydrate-phosphorus metabolism in the skeletal muscles of epinephrectomized animals durlng treatment with cortisone and vitamins c and p. ukrainskii biokhimichnii zhurnal composition of dietary fat and the accumulation of liver lipid in the choline-deficlent rat nutrition and palatability the incidence of protein-calorie malnutrition of early childhood theories on the mode of action of fluoride in reducing dental decay saccharase deficiency, familial entailing intolerance ~ cane sugar. acts paediatrica 52 raw and heat-treated soybeans for growing-finishing swine and their effect on fat firmness effect of the administration of isoniazid and a diet low in vitamin be on urinary excretion of oxalic acid dietary and thyroid interrelationships affecting vitamin a status of feedlot beef cattle ration effects on rumen acids, ketogenesis, and milk composition. i.: unrestricted roughage feeding the effect of supplements of groundnut flour or groundnut protein isolate fortifed with calcium salts and vitamins or of sklmmilk powder on the digestibility coefficient, biological value and net utilization of the proteins of poor indian diets given to undernourished children a comparison of skin-testing with natural foods and commercial extracts die wirkung yon hungern auf den ammoniakgehalt und das ph der pansenfi(issigkeit sowic auf die harnstoff-, cholesterin-und zuckerkonzentration im blu~. acts veterinaria acad increase of plant virus infection by magnesium in the presence of phosphate effect of intramuscular injection of magnesium sulphate solution on the growth rate and serum composition in rats diskussionsbemerkungen zu referaten fiber sauermilchprodukte auf einer vortragsveranstaltung der gesellschaft fiir eru~ effect of massive sodium bicarbonate infusion on renal function excessive insulin response to glucose in obese subjects as measured by immunoehemical assay die wirkung gesteigerter kupferzufuhr auf den vitamin-c-hanshalt yon meerschweinehen bci paren~ral zugef'dhrter ascorbinsrure the influence of growth hormone on fat and protein metabolism. dies. abs~r. 23 [1962] nr phenolearbons~iuren in mensehlichen nahrungsprodukten. zum vorkommen yon phenolcarbons~uren in mensehlichen nahrungsprodukten und ihr einflul~ auf den intermedit~ren stoffwechsel influence of pregnancy and an oxidized lipid diet on the fatty acid composition of blood and tissue an experimental approach to the mechanisms of weight loss. ii. 9 a comparison of effects of thyroxine, fat-mobilizing substance (fms) and food deprivation in achieving weight loss in mice fat accumulation in the regenerating media of arteries in rats given an atherogenic diet the effect of nutrition on the growth of faseiola hepatica in its snail host the effects of specific viruses, virus complexes, and nitrogen nutrition on the growth, flowering, and mineral composition os strawberry plants body weight and food intake as initiating factors for puberty in the rat 9 the role of catecholamines in the free fatty acid response to cigarette smoking die renale steroidausseheidung bei experimentellem vitamin-e-mangel 9 peculiar features of respiration and redox processes in the rice plants grown under different nutritional conditions der bohnenkaffee und die migrrne repeatability of litter size and weight in the laboratory rat as affected by selection and plane of nutrition wirkungen yon muskelextrakt auf den stoffwechsel. arzneimittelforschung is [1963] nr. 3, s. 238/239 einflul] der silageftitterung auf die qualit~t der butter einflul] der silageftitterung auf die qualit~t yon milch und milchproduk. ten. 3. mitt.: einflul] von silagefiitterung auf die organoleptischen eigenschaften dcr milch effect of protein intake and cold exposure on selected liver enzymes associated with amino acid metabolism body iron levels and hematologic fin. dings during excess methionine feeding der einttul~ des kulturmediums auf die bildung von streptolysis s durch ruhcnde zellen influence of polyphosphates in chilling water on quality of poultry meat influence of breed-type, feed level, and sex on characteristics of the lamb carcass, and some relationships among live animal and carcass measurements the toxic action of phenothiazine and some disturbances of intermediary metabolism in undernourished sheep efficiency of feed use in beef cattle uber die wirkung der nalmmgsfette auf die blutlipoide, teil i. ernahrungs-umsehau 10 [1963] nr. 8, s. 174/176. --~ber die wirkung der nahrungsfette auf die blutlipide fette und blutgerinnung. bibliothcca hacmatologiea effect of heavy cigarette smoking on postprandial triglycerides, free fatty acids, and cholesterol role of calcium in fibrin formation glucose-6-phosphate dehydrogcnase and aldolase in lenses of lactose-fed rats -effect of riboflavin, choline, pantothenic acid and vitamin a on the excretion of sodium in urine of rats the effect of waterwashed rice in the diet on the growth, excretion of sodium in urine and blood pressure of rats effect of aseorbic acid, vitamin b, and milk on the dark adaption effect of single deficiency of vitamin a, thiamine der einflub yon vitaminen auf die psyehomotorisehe leistungsi'~higkcit beim menschen. naunyn-schmiederberg's arch. exper feeding response of adult tribolium to carbohydrates in relation to their utilization nutritional secondary hyper hieronymi: influence of nutritional conditions on the cellular rna metabolism in rive and in vitro diffusible auxin increase in a rosette plant treated with gibberellin transamination in muscular dystrophy and the effect of exogenous glutamate: a study on vitamin e deficient rabbits, and mice with hereditary dystrophy. canad effect of auxin on the emergence of lateral roots in p. mungo seedlings the effect of nutritive status on oestrus, ovulation, and graafian follicles in merino ewes biologische wirkungcn autoxydierter, epoxydierter und bestrahlter fette s~ure-basen-gleichgewicht mad chronische acidogene und alkalogene eruehrung effect of protein level in milk replacers on growth and protein metabolism of dairy calves effect of sodium bicarbonate in the drinking water of ruminants on the digestibility of a pelleted complete ration sucrose diet and bfliary chelate excretion in rats: with note on procedure for chelate determination eirdlub sauerer milcherzeugnisse auf die darmflora untersuchungen fiber die speicherung und die ausscheidung yon vitamin a nach ungenfigender vitamin-a-versorgung bei legenden hiihnern the effects of low magnesium intake on lactating ewes effect of vitamin a on the content of pyridinnucleotides, pyrovic, and lactic acid and on anaerobic phosphorylation. ukrainskii biokkimichnii zhurnal the pyridine nucteotides. a study of a method of measurement. a study of the alterations in rat fiver under the conditions of diabetes and starvation. a preliminary study of various marine fish tissues with the emphasis on the islet of langerhans iron uptake-transport of soybeans as influenced by other cations hshe und zeitpm~kt der dfingung yon sommerweizen mit chlorcholinchlorid zur vcrkiirzung der halmlange nutritional significance of soluble nitrogen in dietary proteins for ruminants effects of long-term feeding of vegetable fats on atherosclerosis effects of feeding various mile, corn, and protein levels on laying performance of egg production stock some observations on the influence of a magnesiumdeficient diet of rats, with special reference to renal concentrating ability effect of gibberellic acid on flowering of apple trees the effects of dietary fat and energy levels on the performance of caged laying birds motivational producing properties of the feeding system of the rat hypothalamus the influence of diet and age on lipid metabolism of chickens effect of age on the response of chickens to dietary protein and fat absorption and excretion of biotin after feeding minced liver in achlorhydria and after partial gastrectomy variations de la calcsmie du chien normal apr~s ingestion de cholesterol eristallis~ dans l'6thanol ou dans rhexane changes in activity of rat epididymal adipose tissue in vitro due to time elapsed since last feeding differences in rat strain response to three diets of different compositions l'alcoolisme ~ l'hspital psychiatrique de colsou (martinique). ann. m~dico-psychologiques 1 nitrogen, lipid, glycogen, and deoxyribonu. eleie acid content of human liver. the effect of brief starvation and intravenous administration of glucose some metabolic and nutritional factors affecting the survival of erythrocytes erythrocyte survival of rats deficient in vitamin e or vitamin b 6. j. nutrition 80 [1963] nr. 2, s. 185/190 nutrition and lactation the effect of administering sodium chloride, sodium bicarbonate, and potassium bicarbonate to newly born piglets strontinm-90 and calcium in milk of miniature swine further studies on cariostatic effect of organic and inorganic phosphates urinary excretion of magnesium in man following the ingestion of ethanol the effects of magnesium compounds and of fertilizers on the mineral composition of herbage and on the incidence of hypomagnesaemia in dairy cows behavioral, dietary, and autonomic effects of ehlordiazepoxide in the rat the effect of a high and low sodium diet in a patient with familial periodic paralysis the effect of quaternary ammonium anion exchange resin on plasma and egg yolk cholesterol in the laying hen vitamin e deficiency and ion transpor~ in rat liver slices effect of level of nitrogen on growth and reproductive physiology of young buus and rams influence of low protein rations on growth and semen characteristics of young beef bulls, if treatment of nutritional cirrhosis in rats with choline and methionine; with special reference to fibrogenesis and fibroelasia probleme der beurteilung yon sauermilcherzeugnissen. milchwissen. schaft 18 [1963] nr. 5, s. 228/231. --antwort auf die diskussionsbemerkungen auf einer vortragstagung der gesellschaft ffir ern~hrungsbiologie e. v., miinehen, am 22 response of early-weaned pigs to variations in dietary calcium level with and without lactose effect of low calcium diet on bone crysta]linity and skeletal uptake of 4sca in rats response of rural guatemalan indian children with hypocholesterolemia to increased crystalline cholesterol intake source of plasma free fatty acids in dogs receiving fat emulsion and heparin alcoholic intoxication in the newborn infant. bril mcd dental caries of rats fed a rice diet and modifications a study of zinc deficiency in the dairy calf effects of different levels of zinc and phosphorus on the growth of subterranean clover (trifolium subterraneum l) lrber den einflu8 yon fluor auf den wassergehalt des knochens gegevens ovvr vitamine b,-deficientie, -behoefto en -voorziening the liquefying action of pancreatic, cereal, fungal, and bacterial co-amylases ern~ihrung und endokrines system 1. mitt.: der einfiul3 der erniihrung auf die schilddriise the effect of saline water on kidney tubular function and electrolyte excretion in sheep zinc and iron deficiencies in male subjects with dwarfism and hypogonadism but without ancylostomiasis, schistosomiasis or severe anemia die yextriiglichkeit yon xyli~ beim diabetiker einflni3 der laevulose auf die fu~ionsbrelte. sport~tl. praxis der einflul~ yon vitamin a auf den zitronensi~urestoffwechsel studies on the growth-promoting value and digestibility of passion fruit seed oil ration effects on drylot steer feeding patterns dextrothyroxine on metabolism of cholesterol some effects of feeding lactates to dairy cows copper deficiency problems in south-east scotland bone changes in iron deficiency anaemia a preliminary report on nucleic acid levels in mineral deficient plants metabolism of histidine in protein malnutrition ttypoglycemie effect of l-leucine during periods of endogenous hyperinsulinism nutritional studies with the guinea-pig. viii. : effect of different proteins, with and without amino acid supplements, on growth some effects of sulphur-containing amino acids on the growth and composition of wool effects of hunger and vi value on vi pacing potency of conditioned reinforcem based on food and on food and punishment sfibwaren und karies in theorie und praxis effect of magnesium on the changes of myocardial potassium confent untersuchungen fiber den einflub oral verabreiehten oxytetraeyclins auf leberlipide und serumeholesterin der weil~en ratte further studies on manganese nutrition of tobacco in relation to virus infection and synthesis aminobutyric acid (),-aba)-vitamin be relationships in the brain serum lipids and diet: a comparison between three population groups with low, medium, and high fat intake effects of light and gibberellin on elongation of intact wheat coleoptiles experimental magnesium deficiency in the cow thyroid function in chickens and rats: effect of iodine content of the diet and hypophysectomy on iodine metabolism in white leghorns cockerels and long-evans rats kuhmilehallergie beim sgugling und a rapid method for assessing drug inhibition of feeding behavior variation in, and the effects of vitamins on vertieillinm albo-atrnm influence of high level vitamin a supplement on semen characteristics and blood composition of breeding bulls influence of diet on viral hepatitis der einflul] der stiekstoffdfingung auf die zusammcnsetzung yon karf~ffeleiweib insulin response to fructose and galactose the effect of excess vitamin a on the oxygen consumption of young female rats effect of dietary amino acid pattern on plasma ~mino acid pattern and food intake gibbere/lln: effect on diffusible auxin in fruit development effect of intravenous versus oral fat administration in fat-deiicient dogs plasma vitamin bi~ levels in some nutritional deficiency states nutritional factors influencing the conversion of tryptophan to niacin pancreatic hypertrophy and chick growth inhibition by soybean fractions devoid of trypsin inhibitor production, interior egg quality, and some physiological effects of feeding raw soybean meal to laying hens effect of palm jaggeries on the growth and blood and liver composition of albino rats kept on rice and jowa effects of polyphosphates on water uptake, moisture retention, and cooking loss in broilers untersuchungen fiber den ern~hrungsphysiologischen wer~ yon kasein entgegnung zu diskussionsbemerkungen auf einer vortragsveranstaltung der gesellschaft for ern~ihrungsbiologie e. v., mfinchen, am 22. juni die erg~nzungswirkung yon dl-methionin allein oder in kombination mit l-lysin beim wachsenden schwein der einflul~ der nahnmg auf den kauapparat der einflu6 der nahrung auf den kauapparat. teil il changes in bone mass and density in living rats during the manipulation of calcium intake effect of chromium, cadmium, and other trace metals on the growth and survival of mice a study of fermentation in the production of mahewu, an indigenous sour maize beverage of southern africa the vitamin b~ deficiency syndrome in human infancy, biochemical and clinical observations lipids in chick urine: the influence of dietary rapeseed oil effects of dietary nitri~ on the chick" growth, liver vitamin a stores, and thyroid weight influence of radioactive sodium-24 on higher nervous activity of dogs, when chronically administered into the organism in comparatively small doses 9 the change of erythroeyte blood composition in persons with prolonged complete alimentary starvation (without limiting the water intake) and subsequent feeding dental abnormalities in rats attributable to protein deficiency during reproduction the effect of environment, and nutrition of pathogen and host, in the damping off of seedlings by rhizoctonia solani effect of dietary protein on fructose, citric acid, and 5-nucleotldase activity in the semen of bulls the effect of fluorine on dairy cattle. v. : fluorine in the urine as an estimator of fluorine intake some effects of diet and therapy on the survival and metabolism of adrenalectomized rats effect of methonine and choline deficiency on liver choline oxidase activity in young rats untersuchungen fiber die wasserlssliehen hemmstoffe aus dem 8chnittholz der weinrebe (vi~is vinifera l.). i. zur wirkung der hemmstoffe auf die keimlmg mad entwicklung yon rebs~mlingen nutrition and palatability. lancek 11 the effect of feeding fluoride on some enzymes of bovine tissues diet and histamine in the rmninant effect of food reflexes on cholinestera~e activity of cortical tissue. federation essential fatty acid deficiency and rat liver homogenate oxidations the effect of vitamin and antibiotic injections on early turkey poult growth and mortality alimentary production of gallstones in hamsters. 12. studies with rice starch diets with and without antibiotics nitrogen studies with apple and cranberry the influence of diet on the quality of faecal fat in patients with and without steatorrhoea uber die unterschiediiche beeinflussung des tryptophansteffwechsels dutch vitamin b6-mangel in der ratte. hoppe-seyler's influence of vitamin b12 and its coenzyme upon incorporation in rive of amino acids into tissue proteins in rats relativer vitamin b6-mangel hei erkrankungen der schilddriise strukturanomalien der z/ihne bei vitamin d-mangel-raehitis und der vitamin d-resistenten renalen rachitis the effect of fluoride on bone effects of insulin on hepatic glucose production and utilization prevention by hydrocortisone of changes in connective tissue induced by an excess of vitamin a acid in amphibia acute hypervitaminosis a in guinea-pigs. i. : effect on acid hydrolases der einfiu~ yon vollkornbrot auf den calciumstoffwechsel bei schulkindern effect of feeding milk replacers with varying amounts of f~ for hothonsc lamb production egg yolk and serum cholesterol levels: importance of dietary cholesterol intake effect of protein intake on glutamic dehydrogenase and amino acid desruination in rive observations in experimental magnesium depletion effect of gibbercluc acid on growth, gibberellin content, and chlorophyll content of leaves of potato ~ physiological factors influencing growth, reproduction, and production of well-fed dairy heifers. i. age at first breeding. ii. feeding of diethytstilbestrol tm~ influence of diet on the development of parotid salivation and the rumen of the lamb bericht fiber den wissenschaftlichen kongreb 1963 der deutschen geseuschaft fiir erniihrung influence of variations in dietary calcium: phosphorus ratio on performance and blood constituents of calves the lack of a consistent chick growth response to norwegian kelp meal some effects of kinctin on the growth and flowering of intact green plants incorporation of [,2p] orthophosphate into phospholipids of frog tissues during feeding and stmrvation growth-modifying and antimetabolite effects of amino acids in chrysanthemum a study of techniques used by advertisers in dealing with weight control. a national health problem lipid excretion. 3.: examination of faecal lipids of rats injected intravenously with serum lipoprotcin containing ~ac-labelled cholesterol effect of diet and diabetes on plasma glucose, fatty acid, and insulin effect of cigaret smoking during pregnancy: study of 2000 cases. obstetrics gynecology 21 respiration and phosphorylation in live homogenates from rats exposed to hypoxia and food restriction the influence of mi]l~ fat depressing rations on the yield and composition of bovine milk phosphatides and cholesterol in the rat bed: effects of growth, diet, and age the effect of plant nutrients and antagonistic microorganisms on the damp. ing-off of cotton seedlings caused by rhizoctonia solani kurus l~utrition of gram-negative anaerobic bacilli. nutrition rev. 21 effects of glucose on the production by escherichia coli of hydrogen sulphide from cysteine. j. general iylierobiol. 30 enumeration of psyehrophilie microorganisms vitamin requirements of three pathogenic fungi calorie requirements of rat intestinal microorganisms specificity of the salt requirement of halobacterium cutirubrum influence of the aqueous potato extract and its fractions on growth and spore formation of the b. pumilus and the production of the antibiotic tetaine the relationship between hormonal activity and sugar metabolism in protoparce scxta (joka~sen) and blabcrus craniifer bur~ieister apparent incorporation of ammonia and amino acid carbon during growth of selected species of ruminal bacteria l~ber die wirkung anorganischer st~ube auf das wachstum yon mikroorganismen effect of dietary calcium lactate and lactic acid of faecal eseherichia coli counts in pigs uber den einflul3 des n~ihrsubstrats auf die hemmung des bakterienwachstums durch cyanid autoradiographic studies of the differential incorporation of glycine, and purine and pyrimidine ribosides by paramecium aurelia correlation between the essential amino-acid requirements of staphylococcus aureus, their phage types, and antibiotic patterns nutrition and metabolism of marine bacteria. xii.: ion activation of adenosine triphosphat~se in membranes of marine bacterial cells carbon dioxide fixation in bacillus anthracis bacterial growth under conditions of limited nutrition interrelationship between temperature and sodium chloride on growth of lactic acid bacteria isolated from meat-curing brines morphological variation and nutrition of a new monoeentric marine fungns feeding stimulants required by a polyphagons insect, schlstocera gregaria vitamin requirements of root nodule bacteria phcnotypic, genotypic, and chemical changes in starving populations of aerobacter aerogenes studies on the d-amino acids. ii.: utilization of d-amino acids by lactic acid bacteria role and formation of the acid phosphatase in yeast der einflub yon co~-partia]druck und glucose-konzentration auf wachsturn und stiekstoffbindung yon azotobacter chroococcum bei~ inorganic polyphosphate metabolism in chlorobium thiosulfatophilum effects of molybdenum, vanadium, tungsten, and cobalt on growth of rhizobia and their hosts nutrition of leptospira pomona. ii.: fatty acid requirements 9 sterilization by beta-propiolactone of solid nutrient media for eultivation of moulds the digestion of natural food protein by the elearnose skate raja eglanteria (bose.) utilization of amino acids during metabolism in escheriehia coil the effect of nutrition on the growth of fasciola hepatica in its snail host cobamide coenzyme contents of soybean nodules and nitrogen fixing bacteria in relation to physio]ogical conditions determination of carbohydrate metabolism of marine bacteria the amino acid requirements of various types of shigella mushroom culture. factors affecting the growth of morel mushroom myecelium in submerged culture lebensmittelzusatzstoffe und mutagene wirkung. vii. : priifung einiger xanthen-farbs~offe auf mutagene wirkung an escherichia coll the biological control of glycogen metabolism in agrobaeterium tumefaeiens the maintenance requirement of escheriehia coll methionine requirement for growth of a species of mieroeoecus ~iorphogenesis and nutrition in the memnionella-stachybotrys group of fungi viable organisms from feces and food-s~uffs from early antarctic expeditions the metabolism of yeas~ sporulation. v. : stimulation and inhibition of sporulation and growth by nitrogen compounds the effect of lipids on citric acid production by an aspergillns niger mutant relationship between deuterium inhibition of growth and glucose catabolism in saecharomyees cerevisiae function of trehalose in baker's yeast (saccharomyces cerevisiae). arch. biochem preparation and lyophilization of colicine suspensions. i. production of eolicines in liquid nutrient media lvber den einflub der kulturbedingungen auf die stramenempfmdliehkeit der glueoseoxydation in baeterium cadaveris nutritional requirements and metabolism of myeoplasma laid-lawil j. gen. microbiol. 30 die wirkung subletaler konzentrationen yon sorbinsi~ure auf escherichia coli und aspergillus niger the genetic analysis of carbohydrate utilization in aspergillus nidulans the amino acid nutrition of respiration deficient and respiration competent saecharomyces. a. van leewenhoek nutritional studies of some membem of mucorales. iv.: 1. sugars, amino-, and organic acids of the myceaium selektivn~hrboden fiir staphylokokken effects of certain amino acids in anthranilate production in neurospora crassa studies on the polysaccharide-fermenting lactic acid bacteria. in. : nutritional requirements and the existence of fermentation promoting factors for sucrose and inulin the catabolism of proteins and nucleic acids in starved aerobacter aerogenes aerobic fermentation and the depletion of the amino acid pool in yeast cells influence of hydrogen ion concentrations on the utilization of sodium nitrite by fungi oxidative metabolism of glucose in leaf tissues infected with tobacco mosaic virus differential effects of amino acid deficiencies on bacterial cytochemistry nutritional requirements of an aspergiuus niger mutant for citric acid production culture de tissus d'insectes ~, l'aide d'extrait d'embryon de poulet en l'absenee d' h6molymphe. c. r. acad utilization of some carbon and nitrogen sources by pseudomorms fluorescens on the selection of microorganisms for use in bacterial fertilizers in vitro and -rive uptake of carbon-14 labelled alanine and glucose by ascaridia galli, parasitic nematode of chickens growth of psychrophiles. i. : lipid changes in relation to growgh-temperature reductions vitamin requirements of listeria monoeytogenes parasitism and nutrition of gonatobo~rys simplex the effect of alkali metals on the growth of staphylococcus pyogenes the uptake of potassium and rubidium by staphylococcus pyogenes metabolism of nucleic acids and of nucleotides in the course of synchronous development of azotobacter vihelandii studies on the biotin-oleie acid requirements of a lactobaeillus plantarum variant isolated from chick feces unusual response to iron-dextran. brit. *ned. j. 1963, nr. 5331, s. 630. +in'. n.: tissue trypsin and trypsinogen levels in pancreatitis skeletal development of suckling kittens rate of liver regeneration atherogenesis in the monkey the significance of serum triglyeerides anaemia and parasitism in man physiological mechanisms in nutritionally-induced differences in ovarian activity of mature ewes bone development in suckled pigs production performance of artificiauy and non~r~ifically sired herd-mates in wisconsin search for an unidentified nutrient in mammalian liver. part i.: growth studies with various ox liver preparations proline control of the feeding reaction of cordylo-phora relationship between longevity and production in holstein-friesian cattle circadian adrenal cycle in c mice kept without food and water for a day and a half metabohc pmduc~s form labelled ethanol. iv. : disappearance of ethanol-carbon from morphological fractions and lipids of rat tissues acetate utilization by maize roots vajda, ]3. : i~c~sll-rcs of body fat and hydration in adolescent boys some characteristics of a proteolytic enzyme system of pseudomonas fluorescens some metabolic relationships between host and parasite with particular reference to the eimcriae of domestic poultry nucleotide degradation in the muscle of iced haddock (gadns aeglefinns), lemon sole (pleuronectes microcephalus), and plaice (pleuronectes platessa) effect of starvation and 6-mcthylprednisolone (m_edrol) on the acid phosphatase of rat liver and muscle metabolic patterns in preadolescent children. vii. : intake of niacin and tryptophan and excretion of niacin or tryptophan metabolites biochemical changes in fish muscle during rigor morris studies on ornithine synthesis in relation to benzoic acid excretion in the domestic fowl effect of manual total collection of feces upon nutrient digestibilities polarographie studies on storage of meats. xxii. : influence of proteolytic enzymes on the polarographie wave of beef protein solutions post-mortem changes in chilled and frozen muscle genetic-nutritional interactiions as affecting the early growth rate of chickens effect of unequal milking intervals on lactation milk, milk fat, and total solids production of cows changes with age in glutamic oxalacetic transaminase activity of sonically oscillated tureen juice compared to total steam volatile fatty acids in calves fed different roughages catecholamine metabolism and some functions of the nervous system a study of some of the conditions affecting the rate of excretion and stability of creatinine in sheep urine changes in feeding behavior after intracerebral injections in the rat kanzcrogene substanzen in wasser und boden food additives and contaminants and cancer milk allergy in infancy food poisoning due to salmonella cnteritidis vat the mineral element content of spring pasture in relation to the occurrence of grass tetany and hypomagnesaemia in dairy cows insecticide residues in meat. residues in body tissues of livestock sprayed with sevin or given sevin in the diet over de giftigheid van solanum-alkaloidcn toxic products in groundnuts zur beziehung zwischen lipidcn hepatotoxicity of foods: a consideration of the hepatotoxicity of a few phanerogams and eryptogams. their possible influence in the pathogenesis of cirrhosis and hepatoma food-poisoning potential of the enterococei vanadium. excretion, toxicity, lipid effect in man an institutional food-poisoning outbreak examination of market milk of novokuznctsk for brucellosis an outbreak of food poisoning in a mental hospital food allergy as a cause of abdominal pain radioactivity in the diet the effect of microbial contamination on the requirement of chicks for certain nutrients the acute oral toxicity of cottonseed pigment glands and intraglandular pigments sur l'absorption du edsium radioactif par rorge. c. r. hebd. s6ances aead die experimentelle alimentiire lebernekrose a]s empfindlicher indikator bei thermiseher belastung der milch. uber die magermilehtroekntmg the comparative toxicity of ethylene dibromide when fed as fumigated grain and when administered in single daily doses repository polyvalent insect antigen treatment for patients sensitive to hymenoptera. a clinical evaluation precursors of carcinogenic hydrocarbons in tobacco smoke toxin production in naturally separated liquid and solid components in preparations of heated surface-ripened cheese inoculated with clostridium botulinum allergieversuehe am tier zur ~tiologie der sogenannten margarinekrankheit. dr. reed. wschr. 14 [1963] nr. 1, s. 9/12. --, allergenwirkung oder immunologisohe adjuvanswirkung in der ~tiologie der sogenarmten margarinekrankheit radium-226 in human diet and bone miodine in the thyroids of north american deer experimental groundnut poisoning in pigs cholesterol as carcinogen safety factors in water fluoridation based on toxicology of fluorides entelo epidemiologische gegevens over her ,planta-exantheen" te rotterdam, verkregen door enquetc-onderzoek planta-~x~ntheem" epidemie te rotterdam in de m~nden ~ugnstu~ en september 1960 salmonella-verontreiniging van plantaardige grondstoffen veer voedingsmiddelen van mens en dier increase of strontium-90 and caesium-137 sodium fluoride intoxication salmonellosis in the netherlands zwei seltene salmonellenfunde untersuehungen fiber die chronisehe toxizit~t der ascorbins~ure bei der ratte captan in green vegetables rfickst~nde yon pflanzenschutzmitteln, insektiziden mid dergleichen in der nahrung und ihre bedeutung ffir die gesundheit der gehalt der milch an 181j, 1,~co ' u0ba _{_ 140la in der deutschen )iileh in der zeit yon langfristige nutritive anwendung yon antibiotika in der tierern~hrung im hinblick auf die menschliche gesundheit mi~ besonderer beriicksichtigung yon chlortetrazyklin a milk-borne outbreak of food poisoning due to salmonella heidelberg ergebnisse yon schwebversuehen an farbstoffen zur farbmattierung yon tabakwaren nutritional secondary hyperparathyroidism of the cat insecticide residues in fat. a screening method for chlorinated pesticide residues in fat without cleanup untersuehungen fiber die quantitative verteilung radioaktiver falloutprodukte in milch too many vitamins radios~ron$ium removal from milk. determination of apparent equilibrium constants of the exchange reactions of sodium, potassium, calcium, and magnesium with amberlite ir-120 ern~hrungsphysiologische eigenschaften der margarine. fette, seffen, anstrichmittel 65 smoking and cancer: retrospective studies and epidemlologieal evalution beobachtungen fiber den verlauf der alkoholkrankheit am krankengut einer heilanstalt die verschmutzung yon trinkwasser dutch i)etergentien grain fumigant residues. occurrence of bromides in the milk of cows fed sodium bromide and grain fumigated with methylbromide insecticide persistence. the disappearance of endrin residues on cabbage lebensm ittelchem u. gerichtl reproductive performance of female miniature swine ingesting strontium-90 daily toxicity of nitrate nitrogen to cattle methods of residues of phosphated insecticides and miticides in foods on bacillary excretion in food toxinfections of salmonella etiology relationships between the deposition of strontium-90 and the contamination of milk in the united kingdom staphylococci in cottage cheese is~cs and potassium in people and diet. -a study of finnish lapps effect of treatment of seeds with 2-chloroethanol on the resistance to boron toxicity in wheat seedlings desferrioxamin, eine neue das eisen bindende und eliminierende substanz zur behandlung der 9rim~rcn und sckund~ren i-i~mochromatose akuter eisenvergiftungen toxic products in groundnuts smoking, arteriosclerosis, and age the incidence of milk sensitivity and the development of allergy in infants einflul] yon fluor und jod auf den stoffwechsel, insbesondere auf die schilddrtise quelques exemples illnstrant la valour et l'utllit6 des m~thodes de lysotypie clans certaines salmonelloses humaines d'origine alimentaire food poisoning caused by panthogenic halophilic bacteria (pseudomonas enteritis txkikawa): 1%ep0rt of four autopsy cases procaine penicillin g in milk following intramuscular injections comparative subacute toxicity for rabbits of citric, fumaric, and tartaric acids distribution of pesticides in fermentation products obtained from artificially fortified grape musts mercurial fungicidal seed protectant toxic for sheep and chickens the problem of salmonella food poisoning dietary factors in the pathogenesis and treatment of cirrhosis of the liver. med. clinics of north america 47 an outbreak of salmonella food poisoning in l~ehmadabad town, kaira i)istxiet, gnjaxat 8~a~e la tossinfezionl alimentarl da salmonella nell' 0spedale maggiore di milano dal 1954 al 1961 water intoxication due to oxytocin: reporb of a case c~sium-137 und kalium in menschlichen organen und in der milch 1959/60 caesium-137 in dried milk produets in relation to phytoellmatic zones smoking and oral cancer occurrence of hepatomas in rats fed diets containing peanut meal as a major source of protein nachweis yon mangan-54 und kobalt-57 in pflanzen als fo]ge russischer kernwaffenvcrsuche e-ruhr-epidemie durch speiseeis bcricht fiber eine arbeitstagung bei der internationalen atomenergic-behsrde in wien vom 12. bis 14. i)ezember 1962. i)t. lebensmittel-rdsch the development of microbiological standards for foods. j. milk food technol a case of breslau salmoneliosis caused by eating chicken internationales rundgespr~ch fiber lebensmittelchemische probleme in wiesbaden und eltvllle a. rh. (4 vortragsreferate) staphylococcal infection of raw milk as a cause of food poisoning. monthly bull. ministry health pub carcinogenic effect of egg white, egg yolk, and lipids in mice eczema and cow's milk exitus letalis nach lebensmittelvergiftung dutch bacillus cereus repression of staphylococcus aureus by food bacteria. ii. : causes of inhibition a further report on the radioactive contamination of milk and milk products in japan. determination of strontlum-90 and cesium-137 concentrations in milk powder in japan concerning sporadic salmonelloses insecticide residues. extraction and cleanup studies for parathion residues on leafy vegetables salmonellosis epidemiology zur frage der deponierung yon nutrltiven allergenen im organismus. allergic, _~sthma 9 allergic children with various symptoms caused by cows' milk messungen der umweltsradioaktivit~t und der radioak-tivit~t yon lebensmitteln im jahre ein cxperimenteller bcitrag zur tabakrauehkanzerogenese. dr. reed. wschr. 88 [1963] nr. 13. u n. : contamination of leaves by radio active fall-out insecticide residues in milk and meat. residues in butterfat and body fat of dairy cows fed at two levels of kelthane (1.0 and 2 insecticide residues in milk. residues in milk from dairy cows fed low levels of toxaphene in their daily rations tier-und pfhnzenerniihrnug _anlrnal and plant nutrition summary of ,tropical crops soil, analysis, and its relation to plant composition and growth fertilisers and plant nutrients ulcers in swine tnfluence of chelating agents on the concentration of some nutrients for plants growing in soil under acid and under alkaline conditions nutritional evaluation of permanent pastures with dairy cattle in louisiana the herbage intake of eattle grazing lucerne and cocksfoot pastures terminology and methods for feeding and weighing animals the effect of feeding on evaporative heat loss and body temperature in zebu and jersey heifers studies on the requirements and interaction of copper and iron in broad breasted bronze turkeys to 4 weeks of age some factors affecting iron uptake by strawberry plants feed consumption during lactation and involution in sprague-dawley-rolfsmeycr rats the effect of variations in the energy and protein levels of the ration upon performance in the pig use of barley in high-efficiency broiler rations. 6. poultry sci tierarzncimittcl und aufzuchtmittel in der landwirtschaftlichen praxis. gesund. heitliche erwggungen zum schutze des konsumenten bei der anwendung yon tierarzneimitteln und aufzuchtmitteln in der landwirtschaftlichcn praxis, tell ii mechanism for movement of plant nutrients from soil and fertilizer to plant root growth rate of the tea leaf as determined by shade and nutrients. empire j. exper note on induction of flowering in ~railing shoots of clones of saccharum spontaneum effect of level and sequence of feeding and breed on ovulation rate, embryo survival, and fetal growth in the mature ewe evidence for a high zinc requirement at the onset of egg production aufnahme und wirkung des mikronghrstoffs knpfer in ionogencr und ehelatisierter form bei gerste effects of pelleting and varying grain intakes on milk yield and composition the relationship of gibberellic acid to flower initiation in column stock, math~ela incana the effect of selenium administration on the growth and health of sheep on scottish farms the horsebean (vicia faba l.) as a vegetable protein concentrate in chick diets size and feeding of different types of fishes the influence of age of chicks on their sensitivity to raw soybean oil meal influence of the mineral nutrition on the resistance of peach trees to fusicoceum amygdali de la croix granular fertilizer. influence of associated salts on plant response to dicalcium phosphate a comparison of feeding growing pigs once or twice daily the interrelation of nutrient supply, leaf nutrient content, and vegetative growth of ilcx crenata 'green island' effect of rationing grass on the growth rate of dairy heifers and on output per acre, with a note on its significance in experimental design experiments on the nutrition of the dairy heifer. iv.: protein requirements of 2-year-old heifers grass silage vs. hay for lactating dairy cows hay vs. silage for two to six months old dairy calves weaned at 25 or 60 days effects of high levels of copper and ehlortetracycline on performance of pigs seedlings resistance of corn to leaf feeding of the european corn borer ostrina nubflalis ease of hydrolysis of the hemieeiluloses of forage plants in relation to digestibility bodenkde. 100 [1963] mr. 1, s. 53/58. --caleium-mangelsymptome an h6heren pflanzen effects of frequency of feeding on urea utilization and growth charae%oristics in dairy heifers factors affecting the voluntary intake of foods by cows. 6. : a preliminary experiment with ground, pelleted hay the relationship of dietary energy level and density to the growth response of chicks to fats salt tolerances of pinus thunbergii compensatory carcass growth in steers following protein and energy restriction a guide to production, care, and use of laboratory animals. federation prec. 22 estimation of essential fatty acid intake in swine automatic dispensing at frequent regular intervals of liquid diet for piglets chelation in nutrition. chelates and the trace element nutrition of corn der einflul3 yon humuss~ure auf wachstum und ver~inderungen des freien zuekergehaltes bci winterweizenpflanzen, die im dtmkeln kultiviert wurden a comparison of the growth of chicks in the gustafsson germ-free apparatus and in a conventional environment, with and without dietary supplements of penicillin an evaluation of weed competition and the effects of weed extracts and leachates on the development of field corn (zea mays l.) and oats (arena sativa l digestibility of rations containing different sources of supplementary protein by young pigs the effects of urea supplements on the utillzation of straw plus molasses diets by sheep production performance of artificially and nonartifieiallysired herd-mates inwiseonsin dietary phosphorus for laying hens tolerance to acid soil conditions in barley effect of calcium and magnesium upon digestibility of a ration containing corn oil by lambs effects of caloric restriction during the growing period on the performance of egg-type replacement stock untersuchungen fiber die verwertung yon calcium-und phosphorsalzen aus fisehgr~itenmehl einigcr frischfische und fischkonserven bei der verffitt~rung a nut~rient requirement for optimum water absorption by intact root systems preparation of feed for animal nutrition experiments responses of winter wheat to nitrogen and soil nitrogen status studies on calcium requirements of broilers znm problem dcr nahrungspflanzenwabl der aphiden some factors affecting leaf development and longevity and the subsequent yield of corn grain efficiency of energy utilization by young cattle ingesting diets of hay, silage, and hay supplemented with lactic acid the effects of a plant steroid on body weight and feed efficiency of broilers feeding troughs for guinea-pigs beitrag zur ]~ackfruchtmast mit schweinen unter besondercr beriicksichtigung des n~hrstoffgehaltes der beifuttermischungen und der the feeding of thyroprotein to lactating sows zur planung, durchfiihrnng und auswertung yon schweinemastvcrsuchen bei gruppenfiitterung the influence of barbituric acid derivatives on the development of plant roots and root hairs factors affecting the voluntary intake of food by cows. 5.: the relationship between the voluntary intake of food, the amount of digesta in the retieulo-rumen, and the rate of disappearance of digesta from the alimentary tract with diets of hay, dried grass or concentrates artificial food for oak-silkworm raising the comparative toxicity of ethylene dibromide when fed as fumigated grain and when administered in 8ingle daily do~0~ gibberellin at the vineyard oak wilt development and its reduction by growth regulators. i. production and activity of oak wilt fungus pectinase, ecmulase, and auxin. ii. effect of halogenated benzoic acids on oak trees, the oak wilt diseases, and the oak wilt fungus regulating nutrient intake in laying hens with diets fed ad libitum some effects of different soils on composition and growth of sugar beet production of fodder yeast from barley. i. preliminary studies on the use of the waldhof fermenter development and nutrition of new species of thranstochytrium studies on the effect of frequency of feeding upon the biology of a rabbit-adapted strain of pediculns humanus the influence of previous vitamin k nutrition on thromboplastie activity of brain extract the effect of nutrition conditions on the growth of and nitrogen accumulation by fodder beans when sown together with indian corn. i)oklady akad. nauk sssr dutch phenylbors~ure induzierte fragen der resistenzsteigerung in der modernen gefliigelhaltung chelation in nutrition. metal chelates in plant nutrition beziehungen zwischen dcm kupfergehalt und dem zeitlichen auftreten yon kupfermangelsymptomen an hafer in wasserkultur mit kleincn bodenmengen increased tolerance of bean plants to soil drought by means of growth-retarding substances effect of monocaleium and diammonium phosphates on crop yield, and their influence on soil solution movement and characteristics when associated with different salts effect of barbituric acid and ehlortetraeyeline upon growth, ammonia concentration, and urease activity in the gastrointestinal tract of chicks effects of feeding low levels of dimethoate on milk and on whole blood cholinesterase activity of dairy cattle die ziichtung von fleischschweinen und die folgeerscheinungen, die sich besenders im hinblick auf die qualit~t yon fleiseh und fett ergeben the relationship of specific nutrient deficiencies to antibody response in swine. i.: vitamin a. ii.: pantothenie acid, pyridoxine or riboflavin further studies of diet composition on egg weight effect of various levels of fluorine, stilbestrol, and oxytetraeycline, in the fattening ration of lambs studies on the properties of l~ew zealand butterfat. viii. the fatty acid composition of the milk fat of cows grazing on ryegrass at two stages of maturity and the composition of the ryegrass lipids soil potassium and the growth of vegetable seedlings artificial feed for silkworm, bombyx mori some effects of feeding stilbestrol, ehlortetracyeline, and penicillin with alfalfa soflage on steer performance and carcass quality food agric. 14 [1963] nr. 2, s. 66/75. --, mineral supplements for sheep the influence of higher volatile fatty acids on the intake of urea-supplemented low quality cereal hay by sheep untersuehungen fiber den umsatz waehsender sehweine ab geburt. 2. mitt growth of edible emorophyllous plant tissues in vitro chelates in agriculture. metal chelation by glucose-ammonia derivatives economic analysis of high-level grain feeding for dairy cows evaluation of the dacron bag technique as a method for measuring cellulose digestibility and rate of forage digestion n~ihrlssungen fiir zuckerriiben in wasser-und sandkultur activity of gibbereuin:'d' on the germination of photosensitive lettuce seeds raw and heat-treated soybeans for growing-finishing swine, and their effect on fat firmness ration effects on tureen acids, ketogenesis, and milk composition. i.: unrestricted roughage feeding a new growth stimulant, ~ growth hormone the effects of specific viruses, virus complexes, and nitrogen nutrition on the growth, flowering, and mineral composition of strawberry plants peculiar feature of respiration and redox processes in the rice plants grown under different nutritional conditions einflul] der silagefiitterung auf die qualit~t yon milch und milchprodukten. 3. mitt. : einflul] der silagcfiitterung auf die organoleptischen eigenschaften dcr milch effect of grinding and pelleting on the utilization of coastel bermuda grass hay by dairy heifers langfristige nutritive anwendung yon antibiotika in der tierernlihrung im hinblick auf die menschliche gemmdheit mit besonderer beriicksichtigung yon chlor-te~azyklin mode of action of growth retarding chemicals yield of sugarcane in louis-ana as influenced by soil moisture status and climate diss effect of auxin on the emergence of lateral roots in p. mungo seedlings compound mouse diets a semipurified caries-test diet for rats present status of feeding antibiotics to htctating dairy cows effect of sodium bicarbonate in the drinking water of ruminants on the digestibility of a pelleted complete ration semi-purified diets for sheep effect of vacuum-drying, freeze-drying, and storage environment on the viability of pea pollen. ii. : effect of boron, sucrose, and agar on the germination of pea pollen hshe und zeitpunkt der diingung yon sommerweizen mit chlorcholinchlorid zur verkiirzung der halml~nge nutritional signifieance of soluble nitrogen in dietary proteins for ruminants primary signs of nutritional deficiencies of laboratory animals ]~ffe[~b of dlctary pro~in and fat on growth, protein utilization, and carcass composition of pigs fed purified diets trans-fetts~iuregehalt yon schweineschmalz nach fiitterung yon sehweinen mit rindertalghaltigem kraftfutter. (ein beitrag zur quantitativen infrarotspektroskopischen bestimmung yon trans-fetts~uren in fetten effect of liming and potassium fertilization on soil solution and on yield and composition of alfaffa and orchard grass mixtures effects of feeding various milo, corn, and protein levels on laying performance of egg production stock effect of gibberellie acid on flowering of apple trees the effects of dietary fat and energy ]evels on the performance of caged laying birds effect of age on the response of chickens to dietary protein and fat chemical control of flowering. concentration of a floral-inducing entity from plant extracts strontium-90 and calcium in milk of miniature swine studies on the properties of newzealand butterfat. vii. effect of the stage of maturity of ryegrass fed to cows on the characteristics of butterfat and its carotene and vitamin a contents new radioactive tests show how termites feed mechanisms regulating the feeding rate of daphni~ magna straus influence of low protein rations on growth and semen characteristics of young beef bulls a study of zinc deficiency in the dairy call effects of different levels of zinc and phosphorus on the growth of subterranean clover (trifolium subterraneum l.). australian j. agrie absorption, translocation, exudation, and metabolism of plant growth-regulating substances in relation to residues the effect of the performance of growing pigs of the level of meal fed in conjunction with an unrestricted supply of whey increase in yield of legumes by fer~iliser mixture with lime chemically defined medium for growth of animal cells in suspension dis sieherung der eiweiljwrsotgung in dor l~ndwirt influences of previous calcium and phosphorns intake and plant phosphorus on the requirement of developing turkeys for calcium and phosphorus relationship between isotopicauy exchangeable calcium and absorption by plants effect of adding buffers to all-concentrate rations on fcedlot performance of steers, ration digestibility, and intrarumen environment lysine supplementation of corn -and barley-base diets for growing-finishing swine the effect of gibbereliin on the germination of seeds of arboreal plants effect of physical state of coastal bermuda grass hay on passage through digestive tract of dairy heifers nitrate reduction and carotene stability. effect of nitrate and some of its reduction products on carotene stability, d. agric. food chem chemical preparations for plant protection untersuchungen fiber die ksrperzusammensetzung und den stoffansatz waehsender mastschweine und ihre beeinflussung dutch die erniihrung. 3. mitt the cobalt requirement of sub-~erranean clover in the field comparing mile and corn in broiler diets on an equivalent nutrient intake basis effect of mineral nutrition on the invasion and response of turnip tissue to plasmodiophora brassicae wor the relation of chlorogenic acid and total free phenols in potato plants to resistance to infection by verfieillium alboafxum nitrogen and potassium as variables influencing soluble nitrogen and organic acid accumulation in soybean (glyoine max). di~s. abstr. 23 bett~r british beef and barley feed. veter effect of nitrogen fertilization upon yield and digestibility of aftermath timothy forages fed to dairy heifers ration effects on dltlot steer feeding patterns effects on zea mays seedlings of a strontium replacement for calcium in nutrient media evaluation of albumen quality in a poultry breeding program nutritional studies with the guinea-pig. viii. : effect of different proteins, with and without amino acid supplements, on growth some effects of heredity and environment on appetite in dairy animals further studics on manganese nutrition of tobacco in relation to virus infection and synthesis amillo acid supplementation of pig diets chelation as a basic biological mechanism der einflu~ der stiekstoffdiingung auf die znsammensetzung yon kartoffeleiweiil z studies on photosynthesis. i. : biosynthesis of sucrose from glycolate. par~ ii. : bicarbonate utilization by washed algae production, interior egg quality, and some physiological effects of feeding raw soybean meal to laying hens alteration of post-mortem changes in porcine muscle by preslaughter heat treatment and diet modification chelation in nutrition. soft microorganisms and soil chelation. the pedogenie action of lichens and lichen acids die ergiinzungswirkung yon dl-•ethionin allein oder in kombination mit l-lysin beim wachsenden schwein untersuchungen iibcr den einflub unterschiedlicher wasservemorgung auf ertr~ge, gehalte an ~therischem 01, trenspirationsquotienten, biattgrsl~en und relative ~)ldriisendichtsn bei einigen arten aus der familie der labiaten. 2. teil gehalte an iitherischem 01, transpirationsquotienten, blattgrsflen und relative 01-driisendichten bei einigen arten aus dcr familie der labiaten. iii.: blattgrsflen, relative 01driisendichten, anzahl am haupttricb inseriertcr blattpaare und internodien. liingen cadmium: uptake by vegetables from supcrphosphate in soil studies on the protein and methionine requirements of young bobwhite quail and young ringnecked pheasants chelation in nutrition. evidence for natural chelates which aid in the utilization of zinc by chicks selective fertilization of apple-trees some soils and fertilizer relationships of the cavendish banana (muss cavendlshl lambert) on three different soils in costa rice soil organic phosphorus and the phosphorus nutrition of plants the effect of heat treatment on the nutritive value of milk for the young calf. 5. : a comparison of spray-dried skim milks prepared with different preheating treatments and roller.dried skim milk, and the effect of chlortetracyclinc supplementation of the spray-dried skim milks the effect of heat, ~reatmen~ on the nutritive value of milk for the young calf. 6. :the effect of the addition of calcium a biological assay for metabolizable energy in poultry feed ingredients together with findings which demonstrate some of the problems associated with the evaluation of fats feed additives in livestock rations: part i. : urea in dairy rations. part.i/: use of thyroprotein in cattle nutrition diet and histamine in the ruminant synthetic ion-exchange resins as a medium for plant growth nutrition of vibrio fetus theoretical basis of unicellula algae cultivation amino acid supplementation of barley diets for growing swine some effects of 2.4-dichiorophen-oxyacetic acid on swect corn (zea ]~ays rugosa l.) with emphasis on yield, tillering, root development, and exudation of electrolytes from roots and stems. i)iss. abstr feeding and management of broiler strain breeder hens relationships among seven elements in the nutrition of corn in sand culture an external effect of inorganic nitrogen on nodulation influence of enzyme supplements in lamb fattening rations gravenstein and jonathan apples produced with giberellle acid the role of carotene in the dairy cow. wiss. ver6ff. dr. ges. ern~hrung 9 vitamin a-wirksamkeit der carotine bei versehiedenen tierarten. wiss. ver-5ff. dt. ges. eru~hrung 9 upgrading the indigenous poultry of uganda. i. : the growth rates and feed conversion from hatching to maturity of indigenous poultry crossed with four imported breeds effect of different kinds of litter on growth and feed efficiency in chick rearing investigation of the mineral nutrition of datura innoxia the effect of flooding in the availability of phosphorus and on the growth of rice nutrition of the boll weevil larva ascorbic acid in the nutrition of plant-feeding insects effects on the s~maeh worm, i-iaemonehus contortus, of feeding lambs natural versus semipuriiicd diets yield and foliar composition of corn as affected by fertilizer rates and environmental factors. i)iss. abstr. 2~ [1963] nr praktische erfahrm]gen in der carotinoidversorgung yon vsgeln effect of protein-energy relationship on the performance and carcass quality of growing swine the use of quarter samples in the assessment of the effects of feeding treatments on milk composition * calcium and phosphorus requirements of finishing broilers using phosphorus sources of low and high availability amino acid supplementation of peanut meal diets for broiler chicks the effects of feeding various levels of vitamin a on chicks with cecal coccidiosis chelation in nutrition. review of chelation in plant nutrition water use by irrigated arabia coffee in the failure of certain dietary ingredients to affect the incidence of blood spots in chicken eggs dcr einflul~ der anbauverh~ltnisse auf die eigensehaften der kartoffelknolle und der st~rke effect of feeding milk replacers with varying amounts of fat for hothouse lamb production l~hysiologieal factors influeneelng growth, reproduction, and production of wcll-fed dairy heifers. i. ago at first breeding. ii. feeding of diethylstflbestrol results of an experiment ot rothamsted testing farmyard manure and n, p, and k fertilizers on five arable crops. i. : yields results of an experiment at rothamsted testing farmyard manure and n, p, and k fertilizers on five arable crops. ii.: nutrients removed by crops 9 the utilization of carotenoids by the hen and chick some effects of potassium and lime on the relation between phosphorus in soil and plant, with particular reference to glasshouse tomatoes, carnations, and winter lettuce the lack of a consistent chick growth response to norwegian kelp meal further studies on protein and energy requirements of chicks selected for high and low body weight some effects of kinetin on the growth and flowering of intact green plants individual feed consumption of turkey breeder hens and the correlation of feed intake, bocly weight, and egg production crop analysis technique for studying the food habits and preferences of chickens on range supplemental value of turkey protein for wheat herbicides and plant growth regulators preparation of purified ration for chick. parg iv. : preparation of crystalline amino acid diet evaluation of algae as a food for human diets the influence of milk fat depressing rations on the yield and composition of bovine milk the effect of plant nutrients and antagonistic microorganisms on the damping-off of cotton seedlings caused by rhizoetonia solani kukn ki;-;sehe ern~ihrung und di~tetlk clinical nutrition and dietetics a statement approved by the board of directors of the canadian heath foundation radioactivity and human diet probleme der ern~hrung durch gefrierkost. sympomum der d0utschen gcgell~ehaft ffir ernghrung veto 14. bis 15. mgrz klinische ernghrungslehre" und wissenschaftlicher kongreb der deutschen gesellschaft flit ern~hrung an der johannes-gutenberg-universit~t mainz veto 17. bis 19 wissenschaftlicher kongreb der deutsehen gesellschaft ftir ern~hrung an der johannes-gutenberg-universit~t mainz am 18. und 19 ernghrung nnd digit" 12. deutseher kongreb ffir ~irztliehe fortbildung in berlin veto 5. bis 9 arbeitstagung fiber klinisebe ernghrungslehre. ern~ foods of the future (forts.). problems in space foods and nutrition. foods for extended space travel and habitation the question of fats. ii.: fats and disease behandlung fettbedingter gerinnungsstgrungen mit lipostabil sugar and dental caries obesity and sugar addiction hunger and malnutrition lancet 2 nutrition and general practice bericht fiber die vortragstagung des fachverbandes lebensmittelchemie der chemischen gesellschaft in der ddr vom 19 arbeitstagung fiber kommission ffir volksern~hrung, lebensmittelgesetzgebung und -kontrolle (eek) zu yi~inden des eidg the national diet-heart study low fat diet in familial mediterranean fever the thyroid gland in infant malnutrition evaluation of fao amino acid reference pattern studies on the physiology of nutrition in surinam rickets in southern israel diet and heart disease maiskeims1 in ernt~hrung und ditltetik apha conference report safe and nutritious food supply malnutrition and disease expert committee on medical assessment of nutritional status protein malnutrition the fat tolerance curves of patients with hyperllpidcmia and athcrosclcrosis die lactose im rahmen der ernt~hrung effect of environment on nutritional status zur theorie und praxis der zuckerkrankheit. wiener z dietetically induced experimental flous of rats physikalisch-diiitetische therapie yon hautkrankheiten. arch. phys. therapie 15 err~hrungsforschung 7 [1962/63] :nr. 4, s. 598/ 612. +bo rtiw~l, p. w. : milk-borne disease consumers' reactions to instand foods de voeding van woonwagenbcwoners experimental investigations on nutrition and human behavior. a post-script. amer di~tetische therapie der chronischen herzinsuffizienz construction and validation of the food attitude scale why we have a safe and wholesome food supply use of food in a psychiatric setting stature and nutrition in cystinuria and hartnup disease ])as endokrinologisehe syndrom des proteinmangels urinary excretion of 3.4-dlhydroxyphenylalanine (dopa) in two children of short stature with malnutrition current problems affecting consumption of milk and indnstry's response to them preparedness for emergency feeding fluoridation and public relations dieetprodukten in vlaanderen incorporation of labelled glycine into erythrocyto glutathione of rabbits; effect of nutritional muscular dystrophy hot wcreldvoedselvraagstuk sniker -glycogcen -tandbederf dietary in take in patients with arthritis and other chronic diseases a clinical trial of iron-fortified bread effects of freater intake of milk, fruits, and vegetables joint fao/w/to expert committee on nutrition" fiber eine sitzung in genf vom 18. his 25 serum cholesterol in a military population. its relation to obesity and the military diet ). ~z~, a. c. 9 some nutritional problems of older age groups neuere biochemisehe untersuchungen zur diagnostik und therapie yon b-vitamin-mangelzust~nden call-harvard nutrition project. iii.: the erythroid atrophy of severe protein deficiency in monkeys cali-harvard nutrition projeet. ii. : the erythroid a~rophy of kwashiorkor and marasmus zur hshe des erwiinsehten fottverbrauehs ern~hrung des sportlers voeding 24 moderne ern~hrungsbedarfsnormen. i. mitt. z. ges. hyg. 9 [1963] nr. 1, s. 11122. *--~ioderne ern~hrungsbedarfsnormen. 2. mit~.: z. ges. hyg. 9 a comprehensive home-care program for the chronically ill fat-modified foods for serum cholesterol reduction besonderheiten der ern~hrung alter menschen chronic malnutrition in turkey. v. studies on serum fatty acids in malnourished children prevention of ,meat anemia" in mice by copper and calcium beeinflussung der sportlichen leistungsfdhigkeit dutch eine geeignete er-n~ihrung physiology of adolescence. ii, l~u-trition -basal oxygen consumption -energy expenditure and balance -nitrogen metabolism -calcium metabolism -iron metabolism -red cell mass and hemoglobin ern~hrungsproblemc bei chirurgisehcn kranken. wiss. versff. dr. ges. ern~hrung 11 moderne vitamin b~-therapie: oral, rektal oder parenteral ? a palatable diet for producing experimental folate deficiency in man smoking in hospital was ist hungern und was heibt the cultivation of tflapia. this prolific fish as a fine source of proteinrich food in underdeveloped areas pyridoxine supplementation during pregnancy. clinical and laboratory observations on japanese foods nutritional sequelae of ga~trio surgery koehsalzarme kost und nierenerkrankungen theoretische und praktische grundiagen der ernilhrung in der fett in der diabeteskost foods or supplements? g zur hygiene und ,di~tetik des rauehens zur dis behandlung der uremic anaphylactie shock of the lungs triggered by mieroaspiration of cows' milk: a form of sudden unexpected death in early infancy the food service industry and its relation to the control of foodborne illness die konservative therapie des peptischen gesehwiirs gezondheid op reis addictive aspeeta in heavy cigarette smoking die ern~hrung im rahmen de~ heilvers im kurort the diet in renal faiiure is the rationale for gaetrointestinal diet therapy sound? familie-beruf-ern~hrung die dfiitbehandiung der leberkrankheiten. i)t. reed vom hunger his zum ~beritul3-weltweite ern~hrungsprobleme die bedeutung der vitamine in der t~tglichen erniihrung s~ure-und baseniibersehiissige naln'ung. therapiewoehe 13 [1963] nr. 13, s. 563/565. --ss und chronische acidogene und alkalogene erns z. em~hrungswiss industrial lunches and public health the assessment of nutritional status in man: chairman's opening remarks therapie der essentiellen hypertonie speeifieke voedings-en voorlichtingsproblemen in tropische landen wie kann man unsere kos~ und unsere kostgewohnheiten beeinflussen? neue konzeptionen in der wasser-und salzsubstitution some aspects of the relation of nutrition and pregnancy is coronary heart disease preventable? world hunger demineralization of whey. use of its protein in infant feeding elaidinized olive oil and cholesterol atherosclerosis soziatrischo aspekte dee genul)-und arz~aeimittelkonsums verwendung yon htilsenfriichten in der diabetiker-di~t sanitation and dishes sanitation and dishes. aspects old and new. part ii smoking, arteriosclerosis, and age neue weg~ zur erni~hrungsphysiologi~chen aufwertung yon getreide-erzeugnissen diphyliobothrium latum and human nutrition, with particular reference to vitamin bli deficiency milk and diverticulosis dietary factors in the pathogenesis and treatment of cirrhosis of the liver. ivied. clinies north america 47 zur frage der quanti~tiven charakteristik der ern~hrung der berufst probleme der gemeinschaftsverpflegung aus der sieht des ern~hrungsphysio-logen gegevens over vitamine b,-deficientie, -behoefte en -voorziening use of government-donated foods in a rural community fluoride, teeth, and the analyst eiweil3bedarfsnormen im rahmen unserer ern~hrungsrichts~tze physiologic discomforts in 1962 navy protective shelter tests di~tvorschl~ige : akute nierenentziindtmg siii]waren und karies in theorie und praxis ernehrtmgsberatung im krankenhaus use of a low-sodium formula as an improved karell diet, with emphasis upon the outpatient management of heart failure and lymphedema pflanzliches eiweil~ fiir die erniihrung des menschen influence of diet on viral hepatitis influence of siblings on student smoking patterns voeding yon leerlingen van een lagere technische school. ii. calorie~n-en nutri~ntenwaarde early use of circulating blood volume, weights, and normal diet in acute renal failure fette in tier nahrung. dr. recd. wschr. 14 [1963] nr. 9, s. 247/250. *scm~-idt-b~bach, a.: ~j~ber die di~tetik der sauermilch begriff und anfgabe di~tetischer lebensmittel zur ursache yon geschmackskalamit~ten in trinkwassertalsperren psychologische motive im wandel des brotverzehrs improving levels of nutrition through better food practices the vitamin b~ deficiency syndrome in hun~n infancy. biochemical and clinical observations treatmen~ of ,refractory obesry" with ,formula die~ nr. 1, s. 66ff. (45 s.). 3elwzea, c. c.: morphologie constitution and smoking prediction the outcome for obese dieters symposium fiber probleme der ern~hrung durch gefrierkost in karlsruhe yore 18 coordination of long-term care of :pku children die di~t bei diabetes meuitus a nutritional supplement (nutrament) for elderly patients dietary intake of five groups of subjects. 24-hr. recall diets vs. dietary patterns the prolonged effects of a low cholesterol, high carbohydrate diet upon the serum lipids in diabetic patients stand und perspektiven der eiwei~versorgung. zur zielsetzung des verhandiungsthemas de voeding van rijswerkers signification des standards calorieo-azotes utilists en france erg~nzungen veto standpunkt des lebensmittelehemikers zu (dem beitrag) official acceptance of homogenized milk in the united states advances in nutrition and dietetics nutrition of 96 naval recruits during a shelter habitability study hot ombuigen van voedingsgewoonten de voeding van schippemkinderen san boord en in de internaten voeding 24 [1963] iqr. 5, s. 291/308. --le traitement de rinsuffisance rtnale her zoutarme-eiwitarme dicer ein beitrag zur allgemeinen el3problematik, ausgehend yon einer anorexia nervosa rroblem~ in nutrltlon~l supplementation an4 ~mri~hnl~ut detection of nutritional imbalances theorie und praxis der schwangerenern~hrung die pharmakologisehe beeinflussung yon hunger mad s~ttigung problems in the evalutaion of nutritional status in chronic illness europ~ische di~ttagung in amstercl~m (17. bls 19 entwieklung des brot-und gctreidevcrzehrs in der neuercn zeit nutrition and palatability coehac dis~tse. -biochemical and technological aspects die di~itetik der :fettsucht kinderemll}~mg nutrition of infants and children report on infant feeding childhood nutrition in lapland. :nutrition rev the thyroid gland in infant malnutrition. nutrition rev. 21 [1963] nr. 1, s. 32. +n. n.: physical activity of obese girls appraisal of nutritional adequacy of infant formulas used as cow milk substitutes isomaltose intolerance causing decreased ability to utilize dietary starch passagere hypereh]or~mische azidose bei zwei ausgetragenen s~uglingen wghrend s~uremflch-em~hrung ober erfahrungen in der frfihgeborenonaufzucht mit einer neuen bedarfsangepa6ten friilmahrung nutritional defects in adolescence g p 27 intravenous glucose tolerance in the normal newborn infant: the effects of a double dose of glucose and insulin ! attitudes towards physical ac. tivity, food, and family in obese and nonobese adolescent girls urinary excretion of 3.4-dihydroxyphenylalanine (dopa) in two children of short stature with malnutrition high salt content of western infant's diet: possible relationship ~o hypertension in the adult vitamin e to premature infants des enzym yon ~'leming (lysozym) und seine bedeutung flit die si~ugllngsern~ihrung. ann investigation on the relation of between-meal eating and dental caries of sixth-year molars in school children studies in infantile malnutrition. i.: nature of the problem in peru chronic malnutrition in turkey. v. studies on serum fatty acids in malnourished children emi~hrnng mad faekalc lysozymaktivitiit beim s~iugiing role of linoleio acid in infant nutrition factors related to the eating behavior and dietary adequacy of girls 12 to 14 years of age. dies. abstr. 23 des vitamin c im jugendalter. ii. mitt.: uber die wirkung yon natiirlichem und synthctisehemvitamin c bei l~ngeren zugaben the incidence of protein-calorie malnutrition of early childhood ein besonderer znsammenhang zwischen dem bedarf an nahrungsfett und dem stoffwechsel in den ersten lebensjahren the effect of supplements of groundnut flour or groundnut prorein isolate fortified with calcium salts and vitamins or of skim-milk powder on the digestibility coefficient, biological value, and net utilization of the proteins of poor indian diets given to undernourished children lysine fortifications of wheat bread fed to haitian school children lrber den 24-stunden-rhythmus der kalorienerzeugung bei friihgeborenen beitriige zur frage der spezifisch-dyrmmi~ehen wlrkung auf grund yon glykokoll-belastungen bei friihgeborenen. acta paediatriea acad carotine in tier stiuglingserni~hrung. wiss. versff. d$. ges. ernkhrung 9 liver and depot lipids in children on normal and high carbohydrate diets response of rural guatemalan indian children with hypocholesterolemia to increased crystalline cholesterol intake feeding value of soy milks for premature infants early feeding and birth difficulties in childhood schizophrenia. a brief study zusammenh~nge zwischen stoffwechsel und fl~issigkeitsbedarf beim s~ug-ling kuhmilchauergie beim s~ugling und ,cot death". die unspezifische kumulative sensibilisicrung malnutrition and the health of children practical aspects of infant feeding breast-feeding, weaning, ~nd acculturation appetithemmer in der ]~ehandlung der fettsucht bei kindern. miinchener med the effect of different amounts of vitamin d on growth and serum levels of calcium, inorganic phosphorus, and alkaline phosphatase in premature infants partition of urinary nitrogen in children with kwashiorkor treated with animal and vegetable proteins der einflud yon voukornbrot auf den caleiumstoffweehsel bei schulkindern ist eine rektale vitamin blz-behandlung vertretbar? dr ethyl alcohol in the pathogenesis of gout c|e~rance of infused fat emulsion in diabetic dogs praktische durehffihrung der parentcralon ern~hrung die parcntcrale ern~hrung chirurgischor paticntcn. wiss. ver6ff. dr. ges. em~hrung 11 verwertung intravenss verabfolgter aminos~,urengemische. wiss. versff. dr. ges die erkcnnung yon fe~-transportstsrungen und ihre bedeutung ftir die intra-ven6se fettzufuhr intravenous glucose tolerance in the normal newborn infant: the effect of a double dose of glucose and insulin new intravenous fat emulsion indikationen und kontraindikationen der intraven5sen fettzufuhr in der chirurgie anwendung intraven6s gegebener aminos~urengemische in dcr p~diatrie ymtravensse fettinfusionen. wiss. versff. dt. ges. ern~rung ~qotwendigkcit und erfolge der parenteralen mad sonder-ern~hrung moderne vitamin blz-therapie: oral, rektal oder p~renteral? mcd anwendung intravenss gegebener aminos~urengemische in der gyn~kologie und geburt~hilfe aminos~ureninfusionen. schweiz. reed. wsehr. 93 klinische anwendung mad erfahrungen bei der verabreiehung intravensser fettemulsionen an ehirurgischen patienten diskussionsbemerkung zum thema: die parenteralo ern~hrung ~tude exp~rimentale de la tolerance d'une solution de graisse vsgstale d'administration intraveineuse ern~hrungsphysiologische grundlagen der parenteralen ern~hrung erfahrungen mit der parenteralen ern~hrung mittels fettinfusionen. helvetica chirurgica aeta 30 nitrogen, lipid, glycogen, and deoxyribonucleic acid content of human liver. the effec~ of brief starvation and intravenous administration of glucose techn~ und indikationen der parenteralen ern~hrung des neugeborenen die praktische organisation der klinisehen infusionstherapie mit zuckerund elektrolytlt)sungen. l ~ed untersuchungen und bcobachtungen fiber intravensse fettinfusionen in der inneren klinik. wiss. versff. d$. ges. ern~hrung 11 l'alimentation parenttrale, 6mulsions lipidiques. (a suivre) ann intravcnsse ern~hrungstherapie mit fettemulsionen parenteral-und sondeneru~hrte patienten zur rekt~len kaliumsubstitufion parenteraie ern~hrtmg mit fettemulsionen konservierung mad zubereitung yon lebens-und futtermitteln nutritional hygiene, preservation and preparation of foodstuffs and feeds n. n.: vortragsveranstaltung ,fleischhygieno" der th seminar on the use of radioisotopes in nutrition science and of ionising radiation in food technology. strasbourg, 1st -6th october progress of food irradia~on work and programmes in o.e.c.d. member countries (16 berichte) safe heat processing of canned cured meats with regard to bacterial spores the role of food science i and technology on the freeze dehydration of foods public health aspects of handling animal products in the txopics fack)rs affecting bacfcrial spoilage of animal products at elevated temperatures. food technol sterilized concentrated millr. food teehnol. 17 [1963] nr. 6, s. 43/44, 49. n.n.: foods of the future. now opportunities for flavor modification unrestricted approval for irradiated bacon 3-a sanitary standards for multiple-use rubber and rubber-like materials used as product contact surfaces in dairy equipment 3-a sanitary standards for batch and continuous freezers for ice cream, ices, and similarly-frozen dairy foods bericht fiber die vortrag~tagung des fachverbandes lebensmittelchemio der chemischen gesellsch~ft in der ddr vom 19. bis 21 lebensmittelchem. u. geriehtl. chem. 17 fluoridation in great britain die enzymatische phycinspaltung in geschrotetem getreide in abh~ingigkeit yon der relativen lufffeuchtigkeit the effect of certain antioxidants during freezer storage of pork chops and sausage the mechanics of treating hatching eggs for disease prevention beitrag zur sfil~gerinnung yon kakaotrunk jodophore als desinfektionsmittel in der milchwirtschaft. milchwissenschaft 17 [1962] nr. 9, s. 513ff. (? s.). zitat: dr. lebensmittel tierarzneimittel und anfzuchtmittcl in der landwirtschaftllehen praxis. gesundheitliche erwrgungen znm sehutze des konsumenten bei der anwendung yon tierarzneimitteln und aufzuchtmitteln in der landwirtschaftlichen praxis n~ihrwertminderung dureh zubereitung dcr nahrung suue modifieazioni della flora mierobiea dei mollusohi eduli par effetto di eonservazione impropria verlinderungen des inhaltes yon dosenkonserven w~hrend lgngerer lagerung digtbrote aus der sieht ihrer praktischen gestmtung. wiss. versff. dr. ges. ern~hrung l0 erniihrungshygienische untersuchungen in kindergarten an budapester kindern im alter yon 1 bis 3 jahren. z. ges. hyg. 9 die eignung der bakteriologischen untersuehung yon kannenmilchproben als grundlage eines eutergesundheitsdienstes adequacy of cooking procedures for the destruction of salmonellae zur revitaminierung des mehles bzw. brotes. wiss. versff. dr. ges. er-n~hrung 10 conventionele verwarningsmethoden beitrag zur kenntnis der wechselwirkungen zwischen proteinen und poly. phenolcn der kakaobohnen wtihrend der fermentation nihydrazone feed medication ag~ins~ ar~iiieiaily induced escheriehia eoli air-sac infection foam-mat dried orange juice. i. time-temperature drying studies lebensmittelhygicnische probleme bei der herstellung yon gemeinschaftsverflpegung. 5. mitt. : z. ges. hyg. 9 lebensmittelhygienische probleme bci der herstellung yon gemeinschaftsverpflegung. 6. mitt.: z. ges. ttyg. 9 untersuchungen fiber die temperaturvorg~nge im innern yon lebensmittein w~hrend ihrer thermischen zubereitung, erl~uter~ am kochen yon kartoffelklsl~en. z. ges. hyg. 9 zur gewinnung yon niederverestertem pektin aus toehnisehen apfelpektinextrakten mit ammoniak: einflul] der entesterungsbedingungen auf das geliervermsgen hygienisohe beurteilung einer dutch clostridlum verursaehten massen. lebensmittelvergiftung [ungar neuartige teehnik der lebensmlttelverpaekung filr ge. schmaeks-aromastoffe und andere artikel bei ~berdruek studies with a natural source of xanthophylls for the pigmentation of egg yolks and skin of poultry die problematik der tuberkulosebeurteilung in der sehlachttier-und fleischuntersuchung freezing rate of beef as affected by moisture, fat, and wrapping materials ~ber mit komblnier~en konservierungsmitteln hergestellte konfitiiren und consumers' reactions to instant foods. food teclmol effect of supplementing lime-~reated corn with different levels of lysine, tryptophane, and isoleueine on the nitrogen retention of young children effect of freezing on autoxidation of oxymyoglobin solutions the control of gloecsporium album rot of stored apples by orchard sprays which reduce sporulation of wood infections bakteriologische befunde bei der spelseeisuntersuchung im sommer the microbiology of vacuum packed sliced bacon the stability of canned foods in long4erm storage the effect of proofing and baling on concentrations of organic acids, carbonyl compounds, and alcohols in bread doughs prepared from pre-ferments nutrients in raw vs. cooked globe artichokes effect of gamma-radiation, chemical, and packaging treatments on refrigerated life of strawberries and sweet cherries. food teehnol beeinflussung der wirkung yon kaffeeinhaltsstoffen dutch be-s~immte behandiungsverfahren der l~hbohne. (eine tierexperimentell-toxikologische studie influence of surface pasteurization and ehlortetraeycline on bacterial incidence on fryers the hydrolysis of grass hemicelluloses during ensilage post-harvest storage studies with selected fruits the science of food technology in venezuela beitrag zur aufbewahrung von sti~rkesirup in verzinkten ge-f~ben inactivation-rate studies on a radiation.resistant spoilage microorganism. ii. : thermal inactivation rates in beef the oceurrence and growth of staphylococci on packed bacon, with special reference to staphylococcus aureus zur verhiitung yon lebensmittelinfektionen in grobklichenanlagen dutch desinfektionsmabnahmen. ~rztl the influence of selected bacteria upon the flavor of a precooked frozen poultry product flour maturing and bleaching with aeyclie acetone peroxides effect of processing conditions on dry-heat expansion of bulgar wheat zur vcrwendung von pentachlornitrobenzol bei der lagerung yon kohl. dr. lebensmittel-rdsch. 59 [1963] nr. 1, s. 14115 los mati~res f6cales des pores et les selles des ouvriers d'aba~toir constituent une source permanente de diss6mination des salmonella studies on cooking fats and oils aspect sanitaire et l~gal aetuel des aliments conserv&s. rev. d'hyg over de betekenis van postduiven als besmettingsbron van levensmiddelen met salmonella-kiemen. ti]dschr. v. diergeneesk 87 over bet voorkomen van salmonella-kiemen bij slagerijen. tijdschr. v. diergeneesk 88 ursache und entstehung yon brotfehlern les salmonella des oeufs et ovoproduite frangais eg 6trangers the microstructure of baked products and doughs. food technol tomsto powder by foam-mat drying zitat: dt. lebensmittel-rdsch. 59 [1963] nr. 4, s. 123. --die verwendung yon gefrosteter saline zur butterherstellung. teil iv. die ergebnisse der in d~nemark, frankreich und in den l~iederlanden durchgefiihrten praktischen versuche und ihre bedeutung ffir die in der deutschen molkereiwirtschaft erfolgendo verwendung yon gefrosteter sahne bei der butterherstellung tenderne~ of the turkey meat as influenced by pre-cooling before proce~ing and hand masv~ging vortragsmaterialien flit die ern~hrungsproplldeutik. (erlruterungen zu insgesamt 6 groben sehautafeln.) 2. mitt. : behandlung der tafeln iv bis vi. ernahrungsforschung 7 die unterschiedliehe problematik und ihre konsequcnzen bei der bekrmplung der rinder-und schwefnefinnen salmonellae from flies in a mexican slaughterhouse factors affecting quality of pies prepared from frozen bulkpack red sour pitted cherries zur bedeutung antimikrobiellcr stoffc in der nahrung modified equipment for pasteurizing and deodorizing market milk and for pasteurizing, deodorizing, and slightly concentrating cheese milk adhesion of coatings on frozen fried chicken oob~age eheeae problems in production and sanitation. publle health aspects ~ber den einflub yon licht, 8auerstoff und tempera~ur auf die hal~barkoib yon verpaektem emmentaler ks in scheiben aktuelle notwendigkeiten -gesetzliche m6glichkeiten zur fischkfihlung in eis und seewasser techniques de recherche des salmonella dans les oeufs frais et de conserve food hygiene on board ship safety factors in water fluoridation based on toxicology of fluorides the effect of oiling before and after cleaning in maintaining the albumen condition of shell eggs lebensmi~t~l-aerosole. fette preservation of the natural color in processed sweetpotato products. i.: flakes. food technol temporary inhibition of fermentation in apple juice preservatives and artificial sweeteners the mechanism of the development of rancidity in frozen fresh pork sausage and practicable methods for its inhibition die entwicldung der trinkwasseriinoridierung in den usa microbiological principles in prcpaeking meats st~ndard-kapazit~tstest ffir die bestimmung der desinfektionswirkung yon desinfektionsmitteln in der milchwirtsehaft. internationaler standard fil/ii)f 18-1962 die anwendung einiger arteu, bzw. st~mme, yon propions~urebakterien zur herstellung bestimmter k~scsorten mit hohem vitamin b12-gehalt the extraction of pectins from apple marc preparations zur hygiene und ,di~tetik des rauchens studies on control of respiration of mcintosh apples by packaging methods. food teehnol effects of ingredients used in condermed and frozen dairy products on thermal resistance of potentially pathogenic staphylococci der frisehkllse und seine verpackung studies on the viscosity of mayonnaise. ii.: the influence of addition of vinegar on the vi~co~isy of mayonnaise e~'ec~ of chemical v~lditives on the spreading quality of butter. ii. laboratory and plant churnings studies on browning mechanisms of fruit juice products. i.: changes in chemical composition which accompany browning of commercial concentrated lemon juice during storage ergebnisse der dlg-qualit~tspriifung 1963 fiir speiseeis. dr. molkerei-ztg beeinttnssung versehiedenartig verpaekter lebensmittel dureh desinfektion mit formaldehyd grunds~itzliches zur stabilisierung und solubilisierung yon carotin und carotinoidpr~paraten riickst~nde yon pflanzensehutzmitteln, insektiziden und dergleichen in der nahrung und ihre bedeutung fiir die gesundheit absehliebende stellungnahme lebensmittel-rdsch langfristige nutritive anwendung yon antibiotilm in der tierern~hrung im hinbliek auf die menschliehe gesundheit mit besonderer beriieksiehtigung yon chiortetrazyklin nutritional studies on the utilization of distiller's stillage. part l: insolubles of me]lasses-butanol distiller's stillage auswertung der dlg-priifung fiir frischk~ise in verbraueherpaekungen zu den fermentativen eigensehaften der milchs~urebakterien (,laetobacillus meijerinek"), zugleieh ein beitrag zur vermeidung yon fehlfabrikaten bei roh-nnd briihwurst. arch. lebensmittel-hyg microbiological aspects of one-trip glass bottles as used by the carbonated beverage industry de bereiding van bouillon aktnelle milchhygienische aufgaben und zicle des organisation der ~berwachung der umweltradioaktivit~t unter besonderer beriieksichtigung der l~berwaehung des gehaltes yon lebensmitteln an radioaktiven stoffen. dr. lebensmittel bakteriologie der 8auermilcherzeugnisse l~ber die italtbarkeit yon lebensmittelkonserven preparation of aeid-modifid flour for tub sizing radiostrontium removal from milk. determination of apparent equilibrium constants of the exchange reactions of sodium, potassium, calcium, and magnesium wish amberlite ir-120 probleme der vitaminierung yon brot the effect of several operational variables on the rate of freeze-drying of beef studies on beef quality. x. effect of temperature, freezing, frozen s~orage, thawing, and p~ on the rate of hypoxanthine production. div. food preservation techn retardation of gelation in high temperature-short-time sterile milk concentrates with polyphosphates nonenzymatic bread browning and flavor. changes in amino acids and formation of earbonyl compounds during baking ttinweise fiir konservierende wirkungen synthetiseher senfslbildner naeh versuehen an fisehen neuo wege zur herstellung haltbarer fisch-pr~iserven behavior of ethylene dibromide, methyl bromide, and their mixtures. i. : in columns of grains and milled materials der einflui~ des wksserns auf die kartoffel irradiation of fruits and vegetables in india effect of storage in nitrogen on the soluble sugar and dry matter contents of ryegrass drying of seaweeds and other plants. v. throughcirculation drying of asophyllum nodosnm in a semi-continuous dryer niacin, thiamin, and riboflavin in fresh and cooked pale, soft, watery versus dark, firm, dry pork muscle nouvelles observations concernant la survie des salmonellae clans les fromages pyroearhonie acid diethyl ester as a potential food preservative the effect of phosphates on moisture absorption, retention, and cooking losses of broiler carcasses gur hygienisehen beur~ilung der trinkwasserverh~ltnisse des oberon vogtlandes -eine hydrobiologische s~udie. z. ges. hyg. 9 studies on preserving quality in market eggs rapid detection of faecal coliform bacteria in the food processing plant. j. milk food technol the relationship between the loss of water and carbon dioxide from eggs and the effect upon albumen quality plastic pacckaging of eggs. 2 study on improvement of digestibility of milk protein. i.: the effect of heating, adjustment of activity of calcium ion, addition of whey protein, homogenization, and elimination of coarse casein micelle from milk by ultracentrifuge on the digestibility of milk especially on the coagulability of it part iii.; the digebtibility of slightly hydrolized milk with proteinase and the preparation of rnill~ which has same eoagulability as human milk. g. agrie, chem. see study on improvement of digestibility of milk protein. part iv. : the nature of coagulation of casein of milk preparation which has same coagulability as human milk the influence of added microorganism on the quality of margarine. i. : the influence of mold inoculation die 8ilberung yon tafelw/~ssern. dr. lebermmittel-rdsch technological aspects of the radiation pasteurization of foods rapid hydration of dried fruits. food technol untersuchungen fiber polygalakturonase-enzyme aus sehimmelpilzen. 6. mitt.: eigenschaften der polygalakturonasen aus schimmclpilzen role of individual phospholipids as antioxidants association of veterinary food hygienists symposium on the marketing, transport, and slaughter of calves. iil: scientific aspects. ve~cr ricerche sulla resistenza della brucella abor~us helle salsicce. riv pr6senee des salmonelles dans les viandes. donn6es frangaises et 6tran-gbres biochemisehe vorg~nge inl fleisch bei der lagerung einflul3 des r6stgrades yon kaffee auf die extinktion w~briger extrakte und die menge der trockensubstanz die herstellung yon quark und weillk~e unter ansnfitzung ss eiweibstoffe der milch vcrluste yon vitamin b~ und c beim kochen und turmkoehen yon gemfise effect of chilled storage on the frozen storage life of whiting salmonellenfunde in einer importsendung amerikanischer tiefgefrierhiihner. arch. lebensmittel-hyg iron sulfide blackening in canned protein foods: oxidation and reduction mechanisms in relation to sulfur and iron raft research on food preservation by irradiation in poland no~: gas chromatography of chicken and turkey volatiles: the effect of temperature, oxygen, and type of tissue on composition of the volatile fraction l'inaetivation dens l'eau de meret l'eau d'alimentation de eertains entdrovirus de voedingswaarde van aardappelen van versehiuende re, sen en de invloed daarop van bemesting en bewaring effecb of l-arab]nose and d-xylose on dough fermentation and crust browning gelation of egg yolk corn carotenoids: effects of temperature and moisture on losses during storage salmonellenfunde in einer importsendung amerikanischer tiefgefriorhiihner. arch. lebensmittel-ttyg bacteriological examination of unbottled soft drink ~berbliek fiber kunststoff-folien und -kombinatlonen ale verpackungsmaterial in der mflchwirtschaft pigmentierung des eidottem bei gettiigel. wiss. versff. dr. ges. eruiilu'ung 9 s~ beltrag zur bedeutung wasserlsslieher hochmolekularer kohlenhydrate f'tir die verkleisterung der st~irke einfiul3 ehemischer verbindungen auf die antimikrobielle konservierungs-~toffwirkung. 1. mitt.: einflub verschiedener stoffgruppen auf die konservierungswirkung gegen aspergillus niger a quantitative s~udy of changes in dried skim-milk and lactose cnscin in the 'dry' state during storage the role of the major sugars of potatoes in ~he browning roa0tion during chipping probleme der zuverl~sigkeit yon kunststoffen zur lebensmittelverpackung in europi~ischer sicht bakteriologiseh-hygienisehe beur~ilung yon speiseeis weizenkeime ale wertvoller rohstoff-einige ~ragestellungen und probleme probleme der frischhaltung und haltbarmachung yon brot end backwaren the effect of selected polymers upon the albumen quality of eggs after storage for short periods preparation and quality evaluation of processed fruits and fruit products with sucrose and synthetic sweeteners the microflora within the tissue of fruits and vegetables changes in carbohydrate and phosphorus content of potato tubers during storage in nitrogen preparation of "natural" cow-milk fat globules; preliminary investigation of materials adsorbed at their surfaces lethal doses of gamma radiation of some fruit spoilage microorganisms alteration of post-mortem changes in porcine muscle by preslaughter heat treatment and diet modification ober die msglichkeiten end grenzen eines effects of polyphosphates on water uptake, moisture retention, and cooking loss in broilers flavors imparted to dairy products by phenol deriva aromatisehe crackproduktc yon sterinen. (i). z. ern~hrungs-wiss dose requirements for the radiation sterilization of food berichb fiber eine arbeitstagung bei der internationalen atomenergie-beh6rde in wien vom 12 zur bok~mpfung d6r rinderfinne zum einfiu]~ handelsfiblicher, in lebensmittelbetrieben gebr~uch-]icher desinfektionsmittel auf lactobakterien; zugleich ein beitrag zur desinfektion in der marinadenindustrie einflu~ chemischer umsetzungen bei trockenen lebensmittelgemischen in hinsich~ auf die lagerfestigkeit. vi. mitt. : lebensmittelgemische mit troekenmagermileh als hauptkomponente. z. lebensmittel-untersuchung u die technologie yon sauren milcherzeugnissen, insbesondere der sauermilcharten und sauerrahmarten effects of several edible coatings on poultry meat quality how to control insects in stored foods. part 2 die antibiotika und die ans ihrer anwendung fiir die ~iilehwirtsehaft sich ergebenden probleme zur haltbarkeitsverli~ngerung empfindlicher l~ahrungs-und genuflmittel dureh abpaeken unter vakuum. fette, seifcn the diffusion of hydrogen through tinplate containers packed with grapefruit juice effect of ice cream stabilizem on the freezing characteristics of various aqueous systems ist die infektion mit trichinen aus amtlich untersuchtem schweinefieisch im liehte der mathematisehen analyse der bestimmungen der fleischbcschau yon schweinefleisch msglieh? hygiene in milk production, processing, and distribution effect of pre-eooling eggs and cartons upon quality after storage a biological after-effect in radiation-processed chicken muscle accounting for farm tank milk factors related to the flavor stability during storage of foam-dried whole milk. iii. effect of antioxidants untersuchungen zur hygienischen beurteihing yon ~ietallverunreinlgungen in lvben~mitteln the effect of bleed time prior to scald and refrigerated storage upon bacterial counts in the axillary diverticula of the interclavicular air sac of chickens techniques de recherche des salmonella darts les viandes the serotypes of salmonella isolated from foods carotinverluste beider zubereitung der nahrung. wiss. vcrsff. dr. gcs. er-n~hrung 9 stability of ascorbie acid in a liquid multivitamin emulsion containing sodium fluoride the effect of storage time and holding temperature on egg interior quality in uganda uber den einflul3 versehiedener fangverfahren auf die qualit~t und lagerreserve der fische zerst~ubungstrocknung yon tomatenkonzentraten. dr. lebensmittel radiation pasteurization of fresh fruits and vegetables a bacteriological survey of certain processed meat~. part l population studies at packet and retail levels association of veterinary food hygienists symposium on the marketing, transport, and slaughter of calves. l : marketing and slaughter die kombinierte verarbeitung yon kartoffeln auf st~rke und alkohol (fortschrittsbericht) usaec program in radiation research preservation of certain fish and fruits biochemical and quality changes in chicken meat during storage at above-freezing temperatures inleidend onderzoek naar strnctuurveranderingen die ontstaan bi] verhitten van plantaardige produkten veranderingen van hot vetgehalte bij de bereiding van vlces a study on the relationship between the factors influencing the time of cheese salting maple sirup. xxi. : the effec~ of temperature and formaldehyde on the growth of pseudomonas geniculata in maple sap vacuum-tempering corn for dry. milling organoleptische eigenschappen, thiamine-en ascorbinezuurgehalte van enige week-en diepvriesgroenten growth of psyehrophiles. il : growth of poultry meat spoilage bacteria and some effects of chlortctracycline tests of corn stored four years in a commercial bin association of veterinary food hygienists symposium on the marketing, transport and slaughter of calves. ii. : the slaughter and inspection of calves. veter the effect of processing conditions upon the nutritional quality of vegetable oils berieht fiber den wisscnsehaftlichen kongrel3 1963 der dcutschen gesellschaft ftir ern~hrung (5 vortragsrcferate) an objective measurement of the freshness of ready-to-cook broilers the fieldman's responsibilities in milk quality and procurement studies on the bacteria found in the wine during its making. i.: multiplication of bacteria in wine and male-lactic fermentation hydrophilic colloids as additives in white layer cakes the acceptability of cooked poultry protected by an edible acetylated monoglyeeride coating during fresh and frozen storage istes zu vertreten, dal3 das fleiseh sehwachfinniger rinder auch in gebriitetem zust~nd eingcfrorcn wird? arch. lebensmittel-hyg =[efenlnfit.ierte kondensmilch als ursache yon fehlfabrikation bei schokolade. arch. lebensmittel.hyg. 14 [1963] nr. 1, s. 6110. m, ober die bedeutung aerober sporenbildner als bombageerreger yon wfimtehenkonscreen. arch. lebensmittel-hyg aluminiumfolio zur verpaekung tiefgekiihlter und gefriergetrockneter lebcnsmittel. fette fiber die milcldtuorierung. bull. schwciz. akad. reed. wiss vitamin stability in diets sterilized for germfree animal~ die trinkwasserversorgung ernliln-ungsstatistlk nutritional statistics african nutrition problems der vcrbrauch yon alkoholisehen getr~,nken in 0sterreich childhood nutrition in lapland overweight children in stockholm iron deficiency in the finnish population vitamin b12 deficiency in indian infants the indices of nutritional change in great britain dietary values from a 24 h recall compared to a 7-day survey on elderly people her vaststellen van de voedingstoestand van sen bevolldng in de tropen en subtropen dental effects of fluoridation of water with particular reference to a study in the united kingdom de voedlng van woonwagenbewoncrs nutritional attitudes of some london housewives de vocdingsgewoonten van bcjaarden in amsterdam community studies of drinking behavior fats and carbohydrates as factors on atherosclerosis and diabetes in yemenite jews nutritional beliefs among a low-income urban population algemene gezondimidsaspccten van de vocding in de ontwikkelingsgcbieden a comparative study of the nutritional adequacy of the morning intake of women clerical workers and women factory workers serum cholesterol in a military population. its relation to obesity and the military diet nutrient intakes of healthy older women analysis of the structures of food consumption by groups in japan thiamin (vitamin b,)-untercru~hrung in deutschland? lvied vortragsmaterialien ftir die ern~hrungsprops (erli~uterungen zu insgesamt 6 groben sehautafeln.) 2. mitt.: behandlung der tafein iv bis vi. erns 7 [1962163] nr. 4, s. 6131634 zur ern~ihrungssituation in arbeitexfamilien aus verschiedcnen bezirken der ddr. 2. mitt. : ern~hrungssoziologischc answertung der lebensmittelverzehrungen in 6 itaushaltungen mit 2 erwachscnen und 2 kindern~ stand 1950 studies in infantile malnutrition. i. : nature of the problem in peru zur ~ethodik yon ern~ihrungserhebungen bei der gemeinschaftsverpflegung weight changes in relation to birthweight of papuan, indonesian, and chinese children during the first two weeks of life numbers of tasters required to determine consumer preferences for fruit drinks onderzoek naar de menupatronen in de noordoostpolder smoking habits of medical and non-medical university staff i**cs and potassium in people and diet. -a study of finnish lapps. ann. acad. scientiarum fennicae a ~berbliek fiber die radioaktivit~t der in 0sterreich im jahre 1961 konsumierten lebensmittel diet and plasma cholesterol in 99 bank men der mengenmi~bige getr~inkeverbrauch je einwohner im bundesgebiet predictors of human food consumption use of goverument.donated foods in a rural community bericht fiber die durum-und teigwarentagung der arbcitsgemeinschaft der oetreldeforschung e. v. vorn 12. bis 13. miirz voeding van leerlingen van con lagere teehni~eho school. ii. calorie~in-en nutriiintenwaarde nutritional deficiencies in developing countries dietary survey in surinam vitamine a-tekor~en op de aarde schwierigkeiten beim erreichen einer vollwertigcn ern~ihrung in ausgew~hl-ten vcrbrauchergruppen die entwicklung des brot-und getreideverzehrs in der neueren zeit. wiss. ver6ff. dr. ges. ern~hrung 10 n~rwert und zusammensetzung yon lebens. und futtermltteln nutritive values, composition of food~tuffe and f 9 physical chemistry of ice cream vitamin e in human nutrition appraisal of nutritional adequacy of infant formulas used as cow milk substitutes enkele gegevens betreffende de calorisehe waarde van klsine hepjes, gerechtcn en maaltijden the public health aspects of the use of antibiotics in food and foodstuffs. report of an expert committee niihrwertminderung dutch zubereitung der nahrung metals and other elements in foods die farbe der nahrungsmittel in anthropologischer sicht the enzymatic destruction of carotene and carotenoids foodstuff flavors. some factors affecting the flavor of sodium caseinate chemical and radiochemical composition of the rongelapese diet the organic constituents of food. i. : lettuce the influence of dehydration of foods on the digestibility and the biological value of the protein a stfidy of two methods of assessing vitamin b6 nutriture mincraisalze mad spurenelemente in der nahrung distribution of the bound form of nicotinic ac|d in natural material~ the distribution of c~rotenoids in nature and their biological significance zur bedeutung antimikrobieller stoffe in der mahrung die konsistenz yon margarine mad fetten an improved nutrient solution for diploid chinese hamster and human cell lines extraneous materials in foods and drugs the composition of food flavors studies on pantothenic acid intake. i. pantothenic acid content in japanese foods haben wir mangel an essentiellen fettsiiurcn? phonolcarbons~uren in menschlichen nahrungsprodukten. zum vorkommen yon phenolcarbons~uren in menschlichen nahrungsprodukten und ihr einttud auf den intermedi~ren sboffwechscl drugs in feeds relationship between the sulphur/nitrogen ratio and the protein value of diets gums in foods zur definition der begriffe ,aroma" und begriff und aufgabe dii~tetischer lebensmittel valettr vitaminique des carot6nes pour rhomme. wiss. versff. dr. ges. ern~hrung 9 internationales rundgespriich fiber lebensmittelchemische probleme in wiesbaden und eltville a. rh. (4 vortragsreferate) consumer awareness of texture and other food attributes organisehe und organisierte substanz in der lebensmittelchemie frost resistivity of fruit plants californ'ia association of chemistry teachers : inorganic nutrients in the sea fluoride in food antibiotics in feeds and other products planktorm as foods relation between color of cranberries and color and stability of sauce berieht fiber die vortragstagung des fachverbandes lebensmittelchemie der chemischen gesellsehaf$ in der ddr vom 19. bis 21 bulletin on tobacco evaluation of algae as a food for human diet~ digestibility of high-amylose corn starch. nutrition red. 21 [1963] nr. 1, s. 27/28. 1~. n. : comparative evaluation of corn mesa and steam-processed whole corn flours the major anthocyanin pigments of vitis vinifera varieties flame tokay, emperor, and red ~alaga peonidin-3-monoglucoside in vinifera grapes formation and distribution of amylosc and amylopectin in the starch granule nutrient in seeds. amino composition of some seeds sugar levels in fruits of the lowbush blueberry estimated at four physiological ages the relation of pectic substances to firmness of processed sweet potatoes (ipomoea berates) processed vegetable produc~s the protein composition of different flours and its relationship to nitrogen content and baking performance relationship between 'antitryptic factors' of some plant protein feeds and products of proteolysis precipitable by trichloroacetic acid. g. sci. food agric a thermostable haemolytic factor in soybeans foam-mat dried orange juice. i. time-temperature drying studies bound" growth inhibitor in raw soybean meal leaf analysis as a guide to the nutrition of fruit crops. ii. : distribution of total n, p, k, ca and mg in the laminae and petioles of raspberry (rubns idaeus l.) as influenced by soil treatments nutritive value of pumpkin seed. essential amino acid content and protein value of pumpkin seed (cueur bi~a farinoaa) effect of cooking and of amino acid supplementation on the nutritive value of black beans (phaseolns vulgaris l.) supplementation of cereal proteins with amino acids. iv.: lysine supplementation of wheat flour fed to young children at different levels of protein intake in the presence and absence of other amino acids uber den chemischen total ascorbic acid in potatoes. raw, fresh, mashed, and reeonstituted flakes moisture contents of hard red winter wheat as determined by meters and by oven drying, and influence of small differences in moisture content upon subsequent deterioration of the grain in storage rheological studies with canned tomato juice the chemical composition of maple sugar sand soluble carbohydrate content of varieties of te~raploid ryegrass natiirlicher gehalt und stabilit~t yon carotine11 und carotinoiden in citrnss~ften ~ber wein und weinuntersuchungen fatty acids and other lipids in mayonnaise cyclic fatty acid yields from linseed oil factors ~ffecting enzymatic solubilization of beef proteins weibulls original ring, en ny medeltidig varvetesort. (a new medium early variety of spring wheat, weibull's ring.) agri hortique genetiea 21 ober die askorbins~uresynthese in zerschnittene11 kartoffeln die bestimmung der amylaseaktivit~t und einige studicn fiber amylaseaktivit~t in gekeimtem roggen effect of processing conditions on dry-heat expansion of ~ulgar wheat oils, fats, and waxes ~.~ber das 01 der johannisbro~.kerne. fetto, seifen fruit and fruit products uber die variabili~it einiger eigcnschaften der kartoffelstilrke in abhiingigkcit yon witterung a short.term effect of weather on malie acid in pineapple fruit the specific surface of flour and starch granules in a hard winter wheat flour and in its five subsieve-size fractions oranges and lemons factors affecting quality of pies prepared from frozen bulk-pack red sour pitted cherries studies on the consish~ncy of thiamin and protein contents of pure.bred strains of rice a comparison of the nutritional value of protein from several soybean frac4ions zum vitamin-und aminos~uregehal~ yon maisquellwasser dark discoloration of canned all-green asparagus. i. chemistry and related factors enzymatic enhancement of flavor peetinestcrase in normal and abnormal tomato fruit storage effects on winter squashes. varietal differences and storage changes in the ascorbie acid content of six varieties of winter squashes grape pigments. concord grape pigments corn meal as a source of ribonuclease banana odor components. volatile components of bananas. part i. isolation of an odor concentrate. part il separation and identification lipids of algae. ill. : the components of unsaponifable matter of the algae chlore]la volatile esters of bartlett pear. ii studies on the nutritive value of raw and cooked soybeans for growing rats and swine and their effect on fat firmness characterization of fruit juices by acid profiles nitrate content of beets, collards, turnip greens lysine fortifications of wheat bread fed to haitian school children studies on the flavor of green tea. par~ iv. : dimethyl sulfide and its preeursor funktionelle eigenschaften yon lehens-mittclst~rken ~)ber das vorkommen yon xylit im speisepi]z champignon ein beitrag zur znsammensetzung yon apfeisinens~ften aus spanischcn und l~iarokko-apfelsinen an examination of the free amino acids of the common onion (allium cepa) a trypsin inhibitor in wheat flour the protein quality, digestibility, and composition of algae, chlorella 71105 chemical and color changes in canned apple sauce digestibility of the a-cellulose and pentosan components of the cellulosic mieelle of rescue and alfalfa safeness and serva. bility of meringued pie alcoholic beverages diurnal-nocturnal changes in the starch of tobacco leaves sugar and sugar products modification of flour proteins by dough mixing: effects of suffhydryl-blocking and oxidizing agents ~ber das haferprotein. z. lebensmittel-untersuchung u sodium and potassium in wines and distilled spirits non-volatile organic acids of the dwarf cavendish (chinese) variety of banana~ biological evaluation of soybean meal and cottonseed meal by amino acid digestibility and protein efficiency ratio studies. oiss. abstr. 23 [1963] l~r oxidation-reduction potentials of sak~f and synthetic sak6. xi.: on the relationship of the various fermenting processes of sak~ to oxidereduction potentials and indicator time test (i.t.t.) values of sak6 mash neue wege zur ern~hrungsphysiologischen aufwcrtung von getreideerzeugnissen cereal products ascorbie acid in dehydrated po~es die eiweibqualit~t yon ,getoastetem" (dampferhitztem) und ungetoastetem sojaextraktionssehrot color studies on processed dried fruits studies on the basic amino acid of the soy sauces and the seasoning liquids. ii.: the quantitative changes of l-arginine in the process of soy sauces brewing studies on the flavorons substances in soy sauce. xxii. : the differences between the soy sauce made from soy bean and wheat and that made from defatted soy bean and wheat feeding value of soy milks for premature infants flavors and non-alcoholic beverages fat content and fatty acids in some commercial mixes for baked products sensory examination of four organic acids added f~ wine 13ber die inhaltsstoffe der robktmtanie und versuehe zu ihrcr gehaltsbestimmung. diss. univ. hamburg, 3.2 malt beverages, sirups, extracts, and brewing materials vitamin b 6 and niacin in potatoes. retention after storage and cooking chemical investigation of some wild indian legumes zusatzstoffe der margarine. forte componen~ glyeerides of an indian fresh-water fish fat einflub des rsstgrades yon kaffce auf die extinktion w~$riger extrakte und die 1v[enge der trockensubstanz gaschromatographische untersuehungen yon fusclslen aus vcrsehiedchen g~irprodukten. 1. mitt.: problemsteilung und literaturfibemieht de voedingswaarde van aardappelen van versehillende rassen en de invloed daarop van bcmesting en bewaring eleetrophoretic separation of beet pigments studies on the growth-promoting value and digestibility of passion fruit seed oil polycyclisehe und aliphatisehe kohlenwasserstoffe dc~ tabakrauehes carotenoid, oil, and tocopherol content of corn inbreds location and possible role of esterified phosphorus in starch fractions untersuchungen fiber die chemischen u beim a]tern yon r6stkaffee. 1. mitt.: gaschromatographische analyse der leichtflfichtigen aromabestandteile nature, origin, and prevention of hydrogen sulphide aroma in wines the magnesium contents of soil and crops versuehe zur ehemischen differenzierung der eiweibst~ffe des weizens und roggens lemon juice composition. iii.: characterization of california-arizona lemon juice by use of a multiple regression analysis weizenkeime als wertvoller rohstoff -einige fragestellungen und probleme isolation of gram quantities of a rhamnoglucoside of apigenin from grapefruit determination of distribution of water in wheat grains by interference microscopy preparation and quality evaluation of processed fruits and fruit products with sucrose and synthetic sweeteners changes in carbohydrate and phosphorus content of potato tubers during storage in nitrogen chemical composition of some natural and processed orange juices effects of various factors in the candy test zum stand der kenntnlsse fiber die v{eclmelwir-kung zwischen nativer sts und wasser some volatile compounds from cooked potatoes firmness of canned apple slices as affected by maturity and steam-blanch temperature nutritive values of ten samples of western canadian grains proteins of wheat and flour. the separation and purification of the pyrophosphate-soluble proteins of wheat flour by chromatography on dear-cellulose changes in quality and composition produced in wine by s~ gamma irradiation citrus essential oils. iii. evaluation of silician natural lemon oils mineral analysis of plant tissues a. : nature of colloids in clarified cane juices studies on amino acid content of rice. i. : amino acid composition of polished rice glutelln estimated by beckman amino acid analyzer yliichtige carbonylverbindungen in honig neue aminos~iuren in h6heren pflanzen studium der wirkung der gammastrahlung auf die l-ascorbins~iure flour liplds and oxidation of sulfhydryl groups in dough zur kenntnls eincr weiteren in der sti~rke vorkommenden kohlenhydrat-komponente. ern~ihrungsforschung 8 lemon juice composition. ii. : characterization of california.arizona lemon juice by its polyphenolic content lemon juice composition. i.: characterization of california-arizona lemon juice by its total amino acid and 1-malic acid content safflower amino acids amino acid composition of safflower kernels, kernel protein, and hulls, and solubility of kernel nitrogen citrus fruit enzymes. i. : ascorbie acid oxidase in oranges untersuehungen fiber die verf~rbung gekochter kartoffeln an den sorten des kulturlrartoffelsortiments ides instituts ffir pflanzenziiehtung groi~-liisewitz nutritive value of red kidney beans (phaseolns vulgaris) for chicks instability in potable spirits. ii.: rum and brandy effect of size classification and maturity on the protein content of alaska and perfection peas ma~tr~, d. c.: ascorbie acid retention and color of strawberries as related to low-level irradiation and storage time historical aoac data on four fema samples of vanilla extract the acid-extracted pentosan content of wheat as a measure of milling quality of pacific northwest wheats petroleum ether extraetables in tobacco isolation, origin, and synthesis of a bread flavor constituent the protein composition of airclassifled flour fractions * studies on the flavor of green tea. v. : examination of the essential oil of the tea-leaves by gas lipid chromatography nutritive value of starches. iv.; comparison of digestibility of 15 natural starches estimated by a new procedure 1o c lebensmittel tierischen ursprungs _foodstu~$ o/animal origin n.n.: gdch-faehgruppe ,lebensmib~el. und gerichtliche chemie bericht fiber die vortragstagung des fachverbandes lebensmittelchemie der chemischen gesellsehaft in der ddr yore 19. bis 21 a taxometric study of the propionic acid bacteria of dairy origin comparisons of the caseins of buffalo's and cow's milk matiirlicher gehalt und stabilit~t der carotinoide in fetten und milchprodukten. wiss. versff. dr. ges. ernhhrung 9 antibiotics in milk vitamin a and d enrichment of nonfat dry milk some characteristics of yolk solids affecting their performance in cake doughnuts. i. effects of yolk type, level, and contamination with white proteine des eidotters post-mortem changes in the muscles oflandrace pigs t~ber den fettgehalt yon flcischkonserven. dr. lebensmittel der einflul~ der verfahrenstechnik auf die ks163 jahreszeitliche einfliisse alff den gehalt der fischmuskulatur an freien ami-nos~iuren u. deren bedeutung fiir die qualitiit fangtechnik und fisch-qualit~t. fette spcei~c distribution of fatty acids in marine lipids a comparison of pigs slaughtered at three diffcrcn~ weights. i. : carcass quality and performance a comparison of pigs slaughtered at three different~ weights. ii. : association between dissection results, various measuremen~ and visual assessments a note on the effect of heat on the colour of goat's milk chemical and nutritional changes in stored herring meal. 4.: nutritional significance of oxidation of the oil relation of pork muscle quality factors to zinc con~ent and other properties the chemical nature of the characteristic flavor of cultured buttermilk occurence of vaniuin in hea~ed milks the electrophoretie properties of the proteins in cottage cheese curd the influence of post-mortem glycolysis on poultry tenderness seasonal variations in cod liver oil isolation and characterization of the flavor components of rancid pork some problems in the evaluation of egg albumen quality quality evaluation studies of fish and shellfish from certain northern european waters indices for lamb carcass composition subjective and objective evaluations of prefabricated cuts of beef 0ils, fats, and waxes die bildung yon eiskristallen in diinnen milchschichten. milchwissensehaft 18 the incidence of bacteria in cheese milk and cheddar cheese and their association with flavour body composition of market weight pigs some factors affecting tenderness of turkey meat ovine bioenergetics and nutritional efficiency, with special reference go forage utilization die ziichtung yon fleisehsehweinen und die folgeerseheinungen, die sich insbesondere im hinbliek auf die qualit~t yon fleisch und fett ergeben. arch. lebensmitgel-ityg 0n the structure of highly unsaturated fatty acids of fish oils by high resolution nuclear magnetic resonance spectral analysis the fatty acid composition of the milk fat of cows grazing on ryegrass at two stages of maturity and the composition of the ryegrass hpids effect of intrauterine infusion of penicillin-streptomycin and furacin and vaginal deposition of furacin on chemical residues tevei~ in millr beziehungen zwisehen l-aseorbinsaure und milch comparison of chemical and organoleptic data obtained on thawed and unthawed frozen cod, haddock, and perch fillets die ~4m~uosfiurenzusammensetzung der ziegenmflch und des ziegenmileh-caseins food flavors and odors. meat flavor: lamb dairy products de ehemische samenstelling van visen visprodukten chemical studies on the herring (clupen harengus). vii.: collagen and cohesiveness in heat-processed herring and observations on a seasonal variation in collagen content growth and pro~eolycie activity of pseudomonas fluoresccns in eggs and egg products the fatty acid composition of some perirenal and subcutaneous beef depot fats inactivation of peroxidase in milk by homogenization a study of the "cured meat" color producing reaction and the effects of some curing adjuncts t~yoer den fettgchalt yon br'tihwiirsten stress effects, eareass composition, and carcass quality in lambs effect of chemical additives on the spreading quality of butter. ii. laboratory and plant ehurnings preliminary studies on protein and moisture relationship in fresh and proeessed hams die zusammensetzung des kuhmilchfettes in abh~ngig-keit yon der ffitterung. fette chemical characterization of off-flavors in concentrated and nonfat dry milk zur ausseheidung yon xanthindehydrase mid molybdiin in der kuhmilch copper distribution in milk during early lactation die physikaliseh-ehemischen ursaehen der hitzestabilitet yon mileheiweil]stoffen. milehwissenscha sobn~a: ~oer einige verenderungen in den caseinfraktionen normaler und anomaler milch (colostralmilch und milch an lviastitis erkrankter kiihe). milchwissensehaft 18 some characteristics of yolk solids affecting their performance in cake doughnuts. ii. variability in commercial yolk solids zum problem des znsammenhanges zwisehen der konsistenz und der physikalischen struktur der butter. fette relationships between milk fat acidity, short-chain fatty acids, and rancid flavors in milk induced and natural inhibitory behavior of milk and significance to antibiotics disc assay testing. j. dairy sei. 46 [1963] nr. 2, s. 95ff. (7 s.). --some distribution patterns of cottage cheese particles and conditions contributing to curd shattering natural inhibitory eharacteristies of some irish manufacturing milks thermophylie aetinomycetes in milk and dairy products. mikrobiologiya the free fatty acids of purdue swiss-type eheese die zusammensetzung yon siilzen lind ihre beurteilung im 1regierungsbezirk diisseldorf not~ on tyrosine production in frozen stored liver studies on the muscles of meat animals. hi. : comparative composition of various muscles in pigs of the three weight groups studies on beef quality. x. effect of temperature, freezing, frozen storage, thawing, and pe on the rate of hypoxanthine production. die. food preservation t~chn increased iodine in milk as a countermeasure for ~liodine time-temperature studies of baked, loaves. meat, fish, and poultry vergleichende untersuehungon an butter und einem butter~hnli-chen umgeesterten fort. forte, seifen, anstrichmittel 65 zur beziehung zwisehen fettgehalt und wassergehalt bei ungewa~ohener s/il~rahmbutter (fritzbutter) variation of ovine fat composition within the carcass studies on the properties of new zealand butterfat. vii. effect of the stage of maturity of ryegrass fed to cows on the characteristics of butterfat and its carotene and vitamin a contents studies on turkey body composition. 2. poultry sci thermal conductivity of beef studies on turkey body composition. 1 free fatty acid, tyrosine, and 1~ changes during ripening of blue cheese made from variously treated milks l : factors influencing the nutritional value of fish flour. il: availability of lysine and sulphur amino acids. canad characterization of flavor compounds isolated from evaporated milk effect of egg yolk size on yolk cholesterol concentration ~tudo immuno~lectrophor~tiquo du lair dans los divers types de mammites. i. milchwissenschaft 17 rdsultats hemmstoffe in der anlieferungsmflch und methoden zu ihrem nachweia eiweibverenderungen gefriergetroekneter muskulatur meat and meat products effect of high doses of vitamin a palmitate on vitamin a aldehyde, esters, and alcohol and carotenoid contents of hen's eggs. brig. j. nutrition 17 [1963] nr. 2, s. 235/242. --the amounts of vitamin a aldehyde, esters, and alcohol and of earotenoids in hen's eggs and in day-old chicks nutritive value of marine oils i. : ]~lenhaden oil at varying oxidation levels, with and without antioxidants in rat diets a quantitative study of changes in dried skim-milk and lactose-casein in the 'dry' state during storage physico chemical characteristics of canadian milk fat. unsaturated fatty acids collagen content and its relation t~ tenderness of connective tissue in two beef muscles ~ber eine braune verf~rbung yon mariniertem hering. dr. lebensmitbel versuche fiber den sonnenlichtgesehmaek in mit ass markierter milch effect of unequal milking intervals on lactation milk, milk fat, and total solids production of cows wei~re untersuehungen fiber die nitratreduktase verschiedener an der reifung yon rohwurst beteiligter mlkroorganismen nutritive value of leg of lamb roasts. l~oisture, energy, protein, fat, and iodine values ~ber einige gul~sigkai~s~renzen bei konsistenzfehiern yon butter influence of linoleic acid content of milk lipids on oxidation of milk and milk fat fish hydrolysates-iii.: influence of degree of hydrolysis on nutritive value einige beobaehtungen fiber die bildung des durch lieht verursaehten oxydationsgesehmackes s.). zitat: dt. lebensmittel comminuted meat emulsions: factors affecting meat proteins as emulsion stabilizers factors related to the flavor stability during storage of foam-dried whole milk. iil effect of antioxidants upgrading the indigenous poultry of uganda. hi. : shell and egg interior quality relation between carcass composition and live weight of sheep diethylstilbestrol occurence in eggs of subcutaneously injected hens biochemical and quality changes in chicken meat during storage a~ above-freezing temperatures investigations on the allerged goitrogenie properties of milk fish and other marine products bioehemieal properties of pork muscle in relation to curing potassium content of dried milk vergleiehende histometrische untersuehungen fiber den kollagengehalt yon brfihwfirsten a comparison of the volatile compounds of fresh and decomposed cream by gas chromatography studies on the volatile carbonyl compounds in ladino clover and their influence on the flavor of milk aut~xidation of fish oils. ii.: changes in the carbonyl distribution of autoxidizing salmon oils the public health aspects of the use of antibiotics in food and feedstuffs. repor~ of an expert committee composition and digestibility of corn silage as affected by fertilizer rate and plant population factors influencing the nitrate content of forage the component sugars and rate of hydrolysis of forage hemicelhflose as related to digestibility digestibility trials on forages in trinidad and their use in the prediction of nutritive value acetyl-(para-nitrophenyl)-sulfanilamide in feeds distribution of major and trace elements in some common pasture species 0-dimcthyl 0-(2,4,5-trichlorophcnyl)phosphorothioate) in feeds feeding value of low-moisture alfalfa silage from conventional silos ovine bioenergeties and nutritional efficiency, with special reference to forage utilization a system for naming and describing feeds, energy terminology, and the use of such information in calculating diets supplemental methionine in a sixteen percent protein diet for laying chickens l : organic arsenicals in feeds cellulose degradation by enzymes added to ensiled forages influence of corn distillers dried grains with sohbles on the feeding value of wheat silage zinc content of certain feeds, associated materials, and water der natiirliche gehalt und die stabilit~it yon carotin und carotinoiden in lieu und silage forage digestibility. benzene-ethanol extracts of forage and faeces as indicators of digestibility the prediction of the metabolizable energy content of poultry fcedingstuffs from a knowledge of their chemical composition factors affecting the metabolizable energy content of poultry feeds. 11 facters affeeting the metsbohzablo energy content of poultry feeds unidentified chick growth factor in fish solubles diet and histamine in the ruminant. occurrence of histamine in silage ethopabate in feeds die isolierung yon apocarotinalen aus luzernemehl isoflavone contents of red and subterranean clovers metabolizable energy of some oil seed meals and some unusual feedstuffs ii methoden der untersuchung yon lebens-und futtermitteln techniques analysis of foodstuffs and feeds n.n.: gdch-faehgruppe analysis of foods by neutronaetivation techniques continuous measurement of dissolved solids in food processes by critical-angle refractometry berieht fiber die vortragstagung des fachverbandes lebensmittelchemie der chemischen gesellschaft in der ddr vom 19. bis 21 a rapid test for anionic detergents in drinking water fortschritte in der lebensmittelehemio dureh moderne analysenmethoden direct potcntiometrio determination of chloride in cheese modification of the polarimetrie starch determination on hig-hamyloso corn ~iethoden der ]~berwachung des wassers auf radioaktiviuit. btmdes prediction of quality in protein concentrates by laboratory procedures involving determination of soluble nitrogen direct chromatographic analysis of milk die eignung der bakteriologischen untersuehung yon kannenmilchproben als grundlage eines eu~ergesundheitsdienstes. arch. lebensmittel-i~yg nachweis fremder carotinoide in 0rangensiiften mittels diinnschichtchromatographie. i)t. lebensmit~el-l~dseh determination of phosphate composition of stock food calcium phosphate. 5. assoe. off. agric. chemists 46 molybdenum in plants and animals. determination of molybdenum in biological materims with dithiol control of copper interference the determination of dissolved oxygen in canned drinks using a vibrating mercury-plated platinum electrode determination of chromium and lead in periodic acid solution and dialdehyde starch the utilization of infrared and ultraviolet spectrometric procedures for assay of pesticide residues ultramicro determination of potassium and sodium in biologic fluids gum quautativen naehweis des pektins und der alginsiiure ein schliffapparat zur abtrennung yon flfichtigcn stoffen mittel~ wasscrdampfdcstillation ~tude par chromatographic on phase gazeuse des acidea gras du beurre fabriqu6 en italic et clans d'autres pays. application ~, la recherche des falsifications clans le bcurre commercim. ann. falsiticatiorm expertise chim note on the determination of caffeine in coffee a comparison of the press method with taste-panel and shear measurements of tenderness in beef and lamb muscles zur diehtebestimmung der milch mit der neuen milehspindel. lebensmittelchem, u. geriehtl. chem. 17 [1963] nr. 6, s. 113/116. --aufschlu6-.i~thercxtrakt und titrationswert bei der teigwarenunbersuehung als beispiel einer dynamisehen lebensmittelanalyse quantitative measures of carcass composition and qualitative evaluations fruit preservatives analysis. determination of calcium in cherry brines by versenat~ titration: elimination of anthocyanin interference by means of carbonyl reagents, g. agric. food chem allgemeine prinzipien der analytik yon carotinen und carotinoiden zur bestimmung yon ethoxyquin kolorimctrische bestimmung yon dipterex-riickst~nden yon lebensmittein the direct determination of shear stress-shear rate behavior of foods on the presence of a yield stress methods for evaluating bhe feeding quality of meat-andbone meals a paper chromatographic method for determination of vanillin and ethyl vanillin in vanilla flavorings ~drber die anwendung der papierchromatographisehen analyse auf dem fettgebiet. 4. mitt.: uber 5ls~iurereiehe samenole. ermittlung der konsti~uierenden fettsiiuren der samensie yon margosa (azadiraehta indies), cashewkern (anacardicum oecidentale) und putranjiva roxburghli. nahrung 7 die jodzahl des riickenspeeks im verhiiltnis zu der qualit~t des futterfettes, dem alter der schweine und dem fettungsgrad bei sehweinen der ditnisehen landrasse determination of parathion, methyl parathion, epn, and their oxons in some fruit and vegetable crops an improved chromatographic method for determining trace elements in foodstuffs determination of guthion residues on fruits techniques used in meat flavor research the analysis of edible oils contaminated with synthetic ester lubricants colorimetrische bestimmtmg yon nitrat und nitrit in biologisehem material confrontation de quelques proe~d6s de dosage iodometrique de l'anhydride sulfureux dans les vins zur anwendung der massenspektroskopie zur strukturermittlung yon naturstoffen, mit besonderer berfieksiehtigung der lebensmittelanalytik z lebensmittel cellulose solubility as an estimate of cellulose digestibillty and nutritive value of grasses the 2-thiobarbiturie acid reagent for determination of oxidative rancidity in fish oils bet die eignung yon daphnia magna zur ermittlung yon riieksti~nden auf frisehem 0bst und gemfise bemcrkung zum diinnschiehtchromatographischen kakaoschalennaehweis nach the determination of vitamin a in animal tissues and its presence in the liver of the vitamin a-deficient rat elution column preparation of leaf sample for flame photometry. ii. : determination of calcium in tobacco enzymatic-ultraviolet method for determination of uric acid in flour die bestimmung der amyiaseaktivit~t und einige studien fiber amylaseaktivit~it in gekeimtem roggen identification and determination of ascorbic acid (vitamin c) with janus green and its localisation in mitochondria cho]estehnbestimmung im kleinen laboratorlum chick edema factor. iii.: application of mieroeoulometric gas chromatography to detection of chick edema factor in fats or fatty acids bioassay of chick edema factor. 1962 collaborative study determination of nih4te and nitrate in meat products the measurement of the surface areas of milk powders by a permeability procedure sedimentbeurteilung und sehaim-~iastitis-test als sortierverfahren zur ermittlung yon sekretionsstsrungen bei der 1~iassenuntersuchung yon milchproben aoac methods for nutritional adjuncts die ermittlung der ribonucleins~ure im pflanzenmaterial beitrag zur analytischen beurteilung des frischezustandes der pharmazeutisch verwendeten 01e und fe~te sehnellbestimmung yon kupfer in fe~n applications of oscillographic polarography to the determination of organophosphorus pesticides. ii. : a rapid screening procedure for the determination of parathion in some t~its and vegetables zur standardisierung der vitaminb~-bestimmung in getreide und getreideprodukten eine mikromethode zur bestimmung des fettgehaltes der milch kleiner laboratoriumstiere the determination of organophosphato pesticides and their residues by paper chromatography die l~fikrochemie beim studium yon nahrung und erni~hrung aearicide residues. an improved method for kelthane residue analysis with applications for determination of residues in milk collaborative study of the determination of ethoxyquin in feeds ~rber den naehweis yon quellstoffcn in fleischwaren und m6gliche stsrungen durch andere polysaccharide ersatz manueller labormethoden der l6sungsspektralanalyse durch den automaten determination of total acids in wines: american society of enologists determination of aldehydes in wines and spirit~ by the direct bisulilte method ~fber einen vereinfachten nachweis des vitamin b12 mit poteriochromonas slipitata extraction of nys~tin in animal feeds for microbiological analysis zur quantitativen bestimmung der sorbinsi~ure mit dem thiobarbitur-s~urereagenz cottage cheese problems in production and sanitation. quality control in cottage cheese sitzung des arbeitskreises berlin der gdch-fachgruppe lebensmittelchemie und gerichtliche chemie am 23. 11. 1962 in ]~erlin-dahlem (2 vortragsreferate) ein neues elektronisches sctmeuverfahren zur ermittlung der ~risehe yon seefischen fatty acids of lard. a. identification by gas.liquid chromatography oxydative abbauprodukte der l-aseorbins~ure. 1. mitt. : papierchromatographischer l~achweis analysis of orange juice for total earotenoids, carotenes, and added betacarotene. food technol. 17 [1963] nr. 3, s. 95/98. u. r~ck~, a. a. : refractometrio measurement of soluble solids in orange juice analysis of iron chelates in plant extracts. il: ferric ethylenediamine'bis'(~176 acid) determination of n-aeetylglucosamine-1-phosphate and n-aeetylglucosamine in milk arsenic in foods: collaborative comparison of the aminemolybdenum blue and the silver diethyldithiocarbamate methods improved method for testing macaroni products neuere beitri~ge zur chemic der st~rkefraktionen. 13. mitt.: die mikro-ameisens~urebestimmung bei der perjodat-0xydatonsbestimmungsmethode yon stiirke recherche des falsifications dans les extraits de vanille. ann. falsifications expertise chim bestimmung des gesamtstckstoffgehaltes yon milch nach der kjwld~t:l-methode. internationaler standard fil vorteile und grenzen des einsatzes yon markiertem phosphat bei untersuchungen am hiihnerei paper chromatography of carotene and carotenoids determinaton of sevin insecticide residues in fruits and vegetables insecticide residues in meat and eggs. determinaton of sevin insecticide and its metabolites in poultry tissues and eggs bcstimmung der jodzahl yon fetten und 01en mittels n-bromsuccinimid. 2. mitt.: l~ber eine !~i6gliehkeit zur bestimmung der gesamtjodzahl yon el~iostearins~ure und holz61. nahrung 7 [1963] nr. 5, s. 375/381. kxrr:~e~r, m. a. : ~3ber die anwendbarkeit des thiobarbi~urs~iuretestes boi der untersuchung yon milchprodukten application of gas chromatography to the measurement of gas permeability of packaging materials a sensitive method for quantitative microdetermination of lipids comparison of chemical and microbiological methods for the determination of procaine penicillin in prcmixes and mixed feeds electrophoretc analysis of flour proteins from various varieties of wheat katalytische methode zur bestmmung kleinster manganmengen in lebensmitteln am beispiel der milch eine methode zur papierchromatographischen qualits yon silagen comparison of methods of measuring potassium in pork and lamb and prediction of their composition from sodium and potassium electron capture gas chromatography for determination of ddt in butter and some vegetable oils eine methode zum schnelinachweis yon nitriten in yleiseh-und wurstwaren. arch. lebensmittel-hyg determination ofglyodin residues on pears and peaches eleetrophoretic analysis of flour proteins erstes internationales symposium fiber methoden zur analyse yon lebensmitteln in bordeaux-tatence (frankreich) veto 8. bis 12. oktober on the problem of luminescence technique of protein definition in milk beitr~ge zur aminos/iuren-bestimmung in biologischem material aktuelle fragen der lebensmitteluntersuehung, insbesondere der histologischen wurstanalyse untersuchungen yon importiertem tiefgefrorenem tiasenfleisch argentinischer herkunft fish hydrolysates. iv.: microbiological evaluation untersuehungon zum naehweis yon emulgatoren in lebensmit~eln. 3. mitteilung. forte use of a slice-tenderness evaluation device with pork studies on improvements in quantitative paper chromatography of amino acids in foods. i. : research on procedure of development studies on improvements in quantitative paper chromatography of amino acids in foods. ii.: selection of solvent systems determination of the equilibrium relative humidity of foods untersuchungen fiber die methodik der quantitativen bestimmung yon heizolen und flfissigen treibstoffen im wasser determination of ~-lactalbumin in complex systems hydrogen sulphide in cheddar cheese; its estimation and possible contribution to flavour ascorbic acid measurement. polarographic determination of total ascorbic acid in foods trans-fettsguregehalt yon sehweineschmalz nach fiitterung von schweinen mit rindertalghaltigem kraftfutter. (ein beitrag zur quantitativen infrarotspektroskopischen bestimmung yon trans-fetts~uren in fetten the importance of starch on the microscopic identification of cereal grains in feeds insecticide residues. chromatographic identification of some organophosphate insecticides in the presence of plant extracts chemical and biological estimation of the carotene content in fresh and processed italian apricots sialir acid as an index of the u-casein content of bovine skimmilk foodstuffs analysis. nonvolatile acids of blueberries le dosage de la matibre grasse dans les fromages. ]~tude critique de la m6thode en usage au laboratoire municipal la d6termination de residus d'insecticides et de fongieides par la m6thode polarographique fusel oil determination by gas.liquid chromatography * die brabanter mastitis-reaktion, ein neues verfahren zur ermittlung yon sekretionsstsrungen des euters dutch die kannenmilchuntersuchung uber eine enzymatisehe _&pfels~urebestimmung in wein und traubensaft direct microscopic technique to detect viable yeast cells in pasteurized orange drink die enzymatisehe bestimmung der glucose und saecharose und ibre anwendung in der lebensmittelanalyse a modified zirconium-alizarin method for determining fluoride in natural waters paper chromatography of some cholesterol derivatives determination of moisture by nuclear magnetic resonance and oven methods in wheat, flour, doughs, and dried fruits dye binding by soybean and fish meal as an index of quality the absorptiometrie determination of silicon in water: part l: formation, stability, and reduction of cr and fl-molybdosilicie acids column chromatography of soybean whey proteins enzymatic determination of carbon dioxide in lightly carbonated wines. 1962 collaborative study nachweis yon biiffelmilch als verf/flschungsmittel in kuhmilch durch serologisehe methoden photometric determination of phosphate in wines pertinent references to analytical lipid methods published recently, ft spectrophotometric estimation of nucleic acid of plant leaves hemmstoffe in der aalieferungsmilch und methoden zu ihrem nachweis determination of potassium in tobacco determination of chlorides in tebacco phospholipase c determination by egg yolk turbidimetry oher die inhaltsstoffe der rol]kastanie und versuche zu ihrer gehaltsbestimmung mierobiologlcal method for assaying nystatin in animal feeds /63] nr. 5, s. 401/415. --gaschromatographische untersuchungen yon fuselslen aus versehiedenen g~rproduk-ten. 2. mitt.: methodik der fusel61bestimmung lactose activity measurements. evaluation of laetase preparations for use in breadmaking comparison of methods for determination of lysino in cereals direct determination of calcium in plants, soils, and milk by means of a flame photometer colorimetrie determination of urea in feeds kolorimetrische bestimmung des dihydrox-yacetons fluoride, teeth, and the analyst the quantitative micro-determination of biphenyl in citrus fruit kolorimetrisehes verfahren zur gleichzcitigen bestimmung der weinsiiure und milchsiiurc in wein und most estimation of extra-cellular starch of dehydrated potatoes versuehe zur chromatographischen trennung yon kohlenhydra~en und eiweil3en aus dem w~ibrigen extrakt yon roggenvollkornmehl quantitative determination of the amino acid content of rumen fluid from twin steers fed soybean oil meal or urea vemnche zur chemischen differenzierung der eiwei•stoffe des weizens und roggens aromastoffe des brotes. versuch einer auswertung chemiseher gesehacksanalysen mit hilfe des schwellenwertes zur trennung yon sacchariden an kohle]celit~-s~ulen. ern~hrungs-fomchung 7 untersuehungen zur bestimmung der lsslichkeit yon milchpulver * selection of a medium for the isolation and enumeration of enterocoeei in dairy products colorimetric determination of amino nitrogen in corn syrups 0stxogene und versuche zu ihrem nachweis in gefliigelflelsch. dt. lebensmittel ein beitr~g bert. die verwendung der anatysen-q~arzlampe zu friihzeitiger erkennung der l~anzigkeit colorimetrlsche methode zur bestimmung yon 1-monoglyceriden in eiskrem determination of fusel oil in distilled spirits zu den m6glichen fehlerquellcn bei der histometrischen ermittlung des kollagen-, bzw. gelatine-gehaltes bei briihwiirsten studium der uv-spektren der auf hshere tempcraturen erhitzten 01e measuring of oil-binding characteristics of flour naehweis der konservierungsmittel mit hilfe cl~r papierehromatographie identification of ch]orogenic acid in castor bean and oranges. canad evaluation of forages in the laboratory. iii.: comparison of various methods for predicting silage digestibility feed microscopy essential oils. determination of botanical and geographical origin of spearmint oils by gas chromatographic and ultraviolet analysis fluorometric determination of chlortetracycline in premixes dfinnschicht-chromatographie yon carotin-und carotinoidgemischen measurement of the sub-sieve particle size distribution of flour chemisehe bestimmung der riickst~nde yon parathion, malathion und diazinon auf blumenkohl, kohlrabi, bohnen und gurken determination of cadmium anthranilate in feeds nachweis yon penicillin und anderen antibiotika in milch microbiological evaluation of protein quality with tetrahymenapyriformis w. 3.: a simplified assay procedure peanut lipoprotein. ii. : analysis in foods by gas chromatography beitrag zum papierchromatographisehen und spektrophotometrisehen i~achweis fettlsslieher synthetiseher farbstoffe in lebensmitteln und kosmetika elektroehemische sauerstoffbestimmung in olefinischen fetich bestimmung der gesamten sehwefligen s~ure in getr~nken the modified whiteside test. recommended procedures for bulk or blended milk deliveries die bestimmung yon mileheiweib in fleiseherzeugnisscn. dr. lebensmittel zur bestimmung des veresterungsgrades yon pektin a quantitative fluorometrie method for the determination of serpasil (reserpine) in feeds at the micro level fluorimetric mierodetermination of carbohydrates zur l~iethodik der klebrigkeitsbestimmung yon brot frage der papierchromatographischen untersuchung von amylopektin und amylose chemical determination of diethylstilbestrol residues in the tissues of treated chickens gas chromatographic identification of components in m~ple sirup fl~vor extract dark discoloration of canned all-green asparagus. ii. development of a new tin plate for its control thermal conductivity and density of chicken breast mnsele and skim beitrag zur bestimmung der fluoreszenz in spriten microbiological determination of alanine in proteins and foods collaborative study of the method for counting microorganisms in maple sirup glass fiber paper strip charring. a rapid and simple method for monitoring column chromatography of lipids verbesserte mcthode zum serienm~bigcn quantitativen nachweis yon insektizidrficksti~nden bei 0bst und gemfise use of the shear press in determining fibronsness of raw and canned green asparagus betrachtungen fiber verschiedene methodcn zur bestimmnng des bindegewebeanteiles in rohem fleisch und fleischwaren. arch. lebensmittel-hyg the determination of citric acid in milk and milk sera formaldehyde in maple sirup: an adaption of the nash method polarographische bestimmung dcr ascorbinsiiure und des gesamt-vitamin c untersuchungcn fiber die mbglichkeit zur objektiven beurteilung der organoleptischen eigensehaften yon kokosraspeln analysis of the flavor and aroma constituents of florida orange juices by gas chromatography assay for cyzine in finished feeds dfinnsehicht-ehromatographische trennnngen yon synthetischen lebensmitteffarbstoffen auf ceuulose-schichten evaluation of quantitative methods of determining peroxidase in vegetables. i.: the indophenol and the o-phenylenediamine methods ein beitrag zur untersuchung thermisch oxydierter fette. dt. lehensmittel a more accurate method for determination of caffeine in decaffeinated coffee bestimmung yon vitamin c in friichten, fruchtsiiften, gemiise und konserven naeh der methode naeh tillma~s unter ausschaltung reduzierender stoffe lebensmittelrecht und lebensmitteliiberwaehung nutritional laws and nutritional control briihwiirst~ einfaeher qualit~t, die zu einem kaliber unt~r 32 mm in den verkehr gebracht werden, sind i.s. des w 4 nr. 3 lmg irreffihrend aufgemaeht new regulations under the food and drugs act lebcnsmittel-rdseh orb der kenntlichmaehung fremder stoffe probleme des geltenden lebensmittelrechts glasurmittel ftir r6stkaffee beschr~nkt zugelassen food technol. 17 [1963] nr. 6, s. 96. n. iv. : revised u. k. preservatives regulations. food teehnol ausschub fiir lebeusmittelrechtliche fragen der fachgruppe lebensmittelehemio und gerichtliche chemic in der geseuschaft dcutscher chemiker vi. t~tigkeitsberieht der eidg. kommission fiir volksern~hrung, lebensmittelgesetzgebung und -kontrolle (eek) zu h~nden des eidg. departementes des i_nnern, umfassend die jahre federal food, drug, and cosmetic act, as amended. selected u. s. government publ. 1963 nr. 8; 50 h. catalog i~o verkauf yon frikadellen mit brotzusatz in gastwirtsehaften tierarzneimittel und aufzuchtmittel in der landwirtschaftlichen praxis. gesundheitliche erw~gungen zum schutze des konsumenten bei der anwendung yon tierarzneimitteln und aufzuchtmit~ln in der landwirtschaftliehen praxis. teil hi der amtsarzt und das neue lebensmitt~lgesetz. off ist der beschlub des bundesgerichtshofes vom 18. 5. 1962 -1 stl~ 546161 -geeignet, den vertrieb yon hackfleiseh befriedigend zu regeln ? arch. lebensmittel-hyg zur beurteilung des saccharingehaltes yon meerrettiehzubereitungen. dt. lebensmittel zum begriff ,mai]gebend" in w 4a abs. 2 des lebensmittelgesetzes aspect sanitaire et ldgal actuel des aliments conservds verbrauchererwaxtung und lebensmitte]kontrolle bei fleiseherzeugnissen bei ger~ueherten fleischwaren. sehwarz-w~lder speck, schwarzw~lder sehinkenspeck, sehwarzw~lder sehinken. arch. lebensmittel-ttyg kampf der wasserverseuchung. aktuelle notwendigkeiten -gesetzliehe ~/isgliehkeiten. ~iiinchener reed. wschr. 105 lebensmittelreehtliehe stellung yon carotin und carotinoiden in der schweiz die lebensmittelgesetzgebung 0sterreiehs, der sehweiz und der bundesrepublik deutschland. eine vergleiehende untersuchung arzneimittel-lebensmittel dr. lebensmittel-rdseh erwiderung des autors (zur stellungnahme yon f. nitzsc~, d~. lebensmittel-rdseh. 59 [1963] nr. 5, s. 146). i)t. lebensmittel das lebensmittelgesetz und seine auswirkung auf die gummiindtlstrie codex alimentarius austriacus absehliel]ende stellungnahme der ort der kenntlichmachtmg fremder stoffe. dr. lebensmittel-rdsch _rreffihrende bezeiehnung und angaben bei limonade aus mineralamem tafelwasser. dr. lebensmittel zur herkunftsbezeichnung yon lebensmitteln. dt. lebensmlttel aktuelles zur lebensmitteliiberwachung einige beispiele fiir die auswirkung der neuen lebensmittelrechtliehen vorschriften auf erniihrungs-und landwirtschaft auf dem wege zum einheitliehen lebensmittel-rech~ zur ~r yon b. rsssl~: welehe anforderungen sind an alkoholhaltige siigwaren zu stellen? dr die silbertmg yon tafelwiissern mokka-kaffee" auch im handel mit kaffeebohnen keine tterkunftsbezeiehnung 59 [1963] nr. 1, s. 29131. --zuliissigkeit trod grenzen bildlicher darstellungen yon fleiseh und ylelscherzeugnissen auf paclmngen yon suppen in trockener form. db. lebensmit~el-l~lsch bedeutung und beur~eitung yon galtstreptokokken in vorzugsmileh. arch. lebensmittel-hyg auf clipverschliissen zul~ssig ist die infektion mit trichinen aus amtlich untersuehtem schwelnefleiseh im lichte der mathematischen analyse der bestimmungen der fleischbesehau yon schweinefleiseh msglich? arch. lebensmittel-hyg intema~ionales rundgespr~ch fiber lebensmitte]ehemische probleme in wiesbaden und eltville a. rh. z. lebensmittel-untersuch. u. -forschung. 129 [1963] nr. 2, s. 1271130. --kommission zur priifung fremder stoffe bei lebensmitteln (fremdstoff-kommission) der deutsehen forsehungsgemeinsehaft entwicklung des lebensmittelrechts im nationalen und internationalen bereich entwicklung des lebensmittelrechts im nationalen und inter~ationalen bereich entwicklung des lebensmittelrechts im nationalen und internationalen bereich de warenwet en de hierop berustendc besluiten zum entwurf einer harmonisierung der lebensmittelrechtlichen bestimmungen fiber konservierungsstoffe im gebiet der europaischen wirtschaftsgemeinschaft. er lebensmittelrechtliche stellung yon carotinen und carotinoiden in der bundesrepublik deutschland die instrumente der lebensmittelfiberwachung in osterreich 0sterreichischer standpunkt zur l~rage der f~rbung yon lebensmitteln mit carotinen und carotinolden. wiss. ver6ff. dr. ges. ernkhrung 9 lst es zu vcrtreten, dab das fleisch schwacbfinniger rinder auch in gebr~tetem zustand eingefroren wird? z lrch. lebensmittel-iiyg richtlinien fiber die zulassung yon gegeusachverst~ndigen zur untersuchung yon lebensmittel-gegenproben die ]ebensmittelrechtliche bedeutung yon bildiichen darstellmlgen auf verpackungen diverse problems (education, documentation associations, terminology etc the nutritional education of the food technologist proposed formation of a food engineering panel within the food group studium der hauswirtschafts-und ern~hrungswissenschaften. ern~hrungswirt-sehaft l0 an information service for the american food industry 1st international congress of food science and technology. symposium on education and training technically trained people for developing countries lft paces the food frontier un committee on food additives international standardization of fruit and vegetables. food technol terminology and methods for feeding and weighing animals food additives and food standards beitriige zur durehffihrung der umweltsradloaktivitiits-0berwaehung training dairy personnel in denmark training opportunities for the sanitarian-specialized in-service training r die benennungen honigkueheniihnlicher oebiieke als lobkuehen und lebzelten sowie als biber und bibenzelten. dt. lebensmittel nutrition and dietetics for the medical student problem of keeping dairy plants supplied with the foreman-type employee literaturdokumentation fiir hochschulassistenf~n a system for naming and describing feeds, energy terminology, and the use of such information in calculating diets a brief sketch of veterinary periodical literature in great britain before the foundation of the veterinary record a conference on nutrition teaching in medical schools the food service industry and its relation to the control of foodborne illness research and educational progress in nutrition informationsm6gliehkeiten fiir tieriirzte auf dem measuring readability of health education literature the dai~y literature problem the central food technological research institute the education and function of the nutritionist training in food service for nursing homes. l: tools for evaluation training in food service for nursing homes. iii.: observations on management of twelve units pennsylvania takes a look at nutrition in the orthopedic program zur definition der begriffe ,aroma" und l~ber die notwendigkeit einer priifung for beamtete sehlaehthofleiter. arch. lebensmittel-hyg organisation der dokumentation in einem forschungsinstitut (bundes. forsehungsanstalt f'tir lebensmittelfrischhaltung in karlsruhe). dt. lebensmittel-rdsch on changing the name of our assoziation. for a name change 0pporbunities in nutrition education questionnaires to identify nursing homes most in need of dietary counsel. publ. health irep lebensmittelwissenschafttiche institute in bulgarien zur stellung des psychiaters in der alkoholfrage naas nutrition chemists. vcter on changing the name of our association. against a name changing begriffe) in 10 sprachen: deutseh, russiseh, polnisch, tsehechisch, slowakisch, ungariseh, bulgariseh, serbokroatiseh, rum~nisch, englisch the role of nutrition in the teaching of medicine key: cord-021500-sy6lnt7b authors: jean harry, g.; toews, arrel d. title: myelination, dysmyelination, and demyelination date: 2007-05-09 journal: handbook of developmental neurotoxicology doi: 10.1016/b978-012648860-9.50007-8 sha: doc_id: 21500 cord_uid: sy6lnt7b nan normal functioning of the nervous system involves the transmission, processing, and integration of information as nervous impulses. impulse transmission along axons is greatly facilitated by the presence of myelin, the compact multilamellar extension of the plasma membrane of specialized glial cells that spirals around larger axons. in the central nervous system (cns), oligodendroglial cells are responsible for the synthesis and maintenance of myelin, whereas schwann cells subserve this role in the peripheral nervous system (pns). schwann cells produce a single segment of myelin (called an internode), whereas oligodendroglial cells furnish multiple myelin segments around different axons, although only one segment for a given axon. there are periodic interruptions along the axons between adjacent myelin internodes; termed nodes of ranvier, these short intervals where axons are not enveloped by myelin are vital for normal nervous system function (see later this chapter). myelin is an electrical insulator, and the periodic interruptions at the nodes allow for rapid and efficient transmission of nervous impulses. in unmyelinated axons, impulse transmission involves a wave of membrane depolarization that moves down the axon in a continuous sequential manner. however, in myelinated axons, the myelin internodes function as high-resistance insulators, so the excitable axonal membrane, containing a high concentration of voltage-sensitive sodium channels, is exposed only at the nodes of ranvier. impulse conduction thus involves excitation at the nodes only, and the impulse jumps from node to node (saltatory conduction) (see funch and faber, 1984; waxman et al., 1989; and morell et al., 1994, for details) . saltatory conduction is much more rapid and requires much less energy for membrane repolarization than conduction in unmyelinated axons. myelin thus greatly increases the efficiency of the nervous system, facilitating conduction while conserving metabolic energy and space. it is not difficult to imagine how even minor loss of myelin or perturbations in its structure and/or function could have deleterious effects on normal nervous system function. structural aspects of the process of myelination are most easily illustrated in the pns. each myelin-forming schwann cell produces an elaborate specialized extension of its plasma membrane, which is wrapped spirally around a segment of one axon (fig. 1 ). schwann cells, the glial cells of the pns, are derived from the portion of the neural epithelium that gives rise to the neural crest (le douarin, 1982) . during development, schwann cells invade the developing nerves, where they migrate along bundles of axons, proliferate (probably in response to an axonal mitogen; webster and favilla, 1984) , and segregate the axons individually within invaginations on the surface of schwann cells. as they cease migrating, they synthesize a basal lamina (billings-gagliardi et al., 1974) , composed of laminin, merosin, type iv collagen, fibronectin, nidogen/entactin, and heparan sulfate proteoglycan (sanes and cheney, 1982; tohyama and ide, 1984; bannerman et al., 1986; leivo and engvall, 1989; sanes et al., 1990) . as the axons continue to enlarge, the larger axons become further segregated so that a single schwann cell envelops a single axon. after the plasma membrane of the schwann cell has completely enclosed the axon, the external surfaces of the plasma membrane fuse to form a structure known as the mesaxon. the mesaxon then elongates and spirals around the axon, eventually resulting in a "jelly-roll" structure consisting of double layers of the schwann cell plasma membrane. myelin internodes can be as much as 2 mm long and contain 5 mm of myelin spiral (friede and bischhausen, 1980) . myelin compaction occurs as cytoplasm is extruded and the cytoplasmic faces are condensed to produce the dark major period line visible in electron micrographs (fig. 2) . the juncture of what was originally the outer faces of the apposing plasma membranes form the lighter appearing intraperiod line. mature myelin thus has a characteristic compact multilamellar structure, but cytoplasmic inclusions continu-figure 2 electron micrograph of compact myelin from the mammalian cns. although there are minor ultrastructural differences, compact pns myelin has a similar ultrastructural appearance. note the alternating pattern of darker major dense lines and paler intraperiod lines, originally formed by fusion of apposing surfaces of the inner and outer leaflets, respectively, of the oligodendroglial plasma membrane. cytoplasm-containing internal mesaxons can be seen on two of the myelinated axons. ous with the perikarya cytoplasm of the schwann cells are also present ( figs. 1 and 2 ). in addition, the myelin internode contains several ultrastructurally and biochemically distinct membrane domains, including the outer plasma membrane of the myelin-forming cell and the compact myelin itself, as well as the cytoplasmcontaining schmidt-lanterman incisures, paranodal loops, nodal microvilli, and outer and inner mesaxons. the latter cytoplasm-containing structures provide connections with the perikaryal cytoplasm, and are vital for myelin maintenance. the process of myelination in the cns is similar, except that a single oligodendroglial cell extends a number of processes from its cell body; each process then envelopes and myelinates a single segment of a given axon (fig. 3) . much of the local membrane assembly to give mature, compact myelin occurs within the oligodendroglial cytoplasmic processes (waxman and sims, 1984) . the size of the fibers and the thickness of the sheaths are very different in the pns and the cns, but the overall surface area of myelin generated by an oligodendrocyte around multiple axons may be no larger than that formed by a schwann cell around a single internode. the term oligodendrocyte, meaning "few processes," is actually somewhat of a misnomer, as a given oligodendrocyte may myelinate anywhere from less than five axons up to dozens of axons (butt and ransom, 1989; bjartmar et al., 1994) . as in the pns, the onset of myelination is preceded by proliferation of oligodendroglia. as development continues, both the diameter and length of the axons increase, and this is associated with a corresponding increase in nodal length as well as increases in myelin thickness. thus, despite its compact highly ordered appearance, myelin continues to expand in all planes during growth and development. in general, myelination follows the order of phylogenetic development, with the peripheral nerves myelinating first, then the spinal cord, and finally the brainstem, cerebellum, and cerebrum. there is, however, considerable overlap in this progression. in addition, each fiber tract may have its own spatiotemporal pattern of myelination, so that the degree of myelination may differ in different fiber tracts at a given developmental stage. for example, myelination in the spinal cord proceeds in a rostral-caudal gradient, whereas in the optic nerve, myelination progresses with a retinal to chiasmal gradient. although not necessarily an absolute prerequisite for function, in general, fiber tracts are myelinated before they become fully functional. myelination is a major metabolic and structural event that occurs during a relatively brief but precisely defined period in the normal progression of events involved in nervous system development. in both the cns and pns, an enormous amount of myelin membrane is formed (some pns axons may have as many as 100 layers) and this membrane must be maintained at a considerable distance from the supporting glial cell body. the surface area of myelin per adult oligodendrocyte in the rat brain has been calculated to be 1-20 x 105/zm 2, several orders of magnitude greater than the perikaryal membrane surface area of about 100 /zm 2 (pfeiffer et al., 1993) . myelinating glial cells are thus maximally stressed in terms of their metabolic and synthetic capacity during this time, with each cell synthesizing myelin equivalent to up to 3 times the weight of its perikarya each day . because of these very high levels of synthetic and metabolic activity, these myelinating cells are especially vulnerable to nutritional deficits and/ or to toxic insults or injuries during this period. the tightly programmed sequence of events eventually resulting in the formation of mature compact myelin and the consequent initiation of impulse transmission is regulated by interactions between axons and glial cells at numerous stages (see waxman and black, 1995, for detailed discussion) . during the early stages of myelination, the axon is loosely ensheathed by processes arising from immature, relatively undifferentiated glial cells. loose glial ensheathment of axons in the optic nerve is seen in the rat beginning at postnatal day 6. this is followed by spiral wrapping of the axon by oligodendroglial processes that form compact myelin. in some tracts, the immature myelin sheath is initially close to the oligodendroglial cell body (remahl and hildebrand, 1990) , but with maturation the sheath is displaced radially and often is connected to the cell body by only a thin cytoplasmic bridge. a single oligodendrocyte can myelinate axons of various diameters in their vicinity and can form myelin sheaths of different thicknesses around axons of differing diameter. the development, maturation, and maintenance of the myelin sheath is dependent on both the normal functioning of the myelinating glial cells and the integrity of its relationship to the axon it ensheaths. during development, physical features of the myelin sheath, such as thickness and number of lamellae, are not preprogrammed within the myelin-forming cells, but rather depend on local regulation by the axon, with larger axons having thicker myelin sheaths (waxman and sims, 1984) . myelin internodal distance is also matched to fiber diameter (hess and young, 1952) and the internodal distance:diameter ratio is different for fibers in different tracts. there is some evidence that myelination is initiated when a developing axon reaches a "critical diameter." however, myelination occurs over a range of axonal diameters (fraher, 1972) and at various times along a single axon (waxman et al., 1972; waxman, 1985) . the signal for initiation of myelination thus appears to be specific for particular axons, or for specific domains along axons. the axonal membrane contains molecules that trigger mitogenesis in schwann cells and oligodendrocytes (salzer et al., 1980a; 1980b; devries et al., 1983; chen and devries, 1989) and regulate the rate and degree of myelin formation (black et al., 1986; waxman 1987a,b) . although some schwann cells go on to myelinate axons, others only ensheath bundles of unmyelinated axons. these nonmyelinated schwann cells express distinct molecular markers such as the low-affinity nerve growth factor receptor (ngf-r), neural cell adhesion molecules (n-cam and l1), and growth associated protein-43 (gap-43), but none of the myelin specific proteins (see following sections, as well as mirsky and jessen, 1990; curtis et al., 1992) . transplantation studies have shown that the axon determines the phenotype of the schwann cell (aguayo et al., 1976; weinberg and spencer, 1976) . thus, there are a number of potentially distinct physical interactions between schwann cells and axons as schwann cells proliferate, migrate, ensheath axons, and form myelin sheaths during development of the pns. although the axon influences and helps direct the formation of myelin, the myelin sheath also significantly defines features of the axon. the premyelinated axon is electrically excitable (foster et al., 1982; waxman et al., 1989) and the loss of excitability in the internodal axonal membrane that occurs with myelination involves an active suppression of na + channels by the overlying oligodendrocyte or myelin sheath (black et al., 1985 (black et al., , 1986 . proliferation of oligodendrocyte precursors in the optic nerve is dependent on axonal electrical activity in that the blockage of optic nerve electrical activity by transection or exposure to tetrodotoxin resulted in a dramatic loss of oligodendrocyte precursor cells (barres and raft, 1993) . potassium channels may also participate in myelin formation; the potassium channel blocker, tea + was shown to be effective in eliminating myelination in spinal cord explants while leaving axonal conduction and synapse formation intact. the ontogenic development of the myelinating schwann cell lineage has been relatively wellcharacterized, particularly in rodents. most of our knowledge derives from cell-culture studies, but in vivo developmental studies in normal as well as in transgenic and gene knock-out mice have also proved useful. understanding of the events and stages involved is of potential clinical relevance, not only with respect to various toxic neuropathies and to pns nerve regeneration, but also because schwann cell precursors may be attractive schwann cells of the pns originate from primitive neural crest cells, proliferative multipotential cells also capable of differentiating into neurons or melanocytes. the first step along the schwann cell lineage gives the schwann cell precursor, a proliferative cell that becomes associated with many axons and expresses the lowaffinity nerve growth factor receptor (ngf-r), growth-associated protein 43 (gap-43), and the neural cell adhesion molecules n-cam and l1. the subsequent "committed" schwann cell becomes associated with progressively fewer axons and expresses, in addition to the previously noted markers, s-100 protein (from this stage onward, all schwann cells express s-100). committed schwann cells develop into either nonmyelinating schwann cells, which remain associated with several axons and express galactocerebroside (galc) in addition to the previous markers, or into myelinating schwann cells. myelinating schwann cells progress through a proliferative "premyelinating" stage, characterized by transient expression of suppressed camp-inducible pou-domain transcription factor (scip), followed by a "promyelinating" galc-positive stage, becoming associated with a single axon in the process. the final differentiation into a mature myelinating schwann cell involves candidates for transplantation to facilitate remyelination and repair in injured cns. a detailed discussion of this subject is beyond the scope of this chapter, but the reader is referred to several comprehensive reviews on this subject if more details are desired (jessen and mirsky, 1991; gould et al., 1992; mirsky and jessen, 1996; zorick and lemke, 1996) . schwann cells develop from neural crest cells and most of the key developmental stages in schwann cell maturation seem to depend on axon-associated signals. schwann cell precursors give rise to immature schwann cells, which have a distinct phenotype (fig. 4a ). these cells then develop into either myelin-forming or nonmyelin-forming schwann cells, under the control of reversible processes regulated by axon-associated signals. even mature schwann cells show a high degree of plasticity. following a demyelinating insult, schwann cells dedifferentiate into more primitive precursor cells, but generally do not die. when appropriate conditions present themselves (such as the regrowth of axons that occurs following a nerve-crush injury), these cells can proliferate, reestablish contact with axons, redifferentiate to the myelinforming phenotype, and remyelinate the axons, thereby restoring normal function. it is possible that primitive neural crest cells enter the schwann cell lineage only when they first encounter axons in the developing nerves, and that a signal related to axonal contact guides them down the path to mature schwann cells. such signals may involve members of the neu-differentiation factor (ndf) family. members of the ndf growth factor family, including glial growth factor (ggf), heregulin, acetylcholine-inducing activity (aria), and neuregulin, are alternatively spliced products of a single gene, and these molecules are emerging as important regulators of schwann cell lineage development (dong et al, 1995; zorick and lemke, 1996) . alternatively, it is possible that selected neural crest cells have already entered the schwann cell lineage before they encounter axons (see mirsky and jessen, 1996, for discussion) . investigation of which transcription factors regulate schwann cell development is a current area of active study. among factors likely to play significant roles are the zinc-finger transcription factors krox-20 (topilko, 1994) , scip (see zorick and lemke, 1996) , pax3 (kioussi et al, 1995) , and possibly c-jun (stewart, 1995) . a number of signalling molecules also regulate schwann cell proliferation and myelination, including insulin-like growth factor-1 (igf-1), which promotes expression of a myelinating phenotype in cell culture, and transforming growth factor/3s (tgf-/3), which inhibit myelin formation. the latter may be involved in generating the nonmyelinating schwann cells, which ensheath smaller pns axons. the developmental lineage of the oligodendrocyte, the myelin-producing cell of the cns, is also relatively well characterized, particularly in rodent in vitro systems (fig. 4b ). oligodendrocytes originate as neuroectodermal cells of the subventricular zones and then migrate, proliferate, and further differentiate into mature, postmitotic myelin-forming oligodendroglial cells. their development is discussed only briefly here, but a more comprehensive review is available (warrington and pfeiffer, 1992) . panels of cell-and stage-specific antibodies have proved especially useful in characterizing the sequential expression of various developmental markers, and this has allowed identification of distinct phenotypic stages, each characterized by its proliferative capacity, migratory ability, and distinct morphologies. primitive precursor cells differentiate into proliferative, migratory bipolar o2a progenitor cells. these cells are bipotential, being capable of differentiating into either astrocytes or oligodendrocytes. the oligodendrocyte lineage progresses through several additional stages, including an immature oligodendrocyte expressing galactocerebroside, sulfatide, and cyclic nucleotide phosphodownregulation of ngf-r, gap-43, n-cam, and l1 expression, with upregulation of expression of galc and myelin proteins, and in vivo, the synthesis and elaboration of myelin. because of the high degree of plasticity of schwann cells, most of the developmental steps shown are reversible. (modified from mirsky and jessen [1996] and zorick and lemke [1996] ). (b) myelinating oligodendroglial cells of the cns originate from neuroectodermal cells of the subventricular zones of the developing brain. the earliest precursor cells recognized to date (pre-gd3 stage) are proliferative, unipolar cells that express the embryonic neural cell adhesion molecule (e-ncam). these cells develop into gd3 ganglioside-expressing proliferative bipolar cells, termed o-2a progenitor cells because they are capable (in culture, at least) of developing into either "type 2" astrocytes or oligodendrocytes. development continues through a postmigratory but proliferative multipolar pro-oligodendroblast (pro-ol) and a pre-galc stage, characterized by lack of expression of galc. the onset of terminal oligodendroglial cell differentiation (immature ol stage) is identified by the surface appearance of a subset of "myelin components," consisting of the lipids galc and sulfatide, as well as the enzyme cyclic nucleotide phosphodiesterase (cnp). immature ols then undergo final differentiation into mature oligodendrocytes (mature ol), characterized by regulated expression of myelin components such as mbp and plp and by the synthesis and elaboration of sheets of myelin membrane. diesterase (all myelin components), finally arriving several days later at the mature oligodendrocyte stage. mature oligodendrocytes express all of the myelinspecific proteins and are capable of myelin synthesis in vitro in the absence of axons. it is worth noting that oligodendroglial cells show some developmental plasticity, and this may be of clinical relevance with respect to cns remyelination. in addition, a small population of oligodendrocyte progenitors persist in the adult rat brain (see wolswijk and noble, 1995, for details) , and these constitute a potential source of myelin-forming cells for cns remyelination. immature, cycling oligodendroglial progenitor cells endogenous to adult white matter are capable of remyelinating cns axons following lysolecithin-induced demyelination (gensert and goldman, 1997) . also, 02-a progenitor cells from mice subjected to coronavirusinduced demyelination show increased phenotypic plasticity and enhanced mitotic potential, properties that may be linked to the efficient remyelination that occurs following the demyelinating phase of this disease (armstrong et al., 1990) . manipulation of these progenitor cells by various factors (see later this chapter) thus may also be useful in promoting remyelination in a clinical context. as is the case for schwann cells in the pns, development of oligodendrocytes is governed by a number of growth factors, including platelet-derived growth factor (pdgf), basic fibroblast growth factor (bfgf), igf-1, tgf-/3, and nerve growth factor (ngf), as well as several cytokines (see pfeiffer et al., 1993) . oligodendrocytes differ from schwann cells in that they can be induced to produce myelin in culture in the absence of axons. this raises the question of the extent to which neurons and their axons influence oligodendrocytes with respect to development and myelination in vivo. it is, however, difficult to imagine the lack of significant interaction between these two cells in the developing nervous system, and in fact, many such interactions are known. neurons produce many of the growth factors involved, and they are known to modulate steady-state levels of myelin components as well as mrna levels for these components (singh and pfeiffer, 1985; macklin et al., 1986; kidd et al., 1990; barres and raft, 1993,) . as in the pns, production of myelin by oligodendrocytes requires the coordinated synthesis of massive amounts of myelin components. the marked upregulation of myelin-specific proteins, as well as of enzymes involved in synthesis of myelin lipids (see later this chapter), reflects corresponding increases in abundance of the respective mrna transcripts, suggesting that most regulation of the program for myelination occurs at the level of transcription (see hudson, 1990 , for review). a possible candidate for the coordinate control of cns myelination is myelin transcription factor 1 (myti), a zinc-finger dna-binding protein first identified by its ability to recognize the myelin proteolipid protein (plp) gene (kim and hudson, 1992) . myti mrna transcripts are most abundant in oligodendrocyte progenitor cells, suggesting that this factor acts at a very early stage in the regulation of transcription for myelinogenesis (armstrong et al, 1995) . myelin of both the cns and pns has a distinctive composition that differs somewhat from that of most cellular membranes. it is the major component of white matter of the cns, accounting for about half the dry weight of this tissue, and is responsible for its glistening white appearance. the same is true for larger nerves of the pns, such as the sciatic nerve. myelin in situ has a water content of about 40%, and is characterized by its high lipid content (70-85% of its dry mass) and its correspondingly low protein content (15-30%). most biologic membranes have a much higher protein:lipid ratio, usually somewhere around unity. the insulating properties of myelin, vital to its physiological function, are related to this high lipid content. the high lipid content of myelin also results in a buoyant density less than that of other biologic membranes, and advantage can be taken of this to isolate myelin with a high yield and a high degree of purity (norton and poduslo, 1973a) . the lipid content of cns myelin (table 1) is characterized by high levels of galactolipids (about 32% of lipid dry weight), including galactocerebroside (gal-c) and its sulfated derivative, sulfatide, and cholesterol (about 26% of total lipid weight), with phospholipids accounting for most of the remainder (norton and cammer, 1984; dewille and horrocks, 1992; morell et al., 1994) . plasmalogens, phospholipids having a fatty aldehyde linked to the c1 of glycerol instead of a fatty acid, are especially prominent in myelin. ethanolamine plasmalogens and phosphatidylcholine (lecithin) are the major phospholipid species. gangliosides are also present in minor amounts. pns myelin has a similar composition, although there are quantitative differences (see smith, 1983) . pns myelin has less cerebroside and sulfatide and more sphingomyelin than cns myelin. these differences are minor, however, relative to the larger differences in protein composition discussed later. the protein composition of myelin is relatively simple in that a few major structural proteins account for the bulk of the total protein ( table 2 ). the plp and the myelin basic proteins (mbp) together account for about 80% of the protein content of cns myelin (nornorton and cammer, 1984; morell et al., 1994; and morell and toews, 1996b , for references and additional details. bother lipids are also present in myelin, including gangliosides, galactosyl diglycerides, and fatty acid esters of cerebroside; although not shown in the table, polyphosphoinositides may account for up to 7% of total myelin lipid phosphorus (see morell et al., 1994) . ccalculated from data in norton and cammer (1984) , using 800 and 750 as average molecular weights for phosphoglycerides and sphingomyelin, respectively. aabbreviations: epg, ethanolamine phosphoglycerides; cpg, choline phosphoglycerides; spg, serine phosphoglycerides; ipg, inositol phosphoglycerides. eprimarily ethanolamine plasmalogens. fvalue includes both spg and ipg. ton and cammer, 1984; morell et al, 1994) . in contrast, p0 protein, a protein not found in the cns, accounts for more than half the total protein of pns myelin. aonly major proteins are shown; see text for discussion of other myelin proteins, and morell et al (1994) and newman et al. (1995) for additional references and details. babbreviations: plp, proteolipid protein; mbp, myelin basic protein; cnp, cyclic nucleotide phosphodiesterase; mag, myelin-associated glycoprotein; pmp-22, peripheral myelin protein-22. c composite values representative of adult mammalian myelin. dalthough mrna for this protein has been detected, the protein itself, if present at all, is present in myelin at only low to undetectable levels. mbp, p2-protein, and pmp-22 account for most of the remainder of pns myelin proteins. plp, the major protein component of cns myelin, is present in pns myelin at only very low levels, if at all (see later). in both cns and pns myelin, there are a number of other minor but integral protein components, and the list will continue to grow as research continues. these include structural proteins and proteins involved in cell-cell interactions, as well as a large number of enzymes, receptors, and second messenger-related proteins. all of these have vital roles in maintaining the complex structure of myelin and/or in its function. characteristics of some myelin proteins, including selected aspects of their gene structure and expression, follows, but it is necessarily brief. for a more detailed discussion of individual myelin proteins, see lemke (1988) , morell et al. (1994) , campagnoni (1995) , and newman et al. (1995) . it is worth noting at this point that the composition of myelin changes during development, with the first myelin deposited having a somewhat different composition than that present in adults (norton and cammer, 1984) . in the rat brain, myelin galactolipids increase by about 50%, and phosphatidylcholine decreases by a similar amount. similar changes have been noted in human myelin as well. other minor changes in lipids and gangliosides also occur. the protein portion of myelin also changes somewhat during development; both mbp and plp increase during development, whereas the amount of higher molecular weight proteins decreases. myelin basic proteins (mbps) are highly basic proteins of related isoforms derived from alternative splicing of a single gene. the mbp gene consists of seven exons distributed over about 32 kb of chromosome 18 in the mouse (roach et al, 1985) and human (sparkes et al, 1987) and chromosome 1 in the rat (koizumi et al., 1991) ; at least six transcripts are expressed via alternative splicing of rna (table 3 ). the mbp gene is actually a "gene within a gene," being part of a much larger (approx. 105 kb in mice and 179 kb in humans) transcriptional unit, called the golli-mbp gene pribyl et al, 1993) . portions of the golli-mbp gene are expressed outside the nervous system, including the immune system, although the exact function of these gene products remains unknown. this may be of relevance to clinical disorders related to autoimmunity against mbp. expression of the various mbp protein products is also very complex; in addition to alternative splicing of a number of exons, there is also considerable transcriptional and posttranscriptional control (see campagnoni, 1995, for details) . this complexity of gene expression is augmented by posttranslational protein modifications, including loss of c-terminal arginine, n-acylation, glycosylation, phosphorylation, methylation, deamination, and substitution of some arginine residues with citrulline (toews and morell, 1987; smith, 1992; morell et al, 1994) . mbps are extrinsic membrane proteins localized to the cytoplasmic membrane surface (major dense line) of myelin of both the cns and pns. in the cns this protein accounts for approximately 30% of the total myelin protein, whereas in the pns it accounts for only 18% (greenfield et al, 1980) . myelin lipids can promote mbp self-association, suggesting it may exist as oligomers on the cytoplasmic surface of the myelin membrane (smith, 1992) . it has been suggested that stabilization and maintenance of the myelin structure may be due to specific associations between mbps and sulfatides and gangliosides (ong and yu, 1984; yu, 1989, 1992; mendz, 1992; smith, 1992) . proteolipid protein (plp) and dm20 integral membrane proteins with several transmembrane domains. plp is the most abundant protein in cns myelin (50%), and although mrna for this protein is present in the pns, the protein itself is present at only very low levels in pns myelin and its function in pns myelin is unknown kamholz et al., 1992) . a report (garbern et al, 1997 ) of a human plp null mutation phenotype characterized by a demyelinating peripheral neuropathy suggests that plp/dm20 is necessary for proper myelin formation in the pns as well as the cns. this report also demonstrates by immunoelectron microscopy the presence of plp in compact pns myelin (however, see also puckett et al., 1987) . like mbp, plp is one of the products of alternative splicing of a single gene, having a molecular weight of approximately 25kda. dm20, a second isoform that migrates as a 20kda band on sds gel electrophoresis, is identical to plp except for the deletion of amino acid residues 116-150 (macklin, 1992) . the plp-dm20 gene, located on chromosome x in the mouse, rat, and human, is approximately 17 kb in length and consists of seven exons. the alternative splicing of this gene to give plp and/or dm20 is developmentally regulated, with the dm20 splice product predominating during early myelination. some mutations of the plp/dm20 gene (e.g., jimpy mice; see later this chapter) result in developmental abnormalities prior to any myelination, suggesting that gene may be involved in other functions besides myelination. in addition, expression is not confined to myelin-producing oligodendrocytes in the cns. the role of protein products of this gene outside the nervous system remain unknown, however. missense mutations of the plp/dm20 gene give rise to a host of cns pathologies, the most devastating being pelizaeus-merzbacher disease (pmd) (see seitelberger, 1995) . in some forms of pmd, there is complete deletion of the plp/dm20 gene (raskind et al, 1991) . the described single base pair deletion in humans leads to the absence of plp/dm20 expression; this produces a disease similar to, but less severe than, classic pmd, but also involving a progressive demyelinating peripheral neuropathy (garbern et al, 1997) . a number of both positive and negative cis-acting elements, as well as some trans-acting factors, have been identified for this gene (see campagnoni, 1995, for details) . in its orientation in the myelin membrane, the extracellular domains of plp may be instrumental in stabilizing the intraperiod line of myelin . in addition, plp may play an active role as an ion channel (toews and morell, 1987; lees and bizzozero, 1992) . as noted previously, dm20 is generally a relatively minor product of the plp gene but it shows a pattern of developmental regulation distinct from plp. it is expressed earlier in development than plp and is the major plp gene product in the developing embryo (ikenaka et al., 1992; macklin, 1992; timsit et al., 1992) . its presence in "premyelinating" glial cells and in cells outside the glial cell lineage suggest a possible functional role unrelated to myelination. myelin-associated glycoprotein (mag) is the principal glycoprotein of central nervous system myelin (for review, see milner et al, 1990; quarles et al, 1992) . mag is heavily glycosylated and is specific to myelin sheaths, with an especially high concentration in the periaxonal regions of both cns and pns myelin. it is a member of the immunoglobulin gene superfamily (sutcliffe et al, 1983; lai et al., 1987; salzer et al, 1987) . examination of the extracellular, n-terminal domains suggest that mag is most closely related to the cell adhesion molecules n-cam, l1, and contactin. in the pns, mag immunostaining is seen in glial membranes of the schmidt-lantermann incisures, paranodal loops, and mesaxons (trapp, 1990) and is distinctly absent from the compact myelin sheath. it is thought that mag plays a major role in membrane-membrane interactions during myelin formation and maintenance quarles et al, 1992; morell et al., 1994) . it is presumed to be involved in the adhesion of the myelin sheath to the axonal plasmalemma and in membrane spacing (trapp, 1990) , and it has been implicated in various peripheral neuropathies (mendell et al, 1985; tatum, 1993) . mag exists as two isoforms (l-mag and s-mag) that are derived by alternative splicing from a single gene. l-mag is produced almost exclusively in the cns and is the predominant variant during early development and active myelination (campagnoni, 1988; trapp, 1990) . s-mag is the major isoform in the adult cns and in the pns at all ages. it is thought that the differences in distribution within the cns and pns may be associated with either the phosphorylation or other posttranslational modifications (e.g., sulfation of oligosaccharide moieties, acylation of the transmembrane domain) altering interactions with the cytoskeleton (trapp, 1990; quarles et al, 1992) . homotypic interaction may be operational in schmidt-lantermann incisures, paranodal loops, and mesaxon membranes in pns myelin, whereas heterotypic interactions with axolemmal constituents may mediate glia-axon adhesion trapp, 1990; quarles et al., 1992) . 2'-3'-cyclic nucleotide-'-phosphodiesterase (cnp) is localized within oligodendrocytes in the cns and within schwann cells in the pns. one of the earliest markers for cells of the oligodendroglial lineage, cnp is an enzyme that hydrolyzes 2',3'-cyclic nucleotides to form 2'-nucleotides exculsively. because no physiologically relevant substrate molecules have been found in myelin, however, this enzymatic activity may actually be vestigial and unrelated to its function in myelin. the current view of cnp is that it may be a key component of an interactive protein network within oligodendroglial cells, possibly involved in extension of processes (e.g., see braun et al, 1990) . it is isoprenylated, suggesting possible involvement in signal transduction processes (braun et al., 1991) . furthermore, the presence of potential nucleotide-binding domains on cnp (sprinkle, 1989) suggest it might exert regulatory influence on various cellular processes such as growth and differentiation by serving as a link between extracellular signals and intracellular effector molecules. its exact function within myelin and oligodendroglial cells, however, remains unknown. glycoprotein (mog) mylein/oligodendrocyte glycoprotein (mog) is localized primarily at the external surfaces of the myelin sheath and oligodendrocytes. it is developmentally regulated, appearing with the onset of myelination as one of the last myelin protein genes expressed (scolding et al., 1989) . features of the protein structure suggest it may be a member of the immunoglobulin gene superfamily (gardinier et al., 1992) . because anti-mog antibodies can cause demyelination in vivo (schluesener et al., 1987) , it has received attention for a potential role in autoimmune-mediated demyelination such as experimental autoimmune encephalomyelitis (eae) and multiple sclerosis (gunn et al., 1989; bernard and kerlero de rosbo, 1991) . oligodendrocyte-myelin glycoprotein (omgp) is one of the minor protein components of myelin that appears in the cns during the period of myelination. it is highly glycosylated and appears to be specific to oligodendrocytes and myelin membranes in the cns (mikol and stefansson, 1988) . the protein is anchored to the membrane through a glycosylphosphatidylinositol intermediate. a subpopulation of omgp molecules contain the human natural killer cell antigen-1 (hnk-1) carbohydrate (mikol et al, 1990b) . the presence of a tandem leucine repeat domain in the predicted polypeptide sequence and the apparent presence of omgp at the paranodal regions of the myelin sheath have lead to speculation that its major role is as an adhesion molecule that mediates axon-glial cell interactions (mikol et al., 1990a) . p0 protein (peripheral myelin protein zero) accounts for more than 50% of the protein in peripheral nerve myelin (ishaque et al., 1980) . this has lead to the suggestion that p0 is the pns equivalent of plp in the cns, although the properties of these two proteins are very different. p0 is a transmembrane protein with a glycosylated extracellular domain, a single membrane spanning region, and a highly basic intracellular domain (lemke, (lemke and axel, 1985) . it is thought to play an important role in the compaction of myelin through the homotypic interaction of molecules on adjacent myelin lamellae (lemke, 1988) . like mag, it is a member of the immunoglobulin superfamily. unlike many of the other myelin protein genes, expression of the p0 gene is highly restricted to schwann cells. the p0 gene contains six exons distributed over about 7 kb in both rats and mice (lemke, 1988; you et al, 1991) . based on transgenic experiments, elements regulating its expression appear to reside in first 1.1 kb of the 5'-flanking region (messing et al, 1992) . mutation of the p0 gene in humans is associated with charcot-marie-tooth disease, type 1b, an inherited peripheral neuropathy (hayasaka et al., 1993; kulkens et al., 1993) . peripheral nerve protein p2 is a basic protein distinct from mbp. interest in p2 arose from its ability to induce experimental allergic neuritis, a demyelinating disease of the pns (kadlubowski and hughes, 1980) . it has sequence homology to cellular retinol and retinoic acidbinding proteins (crabb and saari,1981; eriksson et al., 1981) and fatty acid-binding proteins (fabps) (veerkamp et al., 1991) , and has high affinity for oleic acid, retinoic acid, and retinol (uyemura et al., 1984) . the p2 protein gene belongs to an ancient family of fabps that diverged into two major subfamilies (medzihradszky et al., 1992) . p2 mrna levels parallel myelination during development as well as the levels of microsomal enzymes involved in fatty acid elongation. this suggests the p2 protein may be involved in fatty acid elongation or in the transport of very long-chain fatty acids to myelin (narayanan et al, 1988) . peripheral myelin protein-22 (pmp-22) is a glycoprotein with an apparent molecular weight of about 22 kda (kitamura et al., 1976; smith and perret, 1986 ). the rat and human genes have been cloned (spreyer et al., 1991; welcher et al, 1991; hayasaka et al., 1992) ; although these have a high homology to the growth arrestspecific mrnas for gas 3 and pasii-glycoprotein, the function of pmp-22 in pns myelin remains unknown. the pmp-22 gene maps to mouse chromosome 11 and a point mutation in this gene is apparently responsible for the autosomal dominant mutation in the trembler (tr) mutant mouse (suter et al, 1992a,b) . in humans, the gene maps to chromosome 17; this gene is duplicated in patients with charcot-marie-tooth disease, type 1a, and this alteration is presumably related to the pathology (patel et al., 1992; valentijn et al., 1992; see campagnoni, 1995, and newman et al., 1995 for additional references and discussion). the rat pmp-22 gene, the expression of which is largely confined to the pns, is developmentally regulated in schwann cells, where its expression coincides with myelination (spreyer et al, 1991; snipes et al., 1992) . mrna expression is coordinately down-regulated along with other myelin proteins during tellurium-induced primary demyelination and during degeneration induced by nerve transection or crush; message levels are consequently up-regulated along with other myelin genes during remyelination if it occurs (spreyer et al., 1991; toews et al, 1997) . although myelin initially was believed to be metabolically inert (due largely to the very slow metabolic turnover of some of its components) and to function exclusively as an electrical insulator, it is now known that the picture is considerably more complex and interesting. all protein and lipid components of myelin turn over with measurable turnover rates (see benjamins and smith, 1984; morell et al, 1994) . although some structural components of myelin are indeed relatively stable metabolically, with half-lives of several months, there is also a very rapid turnover of some myelin components. the phosphate groups modifying myelin basic protein in compact cns myelin turn over with half-lives of minutes or less (desjardins and , and the phosphate groups on myelin polyphosphoinositides also show a very rapid turnover rate. at least 40 different enzyme activities have been documented in cns myelin (newman et al, 1995) . in addition to cnp described previously, these include enzymes related to second messenger signalling, as well as associated receptors and g-proteins (larocca et al., 1990) . a number of enzymes involved in myelin lipid metabolism are also present, including those for phospholipid synthesis and catabolism. noteworthy among the latter are phospholipase c activities for polyphosphoinositides, and these may have important roles in signal transduction mechanisms in myelin (ledeen, 1992) . also of note are a cholesterol ester hydrolase, and udp-galactose : ceramide galactosyltransferase, the terminal enzyme in biosynthesis of galactocerebroside, the most "myelinspecific" lipid. various proteases, protein kinases and phosphatases, and transport-related enzymes are also present; of particular interest with respect to the transport-related enzymes are carbonic anhydrase (cammer et al, 1976) and na+,k+-atpase (zimmerman and cammer, 1982) . these enzymes may be involved in controlling k + levels at nodes of ranvier (lees and sapirstein, 1983) and/or in removal of carbonic acid from metabolically active axons. the cell surface area of myelin-forming cells is so large as to suggest the need for specialized structures and mechanisms for transporting components between the perikaryon and the remote extensions. there is indeed a great deal of protein and lipid transport and targeting within the myelinating glial cell, and cytoskele-tal elements play important roles in these processes. p0, mag, and laminin (a secreted extracellular matrix component ; cornbrooks et al, 1983) are synthesized and modified in the rough endoplasmic reticulum (rer) and golgi membranes of the perinuclear cytoplasm and then sorted into different carrier vesicles upon exit from the trans golgi network (trapp et al, 1993) . these proteins must be transported over millimeter distances prior to insertion into their proper surface membrane locations. various proteins enriched in compact myelin reach their proper destinations via different mechanisms. for example, p0 reaches compact myelin by vesicular transport, whereas it is the mrna for mbp that is translocated, with synthesis of mbp occurring close to the site of its insertion into the forming myelin (colman et al, 1982; trapp et al, 1987; griffiths et al, 1989) . as noted previously, the cytoskeleton plays a major role in transport and assembly of myelin. in myelinating schwann cells, microfilaments are enriched beneath the membranes of the schmidt-lanterman incisures, the outer and inner mesaxons, and portions of the outermost compact myelin lamellae and schwann cell plasma membrane (trapp et al., 1989; zimmerman and vogt, 1989; kordeli et al, 1990) . it is thought that interactions between mag and microfilaments play a role in membrane motility during myelination (trapp and quarles, 1982; trapp et al, 1984; martini and schachner, 1986; salzer et al, 1987) . mag colocalizes with microfilaments at membranes that move during internodal growth (trapp et al, 1989) . a role for mag in myelin wrapping and spacing is supported by studies showing precocious spiral wrapping by myelinating schwann cells transfected with additional copies of mag (owens et al, 1990) , and impaired or prevented wrapping when mag expression is reduced or eliminated in schwann cells (owens and bunge, 1991; mendell et al., 1985) . microfilaments also help to define and maintain organelle-rich channel regions and organelle-free nonchannel regions of the myelin internode. the channel regions are important to formation and maintenance of the myelin internode and to intracellular transport of myelin components. they include the external cytoplasmic channels, schmidt-lanterman incisures, para nodal loops, and periaxonal cytoplasm. microfilaments associated with the abaxonal plasma membrane and the adaxonal periaxonal membrane may have multiple functions in dealing with the external environment, including endocytosis (blok et al., 1982) , pinocytosis (phaire-washington et al, 1980) , and exocytosis ( john et al, 1983; koffer et al., 1990) , as well as stress resistance. just as they do in axons, microtubules may also subserve a role in the directional movement of organelles within glial cells. post-golgi vesicles transported in this way could be involved in the growth, turnover, or modification of compact myelin at distant sites. microtubules are the largest of the cytoskeletal filaments and provide a dynamic substrate for organelle trafficking and structural organization (for review see dustin, 1984; kirschner and mitchison, 1986; schroer and sheetz, 1991) . they are present in all major cytoplasmic compartments of the myelin internode, but are excluded from compact myelin (peters et al., 1991) . in myelinating schwann cells, microtubules are crucial to the transport of myelin proteins and organelles (trappet al, 1995) . this function is determined by the orientation and organization of microtubules, which in turn are influenced by axons (kidd et al., 1994) . during the formation of the myelin sheath, contact with a myelin-inducing axon results in a more complex microtubule organization (kidd et al., 1994) . as in axons, microtubules are inherently unstable and oscillate between phases of elongation and collapse (dustin, 1984; kirschner and mitchison, 1986 ). the extent of depolymerization and repolymerization is determined by complex assembly/ disassembly kinetics and can be influenced by modifications such as binding of maps (sloboda et al., 1976; pryer et al., 1992) . microtubule disassembly causes marked accumulation of p0, mag, and laminin in the perinuclear cytoplasm of myelinating schwann cells (trapp et al., 1995) . because of this, chemically induced neurotoxicity involving microtubules may lead to alterations not only in axons, but also in myelin and myelinating cells as well. in myelinating schwann cells, the major intermediate filament is vimentin (dahl et al., 1982; schachner et al., 1984; kobayashi and suzuki, 1990) . in most cells, intermediate filaments are considered to have a structural role in mechanically maintaining cell shape against external forces (klymkowsky et al., 1989; skalli and goldman, 1991) it has been proposed that vimentin intermediate filaments interact with microfilament-associated molecules and with microtubules in resisting stress (wang et al., 1993) . it is thought that intermediate filaments play a role in the process of myelination in the pns because myelinating schwann cells contain abundant intermediate filaments and the content of these filaments between myelinating and nonmyelinating schwann cells can vary substantially. because the structure and composition of myelin is unique, its formation involves activation of a set of unique genes (see lemke, 1988 , for details). these genes include those related to induction of myelination (e.g., glial-specific receptors for differentiation signals), those involved in controlling and directing the initial deposition of myelin (e.g., axon-glial cell-adhesion molecules), and those involved in actual production of compact myelin (e.g., structural proteins of myelin). genes for enzymes involved in synthesis of lipids enriched in myelin are also preferentially activated as well. the process of myelination is a highly regulated event that begins postnatally during the first few weeks of life in the rodent brain and within the third fetal trimester in the human spinal cord. in the cns of rodents, maximum levels of synthesis of myelin components and actual accumulation of myelin occurs at about 3 weeks of age (norton and poduslo, 1973b; norton and cammer, 1984; morell et al., 1994) , and although myelin accumulation continues for an extended time, the rate of synthesis declines considerably by about 6 weeks of age. this time course is similar to the profile of expression of myelin protein genes (see campagnoni and macklin, 1988) . in the rodent pns, myelination begins at about birth, peaks at about 2 weeks, and then decreases to a low basal level by the end of the first month (webster, 1971) . as is the case in the cns, the pattern of myelin synthesis and accumulation is closely paralleled by expression of mrna for pns myelin protein components (lemke and axel, 1985; stahl et al, 1990 ) and for enzymes involved in synthesis of major myelin lipids (hydroxymethylglutaryl-coenzyme a [hmg-coa] reductase, the rate-limiting enzyme in cholesterol biosynthesis; and ceramide galactosyltransferase, the ratelimiting enzyme in cerebroside biosynthesis) (lemke and axel, 1985) . regulation of the expression of myelin genes occurs at a number of different levels including promotor choice, transcription, mrna splicing and stability, translation, and posttranslational processing (for reviews see campagnoni, 1988; campagnoni and macklin, 1988; lemke, 1988; nave and milner, 1989; ikenaka et al, 1991; mikoshiba et al, 1991; campagnoni, 1995) . the synthesis and assembly of myelin has been examined by measuring incorporation of radioactive precursors into myelin components both in vivo and in tissue slices, by measuring the in vitro activities of enzymes involved in synthesis of myelin components, by examining levels of expression of mrna species for myelinrelated genes, and by actual isolation and analysis of myelin (for reviews see benjamins and smith, 1984; dewille and horrocks, 1992; morell et al., 1994) . after individual myelin components have been synthesized, they must be assembled to form mature compact myelin. some components, such as plp, which is synthesized on bound polyribosomes in the oligodendroglial perikaryon, show a time lag of about 45 min between their synthesis and their appearance in myelin, reflecting the time required for transport from their site of synthesis to the forming myelin. other components, such as mbp, show only a short lag, in keeping with their synthesis of free polyribosomes in oligodendroglial processes, near the actual site of myelin assembly. in keeping with this difference, studies have shown that myelinating cells spatially segregate mrna species for myelin-specific proteins (see trapp et al., 1987) . mrna for mbp is transported to near the sites of myelin assembly before the protein is synthesized gillespie et al., 1990) , whereas plp mrna is present in a perinuclear location. individual lipids also show different kinetics of entry into myelin following synthesis, and some of this may be due to synthesis in, or movement through, different intracellular pools benjamins and iwata, 1979 ; for review, see benjamins and smith, 1984) . the mature myelin internode contains several ultrastructurally and biochemically distinct membrane domains that include the outer plasma membrane of the myelin-forming cell and its attached compact myelin, as well as the schmidt-lanterman incisures, paranodal loops, nodal microvilli, the periaxonal membrane, and the membranes of the outer and inner mesaxons (see figs. 1 and 3) . some of these membrane domains are compositionally distinct, containing different structural proteins and differing in lipid composition as well. with respect to the pns, p0, mbp, and pmp-22 are enriched in compact pns myelin (trapp et al., 1981; omlin et al., 1982) , whereas plp and mbp predominate in compact cns myelin. the exact mechanisms by which individual components are targeted to their respective membrane domains and other molecular aspects of actual myelin membrane assembly are not well understood, but this continues to be an active area of investigation. it seems clear that cytoskeletal elements are closely involved in the intracellular sorting and transport of myelin components, as discussed in a previous section, and they presumably also function in the actual process of myelin assembly. in the mouse, terminal differentiation of myelinforming cells occurs mostly after birth, following establishment of the basic wiring of the nervous system. many neurologic mutations result in dysmyelination, the inability of myelin-forming glial cells to assemble qualitatively and/or quantitatively normal myelin (see quarles et al., 1994; nave, 1995) . because myelination is a postnatal event in rodents, this inability to assembe normal myelin is not immediately lethal, and the animals usually survive for at least several weeks. mutations that affect myelin are characterized behaviorally by abnormalities such as shivering, ataxia, and frequently including seizures; these signs of abnormal nervous system function begin at about the time when myelin accumulation becomes significant, probably an indication of the importance of myelin for motor control and normal brain function. in general, myelin-deficient mice possess a mutant gene for some structural myelin protein (see lemke, 1988) . the major known "myelin-deficient" mutations in mice are described as follows, as are some related transgenic and "knockout" mouse models. the shiverer mouse (shi, mouse chromosome 18) was one of the first neurologic mouse mutants examined at the molecular-genetic level (roach et al., 1983) . affected homozygotes lack any detectable mbp and fail to make normal cns myelin. the behavioral phenotype of this mouse is first observed within the second postnatal week, when a general body tremor, which becomes more pronounced with intentional movements, develops (biddle et al., 1973; chernoff, 1981) . the shivering behavior derives from a loss of spinal motor and reflex control and increases with age, often progressing to include seizures. the life span of the shiverer mouse is limited to approximately 6 months of age. histologic examination shows severe dysmyelination throughout the entire cns, but with normal-appearing pns myelin (privat et al., 1979; kirschner and ganser, 1980; rosenbluth, 1980) . at the ultrastructural level, the pattern of dysmyelination is dominated by a severe lack of myelin. the myelin-like structures that are occasionally present are loosely wrapped around the axon and the intracellular adhesion zone of the extended cell process that normally forms the "major dense line" of myelin cannot be discerned . the lack of proper myelin sheath formation may be the result of a defect in myelin compaction or a related process, because oligodendroglia appear to be normally differentiated. histologic evidence of dysmyelination is supported biochemically by a dramatic reduction of all major myelin proteins, and more specifically by the complete lack of mbp. this is the result of a large (20 kb) genomic deletion encompassing exons 3-7 of the mbp gene (or exons 7-11 of the larger golli-mbp gene) resulting in no coding capacity for any of the mbp isoforms (roach et al., 1985; molineaux et al., 1986) . the tremoring phenotype can be cured by a number of manipulations associated with restoration of mbp expression, indicating that the amount of mbp available to the oligodendrocytes is a rate-limiting step in the assembly of cns myelin. successful approaches include reintroduction of the entire wild-type mbp gene into the germ line of the shiverer mouse , increasing the transgene copy number shine et al., 1990) , and the reintroduction of a mbp minigene encoding only the smallest (14 kda) mbp isoform (kimura et al., 1989) . a shiverer-like phenotype can also be generated in normal mice by specifically down-regulating the amount of mbp mrna available for protein synthesis via transgenic expression of the mbp gene in "anti-sense" orientation under the control of its cognate mbp promotor (katsuki et al., 1988) . similarly, the formation of antisense mbp mrna is the presumed primary defect of the myelin-deficient (shi-mld) mouse mutant, an allele of the shiverer mutation on chromosome 18 (doolittle and schweikart, 1977; popko et al., 1988) . the presence of this antisense rna is thought to reduce and dysregulate the amount of normal mbp mrna functionally available, thereby resulting in a level insufficient for normal myelin formation (fremeau and popko, 1990; tosic et al., 1990) . the dysmyelination is less severe in the myelin deficient mouse when compared to the shiverer mouse. isolated white matter tracts in cns have a 3-4-fold increase in sodium channel density and it has been suggested that some myelin-associated molecule absent from the shiverer white matter tracts could cause a down-regulation of either the synthesis or accumulation of sodium channels in myelinating axons (noebels et al., 1991) . the lack of dysmyelination in the pns is thought to be due to the substitution of p0, the major integral protein of pns myelin, for the structural function of mbp (lemke and axel, 1985) . the mammalian plp gene is linked to the x chromosome and defects in this gene are associated with neurologic abnormalities in the mouse and with pelizaeus-merzbacher disease in humans. in the mouse, three mutations have been characterized: the jimpy (jp), myelin synthesis-deficient (jp msd), and rumpshaker (rsh). each derives from a point mutation that alters the structure of the encoded protein. in the jimpy mouse, the mutation is a single nucleotide change in the plp gene that inactivates the splice-acceptor site of intron 4. the last a-helical transmembrane domain is replaced by an aberrant carboxy terminus, and the resulting abnormally folded protein is degraded in the endoplasmic reticulum shortly after synthesis, failing to reach the golgi apparatus for further processing and transport (roussel et al, 1987) . the cns is nearly completely devoid of myelin, with less than 1% of the axons ensheathed; pns myelin, however, is ultrastructurally intact (sidman 1964; her-schkowitz et al, 1971) . there are only a few layers of abnormally thin myelin around cns myelin, consisting of either uncompacted membrane whorls or compacted myelin with abnormal ultrastructure (duncan et al, 1989) . the major cause of the dysmyelination in the jimpy mouse seems to be a lack of differentiated oligodendrocytes. the proliferation rate of oligodendrocyte precursor cells is increased but an abnormally high rate of apoptotic cell death eliminates most of these maturating oligodendrocytes (skoff, 1982; knapp et al, 1986; barres et al., 1992) . islands of myelinated fibers are formed by the few oligodendrocytes that escape degeneration and developmental arrest. the behavioral phenotype is evident at 2 weeks of age and consists of general body tremor and ataxia; the animals die with seizures and convulsions by about 4 weeks of age. heterozygous jimpy females, which are mosaics with respect to the x-linked plp gene, display normal behavior. in the allelic mouse mutant, jimpy msd, there are similar ultrastructural alterations in myelin as seen in the jimpy mouse, but about twice as many glial cells escape premature degeneration (billings-gagliardi et al, 1980) . the rumpshaker mutant is the result of a novel mutation of the plp gene (schneider et al, 1992) and displays a phenotype very different from the jimpy and the jimpy msd. rumpshaker mice have more myelin than other dysmyelinated mutants and the degree of dysmyelination varies among cns regions, with early myelinated regions appearing normal whereas late myelinating regions are severely hypomyelinated. the oligodendrocytes appear differentiated and most escape apoptotic cell death resulting in a normal complement of mature oligodendrocytes (griffiths et al, 1990) . the rumpshaker mutation appears to allow the oligodendrocyte to survive but somehow interferes with its ability to normally deposit plp in the myelin membrane (schneider et al, 1992) . although sparse, some myelin sheaths subsist in the rumpshaker mutants and these show selective immunostaining for dm20 (schneider et al., 1992) . these findings suggest that dm20 may serve a critical purpose in glial cell development that is distinct from any function in myelin formation and maintenance. p0-deficient mice have been generated by homologous recombination of the p0 gene in mouse embryonic stem cells with the cloned gene and subsequent generation of germline chimeric mice (giese et al, 1992) . animals lacking one functional copy of the p0 gene are phenotypically normal, but the homozygotes develop a behavioral phenotype by the third week of life. these mice show a body tremor and dragging movements of the hindlimbs. there is no evidence of paralysis or sei-zures and the mutants have a normal life span. histologically, the deficit is characterized by the inability of the schwann cell to assemble a compacted multilamellar pns myelin sheath. the high degree of variability in the pathology is thought to be due to the intervening actions of other proteins such as cell adhesion molecules (mag; n-cam), and perhaps other myelin proteins. using promoter and regulatory regions of the p0 gene in a fusion gene construct, schwann cells were destroyed when they began to express p0, after associating in a 1 : 1 ratio with axons (messing et al, 1992) . the behavioral phenotype of the schwann cell-ablated mice was similar to the phenotype displayed in the homogygous p0 mutants. a proliferation of nonmyelin-forming schwann cells was induced along with skeletal muscle atrophy. the trembler mouse (tr; mouse chromosome 11) contains a mutation of the pmp-22 gene, which results in pns dysmyelination. the pmp-22 gene encodes an integral membrane protein specific to schwann cells (22 kda peripheral myelin protein) believed to be important for normal schwann cell development (spreyer et al, 1991; welcher et al., 1991) . histologically, the majority of large caliber axons in the sciatic nerve are devoid of a myelin sheath and, if present at all, these are abnormally thin and relatively uncompacted (henry and sidman, 1988) . the total number of schwann cells is dramatically increased at the time of segregation of axons, and myelin deposition is arrested. in the absence of pns myelin, the mice display a behavioral phenotype characterized by a coarse-action tremor that begins at the end of the second postnatal week and results in moderate quadriparesis and a waddling gate. under controlled conditions, these animals can experience a normal life span. the quaking mouse (qk; mouse chromosome 17) is the result of an autosomal recessive mutation (sidman et al., 1964) . homozygous mice carrying the viable quaking allele (qkv/qkv) show the typical motor coordination signs of dysmyelination in the absence of seizures and have a normal life span. the myelin deficiency, characterized by fewer than normal myelin lamellae, is predominately in the brain and spinal cord, although a lack of normal compaction and enlarged intraperiod lines of some myelinated fibers has been noted in the pns as well (trapp, 1988) . interestingly, the distribution of mag is shifted from the innermost myelin layer facing the axon to throughout the compact myelin sheath. myelin mutants in a number of other species besides humans and mice have also been described (see duncan, 1995 , for detailed discussion). these include x-linked mutations in the dog (shaking pup, griffiths et al., 1981) , pig (type a iii hypomyelinogenesis congenita, blakemore et al., 1974) , rat (myelin-deficient; dentinger et al., 1982; jackson and duncan, 1988) , and rabbit (paralytic tremor, taraszewska, 1988) , as well as the autosomal recessive taiep mutant rat . as noted previously, myelination is a critical process in the maturation of the nervous system. it involves the synthesis of an enormous amount of specialized membrane within a relatively short period of time. much of the myelin in both the cns and the pns is formed during a relatively short "developmental window" (first few years in humans; first 30 days of age in rodents), and this period is preceded by a burst of myelinatingcell proliferation. during these time periods, a large portion of the nervous system's metabolic capacity is devoted to myelinogenesis. during these "vulnerable periods," the process of myelination is especially susceptible to perturbations such as toxic insults, nutritional deficiencies, genetic disorders of metabolism, viral infections, substances of abuse, and other environmental factors (for review, see wiggins, 1986) . insults occurring during the period of proliferation of myelinating cells may be especially disruptive, as this may lead to an irreversible deficit of myelin-forming cells and consequent permanent hypomyelination. perturbation of myelination at a later stage may result in a myelin deficit that can be reversed. depending on the timing of the insult, a myelin deficiency can result from alterations related to several different developmental events, including failure of myelin-forming cells to proliferate, reduction of axonal development resulting in fewer and/ or smaller axons to myelinate, and decreased formation of myelin at time of maximal synthesis. the morphologic term myelinopathy describes damage to white matter or myelin, and disorders of myelin can be classified by a number of different factors. such factors include the preferential effects on either the cns or on the pns or an involvement of both systems. in addition, effects on myelin can sometimes be delineated as the result of a primary effect on myelin itself or the myelinating glial cell. myelin loss due to a primary insult to myelin or the myelinating cell are termed primary demyelination. there are a number of factors relevant to selective targeting of various toxic or metabolic in-suits to myelin (for discussion, see morell et al, 1994; morell and toews, 1996a ). an intact axon is a prerequisite for maintenance of normal myelin; alterations in myelin due to an effect on the neuron or the underlying axon is termed secondary demyelination. secondary demyelination is an inevitable consequence of serious damage to neurons supporting myelinating axons or to axonal transcetion or crush (wallerian degeneration). however, the distinction between primary and secondary demyelination is often somewhat vague; the basis for this distinction usually involves morphologic evidence of the initial target site. the term hypomyelination is used to describe developmental alterations of myelination in which an insufficient amount of myelin accumulates. hypomyleination can be the result of disease processes, undernutrition, or toxic insult. the term dysmyelination, when used in its strictest sense, refers to certain inborn errors of metabolism in which a block in the breakdown of a myelin lipid causes accumulation of myelin of an abnormal composition (which eventually leads to a collapse and degeneration of myelin), but it is also in wide use as a general descriptor of situations characterized by any abnormalities in myelin. some specific myelinopathies that are preferential to developing organisms are discussed later in this chapter. additional toxicants have been demonstrated to disturb myelin in the adult animal and the morphologic descriptions and mechanistic processes involved have been previously discussed (morell, 1994; morell and toews, 1997) . these include tellurium, diphtheria toxin, 2'3'dideoxycytidine, vigabatrin, carbon monoxide, triethyltin, lead, hexachlorophene, cuprizone, and isoniazid. in the human infant, several studies have provided evidence supporting the concept of a critical period from birth to about 2 years of age, during which time the nervous system is most vulnerable to malnutrition. the production of neurons is virtually completed by about the midpoint of gestation, but glial cell production continues through the end of gestation into the second postnatal year. the vulnerability of the developing nervous system to various factors is determined by the developmental stage of the cellular activities targeted by a specific insult. the effects of an agent or condition may vary depending on the agent, the time of insult during development, and the species under study (dobbing, et al, 1971) . general factors such as undernutrition can have maximal effects on processes that are most active during what has been called the "brain growth spurt" (dobbing and sands, 1979) . depending on the developmental process ongoing at the time of exposure, alterations can be produced in either the number of neurons and extent of axonal arborizations, the number of glial cells, or the degree of myelination. myelination in both the cns and pns is sensitive to nutritional factors (see wiggins, 1986; blass, 1994) . brains of rats undernourished from birth contain a lower amount (20% deficit) of total lipids, cholesterol, and phospholipids and a 50% deficit in cerebrosides (benton et al, 1966) . following severe nutritional deprivation during lactation and post-weaning, total brain galactolipids, cholesterol, and lipid phosphorus showed a slower rate of accumulation (krigman and hogan, 1976) . lipid phosphorus and cholesterol levels recovered by adulthood, whereas galactolipids remained at a 60% decreased level. myelin recovered from undernourished rats was normal in lipid composition but siginificantly reduced in total amount (fishman et al., 1971) . a reduced proportion of basic and proteolipid protein was seen in myelin isolated from undernourished rats at postnatal days 15 and 20, but the composition was similar to normals by postnatal day 30. at all time points examined, myelin yield was 25% less than normal levels (wiggins et al., 1976) . these studies suggested that undernutrition produced a delay in myelin maturation. morphologic examination of animals undernourished from birth showed a decreased number of mature oligodendroglia and poorly stained myelin (bass et al., 1970; krigman and hogan, 1976) . the number of myelin lamellae per axon and the number of myelin lamellae for a given axon diameter were both lower (krigman and hogan, 1976) . some studies suggest that in some cases, myelination is able to catch-up and achieve normal levels once unrestricted feeding is initiated. rats deprived by an increased litter size rapidly gained body and brain weight and normal brain lipid composition within 3 weeks after weaning to an unrestricted diet (benton et al, 1966) . however, nutritional deprivation during the first 21 days of life resulted in reduced levels of total brain lipids, cerebrosides, cholesterol, and plp, and this deficit persisted through 120 days of age (bass et al., 1970) . similar persistent myelin deficits were found in brains of rats subjected to either moderate or severe food deprivation during the first 30 days of life (toews et al., 1983) . although metabolic studies showed that after 6 days of free feeding following 20 days of postnatal starvation, incorporation of labeled precursors into myelin proteins was higher than in animals starved for the entire 26 days, and it was still depressed relative to controls (wiggins et al., 1976) . severe underfeeding in rats from 1 to 14 days of age resulted in a lasting significant deficit in myelin, even with rehabilitation (wiggins and fuller, 1978) . overall, these studies point to the possibility of irreversible deficits in myelin resulting from nutritional deficiencies during development. additional studies suggest the most vulnerable period for myelin may be the time of oligodendroglial proliferation. animals deprived during this period are left with a permanent deficit of myelinforming cells, resulting in irreversible hypomyelination (wiggins and fuller, 1978) . apparently, once normal numbers of oligodendroglia have been formed, the process of myelin formation itself is somewhat more capable of nutritional rehabilitation. similar effects can be found with a specific nutritional manipulation of depleting protein in the diet. in rats subjected to a protein and calorie deficiency during gestation and lactation, glial numbers were greatly reduced, and by postnatal day 19 the majority of cells in the corpus callosum appeared to be glioblasts rather than differentiated oligodendroglia (robain and ponsot, 1978 ). an early postnatal protein deficiency resulted in reduced levels of brain myelin and an altered myelin composition in rats (nakhasi et al., 1975) . in myelin from offspring of rats maintained on a 4% protein diet during lactation, an excess of high molecular weight proteins and a deficiency of plp in heavy myelin was found at postnatal day 17, with normal protein composition seen at 53 days (figlewicz et al., 1978) . the mag persisted in its higher molecular weight form longer than normal, suggesting that protein deficiency results in a delay in development and maturation of the myelination process (druse and krett, 1979) . animals raised on a fat-deficient diet are able to synthesize all fatty acids except the essential fatty acids (linoleic and linolenic acid families). essential fatty acid deficiency induced prenatally in the mothers and postnatally in the offspring resulted in lower brain weights (white et al., 1971) , and a low level of galactolipid and plp (mckenna and campagnoni, 1979) . in the optic nerve of essential fatty acid-deficient rats, vacuolation, intramyelinic splitting, and wallerian degeneration were present (trapp and bernsohn, 1978) . several studies have demonstrated hypomyelination in the offspring of copper-deficient mothers (dipaolo et al., 1974; prohaska and wells, 1974) . in the third generation of mice maintained on a copper-deficient diet, the offspring showed approximately 60% decrease in myelin yield with the major glycoprotein shifted to a higher molecular weight (zimmerman et al., 1976) . thyroid hormones influence the temporal onset of myelination and its compositional maturation. neonatal thyroidectomy in rats results in a lasting reduction of total cerebroside in the brain and a 30% reduction in myelin yield (balazs et al., 1969) . it is thought, however, that hypothyroidism does not exert a specific effect on myelin but rather delays myelin development and matu-ration (dalal et al, 1971; walters and morell, 1981) . whereas hypothyroidism resulted in a 1-2 day delay of myelinogenesis with prolonged immature myelin formation, it eventually attained a normal composition, although the myelin deficit persisted. a classic example of differential susceptibility of the developing organism to the effects of an environmental chemical is that of inorganic lead exposure. children are more vulnerable to lead in terms of external exposure sources, internal levels of lead, and timing of exposures during development. at high exposure levels, lead induces encephalopathy in children and can be life threatening. experimental animal studies have allowed examination of various specific target sites and processes of development susceptible to lead toxicity . during development, the process of cns myelination shows an increased vulnerability to lead exposure. the amount of lead that accumulates in the brain of the developing animal during lactation can be as much as 4 times higher than brain levels in the lactating dam receiving lead in the drinking water. under these conditions, myelin was significantly reduced; however, the relationship between the axon diameter and myelin lamellae remained normal, suggesting that the hypomyelination was the result of altered axonal growth (krigman et al., 1974) . direct administration of lead via to pups intubation from 2 to 30 days of age resulted in a reduction of myelin accumulation in the forebrain and optic nerve. these effects were not due to undernutrition, as the accumulation of brain myelin was decreased significantly relative to controls undergoing a similar degree of malnourishment (toews et al., 1980) . in developing rats, there is a synergistic interaction between lead exposure and mild malnutrution induced by milk deprivation with respect to decreasing the normal developmental accumulation of myelin (harry et al., 1985) . this interaction appeared to be more prevalent in females as compared to males. the decrement in myelin induced by development exposure to inorganic lead is a long-lasting effect that persists into adulthood (toews et al., 1980 (toews et al., , 1983 . myelination is not necessarily the most sensitive target for lead as low doses sufficient to produce some microscopically discernible hemorrhagic encephalopathy in the cerebellum of young rats did not depress myelination (sundstrom and karlsson, 1987) ; this hemorrhagic encephalopathy may be related to concentration of lead in brain capillaries (toews et al., 1978) . the basic cns change induced by exposure to tet is a massive cerebral edema, restricted primarily to the white matter (magee et al., 1957; torak et al, 1970) , with the formation of intramyelinic vacuoles (jacobs et al., 1977) . the pathologic effect varies with the age of the animal (suzuki, 1971) . young rats exposed to tet develop severe spongious white matter similar to that seen in the adult, but with the absence of major clinical signs seen in the adult (suzuki, 1971; blaker et al., 1981) . it is thought that the severe paralysis seen in the adult animal is due in part to the intracranial pressure developed during severe edema, whereas the open cranial sutures in the young rat may allow for edema in the absence of increased pressure. when newborn rats are exposed to tet, brains became swollen and petechial hemorrhages are observed, particularly in the cerebellum. necrotic cells were found diffusely throughout the brain (watanabe, 1977) . when older (postnatal day 8) animals were exposed, both the hemorrhagic and necrotic changes occurred, but damage was also seen in the myelinated fibers of the brain stem and cerebellum. although the morphologic alterations in myelin dissipate with time, biochemical evidence suggests that the amount of myelin produced is decreased and that this myelin deficit persists through adulthood (blaker et al., 1981; toews et al., 1983) . chronic exposure to tet from 2 to 30 days after birth decreases myelin yield and cerebroside content (55%) and 2',3'-cyclic nucleotide 3'-phosphohydrolase activity (20%) (blaker et al, 1981) . in studies using radioactive tracer, smith (1973) demonstrated that it is the newly forming cns myelin that is preferentially susceptible to degradation by tet. interestingly, administration of tet to quaking mice did not produce intramyelinic edema (nagara et al, 1981) . when young animals are exposed to trialkyllead, the process of myelination is inhibited (konat and clausen, 1976) . unlike triethyltin, this impairment in myelinogenesis is not accompanied by edema of white matter. the impairment appears to be primarily in the deposition of myelin rather than in the program for myelination, as the protein composition of forebrain myelin isolated from triethyllead-intoxicated young rats was normal (konat and clausen, 1978) . in vitro studies suggest that the alteration involves posttranslational processing and transport of integral membrane proteins, processes particularly important for myelin proteins during development (konat and clausen, 1980; konat and offner, 1982) . hexachlorophene (2,2'-methylenebis-3,4,6-trichlorophenol) is an antimicrobial agent that has been used previously in soaps and detergents, as well as in the bathing of newborn babies to prevent bacterial infections (herter, 1959; powell et al., 1973; for review, see towfighi, 1980) . both cns and pns myelin show a severe white matter edema following exposure, and young rats are more vulnerable than adults (towfighi et al, 1974) . in young rats, edema of the myelin sheath becomes evident after postnatal day 15, probably because the myelin membrane provides a hydrophobic reservoir for accumulation of this toxic compound and thereby becomes a significant site for fluid accumulation (nieminen et al., 1973) . developmental exposure results in a decrease of the normal accumulation of myelin during development (matthieu et al., 1974) . in 22-dayold rats nursed by mothers fed hexachlorophene, there was a decrease in myelin yield, yet the myelin composition remained normal. abnormal "dissociated" myelin accounted for about 10% of the total myelin and contained the typical myelin constituents with the exception of mag, which was absent (matthieu et al., 1974) . degeneration caused by this antimetabolite involves myelin, neurons, astrocytes, and oligodendroglia. in young animals injected with 6-aminonicotinamide, the pns shows a selective swelling of schwann cell cytoplasm at the inner surface of the myelin sheath. the nerve is compressed by the swelling and results in an overgrowth of the myelin sheath (friede and bischhausen, 1978) . young ducklings fed a diet containing ihn developed a wobbling gait and head tremor after 2 weeks, progressing to ataxia and inability to stand (lampert and schochet, 1968) . examination of the cns showed spongy degeneration of the myelin-containing white matter. cuprizone (bis-cyclohexanoneoxalyhydrazone) is a copper chelator that results in cns demyelination following dietary exposure to weanling mice. the loss of myelin can reach as much as 70% in white matter regions of the cerebrum (carey and freeman, 1983) . deficits in adenosine triphosphate (atp) production secondary to reduced activity of cytochrome oxidase (a copperrequiring enzyme) may lead to alterations in energyrequiring ion transport mechanisms, but the underlying reason for targeting of this compound to cns myelin is not clear. interestingly, cuprizone inhibits carbonic anhydrase, an enzyme present in myelin, and this inhibition takes place well before any demyelination is observed (komoly et al., 1987) . in experimental studies of cuprizone neurotoxicity, mrna for mag, a protein located at the myelin-axonal interface, is downregulated during demyelination and returns to normal levels following cessation of exposure (fujita et al, 1990) . the mrna for this glycoprotein exists in two major splice variants that are both severely downregulated. on recovery, one splice variant returns to normal levels whereas the other shows an accumulation above control levels. prolonged exposure to cuprizone (9 weeks or longer in mice) results in irreversible demyelination (tansey et al., 1996) , possibly due to death of oligodendrocytes and/or oligodendrogilal precursor cells. exposure of weanling rats to a diet containing tellurium (element 52) leads to a highly synchronous demyelinating peripheral neuropathy (lampert and garrett, 1971; duckett et al., 1979; said and duckett, 1981; takahashi, 1981; bouldin et al, 1988) . when tellurium exposure is discontinued, there is rapid and synchronous remyelination. although tellurium toxicity in humans is rare, this model is of considerable interest as a system for studying the manner in which a specific metabolic insult can lead to demyelination. because there is little or no associated axonal degeneration, it has also proved useful for examining events and processes related to pns remyelination, independently of processes related to axonal regeneration. inclusion of 1-1.5% elemental tellurium in the diet of weanling rats leads to a primary segmental demyelination of about 20-25% of myelinating internodes in the sciatic nerve, but with sparing of axons (bouldin et al, 1988; harry et al, 1989) . this demyelination results in a peripheral neuropathy characterized by hindlimb paresis and paralysis. older rats are much more resistant to tellurium, and the cns is generally not affected, although some pathologic alterations can be induced with prolonged exposure periods. the nature of the underlying metabolic insult has been delineated. tellurium blocks cholesterol synthesis, specifically by inhibiting the enzyme squalene epoxidase, an obligate step in the cholesterol biosynthesis pathway (harry et al, 1989; wagner-recio et al, 1991; wagner et al, 1995) . tellurite, a water-soluble oxidized metabolite of the administered insoluble element, is the active species in vitro, effective at micromolar concentrations in a cell-free system (wagner et al., 1995) . the organotellurium compound dimethyltelluronium dichloride, (ch3)zteci2, is also effective in inhibiting squalene epoxidase in cul-tured schwann cells and in inducing demyelination when administered intraperitoneally (goodrum, 1997) . presumably, the resulting cholesterol deficit in schwann cells eventually leads to an inability to maintain preexisting myelin and to assemble new myelin; this in turn leads to the observed demyelination. although the telluriuminduced inhibition of cholesterol biosynthesis is systemic, deleterious effects are confined largely to the sciatic nerve. in the liver, which supplies cholesterol for most body tissues, the resulting intracellular cholesterol deficit results in a marked upregulation of the cholesterol biosynthetic pathway (toews et al, 1991b; wagner-recio et al, 1991; wagner et al, 1995) , presumably via well-characterized feedback mechanisms (see goldstein and brown, 1990, for review) . this allows normal levels of cholesterol synthesis in this tissue despite considerable inhibition of one of the steps in the synthesis pathway, and normal levels of lipoprotein-associated circulating cholesterol are maintained. however, unlike many other tissues, the sciatic nerve cannot use circulating cholesterol; all cholesterol required for myelin in the sciatic nerve must be synthesized locally (jurevics and morell, 1994) . this fact, coupled with the great demand for cholesterol in the rapidly myelinating pns at the time of tellurium exposure, may account for the specificity of toxicity observed. expression of mrna for myelin proteins is markedly down-regulated during the demyelinating phase of tellurium neuropathy, as is gene expression for enzymes involved in synthesis of lipids enriched in myelin (toews et al, 1990 (toews et al, , 1991a (toews et al, ,b, 1997 . the latter include hmg-coa reductase, the rate-limiting enzyme in cholesterol biosynthesis. although this enzyme is markedly upregulated in the liver (as expected from the telluriuminduced intracellular sterol deficit), it is down-regulated in the sciatic nerve in concert with other myelin-related genes (toews et al., 1991b) . failure to up-regulate the cholesterol biosynthesis pathway in the sciatic nerve is, in fact, probably the major underlying reason for the preferential susceptibility of this tissue. the co-ordinate down-regulation for myelin-related genes seen following exposure to tellurium suggests that gene expression of all proteins involved in myelin synthesis and assembly may be under the co-ordinate control of the overall program for myelination (see morell and toews, 1996a; toews et al, 1997 , for discussion). the coordinate downregulation of myelin gene expression takes place in all myelinating schwann cells and not just in those undergoing demyelination (toews et al., 1992) . thus, this downregulation is not just a secondary response to injury but rather reflects the co-ordinate control of myelin gene expression. when tellurium exposure is discontinued, there is co-ordinate up-regulation of these messages during the remyelinating period. thus, tellurium toxicity specifically leads to pns demyelination because (1) synthesis of cholesterol, a major myelin lipid, is severely inhibited; (2) unlike other tissues, peripheral nerve cannot up-regulate the synthesis of cholesterol in response to the tellurium-induced cholesterol deficit; (3) because the pns is isolated from the circulation by barriers, it cannot use circulating cholesterol derived from the diet or from synthesis in the liver; and (4) there is a particularly high demand for cholesterol in the myelinating pns at the time of tellurium exposure. the process of myelination by oligodendroglia in the cns and by schwann cells in the pns represents a complex series of metabolic and cell-biologic events involving intercellular recognition and interaction, adhesion, synthesis, sorting and assembly of specialized myelin membranes, compaction of myelin lamellae, and axonal (and possibly glial) ion channel reorganization. this entire process must be completed during an intense burst of metabolic activity at a specific predetermined interval during development, and failure to complete this program of myelination within the proper "developmental window" may have permanent deleterious effects. because myelin-forming cells are operating at near their metabolic capacity, they are especially sensitive to toxic or other types of insults during this "vulnerable period" of nervous system development, and deficiencies in myelin, either qualitative or quantitative, may result. these myelin deficiencies can result from underlying alterations of various developmental events, including failure of myelin-forming cells to proliferate in normal numbers, reduction of axonal development, and/ or decreased or altered formation of myelin at its normal time of maximal synthesis. the timing of any toxicant exposure or other insult can also differentially effect one or more of these events necessary for normal myelination. although these processes are best examined in the developing nervous system, a clearer understanding of the biochemistry, molecular biology, and cell biology of these events is also of particular relevance with regards to remyelination in injured adult nervous tissue. further delineation of the underlying nature of insults that result from toxic, genetic, nutritional, or other perturbations will also be useful in better understanding these vital processes. multipotentiality of schwann cells in cross-anastomosed and grafted unmyelinated nervesmquantitative microscopy and radioautography expression of myelin transcription factor i (myti), a "zinc-finger" dnabinding protein, in developing oligodendrocytes in vitro analysis of the oligodendrocyte lineage in mice during demyelination and remyelination the effects of neonatal thyroidectomy on myelination in the rat brain light microscopic immunolocalization of laminin, type iv collagen, nidogen, heparan sulfate proteoglycan and fibronectin in the enteric nervous system of rat and guinea pig cell death and control of cell survival in the oligodendrocyte lineage proliferation of oligodendrocyte precursor cells depends on electrical activity in axons effect of neonatal malnutrition on developing cerebrum. ii. microchemical and histologic study of myelin formation in the rat kinetics of entry of galactolipids and phospholipids into myelin metabolic relationships between myelin subfractions: entry of galactolipids metabolism of myelin modification of the schedule of myelination in the rat by early nutritional deprivation immunopathological recognition of autoantigens in multiple sclerosis research news hypomyelinated mutant mice: description of jpmsd and comparison with jp and qk on their present genetic background morphological heterogeneity of rat oligodendrocytes: i. electronmicroscopic studies on serial sections membrane ultrastructure of developing axons in glial cell deficient rat spinal cord effects of delayed myelination by oligodendrocytes and schwann cells on the macromolecular structure of axonal membrane in rat spinal cord ultrastructural observations on the spinal cord of a landrace pig with congenital tremor type a iii effect of triethyl tin on myelination in the developing rat vitamin and nutritional deficiencies endocytosis in adsorptive cells of cultured human small-intenstine tissue: effects of cytochalasin b and d schwann-cell vulnerability to demyelination is associated with internodal length in tellurium neuropathy 2',3'-cyclic nucleotide 3'-phosphodiesterase has characteristics of cytoskeletal proteins: a hypothesis for its function isoprenoid modification permits 2',3'-cyclic nucleotide 3'-phosphodiesterase to bind to membranes visualization of oligodendrocytes and astrocytes in the intact rat optic nerve by intracellular injection of lucifer yellow and horseradish peroxidase brain carbonic anhydrase" activity in isolated myelin and the effect of hexachlorophene molecular biology of myelin proteins from the central nervous system molecular biology of myelination cellular and molecular aspects of myelin protein gene expression structure and developmental regulation of golli-mbp, a 105 kilobase gene that encompasses the myelin basic protein gene and is expressed in cells in the oligodendrocyte lineage in the brain biochemical changes in cuprizone-induced spongiform encephalopathy. i. changes in the activities of 2',3'-cyclic nucleotide 3'-phosphohydrolase, oligodendroglial ceramide galactosyl transferase, and the hydrolysis of the alkenyl group of alkenyl, acyl-glycerophospholipids by plasmalogenase in different regions of the brain mitogenic effect of axolemmaenriched fraction on cultured oligodendrocytes shiverer, an autosomal recessive mutant mouse with myelin-deficiency synthesis and incorporation of myelin polypeptide into cns myelin in vivo and in vitro observations on laminin production by schwann cells n-terminal sequence homology among retinoid-binding proteins from bovine retina glial fibrillary acidic (gfa) protein in schwann cells: fact or artifact? regulatory role of thyroxine on myelinogenesis in the developing rat ultrastructure of the central nervous system in a myelin deficient rat the phosphate groups modifying myelin basic proteins are metabolically labile; the methyl groups are stable further studies on the mitogenic response of cultured schwann cells to rat cns axolemma-enriched fractions synthesis and turnover of myelin phospholipids and cholesterol copper deficiency and the central nervous system: myelination in the rat~ morphological and biochemical studies vulnerability of developing brain: vii. permanent deficit of neurons in cerebral and cerebellar cortex following early mild undernutrition comparative aspects of the brain growth spurt myelin deficient, a neurological mutation in the mouse neu differentiation factor is a neuron-glia signal and regulates survival, proliferation, and maturation of rat schwann cell precursors cns myelin-associated glycoproteins in the offspring of protein deficient rats tellurium-induced neuropathy: correlative physiological, morphological and electron microprobe studies inherited disorders of myelination of the central nervous system myelination in the jimpy mouse in the absence of proteolipid protein the taiep rat: a myelin mutant with an associated oligodendrocyte microtubular defect the proximal end of mouse chromosome 17: new molecular markers identify a deletion associated with quaking viable the nh2-terminal amino acid sequence of cellular retinoic-acid binding protein from rat testis maternal deficiency of protein or protein and calories during lactation: effect upon cns myelin subfraction formation in rat offspring the effect of undernutrition on the development of myelin in the rat central nervous system a quantitative study of anterior root fibres during early myelination in situ analysis of myelin basic protein gene expression in myelin-deficient oligodendrocytes: antisense hmrna and read through transcription the precise geometry of large internodes induction of myelinassociated mrna in experimental remyelination measurement of myelin sheath resistances: implications for axonal conduction and pathophysiology proteolipid protein is necessary in peripheral as well as central myelin myelin-oligodendrocyte glycoprotein is a unique member of the immunoglobulin superfamily conservative amino acid substitution in the myelin proeolipid protein of jimpy msd mice endogenous progenitors remyelinate demyelinated axons in the adult cns disruption of the po gene in mice leads to abnormal expression of recognition molecules, and degeneration of myelin and axons distribution of myelin basic protein and p2 mrnas in rabbit spinal cord oligodendrocytes regulation of the mevalonate pathway tellurium-induced demyelination is correlated with squalene epoxidase inhibition. (abstr.) the cell of schwann: an update characterization of the basic proteins from rodent peripheral nervous system myelin shaking pups: a disorder of central myelination in the spaniel dog. i. clinical, genetic, and light microscopical observations expression of myelin protein genes in schwann cells rumpshaker mouse: a new xlinked mutation affecting myelination: evidence for a defect in plp expression identification of a common idiotype on myelin oligodendrocyte glycoproteinspecific autoantibodies in chronic relapsing experimental allergic encephalomyelitis tellurium-induced neuropathy: metabolic alterations associated with demyelination and remyelination in rat sciatic nerve the effect of lead toxicity and milk deprivation on myelination in the rat isolation and sequence determination of cdna encoding pmp-22 (pas-ii/sr13/gas-3) of human peripheral myelin charcot-marie-tooth neuropathy type 1b is associated with mutations of the myelin p0 gene long lives for homozygous trembler mutant mice despite virtual absence of peripheral nerve myelin myelin differences in the central and peripheral nervous system in the 'jimpy' mouse hexachlorophene poisoning. kaiser found the nodes of ranvier molecular genetics of x-linked mutants selective expression of dm-20, an alternatively spliced myelin proteolipid protein gene product the po glycoprotein of peripheral nerve myelin cell kinetics and cell death in the optic nerve of the myelin deficient rat acute effects of triethyltin on the rat myelin sheath schwann cell precursors and their development effect of cytochalasin d on secretion by rat pancreatic acini sources of cholesterol for kidney and nerve during development the neuritogenicity and encephalitogenicity of p2 in the rat, guinea-pig and rabbit structure and expression of proteolipid protein in the peripheral nervous system conversion of normal behavior to shiverer by myelin basic protein antisense cdna in transgenic mice organization of microtubules in myelinating schwann cells novel member of the zinc finger superfamily: a c2-hc finger that recognizes a glia-specific gene restoration of myelin formation by a single type of myelin basic protein in transgenic shiverer mice pax3: a paired domain gene as a regulator in pns myelination compact myelin exists in the absence of basic protein in the shiverer mutant mouse beyond self-assembly: from microtubules to morphogenesis purification and partial characterization of two glycoproteins in bovine peripheral nerve myelin membrane functions of intermediate filaments oligodendroglial cell death in jimpy mice: an explanation for the myelin deficit development of unmyelinated fibers in peripheral nerve: an immunohistochemical and electronmicroscopic study changes in the state of actin during the exocytotic reaction of permeabilized rat mast cells localization of the gene encoding myelin basic protein to mouse chromosome 18e3-4 and rat chromosome 1 p11-p12 decrease in oligodendrocyte carbonic anhydrase activity preceding myelin degeneration in cuprizone induced demyelination triethyllead-induced hypomyelination in the developing rat forebrain protein composition of forebrain myelin isolated from triethyllead-intoxicated young rats suppressive effect of triethyllead on entry of proteins into the cns myelin sheath in vitro effect of triethyllead on posttranslational processing of myelin proteins an isoform of ankyrin is localized at nodes of ranvier in myelinated axons of central and peripheral nerves lead lead encephalopathy in the developing rat: effect upon myelination undernutrition in the developing rat: effect upon myelination deletion of the serine 34 codon from the major peripheral myelin protein p0 gene in charcot-marie-tooth disease type lb two forms of 1b236/myelin-associated glycoprotein, a cell adhesion molecule for postnatal neural development, are produced by alternative splicing mechanism of demyelination, in tellurium neuropathy: electron microscopic observations demyelination and remyelination in lead neuropathy: electron microscopic studies receptor activity and signal transduction in myelin enzymes and receptors of myelin the neural crest structure and acylation of proteolipid protein myelin-associated enzymes merosin, a protein specific for basement membranes of schwann cells, striated muscle and trophoblast, is expressed late in nerve and muscle development unwrapping the genes of myelin isolation and sequence of a cdna encoding the major structural protein of peripheral myelin the myelin proteolipid protein gene and its expression expression of myelin proteolipid and basic protein mrnas in cultured cells the experimental production of edema in the central nervous system of the rat by triethyltin compounds interaction and fusion of unilamellar vesicles containing cerebrosides and sulfatides induced by myelin basic protein modulation by glycosphingolipids of membrane-membrane interaction induced by myelin basic protein and mellitin immunoelectron microscopic localization of neural cell adhesion molecules (l1, n-cam, and mag) and their shared carbohydrate epitope and myelin basic protein in developing sciatic nerve hexachlorophene intoxication: characterization of myelin and myelin related fractions in the rat during early postnatal development polyneuropathy and igm monoclonal gammopathy: studies on the pathogenetic role of anti-myelin-associated glycoprotein antibody effect of pre-and postnatal essential fatty acid deficiency on brain in development and myelination the primary structure of fatty acid-binding protein from nurse shark liver: structural and evolutionary relationship to the mammalian fatty acid-binding protein family structure and molecular interactions of myelin basic protein and its antigenic peptides p0 promoter directs expression of reporter and toxin genes to schwann cells of transgenic mice structure and chromosomal localization of the gene for the oligodendrocyte-myelin glycoprotein the oligodendrocyte-myelin glycoprotein belongs to a distinct family of proteins and contains the hnk-1 carbohydrate a phosphatidylinositominked peanut agglutinin-binding glycoprotein in central nervous system myelin and on oligodendrocytes structure and function of myelin protein genes organization of myelin protein genes: myelinassociated glycoprotein schwann cell development and the regulation of myelination schwann cell development, differentiation, and myelination recombination within the myelin basic protein gene created the dysmyelinating shiverer mouse mutation biochemical and molecular bases of myelinopathy myelin formation, structure, and biochemistry schwann cells as targets for neurotoxicants biochemistry of lipids. in: neurodystrophies and neurolipidoses, handbook of clinical neurology myelin and myelination as affected by toxicants jumpy mutant mouse: a 74-base deletion in the mrna for myelin proteolipid protein and evidence for a primary defect in rna splicing triethyl tin does not induce intramyelinic vacuoles in the cns of the quaking mouse effects of a postnatal protein deficiency on the content and composition of myelin from brains of weanling rats characterization of a cloned cdna encoding rabbit myelin p2 protein neurological mouse mutants: a molecular-genetic analysis of myelin proteins proteolipid proteins: structure and genetic expression in normal and myelin-deficient mutant mice biochemistry of myelin proteins and enzymes effect of hexacholorphene on the rat brain during organogenesis sodium channel density in hypomyelinated brain increased by myelin basic protein gene deletion isolation and characterization of myelin myelination in rat brain: method of myelin isolation myelination in rat brain: changes in myelin composition during brain maturation immunocytochemical localization of basic protein in major dense line regions of central and peripheral myelin interaction of ganglioside gm1 and myelin basic protein studied by 12c and ah nuclear magnetic resonance expression of recombinant myelin-associated glycoprotein in primary schwann cells promotes the initial investment of axons by myelinating schwann cells schwann cells infected with a recombinant retrovirus expressing myelin-associated glycoprotein antisense rna do not form myelin the gene for the peripheral myelin protein pmp-22 is a candidate for charcot-marie-tooth disease type 1a the fine structure of the nervous system: neurons and their supporting cells the oligodendrocyte and its many cellular processes phorbol myristate acetate stimulates pinocytosis and membrane spreading in mouse peritoneal macrophages a novel mutation in myelin-deficient mice results in unstable myelin basic protein gene transcripts myelin deficient mice: expression of myelin basic protein and generation of mice with varying levels of myelin hexachlorophene myelinopathy in premature infants the human myelin basic protein gene is included within a 179-kilobase transcription unit: expression in the immune and centran nervous sustems absence of the major dense line in myelin of the mutant mouse 'shiverer copper deficiency in the developing rat brain: a possible model for menkes steely hair disease brain microtubule-associated proteins modulate microtubule dynamic instability in vitro: realtime observations using video microscopy myelin-specific proteolipid protein is expressed in myelinating schwann cells but is not incorporated into myelin sheaths myelin-associated glycoprotein: structure-function relationships and involvement in neurological diseases diseases involving myelin complete deletion of the proteolipid protein gene (plp) in a family with x-linked pelizaeus-merzbacher disease the dysmyelinating mouse mutations shiverer (shi) and myelin deficient (shi mld) expression of a myelin basic protein gene in transgenic shiverer mice: correction of the dysmyelinating phenotype relation between axons and oligodendroglial cells during initial myelination: the glial unit characterization of cloned cdna representing rat myelin basic protein: absence of expression in brain of shiverer mutant mice chromosomal mapping of mouse myelin basic protein gene and structure and transcription of the partially deleted gene in shiverer mutant mice effects of undernutrition on glial maturation central myelin in the mouse mutant shiverer arrest of proteolipid transport through the golgi apparatus in jimpy brain tellurium-induced myelinopathy in adult rats studies of schwann cell proliferation. iii. evidence for the surface localization of the neurite mitogen studies of schwann cell proliferation. ii. characterization of the stimulation and specificity of the response to a neurite membrane fraction the amino acid sequences of the myelin-associated glycoproteins: homology to the immunoglobulin gene superfamily structure and function of the myelin-associated glycoprotein laminin, fibronectin, and collagen in synaptic and extrasynaptic portions of muscle fiber basement membrane molecular heterogeneity of basal laminae: isoforms of laminin and collagen iv at neuromuscular junction and elsewhere expression of glial antigens c1 and m1 in the peripheral nervous system during development and regeneration a monoclonal antibody against a myelin oligodendrocyte glycoprotein induces relapses and demyelination in central nervous system autoimmune disease uncoupling of hypomyelination and glial cell death by a mutation in the proteolipid protein gene functions of microtubulebased motors myelin-oligodendrocyte glycoprotein (mog) is a surface marker of oligodendrocyte maturation neuropathology and genetics of pelizaeus-merzbacher disease myelin basic protein myelinogenesis: morphometric analysis of normal, mutant and transgenic central nervous system mutant mice (quaking and jimpy) with deficient myelination in the central nervous system myelin-associated galactolipids in primary cultures from dissociated fetal rat brain: biosynthesis, accumulation, and cell surface expression recent insights into the assembly, dynamics, and function of intermediate filament networks increased proliferation of oligodendrocyte in the hypomyelinated mouse mutant-jimpy microtubule-associated proteins and the stimulation of tubulin assembly in vitro studies on the mechanism of demyelination: triethyl tin-induced demyelination peripheral nervous system myelin properties and metabolism immunological non-identity of 19k protein and p0 in peripheral nervous system myelin the basic protein of cns myelin: its structure and ligand binding characterization of a novel peripheral nervous system myelin protein (pmp-22/sr13) assignment of the myelin basic protein gene to human chromosome 18q22-qter axon-regulated expression of a schwann cell transcript that is homologous to a growth arrest-specific gene 2',3'-cyclic nucleotide 3'-phosphodiesterase, an oligodendrocyte-schwann cell and myelin-associated enzyme of the nervous system quantitative analysis of myelin protein gene expression during development in the rat sciatic nerve d, and camp response element binding protein by schwann cells and their precursors in vivo and in vitro myelin basic protein in brains of rat with low dose lead encephalopathy identifying the protein products of brain-specific genes with antibodies to chemically synthesized peptides a leucine-to-proline mutation in the putative first transmembrane domain of the 22-kda peripheral myelin in the trembler-j mouse trembler mouse carrier a point mutation in a myelin gene some new observations in triethyl-tin of rats experimental study on segmental demyelination in tellurum neuropathy expression of carbonic anhydrase ii mrna and protein in oligodendrocytes during toxic demyelination in the young adult mouse ultrastructure of axons in disturbed cns myelination in pt rabbit experimental paraprotein neuropathy, demyelination by passive transfer of igm anti-myelin-associated glycoprotein the dm20 protein of myelin: intracellular and surface expression patterns in transfectants myelin deficity produced by early postnatal exposure to inorganic lead or triethyltin are persistent primary demyelination induced by exposure to tellurium alters mrna levels for nerve growth factor receptor, scip, 2',3'-cyclic nucleotide 3'-phosphodiesterase, and myelin proteolipid protein in rat sciatic nerve tellurium-induced alterations in hmg-coa reductase gene expression and enzyme activity: differential effects in sciatic nerve and liver suggests tissue-specific regulation of cholesterol synthesis primary demyelination induced by exposure to tellurium alters schwanncell gene expression: a model for intracellular targeting of ngfreceptor alterations in gene expression associated with primary demyelination and remyelination in the peripheral nervous system experimental lead encephalopathy in the suckling rat: concentration of lead in cellular fractions enriched in brain capillaries effect of inorganic lead exposure on myelination in the rat telluriuminduced neuropathy: a model for reversible reductions in myelin protein gene expression posttranslational modification of myelin proteins the localization of laminin and fibronectin on the schwann cell basal lamina krox-20 controls myelination in the peripheral nervous system pathobiology of acute triethyltin intoxication post-transcriptional events are responsible for low expression of myelin basic protein in myelin-deficient mice: role of natural antisense rna hexachlorophene hexachlorophene-induced changes in central and peripheral myelinated axons of developing and adult rats distribution of the myelin-associated glycoprotein and p0 protein during myelin compaction in quaking mouse peripheral nerve the myelin-associated glycoprotein: location and potential functions essential fatty acid deficiencies and cns myelin presence of the myelinassociated glycoprotein correlates with alterations in the periodicity of peripheral myelin myelin-associated glycoprotein and myelinating schwann cell-axon interaction in chronic /3, /3'-iminodipropionitrile neuropathy cellular and subcellular distribution of 2',3'-cyclic nucleotide 3'-phosphodiesterase and its mrna in the rat central nervous system co-localization of the myelin-associated glycoprotein and the microfilament components f-actin and spectrin in schwann cells of myelinated fibers immunocytochemical localization of p0 protein in golgi complex membranes and myelin of developing rat schwann cells spatial segregation of mrna encoding myelin-specific proteins polarization of myelinating schwann cell surface membranes: role of microtubules and the trans-golgi network lipid binding activities of the p2 protein in peripheral nerve myelin identical point mutation of pmp-22 in trembler-j mouse and charcot-marie-tooth disease type 1a structural and functional features of different types of cytoplasmic fatty acidbinding proteins tellurite specifically affects squalene epoxidase: investigations examining the mechanism of tellurium-induced neuropathy tellurium blocks cholesterol synthesis by inhibiting squalene metabolism: preferential vulnerability to this metabolic block leads to peripheral nervous system demyelination effects of altered thyroid states on myelogenesis mechano-transduction across the cell surface and through the cytoskeleton proliferation and differentiation of 04+ oligodendrocytes in postnatal rat cerebellum: analysis in unfixed tissue slices using anti-glycolipid antibodies effect of triethyltin on the developing brain of the mouse structure and function of the myelinated fiber molecular organization of the cell membrane in normal and pathological axons: relation to glial contact rules governing membrane reorganization and axon-glial interactions during the development of myelinated fibers axoglial interactions at the cellular and molecular levels in central nervous system myelinated fibers specificity in central myelination: evidence for local regulation of myelin thickness morphological correlates of functional differentiation of nodes of ranvier along single fibers in the neurogenic electric organ of the knife fish sternarchus low density of sodium channels supports action potential conduction in axons of neonatal rat optic nerve the geometry of peripheral myelin sheaths during their formation and growth in rat sciatic nerves development of peripheral nerve fibers studies on the control of myelinogenesis. ii. evidence for neuronal regulation of myelin production a myelin protein is encoded by the homologue of a growth arrest-specific gene brain recovery from essential fatty acid deficiency in developing rats myelination: a critical stage in development early postnatal starvation causes lasting brain hypomyelination myelin synthesis during postnatal nutritional deprivation and subsequent rehabilitation in vitro studies of the development, maintenance and regeneration of the oligodendrocyte-type-2 astrocyte (o-2a) lineage in the adult central nervous system dna sequence, genomic organization, and chromosomal localization of the mouse peripheral myelin protein zero gene: identification of polymorphic alleles hypomyelination in copper-deficient rats atpase activities in myelin and oligodendrocytes isolated from the brains of developing rats and from bovine brain white matter membrane proteins of synaptic vesicles and cytoskeletal specializations at the node of ranvier in electric ray and rat schwann cell differentiation key: cord-018276-elb93kp6 authors: li, shitao title: proteomics defines protein interaction network of signaling pathways date: 2012-12-27 journal: bioinformatics of human proteomics doi: 10.1007/978-94-007-5811-7_2 sha: doc_id: 18276 cord_uid: elb93kp6 protein interactions play fundamental roles in signaling transduction. analysis of protein–protein interaction (ppi) has contributed numerous insights to the understanding of the regulation of signal pathways. different approaches have been used to discover ppi and characterize protein complexes. in addition to conventional ppi methods, such as yeast two-hybrid (yth), affinity purification coupled with mass spectrometry (ap-ms) is emerging as an important and popular tool to unravel protein complex and elucidate protein function through the interaction partners. with the ap-ms method, protein complexes are prepared first by affinity purification directly from cell lysates, followed by characterization of their components by mass spectrometry. in contrast to most ppi methods, ap-ms reflects ppi under near physiological conditions in the relevant organism and cell type. ap-ms is also able to probe dynamic ppi dependent on protein posttranslational modifications, which is common for signal transduction. ap-ms mapping protein interaction network of various signal pathways has dramatically increased in recent years. here, i’ll present the strategies toward obtaining an interactome map of signal pathway and the methodology, detailed protocols, and perspectives of ap-ms. protein interaction plays essential role in cell structure and function. in a simpli fi ed diagram of a signaling pathway, upon interaction of a ligand, the receptor alters its conformation, such as dimerization, phosphorylation, and ubiquitination, leading to recruitment of intracellular molecules and subsequent activation of downstream signal cascades. each level of the signaling cascades requires protein interaction to work as a well-assembled, multifunctional protein complex essential for signal transduction. the functionality of proteins relies on their ability to interact with one another, whereas pathogenic conditions can re fl ect the perturbations of these protein interactions. numerous protein-protein interaction (ppi) methods have been developed, but only a few of them are used for large-scale ppi detection, including yeast twohybrid (yth), protein fragment complementation assay (pca), luciferase-mediated interactome (lumier), mammalian protein-protein interaction trap (mappit), protein array, and af fi nity puri fi cation coupled with tandem mass spectrometry (ap-ms). the yth system is the fi rst assay for analysis of large-scale protein-protein interactions and widely accepted method (fields and song 1989 ) . in yth system, interested gene (bait, x) is fused to the dna-binding (db) domain of a transcription factor such as gal4 (db-x), while the interacting protein (prey, y) is fused to an activation domain (ad) such as gal4-ad (ad-y). physical interaction between x and y brings ad and db together, which reconstitutes the transcription factor and subsequently activates the downstream reporter genes (fields and song 1989 ) . like the yth, pca requires that bait and prey are each fused with incomplete fragments of a third protein, which acts as a reporter. interaction between bait and prey proteins brings the fragments of reporter protein in close enough proximity to allow them to form a functional reporter protein (rossi et al. 1997 ) . when fl uorescent proteins are reconstituted, the pca is called bimolecular fl uorescence complementation assay (kerppola 2009 ) . lumier is basically a co-immunoprecipitation assay, in which bait is linked to an epitope for puri fi cation and prey protein is fused to renilla or fi re fl y luciferase for detection (barrios-rodiles et al. 2005 ) . in the mappit, bait and prey proteins are linked to signaling de fi cient cytokine receptor chimeras. interaction of bait and prey restores jak-stat cascade after the receptor has been stimulated with ligand, which leads to stat3-dependent reporter gene activation (eyckerman et al. 2001 ) . protein microarray is a microscopic array glass slide on which interested proteins have been af fi xed at separate locations in an ordered manner using a variety of available chemical linkers (macbeath 2002 ) . protein microarrays are typically high-density arrays that are used to identify novel proteins or protein-protein interactions. antibody microarrays are the most common analytical microarray. ap-ms is biochemical puri fi cation of protein complexes followed by characterization of their components by mass spectrometry. however, unlike the methods discussed above, ap-ms is not designed for one-to-one protein interaction (i.e., binary interaction). instead, ap-ms detects multi-protein complexes. as with 2 proteomics de fi nes protein interaction network of signaling pathways ap-ms, gene of interests is tagged with desirable epitope for af fi nity puri fi cation. various tags have been developed, such as flag tag, ha tag, glutathione s-transferase (gst) tags, the calmodulin-binding peptide, the streptavidin-binding peptide, or the in vivo biotinylation of the target tagged peptide using coexpression of the bira ligase (waugh 2005 ) . with af fi nity tag, protein complexes are enriched fi rst by af fi nity puri fi cation. one early developed ap-ms is to use the tandem af fi nity puri fi cation (tap) tag (puig et al. 2001 ) . the original tap tag is composed of a protein a tag and a calmodulin-binding peptide for two sequential enrichment puri fi cations. in the fi rst puri fi cation step, the protein complex is isolated from the cell lysate using immunoglobulin gamma (igg) resin with high protein a af fi nity. after protein complex is cleaved from the protein a tag with tev protease, the eluate undergoes second puri fi cation on an immobilized calmodulin column. to date, ap-ms has been performed in combination with other techniques, such as biochemical fractionation and chemical cross-linking, for characterization of protein complex. combining biochemical fractionations, like size fractionation, with ap-ms can provide a more precise characterization of multi-protein complexes according to the factions. for example, a combination of tap puri fi cation with standard gel fi ltration has allowed for a better characterization of rna polymerase ii complex (mueller and jaehning 2002 ) . crosslinker is used for detecting weak interactions, such as membrane complex, which may be interrupted by detergents in lysis buffer. a combination of tap with in vivo cross-linking with formaldehyde was used to identify novel proteasome interactors (tagwerker et al. 2006 ) . ap-ms can also be combined with quantitative proteomics approaches, such as silac and icat, to better understand the dynamics of protein complex assembly. stable isotope labeling by amino acids in cell culture (silac) is an approach for in vivo incorporation of a label into proteins for mass spectrometry (ms)-based quantitative proteomics (ong et al. 2002 ) . isotope-coded af fi nity tags (icat) are complementary to silac and measure dynamic changes in complexes isolated from tissues or organisms that cannot be metabolically labeled (gygi et al. 1999 ) . both entail labeling the samples with isotope labels that allow the mass spectrometer to distinguish between identical proteins in separate samples. differentially labeled samples are combined and analyzed together, and the differences in the peak intensities of the isotope pairs accurately re fl ect difference in the abundance of the corresponding proteins. given the fundamental importance of protein interactions, systematically mapping protein-protein interaction (ppi) in various species has dramatically increased in recent years. using high-throughput yth, proteome-wide physical interaction maps have been generated for several organisms: saccharomyces cerevisiae (fromont-racine et al. 1997 ; uetz et al. 2000 ; ito et al. 2001 ) , caenorhabditis elegans (walhout et al. 2000 ; reboul et al. 2003 ; li et al. 2004 ) , drosophila melanogaster (giot et al. 2003 ; guruharsha et al. 2011 ) , and human (guruharsha et al. 2011 ; rual et al. 2005 ) . virus-host protein interactomes were also explored, such as severe acute respiratory syndrome (sars)-coronavirus (pfefferle et al. 2011 ) , kaposi sarcoma herpesvirus (kshv), and varicella zoster virus (vzv) (uetz et al. 2006 ; rozen et al. 2008 ) . in addition to global mapping, protein interaction networks of several important signal pathways, such as mapk (bandyopadhyay et al. 2010 ) , tgf b ( tewari et al. 2004 , smad (colland et al. 2004 ) , and pi3k-mtor (pilot-storck et al. 2010 ) , have been investigated. in addition to yth, ap-ms is another widely used ppi tool to map protein interactomes. due to many advantages that will be discussed later, ap-ms mapping protein interaction network of various signal pathways has dramatically increased in recent years. global-wide interactomes have been established in escherichia coli (hu et al. 2009 ) and mycoplasma pneumonia (kuhner et al. 2009 ) , saccharomyces cerevisiae (krogan et al. 2006 ; gavin et al. 2006 ; ho et al. 2002 ) , drosophila melanogaster (guruharsha et al. 2011 ) , and hiv-host interactome (jager et al. 2012 ) . in vertebrate, this approach has so far been used to de fi ne proteomic subspaces or speci fi c signal pathways: antiviral innate immunity pathway (li et al. 2011 ) , autophagy pathway (behrends et al. 2010 ) , deubiquitinase interactome (sowa et al. 2009 ) , endoplasmic reticulum-associated protein degradation network (erad) (christianson et al. 2012 ) , tnf pathway (bouwmeester et al. 2004 ) , proteasome interaction network (guerrero et al. 2008 ) , and disease-related protein network (ewing et al. 2007 ) . systematic identi fi cation of protein interactions within an organism will facilitate systems-level studies of biological processes. current binary ppi networks are mainly generated by high-throughput yeast two-hybrid. due to the small overlap of these maps, it has been assumed that these maps are of low quality containing many false positives (parrish et al. 2006 ) . recent efforts to map interactions using ap-ms illustrate the promise to measure speci fi c protein interactions in vivo (instead of in yeast) and provide a more powerful tool to model the in vivo interactome. first, i discuss the advantages of ap-ms versus yth, and then focus the details of the methodology, applications, and perspectives of ap-ms. despite the wide acceptance of yth system for protein-protein interaction analysis and discovery, high-throughput yth for protein interaction network bears several major limitations: (1) reporter analysis method indirectly re fl ects protein-protein interaction which usually leads to high false positives. for example, proteins with transcriptional activity can lead to autoactivation of the reporter genes. (2) some heterologous protein expressions are incompatible or toxic to yeast, i.e., membrane proteins which are unlikely to be appropriately assayed as a fusion with a reconstituted transcription factor in yth. (3) yth cannot re fl ect the endogenous protein interactions in the relevant organism. (4) lots of signaling pathways in vertebrates do not exist in yeast. thus, interactions triggered by posttranslational modi fi cations do not occur in yeast, resulting in many intrinsic false negatives. (5) the coverage of prey library usually is not completed. in addition, in high-throughput yth, the bait expression is not monitored. heterologous full-length protein expression, especially high-molecular-weight protein, expects to have low expression level in yeast. although both yth and ap-ms detect protein-protein interaction, they have several distinct differences (table 2 .1 ). ap-ms couples af fi nity puri fi cation with mass spectrometry and requires more labor works and sophisticated equipments. basically, baits can be expressed in any cell line, which investigator is interested in. after antibiotic selection, bait expression levels are monitored in stable cell lines by western blot, and cell line expressing low bait protein level (close to endogenous level) is usually chosen for following af fi nity puri fi cation. since the bait expression is close to the counterpart endogenous protein level, we expect the puri fi ed complex re fl ects the endogenous protein interactions under physiological conditions. ap-ms also can be used to detect dynamic protein interactions dependent on protein posttranslational modi fi cation by signal stimulation. unlike yth detecting one-to-one interaction (aka binary interaction), ap-ms analyzes the entire bait complex and provides all prey information in one run. however, the puri fi ed complex represents a mix of direct and indirect binding partners since the nature of the interactions identi fi ed in ap-ms data cannot be determined to be either direct or indirect. last, protein abundance and speci fi city in different cell lines also limits the detection of protein complex. for example, mib1 and mib2 have comparable af fi nity with tbk1, but we did not detect mib2 in tbk1 complex in 293t cells by ap-ms. using real-time pcr, we found mib1 predominantly expressed in 293t cell line (li et al. 2011 ) . taken all together, ap-ms overcomes the limitations of yth discussed above except several disadvantages over yth: high cost, indirect interaction, and cell type speci fi city. the pipeline of ap-ms from gene construction to interaction network mapping is shown in fig. 2 .1 (li et al. 2011 ) . in brief, interested gene is tagged with desirable epitopes such as flag, gst, his, and biotin. depending on the puri fi cation strategy, one or two tags (usually tandem tags) are adopted. these vectors should carry one antibiotic resistance gene for mammalian cell stable line selection. after transfection or infection into the desirable mammalian cell line, cells are selected by designated antibiotics to obtain stably and close to endogenous protein expression. protein complexes are precipitated from lysates of bulk cells by using various immobilized matrixes, such as resin conjugated with antibody. protein complexes are then eluated from the matrixes after several washing steps to remove nonspeci fi c interactors. protein complex is either separated on gel following silver staining or precipitated. sliced gel bands or solution samples are analyzed by mass spectrometry. after data collection and statistical analysis, protein interaction network is generated and ready for validation and further function analysis. to purify protein complex closing to physiological level, cell line stably expressing tagged bait is a prerequisite. therefore, antibiotic resistance gene should be included in the vector for stable cell line selection. genes of interest also needs to be tagged in-frame with an epitope (at either the n or c terminus), which is used to af fi nity purify the tagged protein (aka bait) along with its interacting partners (aka prey). any af fi nity tag can be used for ap-ms in theory, and most successful tags developed to date are flag, ha, s-tag, and tandem af fi nity puri fi cation (tap) tag. each puri fi cation tag has advantages and disadvantages, and the appropriate technique should be selected depending on the goals of the experiment. for example, a single flag or ha epitope only adds 8-11 amino acids (li et al. 2011 ) , while the tap tag adds a >20-kda tag (krogan et al. 2006 ) which may cause more nonspeci fi c binding. because tag may interfere with protein expression or interaction, both n-terminal and c-terminal fusion could be tested for optimal ap-ms. for example, membrane protein may need to put the tag on the c-terminal or after signal peptide on the n-terminus. furthermore, two kinds of puri fi cation methods (single and tandem puri fi cation) are used for ap-ms, which requires bait fused with single or double epitopes, respectively. depending on the number of tags on the vector, there are one-step and two-step puri fi cation methods for speci fi c protein complex, cell line, or organism. originally developed for yeast, the fi rst tap tag consists of calmodulin-binding peptide (cbp), followed by tobacco etch virus protease (tev protease) cleavage site and protein a with high af fi nity to immunoglobulin gamma (igg). protein complex is fi rst puri fi ed from the cell lysate on an igg af fi nity resin and cleaved from the protein a tag with tev protease. the eluate is then enriched in a second af fi nity puri fi cation step on an immobilized calmodulin column. several variants of tap with different combinations of tags, such as flag-ha double tags, are developed. usually, one-step puri fi cations on average preserve weaker or more transient protein-protein interactions in the price of a higher number of nonspeci fi c binding proteins. conversely, the tandem procedure tends to yield cleaner results, but weak interactions can be lost. flag and ha double tags are most commonly applied for tandem puri fi cation of protein complexes. we compared the effect of tandem tag versus single tag puri fi cation on the yield of total prey and hcip by examining four protein complexes puri fi ed by single puri fi cation with flag versus a two-step puri fi cation with flag followed by ha (li and dorf 2013 ) . ms analysis revealed that the number of total interactors was dramatically reduced in all protein complexes (tbk1, nap1, irf3, and sintbad) isolated by tap puri fi cation. however, the ratio of hcip to total prey did not increase. consistently, more hcip were detected by single-step af fi nity puri fi cation ( fig. 2. 2 ). in brief, tandem puri fi cation reduces the nsbp at the price of hcip loss. due to on average more than 90% of proteins as nonspeci fi c binding protein in one-step puri fi cation, researchers prefer to tandem af fi nity puri fi cation to get a cleaner background if they only study on a few protein complexes. however, if the study is to map the protein interaction network of a speci fi c signaling pathway, nsbp from one-step puri fi cation can be excluded by statistical analysis of the whole database. in most proteomics experiments, the puri fi ed proteins are separated by onedimensional sds-page and stained with a mass spectrometry-compatible dye such as silver, sypro ruby, or coomassie. sds-page separation removes unwanted contaminants such as buffer components from the protein sample, and the sample complexity is decreased by separating the proteins according to molecular weight. moreover, it also can be used to compare bands distribution with and without stimulation. in some cases, like irf3 complexes shown in fig. 2 .1 , unique bands are only found in the bait complex with stimulation, indicating these interacting proteins are dependent on ligand stimulation. individual protein bands of interest are excised, or the entire lane is cut into approximately 1-mm 3 pieces. gel pieces were then subjected to an in-gel trypsin digestion procedure to produce peptides for mass spectrometry analysis. but the extraction ef fi ciency of peptides from a gel is low and dependent on the primary structure of the peptide. as an alternative approach to in-gel digestion, protein mixtures can be digested in solution without prior separation (behrends et al. 2010 ) . because buffer components, such as detergents, interfere with the mass spectrometry ionization process, protein samples need to be precipitated with trichloroacetic acid (tca), washed, and redissolved in a digestion buffer. the main advantages of solution digestion are the reduction of the time and a higher recovery of peptides compared to in-gel digestion. however, bear in mind that some proteins like membrane proteins are resistant to be redissolved. the peptide mixture can be directly introduced into the mass spectrometer or separated by hplc before mass spectrometric analysis (lc-ms). the two primary mass spectrometry methods developed for identi fi cation of proteins are electrospray ionization (esi) (fenn et al. 1989 ) and matrix-assisted laser desorption/ionization (maldi) (hillenkamp et al. 1991 ) . electrospray ionization mass spectrometry is a desorption ionization method. a sample solution is sprayed from a small tube into a strong electric fi eld in the presence of a fl ow of warm nitrogen to assist desolvation. the droplets formed evaporate in a region maintained at a vacuum of several torr causing the charge to increase on the droplets. the multiply charged ions then enter the analyzer. the most obvious feature of an esi spectrum is that the ions carry multiple charges, which reduces their mass-to-charge ratio compared to a singly charged species. this advantage allows mass spectra to be obtained for large molecules. a major disadvantage is that this technique cannot analyze mixtures very well. the other most used technique, maldi, is a two-step process. first, desorption is triggered by a uv laser beam. matrix material heavily absorbs uv laser light, leading to the ablation of upper layer (~micron) of the matrix material. a hot plume produced during the ablation contains many species: neutral and ionized matrix molecules, protonated and deprotonated matrix molecules, matrix clusters, and nanodroplets. the second step is ionization (more accurately protonation or deprotonation). in the most common instrumental designs, esi and maldi are performed with mass spectrometers capable of tandem mass spectrometry (ms/ ms) experiments. ion traps, quadrupole time-of-fl ight instruments (q-tof), fourier transform ion cyclotron resonance (ft-icr) mass spectrometers (ftms), and the orbitrap are the most common types of instrumentation now used in high-end protein analysis. most protein interactomes only represent as static entities, which however only poorly captures the dynamics of complex composition. there has been increasing efforts to detect dynamic views of interactomes using various modi fi ed ap-ms. systematic methods to map dynamic changes include semi-quanti fi cation based on total spectral counts or ion intensities of precursor peptide (ms1) or fragment ions (ms2) and use of isotopic labeling approaches to obtain more accurate relative quanti fi cation. relative quanti fi cation methods such as the stable isotope labeling by amino acids in cell culture (silac) detect differences in protein abundance among samples using nonradioactive isotopic labeling. although relative quantitation is more costly and time-consuming, and less sensitive to experimental bias than label-free quantitation, it entails labeling the samples with stable isotope labels that allow the mass spectrometer to distinguish between identical proteins in separate samples. differentially labeled samples are combined and analyzed together, and the differences in the peak intensities of the isotope pairs accurately re fl ect difference in the abundance of the corresponding proteins. thus, relative quantitation may discover the dynamic interactions by comparing the change of identical protein abundances from same bait cells with and without extracellular stimulation. absolute quantitation of proteins is also developed by using isotopic peptides entails spiking known concentrations of synthetic, heavy isotopologues of target peptides into an experimental sample (mirgorodskaya et al. 2012 ) . however, the cost of absolute quantitation is too high and not realistic for large-scale interactome mapping. as quantitative methods become more robust, there will be increasing demand for detection of dynamic protein interaction upon extracellular stimulation. for example, we revealed that ~20% protein interactions are dependent on ligand stimulation, such as viral dsrna mimics poly(di:dc), in the h uman i nnate i mmunity i nteractome for type i i nterferon (hi5) (li et al. 2011 ) . another example in insulin pathway, glatter et al. de fi ned the interaction network of insulin receptor/target of rapamycin pathway in drosophila (glatter et al. 2011 ) . they found that 22% of the detected interactions were regulated by insulin. in addition to the quantitative power of mass spectrometry, it is also crucial to establish a stable cell line sensitive to stimulations. when overexpressed in cells, bait protein may not respond to stimuli as sensitive as the corres ponding endogenous protein. in most cases, the raw data fi les are fi rst processed by the software controlling the respective mass spectrometry instrument. the generated data sets are then searched against a protein database using search engines such as mascot (hirosawa et al. 1993 ) or sequest (maccoss et al. 2002 ) . a valid approach for validation of the chosen parameters is to search the obtained data sets against a decoy protein database. the data also need to be further fi ltered by setting speci fi c thresholds such as a minimum peptide length or a speci fi c number of peptides to consider a protein identi fi cation. mass spectrometry has some intrinsic problems, such as the common problem of carryovers between mass spectrometry runs. to circumvent the carryover problem in mass spectrometry, we usually analyze the repeated sample in different batch. the carryovers in two independent ap-ms of the same bait will not be possible to show up twice. the record of each batch of ms runs will also help to discriminate the carryovers. in addition to mass spectrometry, af fi nity puri fi cation also has its own inherent false positives and false negatives, which is critical general limitation encountered in the interpretation of the ap-ms due to lack of binary interaction information. false positives are nonspeci fi c binding proteins and contaminants found in puri fi ed bait complex. several types of false positives are present in typical af fi nity puri fi ed protein samples. the most common ones are from researchers' hands when they perform puri fi cation and handle samples. these contaminants usually are keratin proteins and easy to remove from the dataset. there are also other various kinds of nonspeci fi c binding proteins: (1) proteins binding to af fi nity matrices, like stk38 and prmt5; (2) proteins bind to af fi nity tag, like kif11 binding to flag tag; (3) abundant proteins (e.g., actin, tubulin); (4) proteins prefer binding to speci fi c domain, like ribosomal proteins binding to baits with nucleic acid-binding domain; (5) and heat-shock proteins for protein folding. therefore, it is important to use cell line stably expressing baits at near physiological levels to avoid nsbps, as transient overexpression may probably result in protein aggregation and improper intracellular localization. to discriminate nsbp from the protein complex, repetition of ap-ms is mandatory. in our experiences, nsbps are dramatically different in two independent ap-ms of the same bait. proper controls including cells expressing gfp with the same epitope will be also useful to exclude nsbps. last, large database with the same af fi nity tag and the same cell line background from high-throughput study will be a good resource for identi fi cation of nsbps and hcips. if a protein is often isolated with many unrelated bait proteins, it is easily recognized through analysis of the high-throughput data. however, systematic large-scale experiment does not allow for the subjective and individual evaluation of their results, which means the removal of potential contaminating proteins cannot be based on judging individual puri fi cations. therefore, statistic tools for analysis of database are required to fi lter out nonspeci fi c proteins and yield high-con fi dence interacting proteins. for statistical analysis of ap-ms data, three main parameters are protein abundance, uniqueness (the frequency of observed protein in database), and reproduci bility. total spectral counts (tsc) have gained acceptance as a practical, label-free, semiquantitative measure of protein abundance in proteomics study. several computational tools have been developed for the processing of ap-ms data, like comppass (sowa et al. 2009 ) , saint (breitkreutz et al. 2010 ) , and mist (jager et al. 2012 ) . we designed a simpli fi ed method for analysis of ap-ms data, combining three main parameters: protein abundance, uniqueness (the frequency of observed protein in the database), and reproducibility. total spectral counts (tsc) have gained acceptance as a practical, label-free, semiquantitative measure of protein abundance for proteomics studies. we adopted the z -score statistic to compare protein abundance because z -score calculates the probability of tsc occurring within a normal distribution. however, z -score does not re fl ect reproducibility. in our protocol, each protein complex is tested in 4 ms runs, so reproducibility can be readily factored into the analysis. z -score also does not analyze information about prey occurrence (i.e., prey uniqueness). to explore the likelihood that an interaction is speci fi c, we set a value of prey occurrence at <5%. we now propose a simple 3-stage scoring system to identify hcip. this algorithm combines z -score plus prey occurrence and reproducibility (zspore) (li and dorf 2013 ) . in the zspore scoring system, each interaction must pass all three criteria to merit classi fi cation as hcip. the fl owchart of zspore is shown as in fig. 2.3 , and a detailed description is provided in sect. 2.4.6 . taken together, the zspore method combines three parameters ( z -score based on tsc, prey occurrence, and reproducibility) and is a simple, ef fi cient, and robust way to analyze ap-ms data. as with any large screening database, ap-ms also has false negatives, like lacking many known protein-protein interactions documented previously. there are several reasons why a known interaction fail to be found in ap-ms. first, statistical analysis tool may fi lter out the known interaction as a nonspeci fi c binding. second, the nature and location of the tag might interfere bait protein function and disrupt its interactions. third, to parallel comparison, all ap-ms experiments are performed in a same single condition. the generic conditions of af fi nity puri fi cation may be too harsh to preserve some protein interactions, such as the buffer for membrane proteins should be different from other ones. fourth, the known protein interaction depends on different stimulation. some proteins may be involved in several pathways and have different interactors in response to the relevant stimulation. last, the absence of detection is often due to the protein expression level in the speci fi c cell type, especially when the cells have relative low abundances of the protein. to visualize the protein interaction network formed by hcips and baits, graphic representation of two protein interactions basically consists of drawing two circles (nodes) linked by a line (edge). all interactions are combined to generate a map of (smoot et al. 2011 ) . for comprehensive and dynamic visualization of the network, various kinds of attributes can be applied to the node and the edge by representation of different color and line thickness. in addition, the functional classi fi cations of hcips can be analyzed by a few online programs. for example, hcip list can be uploaded to panther (thomas et al. 2003 ) or david (da huang et al. 2009 ) via a web interface. these programs group these proteins by protein domains, molecular functions, biological processes, and signal pathways. the functional classi fi cations may help discover common threads underlying the proteins of interest. another approach is to obtain clues from known protein interactions to discover regulation mechanisms. several protein-protein interaction databases are available for online search, repository, and free download, such as biogrid, string, intact, and mint. the biogrid database is an online protein interaction repository with data compiled through comprehensive curation efforts. the latest version searches 31,739 publications for 510,188 raw protein and genetic interactions from major model organism species . the string is a database of known and predicted protein interactions. the interactions include direct (physical) and indirect (functional) associations. string quantitatively integrates interaction data from these sources for a large number of organisms and transfers information between these organisms where applicable (szklarczyk et al. 2011 ) . the intact database provides a freely available, open-source database system and analysis tools for molecular interaction data (kerrien et al. 2012 ) . all interactions are derived from literature curation or direct user submissions and are freely available. the mint database focuses on experimentally veri fi ed protein-protein interactions mined from the scienti fi c literature by expert curators (licata et al. 2012 ) . ap-ms raw data also can be deposited in the tranche repository (smith et al. 2011 ) , which is a distributed fi le system into which any sort of proteomics data may be uploaded. the data then are distributed on the internet and downloaded by anyone who has access to the hash key identi fi ers for the data, which may be kept private or publicly released. in summary, all these free online programs are useful and convenient research tools for mapping, analysis, and repository of ap-ms data. ap-ms has applied for mapping of protein interactome of various cellular signaling pathways in mammalian cells. our lab has established an ef fi cient ap-ms pipeline for de fi ning protein interaction network and successfully applied in several pathways including h uman i nnate i mmunity i nteractome for type i i nterferon (hi5) (li et al. 2011 ) , mi rna pathway i nteractome (mii), and in fl uenza-host (ihost) protein interaction network (li and dorf, unpublished data) . detailed pipeline of our ap-ms is provided in this section, and how this applies on different pathways in mammalian cells will be discussed. genes known to regulate the studied signaling pathway are usually selected as primary baits. baits cover from extracellular signals like ligands to cognate receptors on cell membrane and to signaling intermediates, kinases, and transcription factors involved in these signaling pathways and their family members. after analysis of primary bait ap-ms, some new and important hcips with primary baits are also chosen to be as secondary baits. secondary baits will validate the association with primary baits but also expand the protein interaction network, provide new insights into this signaling pathway, and cross talk with other pathways. bait cdnas can be tagged with various epitopes, such as flag or ha epitope. as we discussed earlier, commercially available anti-flag beads have much higher af fi nity than anti-ha beads. we use two mammalian expression vectors, pcmv-3tag8 (stratagene) and viral expression vector, plpcx (clontech), for transfection and infection, respectively. vector pcmv-3tag8 harbors a hygromycin resistance gene, while plpcx confers cells' resistance to puromycin. transfection and transduction are two common dna delivery methods into mammalian cells. for cell lines easy to be transfected like hek293 cells, bait constructs are directly transfected into cells. for cell lines with low transfection ef fi ciency, such as thp-1 cell line, bait gene needs to be fi rst packaged into retroviral virion. the following infection will allow bait gene to integrate into cell genome dna and subsequent expression in cells. two days after transfection and infection, cells are treated with puromycin or hygromycin for 14 days. single colonies are picked and expanded in 6-well plates. protein expression levels in each colony are determined by immunoblotting. colony with protein expression close to endogenous level is picked up for ap-ms. most protein interactomes are descriptions of homeostasis of a speci fi c signaling pathway, such as dub network (sowa et al. 2009 ) , autophagy interaction network (behrends et al. 2010 ) , and erad interactome (christianson et al. 2012 ) . however, many protein interactions depend on protein posttranslational modi fi cations induced by different stimuli. for example, we found that about 20% interactions were ligand dependent in hi5 protein interaction network (li et al. 2011 ) . we also noticed many new interactions between in fl uenza virus protein and human host after viral infection (li and dorf, unpublished data) . therefore, in our pipeline for ap-ms, each stable cell line is divided into two groups, and cells are treated with ligand speci fi c for the signaling pathway or infected with virus for studying virus-host interactome. each group of cells is cultured in four or fi ve 15-cm 2 culture dishes (about 5 × 10 7 cells) to scale up for af fi nity puri fi cation. cells are lysed in 10 ml tap buffer (50 mm tris hcl [ph 7.5], 10 mm mgcl 2 , 100 mm nacl, 0.5% nonidet p40, 10% glycerol, phosphatase inhibitors, and protease inhibitors). after shaking on ice for 30 min, cell lysates were centrifuged for 30 min at 15,000 rpm. supernatants are collected and precleared with 50 m l of protein a/g resin. after shaking for 1 h at 4°c, resin is removed by centrifugation. cell lysates are added to 20 m l anti-flag m2 resin (sigma) and incubated on a shaker for 12 h. then the anti-flag resin is 3× washed (15 min/time) with 10 ml tap buffer. after removing the wash buffer, the resin is transferred to a spin column (sigma) and incubated with 40 m l 3× flag peptide (sigma) for 1 h at 4°c in a shaker. eluates are collected by centrifugation and stored at −80°c. puri fi ed complexes are loaded on 4-15% nupage gels (invitrogen) and run about 1 cm 2 distance for 8 min at 200 v. gels were stained using the silverquest staining kit (invitrogen). each entire stained lane was excised and rinsed twice with 50% acetonitrile. the taplin biological mass spectrometry facility (harvard medical school) performs ms analysis for our samples. excised gel bands were cut into approximately 1-mm 3 pieces. gel pieces are then subjected to a modi fi ed in-gel trypsin digestion procedure. gel pieces were washed and dehydrated with acetonitrile for 10 min followed by removal of acetonitrile. pieces were then completely dried in a speed-vac. gel pieces were rehydrated with 50 mm ammonium bicarbonate solution containing 12.5 ng/ m l modi fi ed sequencing grade trypsin (promega, madison, wi) at 4°c. after 45 min, the excess trypsin solution was removed and replaced with 50 mm ammonium bicarbonate solution to just cover the gel pieces. peptides were later extracted by removing the ammonium bicarbonate solution, followed by one wash with a solution containing 50% acetonitrile and 1% formic acid. the extracts were then dried in a speed-vac (~1 h) and stored at 4°c until analysis. on the day of analysis, the samples were reconstituted in 5-10 m l of hplc solvent a (2.5% acetonitrile, 0.1% formic acid). a nanoscale reverse-phase hplc capillary column was created by packing 5 m m c18 spherical silica beads into a fused silica capillary (100-m m inner diameter x ~12-cm length) with a fl ame-drawn tip. after equilibrating the column, each sample was loaded via a famos auto sampler (lc packings, san francisco, ca) onto the column. a gradient was formed and peptides were eluted with increasing concentrations of solvent b (97.5% acetonitrile, 0.1% formic acid). as peptides eluted, they were subjected to electrospray ionization and then entered into an ltq velos ion trap mass spectrometer (thermo fisher, san jose, ca). peptides were detected, isolated, and fragmented to produce a tandem mass spectrum of speci fi c fragment ions for each peptide. dynamic exclusion was enabled such that ions were excluded from reanalysis for 30 s. peptide sequences (and hence protein identity) were determined by matching protein databases with the acquired fragmentation pattern by the software program sequest (thermo fisher, san jose, ca). the human ipi database (ver. 3.6) was used for searching. precursor mass tolerance was set to ±2.0 da, and ms/ms tolerance was set to 1.0 da. a reversed-sequence database was used to set the false discovery rate at 1%. filtering was performed using the sequest primary score, xcorr, and delta-corr. spectral matches were further manually examined, and multiple identi fi ed peptides (>1) per protein were required. as with many screening methods, un fi ltered ap-ms data contain many nonspeci fi c binding proteins due to some intrinsic characteristics, such as nonspeci fi c binding to bead or tag, protein aggregation, and carryover during ms runs. we now describe a simple ef fi cient statistic method, z -score plus prey occurrence and reproducibility (zspore) scoring system, for identi fi cation of hcip. using this pipeline, we achieve a higher ef fi ciency of ap-ms and better identi fi cation of high-con fi dence interacting proteins. the methods and criteria used to remove nonspeci fi c binding proteins and identify high-con fi dence interacting proteins include: (a) gfp and controls. ap-ms of gfp-flag and various controls, such as non-flag igg conjugated resin for ap-ms, were used to identify nonspeci fi c binding proteins in the database. (b) z -score. a z -score (aka a standard score) indicates how many standard deviations an element is from the mean. to calculate z -score, mass spectrometry data were transformed into a "stats table," where the columns are total spectral counts (tsc) from 4 ms runs, the rows are bait-associated proteins (table 2. 2 ). then we calculated z -score of each x i,j (i prey interacts with j bait) based on the maximum total spectral counts (tsc) of 4 ms runs. for hi5 database analysis, we set the cutoff of z -score as 2. z is the z-score, x is the value of the element, m is the population mean, and s is the standard deviation. (c) prey occurrence. we considered any prey associated with a single bait as an hcip while preys associated with all baits as nsbp. generally, we set the bar of prey occurrence as <5%, which means one speci fi c prey interacts less than 5% of total baits in the entire database. in hi5, we showed that preys that interact with less than 5 baits represented statistically signi fi cant interactions in hi5 dataset. so the threshold for prey occurrence in hi5 is set as 4. due to known high interconnectivity among selected baits, bait-to-bait interactions were considered as hcip. (d) reproducibility. each prey must appear in at least 2 out of 4 ms runs. (e) batch reproducibility. to account for possible variations in the list of background contaminants observed in our dataset that were not identi fi ed by other statistical approaches, we intentionally sequenced each duplicate puri fi ed complex in different experiments. any protein that did not appear in different puri fi cations was considered an nsbp and manually removed from hcip list. after statistical analysis of dataset, all pairwise interactions are collected and analyzed by cytoscape. several important attributes, such as z -score and tsc, can be integrated into the interaction map. except generating interaction map, the functional classi fi cations of hcips also need to be analyzed. interactors can be grouped by protein domains, molecular functions, biological processes, and signal pathways, which may help discover common mechanism underlying the proteins of interest. to fi gure out the new interactions in database, several protein-protein interaction databases such as biogrid, string, intact, and mint can be used to identify the known interaction. however, protein interactions in new publication will not be included in these databases. the interaction information is also not completed, and many known interactions may not be found in these database. therefore, it is important to dig out protein interaction information in curated literature. take together, all ap-ms data must be interpreted with care and validated with additional experiments. as with any screening approach, the database does not represent a fi nal or complete interaction network. understanding how proteins interact in complex and dynamic networks is the key to dissect the complexity of many genotype-to-phenotype relationships. the systematic mapping of physical interactions is therefore critical for post-genomic research. comprehensive analysis of protein-protein interactions is still a challenging endeavor of functional proteomics. since intrinsic negatives are inherent to every technique, the physical interaction data generated by ap-ms may carry many false positives and negatives. thus, ap-ms is unlikely to grasp the entire interactome. it is also still a challenge to develop optimal computational tools to visually and computationally represent the multiple layers of data and integrate existing biological knowledge and functional data in literature with the interactome data. since most ap-ms data represent static graph of ppi map, advanced methods have to be developed and focused on dynamic and spatial changes in ppi. we have presented the general principles of the ap-ms approach and highlighted some recent developed technologies and successful applications on various signaling pathways. despite of the increasing ap-ms data and analysis tools, there are still many major challenges. it includes (1) the speci fi city of protein complex in different cells and tissues, (2) the dynamics of protein complex with different stimulations or posttranslational modi fi cations, (3) the absolute and relative quantitation of proteins, (4) mapping of transient or weak ppi and endogenous ppi from native cells and tissues, (5) the integration of ppi data sets with the other functional data sets, (6) the standardization and benchmarking for interactome mapping, and (7) the challenges for primary cells like neuronal cells and the detection of weak endogenous interaction. given the different types of mass spectrometric instrumentation, ionization processes, and software platforms, the assessment of published data becomes increasingly dif fi cult. to facilitate sharing experimental data, common standards in data acquisition, data interpretation, and data storage are required. many processes in a cell depend on ppi, and perturbations of these interactions can lead to diseases. comprehensive knowledge of ppi network of signaling pathways will not only give us insights on how the cells respond to stimulation but will also provide new drug targets for therapeutic application. moreover, many viral and bacterial pathogens rely on host ppis to survive in host cells and tissues and exert their damaging effects. ultimately, such high-quality ppi networks will become invaluable resources for better understanding the mechanisms underlying major human diseases and will enable the better de fi nition of drug targets. shitao li , ph.d., usa shitao li is a research fellow in department of microbiology and immunobiology, harvard medical school. he obtained his ph.d. from wuhan university, china. he is a recipient of kaneb fellowship and aai abstract award (2010). dr. li studies on protein interaction network using proteomics approach. he has mapped a dynamic antiviral innate immunity protein interaction network and currently is working on virus-host protein interaction network. by examining the protein network, he is investigating the signaling mechanisms controlling innate antiviral immunity and new drug targets for host defense to viral infection. he published his research on several prestigious journals such as nature, immunity, and molecular cell. a human map kinase interactome high-throughput mapping of a dynamic signaling network in mammalian cells network organization of the human autophagy system a physical and functional map of the human tnfalpha/nf-kappa b signal transduction pathway a global protein kinase and phosphatase interaction network in yeast de fi ning human erad networks through an integrative mapping strategy functional proteomics mapping of a human signaling pathway systematic and integrative analysis of large gene lists using david bioinformatics resources large-scale mapping of human protein-protein interactions by mass spectrometry design and application of a cytokine-receptor-based interaction trap electrospray ionization for mass spectrometry of large biomolecules a novel genetic system to detect protein-protein interactions toward a functional analysis of the yeast genome through exhaustive two-hybrid screens proteome survey reveals modularity of the yeast cell machinery a protein interaction map of drosophila melanogaster modularity and hormone sensitivity of the drosophila melanogaster insulin receptor/target of rapamycin interaction proteome characterization of the proteasome interaction network using a qtax-based tag-team strategy and protein interaction network analysis a protein complex network of drosophila melanogaster quantitative analysis of complex protein mixtures using isotope-coded af fi nity tags matrix-assisted laser desorption/ionization mass spectrometry of biopolymers mascot: multiple alignment system for protein sequences based on three-way dynamic programming systematic identi fi cation of protein complexes in saccharomyces cerevisiae by mass spectrometry global functional atlas of escherichia coli encompassing previously uncharacterized proteins a comprehensive two-hybrid analysis to explore the yeast protein interactome global landscape of hiv-human protein complexes visualization of molecular interactions using bimolecular fl uorescence complementation analysis: characteristics of protein fragment complementation the intact molecular interaction database in 2012 global landscape of protein complexes in the yeast saccharomyces cerevisiae proteome organization in a genome-reduced bacterium a map of the interactome network of the metazoan c. elegans optimization and zspore analysis of af fi nity auri fi cation aoupled with tandem mass spectrometry in mammalian cells mapping a dynamic innate immunity protein interaction network regulating type i interferon production mint, the molecular interaction database: 2012 update protein microarrays and proteomics yates 3rd jr. probability-based validation of protein identi fi cations using a modi fi ed sequest algorithm absolute quantitation of proteins by acid hydrolysis combined with amino acid detection by mass spectrometry ctr9, rtf1, and leo1 are components of the paf1/rna polymerase ii complex stable isotope labeling by amino acids in cell culture, silac, as a simple and accurate approach to expression proteomics yeast two-hybrid contributions to interactome mapping the sars-coronavirus-host interactome: identi fi cation of cyclophilins as target for pan-coronavirus inhibitors interactome mapping of the phosphatidylinositol 3-kinasemammalian target of rapamycin pathway identi fi es deformed epidermal autoregulatory factor-1 as a new glycogen synthase kinase-3 interactor the tandem af fi nity puri fi cation (tap) method: a general procedure of protein complex puri fi cation elegans orfeome version 1.1: experimental veri fi cation of the genome annotation and resource for proteome-scale protein expression monitoring protein-protein interactions in intact eukaryotic cells by beta-galactosidase complementation virion-wide protein interactions of kaposi's sarcoma-associated herpesvirus towards a proteome-scale map of the human proteinprotein interaction network tranche distributed repository and proteomecommons. org cytoscape 2.8: new features for data integration and network visualization de fi ning the human deubiquitinating enzyme interaction landscape the biogrid interaction database: 2011 update the string database in 2011: functional interaction networks of proteins, globally integrated and scored a tandem af fi nity tag for two-step puri fi cation under fully denaturing conditions: application in ubiquitin pro fi ling and protein complex identi fi cation combined with in vivo cross-linking systematic interactome mapping and genetic perturbation analysis of a c. elegans tgf-β signaling network panther: a browsable database of gene products organized by biological function, using curated protein family and subfamily classi fi cation a comprehensive analysis of protein-protein interactions in saccharomyces cerevisiae herpesviral protein networks and their interaction with the human proteome protein interaction mapping in c. elegans using proteins involved in vulval development making the most of af fi nity tags key: cord-007648-tm0hn0hz authors: mockett, a.p.adrian title: envelope proteins of avian infectious bronchitis virus: purification and biological properties date: 2002-12-20 journal: j virol methods doi: 10.1016/0166-0934(85)90138-7 sha: doc_id: 7648 cord_uid: tm0hn0hz immunoadsorbents, made with monoclonal antibodies, were used to purify the spike and membrane proteins of infectious bronchitis virus (ibv). the purified proteins were inoculated into rabbits to produce antisera. the rabbit anti-spike sera neutralized the infectivity of the virus whereas the anti-membrane sera did not. ibv-infected chickens produced antibodies to both the spike and membrane proteins. both these antibodies were at their highest concentration about 9–11 days after inoculation, whereas neutralizing antibodies were present only at very low concentrations at that time. neutralizing antibodies were at their highest concentration 21 days after inoculation. a second inoculation of virus at 42 days induced an anamnestic antibody response to the spike and membrane proteins and also for the neutralizing antibodies. the neutralizing, anti-spike and anti-membrane antibodies all reached highest concentrations 7–11 days after this inoculation. the advantages of purifying viral proteins using affinity chromatography with monoclonal antibodies are discussed. avian infectious bronchitis virus (ibv) is a coronavirus whose principal site of replication is the ciliated epithelial cells of the respiratory tract mucosa of chickens. viral replication occurs in the cytoplasm of the cell and virions are formed by budding from the endoplasmic reticulum. there are three viral structural proteins: spike (s; peplomers), membrane (m) and nucleocapsid (n). s and m proteins are both glycosylated and parts of them are exposed at the surface of the virion. the spike protein consists of two glycopolypeptides, s 1 and $2, which have molecular weights of 90 kda and 84 kda, respectively (cavanagh, 1981) . the membrane protein is present as a number of distinct species which have molecular weights ranging from 23 kda to 34 kda; the molecular weight differences are associated with the various degrees of glycosylation. n protein (54 kda) is associated with the viral rna. the ibv spike protein, associated with the outer projections, plays an important part in the infection of cells. chicken antisera to this protein and spike-specific monoclonal antibodies (mockett et al.. 1984) can neutralize the infectivity of the virus. a similar function has been found for the spike protein of murine coronavirus mhv-4 (collins et al., 1982; fleming et al., 1983) and the porcine coronavirus tgev (garwes et al., 1978) . the spike protein of ibv also contains strain-specific determinants (mockett et al., 1984) . the membrane protein appears to be a more highly conserved antigen and it is possible that only a small amount (approx. 1 kda) is exposed at the viral surface (boursnell et al., 1984) . the nucleocapsid protcin interacts with the viral rna to form a helical nucleocapsid. the objectives of this work were to produce immunoadsorbents using monoclonal antibodies which have been prepared previously (mockett et al., 1984) and to purify the virus-coded proteins of the viral envelope in a single step in relatively large amounts. this allowed hyperimmune rabbit antisera to the proteins to be produced and tested for neutralizing antibodies. in addition the sequential humoral antibody response of chickens after ibv infection has been studied using the purified viral proteins and whole virus in elisas and compared to the results using the neutralization test. the massachusetts m41 strain of ibv was grown in the allantoic cavities of i 1-day-old embryonated chicken eggs and purified on isopycnic sucrose gradients as described by cavanagh (1981) . purified virus was pelleted in a 6 x 14 ml rotor at 70,000 x g for 3 h at 4°c and resuspended in phosphate-buffered saline (pbs). an equal volume of pbs containing 4% (wt./vol.) np40 was added, mixed using a dounce homogeniser and incubated for 2 h at 25°c. the material was centrifuged for 5 min in an eppendorf microcentrifuge and the resulting supernatant, containing soluble viral components, was used for the affinity chromatography purification. monoclonal antibodies (designated a38 and c!24) to the spike and membrane proteins respectively of ibv strain m41 were prepared (mockett et al., 1984) . the gammaglobulin fraction of ascitic fluids containing either anti-spike or anti-membrane monoclonal antibodies was isolated by salt precipitation using a final concentration of 18% (wt./vol.) na2so4. for the spike immunoadsorbent 5.6 mg of gammaglobulin was coupled to 0.75 mg of cnbr-sepharose 4b (pharmacia) according to the manufacturers' instructions and for the membranc immunoadsorbent 5.5 mg was coupled to the same amount of gel. unreactive groups on the gel were blocked using 1 m ethanolamine, pit 8.0, and any non-covalently bound proteins wcre removed by repeated washings with 0.1 m nahco~ buffer, ph 8.3, containing0.5 m naci and 0.1 m acetic acid buffer, ph 4.4, containing0.5 m naci. the immunoadsorbent was stored in pbs containing 0.2% nan~ at 4°c until used. it was washed twice with 3 m nh4scn in pbs containing 0.1% octylglucoside, four times with pbs and twice with pbs containing 2% np40 before use. all wash volumes were 10 ml. the solubilised virus preparation was mixed with the immunoadsorbent for 16 h at 4°c using a rotary stirrer. the gel was poured into a chromatography column and washed with pbs containing 0.1% np40 (40 ml) and pbs containing 0.1% octylglucoside (10 ml). 3 m nh,scn in pbs containing 0.1% octylglucoside was added and 10 fractions of 1 ml collected. the absorbance at 280 nm of each of the fractions was read using a sp1800 pyeunicam spectrophotometer. the fractions in the absorbance peak were dialysed against pbs. a sample of each fraction was then subjected to electrophoresis in a polyacrylamide gel. those fractions containing detectable viral protein were pooled and constituted the purified protein prepar~,tion. ten per cent polyacrylamide slab gels containing sds were used (laemmli, 1970) and after electrophoresis samples were stained first with coomassie brilliant blue r-250 and then silver (morrisey, 1981) . new zealand white rabbits were used. samples of purified proteins were mixed with an equal volume of freund's complete adjuvant and inoculated intramuscularly into the rabbits. a similar inoculation was given 1 wk later. after 5 wk the same antigen in incomplete freund's adjuvant was inoculated subcutaneously. five months later blood was collected from the ear vein and the resulting serum stored at -20°c until used. the houghton poultry research station line of rhode island red chickens was used. ibv (m41) was inoculated intratracheally (500 ciliostatic dose fifty (cd 50); darbyshire et al., 1979) into 8 chickens and sequential blood samples were taken from the wing vein (see fig. 2 for times after inoculation). sera from the blood samples were stored at -20°c until used. a serum pool for each time of sampling was made by mixing an equal volume of serum from each of the eight samples. three different antigens were used for the ei,isas: spike protein, membrane protein and ib virus. the purified spike and membrane proteins were used at a dilution of 1 : 20 in carbonate buffer whilst the purified ib virus was used at a 1 : 100 dilution. antigens were adsorbed for i h at 37°c. in the second step chicken sera were serially diluted in 0.5 m nacl containing 0.5~7c np40 (saline/np40) from an initial 6.64 iog2 dilution. any specifically bound antibody was detected using a rabbit anti-chicken igg serum ( 1 : 300) and a goat anti-rabbit igg alkaline phosphatase conjugate ( 1 : 1,000) (sigma), both diluted in saline/np40. the substrate was p-nitrophenyl phosphate (1 mg/ml)in diethanolamine buffer and the reaction stopped using 3 m naoh. each step was for 30 min at 37°c, the reaction volumes 50 pl and the plate was washed three times with pbs containing 0.1c~ tween 20 between each step. titres were calculated graphically (mockett and darbyshire, 1981) using an absorbance value of 0.1 as the cut off. the chicken antisera were tested for neutralizing antibody to ibv as described by darbyshire et al. (1979) . rabbit sera were precipitated with 18°~ (wt./vol.) nazso4 and the resulting gammaglobulin tested because rabbit sera have high concentrations of non-specific inhibitors of ib virus replication. the viral proteins purified by affinity chromatography are shown in fig. 1 . the spike protein, which is composed of two polypeptides, was the only protein detected in the two fractions shown using the sensitive silver staining procedure. similarly the membrane protein was not contaminated with other proteins, although this protein did not stain as well as the spike. there were other stained bands present, but these are artifacts sometimes observed, even in the absence of protein, with this staining procedure. the purification was highly reproducible. the purified viral proteins were inoculated into rabbits and the gammaglobulin fraction of the antisera tested for neutralizing activity. only the anti-spike gammaglobulin neutralized the virus. however the membrane protein was a good immunogen because the rabbit anti-membrane sera tested in the elisa had high activity against the whole virus: in fact the titres were higher than those of the rabbit anti-spike sera (see table 1 ). the results of testing sera from ibv-infected chickens for antibody to spike and membrane proteins showed that both anti-spike and anti-membrane antibodies were produced early after infection (see fig. 2 ). peak titres were between 9 and 11 days after infection. the antibody response to the whole virus had a similar profile. however, the the second inoculation of virus induced an anamnestic antibody response. the elisa detected a similar increase in antibody titres using spike, membrane and whole virus -the peak was at 10-11 days after infection. the neutralizing antibody response was also similar and the peak titres were at 11 days after infection which contrasted to the slow rise to the peak titres after the primary inoculation. this paper describes the application of affinity chromatography using monoclonal antibodies for the purification of the two viral structural proteins present at the surface of the ib virion -spike and membrane. a previous report has described procedures for the purification of these viral proteins and also nucleocapsid protein, the only other major structural protein (cavanagh, 1983) . ibv was solubilised in np40 detergent and centrifuged in a sucrose gradient containing this detergent in order to purify the nucleocapsid protein. the addition of 1 m naci to the sucrose solutions was required for the purification of the spike and membrane proteins, as they co-migrated in gradients containing low salt concentrations. however, the nucleocapsid protein could not be purified in gradients containing high salt concentrations. the yield of material from these gradients was relatively low, due to the limited number of fractions which contained purified viral components. in other studies purified spike material contained some nucleocapsid protein and the membrane preparation contained other proteins which were thought to be of cellular origin. there are a number of advantages in using affinity chromatography. by making use of the specificity of the antibody pure material can be isolated, even from a crude mixture of proteins. the method is very quick and easy and the immunoadsorbent can be used several times. thus, relatively large amounts of purificd material can be obtained. the availability of spike and membrane proteins in a highly purified form will allow more biochemical, structural and immunological studies to be done. the conditions used to solubilise the virus did not dissociate the two spike polypeptides, therefore, both s i and $2 were detected in thc material eluted from the anti-spike immunoadsorbent. the results of experiments using the rabbit antisera to the viral proteins confirmed the biological importance of the spike protein as only antibodies to this protein neutralized the infectivity of the virus. thc chicken, about 10 days after an ibv infection, has antibodies to both the spike and membrane proteins in its serum but only very low concentrations of neutralizing antibodies. thc profile of neutralizing antibodies shown in this paper agrees with previous published findings (holmes, 1977; mockett and darbyshire, 1981; hawkes et al., 1983) . the results show that anti-spike antibodies produced early after infection are non-neutralizing, as assessed by our in vitro technique. this raises the question as to the function of these antibodies in thc chicken. previous evidence has shown only the spike protein to be capable ofeliciting neutralizing antibodies. there is a possibility that the anti-spike antibodies could be neutralizing in vivo and the function of the anti-membrane antibodies could be similar. the possible role of these antibodies in protection remains to be resolved. purified viral proteins can be used to determine which is responsible for protecting thc chicken from infection. protection tcsts such as thc ciliostasis (darbyshire, 1980) and the mixed infcction (escherichia coil and ibv) (smith et al., 1985) tcsts are available. it is only by using methods such as affinity chromatography that stflficientl~, large amounts of pure viral proteins can be made available to enable such tests to bc done. 1984. \.'iru~ rc~ i 11. pov, cll virolog, 119, 35~. darbyshire artwrlght, lg":'bl vet ph. i). i hcsis, i. tnl,ct~,it', oi ]'he author wishes to thank ms. ,i.k.a. ('ook for her help with the neutralilatton tests, ms. debra southee for her excellent technical assistance and i)r. i'.i).k. br()wn for useful discussions. key: cord-000979-cav9n18w authors: hoppe, sebastian; bier, frank f.; nickisch-rosenegk, markus v. title: rapid identification of novel immunodominant proteins and characterization of a specific linear epitope of campylobacter jejuni date: 2013-05-29 journal: plos one doi: 10.1371/journal.pone.0065837 sha: doc_id: 979 cord_uid: cav9n18w campylobacter jejuni remains one of the major gut pathogens of our time. its zoonotic nature and wide-spread distribution in industrialized countries calls for a quick and reliable diagnostic tool. antibody-based detection presents a suitable means to identify pathogenic bacteria. however, the knowledge about immunodominant targets is limited. thus, an approach is presented, which allows for the rapid screening of numerous cdna derived expression clones to identify novel antigens. the deeper understanding of immunodominant proteins assists in the design of diagnostic tools and furthers the insight into the bacterium’s pathogenicity as well as revealing potential candidates for vaccination. we have successfully screened 1536 clones of an expression library to identify 22 proteins that have not been described as immunodominant before. after subcloning the corresponding 22 genes and expression of full-length proteins, we investigated the immunodominant character by microarrays and elisa. subsequently, seven proteins were selected for epitope mapping. for cj0669 and cj0920c linear epitopes were identified. for cj0669, specificity assays revealed a specific linear epitope site. consequently, an eleven amino acid residue sequence tlikelkrlgi was analyzed via alanine scan, which revealed the glycine residue to be significant for binding of the antibody. the innovative approach presented herein of generating cdnas of prokaryotes in combination with a microarray platform rendering time-consuming purification steps obsolete has helped to illuminate novel immunodominant proteins of c.jejuni. the findings of a specific linear epitope pave the way for a plethora of future research and the potential use in diagnostic applications such as serological screenings. moreover, the current approach is easily adaptable to other highly relevant bacteria making it a formidable tool for the future discovery of antigens and potential biomarkers. consequently, it is desirable to simplify the identification of structural epitopes, as this would extend the spectrum of novel epitopes to be detected. c. jejuni is a gram-negative, microaerophilic bacterium possessing a helical-shaped morphology [1] . in industrialized countries, c. jejuni has been one of the primary causal agents of gastroenteritis. in 2012, in germany alone 62626 cases have been reported [2] . campylobacteriosis predominantly induces mild, self-limiting diarrhoea, however severe cases have been reported [3] . several studies have shown the potential contribution of campylobacteriosis in the development of neuropathies such as the guillain-barré syndrome [4, 5] . the prominent route of infection is the improper handling and insufficient cooking of poultry [6] . the broad distribution of this pathogen in combination with a high clinical relevance necessitates fast and reliable diagnosis. although several genomic typing methods exist [7, 8] , these are often timeconsuming and inappropriate for a point-of-care application. instead, a direct approach detecting the whole bacterium is beneficial. this can be achieved by using specific antibodies to membrane-associated antigens similar to the latex-agglutination-test that is already available for several bacterial pathogens including campylobacter [9] . in order to achieve this, copious knowledge of potentially suitable targets, i.e. immunodominant proteins, is indispensable. in the past, screening for immunogenic proteins has been carried out on nitrocellulose membranes [10] or using microarray library screening with extensive protein purification [11] . however, these methods have some major drawbacks as the former is prone to non-specific binding and cross-reactivity when using polyclonal sera for screening of bacterial libraries [12] , while the latter method is time-consuming and laborious due to the purification steps needed prior to microarray printing. as we have shown elsewhere, an approach using halotagh and specifically coated halolink tm slides is better suited to detect immunodominant proteins while reducing cross-reactivity to a minimum [13] . the halotagh provides several advantages to other commonly used tags as it enhances the amount of soluble proteins expressed [14] reducing the formation of inclusion bodies. in addition, the interaction of tag and its specific ligand is based on covalent binding [15] . this negates the need for additional purification steps as the crude lysate can simply be spotted onto coated microarrays. only the target proteins presented as fusion constructs bearing a halotagh will bind to the surface, whereas the remaining proteins are washed off. combining the described screening method with expression libraries derived from c. jejuni cdna allows for the fast analysis of hundreds of different proteins. thus, suitable immunodominant proteins can be detected, isolated and identified via sequencing the encoding cdna sequence. the generation of a cdna derived expression library offers advantages in contrast to genomic libraries. the latter demand excessive screenings as the genetic information is mostly truncated or of little relevance representing areas within the genome that do not encode for proteins, whereas the former focuses on the genes transcribed [16] . this reduces the amount of clones to be screened. nevertheless, for effective cdna library screening normalization is needed, as rrna is mainly overrepresented due to its extreme abundance within a total rna extraction prior to reverse transcription [17] . bacteria only posses a poly(a)-tail on their mrna in rare cases [18, 19] . although methods exist to isolate mrna from bacteria [20, 21] , it is generally considered to be more challenging as compared to eukaryotic rna, where oligo(dt) primers are sufficient [22] . therefore, we refrained from isolating the mrna prior to reverse transcription. instead, the generated cdna was normalized, i.e. trimmed down, afterwards by the use of a duplex-specific nuclease [23] . this approach has been shown to effectively reduce the amount of rrna-derived molecules, thereby altering the overall composition in favour of the mrnaderived cdna without including a bias [24] . further optimization of library construction was achieved by using a ligationindependent cloning as well as electroporation, which have been shown to enhance overall cloning efficiency [25, 26] . using this approach, a relatively small number of clones can be screened to illuminate immunodominant proteins effectively. the identification of previously unknown or incompletely described antigens offers several benefits and potential applications. first, these proteins might serve in a diagnostic tool to rapidly identify c. jejuni in biological samples, e.g. in food manufacturing industry or hospitals. secondly, proteins eliciting an immune response might be suitable candidates for vaccination. finally, gaining insight into the structure and function of novel immunodominant proteins might improve the overall understanding of a bacterium's pathogenicity as it could accelerate the elucidation of novel virulence-associated factors. in this paper, we show the successful screening of an expression library of c. jejuni identifying several potentially immunodominant proteins. after further investigations, we selected a subset of these proteins for epitope mapping and succeeded in identifying linear epitopes for two proteins, namely cj0669 and cj0920c not described before. furthermore, assays were performed to assess specificity of the binding as well as investigating the relevance of the amino acid residues involved via alanine scanning. additionally, the structure and antigenicity of the proteins and epitopes were modelled to analyze the suitability of the identified sequences for future applications like diagnostic tools or vaccine development. the rin values for the c. jejuni rnas isolated were above 8.5 in all cases (see s1: rin). after polyadenylation of the rna, it was reverse transcribed and subsequently normalized using a duplexspecific nuclease (dsn). assessing the performance of normalization, we sequenced a sample set of 96 individual clones after transformation with trimming and without trimming. without dsn treatment 28% of clones contained 23s rrna derived cdna and 8% other rrna derived cdna (16s and 5s). in comparison, after incubation with dsn only one clone in 96 showed a 23s rrna derived cdna. a total number of 1536 different clones were screened using halolink tm slides. we used hisj (uniprot/swiss-prot: q46125), cjaa (q0p9s0) and peb1a (q0p9x8) as positive reference markers, as they have been described previously to be immunodominant [27] [28] [29] . in contrast, argc (q9pis0) and pyrc (q0pbp6) were used as negative reference proteins. after comparing the median value and the standard deviation of each sample to the values of the positive references, a selection of 192 clones were picked to be sequenced. generally, we grouped clones into three categories after screening. group i included samples showing a higher median contrast value than all the positive references used, group ii encompassed the samples with median contrast values in between those of different positive references and group iii the remaining samples which albeit below the lowest positive reference were still above all the negative reference signals. after screening 1536 clones, 32% fell into group i, 15% encompassed group ii and 4% made up group iii while the remaining 50% were below the negative controls. the sequenced clones were ''blasted'' against the genome of c. jejuni nctc 11168 (refseq: nc_002163) to identify the corresponding genes and proteins. for further experiments, known antigens of c. jejuni found within the sequenced clones were discarded and we focused solely on 22 novel potentially immunogenic proteins. several of the identified clones carried only truncated fragments of the corresponding genes. in addition, some of the sequenced inserts possessed a frame shift. in order to overcome these limitations fulllength inserts were prepared and subcloned to express the fulllength proteins. the 22 genes and their corresponding proteins are summarized in table 1 including the length as well as the size of the protein without the attached halotagh. recombinant expression of the full-length fusion proteins was assessed by sds-page using a halotagh 488 alexa ligand, see figure 1 . the correct-sized proteins are highlighted as they are expressed fused to the 34 kda halotagh. bands at roughly 34 kda in size are visible throughout, likely representing early-termination transcripts comprised only of the halotagh. evaluating immunodominance of these proteins, we performed 32 microarray and elisa analyses using three different primary antibodies against c. jejuni. summing up these results, table 1 shows the respective mean q values and errors (n = 25) for each of the proteins identified. the highest scores were attained by cj0926 and cj0927 with 1.7760.75 and 2.2360.58 respectively, indicating a potentially stronger immunogenicity as compared to the known immunodominant proteins hisj and cjaa. however, we restrained from further characterizing the former two proteins via epitope mapping as both are highly conserved and show multiple homologies in cellular organisms (cj0927) or epsilonproteobacteria (cj0926) according to the ncbi protein cluster database [30] . the same rationale was applied for cj1576, which is conserved in cellular organisms and peaks at 1.5560.43. instead, we focused on cj0623, cj1575 and cj0669, which are highly conserved in campylobacter only as well as cj0920c and cj1380, which in turn are conserved in campylobacteraceae. cj1380, cj0623 and . for almost all proteins, bands were visible with the respective sizes of their molecular weight and the halotag tm . the halotag tm alone at 34 kda is visible throughout. the proteins were as follows: 1 -cj0476, 2 -cj1723, 3 -cj1208, 4 -cj1619, 5 -cj0920c, 6 -cj1380, 7 -cj1381, 8 -cj0669, 9 -cj1624, 10 -cj1320, 11 -cj1486, 12 -cj0130, 13 -cj1366, 14 -cj0016, 15 -cj1575, 16 -1576, 17 -cj1729, 18 -cj0571, 19 -cj0926, 20 -cj0927, 21 -cj0623 and 22 -cj1138. the pictures of the two gels as well as the righthand marker were fused by coreldraw to minimize space. for all proteins except cj0920c (5) and cj1320 (10) the overlapping peptide sequences corresponding to each protein were mapped with three different primary antibodies to c. jejuni. each peptide contains 15 amino acid residues with an overlap of 11 to each adjacent peptide. cj0669, an abc-transporter atp-binding protein, showed significant intensities in two adjacent peptide spots as shown in figure 2 . the consensus sequence derived from these spots is tlikelkrlgi, which is highly conserved in c. jejuni (see s2: tlikelkrlgi is conserved in c. jejuni). next, we assessed specificity of these signals by applying antibodies reactive to other bacteria then c. jejuni in the epitope mapping to assess whether the binding occurs via the paratope of the antibody or in a nonspecific manner. figure 3 shows the results of these investigations. the mean rfis (n = 12) of peptide 44 and 45 are at 8000 and 6000 a.u. respectively after incubation with polyclonal antibody to c. jejuni, whereas these intensities drop below 50 a.u. if an e.coli antibody is used. the remaining proteins painted a different picture. for cj1575 (s3:epitope mapping of cj1575) -a fragment with only 75 residues -no linear epitope was identified. the same was true for cj0623 (s4:epitope mapping of cj0623), cj0476 (s5:epitope mapping of cj0476) as well as cj1723 (s6:epitope mapping of cj1723) which harboured no linear epitopes. for the protein cj1380 (s7:epitope mapping of cj1380) only one peptide showed signal intensities above the positive control. another abc-transporter protein, cj0920c, possesses several regions, where signal intensities were above those of the positive controls, namely peptides 6 to 8 (aa 21 -43), 17 to 19 (aa 65 -87) as well as 56 and 57 (aa 221 -239), see s8: epitope mapping of cj0920c. the three potential antigenic regions were further assessed using transmembrane prediction tool tmhmm2.0 and the antigenic prediction tool by emboss. here, peptides 6 -8 showed a consensus sequence, namely spfavwkfldal, which ought to be presented extracellular as well as being antigenic according to the prediction tools. for the other two sites, either no antigenicity was predicted or most of the amino acids lie within a transmembrane region. this is true for amino acids 47 -69, 90 -112 and 214 -236. for a summary of the predicted characteristics, see s9: transmembrane and antigenic potential of three potential epitope sites for cj0920c.further, specificity control assays revealed that these signals do not drop significantly when using antibodies to salmonella enterica indicative of nonspecific binding to occur, see s10: specific vs. non-specific binding to potential linear epitopes of cj0920c. the influence of each amino acid within the consensus sequence tlikelkrlgi was assessed by performing an alanine scan. as figure 4 shows, the majority of amino acids in the sequence do not confer a change in binding intensity if replaced by alanine. however, in case of glycine a substantial drop in rfi values to less than 10% of the original value is observed. this indicates an important role of the glycine residue in the binding of the antibody. further, these results were highly reproducible in a second set of experiments and the specificity of the binding was evaluated using antibodies reactive to helicobacter pylori as summarized in figure 5 . the green boxes on the left represent the values of original sequence tlikelkrlgi and altered sequence tlikelkrlai after incubation with polyclonal antibodies to c. jejuni. as was observed in the previous figure, the drop in intensity is significant from a mean value of approximately 38000 a.u. to less than 7900 equal to roughly 80% decrease in the signal. in comparison, the adjacent boxes represent the data of the original sequence as well as tlikelkrlai after incubation with antibodies reactive to helicobacter pylori (orange).here, no significant difference between unaltered and altered sequence could be observed. furthermore, the mean intensities were below 800 a.u. thus, binding of an anti-helicobacter pylori antibody to the sequence is approximately 2% as compared to the original binding of polyclonal antibodies to campylobacter jejuni against this linear epitope. the same trend was observed when using antibodies reactive to s. enterica (s11: binding specificity assay of tli-kelkrlgi with anti-salmonella antibodies). this indicates low non-specific binding. modelling of cj0669's structure was performed by swiss-model using automated mode. the template chosen during automatic identification was pdb1ji0a from the rcsb protein data bank [35] , which referred to the crystal structure analysis of the abc transporter from thermotoga maritima [36] . the model spans amino acids 3 to 236 of the molecule incorporating more than 90% of the residues. figure 6 shows a graphical display of the structure with the distinct secondary structures highlighted. in addition, the epitope sequence tli-kelkrlgi is marked in orange. the first nine residues comprise parts of an alpha helix, whereas glycine and isoleucine form a loop to attach the alpha helix with a beta strand further down the primary sequence of the protein. the modelling was supported by statistical analyses including determination of a z-value [37] based on published data in the protein data bank. figure 7 shows the accuracy of the model based on the z-value calculations. we analyzed the consensus peptide sequence tlikelkrlgi by geneious pro 5.6.5. notably, when predicting secondary structure and antigenic regions by emboss garnier and antigenic tool respectively, the results matched the previous observations. an alpha helix precedes a loop that is connected to a beta-sheet. the only difference to the structural modelling by swiss model is the number of residues within the loop as emboss only includes the glycine residue in the loop. for antigenic potential, emboss antigenic predicts all amino acids within the consensus sequence to be part of antigenic sites except for lysine. the development of a cdna derived expression library and subsequent screening of the expressed fusion proteins has revealed several novel immunodominant proteins. optimization during library construction by reducing the amount of rrna molecules via dsn treatment has proven to greatly reduce the amount of rrna clones as was shown for other applications elsewhere [38, 39] . this enhances the number of clones carrying mrnaderived cdna, thus increasing the number of clones potentially expressing proteins of interest. for screening purposes antibodies were used, which were generated by immunizing rabbits with whole and partly lysed cells of c. jejuni. therefore, both membrane-associated as well as soluble proteins were available during immune response. thus, it was to be expected that the antibodies used in screening target both membrane and cytosolic proteins, which could be observed in the proteins identified. the screening not only detected the 22 proteins listed here but included other already known immunogenic proteins. these were excluded from further analyses as the main goal of this work was to identify novel proteins and their linear epitopes. the screening and analysis of immunodominant behaviour by microarrays was highly reproducible. some of the proteins showing the strongest signals during screening and microarray experiments were discarded, as they possess strong homologies within a wide spectrum of organisms. however, we focused on proteins specific for campylobacter due to the potential application in a diagnostic tool. thus, linear epitope mapping was performed on a subset of seven proteins. this revealed linear epitopes for two proteins, cj0669 and cj0920c, both abc-type transporters, which were expected to be immunogenic [40] . the identified region tlikelkrlgi of cj0669 proved to be specific showing little to none non-specific interactions with other antibodies. most notably, h. pylori antibodies did not reveal any binding to the linear epitope. in the wake of the other antibodies tested, i.e. anti-e.coli and anti-s. enterica, this strongly suggests non-specific binding to be minute. furthermore, the applicability of tlikelkrlgi as a specific target for campylobacter identification, antibody or vaccine production seems plausible. alanine scan has further identified the glycine residue to be of utmost importance for the antibody binding as replacing it by alanine reduces the intensity by at least 80%. this has been somewhat surprising as glycine is a non-charged, achiral residue. however, the presence of glycine might cause the protein to turn more easily provided by the glycine's flexibility and consequently leading to a better accessibility. thus, immunogenic ability might be rendered by the presence of glycine. in fact, epitopes containing glycine as an important residue have been reported before [41] [42] [43] . next, we modelled the 3-dimensional structure of cj0669 in order to evaluate the accessibility of the potential epitope. this is an important feature when bearing a diagnostic application in mind. from the model, we could conclude that the glycine residue and the linear epitope are most likely presented as part of a loop structure adjacent to the end of an alpha helix. these are located on the outer rim of the protein, hence easily accessible for antibody binding to occur. although, the structural information is merely based on modelling, the z-values of this model fall within close proximity to zero (-0.88) commonly associated with x-ray crystallographic results [37] . consequently, the probability of this model to match the real structure is high. these findings are supported by further calculations performed using the emboss garnier and emboss antigenic algorithms. these results underline the antigenic potential of the consensus sequence tlikelkrlgi as well as predicting almost identical secondary figure 5 . binding specificity assay of epitope sequence in alanine scan. box-whisker-plot (n = 12) of the eleven residue long consensus peptide sequence tlikelkrlgi and its derivative tlikelkrlai of cj0669 after incubation with antibodies reactive to c. jejuni (green) and h. pylori (blue). each box represents 50% of the values, while 98% fall within the whiskers. the median is represented by a horizontal line within each box and the small rectangle corresponds to the mean for each sample. for references the rfi values of the positive control, rabbit igg, and negative control, mbp, are indicated on the right for both antibody incubations. for the original sequence a mean value of 38000 a.u. can be observed after incubation with antibody to c. jejuni. this dropped to less than 8000 a.u. after alanine substitution. this represents a drop of roughly 80% in overall intensity. in contrast, neither original nor substituted peptide showed any significant mean values after incubation with the anti-h. pylori antibodies indicating specific binding to the peptide by the anti-campylobacter antibody. doi:10.1371/journal.pone.0065837.g005 structures as the automated swiss model. moreover, the secondary structure prediction changes when glycine is replaced by alanine, which further underlines the important role glycine most likely plays in antibody binding due to its ability to create a loop structure which would otherwise be absent with high probability. in conclusion, tlikelkrlgi seems a suitable candidate to be studied further and shows high potential for applications in a diagnostic tool or vaccination due to its good accessibility and antigenicity. this gains further momentum as tlikelkrlgi is highly conserved in c. jejuni, whereas other species of campylobacter, e.g. c. coli, c. upsaliensis or c. lari possess other amino acid residues within this region. for helicobacter pylori, this effect is even more pronounced. specifically, the glycine residue at the tenth position that was revealed paramount for the binding is not present in helicobacter pylori, see s2. thus, tlikelkrgli seems a suitable candidate for specific diagnostic applications targeting c. jejuni. the analysis of the remaining six proteins revealed linear epitopes only in the case of cj0920c, another abc-type transporter. the region encompassing amino acids 27 -38 spfavwkfldal seems to be the most promising as its rfi values are high in microarray experiments. furthermore, the prediction tools determined this sequence to be antigenic and located extracellularly, thus easily accessible. this is not true for the other two regions detected from cj0920c, which are either lacking antigenic potential or are located within the cell or in transmembrane regions. still, the results indicate non-specific binding to contribute mainly to the positive signal. this exempts spfavwkfldal from a suitable diagnostic application; however, further analysis might be helpful to investigate the full potential of this sequence as a specific epitope. although the analyses of full-length proteins on microarrays hint at the presence of antigenic sites within each protein, a lack of linear epitopes was observed. however, this was expected, as most naturally occurring epitopes are conformational and not linear [44] [45] [46] [47] . the current method used for epitope mapping cannot detect conformational epitopes. still, the presence of linear epitopes and their detection are important features in serological applications. this is particularly true for the high-throughput analysis of sera on microarrays as described elsewhere [48, 49] . the present work has accomplished two main goals. first, we have been able to identify a previously unknown antigenic site tlikelkrlgi of cj0669 from c. jejuni and were able to determine the important residue involved in antibody binding as well as modelling the epitope's accessibility within the full-length protein. for c. jejuni the generation of monoclonal antibodies against cj0669 is mandatory to further investigate the affinity and kinetics of the observed binding event by biacore. this might grant further insight whether or not this sequence is a suitable candidate for specific detection in a biosensor or if cj0669 is a suitable vaccine candidate. mutagenesis of cj0669 might help to illuminate the function of this protein and its potential role if any in pathogenicity. once this is determined, a monoclonal antibody might be used to cocrystallize the antigen for x-ray crystallography to assign a measured structure to the predicted model. on top of that, antibody validation by elisa with a wide array of c. jejuni isolates is needed to evaluate the applicability of the derived antibody for future clinical point-of-care devices. additionally, next-generation sequencing of transcriptomes of a broad spectrum of c. jejuni strains ought to be beneficial in analyzing the presence and expression of the protein. this might provide further insight into the suitability of the protein for clinical applications and vaccine development. consequently, it could help to identify potential sequence homologies or discrepancies, which as of now are limited to already published datasets. thus, nextgeneration sequencing might be an attractive procedure to complement the approach presented herein. secondly, we have established and applied a quick and easy method for the screening of cdna expression libraries in order to rapidly detect novel immunodominant proteins of bacteria, see figure 8 for a summary. this approach is easily transferable to other bacterial pathogens such as methicillin-resistent staphylococcus aureus, klebsiella pneumonia, neisseria gonorrhoea, pseudomonas aeruginosa, salmonella enterica and others. the information needed prior to analysis is minute, yet using fully sequenced strains is advantageous as it speeds up the identification of genes and proteins. still, even unsequenced strains such as clinical isolates can easily be analyzed and should not pose any hindrance, as homologous proteins ought to be identified via blast. the more important prerequisite for the screening is the availability of polyclonal antibodies or patient sera. in the context of a diagnostic tool to be used in hospitals, patient sera are beneficial, as they resemble the desired immune response better, thus narrowing down the results to the clinically most relevant proteins. finally, suitable references are useful, yet in most cases, some immunodominant antigens have already been described. regardless, even without a suitable known antigen, this method could identify novel immunoreactive proteins and their linear epitopes with some minor adjustments to the overall calculations. in conclusion, we have successfully shown the application of this method while gaining insight into some novel immunodominant proteins of an important gut pathogen, c. jejuni. the current research emphasizes the need for future investigations in two distinct areas. first, more insight into the newly identified proteins of c. jejuni might foster the understanding of this enigmatic pathogen and help to illuminate its pathogenicity while providing suitable means for rapid detection and combating its spreading. second, transferring this method to other bacterial pathogens will help to discover other immunodominant proteins, potentially leading to a broad spectrum of clinical applications and serological assays. additionally, it might be preferable to simplify the identification of conformational epitopes as this could greatly enhance the retrieval rate of suitable antigenic sites. the state-of-the-art technique involves the cocrystallization of an antigen with a monoclonal antibody [50, 51] . besides, mutation analysis [52] , mass-spectrometry [53] or clips technology [54] are other methods to identify structural epitopes. however, the existing methods are extremely costly, time-consuming and demand high material inputs. therefore, advances to simplify structural epitope detection are essential and would be ideally suited to be combined with our current approach. nevertheless, linear epitopes provide an important tool for clinical applications. while their low abundance poses a problem, their simplicity grants major benefits. they are rapidly synthesized and modified. this enables them to be used in a broad spectrum of assays. in a clinical context where antibody titer determination of patient sera is necessary, short peptides are extremely valuable. production costs using chemical synthesis are relatively low and easier than recombinant expression of full-length proteins. furthermore, specific peptides of pathogenic bacteria remove the need to use the entire pathogen, thus reducing the associated risk. although prediction-based strategies for antigens and linear epitopes have been published [55] [56] [57] [58] , their accuracy is often lacking [59] . in contrast, our approach offers an attractive dual procedure as it rests on experimental data that are complemented by a widespread support of bioanalytical tools. thus, such a thorough approach facilitates to focus on relevant epitopes quickly, while rapidly evaluating their suitability for prospective diagnostic applications as compared to prediction-based or experimental methods alone. consequently, we believe our present findings to be of outstanding interest for diagnostic applications and to pave the way for future implementation. moreover, the ability to quickly generate cdna libraries and identify novel immunodominant figure 8 . schematic summary of the methods involved for library construction, screening and characterization. a bacterial pathogen is selected and its rna isolated. after cdna generation, normalization is performed to minimize the number of rrna derived clones. next, screening is performed using polyclonal antibodies and potentially immunogenic proteins are selected. the corresponding clones are sequenced and the genes and proteins identified, either by exact match (sequenced strain) or homology (clinical isolate). a set of candidate proteins is selected focusing only on previously unknown antigens, while known antigens are discarded. immunodominant nature of each full-length protein is assessed by elisa and microarray analyses to further narrow down the number of proteins to be tested via epitope mapping, if desired. linear epitope mapping reveals potential antigenic sites, which are tested for specificity and in alanine scan. finally, bioinformatic tools are used to model the 3-dimensional structure, accessibility and antigenicity of each protein. afterwards, the most promising candidates can be used in future applications including but not limited to monoclonal antibody generation, kinetic measurements, structural and functional analyses and diagnostic applications. doi:10.1371/journal.pone.0065837.g008 proteins independent of the bacterium used should lay the foundations for future research with highly relevant pathogens. creating a method to extend the detection to structural epitopes would add another tier to the current approach and greatly enhance the knowledge about antigenic sites. further, we are expanding our focus to other pathogens to help elucidate novel antigens within a wide array of clinically relevant bacteria. the strain c. jejuni nctc 11168 was grown on solid karmali media for 48 h at 37 uc under microaerophilic conditions (85% n 2 , 5% o 2 and 10% co 2 ) within a hypoxic workstation (coylabs). for rna isolation, a liquid culture was prepared by inoculating 10 ml of brain-heart-infusion broth (bhi) with a single colony and incubated overnight at 37uc, 140 rpm under microaerophilic conditions. this overnight culture was used to inoculate a flask containing 100 ml fresh bhi medium and incubated for 16 h prior to harvesting. for initial screening, a rabbit polyclonal igg antibody to c. jejuni (acris ap24002pu-n) was used. for further microarray analyses of a subset of candidate proteins, elisa, epitope mapping and alanine scanning this antibody was used as well as two other rabbit polyclonal igg antibody to c. jejuni (abcam ab22542 and abd serotec 1744-9035). the immunogen used for generation of antibodies to c. jejuni was c. jejuni atcc 29428. for specificity assays rabbit polyclonal igg antibody to e.coli bl21 (micromol #322), s.enterica (abcam ab35156), s. aureus (fitzgerald 20c-cr1274rp) and h. pylori (abcam ab20459) were used respectively. detection was achieved by usage of secondary antibodies. a goat polyclonal to rabbit igg conjugated with chromeo tm -546 (abcam ab60317) for fluorescent and conjugated with horseradish peroxidase (abcam ab6721) for a colorimetric readout was applied. the cells were harvested by centrifugation (2000 x g, 10 min). the supernatant was discarded and the pellets resuspended in fresh bhi medium. subsequently, 0.5 ml of the bacterial suspension were added to 1 ml of rnaprotect bacteria reagent (qiagen) in a 2 ml tube, vigorously vortexed for 5 s and incubated for 5 min at room temperature. after centrifugation (5000 x g, 10 min) the pellets were resuspended in 200 ml of lysis buffer (30 mm tris-cl, 1 mm edta, 15 mg/ml lysozyme, .12 mau proteinase k) by pipetting and vortexing for 10 s. the solution was incubated for 10 min using a thriller thermomixer (peqlab) at room temperature and 1000 rpm. after addition of 700 ml buffer rlt and 500 ml 96% ethanol, the lysate was applied to rneasy bacteria mini kit spin columns (qiagen) for rna isolation following the manufacturer's instructions. after loading of the lysate, an on-column dnase digest was performed using rnasefree dnase i solution (qiagen) according to manufacturer's instructions. the isolated total rna was eluted in 50 ml of rnasefree water and its concentration and purity analyzed by nanodrop (peqlab) measurements. the quality of isolated rna was assessed using the rna 6000 pico kit and bioanalyzer 2100 (agilent). the total rna was diluted to a working concentration of 200 -500 pg/ml. the analysis was performed following manufacturer's instructions and the rna integrity number (rin) calculated by the 2100 expert software (agilent). the rin is defined to fall into a range of 0 to 10, with a higher score indicating an intact rna, whereas lower numbers are associated with degraded rna molecules [60] . in order to use bacterial mrna as a substrate in cdna synthesis, polyadenylation was mandatory. the tailing was achieved using the poly(a) polymerase tailing kit (epicentre) following the alternate protocol offered by the manufacturer. briefly, up to 10 mg of total rna were combined with 2 ml poly(a) polymerase reaction buffer, 2 ml 10 mm atp, 0.5 ml riboguard rnase inhibitor and 1 ml poly(a) polymerase (4 u) in a total reaction volume of 20 ml. the reaction was incubated for 20 min at 37uc, terminated by the addition of 1 ml 0.5 m edta and purified by rneasy mini kit (qiagen) following manufacturer's instructions. yield and purity were determined by nanodrop measurements. for cdna synthesis, the in-fusionh smarter tm directional cdna library construction kit (clontech) was used according to manufacturer's instructions with slight modifications. 3.5 ml total, polyadenylated rna were mixed with 1 ml of 39 in-fusion smarter cds primer, heated first for 3 min at 72uc and then incubated for additional 2 min at 42uc. after addition of 5.5 ml mastermix (2 ml 5x first strand buffer, 0.25 ml 100 mm dtt, 1 ml 10 mm dntps, 1 ml 12 mm smarter v oligonucleotide, 0.25 ml rnase inhibitor and 1 ml smartscribe tm reverse transcriptase) the tubes were incubated for 90 min at 42uc. the reaction was terminated at 68uc for 10 min. for second strand cdna synthesis two 2 ml aliquots of first strand reaction were used in long distance pcr using phusion polymerase (finnzymes). each pcr reaction was comprised as follows: 2 ml first-strand reaction, 70 ml rnase-free water, 20 ml 5x phusion hf buffer and 2 ml each of dntp mix (10 mm), 59 pcr primer ii a (12 mm), 39 in-fusion smarter pcr primer (12 mm) and phusion polymerase with a total reaction volume of 100 ml. the pcr reactions were subjected to the cycling program with 98uc for 1 min as initial denaturation followed by 15 cycles of 10 s denaturation at 98uc, 30 s of primer annealing at 65uc and 6 min extension at 72uc. for improved pcr results optimization was performed as follows; 30 ml of the 15 cycle experimental tube were transferred to a separate pcr tube, cycling commenced and aliquots of 5 ml each were collected after 15, 18, 21, 24 and 27 cycles total. the different cycles were compared by gel electrophoresis and the experimental tubes subjected to additional cycles if necessary. finally, pcr reactions were purified using the qiaquick pcr purification kit (qiagen). the purity and yield of each reaction were analyzed by nanodrop measurements. normalization of double-stranded cdna was achieved with the trimmer-2 cdna normalization kit (evrogen) to reduce the number of cdna molecules derived from rrnas. briefly, 12 ml of cdna (approx. 100 ng/ml) were mixed with 4 ml of 4x hybridization buffer. for the trimming reaction 4 ml of this mixture were distributed to four different pcr tubes and overlaid with a drop of pcr-grade mineral oil. after centrifugation (13000 x g, 2 min), the tubes were incubated for 2 min at 98uc followed by 5 h at 68uc. next, pre-heated (68uc) duplex-specific nuclease (dsn) master buffer was added to each tube and incubation prolonged for 10 min. dsn was added to the first three tubes in decreasing concentrations -1 u/ml, 0.5 u/ml and 0.25 u/ml -with the fourth tube receiving dsn storage buffer and no enzyme as a control reaction. the incubation was continued for 25 min at 68uc. after addition of 5 ml dsn stop solution and subsequent incubation for 5 min at 68uc, the tubes were placed on ice. the chilled reaction was diluted by addition of 25 ml sterile, rnase-free water. for amplification of normalized cdna, 1 ml of each reaction was used as template in pcr. each pcr reaction contained 1 ml of template from the normalization reaction, 33 ml nuclease-free water, 10 ml 5x phusion hf buffer, 1 ml 10 mm dntp mix (neb), 2 ml of each primer 59 pcr primer ii a (12 mm), 39 in-fusion smarter pcr primer (12 mm) and 1 ml phusion polymerase. the pcr was performed with initial denaturation at 98uc for 1 min and seven cycles of denaturation at 98uc for 10 s, primer annealing at 65uc for 30 s and extension at 72uc for 3 min, respectively. for optimization, the control tube was subjected to 7, 9, 11, and 13 cycles with 12 ml aliquots taken every two cycles. the optimization samples were analyzed by gel electrophoresis (1% agarose, tae, 100 v) and the optimal cycle number determined. the remaining three tubes were subjected to 9 + x cycles with x being the differential of the optimized cycles to the originally performed seven cycles. after the second pcr, the experimental reactions were compared to the optimal control reaction using gel electrophoresis as above. reactions showing a successful normalization were combined and used in a third pcr reaction. after the final pcr, the reactions were purified by qiaquick pcr purification kit. for cloning pfn18a (promega) was used as a vector, as it features a n-terminal encoded halotagh fusion protein, which allows for specific and covalent binding to a unique ligand, thus reducing background and minimizing cross-reactivity in immunoassays with halolink tm microarrays harbouring the ligand on its surface. first, the vector needed to be linearized to be used with the in-fusion cloning technology. this was achieved by reverse pcr using ifs 18a for (59 ttgataccactgcttttc-catggcgatcgcgttatc 39) and ifs 18a rev (59 tctcatcgtaccccgtgtttaaacgaattcgggctcg 39) . each reaction contained 2 ml each of 1:10 diluted pfn18a (10 ng/ml) and the two primers, 10 ml 5x phusion hf buffer, 1 ml 10 mm dntps, 0.5 ml phusion polymerase and 32.5 ml nucleasefree water to reach a total reaction volume of 50 ml. the pcr was run using a 25 cycle two-step program with 98uc denaturation for 10 s and 4 min extension at 72uc. after completion, 2 ml of dpni (20 u/ml) were added to the reaction and incubated at 37uc for 1 h. the presence of a single band was checked by gel electrophoresis and the remaining reaction purified by qiaquick pcr purification kit. cloning of normalized cdna and linearized pfn18a vector was performed following the manufacturer's instructions within the in-fusion smarter directional cdna library construction kit (clontech). electroporation 2 ml of the cloning reaction were mixed with 25 ml of electrocompetent acella tm e.coli cells (mobitec), a bl21 derivative, and electroporated in 1 mm cuvettes using the easyject plus electroporator (peqlab). conditions for electroporation were as follows: voltage = 1400 v, capacity = 25 mf, resistance = 200 v and pulse duration of 5 ms. the electroporated cells were added to 970 ml of super optimal broth with catabolite expression (soc) and incubated at 37uc for 1 h with shaking at 250 rpm. afterwards, 150 ml of the transformation reaction were plated on lysogeny broth (lb) agar containing ampicillin. for each reaction, at least two plates were prepared and incubated at 37uc for 16 h. a total number of 1536 clones were selected and transferred to 1.3 ml u96 deepwell tm plates (nunc) containing 0.8 ml lbamp. the plates were incubated overnight at 37uc, 130 rpm. on the next day, the deepwell tm plates were centrifuged, the supernatant discarded and the pellets resuspended in 370 ml of lbamp. a new set of u96 deepwell tm plates was prepared with 850 ml of fresh lb-amp and inoculated with 100 ml each from the resuspended overnight cultures. the remaining 270 ml of resuspended overnight culture were mixed with 30 ml of sterile-filtered dmso and stored at -80uc. the newly inoculated plates were incubated for 6 h at 37uc, 130 rpm. afterwards, the temperature was reduced to 20uc, incubation continued for 1 h and protein expression induced by addition of 2 ml of 0.5 m b-d-1thiogalactopyranoside (iptg). incubation persisted overnight at 20uc, 130 rpm. the cells were harvested by centrifugation (2500 x g, 10 min), the supernatant discarded and the pellets frozen at -20uc. after 15 min the pellets were resuspended in 180 ml of easylyse tm bacterial protein extraction solution (epicentre) and incubated for 5 min at room temperature. dnase i was mixed with dnase reaction buffer (10 mm tris-hcl, 2.5 mm mgcl 2 , 10 mm cacl 2 ), added to the reaction and incubation was carried on for 10 min at 37uc. the plates were centrifuged to collect cell debris for 3 min at 2500 x g. for each sample 10 ml of lysate were transferred to 384 microtiter plates (genetixx), which were used as reservoirs for the spotting procedure. the samples were spotted onto halolink tm slides (promega) using the qarray2 microarray spotter (molecular devices). 384 different samples were spotted per slide with three replicate slides per screening. in total 1536 samples were screened on 12 slides (n = 3). each sample was spotted as quadruplicates with controls in two identical sets of eighteen 10610 subarrays each (total number of spots per slide 3600). the controls used included ht-hisj, ht-cjaa and ht-peb1a as positive reference proteins as these have been described as immunodominant before. as specificity controls ht-argc and ht-pyrc were used, representing proteins without known immunodominant behaviour, thus binding of the polyclonal antibodies is not expected. in addition, two different e.coli strains -acella tm electrocompetent cells and krx single-step competent cells (promega) -were spotted as further controls. as those two lack proteins expressed with a halotagh, they are used as negative controls. after spotting of the samples, the slides were incubated for 1 h at room temperature in a humidity chamber. next, slides were washed with pbs + 0.05% igepalh ca-630 (pbsi, sigma aldrich) and dried by nitrogen flow. the 2 well proplate tm module (grace biolabs) was attached to each slide. the top chamber was filled with 1.5 ml of rabbit-polyclonal antibody to c. jejuni (acris, 2 mg/ml) in pbs. the bottom chamber was incubated with pbs only. after 2 h of incubation at room temperature with gentle rocking, both chambers of each slide were washed three times with 2 ml of pbsi. secondary antibody (goat-polyclonal to rabbit igg conjugated with chromeo tm -546, abcam, 5 mg/ml) was subjected to each chamber in pbs and the slides were incubated at room temperature for 2 h in the dark under gentle rocking. finally, slides were washed three times with pbsi, the proplate tm modules removed and the slides dried by nitrogen flow. the slides were scanned on an axon genepix 4200a laser scanner (molecular devices) with the following settings: 532 nm laser, pmt gain 400, 40% laser power, lines to average 1, 10 mm resolution and standard green emission filter at 575 nm. the raw data sets of all the microarray analyses in this publication have been deposited in ncbi's gene expression omnibus [61] and are accessible through geo series accession number gse44717. the median fluorescence intensity of each spot corrected by the local background (median f532 -b532) was used. further, relative fluorescence intensity (rfi) was calculated by subtracting the signals of the bottom chamber from the raw data signals of the top chamber to account for non-specific binding of secondary antibodies. for screening of expression libraries we used the contrast method with either argc or pyrc as specificity control to determine the contrast via the formula: with rf f i control the median of all rfis of the control used: clones harbouring strong signals in microarray screening were selected to be sequenced. sequencing was performed externally by lgc genomics using ht7f (59 acatcggcccgggtct-gaatc 39) and flxr (59 cttcctttcgggctttgttag 39) primers. after sequencing and identification of potentially immunodominant proteins, primers were designed to generate full-length clones for each identified gene, see s12 for a list of the primers used. cloning was performed as mentioned above with slight modifications. the pfn18a vector was linearized using the following primer set; 18a if linear for (59 gtttaaac-gaattcgggctc 39) and 18a if linear rev (59 ggcgatcgcgttatcgctctg 39) with pcr conditions as mentioned before. protein expression, lysis and spotting of fulllength proteins were performed as described above. the slides were incubated for 1 h at room temperature in a humidity chamber. for incubation with antibodies 3 well or 16 well proplate tm modules (grace biolabs) were attached to the halolink tm slides. processing of the slides was done similar to the original screening, however several different antibodies were used, see section antibodies. for testing of immunodominant characteristics with elisa, the crude lysate was first purified using halolink tm magnetic beads (promega) following the manufacturer's instructions. the proteins of interest were subsequently cleaved off by digestion with protev protease (promega) and concentration was determined by nanodrop measurements. the samples were diluted to a total protein content of 20 mg/ml in pbs and 50 ml of each sample was added to maxisorb plates (nunc). each sample was analyzed at least in triplicate. the elisa plate was covered with a lid and incubated overnight at 4uc in a humidity chamber. after five washing steps each with pbs + 0.05% tween-20 (pbst), the plates were blocked using 200 ml 5% non fat dried milk in pbs per well for 2 h. afterwards, plates were washed three times with pbst. 100 ml of primary antibody solution (c = 4 mg/ml) in pbs containing 1% non fat dried milk were applied to each well using the respective desired antibody or pbs for controls. the plates were incubated for 2 h at room temperature and washed four times with pbst. next, 100 ml of conjugated secondary antibody (goat polyclonal to rabbit igg conjugated with horseradish peroxidase, abcam ab6721, c = 20 ng/ml) were added to each well and incubation carried on for 1 h. finally, plates were washed once again four times with pbst and 100 ml 3,39,5,59-tetramethylbenzidine (tmb, sigma-aldrich) was added to each well for detection. after 30 min of incubation at room temperature in the dark, the reaction was stopped by applying 100 ml of 2 m h 2 so 4 to each well. the optical density of each well was measured using the omega fluostar (bmg labtech) at a wavelength of 450 nm. primers were designed using primer3 [62] within geneious pro 5.6.5 [63] . the sequenced inserts were identified by blast [64] .peptide sequence secondary structures were predicted using the emboss garnier [65] algorithm and the transmembrane regions predicted by tmhmm2.0 [66, 67] . antigenic sites were predicted by emboss antigenic [68, 69] . data evaluation was performed by originpro 8 g (originlab) and microsoft excel. 3dimensional structure predictions were performed using the swiss model automated mode [70] [71] [72] [73] [74] and pdb files were visualized and analyzed by the ucsf chimera package [75] . chimera is developed by the resource for biocomputing, visualization, and informatics at the university of california, san francisco (supported by nigms p41-gm103311). analysis of full-length proteins was achieved by combining the results from elisa and microarray data. hence, the rfi of each sample was calculated. next, a normalized rfi was generated by dividing the rfi of each sample by the median rfi of all the samples within an area of interest, i.e. incubation compartment. from these normalized rfis a median and standard deviation was calculated. if the median normalized rfi of the positive control was below the median normalized rfi of any of the negative references whilst considering the standard deviations, the test was rendered invalid. if the test passed the above criterion, the q values were calculated as follows: with rf f i sample the median of normalized rfis of the sample and rf f i pos:control the median of the normalized rfis of the positive control hisj respectively cjaa. the resulting error was calculated by error propagation according to gauss. finally, incorporating all valid tests, the mean q value was determined along with its resulting error following error propagation by gauss, see table 1 . several proteins were chosen for epitope mapping. these were the proteins encoded by cj0467, cj1723, cj0669, cj1380, cj0920, cj1575 and cj0623. the proteins were divided into 15-mer oligopeptides with an overlap of 11 amino acids in silico. the synthesis and coupling to microarray slides was performed externally by jpt peptide technologies gmbh. each peptide sequence was applied 9 times to one slide. the slides were used with proplate 3-well chamber system (grace) allowing for incubation with different antibodies. first, the slides were blocked with superblock blocking buffer (thermo fischer) for 2 h, washed five times with pbs + 0.05% tween-20, primary antibodies applied, incubated overnight at 4uc with mild rocking, washed again, secondary antibodies applied for 2 h in the dark and after a final washing procedure, dried and scanned as above. three different primary antibodies to c.jejuni were tested. the bottom chamber was always used as a control chamber, incubated only with secondary antibody. the peptide tlikelkrlgi and 11 derivatives thereof substituting one amino acid for alanine were synthesized and coupled to microarray slides by jpt peptide technologies gmbh. each peptide was applied 9 times to one slide. incubation procedure was performed as described above for epitope mapping. the expression of the desired halotagh fusion proteins was checked by sds-page. after lysis of cells, 2 ml of each protein extract was mixed with 1 ml of 10 mm halotagh alexa 488 ligand. after addition of 7 ml 1x tbs (100 mm tris, 150 mm nacl, ph 7.6) the reaction was incubated at room temperature for 30 minutes. 2 ml of each reaction were removed, mixed with 8 ml of 5x loading buffer (fermentas) and 1 ml dtt and heated for 5 min at 70uc. the separation was performed on a mini-protean tgx gel (biorad, any kd, 15 wells) in a protean ii xi cell chamber (biorad) for 30 min at 200 v. as a size reference benchmark fluorescent protein standard (life technologies) was used. fluorescence was measured in a fla-5100 (fujifilm) with excitation at 473 nm. figure s1 rin. electropherogramm and rna integrity number (rin) for sample 5, a total rna isolated from c. jejuni nctc 11168, after analysis using the rna 6000 pico kit and the agilent bioanalyzer 2100. the rin equals 9 and the ratio of 23s to 16s rrna is 1.8. on the right hand, a virtual gel picture is presented as calculated by the agilent expert 2100 software. (tif) figure s2 likelkrlgi is conserved in c. jejuni. the protein sequence of cj0669 was blasted and subsequently aligned according to a blosum62 matrix. as a reference sequence tlikelkrlgi of q0pak5, the protein encoded by cj0669 from c. jejuni nctc 11168 was used. the dots indicate an agreement to the reference, while differences are given by the one letter amino acid code. the first 13 sequences including the references are 100% identical and are all derived from c. jejuni proteins. for other species of campylobacter such as c. coli or c. upsaliensis several of the residues are replaced. for other organisms, especially helicobacter pylori the degree of replaced residues increases. the positions showing the most conservation throughout are residues 3, 6, 9 and 11. however, residue 10, the glycine which was revealed to be paramount for the binding of the antibody is not found in helicobacter pylori proteins. (pdf) figure s10 specific vs. non-specific binding to potential linear epitopes of cj0920c. the bars represent the mean values (n = 15) of rfi values after incubation with polyclonal antibody to c. jejuni (green) and salmonella enterica (orange). the mean values for each peptide fall within the same range or possess overlapping standard deviations. thus, no specific interaction of the antibody to the epitope is present; rather a non-specific binding seems likely. (tif) figure s11 binding specificity assay of tlikelkrlgi with anti-salmonella antibodies. the different sequences tested in alanine scanning are shown in the box-whisker-plot (n = 15) with each box representing 50% of the values. the whiskers encompass 98% of the values, the median is indicated by a horizontal line and the mean represented by a small rectangle. the 12 boxes in green on the left represent the results after incubation with polyclonal antibody to s. enterica. for comparison, the two red boxes show the original signals from fig. 4 for the sequence tlikelkrlgi as well as tlikelkrlai, after incubation with polyclonal antibodies to c. jejuni. all the green boxes fall into the same range as the altered sequence tlikelkrlai, where alanine replaced the glycine residue, which possessed only 10% intensity of the original sequence. thus, no specific interaction of the antibody to the epitope is present; rather a non-specific binding seems likely. (tif) figure s12 primers used in this study. the name of each primer, the corresponding target gene or vector and its sequence in 59 to 39 direction is given. for each gene, f represents the forward primer and r the reverse. the primers were used for cloning in the in-fusion smarter tm directional cdna library construction kit. (xls) proposal for a new family epidemiologisches bulletin 03 campylobacter jejuni and related species campylobacter and guillain-barré syndrome genetic basis for the variation in the lipooligosaccharide outer core of campylobacter jejuni and possible association of glycotransferase genes with post-infectious neuropathies risk factors for sporadic campylobacter infection in the united states. a case-control study in foodnet sites typing of campylobacter jejuni and campylobacter coli isolated from live broilers and retail broiler meat by flaa-rflp, mlst, pfge and rep-pcr rapid pulsed-field gel electrophoresis protocol for subtyping of campylobacter jejuni evaluation of three commercial latex agglutination tests for identification of campylobacter spp molecular cloning: a laboratory manual, third edition severe acute respiratory syndrome diagnostics using a coronavirus protein microarray immunogenic cross-reaction among outer membrane proteins of gram-negative bacteria microarray-based method for screening of immunogenic proteins from bacteria proteinprotein interactions: analysis of a false positive gst pulldown result halotag: a novel protein labeling technology for cell imaging and protein analysis validation of two ribosomal rna removal methods for microbial metatranscriptomics polyadenylic acid sequences in e. coli messenger rna identification of the gene for an escherichia coli poly(a) polymerase a simple method to enrich mrna from prokaryotic rna magnetic capturehybridization method for purification and probing of mrna for neutral protease of bacillus cereus direct detection of recombinant gene expression by two genetically engineered yeasts in soil on the transcriptional and translational level normalization of full-length-enriched cdna duplex-specific nuclease efficiently removes rrna for prokaryotic rna-seq ligation-independent cloning of pcr products (lic-pcr) high efficiency transformation of e. coli by high voltage electroporation the campylobacter jejuni/coli cjaa (cj0982c) gene encodes an n-glycosylated lipoprotein localized in the inner membrane genetic diversity of the campylobacter genes coding immunodominant proteins immunogenicity and immunoprotection of recombinant peb1 in campylobacter-jejuni-infected mice the national center for biotechnology information's protein clusters database characterization of genetically matched isolates of campylobacter jejuni reveals that mutations in genes involved in flagellar biosynthesis alter the organism's virulence potential nutrient acquisition and metabolism by campylobacter jejuni identification of campylobacter jejuni genes contributing to acid adaptation by transcriptional profiling and genome-wide mutagenesis structure, function, and evolution of bacterial atp-binding cassette systems the protein data bank: a computer-based archival file for macromolecular structures the 2.0 crystal structure of abc transporter from thermatoga maritima toward the estimation of the absolute quality of individual protein structure models construction and evaluation of normalized cdna libraries enriched with full-length sequences for rapid discovery of new genes from sisal (agava sisalana perr.) different development stages construction of normalized rna-seq libraries for next-generation sequencing using the crab duplex-specific nuclease atp-binding cassette transporters are targets for the development of antibacterial vaccines and therapies a glycine-rich bovine herpesvirus 5 (bhv-5) ge-specific epitope within the ectodomain is important for bhv-5 neurovirulence identical igm antibodies recognizing a glycine-alanine epitope are induced during acute infection with epstein-barr virus and cytomegalovirus glycine-rich cell wall proteins act as specific antigen targets in autoimmune and food allergic disorders epitope mapping x-ray crystallography of antibodies b-cell epitopes: fact and fiction antigenic diversity among helicobacter pylori vacuolating toxins functional peptide microarrays for specific and sensitive antibody diagnostics serodiagnosis of echinococcus spp. infection: explorative selection of diagnostic antigens by peptide microarray three-dimensional structure of an antibody-antigen-complex epitope mapping using the x-ray crystallographic structure of complement receptor type 2 (cr2)/cd21: identification of a highly inhibitory monoclonal antibody that directly recognizes the cr2-c3d interface high-resolution epitope mapping of hghreceptor interactions by alanine-scanning mutations characterization of an anti-borrelia burgdorferi ospa conformational epitope by limited proteolysis of monoclonal antibody-bound antigen and mass spectrometric peptide mapping functional reconstruction and synthetic mimicry of a conformational epitope using clips technology diagnostic peptide discovery: prioritization of pathogen diagnostic markers using multiple features best: improved prediction of b-cell epitopes from antigen sequences an introduction to epitope prediction methods and software immunoinformatics and the in silico prediction of immunogenicity. an introduction benchmarking b cell epitope prediction: underperformance of existing methods the rin: an rna integrity number for assigning integrity values to rna measurements gene expression omnibus: ncbi gene expression and hybridization array data repository primer3 on the www for general users and for biologist programmers geneious v5 basic local alignment search tool analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins predicting transmembrane protein topology with a hidden markov model: application to complete genomes a hidden markov model for predicting transmembrane helices in protein sequences a semi-empirical method for prediction of antigenic determinants on protein antigens new hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and x-ray derived accessible sites the swiss-model workspace: a web-based environment for protein structure homology modeling the swiss-model repository and associated resources swiss-model: an automated protein homology-modeling server swiss-model and the swiss-pdbviewer: an environment for comparative protein modeling protein modeling by e-mail ucsf chimera -a visualization system for exploratory research and analysis the c. jejuni strain nctc 11168 was a kind gift of the group of s. bereswill (department of microbiology and hygiene, charité -university medicine berlin, berlin, germany). sh is greatly indebted to martina obry for her assistance during expression library construction and clone isolation. the authors would like to thank simone aubele for technical assistance. we also gratefully acknowledge michaela schellhase for microarray printing. conceived and designed the experiments: sh mvnr. performed the experiments: sh. analyzed the data: sh ffb mvnr. contributed reagents/materials/analysis tools: sh. wrote the paper: sh ffb mvnr. key: cord-010260-8lnpujip authors: anthonsen, henrik w.; baptista, antónio; drabløs, finn; martel, paulo; petersen, steffen b. title: the blind watchmaker and rational protein engineering date: 1994-08-31 journal: j biotechnol doi: 10.1016/0168-1656(94)90152-x sha: doc_id: 10260 cord_uid: 8lnpujip in the present review some scientific areas of key importance for protein engineering are discussed, such as problems involved in deducting protein sequence from dna sequence (due to posttranscriptional editing, splicing and posttranslational modifications), modelling of protein structures by homology, nmr of large proteins (including probing the molecular surface with relaxation agents), simulation of protein structures by molecular dynamics and simulation of electrostatic effects in proteins (including ph-dependent effects). it is argued that all of these areas could be of key importance in most protein engineering projects, because they give access to increased and often unique information. in the last part of the review some potential areas for future applications of protein engineering approaches are discussed, such as non-conventional media, de novo design and nanotechnology. nature has evolved using several types of random mutations in the genetic material as a fundamental mechanism, thereby creating new versions of existing proteins. by natural selection nature has given a preference to organisms with proteins which directly or indirectly made them better adapted to their environment. thus nature works like a blind watchmaker, trying out an endless number of combinations. this may seem to be an inefficient approach by industrial standards, but nevertheless nature has been able to develop some highly complex and sophisticated designs, simply by the power of natural selection over millions of years, occurring in a large number of parallel processes. by virtue of reproduction several copies of each organism have been able to test the effect of different mutations in parallel. it is quite probable that the mutation frequency was higher in ancient species (doolittle, 1992) , although it is still possible to find highly mutable loci in genes involved in adaptation to the environment (moxon et al., 1994) . enzymes have been used by man for thousands of years for modification of biological molecules. the use of rennin (chymosin) in rennet for cheese production is a relevant example. and with increased knowledge about proteins, genes and other biological macromolecules scientists started starting with a protein with known sequence and properties, we make a 3-d model of the protein from experimental structure data or by homology. by modelling and simulation we identify mutations that will modify selected properties of the protein (the design part of the process), these mutations are implemented at the dna level and expressed in a suitable organism (the production part of the process), and the success of the design is verified by experimental methods. to look at methods for making modified proteins with new or improved properties. at first this was done by speeding up nature's own approach, by increasing the number of mutations (e.g., by using chemicals or radiation) and by using a very strong selection based on tests for specific properties. with the introduction of new and powerful techniques for structure determination and site directed mutagenesis, it is now possible to do rational protein modification. rather than testing out a large number of random mutations, it has become feasible to identify key residues within the protein structure, to predict the effect of changing these residues, to implement these changes in the genetic material, and finally to produce large amounts of modified proteins. this is protein engineering. there are several reviews describing the fundamental ideas in protein engineering, see fersht and winter (1992) for a recent one. the basic protein engineering process is shown in fig. 1 (see also petersen and martel (1994) ). in most cases it starts out with an unmodified protein with well-characterised properties. for some reason we want to modify this protein. in the case of an enzyme we may want to make it more stable, alter the specificity or increase the catalytic activity. first we enter the design part of the protein engineering process. based on structural data we create a computer model of the protein. by a combination of molecular modelling and experimental methods the correlation between relevant properties and structural features is established, and changes affecting these properties can be identified and evaluated for implementation. in more and more cases the effect of these changes can be simulated, and the modifications can be optimised with respect to these simulations. as soon as a new design has been es~tablished we may enter the production part of the process. the necessary mutations must be implemented in the genetic material, this genetic material is introduced into a production organism, and the resulting modified protein can (in most cases) be extracted from a bioprocess. this protein can be tested with respect to relevant properties, and if necessary it may be used as a basis for re-entering the design part of the protein engineering process. after a few iterations we may reach an optimal design. there are several examples of successful protein engineering projects. protein engineering may be used to improve protein stability (kaarsholm et al., 1993) , enhance or modify specificity (getzoff et al., 1992; witkowski et al., 1994) , adapt proteins to new environments (arnold, 1993; gupta, 1992) , or to engineer novel regulation into enzymes (higaki et al., 1992) . in some cases even de novo design of new proteins may be relevant, using knowledge gained from existing structures (kamtekar et al., 1993; shakhnovich and gutin, 1993; ghadiri et al., 1993; ball, 1994) . in a truly multidisciplinary project chymosin mutants with optimal activity at increased ph values compared to wild-type chymosin was designed and produced (pitts et al., 1992) . point mutations changing the charge distribution of superoxide dismutase have been used to increase reaction rate by improved electrostatic guidance (getzoff et al., 1992) . a project on converting trypsin into chymotrypsin has been important for understanding the role of chymotrypsin surface loops (hedstrom et al., 1992) , a serine active site hydrolase has been converted into a transferase by point mutations (witkowski et al., 1994) , and mutations in insulin aiming at increased folding stability have given an insulin with enhanced biological activity (kaarsholm et al., 1993) . an example of a rational de novo project (as opposed to the random approach used, e.g., in generation of catalytic antibodies) is the design of an enzymatic peptide catalysing the decarboxylation of oxaloacetate via an imine intermediate, in which a very simple design gave a three to four orders of magnitude faster formation of imine compared to simple amine catalysts . in some cases it may also be an interesting approach to incorporate nonpeptidic residues into otherwise normal proteins (baca et al., 1993) , or to build de novo proteins by assembling peptidic building blocks on to a nonpeptidic template (tuchscherer et al., 1992) . it has been shown that incorporation of nonpeptidic residues into e-turns of hiv-1 protease gives a more stable enzyme (baca et al., 1993) . the main problem with this approach is how to incorporate the non-standard residues. in the hiv-1 protease case solid-phase peptide synthesis combined with traditional organic synthesis was used, others have suggested that the degeneracy of the genetic code may be used to incorporate novel residues via the standard protein synthesis machinery of the cell (fahy, 1993) . in the present review we will look at the design part of the protein engineering process, with emphasis on some of the more difficult steps, especially homology based modelling in cases with very low sequence similarity, nuclear magnetic resonance (nmr) of very large proteins and modelling of electrostatic interactions. in the last part of the review we will discuss some possible future directions for protein engineering and protein design. any protein engineering project is based on information about the protein sequence. this information may stem from either direct protein sequencing or a deducted translation of the dna/rna sequence. the amount of information on protein and nucleic acid sequences, as well as on relevant data like 3-d structures and disease-related mutations, is growing at a very rapid pace, and novel databases and computer tools give increased access to these data (coulson, 1993) . it is very reasonable to expect that projects like the human genome project will succeed in providing us with sequence information about every single gene in our chromosomes within the next decade. this information will be after transcription, the mrna may be edited, a process that now has been reported in man, plants and primitive organisms (trypanosoma brucei). the mrna is then translated into a protein sequence. this protein sequence can subsequently be modified, leading to n-or o-glycosylation, phosphorylation, sulfation or the covalent attachment of fatty acid moieties to the protein. at this stage the protein is ready for transport to its final destination -which may be right where it is at the time of synthesis, but the destination may also be extracellular or in secluded compartments such as the mitochondria or lysosomes. in this case the protein is equipped with a signal sequence. after arrival to its destination the protein is processed, often involving proteolytic cleavage of the signal sequence. shorter routes to the functional protein with fewer steps undoubtedly exist as well as routes with interchanged steps of processing. finally, the catabolism of the protein is also part of the process, but has been left out in the figure. of key importance for our understanding of the biology, development and evolution of man. it should, however, be kept in mind that the sequence itself may give us little information about regulation of gene expression, i.e., under what conditions genes are expressed, if they are expressed at all. most protein sequences have been deducted from gene sequences. it is in most cases a priori assumed that a trivial mapping exists between the two sets of information. however, this may not necessarily be the case. in fig. 2 , the various steps currently recognised as being of importance for the production of the mature enzyme are shown, and several of these steps may affect the mapping from gene to protein. posttranscriptional editing is modifications at the mrna level affecting the mapping of information from gene to protein, often involving modification, insertion or deletion of individual nucleotides at specific positions (cattaneo, 1994) . currently only speculative models exist for the underlying molecular mechanism(s) for posttranscriptional editing. in the case of mammalian apolipoprotein b two forms exist, both originating from a single gene. the shorter form, apo b48, arises by a posttranscriptional mrna editing whereby cytidine deamination produces an uaa termination codon (teng et al., 1993) . in the ampa receptor subunit glur-b mrna editing is responsible for changing a glutamine codon (cag) into a arginine codon (cgg) (higuchi et al., 1993) . this editing has a pronounced effect on the ca 2+ permeability of the ampa receptor channel, and it seems to be controlled by the intron-exon structure of the rna. similar mrna editing has been reported in the related kainate receptor subunits giur-5 and giur-6, where two additional codons in the first trans-membrane region are altered (sommer et al., 1991; k6hler et al., 1993) . it is also interesting in this context that certain human genetic diseases have been related to reiteration of the codon cag (green, 1993) . mrna editing in plant mitochondria and chloroplasts has also been reported (gray and covello, 1993) . here the posttranscriptional mrna editing consists almost exclusively of c to u substitutions. editing occurs predominantly inside coding regions, mostly at isolated c residues, and usually at the first or second position of the codons, thus almost always changing the amino acid compared to that specified by the unedited codon. in trypanosoma brucei some extensive and well-documented posttranscriptional cases of editing have been reported (read et al., 1992; harris et al., 1992; adler et al., 1991) . the editing takes place at the mitochondrial transcript level where a large number of uridine nucleic acid bases are added or deleted from the mrna, which then subsequently is translated. several non-editing processes affecting the transcription/translation steps are also known. although the ribosomes in an almost perfect manner translate the message provided by the mrna (with error rates less than 5 x 10 -4 per amino acid incorporated), it appears as if the mrna in certain cases contain information, that forces the ribosome to read the nucleic acid information in a non-canonical fashion (farabaugh, 1993) . a special case, that may deserve some attention as well, is the seleno proteins, were seleno cystein is introduced into the protein by an alternative interpretation of selected codons (b6ck et al., 1991; yoshimura et al., 1990; farabaugh, 1993) . translational frameshifting has been found in retroviruses, coronaviruses, transposons and a prokaryotic gene, leading to different translations of the same gene. two cases of translational 'hops' have been reported, where a segment of the mrna is being skipped by all ribosomes, in the two cases 50 and 500 nucleotides were skipped, respectively (farabaugh, 1993) . to our knowledge posttranscriptional editing and related processes are uncommon but definitely present in humans. it is, therefore, important to understand precisely how these mechanisms work, in order to correctly deduct the protein sequence from the gene sequence. the most common posttranslational modifications are side chain modifications like phosphorylations, glycosylations and farnesylations, as well as others. however, some modifications may also affect the (apparent) gene to protein mapping. posttranslational processing may involve removal of both terminal and internal protein sequence fragments. in the latter case an internal protein region is removed from a protein precursor, and the external domains are joined to form a mature protein (hodges et al., 1992; xu et al., 1993) . interestingly, all intervening protein sequences reported so far have sequence similarity to homing endonucleases (doolittle, 1993) , which also can be found in coding regions of group i introns (grivell, 1994) . posttranslational modifications like phosphorylation, glycosylation, sulfation, methylation, farnesylation, prenylation, myristylation and hydroxylation should also be considered in this context. they modify properties of individual residues and of the protein, and may thus make surface prediction, dynamics simulations and structural modelling in general more complex. the residues that are specifically prone to such modifications are tyrosines (phosphorylation and sulfation), serine and threonine (o-glycosylation), asparagine (nglycosylation), proline and lysine (hydroxylation) and lysine (methylation). in addition glutamic acid residues can become y-carboxylated leading to high affinity towards calcium ions (alberts, 1983) . specific transferases are involved in the modification, e.g., tyrosylprotein sulfotransferases (suiko et al., 1992) and farnesyl-protein transferases (omer et al., 1993) . phosphorylation of amino acid residues is an important way of controlling the enzymatic function of key enzymes in the metabolic and signalling pathways. tyrosine kinases phosphorylate tyrosine residues -thus introducing an electrostatic charge at a residue, which under normal physiological ph is uncharged. phosphorylation is central to the function of many receptors, such as the insulin and insulin-like growth factor i receptors. given the possibility that several modifications may be introduced in the sequence when we move from gene to mature protein, the task of deducting a protein sequence from the gene sequence may be more complex than we normally assume. although the protein sequence itself is a valuable starting point, the optimal basis for a rational protein engineering project will be a full structure determination of the protein. in many cases this turns out to be an expensive and timeconsuming part of the project. most structure determinations are based on x-ray crystallography. this approach may give structures of atomic resolution, but is limited by the fact that stable high quality crystals are needed. many proteins are very difficult to crystallise, in particular many structural and membrane-associated proteins. a large number of important x-ray structures have been published over the last few years, and the structures of the hhai methylase (klimasauskas et al., 1994) , the tbp/tata-box complex (kim et al., 1993a; kim et al., 1993b) and the porcine ribonuclease inhibitor (kobe and deisenhofer, 1993) are mentioned as examples only. nmr may be an alternative in many cases, as the proteins can be studied in solution, and for some experiments they can even be membrane associated. however, nmr is limited to relatively small molecules, and even with incorporation of labelling in the protein the upper limit for a full structure determination using current state of the art methods seems to be close to 30 kda. some novel techniques for studying structural aspects of larger proteins will be discussed (vide infra). representative examples of important nmr structures may be interleukin 1/3 (clore et al., 1991a) , the glucose permease iia domain (fairbrother et al., 1992) and the human retionic acid receptor-/3 dna-binding domain (knegtel et al., 1993) . cryo electron microscopy (cem) is a relatively new approach to protein structure determination. the resolution of the structures are still lower than the corresponding x-ray structures, and a 2-dimensional crystal is a prerequisite. however, despite this cem appears to be a very promising approach to structure determination of membrane associated proteins that can form 2-dimensional crystals. cem has been used to study the nicotinic acetylcholine receptor at 9 .~ resolution (unwin, 1993) and the atp-driven calcium pump at 14 a resolution (toyoshima et al., 1993) , and in a combined approach using high resolution x-ray data superimposed on cem data the structure of the actin-myosin complex (rayment et al., 1993) and of the adenovirus capsid (stewart et al., 1993) has been studied. the recent structure by kiihlbrandt et al. (1994) of the chlorophyll a/b-protein complex at 3.4 a resolution shows that the resolution of cem rapidly is approaching the resolution of most x-ray protein data. scanning tunnelling microscopy (stm) is another new approach for studying protein structures (amrein and gross, 1992; lewerenz et al., 1992; haggerty and lenhoff, 1993) . the method is interesting because of a very high sensitivity, as individual molecules may be examined. the method will give a representation of the surface of the molecule, rather than a full structure determination. however, it is possible that both cem and stm can be used for identification of protein similarity. if data from these methods show that the overall shape of a protein is similar to some other known high resolution protein structure, then the known structure may be evaluated as a potential template for homology based modelling. we believe that such a model can either be used as an improved starting point for a full structure determination (i.e., for doing molecular replacement on x-ray data), or as a low resolution structure determination by itself. in homology based modelling a known structure is used as a template for modelling the structure of an homologous sequence, based on the assumption that the structures are similar. this is a very simple and rapid process, compared to a full structure determination. the sequences may be homologous in the strict sense, meaning that there is an evolutionary relationship between protein data banks the sequences. the same approach may obviously also be used for sequences that are similar, but not necessarily evolutionary related, and in that case we probably should talk about similarity based modelling. however, in this paper we will use homology based modelling as a general term, especially since the distinction between homology and similarity may be difficult in many cases. homology based modelling may turn out to be essential for the future of protein engineering. in fig. 3 , the number of entries in the swissprot protein sequence database (bairoch and boeckmann, 1992 ) and the brookhaven protein structure database (bernstein et al., 1977; abola et al., 1987) are shown as a function of time. as we can see, there is a very significant gap between the number of sequences and the number of structures. this gap is in fact even larger than shown in fig. 3 , as not all entries in the brookhaven database are unique structures. a large number of entries are mutants of other structures or identical proteins with different substrates or inhibitors. there has been an exceptional growth in the number of protein structures over the last 2-3 years. however, it is unrealistic to assume that we will be able to get high resolution experimental structures of all known proteins. the structure determination process is too time consuming, and the sequence databases are growing at a far faster pace, as shown in fig. 3 , especially as a consequence of several large-scale genome sequencing projects. on the other hand, it may not really be necessary to do experimental structure determination of all proteins (ring and cohen, 1993) . the assumption that similar sequences have similar structures (see fig. 4 ) has been proved valid several times and it seems to be true even for short peptide sequences as long as they come from proteins within the same general folding class ). an interesting case which is to some degree an exception to this rule is the structure of hiv-1 reverse transcriptase (kohlstaedt et al., 1992) . two units with identical sequence have similar secondary structure, but very different tertiary structure. however, this seems to be a rather exceptional case. new approaches to general structure alignment (orengo structure distance sequence distance fig. 4 . sequence and structure similarity. in most cases similar sequences have similar structures (region 1), and dissimilar sequences (i.e., measured by a standard mutation matrix) have dissimilar structures (region 2). in several cases quite dissimilar sequences have been shown to have very similar structures, at least with respect to individual domains (region 3). in very special cases we may have similar sequences with different structures (region 4), at least with respect to tertiary structure, showing that environment and binding to other proteins may be essential for the final conformation in some cases. however, in most cases it seems to be safe to assume that structures can be found in the lower grey triangle of this graph, indicating that structure is better conserved than sequence. holm et al., 1992; alexandrov and go, 1993; lessel and schomburg, 1993) have made it possible to search for structurally conserved domains in proteins with very low sequence similarity (swindells, 1994) . this is an important approach, as structure normally is better conserved than sequence (doolittle, 1992) . several cases have been identified where the sequences are very different (at least by traditional similarity measures), whereas the three-dimensional structures are surprisingly similar. the identification of a globin fold in a bacterial toxin (holm and sander, 1993) , and the similarity between the dsba protein and thioredoxin (martin et al., 1993) are relevant examples. recently the structure of the human serum amyloid p component was shown to be similar to concanavalin a and pea lectin, despite only 11% sequence identity (emsley et al., 1994) , and the similarity between hen egg-white lysozyme and a lysozyme-like domain in bacterial muramidase "is remarkable in view of the absence of any significant sequence homology", as noted by thunnissen et al. (1994) . this shows that there probably is a limited number of protein folds, and this number must be lower than the number of sequence classes, defined as groups of similar protein sequences. recent estimates show that this number probably is close to 1000 different protein folds (chothia, 1992) , and approx. 160 of these folds are known so far (burley, 1994; orengo et al., 1993) . this means that rather than full structure determination of a very large number of proteins, it may be sufficient to do structure determination of only a few selected examples of each protein fold, and use this as a basis for homology based modelling of other proteins shown to have the same fold. homology based modelling of the 3-d structure of a novel sequence can be divided into several steps. first, one or more templates must be identified, defined as known protein structures assumed to have the same fold as the trial sequence. then a sequence alignment between trial sequence and template is defined, and based upon this alignment an initial trial model can be built. this initial model must be refined in several steps, taking care of gap splicing, loops, side chain packing etc. the final model can be evaluated by several quality criteria for protein structures. an example of homology based modelling is the modelling of cinnamyl alcohol dehydrogenase based on the structure of alcohol dehydrogenase (mckie et al., 1993) . the protein folding problem is a fundamental problem in structural biology. this problem can be defined as the ab initio computation of a protein's tertiary structure starting from the protein sequence. this problem has not been solved and appears to be extremely difficult. if we want to solve the problem by computing an energy term for all conformations of a protein, defined by rotation around the ~b and ~o backbone angles of n residues in 10 degree steps, we have to evaluate 36 2(n-d alternatives, even without considering the side chains. for a peptide with 15 residues this corresponds to 10 44 conformations. a hypothetical computer with 106 processors, each processor running at 1015 hz (the frequency of uv light) and completing the energy evaluation of one conformation per cycle would need 3 x 1015 years in order to test all conformations. the estimated age of the universe is 14 x 10 ~2 years. a more realistic approach is the use of molecular dynamics or monte carlo methods for simulation of protein folding. however, it is still very difficult to use this as an ab initio approach, both because folding is a very slow process compared to a realistic simulation time scale, and also because it is very difficult to distinguish between correctly and incorrectly folded structures using standard molecular mechanics force fields (novotny et al., 1984) . a possible alternative approach may be to generate potential folds on a simplified lattice representation of possible residue positions (covell and jernigan, 1990; crippen, 1991) . however, this approach is still very experimental. some progress has been achieved in the area of secondary (rather than tertiary) structure prediction (benner and gerloff, 1993) . studies of local information content indicate that 65% match may be an upper limit for single-sequence prediction methods (rao et al., 1993) , whereas methods taking homology data into account may probably raise this limit to approx. 85%. methods based on neural networks and combinations of several prediction schemes seem to give good predictions, and especially methods using homology data from multiple alignments may give predictions at 70% match or better in many cases (salzberg and cost, 1992; boscott et al., 1993; rost and sander, 1993a; rost et al., 1993; levin et al., 1993) . also methods taking potential residue-residue interactions into account, like the hydrophobic cluster analysis (hca), may be used for identification of potential secondary structure elements (woodcock et al., 1992) . it has been shown that by restricting the prediction to a consensus region with stable conformation it is possible to make very reliable predictions (rooman and wodak, 1992) . in one case, neural networks were shown to be capable of returning a limited amount of information on the tertiary structure (bohr et al., 1993) . . structure retrieval by secondary structure. a flow chart for structure retrieval by secondary structure (right side) compared to retrieval by sequence (left side). please see the text for details. in this example the secondary structure library was generated using dssp (kabsch and sander, 1983) , the secondary structure was predicted with the phd program (rest et al., 1994), and fasta (pearson, 1990; pearson and lipman, 1988 ) was used to search the secondary structure library and the nrl-3-d databases (namboodiri et al., 1988; george et al., 1986) . only the secondary structure based method was able to identify the hla class i structure as similar to the class ii structure. the ribbon representation of the hla class i antigen binding region used in this figure was generated with molscript (kraulis, 1991) . be inconsistent, compared to the more sophisticated classification which can be achieved by a trained expert. recent studies show that the average agreement between alternative assignment methods used on identical structures is close to 65% for three methods (colloc'h et al., 1993) , or 79% if only two methods are compared (woodcock et al., 1992) . vadar is a new classification method which is aiming at a better agreement between manual and automatic assignment (wishart et al., 1994) , to what degree this may have influence on prediction systems remains to be seen. over the last few years it has been realised that the inverse folding problem is much easier to solve (bowie et al., 1991; blundell and johnson, 1993; bowie and eisenberg, 1993) . the inverse folding problem can be defined as follows: given a known protein structure, identify all protein sequences which can be assumed to fold in the same way. a large number of protein structures must be available in order to use this as a general approach, as the relevant protein fold has to be represented in the database in order to be identified. however, with a limited number of possible folds actually used by nature, a complete database of all folds appears to be possible. some information about possible folding classes can be derived from experimental data. circular dichroism can be used as a crude way of measuring the relative amounts of secondary structure in a protein. classification methods based on amino acid composition can be used for classification of proteins into broad structural classes zhou et al., 1992; chou and zhang, 1992; dubchak et ai., 1993) . this information may limit the number of different folds which have to be evaluated. it is also possible that such information may be used to improve the performance of other methods, although data on secondary structure prediction of all-helical proteins seems to indicate that the gain may be small (rost and sander, 1993b) . however, for a unique identification of folding class more sensitive methods are needed, and the most useful one is probably some kind of protein sequence library search. in order to identify the folding class we have to search a database of known protein structures with our trial sequence. the problem is that standard methods for sequence retrieval may not be sensitive enough in all cases. if the sequences are similar, then retrieval is trivial. however, we know that there are cases where structures are known to be similar despite very different sequences. how can these cases be identified in a reliable way? the most promising approaches are based on methods for describing the environment of each residue (bowie et al., 1991; eisenberg et al., 1992; overington et al., 1992; ouzounis et al., 1993; wilmanns and eisenberg, 1993; liithy et al., 1994) . this description can be used for generating a profile, showing to what degree each residue is found in a similar environment in other structures, and this profile can be used as a basis for sequence alignment and library searches. similar property profiles can also be used for searching database systems of protein structures (vriend et al., 1994) . a very simple approach can be used if we accept the hypothesis that protein sequences representing structures with a similar linear distribution of secondary structure elements may fold in a similar way. we can then create a sequence type library of known structures where the residues are coded by secondary structure codes rather than residue codes (see fig. 5 ). given the sequence of a protein with unknown 3-d structure, we can use a secondary structure prediction method and translate the sequence into a secondary structure description. if we define a suitable 'mutation' matrix describing the probability of inter conversion between different secondary structure elements, then a standard library search program like fasta (pearson and lipman, 1988; pearson, 1990) can be used in order to identify potential template structures. the example shown in fig. 5 is the identification of hla class i as a suitable candidate for homology modelling of hla class ii. the sequence similarity is very low, 11% sequence identity in the antigen binding region (based on alignment of the structures), and especially for this region most sequence based methods will retrieve a large number of alternative sequences before any of the class i molecules. bly improve the performance to use a positiondependent gap penalty, where most gaps are placed in loop regions rather than in helices or strands. however, the method is very simple to implement and test, as necessary tools and data already are available in most labs. however, for the secondary structure based approach the hla class i sequences are retrieved as top candidates. the structure prediction did not include any information about the hla class ii structure, which recently has been published (brown et al., 1993) . it should be mentioned that the 11% sequence identity score is not significantly higher than the score from a random alignment of sequences. if we, for each sequence in the swissprot protein sequence library, align it against a sequence selected at random from the same library (alignment without gaps, using the full length of the shortest sequence, and start the alignment at a random position within the longest sequence), then the average percentage of identical residues is (6 _+ 6)% at 3 standard deviations. the identification method using secondary structure is based on an assumption which has to be examined more closely, and the implementation of it is very crude. much work can be done on the secondary structure prediction, the 'mutation' matrix and the search method. it will proba-4.2. sequence alignment pip4_rat as described in the introduction a crucial feature in molecular evolution has been the parallel exploration of several different mutations. and although mechanisms like horizontal gene transfer and intragenic recombination may have been important as key steps in the evolution of new proteins, the most common mechanism seems to have been gene duplication followed by mutational modification (doolittle, 1992) . this means that especially multiple sequence alignment can give essential information about the mutation studies already performed by nature. conserved residues are normally conserved because they . multim alignment. alignment of inositol triphosphate specific phospholipase c /3 1 from rat (pip1 rat) against three other pip sequences from rat. each horizontal bar represents a sequence, marked in 50 residue intervals. black lines connecting the bars represent well conserved motifs found in all sequences, in this case subsequences of 8 residues where at least 4 residues are completely conserved in all 4 sequences. it is very easy to identify two well conserved regions, annotated as region x and y in the swissprot entries, despite a 400 residue insertion in two of the sequences. this insertion contains sh2 and sh3 domains (pawson, 1992) . it is an interesting observation that the extra c-terminal domain of the pipi_rat sequence shows a weak similarity to myosin and tropomyosin sequences. have an important structural or functional role in the protein, and identification of such residues will thus give vital information about structure and activity of a protein. several tools have been developed for multiple alignment. a very attractive one is macaw (schuler et al., 1991) , which will generate several alternative alignments of a given set of regions, and in a very visual way help the user to identify a reasonable combination of (sub)alignments. an even more general tool is muitim (drablcs and . here all possible alignments, based on short motifs, are shown simultaneously, and the user is free to identify potential similarities even in cases with low sequence identity and very disperse motifs. this is possible because of the superior classification potential of the human brain compared to most automatic approaches. the method includes an option for probability based filtering of motifs, and an example of a multim alignment is shown in fig. 6 . however, it is important to realise that in standard sequence alignment we are trying to solve a three-dimensional problem (residue interactions) by using an essentially one-dimensional method (alignment of linear protein sequences). as a consequence important conserved throughspace interactions may not be evident from a standard sequence alignment. a good example can be found in the alignment of lipases (schrag et al., 1992) . in fig. 7 , the sequence alignment of residues in a structurally conserved core of three lipases (rhizomucor miehei lipase (derewenda et al., 1992) , candida antarctica b lipase (a. jones, personal communication) and human pancreatic lipase (winkler et al., 1990 ) is shown. the active site residues, ser (s), asp (d) and his (h), are shown as black boxes. the ser and his residues are at identical positions. however, the asp residue of the pancreas lipase is at a very different sequence position compared to the other two lipases. it would be very difficult to identify this as the active site asp from a sequence alignment. if we look at the structural alignment in fig. 8 , we see that the positions are structurally equivalent, it is possible for all three lipases to have highly similar relative orientation of the active site atoms, despite the fact that the alternative asp positions are located at the end of two different /3-strands. an improved alignment may be generated if we can incorporate 3-d data for at least one of the sequences in the linear alignment (gracy et al., 1993) . however, in order to get a reliable alignment of sequences with low sequence similarity, we have to take true three-dimensional effects into account. this means that if we are able to identify a known 3-d structure as a potential basis for modelling, then the sequence alignment should be done in 3-d using this structure as a template. this can be done by threading the sequence through the structure and calculating pairwise interactions (jones et al., 1992; bryant and lawrence, 1993) . as soon as a template has been identified, and an alignment between this template and a sequence has been defined, a 3-d model of the protein can be generated. we can either use the template coordinates directly, combined with diffig. 7 . sequence alignment of lipases. alignment of structurally conserved regions of three lipases. for each lipase the solvent accessible surface in % compared to the gxg standard state (grey scale, white is buried and black is exposed), the secondary structure as defined by the dssp program, and the sequence is shown. the position of each subsequence in the full sequence is also shown. the active site residues are shown in white on black. please observe the shift of the active site asp (d) between two very different positions. the alignment was generated using alscript (barton, 1993) . ferent modelling approaches for the ill-defined regions, or the template can be used as a more general basis for folding the protein by distance geometry (srinivasan et al., 1993) or general molecular dynamics methods. loop regions are often highly variable, and must be treated with special approaches (topham et al., 1993) . it is also necessary to consider the orientation of side chains. although the backbone may be well conserved, many residues especially at the protein surface will be mutated, as shown in fig. 9 . the stability of a protein depends upon an optimal packing of residues, and it is important to optimise side chain conformation if we want to study protein stability and complex formation. a very common approach is the use of rotamer libraries combined with molecular dynamics refinement. recent studies show that this step of the modelling in fact may be less difficult than has been assumed . and important interactions, and exposed regions may to some degree be identified by using antibodies. however, in many cases the rational for modelling by homology is the very lack of experimental data related to structure, and we have to use other more general methods for evaluation of models. some of the approaches we already have described for sequence alignment can obviously also be used for evaluation of models. in general, model evaluation can be based on 3-d profiles (lfithy et al., 1992) , contact profiles (ouzounis et al., 1993) or more general energy potentials (hendlich et al., 1990; jones et al., 1992; nishikawa and matsuo, 1993) . some of these approaches have been implemented as programs for evaluation of structures or models, like procheck (laskowski et al., 1993) and prosa (sippl, 1993a, b) . however, in general no model (or even experimental structure) should be trusted beyond what can be verified by experimental methods. a protein model based on homology (or similarity) has to be verified in as many ways as possible, and experimental methods should always be preferred. mutation studies may give valuable information about active site residues a prerequisite for rational protein engineering is 3-d structure information about the protein. in fig. 6 , including parts of the sequences connecting the core regions. the active site asp is able to maintain a similar relative orientation, despite very different sequence positions. the alignment was generated using insight (biosym technologies). adddition to x-ray crystallography, nmr is the most important method for protein structure determination. x-ray crystallography has several advantages when compared to nmr. solving the crystal structure by x-ray crystallography is usually fast as soon as good crystals of the protein are obtained (even if it may not be so easy to obtain these crystals). it is also possible to determine the structure of very big proteins. the major disadvantage of x-ray crystallography is that it is the crystal structure that is determined. this implies that crystal contacts may distort the structure (chazin et al., 1988; wagner et al., 1987) . since active sites and other binding sites usually are located on the surface of the proteins, very important regions of the protein may be distorted. some structures even show large differences between nmr and x-ray structure (frey et al., 1985; klevit and waygood, 1986) the advantage of nmr is that it is dealing with protein molecules in solution, usually in an environment not too different from its natural one. it is possible to study the protein and the dynamical aspects of its interaction with other molecules like substrates, inhibitors, etc. it is also possible to obtain information about apparent pk a values, hydrogen exchange rates, hydrogen binding and conformational changes. all nuclei contain protons, and therefore they carry charge. some nuclei also possess a nuclear spin. this creates a magnetic dipole, and the nuclei will be oriented with respect to an external magnetic field. the most commonly studied nuclei in protein nmr (1h, 13c and 15n) have two possible orientations, representing high and low energy states. the frequency of the transition between the two orientations is proportional to the magnetic field. at a magnetic field of 11.7 tesla the energy difference corresponds to about 500 mhz for protons. in an undisturbed system there will be an equilibrium population of the possible orientations, with a small difference in spin population between the high and low energy orientation. the equilibrium population can be perturbed by a radio frequency pulse of a frequency at or close to the transition frequency. in addition, the spins will be brought into phase coherence (concerted motion) and a detectable magnetisation will be created. the intensity of the nmr signal is proportional to the population difference between the levels the nuclei can possess. nuclei of the same type in different chemical and structural environments will experience different magnetic fields due to shielding from electrons. the shielding effect leads to different resonance frequencies for nuclei of the same type. the effect is measured as a difference in resonance frequency (in parts per million, ppm) between the nuclei of interest and a reference substance, and this is called the chemical shift. in molecules with low internal symmetry most atoms will experience different amounts of shielding, the resonance signals will be distributed over a well-defined range, and we get a typical nmr spectrum. the process that brings the magnetisation back to equilibrium may be divided into two parts, longitudinal and transverse relaxation. the longitudinal or t 1 relaxation describes the time it takes to reach the equilibrium population. the transverse or t 2 relaxation describes the time it takes before the induced phase coherence is lost. for macromolecules the t 2 relaxation is always shorter than the t a relaxation. short t 2 relaxation leads to broad signals because of poor definition of the chemical shift. most molecules have dipoles with magnetic moment, and the most important cause of relaxation is fluctuation of the magnetic field caused by the brownian motion of molecular dipoles in the solution. how effective a dipole may relax the signal depends upon the size of the magnetic moment, the distance to the dipole, and the frequency distribution of the fluctuating dipoles. a nucleus may also detect the presence of nearby nuclei (less than three bonds apart), and this will split the nmr signal from the nucleus into more components. several nuclei in a coupling network is called a spin system. by applying radio frequency pulses it is possible to create and transfer magnetisation to different nuclei. it is, as an example, possible to create magnetisation at one nucleus, and transfer the magnetisation through bonds to other nuclei where it may be detected. the pulses are applied in a so-called pulse sequence (ernst, 1992; kessler et al., 1988) . the methodology for determination of protein structure by two-dimensional nmr is described in several textbooks and review papers (wagner, 1990; wiithrich, 1986; wider et al., 1984) . the standard method is based on two steps, sequential assignment: assignment of resonances from individual amino acids, and distance information: assignment of distance correlated peaks between different amino acids. the first step involves acquiring coupling correlated spectra (cosy, tocsy) in deuterium oxide to determine the spin system of correlated resonances. some amino acids have spin systems that in most cases make them easy to identify (gly, ala, thr, ile, val, leu). the other amino acids have to be grouped into several classes, due to identical spin systems, even though they are chemically different. the spin systems can be correlated to the nh proton by acquiring cosy and tocsy spectra in water. the assigned nh resonance is then used in distance correlated spectra (noesy) to assign correlations to protons (nh, h,, h~) at the previous amino acid residue (fig. 10) . by combining the knowledge of the primary sequence (which gives the spin system order) with the nmr data collected it is possible to complete the sequential assignment. when the sequential assignment is done the assignment of short range noe (up to four residues) will give information about secondary structure (a-helix, fl-strand). long-range correlations will serve as constraints (together with scalar couplings) to determine the tertiary structure of the protein. excellent procedures describing these steps are available (roberts, 1993; wiithrich, 1986) . with large proteins there will be spectral overlap of resonance lines. the problem is partially solved by labelling the protein with 13c and 15n isotopes. triple resonance multidimensional nmr methods (griesinger et al., 1989; kay et al., 1990 ) may then be applied. the resonances will then be spread out in two more dimensions (13c and ~sn) and the problem with overlap is reduced. these methods depend upon the use of scalar couplings to perform the sequential assignment, the sequential assignment procedure will then be less prone to error. the noesy spectra of such large proteins are often very crowded, but four-dimensional experiments like the 13c-~3c edited noesy spectrum (clore et al., 1991b) have been designed. such experiments will spread the proton-proton distance correlated peaks by the chemical shift of its corresponding 13c neighbour and reduce the spectral overlap. secondary structure elements may also be predicted from the chemical shift of 1h and 13c (spera and bax, 1991; williamson and asakura, 1991; wishart et al., 1992) . obtaining nmr-spectra of proteins has some aspects that should be considered. spectral overlap. as we move to larger proteins the probability of overlap of resonance lines increases. at some point it will become impossible to do sequential assignment due to this overlap. application of 3-d and 4-d multiresonance nmr has made it possible to assign proteins in the 30 kda range (foght et al., 1994; stockman et al., 1992) . fast relaxation. as the size of the protein is increased the rate of tumbling in solution is re-duced. this leads to a reduced transverse relaxation time (t2), and broadening of the resonance lines in the nmr spectra. the intensities of the peaks are reduced and they may be difficult to detect. the short transverse relaxation time will also limit the length of the pulse sequences it is possible to apply (because there will be no phase coherence left), and multidimensional methods become difficult. it is possible to determine a 3-d structure by nmr or x-ray crystallography are probably a subset of all proteins (wagner, 1993) . proteins may have regions with mobility and few cross peaks. the effective size of a protein is often increased by aggregation. the amount of aggregation can often be reduced by reducing the protein concentration. thus, very often the degree of aggregation will determine whether it is possible to assign and solve a protein structure by nmr, by limiting the maximum concentration that may be used. the stability of the proteins is also a major issue. a sample may be left in solution for days, often at elevated temperatures, so denaturation may become a problem. photo-cidnp (chemically induced nuclear polarisation) is an interesting technique for the study of surface positioned aromatic residues in proteins (broadhurst et al., 1991; cassels et al., 1978; hore and kaptain, 1983; scheffier et al., 1985) . by introducing a dye and exciting it with a laser, it is possible to transfer magnetisation to aromatic residues, where it can be observed. in addition to high-resolution nmr, solid state nmr has also been applied to studies of proteins. studies of active sites and conformation of bound inhibitors yields interesting information. the stability of proteins may be monitored under different conditions by detecting signals from transition intermediates bound to the active site (burke et al., 1992; gregory et al., 1993) . structural constraints on transition state conformation of bound inhibitors can be obtained (auger et al., 1993; christensen and schaefer, 1993) . structural constraints of the fold and conformation of the amino sequence may be gathered by setting upper and lower distances for lengths between specific amino acids (mcdowell et al., 1993) . using solid-state nmr it is also possible to study membrane proteins and their orientation with respect to their membrane (killian et al., 1992; ulrich et al., 1992) . we expect such studies to give insight into ion channels in membranes (woolley and wallace, 1992). an important mechanism for relaxation in high-resolution nmr is dipolar relaxation. usually this is induced by the spin of nuclei in the immediate vicinity, and it is a function of the size of the dipole. the electron is also a magnetic dipole, and the magnitude of this dipole is about 700-times that of a proton. paramagnetic compounds have an electron that will interact with 11 . the paramagnetic relaxation method. outline of the paramagnetic relaxation method. the protons located at the protein surface will be closer to the dissolved paramagnetic relaxation agent than the protons located inside the protein core, hence the resonance lines from protons at the surface will be broadened more than resonance lines stemming from protons located inside the protein. nearby protons and increase the relaxation rate of these protons. the widest use of paramagnetic compounds has been of gd 3÷ bound to specific sites in a protein , but also other compounds have been used (chang et al., 1990; hernandez et al., 1990a, b ). this will make it possible to identify resonance lines from residues in the vicinity of the binding site. it is also possible to calculate distances from the paramagnetic atom as the relaxation effect is distance dependent. the paramagnetic broadening effect can also be used with a compound moving freely in solution (drayney and kingsbury, 1981; esposito et al., 1992; petros et al., 1990) . in this way residues located on or close to the protein surface will give broadened resonance lines compared to residues in the interior of the protein. this method can be used to measure important noe and chemical shifts inside the protein directly, or it can be used as a difference method to identify resonances at the surface by comparing spectra acquired with and without the paramagnetic relaxation agent (fig. 11) . we have used the paramagnetic compound gadolinium diethylenetriamine pentaacetic acid (gd-dtpa) as a relaxation agent. gd-dtpa will increase both the longitudinal and the transverse relaxation rates of protons within the influence sphere. suitable nmr experiments to highlight the relaxation effect may be noesy, roesy and tocsy (bax and davis, 1985; braunschweiler and ernst, 1983) gd-dtpa is widely used in magnetic resonance imaging (mri) to enhance tissue contrast. it is assumed to be non-toxic and we do not expect it to bind to proteins. we used the wellstudied protein hen egg-white lysozyme as a test protein. both the structure and the nmr spectra of this protein are known (diamond, 1974; redfield and dobson, 1988) , and the protein is extremely well suited for nmr experiments. in fig. 12 , the 1-d ih-nmr spectrum recorded in the presence and absence of gd-dtpa is shown. although it is evident that there is a selective broadening in the 1-d spectrum, it is also clear that there are problems with overlapping spectral lines. we therefore applied two-dimensional nmr methods, and shown in fig. 13 is the low field region of a noesy spectrum of lysozyme. the region corresponds to the same region as shown in fig. 12 . from fig. 13 we see that the signals from w63, w 63 and w123 disappear with addition of gd-dtpa, while the signals from w28, w108 and wlll still are observable. by examination of the solvent accessible surface of lysozyme it is evident that the indole nh of w62, w63 and w123 is exposed to solvent, while the indole nh of w28, w108 and wlll is not exposed. this shows that the changes in the spectrum are as expected from the structure data. the appearance of the nh-nh region of the spectrum (fig. 14) also shows the reduction in the number of signals in the gd-dtpa exposed spectrum. this shows that the paramagnetic broadening effect can be used for selective identification of signals from solvent exposed residues in a protein. one of the fundamental steps in the protein engineering process shown in fig. 1 is the design step, where a correlation between structure and properties is established in order to select potential structural candidates that match new functional profiles. the understanding of this correlation implies a realistic modelling of the physical chemical properties involved in the functional features to be engineered. these features are basically of two types: diffusional and catalytic. any ligand binding to a protein, whether ligandreceptor or substrate-enzyme, is essentially a diffusional encounter of two molecules. electrostatic interactions are the strongest long-range forces at the molecular scale and, thus, it is not surprising that they are one of the determinant effects in the final part of the encounter (berg and von hippel, 1985) . in the case of substrateenzyme interactions the catalytic step that follows the binding of the substrate seems to be possible mainly by the presence of electrostatic forces that stabilise the reaction intermediates in the binding site (warshel et al., 1989) , from which the product formation may proceed. another and much more basic necessary condition for a successfully engineered protein is that a functional folded conformation is maintained. solvation of charged groups is one of the determinants in protein folding (dill, 1990) , so that even the conformation of the protein is electrostatically driven. given the ubiquitous role of electrostatic interactions, it is then obvious that their accurate modelling is an essential prerequisite in the design of engineered proteins. several good reviews exist on protein electrostatics (warshel and russel, 1984; matthew, 1985; rogers, 1986; harvey, 1989; davies and mccammon, 1990; sharp and honig, 1990) . this section intends to give a brief overview of the subject. we start by presenting the methods one can use to model electrostatic interactions. the most familiar methodology in biomolecular modelling is certainly molecular mechanics (mm) (either through energy minimisations or molecular dynamics (md)). we point out some of the limitations of mm in the treatment of electrostatic interactions, and the need to use alternative ways of describing the system, such as continuum methods. the computation of ph-dependent properties and some potential extensions of mm are also discussed. finally, we refer some applications of electrostatic methods relevant to protein engineering. in mm simulations, electrostatic interactions are usually described with a pairwise coulombic term of the form qlq2/dr12,were ql and q2 are the charges of the pair of atoms, r12 their distance, and d the dielectric constant. d is usually set equal to 1 when the solvent is included. a complete simulation in a sufficiently big box with water molecules should, in principle, give a realistic description of the protein molecule (harvey, 1989) . this would be specially true if a force field including electronic polarizability effects (see 6.3.) was available for use with biomolecular systems, which unfortunately is not the case (harvey, 1989; davis and mccammon, 1990) . we use the term force field in this context as including both the functional form and parameters describing the energetics of the system, from which the forces are derived. simulations where solvent molecules are not treated explicitly are naturally appealing, since the computation time increases with the square of the number of atoms. several methods have been proposed that attempt to account for solvent effects. the more popular approach is an ad hoc dielectric 'constant' proportional to the distance (e.g., mccammon and harvey, 1987) but different distance dependencies can be used (e.g., solmajer and mehler, 1991) . a variety of more elaborated methods were also suggested (northrup et al., 1981; still et al., 1990; gilson and honig, 1991) . all these methods should be viewed as attempts of including solvent screening effects in a simplified way. they can be useful when inclusion of water is computationally prohibitive, but they cannot substitute for an explicit inclusion of solvent since, e.g., the existence of hydrogen bonding with the solvent is not properly described by these approaches. mm of biomolecules has, in general, heavy computation needs. the number of water molecules that should be included in order to simulate a typical protein in a realistic way is quite large, especially if one wants to perform md. also, each pair of atoms has its own electrostatic interaction and the number of pairs cannot be lowered by a short cut-off distance (e.g., 7 a) as in van der waals interactions, since electrostatic interactions are very long range, typically up to 10 ~,. mm simulations have also some limitations on the description of the system, since ph and ionic strength effects usually are difficult or impossible to include. the only way to include ph effects is through the protonation state of the residues. each titrable group (in asp, glu, his, tyr, lys, arg, c-and n-terminal) in the protein have two states, protonated or unprotonated. thus, a protein with n titrable groups will have 2 n possible protonation charge sets. the best we can do is to choose the set corresponding to the protonation states of model compounds at the desired ph. free ions can be included in md simulations of proteins (levitt, 1989; mark et al., 1991) , but it is not clear if the simulated time intervals are long enough to realistically reflect ionic strength effects. another problem with mm is that the understanding it provides of the system (through energy minimisation or md) does not include entropic aspects explicitly, i.e., it does not give free energies directly. there are methods to calculate free energies based on mm potentials (beveridge and dicapua, 1989) , but even though several applications have been made on biomolecular systems (for a review see beveridge and dicapua, 1989) , they are still too demanding for routine use in systems of this size. then, when the properties under study are related to free energies rather than energies (which is often the case), mm by itself can only be seen as a first approach. in summary, although mm simulations can provide some unique information on the structural and dynamical behaviour of biomolecular systems, some limitations exist due to both conceptual and practical reasons, in particular regarding the treatment of electrostatic interactions. fortunately, other methods exist that can provide insight on aspects whose modelling is poor or absent in mm simulations, although at the cost of the atomic detail in the description. there is no 'best' modelling method and we should resort to the several methods available in order to gain an understanding of the system that is as complete as possible. the so-called continuum or macroscopic models assume that electrostatic laws are valid at the protein molecular level and that macroscopic concepts such as dielectric properties are applicable. protein and solvent are treated as dielectric materials where charges are located. these charges may be titrable groups (whose protonation state may vary), permanent ions (structural and bound ions, etc.) or, more recently, permanent partial charges of polar groups. given the dielectric description of the system and the placement of the charges, the problem can be reduced to the solution of the poisson equation (or any equivalent formulation), as can any problem of electrostatics (e.g., jackson, 1975) . the electrostatic potential thus obtained can be used to study diffusional processes or visually compare different molecules (see 6.6.). the simplest continuum model assumes the same dielectric constant inside and outside the protein. typically, a value somewhere between the protein and solvent dielectric constants has been used (sheridan and allen, 1980; koppenol and margoliash, 1982; hol, 1985) . this approach completely ignores the effects of having two very different dielectric regions, but can be used for a first qualitative computation. the more common continuum models treat the protein as a low dielectric cavity immersed in a high dielectric medium, the solvent. the way the charges are placed in this cavity and the way the electrostatic problem is solved vary with the particular method. analytical solutions can be obtained for the simplest shapes, such as spheres, but in general the more complex shapes require numerical techniques. in the first cavity model the protein was assumed to be a sphere with the charge uniformly distributed over its surface (linderstr~m-lang, 1924) . tanford and kirkwood (1957) proposed a more detailed model in which each charge has a fixed position below the surface. assuming a spherical geometry allows for a simple solution to the electrostatic problem. it is even possible to include an ionic atmosphere that accounts for ionic strength effects (leading to the poisson-boltzman equation). the effect of ph occurs naturally in the formalism. the energy cost of burying a charge inside the low-dielectric protein (self-energy) is taken to be the same as in small model compounds, since at the time when this method was developed (before protein crystallography) charges were believed to be restricted to the protein surface. this limits the method to proteins without buried charges, unless we have some estimate on the self-energy. there are, obviously, some problems in fitting real, irregularshaped proteins to a spherical model. some solutions to this problem were proposed, including an ad hoc scaling of interactions based on solvent accessibility (shire et al., 1974) , and the placing of more exposed charges in the solvent region (states and karplus, 1987) . the inclusion of non-spherical geometries im-plies the use of numerical techniques, as referred above. warwicker and watson (1982) and used the finite differences technique to solve, respectively, the poisson and poisson-boltzman equations. self-energies can be included (gilson and honig, 1988) , such that the method is fully applicable when buried charges exist. the intrinsic discretization of the system in the finite differences technique, makes these methods readily applicable to any kind of spatial dependency on any of the properties involved. the inclusion of a spatially-dependent dielectric constant, for instance, will be relatively simple. other extensions such as additional dielectric regions (ligands, membranes, etc.), eventually with charges, should also be possible. alternative numerical techniques for solving the poisson or poisson-boltzman equations have also been used, including finite elements (orttung, 1977) and boundary elements (zauhar and morgan, 1985) . the dielectric constant in a region comes from the existence of dipoles in that region, permanent or induced. permanent dipoles are due to atomic partial charges (e.g., water dipole, peptide bond dipole). induced dipoles are due to the polarizability of electron clouds. warshel and levitt (1976) represented this electronic polarizability by using point dipoles in the atoms. as pointed out by davies and mccammon (1990) this representation is roughly equivalent to a spatially-dependent dielectric constant. this approach is usually combined with a simplified representation ot water by a grid of dipoles (warshel and russel, 1984) . ionic strength and ph effects are not considered. all the above methods deal with a particular charge set (see 6.1.), even when ph effects are considered. however, a protein in solution does not exist in a single charge set. we are usually interested in the properties of a protein at a given ph and ionic strength, not at a particular charge set. moreover, if we want to test the available methods, we have to test them against experimental results which usually do not correspond to a specific charge set. a common test on the accuracy of electrostatic models is their ability in predicting pk a values of titrable groups in a protein (see 6.6.), obtained via titrations, nmr, etc. these values can be quite different from the ones of model compounds, due to environment of the groups in the protein. this difference (pk a shift) can be of several pk units. the experimentally determined apparent pka (pkap p) is determined as the ph value at which half of the groups of that residue are protonated in the protein solution, i.e., when its mean charge is 1/2 (thus, the equivalent notation pk1/2). then, if we can devise a method to compute the mean charge of the titrable groups at several ph values, we can predict their pkap p values. as mentioned above (see 6.1.), we have 2 n possible charge sets. any structural property can, in principle, be computed through a boltzman sum over all those sets, with each one contributing according to its free energy (taken as the electrostatic energy) (tanford and kirkwood, 1957; bashford and karplus, 1990) . the property thus computed is characteristic of the chosen ph value (and ionic strength, if considered) instead of a specific charge set. we are particularly interested in computing the mean charges at a given ph (see last paragraph). a sum with 2 n terms is not, however, a trivial calculation in terms of computer time. tanford and roxby (1972) avoided the boltzman sum by placing the mean charges directly on the titrable groups, instead of using one of the integer sets. this corresponds to considering the titration of the different groups as independent (a mean field approximation; bashford and karplus, 1991) . other alternatives to the boltzman sum are the monte carlo method (beroza et al., 1991) , less drastic mean field approximations (yang et al., 1993; gilson, 1993) , the 'reduced site' approximation (bashford and karplus, 1991) , or even assume that the predominant charge set is enough to describe the system (gilson, 1993) . since electrostatic interactions in proteins are typically dominated by titrable groups whose charge is affected by ph, no electrostatic treat-ment can be complete without taking this effect into account. a simple, although effective, way of doing this is to: (i) compute the electrostatic free energies (e.g., by a continuum method); (ii) compute the mean charge of each titrable group at a given ph (e.g., by a mean field approximation); (iii) use those charges to compute the electrostatic potential (e.g., by a continuum method), which can be displayed together with the protein structure (see the human pancreatic lipase example in section 6.6.). in this way a ph-dependent electrostatic model of the protein can be obtained, which is not possible with usual mm-based modelling techniques. as stated above (see 6.1.), electronic polarizability is not explicitly considered in common force fields. van belle et al. (1987) included the induced dipole formalism (warshel and levitt, 1976) in mm calculations. the electrostatic interactions in the applied force field were simply 'corrected' with additional terms due to inducible dipoles. however, it should be noted that a force field fitted to experimental data without polarizability terms, should be fitted again if those terms are included. the protein conformation used in molecular modelling is usually an experimentally based (xray, nmr) mean conformation, characteristic of those particular experimental conditions. that conformation may, however, be inadequate for modelling the protein properties at different conditions. in particular, proteins are known to denaturate at extreme ph conditions. thus, ph-dependent methods such as the continuum methods may give incorrect results when using one single conformation over the whole ph range. actually, md simulations have shown that the results can be highly dependent on side chain conformation (wendoloski and matthew, 1989) . although overall properties like titration curves did not seem to be very sensitive, individual pka's showed variations up to 2.0 pk units. as mentioned in section 6.1, mm has the problem of what charge set to use in simulations. instead of using a charge set corresponding to model compounds at the intended ph, one may use the predominant charge set of the protein, determined, e.g., by a continuum method, as suggested by gilson (1993) . a different approach to this problem would be to devise a way of including the averaged effect of all charge sets in the mm simulation. we have recently developed a method where a force field is derived which includes the proper averaged effect of all charge sets (a potential of mean force) (to be published). the method depends on the calculation of electrostatic free energies obtained from, e.g., a continuum method. the electrostatic potential, computed in some of the referred methods, can help to understand the contribution of electrostatic interactions in the diffusional encounters of proteins with ligands (substrates or not). the diffusional process driven by the electrostatic field can be simulated through brownian dynamics (bd) and diffusion rates may be computed (for references see, e.g., davies and mccammon, 1990) . the effect of mutations on the diffusion of superoxide ion into the active site of superoxide dismutase has been studied by this technique (sines et al., 1990) and faster mutants showing 2-3-fold increase in reaction rate could be designed (getzoff et al., 1992) , although this enzyme usually is considered to be 'perfect'. electrostatically driven bd simulations can help to reveal steric 'bottlenecks' (reynolds et al., 1990) and orientational effects (luty et al., 1993) . this method can also be applied to study the encounter of two proteins (northrup et al., 1988) . visual comparison of electrostatic fields can also provide useful information. soman et al. (1989) showed that rat and cow trypsins have similar electrostatic potentials near the active site, despite a total charge difference of 12.5 units. as an illustration of such type of comparisons, using ph-dependent electrostatics, we have applied the solvent accessibility-modified tanford-kirkwood method (see 6.2.) to the human pancreatic lipase structures with both closed (van tilbeurgh et al., 1992) and open lid (van tilbeurgh et al., 1993) , as shown in fig. 15a and b. fig. 15c-f shows surfaces corresponding to an electrostatic potential equal to + 1.0 kt/e (where k is the boltzman constant, t the absolute temperature and e the proton charge). these surfaces correspond to regions were the electrostatic interactions on a charge are roughly of the same magnitude as the thermal effects due to the surrounding solvent, i.e., where charged molecules in solution start to feel electrostatic steering or repulsion. at ph 7 clear differences exist between the closed and open forms, the latter showing a dipolar groove in the presumed binding site region. at pi-i 4 the molecule is strongly positively charged and most electrostatically differentiated regions have disappeared. given the role of electrostatic interactions on molecular orientation and association (see the beginning of this section (6)), this is expected to markedly affect the interaction with the lipid-water interface. for enzymes the catalytic activity involving a charged residue can be modulated by shifting the pk a of that residue. the pk a shifts of the active site histidine has been successfully predicted for a number of mutants of subtilisin loewenthal et al., 1993) . one of the main reasons why enzymes are good catalysts is because they stabilise the transition state intermediate (fersht, 1985) . for enzymatic reactions that are not diffusion limited, engineering leading to an enhanced stabilisation of the intermediate will result in an increased activity. the induced dipole method was used to compute the activation free energy for different mutants of trypsin and subtilisin (warshel et al., 1989) , with some qualitative agreement with the experimental results. the prediction of changes introduced by mutations on redox potentials could also be of interest to protein engineering. prediction of redox potentials has been made with some success (rogers et al., 1985; durell et al., 1990) . in plastocyanin the effect of chemically modifying charged groups was also considered (durell et al., 1990) . the effect of mutations could also be analysed, as has been done for pk a shift calculations (see above). the above examples clearly show that, whatever the particular method used, the modelling of 15a c e 13 d t electrostatic interactions in proteins has an important role to play in protein engineering. a highly relevant example is the design of a faster 'perfect' enzyme (getzoff et al., 1992) , which also illustrates the combination of different methods (bd and electrostatic continuum methods) that can sometimes be determinant in a modelling study. the science of protein engineering is advancing rapidly, and is emerging in many new contexts, such as metabolic engineering. rational protein engineering is a complex undertakingand only the groups with sufficient understanding of sequences and 3-d structures can handle the complex underlying problems. predicting protein structure may be difficult -but predicting future developments in a very active branch of science can be hazardous at the best. however, we will review a few of the more recent research aspects that we are convinced will be of key importance in the future development of protein engineering. often the substrates or products in an enzymatic process are poorly soluble in an aqueous medium. this may lead to poor yields and difficult or expensive purification steps. the potential of using other solvents, either pure or in mixture, where substrates and/or products may be soluble has attracted a great deal of attention (tramper et al., 1992; arnold, 1993) . dissolving the protein in organic solvents will alter the macroscopic dielectric constant and lead to a much less pronounced difference between the interior and exterior static dielectric behaviour. protein function in such media may be altered and is poorly understood; we can expect a significant development in the future. despite the often dramatic change in dielectric constant when changing the solvent from, e.g., water to an organic substance, the protein 3-d structure can remain virtually intact, as has been documented in the case of subtilisin carlsberg dissolved in anhydrous acetonitrile (fitzpatrick et al., 1993) . the hydrogen bonding pattern of the active site environment is unchanged, and 99 of the 119 enzyme-bound structural water molecules are still in place. one-third of the 12 enzymebound acetonitrile molecules reside in the active site. many enzymes remain active in organic solvents and in the case of enzyme reactions where the substrate has very poor water solubility, a change to organic solvent can be of major importance (gupta, 1992 ). an extreme case of a non-conventional medium for enzymatic action is the gas phase. certain enzymes, immobilised on a solid bed, have been shown to be active at elevated temperatures towards selected substrates in the gas phase (lamare and legoy, 1993) . obviously the range of substrates that potentially can be used is limited to those that actually can be brought into the gas phase under conditions where the enzyme is still active. enzymes for which such reactions have been studied include hydrogenase, alcohol oxidase and lipases. the fact that even interfacially activated lipases (such as the porcine pancreatic and the candida rugosa lipases) function with gas phase carried substrate molecules opens up the interesting possibility of studying the role of water in this reaction. protein engineering may be used to enhance enzyme activity in organic solvents (arnold, 1993; fig. 15 . electrostatic maps of hpl with closed and open lid. ribbon models of human pancreatic lipase with colipase are shown with closed (left: a,c,e) and open (right: b,d,f) lid. the colipase is shown in blue and the mainly a-helical 'lid' region is highlighted in cyan. the residues of the active site are shown in green. access to the active site pocket seems to be controlled by the conformational st'ate of the lid. electrostatic isopotential contours of + 1.0 kt/e are shown at ph 4 (c,d) and ph 7 (e,f). the negative surfaces are represented in red and the positive surfaces in blue. the models and isopotentiai contours were produced with insight h and delphi (biosym technologies, san diego). the ph-dependent charge sets were computed with titra (to be published). chen and arnold, 1993) . when dissolving subtilisin e in 60% dimethylformamide (dmf) the kcat/k m for the model substrate suc-ala-ala-pro-met-p-nitroanilide drops 333-fold. after ten mutations were introduced, the activity in dmf was restored almost to the level of the native enzyme in water. all metabolic conversions in micro-organisms are carried out directly or indirectly by proteins. our ability to manipulate single genes has opened up for the actual control of such processes. we may alter the efficacy of a certain pathway or we may introduce totally new pathways. thus, escherichia coli can be modified in such a way that one can use i>glucose in the e. coli based manufacture of hydroquinone, benzoquinone, catechol and adipic acid (dell and frost, 1993; draths and frost, 1990; frost, 1993) . presently such compounds are produced through organic chemical synthesis using aromatics as one of the reactants. the prospect of producing the same compounds using only microbes and glucose thus has some obvious environmental benefits. we expect to see a virtual surge in the engineering of microorganisms towards the production of rare chemical or biochemical compounds or compounds for which the current synthetic route is costly either economically or from an environmental perspective. the perspective of designing and producing functional protein molecules from scratch is extremely attractive to many visionary scientists. some central questions arise: do we know enough to undertake such tasks, and what goals can we define? screening mutation studies of protein interfaces show that the majority of mutations reduce activity or binding affinity (cunningham and wells, 1993) , indicating that most proteins already represent highly optimised designs. the groups active in this area have aimed at constructing certain 3-dimensional folds such as the four helix bundle (felix) (hecht et al., 1990 ) and histidine-based metal binding sites (arnold, 1993) and even the observation of limited enzymatic activity is regarded as a successful result . protein de novo design of helix bundles may even follow a very simple binary pattern of polar and nonpolar amino acids as was concluded in a study of four-helix bundle proteins (kamtekar et al., 1993) . the helix-helix contact surfaces are mainly hydrophobic, whereas the solvent exposed regions are hydrophilic. many variants conforming to this hydrophobic pattern were generated and two of these proteins were stabilised with 3.7 and 4.4 kcal tool -~ relatively to the unfolded form, thus approaching what is found for many natural proteins. the authors suggest that such a binary pattern may have been important in the early stages of evolution. in our laboratory we have results supporting this conclusion for the trypsin family of proteins, which is predominantly in a /3-strand based fold . fusion and hybrid proteins may be produced by fusing the genes or gene fragments including a proper linking region between the two genes (argos, 1990) . this in principle may allow for combining properties from two different proteins. thus artificial bifunctional enzymes have been produced by fusing the genes for the proteins, e.g., /3-galactosidase and galactokinase (bulow, 1990 ). in a recent paper an elegant hybrid protein concept is described. a hybrid antibody fragment was designed to consist of a heavy-chain variable domain from one antibody connected through a linker region of 5-15 residues to a short lightchain variable domain from another antibody (holliger et al., 1993) . the antibody fragments displayed similar binding characteristics as the parent antibodies. the prospect of engineering multifunctional antibodies for medical applications is imminent. a hybrid protein between the glucose transporter and the n-acetylglucosamine transporter of e. coli have been produced. the two proteins displayed 40% residue identity. the hybrid protein consisted of the putative transmembrane do-main from the glucose transporter and the two hydrophilic domains from the n-acetylglucosamine transporter. the hybrid protein was, somewhat surprisingly, still specific for glucose (hummel et al., 1992) . interestingly, several naturally occurring proteins themselves seem to have originated through gene fusion. in the case of hexokinase it is proposed that it originated from a duplication of the glucokinase gene maintaining even the gene organisation (kogure et al., 1993) . several other proteins such as receptor proteins of the insulin family can best be understood as gene fusion products of a kinase domain onto the rest of the receptor (which in itself may consist of several fragments). with potential medical applications, proteinnucleic acid hybrids have been constructed, where the nucleic acid fragment complemented the sequence of a fragment of mrna that the rnase should be targeted towards. the results obtained confirmed that this approach indeed worked (kanaya et al., 1992) . the potentials for generating anti-viral agents against, e.g., hiv are obvious. as a consequence of the enormous growth in our understanding of molecular biology and material technology, a new technological sector is emerging which takes aim at exploring the possible advantages in creating micro-machines and switchable molecular entities. this concept is currently known as nano technology (birge, 1992) . two concepts that we find particularly interesting are described briefly below. rhodopsin is a very ancient molecular construct -we find rhodopsin like molecules in a range of roles, all of them associated with its membrane location. proton transport and receptor functions are particularly interesting. bacteriorhodopsin from halobacterium halobium maintains a large ph gradient across the bacterial membrane. this protein complex is coloured, and its colour can be changed by exposing the protein to light of an appropriate frequency. the lifetime of the excited state can be adjusted by adjusting the physical chemical parameters of the medium the rhodopsin is embedded in (birge, 1992) . this protein can be used as a molecular switch in a very broad sense, e.g., as part of a high density memory device. however, changing the colour of a protein molecule is just one example that could be considered. another molecular based switch concept involves the transfer of a molecular ring (paraquat-derived rotaxane ring) between two binding sites (bradley, 1993) . currently the transfer is induced by a solvent change, but it is believed that an electrochemical transfer mechanism can be developed as well. similar concepts can probably also be developed for proteins. the present paper reviews some of many new developments in protein engineering. the review is not exhaustive -it is simply not possible to do this properly within the limits of this paper. we have tried to review some selected scientific areas of key importance for protein engineering, such as the validity of protein sequence information as well as structural information. sometimes the translation of a gene sequence to amino acid sequence is not trivial -a range of posttranscriptional editing and splicing events may occur, leading to a functional protein, where the amino acid sequence cannot be directly deducted from the gene sequence. in addition, posttranslational modification may provide triggers for other parts of the cells molecular machinery. we are thus in a situation where the full benefits and profits from projects such as the human genome project may escape us for a while. we have covered some of the recent developments in the modelling of protein structure by homology, which we regard as one of the most strategic areas of development. we will be flooded with sequence information deducted from gene sequences, and in the cases where the deducted amino acid sequences are assumed valid, we have to use homology based structure prediction in most cases. given that the number of protein structure families is expected to be limited the task is durable. here we should again caution the reader. we have no a priori reason to assume that non-soluble proteins, such as structural proteins, have structures that can be predicted from our limited library of mostly globular, soluble proteins. some structural proteins are gigantic, the cuticle collagen in the riftia worms from deep sea hydrothermal vents have a molecular mass of 2.600 kda (gaill et al., 1991) . it is extremely unlikely that a 3-d structure at atomic resolution of such a protein will ever be determined using methods we have available today. nmr has emerged with surprising speed as a structure determination tool. many excellent reviews have been written on this topic. we have decided to direct the readers attention to some recent developments that we believe will be of significant importance to the usage of nmr in protein engineering projects. the potential of using nmr to study the solvent exposed outer shell of larger proteins, that by far exceed the 30 kda limit mentioned earlier is intriguing. this is particularly so, since most functionality of a protein is a feature of exactly the residues in the outer shell. thus, we can 'peel' the protein, and thereby isolate the spectral information that pertains to the surface only. this simplifies the spectra, and in some cases even allows for a partial assignment of specific residues. recent developments in ph-dependent protein electrostatics have been given special attention here. the similarities and differences within a family of structurally related proteins can only be understood if we are capable of interpreting the consequences of the substitutions, insertions and deletions that mostly occur at the surface of the proteins. when such changes are found and they involve charged residues, this will effect the extent or polarity of the electrostatic fields that the protein molecule is embedded in. we believe that the consequences of charge mutations to a large extent can be predicted through the use of ph-dependent electrostatics although practical examples are still lacking. to our knowledge the results on the electrostatic consequences of the lid motion in the human pancreatic lipase (vide supra) are among the first such reported. the story of molecular biology is continuously unfolding -and our understanding of our own biology, development and evolution is becoming ever deeper and more detailed. but we are also, once again, discovering that one of the many qualities of nature is endless complexity. protein data bank. crystallographic databases -information content, software systems, scientific applications. bonn/cambridge/chester, data commission of the international union of crystallography modification of trypanosoma brucei mitochondrial rrna by posttranscriptional 3' polyuridine tail formation significance of similarities in protein structures (in abstracts of the 5th annual meeting of the protein engineering society of japan) scanning tunneling microscopy of biological macromolecular structures coated with a conducting film an investigation of oligopeptides linking domains in protein tertiary structures and possible candidates for general gene fusion engineering proteins for nonnatural environments solid-state 13c nmr study of a transglutaminaseinhibitor adduct structural engineering of the hiv-1 protease molecule with a/3-turn mimic of fixed geometry the swlss-prot protein sequence data bank polymers made to measure alscript: a tool to format multiple sequence alignments pka's of ionizable groups in proteins: atomic detail from a continuum electrostatic model multiple-site titration curves of proteins: an analysis of exact and approximate methods for their calculation mlev-17-based two-dimensional homonuclear magnetization transfer spectroscopy predicting the conformation of proteins. man versus machine diffusion-controlled macromolecular interactions the protein data bank: a computer-based archival file for macromolecular structures protonation of interacting residues in a protein by a monte carlo method: application to lysozyme and the photosynthetic reaction center of rhodobacter sphaeroides free energy via molecular simulation: application to chemical and biomolecular systems research and perspectives catching a common fold seleno protein synthesis: an expansion of the genetic code protein structures from distance inequalities secondary structure prediction for modelling by homology inverted protein structure prediction a method to identify protein sequences that fold into a known three-dimensional structure will future computers be all wet? coherence transfer by isotropic mixing: application to proton correlation spectroscopy a photochemically induced dynamic nuclear polarization study of denatured states of lysozyme three-dimensional structure of the human class ii histocompatibility antigen hla-dr1 an empirical energy function for threading protein sequence through the folding motif preparation of artificial bifunctional enzymes by gene fusion solidstate nmr assessment of enzyme active center structure under nonaqueous conditions forward to the fundamentals study of the tryptophan residues of lysozyme using 1h nuclear magnetic resonance rna duplexes guide base conversions ph dependence of relaxivities and hydration numbers of gadolinium(ill) complexes of linear amino carboxylates ih nmr studies of human c3a anaphylatoxin in solution: sequential resonance assignments, secondary structure, and global fold tuning the activity of an enzyme for unusual environments: sequential random mutagenesis of subtilisin e for catalysis in dimethylformamide proteins. one thousand families for the molecular biologist a correlation-coefficient method to predicting protein-structural classes from amino acid compositions solid-state nmr determination of intra-and intermolecular 31p-13c distances for shikimate 3-phosphate and [1-i3c]glyphosate bound to enolpyruvylshikimate-3-phosphate synthase four-dimensional 13c/13c-edited nuclear overhauser enhancement spectroscopy of a protein in solution: application to interleukin 1/3 high-resolution three-dimensional structure of interleukin 1/3 in solution by three-and four-dimensional nuclear magnetic resonance spectroscopy origins of structural diversity within sequentially identical hexapeptides comparison of three algorithms for the assignment of secondary structure in proteins: the advantages of a consensus assignment extracting the information -sequence analysis software design evolves conformations of folded proteins in restricted spaces prediction of protein folding from amino acid sequence over discrete conformation spaces comparison of a structural and a functional epitope electrostatics in biomolecular structure and dynamics identification and removal of impediments to biocatalytic synthesis of aromatics from d-glucose: rate-limiting enzymes in the common pathway of aromatic amino acid biosynthesis the crystal and molecular structure of the rhizomucor miehei triacylglyceride lipase at 1.9 a resolution real-space refinement of the structure of hen egg white lysozyme dominant forces in protein folding complete assignment of aromatic 1h nuclear magnetic resonances of the tyrosine residues of hen lysozyme stein and moore award address. reconstructing history with amino acid sequences the comings and goings of homing endonucleases and mobile introns multim -tools for multiple sequence analysis genomic direction of synthesis during plasmid-based biocatalysis free radical induced nuclear magnetic resonance shifts: comments on contact shift mechanism prediction of protein folding class from amino acid composition modeling of the electrostatic potential field of plastocyanin three-dimensional profiles for analysing protein sequence -structure relationships a method to configure protein side-chains from the main-chain trace in homology modelling structure of pentameric human serum amyloid p component nuclear magnetic resonance fourier transform spectroscopy (nobel lecture) probing protein structure by solvent pertubation of nuclear magnetic resonance spectra molecular nanotechnology low resolution solution structure of the bacillus subtilis glucose permease iia domain derived from heteronuclear three-dimensional nmr spectroscopy alternative readings of the genetic code enzyme structure and mechanism. freeman protein engineering enzyme crystal structure in a neat organic solvent ih, 13c and lsn nmr backbone assignments of the 269-residue serine protease pb92 from bacillus alcalophilus polypeptide -metal cluster connectivities in metallothionein 2 by novel i h-113cd heteronuclear two-dimensional nmr experiments design and use of heterologous microbes for conversion of d-glucose into aromatic chemicals. enzyme engineering xii molecular characterization of the cuticle and interstitial collagens from worms collected at deep sea hydrothermal vents the protein identification resource (pir) faster superoxide dismutase mutants designed by enhancing electrostatic guidance self-assembling organic nanotubes based on a cyclic peptide architecture multiple-site titration and molecular modeling: two rapid methods for computing energies and forces for ionizable groups in proteins calculation of the total electrostatic energy of a macromolecular system: solvation energies, binding energies, and conformational analysis the inclusion of electrostatic hydration energies in molecular mechanics calculations calculations of electrostatic potentials in an enzyme active site calculating the electrostatic potential of molecules in solution: method and error assessment improved alignment of weakly homologous protein sequences using structural information rna editing in plant mitochondria and chloroplasts human genetic diseases due to codon reiteration: relationship to an evolutionary mechanism the influence of hydration on the conformation of lysozyme studied by solid-state 13c-nmr spectroscopy three-dimensional fourier spectroscopy. application to high-resolution nmr invasive introns enzyme function in organic solvents analysis of ordered arrays of adsorbed lysozyme by scanning tunneling microscopy specific cleavage of pre-edited mrnas in trypanosome mitochondrial extracts treatment of electrostatic effects in macromolecular modeling de novo design, expression and characterization of felix: a four-helix bundle protein of native like sequence converting trypsin to chymotrypsin: the role of surface loops identification of native protein folds amongst a large number of incorrect models. the calculation of low energy conformations from potentials of mean force nuclear magnetic relaxation in aqueous solutions of the gd(hedta) complex proton magnetic relaxation dispersion in aqueous glycerol solutions of gd(dtpa) 2-and gd(dota) engineered metalloregulation in enzymes rna editing of ampa receptor subunit giur-b: a base-paired intron-exon structure determines position and efficiency protein splicing removes intervening sequences in an archaea dna polymerase the role of the a-helix dipole in protein function and structure diabodies': small bivalent and bispecific antibody fragments globin fold in a bacterial toxin a database of protein structure families with common folding motifs proton nuclear magnetic resonance assignment and surface accessibility of tryptophan residues in lysozyme using photochemically induced dynamic nuclear polarization spectroscopy a functional protein hybrid between the glucose transporter and the n-acetylglucosamine transporter of escherichia coli classical electrodynamics synthesis, structure and activity of artificial, rationally designed catalytic polypeptides a new approach to protein fold recognition engineering stability of the insulin monomer fold with application to structure-activity relationships dictionary of protein secondary structure: pattern recognition of hydrogenbonded and geometrical features protein design by binary patterning of polar and nonpolar amino acids a hybrid ribonuclease h. a novel rna cleaving enzyme with sequence-specific recognition four-dimensional heteronuclear triple-resonance nmr spectroscopy of interleukin-1/3 in solution two-dimensional spectroscopy: background and overview of the experiments orientation of the valine-1 side chain of the gramicidin transmembrane channel and implications for channel functioning. a 2h nmr study co-crystal structure of tbp recognizing the minor groove of a tata element crystal structure of a yeast tbp/tata-box complex two-dimensional 1h nmr studies of histidine-containing protein from escherichia coli. secondary and tertiary structure as determined by nmr hhaimethyltransferase flips its target base out of the dna helix the solution structure of the human retinoic acid receptor-/3 dna-binding domain crystal structure of porcine ribonuclease inhibitor, a protein with leucine-rich repeats evolution of the type ii hexokinase gene by duplication and fusion of the glucokinase gene with conservation of its organization determinants of ca 2+ permeability in both tm1 and tm2 of high affinity kainate receptor channels: diversity by rna editing crystal structure at 3.5 ,~ resolution of hiv-1 reverse transcriptase complexed with an inhibitor the asymmetric distribution of charges on the surface of horse cytochrome c molscript: a program to produce both detailed and schematic plots of protein structures atomic model of plant light-harvesting complex by electron crystallography biocatalysis in the gas phase procheck: a program to check the stereochemical quality of protein structures a new procedure for the detection and evaluation of similar substructures in proteins quantification of secondary structure prediction improvement using multiple alignments molecular dynamics of macromolecules in water direct observation of reverse transcriptases by scanning tunneling microscopy on the ionization of proteins long-range surface charge-charge interactions in proteins assessment of protein models with three-dimensional profiles improving the sensitivity of the sequence profile method brownian dynamics simulations of diffusional encounters between triosephosphate isomerase and glyceraldehyde phosphate: electrostatic steering of glyceraldehyde phosphate conformational flexibility of aqueous monomeric and dimeric insulin: a molecular dynamics study crystal structure of the dsba protein required for disulphide bond formation in vivo electrostatic effects in proteins dynamics of proteins and nucleic acids inter-tryptophan distances in rat cellular retinol binding protein ii by solid-state nmr a molecular model for cinnamyl alcohol dehydrogenase, a plant aromatic alcohol dehydrogenase involved in lignification adaptive evolution of highly mutable loci in pathogenic bacteria automated protein structure data bank similarity searches and their use in molecular modeling with development of pseudoenergy potentials for assessing protein 3-d-1-d compatability and detecting weak homologies brownian dynamics of cytochrome c and cytochrome c peroxidase electron transfer proteins molecular dynamics of ferrocytochrome c. magnitude and anisotropy of atomic displacements an analysis of incorrectly folded protein models. implications for structure predictions characterization of recombinant human farnesyl-protein transferase: cloning, expression, farnesyl diphosphate binding, and functional homology with yeast prenyl-protein transferases fast structure alignment for protein databank searching identification and classification of protein fold families direct solution of the poisson equation for biomolecules of arbitrary shape, polarizability density, and charge distribution prediction of protein structure by evaluation of sequencestructure fitness. aligning sequences to contact profiles derived from three-dimensional structures environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds sh2 and sh3 domains rapid and sensitive sequence comparison with fastp and fasta improved tools for biological sequence comparison gene duplication and the origin of trypsin protein engineering -new or improved proteins for mankind nmr identification of protein surfaces using paramagnetic probes multidisciplinary cycles for protein engineering: site-directed mutagenesis and x-ray structural studies of aspartic proteinases. scand the local information content of the protein structural database structure of the actin -myosin complex and its implications for muscle contraction extensive editin~ of both processed and preprocessed maxicircle cr6 transcripts in trypanosoma brucei sequential 1h-nmr assignments and secondary structure of hen egg white lysozyme in solution electrostatics and diffusional dynamics in the carbonic anhydrase active site channel identification of structural motifs from protein coordinate data: secondary structure and first-level supersecondary structure modeling protein structures: construction and their applications nmr of macromolecules. a practical approach the modelling of electrostatic interactions in the function of globular proteins electrostatic interactions in globular proteins: calculation of the ph dependence of the redox potential of cytochrome c55 i extracting information on folding from the amino acid sequence: consensus regions with preferred conformation in homologous proteins prediction of protein secondary structure at better than 70% accuracy secondary structure prediction of all-helical proteins in two states phd -an automatic mail server for protein secondary structure prediction progress in protein structure prediction? predicting protein secondary structure with a nearest-neighbor algorithm database of homologyderived protein structures and the structural meaning of sequence alignment an winexpensive, versatile sample illuminator for photo-cidnp on any nmr spectrometer pancreatic lipases: evolutionary intermediates in a positional change of catalytic carboxylates? a workbench for multiple alignment construction and analysis a new approach to the design of stable proteins electrostatic interactions in macromolecules: theory and applications the electrostatic potential of the alpha helix electrostatic effects in myoglobin. hydrogen ion equilibria in sperm whale ferrimyoglobin point charge distributions and electrostatic steering in enzyme/substrate encounter: brownian dynamics of modified copper/zinc superoxide dismutases boltzmann's principle, knowledge based mean fields and protein folding recognition of errors in three-dimensional structures of proteins describing protein structure: a general algorithm yielding complete helicoidal parameters and a unique overall axis electrostatic screening in molecular dynamics simulations electrical potentials in trypsin isozymes rna editing in brain controls a determinant of ion flow in glutamate-gated channels empirical correlation between protein backbone conformation and c a and ct3 13c nuclear magnetic resonance chemical shifts an automated method for modeling proteins on known templates using distance geometry a model for electrostatic effects in proteins difference imaging of adenovirus: bridging the resolution gap between x-ray crystallography and electron microscopy semianalytical treatment of solvation for molecular mechanics and dynamics sequencespecific 1h and 15n resonance assignment for human dihydrofolate reductase in solution posttranslational modification of protein by tyrosine sulfation: active sulfate paps is the essential substrate for this modification finding your fold (commentary) theory of protein titration curves. i. general equations for impenetrable spheres interpretation of protein titration curves. application to lysozyme molecular cloning of an apolipoprotein b messenger rna editing protein fragment ranking in modelling of protein structure. conformationally constrained environmental amino acid substitution tables three-dimensional cryo-electron microscopy of the calcium ion pump in the sarcoplasmic reticulum membrane biocatalysis in non-conventional media total chemical synthesis, characterization, and immunological properties of an mhc class i model using the tasp concept for protein de novo design doughnut-shaped structure of a bacterial muramidase revealed by x-ray crystallography structure determination of the cyclohexene ring of retinal in bacteriorhodopsin by solid-state deuterium nmr nicotinic acetylcholine receptor at 9 ~, resolution calculations of electrostatic properties in proteins interfacial activation of the lipase -procolipase complex by mixed micelles revealed by x-ray crystallography structure of the pancreatic lipase -colipase complex a novel search method for protein sequence -structure relations using property profiles nmr investigations of protein structure prospects for nmr of large proteins protein structures in solution by nuclear magnetic resonance and distance geometry theoretical studies of enzymic reactions calculation of electrostatic interactions in biological systems and in solution how do serine proteases really work? calculation of the electric potential in the active site cleft due to a-helix dipoles molecular dynamics effects on protein electrostatics homonuclear two-dimensional 1h nmr of proteins. experimental procedures calculation of chemical shifts of protons on alpha carbons in proteins three-dimensional profiles from residue-pair preferences: identification of sequences with beta/alpha-barrel fold structure of human pancreatic lipase the chemical shift index: a fast and simple method for the assignment of protein secondary structure through nmr spectroscopy reengineering the specificity of a serine active-site enzyme. two active-site mutations convert a hydrolase to a transferase detection of secondary structure elements in proteins by hydrophobic cluster analysis model ion channels: gramicidin and alamethicin nmr of proteins and nucleic acids in vitro protein splicing of purified precursor and the identification of a branched intermediate on the calculation of pka's in proteins molecular cloning of cdna coding for rat plasma glutathione peroxidase a new method for computing the macromolecular electric potential an optimization approach to predicting protein structural class from amino acid composition a weighting method for predicting protein structural class from amino acid composition we want to thank christian cambillau, cnrs, marseille, for kindly providing us with pre-release 3-d data of human pancreatic lipase, jerry h. brown, harvard university, for sending us a prerelease dataset for the hla ii structure, alwyn jones, uppsala university, for pre-release 3-d data of candida antarctica b lipase, and johnmccarthy, brookhaven national laboratory, for helping us with data on previous pdb releases. the french norwegian foundation (fns 27958) and the norwegian research council (bp 29345) have contributed with financial support to some of the research activities described in this paper. a.b. and p.m. thank junta nacional de investi-ga~o cientlfica, portugal, for their grants. key: cord-010604-3d37o05y authors: rein, theo title: post-translational modifications and stress adaptation: the paradigm of fkbp51 date: 2020-04-29 journal: biochem soc trans doi: 10.1042/bst20190332 sha: doc_id: 10604 cord_uid: 3d37o05y adaptation to stress is a fundamental requirement to cope with changing environmental conditions that pose a threat to the homeostasis of cells and organisms. post-translational modifications (ptms) of proteins represent a possibility to quickly produce proteins with new features demanding relatively little cellular resources. fk506 binding protein (fkbp) 51 is a pivotal stress protein that is involved in the regulation of several executers of ptms. this mini-review discusses the role of fkbp51 in the function of proteins responsible for setting the phosphorylation, ubiquitination and lipidation of other proteins. examples include the kinases akt1, cdk5 and gsk3β, the phosphatases calcineurin, pp2a and phlpp, and the ubiquitin e3-ligase skp2. the impact of fkbp51 on ptms of signal transduction proteins significantly extends the functional versatility of this protein. as a stress-induced protein, fkbp51 uses re-setting of ptms to relay the effect of stress on various signaling pathways. the physiological stress response is elicited whenever a change in the environment is sensed and interpreted as threat to homeostasis. effector systems comprise the autonomic nervous system and the hypothalamic-pituitary-adrenocortical (hpa) axis [1] . the effector molecules of these systems are noradrenaline and adrenaline for the autonomic nervous system, and cortisol for the hpa axis (corticosterone in rodents). adrenaline and noradrenaline act through g protein-coupled receptors that are hooked to various intracellular pathways involving determinants of post-translational modifications (ptms) such as kinases and phosphatases [2] . these determinants are also referred to as 'writers' and 'erases' of ptms [3, 4] . using ptms for signal transduction comes with the obvious advantage of allowing for quickly changing the mode of action of pre-synthesized proteins in a vast functional space defined by the huge number of possible combinations of ptms at different amino acids [4, 5] . moreover, ptms are reversible at a small time scale, which is another important feature for the stress response, in particular when stress exposure is short [5] . the hpa axis also activates g protein-coupled receptors, namely the receptors for the hormones corticotropin-releasing factor and adenocorticotropic hormone [1] . however, its final effector cortisol operates through the steroid receptors glucocorticoid receptor (gr) and mineralocorticoid receptor [6] . these receptors are transcription factors, and thus this part of the overall stress response typically takes longer than the time required to redefine the function of proteins through re-setting their ptms. nevertheless, the action of these receptors also is intertwined with ptms: they are subject to regulation by ptms and they impact writers and erasers of ptms [7] [8] [9] . in part, this is achieved through the synthesis of specific proteins that take part in the orchestration of proteome function through ptms. fk506 binding protein (fkbp) 51 turned out to be one of these proteins which is subject of this review. the stress protein fkbp51, promoted through translational research fkbp51 is a show-case of translational research where clinical and basic science approaches stimulated each other. the background of fkbp51 is laid out here only shortly, and the reader is referred to the numerous recent reviews for more detailed information [10] [11] [12] [13] [14] [15] . originally discovered as part of steroid receptor-heat shock protein 90 hetero-complexes, fkbp51 was shown to be a potent inhibitor of gr by several laboratories [16] [17] [18] [19] . by virtue of its binding to immune suppressive drugs such as fk506, fkbp51 also has been classified as 'immunophilin' [15] . biochemically, fkbp51 is able to isomerize peptidyl-prolyl bonds [20] ; the physiological relevance, if any, of this function is not clear [11, 12, 21] . however, the peptidylprolyl isomerase domain of fkbp51 is engaged in protein interactions, and drug binding to this domain likely affects several functions of this protein [13] . the inducibility of fkbp51 gene (named fkbp5) expression by the activated gr [22] [23] [24] [25] [26] [27] gives rise to an intracellular ultra-short negative feedback loop, as one of the hallmarks of adaptive molecular circuits [11] . of particular interest for neuropsychiatric research was the observation that fkbp51 was overexpressed in squirrel monkeys featuring altered set-points of the hpa axis [28] . thus, the molecular settings in these animals were considered a model for gr-resistance [29] . this was related to the situation in patients suffering from depression where malfunction of gr was hypothesized to be causal for the development of the disease [30] . based on these considerations, fkbp5 was included as candidate gene in an early gene association study in depression that found this gene linked to the response to antidepressant treatment [31] . later, fkbp5 could be linked to additional stress-related diseases such as post-traumatic stress disorder [12, 14] . these findings strongly amplified the interest in fkbp51 and stimulated research on its function and regulation in several laboratories; this greatly expanded the knowledge base on fkbp51's ( patho)physiological role, regulation on several levels and involvement in multiple molecular pathways, going beyond stress regulation [12, 13] . the variety of its physiological functions goes along with its association with several proteins, including proteins involved in writing and erasing ptms, as detailed in the subsequent sections. more than 200 ptms are known with a major impact in the configuration of protein networks [32] [33] [34] ; the vast majority of these modifications are reversible with prominent examples being the attachment of chemical groups (e.g. phosphorylation, acetylation, methylation, nitrosylation, sulfonation), the conjugation with polypeptides (e.g. ubiquitination, nedd8 [neural-precursor-cell-expressed developmentally down-regulated 8], and ubiquitin-like peptides such as sumo [small ubiquitin-like modifier] or atg8 [autophagy-related gene 8]) and the addition of a complex group of molecules including, e.g. prenylation, farnesylation, glycosylation, palmitoylation, myristoylation, glutamylation, adp-ribosylation and ampylation [4, 35] . protein phosphorylation is one of the first known ptm [36, 37] and probably the best studied one affecting almost all biological processes [38] . in eukaryotes, proteins are phosphorylated primarily through phosphor-ester bonds formed at the residues serine, threonine and tyrosine, and up to 2% of the protein-coding genes produce the enzymatic machinery governing this process [39] . while this review focusses on the effect of fkbp51 on ptms of other proteins, it should be noted that fkbp51 also is subject to ptms itself. this has been reviewed very recently [40] , and thus is mentioned here only briefly: not surprisingly, the first reported ptm of fkbp51 was phosphorylation. originally, it was inferred from the pattern of this protein in 2d gel electrophoresis and the modulation of this pattern by the use of kinase inhibitors or phosphatases [40] [41] [42] . more recently, pten-induced putative kinase 1 (pink1) was found to phosphorylate fkbp51 at yet to be mapped serine residues [43] . thereby, pink1 regulates the interaction of fkbp51 with the kinase akt1 and the phosphatase phlpp [43] . this interaction further is controlled by acetylation of fkbp51 at lysines 28 and 155 [44] . the sirtuin sirt7 has been identified as deacetylase acting at these sites [44] . sumoylation of fkbp51 was detected and mapped to lysine 422 [45] . it regulates the inhibitory action of fkbp51 on gr [45] . the first evidence for the involvement of fkbp51 in the regulation of the phosphoproteome was provided by the observation that fk506-bound fkbp51 inhibits the serine/threonine-phosphatase calcineurin (also known as protein phosphatase 2b), thereby inhibiting nuclear factor of activated t cells (nfat) [46, 47] . it also has been reported that fkbp51 interacts with calcineurin in the absence of fk506 [48] , which was not observed by others [47] . nevertheless, impacting ptms through re-arranging protein association of kinases and phosphatases as in the case of calcineurin/nfat is the mode of action also revealed for the effect of fkbp51 on many other signaling pathways. the inhibition of calcineurin by fkbp51 is reported to also affect the nuclear factor (nf)κb pathway [49] . in this pathway, phosphorylation of the inhibitor of κb (iκb) by the kinase of iκb (ikk) leads to activation of nfκb [50] . therefore, the protein associations of fkbp51 with calcineurin as well as with ikkα and other kinases of the nfκb pathway [51] support a model where fkbp51 impacts nfκb activity through re-setting phosphorylation at multiple levels with variable outcome [11] . for example, a direct association between fkbp51 and both tnf receptor-associated factor 2 (traf2) and ikkγ was found [52] . traf2 catalyzes k63-linked poly-ubiquitination of receptor-interacting protein 1 (rip1) in response to tnfα, thereby facilitating the recruitment of the ikk complex to its upstream activating kinase tak1 (transforming growth factorbeta activated kinase 1) [53, 54] . fkbp51 enhances and shapes this polyubiquitin-mediated interaction, thereby changing the phosphorylation of ikk and iκb and thus the activity of nfκb [52] . the serine/threonine kinase akt (also known as protein kinase b) is another well-examined example of fkbp51's impact on ptms through organizing protein complexes. akt is activated by step-wise phosphorylation in response to extracellular signals that involves its translocation from the cytoplasm to the cell membrane [55, 56] . fkbp51 employs its ability to interact with various proteins, frequently referred to as scaffolding, to recruit the phosphatase phlpp that de-phosphorylates and thereby inactivates akt [57, 58] (figure 1a ). akt is a central pathway regulator that inhibits apoptosis and promotes cell growth [59] . accordingly, fkbp51 expression is enhanced in most cancer types and is linked to resistance to chemotherapy [57, [60] [61] [62] . evidence has also been provided that fkbp51 mediates the inactivation of akt induced by stress [58] . thus, fkbp51 may also relay the effect of stress on the phosphoproteome [11] . fkbp51's effect on akt also alters downstream ptm-dependent pathways. for example, through akt1 fkbp51 impacts p38 mapk, thereby differentially regulating the transcription factors gr and peroxisome proliferator-activated receptor-γ [63] . fkbp51 also governs the effects of akt1 on its targets beclin 1 (becn1) and s-phase kinase-associated protein 2 (skp2), constituting a link to autophagy as detailed below [58, 64, 65] . [58, 64, 65] . the recruitment of phlpp leads to lower akt phosphorylation and activity, entailing less phosphorylation of becn1 and of the e3-ligase skp2. thereby, skp2 is less active resulting in lower ubiquitination of becn1. whether or not fkbp51 associates with all these proteins in one complex remains to be elucidated. another downstream target of akt is glycogen synthase kinase 3β (gsk3β) [66] . consistent with the inhibitory effect of fkbp51 on akt1, it has been reported that overexpression of fkbp51 decreased the phosphorylation of gsk3β at serine 9 [57] . however, it also has been found that fkbp51 associates with gsk3β, and increases its phosphorylation [67] . this appears to be accomplished through rearrangement of the protein heterocomplex governing phosphorylation and thus the activity of gsk3β. more specifically, fkbp51 recruits cyclin-dependent kinase 5 (cdk5) and furthermore associates with the three subunits of the phosphatase pp2a [67] (figure 1b) , which acts in concert with cdk5 to regulate gsk3β affecting downstream targets [68] . thus, fkbp51 redefines signaling pathway connections through protein associations. fkbp51's impact on protein phosphorylation furthermore provides a link to epigenetic regulation as well as to metabolic function. the link to epigenetics is evidenced by its impact on phosphorylation, and thus the activity of dna methyltransferase 1 (dnmt1) [69] . mechanistically, this effect appears to be achieved through the differential association of fkbp51 and its close homolog fkbp52 with cdk5 and its regulatory protein p35 [69] . metabolic function is impacted by fkbp51 through protein associations that dephosphorylate and thus inhibit akt2, leading to dephosphorylation and thus inhibition of as160 and reduced glucose uptake [70] . ptms are also involved in the effect of fkbp51 on microtubule dynamics. it is assumed that phosphorylation of tau leads to its dissociation from microtubules and adoption of the trans configuration of specific peptidylprolyl bonds, while the association of phosphorylated tau with fkbp51 promotes its dephosphorylation and cis configuration that is required for microtubule association [71] [72] [73] . another mass spectrometry-based screen for fkbp51-associated proteins revealed the rho (ras homologous) gtpase-activating proteins deleted in liver cancer (dlc) 1 and dlc2 as novel interaction partners [74] . accordingly, fkbp51 enhances rhoa activity and signaling through the serine/threonine kinase rock (rho-associated coiled-coil containing protein kinase) along with the linked processes cell migration and invasion [74] . the exact mechanism remains to be elucidated. it appears that this is an example for a more indirect effect of fkbp51 on the phosphoproteome: it inhibits a protein, dlc, that serves as a gtpase activator for members of the rho family of gtpases that regulate downstream kinases [74, 75] . ubiquitination is an essential ptm in all eukaryotes [3, 76] . it determines protein stability as well as protein function through changing protein-protein interaction. biochemically, ubiquitination is a process where the 76 amino acid protein ubiquitin is covalently linked to other proteins, in most cases through the formation of an amide bond between its carboxy terminus and the ε amino group of lysine residues in the substrate proteins [34, 76] . in addition to this isopeptide bond, ubiquitin forms other links to target diverse protein residues such as cysteines, serines or threonines [3] . ubiquitination comes in the form of mono-ubiquitination and polyubiquitination, where distinct lysines of one ubiquitin serve as attachment sites for additional ubiquitin moieties. the site of linkage destines the modified protein to different functions. for example, poly-ubiquitination through the lysines at position 11 and 48 typically lead to degradation of the protein through the 26s proteasome [77, 78] . the first indication that fkbp51 influences the ubiquitination of other proteins came from the observation that it stabilizes tau and protects it from becoming ubiquitinated [71] . the mode of action appears to be an indirect mechanism where fkbp51 influences the conformation and/or phosphorylation of tau to prevent the access of ubiquitinating enzymes [71, 73, 79] . for the differential effect of fkbp51 and fkbp52 on nfκb, it has been proposed that a mechanism is involved that is similar to pin1's promotion of the ubiquitin-mediated proteolysis of the nfκb subunit p65/rela [80] . further experimental evidence is awaited; in any case, this effect also would be indirect. the reported effect of fkbp51 on nfκb signaling through the association with traf2 and ikk is dependent on the k63-poly-ubiquitination of rip1, but does not appear to impact ubiquitination in this context [52] . the e3 ubiquitin-protein ligase complex members traf3 and traf6 also were discovered as fkbp51 associating proteins, with currently unknown consequences for their enzymatic activity [81] . more recently, a direct association of fkbp51 with glomulin, a regulator of the scf (skp1-cul1-f-box protein) e3 ubiquitin-protein ligase complex, has been described in detail and in comparison with other fkbps [82] . it is likely that this interaction affects the ubiquitination activity of scf; experimental investigation of this potential impact of fkbp51 on the ubiquitination machinery has not been reported yet. a yeast two-hybrid screen indicated protein associations of fkbp51 with the ubiquitin-specific peptidases (usps) 18, 36 and 49 as well as with the e3 ubiquitin ligases ring finger protein 219 and skp2 [83] . while the functional consequences were not explored in this report, another study revealed that usp49 stabilizes fkbp51 by the removal of ubiquitin chains [84] . thus, in this case, the association with a ptm executer results in fkbp51 being a target rather than a modifier. conversely, fkbp51 regulates the ubiquitination activity of skp2 through phosphorylation, probably by functioning as protein scaffolder for the association with phlpp and akt1 [65] . this novel fkbp51 function on protein ubiquitination has far-reaching consequences: the study further discovered the autophagy regulator becn1 as novel target of skp2 executing k48-linked poly-ubiquitination at this autophagy regulator [65] . thus, fkbp51 drives autophagy involving at least two types of ptms of becn1, ubiquitination and phosphorylation [58, 64, 65] (figure 1c ). given the multiple targets of skp2 [85] [86] [87] [88] , it is assumed that the ubiquitination of more proteins will be changed by fkbp51 with diverse functional consequences. autophagy is another fundamental cellular process essential for protein, organelle and energy homeostasis [89] . it involves several atg products that through several steps form autophagosomes, vesicles that engulf material destined for degradation which is accomplished upon the fusion with lysosomes [89, 90] . an important step in this process is the lipidation of atg8 (also known as microtubule associated protein 1 light chain 3 beta), a ubiquitin-like protein that is integrated into the autophagosomal membrane upon formation of an amide bond the stress protein fkbp51 is intertwined with gr as both inhibitor and target. as target, it has the potential to relay the stress response to downstream pathways through the association with writers and erasers of ptms. known associations include de-ubiquitinases, ubiquitinases, protein kinases and protein phosphatases (first box, writers and erasers of ptms are grouped according to the type of their biochemical activity). some of the associated proteins are changed in their activity, others are redirected to certain targets. examples of target proteins affected by the altered activities of ptm writers and erasers are provided in the box below. '+' and '−' after the protein names indicate the overall effect of fkbp51 on the activity of associating proteins or downstream proteins. not all potential interactions are displayed. becn1 is a downstream protein that also forms a complex with fkbp51. however, most of the downstream target proteins are indirectly affected in the sense that they were not shown to associate with fkbp51. with phosphatidylethanolamine [91, 92] . similar to the process of ubiquitination, the formation of this amide bond is executed by a set of ligases (e1-e3) that use cysteine linked thioesters as intermediates. the last step, conjugation to phosphatidylethanolamine, is mediated by the e3-like atg12-atg5:atg16 complex [93] . the effect of fkbp51 on this protein lipidation probably is indirect through driving autophagy by regulating becn1. however, fkbp51 also leads to enhanced levels of atg12, a member of the conjugation system [58] , pointing to different pathways used by fkbp51 to enhance atg8 lipidation. • fkbp51 is receiving increased attention for its pivotal role in research on stress and stressrelated diseases, but also several additional fields such as immunology, metabolism, oncology, neurology, etc. • the stress protein fkbp51 is engaged in various signaling pathways through diverse protein interactions. it both affects the stress response and is affected by the stress response, and furthermore relays stress to multiple pathways. likewise, it is the subject of ptms and affects the activity of ptm writers and erasers ( figure 2 ). • protein associations of fkbp51 with ptm writers and erasers could affect ptms of fkbp51 itself or of other proteins, and deciphering these scenarios will significantly contribute to our mechanistic understanding of this versatile protein. furthermore, it will be of high interest to elucidate which of the ptm-mediated downstream effects of fkbp51 contribute to the physiological stress reaction. the author declares that there are no competing interests associated with this manuscript. atg, autophagy-related gene; becn1, beclin 1; cdk, cyclin-dependent kinase; dlc, deleted in liver cancer; dnmt1, dna methyltransferase 1; fkbp, fk506 binding protein; gr, glucocorticoid receptor; gsk, glycogen synthase kinase; hpa axis, hypothalamic-pituitary-adrenocortical axis; iκb, inhibitor of κb; ikk, kinase of iκb; mapk, mitogen-activated protein kinase; nfκb, nuclear factor kappa b; nfat, nuclear factor of activated t cells; phlpp, pleckstrin homology domain leucine-rich repeat protein phosphatase; pp2a, protein phosphatase 2a; pparγ, peroxisome proliferator-activated receptor-γ; ptm, post-translational modification; rip1, receptor-interacting protein 1; scf, skp1-cul1-f-box protein; skp2, s-phase kinase-associated protein 2; tnf, tumor necrosis factor; traf, tnf receptor-associated factor; usp, ubiquitin-specific peptidase. the corticotropin-releasing factor family: physiology of the stress response molecular pathways: beta-adrenergic signaling in cancer principles of ubiquitin-dependent signaling orchestrating the proteome with post-translational modifications adaptive posttranslational control in cellular stress response pathways and its relationship to toxicity testing and safety assessment mechanisms of stress in the brain post-translational modifications of the mineralocorticoid receptor: how to dress the receptor according to the circumstances? role of phosphorylation in the modulation of the glucocorticoid receptor's intrinsically disordered domain glucocorticoid receptor control of transcription: precision and plasticity via allostery fkbp51 and fkbp52 in signaling and disease fk506 binding protein 51 integrates pathways of adaptation: fkbp51 shapes the reactivity to environmental change the fkbp51 glucocorticoid receptor co-chaperone: regulation, function, and implications in health and disease the many faces of fkbp51 hsp90 and fkbp51: complex regulators of psychiatric diseases biological actions of the hsp90-binding immunophilins fkbp51 and fkbp52 fk506-binding proteins 51 and 52 differentially regulate dynein interaction and nuclear translocation of the glucocorticoid receptor in mammalian cells the hsp90-binding peptidylprolyl isomerase fkbp52 potentiates glucocorticoid signaling in vivo squirrel monkey immunophilin fkbp51 is a potent inhibitor of glucocorticoid receptor binding overexpression of the fk506-binding immunophilin fkbp51 is the common cause of glucocorticoid resistance in three new world primates functional analysis of the hsp90-associated human peptidyl prolyl cis/trans isomerases fkbp51, fkbp52 and cyp40 2020) peptidylprolylisomerases, protein folders or scaffolders? the example of fkbp51 and fkbp52 tissue distribution and abundance of human fkbp51, and fk506-binding protein that can mediate calcineurin inhibition isolation and characterization of glucocorticoid-and cyclic amp-induced genes in t lymphocytes intronic hormone response elements mediate regulation of fkbp5 by progestins and glucocorticoids glucocorticoid-resistant b-lymphoblast cell line derived from the bolivian squirrel monkey (saimiri boliviensis boliviensis) allele-specific fkbp5 dna demethylation mediates gene-childhood trauma interactions glucocorticoid receptor activates poised fkbp51 locus through long-distance interactions glucocorticoid hormone resistance during primate evolution: receptor-mediated mechanisms the new world primates as animal models of glucocorticoid resistance the corticosteroid receptor hypothesis of depression polymorphisms in fkbp5 are associated with increased recurrence of depressive episodes and rapid response to antidepressant treatment posttranslational modifications the roles of post-translational modifications in the context of protein interaction networks post-translational modifications in signal integration uniprot: the universal protein knowledgebase conversion of phosphorylase b to phosphorylase a in muscle extracts protein-tyrosine kinases exploring the function of dynamic phosphorylation-dephosphorylation cycles evolution of protein kinase signaling from yeast to man regulation of fkbp51 and fkbp52 functions by post-translational modifications molecular cloning of human fkbp51 and comparisons of immunophilin interactions with hsp90 and progesterone receptor the 90-kda heat-shock protein (hsp90)-binding immunophilin fkbp51 is a mitochondrial protein that translocates to the nucleus to protect cells against oxidative stress pink1 regulates fkbp5 interaction with akt/phlpp and protects neurons from neurotoxin stress induced by mpp + regulation of serine-threonine kinase akt activation by nad(+)-dependent deacetylase sirt7. cell rep the activity of the glucocorticoid receptor is regulated by sumo conjugation to fkbp51 fkbp51, a novel t-cell-specific immunophilin capable of calcineurin inhibition comparative analysis of calcineurin inhibition by complexes of immunosuppressive drugs with human fk506 binding proteins calcium-and fk506-independent interaction between the immunophilin fkbp51 and calcineurin overexpression of fkbp51 in idiopathic myelofibrosis regulates the growth factor independence of megakaryocyte progenitors regulation and function of ikk and ikk-related kinases a physical and functional map of the human tnf-α/ nf-κb signal transduction pathway fkbp51 employs both scaffold and isomerase functions to promote nf-κb activation in melanoma regulation of the nf-κb-inducing kinase by tumor necrosis factor receptor-associated factor 3-induced degradation nf-κb-inducing kinase and iκb kinase participate in human t-cell leukemia virus i tax-mediated nf-κb activation aktivation mechanisms the pi3k pathway in human disease fkbp51 affects cancer cell response to chemotherapy by negatively regulating akt association of fkbp51 with priming of autophagy pathways and mediation of antidepressant treatment response: evidence in cells, mice, and humans akt/pkb signaling: navigating downstream role of fk506-binding protein 51 in the control of apoptosis of irradiated melanoma cells hsp90-binding immunophilin fkbp51 forms complexes with htert enhancing telomerase activity fkbp51 immunohistochemical expression: a new prognostic biomarker for oscc? fkbp51 controls cellular adipogenesis through p38 kinase-mediated phosphorylation of grα and pparν fkbp5/fkbp51 enhances autophagy to synergize with antidepressant action skp2 attenuates autophagy through beclin1-ubiquitination and its inhibition reduces mers-coronavirus infection inhibition of glycogen synthase kinase-3 by insulin mediated by protein kinase b fkbp51 inhibits gsk3β and augments the effects of distinct psychotropic medications mice lacking phosphatase pp2a subunit pr61/b 0 δ (ppp2r5d) develop spatially restricted tauopathy by deregulation of cdk5 and gsk3β chaperoning epigenetics: fkbp51 decreases the activity of dnmt1 and mediates epigenetic effects of the antidepressant paroxetine stress-responsive fkbp51 regulates akt2-as160 signaling and metabolic function the hsp90 cochaperone, fkbp51, increases tau stability and polymerizes microtubules organization and function of the fkbp52 and fkbp51 genes bending tau into shape: the emerging role of peptidyl-prolyl isomerases in tauopathies fkbp51 regulates cell motility and invasion via rhoa signaling rhoa-rock signaling as a therapeutic target in traumatic brain injury the ubiquitin system protein modifications: beyond the usual suspects' review series cellular quality control by the ubiquitin-proteasome system and autophagy accelerated neurodegeneration through chaperone-mediated oligomerization of tau nf-κb transcriptional activity is modulated by fk506-binding proteins fkbp51 and fkbp52: a role for peptidyl-prolyl isomerase activity mitochondria-nucleus shuttling fk506-binding protein 51 interacts with traf proteins and facilitates the rig-i-like receptor-mediated expression of type i ifn fkbp51 and fkbp12.6-novel and tight interactors of glomulin a quantitative chaperone interaction network reveals the architecture of cellular protein homeostasis pathways usp49 negatively regulates tumorigenesis and chemoresistance through fkbp51-akt signaling skp2 is required for ubiquitin-mediated degradation of the cdk inhibitor p27 expression of skp2, a p27(kip1) ubiquitin ligase, in malignant lymphoma: correlation with p27(kip1) and proliferation index kip1 meets skp2: new links in cell-cycle control dysregulated expression of skp2 and its role in hematological malignancies autophagy: renovation of cells and tissues dynamics and diversity in autophagy mechanisms: lessons from yeast the atg8 conjugation system is indispensable for proper development of autophagic isolation membranes in mice atg8 controls phagophore expansion during autophagosome formation atg systems from the protein structural point of view key: cord-013046-r6dtiu97 authors: liu, bin; zhang, deyuan; xu, ruifeng; xu, jinghao; wang, xiaolong; chen, qingcai; dong, qiwen; chou, kuo-chen title: combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection date: 2014-02-15 journal: bioinformatics doi: 10.1093/bioinformatics/btt709 sha: doc_id: 13046 cord_uid: r6dtiu97 motivation: owing to its importance in both basic research (such as molecular evolution and protein attribute prediction) and practical application (such as timely modeling the 3d structures of proteins targeted for drug development), protein remote homology detection has attracted a great deal of interest. it is intriguing to note that the profile-based approach is promising and holds high potential in this regard. to further improve protein remote homology detection, a key step is how to find an optimal means to extract the evolutionary information into the profiles. results: here, we propose a novel approach, the so-called profile-based protein representation, to extract the evolutionary information via the frequency profiles. the latter can be calculated from the multiple sequence alignments generated by psi-blast. three top performing sequence-based kernels (svm-ngram, svm-pairwise and svm-la) were combined with the profile-based protein representation. various tests were conducted on a scop benchmark dataset that contains 54 families and 23 superfamilies. the results showed that the new approach is promising, and can obviously improve the performance of the three kernels. furthermore, our approach can also provide useful insights for studying the features of proteins in various families. it has not escaped our notice that the current approach can be easily combined with the existing sequence-based methods so as to improve their performance as well. availability and implementation: for users’ convenience, the source code of generating the profile-based proteins and the multiple kernel learning was also provided at http://bioinformatics.hitsz.edu.cn/main/∼binliu/remote/ contact: bliu@insun.hit.edu.cn or bliu@gordonlifescience.org supplementary information: supplementary data are available at bioinformatics online. by march 2013, 89 003 experimentally determined protein structures were deposited in the protein data bank (berman et al., 2007) . however, this number is only about one-sixth of 539 616, the number of protein sequences held in the uniprotkb/swiss-prot database (wu et al., 2006) . to timely use such vast amount of structure-unknown protein sequences for basic research and drug development, it is highly desired to determine their 3d structures and functions by means of homology approaches (chou, 2004) . unfortunately, protein remote homology detection is still a challenging problem in bioinformatics. the early methods in dealing with this problem were based on the pairwise sequence comparison approaches, such as blast (altschul et al., 1990) and smith-waterman local alignment algorithm (smith and waterman, 1981) . however, in many cases, this kind of sequence alignment method failed to detect remote homologies due to the low sequence similarities. later methods to challenge this problem were based on the generative models to induce a probability distribution over the protein family, and then to generate the unknown proteins as new members of the family from the stochastic model. for example, the hidden markov model (hmm) (karplus et al., 1998) can be trained iteratively in a semi-supervised manner by using both positively labeled and unlabeled samples of a particular family to generate the positive set (qian and goldstein, 2004) . recently, the discriminative methods, such as support vector machine (svm) (vapnik, 1998) , were used to address this problem by focusing on the differences between protein families. the key of the svm methods is the kernel function by which to compute the inner product between two samples in the feature space. the most straightforward approach to generate the kernels was based on the features extracted from protein sequences. svm-ngram (dong et al., 2006) , svm-pairwise (liao and noble, 2003) and svm-la (saigo et al., 2004) were three of the most successful sequencebased kernels. svm-ngram (dong et al., 2006) was based on the feature space that contains all short subsequence of length n. in svm-pairwise (liao and noble, 2003) , a protein sequence was represented as a vector of pairwise similarities to all protein *to whom correspondence should be addressed. sequences in the training set, and then inner product between these vector-space representations was taken as the kernel. svm-la (saigo et al., 2004) measured the similarity between a pair of proteins by taking all the optimal local alignment scores with gaps between all possible subsequences into account. besides these kernels, several other sequence-based kernels were also proposed, such as mismatch (leslie et al., 2004) and svm-balsa (webb-robertson et al., 2005) . the profile-based kernels could further improve the performance by using the evolutional information extracted from the profiles. for example, top-n-grams (liu et al., 2008) extracted the profile-based patterns by considering the most frequent elements in the profiles; profile kernel (kuang et al., 2005) extracted the short substrings according to the profile-based ungapped alignment scores; some profile-based methods improved the predictive performance by developing more sensitive profiles. hhsearch method (so¨ding, 2005) was based on a novel profile using the hmm. in compass (sadreyev et al., 2009) , numerical profiles were generated to construct optimal profile-profile alignments and to estimate the statistical significance of the corresponding alignment scores. in the meantime, some other features and techniques have been applied to this field to further improve the predictive performance. for instance, the kernel combination methodology (vbkc) (damoulas and girolami, 2008 ) used a single multiclass kernel machine to combine various kernels based on different feature spaces; svm-physicochemical distance transformation (pdt) (liu et al., 2012) combined the amino acid physicochemical properties and the profile features via pdt to incorporate the local sequence-order information of the entire protein sequences. also, based on the similarities between protein sequences and natural languages, the natural language processing techniques were applied to this field. it was shown that the performance of building-block-based methods could be improved by using the latent semantic analysis (lsa) (dong et al., 2006) . moreover, p rot e mbed (melvin et al., 2011) detected protein remote homology by embedding protein sequences into a low-dimensional semantic space. as we can see from the aforementioned introduction, most of the top-performing methods were developed based on the features extracted from profiles. this is consistent with the fact that a profile is much richer than an individual sequence in encoding information. also, biology is a natural science with historic dimension. all biological species have developed beginning from a limited number of ancestral species. it is true for protein sequence as well (chou, 2004) . their evolution involves changes of single residues, insertions and deletions of several residues, gene doubling and gene fusion (chou, 1995) . with these changes accumulated for a long period, many similarities between initial and resultant amino acid sequences are gradually eliminated, but the corresponding proteins may still share many common features, such as having basically the same biological function (loewenstein et al., 2009) , folding topology, subcellular location and other attributes (chou, 2013) . accordingly, the key to improve the performance of these methods is to find a suitable approach to extract the evolutionary information from the profiles. in view of this, the current study was initiated in an attempt to propose a profile-based protein representation by extracting the evolutionary information from the frequency profiles. as shown by a series of publications liu et al., 2009; xiao et al., 2013; xu et al., 2013) and summarized in a comprehensive review , to develop a useful statistical prediction method or model for a biological system, one needs to engage the following procedures: (i) construct or select a valid benchmark dataset to train and test the predictor; (ii) formulate the samples with an effective mathematical expression that can truly reflect their intrinsic correlation with the target to be predicted; (iii) introduce or develop a powerful algorithm (or engine) to operate the prediction; (iv) properly perform cross-validation tests to objectively evaluate the anticipated accuracy of the predictor; and (v) provide the downloadable source code or a web-server for the prediction method. below, let us describe how to deal these procedures. suppose s is a remote homology system investigated in the current study that contains 4352 protein sequences, which were taken from (liao and noble, 2003) at http://noble.gs.washington.edu/proj/svm-pairwise/. these proteins were selected from scop version 1.53 and extracted from the astral database (brenner et al., 2000) . none of the 4352 proteins has sequence pairwise similarity to any other with higher than 10 à25 in the e-value [for more about the e-value and its implication in reducing homology bias and redundancy, see (brenner et al., 2000) ]. these proteins were also used by others (dong et al., 2006; lingner and meinicke, 2006; saigo et al., 2004) to study remote homology detection. the 4352 proteins in s can be classified into 853 superfamilies and 1356 families; i.e. where s f i ði ¼ 1, 2, . . . , 853þ is the ith superfamily, s f k ðk ¼ 1, 2, . . . , 1356þ is the kth family and the symbol [ represents the 'union' in the set theory. for readers' convenience, the codes of the 4352 proteins and their sequence as well as the attributes of their families and superfamilies are given in supplementary material s1. because some families and superfamilies in s do not contain significant number of protein sequences, and also because the negative dataset for each protein family can be any proteins except those belonging to its own superfamily, it is not so straightforward but a little more complicated and subtle for how to select protein samples from s to define the training and testing datasets. to provide a clear description, let us consider a different manner to address this. as demonstrated by many previous studies on a series of important biological topics, such as enzymecatalyzed reactions (zhou and deng, 1984) , inhibition of human immunodeficiency virus-1 reverse transcriptase (althaus et al., 1993) , drug metabolism systems (chou, 2010) and applying wenxiang diagram or graph to study protein-protein interactions (zhou, 2011; zhou and huang, 2013) , using graphical approaches to study complicated problems can provide an intuitive picture and useful insights for in-depth studying and analyzing various complicated relations in these systems (lin and lapointe, 2013) . in view of this, let us also use graphic approach to describe the feature and relation of the families and superfamilies in s, as shown in figure 1 , where the open circles denote the families or superfamilies that have significant number of protein sequences and the gray circles denote those that do not. of the 1356 families in s (cf. equation 1), 54 contain significant number of proteins (see the third row of fig. 1 ) and form a target family set s f target ; i.e. of the 853 superfamilies in s, 23 contain at least one target family (see the open circles in the second row of fig. 1 ) and form a target superfamily set s f target ; i.e. thus, we have meaning that s f target is the subset of s f target , and s f target is the subset of s; each of the three contains 857, 1508 and 4352 proteins, respectively. now, for each of the 54 families in the target family set s f target , we can define a training dataset and testing dataset given by where the positive training dataset s þ train ðkþ contains at least 10 of its superfamily members, none of which belongs to the kth family, and the positive testing dataset s þ test ðkþ contains at least five protein domains within the family. the proteins in the negative training and testing datasets, s à train ðkþ and s à test ðkþ, were picked from s by excluding the superfamily of the kth family and randomly split between the two in the same ratio as the positive ones. the 54 training and testing datasets thus obtained are given in the supplementary materials s2 and s3, respectively. the frequency profile m for protein p with l amino acids can be represented by where 20 is the total number of standard amino acids; m i,j (0 m i,j 1) is the target frequency, which reflects the probability of amino acid i (i ¼ 1,2, . . . ,20) occurring at the sequence position j (j ¼ 1,2, . . . ,l) in protein p during evolutionary processes. for each column in m, the elements add up to 1. the target frequency is calculated from the multiple sequence alignments generated by running psi-blast (altschul et al., 1997) against the ncbi's nr with default parameters except that the number of iterations was not set at 1 but was set at 10 in the current study. the target frequency of amino acid i in sequence position j is calculated as: where f ij is the observed frequency of amino acid i in column j; is a free parameter set to a constant value of 10, which is initially used by psi-blast; is the number of different amino acids in column j à 1; and g ij is the pseudo-count for amino acid i in protein sequence position j, which can be calculated as: where p k is the background frequency of amino acid k, and q ik is the score of amino acid i being aligned to amino acid k in blosum62 substitution matrix, which is the default score matrix of psi-blast (altschul et al., 1997) . although the methods by using amino acid sequence information achieve certain degree of success, only using sequence information cannot accurately detect protein remote homology. recent studies demonstrated that the profile-based methods would show better performance because a profile is richer than an individual sequence as far as the encoding information is concerned. however, a profile is a 2d matrix, whereas a protein sequence is an amino acid string. therefore, the 2d evolutionary profile information cannot be directly incorporated into the sequence-based methods for prediction. to deal with this problem, we propose an approach to convert the frequency profiles into a series of profile-based proteins. thus, the existing sequence-based methods can be directly performed on these proteins for the prediction. the target frequencies in the frequency profiles reflect the probabilities of the corresponding amino acids appearing in the specific sequence positions. the higher the frequency is, the more likely the corresponding amino acid occurs. it is reasonable to use the nth most frequent amino acids in the frequency profiles to represent the protein sequences. below is the description on how to convert frequency profiles into profile-based proteins. given the frequency profile m for protein p (equation 6), we can sort the amino acids in each column according to a descending order. the frequency profile thus obtained by the sorting operation is called the sorted frequency profile and denoted by m 0 . for each row in m 0 , the amino acids are combined to produce the profile-based protein. by following this approach, the frequency profile m is converted into 20 profile-based proteins p 1 , p 2 , . . . , p 20 ( supplementary fig. s1 in supplementary material s4), which contain the evolutionary information in the frequency profile. these 20 proteins have different importance. during evolutionary process, protein p is preferred to transform into p1, but not preferred to transform into p20. for reader's convenience, the source code for generating the profile-based proteins is accessible by clicking the link at http://bioinformatics.hitsz.edu.cn/main/ *binliu/ remote/. three state-of-the-art sequence-based kernels [svm-ngram (dong et al., 2006) , svm-pairwise (liao and noble, 2003) and svm-la (saigo et al., 2004) ] were used to validate whether the proposed approach could improve their performance. in svm-ngram (dong et al., 2006) method, the ngrams were the set of all possible subsequences of amino acids of a fixed length. a protein sequence was mapped to a feature vector by the occurrence frequency of each ngram. the value of n was set at 3 as suggested by the authors (dong et al., 2006) , and therefore the dimension of the vector is 8000. at the heart of the svm is a kernel function that acts as a similarity score between pairs of vectors. the kernel was normalized so that each vector had length 1 in the feature space: where x and y are two proteins in the dataset. this normalized step was also used by svm-pairwise (liao and noble, 2003) and svm-la (saigo et al., 2004) . the normalized kernel thus obtained was then transformed into a radial basis kernel. in the svm-pairwise (liao and noble, 2003) method, the feature vector was a list of pairwise sequence similarity scores, computed with respect to all of the sequences in the training set. the radial basis function was used as the kernel. the rest steps were the same as the ones used in svm-ngram (dong et al., 2006) . in the svm-la (saigo et al., 2004) , the kernel was calculated by summing up scores obtained from the local alignments with gaps between the two sequences, computed by smith-waterman dynamic programming algorithm. such kernel might not be a positive definite kernel and the authors (saigo et al., 2004) provided two solutions for this problem. owing to its performance and simplicity, we implemented one of the two methods, namely, the la-ekm kernel. the parameters of la-ekm kernel took the optimal values ( ¼ 0.5, d ¼ à11, e ¼ à4). the kernel described in the previous section can be used by kernel methods to train the svm classifier. each kernel contains different discriminative information, and therefore combining the kernels automatically is a promising way to improve the performance. in machine learning field, this approach is called multiple kernel learning (mkl) (cortes et al., 2010; varma and babu, 2009) , which has attracted a lot of attention recently. the mkl technique aimed to combine different kernels to improve the performance, and showed the state-of-the-art results on image classification field (varma and babu, 2009) . in this article, we focused on the weighted linear combination of kernels. the weight of each kernel can be optimized based on different criterion, which can be categorized by two groups. one group is the one-stage kernel learning methods, which optimize the weight and the svm objective function simultaneously (varma and babu, 2009 ). these methods suffer from the high training complexity. the other group is two-stage kernel learning methods, which optimize the weight by using a criterion first and then train the svm classifier using the kernel combined by the learned weight of each kernel. compared with one-stage learning methods, the two-stage kernel learning methods showed better performance with reduced training cost. therefore, in this study, we adopted the two-stage kernel learning method. specifically, the kernel target alignment (kta) objective function was used to optimize the weight of each kernel, which showed theoretical guarantees and can improve the performance in practice (cortes et al., 2010; varma and babu, 2009) . given m training samples x 1 , x 2 , . . . , x m and their corresponding labels y 1 , y 2 , . . . , y m , the ideal kernel matrix can be formulated as k y ¼ y t y, where y is the vector of labels [y 1 , y 2 , . . . , y m ]. for the given n kernels k 1, k 2 , . . . k n , the aim is to learn the weight of each kernel. to avoid the kernel scaling problem, we center kernel k k and the corresponding ideal kernel k y in feature space by the following equation: where k ck is normalized by: following the above steps, each kernel is normalized, and then these n kernels are linearly combined by the following equation: where w k (0 w k 1, p n k¼1 w k ¼ 1) is the weight of kernel k 0 k . the weight is learned by kta objective function, which maximizes the alignment between k comb and the centered ideal kernel k cy . this leads to a quadratic program problem and can be solved quite efficiently. for implementation details, please refer to (cortes et al., 2010) . in this study, three kernels (k p , k p1 and k p2 ) for each selected sequence-based method were linearly combined by using the above kta approach to further improve the performance. for reader's convenience, the source code of the mkl is accessible by clicking the link at http://bioinformatics.hitsz.edu.cn/main/ *binliu/remote/. svm is a class of supervised learning algorithms first introduced by vapnik (1998) . svm-based machine learning algorithm has been successfully used to investigate various problems in molecular biology, such as identifying dna recombination spots , membrane protein types (cai et al., 2003) and heat-shock protein functions (feng et al., 2013) , among many others. in this study, the publicly available gist svm package (http://www.chibi.ubc.ca/gist/) was used. because the test sets have many more negative than positive samples, simply measuring error-rates will not give a good evaluation of performance. for the cases in which the positive and negative samples are not evenly distributed, the best way to evaluate the trade-off between the specificity and sensitivity is to use a receiver operating characteristic (roc) score (gribskov and robinson, 1996) . an roc score is the normalized area under a curve that plots true positives against false positives for different classification thresholds. a score of 1 means perfect separation of positive samples from negative ones, whereas a score of 0 means that none of the sequences selected by the algorithm is positive. another performance measure is roc50 score, which is the area under the roc curve up to the first 50 false positives. the frequency profile of a protein p can be converted into 20 profile-based proteins (p1, p2, . . . , p20) by using the proposed approach (see section 2 for details). these 20 proteins have different importance. p1 is the most important protein, as it is the combination of the top frequent amino acids in frequency profile, whereas p20 is the profile-based protein to which protein p is the most unlikely to convert because it is the combination of the amino acids with lowest frequencies in frequency profile. if all the 20 profile-based proteins are used in the prediction, the computational cost is relatively high. in this study, only the top n most important profile-based proteins (p1, . . . , pn) were used in the prediction. to select the value of n, the following experiment was conducted. the frequencies of 20 standard amino acids in each column of a frequency profiles add up to 1. therefore, the average frequency is 0.05 (1/20 ¼ 0.05). if an amino acid with frequency 40.05, it is likely to occur during evolutionary process; otherwise, it is not likely to occur. the percentage of the amino acids with frequencies40.05 in each profile-based protein on the scop benchmark was calculated, and the results are shown in figure 2 . as we can see from the figure, such amino acids are abundant in profile-based proteins p1, p2 and p3 (99.99%, 99.60% and 98.13%, respectively), but for the other 17 profile-based proteins, the percentage decreases significantly (from 89.28 to 0%). therefore, in this study, only the top three profile-based proteins were used in the prediction. these profile-based proteins were combined with three state-ofthe-art methods based on sequence composition, including svm-ngram (dong et al., 2006) , svm-pairwise (liao and noble, 2003) and svm-la (saigo et al., 2004) , and the results are shown in supplementary table s1 of supplementary material s4. for each of the three methods, the best performance was achieved for the top important protein p1. compared with the methods performed on the raw protein sequence p, the performance of the proposed methods can be improved by 3.7$7.5% and 9.6$13.7% in terms of average roc and roc50 scores, respectively, indicating that the proposed profile-based protein representation is useful for protein remote homology detection. the performance of the methods performed on p2 is similar as that of the methods performed on the raw protein p. the predictive results of the methods performed on p3 were the lowest. these results are consistent with the different importance of the three profile-based proteins p1, p2 and p3. besides the current method, there are some other methods for predicting protein remote homologies based on profiles, such as svm-top-n-gram-combine-lsa (liu et al., 2008) , svm-pdt-profile (liu et al., 2012) , profile (kuang et al., 2005) , biosvm-2l (muda et al., 2011) and hhsearch (so¨ding, 2005) . svm-top-n-gram-combine-lsa (liu et al., 2008) extracted the building blocks of proteins from the frequency profiles, which could be treated as the 'words' of protein language. the lsa (dong et al., 2006) was applied to further improve the performance of this method. svm-pdt-profile (liu et al., 2012) combined the amino acid physicochemical properties in the amino acid index (aaindex) (kawashima et al., 2008) with the frequency profiles for the prediction. the feature vector of profile method (kuang et al., 2005) was constructed by the short subsequences whose pssm-based ungapped alignment score was above a predefined threshold. biosvm-2l constructed two-layer svm classifiers with profile-based kernels (muda et al., 2011) . all the above three methods were based on svm, and the difference among them was in the extracted features. hhsearch (so¨ding, 2005) was one of the best protein remote homology detection methods, which used a novel profile-based hmm. the results obtained by these four methods on the scop benchmark are listed in supplementary table s1 of supplementary material s4, from which we can see that the current method outperforms svm-top-n-gram-combine-lsa (liu et al., 2008) , svm-pdt-profile (liu et al., 2012) and biosvm-2l (muda et al., 2011) and is highly comparable with profile (kuang et al., 2005) and hhsearch (so¨ding, 2005) , indicating that the profile-based protein representation is a promising approach to extract the evolutionary information from frequency profiles for protein remote homology detection. as mentioned above, the approaches based on the top two profile-based proteins p1, p2 and the raw protein p are among the top performing methods. it is interesting to investigate whether these methods can be combined to further improve the performance. in this study, the mkl framework was used to combine these methods. the kta method was used to automatically optimize the weight of each kernel on the training set, and then these kernels are combined with weights into a single kernel for the svm-based prediction. the results are shown in supplementary table s2 of supplementary material s4 as well as supplementary materials s5-s7. the mkl approach can improve the performance of svm-ngram (dong et al., 2006) , but only has minor impact on the svm-pairwise (liao and noble, 2003) and svm-la (saigo et al., 2004) . to uncover the reason, the weight of each kernel was analyzed. for each kernel, the average weight on all the 54 protein families is shown in supplementary table s2 of supplementary material s4. for these three methods, the p1-based kernel was weighted most heavily. for svm-pairwise (liao and noble, 2003) and svm-la (saigo et al., 2004) , the weight values of their corresponding p and p2 kernels are 50.1, indicating these kernels only have minor impact on the final results, and hence the performance improvement is modest. vbkc (damoulas and girolami, 2008) is another method based on the mkl, which combined four string kernels: svm-pairwise (liao and noble, 2003) , svm-la (saigo et al., 2004) , svm-mm (leslie et al., 2004) and svm-mono (lingner and meinicke, 2006) . our proposed svm-pairwise-kta and svm-la-kta outperform vbkc (damoulas and girolami, 2008 ) by 1.2 $ 2.2% and 29.9 $ 31.3% according to the average roc and roc50 scores, respectively. the obvious performance improvement is mainly due to the proposed profile-based protein representation and mkl approach. the svm-ngram (dong et al., 2006) method is based on the explicit feature space representation, which provides the possibility to measure the correlations between ngrams and protein families. the sequence-specific weight learnt from the svm training process can be used to calculate the discriminant weight for each ngram, which indicates the importance of the corresponding ngram. by following lingner and meinicke's approach (lingner and meinicke, 2008) , given the weight vector of a set of m sequences obtained from the kernel-based training process ¼ [ 1 , 2 , 3 , . . . , m ], the discriminant weight vector w in the feature space can be calculated by the following equation: where f is the matrix of sequence representatives. the magnitude of the element in w represents the discriminative power of the corresponding feature. in most protein families, kernel p1 is weighted more heavily than kernel p and kernel p1. two such protein families (scop id: 2.1.1.4 and 3.2.1.5) were selected from the scop benchmark for further study, and the results are shown in supplementary tables s3 and s4 of supplementary material s4, respectively. for each kernel, the top 10 most discriminative ngram features calculated by equation 14 are shown in the tables too. for protein family 2.1.1.4, kernel p and kernel p1 share some common most discriminative ngrams, such as 'mtm', 'yty', 'mtf' and 'wwf', indicating these ngrams remain stable during evolutionary process and therefore these ngrams would be the important sequence patterns for maintaining the structure and function of this protein family (supplementary table s3 of supplementary material s4). however, there are a few common most discriminative ngrams between kernel p and kernel p1 in protein family 3.2.1.5. the top 10 most discriminative ngrams of kernel p1 are all different from those in kernel p (supplementary table s4 of supplementary material s4). these ngrams would contribute to the higher discriminative power of kernel p1 for this protein family. although in most cases, kernel p1 was weighted most heavily, some exceptions were observed. for example, for protein family 7.3.6.1, kernel p2 is the most discriminative kernel with weight value of nearly 1, while the other two kernels only have little contribution to the mkl (supplementary table s5 of supplementary material s4). the top 10 most discriminative ngrams for each kernel were investigated, and the results are shown in supplementary table s5 of supplementary material s4, from which some interesting patterns can be observed. the ngrams containing amino acids 'n' and 'f' tended to show strong discriminative power in both kernel p and kernel p1, whereas amino acid 'a' was abundant in the top discriminative ngrams in kernel p2, indicating the ngrams with amino acid 'a' could better describe the prosperities of protein family 7.3.6.1 in the evolutionary process. 3.5 application of the proposed remote homology detection methods for studying the 3d structure of nck5a in addition to provide useful insights for evolution study, protein remote homology detection is useful for drug development as well. as is well known, many drug-targeted proteins are still without x-ray or nuclear magnetic resonance structure. pharmaceutical scientists have to resort to the homology modeling technique or structural bioinformatics tools (chou, 2004) to timely develop their 3d structures, so as to be able to conduct molecular docking study (chou et al., 2003; wang et al., 2009) , one of the key steps in structure-based drug design. however, a reliable template, or a structure-known protein homologous to the target protein, is the necessary prerequisite in this regard (chou, 2004) . unfortunately, many target proteins did not have significant sequence similarity with any structure-known proteins, and hence it was hard to find a proper template to develop their 3d structures. actually, many of them did have structure-known homologous proteins, but the problem was how to detect them. for example, the sequence similarity between nck5a and cyclina was 520% (chou et al., 1999) and hence their homologous relationship could not be detected by the simple sequence alignment technique (mohabatkar, 2010) . now let us see what will happen if the current remote homology detection technique is applied. to realize this, a dataset was constructed based on scop, from which 11 proteins were selected as the positive samples in the cyclin family (scop id: a.74.1.1), while 3605 negative samples were selected from the scop version 1.67 by excluding all the proteins within the cyclin-like superfamily. none of these proteins shares 495% sequence similarity. trained with such 11 positive proteins and 3605 negative proteins, the proposed best performing method svm-la (p1) was used to predict nck5a. it was found that nck5a is homologous to cyclina, fully consistent with the experimental results obtained by the site-directed mutagenesis studies (tang et al., 1997) . actually, chou et al. (1999) did use cyclina as a template to construct the 3d structure of activation domain of nck5a, one of the important parts of tau protein kinase ii, an important therapeutic target against alzheimer's disease. furthermore, based on the computed structure thus obtained, the molecular truncation experiments (zhang et al., 2002) were conducted with an outcome that confirmed and validated the structure computed by using such a remote homologous protein as a template. therefore, it is anticipated that the proposed method for detecting remote homology proteins will certainly enhance the power of homology modeling, and hence have impacts on drug development as well. discriminative methods based on svm are the most effective and accurate methods for protein remote homology detection. the performance of the svm-based methods depends on the kernel function, which measures the similarity between the samples in any pair. varieties of kernels based on sequence composition have been proposed. however, these methods often fail to accurately predict the proteins sharing low sequence similarity. recently, methods using the evolutionary information extracted from profiles achieved great success, such as profile (kuang et al., 2005) , sw-pssm (rangwala and karypis, 2005) , svm-top-ngram (liu et al., 2008) and svm-acc (liu et al., 2011) . a key step to improve the performance of these methods is in how to find a suitable approach to incorporate the evolutionary information extracted from the profiles for prediction. in this article, we proposed a method that can convert the frequency profile into a series of profile-based proteins. three state-of-the-art sequence-based kernels, i.e. svm-ngram (dong et al., 2006) , svm-pairwise (liao and noble, 2003) and svm-la (saigo et al., 2004) , were selected for demonstration on a well-known benchmark. it was shown that the methods based on the profile-based proteins p1 and p2 achieved the best performance, outperforming the original three string kernels by 3.7 $ 7.5% and 9.6 $ 13.7%, respectively, according to the average roc and roc50 scores. these results are fully consistent with our previous findings that the top two most frequent amino acids show stronger discriminative power than the other low frequent amino acids in the frequency profiles (liu et al., 2008) , further confirming that the proposed profile-based protein representation is a promising approach in extracting the evolutionary information from frequency profiles for protein remote homology detection. it has not escaped our notice that the current approach can be easily combined with sequence-based methods, and hence, with the development of the sequence-based kernels, the currently proposed method can be further improved accordingly. it is instructive to point out that since the concept of pseudo amino acid composition, or chou's pseaac (lin and lapointe, 2013) , was introduced in 2001 (chou, 2001) , it has been successfully used to predict various attributes of proteins (e.g. chen and li, 2013; chou, 2005; georgiou et al., 2009; huang and yuan, 2013; khosravian et al., 2013; mohabatkar, 2010; mohabatkar et al., 2011 mohabatkar et al., , 2013 mohammad beigi et al., 2011; nanni et al., 2012; sahu and panda, 2010; zhang et al., 2008; zhou et al., 2007; liu et al., 2013) . accordingly, the potential would be high to develop a powerful method for protein remote homology detection by combing pseaac with profile-based protein representation. in the original pseaac, it only uses three indices, including the hydrophobicity index, hydrophilicity index and side-chain mass index. because protein remote homology detection is a more difficult problem, proteins in the dataset only share low sequence similarity. only these three indices would not be enough to capture the different properties of various proteins. therefore, our further research will focus on incorporating new amino acid indices into pseaac and applying it to protein remote homology detection. conflict of interest: none declared steady-state kinetic studies with the non-nucleoside hiv-1 reverse transcriptase inhibitor u-87201e basic local alignment search tool gapped blast and psi-blast: a new generation of protein database search programs the worldwide protein data bank (wwpdb): ensuring a single, uniform archive of pdb data the astral compendium for sequence and structure analysis support vector machines for predicting membrane protein types by using functional domain composition irspot-psednc: identify recombination spots with pseudo dinucleotide composition predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of chou's pseudo amino acid composition the convergence-divergence duality in lectin domains of the selectin family and its implications prediction of protein cellular attributes using pseudo amino acid composition review: structural bioinformatics and its impact to biomedical science using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes graphic rule for drug metabolism systems some remarks on protein attribute prediction and pseudo amino acid composition (50th anniversary year review) some remarks on predicting multi-label attributes in molecular biosystems a model of the complex between cyclin-dependent kinase 5 (cdk5) and the activation domain of neuronal cdk5 activator binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against sars wenxiang: a web-server for drawing wenxiang diagrams two-stage learning kernel algorithms probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection application of latent semantic analysis to protein remote homology detection ihsp-pseraaac: identifying the heat shock protein families using pseudo reduced amino acid alphabet composition use of fuzzy clustering technique and matrices to classify amino acids and its impact to chou's pseudo amino acid composition use of receiver operating characteristic (roc) analysis to evaluate sequence matching a multilabel model based on chou's pseudo-amino acid composition for identifying membrane proteins with both single and multiple functional types hidden markov models for detecting remote protein homologies aaindex: amino acid index database, progress report predicting antibacterial peptides by the concept of chou's pseudo-amino acid composition and machine learning methods profile-based string kernels for remote homology detection and motif extraction mismatch string kernels for discriminative protein classification combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships theoretical and experimental biology in one remote homology detection based on oligomer distances word correlation matrices for protein sequence analysis and remote homology detection a discriminative method for protein remote homology detection and fold recognition combining top-n-grams and latent semantic analysis prediction of protein binding sites in protein structures using hidden markov support machine using amino acid physicochemical distance transformation for fast protein remote homology detection protein remote homology detection by combining chou's pseudo amino acid composition and profile-based protein representation protein remote homology detection based on auto-cross covariance transformation protein function annotation by homology-based inference detecting remote evolutionary relationships among proteins by large-scale semantic embedding prediction of cyclin proteins using chou's pseudo amino acid composition prediction of gaba(a) receptor proteins using the concept of chou's pseudo-amino acid composition and support vector machine prediction of allergenic proteins by means of the concept of chou's pseudo amino acid composition and a machine learning approach prediction of metalloproteinase family based on the concept of chou's pseudo amino acid composition using a machine learning approach remote protein homology detection and fold recognition using two-layer support vector machine classifiers identifying bacterial virulent proteins by fusing a set of classifiers based on variants of chou's pseudo amino acid composition and on evolutionary information performance of an iterated t-hmm for homology detection profile-based direct kernels for remote homology detection and fold recognition compass server for homology detection: improved statistical accuracy, speed and functionality a novel feature representation method based on chou's pseudo amino acid composition for protein structural class prediction protein homology detection using string alignment kernels identification of common molecular subsequences protein homology detection by hmm-hmm comparison cyclin-dependent kinase 5 (cdk5) activation domain of neuronal cdk5 activator. evidence of the existence of cyclin fold in neuronal cdk5a activator statistical learning theory more generality in efficient multiple kernel learning insights from investigating the interactions of adamantanebased drugs with the m2 proton channel from the h1n1 swine virus svm-balsa: remote homology detection based on bayesian sequence alignment the universal protein resource (uniprot): an expanding universe of protein information icdi-psefpt: identify the channel-drug interaction in cellular networking with pseaac and molecular fingerprints isno-aapair: incorporating amino acid pairwise coupling into pseaac for predicting cysteine s-nitrosylation sites in proteins identification of the n-terminal functional domains of cdk5 by molecular truncation and computer modeling using the concept of chou's pseudo amino acid composition to predict protein subcellular localization: an approach by incorporating evolutionary information and von neumann entropies an extension of chou's graphic rules for deriving enzyme kinetic equations to systems involving parallel reaction pathways the disposition of the lzcc protein residues in wenxiang diagram provides new insights into the protein-protein interaction mechanism using chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes the ph-triggered conversion of the prp(c) to prp(sc.) key: cord-004719-3stcx0dd authors: mushegian, a. r.; koonin, e. v. title: cell-to-cell movement of plant viruses: insights from amino acid sequence comparisons of movement proteins and from analogies with cellular transport systems date: 1993 journal: arch virol doi: 10.1007/bf01313766 sha: doc_id: 4719 cord_uid: 3stcx0dd cell-to-cell movement is a crucial step in plant virus infection. in many viruses, the movement function is secured by specific virus-encoded proteins. amino acid sequence comparisons of these proteins revealed a vast superfamily containing a conserved sequence motif that may comprise a hydrophobic interaction domain. this superfamily combines proteins of viruses belonging to all principal groups of positive-strand rna viruses, as well as single-stranded dna containing geminiviruses, double-stranded dna-containing pararetroviruses (caulimoviruses and badnaviruses), and tospoviruses that have negative-strand rna genomes with two ambisense segments. in several groups of positive-strand rna viruses, the movement function is provided by the proteins encoded by the so-called triple gene block including two putative small membrane-associated proteins and a putative rna helicase. a distinct type of movement proteins with very high content of proline is found in tymoviruses. it is concluded that classification of movement proteins based on comparison of their amino acid sequences does not correlate with the type of genome nucleic acid or with grouping of viruses based on phylogenetic analysis of replicative proteins or with the virus host range. recombination between unrelated or distantly related viruses could have played a major role in the evolution of the movement function. limited sequence similarities were observed between i) movement proteins of dianthoviruses and the mip family of cellular integral membrane proteins, and ii) between movement proteins of bromoviruses and cucumoviruses and m1 protein of influenza viruses which is involved in nuclear export of viral ribonucleoproteins. it is hypothesized that all movement proteins of plant viruses may mediate hydrophobic interactions between viral and cellular macromolecules. it is generally accepted that plant viruses exploit plasmodesmata to move from initially infected cells, which are usually negligible in amount, into neighbouring healthy cells. without cell-to-cell movement productive virus infection is not established. also it is known that specific movement proteins encoded by virus genomes are essential for cell-to-cell spread. frequently, a plant virus protein is referred to as movement protein if i) it is not a capsid protein and ii) disruption of the coding sequence of this protein abolishes infection in whole plants but has no effect on virus replication in protoplasts. putative movement proteins have been identified in this manner in about 15 different plant virus groups (see [4, 19, 40] for recent reviews). in some cases, additional virus proteins are also required for movement, e.g. coat protein [13, 52, 58] or a specific segment in a replication protein [55] , but these phenomena are not universal. on the other hand, a specific movement protein or at least movement domain is thought to be present in all viruses that are able to spread from cell to cell [4] . apart from this genetic evidence for the movement function, the current knowledge of mechanisms which facilitate intercellular spread of plant virus genomes is insufficient. recently certain properties of some movement proteins, notably of the 30 kda movement protein of tobacco mosaic virus (tmv), were characterized in vitro and in vivo and models for movement of plant viruses have been prompted by these studies [ 16, 19] . additionally, movement proteins were subjects of comparative sequence analysis, which allowed to delineate common motifs in movement proteins from otherwise unrelated groups implying existence of common features among many movement proteins [33, 43] . in this work, we sought to critically analyze currently available biochemical and ultrastructural data on movement proteins. we then investigated whether sequences of movement proteins could provide information on their function based on similarities with other viral or cellular proteins. fluorescent probes injected into a plant cell are either transported into neighbouring cells via plasmodesmata or retained in the initially injected cell. the crucial determinant for the exclusion of a particle from plasmodesmatal movement is its hydrodynamic (stokes) radius [51] . studies of transgenic tobacco plants constitutively expressing tmv 30 kda movement protein (mp + plants) revealed that certain molecules, which were too large to be transported in wild type plants, gained such an ability in mp + tobaccos. specifically, fitc-labelled dextran (mr 9.4 kda) moved from injected to neighbouring cells only in plants transformed with and expressing tmv 30 kda protein but not in control mp-plant virus movement proteins 241 plants [57] . interestingly, plasmodesmata in mp + and mp-plants appeared to be structurally indistinguishable [19, 57] . these data indicate that movement protein of tmv functionally modifies plasmodesmata. it is not understood yet whether these modifications and the changes required for moving virus genome through plasmodesmata are the same or different. virus nucleic acid would not be expected to exist in naked form in the cell; rather, it seems reasonable that, like all nucleic acids in eukaryotic cells, it is associated with cellular and/or virus proteins throughout its lifetime. while stokes radius of a 10 kda dextran molecule is estimated to be about 3.1 nm [20, 57] , ribonucleoproteins (rnps) are clearly larger, many of them having sizes within the range of 10-50 nm [41] , which would exclude them from movement even in mp + plants described in [57] . recently, plasmodesmatal permeability in nicotiana clevelandii leaves infected with tobacco rattle virus (trv) has been investigated [20] . movement protein of trv, the 29 kda protein, shares high sequence similarity with tmv movement protein, and both proteins are considered to be functionally very similar. simultaneous injection of virus and fluorescent tracers into cells of leaf trichomes allowed to observe movement of the labels under the conditions when the virus itself was known to move in the same cells. it has been shown that fitc-labelled dextran (mr 4.4kda) moved from cell to cell only in trvinfected cells and synchronously with the virus. however, lucifer yellow-labelled dextran (mr 10 kda) was excluded from plasmodesmatal movement even in infected cells. thus, certain molecules with the dimensions resembling the smallest of the known rnps did not move under conditions when virus rnps readily moved. apparently, either an extremely thin virus-specific rnp should be postulated (see below), or it might be concluded that slight increase in plasmodesmatal permeability induced by movement proteins hints at some important circumstances but is not sufficient to explain celt-to-cell movement of the virus genome. binding of movement proteins to single-stranded nucleic acids tmv 30 kd movement protein has been overexpressed in e. coli, and purified protein has been shown to bind single-stranded (ss) dnas and rnas nonspecifically and cooperatively [14] . a model has been proposed, in which movement protein binds genomic rna stoichiometrically to unfold and shape it into a thin complex [ 14, 16, 17] . electron microscopy of such complexes formed in vitro revealed thin structures at the limit of microscope resolution; it was concluded that, as it was barely seen under em, the complex was thin enough to fit in plasmodesmata with size exclusion limit of 3 nm [17] . additionally, ss nucleic acid-binding properties have been reported for movement proteins of cauliflower mosaic caulimovirus (camv) [ 15] , red clover necrotic mosaic dianthovirus (rcnmv) ( [47] , d. cookmeyer, pers. comm.) and alfalfa mosaic virus (aimv) [-40] . interestingly, p1, the movement protein of cauliflower mosaic virus, has been shown to preferentially bind rna rather than dna 242 a.r. mushegian and e. v. koonin [15] ; it was speculated that 35 s transcript, a genome length virus replication intermediate, might be the form which is transported from cell to cell. these in vitro observations offer new insights into a mechanism of virus cell-to-cell movement. however, is should not be forgotten that virus rnas in vivo are associated with proteins, as revealed for tmv-specific rnas [22] ; it is thought that such an association is maintained throughout the whole life of the rnps in the cell; a mechanism of stripping virus rna from these proteins has not yet been proposed. also, as binding of movement proteins to rna is thought not to be sequence-specific [14, 17] , it could be anticipated that the majority of movement protein will bind to cellular rnas, which in the case of tmv infection are in excess over virus rnas at the time when 30 kd protein is transiently expressed [18] . the "ss nucleic acid-binding" model, therefore, has to be modified to deal with these difficulties. biochemical fractionation of infected cells revealed that in many virus-plant interactions movement proteins are found in cell wall-enriched fraction. immunogold labelling has confirmed these observations and demonstrated localization of tmv, a1mv, camv, and rcmv movement proteins in plasmodesmatal channels or in cell wall near plasmodesmata [36, 40] . movement protein of cowpea mosaic comovirus (cpmv), the 58 kd/48 kda protein, has been found to form tubular structures perpendicular to and intruding into plasma membrane and cell wall. these tubules have been found both in whole leaves and in protoplasts [-56] . virions have been observed inside the tubules [56] . it has been speculated that the tubular inclusions are the structures required for movement. cucumber mosaic virus (cmv) movement requires the 32 kda movement protein [7] . this protein has not been found to associate with the cell wall or from tubules; instead, it was located in nucleoli of infected cells [-37] . observations on movement proteins of tmv and cpmv have given rise to the idea of two different types of virus movement, "tobamo-type" and "comotype". it has been proposed that the tobamo-type is characterized by movement protein localization in the plasmodesmatal area of the cell wall and by the virus infectious entity capable of cell-to-cell movement in the absence of the viral capsid protein. como-type of movement was thought to require both movement protein and coat protein and to involve formation of tubular structures protruding from cell wall into cytoplasm and formed by the movement protein [19, 56] . complementation data argued, however, that there is certain compatibility among both types of movement as tobamoviruses were able to complement comovirus movement [4] . when all available ultrastructural data are considered, the picture becomes not so clear-cut. it should be expected that rcmv, a comovirus closely related to cpmv, moves according to the como-type; instead, movement protein of rcmv was found exclusively in plasmodesmata of the infected plants [-40] which had been proposed to be a tobamo-type feature. camv movement protein is found in cell wall fraction and, more specifically, in plasmodesmata [2, 36] ; this seemingly supports the attribution of caulimovirus movement to the tobamo-type. however, tubular structures of unknown composition have been observed in the cytoplasm of camv-infected cells [36] and recently it has been shown that p1, the movement protein of camv, forms tubular structures on the surface of inoculated protoplasts [49] . these findings seem to be more consistent with the como-type action of the caulimovirus movement protein. the above discussion shows that it may be premature to draw too sharp a distinction between different types of movement based on ultrastructural observations. it also should be remembered that we see movement proteins at the sites where they are most easily observed and/or are retained for longer periods, but not necessarily at the sites of their essential activity. with full-length dna copies of numerous virus genomes available, attempts have been made to genetically dissect movement protein coding sequences. in most cases, lengthy deletions have been introduced, and effect of these deletions on the overall infectivity or individual properties of the movement proteins has been investigated. it has been shown that even small changes near the n-terminus of tmv 30 kd movement protein abolish virus infectivity [6] ; at the c-terminus, up to 33 of the 268 amino acid residues of this protein could be deleted without apparent effect on infectivity or cell wall localization of the movement protein. when 55 c-terminal amino acids were deleted, virus moved slowly though movement protein was still found in plasmodesmata [-6] . deletion of 74 cterminal residues (195 to 268) was lethal, and, at the same time, movement protein was no longer found in plasmodesmata. it has been noted that essential residues from 195 to 214 could comprise a domain with a specific function, or their deletion might simply cause misfolding of the protein i-6]. in another study, c-terminal and internal deletions have been introduced into tmv movement protein to determine location of ss nucleic acid-binding site(s) [-17] . initially, it was thought that residues 65 to 86 constitute rnabinding domain [14] . later, this domain has been shown to rather decrease protein solubility [17] . up to 84 amino acids from the c-terminus of tmv movement protein could be eliminated without effect on ss nucleic acid binding capacity of the protein [-17]. in the remaining part, deletion of amino acids 111-125 abolished ss nucleic acid binding, but when the gene segment encoding these amino acids was fused to an unrelated sequence, the resulting himeric protein did not bind to ss nucleic acids [17] . movement protein of a1mv associates with cell wall upon infection and in transgenic tobacco plants; however, this did not happen when n-proximal amino acids (12-77) have been deleted [-24] . strains of tmv have been characterized that are temperature sensitive in 244 a.r. mushegian and e. v. koonin: plant virus movement proteins cell-to-cell movement or break the host resistance which is based on plant genes restricting virus movement. involvement of mutations in 30 kd movement protein cistron could be implied, and, indeed, in many cases nucleotide changes responsible tbr the altered phenotype were shown to be confined to the movement protein gene [11] . many of these mutations were scattered around the middle one-third of 30 kd movement protein and resulted in a change in polarity of the encoded amino acid [11] . as no known function could be mapped to the area, these data are difficult to interpret. with representative sequences of plant virus genomes from many groups available, efforts were made to identify regions of similarity between known movement proteins and to predict movement functions for uncharacterized orfs. early observations made by this approach have been reviewed previously (e.g. [4] ) and included strong similarity between tobamovirus and tobravirus movement proteins [29] as well as less pronounced but significant local similarity between segments of tobamovirus and caulimovirus movement proteins [30] . later, similarities between movement proteins of tricornaviruses and dianthoviruses were observed [59] . in fact, the movement function for the respective proteins of tobraviruses, caulimoviruses and dianthoviruses has been implied by these observations and only subsequently confirmed by site-directed mutagenesis [54, 59, 60] . more detailed analyses performed by melcher [43] and by ourselves [33] revealed longer (ca. 150 amino acids) segments of sequence similarity between movement proteins from different virus groups. interestingly, one family of movement proteins included proteins encoded by viruses with positive-strand rna genomes, single-strand dna geminiviruses and by pararetroviruses with virion dna, namely caulimoviruses [33] . in the time past after the publication of these analyses, further significant increase in the number of sequenced virus genomes was achieved and it has been argued that the current collection of such sequences may represent the majority of the existing virus groups [32, 34] . with this in mind, we undertook a systematic comparison of the sequences of the movement proteins with each other and with the current sequence databases. using the blast program [3] , statistically highly significant similarities were observed only between proteins of viruses belonging to a single group or two closely related groups. further comparisons were done using programs ma-caw [53 a] and optal [27] to generate multiple alignments according to the progressive alignment strategy [21] , i.e. starting with pairs of most closely related sequences and proceeding to incorporate more and more distant ones. it was demonstrated that movement proteins of viruses belonging to 17 groups contain a weak but confidently identified conserved motif consisting of approximately 30 amino acid residues (fig. 1) . this motif includes a nearly invariant aspartic acid residue (replaced by asparagine in dianthoviruses and ilarviruses) preceded by a region enriched in hydrophobic and nonpolar residues. in some of the movement proteins this region was predicted to form a transmembrane helix using the algorithm of rao and argos ( [50] , data not shown; see also discussion below). remarkably, this conserved motif unites viruses with single-stranded dna genomes (geminiviruses of group ii), pararetroviruses with double-stranded dna genomes (caulimoviruses and badnaviruses), negative-strand rna viruses with ambisense genome segments (tospoviruses), and numerous groups of positive-strand rna viruses. among the latter, all the three major phylogenetically defined divisions [31, 34] are represented, namely run-like viruses (tobamoviruses, tobraviruses, cucumoviruses, bromoviruses, ilarviruses, capilloviruses, furoviruses, raspberry bushy dwarf virus), picornalike viruses (comoviruses, parsnip yellow fleck virus, pea enation mosaic virus), and flavi-like viruses (dianthoviruses and tombusviruses). several groups of movement proteins showed significant similarity on longer sequence stretches. in fig. 2 we show the amino acid sequence alignments for four of these groups, delineation of which included identification of new putative movement proteins. such new identifications are: i) 38 kda protein of apple stem grooving virus, which is related to the putative movement proteins of the other capilloviruses, namely apple chlorotic leaf spot virus and potato virus t (fig. 2 a) ; ii) 37 kda protein of soil-borne wheat mosaic furovirus related to the movement proteins ofdianthoviruses (fig. 2 b) ; iii) 27 kda protein of pea enation mosaic virus and 21 kda protein of tombusviruses related to the movement proteins of bromoviruses and cucumoviruses (fig. 2 c) ; and iv) nsm protein of fig. 2 . amino acid sequence alignments for groups of proteins including newly identified putative movement proteins. for each group of boundaries of the conserved segments were determined using the macaw program and the alignment itself was constructed hierarchically using the optal program. in the consensus u indicates a bulky aliphatic residue (i, l, v, or m), a indicates an aromatic residue (f, i7, or w) & indicates a bulky hydrophobic residue (aliphatic or aromatic) and x indicates any residue. the conserved motif of the "30 k superfamily" is highlighted by bold typing, a putative movement proteins of capilloviruses. aclsv apple chlorotic leaf spot virus; pvt potato virus t; asgv apple stem grooving virus; b movement protein of red clover necrotic mosaic dianthovirus (rcnmv), putative movement proteins of raspberry bushy dwarf virus (rbd v) and soil-borne wheat mosaic furovirus (sb wmv); e movement proteins of cucumber mosaic cucumovirus (cmv) and brome mosaic bromovirus (bmv), putative movement proteins of pea enation mosaic virus (pemv) and cucumber necrosis tombusvirus (cnv); the similarity between 21 kda proteins of tombusviruses and movement proteins of bromoviruses and cucumoviruses has been independently reported by u. melcher (pers. comm.); d putative movement proteins of caulimoviruses, badnaviruses, parsnip yellow fleck virus, and tomato spotted wilt tospovirus; asterisks show identical residues and colons show similar residues between the sequences of pyfv and fmv, and coymv and tswv tospoviruses and the n-terminal domain of the parsnip yellow fleck virus polyprotein related to the putative movement proteins of caulimoviruses and badnaviruses (fig. 2 d; the similarity between caulimovirus movement proteins and n-terminal domains of badnavirus polyproteins was noted by u. melcher and cited in [8] ). even among these smaller groups, the latter three brought together proteins of very different viruses. group iv) is particularly notable in that it compiles proteins of a positive-strand rna virus, negative-strand (ambisense) rna viruses, and dna-containing pararetroviruses. curiously, the array of domains, namely n-movement domain-capsid protein(s)-replicative domains-c in the polyproteins of such seemingly unrelated viruses as parsnip yellow fleck virus and the badnaviruses is very similar. a relevant observation from our previous work is the relationship between movement proteins of two component geminiviruses (bl1/bc1) with those of tobamoviruses and tobraviruses [33] . all the proteins containing the conserved motifs shown in fig. 1 should be considered a single vast, and highly diverged superfamily of plant virus movement proteins. we would like to coin the nickname "30 k superfamily", after the most thoroughly studied movement protein of tobamoviruses, and also because most of the proteins of this superfamily have the size around this value (fig. 1 a) . several groups of positive-strand rna viruses, namely potexviruses, carlaviruses, hordeiviruses, and a group including beet necrotic yellow vein virus, nicotiana velutina mosaic virus, and peanut clump virus, encompass the triple gene block that consists of two small membrane proteins and a putative rna helicase [34, 44] . it has been demonstrated that disruption of any of the triple block genes abolished the cell-to-cell movement of hordeiviruses, potexviruses and bnyvv [5 a, 25 a, 49 a]. the putative helicases in the triple block comprise a distinct group within superfamily i of dna and rna helicases and contain several unusual amino acid substitutions in the conserved helicase motifs [26, 34] . unlike the genes coding for the proteins of the "30 k superfamily", the triple block is found only in positive-strand rna viruses. carmoviruses and necroviruses encode a pair of small proteins, one of which is predicted to be an integral membrane protein. it has been shown that disruption of either of these genes precluded cell-to-cell movement of turnip crinkle carmovirus [28] . these two genes perhaps may be considered a "truncated triple block" although it was hard to demonstrate this at the level of sequence comparison due to the small size of the hydrophobic proteins. a third type of movement protein has been revealed in tymoviruses. a 75 kda proline-rich protein that is encoded in the 5' portion of the genomic rna and overlaps with the replicative polyprotein gene is essential for the movement of tymv in infected plants [10] . this protein may have specific secondary and tertiary structure similar to that of other proline-rich filamentous proteins. interestingly, upon screening of the amino acid sequence databases with the sequence of the tymovirus movement protein the highest (albeit moderate) similarity was observed with the plant cell wall-associated protein extensin (data not shown). recently it has been shown that cell-to-cell movement of geminivirus msv is dependent on the small protein v1 [9] . this protein and the related proteins of other one component geminiviruses did not show appreciable sequence similarity to movement proteins of other viruses. interestingly, however, alignment of their amino acid sequences revealed the conservation of a highly hydrophobic central domain terminating at an invariant aspartic acid residue and thus distantly resembling the conserved motif of the 30 k superfamily (fig. 3) . this domain was flanked by two proline-rich domains showing some similarity to extensins (fig. 3) . thus, in a sense the movement proteins of one component geminiviruses combined the 30 k superfamily and the tymovirus structural themes. for several virus groups, in which at least one member has been completely sequenced, the movement function has not been genetically mapped and identification of a putative movement proteins by amino acid sequence comparison was very uncertain if possible at all. among these, uncharacterized proteins of nepoviruses and closteroviruses showed a very limited sequence similarities to some members of the 30 k superfamily, in particular to caulimovirus movement proteins [33] . the observations discussed in the previous sections show that classification of movement proteins based on comparison of their amino acid sequences does not correlate with the type of genome nucleic acid or with grouping of viruses based on phylogenetic analysis of replicative proteins 1-31, 34, 61] . also, we were unable to notice any correlation between grouping of movement proteins and biological properties of the viruses encoding these proteins (e.g. host ranges). recombination between unrelated or distantly related viruses could have played a major role in the evolution of the movement function. this is clearly the preferential explanation for situations when related movement proteins are found in viruses with different type of genome nucleic acid (e.g. the striking similarity between the putative movement proteins of caulimo/badnaviruses fig. 2 d) . in some cases recombinational transfer of movement protein genes appears very likely also between remote groups of positive-strand rna viruses (e.g. dianthoviruses and soil borne wheat mosaic furovirus; fig. 2 c) . another equally important trend in the evolution of movement protein genes is the apparent high rate of mutational change leading to the very limited level of sequence conservation as featured above. even within compact groups of viruses that share common genome organization and functional properties (e.g. tobarnoviruses), the divergence between the movement protein sequences is much higher than that between the principal repticative domains although comparable to that between capsid proteins. it has been proposed that plasmodesmata are plant analogues for certain animal cell-cell contacts, namely gap junctions, based mainly on studies in permeability of both to fluorescent dye-labelled molecules [5, 51] . details of protein organization of plasmodesmata are poorly understood, although a protein serologically related to a major gap junctions protein, connexin, was reportedly found in plasmodesmata and gene encoding this plant protein was cloned [42] . in animal kingdom, gap junctions are important for electrical coupling of cells in various tissues [51] . interestingly, permeability of both plasmodesmata and gap junctions for some low molecular weight tracer dyes is down-regulated by activation of protein kinase c [5] . however, movement of macromolecules, in particular nucleoproteins, through gap junctions, has not been described. a case of movement of a nucleoprotein through a channel in a membrane is represented by nucleocytoplasmic trafficking of rnps (reviewed in [38, 45] ). among the features of the nucleocytoplasmic export of cellular rnas revealed thus far, several might be relevant to the mechanisms of movement of plant virus genomes. in particular, it has been shown that at all stages of nucleocytoplasmic export, mrna is attached to specific protein frameworks, being transferred from nucleoskeleton to nuclear pore complex to cytoplasm [12] ; specific proteins in mrnp are thought to chaperone association-dissociation of the mrna to and from these frameworks [1, 53] . components of mrnp activate enzymes within the nuclear pore complex which are essential for energydependent translocation of rnp through nuclear pore; two of these enzymatic activities are atpase and rna helicase [53] . it is tempting to speculate that mechanistic analogies may exist between the nucleocytoplasmic transport of cellular rnps and intracellular and/or intercellular movement of viral rnps. more specifically, the parallel between the rna helicase in the nuclear pore and the putative viral rna helicase in the triple block that is involved in virus movement (see above) certainly is provocative. it should be noted that plant viruses, which undergo some stages in the nucleus of the infected cells (i.e. pararetroviruses and geminiviruses), should be able to perform both nucleocytoplasmic movement and cell-to-cell movement of their genomes; one may wonder to what extent these two processes are similar and whether movement proteins are involved in both of them. comparison of the amino acid sequences of dianthovirus movement proteins with amino acid sequence databases revealed a marginally significant but provocative similarity with membrane proteins of the mip family [48] . with the blast score of 67 and probability of random matching of 0.23, the similarity of the movement protein of rcnmv with tonoplast membrane protein tip 1 from arabidopsis was the highest in the entire database except for the similarities with the homologous proteins of the other dianthoviruses. analysis of the multiple alignment of the dianthovirus movement proteins with a representative set of the mip proteins (fig. 4) showed that the region of similarity included the conserved motif of the "30 k superfamily" described above and the characteristic conserved motif of the mip family [48] . each of these motifs consisted of a putative transmembrane helix succeeded by a loop containing conserved amino acid residues (fig. 4) . the drastic difference between the two types of proteins is that mip proteins contain six transmembrane helices whereas in the virus movement proteins the region shown in figs. 1 and 4 is the only such predicted helix. nevertheless, we believe that the observed sequence similarity may be functionally relevant indicating that the conserved motif of the "30 k superfamily" may mediate membrane interaction. for many proteins of this superfamily, membrane interaction is compatible with the observed localization in the cell wall [40] . however, caution is due in the interpretation of these observations as the hydrophobicity of the conserved region in different movement proteins differed significantly and not for all of them a membrane-spanning helix was confidently predicted ( fig. 1 b and fig. 4 . a conserved sequence motif in the movement proteins of dianthoviruses and the mip family of membrane proteins. asterisks show identical residues and colons show similar residues in the sequences of the movement protein of rcnmv and tip1. the consensus (for symbols see legend to fig. 2) includes the conserved residues, with one possible exception in the mip-related proteins. the hydrophobic regions predicted to form transmembrane helices are underlined. the motif was extracted from an alignment generated using the macaw program in some of these proteins the conserved motif may be involved in hydrophobic interactions other than membrane spanning. as mentioned above, membrane localization also has been proposed for the triple block proteins and strongly predicted for the movement proteins of monopartite geminiviruses. the observations on the involvement of influenza virus m 1 protein in nuclear export of virus ribonucleoproteins [-39] prompted us to directly compare the m 1 sequence to those of plant virus movement proteins. moderate similarity was revealed between m 1 and the movement proteins of bromoviruses and cucumoviruses (fig. 5) . the region of similarity included the conserved motif of the "30 k superfamily", with the invariant aspartic acid residue and the preceding hydrophobic stretch. m 1 protein chaperones newly formed influenza virus rnps from the nucleus to the cytoplasm; upon formation of virions, this protein is thought to provide the link between the rnp and the virion membrane (reviewed in [35] ). both of these functions are apparently based on the ability of m 1 to interact both with rnp and the membrane providing a provocative analogy with the plant virus movement proteins. very recently, a mutation in m 1 protein gene resulting in impaired nucleocytoplasmic movement was mapped within the region of similarity to bromovirus and cucumovirus movement proteins [23] . finally, our previous observations indicated a weak sequence similarity between a domain of cellular 90 kda heat shock proteins with some of the movement proteins, specifically those of caulimoviruses [33] . a common denominator for all these observations may be that the function of plant virus movement proteins is determined by domains mediating hydrophobic interactions. taking into account the limited sequence conservation and the absence of strictly invariant amino acid residues, it is very unlikely that these proteins (with the exception for the putative helicase in the triple block) have any enzymatic activity. among the known groups of plant viruses [25] , movement proteins so far have been identified genetically or by sequence comparison for about 75 percent. the variety of these proteins could be reduced to only three types, namely the "30k superfamily", with the distantly similar proteins of monopartite geminiviruses; the triple block; and the proline-rich proteins of tymoviruses. obviously, future studies may lead to characterization of movement proteins unrelated to any of these groups. different types of movement proteins may affect different aspects in viruscell interaction leading to virus spread in planta. however, it appears likely that in many cases these proteins mediate hydrophobic interactions between viral and cellular components, in a general analogy with molecular chaperone action [16, 33] . as discussed in the first part of this review, a large fraction of experimental data on plant virus movement proteins is very hard to be interpreted unequivocally. conceivably, this reflects the highly complex nature of virus-plant interaction. in this situation, it appears reasonable to focus further experiments on probing the specific functions fo conserved domains revealed by amino acid sequence comparison. hopefully, this approach will highlight new important aspects of plant virus lifestyle. recently, it has been shown that tmv movement protein expressed by transgenic tobacco behaves as an integral membrane protein [moore pj, fenczik ca, deom cm, beachy rn (1992) developmental changes in plasmodesmata in transgenic tobacco expressing the movement protein of tobacco mosaic virus. protoplasma 170:115-127]. between nucleus and cytoplasm cauliflower mosaic virus gene i product detected in cell wall-enriched fraction basic local alignment search tool expression of a plant virus-coded transport function by different viral genomes dynamic continuity of cytoplasmic and membrane compartments between plant cells triple gene block proteins of white clover mosaic potexviurs are required for transport the tmv movement protein: role of the c-terminal 73 amino acids in subcellular localization and function mutational analysis of cis-acting sequences and gene function in rna 3 of cucumber mosaic virus an analysis of the complete nucleotide sequence of a sugarcane bacilliform virus genome infectious to banana and rice replication of maize streak virus mutants in maize protoplasts: evidence for a movement protein expression of orf-69 of turnip yellow mosaic virus is necessary for virus spread in plants nucleotide sequence analysis of the movement genes of resistance breaking strains of tomato mosaic virus a three-dimensional view of precursor messenger rna metabolism within the mammalian nucleus potato virus x as a vector for gene expression in plants the p 30 movement protein of tobacco mosaic virus is a single-stranded nucleic acid binding protein gene i, a potential cell-to-cell movement locus of cauliflower mosaic virus, encodes an rna-binding protein how do plant virus nucleic acids move through intercellular connections visualization and characterization of tobacco mosaic virus movement protein binding to single-stranded nucleic acids relationship of tobacco mosaic virus gene expression to movement within plant plant virus movement proteins increase in plasmodesmatal permeability during cell-to-cell spread of tobacco rattle virus from individually inoculated cells of urfs and orfs. a primer on how to analyze derived amino acid sequences the plant virus movement proteins 255 informosome-like virus-specific ribonucleoprotein (vrnp) may be involved in the transport of tobacco mosaic virus infection an influenza virus temperature-sensitive mutant defective in the nuclear-cytoplasmic transport of the negative-sense viral rnas an n-proximal sequence of alfalfa mosaic virus movement protein is necessary for association with cell walls in transgenic plants classification and nomenclature of viruses. fifth report of the international committee on taxonomy of viruses efficient cell-tocell movement of beet necrotic yellow vein virus requires 3' proximal genes located on rna 2 a novel superfamily of nucleoside triphosphate-binding motif-containing proteins which are probably involved in duplex unwinding in dna and rna replication and recombination an atp-binding motif is the most conserved sequence in a highly diverged group of proteins involved in positive strand rna viral replication turnip crinkle virus genes required for rna replication and virus movement the complete nucleotide sequence of tobacco rattle virus rna-1 the sequence of carnation etched ring virus dna: comparison with cauliflower mosaic virus and retroviruses the phylogeny of rna-dependent rna polymerases of positivestrand rna viruses virus evolution: time for sturm und drang diverse groups of plant dna and rna viruses share related movement proteins that may possess chaperonelike activity evolution and taxonomy of positive-strand rna viruses: implications of comparative analysis of amino acid sequences genes and proteins of the influenza viruses the subcellular localization of gene i product of cauliflower mosaic virus is consistent with a function associated with virus spread ultrastructural location of non-structural protein 3a of cucumber mosaic virus in infected tissue using monoclonal antibodies to a cloned chimeric fusion protein nuclear mrna export nuclear transport of influenza virus ribonucleoproteins: the viral matrix protein (m1) promotes export and inhibits import virus movement in infected plants translocation of a specific premessenger ribonucleoprotein particle through the nuclear pore studied with electron microscope tomography gap junction protein homologue in arabidopsis thaliana: evidence for connexins in plants similarities between putative transport proteins of plant viruses probable reassortment of genomic elements among elongated rna-containing plant viruses nuclear import-export: in search of signals and mechanisms detection of the movement protein of red clover necrotic mosaic virus in a cell wall fraction from infected nicotiana clevetandii plants cooperative binding ofthe red clover necrotic mosaic movement protein to single stranded nucleic acids evolution of the mip family of integral membrane transport proteins cauliflower mosaic virus gene i product (p 1) forms tubular structures which extend from the surface of infected protoplasts identification of barley stripe mosaic virus genes involved in viral rna replication and systemic spread a conformation preference parameter to predict helices in integral membrane proteins spray dc (eds) (1990) parallels in cell-to-cell junctions in plants and animals effects of deletions in the n-terminal basic arm of brome mosaic virus coat protein on rna packaging and systemic infection evidence for involvement of a nuclear envelope-associated rna helicase activity in nucleocytoplasmic rna transport a workbench for multiple alignments construction and analysis a mutation in cauliflower mosaic virus gene i interferes with virus movement but not virus replication deletion analysis of brome mosaic virus 2 a protein: effects on rna replication and virus spread tubular structures involved in movement of cowpea mosaic virus are also formed in infected cowpea protoplasts plasmodesmatal function is probed using transgenic tobacco plants that express a virus movement protein cell-to-cell transport of cowpea mosaic virus requires both the 58/48 k proteins and the capsid proteins the roles of the red clover necrotic mosaic virus capsid and cell-to-cell movement proteins in systemic infection tobacco rattle virus rna-1 29 k gene product potentiates viral movement and also affects symptom production in tobacco evolution of rna viruses we appreciate helpful discussions with dr. v. v. dolja and dr. u. melcher. we would like to thank drs. d. cookmeyer, s. a. demler, r. kormelink, u. k. melcher, n. olszewski and yu. shirako for communicating data prior to publication. a. m. is grateful to dr. r. j. shepherd for constant support and encouragement. received june 9, 1993 key: cord-048471-7jszm1nd authors: salim, omar; clarke, ian n.; lambden, paul r. title: functional analysis of the 5′ genomic sequence of a bovine norovirus date: 2008-05-14 journal: plos one doi: 10.1371/journal.pone.0002169 sha: doc_id: 48471 cord_uid: 7jszm1nd background: jena virus (jv), a bovine norovirus, causes enteric disease in cattle and represents a potential model for the study of enteric norovirus infection and pathogenesis. the positive sense rna genome of jv is organised into orf1 (non-structural proteins), orf2 (major capsid protein) and orf3 (minor capsid protein). the lack of a cell culture system for studying jv replication has meant that work to date has relied upon in vitro systems to study non-structural protein synthesis and processing. principal findings: only two of the three major orf1 proteins were identified (p110 and 2c) following in vitro translation of jv rna, the n-term protein was not detected. the n-term encoding genomic sequence (5′gs) was tested for ires-like function in a bi-cistronic system and displayed no evidence of ires-like activity. the site of translation initiation in jv was determined to be at the predicted nucleotide 22. following the insertion of an epitope within the 5′gs the jv n-term protein was identified in vitro and within rna transfected cells. conclusions: the in vitro transcription/translation system is currently the best system for analysing protein synthesis and processing in jv. unlike similarly studied human noroviruses jv initially did not appear to express the n-terminal protein, presenting the possibility that the encoding rna sequence had a regulatory function, most likely involved in translation initiation in an ires-like manner. this was not the case and, following determination of the site of translation initiation the n-term protein was detected using an epitope tag, both in vitro and in vivo. although slightly larger than predicted the n-term protein was detected in a processed form in vivo, thus not only demonstrating initial translation of the orf1 polyprotein but also activity of the viral protease. these findings indicate that the block to noroviral replication in cultured cells lies elsewhere. jena virus, a bovine norovirus, is a member of the caliciviridae family of positive sense rna viruses and was first isolated from the diarrhoeic stools of newborn calves [1, 2] . jv is a type i genogroup iii (giii) norovirus which is closely related to the type ii giii bovine noroviruses newbury agent 2 and dumfries [3, 4] . the giii noroviruses are responsible for causing enteric disease in cattle [2, 5] and, thus, likely share a similar tissue tropism to the human-associated enteric noroviruses. like human noroviruses [6] bovine noroviruses have a high seroprevalence [4] . jv is therefore a potentially useful model for studying the molecular biology of enteric norovirus pathogenesis and replication. the 7.3 kb polyadenylated rna genome of jv has been characterised previously [7] and, like other noroviruses, is organised into 3 open reading frames (orfs). orf1 encodes the non-structural proteins in the form of a large 185 kda polyprotein, which is subsequently cleaved into functional replication proteins by the viral encoded 3c-like protease. orf2 encodes the structural capsid protein (56 kda) and orf3 encodes a small basic protein, which has been shown to function as a minor capsid component [8] . jv orf1 is consistent with other caliciviruses in that it encodes a 39 kda 2c-like nucleoside triphosphatase (ntpase), a 3c-like protease and a 56 kda 3d-like rna-dependent rna polymerase [7, [9] [10] [11] [12] [13] . however, the genomic sequence within the 59 region of jv orf1 (59gs) displays a high level of divergence. this divergence is mainly attributed to the presence of several proline-encoding polypyrimidine tracts within the region predicted to encode a 35 kda nterminal protein [7] . the predicted size of n-terminal proteins relative to the size of the respective 2c proteins differs within the norovirus genus. within the gi noroviruses, such as southampton virus, the n-terminal protein (44.8 kda) is larger in size compared to the 2c protein (39.6 kda). this is in contrast to the gii noroviruses, such as lordsdale virus and camberwell virus, in that the n-terminal protein is smaller in size compared to the 2c protein [11, 14] . this is also the case for jena virus in which the predicted jv n-terminal protein (35 kda) is smaller than the jv 2c protein (39 kda) [7] . the norovirus n-terminal protein varies in relative size across the genus, and the encoding sequence bears no similarity to other cellular or viral proteins. alignment of the n-term protein sequences of various noroviruses indicates little similarity between genogroups within the first 180 residues, however towards the cterminal end of the protein similarity between the amino acid residues increases. recent studies investigating the functions of the norwalk virus n-terminal protein have successfully demonstrated association with the golgi apparatus in transfected cells [15] . in addition this study also identified a picornaviral 2b like region within the n-terminal protein, suggesting that the protein is involved with host cell membrane interactions, reinforcing other findings that have suggested that the norwalk virus n-terminal protein disrupts intracellular protein trafficking, including proteins destined for the host cell membrane [16] . a 3c protease-mediated cleavage event within the n-terminal protein (37 kda) was described for camberwell virus, a genogroup 2 norovirus, yielding proteins of 22 kda and 15 kda [17] . based on these observations and location within the genome it was hypothesised that the nterminal protein of noroviruses corresponds to the 2ab region in picornaviruses. another possibility is that the n-term encoding rna itself serves to function as a translational enhancer by interacting with cellular proteins involved in translation. indeed, this phenomenon has been previously reported for norwalk virus, within which a double stem loop structure has been predicted at the 59 end of the genomic rna [18] . it was subsequently demonstrated that elements within the 59 end of norwalk virus bind specifically with cellular proteins such as la, ptb and pcbp2 [19] which have all been implicated in ires-mediated cap-independent translation in the closely related picornaviruses [20] [21] [22] [23] . in this study the role of the jv 59gs was investigated, including its potential to direct capindependent translation initiation. the precise location of translation initiation in jv was also investigated. previous studies of norovirus polyprotein processing have yielded three major products following in vitro transcription and translation, representing the uncleaved 3abcd, n-term and 2c proteins. however, initial analysis of jv polyprotein processing indicated that only two major proteins are synthesised initially which, based on molecular weight predictions, are the 3abcd (110 kda) and the 2c (39 kda). the lack of an n-terminal protein encoded by the jv 59gs, predicted to be 35.3 kda, is unique among the noroviruses that have been studied in this way. the in vitro transcription and translation profile for jv was therefore studied in more detail. as initial experiments had analysed tnth reactions following a 1 hr incubation, reaction aliquots were harvested at time points before and after the recommended 1 hr incubation. the results in figure 1 show that there are no major reaction products synthesised prior to the 1 hr time point, at which time the 3abcd/p110 and 2c/p39 proteins are clearly visible. extended incubation past the 1 hr point resulted in further proteolytic cleavage of p110 that coincided with the appearance of proteins of the following sizes: 86 kda, 55 kda, and 51 kda. in addition proteins of 29 kda, 22 kda and 20 kda were also visible at the 24 hr time point (figure 1, lane 7) . the only protein that was consistently visible following the 1 hr time point was the 2c/p39 protein. despite prolonged incubation there was no indication that the n-terminal/p35 protein was synthesised. a comprehensive study of polyprotein processing within the murine norovirus (mnv) suggests likely identities for the equivalent proteins in the similar profile for jv [12] . using region specific antisera the authors were able to identify p110 as the 3abcd uncleaved precursor, p90 as the 3bcd, p57.5 as the 3d-like polymerase, p52 as a 3abc precursor and p40 as the 2clike ntpase, which was determined by mutagenesis and microsequencing experiments. the 19 kda protein was identified as the 3c-like protease. the antisera used to detect the mnv nterm protein recognised 3 products; one was the predicted molecular weight at 39 kda and the other two bands migrated as a 45 kda doublet. the 59gs region of jv is highly divergent compared to other noroviruses, mainly due to the relatively high cytosine content (32%), which contributes to an overall g/c content of 58%. there are many polypyrimidine tracts within the sequence, potentially yielding a relatively high degree of rna secondary structure. previous studies have described potential secondary rna structure and interaction with proteins involved with iresmediated translation within the 59 genomic region of norwalk virus [18, 24] . it was of interest therefore, based on these findings, to ascertain whether or not the 59gs of jv possessed ires-like properties within the context of a 'bi-cistronic' expression system, independently of other viral proteins, including the vpg which, in other caliciviruses, has been shown to be associated with translation initiation factors [25, 26] . traditionally the bi-cistronic vector system has been used to define potential ires-like sequences from a variety of viral and cellular mrnas, and is recognized as being the standard test for this function [27] . a bi-cistronic vector is comprised of a 59 and 39 cistron; translation of the 59 cistron being cap-dependent and translation of the 39 cistron regulated by the putative ires-like sequence. thus, if the 39 cistron is translated in addition to the 59 cistron then the sequence of interest is said to have ires-like properties, as translation is initiating internally. to test for ires-like function in jv, bi-cistronic constructs were made with a cap dependent 59 egfp cistron and a 39 lacz cistron under the translational control of either the jv 59gs (pegfp-c1/ jv59gs/lacz) or an authentic emcv ires (pegfp-c1/ires/ lacz). crfk cells were transfected with the bi-cistronic constructs and, following incubation, were assayed for egfp and lacz expression. both constructs were able to direct translation of the egfp cistron effectively as expected (figure 2a and figure2d). the use of an authentic emcv ires to direct translation of the lacz cistron was also effective (figure 2e), with levels of b-galactosidase activity comparable to those of the b-galactosidase reporter ( figure 2f ). however, no b-galactosidase activity was detected from cells transfected with the pegfp-c1/jv59gs/lacz construct (figure 2b ), demonstrating that the jv 59gs was unable to initiate translation, and therefore, in this context, did not possess any ires-like functions. as it was clear that the jv 59gs did not posses any ires-like functions it was necessary to determine the location of translation initiation within orf1. this was predicted be the atg encoding methionine at nucleotide position 22, as it is situated in a favourable context for translation initiation [7] . to investigate this multiple translation termination codons (polystop) were inserted into the jv genome within the 3b-encoding region, downstream of the 59gs, to halt translation at a defined point. in vitro transcription and translation of this construct would, in theory, yield a product whose size would relate to the initiation codon used within the 59gs (figure 3 ). to address the unlikely event of translation read-through or re-initiation downstream of the polystop, which would result in subsequent translation of the 3c protease and cleavage of the truncated orf1 polyprotein, a mutation was made within the active site encoding region of the 3c protease within jv orf1, to prevent any viral mediated cleavage of orf1 translation products (jv 3c mut /polystop). a point mutation of the critical cysteine residue within the highly conserved gdcg motif to a glycine residue was performed, and this approach has been described for the successful inactivation of other norovirus' 3c activity [28] . in vitro transcription and translation analysis was performed on jv wild type ( figure 4 within the 3c region of jv successfully inactivated the 3c protease, thus a large, .200kda uncleaved polyprotein is yielded following tnth. the major product generated by jv 3c mut / polystop was calculated to be 103kda in size. based on computer predictions this is in agreement with the initiation of translation occurring at nucleotide 22, which demonstrates that the jv n-term protein is translated in full in vitro. at this time it is not possible to determine whether translation of intracellular vpgbound viral rna initiates at nucleotide 22, although it is likely given the favourable context in which the initiation codon is situated. as the jv n-term was found to be translated in vitro attempts were made to express and purify the protein in bacteria for immunisation so that the protein could be identified by radioimmune precipitation assay (ripa), as it was possible that the nterm protein was migrating on gels aberrantly and possibly comigrating with 2c. attempts to express the protein in bacteria were unsuccessful due to toxicity. therefore, the 14aa v5 epitope encoding sequence was cloned in frame into the jv cdna construct at nucleotide position 123 (jv v5). the v5 epitope originates from the p and v proteins of the sv5 paramyxovirus [29] , for which a commercially available monoclonal antibody is used for detection. following in vitro transcription and translation of jv v5 a new product, approximately 42 kda in size, was visible ( figure 5 , lane 2). this product was not observed in any prior analyses of jv. to confirm that this protein was v5/n-term associated the tnth reaction was subjected to ripa using the anti-v5 antibody ( figure 5, lane 3) . this confirmed expression of the n-term protein in vitro. to confirm expression of the v5/n-term protein in cell culture capped rna was synthesised from the jv v5 t7 cdna construct, which was used to transfect crfk cells. as there is currently no host cell line in which to propagate jv the crfk cell line was used as it has been shown to support the replication of feline calicivirus [30] . confocal immunofluorescence of transfected cells using the anti-v5 antibody demonstrated expression of the v5/nterm protein in cultured cells ( figure 6 ). expression of the v5/nterm protein was diffuse and did not co-localise with the golgi/ er/plasma membrane marker wheat germ agglutinin (wga) and therefore displays a different pattern of cellular expression compared to norwalk virus [15] . cells transfected with the wild type full length jv rna were negative for fluorescence (data not shown). lysates of cells transfected with wild type jv and jv/v5 rna were subjected to western blot using the anti-v5 antibody ( figure 7) . no product was present for cells transfected with wild type jv rna, but a protein of approximately 42 kda in size was visible in cells that had been transfected with jv/v5 rna, confirming n-term expression and size as seen in the in vitro system. in addition, this important observation also confirms for the first time that the jv 3c protease was active in cells transfected with capped rna as the size of the v5/n-term indicated successful cleavage of the protein from the orf1 polyprotein. to address the issue of potential rapid degradation of the jv nterm protein crfk cells were transfected with jv v5 rna and were harvested at designated time points following the addition of the protein synthesis inhibitor cycloheximide. cell lysates were analysed by western blot using the anti-v5 antibody (figure 8 ). the consistent appearance of the n-term/v5 protein suggested that it is stable and insensitive to degradation by viral and host cell proteases. the predicted molecular weight of the jv n-term is 35.3 kda, based on the site of initiation of translation and location of conserved cleavage sites. the appearance, therefore, of a previously unseen 42 kda protein in the in vitro transcription and translation profile was unexpected but this protein does represent a translation product for the jv 59gs. to date, it has not been possible to explain the difference in the predicted and observed sizes for the jv n-term, and the addition of the 14 amino acid v5 epitope within jv n-term does not account for this apparent large shift in molecular weight. however, a recent study described a similar anomaly when investigating proteolytic processing in the murine norovirus mnv-1 [12] . the predicted molecular weight for the mnv-1 n-term protein was 38.3 kda. the authors successfully generated antisera against the mnv-1 n-term and used it to immunoprecipitate the protein from in vitro transcription and translation reactions and observed that the n-term existed as a 45 kda doublet, in addition to the predicted size of 38 kda. however, when mnv-1 n-term antisera was used to probe mnv-1-infected cell lysates only the 43-45 kda doublet and a large 115 kda precursor could be detected, suggesting that the predicted 38 kda form of the n-term is not generated in cell culture. again, it was not possible to conclusively determine the cause of this discrepancy, but it was speculated that the n-term protein may migrate abnormally in sds-page, or may be proteolytically processed at a previously unknown cleavage site downstream of the protein's predicted c-terminus. it is also possible that the n-term protein might be modified in some way leading to a shift in observed molecular weight. at this time the same conclusions would seem appropriate for the jv n-term. in addition, it is not known why the jv n-term was previously not detected in in vitro transcription and translation studies prior to the insertion of the v5 epitope. it cannot be ruled out, however, that the wild type jv n-term aberrantly co-migrates with the 39 kda jv 2c protein in sds-page. indeed, the appearance of the v5/ n-term product from transfected cell lysates would appear to be one of a doublet (figure 8) , also analogous to the observed appearance of the mnv n-term protein in infected cells, suggesting the likelihood of a further cleavage site within the jv n-term protein which has yet to be elucidated. nevertheless, these studies clearly demonstrate that a protein representative of the 59gs of jv is translated both in vitro and in vivo and is proteolytically processed from the orf1 polyprotein following translation initiation at nucleotide 22. human norovirus infection has been shown to be the leading cause of non-bacterial gastroenteritis [31] , however there is currently no cell culture system available to facilitate viral replication and ethical considerations have hindered progress in establishing a permissive human organ culture system. the study of jena virus offers a potential animal model of enteric noroviral infection. however, until a permissive bovine cell and/or organ culture systems is established analysis of the molecular mechanisms underpinning viral replication and pathogenesis rely upon in vitro systems, most notably polyprotein synthesis and processing. unlike similarly studied human noroviruses jv initially did not appear to express the n-terminal protein, presenting the possibility that the encoding rna sequence had a regulatory function itself, most likely involved in translation initiation in an ires-like manner. this was shown not to be the case and, following determination of the site of translation initiation at the predicted nucleotide 22 the n-term protein was detected following the insertion of an epitope tag, both in vitro and in vivo. although slightly larger than predicted the n-term protein was detected in a processed form in vivo, thus not only demonstrating initial translation of the orf1 polyprotein but also activity of the viral encoded protease. these important findings indicate that the block to replication of enteric norovirus in cultured cells cannot be attributed to a failure to synthesise and process the non-structural proteins. the detection of processed and active orf1 proteins in transfected cultured cells, however, highlights the potential for the development of cell and bovine organ based systems to facilitate the replication of jena virus. the pegfp-c1 vector (clontech) comprises of an egfp coding sequence under the control of a cmv promoter and a kozak translation initiation site. downstream of the egfp sequence is the multiple cloning site containing unique bglii, saci, hindiii and apai restriction sites. contruction of pegfp-c1/jv 59 gs/lacz was as follows; the jv 59 gs sequence was amplified from the jv full length cdna clone [7] using bio-x-act dna polymerase (bioline) with the primers 59 gs f (59-aactgca-gatcttaataagtgaatgaagactttgacgat-39), containing the bglii restriction site (bold) and two in-frame translation termination codons (underlined) to ensure that translation of the egfp sequence did not carry over to the 59 gs, and 59 gs r (59-aactgcaagcttctgcaggacacaatgagg-39), containing thehindiii restriction site. the jv 59 gs amplicon was ligated to the pegfp-c1 vector, following restriction enzyme digestion of both amplicon and vector with bglii and hindiii restriction enzymes, and the ligated dna used to transform e.coli top10 (invitrogen). this intermediate construct was named pegfp-c1/ jv 59 gs. the lacz coding sequence was amplified from the psvb-gal reporter vector (promega) using bio-x-act dna polymerase and the primers lacz f (59-aactgcaagcttga-tatgggggatcccgtcgttttacaacg-39), containing the hindiii restriction site (bold) and a kozak translation initiation site (underlined), and lacz r (59-aactgcgggcccttat-tatttttgacaccagacca-39) containing the apai restriction site (bold) and translation termination codons (underlined). the lacz amplicon was ligated to the pegfp-c1/jv 59 gs vector following restriction enzyme digest of both amplicon and vector with hindiii and apai restriction enzymes, and the ligated dna used to transform e.coli top10. the construct was verified by sequencing. construction of pegfp-c1/ires/lacz was as follows; the emcv ires sequence was amplified from the pires2-egfp vector (clontech) using bio-x-act dna polymerase and the primers ires bgl f (59-actcgaagatcttaa-tagagcttcgaattctgcagtcga-39), containing the bglii restriction site (bold) and translation termination codons (underlined) to prevent carry over translation as before, and ires sac r (59-actcgagagctctgtggccatattatcatc-gtg-39), containing the saci restriction site (bold). the ires amplicon was ligated to the pegfp-c1 vector follwing restriction whole cell lysate was collected at the following time points following chx treatment: 0 hr, 1 hr, 3 hr, 6 hr, 12 hr, 24 hr. bradford analysis was performed on the lysates to ensure equal loading. following western analysis the ecl treated membrane was exposed to film for 1 min. molecular weight marker is represented in lane 1. doi:10.1371/journal.pone.0002169.g008 enzyme digest of both amplicon and vector with bgiii and saci restriction enzymes, and the ligated dna used to transform e.coli top10. the lacz amplicon described previously was ligated to the intermediate pegfp-c1/ires vector following restriction enzyme digest of both amplicon and vector with hindiii and apai restriction enzymes, and the ligated dna used to transform e.coli top10. the jv 3c protease mutant was created by point mutation of the critical tgt encoded cysteine residue, within the gdcg active site motif, to a ggt encoded glycine residue by mutagenic overlap pcr using bio-x-act dna polymerase. three rounds of amplification using the jv full length cdna clone as template were used to generate the final mutant protease cassette. round 1 used the primers jv f1 (59-cgtctcagggttgatact-39) and jv mut 1 (59-gcaaccaccgtcaccag-39), yielding a 222 bp amplicon (point mutation nucleotide shown in bold). round 2 used the primers jv mut 2 (59-ctggtgacggt-ggttgc-39) and jv r2 (59-ttcctgggaggaacaagtt-39), yielding a 651 bp amplicon. amplicons generated in rounds 1 and 2 were pooled to serve as template for round 3 using the primers jv nf (59-atgtcaaccaccaccagc -39) and jv nr (59-aagggctccggtgaagg-39). this cassette contained two bcli restriction sites flanking the 3cprotease active site, as also found in the wild-type full length clone. restriction digest using bcli was used to remove the appropriate wild-type cassette from the jv full length clone. the mutant cassette was also digested with bcli prior to ligation to the bcli-digested jv full length clone. the ligated dna was used to transform e.coli top10, and was designated jv 3c mut . construction of jv 3c mut /polystop was as follows: complementary oligonucleotides with three translation termination codons (underlined) in each reading frame in sense and anti-sense orientations were desgined in such a way that upon annealing the duplex would contain blunt termini. the oligonucleotides were termed pstop top (59-ctaggtaagtaaacgcgtctact-cactcac-39) and pstop comp (59-gtgagtgagta-gacgcgtttacttcaatag-39). each oligo (1 mg) was incubated with t4 polynucleotide kinase and atp to phosphorylate the 59 termini, pooled and heated to 75uc for 15 min, and left to cool to room temperature to anneal the oligos. following purification the polystop duplex was ligated to eco47iii digested jv 3c mut , and ligated dna was used to transform e.coli top10. the duplex contained the unique restriction site mlui (shown in bold) to assist screening of recombinant clones. the v5 epitope (n-gly-lys-pro-ile-pro-asn-pro-leu-leu-gly-leu-asp-ser-thr-c) is recognized by the anti-v5 monoclonal antibody (invitrogen). complementary oligonucleotides encoding the v5 epitope were designed in such a way as to generate sacii compatible termini following annealing (bold), and to preserve the reading frame when inserted into the sacii restriction site at nucleotide 123 within the 59 gs of the jv genome (underlined). the oligos were termed v5 top (59-ggtaagcctatccc-taaccctctcctcggtctcgattctacgagc-39) and v5 comp (59-tcgtagaatcgagaccgaggagagggt-tagggataggcttaccgc-39). the oligos were phosphorylated and annealed as described previously and the duplex ligated to the sacii digested jv full length clone. ligated dna was used to transform e.coli top10. in vitro coupled transcription and translation was performed using the tnth coupled reticulocyte lysate system (promega) as per the manufacturer's instructions. reactions were incubated at 30uc for 1-2 hr. for non-radiolabelled reactions the 35 s-methionine was replaced with 1 mm unlabelled methionine (2 ml). reaction products (1-2 ml) were analysed by sds-page. gels were stained and prepared for autoradiography by incubating for 30 min in a solution containing 32 g sodium salicylate, 100 ml methanol and 100 ml dh 2 o. gels were dried under vacuum and the reaction products were detected by exposure to kodak x-omat scientific imaging film (sigma) at 270uc for 16 hr followed by developing using a kodak automated developer. specific v5-tagged proteins synthesised by tnth were precipitated from 5-10 ml of reaction product using the anti-v5 monoclonal antibody (invitrogen) at the recommended dilution in 600 ml of 16 ripa buffer (diluted from 10x stock: 10 mm tris-hcl (ph 7.5), 1 mm edta, 0.15 mm nacl, 0.1% sds, 0.5% empigen bb, 0.1 mm phenylmethylsulphonylfluoride) for 1 hr at 37uc. this was followed by a second incubation of tube for 2 hr rotating at room temperature with goat anti-mouse immunoglobulin g agarose beads (sigma) to absorb the immune complexes. the beads were washed three times with 500 ml 16 ripa buffer and once with 500 ml pbs. the beads were resuspended in sample buffer for analysis by sds-page and autoradiography as before. endotoxin-free preparations of plasmid dna were prepared using the genelute tm endotoxin free plasmid midi prep kit (sigma). crandall-reese feline kidney cells (crfks) were seeded into a 12 well tray at approximately 40-50% confluence. crfk cells were transfected with no dna (negative control), psv-b-gal (control for b-galactosidase activity), pegfp-c1/jv 59 gs/lacz and pegfp-c1/ires/lacz (control for ires activity) using the superfect tm transfection reagent (qiagen) as per the manufacturer's recommendations. following a 16 hour incubation the cells were observed for egfp expression using a leica leitz dmrb fluorescence microscope. the cells were washed in pbs and fixed using a 0.5% solution of glutaraldehyde for 30 min at room temperature. the cells were incubated with an x-gal stain solution: 5 mm k 3 fe(cn) 6 , 5 mm k 4 fe(cn) 6 , 2 mm mgcl 2 , 1x x-gal (sigma) for 4 hours at 37uc and were observed for b-glactosidase activity by light microscopy. the experiment was performed more than once to confirm the results. jv v5 and jv flc t7 cdna plasmid constructs were linearised using ndei (invitrogen). capped rna was synthesised using the mmessage mmachineh capped rna transcription kit (ambion) according to the manufacturer's instructions. crfk cells were seeded into 6 well trays at approximately 50% confluence and were transfected with 2 mg purified rna per well using transmessenger transfection reagent (qiagen) according to the manufacturer's instructions. for immunofluorescence crfk cells were seeded onto 19 mm coverslips in 6 well trays and were transfected with rna as described. following a 24 hr incubation the coverslips were washed with pbs and fixed in 4% formaldehyde for 15 min at room temperature. cells were permeabilised and blocked in saponin buffer, also used as staining buffer, (0.1% saponin, 10% foetal calf serum, 0.1% sodium azide) for 1 hr at 4uc. cells were stained using an anti-v5 monoclonal antibody (invitrogen) followed by an anti-mouse alexafluor 488 conjugated secondary antibody (molecular probes) at the recommended dilution in staining buffer for 30 min in the dark. cells were then stained for 30 min in the dark with a wheat germ agglutinin alexafluor 594 nm conjugate (molcular probes) to allow identification of plasma and golgi membranes. coverslips were washed and mounted onto slides using vectashield containing dapi (vector labs). microscopy was performed using an inverted leica tcs-nt confocal laser scanning microscope. the anti-v5 antibody was also used to detect v5-tagged protein by western blot. cell lysates were prepared following transfection using lysis buffer (0.15 m sodium chloride, 0.5% (v/v) sodium deoxycholate, 0.1% (w/v) sds, 50 mm tris-cl ph 8.0) and protease inhibitor cocktail (sigma). lysates were incubated for 15 min on ice followed by sonication to shear genomic dna. following bradford analysis equal protein content from jv v5 and jv flc lysates were run on a 10% sds-page gel and subsequently transferred onto immobilon-p pvdf membrane (millipore) according to the manufacturer's recommendations. the membrane was probed using the anti-v5 monoclonal antibody at the manufacturer's recommended dilution, followed by an anti-mouse hrp-copnjugated secondary antibody (santa cruz) at the recommended diltution. the ecl western blotting reagents kit (g.e. healthcare) was used to detect antibody bound protein, which was visualised by exposure to biomax light film (kodak). crfk cells were seeded into 6 well trays and transfected with capped jv v5 rna as described. following a 24 hr incubation cycloheximide (sigma) was added to the cells at a final concentration of 50 mg/ml. cells were harvested at indicated times for the preparation of lysates for v5 western analysis as described above. bradford reagent (sigma) was used to ensure equal loading of lysates according to the manufacturer's recommendations. studies into diarrhoea of young calves. sixth communication: detection and determination of pathogenicity of a bovine corona virus and an undefined icosahedric virus. archives of experimental veterinary medicine studies into diarrhoea of young calves-seventh communication: ''zackenvirus'' (jena-agens 117/80)-a new diarrhoea pathogen to calf. archives of experimental veterinary medicine complete genomic characterization and antigenic relatedness of genogroup iii, genotype 2 bovine noroviruses genotype 1 and genotype 2 bovine noroviruses are antigenically distinct but share a cross-reactive epitope with human noroviruses characterization of a calici-like virus (newbury agent) found in association with astrovirus in bovine diarrhea the seroepidemiology of genogroup 1 and genogroup 2 norwalk-like viruses in italy molecular characterization of a bovine enteric calicivirus: relationship to the norwalk-like viruses two nonoverlapping domains on the norwalk virus open reading frame 3 (orf3) protein are involved in the formation of the phosphorylated 35k protein and in orf3-capsid protein interactions processing map and essential cleavage sites of the nonstructural polyprotein encoded by orf1 of the feline calicivirus genome rabbit hemorrhagic disease virus: genome organisation and polyprotein processing of a calicivirus studied after transient expression of cdna constructs organisation and expression of calicivirus genes cleavage map and proteolytic processing of the murine norovirus nonstructural polyprotein in infected cells norovirus protein structure and function open reading frame 1 of the norwalklike virus camberwell: completion of sequence and expression in mammalian cells norwalk virus n-terminal nonstructural protein is associated with disassembly of the golgi complex in transfected cells norwalk virus nonstructural protein p48 forms a complex with the snare regulator vap-a and prevents cell surface expression of vesicular stomatitis virus g protein activity of the norovirus camberwell proteinase and cleavage of the n-terminal protein encoded by orf1 sequence and genomic organization of norwalk virus la, ptb, and pab proteins bind to the 39 untranslated region of norwalk virus genomic rna requirement of poly(rc) binding protein 2 for translation of poliovirus rna stem-loop structure synergy in binding cellular proteins to the 59 noncoding region of poliovirus rna direct evidence that polypyrimidine tract binding protein (ptb) is essential for internal initiation of translation of encephalomyocarditis virus rna sonenberg n (1993) la autoantigen enhances and corrects aberrant translation of poliovirus rna in reticulocyte lysate interaction of cellular proteins with the 59 end of norwalk virus genomic rna the genomelinked protein vpg of the norwalk virus binds eif3, suggesting its role in translation initiation complex recruitment vpg of murine norovirus binds translation initiation factors in infected cells new ways of initiating translation in eukaryotes? polyprotein processing in southampton virus: cleavage in trans by the 3c-like protease identification of an epitope on the p-proteins and v-proteins of simian-virus 5 that distinguishes between 2 isolates with different biological characteristics electron-microscopic observation of feline kidney-cells infected with a feline calicivirus viral gastroenteritis outbreaks in europe key: cord-017817-ztp7w9yh authors: land, walter gottlieb title: cell-autonomous (cell-intrinsic) stress responses date: 2018-03-28 journal: damage-associated molecular patterns in human diseases doi: 10.1007/978-3-319-78655-1_18 sha: doc_id: 17817 cord_uid: ztp7w9yh in this chapter, the role of cell-intrinsic stress responses is examined which include autophagic processes, the oxidative stress response, the heat shock response, the unfolded proteins response, and the dna damage response. autophagy (macroautophagy, microautophagy, and chaperone-mediated autophagy) is a self-digestive process in response to environmental stress to eukaryotic cells, by which cytoplasmic components are delivered to the lysosome for recycling and degradation. the oxidative stress response is directed against any oxidative stress and is mediated by antioxidative defense systems including antioxidant enzymes such as superoxide dismutase, detoxifying enzymes such as glutathione peroxidase, and energy-dependent efflux pumps. the heat shock response is induced upon exposure of cells to any stress condition and characterized by emission of heat shock proteins which operate as damps to maintain and restore homeostasis. the unfolded protein response is induced by any stress of the endoplasmic reticulum that is perceived by three sensor molecules. under remediable endoplasmic reticulum stress conditions, the sensors trigger signalling pathways to resolve this stress. however, in severe irremediable endoplasmic reticulum stress, the unfolded protein response may lead to pro-inflammatory and pro-apoptotic responses resulting in regulated cell death. finally, the dna damage response is induced by any dna damage that occurs in a variety of exogenous and endogenous conditions. when successful, this stress response leads to dna repair and is associated with the emission of various damps which contribute to restoration of homeostasis. when unsuccessful, the dna damage response, like the unsuccessful unfolded protein response, can result in regulated cell death, either in form of apoptosis or necrosis. together, the ultimate goal of all the stress responses is to maintain cellular homeostasis and ensure cell integrity. when they fail, the incidence of regulated cell death is frequently observed. as comprehensively described in part ii, prms are specifically involved in the recognition of mamps and damps. as will be discussed in part vi, each of these recognition receptors can trigger distinct signalling cascades in innate immune cells that modify their gene expression to create and execute efferent innate immune responses that involve (1) production of inflammatory mediator substances such as cytokines and chemokines, (2) phagocytosis, and (3) cytotoxicity, as well as, as described in part viii, may elicit and shape antigen-specific adaptive immune responses. beyond this well-characterized mamp/damp engagement of prms leading to a variety of downstream efferent cellular and humoral responses, the innate immune defense program also depends on cell-autonomous, that is, cell-intrinsic, responses which counteract any stressful insult [1] . constitutive cell-autonomous immunity mobilizes pre-existing molecules and processes in order to primarily and quickly defend the cell and the host against infectious and sterile injury. hence it can be considered as the very first line of innate immune defense. here, the role of constitutive cell-autonomous responses will be examined, whose involvement in the innate immune defense to stress and injury has only been appreciated within the last few years. the focus of this brief overview will be mainly directed toward cellular stress responses. the term autophagy comes from the greek words "phagy" meaning eat and "auto" meaning self. autophagy is an evolutionarily highly conserved self-digestive process in response to environmental stress to eukaryotic cells, by which cytoplasmic components such as defective/damaged or redundant organelles or protein aggregates are delivered to the lysosome for recycling and degradation. there is convincing evidence indicating that activation of the autophagic process is promoted by mamps and/or damps [2, 3] . in more simple words, autophagy is a classical cellprotective and cell-autonomous process of the innate immune system aimed at maintaining and restoring homeostasis at both the cellular (cell-intrinsic) and organismal (cell-extrinsic) level [4] . although autophagy was initially identified in mammals, a significant breakthrough in our understanding of how autophagy is controlled came from the analysis in the genetically tractable yeast system. pioneering work from ohsumi's group showed that the morphology of autophagy in yeast was similar to that documented in mammals [5] . (as known, ohsumi received the nobel prize in physiology or medicine 2016.) in fact, the discovery of the autophagyrelated genes in yeast has significantly advanced the understanding of the molecular mechanisms participating in autophagy and the genes involved in regulating the autophagic pathway. many yeast genes have mammalian homologues, confirming that the basic machinery for autophagy has been evolutionarily conserved along the eukaryotic phylum [6] [7] [8] [9] . notably, a panel of leading experts in the field of autophagy has recently published a new definition of several autophagy-related terms based on specific biochemical features [10] . accordingly, in the following, three types of autophagy are briefly sketched including macroautophagy, microautophagy, and, in mammals, chaperone-mediated autophagy. each of them fulfils very specific tasks in intracellular degradation. there is general agreement on two main features that characterize bona fide, functional autophagic responses, irrespective of type: (1) they involve cytoplasmic material; and (2) they culminate with (and strictly depend on) lysosomal degradation [10] . thus, although autophagy substrates can be endogenous such as damaged cellular organelles or exogenous such as viruses or bacteria escaping phagosomes, autophagy acts on entities that are freely accessible to cytosolic proteins. this property is essential in order to distinguish between autophagic responses and branches of vesicular trafficking that originate at the plasma membrane, which also culminates in lysosomal degradation. such endocytic processes include phagocytosis, receptor-mediated endocytosis, and macropinocytosis, that is, processes which will be dealt with in part vi, sect. 22.6. of note, however, some forms of autophagy and the endocytic pathway interact at multiple levels, and the molecular machinery responsible for the fusion of late endosomes (also known as mvbs) or autophagosomes with lysosomes is essentially the same [11] . as stressed [10] , the strict dependency of autophagic responses on lysosomal activity is necessary to discriminate them from other catabolic pathways that also involve cytoplasmic material, such as proteasomal degradation [12] . thus, the 26s proteasome (box 18.1) degrades a large number of misfolded cytoplasmic proteins that have been ubiquitinated (for (poly)ubiquitination, see box 18.2) as well as properly folded proteins that expose specific degradation signals, such as the socalled n-degrons [13] . on the other hand, the proteasome system shares some substrates with different forms of autophagy whereby these two catabolic pathways differ drastically in their final products. thus, proteasomal degradation results in short peptides that are not necessarily degraded further but may flow into additional processes including but not limited to antigen presentation/cross-presentation at the plasma membrane, thereby generating mhc-ii and mhc-i epitopes (compare part viii, chap. 31). by contrast, lysosomal proteases fully catabolize polypeptides to their constituting amino acids which eventually become available for metabolic reactions or repair processes. together, as summarized [10] , bona fide functional autophagic responses navigate cytoplasmic material of endogenous or exogenous origin to degradation within lysosomes (or late endosomes, in specific cases). the binding of many ubiquitin molecules to the same target protein. in its simplest form, ubiquitin can be attached to the target protein as a single moiety resulting in monoubiquitination. ubiquitin itself can be ubiquitinated, resulting in the formation of ubiquitin chains attached to the target protein: polyubiquitination. polyubiquitination of proteins is the triggering signal that leads to subsequent degradation of the protein in the proteasome. ligases play a central role in polyubiquitination. ligases are enzymes that catalyze the synthesis of polyubiquitin chains. ubiquitin conjugation requires typically box the proteasome is a common complex for all living cells, needed to recycle and eliminate unwanted proteins. in analogy, it resembles a chaff-cutter. this molecular machine provides a pathway that is involved in many cellular levels such as protein degradation, antigen processing, cell cycle, apoptosis, and dna repair. the 26s proteasome that is present in the cytoplasm and nucleus is usually formed by one 20s proteasome complex and two 19s proteasome complexes, which are composed of proteases and structural units. the 26s proteasome is a giant protease responsible for the regulated degradation of polyubiquitylated proteins (see box 18.2) . it consists of at least 33 distinct subunits and is arranged into two modules: core particle containing catalytic sites and regulatory particles. the cylinder-shaped proteolytic core is the 20s core particle, which is capped at one or both ends by 19s regulatory particles. further reading: wehmer m, sakata e. recent advances in the structural biology of the 26s proteasome. int j biochem cell biol 2016;79:437-442. basically, the term macroautophagy is often used when describing autophagy in general. the phenomenon is characterized by its typical morphological features which involve dedicated vesicles that can occupy a considerable part of the cytoplasm. typically, macroautophagy is one type of autophagic processes in which the substrates are sequestered within cytosolic double-membrane vesicles termed autophagosomes. the substrates of macroautophagy include superfluous and damaged organelles, cytosolic proteins, and invasive microbes. mechanism of formation and regulation of macroautophagy are very complex and complicated processes that are outlined here in a considerably simplified way. macroautophagy involves the sequestration of cytoplasm via a double-membrane intermediate structure termed the phagophore which matures into an autophagosome; the latter compartment fuses with a lysosome allowing degradation and recycling of the cargo [14] . in more detail, the process begins with the formation of a membrane of unknown origin, the initial phagophore or isolation membrane. the phagophore then expands, surrounds proteins or organelles, sequesters cytoplasm, and, on completion, develops into a large double-membrane transport vesicle, the autophagosome. subsequently, the autophagosome fuses with a lysosome containing acid hydrolases and releases its contents into the lytic acid hydrolases-containing compartment as part of single-membrane vesicles, termed autophagic bodies. the fused compartment where the autophagic body and its contents are degraded is called an autophagolysosome or autolysosome ( fig. 18.1) . notably, the process of phagophore expansion three classes of enzymes. e1 (ubiquitin-activating enzyme) hydrolyzes atp and forms a thioester-linked complex between itself and ubiquitin. e2 (ubiquitin-conjugating enzyme) receives ubiquitin from e1 and forms a similar thioester-linked intermediate with ubiquitin. e3 (ubiquitin ligase) finally binds both the e2 and a substrate and catalyzes the transfer of ubiquitin to the substrate. ubiquitin itself is often a substrate for further ubiquitylation, which results in the formation of so-called polyubiquitin chains. ubiquitin has seven lysine residues, and depending on the lysine residue used for ubiquitin-ubiquitin chain formation, the polyubiquitin chain can signal different functions. proteins modified by lysine-48 (k48)-or lysine-29 (k29)-linked chains are usually degraded by the proteasome. in contrast, modification by k63-linked chains or by a single ubiquitin moiety (monoubiquitylation) seems to trigger other functions, e.g., protein sorting, gene expression, and dna repair. further reading: callis j. the ubiquitination machinery of the ubiquitin system. arabidopsis book 2014;12:e0174. provides tremendous flexibility and capacity with regard to cargo, allowing entire organelles to be deleted via autophagy; however, this flexibility also means that autophagy must be tightly controlled in order to prevent inappropriate degradation, which could lead to cell death (for relevant papers, see [6] [7] [8] [9] [14] [15] [16] . intensive studies have been carried out in the past two decades to understand the mechanism and regulation of autophagy. the biogenesis of autophagosomes needs the ordered intervention of autophagy-regulated (atg) proteins that act on different modules. thus, more than 30 atg genes have been identified in human that orchestrate the complex membrane dynamics involved in autophagic sequestration. these atg proteins act sequentially in three macromolecular complexes involved in the three successive stages of autophagy. initiation of autophagy requires the unc-51like kinase 1 (ulk1)-atg13-fip200 (also known as rb1-inducible coiled-coil 1) complex, whereby the kinase activity of ulk1 is controlled by the kinase mammalian target of rapamycin (mtor) in mtor complex 1 (mtorc1), which is sensitive to rapamycin [9] . the next process, membrane nucleation, requires the beclin1/ class iii pi3k complex, which also plays a major role in membrane trafficking and restructuring involved in autophagy [15, 17] ; the final process refers to the elongation, expansion, and closure of the phagophore membrane/autophagosome which mainly relies on atg8/microtubule-associated protein 1 light chain 3 (lc3) lipidation. in fact, atg8/lc3 lipidation is regarded as a hallmark of autophagy and is established by a covalent linkage of cytosolic lc3 to the lipid phosphatidylethanolamine on the surface of the autophagosome [7, 9] . of note, in addition to the cytoplasmic ptm of various atg proteins, recent studies have explored the transcriptional and epigenetic control of autophagy [18] . notably, in human cells, tfeb (for transcription factor eb) and zkscan3 (for zinc finger with krab and scan domains 3) were shown to be implicated in playing a crucial role in autophagy regulation [19, 20] . also, there is growing evidence in support of the notion that histone modification/dna methylation acts as an alternative approach for long-term autophagy control [21] (for histone modification, see part vi, sect. 24.2.3) . also recently, a new ampk→skp2→carm1 (for: ampactivated protein kinase; s-phase kinase-associated protein 2 (p45); coactivatorassociated arginine methyltransferase 1) regulatory axis was reported that incorporated cellular nutrient sensing with transcriptional as well as epigenetic control of autophagy [22] . as concluded by xu and klionsky [14] , "…this ampk-skp2-carm1 signalling axis integrates the various levels of autophagy regulation including cell signalling, and transcriptional regulation as well as epigenetic modification. epigenetic and transcriptional regulation provides an energy-saving approach for control and also create an enduring memory in preparation for future adverse events. thus, this study has deepened our understanding of how autophagy can be controlled in a holistic manner by pathways linking a multitude of regulation mechanisms. given the extensive involvement of autophagy in human diseases, this work also presents potential directions for novel therapeutic intervention." indeed, besides its beneficial function in controlling cellular homeostasis, macroautophagic pathways when disrupted can have severe consequences leading to major diseases such as cancer, metabolic and neurodegenerative disorders, and cardiovascular and pulmonary diseases [23] . of note, macroautophagy can be divided into two subtypes depending on the organelle that is targeted for autophagic degradation; thus, the process of mitophagy corresponds to autophagy of mitochondria, whereas the term er-phagy refers to autophagy of the endoplasmic reticulum (er). both processes deserve a few more words in the following subsection. the term mitophagy corresponds to cargo-specific autophagy of mitochondria, a process which mediates the selective removal of mitochondria [24, 25] . the aim of mitophagy is to eliminate mitochondria, either to regulate their number to adjust to metabolic demand or to explicitly remove those that are damaged in terms of a quality control. mechanistically, mitochondria are selectively recruited into isolation membranes, which seal and then fuse with lysosomes to eliminate the trapped mitochondria. as discussed [24] , mitophagy is preceded by so-called mitochondrial fission that divides elongated mitochondria into pieces of manageable size for encapsulation and also controls segregation of damaged mitochondrial material for selective removal by mitophagy. the term er-phagy (also called micro-er-phagy) refers to a process of distinct selective degradation of er membranes and proteins in the lysosome under stress, and this is independent of the core autophagy machinery [26] [27] [28] (for er stress, see sect. 18.5) . studies on yeast showed that er-phagy is characterized by the fact that stress-induced er whorls are selectively taken up into the vacuole, the yeast lysosome. import into the vacuole was found not to involve autophagosomes but occurs through invagination of the vacuolar membrane, indicating that er-phagy is topologically equivalent to microautophagy [27] . recent studies on yeast provide evidence suggesting that the atg proteins atg39 and atg40 are specific receptors for this pathway of er-phagy [28] . at this point, it also appears worthwhile to mention that a more recent study on yeast revealed a novel er quality-control pathway, namely, the so-called macro-er-phagy. first results from this study suggest that this pathway delivers an excess of integral-membrane proteins from the er to the lysosome for degradation and, typically, requires the core autophagy machinery [29] . the brief overview about macroautophagy provides another typical example of innate immune responses which, when controlled, operate in a beneficial homeostatic way but, when uncontrolled, may lead to severe pathologies. for other forms of phagocytic responses, this phenomenon has not been investigated sufficiently. clearly, mitophagy also plays a key homeostatic role in mitochondrial quality control. upregulation of mitophagy has been shown to mitigate excessive mitochondrial accumulation and toxicity to safeguard mitochondrial fitness. hence, mitophagy is a viable target to promote longevity and prevent age-related pathologies [25] . concerning the two types of er-phagy (micro-and macro-er-phagy), one has to state that research in this exciting field has just begun. several questions remain to be addressed, for example, what is the purpose of er-phagy and what are the underlying mechanisms. future studies will probably provide a clue to elucidating the molecular mechanisms and physiologic roles of er-phagy in other organisms. of note, besides mitophagy and er-phagy, other specific forms of phagocytic pathways have been described. they include -pexophagy as a macroautophagic response preferentially targeting peroxisomes -nucleophagy as an autophagic response selectively targeting portions of the nucleus -ribophagy as a specific autophagic response targeting ribosomes -aggrephagy as an autophagic response specific for protein aggregates -lipophagy in terms of selective autophagic degradation of neutral lipid droplets -bacterial xenophagy as a macroautophagic removal of cytoplasmic bacteria which have escaped the phagosomal compartment upon phagocytosis -viral xenophagy as a macroautophagic response targeting fully formed cytoplasmic virions or components thereof -proteaphagy in terms of macroautophagic responses specific for inactive proteasomes -lysophagy as a specific macroautophagic disposal of damaged lysosomes in mammalian cells for details of these specific forms of phagocytic responses, the reader is referred to the excellent comprehensive review article of galluzzi et al. [10] . in addition to macroautophagy, two other types of autophagy have been described called microautophagy and chaperone-mediated autophagy (cma). microautophagy together with macroautophagy plays, for example, a role in nutrient recycling under starvation. on the other hand, cma is known to contribute to the maintenance of cellular homeostasis by facilitating recycling of amino acids of the degraded proteins and by eliminating abnormal or damaged proteins, thereby exerting major regulatory functions in different pathophysiological scenarios such as metabolic regulation. here, a few aspects of these two types of autophagic responses are skimpily touched. by contrast to macroautophagy, the process of microautophagy is much less defined in mammals since most studies have been performed in yeast and plants. according to current models, the term refers to a collection of diverse processes. unlike autophagy, microautophagy does not involve the autophagosome-dependent degradation of cytoplasmic components but rather and characteristically relies on the direct engulfment of small portions of cytoplasm into lysosomes or late endosomes by invagination and inward budding of the lysosomal/endosomal membrane, a process that leads to their degradation [30] [31] [32] . though microautophagy is the least studied form of autophagy, a molecular signature of the process has begun to emerge and has led to the definition of microautophagy as a type of autophagy in which the cargo is directly internalized in small vesicles that form at the surface of the lysosome/vacuole or late endosomes (multivesicular bodies), respectively [10] . in addition to macroautophagy and microautophagy, there is another type of autophagy experiencing increased attention, the cma. characteristically, in cma, cargo delivery also occurs directly at lysosomes, but it does not require formation of vesicles nor membrane invagination. instead, the substrate proteins for this autophagic pathway cross the lysosomal membrane through a protein-translocation complex, that is, a process that requires protein interaction with the chaperone hspa8 (also known as hsc70) and association of hspa8 with a specific splicing isoform of lamp-2, that is, the lysosomal protein lamp-2a. thus, chaperone-bound autophagy substrates bind lamp-2a monomers on the cytosolic side of the lysosome, which stimulate the formation of an oligomeric lamp-2a translocation complex [10, [33] [34] [35] . essential functions that cma fulfils in cells include a contribution to amino acid recycling during prolonged starvation as well as quality control, directly linked to the ability of this pathway to selectively remove single proteins from the cytosol. for example, cma is up-regulated during oxidative stress where it contributes to the degradation of oxidized proteins (reviewed in [36] ) (see next sect. 18.3). of note, growing evidence demonstrates that malfunction of cma plays a vital role in the pathogenesis of severe human disorders. often, the mechanisms underlying the alterations of cma in these pathologies involve perturbations in the functioning of the cma translocation complex. both diminished and enhanced cma activities have been shown to associate with diseases, an observation that emphasizes the importance of a tight regulation of cma activity (highlighted in [36] ). as argued above, current knowledge about mechanism and physiological relevance of microautophagy in mammalian cells is hard to judge since most findings derive from studies in yeast. however, future studies aimed at identifying proteins controlling microautophagy-related vacuolar membrane changes in yeast will probably allow to search for homologues in mammals and then investigate their contribution to mammalian microautophagy. by contrast, research on cma has already made substantial progress. for example, the recent identification of a plethora of new cma substrates and deficiencies in cma associated with diverse human pathologies has expanded our understanding of the importance of cma in multiple cellular functions. in fact, the growing number of connections between cma and human diseases has already generated interest in modulating cma activity for therapeutic purposes. there is a close relationship between autophagy and mamps and/or damps in the cellular response to injury. in fact, autophagy cannot be restricted to an innate immune mechanism that controls intracellular homeostasis alone but has to be extended in terms of "immunological autophagy" to a process that is committed to control and regulate efferent innate and potential adaptive immune responses. the "medium" for achieving this goal is the mamps and/or damps which operate as a link between intracellular and extracellular events. this scenario is briefly touched in the following. growing evidence indicates that autophagy regulates release and degradation of damps-here in terms of inducible damps-including hmgb1, atp, and dna in several cell types [37] . for example, autophagic mechanisms reportedly promote and regulate the release and secretion of hmgb1 in a ros-dependent manner in fibroblasts, macrophages, and cancer as well as net-mediated release of hmgb1 in neutrophils [37, 38] . moreover, autophagy has been shown to be required for the liberation/active secretion of atp by dying cancer cells [39, 40] . in addition, autophagy was found to contribute to the regulation of the ddr at multiple levels, that is, a process associated with the emission of damps [37, 41] (for ddr, see sect. 18.6). via emission of damps, eventually, together with mamps, autophagy can amplify or even instigate mamp/damp-prm signalling leading to efferent innate immune responses. on the other hand, autophagy can inhibit pro-inflammatory signalling cascades. for example, the atg5-atg12 complex, a key regulator of the autophagic process, was shown to negatively regulate rlr signalling by direct binding to card domains of rig-i and interferon promoter-stimulating factor-1 (ips-1) [42] (compare part vi, sect. 22.3.6). moreover, as reviewed elsewhere [1] , autophagy has been found to inhibit both nlrp3 and aim2 inflammasome activation and subsequent production of pro-inflammatory cytokines il-1β and il-18 (for inflammasomes, see part vi, sects. 22.4.2 and 22.4.4). as a possible mechanism, the authors propose that inflammasome components and pro-il-1β are subjected to ubiquitination and subsequent degradation by autophagy, thereby leading to functional inactivation of inflammasomes. conversely, an increasing number of studies suggest that damps, including hmgb1, atp, and dna, are powerful stimuli and regulators to elicit autophagic responses [3, [43] [44] [45] [46] [47] [48] [49] [50] . for example, hmgb1 was demonstrated to be an important regulator of autophagy in various types of cancer cells and keratinocytes. mechanistically, the reduced form of the hmgb1 protein was proposed to be responsible for the promotion of autophagy in an ager/rage-dependent fashion [47, 48] . also, and of high interest, in a clinical study on patients with chronic hepatitis b, hmgb1-induced autophagy was found to maintain treg function during chronic viral infection [49] . moreover, in studies on a mini pig lung iri model, evidence was provided indicating that autophagy, when triggered by damps such as hmgb1 and hsp60 during iri, amplifies the inflammatory response through enhancing k63-linked ubiquitination of traf6 and activation of the downstream mapk and nf-κb signalling (for traf6, mapk, and nf-κb signalling, see part moreover, there is already first evidence suggesting a role of atp in the regulation of autophagy [50] . in addition, there are accumulating data indicating that cytosolic dna, dislocated as a result of dna damage, may contribute to the regulation of autophagy whereby the dna damage-regulated autophagy modulator 1 (dram1) appears to play a mechanistically crucial role [51] [52] [53] . the mechanisms involved in mamp/damp-activated autophagic responses have only partially been elucidated. in fact, there is convincing evidence suggesting that many mamp/damp-recognizing prms including tlrs (in particular endosomal tlrs), nlrs, and anti-dna receptors can activate autophagic responses by triggering specific signalling pathways (reviewed or discussed in [1, 2, 54, 55] ). the crosstalk between autophagic responses and damps represents a powerful instrument of the innate immune system to integrate and unify various tools for the promotion and regulation of injury-induced inflammation and, in the presence of nonself-or altered self-antigens, injury-induced adaptive immunity. thus, on the one hand, autophagy is known to promote and regulate the release of damps (though the exact mechanisms are still elusive); subsequently, damps via prmtriggered pathways participate in the regulation of inflammation. on the other hand, activation of prms by mamps and/or damps promotes autophagy activation through a mechanism that has been partially elucidated. in fact, an increasing number of findings suggest that this activation process is triggered by prms following recognition of mamps and/or damps. nevertheless, the precise molecular mechanisms by which prms modulate autophagy remain largely unknown. there is increasing evidence in support of the notion that mamp/damp-activated autophagic responses promote emission of damps which in turn support cellular homeostasis in the course of adaptive stress responses in healthy cells. notably, this cellular homeostatic effect may spread out and affect the whole organism via emission of autophagy-dependent damps. in other words, via damps, autophagy as a cell-intrinsic stress response can fortify its defending capability by providing a link to promotion and regulation of cell-extrinsic efferent innate immune and eventually subsequent adaptive immune responses. however, despite the fact that autophagy is one of the best-known cell-autonomous responses in innate immunity and has clearly been shown to counteract dangerous infectious and sterile cell stress, much is left unclear. one such issue concerns the definition of autophagy-dependent cell death. as discussed and summarized [10] , autophagy-dependent cell death can be defined as a form of rcd (see next chapter) that can be retarded by pharmacological or genetic inhibition of macroautophagy. in this context, as stressed by galluzzi et al. [10] , it is important to note that (1) specificity issues affect most, if not all, pharmacological agents employed so far for suppressing macroautophagic responses and (2) multiple components of the macroautophagy machinery have autophagy-independent functions. in view of these facts and findings, these authors recommend to favor genetic approaches and to test the involvement of at least two different proteins of the macroautophagy apparatus in a specific instance of rcd before etiologically attributing it to macroautophagy. other unclear issues refer to the specific modulation of autophagy by mamps and/or damps, the precise interaction of autophagy with innate immune signalling cascades, and the cooperation between autophagy and other physiologic cell-intrinsic and cell-extrinsic processes during scenarios of cell stress and tissue injury. efforts to solve these problems are of utmost importance in view of the fact that autophagy-when induced by excessive, chronic, or acute-repetitive emission of damps-can contribute to the pathogenesis of many human diseases, that is, acute and chronic, infectious, or sterile inflammatory disorders. at the respective places, they will often be mentioned in the following chapters as well as in volume 2. oxidative cell stress and tissue injury reflect most potent and omnipresent threats an organism is exposed to. though there is a robust defense response continuously operating, this kind of injury is known to contribute to the pathogenesis of many human diseases. how can this be? oxidative stress is caused by an imbalance between the production of oxidants such as ros on one side and the biological antioxidative defense system's ability on the other side to counter the oxidant levels with antioxidants, that is, to readily detoxify the toxic reactive species or easily repair the resulting damage. thus, it is the excessive production of ros-overriding the antioxidative capacities-that is pathophysiological and contributes to dysfunction, damage, and even death of cells. by contrast, generation of ros in physiological low/ moderate concentrations-operating as second messenger molecules and causing so-called oxidative "eustress" [56] -assists in intracellular signalling pathways and, thus, is essential for optimal cell functions and homeostasis of an organism. in other words, the biological effects of ros-beneficial or deleterious-considerably depend on the amounts of ros present and, in action, a phenomenon that is in agreement with the idea that cellular ros generation has characteristics of hormesis implying a dose-response phenomenon that is characterized by beneficial effects at low doses and deleterious effectivity at high toxic doses [57] . to guarantee this homeostatic function of ros, to keep these molecules within physiological limits, and to prevent their deleterious effects, that is, to maintain hormesis, a smooth running of the oxidative stress response is of utmost importance. accordingly, a few aspects of this critical stress response are addressed in the following. reactive oxygen species are produced from molecular oxygen as a result of normal cellular metabolism. to understand any discussion on a role of ros in host defense or human diseases, one should define free radicals. according to halliwell and gutteridge [58] , "a free radical is any species capable of independent existence that contains 1 or more unpaired electrons." an unpaired electron is one that occupies an atomic or molecular orbital by itself. radicals can be formed by the loss of a single electron from a non-radical, or by the gain of a single electron by a non-radical. in this sense, superoxide anions (o 2 ·− ), hydroxyl radicals ( · oh), peroxyl radicals (ro 2 · ), and alkoxyl radicals (ro˙) are oxygen radicals. of note, ros is a collective term often used by scientists to include not only the oxygen radicals but also some non-radical derivatives of oxygen such as h 2 o 2 , hypochlorous acid (hocl), ozone (o 3 ), and singlet oxygen ( 1 o 2 ). nitrogen-containing oxidants, such as no · are called rns. generation of ros is generally a cascade of reactions that starts with the production of superoxide anions. superoxide rapidly dismutases to h 2 o 2 either spontaneously (especially at the low ph) or catalyzed by sod. other elements in the cascade of ros generation include the reaction of superoxide with no to form the very toxic peroxynitrite, the peroxidase-catalyzed formation of hocl from h 2 o 2 , and the iron-catalyzed fenton reaction, leading to the generation of hydroxyl radical. the oxidants are produced endogenously as by-products or metabolites of various metabolic processes. multiple enzyme systems produce superoxide radicals and their derivatives including xanthine oxidoreductase (xor), the reduced form of nadph oxidases (noxes), and mitochondrial electron transport chain (etc)associated molecular complexes ( fig. 18 [61, 62, [64] [65] [66] [67] [68] [69] [70] mitochondrial etc is believed to be the main source of ros. in the following, some aspects of these three systems (other systems not mentioned here) are briefly touched, exemplified by and focused on their role as vascular sources of ros production as investigated on models of iri. xanthine oxidoreductase (xor), as a housekeeping enzyme, is probably expressed in all cells but primarily in surface epithelia such as capillary endothelial tissue of various organs. xanthine oxidoreductase, a complex molybdoflavoprotein, is the ratelimiting step in the catabolism of purines, where it catalyzes the last steps of purine metabolism: the conversion of hypoxanthine to xanthine and of xanthine to uric acid, with superoxide/h 2 o 2 generated as by-products. there is a definite role of xor in reperfusion of tissue and organs [59, 60] . for example, experiments performed in isolated rat hearts have demonstrated that radical generation and functional injury are decreased by inhibition of xor with oxypurinol. similarly, in human aortic or venous ecs, xor-mediated ros generation has been shown to be a central mechanism of oxygen radical generation upon postischemic reoxygenation [61, 62] . the nadph oxidases were initially considered as enzymes expressed only in phagocytic cells involved in host defense and innate immunity; however, recent evidence indicates that there is an entire family of noxes based on the discovery of gp91phox homologues. the family comprises seven members, including nox1, nox2 (formerly termed gp91phox), nox3, nox4, nox5, duox1, and duox2 [63] . three members out of the enzyme family are important sources of ros in the vasculature, namely, nox1, nox2, and nox4 [64] [65] [66] . today, noxes are perhaps the best-studied enzymes involved in ros production in the blood vessels. remarkably, different members of the nox/duox family engaged in iri are localized in various cells, that is, in vascular cells and phagocytes. this may lead to the notion that noxes in vascular cells are responsible for the first wave of ros production because vascular cells are first confronted with reintroduced molecular oxygen. as generally believed, there is, in fact, no vascular specific nox isoform but rather a complex expression of various nox isoforms in different cells and regions of the vascular system. nevertheless, in arteries from humans and animals, nox2, nox4, and a shallow level of nox1 have been consistently found to be present both as messenger rna and as protein [63] . altogether, the findings and data briefly described here make clear that vascular cells are equipped with efficient machinery able to efficiently produce ros. regarding the different enzymatic sources, superoxide radicals appear to be predominantly generated compared, for example, to hydroxyl radicals. mitochondria have been implicated as potential oxygen sensors by increasing the generation of ros which regulate a variety of hypoxic responses [67] [68] [69] [70] . in fact, mitochondria are increasingly recognized as lynchpins in the evolution of tissue injury during postischemic reperfusion. it is generally acknowledged that the majority of intracellular ros production is generated in the mitochondrial etc and its associated metabolic enzymes. however, very little is known about which mitochondrial sites are involved in physiological or pathological ros production under native conditions. of note, using inhibitors to manipulate the redox states of particular sites and prevent superoxide generation from others, at least ten different locations of superoxide/h 2 o 2 production in the etc and associated enzymes (krebs cycle, β-oxidation, etc.) have been identified in mammalian mitochondria. in fact, the relative and absolute contributions of specific sites to the production of ros in isolated mitochondria depend very strongly on the substrates being oxidized, and the same is likely valid in cells and in vivo [71] . for example, superoxide formation occurs on the outer mitochondrial membrane, in the matrix, and on both sides of the inner mitochondrial membrane (fig. 18.3) . complex i (nadh-ubiquinone oxidoreductase) accepts electrons from nadh; these electrons are carried to complex ii (the succinate dehydrogenase-coq oxidoreductase), where they are used to oxidize succinate to fumarate. afterward, electrons continue to travel down their electrochemical gradient to complex iii (the cytochrome bc1 complex (ubiquinol-cytochrome c oxidoreductase)), and subsequently to complex iv (cytochrome c oxidase); finally, the electrons are used to reduce molecular oxygen to water. thus, complex i and complex ii oxidize the energy-rich molecules nadh and flavin adenine dinucleotide h 2 , respectively, and then transfer the resulting electrons to ubiquinol that carries it up to complex iii (for competent articles, see [72, 73] notably, complexes i and ii generate superoxide within the mitochondrial matrix, whereas complex iii produces superoxide at the qo site, resulting in the release of superoxide into either the intermembrane space or the matrix. regarding complex i, it was recently demonstrated that inhibition of nd5, a subunit of complex i, suppresses the activity of this complex and thus ros production [74] . furthermore, data from another set of studies on complex i showed that stable down-modulation of its subunits grim-19 and ndufs3 decreased complex i activity that was associated with a significant reduction in the overall nadh oxidation rate but with an increased production of ros by the target cells [75] . similar results have been found in studies on complex ii: there is evidence suggesting that inhibition of complex ii on the level of subunits even leads to an increase in ros production. the phenomena can be explained by assuming that, if electrons provided in the course of the etc cannot efficiently be transferred to the next complex, they would leak out from the inhibited complex and generate ros [76] . the complex iii subunits rieske iron-sulfur protein (risp) encoded by ubiquinol-cytochrome c reductase, rieske iron-sulfur polypeptide 1 (uqcrfs1), and ubiquinol-cytochrome c reductase binding (uqcrb) protein appear to play a crucial role in hypoxia-triggered mitochondrial ros generation (for rieske, see box 18.3). thus, it was shown that risp promotes the hypoxic stabilization of the transcription factor hif-1α protein [77] and uqcrb was found to mediate hypoxia-induced tumor angiogenesis via mitochondrial ros-mediated signalling [78, 79] . also, and of note, a mouse model to permit conditional deletion of the nuclear-encoded risp gene was recently developed to assess its role in hypoxiainduced ros signalling in the pulmonary circulation [80] . it was found that depletion of risp abolishes the ros response to hypoxia in isolated pulmonary the rieske protein is an iron-sulfur protein (isp) component of the cytochrome bc1 complex that was first discovered and isolated by john s. rieske and coworkers in 1964. the rieske iron-sulfur protein is an essential subunit of mitochondrial cytochrome bc1 complexes and, like the majority of mitochondrial proteins, is encoded by a nuclear gene and synthesized on cytoplasmic ribosomes as a precursor with a 32-residue amino-terminal extension. the iron-sulfur protein is then post-translationally imported into the mitochondria where it is inserted into the bc1 complex in the inner mitochondrial membrane. at first, the precursor is translocated via translocation contact sites into the matrix. there, cleavage to an intermediate containing an 8-residue extension occurs. the intermediate is then redirected across the inner membrane, processed to the mature subunit, and assembled into complex iii. arterial smcs and isolated pulmonary artery segments. further, in this article, it was discussed that mitochondria are not the only source of ros during hypoxia. thus, studies using a genetic knockout of p47phox suggested that cytosolic nadph oxidase systems may also contribute to a hypoxic pulmonary vasoconstriction response during acute hypoxia [81, 82] . according to the authors' conclusion, the blockade of hypoxia-induced ros responses (in these studies observed with depletion of risp) suggests that the mitochondria may act as the initiators of ros production, which could be amplified by engagement of nadph oxidase systems elsewhere in the cell. such "ros-induced ros release" might permit small ros signals generated by mitochondria to activate ros signalling throughout the cell, thereby avoiding mitochondrial damage that might arise if the entire cellular oxidant signal originated from that organelle [83] (or even leading to excessive ros production?). indeed, the substantial advances in oxidative stress research of recent times, in particular, the specification of hypoxia-sensing ros-producing enzyme systems, will contribute to new therapeutic strategies to be applied in acute and chronic human diseases known to be influenced by oxidative stress. for example, discrimination of oxidative eustress, a fundamental process in maintaining health, from oxidative damage will improve clarity in developing "redox medicine" [56] . when the redox equilibrium of a cell is upset by pro-oxidant environmental stimuli, that is, when oxidative stress exists, an adaptive stress response takes place which can result in upregulation of antioxidant proteins and detoxification enzymes. these antioxidative defense molecules comprise the following [58] : (1) agents that catalytically remove free oxygen radicals and other reactive species, for example, sod, catalase, peroxidase, and thiol-specific antioxidants; (2) proteins that minimize the availability of pro-oxidants such as iron ions, copper ions, and heme, for example, transferrins, haptoglobins, hemopexin, and metallothionein; (3) proteins that protect biomolecules against damage (including oxidative damage) by other mechanisms, for example, hsps; and (4) low-molecular-mass agents that scavenge ros and rns, for example, glutathione, α-tocopherol, and (possibly) bilirubin and uric acid. in the past, it was fashionable to divide the oxidative stress response into three main tiers: (1) antioxidant enzymes including sod, catalase, glutathione peroxidase, and glutathione; (2) detoxifying enzymes such as glutathione peroxidase, glutathione s-transferase, aldo-keto reductase, and aldehyde dehydrogenase; and (3) energy-dependent efflux pumps. as a fourth defense system, the antioxidant nutrients such as vitamins e and c as well as carotenoids were appreciated [84] . it was generally accepted that the first line of enzymes is of enormous importance in limiting ros-mediated damage to intracellular macromolecules. for example, among the most important regulators of ros levels were the sod enzymes: cu/ znsod in the cytoplasm and outer mitochondrial space and mnsod exclusively in the inner mitochondrial space. mechanistically, superoxide is converted to h 2 o 2 and oxygen (o 2 ·− + o 2 ·− + 2h + → h 2 o 2 + o 2 ) by sod. peroxiredoxins and abundant catalase enzyme then scavenge h 2 o 2 , converting it to molecular oxygen and water. another example of a first-line defense molecule is trx. thioredoxin contains two adjacent -sh groups in its reduced form which are converted to a disulfide in oxidized trx. notably, it can undergo redox reactions with multiple proteins using the reaction trx however, it then turned out that these antioxidative principles were clearly not 100% effective at performing this task, as under normal physiological conditions, lipid and dna oxidation products can be detected in blood and urine. because certain compounds of the chemicals generated after an interaction of ros with macromolecules are highly reactive, there must be an equal necessity to detoxify these secondary oxidation products to prevent them from also damaging dna, proteins, and lipids. without the adequate detoxification of such products, an extended chain reaction will occur resulting in the degradation of cellular components and the ultimate death of the cell. this second line of defense against ros is provided by those detoxifying enzymes. finally, detoxified metabolites produced by these enzymes are eliminated from the cell by energy-dependent efflux pumps such as the glutathione s-conjugate transporter, also called the multidrug resistance-associated protein (mrp) [58] . today, one must state that these previous notions are incomplete. at first, it became apparent that members of the so-called cnc (for cap 'n' collar)-basic region-leucine zipper (bzip) family of transcription factors are principal mediators of defense responses to redox stress. in mammals, the cnc family members nrf1 and nrf2 were shown to be involved in the transcriptional upregulation of cytoprotective genes encoding a large number of diverse detoxification, antioxidant, and anti-inflammatory proteins (e.g., glutamate cysteine ligase, nadph-quinone oxidoreductase, glutathione s-transferases, and aldo-keto reductases) as well as enzymes with essential roles in cell metabolism [85] . more recent studies then revealed that these transcription factors, notably nrf2, are activated by keap1 as the primary negative regulator of nrf2, that is, a molecule that simultaneously operates as a sensor protein able to perceive dyshomeostatic subclass iic-4 damps, for example, in terms of redox changes reflecting electrophilic stress. it is worth to add here that six critical domains have been defined in nfr2 (neh1-neh6), and it is neh2, located at the n terminus of nrf2 that acts as the regulatory domain for the cellular stress response. actually, neh2 interacts with the cytoplasmic protein keap1 [86] . in the following, this very important damp-induced and gene-based antioxidative and cytoprotective system is addressed a bit more in detail. increasing evidence indicates that several redox-regulated gene products serve to protect cells from ros damage. the antioxidant response element (are), a cis-acting dna regulatory element or enhancer sequence is known to be activated by oxidative stress and to be responsible for the transcriptional regulation of several redox-regulated gene products. both nrf 1 and 2 bind to are and regulate are-mediated gene expression and induction. the molecule nrf2 is more potent than nrf1 in activation of are-regulated gene expression and is regarded as the principal transcription factor that binds to the are. this transcription factor is ubiquitously expressed and present in various organs and tissues including the kidney, muscle, lung, heart, liver, and brain (for reviews, see [87] [88] [89] [90] [91] ). as touched in part ii, sect. 5.3.2 and part iv, sect. 13.4.5, the nrf2-triggered antioxidant response is initiated by activation of keap1 that functions as a substrate adaptor protein for the degradation of nrf2 and serves as an intracellular sensor for redox changes reflecting the presence of subclass iic-4 damps (for reviews see [92] [93] [94] ). earlier studies had already shown that nrf2 is a bzip transcription factor that translocates to the nucleus after liberation under oxidative stress conditions from its cytosolic inhibitor keap1 [86] . in the nucleus, nrf2 was found to form dimers with the proteins maf, jun, fos, activating transcription factors 4 (atf4), and/or creb binding protein (cbp) and, in addition, regulates transcription by binding to the are upstream of a variety of cytoprotective and detoxification target genes to combat the oxidative stress [95] . thus, established nrf2-regulated genes reportedly included cu/zn sod, catalase, trx, trx reductase, glutathione reductase (gr), glutathione peroxidase (gpx), and ferritin l (ftl) [96] (ftl and ferritin h (fth) subunits are responsible for intracellular iron storage). all of these genes are involved in the response to oxidative stress. there are several other genes also known to be engaged in the response to oxidative stress that are not described here. recently, the molecular signalling mechanism involved in the keap1 ↔ nrf2 pathway has been further elucidated and specified. the core can be seen in an axis consisting of redox change (subclass iic-4 damps)-initiated → keap1-induced → nrf2-triggered → are-driven expression of antioxidant and detoxifying genes ( fig. 18.4 ) (discussed in [92-94, 97, 98] ). the complex and complicated sequelae of the pathway are simplified in the following text. under homeostatic and stress-free conditions, binding of keap1 to the nrf2 molecule leads to its polyubiquitination and subsequent degradation by the proteasomal pathway, thereby maintaining a consistent generation of nrf2 and retaining its very low levels in the cytoplasm. in this scenario, keap1 homodimer binds to a single nrf2 protein via a high-affinity so-called "etge" motif and low-affinity so-called "dlg" motif. the twosite recognition of nrf2 by the keap1 dimer is essential for polyubiquitination of nrf2 (also see part ii, sect. 5.3.2) (for polyubiquitination, see box 18.2) . in this sense, the keap1 ↔ nrf2 system can be regarded as a vital part of regulating cells under a homeostatic environment. however, in case of dangerous and threatening oxidative (and xenobiotic) stress, the system instigates a stress response that is characterized by a rapid and dramatic cessation of the keap1-dependent polyubiquitination process resulting in a rapid increase of nrf2 abundance. in fact, exposure to ros-mediated stress (or electrophilic stress) is thought to modify the reactive "iic-4 damps-sensing" cysteine residues in keap1, which is associated with a conformational change of the protein resulting in a loss of keap1 ubiquitination activity (fig. 18.4) . notably, as a cysteine-rich protein, the human keap1 possesses 27 cysteine residues, and they are all reactive to stress to varying degrees. among these sensor cysteines of keap1, c151 is best characterized. evidence from a number of studies has suggested that c151 is the most reactive and critical to the keap1 ↔ nrf2 stress-sensing response. even, there is already evidence from a first atomic-level view suggesting that the unique environment of cys 151 (besides cys171, cys273, and cys288) appears to be the critical residue of keap1 responsible for detecting increased levels of oxidative stress [91, 92, 97] . oxidative modification of cysteine sensors of keap1 leads to a loss of its polyubiquitination and degradation activity thereby stabilizing nrf2. consequently, fig. 18.4 the oxidative stress-induced keap1 ↔ nrf2 pathway. under non-oxidative homeostatic conditions, the sensor keap1is bound to the nrf2 molecule resulting in its polyubiquitination and subsequent degradation via the proteasomal pathway. oxidative stress modifies the ros-sensing cysteine residues of keap1 leading to loss of its polyubiquitination and degradation activity, dissociation of nrf2 that becomes stabilized and accumulates. then, nrf2 translocates to the nucleus and forms a heterodimer with the smaf transcription factor. the nrf2/smaf heterodimer binds to are and induces transcription of numerous cytoprotective antioxidant and detoxification genes. are antioxidant response element, keap1 kelch-like erythroid cell-derived protein with cnc homology (ech)-associated protein 1, nrf2 nuclear factor-erythroid 2 p45-related factors 2, smaf small musculoaponeurotic fibrosarcoma, ub-ub-ub poly-ubiquitin chain. sources: refs. [92-94, 97, 98] stabilized nrf2 accumulates in the cytoplasm, translocates into the nucleus, and forms a heterodimer with a smaf transcription factor (smaf, small musculoaponeurotic fibrosarcoma). thus, this accumulation of nrf2 in response to ros (and electrophiles) cannot be regarded as an induction in a strict sense but instead is a mechanism referred to as derepression, that is, from the rapid degradation-based repression. as highlighted and discussed [93] , there are two models of how to explain the cytoplasmic accumulation process of nrf2. the "hinge-and-latch" model holds that the modification of the sensing cysteine residues of keap1 reduces its affinity for nrf2 but does not result in release. instead, newly synthesized nrf2 is translocated to the nucleus to trigger the transcription of nrf2-dependent genes. the other model denoted as the "conformation cycling" model claims that keap1 uses a cyclic mechanism to target nrf2 for polyubiquitination and proteasomal degradation. an important feature of this cyclic mechanism is that it ensures regeneration of keap1 which allows the cycle to proceed. modification of specific reactive cysteine residues of keap1 may block the cycle of keap1-dependent nrf2 degradation allowing de novo synthesized nrf2 to accumulate. the subsequent transcriptional process in the nucleus has been specified as well. as partially mentioned above, the nrf2-smaf heterodimer binds to are or electrophile-responsive element (epre) and induces transcription of numerous cytoprotective genes. of note, recently, an extensive genome-wide analysis of the nrf2-smaf-binding sequence, that is, the are/epre, and the maf homodimerbinding sequence (the so-called maf responsive element or "mare") was conducted, and the differences between these elements were clarified. as a result, it was proposed that are, epre, and the nf-e2 binding sequence be collectively named cnc-smaf-binding elements (csmbe) [99] . interestingly, prms appear to be co-players in this scenario. thus, tlrs have been observed to induce nrf2 activation. remarkably, in a recent study, tlr agonists were shown to activate nrf2 signalling via reduction of keap1 [100] . the authors could demonstrate that tlr signalling-induced keap1 reduction promotes nrf2 translocation from the cytoplasm to the nucleus, where it activated transcription of its target genes. further, tlr agonists were found to modulate keap1 at the protein post-translation level through autophagy. in fact, tlr signalling increased the expression of autophagy protein p62 and lc3-ii and induced their association with keap1 in the autophagosome-like structures. the keap1 ↔ nrf2 system as briefly described here is a robust oxidative stress response. its regulatory mechanisms, for example, stress-sensing mechanism, proteasome-based regulation of nrf2 activity, and selection of target genes have been elucidated mainly in mammals (for proteasome, see box 18.1). nevertheless, the pathway is now regarded as an evolutionarily conserved defense mechanism against oxidative and xenobiotic stress across the tree of life. thus, the keap1 ↔ nrf2 system has been found to be also present in zebrafish, fruit fly, and caenorhabditis elegans indicating that its roles in cellular defense are conserved throughout evolution among vertebrates and suggesting that analogous defense systems are widely conserved throughout the animal kingdom [101, 102] . as briefly demonstrated in this subchapter, aerobic organisms have integrated antioxidant systems, which include damp-promoted generation of enzymatic and nonenzymatic antioxidants that are usually effective in blocking harmful effects of ros. however, when ros is produced in excess causing pathological conditions, the stress response against oxidative damage can be overridden. thus, oxidative stress is known to contribute to many pathological conditions, including cancer, neurological disorders, atherosclerosis, hypertension, ards, and chronic obstructive pulmonary disease, just to mention a few of them. plausibly, these disorders are motivation enough to search for new effective therapeutic options by harnessing the new insights into mechanisms of the keap1 ↔ nrf2 system. certainly, intense research is essential for a detailed understanding of the precise consequences of targeting keap1 for disease prevention and treatment. all the more so as the oxidative stress response is integrated into other forms of innate stress responses that will be further outlined in the following subchapters. the heat shock response-one of the most ancient and evolutionarily conserved cytoprotective mechanisms found in nature-is induced upon exposure of living cells to acute, subacute, or chronic stress conditions. this defense response is characterized by the expression of a group of phylogenetically conserved intracellular hsps, which possess the capacity to recognize structures commonly found in the interior of proteins and to bind such structures. thanks to this property, they form a chaperone network involved in correct protein folding, trafficking, and complex assembly (for reviews, see [103] [104] [105] [106] [107] [108] [109] ). in order to be released from engagement with proteins after folding and take part in further rounds of activity, hsp70 and other chaperones utilize an intrinsic atpase domain to hydrolyze atp and assume a free conformation [110] . once released, hsps mediate protective cellular defense mechanisms including regulation of apoptosis, the aim being to maintain and restore cellular protein homeostasis (proteostasis). exposure of almost any cell to heat shock most often leads to the rapid transcription, translation, and accumulation of a variety of hsps that increase to quite considerable levels when the stress is pronounced. in fact, these molecules are induced by various environmental insults that can cause protein denaturation and unfolding within the cells, leading to the formation of nonnative proteins and protein aggregates, thereby emitting dyshomeostatic damps. of note, besides their protective role in different intracellular compartments, hsps in terms of inducible damps can translocate to the cell surface to get exposed or are actively secreted via non-canonical pathways. in case of necrotic cell death, they are passively released in large amounts into the extracellular space to act as constitutively expressed damps (compare part iv, sect. 12.2.3 and sects 12.3.2.3 and 14.2.2.2). once emitted, hsps are sensed by classical recognition receptors and/or non-classical receptors (e.g., cd 91) [111] [112] [113] [114] [115] [116] [117] . interestingly, recent knowledge about similarities between allograft and tumor rejection has visualized that the processes of both iri to allografts [108, 118, 119] and therapy-mediated injury to tumors [120] [121] [122] [123] are characteristically associated with emission of hsps. in all eukaryotes, the hsr is primarily regulated and controlled by the hsfs, in particular, hsf1, a sequence-specific factor that binds upstream to heat shock elements in the promoters of target genes [124] (fig.18.5 ). for example, hsf1 is activated by environmental stress including oxidative and tumor-associated stress [125, 126] . notably, in all eukaryotes, hsf1 responds to such stress conditions by undergoing a monomer to trimer transition and becomes massively phosphorylated, leading to its acquiring ability to bind to dna rapidly and activate transcription [127] . there is at least preliminary evidence suggesting that intracellular perturbations reflecting dyshomeostatic damps may activate hsfs [128] [129] [130] [131] . such molecular alterations reportedly include changes in cytosolic ca 2+ concentration, for example, caused by increase of fluidity in specific membrane domains [128, 129] . interestingly, a recent study in support of these earlier findings provided evidence of the existence of a plasma membrane-dependent mechanism of hsf1 activation in animal cells, which is initiated by specific membrane-dependent trpv calcium channel-like receptors [130] . these findings lend support to the notion that heat sensing and signalling in mammalian cells are dependent on trpv channels, suggesting that these receptors may act as a major hsr sensor in different epithelial non-cancerous and cancerous cells, capable of triggering the cellular hsr [130] . in another line of experiments, trpv2 was demonstrated to mediate the effects of transient heat shock on endocytosis of human monocyte-derived dcs, suggesting a central role of trpv2 in mediating the cellular action of heat shock on these important cells of the innate immune system [131] (for trpv channel receptors, also compare part ii, sect. 5.3.6). induction of an hsr is not only mediated by sterile stress conditions but is also believed to be promoted by cell-invading viruses or bacteria. consequent emission of hsps in their role as damps may reflect a mechanism by which pathogens may contribute to sterile inflammation. for example, exacerbation of hepatitis b virus (hbv)-associated liver injury is reportedly characterized by an abnormal immune response that not only mobilizes specific antiviral effects but also poses a potentially lethal non-specific sterile inflammation to the host [132] . heat shock proteins may be the most extensively studied damps in the context of hbv infection. a number of hsps such as hsp70 and hsp90 have been reported to be supportive factors in the process of hbv replication, and selective inhibition of these hsps was proposed to be host-based anti-hbv strategies [133] [134] [135] . thus, as argued [132] , both infective and sterile inflammation may synergistically contribute to the exaggeration of chronic hepatitis, if the hbv cannot be cleared entirely. likewise, bacterial infections were also reported to promote induction of the hsr [136, 137] . for example, in studies on h. pylori and e. coli infection models, the initiation of an hsp70 stress response could be demonstrated [138, 139] . the hsr is recognized and accepted as a classical stress response in nearly all species across the tree of life. it reflects the desperate efforts of a cell to restore homeostasis and survive upon both infectious and sterile insults. its products, the hsps, operate as damps in commission of the innate immune system to reach this goal. notably, via this mechanism, an hsr, induced by infectious damage to a cell, can promote sterile inflammation. whereas a successful hsr upon stress leads to restoration of cellular homeostasis and cell survival, an unsuccessful hsr may result in rcd such as apoptosis [109] . this phenomenon will be resumed in the following subchapters. the er is a continuous membrane system that forms a series of flattened sacs within the cytoplasm of eukaryotic cells. as a subcellular organelle in the control of proteostasis, it is responsible for calcium storage and lipid biosynthesis as well as the synthesis, correct folding, processing, and maturation of proteins as well as for the orchestration of their transport along the classical/conventional secretory pathway. the er delivers these components to their destination compartments which include the er itself, the golgi apparatus, the plasma membrane, and the extracellular milieu or the endocytic and autophagic pathways. plausibly, the multifunctional nature of this organelle requires a myriad of proteins, unique physical structures, and coordination with and response to perturbations in the intracellular environment. a series of chaperones, folding enzymes, glucosidases, and carbohydrate transferases support and execute these processes. perturbation of er-associated functions such as accumulation of unfolded/misfolded proteins, excessive ros production, hypoxia, calcium and glucose depletion, or viral and bacterial infections reflect stress of the organelle and result in activation of an er stress-coping response, the evolutionary conserved upr [140] [141] [142] [143] . for example, the processes of both iri-mediated cell damage/cell death [144] [145] [146] and induction of the icd of cancer cells [147] [148] [149] are characterized by demonstration of er stress that is almost always associated with oxidative stress and vice versa [150] . to cope with er stress, cells have the unique possibility to activate the upr, a dynamic signalling network that orchestrates the recovery of homeostasis or triggers rcd modalities, depending on the level of damage. perception of any perturbation of the er is provided by three sensor molecules of the upr embedded in the er membrane: the perk, ire-1, and atf6. the upr-mediated recovery of er homeostasis mainly occurs through the perk-eif2α-mediated temporary shutdown of protein translation and the activation of a complex genetic program that aims to improve er quality control and adaptive responses [140] [141] [142] [143] (fig. 18.6) . accordingly, perk, devoted to perceiving dyshomeostatic damp-emitting er perturbations (so far, no clear evidence for ire-1 and atf6 in this respect), may be regarded as a new family of non-classical prms, at least in a broader sense. in nonstressed conditions, the three sensors of the er homeostasis are kept in an inactive state by the er-luminal binding immunoglobulin protein (bip). the protein bip, also known as glucose regulated protein 78 (grp78), is a member of the hsp70 family of proteins (specifically hspa5). it functions as a chaperone to selectively bind unfolded proteins in the er lumen by interacting with exposed hydrophobic 18 .6 simplified scenario model illustrating the er stress-induced three arms of the unfolded protein response. perception of any perturbation of the er is provided by three sensor molecules of the unfolded protein response embedded in the er membrane: the perk, ire-1, and atf6. perk perceives dyshomeostatic damp (subclass iic-4 damp)-emitting er perturbations (ire1 and atf6 still questionable). perk phosphorylates eif2a to up-regulate transcription factor atf4 that induces the expression of transcription factor chop. ire1a signals through its rnase via the splicing of xbp1 mrna. the active transcription factor xbp1s translocates to the nucleus. atf6 is exported from the er to the golgi complex to enter the nucleus as a potent transcription factor. together, these transcription factors of the unfolded protein response determine the cell fate by the regulation of distinct subsets of target genes toward recovery of er homeostasis and cell survival or the induction of regulated cell death in form of apoptosis and ferroptosis. in addition, the transcription factor atf4 is differentially translated, up-regulating genes participating in autophagy and other homeostatic pathways. atf activating transcription factor, chop cytidine-cytidine-adenosine-adenosine thymidine-enhancer-binding homologous protein, eif2α eukaryotic translational initiation factor 2α, er endoplasmic reticulum, ire1α inositol-requiring transmembrane kinase/endoribonuclease 1α, perk protein kinase-like eukaryotic initiation factor 2α kinase, upr unfolded protein response, xbp1 x-box binding protein 1. xbp1s, x-box binding protein 1 whereby the "s" stands for the spliced form of xbp1. sources: refs. [142, [153] [154] [155] [156] [157] [158] [159] [160] [161] residues on nascent peptides [151] . however, in conditions of er stress, bip is detached from these sensors allowing their activation to trigger pathways, collectively included in the term of upr. this stress response acts as a corrective path, capable of both increasing the er folding capacity and decreasing the incoming polypeptide load. of note, this downstream pathway of each of the three upr sensors appears to have an innate preference for a particular type of er stress. moreover, as reviewed [152] , upon dissociating from bip, each of the three sensors modifies the er to mitigate stress in its own unique way. for example, atf6 is often the first sensor to respond to er stress. once atf6 dissociates from bip, it is translocated to the golgi apparatus for cleavage. the cytosolic domain of atf6 is then free to move to the nucleus, where it moderates increased expression of several proteins involved in lipid biosynthesis and chaperones. this allows an increase in the volume of the er and provides more chaperone proteins to aid in folding, thus relieving some of the er stress. the other two sensors, ire-1 and perk, remain as integral er proteins but oligomerize and autophosphorylate following bip disassociation (autophosphorylation, a type of post-translational modification of proteins (see part vi, sect. 24.3); typical for this biochemical process, a phosphate is added to a protein kinase by itself). under remediable er stress conditions, the three sensors trigger signalling pathways to resolve er stress aiming at maintaining cellular integrity (fig. 18.6) . they are briefly touched in the following (for reviews and original articles, see refs [142, [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] .). for example, misfolded proteins are dislocated in the cytosol where degradation processes such as the er-associated degradation (erad) and autophagy will clear them, thereby reducing their potential toxicity. characteristically, erad is a protein quality control mechanism conserved in all eukaryotic cells and represents a critical arm of the upr, necessary to alleviate er stress. the erad mechanism results in the selective dislocation of unfolded and misfolded proteins from the er to the cytosol via specific membrane machinery. the erad targets are subsequently degraded by the cytosolic ubiquitin proteasome system (ups). the whole process includes transcriptional activation of a variety of er-associated chaperones and folding enzymes which include but are not limited to bip and the lectins calr, calmodulin (cam), and calnexin (cnx). a pathway that represents the most conserved branch of the upr is mediated by ire-1, a multifunctional protein that possesses kinase and endonuclease activities. upon activation, ire-1 aggregates and autophosphorylates, thereby activating its endonuclease activity to catalyze the unconventional splicing of x-box binding protein 1 (xbp1) via removal of a 26-nucleotide intron. this processing event changes the open reading frame of the mrna, resulting in the translation of the transcription factor now termed xbp1s ("s" stands for the spliced form of xbp1) (for splicing, see box 18.4) . production of xbp1s leads to upregulation of several genes involved in the upr's adaptive phase, for example, expression of er-resident molecular chaperones and protein folding enzymes. activation of perk leads to the phosphorylation of eif2α that is required for the initiation of translation. this factor inhibits global protein synthesis by inhibiting the assembly of the 80s ribosome, thereby reducing er load and promoting cellular survival. at the same time and under these conditions, the transcription factor atf4 is differentially translated, up-regulating genes participating in protein folding, amino acid metabolism and transport, autophagy, and oxidative stress resistance/redox homeostasis. under conditions of prolonged or severe er stress that the upr cannot resolve (see below), atf4 also contributes to apoptosis through the induction of the transcription factor chop (for cytidine-cytidine-adenosine-adenosine-thymidine-enhancer-binding homologous protein) and by enhancing oxidative stress and protein synthesis. finally, the third branch of the upr is initiated by atf6. the sensor atf6 is retained at the er under homeostatic conditions but translocates to the golgi apparatus under er stress where it is cleaved by the golgi-resident proteases site-1 protease (sp1) and sp2. this event leads to the release of atf6 n-terminal fragment, a potent transcription factor that moves to the nucleus, where it binds the er stress response element upstream of a subset of upr genes to activate their transcription. together with xbp1, this fragment regulates the expression of several genes with functions in protein folding, protein transport, and lipid biosynthesis, that is, genes involved in re-establishing er homeostasis. strikingly, there is a considerable interconnectedness of the er stress-promoted upr with other innate immune processes. for example, an increasing number of studies support the view that oxidative stress has a strong connection with er stress. during the protein folding process, ros are produced as by-products, leading to impaired redox balance conferring oxidative stress. as the protein in molecular biology, splicing is a modification of an rna after transcription in which introns are removed and exons are joined. thus, an intron is any noncoding nucleotide sequence within a gene that is removed by rna splicing during maturation of the final rna product; an exon is any part of a gene that encodes a part of the final mature rna produced by that gene after introns have been removed by rna splicing. this process is needed for the typical eukaryotic messenger rna before it can be used to produce a correct protein through translation. for many eukaryotic introns, splicing is done in a series of reactions catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snrnps), but there are also self-splicing introns. folding process is dependent on redox homeostasis, the oxidative stress can disrupt the protein folding mechanism and facilitate the production of misfolded proteins, causing further er stress [163] . moreover, this stress response functions as a productive source of damps. typically, hsps and calr, like other er chaperones, can translocate to the cytosol and eventually to the surface of cells. once exposed, these molecules can operate as subclass ib-1 damps to facilitate engulfment of antigens as mostly described in the context of induction of icd [148, [164] [165] [166] [167] [168] . via these mechanisms and by crosstalk with other molecular machines of the innate immune system including nlrp3 inflammasome activation via txnip (see part iv, sect. 13.4.6.3 and part vi, sect. 22.4.2.2) , the upr may contribute to sterile inflammation and immunity (for articles, see refs [153] [154] [155] [156] [157] [169] [170] [171] [172] ). also, several signalling cascades triggered by the three sensors are apparently potent inducers of autophagy at a cellwide level that-as mentioned above-normally has an adaptive/protective function and consists of the three major types: chaperone-mediated, macro-and microautophagy. interestingly, recent evidence has indicated that upr-induced autophagic processes are capable of alleviating the upr pointing to a crosstalk between these two innate immune defense mechanisms [173, 174] . strikingly, a complex relationship reportedly exists between autophagy and damps in cellular adaption to stress and injury and cell death characterized by a crosstalk between autophagy induction and secretion or release of damps. in fact, growing evidence indicates that autophagic mechanisms are involved in regulating release and degradation of damps including calr, hmgb1, atp, and dna in several cell types [37, 148, 175] . this scenario may contribute to the observation that autophagy is also able to shape a supportive cellular immune response [176, 177] . this kind of innate immune interrelationships is, for example, involved in mechanisms of both reperfusion-mediated cell injury and icd of cancer cells (see refs [40, [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] ). overall, depending on the duration and intensity of the stress, the upr engages different cellular pathways to restore and maintain cell survival, on the one hand, or trigger apoptosis, on the other hand. in cases of severe irremediable er stress, however, the balance is tipped in favor of pro-death signalling; that is, the upr, now mediated by different biochemical pathways, may lead to pro-inflammatory and pro-apoptotic responses resulting in catastrophic rcd (fig. 18.6) . while the precise pathways of apoptosis induced by er stress are not known, the up-regulated perk → eif2α → atf4 → chop pathway plays an essential role by reversing translational arrest, increasing generation of ros, and promoting calcium efflux from the er. together, these signals lead to cytochrome c release from mitochondria and loss of membrane potential, resulting in apoptosis [191] . in this context, it is interesting that one pathway of rn (ferroptosis; see sect. 19.3.3) apparently shares a partially overlapping machinery with er stress, suggesting a molecular interconnectivity between these two events [192] . the issue of er-stress-promoted upr is a parade example of an injury-induced innate immune response that can decide besides life and death of a cell. placed at the very beginning of defense processes of multicellular organisms upon injury, this stress response nicely reflects a hierarchy of damp emission (see part iv, chap. 16) . striking is also the existence of an inter-organelle communication, for example, between upr, inflammasome activation, and autophagic pathways which emerge as a homeostatic network determining the switch from adaptive life-saving programs to cell death under stress conditions, where specialized sentinels are localized at organelle membranes to induce the core apoptosis pathway. as briefly sketched in the next section, this innate immune defense response is not only directed against sterile stress but also against pathogen-mediated stress. in the previous section, induction of a upr was mainly exemplified by referring to reperfusion-mediated cell damage and icd of cancer cells, that is, instances of sterile stress. however, growing evidence is coming to light clearly indicating that viruses and bacteria also induce er stress, thereby activating a robust upr. in fact, the constitution of cellular stress responses is meanwhile regarded as the first line of defense against both viral and bacterial infection. however, the outcome is not always in favor of the host; the pathogen may also profit from this stress response, at least under certain circumstances. an increasing number of reports have recently been published on this emerging topic (such as refs [192] [193] [194] [195] [196] [197] ), the quintessence briefly being addressed here. there are several mechanisms described of how a virus can induce er stress. the central mechanisms of perturbation of the er during virus infection can be seen in the production of large amounts of viral proteins by the virus concerned. such accumulation of viral proteins in the er implies a challenge to the protein folding machinery which may cause er stress and, in turn, activate the upr resulting in restoration of the er homeostasis or apoptosis. so far, at least 36 viruses have been found to be able to induce er stress and activate the three upr stress signalling pathways [198] . moreover, er stress can be caused by viruses via other mechanisms, for example, as a result of er membrane exploitation, imbalance of calcium concentration, or sabotage/depletion of the er membrane during virion release. viral infections may activate these pathways resulting in the inhibition or promotion of viral replication. for example, the perk-mediated global translation shutdown is a very efficient antiviral mechanism, and a similar shutdown by pkr has been used in the interferon pathway to defend against viral infection [199] . also, the virus-related upr was found to trigger host inflammatory signalling cascade through innate immune signalling pathways that activate nf-κb and ap-1 transcription factors as a result of chronic er stress. in fact, overexpression of viral proteins in the er has long been known to activate these transcription factors which induce expression of pro-inflammatory cytokines such as il-6 [200, 201] . increasing evidence supports the notion that the upr signalling synergistically interacts with virus-induced signalling to produce inflammatory cytokines and type i ifns. in addition, other lines of studies have shed light on a role of nod1 and nod2 receptors in transducing virus-related er stress signals to elicit inflammation [197] . so far, however, a possible contribution of damps to the promotion of these signalling pathways has not been investigated. importantly, however, the effects of virus-induced upr have been observed not only to inhibit but also to potentiate viral infection. in fact, manipulation of the upr response has become an asset for many viruses to promote their translation, thereby leading to chronic er stress. in other words, during infection, viruses are capable of hijacking the host translational machinery and fill the er with viral proteins. for example, this is the case for many positive-strand rna viruses, which house the virus replication machinery in the protective er-membrane. in fact, viruses need host er to produce increased quantities of viral proteins to continue replication. intriguingly, as discussed [194] , many viruses have evolved strategies aimed at continuing the replication cycle. thus, viruses were shown to manipulate the host upr in various ways to stimulate protein synthesis capacity and to improve cell survival by inhibiting cellular apoptosis. in particular, the link between the upr and autophagy are intensely discussed to be involved in this scenario. these two systems may act dependently, or the induction of one system may interfere with the other [202] . thus, experimental studies could demonstrate that different viruses modulate these mechanisms to allow them to circumvent and bypass the host immune response or, worse, to exploit the host's defense to their advantage. according to current knowledge, rna viruses including influenza virus, poliovirus, coxsackievirus, enterovirus, japanese encephalitis virus, hcv, and dengue virus were shown to regulate these processes. for example, recent studies on hcv-infected hepatocytes confirmed the evidence that virus-associated er stress and upr are linked to cellular autophagy. thus, induction of the cellular autophagic response is reportedly required to improve survival of infected cells by inhibition of cellular apoptosis. moreover, the autophagic response was demonstrated to inhibit the cellular innate antiviral program that usually inhibits virus replication. nevertheless, as argued by the authors [194] , though hcv induces er stress and autophagy, their cause-effect relationship is not clear. notably, the upr is not only modulated by viruses! recent evidence indicates that this stress response plays multiple roles during bacterial infections as well. thus, as comprehensively reviewed by celli and tsolis [203] , the upr has been shown to be induced in murine lungs by m. tuberculosis (associated with apoptotic events) and also be correlated with helicobacter-induced gastric carcinogenesis. similarly, in vitro infectious models have revealed upr induction in macrophages and epithelial cells infected with either brucella melitensis and b. abortus or listeria monocytogenes. on the other hand, there is evidence suggesting that bacteria can subvert the upr for their own advantage. nevertheless, indications that bacteria can modulate this response are still somewhat sparse. one example refers to l. pneumophila that recruits components of the erad on its vacuole to mediate turnover of bacterial effectors on the vacuolar surface and uses the proteasome to generate amino acids necessary for its intracellular growth. interestingly, like with virus-induced upr, a bacteria-initiated upr may turn out to be of advantage for the host or in favor of the bacteria, but it is not clear in each case whether this response benefits the host or the pathogen. thus, as argued by celli and tsolis [203] , "modulation of er function during infection by intracellular bacteria can promote bacterial infection by providing a replicative niche, but at the same time the resulting disruption of the secretory pathway can provide a pattern of pathogenesis that aids the innate immune system in recognizing intracellular infection and in mounting an appropriate defence. however, considering the more rapid evolution of bacterial pathogens compared to their hosts, it is likely that bacteria have evolved to modulate the upr to their advantage during infection." the function of the er stress-induced upr appears to provide another impressive evidence in support of the notion that immunity is induced by both sterile and infectious stressful injury and not primarily by invading nonself. in fact, this stress response is at the forefront of any stressful injury and is dedicated to initiating involvement of further damp-promoted defense mechanisms. however, like all the other innate immune processes, the upr may become uncontrolled. together, this may be reason enough that this stress response is involved in the pathogenesis of many human systemic and organ-specific disorders. in fact, the list of human diseases that are pathogenetically associated with er stress and activation of the upr is steadily growing [204] [205] [206] . they include neurodegenerative and metabolic diseases, autoimmune disorders, and atherosclerosis. given such an ever-increasing list of diseases, it is not a surprise that chemical compounds and inhibitors targeting the upr signalling pathways with a high degree of specificity have been and will be further developed [207] [208] [209] . maintaining genome integrity and transmission of intact genomes is a condition sine qua non for cellular, organismal, and species survival [210] . this homeostatic integrity is threatened by dna damage that occurs in a variety of conditions including but not limited to ionizing radiation, chemical reactions, and viral infections, two of the most dominant conditions being oxidative stress and replication stress. these exogenous and endogenous factors induce diverse lesions in the dna such as nucleotide alterations (substitution, deletion, and insertion), bulky adducts, collapsed dna replication forks, ssbs, and dsbs (see refs [57, [211] [212] [213] [214] [215] [216] [217] [218] [219] ). in response to dna damage, cells initiate and activate a complex network of cellular signalling cascades that cooperate to sense and repair lesions in dna, denoted as the ddr. this stress response plays an important role in fighting against detrimental effects of cell stress and injury. it orchestrates many processes, including not only dna repair but also regulation of cell-cycle checkpoints, transcription of ddr genes, and autophagy ( fig. 18.7 ). the ddr is controlled by three pi3k-related kinases (pikks): the atm, the dna-pk, and the atr kinases (mostly nuclear proteins). all three pikks are enormous polypeptides with similar domain organizations and various common structural features. equipped with the capability to autophosphorylate (see above however, at the beginning of this dance, sensors have recently been identified which, as prms in a wider sense or even as a new group of recognition receptors, are capable of detecting dna damage-as manifested by the toxic dsbs or generation of ssdna to activate these three pikks. as already briefly touched in part ii, sect. 5.2.6.4, two highly conserved multiprotein complexes, mrn and ku, are considered the primary sensors of dsbs to subsequently activate atm and dna-pk [221, 227, 228] . in addition, ssdna is sensed by the recognition molecule rpa that then, analogous to ku, acts as a signalling and repair platform for downstream factors such as the prp19, an e3 ubiquitin ligase involved in pre-mrna splicing, and atr kinase [229] . other lines of studies have provided evidence suggesting that the protein prp19 itself may act as the primary sensor of rpa-ssdna to subsequently activate atr [224] (fig. 18.7) . in brief, upon dna damage, dsbs are recognized by the mrn complex. following recognition, mrn recruits atm to this dna lesion where it binds to the c-terminus of nbs1 as a component of mrn [230] . following binding, atm kinase is activated. however, the exact mechanism whereby mrn activates atm is still not fully understood (discussed in [226] ). recognition of dsbs is also realized by ku70/80 heterodimer that has been shown to bind broken dsdna ends preferentially [231] . it is the ku70/80 then that recruits the catalytic subunit of dna-pk (dna-pkcs) at the site of dsbs to form the dna-pk holoenzyme. the major role of activated dna-pk is to promote a peculiar dna repair mechanism called non-homologous end joining (nhej), a pathway that repairs double-strand breaks in dna [231] . in fact, nhej repairs most dsbs in mammalian cells. as its name implies, nhej involves ligation of two broken dna ends without needing a repair template [232] . in contrast to atm and dna-pkcs which respond primarily to dsbs, atr is activated by a much wider range of genotoxic stresses, for example, reflected by exposure of increasing amounts of ssdna as a consequence of compromised activity of replication proteins or nucleolytic processing of various forms of damaged dna. in fact, first evidence suggests that ssdna is initially sensed by rpa which then recruits and activates prp19 [229] . in turn, prp19 facilitates the accumulation of atr-interacting protein (atrip, the regulatory partner of the atr kinase) at dna damage sites, thereby activating atr [233, 234] . plausibly, the recent discovery of dna damage-sensing molecules such as mrn, ku70/80, and rpa calls for the definition of those molecules they recognize, that is, broken dna ends at the site of dsbs and ssdna. as outlined in part iv, sect. 13.4.2, they have been tentatively sorted into a subclass of cell-intrinsically emitted damps (subclass iic-1). future studies will have to assign their exact place in the world of damps. apart from those damps emitted in the nucleus, other lines of studies lend support to the suggestion that dna damage or dna replication stress, in case of unsuccessful dna repair, promotes release of aberrant dna structures into the cytosol in the form of ssdna and dsdna breaks/fragments which operate as cell-intrinsically emitted dislocated damps. they are sensed by the dna receptor cgas and probably other as-yet-not-identified dna receptors to activate sting-dependent pathways to promote defense pathways [235, 236] (compare part ii, sect. 5.2.6; part iv, sect. 13.4.3; as well as part vi, sect. 22.3.7). as discussed by the authors [236] , it is conceivable that cytosolic dna is released by dysfunctional mitochondria upon dna damage or generated during repair of damaged genomic dna. moreover, as shown in studies on tumor models, the ddr-through the activation of sting-mediated pathways-induces the expression of another class of constitutive damps which are exposed at the cell surface, namely, the mics and different ulbps [236] [237] [238] [239] [240] (compare part iv, sect. 12.3.3). of note, however, the ddr does not always result in a happy end. in fact, when unsuccessful in repairing dna damage, the ddr can lead to cellular senescence or-like a upr in case of irremediable er stress-ultimately induces an rcd, most often in the form of apoptosis, less frequently in the form of necrosis. such subroutines of rcd are presumably aimed at mitigating the propagation of potentially mutated cells leading to cancer or other age-related pathologies (fig. 18.7) [222, 223, 241]. the dna damage response must be regarded as an efficient cell-intrinsic defense process, which is sophistically connected with other stress responses such as autophagy that also plays a significant role in maintaining genome stability [242] and the er stress-induced upr [243] . also, of particular interest is the observation that three tiers of damps are involved in this pathway, starting with broken dna ends at the site of dsbs and ssdna in the nucleus immediately generated upon dna damage; followed by dna fragments dislocated in the cytosol and operating as class iic-2 damps to promote intracellular innate immune signalling; and finishing with the exposure of subclass ib-2 damps (e.g., mics) able to activate nk, nkt, and γδ t cells (compare part vii, sects. 27.2.2, 28.2.2 and 28.4.2). this phenomenon may again reflect a particular hierarchy in the work of damps to maintain homeostasis. in fact, genomic integrity is of utmost importance and key to human health, and this is retained by the ddr. this stress response, however, may fail in maintaining and restoring homeostasis. accordingly, dna damage has been observed to play a causal role in numerous human pathologies associated with genome instability or aberrant pikk function such as cancer, leukemia, premature ageing, and certain chronic inflammatory conditions. furthermore, there is growing evidence suggesting that uncontrolled ddr signalling is associated with various neurodegenerative diseases, as documented by recent work showing that atm inhibition alleviates pathologies in models of huntington's disease [244] . this is reason enough that ddr pathways have been and are being explored therapeutically to induce freedom from these diseases. in particular, small molecule inhibitors of atm, dna-pk, and atr are regarded as potential therapeutic agents, for example, as innovative drugs in cancer treatment [245, 246] . an increasing amount of publications in the international literature provides convincing evidence that cell-intrinsic, cell-autonomous stress responses are initiated by any damage to a cell, regardless of being of sterile or infectious in nature. the ultimate goal of those responses, which are all highly conserved among eukaryotic species, is to maintain cellular homeostasis and ensure cell integrity. again this knowledge is in support of the concept that any form of host defense is primarily directed against injury and not against microbes. autophagy in terms of "immunological autophagy" appears to act as the center of all stress responses that is committed to control and regulate efferent innate and potential adaptive immune responses. the "medium" for achieving this goal is the mamps and/or damps which operate as a link between intracellular and extracellular events. another characteristic feature of cell-intrinsic stress responses is their interconnectivity and interdependency. a typical example is represented by oxidative stress and er stress responses, which appear to act mutually in any form of cellular damage. strikingly, this kind of interconnectedness is not restricted to a collaboration between the stress responses but expands to the elicitation of innate and adaptive immune processes. again, the damps-in terms of a particular hierarchy in their emission -appear to take center stage in this scenario. a better understanding of these multiple connections between cell-autonomous processes, on the one hand, and innate/adaptive immune responses, on the other hand, will undoubtedly stimulate new approaches to therapeutic interference with injury-induced pathologies. cell-autonomous stress responses in innate immunity autophagy in infection, inflammation and immunity autophagy induced by damps facilitates the inflammation response in lungs undergoing ischemia-reperfusion injury through promoting traf6 ubiquitination autophagy in human health and disease autophagy in yeast demonstrated with proteinase-deficient mutants and conditions for its induction autophagy: process and function regulation mechanisms and signaling pathways of autophagy eaten alive: a history of macroautophagy mammalian autophagy: core molecular machinery and signaling regulation molecular definitions of autophagy and related processes endocytosis and autophagy: exploitation or cooperation? regulated protein turnover: snapshots of the proteasome in action the n-end rule pathway: emerging functions and molecular principles of substrate recognition the ampk-skp2-carm1 axis links nutrient sensing to transcriptional and epigenetic regulation of autophagy autophagy: cellular and molecular mechanisms autophagy fights disease through cellular self-digestion beclin 1, an essential component and master regulator of pi3k-iii in health and disease the return of the nucleus: transcriptional and epigenetic control of autophagy tfeb links autophagy to lysosomal biogenesis zkscan3 is a master transcriptional repressor of autophagy hmof histone acetyltransferase is required for histone h4 lysine 16 acetylation in mammalian cells ampk-skp2-carm1 signalling cascade in transcriptional regulation of autophagy the significance of macroautophagy in health and disease mechanisms of mitophagy mitophagy transcriptome: mechanistic insights into polyphenol-mediated mitophagy er-phagy: selective autophagy of the endoplasmic reticulum er-phagy mediates selective degradation of endoplasmic reticulum independently of the core autophagy machinery eating the er and the nucleus for survival under starvation conditions a role for macro-er-phagy in er quality control microautophagy in mammalian cells: revisiting a 40-year-old conundrum emerging regulation and functions of autophagy microautophagy: lesser-known self-eating chaperone-mediated autophagy at a glance chaperone-mediated autophagy: a unique way to enter the lysosome world role of chaperone-mediated autophagy in metabolism chaperone-mediated autophagy: roles in disease and aging damps and autophagy: cellular adaptation to injury and unscheduled cell death oxidative stress-mediated hmgb1 biology premortem autophagy determines the immunogenicity of chemotherapy-induced cancer cell death molecular mechanisms of atp secretion during immunogenic cell death autophagy-dependent regulation of the dna damage response protein ribonucleotide reductase 1 the atg5 atg12 conjugate associates with innate antiviral immune responses endogenous hmgb1 regulates autophagy mir-22 targets the 3′ utr of hmgb1 and inhibits the hmgb1-associated autophagy in osteosarcoma cells during chemotherapy hmgb1-mediated autophagy promotes docetaxel resistance in human lung adenocarcinoma interference with hmgb1 increases the sensitivity to chemotherapy drugs by inhibiting hmgb1-mediated cell autophagy and inducing cell apoptosis hmgb1-mediated autophagy modulates sensitivity of colorectal cancer cells to oxaliplatin via mek/erk signaling pathway hmgb1/rage axis promotes autophagy and protects keratinocytes from ultraviolet radiation-induced cell death hmgb1-induced autophagy: a new pathway to maintain treg function during chronic hepatitis b virus infection autophagy occurs within an hour of adenosine triphosphate treatment after nerve cell damage: the neuroprotective effects of adenosine triphosphate against apoptosis the dna damage-regulated autophagy modulator dram1 links mycobacterial recognition via tlr-myd88 to autophagic defense dnase2a deficiency uncovers lysosomal clearance of damaged nuclear dna via autophagy dram1 promotes the targeting of mycobacteria to selective autophagy autophagy downstream of endosomal toll-like receptors signaling in macrophages is a key mechanism for resistance to leishmania major infection the role of tlr9 in stress-dependent autophagy formation oxidative stress ros function in redox signaling and oxidative stress free radicals in biology and medicine effects of allopurinol and deferoxamine on reperfusion injury of the brain in newborn piglets after neonatal hypoxia-ischemia xanthine oxidoreductase-catalyzed reactive species generation: a process in critical need of reevaluation the role of oxidants and free radicals in reperfusion injury oxidative stress and ischemia-reperfusion injury in gastrointestinal tract and antioxidant, protective agents the nox family of ros-generating nadph oxidases: physiology and pathophysiology nox4 nadph oxidase mediates oxidative stress and apoptosis caused by tnf-alpha in cerebral vascular endothelial cells redox regulation of nox proteins nadph oxidases: functions and pathologies in the vasculature mitochondrial complex iii: an essential component of universal oxygen sensing machinery? superoxide production by cytochrome bc1 complex: a mathematical model cardiac mitochondria and reactive oxygen species generation mitochondrial reactive oxygen species trigger hypoxia-induced transcription sites of reactive oxygen species generation by mitochondria oxidizing different substrates mitochondrial formation of reactive oxygen species in vivo reactive oxygen species production induced by ischemia in muscle arterioles of mice: involvement of xanthine oxidase and mitochondria nuclear and cytoplasmic p53 suppress cell invasion by inhibiting respiratory complex-i activity via bcl-2 family proteins suppression of mitochondrial complex i influences cell metastatic properties specific disintegration of complex ii succinate:ubiquinone oxidoreductase links ph changes to oxidative stress for apoptosis induction the qo site of the mitochondrial complex iii is required for the transduction of hypoxic signaling via reactive oxygen species production exploring the role of mitochondrial uqcrb in angiogenesis using small molecules development of a novel class of mitochondrial ubiquinol-cytochrome c reductase binding protein (uqcrb) modulators as promising antiangiogenic leads superoxide generated at mitochondrial complex iii triggers acute responses to hypoxia in the pulmonary circulation impact of mitochondria and nadph oxidases on acute and sustained hypoxic pulmonary vasoconstriction redox signaling and reactive oxygen species in hypoxic pulmonary vasoconstriction hypoxia activates nadph oxidase to increase [ros]i and [ca2+]i through the mitochondrial ros-pkcepsilon signaling axis in pulmonary artery smooth muscle cells glutathione and glutathione-dependent enzymes represent a co-ordinately regulated defence against oxidative stress the nrf2 regulatory network provides an interface between redox and intermediary metabolism keap1 represses nuclear activation of antioxidant responsive elements by nrf2 through binding to the amino-terminal neh2 domain nrf2:inrf2 (keap1) signaling in oxidative stress regulation of nrf2-an update molecular and cellular basis for the unique functioning of nrf1, an indispensable transcription factor for maintaining cell homeostasis and organ integrity the keap1-nrf2 system in cancer the keap1-nrf2 pathway: promising therapeutic target to counteract ros-mediated damage in cancers and neurodegenerative diseases characterizations of three major cysteine sensors of keap1 in stress response keap1, the cysteine-based mammalian intracellular sensor for electrophiles and oxidants keap1 as the redox sensor of the antioxidant response two domains of nrf2 cooperatively bind cbp, a creb binding protein, and synergistically activate transcription induction of cytoprotective genes through nrf2/antioxidant response element pathway: a new therapeutic approach for the treatment of inflammatory diseases structure of the btb domain of keap1 and its interaction with the triterpenoid antagonist cddo an important role of nrf2-are pathway in the cellular defense mechanism unique cistrome defined as csmbe is strictly required for nrf2-smaf heterodimer function in cytoprotection toll-like receptor signaling induces nrf2 pathway activation through p62-triggered keap1 degradation molecular mechanisms activating the nrf2-keap1 pathway of antioxidant gene regulation conservation of the keap1-nrf2 system: an evolutionary journey through stressful space and time heat shock, stress proteins, chaperones, and proteotoxicity heat shock in vertebrate cells allograft injury mediated by reactive oxygen species: from conserved proteins of drosophila to acute and chronic rejection of human transplants. part ii: role of reactive oxygen species in the induction of the heat shock response as a regulator of innate the stress of dying": the role of heat shock proteins in the regulation of apoptosis heat shock response modulators as therapeutic tools for diseases of protein conformation innate alloimmunity part 2: innate immunity and allograft rejection regulation of apoptosis by heat shock proteins pathways of allosteric regulation in hsp70 chaperones mechanisms for hsp70 secretion: crossing membranes without a leader extracellular heat shock proteins in cell signaling hsp60 is actively secreted by human tumor cells the odyssey of hsp60 from tumor cells to other destinations includes plasma membrane-associated stages and golgi and exosomal protein-trafficking modalities hsp60 is transported through the secretory pathway of 3-mca-induced fibrosarcoma tumour cells and undergoes n-glycosylation nucleocytoplasmic transport under stress conditions and its role in hsp70 chaperone systems membrane-bound heat shock proteins facilitate the uptake of dying cells and cross-presentation of cellular antigen emerging role of innate immunity in organ transplantation part ii: potential of damage-associated molecular patterns to generate immunostimulatory dendritic cells emerging role of innate immunity in organ transplantation: part i: evolution of innate immunity and oxidative allograft injury heat shock proteins and toll-like receptors photodynamic therapy: illuminating the road from cell death towards anti-tumour immunity molecular and translational classifications of damps in immunogenic cell death hyperthermic intraperitoneal chemotherapy leads to an anticancer immune response via exposure of cell surface heat shock protein 90 roles of the heat shock transcription factors in regulation of the heat shock response and beyond reactive oxygen species play an important role in the activation of heat shock factor 1 in ischemic-reperfused heart multifaceted roles of hsf1 in cancer heat shock factors: integrators of cell stress, development and lifespan in vitro activation of heat shock transcription factor dna-binding by calcium and biochemical conditions that affect protein conformation the hyperfluidization of mammalian cell membranes acts as a signal to initiate the heat shock protein response the membrane-associated transient receptor potential vanilloid channel is the central heat shock receptor controlling the cellular heat shock response in epithelial cells transient receptor potential vanilloid-2 mediates the effects of transient heat shock on endocytosis of human monocyte-derived dendritic cells the sterile inflammation in the exacerbation of hbvassociated liver injury requirement of heat shock protein 90 for human hepatitis b virus reverse transcriptase function heat stress cognate 70 host protein as a potential drug target against drug resistance in hepatitis b virus in vitro activity of cepharanthine hydrochloride against clinical wild-type and lamivudine-resistant hepatitis b virus isolates fever, immunity, and molecular adaptations heat shock proteins: possible biomarkers in pulmonary and extrapulmonary tuberculosis the dynamics of heat shock system activation in monomac-6 cells upon helicobacter pylori infection enterococcus faecium ncimb 10415 modulates epithelial integrity, heat shock protein, and proinflammatory cytokine response in intestinal cells the mammalian unfolded protein response signal integration in the endoplasmic reticulum unfolded protein response glycoprotein quality control and endoplasmic reticulum stress protein folding and mechanisms of proteostasis ferroptosis: process and function novel ferroptosis inhibitors with improved potency and adme properties perturbations in maturation of secretory proteins and their association with endoplasmic reticulum chaperones in a cell culture model for epithelial ischemia er stress, autophagy and immunogenic cell death in photodynamic therapy-induced anti-cancer immune responses the perks of damage-associated molecular patterns mediating cancer immunogenicity: from sensor to the plasma membrane and beyond immune modulation by er stress and inflammation in the tumor microenvironment cross talk between er stress, oxidative stress, and inflammation in health and disease overexpression of grp78 mitigates stress induction of glucose regulated proteins and blocks secretion of selective proteins in chinese hamster ovary cells bip: master regulator of the unfolded protein response and crucial factor in flavivirus biology regulation of innate immunity by signaling pathways emerging from the endoplasmic reticulum integrating the mechanisms of apoptosis induced by endoplasmic reticulum stress the impact of the unfolded protein response on human disease endoplasmic reticulum stress and oxidative stress in cell fate decision and human disease proteostasis control by the unfolded protein response endoplasmic reticulum stress and its regulator xbp-1 contributes to dendritic cell maturation and activation induced by high mobility group box-1 protein multi-layered molecular mechanisms of polypeptide holding, unfolding and disaggregation by hsp70/hsp110 chaperones definition of the lectin-like properties of the molecular chaperone, calreticulin, and demonstration of its copurification with endomannosidase from rat liver golgi protein disulfide isomerase and nox: new partners in redox signaling the unfolded protein response: at the intersection between endoplasmic reticulum function and mitochondrial bioenergetics endoplasmic reticulum stress and oxidative stress: a vicious nexus implicated in bowel disease pathophysiology a novel pathway combining calreticulin exposure and atp secretion in immunogenic cancer cell death mechanisms of translocation of er chaperones to the cell surface and immunomodulatory roles in cancer and autoimmunity the co-translocation of erp57 and calreticulin determines the immunogenicity of cell death mechanisms of pre-apoptotic calreticulin exposure in immunogenic cell death er stress-induced clearance of misfolded gpi-anchored proteins via the secretory pathway reactive oxygen species at the crossroads of inflammasome and inflammation molecular mechanisms regulating nlrp3 inflammasome activation thioredoxin-interacting protein links oxidative stress to inflammasome activation ire-1α induces thioredoxin-interacting protein to activate the nlrp3 inflammasome and promote programmed cell death under irremediable er stress endoplasmic reticulum stress in sepsis endoplasmic reticulum stress and nrf2 signaling in cardiovascular diseases rosinduced autophagy in cancer cells assists in evasion from determinants of immunogenic cell death autophagy and cellular immune responses autophagy as a stress response pathway in the immune system cell biology of ischemia/reperfusion injury cell death in the pathogenesis of heart disease: mechanisms and significance molecules and their functions in autophagy autophagy, a process within reperfusion injury: an update ros and autophagy: interactions and molecular regulatory mechanisms mitochondrial membrane permeabilization in cell death targeting post-mitochondrial effectors of apoptosis for neuroprotection apoptotic mechanisms after cerebral ischemia molecular definitions of cell death subroutines: recommendations of the nomenclature committee on cell death the mitochondrial permeability transition: a current perspective on its identity and role in ischaemia/reperfusion injury atp release from dying autophagic cells and their phagocytosis are crucial for inflammasome activation in macrophages calreticulin surface exposure is abrogated in cells lacking, chaperone-mediated autophagy-essential gene, lamp2a when er stress reaches a dead end pharmacological inhibition of cystine-glutamate exchange induces endoplasmic reticulum stress and ferroptosis. elife arms race between enveloped viruses and the host erad machinery hepatitis c virus infection induces autophagy as a prosurvival mechanism to alleviate hepatic er-stress response respiratory syncytial virus and cellular stress responses: impact on replication and physiopathology nod1 and nod2 signalling links er stress with inflammation nod1 and nod2: new functions linking endoplasmic reticulum stress and inflammation the expanding roles of endoplasmic reticulum stress in virus replication and pathogenesis signal integration via pkr expression of influenza virus hemagglutinin activates transcription factor nf-kappa b a novel signal transduction pathway from the endoplasmic reticulum to the nucleus is mediated by transcription factor nf-kappa b er stress, autophagy, and rna viruses bacteria, the endoplasmic reticulum and the unfolded protein response: friends or foes? the role of damage-associated molecular patterns in human diseases: part i -promoting inflammation and immunity the role of damage-associated molecular patterns (damps) in human diseases: part ii: damps as diagnostics, prognostics and therapeutics in clinical medicine recent insights into the role of unfolded protein response in er stress in health and disease preventing proteostasis diseases by selective inhibition of a phosphatase regulatory subunit ceapins are a new class of unfolded protein response inhibitors, selectively targeting the atf6α branch modulating protein quality control what's the damage? the impact of pathogens on pathways that maintain host genome integrity role of oxidative stress and dna damage in human carcinogenesis oncogeneinduced reactive oxygen species fuel hyperproliferation and dna damage response activation oxidized extracellular dna as a stress signal that may modify response to anticancer therapy the emerging role of ros-generating nadph oxidase nox4 in dna-damage responses oxidative genome damage and its repair: implications in aging and neurodegenerative diseases dna replication and oncogene-induced replicative stress exploiting replicative stress to treat cancer replication stress and cancer maintaining genome stability at the replication fork the dna-damage response in human biology and disease the dna damage response: making it safe to play with knives autophagy in dna damage response dna damage-induced cell death: from specific dna lesions to the dna damage response and apoptosis dna damage sensing by the atm and atr kinases molecular mechanisms of mammalian dna repair and the dna damage checkpoints the trinity at the heart of the dna damage response double-strand break end resection and repair pathway choice mechanism and regulation of dna end resection in eukaryotes conserved modes of recruitment of atm, atr and dna-pkcs to sites of dna damage the dna-dependent protein kinase: a multifunctional protein kinase with roles in dna double strand break repair and mitosis two-stage synapsis of dna ends during non-homologous end joining prp19 transforms into a sensor of rpa-ssdna after dna damage and drives atr activation via a ubiquitin-mediated circuitry a phosphorylation-and-ubiquitylation circuitry driving atr activation and homologous recombination dna damage primes the type i interferon system via the cytosolic dna sensor sting to promote antimicrobial innate immunity sting-dependent cytosolic dna sensor pathways regulate nkg2d ligand expression regulation of ligands for the nkg2d activating receptor the dna damage pathway regulates innate immune system ligands of the nkg2d receptor rae1 ligands for the nkg2d receptor are regulated by sting-dependent dna sensor pathways in lymphoma atm-atr-dependent up-regulation of dnam-1 and nkg2d ligands on multiple myeloma cells by therapeutic agents results in enhanced nk-cell susceptibility and is associated with a senescent phenotype biphasic ros production, p53 and bik dictate the mode of cell death in response to dna damage in colon cancer cells autophagy regulates dna repair through sqstm1/p62 endoplasmic reticulum stress, genome damage, and cancer targeting atm ameliorates mutant huntingtin toxicity in cell and animal models of huntington's disease phosphatidylinositol 3-kinase (pi3k) and phosphatidylinositol 3-kinase-related kinase (pikk) inhibitors: importance of the morpholine ring targeting atr in cancer medicine the mechanism of mitochondrial superoxide production by the cytochrome bc1 complex key: cord-022354-aqtceqqo authors: hunter, eric title: membrane insertion and transport of viral glycoproteins: a mutational analysis date: 2012-12-02 journal: protein transfer and organelle biogenesis doi: 10.1016/b978-0-12-203460-2.50007-x sha: doc_id: 22354 cord_uid: aqtceqqo nan the eukaryotic cell faces a continual problem of partitioning a variety of enzyme activities into different subcellular organelles where specific macromolecular reactions can occur. this subcellular organization requires specific intracellular targeting of macromolecules through the cytoplasm, by as yet ill-defined mechanisms, and through the secretory pathway via a more clearly defined vesicular transport mechanism (sabatini et al. y 1982) . as obligate intracellular parasites with only limited genetic complexity, viruses must utilize the existing cellular transport mechanisms to colocalize their virion components in a part of the cell where assembly can take place. this problem of intracellular targeting is compounded for the enveloped viruses because they must transport capsid polypeptides through the cytoplasm and envelope components through the secretory pathway to a common point of assembly. since these viruses depend on the preexisting host cell processes and possess lipid envelopes that are biochemically similar to cellular membranes, they can provide ideal systems for probing the cellular mechanisms involved in glycoprotein biosynthesis and transport. the relatively simple structure of virions, the high level of expression of viral genes, the ease of molecularly cloning and manipulating those genes, together with the availability of conditional lethal mutants with defects in viral protein transport, confer several additional advantages for such studies. we have chosen an enveloped, rna-containing virus, rous sarcoma virus (rsv), for an analysis of viral glycoprotein biosynthesis and transport, because in addition to the aspects delineated above, the availability of molecularly cloned, infectious dna copies of the genome of this retrovirus has also allowed us to pose questions about the role of the glycoproteins in virus assembly and virus infectivity. like most simple enveloped viruses, the retroviruses consist of a host membrane-derived, lipid bilayer that surrounds (envelopes) a protein capsid structure (fig. 1) . the icosahedral capsid of rsv assembles as the virus particle buds from the plasma membrane of the cell, and so these two events are linked both temporally and spacially. glycoprotein knobbed spikes extend from the surface of the virion, and it is generally thought that during virus assembly a specific interaction between the glycoproteins and capsid (and/or "matrix" protein) is required, since cellderived polypeptides are for the most part excluded from the budding structure. the envelope glycoproteins span the lipid bilayer and are thereby divided into three distinct domains: an external, hydrophilic receptor-binding domain that functions in virus-cell attachment; a hydrophobic membrane-spanning domain; and a hydrophilic cytoplasmic domain. in rsv two polypeptides, gp85 and gp37, make up this structure (fig. 2) ; the external domain is primarily made up of the 341 amino acid long gp85 which contains regions that define the host-range and neutralization properties of the virus. the 198 amino acid long gp37 polypeptide, on the other hand, is a bitopic protein that anchors the envelope glycoprotein complex into the virion via disulfide linkages to gp85. it is less heavily glycosylated than gp85 (having only 2 versus 14 potential glycosylation sites) and contains two apolar regions in addition to the hydrophilic cytoplasmic domain. one apolar region is located near the amino terminus of gp37 and may be analogous to the fusion peptide of the hemagglutinin ha2 polypeptide of influenza virus that mediates viral entry into the cell. the hatched regions represent the highly hydrophobic signal and anchor sequences found at the n and c terminus, respectively. the location of a nonpolar region that may be analogous to the fusion peptide of ortho-and paramyxoviruses is shown by the stippled box. the aug at the start of the env orf was used to initiate translation of transcripts expressed in an sv40 vector (wills et ai, 1984) , but during a virus infection the genomic length transcript is spliced such that the aug and first 5 codons from gag (black box) are spliced into the env orf . in both cases the signal peptide is removed during translation, and cleavage of the polyprotein precursor to gp85 and gp37 occurs in the golgi at the basic tetrapeptide, -arg-arg-lys-arg-. branched structures denote potential n-linked oligosaccharide addition sites (asn-x-ser or asn-x-thr). (b) orientations of gp85 and pg37 in the viral membrane are depicted schematically. in the electron microscope this structure is seen as a spiked knob, where gp37 is the spike and gp85 is the knob. second apolar region in gp 37 consists of a 27 amino acid long stretch of hydrophobic residues near the carboxy terminus that functions during translation to stop the movement of the protein into the lumen of the rough endoplasmic reticulum (rer) and to anchor the complex in the membrane. the general orientation and structure of the rsv glycoprotein is thus similar to that of the influenza virus hemagglutinin (ha) (porter et al, 1979; gething et al, 1980) , the vesicular stomatitis virus (vsv) g protein (gallione and rose, 1983) , and several membrane-spanning cellencoded glycoproteins. the two viral glycoproteins of rsv are encoded by a single viral gene, env, and are translated in the form of a heavily glycosylated precursor polypeptide, pr95 e/ii; . since several processing and maturation events occur during the transport of the env gene products to the plasma membrane, they provide excellent markers for the subcellular compartments of the cell. a long (62 amino acid) amino-terminal signal peptide, which mediates translocation of the env gene product across the rough endoplasmic reticulum (rer), is removed cotranslationally from the precursor protein, and in the lumen of the rer 15-16 high-mannose core glycosylation units are added to the nascent pr95. the marked increase in molecular weight (mw) (approximately 40k) that results from the addition of this endo-ß-tv-acetylglucosaminidase h (endo h) sensitive carbohydrate provides a clear indicator for the translocation event. removal of glucose residues and some mannose moities appears to occur prior to transport of the protein to the golgi, where further trimming of the mannose residues and addition of glucosamine, galactose, and fucose are observed. cleavage of pr95 to gp85 and gp37 takes place after galactose and prior to fucose addition; it thus provides an excellent marker for trans-golgi locations. while the transit time of pr95 from the rer to golgi is quite long (t = 90 min) compared to other viral glycoproteins, movement through the golgi appears rapid and precludes any dissection of individual compartments within this organelle. these major biochemical modifications to the env glycoprotein coupled with sensitive immunological probes allow a fairly accurate mapping of the cell's secretory pathway. several viruses assemble at points within the secretory pathway, and since in many cases the location of the viral glycoproteins defines the virus maturation point (roth et al., 1983b; jones et al., 1985; gottlieb et al., 1986; gahmberg, 1984; kabcenell and atkinson, 1985) these systems are proving useful in investigating the signals that target proteins to specific subcellular locations. figure 3 is a schematic summary of the assembly points for the major groups of enveloped viruses. herpes simplex virus (hsv), a complex enveloped dna virus that encodes several glycoproteins (gb, gc, gd, ge) , buds into the nuclear envelope (spear, 1985) . virion glycoproteins synthesized on the rer appear to be transported to the nuclear membrane in an endo h-sensitive form where they are incorporated into nascent virions (compton and courtney, 1984) . it has been postulated that intact hsv virions traverse the secretory pathway thereby exposing the glycoproteins to the entire array of carbohydrate-modifying enzymes such that the mature virion contains glycoproteins with complex carbohydrate side chains (compton and courtney, 1984; spear, 1985) . a majority of the hsv glycoproteins can be found on the surface of infected cells, suggesting that targeting to the nuclear membrane is not absolute. expression of the cloned hsv gd gene in the absence of other viral components resulted in more rapid alphaviruses ortho-, paramyxoviruses transport to the plasma membrane and reduced accumulation on the nuclear membrane, indicating that interactions with other viral-encoded proteins may be required for normal nuclear membrane localization (johnson and smiley, 1985) . the corona-, flavi-, and rotaviruses have been reported to undergo assembly at the rer (dubois-dalcq et al, 1984) , but the rotaviruses appear to target this organelle most specifically. while the mature rotavirus is not enveloped, it contains a glycosylated capsid protein in its outer shell that is derived from a transient membrane during the assembly process. the inner and outer protein shells of these viruses form sequentially and by very different mechanisms. the inner capsids, containing the genomic rna segments, are the equivalent of nucleocapsids of other viruses and assemble in the cytoplasm at the edge of electron-dense inclusions called "viroplasm" (petrie et al, 1982) . they acquire a transient envelope, or pseudoenvelope, by budding at rer membranes adjacent to the viroplasm. further maturation of rotaviruses occurs within the cisternae of the rer where enveloped particles are converted to mature double-shelled virions and the lipid bilayer is removed by a process that remains to be elucidated (dubois-dalcq et al, 1984) . vp7, the glycosylated protein found in the outer capsid, has been shown to have carbohydrate structures consistent with its rer location and to target specifically to this organelle when expressed in the absence of other viral proteins from a recombinant expression vector (kabcenell and atkinson, 1985; poruchynsky et al, 1985) . the golgi body is the site of assembly for several viruses. coronaviruses, for example, mature by budding into the lumina of rer or golgi cisternae; virions form as the intracytoplasmic, helical neucleocapsids align under regions of intracellular membranes containing viral proteins. two glycoproteins, el and e2, comprise these membrane-associated polypeptides. e2 forms the large peplomers or spikes characteristic of coronaviruses and is a multifunctional molecule, being responsible for virus-induced cell fusion, binding of the virion to receptors on the plasma membrane of susceptible cells, and for inducing neutralizing antibody (dubois-dalcq et al, 1984) . el, in contrast, is an unusual polypeptide; it has only a short amino-terminal domain, which contains the glycosylation sites of the protein, and two long stretches of hydrophobic amino acids, suggesting that it may traverse the membrane more than once (dubois-dalcq et al, 1984; boursnell et al, 1984) . during infection, e2 can be transported through the secretory pathway to the plasma membrane, whereas el is transported only as far as the golgi apparatus, where it accumulates during the infection cycle (sturman and holmes, 1983) ; it has been suggested that this restricted intracellular movement of el ac-counts for the intracellular budding site of coronaviruses (sturman and holmes, 1983) . members of the bunyavirus family also mature intracellularly, by budding at the golgi complex (bishop and shope, 1979; dubois-dalcq et al., 1984) . by immunofluorescent microscopy, kuismanen and colleagues showed that in cells infected with uukuniemi virus the golgi region underwent an expansion and became vacuolized . both glycoproteins, gl and g2, accumulated in the golgi region during virus infection; neither polypeptide could be chased out of the golgi even after a 6-hr treatment with cycloheximide (gahmberg et al., 1986) , conditions that would allow complete transport of the semliki forest virus membrane proteins from the golgi (green et al., 1981a) . furthermore, the glycoproteins of a temperature-sensitive strain of uukuniemi virus were retained in the golgi even under conditions where no virus maturation took place and no nucleocapsids accumulated in the golgi region (gahmberg et al., 1986) . thus intracellular targeting of these viral components appears to be independent of other viral components and of the assembly process itself. moreover, it supports the concept that it is the glycoproteins themselves that dictate the cellular site of virus maturation. for several virus groups, virion assembly does not occur until the envelope components have traversed the entire secretory pathway. thus the ortho-and paramyxoviruses, rhabdoviruses, alphaviruses, and retroviruses mature at the plasma membrane. even these proteins, however, possess additional membrane-targeting information such that in polarized epithelial cells, where different cell proteins are inserted in the apical and basolateral membranes, the different viruses assemble from distinct membranes; for example, the ortho-and paramyxoviruses bud from apical membranes, rhabdo-and retroviruses from basolateral membranes (rodriguez-boulan and sabatini, 1978; herrler et al., 1981; roth et al., 1983a; rindler et al., 1985) . as with those viruses that mature at points within the secretory pathway, it is the glycoproteins themselves that appear to specify the specific plasma membrane location for virus assembly, since glycoproteins expressed from recombinant expression vectors are transported in a polarized fashion (roth et al., 1983b; jones et al., 1985; gottlieb et al., 1986; . the problem of polarized expression will be dealt with in more detail later in this chapter (see section ii, b, 4) . from the brief outline presented above it is clear that viral envelope components provide a plethora of systems for studying intracellular protein targeting. recombinant dna approaches, described below, are already providing information on the role of the different glycoprotein domains in this important aspect of viral and cell biology. in addition to utilizing the secretory pathway of vertebrate cells for transporting viral components to the point of virus maturation, several enveloped viruses take advantage of a second vesicle-mediated transport system, the endocytic pathway, to gain entry into susceptible cells. these viruses, such as orthomyxoviruses, rhabdoviruses, and togaviruses, in contrast to paramyxoviruses, such as sendai virus, which bind and fuse with the plasma membrane of the host cell, bind to the host cell surface and are subsequently internalized by endocytosis. this latter process serves an important role in the normal uptake of nutrients and in the internalization of receptor-bound ligands such as hormones, growth factors, lipoproteins, and antibodies (mellman et al., 1986; hopkins, 1983) . bound virions are carried into clathrin-coated pits, which form continually on the surface of the cell, and which fold inward and pinch off into the cytoplasm to form "coated vesicles." as the coated vesicle moves into the cytoplasm, it loses its clathrin and fuses with an endosome, a large acidic vacuole with a smooth outer surface. for viruses entering by this pathway, membrane fusion occurs in the endosomal compartment (marsh, 1984; yoshimura and ohnishi, 1984) . fusion is triggered by the mildly acidic endosomal ph and is catalyzed by the virally encoded glycoproteins which undergo a low ph-dependent configurational change (skehel et al., 1982; kielian and helenius, 1985) . the ph dependence of fusion varies among virus types, with the optimal ph for fusion generally falling within the range of ph 5.0-6.2 for endocytosed viruses (white et al., 1983) . in order to obtain an understanding of the molecular mechanisms involved in these low ph-induced fusion reactions, virus mutants have been isolated which fuse with ph optima different from those of their respective parents. through the use of an elegant selection scheme, in which mutagenized virus was allowed to fuse with nuclease-filled liposomes at a ph below 6.0, kielian et al. (1984) isolated the first such fusion mutant of semliki forest virus. this virus, fus-1, fused at a ph optimum 0.7 ph units lower than that of the wild type (ph 5.5 versus 6.2). the mutant was, nevertheless, fully capable of infecting cells under standard infection conditions and even under conditions that prevent fusion of endosomes with lysosomes. on the other hand, the fus-1 mutant showed increased sensitivity to lysosomatropic agents that increase the ph in acidic vacuoles of the endocytic pathway. in addition to proving that alterations within viral structural components can significantly affect the ph at which virus-induced fusion can occur, these results showed that a ph below 5.5 exists within the endosomal compartment and thereby demonstrated the usefulness of mutant viruses as biological ph probes of this pathway. in parallel studies, rott and co-workers (1984) have shown that variants of the x31 strain of influenza virus, selected for their ability to undergo activation cleavage and growth in madin-darby canine kidney (mdck) cells, have an elevated fusion ph threshold (approximately 0.7 ph units higher than the wild type). similar virus variants have been selected by growth of influenza virus in the presence of amantadine, a compound that raises endosomal ph (daniels et al., 1985) . in this latter study, viruses were obtained that fused at ph values 0.1-0.7 units higher than the parental strain. analogous mutants have been reported to occur naturally within stocks of the x31 strain of influenza virus (doms et al., 1986) . such mutants should provide useful probes for elucidating the endocytic pathway. the mechanisms by which cells send membrane-bound and secreted proteins to their proper subcellular locations remain a central problem in cell biology. it has been postulated to involve the specific interaction of "sorting signals," located within the structure of the newly synthesized proteins, with membrane-bound receptors in the rer and golgi apparatus of the cell (for review, see sabatini et al., 1982; silhavy et al., 1983) . this concept is supported by the facts that cell, as well as viral, glycoproteins can be retained at or targeted to specific points in the secretory pathway and that cells can transport and secrete a variety of glycosylated and nonglycosylated proteins at distinctly different rates (strous and lodish, 1980; fitting and kabat, 1982; gumbiner and kelly, 1982; ledford and davis, 1983; , kelly, 1985 . very little is known about the composition(s) or indeed the exact role of sorting signals, but it is generally thought that they are composed of protein. clearly, the initial step that introduces polypeptides into the secretory pathway is mediated by the interaction of a sequence of amino acids (the signal sequence) within the polypeptide and the signal recognition particle (srp)/docking protein (dp) complex (blobel and dobberstein, 1975; blobel, 1980, 1981a,b; walter et al., 1981; meyer and dobberstein, 1980; meyer et al., 1982; gilmore et al., 1982a,b; . mutants of secreted proteins that are defective in later stages of transport have been identified that differ from the wild-type forms by one (mosmann and williamson, 1980; wu et al., 1983; shida and matsumoto, 1983) or two (yoshida et al., 1976; hercz et al., 1978) amino acid substitutions, supporting the concept that sorting signals are composed of protein. also, several conditional transport-defective mutants of membrane-bound viral glycoproteins have been identified (for example, knipe et al, 1977b; zilberstein et al, 1980; pesonen et al, 1981) . studies using tunicamycin, an inhibitor of glycosylation, suggest that carbohydrate moieties are not recognized directly by the sorting machinery but may be important for maintaining the proper secondary or tertiary structures of (protein-composed) sorting signals (struck et al., 1978; gibson et al, 1978 gibson et al, , 1979 leavitt et al, 1977; hickman et al, 1977; roth et al, 1979; strous et al, 1983; green etal, 1981b) . the lack of a direct role for carbohydrate moieties in the sorting process is perhaps to be expected in view of the fact that many secreted proteins are not glycosylated at all (for example, strous and lodish, 1980; underdown et al, 1971) . nevertheless, addition of carbohydrate to molecules that are unable to be transported through the secretory pathway can release the block to their transport (guan et al, 1985; machamer et al, 1985) . in addition, the transport of certain hydrolases to the lysosome (and away from the secretory pathway) does appear to require the addition of a carbohydrate moiety (mannose 6-phosphate) (hasilik and neufeld, 1980; sly and fisher, 1982; creek and sly, 1984) , but these additions in turn must require the recognition of signals within the polypeptide chains. even less is known about the intramolecular location(s) of sorting signals. in the case of the membrane-spanning glycoproteins, three protein domains exist which together or separately may harbor sorting signals: (1) the internal or cytoplasmic domain, (2) the hydrophobic or transmembrane domain, and (3) the extracytoplasmic or external domain. since most secreted proteins (which may be cotransported with membranebound glycoproteins, strous et al, 1983) possess only external domains, it might be reasonable to expect the transmembrane and cytoplasmic domains to be unimportant to the sorting process. we have tested this hypothesis by introducing genetic lesions into the gene encoding the envelope glycoproteins of rsv (wills et al, 1983 (wills et al, , 1984 hardwick et al, 1986; , as have others for the vsv g protein bergmann, 1982, 1983; rose et al, 1984; adams and rose, 1985a,b) , the influenza virus hemagglutinin protein (sveda et al, 1982 gething and sambrook, 1982; doyle et al, 1985 doyle et al, , 1986 gething et al, 1986) , and the glycoproteins of semliki forest virus (garoff et al, 1983; garoff, 1985; . genetic analyses of protein transport in prokaryotic systems have provided both support for the role of the signal peptide in protein translocation and valuable insights into the polypeptide interactions that are required for the intracellular targeting of bacterial secreted and membrane proteins (reviewed by michaelis and beckwith, 1982; silhavy et al, 1983; benson et al, 1985; oliver, 1985) . while similar experiments are more difficult to perform in eukaryotic cells with the enveloped virus systems described here, both the classic and molecular genetic approaches outlined below are providing information on the role of different protein domains in the transport process. during the genetic analysis of enveloped virus replication through the isolation and biochemical characterization of spontaneous and mutageninduced variants, complementation groups were established for several viruses that contained mutants defective in normal transport of the viral glycoproteins (knipe et al., 1977a,b; zilberstein et al., 1980; pesonen et al, 1981; gahmberg, 1984; ueda and kilbourne, 1976) . the existence of conditional lethal mutants that were blocked at different stages of virus glycoprotein maturation suggested that the viral polypeptides themselves might contain the signals necessary for normal sorting by the cells' transport machinery and raised the possibility that such mutants could be used tq dissect the maturation pathway of a glycoprotein. since it is impossible in this chapter to provide a detailed review of the characterization of mutants in each of these systems, and since the transport of the influenza virus glycoproteins has recently been discussed in detail by , this section will concentrate primarily on mutants of the vsv g protein gene as an example of these approaches. temperature-sensitive (ts) mutants in complementation group v of this vsv have defects in the structural gene for the viral glycoprotein, g, and cells infected at the nonpermissive temperature with such mutants produce markedly reduced yields of virus like particles which are noninfectious and specifically deficient in g protein. at the nonpermissive temperature the mutant g polypeptide is synthesized normally; however, it does not accumulate on the cell surface, nor is it incorporated into virions (knipe et al., 1977b; zilberstein et al., 1980) . the ts(v) mutants can be subdivided into two subclasses with respect to the stage of posttranslational processing at which the block occurs (zilberstein et al., 1980) . three mutants, tsl5\3, tsm50l, and ts045, encode g proteins that at the nonpermissive temperature are blocked at an early, pre-golgi step of the secretory pathway. while insertion into the er membrane, removal of the amino-terminal hydrophobic signal sequence, and addition of the two n-linked high-mannose core oligosaccharides occur in a way that is indistinguishable from the wild type, all subsequent golgi-mediated carbohydrate processing reactions are blocked (zilberstein et al., 1980) . these results are consistent with subcellular fractionation and immunoelectron microscopy studies which indicated that the g protein in tam501 or ta045-infected cells was arrested in its transport from the rer to the golgi complex at the nonpermissive temperature (zilberstein et al., 1980; bergmann et al., 1981; bergmann and singer, 1983) . the defect in transport in these mutants is a reversible phenomenon, thereby excluding irreversible denaturation as the basis for lack of movement; proteins synthesized at the nonpermissive temperature rapidly move by stages to the plasma membrane upon shift to the permissive temperature bergmann and singer, 1983) . within 3 min after shift to 32°c, g protein of tsoa5 could be seen by immunoelectron microscopy at high density in saccules at one face of the golgi complex and by 3 min later was uniformly distributed through the complex (bergmann and singer, 1983) . movement of the mutant proteins to the cell surface occurred rapidly and was accompanied by incorporation into virions . the second class of ts mutants of vsv is represented by tal511. g protein encoded by this mutant is transported normally through most of the golgi-mediated functions involved in the processing of carbohydrate side chains, including addition of the terminal sialic acid residues. however, this molecule does not undergo two posttranslation modification reactions that take place with the wild-type g protein. the first is the addition of fucose to the tel511 oligosaccharide chains, which is reduced at both permissive and nonpermissive temperatures (zilberstein et al., 1980) . in the second, a molecule of palmitic acid (a 12-carbon fatty acid) is covalently attached to the wild-type g polypeptide near the membranespanning region rose et al., 1984) . attachment occurs at a late stage of maturation, just before oligosaccharide processing is completed (schmidt and schlesinger, 1980) , probably in the eis compartment of the golgi complex (dunphy et al., 1981) . this modification of the g protein does not occur at the nonpermissive temperature in cells infected with the tsl5ll mutant. taken together with the almost complete processing of the mutant's oligosaccharide side chains, this suggests that at the nonpermissive temperature the asl511 glycoprotein accumulates at a specific region within the golgi complex. the mutants of vsv, together with equivalent mutants in other viral systems that were blocked at different stages of the secretory pathway (pesonen et al., 1981; gahmberg, 1984; kuismanen et al., 1984) , raised the possibility of identifying and understanding the nature of sorting signals in secreted polypeptides. with the advent of recombinant dna and rapid nucleotide sequence techniques, it has been possible to determine at the amino acid level the basis for these defects (gallione and rose, 1985; arias et al., 1983) , but the interpretation of this information with regard to protein transport has been less than straightforward. gallione and rose (1985) determined the nucleotide sequence of the ts045 mutant of vsv and compared it to that of the parent and a wildtype revertant. the mutant and revertant differed in three amino acid residues, and through the construction and expression of hybrid genes it was possible for these investigators to demonstrate that the basis of the temperature-sensitive phenotype was a single amino acid change of phenylalanine to serine. since this polar substitution occurred within a very hydrophobic region of the g protein, it was suggested that it might significantly affect protein folding in this region such that reversible denaturation of the protein might occur at the nonpermissive temperature. this denaturation could prevent further transport to the golgi apparatus and the cell surface. alternatively, gallione and rose (1985) pointed out that the conformational change at the nonpermissive temperature might be more subtle, perhaps preventing recognition by a component of the protein transport machinery. however, since hydrophobic residues are generally buried within a protein (kyte and doolittle, 1982) , it is unlikely that the mutated sequence itself would play a direct role in such an interaction. nothing is known about the 3-dimensional structure of the g protein or whether accessory proteins are involved at this stage of protein transport, thus both suggestions remains viable possibilities. a similar analysis has been carried out by arias et al. (1983) , who sequenced the genes encoding the viral glycoproteins of tslo and ts23, mutants of sindbis virus defective in the intracellular transport of their glycoproteins, and of revertants of these mutants. these investigators found ts23 to have a double mutation in glycoprotein el, while ts was a single mutant in the same glycoprotein. in each case reversion to temperature insensitivity occurred by changes at the same site as the mutation, in two cases restoring the original amino acid and in the third case substituting a homologous amino acid (arginine in place of lysine). since the three mutations were far apart from each other in the protein, these authors concluded that the 3-dimensional conformation of el was very important for the correct migration of the glycoproteins from the er to the plasma membrane. similarly, two ts mutants in the ha gene of influenza virus that result in ha protein transport being arrested in the rer are also caused by single point mutations that probably disrupt the tertiary structure of the molecule (nakajima et al., 1986) . in summary, several conditional mutants have been isolated by classic genetic approaches that at the nonpermissive temperature disrupt the normal transport of viral glycoproteins through the secretory pathway. while these mutants carried the promise of defining specific protein domains that might interact with components of the transport machinery, the evidence from the nucleotide sequencing experiments outlined above suggests that many or all of the mutations may exert their phenotype through distortion of the 3-dimensional shape of the molecule. while this mechanism does not preclude a role for specific protein-protein interactions in the secretory pathway, it provides no direct evidence for it at this time. the recent development of cdna cloning, gene sequence manipulation, and gene expression technologies has opened up new approaches for localizing and characterizing those structural features of a protein that act as sorting signals. these new methodologies have allowed investigators to delete or modify potentially important structural regions of a protein at the nucleotide level and then to determine the effect of such changes by expressing modified genes in suitable eukaryotic expression vectors. furthermore, in some instances it has been possible to test directly the functional role of a particular peptide region by fusing it to another protein and analyzing the behavior of the chimera. since these general approaches to protein sorting have been reviewed recently (garoff, 1985; gething, 1985) , this chapter will describe our recent recombinant dna analyses of the biosynthesis and transport of the rsv envelope glycoproteins, within the context of similar analyses in other viral systems. as we have discussed earlier, the rsv env gene encodes two viral glycoproteins, gp85 and gp37, that mediate recognition of, attachment to, and penetration of the susceptible target cell. these proteins are synthesized as a glycosylated precursor protein, pr95, that is proteolytically cleaved in the golgi complex. the coding sequences for gp85 and gp37 have been placed in an open reading frame that extends from nucleotide 5054 to nucleotide 6863, and predict sizes of 341 amino acids (40,000 mw) for gp85 and 198 amino acids (21,500 mw) for gp37 ( fig. 2) . carbohydrate makes up a significant contribution to the observed molecular weights of these polypeptides-the predicted amino acid sequence contains 14 potential glycosylation sites (asn-x-ser/thr) in gp85 and 2 in gp37. experiments aimed at determining the number of carbohydrate side chains yielded results consistent with most or all of the sites being occupied . although an initiation codon is located early (codon 4) in the open reading frame, during a viral infection splicing yields an mrna on which translation initiates at the same aug as the gag gene to produce a nascent polypeptide in which gp85 is preceded by a 62 amino acid long leader (signal) peptide (fig. 2) . this peptide contains a hydrophobic sequence that we have shown (see below) is necessary for translocation across the rer and is completely removed from the env gene product during translation . it represented one of the longest signal peptides described to date, and we were therefore interested in determining the signal peptide requirements for normal biosynthesis of gp85 and gp37. for these studies the env open reading frame was excised from the rsv genome and inserted into an sv40 expression vector under the control of the late-region promoter (wills et al., 1983; . in this construction translation is initiated at the aug present at the start of the open reading frame (at nucleotide 5054 of the rsv genome) and results in the synthesis of an even longer (64 amino acid) signal peptide; nevertheless, biosynthesis of pr95 and signal peptide cleavage occur normally. furthermore, expression of the rsv env gene in african green monkey (cv-1) cells parallels that seen in a normal virus infection in avian cells (wills et al., 1984; hardwick et al., 1986) , making it an excellent system for the analysis of mutant env genes. a. deletion/substitution of the signal peptide. in order to examine the role of the signal peptide in rsv glycoprotein biosynthesis we constructed a series of deletion mutations within the 5' coding region of the env gene using the double-stranded exonuclease babl. oligonucleotide linkers of the sequence catcgatg were ligated to the ends of the truncated molecules to introduce a unique restriction endonuclease cleavage site and to replace the deleted in-frame aug. the mutants were then sized and their nucleotide sequence determined to find those with a suitable deletion and an in-frame aug. one such mutant, al, contained a deletion of 171 nucleotides within the env coding sequences and encoded an env product that completely lacked an aminoterminal hydrophobic sequence (fig. 4) . expression of this gene from the sv40 vector resulted in the synthesis of a nonglycosylated, 58 kda cytoplasmic protein that was similar in size to the nonglycosylated wild-type env gene product produced in the presence of the glycosylation inhibitor, tunicamycin. in contrast to the tunicamycin product, however, the al protein was not associated with membrane vesicles and was rapidly degraded (half-life < 5 min; e. hunter, k. shaw, and j. wills, unpublished) . thus the signals for initiating translocation of the rsv env gene product must reside within the cotranslationally removed amino-terminal sequence, and in their absence the molecule is synthesized as an unstable cytoplasmic protein. this result is similar to those obtained by gething and sambrook (1982) and by sekikawa and lai (1983) with the influenza virus ha gene product. influenza hemagglutinin sequences are depicted by italicized text. underlined text denotes amino acid residues encoded by oligonucleotide linkers. arrows depict the signal peptidase cleavage site at which the signal is removed cotranslationally from each of the constructs. plus symbols indicate that translocation, glycosylation, or transport to a plasma membrane location is observed; minus symbols mean that the properties above are not observed. since the signal peptide of the rsv env gene product is exceptionally long, it was of interest to determine whether another signal peptide could substitute for it. for these experiments we have utilized the signal sequences of the influenza virus ha gene (a/jap/305/57; gething et al., 1980) . two constructions were made: in the first of these the al deletion mutant coding sequence was fused in-frame to the ha signal coding sequence at the signal peptidase cleavage site of the latter (fig. 4) and in the second, we made use of a sail restriction enzyme site in the ha gene and an xhol site in the rsv env gene, so that the signal sequence and 16 amino acids of ha1 were fused with env 6 amino acids into gp85 (fig. 4) . expression of these hybrid genes in cv-1 cells resulted in the biosynthesis of a glycosylated pr95 protein that was transported to the golgi complex, cleaved to gp85 and gp37, and displayed on the cell surface (e. hunter, k. shaw, and j. wills, unpublished) . to demonstrate that the pr95 molecules expressed from the ha-a1 fusion gene had undergone signal peptide cleavage, pr95 was immunoprecipitated from [ 3 h]leucine, pulse-labeled cells and analyzed by sequential edman degradation, in order to determine the amino-terminal sequence. to our surprise the signal peptidase had cleaved at the ha cleavage site, despite the fact that according to the analyses of von heijne (1983) only the rsv cleavage sequence should have been recognized. thus the h a -al fusion protein contains two potential signal peptidase cleavage sites (that from the ha and that remaining in the env sequences), but only the first of these is utilized. both gene fusions, therefore, result in the synthesis of aberrant gp85 proteins-that from the ha-a1 fusion having an 8 amino acid amino-terminal extension, and that from the sallxho fusion having lost 6 amino-terminal amino acids and gained 16 from ha1which nevertheless can be transported to the plasma membrane (wills et al., unpublished data) . reciprocal gene fusions, in which the env gene signal peptide was fused to the structural sequences of ha (fig. 4) , also resulted in translocation of the ha molecule across the rer membrane, supporting the concept that this transient sorting sequence is not polypeptide specific. however, only in the construction where the signal sequence of env was precisely fused to the amino terminus of ha1 was transport beyond the rer observed (wills et al., unpublished data) . in constructions where the amino-terminal sequence of ha1 was perturbed, the recombinant protein was apparently prevented from assembling into trimers and its transport was blocked in the rer . thus, while signal peptides may be capable of mediating the translocation of foreign polypeptides across the rer, other sorting "signals" must be active for transport of the molecule to continue. site. the experiments described above have been extended to determine the following: (a) whether the hydrophobic region of the signal peptide carries all the information required for transfer of the env gene product into the rer; (b) what the structural specificities of the signal peptide are; and (c) where the specificity for signal peptidase cleavage is located. more than 200 prokaryotic and eukaryotic signal peptides have been sequenced (watson, 1984) . comparison shows that most extensions comprise 20-40 amino acid residues; one of the longest being that of the rsv envelope glycoprotein. there is no homology between sequences, but a characteristic distribution of amino acid chains is observed. three structurally distinct regions have been observed so far: a positively charged aminoterminal region, a central region of 9 or more hydrophobic residues, and a more polar carboxy-terminal region that appears to define the cleavage site (von heijne, 1983 (von heijne, , 1984 (von heijne, , 1985 perlman and halvorson, 1983) . the importance of these general features has been supported by the genetic studies in prokaryotic systems (reviewed by silhavy et al., 1983; benson et al, 1985) . to investigate these questions we initially constructed a series of internal deletion mutants that initiated within the amino-terminus of gp85 and extended into the signal peptide. the deletion mutations were introduced into the coding region for the envelope glycoprotein by digestion of a plasmid containing the env gene at a unique xhol site located 13 base pairs (bp) from the 5' end of the coding sequence for gp85, followed by digestion with the double-stranded exonuclease bal3l. potential mutants were identified by restriction enzyme analysis and dna sequencing, and those of interest were engineered into the sv40 expression vector. these are depicted in figs. 5 and 6. mutants x4-a -b, and -c were derived from a single out-of-frame parent, and they represent a nested set of mutants in which the hydrophobic sequence varies from the wild-type length of 11 to only 9 amino acids. expression of these mutant genes in cv-1 cells gave the results summarized in fig. 5 . mutant polypeptides with the shortest hydrophobic domain (x4-c) resembled the al mutant polypeptides in that they had a cytoplasmic location, were nonglycosylated, and were rapidly degraded. they differed from the al mutant in length (65 versus 58 kda) confirming that the mutated signal was not removed. mutant x4-a polypeptides, on the other hand, were translocated and glycosylated with an efficiency equivalent to wild type, despite the substitution of serine and isoleucine for leucine and cysteine residues within the hydrophobic domain. mutant x4-b expressed a phenotype intermediate between that of x4-a and x4-c, approximately 50% of the polypeptides being translocated and glycosylated. none of the mutants contain the sequences that specify the signal peptidase cleavage site, and molecules of x4-a/x4-b that were translocated across the rer retained an uncleaved signal peptide. the data from these mutations suggest the following: (1) that the length, rather than the amino acid composition, of the hydrophobic domain of the env signal peptide is critical for translocation across the rer and (2) that signal peptide cleavage is not a requirement for translocation. the first of these conclusions is supported by genetic experiments in prokaryotic systems, where a requirement for secondary structure in the signal peptide was suggested bankaitis et al., 1984) . the second is consistent with the presence of permanent insertion sequences in secreted and membrane-spanning proteins that are translocated across the rer membrane without removal of an amino-terminal signal peptide (palmiter et al., 1978; bos et al., 1984; markoff et al., 1984; zerial et al., 1986; spiess and lodish, 1986) . several membrane-spanning proteins are anchored in the membrane by an amino-terminal anchor/signal domain and display what is termed group ii protein topology (garoff, 1985; wickner and lodish, 1985) , where the amino terminus of the protein is cytoplasmic and the carboxy terminus is luminal. after translocation of a nascent chain across the endoplasmic reticulum has been initiated, the signal peptide is removed. this cleavage is carried out by signal peptidase, a cellular gene product. two classes of signal peptidases have been described. a signal peptidase of escherichia coli (spase i) has been cloned into pbr322 (date and wickner, 1981) , and has been shown to accurately cleave eukaryotic precursor proteins as well as bacterial protein precursors (talmadge et al., 1980) . conversely, the eukaryotic signal peptidase will accurately cleave prokaryotic proteins (watts et al., 1983) . the latter enzyme has been studied, using detergentsolubilized dog pancreas signal peptidase (jackson and white, 1981) and hen oviduct signal peptidase (lively and walsh, 1983) , and been demonstrated to be an integral membrane protein that can be solubilized only when the lipid bilayer is dissolved. a second prokaryotic signal peptidase (e. coli spase ii) has been described that is specific for prolipoproteins (hussain et al., 1982; tokunaga et al., 1982) and membrane-bound penicillinases (nielsen and lampen, 1982) . this enzyme maps to a different locus on the e. coli genome and requires a glyceride-modified cysteine for cleavage. perlman and halvorson (1983) and von heijne (1983) have examined sequences of a number of membrane proteins and have described amino acid sequence patterns that allow prediction of signal peptidase cleavage sites with greater than 90% accuracy. the most striking feature of signal peptidase cleavage sites is the presence of an amino acid with a small, uncharged side chain at the carboxy terminus of the signal peptide. the most common amino acids found at this position are alanine and glycine. from statistical analyses, the peptidase cleavage site appears to be determined by sequences within the signal peptide and not by sequences beyond the cleavage site. this is in contrast to the observations that mutations within the structural protein itself prevent signal peptidase cleavage of the lamb gene product and the m13 coat protein (emr and bassford, 1982; benson and silhavy, 1983; rüssel and model, 1981) . to investigate this question with regard to the rsv env gene product, the deletion mutants shown in fig. 6 were constructed as described above. mutant xi has a 14 amino acid deletion encompassing residues 4-17 of gp85, which results in the loss of one potential glycosylation site. this deletion resulted in the synthesis of a slightly smaller precursor polypeptide that lacked one carbohydrate side chain but was otherwise glycosylated normally. based on size estimations in pulse-labeling experiments, in the presence and absence of tunicamycin, the precursor polypeptide lacked the long, signal-containing leader peptide. thus, although the mutation in xi significantly alters the sequences near the signal peptidase site, the signal peptidase still recognized and removed the signal peptide. in mutants x2 and x3 the amino-terminal nine and six amino acids, respectively, of gp85 are deleted. they therefore encode gp85 polyxhol site in dna r\ 5 | 10 15 °| asp val his leu leu glu gin pro gly asn leu trp ile thr trp ala asn arg . 1 fig. 6 . amino acid sequence deduced from dna sequence of mutants xi, x2, and x3. the rsv glycoprotein is schematically represented: the location of the hydrophobic signal sequence within the long (64 amino acid) leader peptide is denoted by a stippled bar, and the mature gp85 glycoprotein by a hatched bar. the signal peptidase cleavage site in both the cartoon and the numbered amino acid sequence is denoted by a long arrow. the potential glycosylation site in the amino terminus of gp85 is shown as a cho. the amino acid sequence of the last 7 amino acids of the signal peptide and the amino-terminal 18 amino acids of gp85 are shown for the wild-type gene product. the solid black bars show the lengths and positions of the deletions in mutants xi, x2, and x3. peptides with novel amino termini that alter the signal peptidase cleavage site from ala/asp-val-his to ala/asn-leu-trp and ala/gln-pro-gly, respectively. thus both the charge and secondary structure of the cleavage site would be predicted to be altered by the loss of asp and his (x2) and by the relocation of proline near the cutting site (x3). nevertheless, the signal peptidase efficiently cleaved the leader peptide from the nascent polypeptide, and its specificity of cleavage was unaffected by these alterations (hardwick et al, 1986) . these experiments then support the generalized conclusion from statistical analysis, that the sequence to the right of the signal peptidase site is not critical for signal peptidase cleavage. none of the deletions generated in the rsv glycoprotein gene resulted in a loss of recognition and cleavage by the signal peptidase. this result contrasts with previously described prokaryotic mutants. a 12 amino acid deletion starting at the fifth residue beyond the signal peptidase site of the lamb gene product blocked cleavage of the signal (emr and bassford, 1982; emr et al, 1981) , and a deletion of 130 amino acids beginning 70 amino acids downstream from the signal abolished signal cleavage although the shortened protein was localized correctly . in addition, the substitution of a leucine, in place of the glutamic acid, at residue 2 of the mature m13 coat protein also inhibited signal peptidase cleavage; however, in this latter instance the procoat protein was transported inefficiently across the inner membrane (boeke et al., 1980; rüssel and model, 1981) . although more mutants will be required to properly define these systems, the prokaryotic cleavage site appears to be more sensitive to manipulation than that of eukaryotes. there is accumulating evidence that transported prokaryotic proteins, unlike those of eukaryotes, may not be transferred across membranes in a strictly cotranslational manner (randall and hardy, 1984) . thus, altered regions within the structural protein portion of a molecule would have the opportunity to interact and interfere with signal peptidase cleavage; such as interaction would not be possible in the cotranslational system described for eukaryotes. although the mutant xi polypeptides were translocated across the rer membrane in a normal fashion, immunofluorescence experiments and posttranslational modification probes indicated that the transport and maturation of the xi glycoprotein was halted shortly after exiting the rer, perhaps within pre-or cis-golgi vesicles. cells synthesizing the mutant protein showed no surface immunofluorescence, no cleavage of the pr95 to gp85/gp37, and no terminal sugar additions (hardwich et al., 1986) . the basis for this block appears to be the altered amino acid sequence rather than the loss of the carbohydrate side chain since by using a mutagenic oligonucleotide we have modified the amino terminus of the xi gp85 from asp-val-his-arg-thr-to asp-val-asn-arg-thr-, thereby reinserting the glycosylation site missing from this mutant. the derivative mutant, x1a, is glycosylated at this site but remains blocked at the same intracellular point in the secretory pathway (k. shaw, k. kervin, and e. hunter, unpublished) . a second deletion mutant of the rsv env gene is also blocked in intracellular transport. this mutant, c3, has an engineered deletion at the carboxy terminus of gp37 that removed the cytoplasmic tail and transmembrane region (see below and wills et al., 1984) . its transport is clearly blocked at an earlier stage than that of the xi mutant since it was localized to the er and never reached the golgi apparatus, whereas by immunofluorescent staining of fixed cells the xi protein appeared to colocalize with the golgi complex (see fig. 7 ; hardwick et al., 1986) . although the xi and c3 mutants contain alterations at opposite ends of the env gene product, they both appear to lack an element that normally signals their transport to and beyond the cis-golgi. while there may be a specific amino acid sequence (analogous to the amino-terminal signal sequence) that is required for these transport steps, it is as likely that a correctly aligned tertiary structure is the critical factor. just as small changes as the amino terminus of ha1 can disrupt assembly and transport un rhodamine fig. 7 . intracellular immunofluorescence of cells expressing wild-type and mutant polypeptides. fixed infected cv-1 cells were stained to detect the intracellular localization of wild-type and mutant glycoproteins. rabbit anti-glycoprotein antibodies were tagged with fluorescein-conjugated goat anti-rabbit antibodies which in wild-type and xi-infected cells could be localized on the nuclear membrane (nm), endoplasmic reticulum (er), and golgi apparatus (g). the golgi was localized by staining the same cells with rhodamine-conjugated wheat germ agglutinin. in mutant c3-infected cells neither the nuclear membrane nor the golgi apparatus stained with the antiglycoprotein antiserum. magnification is 500. 131 of the ha trimer, the deleted amino acids unique to the xi mutation may similarly play a critical role in the tertiary structure of the env complex, such that sorting signals required for transport through the golgi complex are lost. mutants x2 and x3 have deletions that begin at the amino terminus of gp85 and extend nine and six amino acids into this structural protein; thus these mutations overlap with the deletion in xi. nevertheless, both mutant proteins were transported to the cell surface and were indistinguishable from the wild type. mutants x2 and x3 thus indicate that the terminal nine amino acids of pr95 are not required for normal intracellular transport and define the critical region in gp85 as the seven amino acids that are uniquely deleted in the xi mutant. a. truncation of the cytoplasmic sequences. dna and protein sequence studies demonstrated the presence of a 27 amino acid long hydrophobic (and presumably membrane-spanning) domain and a 22 amino acid long cytoplasmic domain at the carboxy terminus of gp37 ; see fig. 1 ). comparison of these domains with those of other exogenous and endogenous strains of rsv has revealed that the sequence within the hydrophobic domain is highly conserved and that within the cytoplasmic domain the sequence of the first 18 amino acids (adjacent to the hydrophobic domain) is also highly conserved while those at the carboxy termini diverge greatly (hughes, 1982; hunter et al., 1983) . these results raised the possibility that the conserved region of the cytoplasmic domain might play a functional role in either transport of the env gene product through the secretory pathway or in virus assembly. to investigate this question we initially altered the cytoplasmic domain by introducing deletion mutations into the molecularly cloned sequences of the proviral env gene and examined the effects of the mutations on transport and subcellular localization in cv-1 cells. we found that replacement of the nonconserved region of the cytoplasmic domain with a longer unrelated sequence of amino acids from sv40 vector sequences (mutant cl) did not alter the rate of transport to the golgi apparatus nor the appearance of the glycoprotein on the cell surface. larger deletions, extending into the conserved region of the cytoplasmic domain (mutant c2), however, resulted in a 3-fold slower rate of transport to the golgi complex, but did not prevent transport to the cell surface (wills et al., 1983 (wills et al., , 1984 . these results were thus consistent with the cytoplasmic domain of the rsv env gene product playing some role in transport to the golgi complex. similar results were obtained by rose and bergmann (1983) who introduced into the cdna clone encoding the vsv g protein a series of deletions that affected the cytoplasmic domain. these mutants fell into two classes; the first was completely arrested in their transport at a stage prior to the addition of complex oligosaccharides (presumably the rer) and the second showed severely reduced rates of transport to the golgi complex although the proteins were ultimately transported to and expressed on the cell surface. the method by which these mutants were constructed (as with the rsv env mutants) meant that the truncated g proteins terminated in sv40 sequences, and in at least one case the block to transport could be alleviated by substitution of a termination codon for these "poison" sequences. even in these constructions, however, three foreign amino acids were translated prior to termination (rose and bergmann, 1983) . the concept that the cytoplasmic domain might influence or govern the rate at which membrane-spanning proteins were transported to the golgi complex was supported by similar studies of doyle et al. (1985) on the ha polypeptide. while the ha cytoplasmic domain could be replaced by the equivalent region from the rsv env gene product without affecting the rate of transport of the hybrid ha from the rer to the golgi complex, truncation of the ha cytoplasmic domain or addition of the 21 amino acid long cytoplasmic domain from gp37 slowed transport significantly, and addition of 16 amino acids encoded by pbr322 sequences blocked transport of the ha from the rer. on the other hand, studies on the class i histocompatibility antigens (zuniga et al., 1983; murre et al., 1984) , the p62 of semliki forest virus (garoff et al., 1983) , and additional studies on the ha of influenza indicated that the cytoplasmic domains of these proteins could be truncated without affecting transport to the cell surface, although the kinetics of transport were not determined in every case. b. substitution mutations. it should be noted that in most of the mutant constructions described above, the carboxy-terminal region contained one or several aberrant amino acids as a result of the recombinant dna approach. thus in order to determine more directly the role of the cytoplasmic domain of gp37, we have used oligonucleotide-directed mutagenesis to introduce an early termination codon in the coding sequences of gp37 such that the arginine residue that represents the first amino acid of the cytoplasmic domain is changed to an opal terminator. this mutation creates a truncated viral glycoprotein lacking specifically the cytoplasmic domain of gp37. the biosynthesis and transport of the products of this mutant viral glycoprotein gene were analyzed by expression from an sv40 late-region replacement vector, and its ability to be active in viral assembly was investigated by substitution of the mutated gene for the wild-type gene in an infectious avian retrovirus vector. in contrast to our previous results, deletion of the entire cytoplasmic domain alone had no effect on the biosynthesis or rate of intracellular transport of the env glycoprotein. thus it seems unlikely that the conserved amino acids present in this region play a role in intracellular transport. although the cytoplasmic domain contains several charged, hydrophilic residues, it does not appear, by itself, to be required for anchoring the complex in the membrane, since molecules lacking the cytoplasmic domain were expressed stably on the plasma membrane and were not shed into the cell culture medium . a recent study by gething et al. (1986) has demonstrated that mutations within the cytoplasmic domain of the influenza virus ha can affect the conformation of the extracellular domain by preventing assembly and trimerization of the ha molecule, thereby resulting in a failure of those mutants to be efficiently transported. a similar requirement for the assembly of oligomeric forms of the vs v g protein prior to its transport to the golgi complex has also been reported (kreis and lodish, 1986) . the inconsistency of our previous results with the ones obtained with the opal mutant could be explained in a similar way. we cannot rule out the possibility that in our earlier experiments the extra amino acids, added as a consequence of the loss of the env termination codon, created a conformational change in the extracellular domain of pr95 and slowed its transport from the rer. our present results indicate that the cytoplasmic domain of gp37 is neither a recognition signal for transport to the plasma membrane nor a requirement for anchoring the molecule to it. these findings also support the idea that the charged amino acids present in most of the cytoplasmic domains of many transmembrane proteins (garoff et al., 1983; sabatini et al., 1982) are dispensable for anchor function (davis etal., 1984) . this latter question has also been addressed by cutler and co-workers , who mutated the cytoplasmic domain of the p62 polypeptide of semliki forest virus. this region, which normally contains a charge cluster (arg-ser-lys) flanking the hydrophobic domain, was changed to a neutral (met-ser-gly) or an acidic (met-ser-glu) one using oligonucleotide mutagenesis. expression analyses of these mutant proteins confirmed that the basic amino acids were not required for cell surface transport since they reached the surface in a biologically active form. nevertheless, both mutant polypeptides showed reduced stability when membranes containing them were extracted with high-ph buffer . charged residues within the cytoplasmic domain may thus provide an additional measure of stability to the membrane-bound complex. since the conserved residues in the cytoplasmic domain of gp37 were not required for protein transport it seemed possible that this region might play a role in the process of infectious virus assembly. the fact that the mutant protein was efficiently transported to the cell surface allowed us to analyze this potential role for the cytoplasmic domain in the process of virus budding. chemical cross-linking experiments have demonstrated an interaction between gp37 and pl9, one of the gag gene products that structure the viral core of rsv (gebhardt et al, 1984) . while it is clear that virus assembly can occur in the absence of glycoproteins, it was suggested that the pl9/gp37 interaction may be part of the driving force for the process of viral assembly and budding. furthermore, since host membrane glycoproteins are excluded from the viral membrane there must be some positive signal for inclusion of the viral env gene products in the budding virion. to determine whether the cytoplasmic domain is involved in this interaction and required for infectious virus assembly, we reconstructed a retrovirus genome carrying the "tail(-)" env gene mutation. surprisingly, such mutant viruses were infectious on avian cells and spread through the culture with similar efficiency to those containing a native env glycoprotein complex. furthermore, this truncated env gene complex was incorporated as efficiently into virus particles as the wildtype complex . this fact suggests that if an interaction between gp37 and pl9 is required to mediate the incorporation of the glycoproteins into the envelope of the budding viral particle, it must occur within the lipid bilayer, presumably with the hydrophobic anchor domain. it is thus unlikely that interactions between viral capsid proteins and the cytoplasmic domain of the env complex constitute a driving force for preferential incorporation of the viral glycoproteins in the avian retroviral envelope. what then is the function of the cytoplasmic domain of the env glycoprotein? since this segment of the viral polypeptide does show a region of conserved sequence, it is possible that it has evolved to facilitate transport to the plasma membrane without being a requirement for it; clearly, randomly inserted alterations within this domain can exert a negative effect on the transport process. while we have observed normal assembly and infection by virus encoding a "tail(-)" env product, it will be of interest to determine whether continued growth of the virus results in the dominant appearance of revertants that encode a functional cytoplasmic domain. many cell surface and membrane proteins of animal viruses are bound to the lipid bilayer by a membrane-spanning hydrophobic peptide close to the carboxy terminus of the polypeptide (reviewed by warren, 1981; armstrong et al, 1981) . experimental evidence for this first came from deletion mutants of the influenza virus ha (gething and sambrook, 1982; sveda et al, 1982) , the vsv g protein (rose and bergmann, 1982) , and the minor coat protein of phage fl (boeke and model, 1982) in which removal of sequences that encoded the cytoplasmic and membrane-spanning domains resulted in secretion of the protein. the hydrophobic membrane-spanning peptide of these polypeptides is thought to be an essential component of the cotranslational signal that results in the arrest of chain transfer across the rer membrane during synthesis. these stop-translocation sequences have been proposed to be a region of the nascent protein molecule which halts insertion through the membrane by disassembling the translocation apparatus and thereby creates proteins with three topological domains (blobel, 1980) . they appear to be inseparable from the anchor sequences (yost et al, 1983; rettenmier et al., 1985) since the transfer of intact transmembrane domains to normally secreted proteins has caused translocation of the constructed hybrid molecules to stop at the added sequences (yost et al., 1983; guan and rose, 1984) . however, the precise structural and physical properties of the stop-translocation sequences have not been defined. wold et al. (1985) have suggested that the cytoplasmic domain of membrane-spanning proteins might act to interrupt translocation; however, this seems unlikely since deletion mutants lacking this domain are found to be associated with the membrane in a normal manner (see above; garoff et al., 1983; zuniga et al, 1983; murre et al, 1984; doyle et al, 1986; . while length and sequence vary widely among regions described as transmembrane anchors, they do have characteristics in common. most often, they are long stretches (19-30 residues) of predominantly nonpolar and hydrophobic amino acids, bounded by charged residues, at the carboxy terminus of membrane proteins. membrane-spanning sequences have also been described, however, at the amino terminus of some viral proteins (blok etal.,a9s2; palmiter et al, 1978; bos et al, 1984; markoff et al, 1984; zerial et al., 1986; spiess and lodish, 1986) and in the middle of other proteins (rettenmier et al, 1985; kopito and lodish, 1985; finer-moore and stroud, 1984) . we have investigated the structural requirements for a functional anchor/stop-translocation sequence in the rsv env system by constructing both deletion and point mutations in this region. a. deletion of the anchor domain. during our studies on the role of the cytoplasmic domain in env product transport, we characterized a mutant (c3) in which the entire cytoplasmic and transmembrane domains were deleted. this mutant, in contrast to those described for the influenza virus ha and the vsv g protein, was arrested in its transport at the rer and thus was not secreted from the cell (wills et al., 1984) . pulse-chase experiments coupled with oligosaccharide precursor labeling experiments showed that the c3 polypeptide was not transported to the golgi complex, even though it accumulated in a soluble, nonanchored form in the lumen of the rer; the mutant thus appeared to lack a functional sorting signal. surprisingly, immunofluorescent labeling studies showed that the c3 protein (unlike the wild type) did not accumulate on the nuclear membrane but rather in vesicles distributed throughout the cytoplasm (fig. 7) , suggesting that movement to the nuclear membrane, blocked in c3, may require a specific transport event, even though the rer and nuclear membranes appear to be continuous. this hypothesis is supported by studies on the vsv g protein that indicate that transitional vesicles (for the transport of glycoproteins to the golgi apparatus) may be derived from "blebs" in the nuclear membrane (bergmann and singer, 1983) . although these studies raised the possibility that sorting signals might exist within the deleted region of c3, the 95 amino acid deletion in this mutant, which extends into the external domain of gp37, would be expected to prevent normal folding of the glycoprotein. to determine whether the transmembrane domain was required for intracellular transport, we have modified the env gene by oligonucleotide-directed mutagenesis, changing the lysine (aaa) codon, which precedes the hydrophobic domain of gp37, to an ochre nonsense codon (taa). this modified gene thus encodes a protein consisting of the entire external domain of pr95 and lacking precisely the hydrophobic membrane-spanning and hydrophilic cytoplasmic domain. the biosynthesis and intracellular transport of the truncated protein in cv-1 cells was not significantly different from that of the wild-type glycoprotein, suggesting that any protein signals for biosynthesis and intracellular transport of this viral glycoprotein complex must reside in its extracellular domain. in contrast to the case of the c3 mutant, this complex lacking just the transmembrane and cytoplasmic domains is secreted as a soluble molecule into the culture medium . since the glycoprotein complex lacking only the cytoplasmic domain of gp37 is stably expressed on the cell surface, in a manner similar to the wild-type complex, it can be concluded that the transmembrane domain alone is required for anchoring the rsv env complex in the cell membrane. b. requirements for a functional stop-translocation/anchor sequence. we have approached the question of the compositional requirements for membrane anchoring and orientation (stop-translocation) of a membrane-spanning protein by substituting an arginine for a centrally positioned leucine in the hydrophobic anchor region of the rsv env gene product (fig. 8) . the arginine substitution is one of the most drastic ^ membrane-spanning mutant « domain * wt his leu leu lysjgly leu leu leu gly leu val val ile leu leu leu leu val cys leu pro cys leu leu gin phe val ser ser ser ile|arg lys met μά^ his leu leu lys|gly leu leu leu gly leu val val ile leu leu leu leu val cys|arg|pro cys leu leu gin phe val ser ser ser ile) arg lys met t24 his leu leu lys|gly leu leu leu gly leu val val ile leu leu leu leu val cysj ;;;-::;;;: : :;|i|leu leu gin phe val ser ser ser ile|arg lys met t18 his leu leu lys|gly leu leu leu gly leu val val ile leu leu leu leu val| |ser ser se7üe|arg lys met t16 his leu leu lys|gly leu leu leu gly leu val val ile leu leu[ jval ser ser ser ile| arg lys met t11 his leu leu lys|gly leu leu leu gly leu val| |ser ser ser ile|arg lys met t5 his leu leu lys|gj7leu||| |ser ser ile|arg lys met compositional point mutations that could be made since it is only rarely found buried in hydrophobic environments (kyte and doolittle, 1982) and has a high predicted potential for terminating membrane-buried helices (rao and argos, 1986) . the substitutions we have made fall within the conserved leucine-rich " i c " region proposed by patarca and haseltine (1984) and near the two cysteine residues where palmitate may be covalently added (gebhardt et al, 1984; kaufman et al., 1984) . by changing the anchor's hydrophobic integrity through the insertion of point mutations we hoped to define better what constituted a functional anchor sequence. placing a highly charged basic side chain into the hydrophobic core of the membrane might be expected to either (1) terminate the membrane-spanning helix, thereby partitioning the charged residue to one side or the other of the membrane, or (2) destroy the stop-translocation signal, causing the protein to be secreted. the results of these experiments showed that a single amino acid substitution in the transmembrane anchor did not affect membrane association or its orientation in the membrane; unexpectedly, however, it affected targeting of the protein at a stage late in the transport pathway, such that the mutant protein was rapidly degraded in lysosmes . the early translation products of both the arginine-mutant and wild-type genes behaved normally: they were synthesized with equal efficiency, had normal bitopic symmetry, and were glycosylated. the kinetics for the turnover of these precursors were nearly identical to those previously reported in infected chicken embryo fibroblasts (bosch and schwarz, 1984) and in sv40 expression vectors (wills et al., 1984) . furthermore, in the golgi, palmitate was added to the precursors, they were cleaved to gp85-gp37, and they received terminal sugars. only after this last stage did the presence of the charged side chain of the substituted arginine alter expression. at the level of the trans golgi, a post-golgi compartment (saraste and kuismanen, 1984) , or at the cell surface, the gp85-gp37 complex was rapidly shunted to lysosomes and degraded-as shown by the protection afforded the terminally glycosylated env proteins by the lysosomatropic agent, chloroquine . the exact pathway that the molecules take to the lysosome is not known. they may be transported directly from the trans golgi or first to the surface where they are rapidly endocytosed. discriminating between the alternate pathways has not been possible from current data. why the insertion of an arginine into the anchor should result in targeting to lysosomes is not obvious. a possible explanation is that we have introduced a specific sorting signal into the molecule; however, this is unlikely since other polypeptides with charged residues in the membranespanning domain are not so targeted (kabcenell and atkinson, 1985; saito et al., 1984; hayday et al., 1985) . the arginine's charge is incompatible with the hydrophobic environment of the lipid bilayer; to achieve stability, the charged guanidinium group needs to be neutralized, and how this is done inside the bilayer is not clear. parsegian (1969) has postulated that a lone charge sequestered in a membrane must form a pore or tunnel along with localized membrane thinning to acheive the lowest energy state. if the charged residue in the gp37 anchor causes the mutant molecules to aggregrate and form channels in the membrane in an analogous manner, it would likely kill the cell unless there was a mechanism to remove it rapidly. alternatively, since the env protein is not an isolated entity in the membrane, it is conceivable that it aggregates with other components of the membrane to reduce net charge cooperatively, and thereby triggers the endocytotic machinery (mellman and plutner, 1984) . charged residues are found in several proposed membrane-spanning helices (kabcenell and atkinson, 1985; saito et al., 1984; hayday et al., 1985; reviewed by rao and argos, 1986) . the charged residues in bacteriorhodopsin membrane-spanning a helices have been suggested to be neutralized by forming ion pairs (engelman et al., 1980) . this is likely a special case, however, since the energy required to bury an ion pair in the membrane is not much different from that required to bury the free charged group itself (parsegian, 1969) . neutralization of strong charges, particularly of lysine and arginine, may occur through the formation of strong hydrogen bonds with tyrosine (kyte and doolittle, 1982) ; however, no tyrosine residues are present in the anchor domain of the env gene product which could participate with the arginine. the t cell a, ß, and y gene products and the rotavirus vp7 protein have a putative structure similar to that of the arginine mutant, with a lysine centered within the transmembrane anchor; however, unlike the molecule we created, they invariably have tyrosine residues adjacent to the lysine (saito et al., 1984; kabcenell and atkinson, 1985; hayday et al., 1985) which could stabilize the charge through hydrogen bonds (kyte and doolittle, 1982) . adams and rose (1985a) have described the similar insertion of an arginine (and glutamine) residue at the center of the transmembrane domain of the vsv g protein. their mutant protein, like the env mutant we have characterized, was bitopic, could be seen localizing in the golgi, and did not accumulate on the cell surface. since these investigators observed a "lower level of protein expression" with their mutant g protein, it is possible that it also was rapidly degraded in lysosomes following terminal glycosylation. in contrast, observed no alteration in the biosynthesis and transport of a mutant p62 polypeptide of semliki forest virus in which the hydrophobic domain was interrupted by insertion of a glutamic acid residue. in this protein, however, the outer boundary of the hydrophobic domain is not delineated by a charged residue and so it is possible that additional uncharged residues from the external domain were pulled into the membrane. since the insertion of a charged polar residue into the transmembrane region of the rsv env gene product did not interfere with its anchor/stoptranslocation function, we have investigated the requirement for the long (27 amino acid) hydrophobic domain in arresting translocation and anchoring the env complex. a series of deletion mutations was generated by progressively removing base pairs to either side of a unique sph\ restriction site that had been previously engineered into the center of the anchor coding region. this produced env proteins with truncated transmembrane anchors that ranged in length from 24 (t24) to a single apolar amino acid (tl) summarized in fig. 8 ; . while the effects of the deletions on the transport and subcellular localization of the env gene product appeared to be a complex function of the length and composition of the remaining anchor, the mutants appeared to fall into three broad phenotypic classes (summarized in table i ). even the smallest deletion (t24), which removed only three amino acids, greatly reduced the surface expression of the mature env proteins. t24 and mutants t18, t17, and t16 had a normal bitopic orientation in the membrane but appeared to be cleared from the cell surface and degraded in lysosomes, since they accumulated only in the presence of choroquine, an inhibitor of lysosomal degradation. the reduced surface expression of the as determined by the distance between charged residues. b hydrophobicity score as determined using the hydropathicity values of kyte and doolittle (1982) . mean hydrophobicity is hydrophobicity score divided by the apparent anchor length. c symbols: + denotes either a wild-type or positive response; ± denotes an intermediate response; -denotes a negative response. d sec, secreted. e nd/na, not determined/not applicable. f with chloroquine treatment, pr95 appears in the medium with gp85 and gp37. « anchorless and tailess deletion mutant . largest deletion mutant (t24) was surprising since the effective hydrophobic peptide remaining in this construct was as long or longer than functional anchors reportedly present in other integral membrane proteins [e.g., 19 amino acids: m 2 protein of influenza (lamb, 1985) ; 20 amino acids: vsv g protein (rose et al., 1980) and adenovirus e3 protein (wold et al., 1985) ]. deletions which reduced the size of the transmembrane anchor to seven amino acids (t7) or less resulted in the secretion of mature glycoproteins into the medium. a third class of mutants with hydrophobic regions of 14 (t14) and 11 (til) amino acids, respectively, while remaining membrane associated, no longer appeared to span the rer as bitopic proteins. neither mutant could be found at the surface of cells, nor could their degradation be arrested by chloroquine treatment. from the sum of the data obtained with these mutants, it would appear that bitopic insertion of the env gene product is possible with effective anchor domains of at least 16 amino acids; if additional amino acids are removed from the domain, the protein can no longer exist bitopically, and it either partitions monotopically to the luminal side of the rer membrane or withdraws amino acids from the cytoplasmic side of the membrane into the bilayer. nevertheless, such sequences from the cytoplasmic domain are not able to stabilize the shortest anchors (tl, t5, and t7) since these are secreted from the cell. davis and model (1985) have investigated the requirements of a functional anchor domain by inserting artificial hydrophobic peptides of varying length into the membrane-associated pill protein of the bacteriophage fl. their results show that 17 hydrophobic amino acids are sufficient to maintain the protein in a bitopic configuration; however, the 17 amino acid anchor was "deleterious to the cell" presumably because it was too short to assume a stable conformation compatible with existence in the bilayer and thereby destabilized the membrane. a construct with an anchor of 12 hydrophobic amino acids was membrane associated but showed an intermediate phenotype in its sensitivity to solubilization by alkali. in contrast, a construct with only 8 hydrophobic amino acids-also membrane associated-was completely released into the supernatant at high ph. adams and rose (1985b) reduced the anchor domain of vsv g by precisely deleting amino acids from within the hydrophobic core. when the length of the anchor was reduced from 20 amino acids to as few as 14, the protein was normally membrane associated and expressed in a bitopic fashion on the cell surface. on the other hand, proteins with an anchor domain of 12 or 8 amino acids, while spanning the membrane, appeared to be transported only as far as the golgi where they accumulated; the surface expression of these proteins was greatly reduced (12 amino acid anchor) or undetectable (8 amino acids). doyle et al. (1986) have characterized a series of carboxy-terminal deletion mutants of the ha polypeptide in which the 27 amino acid anchor domain was truncated to 17, 14, and 9 amino acids, respectively. in this case, molecules with a 17 amino acid long transmembrane domain were stably anchored but were transported less efficiently to the plasma membrane. truncation of the hydrophobic anchor to 9 or 14 residues resulted in ha proteins that were unstable and whose transport appeared to be blocked in the rer or in a pre-golgi compartment-resembling the til mutant of the rsv env gene described above. from the results of these different systems and approaches it would seem that in strictly physical terms, anchors may be significantly reduced in their length without any consequence to the membrane association. the limits for this length ap-pear to be about 8-12 amino acids, but it must be appreciated that the mere presence of a stretch of hydrophobic amino acids within a protein does not serve to constitute an anchor. several mammalian virus envelope proteins contain in their external domain long hydrophobic amino acid regions that are equivalent to the truncated bitopic anchors have described here (gething et al, 1978; white et al, 1981) . indeed, gp85 contains a strongly hydrophobic 11 amino acid region that clearly does not act as a stop-translocation sequence or play a role in membrane association . the work of davis and model (1985) , on the other hand, implies that the length of a hydrophobic region is the major determinant as to whether or not it will confer membrane association properties to a protein, although they point out that the position of such sequences within the molecule may play a role. most eukaryotic membrane-spanning polypeptides have a complex tertiary structure that is stabilized by multiple disulfide linkages, and it is possible that the entropy of a correctly folded molecule is sufficient to pull potential stop-transfer regions through the membrane. a corollary of this hypothesis, therefore, is that, once folding is complete, short hydrophobic regions that can potentially span the membrane as an a helix would stop translocation. the length requirements for such a region could be shorter than the 20 residues predicted from a-helix dimensions if the region were flanked by arginines or lysines. since the latter have long side chains, equivalent in length to a single turn of an a helix, a stretch of hydrophobic amino acids 13-14 amino acids long might be sufficient. such a prediction fits well with the data we have obtained and with those of adams and rose (1985b) and davis and model (1985) . it will be of interest to determine what effect inserting the truncated anchors of mutants t18 and t16 into the middle of the env precursor has on translocation; if the above speculations are correct they should be extruded into the external domain. finally, it should be reemphasized that merely providing a bitopic membrane anchor/stop-translocation is not sufficient to confer wild-type biological activity on a polypeptide. mutant t24 of the rsv env gene has a hydrophobic domain which might be expected to be sufficiently long and hydrophobic in character to span the membrane stably; it is modified normally by palmitic acid and can clearly be transported to its targeted cellular location. nevertheless, it is degraded rapidly by the cell. these results imply that hydrophobic transmembrane domains contain additional (and perhaps subtle) signals that remain to be deciphered, a conclusion that is supported by the finding that deletion of the anchor domain of the rotavirus vp7 protein abolishes its specific targeting and retention in the rer (poruchynsky et al, 1985) . polarized epithelial cells exhibit apical and basolateral membrane domains that are separated by well-defined tight junctions. each membrane domain has a unique protein composition (louvard, 1980; reggio et al., 1982) , indicating that mechanisms must exist to specifically target membrane proteins to different surfaces. the sorting process occurs during or shortly after passage of the glycoproteins through the golgi complex pfeiffer et al., 1985; rindler et al., 1984 rindler et al., , 1985 . however, the mechanisms determining this directed transport to either the apical or basolateral membranes are not understood. their study has been facilitated by the use of cultured epithelial cell lines, such as the mdck cell line, and by the observation that certain rna viruses bud exclusively from apical or basolateral domains of these polarized cells in culture (rodriguez-boulan and sabatini, 1978; herrler et al., 1981; roth et al., 1983a; rindler et al., 1985) . avian and mammalian retroviruses together with rhabdoviruses such as vsv mature from the basolateral surface, while ortho-and paramyxoviruses bud from the apical surface. the carbohydrate residues present on the different proteins do not appear to play a role in this sorting process, since tunicamycin does not interfere with the polarized release of the viruses (roth et al., 1979; green et al., 1981b) . as with the maturation of viruses that assemble at intracellular locations within the secretory pathway, polarized budding of enveloped viruses is dependent on the site to which viral glycoproteins are transported. expression of cloned viral glycoprotein genes from both sv40-based and vaccinia expression vectors in polarized cells has demonstrated that the ha and neuraminidase polypeptides of influenza virus are targeted to the apical surface (roth et al., 1983b; jones et al., 1985; gottlieb et al., 1986) while the g protein of vsv and the gp70/pl5e complex of murine leukemia virus (mulv) are transported exclusively to the basolateral membranes . in an attempt to locate the signals which direct these glycoproteins to the apical or basolateral domains, recombinant dna techniques have been employed to construct chimeric proteins and express these in polarized cells in culture. in experiments where sequences encoding the external domain of ha were fused to those encoding the transmembrane and cytoplasmic domains of the vsv g protein, the hybrid glycoprotein behaved in the same manner as wild-type ha and was transported to the apical domain of polarized cells (mcqueen et al., 1986; roth et al., 1986) . conversely, fusing the external domain of g protein to the anchor/cytoplasmic domain of ha results in basolateral transport (mcqueen et al., 1987) . these experiments thus suggest that the ectodomains of ha and g protein contain signals for apical and basolateral transport, respectively. while expression of a secreted form of the ha glycoprotein in an apical polarized manner supports this conclusion (roth et al. y 1986) , the unanchored ectodomain of the mulv gp70/pl5e complex, which is normally targeted to the basolateral membrane, is secreted in a nonpolarized fashion . it is possible that this soluble protein is improperly folded and thus is unable to interact with the sorting machinery, alternatively it also raises the possibility that targeting signals may be located in more than one domain of these molecules. further analyses should shed light on this problem. the studies described in this chapter demonstrate the breadth of information that has been and can be obtained from studies on enveloped virus glycoprotein biosynthesis. many of the studies were performed at a time when the cloned genes and molecular probes for cellular glycoproteins were unavailable and thus provided valuable insights into the manner in which cells compartmentalized and transported membrane proteins. the exciting possibility of utilizing viral glycoprotein genes for genetic analyses of the transport pathway, in a way analogous to that pursued in prokaryotic systems (michaelis and beckwith, 1982; silhavy et al. y 1983; oliver, 1985) , has led to a plethora of studies that utilized both classic and recombinant dna genetic approaches. these investigations have resulted in great progress in our understanding of the general processes involved in intracellular transport of proteins through the secretory pathway but at the same time have raised difficult questions about the molecular interactions required for protein sorting. the initial observation that proteins destined for secretion contain an amino-terminal sorting sequence provided a precedent on which to build models for protein targeting based on topogenic sequences (blobel, 1980) . to a large extent the identification of signal sequences was facilitated by their transient nature, not by a conserved primary sequence. indeed, while some common characteristics of signal peptides can be recognized (von heijne, 1985) , the identification of signal peptides in proteins where they are not removed has proved difficult and has required the use of sophisticated recombinant dna technology [e.g., ovalbumin (tabe et ah, 1984) ]. the requirement for a signal sequence to initiate translocation across the er membrane has been clearly confirmed through the isolation and construction of mutants which lack this functional region (as discussed above) and by fusion of this sequence to proteins that are not normally translocated (lingappa et al., 1984) . these approaches have also been facilitated by the transient, nonstructural nature of many aminoterminal, translocation signals. recent experiments by friedlander and blobel (1985) and kaiser et al. (1987) , however, raise questions about the informational content of signal sequences. in particular kaiser and colleagues showed that several random amino acid sequences derived from human genomic dna fragments could act as signal sequences for translocation of the yeast invertase enzyme. thus even in this well-defined situation, where an amino acid sequence is known to play a functional role in the sorting process, it can be impossible to predict with confidence its location in the protein; how then might we expect to identify additional sorting sequences that may or may not exist within the structural domain of a transported protein? the possibility that additional sorting sequences might be involved in steering the transport of a membrane-spanning or secreted protein through the vesicular maze of the secretory pathway remains open. at the present time the necessity for a native tertiary structure cannot be separated from the possibility of such additional sorting sequences. it is clear that disruption of a poly peptide's normal folding can completely prevent its transport from the er kreis and lodish, 1986) , and the simplest explanation for the phenotypes of a variety of conditional and nonconditional transport mutants would be that they alter the tertiary structure of the mature protein (see section ii, a, above). a 3-dimensional structure has been determined for only a few molecules that traverse the secretory pathway, and even with these proteins the current, predictive algorithms are insufficiently accurate to model potential changes in molecular shape in response to mutations. the question of a direct role for tertiary structure in protein transport thus represents a major challenge to molecular biologists. furthermore, one might argue that a change in protein shape could also mask or distort a necessary (peptide) sorting sequence. this possibility is supported by the observation that for a majority of eukaryotic proteins the amino-terminal signal peptide is unable to initiate translocation across the er if translation is allowed to proceed to completion, presumably because the tertiary structure of the nascent polypeptide precludes the interaction of the signal peptide with the translocation machinery. it is quite feasible that sorting signals and targeting signals could be represented by different entities within a single polypeptide, particularly if the latter were required to fix the intracellular location of a protein. for example, the rotavirus vp7 polypeptide accumulates within the er un-less its amino-terminal hydrophobic anchor region is deleted, whereupon it is transported to the cell surface and secreted (poruchynsky et al., 1985) ; in this instance the deleted region presumably contains a sequence that can fix the intracellular location of the protein despite the fact that the molecule has the potential to be exported from the cell. since most translocated polypeptides appear to follow a common pathway to a late compartment of the golgi (kelly, 1985) , it might be argued that a native conformation is the sole requirement for transport to this organelle and that the observed differences in the transport rates of proteins to the golgi merely reflect the time necessary for completion of the folding process. nevertheless, proteins leaving the golgi appear to be sorted into specific pathways; for example, in secretory cells proteins may follow either the constitutive pathway or be sequestered in secretory granules (moore and kelly, 1985; reviewed by kelly, 1985) , and in epithelial cells specific proteins appear to be transported directly to either the apical or basolateral membranes (see above). thus, it would seem likely that some form of sorting signal must be present in the polypeptide at this point in the secretory pathway in order to correctly direct its transport; initial results from viral glycoprotein expression studies indicate that at least in polarized cells the ectodomain of the sorted protein plays a dominant role (mcqueen et al., 1986; . additional studies should provide a clearer picture of this complex process. in summary, studies on the biosynthesis and transport of enveloped virus glycoproteins have provided important insights into the general processes involved in the intracellular movement of these membrane-associated molecules. the specific questions that remain to be answered are many and difficult, but it is likely that these viral systems will continue to play a vital role by providing clues and direction in this important area of cell biology. incorporation of a charged amino acid into the membrane spanning domain blocks cell surface transport but not membrane anchoring of a viral protein structural requirements of a membrane-spanning domain for protein anchoring and cell surface transport sequence analysis of two mutants of sindbis virus defective in the intracellular transport of their glycoproteins domain structure of bacteriophage fed adsorption protein intragenic suppressor mutations that restore export of maltose binding protein with a truncated signal peptide information within the mature lamb protein necessary for localization to the outer membrane of eseherichia coli k-12 genetic analysis of protein export in escherichia coli k-12 immunoelectron microscopic studies of the intracellular transport of the membrane glycoprotein (g) of vesicular stomatitis virus in infected chinese hamster ovary cells passage of an integral membrane protein, the vesicular stomatitis virus glycoprotein, through the golgi apparatus en route to the plasma membrane bunyaviridae. in "comprehensive virology intracellular protein topogenesis transfer of proteins across membranes. i. presence of proteolytically processed and unprocessed nascent immunoglobulin light chains on membrane-bound ribosomes of murine myeloma studies on the size, chemical composition and partial sequence of the neuraminidase (na) from type a influenza viruses show the nterminal region of na is not processed and serves to anchor na in the viral membrane a prokaryotic membrane anchor sequence: carboxyl terminus of bacteriophage fl gene iii protein retains it in the membrane processing of filamentous phage precoat protein: effect of sequence variations near the signal peptidase cleavage site nh 2 -terminal hydrophobic region of influenza virus neuraminidase provides the signal function in translocation processing of gpr92env, the precursor to the glycoproteins of rous sarcoma virus: use of inhibitors of oligosaccharide trimming and glycoprotein transport sequence of the membrane protein gene from avian coronavirus ibv virus-specific glycoproteins associated with the nuclear fraction of herpes simplex virus type 1-infected cells mutants of the membrane-binding region of semliki forest virus e2 protein. i. cell surface transport and fusgenic activity mutants of the membrane-binding region of semliki forest virus e2 protein. ii. topology and membrane binding fusion mutants of the influenza virus hemagglutinin glycoprotein isolation of the escherichia coli leader peptidase gene and effects of leader peptidase overproduction in vivo a charged amino acid substitution within the transmembrane anchor of the rous sarcoma virus envelope glycoprotein affects surface expression but not intracellular transport altered surface expression, membrane association and intracellular transport result from deletions within the transmembrane anchor of the rous sarcoma virus envelope glycoprotein an artificial anchor domain: hydrophobicity suffices to stop transfer fine structure of a membrane anchor domain variant influenza virus hemagglutinin that induces fusion at elevated ph mutations in the cytoplasmic domain of influenza virus hemagglutinin affect different stages of intracellular transport analysis of progressive deletions of the transmembrane and cytoplasmic domains of influenza hemagglutinin assembly of enveloped rna viruses early and late functions associated with the golgi apparatus reside in distinct compartments localization and processing of outer membrane and periplasmic proteins in escherichia coli strains harboring export-specific suppressor mutations suppressor mutations that restore export of a protein with a defective signal sequence importance of secondary structure in the signal sequence for protein secretion path of the polypeptide in bacteriohodopsin amphipathic analysis and possible formation of the ion channel in an acetylcholine receptor evidence for a glycoprotein "signal" involved in transport between subcellular organelles bovine opsin has more than one signal sequence characterization of two recombinant-complementation groups of uukuniemi virus temperature-sensitive mutants uukuniemi virus glycoproteins accumulate in and cause morphological changes of the golgi complex in the absence of virus maturation nucleotide sequence of a cdna clone encoding the entire glycoprotein from the new jersey serotype of vesicular stomatitis virus a single amino acid substitution in a hydrophobic domain causes temperature-sensitive cell-surface transport of a mutant viral glycoprotein using recombinant dna techniques to study protein targeting in the eucaryotic cell expression of semliki forest virus proteins from cloned complementary dna. ii. the membrane-spanning glycoprotein e2 is transported to the cell surface without its normal cytoplasmic domain rous sarcoma virus pl9 and gp35 can be chemically crosslinked to high molecular weight complexes. an insight into viral association cold spring harbor laboratory, cold spring habor construction of influenza haemagglutinin genes that code for intracellular and secreted forms of the protein purification of the fusion protein of sendai virus: analysis of the nh 2 -terminal sequence generated during precursor activation cloning and dna sequence of double-stranded copies of haemagglutinin genes from h2 and h3 strains elucidates antigenic shift and drift in human influenza virus expression of wild-type and mutant forms of influenza hamagglutinin: the role of folding in intracellular transport synthesis and infectivity of vesicular stomatitis viruses containing nonglycosylated g protein the nonglycosylated glycoprotein of vesicular stomatitis virus is temperature-sensitive and undergoes intracellular aggregation at elevated temperatures protein translocation across the endoplasmic reticulum. i. detection in the microsomal membrane of a receptor for the signal recognition particle protein translocation across the endoplasmic reticulum. ii. isolation and characterization of the signal recognition particle receptor the mechanism of protein translocation across the endoplasmic reticulum membrane sorting and endocytosis of viral glycoproteins in transfected polarized epithelial cells passage of viral membrane proteins through the golgi complex glycosylation does not determine segregation of viral envelope proteins in the plasma membrane of epithelial cells conversion of a secretory protein into a transmembrane protein results in its transport to the golgi complex but not to the cell surface glycosylation allows cell-surface transport of an anchored secretory protein two distinct intracellular pathways transport secretory and membrane glycoproteins to the surface of pituitary tumor cells amino-terminal deletion mutants of the rous sarcoma virus glycoprotein do not block signal peptide cleavage but block intracellular transport biosynthesis of lysosomal enzymes in fibroblasts: phosphorylation of mannose residues structure, organization, and somatic rearrangement of t cell a genes antitrypsin: the presence of excess mannose in the z variant isolated from liver isolation and structural analysis of influenza virus c virion glycoproteins studies on the mechanisms of tunicamycin inhibition of iga and ige secretion by plasma cells the importance of the endosome in intracellular traffic sequence of the long terminal repeat and adjacent segments of the endogenous avian virus rous-associated virus complete sequence of the rous sarcoma virus env gene: identification of structural and functional regions of its product mechanism of signal peptide cleavage in the biosynthesis of the major lipoprotein of the escherichia coli outer membrane phospholipid is required for the processing of presecretory proteins by detergent-solubilized canine pancreatic signal peptidase intracellular transport of herpes simplex virus gd occurs more rapidly in uninfected cells than in infected cells surface expression of influenza virus neuraminidase an amino-terminally anchored viral membrane glycoprotein, in polarized epithelial cells processing of the rough endoplasmic reticulum membrane glycoproteins of rotavirus sah many random sequences functionally replace the secretion signal sequence of yeast invertase cysteines in the transmembrane region of major histocompatability complex antigens are fatty acylated via thioester bonds pathway of protein secretion in eukaryotes ph-induced alterations in the fusogenic spike protein of semlilci forest virus membrane fusion mutants of semliki forest virus separate pathways of maturation of the major structural proteins of vesicular stomatitis virus maturation of viral proteins in cells infected with temperature-sensitive mutants of vesicular stomatitis virus primary structure and transmembrane orientation of the murine anion exchange protein oligomerization is essential for transport of the vesicular stomatitis virus glycoprotein to the cell surface uukuniemi virus maturation: an immune fluorescence microscopy study using monoclonal glycoprotein-specific antibodies a simple method for displaying the hydropathic character of a protein influenza m2 protein is an integral membrane protein expressed on the infected-cell surface impaired intracelluiar migration and altered solubility of nonglycosylated glycoproteins of vesicular stomatitis virus and sindbis virus kinetics of serum protein secretion by cultured hepatoma cells: evidence for multiple secretory pathways determinants for protein localization: /3-lactamase signal sequence directs globin across microsomal membranes hen oviduct signal peptidase is an integral membrane protein reversible block in intracellular transport and budding of mutant vesicular stomatitis virus glycoprotein hepatoma secretory proteins migrate from rough endoplasmic reticulum to golgi at characteristic rates apical membrane aminopeptidase appears at site of cell-cell contact in cultured kidney epithelial cells l985). a single n-linked oligosaccharide at either of the two normal sites is sufficient for transport of vesicular stomatitis virus g protein to the cell surface polarized expression of a chimeric protein in which the transmembrane and cytoplasmic domains of the influenza virus hemagglutinin have been replaced by those of the vesicular stomatitis virus g protein basolateral expression of a chimeric protein in which the transmembrane and cytoplasmic domains of vesicular stomatitis virus g protein have been replaced by those of the influenza virus hemagglutinin glycosylation and surface expression of the influenza virus neuraminidase requires the n-terminal hydrophobic region the entry of enveloped viruses into cells by endocytosis internalization and degradation of macrophage fc receptors bound to polyvalent immune complexes acidification of the endocytic and exocytic pathways identification and characterization of a membrane component essential for the translocation of nascent proteins across the membrane of the endoplasmic reticulum secretory protein translocation across membranes-the role of the "docking protein mechanism of incorporation of cell envelope proteins in escherichia coli secretory protein targeting in a pituitary cell line: differential transport of foreign secretory proteins to distinct secretory pathways structural mutations in a mouse immunoglobulin light chain resulting in failure to be secreted construction, expression and recognition of an h-2 molecule lacking its carboxyl terminus identification of the defects in the hemagglutinin gene of two temperature-sensitive mutants of a/wsn/33 influenza virus membrane-bound penicillinases in grampositive bacteria carbohydrate moieties of glycoproteins, a reevaluation of their function protein secretion in escherichia coli ovalbumin: a secreted protein without a transient hydrophobic leader sequence energy of an ion crossing a low dielectric membrane: solutions to four relevant problems similarities among retro virus proteins mutations within the proteolytic cleavage site of the rous sarcoma virus glycoprotein precursor block processing to gp85 and gp37 mutants of the rous sarcoma virus envelope glycoprotein that lack the transmembrane anchor and/or cytoplasmic domains: analysis of intracellular transport and assembly into virions a putative signal peptidase recognition site and sequence in eucaryotic and procaryotic signal peptides reversible defect in the glycosylation of the membrane proteins of semliki forest virus tsl mutant localization of rotavirus antigens in infected cells by ultrastructural immunocytochemistry intracellular sorting and basolateral appearance of the g protein of vesicular stomatitis virus in mdck cells complete nucleotide sequence of an influenza virus haemagglutinin gene from cloned dna deletions into an nh 2 -terminal hydrophobic domain result in secretion of rotavirus vp7, a resident endoplasmic reticulum glycoprotein export of protein in bacteria a conformation preference parameter to predict helices in integral membrane proteins surface and cytoplasmic domains in polarized epithelial cells transmembrane orientation of glycoproteins encoded by the w-fms oncogene viral glycoproteins destined for apical or basolateral plasma membrane domains traverse the same golgi apparatus during their intracellular transport in doubly infected madin-darby canine kidney cells polarized delivery of viral glycoproteins to the apical and basolateral plasma membranes of madin-darby canine kidney cells infected with temperature-sensitive viruses asymmetric budding of viruses in epithelial monolayers: a model system for study of epithelial polarity intracellular transport of influenza virus hemagglutinin to the apical surface of madin-darby canine kidney cells expression from cloned cdna of cell-surface secreted forms of the glycoprotein of vesicular stomatitis virus in eukaryotic cells altered cytoplasmic domains affect intracellular transport of the vesicular stomatitis virus glycoprotein vesicular stomatitis virus is anchored in the viral membrane by a hydrophobic domain near the cooh terminus the presence of cysteine in the cytoplasmic domain of the vesicular stomatitis virus glycoprotein is required for palmitate addition polarity of influenza and vesicular stomatitis virus in mdck cells; lack of a requirement for glycosylation of viral glycoproteins influenza virus hemagglutinin expression is polarized in cells infected with recombinant sv40 viruses carrying cloned hemagglutinin dna basolateral maturation of retroviruses in polarized epithelial cells heterologous transmembrane and cytoplasmic domains direct functional chimeric influenza virus hemagglutinins into the endocytic pathway membrane insertion and intracellular transport of influenza virus glycoproteins the large external domain is sufficient for the correct sorting of secreted or chimeric influenza virus hemagglutinins in polarized monkey kidney cells studies on the adaption of influenza viruses to mdck cells a mutation downstream from the signal peptidase cleavage site affects cleavage but not membrane insertion of phage coat protein mechanisms for the incorporation of proteins in membranes and organelles complete primary structure of a heterodimeric t-cell receptor deduced from cdna sequences pre-and post-golgi vacuoles operate in the transport of semliki forest virsus membrane glycoproteins to the cell surface fatty acid binding to vesicular stomatitis virus glycoprotein: a new type of posttranslational modification of the viral glycoprotein relation of fatty acid attachment to the translation and maturation of vesicular stomatitis and sindbis virus membrane glycoproteins evidence for covalent attachment of fatty acids to sindbis virus glycoproteins defects in functional expression of an influenza virus hemagglutinin lacking the signal peptide sequences analysis of the hemagglutinin glycoprotein from mutants of vaccinia virus that accumulates on the nuclear envelope mechanisms of protein localization changes in the conformation of influenza virus hemagglutinin at the ph optimum of virus-mediated membrane fusion the phosphomannosyl recognition system for intracellular and intercellular transport of lysosomal enzymes glycoproteins specified by herpes simplex viruses an internal signal sequence: the asialoglycoprotein receptor membrane anchor nonpolarized expression of a secreted murine leukemia virus glycoprotein in polarized epithelial cells polarized transport of the vsv g surface expression of viral glycoproteins is polarized in epithelial cells infected with recombinant vaccinia viral vectors intracellular transport of secretory and membrane proteins in hepatoma cells infected by vesicular stomatitis virus vesicular stomatis virus glycoprotein, albumin, and transferrin are transported to the cell surface via the same golgi vesicles effect of tunicamycin on the secretion of serum proteins by primary cultures of rat and chicken hepatocytes the molecular biology of coronaviruses cell surface expression of the influenza virus hemagglutinin requires the hydrophobic carboxy-terminal sequences influenza virus hemagglutinin containing an altered hydrophobic carboxy terminus accumulates intracellularly segregation of mutant ovalbumins and ovalbumin-globin fusion proteins in xenopus oocytes: identification of an ovalbumin signal sequence eukaryotic signal sequence transports insulin antigen in escherichia coli prolipoprotein signal peptidase in escherichia coli is distinct from the m13 procoat protein signal peptidase temperature-sensitive mutants of influenza virus: a mutation in the hemagglutinin gene structural studies of iga myeloma proteins having anti-dnp antibody activity patterns of amino acids near signal-sequence cleavage sites how signal sequences maintain cleavage specificity signal sequences, the limits of variation purification of a membrane-associated protein complex required for protein translocation across the endoplasmic reticulum signal recognition protein (srp) mediates the selective binding to microsomal membranes of m-i>//ro-assembled poly somes synthesizing secretory protein translocation of proteins across the endoplasmic reticulum iii. signal recognition protein (srp) causes signal sequence-dependent and sitespecific arrest of chain elongation that is released by microsomal membranes translocation of proteins across the endoplasmic reticulum i. signal recognition protein (srp) binds to m-wvro-assembled polysomes synthesizing secretory protein membrane proteins: structure and assembly compilation of published signal sequences m13 procoat and a preimmunoglobulin share processing specificity but use different membrane receptor mechanisms cell fusion by semliki forest, influenza, and vesicular stomatitis viruses membrane fusion proteins of enveloped animal viruses multiple mechanisms of protein insertion into and across membranes alterations in the transport and processing of rous sarcoma virus envelope glycoproteins mutuated in the signal and anchor regions mutations of the rous sarcoma virus env gene that affect the transport and subcellular location of the glycoprotein products the 19-kda glycoprotein precursor coded by region e3 of adenovirus secretion of a x2 immunoglobulin chain is prevented by a single amino acid substitution in its variable region molecular abnormality of human a r antitrypsin variant (pi-zz) associated with plasma activity deficiency uncoating of influenza virus in endosomes a stop transfer sequence confers predictable transmembrane orientation to a previously secreted protein in cell-free systems the transmembrane segment of the human transferrin receptor functions as a signal peptide mutants of vesicular stomatitis virus blocked at different stages in maturation of the viral glycoprotein expression and function of transplantation antigens with altered or deleted cytoplasmic domains key: cord-014685-ihh30q6f authors: nan title: posters p788 p999 date: 2005-09-21 journal: eur biophys j doi: 10.1007/s00249-005-0504-x sha: doc_id: 14685 cord_uid: ihh30q6f nan in order to understand mrna stability, we proceed to the studie of life time model system composed of the regb ribonuclease and the ribosomal protein s1. the t4 bacteriophage life cycle is modulated by regb which mediate speci c mrna degradations. regb cleaved the mrna at the middle of the translation initiation region (shine-dalgarno sd). studies had shown that regb activity is enhanced with addition of s1 which is normaly required for the translation of mrna in case of unusual sd regions. s1 is considered as a key factor of translation's inititiation. composed of six similar domains (f1-f6), it interacts with the ribosome by his f1-f2 domains and with the mrna by f3-f6. we had shown that the f345 fragments got the same properties than the whole protein. many global architectures had been published (linear or globular conformation). then we try to understand what is the catalysis way of s1 in the regb activation and in rna recognition. we study the global conformation of f345 fragment. for that, we use nmr to determinate the domains interfaces. at rst we made the nmr backbone assignement of 15n/13c/2h labeled fragments (f34-f45). then we analysed hsqc overlapping of each fragments in order to identify interfaces residues. the residues identi cation on s1 homologous model allowed us to determinate interaction region between domains. currently we study of s1-rna interactions to determinate if the ligand conformation could change the interaction region and involve conformationals changes. translocation of amino acids from the membrane interface to the interior: theory and experimiment c. aisenbrey 1 , e. goormaghtigh 2 , j.-m. ruysshaert 2 , b. bechinger 1 1 ulp/cnrs, chemistry, 4 rue blaise pascal, 67070 strasbourg, 2 université libre, campus plaine cp206/2, 1050 brussels the interactions of a series of histidine-containing peptides with biological model membranes have been investigated by attenuated total re ection fourier transform infra red (atr-ftir) and oriented solid-state nmr spectroscopies. related peptides have previously been shown to exhibit antibiotic and dna transfection activities. the 26-residue lah 4 x 4 and lah 4 x 6 peptides were designed in such a manner to form amphipathic helical structures in membrane environments. four histidines and four/six variable amino acids x constitute one face of the helix whereas leucines and alanines characterize the opposite hydrophobic surface. the dichroic ratio of atr-ftir spectra, or the orientation-dependent 15 n solid-state nmr chemical shift have been used to follow the ph-dependent transition from in-plane to transmembrane alignments upon increase in ph. a theoretical model of the topological modulations is presented and the experimental transition curves analysed in order to reveal the gibbs free energy of transition. the novel concept provides access to the free energy changes associated with the amino acids x incorporated into an extended -helix and in the context of phospholipid bilayers. for the peptides of the lah 4 x 4 series the gibbs free energies associated with the transition from the membrane interface to the bilayer interior follow the sequence of amino acids: l < a i < s f < t g < v w << y. here we present a novel solid-state nmr approach which allows for the accurate determination of the tilt and rotational pitch angles of peptides reconstituted into uniaxially oriented membranes. the method works with transmembrane or in-plane oriented peptides that have been labelled with 3,3,3-2 h 3 -alanine and 15 n-leucine at two selected sites. proton-decoupled 15 n and 2 h solid-state nmr spectroscopy at sample orientations of the membrane normal parallel to the magnetic eld direction have been used to characterize the tilt and rotational pitch angle of these peptides in considerable detail. the same samples when inserted into the magnetic eld at 90 degrees tilted alignments provide valuable information on the rotational diffusion constants in membranes and thereby of the association and size of peptide complexes within the membrane environments. whereas monomeric transmembrane peptides exhibit spectral averaging and well-de ned resonances, larger complexes are characterized by broad spectral line shapes. in particular the deuterium line shape is sensitive to association of a few transmembrane helices. in contrast, the formation of much larger complexes affects the 15 n chemical shift spectrum. aisenbrey ch. & bechinger, b. biochemistry 43, 10502-10512 (2004) a meccano set approach of joining trpzip a water soluble -hairpin peptide with a didehydrophenylalanine containing hydrophobic helical peptide p. chetal, v. chauhan, d. sahal international centre for genetic engineering & biotechnology,new delhi 110067,india a 16 residues long, water soluble, monomeric -hairpin peptide "trpzip" [cochran et al (2001) pnas 98, 5578-5583], stabilized by tryptophan zipper has been linked via a tetraglycyl linker to a hydrophobic didehydrophenylalnine ( f) containing helical octapeptide. circular dichroism studies of this 28 residues long peptide, "trpzipalpha" (ac-gewtwddatktwtwte-gggg-fal fal fa-nh 2 ) in water have revealed the presence of both the -hairpin and the helical conformations. this is the rst instance where a f containing peptide has been found to display a helical fold in water. the uorescence emission wavelengths of tryptophan in ac-g-w-g-nh 2 , trpzip and trpzipalpha were 341.5, 332.8 and 332.6 nm respectively. the uorescence quantum yield of trpzip was 2.6 fold higher than trpzipalpha suggesting that proximal interactions between the -hairpin and the helix caused the quenching of tryptophan uorescence in the former by the fs in the latter. the molar ellipticity of the far uv couplet characteristic of trpzip was reduced in trpzipalpha and the cd based thermal melting temperatures at 228 nm were 62 c (trpzip) and 57 c (trpzipalpha). a concentration dependent variable temperature cd study in water showed that in trpzipalpha, increasing temperature is detrimental to the -hairpin, but it augments the helical motif by intermolecular oligomerization. our results show that in water, trpzipalpha exhibits long-range interactions between two different secondary structures. structural-based differential stability in the yoeb-yefm toxin-antitoxin module i. cherny, e. gazit department of molecular microbiology and biotechnology, faculty of life sciences, tel aviv university, tel aviv 69978, israel the speci c physiological role of natively unfolded proteins is only recently beginning to be explored. a notable case in which natively unfolded state appears to have physiological signi cance is the e. coli yoeb-yefm toxin-antitoxin (ta) module. a crucial element in proper functioning of ta systems requires physiological instability of the antitoxin in contrast to the stable pro le of its toxin partner. we have shown that yefm antitoxin is a natively unfolded protein, lacking secondary structure even at low temperatures. in contrast, its toxin partner has a well-folded conformation at physiological temperatures. we suggest that the structural-based differential thermodynamic stability between the two components is the cause for their differential physiological stability, since structural instability of the antitoxin exposes it to cellular quality-control machinery. we further revealed that yefm and yoeb interact and form tight complex and determined it stoichiometry. a potential use of ta systems is as novel antibacterial targets. indeed, we identi ed homologous yefm-yoeb systems in a large number of bacteria including major pathogens. we aim to design peptides capable of interfering with the yefm-yoeb interaction, thus releasing the toxin to execute its detrimental function. for this purpose, we identi ed a short linear determinant within yefm that is involved in yoeb interaction. this peptide motif will be optimized for development of antibacterial lead molecules. a. chatterjee, a. prabhu, a. ghosh-roy, r. v. hosur tata institute of fundamental research, mumbai, india-400005 dynein light chain (ddlc1), a member of the cytoplasmic motor assembly exists as a monomer or a dimer (functional form) under different experimental conditions. here we report the unfolding characteristics of the monomeric ddlc1 at ph 3, due to urea and guanidine hydrochloride, by various biophysical techniques. it is observed that the unfolding pathways due to the two denaturants have many differences. urea unfolding seems to be two state, while guanidine unfolding is more complex. the nmr experiments carried out at low denaturant concentrations have enabled detailed characterization of the structure and dynamics of the near native excited states of the protein. these are similar to the native state in structure, except for the small extensions of the helices in the nterminal half of the protein. however, the local stabilities of the and -strands are perturbed and this occurs differently in the two denaturants. in the guanidine case the entire multi-stranded -sheet in the c-terminal half is destabilized. in either case the motional characteristics, seem to suggest the presence of a nite population of the dimer in the excited state ensemble. these states are suggested to be likely intermediates in the momoner-dimer transition, and their characterization here thus provides clues to the molecular mechanism of the transition. it is also envisaged that the near native excited states could play regulatory roles in the functioning of the protein. kinetic bottlenecks identi cation in different folding models f. cecconi 1 , c. guardiani 2 , r. livi 3 1 infm -istituto di sistemi complessi -cnr, italy, 2 centro interdipartimentale per lo studio delle dinamich complesse, università di firenze, italy, 3 dipartimento di sica, università di firenze, italy the ww domains are a family of fast folding, compact, modular domains featuring a triple-stranded, antiparallel beta-sheet. the ww domain of the pin1 protein, due to the availability of a complete picture of the residues involved in thermodynamic stability and in the formation of the transition state, in particular, represents an excellent benchmark to test computational methods. the objective of the present work is to identify the kinetic bottlenecks in the folding process through md unfolding simulations at increasing temperatures. the kinetic bottlenecks are related to the establishment of contacts requiring the overcoming of a large entropy barrier and acting as a nucleus for the creation of further contacts. the key sites are therefore those involved in contacts showing a dramatic decrease in fractional occupation near the speci c heat peak. the technique was applied to the go model and to a model based on the knowledge of secondary structure, providing in both cases a picture of the folding process consistent with the experimental data. evidence is also shown that while the go model allows a more accurate prediction of the native structure, the folding pathway is better described by the other model. the protein talin plays a key role in coupling the integrin cell adhesion molecules to the actin cytoskeleton and in integrin activation. the globular head of talin, which binds -integrins, is linked to a rod containing an actin-binding site and binding sites for the protein vinculin, which regulates the dynamic properties of cell-matrix junctions. we have determined the structure of three domains which contain vinculin binding sites (vbss) and shown that each of these are made up of helical bundles. the structures of complexes between vinculin and peptides corresponding to the vbss show that the residues which interact with vinculin are buried in the hydrophobic core of the helical bundles of the talin domains. nmr studies of the interaction of one of these domains with vinculin shows that it involves a major structural change in the talin fragment, including unfolding of one of its four helices, to make the vbs accessible. while the observation of folding of unstructured regions of a protein on interaction with a 'partner' is quite common, this kind of major unfolding to permit a protein-protein interaction is much less common. ways in which it may be regulated will be discussed. conformational changes of eye lens proteins studied by combined saxs and high pressure s. finet 1 , f. skouri-panet 2 , a. tardieu 3 1 european synchrotron radiation facility, 6 rue horowitz, bp220, 38043 grenoble france, 2 institut de minéralogie et de physique des milieux condensés, 140 rue de lourmel, 75015 paris france, 3 protéines: biochimie structurale et fonctionnelle, case 29, 7 quai st-bernard, 75252 paris france -, -and -crystallins are the main components of vertebrate eye lenses, with exceptional structural and associative properties. the crystallins are known to be exceptionally stable in vivo since they have to last the lifetime of the organism. they therefore represent an extreme case of stability versus unfolding and aggregation. these proteins are mainly beta strands. -crystallins are 21 kda monomers (from 50 to 80% sequence identity), and -crystallins are large hetero-oligomers of about 800kda. -crystallins are molecular chaperone; they belong to the ubiquitous superfamily of small heat shock proteins, shsps. here, the conformation and the stability of -and -crystallins were investigated by small angle x-ray scattering (saxs) and high pressure, depending upon temperature and ph. at room temperature, -crystallins have shown a partially reversible change in size from 2 to 3kb, and this effect was enhanced by the combination of temperature and pressure. in the case of -crystallins and in the pressure range up to 2kb, the pressure was combined with temperature and ph. the results depend upon the different itself. structural studies on pore-forming peptides s. m. ennaceur 1 , d. bown 2 , j. m. sanderson 1 1 chemistry department, university of durham, 2 biology department, university of durham the resistance of pathological microorganisms to conventional antibiotic drugs has created a need for new antibacterial agents. biologically active antimicrobial peptides that act as primary defense agents in a large variety of species are thought to have the potential as precursors for a new range of drugs from antibiotics to cancer treatments. this study has attempted to analyse the structural properties of membrane peptides and proteins through the use of model systems that have been designed to mimic their natural counterparts: we have successfully synthesised model membrane peptides with a beta-sheet structural motif and have used a wide range of techniques to analyse their interactions with phospholipid (pc) membranes. the synthetic peptides were very hydrophobic and only soluble in uorinated alcohols such as hfip and to a lesser extent tfe. we found hfip to have a very strong af nity for pc membranes and carried out a series of experiments to investigate this af nity. 1 binding of hfip to pc membranes was found to be reversible and we exploited this property in 2d crystal trials of our synthetic peptides. we over expressed the c-terminal domain of brka, a gram negative autotransporter protein, which forms a beta-barrel channel in the outer membrane (omp), for comparison with our model peptides. we performed 2d crystal trials on the omp and imaged the resulting protein arrays by stem and afm. a protein spontaneously folds into a unique native structure in physiological conditions. this process accompanies a huge loss of the conformational entropy (ce). our major concern is to specify the factor that can compete with the ce loss. the previous discussions concerning protein folding have been focused on contributions to the free energy of folding from the interaction potentials in a system. a view lacking in earlier studies is that the folding is critically in uenced by the translational movement of water molecules. when solute particles contact each other in a solvent, the excluded volumes for the solvent molecules overlap, and the total volume available to their translational movement increases. this leads to a gain in the translational entropy (te) of the solvent. this type of te effect should be much stronger in protein folding where the tight packing of the side chains occurs. an elaborate statistical-mechanical theory is employed to analyze the te of water in which a peptide or a protein molecule is immersed. it is shown that the te gain upon folding is large enough to compete with the ce loss. when water is replaced by another solvent whose molecular size is larger, the te gain decreases to a remarkable extent. we suggest that the entropic loss accompanying the self-assembly and the formation of ordered structures in a living system is compensated mostly by the te gain of water, highlighting an aspect of the crucial importance of water in sustaining life. domain ii of ribosome recycling factor is required for disassembly of the post-termination complex p. guo, l. zhang, y. feng, g. jing institute of biophysics, chinese academy of sciences, national laboratory of biomacromolecules, beijing, china ribosome recycling factor (rrf) consists of two domains and, in concert with elongation factor ef-g, triggers dissociation of the post-termination ribosomal complex. however, the exact function of the individual domains of rrf remains unclear. to clarify this, two rrf chimeras, ecodi/ttedii and ttedi/ecodii, were created by exchanging the rrf domains between the proteins from escherichia coli and thermoanaerobacter tengcongensis. the ribosome recycling activity of the rrf chimeras was compared with their wild-type rrfs by using in vivo and in vitro activity assays. the experiments show that like wild-type tterrf, the ecodi/ttedii chimera fails to complement the rrf ts phenotype of e.coli lj14 (frr ts ) strain and has no polysome breakdown activity. however, under the same conditions, the ttedi/ecodii chimera complements the rrf ts phenotype and has polysome breakdown activity equivalent to that of wild-type ecorrf. the results indicate that domain ii of rrf is the functional domain that is mainly responsible for the disassembly of the post-termination ribosomal complex, and the speci c interaction between rrf and ef-g on the ribosomes mainly depends on the interaction between domain ii of rrf and ef-g; while domain i of rrf is the main contributor for binding ribosomes and maintaining the stability of the rrf molecule. this study provides direct genetic and biochemical evidence for the assignment of individual functions of rrf domains. self-assembly of natural somatostatin into liquid crystalline nano brils w the natural neuropeptide somatostatin-14 is a cyclic tetradecapeptide hormone, with broad inhibitory effects on both endocrine and exocrine secretions. we report the self-assembly of somatostatin in solution, into stable liquid crystalline nano brils, based on the neuropeptide bioactive backbone conformation. the system was studied as a function of peptide concentration, milieu composition and time, using optical and electron microscopy, x-ray scattering, vibrational spectroscopy and sec/rp-hplc. in pure water, the formation of twisted nano brils (around 10nm wide and a few microns long) was characterized. their structure relies on the native somatostatin -hairpin and on intermolecular antiparallel -sheets networks. the nano brils were observed to laterally associate further with increased concentration and time, as well as to generate hexagonal phases. increase in ionic strength (sodium chloride, phosphate) was found to signi cantly favor the self-association process. the soft conditions of formation of the somatostatin nano brils support biological relevance, for instance to the biological mechanism of storage of the neuropeptide hormone. unraveling the physical origin of the structure of fully denatured ubiquitin s. golic grdadolnik, f. avbelj national institute of chemistry, hajdrihova 19, 1001 ljubljana, slovenia the structure and dynamics characterization of non-native states of proteins is crucial for understanding the mechanism of protein folding. recently many experimental studies have shown variations of conformational propensity and exibility along the backbone chain of fully denatured proteins. it has been supposed that areas of residual structure may serve as initiation sites of protein folding. however, the physical origin of these variations is still unclear. we analyze the structure of fully urea-denatured ubiquitin. the experimental veri cation of conformational propensities of protein backbone is obtained through structurally dependent nmr parameters. although the secondary structure of ubiquitin under strong denatured conditions is not detectable and no correlation with the native overall topology is found, the variations of nmr parameters along the backbone follow the secondary structure elements of its native state. we show that these variations are in accord with the recently developed electrostatic screening model of denatured proteins (1). in this model, the backbone conformations of residues in unfolded protein are determined by local backbone electrostatic interactions and their screening by backbone solvation. many virulence factors from gram-negative bacteria are autotransporter proteins. the nal step of autotransporter secretion is c nterminal threading through the outer membrane (om), followed by folding. this process requires neither atp hydrolysis nor a proton gradient. pertactin, an autotransporter from bordetella pertussis and the largest -helix structure solved to date, folds much more slowly than expected based on size and native state topology, yet folding intermediates are not aggregation-prone. equilibrium denaturation results in the formation of a partially folded structure, a stable core comprising the c-terminal half of the protein. examination of the pertactin crystal structure does not reveal the origin of the enhanced c-terminal stability. yet sequence analysis reveals that, despite size and sequence diversity, all autotransporters are predicted to fold into parallel -helices, suggesting this structure may be important for secretion. for example, slow folding in vivo could prevent premature folding of in the periplasm prior to the assembly of the om porin. moreover, extra stability in the c-terminal rungs of the -helix may serve as a template for the formation of the native protein during secretion, and formation of the growing template may contribute to the energy-independent translocation mechanism. coupled with the sequence analysis, these results suggest a general mechanism for autotransporter secretion. thermal and functional properties of e.coli outer membrane protein-receptor fhua fhua is e.coli outer membrane protein, which transports iron into the bacterial cell and also serves as receptor for several phages. in order to get more deep information about structural properties of fhua, we've studied its thermal properties by means of calorimetry. we've also investigated the interaction between t5, ira phages and directly fhua by means of viscometry. the calorimetric result of heat denaturation of membrane protein fhua and next deconvolution of the recorded calorimetric curve with two transitions has shown that in a chosen conditions the structure of fhua consists of 4 domains. though both t5 and ira phages grow on the common bacterial strain (e.coli k12 ho830), expressing fhua the results of viscometric investigation show that under direct interaction of phages with fhua the receptor activity of protein revealed only for t5 phage. therefore, we conclude that other than fhua protein serves as receptor for ira phage. it should be mentioned that the phage dna ejection process induced by receptor was observed for the rst time by us in an incessant regime. electron spray ionization mass spectroscopy (esi-ms) is a powerful tool for the investigation of the protein folding or proteins non-covalent interactions in solution since charge state distributions (csds) in esi-ms are affected by the conformational state and mass relates on the association state. we used this tecnique to inquire at different ph and different conditions the dimerization process of the porcine and bovine -lactoglobulins that share a high sequences similarity and close 3d structures. dimerization oflactoglobulins is reversible and involves both electrostatic and hydrophobic interactions. it was possible to detect simultaneously both the monomeric and dimeric form of the proteins in solution, pointing out the different dimerization behaviour of the two isoforms. we assessed the maximal stability of the dimeric structure at ph 4 for the porcine protein and ph 6 for the bovine one. moreover we showed that bovine lactoglobulin has a stronger dissociation costant than the porcine protein. further we showed that it is possible to modulate the dimerization equilibrium of the bovine isoform at ph 6 both increasing temperature and adding methanol without inducing denaturation of the protein. a possible novel method of protein structure prediction; a. ikehata division of structural bioinformatics science of biological spramolecular systems graduate school of integrated sciences mechanism of protein folding has been mysterious since an nsen's dogma was sugested.here i would like to propose a possible novel method of protein structure prediction; origami method. the method comes from a protein backbone property of uctuation and the residue hydrophobicity. l. marsagishvili 1 , m. shpagina 1 , z. podlubnaya 2 1 institute of theoretical and experimental biophysics ras, 2 pushchino state university amyloid brils are formed by proteins or their peptides in the result of a conformational transition from alpha helix into beta-sheet structure. despite the different nature of proteins-precursors their amyloids have common properties: beta-pleated sheet structure with individual beta-sheets oriented parallel to the main axis; insolubility in vivo; speci c binding to congo red and thyo avin-t. amyloid deposits are observed in different diseases such as myositis, myocarditis, cardiomyopathies and others. we showed that sarcomeric cytoskeletal proteins of titin family (x-, c-, h-proteins) of rabbit skeletal muscles are capable to form amyloid brils in vitro. these proteins already contain 90% of beta-sheet structure necessary for formation of amyloids. the amyloid nature of their brils was conrmed by electron, polarization and uorescence microscopy. as x-, c-, h-proteins form amyloid brils easily in vitro, there is a danger of fast growth their amyloid deposits in vivo. taking into account common properties of amyloids formed by different proteins, our results clear the ways for conducting by amyloidogenesis in human organs and tissues. work is supported by rfbr grants 03-04-48487, "universities of russia" 11.01.462 and program of the presidium ras "fundamental sciences for medicine". probing the folding capacity and residual structures in 79-and 110-residue fragments of staphylococcal nuclease d. liu, x. wang, y. feng, l. shan, j. wang national laboratory of biomacromolecules, institute of biophysics, chinese academy of sciences n-terminal fragments of staphylococcal nuclease (snase) with different chain lengths were used as a model system in the folding study. the detailed characterization of conformational states of 1-79 and 1-110 residues snase fragments (snase79 and snase110) and their v66w and g88w mutants can provide valuable information on the development of conformations in the folding of snase fragments of increasing chain lengths in vitro. in this study, the presence of retained capacity for folding and residual structures in snase79 and snase110 is detected by cd, uorescence, ftir, and nmr spectroscopy. snase79 is represented as an ensemble of interconverting conformations. the uctuating nascent helix-and âsheet-like structures, localized in regions of a58-a69 and t13-v39, respectively are transiently populated in snase79. the native-like tertiary conformations are obtained for g88w110 and v66w110 and for snase110 in the presence of 2.0 m tmao. analysis of the results of such studies indicate that folding of snase fragments is dominated by developing the local and non-local nucleation sites from native-like secondary structures and by intensifying the longrange interactions of residues at nucleation sites with residues further removed in sequence. thermal disruption of a spanning network of hydration water and conformational changes of elastin a. krukau 1 , i. brovchenko 2 , a. oleinikova 2 , a. geiger 2 1 international max-planck research school in chemical biology, otto-hahn str. 11, d-44227, germany, 2 physical chemistry, university of dortmund, otto-hahn str. 6, d-44227, germany hydration water strongly in uences the structural and dynamical properties of biomolecules. the existence of a spanning hydrogenbonded network formed by the hydration water enables the function of biomolecules at low hydration levels. we can expect that the formation/disruption of the spanning network formed by the hydration water in solution also affects crucially the protein properties. we present the rst computer simulation study of the thermal disruption of a spanning network formed by the hydration water of a biomolecule (elastin-like peptide). this process obeys the laws of 2d percolation transition, similarly to the formation of a spanning water network with increasing hydration level [1]. the spanning water network transforms into an ensemble of small water clusters with increasing temperature: it is still permanent at 280 k and exists with probability 50% at 320 k. in the same temperature interval, the conformation of the peptide changes noticeably: its radius of gyration increases sharply (by 15%) at about 295 k. these two phenomena may be related to the "inverse temperature transition" at about 310 k, where an elastin solution separates into two phases. in our simulations, the displacement of hydration water by the addition of a denaturant (urea) or by other peptide molecules causes an even stronger increase of the radius of gyration (up to 25% lipidic cubic phases formed by distinct water and lipid volumes provide bicontinuous 3d bilayer matrices that have speci c and controllable water channel sizes and large surface areas. these systems have proven to be also valuable as membrane mimetic structures, as promising matrices for controlled-release and delivery of proteins, vitamins and small drugs in pharmacological applications, and they offer a 3d lipid matrix for successful crystallization of membrane proteins which do not easily crystallize in bulk solution. the present study is directed towards a better understanding of the interplay between curved cubic lipid phases and the protein entrapped within their aqueous channel structures. as model systems, we have chosen a cubic ia3d phase formed by an uncharged lipid, monoolein, and incorporated different proteins, such as cytochrome c, -chymotrypsin and insulin, in its narrow water channels. we show that the protein secondary structure and unfolding behaviour may be in uenced by the con nement and, vice versa, the topology of the lipid matrix may change as a function of protein size and concentration. in fact, even new cubic lipid structures may be formed that are not known in pure lipid systems. furthermore, we compare the aggregation scenario of insulin in bulk solution and in the narrow water channels of the cubic lipid matrix and discuss the differences found in terms of the geometrical limitations imposed by the con nement. after endocytosis, i.e. at acidic ph, the t domain inserts in the membrane of the target cell and helps the translocation of the catalytic domain into the cytoplasm. therefore, the t domain has a key role in the strategy of internalization of the toxin. the study of the interaction of the t domain with membranes and its ph dependence is important for a better understanding of the diphtheria toxin translocation mechanism. at least, two steps can be distinguished during the membrane insertion of the t domain. the rst step involves hydrophobic interactions with the membrane and is related to the ph-induced stabilization in a molten-globule state. in the second step, electrostatic interactions are preponderant and the ph-sensitivity comes from changes of the balance between repulsive and attractive electrostatic interactions. the role of the n-terminal part of the t domain in the second step has been investigated by studying peptides corresponding to the amphiphilic helices found in this part of the domain. the results are correlated with those obtained with a single trp mutant probing the n-terminal region of the whole domain. the translocation mechanism will be discussed in view of the physico-chemical properties of the peptides. in complex systems with many degrees of freedom such as peptides and proteins there exist a huge number of local-minimumenergy state. one way to overcome this multiple-minima problem is to perform a simulation in a generalized ensemble where each state is weighted by a non-boltzmann probability weight factor so that a random walk in potential energy space may be realized (for reviews, see refs. [1] [2] [3] xas spectroscopy results show that there are two different structures of the metal binding site in the a 1 40 peptide according to whether they are complexed with cu 2 or zn 2 ions. while the geometry around copper is suggestive of an intra-peptide binding with three histidine residues bound to the metal, the zinc site geometry is compatible with an inter-peptide aggregation mode. this result reinforces the hypothesis that assigns opposite physiological roles to the two metals, with zinc favouring and copper blocking peptides aggregation and consequent plaque formation. effect of pressure on the conformation of proteins. a molecular dynamics simulation of lysozyme a. n. mccarthy, r. grigera instituto de física de líquidos y sistemas biológicos (iflysib), conicet-unlp-cic, university of la plata, argentina the effect of pressure on the structure and dynamics of lisozyme was studied by md computer simulation at 1bar (101 325 pa) and 3 kbar using gromacs package. all-atoms (ff2gmx) force eld were used for the minimization process and for all the md simulation and kept all protein bond lengths constrained (lincs algorithm). water molecules (spc/e model) were constrained using the settle algorithm. for the electrostatic forces we applied the reaction field method. lennard-jones interactions were calculated within a cut-off radius of 1.4nm. the results have good agreement with the available experimental data, allowing the analysis of other features of the effect of pressure on the protein solution. the studies of mobility show that although the general mobility is restricted under pressure this is not true for some particular residues. from the analysis of secondary structures along the trajectories it is observed that the conformation under pressure is more stable, suggesting that pressure acts as a 'conformer selector' on the protein. the difference in solvent accessed surface (sas) with pressure shows a clear inversion of the hydrophilic/hydrophobic sas ratio, which consequently shows that the hydrophobic interaction is considerably weaker under high hydrostatic pressure conditions. direct observation of mini-protein folding using uorescence correlation spectroscopy h. neuweiler, s. doose, m. sauer applied laser physics & laser spectroscopy, university of bielefeld, 33615 bielefeld, germany the "trp-cage" motif represents the smallest and one of the fastest folding mini-proteins known to date. the globular fold is characterized by a hydrophobic core burying a single tryptophan (trp) residue. here, we report on the direct observation of trp-cage folding kinetics using uorescence correlation spectroscopy (fcs). our method is based on the selective uorescence quenching of oxazine dyes by trp which becomes ef cient only upon contact formation between the dye and the indole moiety of trp. by sitespeci cally labeling the dye to trp-cage, temporal uorescence uctuations of the dye-peptide conjugate, caused by intramolecular contact formation between dye and trp, directly report on conformational dynamics and folding transitions of the peptide chain. in order to measure uorescence uctuations directly in solution we used fcs on a confocal uorescence microscope setup. fcs allows us to reveal conformational dynamics with nanosecond timeresolution, under thermodynamic equilibrium conditions, and in highly dilute solutions (i.e. at nano-molar sample concentrations). our method con rms microsecond folding kinetics of the trp-cage motif, previously estimated with non-equilibrium temperature-jump techniques. we further investigated stability and folding rates under denaturing conditions and at various temperatures, giving further insight into structural transitions during the folding process. identi cation and mutagenesis of a region of tnt required for the stability of tnt-tni coiled-coil evolutionarily conserved heptad hydrophobic repeat (hr) domains present in troponin subunits tnt and tni are involved in alpha helical coiled-coil formation. using recombinant peptides from fast-skeletal tnt and tni, we examined the contributions of amino acid residues within these hr domains as well as anking these domains, to the stability of the coiled-coil interaction. a series of tnt fragments were tested for their ability to form coiledcoil with tni hr domain. we show that the tnt region 166-178, although remains outside of the coiled-coil domain, is absolutely required for the stability of the coiled-coil. interestingly, the region tnt 166-178 contains few absolutely conserved residues that are potential candidate for ionic interaction, as predicted by molecular modeling. using single point mutants we show that among all the conserved residues, residue lysine 175 is most important in stabilizing the coiled-coil interaction, whereas others play accessory role. we propose that the lysine 175 initiates the stabilization of the coiled-coil interaction and then the other residues acts in a zipper like fashion. magnesium promotes conformational switching of ca 2 sensor s. mukherjee, k. v. r. chary tata institute of fundamental research the importance of mg 2 , one of the most abundant metal ion in the cell cytoplasm, relating to the calcium sensor mechanism is demonstrated. in this study it has been shown that the 134 amino acid long calcium binding protein from entamoeba histolytica (ehcabp) can exist in three different forms namely, the calcium-free (apo) form, the magnesium bound (apo-mg) form and the calcium bound (holo) form. these three forms have been characterized using chromatographic, calorimetric as well as various spectroscopic techniques. there is a radical difference in the stability between the calcium free and ca 2 bound forms. the calcium free form has molten globule like characteristics. mg 2 stabilizes the closed conformation of the apo form, where the hydrophobic core remains buried. the presence of mg 2 signi cantly alters the calcium binding cooperativity thereby increasing the cooperativity of the conformational switching between the open and closed conformation which is an important aspect of such regulatory proteins. a structural model for the molten-globular form of apo-ehcabp and its equilibrium folding towards completely folded holo state in presence and absence of mg 2 will be presented. intermediate states of formin binding protein ww domain: explored by replica-exchange simulation y. mu school of biological science, nanyang technological university, singapore ww domain of formin-binding protein (fbp) is a model system for beta strand folding study. although it is small, only 37 amino residues in total, the folding kinetics of fbp ww domain proved to be biphasic. an extensive molecular dynamics-monte carlo hybrid method, called replica-exchange simulation strategy is employed to study the folding/unfolding of the fbp ww domain in explicit water model. begin with randomly chosen conformations from high temperature unfolding trajectory distributed in 88 replicas (88 different temperatures covering from 290k to 570k), the simulation lasts 30 nanoseconds in each replica. in the end an interesting distribution of conformations shapes up. we nd that there are three distinctive subgroups, one being the unfolded conformations with rmsd averaging around 10å , one being native conformations with rmsd 2.5å, more interestingly, the third group having rmsd 5å, an intermediate folding ensemble. by checking the intermediate ensemble in detail, we nd that it is quite heterogeneous and the heterogeneity mainly comes from the exibility of the c-terminal loop region. our ndings provide a microscopic picture of folding kinetics of the ww domain: the stable intermediate states with mis-registered hydrogen bonds on the c-terminal beta strand make this peptide folding as a three-state folding model rather than a usual two-state model. h. rezaei, f. eghiaian, j. perez, y. quenet, y. choiset, t. haertle, j. grosclaude institut national de la recherche agronomique, france in pathologies due to protein misassembly, low oligomeric states of the misfolded proteins rather than large aggregates play an important biological role. to get better insight into the molecular mechanisms of prpc/prpsc conversion, we studied the kinetic pathway of heat induced amyloidogenesis of the full length recombinant ovine prp (arq) at ph 4.0. according to the size exclusion chromatography experiments, three sets of oligomers were generated from the partial unfolding of the monomer. the effect of concentration on the oligomerization kinetics was different for the three species obsreved suggesting that they are generated from distinct kinetic pathways. limited proteolysis and peptide analysis of the two best separated peaks showed a difference in the accessibility of the c-terminal domain of these two oligomers, and allowed the identi cation of regions undergoing a structural change during the conversion process. the analysis of correlations between oligomer populations as well as numerical kinetic modeling led us to propose a multi-step kinetic pathway describing the evolution of each species as a function of time. the existence of at least three distinct oligomerization pathways on one hand and the differences in the accessibility of the two puri ed oligomers on the other hand re ect the structural plasticity of the prp protein. studying the mechanism of retention of ovine prion protein in soils will tackle the environmental aspect of potential dissemination of scrapie infectious agent. the conformational transition from the monomeric cellular prion protein prp c in -helical structure into the aggregated -sheet-rich multimer prp sc is supposed to be responsible for the so-called prion diseases. it is commonly admitted that the recombinant prp could serve as model of conversion of the normal prion protein prp c . fundamentally, the interaction of proteins with surfaces either uid or solid involves both protein binding and unfolding. our goal in studying protein adsorption is to determine the nature and the amplitude of the structural changes occurring during non-speci c adsorption. the protein-clay interaction depends on several parameters such as protein hydration, net charge, charge distribution on the protein surface. ftir spectroscopy is well-suited to probe structural changes of proteins at a molecular level at aqueous/solid interfaces. the conformational states of the full-length ovine prp adsorbed on the electronegative clay surface are compared to its solvated state in deuterated buffer in the pd range 3.5-9, using ftir spectroscopy. during the intoxication of a cell, the diphtheria toxin binds to a cell surface receptor, is internalized and reaches the endosome. the translocation (t) domain from the toxin interacts with the membrane of the endosome in response to the acidic ph found in this compartment. this process drives then the passage of the catalytic domain of the toxin through the membrane into the cytoplasm. the interaction of the t domain with the membrane involves at least two steps. in the rst step occurring at ph 6, t adopts a molten globule conformation, which is able to bind super cially to the membrane through hydrophobic interactions by the c-terminal region (helices th8 and th9). in a second step occurring between ph 6 and 4, penetration into the membrane involves electrostatic interactions. this step leads to a functional inserted state. trp 281 was mutated to phe in order to use trp 206 located in helix th1 as a probe of its behavior during the interaction with the membrane as a function of ph. we found that the second step is correlated with the reorganization of the n terminal region in the membrane and is controlled by electrostatic interactions. peptide conformational search using generalized simulated annealing method p. g. pascutti 1 , f. p. agostini 2 , c. osthoff 2 , k. c. mundim 3 , m. a. moret 4 1 ibccf -ufrj -brazil, 2 lncc -brazil, 3 iq-unb -brazil, 4 ceppev -fundação visconde de cairú / df-uefs -brazil the three-dimensional structure of proteins is mainly determined by the sequence of amino acids, making possible the development of ab initio methods for peptides and protein structure prediction. we proposed a stochastic method based on a classical force field and the generalized simulated annealing (gsa), which utilizes tsallis generalization of boltzmann-gibbs statistics. we have applied this method for peptide conformational search and as a complement for comparative modeling of proteins, searching the conformation for loops and mismatched sequence alignments. the gsa ef ciency depends on the right choice of the parameters involved in the conformational calculation: the qv parameter which de nes a function for visiting the molecular energy surface; and the qa parameter de ning an acceptance probability, both according with tsallis statistics. to avoid the conformational trapping in local energy minima we introduced a new parameter in this work, qt, to control the temperature decreasing. to investigate the qv, qa and qt best parameters set, we used the 18-alanine and 26-alanine peptides, which have a known alpha-helical structure in low dielectric environment. the global minimum energy occurs for the alpha-helix folded structure, and was found for qv ranging from 1.1 to 1.9, qa from 1.6 to 2.9, and qt from 1.1 to 2.4. we observed also that convergence values for qv decrease while for qa and qt increase for folded structures. p. s. santiago, é. v. de almeida, m. tabak instituto de quimica de são carlos-usp cp780 13560 são carlos sp brasil extracellular hbgp is a giant hemoglobin, similar to other annelid hemoglobins, having a molecular weigth of 3.1 mda. the effect of ctac in the oligomeric protein structure was assessed by optical absorption and emission spectroscopies. optical absorption spectra of hbgp 0.075 mg/ml as a function of ctac concentration from 0 up to 10.5 mm evidentiate changes both in soret band at 414 nm, and at 490 nm associated to light scattering which appears at low ctac concentration. below 1 mm of surfactant extensive light scattering occurs together with signi cant shifts of max of soret band from 414 nm to around 400 nm, which is probably due to oxidation of the original oxyhbgp. light scattering reaches a maximum value disappearing for higher ctac concentrations. fluorescence spectra show a signi cant increase in intensity (30fold) upon titration with ctac. this is consistent with the dissociation of the oligomer with signi cant reduction of intrinsic quenching of tryptophan uorescence due to the heme groups. similar data were obtained at protein concentration of 3 mg/ml. in this case a signi cant increase of light scattering is observed with protein precipitation at a narrow ctac range followed by re-disolution at higher ctac concentration. differently from anionic sds surfactant, cationic one induces protein aggregation. support: cnpq and fapesp. in situ formation of silk protein nano-particles studied by small angle x-ray scattering m. w. roessle 1 , c. riekel 2 , d. i. svergun 1 1 european molecular biology laboratory; outstation hamburg/germany, 2 european synchrotron radiation facility; grenoble/france silk threads from the mulberry silkworm bombyx mori are used for the production of textiles since centuries. in modern applications silk proteins are chosen because of their good tensile strength, their high biocompatibility as well as of the resorbability for the human body. however, the processing of silk by the silkworm is barely understood. the silk proteins are stored as a solution in the glands of the silkworm and processed during the spinning into a co-block polymer like ber. in a rst step the random coil silk proteins are transformed into molecules with beta-sheet subdomains, which provide a protein-protein interface for the ber assembly. this transformation can be mimicked by a rheometer applying a shear force to the silk protein solution. applying combined small/wide angle x-ray scattering (saxs/waxs) transient formed silk nanoparticles upon increasing shear force were found. further details of these macromolecules were derived by xing the transient state with chemicals such as polyethylene oxide (peo). the resulting data can be analyzed in detail by saxs data evaluation software and low resolution models of the found nanoparticles were derived. moreover, the internal structure of the particles was explored as well as suggestions for the silk processing of the silkworm could be made. on the three-dimensional information of a protein sequence v. a. risso, l. g. gebhard, r. g. ferreyra, j. santos, m. noguera, m. r. ermacora universidad nacional de quilmes conicet in this work, lactamase of b. licheniformis (es l) was used as an experimental model to (a) study the conversion of sequential information into 3d structure and (b) to investigate the distribution of conformational information in the polypeptide chain. by a novel approach, over thirty connectivity variants of the polypeptide chain were prepared; witch were also nterminally truncated to a variable degree. the variants produced in e. clil were puri ed to homogeneity, refolded, and its structure content analyzed by circular dichroism, hydrodynamic behaviour and aggregation state. several variants were dimeric in solution, suggesting a possible general inespeci c stabilization mechanism. most variants were compact and had different degrees of secondary and tertiary structure. a strikingly large number of variants showed native like spectroscopic signatures and signi cant enzymatic activity, which means that the very elaborate active site of beta lactamase is formed, at least in fractions of the molecules, despite the absence of long stretches of sequence. these ndings are discussed in the light of the current knowledge of the protein folding process. var and lgg contributed equally to this work. on the three-dimensional information of a protein sequence v. a. risso, l. g. gebhard, r. g. ferreyra, j. santos, m. e. noguera, m. r. ermacora universidad nacional de quilmes conicet in this work, lactamase of b. licheniformis (es l) was used as an experimental model to (a) study the conversion of sequential information into 3d structure and (b) to investigate the distribution of conformational information in the polypeptide chain. by a novel approach, over thirty connectivity variants of the polypeptide chain were prepared; witch were also nterminally truncated to a variable degree. the variants produced in e. clil were puri ed to homogeneity, refolded, and its structure content analyzed by circular dichroism, hydrodynamic behaviour and aggregation state. several variants were dimeric in solution, suggesting a possible general inespeci c stabilization mechanism. most variants were compact and had different degrees of secondary and tertiary structure. a strikingly large number of variants showed native like spectroscopic signatures and signi cant enzymatic activity, which means that the very elaborate active site of beta lactamase is formed, at least in fractions of the molecules, despite the absence of long stretches of sequence. these ndings are discussed in the light of the current knowledge of the protein folding process. var and lgg contributed equally to this work. unfolding of the extrinsic proteins of photosystem ii (33kda, 23kda and 17 kda) induced by pressure unfolding of the three extrinsic proteins of spinach photosystem ii induced by pressure has been systematically investigated. thermodynamic equilibrium studies indicated that these proteins are very sensitive to pressure. at 20 o c all the proteins show a reversible unfolding transition by about 180 mpa for 33kda, 200mpa for 23kda and 270 mpa for 17kda. the ph and temperature dependence of pressure unfolding of these proteins were explored. the stabilization effect of reagents sucrose etc on the proteins was found to associate not only with the increase in the unfolding free energy, but also with the reduction of the absolute value of v u . pressure-jump studies of unfolding of 23kda protein revealed a negative activation volume for unfolding and a positive activation volume for refolding, indicating that in terms of system volume, the transition state lies between the folded and unfolded states. comparison of temperature dependence of v u # , v f # and v u indicated that the thermal expansivities of the transition state and the unfolded state are similar and larger than that of the folded state. aggregation-prone intermediate protein structures on the refolding pathway l. smeller 1 , j. fidy 1 , k. heremans 2 1 semmelweis university dept. biophysics and radiation biology, budapest, hungary, 2 department of chemistry, katholieke universiteit leuven, belgium the folding of the polypeptide chain into a native conformation can be studied by experimental systems, where the environmental parameters causing the denatured state can be easily and fast eliminated. one of such parameters is the high hydrostatic pressure. refolding of the unfolded protein can be studied after decompression. the refolding pathway can contain several intermediate states. on the other hand, the destabilization of the native proteins can populate conformations, where the polypeptide chain is not completely folded. these metastable conformations can easily aggregate. deposition of insoluble protein aggregates plays a crucial role in the conformational diseases (parkinson, alzheimer's disease, amyloidozis, etc.) stability and conformation of the above mentioned metastable conformations were investigated in case of myoglobin, apo-horseradish peroxidase, lipoxygenase. fluorescence and infrared spectroscopy and light scattering experiments were used to explore not only the structure but also the aggregation af nity of the intermediates in case of all the above mentioned proteins a well de ned temperature range could be determined, where the metastable not completely folded structures were populated considerably during the refolding process. these intermediate conformations were significantly more aggregation prone, than the native conformers existing in the same temperature range. aggregation processes in beta-amyloid peptides: effects of molecular chaperons a. sgarbossa, d. buselli, f. lenci cnr istituto bio sica, pisa, italy several neurodegenerative pathologies, like parkinson's, hungtington's and alzheimer's diseases, are related to the formation of small peptides aggregates, which amyloid brils originate from. understanding the molecular mechanisms responsible for these processes can, therefore, contribute to clarify the origin and, hopefully, to control the development of the afore mentioned diseases. here we report the results of an in vitro study aiming to affect the aggregation kinetic of 1-42 and 1-40 beta-amyloid peptides by means of an endogenous chaperone-like protein (alpha-crystallin) and an exogenous polycyclic aromatic pigment (hypericin) that can perturb the aggregation process through stacking interactions with the peptides aromatic residues. because of the well known problems in getting reproducible and reliable results, particular attention has been devoted to carefully check the preparation procedures of the samples. the effects of both alpha-crystallin and hypericin on the self-assembly process have been examined at different times of the aggregation kinetics. the results are discussed in relation with the involvement of different molecular structures in the amyloid brillation phenomenon. remodelling the folding of thioredoxin by removal of the c-terminal helix j. santos, j. m. del no iquifib and departamento de química biológica, facultad de farmacia y bioquímica, uba, junín 956, c1113aad buenos aires, argentina e. coli thioredoxin (trx) is a monomeric / protein of 108 amino acids with a fold characterized by a central beta sheet surrounded by alpha helices. two subdomains are topologically noticeable, but it is unclear whether their folding occurs in a concerted fashion. subdomain trx1-73 has been extensively studied as a model of a partially folded, with no tertiary or persistent secondary structure. this work describes the expression and characterization -by circular dichroism (cd), uorescence emission, size exclusion chromatography, chemical cross-linking and light scattering-of a novel engineered fragment (trx1-93) lacking the last stretch of 15 amino acids. after refolding from inclusion bodies, trx1-93 shows a strong propensity to form soluble oligomers endowed with distinctive optical properties unlike those observed for the full protein. although trx1-93 also shows signi cant changes in secondary structure, trp residues appear to occupy rigid and apolar environments. these ndings support the existence of an alternatively folded form for trx1-93. in addition, the secondary structure content of chemically synthesized peptide trx94-108 and its ability to complement fragment trx 1-93 upon refolding were also evaluated by cd. taken together, the data herein presented shed light upon issues such as the distribution of information content relevant for folding along the polypeptide chain in regard to conformational stability. with grants from anpcyt, ubacyt and conicet. thermal aggregation of two "beta-protein" models at different ph values v. vetri, f. librizzi, v. militello, m. leone physical and astronomical sciences, univ. of palermo, italy & infm the structural stability of proteins strongly depends on the environment and the lost of this stability may trigger a partial unfolding, leading in turn to the formation of aggregates. such processes have been extensively studied also in view of their biotechnological and medical implications. in fact a large number of diseases is associated with protein misfolding and aggregation. conformational changes play a keyrole in the aggregation processes and have their onset under particular external conditions. the aggregation pathways and the topology of the obtained supramolecular structures sizeably depend on the details of the involved conformational changes, which are determined by the details of the external conditions. here we present an experimental study on thermal aggregation processes of two model proteins mainly composed from structures: -lactoglobulin and concanavalin a, at different ph values. the conformational changes of the proteins (whose association state depends by ph) and the aggregation pathways were monitored by intrinsic and suitable external dyes uorescence. at the same time, the growth of supramolecular structures was followed by measuring the rayleigh scattering of the excitation light. secondary structure changes were followed by circular dichroism measurements. the results show that at different ph values the aggregation processes of both proteins follow different pathways determined by the variations in the native structure and by the details of the involved conformational changes. a. verma 1 , t. herges 2 , w. wenzel 2 1 iwr, forschungszentrum karlsruhe, po box 3640, 76021, karlsruhe, germany, 2 int, forschungszentrum karlsruhe, po box 3640, 76021, karlsruhe, germany ab-initio protein structure prediction and the elucidation of the mechanism of the folding process are among the most important problems of biophysical chemistry. investigations of the protein landscape may offer insights into the folding funnel and help elucidate folding mechanism and kinetics. we investigate the landscape of the internal free-energy of the 36 amino acid villin headpiece with a modi ed basin hopping method in the all atom force eld pff01, which was previously used to fold several helical proteins with atomic resolution. we identify near native conformations of the protein as the global optimum of the force eld. more than half of the twenty best simulations started from random initial conditions converge to the folding funnel of the native conformation, but several competing low-energy metastable conformations were observed. from 76,000 independently generated conformations we derived a decoy tree which illustrates the topological structure of the entire low-energy part of the free energy landscape and characterizes the ensemble of metastable conformations. these emerge as similar in secondary structure content, but differ in the tertiary arrangement. exploring the free energy landscape of a folded protein by means of afm stretching experiments m. vassalli 2 , b. tiribilli 1 , l. casetti 2 , a. torcini 1 , a. pacini 3 , a. toscano 3 1 isc -cnr, florence -italy, 2 csdc, univ. of florence -italy, 3 anatomy dept, univ. florence -italy the aim of our work is to study the mechanically induced folding and refolding of single proteins by means of an atomic force microscope (afm). the resulting data will be analyzed with theoretical methods, both to determine the folding pathway and to gain information on the energy landscape of real systems. recently, experiments employing atomic force microscopy have shown that mechanical and thermal unfolding share several common features. we are using an afm to perform mechanical stretching experiments on single biomolecules. the experiments that can be performed are particularly well-suited to reconstruct the folding-unfolding pathways as well as the free energy landscape of the examined protein. in particular we are interested in the free energy pro le associated to titin and elastin, by considering a periodic loading of the afm cantilever, instead of the usual linear ramp, and measuring the force as a function of displacement. these experiments will be complemented by theoretical and numerical studies. molecular dynamics simulation of simple models but including the experimental geometry, will allow to examine in detail the effect of different experimental procedures (periodic loading versus linear ramp) proposed to reproduce equilibrium energy landscapes. moreover, we will investigate the limit of applicability of the jarzynski's equality which has been claimed to be able to be used to extract equilibrium results from non equilibrium measurements. association of subunits is a prerequisite for formation of the native structure of the dimeric ipmdh a. varga, é. gráczer, i. hajdú, p. závodszky, m. vas institute of enzymology, biological research center, hungarian academy of sciences, budapest, hungary to answer the question whether subunits are autonomous folding units, or their association at an early stage of folding is required for formation of the native protein structure, denaturation-renaturation experiments were carried out with the dimeric isopropylmalatedehydrogenase (ipmdh). denaturation was induced by guanidine hydrochloride, renaturation was initiated by dilution and followed by activity measurements and uorimetry. reactivation is a complex process with an initial lag phase, indicating the presence of an inactive intermediate. the kinetics of the process is independent of protein concentration, suggesting that association of the two polypeptide chains takes place much faster than the rate limiting rst order isomerisation step(s). restoration of protein uorescence during renaturation is also protein concentration independent, biphasic process, however the initial lag phase is replaced by an even faster burst increase of uorescence. the rst step leads to formation of an intermediate with a native-like uorescence spectrum. based on our experiments the following mechanism is proposed for refolding of ipmdh: d+d i 2 i 2 n 2 , where d is denaturated monomer, i 2 and i 2 are inactive dimer intermediates, n 2 is native dimer, that means initial association of the polypeptide chains during refolding is a prerequisite for formation of the native 3 dimensional structure of ipmdh. we describe a new beamline for optical biophysic in construction on the synchrotron soleil. the high briliance of the synchrotron beam, associated with its tunability on a broad part of the electromagnetic spectrum make it an excitation source of choice for several biophysical optical techniques. the disco beamline we present will consist of three endstations : 1. the circular dichroism (cd) endstation will bene t from the inclusion of the energies accessible in the vuv wavelength range (350-165nm) and from the natural polarization of the synchrotron beam. cd spectra of proteins covering a broad range of wavelenghts will enable better and ner structural analysis. moreover, new biological chromophores such as sugars which absorb in the deep uv will be accessible in cd. 2. the mass spectrometry endstation,will bene t from an ionisation beam with even greater energies (down to 60 nm) comprising nebulisation at atmospheric pressure. photoionisation of macromolecular bio-structures without any solvent restriction will produce perfect analytes for mass spectrometry, 3. the multiparametric imaging endstation, build on a confocal microscope, will use the great tunability of the synchrotron radiation (200-700 nm) to excite samples at many wavelenghts simultaneously. the temporal component of the beam will allow natural lifetime imaging by phase modulation -demodulation. predictive all-atom protein folding with stochastic optimization methods the prediction of protein tertiary structure, in particular based on sequence information alone, remains one of the outstanding problems in biophysical chemistry. according to the thermodynamic hypothesis, the native conformation of a protein can be predicted as the global optimum of its free energy surface with stochastic optimization methods orders of magnitude faster than by direct simulation of the folding process. we have recently developed an all-atom free energy force eld (pff01) which implements a minimal thermodynamic model based on physical interactions and an implict solvent model. we demonstrated that pff01 stabilizes the native conformation of several helical proteins as the global optimum of its free energy surface. in addition we were able to reproducibly fold several helical proteins (the 20 amino acid (aa) trp-cage protein, the villin headpiece (36 aa), the conserved headpiece of the hiv accessory protein (40 aa), the headpiece of protein a (40 aa) and the 4-helix bacterial ribolsomal protein l20 (60 aa), as well as several beta-sheet peptides. we used several stochastic optimization methods: the stochastic tunneling method, an adapted version of parallel tempering, basin hopping techniques and distributed evolutionary optimization strategies. we discuss advantages and limitations of this approach to de-novo allatom protein structure prediction. the 9 independent thermal unfolding simulations of gb1 have been performed. 12 physical property parameters of protein structure were chosen to construct a 12-dimension physical property space. then the 12-dimension property space was reduced to 3dimensions principle component property space. under the property space, the unfolding pathway ensemble of gb1 was obtained. the pathway ensemble likes a funnel that was gradually emanative from the native state ensemble to the unfolded state ensemble. the unfolding trajectories have the similar variable trend during the native state and the transition state ensemble. during the unfolded state, the 9 unfolding trajectohries were divided into two types that one includes only one trajectory and the other include 8 trajectories. the rst type of unfolded state was a discontinuity step distribution model, which is not random distribution. the second type of unfolded state was a near ellipsoid distribution model and a near random. there were substantial overlaps of unfolded state, indicating that thermal unfolded state consists of a con ned set of property values that makes the number of unfolded state of protein to be much smaller than that was believed before. the protein circular dichroism data bank (pcddb): a bioinformatics and spectroscopic resource b. a. wallace 1 , l. whitmore 1 , r. w. janes 2 1 dept. of crystallography, birkbeck college, university of london, 2 school of biological sciences, queen mary, university of london we describe the development and creation of the protein circular dichroism data bank (pcddb), a deposition data bank for validated circular dichroism spectra of biomacromolecules. its aim is to provide a resource for the biological and bioinformatics communities, by providing open access and archiving facilities for circular dichroism spectra. it is named by parallel with the protein data bank (pdb), a long-existing valuable reference data bank for protein crystal structures. it will permits spectral deposition via userfriendly web forms and will include automatic reading of a range of data formats and data mining from le headers to facilitate the process. it will be linked, in the case of proteins whose crystal structures and sequences are known, to the appropriate pdb and sequence data band les, respectively. a series of validation tools that will provide reports on data quality are included (and are accessible as stand-alone software). it is anticipated that this data bank will provide readily-accessible biophysical catalogue of information on folded proteins that may be of value in structural genomics programs, for quality assurance and archiving in industrial and academic labs, as a resource for programs developing spectroscopic structural analysis methods, and in bioinformatics studies. the relation of n-terminal residues and structural stability of l-chain apoferritin k. yoshizawa 1 , k. iwahori 3 , y. mishima 1 , i. yamashita 2 1 crest/jst, 2 atrl-matsushita, 3 nara institute of science and technology the denaturation of apoferritin by acidic solution was studied. ferritin, the ubiquitous iron storage protein, represents a well known polymeric assembly that is highly resistant to chemical and physical denaturants. it is a cage-shaped protein which is composed of 24 subunits. natural vertebrate ferritins are copolymers of two different subunits, l-and h-chains. in the recombinant h-chain apoferritin (rhf), the structural stability is decreased by deletion of the n-terminal residues. we studied the effect of n-terminal residues of recombinant l-chain apoferritin (rlf) on the acidic ph denaturation and re-assembly. we constructed rlf and mutant rlfs which are lack of 4 (fer4) or 8 (fer8) amino acid residues from the n-terminus and investigated their stability by cd spectra. among three, fer8 has the least endurance against ph decrease. in the case of fer8, the re-assembly of subunits into apoferritin can be performed by increasing solution ph without causing the by-product while huge aggregations are caused in fer0 and fer4. the structural comparison of three mutants indicates that the hydrogen bonds of inter-and intra-subunits decrease by the loss of the n-terminal residues. therefore, it is elucidated that the hydrogen bonds of inter-and intra-subunits from n-terminal residues affect the molecule stability and re-assembly of l-chain apoferritin. pressure perturbation calorimetic studies of solvation, unfolding and aggregation of proteins r. winter university of dortmund, physical chemistry, otto-hahn-str. 6, d-44227 dortmund pressure perturbation calorimetry (ppc) was used to study the solvation and volumetric properties of various proteins in their native and unfolded state. in ppc, the coef cient of thermal expansion of the partial speci c volume of the protein is deduced from the heat consumed or produced after small isothermal pressure jumps, which strongly depends on the interaction of the protein with the solvent or cosolvent at the protein-solvent interface. the effects of ph and various chaotropic and kosmotropic cosolvents (glycerol, sucrose, urea, guhcl, salts, etc.) on the solvation and unfolding behavior of the proteins was also investigated, and the observed volume and expansivity changes are correlated with further thermodynamic and spectroscopic properties of the systems. depending on the type of cosolvent and its concentration, speci c differences are found for the solvation properties of the proteins, and the volume change upon unfolding may even change sign. taken together, the data obtained lead to a deeper understanding of the solvation process of proteins in different cosolvents in their native and unfolded states. in addition, the effects of con nement and crowding on the solvational properties of the proteins were studied. finally, the use of ppc for studying intermolecular interactions and aggregation (amyloidogenesis) phenomena of proteins (e.g., insulin, prp) will be discussed. the in uence of semisynthetic derivatives of phenolic lipids on activity of yeast abc pumps phenolic lipids are the natural amphiphilic long-chain homologues of orcinol (1,3-dihydroksy-5-methylbenzene). they occur in numerous plants and microorganisms. resorcinolic lipids exhibit high af nity for lipid bilayer and biological membranes and are able to modify the activity of membrane enzymes (e.g. pla 2 , ache). the in uence of semisynthetic derivatives of phenolic lipids on yeast pdr protein activity was studied by spectro uorimetric method using the potentiometric uorescence probe dis-c 3 (3). the probe is expelled from s. cerevisiae by abc pumps and can conveniently be used for studying their performance. two pump-competent s. cerevisiae strains and different pump-free mutant strains were used to experiment to check the effect of the semisynthetic derivatives of phenolic lipids on activity of abc transporters. two of these derivatives, named 23.1 and 23.2, seem to affect the activity of pdr pumps. their in uence on activity of yeast plasma membrane multidrug resistance abc pumps is concentration-dependent. in uence of lipid membrane composition on pglycoprotein activity k. bucher, s. d. krämer, h. wunderli-allenspach department chemistry and applied biosciencies, eth zurich, switzerland p-glycoprotein (p-gp), a membrane atpase expelling many structurally unrelated compounds, is one of the major contributors to multidrug resistance. it is proposed that substrates bind to it within the membrane and are exported from there out of the cell. p-gp substrates are generally hydrophobic and their binding to the transporter is governed by their ability to partition into the membrane. the intimate association of both p-gp and its substrates with the membrane suggests that p-gp function may be regulated by the composition of the lipid bilayer. as detergents in uence the membrane properties and have been shown to affect p-gp atpase activity, we developed virtually detergent-free proteoliposomes to investigate the in uence of the membrane environment on the atpase activity of p-gp. the basal and substrate induced atpase activity was dependent on the cholesterol level of egg phosphatidylcholine (pc) membranes. the compound concentration at half maximal activation of p-gp (k m ) in proteoliposomes correlated with the af nity of the respective compound to liposomes consisting of the same lipids as the proteoliposomes tested. in conclusion, the basal and drug-induced atpase activity of p-gp is strongly dependent on the cholesterol content in detergent-free p-gp/egg pc/cholesterol proteoliposomes. m. berchel 1 , j. jeftic 1 , t. benvegnu 1 , j.-y. thepot 2 , d. plusquellec 1 1 enscr umr cnrs 6052, campus de beaulieu, 35700 rennes, 2 université de rennes 1, umr cnrs 6509, 35042 rennes bipolar lipids found in archaebacterial membranes, generally termed bolaamphiphiles, induce increased stability in membranes exposed to environments such as acidic conditions, high temperatures, high salt concentrations and/or absence of oxygen. we have synthesized a spin labeled unsymmetrical bolaamphiphile that selforganises in water solutions in multilamellar vesicles and shows slow ip-op phenomenon in comparison to conventional liposomes. generally, the ip-op from the exovesicular to the endovesicular membrane surface is a relatively slow process, which is due to the high energy barrier in transferring the polar amphiphilic heads through the lipophilic membrane. it can be involved in membrane transport mechanisms and in facilitating the transport, cells have evolved to use various supramolecular strategies. the half-life of the ip-op is estimated to more than twelve hours. we are now modulating the ip-op rate by incorporating chemical modi cations such as addition of cyclopentanes, double or triple bonds into the bridging chain of the molecule, in order to control the membrane transport via the ip-op mechanism. transport activity of the monocarboxylate transporter 1 is increased by carbonic anhydrase h. m. becker, j. w. deitmer abt. allgem. zoologie, tu kaiserslautern, 67663 kaiserslautern the enzyme carbonic anhydrase (ca), which catalyses the conversion of co 2 and h 2 o to bicarbonate and protons, is present in nearly all animal cells, and is highly expressed in astrocytes. it is known that ca can bind to several membrane transporters, forming a transport metabolon and thereby enhancing the transport activity of the protein. in this project we have studied the functional interaction of the enzyme with the monocarboxylate transporter 1 (mct1), which transports lactate and other monocarboxylates together with protons and is believed to play a pivotal role in the metabolite shuttling between astrocytes and neurons. therefore we expressed mct1 and then injected ca into xenopus oocytes. our results indicate a direct binding of ca to the mct1, leading to a ca-induced increase in acid/base ux mediated by the transporter. interestingly, the effect was insensitive to the ca inhibitor ethoxyzolamide and to the nominal absence of co 2 /hco 3 , but disappeared when binding of ca to the mct1 was hindered. it seems, that ca, bound to the mct1, mediates local buffer capacity by removing protons transported into the cell via the mct1. this helps to stabilise the proton gradient close to the cell membrane, and thereby enhances the transport activity of the mct1. these ndings suggest that ca can enhance metabolite-acid/base transport, by forming a transport metabolon with the mct1. melibiose permease of e.coli (mel b) is a membrane bound ioncoupled sugar symporter that uses the favorable na , li , or h electrochemical potential gradient to drive cell accumulation ofor -galactosides. cysteine scanning mutagenesis, electrophysiological (ssm -solid supported membranes) and uorometric measurements, were used in order to better understand the role of speci c parts of the protein in the function of this symporter. the ssm technique combines a rapid solution exchange with the high sensitivity of planar lipid membranes. it employs a solid supported membrane as a capacitive electrode and allows the time resolved investigation of charge translocation during the catalytic cycle of such transporters as na /solute symporters. in order to obtain some more precise information about the function of mel b symporter, starting from the c-less melb, the mutant g117c was constructed and from electrical, spectro uorimetric and fret measurements carried out on this mutant, in the absence and in the presence of speci c inhibitors, conclusions were drawn about the possible role of the helix iv in the function of the symporter. two-dimensional crystallization of co-reconstituted ca2+-atpase, phospholamban and sarcolipin the sarcoplasmic reticulum ca 2 -atpase and its regulators phospholamban (plb) and sarcolipin (sln) form a primary control mechanism in the recovery of resting state calcium levels in the myocardium. defects in the regulation of ca 2 -atpase by plb and sln are central determinants in cardiac contractility and disease states such as cardiomyopathy. given the signi cance of these proteins, the structural details of their regulatory mechanisms remain an important future goal for the clinical improvement of heart disease. using co-reconstitution into proteoliposomes at low lipidto-protein ratios, we have examined the effects of mutation on the functional properties of plb and sln, revealing novel insights into calcium pump regulation. in addition, these same co-reconstituted proteoliposomes have been used for structural studies by electron cryo-microscopy. in an attempt to better de ne the structural interactions between plb, sln and ca 2 -atpase, we have sought methods for the production of large two-dimensional crystals suitable for high resolution electron crystallography. we previously utilized the co-reconstituted proteoliposomes to produce long, tubular crystals suitable for helical reconstruction. our new procedure comprises three steps -co-reconstitution, membrane fusion, and crystallization -producing large two-dimensional crystals suitable for high resolution structural studies. herein, we will present our latest results characterizing the structural interaction between plb, sln and ca 2 -atpase. t. v. demina 1 , n. s. melik -nubarov 1 , h. frey 2 , e. e. pohl 3 1 state university, chemistry department, moscow, russia, 2 institute of organic chemistry, johannes-gutenberg-university, mainz, germany, 3 humboldt university, charite, institute of celland neurobiology, berlin, germany multidrug resistance (mdr) of tumours is associated with overexpression of the p-glycoprotein responsible for an active drug ef ux from cells. block copolymers of ethylene oxide and propylene oxide (pluronics) are known to cause a pronounced chemosensitization of tumour cells. the effect may be due either to speci c polymer -protein interactions or to unspeci c lipid bilayer disturbance. we have shown recently that amphiphilic copolymers with various hydrophilic and hydrophobic blocks can disturb lipid bilayers. importantly, that block copolymers of propylene oxide and glycerol (ppo-pg) with hyperbranched "corona" induced larger effects then pluronics with linear polyethylene oxide chains. in the present work we have shown that ppo-pg copolymers increase dox cytotoxicity towards human erythroleukemia (k562is9, k562/dox) and breast carcinoma (mcf7/dox) resistant cell lines. using confocal and two-photon microscopy, we demonstrated that these copolymers accelerated dox penetration into resistant cells, inhibited efux and caused drug redistribution into nuclei. a clear correlation between the ability of the polymers to disturb lipid bilayers and favour drug accumulation in mdr cells was disclosed. this nding points to an unspeci c mode of the copolymers' chemosensitizing activity. voltage dependence of processes related to electrogenic membrane transporters electrogenic membrane transporters, such as the sodiumbicarbonate cotransporter (nbce1), may induce dependence on membrane potential upon processes which are innately voltageindependent. we tested this hypothesis by heterologously coexpressing electrogenic nbce1 from human kidney in oocytes of the frog xenopus laevis with the electroneutral rat monocarboxylate transporter mct1. the apparent intracellular buffer capacity was increased by nbce1 expression and became voltage-dependent by 7 mm/10 mv membrane depolarisation. lactate transport via the mct1 not only became enhanced after co-expression with nbce1, but also dependent upon membrane potential. injection of carbonic anhydrase caii from bovine erythrocytes into oocytes enhanced the ef cacy of nbce1 activity, identifying an additional, ca-sensitive, membrane current via nbce1. our results show that nbce1 adds voltage-dependent buffer capacity to the cytosol; this is suggested to be the prime cause for enhancing acid/base-coupled transport and conferring membrane potential dependence on transporters which are stoichiometrically electroneutral. these interactions may have functional consequences for cells and tissues, where electrogenic and electroneutral processes interact, such as in brain, heart and muscle. . by using carboxy-snarf-1 as ph-sensitive uoroprobe and microspectro uorimetry, we now show that nhe1 activation is due to jun kinase (jnk) activation, resulting from reactive oxygen species (ros) produced during metabolism of b(a)p and might involve lipid raft. when analysing b(a)p-induced apoptosis, we have found that cariporide signi cantly reduces both nuclear fragmentation and caspase-3 like activity. we further show that nhe1 activation and/or alkalinization affects the mitochondrial ros production detected during the apoptotic cascade, likely via an effect on the complex iii of the electron transport chain. altogether, our results suggest that apoptotic xenobiotics, such as benzo[a]pyrene, induce an early activation of nhe1 that might play a signi cant role in the subsequent mitochondria-dependent apoptosis. conformational dynamics of a lactose proton symporter j. c. holyoake, m. s. sansom biochemistry department, university of oxford, south parks road, oxford, ox1 3qu, u.k. transporter proteins are an essential class of proteins, catalysing the transfer of molecules across membranes. increasing numbers of transporter structures are becoming available, opening the way to study their dynamic properties using computational techniques. we present a study on the conformational dynamics of the lactose proton symporter lactose permease using molecular dynamics simulations. these simulations exhibit large scale conformational changes from the initial intracellularly open conformation to a more closed conformation that may be signi cant to the transport mechanism. the conformational change is analysed to identify the contributing motions. effects of nortriptyline and chlorpromazine on anthroylouabain-labeled na,k-atpase e. a. guevara 1 , m. l. barriviera 1 , a. hassón-voloch 2 , s. r. louro 1 1 departamento de física, puc-rio, rio de janeiro, brazil, 2 instituto de biofísica carlos chagas filho, ufrj, rio de janeiro, brazil the effects of nortriptyline and chlorpromazine (cpz) on the uorescence properties of anthroylouabain (ao)-treated na ,k -atpase of electrocyte membranes from e. electricus are studied. na ,k -atpase oscillates between two major conformations e 1 and e 2 during ion transport cycle. the cardiotonic steroid ouabain speci cally inhibits this enzyme binding to the e 2 conformation. the uorescent label ao presents increased uorescence when binding to the ouabain site of na ,k -atpase. tricyclic drugs such as the antipsychotic cpz and the antidepressant nortriptyline inhibit na ,k -atpase activity in the micromolar range. for the e 2 enzyme, but not e 1 , nortriptyline was found to increase the uorescence in a concentration dependent manner, suggesting a further stabilization of e 2 . for both conformations, cpz induces negligible uorescence change up to 10 µm. the uorescence of atpasebound ao, however, strongly increases upon ultraviolet exposure after cpz treatment at concentrations around 20 µm. fluorescent products of cpz-photodegradation were studied in pure buffer and in the presence of membranes. the results suggest that cpz binds to na ,k -atpase and photolabels amino-acid residues near the ouabain binding site. how important is protein exibility for transport through ion channels? a. a. gray-weale university of sydney certain ion channels are selective for k+ over other ions, but the geometry of the pore does not explain selectivity because thermal uctuations are too large. i extend the usual treatment of ion channels with molecular dynamics simulation by calculating the static and dynamic pair correlations between monovalent ions and ion channels (gramicidin-a and kcsa), and also between certain small, complex cations and the gramicidin channel. this means not only the radial-distribution functions or the density pro les, but also correlations between the ion and the mass and charge densities of various regions of the protein. the advantage of this approach is that it systematically identi es the elements of the protein and modes of motion that contribute to selectivity, and illustrates the decay of correlations. recently, noskov et al. [1] showed that thermal uctuations protect selectivity. my results on the interaction of ions with carbonyl groups agree with theirs, but take the analysis further to higher correlations. the key new element is the study of the time-correlation functions that describe the motion of the ions through the channel, borrowing methods originally developed for the study of dense or even supercooled liquids. the surface-enhanced infrared absorption spectroscopy (seiras) is used for the investigation of two membrane proteins, the cytochrome c oxidase (cco) and the bacteriorhodopsin (br). of central interest are the transport-mechanisms of electrons (cco) and ions (br). the main parts of the setup are an infrared light source, a hemispherical si-crystal, in which the beam is internal re ected and a plexiglas cell with buffer solution (see schematic scetch). the beam is totally re ected on the inner at surface of the crystal, but the evanescent wave excites surface plasmons in a chemically adsorbed gold-layer on the crystal (attenuated total re ection spectroscopy-atr). the protein to be analysed is attached at the gold surface and can absorb certain wavelengths. the gold is need for the surfaceenhancing effect. cco plays a major role in the respiratory chain, the retinal protein br is a photosynthetic protein. the cco is immobilized on the gold surface via the af nity of its histidine-tag to a nickel-chelating nitrilo-triacetic acid (nta) surface. for the br we incorporate the protein in a lipid membrane, which is attached on the gold surface by the sulfuric bindings of 2,3-di-o-phytanyl-sn-glycerol-1-tetraethylene glycol-d,l-lipoic acid ester lipid (dptl). the sensitivity of this method is further enhanced by modulation of an external parameter, like the electric potential. effects of copper ions on the escherichia coli growth and proton-potassium exchange copper ions are required for the function of many important enzymes in escherichia coli but can cause a number of toxic cellular effects also. it's interesting to reveal the in uence of copper ions on the growth of bacteria and proton-coupled membrane systems. upon transition of e. coli mc 4100 wild-type culture to stationary growth phase a decrease in redox potential (e h ) from the positive values ( +140 mv) to the negative ones (of -380 to -550 mv), resulting h 2 production by formate hydrogenlyase (fhl) have been studied. copper increased a latent growth phase duration as well as delayed a logarithmic growth phase in concentration-dependent manner. during the anaerobic growth, the production of h 2 was strongly inhibited in the presence of cucl 2 (2 mm). 0.1 mm cucl 2 was inhibited h 2 production under experimental conditions with glucose. the inhibitory effect of copper ions (0.1 mm) on n',n'dicyclohexylcarbodiimide (dcc)-sensitive h /k exchange was also observed: k uptake was decreased and the stoichiometry of dcc-inhibited ion uxes varied. interestingly, these effects on h and k uxes were absent for the mutant hd700 (hyc-operon for hydrogenase was deleted). we suggest that copper ions, inhibiting the activity of fhl, have an effect on h /k -exchanging mechanism which is the proton f 0 f 1 -atpase associated with k uptake trk system. this effect may be due to the relationship of fhl with the ion-exchanging mechanism above under fermentation at alkaline ph. lateral diffusion in tethered bilayer membranes m. jung, v. atanasov, w. knoll, i. koeper max planck institute for polymer research, mainz, germany tethered bilayer membranes (t'blm) provide a useful platform for the investigation of bilayer membranes as well as embedded membrane proteins. we have developed a modular system, which is suitable for surfaces providing gold and oxide surface coatings. these systems serve as a quasi natural environment for the study of membrane proteins, being functionally incorporated into a lipid bilayer, which is covalently bound (tethered) to the substrates. functionality could be shown using electrochemical methods. here we present a study of these systems, basically the membrane itself, using uorescence recovery after photobleaching (frap) studies in order to investigate lateral motion in the lipid bilayers. lateral mobility is essential for successful incorporation of large membrane protein complexes. we will present rst results of experiments that try to differentiate motions hindered due to the tethering from diffusion in free oating or suspended bilayers. the information gained in this study will serve for improvements in the chemical structure of the tethered molecules. we will develop the system as a basis for bio sensing applications, where embedded proteins will serve as actual sensing elements. a. d. ivetac, j. campbell, m. s. p. sansom biochemistry department, university of oxford, south parks road, oxford, ox1 3qu, u.k. atp-binding cassette (abc) transporters form an important superfamily of membrane proteins which couple atp hydrolysis to the active transport of diverse compounds across the cell membrane. their biomedical relevance is highlighted in examples such as multidrug resistance to antibacterial and anticancer agents, and cystic brosis. the availability of crystal structures of three complete bacterial abc transporters provides an opportunity to study structurefunction relationships at the atomic level. in this work, we carry out multi-nanosecond molecular dynamics simulations of the vitamin b 12 importer from e. coli (btucd), with both the complete multimeric transporter embedded in a phospholipid bilayer and the soluble subunits in a membrane-free environment, in an attempt to elucidate some of the conformational changes which arise during the transport event. atp-bound and atp-free structures are used to investigate the effect of nucleotide on the system. a range of analytical techniques have been applied to assess the dynamic behaviour of the protein during the simulations, which includes measurements of: conformational drift, residue exibility, transmembrane domain (tmd) movement, concerted protein motions, nucleotide-binding and translocation pathway changes. an in-vitro method was designed to measure transmembrane transport rates. liposomes were prepared by extrusion with dipalmitoyl phosphatidylcholine (dppc) and optionally cholesterol, and loaded with a peptide (zinc-insulin tagged with a uorescent group or bsa). after removal of the non-encapsulated peptide from the liposome solution by gel ltration, the release of the peptide from the liposomes was monitored by uorescence as a function of time at various temperatures. the transport was greatly accelerated by the presence of a speci c proprietary excipient molecule (cyclopentadecanolide -cpe215™), which effectively triggered the release of the peptide. a mathematical model was developed to quantify these results. a semi-empirical nonlinear equation involving four parameters ts the protein release pro les. then a neural network predictions model was used to correlate the different release condition parameters and the four semi-empirical tting parameters based on the experimental data sets. most release data t well with the mathematical model, further supporting our theory of a two step release mechanism. phenyltin the well known group of organotin compounds that exhibit toxic properties in relation to the biological systems are phenyltins. no studies have been performed as yet to establish directly whether organotins such as diphenyltin dichloride ( dpht) and triphenyltin chloride (tpht) cross the lipid bilayer. we have performed experiments that showed transfer of those compounds across the lipid bilayer using the stopped-ow technique and desorption of those compounds from a monolayer using the langmuire technique. obtained results demonstrate that dpht and tpht rst adsorb onto the lipid bilayer surface, in diffusion controlled manner, within a very short time (0.05 s), whereas the membrane passing was observed in a minute's time range. the long time kinetics show a complex dependence on the kind of compound, its concentration and the presence of cholesterol in the membrane. the desorption of both compounds from the monolayer to water subphase occurs in a minute's time range. these observations may explain the known fact, that the in uence of organic, amphiphilic tin (and also lead) compounds is more toxic than that of inorganic ones.the phenyltins much easier (compared with tin or lead ions ) penetrate e.g. blood -brain barrier. [ under the resting conditions ca 2 concentration in agonistsensitive ca 2 stores re ects a balance between active uptake mediated by a ca 2 -atpase (serca) and passive ef ux of ca 2 . this ca 2 leak appears to be a common property of ca 2 -storing organelles, but the nature of the leak in submandibular acinar cells remains unclear. we have studied the ca 2 leak pathways in the endoplasmic reticulum (er) of acinar cells of rat submandibular salivary gland by directly measuring concentration ca 2 in the er ([ca 2 ] er ) in mag-fura 2/am preloaded cells while [ca 2 ] i was clamped at a resting level with a egta/ca 2 mixture. we have shown that thapsigargin (tg) or ca 2 -free buffer treatment completely blocked ca 2 uptake by serca after the rst minute of superfusion and caused a ca 2 leak represented by continuous decline in [ca 2 ] er . this ca 2 leak from the er was not sensitive to tg, heparin and ruthenium red and therefore appears to be independent of the serca, the insp 3 receptor and the ryanodine receptors. however, treatment with puromycin (0.1-1 mm) to remove nascent polypeptides from er-ribosome translocon pores increased ca 2 leak from the er by a mechanism independent of the serca, insp 3 or ryanodine receptors. thus we conclude that basal ca 2 leak from the er of submandibular acinar cells occurs through translocon pores in the er membrane. r. b. kishore, j. reiner, e. edgu-fry, a. jofre, k. helmerson physics laboratory, national institute of standards and technology we have developed a procedure to make lipid and polymer nanotubes of up to one cm long and 50 nm in diameter, from the surface of giant liposomes and polymersomes, using micro uidics and optical tweezers. the liposomes and polymersomes were formed, using elcetroformation method, from phospholipids and amphiphilic diblock copolymers, respectively. the polymer tubes were made extremely robust by cross-linking them using chemical reactions. we are currently studying the transport of molecules in the crosslinked nanotubes for use in nano uidic networks. p-glycoprotein (p-gp) is an active membrane transporter capable of expelling out of the cell a large number of potentially cytotoxic amphiphilic molecules with unrelated chemical structures. as a consequence, p-gp may be responsible for multidrug resistance of tumors against chemotherapy (mdr); it also plays a key role in absorption, biodisposition and elimination of many pharmaceuticals. to understand the molecular mechanisms of the transmembrane drug transport mediated by pgp, it is highly desirable to design a convenient assay for measuring both p-gp atpase activity and p-gp transport function. to do so, we used inside-out native membrane vesicles, prepared from mdr cells and containing high amounts of p-gp. we took advantage of the speci c property of a uorescent dye, the carbocyanin jc-1, known to be expelled out of mdr cells: above a critical concentration (the "cjc"), this dye forms j-aggregates which emit a uorescence at a wavelength very different from that emitted by the monomer. in the presence of mgatp, the p-gp-containing vesicles accumulated jc-1, which exceeded locally the cjc and thus formed intraluminal j-aggregates; these aggregates allowed accumulated jc-1 both to be sequestered inside the vesicles, by dramatically slowing down its passive backdiffusion, and to be speci cally detected. kinetic characterization of this transport suggests that jc-1 is rst translocated to the exoplasmic lea et of the vesicle membrane before its internalization into the aqueous phase of the vesicle lumen. interaction between the energy metabolism and externally applied electric elds in yeast cells electric elds are often used for biophysical or biomedical treatment of biological cells, e.g. cell fusion or killing of cells. however, only a few data about the possible mechanisms of electrosensitivity of biological cells are available. since electrostimulation always induces depolarization of biomembranes, an impact of the energy metabolism is obvious due to the regeneration of electrochemical gradients by the expenditure of cellular energy. our aim is to investigate the interactions between externally applied electric elds and the aerobic/anaerobic energy metabolism of yeast cells. for this, we have constructed a new electrical interface for local stimulation of biological cells with variable duration and amplitude. when applying short lasting electrical pulses to yeast cells, we nd a direct response of the energy metabolism (measured by nadh-uorescence) to these pulses. a sudden and fast decrease in nadh is followed by a slower recovery of the uorescence signal. these nadh-signals are abolished in the presence of antimycin a or kcn, demonstrating the importance of mitochondrial energy production for this phenomenon. we attribute these changes to the immediate break down of atp as a consequence of the regeneration of the membrane potential (atpases) and the slower regeneration of atp by mitochondrial respiration. new insights of hypericin blood transport and its incorporation into the plasma membrane b. m. macri 1 , g. stoian 2 , m. l. flonta 1 1 department of animal physiology and biophysics, faculty of biology, university of bucharest, 2 department of biochemistry, faculty of biology, university of bucharest, bucharest, romania hypericin (hyp) is one of the active compounds from hypericum perforatum (an herb usually prescribed as antidepressant). the bioavailability of this hydrophobic molecule is a very important issue for medical applications. the goal of our work was the study of hyp blood transport mechanisms. techniques of absorbtion spectroscopy, electrophoresis and uorescence microscopy were used in order to de ne the properties of hyp-albumin and hyp-lipoproteins complexes and to explain the action of hyp at the plasma membrane level. hyp bind to several electrophoretic (sds-page) bands evidenced by mice plasma migration. both albumin and lipoproteins bind to hyp, forming complexes, during the blood transport process. different types of lipoproteins from males and females plasma mice were evidenced by gradient electrophoresis to bind hyp. hyp-albumin complex was also identi ed by absorbtion spectra, and the ratio a 594 /a 550 has a ph-dependence. hyp interaction with plasma membranes was also examined on cell culture by uorescence microscopy, and hyp plasma incorporation is a dose-and incubating time-dependent process. our results partially elucidate the plasma fractions that bind hyp, contributing to its blood transport. this study proposes a new mechanism of hyp cellular insertion, discussing its plasmatic membrane penetration due to its high hydrophobicity. the overexpression of p-glyoprotein (p-gp) is one of the major causes of multidrug resistance (mdr) in cancer chemotherapies. many p-gp inhibitors have been designed to reverse the mdr effect, but the structure-activity relationships of the p-gp inhibitors and substrates still remain largely unknown. until now, it is still very challenging to obtain the high-resolution p-gp structures and currently only low-resolution electron microscopy structure is available. this has caused the structure-based design of "p-gp-ignoring" therapeutic agents a remote goal. however, the recent determination of x-ray crystal structures of bacterial lipid a transporter, msba, has provided eligible structure templates for homology modeling of p-gp. we have therefore conducted explicit solvent molecular dynamics simulations of the fulllength ef ux pump, human p-gp, in an excessively hydrated popc bilayer to re ne the homology model. both free and atp-bound forms of p-gp have been simulated. the entire system consists of more than 365,000 atoms. our molecular dynamics simulations have shown that the overall architecture of p-gp remained very stable for tens of nanoseconds, while the observed membrane undulation was rather large. the simulation results have allowed us to investigate the conformational changes of p-gp upon atp binding in the ef ux process and to predict the possible binding site of various known substrates and inhibitors. the re ned structure models of p-gp by our simulations could be used as the basis for further drug design. electroporation (ep) is a phenomenon where increased permeability of cells exposed to an external electric eld is observed. the induced transmembrane voltage presumably leads to the formation of aqueous pores in the phospholipid bilayer, which increases permeability of the membrane for molecules and ions. ep is currently used in many biomedical applications including transfer of genes and electrochemotherapy of tumors. still, the molecular mechanisms of the process are not fully explained. recently it was proposed that ep could be monitored in real-time by measuring electric conductivity of tissue. so far the studies focused mostly on a single pulse, however in biomedical applications usually several pulses are used. in our study we used a train of electric pulses to analyse the relationship between electric conductivity and cell permeabilization. current-voltage measurements during and after pulse application were performed in dense suspension of cells. conductivity changes were analysed numerically using nite elements method and compared with the percentage of permeabilized cells. we obtained a transient increase in conductivity above a certain voltage with complete relaxation in < 1s. substantial changes in conductivity are also due to the diffusion of ions through membrane pores and osmotic swelling. we further show that relation between conductivity and permeabilization level is indirect. interaction of quinolones with bacterial porin ompf: uorescence quenching studies quinolones are widely used antibiotics witch develop their antibacterial action by inhibition of important bacterial enzymes. consequence of the internal location of their target of action, the translocation of this drugs trough the outer membrane is an essential step for their antibacterial action. in vivo studies have been showing that ompf is important for the entrance of some of these antibiotics, but the exact degree of involvement of this protein in the transport of the different members of this group of antibiotics, remains unknown. in this study, the quenching of the intrinsic tryptophan uorescence of ompf, in presence and in the absence of the drugs and by two distinct quenchers, was used as a rst approach, to elucidate ligandinduced structural changes and consequently prove the differential involvement of ompf in the entrance of these antibiotics in the bacterial cell. the results obtained reveal that the degree of interaction with the protein is related with the hydrophobicity of the different antibiotics. this kind of evidence suggests that the entry by the porin channel is not the only path used by these antibiotics and that it is more important for the latest generations of this group because of their increased hydrophilic characteristics. inhibition of multidrug resistance-associated protein mrp1 and kv channels by natural polyphenols k. michalak, a. teisseyre, b. ania-pietrzak department of biophysics, wroclaw medical university, poland resistance to cytotoxic agents remains a major obstacle to successful chemotherapy in cancer. best-characterized form of drug resistance is caused by the overexpression of genes encoding membrane drug pumps, like p-gp or mrp1. in present study, activity of several plant polyphenols ( avonoids and stilbene) against mrp1 has been studied using functional assay based on ef ux of mrp1 uorescent substrate. very recently, a role of kv1.3 potassium channels in proliferation of various cancer cells was suggested. in our study the effect of the plant polyphenols on voltage-gated potassium channels kv1.3 was investigated by patch-clamp electrophysiological method. some of studied compounds were found to be active inhibitors of multidrug resistance-associated protein mrp1 and voltage-gated potassium channels, and their properties are promising for further research in the eld of anticancer activity of natural products. the transport mechanism of melibiose permease: a study using electrical measurements and uorescence techniques k. meyer-lipp 1 , c. ganea 2 , t. pourcher 3 , g. leblanc 3 , k. fendler 1 1 max planck institut für biophysik, frankfurt, germany, 2 c. davila medical university, bucharest, romania, 3 cea, université de nice so a antipolis, nice, france the melibiose permease (melb) of escherichia coli is a membrane bound carrier that uses the favorable na+, li+, or h+ electrochemical potential gradient to drive cell accumulation of alfa-galactosides (melibiose, raf nose) or beta-galactosides (methyl-1-thio-beta-dgalactopyranoside). electrophysiological techniques have proved to be extremely useful tools to investigate the mechanism of ion transfer across the membrane by ion-coupled transporters. using a solid supported membrane (ssm) as a capacitive electrode a rapid solution exchange can be combined with the high sensitivity of planar lipid membranes and allows time resolved investigation of the charge translocation during the catalytic cycle of na+/solute symporters. this technique has been combined with uorescence measurements, which report on structural changes during the substrate transport process of the carrier. we have used time resolved tryptophane uorescence, uorescence energy transfer with a uorescent sugar substrate and site speci c uorescence of a label attached to a cysteine residue on the protein. this allowed us to identify conformational transitions during the reaction cycle of the melibiose permease. we could assess their electrogenicity and determine rate constants. a kinetic model for na+ and melibiose binding and transport is presented. electrophysiological characterization of the vast number of annotated channel and transport proteins in the postgenomic era would be greatly facilitated by the introduction of rapid and robust methods for the functional incorporation of membrane proteins into dened lipid bilayers. we present an automated method for reconstitution of membrane proteins into lipid bilayer membranes, that substantially reduces both the reconstitution time and the amount of protein required. we have applied this well-de ned system to the characterization of a novel mitochondrial uncoupling protein, ucp2 and demonstrated that ucp2 exhibits protonophoric function exclusively in the presence of fatty acids, similar to that previously shown for its homologue ucp1. the membrane conductance was proportional to the concentration of the reconstituted ucp2 in presence of oleic acid or eicosatrienoic acid, and was inhibited by atp. amphipols are amphipathic polymers designed to replace or supplement detergents in membrane protein solution studies. for the study of the ca 2 -atpase from sarcoplasmic reticulum, previous experiments have revealed both advantages and disadvantages to the use of a polyacrylate-based amphipol, a8-35. these issues have been reinvestigated using four different amphipols. size exclusion chromatography showed that, although a8-35 aggregates in the presence of millimolar concentrations of calcium -an effect that probably accounts for most of the aggregation of atpase/a8-35 complexes observed in our previous work-, aggregation can be avoided by resorting to a sulfonated version of a8-35. we also found that all amphipols tested slowed down the rate of calcium dissociation from its binding sites and reduced atpase activity, while protecting the solubilized protein against denaturation. this suggests that association with the polymer may damp the protein's dynamics, perhaps due to the multipoint attachment of the polymer to its hydrophobic transmembrane surface. such a "gulliver" effect could contribute both to the protection of membrane proteins against denaturation and to the reversible inhibition of serca1a. clc proteins are found from prokaryotes to mammals. they function as plasma membrane chloride channels or provide neutralizing anion currents for v-type h -atpases that acidify compartments of the endosomal/lysosomal pathway. vesicular clcs have been thought to be cl -channels, in particular because clc-4 and clc-5 mediate plasma membrane cl -currents upon heterologous expression. we have shown, however, that these two mainly endosomal clc proteins rather function as electrogenic cl /h exchangers, resembling the transport activity of the bacterial clc-e1 that has been crystallized. neutralization of a critical glutamate residue not only abolished the steep voltage-dependence of transport, but also eliminated the coupling of anion ux to proton counter-transport. clc-4 and clc-5 may still compensate the charge accumulation by endosomal proton pumps, but are expected to tightly couple vesicular ph-gradients to cl -gradients. calorimetry and mechanics of ca 2 transporting systems in rat myocardial bigeminies a series of new n-oxides of tertiary amines (nta) was checked for its biological activity. individual compounds differed in the length of substituted alkyl chain. the primary goal was to nd if they can be used as effective antioxidants and, to what degree they modify used model (liposomes) and biological (erythrocytes, algae and cucumber) membranes. various methods were used in order to do that. a mechanism of the interaction between nta and membranes was studied by measuring their potency to hemolyse erythrocytes, to in uence a phase transition temperature in dppc liposomes, to change a membrane potential of algae cells. the measure of the interaction of nta with cucumber cells were potassium leakage, chlorophyll content and inhibition of growth of hypocotyls. antioxidative abilities of nta were determined by measuring their ef ciency to protect erythrocytes against membrane lipid oxidation induced by uv irradiation and by comparising their antioxidative ef ciencies with that of trolox (vitamin e analogue) in chromogen experiments. the mostly widely employed mechanism of drug extrusion in bacteria is via membrane transport proteins called ef ux pumps. in gram-negative bacteria, multidrug resistance is conferred by tripartite complexes, rather than by a single transport protein. through these systems, a wide range of substrates is expelled from the cytoplasm, through the periplasmic region, to the exterior of the cell. among these complexes, the acrab/tolc system in escherichia coli is formed by an inner membrane ef ux pump, acrb, an outer membrane protein, tolc, and a periplasmic protein known as an adaptor, acra. the components of this complex are studied, in order to provide insights into drug transport in bacteria. here we present a dynamics study on mexa, homologue of acra from pseudomonas aeruginosa. the protein has been studied by molecular dynamics simulations in bulk water. a structural adjustment by the periplasmic protein is required in order to engage both the bottom part of the om protein and the top region of im protein. the dynamics on mexa reveals a exible behaviour of the protein in water. the major concerted motions observed are the hinge-bending of the two domains, and the rotation of the -barrel domain. these can be related to the adaptation of mexa (and acra) to the om and im proteins during the process of assembly in forming the complex, and during the opening of the channels. electrophysiologic study of ap in chara corallina -indication of its biochemical nature the speci c conductance's of aqueous solution of electrolytes (viz.naf, nacl, nano 3 , na2so 4 , kf, kcl, kno 3 , k 2 so 4 , mgcl 2 , cacl 2 , fecl 3 , mncl 2 ,crcl 3 ,cucl 2 , cocl 2 , )have been measured across peritoneum at temperatures between(15-35) c.conductance attains a maximum limiting value at higher concentrations for each electrolyte due to a progressive accumulation of ionic species within the transmembrane region. the membrane becomes more and more conductive to incoming ions and attaining a limiting value due to the fact that an electrically neutral pore, which is speci c for a particular ion, is unlikely to contain more than one type of ion. consequently, at high electrolyte concentration, the pore saturates and the conductance's approaches a limiting value. the values of speci c conductance measured follow the sequence for anions; so 4 2 > cl > no 3 > f . whereas for the cations the sequence is k > na ; ca 2 > mn 2 > co 2 > cu 2 > mg 2 ; cr 3 > fe 3 . the energy of activation for the cations as well as for the anions follows the sequence (for cations): the low temperature (77k) chlorophyll uorescence, photochemical activity, oxygen ash yield and oxygen burst decay of thylakoid membranes with different organization of the light-harvesting chlorophyll a/b complex of photosysytem ii (lhcii) were investigated after freeze-thaw cycle in criotoxic and cryoprotective medium. the increase of lhcii oligomerization, which is associate with signi cant reduction of the surface charge density of the thylakoid membrane, correlates with lower extent of freezing damage of the photosynthetic apparatus, when the procedure is carried out in cryotoxic medium (nacl). in the presence of the cryoprotective compound (sucrose) freezing damage is less pronounced and is not affected by the degree of the lhcii oligomerization. the mechanisms of damage and protection of photosynthetic apparatus in the process of freeze-thaw treatment are discussed. spectral and redox characterization of the novel heme ci in the cytochrome b6f complex . this is an indication of different spin delocalization in the primary donor, for the mutant being typical of a monomeric oxidized bchl. considering the fact that the properties of both isolated and membrane-associated mutant rcs were similar, we conclude that missing bchl molecule from the mutant rc was the result of the introduced mutation but not of the protein puri cation procedure. authors acknowledge the support by the russian brf. we created two site-directed mutants, a249s and l267i in the d2 protein of photosystem ii in thermosynechococcus elongatus. both mutations are within the binding pocket of the primary quinone acceptor (q a ). we investigated the effects of the mutations in vivo and in isolated psii. while the l267i mutant exhibits characteristics similar to the wild type, the a249s mutation effects q a charge recombination measured by thermoluminescence and uorescencedecay. these results strongly indicate that the a249s mutation induce a shift in the redox potential of q a . the a249s accelerates the rate of photoinhibition, an effect consistent with the negative shift in the redox potential. epr was used to measure the temperature dependence of the electron transfer from q a to q b in the a249s mutant. it was found to be indistinguishable from the wild type despite the difference in the midpoint potential of q a . this is taken as an indication as a gating mechanism on the acceptor side of psii similar to that in bacterial reaction centers. protochlorophyllide oxidoreductase takes an abnormal reaction pathway below the glass transition g. durin 1 , d. j. heyes 2 , c. n. hunter 2 , d. bourgeois 1 1 ibs and esrf, grenoble, france, 2 krebs institute and r hill institute for photosynthesis, shef eld university, shef eld, uk motions through the energy landscape of proteins lead to biological function. at temperatures below a dynamical transition (150-250 k), the activity of some proteins cease. in this work, we describe an enzyme that, instead, engages into a non-productive pathway below 160k. protochlorophyllide oxidoreductase (por) catalyzes the reduction of protochlorophyllide (pchlide) into chlorophyllide (chlide), a key step in chlorophyll biosynthesis. por is one of the two enzymes known to require light for catalysis. when illuminated with gentle light at 165 k, the complex of t. elongatus por with pchlide and nadph transforms into a nonuorescent intermediate. upon warming, several uorescent intermediates develop, and at 290k chlide is released. when illuminated at temperatures below 155k, por behaves differently. if gentle light is used, the reaction can not start. instead, if a blue laser source is used, the initial complex disappears, like at 165k. however, upon warming, a new intermediate develops that uoresces at 694nm and leads to a dead-end product. by using uorescence microspectrophotometry, we have measured the solvent glass transition temperature of the system to be 158k. the solvent glass transition, possibly controlling a por dynamical transition, may be the determinant that switches the enzyme reaction pathway from a non productive to a productive one. the nonproductive pathway results from a two-photons absorption mechanism, whereas the productive pathway is a one-photon mechanism. sensory rhodopsin ii from n. pharaonis (npsrii) forms a complex with its cognate transducer nphtrii in a 2:2 stoichiometry 1 . light activation of npsrii leads to a movement of helix f which triggers a rotation of tm2 in nphtrii 2 3 . the mechanism of signal transduction through the hamp region to the cytoplasmic domain of the transducer is still unknown. structural information exists for the transmembrane and cytoplasmic regions, however the hamp domain is not yet characterized. in order to obtain structural information on this domain, twenty-four residues in the membrane adjacent region (78-101), and six residues in the following region were spin labeled and investigated by cw and pulsed x-band epr. to analyze the overall architecture of the complex, doubly spin labeled variants between the transducer and the receptor were also engineered. depending on their function, the absorption spectra of rhodopsins can be tuned by the protein over a wide range. a major determinant for spectral shifts between different rhodopsins are electrostatic interactions between the chromophore retinal and the protein. we compute and compare the classical electrostatic potential at the retinal of three archaeal rhodopsins: bacteriorhodopsin (br), halorhodopsin (hr), and sensory rhodopsin ii (srii). these proteins are an excellent test case for understanding the spectral tuning of retinal. the absorption maxima of br and hr are very similar, while the spectrum of srii is considerably blue shifted. we nd that the electrostatic potential is similar in br and hr, but differs signi cantly in srii. a quantum mechanical model of a particle in a box with a step potential can qualitatively relate the differences between the electrostatic potentials of the proteins to the relative shifts of their absorption maxima. by decomposing the electrostatic potential into contributions of individual residues, we could identify six residues that are responsible for the differences in electrostatic potential between the proteins. three of these residues are close to the retinal, while the other three residues are more then 8 angstroem away from the retinal. the counterion of the schiff base, which is frequently discussed to be involved in the spectral tuning, does not contribute to the dissimilarities between the electrostatic potentials. effect of uv-a radiation on thylakoid membranes with different organization p. ivanova, a. dobrikova, t. markova, s. taneva, e. apostolova institute of biophysics, bulgarian academy of sciences, 1113 so a, bulgaria the effect of uv-a (320-400 nm) radiation on the energy transfer and the photosynthetic oxygen evolution of thylakoid membranes from pea mutants was investigated. the membranes have different pigment composition, stoichiometry and organization of pigment-protein complexes. the aim of our work was to nd out whether uv-a induced damage is affected by the altered content and/or oligomerization of the main light-harvesting chlorophyllprotein complex (lhcii) in thylakoid membranes. the data for the effect of uv-a radiation on the oxygen evolution demonstrate that: (i) the inhibition of photosystem ii (psii)-mediated electron transport and ash-induced oxygen yields strongly depend on the amount of lhcii; (ii) the increase of the s o populations of psii centers in darkness is more pronounced in thylakoid membranes with smaller amount of lhcii; (iii) the inhibition of the oxygen evolution is related to the reduced number of the functionally active psii centres; (iv) the degree of impairing of active psii centres depend on the amount and oligomerization of lhcii. the results also show that the altered content and organization of lhcii in uence the uv-a light-induced changes in the energy transfer between psii and psi and within the supramolecular lhcii-psii complex. the effects of uv-a radiation on leaves and isolated thylakoid membranes are compared. sudden polarisation -a large change in the electric dipole moment between the excited and the ground state -is a well-known phenomenon for retinal chromophore. some early models of the energy transduction mechanism in bacteriorhodopsin (br) even attribute a primary functional role of that. however, it was apparently unrecognized that the maxwell theory intuitively predicts the appearance of an ultrafast transient electromagnetic radiation due to this dipole moment change. here we show that the existence of this type of radiation can be derived from semiclassical quantum electrodynamics as a second order phenomenon. in optical terms it corresponds to the previously unstudied resonant case of optical recti cation. recently we experimentally observed a major component in the fs coherent infrared emission of oriented purple membranes of br corresponding well to this effect (groma et. al, proc. natl. acad. sci. 101, 7971, 2004). our theory predicts that such a signal holds detailed information on the dynamics of excited state polarization, opening a new branch of impulsive spectroscopy on asymmetric systems. beyond optical recti cation we found a complex phase a coherent oscillation living for a few ps, i.e. much longer than the excited state of br. fitting analysis resulted in at least seven vibrating modes in the 700-1500 cm 1 region, while windowed fourier transform indicated time-dependent frequency distribution. a. ghignoli, g. cercignani, s. lucia, g. colombetti istituto di bio sica, cnr -pisa, dipartimento di fisiologia e biochimica, università di pisa, italy the life cycle of ophryoglena ava, a histophagous ciliate dwelling in fresh waters, reportedly includes several stages that feature morphology changes and different phototactic responses. previous studies on the phototactic responses in o. ava during its phase of maximal positive phototaxis led to an action spectrum with two main peaks at 420 and 590 nm, and a minor peak at 540 nm. starting from those results, we analyzed the phototactic response at various cell ages, using three broad-band interferential lters (fwhm = 50 nm) centred respectively at 420, 550 and 600 nm, and constructed dose-effect curves for each band. a higher photosensitivity at 420 nm, and lower photosensitivies with the other two lters (550 and 600 nm) have been observed at any cell age. however, the photosensitivities in the blue and orange regions show a different time course vs. cell age with respect to the photosensitivity in the green region. measures were also carried out on cells whose feeding cycle was altered by a 4-day starvation (a double time with respect to the standard protocol) before being fed at t = 0. the maximal photoresponse values reached by starved cells are lower than the highest values reached with standard cultures; in other words, a general reduction of the phototactic response is observed. these results suggest that, while feeding optimally induces cell division, it does not generally reset all cellular functions. a. quaranta 1 , f. lachaud 2 , y. pellegrin 2 , p. dorlet 2 , m.-f. charlot 2 , s. un 1 , a. aukauloo 2 , w. leibl 1 1 service de bioénergétique, cea-saclay, bât. 532, 91191 gif-sur-yvette cedex (france), 2 laboratoire de chimie inorganique, bât. 420, université paris-sud, 91405 orsay (france) coordination complexes based on a photoactive rutheniumpolypyridyl moiety linked to simple, rigid ligands with binding sites for transition metals, are developed to mimic the light induced charge separation and water oxidation processes taking place in the photosynthetic apparatus. inspired by the structure around the donor side of photosystem ii a family of phenanthroline based ligands holding an imidazole, a phenol or an indole unit simulating the amino acids histidine, tyrosine and tryptophan in the oxygen evolving complex, were developed as models for proton-coupled electron transfer. in some of the molecules investigated the hydrogen bonding interaction present in the natural system is reproduced. combined data from photophysical, spectroelectrochemical studies and dft calculations evidenced the photogeneration of a phenoxyl or a tryptophan radical upon excitation of the chromophore in presence of an external electron acceptor, therefore mimicking the electron trade between p 680 and tyrz-hist190. finite element model to predict the electric potential distribution in ps i containing vesicles c. p. a. pennisi 1 , e. chemineau 1 , e. greenbaum 2 , k. yoshida 1 1 center for sensory motor interaction, aalborg university, denmark, 2 chemical sciences division, oak ridge national laboratories, usa, 3 facultad de ingeniería, uner, argentina photosynthetic reaction centers (rc) are integral membrane proteins and molecular photovoltaic structures. recently, it was suggested their use as triggers of voltage-gated ion channels in excitable cells, where a certain voltage threshold has to be reached to evoke a response (kuritz et al., ieee trans. nanobiosci. in press 2005). experimental studies with rc's reconstituted in lipid vesicles have shown different values of transmembrane voltage, depending on parameters like light intensity, rc concentration and membrane passive properties. ultimately, the purpose of this work is to have a tool to estimate the proximity, number and density of rc's required near a voltage-gated channel to activate an excitable cell. as a starting point, we aim to predict the spatial distribution of the membrane potential in vesicles. a nite element model was realized using a commercial package (femlab, comsol a/s). the three-dimensional distribution of the electrical potential near a single rc in the surface of a spherical vesicle was calculated. in terms of density, in conditions of saturating light, a minimum of 1,8e 12 rcs/cm 2 is needed to develop a potential of 20 mv, capable to activate voltage-gated sodium channels. microsecond time-resolved x-ray diffraction study of purple membrane t. oka 1 , k. inoue 2 , m. kataoka 3 , n. yagi 2 1 faculty of science and technology, keio university, japan, 2 japan synchrotron radiation research institute (jasri), japan, 3 nara institute of science and technology, japan the structural changes in the photoreaction cycle of bacteriorhodopsin, a light-driven proton pump, was investigated at a resolution of 7 å by time-resolved x-ray diffraction experiment utilizing synchrotron x-rays from an undulator of spring-8. the xray diffraction measurement system, used in coupling with a pulsed yag laser, enabled to record diffraction pattern from purple membrane lm at a time-resolution of 6 µsec over the time domain of 5 µsec to 500 msec. the low temperature (77k) chlorophyll uorescence, photochemical activity, oxygen ash yield and oxygen burst decay of thylakoid membranes with different organization of the light-harvesting chlorophyll a/b complex of photosystem ii (lhcii) were investigated after freeze-thaw cycle in cryotoxic and cryoprotective medium. the increase of lhcii oligomerization, which is associate with signi cant reduction of the surface charge density of the thylakoid membrane, correlates with lower extent of freezing damage of the photosynthetic apparatus, when the procedure is carried out in a cryotoxic medium (nacl). in the presence of a cryoprotective compound (sucrose) freezing damage is less pronounced and independent of the degree of the lhcii oligomerization. the mechanisms of damage and protection of photosynthetic apparatus during the freeze-thaw process are discussed. we have studied the effect of a cytokinin meta-topolin (mt, 10 4 m) on senescence-induced changes in the photosynthetic apparatus of detached primary leaves of wheat (triticum aestivum l. cv. hereward). the senescing leaves were kept under continuous light conditions. mt signi cantly slowed down the senescenceinduced decrease in chlorophyll content and markedly stimulated violaxanthin zeaxanthin (z) conversion. the high z content was maintained even after an hour in darkness. mt treatment caused also the appearance of an emission band f699 peaking at 698-700 nm. this emission band is attributed to aggregates of lightharvesting chlorophyll a/b-binding proteins (lhc), the production of which is associated with a higher z content. the presence of lhc aggregates in mt treated leaves was documented also by electron microscopy imagines. besides the lhc aggregation, mt induced also a decrease in photosystem i content which was documented by electrophoresis and 77k-uorescent spectra. supported by grants frvs 3190/2005 and msm 6198959215. recently high resolution images of bacterial photosynthetic membranes have revealed the organization of membrane proteins in these native membranes. the organization revealed is remarkable, and all the more so when we realize that these specialized, protein rich, membranes differentiate from the cytoplasmic membrane which has a more complex composition and is richer in lipids. analysis of the protein organization in these specialized membranes from several different bacteria suggest that the organization results from a phase separation of several different contiguous phases. in order to better understand our observations we have undertaken an examination of the different phase behaviors that are possible for membrane proteins considered as a two dimensional colloid. monte-carlo modeling of the phase diagram of this system shows the importance of interaction distance in the determination of system behavior. transcription of our observations on the model systems to the photosynthetic membranes suggests that electrostatic and elastic forces in the membrane are of particular importance in determining the high level order of membrane proteins. the recent crystal structure of photosystem i (psi) from synechococcus elongatus shows two quasi-symmetric branches of potential electron transfer cofactors including primary donor (dimer of chlorophylls p700), monomeric chlorophylls a and a 0 and quinone a 1 , bound to the psaa/psab heterodimer. so far, it is not clear if both potential electron transfer pathways are active in this process or only one of them. to solve this issue, we studied a set of 6 mutants with methionine coordinating the primary electron acceptor, a 0 , replaced by histidine, leucine, or serine in either of two branches. our results obtained with a technique of femtosecond transient absorption spectroscopy show that both branches are equally active in electron transfer. mutation in either branch slows the forward electron transfer between a 0 and a 1 from 20 ps in wilde type psi to 1-2 ns in all these mutants. this strong effect is explained by signi cant change in the redox midpoint potential and change in the position of a 0 by the mutations. i. s. zaharieva department of biophysics and radiobiology, faculty of biology, university of so a an approach to the investigation of structural and functional properties of the photosystem ii supramolecular complex in native photosynthesizing objects based on the registration of delayed chlorophyll a uorescence is developed. using a disc phosphoroscope, we register simultaneously: i) changes of the intensity of millisecond delayed uorescence (decayed in 0.35 -5.5 ms time range) during the transition of the photosynthetic apparatus from dark to lightadapted state; ii) changes of the intensity of delayed uorescence decaying in different subintervals of the investigated time range; iii) dark relaxation curves at different moments of the transition; iv) changes of the intensity of prompt chlorophyll a uorescence. the analysis of these data allows the correlation of the delayed uorescence characteristics to particular processes occurring in the photosystem ii complex -proton or electrical gradient accumulation, changes in the redox state of quinone acceptors, changes in the pigment-protein complexes caused by different stress factors, for example temperature. three-dimensional structure of major light-harvesting antenna of photosystem ii from cucumber h. yan 1 , z. liu 1 , k. wang 2 , t. kuang 2 , j. zhang 1 , l. gui 1 , x. an 1 , w. chang 1 1 national lab of biomacromolecules, institute of biophysics, chinese academy of sciences(cas), 15 datun road, chaoyang distr., beijing 100101, china, 2 lab of photosynthesis and environmental molecular physiology, institute of botany, cas, 20 nanxincun, xiangshan, beijing 100093, china the major light-harvesting antenna complex of photosystem ii (lhc-ii), the most abundant integral membrane protein, functions in light capture, energy transfer/distribution and photoprotection. lhc-ii from different species or conditions shows different spectral properties and variation in polypeptide and pigment components. this indicates some speci c function-related alterations in the organization of lhc-ii. here we report a 2.66-å crystal structure of cucumber homo-trimeric lhc-ii, organized in a perfect virus-like icosahedral particle. the electron-density map shows the reasonable existence of a chlorophyll (chl) a/b mixed binding site in the complex. the occurrence and locus of lactucaxanthin (lac) was seen directly for the rst time. based on the credible structure information, a mechanism of the energy transfer, regulation and excess excited energy dissipation under high light condition was proposed. coherent anti-stokes raman scattering microscopy (cars) is a new approach for chemical imaging of molecular systems within cells and tissues, with high sensitivity, high spatial resolution, and three dimensional sectioning capabilities, without using uorophores that are prone to photobleaching. this technique permits to map selectively molecular species, by using vibrational properties of their chemical bounds. the epi detected (e-cars) and forward detected (f-cars) intensities depends on the shape, the size of the sample, as well as the index of the solvent. in this presentation, after introducing the cars microscopy technique, we show the rst cars studies of the refractive effect of the sample, comparing the e-cars and f-cars signals for different diameters of polystyrene beads, in different refractive index solvents. we present several simulations, comparing forward-detected and backward-detected signals in different sized polystyrene beads, embedded in different index solvents, and we show that, the backwardre ected f-cars dominates the experimentally epi-detected signals. furthermore, we demonstrate experimentally and theoretically that the maxima of forward and epi-detected signals are generated at different positions along the z axis in the sample. we nally discuss how index mismatch in cells can alter cars images. the effects of static magnetic elds on humans have been the subject of continuous investigations. since one of the major static magnetic eld sources is nuclear magnetic resonance imaging (mri), the present study aimed to investigate the effects of 1.5 t magnetic eld that is produced by mri on humans. the study is carried out with 33 voluntary and healthy young men from 20 to 25 years of age. the subjects informed about the purpose of the study at the beginning. the subjects exposed to 30 minutes of 1.5 t static magnetic eld by means of putting the subjects into the magnetic resonance unit. 5 ml blood was taken from each subject one minute before and one minute after exposure. t1 and t2 relaxation times and trace elements were measured in of pre and post exposure plasma of the subjects. the obtained post exposure values were compared with pre-exposure values of the subjects. pre and post exposure results were analyzed by means of student t-test. evaluation of tumor response of breast cancer patients by diffusion weighted mri k. a. danishad 1 , v. seenu 2 , u. sharma 1 , p. k. julka 3 , g. k. rath 3 , n. r. jagannathan 1 1 department of nmr, 2 department of surgery, 3 department of radiotherapy, all india institute of medical sciences, new delhi, india diffusion weighted mr imaging (dwi) measures the diffusion of water molecules in tissues and is quanti ed by apparent diffusion coef cient (adc). dwi can be used to differentiate tumors from normal tissue and also can be used to monitor the response of tumor to chemotherapy. thirteen healthy volunteers and twelve patients were recruited for the study. dw images were obtained prior to therapy (n=10) and after three cycles of therapy (n=3). the mean adc value of tumors (0.83 x 10 3 0.05 mm 2 /s) was signi cantly less (p < 0.05) compared to the normal tissue (1.80 x 10 3 0.2 mm 2 /s). decrease in adc in tumor is due to an increase in the cellularity which restricts the diffusion of water molecules. in patients receiving neo-adjuvant chemotherapy, the adc values were higher (1.36 x 10 3 0.86 mm 2 /s) and were closer to that of the normal tissue (p <0.05), indicating response of the tumor to chemotherapy. the post-therapy increase in adc is due to the cell damage caused by the therapeutic agents which increases the fractional volume of the interstitial space, causing an increase in the mobility of water. the study showed that dwi can be used non-invasively to assess the response of breast cancer patients to neo-adjuvant chemotherapy. quanti cation by optical imaging of gene electrotransfer in mouse muscle and knee optical imaging was evaluated for monitoring and quanti cation of the mouse knee joint and tibial cranial muscle electrotransfer (et) of a luciferase encoding plasmid. the substrate of luciferase (luciferin) was injected i.p or locally in the muscle or the knee joint. luminescence resulting from the luciferase-luciferin reaction was measured with a cooled ccd camera. luminescence of the knee joint and muscle were higher after local than after i.p injection of luciferin, but both measurements were highly correlated. local injection procedure was adopted. a signi cant correlation was observed between measurements in vivo and in vitro on the same muscle. reproducibility of individual luminescence measurements was also veri ed, and the luminescence levels were clearly dependant of the amount of plasmid injected. in vivo luciferase in the electrotransfered knee joint was detected for two weeks. intramuscular electrotransfer of 0.3 or 3 µg of plasmid led to stable luciferase expression for 62 days, whereas injecting 30 µg plasmid resulted in a luminescence fall two weeks after electrotransfer. these decreases were, at least partly, related to the production of antibodies against luciferase. thus, optical imaging was shown to be a relevant technique to quantify variations of luciferase activity in vivo in one given tissue. furthermore, evaluating the effective amount of luciferase in tissues from in vivo luminescence levels requires calibration since it relies on conditions of the enzymatic reaction and light absorption. acute effect of corticosterone on nmda receptormediated ca2+ elevation in mouse hippocampal slices m. saito 1 , s. sato 1 , h. osanai 1 , a. hirano 1 , y. komatsuzaki 1 , s. kawato 2 1 department of physics and applied physics, college of humanities and sciences, nihon university, 2 department of biophysics and life sciences, graduate school of arts and sciences, university of tokyo corticosterone (cort) is a principal glucocorticoid synthesized in the rodent adrenal cortex and secreted in response to stress. we examined the rapid effects of cort on n-methyl-d-aspartate (nmda) receptor-mediated ca 2 signals in adult mouse hippocampal slices by using ca 2 imaging technique. application of nmda caused a transient elevation of intracellular ca 2 concentration followed by a decay to a plateau within 150 sec. the 30 min preincubation of cort induced a signi cant decrease of the peak amplitude of nmda-induced ca 2 elevation in the ca1 region. the rapid effect of cort was induced at a stress-induced level (0.4-10 µm). because the membrane non-permeable bovine serum albuminconjugated cort also induced a similar rapid effect, the rapid effect of cort might be induced via putative surface cort receptors. in contrast, cort induced no signi cant effects on nmdainduced ca 2 elevation in the dentate gyrus. in the ca3 region, cort effects were not evaluated, because the marked elevation of nmda-induced ca 2 signals was not observed there. in vivo subcellular structures recognized with phase k. nagayama 1 , r. danev 1 , n. usuda 2 , y. kaneko 3 , k. nitta 1 , a. nakazawa 2 , k. atsuzawa 2 , m. tanaka 4 , m. setou 1 1 okazaki institute for integrative bioscience, okazaki, japan, 2 fujita health university, school of medicine, toyoake, aichi, japan, 3 saitama university, saitama, japan, 4 tokyo metropolitan institute of gerontology, itabashi-ku, tokyo, japan phase contrast transmission electron microscopy has been developed to enable a high contrast and a high resolution observation for unstained ice-embedded samples. to enhance the image contrast, two methodologies have already been developed; i) scattering contrast for stained samples with small aperture diaphrams and ii) defocus contrast for unstained or stained samples with deep defocusing. the former prevails in histochemical sciences and the latter is popular in electron crystallography. both methods, however, have a common drawback that the contrast is only improved by impairing the image quality. this drawback can be removed with use of the phase contrast method using phase plates, which has traditionally been used in visible light microscopy. due to the severe obstacle of the charging of phase plates, however, the idea has not yet been materialized. we have solved the phase-plate charging problem. an experiment 300kv with tem for a whole cell from cyanobacterium unstained and ice-embed ful lled the expectation. only weak and vague contrast was obtained for the conventional image of the cell even with a very deep defocus. contralily a high-contrasted image has appeared for phase contrast images, where various ne structures are clearly recognized. this may be a rst example to observe nanometer scale structures in details in the intact cell. other examples including intact state intravesicular structures will be shown. j. lichtenberger, p. fromherz max-planck-institute for biochemistry we cultured bovine chromaf n cells on an array of electrolyteoxide-silicon eld-effect transistors (eos fet) and monitored granule secretion. by stimulation with barium chloride, vesicles are released into the narrow sheet of electrolyte between the chip surface and the plasma membrane. the interaction of released protons with the silicon dioxide surface of the chip alters the threshold voltage of the transistor and gives rise to a measurable signal. simultaneously performed measurements with a carbon bre showed a correlation of the transistor signals and amperometric current traces. we conclude that the transistors are able to monitor exocytosis on a single vesicle level. to elucidate the role of protons, we destroyed the proton gradient across the vesicle membrane by nigericin and valinomycin. as a result a massive reduction of the transistor signals was induced, whereas there was only little change of the amperometric records. we conclude that released protons are responsible for the detection of vesicles with transistors. the individual transistor records of vesicle exocytosis can be explained by combining the dynamics of the exocytotic event with the diffusion in the cell-chip junction. transistor recording of exocytosis does not depend on the electrochemistry of transmitters. as many kinds of exocytotic vesicles contain a large amount of buffered protons it can be applied to numerous kinds of exocytotic events, independent on the nature of the transmitter. we tried various solvents for the solubility of the uorescent product, and found that the product was insoluble in water and most organic solvents. a quite bright uorescence emitted by the particles was observed by uorescent microscope when emitted by uv365 nm. sem indicated that the size of the particles was 1µm 20µm, depending on the reaction time and phospholipid concentration in hexane solution. endothelial cells from human vein grew better on the surface prepared from the particles than the culture plate, implies a possible application as a new type of biomaterial as a coating material for medical devices, and as a uorescent tracer for human bodies. confocal microscopy of the phototactic ciliate f. salina. fabrea salina is a marine heterotrich ciliate, which dwells in salt ponds. in previous works we have described the phototaxis and the uorescence properties of a hypericin like endogenous pigment in an albino strain. we have recently obtained a heavily pigmented strain from the saline of torre colimena (taranto, italy). we have used confocal microscopy to characterize the uorescent properties of this strain and to compare them with those of the albino strain. the results obtained by one and two photon confocal microscopy show that, as in the albino strain, the uorescence intensity of the pigment is higher in dead cells than in the living cells. the excitation and emission spectra are quite similar in the two strains and this is also true for the uorescence lifetime, which is about 2 ns. all together, these measurements indicate that the pigment of the new strain belongs to the family of hypericin-like chromophores. the analysis of different confocal planes shows that the pigment is localized not only in granules under the somatic membrane in the cellular body, as currently thought, but also in the cilia. some experiments of fotobleaching "in situ" con rm this result, that might have important implications in the understanding of the mechanisms of the photomotile responses of f. salina and probably of other heterotrich ciliates. there is no doubt that modern physician should have the knowledge of basic sciences as physics, chemistry and biology. furthermore, the biophysics is incorporated into the curriculum of most european medical schools. at medical school in zagreb, the course of physics and biophysics is positioned in rst and forth year of study. the students learn basic physics phenomena of structure of matter, mechanics, thermodynamics, electromagnetism, optics and acoustics applied on the human body. additionally, the interactions of the body with the surrounding are thoroughly discussed as the basis for different diagnostic methods. the arguing at our school is still going on where to include the content of this course. should it be the autonomous course or the part of physiology and radiology courses in the problem based learning approach?! so far at zagreb medical school the biophysical courses are autonomous structured according to the biophysics programs at other european universities. the highlighting is on seminar work and lab, encouraging the students for individual learning. the seminars are made more vivid and instructive for students by inclusion of different model devices constructed in our department learning on science is also learning how scienti c knowledge is produced. in this sense, issues related to the dynamics of science should be brought into focus in science education. the idea has support in the 1999 "declaration on science and the use of scienti c knowledge", particularly in the statement that "science curricula should include science ethics, as well as training in the history and philosophy of science and its cultural impact". thinking on rst year undergraduate students, we are interested in an integrated approach of that kind of themes. this presentation describes a practical teaching module included in the basic biophysics course for biochemistry majors. organized in case studies, it deals with stories of biophysics (and biochemistry), and addresses the role of the biophysical approach in the progress of the life sciences. the module follows the whole course, and consists in small exercises on the way a given understanding has been constructed explored within the practice trend of contemporary science studies. examples of the chosen stories -anchored in the subjects covered in the lectures of the course -relate to the search for the mechanism of energy production through atp-synthase, the development of radioactive labelling techniques, and the discovery of protein water channels. beyond their value as cultural legacy and as motivating tools, the insight they might provide is vast. axiomatic theory of biophysics q. zhao china rehabilitation research centre, beijing, china. until now, all approaches to interpret biology by applying principles of physics have been announced failure. the achievement of biophysics is limited in area to provide essential tools for biological research and biophysics is far from to be the basic theory of biology. to resolve it requires the fundamental research about the logic features of biophysical processes of biology. we promoted system logic and then protein thermodynamics structure theory. based upon it, the axiomatic theory of biophysics and biology could be developed. our result shows that the real understanding of biology or biophysics must be constructed based upon new thinking methods. a successful ligand-receptor docking methodology depends strongly in the ef ciency of the global optimization algorithm used to explore the ligand conformational space. in this work we have implemented and analyzed the performance of a new exible ligand-receptor docking methodology. this methodology uses as optimization method a multisolution version of the generalized simulated annealing algorithm adapted to problems with box constraints. a grid-based methodology, considering the receptor rigid, and the gromos96 classical force eld are used to evaluate the ligand-receptor scoring function. the methodology was tested in redocking (ligand within it own protein conformation) and cross-docking (ligand within another protein conformation) experiments for ve hiv1 protease-ligand complexes with known threedimensional structures. all ligands tested are highly exible, having 12 to 20 conformational degrees of freedom. the implemented docking methodology was able to redock successfully all exible ligands with a success ratio 95% and a mean rmsd lower than 1.52 å with respect to the corresponding experimental structures. in the cross-docking experiments we observed a strong dependence of the mean success ratio with respect to the protein structure used as reference. in 4 situations we observed a mean success ratio 40% and 70% in 13 cases among the 20 possible ones. s. aida-hyugaji 1 , h. nakagawa 2 , j. nomura 2 , m. sakurai 2 , d. tokushima 3 , t. takada 3 , u. nagashima 4 , t. ishikawa 2 1 tokai university, japan, 2 tokyo institute of technology, japan, 3 nec corporation, japan, 4 national institute of advanced industrial science and technology, japan irinotecan is a widely-used antitumor drug that inhibits mammalian dna topoisomerase i. however, overexpression of abcg2 can confer cancer cells resistance to sn-38, that is, the active form of cpt-11. in the present study to develop a platform for the molecular modeling to circumvent drug resistance associated with abcg2, we have characterized a total of fourteen new sn-38 analogues by some typical properties, which were evaluated by molecular orbital (mo) calculations and neural network (nn) analysis. the nn was rst applied to estimate hydrophobic properties (logp) of the analogues. thereafter, the electrostatic potential (esp) and the solvation free energy ( g) were evaluated by mo calculation. these indexes were found to be well correlated with the drug resistance ratio experimentally observed in abcg2-overexpressing cells. it is suggested that hydrophilic analogues carrying oh-or nh 2 -groups are good substrates for abcg2 and therefore exported from cancer cells. in contrast, sn-38 analogues with cl or br atom at those positions have similar logp values and high af nities toward the putative active site of abcg2, however they were not substrates of abcg2. from these results, it is strongly suggested that hydrogen bond formation with oh-or nh 2 -groups are critically involved in the transport mechanism of abcg2. a. agopian 1 , j. depollier 1 , e. gros 1 , g. aldrian-herrada 1 , p. clayette 2 , n. bosquet 2 , g. divita 1 1 crbm-cnrs, 1919 route de mende, 34293 montpellier, france., 2 spi-bio-cea fontenais aux roses, france reverse transcriptase (rt) plays an essential role in the replication of hiv and constitutes the main target for the development of aids therapies. the biologically active form of hiv rt is a heterodimer of two subunits, p51 and p66, each consisting of distinct subdomains: the ngers, the palm, the thumb, the connection and the rnase h subdomain, the latter only present in p66. we have demonstrated that formation of fully active rt is a two-step process involving rapid association of the two subunits (dimerization) followed by a conformational change (maturation). thanks to the crystal structure of rt we have identi ed a new class of inhibitors based on short peptide motifs derived from the dimer interface. we rst identi ed a short 9mer peptide (pep-7) derived from the tryptophanrich motif of the connection subdomain that blocks dimerization of rt and ef ciently abolishes hiv-1 replication. pep-7 interacts preferentially in a pocket involving residues trp 24 and phe 61 on p51. we then designed 15mer peptides derived from the thumb domain which inhibit rt maturation as well as viral replication when delivered into cells. taking into account these results we propose that dimerization of rt constitutes a potential target for the design of more speci c new antiviral drugs. . the success of gene therapy largely relies on the availability of vectors that would deliver the genetic material ef ciently to the target cells with a minimal toxicity. in this context, our purpose was to evaluate as possible vectors a series of newly synthesized low molecular weight (5 kda) chitosan derivatives grafted with dodecenoyl (ddc) groups at different percentages (3, 9, 16 and 25 %). in the absence of dna, the critical micellar concentration (cmc) of these derivatives in 20 mm mes buffer ph 6.5 was found to be strongly dependent on the percentage of ddc but not on ph or salt concentrations. this indicates that the ddc groups confer to the chitosan derivatives the potency to self-assemble probably in micellar structures: a property that may dictate the formation and the structure of their complexes with dna. next, we investigated by quasielastic light scattering the size and the surface charge of complexes of plasmid dna with these derivatives at different ph, salt concentrations and n/p ratios (expressed in charged units of chitosan amines to dna phosphates). we found the smallest and more positively charged complexes were obtained at ph 5.8 and n/p=5 in the absence of salt: a condition where the chitosan derivatives were fully protonated and in excess over the dna phosphate groups. biophysical and biological examination of dna/lipids complexes particles of virus-like structure designed for in vivo gene transfer d. durand 1 , m. schmutz 2 , b. lebleu 3 , a. r. thierry 3 1 lure, centre universitaire paris sud, orsay, france, 2 institut henri sadron, strasbourg, france, 3 laboratoire des défenses antivirales et antitumorale, umr 5124, montpellier, france the structure of complexes made from dna and suitable lipids (lipoplexes lx) was examined by cryo electron microscopy. we observed a distinct concentric ring-like pattern with striated shells when using plasmid dna. these spherical multilamellar particles have a mean diameter of 254 nm with repetitive spacing of 7.5 nm with striation of 5.3 nm width. small angle x-ray scattering (saxs) con rmed cryoem data and revealed repetitive ordering of 6.9 nm, suggesting a lamellar structure containing at least a dozen layers. this concentric and lamellar structure with different packing regimes was also observed by cryoem with linear dsdna, ssdna and oligodeoxynucleotides. for the rst time, dna chains could be visualized in dna/lipid complexes. such speci c supramolecular organization is the result of thermodynamic forces, which cause compaction to occur through concentric winding of dna in a liquid crystalline phase. cryoem of t4 phage dna packed either in t4 capsides or in lipidic particles showed similar patterns. saxs suggested an hexagonal phase in lx-t4 dna. thus, both lamellar and hexagonal phases may coexist in the same lx preparation or particle and transition between both phases may depend upon equilibrium in uenced by type and length of the dna used. organization of such nucleotidic supramolecular assemblies is relevant for prebiotic chemistry. engineering self-assembly peptides for targeted delivery of therapeutics and imaging agents s. s. dhadwar, m. sung, k. kawamura, j. gariépy department of medical biophysics, university of toronto, canada peptide-mediated delivery systems have recently emerged as a means to substitute or augment conventional drug and gene delivery technologies. these approaches are versatile and easily designed to incorporate a number of speci c attributes required for ef cient delivery of therapeutic and imaging agents. in particular, self-associating peptide domains can be utilized to construct stable and structurally well-de ned protein-like assemblies displaying a series of cell-routing functions. more speci cally, a peptide-based self-assembling intercellular delivery vehicle was designed by incorporating the 30-residue long tetramerization domain of the human tumor suppressor protein p53 (hp53tet). the resulting peptide tetramer displays 8 termini within its structure that allows for the simultaneous presentation of distinct cell targeting signal or functional domains. the fusion of polycationic sequences to the hp53tet domain promotes the cellular import of the resulting constructs into eukaryotic cells. this internalization event was dramatically enhanced for such multivalent peptides in relation to their monomeric counterparts. peptides containing a nuclear localization sequence along with a polycationic sequence were found to shuttle reporter plasmids ef ciently to the nucleus of cells. these results have important implications in the design and construction of novel targeted delivery vehicles. mechanisms of non-covalent peptide mediated cellular delivery of therapeutics: a biophysical study s. deshayes 1 , m. c. morris 1 , a. heitz 2 , p. charnet 1 , g. divita 1 , f. heitz 1 1 crbm -cnrs fre 2593 montpellier france, 2 cbs -cnrs umr 5048 -inserm u554 montpellier france two different cell-penetrating peptides mpg and pep-1 were shown to promote non-endosomal intracellular delivery of non-covalent bound cargos, namely nucleic acids and proteins; respectively. in order to identify the peptide mediated internalization pathway, we undertook conformational investigations of both peptides with and without associated cargos and checked the conformational consequences of the presence of phospholipids. from the conformational point of view, pep-1 behaves differently from mpg. cd analysis revealed a transition from a non-structured to a helical conformation upon increase of the concentration while mpg remained nonstructured. determination of the structure by nmr showed that in water, it's a-helical domain extends from residue 4 to 14. cd and ftir indicated that pep-1 adopts a helical conformation in the presence of phospholipids while mpg is in a -sheet form. adsorption measurements performed at the air-water interface were consistent with the helical form. pep-1 did not undergo conformational changes upon formation of a particle with a cargo peptide. in contrast, we observed a partial conformational transition when the complex encountered phospholipids. for mpg, interactions with nucleic acids generated a partial folding into -sheet which was more pronounced in the presence of lipids. electrophysiological measurements showed that both peptides, whether associated or not with their cargo, can induce transmembrane pore-like structures. self-assembly of hydrolysed alpha-lactalbumin into nanotubes j. f. graveland-bikker 1 , k. g. de kruif 2 1 nizo food reseach, ede, netherlands, 2 van´t hoff laboratory, netherlands nanotubes are formed by self-assembly of partially hydrolysedlactalbumin, a 14 kda milk protein. there are several promising applications of these -lactalbumin tubes, in food, pharmacy and nanotechnology. we studied the mechanism of self-assembly, the structure and the properties of the nanotubes. limited proteolysis of the -lactalbumin (by a serine protease) makes the molecule prone to self-assembly. in the presence of ca 2 tubular structures are formed. other divalent ions like mn 2 and zn 2 can also induce tubular self-assembly, while mg 2 leads to random aggregation. light scattering showed that the self-assembly is reversible, which is of relevance for controlled release applications. on the other hand, we could also make stable tubes by cross linking, which would be a requisite for several other applications. from afm and saxs measurements, we obtained values for the outer diameter: 21 nm; and the inner diameter: 8 nm. afm and cryo-em revealed the helical structure of the tube wall; it is a right-handed helix. by performing nano indentations with afm we determined mechanical properties of the tubes. the tubes were shown to be relatively resilient upon small deformations; the elastic modulus is of the order of 0.1 gpa. targeted delivery of photosensitizers into the cancer cell nuclei enhances their cytotoxic ef cacy the search for new pharmaceuticals has raised interest in locallyacting drugs which act over short distances within the cell, and for which different cell compartments have different sensitivities, e.g. photosensitizers used in anticancer therapy should be transported to the most sensitive subcellular compartments where their action is most pronounced. earlier we have produced a number of modular recombinant transporters for locally-acting drugs comprising several functional modules for cell-speci c targeting, internalization, escape from intracellular acidic vesicle, and targeting to the nuclei of melanoma cells overexpressing melanocortin receptors. here we describe new transporters on the basis of epidermal growth factor which are speci c for a wide variety of cancers. these transporters possess all necessary functional activities and deliver photosensitizers into the nuclei of human carcinoma cells to result in photocytotoxic effects almost 3 orders of magnitude greater than those of nonmodi ed photosensitizers. characterization of mixtures of dna and nonionic polymeric agents for gene delivery in muscle j. m. gau 1 , j. lal 2 , l. auvray 2 1 laboratoire mpi -lrp, umr cnrs 7581, université d´évry val d´essonne, 91025 évry cedex, france, 2 argonne national laboratory, ipns, 9700 south cass avenue, illinois 60439, usa a strategy to cure muscle disease is to introduce genes (dna) into the muscle cell to correct or to add genes. nonionic polymeric agents have emerged as an ef cient vector to deliver dna in the muscle. these polymers protect dna from extracellular nuclease degradation by allowing the dna diffusion throughout the muscle tissue. there is at present no understanding about how nonionic polymers enhance transfection in the muscle. the kind of interactions between these nonionic agents and dna, dna-nonionic polymeric agent mixtures and cell membrane are currently unknown. also the structure of dna-nonionic polymeric agent mixtures is not yet well de ned. more information is needed to improve this delivery system. neutron scattering (contrast variation) and light scattering were used to investigate the interaction between: dna and nonionic polymers (pvp, di-and triblock copolymers). furthermore, electrical measurements with the same polymer complexes and black lipid membrane were also performed. depending on the polymer type there is either direct interaction with dna or in other cases polymers exhibit strong interaction with the lipid membrane. an explanation for transfection ef ciency of these nonionic agents in gene delivery to muscle will be given. high throughput in-silico screening against exible protein receptors b virtual screening of chemical databases to targets of known threedimensional structure is developing into an increasingly reliable method for nding new lead candidates in drug development. based on the stochastic tunneling method (stun) we have developed flexscreen, a novel strategy for high-throughput in-silico screening of large ligand databases. each ligand of the database is docked against the receptor using an all-atom representation of both ligand and receptor. in the docking process both ligand and receptor can change their conformation. the ligands with the best evaluated af nity are selected as lead candidates for drug development. using the thymidine kinase inhibitors as a prototypical example we documented the shortcomings of rigid receptor screens in a realistic system. we demonstrate a gain in both overall binding energy and overall rank of the known substrates when two screens with a rigid and exible (up to 15 sidechain dihedral angles) receptor are compared. we note that the stun suffers only a comparatively small loss of ef ciency when an increasing number of receptor degrees of freedom is considered. flexscreen thus offers a viable compromise between docking exibility and computational ef ciency to perform fully automated database screens on hundreds of thousands of ligands. maturation and inhibitor design of sars-cov 3cl protease based on a product-bound crystal structure severe acute respiratory syndrome (sars) is an emerging infectious disease caused by a novel human coronavirus. here we report that the 3cl pro containing n-and/or c-terminal additional in-frame sequences underwent autoactivation to cleave the tags and yielded the mature protease in vitro. the 3-d structure of the c145a mutant protease shows that the active site of one protomer of the dimeric protease is bound with the c-terminal six amino acids of the protomer in another asymmetric unit, suggesting a possible mechanism for maturation. the crystal structure of this product-bound form shows that the active site has a p1 pocket that binds the gln side chain speci cally. in addition, the p2 and p4 sites are clustered together to accommodate large hydrophobic side chains. the tagged c145a mutant protein served as a substrate for the wildtype protease and the n-terminus was rst digested (55-fold faster) followed by the c-terminal cleavage as shown by the sds-page analysis. the analysis of t analytical ultracentrifuge experiments reveals the remarkably tighter dimer formation for the mature enzyme (k d = 0.35 nm) than for the mutant (c145a) containing the n-terminal (k d = 17.2 nm) or the c-terminal 10 extra amino acids (k d = 5.6 nm). taken together, the study here provides insights to the design of our new structure-based inhibitors. nevertheless, a signi cant proportion of patients do not respond to this therapy, and adverse effects are common. here we report the delivery and expression of recombinant mycobacterial dna vaccines in vivo and demonstrate the ability of multicomponent dna vaccines to enhance th1-polarized immune responses. splenocytes from immunized groups of mice were re-stimulated in vitro and examined for cytotoxicity against bladder tumour cells. we used four combined recombinant bcg dna vaccines (multi-rbcg) for electroporative immunotherapy in vivo, and found that tumour growth was signi cantly inhibited and mouse survival was prolonged. increased immune cell in ltration and induction of apoptosis were noted after treatment with multi-rbcg alone, with the interleukin-23 (il-23) vaccine alone, and-most signi cantly-with their combinations. thus, electroporation immunogene therapy using multi-rbcg plus il-23 may be an attractive regimen for the treatment of bladder cancer. this approach presents new possibilities for the treatment of bladder cancer using recombinant bcg dna vaccines and il-23 dna vaccine. the cell-penetrating peptide (cpp) pep-1 is capable of introducing large proteins into different cell lines, maintaining their biological activity. two mechanisms have been proposed to explain the entrance of other cpps in cells, endosomal-dependent and independent. we evaluated the molecular mechanisms of pep-1mediated cellular uptake of -galactosidase ( -gal) from e. coli, in large unilamellar vesicles (luv) and hela cells. fluorescence spectroscopy and immuno uorescence microscopy were used to study the translocation. internalization of -gal into luv and protein functionality in hela cells were detected by enzymatic activity. -gal translocated into luv in a transmembrane potentialdependent manner. likewise, -gal incorporation was extensively decreased in depolarized cells. furthermore, -gal uptake efciency and kinetics were temperature-independent and -gal did not co-localize with endosomes, lysosomes or caveosomes. therefore, -gal translocation was not associated with the endosomal pathway moreover transmembrane pores were not detected. these results indicated that the protein uptake in vitro and in vivo was mainly, if not solely, dependent on a physical mechanism governed by electrostatic interactions between pep-1 (positively-charged) and membranes (negatively-charged). peptide the dramatic acceleration in the identi cation of new nucleic acidbased therapeutic molecules has provided new perspectives in pharmaceutical research. however, the development of nucleic acidand peptide-based therapeutics is limited by their poor cellular uptake and traf cking. with the aim of addressing these issues, we have designed a family of short amphipatic peptides for delivery of nucleic acids (mpg) and of peptide/pna (pep). these carriers consist of a hydrophobic moiety and a nls-derived hydrophilic domain. they form stable non-covalent complexes with peptides, proteins, sirna or pna without any requirement for prior covalent cross-linking. both mpg and pep carriers enter cells rapidly, in a process involving membrane disorganization, independently of the endosomal pathway. mpg ef ciently delivers short odns and sirna into a wide variety of mammalian cell lines, without interfering with their biological function. pep signi cantly improves delivery of pna and peptides. both carriers were used for the delivery of sirna or antisense pna targeting the cell cycle regulatory protein cyclin b1 in an animal model and were found to block tumor growth upon intravenous injection. we believe that mpg and pepbased technologies will contribute signi cantly to the development of basic and therapeutic applications. probing the bound conformation of acetylcholinesterase (ache) inhibitor at the binding site c. g. kim, x. zhao, s. goodall, a. watts department of biochemistry, university of oxford, south parks road, oxford, ox1 3qu, uk acetylcholinesterases are the enzymes which preferentially hydrolyze acetyl esters (such as ach or acetyl-â-methylcholine), containing 543 amino acid residues in the eeache form and arranged as a 12-stranded â-sheet surrounded by 14 á-helices. the protein is ellipsoidal in shape, with approximate dimensions of 45 å by 60 å by 65 å. inhibitors of acetylcholinesterase are of commercial and medical interest as pesticides and as therapeutics in the treatment of alzheimer's disease. an understanding of the conformation of inhibitors in the binding site enables the rational design of novel inhibitors with increased potency and speci city. interaction between the ligand, amino-2-methyl-3-(3tri uoroacetylbenzyl-oxymethyl)quinoline (r414983), and ache inhibitor has been studied by advanced solid-state nmr through double-quantum chemical shift and distance measurements. combining solid-state nmr data and docking simulations, conformation of the ache inhibitor at the active site has been predicted. in vivo, heat shock proteins (hsps) being stress-inducible chaperones can attenuate detrimental consequences of ischemic insults, inammation, neurodegenerative diseases, etc. also, intracellular accumulation and chaperone activities of some hsps may contribute to improved cell survival following uv or ionizing radiation. in models of pathological states and their treatment, we used special virusbased vectors for overexpression of hsp70 or hsp27 in cell cultures to confer cytoprotection under simulated ischemia/reperfusion. in parallel, similar cytoprotection was achieved after pretreatments of the cells with a pharmacological hsp inducer, geranylgeranylacetone. the cytoprotective effects were manifested in the lesser extent of oxidative modi cation and aggregation of cellular proteins, better preservation of the cytoskeleton, faster restoration of energy metabolism and the improved post-stress cell survival. in the other model, we treated normal and tumor cells with an inhibitor of the chaperone activity of hsp90, geldanamycin. only the drug-treated tumor cells became more sensitive to gamma-irradiation; such results characterize this drug as a potentially selective radiosensitizer of tumors. taken together our data demonstrate promising approaches to clinically bene cial manipulating the levels of expression and/or chaperone activity of hsp(s) by means of gene therapy or pharmacotherapy. characterizaton of proteins from human pleural uid r. jain, s. kumar, n. singh, s. sharma, t. p. singh all india institute of medical sciences, new delhi ,india 110029 the samples of human pleural uid were obtained from both healthy subjects and patients infected by tuberculosis. after the preliminary processing these samples were run in independent lanes of sodium dodecyl sulphate-polyacrylamide gel electrophoresis (sds-page). the two lanes indicated variations in the intensities of a few bands and some new bands were also observed in the infected samples. these were characterized by determining their nterminal sequences. the new bands which had low density were carefully identi ed and cloned. some of the common bands that showed intensity variations were characterized. these were matrix metalloproteins, secretory phospholipasea 2 , transferrin and ceruloplasmin. they were also studied with maldi-tof and their molecular weights have been determined. some of these proteins have been crystallized and their detailed crystal structure determinations are in progress. biophysical study of non-lethal stress response of cultured dc3f cells stress factors may induce two kinds of responses in living cells: either cell death or adapting mechanisms. our aim was to search for non-lethal effects of various stress conditions on cultured hamster lung broblasts (dc3f cells) as well as to assess the recovery time after stress removal. dc3f cells were cultured in standard conditions and were submitted to stress, either by incubation with chemical substances (sodium arsenate, sodium nitroprusiate) and drugs (bleomicine and statins) either by irradiation (uv, he-ne laser). the doses and exposure times were chosen as to avoid cell death. after stress removal, cells were allowed to recover and the recovery time period was measured. structural and functional parameters were evaluated before and after stress, as well as during recovery. by now, experimental models for the in vitro study of non-lethal stress inducing factors have been set up. severe acute respiratory syndrome (sars) is an emerging infectious disease caused by a novel human coronavirus. the viral maturation requires a main protease (3cl pro ) to cleave the virus-encoded polyproteins. we report here that the 3cl pro containing n-and/or c-terminal additional in-frame sequences underwent autoactivation to cleave the tags and yielded the mature protease in vitro. the 3-d structure of the c145a mutant protease shows that the active site of one protomer of the dimeric protease is bound with the cterminal six amino acids of the protomer in another asymmetric unit, suggesting a possible mechanism for maturation. the tagged c145a mutant protein served as a substrate for the wild-type protease and the n-terminus was rst digested (55-fold faster) followed by the c-terminal cleavage as shown by the sds-page analysis. the analysis of the quaternary structures for the tagged and mature proteases by analytical ultracentrifuge experiments reveals the remarkably tighter dimer formation for the mature enzyme than for the mutant (c145a) containing the n-terminal or the c-terminal 10 extra amino acids. taken together, the study here provides insights to the design of our new structure-based inhibitors. characterisation of macromolecular transport in physiologically relevant mixed ecm based gels s. lelu, a. pluen school of pharmacy and pharmaceutical sciences, university of manchester (uk) the extracellular matrix (ecm) a complex gel made of hyaluronic acid, collagen and proteoglycans (pg) impedes the penetration of macromolecules especially in tumours, and may compromise the success of novel therapies. though recent in vivo investigations pointed out that, not only ha but brillar collagen its content and organisation and its interactions with pg were involved in macromolecular transport hindrance, transport mechanisms relating the macromolecular drug and these ecm components are unknown. in this study we seek to evaluate the determinants of passive transport mechanisms of biomacromolecules in complex gels made of ha, collagen using uorescence techniques (frap and confocal re ection microscopy (crm)) and rheology. focus was on conditions relevant to tumours and, initially, on low collagen and relatively high ha content. rheology experiments showed that mixed systems containing less than 10mg/ml of ha present higher elastic modulus ge than pure ha network or pure collagen gels. interestingly crm and frap studies revealed similarities for collagen and mixed gels: the organisation and spacing of the collagen bres did not change and the ratio of the diffusivities (d/d 0 ) of dextran 2m and igg were not different but higher than those in ha networks. systems with higher collagen content are under investigation to complete the characterisation of transport. encapsulation of clone vector dna by cationic diblock copolymer vesicles for gene delivery a. v. korobko 1 , j. r. van der maarel 2 1 leiden university, the netherlands, 2 national university of singapore, singapore we will discuss the design, control, and structural characterization of cationic copolymer vesicles loaded with dna. these vesicles serve as a model system for diverse applications such as gene delivery, micro-arraying techniques and packaging of dna in congested states. encapsulation of dna was achieved with a single emulsion technique. for this purpose, an aqueous puc18 or pegfp-n1 plasmid solution is emulsi ed in an organic solvent and stabilized by an amphiphilic diblock copolymer. the neutral block of the copolymer forms an interfacial brush, whereas the cationic block complexes with dna. a subsequent change of the quality of the organic solvent results in a collapse of the brush and the formation of a capsule. the capsules are subsequently dispersed in aqueous medium to form vesicles and stabilized with an osmotic agent in the external phase. inside the vesicles, the dna is compacted in a liquid-crystalline fashion as shown by the appearance of birefringent textures under crossed polarisers and the increase in uorescence of labeled dna. the compaction ef ciency and the size distribution of the vesicles were determined by light and electron microscopy, respectively, and the integrity of the dna after encapsulation and subsequent release was con rmed by gel electrophoresis. we demonstrate the gene transfer ability of this new carrier system by the transfection of encapsulated pegfp plasmid into hela cancer cells. cellular transduction of nucleotide kinases to improve the activation of nucleoside analog prodrugs m. konrad 1 , c. monnerjahn 1 , s. ort 1 , a. lavie 2 1 max-planck-institute for biophysical chemistry, goettingen, germany, 2 university of illinois at chicago, chicago, usa the objective of our study is to improve therapeutic enzymeprodrug systems by generating catalytically superior nucleoside and nucleotide kinases that are essential for activation of nucleoside analogs. compounds, such as azt for the treatment of hiv infections, acv and gcv used against herpes virus, or the anticancer compounds arac and gemcitabine, can enter cells only in the unphosphorylated state (prodrug) and need to be transformed by different kinases to their pharmacologically active triphosphate state that interferes with dna replication. we have rst designed mutants of the human tmp kinase (htmpk) that phosphorylate aztmp up to 200-fold faster than wildtype. expression of this enzyme in human cells leads to 10-fold higher intracellular concentrations of azttp and to enhanced hiv inhibition. second, the prodrugs acv and gcv are not phosphorylated by human kinases, but are converted to their monophosphate forms by hsv1-tk which is used in enzyme/prodrug-dependent cancer suicide gene therapy. we generated enzyme variants which show selective and ef cient phosphorylation of gcv. third, an engineered human dck variant catalyzes more ef ciently the activation of the prodrugs arac and gemcitabine. thus, the concept of a gene (or enzyme) therapeutic treatment involving expression (or direct intracellular transduction) of a catalytically improved human enzyme may pave the way to the development of novel strategies in nucleoside prodrug-dependent cancer chemotherapy. docking-molecular dynamics studies on the peroxidase site of prostaglandin endoperoxide h2 synthase prostaglandin endoperoxide h 2 synthases-1 and 2 (pghs-1 and 2) catalyze the rst step in the biosynthesis of prostaglandins, prostacyclins and thromboxanes. arachidonic acid is transformed into prostaglandin g 2 (pgg 2 ) at the cyclooxygenase site of the enzyme and the 15-hydroperoxide oxygen-oxygen bond of pgg 2 is subsequently cleaved by reaction with haem at the distinct peroxidase site (pox) to produce prostaglandin h 2 (pgh 2 ). herein we present a plausible productive conformation obtained by docking calculations for the binding of pgg 2 to the pox site of pghs-1. the enzyme-substrate complex stability was veri ed by a 500-ps molecular dynamics simulation. structural analysis unveils the requirements for enzyme-substrate recognition and binding: the pgg 2 15-hydroperoxide group is in the proximity of the haem iron and participates in a hydrogen bond network with the invariant his207 and gln203 and a water molecule, whereas the carboxylate group establishes salt bridges with the remote lysines 215 and 222. the interaction of the peptide lah4 with anionic lipids during dna/rna delivery to eukaryotic cells a. j. mason 1 , a. martinez 2 , c. leborgne 2 , a. kichler 2 , b. bechinger 1 1 faculté de chimie, université louis pasteur, strasbourg, france, 2 généthon, evry, france the histidine rich amphipathic peptide lah4 has antibiotic and dna delivery capabilities. the peptide has a strong af nity for anionic lipids found in the outer membrane of bacterial membranes and has shown evidence of higher transfection activity against transformed over healthy tissue in culture. it has been proposed that anionic lipids can ip-op to reach the cytoplasmic monolayer. here they neutralise the cationic transfection complexes thereby causing release of oligonucleotides into the cytoplasm. we were, therefore, particularly interested to test for the role of the acidic lipid phosphatidylserine (ps) in mediating lah4-mediated delivery of dna ef ciency. to understand the potential peptide-lipid interactions in more detail, solid-state nmr experiments on model membranes have been performed. 31 p mas nmr on mixed phosphatidylcholine (pc)/ps and pc/phosphatidylglycerol (pg) membranes has been used to investigate speci c lah4 interactions with anionic lipids. by using deuterated lipids and wide-line 2 h nmr when probing lipid chain order, it is demonstrated that lah4 preferentially interacts with ps over pc. lah4 thereby effectively disorders the anionic ps lipid fatty acyl chains. the lipid chain destabilising effect of lah4 and also lah4 analogues can then be compared with their transfection ef ciency for dna or sirna in cell culture to aid in rational peptide vector design. virtual screening is now widely accepted as a basis for drug discovery thanks to signi cant improvement and good hit rates [1]. however, it is still highly cpu-consuming. at the same time, the number of protein-ligand complexes described at the atomic level is rising and the sequence similarity is used for structure and function predictions. new approaches are being developed to take advantage of the available structural data and the huge number of protein sequences in order to allow better tuned virtual screening. new web servers are being built to ease and to speed up the whole process (http://abcis.cbs.cnrs.fr/kindock/). integrating these servers into a pipeline dedicated to molecular modelling (http://abcis.cbs.cnrs.fr/atome/) shall allow both the re ned validation of modelled active sites as well as the oriented screening for the primary caracterization of potential ligands. an ideal drug delivery system should own the following characteristics, the rst is the targeting of therapeutic agent to the speci c site of their action. the second is the controlled delivery of a therapeutic molecule or protein in a pulsatile or staggered fashion. the third is the achieving sustained zero-order release of a therapeutic agent over a prolonged period of time. in this study, a new drug delivery system combined these characteristics was provided, which contains azobenzene derivatives (ab lipid) as an on-off switch incorporated into liposomes. the drastic release of calcein was observed on the rst uv irradiation of ab lipid to the cis isomer, while a suppressed release was observed when irradiated with the rst visible light. after that, the slope of release pro le became coincident. furthermore, calcein release was greatly increased after uv irradiation of ab lipid to the cis isomer and the drug release was greatly suppressed after vis irradiation of ab lipid to the trans isomer. we can control the release rate of calcein from ab lipid/egg pc mixed liposomes by uv or vis light irradiation. tryptophanase (trpase), a bacterial enzyme with no counterpart in eukaryotic cells, produces l-trp pyruvate ammonia and indole. it was suggested that indole is essential for bacteria multiplication and bio lm formation. bio lms destroy equipment and food and cause many illnesses. most synthesized quasi-substrates inhibit trpase at mm range. an optimal and speci c inhibitor of trpase may eliminate indole production and prevent bio lm formation. x-ray crystallography of the holo-wt e. coli trpase soaked with l-trp and the known mechanism of trpase activity should provide the information for the design and synthesis of active-sitespeci c quasi-analogs. utilizing the chromogenic substrate s-(onitrophenyl)-l-cysteine, the following michaelis-menten kinetics analyses determined the mode of trpase inhibition by trp and quinone based quasi-analogues and the corresponding ki values (in µm): dl-2-alanyl-2,10 anthraquinone, noncompetitively, 110; trypthophan ethylester, competitively, 24; acetyltryptophan, uncompetitively, 61.5; s-phenylbenzoquinone-l-tryptophan, uncompetitively, 92. plp-l-trp, inhibited irreversibly only the apo form of trpase and may serve for structure-determination purposes. further attempts are being made to synthesize improved trpase inhibitors, i.e., in the nm range. polyelectrolyte multilayer lms (pem) adsorbed on biomaterial surfaces are a new way to create a controlled release system. using biodegradable polymers, the lms can be degraded in vivo and release active molecules. in this work, we demonstrate the possibility of tuning the degradability of polysaccharide pem in vitro and in vivo. chitosan and hyaluronan pem (chi/ha) were either native or cross-linked (cl) using a water soluble carbodiimide (edc) at various concentrations in combination with nhydroxysulfosuccinimide. the in vitro degradation of the lms in contact with enzymes was followed by quartz crystal microbalance measurements and confocal laser scanning microscopy after lm labeling with chi fitc . whereas the native lms were subjected to degradation, the cl lms were more resistant to enzymatic degradation. films made of chitosan of medium molecular weight were indeed more resistant than lms made of chitosan-oligosaccharides. in addition, macrophages could degrade all types of lms and internalize the chitosan in vitro. the native lms implanted in vivo in mouse peritoneal cavity for a week showed an almost complete degradation whereas the cl lms were only partially degraded. these results suggests that the polysaccharides pem are of potential interest for in vivo applications as biodegradable coatings and that degradation can be tuned by controlling lm cross-linking. membrane electroporation -tool for therapeutic electrotransfer of drugs and gene dna e. neumann, s. kakorin physical and biophysical chemistry, faculty of chemistry, university of bielefeld, germany membrane electroporation (mep) is a new electrical high voltage scalpel, transiently opening the cell membranes of tissue for the penetration of foreign substances. due to the enormous complexity of cellular membranes, many fundamental problems of mep have to be studied at rst on model systems, such as the curved bilayer membranes of unilamellar lipid vesicles. electrooptical and conductometrical data of unilamellar liposomes indicate that electric eld pulses cause not only the formation of membrane electropores but also shape deformation of the liposomes, both processes mutually affecting each other. the primary eld effects of mep and cell deformation can trigger a cascade of numerous secondary phenomena, such as pore percolation and transport of small and large molecules across the electroporated membrane. the chemical mep theory represents a molecular physico-chemical approach to electrochemomechanical pore formation, yielding transport parameters, such as permeation coef cients, pore fractions and pore sizes. the pore concept is successfully applied to rationalize optimization strategies for biotechnological and medical applications of mep. in silico elucidation of xenobiotic processing loops k. nakata 1 , y. tanaka 2 , t. nakano 1 , t. ishikawa 3 , h. tanaka 2 , t. kaminuma 4 1 national institute of health sciences, tokyo, japan, 2 tokyo medical and dental university, tokyo, japan, 3 tokyo institute of technology, yokohama, japan, 4 hiroshima university, hiroshima, japan one of the important challenges for drug designers is to predict and analyze how drugs are absorbed, distributed, metabolized and excreted (adme) in the body. these processes highly correlated with toxicity of drugs and are actively studied in pharmacology. two classes of proteins, the drug metabolizing enzymes such as cytochrome p450s (cyps) and transporters, are the target of such adme/tox research. it was relatively recent that these two classes of proteins are synthesized by the genes that are the target genes of the nuclear receptors. nuclear receptors are ligand-activated transcription factors that form a superfamily. in case of humans there are 48 nuclear receptors almost half of whose ligands are identied, leaving some as true orphans. thus it was now recognized that these nuclear receptors play the role of sensors of drugs and other xenobiotic substances including environmental chemical pollutants and nutritional ingredients, while the drug metabolizing enzymes and the transporters are the processors which carry the actual cleaning jobs. we have started to elucidate the feedback loops that are formed by the xenobiotic ligands, nuclear receptors, their target genes, their product proteins, and their feedback actions on the ligands. the work is being carried out on our background database on the ligands and their receptors called kibank, and search programs for target genes of nuclear receptors algorithmically. the most recent results will be presented at the presentation. the major route for drug entry into cells is permeation across lipid bilayers. due to methodological limitations there are only few studies on permeation of drug-like molecules across lipid bilayers. an assay developed in our lab allows the direct measurement of lipid bilayer permeation of aromatic carboxylic acids (acas). tb 3 , which forms a uorescent complex with acas, is entrapped in liposomes and aca entry is determined from luminescence increase. lipid bilayer permeation was ph-dependent, following a henderson-hasselbalch function with a plateau for the neutral and the anionic species, respectively. in contrast to the expectations of the ph-partition hypothesis, permeation of the anionic species was only 1 to 3 magnitudes lower than that of the neutral species, leading to anion-controlled permeation at ph 7.4, independently of bilayer state and lipid composition. permeation across bilayers with a biologically relevant lipid composition was signi cantly slower than across egg-phosphatidylcholine membranes. the in uence of single lipids, such as cholesterol, was dependent on the structure and ionization state of the permeant. permeation coef cients of the neutral species correlated better with the polar surface area (psa) than logp oct , therefore psa is a better predictor for bilayer permeation of the neutral species of small acas than logp oct . interplay between polymerized liposomes physicochemical properties and composition and citotoxicity this study was aimed at investigating whether there is an interplay between diacetylenic polymerized liposomes physicochemical properties and lipid composition affecting citotoxicity in vitro. unsaturated 1,2-bis(10,12-tricosadiynoyl)-sn-glycero-3-phosphocholine with saturated 1,2-dimiristoyl-sn-glycero-3phosphocholine in molar ratio 1:1, were combined to give a chemically modi ed membrane by uv-polymerization. biophysical characterization was carried out determining the hydrophobic factor and hydrodynamic radius. citotoxicity was evaluated through haemolytic capacity on bovine red blood cells and indirectly by capacity of induction of lipid peroxidation on microsomes or mitochondrial membranes. the haemolysis percentage in presence of dc8,9pc/dmpc is less than that induce by polymers used in dentistry. the data obtain suggests that the polymerized lipids can not induce lipid peroxidation on natural membranes. the polymerized diacetylenic liposomes showed less interaction with serum proteins than non polymerized and lower citotoxicity as compared with natural lipids. also cell viability was determined in cell line nih3t3 after exposure to lipids systems under study. the hydrophobic factor showed further augmentation for polymerized liposomes and is discussed in relation to in vitro stability. the above results suggest that polymerized and non-polymerized liposomes would serve as an effective delivery vehicle. s. sonar, s. d´souza, k. p. mishra radiation biology and health sciences division,bhabha atomic research centre,mumbai-400085, india liposomes offer new approaches for drug delivery through their encapsulation to alter pharmacodynamic properties of loaded drug leading to reduction in toxicities and/ or improved ef cacy. for prolonged systemic circulation, the liposomes size has been shown to be limited to 200 nm or less. the ethanol injection method is an excellent technique for the formation of liposomes of < 200nm without the need of sonication or extrusion. the present study was aimed to produce liposomes encapsulating doxorubicin in minimum procedural steps. liposomes were prepared using distearoyl phosphatidylcholine and cholesterol, distearoyl phosphatidylcholine, cholesterol and oleic acid. the effects of different operational conditions for vesicle production and drug encapsulation were evaluated, with a view to achieve process cost to a minimum, suitable size and high encapsulation ef ciency. although high ef ciency of doxorubicin encapsulation was obtained by 'active' or 'remote' loading process in dspc/chol system, it was poor in one-step injection method. oleic acid was included to cut down the active loading by ph-gradient. dspc/chol/oa systems spontaneously loaded doxorubicin with encapsulation ef ciency of 50 % and nal drug to lipid molar ratio upto 0.172. the mean diameter of the vesicles was 175 + 5 nm. the method offers liposomes of small size with high loading. design of peptides with consecutive dehydro phenylalanine residues r in order to develop general rules for the design of peptide conformations with consecutive alpha, beta-dehydro phenylalanine-residues, peptides were synthesized, crystallized and crystal structures and molecular conformations were determined. following conclusions were drawn based on the structural data: -peptide unit sequences with two consecutive dehydro-phe residues at (i+1) and (i+2) positions adopt an unfolded s -shaped structure with dihedral angles phi-psi centred at 60 , 30 . -the peptides containing two consecutive dehydro-phe residues at (i+2) and (i+3) positions • form two overlapping type iii beta -turns (incipient 3 10 -helix). • with branched beta -carbon residue only at (i+1) position adopt a conformation with two overlapping types ii and iii beta -turns. • with branched beta-carbon residues such as val and ile at both (i+1) and (i+4) positions form two overlapping types ii and i beta -turns. the consistency in the formations of these conformations makes the design of peptides with alpha,beta ?-dehydro -residues a useful and highly predictable method for developing speci c ligands for various biological applications including drug design. binding of cationic porphyrin to isolated double-stranded dna and nucleoprotein complex k. zupán 1 , l. herényi 1 , k. tóth 2 , z. majer 3 , g. csík 1 1 institute of biophysics and radiation biology, semmelweis univ., budapest, hungary" 2 biophysics of macromolecules, german cancer research center, heidelberg, germany, 3 department of organic chemistry, eötvös loránd univ., budapest hungary the complexation of tetrakis(4-n-methylpyridyl)porphyrin (tmpyp) with free and encapsidated dna of t7 bacteriophage was investigated. to identify binding modes and relative concentrations of bound tmpyp forms, the porphyrin absorption spectra at various base pair/porphyrin ratios were analyzed. spectral decomposition, uorescent lifetime, and circular dichroism measurements proved the presence of two main binding types of tmpyp, e. g., external binding and intercalation both in free and in encapsidated dna. tmpyp binding does not in uence the protein structure and/or the protein -dna interaction. concentrations of tmpyp species were determined by comprehensive spectroscopic methods. our results facilitate a qualitative analysis of tmpyp binding process at various experimental conditions. we analyzed the effect of base pair composition of dna, the presence of protein capsid and the composition of buffer solution on the binding process. protein crystallization and structural study of upa protease domain with active site serine mutation the urokinase (upa) system is composed of upa, its receptor (upar), and inhibitor (pai). it plays important role in various physiologic processes, including brinolysis, cell adhesion, and signal transduction, and has been recognized as a target for intervention in tumor growth and tumor metastasis. we constructed an active site mutant of upa protease domain (159-405) with three mutations (c279a, n302q, and s356a) and expressed it as secreted protein in pichia pastoria with ppicza vector. the secreted mutant was captured from culture medium by a cation exchange column and then further puri ed on a gel ltration column. the puri ed mutant was then crystallized by sitting drop vapor diffusion method with several precipitant conditions: (1) 1.5-2.1m ammonium sulphate, 5-8% peg400, 50mm sodium citrate ph 4.6 or 50mm sodium phosphate ph 7.5, 0.05% sodium azide. (2) 1.8m ammonium sulphate, 0.14m lithium sulphate, 50mm sodium acetate at ph 5.2, (3) 2.5-2.8m sodium formate, 50mm sodium acetate at ph 5.2. (4) 3.2-3.6m sodium chloride, 50mm sodium acetate at ph 5.2. the crystals were of varying quality but generally diffracted from 1.7å-2.1å with inhouse x-ray source. the structure of this upa mutant and its complex with various inhibitors will provide a platform for rational upa inhibitor design. abstract in this work, the linear interaction energy (lie) method was used to calculate the binding free energies of hiv-1 integrase (in) and a series of dicaffeoyl -or digalloyl pyrroliding and furan derivatives inhibitors. the model of binding free energy prediction for homogeneous inhibitors of hiv-1 in has been obtained with a root-mean-square deviation (rmsd) of 1.39 kj/mol and estimated to be a precise model with good prediction capability. in addition, the probable binding mode of this series of inhibitors with hiv-1 in was proposed by using molecular docking and molecular dynamics (md) simulation methods. our results indicate that caffeoyl -or galloyl group of inhibitors have close interaction with a hiv-1 in conservative dde motif. a speci c non-competitive inhibitor of a small g protein/gef complex on the protective role of selenium and catechin in cadmium toxicity s. özdemir 1 , s. dursun 1 , s. toplan 1 , n. dariyerli 2 , m. c. akyolcu 1 1 istanbul university, cerrahpasa medical faculty, department of biophysics, turkey, 2 istanbul university cerrahpasa medical faculty, department of physiology, turkey cadmium as heavy metal is toxic and carcinogenic for organisms. cadmium perform their effects on living organisms by accumulation in blood and various tissues. due to their accumulation in various tissues and in blood, tissue antioxidant enzyme systems are affected. the present study was planned to determine the possible protective roles of selenium and catechin against the toxic effects of considered heavy metals. the study has been performed in wistar albino type rats which divided into four groups as control and cadmium, cadmium+selenium, cadmium+catechin received groups. besides cadmium as heavy metal, selenium concentration determinations were performed in blood, liver and kidney tissues of each group of rats. in the same tissue samples besides lipid peroxidation measurements, glutathione, glutathione peroxidase and superokside dismutase enzyme activity determinations were also performed. the accumulation of heavy metals was determined in blood, liver and kidneys after cadmium administration during experimental period. in the tissue of experimental group animals there was an increased lipid peroxidation but decreased antioxidant enzyme activities were observed. while effects of selenium in decreased toxicity of cadmium have been detected, there was no statistically signicant effect of catechin observed. proposed that motors could dynamically cluster at the tip of tubes when they are individually attached to the membrane. we demonstrate, in a recently designed experimental system, the existence of an accumulation of motors allowing tube extraction. we determine the motor density along a tube by using uorescence intensity measurements. we also perform a theoretical analysis describing the dynamics of motors and tube growth. the only adjustable parameter is the motor binding rate onto microtubules, which we measure to be 4.7 +/-2.4 s 1 . in addition, we quantitatively determine, for a given membrane tension, the existence of a threshold in motor density on the vesicle above which nanotubes can be formed. we nd that the number of motors pulling a tube can range from four at threshold to a few tens away from it. the threshold in motor density (or in membrane tension at constant motor density) could be important for the understanding of membrane traf c regulation in cells. kinesin and dynein move a peroxisome in vivo: a tug-ofwar or coordinated movement? c. kural, p. r. selvin, k. hwajin, g. goshima, v. i. gelfand university of illinois at urbana-champaign, usa we have used fluorescence imaging with one nanometer accuracy (fiona) for analysis of organelle movement by conventional kinesin and cytoplasmic dynein in a cell. we can locate a green uorescence protein (gfp)-tagged peroxisome in cultured drosophila s2 cells to within 1.5 nanometer in 1.1 milliseconds, a 400-fold improvement in temporal resolution, suf cient to determine the average step size to be 8 nanometers for both dynein and kinesin. furthermore, we nd that dynein and kinesin do not work against each other in vivo during peroxisome transport. rather, we nd that multiple kinesins or multiple dyneins work together, producing up to 10 times the in vitro speed. engineering a bio-molecular walker h. jankevics, j. e. molloy division of physical biochemistry, mrc national institute of medical research, the ridgeway, mill hill nw7 1aa, london, uk in this work we describe the design and development of a biomolecular walker based on the motile system found in certain ciliated protists. the motile system is driven by the binding of ca 2 ions and in contrast to other commonly studied motor proteins is independent of atpase (amos et al., 1975) . the motor protein is a 20kda ca-binding protein called spasmin (maciejewski et al.,1999; itabashi et al., 2003) which belongs to the ef-hand family of calcium binding proteins called calmodulins. upon calcium binding, spasmin is thought to undergo a large conformational change as it binds its own target peptide and we wish to exploit this in order to create our own novel molecular walker.we have created a recombinant spasmin with sequence tags to enable speci c immobilization and various conjugate chemistries. cystein mutants have been introduced at speci c points in the protein to enable attachment of other small molecules, for example uorophores. we are now optimising protein expression and puri cation to maximise the yield of active protein on which we can perform the conjugate chemistries. we will characterize the structural changes using biophysical methods such as circular dichroism, analytical ultracentrifugation, electron microscopy, afm and total internal re ection uorescence spectroscopy on single molecules. the force generation in muscle arises from direct interaction of the two main protein components of the muscle, myosin and actin. the process is driven by the energy liberated from the hydrolysis of atp by myosin. the interaction is performed by cyclic interaction of myosin with atp and actin, and at least six intermediates are proposed for actomyosin atpase in solution. the powerful dsc technique allows the derivation of heat capacity of proteins as a function of temperature. from the deconvolution of the thermal unfolding patterns it is possible to characterize the structural domains of the motor protein. in this work we tried to approach the temperature-induced unfolding processes in different intermediate state of atp hydrolysis in striated muscle bres. we have extended the experiments to study the ber system prepared from psoas muscle of rabbit in rigor, strongly binding and weakly binding states of myosin to actin where the inorganic phosphate (p i ) was substituted by the phosphate analogue orthovanadate. the dsc transitions were analyzed in different buffer solutions (tris and mops) to get information about the temperature dependence of ph on the conformational changes. single kinesin motor proteins walking through the searchlight s. verbrugge, l. c. kapitein, e. j. peterman vrije universiteit, de boelelaan 1081, 1081 hv, amsterdam, the netherlands the dimeric motor protein kinesin steps by a hand-over-hand mechanism. this means that the centre of mass moves with 8 nm steps, while the two motor domains, the one after the other, move 16 nm to the next binding site on the microtubule. the molecular details of what happens during a step are not fully understood, partly because of lack of time resolution in wide-eld, single-molecule uorescence experiments. we set out to develop an approach to study the motility of kinesin with a time resolution below a millisecond (a single step takes on the order of 10 milliseconds). this approach allows us to look into the mechanochemistry and coupling of the two kinesin motor domains while they are stepping. our method is based on confocal microscopy and we study the uorescent properties of single labeled motors while they walk through the confocal laser spot. we present the experimental details of our approach and show our results on human kinesin constructs that are speci cally labeled in the tail. we show that our approach enables us to study the mechanism of kinesin with a much higher time resolution than what was achieved before with single-molecule uorescence experiments. movement of coupled single-headed kinesins analysed by a brownian-ratchet model . the analysis of this system is expected to provide insights into the mechanism underlying the motility of conventional double-headed kinesin, espetially the roles played by individual heads. we would like to clarify whether the experimentally observed behaviors that are supposed to be caused by a pair of single-headed kinesins can be explained by a simple brownianratchet model, which is successful in describing the motion of an unconventional single-headed kinesin kif1a. our model consists of two brownian motors (ratchets) separated by a xed distance r. the velocity and other quantities of the coupled motors are calculated by solving the fokker-planck equation with various choices of r and other parameters. then, assuming a certain probability distribution of r associated with random attachment of kinesin heads on a bead in the experiment, the statistical properties of the motion of the coupled brownian motors are analysed. the force-velocity relation observed experimentally is found to be consistent with the present model with appropriate choices of the model parameters. the adequacy of the parameter choice needs to be con rmed by other experiments. optical trap with fast programmable feedback loop to study rotary molecular motors t. pilizota, f. bai, r. m. berry clarendon laboratory, univeristy of oxford an optical trap with back-focal plane detection and fast programmable feedback has been developed for the study of rotary molecular motors. a helium-neon laser (632 nm) is used for position detection and a solid state bre laser (1064 nm, 3w cw) forms the trap. acousto-optic de ectors (aods) controlled by a digital signalling processing board are used to achive programmable feedback loops with exible control options and speeds up to 8 khz. several modes of feedback are demonstrated, controlling both bead position (x,y) and angle (r, ). polystyrene beads or bead pairs can be held at set (x,y) or , and the set-point can be changed while the program is running. for example, feedback can be used to move a bead or a bead pair in a circle. results of using the system to study the bacterial agellar motor are presented. a dimeric 1-d lattice gas as model for molecular motors collective dynamics p. pierobon 1 , t. franosch 2 , e. frey 3 1 ludwing-maximilian universitaet, münchen, germany, 2 hahn meitner institut, berlin, germany, 3 arnold sommerfeld center for theoretical physics, münchen, germany the transport of molecular motors along microtubules closely resemble the dynamics of a driven lattice gas of dimers without conservation of particles. the unidirectionality, asymmetry and stochasticity of the motion are encoded in the well studied totally asymmetric simple exclusion process (tasep). we extend the model to a more realistic one, including attachment and detachment kinetics and extended (dimeric) particles. we study the stationary phase diagram by means of monte carlo simulations combined with a continuum description (based on an extended mean eld theory). we also evaluate the domain wall theory nding out the effective potential con ning the phase interface into the bulk. a. e. wallin 1 , j. lisal 2 , r. tuma 2 1 department of physical sciences, pobox 64, fin-00014, university of helsinki, finland, 2 institute of biotechnology, university of helsinki, finland molecular motors often consist of two or more subunits that cooperate to convert chemical energy into mechanical motion. hexameric helicases and viral packaging atpases constitute a special class of molecular motors that translocate along nucleic acids. recent structural and spectroscopic characterization of these motors revealed that their enzymatic cooperativity does not result from cooperative binding [1-2]. in order to understand this new type of cooperativity we simulated the kinetics of a single hexameric motor by multiple coupled stochastic reactions using the gillespie algorithm [3] . in contrast to analytical methods, our direct simulation allowed us to investigate the kinetics with an arbitrary model for cooperativity between the subunits. simulations on the kinetics of the hexameric rna packaging motor p4 from dsrna bacteriophage [1] with different cooperativity mechanisms provided insight into the rnamediated cooperativity and yielded a sound theoretical basis for the interpretation of experimental results [2]. the viral infectivity factor (vif) encoded by hiv-1 is a small basic protein that strongly modulates the viral replication and is required for pathogenicity. vif is packaged into hiv-1 particles through a strong interaction with genomic rna and is associated with viral nucleoprotein complexes. moreover vif acts during the early stages of the viral infection (capsid disassembly, reverse transcription) as well as during the late stages of virus replication (virus assembly and maturation of the virion). however the effect on early stages is probably a consequence of a defective assembly and / or virion maturation. understanding the rna-binding properties of vif would contribute to elucidate the role played by vif in the regulation of the genomic rna traf cking in the cytoplasm to unable ef cient packaging, and the prevention of cellular inhibitors from altering hiv-1 rna. in this context, we have characterised the interactions of recombinant vif with hiv-1 genomic rna by uorescence spectroscopy, and determined the af nity of the protein for synthetic rnas corresponding to various regions of hiv-1 genome. taken together our results demonstrate cooperative and speci c binding. in particular, we showed that vif has a high af nity for the 5'untranslated region of hiv-1 genomic rna. s. bernacchi 2 , e. ennifar 1 , k. toth 2 , p. walter 1 , j. langowski 2 , p. dumas 1 1 cnrs upr 9002 -strasbourg (france), 2 dkfz -heidelberg (germany) we have used the dimerization initiation site (dis) of hiv-1 genomic rna as a model to investigate hairpin-duplex interconversion by using a combination of uorescence, uv-melting, gel electrophoresis and x-ray crystallographic techniques. fluorescence studies with molecular beacons and crystallization experiments with 23-nucleotide dis fragments showed that the ratio of hairpin to duplex formed after annealing in water essentially depends on rna concentration, and not on cooling kinetics. with natural sequences able to form a loop-loop complex (or 'kissing complex'), concentrations as low as 3 µm in strands are necessary to obtain a majority of the hairpin form. on the contrary, when kissing-complex formation was made impossible by mutation in the loop, a majority of hairpins was obtained even at 80 µm in strands. this mutated sequence also showed that kissing-complex formation is not a prerequisite for an ef cient conversion to duplex in presence of salts. we proved that this happens through hairpins engaged in a cruciform intermediate, but not from free strands after hairpin melting. supporting this view, the very rst step of formation of such a cruciform intermediate could be trapped in a crystal structure. such a mechanism might be biologically signi cant beyond the strict eld of hiv-1 rna dynamics. generation of rna dimeric form of the human immunode ciency virus type 1 (hiv-1) genome is important for the viral replication. the dimerization initiation site (dis) has been identi ed as a short sequence that can form a stem-loop structure with a selfcomplementary sequence in the loop and a bulge in the stem. a 39mer dis rna fragment, dis39, spontaneously formed "loosedimer" and was converted into "tight-dimer" by supplement of nucleocapsid protein ncp7. nmr chemical shift analysis for dis39 in the kissing-loop and extended-duplex dimers revealed that three dimensional structures of the stem-bulge-stem region were similar between the two types of dimers. therefore, we determined the solution structures of two shorter rna molecules corresponding to the loop-stem region and the stem-bulge-stem region of dis39, and the solution structures of dis39 in the kissing-loop and extendedduplex dimers were determined by combining the parts of structures. the mechanism of conformational conversion will be discussed based on the solution structures and the molecular dynamics analysis. nmr and molecular modelling studies of an rna hairpin containing a g-rich hexaloop the mrna of the pgy1/mdr1 gene encoding the transmembrane p-glycoprotein (p-gp) contains a hairpin that is the target of antisense oligonucleotides, suppressing the p-gp function of multidrug resistance. the solution conformation of this hairpin constituted by the 5'(gggaug)3' loop closed by a g-u mismatch containing stem is studied by nmr and molecular dynamics in explicit solvent. special attention is given on the sugar and the backbone conformations and the hexaloop intrinsic properties of these two components are carefully investigated. the stem structures obtained by molecular dynamics with and without nmr constraints converge to the same a-type double helix. the wobble g-u mismatch moderately perturbs the overall conformation, despite of c2'-endo sugars and unusual backbone conformations located between the mismatch and the loop. in the hexaloop part, the sugar puckers are in majority in c2'-endo conformations, probably to extend the strand with the help of unusual backbone angles conformations. the loop appears stabilized by one hydrogen bond and stacking interactions. thus, from the 5' to the 3'-ends, the four purine bases ggga are stacked together, then a u-turn like is observed, and nally, u stacks on the last g that remains rather far from the stem. nmr and molecular modelling studies of an rna hairpin containing a g-rich hexaloop the mrna of the pgy1/mdr1 gene encoding the transmembrane p-glycoprotein (p-gp) contains a hairpin that is the target of antisense oligonucleotides, suppressing the p-gp function of multidrug resistance. the solution conformation of this hairpin constituted by the 5'(gggaug)3' loop closed by a g-u mismatch containing stem is studied by nmr and molecular dynamics in explicit solvent. special attention is given on the sugar and the backbone conformations and the hexaloop intrinsic properties of these two components are carefully investigated. the stem structures obtained by molecular dynamics with and without nmr constraints converge to the same a-type double helix. the wobble g-u mismatch moderately perturbs the overall conformation, despite of c2'-endo sugars and unusual backbone conformations located between the mismatch and the loop. in the hexaloop part, the sugar puckers are in majority in c2'-endo conformations, probably to extend the strand with the help of unusual backbone angles conformations. the loop appears stabilized by one hydrogen bond and stacking interactions. thus, from the 5' to the 3'-ends, the four purine bases ggga are stacked together, then a u-turn like is observed, and nally, u stacks on the last g that remains rather far from the stem. aminoglycoside binding to hiv-1 dis kissing-loop complex: from crystals to cells e. ennifar 1 , j.-c. paillart 1 , a. bodlenner 2 , p. pale 2 , r. marquet 1 , p. dumas 1 1 cnrs upr 9002, strasbourg -france, 2 cnrs/université louis pasteur strasbourg umr 7123, strasbourg -france all retroviral genomes consist in two homologous single stranded rnas. dimerization is an essential step for viral replication. hiv-1 dimerization initiation site (dis) is a strongly conserved stem-loop in the 5' leader region of the genomic rna. it was shown in vivo that alteration of the dis dramatically reduces viral infectivity. we have previously solved crystal structures of the dis kissing-loop complex. analysis of these structures revealed an unexpected resemblance between the dis kissing-loop and the 16 s ribosomal aminoacyl-trna site (a-site), which is the target of aminoglycoside antibiotics. we have shown that some aminoglycosides specifically bind to the dis kissing-loop complex with an af nity and geometry similar to that observed in the a-site. in agreement with these previous results, we have now solved highresolution crystal structures of the dis kissing-loop complex bound to four aminoglycosides. these structures show that, as expected, two aminoglycosides are bound per kissing-loop complex. importantly, the binding is observed not only in vitro on large hiv-1 genomic rna fragments, but also on infected cells. moreover, we showed that some of these aminoglycosides stabilize the kissingloop rna dimer, which is consistent with the observation in crystal structures of numerous direct and water-mediated drug-rna contacts. these structures are currently used as starting points for designing potential new drugs targeted against the viral rna. modeling the long range entropy of rna: w. k. dawson 1 , k. fujiwara 1 , k. yamamoto 2 , g. kawai 1 1 chiba institute of technology, 2-17-1 tsudanuma, narashino, chiba, japan, 2 international medical center of japan, 1-21-1 toyama, shinjuku-ku, tokyo, japan non-coding rna appears to make up a large part of the human genome. a reliable rna structure prediction program is needed to understand the structure of this non-coding rna. we recently developed a new way to model the long range entropy in rna and applied it to rna secondary structure prediction. in some of instances, the new approach is able to achieve far better predictions than the state of the art secondary structure programs even given exactly the same parameters. predictions using this method tend to show distributions that are funnel shaped. a new and important parameter in these calculations is the persistence length (a measure of the correlation and exibility of the rna). (url: http://www.rna.it-chiba.ac.jp/ vsfold/vsfold4/) this new approach has now been extended to prediction of pseudoknots. the method is a heuristic wherein the hierarchical folding hypothesis is used to nd the pseudoknots as the rna secondary structure is folding, and corrections to that secondary structure are made to accommodate the pseudoknot. it is able to do these searches in roughly n^4 time. the model is consistent with the hierarchical hypothesis and it is possible to estimate rna folding times that are of the correct order of magnitude using this model. with further adaptations to account for the size, shape and variablity of amino acid residues (hydrophobicity etc.), the model also appears to be transferable to protein folding problems. a. v. melkikh ural state technical university, ekaterinburg, russia a model of the genome as a gene network capable of receiving information about the environment and performing some operations on genes has been considered. the evolution rate of replicators for the mechanism of random mutations has been estimated. it was shown that the evolution rate under random actions is negligibly small for real dimensions of genomes of replicators [1]. it was inferred that only a deterministic mechanism of the evolution can explain the known evolution rate of replicators. a deterministic model of the evolution has been proposed. the basic principles of this model include: 1) information about the replicators evolution is encoded in the conformational states of proteins; 2) the conformational language of proteins is translated into the language of nucleotide sequences during the evolution; 3) the structure of genes is controlled such that the transition to a nearest free ecological niche takes a minimum time (at a preset restriction on the control). transfer rnas are synthesized as part of longer primary transcripts that require processing of both their 3' and 5' extremities in every living organism known. the 5' side is matured by the quasiuniversally conserved endonucleolytic ribozyme, rnase p, while removal of the 3' tails can be either exonucleolytic or endonucleolytic. the endonucleolytic pathway is catalysed by an enzyme known as rnase z. rnase z cleaves precursor trnas immediately after the discriminator base in most cases, yielding a trna primed for addition of the cca motif by nucleotidyl transferase. rnase z is found in the vast majority eukaryotes and archaea and in about half of the sequenced bacteria. it is often essential for growth and mutations in one of the two genes encoding rnase z (elac2) have been linked with prostate cancer in man. in this poster we present the crystal structure of bacillus subtilis rnase z at 2.1å resolution (1) resolved by mad method and propose a model for trna recognition and cleavage. the structure explains the allosteric properties of the enzyme and also sheds light on the mechanisms of inhibition.it also highlights the extraordinary adaptability of the metallohydrolase domain of the b-lactamase family for the hydrolysis of covalent bonds. (1)i. most rnas undergo several steps of post-transcriptional modi cation before carrying out their assigned functions. one of the major modi cations is the splicing process, by which non-coding introns are removed from the coding exons. splicing can be performed by autocatalytic, self-splicing introns (e.g. group ii introns), i.e. catalytic rna or ribozymes. group ii introns, which occur in bacterial genomes and in organellar genes of plants, funghi and lower eukaryotes, consist of a conserved set of six domains. domain 1 recognizes the 5'-exon through a 10-15 base pairing interaction formed by two regions within the intron (exon binding sites, ebs1 and ebs2) and the last 10-15 nucleotides of the 5'-exon (intron binding sites, ibs1 and ibs2). as the correct recognition of ibs1 by ebs1 is crucial for a successful splicing event we are investigating the structural and metal ion requirements of this part by various spectroscopic techniques, e.g. nmr. our data shows that the hairpin including ebs1 consists of a helical region followed by an unstructured single stranded part, which is ready for splice site recognition. the results of the structure analysis will be presented. financial support by boehringer ingelheim fonds (fellowship to d. k.) and the swiss national science foundation (snf-förderungsprofessur to r. k. o. s.) is gratefully acknowledged. structural basis for the antigene and antisense properties of modi ed dna:dna and rna:dna duplexes e. c. m. juan 1 , t. kurihara 1 , j. kondo 1 , t. ito 2 , y. ueno 3 , a. matsuda 2 , a. takenaka 1 1 graduate school of bioscience and biotechnology, tokyo institute of technology, 2 graduate school of pharmaceutical sciences, hokkaido university, 3 faculty of engineering, gifu university oligonucleotides containing polyamines are currently being evaluated as potential antigene and antisense compounds. those with 5-(n-aminohexyl)carbamoyl-2'-deoxyuridine ( n u) and its 2'-omethyl derivative ( n u m ) exhibit improved nuclease resistance and form stable duplexes with their dna and rna targets. x-ray structures of these duplexes have shown good correlation between the conformational changes and the observed chemotherapeutic properties. the amide groups of the modi ed uracil bases form six-membered rings through the intra-molecular nh-o4 hydrogen bonds, so that the aminohexyl chains protrude into the major grooves. some of the terminal ammonium groups are involved in intra-duplex interactions with phosphate oxygen anions, whereas the others interact with those of the adjacent duplex. such interactions contribute to the stability of duplex formation. the 2'-o-methyl modi cation in n u m shifts the ribose ring toward the c3'-endo conformation and in uences duplex stability. observed changes in the dimensions of the minor grooves and in the hydration structures are also well correlated to nuclease resistance and duplex stability. group ii intron ribozymes catalyze selfsplicing in bacterial genomes as well as in organellar genes of lower eucaryotes. for correct structure and function these ribozymes need speci c concentrations of monovalent and divalent cations such as k and mg 2 . most of these ions are used for charge screening, but some are also bound to distinct sites ful lling various speci c tasks. the conserved secondary structure of group ii intron ribozymes consists of six domains grouped around a central wheel. here the focus is set on domain 5 (d5) of the yeast mitochondrial intron ai5 , a hairpin of 34 nucleotides, which is crucial for catalytic activity. the 3-nucleotide bulge in d5 is known to be exible and acts as a metal ion binding platform. we have investigated the binding of different metal ions (mg 2 , ca 2 , mn 2 , cd 2 ) to this platform by uorescence spectroscopy. for this the bulge site adenosine in d5 was replaced by the uorescent nucleotide base analogue 2aminopurine (2ap). the binding data ts to an equation describing a binding to a single class of sites. the titration experiments not only reveal different dissociation constants for the tested metal ions but also indicate different effects on the bulge structure. financial support by the swiss national science foundation (snf-förderungsprofessur to r.k.o.s) is gratefully acknowledged. modi ed nucleosides and across the anticodon loop interactions in trna u. b. sonavane, k. d. sonawane, r. p. tiwari national chemical laboratory, pune 411008, india in several interesting trna molecules, the (34 th ) as well as the (37 th ) nucleoside are hyper modi ed. as an example, unique hypermodi ed nucleosides mcm 5 s 2 u 34 and ms 2 t 6 a 37 are crucial in human trna lys , which acts as a primer in hiv replication. modi ed nucleosides may facilitate or hinder across the loop interactions. large substituents in 34 th and 37 th modi ed nucleosides if oriented suitably may also interact with each other. across the loop interactions may lead to unconventional anticodon loop structures also affecting exibility of the anticodon loop. this may restrict or enlarge synonymous codon choice and decoding during protein biosynthesis. except for trna asn (with interacting q 34 and t 6 a 37 ), our studies show conventional 'open' loop structure -free of across the loop interactions, for a number of interesting trna anticodon loops with diverse hyper modi ed nucleosides at both of these locations. molecular dynamics simulations of hydrated anticodon arm of trna asn show persisting interaction involving the diol group of q 34 and carbonyl group of ureido linkage in t 6 a 37 . additionally, the hoogsteen edge of 37 th adenine base participates in hydrogen bonding with watson -crick edge of 33 rd base and thus contributes to unique loop structure of trna asn . resulting suboptimal q:c base pairing leads to unbiased reading of u or c as the third codon letter. absence of queuosine modi cation, q 34 happens to be also associated with uncontrolled rapid proliferation of cells and malignant growth. structural properties of ctg/cag repeats, and preliminary x-ray analysis of cug repeats y. sato, k. kimura, a. takénaka graduate school of bioscience and biotechnology, tokyo institute of technology, yokohama, japan. the human genome contains so many different types of repetitive sequences. some of them are tandem repeats of trinucleotides. their unusual expansions cause genetic diseases such as type 1 myotonic dystrophy (dm1) and huntington's disease (hd), the unit sequences being ctg and cag, respectively. the numbers of repeats of the two complementary sequences change independently during dna replication or repair. the direct origin of dm1 is, however, the transcribed rna fragments with cug repeats, which forms a speci c structure and inhibits other protein syntheses. in the present study, structural versatilities of such dna and rna fragments has been examined. native pages of (cug) n show that the hairpin structure with even number is more stable than that with odd number. this difference might be ascribed to the structural difference at the hairpin head. the pages also show that duplex formation is dependent on coexisting cationic species and their concentration. crystal data of (cug) 6 (a=b=39.6, c=141.0 å, and the space group r32) suggest that the asymmetric unit contains the rna fragment. an approximate crystal structure solved by molecular replacement techniques at 1.95 å resolution shows that the rna fragments form a duplex similar to an a-form rna. context-dependent selection of promoter in a natural selection-type evolution reactor h. nagayasu, y. ageno, x. t. ma, y. husimi saitama university, saitama, japan using an isothermal ampli cation of hairpin dna/rna (developed by g.joyce), we drove a natural selection-type evolution reactor taking the speci c growth rate as the tness. we used hiv-1 rt and thermot7 rnap at 50 c. starting from the random promoter pool, we selected the strongest promoter at 50 c, which was separated by hamming distance 2 from the strongest promoter at 37 c. the latter was found to be identical to the natural t7 promoter. when we used a simple random pool, the selection process showed one-step convergence. when we used a random pool with a speci c short sequence at the upstream anking region of the random region, we observed an evolution process as convergencedivergence-convergence of the promoter sequence, driven by deletion of the speci c short sequence. this context-dependent selection was found to come through a neutral path, judged from the tness measurement. intronic sirna and mirna, and dna methylation gif sur yvette, france. abnormal activation of small g proteins is involved in several human diseases. small g proteins are activated by gdp/gtp exchange, which is stimulated by their guanine nucleotide exchange factors (gefs). thus, small g protein/gef complexes appear as emerging targets for interrupting signalling networks regulated by small g proteins in pathological contexts. the activation of small g proteins of the arf family is initiated by exchange factors which carry a sec7 catalytic domain stimulating the dissociation of the bound gdp nucleotide. the structure of the arf1-gdp-arno reaction intermediate, trapped by a mutation of the catalytic glutamate, was recently solved by x-ray crystallography (pdb code: 1r8s) acknowledgements: the work was supported by contract kj. q. yin, f. chen, y. tan, t. gu institute of biophysics, chinese academic sciences, beijing, china sirna/mirna can ef ciently induce mrna cleavage or translational repression at the posttranscriptional level in a sequencespeci c manner. recently, it has been shown that these small rnas guide genome modi cation in mammalian cells. however, their ability to direct cognate dna methylation has been con rmed so far only in plants, and their biogenesis, functions, and modes of silencing genes are yet elusive. here, we report that small rnas derived from intron regions of some genes can target homologous dna sequence in promoter, 5'utr or 3'utr regions of genes in different human tumor cells. surprisingly, we also discovered that endogenous sirnas from introns of genes possessed a large number of target mrnas by using bioinformatics, and con rmed their existence in human cells with northern blot analysis. intronic small rnas generated by sliceosomes can form mature mirnas or sirnas through the processing of drosha and /or dicer. rt-pcr analysis indicated that vector-based small rna repressed expression of homologous genes at the transcriptional and/or translational levels. western blotting demonstrated that the expression of some proteins was greatly reduced or completely inhibited owing to promoter methylation. these ndings reveal that the expression of some genes can incredibly control cell activities at both protein and rna levels. our results also suggest that these small rnas may regulate gene expression in different modes of action. role of stacking in speci c recognition of capped rna by the cbc protein the stacking interaction involving 7-methylguanine moiety of mrna 5' terminal cap (m 7 g) and aromatic amino acid side chains is a common feature of all known cap-binding proteins. the crystal structures of the human cap binding complex (cbc) showed its induced folding upon m 7 gpppg cap analogue binding. stabilization of the cbc-m 7 gpppg complex by sandwich stacking of m 7 g in between y20 and y43 is additionally enhanced by stacking of the second base of capped mrna with y138. gibbs free energy of the association of various cbc mutants with the synthetic cap analogues, m 7 gpppg, m 7 gpppu, and m 7 gtp, has been determined by uorimetric titration. preference of the wild type (wt) cbc for the dinucleotide analogues is also observed for the y43a mutant, with the energy loss of 0.8-1.5 kcal/mol. however, all proteins with the mutated second stacking partner y20 prefer to bind m 7 gtp compared with m 7 gpppg. the binding of m 7 gtp to the y20a mutant is only 0.3 kcal/mol less favourable compared with the wt cbc. these divergences may be ascribed to smaller entropic costs of conformational rearrangement of cbc in the case of a smaller ligand, which can nd more favourable contacts when its second part is not ef ciently 'constrained' by stacking with y138. supported by kbn 3 p04a 021 25 annealing of the tar dna hairpin to a complementary tar rna hairpin, resulting in the formation of an extended duplex, is an essential step in the minus-strand transfer process of hiv-1 reverse transcription. in this work, we use gel-mobility-shift analysis to follow the kinetics of this reaction in the absence or presence of hiv-1 nc prepared by solid-phase peptide synthesis. to elucidate the reaction pathway, we use either the complete 59-nt tar hairpins or truncated 27-to 32-nt minihelices (mini-tar) derived from the top part (i.e. hairpin loop) of tar. assays were also carried out with mutant tar constructs. the annealing kinetics were studied systematically as a function of dna concentration and temperature. we show that the annealing initiates through a weak loop-loop kissing interaction, followed by a much slower conversion step, which results in formation of the extended duplex. nc facilitates both reaction steps, resulting in the overall 10 4 -fold and 10 6 -fold rate enhancement for mini-tar and tar annealing, respectively. we show that the kissing step is facilitated by the nc-induced nucleic acid aggregation, which is more pronounced for the longer tar hairpins. at the same time, the conversion steps in tar and mini-tar appear to be very similar and are similarly facilitated by nc 10-100-fold. the later effect relays on the ability of nc to destabilize nucleic acid duplexes, and is equivalent to destabilization of a few base pairs required for the conversion initiation. linking of the n-terminus of a peptide to its encoding mrna s. ueno, h. arai, y. husimi department of functional materials science, saitama university, saitama, japanin evolutionary protein engineering, in vitro selection using a cellfree translation system has advantages of large library size and also of applicability to cytotoxic protein. although many in vitro protein selection techniques such as in vitro virus, ribosome display are developed, most of these are the techniques that link the genotype molecule to the peptide at its cterminus. we developed a method to link the genotype molecules to the nterminus of its encoding peptide. the mrna has dna-linker hybridize region, translation enhancer, initiation codon, single codon, amber stop codon, n-terminus sequence in gfp gene, his-tag and ochre stop codon. the mrna is also linked at 5'-terminus to sup trna via spacer and the speci c amino acid. the mrna is translated in the cell-free translation system. thus, c-terminus of the protein becomes free. moreover, in this system, the free protein of the full length is never generated. for this reason, the problem in the conventional technique, that is, the competition between the protein displayed on the genotype molecule and the free protein is eliminated. we succeeded to turn one round of the " life cycle " of the in vitro virus, judged by his-tag selection. key: cord-014661-mrh2pbi6 authors: dumitrascu, georgiana r.; bucur, octavian title: critical physiological and pathological functions of forkhead box o tumor suppressors date: 2013-12-31 journal: nan doi: 10.15190/d.2013.5 sha: doc_id: 14661 cord_uid: mrh2pbi6 the forkhead box, subclass o (foxo) proteins are critical transcription factors, ubiquitously expressed in the human body. these proteins are characterized by a remarkable functional diversity, being involved in cell cycle arrest, apoptosis, oxidative detoxification, dna damage repair, stem cell maintenance, cell differentiation, cell metabolism, angiogenesis, cardiac development, aging and others. in addition, foxo have critical implications in both normal and cancer stem cell biology. new strategies to modulate foxo expression and activity may now be developed since the discovery of novel foxo regulators and non-coding rnas (such as micrornas) targeting foxo transcription factors. this review focuses on physiological and pathological functions of foxo proteins and on their action as fine regulators of cell fate and context-dependent cell decisions. a better understanding of the structure and critical functions of foxo transcription factors and tumor suppressors may contribute to the development of novel therapies for cancer and other diseases. the forkhead box (fox) proteins represent a wide family of transcription factors that display an extraordinary functional diversity, regulating a variety of critical biological processes 1, 2 . fox proteins are well known to control the following physiological procceses: apoptosis, cell-cycle, cellular metabolism, immune response, differentiation, development (such as cardiac development) or aging. fox proteins are also involved in various pathologies, such as cancer, diabetes and neurodegenerative diseases 3, 4 . the first forkhead transcription factor was first identified in 1989, and its function was related to the development of the anterior and posterior gut of the drosophila embryo. the "forkhead" name was given because of the two spiked-head structures found in the embryos of the drosophila melanogaster forkhead mutant presenting modifications of the gut formation 5 . one year later, the sequence comparison between forkhead and mammalian hepatocyte-enriched transcription factor hnf-3a showed a conserved 110-aminoacid dna binding domain, bringing evidence that forkhead proteins are a new class of transcription factors 6 . hence, forkhead transcription factors are defined by their winged-helix dna binding domain, a conserved structure called "forkhead box", the symbol fox being assigned to all vertebrate forkhead transcription factors, according to the revised nomenclature 7 . since its discovery, numerous forkhead genes have been identified in a broad range of organisms, from yeast and worms to humans 2 . interestingly, up to now, fox genes have not been identified in plants 8 . however, an update published in 2010 reported that are 50 fox genes identified and classified in the human genome and 44 in the mouse genome 9 . therefore, a standard nomenclature system was required for this extended number of discovered factors. in 2000, daniel e. martinez and his team established fox nomenclature committee, the first step towards an unified nomenclature for the winged-helix / forkhead transcription factors. the committee has changed the initial terms for the forkhead proteins (e.g. freac -forkhead related activator; fkh -drosophila gene fork head) 7 . currently, based on phylogenetic analysis, foxo genes are classified in subclasses that range from foxa to foxs to yield 23 subclasses in total 10, 11, 12, 13 . each subclass has several members noted with an arabic number. while for human forkhead proteins abbreviations have all letters in uppercase (e.g., foxd3), for mouse only the first letter is capitalized (e.g., foxd3) and for all other chordates the first and subclass letter are in uppercase (e.g., foxd3) 7 . one of the largest and the most important subclass of fox family is represented by foxo (forkhead box, subclass o) transcription factors. foxo transcription factors (foxos) are characterized by the same conserved dna binding domain that define the family of forkhead proteins 1 . however, foxos share an additional unique 5 amino acid sequence insertion within the dna binding domain that is not present in other forkhead proteins 14 . only one foxo transcription factor is known in invertebrates (named abnormal dauer formation protein 16/daf-16 in the nematode worm caenorhabditis elegans and dfoxo in the fruit fly drosophila melanogaster), while in mammals four foxo proteins encoded by four different genes were identified: foxo1, foxo3, foxo4 and foxo6 15 . initially, foxo1 transcription factor (previously known as fkhr-forkhead in rhabdomyosarcoma) was identified through its involvement in chromosomal translocations t(2;13) and t(1;13) in alveolar rhabdomyosarcoma tumors due to pax3/7-fkhr fusion transcript 16, 17 . a few years later, foxo3 (previously known as fkhrl1 -forkhead in rhabdomyosarcoma like protein 1) was characterized and named, based on similarities to fkhr 18 . the gene for foxo4 was described as being fused to mll transcription factor as a result of the t(x; 11) chromosomal translocation in acute lymphoblastic leukemia, therefore, the initial term for foxo4 was afx (the acute leukemia fusion gene located on chromosome x) 19 . foxo proteins are considered unique cellular targets, regulating a wide variety of critical cellular processes, such as: apoptosis, oxidative stress, dna damage repair, cell cycle, stem cell proliferation and maintenance, metabolism, angiogenesis, vascular tone, cardiovascular development, fertility, immune response and neuronal survival 4, 20 . the aim of this review is to discuss the most recent advances in elucidating the functions, structure and transcriptional regulation of forkhead box o transcription factors, both in physiological and pathological conditions. the advances in understating the mechanism of foxos regulation of stem cells, cancer stem cells and how non-coding rnas are regulated and regulate the function of foxo genes/protein are presented. in humans, the primary structure of the foxo proteins is characterized by a length of approximately 655-675 amino acid residues (aa) for foxo1 (655 aa) and foxo3 (673 aa), and a shorter sequence of approximately 500 amino acids for foxo4 (505 aa) and foxo6 (492 aa) 21,22,23,24 . all members of the foxo family consist of four domains: a forkhead dna-binding domain (dbd), a nuclear localization signal (nls), a nuclear export sequence (nes) and a c-terminal transactivation domain 25 (figure 1) . the forkhead dna-binding domain (dbd), also called the forkhead box, is described as a "winged helix" due to the butterfly-like aspect on x-ray crystallography and nuclear magnetic resonance 26 . analysis of the amino acid sequences alignment of foxo proteins revealed a highly conserved dna binding domain among all the members of the broader group of forkhead transcription factors, but also among species, in eukaryotic organisms, from yeast to humans 3, 27 . moreover, several studies point out that forkhead dbd is not the only region in the foxo molecule that is highly conserved. similarities were also observed in the n-terminal region near the first akt/protein kinase b (pkb) phosphorylation site (thr24 for foxo1 and thr32 for foxo3), the region containing nls, and sequences from c-terminal transactivation domain 25 . as mentioned above, the forkhead domains in foxo subclass of fox family are similar to the forkhead regions of other subclasses. for example, structural similarities have been proven in the core region of foxo3 compared to foxa3, foxk1 and foxp2 28 . the forkhead/winged helix motif is a shared sequence between subclasses of foxo, of around 100 amino acids that folds into three alphahelices (h1, h2 and h3), three beta-strands (s1, s2 and s3) and two wing-like loops (w1 and w2). the structure has a h1 -s1 -h2 -h3 -s2 -w1 -s3 -w2 topology and the strands s1, s2 and s3 interact with each other forming a beta-sheet. while the n-terminal part of the domain is formed by a cluster of three alpha-helices, the c-terminal half consists of beta strands s2 and s3 and two large loops (wings w1 and w2) 25 . the main dna recognition site is represented by the highly conserved helix h3 of forkhead dna-binding domain. the stability of forkhead dbd -dna complexes is increased by the more variable regions of the dbd, including h1, the region between h2 and h3, wing w1 and c-terminal segment 25 . although the foxo forkhead box is clearly related to those found in other forkhead genes, all foxo members contain an aditionally segment of five amino acid (198-gdsns-202) between helices h2 and h3, that creates a small extra loop, forming a coil structure in the foxo3 dbd, and a short helix in the foxo4 dbd 28, 29 . this is important in sequence-specific interactions with dna-binding sites 30 . foxo proteins mediate transcriptional activation through binding to the conserved consensus core motif ttgtttac in the dna 31 . they bind dna through a foxo-recognized element with a t/c-g/a-a-a-a-c-a-a consensus sequence in the forkhead dbd (c-terminal) 26 . there are fourteen protein-dna contacts described, which mediate the activation/inhibition of foxo's target genes, such as bim, trail, p27, p21 and catalase. the main recognition site is the α-helix h3 32 . forkhead proteins function as transcription factors that bind to dna through their forkhead domain in order to upregulate or downregulate the expression of a tremendous number of target genes 20 . hence, the foxo transcription factors control the expression of a wide spectrum of genes that regulate essential physiological cellular processes, such as cell death/cell survival, cell cycle, cell proliferation, cell differentiation/development, angiogenesis, cell metabolism, stress response and stem cell maintenance 33,34,35,36,37,38 (figure 2) . moreover, foxos are also involved in a wide range of pathological cellular processes, such as cancer, neurodegenerative diseases (parkinson's disease, alzheimer), diabetus mellitus, cardiac failure, atherosclerosis, hypertension and anovulation 39, 40, 41, 42, 43, 44, 45, 46 . differences between functions and regulatory mechanisms of foxo1 and foxo3 proteins are considered to be partially redundant, with several exceptions. for example, akt phosphorylates all foxo family members, inducing their translocation into the cytoplasm and inactivation, while pp2a dephosphorylates part of the akt phosphorylation sites in foxo1 and foxo3, reactivating them 47 . several particular differences regarding the regulatory kinases, e3 ligases or other important enzymes and regulators exist. thus, their activity is in part controlled by different mechanisms 31 . foxo transcription factors regulate the expression of multiple pro-apoptotic and anti-apoptotic proteins, and they have the ability to induce apoptosis by activating either intrinsic or extrinsic pathways of apoptosis 48,49,50,51,52 . moreover, the consensus foxo recognition element (fre) -(g/c)(t/a)aa(c/t)aa -which differs from that of other forkhead proteins, seems to have a very important role in both apoptotic pathways, since matching functional fre sites have been identified in the promoters of foxo target genes encoding fas ligand (fasl), insulin like growth factorbinding protein 1 (igfbp1), the apoptotic regulator bcl-2 interacting mediator of cell death (bim) and others 30 . foxo triggers the mitochondria-dependent intrinsec apoptotic pathway through upregulation of multiple pro-apoptotic bcl-2 family members, such as puma, bim, noxa, bnip3, and downregulation of the anti-apoptotic bcl-2 family member bcl-xl 48,49,50,51,52 . the expression of the anti-apoptotic protein bcl-xl is suppressed by foxo proteins, such as foxo4, after increasing the expression of the transcriptional repressor bcl-6, in order to trigger apoptosis 52 . both foxo-induced expression of the proapoptotic bcl-2 family members and foxodownmodulated anti-apoptotic bcl-2 family members lead to apoptosis due to mitochondrial outer membrane permeabilization (momp), as a response to intracellular stress, growth factor deprivation or other factors 53 . these mitochondrial modifications result in release of cytochrome c, smac/diablo and omi/htra2 from mitochondria, subsequently triggerring downstream caspases activation 54,55 . apaf 1 binds cytochrome c forming a complex named apoptosome, required for caspase 9 activation (initiator caspase) 54 . caspase 9 activation further activates the effector caspases, such 3, 6 or 7, triggering caspase cascade 54 . foxo proteins are also able to induce apoptosis by activating the receptor-dependent extrinsic pathway of apoptosis. they induce the upregulation of the death receptor ligands fasl and trail, promoting an autocrine and/or paracrine action of these death ligands on the death receptors 56 . fas signaling pathway is important in apoptosis induction through the extrinsic pathway, since jurkat cells lacking several critical components of fas pathway fail to induce foxodependent cell death 57 . upregulation of tumor necrosis factor related apoptosis inducing ligand (trail) by increased levels of foxo1 or foxo3 in cancer cells, such as prostate carcinoma cells, leads to apoptosis 58 . thus, binding of fas ligand (fasl) and trail to their receptors (fas/cd95/apo-1 for fasl and dr4, dr5 for trail) triggers a death-inducing signaling complex (disc) with a subsequent activation of caspases, mainly initiator caspase 8 and effector caspase 3, leading to apoptosis 59 . moreover, foxo transcription factors directly regulate the expression of tumor necrosis factor receptorassociated death domain (tradd). activation of foxo proteins by pi3k-akt pathway inhibitors results in increased expression of tradd protein 60 . tradd is an important adaptor protein interacting with tnfr1 and fas receptors, mediating apoptosis and nfkb pathway activation 61 . under normal physiological conditions, cyclins and their associate cyclin dependent kinases (cdks) are very important for the progression of the cell cycle 62 . in response to dna damage, foxo transcription factors increase the expression levels of the cdk inhibitors binding to cyclin/cdk complexes, such as p21 (also known as cdk inhibitor 1) and p27 (kip1) 63,64 . cdk inhibitors together with foxo-induced inhibition of cyclins expression act by stopping the cell cycle at different checkpoints in order to repair the dna damage or to remove the damaged cell 65 . for example, a study performed on 32d murine myeloid cells and on baf3 murine pre-b-cell lines brings evidence that endogenous foxo proteins are required to enforce cell cycle checkpoints after the cell cycle arrest at g1/s transition is induced by foxo by upregulating the negative regulators of the g1/s phase, such as the cdk inhibitors: p27 kip1 , p57 kip2 , p21 cip1/waf1 (cip/kip family), p15 ink , p19 ink (ink family) and the retinoblastoma protein family member p130 27 .also, foxo decreases the positive regulators, such as cyclin d1, or cyclin d2, blocking the g1/s transition 66 . additionally, foxo increases the expression of negative regulators such as gadd45alpha and cyclin g2, resulting in cell cycle arrest at g2/m 66 . surprisingly, foxo proteins act as transcription factors for plk1 expression during cell cycle, plk1 being critical in promoting g2-m phase transition, m phase progression and end of mitosis 67 . a complete knock-down of foxo protein levels induces an arrest in cell cycle, suggesting that a basal, low levels of foxos activity is required for the cell cycle progression 67 . surprisingly, in specific settings, foxo may actually serve as a promoter of proliferation. for example, neutrophils isolated from foxo3 deficient mice show high levels of fasl expression and apoptosis, revealing that foxo3 may represses fasl expression in neutrophils, leading to proliferation in this type of cells 68 . foxo1 was also shown to positively influence proliferation induction in pancreatic β cells in vitro, under low nutritional circumstances. the mechanisms implicated in this process are not completely understood. however, the induction of ccnd1 gene transcription, which encodes cyclin d1, may at least partially be involved. cyclin d1 represents one of the earliest cell cycle-related events, being critical for the g1 to s phase progression during the cell cycle 34 . thus, foxo proteins coordinate the expression of multiple important cell cycle regulators, in order to block the g0/g1, g1/s or the g2/m transitions during cell cycle when needed, such as after dna damage. however, a basal level of foxo proteins is required for g2-m cell cycle transition, when foxo-dependent plk1 expression seems to be necessary. these results suggest that foxo proteins have to be tightly regulated and are important and sensitive regulators of cell cycle, proliferation and other critical cellular processes. foxo proteins play an important role in regulating differentiation of a wide variety of tissues. it is well known that foxos can control the differentiation of precursor cells into muscle, adipose tissue or blood cells. interestingly, foxos effect on differentiation is context dependent and sometimes foxo isoform dependent. while foxo3 promotes differentiation of erythroid cells, foxo1 suppresses the differentiation of precursor cells in adipose and muscle tissues 69 . foxos can also suppress bone formation by inhibiting wnt-β catenin-tcf signaling. wnt signaling is known to stimulate bone formation 70 . moreover, foxo proteins are required for maintenance of somatic/adult stem cells, such as hematopoietic stem cells, and they also regulate cancer stem cells 71 (see details in chapter 5). metabolic signaling mediated by forkhead transcription factors is conserved among multiple species, including but no limited to mammals, drosophila melanogaster and caenorhabditis elegans 20 . it was first noticed for the foxo homolog named dauer formation-16 (daf-16), in the caenorhabditis elegans worm 72 . foxo proteins are involved in critical physiological processes that regulate cellular metabolic activity in many organs, such as liver, pancreas, adipose tissue and hypothalamus 73, 74, 75, 76 . in the liver, foxo1 forms a complex with another transcription factor, the liver specific pgc1alpha, in order to induce gluconeogenesis by upregulating g6pase and pepck genes. this is important for maintaining glucose homeostasis 73 . confirming these results, loss of hepatic foxo genes in mice induces a downmodulation of gluconeogenesis and an upregulation of glycolysis 77 . in addition, ectopic expression of foxo1 in rat primary hepatocytes increases apociii, an inhibitor of lipoprotein lipase, suggesting that foxo1 is involved in regulating the lipid metabolism 78 . recently, nicotinamide phosphoribosyltransferase (nampt) gene was described to be a new transcriptional target gene of foxos for regulating hepatic triglyceride levels 79 . in pancreas, foxo1 plays an important role in the maintenance of beta cell function, but also in the development of the pancreas, through repression of the pancreatic transcription factor pdx1 74 . aditionally, foxo1 increases the food intake by upregulating agrp and npy, and by downregulating pomc in hypothalamus, antagonizing the anorexigenic hormone leptin 76, 80 . recently, g-protein-coupled receptor gpr17 was described to be a foxo1 target that activates agrp, suggesting a new mechanism for foxo1-agrp mediated food intake that might provide a new treatment for obesity 81 . all these results establish foxo proteins as important regulators of cellular metabolism and potential target in metabolic related diseases. foxo transcription factors function as sensors for reactive oxygen species (ros) and play an important role in cellular resistance to stress, increasing cellular survival. foxo proteins are able to protect the cell from oxidative damage, decreasing the availability of ros. they stimulate the expression of certain genes responsible for the ros inactivation, such as the antioxidant enzymes manganese superoxide dismutase (mnsod), that catalyzes the conversion of superoxide to hydrogen peroxide, and catalase, that converts hydrogen peroxide into water and oxygen 27 . studies on human cardiac fibroblasts revealed that foxo also modulate the antioxidant enzyme peroxiredoxin iii, fighting against cell damage induced by oxidative stress. concomitant peroxiredoxin iii knockdown and foxo3 knockdown resulted in higher levels of hydrogen peroxide in response to serum starvation, as compared to peroxiredoxin iii knockdown only 82 . other foxo transcriptional targets (figure 3 ), such as sterrole carrier proteins (scps), are now known to play an important role during the defence against oxidative stress. scps are implicated in protecting fatty acids from peroxidation 83 . foxo proteins are also able to increase dna repair, inducing a cell cylce arrest at g2/m checkpoint, in order to provide time for repairing the dna damage. one of the transcriptional target of foxo, implicated not only in cell cycle arrest but also in dna repair and cell survival in response to cellular stress, is gadd45 84 . similarly, ddb1 gene protein product is able to repair the dna damage through upregulation by foxo 69 . foxos regulate the longevity of the cell mainly by increasing resistance to stress, modulating the dna repair process and maintaining the stem cells 27 . the lifespan extension is an important function of foxo that is conserved among several species. a well studied example is the role of foxo homologous daf-16 (caenorhabditis elegans) in aging 85 . loss of foxo3 in human skin fibroblasts results in aging specific morphological changes, suggesting that foxo3 is necessary for maintenance or promotion of cellular longevity 86 . moreover, loss of foxo3 activity leads to decreased mnsod and enhanced cell injury in vascular smooth muscle explanted from aged mice, due to limied ros inactivation 87 . thus, cellular lifespan maintenance requires a certain level of foxos activity and this is mainly achieved by inactivation of akt activity 88 . foxo proteins do not only play an important protective role during senescence/aging, but also during exercise. it was previously suggested that foxos may at least partially be involved in the exerciseinduced beneficial cardiac effects. interestingly, exercise induces an upregulation of foxo3 and sirt1 in the heart, with subsequent increased mnsod, catalase and gad45alpha, and decreased cyclin d2, suggesting the protective role of foxo proteins 89 . interestingly, other studies revealed that, in conditions of oxidative stress, foxo3 is also able to induce apoptosis by triggering a fas-mediated death pathway in cultured motoneurons, by activating trail, bh3-only proteins noxa and bim, and by promoting pro-apoptotic activity of p53 20 . additional reports suggest that suppression of foxo proteins expression during oxidative stress can be protective to some extent for cells, since protein inhibition or gene knockdown of foxo1 and foxo3 decreases the ischemic infarct size in the brain, protects the metabotropic glutamate receptors during vascular injury, enhances pancreatic β-cell or neuronal survival through nad + precursors and provide trophic factor protection with erythropoietin (epo) and neurotrophins 20 . this suggests that while a certain level of foxo3 activation is required in cellular stress resistance, sustained activation or activation over a threshold of foxo3 is detrimental and may induce apoptosis. thus, foxo3 plays an important role in cellular decision during stress, helping the cells to survive by multiple mechanisms (such as dna repair, ros inactivation etc) and inducing apoptosis when dna or cellular damage can't be repaired. foxo proteins play a central role in maintaining the immune response of the human body. cell type-specific deletions of foxo1 and/or foxo3 in mice revealed an important role of foxo proteins in regulating immunological homeostasis and tolerance, by controlling the function and development of b and t lymphocytes. thus, foxo1 and foxo3 are essential transcription factors involved in early b cell development and peripheral b cell function, since early deletion of foxo1 blocked b cell differentiation at the pro-b cell stage. this is due to a defect in interleukin 7 receptor alpha (il-7ralpha) expression. deletion of foxo1 in peripheral b cells was associated with defective expression of both cd62l and aid, with subsequent failure in class-switch recombination and reduced igg production upon immunization. moreover, it is known that the pi3k-akt-mediated inactivation of foxo1 is essential for the optimal proliferation of b cells, while ectopic expression of pi3k-independent variants of foxo1 or foxo3 (active triple mutants at the akt phosphorylation sites) resulted in cell cycle arrest and increased cell death in b cells 90 . another study on mice with t cell-specific deletion of foxo1 showed a decreased expression of il-7 receptor on mature t-cells, suggesting that il7-r is a transcriptional target of foxo1, which is mediating through binding with il-7 the survival and homeostatic proliferation of peripheral t cell 91 . excesive inflammatory cells may become harmful for the human body through the generation of ros and through the production of cytokines. studies performed on mice deficient of foxo3 ilustrated lymphoproliferation, inflammation of the salivary glands, lung, and kidney, and increased activity of helper t cells. these observations demonstrate the beneficial role of foxo proteins in human body, by preventing t cell hyperactivity 20 . also, it was demonstrated that mir-182 has a central role in the late phase of clonal expansion of the helper t lymphocytes, by inducing il-2 which is able to inactivate foxo1. foxo1 was found to be a suppressor of proliferation in resting helper t lymphocytes and, in order to allow proliferation, t cell activation via tcr/cd28 and il-2r signaling must inhibit foxo1 90 . other studies reported that t cells derived from bim and puma deficient mice were resistant to apoptosis after il-2 deprivation, demonstrating that foxo3, through upregulation of bim, puma and p27 kip1 is important for the induction of cell cycle arrest and apoptosis of t cells, in the absence of cytokines 48 . foxo proteins may be benefical for autoimmune disorders by inducing a fas mediated apoptosis that target activated t cells, followed by a decrease in cytokine stimulation in patients with autoimmune lymphoproliferative syndromes 20 . foxo proteins play an important role not only during physiological cellular processes, but also in few pathologies, such as cancer. foxos are well known tumor suppressor proteins. although foxo proteins protect the human body by playing a central role in a wide range of mainly physiological functions (as described earlier), under some circumstances, foxo's roles can become harmful for the human cells. this is because foxo is also involved in a number of pathological functions, such as inflammation and muscle atrophy. for example, apoptosis places foxo proteins on the good side when it leads to tumor suppression. however, cellular apoptosis can become itself a significant component for pathology in diseases such as neurodegenerative disease, diabetes mellitus (dm), and cardiovascular injury 40,41,92,93 . cancer: foxo proteins are tumor suppressors foxo proteins are inactivated in a wide variety of malignancies, eiher posttranslational (mainly by pi3k-akt mediated phosphorlation) or by fusion mutations (such as pax3-foxo1; mll-foxo3, mll-foxo4) 94, 95 . inactivation of foxo by pi3k-akt pathway activation is a common feature of many malignancies, such as prostate cancer, breast cancer, leukemia and glioblastoma 69 . conditional deletion of foxo family members in mice leads to the lymphomas and hemangiomas, which suggests that loss of foxos maintains or promotes survival of tumor cells 96 . foxos loss or inactivation is known to play an important role in cancer tumorigenesis or progression, in vivo 97 . for example, a recent study described a significant correlation between low expression of foxo3 and a poor prognosis for gastric cancer patients, bringing evidence that foxo3 could be a valuable prognostic biomarker for patients with gastric cancer 98 . in addition, foxo3 overexpression was shown to reduce motility, invasiveness, and aggressiveness in estrogen receptor α-positive (erα+) breast cancer cells 39 . foxo proteins exert their tumor suppressor functions predominantly by promoting cell cycle arrest, apoptosis, ros inactivation and dna repair, through expression of their target genes 47,99 . for instance, foxo3-induced expression of bim, a pro-apoptotic member of the bcl-2 family of proteins induced a caspase-dependent cell death in several types of cancers, such as in chronic leukemias, breast and gastric cancers 27 . similarly, foxo-induced trail and noxa expression induces apoptosis in many malignancies, including leukemias 100 . interestingly, foxo3 can also inhibit the protooncogene c-myc, indirectly controlling the transcription of a wide set of target genes implicated in cell survival, cell cycle, apoptosis and tumorigenesis. myc is a transcription factor and a well known promoter of survival, proliferation and tumorigenesis, being found upregulated in a large variety of malignancies. noteworthy, activation of foxo proteins not only induces cell cycle arrest or apoptosis, but also a differentiation program. in chronic myeloid leukemia (cml), foxo3 can induce cml leukemic cells differentiation, by inhibiting the expression of id1 (inhibitor of dna binding 1). cml is characterized by the presence of the bcr-abl fusion protein, which is a constitutive active kinase, controlling many crucial downstream pathways implicated in cell cycle, proliferation, apoptosis and cell adhesion. bcr-abl activates pi3k-akt pathway, inactivating foxo proteins and their pro-apopotic and cell cycle arrest signals 27 . the use of the tyrosine kinase inhibitors (tkis) and of the proteasome inhibitor bortezomib results in inhibition of bcr-abl activity and its downstream pathways, with a subsequent activation of foxo proteins 102 . tkis treatment induces a foxo-dependent suppression of id1 expression, leading to k562 bcr-abl positive cell line differentiation. thus, bcr-abl can maintain the leukemic state not only by promoting proliferation, cell cycle progression and inhibiting apoptosis, but by also inhibiting foxo-mediated differentiation 27 . inflammation (rheumatoid arthritis, osteoarthritis, systemic lupus erythematosus) it was shown that loss of functional foxo proteins lead to inflammatory cell activation in several disorders, with the subsequent cellular damage, through oxidative stress and excess of cytokines. inactivation of foxo3 in t lymphocytes, as well as inactivation of foxo1 and foxo4 in synovial macrophages in patients with rheumatoid arthritis and osteoarthritis result in inflammatory cell activation. in addition, loss of foxo proteins may be a potential etiology for systemic lupus erythematosus (sle) and rheumatoid arthritis, since foxo1 gene transcript levels are downregulated in peripheral blood mononuclear cells of these patients 20 . a link between inflammation, insulin resistance and foxo transcription factors was previously suggested when downregulated foxo1 decreased the levels of c/ebp beta transcription factor in adipocytes, with the subsequent reduction in expression of the pro-inflammatory cytokines ccl2 (chemokine ligand 2) and il-6 103 . thus, foxo1 indirectly might induce an inflammatory status of the adipose tissue, responsible for the insulin resistance in type 2 diabetes. yet, there are studies that describe the ability of foxo proteins to directly induce inflammation through upregulation of the inflammatory cytokine il-1b 104 . however, further experiments are required in order to clearly understand the context related mechanisms of foxo-dependent modulation of inflammation. muscle atrophy appears in a variety of diseases, including cancer, diabetes and sepsis, and is characterized by accelerated proteolysis that can be induced through two pathways: the ubiquitinproteasome pathway and through lysosomal pathway, as a consequence of autophagy 27 . studies revealed that foxo proteins can induce muscle atrophy, characterized by decreased muscle function 105 . thus, foxo proteins have been shown to increase the transcription of key regulators of both lysosomal and proteasomal proteolysis: the autophagy followed by the lysosomal proteolysis is stimulated by upregulation of bnip3, lc3, and gabarapl1, while the proteasomal proteolysis is induced by increasing the ubiquitin ligase atrogin1. it was also shown that foxo3 activity is both required and sufficient for induction of autophagy in muscle cells, since studies performed on adult muscle fibers from mice show that the ectopic expression of an active foxo3 mutant leads to lysosomal proteolysis after formation of autophagosomes, while the knockdown of foxo3 in these muscles fibers blocked autophagosome formation after starvation 106, 107 . autophagy has a critical role in maintaining the cellular and metabolic homeostasis. it seems that the metabolic status of the cell strongly influences this process in both normal cells and cancer cells, despite the profound differences in their metabolism 108 . in cancer cells, atp is predominantly produced through the constitutive activation of aerobic glycolysis, process that is modulated by the transcription factor hif1α 108 . since p38α is required to maintain the levels of hif1α target genes, researchers demonstrated that in colorectal cancer cells, the inhibition of p38α causes a rapid drop in atp levels, with an acute energy need which activates foxo3 in an ampkdependent manner, in order to induce autophagy, cell cycle arrest and cell death in these cells 96, 108 . moreover, the knockdown of foxo3 was sufficient to induce hypertrophy in cultured neonatal rat cardiomyocytes. in these cells, stimulation with insulin inhibits foxo3 function, with subsequent downregulation of the antioxidant enzyme catalase and increased levels of ros, that in low levels may act as second messengers for intracellular signaling, making possible the increase in cell size 109 . atrogin 1, upregulated by foxo1 and foxo3, plays important roles in cardiovascular system, since mice lacking atrogin-1 are susceptible to cardiac hypertrophy 20 . foxo proteins are inducing atrophy of differentiated cardiac and skeletal muscle cells through protein synthesis inhibition, which leads to a decrease in cell size 69 . in skeletal muscle, this mechanism involves myostatin, a foxo transcriptional target and a secreted molecule that can induce atrophy by protein synthesis inhibition 110 . foxo1 might be implicated in the development or progression of type 2 diabetes, since increased foxo1 expression in diabetic mouse liver is associated with increased expression of pepck and g6pase, and inhibition of foxo1 activity downregulated both pepck and g6pase expression and normalized the blood glucose levels. thus, foxo1-mediated expression of g6pase and pepck is critical for gluconeogenesis in the liver during fasting, but its deregulation may be involved in diabetes etiology 111 . physiological functions of foxo take place under certain circumstances, since sometimes the ability to maintain the proper control is overwhelmed 20 . experiments on insulin-producing mouse pancreatic beta cells (betatc-6) show that chronic exposure to high glucose activates foxo transcription factors and leads to upregulation of endogenous inflammatory cytokines interleukin-1beta (il-1beta) and suppressors of cytokine signalling (socs). these events trigger the activation of caspase-3 with subsequent apoptosis, suggesting a new mechanism that leads to the destruction of endocrine pancreas in type 2 diabetes 92 . also, exposure to high glucose of the cardiac microvascular endothelial cells (cmecs) isolated from hearts of adult rats show that foxo transcription factors leads to reactive oxygen species (ros) accumulation and apoptosis, suggesting that foxos might be involved in microvascular complications of diabetes 93 . foxo is also involved in insulin resistance and metabolic syndrome, since the activation of foxo1 in cardiomyocytes leads to increased akt activity and attenuated cellular response to insulin, followed by decreased glucose uptake 112 . interestingly, clinical studies regarding metabolic status profile on age-related diseases, fertility, fecundity and mortality revealed higher hba1c levels and increased mortality risk associated with specific haplotypes of foxo1 113 . foxo proteins are also activated in an attempt to protect the human body against the oxidative stress resulted due to hyperglycemia that leads to increased production of ros in endothelial cells, liver cells, and pancreatic β-cells. this hyperglycemia-dependent ros increase leads to a subsequent development of insulin resistance and significant neurodegenerative and cardiovascular diseases in the patients with diabetus mellitus 20 . foxos are necessary for endothelial cell development and angiogenesis, since mice that are deficient in foxo1 lack development of the vascular system and die by embryonic day eleven 114 . in addition, foxo1 and foxo3 were shown to be the most abundant foxo isoforms in mature endothelial cells, having also an important role in the regulation of postnatal vessel formation, not only in the embryogenesis 36 . unfortunately, angiogenesis is not only involved in critical physiological processes, such as embryogenesis and postnatal vessel formation, but it is also involved in pathological events, such as chronic inflammation and tumor growth 115 . thus, it is fascinating how the angiogenesis mediated by foxo proteins may become a negative element for the organism, antagonizing the tumor suppresor's main function through new vessel formation, that can lead to tumor cell growth 116 . foxo3 was associated with both cardiomyocyte survival after oxidative stress and heart muscle loss with subsequent ventricular dysfunction 43, 117 . foxos are activated after akt inhibition by insulin or other factors, leading to atrogin-1 induction, which results in a suppression of heart muscle cell size 43 . foxo3 proteins seem to inhibit the vascular smooth muscle cell proliferation and growth in a rat balloon carotid arterial injury model, suggesting a role of foxos in the regulation of vascular tone and systemic arterial blood pressure, preventing or at least lessening the effects of atherosclerosis and hypertension 118 . also, decreased foxo1 expression due to high flow states in vessels leads to proliferation of vascular smooth muscle cells, vascular neointimal hyperplasia, and subsequent hypertension 45 . moreover, experiments on lowdensity lipoprotein (ldl) receptor knockout mice resulted in the prevention of atherosclerosis when the triple ablation of foxo1, foxo3 and foxo4 was induced in endothelial cells 44 . interestingly, the same experiment on myeloid cells lead to more severe atherosclerosis compared to the controls, explained by authors through increased proliferation of granulocyte-monocyte progenitors and high levels of inducible nitric oxide synthase (inos) and oxidative stress, which predispose to atherosclerosis 119 . noteworthy, analysis of mouse oocytes revealed overexpressed foxo3 transcription factors in primordial and early primary follicles, but downregulated foxo in primary and more developed follicles, suggesting that foxo proteins also have reproductive functions, modulating oocyte and follicular cell maturation. to confirm the hypothesis, constitutively active foxo3 was induced in transgenic mouse oocytes in primary and more developed follicles, which affected the oocyte growth and follicular development, leading to anovulation and luteinization of unruptured follicles, with subsequent infertility 46 . in addition, foxo3 and foxo1 mutations were detected in a small percentage of womens with premature ovarian failure 120 . involvement of foxo family of transcription factors in stem cells self-renewal, survival, proliferation and differentiation is currently under investigation. foxos have been shown to play critical functions in maintaining self-renewal potential and quiescence of hematopoietic stem cells, however, the mechanisms of these processes are not yet well understood 1, 69 . recent reports show that foxo-mediated regulation of cell cycle, oxidative stress and apoptosis plays an important role in these processes 69 . stem cells are characterized by the capacity of self-renewing and the ability to differentiate 121 . they are necessary in maintenance and propagation of several adult tissues, including but not limited to blood, skin and gastro-intestinal epithelium. adult stem cells or cells with stem cell properties were also found in other critical organs/systems, such as the central nervous system and the lung 69 . previous studies suggested that hematopoietic stem cells are sensitive to reactive oxygen species (ros) levels. foxo family members are known to play a central role in ros detection and in inducing an adaptative response after ros exposure, by inducing the expression of critical enzymes that neutralise ros, such as catalase and manganase superoxide dismutase (mnsod). interestingly, foxo1, foxo3, and foxo4 inactivation leads to an upregulation of ros levels in hematopoietic stem cells and their death 121 . deletion of these three foxo family members in mice revealed their importance in controlling ros in stem cells, in vivo 27 . while the number of hematopoietic stem cells in bone marrow of foxo-deficient mice is low, an increase in myeloid progenitor cells in blood is observed. moreover, the repopulation ability is decreased in the absence of foxos and the treatment of foxo-deficient mice with n-acetylcysteine (nac) at least partially rescued these effects 27 . thus, persistent akt activation, which induces inactivation of foxo transcription factors by akt-mediated phosphorylation, results in ros-induced cell death. this is due to the fact that the cells can't synthesize the foxo-dependent ros neutralizing factors catalase and mnsod 121 . between foxo family members, the most important regulator of hematopoietc stem cells survival and self-renewal is foxo3, since foxo3 knockdown induces hematopoietic stem cells depletion 121 . ros neutralizing agent n-acetylcysteine (nac) can rescue hematopoietic stem cells quiescence and at least partially resque their ros-induced loss 121 . all these results suggest that foxo1, foxo3 and foxo4-induced resistance to ros is critical in maintaining homeostasis of hematopoietic stem cells in bone marrow 27 . pten tumor suppressor was revealed as an important modulator of hematopoietic stem cells self-renewal and survival. pten is a major inhibitor of akt activation. depletion of pten, induces akt activation and subsequent foxo inactivation. it is likely that foxos mediate at least a part of the pten effects on hematopoietic stem cells 121 . although foxo3 was established as the most important foxo family member regulator of hematopoietic stem cell's self-renewal, foxo1 is critical for the human embryonic stem cells (hesc) pluripotency maintenance. foxo1induced sox2 and oct4 expression is one of the mechanisms responsible for this process. interestingly, in embryonic stem cells akt is not the major regulator of foxo1 71 . fascinating, a recent study brings evidence that foxo is a critical regulator of stem cell maintenance in hydra vulgaris, a member of the phylogenetically old animal phyla cnidaria, which has been suggested to be biologically immortal due to the unlimited self-renewal capacity of their stem cells 38, 122 . foxo transcription factors are not only required for maintenance of somatic stem cells, but they also play an important role in cancer stem cells 71 . recent reports provided evidence that in many types of malignancies, such as leukemia, colon, or gastro-intestinal malignanciens, a small population of cells similar to stem cells exists 121 . these cells, called cancer stem cells, have the potential of forming new tumors. notably, most of the time, the cancer stem cells are resistant to current cancer treatments, and these therapies may result in cancer stem cells enrichment. these cells serve as a starting point in cancer recurrence 121 . similarities between stem cells and cancer stem cells were best described in the hematopoietic system, where similar surface markes and signal transduction patterns were described between the hematopoietic stem cells and leukemia-initiating cells 121 . presented results have not only implications in uncovering the stem cells/cancer stem cells regulation, self-renewal and differentiation, but also for the development of novel therapeutic strategies in cancer, degenerative diseases and many other pathologies 71 . foxo family members are expressed in almost every tissue of the human body, including the nervous system, cardiovascular system, reproductive system of males and females, lung, liver, spleen, pancreas, thymus, and skeletal muscle. however, each member of the foxo family has its own expression pattern, since they are not equally expressed in all tissues 20 . foxo1 is better represented in adipose tissue 27 . foxo3 has the highest expression in liver, but it is also being predominantly expressed in heart, brain, kidneys, and ovaries, while foxo4 is found mainly in the muscle and heart 20,27 . the newest member of the transcription family, foxo6, is present in the brain. the association of this member with other tissues is still a matter of study 123 . cell lines (immortalized and/or cancer cells) are some of the most used and useful tools for studying the structure, function, regulation and expression of proteins in general, and foxo family members in particular. foxo members expression in different cell lines (including nci 60 group of cell lines) is partially known 124, 125 . foxo1 is found to be expressed at high levels in igrov 1 (human ovarian carcinoma cells), astrocyte cells, rl 7 (human follicular lymphoma cells), ht29 (human colon carcinoma cells) and hek 293 (human embryonic kidney cells). knowing the qualitative and quantitative expression pattern of foxo family members is important in elucidating their functions and regulation. thus, databases summarizing these patterns in various tissues and cell lines are very helpful and necessary. for example, biogps presents experimental results showing the mrna expression levels of a wide number of genes in most of the human tissues and many (mostly human) cell lines 126, 127, 128 . regulation of foxos expression is not yet well understood. p300 was shown to control foxo1 gene expression by binding to the proximal region of the foxo1 promoter (cre tandem sites) 129 . non-coding rnas were shown to suppress translation of foxo family members in various tissues and contexts. in particular, the micrornas-dependent inhibition of foxo1 and foxo3 expression is better studied and is summariez below, in chapter 7. however, the main transcription factors implicated in foxos expression are not well understood. interestingly, methylation of fox promotors induces suppression of their expression. as an example, promoter methylation induced by braf results in inactivation of fox genes expression 130 . the mirnas are 18-28 nucleotide-long noncoding rna molecules with an important role in post-transcriptional regulation of protein expression that regulate a variety of cellular processes, including cell differentiation, cell cycle progression and apoptosis. these mirnas can function as oncogenes or tumor suppressors, and oncogenic mirnas (oncomirs) are upregulated in cancer cells. in cancer, mirnas were found to be situated both upstream and downstream of the carcinogenesis process and modified expression of some mirnas is the outcome of carcinogenic transformation or progression, revealing that mirnas may be potential diagnostic or prognostic tools in cancer. 131 for instance, micrornas gained a special attention in melanoma studies and the altered pattern of mirna in melanoma seems to be related to apoptosis (mir-15b), cell cycle (mir-193b) and invasion/metastasis (mir-182) 132 . the microrna (mirna)-mediated regulation of foxo transcription factors was demonstrated by several groups within the last three years. mir-182 has been shown to specifically target foxo transcription factors irrespective of cell type, since mir-182 seems to target foxo3 in melanoma cells, foxo1 in breast cancer cells and foxo1 in activated helper t (th) lymphocytes 90 . thus, in melanoma cells, mir-182 modulate the expression of both foxo3 and microphthalmia-associated transcription factor (mitf). the inhibition of mir-182 by anti-mirs (blocking antisense oligonucleotides) hindered melanoma cell migration and triggered their apoptosis 90 . in breast cancer cells, foxo1 was coordinately targeted by mir-27a, mir-96, and mir-182, while the inhibition of each mirna resulted in induced levels of foxo1 and reduced breast cancer cell survival 90 . mir-182 also targets foxo1 in osteoblasts lineage cells in order to inhibit osteoblast proliferation and differentiation, repressing the osteogenesis 133 . many other micrornas were shown to regulate foxos activity and functions, including apoptosis or cell cycle. mir-96 has been shown to regulate foxo1-induced apoptosis in transitional cell carcinoma 134 . also, increased expression of mir-96 in breast cancer cells have been shown to downregulate foxo3 transcription factor with consequent induction of cell proliferation 135 . also, increased mir-96, mir-182 and mir-183 downregulate the expression of foxo1 transcription factor in classical hodgkin lymphoma (chl) cell lines, suggesting that decreased foxo1 expression is involved in lymphomagenesis 136 . moreover, a recent study performed on du145 and lncap human prostate cancer cells show that upregulated microrna-370 induces proliferation due to downregulation of the foxo1 transcription factor 137 . in addition, mir-155, which is highly induced in mature activated t and b cells and in treg cells, has recently been shown to target foxo3 in t cells, but it remains to be shown whether foxo3 via mir-155 contributes to the observed phenotypes in b and t lymphocytes 90 . new strategies to modulate foxo expression may now be developed since the discovery of mirnas targeting foxo transcription factors 90 . however, some difficulties may appear in the mirna-based therapeutic manipulation of foxo transcription factors. for example, a single mirna may target hundreds of genes among foxos, and the therapeutic manipulation of a specific mirna could have unanticipated adverse effects by influencing whole gene networks, while having only moderate effects on foxo genes 90 . since delivery of mir-182 mimics could induce lymphoproliferative disease or other forms of cancer by systemic repression of foxo transcription factors, another challenge in the mirna-foxo therapeutic manipulation is the specific delivery of the mirna into the target cell, in order to avoid adverse effects in other cell types or tissues 90 . several micrornas were identified as being targets of foxo transcription factors. an akt -foxo -mir-30d signaling pathway was recently identified. after inhibition of akt, activated foxo3 leads to upregulation of mir-30d. mir-30d acts as a tumor suppressor in renal cell carcinoma, further inhibiting the oncoprotein metadherin (mtdh) 138 . another recent study shows that foxo1 stimulates the expression of a microrna cluster located on a x chromosome, in a direct manner, dependent on rna polymerase ii, but not on the de novo protein synthesis. thus, foxo1 upregulates mir-506, mir-508, mir-513c, mir-513a-1, mir-513a-2. also, the same study shows that inhibition of pi3k-akt axis in lncap and mcf7 human carcinoma cell lines is followed by increased mir-506. as suggested by authors, mirnas could be valuable biomarkers of foxo activity 139 . a study performed on primary cultures of neural stem/progenitor cells (nspcs) from adult mice show that the expression levels of the mir-106b~25 cluster members (mir-106b, mir-93, and mir-25) is modulated by foxo3 transcription factors in a complex manner 140 . the precursors of the mir-106b~25 cluster members are located on mcm7 gene. foxo3 directly binds to the first intron of this gene, modulating the expression of the micrornas. thus, foxo3 transcription factors could be an important tool in preventing the loss of neurogenesis during aging 140 . further studies are needed to completely understand the regulation of micrornas expression by activated foxo proteins. foxo transcription factors and tumor suppressors are ubiquitously expressed in the human body, with some specific differences between its members. as resumed above, foxo proteins are characterized by a remarkable functional diversity, being implicated in regulation of many critical cellular functions, such as cell cycle arrest, apoptosis, oxidative detoxification, dna damage repair, stem cell maintenance, cell differentiation, cell metabolism, angiogenesis, cardiac development, aging and others. foxo proteins play an important role not only during physiological cellular processes, but also in few pathologies, such as cancer. they are well known tumor suppressors proteins. although foxo proteins protect the human body by playing a central role in a wide range of mainly physiological functions, under some circumstances, foxo's roles can become harmful for the human cells. this is because foxo is also involved in a number of pathological functions, such as inflammation, muscle atropy, and a number of physiological functions that become harmful for the organism. for example, apoptosis places foxo proteins on the good side when it leads to tumor suppression, but in some cases cellular apoptosis can become itself a significant component for pathology in diseases such as neurodegenerative disease, diabetes mellitus (dm), and cardiovascular injury. interestingly, while excessive foxo levels induce cell cycle arrest and cell death, complete knock-down of foxos leads to cell cycle progression impairment, suggesting that certain levels of foxo activation are required for cell cycle progression. moreover, foxo proteins play a variety of roles in the cells dependending on the context. in deciphering the role of forkhead transcription factors in cancer therapy foxo tumor suppressors and bcr-abl-induced leukemia: a matter of evasion of apoptosis snapshot: forkhead transcription factors i sly as a foxo": new paths with forkhead signaling in the brain the homeotic gene fork head encodes a nuclear protein and is expressed in the terminal regions of dynamic foxo transcription factors forkhead transcription factors contribute to execution of the mitotic programme in mammals inflammatory arthritis requires foxo3a to prevent fas ligandinduced neutrophil apoptosis foxo transcription factors and stem cell homeostasis: insights from the hematopoietic system foxos attenuate bone formation by suppressing wnt signaling foxo1 is an essential regulator of pluripotency in human embryonic stem cells the fork head transcription factor daf-16 transduces insulin-like metabolic and longevity signals in c. elegans insulin regulated hepatic gluconeogenesis through foxo1-pgc-1alpha interaction the forkhead transcription factor foxo1 links insulin signaling to pdx1 regulation of pancreatic beta cell growth forkhead transcription factor foxo1 in adipose tissue tissue regulates energy storage and expenditure forkhead protein foxo1 mediates agrpdependent effects of leptin on food intake deletion of hepatic foxo1/3/4 genes in mice significantly impacts on glucose metabolism through downregulation of gluconeogenesis and upregulation of glycolysis foxo1 mediates insulin action on apoc-iii and triglyceride metabolism hepatic foxos regulate lipid metabolism via modulation of expression of the nicotinamide phosphoribosyltransferase gene minibrain/dyrk1a regulates food intake through the sir2-foxo-snpf/npy pathway in drosophila and mammals foxo3 regulates peroxiredoxin iii expression in human cardiac fibro blasts regulation of sterol carrier protein gene expression by the forkhead transcription factor foxo3a dna repair pathway stimulated by the forkhead transcription factor foxo3a through the gadd45 protein daf-16/foxo directly regulates an atypical amp-activated protein kinase gamma isoform to mediate the effects of insulin/igf-1 signaling on aging in caenorhabditis elegans down-regulation of a forkhead transcription factor, foxo3a, accelerate s cellular senescence in human dermal fibroblasts down-regulation of manganese-superoxide dismutase through phosphorylation of foxo3a by akt in explanted vascular smooth muscle cells from old rats akt negatively regulates the in vitro lifespan of human endothelial cells via a p53/p21-dependent pathway exercise training promotes sirt1 activity in aged rats lymphocyte signaling: regulation of foxo transcription factors by micrornas an essential role of the forkhead-box transcription factor foxo1 in control of t cellhomeostasis and tol erance high glucose induces suppression of insulin signalling and apoptosis via upregulation of endogenous il-1beta and suppressor of cytokine signalling-1 in mouse pancreatic beta cells high glucose induced oxidative stress and apoptosis in cardiac microvascular endothelial cells are regulated by foxo3a apo2 ligand/tumor necrosis factor-related apoptosis-inducing ligand in prostate cancer therapy common mechanism for oncogenic activation of mll by forkhead family proteins applications of post-translational modifications of foxo family proteins in biological functions foxos are lineage-restricted redundant tumor suppressors and regulate endothelial cell homeostasis decreased expression of the foxo3a gene is associated with poor prognosis in primary gastric adenocarcinoma patients the dna damage repair protein ku70 interacts with foxo4 to coordinate a conserved cellular stress response therapy-resistant acute lymphoblastic leukemia (all) cells inactivate foxo3 to escape apoptosis induction by trail and noxa induction of mxi1-sr alpha by foxo3a contributes to repression of myc-dependent gene expression combination of bortezomib and mitotic inhibitors down-modulate bcr-abl and efficiently eliminates tyrosine-kinase inhibitor sensitive and resistant bcr-abl-positive leukemic cells foxo1 increased pro-inflammatory gene expression by inducing c/ ebpbeta in tnf-alpha-treated adipocytes foxo1 links insulin resistance to proinflammatory cytokine il 1beta production in macrophages inhibition of foxo transcriptional activity prevents muscle fiber atrophy during cachexia and induces hypertrophy foxo3 controls autophagy in skeletal muscle in vivo foxo3 coordinately activates protein degradation by the autophagic/lysosomal and proteasomal pathways in atrophying muscle cells inhibition of p38alpha unveils an ampk-foxo3a axis linking autophagy to cancer-specific metabolism foxo3a inhibits cardiomyocyte hypertrophy through transactivating catalase regulation of myostatin expression and myoblast differentiation by foxo and smadtranscription factors inhibition of foxo1 function is associated with improved fasting glycemia in diabet ic mice foxo transcription factors activate akt and attenuate insulin signaling in heart by inhibiting protein phosphatases haplotypes in the human foxo1a and foxo3a genes; impact on disease and mortality at old age abnormal angiogenesis in foxo1 (fkhr)-deficient mice multifaceted link between cancer and inflammation constitutive phosphorylation of the foxo1 transcription factor in gastric cancer cells correlates with microvessel area and the expressions of angiogenesis -related molecules foxo transcription factors promote cardiomyocyte survival upon induction of oxidative stress forkhead transcription factors inhibit vascular smooth muscle cell proliferation and neointimal hyperplasia expanded granulocyte/monocyte compartment in myeloid-specific triple foxo knockout increases oxidative stress and accelerates atherosclerosis in mice mutational screening of foxo3a and foxo1a in women with premature ovarian failure the pi-3kinase pathway in hematopoietic stem cells and leukemia-initiating cells: a mechanistic difference between normal and cancer stem cells mortality pattern suggest lack of senescence in hydra foxos: signalling integrators for homeostasis maintenance modulators of sensitivity and resistance to inhibition of pi3k identified in a pharmacogenomic screen of the nci-60 human tumor cell line collection global proteome analysis of the nci-60 cell line panel biogps: an extensible and customizable portal for querying and organizing gene annotation resources biogps and mygene.info: organizing online, gene-centric information a gene atlas of the mouse and human protein-encoding transcriptomes control of foxo1 gene expression by co-activator p300 braf mutation-specific promoter methylation of fox genes in colorectal cancer tissular and soluble mirnas for diagnostic and therapy improvement in digestive tract cancers immune-related biomarkers for diagnosis/prognosis and therapy monitoring of cutaneous melanoma mir-182 is a negative regulator of osteoblast proliferation, differentiation, and skeletogenesis through targeting foxo1 mir-96 regulates foxo1-mediated cell apoptosis in bladder cancer unregulated mir-96 induces cell proliferation in human breast cancer by downregulating transcriptional factor foxo3a foxo1 is a tumor suppressor in classical hodgkin lymphoma upregulation of mircorna-370 induces proliferation in human prostate cancer cells by downregulating the transcription factor foxo1 akt/foxo pathway in renal cell carcinoma foxo1 regulates expression of a microrna cluster on x chromosome the microrna cluster mir-106b~25 regulates adult neural stem/progenitor cell proliferation and neuronal differentiation key: cord-022779-himray6q authors: nan title: abstracts of oral presentations date: 2005-06-10 journal: biopolymers doi: 10.1002/bip.20321 sha: doc_id: 22779 cord_uid: himray6q nan s. tchertchian, f. oplinger, m. paolini, s. manganiello, s. raimondi, b. depresle, n. dafflon, h. gaertner, and p. botti geneprot inc., geneva branch, 2, pré de-la-fontaine, 1217 meyrin, switzerland the last decade has provided extensive demonstration of the key role played by native chemical ligation (ncl) for the preparation of small and medium size proteins [1] . yet the requirement for cysteine at the site of ligation in standard ncl has limited its flexibility. recently, different types of auxiliary groups [2, 3] have been developed to extend the application of ncl to other ligation sites. however, the generally slower ligation rates especially with large fragments and the additional step required to cleave the auxiliary post-ligation have reduced their utility. here we present a novel strategy to synthesize proteins through a chemical ligation using unprotected peptide segments. our scheme does not make use of auxiliary groups [2, 3] , instead originally exploits the features of some side chain removable functionalities. ligation rates are high, comparable to ncl and the residues available for ligation are more frequent than cysteine. furthermore the whole process is "one pot" and at the end a native polypeptide is obtained directly in the ligation mixture. the total chemical syntheses of c5a (1-74) using both ncl and our method will be presented and compared. [ during the biosynthesis of glycopeptide antibiotics of the vancomycin family, several oxidative phenol coupling reactions take place. the enzymes catalyzing these reactions are of interest from structural and mechanistic viewpoints. in this work [1, 2] , it is shown that the oxygenase oxyb from the vancomycin producer only catalyzes a phenol coupling reaction when the putative peptide substrate is linked as a thioester to a peptide carrier domain (pcd) derived from the nonribosomal peptide synthetase. an efficient access is described to representative free linear peptide substrates, which makes use of alloc-solid phase peptide chemistry, but largely avoids the use of amino acid side chain protecting groups. in this way, the target linear peptides can be released from the resin under very mild conditions, and then be activated as thioesters, prior to loading onto the pcd. [ we have recently discovered a new nonenzymatically-formed product from n-(3-oxododecanoyl)-l-homoserine lactone. interestingly, both the n-acylhomoserine and its novel tetramic acid degradation product 1 are potent antibacterial agents. bactericidal activity was observed against all tested gram-positive bacterial strains, while no toxicity was seen against gram-negative bacteria. we propose that p. aeruginosa utilizes this tetramic acid as an interference strategy to preclude encroachment by competing bacteria. additionally, we have discovered that this tetramic acid binds iron with comparable affinity to known bacterial siderophores, possibly providing an unrecognized mechanism for iron solubilization. using short portions (7-11 amino acid) rich in positively charged residues from either human lactoferricin or the marcks protein as templates, a panel of 70 peptides each possessing a specific chemical structure was synthesized. these included amino acid omissions, substitutions, and insertions in the aim to modify the peptides overall charge, hydrophobic core, and/or amphipathicity. the peptides antimicrobial activities against a large panel of bacteria were assessed using both conventional tests (mic, mbc) and non-conventional assays (mic quantified by an automated turbidimetry-based system and mbc measured on resting cells suspended in low-ionic strength medium-"survival assay"). furthermore, the membrane permeabilizing activity of the peptides on strains of several gram negative bacterial species was quantitated by measuring their ability both to decrease the mic of novobiocine and to promote the uptake of the hydrophobic fluorescent probe npn. while the mic determined by turbidimetry or by the conventional method did not significantly differ, bactericidal activity of the peptides measured by the survival assay was 1 to 2 orders of magnitude higher than that measured by the conventional mbc test. on the other hand, the two assays used to measure the permeabilizing activity of the peptides rendered similar results. interestingly, the most potent permeabilizers did not correspond with the peptides exhibiting the highest bactericidal activity thus indicating that these two activities have different structural bases. protein farnesyl transferase (pftase) catalyzes the attachment of farnesyl diphosphate (fpp) to proteins that contain a caax-box sequence at their c-termini [1] . several analogues of fpp that incorporate azide functional groups have been synthesized and shown to be incorporated into peptides using pftase as a catalyst. in particular, it has been shown that the prenyl azide moiety from 1 or related analogues can be transferred to the peptide substrate, n-dansyl-gcvia to yield the corresponding thioether-linked products. the resulting azide-containing peptides have been derivatized with a triphenylphosphine-based reagent to generate o -alkyl imidate-linked products rather than the amide-linked material expected via a staudinger reaction [2] . since caax-box sequences can be appended to the c-termini of many different proteins, these analogues provide a simple and general method for incorporating orthogonal azides into proteins at unique sites. subsequent functionalization of such azide groups via staudinger or "click" chemistry should provide a convenient method for linking proteins with a diverse array of probes, biomolecules, surfaces and other materials under mild conditions. chemoselective glycosylation, acylation, and alkylation of completely unprotected peptides can be accomplished by incorporating n-alkylaminooxy amino acids into the peptide sequence. the n-alkylaminooxy side chains react selectively with reducing sugars, activated alkyl halides, and various acylating agents in mildly-acidic aqueous buffers (ph 4) to furnish neoglyco-and neolipopeptides. a key feature of the approach is that a single parent peptide can be quickly reacted with a variety of agents to provide a large number of "post-translationally"modified peptides. the ability to easily synthesize arrays of modified peptides allows comprehensive studies of the effects that glycosylation and lipidation have on peptide structure and function. here we present an overview of the methodology and initial results on its application to studying problems of biological interest. this presentation describes cyclic peptides that fold into well-defined ␤-sheet structures in aqueous solution and can dimerize through ␤-sheet interactions. the cyclic peptides contain the unnatural amino acid hao, which mimics the hydrogen-bonding pattern of one edge of a peptide ␤-strand, and ␦-linked ornithine, which mimics a ␤-turn and provides enhanced water solubility or a linkage point for creating multivalent structures. institute for molecular bioscience, the university of queensland, queensland 4072, australia the human genome project and other major sequencing projects have rapidly provided a vast array of new protein/peptide sequences. in contrast, many other new proteins/peptides are also being uncovered from plant and animal sources whose genomes are yet to be tapped. in the post-genomic era, the physical form of many of these gene-encoded sequences will be vital for biomedical research and drug development. moreover, the advantages of peptide and protein chemical synthesis over recombinant-dna methods are increasingly being used to provide rapid structure-activity information of complex bioactive peptides, small proteins and functional receptor domains. in a program designed to exploit the potential of australian conus species we have isolated, characterised and chemically synthesised a wide range of novel conotoxins. these cysteine rich microproteins have well-defined tertiary structures with considerable rigidity and stability and contain many elements of protein secondary structure. of particular interest are the two disulfide bond containing conotoxins (examples below) which target transporters, ion channels and receptors at nanomolar potencies. in this presentation i will describe some of our research on controlling the shape and potency/selectivity of these microproteins through intramolecular native ligation chemical approaches. it appears that there is considerable scope to control the properties of these native sequences which in some cases may prove useful in the development of these molecules as therapeutic candidates. despite identical amino acid composition, differences in the properties of class a amphipathic helical peptides due to differences on the hydrophobic face results in substantial differences in anti-inflammatory properties. one of these peptides is an apolipoprotein a-i mimetic, d-4f. when given orally to mice and monkeys, d-4f caused the formation of pre-␤ hdl, improved hdl-mediated cholesterol efflux, reduced lipoprotein lipid hydroperoxides, increased paraoxonase activity and converted hdl from pro-inflammatory to anti-inflammatory. in apoe null mice d-4f increased reverse cholesterol transport from macrophages. oral d-4f reduced atherosclerosis in apoe null and ldl receptor null mice. in vitro d-4f caused the formation of pre-␤ hdl, reduced lipoprotein lipid hydroperoxides and converted hdl from pro-inflammatory to anti-inflammatory. physical properties and the ability of various class a amphipathic helical peptides to activate enzyme lcat in vitro did not predict biologic activity in vivo. in contrast, the use of cultured human artery wall cells in evaluating these peptides was more predictive of their efficacy in vivo. thus, anti-inflammatory properties of different class a amphipathic helical peptides depends on subtle differences in the configuration of the hydrophobic face of the peptides. physical-chemical properties provide an explanation for the mechanism of action of the active peptides. peptides to ameliorate atherosclerosis and other inflammatory diseases can be designed using this strategy. inflammatory diseases. this chemokine belongs to the family of cxc chemokines, its response is mediated through binding to seven transmembrane helical g-protein coupled receptors cxcr1 and cxcr2. in order to investigate the relevance of selected protein segments for biological activity we synthesized chemically modified and biologically active analogues of the 77-mer of hil-8 by expressed protein ligation (epl). for ligation naturally occurring cysteine at position 55 was chosen. c-terminal peptides carrying an n-terminal cysteine were synthesized by solid phase peptide synthesis (spps) applying the fmocstrategy and used to introduce modifications. ligation of the recombinantly produced thioester with synthetic peptides yielded in full length hil-8 that finally was correctly folded and stabilised by two disulfide bridges as in the native protein. in addition to fluorescent and photoactivatable analogues, we produced variants that contain a ␤-peptide helix instead of the naturally occurring ␣-helix. thus, for the first time, we received a protein containing a whole ␤-peptide segment and still showing high biological activity. depending on the linker between ␤-sheet and ␤-peptide helix of interleukin 8 we could discriminate between active and inactive proteins suggesting that the overall orientation of the c-terminal segment is highly relevant for the folding of the protein and subsequently for the signalling of interleukin 8. a peptide based on residues 109 -122 of the syrian hamster prion protein (h1) forms ␤-sheet aggregates in solution, which grow to form large fibers. isotope-edited infrared spectroscopy has shown that the initial antiparallel ␤-sheet formed by this peptide is disordered. a slow rearrangement occurs to form a structure in which the hydrophobic core of the strands (residues 112 -122) pack together, resulting in the alignment of residue 117 across the sheet. the kinetics of the realignment have been monitored for h1 and for peptides with mutations at residue 117 (a117i, a117l and a117b where b is aminobutyric acid). h1 and a117i align with non-exponential kinetics. at low concentrations h1 aligns via the repeated detachment and annealing of strands, whereas at higher concentrations a reptation mechanism is observed. a117b aligns instantaneously within the dead-time of our experiments. a117l does not align at residue 117 but some undefined reordering can still be observed as a shift of the 13 c band. these data are the first experimental probes of the types of intersheet rearrangements which are required for the nucleation of fibrous peptide aggregates, and the evidence for strand reptation within the ␤-sheet confirms observations in molecular dynamics simulations. [1, 2] . we have since established the microfluidic peptide chip as a miniaturizing platform. the challenging issues in making peptide chips a practical tool for understanding biology, drug discovery, and diagnostics are quality of synthesis, specificity in reported activities, and ability for quantitative measurements. we will present the results of the work which involves our intensive effort in: • development of the method for monitoring and analyzing quality of peptide chip synthesis • improvement in peptide chip synthesis • development of the methods for quantitative analysis of (a) the specific binding of antibodies/proteins to peptides on chip and (b) kinase enzymatic activities against substrate peptides on chip. our presentation should demonstrate novel applications of peptide chips that can be implemented as routine laboratorial processes. [1] gao, x., pellois, j. p., kim, k., na, y. , gulari, e., and zhou, x. to prototype this approach we developed a protein array consisting of the ras-binding domain of craf-1 (rbd) that was c-terminally modified with a 24mer oligonucleotide and immobilized via hybridization with the complementary oligonucleotide on a wafer [1] . the rbd-dna conjugate was generated by native chemical ligation using a recombinantly produced rbd-thioester and an oligonucleotide carrying a 5'-cystein residue [2] . incubation of the immobilized rbd with ras(gtp), the activated rbd-binding form, retains sufficient amounts of ras on the protein array to allow detection by mass spectrometry with high sensitivity. controls carried out with inactive ras(gdp) did not produce any signal, demonstrating a sufficient selectivity for biotechnological applications. protein microarrays in which proteins are immobilized to a solid surface are ideal reagents for high-throughput experiments that require very small amounts of analyte. such protein microarrays ('protein chips') can be used very efficiently to analyze all kind of protein interactions en masse. the present work describes a general method for the selective attachment of proteins to solid surfaces through its c-termini that can be used for the creation of protein chips. our method is based in the chemoselective reaction between a protein c-terminal ␣-thioester and a modified surface containing n-terminal cys residues. ␣-thioester proteins can be obtained using standard recombinant techniques by using expression vectors containing modified inteins. we also present an efficient solid-phase approach for the rapid synthesis of cys-containing linkers that can be used for the modification of au-and si-based surfaces. this new method was used to immobilize two fluorescent proteins and a functional sh3 domain. a series of glycopeptides based on the leu-enkephalin analogue yt-gfs*-conh 2 led to greatly enhanced stability in vivo and effective penetration of the bbb. transport through the bbb hinges on the biousian nature of the glycopeptides-the glycopeptides have two conflicting conformational manifolds, a h 2 o soluble state, and an amphipathic state at h 2 o-membrane phase boundaries. multiple lines of evidence suggest that the bbb transport mechanism is absorptive endocytosis. mixed /␦-agonists showed antinociceptive potencies greater than morphine, and lacked many of the side effects generally associated with classical -selective opiate analgesics. the biousian design was extended to larger glycopeptides (16 residues) related to ␤-endorphin, which also penetrated the bbb and produced antinociception in mice. plasmon waveguide resonance (pwr) studies showed that the amphipathic helices bound to membrane bilayers with m to low nm k d 's. the presence of diverse endogenous neuropeptide transmitters and neuromodulators in the human brain is potentially applicable to the treatment of a wide range of behavioral disorders. clemencia pinilla, 1 mireia sospedra, 2 yindong zhao, 3 we have recently demonstrated the feasibility of utilizing the ligase activity of inteins for the in vivo backbone cyclization of peptidic chains. this procedure -called siclopps for split intein circular ligation of peptides and proteins-provides a biosynthetic pathway for peptides that are metabolically stable, and can be produced with spatial and temporal control [1] . to screen for bacteriotoxic peptides, a sic-lopps library was introduced into an escherichia coli population, such that each bacterium encodes a different peptide sequence. siclopps library over-expression afforded six distinct bacteriostatic peptides that reduce cell growth. one of these peptides (ln05) also caused cell aggregation. an e. coli genomic library was introduced into cells encoding ln05. co-expression of the genomic library and ln05 peptide rescues growth only in cells expressing genomic fragments able to counteract peptide toxicity. genomic library and ln05 co-expression resulted in enrichment of a single genomic construct, a fragment of the narz gene. narz is part of a nitrate reductase complex and has a role in tuberculosis persistence [2] . ln05 production in mycobacterium smegmatis resulted in a slow-growth phenotype. [1] abel-santos, e., scott, c. p., and benkovic, s. j., methods in the special feature of proteins involved in alzheimer's or prion diseases is their ability to adopt at least two different (meta) stable conformations. thus, amyloid-forming proteins that mainly contain ␣-helical structures in their native conformation must undergo an ␣-helix3␤strand conversion before or during fibril formation. the conformational transition that shifts the equilibrium from the functional to the pathological isoform can happen sporadically. it can also be triggered by mutations in the primary structure, changes of the environmental conditions such as ph, ionic strength, metal ions, protein concentration, oxidative stress, free radicals, action of physiological, or pathological chaperones. alternatively, the introduction of a small quantity of protein polymer may act as a structural template and initiate the disease. therefore, the development of model systems which allow the investigation of the complex folding mechanisms that lead to ␤-sheet aggregation appears to be one of the main challenges in the detailed understanding of the pathways from incubation to mortality. in order to create an ␣-␤ switch system we designed ideal ␣-helical parallel coiled coil peptides, introduced trigger functions by mutations in the primary structure, and studied the consequences that these mutations have on the secondary structure properties of the resulting peptides under certain environmental conditions. based on these results we continued to change the primary structure of the coiled coil system subsequently by mutations in the heptad repeat untill we ended up with soluble ␤-sheet peptides. the most important feature of these new ␤-sheet peptides is that they still follow the characteristic hydrophobic heptad repeat of an ␣-helical coiled coil and that all of the positions which are part of the dimerization domains of the coiled coil remained untouched. thus, these new peptides bear all of the requirements which are necessary for formation of cooperatively interacting helical structures and, furthermore, contain domains for cooperative sheet aggregation as well. the folded structure will now strongly depend on the environment. this peptidic model system allows a systematic study of the subtle influences that environmental conditions may have on protein folding stepwise, which means changing these conditions one after one or in all of the possible combinations that nature applies in vivo. plasmon waveguide resonance (pwr) spectroscopy is a powerful new biophysical method which allows us to examine structural changes, kinetics and thermodynamics of anisotropic biological systems and processes such as proteolipid membranes. this method has probed the mechanisms of g-protein coupled receptor (gpcr) signal transduction, and has obtained new insights into specific signaling pathways of agonists and antagonists with gpcrs. we now have extended these studies to examine the effects of lipid microdomains (rafts) on the binding, signaling and transduction pathways. using a 1:1 mixture of palmitoyloleoylphosphatidyl-choline (popc) and sphingomyelin (sm), we have directly observed the formation of two lipid bilayer microdomains, and the preferred segregation of the delta opioid receptor (hdor) into sm lipid rafts when the agonist ligand was bound, but not for the unoccupied receptor which preferentially incorporated into the popc-rich domain. furthermore, we can demonstrate directly that, g-proteins bind much more strongly to the hdor receptor in the lipid raft (sm-rich) environment than in the fluid non-raft (popc-rich) domain of the lipid bilayer. the implications of these findings for novel design of drugs, and drug screening will be emphasized. † supported by grants from the u.s.p.h.s., national institute of drug abuse and national science foundation. their measurement in high resolution nmr requires partial molecular alignment. for proteins in aqueous solution a number of standard methods exist to achieve such small alignment. we found that swelling of cross-linked polymers inside the nmr tube results in anisotropic gels. peptides in such a gel phase exhibit well resolved spectra of the partially aligned molecules. this allows to scale the orientation depending on the cross-linking of the polymer, the thickness of the unswollen stick or the temperature. rdcs can now easily be measured in natural abundance in peptides 3] all common nmr solvents can be used. with the chiral gel gelatin it is possible to discriminate d-and l-alanine to determine enantiomeric purity. the procedure is demonstrated to refine the solution structures of peptides such as cyclosporin a, somatostatin analogues and others galia blum, 1 georges von degenfeld, 2 kinneret keren, 3 misregulation of cysteine protease activity is associated with numerous pathologies ranging from cancer to autoimmune disease. protease activity is controlled by a delicate balance of many factors such as levels of natural inhibitors and posttranslational modifications. thus developing a detection method for monitoring protease activity rather than abundance is desirable. here we describe the design and synthesis of a novel class of chemical tools, activity based probes (abps), that detect protease activity. these probes are composed of a fluorescent tag and its cognate quenching group, a peptide recognition scaffold, and a reactive "warhead". these fluorescently quenched "smart probes" covalently modify protease active sites in a fashion that is dependent on activity of the protease. this results in loss of the quenching group, producing a fluorescent signal. we report the production of selective, cell permeable activity based probes for the study of papain family cysteine proteases in cells and whole animals. these probes are used to monitor real time protease activity in living cells using fluorescence microscopy techniques as well as standard biochemical methods. key: cord-012878-j9vndxgg authors: tremblay, reynald; feng, mary; menassa, rima; huner, norman p. a.; jevnikar, anthony m.; ma, shengwu title: high-yield expression of recombinant soybean agglutinin in plants using transient and stable systems date: 2010-06-18 journal: transgenic res doi: 10.1007/s11248-010-9419-0 sha: doc_id: 12878 cord_uid: j9vndxgg soybean agglutinin (sba) is a specific n-acetylgalactosamine-binding plant lectin that can agglutinate a wide variety of cells. sba has great potential for medical and biotechnology-focused applications, including screening and treatment of breast cancer, isolation of fetal cells from maternal blood for genetic screening, the possibility as a carrier system for oral drug delivery, and utilization as an affinity tag for high-quality purification of tagged proteins. the success of these applications, to a large degree, critically depends on the development of a highly efficient expression system for a source of recombinant sba (rsba). here, we demonstrate the utility of transient and stable expression systems in nicotiana benthamiana and potato, respectively, for the production of rsba, with the transgenic protein accumulated to 4% of total soluble protein (tsp) in nicotiana benthamiana leaves and 0.3% of tsp in potato tubers. furthermore, we show that both plant-derived rsbas retain their ability to induce the agglutination of red blood cells, are similarly glycosylated when compared with native sba, retained their binding specificity for n-acetylgalactosamine, and were highly resistant to degradation in simulated gastric and intestinal fluids. affinity column purification using n-acetylgalactosamine as a specific ligand resulted in high recovery and purity of rsba. this work is the first step toward use of rsba for various new applications, including the development of rsba as a novel affinity tag for simplified purification of tagged proteins and as a new carrier molecule for delivery of oral drugs. electronic supplementary material: the online version of this article (doi:10.1007/s11248-010-9419-0) contains supplementary material, which is available to authorized users. soybean agglutinin (sba) is a legume lectin glycoprotein that binds non-covalently to specific cell-surface carbohydrates, provoking agglutination of the bound cells when in solution. sba has been used to fractionate different cell types for use in clinical and biomedical applications. one such application is the separation of pluripotent stem cells from human bone marrow. cells fractionated by sba do not produce graft versus host disease (gvhd) and can be used in bone marrow transplantation across histocompatibility barriers (yura et al. 2008) . another application is the enrichment and isolation of fetal cells from the blood of pregnant women as a means of detecting fetal genetic and chromosomal abnormalities. it has been shown that the number of fetal erythroblasts recovered by a soybean agglutinin galactose-specific lectin method was approximately eightfold higher than the number obtained by a standardized magnetic cell-sorting (macs) procedure (babochkina et al. 2005 ). in addition, sba binds very effectively to some tumor cells and has been used to detect and treat several cancers including breast cancer (pusztai et al. 2008) . sba and other lectins, which are carbohydrate-binding proteins that bind sugar residues reversibly and specifically, have also been exploited as a ligand for targeted oral drug delivery, because their glycoprotein and glycolipid targets are integral parts of the enterocyte membrane (smart 2004) . procedures for high recovery and purity of sba have been developed, with extraction of sba from soybean flour resulting in high-quality purification with over 90% yield (percin et al. 2009 ), opening the door for the use of recombinant sba as a potential novel affinity tag for purification of genetically fused proteins. the potential production of a variety of recombinant proteins genetically fused to sba, however, necessitates the generation of an efficient recombinant production system. native sba is produced in the developing seeds of glycine max, with maximum production in seeds reaching over 2% tsp (lindstrom et al. 1990 ). it is relatively simple to obtain large amounts of purified native sba given that soybean seeds express high levels of native sba together with the establishment of efficient purification methods. however, many potential applications of sba in biotechnology and medicine could not be achieved or are difficult to achieve by use of unmodified native sba. for example, it is difficult to use native sba as an affinity tag for protein purification unless the target protein is genetically fused to sba and produced as a recombinant fusion protein. sba has previously been expressed in transgenic tobacco seeds in order to study the upstream and downstream cis-regulatory sequences mediating the expression of native sba (lindstrom et al. 1990 ). although a protein band with a molecular mass expected for the monomer of sba protein was seen on western blot, no analysis or characterization of the recombinant protein itself was performed; it was merely used as a marker for gene expression. others have successfully expressed sba in both e. coli and monkey bs-c-1 cells in an attempt to generate recombinant sba in order to analyze the relationship between glycosylation and tetramer assembly (adar et al. 1997) . they demonstrated that the recombinant sba retained its agglutination ability and specificity for n-acetylgalactosamine. the sba generated in bacteria, however, lacks glycosylation that, while not required for agglutination or assembly, is important to the stability of the bioactive tetramer (sinha and surolia 2005) . production in bs-c-1 cells resulted in a similarly glycosylated protein to native sba with an identical sugar binding profile but for some unknown reason had a lower agglutination ability. the minimum amount of bs-c-1 cell-derived sba required to induce hemagglutination is 10-20 lg/ml, which is 4-5 times larger than that of native sba. in addition, for production of a therapeutic protein, animal cells carry an inherent risk of pathogen contamination and have a high production cost due in large part to media costs. therefore, both bacteria and mammalian cells are not ideal bioreactors for sba production for therapeutic uses. plants are a suitable alternative expression system for recombinant sba production. as bioreactors, plants enable unlimited scalability, elimination of product contamination by mammalian pathogens, and reduced production costs compared with microbial or animal cell-based systems (boehm 2007; ma et al. 2008; tremblay et al. 2010 ). because of their eukaryotic nature, plants can perform the complex post-translational modification and processing required by many transgenic therapeutic proteins for biological and/or immunological function. furthermore, plant bioreactors have the short turn-around time needed to obtain gram quantities of a recombinant protein in a matter of weeks when expressed transiently. this is not only economically advantageous, but is also critical to meeting challenges related to quick access to life-saving biotechnology drugs and therapies. in addition, stable edible transgenic plant tissue might enable direct oral delivery of plant-derived therapeutic proteins and peptides, eliminating the need for expensive downstream protein purification and processing. our long-term objective is to generate recombinant sba for use as a carrier molecule for oral delivery of protein and peptide drugs and as an affinity tag for simplified and high-yield, high purity isolation of recombinant proteins. as a first step toward this objective, we report here the transient production of recombinant sba (rsba) in nicotiana benthamiana (nb), a close relative of tobacco (nicotiana tabacum), and stable production in solanum tuberosum (st) under the control of a ubiquitous camv35s promoter. we demonstrate transient expression of sba in nb plants at levels as high as 4% of total soluble protein (tsp), whereas its stable expression in st tubers reaches 0.3% tsp. furthermore, nbrsba and strsba are similarly glycosylated compared with native sba, retain their ability to induce hemagglutination, bind specifically to n-acetylgalactosamine, are stable in simulated gastric fluid (sgf) containing pepsin at acidic ph and are rapidly isolated in high purity from total soluble protein. the cdna of soybean agglutinin (sba) was cloned from glycine max cdna derived from 1 to 5 mm developing seeds. in brief, total messenger rna was extracted from the seeds using the rneasy plant mini kit (qiagen #74903) and converted to cdna by following the superscript ii procedure (invitrogen #18064). the primers f 5 0 aatccatggctacttc aaagttgaaaacc 3 0 and r 5 0 tctagattaa tgatgatgatgatgatggatggcctcatgca acacaaaacttg 3 0 (with addition of a 6xhis-tag to the c-terminus underlined, restriction sites in bold, and a silent mutation to remove an internal hindiii site italicized) were constructed using the previously published sequence for sba (genbank accession #k00821.1). the generated sba cdna was then inserted into puc-19. after confirmation by sequencing, the sba cdna was inserted into prtl-2 via digestion with ncoi and xbai, replacing the gus gene, providing a 35s promoter, 5 0 and 3 0 utr (carrington and freed 1990) . the resulting expression cassette was digested with hindiii and inserted into pbi-101 to create pbi-rsba. tri-parental mating was used to transfer pbi-rsba to agrobacterium tumifaciens strain lba4404 and confirmed via pcr using specific primers. transient expression of sba in n. benthamiana transient expression of pbi-rsba was accomplished by infection of 6-8-week-old leaves of nicotiana benthamiana as described by sparkes et al. (2006) . briefly, overnight cultures of agrobacterium were grown until density reached 0.5-1.0 a 600 , at which time the cells were centrifuged at 800g for 10 min, rinsed four times with infiltration medium (0.5% d-glucose, 50 mm mes, 2 mm na 3 po 4 , 0.0001 m acetosyringone), and then resuspended to the desired cell density, between 0.01 to 0.75 a 600 . the cells were then infiltrated into the abaxial side of the leaves using a 1-ml syringe and infected tissue was harvested each morning at day 1 through 9 postinfection. for co-infiltration with a second agrobacterium harboring the t-dna vector encoding the p19 suppressor gene of tomato bushy stunt virus (lakatos et al. 2004) , the two agrobacterial cultures were mixed at equal concentration before infiltration. stable expression of sba in s. tuberosum solanum tuberosum mini-tubers were generated as described previously (bourque et al. 1987) . overnight cultures of agrobacterium with pbi-rsba were used to transform tubers as previously described (ma et al. 2005) . regenerating plantlets were transferred to magenta boxes containing ms media supplemented with 50 lg/l kanamycin. the presence of the transgene in transgenic plants was confirmed by pcr, and pcr-positive plants were then selected to produce mini tubers for protein expression analysis. for mini tuber induction, a stem section (1 cm) with one resting auxiliary bud and one fully developed leaf was excised from a sterile magenta-grown transgenic plant. the leaf was removed and the stem was transferred to tuber-inducing medium as described above and placed in darkness at 20°c. after 4 weeks, the tubers formed (3 mm in diameter) were harvested and used for protein analysis. total soluble protein was extracted from infected n. benthamiana leaf tissue or potato tuber and then quantified as described previously . briefly, approximately 100 mg plant material, either solanum tuberosum tuber or nicotiana benthamiana leaf, was ground in a 1.5-ml tube with a plastic pestle and mixed with protein extraction buffer (200 mm tris ph 8.0, 100 mm nacl, 400 mm sucrose, 1 mm phenylmethylsulfonyl fluoride, 10 mm edta, 14 mm b-mercaptoethanol, 0.05% tween-20, 2 lg/ml leupeptin, 2 lg/ml aprotinin). the mixture was incubated on ice for 30 min and clarified by centrifugation at top speed for 10 min at 4°c. the supernatant was transferred to a new 1.5-ml tube and used for all subsequent assays. protein samples, boiled or unboiled, were loaded on to a 12.5% sds-page gel and resolved. gels were then either stained with coomassie blue to visualize the protein profile or transferred to a pvdf membrane as described previously ). the blot was then blocked for 1 h at room temperature with 5% w/v milk in tbs-t and washed 3 times, each time for 5 min, with tbs-t. the blots were incubated overnight at 4°c in 1:2000 rabbit anti-sba (cedarlane al-1301-2) in 1/3 blocking buffer 2/3 wash buffer. the blots were washed 3 times, each time for 10 min, in wash buffer and incubated in 1:5,000 goat anti-rabbit-hrp (g-7641, sigma) for 1 h at room temperature. the blots were washed 3 times, each time for 10 min, in wash buffer and then incubated with supersignal west pico chemiluminescent substrate (34080, pierce, rockford, il, usa). blots were exposed and then developed using a film processor. the amount of accumulated rsba in nicotiana benthamiana leaves or solanum tuberosum tubers was quantified by enzyme linked immunosorbent assay (elisa). briefly, commercial native sba standard (sigma l-1395) and triplicate ntrsba or strsba samples were bound to 96 well plates by incubation in phosphate buffer overnight at 4°c. the plates were then washed with pbs-t and blocked with 3% bsa in pbs-t for 1 h at room temperature. the plates were washed 3 times and then incubated overnight at 4°c in 1:2,000 rabbit anti-sba in 1/2 blocking:1/2 wash buffer. the plates were washed 3 times and incubated for 1 h at room temperature with 1:5,000 goat anti-rabbit-hrp in 1/2 blocking:1/2 wash buffer and then rinsed 3 times. the plates were incubated with substrate reagent pack (dy999, r&d systems) according to the manufacturer's instructions. the color was developed for 20 min and then stopped by addition of an equal volume of 2 m h 2 so 4 . the plate was then read using a thermomax microplate reader (molecular devices, usa). standard curves were calculated and used to determine protein concentrations of the individual samples. negative control was protein extract prepared from wild-type plant tissue. purification of rsba from nicotiana benthamiana or potato tubers was carried out according to the procedure for ge healthcare histrap hp column (#17-5248-01) or using an n-acetylgalactosamineagarose column. for purification with the n-acetylgalactosamine-agarose column, agarose columns with pre-bound n-acetylgalactosamine (sigma a2787-5 ml) were washed and equilibrated with 0.1 m nacl. the tsp containing rsba was then applied to the column and rinsed with excess 0.1 m nacl. samples were taken throughout rinsing and total protein in rinse was determined by spectrophotometry at a 280 . once protein levels were negligible, rsba was eluted with 0.5 m galactose/0.1 m nacl. the purified protein was confirmed via western blot and then desalted via dialysis against excess 0.59 pbs buffer. hemagglutination assay was performed using 2% rabbit red blood cells (rbc) suspended in saline buffer. commercial native sba standard and purified ntrsba and strsba were used for the assay. the proteins were diluted to different concentrations with saline in order to determine the effective unit for each sample, with one unit defined as the amount of sba required to induce agglutination (lin et al. 2008) . all agglutination assays were repeated in triplicate. rbc (2%, 50 ll) was added to round-bottomed 96-well plates and then mixed with 50 ll protein dilutions. the mixture was then allowed to settle for 1 h and the results were recorded. competitive sugar-binding assay sugar-binding assay was carried out as described above for hemagglutination, with the following modifications. one unit of native sba or nbrsba or strsba was mixed with saline containing 40 lm or 400 lm concentrations of one of the sugars n-acetylgalactosamine, n-acetylglucosamine, arabinose, lactose, or raffinose. the mixture (50 ll) was then added to 50 ll 2% rbc and incubated for 1 h. the results were then recorded as positive or negative for hemagglutination. deglycosylation of nbrsba and strsba was carried out with pngase f according to the manufacturer's procedure (sigma p-9120-1set). the samples were then loaded on to an sds-page gel followed by western blot analysis using anti-sba antibody. n-linked glycan removal was confirmed via a band shift on the western blot. to confirm the presence of mannose-type glycans on plant-derived rsba, blots containing both pngase f-treated and untreated samples were incubated with 20 lg/ml concanavalin a (c-2010, sigma) at room temperature for 1 h. after several washes with 0.5% tween-20 in pbs (ph 7.5), the blot was incubated with hrp-conjugated anti-cona antibodies (hal-1104-1, e.y. laboratories) at 1,000-fold dilution for 1 h. after the same wash step, the blot was incubated with supersignal west pico chemiluminescent substrate. the signals were then developed using a film processor. analysis of in-vitro digestion of nbrsba was carried out using either simulated gastric fluid (sgf) or simulated intestinal fluid (sif). for sgf (0.2 g nacl, 0.32 g pepsin, 700 ll hcl, in 100 ml h 2 o, final ph 2.5) and sif (0.68 g monobasic kh 2 po 4 , 7.7 ml 0.2 m naoh, 1.0 g pancreatin, in 100 ml h 2 o, final ph *6.8), purified protein was incubated in a 37°c water bath with samples taken and mixed in neutralization buffer (1.7 g na 2 co 3 in 100 ml h 2 o) at time 0, 15 s, 30 s, 1 min, 5 min, 15 min, and 30 min. samples were boiled for 10 min, separated on a 12.5% sds-page gel and subjected to western blot analysis as described above. control native sba and a non-glycoprotein, human gad65 made in e. coli (plantigen), were also tested. isolation and cloning of cdna encoding sba to obtain a cdna clone encoding sba, rna was extracted from wild-type soybean seeds that were approximately 1-5 mm in size, then converted into cdna that was then used as a template for the pcr cloning of the sba coding sequence. the primers included the addition of a 5 0 ncoi and 3 0 xbai restriction sites to facilitate sub-cloning of the pcr products, and addition of the 69 histidine tag to the c-terminus. the cloned sba gene was confirmed by sequencing. for the convenience of sub-cloning, the internal hindiii site within the native sba coding sequence was removed by converting a g to an a in the 2nd codon position encoding serine, resulting in no change of the amino acid. the complete sba expression cassette was released from the prtl-rsba via hindiii digestion and inserted into pbi101.1 to form pbi-rsba (supplemental fig. 1) . the plasmid pbi-rsba was then transferred into a. tumifaciens strain lba4404 for plant transformations. six-week-old n. benthamiana plants were infiltrated with the a. tumefaciens clone harboring pbi-rsba or mixed with cultures of a. tumefaciens containing the p19 expression cassette, a viral inhibitor of the rna silencing pathway in plants that has proven effective in increasing the yield of transiently expressed proteins (lakatos et al. 2004) (fig. 1a) . leaf samples were collected from three independent plants on different days post-infiltration with different concentrations of a. tumefaciens cultures, with an equal amount of p19 added for each assay, in order to determine the peak of rsba expression in the agroinfiltrated n. benthamiana plants. the results of immunoblot analysis showed that the highest expression using the gene-silencing suppressor p19 was reached at day 5, with an optimum concentration of agrobacterium of 0.25 a 600 (fig. 1b,c) . as calculated by elisa, the expression level of rsba at day 5 reached 4% of tsp (fig. 1d) . examination of unboiled samples on sds-page gels followed by western blotting showed the presence of two bands, one corresponding in size to the monomer and the other equivalent to the size of the tetramer of sba (supplemental fig. 2) , suggesting that n. benthamiana made rsba was assembled into a tetrameric protein complex, essential for its biological activity. stably transformed potato plants were also tested for expression of rsba. thirteen individual transgenic potato lines were generated after agrobacteriummediated transformation and transferred to minituber-inducing magenta boxes in order to generate consistently sized tubers for further analysis. immunoblot analysis of rsba expression in unboiled samples of tuber extract showed two major bands of 32 kda and just over 100 kda (supplemental fig. 2) , expected for the monomer and tetramer of sba, respectively. immunoblot analysis of boiled samples of tuber extract showed only the small band of 32 kda (data not shown). the level of strsba accumulation in potato mini-tubers was variable among individual transgenic lines, ranging from no detectable signal to 0.31% tsp (supplemental fig. 3) . soybean agglutinin (sba) specifically binds to n-acetylgalactosamine, enabling the purification of this protein via an n-acetylgalactosamine-agarose column to a high degree of purity. purification through this column also serves as confirmation of authentic sba production. figure 2a shows that sugar-specific purification of rsba from nicotiana benthamiana leaf extracts resulted in efficient protein retention during purification and elution. in addition, the sugar-purified sample was of high-quality when examined on sds-page gel and coomassie blue staining, with negligible contamination of non-specific proteins (fig. 2b) . sugar-specific purification of rsba from potato mini tubers also resulted in high-quality, purified protein (data not shown). as a control, the isolation of nicotiana benthamiana-derived rsba using a traditional histrap column was also performed (fig. 2b) . hemagglutinating activity of plant-derived rsba soybean agglutinin is known to agglutinate human and animal red blood cells (hemagglutination). rsba was therefore assessed for its ability to agglutinate rabbit erythrocytes. purified nbrsba was found to c accumulation of rsba at day 5 with different a 600 concentrations of agrobacterium. ? is native sba, nb is nicotiana benthamiana wild-type tsp. d elisa data for accumulation at days 1-9 after infection at optimized concentration. values are expressed as a percentage of total soluble protein (tsp), with each bar representing the mean value of the three collected samples repeated in triplicate from each day, with standard error. all western blots were loaded with approximately 30 lg tsp loaded per lane, 150 ng native sba control induce the agglutination (clumping) of rabbit erythrocytes within 1 h of treatment (fig. 3a) . similar results were obtained with strsba (data not shown). the minimum amount of native sba required to induce agglutination in the assay was 2.5 lg, whereas nbrsba and strsba required approximately 2 and 3.5 lg, respectively. as agglutination of erythrocytes by sba is because of its ability to bind to specific sugar residues on the cell surface, it is therefore speculated that the agglutinating activity of sba could be inhibited by addition of specific sugars. indeed, addition of n-acetylgalactosamine even at low concentrations was able to prevent the agglutinating ability of nbrsba, similarly to the native sba control (fig. 3b) . in contrast, the agglutinating ability of nbrsba or native sba was unaffected by high concentrations of non-specific sugars (supplemental table 1 ; see also fig. 3b ). native sba is a glycosylated protein with a single high-mannose glycan (adar et al. 1997) . the presence of glycosylation, while not required for assembly or function, is important in maintaining the stability of sba (sinha and surolia 2007) . to determine the glycosylation status of plant-derived rsba, nbrsba, strsba, and sba standard were digested with pngase f and analyzed by immunoblotting following separation by sds-page gels. both sba standard and plant-derived rsba had a decrease in molecular size of approximately 2-3 kda, which is an expected size for a single glycan (fig. 4a) . to determine the type of glycans of plant-derived rsba, blots containing both pngase f-treated and untreated nbrsba, strsba, and sba standard were incubated with cona, which binds to n-linked mannose, and then with anti-cona antibody. figure 4b shows that binding of cona occurs only in the untreated samples, with no binding to any of the pngase f treated sba proteins, suggesting that plant-derived rsba contains high-mannose type glycans, similar to native sba. plant-derived rsba was further investigated in simulated gastric and simulated intestinal fluid (sgf and sif) to assess its stability in the digestive system. sgf (3.2 g/l pepsin, ph 2.5) has been used to mimic the acidic stomach environment in animals, and sif (10 g/l pancreatin, ph 6.8) is used for proximal small bowel conditions. to this end, purified nbrsba and native sba were used in simulated digestion experiments. the effect of sgf and sif on the degradation of nbrsba, as evaluated by immunoblot analysis, is shown in fig. 5a ,b. as can be seen, nbrsba remains relatively intact after digestion even for 30 min in either solution. native sba showed similar resistance to sgf and sif digestion (data not shown). to ensure proper activity of the simulated fluids, e. coli-derived recombinant gad65 (glutamic acid decarboxylase) previously produced in our laboratory was tested as a control. the protein lasts less than 1 min in sgf and 5 min in sif (fig 5c,d) . we have demonstrated the successful recombinant production of sba through both stable and transient expression in solanum tuberosum and nicotiana benthamiana, respectively. sba has served as an excellent system of choice to study the fundamentals of protein-carbohydrate interaction (loris et al. 1998; sharon and lis 1990) . with a vast increase in knowledge of sba and other lectins, it has become apparent that sba has many additional applications, especially in medical and biotechnology-focused industrial areas such as the utilization of sba to selectively remove cancerous cells from whole blood without removing normal t cells, to prevent graftversus-host disease following transplant, or to enrich hematopoietic stem cells (bakalova and ohba 2003; kernan et al. 1987; yura et al. 2008) . moreover, recombinant sba has the potential to serve as a carrier system for oral drug delivery or to be used as an affinity tag for isolation of high-purity fusion proteins. to make these applications feasible, however, a more affordable source of large quantities of high-quality rsba is required. previous recombinant production of sba in tobacco was incidental, because it was simply used as a marker for studying the promoter activity and, as such, no assays were performed to confirm that the protein made retained authentic sba activity (lindstrom et al. 1990) . similarly, recombinant production in both bacterial and mammalian cell culture was for the purpose of elucidating the functional role of glycosylation and the produced recombinant sba was either lacking glycosylation (e. coli-derived) or had reduced agglutinating ability (mammalian cell-derived) (adar et al. 1997) . our objective was to develop a robust expression system that can generate large quantities of low-cost authentic rsba, which retains its binding specificity to facilitate high-purity simplified protein isolation and maintains its stability in the gastrointestinal tract (gi) needed for future use as a carrier molecule for oral drug delivery. authenticity of both nbrsba and strsba was confirmed by several results. the foremost is the binding and inhibition studies using rabbit red blood cells. sba binds to different surface proteins bearing n-acetylgalactosamine, and, in the case of red blood cells, results in hemagglutination (gordon and marquard 1974) . both of the plant-derived rsbas were able to induce hemagglutination of rabbit red in addition, a similar minimum amount of sba, whether native or from n. benthamiana or s. tuberosum, was able to induce agglutination. this suggests that, unlike monkey cell-made sba, nb/strsba is functionally equivalent to native sba. many lectins can induce hemagglutination, however, so it is important to confirm the specificity of rsba for its ligand. for example, the chinese black soybean variety is able to cause agglutination of rabbit red blood cells but is inhibited by a different sugar, in that case melibiose (lin et al. 2008) . the inhibition of hemagglutination by addition of n-acetylgalactosamine but not other sugars confirms the rsba retains the authentic binding profile of native sba. in addition to this, the ability to specifically purify high-quality rsba via n-acetylgalactosamine bound to agarose further confirms the authenticity of the recombinant protein, as there was no detectable level of rsba in either the flow-through or wash steps, only in the elution step. the capture rate for this technique has been reported to be over 90% efficient, which is supported by the lack of rsba signal in the flowthrough steps seen in fig. 2b (percin et al. 2009 ). the ability to purify high-quality rsba also enables potential use as an affinity tag. through genetic fusion with a gene of interest (goi), it may be possible to first purify the fusion protein on the sugar columns, followed by subsequent cleavage of the desired goi via specific endoproteolytic cleavage, such as with tobacco etch virus (tev) protease (tubb et al. 2009) . a second possibility would be in combination with a linker intein, an intervening protein sequence that can be induced to undergo n-terminal, c-terminal or both termini cleavage under specific conditions (evans et al. 2005) . this could enable cleavage of the goi from rsba without any expensive proteases and result in high-purity protein. it should be mentioned that because of the requirement of the tetrameric formation for proper sba activity, addition of a fusion partner may affect the formation of a tetrameric sba. however, several groups, including our own, have previously demonstrated that other multimer forming proteins, for example the non-toxic b subunit of cholera toxin, retain their ability to form pentamers and bind to their target ligand when fused with other proteins (arakawa et al. 1998; li et al. 2006; ruhlman et al. 2007; tremblay et al. 2008 ). the production of rsba was accomplished using both transient and stable transformation of nicotiana benthamiana and potato, respectively. transient gene expression in plants has a number of advantages over stable genomic transformation in plants or other recombinant expression platforms. first and foremost is that new recombinant proteins can be generated in sufficient quantities for in-vitro analysis and in-vivo studies at a speed not possible with traditional bioreactors. this is best demonstrated in the generation of idiotpe-specific scfvs to treat non-hodgkins lymphoma patients. researchers were able to generate patient-specific scfvs by first cloning the antibody fragment sequences from the patient's biopsy and then generated sufficient quantities in plants to inject the plant-made scfvs back into the patient in less than 16 weeks, something unachievable with any other biological system (mccormick et al. 2008) . transient expression can also be used to generate and validate new vaccines against future pandemics, for example h1n1 or sars, for which large quantities of vaccine are needed in a short-period of time in order to meet world-wide demand. an additional benefit of transient expression in plants is the level of accumulation. in our work, we achieved almost 4% accumulation of rsba in less than a week. this is a tenfold increase over the accumulation seen in our stably transformed potato lines. based on our data, the transient system can generate an approximate yield of between 1.5 and 3 g of rsba per kg fresh weight in 1 week, meaning that a greenhouse setting capable of producing 1,000 kg of tissue per week could generate up to 3 kg of purified rsba every week. the use of stable lines expressing rsba can likewise prove advantageous. the production of a stable expression platform provides long-term, lowcost therapeutic proteins, typically required where the therapeutic agent is commonly used in multiple individuals over a long period of time, for example in the potential prevention of type 1 diabetes using glutamic acid decarboxylase 65 (gad65) (ma et al. 2004) . we were able to accumulate rsba to over 0.3% of tsp in transgenic potato tubers. a major benefit of tuber accumulation is the stability of recombinant proteins during long periods of storage. in the production of monoclonal antibodies, for example, the recombinant antibodies were stable for over 6 months when stored in the dark at room temperature, with little to no loss in either quantity or activity (de wilde et al. 2002) . this makes potatoes an excellent system for recombinant protein production, because one of the major disadvantages of field/ greenhouse grown plants for recombinant production is losses in yield owing to protein degradation during transit and processing. an additional benefit of the use of tubers is that they can be used to orally deliver the target protein with minimal processing, especially given that potato tubers form a staple part of many cultures' diets and can be processed for consumption via freeze-drying that enables ingestion without cooking and thus could help minimize protein denaturation/loss during food preparation. in-vitro digestion of n. benthamiana-derived rsba under simulated gastric and intestinal conditions proved it to be stable. there was no significant loss or degradation of the protein after incubation of nbrsba in sgf for 30 min, as seen in fig. 5a ,c. these results suggest that the plant-derived rsba may hold great promise for use as a carrier system for oral delivery of protein and peptide drugs. the oral route for drug delivery is the most preferred route and has considerable advantages: requiring neither sterile needles nor trained personnel, lower cost and increased quality of life, increased access to a large population, reduced side-effects often seen with systemic delivery, and greater patient compliance and acceptability. however, administration of therapeutic peptide or protein drugs by the oral route is a major challenge. orally administered peptide or protein drugs are readily degraded because of their exposure to the harsh environment of the gi (low ph and various proteinases and peptidases). therefore, development of suitable delivery systems is crucial to the success of oral administration of protein drugs. plant-derived rsba may provide an ideal vehicle for oral drug delivery. moreover, sba undergoes endocytosis through the epithelial lining of the intestine, adding to its potential as an adjuvant for therapeutic protein delivery (benjamin et al. 1997) . a potential concern with the use of rsba as a carrier system for targeted drug delivery is its antinutritional property. indeed, sba is one of the predominant anti-nutritional factors found in the raw soybean and accounts for approximately 50% of the growth inhibition in rats fed unheated soybean (liener 1996) . however, its anti-nutritional activity is strictly dose-dependent. a negative effect of sba on growth and immune function in rats was observed only when very high levels of this protein was given in diets (14 mg/per rat daily or equivalent to 0.2% of its body weight). in our previous work on oral tolerance induction in mice using plant-derived autoantigenic protein, we showed that only microgram quantities (or .005% of the animal's body weight) of the plant-derived antigen is required to induce a response (ma et al. 2004 ). this suggests that when used as a carrier system, the minimum effective dose of rsba-antigen fusion protein required to be delivered is in the range of micrograms, not milligrams. delivery of microgram quantities of sba to animals is safe (buttle et al. 2001; tang et al. 2006) . however, as with all therapeutics destined for human use, thorough studies would be required to determine the overall bio-safety. in conclusion, our work has demonstrated the feasibility of high accumulation, high-purity recombinant production of sba. this is the first molecular characterization of rsba in plants. the availability of low-cost, high-quality rsba may make many important applications of this protein possible, especially in medicine. synthesis of soybean agglutinin in bacterial and mammalian cells a plant-based cholera toxin b subunitinsulin fusion protein protects against the development of autoimmune diabetes evaluation of a soybean lectin-based method for the enrichment of erythroblasts interaction of soybean agglutinin with leukemic t-cells and its use for their in vitro separation from normal lymphocytes by lectin-affinity chromatography inflammatory and anti-inflammatory effects of soybean agglutinin bioproduction of therapeutic proteins in the 21st century and the role of plants and plant cells as production platforms use of an invitro tuberization system to study tuber protein gene-expression the binding of soybean agglutinin (sba) to the intestinal epithelium of atlantic salmon, salmo salar and rainbow trout, oncorhynchus mykiss, fed high levels of soybean meal cap-independent enhancement of translation by a plant potyvirus 5 0 nontranslated region expression of antibodies and fab fragments in transgenic potato plants: a case study for bulk production in crop plants protein splicing elements and plants: from transgene containment to protein purification factors affecting hemagglutination by concanavalin a and soybean agglutinin prevention of gvhd in hlaidentical marrow grafts by removal of t-cells with soybean agglutinin and srbcs molecular mechanism of rna silencing suppression mediated by p19 protein of tombusviruses expression of cholera toxin b subunit and the b chain of human insulin as a fusion protein in transgenic tobacco plants effects of processing on antinutritional factors in legumes: the soybean case purification of melibiose-binding lectins from two cultivars of chinese black soybeans expression of soybean lectin gene deletions in tobacco legume lectin structure induction of oral tolerance to prevent diabetes with transgenic plants requires glutamic acid decarboxylase (gad) and il-4 production of biologically active human interleukin-4 in transgenic tobacco and potato plant-based pharmaceuticals and its application in oral tolerance. in: pontell eb (ed) immune tolerance research development plant-produced idiotype vaccines for the treatment of non-hodgkin's lymphoma: safety and immunogenicity in a phase i clinical study n-acetyl-dgalactosamine-specific lectin isolation from soyflour with poly(hpma-gma) beads uses of plant lectins in bioscience and biomedicine expression of cholera toxin b-proinsulin fusion protein in lettuce and tobacco chloroplasts-oral administration protects against development of insulitis in nonobese diabetic mice carbohydrate-protein interactions oligomerization endows enormous stability to soybean agglutinin: a comparison of the stability of monomer and tetramer of soybean agglutinin attributes of glycosylation in the establishment of the unfolding pathway of soybean agglutinin lectin-mediated drug delivery in the oral cavity rapid, transient expression of fluorescent fusion proteins in tobacco plants and generation of stably transformed plants effects of purified soybean agglutinin on growth and immune function in rats expression of a fusion protein consisting of cholera toxin b subunit and anti-diabetic peptide (p277) from human heat shock protein in transgenic tobacco plants tobacco, a highly efficient green bioreactor for production of therapeutic proteins purification of recombinant apolipoproteins a-i and a-iv and efficient affinity tag cleavage by tobacco etch virus protease selection of hematopoietic stem cells with a combination of galactose-bound vinyl polymer and soybean agglutinin, a galactose-specific lectin key: cord-003435-ke0az7nf authors: schlake, thomas; thess, andreas; thran, moritz; jordan, ingo title: mrna as novel technology for passive immunotherapy date: 2018-10-17 journal: cell mol life sci doi: 10.1007/s00018-018-2935-4 sha: doc_id: 3435 cord_uid: ke0az7nf while active immunization elicits a lasting immune response by the body, passive immunotherapy transiently equips the body with exogenously generated immunological effectors in the form of either target-specific antibodies or lymphocytes functionalized with target-specific receptors. in either case, administration or expression of recombinant proteins plays a fundamental role. mrna prepared by in vitro transcription (ivt) is increasingly appreciated as a drug substance for delivery of recombinant proteins. with its biological role as transient carrier of genetic information translated into protein in the cytoplasm, therapeutic application of mrna combines several advantages. for example, compared to transfected dna, mrna harbors inherent safety features. it is not associated with the risk of inducing genomic changes and potential adverse effects are only temporary due to its transient nature. compared to the administration of recombinant proteins produced in bioreactors, mrna allows supplying proteins that are difficult to manufacture and offers extended pharmacokinetics for short-lived proteins. based on great progress in understanding and manipulating mrna properties, efficacy data in various models have now demonstrated that ivt mrna constitutes a potent and flexible platform technology. starting with an introduction into passive immunotherapy, this review summarizes the current status of ivt mrna technology and its application to such immunological interventions. our bodies are continuously exposed to molecules that may indicate disease or parasitic invasion. the immune system is responsible for the detection and clearance of such molecules and for control of the underlying causes. immediate discrimination between self and foreign upon first exposure is mediated by innate immunity. signals that induce innate immunity are called pathogen-associated and damageassociated molecular patterns (pamps and damps) [1, 2] . the innate immune system is characterized by induction of generic defenses against a broad spectrum of infectious agents and is mainly aimed at clearance of tissue damage at the site of infection and interruption of further pathogen replication. responses against specific targets following prolonged or repeated exposures are processed by the adaptive immune system [3] . important effectors of adaptive immunity are highly specialized cells: b cells secrete antibodies against soluble or cell-associated antigens [4] . cytotoxic cd8 + t cells (ctls) recognize and kill infected or neoplastic cells [5] . regulatory cd4 + t cells augment b cell maturation or inhibit auto-reactive immune cells [6] . dendritic cells (dcs) process and present antigen for the transition from innate to adaptive immunity [7] . some b and t cells progress towards persisting memory cells that react faster and with greater affinity to the foreign molecules upon re-exposure. compared to innate immunity, adaptive immunity usually requires weeks (as opposed to minutes) until an effective response against novel targets is achieved. advantages of the adaptive immune system include the ability to launch specific immune responses also against antigens of endogenous (not only microbial) origin that may be associated with degenerative or neoplastic disease. manipulation of the immune system is an important component of many prophylactic and therapeutic applications against infectious, degenerative and neoplastic diseases. the diverse repertoire of methods can be roughly divided into four approaches: active immunization (vaccination) prepares adaptive memory responses, usually prior to first exposure, and constitutes one of the most efficacious and cost-effective medical interventions. immunization with antigens generally only leads to the induction of antibody production, whereas for instance inoculation with attenuated viruses also elicits cytotoxic effector cells [8] . in addition, active immunization can be accomplished by adoptive cell transfer. in a typical setting, dendritic cells are loaded with tumor antigens ex vivo and subsequently infused into patients to induce an adoptive immune response against cancer cells [9] . in contrast, passive immunotherapy circumvents the initial steps required for immune responses to be launched and directs the immune system efficiently to the desired medical targets. for cellular approaches, ctls are equipped with recombinant receptors ex vivo. these cells are designed to attack neoplastic cells that express the cognate tumorassociated antigen immediately after infusion [5] . passive immunization by administration of processed antibodies derived from human or animal donors is a well-established emergency procedure for treatment of snake-bite envenomation or post exposure prophylaxis against, for example, rabies [10] . the advantage of passive immunization is that protective antibodies can be provided in a very short time. application of recombinant antibodies further expands the number of available targets and is increasingly important for augmentation of conventional therapies against cancer [10, 11] . passive immunotherapies either require or can exploit modern nucleic acid-based methods, among which mrna is the latest technology. the present review is dedicated to provide an overview on mrna in passive immunotherapy after introducing the immunological approach as well as mrna technology as such. it is well known that protection raised by most of today's licensed vaccines is primarily antibody-dependent (fig. 1a ) [8, 12] . basically, this explains the long and successful history of passive immunotherapy (fig. 1b) . the protective capacity of serum against bacterial toxins was discovered in the early 1890s [13] . the avoidance or control of infection by such passive immunization is based on the transfer of serum and later polyclonal immune globulin (= antibody) preparations from convalescent or vaccinated humans or animals [14, 15] . prior to the discovery of antibiotics, serum was the only antidote for bacterial diseases [16] . thus, following successful passive immunization against diphtheria toxin [13] , a whole plethora of serum or immune globulin therapies for viral and bacterial diseases as well as to neutralize snake toxins was developed [17] . clinical benefits of serum and immune globulin therapy were demonstrated for viral diseases such as influenza, measles, and polio and bacterial infections with meningococcus or pneumococcus [18] [19] [20] . with the advent of antibiotics, the use of serum or polyclonal immune globulin preparations as antibacterial agent was largely discontinued in late 1940. however, such therapies retained a niche as treatment for venoms, toxins, and certain viral infections. in the second half of the twentieth century, polyclonal antibody preparations were developed and in part licensed for the prophylaxis and treatment of hepatitis a and b, cytomegalovirus, varicella-zoster virus, vaccinia virus, rabies, respiratory syncytial virus, west nile virus, and various hemorrhagic fevers [21] . for instance, passive immunization against the argentine hemorrhagic fever shows beneficial effects when applied within 1 week after the emergence of symptoms, and post-exposure treatment with human or equine immunoglobulins are recommended for rabies [22] [23] [24] . moreover, botulism is treated with equine antitoxin [25, 26] . such immune globulin preparations of non-human origin are particularly prone to elicit an immune response that obstructs their therapeutic use or efficacy [27] . further disadvantages or difficulties associated with the use of serum or polyclonal globulin preparations are the often high content of non-neutralizing antibodies, batch-to-batch variations, and in case of human sources the availability of appropriate immune donors [21, 28] . in 1975, groundbreaking work described the production of monoclonal antibodies (mab) by immortalization of b cells [29] . the resulting hybridoma technology was then rapidly exploited for clinical use, for instance, to produce a mab to cd3 for preventing organ rejection [30] . recombinant technologies further expanded the available therapies based on mabs. in vitro antibody selection technologies like phage or ribosome display were developed to enable the generation of highly specific human mabs out of libraries that may even be naïve for the specific antigens [31] [32] [33] [34] [35] [36] 1 schematic illustration comparing active immunization and passive antibody immunotherapies. a during active immunization triggered by natural infection or vaccination, antigenic patterns are presented by antigen-presenting cells in the lymph nodes. this leads to t cell mediated activation of antigen-specific b cells. as a consequence, b lymphocytes differentiate into plasma cells which produce and secrete antigen-specific antibodies that bind to cognate structures, finally leading to their clearance. b instead of being produced by plasma b cells, antibodies can be manufactured recombinantly and administered for instance by subcutaneous injection for passive immunization. after injection, antibodies enter circulation by diffusion and act like endogenous antibodies. c for dna-based passive immunization, dna is often packaged in nanoparticles, e.g., virus capsids, which for instance can be injected intramuscularly. after uptake by muscle cells, dna is released into the cytosol. for transcription into mrna, the dna has to enter the cell nucleus first. mrna is then translated into antibodies which are secreted to bind their cognate targets a high throughput technique to amplify and clone antibody genes from single human b cells was described [37, 38] . although recombinant mab technology is exploiting a large variety of different antibody formats (fig. 2) , the prevalent type is still a full-size antibody of the igg class. in addition to the variable domains essential for antigen binding, they contain constant domains including the fc-region. the latter is important for antibody function and can mediate antibody-dependent cellular cytotoxicity (adcc), complement-dependent cytotoxicity (cdc), and antibodydependent cellular phagocytosis (adcp) [39] . furthermore, binding to the neonatal fc receptor (fcrn) plays a role in controlling the antibody half-life which is 21-28 days for human igg [40] [41] [42] . fcrn rescues bound antibody from degradation by transporting it back to the cell surface where it is released into the extracellular space [43, 44] . specific mutations in the fc region that increased the affinity to fcrn have been shown to prolong antibody half-life up to fivefold [45] . in addition to the impact of fcrn binding, half-life also benefits from the large size of iggs. it obstructs antibody clearance by the kidney as well as metabolization by cytochrome p450 [42, 46] . the downside of the large size is the correspondingly low access to and penetration of tissue which can affect therapeutic efficacy [47] . full-size antibodies are often posttranslationally glycosylated, modulating fc function. although aglycosylated iggs can be produced in bacteria [48] , modern production processes usually take advantage of the cellular machinery for advanced posttranslational maturation and secretion of eukaryotic cells [48] [49] [50] [51] . the vast majority of approved therapeutic antibodies are currently produced in mammalian cells [52, 53] . production processes have been optimized especially for the predominant production system, the continuous chinese hamster ovary (cho) cell line. cho cells secrete antibodies with negligible non-human glycoforms [54] and are also amenable to glycoengineering [55] . differences in glycoforms depend on the production system and affect distribution and stability of the antibody, fc effector function and immunogenicity in recipients [54] . since traditional expression hosts such as e. coli do not allow efficient production of full-size antibodies, smaller proteins consisting of fragments derived from the variable domains were developed as promising alternatives. such single-chain variable fragments (scfv) and various derivatives thereof preserve antigen binding while facilitating manufacturing (fig. 2b, c) [56] . another type of antibody fragment is derived from camelids or cartilaginous fish. these animals produce single-domain antibodies devoid of light chains (fig. 2e) [57, 58] . since antigens are recognized by a heavychain-only v h domain (v h h) in camelids [59] , the variable v h h fragment can be easily engineered into nanobodies that offer additional advantages such as improved heat and ph stability [60] . moreover, they can also be assembled into v h h-based neutralizing agents (vnas) (fig. 2e) [61] . various studies demonstrated that multivalent formats were more effective than monovalent single-domain antibodies [62, 63] . notably, all formats based on antibody fragments can be relatively efficiently produced with less expensive bacterial expression systems, typically employing e. coli [64, 65] . the antibody fragments produced in this system are often targeted to the oxidative environment of the periplasm using specific signal peptides to foster disulfide bond formation and proper folding [64, 65] . moreover, enhanced expression of chaperones and cytoplasmic oxidases has been demonstrated to increase the yield of antibody fragments [48, 66] . small antibody fragments were also the basis for developing the concept of bispecific antibodies more than 20 years ago. initially, single chain antibodies with a different binding specificity were fused to the c-terminal ends of heavy chains of iggs [67] . generation of first bispecific igg molecules benefited from the knob-into-hole technology [68] . today, many different bispecific antibody formats combining two different antigen binding domains in one molecule are available ( fig. 2d ) [69] [70] [71] [72] . among them, bispecific diabodies (bi-(scfv) 2 ) and bite antibodies are prominent examples [73, 74] . in general, bispecific antibodies can be deployed to target therapeutic substances such as toxins, radionuclides, and drugs as well as effector cells like ctls to the site of cognate antigen expression [75] . associated with their small size, many formats using antibody fragments are cleared by renal elimination [76, 77] . moreover, in the absence of an fc region, recycling by the fcrn rescue mechanism cannot take place [77] . as a consequence, these formats usually reveal short plasma half-lives [77] . for instance, bi-(scfv) 2 antibodies have a serum half-life of less than 2 h which usually requires continuous infusion [78] . in case of the bite blinatumomab, the antibody is usually administered daily due to its short halflife [79] . possible strategies to extend serum half-lives are site-specific pegylation and fusion to an fc region [80, 81] . however, the latter approach would negate various advantages of antibody fragments including their better and faster tissue penetration [41, 82] . it has been shown that small single-domain antibodies could even cross the blood-brain barrier [83] . in case of an anti-rabies antibody, this allowed partial rescue of mice challenged with virus injection into the brain in contrast to full-size immunoglobulins [84, 85] . today, monoclonal antibodies play an important role in the therapeutic armamentarium. dozens of antibodies have been licensed to treat cancer, rheumatoid arthritis, multiple sclerosis, psoriasis, allergy, systemic lupus, and other diseases. in addition, mabs have shown promise in protecting against various microorganisms, viruses, and fungal infections as well as in treating neurodegenerative diseases [86] [87] [88] [89] . however, mabs for infectious disease indications are still scarce among licensed products. palivizumab for respiratory syncytial virus prophylaxis in high-risk infants was the first antiviral mab approved by the fda [90] . since then, antibodies against anthrax and rabies in india became available. in addition, bezlotoxumab binding to clostridium difficile toxin is used to prevent recurrent bacterial infections [91] . mabs for treating cancer are still the largest group of licensed products. here, rituximab, directed against the transmembrane protein cd20 on the surface of b lymphocytes, was the first mab in clinical use [92] . however, as for many other mabs in antitumor therapy, high doses are needed to obtain clinical efficacy [93] . a successful example for a bispecific antibody is the first-in-class bite against cd19/cd3, blinatumomab, which is approved for the treatment of acute lymphoblastic leukemia [94] [95] [96] . in addition to the administration of immunoglobulins, passive immunity can also be conferred by transferring functionalized immune cells. adoptive transfer of ctls was shown to be a potent therapeutic means to treat both viral infections and cancers [97] [98] [99] . to this end, t cells can be equipped with an additional t-cell receptor (tcr) or a chimeric antigen receptor (car) [100] . while cars are limited to the binding of surface antigens, tcrs recognize mhc-presented peptides derived also from intracellular proteins. the first successful clinical trial with an engineered tcr on ctls demonstrating tumor regression was reported in 2006 [99] . subsequently, passive immunotherapies with tcr-engineered t cells became an important approach for antitumor treatments [101] . efficient targeting and killing of cancer cells expressing the respective antigen in patients with various forms of cancer including metastatic melanoma, synovial sarcoma, and colorectal carcinoma have been demonstrated [99, [102] [103] [104] . a possible problem specific to the use of tcr-engineered t cells is the presence of an endogenous tcr. mispairing of endogenous and introduced α-and β-chains may create new specificities with potential reactivity to host molecules [105, 106] . to avoid such mispairing, the use of γ/δ t cells has been suggested, since γ/δ-chains do not pair with α-or β-chains [107] [108] [109] . alternatively, gene editing is now being explored to disable endogenous tcr expression [110, 111] . the concept of cars was developed in 1989 [112] and then refined using a scfv fragment to obtain antibody-like receptor specificity without the need to transfer multiple genes [113] . first generation cars consisted of an antigenspecific scfv, fused to a transmembrane and intracellular cd3ζ tcr signaling domain, conferring transient activation and cytotoxicity to t cells [114] . upon target binding through the scfv domain, the engineered t cell is activated in an mhc-independent manner [115] . subsequent generations of cars were improved with respect to cytotoxicity and persistence by including additional co-stimulatory domains such as cd28, ox-40 or 4-1bb [114, [116] [117] [118] . t cells engineered with such cars targeting specific tumor antigens are remarkably successful in treating hematological malignancies like leukemia and lymphoma [119, 120] . for instance, cd19-directed car t cells repeatedly revealed complete and durable remissions in patients with b-cell acute lymphoblastic leukemia (b-all) [121] [122] [123] . in contrast, car t cell therapies face various challenges for solid tumors [124] . at present, the most common techniques for generating tcr-or car-engineered t cells utilize viral gene transduction with retro-or lentiviral vectors [125] . however, permanent expression of the transgenic receptor mediated by this efficient technology can be disadvantageous in case of therapy-related severe toxicities due to accidental cross reactivity [126] [127] [128] . on-target/off-tissue and off-target toxicities by engineered t lymphocytes attacking healthy host cells as well as cytokine release syndrome are feared side-effects which were repeatedly reported when virus vector transduced cells were applied [129] [130] [131] [132] . another concern of using retroviral vectors is the potential to induce insertional mutagenesis and genotoxicity in effector cells [133] [134] [135] [136] . hence, more precise cell manipulations are currently under investigation. among them, gene editing of primary human t cells was recently demonstrated to be an efficient approach [137] . while t cell engineering for adoptive transfer inevitably requires the use of nucleic acids encoding a target-specific receptor, antibody immunotherapy can deploy recombinant proteins. however, as pointed out above, maintaining therapeutically effective levels may require frequent administrations dependent on clearance and indication [138] . thus, dna-mediated antibody expression directly in the body may represent an attractive alternative to administration of recombinant proteins (fig. 1c ). both, plasmids as well as viral vectors have been used for passive immunization. although efficient in small animal models, strong expression of recombinant genes using unformulated dna does not scale well to larger animals (including primates) [139] . consequently, application of recombinant adeno-associated viruses (aavs) is currently the preferred method for transduction of the antibody gene of interest [140] . early work reached single digit µg/ml serum titers [141] . later studies with advanced vector designs reported expression levels in the high µg/ml or even in the single digit mg/ml range [142] [143] [144] . much work has been done in the field of hiv prophylaxis. here, a single intramuscular injection of recombinant aav was demonstrated to elicit peak antibody titers above 100 µg/ml in mice [142] . treated mice revealed substantial anti-hiv antibody levels for more than a year and were protected from hiv-1 challenge. although aav usually maintains an episomal state, this vector still harbors an inherent risk of insertional mutagenesis. in patients with hepatocellular carcinomas integration of aav2 into known cancer genes was observed [145] . moreover, aav-based immunotherapy faces various issues regarding immunogenicity [146] [147] [148] . a substantial percentage of the population has already been in contact with the used virus and consequently shows pre-existing immunity limiting the efficacy of treatment [149, 150] . the induction of anti-viral responses during immunotherapy may have similar consequences if a single virus serotype is used in repeated treatments [148] . pre-existing or induced immunity could lead to clearance of the viral vector and/or aavtransduced cells. finally, gene delivery by aav has been reported to induce immune responses against the encoded protein. even when using an endogenous gene such as erythropoietin (epo), some macaques receiving epo-encoding aav intramuscularly developed severe autoimmunity against the protein [151] . when non-human primates were treated with aav vectors expressing antibody or antibody-like proteins, titers dropped rapidly in some animals [152] . one reason is the possibility of inducing anti-idiotype antibodies which requires immune suppression to obtain sustained expression [153] . further studies indicated that primates may be more prone to develop a robust t cell response to the aavencoded protein compared to mice [154] . in summary, the risks of insertional mutagenesis and genotoxicity, long-lasting expression without control in case of adverse events, and various potential issues regarding immunogenicity associated with viral vectors that may limit efficacy emphasize the high demand for other vectors for passive immunotherapy than dna. how mrna offers a viable option to meet the demands is reviewed in the following sections. the cellular machinery uses mrna as a transient carrier of genetic information for the synthesis of proteins. based on this fundamental biological concept administering exogenous mrna represents an alternative to dna-mediated protein delivery in vitro and in vivo. using mrna instead of dna as therapeutic substance is attractive due to the absent risk of insertional mutagenesis. moreover, efficient expression is even obtained in non-dividing cells, since mrna does not require a nuclear phase for activity. compared to the delivery of proteins and peptides, mrna may prolong the availability of effector molecules, however, not as much as dna. in contrast to the latter, mrna therapy, therefore, has to cope with the short half-life in vivo of exogenously delivered mrna as for instance indicated by mrna-mediated vegf expression in myocardium returning to baseline within 72 h [155] . while this may be a therapeutic disadvantage in various instances, it can be considered advantageous from a safety perspective, particularly in case of adverse events. mrna was first employed for the expression of a protein of interest in the early 1970s when rna preparations were microinjected into xenopus oocytes and synthesis of the encoded protein was demonstrated [156, 157] . in 1989, the group of inder verma presented a reliable method to efficiently introduce rna into a variety of cells using a cationic lipid [158] . almost at the same time, mrna-mediated protein expression in vivo was demonstrated after direct injection into mouse muscle [159] . much of the early work on a potential therapeutic use of mrna focused on the development of active vaccination approaches, in part since low amounts of antigen suffice due to the amplifying nature of the immune response. subcutaneous injection of liposomeencapsulated, antigen-encoding mrna was the first example of eliciting an antigen-specific ctl response in mice [160] . gene gun delivery of mrna into mouse epidermis provided the first evidence of an antigen-specific antibody response [161] . in addition, mrna turned out to be a potent means to load dendritic cells with antigens to convert them to tailored antigen-presenting cells in vitro and in vivo [162] . later, a new vaccination protocol was introduced which elicited a complete adaptive immune response consisting of antigenspecific antibodies and t cells with lytic activity without requiring any transfection reagents, special equipment or heterologous boost [163] . from a retrospective view, this event marked the starting point of the commercial development of mrna vaccines. in the minimal structure, mrna contains a protein-encoding open reading frame (orf) flanked at the 5′-and 3′-end by two elements essential for the function of eukaryotic mrna: the "cap", a 7-methyl-guanosine residue (m7g) bound to the 5′-end of the rna via a 5′-5′ triphosphate bond, and, with the exception of histone mrnas, a poly(a) tail at the 3′-end [164] [165] [166] . synthetic mrna is transcribed in vitro from a plasmid dna template that contains at least a bacteriophage promoter, the orf, and a unique restriction site for linearization of the plasmid to ensure defined termination of transcription. typically, the template also contains a poly(d[a/t]) sequence transcribed into poly(a). alternatively, the poly(a) tail can be generated by enzymatic in vitro polyadenylation of transcribed rna. the cap may be incorporated into the transcript during transcription by including an m7gpppg dinucleotide as a structural homolog of the endogenous cap structure in the transcription reaction. different options for how to derive ivt mrna are summarized in fig. 3 . various elements of an mrna molecule contribute to the level and duration of expression of the encoded protein. the cap structure is required for efficient translation and stabilizes mrna towards exonucleolytic decay [167] [168] [169] . various structures have been repeatedly used for ivt mrna (fig. 4) . the basic m7gpppg cap analog is incorporated in both orientations into the rna by the bacteriophage rna polymerase [170] . however, reverse incorporation of the cap analog results in mrna molecules that lack the m7 methylation at the cap and are not recognized by the translational machinery [171] . substitution of the hydroxyl group in c2′ or c3′ position of the m7g with a methoxy group prevents reverse incorporation of the cap analog by inhibiting elongation at the m7g. the dinucleotide is, therefore, called 'antireverse cap analog' or arca [171, 172] . arca-capped mrna revealed increased as well as prolonged protein expression in cultured cells and enhanced reporter protein expression in mouse dendritic cells up to 20-fold [173, 174] . for further optimization of the cap structure, modifications were introduced within the triphosphate linkage to inhibit decapping. substitution of a non-bridging oxygen in the β-phosphate moiety of an arca by sulfur results in the β-s-arca dinucleotide. while β-s-arca maintains recognition by the translational machinery, protein expression from mrna capped with β-s-arca was extended in hc11 cells and immature dcs, but not in mature dcs [175, 176] . enzymatic capping represents an alternative to the cotranscriptional approach, avoiding a n7-unmethylated cap as does arca. to this end, mrna is transcribed without cap analog and subsequently capped using the vaccinia virus capping complex [177] . this complex with triphosphatase, guanylyltransferase and (guanine-7)-methyltransferase activity adds a natural cap to the 5′-triphosphate of an rna molecule. such a cap0 structure can be further converted into a cap1 group by o-methylation of the ribose of the cap-proximal nucleotide using the vaccinia virus 2′o-methyltransferase [178] . cap1 (and cap2 harboring a further o-methyl-ribose at the subsequent nucleotide) structures are typical of eukaryotic mrna and are recognized less by cytosolic rna sensors of the innate immune system, thereby rendering the mrna less immunostimulatory. for instance, rig-i is activated by cap0 but not cap1 mrna and ifit1 binds cap0 mrna more tightly than cap1 molecules [179, 180] . such sensor-mediated immune stimulation may in turn negatively impact mrna translation [181] . it is interesting to note that since recently a new cap analog, cleancap, is available which enables the cotranscriptional incorporation of a cap1 structure into ivt mrna. like the cap structure at the 5′-end, details of the poly(a) tail at the 3′-end influence translation and stability of mrna [167, 182] . while numerous studies demonstrated a positive effect of a poly(a) tail and some correlation of effectiveness and length of the element, details of observations were remarkably variable. the majority of studies indicate an enhancement of translation when extending the poly(a) length from approximately 60-70 to 100-150 nucleotides or by tail extension using enzymatic polyadenylation [174, [183] [184] [185] . however, effects were mostly moderate, but reached a maximum of a 35-fold increase in one particular setting. in contrast, another study reported an optimum of approximately 60 nucleotides; protein expression declined with further increasing poly(a) length [186] . in addition to poly(a) length, one report suggests a positive role of using a type iis restriction enzyme for dna template linearization to obtain a free poly(a) end rather than one extended with unrelated nucleotides [183] . further elements that can affect mrna translation and/or half-life are untranslated region (utr) sequences flanking fig. 4 schematic representation of different cap structures. a the typical 5′ cap of eukaryotic mrnas. a guanosine is methylated at position 7 and linked to the first nucleotide of the mrna by an unusual 5′ to 5′ triphosphate bridge. depending on the degree of methylation of the first two bases of the mrna, the full 5′ terminal structure is referred to as cap0, cap1 or cap2. the cleancap™ analog, a trinucleotide introducing a cap1 structure during ivt, is indicated in blue. b a plain cap0 analog (orange) is incorporated in two orientations during ivt. c inverse orientation can be avoided using anti-reverse cap analogs (arcas, highlighted in orange). such analogs are characterized by the presence of a methoxy group at either c2′ or c3′ of m 7 g. to improve resistance to decapping, a phosphorothioate was positioned in the 5′-5′ bridge of arca (β-s-arca) the orf sequence. trans-acting regulatory rna-binding proteins (rbps) interact with distinct rna sequence elements, thereby affecting ribosome recruitment and transit [187] . for instance, β globin 5′-and 3′-utrs, duplicating the β globin 3′-utr, the 5′-utr of tobacco etch virus, and a structure of the 5′-utr of human heat shock protein 70 all enhanced mrna translation in mammalian cells [158, 183, 188, 189] . according to a very recent survey of a combinatorial utr library, 5′-utr sequences appear to be most critical for protein expression [190] . in contrast, 3′-utrs seem to be the key driver for mrna half-life as exemplified by the stabilizing effects of the α globin 3′-utr as well as a duplication of the β globin 3′-utr [183, 191] . codon usage of the orf sequence also affects translation efficacy in many species. although in humans codon usage bias does not correlate with trna levels and gene expression [192, 193] , still increased protein expression from mrna upon codon usage optimization has been reported. for instance, codon usage adaptation of hiv-1 gag improved protein yield from mrna approximately 1.6-fold in a human t lymphocyte cell line [194] . a more pronounced increase in protein expression as a result of codon optimization was reported upon transfection of mrna encoding angiotensinconverting enzyme 2 into a549 and hepg2 cells [195] . however, alternatively to a direct effect of codon usage, the enhancement may be an indirect result of the use of modified nucleotides, the content of which was altered by codon optimization. furthermore, coding sequence engineering exploiting more advanced concepts like codon optimality may prove valuable in designing therapeutic mrnas of high efficacy. according to recent insights, codon usage may also affect fidelity of translation or the stability of transcripts [196] . in addition, orf as well as utr sequences can have an effect on immunostimulation and thus on translational activity. initially, researchers applied mrna containing only the four unmodified bases a, u, c, and g [158] [159] [160] 197] . in this context, for instance, u-rich sequences, as well as several rna structural features were described as immunostimulatory due to interactions with various rna sensors such as toll-like receptors (tlr), rig-i, and protein kinase r (pkr) [198] [199] [200] [201] [202] [203] [204] [205] . as a consequence, development of mrna therapies was hampered by the immunogenicity of in vitro transcribed mrna. the consideration of mrna for therapeutic purposes gained momentum by the finding that incorporation of modified bases into in vitro transcribed mrna reduced immunostimulation of such preparations. various modified bases found in natural rnas suppressed recognition by tlrs in vitro [206] . among these, particularly pseudouridine increased translation and stability of mrna [207] . in addition to the effect on tlr binding, replacement of uridine by pseudouridine affected binding to and activation of further sensors such as pkr and 2′-5′-oligoadenylate synthetase, contributing to higher and longer protein expression [208, 209] . however, the effects of mrna modifications on translation, immunostimulation and resulting protein expression appear to be variable, for instance dependent on the type of target cells. in vitro testing of various modifications and combinations thereof revealed decreased protein yield for any of them in a context dependent manner, while most of them reduced immunostimulation in raw124.7 macrophages [210] . as found in vitro, pseudouridine modification of mrnas reduced immunostimulation and increased level and duration of protein expression after intravenous (iv) or intraperitoneal administration of formulated mrna in mice [188, 207] . in contrast, another study on systemic administration of nanoparticle-complexed mrna concluded that neither immunostimulation nor protein expression benefited from pseudouridine modification [211] . after intradermal or intramuscular injection of formulated mrna, only n1-methyl-pseudouridine, but not pseudouridine, substantially enhanced expression [212] . in line, a different study applying lipid nanoparticle (lnp)-formulated mrna intradermally found that replacement of uridine with n1-methylpseudouridine resulted in much increased and longer lasting protein expression [213] . notably, recent studies revealed that also endogenous eukaryotic mrna harbors various modified nucleotides [214] [215] [216] [217] . however, the total level of modification is rather low which is in strong contrast to the usually 100% replacement of an unmodified nucleotide in ivt mrna. moreover, heavy nucleotide modification appears to interfere with the function of translation-enhancing rna elements such as utrs and internal ribosomal entry sites (ireses) [218] . at the time when modified nucleotides were introduced to minimize immunostimulation of ivt mrna, the importance of stringent purification of such preparations was recognized. chromatographic purification, particularly hplc, can separate mrna according to size, thereby removing smaller or larger by-products such as abortive transcripts, mrna from traces of non-linearized dna template or double-stranded rna (fig. 5) [219, 220] . such purification enhanced protein yield, most probably by enriching functional transcripts and depleting contaminants causing detrimental immunostimulation [220, 221] . the latter is corroborated by the finding that stringent purification reduced the beneficial effect of chemical modification or even made it dispensable, particularly in combination with specific sequence-engineering of the mrna [218, 220] . importantly, the specifics of mrna formulation are another layer that can influence activation of the innate immune system by masking mrna from recognition by sensors, particularly tlrs. for example, immune responses after mrna administration into the central nervous system were effectively suppressed by the use of a nanomicelle formulation compared to naked mrna [222] . after in vivo administration of mrna was proven to be feasible, the concept of using mrna as a basis for therapeutics was pursued almost immediately. the very first report on a therapeutic effect with exogenous mrna was already published in 1992 and described a temporary reversion of diabetes insipidus in rats by intrahypothalamic injection of vasopressin mrna [197] . thereafter, it took almost two decades until further studies started to demonstrate the broad potential of mrna-based protein therapies. meanwhile, there is a plethora of publications on a huge variety of indications comprising anemia [188, 218] , hemophilia [223, 224] , myocardial infarction [155, 225] , cancer [226, 227] , lung disease such as surfactant b deficiency and asthma [228] [229] [230] , metabolic disorders [231] [232] [233] [234] [235] , fibrosis [195] , skeletal degeneration [236] , tendon impairment [237] , and neurological disorders such as sensory nerve dysfunction, friedreich's ataxia and alzheimer's disease [238] [239] [240] . whereas evidence for the therapeutic potential of mrna is mostly restricted to mouse models, first data in swine indicate that mrna-based protein therapies are feasible also in large animals [218, 225] . in view of the various indications, it is hardly surprising that this diversity goes along with different routes of administration and various formulations. only very few studies looking at local administration used uncomplexed and thus unprotected mrna [225, 228, 230, 237] . the majority of investigations built on lipid-based formulations with a clear tendency to the application of lnps [223, 224, [231] [232] [233] 239] . most if not all groups purified their ivt mrna before in vivo administration. while some simply precipitated the mrna [195, 227, 236] , most used commercial purification kits. only a few researchers applied hplc purification [188, 218, 232] . with respect to the mrna, the vast majority of studies used long poly(a) tails of at least 100 nucleotides. likewise, there is a clear prevalence to chemically modified mrna, although various examples suggest that this is not mandatory [218, 223, 239, 240] . while most mrnas harbored 5-methyl-cytosine and/or pseudouridine initially [155, 188, 226] , there appears to be a trend towards the use of n1-methyl-pseudouridine at present [224, 225] . regarding the cap structure, almost all early studies cotranscriptionally generated cap0 mrnas using arca [226, 228, 229] . however, since about 2 years, research groups prefer to apply mrnas with a cap1 5′-end [223, 227, 231] . initial attempts to apply mrna to passive immunotherapy focused on cellular approaches for various reasons. adoptive transfer of ctls equipped with either an additional tcr or a car had shown great promise in cancers and viral infections. in contrast to typical scenarios of antibody therapy, receptor expression usually requires much lower protein levels. furthermore, t cells are loaded with receptor-encoding nucleic acid (dna or mrna) ex vivo. hence, passive cellular immunotherapy does not require sophisticated and highly efficient formulations for in vivo delivery but can build on the armamentarium of cell transfection methods. previous work on active cellular vaccination with antigen-presenting cells that had been transfected with antigen-encoding mrna revealed electroporation as easy and efficient means to load cells [241] . comparison to passive cell pulsing and lipofection demonstrated that electroporation was also most efficient for transfection of t lymphocytes [242] . rna electroporation had up to 90% efficiency without eliciting any critical toxicity [243] . onset of transgene expression was very rapid and lasted about 7 days [243] . receptor transfer into t cells by mrna electroporation has now been well established for many years [243, 244] . moreover, gmpcompliant protocols for manufacturing receptor-expressing retention time t cell preparations via mrna electroporation are available today [245, 246] . electroporation of human t lymphocytes with various antigen-specific tcrs redirected them to recognize cancer cells in an mhc-dependent manner in vitro [243] . mrnamediated tcr expression conferred in vitro cytotoxicity to t cells for at least 72 h [244] . the lytic efficacy of such cells was comparable to retrovirally transduced lymphocytes [244, 247] . likewise, transfection of car-encoding mrnas generated cells that were lytically active in vitro. using an optimized ivt mrna for a cd19-specific car, surface expression and cytotoxic function were detectable for up to 10 days [248] . to avoid as many manipulation steps as possible in generating t cells for adoptive transfer, it was demonstrated that human peripheral blood lymphocytes instead of purified t lymphocytes could be used as well to elicit strong cytotoxicity in vitro upon electroporation of a car mrna [249] . currently, most car approaches deploy αβ t cells. however, γδ t lymphocytes are an attractive target as well due to their antitumor effector function which is not mhc restricted and does not require co-stimulation. accordingly, mrna-mediated tcr and car expression in such cells was investigated very recently and shown to kill target cells in an antigen-specific manner [246] . to the best of our knowledge, there is so far just one study that started to systematically analyze the role of different mrna elements for receptor expression in t cells. to this end, the group of carl june built on previous findings in dendritic cells which revealed the superiority of a duplicated β globin 3′-utr over a single copy of the same element and of a long (120 nt) over a short (16-51 nt) poly(a) tail [183] . with respect to receptor expression, 150 as enhanced expression compared to 64 as [184] . a tandem repeat of the β globin 3′-utr had also a beneficial effect, particularly in combination with a long poly(a). in contrast, the vegf translational enhancer as 5′-utr element had even detrimental consequences. the authors speculated that this may be due to reduced capping efficacy but did not provide data corroborating their hypothesis. however, they demonstrated the important role of the cap structure. co-transcriptional cap0 using arca and an enzymatically generated cap1 structure were equivalent and outperformed the basic cap analog as well as an enzymatic cap0. besides expression level, capping also appeared to have an effect on the persistence of expression. modification of the orf sequence by removing all internal orfs had no effect on receptor production. t lymphocytes transfected with tcr-or car-encoding mrna proved to be functional also in vivo. robust antitumor effects were observed in various preclinical models [184, 249] . they were mrna-specific, since mock-transfected t cells had no or very little unspecific impact. although mrna-mediated receptor expression is transient, a single injection of car t cells against cd19 was sufficient to prolong survival of mice [248] . using peripheral blood lymphocytes instead of purified t cells for car mrna transfection (see above) also enabled a strong antitumor response in vivo although those cells could not persist long-term in vitro [249] . very recently it was shown that mrna cannot only be used to drive receptor expression, but can support the generation of t cells for adoptive immunotherapy. by expressing a chimeric membrane protein against cd3, cells could be efficiently stimulated and expanded in vitro [250] . after transfection with an mrna for an anti-cd19 bite, those cells mediated sustained reduction in tumor burden upon intraperitoneal injection. based on encouraging preclinical data, adoptive t cell therapies using mrna were already subjected to first clinical testing. in a phase 1 trial on solid tumors addressing the safety and feasibility of using such cells, car transfected lymphocytes migrated to tumor sites after iv injection [251] . in addition, the study appeared to provide initial evidence of antitumor activity. due to the transient nature of mrna expression, subjects received repeated infusions of t cells. this led to an anaphylactic response in one patient who developed antibodies specific to the scfv domain of the car [252] . however, this could be a consequence of the murine origin of this domain. in another trial, mrnatransfected car t cells were injected intratumorally in metastatic breast cancer patients [253] . treatment was well tolerated and elicited an inflammatory response within tumors. as discussed above, the use of viral vectors for adoptive t cell therapy has potential safety issues. in various studies, authors ascribed toxicities particularly to the persistence of receptor-expressing t cells. regarding such concerns, mrna-mediated tcr or car expression offers at least two advantages. first, mrna does not integrate into a cell's genome, thus excluding genotoxicity. second, due to its transient nature, any potential toxicities accompanying treatment are temporary as well [254] . however, the increased level of safety has a substantial drawback. apparently, clinical efficacy correlates with long-term persistence of receptor-engineered t cells [255, 256] . as a consequence, mrna-transfected cells are expected to have limited antitumor activity because of rapidly declining receptor expression. substantial mrna translation only lasts about 1 day which translates into efficacious receptor levels on the cell surface for several days [183, 184] . importantly, clinical studies demonstrated that iv infused t lymphocytes reached tumor sites only 2-3 days after administration [257, 258] . as a consequence, intratumoral or intraarterial administration was suggested to counteract the delayed cell arrival. the problem of transience of mrna is further enhanced by the well-known ligand-induced receptor internalization. upon target recognition, tcrs as well as cars are rapidly internalized, a mechanism which is important for proper signal transduction [259, 260] . this explains why lentiviral vectors generated a more robust treatment effect than mrna [184] . mrna transfection can give rise to high receptor expression, equaling lentiviral vectors [261] . however, mrna was only similarly effective during the first hours after electroporation. later, contact to target cells strongly down-regulated receptor on the cell surface while lentiviral expression remained constant [261] . in comparison to a single transfer of cells with retroviral car expression, an mrna-encoded receptor required three consecutive lymphocyte infusions to obtain a comparable antitumor effect [262] . these observations and considerations may be put into perspective at least in part by the finding that transferred t cells can become tolerized rather rapidly, thereby losing their ability to function in the tumor microenvironment [263] . thus, frequent injection of t cells may be desired even for viral vector transduced t cells. notably, mrna was also considered to be of use in settings that require long-term expression for therapeutic efficacy and preclude repeated cell administration. the greater safety of mrna-transfected t cells may make the initial testing of novel antigens and receptors with unknown on-target/off-tissue toxicity less hazardous [248] . passive immunization with antibodies often requires considerable amounts of polypeptide to obtain therapeutically active concentrations after systemic administration. this poses a substantial challenge to the broad applicability of any nucleic acid mediated passive immunization strategy. thus, compared to the very first attempts regarding in vivo protein expression with mrna in the 1990s, the optimization of ivt mrna to enhance and extend expression is a prerequisite for many mrna-based passive immunotherapies. as reviewed above, great progress towards this goal was made during the last almost three decades. another potential hurdle for passive mrna immunization is related to delivery. to obtain high levels of in vivo antibody expression, the mrna should be targeted to as many "producer" cells as possible which in turn should be transfected with high efficiency. to this end, the mrna which is prone to degradation by rnases should also be protected against these ubiquitous nucleases in an appropriate manner. moreover, a viable mrna complexation reagent needs to be well tolerated. notably, various commercial transfection reagents can be used to formulate mrna and suffice research purposes. among them, transit has been repeatedly used for in vivo studies [188, 218, 264] . with respect to potential therapeutic applications, the class of lipid nanoparticles (lnps) became the most widely deployed means of complexation [223, 224, [231] [232] [233] . after iv delivery, lnps mainly route to the liver on the basis of an apolipoprotein e (apoe)-dependent mechanism [265] . however, such nanoparticles were demonstrated to be also applicable to intramuscular and subcutaneous administration [266] . advances in mrna and formulation technology led to a couple of recent intriguing studies as to passive immunization with mrna (fig. 6) . while the group of drew weissman described successful passive mrna immunization for prophylaxis of viral infections [267] , stadler et al. demonstrated the applicability of mrna-mediated antibody expression for cancer immunotherapy [264] . the feasibility of using mrna for such indications was confirmed by thran et al. who applied different mrnaencoded antibody formats to diverse biological threats, viruses, toxins, and tumors [268] . finally, sabnis and colleagues presented first antibody expression data in nonhuman primates (nhps) in a publication dealing with the development of novel lnp formulations [269] . 6 schematic illustration of mrna-mediated passive antibody immunotherapy. for in vivo administration, mrna is usually formulated in nanoparticles which for instance can be administered by iv injection. for many formulations, liver is the main target organ. upon uptake of nanoparticles by hepatocytes and release of the mrna into the cytosol, it is translated into antibodies that are typically secreted into circulation and finally bind their cognate antigens mrna design and formulation fundamental designs of antibody-encoding mrnas reveal only a few common features but several differences (table 1) . among the latter, the exclusive use of chemically unmodified nucleotides by thran et al. as in a previous publication from the same group [218] may be the most prominent one, since it contrasts to all other reports. other differences are much more diverse among these studies. hence, they do not provide an unequivocal guidance for future work, but at least commonalities may be taken as a recommendation. although obviously not mandatory, mrna with cap1 structure was clearly preferred. in addition, all mrnas harbored a poly(a) tail. the use of bipartite poly(a) elements by some groups may be owed to the experience that maintenance of long poly(d[a/t]) vector sequences is challenging and strongly dependent on bacterial strains [186] . beyond these common rna elements, the publications suggest that mrna should be subjected to further optimizations to exploit its full potential for antibody expression. however, different strategies appear to be applicable, but little is known about the interchangeability of individual elements. finally, chromatographic purification of ivt mrna appears to be generally recommended as well (table 1) . whether pardi et al. actually used fplc as stated throughout their report instead of hplc applied by other groups is not fully clear, since they referred to earlier publications describing the use of hplc [220, 270] . pardi et al. encoded a well-known, broadly neutralizing antibody against hiv-1, vrc01 [267] . to this end, heavy and light chains of the full igg antibody were represented on separate mrna molecules. for delivery, heavy and light chain mrnas were mixed in a molar ratio of 1:1. likewise, thran et al. used separate mrnas to encode heavy and light chain of various full igg antibodies [268] . titration of heavy and light chain mrna found a molar ratio of approximately 1.5:1 to be optimal for co-delivery. neither report provides a rationale for encoding chains on separate molecules. in principle, a bicistronic construct separating heavy and light chain by an ires sequence or an mrna for a polypeptide where a 2a sequence between heavy and light chain would lead to separate antibody chains by ribosome skipping could have been used [271] . for pardi et al. the observation that modified nucleotides can hamper the function of ires elements may have affected the selection [218] . sabnis et al. also worked with a full igg antibody, directed against influenza a, but did not provide any details on how heavy and light chain were represented [269] . in contrast, stadler et al. chose bite antibodies directed against tcr-associated cd3 and one of three different tumor-associated antigens (taas) [264] . they displayed the bites as fab(scfv) 2 or scfv 2 molecules but focused on the latter format. their findings on single-chain antibodies are complemented by thran et al. whose work covers single domain-derived vnas in addition to igg antibodies [268] . for iv administration of antibodyencoding mrna all but the group of ugur sahin used lnp formulations which, however, may differ from each other in composition ( table 1 ). the latter team exploited transit but switched the route of administration which had been intraperitoneal in previous studies [188, 218] . as with lnps, nanoparticles were shown to mainly target the liver upon iv injection [264] . drew weissman's group administered 30 µg of vrc01encoding mrna in most in vivo studies [267] . this corresponded to doses between 1 and 1.4 mg/kg due to differences in mouse weight among experiments. antibody serum titers 24 h after administration, the earliest time of analysis, ranged between approximately 80 and 200 µg/ml in various mouse strains. obviously, slight differences in dosage, as well as the respective strain contributed to varying peak levels. notably, increasing the administered mrna dose in steps of two enhanced serum titers by more than twofold with each step. moreover, 30 µg of mrna in lnps generated higher serum titers than 600 µg of recombinant vrc01 protein. the kinetics of antibody serum titers revealed an accelerated decline after about a week in balb/c mice. the kinetics in nsg mice appeared to be basically the same, since the level at 1 week after single administration was largely the same as in balb/c at this time. the observed kinetics may also explain why weekly injections of mrna-lnps in nsg mice did not show additive effects on serum titers at the times of analyses. since measurements were conducted 7 days after each treatment, antibodies from the preceding injection probably dropped to background levels within this 2-week period as observed in balb/c animals. such accelerated decline of protein titers after a few days is often indicative of the induction of an anti-drug antibody (ada) response [139] . the likelihood of such a response may be particularly high for the reported experiments, since the authors expressed a human antibody in mice. the emergence of adas cannot be fully ruled out because animals were not analyzed accordingly. however, the apparently similar kinetics in immunocompromised nsg mice which are unable to develop adas suggests a different explanation for the pharmacokinetics. possibly, the mrna continues to express antibody for a few days which would inevitably lead to a seemingly extended antibody serum half-life during that period. only after expression ceases, the actual shorter antibody half-life becomes evident. while this could also easily explain the serum profile of repeated treatment of nsg mice, it remains hypothetical due to the lack of respective analyses. stadler et al. first characterized their mrna in vitro demonstrating expression and secretion of functional antibodies [264] . in a pbmc-mediated killing assay, mrna-derived bite antibodies targeted ctls to tumor cells via binding to cd3 on pbmcs and to the cognate taa on tumor cells, thereby inducing t cell activation and tumor cell lysis. these antibodies were equally potent as the corresponding recombinant protein. in immunodeficient nsg mice, antibody plasma levels peaked within 6 h, but rapidly declined by more than 50% within the next 18 h. subsequently, the decrease of bite titers became much slower. the authors did not provide an explanation of this striking kinetics. a pharmacokinetic analysis in non-tumor-bearing mice could have elucidated whether the initial kinetics reflects the trapping of antibody in the engrafted tumor until saturation of binding sites. bite plasma levels above background for a few days were in accordance with the sustained ex vivo cytotoxicity of plasma from mrna-treated mice. in contrast to the antibody plasma kinetics, cytotoxicity showed a steady and slow decline during the observation period. 0.05 µg of mrna were already sufficient to obtain strong plasma activity in the ex vivo killing assay. 5 µg of mrna (approx. 0.25 mg/kg) were comparable to 4-7 µg of recombinant antibody with respect to peak plasma concentrations that were in the range of 6.5 µg/ml in nsg mice. as opposed to this modified and hplc-purified mrna, antibody plasma levels were almost undetectable with mrna preparations without modification and chromatographic purification. plasma titers declined much faster for recombinant protein compared to mrna, thereby demonstrating the substantial impact of mrna on bite pharmacokinetics. consequently, only mrna was able to maintain a sustained cytotoxic activity of plasma by weekly administrations. the various mrna-encoded antibodies of thran et al. included vrc01 which had been used in drew weissman's work [267, 268] . however, analyses were limited to in vitro characterization, preventing a direct comparison between studies. as observed for bites, igg and vna antibodies produced from mrna in vitro revealed potencies comparable to that of the respective recombinant proteins. iv administration of 40 µg of unmodified mrna (approx. 2 mg/kg) gave rise to antibody serum titers between 15 and 400 µg/ ml in immunocompetent mice. this contrasts strongly with the finding of stadler et al. who found unmodified mrna to be basically inactive. differences in purification and mrna design may be responsible for this striking discrepancy. similar to the weissman work, thran and colleagues observed a disproportionate increase of antibody serum titers with elevated mrna doses. onset of antibody expression was rather rapid, being already substantial 2 h after injection and reaching peak levels after approximately 4 h. this confirms findings on other mrna-mediated protein therapies showing that mrna starts accumulating in hepatocytes within minutes after administration and leads to substantial protein levels within a couple of hours [223, 231] . serum half-life of igg antibodies appeared to be in the range of 1 week and thus slightly longer compared to pardi et al. [267] . as in the latter study, one of two iggs showed an accelerated decline after about 1 week, however, only in approximately half of the animals. here, the expedited clearance could be assigned to the development of an ada response against the mrna-encoded antibody. importantly, this response was antibody-dependent and not intimately linked with the use of mrna. as expected, vnas revealed a much shorter serum half-life of about 1-2 days. compared to published kinetics data on recombinant vnas, mrna appeared to contribute to extended antibody availability during the first days after administration as it has been observed for bite-expressing mrna by ugur sahin's group. however, the lack of a headto-head comparison hampers a detailed analysis. while previous in vivo studies on mrna-mediated antibody expression were limited to mice, sabnis et al. presented expression results in nhps using a proprietary lnp formulation [269] . a 0.3 mg/kg dose of mrna gave rise to antibody serum titers of about 4 µg/ml 24 h after iv administration which is at least at the lower end of the range of efficacy observed in mouse studies. however, data on a different protein suggest that efficacy of the formulation may be slightly lower in nhp than in mouse. whereas a 0.5 mg/ kg dose induced protein levels of approximately 7 µg/ml in mice, a 0.2 mg/kg dose generated protein titers between 200 and 800 ng/ml in nhps. all mouse studies on mrna-encoded antibodies investigated their therapeutic efficacy. pardi et al. used two different humanized mouse models to demonstrate that mrnaderived vrc01 protects from hiv-1 challenge [267] . mrna encoding a reporter protein was utilized as control. mrna-lnps were administered 24 h prior to challenge with one of two hiv-1 isolates. in the authors' first model, a vrc01 mrna dose of 0.35 mg/kg was ineffective, but 0.7 mg/kg already reduced viral rna copies in the plasma to undetectable levels as assessed by quantitative real-time polymerase chain reaction (qrt-pcr). the latter dose is well below the 10-20 mg/kg doses that are typically used for prophylactic immunization with recombinant antibody in humanized mice to reach therapeutic concentrations [272, 273] . however, the authors did not titrate the dose of recombinant vrc01 but used a 28 mg/kg dose as control which was sufficient to completely eradicate viral rna copies in the plasma. mrna efficacy could be also demonstrated in the second mouse challenge model. to show in vivo efficacy of mrna-mediated bite expression, stadler et al. implanted tumor cells subcutaneously in immunodeficient nsg mice [264] . about 1 week before mrna treatment, human pbmcs were engrafted into these animals. 3 µg of bite mrna (approx. 0.15 mg/ kg) given three times with an interval of 1 week could eliminate tumors entirely. in contrast, tumors progressed in control animals that received mrna encoding a reporter protein. the recombinant bite required three injections per week and a total of ten injections of 4-7 µg each to obtain a comparable antitumor effect as with bite mrna. the need for a more frequent administration corroborated the previous finding that mrna substantially improved antibody plasma half-life. due to the diversity of antibodies included in their study, thran et al. utilized various disease models for demonstrating therapeutic efficacy [268] . in contrast to all other studies, the authors applied mrna encoding irrelevant antibodies instead of a reporter protein as control. a 40 µg dose (approx. 2 mg/kg) of antibody mrna could protect mice from challenges with either rabies virus or botulinum toxin. in the intoxication model, mrna was proven to be equally protective as recombinant antibody. however, mice received approximately 0.1 mg/kg of recombinant vna compared to approximately 2 mg/kg of mrna. based on protein expression levels from mrna dose titrations, lower doses than 2 mg/kg may still confer full protection but this remains hypothetical, since the authors did not conduct an mrna dose titration in their challenge model. notably, mrna was effective in pre-as well as in post-exposure settings. the latter is important for some indications of passive immunization and confirms the aforementioned rapid onset of antibody expression. the post-exposure scenario for botulinum toxin requires very rapid availability of neutralizing antibodies. whereas recombinant protein can act immediately after administration, mrna needs more time to provide the antibody. hence, it may well be that in such instances higher doses of mrna than of protein are required, not for obtaining the same peak level but for reaching meaningful titers in a timely manner. in a further model, thran et al. evaluated their mrna approach with respect to anti-tumor efficacy. using a disseminated tumor model for rituximab, they showed efficient tumor growth control with injections of 50 µg (approx. 2.5 mg/kg) of rituximab mrna twice a week. higher doses (200 µg, approx. 10 mg/kg) of recombinant rituximab were less potent. this finding is reminiscent of results of drew weissman's group and contrasts those of ugur sahin and colleagues who required similar doses of mrna and recombinant protein (but less frequent dosing with mrna) to obtain equivalent therapeutic effects [264, 267] . notably, the difference among studies may be related to the use of igg antibodies on the one hand and a scfv 2 protein on the other hand. moreover, the irrelevant antibody control used by thran et al. appeared to have a slight unspecific anti-tumor effect. it may have contributed to the superiority of mrna compared to recombinant protein regarding dosing. amongst other explanations, the potential unspecific effect may be due to an mrna-lnp-independent response to repeated treatment or may be the consequence of a weak and transient cytokine response observed after mrna-lnp administration. however, the phenomenon was not investigated further. in line with previous reports, pardi et al. confirmed the importance of highly purified ivt mrna. in combination with lnps, only modification with n1-methyl-pseudouridine plus chromatographic purification was sufficient to avoid cytokine release by innate immune activation [267] . for this analysis, however, the authors deployed an mrna encoding a different protein than the vrc01 antibody which had been used for in vivo expression and efficacy experiments. tolerability of mrna-lnps was also addressed by repeated mrna treatments. translation of vrc01 mrna was not compromised over time, but the analysis was conducted in immunodeficient nsg mice. to overcome this caveat, the authors complemented their study by repetitive treatment of immune competent balb/c mice. to this end, they switched to an endogenous protein, since human vrc01 may be recognized as foreign and thus elicit an immune response. again, mrna injections did not lose efficacy over time. however, the authors also changed the formulation (transit instead of lnp) as well as the route of administration (intraperitoneal instead of iv) compared to the use of vrc01 mrna. hence, evidence for immune silence and overall tolerability of vrc01 mrna in lnps is just circumstantial yet. stadler et al. did not observe any liver toxicity upon treatment with mrna in transit according to liver enzyme analyses [264] . moreover, bite mrna administration did not elevate murine cytokines such as ifnα and tnfα above background in plasma. likewise, analysis of systemic human cytokine release from engrafted pbmcs did not show any unspecific t cell activation. as opposed to modified and hplc-purified mrna, preparations without nucleotide modification and chromatographic purification induced detectable levels of murine cytokines. similar to the weissman group, the authors also assessed the tolerability of formulated mrna by repeated injections. administrations did not lose efficacy over time, but as in the corresponding weissman experiment immunodeficient nsg mice were used. using chemically unmodified mrna formulated in lnps, thran et al. did not observe any liver toxicity in histopathological analyses [268] . only a few animals developed an ada response which was dependent on encoded antibody and was, thus, no intrinsic consequence of treatment with mrna-lnp. in addition, treatment appeared to elicit a transient weak cytokine release which, however, neither suppressed antibody expression nor induced adverse effects. since there are ample differences among studies on mrna-mediated antibody expression and no detailed analyses of the issue, the role of mrna, lnp, and/or encoded antibody/protein in cytokine induction remains elusive. an earlier study on erythropoietin showing the absence of any appreciable immunostimulation suggests that the use of chemically unmodified instead of modified mrna is not the decisive parameter [218] . quite a few in vivo studies provided compelling evidence for the principle feasibility of mrna-based immunotherapies. as discussed above, challenges and open questions regarding adoptive t cell transfer are less related to the mrna and its formulation or transfection but more of fundamental character. in contrast to ex vivo loading of cells, mrnamediated antibody expression is strongly affected by body size. thus, while there are now convincing efficacy data in diverse small rodent models, the translation to larger animals and finally humans has still to be demonstrated. first data suggest that substantial expression can be obtained in small nhps. however, the utilized lnps appeared to lose some efficacy when switching from mouse to nhp. hence, the development of human therapies may perhaps require further advancements of the mrna technology as well as primatespecific formulations with improved efficacy. in addition, tolerability of formulations has to be analyzed further and in more depth in the future. for instance, repeated dosing of nanoparticles can induce complement activation-related pseudoallergy (carpa) [274] . however, this can in principle be counteracted by optimization towards better biocompatibility. in case of lnps, fast degradation was shown to be particularly important [231, 232, 269] . while antibodies for cancer treatment were initially developed for iv administration, there is a trend towards subcutaneous injections today. for instance, rituximab was initially formulated for iv infusion which is typically administered over a period of 1.5-6 h [275] . this treatment schedule poses a substantial burden to patients as well as the healthcare system. thus, a formulation which reduces the time and required resources would be advantageous. to meet these goals by subcutaneous administration, the antibody solution was concentrated 12-fold [276, 277] . since this volume was still too large for subcutaneous injection, rituximab was co-formulated with human hyaluronidase which limits swelling and associated pain by increasing the dispersion and absorption of co-administered substances [276, 278] . now, median administration time for rituximab using the subcutaneous route is 6 min. as a consequence, antibody immunotherapy with mrna does not only require competitive efficacy and costs but also routes of administration to become a viable alternative to recombinant proteins. although other routes than iv have been shown to be possible for mrna, there are still a few open questions to be addressed by future studies. where are the advantages of using mrna for antibody immunotherapies? compared to dna it may be primarily the safety aspect. concerning recombinant proteins various points matter. as reviewed above, mrna provides benefits as to the pharmacokinetics when short-lived antibodies such as scfv, (bi)-scfv 2 or vna are used. moreover, solving the challenges of antibody cocktails may be easier using mrna. different mrna sequences are much more similar with respect to their physicochemical characteristics than different proteins are. hence, producing a cocktail may be less demanding for mrna compared to protein. however, co-delivery and thus co-expression implicates the risk of antibody chimerism and thus requires specific solutions such as knob-into-hole concepts [68] . last but not least, while proteins are difficult to deliver directly through the cell membrane [279] , mrna-mediated protein expression makes a large number of potential intracellular targets accessible to antibody immunotherapy. particularly single-chain and single-domain formats are amenable to functional expression in the cytosol and thus suited as intrabodies, since they are less dependent on disulfide bond formation [280, 281] . the value of targeting intracellular proteins has already been demonstrated by various studies. a bispecific scfv could restore p53 function in mutant p53 colon cancer cells and trapping ccr5 in the er via an intrabody reduced hiv cell entry [282, 283] . support for the potential of intrabodies as therapeutics also comes from further work in the field of oncology or neurodegenerative diseases [284] [285] [286] . although it has been recently demonstrated that even a full antibody can be delivered into cells in vivo [287] , it has been recognized that fusions of cell-penetrating peptides (cpps) and macromolecules are often trapped in endosomes instead of being released to the cytoplasm [288] . in contrast, nucleic acids including mrna can be very efficiently transfected into cells, making them ideal for the delivery of intrabodies. however, while lnps provide very efficient solutions for systemic delivery to the liver, formulations for routing mrna to other tissues are scarce today. with the recent burst of publications on successful applications of mrnabased antibody immunotherapy it is likely that there will be much to follow in the near future. pathogen recognition and innate immunity inside, outside, upside down: damage-associated molecular-pattern molecules (damps) and redox fundamentals and methods for t-and b-cell epitope prediction b-lymphocyte lineage cells and the respiratory system current advances in t-cell-based cancer immunotherapy foxp3 + regulatory t cells in the human immune system human dendritic cell subsets: an update correlates of protection induced by vaccination cancer immunotherapy via dendritic cells recombinant snakebite antivenoms: a cost-competitive solution to a neglected tropical disease? strategies and challenges for the next generation of therapeutic antibodies contributions of humoral and cellular immunity to vaccine-induced protection in humans ueber das zustandekommen der diphtherie-immunitat und der tetanus-immunitat bei thieren passive immunity in the prevention of rabies passive antibody therapy for infectious diseases monoclonal antibody-based therapies for microbial diseases passive immunity in prevention and treatment of infectious diseases use of concentrated human serum gammaglobulin in the prevention and attenuation of measles evaluation of red cross gamma globulin as a prophylactic agent for poliomyelitis. iv. final report of results based on clinical diagnoses serum therapy revisited: animal models of infection and development of passive antibody therapy the growth and potential of human antiviral monoclonal antibody therapeutics efficacy of immune plasma in treatment of argentine haemorrhagic fever and association between treatment and a late neurological syndrome treatment of argentine hemorrhagic fever rabies in the americas and remarks on global aspects clostridial infections botulism in the united states: a clinical and epidemiologic review heterologous antisera and antivenins are essential biologicals: perspectives on a worldwide crisis comparison of an anti-rabies human monoclonal antibody combination with human polyclonal anti-rabies immune globulin continuous cultures of fused cells secreting antibody of predefined specificity an overview of the use of the monoclonal antibody okt3 in renal transplantation overview of antibody phage-display technology and its applications a surface expression vector for antibody screening selecting and screening recombinant antibody libraries ribosome display human monoclonal antibody as prophylaxis for sars coronavirus infection in ferrets a neutralizing human monoclonal antibody protects african green monkeys from hendra virus challenge predominant autoantibody production by early human b cell precursors efficient generation of monoclonal antibodies from single human b cells by single cell rt-pcr and expression vector cloning therapeutic antibodies for autoimmunity and inflammation fcrn: the neonatal fc receptor comes of age clinical pharmacokinetics of therapeutic monoclonal antibodies monoclonal antibody pharmacokinetics and pharmacodynamics a theoretical model of gamma-globulin catabolism mediator of transmission of immunity and protection from catabolism for igg enhanced antibody half-life improves in vivo activity pharmacokinetics and pharmacodynamics of monoclonal antibodies: concepts and lessons for drug development challenges in monoclonal antibody-based therapies expression of recombinant antibodies current state and recent advances in biopharmaceutical production in escherichia coli, yeasts and mammalian cells production of recombinant protein therapeutics in cultivated mammalian cells therapeutic antibody expression technology high level transient production of recombinant antibodies and antibody fusion proteins in hek293 cells biopharmaceutical benchmarks antibody glycosylation and its impact on the pharmacokinetics and pharmacodynamics of monoclonal antibodies and fc-fusion proteins production of non-fucosylated antibodies by co-expression of heterologous gdp-6-deoxy-d-lyxo-4-hexulose reductase engineered antibody fragments and the rise of single domains camelid and shark single domain antibodies: structural features and therapeutic potential llama-derived single domain antibodies to build multivalent, superpotent and broadened neutralizing anti-viral molecules comparison of physical chemical properties of llama vhh antibody fragments and mouse monoclonal antibodies prolonged prophylactic protection from botulism with a single adenovirus treatment promoting serum expression of a vhh-based antitoxin protein nanobodies with in vitro neutralizing activity protect mice against h5n1 influenza virus infection single-domain antibodies targeting neuraminidase protect against an h5n1 influenza virus challenge design and evaluation of a diabody to improve protection against a potent scorpion neurotoxin identification of potent nanobodies to neutralize the most poisonous polypeptide from scorpion venom systematic screening of soluble expression of antibody fragments in the cytoplasm of design and production of novel tetravalent bispecific antibodies knobs-into-holes' engineering of antibody ch3 domains for heavy chain heterodimerization systemic administration of a bispecific antibody targeting egfrviii successfully treats intracerebral glioma bispecific antibodies and trispecific immunocytokines for targeting the immune system against cancer: preparing for the future targeting and killing of glioblastoma with activated t cells armed with bispecific antibodies dual targeting strategies with bispecific antibodies diabodies": small bivalent and bispecific antibody fragments single-chain mono-and bispecific antibody derivatives with novel biological properties and antitumour activity from a cos cell transient expression system immunotherapeutic perspective for bispecific antibodies antibody pharmacokinetics and pharmacodynamics elimination mechanisms of therapeutic monoclonal antibodies bispecific antibodies rise again clinical use of blinatumomab for b-cell acute lymphoblastic leukemia in adults tailoring structure-function and pharmacokinetic properties of single-chain fv proteins by site-specific pegylation strategies to extend plasma half-lives of recombinant antibodies properties, production, and applications of camelid single-domain antibody fragments transmigration of beta amyloid specific heavy chain antibody fragments across the in vitro blood-brain barrier protective effect of different anti-rabies virus vhh constructs against rabies disease in mice post-exposure treatment with anti-rabies vhh and vaccine significantly improves protection of mice from lethal rabies infection a mab recognizing a surface antigen of mycobacterium tuberculosis enhances host survival antibodies to a cell surface histone-like protein protect against histoplasma capsulatum monoclonal antibodies for prophylactic and therapeutic use against viral infections single-chain fragment variable passive immunotherapies for neurodegenerative diseases human respiratory syncytial virus and other viral infections in infants receiving palivizumab bezlotoxumab (zinplava) for prevention of recurrent clostridium difficile infection depletion of b cells in vivo by a chimeric mouse human monoclonal antibody to cd20 rituximab, an anti-cd20 monoclonal antibody: history and mechanism of action targeted therapy with the t-cell-engaging antibody blinatumomab of chemotherapy-refractory minimal residual disease in b-lineage acute lymphoblastic leukemia patients results in high response rate and prolonged leukemia-free survival blinatumomab: a bispecific t cell engager (bite) antibody against cd19/cd3 for refractory acute lymphoid leukemia alternative molecular formats and therapeutic applications for bispecific antibodies adoptive t-cell immunotherapy t-cell therapy: options for infectious diseases cancer regression in patients after transfer of genetically engineered lymphocytes chimeric antigen receptor-and tcr-modified t cells enter main street and wall street adoptive cell transfer as personalized immunotherapy for human cancer gene therapy with human and mouse t-cell receptors mediates cancer regression and targets normal tissues expressing cognate antigen tumor regression in patients with metastatic synovial cell sarcoma and melanoma using genetically engineered lymphocytes reactive with ny-eso-1 t cells targeting carcinoembryonic antigen can mediate regression of metastatic colorectal cancer but induce severe transient colitis adoptive transfer of t-cell immunity: gene transfer with mhc-restricted receptors transgenic tcr expression: comparison of single chain with full-length receptor constructs for t-cell function independent association of t cell receptor beta and gamma chains with cd3 in the same cell surface expression of only gamma delta and/or alpha beta t cell receptor heterodimers by cells with four (alpha, beta, gamma, delta) functional receptor chains alpha beta t cell receptor transfer to gamma delta t cells generates functional effector cells without mixed tcr dimers in vivo talenmediated editing of endogenous t-cell receptors facilitates efficient reprogramming of t lymphocytes by lentiviral gene transfer optimizing t-cell receptor gene therapy for hematologic malignancies expression of immunoglobulin-t-cell receptor chimeric molecules as functional receptors with antibody-type specificity the t-body approach: potential for cancer immunotherapy the basic principles of chimeric antigen receptor design specific activation and targeting of cytotoxic lymphocytes through chimeric single chains consisting of antibody-binding domains and the gamma or zeta subunits of the immunoglobulin and t-cell receptors chimeric antigen receptors for t cell immunotherapy: current understanding and future directions structural design of engineered costimulation determines tumor rejection kinetics and persistence of car t cells chimeric antigen receptors modified t-cells for cancer therapy adoptive t cell transfer for cancer immunotherapy in the era of synthetic biology adoptive immunotherapy for cancer: harnessing the t cell response chimeric antigen receptor t cells for sustained remissions in leukemia cd19 car-t cells of defined cd4 + :cd8 + composition in adult b cell all patients t cells expressing cd19 chimeric antigen receptors for acute lymphoblastic leukaemia in children and young adults: a phase 1 dose-escalation trial chimeric antigen receptor-modified t cells for the treatment of solid tumors: defining the challenges and next steps state-of-the-art gene-based therapies: the road ahead going viral: chimeric antigen receptor t-cell therapy for hematological malignancies case report of a serious adverse event following the administration of t cells transduced with a chimeric antigen receptor recognizing erbb2 treatment of metastatic renal cell carcinoma with autologous t-lymphocytes genetically retargeted against carbonic anhydrase ix: first clinical experience treatment of metastatic renal cell carcinoma with caix car-engineered t cells: clinical evaluation and management of on-target toxicity b-cell depletion and remissions of malignancy along with cytokineassociated toxicity in a clinical trial of anti-cd19 chimericantigen-receptor-transduced t cells clinical development of car t cells-challenges and opportunities in translating innovative treatment concepts current concepts in the diagnosis and management of cytokine release syndrome preventing and exploiting the oncogenic potential of integrating gene vectors gene therapy. safer and virus-free? lmo2-associated clonal t cell proliferation in two patients after gene therapy for scid-x1 genotoxicity of retroviral integration in hematopoietic cells generation of knock-in primary human t cells using cas9 ribonucleoproteins pharmacokinetics and immunogenicity of broadly neutralizing hiv monoclonal antibodies in macaques state of play and clinical prospects of antibody gene transfer vector-mediated antibody gene transfer for infectious diseases generation of neutralizing activity against human immunodeficiency virus type 1 in serum by antibody gene transfer antibody-based protection against hiv infection by vectored immunoprophylaxis broad protection against influenza infection by vectored immunoprophylaxis in mice stable antibody expression at therapeutic levels using the 2a peptide recurrent aav2-related insertional mutagenesis in human hepatocellular carcinomas general considerations on the biosafety of virus-derived vectors used in gene therapy and vaccination pre-existing immunity against ad vectors: humoral, cellular, and innate response, what's important? hum vaccines immunother promise and problems associated with the use of recombinant aav for the delivery of anti-hiv antibodies preexisting anti-adeno-associated virus antibodies as a challenge in aav gene therapy in situ production of therapeutic monoclonal antibodies erythropoietin gene therapy leads to autoimmune anemia in macaques vector-mediated gene transfer engenders long-lived neutralizing activity and protection against siv infection in monkeys broadly neutralizing human immunodeficiency virus type 1 antibody gene transfer protects nonhuman primates from mucosal simian-human immunodeficiency virus infection adeno-associated virus-mediated gene transfer to nonhuman primate liver can elicit destructive transgene-specific t cell responses modified mrna directs the fate of heart progenitor cells and induces vascular regeneration after myocardial infarction use of frog eggs and oocytes for the study of messenger rna and its translation in living cells translation of encephalomyocarditis viral rna in oocytes of xenopus laevis cationic liposome-mediated rna transfection direct gene transfer into mouse muscle in vivo induction of virusspecific cytotoxic t lymphocytes in vivo by liposomeentrapped mrna gene gun delivery of mrna in situ results in efficient transgene expression and genetic immunization dendritic cells pulsed with rna are potent antigen-presenting cells in vitro and in vivo in vivo application of rna leads to induction of specific cytotoxic t lymphocytes and antibodies 5′-terminal cap structure in eucaryotic messenger ribonucleic acids how the messenger got its tail: addition of poly(a) in the nucleus formation of the 3′ end of histone mrna the cap and poly(a) tail function synergistically to regulate mrna translational efficiency the enzymes and control of eukaryotic mrna turnover concerted action of poly(a) nucleases and decapping enzyme in mammalian mrna turnover reverse 5′ caps in rnas made in vitro by phage rna polymerases synthesis and properties of mrnas containing the novel "anti-reverse" cap analogs 7-methyl(3′-o-methyl)gpppg and 7-methyl (3′-deoxy)gpppg novel "anti-reverse" cap analogs with superior translational properties effective delivery with enhanced translational activity synergistically accelerates mrna-based transfection mrna transfection of dendritic cells: synergistic effect of arca mrna capping with poly(a) chains in cis and in trans for a high protein expression level phosphorothioate cap analogs stabilize mrna and increase translational efficiency in mammalian cells phosphorothioate cap analogs increase stability and translational efficiency of rna vaccines in immature dendritic cells and induce superior immune responses in vivo modification of the 5′ end of mrna. association of rna triphosphatase with the rna guanylyltransferase-rna (guanine-7-)methyltransferase complex from vaccinia virus cap-specific mrna (nucleoside-o2′-)-methyltransferase and poly(a) polymerase stimulatory activities of vaccinia virus are mediated by a single protein a conserved histidine in the rna sensor rig-i controls immune tolerance to n1-2′o-methylated self rna inhibition of translation by ifit family members is determined by their ability to interact selectively with the 5′-terminal regions of cap0-, cap1-and 5′ppp-mrnas viral stressinducible protein p56 inhibits translation by blocking the interaction of eif3 with the ternary complex eif2.gtp.met-trnai mrna poly(a) tail, a 3′ enhancer of translational initiation modification of antigen-encoding rna increases stability, translational efficacy, and t-cell stimulatory capacity of dendritic cells multiple injections of electroporated autologous t cells expressing a chimeric antigen receptor mediate regression of human disseminated tumor mrna with a < 20-nt poly(a) tail imparted by the poly(a)-limiting element is translated as efficiently in vivo as long poly(a) mrna optimized transfection of mrna transcribed from a d(a/t)100 tail-containing vector trans-acting translational regulatory rna binding proteins increased erythropoiesis in mice injected with submicrogram quantities of pseudouridine-containing mrna encoding erythropoietin an element within the 5′ untranslated region of human hsp70 mrna which acts as a general enhancer of mrna translation optimization of mrna untranslated regions for improved expression of therapeutic mrna an mrna stability complex functions with poly(a)-binding protein to stabilize mrna in vitro codon usage and trna genes in eukaryotes: correlation of codon usage diversity with translation efficiency and with cgdinucleotide usage as assessed by multivariate analysis evolution of synonymous codon usage in metazoans quantitative effect of suboptimal codon usage on translational efficiency of mrna encoding hiv-1 gag in intact t cells translation of angiotensin-converting enzyme 2 upon liver-and lung-targeted delivery of optimized chemically modified mrna codon optimality, bias and usage in translation and mrna decay reversal of diabetes insipidus in brattleboro rats: intrahypothalamic injection of vasopressin mrna species-specific recognition of single-stranded rna via toll-like receptor 7 and 8 recognition of double-stranded rna and activation of nf-kappab by toll-like receptor 3 innate antiviral responses by means of tlr7-mediated recognition of single-stranded rna 5′-triphosphate rna is the ligand for rig-i differential roles of mda5 and rig-i helicases in the recognition of rna viruses rig-i-mediated antiviral responses to single-stranded rna bearing 5′-phosphates rig-i detects viral genomic rna during negative-strand rna virus infection 5′-triphosphate-dependent activation of pkr by rnas with short stem-loops suppression of rna recognition by toll-like receptors: the impact of nucleoside modification and the evolutionary origin of rna incorporation of pseudouridine into mrna yields superior nonimmunogenic vector with increased translational capacity and biological stability incorporation of pseudouridine into mrna enhances translation by diminishing pkr activation nucleoside modifications in rna limit activation of 2′-5′-oligoadenylate synthetase and increase resistance to cleavage by rnase l screening of mrna chemical modification to maximize protein expression with reduced immunogenicity efficacy and immunogenicity of unmodified and pseudouridine-modified mrna delivered systemically with lipid nanoparticles in vivo n(1)-methylpseudouridine-incorporated mrna outperforms pseudouridine-incorporated mrna by providing enhanced protein expression and reduced immunogenicity in mammalian cell lines and mice nucleoside-modified mrna vaccines induce potent t follicular helper and germinal center b cell responses chemical and structural effects of base modifications in messenger rna chemical pulldown reveals dynamic pseudouridylation of the mammalian transcriptome a mettl3-mettl14 complex mediates mammalian nuclear rna n6-adenosine methylation transcriptome-wide mapping reveals reversible and dynamic n(1)-methyladenosine methylome sequence-engineered mrna without chemical nucleoside modifications enables an effective protein therapy in large animals messenger rna-based vaccines generating the optimal mrna for therapy: hplc purification eliminates immune activation and improves translation of nucleoside-modified, protein-encoding mrna spontaneous cellular uptake of exogenous messenger rna in vivo is nucleic acid-specific, saturable and ion dependent in vivo messenger rna introduction into the central nervous system using polyplex nanomicelle therapeutic efficacy in a hemophilia b model using a biosynthetic mrna liver depot system systemic delivery of factor ix messenger rna for protein replacement therapy biocompatible, purified vegf-a mrna improves cardiac function after intracardiac injection 1 week post-myocardial infarction in swine systemic delivery of modified mrna encoding herpes simplex virus 1 thymidine kinase for targeted cancer gene therapy exploring cytotoxic mrnas as a novel class of anti-cancer biotherapeutics expression of therapeutic proteins after delivery of chemically modified mrna in mice in vivo genome editing using nuclease-encoding mrna corrects sp-b deficiency modified foxp3 mrna protects against asthma through an il-10-dependent mechanism systemic messenger rna therapy as a treatment for methylmalonic acidemia targeted mrna therapy for ornithine transcarbamylase deficiency g6pc mrna therapy positively regulates fasting blood glucose and decreases liver abnormalities in a mouse model of glycogen storage disease 1a quantitative systems pharmacology model of hugt1a1-modrna encoding for the ugt1a1 enzyme to treat crigler-najjar syndrome type 1 mrna treatment produces sustained expression of enzymatically active human adamts13 in mice chemically modified rna induces osteogenesis of stem cells and human tissue explants as well as accelerates bone healing in rats tendon healing induced by chemically modified mrnas treatment of neurological disorders by introducing mrna in vivo using polyplex nanomicelles intrathecal delivery of frataxin mrna encapsulated in lipid nanoparticles to dorsal root ganglia as a potential therapeutic for friedreich's ataxia messenger rna-based therapeutics for brain diseases: an animal study for augmenting clearance of beta-amyloid by intracerebral administration of neprilysin mrna loaded in polyplex nanomicelles efficient genetic modification of murine dendritic cells by electroporation with mrna highly efficient gene delivery by mrna electroporation in human hematopoietic cells: superiority to lipofection and passive pulsing of mrna and to electroporation of plasmid cdna for tumor antigen loading of dendritic cells high-efficiency transfection of primary human and mouse t lymphocytes using rna electroporation a new way to generate cytolytic tumor-specific t cells: electroporation of rna coding for a t cell receptor into t lymphocytes a gmp-compliant protocol to expand and transfect cancer patient t cells with mrna encoding a tumor-specific chimeric antigen receptor rna-transfection of gamma/delta t cells with a chimeric antigen receptor or an alpha/beta t-cell receptor: a safer alternative to genetically engineered alpha/beta t cells for the immunotherapy of melanoma transfer of mrna encoding recombinant immunoreceptors reprograms cd4 + and cd8 + t cells for use in the adoptive immunotherapy of cancer treatment of advanced leukemia in mice with mrna engineered t cells adoptive immunotherapy using human peripheral blood lymphocytes transferred with rna encoding her-2/neu-specific chimeric immune receptor in ovarian cancer xenograft model novel t cells with improved in vivo anti-tumor activity generated by rna electroporation mesothelin-specific chimeric antigen receptor mrna-engineered t cells induce anti-tumor activity in solid malignancies t cells expressing chimeric antigen receptors can cause anaphylaxis in humans ) safety and efficacy of intratumoral injections of chimeric antigen receptor (car) t cells in metastatic breast cancer nonviral rna transfection to transiently modify t cells with chimeric antigen receptors for adoptive therapy adoptive t cell therapy for cancer in the clinic adoptive cell transfer: a clinical path to effective cancer immunotherapy survival and tumor localization of adoptively transferred melan-a-specific t cells in melanoma patients phase i trial of adoptive immunotherapy with cytolytic t lymphocytes immunized against a tyrosinase epitope signal transduction and endocytosis: close encounters of many kinds endosomes: a legitimate platform for the signaling train redirecting t cells to ewing's sarcoma family of tumors by a chimeric nkg2d receptor expressed by lentiviral transduction or mrna transfection immunological quality and performance of tumor vessel-targeting car-t cells prepared by mrna-ep for clinical research interleukin-15 rescues tolerant cd8 + t cells for use in adoptive immunotherapy of established tumors elimination of large tumors in mice by mrna-encoded bispecific antibodies structure, activity and uptake mechanism of sirna-lipid nanoparticles with an asymmetric ionizable lipid expression kinetics of nucleoside-modified mrna delivered in lipid nanoparticles to mice by various routes administration of nucleoside-modified mrna encoding broadly neutralizing antibody protects humanized mice from hiv-1 challenge ) mrna mediates passive vaccination against infectious agents, toxins, and tumors a novel amino lipid series for mrna delivery: improved endosomal escape and sustained pharmacology and safety in non-human primates hplc purification of in vitro transcribed long rna comparison of ires and f2a-based locus-specific multicistronic expression in stable mouse lines hiv therapy by a combination of broadly neutralizing antibodies in humanized mice passive immunization with a human monoclonal antibody protects hu-pbl-scid mice against challenge by primary isolates of hiv-1 complement activation-related pseudoallergy: a stress reaction in blood triggered by nanomedicines and biologicals phase iii safety study of rituximab administered as a 90-minute infusion in patients with previously untreated diffuse large b-cell and follicular lymphoma subcutaneous administration of rituximab (mabthera) and trastuzumab (herceptin) using hyaluronidase non-clinical pharmacokinetic/pharmacodynamic and early clinical studies supporting development of a novel subcutaneous formulation for the monoclonal antibody rituximab a recombinant human enzyme for enhanced interstitial transport of therapeutics delivery of antibodies to the cytosol: debunking the myths intracellular and cell surface displayed single-chain diabodies phenotypic lentivirus screens to identify functional single domain antibodies construction and expression of a bispecific single-chain antibody that penetrates mutant p53 colon cancer cells and binds p53 barbas cf 3rd (2000) functional deletion of the ccr5 receptor by intracellular immunization produces cells that are refractory to ccr5-dependent hiv-1 infection and cell fusion a novel intracellular antibody against the e6 oncoprotein impairs growth of human papillomavirus 16-positive tumor cells in mouse models a human singlechain fv intrabody blocks aberrant cellular effects of overexpressed alpha-synuclein trapping prion protein in the endoplasmic reticulum impairs prpc maturation and prevents prpsc accumulation a cell-penetrating whole molecule antibody targeting intracellular hbx suppresses hepatitis b virus via trim21-dependent pathway delivery of macromolecules using arginine-rich cell-penetrating peptides: ways to overcome endosomal entrapment acknowledgements we thank mariola fotin-mleczek for discussion on the manuscript. we are grateful to nigel horscroft and michael stolz for critical reading of the review. we also thank bettina danker for her graphical illustrations. finally, we apologize to those authors whose work was not cited owing to space limitations. key: cord-222664-4qyrtzhu authors: coban, mathew; morrison, juliet; freeman, william d.; radisky, evette; roch, karine g. le; caulfield, thomas r. title: attacking covid-19 progression using multi-drug therapy for synergetic target engagement date: 2020-07-06 journal: nan doi: nan sha: doc_id: 222664 cord_uid: 4qyrtzhu covid-19 is a devastating respiratory and inflammatory illness caused by a new coronavirus that is rapidly spreading throughout the human population. over the past 6 months, severe acute respiratory syndrome coronavirus 2 (sars-cov-2), the virus responsible for covid-19, has already infected over 11.6 million (25% located in united states) and killed more than 540k people around the world. as we face one of the most challenging times in our recent history, there is an urgent need to identify drug candidates that can attack sars-cov-2 on multiple fronts. we have therefore initiated a computational dynamics drug pipeline using molecular modeling, structure simulation, docking and machine learning models to predict the inhibitory activity of several million compounds against two essential sars-cov-2 viral proteins and their host protein interactors; s/ace2, tmprss2, cathepsins l and k, and mpro to prevent binding, membrane fusion and replication of the virus, respectively. all together we generated an ensemble of structural conformations that increase high quality docking outcomes to screen over>6 million compounds including all fda-approved drugs, drugs under clinical trial (>3000) and an additional>30 million selected chemotypes from fragment libraries. our results yielded an initial set of 350 high value compounds from both new and fda-approved compounds that can now be tested experimentally in appropriate biological model systems. we anticipate that our results will initiate screening campaigns and accelerate the discovery of covid-19 treatments. covid-19 is a disease cause by severe acute respiratory syndrome coronavirus 2 (sars-cov-2). it was identified in wuhan city, in the hubei province of china in december 2019 (chen et al., 2020; huang et al., 2020; zhu et al., 2020) . the virus is spread between people via small droplets produce by talking, sneezing and coughing. the disease was declared a global pandemic by the world health organization (who) on march 11th, 2020. while a large proportion of the cases results in mild symptoms such as fever, cough, fatigues, loss of smell and taste, as well as shortness of breath, some cases progress into more acute respiratory symptoms such as pneumonia, multiple-organ failure, septic shock and blood clots. these more severe symptoms can lead to death and are likely to be precipitated by a cytokine storm after infection and multiplication of the virus in humans. indeed, recent data indicate that the levels of il-6 correlate with respiratory and organ failures (gubernatorova et al., 2020) . so far, the estimated death rate of sars-cov-2 is above 1.3%, which is more than 10 times higher than the death rate of seasonal influenza (abdollahi et al., 2020) . older patients and patients who have serious underlying medical conditions such as hypertension, diabetes, and asthma are at higher risk for severe disease outcomes . a clear understanding of the genetics and molecular mechanisms controlling severe illness remains to be determined. sars-cov-2 is a positive-sense, single-stranded rna betacoronavirus, closely related to sars-cov-1, which caused severe acute respiratory syndrome (sars) in 2003, and middle east respiratory syndrome coronavirus (mers-cov), which caused mers in 2012. positive-strand rna viruses are a large fraction of known viruses including common pathogens such as rhinoviruses that cause common colds, as well as dengue virus, hepatitis c virus (hcv), west nile virus. the first genome sequence of sars-cov-2 was released in early january on the open access virological website (http://virological.org/) (zhou et al., 2020) . its genome is ~29.8 kb and possesses 14 open reading frames (orfs), encoding 27 proteins (wu et al., 2020a) . the genome contains four structural proteins: spike (s) glycoprotein, envelope (e) protein, membrane (m) protein, and nucleocapsid (n) protein. the e and m proteins form the viral envelope, while the n protein binds to the virus's rna genome. the spike glycoprotein is a key surface protein that interacts with cell surface receptor, angiotensinconverting enzyme 2 (ace2) mediating entrance of the virus into host cells (zhu et al., 2018) . in addition to its dependence on the binding of s to ace2, cell entry also requires priming of s by the host serine protease, transmembrane serine protease 2 (tmprss2). tmprss2 proteolytically processes s, promoting membrane fusion, cell invasion and viral uptake (heurich et al., 2014; hoffmann et al., 2020) . blocking viral entry by targeting s/ace2 interaction or tmprss2-mediated priming may constitute an effective treatment strategy for covid-19. the non-structural proteins, which include the main viral protease (nsp5 or m pro ) and rna polymerase (nsp12), regulate virus replication and assembly. they are expressed as two long polypeptides, pp1a and pp1ab, which are proteolytically processed by m pro . the key role of m pro in viral replication makes it a good therapeutic target as well. a third group of proteins are described as accessory proteins. this group is the least understood, but its members are thought to counteract host innate immunity (kim et al., 2020, cell 181, 914-921) (fig. 1a) . there is currently no treatment or vaccine available to prevent or treat covid-19 (baden and rubin, 2020; lurie et al., 2020) (https://www.fda.gov/news-events/press-announcements/coronavirus-covid-19update-daily-roundup-june. while the fda has granted emergency use authorization (eua) for the 65-year-old antimalarial drug, hydroxychloroquine, covid-19 treatment based on early results from clinical trial in china and france gautret et al., 2020a; gautret et al., 2020b; million et al., 2020) , more recent results reported that hydroxychloroquine does not decrease viral replication, pneumonia or hospital mortality, and may in fact increase cardiac arrest in patients infected with covid-19 rosenberg et al., 2020) . the accuracy of the statistical analyses in these studies raised serious concerns in the scientific community. more accurate data are needed to reach a conclusion about the effect of hydroxychloroquine in covid-19 patients. in another recent study published in the new england journal of medicine, the antiviral remdesivir, an unapproved drug that was originally developed to fight ebola, seemed to improve patients with severe breathing problems (beigel et al., 2020) and has also recently been granted eua by the fda. repurposing drugs that are designed to treat other diseases is one of the quickest ways to find therapeutics to control the current pandemic. such drugs have already been tested for toxicity issues and can be granted eua by the fda to help doctors to treat covid-19 patients. another efficient way to attack the virus is to use drug cocktails to target multiple enzymes/pathways used by the virus. combination therapy has the advantage of being less likely to select for treatmentresistant viral mutants. such a strategy has been successfully used to treat hepatitis c virus (hcv) and human-immunodeficiency virus (hiv) infections. in the case of hcv, the treatment, enpclusa, combines sofosbuvir, which inhibits the viral rna-dependent rna polymerase (ns5b), and velpatasvir, a defective substrate that inhibits ns5a. antiretroviral therapy (art) against hiv combines drugs from different drug classes to target disparate aspects of the hiv replication cycle. these drug classes include nucleoside reverse transcriptase inhibitors, non-nucleoside reverse transcriptase inhibitors, protease inhibitors, fusion inhibitors, ccr5 antagonists, post-attachment inhibitors, and integrase inhibitors. one example from hiv-aids literature is the randomized comparison of 4 groups of patients comparing monotherapy to combination therapies: zidovudine (zdv) monotherapy; zdv zidovudine and didanosine; zdv plus zalcitabine; or didanosine monotherapy. this randomized trial showed positive results when zdt was combined with didanosine or zalcitabine, and for didanosine compared to zdt monotherapy in raising cd4 counts greater than 50% (hammer et al., 1996) . combination therapy has become standard of care initial treatment in other infectious diseases such as mycobacterium tuberculosis and failure to cure with monotherapy and requires multidrug therapy (mdt) (collaborative group for the meta-analysis of individual patient data in et al., 2018) . similar mdt is also found effective in hepatitis c virus infection using glecaprevir and pibrentasivr combination therapies which lead to sustained virological response rates as far out as 12 weeks' post-treatment (wang et al., 2019) . we propose an effective combination therapy for covid-19 could target the sars-cov-2 replication cycle at multiple levels to synergistically inhibit viral spread and dissemination. using a computational pipeline that aimed to expeditiously identify lead compounds against covid-19, we combined compound library preparation, molecular modeling, and structure simulations to generate an ensemble of conformations and increase high quality docking outcomes against two essential sars-cov-2 viral proteins and their host protein interactions; s/ace2, tmprss2, cathepsin l and k, and m pro that are known to control both viral binding, entry and virus replication (fig. 1a) . our in silico approach (fig. 1b) , which will most likely lead into experimental virus screening, structural characterization of binding interactions by x-ray crystallography, and compound safety profiling. virtual screening (vs) is a rational driven controller for identification of new hits from compound libraries (willett, 2006) using either ligandbased (lbvs) or structure-based (sbvs) virtual screening (dror et al., 2004) . lbvs tactics use structural and biological data of known active compounds to select favorable candidates with biological activity from experiments (jahn et al., 2009; maldonado et al., 2006) . sbvs approaches, on the other hand, examine quantitative structure-activity relationships (qsar), clustering, pharmacophore and 3d shape matching (villoutreix et al., 2007) . the utility of vs is evident in the growth of our knowledge base of new compounds and existing drugs as well as the expansion of our structural databases. sbvs is generally the preferred approach when access to the target 3d-information derived from nmr, x-ray crystallography or homology models (jahn et al., 2009; maldonado et al., 2006) is possible. molecular docking (docking) is the most common sbvs approach used today (bottegoni et al., 2009; corbeil et al., 2012; fernandez-recio et al., 2005; friesner et al., 2006; mcgann, 2012; morris et al., 2009 ) and searches for the ideal position and orientation (called "pose") of the small molecule within a target's binding site, which gives a score for the pose. when including knowledge of experimentally known compounds ("actives") from a 3d target, lbvs and sbvs can be combined to increase likelihood of obtaining new actives from searches (kruger and evers, 2010) . hit identification in vs also requires careful selection of the methods used based on the goal of the project (e.g. compound databases and libraries can be either proprietary, commercial or public) (bender, 2010) . zinc is one such large public database often used in vs (irwin and shoichet, 2005) , which contains millions of compounds. by contrast, other libraries have structure-activity relationships (sar) databases (scior et al., 2007) that integrate information about compound interactions with their known targets. drugbank, chem-space are other attractive sources of compounds for drug repurposing (or repositioning) (ashburn and thor, 2004; duenas-gonzalez et al., 2008; o'connor and roth, 2005) (wishart et al., 2008) , and maintain drug diversity that is useful for scaffold development (gozalbes et al., 2008; schreiber, 2000) . advances in computing power have increased utility of in silico screening capabilities and balanced the need for accuracy with virtual high-throughput screening approximations and assumptions (anthony, 2009; lee et al., 2008; mcgaughey et al., 2007; plewczynski et al., 2009) , while recent techniques have improved accuracy without sacrificing cpu time (caulfield and devkota, 2012; caulfield et al., 2011; jiang et al., 2014; mackerell et al., 1998; phillips et al., 2005) (fig. 1b) . further innovations in docking methods have improved the exactness of empirical docking equations (corbeil et al., 2012; fernandez-recio et al., 2005; friesner et al., 2006; kalid et al., 2012; kruger and evers, 2010; mcgann, 2012) . accuracy is improved by incorporating molecular flexibility with simulations (caulfield, 2012; caulfield et al., 2019; caulfield and medina-franco, 2011; caulfield et al., 2011; caulfield et al., 2014; kayode et al., 2016) , thus capturing conformational information on structural changes that directly impact compound docking results. to target the covid19 problem on multiple fronts (e.g. ace2:s protein, tmprss2, m pro , and cathepsin l and k), as well as improve our screening accuracy using our selected repurposing libraries and new chemical entity libraries (zinc database), we implemented a novel method that integrates protein flexibility/shape, adaptive biasing algorithms, machine learning from drug data, and final z-score matrix weighting to our drug modeling. we matched all fda compounds with our realistic (x-ray derived) protein structures over a dynamic range of protein conformations with accelerated dynamics using our algorithms, such as maxwell's demon molecular dynamics (mdmd); this approach combines docking with simulations for exploration of both ligand and protein flexibility (caulfield, 2012; caulfield et al., 2019; caulfield and devkota, 2012; caulfield and medina-franco, 2011; caulfield et al., 2014; kayode et al., 2016; von roemeling et al., 2018) . we then refined the drug-target interface our specific leaderlike hit compounds using the quantum mechanics (qm)-based scoring within our mdmd matrix (caulfield, 2012) to make our go/no-go assessment, which is particularly useful with nces and de novo compound design (dcds). the protocol for library, structural modeling, dynamics, refinement, and hit identification as part of a pipeline is given (fig. 1b) . to improve our docking outcome, we constructed x-ray structure-based models of ace2 bound to sprotein, m pro , and tmprss2 in our molecular dynamics simulations (mds) and virtual screening (fig. 1b,s1 ). as s-protein interfaces with ace2 at a distinct region from the active site ( fig. s1a-d) , inhibition of the binding site by ligands may disrupt the ace2/s-protein interaction. canonical inhibitors of ace2 bind at the active site where angiotensin interacts, whereas drugs directed at the structural region for s-protein binding are not overlapping with the binding site. the modulation of ace2/s-protein interaction by canonical ace2 inhibitors is likely allosteric and suboptimal. therefore, directly targeting the interface of the interaction should increase efficacy of the approach and block covid viral binding, precluding entry (fig. s1) . additional investigation into the glycosylation sites of the s-protein demonstrated that the ace2 binding site is mostly unaffected by these additions (fig. s2) . a. s-protein:ace2 interaction (protein-protein inhibitor, ppi) requires dynamics to reveal binding site to get the optimal interface for drug screening, we used our grid searching algorithms, as well as site mapping and protein-protein docking, to examine the protein-protein interactions surface using mds ( fig. 2-3 ,s1) (bhachoo and beuming, 2017; caulfield and devkota, 2012; caulfield et al., 2011; caulfield and harvey, 2007; fernandez-recio et al., 2005; kozakov et al., 2006) . the protein-protein inhibitor (ppi) interaction complex did not identify any immediate binding site on the surface of the ppi interfaces. nevertheless, a small pore around one single beta-sheet in the center of the ppi interaction area could be exploited as a weak point that may perturb the interface equilibrium. using uniprot, which contains information about a number of confirmed mutations, we determined the relative potencies of ppi binding residues, identifying those that would likely affect the integrity of the complex (fig. 2) . residues k353 and y41, which interact with d155 at the center of the ppi, are likely stabilizing its surface, potentially forming a useful "hot spot" for targeted druggability ( fig. 2-3,s2) . to check whether this is true and to understand how ace2:s-protein cooperation functions, we performed two md simulations, one with and one without the mutation of y41a. this mutation causes strict inability to form the s-protein:ace2 complex. analysis of the trajectory of the wild-type protein, which possessed an intact complex, revealed the three most stable conformations of the "hot spot" region with expanded pores inside the triangle of residues k353, d155, y41. since it is impossible to determine which of these three conformations is the most stable, we ran three high-throughput screenings based on the donor-acceptor atoms and hydrophobic areas of the region. we then performed three md simulations with top pose ligands. as demonstrated in figure 3k , ligands failed binding within 10 ns, while docked ligands became leaders, as determined by energetic stability, during md and interaction energy values (electrostatic -red, van der waals -blue) ( fig. 3j/l) . to identify inhibitors of the s-protein:ace2 interaction via docking, we used the best scoring compounds obtained after combination of molecular docking and molecular dynamics simulations, which feeds into the pipeline for constraint-based screening. the high-throughput screening (hts) of a ppi library did not produce any results, since the ppi binding sites were weakly identified shallow regions (fig. 2,4a -d,s1). compounds that made good insertion into the sites situated between ace2 and s-protein were able to perturb the association of s-protein with ace2 via steric hindrance of s-protein association (fig. 3) . from the mds, we detected compounds that decreased energy of stability between the ace2:sprotein complex, which is desired in an inhibitor of protein-protein interaction. as a whole, this approach identified a deep and narrow binding site to disturb the s-protein interaction with ace2 (fig. 3,4a-d) . c. tmprss2 and m pro modeling requires dynamics to reveal optimal inhibitor binding to optimize the binding site of our inhibitors, we constructed a full-length (zymogen) model of tmprss2 (epitheliasinogen), as well as a mature version of the protease (epitheliasin), as described in our method section ( fig. 4e-g) . the mature protease model was used for mds studies to generate a reference dynamical profile that can be used to assist in silico screening of tmprss2 inhibitors. a control experiment was also completed with the uncleaved (non-catalytic) form of tmprss2 to demonstrate the pocket's instability and poor ligand binding capacity ( fig. s3 ) (ko et al., 2015; lucas et al., 2014; wilson et al., 2005) . a full-length model of monomeric m pro was also constructed, as well as a homodimer ( fig. 4h-k,s1 ). the structure derived from pdb code 6y2f with its ligand was used for a consensus virtual screen . in addition, we used the dimer to generate a reference dynamical profile to assist with in silico screening and study its interdomain behavior. we acquired the dimer protein sequence from the uniprot database. blast search showed the highest identification values against factor xi, prothrombin, kallikrein proteases (~41-42%). however, we focused on ligands that could be active against active form of tmprss2 protein. thus, we found the ligand: (2s)-1-[(2r)-2-(benzylsulfonylamino)-5-guanidino-pentanoyl]-n-[(4-carbamimidoylphenyl)methyl]pyrrolidine-2-carboxamide, contained within the chembldb repository (chembl1229259) and active against tmprss2, prothrombin, and factor xi. likewise, another docked model was recovered with macrocyclic ligand (chembl3699198), called: ethyl14-[[(e)-3[5-chloro-2-(tetrazol-1-yl) phenyl]prop-2-enoyl]amino] -5-(methoxycarbonylamino)-17-oxo-8,16 diazatricyclo[13.3.1.02,7] nonadeca-1(18),2(7), 3,5,15(19) -pentaene-9-carboxylate. we launched several molecular dynamics simulations (up to 75 ns of duration) to understand the interaction with the target protein-binding site. figure s3 shows the initial and stable/final states of our various models ( fig. 4e-g) . the md analysis provided useful results for selecting the appropriate model. after 15 ns md, the putative binding site collapsed (fig. s3,4e-g) . although the active form of thrombin was used for tmprss2 modeling, as a negative control we also examined the region with prothrombin-based binding site for completeness of the docking study (fig. s3) . the overlay of the average homology model structure from md and structure 3f68 (pdb code) was used as a template to compare protein-ligand interaction map and assign docking constraints (baum et al., 2009) . two optimal inhibitors for tmprss2 were selected for demonstration purpose in figure 5 . we also modeled cathepsins l and k for preliminary work, since these can be implicated in late-endosomal entry of the virus (fig. s4 ). for the viral main proteinase, m pro , a key enzyme for coronavirus replication (sars-cov-2), and a potential target for anti-sars drug development, several peptidomimetics synthetized in early 2012 against sars-cov-1 proteases were identified as selective. there is a high degree of sequence identity between the sars-cov-1 and sars-cov-2 m pro . this means that sars-focused ligands could form similar interaction map with m pro protein and offers good launching points for 3d-qsar/machine learning-drive based drug design for future iterations. to perform the virtual screening, protein structure was taken from the pdb code 7bqy complex and significant attention was paid to the interaction between the crystallized ligand from the complex and protein-binding site (jin et al., 2020) (fig. 6) . as the binding site is quite large (fig. 6a) we used a set of additional crystal structures (pdb code 6y2f and fragment-like compounds from https://www.diamond.ac.uk/) to narrow the source of possible conformations. the binding of the compounds inserted into this region demonstrated a very canonic and recurring interacting motif, represented with î±-keto amide group flanked with aliphatic or saturated rings. we then performed molecular dynamics of 75 ns for the ligand-free dimer structure of the m pro to evaluate and "catch" the most flexible elements of the binding site. our simulation revealed that the extended binding pocket was not very stable, unlike its individual sub-pocket, which contains active cysteine (c145) residue (fig. 6b,6c) . we began our molecular docking after assigning several combinations of constraints that should define specific interactions with the protein-binding site. we performed several high-throughput screening procedures using the same set of features in different combinations of constraints by partial matching algorithm ( fig. 6d-e) . we then ranged docking scores and compared obtained conformations inside the binding site with the co-crystalized ligands from 7bqy, 6y2f structures to select the most potent compounds. by disrupting the sars-cov-2 viral process in three different critical routes: binding, entry, and replication with our virtual screening approaches against dynamic structures, we were able to identify 350 compounds (dataset s1) and compile data reflecting physiochemical and chemoinformatic properties. an exemplar top hit from each target is summarized for docking score in table 1 . to classify the compounds and their chemical space, we completed various regression, k-means analyses and fingerprint measurements, and provide further details about their structures and properties, including commonly evaluated traits: mw, hba, hbd, docking score, rule of three (jorgensen), rule of five (lipinksi), logp o/w , and logs (dataset s2). we focused on new compound searches. the mw for these initial screening compounds ranges from large fragment (~250 da) to mature drug sized molecules (~500 da) with only 10 of the 310 top scoring compounds being over 500 da in size and the smallest fragment-based compound measured 178 da. overall the docking scores were very good with median around -7 kcal/mol using the glide xp calculations. we also generated a list of most commonly related drugs and discuss some of our best hits to known and clinical trial drugs (dataset s3). the general process for pruning the >30 million total chemical fragments and compounds from commercially available compounds for the initial round of virtual screening is described (fig. 1b) , which reduces the primary large set to 3 million per conformation of target. as an example, when examining some prototype compounds from our selected dataset of >300 nces screened from >10 million total compounds, we find the predicted interactions between drug and protein ( table s2 ) have some common binding modalities. when looking at the dynamical data for the drugs binding to the protein-protein site on ace2, we find the rmsd, rmsf, and h-bond occupancy evidence strong binding capability, as calculated from three separate simulations of ace2 with different ligands, referred to as 300, 392, and 488 ( fig. 4,s1) . these observations can be applied to generate constraints for additional virtual screening to improve the performance at higher throughput. based on these results, ligand 392 reduced the overall rmsd and per residue rmsf, while maintaining strong hydrogen bonds, as demonstrated by its greater occupancy during the simulation (table s1 ). this information, particularly h-bond occupancy and modulation of interface residue rmsfs, can be used in conjunction with docking and other data to profile the compounds more thoroughly (fig. 4) . in some cases, where constraints were utilized, the docking score underrepresents the compound and testing is needed to get important single-point data to clarify actives from non-actives, as well as determine the real ic50s for the selected active compounds. we will enrich our dataset with the top compounds for future rounds of parallel chemical screening and eventual de novo chemical design for novel chemical entities. current results of our approach are presented on all three targets (ace2, tmprss2, m pro ). for each of our targets, we screened for hits from a library of fda-approved compounds alongside the more extensive library of nces. our final result across all three targets identified a total of 350 specific compounds, with 167 against ace2, 40 against tmprss2, and 103 against m pro . among these are fdaapproved drugs that could be repurposed: 21 against ace2, 11 against tmprss2, and 8 against m pro (supplemental dataset tables1_topnce-fda-hits.xlsx). isoprenaline hydrochloride (isoprotenerol) is an adrenoreceptor agonist that can be repurposed as a vasopressor to augment cardiovascular function with a beta-receptor side benefit of bronchodilation to improve breathing function. metaraminol bitartrate, a stereoisomer of meta-hydroxynorephedrine, is a potent sympathomimetic amine to raise blood pressure. atenolol and nadolol are beta-receptor blocking agents used in chronic hypertension, a comorbid risk factor in covid-19 patients. propafenone is an anti-arrhythmic agent approved for patients with life-threatening ventricular tachycardia. levosulpiride is an atypical antipsychotic medication with prokinetic function that can be used in patients with agitated delirium, and gut immotility. valganciclovir hydrochloride is an antiviral agent used for cytomegalovirus (cmv), varicella zoster virus (vzv), and preventative medication in hiv patients (wu et al., 2020b) . recent data shows covid-19 deplete cd8 t helper cells similar to hiv (zheng et al., 2020) . amikacin sulfate and cephalexin are antibiotic anti-bacterial drugs that can treat bacterial super-infection. prochlorperazine dimaleate is a phenothiazine derivative prescribed in medicine for nausea. isoetharine mesylate is a selective adrenergic beta-2 agonist and fast-acting aerosolized bronchodilator for covid-19 respiratory distress. benserazide hydrochloride is an aromatic l-amino acid decarboxylase (dopa decarboxylase inhibitor) used with levodopa for the treatment of parkinsonism. glucosamine hydrochloride is constituent found in cartilage and used for osteoarthritis joint pains. s4701 or 2-deoxy-d-glucose (2d-dg) compound can induce ketogenic state, a powerful pathway involved in reducing systemic inflammation. inulin is a natural prebiotic agent that enhances gi function and digestion by increasing prebiotic gi homeostasis critical to stabilize downstream anti-inflammatory effects and prevent overgrowth of harmful bacteria. metaproterenol is a bronchodilator (beta-2 receptor agonist) that is commonly used to treat a variety of respiratory disorders including asthma, copd, bronchitis and wheezing associated with viral pneumonias in clinical practice. the novelty of this drug is that is aerosolized and can be given as a breathing treatment and similar reach the lungs, which have a tremendous surface area and enter the blood rapidly. by inhalation this drug acts rapidly and potentially with or in combination with other aerosolized drugs or oral or iv combination drugs. its inhalational route of delivery also can reach alveolar type ii cells which express ace2 for dual synergism. metaraminol bitartrate, a stereoisomer of meta-hydroxynorephedrine, is a potent sympathomimetic amine. this drug is used in patients with hypotension or low blood pressure. covid-19 hospitalized patients in the intensive care unit (icu) setting often need vasopressor agents to raise blood pressure in a condition called shock (dangerously low blood pressure) from covid-19 disease or sepsis. therefore, metaraminol has dual purpose of antiviral function at ace2 docking site /entry as well as helping with systemic blood pressure in those acutely ill covid-19 patients. this drug has immediate repurposing use in this patient population. atorvastatin is a statin drug with anti-inflammatory, immunomodulatory (diamantis et al., 2017) and endothelial benefits (ackermann et al., varga et al., 2020) . carbenicillin disodium is a penicillin derivative antibacterial antimicrobial agent. catechins are derived from plants with many beneficial properties in human health including anticancer, anti-obesity, antidiabetic, anti-cardiovascular, antiinfectious, hepatoprotective, and neuroprotective effects (isemura, 2019) . these substances fall outside fda purview since supplements and generally have a wide safety margin that will be tested on the multidrug platform. epicatechine s5105 is a naturally occurring flavonoid found in chocolate with anti-sarcopenic effects on skeletal muscle (gutierrez-salmean et al., 2014) . ivosidenib is an experimental drug for treatment of several forms of cancer. bezafibrate is a fibrate lipid-lowering drug, which creates a favorable anti-inflammatory ratio against cardiovascular diseases. pf299804 or dacomitinib is an egfr inhibitor used in cancer therapeutics. metaproterenol is a bronchodilator (beta-2 receptor agonist) that is commonly used to treat a variety of respiratory disorders with viral pneumonias in clinical practice. carbenicillin disodium is a penicillin derivative antibacterial antimicrobial agent that as mentioned above can be used in conjunction with other anti-sars-cov-2 agents to shut down antiviral effects and used in combination with those covid-19 patients with secondary super-infection with bacterial infection of lung, blood, or skin. bumetanide is a loop-diuretic used to remove extra fluid in the body (edema) such as pulmonary edema. aloin is an anthraquinone glycoside found naturally in aloe vera plants, a natural cathartic, and decreases 16s rrna sequencing of dysbiosis-producing butyrate producing bacterial species via an emodin breakdown product (gokulan et al., 2019) . emodin blocks ace2 and viral docking (ho et al., 2007) . salbutamol sulfate (albuterol) is a bronchodilator used in various breathing disorders. s4953 usnic acid is a naturally occurring dibenzofuran derivative found in lichen plant species, in some kombucha teas, with adrenergic function to raise blood pressure and potential bronchodilator. usnic acid is an active ingredient in some and a preservative in others and has a wide array of antimicrobial action against human and plant pathogens with antiviral, antiprotozoal, antiproliferative, antiinflammatory, and analgesic activity (ingolfsdottir, 2002) . avanafil is a class of medications called phosphodiesterase (pde) inhibitors, which are pulmonary artery and circulation dilators. s3612 rosmarinic acid is a naturally occurring compound found in plants (rosemary and sage), which has broad range of antimicrobial activity including antiviral activity including hiv (shekarchi et al., 2012) . ractopamine is a beta-agonist function used for bronchodilatation. neohesperidin dihydrochalcone (nhdc) is a naturally derived plant sweetener (bitter orange) with anti-tmprss2 effects. cidofovir and zidovudine (zdv) are both antiviral drugs used in hiv patients. there is a critical unmet patient need for therapeutics to treat the acute phase of covid-19 disease now and for the future. efforts to create and trial a vaccine are underway, but 11.6 million patients are confirmed infected globally (>540k deaths) with 25% infected within the united states and we are just at the midpoint of 2020. therefore, there is an urgent need to rapidly speed drug discovery from the bench to the bedside. in order to accelerate drug discovery, translation and human application, a design funnel using high-powered artificial intelligence is needed to screen millions of compounds against macromolecular mechanistic targets against the virus. at the back end of this funnel 40 drug candidates emerged, many of which may represent repurposing candidates for use in humans due to known safety and tolerability profiles. however, the approach with the highest probability of overall clinical therapeutic success may be not a single drug therapy for this viral rna disease but rather a multi-pronged drug approach gleaned from decades of hiv-aids epidemic research. a multidrug approach for hiv has improved survival, markedly reduced viral loads, and vastly improved management of the disease by preventing aids end-stage fatal complications. we therefore suggest that a multifaceted drug approach for sars-cov-2 may prove superior by attacking 3 viral entry and replication cycle sites simultaneously: ace2 receptor docking site and entry, tmprss2 endosomal packaging, and m pro viral replication. multiple drug targets for each of the 3 sites also allow permutations and optimization for combinatorial success. a recent study that screened commercially available >10,000 clinical-staged and fda-approved small molecules against sars-cov-2 in a cell-based assay (riva et al., 2020) identified interesting compounds for alternative targets that complement our results. these fda approved compounds included mdl-28170, a selective cathepsin b inhibitor; vby-825, a non-specific cathepsin b, l, s, v inhibitor; apilimod, an inhibitor of production of the interleukins il-12 and il-23; z-lvg-chn2, a tripeptide derivative inhibitor for cysteine proteinases; ono 5334, a selective cathepsin k inhibitor; and sl-11128, a polyamine analogs designed against e. cuniculi, a antimicrobial agents used as an adjuvant treatment for opportunistic aids-associated infections. overall these compounds are cathepsin-centric or antibiotic in nature, with little to no effect on our intended targets (tmprss2, ace2, m pro ). additional top hits identified by riva et al. include: amg-2674, an amgen compound inhibitor of trpv-1 (vanilloid receptor); sb-616234-a that possesses high affinity for human 5-ht1b receptors; sdz 62-434 that strongly inhibited various inflammatory responses induced by lipopolysaccharide (lps) or function-activating antibody to cd29; hafangchin a (also called "tetrandrine"), a bisbenzylisoquinoline alkaloid, which acts as a calcium channel blocker; elopiprazole an antipsychotic drug of the phenylpiperazine class (antagonist at dopamine d2 and d3 receptors and an agonist at serotonin1a receptors) that was never marketed; yh-1238, which inhibits dipeptidyl peptidase iv (dpp-iv) enzyme prolonging the action of the incretin hormones, glucagon-like peptide-1 (glp-1) and glucose-dependent insulinotropic polypeptide (gip); kw-8232, an anti-osteoporotic agent that can reduce the biosynthesis of pge2; astemizole, an antihistamine; n-tert-butyl isoquine (also called "gsk369796"), an antimalarial drug candidate; and remdesivir, a broad-spectrum antiviral medication developed by the biopharmaceutical company gilead sciences. again, none of these compounds were geared toward targeting tmprss2 or mpro, and are also not specific to ace2. while the lack of overlap may be surprising, results generated by riva and colleagues are not in opposition to our findings and both approaches can complement each other. most importantly, these approved fda compounds can be combined with our set of identified nce (310 compounds) that have been demonstrated to have low toxicity issues based on our chemoinformatics filtering (fig. 1b) . all nce compounds identified were chemical moieties that do not overlap any fda drugs. altogether, the data presented here complements previously generated data and should help prioritize and rapidly identify safe treatments for covid-19. future work will rely on advanced 3d-qsar, fragment-based drug design principles for de novo drug optimization. among millions of potential covid-19 drugs screened the majority of the final 40 drug candidates have known medical use and/or fda approval for a primary indication (e.g., hypertension, cardiac indication, hyperlipidemia) with well-established patient safety and tolerability profiles from large phase iii human trials and post-market (phase iv) analyses. these large human data provide both a clinically significant and scientifically innovative window of opportunity to test 40 compounds on the multidrug platform, and, in conjunction, observe longitudinal human survival outcomes of covid-19 patients on these drugs for comparative effectiveness within established and ongoing patient registries. an emerging example of this important parallel is ace2 pathway drugs (ace inhibitors [acei] and angiotensin receptor blocking drugs [arb]), which are increasingly observed in humans with covid-19 to be associated with improved survival advantage (jarcho et al., 2020; mancia et al., 2020; mehta et al., 2020; patel and verma, 2020; vaduganathan et al., 2020) . however, there is a scientific knowledge gap within human registries data regarding a scientifically robust and testable translational platform to test mechanistic effects of these different molecular compounds. therefore, creation of a "pandemic platform" using newer technology of compound ai drug throughput screening combined with animal multi-drug screening models creates an early phase i/ii safety, tolerability and early efficacy platform which is rapidly needed to expedite bedside human use for covid-19 pandemic, and as a platform that can be used in future pandemics. a flurry of activity to identify compounds for sars cov-2 targets has been underway by academic labs globally. here in our approach we introduce our novel maxwell's demon molecular dynamics method for screening flexibility required to get rare and essential conformational transitions and pathways to find the most likely druggable state. we also used our quantum docking technique (qm-driven adaptive molecular dynamics scanning docking) (caulfield, 2012) to identify compounds effective for targeting ace2, tmprss2 and m pro . the compounds identified by our large-scale in silico platform can next be experimentally validated as binders for intended targets and for efficacy in models of the disease, evaluated for ec50/safety-toxicity data, and carried into hit-to-lead and lead optimization in a drug development pipeline. structural studies such as x-ray crystallography will also be important to generate structural sar data for these efforts. in sum, our leading edge in silico methods incorporating structural dynamics have produced a set of 350 candidate compounds suitable for screening in biological disease models. among these, 40 fdaapproved compounds are eligible for rapid clinical trial testing. additionally, our results bring forward 310 nces predicted to possess potency and specificity for viral or human accessory target proteins to lower the viral load. moreover, this resource offers the community a set of chemical tools to probe the behavior of these enzymes essential for sars-cov-2 progression, namely, binding, entry and replication. as sars-cov-2 is already endemic, the rapid identification of effective antivirals remains a paramount focus until we have an efficient vaccine to provide long-lasting protection. in general, coot was used for building in missing residues and regularizing geometry (emsley and cowtan, 2004; emsley et al., 2010) . more details for the preparation of each model are given in the respective subsections. since these structures were all used in downstream computational studies, a uniform structural preparation was implemented. the full-length structures are comprised of all residues and side chains. we added missing atoms in rotamers and de-clashed atoms, added missing residues for chain continuity, and removed extraneous molecules/atoms (e.g. artifacts of crystallography or alternative conformations of residues were removed (keeping the highest occupancy)), and the bfactors were set to isotropic. the pdbepisa server was used to data mine the interface between ace2 and s-protein (krissinel and henrick, 2007) . surface interactions data is provided (supplemental). calculations on molecular dynamics trajectories including rmsd, rmsf, and h-bonds were performed using vmd and internal tools thereof (rmsd trajectory tool and tk console). prior to calculations, the backbone (concî±) atoms of each frame of the trajectories were aligned to the first frame as a reference, to remove the effect of random rotation/translation. after alignment, the per residue average of rmsf or rmsd per frame in ã� across the entire mds trajectory is given. for the ace2-ligand simulations, the number of hydrogen bonds between the protein and ligand were recorded for each frame, and the occupancy of each specific h-bond is defined as the percentage of frames the bond is present. rmsd, rmsf, and h-bond data were plotted in 2d format in excel. the rmsf was also appended to the beta column of the pdb and heat-mapped to the structure using a custom tcl/tk script and pymol. all molecular graphics were generated in pymol (mooers, 2016) . molecular dynamics and monte carlo simulations were performed on the protein to allow local regional changes for full-length structure for all acids of each structure. the x-ray refinement for monte carlo was built using yasara ssp/pssm method (altschul et al., 1997; hooft et al., 1996a; hooft et al., 1996b; king and sternberg, 1996; krieger et al., 2009; qiu and elber, 2006) . the structure was relaxed to the yasara/amber force field using knowledge-based potentials within yasara. the side chains and rotamers were adjusted with knowledge-based potentials, simulated annealing with explicit solvent, and small equilibration simulations using yasara's refinement protocol (laskowski ra, 1993) . the entire full-length structure was modeled, filling in any gaps or unresolved portions from the x-ray. refinement of the finalized models was completed using either schrodinger's lc-mod monte carlo-based module or namd2 protocols. these refinements started with yasara generated initial refinement of tmprss2 (altschul et al., 1997; hooft et al., 1996a; hooft et al., 1996b; krieger et al., 2009) . the superposition and subsequent refinement of each protein regions yields a complete model. the final structures were subjected to energy optimization with pr conjugate gradient with an rdependent dielectric. atom consistency was checked for all amino acids of the full-length wild-type structure, verifying correctness of chain name, dihedrals, angles, torsions, non-bonds, electrostatics, atom typing, and parameters. model was exported to the following formats: maestro (mae), yasara (pdb). model manipulation was done with maestro (macromodel, version 9.8, schrodinger, llc, new york, ny, 2010), or visual molecular dynamics (vmd) (humphrey et al., 1996) . mds and mc searching were completed on each model for conformational sampling, using methods previously described in the literature (caulfield and devkota, 2012; caulfield and medina-franco, 2011; caulfield, 2011; caulfield et al., 2011) . briefly, each protein system was minimized with relaxed restraints using either steepest descent or conjugate gradient pr, then allowed to undergo the mc search criteria, as shown in the literature (caulfield and devkota, 2012; caulfield and medina-franco, 2011; caulfield, 2011; caulfield et al., 2011) . the primary purpose of mc, in this scenario, is examining any conformational variability that may occur with each protein. for ace2/s-protein, pdb code 6vw1 was used to construct the model (shang et al., 2020) . while the structure was mostly complete, chain f (s-protein) was missing more residues, though it had residue ala522. chain e (s-protein) was only missing residue 522. residue ala522 was built into chain e using coot and where the extraneous molecules (solvent/cryoprotectant) and chains were deleted to leave only the heterodimer ace2/s-protein, which was processed to be used for computational studies, not to generate a de novo model or complete structure with missing atoms and sections. all information about the protein was found on the corresponding uniprot page. after identifying the hot spot residues using sitemap or protein-protein interfaces, we used md to find out how the y41a mutation can affect of ppi inhibition. we performed md for wild type and mutated protein. residual mutation was also performed using pymol's built-in tools. gromacs 2018 and amber99 force field were used to conduct md and further analysis of the results (baugh et al., 2011; dilip et al., 2016; janson et al., 2017; kazmierkiewicz, 2013, 2016; mooers, 2016) . visual inspection of every 10 frames allowed us to determine some tendency of structural deformation in a certain place on the protein surface. according to the literature data and our finding, we focused on the predicted binding site. then, each trajectory was analyzed via the built-in clustering tool based on the rmsd distribution. three the most stable conformations of the binding site were chosen for the docking studies. all received docking poses from each docking study were evaluated based on the docking scores, interaction diagrams and solvent exposure. to make some prediction regarding the binding method, we carried out another molecular dynamics simulation for the upper poses of each docking. after such a confirmation of our assumptions, we selected the most powerful and accurate compounds from the results of docking. a homology model was constructed on the basis of prothrombin crystal structure in complex with the ligand analog (pdb code 3f68) (baum et al., 2009) . we modeled the 492 amino acid tmprss2 protein two different ways: yasara based and swissmodel server based (krieger et al., 2002; waterhouse et al., 2018; zoete et al., 2011) . first, the yasara based model begins with the fasta sequence: malnsgsppaigpyyenhgyqpenpypaqptvvptvyevhpaqyypspvpqyaprvltqasnpvvct qpkspsgtvctsktkkalcitltlgtflvgaalaagllwkfmgskcsnsgiecdssgtcinpsnwcdg vshcpggedenrcvrlygpnfilqvyssqrkswhpvcqddwnenygraacrdmgyknnfyssqgi vddsgstsfmklntsagnvdiykklyhsdacsskavvslrciacgvnlnssrqsrivggesalpgawp wqvslhvqnvhvcggsiitpewivtaahcvekplnnpwhwtafagilrqsfmfygagyqvekvishpn ydsktknndialmklqkpltfndlvkpvclpnpgmmlqpeqlcwisgwgateekgktsevlnaakvll ietqrcnsryvydnlitpamicagflqgnvdscqgdsggplvtsknniwwligdtswgsgcakayrp gvygnvmvftdwiyrqmradg. topological domains have the following characteristics: residues 1 -84 forms the cytoplasmic sequence; residues 85 -105 form the transmembrane domain region (helical 21 aa); and residues 106 -492 form the signal-anchor for type ii membrane protein (extracellular), where the protein as two main chains: non-catalytic chain (met1-arg225) and catalytic chain (ile256-gly492), where each domain modeled as a separate unit built together in composite. disulfide bonds exist between several residues (113 â�� 126), (120 â�� 139), (133 â�� 148), (172 â�� 231), (185 â�� 241), (244 â�� 365), (281 â�� 297), (410 â�� 426), (437 â�� 465), which can be informative for building the structure. glycosylation sites are also possible at residues n213 and n249. cleavage site (active) exists between arg255 and ile256 (see refinement section). the second method, homological modeling was performed using the swissmodel server after performing a blast search on available protein structures in the rcsb database. molecular dynamics simulations of 100 ns of both, suggested and re-modeled protein structures, was performed with gromacs 2018 kazmierkiewicz, 2013, 2016) . based on the structural analysis and the generated connolly surfaces, we identified critical changes in the binding site of the proposed model and began creating a mesh for the binding site of the new homology model. since our model was based on the structure of thrombin, we used its co-crystallized ligand as a template for assigning constraints and ensured we built the catalytically active state. for m pro (pdb 6y2f) co-crystallization with tert-butyl (1-((s)-1-(((s)-4-(benzylamino)-3,4-dioxo-1-((s)-2oxopyrrolidin-3-yl)butan-2-yl)amino)-3-cyclopropyl-1-oxopropan-2-yl)-2-oxo-1,2-dihydropyridin-3yl)carbamate (also referred to as alpha-ketoamide 13b) was used; the structure was also mostly complete. residues e47 and d48 were built in using coot, where the other preparations previously described were also performed. to build the missing residues, the coordinates and structure factors were downloaded, generated 2mfo-dfc and fem maps, and real space refine zone/regularize zone were used to fit to electron density and optimize local geometry. the ligand (alpha-ketoamide 13b) was left for usage as a cognate ligand for virtual screening. the protein structure was initially studied using md to find out if the binding site is cruel enough or can break down without a ligand molecule during the simulation. simulation of the dimeric complex for 100 ns was sufficient to compare conformational changes from different md states. a set of positional and hydrogen bonds were assigned based on the available peptidomimetic structure. thus, two screenings were conducted with an emphasis on positional constraints or interactions of hydrogen bonds. using mds and mc refinement with schrodinger and/or yasara ssp/pssm methods (altschul et al., 1997; hooft et al., 1996a; hooft et al., 1996b; king and sternberg, 1996; krieger et al., 2009; qiu and elber, 2006) , each structure was relaxed to the yasara/amber force field using knowledge-based potentials within yasara. the side chains and rotamers were adjusted with knowledge-based potentials, simulated annealing with explicit solvent, and small equilibration simulations using yasara's refinement protocol (laskowski ra, 1993) . the entire full-length structure was modeled, filling in any gaps or unresolved portions from the x-ray structure. refinement of the finalized models was completed using either schrodinger's monte carlobased module or in-house protocols. these refinements started with generated initial refinement for each independent structure (altschul et al., 1997; hooft et al., 1996a; hooft et al., 1996b; krieger et al., 2009) . the superposition and subsequent refinement of the overlapping regions yields a complete model for all four proteins. the final structures were subjected to energy optimization with pr conjugate gradient with an r-dependent dielectric. atom consistency was checked for all amino acids (and atoms) of the full-length wild-type model, verifying correctness of chain name, dihedrals, angles, torsions, non-bonds, electrostatics, atom typing, and parameters. a multimeric-complex model is predicted, including cofactors and ions. all of the models were exported in the following formats maestro (mae), yasara (pdb). model manipulation was done with maestro (macromodel, version 9.8, schrodinger, llc, new york, ny, 2010), or visual molecular dynamics (vmd) (humphrey et al., 1996) . analyses were emphasized on the protein-protein interaction regions containing. monte carlo dynamics searching (mc-search) was completed on each model for additional conformational sampling, using methods previously described in the literature (caulfield and devkota, 2012; caulfield, 2011; caulfield et al., 2011) . briefly, each protein system was minimized with relaxed restraints using either steepest descent or conjugate gradient pr, then allowed to undergo the mc search criteria, as shown in the literature (caulfield and devkota, 2012; caulfield, 2011; caulfield et al., 2011) . the primary purpose of mc, in this scenario, is examining any conformational variability that may occur with different orientations in the region near to protein-protein interfaces. the total atomic force field was used to minimize the energy of the system, namely, the descent algorithm for 20,000 steps with an iteration interval of 2 fs. the equilibrium of the solvent was carried out using positional restrictions imposed on the atoms of protein structures, while the solvent molecules remained mobile for all 100 ps. each system was placed in a box in which the layer of the tip3p water molecule was 10 ã�. the final systems were neutralized by the addition of na + and cl-ions to a concentration of 150 mm. all simulations were performed under periodic boundary conditions using the v-rescale thermostat algorithm to maintain temperature (310 k) and the parrinello-rahman barostat algorithm for constant pressure (1 bar) (bussi et al., 2007; parrinello and rahman, 1981) . long-range unrelated interactions were calculated using the particle-mesh-ewald (pme) method (abraham and gready, 2011). all molecules were relaxed with a molecular dynamics simulation of 100 ns. ligand topologies were created using the antechamber module from the ambertools18 package (case et al., 2005) . we used sitemapper (bhachoo and beuming, 2017) to identify possible binding sites for docking affinity with the proteins ace2 (allosteric site), tmprss2, and m pro . we also used our novel mds biasing technique algorithm, maxwell's demon md, for searching within these sites for potential flexible zones that would have beneficial peptide interactions, which served as a reductive filter limiting the total number of possible sites screened on the proteins to those with adequately deep binding grooves (caulfield, 2011; kayode et al., 2016) or interesting insertion sites (ace2). prior to the docking with the ace2 (allosteric site), tmprss2, and m pro , we had completed rigorous molecular dynamics simulations (mds) and monte carlo (mc) conformational searching for each model for additional conformational sampling, using methods previously described in the literature (caulfield and devkota, 2012; caulfield, 2011; caulfield et al., 2011) . the primary purpose of mc, in this scenario, is examining any conformational variability that may occur with different orientations in the region near to protein-protein interfaces. over three million compounds were docked to each site using the glide xp docking program (bhachoo and beuming, 2017). all compounds were accounted for using opls3 within maestro program (maestro-9.4, 2014). using our published docking protocols on each identified site, we reductively scanned from 100s to the top 10 poses from each docking and then did cross-comparisons of docking scores to retain only the top binding pose of each compound from each site in a winnertakes-all strategy. each compound has been converted into a set of energy minimized three-dimensional shapes with the ligprep module. without protein preparation, it was used for the correct distribution of protonation and post-minimization in the opls3 force field. in the case of assigning restrictions based on ligands (m pro , tmprss2), we tried to cover the most important and strong interactions. in the case of ace2, a set of constraints was generated in sufficient quantities to generate combinations of possible interactions. positional constrains (1.8 a radius) and h-bond constraints were generated in the schrodinger glide module, namely in the mesh generation tool. aromatic and hydrophobic features were represented with short smarts. a partial matching protocol for applying constraints has also been used to improve process accuracy. a high throughput screening protocol with regulated ligand flexibility was applied. each compound has been converted into a set of energy minimized three-dimensional shapes with the ligprep module. without protein preparation, it was used for the correct distribution of protonation and post-minimization in the opls3 force field. in the case of assigning restrictions based on ligands (m pro , tmprss2), we tried to cover the most important and strong interactions. in the case of ace2, a set of constraints was generated in sufficient quantities to generate combinations of possible interactions. positional constrains (1.8 a radius) and h-bond constraints were generated in the schrodinger glide module, namely in the mesh generation tool. aromatic and hydrophobic features were represented with short smarts. a partial matching protocol for applying constraints has also been used to improve process accuracy. a high throughput screening protocol with regulated ligand flexibility was applied. conformations of compound orientations were generated using our standard protocols (bhachoo and beuming, 2017; kalid et al., 2012; unger et al., 2015) . the starting conformation of relaxed protein structures was first obtained by the method of polak-ribiã¨re conjugate gradient (prcg) energy minimization with the optimized potentials for liquid simulations (opls) 2005 force field (jorgensen, 2004; jorgensen and tiradorives, 1988) for 5000 steps, or until the energy difference between subsequent structures was less than 0.001 kj/mol-ã� units. our docking methodology has been described previously (caulfield and devkota, 2012; friesner et al., 2006; loving et al., 2009; vivoli et al., 2012) . briefly, compounds were docked within the schrã¶dinger software suite (mohamadi et al., 1990 ) using a virtual screening workflow (vsw) (bhachoo and beuming, 2017; friesner et al., 2006; jacobson et al., 2002; kalid et al., 2012; kozakov et al., 2006) . alternative docking methods were also employed, including in-house software techniques for top leads for sar elucidation. the top seeded poses were ranked and unfavorable scoring poses were discarded. top favorable scores from initial dockings yielded hundreds of poses with the top five poses retained. molecular interactions of the ligand-protein interfaces were used to help determine the optimal binding set, which included descriptors were used to obtain atomic energy terms like hydrogen bond interaction, electrostatic interaction, hydrophobic enclosure and ï�-ï� stacking interaction that result during the docking run. molecular modeling for importing and refining the proteins was completed (maestro-9.4, 2014). examinations of structure stability were examined for all proteins investigated, s-protein:ace2, tmprss2, and m pro , respectively (caulfield and devkota, 2012; caulfield and medina-franco, 2011; caulfield, 2011; reumers et al., 2005; schymkowitz et al., 2005; zhang et al., 2013) . object stability was used to determine if any changes in structure that were deleterious to function from immediate inspection, which the foldx algorithm can provide, prior to docking studies. thus, we examined the local residues around the docking site and determined an electrostatic calculation may be useful to explain the change in function. the molecular model for the full structure and its truncated form are given (fig. s1 ) using our state of the art methods, which have been established (abdul-hay et al., 2013; ando et al., 2017; caulfield and devkota, 2012; caulfield and medina-franco, 2011; caulfield, 2011; caulfield et al., 2011; caulfield et al., 2014; caulfield et al., 2015; fiesel et al., 2015a; fiesel et al., 2015b; puschmann et al., 2017; zhang et al., 2013) . local residues within the 12ã� cutoff near docking sites were analyzed ( fig. s1-s2) . any interactions requiring inducible fit, or threonine/serine hydroxyl rotation or other docking parameter (ï�stacking/halogen-directionality) were also included. mapping electrostatics was accomplished using the poisson-boltzmann calculation for solvation on all amino acids for each docked structure (caulfield and devkota, 2012; caulfield and medina-franco, 2011; caulfield, 2011; reumers et al., 2005; schymkowitz et al., 2005; zhang et al., 2013) e. libraries used compounds were derived from either a set of all fda approved and clinical tested compounds, bioactive set of compounds, or a large multi-million compound set from zinc database. in the all cases the libraries were prepared using ligprep described above. the zinc database was pruned using parameters for better drug-like profile and removal of reactive functional groups and poor chemoinformatics properties delivering a large set suitable for screening on all targets across dynamic time points from mds. supplemental information can be found online at: xxx acknowledgements author contributions t.c. designed and conducted most experiments, analyzed data, and wrote the manuscript with inputs from j.m., k.l., m.c. and e.r.; t.c. and m.c. performed mds, docking, and generated analyses; t.c. and m.c. performed post simulation analyses; c.b. and m.c. performed bioinformatics analysis; t.c. supervised m.c.; t.c. provided expertise on data analysis; e.r. provided expertise and insight interpreting experimental structures and homology models; and t.c. proposed the project to k.l., whom helped with formatting, detailing analysis and edited the manuscript. the authors declare no competing interest. zhu, z., zhang, z., chen, w., cai, z., ge, x., zhu, h., jiang, t., tan, w., and peng, y. (2018) . predicting the receptor-binding domain usage of the coronavirus based on kmer frequency on spike protein. infect genet evol 61, 183-184. zoete, v., cuendet ma fau -grosdidier, a., grosdidier a fau -michielin, o., and michielin, o. (2011) . swissparam: a fast force field generation tool for small organic molecules. (g) ligand interaction diagram rendered with maestro for ace2 with ligand 300 at the allosteric site impacting s-protein binding from sar-cov2. this 2d "flat" representation shows the interactions at this particular compounds interface on ace2 that would interfere with s protein binding. in particular, extending from deeply inserted to superficial, the interactions are described in the subsequent sentences. d382 and d350 are hydrogen bond acceptors (side chains) from the opposite nh+ on the piperazine-like ring deeply inserted into the binding pocket. r393 is a hydrogen bond donor (side chain) to the alcohol group connecting the piperazine-like ring to the fused ring. e37 is a hydrogen bond acceptor (side chain) to one of the nh on the fused ring. the fluorocyclohexane group is entirely solvent-exposed at the mouth of the binding pocket. although glycosylation sites at residues n165, n234, n343 from s-protein (pdb code 6vsb), are nearby the ace2:s-protein binding interface, they do not overlap and interfere with the protein-protein interface, offering an adjacent site is readily available for ppi docking (s-prot glycosylation analysis: doi: 10.1126/science.abb9983; 10.1101/2020.04.29.069054}. the majority of glycosylation sites are not on the rbd (fig. s2) , the glycosylation site that is actually present on the rbd, n343, is not in 3d proximity to the binding interface. recently, a variant of the s-protein, d614g, was identified to possess enhanced transmissibility and resistance to contemporary interventions and this site is not present on the rbd. neither the glycosylation sites, nor the enhanced transmissibility variant d614g, are within the 3d proximity to the drug binding site for our targeted protein-protein interface disrupting therapeutics for ace2. 3-methyl-7-{2-methyl-3-[(1-phenyl-1h-1,2,3,4tetrazol-5-yl) sulfanyl]propyl}-8-(4-methylpiperazin-1-yl)-2,3,6,7-tetrahydro-1h-purine-2,6-dione ace2 -7.596339 [ (2-chloro-1-benzofuran-3-yl) rac-(3r,4s)-1-[ (3-cyclopentyl-1,2,4-oxadiazol-5yl) methyl]-4(1-methyl-1h-imidazol-5-yl) 2-(decahydroisoquinolin-2-yl)-n(2-oxo-2,3-dihydro-1h-1,3-benzodiazol-5-yl) predicting molecular interactions in silico: i. a guide to pharmacophore identification and its applications to drug design the prince and the pauper. a tale of anticancer targeted agents a common feature pharmacophore for fdaapproved drugs inhibiting the ebola virus coot: model-building tools for molecular graphics features and development of coot optimal docking area: a new method for predicting protein-protein interaction sites patho-)physiological relevance of pink1-dependent ubiquitin phosphorylation structural and functional impact of parkinson disease-associated mutations in the e3 ubiquitin ligase parkin extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes breakthrough: chloroquine phosphate has shown apparent efficacy in treatment of covid-19 associated pneumonia in clinical studies clinical and microbiological effect of a combination of hydroxychloroquine and azithromycin in 80 covid-19 patients with at least a six-day follow up: a pilot observational study dose-dependent effects of aloin on the intestinal bacterial community structure development and experimental validation of a docking strategy for the generation of kinase-targeted libraries il-6: relevance for immunopathology of sars-cov-2 effects of (-)-epicatechin on molecular modulators of skeletal muscle growth and differentiation a trial comparing nucleoside monotherapy with combination therapy in hiv-infected adults with cd4 cell counts from 200 to 500 per cubic millimeter tmprss2 and adam17 cleave ace2 differentially and only proteolysis by tmprss2 augments entry driven by the severe acute respiratory syndrome coronavirus spike protein emodin blocks the sars coronavirus spike protein and angiotensin-converting enzyme 2 interaction sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor the pdbfinder database: a summary of pdb, dssp and hssp information with added value errors in protein structures clinical features of patients infected with 2019 novel coronavirus in wuhan vmd: visual molecular dynamics usnic acid zinc--a free database of commercially available compounds for virtual screening catechin in human health and disease on the role of the crystal environment in determining protein side-chain conformations optimal assignment methods for ligandbased virtual screening the reframe library as a comprehensive drug repurposing library and its application to the treatment of cryptosporidiosis pymod 2.0: improvements in protein sequence-structure analysis and homology modeling within pymol inhibitors of the renin-angiotensin-aldosterone system and covid-19 generalized scalable multiple copy algorithms for molecular dynamics simulations in namd structure of m(pro) from sars-cov-2 and discovery of its inhibitors the many roles of computation in drug discovery the opls potential functions for proteins -energy minimizations for crystals of cyclic-peptides and crambin consensus induced fit docking (cifd): methodology, validation, and application to the discovery of novel crm1 inhibitors an acrobatic substrate metamorphosis reveals a requirement for substrate conformational dynamics in trypsin proteolysis identification and application of the concepts important for accurate and reliable protein secondary structure prediction androgen-induced tmprss2 activates matriptase and promotes extracellular matrix degradation, prostate cancer cell invasion, tumor growth, and metastasis piper: an fft-based protein docking program with pairwise potentials improving physical realism, stereochemistry, and side-chain accuracy in homology modeling: four approaches that performed well in casp8 increasing the precision of comparative models with yasara nova--a self-parameterizing force field inference of macromolecular assemblies from crystalline state comparison of structure-and ligand-based virtual screening protocols considering hit list complementarity and enrichment factors online structure-based screening of purchasable approved drugs and natural compounds: retrospective examples of drug repositioning on cancer targets procheck -a program to check the stereochemical quality of protein structures optimization of high throughput virtual screening by combining shape-matching and docking methods energetic analysis of fragment docking and application to structure-based pharmacophore hypothesis generation the androgen-regulated protease tmprss2 activates a proteolytic cascade involving components of the tumor microenvironment and promotes prostate cancer metastasis developing covid-19 vaccines at pandemic speed all-atom empirical potential for molecular modeling and dynamics studies of proteins molecular dynamics simulation by gromacs using gui plugin for pymol improvements in gromacs plugin for pymol including implicit solvent simulations and displaying results of pca analysis molecular similarity and diversity in chemoinformatics: from theory to applications renin-angiotensin-aldosterone system blockers and the risk of covid-19 fred and hybrid docking performance on standardized datasets comparison of topological, shape, and docking methods in virtual screening advances in the computational development of dna methyltransferase inhibitors hydroxychloroquine or chloroquine with or without a macrolide for treatment of covid-19: a multinational registry analysis association of use of angiotensin-converting enzyme inhibitors and angiotensin ii receptor blockers with testing positive for coronavirus disease early treatment of covid-19 patients with hydroxychloroquine and azithromycin: a retrospective analysis of 1061 cases in marseille macromodel-an integrated software system for modeling organic and bioorganic molecules using molecular mechanics simplifying and enhancing the use of pymol with horizontal scripts autodock4 and autodocktools4: automated docking with selective receptor flexibility finding new tricks for old drugs: an efficient route for publicsector drug discovery polymorphic transitions in single crystals: a new molecular dynamics method covid-19 and angiotensin-converting enzyme inhibitors and angiotensin receptor blockers: what is the evidence? coinhibition of the deubiquitinating enzymes, usp14 and uchl5, with vlx1570 is lethal to ibrutinib-or bortezomib-resistant waldenstrom macroglobulinemia tumor cells scalable molecular dynamics with namd a medicinal chemistry perspective of drug repositioning: recent advances and challenges in drug discovery virtual high throughput screening using combined random forest and flexible docking heterozygous pink1 p.g411s increases risk of parkinson's disease via a dominant-negative mechanism ssaln: an alignment algorithm using structure-dependent substitution matrices and gap penalties learned from structurally aligned protein pairs homology model-based virtual screening for gpcr ligands using docking and target-biased scoring snpeffect: a database mapping molecular phenotypic effects of human non-synonymous coding snps a large-scale drug repositioning survey for sars-cov-2 antivirals. biorxiv association of treatment with hydroxychloroquine or azithromycin with in-hospital mortality in patients with covid-19 in target-oriented and diversity-oriented organic synthesis in drug discovery prediction of water and metal binding sites and their affinities by using the fold-x force field large compound databases for structure-activity relationships studies in drug discovery structural basis of receptor recognition by sars-cov-2 comparative study of rosmarinic acid content in some plants of labiatae family predictors of mortality in hospitalized covid-19 patients: a systematic review and meta-analysis selection of nanobodies that block the enzymatic and cytotoxic activities of the binary clostridium difficile toxin cdt. sci rep 5, 7850. vaduganathan aldosterone system inhibitors in patients with covid-19 endothelial cell infection and endotheliitis in covid-19 free resources to assist structure-based virtual ligand screening experiments inhibition of prohormone convertases pc1/3 and pc2 by 2,5-dideoxystreptamine derivatives accelerated bottom-up drug design platform enables the discovery of novel stearoyl-coa desaturase 1 inhibitors for cancer therapy synthesis and evaluation of derivatives of the proteasome deubiquitinase inhibitor b-ap15 efficacy and safety of glecaprevir/pibrentasvir for chronic hepatitis c virus genotypes 1-6 infection: a systematic review and meta-analysis swiss-model: homology modelling of protein structures and complexes similarity-based virtual screening using 2d fingerprints the membrane-anchored serine protease, tmprss2, activates par-2 in prostate cancer cells drugbank: a knowledgebase for drugs, drug actions and drug targets genome composition and divergence of the novel coronavirus (2019-ncov) originating in china analysis of therapeutic targets for sars-cov-2 and discovery of potential drugs by computational methods crystal structure of sars-cov-2 main protease provides a basis for design of improved alpha-ketoamide inhibitors the dual functions of the extreme n-terminus of tdp-43 in regulating its biological activity and inclusion formation functional exhaustion of antiviral lymphocytes in covid-19 patients a pneumonia outbreak associated with a new coronavirus of probable bat origin a novel coronavirus from patients with pneumonia in china fc1c(f)c(f)c(cc(=o)nc2cccc(c2)c(=o) nc2cc2)c(f) coc1ccc(ccc(=o)nc2ccccc2c(=o)occ (=o)nc(c)c2ccc(f)cc2f) -methoxyphenyl)methyl]-4h-1,2,4-triazol-3-yl}sulfanyl)-n-(3,5-dimethylphenyl =o)nc3cc(c)cc( c)c3)c3ccccc3)n2n ccoc(=o)nc1ccc2c(coc(=o)cns(=o)( =o)c3cccc(br)c3)cc(=o) cc1ccc(c(c1)c(=o)nc1cccc(occ(=o)nc 2ccccc2)c1)-n1cnnc1 o=c(coc(=o)cc1ccsc1)nc1cccc2c(=o) c3ccccc3c(=o) cn(c1ccccc1)s(=o)(=o)c1cccc(c1)c(=o )oc(c(=o)nc1cccc(c1)c(c)=o) =o)c(c1)c(f)(f)f nc(=o)cnc(=o)nc1cccc(nc(=o) ccoc(=o)c1c(nc(=o)c(c)sc2nc3ccc(o cc)cc3[nh]2)sc2cc(c) -fluorophenyl)-3-methyl-1h-pyrazol-5-yl cc1cc(nc(=o)csc2nnc3ccccn23)n(n1)-c1ccc(f) fc(f)(f)c1cccc cn1cc(nc1c1cccn(c1)c(=o)cc1ccc(o) c(f)(f)f 4-tetrahydropyrimidin-1-yl)-nmethyl-n-{[(2r,3s)-1-methyl-2-(1-methyl-1h-pyrazol-5-yl) c(=o)cn1ccc(=o)[nh]c1=o 3-carbamoylphenyl)-4-(2-fluorophenyl)-1,4-diazepane-1-carboxamide mpro -7.652268 nc(=o)c1cccc(nc(=o)n2cccn(cc2)c2 ccccc2f -(difluoromethyl)-5-methyl-1h-pyrazol-4-yl )f)c1nc(=o)c1cnc2[nh] c(=o) coc(=o)cc(nc(=o)c1cc(=o)nc2cc(f) ccc12) c=cc(=o)c2ccc3[nh]c(=o)[nh] c3c2)cc1 6-dichlorophenyl)prop-2-enoyl clc1cccc(cl)c1c=cc(=o)c1ccc2[nh]c(= o) 2-yl)methyl]sulfamoyl}phenyl)-3-(2-oxo-1,2,3,4-tetrahydroquinolin-3-yl)propanamide s(=o)(=o) -fluorophenyl)-1-(2-methoxyacetyl)-4,5-dihydro-1h-pyrazol-3 fc1ccc(cc1)s(=o)(=o)nc1ccccc1c(=o) nc1cccc(c1) coc1ccc(cc1)n(c)s(=o)(=o)c1cccc(c1) c(=o)nc1cccc(c1)c#n 4-dihydrophthalazine-1-carboxamide mpro -7.71278 cc(nc(=o)c1nn(cc2ccccc2)c(=o)c2ccc cc12)c1ccc(cc1)s(n)(=o)=o -methyl-2,3-dihydro-1h-indol-1-yl cc1cc2ccccc2n1c(=o)csc1cc(c(n)=o) 2-dimethyl-3-oxo-1,2,3,4-tetrahydroquinoxalin-1-yl coc1ccc(cc1ncc(=o)n1c2ccccc2nc(= o)c1(c)c)s(=o)(=o) c(=o)ncc2ccccc2)c1 cn1cnnc1scc(=o)nc1cccc(c1)c(=o)n cn(c)c1ccc(cn(c)c(=o)coc2ccc3nc(= o)ccc3c2) benzothiazol-2-yl)piperidine-1-carbonyl cc1ccccc1nc(=o)cc1nc(coc(=o) )=o)c2ccccc2)cs1 nc(=o)nc(c)c1cccc(nc(c)=o)c1)c1 cccc(f) coc1ccc(oc)c(cnc(=o)nc(c)c2cccc(n c(c)=o)c2) -ethoxypyridin-3-yl)methyl ccoc1ncccc1cnc(=o)cc1sc(c)nc1-c1ccc(c) 3-dihydro-1-benzofuran-2-yl)methyl]-2-oxo-1,2,3,4-tetrahydroquinoline-6-carboxamide mpro -7.632513 o=c(ncc1cc2ccccc2o1)c1ccc2nc(=o) yl)acetyl]pyrrolidin-2-yl}-4-methyl-4,5-dihydro-1h-1,2,4-triazol-5-one 4-dichlorophenyl)methyl]-n-({4-oxo-4h-pyrido[1,2-a]pyrimidin-2-yl}methyl clc1ccc(cn2ccc3(c2)cccn(c3)c(=o)n cc2cc(=o)n3ccccc3n2) ccc(=o)nc1ccc(cc1)c(c)nc(=o) ccc(=o)nc1ccc(cc1)c(c)nc(=o) nc(=o)c1ccn(n1)-c1cccc(nc(=o)c2ccn(cc2)s(=o)(=o)c =cc2ccccc2) cc(n1ccn(cc=cc2ccccc2)cc1)c(=o)n c1cccc(c1)c#n 4,5-tetrahydro-1h-1 cccn1c(scc(=o)n2c(c)cc(=o)nc3ccc cc23 -oxo-1,2-dihydropyridin-1-yl)acetamide mpro -7.623712 fc(f)(f)c1cc(nc(=o)cn2ccccc2=o)cc(c c(f)(f)f 3-dihydro-1,4-benzodioxin-6-yl)ethyl fc1cccc(c1)n(ccn1c(=o)c2cccc3cccc( c1=o)c23)c(=o) 10-dihydroanthracen-1-yl)-2-(thiophen-2-yl)quinoline-4-carboxamide mpro -7.800731 o=c(nc1cccc2c(=o)c3ccccc3c(=o)c12 ) c(=o)c3ccccc3c(=o) coc1ccc(ns(=o)(=o)c2cccc(c2) coc1ccc(cc1)s(=o)(=o)nn=cc1ccc(oc (=o)nc2ccccc2)c(oc) 3-dimethylcyclohexyl)carbamoyl]methyl 5-(benzylsulfamoyl s(=o)(=o)ncc2ccccc2) key: cord-004672-0lf5j8lo authors: anderson, kevin; bond, clifford w. title: structural and physiological properties of mengovirus: avirulent, hemagglutination-defective mutants express altered alpha (1 d) proteins and are adsorption-defective date: 1987 journal: arch virol doi: 10.1007/bf01313891 sha: doc_id: 4672 cord_uid: 0lf5j8lo structural and physiological properties of two mutants of mengovirus, 205 and 280, were compared to those of wild-type virus to understand the molecular basis of changes exhibited in their biological function. two dimensional gel electrophoresis of wild-type and mutant structural proteins revealed alterations in the isoelectric character of the alpha (1 d) protein of both mutant 205 and 280. these data suggest that alterations in the alpha (1 d) protein may be responsible for the phenotypic changes by the mutants. a delay in detectable virus-specified protein synthesis was exhibited in mutant-infected cells in comparison to wild-type. the amount of rna synthesized in mutantand revertant-infected cells was less than that synthesized in wild-type infected cells. changes in virus-specified macro-molecular synthesis in mutant and revertant-infected cells reflected a decrease in the ability of the viruses to attach to cells. biological properties of two mengovirus mutants, 205 and 280, were compared with those of wild-type virus (2) . these mutants were defective in their ability to agglutinate erythroeytes, produced smmler plaques in cell culture, were avirulent in mice, but were not temperature-sensitive. these data suggested that changes in the structure of tile mutant viruses may be responsible for the expression of altered phenotypie traits. in this communication, the structurm proteins of the wfld-ty]pe and mutant, viruses were compared by sds-page, peptide mapping, and twodimensional gel electrophoresis to determine which of the mutant structural proteins differed from that of wild-type and the extent of homology among analogous proteins. the structural proteins of two revertants of mutant 205 were also examined by two-dimensional gel electrophoresis to relate changes in biological function to structural differences in the analogous proteins of the mutants and wild-type mengovirus. in addition, the progression of events associated with wild-type, mutant, and revertant virus infection were examined to identify-factors responsible for physiological changes associated with mutant and revertant virus-specified macromolecular s:ymthesis. the growth and assay of wild-type, mutant and revertant viruses using bhk-21 cells have been described previously (2) . mutant and revertant virus stocks used in the experiments described below were from the second passage of virus isolated by two successive rounds of plaque purification. purified virions were radiolabeled in vivo with [hc(u)]-l-amino acid mixture (6 ~ci/ ml) [new england nuclear (nen), nec-445] were prepared and analyzed by sds-polyacrylamide gel electrophoresis as described previously (2) . surface tyrosine residues of purified virions were labeled with [12~i]-sodium iodide [(nen), nez-033 hi using a modification of the method described by millar and smith (15) . pellets containing purified virions were resuspended in tn buffer [50 mm tris-hc1 (ph 7.5), 150 m~ nac1] and quantitated by absorbanee at 260 nm as described (19) . suspensions of purified virus in tn buffer containing 0.3 absorbance units (260 nm), 200 ~ci of lesi, and 100 ~g of iodogen (pierce) were mixed in a total volume of 0.3 ml and agitated for l0 minutes. an equal volume of 0.4 ~m nai, 5.0 ~m 2-mercaptoethanol was added to the mixture and layered onto sephadex g-25 columns (60 × 7 mm) equilibrated with tn buffer. the excluded volumes were collected and centrifuged in an sw 41 rotor at 37,000 rpm for 90 minutes at 6°c. pellets were prepared for preparative sds-polyacrylamide gel eleetrophoresis (sds-page). virus-specified intracellular proteins were labeled with l-[35s]-methionine, immunoprecipitated and analyzed by sds-page. cells were mock-infected or infected with virus in suspension (1 × 107 eells/ml) at an moi of 3, adsorbed, plated into plastic 35 mm dishes (2.5 × 106 ceils/dish), and incubated at 33 ° c. cells were pulse-labeled for 15 minutes with 100 ~ci/ml 3~s-methionine in 0.3 ml methionine-free medium. the cells were washed twice with serum free medium and lysed with 0.1 ml b 10 [10 mm tris-hcl (ph 7.4), 5 mm mgci~, 0.5 percent np 40, 0.1 percent sds, 1 percent aprotinin, 50 txg/ml ribonuelease a, 50 ~g/ ml deoxyribonuelease] for 5 minutes on ice. cellular lysates were centrifuged at 6500 × g for 1 minute at 4 ° c and the supernatant fluid was colleeted and stored at -2 0 ° c. b 11 [50 m~ tris (ph 7.4), 150 mm nacl, 5 m~i edta, 0.02 percent sodium azide, 0.05 percent np 40, 1 percent aprotinin, 0.1 percent bovine sernm albumin]. twelve txl of mouse anti-mengovirus hyperimmune aseitic fluid (2) was added to the diluted lysates (0.5 ml volume) and incubated at 0 ° c for 60 minutes. immune complexes were precipitated with 100 t~l of 10 percent fixed staphylococcus aureus (cowan strain) (13) by incubation at 0 ° c for 60 minutes and pelleted by centrifugation for 1 minute at 6500 × g. pellets were washed four times with b 11 buffer at 0 ° c and resuspended in 40 t~l 20 mm dithiothreitol (dtt), 1 percent sds. after 15 minutes of incubation at room temperature, the immunoprecipitated proteins were eluted and reduced by heating at 60 ° c for 5 minutes. bacteria were removed by eentrifugation and the supernatant fluids were prepared for sds-page. virus-specified intracellular rna was labeled with [5, 6-'~h]-uridine [(nen), net-367]. cells were infected in suspension with virus (1 × 107 ceus/ml) at an moi of 3, plated in plastic 35 mm dishes (2.5 × 106 cells/dish) following adsorption at 33 ° c for 30 minutes, and incubated at 33 ° c. actinomyein d (5 l~g/ml final concentration) was added to each dish at various time points and the cells were incubated at 33 ° c for 20 minutes. the medium was aspirated, replaced with 0.3 ml dme 2 containing 10 ~ci/mt '~h-uridine, and the monolayers were incubated at 33 ° c for 60 minutes. at the end of the labeling period, the cells were lysed at 4 ° c for 5 minutes with net buffer [10 mm tris-hc1 (ph 7.4), 100 mm nac1, 1 mm edta] supplemented with 1 percent np 40. the lysate was centrifuged at 6500 × g for i minute. duplicate samples containing 25 i~t of the supernatant fluid were spotted onto whatman gfc glass fiber filters and precipitated in ice cold 10 percent trichloroacetic acid (tca). filters were placed in vials with scintillation fluid [5 g / l 2, 5diphenyloxazoie (pp0) in xylene], and counted in a packard lsc 460 cd liquid scintillation counter using the pre-set tritium channel (4). to prepare virus particles containing radiolabeled i~na, cells were infected with virus at an m0i of 3 and were labeled at 4.5 hpi with 20 ~ci/ml [5, 6-3h]-uridine [(nen), net-367] or 500 ~ci/m132p-orthophosphate [(nen), nex-054] in medium containing 12.5 rag/ l monosodium phosphate. virus-infected cells were allowed to incubate at, 33 ° c until lysis and virions were purified as described above. a~s-methionine labeled virions were suspended in sample buffer containing 9.95 m urea (bio-l~ad), 4 percent. np-40 (particle data laboratories), 2 percent bio-lyte 3/10 (bio-rad), and 100 m~[ dtt as described (8) and incubated at 25 ° c for 30 minutes. isoelectrie focusing (ief) was performed as described (17). following equilibration, the tube gels were placed onto 10 percent polyaerylamide slab gels and subjected to sds-page. the ph gradients for ief and nephge were generated from 1 cm serial gel sections equilibrated in 50 mm nac1 and the ph values were lowered by 0.5 ph unit~s to correct for measurement in the presence of urea (22) . the isoclectric point (pi) value for each protein species was estimated from these gradients. bands corresponding to the structural proteins of mcngovirus were identified by exposure of dried preparative sds-page gels to x-ray film, excised, rehydrated in 10 mm ammonium bicarbonate, 0.1 percent sds, and electroeluted as described (7, 21) . protein samples were lyophilized and sds was removed as described (10) . the proteins were dried, oxidized for 90 minutes at 4 ° c with 0.2 mt of fresh performie acid and lyophitized three times in distilled water. proteins were anmyzed for purity by sds-page prior to digestion. purified, oxidized proteins were resuspended in 0.2 ml of 1 percent ammonium bicarbonate (ph 7.8) for digestion with n-mpha-p-tosyl-l-lysine ehloromethyl ketonetreated chymotrypsin (tlck-chymotrypsin) at 37 ° c as described (7), enzyme treated samples were frozen and lyophilized before peptide analysis by thin layer chromatography (tlc). two-dimensional peptide mapping was performed as previously described (ll). lyophilized samples were dissolved in electrophoresis buffer [butanol-pyridine-acetie acid-water (2 : 1 : 1 : 36)] and 2--5 pl was spotted onto cellulose tlc plates (e. merck). separation of peptides in the first dimension was perfomed by eleetrophoresis for 45 minutes at 1000v (40 to 50 ma). the tlc plates were air-dried and the peptides were separated in the second dimension by ascending chromatography in n-butanol-pyridineacetic acid-water (393 : 304 : 61 : 243) until the solvent front reached 2 cm from the top. dried plates were sprayed with en3hanee (nen) and exposed to preflashed kodak nag-2 x-ray film at -70 ° c. the number and percentage of wild-type, mutant, and revertant virus particles adsorbing to bhk-21 cells were determined as follows. cells were infected with purified ~ep_ labeled virus at a multiplicity of 2000 particles/cell. the 32p-labeled virus suspensions were allowed to adsorb to bhk-21 cell monolayers in plastic dishes (60 mm diameter) for 60 minutes at 33 ° c. the cells were lysed with 0.1 ml b 10 for 5 minutes on ice. duplicate 100 t~1 samples were spotted onto whatman glass fiber filters and precipitated in ice-cold 10 percent tca. the filters were placed in vials with scintillation fluid and counted in a packard 460 cd liquid scintillation counter using the pre-set ~2p channel (4) . the fraction of cell-associated virus particles was calculated by dividing the number of cell associated counts per minute (cpm) by the cpm of the input virus. the average number of virus particles adsorbed per cell was calculated by multiplying the fraction of cell-associated virus by the multiplicity of infection (2000 particles/cell). the structural proteins of purified 14c-labeled wild-type and m u t a n t mengovhnases were analyzed by s d s -p a g e on 8, 10 a n d 15 p e r c e n t polya e r y l a m i d e slab gels. a n a l y s e s of the structural proteins resolved on 15 percent gels are shown (fig. 1) . five structural proteins were resolved for each of the viruses: epsilon (lab), 40 kd; alpha (1 d), 37 kd; b e t a (1b), 33 kd; g a m m a (1 c), 25 kd; and delta (t a), 7.8 kd. no changes in the migration of the m u t a n t structurm proteins were o b s e r v e d in c o m p a r i s o n to those of the wild-type virus. in addition, the migration of the structural proteins of several h a + r e v e r t a n t s of m u t a n t 205 was also similar to t h a t of wild-type (data n o t shown). although some v a r i a t i o n in the a m o u n t of delta (1 a) p r o t e i n of the viruses w a s o b s e r v e d (fig. 1) , this was not a result found consistently t h r o u g h o u t our analyses. 35s-methionine-labeled structural proteins of wild-type and m u t a n t viruses 205 and 280 were analyzed following digestion with t p c k -t r y p s i n b y r e v e r s e -p h a s e h p l c to further e x a m i n e the structure of the proteins by separating peptides in a gradient on the basis of charge, however, the spectra of methionine-eontmning peptides of the delta (1a), beta (1b), gamma (1 c), and alpha (1 d) proteins from the different viruses were identical (data not shown). although not all of the peptides generated by tpck-trypsin digestion were likely to contain methionine, the results suggested that the peptide compositions of the four strueturm proteins of the wild-type and mutant viruses were remarkably similar. hplc analyses of the tryptie peptide compositions of the four structural proteins of the wildtype and mutant viruses uniformly labeled with 14c-amino acid mixture by labeled strueturm proteins was inconclusive because the resulting ehromatograms contained many unresolvable, overlapping peaks (data not shown). to determine whether the arrangement of the structural proteins on the surface of mutants 205 and 280 was similar to that of wild-type, purified virions were labeled with r25i and analyzed by sds-page (fig. 2) . most of the tyrosine residues on the surface of the wild-type and mutant viruses were on the alpha (1 d) protein as demonstrated by the extensive labeling of this protein. the beta (1 b) protein was labeled to a much lesser extent. there was no detectable difference in the pattern of surface protein labeling among the three viruses suggesting that the arrangement of the structural proteins on the surfaces of the virions of the wild-type and mutant viruses was similar. to examine the structure of the surface proteins of the three viruses in greater detail, two-dimensional tlck-ehymotryptie peptide ehromatoalthough not all of the surface peptides generated by tlck-chymotrypsin are likely to be labeled, numerous peptides were labeled. these data suggest that the arrangement of the structural proteins on the surface of the three viruses was remarkably similar. the structural proteins of the wfld-ts~e, mutant, and revertant strains were analyzed by ph gradient gel electrophoresis to determine whether any changes were apparent in migration of the mutant and revertant structural proteins relative to those of the wild-type virus. the proteins were analyzed by ief and nephge two-dimensional gel electrophoresis in order to resolve both acidic and basic proteins. the two-dinlensional electrophoretie migration of the structural proteins of wild-type mengovirus is shown in fig. 4 , panel a. the epsflon (1 ab) protein had a pi of 5.32. the alpha (1 d) protein was separated into four major protein species, each with a distinct pi (5.35, 5.52, 5.61, and 5.72). the beta (1 b) protein was separated into two major protein species with pi values of 5.55 and 5.62 and a heterogenous species ranging from 6.3 to 6.6 the gamma (1 c) protein was not well resolved by this technique and appears to be slightly acidic or neutral in charge. the delta (1 a) protein does not appear in any of the two-dimensional ief gels. the two-dimensional electrophoretic migration of the structural proteins of mutant 205 is shown in fig. 4, panel b . the etectrophoretic migration of the proteins was identical to that of the wild-type strain with one exception. only three alpha (1 d) protein species were resolved. one of the protein species resolved in separation of the wild-type proteins, p i = 5.72, was absent. however, this protein species was resolved in the wild-type/205 mixed sample separation (fig. 4, panel d) . these data suggest that the absence of this particular protein species represents a phenotypie change or mutation in the alpha (1 d) protein of mutant 205 relative to the wild-type virus. in addition, the two-dimensional electrophoretic patterns of the structural proteins of the revertent viruses, 205-a 7 and 205-d 2, were identical to those of mutant 205 (data not shown). the two-dimensional electrophoretie migration of the structural proteins of mutant 280 is shown in fig. 4 , panel c. the eleetrophoretic migration of the proteins was identiem to t h a t of the wild-type strain with one exception. the pi of one of the four alpha (1 d) protein species (pi = 5.58) was slightly more acidic t h a n tile analogous wild-type isoelectric species ( p i = 5.61). this species a p p e a r e d b r o a d e r and slightly lower in molecular weight t h a n the corresponding wild-type protein species. the wild-type/280 mixed sample separation (fig. 4, panel e) shows a broad protein species in the region of the gel corresponding to tile apparent, change in the pis of the wild-type and 280 alpha (1 d) protein species. the altered migration of the m u t a n t 280 isoeleetrie species was found consistently in two dimensional profiles of independently derived virion protein samples. these data suggest that the change in form and migration of this protein species represents a phenotypic change or mutation in the alpha (1 d) protein of mutant 280 relative to the wild-type virus. two-dimensionm nephge analyses confirmed the results obtained by ief analysis of the wild-type, mutant, and revertant structural proteins. the delta (1 a) proteins of the wild-type and mutant viruses were resolved by this technique and migrated similarly as a single acidic protein species at ph 2.5 (data not, shown). the specific activities (cpm/particle) of the wild-type and mutants differed when bhk-21 cells were intbcted and labeled with 35s-methionine or 14c-amino acids under similar conditions (ani)e~son and bo~-d, unpublished data). therefore, to determine whether differences in the kinetics of virus-specified macromotecular synthesis exist among the viruses, virusspecified intraeellular protein and t~na synthesis were examined. virusinfected cells were pulse-labeled at 1 hour intervals from 3 to ll hpi with 35s-methionine for 15 minutes. cytoplasmic lysates were immunoprecipitated and analyzed by sds-page (fig. 5) . eight virus-specified intracellular proteins were resolved in wild-type and mutant strain-infected cells: a (1-2 a), b (1), c (3), d (3 cd), 1 abc, d 2 (1 cd), alpha (1 d), and gamma (1 c). the number and molecular weights of the proteins specified by the wild-type and mutant strains were identical. however, virus-specified protein synthesis was detected initimly at 5 hpi for the wild-type strain and at 6 hpi for mutant strains 205 and 280. to determine whether changes in the kinetics of lgna synthesis could explain the delay in detectable protein synthesis observed in mutantinfected cells, virus-infected cells were treated with aetinomyein d, pulselabeled with 3h-uridine, lysed, and counted. the results are shown in fig. 6 . the peak of rna synthesis for each virus was from 10 to 11 hpi. however, the amount of rna synthesized by the wild-type virus was 10-fold greater than that of the mutants and 2-fold greater than that of the revertants. since the magnitude of virus-specified rna synthesized in the mutant virusinfected cells was 10-fold less than that of wild-type virus, the beginning of virus-specified protein synthesis, as detected by immunoprecipitation (fig. 5) , would appear to be delayed due to the fact that less rna would be available for translation. since the magnitude of virus-specified i%na synthesis in mutant virusinfected cells was 10-fold less than that of the wild-type virus, mterations in uncoating or adsorption of the m u t a n t viruses to cells m a y account for this difference. however, surface peptide analysis suggested t h a t the arrangem e n t of the strueturm proteins of the wild-type and m u t a n t virions was r e m a r k a b l y similar. therefore, it is possible t h a t uncoating of the virions would occur at a similar rate in infected ceils due to their similarity in structure. the average n u m b e r of wild-type, mutant, and revertaa~t virus particles adsorbing to bhk-21 cells was examined tbllowing incubation at 33 ° c for 60 minutes. u n d e r these conditions, the a m o u n t of virus adsorption to cells would approximate saturation. the results are shown in table 1 . the average n u m b e r of virus particles adsorbing per cell clearly differed among the wild-type, mutant,, and r e v e r t a n t viruses. therefore, it is a p p a r e n t t h a t the mechanisms of adsorption of the m u t a n t and r e v e r t a n t viruses to cells is modified from t h a t of wild-type. the diftbrenee in the cellular binding affinities among the wild-type, mutant, and revei'~ant viruses would be an i m p o r t a n t factor contributing to the differences in the magnitude of viral rna and protein synthesis observed fbr these viruses. these data suggest t h a t fewer cells would be infected by the m u t a n t and r e v e r t a n t viruses and therefore, fewer virus-specified macromoleeules would be synthesized. kevln anderson and clifford w. bond: cells were infected at an moi of 2000 particles/cell and incubated at 33 ° c for 60 minutes b the average number of particles adsorbed/cell ° the percentage of adsorption of the mutants and revertants in comparison to wild-type mengovirus biologiem and structural properties of two mengo~drus :mutants have been compared to those of the parental wild-type strain. ~e s e mutants, 205 and 280, exhibited alterations in agglutination of erythrocytes, virulence in mice, and plaque morphology (2) . in addition, biological characterization of several ha + revertants of mutant 205 isolated from the brains of mice infected intraeranieally indicated that agglutination and virulence may be linked traits and that the size of plaques produced by a pa~ieular mengovirus isolate may reflect its bindhag affinity to cells and virulence in mice. analysis of 35s-methionine-labeled structural proteins of wild-type and mutant viruses by sds-page and hplc revealed that extensive homology exists among these viruses. to further examine these viruses for structural differences, surface labeling of intact wild-type and mutant virus particles was done to compare the arrangement of the structural proteins which form the xdral capsids. previous studies have shown that surface iodination of intact potiovirus and mengovirus labels primarily the 1 d (vp 1 or alpha) protein of the respective virus eapsids (3, 12) and the 1 b (vp 2 or beta) protein is also labeled, but to a much lesser extent than the 1 d protein. we obtained similar data tbr the surface iodination of intact wild-type and mutant virus particles. in addition, no apparent differences were evident from the analysis of ehymotryptie peptides of 125i-labeled wild-type and mutant alpha (1 d) proteins. these data suggest that arrangement of structural proteins on the surface of mutant eapsids did not differ from that of wild-type. therefore, the mutations may result in the expression of altered determinants that reflect differences in biologicm function and may not result in the masking of otherwise funetionm determinants due to altered arrangement of the capsid proteins. we compared the isoeleetrie character of the viral capsid proteins by two-dimensional gel electrophoresis. multiple species of the alpha (1 d) and beta (1 b) proteins were reprodueibly deteeted by this technique in both the presence (fig. 4 ) and absence (data not shown) of sds prior to isoeleetrie focusing. previous studies have demonstrated multiple isoelectrie species for the major eapsid proteins of poliovirus [vp 1 (1 d), vp 2 (1 b), and vp 3 (1 c)] (9, 23) and emc virus [alpha (1 d) and beta (1 b)] (6). vriasen et al. (23) also excluded the possibility that the multiple species are derived from differential binding of sds or the accumulation of mutants in the original virus stoek. these authors suggested that the multiple species may result from heterogeneity in the cleavage of structural protein precursors; although the sequenees of the amino-and earboxyl-termini of the major eapsid polypeptides of mengovirus were determined without mention of sequenee variations (24) . we suggest that the charge heterogeneity of these molecules may reflect differences in the interaction of these proteins with the ampholines. the different protein species observed may represent different forms of the viral proteins associated with either intraeellular partieles or free extraeellular particles or extraeellular particles bound to or eluted from membranes, assuming that these particles share the same buoyant density in equilibrium gradients. therefore, the heterogeneity observed may reflect differenees in the manner of isolating virus partieles, whether intraeellularly or from lysed cells. differenees were deteeted among the alpha (1 d) structural proteins of the wild-type and mutant viruses. since the mutants exhibited different alterations in the alpha (1 d) protein, yet similar ehanges in biological activity, it is possible that alteration of this protein resulted in the observed phenotypie ehanges. the differenees in the number and migration of the alpha (1 d) protein isoelectrie species are likely to be due to changes in uneharged amino acid residues affeeting the interaction of the altered species with the ampholines. changes in eharged amino acid residues would result in altered migration of all four isoeleetrie species. since the ehromatograms of surface-and metabolieally-labeled peptides were similar, we speculate that the alterations in the mutant alpha (1 d) proteins may be assoeiated with a particular determinant on this protein. this determinant may serve as or be part of a funetional attachment site on the surface of virus partieles that determines their affinity for various cells. atomic resolution of the structure of another pieornavirus, human rhinovirus 14, revealed a large cleft on the ieosahedral faee that has been postulated to serve as the host eell receptor binding site (18) . the large cleft separates the major part of five 1 d (alpha) subunits from the other viral protein subunits. sinee the pieornaviruses share similar struetural eharaeteristies, by analogy it seems reasonable to speculate that mutations in the 1 d (alpha) protein could affect the strueture of the deft, and thus, alter the binding affinities of the mutant and revertant viruses. alteration of the mengovirus host cell receptor binding site may lead to changes in affinity for erythrocytes as well as other cell types which may explain the lack of virulence of the mutants in mice. previous work by morishima et al. (16) demonstrated that differences in binding affinities of closely related strains of emc and mengovirus to various cells reflects their difference in pathogenicity for mice. ha + revertants were isolated from the brains of mice infected intracranially with mutant 205 (2) . in addition to regaining agglutination activity, the revertants were also virulent for mice and exhibited a slight increase in plaque size. however, revertants required 103. to 104-fold more p f u to kill mice than the wild-type virus. since the isoelectric character of t/he proteins of the ha + revertants were identicm to that of mutant 205, this partial phenotypic reversion may have resulted from a change in an uncharged amino acid associated with a determinant, located on the alpha (1d) protein, which partially restores its biological activities. the partial reversion of this mutation may involve modification of the altered surface determinant, increasing the binding affinity of the virus for cells and restoring virulence, but would not result in reappearance of the isoelectric species absent in mutant 205 and revertant alpha (1 d) capsid proteins. alternatively, mutant 205 may express more than one mutation since complete reversion to w i l d -t~e virulence and plaque size did not occur and the isoelectric profile of the alpha (1 d) protein was the same as that of mutant 205. however, revertants were not isolated from mice infected intracranially with mutant 280, which shared biological properties as well as alpha (1 d) protein alterations with mutant 205. therefore, the mutation observed in the alpha (1 d) structural protein of mutant 280 is apparently more stable than that of mutant 205. however, unlike mutant 205, expression of the mutation of mutant 280 resulted in a more subtle, yet reproducible change in the isoelectric point of one of the four alpha (1 d) protein species. changes in virus-specified macromolecular synthesis in mutant and revertant virus-infected cells can be explained by a decrease in the ability of these viruses to attach to cells. these data suggest that a smaller proportion of cells were productively infected with the mutant and revertant viruses in comparison to wild-type. therefore, the effective moi for the mutant and revertant viruses would be less than that of wild-type. this difference can be explained by changes in the cellular binding affinities; and the higher particle : p f u ratios exhibited by the mutant and revertants (2), assuming that a greater proportion of noninfectious particles would lower the probability of these viruses to infect cells. this interference phenomenon would explain why p f u values obtained under dilute conditions could not be used to accurately predict the fraction of infected cells in high particle : cell infections. since fewer cells would be productively infected with mutant and revertant viruses, less virus-specified rna would be synthesized, although the peak hour of rna synthesis would be the same as that of wild-type. therefore, less rna would be available for translation; and protein synthesis, as detected by immunoprecipitation, would appear to be delayed in ceils infected with the mutant viruses. changes in the metabolic rates of mutant, virus-specified gna synthesis would predict different results than those presented here if tile fraction of infected cells were similar to ~t d -t y p e infections. collectively, our data suggest that the phenotypie changes expressed by the mutants may be due to mutations toeated exclusively within the alpha (1 d) coding region of the genome. previous work by a(~ol et al. (1) has indicated that the neurovirulenee of poliovirus maps to the 5' end of the genome which includes the coding region of the eapsid proteins. consistent with these results, we have identified two different mutations in the alpha (1 d) capsid protein, which maps to the 3' end of tiffs region, of two avirulent mengovirus mutants. in addition, ko~tara et al. (14) have stated that a molecular recombinant of the sabin vaccine strain of poliovirus containing vp 1 (1 d) and most of vp 3 (1 c) of the neurovirulent mahoney strain is virulent, but not as virulent as the mahoney strain and retain the small plaque morphology of the sabin strain. our data also indicate that changes in the alpha (1 d) protein result in altered virulence of a pieornavirus. since revertants of mutant 205 shared similar characteristics with the poliovirus recombinant, the plaque-forming ability of a particular virus isolate may be a characteristic that reflects as well as contributes to its virulence. future experiments which compare the nueleotide sequences of the nmtants and revertant structural proteins to that of the parental wild-type virus will be useful in determining the genetic basis for the altered biological properties exhibited by the mutant and revertant viruses. construction and properties of intertypie poliovirus reeombinants: first approximation mapping of the major determinants of neurovirulence biological properties ofmengovirus: characterization of avirulent, hemagglutination-defective mutants iodination of poliovirus capsid proteins liquid scintillation counting: elimination of spurious results due to static electricity relatedness of virion and intracellular proteins of the murine coronaviruses jhm and a 59 two-dimensional electrophoretic analysis of encephalomyocarditis viral proteins identification of the initiation site of poliovirus protein synthesis two-dimensional gel electrophoresis and computer analysis of proteins synthesized by clonal cell lines isoelectric points of polypeptides of standard poliovirus particles of different serological types and of empty capsids and dense particles of poliovirus type 1 a micromethod for complete removal of dodecyl sulfate from proteins by ion-pair extraction characterization of t antigens in polyoma-infected and transformed cells structure of the mengo virion. distribution of the capsid polypeptides with respect to the surface of the virus particle rapid isolation of antigens from cells with a staphylococcal protein a-antibody absorbent: parameters of the interaction of antigen-antibody complexes with protein a in vitro phenotypic markers of a poliovirus recombinant constructed from infectious cdna clones of the neurovirulent mahoney strain and the attenuated sabin 1 strain protein iodination using iodogen genomic and receptor attachment differences between mengovirus and encephalomyocarditis virus two-dimensional polyacrylamide gel electrophoretic fraetionation structure of a human common cold virus and functional relationship to other picornaviruses on the structure and morphogenesis of picornaviruses systematic nomenclature for picornavirus proteins poliovirus replication proteins: rna sequence encoding p 3-1 b and the sites of proteolytie processing isoelectric points and conformations of proteins. i. effect of urea on the behavior of some proteins in isoelectric focusing gesolution of the major poliovirus eapsid proteins into doublets structure of the mengo virion. iv. amino-and carboxylterminal analysis of the major capsid polypeptides we thank drs. sandra ewald and andreas luder for their hetpfhl discussions and interest in our work and dr. andrew king for helpful comments on the manuscript. support for this work was obtained from three geseareh creativity development grants awarded to k. a. by the department of graduate studies, montana state universi~-and a grant from the montana heart association, inc. awarded to c.w.b. p~eceived march 20, 1986 key: cord-193133-puqcbf8t authors: piplani, sakshi; singh, puneet kumar; winkler, david a.; petrovsky, nikolai title: in silico comparison of spike protein-ace2 binding affinities across species; significance for the possible origin of the sars-cov-2 virus date: 2020-05-13 journal: nan doi: nan sha: doc_id: 193133 cord_uid: puqcbf8t the devastating impact of the covid19 pandemic caused by sars coronavirus 2 (sarscov2) has raised important questions on the origins of this virus, the mechanisms of any zoonotic transfer from exotic animals to humans, whether companion animals or those used for commercial purposes can act as reservoirs for infection, and the reasons for the large variations in susceptibilities across animal species. traditional lab-based methods will ultimately answer many of these questions but take considerable time. in silico modeling methods provide the opportunity to rapidly generate information on newly emerged pathogens to aid countermeasure development and also to predict potential future behaviors. we used a structural homology modeling approach to characterize the sarscov2 spike protein and predict its binding strength to the human ace2 receptor. we then explored the possible transmission path by which sarscov2 might have crossed to humans by constructing models of ace2 receptors of relevant species, and calculating the binding energy of sarscov2 spike protein to each. notably, sarscov2 spike protein had the highest overall binding energy for human ace2, greater than all the other tested species including bat, the postulated source of the virus. this indicates that sarscov2 is a highly adapted human pathogen. of the species studied, the next highest binding affinity after human was pangolin, which is most likely explained by a process of convergent evolution. binding of sarscov2 for dog and cat ace2 was similar to affinity for bat ace2, all being lower than for human ace2, and is consistent with only occasional observations of infections of these domestic animals. overall, the data indicates that sarscov2 is uniquely adapted to infect humans, raising questions as to whether it arose in nature by a rare chance event or whether its origins lie elsewhere. the devastating impact of covid-19 infections caused by sars-coronavirus 2 (sars-cov-2) has stimulated unprecedented international activity to discover effective vaccines and drugs for this and other pathogenic coronaviruses. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] it has also raised important questions on the mechanisms of zoonotic transfer of viruses from animals to humans, questions as to whether companion animals or those used for commercial purposes can act as reservoirs for infection, and the reasons for the large variations in sars-cov-2 susceptibility across animal species. [17] [18] [19] understanding how viruses move between species may help us prevent or minimize these pathways in the future. elucidating the molecular basis for the different susceptibilities of species may also shed light on the differences in susceptibilities in different sub-groups of humans. very recently, shi et al. published the results of experiments to determine the susceptibility to sars-cov-2 of ferrets, cats, dogs, and other domesticated animals. 20 they showed that sars-cov-2 virus replicates poorly in dogs, pigs, chickens, and ducks, but ferrets and cats are permissive to infection. other studies have reported the susceptibility of other animal species to sars-cov-2. 17, 20, 21 susceptible species such as macaques, hamsters and ferrets are used as animal models of sars-cov-2 infection. [22] [23] [24] in the absence of purified, isolated ace2 from all the relevant animal species that could be used to measure the molecular affinities to spike protein experimentally, computational methods offer considerable promise for determining the rank order of affinities across species, as a method to impute which species may be permissive to sars-cov-2. here we show how computational chemistry methods from structure-based drug design can be used to determine the relative binding affinities of the sars-cov-2 spike protein for its receptor, angiotensin converting enzyme (ace)-2, a critical initiating event for sars-cov-2 infection, across multiple common and exotic animal species. [25] [26] [27] the aim of these studies was to better understand the species-specific nature of this interaction and see if this could help elucidate the origin of sars-cov-2 and the mechanisms for its zoonotic transmission. to construct the three-dimensional structure of the sars-cov-2 spike protein, the sequence was retrieved from ncbi genbank database (accession number yp_009724390.1). a psi-blast search against the pdb database for template selection was performed and the x-ray structure of sars coronavirus spike template (refcode 6acc) was selected with 76.4% sequence similarity to sars-cov-2 spike protein. the protein sequences of the ace2 proteins for different species is summarized in table 11 and full sequence alignment in supplementary figure 1 . the phylogenetic tree for ace2 proteins from selected animal species is illustrated in supplementary figure 2 . the 3d-structures were built using modeller 9.21 (https://salilab.org/modeller/). 28 the quality of the generated models was evaluated using the ga341 score and dope scores, and the models assessed using swiss-model structure assessment server (https://swissmodel.expasy.org/assess). 29 the x-ray crystal structures of human ace2 (recode 3sci) and human sars spike protein (refcode 5xlr) were retrieved from protein data bank. protein preparation and removal of nonessential and non-bridging water molecules for docking studies were performed using the ucsf chimera package (https://www.cgl.ucsf.edu/chimera/). 30 these modelled structures were docked against sars-cov-2 spike protein structure using the hdock server (http://hdock.phys.hust.edu.cn/). 31, 32 molecular docking was performed on the homology modelled sars-cov-2 spike protein with human and animal ace2 proteins. the sars-cov spike protein was also docked with human ace2 protein to obtain the docking pose for binding energy calculations. the docking poses were ranked using an energy-based scoring function. the docked structures were analyzed using ucsf chimera software. the final models were optimized using the amber99sb-ildn force field in gromacs2020 (http://www.gromacs.org/). 33 docked complexes (sars-cov-2 spike with human ace2, human sars-cov spike with human ace2, sars-cov-2 spike with bat ace2 etc) were used as starting geometries for md simulations. simulations were carried out using the gpu accelerated version of the program with the amber99sb-ildn force field i periodic boundary conditions on an oracle cloud server. docked complexes were immersed in a truncated octahedron box of tip3p water molecules. the solvated box was further neutralized with na+ or cl− counter ions using the tleap program. particle mesh ewald (pme) was employed to calculate the long-range electrostatic interactions. the cutoff distance for the long-range van der waals (vdw) energy term was 12.0 å. the whole system was minimized without any restraint. the above steps applied 2500 cycles of steepest descent minimization followed by 5000 cycles of conjugate gradient minimization. after system optimization, the md simulations was initiated by gradually heating each system in the nvt ensemble from 0 to 300 k for 50 ps using a langevin thermostat with a coupling coefficient of 1.0/ps and with a force constant of 2.0 kcal/mol·å2 on the complex. finally, a production run of 100 ns of md simulation was performed under a constant temperature of 300 k in the npt ensemble with periodic boundary conditions for each system. during the md procedure, the shake algorithm was applied for the constraint of all covalent bonds involving hydrogen atoms. the time step was set to 2 fs. the structural stability of the complex was monitored by the rmsd and rmsf values of the backbone atoms of the entire protein. finally, the free energies of binding were calculated for all the simulated docked structures. calculations were also performed for up to 500 ns on human ace2 to ensure that 100ns is sufficiently long for convergence. duplicate production runs starting with different random seeds were also run to allow estimates of binding energy uncertainties to be determined for the strongest binding ace2 structures. the binding free energies of the protein-protein complexes were evaluated in two ways. the traditional method is to calculate the energies of solvated sars-cov-2 spike and ace2 proteins and that of the bound complex proteins and derive the binding energy by subtraction. δg (binding, aq) = δg (complex, aq) -(δg (spike, aq) + δg (ace2, aq) we also calculated binding energies using the molecular mechanics poisson boltzmann surface area (mm-pbsa) tool in gromacs that is derived from the nonbonded interaction energies of the complex. 34, 35 the method is also widely used method for binding free energy calculations. the binding free energies of the protein complexes were analyzed during equilibrium phase from the output files of 100 ns md simulations. the g_mmpbsa tool in gromacs was used after molecular dynamics simulations, the output files obtained were used to post-process binding free free energy decomposition analyses were also performed by mm-pbsa decomposition to get a detailed insight into the interactions between the ligand and each residue in the binding site. the binding interaction of each ligand-residue pair includes three terms: the van der waals contribution, the electrostatic contribution, and the solvation contribution. another estimate of the strength of the interaction between protein-protein complex can be obtained from the non-bonded interaction energy between the complex. gromacs has the ability to decompose the short-range nonbonded energies between any number of defined groups. to compute the interaction energies as a part of our analysis, we reran the trajectory files obtained during simulation to recompute energies using -rerun command. the interaction energy is the combination of short range coulombic interaction energy (coul-sr:protein-protein) and the short-range lennard-jones energy (lj-sr:protein-protein (see table 3 ). while this paper was being prepared, a paper by guterres and im described a substantial improvement in protein-ligand docking results using high-throughput md simulations. 36 they employed docking using autodock vina, followed by md simulation using charmm. the parameters they advocated were very similar to those used in our study. proteins were solvated in a box of tip3p water molecules extending 10 å beyond the proteins and the particle-mesh ewald method was used for electrostatic interactions. nonbonded interactions over 10 and 12 å were truncated. their systems were minimized for 5000 steps using the steepest descent method followed by 1 ns equilibration with an nvt setting. for each protein-ligand complex, they ran 3 × 100 ns production runs from the same initial structure using different initial velocity random seeds and an integration step size of 2 fs. the ancestry of sars-cov-2 traces back to the human, civet and bat sars-cov strains, which all use the same ace2 proteins for cellular entry. [37] [38] [39] the similarities and variations in sequences for both the sars-cov-2 spike protein and the sars spike protein were determined from sequences retrieved from ncbi genbank databank and aligned using clustalw. the spike protein receptor binding domain (rbd) region showed a 72% identity between the two viruses ( figure 1 ). conserved regions in black and non-conserved residues in red and blue. the three-dimensional structure of sars-cov-2 spike protein ( figure the ramachandran scores of the modeled ace2 structures for selected species are summarized in table 1 with the actual predicted ace2 structures and ramachandran plots for each selected species shown in supplementary figure 3 . figure 3) . the modelled structures were further assessed for quality control using ramachandran plot and molprobity scores in swissmodel structure assessment. the ramachandran plot checks the stereochemical quality of a protein by analyzing residue-by-residue geometry and overall structure geometry, is also a way to visualize energetically allowed regions for backbone dihedral angles ψ against φ of amino acid residues in protein structure. the ramachandran score of sars-cov-2 spike protein was 90% in the binding region and molprobity scores that provide an evaluation of model quality at both the global and local level was 3.17. further, the ace2 modelled structures from selected species were also assessed using swiss model structure assessment. ramachandran score or the percentage of amino acid residues falling into the energetically favored region for all species ranged from 96-99% (table 1) . these ramachandran and molprobity scores show that all the built structures were of good quality and were suitable for use in further studies. the ramachandran graphs are presented in supplementary figure 3 . the receptor binding domain (rbd) of sars-cov-2 spike protein was docked against ace2 receptor of various species using hdock server. the interacting residues of ace2 and sars-cov-2 spike protein are depicted in table 3 . we found certain key amino acids in the receptor binding motif (rbm) that were in accordance with previous studies 40 . certain amino acids including phe28, asn330, asp355 and arg357 were conserved in ace2 of most of the selected species and were observed to take part in the interaction with spike protein. tyr41, lys353, ala386 and arg393 also interacted with spike protein residues and were highly conserved across all species except bat, mouse, ferret and pangolin, respectively. spike protein interacting residues with ophiophagus hannah (king cobra) were least common amongst all the ace2 species included in the study, consistent with its low sequence similarity to human ace2. homo sapiens (human) the molecular dynamics simulation of complexes of sars-cov-2 spike protein and ace2 receptors of various species were performed for 100ns. all complexes became stable during simulation with rmsd fluctuations converging to a range of 0.5 to 0.8 nm from the original position. the calculated binding energies for the interactions of sars-cov-2 with ace2 from the species studied are presented in table 4 . the table 4 also includes observational in vivo data on sars-cov-2 infectivity and disease symptoms in the species where this has been reported. although bats carry many coronaviruses including sars-cov, a relative of sars-cov-2, direct evidence for existence of sars-cov-2 in bats has not been found. as highlighted by our data, the binding strength of sars-cov-2 for bat ace2 is considerably lower than for human ace2, suggesting that even if sars-cov-2 did originally arise from a bat precursor it must later have adapted its spike protein to optimise its binding to human ace2. there is no current explanation for how, when or where this might have happened. instances of direct human infection by coronaviruses or other bat viruses is rare with transmission typically involving an intermediate host. for example, lyssaviruses such as hendra are periodically transmitted from bats to horses and then to humans who contact the infected horse. similarly, sars-cov was shown to be transmitted from bats to civet cats and from them to humans. to date, a virus identical to sars-cov-2 has not been identified in bats or any other non-human species, making its origins unclear. to date, the most closely related coronavirus to sars-cov-2, is the bat coronavirus, batcov ratg1, which has 96% whole-genome identity to sars-cov-2. 50 hence other genetic factors could underlie the apparent lack of susceptibility of dogs to covid-19 clinical infection. it is known that gain of function (gof) mutations occur in viruses that can lead to pandemics. gof means viruses gain a new property e.g. in influenza virus gof has been associated with the acquisition of a new function, such as mammalian transmissibility, increased virulence for humans, or evasion of existing host immunity. 49 the conditioning of viruses to humans as pandemics progress is well recognized. however, the sars-cov-2 structures and sequences that we employed were from viruses collected very early in the pandemic. it is therefore not clear how sars-cov-2 could have developed such a high affinity for human ace2, notably higher than for those of putative zoonotic sources for sars-cov-2, unless it has been previously selected on human ace2 or an ace2 of another species bearing a closely homologous spike protein binding domain. interestingly, pangolin ace2 bears some similarities in its sbd to human ace2. this marries with the fact that pangolin-cov shares a highly similar rbd to sars-cov-2, although their remaining sequence has only 90% similarity. this could be consistent with a process of convergent evolution whereby human and pangolin coronaviruses infecting via ace2, have come to the same solution in respect of evolving an optimal spike rbd for binding of either human or pangolin ace2, respectively. our data does indicate that humans might be permissive to pangolin covs that use ace2 for cell entry, a fact that needs to be borne in mind in respect of future potential coronavirus pandemic sources. however, this does not mean that pangolin ace2 was the receptor on which the sars-cov-2 spike protein rbd was initially selected, with the strength of binding to pangolin ace2 lower than binding to human ace2. this makes it unlikely that pangolins are the missing intermediate host. if sars-cov-2 spike was selected on pangolin ace2, then given the higher affinity of sars-cov-2 for human ace2 than for bat ace2, sars-cov-2 would have to have circulated in pangolins for a long period of time for this evolution and selection to occur and to date there is no evidence of a sars-cov-2 like virus circulating in pangolins. another possibility would be a short term evolutionary step where a pangolin was recently coinfected with a bat ancestor to sars-cov-2 at the same time as it was infected by a pangolin cov allowing a recombination event to occur whereby the spike rbd of the pangolin virus was inserted into the bat cov, thereby conferring the bat cov with high binding for both pangolin and human ace2. such recombination events are known to occur with other rna viruses and can explain creation of some pandemic influenza strains 49 . nevertheless, such events are by necessity rare as they require coinfection of the one host at exactly the same time. most importantly, if such a recombination event had occurred in pangolins it might have been expected to have similarly triggered an epidemic spread of the new highly permissive sars-cov-2 like virus among pangolin populations, such as we now see occurring across the human population. currently there is no evidence of such a pangolin sars-cov-2 like outbreak, making this whole scenario less likely. indeed, pangolins might be protected from sars-cov-2 infection due to the existence of crossprotective spike rbd neutralising antibodies induced by exposure to pangolin cov, given the rbd similarity of these two viruses. another possibility which still cannot be excluded is that sars-cov-2 was created by a recombination event that occurred inadvertently or consciously in a laboratory handling coronaviruses, with the new virus then accidentally released into the local human population. given the seriousness of the ongoing sars-cov-2 pandemic, it is imperative that all efforts be made to identify the original source of the sars-cov-2 virus. in particular, it will be important to establish whether covid-19 is due to a completely natural chance occurrence where a presumed bat virus was transmitted to humans via an intermediate animal host or whether covid-19 has alternative origins. this information will be of paramount importance to help prevent any similar human coronavirus outbreak in the future. clinical trials for the treatment of coronavirus disease 2019 (covid-19): a rapid response to urgent need progress and prospects on vaccine development against sars-cov-2 antiviral treatment of covid-19 covid-19: a fast evolving pandemic the covid-19 vaccine development landscape world health organization declares global emergency: a review of the 2019 novel coronavirus (covid-19) repurposing antimalarials and other drugs for covid-19 pharmacologic treatments for coronavirus disease 2019 (covid-19): a review what does plant-based vaccine technology offer to the fight against covid-19? vaccines (basel) clinical trials on drug repositioning for covid-19 treatment perspectives: potential therapeutic options for sars-cov-2 patients based on feline infectious peritonitis strategies: central nervous system invasion and drug coverage research towards treating covid-19 timely development of vaccines against sars-cov-2 don't rush to deploy covid-19 vaccines and drugs without sufficient safety guarantees covid-19 needs a big science approach can companion animals become infected with covid-19? can companion animals become infected with covid-19? covid-19: zoonotic aspects susceptibility of ferrets, cats, dogs, and other domesticated animals to sarscoronavirus 2 absence of sars-cov-2 infection in cats and dogs in close contact with a cluster of covid-19 patients in a veterinary campus infection and rapid transmission of sars-cov-2 in ferrets age-related rhesus macaque models of covid-19 simulation of the clinical and pathological manifestations of coronavirus disease 2019 (covid-19) in golden syrian hamster model: implications for disease pathogenesis and transmissibility structural basis for the recognition of sars-cov-2 by full-length human ace2 ace2 the janus-faced protein -from cardiovascular protection to severe acute respiratory syndrome-coronavirus and covid-19 structural basis of receptor recognition by sars-cov-2 comparative protein modelling by satisfaction of spatial restraints toward the estimation of the absolute quality of individual protein structure models ucsf chimera--a visualization system for exploratory research and analysis hdock: a web server for proteinprotein and protein-dna/rna docking based on a hybrid strategy the hdock server for integrated protein-protein docking high performance molecular simulations through multilevel parallelism from laptops to supercomputers electrostatics of nanosystems: application to microtubules and the ribosome open source drug discovery, c. & lynn, a. g_mmpbsa--a gromacs tool for high-throughput mm-pbsa calculations improving protein-ligand docking results with high-throughput molecular dynamics simulations isolation and characterization of a bat sars-like coronavirus that uses the ace2 receptor angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus receptor recognition by the novel coronavirus from wuhan: an analysis based on decade-long structural studies of sars coronavirus a highly conserved cryptic epitope in the receptor-binding domains of sars-cov-2 and sars-cov predicting the angiotensin converting enzyme 2 (ace2) utilizing capability as the receptor of sars-cov-2 silico analysis of intermediate hosts and susceptible animals of sars-cov-2. chemrxiv identifying sars-cov-2 related coronaviruses in malayan pangolins an overview of sars-cov-2 and animal infection sars-cov-2 spike protein favors ace2 from bovidae and cricetidae the pathogenicity of 2019 novel coronavirus in hace2 transgenic mice comparative pathogenesis of covid-19, mers and sars in a nonhuman primate model risks and benefits of gain-of-function experiments with pathogens of pandemic potential, such as influenza virus: a call for a science-based discussion a pneumonia outbreak associated with a new coronavirus of probable bat origin alignment of ace2 amino acid sequence from selected species. the sar-cov-2 spike protein binding region is highlighted in red we would like to thank harinda rajapaksha for assistance to optimise gromacs for this project. we would like to thank oracle corporation for providing their cloud computing resources through the oracle for research program for the modelling studies described herein. in particular, we wish to supplementary figure 4 . predicted and confirmed utilization of ace2 receptors by the sars-cov-2 spike protein based on sequence homology from qui et al. 41 key: cord-010938-12igesqw authors: patra, prasanta; mondal, niladri; patra, bidhan chandra; bhattacharya, manojit title: epitope-based vaccine designing of nocardia asteroides targeting the virulence factor mce-family protein by immunoinformatics approach date: 2019-08-21 journal: int j pept res ther doi: 10.1007/s10989-019-09921-4 sha: doc_id: 10938 cord_uid: 12igesqw nocardia asteroides is the main causative agent responsible for nocardiosis disease in immunocompromised patient viz. acquired immunodeficiency syndrome (aids), malignancy, diabetic, organ recipient and genetic disorders. the virulence factor and outer membrane protein pertains immense contribution towards the designing of epitopic vaccine and limiting the robust outbreak of diseases. while epitopic based vaccine element carrying b and t cell epitope along with adjuvant is highly immunoprophylactic in nature. present research equips immunoinformatics to figure out the suitable epitopes for effective vaccine designing. the selected epitopes vlgssvqta, vnielkpef and vvpsnlfav amino acids sequence are identified by hla-drb alleles of both mhc class (mhc-i and ii) molecules. simultaneously, these also accessible to b-cell, confirmed through the abcpred server. antigenic property expression is validated by the vaxijen antigenic prediction web portal. molecular docking between the epitopes and t cell receptor delta chain authenticate the accurate interaction between epitope and receptor with significantly low binding energy. easy access of epitopes to immune system also be concluded as transmembrane nature of the protein verified by using of tmhmm server. appropriate structural identity of the virulence factor mce-family protein generated through phyre2 server and subsequently validated by prosa and procheck program suite. the structural configuration of theses epitopes also shaped using distill web server. both the structure of epitopes and protein will contribute a significant step in designing of epitopic vaccine against n. asteroides. therefore, such immunoinformatics based computational drive definitely provides a conspicuous impel towards the development of epitopic vaccine as a promising remedy of nocardiosis the gram positive bacillus bacteria, nocardia asteroides causing nocardiosis targeting to immunocompromised patients in direct state (wilson 2012) . a total of 225 species have been estimated till now under the genera of nocardia (bennett et al. 2014) . it is an opportunistic pathogen responsible for several diseases in human as well as other vertebrate animal (moylett et al. 2003) . practically these bacteria serve as predominant causing factors for pulmonary infection. consequently it shows acute necrotizing pneumonia that also reflects in the inflammation of cutaneous and subcutaneous tissue. while the brain is recognized as most favored site for secondary infection in 25% patients amongst all victims (ellis and beaman 2002; quinones-hinojosa 2012) . subsequently n. asteroides is the prime cause of serious cerebral abscess as reported by researcher (fleetwood et al. 2000) . nocardiosis is distributed worldwide but it is most prevalence through the united states (patil et al. 2012; saubolle and sussland 2003) . infection of nocardiasis is influences by the sex and age of the patient. as noted the male female patient ratio 3:1 reflects that men are prone to nocardiasis rather than the female. recently, around 0.375 nocardia infection for every 100 000 individuals have been reported throughout the world (palmieri et al. 2014) . exclusively 500-1000 individuals are affected in united states every year in a random mode (sethy et al. 2016) . numerous endeavors are taken to overcome the fatal disease but till now the disease is prevalence as research data reported. the nocardiosis disease is predominant to the alcoholic, diabetic, cancer, aids and organ transplanted patient so, its need immediate control and monitoring supports. in present scenario records of these patients are quiet in frequent numbers. therefore, this disease already occupied great concern and requires immediate remedy by promising, cost effective ways. vaccination against the disease will be the finest and reliable option for controlling the disease outbreak (nichol et al. 2003) . a vaccine would have more effective if it can elicit both b and t-cell dependent immune responses. vaccine designing against any disease can be performed using two specific approaches, conventional approach and reverse vaccinological approach. within the present research plan the reverse vaccinological approach has been applied to design epitopic vaccine against n. asteroides as the conventional technique has some limitations. the conventional technique requires much time and also very expensive to formulate. furthermore, the technique applied to killed, attenuated or indolent pathogens for vaccine development which will prove deadly if the pathogen alters to its pathogenic appearance (khan et al. 2015) . besides from such conventional approach, the reverse vaccinology harnesses the immunoinformatics for developing vaccine against pathogenic diseases (vivona et al. 2008) . the advance approach involves several web portals and computational tools to predict the potential vaccine candidate. the fundamentality of reverse vaccinology showing it targets of protein derived epitopic sequence for novel vaccine generation. the reverse vaccinology provides easy and reliable techniques to find out the accurate b and t-cell epitopes (srivastava et al. 2019) . while the exomembrane and secreted protein of pathogens are the ideal for vaccine generation as these can easily interact with immune system (gourlay et al. 2013) . considering the bacterial proteins are foreign to host body, it will be further recognized as antigenic determinants. expression of both b and t-cell epitopes will enhance the immunogenicity driven by b and t-cell proliferation. the virulence factor of mce family protein found to be responsible for the virulency of n. asteroides and pathogenicity expression. the current research focuses on the computational analysis and identification of potential epitopes present within the bacterial mce family protein candidate. therefore, identified epitopes will be served as ideal component for future vaccine development against the infection of n. asteroides. in order to execute the subsequent research performance the supreme computational aided techniques has been employed. consequently the targeted protein sequence has been crucially analyzed through several authentic web prediction portals focus to extract the functional epitopes leading to narrative vaccine generation against nocardiosis disease. the amino acid sequences of a protein is the preliminary requirement for development of computer aided epitopic vaccine component. amino acid sequence of the virulence factor mce-family protein has been retrieve from ncbi protein database (coordinators 2017) . identified amino acids sequence was downloaded and carried out further computer based analysis particularly supported in fasta file format. b-cell epitope identification is one of the most crucial steps intended to development of epitopic vaccine (jamal et al. 2017) . potential b-cell epitope has been predicted employing the abcpred web server (saha and raghava 2006) . the server utilizes recurrent neural network technique to locate the specific b-cell epitopes present in targeted protein sequence. the server also uses five parameters to validate the predicted outputs. these factors are: q ppv (positive prediction value) = tp tp + fp tp, fn, tn and fp represents true positive, false negative, true negative and false positive output respectively (saha and raghava 2006) . the collective fasta sequence of targeted protein virulence factor mce-family protein has been submitted to the abcpred server. further, for the b-cell epitope identification, threshold score is set to 0.75 and window length is set to 20 in order to clarify the result. moreover an overlapping filter has been used to eliminate the overlapped epitopic prediction. in order to generate the immense immune response, an antigenic epitope necessarily to be accessible for the both mhc class type molecules (naz et al. 2015) along with the b-cell (barh et al. 2010; bhattacharya et al. 2019) . in support to the prediction of potent t-cell epitopes, the propred (singh and raghava 2001) and propred-i (singh and raghava 2003) servers are appointed for mhc-ii and mhc-i molecules correspondingly. the propred and propred-i web servers predict epitopes that can be recognized by 51 mhc-ii and 47 mhc-i alleles (mustafa and shaban 2006) . both the server applied matrices prediction algorithm method (lafuente and reche 2009; lin et al. 2008) to find out the potent t-cell epitope. current research work predicted epitopes for b-cell, are submitted to the propred and propred-i server with default parameters to identify the common epitopes anticipate by both b and t-cell. predicted common epitopes for both the b and t-cell are further investigated for their antigenic property through the vaxijen (v. 2.0) server (doytchinova and flower 2007) . the server requires a query sequence of amino acids to evaluate the antigenic propensity with 70-97% accuracy level (dimitrov et al. 2016) . the server also specifies the field of antigenic prediction in five subsequent target group like-bacteria, virus, tumor, parasite and fungal (zaharieva et al. 2017 ). the auto cross covariance (acc) method has been also incorporated to convert the submitted sequence into uniform length of amino acids chain (doytchinova and flower 2007) . the acc is estimated by the formulas (1) where 'j' is the z-scale (j = 1, 2, 3), 'n' is the amino acids present in sequence (i = 1, 2,…n) and 'l' is the lag (l = 1, 2,…l). the eq. (2) is used when there are two different zscales. the query sequence has been submitted to the server by selecting bacteria as target organism. in order to obtain more precise result, a threshold value of 1.0 has been set instead of default threshold value (0.4). the common epitopes of b and t-cell having antigenic property are promising candidate for the epitopic vaccine designing. the epitopes predicted by abcpred, propred and propred-i server also having vaxijen predicted antigenic nature are selected for the vaccine development. three dimensional conformation of the epitope have its significant role leading to vaccine development. it is the most vibrant object for investigation of epitope-antibody docking system (alam et al. 2016 ). due to short amino acid length of the epitope conventional modeling servers are not in use so, distill 2.0 (baú et al. 2006 ) server have been introduced for the process. the server uses two sets of bidirectional recurrent neural networks technique (pollastri et al. 2002) . for the result output, filtering is applied in two successive stages. in first stage network input has been amplified through prophecies and averaged over several adjacent windows. if σj = (αj, βj, ϒj) signifies output of j, subsequent to helix, strand and coil prediction reflects the inputs. in second stage network, i j is the input instead of j. in which k f = j + f(2ω + 1), 2ω +1 represents window size (ω = 7) and 2p + 1 represents window numbers (p = 7) taken into account (pollastri and mclysaght 2004) . pbd file of the experimental epitopic sequence is provided by the server as result element. the three dimensional (3d) architecture of a protein plays essential role in protein function and stability (roy et al. 2012; willard et al. 2003) . so, unrevealing the protein structural configuration is quiet necessary and purposeful for the study. the virulence factor mce-family protein structure of n. asteroides is unavailable in pdb database (berman et al. 2000) . because of the unavailability of pdb structure, the phyre2 (kelley et al. 2015) web server has been introduced to generate the protein structural data. the phyre2 server applies the hidden markov model alignment to detect structure of target protein through hhsearch, open-source software package (kumar and jena 2014; nema and pal 2013) . this server also uses poing folding simulation to figure out the non allied part of the protein sequence (nema and pal 2013) . amino acid sequence of virulence factor mce-family protein has been submitted to the server and a zip file containing predictions is comes out as systematic result component. justification of a predicted protein model is very much fundamental aspect for establishing the specified protein 3d configuration (laskowski et al. 2006; rodriguez et al. 1998) . to properly justify the model quality, two web portals are methodologically assigned. the procheck web server (laskowski et al. 1993) , to analyze stereochemical properties of the protein model emphasizing on the torsion angle of c α atoms of amino acids. a pdb file of the targeted protein model has to provide to the online web server. the server offered a ramachandran plot of all available amino acids, that also vital for the validation of targeted protein model (laskowski et al. 1993 ). the prosa-web is introduce to calculate the z score of the protein model and furnishes a plot of the protein model within all known protein structure (wiederstein and sippl 2007) . this server also furnishes energy plot for perfect assessment of the model quality. while the negative energy value of the amino acid residues indicates a good model quality of examined protein (belkina et al. 2001; wiederstein and sippl 2007) . transmembrane helix prediction of the protein model was performed through tmhmm (ver. 2.0) server, the standalone software package (sonnhammer et al. 1998 ). the server predicts transmembrane helices based on a hidden markov model method with 97-98% accuracy of result (krogh et al. 2001) . fasta sequence of the protein has been submitted to the server and the server provides lists of transmembrane prediction along with an apparent graphical interpretation. molecular docking between epitope and antibody component is the critical for proper functioning of vaccine candidate. in order to carry out molecular docking the patchdock server have been appointed in present research (duhovny et al. 2002) . the geometric contour complementarity method has been applied for docking the peptide and protein sequences (schneidmanduhovny et al. 2003) . hence, the pdb file or pdb code of the protein and epitope has to be submitted to the server for molecular docking analysis. the accurate docked complexes ranked according to geometric shape complimentary score and presented along with ace value, interface area and pdb data (schneidmanduhovny et al. 2005) . in molecular docking the aggregated desolvation energy of atom pairs termed as ace. atom set inside s 1 and s 2 with threshold distance d, the ace is: here |s − t| signifies euclidean distance amid s and t, t [s,t] signifies the prearranged score of s and t atom pair. t [s,t] is estimated using the subsequent formula: here 0 signifies the solvent. (n s,t ) is number of s,t connection and the number of s-0 connection (n t,0 ) are suitable connection numbers of recognized complexes. moreover, cs,t and cs,0 are signifies as the possible numbers of s,t connection and s-0 connection (guo et al. 2012) . the amino acid sequence of the protein virulence factor mce-family protein has been retrieved from the ncbi protein database. the virulence factor mce-family protein consisting of 493 amino acids and the sequence was downloaded in fasta format (genbank: sfl64340.1) (benson et al. 2012) . afterwards, this sequence processed through several web servers to find out the potential epitopes for promising vaccine development. abcpred server predicts 20 linear epitopes (table 1) within the virulence factor mce-family protein of 20 amino acids length (window length). the sensitivity, specificity and accuracy are 57.14%, 71.57% and 64.26% correspondingly at window length of 20 (saha and raghava 2006) . a score is also assigned against predicted epitopes and are ranked accordingly (han et al. 2015) . predicted sequence with the higher score secures better chance to be an potent epitope (jones and carter 2014) . the predicted epitopes are superior because of high threshold score of 0.75. the identified b-cell epitope were further investigated for detection of prospective t-cell epitope through propred and propred-i to improve immunogenicity (oprea and antohe 2013). the b-cell epitopes those are recognized by both mhc molecules are taken under additional consideration. the epitopes common to both mhc classes and b-cell are listed in the table 2 which may be recognized for ideal vaccine candidate. following mhc alleles are also enlisted in the supplementary table. the propred server predicts epitopes of only nine amino acids moreover, the common epitopes comprised of 9mer sequence. manifestation of antigenicity is the prime criterion of a novel epitope in immunobiological aspect (chen et al. 2007 ). the common b and t-cell epitopes are successively validated against antigenic propensity via vaxijen server; table 3 presents the epitopes along with relevant antigenic score. the 9mer epitopes availing antigenic score beyond the threshold value 1.0 secures antigenic characteristics. in that consequence 10 out of 13 9meric epitopes proved futile to express antigenicity. three 9mers vlgssvqta, vnielkpef and vvpsnlfav having antigenic score 1.1110, 2.4569 and 1.0810 respectively proved to secure antigenic propensity. the three 9mer b and t-cell epitopes vlgssvqta, vnielkpef and vvpsnlfav having antigenicity (table 4 ) will be easily accessible to the immune system (comerford et al. 1991) . these epitopes can be utilized for the future prospective of epitopic vaccine development. the epitopes will elicit strong immune response when administrated in the body as potent vaccine element. the selected epitopes vlgssvqta, vnielkpef and vvpsnlfav are submitted to distill server (2.0). the distill server contributes five models of each of the epitope in pdb format. only the top ranked model of epitopes is taken under consideration. ucsf chimera (ver. 1.13.1) program has been implemented for visualization of the selected pdb files of the targeted epitopes (pettersen et al. 2004 ). subsequently the generated images are retrieved and presented in the fig. 1a , b, c in order of vlgssvqta, vnielkpef and vvpsnlfav protein. the virulence factor mce-family protein lacking pdb entry, for that reason the structure of protein has been depicted by homology modeling. modeling of virulence factor mce-family protein has been performed via hidden markov model in phyre2 server. amongst 493 amino acids of virulence factor mce-family protein 41%, 11%, and 4% amino acid construct α-helix, β-strand and transmembrane helix respectively. phyre2 predicted pdb data of the protein processed through ucsf chimera program in order to generate structural image. figure 2 showing the three dimensional conformation of virulence factor mce-family protein in n. asteroids. the topology of virulence factor mce-family protein is justified using prosa and procheck web servers. according to the prediction of procheck generated ramachandran plot (fig. 3) , 79.3% of residues exist in most favoured region and only 1.7% exists in disallowed region (table 5 ). existence of only two non-glycine and non-proline residues in disallowed region established the model quality. prosa estimated z score (− 2.38) of the model resides within the range of experimentally proved protein structures (fig. 4 ) (sharma and jaiswal 2009) . as presented in fig. 5 , most of the residues of the model possess negative energy value only few have positive value then the model is justified (bodade et al. 2010 ). pdb file of vlgssvqta, vnielkpef and vvpsnl-fav epitopes and variable domain of t cell receptor delta chain (pdb id:1tvd) are submitted to patchdock algorithm based server (li et al. 1998) . the server provides 20 docking complex for each epitope ranking them against geometric shape complimentary score. only the top ranked complex of each epitopes is picked for computational analysis. epitope along with score, area of interface and ace value are listed in table 6 . the significantly low ace value of docking complexes indicated elevated reactivity between epitope and (guo et al. 2012; lavi et al. 2013 ). the docking of selected epitopes and t cell receptor confirms that the epitopes will accessible to immune system and generate specific immunogenicity. rational identification, authentication and in sillico analysis of epitopic components facilitates the successful generation of novel vaccines. as noted the vaccine element performs a key role not only recovering from the infections but also controlling the future disease outbreak. hence, the nocardiosis has carried a dreadful influence over the immunocompromised patient as they manifest high level of disease susceptibility. particularly the organ transplant patients encounter the greater chances of nocardiosis. the casualty of organ recipient patients for nocardia infection often reflects in higher degree (husain et al. 2002) . consequently, the causing pathogen also highly resistance to several well known, market available antibiotics (husain et al. 2002) . greater survivability, infections and antibiotic resistance nature of the bacterial pathogen is one of the greatest apprehensions for medical biotechnologist. present advanced, computational analysis assisted research emphasizes much on the discovery of effective vaccine against to n. asteroides. the virulence factor mce family protein of n. asteroides is responsible for the pathogenesis of the bacteria to other organism. hence this protein served as the supreme component for designing epitopic vaccine against nocardiosis. the virulence factor mce family protein is processed through several bioinformatic tools to execute the suitable epitopes having antigenicity. the transmembrane localization of the protein permits it to intermingle with exact immune system. the transmembrane helix prediction server tmhmm server (ver. 2.0) attests the exomembrane localization of the protein component. both the b and t-cell epitopes are considerate for obtaining the maximum immune response through humoral and cell mediated immunity. the epitopes were identified through sequence based prediction method using the several web prediction servers. after retrieving the protein sequence of virulence factor mce family protein (genbank: sfl64340.1) from ncbi database, it is processed through the abcpred server for b-cell epitope motif. the 20 numbers of b-cell epitopes having considering threshold value are again submitted to propred and propred-i servers simultaneously for mhc-ii and mhc-i binding allele (barh et al. 2010) . the common epitopes are shortened to only 9mers because of propred prediction module provides only 9meric epitopes. the epitopes vlgss-vqta, vnielkpef and vvpsnlfav were selected for vaccine designing after accurate validation against antigenic property through vaxijen server. these are highly antigenic (vlgssvqta = 1.1110, vnielkpef = 2.4569 and vvpsnlfav = 1.0810) being laid over the customized threshold antigenic score. this research design reveled that, three epitopes are the finest vaccine components as identified by b-cell and mhc molecules. the motif of the particular epitopes was mapped out through distill 2.0 for conformational uniqueness and molecular docking. intend to proper binding of epitopes with the t cell receptor delta chain reflects through the significantly lower ace values. the structural profile of virulence factor mce family protein also been mapped out via phyre2 server that will be fundamental for therapeutics to design the vaccine. the epitopes vlgssvqta, vnielkpef and vvpsnlfav will be much more effectual for perfect designing of epitopic vaccine against n. asteroides limiting nocardiosis and subsequent casualties linked with it. specialized immunoinformatic studies focuses the virulence factor mce-family protein of n. asteroides established its significances lead to expression of bacterial pathogenicity. existing research will be surely valuable in modern therapeutics purposes, to resist the nocardiosis outbreak. particular manifestation of antigenicity by the common b and t-cell epitopes (vlgssvqta, vnielk-pef and vvpsnlfav) substantiates the critical aptitude to generate humoral and cell mediated immunity. consequently, the targeted epitopes assist for easy interaction with the immune receptors favor to transmembrane localization of protein element. literal structural signature of considerable protein along with its epitopes served as decisive factor for novel vaccine development. however, the epitopes requires substantial in vivo and in vitro justification for accurate refinement to generate finest vaccine component restricting the nocardia infection. such computer aided research techniques are also highly influential and efficient for designing of desired epitopic vaccine against several associated diseases in light of immunoinformatics. from zikv genome to vaccine: in silico approach for the epitope-based peptide vaccine against zika virus envelope glycoprotein a novel strategy of epitope design in neisseria gonorrhoeae distill: a suite of web servers for the prediction of one-, two-and three-dimensional structural features of proteins modelling of three-dimensional structures of cytochromes p450 11b1 and 11b2 mandell, douglas, and bennett's principles and practice of infectious diseases: 2-volume set the protein data bank computational characterization of epitopic region within the outer membrane protein candidate in flavobacterium columnare for vaccine development homology modeling and docking study of xanthine oxidase of arthrobacter sp. xl26 prediction of linear b-cell epitopes using amino acid pair antigenicity scale identification of t-and b-cell epitopes of the e7 protein of human papillomavirus type 16 database resources of the national center for biotechnology information pymol: an open-source molecular graphics tool a cohesive and integrated platform for immunogenicity prediction. vaccine design vaxijen: a server for prediction of protective antigens, tumour antigens and subunit vaccines efficient unbound docking of rigid molecules murine polymorphonuclear neutrophils produce interferon-γ in response to pulmonary infection with nocardia asteroides nocardia asteroides cerebral abscess in immunocompetent hosts: report of three cases and review of surgical recommendations exploiting the burkholderia pseudomallei acute phase antigen bpsl2765 for structure-based epitope discovery/design in structural vaccinology protein-protein binding site identification by enumerating the configurations identification of immunodominant b-cell epitope regions of reticulocyte binding proteins in plasmodium vivax by protein microarray based immunoscreening nocardia infection in lung transplant recipients identification of b-cell epitope of leishmania donovani and its application in diagnosis of visceral leishmaniasis prediction of b-cell epitopes in listeriolysin o, a cholesterol dependent cytolysin secreted by listeria monocytogenes the phyre2 web portal for protein modeling, prediction and analysis epitopebased peptide vaccine design and target site depiction against ebola viruses: an immunoinformatics study predicting transmembrane protein topology with a hidden markov model: application to complete genomes understanding rifampicin resistance in tuberculosis through a computational approach prediction of mhc-peptide binding: a systematic and comprehensive overview pro-check: a program to check the stereochemical quality of protein structures procheck: validation of protein-structure coordinates detection of peptide-binding sites on protein surfaces: the first step toward the modeling and targeting of peptidemediated interactions structure of the vδ domain of a human γδ t-cell antigen receptor evaluation of mhc class i peptide binding prediction servers: applications for vaccine research clinical experience with linezolid for the treatment of nocardia infection propred analysis and experimental evaluation of promiscuous t-cell epitopes of three major secreted antigens of mycobacterium tuberculosis identification of putative vaccine candidates against helicobacter pylori exploiting exoproteome and secretome: a reverse vaccinology based approach exploration of freely available web-interfaces for comparative homology modelling of microbial proteins influenza vaccination and reduction in hospitalizations for cardiac disease and stroke among the elderly reverse-vaccinology strategy for designing t-cell epitope candidates for staphylococcus aureus endocarditis vaccine soil-acquired cutaneous nocardiosis on the forearm of a healthy male contracted in a swamp in rural eastern virginia ucsf chimera-a visualization system for exploratory research and analysis porter: a new, accurate server for protein secondary structure prediction improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles schmidek and sweet: operative neurosurgical techniques 2-volume set: indications, methods and results in silico identification of catalytic residues in azobenzene reductase from bacillus subtilis and its docking studies with azo dyes homology modeling, model and software evaluation: three related resources cofactor: an accurate comparative algorithm for structure-based protein function annotation prediction of continuous b-cell epitopes in an antigen using recurrent neural network membrane association and epitope recognition by hiv-1 neutralizing anti-gp41 2f5 and 4e10 antibodies nocardiosis: review of clinical and laboratory experience taking geometry to its edge: fast unbound rigid (and hinge-bent) docking patchdock and symmdock: servers for rigid and symmetric docking pleuroplumonary nocardiosis in an immunocompetent host egf domain ii of protein pb28 from plasmodium berghei interacts with monoclonal transmission blocking antibody 13.1 propred: prediction of hla-dr binding sites propred1: prediction of promiscuous mhc class-i binding sites a hidden markov model for predicting transmembrane helices in protein sequences design of novel multi-epitope vaccines against severe acute respiratory syndrome validated through multistage molecular interaction and dynamics computer-aided biotechnology: from immuno-informatics to reverse vaccinology prosa-web: interactive web service for the recognition of errors in three-dimensional structures of proteins vadar: a web server for quantitative evaluation of protein structure quality nocardiosis: updates and clinical overview epitope mapping of two immunodominant domains of gp41, the transmembrane protein of human immunodeficiency virus type 1, using ten human monoclonal antibodies immunogenicity prediction by vaxijen: a ten year overview key: cord-023726-2fduzqyb authors: strauss, james h.; strauss, ellen g. title: the structure of viruses date: 2012-07-27 journal: viruses and human disease doi: 10.1016/b978-0-12-373741-0.50005-2 sha: doc_id: 23726 cord_uid: 2fduzqyb nan virus particles, called virions, contain the viral genome encapsidated in a protein coat. the function of the coat is to protect the genome of the virus in the extracellular environment as well as to bind to a new host cell and introduce the genome into it. viral genomes are small and limited in their coding capacity, which requires that three-dimensional virions be formed using a limited number of different proteins. for the smallest viruses, only one protein may be used to construct the virion, whereas the largest viruses may use 30 or more proteins. to form a three-dimensional structure using only a few proteins requires that the structure must be regular, with each protein subunit occupying a position at least approximately equivalent to that occupied by all other proteins of its class in the final structure (the principle of quasi-equivalence), although some viruses are now known to violate the principle of quasi-equivalence. a regular three-dimensional structure can be formed from repeating subunits using either helical symmetry or icosahedral symmetry principles. in the case of the smallest viruses, the final structure is simple and quite regular. larger viruses with more proteins at their disposal can build more elaborate structures. enveloped viruses may be quite regular in construction or may have irregular features, because the use of lipid envelopes allows irregularities in construction. selected families of vertebrate viruses are listed in table 2 .1 grouped by the morphologies of the virions. also shown for each family is the presence or absence of an envelope in the virion, the triangulation number (defined later) if the virus is icosahedral, the morphology of the nucleocapsid or core, and figure numbers where the structures of members of a family are illustrated. electron micrographs of five dna viruses belonging to different families and of five rna viruses belonging to different families are shown in fig. 2 .1. the viruses chosen represent viruses that are among the largest known and the smallest known, and are all shown to the same scale for comparison. for each virus, the top micrograph is of a virus that has been negatively stained, the middle micrograph is of a section of infected cells, and the bottom panel shows a schematic representation of the virus. the structures of these and other viruses are described next. helical viruses appear rod shaped in the electron microscope. the rod can be flexible or stiff. the best studied example of a simple helical virus is tobacco mosaic virus (tmv). the tmv virion is a rigid rod 18 nm in diameter and 300 nm long ( fig. 2.2b ). it contains 2130 copies of a single capsid protein of 17.5 kda. in the right-hand helix, each protein subunit has six nearest neighbors and each subunit occupies a position equivalent to every other capsid protein subunit in the resulting network ( fig. 2.2a) , except for those subunits at the very ends of the helix. each capsid molecule binds three nucleotides of rna within a groove in the protein. the helix has a pitch of 23 å and there are 16 1 /3 subunits per turn of the helix. the length of the tmv virion (300 nm) is determined by the size of the rna (6.4 kb). many viruses are constructed with helical symmetry and often contain only one protein or a very few proteins. the popularity of the helix may be due in part to the fact that the length of the particle is not fixed and rnas or dnas of different sizes can be readily accommodated. thus the genome size is not fixed, unlike that of icosahedral viruses. virions can be approximately spherical in shape, based on icosahedral symmetry. since the time of euclid, there have been known to exist only five regular solids in which each face of the solid is a regular polygon: the tetrahedron, the cube, the octahedron, the dodecahedron, and the icosahedron. the icosahedron has 20 faces, each of which is a regular triangle, and thus each face has threefold rotational symmetry ( fig. 2.3a ). there are 12 vertices where 5 faces meet, and thus each vertex has fivefold rotational symmetry. there are 30 edges in which 2 faces meet, and each edge possesses twofold rotational symmetry. thus the icosahedron is characterized by twofold, threefold, and fivefold symmetry axes. the dodecahedron, the next simpler regular solid, has the same symmetry axes as the icosahedron and is therefore isomorphous with it in symmetry: the dodecahedron has 12 faces which are regular pentagons, 20 vertices where three faces meet, and 30 edges with twofold symmetry. the three remaining regular solids have different symmetry axes. the vast majority of regular viruses that appear spherical have icosahedral symmetry. in an icosahedron, the smallest number of subunits that can form the three-dimensional structure is 60 (5 subunits at each of the 12 vertices, or viewed slightly differently, 3 units on each of the 20 triangular faces). some viruses do in fact use 60 subunits, but most use more subunits in order to provide a larger shell capable of holding more nucleic acid. the number of subunits in an icosahedral structure is 60t, where the permissible values of t are given by t = h 2 + hk + k 2 , where h and k are integers and t is called the triangulation number. permissible triangulation numbers are 1, 3, 4, 7, 9, 12, 13, 16 , and so forth. a subunit defined in this way is not necessarily formed by one protein molecule, although in most cases this is how a structural subunit is in fact formed. some viruses that form regular structures that are constructed using icosahedral symmetry principles do not possess true icosahedral symmetry. in such cases they are said to have pseudo-triangulation numbers. examples are described later. a protein subunit, shown as blue trapezoids labeled "a." the twofold, threefold, and fivefold axes of symmetry are shown in yellow. this is the largest assembly in which every subunit is in an identical environment. (b) schematic representation of the subunit building block found in many rna viruses, known as the eightfold β barrel or β sandwich. the β sheets, labeled b through i from the n terminus of the protein, are shown as yellow and red arrows; two possible α helices joining these sheets are shown in green. some proteins have insertions in the c-d, e-f, and g-h loops, but insertions are uncommon at the narrow end of the wedge (at the fivefold axis). from granoff and webster (1999) vol. 3, color plate 31. [originally from j. johnson (1996) ]. structural studies of viruses have shown that the capsid proteins that form the virions of many plant and animal icosahedral viruses have a common fold. this fold, an eightstranded antiparallel β sandwich, is illustrated in fig. 2.3b . the presence of a common fold suggests that these capsid proteins have a common origin even if no sequence identity is detectable. the divergence in sequence while maintaining this basic fold is illustrated in fig. 2 .4, where capsid proteins of three viruses are shown. sv40 (family polyomaviridae), poliovirus (family picornaviridae), and bluetongue virus (family reoviridae) are a dna virus, a single-strand rna virus, and a double-strand rna virus, respectively. their capsid proteins have insertions into the basic eight-stranded antiparallel β-sandwich structure that serve important functions in virus assembly. however, they all possess a region exhibiting the common β-sandwich fold and may have originated from a common ancestral protein. thus, once a suitable capsid protein arose that could be used to construct simple icosahedral particles, it may ultimately have been acquired by many viruses. the viruses that possess capsid proteins with this fold may be related by descent from common ancestral viruses, or recombination may have resulted in the incorporation of this successful ancestral capsid protein into many lines of viruses. because the size of the icosahedral shell is fixed by geometric constraints, it is difficult for a change in the size of a viral genome to occur. a change in size will require a change in the triangulation number or changes in the capsid proteins sufficient to produce a larger or smaller internal volume. in either case, the changes in the capsid proteins required are relatively slow to occur on an evolutionary timescale and the size of an icosahedral virus is "frozen" for long periods of evolutionary time. for this reason, as well as for other reasons, most viruses have optimized the information content in their genomes, as will be clear when individual viruses are discussed in the following chapters. cryoelectron microscopy has been used to determine the structure of numerous icosahedral viruses to a resolution of 7 to 25 å. for this, a virus-containing solution on an electron microscope grid is frozen very rapidly so that the sample is embedded in amorphous frozen water. the sample must be maintained at liquid nitrogen temperatures so that ice crystals do not form and interfere with imaging. unstained, slightly out-of-focus images of the virus are captured on bluetongue virus vp7 t = 13 figure 2.4 structure of three vertebrate virus protein subunits that assemble into icosahedral shells. the n termini and c termini are labeled with the residue number in parenthesis. the β barrels are shown as red arrows, α helices are gray coils, and the subunit regions involved in quasi-symmetric interactions that are critical for assembly are colored green. sv40 and pv have triangulation numbers of "pseudo-t=7" or p = 7 and "pseudo-t=3" or p = 3, respectively. adapted from granoff and webster (1999), vol. 3, plate 32. film, or more recently captured electronically, using a low dose of electrons. these images are digitized and the density measured. mathematical algorithms that take advantage of the symmetry of the particle are used to reconstruct the structure of the particle. a gallery of structures of viruses determined by cryoelectron microscopy is shown in fig. 2 .5. all of the images are to scale so that the relative sizes of the virions are apparent. the largest particle is the nucleocapsid of herpes simplex virus, which is 1250 å in diameter and has t=16 symmetry (the virion is enveloped but only the nucleocapsid is regular figure 2.5 gallery of three-dimensional reconstructions of icosahedral viruses from cryoelectron micrographs. all virus structures are surface shaded and are viewed along a twofold axis of symmetry except ross river, which is viewed along a three-fold axis. all of the images are of intact virus particles except for the herpes simplex structure, which is of the nucleocapsid of the virus. most of the images are taken from baker et al. (1999) , except the images of ross river virus and of dengue virus, which were kindly provided by drs. r. j. kuhn and t. s. baker. rest are not enveloped). b19 parvovirus has t=1. the general correlation is that larger particles are constructed using higher triangulation numbers, which allows the use of larger numbers of protein subunits. larger particles accommodate larger genomes. because the simplest viruses are regular structures, they will often crystallize, and such crystals may be suitable for x-ray diffraction. many viruses formed using icosahedral symmetry principles have been solved to atomic resolution, and a discussion of representative viruses that illustrate the principles used in construction of various viruses is presented here. among t=3 viruses, the structures of several plant viruses, including tomato bushy stunt virus (tbsv) (genus tombusvirus, family tombusviridae), turnip crinkle virus (tcv) (genus carmovirus, family tombusviridae), and southern bean mosaic virus (sbmv) (genus sobemovirus, not yet assigned to family), have been solved. all three of these viruses have capsid proteins possessing the eight-stranded antiparallel β sandwich. t=3 means that 180 identical molecules of capsid protein are utilized to construct the shell. the structures of two insect viruses that are also simple t=3 structures have also been solved. as an example of these simple structures, the t=3 capsid of the insect virus, flock house virus (family nodaviridae), is illustrated in fig. 2 .6. the 180 subunits in these t=3 structures interact with one another in one of two different ways, such that the protein shell can be thought of as being composed of an assembly of 60 ab dimers and 30 cc dimers (fig. 2.6a ). the bond angle between the two subunits of the dimer is more acute in the ab dimers than in the cc dimers (figs. 2.6b and c). for the plant viruses, there are n-terminal and c-terminal extensions from the capsid proteins that are involved in interactions between the subunits and with the rna. the n-terminal extensions have a positively charged, disordered domain for interacting with and neutralizing the charge on the rna and a connecting arm that interacts with other subunits. in the case of the cc dimers, the connecting arms interdigitate with two others around the icosahedral threefold axis to form an interconnected internal framework. in the case of the ab conformational dimer, the arms are disordered, allowing sharper curvature. for flock house virus, the rna plays a role in controlling the curvature of the cc dimers, as illustrated in fig. 2 .6. b. proteins that make up a triangular face are only quasi-equivalent. the angle between the a and b 5 units (shown with a red oval and in diagram (c) is more acute than that along the c-c 2 edge, shown with a blue oval, and diagram (b). this difference in the angles is due to the presence of an rna molecule located under the c-c 2 edge. from johnson (1996) , with permission. the structures of several picornaviruses and of a plant comovirus (cowpea mosaic virus) have also been solved to atomic resolution. the structures of these viruses are similar to those of the plant t=3 viruses, but the 180 subunits that form the virion are not all identical. a comparison of the structure of a t=3 virus with those of poliovirus and of cowpea mosaic virus is shown in fig. 2 .7. poliovirus has 60 copies of each of three different proteins, whereas the comovirus has 60 copies of an l protein (each of which fills the niche of two units) and 60 copies of an s protein. all three poliovirus capsid proteins have the eight-stranded antiparallel β-sandwich fold. in the comoviruses, the l protein has two β-sandwich structures fused to form one large protein, and the s protein is formed from one sandwich. the structures of the picornavirus and comovirus virions are called pseudo-t=3 or p=3, since they are not true t=3 structures. the picornavirus virion is 300 å in diameter. the 60 molecules of each of the three different proteins have different roles in the final structure, as illustrated in fig. 2 .8, in which the structure of a rhinovirus is shown. notice that five copies of vp1 are found at each fivefold axis (compare fig. 2 .7 with fig. 2.8) . vp1, vp2, and vp3 are structurally related to one another, as stated, all possessing the common βsandwich fold. there exists a depression around each fivefold axis of rhinoviruses that has been termed a "canyon." this depression is believed to be the site at which the virus interacts with the cellular receptor during entry, as illustrated in fig. 2 .9. this interaction is thought to lead to conformational changes that open a channel at the fivefold axis, through which vp4 is extruded, followed by the viral rna. the structures of both mouse polyomavirus and of sv40 virus, two members of the family polyomaviridae, have been solved to atomic resolution. both viruses possess pseudo-t=7 icosahedral symmetry. although t=7 l vp1 vp2 vp3 vp1 vp2 vp1 vp2 vp3 vp1 vp2 vp3 vp1 vp2 vp3 vp1 vp2 vp3 vp1 vp2 vp3 vp1 vp2 vp3 vp1 vp2 vp3 vp1 vp2 vp3 vp1 vp2 vp3 vp1 vp2 vp3 vp3 figure 2.7 arrangement of the coat protein subunits of comoviruses compared with those of simple t=3 viruses and picornaviruses. in simple viruses, the asymmetric unit contains three copies of a single protein β sandwich, labeled a, b, and c in order to distinguish them. in picornaviruses such as poliovirus the asymmetric unit is made up of three similar but not identical proteins, all of which have the β-sandwich structure. in comoviruses such as cowpea mosaic virus, two of the β-sandwich subunits are fused to give the l protein. adapted from granoff and webster (1999) , p. 287. symmetry would require 420 subunits, these viruses contain only 360 copies of a major structural protein known as vp1. these 360 copies are assembled as 72 pentamers. twelve of the 72 pentamers lie on the fivefold axes and the remaining 60 fill the intervening surface in a closely packed array ( fig. 2 .10). these latter pentamers are thus sixfold coordinated and the proteins in the shell are not all in quasi-equivalent positions, a surprising finding for our understanding of the principles by which viruses can be constructed. the pentamers are stabilized by interactions of the β sheets between adjacent monomers in a pentamer ( fig. 2.10c ). the pentamers are then tied together by c-terminal arms of vp1 that invade monomers in an adjacent pentamer (figs. 2.10b and c). because each pentamer that is sixfold coordinated has five c-terminal arms to interact with six neighboring pentamers, the interactions between monomers in different pentamers are not all identical ( fig. 2.10b ). flexibility in the c-terminal arm allows it to form contacts in different ways. members of the reovirus family are regular t=13 icosahedral particles. they are composed of two or three concentric protein shells. cryoelectron microscopy has been used to solve the structure of one or more members of three genera within the reoviridae, namely reovirus, rotavirus, and orbivirus, to about 25-å resolution. structures of a reovirus and of a rotavirus are shown in fig. 2 .5. the complete structure of virions has not been determined because of their large size, but in a remarkable feat the atomic structure of the core of bluetongue virus (genus orbivirus) has now been solved. this is the largest structure determined to atomic resolution to date. solution of the structure was possible because the virus particle had been solved to 25 å by cryoelectron microscopy, and the structures of a number of virion proteins had been solved to atomic resolution by x-ray diffraction. fitting the atomic structure of the proteins into the 25-å structure gave a preliminary reconstruction at high resolution, which allowed the interpretation of the x-ray data to atomic resolution. core particles are formed following infection, when the outer layer is proteolytically cleaved (described in more detail in chapter 5). the structure of the inner surface of the bluetongue virus core is shown in fig. 2 .11a and of the outer surface in fig. 2 .11b. the outer surface is formed by 780 copies of a single protein, called vp7, in a regular t=13 icosahedral lattice. the inner surface is surprising, however. it is formed by 120 copies of a single protein, called vp3. these 120 copies have been described as forming a t=2 five-fold axis lattice. because t=2 is not a permitted triangulation number, these 120 copies, strictly speaking, form a t=1 lattice in which each unit of the lattice is composed of two copies of vp3. however, the interactions are not symmetrical, leading to the suggested terminology of t=2. it has been suggested that the inner core furnishes a template for the assembly of the t=13 outer surface. the reasoning is that a t=13 structure may have difficulty in forming, whereas the t=2 (or t=1) structure could form readily. in this model, the threefold symmetry axis of the inner surface could serve to nucleate vp7 trimers and organize the t=13 structure. cryoelectron microscopy has also been applied to adenoviruses, which have a triangulation number of 25 or pseudo-25. various interpretations of the structure of adenoviruses, both schematic and as determined by microscopy or crystallography, are shown in fig. 2 .12. three copies of a protein called the hexon protein associate to form a structure called a hexon ( fig. 2.12c ). the hexon is the basic building block of adenoviruses. five hexons, called peripentonal hexons, surround each of the 12 vertices of the icosahedron (which, as has been stated, have fivefold rotational symmetry). between the groups of peripentonal hexons are found groups of 9 hexons, which are sixfold coordinated. each group of 9 hexons forms the surface of one of the triangular faces. thus, there are 60 peripentonal hexons and 180 hexons in groups of nine. the structure of the hexon trimer has been solved to atomic resolution by x-ray crystallography ( fig. 2.12c ). the hexon protein has two eight-strand β sandwiches to give the trimer an approximately sixfold symmetry (and each hexon protein fills the role of two symmetry units). there are long loops that intertwine to form a triangular top. these structures can be fitted uniquely into the envelope of density determined by cryoelectron microscopy, which produces a structure refined to atomic resolution for most of the capsid. the minor proteins can be fitted into this structure. from the 12 vertices of the icosahedron project long fibers that are anchored in the surface by a unit called the penton base. each fiber terminates in a spherical extension that forms an organ of attachment to a host cell (fig. 2.12a ). the length of the fiber differs in the different adenoviruses. in addition to the nonenveloped viruses that possess relatively straightforward icosahedral symmetry or helical symmetry, many viruses possess more complicated symmetries made possible by the utilization of a large number of structural proteins to form the virion. the tailed bacteriophages are prominent examples of this ( fig. 2.13) . some of the tailed bacteriophages possess a head that is a regular icosahedron (or, in at least one case, an octahedron) connected to a tail that possesses helical symmetry. other appendages, such as baseplates, collars, and tail fibers, may be connected to the tail. other tailed bacteriophages have heads that are assembled using more complicated patterns. for example, the t-even bacteriophages have a large head, which can be thought of as being formed of two hemi-icosahedrons possessing regular icosahedral symmetry, which are elongated in the form of a prolate ellipsoid by subunits arranged in a regular net connecting the two icosahedral ends of the head of the virus. many animal viruses and some plant viruses are enveloped; that is, they have a lipid-containing envelope surrounding a nucleocapsid. the lipids are derived from the host cell. although there is some selectivity and reorganization of lipids during virus formation, the lipid composition in general mirrors the composition of the cellular membrane from which the envelope was derived. however, the proteins in the nucleocapsid, which may possess either helical or icosahedral symmetry, and the proteins in the envelope are encoded in the virus. the protein-protein interactions that are responsible for assembly of the mature enveloped virions differ among the different families and the structures of the resulting virions differ. the virions of alphaviruses, and of flaviviruses, are uniform the essential features of the orbivirus native core particle. the asymmetric unit is indicated by the white lines forming a triangle and the fivefold, threefold, and twofold axes are marked. (a) the inner capsid layer of the bluetongue virus (btv) core is composed of 120 molecules of vp3, arranged in what has been called t=2 symmetry. note the green subunit a and the red subunit b which fill the asymmetric unit. (b) the core surface layer is composed of 780 copies of vp7 arranged as 260 trimers, with t=13 symmetry. the asymmetric unit contains 13 copies of vp7, arranged as five trimers, labeled p, q, r, s, and t, with each trimer a different color. trimer "t" in blue sits on the icosahedral threefold axis and thus contributes only a monomer to the asymmetric unit. from granoff and webster (1999) , color plate 17. structures that possess icosahedral symmetry. poxviruses, rhabdoviruses, and retroviruses also appear to have a regular structure, but there is flexibility in the composition of the particle and the mature virions do not possess icosahedral symmetry. the herpesvirus nucleocapsid is a regular icosahedral structure ( fig. 2.5 ), but the enveloped herpesvirions are not regular. other enveloped viruses are irregular, often pleiomorphic, and are heterogeneous in composition to a greater or lesser extent. the structures of different enveloped viruses that illustrate these various points are described next. the nucleocapsids of enveloped rna viruses are fairly simple structures that contain only one major structural protein, often referred to as the nucleocapsid protein or core protein. this protein is usually quite basic or has a basic domain. it binds to the viral rna and encapsidates it to form the nucleocapsid. for most rna viruses, nucleocapsids can be recognized as distinct structures within the infected cell and can be isolated from virions by treatment with detergents that dissolve the envelope. the nucleocapsids of alphaviruses, and probably flaviviruses and arteriviruses as well, are regular icosahedral structures, and there are no other proteins within the nucleocapsid other than the nucleocapsid protein. in contrast, the nucleocapsids of all minus-strand viruses are helical and contain, in addition to the major nucleocapsid protein, two or more minor proteins that possess enzymatic activity. as described, the nucleocapsids of minus-strand rna viruses remain intact within the cell during the entire infection cycle and serve as machines that make viral rna. the coronaviruses also have helical nucleocapsids, but being plus-strand rna viruses they do not need to carry enzymes in the virion to initiate infection. the helical nucleocapsids of (−) rna viruses appear disordered within the envelope of all viruses except the rhabdoviruses, in which they are coiled in a regular fashion (see later). the nucleocapsids of retroviruses also appear to be fairly simple structures. they are formed from one major precursor protein, the gag polyprotein, that is cleaved during maturation into four or five components. the precursor nucleocapsid is spherically symmetric but lacks icosahedral symmetry. the mature nucleocapsid produced by cleavage of gag may or may not be spherical symmetric. the nucleocapsid also contains minor proteins, produced by cleavage of gag-pro-pol, as described in chapter 1. these minor proteins include the protease, rt, rnase h, and integrase that are required to cleave the polyprotein precursors, to make a cdna copy of the viral rna, and to integrate this cdna copy into the host chromosome. the two families of enveloped dna viruses that we consider here, the poxviruses and the herpesviruses, contain large genomes and complicated virus structures. the nucleocapsids of herpesviruses are regular icosahedrons but those of poxviruses are complicated structures containing a core and associated lateral bodies. the external proteins of enveloped virions are virusencoded proteins that are anchored in the lipid bilayer of the virus or whose precursors are anchored in the lipid bilayer. in the vast majority of cases these proteins are glycoproteins, although examples are known that do not contain bound carbohydrate. these proteins are translated from viral mrnas and transported by the usual cellular processes to reach the membrane at which budding will occur. when budding is at the cell plasma membrane, the glycoproteins are transported via the golgi apparatus to the cell surface. some enveloped viruses mature at intracellular membranes, and in these cases the glycoproteins are directed to the appropriate place in the cell. both type i integral membrane proteins, in which the n terminus of the protein is outside the lipid bilayer and the c terminus is inside the bilayer, and type ii integral membrane proteins, which have the inverse orientation with the c terminus outside, are known for different viruses. many viral glycoproteins are produced as precursor molecules that are cleaved by cellular proteases during the maturation process. following synthesis of viral glycoproteins, during which they are transported into the lumen of the endoplasmic reticulum (er) in an unfolded state, they must fold to assume their proper conformation, and assume their proper oxidation state by formation of the correct disulfide bonds. this process often occurs very quickly, but for some viral glycoproteins it can take hours. folding is often assisted by chaperonins present in the endoplasmic reticulum. it is believed that at least one function of the carbohydrate chains attached to the protein is to increase the solubility of the unfolded glycoproteins in the lumen of the er so that they do not aggregate prior to folding. during folding, the solubility of the proteins is increased by hiding hydrophobic domains within the interior of the protein and leaving hydrophilic domains at the surface. the glycoproteins possess a number of important functions in addition to their structural functions. they carry the attachment domains by which the virus binds to a susceptible cell. this activity is thought to be related to the ability of many viruses, nonenveloped as well as enveloped, to bind to and agglutinate red blood cells, a process called hemagglutination. the protein possessing hemagglutinating activity is often called the hemagglutinin or ha. the viral glycoproteins also possess a fusion activity that promotes the fusion of the membrane of the virus with a membrane of the cell. the protein possessing this activity is sometimes called the fusion protein, or f. the glycoproteins, being external on the virus, are also primary targets of the humoral immune system, in which circulating antibodies are directed against viruses; many of these are neutralizing antibodies that inactivate the virus. the glycoproteins of some enveloped viruses also contain enzymatic activities. many orthomyxoviruses and paramyxoviruses possess a neuraminidase that will remove sialic acid from glycoproteins. the primary receptor for these viruses is sialic acid. the neuraminidase may allow the virus to penetrate through mucus to reach a susceptible cell. it also removes sialic acid from the viral glycoproteins so that these glycoproteins or the mature virions do not aggregate, and from the surface of an infected cell, thereby preventing released virions from binding to it. the viral protein possessing neuraminidase activity may be called na, or in the case of a protein that is both a neuraminidase and hemagglutinin, hn. the structure of most enveloped viruses is not as rigorously constrained as that of icosahedral virus particles. the glycoproteins are not required to form an impenetrable shell, which is instead a function of the lipid bilayer. they appear to tolerate mutations more readily than do proteins that must form a tight icosahedral shell and appear to evolve rapidly in response to immune pressure. however, the integrity of the lipid bilayer is essential for virus infectivity, and enveloped viruses are very sensitive to detergents. in some enveloped viruses, there is a structural protein that underlies the lipid envelope but which does not form part of the nucleocapsid. several families of minus-strand rna viruses possess such a protein, called the matrix protein. this protein may serve as an adapter between the nucleocapsid and the envelope. it may also have regulatory functions in viral rna replication. the herpesviruses also have proteins underlying the envelope that form a thick layer called the tegument. the thickness of the tegument is not uniform within a virion, giving rise to some irregularity in its structure. the tegument proteins perform important functions early after infection of a cell by a herpesvirus (see chapter 7). the alphaviruses, a genus in the family togaviridae, are exceptional among enveloped rna viruses in the regularity of their virions, which are uniform icosahedral particles. virions of two alphaviruses have been crystallized and the crystals are regular enough to diffract to 30-40-å resolution. higher resolution has been obtained from cryoelectron microscopy, which has been used to determine the structures of several alphaviruses to 7-25 å (fig. 2.5) . more detailed reconstructions of sindbis virus and ross river virus (rrv) have been derived from a combination of cryoelectron microscopy of the intact virion and x-ray crystallography of alphavirus structural proteins. a cutaway view of rrv at about 25-å resolution is shown in fig. 2.14a . the nucleocapsid, shown in red and yellow, has a diameter of 400 å, and is a regular icosahedron with t=4 symmetry. it is formed from 240 copies of a single species of capsid protein of size 30 kda. note the fivefold and sixfold coordinated pedestals in yellow that rise above the red background of rna and unstructured parts of the protein. each of these pedestals is formed by the ordered domains of one capsid protein molecule. the lipid bilayer is shown in green and is positioned between the capsid and the external shell of glycoproteins, shown in blue. the glycoproteins are also icosahedrally arranged with t=4 symmetry. the complete structure is therefore quite regular and the virion has been described as composed of two interacting protein shells with a lipid bilayer sandwiched between. the structure of the ordered part of the capsid protein of sindbis virus has been solved to atomic resolution by conventional x-ray crystallography and this structure is shown in fig. 2. 14b. the first 113 residues are disordered and the structure is formed by residues 114-264. this ordered domain has a structure that is very different from the eightfold β sandwich described earlier (compare fig. 2.14b with figs. 2.3b and 2.4). instead, its fold resembles that of chymotrypsin, and it has an active site that consists of a catalytic triad whose geometry is identical to that of chymotrypsin. the capsid protein is an active protease that cleaves itself from a polyprotein precursor. after cleavage, the c-terminal tryptophan-264 remains in the active site and the enzymatic activity of the protein is lost. the interactions between the capsid protein subunits that lead to formation of the t=4 icosahedral lattice have been deduced by fitting the electron density of the capsid protein at 2.5-å resolution into the electron density of the nucleocapsid found by cryoelectron microscopy. such a reconstruction, based on a cryoem structure of sindbis virus at a resolution of better than 10 å, is shown in fig. 2.14c . the fit of the capsid protein is unique and the combined approaches of x-ray crystallography and cryoelectron microscopy thus define the structure of the shell of the nucleocapsid to atomic resolution. the envelopes of alphaviruses contain 240 copies of each of two virus-encoded glycoproteins, called e1 and e2. e2 is first produced as a precursor called pe2. e1 and pe2 form a heterodimer shortly after synthesis, and both span the lipid bilayer as type i integral membrane proteins (having a membrane-spanning anchor at or near the c terminus). the c-terminal cytoplasmic extension of pe2 interacts in a specific fashion with a nucleocapsid protein so that there is a one-to-one correspondance between a capsid protein and a glycoprotein heterodimer. the 240 glycoprotein heterodimers form a t=4 icosahedral lattice on the surface of the particle by interacting with one another and with the capsid proteins. because of the glycoprotein-capsid protein interactions, the icosahedral lattices of the nucleocapsid and the glycoproteins are coordinated. at some time during transport of the glycoprotein heterodimers to the cell surface, pe2 is cleaved by a cellular protease called furin to form e2. e1 and e2 remain associated as a heterodimer. if cleavage is prevented, noninfectious particles are produced that contain pe2 and e1. in the virion, three glycoprotein heterodimers associate to form a trimeric structure called a spike, easily seen in figs. 2.5 and 2.14a. it is not known if the spike assembles during virus assembly or if heterodimers trimerize during their transport to the cell surface. a reconstruction of a spike of sindbis virus at a resolution better than 10 å is shown in fig. 2 .15a. in this reconstruction, the electron density of e1 has been replaced by the e1 structure of the related semliki forest virus determined to atomic resolution by x-ray crystallography. the three copies of e1 project upwards at an angle of about 45° and are shown in three colors because they have slightly difb. dengue virus figure 2.15 comparison of (a) the spike structure of mature sindbis virus (an alphavirus) with (b) the spike of immature dengue virus (a flavivirus). the cα backbones of the three e1 (sindbis) and e (dengue) glycoprotein ectodomains are shown in red, green, and blue, as they were fitted into the cryoelectron density envelope. the e1 and e densities have been zeroed out, leaving the gray envelope that corresponds to e2 for sindbis and prm for dengue. the density corresponding to the lipid bilayer is shown in bright green. adapted from figure 5 in y. with permission. ferent environments. the electron density in gray that remains after subtracting the density due to e1 is thus the electron density of e2. e2 projects further upward than does e1 and covers the apex of e1, which has the fusion peptide. thus, e2 covers the fusion peptide with a hydrophobic pocket so that it does not interact with the hydrophilic environment. the apex of the e2 spike contains the domains that attach to receptors on a susceptible cell. both e1 and e2 have c-terminal membrane-spanning anchors that traverse the lipid bilayer shown in green. the c-terminal domain of e1 is not present in the protein whose structure has been determined because hydrophobic domains do not easily crystallize. thus, the electron density shown traversing the lipid bilayer arises from both e1 and e2 and shows that the two membrane spanning anchors go through as paired α helical structures (fig. 2.16) . upon entry of an alphavirus into a cell, the acidic ph of endosomal vesicles causes disassembly of e2/e1 heterodim-ers and trimerization of e1 to form homotrimers. the fusion peptide is exposed and penetrates the target bilayer of the host endosomal membrane. fusion follows by methods discussed in chapter 1. flaviviruses also possess a regular icosahedral structure (fig. 2.5 ) that has been solved by methods similar to those used to determine the structure of alphaviruses. the structures of alphaviruses and flaviviruses are related and have descended from a common ancestral structure. like alphaviruses, flaviviruses produce two structural glycoproteins, called e and prm (for precursor to m). e is homologous to e1 of alphaviruses. although no sequence identity is detectable, the structures of the two proteins are virtually identical and are formed with a similar fold (fig. 2.17 ). prm and e form a heterodimer and immature particles can be formed if cleavage of prm is prevented. the glycoprotein heterodimers in these immature particles trimerize to form spikes whose structure is very similar to the spikes of alphaviruses (fig. 2.15b ). the arrangement of the glycoproteins on the immature virus particle is illustrated in fig. 2.18a and a cryoem reconstruction that illustrates the surface of the immature dengue virus particle is shown in fig. 2.18d . the major differences between the immature flavivirus particle and the alphavirus particle are that there are 180 copies of the heterodimer in the flavivirus particle arranged in a t=3 icosahedral structure rather than 240 heterodimers arranged in a t=4 structure in alphaviruses; that prm is a smaller molecule than pe2 so that in the flavivirus spike there is but a thin trace of density that projects downward, parallel to e, from the cap that shields the fusion peptide ( fig. 2.15b ) rather than a substantial trace of density in the alphavirus spike ( fig. 2.15a) ; and that the c-terminal regions of prm and e that enter the membrane do so independently and do not emerge from the internal side of the membrane (illustrated in fig. 2.19b ), unlike the alphavirus membrane spanning regions (fig. 2.16 ). there is no evidence that the membrane glycoproteins interact with the nucleocapsid in flaviviruses, and the flavivirus nucleocapsid, assuming it is a regular icosahedral structure, is not coordinated with the icosahedral structure formed by the spikes, which is again different from alphaviruses where the c-terminal domain of pe2 interacts with the nucleocapsid. following cleavage of prm by furin to form m, there is a dramatic rearrangement of the flavivirus glycoproteins such that the final virion structure is very different from the alphavirus structure. the heterodimers dissociate and e-e homodimers are formed that collapse over the lipid bilayer (fig. 2.19) . the e-e homodimers occur in two totally different environments, either perpendicular to the twofold axis where they interact with homodimers in sideto-side interactions, forming a herringbone pattern, or parallel to a twofold axis where they interact at fivefold and threefold axes (fig. 2.18b) . a comparison of the immature flavivirus particle, t398 figure 2.16 the e1 and e2 transmembrane helices of sindbis (an alphavirus) determined from a 9å resolution cryoelectron microscopy reconstruction. shown are e1 residues from 409 to 439 and e2 residues 363 to 398 fitted into the transmembrane density. this is figure 6 from mukhopadhyay et al. (2006) , reprinted with permission. the mature flavivirus particle, and the alphavirus particle is shown in fig. 2.18 . the mature flavivirus is smooth, with no surface projections, and is 50 nm in diameter (fig. 2.18e ). the immature particle is ragged in appearance and is 60 nm in diameter (fig. 2.18d ). the alphavirus virion shows conspicuous spikes and is 70 nm in diameter (fig. 2.18f ). the arrangement of e or e1 in the three particles is also shown (figs. 2.18a, b, c) , illustrating the differences in their arrangement. entry of flaviviruses follows pathways similar to those used by alphaviruses. the acidic ph of the endosome causes the e-e homodimers to reorganize to form e homotrimers. these trimers must reorient so that the exposed fusion peptide is projected upwards where it penetrates the host membrane. the arteriviruses possess icosahedral nucleocapsids, but the mature virion does not appear to be regular in structure. detailed structures of these particles are not available. the herpesviruses are large dna viruses that have a t=16 icosahedral nucleocapsid (fig. 2.5) . a schematic diagram of an intact herpesvirion is shown in fig. 2.20a . underneath the envelope is a protein layer called the tegument. the tegument does not have a uniform thickness, and thus the virion is not uniform. two electron micrographs of herpesvirions are shown in figs. 2.20b and 2.20c that illustrate the irregularity of the particle and the differing thickness of the tegument in different particles. the retroviruses have a nucleocapsid that forms initially using spherical symmetry principles. cleavage of gag during virus maturation results in a nucleocapsid that is not icosahedral and that is often eccentrically located in the virion. fig. 2 .21a presents a schematic of a retrovirus particle that illustrates the current model for the location of the various proteins after cleavage of gag and gag-pol. figs. 2.21b, c, and d show electron micrographs of budding virus particles and of mature extracellular virions for three genera of retroviruses. betaretrovirus particles usually mature by the formation of a nucleocapsid within the cytoplasm that then buds through the plasma membrane. this process is shown in fig. 2 .21b for mouse mammary tumor virus. in the top micrograph in fig. 2 .21b, preassembled capsids are seen in the cytoplasm. in the middle micrograph, budding of the capsid through the plasma membrane is illustrated. in the bottom micrograph, a mature virion with an eccentrically located capsid is shown. in gammaretroviruses, the capsid forms during budding, and the nucleocapsid is round and centrally located in the mature virion. this process is illustrated in fig. 2 .21c for murine leukemia virus. the top micrograph shows a budding particle with a partially assembled capsid. the bottom micrograph shows a mature virion. in the lentiviruses, of which hiv is a member, the capsid also forms as a distinct structure only during budding. in the top panel of fig. 2 .21d is shown a budding particle of bovine immunodeficiency virus. after cleavage of gag to form the mature virion, the capsid usually appears cone shaped or bar shaped (bottom panel of fig. 2.21d) . mason-pfizer monkey virus is a betaretrovirus whose capsid is cone shaped and centrally located in the mature virion. a single amino acid change in the matrix protein ma determines whether the capsid preassembles and then buds, or whether the capsids assemble during budding. thus, the point at which capsids assemble does not reflect a fundamental difference in retroviruses. preassembly of capsids or assembly during budding appears to depend on the stability of the capsid in the cell. stable capsids can preassemble. unstable capsids require interactions with other viral components to form as a recognizable structure. the coronaviruses and the minus-strand rna viruses have nucleocapsids with helical symmetry. the structures of the mature virions are irregular, with the exception of the rhabdoviruses, and the glycoprotein composition is not invariant. because of the lack of regularity in these viruses, as well as the lack of symmetry, detailed structural studies of virions have not been possible. the lack of regularity arises in part because in these viruses there is no direct interaction between the nucleocapsid and the glycoproteins. the lack of such interactions permits these viruses to form pseudotypes, in which glycoproteins from other viruses substitute for those of the virus in question. pseudotypes are also formed by retroviruses. the structures of paramyxoviruses and orthomyxoviruses are illustrated schematically in fig. 2.22 . the helical nucleocapsids contain a major nucleocapsid protein called n or np, and the minor proteins p (ns1) and l (pb1, pb2, pa), as shown. there is a matrix protein m (m1) lining the inside of the lipid bilayer and also two glycoproteins anchored in the bilayer that form external spikes. the two glycoproteins, called f and hn in paramyxoviruses and ha and na in orthomyxoviruses, do not form heterodimers but rather form homooligomers so that there are two different kinds of spikes on the surface of the virions. ha in the orthomyxoviruses forms homotrimers whereas na forms homotetramers, and the two types of spikes can be distinguished in the electron microscope if the resolution is high enough. as occurs in many enveloped rna viruses, f and ha are produced as precursors that are cleaved by furin during transport of the proteins. cleavage is required to activate the fusion peptide of the virus, which is found at the n terminus of the c-terminal product (see fig. 1.6 ). electron micrographs of virions are shown in figs. 2.22c and d. the particles in the preparations shown are round and reasonably uniform, but in other preparations the virions are pleomorphic baglike structures that are not uniform in appearance. in fact, clinical specimens of some orthomyxoviruses and paramyxoviruses are often filamentous rather than round, illustrating the flexible nature of the structure of the virion. the micrograph of the paramyxovirus measles virus shown in fig. 2 .22c is a thin section and illustrates the lack of higher order structure in the internal helical nucleocapsid. the micrograph of the orthomyxovirus influenza a virus shown in fig. 2 .22d is of a negative-stained preparation and illustrates the spikes that decorate the virus particle. the structures of rhabdoviruses and filoviruses are illustrated in fig. 2 .23. the rhabdoviruses assemble into bullet-shaped or bacilliform particles in which the helical nucleocapsid is wound in a regular elongated spiral conformation (figs. 2.23b and c). the virus encodes only five proteins (fig. 2.23a) , all of which occur in the virion (fig. 2.23b ). the nucleocapsid contains the major nucleocapsid protein n and the two minor proteins l and ns. the matrix protein m lines the inner surface of the envelope, and g is an external glycoprotein that is anchored in the lipid bilayer of the envelope. budding is from the plasma membrane ( fig. 2.23d) . the filoviruses are so named because the virion is filamentous. a schematic diagram of a filovirus is shown in fig. 2 the structure of viruses the poxviruses, large dna-containing viruses, also have lipid envelopes. in fact, they may have two lipid-containing envelopes. the structures of poxviruses belonging to two different genera, orthopox and parapox, are illustrated in fig. 2 .24. electron micrographs of the orthopox virus vaccinia virus and of a parapox virus are also shown. vaccinia virus has been described as brick shaped. the interior of the virion consists of a nucleoprotein core and two (in vertebrate viruses) proteinaceous lateral bodies. surrounding these is a lipid-containing surface membrane, outside of which are several virus-encoded proteins present in structures referred to as tubules. this particle is called an intracellular infectious virion. as its name implies, it is present inside an infected cell, and if freed from the cell it is infectious. a second form of the virion is found outside the cell and is called an extracellular enveloped virion. this second form has a second, external lipid-containing envelope with which is associated five additional vaccinia proteins. this form of the virion is also infectious. parapox virions are similar to orthopox virions. however, their morphology is detectably different, as illustrated in fig. 2 .24. virions self-assemble within the infected cell. in most cases, assembly appears to begin with the interaction of one or more of the structural proteins with an encapsidation signal in the viral genome, which ensures that viral genomes are preferentially packaged. after initiation, encapsidation continues by recruitment of additional structural protein molecules until the complete helical or icosahedral structure has been assembled. thus, packaging of the viral genome is coincident with assembly of the virion, or of the nucleocapsid in the case of enveloped viruses. the requirement for a packaging signal may not be absolute. in many viruses that contain an encapsidation signal, rnas or dnas lacking such a signal may be encapsidated, but with much lower efficiency. for some viruses, there is no evidence for an encapsidation signal. assembly of the tmv rod ( fig. 2. 2) has been well studied. several coat protein molecules, perhaps in the form of a disk, bind to a specific nucleation site within tmv rna to initiate encapsidation. once the nucleation event occurs, additional protein subunits are recruited into the structure and assembly proceeds in both directions until the rna is completely encapsidated. the length of the virion is thus determined by the size of the rna. the assembly of the icosahedral turnip crinkle virion has also been well studied. assembly of this t=3 structure is initiated by formation of a stable complex that consists of six capsid protein molecules bound to a specific encapsidation signal in the viral rna. additional capsid protein dimers are then recruited into the complex until the structure is complete. it is probable that most other viruses assemble in a manner similar to these two well-studied examples. at least some viruses deviate from this model, however, and assemble an empty particle into which the viral genome is later recruited. it is also known that many viruses will assemble empty particles if the structural proteins are expressed in large amounts in the absence of viral genomes, even if assembly is normally coincident with encapsidation of the viral genome in infected cells. the nucleocapsids of most enveloped viruses form within the cell by pathways assumed to be similar to those described above. they can often be isolated from infected cells, and for many viruses the assembly of nucleocapsids does not require viral budding or even the expression of viral surface glycoproteins. after assembly, the nucleocapsids bud through a cellular membrane, which contains viral glycoproteins, to acquire their envelope. budding retroviruses were illustrated in fig. 2 .21 and budding rhabdoviruses in fig. 2 .23. a gallery of budding viruses belonging to other families is shown in fig. 2 .25. the membrane chosen for budding depends on the virus and depends, in part if not entirely, on the membrane to which the viral glycoproteins are directed by signals within those glycoproteins. many viruses bud through the cell plasma membrane (figs. 2.25b-f); in polarized cells, only one side of the cell may be used. other viruses, such as the coronaviruses and the bunyaviruses, use the endoplasmic reticulum or other internal membranes. the herpesviruses replicate in the nucleus and the nucleocapsid assembles in the nucleus; in this case, the first budding event is through the nuclear membrane (fig. 2.25a) . although the nucleocapsid of most enveloped viruses assembles independently within the cell and then buds to acquire an envelope, exceptions are known. the example of retroviruses, some of which assemble a nucleocapsid during virus budding, was discussed earlier. in these viruses, morphogenesis is a coordinated event. the forces that result in virus budding are not well understood for most enveloped viruses. in the case of the alphaviruses, there is evidence for specific interactions between the cytoplasmic domains of the glycoproteins and binding sites on the nucleocapsid proteins. the model for budding of these viruses is that the nucleocapsid first binds to one or a few glycoprotein heterodimers at the plasma membrane. by a process of lateral diffusion, additional glycoprotein heterodimers move in and are bound until a full complement is achieved and the virus is now outside the cell. additional free energy for budding is furnished by lateral interactions between the glycoproteins, which form a contiguous layer on the surface (fig. 2.18c ). this model accounts for the regularity of the virion, the one-to-one ratio of the structural proteins in the virion, and the requirement of the virus for its own glycoproteins in order to bud. in other enveloped viruses, however, there is little evidence for nucleocapsid-glycoprotein interactions. the protein composition of the virion is usually not fixed, but can vary within limits. in fact, glycoproteins from unrelated viruses can often be substituted. in the extreme case of retroviruses, noninfectious virus particles will form that are completely devoid of glycoprotein. the matrix proteins appear to play a key role in the budding process, as do other protein-protein interactions that are yet to be determined. for most animal viruses, there are one or more cleavages in structural protein precursors during assembly of virions that are required to activate the infectivity of the virion. interestingly, these cleavages may either stabilize or destabilize the virion in the extracellular environment, depending on the virus. many of these cleavages are effected by viral proteases, whereas others are performed by cellular proteases present in subcellular organelles. virions are formed by the spontaneous assembly of components in the infected cell, sometimes with the aid of assembly factors ("scaffolds") that do not form components of the mature virion. for most nonenveloped viruses, complete assembly occurs within the cell cytoplasm or nucleoplasm. for enveloped viruses, final assembly of the virus occurs by budding through a cellular membrane. in either case, the virion must subsequently disassemble spontaneously on infection of a new cell. the cleavages that occur during assembly of the virus potentiate penetration of a susceptible cell after binding of the virus to it, and the subsequent disassembly of the virion on entry into the cell. a few examples will be described that illustrate the range of cleavage events that occur in different virus families. in the picornaviruses, a provirion is first formed that is composed of the viral rna complexed with three viral proteins, called vp0, vp1, and vp3. during maturation to form the virion, vp0 is cleaved to vp2 and vp4. no protease has been found that performs this cleavage, and it has been postulated that the virion rna may catalyze it. cleavage to produce vp4, which is found within the interior of the capsid shell, as illustrated schematically in fig. 2.9 , is required for the virus to be infectious. as described in chapter 1, vp4 appears to be required for entry of the virus into the cell. this maturation cleavage has another important consequence. whereas the provirion is quite unstable, the mature virion is very stable. the poliovirus virion will survive treatment with proteolytic enzymes and detergents, and survives exposure to the acidic ph of less than 2 that is present in the stomach. only on binding to its receptor (figs. 1.5 and 2.9) is poliovirus destabilized such that vp4 can be released for entry of the viral rna into the host cell. similarly, the insect nodaviruses first assemble as a procapsid containing the rna and 180 copies of a single protein species called α (44 kda). over a period of many hours, spontaneous cleavage of α occurs to form β (40 kda) and γ (4 kda). this cleavage is required for the particle to be infectious. these events in nodaviruses have been well studied because it has been possible to assemble particles in vitro, and the structures of both cleaved and uncleaved particles have been solved to atomic resolution. rotaviruses, which form a genus in the family reoviridae, must be activated by cleavage with trypsin after release from an infected cell in order to be infectious. trypsin is present in the gut, where the viruses replicate, and activation occurs normally during the infection cycle of the virus in animals. when the viruses are grown in cultured cells, however, trypsin must be supplied exogenously. a different type of cleavage event occurs during assembly of retroviruses and adenoviruses, as well as of a number of other viruses. during assembly of retroviruses, the gag and gag-pol precursor polyproteins are incorporated, together with the viral rna, into a precursor nucleocapsid. these polyproteins must be cleaved into several pieces by a protease present in the polyprotein if the virus is to be infectious. these cleavages often visibly alter the structure of the particle as seen in the electron microscope (fig. 2.21 ). an analogous situation occurs in adenoviruses, where a viral protease processes a protein precursor in the core of the immature virion. in most enveloped viruses, one of the envelope proteins is produced as a precursor whose cleavage is required to activate the infectivity of the virus. this cleavage may occur prior to budding, catalyzed by a host enzyme called furin, or may occur after release of the virus, catalyzed by other host enzymes. the example of the hemagglutinin of influenza virus was described in chapter 1. this protein is produced as a precursor called ha0, which is cleaved to ha1 and ha2 ( fig. 1.6 ). cleavage is required to potentiate the fusion activity present at the n terminus of ha2. as a second example, alphaviruses produce two envelope glycoproteins that form a heterodimer and one of the glycoproteins is produced as a precursor, as described before. the heterodimer containing the uncleaved precursor is quite stable, so that a particle containing uncleaved heterodimer is not infectious. the cleaved heterodimer, which is required for virus entry, is much less stable and dissociates readily during infection. thus, in contrast to the poliovirus maturation cleavage, the alphavirus cleavage makes the virion less stable rather than more stable. maturation cleavages also occur in the envelope glycoproteins of retroviruses, paramyxoviruses, flaviviruses, and coronaviruses. dna or rna has a high net negative charge, and there is a need for counterions to neutralize this charge in order to form a virion. in many viruses, positively charged polymers are incorporated that neutralize half or so of the nucleic acid charge. the dna in the virions of the polyomaviruses is complexed with cellular histones. the viral genomes in these viruses have been referred to as minichromosomes. in contrast, the adenoviruses encode their own basic proteins that complex with the genome in the core of the virion. another strategy is used by the herpesviruses, which incorporate polyamines into the virion. herpes simplex virus has been estimated to incorporate 70,000 molecules of spermidine and 40,000 molecules of spermine, which would be sufficient to neutralize about 40% of the dna charge. among rna viruses, the nucleocapsid proteins are often quite basic and neutralize part of the charge on the rna. as one example, the n-terminal 110 amino acids of the capsid protein of sindbis virus have a net positive charge of 29. the positive charges within this domain of the 240 capsid proteins in a nucleocapsid would be sufficient to neutralize about 60% of the charge on the rna genome. this charged domain is thought to penetrate into the interior of the nucleocapsid and complex with the viral rna. virions differ greatly in stability, and these differences are often correlated with the means by which viruses infect new hosts. viruses that must persist in the extracellular environment for considerable periods, for example, must be more stable than viruses that pass quickly from one host to the next. as an example of such requirements, consider the closely related polioviruses and rhinoviruses, members of two different genera of the family picornaviridae. these viruses shared a common ancestor in the not too distant past and have structures that are very similar. the polioviruses are spread by an oral-fecal route and have the ability to persist in a hostile extracellular environment for some time where they may contaminate drinking water or food. furthermore, they must pass through the stomach, where the ph is less than 2, to reach the intestinal tract where they begin the infection cycle. it is not surprising, therefore, that the poliovirion is stable to storage and to treatments such as exposure to mild detergents or to ph < 2. in contrast, rhinoviruses are spread by aerosols or contaminated mucus, and spread normally requires close contact. the rhinovirion is less stable than the poliovirion. it survives for only a limited period of time in the external environment and is sensitive to treatment with detergents or exposure to ph 3. general structure of viruses thin section of a herpes simplex virion (herpesviridae) in an infected hep-2 cell. the particle is apparently coated with an inner envelope, and is in the process of acquiring its outer envelope from the nuclear membrane. from roizman (1969). (b) machupo virus (arenaviridae) budding from a raji cell. from murphy et al. (1969). (magnification: 120,000×). (c) sindbis virus (togaviridae) budding from the plasma membrane of an infected chicken cell 000×). (e) a portion of the cell surface with sv5 filaments (paramyxoviridae) in the process of budding. from compans and choppin (1973). (45,000 ×). (f) a row of sv5 virions budding from the surface of a monkey kidney cell. cross sections of the nucleocapsid can be seen within several of the particles principles of virus structure functional implications of protein-protein interactions in icosahedral viruses principles of virus structure structural biology of viruses ultrastructure of animal viruses and bacteriophages: an atlas. ultrastructure in biological systems cryoelectron microscopy adding the third dimension to virus life cycles: three-dimensional reconstruction of icosahedral viruses from cryo-electron micrographs. microbiol membrane proteins organize a symmetrical virus mapping the structure and function of the e1 and e2 glycoproteins of alphaviruses structures of immature flavivirus particles conformational changes of the flavivirus e glycoprotein visualization of membrane protein domains by cryo-electron microscopy of dengue virus the refined crystal structure of hexon, the major coat protein of adenovirus type 2, at 2.9å resolution nucleocapsid and glycoprotein organization in an enveloped virus three-dimensional structure of poliovirus at 2.9å resolution the fusion glycoprotein shell of semliki forest virus: an icosahedral assembly primed for fusogenic activation at endosomal ph a ligand-binding pocket in the dengue virus envelope glycoprotein x-ray crystallographic structure of the norwalk virus capsid structure of a human common cold virus and functional relationship to other picornaviruses the structure of simian virus 40 refined at 3.1å resolution the canyon hypothesis in a nutshell: structure and assembly of the vaccinia virion structure of dengue virus: implications for flavivirus organization, maturation, and fusion a structural perspective of the flavivirus life cycle rotavirus proteins: structure and assembly bluetongue virus assembly and morphogenesis budding of alphaviruses key: cord-001093-5l0fthw3 authors: jin, fan; yu, chen; lai, luhua; liu, zhirong title: ligand clouds around protein clouds: a scenario of ligand binding with intrinsically disordered proteins date: 2013-10-03 journal: plos comput biol doi: 10.1371/journal.pcbi.1003249 sha: doc_id: 1093 cord_uid: 5l0fthw3 intrinsically disordered proteins (idps) were found to be widely associated with human diseases and may serve as potential drug design targets. however, drug design targeting idps is still in the very early stages. progress in drug design is usually achieved using experimental screening; however, the structural disorder of idps makes it difficult to characterize their interaction with ligands using experiments alone. to better understand the structure of idps and their interactions with small molecule ligands, we performed extensive simulations on the c-myc(370–409) peptide and its binding to a reported small molecule inhibitor, ligand 10074-a4. we found that the conformational space of the apo c-myc(370–409) peptide was rather dispersed and that the conformations of the peptide were stabilized mainly by charge interactions and hydrogen bonds. under the binding of the ligand, c-myc(370–409) remained disordered. the ligand was found to bind to c-myc(370–409) at different sites along the chain and behaved like a ‘ligand cloud’. in contrast to ligand binding to more rigid target proteins that usually results in a dominant bound structure, ligand binding to idps may better be described as ligand clouds around protein clouds. nevertheless, the binding of the ligand and a non-ligand to the c-myc(370–409) target could be clearly distinguished. the present study provides insights that will help improve rational drug design that targets idps. intrinsically disordered proteins (idps), discovered in the 1990s, are proteins that lack a stable three-dimensional native structure under physiological conditions [1] [2] [3] [4] [5] . idps are sometimes described as ''protein clouds'' because of their structural flexibility and dynamic conformation ensemble [6] . various bioinformatics methods have been developed to predict idps based on their sequences [7, 8] . it was revealed that idps are abundant in all kingdoms of life; for example, more than 40% of the proteins in eukaryotic cells possess disordered regions longer than 50 residues [9, 10] . because of the flexibility of the chain and the resulting advantages in protein-protein interactions [1, 11, 12] , idps play important roles in various critical physiological processes such as the regulation of transcription and translation [2] , cellular signal transmission, protein phosphorylation and molecular assemblies [3, 13, 14] . on the other hand, idps also have some adverse effects. it was revealed that many idps are associated with human diseases such as cancer, cardiovascular disease, amyloidosis, neurodegenerative diseases, and diabetes [15] . it was also reported that the swiss-prot keywords for eleven severe diseases are strongly correlated with idps [16] . given their abundance and their biological importance, idps are regarded as promising and potential drug targets [15, [17] [18] [19] . compared with rational drug design targeting ordered proteins [20] [21] [22] , drug design targeting idps is still in its infancy. though some general strategies have been proposed [23] , most of the studies [24] [25] [26] [27] [28] [29] [30] have been limited to only a few systems, namely, p53-mdm2, ews-fli1 and c-myc-max. among them, the oncoprotein c-myc is an encouraging example. c-myc is a transcription factor with a basic helix-loop-helix leucine zipper (bhlhzip) domain which becomes active by forming a dimer with its partner protein max [31] . in their unbound forms, both c-myc and max are disordered. however, in the dimerized forms, they undergo coupled folding and binding. in most cancers cells, c-myc protein is expressed persistently by a mutated myc gene, causing its unregulated expression in cell proliferation and signal transmission. therefore, inhibiting either the overexpression of c-myc and/or its dimerization with max may provide a therapy for cancer. yin et al. [30] have used high-throughput experimental screening to successfully identify seven compounds that inhibit dimerization between c-myc and max. further biophysical studies using nuclear magnetic resonance (nmr), circular dichroism (cd) and fluorescence assays have verified three different binding sites (residues 366-375, 374-385, and 402-409) in the bhlhzip domain of c-myc [28] . these binding sites contain several successive residues that can independently bind different small molecules [28] [29] [30] . it should be noted that, after binding with the small molecule inhibitors, the c-myc sequence remains disordered, making the detailed experimental characterization of the molecular interactions almost impossible. therefore, the inhibition mechanism is still unclear. for example, a recent study using drifttime ion mobility mass spectrometry suggested that the binding between c-myc and these inhibitors is not as specific as previously thought [32] . the lack of conformation data also hampers the application of the well-developed structure-based drug design approach to optimize the inhibition. molecular simulations are useful in understanding the characteristics of idps because they can provide an atomic description of molecular interactions. coarse-grained models [11, [33] [34] [35] and allatom simulation [36] [37] [38] [39] [40] [41] [42] have both been used to investigate idps. recently, knott and best [40] used large-scale replica exchange molecular dynamics (remd) simulations with a well-parameterized force field to obtain a conformational ensemble of the nuclear coactivator binding domain of the transcriptional coactivator cbp. their simulation results were in good agreement with nmr and small-angle x-ray scattering measurements, validating the efficacy of all-atom simulations in exploring the highly dynamic conformations of idps. for the c-myc/inhibitor complex described above, michel and cuchillo [43] built a structural ensemble using all-atom simulations for c-myc 402-412 with and without an inhibitor (10058-f4) and found that 10058-f4 bound to multiple distinct binding sites and interacted with c-myc 402-412 . however, because the c-myc segment used in their simulation contained only the 11 residues that covered the binding sites of 10058-f4 (residues 402-409), it is unclear how the inhibitors would interact with longer segments of c-myc and how specific the interaction would be. in the present study, we conducted extensive all-atom molecular dynamic (md) simulations to investigate the c-myc 370-409 conformational ensemble and its interactions with a smallmolecule inhibitor (10074-a4). first, we performed implicitsolvent remd simulations to clarify the conformational features of the unbound c-myc 370-409 . next, we performed md simulations with an explicit water model to explore in detail the interactions between c-myc 370-409 and 10074-a4. finally, a negative control using a different peptide segment (c-myc 410-437 ) was simulated to address the issue of interaction specificity. the conformational ensemble that we obtained will be useful not only in clarifying the structural features of c-myc and the binding mechanism with inhibitors, but also in providing reference structures for drug design targeting c-myc via structure-based approaches. conformational sampling of idps for molecular modeling is challenging because the energy landscapes of idps are relatively flat [44, 45] . in the present study, extensive remd simulations using an implicit solvent model were performed to explore the conformational characteristics of c-myc 370-409 . the accumulative total of simulation time reached 34.5 ms (see methods). c-myc 370-409 is a 40-residue truncated construct of a full-length c-myc. the conformational properties of c-myc 370-409 in its bound state (with 10074-a4) and more dynamic unbound state, have been studied experimentally using cd and nmr spectroscopy, and a likely average conformation was built based on chemical shift data which is not meant to (and cannot) define detailed structural features [28] . we compared our simulation results with the available experimental results. to assess the sampling quality of the remd simulations, we computed 1 h and 13 c chemical shifts from the simulated conformational ensemble using shifts [46] and compared the computed values with the experiment values ( figure 1 ). the agreement is reasonable, though not excellent. deviations between the average chemical shift values for a simulated ensemble and experimental values have been observed previously in several studies on idps [40, 47, 48] . the chemical shift calculation performed using several other software (shiftx [49] , camshift [50] , sparta+ [51] ) also showed deviations between the computed and experimental values ( figure s1 ). a possible reason is that chemical shifts are difficult to calculate accurately and the underlying parameterizations applied in current software for the calculation of chemical shifts have been optimized for ordered proteins but not for idps [47] . interestingly, when we backcalculated chemical shifts from the nmr-refined structure using either the shifts [46] or shiftx [49] software, the resulting values also deviated from the experimental ones ( figure s2 ). in addition, the ensemble nature of idp conformations suggests that the chemical shifts of idps should be described as a distribution, and not merely as average values. the calculated distributions of the h a chemical shifts obtained from our simulations are summarized in figure 2 . all the h a chemical shifts are distributed over a broad range. the experimental values, indicated by arrows in figure 2 , are located close to the centers of the distributions, indicating the validity of the conformational sampling. data for the h n , c a and c b chemical shifts are given in figure s3 , showing similar behaviors as the h a chemical shifts. we also computed the distribution of the backbone dihedral angles (ramachandran (w,y) plot) for the simulations and the dihedral angles of the nmrrefined apo structure lie well within the simulation distributions ( figure s4 ). the secondary structure content of the simulated structures was also calculated [43, [52] [53] [54] and compared with that estimated from the experimental chemical shifts (figure 3 ). the helix and polyproline ii content of the simulated structure were consistent with the experimental structures ( figure 3a ). however, the sheet content of the simulated structures was much lower than the sheet content of the experimental structures. in a previous study [43] on a shorter c-myc segment, c-myc 402-412 , a similar underestimation of sheet content was observed in the simulated structures. the deficiency of sheet content in the simulated structures might be caused by a bias in the force fields. although c-myc 370-409 is intrinsically disordered, it possesses a high content of residual helical structure (.25%). the simulated helix propensity ( figure 3b ) showed three helical regions separated by proline residues, pro382 and pro391. intrinsically disordered proteins (idps) exist as conformational ensembles that change rapidly. they are an important and common class of proteins in all kingdoms of life. idps are widely associated with human diseases and may serve as potential drug design targets. however, drug design targeting idps is difficult and only limited examples have been reported. one example is the oncoprotein, c-myc, for which seven inhibitors were discovered by experimental screening. understanding how small inhibitor molecules bind to c-myc may help in understanding the binding mechanism of idps with ligands. in the present study, we conducted extensive molecular dynamics simulations to explore the binding mechanism for the c-myc peptide with an inhibitor 10074-a4. we found that 10074-a4 could bind to c-myc 370-409 at different sites along the peptide chain and its binding behavior could be described as a 'ligand cloud'. even in the bound state, the structure of the c-myc 370-409 peptide remained a dynamic ensemble. compared to c-myc peptides that do not bind to 10074-a4, c-myc 370-409 binds selectively with 10074-a4, but the specificity of binding was not high. the interactions of idps with ligands can perhaps be described as a scenario in which ligand clouds around protein clouds. to clarify the conformational features of c-myc 370-409 , backbone-rmsd clustering with a cutoff of 2.0 å of the conformations was performed. representative structures (the central structure of each group) of the first eight groups were depicted in figure 4 . they are all somewhat collapsed compared to the fully extended structure and possess a rich residual helical structure. these states with considerable population will be useful references for rational drug design targeting c-myc. the existence of residual structure may be related to the functional misfolding that prevents idps from unwanted interactions with non-native partners [55] . a quantitative analysis on the distributions of dimension and helix content was provided in figure s5 . the mean radius of gyration is around 10.360.6 å , which is much smaller than the expected value of random coils (18.5 å ) under the same chain length. the mean helix content of the conformational ensemble is 27.7611.1%, showing a broad distribution. these results indicated that c-myc 370-409 is disordered in nature and interconversions between dispersed structures occur. to reveal how the conformations of apo c-myc 370-409 were stabilized, we analyzed the lennard-jones and electrostatic residue-residue interactions among all the residues ( figure s6 ). the lennard-jones interaction matrix was rather weak ( figure s6a ), indicating that the conformations were disordered and that the packing in the collapsed structures was poor. this finding is consistent with the contact map, which showed that residueresidue contacts were dispersed and low in magnitude ( figure s6b ). the electrostatic interactions, on the other hand, were comparatively strong ( figure s6c ), probably because nearly onethird of the residues in c-myc 370-409 (12 out of the 40) are charged residues. the favorable electrostatic interactions of the arg372, arg378, lys389, lys392 and lys398 residues with the asp379, glu383, glu385 and glu409 residues ( figure s6c ) are the result of the electrostatic attraction between residues with opposite charges. residues like ser373 and gln380 also contributed to the electrostatic interactions by forming hydrogen bonds ( figure s6d ). therefore, charge-pair interactions and hydrogen bonds were the main stabilized factors for the c-myc 370-409 conformations. we conducted md simulations with an explicit solvent model to investigate the interactions between c-myc 370-409 and the inhibitor 10074-a4. 10074-a4 is the only inhibitor (among seven inhibitors of c-myc) that binds to the 375-385 sites in loop region of the bhlhzip domain of c-myc and we wanted to see whether or not stable local structures were induced when 10074-a4 interacted with the flexible loop region. in the experimental study, 10074-a4 is a mixture of two chiral forms, the s and r forms ( figure 5 ). in the simulations, both chiral forms were tested. for comparison, the apo c-myc 370-409 was simulated with the same explicit solvent model. the accumulative simulation time for each group was 7 ms (see methods). we calculated and compared the simulated chemical shifts with experimental chemical shifts for both implicit solvent remd and explicit solvent simulations ( figures s7, s8 , s9, s10). reasonable agreements were found. for example, the average discrepancy between the simulated and experimental chemical shifts for ha atoms of apo c-myc370-409 is 0.14 and 0.16 in the md simulations with explicit solvent model and remd simulations, respectively (see table s1 ). the relative binding free energy of c-myc 370-409 with the two chiral 10074-a4 forms was analyzed from the md trajectories for the remd simulations (red), the helix and sheet content was computed using the dssp [52] method; the polyproline ii content was computed with the pross software [53] . for the experimental data (black), the secondary structure content was estimated from the chemical shifts using d2d [54] . b helix propensity from the remd simulations using the dssp method. doi:10.1371/journal.pcbi.1003249.g003 using the molecular mechanic/poisson-boltzmann surface area (mm/pbsa) method [56] . the results of this analysis, together with the average non-bonded interactions u non-bonded (lennard-jones and electrostatic potentials) between c-myc 370-409 and 10074-a4, are given in table 1 . we found that the interaction between c-myc 370-409 and the s form of 10074-a4 was much stronger than the interaction with the r form. the difference of u non-bonded between the s and r forms (23.7 kcal/mol) was close to the difference of dh from mm/pbsa (23.2 kcal/mol). the difference of binding free energy between the s and r forms was 22.2 kcal/mol, resulting in a binding-affinity ratio of for the s and r forms. therefore, compared with the binding of the s form to c-myc 370-409 , the binding of the r form can be ignored. thus, only the holo system with the s form of 10074-a4 is discussed further. hammoudeh et al. [28] reported an induced circular dichroism (icd) effect on c-myc 370-409 by the binding of a racemate (1:1 mixture of the s and r forms) of 10074-a4. there were two possible reasons for the observed icd effect [28] ; either the chiral surroundings affected the absorption transition of the compound, or the enantiomer-specific effect (the different binding affinity of the s and r forms) led to the icd effect. we have shown above that the s form of 10074-a4 bound much stronger with c-myc 370-409 than the r form. therefore, we suggest that it was the enantiomer-specific effect that was responsible for the observed icd effect. further experiments using single chiral forms of 10074-a4 would be helpful in clarifying this observation. we clustered the conformations from md simulations with the explicit solvent model for both the apo and holo c-myc 370-409 peptide based on rmsd of the backbone atoms. figure 6 and 7 showed the representative conformations for the top eight clusters of the apo and holo peptides. it is clear that both the apo and the holo peptides have a rather broad conformation distribution, which is typical of disordered proteins. upon binding to the ligand 10074-a4, the conformational distribution became more condensed. the top eight conformation clusters of the holo peptide were more highly populated compared to that of the apo peptide, with a total of about 77% occupancy compared to 50%. similar to the apo c-myc 370-409 structure, the holo c-myc 370-409 structure is rich in helical structures. a quantitative analysis indicated that the helix and polyproline ii content was almost unaffected by the binding of 10074-a4 ( figure s11 ), while the sheet content was enhanced (see also in figure 7 ). the electrostatic interactions (from both charged residues and hydrogen bonding) dominated the intramolecular stabilizing force for holo c-myc 370-409 ( figure s12 ). the binding free energy was calculated using the mm/pbsa method [56] . all quantities are in kcal/mol. *u non-bonded is averaged non-bonded potential, which is composed of lennard-jones and electrostatic potential. 10074-a4 usually binds simultaneously to two or more regions that are flanked by several residues. the binding was highly dynamic and could switch between different modes within a trajectory. the time percentage of binding for each residue was calculated and is shown in figure 9 . three binding sites were detected, which included site i (residues 372 to 384), site ii (387 to 395), and site iii (398 to 408). site i was near the n-terminal and showed stronger potency than that of the other two sites. this result was supported by the intermolecular interaction analysis (figure 10 ), which showed that both the electrostatic and lennard-jones interactions for site i were much stronger than those of the other two sites. in fact, in the latter cases, hydrogen bonds hardly formed and the electrostatic interactions were weak. site i was similar to the experimentally determined binding site of 10074-a4 on c-myc at residues 374-385 [28] . binding at all the other sites generated in our simulations was much weaker, which would make them difficult to be observed experimentally. the low residue interaction specificity that we observed in the simulations is consistent with a recent simulation on an 11 residue peptide of c-myc 402-412 that suggested that ligand binding was driven by weak and nonspecific interactions [43] . the mass spectrometry experiment on c-myc reported by harvey et al. [32] also supported this conclusion. to further investigate the inherent specificity features of idps, we conducted a negative control study in which we chose another segment of c-myc (residues 410-437) that does not bind with 10074-a4 [28] . the simulated binding between c-myc 410-437 and 10074-a4 is shown in figure 11 the specificities of idps in molecular recognition are complicated [57] . our simulation results showed that the specificity of c-myc in binding the small-molecule ligand 10074-a4 was not high. c-myc is a typical example of idps. it is sticky and binds the ligands at different regions with different interaction strengths. because of the lack of coupled folding and binding, after binding, c-myc is still in an ensemble with diverse conformations and the distinct conformations are all capable of binding the ligand. furthermore, for a given c-myc structure, the binding of ligand occurred at disperse sites ( figure 12 ). we named this phenomenon ligand clouds. ligand clouds are remarkably different from the type of binding that is found in ordered proteins where a dominant binding structure is formed. we expect that ligand clouds may be a general feature for idps binding with small-molecule ligands. for idps binding with macromolecule partners, it was reported that some idps remain disordered in the holo state [57] ; for example, b-catenin/tcf4, b-catenin/apc peptide, b-catenin/apc phosphorylated, vif/elob/eloc, and errclbd/pgc-1a. these idp complexes assume dynamic structures upon binding, suggesting that idps may interact with their partners in a similar manner to the ligand clouds. the ligand clouds concept supports the idea that there is no definite binding mode in the interactions between idps and small-molecule inhibitor [43] . it suggests that the interactions could be described as protein clouds interacting with ligand clouds. the ligand cloud concept describes a scenario for the interactions between idps and small-molecule ligands and may provide a basis for drug design targeting idps. a straightforward strategy for rational drug design on idps is to extract metastable structures from simulations and then to conduct a virtual screen on them to identify potential inhibitors. a similar strategy was applied successfully in designing an inhibitor for ab fibrillation [58] . however, the ligand clouds concept for small molecules binding with idps implies that different strategies from those used for ordered proteins should be developed for better rational drug design on idps. for example, because ligand binding on idps occurs in disperse locations and in different orientations, multimode interactions should be considered in the scoring functions instead of the single-mode interaction that is commonly used for other proteins. therefore, schemes that can consider binding energy landscapes [59] might be expected to perform better when designing small molecule ligands for idps. on the other hand, in contrast to the conventional ordered proteins that are in either ''binding'' or ''non-binding'' states with small molecules, idps are ''sticky'' and would be either in ''strong binding'' or ''weak binding'' with small molecules. so more cares should be paid to the problem of specificity in drug design targeting idps. for conventional ordered proteins, the binding conformation is unique which could be selected from pre-existing conformations (the conformational selection mechanism) or be induced (the induced fit mechanism) by particular ligands. the scenario of ligand clouds around protein clouds for idps indicates that multiple protein conformations are selected and/or induced by the binding of a ligand on idps. this may extend the conformational selection-induced fit continuum in a new dimension. in conclusion, we conducted extensive simulations to explore the conformational ensemble of c-myc 370-409 and its complex with a small-molecule inhibitor 10074-a4. the conformational space was found to be rather dispersed. in contrast to conventional structured proteins, the conformations of c-myc 370-409 were mainly stabilized by charge interactions and hydrogen bonds. upon binding to 10074-a4, c-myc 370-409 remained disordered. the 10074-a4 ligand bound at different sites throughout the c-myc 370-409 chain with different strength. accordingly, a ligand cloud concept was proposed, that is, the interactions between small molecule ligands and idps were like ligand clouds around protein clouds. the different binding probabilities between the protein clouds and ligand clouds indicated that the ligand could be selective and thus specific. though the specificity of the binding was not high, the binding of ligand and non-ligand to the target idp could be clearly distinguished. hammoudeh et al. [28] measured chemical shifts and several noe signals of c-myc 370-409 and predicted dihedral angle distributions and atomic contacts. to build the c-myc 370-409 peptide, we first built a completely extended conformation with the following sequence: 370 lkrsffalrdqipelenne-kapkvvilkkatayilsvqae 409 (accession number: p01106). we then built the initial structures from the reported dihedral angles [28] using pymol [60] . the apo and holo structures for c-myc 370-409 were refined further using the gromacs 4.5.4 software package [61] and the amber99sb force field, with the nmr data [28] as the dihedral angle and distance restraints in the simulation. each initial structure was minimized in vacuum. then, it was solvated, minimized, and equilibrated as described below. the time step was set to 0.5 fs. finally, a 5 ns production simulation was performed and the final structure was adopted as the refined structure. the conformations of the c-myc 370-409 peptide were sampled by remd simulations with a generalized born/surface area (gb/sa) implicit solvent model. the amber molecular simulations package was used with amber99sb force fields [62] . a total of 30 replicas were adopted with temperatures ranging between 284.6 k and 608.8 k. all adjacent replicas attempted to exchange temperature every 10 ps with the average exchange rate between 35% and 40%. to produce the 30 starting conformations for an remd simulation, an initial structure (described below) was minimized using steepest descent for 500 steps and then switched to conjugate gradient for another 500 steps. the minimized conformation was then heated to the defined temperature over a time of 200 ps for each replica. the obtained conformations were adopted as starting conformations in the remd simulations, which were run with a time step of 2 fs. replica temperature was controlled with a coupling time constant of 2 ps. bonds involving hydrogen atoms were constrained with shake. chirality restraints on the backbone were employed to prevent non-physical chiralities. ionic strength was set to 0.2 m. the cutoff for non-bonded interactions and for the gb pairwise summations involved in calculating born radii was 999 å to consider all probable interactions entirely. snapshots from each trajectory were stored every 10 ps. we conducted four groups of remd simulations with different initial structures: (a) the extended structure of the peptide; (b) apo nmr refined structure; (c) the structure after a 80-ns md simulation at 300 k starting from the extended conformation; and (d) the most occupied representative conformation generated previously from the remd simulations of the extended structure in (a). the simulation time for the four groups of remds was 150 ns, 270 ns, 210 ns and 520 ns, respectively. the total simulation time was 34.5 ms (1.15 ms per replica). the trajectories of 292.2 k, 300 k and 308 k were used in the further analyses except that only the trajectory of 300 k was used in the chemical shifts calculations. to investigate the interactions between c-myc 370-409 and 10074-a4, md simulations for the complex structure were carried out with an explicit solvent model [63] . the apo c-myc 370-409 was also simulated with the same explicit solvent model for comparison. three groups of simulations were performed, one for the apo and two for the holo (with the two chiral 10074-a4 forms (see figure 5) ). each group contained seven trajectories of 1 ms, therefore, the total simulation time was 21 ms. one of the seven initial structures was the nmr refined structures (apo and holo); the other six initial structures were adopted from representative conformations generated previously in the 150-ns remd simulations (for the holo structures, the 10074-a4 isomers were docked using the autodock 4.2 program [64] ). md simulations with the explicit solvent model were performed with the gromacs 4.5.4 software package [61] and am-ber99sb force field under particle mesh ewald periodic boundary conditions. the tip4p-ew water model [63] was used with amber99sb force field because of its previously reported good performance in other simulations of idps [36, 47, 65] . in the holo simulations, the small molecule 10074-a4 ligand involved was parameterized using a general amber force field (gaff) with acpype software [66] . an am1-bcc charge model [67] was used to assign charges to the ligand. each initial structure was immersed in an explicit tip4p-ew truncated octahedral water box. the dimensions of the box, defined as the distance between the farthest atoms of the peptide and the edge of the box, was set to 10 å . the system was neutralized by adding ions, and extra nacl was added to represent a solution with an ionic strength of 0.15 m. the system was minimized using the steepest descent minimization approach. after the minimization, the system was equilibrated in the nvt ensemble with all-heavy atom restrained with a force constant of 239 kcal/mol. the temperature was maintained at 300 k using a v-rescale thermostat with a coupling constant of 0.1 ps. further equilibration was carried on in the npt ensemble without strains, and where the pressure was maintained at 1 atmosphere using a parrinello-rahamn barostat with the coupling constant set to 2.0 ps. both equilibrations were performed for 200 ps with a time step of 1 fs. for the production run, the thermostat and barostat settings were the same as for the npt run. to enable 2 fs time steps, bonds involving hydrogen atoms were constrained to equilibration length using the lincs algorithm [68] . a realspace cutoff of 10 å was used for the electrostatic and lennard-jones forces. snapshots from each trajectory were stored every 20 ps. to further investigate the inherent specificity features of idps, we conducted a negative control study using the c-myc 410-437 truncated peptide ( 410 eqkliseedllrkrreqlkhk-leqlrns 437 ), which did not bind to 10074-a4. the extended structure of the peptide was used as the initial structure in an 80 ns implicit solvent md simulation and the final structure that was generated was applied in all-atom explicit simulations. two groups of simulations were performed for each of the two chiral 10074-a4 isomers. each group contained one trajectory of 1 ms; the other parameters were the same as the parameters used for the holo c-myc 370-409 simulations described above. all the simulations were analyzed using the gromacs utilities [61] with either pymol [60] or in-house scripts. dsasa was used in determinations of the binding sites. upon small molecule binding, for each residue in the peptide there would be a clear decrease of sasa related to the difference between the sasa of the bound and unbound states. backbone rmsd clustering of peptide conformations was performed to identify distinct structural clusters and to estimate their populations. the relative binding free energy was calculated every 200 ps using mm/pbsa [56] methods. figure s1 comparisons of the computed and experimental chemical shifts for apo c-myc 370-409 . the computed values using shiftx (red circles), camshift (blue square) and sparta+ (green squares) are from the remd simulations and the experimental values are from hammoudeh et al. [28] (black triangle). note that the experimental values for some residues were not available. chemical shifts are for the atoms : a h a , b h n natively unfolded proteins: a point where biology waits for physics intrinsic disorder and protein function intrinsically unstructured proteins and their functions intrinsically unstructured proteins intrinsically disordered proteins: the new sequencestructure-function relations drugs for 'protein clouds': targeting intrinsically disordered transcription factors predicting intrinsic disorder in proteins: an overview inherent relationships among different biophysical prediction methods for intrinsically disordered proteins prediction and functional analysis of native disorder in proteins from the three kingdoms of life comparing and combining predictors of mostly disordered proteins kinetic advantage of intrinsically disordered proteins in coupled folding-binding process: a critical assessment of the ''fly-casting'' mechanism smoothing molecular interactions: the ''kinetic buffer'' effect of intrinsically disordered proteins malleable machines take shape in eukaryotic transcriptional regulation exploring the binding diversity of intrinsically disordered proteins involved in one-to-many binding intrinsically disordered proteins in human diseases: introducing the d 2 concept functional anthology of intrinsic disorder. 3. ligands, post-translational modifications, and diseases associated with intrinsically disordered proteins intrinsically disordered proteins are potential drug targets rational drug design via intrinsically disordered protein novel strategies for drug discovery based on intrinsically disordered proteins (idps) dynamic modeling of human 5-lipoxygenase-inhibitor interactions helps to discover novel inhibitors discovery of multitarget inhibitors by combining molecular docking with common pharmacophore matching virtual screening of novel noncovalent inhibitors for sars-cov 3c-like proteinase rational drug design via intrinsically disordered protein inhibition of the p53-mdm2 interaction: targeting a proteinprotein interface in vivo activation of the p53 pathway by small-molecule antagonists of mdm2 a small molecule blocking oncogenic protein ews-fli1 interaction with rna helicase a inhibits growth of ewing's sarcoma improved low molecular weight myc-max inhibitors multiple independent binding sites for small-molecule inhibitors on the oncoprotein c-myc structural rationale for the coupled binding and unfolding of the c-myc oncoprotein by small molecules low molecular weight inhibitors of myc-max interaction and function x-ray structures of myc-max and mad-max recognizing dna: molecular bases of regulation by proto-oncogenic transcription factors smallmolecule inhibition of c-myc:max leucine zipper formation is revealed by ion mobility mass spectrometry polyelectrostatic interactions of disordered ligands suggest a physical basis for ultrasensitivity electrostatically accelerated coupled binding and folding of intrinsically disordered proteins binding-induced folding of a natively unstructured transcription factor optimizing protein-solvent force fields to feproduce intrinsic conformational preferences of model peptides anchoring intrinsically disordered proteins to multiple targets: lessons from n-terminus of the p53 protein energy landscape analyses of disordered histone tails reveal special organization of their conformational dynamics a free-energy landscape for coupled folding and binding of an intrinsically disordered protein in explicit solvent from detailed all-atom computations a preformed binding interface in the unbound ensemble of an intrinsically disordered protein: evidence from molecular simulations residual structures, conformational fluctuations, and electrostatic interactions in the synergistic folding of two intrinsically disordered proteins binding of two intrinsically disordered peptides to a multi-specific protein: a combined monte carlo and molecular dynamics study the impact of small molecule binding on the energy landscape of the intrinsically disordered protein c-myc ensemble modeling of protein disordered states: experimental restraint contributions and validation constructing ensembles for intrinsically disordered proteins automated prediction of 15n, 13ca, 13cb and 13c' chemical shifts in proteins using a density functional database structure and dynamics of the ab(21-30) peptide from the interplay of nmr experiments and molecular simulations homogeneous and heterogeneous tertiary structure ensembles of amyloid-b peptides rapid and accurate calculation of protein 1h, 13c and 15n chemical shifts fast and accurate predictions of protein nmr chemical shifts from interatomic distances sparta+: a modest improvement in empirical nmr chemical shift prediction by means dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features a physical basis for protein secondary structure determination of secondary structure populations in disordered states of proteins using nuclear magnetic resonance chemical shifts intrinsically disordered proteins may escape unwanted interactions via functional misfolding mmpbsa.py: an efficient program for end-state free energy calculations do intrinsically disordered proteins possess high specificity in protein-protein interactions inhibitor discovery targeting the intermediate structure of beta-amyloid peptide on the conformational transition pathway: implications in the aggregation mechanism of beta-amyloid peptide binding energy landscape analysis helps to discriminate true hits from high-scoring decoys in virtual screening the pymol molecular graphics system algorithms for highly efficient, load-balanced, and scalable molecular simulation development of an improved four-site water model for biomolecular simulations: tip4p-ew autodock4 and autodocktools4: automated docking with selective receptor flexibility atomic-level characterization of the ensemble of the ab(1-42) monomer in water using unbiased molecular dynamics simulations and spectral slgorithms acpype -antechamber python parser interface fast, efficient generation of highquality atomic charges. am1-bcc model: i. method lincs: a linear constraint solver for molecular simulations the authors thank daqi yu, dr. changsheng zhang and dr. fangjin chen for helpful discussions. f.j. gratefully acknowledges the help of prof. tsun-mei chang and dr. shuangyu bi in preparing the manuscript. key: cord-103528-3tib5o1m authors: ahmed, asad; mam, bhavika; sowdhamini, ramanathan title: deelig: a deep learning-based approach to predict protein-ligand binding affinity date: 2020-09-28 journal: biorxiv doi: 10.1101/2020.09.28.316224 sha: doc_id: 103528 cord_uid: 3tib5o1m protein-ligand binding prediction has extensive biological significance. binding affinity helps in understanding the degree of protein-ligand interactions and has wide protein applications. protein-ligand docking using virtual screening and molecular dynamic simulations are required to predict the binding affinity of a ligand to its cognate receptor. in order to perform such analyses, it requires intense computational power and it becomes impossible to cover the entire chemical space of small molecules. it has been aided by a shift towards using machine learning-based methodologies that aids in binding prediction using regression. recent developments using deep learning has enabled us to make sense of massive amounts of complex datasets. herein, the ability of the model to “learn” intrinsic patterns in a complex plane of data is the strength of the approach. here, we have incorporated convolutional neural networks that find spatial relationships among data to help us predict affinity of binding of proteins in whole superfamilies towards a diverse set of ligands. the models were trained and validated using a detailed methodology for feature extraction. we have also tested deelig on protein complexes relevant to the current public health scenario. our approach to network construction and training on protein-ligand dataset prepared in-house has provided significantly better results than previously existing methods in the field. proteins are a diverse class of dynamic macromolecular structures in living organisms and are essential for the biochemistry and physiology of the organism. depending on their functional role (s), proteins may bind to other proteins, peptides, nucleic acids and non-peptide ligands with varying affinities. determining protein-ligand affinity helps in understanding the reaction mechanism and kinetics of the reaction and has applications in drug development and pharmacology (1) . protein-ligand interaction is measured in terms of binding affinity. the stronger the readout for binding affinity, the stronger the interaction between protein and ligand may be inferred. it is quantified in terms of inhibition constant (ki), dissociation constant (kd), changes in free energy measures (delta g, delta h) and (ic50) (2) . predicting binding affinity between a protein and ligand complements experimental approaches and is usually used as a start-point for the latter. prediction-based approaches are also useful in cases where experimental determination of binding affinity may not be feasible. classical prediction methods to score free binding energies of small ligands to biological macromolecules such as mm/gbsa and mm/pbsa typically rely on molecular dynamic simulations for calculations and aid in silico docking and virtual screening as well as experimental approaches. however, there is a trade-off between computational resources and accuracy (3) . with a recent shift towards the use of machine learning and deep-learning based methods in the field of structural biology, making biologically significant predictions using regression and 'learning' intrinsic patterns in a complex plane of available data has led to resourceoptimal predictions without compromising on accuracy. deep learning has been known to learn representations and patterns in complex data forms. our aim was to apply deep learning to predict binding affinity of protein-ligand interaction. convolutional neural networks (cnn) are deep neural networks that use an input layer, output later as well as convolutional hidden layer(s). the first cnn was incorporated by lecunn in 1998 (4) the connectivity pattern of which was inspired by the elegant experiments of hubert and weisel on the mammalian visual cortex in the 1960s (5) . with the growing technical advancements and massive amounts of data, cnns have emerged popular in biological fields in the recent decade with various applications (6) . in our study, we have used cnns to provide a quantitative estimate of protein-ligand binding using various sets of features corresponding to protein and ligand respectively by finding spatial relationships amongst the data. our approach was validated using ligand-bound complexes from kinases superfamily in the pdb. kinases belong to a class of enzymes required for substrate dependent phosphorylation. they are represented across diverse cellular functions like signaling, differentiation, glycolysis (7) . we have also tested our model on covid-19 main protease (8) of the novel coronavirus strain complexed with various inhibitors of which binding affinities have not been predicted or experimentally determined so far. the raw data for our novel database was obtained from rcsb pdb (9) database, where following were selected as the query parameters. these criteria resulted in a list of 5464 protein pdb ids, 2568 complexed ligand (s) and corresponding binding affinity values. the search results include the structures present in pdbdatabase, pdbbind (10, 11, 12) , pdbmoad (13, 14) and scpdb (15) for its results. initial raw data database created contained protein structures in pdb format, protein sequences in fasta format, ligand in sdf format and binding affinity values of corresponding protein-ligand pairs for 5464 complexes. the pdb, fasta and sdf files filtered were further processed to refine our novel dataset, as shown in figure 1 . protein-ligand complexes were 5,464 in number and corresponded to 29,650 complex unique chain-ligand pairs. binding affinity values were obtained from the rcsb database and protein chain-ligand pairs with corresponding binding affinity as 0 were discarded to reduce statistical errors. this narrowed down the total complexes to 4,750 protein-ligand pairs. pocket information was extracted from the protein using ghecom (16) and converted to mol2 format using chimera (17) , which narrowed our results to 4699 pocket-ligand pairs. it narrowed down the size of the dataset to 4286 pocket-ligand pairs. we discarded other protein-ligand pairs with missing pssm profiles, secondary structure or dihedral angle information. it resulted in a total of 4041 pocket-ligand pairs, which corresponds to 7414 pocket-ligand pairs containing unique chains. training the deep learning network on raw information is known to result in longer time for convergence and less accuracy. we followed a conventional methodology for feature extraction and used the deep learning framework to learn the interaction between the proteinpocket and ligand for their affinity prediction. a comprehensive two-level feature extraction methodology, one at the atomic level and the other at the level of amino acids utilizing structural information and protein sequence respectively. • 9 bit 1 hot or all null hot encoding for atom types: b, c, n, o, p, s, se, halogen and metal. • 1 integer for hybridization • 1 integer representing the number of bonds with heavy atoms • 1 integer representing the number of bonds with hetero atoms • 5 bits (1 if present) encoding properties defined with smarts patterns: hydrophobic, aromatic, acceptor, donor and ring • 1 float for partial charges • 1 integer to distinguish between ligand as -1 and protein as 1 we utilized the sequence information of protein to get more features about the protein pocket-ligand interaction. • position-specific scoring matrix (pssm): pssm is a matrix that represents the probability of mutation at each point of the sequence. it gives a 20 bit-probability for each amino acid at each location. pssm profiles were obtained using psi-blast (18) with swissprot as subject database and e-value threshold as 0.001. chains with less than 50 amino acids were removed from the input dataset. • relative solvent accessibility (rsa): it is encoded by 1 bit of information for each amino acid that provides whether it is buried or exposed to the solvent. we set a threshold of 25% in rsa values. rsa was obtained using naccess (19) . • secondary structure: it is encoded by 1 bit of information about the structure as coil, helix or plate and was predicted using the dssp (20, 21). • dihedral angles: it is encoded by 2 bits of information with phi / psi angles of each of the amino acids and was predicted using dssp (20, 21) for obtaining dihedral angles. standard ligand features were calculated for ligands in our dataset using padel (22) and 1d, 2d and chemical fingerprints, which includes hybridisation, atom pair interaction, counts of various functional group. we also used qikprop (34) and canvas (23) to derive admet (absorption, distribution, metabolism, excretion, and toxicity) properties, which includes the physical properties, solubility and partition coefficients. the exhaustive list of every property calculated is given in the appendix. it results in a 1d array of 14,716 dimensions containing the various properties of a given ligand. this is used as a feature vector representing the ligand represented in mol2 format. the three-dimensional co-ordinates of atoms were converted into a 3d grid of resolution 10å with 1å spacing between the two axes centered along the centroid of the ligand. atoms outside each such grid were discarded. the atoms lying inside the grid were rounded up to the nearest coordinate of the grid where features of corresponding atoms that lay in the same coordinates were added up. this resulted in projecting ligand-interacting residues into a three-dimensional cube with features representing the atomic as well as protein-based properties of each atom of the protein pocket. features were calculated at the atomic level (section 4.1.1) corresponding to each atom of an amino acid and ligand. a 19-bit vector was calculated that uniquely identified each of the atoms in the 3d co-ordinates of a given protein-pocket and ligand complex. a 4d tensor each of size m x m x m x 19, i.e. the 3 coordinates (x, y, z) and the features, where m represents the number of atoms present in a complex was constructed as the feature vector representing the given protein pocket-ligand. the 4d vector contains the protein-pocket features and was converted to a 3d grid using grid featurization (section 4.3). the 3d-featurized grid is essentially a 4d tensor, where the coordinates are approximated to the points on the grid. the dataset is converted to vectors and is divided into training:validation:test sets in ratio 80:10:10. convolutional neural networks (24) have been used to capture spatial features in an image. we use cnns to capture the interaction between ligand and protein atoms in threedimensional space. a network was constructed ( figure 2 ) with a 3d cnn of varying channel sizes of [64, 128, 256] with non-linear activation relu after each layer, each 3d cnn had a filter of 5å cube which was used to perform convolution operations. maxpool (25) layer that acts in three dimensions to lower the dimension with a pool size of 2å cube and batch normalization (26) layer is added after each cnn layer, this in turn decreases the training time and helps in faster convergence. the latent features learnt from the above cnn layers were then flattened and used for calculating the binding affinity of the protein pocket-ligand pair. the cnn derives the relation among the 3d coordinates and their features, which would correspond well to the binding affinities of complexes. the features from the last cnn layer are then flattened out, and passed through a fully connected neural network having the number of neurons as [1000, 500, 250] with relu as non-linearity after each layer. dropout (25) is added after each layer to prevent overfitting by forcing the neural network to learn various other pathways by randomly assigning neurons to zero, 0.50 as dropout threshold. dense network predicts a regressive value of binding affinity, corresponding to a single neuron output. training framework is shown in figure 2 and a detailed layer network is shown in figure 4 (a). the featurized protein-pocket grid formed was rotated to all 24 combinations possible, such that the network is able learn in an orientation invariant form. the network was trained by taking mean square error between the predicted and actual values as a loss function. the network was optimized using adam (27) as the optimizer with a learning rate of 1e-5 and weight decay of 0.001 for 20 epochs. network was trained on an nvidia pascal gpu using pytorch (28) as the framework. features were calculated at the amino acid level (section 4.1.2) and were concatenated alongside the atomic level features (section 4.1.1) to each atom of amino acid. it results in a 44-bit vector uniquely identifying each of the atoms in the 3d co-ordinates of a given protein. a 4d tensor each of sizes m x m x m x 44, i.e. the 3 coordinates (x, y, z) and the features, where m represents the number of atoms present in a complex is constructed as the feature vector of protein pocket. the 4d vector contains the protein-pocket features, it was converted to a 3d grid using grid featurization (section 4.3). the 3d featurized grid is essentially a 4d tensor, where the coordinates are approximated to the points on the grid. the ligands were separately featurized by calculating the ligand properties (section 4.2), which results in a 1d tensor. the dataset is converted to vectors and is divided into training:validation:test sets in ratio 80:10:10. a multi-input network was constructed (29) training framework is shown in figure 3 and a detailed layer network is shown in figure 4 (b) the featurized protein-pocket grid formed was rotated to all 24 combinations possible, such that the network is able to learn in an orientation invariant form. the featurized protein pocket-ligand pair of training set was passed through corresponding the network and trained by taking mean square error between the predicted and actual values as a loss function. the network was optimized using adam (27) as the optimizer with a learning rate of 1e-5 and weight decay of 0.001. the network was trained on an nvidia pascal gpu using pytorch (28) as the framework. the performance of the models was quantified using mean absolute error (mae) and root mean square error (rmse). it was tested on validation and testing sets which were initially divided from our dataset as mentioned in the training section. lower error corresponds to better learning capacity of the model. standard deviation among the real and predicted values was also calculated. the mae, rmse and sd values are shown in table 1 . for the purpose of training and testing models, one nvidia tesla p100 gpu cluster was used. computational time taken for featurization of dataset, training and testing were 52 hours, 22 hours and 8 minutes respectively. two modules were trained. the first module was trained using a small set of features for protein and ligand, which were represented together in a 3d grid space. this approach has also been part of a previous study (29) . however, the previous study uses a restricted ligand set that does not involve larger ligands. here we have used a diverse set of ligands as one of our inputs. with training of atomic model for 35 epochs, mae score of 2.84 was achieved (table 1) . we constructed another module that enabled us to improve on the ligand and protein based information. to this purpose, we used an increased feature vector size which amounted to 14716 bits in size for ligand and 44 bits for each atom of protein. with training of composite model for only 4 epochs, mae score of 2.27 was achieved ( table 1) . the performance of our model was further evaluated using ligand-bound complexes from the kinase superfamily from pdb. the composite model outperformed the atomic model significantly and with lower standard deviation. in light of the ongoing coronavirus pandemic, we tested protein-ligand complexes from the coronavirus (cov) family. the covid-19 main protease is a key enzyme for the novel strain of coronavirus that is being implicated in the pandemic. a recent study involved testing of invitro binding efficacy of coronavirus covid-19 virus main protease (mpro) with a potent reversible synthetic inhibitor, n3 (31) . however, the highly potent inhibition by n3 rendered the experimental determination of binding affinity not achievable. using the structure of mpro at high resolution (7bqy: 1.7 angstrom), we have been able to predict the binding affinity of n3 to 3.1e+4 nanomolar (table 3) . this value agrees with the observed high affinity in the course of recent experiments (31) . another study has deposited the complex of the covid-19 main protease with a broadspectrum inhibitor x77 (n-(4-tert-butylphenyl)-n-[ (1r)-2-(cyclohexylamino)-2-oxo-1-(pyridin-3-yl)ethyl]-1h-imidazole-4-carboxamide) (2020; unpublished). we used these complexes to predict their respective binding affinities as they have not been made available. based on our model-based predictions, broad spectrum inhibitor x77 scores for highest affinity followed by ligands z45617795, n3, z31792168, z1220452176 and z219104216 in the order of decreasing binding affinity ( we propose a deep-learning based approach to predict ligand (eg., drug)-target binding affinity using only structures of target protein (pdb format) and ligand (sdf format) as inputs. convolutional neural networks (cnn) were used to learn representations from the features extracted from these inputs and hidden layers in the affinity prediction task. we used two approaches to feature extraction-atomic level as well as composite level and compared their performance using the same network. deep-learning based approaches have been implemented for prediction of binding affinity. one of the studies used atomic level features of complex in a cnn based framework for binding affinity prediction (32), while another study used protein sequence level features in a cnn based framework for prediction (33) . another approach used as been to use feature learning along with gradient boosting algorithms to predict binding affinity (34) . here, we provide a composite model that incorporates tripartite structural, sequence and atomic level features with those of the atomic and other chemical features of the ligand to predict binding affinity of a putative complex. we have trained two models to predict the binding affinity between protein and ligand in a given complex. this would help existing databases like rscb pdb, pdbbind in filling missing binding affinity data for complexes. we have constructed a novel dataset that represents a diverse set of ligands and using a novel deep learning based approach we have achieved significant improvement in prediction of binding affinity of protein-ligand complexes. interestingly, our approach performed better without ligand coordinates as input. to counter filtering or noise reduction in our dataset, our dataset constructed is smaller than pdbbind (35) but we have overcome the constraints on ligand selection part of a previous study (29) . although our dataset contains 5464 complexes compared to 16,151 complexes found in pdbbind, the ligands used as part of our training include 452 unique ligands absent in pdbbind. this helps in achieving ligand diversity during training the cnn model. the similarity matrix constructed from the binary fingerprints of ligands used in the dataset supports our claim of improved ligand diversity in our dataset (supplementary file s1). we have also eliminated the need of providing ligands in a complex form with protein. thus a given protein pocket may be tested for the degree of binding for any given ligand. this can be extended to predicting potential binding partners for proteins in other superfamilies as well. it is also important to consider that docking score and pose is not a reliable correlation with mm/gbsa poses (36). deelig can be used for a member of any protein superfamily and a non-peptide ligand, the docking pose of which may or may not be known. the code repository for the project is publicly available at : https://github.com/asadahmedtech/deelig binding affinity predictions through deelig can be extended to protein-ligand complexes of protein superfamilies where the affinity is quantitatively unknown due to experimental limitations or where the potential for binding is yet to be explored in vitro. a webserver to implement deelig for easy online access would be useful for the general scientific community and this will also be in the pipeline. a later version of deelig which is trained on peptide ligand dataset will also be worked on. following properties of ligand were calculated using padel (22) • ip (ev) ( ionization potential) • ea (ev) (electron affinity) • #metab (likely metabolic reactions) • psa (van der waals sa of polar n and o atoms) • #nando, #ringatoms (number of atoms in rings) • #in34 (number of atoms in 3 or 4 membered rings) • #in56 (number of atoms in 5 or 6 membered rings) • #noncon (ring atoms cannot form conjugated aromatic bonds) • #nonhatm (heavy atoms-nonhydrogen atoms) • ruleofthree • ruleoffive (lipinski violations) • qplogkhsa (binding to human serum albumin) • percenthuman-oralabsorption recent improvements to binding moad: a resource for protein ligand binding affinities and structures gapped blast and psi-blast: a new generation of protein database search programs analysis and comparison of 2d fingerprints: insights into database screening performance using eight fingerprint methods the mm/pbsa and mm/gbsa methods to estimate ligandbinding affinities expert opinion on drug discovery simboost: a read-across approach for predicting drug-target binding affinities using gradient boosting machines receptive fields, binocular interaction and functional architecture in the cat's visual cortex batch normalization: accelerating deep network training by reducing internal covariate shift structure of m pro from sars-cov-2 and discovery of its inhibitors a series of pdb related databases for everyday needs sc-pdb: a 3d-database of ligandable binding sites-10 years on dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features on the binding affinity of macromolecular interactions: daring to ask why proteins interact detection of multiscale pockets on protein surfaces using mathematical morphology adam: a method for stochastic optimization imagenet classification with deep convolutional neural networks binding moad (mother of all databases gradient-based learning applied to document recognition deepatom: a framework for protein-ligand binding affinity prediction deep learning in bioinformatics bindingmoad, a high-quality protein-ligand database deepdta: deep drug-target binding affinity prediction pytorch: tensors and dynamic neural networks in python with strong gpu acceleration. pytorch: tensors and dynamic neural networks in python with strong gpu acceleration ucsf chimera-a visualization system for exploratory research and analysis the rcsb protein data bank: redesigned website and web services very deep convolutional networks for largescale image recognition development and evaluation of a deep learning model for protein-ligand binding affinity prediction the pdbbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures the pdbbind database: methodologies and updates the pdbbind database: methodologies and updates padel-descriptor: an open source software to calculate molecular descriptors and fingerprints pdb-wide collection of binding data: current status of the pdbbind database binding of nicotinoids and the related compounds to the insect nicotinic acetyicholine receptor schrödinger release 2020-3: qikprop, schrödinger, llc supplementary files a. similarity matrix of ligands used in dataset (supplementary file s1) b. dataset details (supplementary file s2) c. dataset distribution aa acknowledges funding awarded by the indian academy of sciences, bangalore (2019).bm would like to acknowledge tata trusts-tdu fellowship for phd awarded to her from 2017 to 2019. all authors acknowledge ncbs for infrastructural support. the authors declare no conflict of interest. key: cord-022200-hqc8r31t authors: hyatt, alex d. title: protein a–gold: nonspecific binding and cross-contamination date: 2012-12-02 journal: colloidal gold doi: 10.1016/b978-0-12-333928-7.50007-1 sha: doc_id: 22200 cord_uid: hqc8r31t nan colloidal gold has been a popular electron-dense marker in immunoelectron microscopy for the past decade. over this period protein a-gold has been used extensively as a reagent for binding immunoglobulins from many mammalian species. due to the broad reactivity of the probe and the ease of preparation, protein α-gold remains a popular choice as an immunolabel. the aim of this chapter is to discuss the potential problems associated with protein α-gold as stated in the literature and as recog nized in this laboratory. to understand the potential difficulties, it is nec essary first to discuss the techniques involved with the preparation of colloidal gold and protein α-colloidal gold conjugates. colloidal gold is generally prepared by the reduction of tetrachloroauric acid, h(aucl 4 )-4h 2 0, with a range of chemicals including phosphorus (zsigmondy, 1905; zsigmondy and thiessen, 1925) , trisodium citrate (frens, 1973) , ascorbic acid (stathis and fabrikanos, 1958) , sodium borohydride (tschopp et al, 1982) , and trisodium citrate and tannic acid (slot and geuze, 1985) . the gold colloid is formed from micromolecular units (nuclei) and the increase in particle size is dependent on the rate of forma tion of nuclei and the rate of crystal growth. thus, by controlling the reduction of tetrachloroauric acid, gold probes of varying diameters can be produced. reduction with white phosphorus, ascorbic acid, or triso dium citrate, for example, produces gold particles with average diameters of 5, 12, and 16 nm, respectively (slot and geuze, 1981) . for further de tails, see handley (1989) . an important surface property of colloidal gold is its negative charge. in an aqueous phase, colloidal gold will stay in suspension by electro static repulsion. if electrolytes are added, the ion layers of the particles are compressed and so the critical "cohesion" distance is reduced, thereby facilitating flocculation. it has been recognized since 1901 (zsig mondy, 1901) that gold particles can be stabilized in solution by the addi tion of proteins. one of the more successful of such proteins is staphylo coccus aureus coat protein a. the complexing of protein a (as with other proteins) to colloidal gold involves electrostatic interaction between the negatively charged surface of gold particles and positively charged groups of the proteins. it is generally accepted that the protein is attracted into the action radius of van der waals-london attractive forces and there firmly binds to the surface of the gold particles. the important parameters for successful preparation of stable and ac tive protein α-gold complexes are the ph of the colloidal gold suspension (roth, 1983) and the determination of the minimum amount of protein required for stabilization. although the isoelectric point for protein a is 5.1, complexes are generally produced at a ph range of 5.9 to 6.2. how ever, adequate protein binding can be obtained at a ph between 5 and 6 (slot and geuze, 1981) ; the overall bioactivity and stabilization do not appear to be compromised in such complexes. the amount of protein a required for stabilization of colloidal gold is generally determined by the technique described by zsigmondy and thiessen (1925) . the test results in a color change of the colloidal gold solution from orange-red to blue when there is incomplete stabilization. the minimum amount of protein a required to prevent this effect is taken as an estimation for the stabiliza tion of the gold solution. it is common practice to add 10 to 100% excess protein a to the solution to ensure stabilization geuze, 1981, 1985) . the protein a complex is then stored in a buffer (e.g., phosphatebuffered saline, pbs) containing an excess of unrelated protein such as bovine serum albumin (bsa); this further stabilizes the gold complex. the protein α-gold method is a two-stage immunolabeling procedure. the antigenic site within or on the specimen is revealed (indirectly) by the addition of a primary antibody. the excess antibody is removed and protein α-gold added. protein a has two functional fc binding regions (langone, 1982) , which interact predominantly at the ch 2 and ch 3 do mains of the fc region of the antibody (forsgren and sjöquist, 1966) . the binding is strong in most igg classes of several mammalian species such as human, rabbit, guinea pig, and dog; weaker with igg classes from horse, cow, and mouse; and weakest in igg from goat, sheep, rat, and chicken (forsgren and sjöquist, 1966; kronvall et al., 1970 kronvall et al., , 1974 biberfeld et al., 1975; lindmark et al., 1983) . binding to different classes of igg, human iga, and ige and igm from various species has also been shown (goudswaard et al., 1978; langone, 1982; lindmark et al., 1983) . it is obvious from the above that antibodies used in protein α-gold proce dures should be generated from species which produce high-affinity anti body-protein a complexes. furthermore, the antibodies should be of the one class, that is, they should be affinity-purified or monoclonal antibodies. labeling of antigens can occur prior to embedding (preembedding immunocytochemistry) or postembedding of whole or sectioned specimens. labeling can therefore occur either on or within tissue sections or whole mounts (for example, whole cells and complete cytoskeletal matrices). irrespective of the mode of antigenic presentation, the labeling protocol generally follows a set format such as that outlined below. labeling of structures in fig. 2.1a ,b was produced by the same protocol, the details of which are shown in parentheses. the incubations were performed in plastic petri dishes and solutions changed with micropipettes ( fig. 2. 2). 1. wash (pbs, 3 min). 2. fixation (0.1% glutaraldehyde in pbs, 2 min; when cytoskeletons are prepared the fixative includes 1% of the non-ionic detergent np40). 3. wash (pbs, 3x3 min). 4. blocking step [pbs containing 1% bsa (pbs-bsa), 10 min]. 5. primary antibody (1 : 10 dilution in pbs-bsa, 37°c (310 κ), 1 hr). 6. wash (pbs-bsa, 6x3 min). 7. protein a-gold (1 : 20 in pbs-bsa, 37°c (310 κ), 1 hr). 8. wash (pbs, 6x3 min). 9. stain (2% phosphotungstic acid adjusted to ph 6.8 with 1 μ koh). provided the antibodies are affinity purified or are monoclonal, the anti gen-antibody interactions will be specific and indicative of the presence and localization of the antigen. this assumes that the antigen has not been leached or translocated from or within its biological matrix due to inap propriate fixation. the possibility of "unwanted" antibodies, from whole sera, reacting with cell constituents is discussed in the following section. gold particles of varying sizes can be produced by different methodolo gies (handley, 1989) . when the gold particles vary in size, i.e., their coef ficient of variation (cv) is greater than 15% (slot and geuze, 1985) , the solutions are considered polydispersive in size; when the cv is less than 15% they are referred to as monodispersive. the production of monodispersive colloidal gold is dependent on the method for synthesis (han dley, 1989) and subsequent treatment (bendayan, 1984) . the ability to produce solutions of monodispersive gold particles is of paramount im portance in multiple-labeling immunoelectron microscopy. the advantage of protein a is that it can be readily complexed to a gold probe of any size and possesses an affinity for most igg of most mammals. protein α-gold probes therefore provide a potential means whereby colocalization of antigens within a single sample can be achieved. in the literature there are numerous reports which claim success in double labeling of anti gens within/on the one biological section. the techniques include colocalization of antigens on one face of a section with probes of different sizes (geuze et al, 1981; roth, 1982; hisano et al, 1984) or the labeling of both faces of a section (bendayan, 1982) . in the first technique addition of each antibody is followed by protein α-gold probes of specific sizes in a defined sequence. the first protein α-gold probe to be applied is the smaller of the two (3-5 nm), followed by the larger probe (12-15 nm). when this sequence is followed, little cross-contamination occurs (roth, 1982) . if free protein a is added during the last minutes of step 7 (see above) little cross-contamination is re ported to occur irrespective of which probe is applied first (slot and geuze, 1984) . if, however, the order of probes is reversed without the addition of free protein a, cross-contamination can occur (roth, 1982) . the protocol for this form of double labeling is a simple extension of that outlined for single labeling. 1-7. same as on p. 22. 8. free protein a (= 0.05 mg/ml) in the last few minutes of step 7. 9. wash (pbs, 6x3 min). 10. repeat steps 5 to 7. while this procedure has enjoyed some success, there have been nu merous reports of problems associated with it (bendayan, 1982; ben dayan and stephens, 1984; hyatt et al., 1988 ). an alternative procedure is described by bendayan (1982) . in this procedure both sides of the sec tion are used; that is, there are t w o independent labeling steps. this sec ond technique avoids any cross-contamination b e t w e e n the different im munolabeling steps. n o t all double labeling is performed on biological sections. w h e n only one aspect of a sample is available for labeling (for example, viruses ad sorbed to a grid substrate), then successful double labeling, as illustrated in fig. 2 .3, may not be possible with either of the above m e t h o d s . h y a t t et al. (1988) , for example, found that protein α -g o l d could not be used for the multiple epitope mapping of a k a b a n e virus as it resulted in colabeling of the primary antibody. successful double labeling could be achieved only with specific igg-gold probes (fig. 2.3) . f o r additional in formation on multiple labeling, the reader is referred to doerr-schott (1989) . some potential problems associated with protein α -g o l d labeling are also associated with other immunocytochemical techniques, namely nonspecific staining (for example, nonspecific antibodies and electrostatic at tachment; taylor, 1978; behnke et al., 1986; gosselin et al., 1986; birrell et al., 1987) . double immunolabeling with the protein a method can in duce further sources of error, namely cross-contamination of antibodies with protein α-gold. the problems of cross-contamination and nonspe cific labeling are discussed below (also see birrell and griffith, 1989; park etal., 1989) . as stated previously, it is advisable to use monospecific antibodies raised in the appropriate mammalian species for the primary localization of antigens. from a practical viewpoint, affinity-purified and monoclonal antibodies generally yield specific labeling with a clean background ( fig. 2.1a) . the labeling in fig. 2 .1b, on the other hand, was produced with a whole porcine serum as the primary antibody. although 1% bsa was used to minimize nonspecific labeling (as per fig. 2.1a) , the background in fig. 2 .1b was considerably greater. an explanation for this is that the use of whole serum (containing numerous natural antibodies, targeted an tibodies, and antibodies to any carrier molecules or extraneous material carried over or used in the immunization) with protein α-gold produces many immunocomplexes. the overall effect can result in the adsorption of unwanted antibodies and subsequent labeling of various cellular con stituents in addition to the targeted antigen. if normal serum is used to block nonspecific binding of primary antibodies to the reactive sites of tissues, an effect similar to that described above may occur. in many instances the problem of such nonspecific binding may be reduced by decreasing the effective antibody concentration. if the studies involve in fectious agents, hormones, or the like (i.e., cause-and-effect experi ments), then nonspecific staining can be further reduced by preadsorption of the serum with the normal (nonexperimental) biological tissue. biological samples bathed in serum may also provide another source of error (for example, tissue culture cells and cells of the blood). it is not unreasonable to expect that during specimen preparation some antibodies may fortuitously bind to the sample surface. such antibodies may provide additional binding sites for protein a. these samples should be thor oughly washed (e.g., with pbs) before labeling is attempted. in double-labeling experiments cross-contamination is now recognized as a problem which arises from the affinity which protein a possesses for different regions of igg antibodies, namely the fc and fab regions (roth, 1982; endresen, 1979; zikan, 1980) . protein a can bind two igg mole cules; thus, after the primary antibody-protein a complex has been formed, free igg binding sites can still be available on the bound protein a. addition of the second antibody can result in its binding to the free sites on protein a molecules and target specified antigens. a second pro tein α-gold probe can potentially bind not only to any protein a-gold free igg antibodies from the primary incubation (the fc region of the secondary igg complexes) but also to the fab regions of igg antibodies. reports which claim success with the "one-face" protein a-gold method use either a defined sequence of protein a-gold probes or free protein a (described above). if free protein a is not incorporated, then the success of the method depends on the smaller of the protein a-gold probes being used first. the small probes (= 3 nm) have approximately one associated protein a molecule, whereas larger probes (= 15 nm) can adsorb -60 such molecules (roth, 1982) . if the larger of the gold probes is used for the visualization of the first antigen, many potential fc binding sites (unoccupied protein a) are available for the second immunoglobulin and significant cross-contamination may therefore result. if excess free protein a is added, the primary antibody-protein a interactions may reach saturation and so minimize interactions with secondary protein a-gold probes. the binding of secondary antibodies to the existing pro tein a fc binding sites should not be recognized by the secondary protein a-gold probe as it is assumed that igg can bind only one protein a mole cule. cross-contamination can still occur, however, via interactions with the fab regions of igg. if cross-contamination is a continued problem, alternative techniques to the protein a method should be investigated. such techniques generally involve direct labeling and species-specific antibody-gold indirect labeling techniques. the use of small (3-6-nm) gold probes has the advantage of improving immunocytochemical resolution. anyone who has worked with small gold probes has observed them (in some specimens) to be extremely "sticky" (fig. 2.4a,b) . behnke et al. (1986 ), birrell et al. (1987 ), and birrell and griffith (1989 have reported the problem to be largely associated with exposed areas (areas not covered with protein) on the gold particles themselves and secondarily with the method by which the colloidal gold is produced. since small protein a-gold complexes are used widely, the problems associated with small probes merit discussion. flocculation of colloidal gold can be prevented by partial saturation of the particles (goodman et al., 1980; horisberger and vauthey, 1984) . the stabilization of colloidal gold therefore does not ensure its saturation. it is because of this that excess unrelated protein (for example, bsa) is (1 : 10) . n o stabilizer was included within the added to these protein-gold complexes; the subsequent complex is then assumed to be saturated. however, it would appear that some immuno gold labeling patterns are independent of protein-protein interactions and result from interactions between specimen constituents and the gold par ticles (behnke et al., 1986) . the occurrence of such nonspecific binding is obvious in whole cytoskeletons (fig. 2.4a ) and thick polyethylene glycolextracted sections (a. d. hyatt, unpublished observations) . strategies in colloidal gold labeling of cytoskeletal elements are discussed at length by birrell and griffith (1989) . nonspecific binding is attributed to electro static interactions between exposed areas on the gold particles (areas of negative charge) and cationic structures within the specimen (behnke et al., 1986) . the method used to prepare colloidal gold has also been considered a source for nonspecific labeling (birrell et al., 1987) . birrell et al. found that 5-nm gold particles produced by the trisodium citrate-tannic acid method exhibited a higher degree of nonspecificity than particles of simi lar size produced by other methods. this nonspecificity was attributed to the chemical agents used in colloidal gold production (birrell et al., 1987) ; for example, colloidal gold produced by the white phosphorus method was found to contain phosphate groups. tannic acid may therefore also be a source of contamination for colloidal gold produced by the trisodium citrate-tannic acid procedure (birrell et al., 1987; birrell and griffith, 1989) . such contaminants on gold particles may, as with tannic acid, form a series of complex interactions with biological specimens, particularly those fixed with an aldehyde and/or os0 4 (simionescu and simionescu, 1976) . observations in this laboratory and by birrell et al. (1987) and bir rell and griffith (1989) that nonspecific labeling is more pronounced with smaller probes may be explained by the greater accessibility of these probes to tissue constituents within open and/or permeabilized specimens. nonspecific binding due to protein-protein and electrostatic interac tions can be minimized by including in the washing and incubation steps a noncompeting protein which has a high affinity for gold particles. these proteins (e.g., bovine skin gelatin, fish gelatin, and bsa) will bind to protein-reactive constituents within the specimen and will minimize electro static interactions by covering the exposed areas on the gold particles. the choice of an appropriate stabilizer will significantly reduce the level of nonspecific labeling, as is illustrated in fig. 2 .4c (compare with fig. 2.4b ). an alternative or concomitant procedure would involve the incuba tion of protein α-gold with the biological sample. immunolabeling with protein α-gold has several distinct advantages: (1) it is easy and quick to prepare complexes of various sizes; (2) the complexes interact with antibodies from many mammalian species; and (3) the method is sensitive, clean, and can yield valuable information pro vided the parameters of immunolabeling are understood. in other words, the investigator must be familiar with the characteristics of the primary antibody, protein α-gold probes, and the specimen itself. generally, the use of the protein α-gold method should involve (1) the use of highly specific antibodies; (2) the use of such antibodies at their lowest effective concentration; (3) the absence of normal serum as a blocker; and (4), if using open matrix specimens, inclusion of an effective stabilizer in the incubation steps. these observations, if followed, will reduce the level of nonspecific labeling and cross-contamination. it should also be noted that at no stage during the labeling protocol should the sample be allowed to dry. in addition, the sample should be thor oughly rinsed between antibody and protein α-gold steps; this will re move unbound antibodies and protein a. failure to comply with the above precautions will result in high backgrounds. there are no detailed protocols for protein α-gold labeling which have universal applicability. for each different experiment, detailed protocols should be developed to produce optimum labeling characteristics. these protocols must be based on results of control immunocytochemical ex periments. if protein α-gold is the label, and nonspecific binding and cross-contamination are persistent problems in spite of the addition of excess free protein a, high-affinity stabilizers, adsorption of contaminat ing antibodies, and colloidal gold, then alternative labeling techniques should be pursued and evaluated. the author gratefully acknowledges drs. β. t. eaton and a. r. gould for reading the manuscript and mr. t. wise for skillful technical assistance. non-specific binding of protein-stabilized gold sols as a source of error in immunocytochemistry double immunocytochemical labeling applying the protein a-gold technique protein α-gold electron microscopic immunocytochemistry: meth ods, applications, and limitations double labelling cytochemistry applying the pro tein α-gold technique. immunolabelling for electron microscopy demonstration and assaying of igg anti bodies in tissues and on cells by labeled staphylococcal protein a the grid-cell-culture technique: the direc examination of virus-infected cells and progeny viruses antibody competition studies witl gold-labelling immunoelectron microscopy phylogenetic insight int< evolution of mammalian fc fragment of 7g globulin using staphylococcal protein a protein a of staphylococcus aureus and related immunoglobulin re ceptors produced by streptococci and pneumococci binding of immunoglobulins t< protein a and immunoglobulin levels in mammalian sera factors affecting the staining with colloida gold the preparation of protein α-gold complexes with 3nm and 15nm gok particles and their use in labelling multiple antigens on ultra-thin sections the colloidal gold marker system for light and electron microscopic cyto chemistry galloylglucoses of low molecular weight a: mordant in electron microscopy. i. procedure, and evidence for mordanting effect sizing of protein α-colloidal gold probes for immu noelectron microscopy gold markers for single and double immunolabellinj of ultrathin cryosections preparation of colloidal gold immunoperoxidase techniques: practical and theoretical aspects ultrastructure of the mem brane attack complex of complement: detection of the tetramolecular c9-polymeriz ing complex c5b-8 interactions of pig fab gamma fragments with protein a from staphylococ cus aureus das kolloidale gold key: cord-014368-4nasrbs6 authors: nan title: gene chip for viral discovery date: 2003-11-17 journal: plos biol doi: 10.1371/journal.pbio.0000003 sha: doc_id: 14368 cord_uid: 4nasrbs6 nan comparing the genomes of related organisms, researchers can see what parts of the genomes are conserved-highly conserved genes tend to be important-and then focus on these regions to track down genes and determine how they function. to construct a draft sequence of the c. briggsae genome, the researchers merged genomic data from three sources-one derived from whole-genome shotgun sequencing, another from physical genome mapping, and the third from regions of a previously "finished'' sequence. for the shotgun sequence, the researchers extracted dna from worms, randomly cut it into short pieces, sequenced them, and then assembled overlapping sequences to create thousands of stretches of contiguous dna sequence. to help fill in the gaps between these "contigs,'' stein and colleagues developed a "fingerprint'' map of the genome as a guide for aligning the shorter fragments. the map also helped them identify inconsistencies and misalignments in the genome assembly. finally, they integrated the previously finished sequence to improve the draft genome sequence. using these massive datasets, the authors produced a high-quality genome sequence; although it does not quite meet the gold standard of a "finished'' sequence, it covers 98% of the genome and has an accuracy of 99.98%. after confirming the accuracy of the draft, the researchers turned to the substance of the genome. examining two species side by side, scientists can quickly spot genes and flag interesting regions for further investigation. analyzing the organization of the two genomes, stein et al. not only found strong evidence for roughly 1,300 new c. elegans genes, but also indications that certain regions could be "footprints of unknown functional elements.'' while both worms have roughly the same number of genes (about 19,000), the c. briggsae genome has more repeated sequences, making its genome slightly larger. because the worms set out on separate evolutionary paths about the same time mice and humans parted ways-about 100 million years ago, compared to 75 million years ago-the authors could compare how the two worm genomes have diverged with the divergence between mice and humans. the worms' genomes, it seems, are evolving faster than their mammalian counterparts, based on the change in the size of the protein families (c. elegans has more chemosensory proteins than c. briggsae, for example), the rate of chromosomal rearrangements, and the rate at which silent mutations (dna changes with no functional effect) accumulate in the genome. this would be expected, the researchers point out, because generations per year are a better measure of evolutionary rate than years themselves. (generations in worms are about three days; in mice, about three months.) what is surprising, they say, is that despite these genomic differences, the worms look nearly identical and occupy similar ecological niches; this is obviously not the case with humans and mice, which nevertheless have remarkably similar genomes. both worm pairs-as well as mouse and human-also share similar developmental pathways, suggesting that these pathways may be controlled by a relatively small number of genes and that these genes and pathways have been conserved, not just between the worms, but also between the nematodes and mammals. this question, along with many others, can now be explored by searching the two species' genomes and comparing those elements that have been conserved with those that have changed. with the nearly complete c. briggsae genome in hand, worm biologists have a powerful new research tool. by comparing the genetic makeup of the two species, c. elegans researchers can refine their knowledge of this tiny human stand-in, fill in gaps about gene identity and function, as well as illuminate those functional elements that are harder to find, and study the nature and path of genome evolution. some 200,000 people live with partial or nearly total permanent paralysis in the united states, with spinal cord injuries adding 11,000 new cases each year. most research aimed at recovering motor function has focused on repairing damaged nerve fibers, which has succeeded in restoring limited movement in animal experiments. but regenerating nerves and restoring complex motor behavior in humans are far more difficult, prompting researchers to explore alternatives to spinal cord rehabilitation. one promising approach involves circumventing neuronal damage by establishing connections between healthy areas of the brain and virtual devices, called brain-machine interfaces (bmis), programmed to transform neural impulses into signals that can control a robotic device. while experiments have shown that animals using these artificial actuators can learn to adjust their brain activity to move robot arms, many issues remain unresolved, including what type of brain signal would provide the most appropriate inputs to program these machines. as they report in this paper, miguel nicolelis and colleagues have helped clarify some of the fundamental issues surrounding the programming and use of bmis. presenting results from a series of long-term studies in monkeys, they demonstrate that the same set of brain cells can control two distinct movements, the reaching and grasping of a robotic arm. this finding has important practical implications for spinal-cord patients-if different cells can perform the same functions, then surgeons have far more flexibility in how and where they can introduce electrodes or other functional enhancements into the brain. the researchers also show how monkeys learn to manipulate a robotic arm using a bmi. and they suggest how to compensate for delays and other limitations inherent in robotic devices to improve performance. while other studies have focused on discrete areas of the brain-the primary motor cortex in one case and the parietal cortex in another-nicolelis et al. targeted multiple areas in both regions to operate robotic devices, based on evidence indicating that neurons involved in motor control are found in many areas of the brain. the researchers gathered data on both brain signals and motor coordinates-such as hand position, velocity, and gripping force-to create multiple models for the bmi. they used retraining the brain to recover movement if you have ever spent an evening hoisting brews with your pals at the corner pub, chances are you never stopped to think-gee, how do i lift my glass now that it's only half full? it seems like a simple task-you raise that glass reflexively, whether it is empty or full-yet the neural calculations that determine the force needed to lift your arm smoothly to your lips in each case are anything but simple. the brain, it seems, operates like a computer to process variable cues-such as the weight of a glass and the position of your arm-to generate an appropriate response: lifting the glass. neuroscientists believe the brain builds a kind of internal software program based on past experience to transform such variable cues into motor commands. the brain's software, or internal model, depends on specialized sets of instructions, or "computational elements,'' in the brain. but exactly how the brain organizes these elements to process sensory variables that affect arm movements is far from clear. eun jung hwang and colleagues predict that these computational elements are based on a multiplicative mechanism, called a gain field, through time after time in biology, revelations about structure lead to insights about corresponding functional mechanisms. while evolution throws in the occasional spandrel, more often organizational structure serves a practical purpose. so naturally, neuroscientists wonder, does the architectural organization of the motor system reveal an underlying functional organization? progress on this question has been complicated by the fact that there appears to be no clear correspondence between the development of motor neurons centrally and their target muscles in the periphery. in the visual system, for example, retinal ganglion cells send axons in an ordered manner into the brain, where they form connections with neurons of the primary visual center in the brain responsible for detecting visual targets. the arrangement of these connections mirrors the neighboring relationships of the neurons in the retina, and so the neural map of connections in the brain is an "anatomical correlate'' of the arrangements in the retina. the origin of these anatomical relationships can be traced through the process of development, allowing scientists to link the assembly of this sensory system with the function of the neurons involved. matthias landgraf and colleagues now report that in the fruitfly drosophila the arrangement of motor neurons corresponds to the distribution of their target muscles. thus, anatomical correlates also exist in the motor system, in the form of a "myotopic map,'' where the arrangement of motor neuron dendritic branches in the central nervous system reflects the distribution of their target body wall muscles in the periphery. starting with the larger question of how the neural networks governing locomotion are specified and assembled during development, the researchers decided to see if they could identify an elementary principle of motor system organization. working in drosophila, they examined motor neurons and the body wall muscles they innervate. with an eye toward understanding the mechanisms directing the assembly of the motor system, the researchers concentrated on the early stages of development, when the motor neurons first establish their characteristic dendritic territories. they found that the dendrites of motor neurons innervating internal muscles and that those innervating external muscles do in fact project into distinct regions, corresponding to the distinct mapping of the muscles themselves. surprisingly, the arrangement of the dendrites in the myotopic map forms independently of the muscles they innervate. it may be, the researchers suggest, that the initial signals charting the location of the dendrites are set very early in development, when the coordinates for other structural elements are established. but that question requires further investigation. the researchers are among the first to reveal such an orderly underlying principles of motor system organization revealed which sensory signals to the brain are amplified by signals from the eye, head, or limbs. in this way, the brain can rely on past experience of one kind of sensory cues to predict how to respond to new but similar situations. while previous studies had established that some visual cues are combined through a gain field, this study shows that motor commands may also be processed via gain fields. this finding, the researchers demonstrate, accounts for a range of behaviors. based on previous studies showing that when people reach to various directions in a small space, they can extrapolate what they learn about the forces in one starting position to a significantly different position, it has been proposed that the way the brain computes movement is not terribly sensitive to limb position. citing other research with seemingly contrary conclusions-that the brain can be highly sensitive to limb position in calculating force and movement-hwang et al. set out to investigate whether-and how-the brain creates a template to translate sensory variables (limb position and velocity) into motor commands (force). they created a computer model to mimic the reaching behaviors observed by people in their experiments and found that the most accurate model used computational elements that are indeed sensitive to both limb position and velocity. if the brain processes these two independent variables through a gain field, it can use the relationship of the two variables-that is, the strength of the gain field-to adapt information about the force needed to move or lift something in one situation to accomplish a wide range of similar movements. when the researchers compared their model to previously published results, they found their model accounted for seemingly disparate findings. they explain that the brain's sensitivity to limb position can be either low or high after a task has been learned because the gain field itself is adjustable. the authors note that neurophysiological experiments suggest that the motor cortex may be one of the crucial components of the brain's internal models of limb dynamics. the next step will be to track the motor cortex neurons to see whether their activity supports this model. hwang et al. predict they will. novel "checkpoint" mechanism mediates dna damage responses connection between patterns of motor neuron dendrites and patterns of muscles. this organization, in the form of the myotopic map, may be mirrored by the patterning of processes of higher-order neurons, which form connections with the motor neuron dendrites themselves. in vertebrates, studies have shown that motor neurons are grouped into "pools'' and "columns'' that correlate with the muscles they innervate. but because these pools and columns represent the location of the cell bodies and not the areas of the spinal cord where the neurons receive most of their inputs, that is, their dendritic branches, scientists could not say whether the pools and columns are simply spandrels-an incidental result of the way motor neurons are generated-or mirror a functional organization of the motor system. this novel fi nding in drosophila will pave the way for future studies on the relationship between anatomy and physiology during development. it will be particularly interesting to discover whether such myotopic arrangements of motor neuron dendrites are unique to insects or whether this organizational principle occurs in other motor systems, including vertebrates. of all the tasks a cell must accomplish day in and day out, protecting its genome may be the most important. genomes confront all manner of potential assaults, from the strand-splitting action of gamma-radiation to the simple copying mistakes sometimes made when dna replicates before a cell divides. though some mutations are harmless, others can disrupt gene action, leading to cancer and other diseases. to guard against such events, healthy cells maintain quality-control "checkpoints'' that sense and respond to dna injuries, as well as to defects in dna replication, and that prevent cell division until the dna can be repaired. if the damage is beyond repair, apoptosis pathways set about the business of destroying the affl icted cell. many of the genes and protein complexes involved in these checkpoint responses have been identifi ed, but the biochemical mechanisms that in some cases trigger cell cycle arrest are not fully understood. experiments by philip hanawalt and his student david pettijohn at stanford university in 1963 suggested that the molecular machinery of dna replication and repair-which they discovered at sites of damage-are quite similar and closely linked. while many studies have since supported that link, viola ellison and bruce stillman, the director of the cold spring harbor laboratory, have found new evidence that the two processes may indeed coincide by showing that protein complexes regulating a cellular checkpoint in dna repair operate much like similar complexes involved in dna replication. the molecular pathways governing the replication of dna before cell division are well known. as the double-stranded dna molecule unwinds, different protein complexes step in to ensure that each strand is faithfully reproduced. two protein complexes required for this process are replication factor c (rfc) and proliferating cell nuclear antigen (pcna). in the 1980s, stillman's laboratory isolated pcna and rfc and showed that they function together to "load'' pcna onto a structure in dna that is created after dna synthesis begins. pcna forms a clamp around the dna strand and regulates the dna polymerases that duplicate the dna double helix. studies in yeast had identifi ed a series of proteins required for the dna synthesis phase of the cell cycle and the dna damage checkpoint pathways; mutations in these proteins' genes make cells very sensitive to radiation (hence the name rad genes). a subset of these proteins, which are conserved in human cells, form two protein complexes-rsr and rhr-that function like rfc and pcna, respectively, with rsr loading the rhr clamp onto dna. ellison and stillman demonstrate that both pairs of "clamp-loading'' complexes follow similar biochemical steps, but, signifi cantly, rfc and rsr favor different dna structures for clamp loading. while it was known that the rsr/rhr complexes exist in human cells, it had not been established that the two types of clamps prefer different dna targets. the researchers also show that the rsr/rhr biochemistry depends on rpa, a protein known to be involved in the dna damage-response pathway. the discovery that rsr loads its rhr clamp onto a different dna structure was unexpected; it suggests not only that the two clamp loaders have distinct replication and repair functions, but also how the checkpoint machinery might work to prevent dna damage from being passed on to future generations. by establishing the chemical requirements of rsr/rhr interactions as well as the preferred dna-binding substrate, the researchers have charted the way for determining the different functions of these cell cycle checkpoint complexes and how the complexes' different subunits affect these functions. the researchers propose that the role of this checkpoint machinery is not as an initial sensor of dna damage, but rather as a facilitator of dna repair, stepping in after preliminary repairs to dna lesions have been made. ellison and stillman's work helps establish a biochemical model for studying how both of these checkpoint complexes function to coordinate replication and repair-and promise to help scientists understand how cancer develops when the checkpoint repair mechanisms fail. during animal development, cells gradually grow, multiply, and specialize to create the tissues and organs that shape and sustain multicellular organisms. the progression from a single cell to a thousand-, million-, or trillion-celled animal follows an exacting schedule and plan involving an elaborate network of genes and proteins. one of the primary mechanisms coordinating this process is cell-to-cell communication. cellular signaling regulates two crucial development mechanisms, apoptosis (programmed cell death) and cell proliferation, which work like chisel and clay to sculpt multiplying masses of cells into, say, a fly wing or a human finger. controlled by multiple signals operating at fixed intervals, the entwined pathways can be steered off-course by a single defect in the communication network, resulting in the death of a healthy cell, for example, or the survival of a damaged cell. such disruptions can lead to physical abnormalities, such as webbed hands and feet, when cells that should die remain alive; degenerative nerve disease, when healthy cells are killed; and cancer, when damaged cells survive and evade normal growth limitations. researchers have uncovered some of the mechanisms underlying these processes by studying genes involved in fruitfly (drosophila) development. following that tradition, stephen cohen and david hipfner have identified a gene critical to drosophila development that juggles cell growth and survival signals to help promote cell growth and prevent inappropriate apoptosis. they searched for genes associated with changes in tissue growth in fruitfly wings and identified some that can cause tissue "overgrowth''-abnormally large masses resulting either from cells growing faster than they divide or from cells escaping proliferation controls when they are overexpressed. among these is a gene that encodes a newly divide, and differentiate, most respond to the defect by killing themselves, even under conditions that normally promote survival. thus, cells without slik appear to have an intrinsic survival defect, suggesting that slik prevents apoptosis. when slik is overexpressed, cell proliferation increases, but so does apoptosis. only when apoptosis was blocked did the cells form tumor-like growths. this coupling of cell growth and cell death is characteristic of oncogenes (cancercausing genes), and slik also seems to function in both pathways. the authors point out that the signal to proliferate may inherently sensitize cells to apoptosis, as has been shown previously for some cancer cells. this may keep an individual cell under the control of its neighbors, who collectively monitor the needs of the organism. for a cell to respond to a signal by dividing rather than dying, it must get the appropriate signs from its comrades. slik, the authors demonstrate, is a key factor in determining whether a cell lives or dies. whether its mammalian counterparts play a similar role is yet to be determined. identified kinase that contributes to the regulation of cell proliferation and survival (or death, depending on the circumstance) during drosophila development. cohen and hipfner called the gene slik based on its similarity to two human kinase-coding genes (slk and lok). little is known about these human proteins, though previous studies suggest they may affect cytoskeletal dynamics and cell adhesion. in this paper, the authors report preliminary evidence supporting the notion that slik may regulate the cytoskeleton, the "backbone'' of the cell that confers structure and motility. interestingly, disturbances to cell adhesion and cytoskeletal structure are known triggers of apoptosis and are being explored as potential anticancer agents. kinases make up one of the largest families of proteins and are important regulators of cell signaling. to investigate the function of slik in drosophila, the researchers removed the gene and then studied the physical and cellular effects. they found striking delays in growth and developmental timing and showed that these effects result largely from the demise of the slik-deficient cells. while cells deprived of slik can grow, one might not expect that yeast lead terribly eventful lives, yet the singlecelled fungus must struggle to survive just like everyone else. and for yeastlike everyone else-survival means being able to detect and coordinate a rapid response to changes in its environment. though survival for humans is a bit more complicated, our cells use the same regulatory networks, which maintain cell growth and health when they work and contribute to diseases, from asthma to cancer, when they break down. given the variety of conditions even the lowly yeast is likely to encounter during its life, one might expect to find a multitude of molecules mobilizing a response. but yeast cells, it turns out, are fairly resourceful. as erin o'shea and colleagues report, just one protein in yeast activates different groups of genes in response to different amounts of an environmental stimulus. the researchers focused on how yeast responds to various levels of phosphate, an essential nutrient for all cells. one way that cells regulate responses to environmental stimuli is through the transcription (activation) of genes. these transcriptional responses are often controlled by a multistep process that shuttles gene-activating proteins into the nucleus, where they can generate the appropriate response for a given stimulus, or confines them to the cytoplasm if their gene products are not needed. during this process, called phosphorylation, the addition of a phosphate group to a protein-such as a receptor or transcription factor-acts as a mechanism for controlling gene expression. o'shea's team demonstrated that phosphorylation of a transcription factor called pho4 controls gene expression by controlling where that protein resides in the cell-in the cytoplasm or in the nucleus. as is the case with many proteins, pho4 can accept phosphate groups at multiple sites. to see whether the location of phosphorylation affects the action of pho4, o'shea's team exposed yeast to different levels of phosphate and tracked the cellular response. they found that when yeast is deprived of phosphate, pho4 has no phosphate groups at any of its binding sites and enters the nucleus, where it binds to dna and activates a set of genes whose products can scavenge for phosphate or otherwise compensate for the scarcity. when yeast has ample supplies of phosphate, pho4 is phosphorylated and remains in the cytoplasm-unable to influence transcription-suggesting that the cells can absorb plenty of nutrients from their environs without having to engage a specialized foraging team. when the researchers exposed the yeast to intermediate amounts of phosphate, the results were surprising. middling concentrations of phosphate produced different forms of phosphorylated pho4, which varied in their ability to activate genes, and so added to the number of possible responses. pho4 partially phosphorylated at one site, for example, could still enter the nucleus, but activated only one type of phosphate-recovery gene and not others. while it is not unexpected that differential phosphorylation could have different functional outcomes, the authors say, it is surprising that one enzyme acting on one transcription factor can create different phosphorylation patterns-and therefore different gene-expression patterns-in response to different amounts of a single stimulus. their results show that cells rely on a highly regulated series of interactions that induce subtle changes in gene expression to fine-tune their response to small environmental changes. and they do this in a remarkably efficient manner, relying on a small cast of characters to orchestrate the responses essential for survival. extraction, amplification, and decoding of viral sequences rapidly identify known viruses and classify new ones based on their genetic makeup. this was validated in march when the viral chip contributed to the identification of the cause for severe acute respiratory syndrome (sars) as a novel coronavirus. in the article published in this issue, the researchers describe the chip (or microarray), how it was used in the classification of the sars virus, and how it provides direct access to viral genomic sequence. microarray technology works by taking advantage of the structural properties of dna. dna molecules normally exist as double helices, two complementary strands of nucleotides wrapped around each other. the microarray consists of a large number of single dna strands attached to a solid base. these probes (which in case of the viral chip represent sequences from all fully sequenced reference viruses) can be used to interrogate unknown sequences: if a solution containing such sequences is passed over the chip, similar sequences will "hybridize,'' or bond in a signature double helix. known viruses hybridize in a characteristic pattern and can be identified quickly. because bonding occurs even when the match between probe and sample sequence is not perfect, new relatives of known viruses can be identified as belonging to a particular family (such as coronaviruses, in the case of sars). to quickly obtain more information on a novel virus, it is then possible to "syphon off'' those viral sequences that stuck to their respective counterparts on the chip and to use the material to determine part of the genomic sequence. such sequence information provides more detail on how the new virus relates to known ones, which might provide clues about its origin and possible treatment strategies. faced with all manner of potential threats in the form of billions of different viral, bacterial, and chemical pathogens, the mammalian immune system relies on a "safety in diversity'' strategy for protection. with two distinct subsystems-one innate, the other adaptive-the immune system can recognize some 100 trillion antigens. the innate system deploys cells programmed to quickly recognize microbes with a particular set of conserved molecular structures. the adaptive system relies on billions of uniquely outfitted lymphocytes (white blood cells) to identify just as many pathogens through their protein fragments, or antigens. a human being grinds out billions of these cells every day. in the absence of threats, the immune system maintains a quiescent state and many of these cells are discarded. but for the immune system, doing nothing takes a concerted effort. lymphocytes originate in the bone marrow, though not all differentiate there. one class of lymphocytes, called t cells, develops in the thymus, where every t cell acquires a one-of-a-kind receptor, called a t cell receptor (tcr), designed to recognize a different antigen. when an antigen gets bound by a tcr (a bound molecule is called a ligand), the antigen triggers a signaling cascade that tells the t cell either to attack the infected cell or to alert other immune cells of the infiltrator. but as jeroen roose, arthur weiss, and colleagues report, signaling pathways activated by bound tcrs appear to influence gene expression even in the absence of antigen or other receptor ligands, a process called ligand-independent signaling. these findings lend support to the notion that cellular signaling pathways regulated by surface receptors, like tcrs, exhibit a continuous low-level signaling (known as basal signaling) in the absence of a stimulus and that this continuous signaling, by influencing gene expression, has significant influence on cellular differentiation. roose, weiss, et al. focused on the tcr signaling pathway that regulates the expression of a group of genes, including rag-1 and rag-2, that are activated in two distinct waves during t cell development. rag genes play a crucial role in t cell development, a highly complex, multistage process that involves a reshuffling, or recombination, of tcr genes and the activation of different proteins and genes at different stages. rag genes regulate the genetic recombination and ultimate cell surface expression of tcrs. using chemical inhibitors and mutant human t cell lines deficient in critical signaling components involved in antigen receptor-dependent pathways, the researchers found that the loss of specific functions or specific proteins affected an unexpected set of target genes. notably, when downstream components (the protein kinases erk and abl) were disabled in the basal signaling pathway, the researchers saw a resurgence of rag gene expression. while erk was already known to play a prominent role in signaling pathways downstream of the tcr, it now appears that abl may also be regulated in tcr pathways. most importantly, these findings suggest that signaling pathways thought to be triggered only by ligated receptors can influence gene expression on their own. and it may be through this type of signaling that tcr pathways help regulate t cell development by repressing rag gene activity. these basal signals, the researchers postulate, may in effect save the rag expression machinery until recombination is called for. if rag genes were expressed at the wrong time, they could cause inappropriate genetic recombination and create t cells that either lack function or attack healthy cells, as happens in immunodeficiency and autoimmune diseases. elucidating the mechanisms and components of this basal pathway will contribute important insights into the development and function of the immune system. but these studies also establish a model for investigating other signaling systems, to determine whether biologically functional basal signaling is a rare phenomenon or whether it is a fundamental cell process needed to control the profile of gene expression in the quiescent state. a multicellular organism can have more than 200 different types of cells and as many as 100 trillion altogether. during the process of development, an organism enlists the service of hundreds of signaling molecules and thousands of receptors to direct cell growth, differentiation, and morphological destiny. any given cell has no use for most of these signals and gets by with just a limited repertoire of receptors on its surface. once a signal reaches a receptor, it triggers a series of biochemical reactions as different molecules transform the external signal into a biological response, in a process called signal transduction. one cell type controls all of its cellular functions-both universal and specialized-with just a few dozen receptors; each receptor elicits a wide range of responses by triggering a small number of interacting pathways. exactly how a receptor produces the right response at the right time is a fundamental question in biology. of particular interest is a class of receptors-called receptor tyrosine kinases (rtks)that regulate cell proliferation, differentiation, and survival and play an important role in embryonic development and disease. growth factor receptors are an important subset t he protozoan parasite plasmodium falciparum causes falciparum malaria, a fatal parasitic disease in humans, and is transmitted by anopheles mosquito vectors (predominantly the anopheles gambiae complex and an. funestus in africa). there are about 300 million malaria cases and 1-2 million deaths annually, the brunt of which are borne mostly in africa by children under 5 years of age and by pregnant women. in many african countries, malaria poses a formidable challenge to an overburdened and underfunded public health system. the current malarial control strategies consist of chemotherapy directed against the malaria parasite and prevention of mosquito vector/human contact using insecticide-impregnated bednets and, to a lesser extent, indoor residual insecticide spraying and environmental control for reducing mosquito breeding sites. there are still no malaria vaccines in clinical practice. chemotherapy (the use of drugs to target disease) is used for both treatment and prevention. drug resistance is increasingly becoming a problem. some of the antimalarial drugs in current use include quinolines, artemisinins, antifolates, atovaquone/ proguanil, and antibiotics. chloroquine of rtks. the platelet-derived growth factor receptor (pdgfr) family activates downstream signaling enzymes that stimulate the growth and motility of connective tissue cells, such as vascular smooth muscle cells (vsmcs), oligodendrocytes (cells of the tissue encasing nerve fibers), and chondrocytes (cartilage cells). the pdgf beta receptor is essential for directing the differentiation of vsmcs. while studies of signal transduction of this growth factor have established a model of how receptor tyrosine kinases function, the role of individual downstream signaling components in a living organism is still unclear. using mouse molecular genetics, michelle tallquist and colleagues set out to determine the function of individual components in the pdgfr beta pathway. they discovered a quantitative correlation between the overall amount of signal produced by the receptor and the end product of the signal, formation of vsmcs. receptor responses, they report, are controlled in two ways: signaling was influenced both by the amount of receptors expressed and by the number of specific pathways engaged downstream of the receptor. surface receptors have "tails'' that project into a cell's interior. when a surface receptor is activated, a number of potential binding sites-modified amino acid residues-are exposed on its intracellular tail. ten of these sites can bind to proteins with a specific amino acid sequence, called an sh2 domain; proteins with these domains can then initiate a signal transduction pathway. by introducing mutations in the sh2 domain-binding sites in mice, the researchers could evaluate how the loss of a particular binding site-and therefore pathway-affected the function of the receptor. they had previously investigated the functions of two other downstream signaling proteins in similar experiments. surprisingly, tallquist et al. found that losing some of the individual components did not produce a significant negative physiological effect. only when multiple downstream signaling pathways were disrupted did the researchers see a significant effect on the population of the cells. reductions in the numbers of both activated receptors and activated signal transduction pathways produced reductions in the population of vsmcs. these results have not been seen in tissue culture before, suggesting that signal transduction is more complex in vivo and that future studies would benefit from incorporating a global approach, rather than targeting a single signaling component. the next step will be to investigate exactly how the individual pathways contribute to this result. it is also unclear whether these results apply only to these growth factor receptors or explain how rtks operate in general. such questions have significant clinical relevance. overexpression of the pdgfr beta pathway has been linked to a variety of serious diseases, including atherosclerosis and cancer. understanding how cells control the action of this growth factor is an important step in developing targeted therapies. since many of these conditions result from a growth factor stuck in the "on'' position, inhibiting overactive receptors promises to be an effective clinical intervention. (cq) is a cheap and widely used aminoquinoline, but cq-resistant parasites have become ubiquitous in endemic countries and other drugs are now used much more frequently (ridley 2002) . fansidar, a combination of sulphadoxine and pyrimethamine (sp), is a first-line treatment in several african countries, but resistance to sp is spreading rapidly. targeting the mosquito vector with pyrethroidimpregnated bednets, in addition to chemotherapy, is an effective method of controlling malaria transmission. however, pyrethroid resistance has been reported in an. gambiae s.s. in west africa, and there is concern about its emergence in east africa (chandre et al. 1999) . thus, the public health problem due to malaria is exacerbated by the emergence of drug-resistant parasites and insecticide-resistant mosquitoes. the clinical application of efficacious intervention tools is therefore an urgent imperative for malarial control. this brings into sharp focus the importance of genomics research for drugs, vaccines, diagnostics, and insecticides. the unraveling of the genomes of humans, p. falciparum, and an. gambiae has ushered in a new era of hope that genomics research will result in the development of new and better tools for malaria control. the p. falciparum genome of 22.8 megabases (mbp) distributed among 14 chromosomes consists of 5,300 protein-coding genes (gardner et al. 2002) . p. falciparum possesses a relict plastid, the apicoplast, homologous to the chloroplasts of plants and algae. the apicoplast is essential for parasite survival and functions in the anabolic synthesis of fatty acids, isoprenoids, and heme (seeber 2003). these essential metabolic pathways are not present in humans and are therefore ideal targets for the development of safe antimalarial drugs. inhibitors of type ii fatty acid biosynthesis (triclosan and thiolactomycin) and mevalonateindependent isoprenoid biosynthesis (fosmidomycin and fr900098) with potent antimalarial activities have been identified by computational mining of the genome data. the fact that fosmidomycin has rapidly entered into clinical trials underscores the great utility of genomics research in the control of malaria (lell et al. 2003) . about 3,200 proteins (60%) in p. falciparum have no known functions (gardner et al. 2002) . the greatest challenge of malarial functional genomics (the elucidation of the functions of genes encoded by an organism's genome) is to assign functions to these proteins, thus comprehensively identifying the proteins that function at various lifecycle stages and that function together to carry out particular cellular processes, e.g., red blood cell invasion, signal transduction, growth, vesicular trafficking, etc. the application of functional genomics approaches allows the properties of many genes and proteins to be assessed in parallel on a large scale. these approaches are being used to address specific questions about the biology of p. falciparum. gene profiling (determining which genes are expressed) by microarray technology allows a rapid, parallel analysis of genome-wide changes in gene expression over a variety of experimental conditions (e.g., chloroquine versus saline control), tissues, and cell types; these genes can be clustered (ordered by expression pattern) to identify those that function in the same process. one of the most promising applications of microarrays is the study of differential gene expression during the complex p. falciparum lifecycle, specifically the formidable and challenging task of determining which subset of the 5,300 genes is represented in the transcriptome of each stage (bozdech et al. 2003; le roch et al. 2003) . these approaches are beginning to yield invaluable insights about new vaccine candidates, novel drug targets, and the molecular basis of drug resistance. proteomics is the study of all the proteins expressed in an organism. global protein analysis offers a unique means of determining not only protein expression, but also interacting partners, subcellular localizations, and post-translational modifications of proteins of whole proteomes. analyses of the proteomes of parasites that have been exposed to distinct environmental stimuli (e.g., chloroquine versus saline control) or that manifest distinct phenotypes (drug resistant versus drug sensitive) might also facilitate the identification of biochemical drug targets and of the specific proteins involved in drug resistance. comparative genomics (the comparison of genomes of related species), on the other hand, will yield invaluable insights about the biology of and the pathogenesis of disease associated with different parasites, i.e., p. falciparum doi: 10.1371/journal.pbio.0000039.g001 on the one hand and p. vivax on the other. the biology and pathology of the two parasites are quite distinct, e.g., the preference for reticulocytes (p. vivax) versus mature red blood cells (p. falciparum), the ability to cause severe (p. falciparum) versus mild (p. vivax) disease, and the implication of amino acid substitutions in pfcrt in cq resistance in one (p. falciparum) but not in the other (p. vivax). the 278 mbp sequence of the nuclear genome of the pest strain of an. gambiae s.s. has been published in draft form and is considerably larger than the 122 mbp assembled sequence of the fruitfly drosophila melanogaster (holt 2002) . the an. gambiae genome includes a treasure trove of 79 odorant receptor genes and about 200 genes that encode glutathione-s-tranferases, cytochrome p450s, and carboxylesterases. these and possibly other genes probably play a critical role in human host finding and detoxification of insecticides, respectively, and could be exploited, using gene profiling, proteomics, and comparative genomics, for the development of novel mosquito repellants or traps and insecticides. the ability to introduce foreign genes into anopheles vectors is an exciting advance that might facilitate the development of transgenic mosquitoes that do not transmit malaria parasites (moreira et al. 2002) . however, the future implementation of this control strategy, if current technical hurdles can be overcome, must take into consideration concerns about the environmental impact of releasing genetically altered mosquitoes. scientists in endemic countries must be active participants in malaria genomics research and not just conduits for field materials for northern partners. however, the reality is that there is an increasing technological gap between endemic-and developedcountry researchers in the field. this needs to be urgently addressed. the world health organization special programme for research and training in tropical diseases have initiated a series of training workshops in bioinformatics in endemic countries; the howard hughes medical institute has supported one such workshop. the training must extend to other aspects of genomics and include infrastructure development. there is considerable optimism that genomics research will result in new drugs, vaccines, diagnostics, and tools for malarial vector control. strong linkages between genomics research and national malarial control programs will facilitate the translation of research findings into intervention tools. as it is for all new technologies, it might also be important for the communities in endemic countries to have a greater awareness and understanding of genomics research. this will enhance acceptance of the products and improve informed consent. there is therefore a unique opportunity for collaborations between social-economic scientists and genomics researchers. the challenges of searching the scientific literature t he standard "front end" for biomedical literature search is medline and its entrez query system. huge, well-managed, and nearly exhaustive, medline and its 11 million references provide incredible ease and facility for anyone who can type a boolean query. though not quite a parallel for google-which runs a kind of popularity contest for web links in real time-the entrez search has opened up the literature to anyone with a web browser. to those who grew up chasing citations and papers through the aisles of a scientific library, entrez is a dream come true. and yet. suspend disbelief and imagine for a moment a kind of literature search dream-tool. "find me all references citing my gene of interest," you could ask. but why stop there? "find me all references citing some or all of my four genes of interest with expression or in vitro data." and then, "bring up the text of the paragraph in which these citations occurred so i can view them in context. and do it in real time." tools that can perform such searches would go beyond google because they avoid the repetitiveness involved in multiple searches. and they would go beyond entrez because they would search the entire medical literature in full-text format and not, as medline does, just the abstracts. furthermore, they would go beyond both types of searches in that they would be at least somewhat intelligent. such text-mining efforts are the next frontier for both academic and commercial groups that have sprung up from pasadena to boston to tel aviv. but how realistic is this venture? text-mining and its more universal relative "information retrieval" are still in their infancy. the first paper on text-mining for biology was published only in 1997. furthermore, because biological text-mining comes so close to the challenge of comprehending human language-arguably the most complex invention in the history of the planet-it is what computer scientists call a "hard problem." so even here, at the embryonic and fun stage in this technology's history, the outcome and especially the timing of improvement are impossible to predict. language-processing software tools have been successfully applied in text-mining of nonscientific sources, especially to newswire content. computer programs can already perform all three levels of text-mining ( figure 1 ) effectively: retrieving documents relevant to a given subject; extracting lists of entities or relationships among entities; and answering questions about the material, delivering specific facts in response to natural-language queries. information retrieval and extraction can be performed on news data at success rates of 90%-95%, says lynette hirschman, a structural linguist. question-answering has been reported in the literature at 85% accuracy, she notes, which is "amazingly good." the question is, how soon can these levels be achieved for biology? good thing for biologists that hirschman has turned her energies in their direction. hirschman works in massachusetts at mitre corporation, a government-funded institution that pursues projects in the national interest, be they in defense and intelligence or, as in the case of textmining, "anywhere we can move an entire field forward," says hirschman. the good news from news-mining is that improvement seems to arrive in direct proportion to the time and energy expended by the research community. similar improvement has occurred in speech recognition by computers, she adds ( figure 2 ). when people took successively harder problems and worked on them for four or five years, she explains, it caused error rates to drop, as a rule, by a factor of two every two years. one might think tackling the biomedical literature would be relatively easy, remarks hirschman: biology jargon has a lot of prefixes and suffixes, which can be parsed more easily than verbs and adverbs; it is highly regular, with greek-letter addons to gene or protein names signifying relatives or subtypes of the original proteins; and there are many resources available, such as databases and ontologies linking different biological terms. but whereas extraction of person and place names from news text routinely reaches 93%, results in biology remain mired in the 75%-80% range. "it's a little depressing," warns hirschman. "even something as simple as a slash may imply two different entities or a single compound." a chorus of assent greets her observation. programmers eager to codify the rules of biology have been stymied by what one bioinformaticist calls "a sea of exceptions." moreover, there is a chronic lack of data that have been "marked up" by software or humans to indicate the roles played by some of the key words. this marking-up process, however it is done, is crucial for machine-learning tasks. getting these data is both hard and expensive, says hirschman. to move biology text-mining forward, she believes, requires organizing different academic and commercial groups so that they are at least working on the same problem. only then can standards emerge that will allow progress in the field even to be measured. this type of shared problem-known as a "challenge evaluation"-has become something of a "religion" in the speech and language community since the 1980s, says hirschman. by putting out a set of data to train on and then issuing a "challenge" for each group to extract the same information or answer the same questions, "you compare apples to apples. in the process you build a research community." last year, hirschman and others ran the very first challenge evaluation in biology, the kdd cup (officially called the knowledge discovery and data-mining challenge cup). six weeks in advance, the organizers gave participants a training set of 862 journal articles already included in the model organism database flybase, along with associated lists of genes and gene products, as well as relevant data fields from flybase. after building their software tools, the entrants were then asked to take a test set of 213 articles and pretend they were curators: the tools were supposed to determine whether the articles were appropriate for curation, based on whether they contained experimental evidence for gene expression products, including both rna transcripts and proteins. eighteen participants took a shot at the kdd cup and their results speak of the infant state of the field. on average, they could assign only 58% of the papers correctly and could determine whether relevant gene products were present only 35% of the time. the winning entrant, a joint group from the israeli company clearforest ("see the forest and the trees") and marylandbased celera genomics, did better. doi: 10.1371/journal.pbio.0000048.g001 the four levels of information retrieval: google and medline both use keywords to direct a searcher to documents. but the next level has been tough to crack. improved software would allow biologists to jump from the web or medline to specifics with a single query. (adapted with permission from the mitre corporation.) information retrieval and extraction can be performed on news data at success rates of 90%-95%. the question is, how soon can these levels be achieved for biology? their entry made the right decision to curate 78% of the time and the right call on the presence of gene products 67% of the time. the winning group did so well by using a clever "trick," says hirschman admiringly. their program searched for figure captions and then applied multiple techniques to find those gene products they were looking for. the techniques applied by clearforest and others fall into two broad categories, statistical and heuristic. statistical techniques are the next step up from keyword searches. they count words such as genes or gene products appearing close to one another, but apply no linguistic insights, such as whether an adjective modifies a noun. by contrast, heuristic approaches use hand-crafted rules designed for specific datasets: e.g., january, february, march, etc., are months; the word following "mr." is a name; and so forth. this approach is labor-intensive but especially useful when there is only a limited amount of data-as is the case with single scientific papers or small groups of papers. some statistical approaches have been labeled with the nickname "bag of words" because they fail to account for grammatical relationships; e.g., "man bites dog" and "dog bites man" would drop the same three words in the bag. a key observation at the kdd cup was that the most basic statistical approach, which counts word occurrences at the document level, is not sufficient unless it takes into account at least some higher-level context, such as the part of the paper from which the search terms are extracted. furthermore, the more hand-crafted rules there were, the better. many of the top teams included biologists who applied their expertise to help create empirical rules that became part of the program instructions. this points to a general theme in machine learning: the greater the degree of human intervention, the better. the best programs are covered with fingerprints. although the march toward better text-mining systems is building momentum, there are two issues that could stop it in its tracks. the first is access. experts in text-searching uniformly cite access as a key obstacle for developing better search tools. "access is a bigger problem than algorithms" is how one machine-learning expert puts it, and a half-dozen others agreed. the present "balkanized" situation for text-processing is filled with "dead ends" and "short circuits" in information flow among biologists, says david lipman, head of the united states' national center for biotechnology information, which runs pubmed, the medline database, as well as the national library of medicine and other critical resources in biology and bioinformatics. it is as if readers are marine biologists on a coastline whose beaches are 98% private. at best, asking permission to view every article slows down the work. at worst, there are some important tools one can never build owing to the missing context. medline itself would be much more powerful if it were based on full text, experts say. owing to lack of access, says hirschman, "we miss a great deal by not having large corpora of full-text articles" included in the design of both the kdd cup and the next challenge evaluation, called biocreative, being held later this year. many of the relevant biological data are found outside abstracts, but getting access to full text is complicated at best. for manual searching, researchers traditionally fall back on portalhopping: jumping from one full-text subscription (to nature, science, or cell, for example) to another, or from one portal (highwire, web of science) to another. that way, many scientists routinely obtain access to as many as 80% of the journals they need. the rest they can usually request via interlibrary loan or order as photocopies online. however, this approach fails for most automated search programs. just sorting out the permissions and keeping up with changes in the portals dramatically increase the headaches for anyone trying to build a search tool. the second threat to text-searching programs ever becoming widely useful has more of the ring of linguistics jargon. the so-called " ontology problem" threatens successful searching based on the very specific nature of biological terminology. the issue here is not only that scientists are truly terrible about sticking to established terminologies. "scientists would rather share each other's underwear than use each other's nomenclature," as biochemist keith yamamoto is fond of saying. consequently, the scientific literature is a hodgepodge of identical or overlapping terms. a naã¯ve text-parsing program does not know whether "cat" refers to the catalase gene, the chloramphenicol transferase gene, or a household animal. the challenge is to build an ontology describing all the important relationships so your computer program can navigate among them without asking you what to do. consequently, an ontology would prescribe rules for understanding the interactions among genes based on the appearance of certain verbs ("inhibit," "express"), nouns ("agonist"), or phrases. although within each narrow scientific subdiscipline it may be possible to build exquisitely useful textmining tools, as soon as programmers broach the borders of the narrowest subfields, they will run into a kind of heisenberg uncertainty principle of linguistics and science. every toolmaker doi: 10.1371/journal.pbio.0000048.g002 is faced with the ontology problem in one respect or another, especially when the tool is meant to be a general one. david gilmour, chief executive officer of tacit inc., a knowledge management company in palo alto, california, is an industry veteran of exactly this war "and i have scars all over my body to prove it," he says. the issue in a nutshell, he explains, is that "ontologies scale poorly, and by the time they are useful," that is, large enough to capture most of the possible relationships among words, "they are unmaintainable." hirschman acknowledges that keeping up with the literature and new terminologies is challenging. adapting tools to new domains has traditionally been one of the "critical stumbling blocks" for text-processing technology, she says. the dynamic growth of biological terminology does not help. there are 50-100 alterations every week to the nomenclature section of mouse genome database web page. staying within one's narrow domain, then, could be a recipe for success, as long as the vocabulary and user questions remain tightly constrained, especially if there is a way to tiptoe around the access problem. that is apparently the case at wormbase, though the newly available tool there, called textpresso, is still being built. the motivation for textpresso was simple, says hans-michael mueller, a postdoctoral fellow in the lab of paul sternberg at caltech in pasadena, california, where wormbase-the genetic database for the nematode worm caenorhabditis elegans-is curated. "we want the user to be able to avoid going to the library to read all those papers [on genes and proteins] that your favorite gene interacts with. that is very tedious." the other goal is equally recognizable in the biology community: no mere mortal can hope to keep up with the burgeoning literature, even in the relatively narrow field of worm biology. mueller, a nuclear physicist by background, called textpresso "a search engine for full-text searches of abstracts and articles" that can help find answers to more challenging queries than simple keyword searches. mueller and his team use human "taggers" to mark up the corpus of text to indicate categories like "biological processes" ("late larval activation"), "genes" (let-7), and "molecular functions." then, like the clearforest-celera program, textpresso searches for combinations of categories in the same or neighboring sentences. the ontology relating the expressions and categories to one another is based both on scientific and common sense as well as linguistic components. in less than two years of work, mueller and his team have already marked up 3.9 million terms in 16,000 abstracts and 2,000 full-text papers. a typical search asks a question such as "what can be found out about the negative regulatory aspects of a genetic network in the pharynx?" answers emerge in the form of citations, abstracts, and, if available, a paragraph or so from the text of the relevant paper. textpresso went up-unpublicized-on the web in february this year and already receives a couple of hundred hits a day, a big number in a field of about 2,000 researchers. mueller estimates that textpresso is 95% accurate and that about 35% of the relevant papers have been included. textpresso needs full-text access to be as good as it is, says mueller. "we noticed" that drawing on full text "greatly increased the chances of a true hit," not a false positive. he managed to avoid the access issue by claiming a kind of "curator's privilege." only the curators see the full text. once the data are on the web, users can only get at most a paragraph, which falls within fair use, said mueller. if a user happens to subscribe to the journal in question, it is possible for him to click through, the publisher's portal and see the paper. whereas textpresso works exclusively on worm genetic data and commercial players like clearforest are just beginning to hunt for biological applications, a handful of companies have begun to market text-searching products to academic biomedical scientists. one such product is called quosa, for query, organize, share, and analyze. the software had its commercial launch in late 2002. put simply, the program-available on an institution-wide basis and already installed for hundreds of researchers at massachusetts general hospital and the dana-farber cancer institute in boston-allows a search across one's own documents. a front end for the literature that cooperates with medline, quosa pulls in and prioritizes full-text papers. the program first allows the user to search for the relevant files and download them in full-text format to the extent permitted by her library's subscription agreements and licenses. once it becomes second nature to users, they rave about it. like the best of the firstgeneration software, quosa allows users to make connections they would not have otherwise made. like so many other early software products, its longterm success will hinge on demand as well as improvements made in the upgrades. because of the ontology problem, improvements in searching in the next couple of years are likely to result from the application of ever-better techniques within existing domains. collaborations among wormbase, flybase, and other model-organism database groups will help improve all their search tools. medline itself may benefit from more advanced search techniques, though these will be restricted to abstract searches. the big unknown for predicting further development of text-search tools is the path publishers will take. if each publisher or portal such as reed-elsevier or highwire were to license or develop its own tool for searching its own content, the result might be better than the status quo, but would still be unsatisfying. running the same search three times on three different subsets of content might be better than running it 15 times-but wouldn't it be easier to run it just once? î�­ the transcriptome of the intraerythrocytic developmental cycle of plasmodium falciparum we are grateful to drs. lisa ranford-cartwright (university of glasgow, glasgow, united kingdom), ayoade oduola and yeya toure (world health organization, geneva, switzerland), and wilfred mbacham (university of yaounde, yaounde, cameroon) for critical comments on the manuscript. key: cord-009959-erh8ggh3 authors: bentley, william e.; wang, min‐ying; vakharia, vikram title: development of an efficient bioprocess for poultry vaccines using high‐density insect cell culture date: 2006-12-17 journal: ann n y acad sci doi: 10.1111/j.1749-6632.1994.tb44387.x sha: doc_id: 9959 cord_uid: erh8ggh3 nan infectious bursa1 disease virus (ibdv) is a pathogen of major economic importance to the world's poultry industries. it causes severe immunodeficiency in young chickens by destroying the precursors of antibody-producing b cells in the bursa of fabricius.' ibdv is a member of the birnaviridae family whose genome consists of two segments of double-stranded rna. presently, the principle method of controlling ibdv infection in young chickens is by vaccination with an avirulent strain of ibdv or by the transfer of high levels of maternal antibody to breeder hens.* recently, virulent strains that are antigentically different from previously established ibdvs have been isolated from vaccinated flocks on the delmarva peninsula. consequently, present vaccines afford only partial protection against infection. in figure 1, the large segment indicated is comprised of a precursor polyprotein that is processed into mature vp2, vp3, and vp4.3 vp2 and vp3 are the major structural proteins of the virion. vp2 is the major host-protective immunogen of ibdv and contains the antigenic regions responsible for the induction of neutralizing antibodies. for this reason, we have focused on vp2 as a potential vaccine. when expressed in baculovirus as one of a cassette (vp2, vp3, and vp4), the coat proteins self-assemble into empty virions, which subsequently afford protection in challenged chickens (fig. 2) : this article presents the current status of our efforts to develop a cost-efficient vaccine for ibdv using the recombinant baculovirus and insect cell culture process. although the baculovirus expression system has proven quite successful for use in laboratories, its use as an industrial expression system has yet to occur, although products for human use are in clinical trials. briefly, the system entails replacing the gene for the acnpv coat protein, polyhedrin, with the gene for the protein of interest. this coat protein is not required for the replication and maintenance of the virus in cell culture due to its biphasic life cycle. about three days after infection with a recombinant virus, the protein of interest is expressed instead of polyhedrin, which would normally constitute the majority of the total cellular protein. the system is attractive because it is nonpathogenic towards humans, and it involves the in virro propagation of cells for which technology has advanced due to previous mammalian and hybridoma culturing techniques. it provides for an abundant supply of the protein of interest in part due to the strong polyhedrin promoter and also because insect cell lines reported to date provide for glycosylation and other posttranslational processing. in addition, the construction of recombinant baculoviruses has become simplified due to commercial expression kits. several general articles, manuals, and review articles have been published that describe both the expression system'" and novel process engineering aspects.%" in this work, we subdivide the entire bioprocess into three key areas: (1) metabolism of infected cells and specific heterologous protein yield; (2) bioreactor configuration and high cell density continuous culture; and (3) integration of expression and product separation. in each of the results subsections, we briefly review the recent literature as well as present our efforts to understand and advance these areas. the stock virus solutions were developed as described previ~usly.~ virus titer was determined by the end-point dilution method. the infections were performed by adding different volumes of virus solution after viable cell count. thus, the time postinfection commences from the addition of the virus solution. the multiplicity of infection (moi) was evaluated separately in spinner flask culture. samples (2 ml) were taken from fermentors during the infection process, divided in two tubes, and centrifuged for 20 min at 12,000 rpm. the supernatants were separated from cell pellets and stored at -20°c until measurement of glucose and lactate. the cell pellet was resuspended in 250 ml 10 mm kh,po, (ph 7.4) buffer solution and stored at -20°c until assayed (page) for quantification. another two 10 ml samples were centrifuged for 20 min at 3500 x g. the cell pellets were stored at -20°c without resuspending in any buffer and used for the measurement of proteolytic activity and total protein. total cell counts were performed with a vwr-brand hemacytometer, and viability was determined by trypan blue dye exclusion using a 0.04% solution (sigma). cell pellets resuspended in 250 ml ehbs were sonicated on ice for 10 s with a microtip and a 30% pulsed duty cycle. glucose and lactate were measured using a ysi model 27 analyzer (yellow springs instruments), and ammonia was determined by an enzyme-based assay kit (sigma, no 170-uv). total protein was measured using a biorad protein assay kit i. alkaline protease activity was assayed using the synthetic substrate azocasein (sigma) at a concentration of 2% (w/v) in 0.1 m tris/hci buffer (ph 9.0). the cell pellets were resuspended using 250 ml of 0.1m tris/hci buffer (ph 9.0) and sonicated. one milliliter of azocasein solution was added to this 250 ml sample, and this mixture was incubated at 28°c for 1.5 hours. the reaction was stopped by the addition of 10% trichloroacetic acid (1.2 ml). after standing at 40°c for 30 min, the mixture was filtered, and 0.5 ml 0.5 m naoh was added to 0.5 ml of the filtrate. one unit of alkaline protease activity is defined as the amount of enzyme that gives an increase in am of 0.1 in 30 min at 28°c. cells were harvested by centrifuging 70 ml of culture and then resuspending them in 10 ml native binding buffer (20 mm sodium phosphate, 500 mm sodium chloride, ph 7.8). egg white lysozyme (100 pg/ml) was added to this mixture, which was incubated for 15 minutes on ice. the cells were sonicated on ice three times for 10 s with a 30% pulsed duty cycle (branson sonic power, co.) and then flash frozen in a -80°c freezer. after thawing the sonicated cells at 37"c, a final concentration of 5 pg/ml, rnase was added to the crude cell lysate and incubated on ice for 15 minutes. insoluble cell debris was removed by centrifugation at 3,000 g for 15 minutes. the column resin (invitrogen, ca) was washed with distilled water and then equilibrated with native binding buffer. cell lysate (7 ml) was then loaded onto the column and washed by native binding buffer until od,,, was less than 0.01. a step down to ph 6.3 was performed using native wash buffer (20 mm sodium phosphate, 500 mm sodium chloride, ph 6.3), and the od,,, of supernatant was monitored until its value was smaller than 0.01. protein was then eluted by loading native-ph elution buffer (20 mm sodium phosphate, 500 mm sodium chloride, ph 4.0) onto the column and collecting 1 ml fractions. hundreds of laboratories have expressed heterologous proteins using recombinant acnpv and sf-9 or bm-5 cell lines, yet only several laboratories have focused attention on maximizing yield by examining cellular functions during the growth and expression phases. factors such as moi, timing of infection, and cell density have been well documented. for example, neutra et al. l 2 examined the effects of reactor volume, infection timing, moi, and cell density on p-galactosidase production in shake flasks. king et al. 13 found that low moi was favorable for a temperaturesensitive baculovirus. conversely, for a more typical baculovirus with polyhedrin promoter control and an rcd4 product, lazarte et alt4 found high moi was most favorable. perhaps these systems differed by changes in cell metabolism that resulted from medium condition and infection temperature. in the lazarte et a1.i4 study, fresh medium added at infection tripled the yield. indeed, several groups have demonstrated that infected cells retain significant metabolic activity, and this is essential for producing the recombinant protein. caron et al. noted that postinfection cell physiology was critical and suggested that different growth and production medias might be required. zhang et a1. 16 noted that for bm-5 cells, serum concentrations for optimal growth were different than for viral production. additionally, kamen et al. "quantified differences in principal nutrients and waste products during the growth and production phases. reports demonstrating the effects of nutrient limitation and by-product inhibition on protein yield have been limited, however. wang et a1.i8 demonstrated that apparent glucose limitations could be offset by subsequent sucrose decomposition. lindsay and betenbaughi9 noted that the oxygen demand was higher for infected cells and adjusted aeration properties, which facilitated dramatic increases in yield. likewise, scott et a1.2" reported a significant increase in protein yield by maintaining dissolved oxygen tension at 50% of saturation. however, at high cell densities, maintaining oxygen was not sufficient for increasing yield. caron el al." restored recombinant protein production at higher cell density by renewing the medium at the time of infection. feeding the specific rate-limiting nutrients is less expensive, however, and aides in elucidating insect cell physiology under viral infection. two batch cultures were conducted to examine the effects of do during viral replication and protein production.*' the first batch culture was in a fermentor without do control (fig. 3) , while the other was conducted with do regulated near 35% (fig. 4) . virus infection was performed for the fermentor without do control by adding 15 ml virus solution (moi 5) at 91 hours, after the cells had attained a concentration of 6.15 x lo5 cells/ml. this moi and cell density were chosen so that other factors, such as nutrient depletion, were eliminated and differences due to oxygen level were discernible. shortly after infection, the do dropped to zero and remained zero for another 90 hours. the viable cell concentration remained near 6.55 x l@ cells/ml and then started to decrease simultaneously with an increase in do after 190 hours (fig. 3a) . the onset of cell lysis, however, is marked by both the increment in lactate dehydrogenase (ldh) activity and decrement in cell viability near 150 hours (fig. 3b) . note that after the do dropped to zero, the lactate increased to a maximum of 2.5 mm, reflecting the inadequacy of oxidative phosphorylation to supply the required atp. in these energy-limited cells, the final eh activity was 0.015 units per milliliter. for the batch culture with regulated do, the growth kinetics are shown in figure 4 . virus infection was also performed at 90 hours and 6.55 x 105 cells/ml. unlike the oxygen-uncontrolled culture, the viable cell concentration continued to increase to a maximum of 1.05 x lo6 cells/ml. the increased demand in oxygen immediately following infection was met in this experiment, and the do was maintained above 10% and averaged 35% for the remainder of the experiment. the continuous addition of oxygen likely provided the requirement for continued cell growth. note that the lactate concentration remained below 1 mm for the entire experiment, whereas the glucose concentration dropped almost 45%, from 14 mm to 8 mm after infection. the epoxide hydrolase activity reached 0.044 u/ml, which was 200% higher than in the oxygen-uncontrolled experiment. the specific productivity increased by 100% as well. this increase in protein synthesis is accomplished only by continued glucose uptake after infection. in the uncontrolled experiment, infected cells consumed little glucose (-1 mm) after the do dropped to zero. on the other hand, both the uninfected and infected cells in the do-controlled experiment continued to uptake glucose and consumed almost 1 g/l after infection for cell growth and viral replication. note, therefore, that the effects of oxygen, like moi and cell density, are delineated and realized when sufficient nutrients are available in the extracellular environment. three experiments were performed to evaluate glucose and glutamine feeding in spinner flasks at high cell density (fig. 5) . the use of 50 ml spinner flasks with 30 ml working volume prevents oxygen limitation. the cells were grown in a 250 ml spinner flask with 80 ml working volume until a density of 3.85 x lo6 cells/ml was attained; they were then divided and infected in three smaller spinner flasks at an moi of 10 and a cell density of 3.3 x lo6 cells/ml. one of the spinner flasks was diluted 3:l with fresh medium as a control (total working volume, 30 ml). previously, this method was reported as resulting in the highest specific yield attainable in spinner flasks.l5.iy this spinner flask is denoted an oxygen and substrate-sufficient control. one spinner flask was then fed 0.5 ml of 200 mm glutamine and 80 pl of 2 m glucose. subsequent daily glucose feeding maintained glucose supply (fig. 5a) . the last spinner flask was a control high cell density culture, without glucose/glutamine feeding. the epoxide hydrolase activity (fig. 5b) continued to increase for the culture with fresh medium (oxygenhubstrate control) until 118 hours postinfection (hpi). by contrast, in the high cell density cultures, epoxide hydrolase activity increased more rapidly initially and then decreased at 88 hpi for the glucose/glutamine fed culture, and at 48 hpi for the unfed culture. a significant improvement in protein production from glucose and glutamine feeding was seen as early as 12 hpi. coincidentally, the glucose concentration in the unfed culture had dropped dramatically by this time (fig. 5a) . the increase in activity continued well into the production phase and because the viable cell concentration in the fed culture was significantly lower after 48 hpi, the increase in activity was due primarily to an increase in specific productivity. when comparing the overall performance of the effects of oxygen, glucose, and glutamine feeding, the total protein activity of the do-controlled culture was three times that of the uncontrolled culture. further, the glucose/glutamine fed culture was twofold higher in yield than the do-controlled culture. ultimately, the epoxide hydrolase yield was 40-60 milligrams per liter.22 as seen above, cell metabolism may play a very important role in recombinant protein production when cells are infected at high cell density. the metabolic stress responses of bacterial and mammalian cells to oxidation, heat shock, and amino acid starvation have been well documented, and many stress proteins have been ider~tified.~"?~ insect stress, however, has been quantified by previous researchers only by measuring the bulk proteolytic activity of whole insect homogenates.25,26 in those reports, increased proteolytic activity corresponded to increased cellular stress. in many recent reports, degradation of the recombinant product has been s i g n i f i~a n t .~~.~~.~~ thus, another important aspect of cell metabolism concerns the elicitation of protease activity. we have used azocasein to investigate the appearance of alkaline proteases that may breakdown the product, particularly if allowed to follow in the separation and purification processes. 29 in another fermentor experiment, we controlled the dissolved oxygen to 50% and infected cells (moi 5) by the addition of virus stock solution when the cell density was about 1 . 2~1 0~ cells/ml. this lower cell density was chosen in order to avoid other cellular stress effects (for example, nutrient starvation). the cells (fig. 6) kept growing after viral infection due to a low mo121-30 for a period of 40 hours and then the viable cell density decreased later due to a secondary infection. the total cell density decreased due to cell lysis, and the viability was 20% at cell harvest. two recombinant protein activities and the total specific protease activity are also shown. in this system, p-galactosidase expression is under promoter p10 control; epoxide hydrolase (eh) is our desired recombinant product and is under the polyhedrin promoter control. 0-galactosidase is a cytoplasmic protein and will be liberated to the medium during expression and lysis. however, eh is membrane bound and not released to the medium. expression of both proteins (fig. 6) started simultaneously. the p-galactosidase activity increased until 79 hpi, was constant for a short time, and then decreased by 12% until harvest. on the other hand, eh increased more slowly initially, remained constant for 20 hours and increased slowly by 20% until harvest. the specific protease activity was shown to increase dramatically starting at about 70 hpi. this time just corresponds to the reduction of p-galactosidase activity and the slowdown in eh production. we should note that the viability was near 70% at this moment. inasmuch as the viability was high and.there was no nutrient limitation (not shown), there was no apparent reason for cells to stop producing recombinant protein. the protease (measured as specific activity in the cell pellet) increased at least 15-fold even when considering partial loss from cell lysis. we also measured proteolytic activity in supernatant, and no increment was found, which suggests that the proteolytic activity was either associated with the cell membrane or was too dilute to detect in the supernatant. although there is no specific connection between the protease activity and the shift in protein activities (p-gal, eh), the coincident timing warrants further consideration. because it is unlikely that protease inhibitors will be used in a commercial process, the specific nature and timing of this proteolytic activity is significant. hence, characterization of the identity and consequences of this and other proteases is under investigation. in addition to the many productivity advantages that continuous suspension culture has over stationary (t-flask) or batch-suspension culture, continuous culture provides homogeneous and constant environmental conditions that facilitate the modeling of cell growth and metabolism. research with hybridoma cells has demonstrated this p~i n t .~' .~~ due to infected cell lysis, most continuous insect cell systems have employed two reactor stages: one for cell growth and the other for virus propagation and recombinant protein expre~sion."*~*~~ however, it has been re ported that there is eventually an almost complete loss of productivity from continuous insect cell culture with a continuously passaged v i r~s .~~.~~.~~ limited stability of the virus upon serial passage may limit the application of two-stage continuous reaction systems. this may be improved, however, by a three-stage continuous bioreactor system proposed by taticek and shuler,3x or by a cycling-batch reactor scheme." at present, batch and perfusion systems have achieved significantly higher viable cell density than continuous cultures. further, there is some question as to the long-term productivity of the cell lines in continuous cultures, independent of the virus.ls consequently, we have investigated methods for maximizing cell number in extended continuous cultures.3y dilution rates were examined from 0.014 to 0.0403 hr-l, which is close to the maximum specific growth rate obtained in batch cultures.'' in figure 7, our small-volume glass-jacketed spinner flask reactorjy is depicted. cultures were stirred at 130 rpm and contained 0.1% pluronic f-68.40 the airsparging system was cyclic*with 60 min on and 30 min off. two volummetric air flow rates were examined: 2.056 ml/min (0.0093 vvm) and 2.76 ml/min (0.0125 vvm) for low-and high-sparging rates, respectively. the viable and total cell densities for both sparging rates are plotted as a function of dilution rate in figure 8. at the low sparging rate, a monotonic decrease in viable cell density with increasing dilution rate was observed. this is consistent with the semicontinuous culture of mouse ls cells (without ph and do control) by s i n~l a i r .~~ the high sparging rate exhibited a maximum in total and viable cell densities at a dilution rate of 0.0287 hr-i. the decrease in viable cell density at lower dilution rate (0.0251 hr-i) may have been due to low nutrient and/or high toxin concentrations."note that at 0.0332 hr-l, the cell density for the high sparging rate (0.0125 vvm) was five times higher ( 3 . 6 4~1 0~ vs. 7x105 cells/ml) than the low sparging rate. thus, this sparging rate provided much better oxygenation for cell growth with little contribution to cell death. the trends shown in figure^ are also similar to a continuous sp2/0-derived mouse hybridoma cell culture with ph and do control31 as well as an oxygen-limited continuous culture (ph control only) of mouse nbi h y b r i d o m a~.~~ the data depicted was collected in one continuous culture of over 3100 hours in duration. at several times, the dilution rate was reset to a previous setting, and the cell density returned to the previous steady state value.3y thus, the spinner flask reactor performed well, was reproducible, and will be very useful for further development of the additional protein expression stages. in figure 8, we demonstrated that a cell density of 4.2 x loh cells/ml with 94% viability could be achieved at a dilution rate of 0.0287 hr-i. in order to check the protein expression level of infected cells grown continuously for an extended period, cells at dilution rate of 0.0251 hr-l with low (0.0093 vvm, 800 h) and high sparging rate (0.0125 vvm, 2600 hr) were taken from the bioreactor and directly infected with an moi of 5 in two 50 ml spinner flasks (30 ml working volume). the maximum recombinant protein yields were 0.15 u/ml at 71 hpi and 0.05 u/ml at 69 hpi for cells from the 2600 hr and 800 hr samples, respectively. the maximum cell densities reached were 3.7 x lo6 and 1.6 x loh cells/ml for the 2600 hr and 800 hr samples. flow diagram and setup of continuous bioreactor system. the total volume is 250 ml, with working volume up to 230 ml. air is introduced by way of peristaltic pump and hplc solvent porous sparger. air is introduced in on/off mode with a home timer. temperature is maintained by water bath and glass jacket. agitation is provided by magnetic stir plate (bellco).3" the average specific eh activity, calculated as the maximum enzyme activity divided by the maximum cell density, was slightly lower for cells from the lower sparging rate (3.125 vs. 4.05 x lo--' u/1q5 cells). perhaps this was due to a lower growth rate (0.022 hr-i vs. 0.025 hr-i) or lower oxygen supply during the continuous culture. however, the specific eh activity from these long-term cultured cells is comparable to that from freshly prepared cells (3.34-5 x lo-' u/1@ cells"). this demonstrates that high cell density (more than 3 x loh cells/ml) can be achieved in continuous culture, even at a dilution rate of 0.0403 hr-i. in addition, the specific protein yield from long-term cultured cells (over three months) is not compromised. the results of this work can be combined with the previous discussion on maximizing specific productivity in designing the best reactor strategy. this may involve incorporating feeding policies in the second or third stage of a multistage continuous system, or the inclusion of several alternating batch reactors that are filled by one continuous or semicontinuous reactor. an attractive feature of the baculovirus system has been its potential for disease diagnostics and vaccine development. again, this is due to the high expression level, posttranslational processing, and nonpathogenicity. in addition, recent reports have shown that expression of virus coat proteins often results in self-assembled virus-like particles (vlp) that are essentially empty whole virion^.^,^"' of these vlp-producing systems, vaccines have been proposed for p~liovirus,"~ parvovirus,5" ibdv,4 flock house virus,s' and bluetongue virus. 44 in addition, vaccines have been proposed or tested from non-vlp proteins, including hiv,s2 yellow fever," feline herpes trypanosoma cru~i,'~ mokola and rabies viruses:h newcastle disease vi-us,^^ japanese encephalitis:s hepatitis c,59 and bovine coronavirus.60 of these, only two research groups have considered the advantages of integrating initial separation with product expression by employing imac.59*h' we have developed the bioprocess so that imac can be employed as a potentially single-step separation procedure. imac was introduced for the selective adsorption of protein by porath et ~1 . ~~ who used divalent cations, zn2+ and cu2+, chelated to a chromatographic matrix to fractionate serum proteins. metal ion ligands most often used in imac are first-row transition metals (zn, ni, cu, and fe) chelated by iminodiacetate (ida). in resin, they distinguish affinities exhibited by functional groups on the surface of proteins. these interactions between particular surface amino acids and immobilized metal ions provide the basis for metal-affinity protein to a first approximation, proteins are retained on metal-affinity columns according to the number of accessible his ti dine^.^*^' histidine is not a common residue in protein sequences and it accounts for only 2.1% of amino acids in globular proteins.hr thus, the chance that a desired protein, engineered to contain additional histidines at its c-or n-terminus, will be preferentially separated from other proteins is quite good. a description of the construction of the vp2 baculovirus that expresses his,-vp2 (denoted vp2h) is given elsewhere.6' monoclonal antibodies (mabs) recognizing ibdv were produced and characterized using protocols previously d e s~r i b e d .~~,~~' identification of ibdv antigens by modified antigen-capture elisa (ac-elisa) was also carried out as described by snyder et al." recombinant vp2 proteins from insect cells infected with vedlh-22 (for vp2h) and vedl-8 (for vp2) were identical using the array of mabs and ac-elisa. however, these two proteins (vp2, vp2h) were distinguished and resolved on a 12.5% sds-polyacrylamide gel, and detected immunologically following western blotting with polyvalent chicken anti-ibdv ~e r u m .~~,~* this indicated that vp2 protein was fused with histidine residues. the vp2h was purified by affinity chromatography using a niz+ immobilized resin column provided in a protein purification kit as described in the methods section (invitrogen, ca). the final elution was monitored by odzs0 readings of the fractions, using the native-ph elution buffer as a blank. in f l g u r e 9 , the optical density (odzso) of the column eluate is shown for the wash and vp2 eluate steps. the eluted pool (fractions 18-26) from the ph 4 buffer was collected, concentrated, and desalted by a 10 kda membrane (amicon) and checked by western blotting!' in figure 10a, the purified vp2h protein (molecular weight near 43 kda, lane 2) comigrated with two vp2 proteins: one derived from gls ibdv (lane 1) and the other from sf-9 cells infected by ibdv-7 recombinant baculovirus4 encoding entire vp2, vp3, vp4, vp2x proteins of ibdv (lane 5). this demonstrates that the vp2h protein can bind the ni2+ ion on the resin and was effectively eluted by the ph 4 elution buffer. the purity of purified vp2h protein was also checked by sds-page (fig. lob) . the intense band with a molecular weight of about 43 kda showed that this vp2h protein was dominant in the mixture (roughly 40%). some small proteins or peptides were still in this purified mixture which may have been due to partial degradation of vp2h (data not shown) or additional elution by this strong low-ph buffer. in this work, a step change from ph 6.3 to ph 4, which is much lower than the pka value commonly observed for surface histidines (-6.0):3 may have coeluted some peptides with a stronger binding ~apacity.'~ because we demonstrated that the recombinant viruses express immunoreactive vp2 proteins, the next step was bench-scale production of vp2h protein. i proteins were about 120 and 70 mg/l, respectively. although we have not used imac for a large volume purification, we have obtained sufficient data to estimate protein production costs using the bioprocess presented. advantages for the recombinant poultry vaccine, as compared to the injection of an avirulent strain or the transfer of maternal antibody to breeder hens, include potential for complete and passive protection, higher reactivity per milligram of vaccine protein, smaller dose quantities, and more convenient processing. at this point, results on the optimum dose composition and quantity are incomplete, so a direct economic comparison between the processes is unavailable. however, given both the attractiveness of the baculovirus insect cell expression system and the optimization research activity, one might anticipate commercialization of ibdv vaccines and other baculovirus-based products in the near future. the work reviewed and presented in this paper highlights specific areas where further advancements will lead to increased productivity and therefore lower processing cost, namely maintaining the maximum cell number and specific protein productivity, developing convenient and stable bioreactor operating strategies, and integrating unique lowcost separations steps. structural and growth characteristics of infectious bursal disease virus maternally derived antibody-effect on susceptibility of chicks to infectious bursal disease genomic structure of the large rna segment of infectious bursal disease virus infectious bursal disease virus structural proteins expressed in a baculovirus recombinant confer protection in chickens insect cell culture technology in baculovirus expression systems expression of foreign genes in insects using baculovirus vectors baculovirus as vectors for foreign gene expression in insect cells baculovirus expression vectors: a laboratory manual large-scale insect cell culture for recombinant protein production production of (recombinant) baculoviruses in insect-cell bioreactors optimization of protein-production by the baculorivus expression vector system in shake flasks assessment of virus production and chloramphenicol-acety i-transferase expression by insect cells in serum-free and serum-supplemented media using a temperature-sensitive baculovirus optimization of the production of full-length rcd4 in baculovirus-infected sf9 cells high-level recombinant protein production in bioreactors using the baculovirus insect cell expression system a two-stage bioreactor system for the production of recombinant proteins using a genetically engineered baculovirus/insect cell system culture of insect cells in a helical ribbon impeller bioreactor expression of epoxide hydrolase in insect cells: a focus on the infected cell quantification of cell culture factors affecting recombinant protein yields in baculovirus-infected insect cells effects of oxygen on recombinant protein production by suspension cultures of spodoptera fiugiperda (sf-9) insect cells effects of oxygen/glucose/glutamine feeding on insect cell baculovirus protein expression: a study on epoxide hydroxylase production interaction of hepatic microsomal epoxide hydrolase derived from a recombinant baculovirus expression system with an azarene oxide and an aziridine substrate analogue escherichia coli and sulrnonella typhirnuriurn cellular and molecular biology coordinated regulation of a set of genes by glucose and calcium ionophores in mammalian cells alterations in proteases, protease inhibitors and ecdysone levels: a profile of stress in insects effect of developmental derangements on the proteolytic and protease-inhibitory activities in galleria rnellonella (insecta) ecdysteroids increase the yield of recombinant protein produced in baculovirus insect cell expression system production of recombinant sarcotoxin ia in bonbyx rnori cells kinetic analysis of protease activity, recombinant protein production, and metabolites of infected sf-9 cells under different do levels factors influencing recombinant protein yields in an insect cell-baculovirus expression system: multiplicity of infection and intracellular protein degradation a kinetic analysis of hybridoma growth and metabolism in batch and continuous suspension culture: effect of nutrient concentration, dilution rate, and ph a structured kinetic modeling framework for the dynamic hybridoma growth and monoclonial antibody production in continuous suspension culture a model for baculovirus production with continuous insect cell cultures continuoue production of baculovirus in a cascade of insect-cell reactors continuous p-galactosidase production with a recombinant baculovirus insect-cell system in bioreactors a structured dynamic model for the baculovirus infection process in insect-cell reactor configurations a continuous process for the production of baculovirus using insect-cell cultures a continuous flow bioreactor system for the production of recombinant proteins using the insect cell-baculovirus expression system continuous insect cell (sf-9) culture with aeration through sparging in a novel low-volume bioreactor scale-up of insect cell cultures: protective effects of pluronic f-68 response of mammalian cells to controlled growth rates in steady-state continuous culture the baculovirus-integrated retrotransposon ted encodes gag and pol proteins that assemble into virus-like particles with reverse transcriptase development of baculovirus triple and quadruple expression vectors: co-expression of three or four bluetongue virus proteins and the synthesis of bluetongue virus-like particles in insect cells three-dimensional reconstruction of baculovirus expressed bluetongue virus core-like particles by cryo-electron microscopy synthesis of bluetongue virus (btv) corelike particles by a recombinant baculovirus expressing the two major structural core proteins of btv analyses of the requirements for the synthesis of virus-like particles by feline immunodeficiency virus gag using baculovirus vectors assembly and release of hiv-1 precursor pr55eg virus-like particles from recombinant baculovirus-infected insect cells synthesis of immunogenic, but non-infectious, poliovirus particles in insect cells by a baculovirus expression vector canine parvovirus empty capsids produced by expression in a baculovirus vector: use in analysis of viral properties and immunization crystallization of virus-like particles assembled from flock house virus coat protein expressed in a baculovirus system cell-surface expression and purification of human cd4 produced in baculovirus-infected insect cells 17d yellow fever vaccine virus envelope protein expressed by recombinant baculovirus is antigenically indistinguishable from authentic viral protein the use of feline herpesvirus and baculovirus as vaccine vectors for the gag and env genes of feline leukemia virus trypanosoma cruzi flagellar repetitive antigen expression by recombinant baculovirus: towards an improved diagnostic reagent for chagas' disease structure and expression in baculovirus of the mokola virus glycoprotein: an efficient recombinant vaccine vaccination against newcastle disease with a recombinant baculovirus hemagglutinin-neuraminidase subunit vaccine protection of mice against lethal japanese encephalitis with a recombinant baculovirus vaccine secretion and purification of hepatitic c virus nsl glycoprotein produced by recombinant baculovirus-infected insect cells primary structure of the s peplomer gene of bovine coronavirus and surface expression in -insect cells purification of a recombinant protein produced in a baculovirus expression system by immobilized metal affinity chromatography metal chelate affinity chromatography, a new approach to protein fractionation imac-immobilized metal ion affinity based chromatography purification of proteins by imac the sage of imac and mit cu(i1)-binding properties of a cytochrome c with a synthetic metal-binding site: his-x3-his in an a-helix evaluation of the interaction of peptides with cu(ii), ni(ii), and zn(i1) by high-performance immobilized metal ion affinity chromatography the independent distribution of amino acid near neighbor pairs into polypeptides group and strainspecific neutralization sites of ibdv defined with monoclonal antibodies differentiation of infectious bursal disease viruses directly from infected tissues with neutralizing monoclonal antibodies: evidence of a major antigenic shift in recent field isolates naturally occurring-neutralizing monoclonal antibody escape variants define the epidemiology of infectious bursal disease viruses in the united states comparative studies on structural and antigenic properties of two serotypes of infectious bursal disease virus a mathematical model for metal affinity protein partitioning metal affinity precipitation of protein carrying genetically attached polyhistidine affinity tails we thank gerard h. edwards and sarah milczanowski for technical assistance. key: cord-008556-oetrdm8g authors: kozak, marilyn title: regulation of protein synthesis in virus-infected animal cells date: 2008-03-01 journal: adv virus res doi: 10.1016/s0065-3527(08)60265-1 sha: doc_id: 8556 cord_uid: oetrdm8g this chapter summarizes the structural features that govern the translation of viral mrnas: where the synthesis of a protein starts and ends, how many proteins can be produced from one mrna, and how efficiently. it focuses on the interplay between viral and cellular mrnas and the translational machinery. that interplay, together with the intrinsic structure of viral mrnas, determines the patterns of translation in infected cells. it also points out some possibilities for translational regulation that can only be glimpsed at present, but are likely to come into focus in the future. the mechanism of selecting the initiation site for protein synthesis appears to follow a single formula. the translational machinery displays a certain flexibility that is exploited more frequently by viral than by cellular mrnas. although some of the parameters that determine efficiency have been identified, how efficiently a given mrna will be translated cannot be predicted by summing the known parameters. the translation of viral mrnas: where the synthesis of a protein starts and ends, how many proteins can be produced from one mrna, and how efficiently. the next section focuses on the interplay between viral and cellular mrnas and the translational machinery. that interplay, together with the intrinsic structure of viral mrnas, determines the patterns of translation in infected cells. the final section points out some possibilities for translational regulation that can only be glimpsed at present, but are likely to come into focus in the future. to keep the project manageable, i have concentrated on animal viruses. plant viruses are mentioned, however, when they provide the best (or sometimes the unique) example of a given mechanism. the structural requirements for mrna function have been determined by inspection'of natural eukaryotic mrnas, followed by manipulation of features that looked suspicious. the general structural characteristics of eukaryotic mrnas have been reviewed previously (kozak, 1983a) and will not be elaborated here. the discovery of the m7g cap on a wide variety of viral and cellular mrnas (shatkin, 1976 ) was a provocative clue that the mechanism of initiation in eukaryotes differs from prokaryotes. although the list of plant virus mrnas that are translated without a cap has grown in recent years, picornaviruses and caliciviruses are still the only animal viruses known to be translated without a cap (nomoto et al., 1976; ehresmann and schaffer, 1979) . indeed, the near-indispensibility of the m7g cap may be inferred from the fact that animal viruses that replicate in the cytoplasm routinely encode their own capping and methylating enzymes. this is true not only for poxviruses (moss et d., 1976) , where the vast coding capacity of the genome allows room for frills, but also for reovirus (furuichi et al., 19761, vesicular stomatitis virus (vsv) (abraham et al., 19751, and alphaviruses (cross, 1983) ) in which the small size of the genome limits the encoded proteins to the barest essentials. the m7g cap enhances both the stability and translatability of mrnas. transcripts that are capped but not methylated are stable, but nonetheless untranslatable (furuichi et al., 1977; horikami et al., 1984) . much of the discussion that follows assumes that a scanning mechanism underlies the initiation process. the scanning model postulates that a 40 s ribosomal subunit binds initially at the 5' end of the mrna viral translation 23 1 and migrates until it reaches the first aug triplet. if the first aug codon occurs in the optimal context (accaugg-see kozak, 1981a kozak, , 1984a kozak, , 1986a all 40 s subunits stop there, and that aug serves as the unique site of initiation. if the first aug triplet occurs in a suboptimal context, only some 40 s subunits will initiate there; some will migrate beyond that site and initiate at an aug codon farther downstream. the scanning hypothesis is not universally accepted, but it is supported by extensive evidence from many laboratories (reviewed by kozak, 1980 kozak, , 1981b kozak, , 1986a . two alternative models have been suggested from time to time. one is that ribosomes bind directly to the sequence around the aug codon, but experiments designed to distinguish between scanning and direct binding do not support the latter (kozak, 1979a (kozak, , 1983b . a hybrid mechanism in which 98% of the ribosomes scan from the 5' end, while 2% of the binding occurs directly at the aug start site, is difficult to rule out, however. another suggestion is that secondary structure might guide the choice of aug codons (this idea is evaluated a few paragraphs hence). one consequence of the scanning mechanism is that deleting the "ribosome binding site" (i.e., the normal initiator codon and flanking sequences) will not abolish translation; ribosomes will simply use the next aug codon downstream, which, in some cases, has been shown to direct the synthesis of a biologically active, truncated protein (downey et al., 1984; halpern and smiley, 1984; katinka and yaniv, 1982) . conversely, introducing spurious upstream aug codons will reduce initiation from the authentic start site-a prediction that has been verified many times with laboratory constructs (bandyopadhyay and temin, 1984; lomedico and mcandrew, 1982; smith et al., 1983; zitomer et al., 1984) as well as with naturally occurring variant forms of mrna from the early1 and late regions of simian virus 40 (sv40) (barkan and mertz, 1984) . when the context around an upstream aug codon conforms closely to the accaugg consensus sequence, initiation from the downstream site is suppressed almost completely (kozak, 198313, 1984b; liu et al., 1984; m. scott and h. varmus, personal communication) . when the context around the upstream aug codon is less ideal, initiation from the downstream site is reduced but not abolished (kozak, 1986a) . stated in a more positive way, when the 5'-proximal aug codon occurs in a suboptimal context, ribosomes are able to initiate at the first and the second aug codons. this "leaky" scanning process is further explained and documented in section i1,c. the scanning mechanism predicts that translation should be downregulated by any ploy that interferes with the linear movement of 40 s ribosomal subunits from the cap to the aug codon: binding of a protein to the 5'-noncoding sequence; introducing spurious out-of-frame aug codons, as mentioned above; annealing cdna fragments that are complementary to the 5'-untranslated sequence (haarr et al., 1985; perdue et al., 1982; privalsky and bishop, 1982; willis et al., 1984) ; or creating a stable hairpin anywhere upstream from the aug codon, as described in the next section. on the other hand, the simplicity of the scanning mechanism suggests few possibilities for enhancing translation. although we know what features should be absent from the leader for a message to be efficient, the only features known to contribute in a positive way are the m7g cap and the sequence directly flanking the initiator codon. a promising place to look for other positive effectors is the tripartite leader on late adenovirus mrnas. transposition of the 200-nucleotide tripartite leader sequence to heterologous mrnas stimulates their translation 20-fold (berkner and sharp, 1985; logan and shenk, 1984) , but the feature responsible for the stimulation has not been pinpointed, and could turn out disappointingly to be a long sequence that simply lacks all of the negative effectors cited above. the impression that the leader sequences on most viral mrnas do not contain unidentified translational "enhancers" is reinforced by the ease with which 5'-noncoding sequences can be deleted without deleterious effects (bendig et al., 1980; spindler and berk, 1984a; villarreal et al., 1979) .2 if our intuition is correct that "extra" 5'aoncoding sequences are more likely to inhibit than to help, the trend toward short 5'-noncoding sequences on many viral mrnas becomes significant (reviewed by kozak, 1981b ; see also rose1 and moss, 1985) . indeed, the 24-nucleotide leader sequence on the mrna that encodes adenovirus polypeptide ix seems to mediate translation more efficiently than the long tripartite leader that has received so much attention (lawrence and jackson, 1982) . the synthesis of polyoma virus t antigen was significantly reduced in only one of the mutants studied by bendig et al. (1980) -a mutant in which the deletion extended to within two nucleotides of the aug codon. this fits with evidence from other sources that (only) the nucleotides immediately preceding the aug codon are part of the ribosome recognition sequence. secondary structure in viral mrnas might have various effects on translation. 1. one might expect secondary structure to inhibit more when it occurs near the cap, which is the presumptive entry site for ribosomes, than when a hairpin occurs farther downstream, because 40 s ribosomal subunits once bound must be able to melt secondary structure to some extent. (one knows for sure that 80 s ribosomes melt secondary structure during the elongation phase of protein synthesis; the triplet code could not be read linearly otherwise.) the prediction that 40 s ribosomal subunits can melt their way through secondary structure within the interior of the leader sequence has been verified: introducing a 13-base-pair hairpin (ag -30 kcal/mol) 60 nucleotides downstream from the cap did not impair the translation of preproinsulin mrna in uzuo (kozak, 1986b) . the effects of secondary structure close to the cap have not yet been tested systematically, but it has been noted that the 5' end of alfalfa mosaic virus rna-4 is unfolded (gehrke et al., 1983) and rna-4 is a notoriously efficient message. godefroy-colburn et al., (1985b) claim more generally that the degree of cap accessibility of the four alfalfa mosaic virus mrnas correlates with their translational efficiency, but the correlation appears weak. the cap was indeed least accessible on rna-3, which ranks lowest in translational efficiency, but the cap was equally accessible on rnas 1,2, and 4, which differ 15fold in competitive efficiency (godefroy-colburn, 1985a) . 2. although we expect ribosomes to melt secondary structure to some extent, there must be a limit to that ability. whereas a hairpin of -30 kcalimol at the midpoint of the leader sequence (involving neither the cap nor the aug codon) did not reduce the synthesis of preproinsulin under normal culture conditions, a hairpin of -50 kcal/mol nearly abolished translation (kozak, 198613) . because the hairpin did not encroach on the aug codon, the observed inhibition seems incompatible with the direct-binding hypothesis, but is consistent with the scanning hypothesis. pelletier and sonenberg (1985) have also shown that translational efficiency decreases as secondary structure in the 5'noncoding region increases. 3. there is no experimental support for the idea that secondary structure orients the cap and the aug codon, thus determining which aug will initiate translation. were that true, denaturation should impair translation; in fact, denaturation often enhances (payvar and schimke, 1979) . nor is there support for the idea that downstream cistrons are silent due to conformational constraints: attempts to activate internal initiation sites by denaturing viral mrnas invariably fail (collins et al., 1982; monckton and westaway, 1982) . a popular idea is that when secondary structure sequesters the 5'-proximal aug triplet, it might be skipped by ribosomes in favor of the next exposed aug codon (darlix et al., 1982; ghosh et al., 1978; hay and aloni, 1985; nomoto et al., 1982) . the results of a direct test contradict that notion, however, when the primary sequence around the 5'-proximal aug codon in a chimeric preproinsulin mrna was favorable for initiation, no translation from a downstream site could be detected irrespective of whether the first aug codon was single stranded or base paired (kozak, 1986b) . thus, 40 s ribosomal subunits appear to scan linearly, melting the secondary structure (ag 5 -30 kcal/mol) t o reach each aug codon in turn. if a hairpin is too stable to be melted (ag 2 -50 kcal/mol), the 40 s subunit apparently stalls, but it does not "jump over" the barrier. 4. in some viral mrnas, sequences at the 3' end are complementary, to a limited extent, to those at the 5' end (antczak et al., 1982; dasgupta et az., 1980) . that arrangement might be expected to inhibit translation-an expectation that has been confirmed recently using mrnas with artificially constructed terminal complementary sequences (spena et al., 1985) . some viruses seem to take measures to preclude such inhibition. whereas the genomic rnas of influenza (robertson, 1979) and bunyaviruses (eshita and bishop, 1984) have complementary 5'and 3'4erminal sequences, that potentially deleterious structure is not copied into mrna, inasmuch as the 3' terminus of each mrna stops short of the 5' end of the template strand (bouloy et al., 1984; eshita et al., 1985; hay et al., 1977) . arenaviruses also produce mrnas that lack the complementary sequences present at the termini of genomic rna (auperin et al., 1984) . 5 . incubation in hypertonic culture medium has been used often to study protein synthesis in virus-infected cells (see yates and nuss, 1982, and references therein) . hypertonic shock results in the rapid and reversible inhibition of protein synthesis at the level of initiation (saborio et al., 1974) . an intermediate concentration of salt or sucrose permits a residual low level of translation, under which circumstance viral protein synthesis nearly always predominates over cellular protein synthesis (cherney and wilhelm, 1979; nuss et al., 1975; oppermann and koch, 1976) . it is difficult to deduce the mechanism of this differential response from inspection of natural forms of viral and cellular mrnas. however, a cloned preproinsulin gene has been experimentally converted from hypertonic resistant to hypertonic sensitive by inserting into the 5'-noncoding sequence the oligonucleotide agcttgggccgtggtgg, thereby creating a 13base-pair hairpin around the aug initiator codon (mutant b13hp in kozak, 1986b) . a reasonable interpretation is that the hairpin structure (ag -30 kcal/mol), which does not inhibit translation under nor-ma1 culture conditions, is stabilized under hypertonic conditions to the point where it becomes inhibitory. an alternative explanation, currently under investigation, is that the primary sequence of the oligonucleotide insert underlies the enhanced sensitivity of mutant b13hp to hypertonic stress. if the first explanation turns out to be correct, one might suggest by extrapolation that most viral mrnas are less structured near the 5' end than are most cellular mrnas, and for that reason viral mrnas are more resistant to hypertonic stress. herpes simplex virus mrnas are a notable exception: they are unusually sensitive to hypertonic inhibition (stevely and mcgrath, 19781 , perhaps because their high g + c content generates extensive secondary structure. 6. the mechanism of action of interferon is too complex to discuss here, except to mention that double-stranded regions of rna, either free or incorporated into the mrna structure (debenedetti and baglioni, 1984; knight et al., 19851 , are critical in activating and targeting the interferon-induced enzymes. the deleterious effects of interferon on the stability and translation of viral mrnas have been reviewed by lengyel (1982) . the monocistronic rule means more than simply producing one protein from one mrna. a number of viral mrnas encode two or more proteins in nonoverlapping reading frames; with few exceptions, however, (see section ii,c), it is exclusively the 5'-proximal cistron that gets translated (shih and kaesberg, 1973; reviewed by kozak, 1978; . to cope with the usual inability of eukaryotic ribosomes to initiate at internal sites in mrna, the genomes of animal viruses are punctuated at one of four levels, as described below. the structures of plant virus rna genomes and their patterns of expression have been reviewed by davies and hull (19821 , and they are not exceptional. the mode of expression of cauliflower mosaic virus, which has a circular dna genome, is exceptional indeed, and is discussed in section i1,c. the following descriptions are generalized; additional details and references have been published elsewhere (kozak, 1981b) . each virus is classified according to its major mode of punctuation, which is often not the exclusive mode. 1. the genome itself is segmented. each segment typically consists of one gene, which is transcribed end to end, or nearly so. there is usually a simple correspondence between the size of the mrna and the size of the mature protein derived therefrom. reoviruses, influenza viruses, and bunyaviruses fit this description. arenaviruses and nodaviruses (e.g., black beetle virus) have segmented rna genomes but rely also on other mechanisms. 2. the viral genes are linked, but internal start and stop sites for transcription generate a separate mrna for each protein. punctuation is accomplished for the most part at the level of transcription rather than by posttranscriptional processing. again, the size of the mrna usually corresponds to the size of the mature p r~t e i n .~ this group includes poxviruses, herpesviruses, rhabdoviruses (vsv), and paramyxoviruses. here the genome lacks internal transcriptional and translational stoplstart sites. the genome-sized mrna is translated end to end to produce a "polyprotein," more than 2000 amino acids in length, which is cleaved to generate the mature viral proteins. the extreme situation in which all viral proteins are derived from a single precursor is characteristic of picornaviruses and flaviviruses (castle et al., 1986 ; c . m. . [rice et al. (1986) present a lucid explanation of some older data that had suggested a different translational strategy for flaviviruses.] posttranslational cleavage supplements other modes of punctuation in many animal virus systems, and is especially important in the maturation of retrovirus and alphavirus proteins. 4. the fourth, rather heterogeneous group of viruses characteristically produce big transcripts that cannot be translated completely: ribosomes bind at the 5' end and translate only up to the first stop codon, and the downstream cistrons in these polycistronic mrnas are usually silent. the downstream cistrons become translatable when they are moved closer to the 5' end, which is accomplished by producing truncated or subgenomic mrnas. various mechanisms generate these shortened transcripts. conventional splicing of nuclear transcripts is used by retroviruses, papovaviruses, and parvoviruses. adenoviruses also use splicing, on a rather grand scale (nevins, 1982; ziff, 1985) . coronaviruses use a novel cytoplasmic fusion mechanism to transfer a common leader sequence to each of six, progressively shorter, subgenomic mrnas (budzilowicz et al., 1985; lai et al., 1984; spaan et al., 1983) . in the case of alphaviruses and parvoviruses, initiation at a n 3 whereas the molecular weight correlation between mrnas and proteins holds for most early vaccinia virus genes (cooper and moss, 1979; hruby and ball, 1982) , late vaccinia mrnas are notoriously heterogeneous in size, apparently because transcription does not terminate discretely (mahr and roberts, 1984; rose1 and moss, 1985) . the 3'-proximal portions of such transcripts are assumed to be translationally silent. in the case of herpes simplex virus, the size of many mrnas corresponds simply to the size of the encoded protein, but more complex mrnas also exist (wagner, 1985) ; the functional significance of the latter is not yet clear. internal transcriptional promoter produces the subgenomic mrnas that encode the major capsid proteins (brzeski and kennedy, 1978; janik et al., 1984) . hepadnaviruses (hepatitis b and others) cannot yet be classified, since mrnas have been identified for some but not all of the viral proteins (tiollais et al., 1985) . the major subgenomic mrna is initiated at an internal promoter, and there is no evidence for splicing. the heterogeneous initiation sites for transcription in hepatitis viruses might be a means to regulate translation, as suggested by laub et al. (1983) and enders et al. (1985) . arenaviruses are a special case. the genomic s-rna segment codes for two structural proteins, n and gpc, but only gpc can be translated conceptually directly from the 5' half of virion rna; the 3' half of the sequence is an antisense version of the n gene (auperin et al., 1984) . thus, a subgenomic complementary mrna is produced to translate the n protein. although gpc could in theory be translated from the full-length viral s-rna, a subgenomic rna corresponding to the 5' portion of s-rna is also present in infected cells. this might be necessary to avoid "hybrid arrest" which could occur if translation were attempted with full-length viral and antiviral transcripts. whereas most eukaryotic mrnas are functionally monocistronic, certain viral mrnas have been shown to synthesize two separately initiated polypeptides. with few exceptions4 we can rationalize the 4 the mechanisms outlined herein cannot explain the (inefficient) internal initiation that occurs in a mutant form of rous sarcoma virus src mrna (mardon and varmus, 1983) . poliovirus mrna also initiates translation at more than one site, at least in uitro (celma and ehrenfeld, 1975) , but one cannot attempt an explanation until the sites have been identified. [dorner et al. (1984) claim to have localized an internal initiation site, but they did not prove that the template rna was intact. the fact that they could demonstrate "internal initiation" in extracts from reticulocytes but not from poliovirusinfected cells hints of an artifact.] because the poliovirus 5'-noncoding sequence has eight aug triplets upstream from the major translational start site (kitamura et al., 1981; racaniello and baltimore, 1981) , spurious initiation events are expected in that region. on the other hand, the upstream aug triplets would not preclude initiation of the polyprotein from the ninth aug codon, because seven of the upstream aug triplets lie in a weak context; the only one that lies in a favorable context is followed by an inframe terminator codon, which would allow reinitiation. the same explanations are compatible with the genomic sequences of many other picornaviruses callahan et al., 1985; forss et al., 1984; linemeyer et al., 1985) . the two structural peculiarities of picornavirus mrnas-presence of upstream aug codons and absence of production of two proteins from a single mrna by invoking one of the following mechanisms, each of which is experimentally supported. these mechanisms (with the exception of reinitiation) might be considered errors, i.e., the results of imprecise execution of some step in translation. a system that functions with less-than-perfect fidelity apparently gains the advantage of versatility. the scanning model postulates that, when the 5'-proximal aug codon occurs in a suboptimal context, ribosomes will initiate at that site as well as at another aug codon farther downstream. several nucleotides near the aug codon are known to affect the efficiency of initiation, but the most important determinants are a purine (preferably a) in position -3, and g in position +4; we can predict the occurrence of leaky scanning by focusing on those two positions. in each of the bifunctional viral mrnas listed in fig. 1 , the more 5'-proximal initiation site lies in a suboptimal context, thus rationalizing the ability of some ribosomes to reach the start of the second cistron. (in sv40 16 s mrna, influenza b, and adenovirus-12, which are bracketed in the center of the figure, the sequence flanking the first aug codon is not really weak, but it is not perfect; thus, some 10-20% of the 40 s subunits are expected to bypass the first aug codon and reach the second. that may be adequate to produce the second protein in the case of adenovirus and influenza virus, but it does not seem adequate to explain the synthesis of sv40 vp1, which is an abundant protein. in sv40 16 s mrna, however, ribosomes can reinitiate at the vp1 start site, as explained below.) the scanning model does not necessitate that the second aug codon lie in a stronger context than the first, although that usually is the case; it is necessary only that the first aug codon lie in a context that is less than optimal. each mrna listed in the upper part of fig. 1 produces two unrelated proteins, translated from two different reading frames. the mrnas in the lower part of the figure initiate at two aug codons in the same reading frame, thereby producing long and short versions of the same protein. whereas the relaxed scanning mechanism accounts qualitatively for the dual function of the mrnas listed in fig. 1 , the model is not very a cap-might be related it is possible that, when cap binding proteink) are not part of the 40 s initiation complex, aug codons in suboptimal contexts are recognized even less efficiently than usual, and the barrier effect of the upstream aug codons in poliovirus mrna would thus be minimized. perhaps p220 is cleaved (see section iii,d) to directly facilitate viral translation, rather than to inhibit host translation. good a t predicting the frequency with which ribosomes initiate a t each site. one problem is that the ratio of initiation a t sites 1 and 2 in vivo is often different from that i n vitro (bos et al., 1981; clarke et al., 1985; dethlefsen and kolakofsky, 1983; , and the ratio changes when salt or other reaction conditions are varied. that is hardly surprising because the fidelity of initiation in vitro is sensitive to reaction conditions (jense et al., 1978; kozak, 1979b; petersen and hackett, 1985) . on the other hand, the in vivo ratio might be skewed if one protein is less stable or less efficiently extracted than the other. in addition to the obvious economy of using one mrna to make two proteins, in a fixed ratio, their simultaneous production might allow the polypeptides to interact as the nascent chains grow. it would be amusing to determine whether complementation is less efficient when two proteins that are normally translated from one mrna are instead synthesized from separate templates. the hundreds of eukaryotic cellular genes that have been sequenced to date invariably initiate translation a t aug. when alternate initiator codons were tested experimentally, however, they were not inert. eukaryotic ribosomes can initiate a t gug (kozak, unpublished data) and uug (zitomer et al., 1984) , but the efficiency is a t least 30-fold lower than at an aug codon in the same context; initiation at gug, uug, or other nonstandard codons is (barely) detectable only when the codon is preceded by the optimal a in position -3 (m. k., unpublished data). there is credible, albeit not definitive, evidence that alternate initiator codons are used in two virus systems to produce minor virion components. one is adeno-associated virus capsid protein b, which probably initiates a t an acg codon that lies upstream from the major aug start site (becerra et al., 1985) . [an acg codon in coliphage t7 mrna is also recognized as an initiator codon by wheat germ ribosomes i n vitro (anderson and buzash-pollert, 1985) . although the template is unnatural in that case, the evidence for initiation a t acg is irrefutable.] the second natural example is gpr80gag, a nonessential but nonetheless conserved form of gag produced by moloney murine leukemia virus (edwards and fan, 1980; the nucleotide sequence of the region is given by shinnick et al., 1981) . gpr80gag is analogous to the elongated form of gag produced by feline leukemia virus, except that the latter is presumably initiated at an upstream aug codon in a weak context (fig. l ) , whereas in murine leukemia virus the most likely initiation site(s) are upstream gug and/or cug codons that lie in a favorable context. charles van beveren has shown that gpr80gag is produced not only by the left-most column shows that in most cases the sequence around the first functional initiator codon is suboptimal with respect to the nucleotides in positions -3 and +4, thus explaining how some 40 s ribosomal subunits can reach the second initiation site. in the case of the coronaviruses and black beetle virus, the indicated proteins are predicted but have not yet been demonstrated. although the 6.7-kda protein predicted from adenovirus region e3 has not been seen, its ribosome binding site has been proven functional by demonstrating the synthesis of a fusion protein from an appropriately engineered mutant virus (wold et al., 1986) . all of the other proteins listed here have been detected in infected cells, and most have also been synthesized in cell-free translation systems. notes: %since the 5' ends of hepatitis virus and some bunyavirus mrnas are heterogeneous (laub et al., 1983; patterson et al., 19831 , the second protein could be translated, without invoking leaky scanning, from the portion of the mrna population that lacks upstream aug codons. binfluenza virus rna-6 is unusual in that the first and second aug codons are separated by only four nucleotides (shaw et al., 1982) , but that probably does not explain the ability of ribosomes to initiate at both sites. in a version of preproinsulin mrna in which the first and second aug codons (both in the perfect context for initiation) were moloney virus, but also by two other murine leukemia viruses that have no aug codons upstream from the major (pr65gag) start site (personal communication). thus, there is no alternative to believing that nonstandard codon(s) are used to initiate the elongated form ofgag. experiments to pinpoint the start sites are in progress in van beveren's laboratory. although reinitiation was documented years ago in prokaryotes, there was no reason to suspect a similar phenomenon in eukaryotes until laboratory manipulations with cloned genes yielded some results separated by five nucleotides, ribosomes were unable to initiate a t the second member of the pair (kozak, 1984b) . 'because the first reading frame terminates upstream from the second in sv40 mrnas, ribosomes could reach the start site for vp1 by a combination of leaky scanning and reinitiation. dthe arrangement of aug codons near the 5' end of sv40 late 19 s mrna is gccaugg (out-of-frame at position 253-255) . . . uccaugg (start of vp2) . . . ccuaugc (out-of-frame a t position 679-681) . . . ggaaugg (start of vp3) . we postulate that leaky scanning allows some 40 s ribosomal subunits to bypass the first aug triplet (position 253-255) in order to initiate vp2. that does not contradict the fact that, in 16 s mrna, the aug codon in position 253-255 initiates the agnogene product. by extrapolating from the systematic measurements carried out in another system (kozak, 1986a) , we would expect 80-9070 of the ribosomes to initiate a t the aug codon in position 253-255, while 10-209 should reach the next aug; that seems sufficient to produce vp2, which is a minor component of the virion. synthesis of vp3 might depend on leaky scanning (bypassing the first three aug codons) as well as reinitiation, inasmuch as ribosomes that initiate a t the first aug codon would terminate before reaching the vp3 start site. ethe nucleotide in position -3 varies among strains of foot-and-mouth disease virus, and the relative yields of p20a and p16 vary accordingly (clarke et al., 1985) . fit is likely that two annaug sequences farther upstream are also used to produce longer forms of surface antigen (heermann et al., 1984) . most transcripts lack the extreme upstream aug codons, however. nthe two proteins postulated for feline leukemia virus are indeed seen in infected cells, but the mechanism of synthesis postulated here has not been proven. infrequent initiation at weak, upstream aug codons is also suspected with mrnas from some other retroviruses (gruss et al., 1981; willumsen et al., 1984) . references: sendai virus: giorgi et al., 1983 . measles virus: bellini et al., 1985 . reovirus: cashdollar et al., 1985 ernst and shatkin, 1985; kozak, 1982; sarkar et al., 1985. bunyaviruses: eshita and fuller et al., 1983 pasek et al., 1979; persing et al., 1985 . feline leukemia virus: laprevotte et al., 1984 . herpes simplex: haarr et al., 1985 wagner et al., 1981. that are difficult to explain ~t h e r w i s e .~ the principal observation is that eukaryotic ribosomes can initiate at an internal aug codon, when another aug codon occurs upstream and in a highly favorable context (thus ruling out leaky scanning), provided that a terminator codon occurs in-frame with the first aug codon and upstream from the second (kozak, 1984b; liu et al., 1984; m. scott and h. varmus, personal communication) . we envision that when a complete "minicistron," i.e., an aug triplet followed by a terminator codon, occurs upstream, it is translated; but the 80 s ribosome does not detach at the terminator codon. rather, the 60 s subunit dissociates while the 40 s subunit remains bound to the message and resumes scanning. when the 40 s subunit reaches the next aug codon, it reinitiates translation. reinitiation is more eficient when the terminator codon precedes, rather than when it overlaps, the aug codon (m. kozak, unpublished) . with respect t o natural mrnas rather than laboratory constructs, elegant genetic manipulations implicate reinitiation in the translation of rous sarcoma virus src mrna (hughes et al., 1984) and cauliflower mosaic virus mrna , dixon et al., 1986 . the latter is the most striking example to date of a functionally polycistronic mrna in eukaryotes. the overlapping arrangement of cistrons rules out the possibility of reinitiation with many other viral mrnas (contreras et al., 1977; meshi et al., 1983; schwartz et al., 1983; . however, in some instances in which adjacent cistrons do not overlap, and reinitiation is therefore expected, it has not been observed (barker et al., 1983; goelet et al., 1982; knowland, 1974; ou et al., 1982) . reinitiation, together with leaky scanning, could theoretically account for translation of the sv40 agnogene pro-5 the alternative to reinitiation is to postulate that eukaryotic ribosomes can initiate directly at an internal aug codon, and that they usually fail to do so only because the downstream site is occluded by the stream of 80 s ribosomes advancing from upstream. occlusion indeed occurs during the translation of polycistronic prokaryotic transcripts, but the inhibitory effect of an overlapping upstream cistron is sometimes only twoor threefold (das and yanofsky, 1984; hoess et al., 1980) . berkhaut et al. (1985) claimed to see complete inhibition of translation of the ms2 lysis protein when the coat protein cistron overlapped, but the unknown sensitivity of their biological assay complicates the interpretation. moreover, their claim that a strong upstream initiation site (for coat protein) suppresses initiation from the much weaker site for lysis protein hardly compares with the situation in eukaryotes, where an upstream aug codon can completely suppress initiation from an equally favorable downstream site (kozak, 1983b, 198413) . the essential difference between the occlusion and reinitiation mechanisms is that the former postulates direct binding of ribosomes to internal aug codons, while the latter prohibits such binding. there is experimental evidence against direct binding (kozak, 1979a (kozak, , 1983b and against occlusion (kozak, 198413) . tein and vp1 from the same mrna, although neither mechanism has been experimentally demonstrated with sv40. (the simultaneous occurrence of two phenomena complicates the task of demonstrating either one.) reinitiation is expected within the leader region of rous sarcoma virus genomic rna, where three small open reading frames (orfs), one of them headed by an aug codon in a highly favorable context, precede the gag coding sequence (schwartz et al., 1983) . it has been difficult to demonstrate synthesis of the predicted leader peptides, perhaps because their small size makes them unstable. with admirable persistence, however, hackett et al. (1986) have devised a sensitive assay with which they have detected small amounts of the peptide encoded in the first minicistron of rous sarcoma virus. parenthetically, when one is designing experiments to probe the function of a particular viral or cellular product, one must remember that introducing a nonsense codon near the beginning of a gene might not abolish its function. if an in-frame aug codon occurs downstream from the nonsense codon, ribosomes will probably reinitiate and the truncated polypeptide might be functional. the mechanism by which reverse transcriptase is synthesized has long puzzled retrovirologists. the pol coding sequence is not preceded by an initiator codon; rather, reverse transcriptase is derived by cleavage from a joint gag-pol precursor (murphy et al., 1978; oppermann et al., 1977) . the problem is that the genomic arrangement of gag and pol sequences would seem to preclude their joint translation. in avian retroviruses, gag and pol are in different, partially overlapping, reading frames (schwartz et al., 1983) ; in murine retroviruses, gag and pol are in the same frame but are separated by a terminator codon (shinnick et al., 1981) . in both cases, the solution involves a translational "error." with avian retroviruses, about 5% of the ribosomes shift reading frames somewhere near the end of the gag sequence, thereby producing from one message both gag and a small amount of the gag-pol fusion protein. jacks and varmus (1985) have shown beyond reasonable doubt that frameshifting occurs near the gag-pol junction in a cell-free translation system from reticulocytes. by using mrna that was transcribed in uitro from cloned rous sarcoma virus dna, they excluded the possibility that a low-abundance, spliced transcript served as the template for the fusion protein. inspection of the gag-pol junction sequences in several other retroviruses leads one to expect that frameshifting is not limited to the avian system. neither is it limited to eukaryotes, of course. frameshifting occurs under intrigu-ing circumstances in a few bacterial and phage genes (craigen et al., 1985; dunn and studier, 1983; kastelein et al., 1982) . the excitement that accompanied the old discovery of a "readthrough" version of coliphage qp coat protein (weiner and weber, 1973) has been rekindled recently by finding a similar phenomenon in eukaryotic systems. in murine retroviruses, for example, the gag and pol sequences are separated by a single uag terminator codon, the occasional suppression of which generates a gag-pol fusion protein. the first hint of this came from supplementing a cell-free translation system with yeast suppressor trna, which indeed enhanced the synthesis of the gag-pol precursor (philipson et al., 1978) . the notion was confirmed for both murine and feline leukemia viruses when yoshinaka et al. (1985a,b) directly determined the amino acid sequence of the protease that constitutes the nh,-terminal portion of the pol gene product. suppression of a terminator codon is not peculiar to retroviruses, for it occurs also with alphaviruses (lopez et al., 1985; strauss et al., 1983) , tobacco mosaic virus (pelham, 1978) , and probably carnation mottle virus (guilley et al., 1985) . suppression of the uag codon in tobacco mosaic virus rna has been traced to the major tyrosine-specific trnas which, in tobacco cells, have the anticodon sequence gjia (beier et al., 1984a,b) . the most abundant trnaer from wheat germ has the highly modified queuine base (q) in place of g in the wobble position of the anticodon, and it is not able to suppress. thus, minor differences in trna structure can be an important determinant of host range for some viruses. in one sense, suppression solves the problem of how to produce a full-length protein from an interrupted coding sequence. but that probably misplaces the emphasis. the real problem might be how to produce only a small amount of an essential protein that might be toxic if overproduced. an inefficient mechanism, such as suppression or frameshifting, is an ideal solution. whereas the features described in the preceding section are intrinsic to viral mrnas, and can be demonstrated readily in a "universal" reticulocyte lysate, the translation of viral mrnas in uiuo is influenced by specific conditions that prevail in the cytoplasm of infected cells. the way in which the translational machinery is partitioned viral translation 245 between viral and host mrnas is one important consideration. because the literature concerning inhibition of host protein synthesis by animal viruses has already been reviewed at length (fraenkel-conrat and wagner, 1984; kaariainen and ranki, 1984; shatkin, 1983) , i shall be selective in my coverage. an overview of the phenomenology is presented in table i . the general mechanisms of host shutoff defined by these phenomena are described briefly in sections b and c, which are followed by a detailed discussion of two viruses-poliovirus and adenovirus-that seem to merit more attention. the phenomenon of host shutoff is not as widespread as might appear from table i . retroviruses, paramyxoviruses, parvoviruses, and flaviviruses do not suppress host translation, and papovaviruses actually stimulate host protein synthesis. because host shutoff is interesting, and because it is easier to detect viral protein synthesis against a clean background, virologists understandably have focused on systems that demonstrate the phenomenon. the inhibition of host protein synthesis may be of more interest to virologists than to viruses, however. in many cases, the yield of infectious progeny from a virus that fails to shut off host protein synthesis is the same as from another virus strain (or the same virus in a different cell line) in which host protein synthesis is obliterated (detjen et al., 1982; gillies and stollar, 1982; jen and thach, 1982; lodish and porter, 1981; minor et at., 1979; munemitsu and samuel, 1984; read and frenkel, 1983; sharpe and fields, 1982) . a virus strain that suppresses host macromolecular synthesis sometimes replicates faster in culture than one that does not, however. whether the inhibition of host protein synthesis is beneficial or harmful or irrelevant to the virus during the course of natural infections is not known. in short, with a few viruses inhibition of host protein synthesis might be a strategic move, necessary for efficient expression of viral genes, but no unequivocal example can be cited. in most instances, host shutoff is likely to be an unintentional side effect of viral gene expression-an effect of no real value, and possibly even harmful, to the virus. it is interesting that poliovirus replicates better during coinfection with cytomegalovirus than during single infection; cytomegalovirus stimulates the cell functions that are turned off by poliovirus (furukawa et al., 1978) ! there are examples of nonpermissive virus-cell systems in which macromolecular synthesis is inhibited so effectively that neither host nor viral proteins can be made (brown and moyer, 1983; drillien et al., 1978; jones et al., 1982) . in such cases the wild-type virus must have a way to throttle the shutoff mechanism. that notion will be pursued in the section on adenoviruses. throughout this section i have tried to point out wrinkles in the data, uncertainties in some popular interpretations, and alternative mechanisms. this critical slant is intended not to minimize the value of the work that has been done, but to stimulate reconsideration of some paradigms that may have been accepted or rejected too quickly. experiments probing the mechanism of host shutoff are difficult. some of the pitfalls and caveats might be stated at the outset. certain techniques that are used to block virus infection at a particular step, in order to define the extent of viral expression that is needed to effect host shutoff, might inadvertently create a new inhibitory mechanism. in the resulting confusion one learns little about the physiological mechanism of inhibition. for example, treatment of poliovirus-infected cells with guanidine not only blocks the synthesis of progeny rna (which is the intended purpose), but also causes double-stranded rna to accumulate to higher-than-normal levels (baltimore, 1969); and double-stranded rna is a potent inhibitor of translation. experiments showing that a temperature-sensitive mutant virus which makes no progeny rna nevertheless shuts off host protein synthesis as effectively as wild-type poliovirus suffer the same defect. the mutant-infected cells accumulate massive amounts of partially double-stranded "replicative intermediates" which are likely to inhibit translation, irrespective of the normal shutoff mechanism (hewlett et al., 1982) . in short, the problem with many experiments is that translation can be inhibited in a variety of ways, and in the process of blocking one pathway, another can be activated. for the same reason, the assumption that the mechanism of host shutoff is the same at high multiplicities of infection as at low multiplicities is untenable. in the case of encephalomyocarditis (emu virus, the effect on host protein synthesis has been shown to differ qualitatively as a function of multiplicity (alonso and carrasco, 1981) . with poliovirus, the familiar statement that guanidine does not prevent host shutoff is true only when the cells are infected at a high multiplicity (helentjaris and ehrenfeld, 1977) . at a normal multiplicity of infection, guanidine does block host shutoff, and therefore it is not clear that viral rna synthesis (which is the guanidine-sensitive step) is uninvolved in the normal mechanism of host shutoff by poliovirus. the specific deficiency or alteration in the translational machinery can sometimes be pinpointed by studying protein synthesis in extracts prepared from virus-infected cells, provided that one appreciates the limitations of that approach. the notion that one can study the mechanism of host shutoff by one virus by using a second virus as a stand-in for host mrna is questionable, because proteins encoded by two differhelentjaris and ehrenfeld (1978) ; nuss et al. (1975) . (2) bienz et al. (1978) . (3) bossart and bienz (1981) ; femandez-munoz and darnell (1976) . (4) celma and ehrenfeld (1974) . (5) h e a l and . (6) etchison etal. (1982) ; a. dasgupta, personal communication. (7) helentjaris and ehrenfeld (1977) . emc uirus in hela cells: (1) jen e t d . (1980) . (2) carrasco and lacal (1983). (3) alonso and carrasco (1981) . (4) jen et al. (1980) . (5) alonso and carrasco f1982b); lacal and c a r r a m (1982). (6) mosenkis et al. (1985) ; a. p. rice etal. (1985) . sindbis and sfv: (1) lachmi and kaiiriiiinen (1977) ; wengler and wengler (1976) . (2) and (3) simizu (1984) . (4) van steeg et al. (1981) ; wengler and wengler (1976) . (5) carrasco and lacal(1983) ; gamy et d. (1979) . (6) van steeg et al. (1981) . (7) simizu (1984) . vsv: (1) lodish and porter (1981) ; mcallister and wagner (1976) . (2) grinnell and wagner (1985) . (3) jaye et al. (1982) ; lodish and porter (1980) ; nisbioka and silverstein (1978a) . (4) lodish and porter (1980) ; otto and lucas-lenard (1980). (5) francoeur and stanners (1978) ; . (6) centrella and lucas-lenard (1982) ; dratewka-kos etal. (1984) . reouirus in l cells: (1) zweerink and joklik (1970) . (2) sharpe and fields (1982) . (3) (1984) . skup and millward (1980) . (7) sharpe and fields (1982) . influenza virus: (1-3) inglis (1982) ; . (4) lazarowitz etal. (1971) . (5) carrasco and lacal (1983) . (6) katze et al. ( , 1986 . adenouirus: (1) castiglia and flint (1983) . (2) babich et al. (1983) ; beltz and flint (1979) . (3) babich et al. (1983) . (4) castiglia and flint (1983) . (6) see text. (7) babiss and ginsberg (1984) . vaccinia uirus: (1) hruby and ball (1981) ; oppermann and koch (1976) . (2) salzman etal. (1964) . (3) cooper and moss (1979) ; rice and roberts (1983) . (4) oppermann and koch (1976) ; rice and roberts (1983) . (5) norrie etal. (1982) . (7) bablanian etal. (1981) . herpes simpler: (1) pereira et al. (1977) . (2) fenwick and walker (1978) ; stenberg and pizer (1982) . (3) stage 1-see text; stage 2inglis (1982) ; nishioka and silverstein (1978b) . (4) silverstein and engelhardt (1979) . (5) fenwick and walker (1978) ; hackstadt and mallavia (1982) . (7) fenwick and walker (1978) ; nishioka and silverstein (1978b) ; read and frenkel (1983) . frog virus 3: (6) cited in mosenkis et al. (1985) . all other entries are from willis et al. (1985) . bthe timing of host shutoff relative to the onset of viral translation is indicated. a capitalized entry in this or any other column identifies the probable major mechanism of host shutoff. "coincident" in capitals means that competition probably underlies host shutoff. cfunctional stability is usually evaluated by the ability of host mrnas, extracted from infected cells, to be translated in a cell-free reticulocyte lysate. dthis column indicates the presence or absence of a temporal correlation between the inhibition of host protein synthesis and the influx of sodium ions that often accompanies virus infection (carrasco and lacal, 1983) . ea change in cap binding protein was postulated because extracts from sfv-infected cells were unable to translate most capped mrnas, with the exception of emc and sfv late 26s mrnas (van steeg et al., 1981 1. although it is true that efficient mrnas like emc and sfv 26 s can be translated without benefit of the m7c cap, it does not follow that cap binding protein(s1 are deficient in every instance where translation of those mrnas persists in the face of an overall decline. efficient mrnas will be selectively translated when any component of the translational machinery is made limiting. the best evidence for this is the ability of both emc and sfv 26 s mrna to be translated in emc virus-infected cells, in which host translation is drastically inhibited by a mechanism that has not been difined, but that clearly does not involve cap binding protein (mosenkis et al., 1985) . wan steeg et al. (1984) have postulated that capsid protein is responsible for host shutoff by sfv, but the evidence is not compelling: the binding of host mrna to ribosomes was only slightly inhibited in fig. 4 of their paper, and the inhibition was a t the level of 80 s rather than 40 s ribosomes. the fact that translation of late viral 26 s mrna was unaffected is not adequate evidence of specificity, since 26 s mrna-by virtue of its high efficiency-would be relatively resistant to any inhibitor, physiological or otherwise. gwith type-2 reovirus in l cells, infection at a multiplicity of infection (moi) of 10 caused no significant decrease in translation; a t mol of 20, translation gradually declined by -40% (sharpe and fields, 1982) . with type-3 reovirus (moi of zo), overall protein synthesis was initially stimulated in both hela and l cells; translation declined later only in l cells (munoz et al., 1985a) . hrecent data do not corroborate an earlier hypothesis concerning inzctivation of a cap-specific translation factor (skup and millward, 1980) . although extracts from reovirus-infected cells translate capped reovirus mrnas poorly, other cap-dependent mrnas, such as globin and tobacco mosaic virus, are translated efficiently in such extracts (lemieux et al., 1984) ; and capped sv40 mrnas are translated in cells coinfected with reovirus (daher and samuel, 1982) . perhaps translation of capped reovirus mrnas is inhibited (artificially) in extracts from infected cells because viral structural proteins, which must be abundant in those extracts, adsorb to the homologous mrnas and sequester them from ribosomes. 'in contrast with most other host mrnas, the synthesis of histone mrnas is inhibited in adenovirus-infected cells (flint et al., 1984) . ]the 55-kda e1b protein probably functions only indirectly to shut off host translation. the protein is required for efficient cytoplasmic accumulation of late viral -as, which might in turn shut off host protein synthesis by competition (see text). proteins from regions e1b and e4 may function as a complex. khost transcripts were stable by hybridization when hela cells were infected in the presence of actinomycin d (rosemond-hornbeak and moss, 1975) but were degraded during productive infection of l cells by vaccinia virus (rice and roberts, 1983) . the second observation seems more pertinent. 'ben-hamida et al. (1983) have purified a component from vaccinia virions that blocks the binding of met-trna to 40 s ribosomes in uitro, but the physiological (in uiuo) mechanism of host shutoff seems to require the expression of viral genes. there is no evidence that eif-2 function is impaired in infected cells. it is possible, however, that some component in the eif-2 cycle is altered in a positive way, i.e., a way that prevents inactivation by eif-2 kinase (whitakerdowling and youngner. 1984) . "it is clear that host translation can be inhibited rapidly in the presence of drugs that preclude the synthesis of viral mrna (moss, 1968) . but it is not clear that the normal shutoff mechanism is at work in such cases (see text). "the virion-mediated rapid shutoff of host translation is not usually seen with herpes simplex type 1. except in vero cells; type 2 virus displays the early shutoff function in all cell types. an important, albeit undeciphered, clue is that type 1 virus interferes with the early shutoff by type 2 virions in doubly infected friend erythroleukemia cells (hill et al., 19851. ent viruses can often be produced simultaneously in cells in which host protein synthesis is suppressed (alonso and carrasco, 1982a,c; otto and lucas-lenard, 1980) . some features of the intracellular environment, such as ionic changes that favor the translation of viral over host mrnas, are inevitably lost when the cells are lysed, and other features are not easily preserved. for example, phosphorylated initiation factors have sometimes been inadvertently restored to normal during their purification (centrella and lucas-lenard, 1982; wong et al., 1982) . phosphorylation of eif-2 has also been missed on occasion because eif-2(ap), retains the ability to function stoichiometrically, and the defect is evident only if one assays for catalytic function (safer, 1983) .6 on other occasions, phosphorylation of eif-2 has been missed because a high concentration of gtp in the lysate masks the functional defect (schneider et al., 1985) . extreme care is needed also to preclude the artifactual modification of initiation factors-by proteolysis, for example-during the preparation of cell-free extracts. the fact that one can reproduce in vitro the preferential translation of viral over host mrnas does not necessarily mean that one is studying the physiological mechanism of host shutoff. if viral mrnas are even slightly more efficient than host mrnas, as is often the case, any manipulation that establishes competition will favor the viral mrnas. one cannot define how competition is established in uiuo by showing that competition occurs in uitro. for example, the fact that translation of vaccinia mrnas is more resistant than host mrnas to inhibition by poly(a) when translation is studied in cell-free extracts from reticulocytes (bablanian and banerjee, 1986; coppola and bablanian, 1983) does not mean that vaccinia virus inhibits host translation by flooding the cytoplasm with short, polyadenylated transcripts. such transcripts are indeed produced in infected cells, but only when drugs are used to block the synthesis of normal viral mrnas (rosemond-hornbeak and moss, 1975) . the aforementioned problem of an experimental manipulation creating a new inhibitory mechanism, rather than exposing the normal mechanism, almost certainly applies here. the tendency to attribute functional significance to foreign agents that cosediment with polysomes should be resisted. everything cosediments with polysomes to some extent. the presence of a trace of ade-6 eif-2, eukaryotic initiation factor 2, is responsible for binding initiator methionyl-trna to the 40 s ribosomal unit, and eif-2(ap) is eif-2 phosphorylated on its a-subunit. when eif-2 is phosphorylated, the reaction in which gdp is exchanged for gtp fails. that reaction is mediated by an accessory protein called gef, which becomes trapped in an inactive complex with eif-2(ap). because the pool size of gef is small, phosphorylation of only 30% of the eif-2 pool can completely inhibit translation (safer, 1983; siekierka et al., 1984) . 25 1 novirus va-rna in the polysome region of sucrose gradients (schneider et al., 1984) , for example, is almost certainly unrelated to the function of va-rna. it is common to find viral capsid proteins stuck to ribosomes, and it is wise to treat such contamination as contamination, until it is proven otherwise. the known and suspected mechanisms by which translation of viral mrnas is facilitated, usually to the disadvantage of host mrnas, fall into four categories. 1. competition may be suspected when the decline in host protein synthesis and the onset of viral protein synthesis coincide. on the other hand, competition is an insufficient explanation when host protein synthesis is severely inhibited before the onset of viral translation, as occurs with poliovirus, herpes simplex virus, and frog virus 3. often competition is exacerbated by a decline in the overall translational capacity, which may be brought about by changes in the ionic environment or in the translational machinery. when initiation is limiting, most mrnas accumulate in small polysomes, the size of which increases upon exposure to a low concentration of cycloheximide. (cycloheximide slows elongation, thus causing the number of ribosomes to increase on mrnas that were previously limited at the initiation step.) the characteristic shift in polysome size upon exposure to cycloheximide is seen, for example, in cells infected by vsv (jaye et al., 1982) or adenovirus (perlman et al., 1972) . the competition between host and viral mrnas that takes place in uiuo is sometimes not maintained when the translation of endogenous mrnas is studied in extracts from infected cells, probably because the concentrations of critical components change during the preparation of such extracts. 2. inactivation of a normal component of the translational machinery. the resulting deficiency enables only a subset of mrnas, mostly viral, to be translated. the hallmark of this mode of regulation is the ability to restore translation to cell-free extracts by adding back the missing factor, in practice, this is not as easy as it sounds. there is evidence for inactivation of initiation factor eif-2 in several virus systems, as noted in table i . alterations in the initiation factor that mediates the translation of capped mrnas have been postulated for several viruses (table i) , but the story seems credible only in the case of poliovirus, which is described below. identifying characteristic here is that extracts from infected cells cannot be reactivated by the addition of normal initiation factors, but can be reactivated by washing the ribosomes to remove the inhibitor. based on these criteria, pensiero and lucas-lenard (1985) have postulated the production of an inhibitor during mengovirus infection. because mengovirus and host mrnas are equally sensitive to the inhibitor in cell-free extracts, one must invoke competition (for the residual functional ribosomes) t o explain the selective persistence of viral translation in uiuo. this seems justified in view of the extraordinary efficiency of mengovirus mrna when translation is carried out in uitro under conditions of competition (abreu and lucas-lenard, 1976) . until the postulated inhibitor has been identified, however, we cannot be certain that mengovirus belongs in category 3. the hallmark here is that viral mrnas should be translated more efficiently in cell-free extracts from infected than from uninfected cells. frog virus 3 meets this criterion (raghow and granoff, 1983) . it is possible that some other animal viruses alter the translational machinery in a "positive" way.7 the best evidence t o date comes from plant viruses, however. a genetic analysis of temperature-sensitive mutants of alfalfa mosaic virus strongly suggests that rnas-1 and -2 encode or induce a factor that facilitates the translation of coat protein from rna-4 (huisman et al., 1985) . extrapolating that mechanism to brome mosaic virus would explain why rna-4 fails to synthesize coat protein when it is injected (without rnas-1, -2, and -3) into barley protoplasts (kiberstis et al., 1981) . unfortunately, cellfree systems from infected plant cells are not available to test the hypothesis. the aforementioned hints are only hints. no virus has yet been proved to produce a new or alter an old translational factor in a way that specifically promotes its own translation. since competition is the most common mechanism of translational regulation in virus-infected cells, that topic merits more attention. there are at least three variations on the theme. although the overall ability to translate poliovirus mrna is about the same when cell-free systems are reconstituted with factors from infected or uninfected cells (brown and ehrenfeld, 1980) , there is a qualitative difference in the selection of initiation sites when factors from infected cells are used (brown and ehrenfeld, 1979) . other experiments support the idea that poliovirus (bernstein et al., 1985) as well as vaccinia (moss and filler, 1970) and human t-lymphotropic virus type i11 (rosen et al., 1986) produce something that enhances the synthesis of viral proteins. the enhancing substance could viral translation 253 in l cells infected by reovirus type 2, host protein synthesis is dramatically shut off. the elegant genetic studies of sharpe and fields (1982) revealed that the s4 gene, which encodes the major capsid protein u3, is responsible for the inhibition. the effect of u3 might be indirect, inasmuch as the same gene product is responsible for inhibiting host rna synthesis. although host shutoff by type 3 reovirus in sc-1 cells is less dramatic than with type 2 virus in l cells, the mechanism of type 3 shutoff is better understood due to the careful quantitative studies of thach and colleagues (walden et al., 1981) . their conclusion was rather surprising: the intrinsic translational efficiency of reovirus mrnas is not higher than that of host mrnas, but rather, reovirus translation dominates because viral mrnas accumulate in massive amounts-up to 45% of the total mrna in the cell! the evidence that reovirus mrnas initiate translation less efficiently than most host mrnas is twofold: (1) the size of reovirus polysomes is smaller than host polysomes that code for proteins of comparable size; and (2) whereas a low concentration of cycloheximide reduces the synthesis of host proteins (which is the result expected for mrnas of "normal" efficiency), the translation of reovirus proteins is actually enhanced by a low concentration of cycloheximide. whereas reovirus mrnas appear to be less efficient than most host mrnas, vsv mrnas are probably translated as efficiently as host mrnas, but not more so. competition is simply proportional to the concentration of viral mrnas in the cytoplasm (lodish and porter, 1981) ,8 and vsv and host mrnas that encode the same-sized proteins are on polysomes of the same size (lodish and porter, 1980) . in some cell lines infected by some strains of vsv, a portion of the eif-2 pool seems to be inactivated (centrella and lucas-lenard, 1982; dratewka-kos et al., 1984) . although that would intensify the competition, it is obvious that lowering the eif-2 level per se cannot explain the selective inhibition of host translation. selective shutoff requires that viral mrnas be more abundant than host mrnas, or more efficient, or be a virus-specific translation factor, or a protease inhibitor that stabilizes viral proteins, or a nuclease inhibitor, or something else. recent evidence indeed suggests that vaccinia encodes a function that protects late viral mrnas against degradation (pacha and condit, 1985) . both. the aforementioned experiments argue against vsv mrnas being unusually efficient, but other experiments have been taken as evidence for the contrary view. because vsv mrnas are more resistant than host mrnas to hypertonic stress, nuss et al. (1975) have suggested that the viral mrnas are intrinsically more efficient. their interpretation seems reasonable, but it must be carefully circumscribed. if high salt exacerbates some deleterious feature in the mrna (such as secondary structure) to the point where it becomes inhibitory, then the hierarchy of mrna strengths that one observes under hypertonic conditions might be irrelevant to normal growth condition^.^ the progressive inhibition of host protein synthesis during infection by vaccinia virus is probably due to competition, in proportion to the concentration of each mrna. viral mrnas are not apparently more efficient than host mrnas (cooper and moss, 1979; . degradation of host mrnas (table i ) and the massive synthesis of viral transcripts probably tip the balance in favor of viral protein synthesis. have suggested that differential association of mrnas with the cytoskeleton might also play a role, but that is a difficult hypothesis to test. in contrast with reovirus and vsv, the concentration of emc virus mrna in infected cells may be too low for simple competition to effect the observed switch from host to viral translation, even though emc mrna is translated more efficiently than host mrnas both in uiuo (jen et al., 1978) and in vitro (golini et al., 1976; svitkin et al., 1978) . in view of the overall decline in translation that begins 3 hours postinfection, however, the idea that emc virus mrna outcompetes host mrnas for the low, residual translational capacity seems reasonable. the overall decline is most likely due to an influx of monovalent cations, since the two events are temporally correlated (lacal and carrasco, 1982) . host translation is restored when emc virus-infected cells are shifted to hypotonic medium carrasco, 1981, 1982b1 , and excess salt, sufficient to inhibit the translation of host mrnas in uitro, dramatically stimulates the translation of emc rna (carrasco and smith, 1976 ). 9recall that, although reovirus mrnas are not more efficient than host mrnas in unperturbed cells, reovirus translation, like that of vsv, dominates when cells are subjected to hypertonic stress (nuss et al., 1975) . in other studies, the creation of a hairpin (ag -30 kcalimol) within the 5'-noncoding region of preproinsulin mrna impaired translation only in hypertonic medium; the hairpin did not inhibit translation under normal culture conditions (kozak, 1986b ). a similar mechanism might mediate the switch from host to viral translation during infection by alphaviruses (see sindbis and sfv, i.e., semliki forest virus, in table i) , since the influx of sodium ions exactly coincides with the overall decline in protein synthesis. the magnitude of the ion influx remains controversial (gray et al., 1983; munoz et al., 1985b) . translation of sfv mrna is more resistant than host protein synthesis to hypertonic conditions (garry et al., 19791 , but the resistance is not as dramatic as with emc virus (alonso and carrasco, 198213 . the notion that alphaviruses inhibit host translation by competition seems viable even if something more than enhanced permeability to monovalent cations is needed to explain the overall decline. the fact that sfv mrna can be translated in emc-infected cells (alonso and carrasco, 1982~1 , in which the overall translational capacity is very low, identifies sfv late 26 s mrna as an efficient message. consistent with the competition hypothesis, the time of host shutoff coincides with the production of viral mrna (lachmi and kaariainen, 1977) and the severity of inhibition correlates with the yield of viral rna in mutant-infected cells (atkins, 1976) . polysomes containing sfv (wengler and wengler, 1976) or emc virus mrna do not increase in size upon exposure to cycloheximide, suggesting that those mrnas are efficient enough to be fully loaded with ribosomes even when the overall translational capacity is low. influenza virus mrnas are translated with extraordinarily high efficiency i n uitro (katze et al., 1986) . because the shutoff of host protein synthesis coincides with the onset of influenza virus protein synthesis and there is no overall decline in translation, simple competition would seem adequate to explain the switch from host to viral translation. in the case of adenovirus, competition is probably exacerbated by a reduction in functional eif-2 levels. these issues are discussed in more detail in section ii1,e. in some virus systems, competition might dictate the switch from synthesis of early to late viral proteins. picornaviruses, rhabdoviruses, and influenza virus are uninteresting in this regard, as they display little or no temporal control over protein synthesis. the existence of a temporal switch is questionable for reoviruses, but all of the other entries in table i , as well as the papovaviruses, show a striking earlyto-late transition. in every case, the switch is effected primarily a t the level of transcription: the mrnas that encode late proteins are not synthesized until late. in several cases, however, early mrnas persist in the cytoplasm at late times, and some form of translational regulation seems to limit their expression (hruby and ball, 1981; johnson and spear, 1984; lachmi and kaariainen, 1977; vassef et al., 1982) . with black beetle virus that phenomenon can be attributed to competition, because translation of the late mrna predominates over early mrna in cell-free extracts under conditions of competition (friesen and rueckert, 1984) . the same explanation probably holds for alphaviruses. on the other hand, late vaccinia virus mrnas do not appear to be more efficient than early viral mrnas (cooper and moss, 1979; oppermann and koch, 1976) . instead, degradation of some early vaccinia transcripts (hruby and ball, 1981) might be part of the switching mechanism. the translation of early mrnas could be further reduced by the accumulation of "anti-early mrna" (boone et al., 19791 , which could inhibit translation much as antisense rna does in other experimental systems (izant and weintraub, 1985) . lo with baculoviruses, temporal switching involves the sequential activation of upstream promoters, such that the small, early mrnas are replaced by progressively longer overlapping transcripts (friesen and miller, 1985) . the resulting relegation of early protein coding sequences to the 3' ends of late transcripts probably prohibits their translation. promoter switching late in sv40 infection also generates forms of mrna from which t antigen is translated inefficient1y.l thus, although transcription plays the dominant role, translational mechanisms-involving competition or other ploys-contribute to the temporal switch in expression of viral genes in some systems. the current thinking is that poliovirus selectively shuts off host protein synthesis by inactivating a 220-kda protein (~2 2 0 ) which is a subunit of the initiation factor that mediates the translation of capped mrnasll because the 5' end of poliovirus mrna is uncapped, inac-10 few systems other than vaccinia show much potential for regulating translation by "hybrid arrest." complementary transcripts accumulate in the nuclei of many virusinfected cells, but the complementary sequences are usually edited from cytoplasmic mrnas. in the case of adenoviruses, for example, where transcription switches periodically from one dna strand to the other, the 3' ends of the juxtaposed mature mrnas rarely overlap (lemoullec et al., 1983) . the 3' ends of papovavirus early and late mrnas do characteristically overlap, however. 11 although the proteins that can be cross-linked to the m7g cap have a disturbing tendency to change from year to year, two proteins in mammalian cells that reproducibly cross-link are p24-cbp and p46-cbp. p220 is not a "cap binding protein" inasmuch as it does not cross-link to the cap, but p220 does copurify with p24-cbp and p46-cbp. "he aggregate, called eif-4f, is considered by most people to be the functional "cap binding factor." the functions of cap binding proteins have been reviewed by shatkin (1985) . tivation of the cap binding factor should not impair the translation of viral mrna. the idea is appealing because it is straightforward, but some of the supporting data are less so. the experiment that gave birth to the hypothesis was provocative. using an antiserum against initiation factor eif-3, etchison et al. (1982) showed by immunoblotting that p220 is clipped during the first few hours after infection of hela cells by poliovirus. because affinitypurified antibodies against p220 recognized a protein of the same size in some preparations of cap binding factor, the working hypothesis was that cleavage of p220 inactivated the cap-binding initiation factor. indeed, an activity from uninfected cells that was subsequently purified, based on its ability to restore translation to poliovirus-infected cell-free extracts, contained p220 and the cap binding proteins. these experiments are described below in more detail. this is the most promising explanation to come forth, but there are some irregularities and many lacunae in the supporting data. 1. there is a discrepancy between the kinetics of degradation of p220 and the kinetics of host shutoff (etchison et al., 1982) . the same anomaly occurs during infection by rhinovirus, which degrades p220 in a manner similar to poliovirus (etchison and fout, 1985) : the rate of translation is still half-maximal a t the time when p220 disappears from the polyacrylamide gels. in many of the experiments carried out with cell-free extracts, the question of timing was disregarded, and cells were routinely harvested 3 hours postinfection (lee and sonenberg, 1982; lee et al., 1985a) , which is well beyond the point when host translation is precipitously shut off. 2. the extent of cleavage is difficult to evaluate quantitatively. it seems dangerous to accept the recommendation of bernstein et al. (1985) to focus on the accumulation of the 115-kda cleavage products without also monitoring the disappearance of p220, because cleavage need not always be arrested at the 115-kda level. in some experiments the concentration of cleavage products in immunoblots from infected cells greatly exceeded the concentration of intact p220 in uninfected cells (bernstein et al., 1985) . 3. a p220 cleavage pattern qualitatively similar to that which occurs in infected cells, although not nearly as extensive, is evident in some extracts from uninfected cells (bernstein et al., 1985; fig. 5 in lee et al., 1985a) . because the link between degradation of p220 and virus infection is not tight, it was not surprising to find that the virus-encoded protease 3c is not responsible for cleaving p220 (lee et al., 1985b; lloyd et al., 1985) . it would seem wise to include a spectrum of protease inhibitors during the preparation of extracts. phenylmethylsulfonyl fluoride is the only one routinely used, at concentrations ranging from 5 mm (which is adequate; etchison et al., 1982) to 1 mm (bernstein et al., 1985) or 0.2 mm (lee and sonenberg, 1982) . one is reminded of the old excitement concerning "processing" of sv40 t antigen (ahmad-zadeh et al., 1976) which turned out to be an artifact of extraction (smith et al., 1978) . 4. in uninfected hela cells, the concentration of p24-cbp is 10-fold lower than p220 (duncan and hershey, 1985a,b) . [the concentration of p24-cbp is also low in reticulocytes (hiremath et al., 1985) .1 if p24 and p220 (together with p46) function as the cap binding complex, large changes in the p220 pool-although easy to detect by immunoanalysis-are unlikely to alter the rate of translation; but small changes in the pool of p24, which might go undetected with immunological or biochemical probes, would significantly impair translation. 5. a mutant of poliovirus called hf121 has been described in which the synthesis of viral rna is normal in cv-1 cells, but viral protein synthesis is inefficient, host translation is inhibited more slowly than usual, and p220 is not rapidly cleaved (bernstein et al., 1985) .12 (the phenotype of hf121 in hela cells, which are a more natural host than cv-1 monkey cells, is more complex. the synthesis of viral rna is greatly reduced and all protein synthesis, host and viral, is inhibited very early, again without the concomitant cleavage of p220.) the authors argue cogently that wild-type poliovirus appears to encode a function, absent from hf121, that promotes (or "avoids the early inhibition of") viral translation, and they argue less cogently that hf121 is translated poorly as a consequence of the failure to selectively inhibit host translation. to me, the second postulate seems redundant. lack of the putative positive factor would be sufficient to account for the poor translation of viral mrna, and the failure to inhibit host translation with normal kinetics could as likely be the result of ineffkient viral translation as the cause.i3 if the slow inhibition of translation by 12 although cleavage of p220 is not detectable at all in hela cells infected by mutant hf121, cleavage products are clearly evident in cv-1 cells a t 5 hours postinfection (fig. 8a, lane 7, in bernstein et al., 1985) . that is later than normal, and the cleavage is less extensive than normal, but some cleavage does occur. 13 one could argue similarly that cleavage of p220 in wild-type infected cells is the consequence of the abundant accumulation of viral proteins, rather than a precondition. the issue might be resolved by treating wild-type infected cells with guanidine, which allows only limited synthesis of viral proteins while host translation is inhibited with the usual rapid kinetics. it would be informative to know whether p220 undergoes cleavage under those circumstances. hf121 in cv-1 cells is a delayed version of the normal shutoff mechanism, then cleavage of p220 must not be central to the normal shutoff mechanism. the authors argue, to the contrary, that shutoff by hf121 is mechanistically different, inasmuch as the inhibition affects both host and viral mrnas in hf121-infected cells, whereas host translation is preferentially inhibited in wild-type-infected cells. however, selective stimulation of viral translation (mediated by the product that is defective in hf121) superimposed on a general inhibition of translation, would mimic selective inhibition. the authors contend that the ability of guanidine to block host shutoff by hf121 distinguishes it from the normal shutoff mechanism, but the experiment (in cv-1 cells) was done without testing wild-type virus in parallel, as could have been done by adding guanidine at the start of the infection rather than after 3 hours. 6. the assumption that tobacco mosaic virus, sindbis virus, and vsv mrnas are appropriate stand-ins for host mrna in the restoring assay is questionable rose et al., 1978; tahara et al., 1981) . in cells singly infected by vsv, viral mrnas are translated in preference to host mrnas under some conditions (nuss et al., 1975) ; thus, vsv mrnas are not equivalent to most host mrnas. when cells are doubly infected with poliovirus and sfv (which is akin to sindbis), or with poliovirus and vsv, conditions can be found that allow the simultaneous translation of poliovirus mrna and the capped mrnas of vsv or sfv (alonso and carrasco, 1982a) . thus, the factor that restores to poliovirus-infected extracts the ability to translate vsv or alphavirus mrnas might not be sufficient to restore the translation of most host mrnas. globin is the only cellular mrna that has been shown to work in the restoring assay (edery et al., 19841 , and its translational efficiency resembles viral mrnas more than the average cellular mrna. it would be reassuring to omit the usual micrococcal nuclease pretreatment of lysates, and show that the addition of cap binding factor to poliovirus-infected cell-free extracts restores the translation of authentic endogenous host mrnas. the association of p220 with p24-and p46-cbps does not prove that p220 is a necessary component of the complex. in early studies, preparations of p24-cbp that lacked the p220 subunit did preferentially stimulate the translation of capped mrnas in hela cell-free extracts (tahara et al., 1981; sonenberg et al., 1980) . those results were considered wanting because p24-cbp failed to reproducibly restore translation to extracts from poliovirus-infected cells, whereas an aggregate of p220, p46, and p24 could restore (tahara et al., 1981) . but there is no reason to reject the aforementioned demonstration that p24-cbp by itself does stimulate in uninfected systems. using the restoring assay to define the structure of cap binding factor would be acceptable only if one knew that cap binding factor was deficient in infected-cell extracts. grifo et al. (1983) showed that translation of globin mrna was stimulated by the p22o-p46-p24 aggregate, even when the system was saturated with p24-and p46-cbps. those data prove only that the system which they reconstituted from partially purified subfractions of a reticulocyte lysate was deficient in p220; the data do not prove that p220 is an essential component of cap binding factor. (indeed, translation of the uncapped mrna from satellite tobacco necrosis virus was stimulated to the same extent as capped globin mrna.1 the function of p220 would be clearer if one could show that antibodies against p220 inhibit the function of cap binding factor. such experiments have not been reported. indeed, the immunological evidence is far from convincing even for the original p24-cbp. a monoclonal antibody "directed against cap binding proteins" was shown to inhibit the translation of capped mrnas, but the antibody reacted with higher molecular weight proteins and not with p24-cbp (sonenberg et al., 1981) . the claim that the higher molecular weight polypeptides were related to p24-cbp no longer seems valid, because polyclonal antibodies obtained recently against p24-cbp react only with that polypeptide (hiremath et al., 1985) . in extracts from uninfected hela cells, p220 copurifies to some extent with both cbps and eif-3 (etchison et al., 1982) . whereas it is known that p220 restores translation to poliovirus-infected extracts when it is introduced in association with cap binding proteins, it is not known whether p220 would also enhance were it introduced in association with eif-3. in several studies, eif-3 failed to restore translation to extracts from poliovirus-infected cells, but it was usually tested on an equal weight basis, vis-a-vis the other initiation factors (table iv in grifo et al., 1983; rose et al., 1978) . because eif-3 is so massive, it must be tested on an equal molar basis. an experiment which intended to show the eif-3 from poliovirusinfected cells is fully functional failed to prove the point, because the assay for eif-3 was carried out in the presence of cap binding factor from uninfected cells (etchison et al., 1984) . the exogenous cap binding factor may have contributed a component (such as p220) which was necessary for, and absent from, infected-cell eif-3. the assay would have been more meaningful had an uncapped mrna been used, thus allowing the function of eif-3 to be evaluated without the necessity of adding cap binding factor. whether p220 should be considered a component of eif-3 or of the cap recognition factor involves more than semantics. whereas inactivation of cap binding factor would be sufficient to explain the selective inhibition of host translation, eif-3 is apparently needed for translating all mrnas; were eif-3 activity low, poliovirus would have to outcompete host mrnas for the residual activity. casting a wider net might identify other components that are involved in host shutoff by poliovirus. a few candidates have been ruled out. initiation factors eif-4a and eif-4b, for example, appear to be unaltered (duncan et al., 1983) . the normal association of host mrnas with the cytoskeleton is disrupted shortly after infection by poliovirus (lenk and penman, 1979) . whether that is the cause, or the effect, or is unrelated to inhibition of host translation remains unclear. such a dramatic effect seems unlikely to be gratuitous, but in other systems disruption of the cytoskeleton does not preclude all translation (welch and feramisco, 1985) . follow-up studies in the poliovirus system have not significantly extended penman's original, provocative observation. when bonneau et al. (1985) infected cv-1 cells first with vsv (which does not dissociate host mrnas from the cytoskeleton) and then superinfected with poliovirus, translation of vsv g and m proteins was inhibited and those mrnas were released from the cytoskeleton; unfortunately, it was not shown that vsv n and ns mrnas, which continued to be translated, remained bound to the cytoskeleton. the conclusion that translation requires association with the cytoskeleton hardly seems warranted. carrasco has suggested that the increased permeability of virusinfected cells to monovalent cations might mediate the switch from host to viral translation (carrasco and lacal, 1983; carrasco and smith, 1976) . when infected cells are incubated in medium containing sufficient excess nacl to inhibit the translation of most other proteins, poliovirus translation is fairly resistant (alonso and carrasco, 1982a; nuss et al., 19751 , but the resistance is not as striking as with emc virus (alonso and carrasco, 1982b) . the stimulation of in uitro translation by high salt is also less obvious with poliovirus mrna than with some other picornaviruses (bossart and bienz, 1981) . in the natural course of infection by poliovirus, the precipitous decline in host translation occurs within the first 2 hours, prior to the observed increase in intracellular sodium ions (nair et al., 1979) . moreover, the synthesis of cellular proteins cannot be reactivated by incubating poliovirusinfected cells in hypotonic medium (alonso and carrasco, 1982b) , a manipulation that works beautifully with emc virus. thus, hypertonicity does not appear to underlie the shutoff of host protein synthesis by poliovirus. morrow et al. (1985) made the astonishing discovery recently that the host-encoded kinase that is responsible for phosphorylating eif-2 binds to and mediates the replication of poliovirus rna. although that seems about as auspicious as a sheep shaking hands with a wolf, one can think of ways to rationalize such a dangerous move. if the pool of viral rna that serves as template for replication has to be kept free of ribosomes, for example, the presence of eif-2 kinase in replication complexes could help by phosphorylating the local pool of eif-2. indeed, eif-2 might become globally phosphorylated in infected cells, and the resulting eif-2 deficiency could contribute to the inhibition of host protein synthesis. whereas older studies suggested that eif-2 was not deficient in polio-infected cells (brown and ehrenfeld, 1980; helentjaris and ehrenfeld, 1978) , the translation of heterologous mrnas in infected-cell extracts was restored to a limited extent by the addition of eif-2 (rose et al., 1978) -an effect that the authors chose to ignore. asim dasgupta has reopened the question, and his careful measurements reveal extensive phosphorylation of eif-2(a) in poliovirus-infected cells (personal communication). the adenovirus system has generated considerable excitement recently because genetic manipulations, pioneered in shenk's laboratory, have revealed a regulatory mechanism that is novel, and yet connects to the extensive older literature on inactivation of initiation factor eif-2. the focal point is a small virus-encoded rna called va-rna,. thimmappaya et al. (1982) found that, in cells infected by an adenovirus mutant that produced no va-rnai, late viral mrnas were synthesized, processed, and transported, but failed to be translated. in the absence of va-rna,, translation was blocked at the level of initiation (schneider et al., 1984) and the defect was ultimately localized to eif-2. overwhelming evidence now supports the hypothesis that, in the absence of va-rna,, a kinase becomes activated that phosphorylates, and thus inactivates, eif-2 schneider et al., 1985; siekierka et al., 1985) . eif-2 kinase (one of the enzymes involved in the antiviral action of interferon; see lengyel, 1982) exists in uninfected hela cells in an inactive state, and is apparently activated by double-stranded rna that accumulates in infected cells as a by-product of adenovirus transcription (o'malley et al., 1986) . l4 the exact mechanism by which va-rnai blocks the action of eif-2 kinase is not yet known. an intriguing scenario can be extrapolated from a model that was proposed by rosen et al. (1981) in another context. their model proposes that high molecular weight double-stranded rna activates and targets eif-2 kinase: because both the kinase and eif-2 have binding sites for dsrna, high molecular weight dsrna could link the two proteins.15 by virtue of its small size, va-rna, might be able to bind to eif-2 or to eif-2 kinase, but not simultaneously to both. va-rnai would thus block the phosphorylation of eif-2 much as a monovalent hapten blocks antigen-antibody interactions. the proposal rationalizes the known properties of va-rna,: its small size (about 160 nucleotides), doubled-stranded structure (monstein and philipson, 19811 , and the high concentration that is required to confer protection. whether the double-stranded regions of va-rnai are crucial for its function is not yet clear. have mutated va-rna, and found that extensive regions could be deleted without affecting biological activity, although certain other mutations were deleterious. further experiments will be needed to pinpoint the essential region(s) in va-rnai. a second adenovirusencoded species called va-rnai, rescues translation far less efficiently than va-rna, (thimmappaya et al., 19821 , and va-rnaii appears less extensively base paired (mathews and grodzicker, 1981) . in addition to its proven protective effect on eif-2, it has been suggested that va-rna, might interact directly with viral mrnas to promote their translation (kaufman, 1985; schneider et al., 1984; svennson and akusjarvi, 1985) . a sequence-specific interaction seems unlikely, however, because the small rnas encoded by epstein-barr virus (which are related to adenovirus va-rnas by size but not sequence) can substitute to some extent for va-rnai (bhat and thimmappaya, 1983, 19851 , and the facilitating effect of va-rna, extends 14 a virus that is protected (by va-rna or some other mechanism) from the deleterious effects of its own symmetrical transcription process would also have some resistance to interferon. an interesting story along these lines is emerging with vaccinia virus (rice and kerr, 1984; whitaker-dowling and youngner, 1984) . 15 sen et al. (1978) showed that, once kinase has been activated by binding dsrna, incubation with ribonuclease i11 does not abolish the ability of kinase to phosphorylate histone h1 (which was chosen as a convenient substrate), but neither the extent of trimming by the nuclease, nor the activity of the trimmed kinase on eif-2, were determined. thus, the experiment does not contradict the targeting hypothesis for dsrna. not only to late adenovirus mrnas that carry the standard tripartite leader, but also to early adenovirus mrnas and late mrnas with a truncated version of the tripartite leader (svensson and akusjarvi, 19841, adeno-associated virus mrnas (janik et al., 19821, and various heterologous mrnas (svensson and akusjarvi, 1985) . the protective effect of va-rna, and the shutoff of host translation in adenovirus-infected cells might be two aspects of a single mechanism. if one postulates that the production of va-rna, by wild-type adenovirus is sufficient to protect only a portion of the eif-2 pool, the switch from host to viral protein synthesis could occur because late adenovirus mrnas outcompete host mrnas for the small residual translational capacity. the hypothesis that competition occurs during the late stage of infection by adenovirus is not entirely ad hoc. the overall translational capacity is low late in the infection (castiglia and flint, 1983) ; a portion of the eif-2 pool is phosphorylated (m. mathews, personal communication) ; polysomes are small and their size increases in response to a low dose of cycloheximide (perlman et al., 1972) ; and the decline in host translation correlates with the temporal onset and magnitude of late viral translation (castiglia and flint, 1983) . every mutation that has been shown to prevent host shutoff also prevents the cytoplasmic accumulation of late viral mrnas (babiss et al., 1985; halbert et al., 1985) . an interesting set of experiments by logan and shenk (1984) can be rationalized in terms of competition during the late stage of infection. they observed that transposition of the late tripartite leader to the early e1a genes had no effect on the efficiency of translation of e1a products at early times, but significantly enhanced the translation of e1a mrnas at late times. this is understandable if there is no competition early in the infection, allowing efficient and inefficient mrnas to be translated equally well. the facilitating effect of the tripartite leader would become evident only late in the infection, when eif-2 has been partially inactivated and competition has set in. one might think along similar lines to explain the surprising ability of influenza virus mrnas to be translated in adenovirus-infected cells . in wild-type adenovirus-infected cells, in which host protein synthesis is drastically reduced, both adenovirus and influenza virus mrnas are translated efficiently. in cells infected by d1331, a deletion mutant that produces no va-rna,, adenovirus proteins are translated inefficiently, as noted above, but influenza virus proteins are still synthesized in abundance. despite that striking observation, there is little support or necessity for the notion that influenza virus establishes its own translational system. a simpler explanation is that the very low capacity for protein synthesis that persists in the absence of va-rna, is sufficient for the translation of influenza virus mrnas.16 for that explanation to be correct, influenza virus mrnas would have to be translated with extraordinary efficiency, and that prediction has recently been confirmed (katze et al., 1986) . what makes all this so intriguing is that the 5' ends of influenza virus mrnas, which presumably dictate their high translational eficiency, are derived from host mrnas (krug, 1981) . it appears as if the viral capspecific endonuclease (which selects the cellular mrnas that will serve as donors) is biased toward the same features that facilitate translation. indeed, that deduction has been verified directly, at least in vitro. bouloy et al. (1978 bouloy et al. ( , 1980 found that p-globin mrna, which is translated more efficiently than a-globin mrna, is also a more efficient primer for influenza transcription; and alfalfa mosaic virus rna-4, which translates in vitro with extraordinary eficiency, is the best known primer for influenza transcription. [from the fact that mrnas with 2'-o-methyl groups in the penultimate position of the cap function better than monomethylated caps as primers for influenza virus transcription (bouloy et al., 1980) , one is tempted to suggest that 2'-0methyl groups enhance translation, although there is little direct evidence for that view!] if the selection of primers in infected cells follows the pattern that is seen in vitro, the influenza virus takeover scheme is indeed remarkable: the most efficient cellular mrnas would be sacrificed to construct viral mrnas that ips0 facto translate most efficiently. a few other possibilities for regulating translation in virus-infected cells are discussed briefly below. none of these topics is well understood at present, and the musings should be considered little more than that. cauliflower mosaic virus seems to be the only case in which the ability of eukaryotic ribosomes to reinitiate is fully exploited to produce several full-length proteins from one mrna. the structure of a few 16 katze et al. (1986) have shown that influenza virus partially suppresses the activation of eif-2 kinase, and they suggest that this underlies the ability of influenza virus to replicate in cells infected by adenovirus mutant d1331. that interesting hypothesis would be stronger if it could be shown that influenza virus replicates even in cells infected by the adenovirus double mutant (vai -vaii -), and if superinfection with influenza virus could be shown not just to reduce the level of activated kinase, but to increase the level of functional eif-2. other viral and cellular mrnas leads one t o predict that reinitiation is necessary for ribosomes to reach the major protein coding sequence, but the upstream open reading frames (orfs) in such mrnas are characteristically short. in some cases, however, the small peptides encoded near the 5' end of the message might be biologically important. genetic studies indicate that this is certainly the case with the agnogene product of sv40 (margolskee and nathans, 1983) . in contrast, the three peptides encoded within the avian retrovirus leader sequence probably are not functional because there is little conservation of amino acid sequences among virus strains (hackett et al., 1986) . in retrovirus mutants that lack most of the leader sequence, the only known deficiency is the absence of a cis-acting packaging signal (mann et al., 1983; nishizawa et al., 1985) . comparison of different strains of poliovirus reveals that the number of upstream aug codons varies and the coding properties of the small orfs are not conserved (toyoda et al., 1984) . upstream minicistrons that do not encode anything interesting might nevertheless be important for regulation. several possibilities come to mind for retroviruses. the least interesting idea is that upstream aug codons accumulate, not from design, but from defaultbecause the deleterious effects on translation can easily be compensated by using efficient transcription signals to mass-produce retrovirus mrnas. the opposite view is that upstream aug codons are deliberately retained to throttle the synthesis of a protein that would be harmful if overproduced (tarpley and temin, 1984) . while that seems a reasonable ploy to use for oncogenes, it makes little sense when extended to viral structural genes. a third possibility derives from the observation that reinitiation usually is not 100% efficient. with preproinsulin constructs, for example, in which the efficiency of reinitiation is routinely 20% (kozak, 1984b) , one might ask what the remaining 80% of the ribosomes are up to. one scenario is that, after 80 s ribosomes have moved through the 5'-proximal orf, 80% of the 40 s subunits detach at the terminator codon while the rest remain on the message, resume scanning, and reinitiate at the second aug codon. a more interesting possibility is that all 40 s subunits remain bound and resume scanning, but only 20% reinitiate at the closest aug codon, perhaps because the codon-recognition step in inefficient in the absence of met-trna,, cap binding proteins, and/or other initiation factors-all of which were presumably released at an earlier step. [we do not know the precise sequence of events during initiation, but it seems likely that the factors that mediate the binding of met-trna, and mrna to the 40 s ribosomal subunit are released prior to or during the joining of the 60 s subunit at the first aug codon (moldave, 1985) .] if the factor-deficient 40 s subunits that are unable to reinitiate at the second aug codon eventually become competent, they might reinitiate father downstream. thus, the effect of an upstream minicistron could be to loosen the process of initiation in a way that permits ribosomes to reach otherwise inaccessible internal aug codons. there is no evidence for this, as yet. we know only that reinitiation at the closest aug codon (following a terminator codon) is less than 100% efficient. yet another way in which ribosomes might gain access to internal aug codons, even in a message in which the major open reading frame initiates with a "strong" aug codon, relies on the presence of weak, out-of-frame initiator codons in the retrovirus leader sequence and the ability of ribosomes to reinitiate. this hypothetical scheme is best illustrated by using as an example an avian retrovirus mrna that encodes the e m glycoprotein (fig. 2) . katz et al. (1986) have studied the effects of mutations in the leader region of this mrna, using as an a hypothesis whereby minor initiation sites in the leader sequence of retroviruses create a shunt that directs ribosomes to internal initiation sites. the diagram represents a subgenomic mrna that encodes the enu protein of avian leukosis virus (katz et al., 1986) . messenger rna is represented by a wavy line, the pathway followed by 40 s ribosomal subunits is shown above the mrna, and the pathway of 80 s ribosomes is shown below the mrna. a solid black line traces the pathway followed by most 40 s subunits: they scan from the m7g cap to the start of the enu coding sequence, marked "major start site," where a 60 s subunit joins and translation begins. (some 40 s subunits will stop and initiate at three upstream aug codons, but in each instance there is a nearby terminator codon, enabling ribosomes to reinitiate. thus, the upstream aug codons are irrelevant for the present discussion and are not shown.) of more significance are the many nonstandard codons (gug, uug, etc.) that lie in the standard context for initiation. such codons occur frequently in the -1 reading frame which is open (in the functional ev-2 viral genome) over a stretch of about 200 nucleotides preceding the major enu start site; the open -1 reading frame ends 125 nucleotides beyond the start of the enu coding sequence at uaa51ss-5190, which is labeled t ~ 1 in the figure. were a few 40 s subunits to recognize the nonstandard upstream codons as initiation sites, the resulting 80 s ribosomes-translating in theframe-would bypass the normal enu start site. a dashed line traces the pathway of this shunt. the main point is that ribosomes that terminate at t-1 could reinitiate a t an internal site which would be inaccessible were it not for the shunt. assay the ability to complement a replication-defective (env-) strain of rous sarcoma virus. their results are provocative. point mutations in positions -4 and -7 (i.e., 4 and 7 nucleotides upstream from the aug codon that initiates enu) caused a 10-fold reduction in complementation. on the other hand, the translational efficiency of deletion mutants varied from 5 to 106% of the wild type level, and the variation did not correlate with the presence or absence of any particular portion of the leader sequence. to explain this puzzling pattern (or rather the absence thereof), skalka suggests that the mutations perturb some aspect of secondary structure that is critical for translation. because that idea is difficult to formulate in a way that can be tested, it can do no harm to consider an alternative explanation. the biological assay that was used has the advantage of being exquisitely sensitive, but it has the disadvantage of measuring the yield of enu protein only indirectly: the authors did not show a 10-fold reduction in enu synthesis; they showed a 10-fold drop in complementation. what if complementation were to require, in addition to enu, a second minor protein-either a truncated form of enu that initiates a little farther downstream or a small protein encoded in an alternate reading frame? (there is an open reading frame beginning at aug,,,,-,,,,, for example, that could direct the synthesis of a 10-kda protein.) because the context at the major enu start site is highly favorable, all of 40 s ribosomal subunits that reach that site should initiate there; production of the putative internally initiated protein would therefore require a mechanism for shunting some ribosomes beyond the major enu start site. the hypothesis illustrated in fig. 2 is that a small fraction of the ribosomes initiate within the leader sequence at weak sites (nonstandard codons that lie in a favorable context for initiation) in the -1 reading frame, and translate in that frame past the major enu start site, terminating at the site labeled tin fig. 2 . the small fraction of ribosomes that follow this shunt could reinitiate to produce the second protein postulated above. the notion that the enu gene encodes two products is certainly ad hoc, but it rationalizes the behavior of skalka's mutants. the deleterious mutation in position -4 creates a terminator codon in the -1 reading frame, which would short-circuit the shunt and prevent synthesis of the internally initiated protein. in all of the deletion mutants that fail to complement effkiently, the weak upstream start sites are either in-frame with the major enu start site or terminate upstream from it-again abolishing the shunt. on the other hand, all of the deletion mutants that retain the ability to complement efficiently retain one or more weak upstream start sites (such as gug in position 132-134 in mutant 1371349, or uug in position 22-24 in mutant 65/349) which can feed ribosomes into the shunt. the hypoth-viral translation 269 esis could be tested in two ways. one is to directly measure the yield of the major enu protein-which we predict will not vary, because it is the internally initiated protein that is deficient in these mutants. the best test would make use of a null mutant called pd99/394 that lacks the major enu start site: that mutant should still make the second protein encoded within the enu gene, and therefore should complement all of the other mutants that have lost the shunt. viruses that replicate in the cytoplasm have the potential for coupling transcription with translation. for example, if ribosomes were to bind the 5' end of reovirus mrnas as the nascent chains emerge from the subviral particle, the mrnas would be recruited for translation before the chains grew long enough to fold. that might enhance translation considerably, because the pattern of cleavage by t, rnase suggests that the capped terminus might be sequestered in mature reovirus mrnas (kozak and shatkin, 1978b) . it would be fun to know whether reovirus mrnas are translated more efficiently in naturally infected cells than in cells transfected with cloned viral genes which are transcribed from a plasmid vector. the idea of coupling is ad hoc for reovirus, but there is a glimmer of evidence in the case of silkworm cytoplasmic polyhedrosis virus; whereas performed viral mrnas were inactive in reticulocyte or wheat germ translation systems, viral proteins were synthesized during coupled transcription-translation in frog oocytes (ikegami et al., 1985) . payne and mertens (1983) obtained somewhat different results, in that some viral proteins were made in vitro in the absence of transcription; but the polyhedron protein that predominates in vivo was still not produced in vitro. in the vaccinia virus system, cooper and moss (1978) observed more efficient synthesis of vaccinia proteins when transcription and translation were coupled. synergism could also occur in the opposite direction; i.e., viral transcription might be facilitated by translation. during the early hours of reovirus infection in l cells, transcription is mainly from genome segments m3, s3, s4, and one of the l segments (nonoyama et al., 1974) . because mrnas from segments m3, s3, and s4 bind ribosomes very efficiently (shatkin and kozak, 19831 , one wonders whether preferential transcription is the consequence of preferential tran~1ation.l~ the 17 the hypothesis is complicated, but not necessarily contradicted, by the finding that m3, 53, and s4 are preferentially transcribed in viuo even in the presence of cycloheximide (shatkin and kozak, 1983) . although cycloheximide blocks elongation by 80 s ribosomes, 40 s ribosomal subunits could still bind to the nascent transcripts. possibility that coupled translation enhances transcription was fleetingly entertained for some other cytoplasmic viruses (ball and white, 1978; cooper and moss, 1978) , but the reticulocyte lysate appeared to enhance transcription only because it conferred protection against nucleolytic degradation . it remains possible that transcription and translation are obligatorily coupled in some less well studied rna viruses, as has been hinted for bunyaviruses (patterson and kolakofsky, 1984; pattnaik and abraham, 1983) . in the case of viruses that replicate in the nucleus, the possibility that movement of mrnas out of the nucleus might be coupled with translation has been raised from time to time. coupling clearly is not obligatory, because viral mrnas accumulate in the cytoplasm under many circumstances in which translation is blocked. a good example is the cytoplasmic accumulation of late adenovirus mrnas in the absence of va-rna,. on the other hand, the transport and translation of mrnas are sometimes coordinated. a striking example occurs in adenovirus-infected hela cells that are superinfected with influenza virus : whereas adenovirus blocks both the transport and translation of host mrnas, influenza virus mrnas escape both blocks. the probable mechanism that enables influenza virus mrnas to be translated was discussed in section ii1,e. what mechanism enables influenza mrnas to bypass the block that retains host mrnas in the nucleus? suggested one possibility, namely, an influenza virus-specific transport system. but it seems simpler to look for a single explanation that would account for both the preferential transport and translation. there could be competition at the level of transport, and the same features that make a message highly translatable might make it highly transportable. an alternative view is that the two processes are coupled. one might envision 40 s ribosomal subunits monitoring the nuclear pores, such that only mrnas that can be translated under given circumstances will be transported. along those lines, babiss et az. (1985) have noted that, whereas host mrnas are neither transported nor translated in wildtype adenovirus-infected cells, transport and translation of host mrnas are coordinately restored by mutations in early viral genes that reduce the cytoplasmic accumulation of late viral mrnas. as an extension of the idea that a message will be transported only if it can be translated, one might suggest that mrnas are transported as soon as they become translatable. the consequence would be that translation could sometimes regulate the extent of splicing. some splicing events that could occur, were the transcript kept longer in the nucleus, would be prevented by "prematurely" pulling the mrna out. svensson et al. (1983) invoked this notion to explain some of their observations on the processing of adenovirus early mrnas. coupling of splicing with transport, and transport with translation, would explain why few if any incompletely processed transcripts enter the cytoplasm: no matter how many introns are present in a primary transcript, it remains in the nucleus until every intron has been removed-in effect, until it becomes translatable. it would seem as if the easiest way to judge whether a transcript is translatable is to attempt to translate it. the shutoff of host protein synthesis by herpes simplex virus might not involve a modification in the translational machinery per se. late (second-stage) shutoff is clearly caused by the massive degradation of host mrna. the puzzle of how the nuclease is targeted, such that it degrades host but not viral mrnas, has not yet been solved. a partial explanation might be that herpes virus mrnas are more highly structured, by virtue of their high g + c content. the unusual sensitivity of herpes virus mrnas to hypertonic stress is consistent with the hypothesis that they have extensive secondary structure. the irreversible (read and frenkel, 1983 ) early shutoff of host translation by a structural component of the herpes virion also seems to involve cleavage of host mrnas-enough to inactivate them for translation (fenwick and mcmenamin, 19841 , although they can still be detected by hybridization, more or less (nishioka and silverstein, 1978b; schek and bachenheimer, 1985) . since mutants that are defective in stageone shutoff can still induce secondary shutoff of host protein synthesis (read and frenkel, 19831 , two distinct viral gene products, either nucleases or activators thereof, are apparently involved. a herpes virus mutant that is defective in stage-one host shutoff is defective in switching off the translation of early viral mrnas as well (read and frenkel, 1983) . the differential accumulation of adenovirus early mrnas is also mediated, in part, by the regulated degradation of some transcripts (wilson and darnell, 1981) . degradation of host mrnas might be part of the mechanism by which vaccinia and influenza viruses reduce host translation (see table i ), although clear-cut genetic evidence, such as that described for herpes virus, is lacking in those systems. the extent to which gene expression is regulated by posttranslational proteolytic degradation is probably not fully appreciated. there are striking, isolated examples, for example, the selective degradation of measles virus m protein (sheppard et al., 19851 , the rapid turnover of some early adenovirus proteins (spindler and berk, 1984b) , and stabilization of the cellular protein p53 by its interaction with sv40 t antigen (oren et al., 1981) . given the intricacies of the ubiquitin pathway for proteolysis, it might be surprising were that pathway not perturbed by virus infection. some animal viruses might encode a function that protects foreign (i.e., viral) proteins from degradation, analogous to the pin function of bacteriophage t4 (simon et al., 1983) . although the pattern of codon usage in viral genes is sometimes different from that of the cellular genome, imbalances in the trna pool probably do not affect the yield of most viral proteins because the rate-limiting step is usually initiation rather than elongation. moreover, while there is convincing evidence for a preferred pattern of codon usage in highly expressed bacterial and yeast genes (bennetzen and hall, 1982; ikemura, 1981 ikemura, , 1982 , codon preference seems to be more relaxed in higher eukaryotes (tso et al., 19851 , and therefore the cellular trna pool might not be markedly skewed. consistent with the idea that codon usage is not a major regulatory factor in viral gene expression, the close conservation of amino acid sequences between some viruses is not always accompanied by conservation in the choice of codons (ou et al., 1982) . the degree to which expression might be limited by trna deficiencies has been tested in escherichia coli by using cloned genes that are rich in rare codons. the availability of trna was found to limit translation only when the mrna concentration was extraordinarily high (pedersen, 1984; robinson et al., 1984) . codon usage might regulate translation in more subtle ways, however. one possibility with some experimental justification is that ribosomes pause briefly at rare codons (lizardi et al., 1979; misra and reeves, 1985; varenne et al., 1984) . la discontinuous elongation is not incompatible with efficient translation, as pausing has been detected during the synthesis of some very abundant proteins (cepko and sharp, 1982; lizardi et al., 1979) . slowing translation in certain positions might facilitate folding of the polypeptide and/or its interaction with other components, however. the pattern of codon usage in the signal peptide portion of some genes encourages this notion (spieth et al., 1985) . the suppression of nonsense codons and the occurrence of frameshifting (see section ii,c) might also be facilitated by an imbalance between the cellular trna pool and the viral pattern of codon usage. 1* an alternative explanation for discontinuous elongation is that ribosomes pause when they encounter hairpin structures in the mrna, but that idea is without experimental support. what we have learned about the structure and function of animal virus mrnas can often be extrapolated to cellular mrnas. the mechanism of selecting the initiation site for protein synthesis certainly appears to follow a single formula. the translational machinery displays a certain flexibility (leaky scanning, frameshifting, etc.) that is exploited more frequently by viral than by cellular mrnas. that no (doubt reflects the limited coding capacity of most viral genomes. in contrast, it would seem easier and more efficient for the expansive cellular genome to separately encode two versions of a protein than to attempt to skirt the "monocistronic rule" in the ways described for viruses.lg it is important to remember that there are rules for breaking the monocistronic rule. using those principles, we can correctly predict the qualitative aspects of viral protein synthesis, with very few exceptions. we understand much less about the quantitative aspects of translation, however. although some of the parameters that determine efficiency have been identified in the preceding pages, or at least surmised, we usually cannot predict how efficiently a given mrna will be translated by summing the known parameters. future studies will almost certainly uncover other features that affect translational efficiency: "repressor" proteins, perhaps, or helix-unwinding proteins, or effects of 3'-noncoding sequences, or aspects of mrna primary and secondary structure that are not yet obvious. the suggestion that it is easier to block translation than to enhance it merits repetition. the most efficient mrnas might be those that cannot interact with regulatory rnas, proteins, etc. it is sometimes but not always true that viral mrnas are translated more efficiently than cellular mrnas. i persist in believing that many viruses inhibit cellular protein synthesis inadvertently, and gain little thereby. understanding the mechanism of host shutoff is nonetheless interesting. it might aid in designing virus vectors, and in our understanding of the conditions that promote persistent virus infections (ahmed and fields, 1982) . acknowledgments i thank many colleagues who kindly sent reprints and preprints, and especially those who gave me permission to cite their unpublished observations. in several instances the the 5'-terminal structure of the methylated mrna synthesized in vitro by vsv cellular protein synthesis shutoff by mengovirus: translation of nonviral and viral mrnas in extracts from uninfected and infected ascites tumor cells two forms of sv40 tantigen in abortive and lytic infection role of the s4 gene in the establishment of persistent reovirus infection in l cells reversion by hypotonic medium of the shutoff of protein synthesis induced by emc virus translation of capped viral mrnas in poliovirus-infected hela cells protein synthesis in hela cells double-infected with emc virus and poliovirus translation of capped virus mrna in emc virus-infected cells can acg serve as an initiation codon for protein synthesis in eukaryotic cells? sequences at both termini of the 10 genes of reovirus serotype 3 (dearing) the effect of infection with sindbis virus and its temperature sensitive mutants on cellular protein and dna synthesis sequencing studies of pichinde arenavirus s rna indicate a novel coding strategy, an ambisense viral s rna effect of adenovirus on metabolism of specific host mrnas: transport control and specific translational discrimination adenovirus type 5 early region l b gene product is required for efficient shutoff of host protein synthesis adenovirus e1b proteins are required for accumulation of late viral mrna and for effects on cellular mrna translation and transport polyriboadenylic acid preferentially inhibits in vitro translation of cellular compared to vaccinia virus mrnas inhibition of protein synthesis by vaccinia virus coupled transcription and translation in mammalian and avian cell-free systems the replication of picornaviruses expression from an internal aug codon of herpes simplex thymidine kinase gene inserted in a retrovirus vector the number of ribosomes on sv40 late 16s mrna is determined in part by the nucleotide sequence of its leader complete nucleotide sequence of alfalfa mosaic virus rna 3 sequence analysis of hepatitis a virus cdna coding for capsid proteins and rna polymerase direct mapping of adeno-associated virus capsid proteins b and c: a possible acg initiation codon structure of the fmdv translation initiation site and of the structural proteins uag readthrough during tmv rna translation: isolation and sequence of two trnastyr with suppressor activity from tobacco plants the molecular basis for differential translation of tmv rna in tobacco and wheat germ measles virus p gene codes for two proteins inhibition of hela cell protein synthesis during adenovirus infection regulatory mutants of polyoma virus defective in dna replication and the synthesis of early proteins solubilization of a protein synthesis inhibitor from vaccinia virions codon selection in yeast translational interference a t overlapping reading frames in prokaryotic mrna effect of the tripartite leader on synthesis of a nonviral protein in an adenovirus 5 recombinant poliovirus mutant that does not selectively inhibit host cell protein synthesis two small rnas encoded by epstein-barr virus can functionally substitute for the virus-associated rnas in the lytic growth of adenovirus 5 construction and analysis of additional adenovirus substitution mutants confirm the complementation of vai rna function by two small rnas encoded by epstein-barr virus structural requirements of adenovirus vai rna for its translation enhancement function differential inhibition of host cell rna synthesis in several picornavirus-infected cell lines effect of viral infection on host protein synthesis and mrna association with the cytoplasmic cytoskeletal structure intermolecular duplexes formed from polyadenylated vaccinia virus rna the 2.2 kb e1b mrna of human ad12 and ad5 codes for two tumor antigens starting at different aug triplets regulation of protein synthesis in hep2 cells and their cytoplasmic extracts after poliovirus infection globin mrnas are primers for the transcription of influenza viral rna in uitro both the 7-methyl and the 2'-o-methyl groups in the cap of mrna strongly influence its ability to act as primer for influenza virus rna transcription a transcript from the s segment of the germiston bunyavirus is uncapped and codes for the nucleoprotein and a nonstructural protein sequencing of coronavirus ibv genomic rna: three open reading frames in the 5' "unique" region of mrna d translation of poliovirus rna in uitro: changes in cleavage pattern and initiation sites by ribosomal salt wash initiation factor preparations from poliovirusinfected cells restrict translation in reticulocyte lysates the white pock mutants of rabbit poxvirus: in uitro translation of early host range mutant mrna synthesis of alphavirus-specified rna complex regulation of sv40 earlyregion transcription from different overlapping promoters three intergenic regions of coronavirus mouse hepatitis virus genome rna contain a sequence that is homologous to the 3'-end of the viral mrna leader sequence molecular cloning and complete sequence determination of rna genome of human rhinovirus type 14 permeabilization of cells during animal virus infection sodium ions and the shut-off of host cell protein synthesis by picornaviruses sequences of the s1 genes of the three serotypes of reovirus effects of adenovirus infection on rrna synthesis and maturation in hela cells sequence analysis of the viral core protein and membrane proteins of the flavivirus west nile virus primary structure of the west nile flavivirus genome region coding for all nonstructural proteins effect of poliovirus double-stranded rna on viral and host-cell protein synthesis translation of poliovirus rna in uitro: detection of two different initiation sites regulation of protein synthesis in vsvinfected cells by decreased initiation factor 2 activity assembly of adenovirus major capsid protein is mediated by a nonvirion protein differential translation in normal and adenovirus type 5-infected cells and cell-free systems two initiation sites for foot and mouth disease virus polyprotein in vivo synthesis and processing of sindbis virus nonstructural proteins in uitro overlapping of the vp2-vp3 gene and the vp1 gene in the sv4q genome transcription of vaccinia virus mrna coupled to translation in uitro in uitro translation of immediate early, early, and late classes of rna from vaccinia virus-infected cells discriminatory inhibition of protein synthesis in cell-free systems by vaccinia virus transcripts bacterial peptide chain release factors: conserved primary structure and possible frameshift regulation of release factor 2 identification of a unique guanine-7-methyltransferase in semliki forest virus infected cell extracts mechanism of interferon action: differential effect of interferon on the synthesis of sv40 and reovirus polypeptides in monkey kidney cells structure-function relationship of rous sarcoma virus leader rna a ribosome binding site sequence is necessary for efficient expression of the distal gene of a translationally-coupled gene pair nucleotide sequence of a viral rna fragment that binds to eukaryotic ribosomes sequence of the 3'-untranslated region of brome mosaic virus coat protein messenger rna genome expression of plant positive-strand rna viruses inhibition of mrna binding to ribosomes by localized activation of dsrna-dependent protein kinase in vitro synthesis of the nonstructural c protein of sendai virus translational specificity in reovirus-infected mouse fibroblasts gene organization of the transforming region of adenovirus type 7 dna initiation of translation of the cauliflower mosaic virus genome from a polycistronic mrna: evidence from deletion mutagenesis oligonucleotide directed mutagenesis of cauliflower mosaic virus dna using a repair-resistant nucleoside analogue identifies an agnogene initiation codon in uitro translation of poliovirus rna: utilization of internal initiation sites in reticulocyte lysate peptide maps and nterminal sequences of polypeptides from early region 1a of human adenovirus 5 catalytic utilization of eif-2 and mrna binding proteins are limiting in lysates from vsv infected l cells host range restriction of vaccinia virus in cho cells: relationship to shutoff of protein synthesis regulation of initiation factors during translational repression caused by serum depletion cellular levels and covalent modification of the subunits of the cap binding protein complex, eif-4f protein synthesis factors 4a and 4b are not altered by poliovirus infection of hela cells complete nucleotide sequence of bacteriophage t7 dna and the locations of t7 genetic elements functional characterization of eukaryotic mrna cap binding protein complex: effects on translation of capped and naturally uncapped rnas sequence relationship of glycosylated and unglycosylated gag polyproteins of moloney murine leukemia virus mapping the major transcripts of ground squirrel hepatitis virus: the presumptive template for reverse transcriptase is terminally redundant reovirus hemagglutinin mrna codes for two polypeptides in overlapping reading frames the complete sequence of the m rna of snowshoe hare bunyavirus reveals the presence of internal hydrophobic domains in the viral glycoprotein analyses of the mrna transcription processes of snowshoe hare bunyavirus s and m rna species human rhinovirus 14 infection of hela cells results in the proteolytic cleavage of the p220 cap binding subunit viral translation proteolysis of a 220,000 dalton polypeptide associated with eif3 and a cap binding protein complex early virion-associated suppression of cellular protein synthesis by herpes simplex virus is accompanied by inactivation of mrna suppression of the synthesis of cellular macromolecules by herpes simplex virus structural difference between the 5' termini of viral and cellular mrna in poliovirus-infected cells: possible basis for the inhibition of host protein synthesis effect of adenovirus infection on expression of human histone genes nucleotide sequence and genome organization of foot and mouth disease virus evidence against the role of k + in the shutoff of protein synthesis by vsv temporal regulation of baculovirus rna: overlapping early and late transcripts early and late functions in a bipartite rna virus: evidence for translational control by competition between viral mrnas bunyavirus nucleoprotein, n, and a nonstructural protein, nss, are coded by overlapping reading frames in the s rna reovirus messenger rna contains a methylated, blocked 5'-terminal structure: m7g(5')ppp(s')gmpcp mechanism of formation of reovirus mrna 5'-terminal blocked and methylated sequence, m7gpppgmpc 5'-terminal structure and mrna stability enhanced poliovirus replication in cytomegalovirus-infected human fibroblasts na+ and k + concentrations and the regulation of the interferon system in chick cells na+ and k + concentrations and the regulation of protein synthesis in sindbis virus-infected chick cells 5'-conformation of capped alfalfa mosaic virus rna 4 may reflect its independence of cap structure or of cap-binding protein for efficient translation sv40 early mrnas contain multiple 5'-termini upstream and downstream from a hogness-goldberg sequence; a shift in 5' termini during the lytic cycle is mediated by large t antigen heterogeneity and 5'4erminal structures of the late rnas of sv40 protein synthesis in lysates of aedes albopictus cells infected with vsv sendai virus contains overlapping genes expressed from a single mrna translational discrimination between the four rnas of alfalfa mosaic virus cap accessibility correlates with the initiation efficiency of amv rnas competition between cellular and viral mrnas in uitro is regulated by a messenger discriminatory initiation factor protein synthesis in cells infected with semliki forest virus is not controlled by intracellular cation changes new initiation factor activity required for globin mrna translation inhibition of dna-dependent transcription by the leader rna of vsv sv40 recombinant molecules express the gene encoding p21 transforming protein of harvey murine sarcoma virus sequence of the black beetle virus subgenomic rna and its location in the viral genome nucleotide sequence and genome organization of carnation mottle virus rna utilization of internal aug codons for initiation of protein synthesis directed by mrnas from normal and mutant genes encoding herpes simplex virusspecified thymidine kinase synthesis in vitro of a seven amino acid peptide encoded in the leader rna of rous sarcoma virus sodium and potassium transport in herpes simplex virus-infected cells adenovirus early region 4 encodes functions required for efficient dna replication, late gene expression, and host cell shutoff effects of deletions on expression of the hsv thymidine kinase gene from the intact viral genome: the amino terminus is dispensable for catalytic activity influenza virus mrnas are incomplete transcripts of the genome rnas attenuation of late sv40 mrna synthesis is enhanced by the agnoprotein and is temporally regulated in isolated nuclear systems large surface proteins of hepatitis v virus containing the pre-s sequence inhibition of host cell protein synthesis by uvinactivated poliovirus control of protein synthesis in extracts from poliovirus-infected cells isolation and preliminary characterization of temperature-sensitive mutants of poliovirus type 1 virion component of hsv type 1 kos interferes with early shutoff of host protein synthesis induced by hsv type 2 immunological detection of the mrna cap-binding protein site-specific recombination of bacteriophage a: dna sequence of regulatory regions and overlapping structural genes for int and xis characterization of the infections of permissive cells by host range mutants of vsv defective in rna methylation control of expression of the vaccinia virus thymidine kinase gene mapping and identification of the vaccinia virus thymidine kinase gene mutation of a termination codon affects src initiation alfalfa mosaic virus temperature sensitive mutants transcriptioncoupled translation of silkworm cytoplasmic polyhedrosis virus genomic rna in xenopus oocytes correlation between the abundance ofe. coli transfer rnas and the occurrence of the respective codons in its protein genes correlation between the abundance of yeast transfer rnas and the occurrence of the respective codons in protein genes inhibition of host protein synthesis and degradation of cellular mrnas during infection by influenza and hsv constitutive and conditional suppression of exogenous and endogenous genes by anti-sense rna expression of the b u s sarcoma virus pol gene by ribosomal frameshifting biosynthesis of reovirus-specified polypeptides: the reovirus sl mrna encodes two primary translation products biosynthesis of reovirus-specified polypeptides polypeptide cleavages in the formation of poliovirus proteins requirement for adenovirus dna-binding protein and va-i rna for production of adeno-associated virus polypeptides adeno-associated virus proteins: origin of the capsid components identification of the sv40 agnogene product: a dna binding protein further studies on the inhibition of cellular protein synthesis by vsv inhibition of host translation in emc virus-infected l cells: a novel mechanism comparison of initiation rates of emc virus and host protein synthesis in infected cells shutoff of hela cell protein synthesis by emc virus and poliovirus: a comparative study two initiation sites for translation of poliovirus rna in uitro: comparison of lsc and mahoney strains evidence for translational regulation of hsv type 1 gd expression restriction of vsv in a nonpermissive rabbit cell line is at the level of protein synthesis inhibition of cell functions by rna-virus infections lysis gene expression of rna phage ms2 depends on a frameshift during translation of the overlapping coat protein gene deletions of n-terminal sequences of polyoma virus t-antigens reduce but do not abolish transformation of rat fibroblasts role of the avian retrovirus mrna leader in expression: evidence for novel translation control metabolism and expression of rna polymerase i1 transcripts in influenza virus-infected cells nuclear-cytoplasmic transport and vai rna-independent translation of influenza viral mrnas in late adenovirusinfected cells translational control by influenza virus: suppression of the kinase that phosphorylates the alpha subunit of initiation factor eif-2 and selective translation of influenza viral mrnas identification of the components necessary for adenovirus translational control and their utilization in cdna expression vectors viral protein synthesis in barley protoplasts inoculated with native and fractionated brome mosaic virus rna primary structure, gene organization and polypeptide expression of poliovirus rna interferon regulates c-myc gene expression in daudi cells at the post-transcriptional level protein synthesis directed by the rna from a plant virus in a normal animal cell how do eucaryotic ribosomes select initiation regions in messenger rna? inability of circular mrna to attach to eukaryotic ribosomes migration of 40s ribosomal subunits on mrna when initiation is perturbed by lowering magnesium or adding drugs evaluation of the "scanning model" for initiation of protein synthesis in eucaryotes possible role of flanking nucleotides in recognition of the aug initiator codon by eukaryotic ribosomes mechanism of mrna recognition by eukaryotic ribosomes during intiation of protein synthesis analysis of ribosome binding sites from the sl message of reovirus: initiation a t the first and second aug codons comparison of initiation of protein synthesis in procaryotes, eucaryotes, and organelles translation of insulin-related polypeptides from mrnas with tandemly reiterated copies of the ribosome binding site compilation and analysis of sequences upstream from the translational start site in eukaryotic mrnas selection of initiation sites by eucaryotic ribosomes: effect of inserting aug triplets upstream from the coding sequence for preproinsulin point mutations define a sequence flanking the aug initiator codon that modulates translation by eucaryotic ribosomes influences of mrna secondary structure on initiation by eucaryotic ribosomes migration of 40s ribosomal subunits on mrna in the presence of edeine identification of features in 5'-terminal fragments from reovirus mrna which are important for ribosome binding priming of influenza viral rna transcription by capped heterologous rnas relationship between membrane integrity and the inhibition of host translation in virus-infected mammalian cells characterization of leader 127, 359-366. virus-infected cells rna sequences on the virion and mrnas of mouse hepatitis virus, a cytoplasmic rna virus nucleotide sequence of the gag gene and gag-pol junction of feline leukemia virus synthesis of hepatitis b surface antigen in mammalian cells: expression of the entire gene and the coding region translation of adenovirus serotype 2 late mrnas influenza virus structural and nonstructural proteins in infected cells and their plasma membranes inactivation of cap-binding proteins accompanies the shut-off of host protein synthesis by poliovirus isolation and structural characterization of cap-binding proteins from poliovirus-infected hela cells poliovirus protease 3c (p3-7c) does not cleave p220 of the eucaryotic mrna cap-binding protein complex expression of vaccinia virus early mrnas in ehrlich ascites tumor cells. part of the polysomes at an early stage of virus infection are not bound to the cytoskeleton translation of cellular and viral early mrna in cell-free systems from uninfected and (vaccinia) virusinfected cells a t the early stage mrna discrimination in extracts from uninfected and reovirus-infected l-cells polyadenylic acid addition sites in the adenovirus type 2 major late transcription unit biochemistry of interferons and their actions the cytoskeletal framework and poliovirus metabolism molecular cloning and partial sequencing of hepatitis a viral cdna initiation of translation at internal aug codons in mammalian cells discontinuous translation of silk fibroin in a reticulocyte cell-free system and in intact silk gland cells poliovirus protease does not mediate cleavage of the 220,000-da component of the cap binding protein complex translational control of protein synthesis after infection by vsv vsv mrna and inhibition of translation of cellular mrna-is there a p function in vsv? adenovirus tripartite leader sequence enhances translation of mrnas late after infection eukaryotic ribosomes can recognize preproinsulin initiation codons irrespective of their position relative to the 5'-end of mrna the nonstructural proteins of sindbis virus as studied with an antibody specific for the c terminus of the nonstructural readthrough polyprotein differential inhibition of host protein synthesis in l cells infected with rna-temperature-sensitive mutants of vsv arrangment of late rnas transcribed from a 7.1-kb ecori vaccinia virus dna fragment construction of a retrovirus packaging mutant and its use to produce helper-free defective retrovirus frameshift and intragenic suppressor mutations in a rous sarcoma provirus suggest src encodes two proteins suppression of a vp1 mutant of sv40 by missense mutations in serine codons of the viral agnogene virus-associated rnas of naturally occurring strains and variants of group c adenoviruses nucleotide sequence of the coat protein cistron and the 3'-noncoding region of cucumber green mottle mosaic virus rna influence of the host cell on proteins synthesized by different strains of influenza virus intermediates in the synthesis of tolc protein include an incomplete peptide stalled at a rare arg codon eukaryotic protein synthesis restricted translation of the genome of the flavivirus kunjin in vitro the conformation of adenovirus vai-rna in solution the host protein required for in vitro replication of poliovirus is a protein kinase that phosphorylates eif2 shutoff of host translation by emc virus infection does not involve cleavage of the eif4f polypeptide that accompanies poliovirus infection inhibition of hela cell protein synthesis by the vaccinia virion irreversible effects of cycloheximide during the early period of vaccinia virus replication formation of the guanylylated and methylated 5'-terminus of vaccinia virus mrna biosynthesis of reovirus-specified polypeptides. multiplication rate but not yield of reovirus serotypes 1 and 3 correlates with the level of virus-mediated inhibition of cellular protein synthesis the regulation of translation in reovirus-infected cells modification of membrane permeability during semliki forest virus infection cell-free synthesis of a precursor polyprotein containing both gag and pol gene products by rauscher murine leukemia virus 35s rna guanidine-sensitive na+ accumulation by poliovirus-infected hela cells adenovirus gene expression: control at multiple steps of mrna biogenesis the complete sequence of the chicken 61 crystallin gene and its 5' flanking region alterations in the protein synthetic apparatus of friend erythroleukemia cells infected with vsv or herpes simplex virus requirement of protein synthesis for the degradation of host mrna in friend erythroleukemia cells infected with hsv type 1 unusual features of the leader sequence of rous sarcoma virus packaging mutant tk15 the 5'-end of poliovirus mrna is not capped with m'g(5')ppp(s') complete nucleotide sequence of the attenuated poliovirus sabin 1 strain genome control of transcription of the reovirus genome vaccinia virusinduced changes in "a+] and [ k + ] in hela cells selective blockage of initiation of host protein synthesis in rna-virus-infected cells a mechanism for the control of protein synthesis by adenovirus va rnal on the regulation of protein synthesis in vaccinia virus infected cells a joint product of the genes gag and pol of avian sarcoma virus: a possible precursor of reverse transcriptase post-translational regulation of the 54k cellular tumor antigen in normal and transformed cells the influence of the host cell on the inhibition of virus protein synthesis in cells doubly infected with vsv and mengovirus sequence studies of several alphavirus genomic rnas in the region containing the start of the subgenomic rna characterization of a ts mutant of vaccinia virus 25,422-426. novel function that prevents virus-induced breakdown of rna hepatitis b virus genes and their expression in e. coli characterization of lacrosse virus smallgenome transcripts multiple leader rnas and mrnas are transcribed from the lacrosse virus small genome segment identification of four complementary rna species in akabane virus-infected cells the reoviridae methylmercury hydroxide enhancement of translation and transcription of ovalbumin and conalbumin mrnas e . coli ribosomes translate in viuo with variable rate leaky uag termination codon in tobacco mosaic virus rna characteristics of a coupled cellfree transcription and translation system directed by vaccinia cores insertion mutagenesis to increase secondary structure within the 5'moncoding region of a eukaryotic mrna reduces translational efiiciency evidence for the presence of an inhibitor on ribosomes in mouse l cells infected with mengovirus regulation of herpesvirus macromolecular synthesis. properties of a polypeptides made in hsv-1 and hsv-2 infected cells utilization of messenger in adenovirus-2-infected cells at normal and elevated temperatures a frameshift mutation in the pre-s region of the human hepatitis b virus genome allows production of surface antigen particles but eliminates binding to polymerized albumin characterization of ribosome binding on rous sarcoma virus rna in uitro translation of mulv and msv rnas in nuclease-treated reticulocyte extracts: enhancement of the gag-pol polypeptide with yeast suppressor trna proteins specified by avian erythroblastosis virus: coding region localization and identification of a previously undetected erb-b polypeptide molecular cloning of poliovirus cdna and determination of the complete nucleotide sequence of the viral genome cell-free translation of frog virus 3 mrnas herpes simplex virus mutants defective in the virion-associated shutoff of host polypeptide synthesis and exhibiting abnormal synthesis of q (immediate early) viral polypeptides the genome of simian virus 40 regulation of a protein synthesis initiation factor by adenovirus va-rnai interferon-mediated, doubled-stranded rna-dependent protein kinase is inhibited in extracts from vaccinia virus-infected cells vaccinia virus induces cellular mrna degradation double-stranded rnadependent protein kinase and 2-5a system are both activated in interferon-treated, emc virus-infected hela cells nucleotide sequence of yellow fever virus: implications for flavivirus gene expression and evolution structure of the flavivirus genome 5' and 3' terminal nucleotide sequences of the rna genome segments of influenza virus codon usage can affect efficiency of translation of genes in e. coli inhibition of translation by poliovirus: inactivation of a specific initiation factor transcriptional and translational mapping and nucleotide sequence analysis of a vaccinia virus gene encoding the precursor of the major core polypeptide 4b inhibition of host protein synthesis by vaccinia virus: fate of cell mrna and synthesis of small poly(a)-rich polyribonucleotides in the presence of actinomycin d messenger rna specificity in the inhibition of eukaryotic translation by double-stranded rna post-transcriptional regulation accounts for the transactivation of the human t-lymphotropic virus type 111 selective and reversible inhibition of initiation of protein synthesis in mammalian cells 2b or not 2b: regulation of the catalytic utilization of eif2 the synthesis of a dna-like rna in the cytoplasm of hela cells infected with vaccinia virus increased phosphorylation of eif-2a in interferon-treated, reovirus-infected mouse fibroblasts degradation of cellular mrnas induced by a virion-associated factor during herpes simplex virus infection of vero cells adenovirus vai rna facilitates the initiation of translation in virus-infected cells rna prevents phosphorylation of eif2a subsequent to infection nucleotide sequence of rous sarcoma virus interferon, double-stranded rna, and protein phosphorylation reovirus inhibition of cellular rna and protein synthesis: role of the s4 gene the reoviridae capping of eucaryotic mrnas molecular mechanisms of virus-mediated cytopathology mrna cap binding proteins: essential factors for initiating translation the reoviridae complete nucleotide sequence of the neuraminidase gene of influenza b virus a previously unrecognized b virus glycoprotein from a bicistronic mrna that also encodes the viral neuraminidase measles virus matrix protein synthesized in a subacute sclerosing panencephalitis cell line translation of brome mosaic viral ribonucleic acid in a cell-free system derived from wheat embryo nucleotide sequence of moloney murine leukaemia virus mechanism of translational control by partial phosphorylation of the a subunit of eukaryotic initiation factor 2 translational control by adenovirus: lack of va-rna, during adenovirus infection results in phosphorylation of eif-2 and inhibition of protein synthesis alterations in the protein synthetic apparatus of cells infected with herpes simplex virus stabilization of proteins by a bacteriophage t4 gene cloned in e . coli coding sequence of coronavirus mhv-jhm mrna 4 coronavirus mhv-jhm mrna 5 has a sequence arrangement which potentially allows translation of a second, downstream open reading frame reovirus-induced modification of cap-dependent translation in infected l cells regulation of translation in l-cells infected with reovirus cytoplasmic methionine transfer rnas from eukaryotes extraction and fingerprint of sv40 large and small t-antigens production of human beta interferon in insect cells infected with a baculovirus expression vector differential stimulation of capped mrna translation in uitro by cap binding protein probing the function of the eucaryotic 5' cap structure by using a monoclonal antibody directed against cap-binding proteins coronavirus mrna synthesis involves fusion of non-contiguous sequences translation efficiency of zein mrna is reduced by hybrid formation between the 5'-and 3'-untranslated region the nucleotide sequence of a nematode vitellogenin gene translation efficiency of adenovirus early region 1a mrnas deleted in the 5' untranslated region rapid intracellular turnover of adenovirus 5 early region 1a proteins herpes simplex virus-induced changes in cellular and adenovirus rna metabolism in an adenovirus type 5-transformed human cell line the effect of hypertonic conditions on protein synthesis in cells infected with herpes virus sequence coding for the alphavirus nonstructural proteins is interrupted by an opal termination codon adenovirus va rnai: a positive regulator of mrna translation adenovirus va rnai mediates a translational stimulation which is not restricted to the viral mrnas splicing of adenovirus 2 early region 1a mrnas is non-sequential a cell-free model of the emc virus-induced inhibition of host cell protein synthesis two forms of purified m7g-cap binding protein with different effects on capped mrna translation in extracts of uninfected and poliovirus-infected hela cells the location of v-src in a retrovirus vector determines whether the virus is toxic or transforming adenovirus vai rna is required for efficient translation of viral mrnas a t late times after infection the hepatitis b virus complete nucleotide sequences of all three poliovirus serotype genomes isolation and characterization of rat and human glyceraldehyde-3-phosphate dehydrogenase cdnas: genomic complexity and molecular evolution of the gene shutoff of neuroblastoma cell protein synthesis by semliki forest virus: loss of ability of crude initiation factors to recognize early sfv and host mrnas infection of neuroblastoma cells by sfv translation is a nonuniform process. effects of trna availability on the rate of elongation of nascent polypeptide chains translational control of early protein synthesis at the late stage of vaccinia virus infection mutational alterations within the sv40 leader segment generate altered 16s and 19s mrnas individual hsv transcripts: characterization of specific genes. jn "the herpesviruses nucleotide sequence of the thymidine kinase gene of herpes simplex virus type 1 the role of mrna competition in regulating translation a single uga codon functions as a natural termination signal in the coliphage q p coat protein cistron disruption of the three cytoskeletal networks 199-203 marilyn kozak in mammalian cells does not affect transcription, translation, or protein translocation changes induced by heat shock protein synthesis in bhk-21 cells infected with semliki forest virus characterization of a specific kinase inhibitory factor produced by vaccinia virus which inhibits the interferon-induced protein kinase nucleotide sequence of an immediateearly frog virus 3 gene macromolecular synthesis in cells infected by frog virus 3 further genetic localization of the transforming sequences of the p21 v-ras gene of harvey murine sarcoma virus control of mrna concentration by differential cytoplasmic half-life evidence that a g u a u e a and ccaagmga initiate translation in the same mrna in region e3 of adenovirus differential phosphorylation of soluble versus ribosome-bound eif2 in the ehrlich ascites tumor cell resistance to inhibitors of mammalian cell protein synthesis induced by preincubation in hypertonic growth medium murine leukemia virus protease is encoded by the gag-pol gene and is synthesized through suppression of an amber termination codon translational readthrough of an amber termination codon during synthesis of feline leukemia virus protease splicing in adenovirus and other animal viruses s. cereuisiae ribosomes recognize non-aug initiation codons studies on the intracellular synthesis of reovirus-specified proteins key: cord-000264-o80duxhs authors: chandramouli, kondethimmanahalli; qian, pei-yuan title: proteomics: challenges, techniques and possibilities to overcome biological sample complexity date: 2009-12-08 journal: hum genomics proteomics doi: 10.4061/2009/239204 sha: doc_id: 264 cord_uid: o80duxhs proteomics is the large-scale study of the structure and function of proteins in complex biological sample. such an approach has the potential value to understand the complex nature of the organism. current proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of proteome. advances in protein fractionation and labeling techniques have improved protein identification to include the least abundant proteins. in addition, proteomics has been complemented by the analysis of posttranslational modifications and techniques for the quantitative comparison of different proteomes. however, the major limitation of proteomic investigations remains the complexity of biological structures and physiological processes, rendering the path of exploration paved with various difficulties and pitfalls. the quantity of data that is acquired with new techniques places new challenges on data processing and analysis. this article provides a brief overview of currently available proteomic techniques and their applications, followed by detailed description of advantages and technical challenges. some solutions to circumvent technical difficulties are proposed. the term proteomics describes the study and characterization of complete set of proteins present in a cell, organ, or organism at a given time [1] . in general, proteomic approaches can be used (a) for proteome profiling, (b) for comparative expression analysis of two or more protein samples, (c) for the localization and identification of posttranslational modifications, and (d) for the study of protein-protein interactions. the human genome harbors 26000-31000 protein encoding genes [2] ; whereas the total number of human protein products, including splice variants and essential posttranslational modifications (ptms), has been estimated to be close to one million [3, 4] . it is evident that most of the functional information on the genes resides in the proteome, which is the sum of multiple dynamic processes that include protein phosphorylation, protein trafficking, localization, and protein-protein interactions [5] . moreover, the proteomes of mammalian cells, tissues, and body fluids are complex and display a wide dynamic range of proteins concentration one cell can contain between one and more than 100000 copies of a single protein [6] . in spite of new technologies, analysis of complex biological mixtures, ability to quantify separated protein species, sufficient sensitivity for proteins of low abundance, quantification over a wide dynamic range, ability to analyze protein complexes, and high throughput applications is not yet fulfilled [7] . biomarker discovery remains a very challenging task due to the complexity of the samples (e.g., serum, other bodily fluids, or tissues) and the wide dynamic range of protein concentrations [8] . most of the serum biomarker studies performed to date seem to have converged on a set of proteins that are repeatedly identified in many studies and that represent only a small fraction of the entire blood proteome [9] . processing and analysis of proteomics data is indeed a very complex multistep process [10, 11] . the consistent and transparent analysis of lc/ms and lc-ms/ms data requires multiple stages [12] , and this process remains the main bottleneck for many larger proteomics studies. to overcome these issues, effective sample preparation (to reduce complexity and to enrich for lower abundance components while depleting the most abundant ones), state-of-the-art mass spectrometry instrumentation, and extensive data processing and data analysis are required. a wide range of proteomic approaches are available such as gel-based applications include one-dimensional and twodimensional polyacrylamide gel electrophoresis [13, 14] , and gel-free high throughput screening technologies are equally available, including multidimensional protein identification technology [15] , isotope-coded affinity tag icat [16] ; silac [17] ; isobaric tagging for relative and absolute quantitation (itraq) [18] . shotgun proteomics [19] and 2de dige [20] as well as protein microarrays [21, 22] are applied to obtain overviews of protein expression in tissues, cells, and organelles. large-scale western blot assays [23] , multiple reaction monitoring assay (mrm) [24] , and label-free quantification of high mass resolution lc-ms data [25] are being explored for high throughput analysis. many different bioinformatics tools have been developed to aid research in this field such as optimizing the storage and accessibility of proteomic data or statistically ascertaining the significance of protein identifications made from a single peptide match [26] . in this review we attempt to provide a overview of the major developments in the field of proteomics, some success stories as well as challenges that are currently being faced. about 20-30% of all genes in an organism encode integral membrane proteins, which are involved in numerous cellular processes [27] . membrane proteins constitute 30% of the typical proteome, yet their propensity to aggregate and precipitate in solution confounds their analysis [28] . the target residues for tryptic cleavage (i.e., lysine and arginine) are mainly absent in transmembrane helices and preferentially found in the hydrophilic part of these lipid bilayer-incorporated proteins. because of the protein aggregation step of ief, 2de is unsuitable for the separation of integral membrane proteins and is limited to detection of membraneassociated proteins and membrane proteins with a low hydrophobicity [29] . membrane solubilization methods have been deployed to analyze enriched membrane fractions and address the solubility issue by using detergents [30] , organic solvents [31] , and organic acids [32] compatible with subsequent proteolytic digestion/chemical cleavage, separation and analysis by lc/ms. in this approach, (1) an enriched yeast membrane fraction is solubilized with 90% formic acid in the presence of cyanogens bromide. the concentrated organic acid provides the solubilization agent, and cyanogen bromide, functional under acidic conditions, allows many embedded membrane proteins to be cleaved, (2) a membrane-enriched microsomal fraction is solubilized by boiling in 0.5% sds and, following isotope-coded affinity tag (icat) labeling, is diluted to reduce the concentration of sds, and (3) by using an enriched membrane sample, the proteins are thermally denatured and sonicated in 60% organic solvent (methanol) in the presence of trypsin. the resultant peptide mixture is then analyzed by lc/ms. all three of these methods are effective and optimize the identifications of membrane proteins. another method using high ph and protenase k is optimized specifically for the global analysis of both membrane and soluble proteins [33] . high ph favors the formation of membrane sheets, while proteinase k cleaves exposed hydrophilic domains of membrane proteins. commercially available nonionic detergents, dodecyl maltoside, and decaethylene glycol mono hexadecyl are proved most efficient membrane protein solubilizers [34] . another more successful approach to isolate membrane proteins relies on cell surface labeling in combination with high resolution two-dimensional (2d) lc-ms/ms [35] . in addition, improved analytical tools should be developed, that is, multidimensional liquid chromatography of peptide mixtures generated from membrane proteins, nanoflow chromatographic techniques for hydrophobic transmembrane peptides, and native electrophoresis of membrane protein complexes, which, in combination with mass spectrometry, should lead to the identification of the majority of proteins in the membrane proteome of simple microorganisms. it is important to quantify not only the identified membrane proteins but also to determine the levels of interacting partners. subcellular fractionation techniques that employ a combination of centrifugation steps are a common choice for preparing plasma membrane-(pm-) enriched fractions including detergent-resistant membrane fractions, commonly known as lipid rafts. these methods can offer a significant improvement in specificity for pm proteins over approaches that do not perform any subcellular fractionation, but rather use whole-cell or tissue preparations [36] . chemical-tagging methods [37] have been a more applied technique used to enrich for pm proteins and are often used in conjunction with physical separation strategies. this method allows for a specific class of protein or modification of interest to be physically separated from other nontagged proteins. importantly, when chemical tags are attached to the extracellular domain of pm proteins on intact cells, they offer an unrivaled specificity for pm proteins, because they offer a manner to distinguish true pm proteins from intracellular contaminants. cell-surface biotinylation, the covalent attachment of a biotin tag to the extracellular domain of pm proteins, is also a popular choice [38] [39] [40] . serum is a complex body fluid, containing a large diversity of proteins. more than 10000 different proteins are present in the human serum and many of them are secreted or shed by cells during different physiology or pathology processes [41] . serum is expected to be an excellent source of protein biomarkers because it circulates through, or comes in contact all tissues. consequently, serum proteomics has raised great expectations for the discovery of biomarkers to improve diagnosis or classification of a wide range of diseases, including cancers [42] . however, serum has been termed as the most complex human proteome [43] with considerable differences in the concentrations of individual proteins, ranging from several milligrams to less than one pictogram per milliliter [44] . the analytical challenge for biomarker discovery arises from the high variability in the concentration and state of modification of some human plasma proteins between different individuals [45] . albumin is a protein of very high abundance in serum (35-50 mg/ml) that would be a prime candidate for complete selective removal prior to performing a proteomic analysis of lower abundance proteins. thus, removal of albumin from serum may also result in the specific removal of low abundance cytokines, peptide hormones, and lipoproteins of interest. immunoglobulins, and antibodies are also abundant proteins in serum that function by recognizing "foreign" antigens in blood and initiating their destruction [46] . the presence of higher abundance proteins interferes with the identification and quantification of lower abundance proteins (lower than ng/ml in serum). complexity and dynamic range of protein concentrations can be addressed with a combination of prefractionation techniques that deplete highly abundant proteins and fractionate. heparin chromatography coupled with protein g appears to be an efficient and economical strategy to pretreat serum for serum proteomics [47] . protein prefractionation by immunodepletion and reversed-phase separation of the depleted plasma on mrp-c18 column provide methods compatible with lc-ms-based analysis. a polyclonal antibody-based system to rapidly deplete multiple high abundant proteins in serum, plasma, csf, and other biological fluids. individual antibody materials are mixed in selected percentages and packed into a column format. albumin can be removed by immunoaffinity columns [48] , isoelectric trapping [49] , dye-ligand chromatography [50] , and peptide affinity chromatography [51] . another approach involves the removal of igg by affinity chromatography using immobilized protein a or protein g [52] . a recently developed depletion method that mixes 6 high-specificity polyclonal antibodies (mars) to remove the top 6 proteins in a single purification step is commercially available [53] . human-14 multiple affinity removal column depletes the top 14 abundant proteins from human serum, plasma, csf, and other biological fluids. to address 2d limitations several types of mass spectrometry, in conjunction with various separation and analysis methods, are increasingly being adopted for proteomic measurements [54] . in contrast, 2d-page analysis, seldi-tof ms is a rather new method which is especially valuable for the identification of serum-derived biomarkers [55] . this method is based on proteinchip arrays which carry various chromatographic properties, such as anion exchange, cation exchange, and hydrophilic or hydrophobic surfaces [56] . for the analysis of serum, only 5-10 μl of serum sample is applied to these surfaces; after washing off unbound material, the protein fingerprint can be determined and visualized by time-of-flight mass spectrometry. the advantages of this method are the low amount of sample necessary for analysis, its speed, and high throughput capability. many different groups have used this method and related methods based on prefractionation of serum proteins by beads and subsequent maldi analysis for the identification of biomarkers in serum, urine, pancreatic juice, and other biological fluids [57] . the necessity of this removal or separation is also illustrated that many proteins found useful as biomarkers [58] . different fractionation steps (such as electrophoresis, seldi, and liquid chromatography) have been developed to reduce the complexity of serum proteome and to allow the detection and the identification of single proteins [59] . 2de and maldi ms had applied to identify candidate biomarkers at early and late stages of lung cancer disease. this method identified 46 proteins in tumor bearing mice this included disease regulated expression of orosomucoid-8, a-2-macroglobulin, apolipoprotein-a1, apolipoprotein-c3, glutathione peroxidase-3, plasma retinol-binding protein, and transthyretin [60] . recently 1065 proteins were identified by stable isotope labeled proteome (silap) standard coupled with extensive multidimensional separation with tandem mass spectrometry of which 121 proteins were present at 1.5-fold or greater concentrations in the sera of patients with pancreatic cancer [61] . specimen collection (blood, serum, plasma samples) is an integral component of clinical research. access to high-quality specimens, collected and handled in standardized ways that minimize potential bias or confounding factors, is key to the "bench to bedside" aim of translational research [62] . variables that may impact analytic outcomes include (1) the type of additive in the blood collection tubes; (2) sample processing times or temperatures; (3) hemolysis of the sample; (4) sample storage parameters; (5) the number of freeze-thaw cycles [63, 64] . the key variable in any analysis is that the case and control samples are handled in the exact same manner throughout the entire analytical process from study design and collection of samples to data analysis [63, 65] . these types of differences between samples could have a significant impact on the stability of proteins or other molecules of interest in the specimens. small differences in the processing or handling of a specimen can have dramatic effects in analytical reliability and reproducibility, especially when multiplex methods are used. a representative working group, standard operating procedures internal working group, comprised of members from across early detection research network should be formed to develop standard operating procedures (sops) for various types of specimens collected and managed for biomarker discovery and validation work. limitations of two-dimensional electrophoresis. figure 1 gives the general work flow in proteomics and table 1 addresses their strengths and limitations. two-dimensional electrophoresis (2de) was developed two decades before the term proteomics was coined [66, 67] . the 2de entails the separation of complex protein mixtures by molecular charge in the first dimension and by mass in the second dimension. 2de analysis provides several types of information about the hundreds of proteins investigated simultaneously, including molecular weight, pi and quantity, as well as possible posttranslational modifications. 2de is extensively used but mostly for qualitative experiments and this method falls short in its reproducibility, inability to detect low abundant and hydrophobic proteins, low sensitivity in identifying proteins with ph values too low (ph < 3) or too high (ph > 10) and molecular masses too small (mr < 10 kd) or too large (mr > 150 kd) [2] [3] [4] [5] . poor separations of basic proteins due to "streaking" of spots and membrane proteins resolution [68] are limiting factors in 2de. however, 2de is the only technique that can be routinely applied for parallel quantitative expression profiling of complex protein mixtures such as whole cell and tissue lysates [69] and most widely used method for efficiently separating proteins, their variants and modifications (up to 15000 proteins). there are two ways to study posttranslational modifications by means of 2de. first, posttranslational modifications that alter the molecular weight and or pi of a protein are reflected in a shift in location of the corresponding protein spot on the proteomic pattern. second, in combination with western blotting, antibodies specific for posttranslational modifications can reveal spots on 2de patterns containing proteins with these modifications [70] . protein extraction and solubilization are key steps for proteomic analysis using 2de, highly hydrophobic proteins tend to precipitate during isoelectro focusing (ief), low copy number and the insolubility of transmembrane proteins renders quantitative analysis of these peptides and polypeptides are very challenging [71] . in order to enhance protein extraction and solubilization, different treatments and conditions are necessary to efficiently solubilise different types of protein extracts [72, 73] . the major challenge for protein visualization in 2de is the compatibility of sensitive protein staining methods with mass spectrometric analysis. therefore, several fluorescent staining methods have been developed for the visualization of 2de patterns, including sypro stainings and cy-dyes [74] . although sypro ruby [75] and silver staining [76, 77] have a comparable sensitivity, sypro ruby staining allows much higher reproducibility, a significantly wider dynamic range and less false-positive staining. in addition, sypro ruby allows for the detection of lipoproteins, glycoproteins, metalloproteins, calcium-binding proteins, fibrillar proteins, and low molecular weight proteins that are less "stainable" using other methods. finally, a large number of protein spots on 2de patterns contain several proteins with a similar pi. a ph gradient with a narrow range allows zooming into different proteins with the same molecular weight. increased separation distance 40 × 40 cm gels using ca-ief [78] could increase the proteome coverage up to 5000 proteins. use of overlapping narrow range ipgs "zoom" gels and increase in separation area could yield better membrane protein separation [79] . this technology, however, is biased against certain classes of proteins including low abundance and hydrophobic proteins. proteins can also be fluorescently labelled with cy2, cy3, or cy5 prior to 2de [80] . cydyes are cyanine dyes carrying an n-hydroxysuccinimidyl ester reactive group that covalently binds the e-amino group of lysine residues in proteins. during dige [81] , proteins in each of up to 3 samples can be labelled with one of these fluorescent dyes, and the differentially labelled samples can be mixed and loaded together on one single gel, allowing the quantitative comparative analysis of three samples using a single gel ( figure 2 ). the dige technique has exhibited higher sensitivity as well as linearity, eliminated postelectrophoretic processing (fixing and destaining) steps and enhanced reproducibility by directly comparing samples under similar electrophoretic conditions [81, 82] . the resulting images are then analyzed by software such as de-cyder which are specifically designed for 2d-dige analysis [83] . the major advantages of 2d-dige are the high sensitivity and linearity of its dyes, its straightforward protocol, as well as its significant reduction of intergel variability, increasing the possibility to unambiguously identify biological variability, and reducing bias from experimental variation. moreover, the use of a pooled internal standard, loaded together with the control and experimental samples, increases quantification accuracy and statistical confidence [84] . the dige technique has dramatically improved the reproducibility, sensitivity, and accuracy of quantitation; however, its labeling chemistry has some limitations; proteins without lysine cannot be labeled, and they require special equipment for visualization, and fluorophores are very expensive [83, 85] . tag (icat) . gel-free, or ms based, proteomics techniques are emerging as the methods of choice for quantitatively comparing proteins levels among biological proteomes, since they are more sensitive and reproducible than two-dimensional gel-based methods. icat is one of the most employed chemical isotope labeling methods and the first quantitative proteomic method to be based solely on using ms [86, 87] . each icat reagent consists of three essential groups: a thiol-reactive group, an isotope-coded light or heavy linker, and a biotin segment to facilitate peptide enrichment. in an icat experiment, protein samples are first labeled with either light or heavy icat reagents on cysteine thiols. the mixtures of labeled proteins are then digested by trypsin and separated through a multistep chromatographic separation procedure. peptides are identified with tandem ms, and the relative quantifications of peptides are inferred from the integrated lc peak areas of the heavy and light versions of the icat-labeled peptides [88] . the icat concept has been widely used after its introduction [89] [90] [91] . different software programs were developed to analyze icat labeled ms data (e.g., proicat from applied biosystems, spectrum mill from agilent technologies, and sashimi from the institute of system biology [92] ). icat is extremely helpful to detect peptides with low expression levels, which is one of the bottleneck issues in analytic protein techniques [93, 94] . however, major limitations of this technique include selective detection of proteins with high cysteine content and difficulties in the detection of acidic proteins [95, 96] . the methods for direct comparison of dige and icat for the identification and quantification of proteins in complex biological mixtures are also being considered [97] . while the icat reagent only interacts with the free sulfhydryl of homocysteine and 8% protein is noncysteine, the silac has emerged as a valuable proteomic technique [98] which becomes more common for cell types and have been applied in many fields [99] [100] [101] . the silac technique can be effectively expanded to compare the differential expression levels of tissue proteome at different pathological states, which allows to identify new candidate biomarkers [102] . compared with the icat, a popular in vitro labeling, silac as an example of in vivo coding requires no chemical manipulation, and there is very little chemical difference between the isotopically labeled amino acid and its naturally occurring counterpart [103] . in addition, the amount of labeled proteins requires for analysis using silac technique is far less than that with icat. therefore, the silac-based method has broadly applied in many areas of cell biology and proteomics. except that the silac-based quantitative method is powerful in comparative/differential proteomics, it has been widely used in analyzing protein posttranslational modification, such as protein phosphorylation, detection of protein-protein or peptide-protein interactions and investigating signal transduction pathways [104, 105] .though there are numerous advantages for using silac-based methods compared to chemical labeling, a major drawback of silac is that it cannot be applied to tissue protein analysis directly. to overcome this shortcoming, silac has been successfully applied to tissue proteome based on 15 n isotope labeling [106] . microorganisms such as malaria parasite can be labeled with isoleucine [107] . latterly the culturederived isotope tags (cdits) method was developed as an alternative quantitative approach for studying the proteome of mammalian tissues based on the application of silac [108] . 18o stable isotope labeling. differential 16o/18o coding relies on the 18o exchange that takes place at the cterminal carboxyl group of proteolytic fragments, where two 16o atoms are typically replaced by two 18o atoms by enzyme-catalyzed oxygen exchange in the presence of h218o [109] . the resulting mass shift between differentially labeled peptide ions permits identification, characterization, and quantitation of proteins from which the peptides are proteolytically generated. in contrast to icat, 18o labeling does not favor peptides containing certain amino acids (e.g., cysteine), nor does it require an additional affinity step to enrich for these peptides [110] . unlike itraq, 16o/18o labeling does not require a specific ms platform nor does it depend on fragmentation spectra (ms2) for quantitative peptide measurements. it is amenable to the labeling of human specimens (e.g., plasma, serum, tissues), which represents a limitation of metabolic labeling approaches (e.g., silac). taken together, recent advancements in the homogeneity of 18o incorporation, improvements made on algorithms employed for calculating 16o/18o ratios and the inherent simplicity of this technique should result in increased use of 18o labeling [111] . in general, 18o labeling suffers from two potential drawbacks, inhomogeneous 18o incorporation and inability to compare multiple samples within a single experiment. a dual 18o labeling using a non-gel-based platform has been developed to overcome the major problems of existing proteolytic 18o labeling methods [112] . (itraq). the itraq reagent is well known for relative and absolute quantitation of proteins. the itraq technology offers several advantages, which include the ability to multiplex several samples, quantification, simplified analysis and increased analytical precision and accuracy [113] [114] [115] . the interest of this multiplexing reagent is that 4 or 8 analysis samples [116] can be quantified simultaneously. in this technique, the introduction of stable isotopes using itraq reagents occurs on the level of proteolytic peptides ( figure 3 ). this technology uses an nhs ester derivative to modify primary amino groups by linking a mass balance group (carbonyl group) and a reporter group (based on n-methylpiperazine) to proteolytic peptides via the formation of an amide bond [117] . due to the isobaric mass design of the itraq reagents, differentially labelled peptides appear as a single peak in ms scans, reducing the probability of peak overlapping. when itraq-tagged peptides are subjected to ms/ms analysis, the mass balancing carbonyl moiety is released as a neutral fragment, liberating the isotope-encoded reporter ions which provides relative quantitative information on proteins. an inherent drawback of the reported itraq technology is due to the enzymatic digestion of proteins prior to labelling, which artificially increases sample complexity and this approach needs a powerful multidimensional fractionation method of peptides before ms identification. prefractionation of proteins based on electrokinetic methodologies in free solution essentially relaying on the isoeletric focusing (ief) has gained wide acceptance. many commercial devices are now constructed to take the advantage of this principle ( table 2 ). reproducible fractionation steps will break down the sample complexicity while concentrating low abundant species, resulting in more confident protein identifications and quantification by 2d gels, mass spectrometry, and protein arrays. a good example of a innovation is liquid-phase isoelectric focusing (ief) as a prefractionation tool before the first dimension of 2d gel electrophoresis [118, 119] . for more consistent pi separation, the zoom ief fractionator [120] and multicompartment electrolyser (mce) [121] are being used to prefractionate the proteins. the fractionated samples can be directly applied on standard narrow range ipg strips for 2d electrophoresis. this allows at least 10000 to 15000 separate proteins to be analyzed, including proteins of very low abundance. ief, a highresolution electrophoresis technique, has been widely used in shotgun proteomic experiments [122] . ief runs in a buffer-free solution containing carrier ampholytes or in immobilized ph gradient (ipg) gels. the use of ipg-ief for the separation of complex peptide mixtures has been applied to the analysis of plasma and amniotic fluid [123, 124] as well as to bacterial material [125] . the ipg gel strip is divided into small sections for extraction and cleaning up of the peptides. this technique recovers the sample from the liquid phase and was demonstrated to be of great interest in shotgun proteomics [126] . ief is not only a high resolution and high capacity separation method for peptides, it also provides additional physicochemical information like their isoelectric point [127, 128] . the pi value provided is used as an independent validating and filtering tool during database search for ms/ms peptide sequence identification [129] . the recent introduction of commercially available offgel fractionator system by agilent technologies provides an efficient and reproducible separation technique [130] . this separation is based on immobilized ph gradient (ipg) strips and permits to separate peptides and proteins according to their isoelectric point (pi) but is realized in solution [131] . its micropreparative scale provides fraction volumes large enough to perform subsequent analyses as reverse phase (rp)-liquid chromatography (lc)-maldi ms/ms. the combined use of itraq labeling and offgel fractionation methods for the proteomic study of complex sample is also being considered [132, 133] . in this procedure, a large well is used to separate the sample by page and lanes are created on the membrane containing immobilized protein with the use of a manifold [134] . compatible combinations of primary antibodies are predetermined, with the criterion of being able to identify proteins that do not comigrate. different combinations of primary antibodies are added to each well, with appropriate dilutions of each primary antibody so that expressed proteins are detected in a single condition. the scalability of the system depends on defining suitable combinations of primary antibodies, with up to 1000 antibodies in 200 lanes being used in the largest screens. detection software is used to identify proteins based on their expected and observed gel mobility. unlike 2d page and hplc-ms/ms, large-scale western blotting only identifies proteins for which antibodies are already available. while this is not an appropriate screen for identifying uncharacterized proteins, it greatly simplifies the verification and functional analyses of proteins that are detected. in addition, this approach is highly flexible, and can be focused to particular sets of proteins or protein function, such as cell signaling molecules. importantly, the foundation of this approach is the large amount of data on individual antibodies, which are already available and characterized in the literature [135] . another approach to analyse proteomes without gels is "shotgun" analysis using mudpit [136] . in the mudpit approach, protein samples are subject to sequencespecific enzymatic digestion, usually with trypsin and endoproteinase lysc, and the resultant peptide mixtures are separated by strong cation exchange (scx) and reversed phase (rp) high performance liquid chromatography (hplc) [137, 138] . peptides from the rp column enter the mass spectrometer and ms data is used to search the protein databases [138] . the mudpit technique generates an exhaustive list of proteins present in a particular protein sample, it is fast and sensitive with good reproducibility however, it lacks the ability to provide quantitative information [139] [140] [141] . a combination of hplc, liquid phase isoelectric focusing, and capillary electrophoresis provides other multimodular options for the separation of complex protein mixtures [142] . high throughput production of human proteins using different methods is being developed to make protein array approach more practical. recently simple and efficient production of human proteins using the versatile gateway vector system has been developed [143] . in this approach, protein expression system is applied to the in vitro expression of 13364 human proteins and assessed their biological activity in two functional categories and developed "human protein factory" infrastructure which includes the resources and expression technology for in vitro proteome research. in another approach, dna array to protein array (dapa) is utilized, which allows the "printing" of replicate protein arrays directly from a dna array template using cellfree protein synthesis [144] . based on the nucleic acid programmable protein array (nappa) concept, high-density self-assembling protein microarray is developed to display thousands of proteins that are produced and captured in situ from immobilized cdna templates [145] . this method will enable various experimental approaches to study protein function in high throughput. the adventage of protein-based microarrays allows the global observation of biochemical activities on an unprecedented scale, where hundreds or thousands of proteins can be simultaneously screened for protein-protein, proteinnucleic acid, and protein-small molecule interactions, as well as posttranslational modifications [146, 147] . the microarray format provides a robust and convenient platform for the simultaneous analysis of thousands of individual protein samples, facilitating the design of sophisticated and reproducible biochemical experiments under highly specific conditions [148] . the principal challenges in protein array development are 3-fold: (1) creation of a comprehensive expression clone library; (2) high-throughput protein production, including expression, isolation, and purification; (3) adaptation of dna microarray technology to accommodate protein substrates [149] . functional protein microarrays differ from analytical arrays in that functional protein arrays are composed of arrays containing fulllength functional proteins or protein domains (figure 4) . these protein chips are used to study the biochemical activities of an entire proteome in a single experiment. they are used to study numerous protein interactions, such as protein-protein, protein-dna, protein-rna, proteinphospholipid, and protein-small molecule interactions [150, 151] . companies have introduced protein arrays aimed not only at proteomic analysis but also functional analyses of proteins (e.g., biacore ab, ciphergen biosystems inc., phylos inc.). affinity proteomics aim to produce antibodies to every protein expressed by the human genome and these will be characterized against purified antigens and tested on tissue arrays to collect information about their specificity for tissue antigens [152] . companies are focused to produce various binding partners, for example, affibodies, monoclonal antibodies, and their fragments [153] . protein chips will likely be the next major manifestation of the revolution in proteomics and offer another solution to analyze low abundant proteins and have the potential for high throughput applications to identify biomarkers [154] . protein chips differ from previously described methods; whereas screening by 2de or lc ms/ms can potentially detect any protein, and protein chips can only provide data on set of proteins selected by the investigator [155] . the development and application of high throughput, multiplex immunoassays that measure hundreds of known proteins in complex biological matrices, is becoming a significant tool for quantitative proteomics studies, diagnostic discovery, and biomarker-assisted drug development. two broad categories of antibody microarray experimental formats have been developed [156] , direct labelling, single antibody experiments [157] , dual antibody, sandwich immunoassays are described [158, 159] . in the direct labelling method, all proteins in a complex mixture are tagged, providing a means for detecting bound proteins following incubation on an antibody microarray. in the sandwich immunoassay format, proteins captured on an antibody microarray are detected by a cocktail of detection antibodies, each antibody matched to one of the spotted antibodies. in addition, a variety of microarray substrates have been described, including nylon membranes, plastic microwells, planar glass slides, gel-based arrays and beads in suspension arrays. much effort has been expended in optimizing antibody attachment to the microarray substrate. finally, various signal generation and signal enhancement strategies have been employed in antibody arrays, including colorimetry, radioactivity, fluorescence, chemiluminescence, quantum dots and other nanoparticles, enzyme-linked assays, resonance light scattering, tyramide signal amplification, and rolling circle amplification. each of these formats and procedures has distinct advantages and disadvantages, relating broadly to sensitivity, specificity, dynamic range, multiplexing capability, precision, throughput, and ease of use. in general, multiplexed microarray immunoassays are ambient analyte assays [160] . given the heterogeneity of antibody array formats and procedures currently in use in proteomics studies, and the absence of a "gold standard," there exists an urgent need for development and adoption of standards that permit platform comparisons and benchmarking. regardless of the choice of a given proteomic separation technique, gel-based or gel-free, a mass spectrometer is always the primary tool for protein identification. during the last decade, significant improvements have been made in the application of ms for the determination of protein sequences [161] . mass spectrometers consist of an ion source, the mass analyzer, and an ion detection system. analysis of proteins by ms occurs in three major steps (a) protein ionization and generation of gas-phase ions, (b) separation of ions according to their mass to charge ratio, and (c) detection of ions [162] . in gel-free approaches such as icat and mudpit, samples are directly analyzed by ms whereas, in gel-based proteomics (2de and 2d-dige), the protein spots are first excised from the gel and then digested with trypsin. the resulting peptides are then separated by lc or directly analyzed by ms. the experimentally derived peptide masses are correlated with the peptide fingerprints of known proteins in the databases using search engines (e.g., mascot, sequest). there are two main ionization sources which include matrix assisted laser desorption/ionization (maldi) and electrospray ionization (esi) and four major mass analyzers, which are time-of-flight (tof), ion trap, quadrupole, and fourier transform ion cyclotron (ftic) which are currently in use for protein identification and characterization [163] . a combination of different mass analyzers in tandem such as quadrupole-tof and quadrupole-ion trap has combined the individual strengths of different types of mass analyzers and greatly improved their capabilities for proteome analysis [162] . simple mass spectrometers such as maldi-tof are used for only measurement of mass, whereas tandem mass spectrometers are used for amino acid sequence determination [164] . in maldi the sample of interest is crystallized with the matrix on a metal surface and a laser ion source causes excitation of matrix along with the analyte ions, which are then released into the gas phase. maldi measures the mass of peptides derived from a trypsinized parent protein and generates a list of experimental peptide masses, often referred to as "mass fingerprints" [165, 166] . in esi, the analyte is ionized from a solution and transferred into the gas phase by generating a fine spray from a high voltage needle which results in multiple charging of the analyte and generation of multiple consecutive ions. tandem mass spectrometry or ms/ms is performed by combining two different ms separation principles. in tandem ms, individual trypsin-digested peptides are fragmented after a liquid phase separation. tandem ms instruments such as triple quadrupole, quadrupole ion trap, fourier transform ion-cyclotron resonance, or quadrupole time-of-flight are used in lc-ms/ms or nanospray experiments with electrospray ionization (esi) to generate peptide fragment ion spectra [167] . ion mobility spectrometry (ims) has been utilized as a rapid gas-phase separations strategy for biomolecular ions [168, 169] . the strategy provides high sensitivity because the gas-phase dispersion of peptide ions separates features corresponding to low abundance species from interfering chemical noise [170] . reduced spectral congestion also allows for the use of shorter experimental run times (lc separations) without sacrificing throughput; short analysis time scales are key to measuring the large numbers of samples required to determine normal protein variability prior to realizing individual plasma profiling. additionally, mobility-dispersed ions can be fragmented and mobility linked to fragment ions without ion loss from precursor mass selection [171] . these advantages have been demonstrated in head-to-head comparisons with conventional lc-ms/ms technology using rapid (21 minutes) lc gradients [169] . accurate mass and time (amt) tag approach [172] addresses an analogous situation in lc-ms-based proteomics studies. in this approach, initial lc-ms/ms analyses are performed on prefractionated peptide samples in order to provide peptide sequence identifications. these experiments are relatively low throughput because the peptide prefractionation can be quite extensive and require separate lc-ms/ms analyses for each fraction. the high-throughput accurate mass and time (amt) tag proteomic approach was utilized to characterize the proteomes for cytoplasm, cytoplasmic membrane, periplasm, and outer membrane fractions from aerobic and photosynthetic cultures of the gram-negative bacterium rhodobacter sphaeroides 2.4.1. there has been a recent trend in proteomics toward the development and application of technologies for the targeted analysis of proteins within complex mixtures [173] . selected reaction monitoring (srm) is a powerful tandem mass spectrometry method that can be used to monitor target peptides within a complex protein digest [174, 175] . the specificity and sensitivity of the approach, as well as its capability to multiplex the measurement of many analytes in parallel, has made it a technology of particular promise for hypothesis driven proteomics. the use of tandem mass spectrometry data acquired on an ltq ion trap mass spectrometer can accurately predict which fragment ions will produce the greatest signal in an srm assay using a triple quadrupole mass spectrometer [176] . one of the biggest benefits of a targeted assay on a triple quadrupole mass spectrometer is high throughput. using the selectivity of multiple stages of mass selection of a tandem mass spectrometer, these targeted srm assays are the mass spectrometry equivalent of a western blot [173] . an advantage of using targeted mass spectrometry-based assay over a traditional western blot is that it does not rely on the creation of any immunoaffinity reagent. while its application is novel in the proteomics community, srm has been utilized for several decades in the toxicology and pharmacokinetics disciplines [177] . peptidebased immunofractionation methods show potential for proteome wide screening approaches but are limited by the availability of antibodies [178, 179] . the stable isotope standards with capture by antipeptide antibodies (siscapa) approach is based on the addition of stable isotope labeled standard peptides to the digested clinical sample followed by immunoaffinity enrichment of standard and analyte peptide by highly specific antipeptide antibodies [180, 181] . this approach enables the absolute quantification of selected diagnostic peptides from digested clinical samples down to physiologically relevant analyte concentrations (ng/ml) at high precision (10% cv) and accuracy [178, 179] . further improvement of mrm-based biomarker quantification should be possible if whole sets of analyte peptides can be enriched by immunofractionation. since this method relies on one specific antibody per target protein/peptide the generation of more than 10000 antibodies is necessary for proteome wide screening approaches. novel peptide affinity enrichment strategies enabling proteome wide analyses of signature peptides may provide an important addition to future proteome workflows. undoubtedly, the accuracy, high throughput, and robustness of ms technologies have made the characterization of entire proteomes a realistic goal [180, 181] . the major bottlenecks in proteomics research today are related to data analysis to create an environment where computer scientists and biologists and the people who collect data can work closely together, so they can develop the necessary analytical tools that will help interpret the data [182] [183] [184] . processing and analysis of proteomics data is indeed a very complex multistep process ( figure 5 ). the meaningful comparison, sharing, and exchange of data or analysis results obtained on different platforms or by different laboratories remain cumbersome mainly due to the lack of standards for data formats, data processing parameters, and data quality assessment. accurate, consistent, and transparent data processing and analysis are integral and critical parts of proteomics workflows [185] . we can now generate huge amounts of data, and currently there is an enormous challenge to figure out how to actually analyze this data and generate real biological insights. the necessity of an integrated pipeline for processing and analysis of complex proteomics data sets has therefore become critical. validation. this step consists of the assignment of ms/ms spectra to a database search using one of several engines available (e.g., sequest, mascot, comet, x!tandem, etc.). one of the difficulties related to the use of sequest for peptide identifications is the lack of methods to globally evaluate the quality of data and the lack of methods to access global changes created by filtering schemes and/or database changes [186] . most approaches are matching and scoring large sets of experimental spectra with predicted masses of fragment ions of peptide sequences derived from a protein database. results are scored according to a scheme specific to each search engine that also depends on the database used for the search. usually tools are linked to one specific platform or were optimized for one instrument type. the various search engines do not yield identical results as they are based on different algorithms and scoring functions, making comparison and integration of results from different studies or experiments tedious [187, 188] . peptide identification via database searches is very computationally intensive and time-demanding. high quality data allow more effective searches due to tighter constrains, that is, tolerance on precursor ion mass and charge state assignment, which will drastically reduce the search time in case of an indexed database. in addition, accurate mass measurements of fragment ions further simplify the database searches and add confidence to the results. the association of identified peptides with their precursor proteins is a very critical and difficult step in shotgun proteomics strategies as many peptides are common to several proteins, thus leading to ambiguous protein assignments. therefore it becomes critical to have an appropriate tool that is able to assess the validity of the protein inference and associate a probability to it. protein prophet database tool combines probabilities assigned to peptides identified by ms/ms to compute accurate probabilities for the proteins present [189] . 5:1921-1926, 2006. impossible. the lack of common standards and protocols has led to this situation and often resulted in duplication of efforts. results were usually reported as a set of identified proteins (i.e., list of peptides identified and associated proteins) with minimal supporting data. obviously the large volume of such data sets has made publication of detailed results using classical mechanisms very challenging. sharing and exchange of data and results requires the definition of standard formats for the data at all levels (including raw mass spectrometric data, processed data, and search results) as well as a better definition (and/or standardization) of the parameters used for the data processing or the database searches. organellar proteomics aims to describe the full complement of proteins of subcellular structures and organelles. identification of the proteins contained in subcellular organelles has become a popular proteomics endeavor [190] . when compared with whole-cell or whole-tissue proteomes, the more focused results from subcellular proteomic studies have yielded relatively simpler datasets from which biologically relevant information can be more easily extracted [191] . subcellular fractionation consists of two major steps, disruption of the cellular organization (homogenization) and fractionation of the homogenate to separate the different populations of organelles. such a homogenate can then be resolved by differential centrifugation into several fractions containing mainly (1) nuclei, heavy mitochondria, cytoskeletal networks, and plasma membrane; (2) light mitochondria, lysosomes, and peroxisomes; (3) golgi apparatus, endosomes and microsomes, and endoplasmic reticulum; (4) cytosol. each population of organelles is characterized by size, density, charge, and other properties on which the separation relies [192] . analyzing subcellular fractions and organelles allows tracking proteins that shuttle between different compartments, for example, between the cytoplasm and nucleus. a high dynamic range of proteins can be partially achieved by fractionation of the proteome into subproteomes by applying affinity purification may allow proteomic analysis of low copy number proteins [193] . the nuclear, chloroplast, amyloplast, plasma membrane, peroxisome, endoplasmic reticulum, cell wall, and mitochondrial proteomes were successfully characterized in arabidopsis [194] . several groups have taken advantage of this approach to recover a higher percentage of membrane proteins from subcellular extracts using various nonionic and zwitterionic detergents or phase-partitioning methods. these efforts resulted in the successful determination of the protein complement of the thylakoid and envelope membrane systems of the chloroplast [195] . by enriching for the protein class of interest based on a particular chemical/physical characteristic(s), offer the advantage of reducing sample complexity and access to lower abundance proteins in a discoverydriven experimental approach [196] . free flow electrophoresis (ffe) utilizes differences in electrophoretic mobility rather than density to separate cells or subcellular organelles [197] . ffe has previously been used in separating endosomes from hamster ovary cells [198] , plasma membrane from human platelets [199] , and insulin transporting vesicles in liver cells. the separation is based on the electrophoretic motility of cells or cell organelles suspended in a vertical free flowing buffer film on which an electric field is applied at a right angle to the flow direction. ffe has been a most valuable tool in the investigation of the composition of secretory vesicles and in addition, it has clarified how the membrane of plasma membrane vesicles is oriented after nitrogen disruption of human neutrophils [200] . importantly, subcellular fractionation is a flexible and adjustable approach that may be efficiently combined not only with 2d gel electrophoresis but also with gelindependent techniques. however, they do have limitations of considerable cross-contamination with other subcellular organelles. ptms of proteins are considered to be one of the major determinants regarding organisms complexity [201] . to date, at least more than 200 different types of ptms have been identified of which only a few are reversible and important for the regulation of biological processes. specific functions are usually mediated through ptms, such as phosphorylations, acetylations, or glycosylations, which places additional demands on the sensitivity and precision of the method [202] . one of the most studied ptms is protein phosphorylation, because it is vital for a large number of protein functions that are important to cellular processes spanning from signal transduction, cell differentiation, and development to cell cycle control and metabolism. enzymes and receptors can be switched "on" and "off " by phosphorylation and dephosphorylation. it was estimated that 10-50% of proteins are phosphorylated. phosphorylation often occurs on serine, threonine, and tyrosine residues in eukaryotic proteins [203] . analysis of the entire cellular phosphoproteome has been an attractive study subject since the discovery of phosphorylation as a key regulatory mechanism of cell life. unfortunately, phosphoproteins analysis is not straightforward for five main reasons. first, the stoichiometry of phosphorylation is generally relatively low, because only a small fraction of the available intracellular pool of a protein is phosphorylated at any given time as a result of a stimulus. second, the phosphorylatation sites on proteins might vary, implying that any given phosphoprotein is heterogeneous (i.e., it exists in several different phosphorylated forms). third, many of the signaling molecules, which are major targets of phosphorylation events [204] , are present at low abundance within cells and, in these cases; enrichment is a prerequisite before analysis. fourth, most analytical techniques used for studying protein phosphorylation have a limited dynamic range, which means that although major phosphorylation sites might be located easily, and minor sites might be difficult to identify. finally, phosphatases could dephosphorylate residues unless precautions are taken to inhibit their activity during preparation and purification steps of cell lysates. in addition, various methods for protein phosphorylation site determination have been developed, yet this task remains a technical challenge [205] . western blot has been widely used to determine the presence of ptms. however, this technique relies on the prior knowledge of the type and position of specific modifications and the availability of antibodies. it has low throughput and not ideal for studying highly complicated samples. specific chemical or affinity enrichment steps are usually incorporated into the sample preparation or fractionation stages of the general scheme of proteomic studies [206, 207] . well established methods involving the analysis of 32p-labeled phosphoproteins by edman degradation and two-dimensional phosphopeptide mapping have proven to be powerful but not without limitations. consequently, mass spectrometry (ms) has emerged as a reliable and sensitive method for the characterization of protein phosphorylation sites [208] and may therefore represent a method of choice for the analysis of protein phosphorylation [209] . immobilized metal affinity chromatography (imac), metal oxide affinity chromatography (moac), and covalent methods are all capable of selectively enriching phosphopeptides [210] . moac based on adsorption to tio2 is especially attractive, but as with all techniques, loading, rinsing, and elution solutions must be carefully selected to minimize nonspecific adsorption and to maximize the detection of both monophosphorylated and multiphosphorylated species. imac might not provide the selectivity available with tio2 enrichment, but with appropriate reagents, imac can be selective and sensitive for monophosphorylated and tetraphosphorylated peptides. however, some buffers and reagents such as edta are not compatible with imac, so hplc purification may be needed prior to this technique [211] . when trying to isolate and identify as many phosphoproteins as possible in a cell lysate, chromatographic column-based methods are required. multiple elutions from imac or moac columns or even gradient elutions can help to simplify fractions of proteins and reveal more peptides [212, 213] . a combination of techniques can reveal large numbers of phosphopeptides in complex samples, but comprehensive phosphoproteomics is still not possible. for the highest protein coverage, future phosphoproteomic techniques will likely employ multiple enrichment techniques along with two-dimensional separations, but such studies are time consuming. combinations of affinity-based enrichment and extraction methods, multidimensional separation technologies, and mass spectrometry are particularly attractive for systematic investigation of posttranslationally modified proteins in proteomics [214] . organisms. the application of proteomics and related technologies for the analysis of proteome is severely hampered by the lack of publicly available sequence information for most of the unsequenced organisms [215] . despite the precision of the mass information yielded by the seldi technique, a significant number of proteins were found to have no similarity to known peptides, an aforementioned weakness of proteomics studies in nonmodel organisms [216] . in order to circumvent this limitation, different strategies and tools were developed to make unsequenced organisms amenable to high-throughput proteomics [217] (figure 6 ). however, an evaluation of their performance in an integrated proteomics strategy using high-throughput shotgun ms data is currently missing. in principle, two different approaches can lead to an increase in protein identifications from unsequenced organisms. in the first approach, ms/ms data are searched against a protein database of an evolutionarily closely related organism. however, as a matter of principle of database-dependent searches, only proteins can be identified that contain at least one peptide with exactly the same sequence as the peptide from a protein in the database. with increasing evolutionary distance this will be an increasingly severe restriction [218] . in the second approach, the amino acid sequence of a peptide is extracted from the ms/ms spectrum for de novo sequencing, that is, in a fully databaseindependent manner using exclusively the information contained in the ms/ms spectrum. several software tools for peptide de novo sequencing are now available and some of them provide sufficiently good results when applied to high-quality spectra [219] . a basic limitation of ms de novo sequencing methods is the necessity for backbone cleavage between each pair of adjacent amino acids; a mass value representing a terminal fragment containing only one of the two residues is a first requirement for ordering of a specific pair [220, 221] and this limitation urged the need for bioinformatics approaches that can help interpret the proteomics data [219] . in the past several years there have been very important extremely useful advances in proteomics methods based on bottom-up display and bottom-up identification using peptides [222] . these methods offer more sensitivity, greater rapidity and greater proteome coverage are often made with the explicit or implicit assertion that these methods are bound to replace more traditional methods based on topdown analysis, especially using 2d gels [223, 224] . the combination of bottom-up display and bottom-up identification has achieved very important successes in detecting the presence of large numbers of different proteins in cells or subcellular organelles [225, 226] . the use of specific fractionation schemes and prudent adoption of methods to increase the number of proteins able to be identified and quantified is enabling significant biological advances to be made. further technological developments that enable a larger proportion of the proteome to be visualized will further enhance our ability to characterize biological systems. as such, these advances in proteomics will impact not only academic pursuits but also pharmaceutical, biotechnology and diagnostic research and development [227] . in the future gel-free techniques mudpit, itraq and 18o stable isotope labeling could be expected to gain more importance as they become more established. sample prefractionation system provides a highly valuable tool to fractionate proteins and peptides from complex eukaryotic samples like plasma. this approach has a positive influence on the number of proteins identified compared to scx method [228] . itraq is a very powerful tool, recognised form its ability to relatively quantify proteins. itraq reagent improves maldi ionisation, especially for peptides containing lysine. although silac labelling is easy for any laboratory that uses cell culture, the ms technology that is required is still beyond the capabilities of most groups. one of the factors that contributed to the rapid acceptance of the silac technology was the availability of an open-source program, msquant, for interpreting results. protein microarrays offer the ability to simultaneously survey multiple protein markers in an effort to develop expression profile changes across multiple protein analytes for potential use in diagnosis, prognosis, and measurement of therapeutic efficacy [229] . this technology is an excellent high-throughput method used to probe an entire collection of proteins for a specific function or biochemistry. it is an exceptional new way to discover previously unknown multifunctional proteins, and to discover new functionalities for well-studied proteins [230] . a systematic and efficient analysis of vast genomic and proteomic data sets is a major challenge for researchers today. to overcome limitations of current proteomics strategies in regard to the dynamic range of peptides detected and alternative mass spectrometrybased approaches are being explored. targeted strategies exemplified by multiple reaction monitoring detect, quantify, and possibly collect a product ion spectrum to confirm the identity of a peptide with much greater sensitivity because the precursor ion is not detected in the full mass spectrum [231] . a systematic and efficient evaluation of large-scale experimental results requires (1) automatic retrieval of user defined information to construct a customized, queryable database; (2) an intuitive graphical and query platform to display and analyze experimental data in the context of the customized database; (3) efficient utilization of webbased bioinformatics software tools for data interpretation, prediction of function, and modeling; (4) scalability and reconstruction of the database in response to changing user needs and an ever-expanding base of knowledge and bioinformatics tools [232] . creating a software tool to encompass the four crucial features outlined above is a challenging and ongoing task, particularly with respect to the ever-expanding publicly available base of knowledge and bioinformatics tools. the data processing and analysis bottleneck can be overcome through integration of the entire suite of tools into one linear pipeline. the good news is that all of the various proteomics strategies are in phases of very rapid technological development and that important advances in sensitivity, throughput, and proteome coverage can be expected in the near future for all of them. post translational modifications 2de: two-dimensional gel electrophoresis dige: fluorescence 2d difference gel electrophoresis esi: electrospray ionization ftic: fourier transform ion cyclotron hplc: high performance liquid chromatography icat: isotope-coded affinity tag itraq: isobaric tags for relative and absolute quantitation ipg: immobilized ph gradient lc: liquid chromatography maldi: matrix-assisted laser desorption/ionization ms: mass spectrometry mudpit: multidimensional protein identification technology tof: time of flight page: polyacrylamide gel electrophoresis scx: strong cation exchange mrm: multiple reaction monitoring assay seldi: surface-enhanced laser desorption/ionization imac: immobilized metal affinity capture. progress with proteome projects: why all proteins expressed by a genome should be identified and how to do it our genome unveiled highthroughput mass spectrometric discovery of protein posttranslational modifications perspectives for mass spectrumetry and functional proteomics proteomic technology for biomarker profiling in cancer: an update 2d protein electrophoresis: can it be perfected? proteomics-challenges and possibilities in finland. national technology agency advances in clinical cancer proteomics: seldi-tof-mass spectrometry and biomarker discovery challenges and opportunities in proteomics data analysis bioinformatics meets proteomics-bridging the gap between mass spectrometry data analysis and cell biology statistical and computational methods for comparative proteomic profiling using liquid chromatography-tandem mass spectrometry protein identification by mass spectrometry: issues to be considered proteomic approaches in brain research and neuropharmacology recent advances in 2d electrophoresis: an array of possibilities proteomic analysis by multidimensional protein identification technology quantitative analysis of complex protein mixtures using isotope-coded affinity tags stable isotope labeling by amino acids in cell culture, silac, as a simple and accurate approach to expression proteomics multiplexed protein quantitation in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents an automated multidimensional protein identification technology for shotgun proteomics genetic analysis of the mouse brain proteome protein arrays: the current state-of-the-art proteomics in multiplex a large-scale proteomic analysis of human embryonic stem cells high sensitivity detection of plasma proteins by multiple reaction monitoring of n-glycosites superhirn-a novel tool for high resolution lc-ms-based peptide/protein profiling proteomics in 2005/2006: developments, applications and challenges generation and initial analysis of more than 15,000 fulllength human and mouse cdna sequences molecular diversity of l-type calcium channels. evidence for alternative splicing of the transcripts of three non-allelic genes a method for global analysis of complex proteomes using sample prefractionation by solution isoelectrofocusing prior to two-dimensional electrophoresis quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry enrichment of integral membrane proteins for proteomic analysis using liquid chromatography-tandem mass spectrometry large-scale analysis of the yeast proteomeby multidimensional protein identification technology a method for the comprehensive proteomic analysis of membrane proteins evaluation of nonionic and zwitterionic detergents as membrane protein solubilizers in two-dimensional electrophoresis proteomic analysis of integral plasma membrane proteins proteomic analysis of the hydrophobic fraction of mesenchymal stem cells derived from human umbilical cord blood chemical probes and tandem mass spectrometry: a strategy for the quantitative analysis of proteomes and subproteomes evaluation of biotinylated cells as a source of antigens for characterization of their molecular profile a chemical proteomics approach for the identification of accessible antigens expressed in human kidney cancer identification and relative quantification of membrane proteins by surface biotinylation and two-dimensional peptide mapping the human plasma proteome: history, character, and diagnostic prospects the case for early detection the human plasma proteome: a nonredundant list developed by combination of four separate sources recent advances in bloodrelated proteomics investigating diversity in human plasma proteins the human plasma proteome: history, character, and diagnostic prospects heparin chromatography to deplete high-abundance proteins for serum proteomics sample preparation of human serum for the analysis of tumor markers: comparison of different approaches for albumin and gamma-globulin depletion depletion of the highly abundant protein albumin from human plasma using the gradiflow isolation of albumin from whole human plasma and fractionation of albumin depleted plasma development of mammalian serum albumin affinity purification media by peptide phage display purification and some properties of streptococcal protein g, a novel igg-binding reagent multi-component immunoaffinity subtraction chromatography: an innovative step towards a comprehensive survey of the human plasma proteome quantitative analysis of bacterial and mammalian proteomes using a combination of cysteine affinity tags and 15n-metabolic labeling proteomic applications for the early detection of cancer the seldi-tof ms approach to proteomics: protein profiling and biomarker identification development of a novel proteomic approach for the detection of transitional cell carcinoma of the bladder in urine tietz fundamentals of clinical chemistry fractionation techniques improve the proteomic analysis of human serum a 2-de maldi-tof study to identify disease regulated serum proteins in lung cancer of c-myc transgenic mice stable isotope dilution multidimensional liquid chromatography-tandem mass spectrometry for pancreatic cancer serum biomarker discovery standard operating procedures for serum and plasma collection: early detection research network consensus statement standard operating procedure integration working group the need for the review and understanding of seldi/maldi mass spectroscopy data prior to analysis analytical validation of serum proteomic profiling for diagnosis of prostate cancer: sources of sample bias preanalytic influence of sample handling on seldi-tof serum protein profiles protein mapping by combined isoelectric focusing and electrophoresis of mouse tissues. a novel approach to testing for induced point mutations in mammals high resolution two dimensional electrophoresis of proteins strategies for the enrichment and identification of basic proteins in proteome projects current twodimensional electrophoresis technology for proteomics proteomic identification of proteins specifically oxidized in caenorhabditis elegans expressing human abeta (1-42): implications for alzheimer's disease membrane proteins and proteomics: un amour impossible membrane proteomics: use of additive main effects with multiplicative interaction model to classify plasma membrane proteins according to their solubility and electrophoretic properties effect of strong detergents and chaotropes on the detection of proteins in twodimensional gels applications and current challenges of proteomic approaches, focusing on two-dimensional electrophoresis background-free, high sensitivity staining of proteins in one-and two-dimensional sodium dodecyl sulfatepolyacrylamide gels using a luminescent ruthenium complex protein stains for proteomic applications: which, when, why protein detection methods in proteomics research fractionated extraction of total tissue proteins from mouse and human for 2-d electrophoresis sample complexity reduction for twodimensional electrophoresis using solution isoelectric focusing prefractionation difference gel electrophoresis: a single gel method for detecting changes in protein extracts recent advances in 2d electrophoresis: an array of possibilities 2d differential ingel electrophoresis for the identification of esophageal scans cell cancer-specific protein markers the development of the dige system: 2d fluorescence difference gel analysis technology a novel experimental design for comparative two-dimensional gel analysis: two-dimensional difference gel electrophoresis incorporating a pooled internal standard fluorescent twodimensional difference gel electrophoresis unveils the potential of gel-based proteomics quantitative analysis of complex protein mixtures using isotope-coded affinity tags quantitative protein profiling by mass spectrometry using isotopecoded affinity tags isotopecoded affinity tag (icat) approach to redox proteomics: identification and quantitation of oxidant-sensitive cysteine thiols in complex protein mixtures current trends in differential expression proteomics: isotopically coded tags phosphoprotein isotope-coded solidphase tag approach for enrichment and quantitative analysis of phosphopeptides from complex mixtures design and synthesis of visible isotope-coded affinity tags for the absolute quantification of specific proteins in complex mixtures automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry complementary analysis of the mycobacterium tuberculosis proteome by twodimensional electrophoresis and isotope-coded affinity tag technology mass spectrometry and proteomics quantitative proteome analysis by solid-phase isotope tagging and mass spectrometry quantitative analysis of redox-sensitive proteome with dige and icat stable isotope labeling by amino acids in cell culture, silac, as a simple and accurate approach to expression proteomics mass spectrometricbased approaches in quantitative proteomics a proteomics strategy to elucidate functional protein-protein interactions applied to egf signaling unbiased quantitative proteomics of lipid rafts reveals high specificity for signaling factors stable isotope labeling of arabidopsis thaliana cells and quantitative proteomics by mass spectrometry stable isotope labeling with amino acids in cell culture (silac) for studying dynamics of protein abundance and posttranslational modifications functional and quantitative proteomics using silac identifying dynamic interactors of protein complexes by quantitative mass spectrometry metabolic labeling of mammalian organisms with stable isotopes for quantitative proteomic analysis quantitative proteomics of the human malaria parasite plasmodium falciparum and its application to studies of development and inhibition quantitative mouse brain proteomics using culture-derived isotope tags as internal standards 18 o stable isotope labeling in ms-based proteomics proteolytic 18 o labeling for comparative proteomics: model studies with two serotypes of adenovirus differential protein expression in the cytosol fraction of an mcf-7 breast cancer cell line selected for resistance toward melphalan non-gel-based dual 18 o labeling quantitative proteomics strategy shotgun proteomics using the itraq isobaric tags itraq is a useful method to screen for membrane-bound proteins differentially expressed in human natural killer cell types a perspective on the use of itraq reagent technology for protein complex and profiling studies eight-channel itraq enables comparison of the activity of 6 leukaemogenic tyrosine kinases multiplexed protein quantitation in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents protein fractionation in a multicompartment device using off-gel tm isoelectric focusing efficient fractionation and improved protein identification by peptide offgel electrophoresis combining microscale solution-phase isoelectric focusing with multiplexed proteomics dye staining to analyze protein posttranslational modifications solution phase isoelectric fractionation in the multi-compartment electrolyser: a divide and conquer strategy for the analysis of complex proteomes a comparison of immobilized ph gradient isoelectric focusing and strong-cation-exchange chromatography as a first dimension in shotgun proteomics two-stage off-gel tm isoelectric focusing: protein followed by peptide fractionation and application to proteome analysis of human plasma proteome analysis of human plasma and amniotic fluid by off-gel tm isoelectric focusing followed by nano-lc-ms/ms exploring glycopeptide-resistance in staphylococcus aureus: a combined proteomics and transcriptomics approach for the identification of resistance-related markers identification of brain cell death associated proteins in human post-mortem cerebrospinal fluid the focusing positions of polypeptides in immobilized ph gradients can be predicted from their amino acid sequences ingel isoelectric focusing of peptides as a tool for improved protein identification gel based isoelectric focusing of peptides and the utility of isoelectric point in protein identification isobaric tags for relative and absolute quantitation (itraq) reproducibility: implication of multiple injections technical, experimental, and biological variations in isobaric tags for relative and absolute quantitation (itraq) neuronal differentiation and long-term culture of the human neuroblastoma line sh-sy5y the proteome of the human neuroblastoma cell line sh-sy5y: an enlarged proteome from transcriptome to proteome: differentially expressed proteins identified in synovial tissue of patients suffering from rheumatoid arthritis and osteoarthritis by an initial screen with a panel of 791 antibodies a large-scale proteomic analysis of human embryonic stem cells mudpit analysis: application to human heart tissue multidimensional separation of peptides for effective proteomic analysis large-scale analysis of the yeast proteome by multidimensional protein identification technology multidimensional protein identification technology: current status and future prospects multiplexed protein quantitation in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents reproducibility of quantitative proteomic analyses of complex biological mixtures by multidimensional protein identification technology proteomics: from gel based to gel free human protein factory for converting the transcriptome into an in vitro-expressed proteome printing protein arrays from dna arrays nextgeneration high-density self-assembling functional protein arrays application of physicochemically modified silicon substrates as reverse-phase protein microarrays rppaml/rims: a metadata format and an information management system for reverse phase protein arrays multiplexed whole bacterial antigen microarray, a new format for the automation of serodiagnosis: the culture-negative endocarditis paradigm protein analysis on a proteomic scale regulation of gene expression by a metabolic enzyme global analysis of protein activities using proteome chips biochemical and genetic analysis of the yeast proteome with a movable orf collection analyzing antibody specificity with whole proteome microarrays severe acute respiratory syndrome diagnostics using a coronavirus protein microarray a quantitative protein interaction network for the erbb receptors using protein microarrays multiplexed protein profiling on antibody-based microarrays by rolling circle amplification measuring proteins on microarrays methods and applications of antibody microarrays in cancer research multiplexed sandwich assays in microarray format evaluating sandwich immunoassays in microarray format in terms of the ambient analyte regime mass spectrometry and protein analysis analysis of proteins and proteomes by mass spectrometry use of mass spectrometry-derived data to annotate nucleotide and protein sequence databases current initiatives in proteomics research: the plant perspective laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons the characteristics of peptide collision-induced dissociation using a high-perfformance maldi-tof/tof tandem mass spectrometer mass spectrometry and proteomics gas-phase separations of protease digests development of high throughput dispersive lcion mobility-tofms techniques for analysing the human plasma proteome formation of peptide aggregates during esi: size, charge, composition, and contributions to noise mobility labeling for parallel cid of ion mixtures advances in proteomics data analysis and display using an accurate mass and time tag approach selective detection of membrane proteins without antibodies: a mass spectrometric version of the western blot analysis of protein phosphorylation by hypothesis-driven multiple-stage mass spectrometry expediting the development of targeted srm assays: using data from shotgun proteomics to automate method development quantitative, multiplexed assays for low abundance proteins in plasma by targeted mass spectrometry and stable isotope dilution current affairs in quantitative targeted proteomics: multiple reaction monitoringmass spectrometry peptidomics: a new approach to affinity protein microarrays antibodybased enrichment of peptides on magnetic beads for massspectrometry-based quantification of serum biomarkers mass spectrometric quantitation of peptides and proteins using stable isotope standards and capture by anti-peptide antibodies (sis-capa) mass spectrometry-based proteomics bioinformatics meets proteomics-bridging the gap between mass spectrometry data analysis and cell biology statistical and computational methods for comparative proteomic profiling using liquid chromatography-tandem mass spectrometry protein identification by mass spectrometry: issues to be considered challenges and opportunities in proteomics data analysis toward a human blood serum proteome: analysis by multidimensional separation coupled with mass spectrometry the need for guidelines in publication of peptide and protein identification data: working group on publication guidelines for peptide and protein identification data reporting protein identification data: the next generation of guidelines a statistical model for identifying proteins by tandem mass spectrometry proteome analysis at the level of subcellular structures complementary methods to assist subcellular fractionation in organellar proteomics subcellular fractionation, electromigration analysis and mapping of organelles affinity purification-mass spectrometry: powerful tools for the characterization of protein complexes the swiss-prot protein knowledgebase and expasy: providing the plant community with high quality proteomic data and tools proteomic study of the arabidopsis thaliana chloroplastic envelope membrane utilizing alternatives to traditional twodimensional electrophoresis subcellular fractionation and proteomics of nuclear envelopes 54 improved subcellular fractionation of the heavy mitochondria pellet using free flow electrophoresis system rapid analytical and preparative isolation of functional endosomes by free flow electrophoresis characterization of human platelet surface and intracellular membranes isolated by free flow electrophoresis chemoattractant-regulated mobilization of a novel intracellular compartment in human neutrophils large-scale analysis of the yeast proteome by multidimensional protein identification technology phosphoproteomics for the discovery of kinases as cancer biomarkers and drug targets mass spectrometry-based proteomics and peptidomics for biomarker discovery in neurodegenerative diseases the extracellular signal-regulated kinase: multiple substrates regulate diverse cellular functions abrf-prg03: phosphorylation site determination interpreting the protein language using proteomics proteomic analysis of posttranslational modifications phosphoamino acid analysis signaling-2000 and beyond techniques for phosphopeptide enrichment prior to analysis by mass spectrometry evaluation of the impact of some experimental procedures on different phosphopeptide enrichment techniques improved enrichment strategies for phosphorylated peptides on titanium dioxide using methyl esterification and ph gradient elution simac (sequential elution from imac), a phosphoproteomics strategy for the rapid separation of monophosphorylated from multiply phosphorylated peptides modification-specific proteomics: characterization of post-translational modifications by mass spectrometry a workflow to increase the detection rate of proteins from unsequenced organisms in high-throughput proteomics experiments application of genomics and proteomics for study of the integrated response to zinc exposure in a non-model fish species, the rainbow trout sequence similarity-based proteomics in insects: characterization of the larvae venom of the brazilian moth cerodirphia speciosa a workflow to increase the detection rate of proteins from unsequenced organisms in high-throughput proteomics experiments improving gene annotation using peptide mass spectrometry proteomics studies confirm the presence of alternative protein isoforms on a large scale plips, an automatically collected database of protein lists reported by proteomics studies proteome analysis by mass spectroscopy detection technologies in proteome analysis two-dimensional gel electrophoresis in proteomics: old, old fashioned, but it still climbs up the mountains is proteomics heading in the wrong direction? proteomic mapping of brain plasma membrane proteins how much of the proteome do we see with discovery-based proteomics methods and how much do we need to see? peptides offgel electrophoresis: a suitable pre-analytical step for complex eukaryotic samples fractionation compatible with quantitative itraq labeling development and standardization of multiplexed antibody microarrays for use in quantitative proteomics protein microarray technology integrating biological databases new paradigms in cellular function and the need for topdown proteomics analysis key: cord-003144-nqkw5v3w authors: qu, zehui; gao, fei; li, liwei; zhang, yujiao; jiang, yifeng; yu, lingxue; zhou, yanjun; zheng, hao; tong, wu; li, guoxin; tong, guangzhi title: label‐free quantitative proteomic analysis of differentially expressed membrane proteins of pulmonary alveolar macrophages infected with highly pathogenic porcine reproductive and respiratory syndrome virus and its attenuated strain date: 2017-11-24 journal: proteomics doi: 10.1002/pmic.201700101 sha: doc_id: 3144 cord_uid: nqkw5v3w significant differences exist between the highly pathogenic (hp) porcine reproductive and respiratory syndrome virus (prrsv) and its attenuated pathogenic (ap) strain in the ability to infect host cells. the mechanisms by which different virulent strains invade host cells remain relatively unknown. in this study, pulmonary alveolar macrophages (pams) are infected with hp‐prrsv (hun4) and ap‐prrsv (hun4‐f112) for 24 h, then harvested and subjected to label‐free quantitative ms. a total of 2849 proteins are identified, including 95 that are differentially expressed. among them, 26 proteins are located on the membrane. the most differentially expressed proteins are involved in response to stimulus, metabolic process, and immune system process, which mainly have the function of binding and catalytic activity. cluster of differentiation cd163, vimentin (vim), and nmii as well as detected proteins are assessed together by string analysis, which elucidated a potentially different infection mechanism. according to the function annotations, prrsv with different virulence may mainly differ in immunology, inflammation, immune evasion as well as cell apoptosis. this is the first attempt to explore the differential characteristics between hp‐prrsv and its attenuated prrsv infected pams focusing on membrane proteins which will be of great help to further understand the different infective mechanisms of hp‐prrsv and ap‐prrsv. porcine reproductive and respiratory syndrome virus (prrsv) has been a leading economically significant viral pathogen of swine worldwide for almost 28 years. [1] [2] [3] prrsv, equine arteritis virus, simian hemorrhagic fever virus, and lactate dehydrogenaseelevating virus are members of the family arteriviridae. [4, 5] prrsv is a positive-sense, single-stranded, rna virus with a full-length genome of 15 kb that has a 5 cap and a 3 poly (a) tail. [4, [6] [7] [8] prrsv was first reported in the united states in the late 1980s. [9] in 2006, several large-scale, severe outbreaks of highly pathogenic (hp) atypical prrsv (hp-prrsv) were reported in china and neighboring asian countries. [2, 3, 10] reported hp-prrsv morbidity and mortality rates were much higher than previous pandemic prrsv strains and associated with more severe clinical presentations and higher rectal temperature (>41°c). [11] the emergence of hp-prrsv has caused great economic loss to the swine industry in china and made preventing and controlling prrsv outbreaks difficult. therefore, elucidating the causes of the greater virulence of prrsv and the differences between the hp and attenuated pathogenic (ap) strains has become even more important. to this end, several trials have been conducted to identify virulence factors; these studies have resulted in some successes. [12, 13] however, changes in virulence and pathogenic mechanisms are difficult to discern. other than virulence in vivo, many distinctions in the biological aspects of hp-prrsv and ap-prrsv have been noted, such as in viral binding and entry into pulmonary alveolar macrophages (pams). prrsv exhibits highly restricted cell tropism both in vivo and in vitro. [14] the virus can be detected only in well-differentiated macrophages of lungs, lymph nodes, peyer's patches, spleen, tonsils, and thymus. pams are the main target cells of prrsv. [15] prrsv can also replicate in vitro in the african www.advancedsciencenews.com www.proteomics-journal.com membrane proteins (mps) of pams infected by highly proteomic (hp)-and attenuated proteomic (ap)-prrsv have been elucidated by lc-ms/ms for label-free quantitative proteomics. ninety-five differentially expressed proteins were identified and characterized. the most significant difference in the biological process between pams infected with hp-and ap-prrsv is the metabolic process. most different molecular functions were classified as binding and catalytic activities. cellular component categories showed that 26 differentially expressed proteins were confirmed as mps based on the annotation of uniprot database, such as rap2a, vcl (vinculin), ifitm3 function in cell-cell junctions, erk signaling pathway, g protein signaling pathways, biotic stimulus, and so on. among them, vcl is a kind of f-actin-binding protein which is involved in cell-matrix adhesion and cell-cell adhesion in humans. it was demonstrated that over expression of vcl could inhibit the replication of both hp-prrsv and the attenuated prrsv in the mrna level. there were obvious differences in the inhibiting ability for hp-prrsv and its attenuated strain. this is the first attempt to explore the differential characteristics between hp-prrsv and its attenuated prrsv-infected pams focusing on mps which will be of great help to further understand the different infective mechanisms of hp-prrsv and ap-prrsv. green monkey kidney cell line ma-104 and its derivatives, marc-145 and cl-2621, which are considered permissive cell lines for prrsv. [16, 17] reportedly, prrsv targets cellular membrane proteins (mps) and enters target cells through receptormediated endocytosis during viral infection. [18] studies have investigated possible mechanisms employed by prrsv to infect pams. [19] elucidating the characteristics of viruses and interactions between viruses and host cells is increasingly important. however, exploring individual proteins within the many proteins in cells is difficult. nonetheless, gradual advancements have come through use of proteomic techniques. of these, ms-based quantitative proteomic techniques offer the advantage of better accuracy and sensitivity, and have been widely used to analyze host cell responses to viral infection. among them, liquid chromatographytandem ms (lc-ms/ms) for label-free quantitative proteomics (lfqp) is an important mass spectrometric tool to detect and quantify large amounts of proteins. [20] compared with quantitative proteomics using stable isotope labeling such as stable isotope labeling by amino acids in cell culture and isobaric tags for relative and absolute quantitation, lfqp detects greater amounts of proteins and important signaling pathways and networks. the aim of this study was to determine potentially different infection mechanisms used by the hp strain vhun4 [10] and its derivative attenuated strain vhun4-f112 [21, 22] using lfqp. the animal study protocols were approved by the animal care and use committee of shanghai veterinary research institute, chinese academy of agricultural sciences. hp-prrsv strain vhun4 [3, 10] at a titer of 10 5 50% tissue culture infective dose (tcid 50 ) ml −1 and cell-passaged attenuated virus strain vhun4-f112 (ap-prrsv) [21, 22] at a titer of 10 5 tcid 50 ml −1 were stored as viral stocks. porcine circovirus 2, classical swine fever virus, prrsv antibody, and antigen-free 15-day-old piglets were used. animals were sacrificed in accordance with the ethics statement. lungs were dissected and lavaged with pbs (pbs; life technologies, inc., gibco/brl division, grand island, ny, usa) supplemented with 1% penicillinstreptomycin (gibco/brl), then centrifuged at 1000 × g for 5 min, resuspended in pbs, centrifuged, and resuspended in pbs. pams were collected in roswell park memorial institute (rpmi) 1640 medium (gibco/brl) containing 10% fetal bovine serum (gibco/brl) [23] and incubated in 10 cm dishes (corning, inc., corning, ny, usa) for 12 h at 37°c in a 5% co 2 atmosphere. after pams were washed with pbs three times, dead and nonadherent cells were removed when confluency exceeded 95%. three dishes were inoculated with vhun4 and another three with vhun4-f112 at multiplicity of infection 1. an additional three dishes were inoculated with dmem (gibco-brl) as a blank control. all dishes were incubated at 37°c in an atmosphere of 5% co 2 , as described previously. [13] after incubation for 1 h, inocula were discarded and pams were washed with pbs three times. cell monolayers in all dishes were overlaid with rpmi-1640 medium containing 2% fetal bovine serum and incubated at 37°c in a 5% co 2 atmosphere for 24 h. pams were digested with 2 ml 0.25% trypsinethylenediaminetetraacetic acid solution (gibco/brl), collected by gently pipetting, centrifuged at 1000 × g for 5 min and lysed using the proteoextract transmembrane protein extraction kits (novagen, emd biosciences, inc., madison, wi, usa), [24, 25] according to the manufacturer's instructions. cells were resuspended in extraction buffer 1 and protease inhibitor cocktail, incubated for 10 min at 4°c with gentle agitation, and centrifuged at 1000 × g for 5 min at 4°c. after removing supernatants, pellets were resuspended in 0.2 ml extraction buffer 2a and protease inhibition cocktail, incubated for 45 min at room temperature with gentle agitation, and centrifuged at www.advancedsciencenews.com www.proteomics-journal.com 16 000 × g for 15 min at 4°c. supernatants were precipitated with 1 ml acetone and centrifuged at 12 000 × g for 10 min. after evaporating to dryness, 150 μl sdt buffer (4% sodium dodecyl sulfate, 100 mm tris/hcl at ph 7.6, 0.1 m dithiothreitol) was added and mixtures were heated in boiling water for 5 min. after centrifugation, supernatants were collected and quantified with a bca protein assay kit (bio-rad, usa). fig. 1 digestion of protein (250 μg for each sample) was performed according to the fasp (filter-aided sample preparation) procedure. briefly, the detergent, dtt and other low-molecular-weight components were removed using 200 μl ua buffer (8 m urea, 150 mm tris-hcl ph 8.0) by repeated ultrafiltration (microcon units, 10 kd) facilitated by centrifugation. then 100 μl 0.05 m iodoacetamide in ua buffer was added to block reduced cysteine residues and the samples were incubated for 20 min in darkness. the filter was washed with 100 μl ua buffer three times and then 100 μl 25 mm nh 4 hco 3 twice. finally, the protein suspension was digested with 3 μg trypsin (promega) in 40 μl 25 mm nh 4 hco 3 overnight at 37°c, and the resulting peptides were collected as a filtrate. the peptide content was estimated by uv light spectral density at 280 nm using an extinctions coefficient of 1.1 of 0.1% (g l −1 ) solution that was calculated on the basis of the frequency of tryptophan and tyrosine in vertebrate proteins. [26] the peptide of each sample was desalted on c18 cartridges (empore spe cartridges c18 (standard density), bed id 7 mm, volume 3 ml, sigma), then concentrated by vacuum centrifugation and reconstituted in 40 μl of 0.1% (v/v) trifluoroacetic acid. ms experiments were performed on a q exactive mass spectrometer that was coupled to easy nlc (proxeon biosystems, now thermo fisher scientific). five microgram peptide was loaded onto a c18-reversed phase column (thermo scientific easy column, 10 cm long, 75 μm inner diameter, 3 μm resin) in buffer a (2% acetonitrile and 0.1% formic acid) and separated with a linear gradient of buffer b (80% acetonitrile and 0.1% formic acid) at a flow rate of 250 nl min −1 controlled by intelliflow technology over 120 min. ms data was acquired using a datadependent top ten method dynamically choosing the most abundant precursor ions from the survey scan (300-1800 m/z) for hcd fragmentation. determination of the target value is based on predictive automatic gain control. dynamic exclusion duration was 25 s. survey scans were acquired at a resolution of 70 000 at m/z 200 and resolution for hcd spectra was set to 17 500 at m/z 200. normalized collision energy was 30 ev and the underfill ratio, which specifies the minimum percentage of the target value likely to be reached at maximum fill time, was defined as 0.1%. the instrument was run with peptide recognition mode enabled. ms experiments were performed triply for each sample. [27] the ms data were analyzed using maxquant software version 1.3.0.5. ms data were searched against the uniprot sus scrofa sequence database (including 34 253 sequences downloaded on 12/27/2014). an initial search was set at a precursor mass window of 6 ppm. the search followed an enzymatic cleavage rule of trypsin/p and allowed maximal two missed cleavage sites and a mass tolerance of 20 ppm for fragment ions. carbamidomethylation of cysteines was defined as fixed modification, while protein n-terminal acetylation and methionine oxidation were defined as variable modifications for database searching. the cutoff of global false discovery rate for peptide and protein identification was set to 0.01. label-free quantification was carried out in maxquant as previously described. protein abundance was calculated on the basis of the normalized spectral protein intensity (lfq intensity). all statistical analyses were performed using unpaired t-tests. a p-value <0.05 and ratio >2 or <0.5 were considered to indicate significant differences. gene ontology (go) annotation and functional classification of identified proteins was with blast2go ver. v2.6.2 with the current public database b2g_aug 12 (www.blast2go.com). identified proteins were classified www.advancedsciencenews.com www.proteomics-journal.com using blast2go steps under default parameters: blast, mapping, and annotation. protein-protein interaction networks were analyzed using (string software string-db.org/) . confidence view was assigned a score of 0.4, indicating medium confidence. samples of prrsv-infected and dmem-inoculated pams were lysed at 24 h post infection and protein concentrations were determined. samples (20 μg) were separated by 12% sds-page and transferred to 0.22 μm nitrocellulose membranes (bio-rad laboratories, hercules, ca, usa). membranes were blocked with 5% skim milk in tris-buffered saline containing 0.05% tween-20 and incubated overnight at 4°c with monoclonal antibodies against heat shock protein 70 (hsp70; ab5439; abcam plc, cambridge, uk) or kdel receptor (ab69659; abcam plc). after washing three times, membranes were incubated at 37°c for 60 min with horseradish peroxidase-conjugated anti-mouse igg or antirabbit igg (abcam plc). detection used chemiluminescence luminal reagents (pierce biotechnology, waltham, ma, usa). the porcine vinculin (vcl) were amplified from the cdna obtained from pam cells. restriction enzyme sites were incorporated into the primer sequences to facilitate molecular cloning. pcr products were cloned into the pcaggs vector to produce the porcine vcl expression vector. for transfection, cells were seeded in 6-well plates (corning) and transfected at 70-80% confluency with respective constructed plasmids dna by using lipofectamine 3000 (life technologies), according to the manufacturer's instructions. the hp-hun4 or hun4-f112 was infected in marc-145 cells at 36 h post transfection. empty vector transfection samples served as controls in the experiment. at 48 h post infection, total rnas of prrsv-infected cells were extracted using the rneasy mini kit (qiagen), the viral rnas in supernatants were isolated using qiaamp viral rna mini kit (qiagen), according to the instruction manual. all the isolated rnas were used as the template for synthesis of firststrand cdna by rt-pcr using rt primed by oligo (dt) 18 primer using the primescript rt master mix (perfect real time, takara), according to the manufacturer's instructions. then the cdna templates were quantified using prrsv-specific real-time rt-qpcr. [28] 3. results a total of 2849 proteins were detected by lfqp and are displayed in a heatmap (figure 2) . statistical significance was determined using unpaired t-tests. for all tests, a p-value of <0.05 and ratio of >2 or <0.5 was considered to indicate a significant difference. glyceraldehyde 3-phosphate dehydrogenase was the internal normalization control and the ratio of glyceraldehyde 3-phosphate dehydrogenase between hp and ap-prrsv was 0.95 ± 0.55. the "only one exists" group indicated that proteins were detected only in one group but not another group due to low expression level ( table 1) . correlation analysis indicated good repeatability of the technology (figure 3) . a total of 95 differentially expressed proteins were identified ( table 2) . a control group was used to exclude false-positive interference. data from the control group was also used to elucidate functions of target proteins. differentially expressed proteins between control and hp-prrsv-infected cells (con/hp) and the www.advancedsciencenews.com www.proteomics-journal.com p <0.01 and ratio >2 or <0.5 indicated quantitative difference between two groups. the "only one exits" group indicated that proteins were detected three times in one, but not in the other group. hp-prrsv group means proteins identified in the hp-prrsv infected pams, while ap-prrsv was proteins detected in the attenuated pathogenic prrsv infected pams. control was displayed as the mock. control and ap-prrsv-infected cells (con/ap) are in supporting information. to extend the molecular characterization of quantitative differences and only-one-exists groups, uniprot and go databases were used to characterize information about biological processes (bps), molecular functions (mf), and cellular components (cc). bps of h (vh/va >2, including 41 proteins) and a (vh/va < 0.5, including 54 proteins) groups are in figures 4a and b bp. in group h, go annotations were primarily distributed in response to stimulus (70.7%), metabolic process (68.3%), and immune system process (43.9%). ratios were 63.0% response to stimulus, 90.7% metabolic process, and 33.3% immune system process in group a. the most significant difference in bp between pams infected with hp-prrsv and ap-prrsv was seen for metabolic process, which may be the major reason for the large differences among animals challenged with different virulence of prrsv. molecular function categories of h group and a group were shown in figure 4a ,b mf. most different molecular functions were classified as binding (87.8 and 90.7%) and catalytic activity (31.7 and 44.4%). binding of h group included enzyme binding (31.7%), nucleic acid binding (31.7%) and protein complex binding (26.8%), while group a mainly involved nucleic acid binding (31.5%), nucleoside phosphate binding (27.8%) and nucleotide binding (27.8%) from the analysis of go distribution by level 4. enzyme code distribution suggested there were five transferases (12.2%), one hydrolase (2.4%), one lyase (2.4%), and two ligases (4.9%) detected in group h. while four oxidoreductases (1.9%), five transferases (9.3%), five hydrolases (9.3%), three lyases (5.6%), and three isomerases (5.6%) in group a. cc categories were illustrated in figure 4a b cc. ninety-five detected differentially expressed proteins were annotated and categorized to ccs of macromolecular complex (58.9%), membrane (57.9%), membrane-enclosed lumen (57.9%), and extracellular region (44.2%). because of technological problems, we were unable to conclude that all the detected proteins were indeed mps. however, 26 were confirmed based on the annotation of uniprot database and there also existed proteins partially anchored on the membrane or binding with the mps. to verify the differentially expressed proteins via lc-ms/ms for lfqp, western blots were conducted for two proteins partially located on membranes. expression of hsp70 and the kdel receptor from cell lysates of vhun4-infected and vhun4-f112-infected pams, and dmem-inoculated pams were tested with antibodies to the proteins. lfqp showed that the ratios between vhun4infected and vhun4-f112-infected pams reached 0.67 and 0.51, respectively. western blots confirmed lfqp results ( figure 5 ). to examine whether the differentially expressed proteins detected affects virus infection, the vcl transient overexpression vector was transfected into marc-145 cells, followed by the hp hun4 or its attenuated strain infection. the prrsv-specific rt-qpcr results showed that vcl protein could inhibit both viruses, especially for hp-hun4 strain. compared with the empty vector in this study, lfqp of mps of hp-prrsv-, ap-prrsv-infected pams and the control was performed. important information about target proteins related to virus infection was obtained. the hp-prrsv strain vhun4 and its derivative, the serially cellpassaged attenuated strain vhun4-f112, [29] revealed different infection mechanisms in pams. a total of 2849 proteins were identified among the control, ap-prrsv-infected, and hp-prrsv-infected pams. of these, 2400 were detected in all three groups (figure 7) . we focused on 95 differentially expressed proteins of ap-prrsv-and hp-prrsv-infected pams; among these, 43 were detected in only one group. prrsv has been a threat to the global pig economy for several years because of its persistent infection, immune escape, and high mortality from inflammation and high fever. [30] the attenuated prrsv vhun4-f112 vaccine strain attenuated from hp-prrsv vhun4 by serial passages, which is now used in china. [31] we analyzed mps to identify factors associated with immunological effects and determine differences between hp-prrsv and ap-prrsv. mps classified based on go analysis are in table 3 . a heatmap based on the uniprot database was constructed to comprehend the functions and bps of differently expressed proteins ( figure 8b) , which revealed clustering and abundance of the 26 detected proteins that existed confidentially on the membrane based on uniprot database annotation. we first focused on proteins associated with immunology and inflammation with higher abundance in ap-prrsv-infected pams. ptpn2 (vh/va = 0.419), a member of the protein tyrosine phosphatase family, functions as signaling molecule that regulates cellular processes related to the jak-stat, il-3, il-5, and granulocyte-macrophage colony-stimulating factor (gm-csf) signaling pathways; dephosphorylation of nonreceptor kinases including jak1, jak3, stat3, and stat6; and negative regulation of il-2-, il-4, il-6, and ifn-mediated signaling. ptpn2 also functions in the response to inflammation via nf-κb. [32] [33] [34] sipa1 (h-/a+) is a mitogen-induced gt-pase activating protein for ras-related regulatory proteins. it was related to g-protein signaling and blood-brain barrier and immune cell transmigration: vcam01/cd106 (cluster of differentiation) signaling pathways. [35, 36] c5ar1 (h-/a+) is the receptor for the chemotactic and inflammatory peptide anaphylatoxin c5a, which stimulates chemotaxis, granule enzyme release, intracellular calcium release, and superoxide anion production, participates in the innate and adaptive immune responses to www.advancedsciencenews.com www.proteomics-journal.com table 3 . statistics analysis of proteins that existed definitely on the membrane. the lectin-induced complement pathway. [37] [38] [39] [40] [41] [42] polyribonucleotide 5 -hydroxyl-kinase clp1 (clp1) (vh/va = 0.136) mainly acts as a kinase that binds atp hosting 5 -hydroxyl-kinase activity, and functions in mrna cleavage and in sirna loading onto the rnainduced silencing complex involved in rna interference and its destruction. the kinase hclp1 phosphorylates and licenses synthetic sirnas to assemble into an rna-induced silencing complex for cleavage of target rna. [43] expression of these molecules was higher in the ap-prrsv-infected group and had a similar level to the control group. therefore, we concluded that they might be related to the immune escape of hp-prrsv and that the attenuated vaccine strain would not have side effects on these immune factors expression in the host cell, and these proteins above would contribute to the resistance of prrsv. some proteins related to immunology and inflammation had higher expression in hp-prrsv-infected cells. raftlin (vh/va = 2.554) protein is pivotal for maintenance of lipid rafts and may be involved in regulation of b-cell antigen receptor-mediated signaling. raftlin promotes binding of double-stranded rna, activations of b cell receptors and toll-like receptor 3 signaling pathways, and is involved in il-17 production to release proinflammatory cytokines. hence, the higher abundance of raftlin in the hp-prrsv group compared to the ap-prrsv and control groups may explain the more severe inflammation triggered by hp-prrsv. [44, 45] iars (h+/a-), isoleucyl-trna synthetase is a target of autoantibodies in autoimmune diseases. [46, 47] vcl (h+/a-), vinculin is a cytoskeletal protein associated with cellcell and cell-matrix junctions, and is related to the il-3, il-5, and gm-csf signaling pathways. [48] rap2a (h+/a-) is a member of the ras oncogene family, small gtp-binding protein, of which active form interacts with several effectors [49] [50] [51] related to the erk and g protein signaling pathways. [52, 53] it was reported that prrsv could induces prostaglandin e2 production through cyclooxygenase 1 and is related to erk signaling. [54] the differential expression of rap2a may provide information for future research on the interaction between prrsv and the erk signaling pathway. the abundance of rap2a was lower in ap-prrsv-infected pams than in hp-prrsv-infected pams and control cells, which had similar levels. thus, we hypothesized that rap2a was associated with immunization with the attenuated vaccine. ifn-induced transmembrane protein 3, ifitm3 (vh/va = 2.861), responds to biotic stimulus and had the highest expression levels among these proteins in the hp-prrsv group, as compared to the proteins identified in the ap-prrsv group but it was not detected in the control group. in humans, this protein negatively regulates the entry of multiple viruses, including influenza a virus, sars coronavirus, marburg virus (marv), ebola virus, dengue virus, west nile virus (wnv), human immunodeficiency virus (hiv) type 1, and vesicular stomatitis virus (vsv), into host cells. [55, 56] it could inhibit hemagglutinin protein -mediated entry of influenza virus, gp1, two-mediated viral entry of marburg virus and ebola virus, s protein-mediated viral entry of sars coronavirus, and g protein-mediated viral entry of vsv, thereby playing a critical role in the structural stability and function of vacuolar atpase (v-atpase). establishing physical contact with the v-atpase of endosomes is critical for proper clathrin localization and is required for v-atpase to lower the ph in phagocytic endosomes, thus establishing an antiviral state. [57] high expression of both hp-prrsv and ap-prrsv suggests that ifitm3 may help inhibit prrsv entry and promote resistance to hp-prrsv infection. peptidyl-prolyl cis-trans isomerase b (ppib, vh/va = 0.406) has peptidyl-prolyl cis-trans isomerase activity and is involved in protein folding and protein peptidyl-prolyl isomerization, which can accelerate protein folding. [58] ppib is positively regulated by viral genome replication, is involved with the viral processes of hepatitis c virus (hcv), and interacts with and stimulates the rna-binding activity of hcv ns5b. ppib is critical for efficient replication of the hcv genome. [59] compared with the control group, ppib was downregulated in the hp-prrsv and ap-prrsv groups. the abundance of ppib was lower in the hp-prrsv group than the ap-prrsv group, suggesting an association with prrsv replication. string network analysis was used to elucidate interactions among differentially expressed proteins. the essential factors and receptors involved in the entry of prrsv are reportedly cd163 (q2vl90), nmhc ii-a (myh9, f1skj1), sialoadhesin (sn, a7lcj3), cd151 (f1ryz1), and vimentin (vim) (p02543). [60] [61] [62] [63] [64] [65] we assessed the involvement of these mps together with proteins detected by string analysis on viral entry. receptor proteins involved in prrsv entry are in figure 9 to better understand the characteristics of pathway-like regions, the genomrnai database was analyzed to annotate detected proteins. [66] regions containing proteins with higher abundance in hp-prrsv-infected pams are green and in ap-prrsv-infected pams are red ( figure 9a) . a search of the genomrnai database determined that linked proteins ik (ik cytokine), wd repeat domain 12, dyskerin pseudouridine synthase 1, and adenylosuccinate lyase (adsl) in the green region of figure 9a shared similar annotations and functioned to decrease expression of nf-κb or il-8, [67] which are related to inflammation. [44, 45] hence, further analysis of these linked proteins may help explain differential mechanisms between hp-prrsv and ap-prrsv infection. other proteins in the green-colored region of 9a are reported to influence virus infection. nucleolar and coiled-body phosphoprotein 1 increases sindbis virus infection [68] and adsl increases human papilloma virus 16-gfp infection. [69] the lower abundance of nucleolar and coiled-body phosphoprotein and adsl in the hp-prrsv-infected group may be related to the host immune response repression of prrsv and may be used by hp-prrsv during infection to self-upregulate. the red region of figure 9a contains some linked proteins with some figure 9 . protein-protein interaction network determined by string software showing interactions among differentially expressed proteins between two virulent groups with cd163, vim, and myh9 added. red, protein abundance higher in hp-prrsv than ap-prrsv; green, protein abundance in hp-prrsv lower than in ap-prrsv. line colors represent type of evidence for association: green, neighborhood; red, fusion; purple, experimental; light blue, database; black, expression; blue, co-occurrence; yellow, text mining. gene abbreviations are shown. vh, protein abundance in hp-prrsv; va protein abundance in ap-prrsv. vh/va = 0, detected only in ap-prrsv; vh/va = null, detected only in hp-prrsv. information. uridine monophosphate synthetase decreases hiv-1 infection, [70] while programmed cell death 5 decreases hcv replication, [71] both downregulate nf-κb expression. [67] the lower abundance of these proteins in the hp-prrsv group suggested that hp-prrsv escaped inhibition through an unknown mechanism. myosin-9 (myh9) appears to function in cytokinesis, cell shape, and specialized functions such as secretion and capping. [72] myh9 is an important factor for prrsv infection and interacts with gp5 of prrsv, [65] although the underlying mechanisms remain unclear because of limited research. as shown by our results, myh9 was related to vcl in ap-prrsv and actl6 in hp-prrsv. vcl is an actin filament (f-actin)binding protein involved in cell-matrix adhesion and cell-cell adhesion in humans, regulation of cell-surface e-cadherin expression, and potentiation of mechanosensing, and may be important in cell morphology and locomotion by promoting binding with actin, alpha-catenin, cadherin, dystroglycan, and ubiquitin protein ligase. [73] in hiv research, transient overexpression of vcl reduced the susceptibility of human cells to infection with hiv-1 and negatively affected paxillin phosphorylation and limited retroviral infection. [74] just like hiv-1, in our study, it was demonstrated that over expression of vcl could inhibit the replication of both hp-prrsv and the attenuated prrsv in the mrna level. there were obvious differences in the inhibiting ability for hp-prrsv and its attenuated strain. actl6a, which is an actin-like protein 6a, is involved in transcriptional www.advancedsciencenews.com www.proteomics-journal.com activation and repression of select genes by chromatin remodeling (alteration of dna-nucleosome topology) and mainly functions in chromatin binding and transcription coactivator activities. [75] we found that myh9 may regulate ap-prrsv and hp-prrsv infection via different pathways. through super pathways annotation (www.genecards.org), vcl and actl6a were identified as participants in the il-3, il-5, gm-csf, and tnf-α/nf-κb signaling pathways, where they may help with differential mechanisms of prrsv infection with different virulence. using the genomrnai database, some linked proteins were found to be related to viruses or inflammation ( figure 9b ). in the green region, gnas complex locus (gnas) increased human papilloma virus 16-gfp, [69] and serine/arginine repetitive matrix 2 decreased hcv infection, influenza a replication and viral numbers, and il-8 expression, [76, 77] small nuclear ribonucleoprotein u1 subunit 70 decreased influenza a replication and viral numbers. [77] irf-3 decreased infection by hcv, west nile virus, and dengue virus. [76, 78] in the red region, hsf1 decreased hcv replication. [76] protein kinase camp-activated catalytic subunit alpha decreased vsv infection. [79] signal transducer and activator of transcription 6 decreased il-8 expression. rap2a, mink1, and gnas are regulatory proteins in the ras pathway. rap2a and mink1 were detected only in the hp-prrsv group, whereas gnas was detected only in the ap-prrsv group, in accordance with a previous report that stimulation of ras increases the replication ability of hcv by reducing ifn-jak-stat pathway activity. [80] we proposed that prrsv was also related to the ras pathway, and differed between hp-prrsv and ap-prrsv. creb-binding protein (crebbp) is composed of doublestranded rna-activated transcription factor with irf-3, and double-stranded rna-activated transcription factor is activated in many virus-infected cells to promote apoptosis. [81, 82] crebbp may interact with human herpes virus 8 virf-1, which could inhibit the binding of crebbp to irf-3. [83] our results showed that irf-3 abundance was tenfold lower in the hp-prrsv than the ap-prrsv group, which we proposed was a protective mechanism of prrsv to escape from host immunity and ensure survival after viral infection. the ability of hp-prrsv hun4 to induce cell apoptosis is stronger than classical prrsv (ch-1a) in immune organs and lungs of piglets. [84] we concluded that hp-prrsv might interact with crebbp and decrease irf-3 expression to escape from host immunity and cause severe damage to the cell, highlighting a difference from ap-prrsv. notable differences exist during the infection of ap-prrsv and hp-prrsv. hp-prrsv could inhibit host immune function and evade the immune response via unknown mechanism. [85, 86] label-free ms was performed using ap-prrsv-infected and hp-prrsv-infected pams. this is the first attempt to explore the differential characteristics between hp-prrsv and its attenuated prrsv infected pams focusing on membrane proteins. by analyzing detected proteins in hp-infected, ap-prrsv-infected, and control group, proteins related to the immune response or virus replication were identified that may elucidate unique pathways used by different virulent prrsvs for cell entry, virus replication, and immune escape mechanisms. researches on these detected proteins will help with the elucidation of the identity, the expression abundance, and significance of them, future study will be focused on functions of these key membrane proteins to deepen our understanding of differential mechanisms between hp-prrsv and ap-prrsv infection. proc. natl. acad. sci proc. natl. acad. sci supporting information is available from the wiley online library or from the author. the authors declare that they have no conflicts of interest associated with this report. keywords attenuated, highly pathogenic, infection, label-free quantitative proteomics, membrane proteins key: cord-021393-9loesliv authors: meade, h.m.; echelard, y.; ziomek, c.a.; young, m.w.; harvey, m.; cole, e.s.; groet, s.; smith, t.e.; curling, j.m. title: expression of recombinant proteins in the milk of transgenic animals date: 2007-09-02 journal: gene expression systems doi: 10.1016/b978-012253840-7/50015-8 sha: doc_id: 21393 cord_uid: 9loesliv nan introduction of foreign dna into the murine genome by microinjection and the generation of transgenic offspring were first reported by gordon et al. (1980) and gordon and ruddle (1981) . this technique has been applied to the production of transgenic livestock. using mammary gland-specific promoters, a wide range of proteins of biopharmaceutical interest have been expressed in rodents, pigs, and dairy animals. an expression vector, comprising a gene encoding the target protein of interest fused to a milk promoter gene, is introduced by microinjection into the pronucleus of a one-cell embryo. upon germ line integration and expression, the transgene becomes a dominant mendelian genetic characteristic that is inherited by the progeny of the founder animal. the transgenic offspring may express the target protein in gram per liter quantities, most frequently as a soluble whey protein. mammalian mammary epithelial cells have the capacity to carry out complex protein synthesis with a variety of posttranslational modifications and proper folding. coexpression of modifying enzymes in the epithelial cell golgi apparatus may allow the heterologous protein to be engineered to confer specific or desired pharmacokinetic characteristics. the milk of transgenic livestock presents an excellent starting material from which human diagnostic or pharmaceutically active, therapeutic proteins may be purified using established technologies of the dairy and biopharmaceutical industries. to date (1997) , probably more than 50 proteins have been expressed in the milk of transgenic mice, rats, rabbits, goats, sheep, pigs, and dairy cows. phase i clinical trials with antithrombin iii produced in transgenic goats were completed successfully in 1996, and phase ii clinical trials are currently being performed in the united states. ~l-antitrypsin produced in transgenic sheep is currently in phase i clinical trials in the united kingdom. in addition, human lactoferrin and fibrinogen expressed in the milk of transgenic cows and sheep, respectively, are in the late stages of development or preclinical evaluation. the field of transgenic research has been reviewed periodically since 1990. the reader will find the following articles, which trace the development of transgenic technology applied to dairy animals, of interest: henninghausen (1990) , henninghausen et al. (1990) , bialy (1991) , wilmut et al. (1991) , wall et al. (1992) , j~inne et al. (1992, 1994) , logan (1993) , ebert and schindler (1993) , lee and de boer (1994) , houdebine (1994 houdebine ( , 1995 , and echelard (1996) . this chapter discusses the expression of therapeutically useful proteins in mammalian milk with a focus on dairy animals as the production systems of choice and reviews expression constructs, milk-specific transgenes, transgene insertion, transgenic animal production, protein biosynthesis and secretion, lactation, milk, protein purification, and quality and regulatory issues. laboratory mice have served as a model expression system for foreign proteins since the inception of animal transgenesis and are used frequently for feasibility studies concomitant with or prior to the generation of larger, founder transgenic animals. many recombinant proteins of interest in human therapy are, by their very nature, biologically active in most mammals. the murine model is useful in determining, at an early stage of research, the transgene expression characteristics and potential effects on animal health. transgenic mice are generally predictive of what will be observed in larger animals. many transgenic proteins have been expressed in the milk of transgenic livestock. table 1 summarizes the published results of expression in transgenic farm animals during the period from 1990 to 1996. several of these proteins have been expressed at high levels, demonstrating the usefulness of the production system. examination of a great number of transgenic lines has shown that mammary gland-specific expression of a target protein is associated with increased plasma levels of this protein, even in the absence of ectopic expression. the most likely explanation is leakage of the protein from the milk into the circulation through the junctional complexes of the mammary epithelial cells. therefore, certain proteins and peptides, such as highly active hormones and cytokines, cannot be expressed in the mammary system as their secretion into the blood may have severe detrimental effects on the host. lonnerdal and iyer (1995) agene insertion via teat canal using retroviral vectors galv and momlv. bdata from genzyme transgenics corp. (1996) . transgenes containing sequences of several milk protein genes, reviewed by maga and murray (1995) and echelard (1996) , have been used to direct the expression of exogenous proteins to the lactating mammary gland. these transgenes are usually chimeric, being derived from the fusion of a target protein gene and mammaryspecific regulatory sequences. although both genomic dnas and cdnas coding for target proteins have been used for expression, higher levels are normally obtained with genomic dnas. the incorporation of untranslated exons and introns may contribute to increased expression of the transgene . addition of a signal sequence is necessary if the exogenous protein is not normally secreted. this will cause the protein to be secreted out of the mammary tissue into the milk. regulatory sequences from several milk-specific genes have been isolated and tested in transgenic animals: ovine ~-lactoglobulin; murine, rat, and rabbit whey acidic protein (wap); bovine ~-sl casein; rat, rabbit, and goat f~-casein; and guinea pig, ovine, caprine, and bovine ~-lactalbumin. of these promoters, several have permitted grams per liter expression of target proteins in the milk of transgenic offspring, sometimes in large dairy animals; some of this work is summarized next. the ovine f~-lactoglobulin gene contains seven exons and six introns spanning a 4.2-kb region (harris et al., 1988) . the first reported ovine ~-lactoglobulin chimeric transgenes (archibald et al., 1990) were composed of 4 kb of 5' flanking fused to the c~l-antitrypsin genomic sequences. other configurations using variable amounts of 5'-and 3'-flanking sequences have also been used (whitelaw et al., 1992) and reviewed by maga and murray (1995) . with the ovine ~-lactoglobulin gene, high-level expression (g/liter) of ~l-antitrypsin, fibrinogen, and hsa have been reported (wright et al., 1991 , shani et al., 1992 prunkard et al., 1996) . however, similar results have not been observed with transgenes containing cdna sequences (clark et al., 1989 , shani et al., 1992 , hansson et al., 1994 , yull et al., 1995 . rodent wap genes consist of four exons and three introns: the middle two exons encode the two cysteine-rich regions, which probably form separate protein domains (campbell et al., 1984) . wap is present in the milk of mouse, rat, rabbit, and camel, but is absent from the milk of cow, sheep, pig, goat, and human. no recognizable wap gene homologues have been isolated from these species. rat (wei et al., 1995 , yarus et al., 1997 , mouse (gordon et al., 1987; ebert et al., 1991 , reddy et al., 1991 velander et al., 1992; drohan et al., 1994; hansson et al., 1994; limonta et al., 1995) , and rabbit (bischoff et al., 1992; devinoy et al., 1994; th4pot et al., 1995) wap regulatory sequences have been used to direct expression of exogenous proteins to the mammary gland. high-level expression of the target protein was observed with mouse and rabbit wap transgenes. surprisingly, relatively high expression levels (up to 1 g/liter)were observed in transgenic pigs with a construct containing 2.6 kb of 5' and 1.3 kb of 3' mouse wap sequences linked to a human protein c cdna (velander et al., 1992) . this result seems to indicate that mwap regulatory sequences can function efficiently in species that do not have an endogenous wap gene. the bovine asl-casein gene contains 9 exons and spans 17.5 kb (koczan et al., 1991) . originally (meade et al., 1990) , a transgene containing 21 kb of 5' and 2 kb of 3' flanking sequence fused to the genomic sequences of human urokinase was shown to direct high milk expression levels (1-2 mg/ml) in mice. promising results were also obtained in transgenic rabbits with a construct containing the human igf-1 cdna (brem et al., 1994) , in mice with the human lysozyme cdna (maga et al., 1994) , and in human lactoferrin and human granulocyte-macrophage colony-stimulating factor genomic constructs (nuijens et al., 1995; uusi-oukari et al., 1997) . conversely, a bovine ~sl-casein-human lactoferrin cdna construct only permitted low-level expression in milk of transgenic mice (platenburg et al., 1994) , as was the case with a human tpa cdna construct fused to 1.6 kb of bovine ~-sl casein 5'-flanking sequences (riego et al., 1993) . the bovine ~-lactalbumin gene contains four exons and three introns. early reports (vilotte et al., 1989; stinnakre et al., 1991) indicated that a construct containing 750 bp of 5' and 336 bp of 3' flanking region was sufficient to direct intermediate expression levels in transgenic mouse milk when fused to bovine ~-lactalbumin or ovine trophoblastin cdnas. by using a construct containing the same amount of 5'-flanking sequence linked to hgh genomic sequences, higher levels of the target protein (up to 4.3 mg/ml) were obtained in the milk of transgenic rats (ninomiya et al., 1994) . however, at this point, results obtained with ~-lactalbumin-driven transgenes in large animals have not been reported. the caprine ~-casein gene (csn2) has been cloned and sequenced (roberts et al., 1992; persuy et al., 1992) . the intron/exon organization of the 9-kb goat gene is similar to that of other csn2 genes and its expression is limited principally to the mammary gland during lactation. high-level expression was observed in goats transgenic for a construct containing 6.2 kb of 5' and 7.1 kb of 3' goat ~casein flanking noncoding sequence fused to a variant of the human tpa cdna . high-level expression with caprine f~-casein-containing transgenes has also been observed, in mouse milk, with bovine k-casein (persuy et al., 1995; gutierrez et al., 1996) , antithrombin iii (cdna and genomic, mice and goats), hsa (cdna and genomic), ~l-antitrypsin (genomic, mice and goats), and both heavy and light chains of several humanized antibodies (h. meade et al., unpublished data) . transgenic animals may be generated by direct microinjection of the foreign gene into the pronuclei of one-cell stage embryos. microinjection techniques have been reviewed exhaustively by hogan et al. (1986) and pinkert (1994) , among others. techniques initially developed for gene insertion into murine pronuclei (gordon et al., 1980; gordon and ruddle, 1981; brinster et al., 1985; palmiter and brinster, 1986) have been adapted to gene transfer into the pronuclei of ruminants and pigs (hammer et al., 1985; pinkert, 1994) . if the microinjected dna integrates into the genome of the recipient before the first cell division occurs, a heterozygous founder can be created. later integration leads to genetic mosaics consisting of normal cells with a normal genome and cells with transgenomes. generally, fertilized eggs are flushed from the oviduct of a superovulated female donor, microinjected with a few hundred copies of the transgene, transferred to the oviduct or uterus of a pseudopregnant recipient animal, and developed to term. the first transgenic offspring, or founder animals, are at best hemizygous as the transgene is not integrated into both copies of a pair of homologous chromosomes. in the case of mosaic founders, germ line transmission is not always observed. in addition, multiple transgene integration sites have been detected in 10-20% of transgenic founders. to optimize the collection and transfer of microinjectable goat embryos, selgrath et al. (1990) established regimens for superovulation/synchronization and timing of pronuclear embryo collection. does were synchronized with norgestomet ear implants and superovulation was induced with pregnant mare serum gonadotropin (pmsg) or follicle-stimulating hormone (fsh-p). does were hand mated with the average female being mated six to eight times by two different males over a 24-to 36-hr period. embryos were recovered surgically and pronuclei could be visualized without centrifugation. after microinjection the embryos were transferred to the reproductive tracts of recipient females using a surgical procedure similar to that used for the embryo collection. ewes may be similarly synchronized and superovulated during the breeding season using progestin (30 mg fluorogestone acetate) pessaries and fsh injection (rexroad and wall, 1987) . fertilized sheep oocytes are semiopaque and not readily visible. however, differential interference contrast microscopy allows visualization, and successful microinjection may be determined by the swelling of the pronucleus, which occurs on injection of dna (simons et al., 1988) . microinjected embryos are then transferred to recipient ewes. bovine oocytes are generally collected from slaughtered heifers or obtained from cows superovulated with pmsg or prostaglandin. to circumvent surgical procedures and in vivo fertilization, krimpenfort et al. ( 1991 ) have used an/n vitro fertilization and embryo production procedure. the technique for dna microinjection is similar to those described earlier: centrifugation, e.g., at 12,000 • g for 10 min, is necessary to visualize pronuclei. microinjected cow embryos are usually cultured/n vitro to the morula-blastocyst age at which time they are transferred nonsurgically to suitable recipients. the techniques and efficiency of gene transfer in cows have been described by roschlau et al. (1989) and mcevoy and sreenan (1990) . synchronization of sows is accomplished with hormonal treatment and superovulation induced with pmsg or human chorionic gonadotropin. synchronization, however, has been shown to affect the farrowing rate (pursel et al., 1990) . pig ova are opaque and no nuclear structures can be seen, even using interference contrast microscopy. centrifugation at 15,000 • g for 5 min leaves pronuclei visible in the equatorial segment of the cytoplasm (brem et al., 1985; hammer et al., 1985) . microinjected embryos are then transferred to recipient pigs. factors affecting the success of microinjection as a gene insertion technique have been reviewed by rexroad et al. (1990) . experience in microinjection is an important factor as well as dna concentration and the gene construct. the stage of egg development and the quality of eggs may affect the efficiency of producing transgenic animals. three factors contribute to problems of microinjection of livestock embryos compared to mice. cytoplasmic vesicles may obscure the view of pronuclei, fewer embryos are available for microinjection, and there is considerable variability in the stage of embryo development at the time of embryo collection. despite these challenges, there has been considerable success in developing transgenic goats, sheep, pigs, and cows. key reproduction parameters and the success rate of transgenesis are summarized in table 2 . in summary, oocytes may be fertilized in vivo or in vitro. in vivo fertilization may be controlled by artificial insemination of a superovulated animal at the stage where the oocyte has matured in the ovary. an alternative pathway is to isolate follicular oocytes from the ovary of the donor animal and proceed to in vitro maturation and fertilization prior to microinjection and implantation in the recipient female. in goats, sheep, and cows the rate of transgenic births is 5-10%. embryonic stem (es) cells may offer an alternative to pronuclear microinjection for achieving transgenesis. however, pluripotent es cells able to contribute to the germ line have only been described in mice. there have been descriptions of chimeric animals generated with rat, pig, and cow es cells (iannaconne et al., 1994; wheeler, 1994; stice et al., 1996) , but in the case of rats and pigs no evidence of germ line transmission from these cells has been reported. cow fetuses obtained following nuclear transfer of es cell-derived nuclei in recipient oocytes died in utero, exhibiting major defects in placental development (stice et al., 1996) . in sheep, the first large animals derived from cultured cells were described in 1996 by campbell et al. two healthy phenotypicaly female lambs were born from embryos generated by transferring nuclei isolated from embryo-derived cells into enucleated oocytes. dna analysis demonstrated that all the nuclear transfer lambs and fetuses were derived from the cell line. it is not yet clear whether foreign dna can be introduced in this type of cell line, and there are questions about the health and reproductive fitness of the recovered offspring. nevertheless, this experiment is certainly a step toward the possibility of replacing pronuclear microinjection as the method of choice for the generation of transgenic large animals. another potential alternative to the pronuclear microinjection of the transgene is the use of replication-defective retrovirus vectors. archer et al. (1994) have described a procedure in which the construct is infused directly into the mammary gland, via the teat canal, during a period of hormone-induced mammogenesis. a gibbon ape leukemia virus was used to deliver the structural gene encoding for human growth hormone resulting in expression of the hormone in goat mammary epithelial cells. the advantage of this method is that expression of the recombinant protein is obtained quickly, without the delay caused by the generation interval required to generate a producing transgenic animal (18 months for goats). however, reported production levels (archer et al., 1994) are very low (see table 1 ) and at this point in time can only be used for analytical purposes. transgenic animals for the production of therapeutic and diagnostic proteins are produced by transferring fertilized, transgene-carrying embryos to recipient animals. following natural gestation and birth, offspring are subject to tissue biopsy and blood sampling: usually ear tissue and blood are screened by polymerase chain reaction and/or southern analysis for the presence of the transgene. animals identified as being transgenic are mated with nontransgenic animals: transgenic founder females will produce the protein for which the dna codes in their blood or milk depending on the tissue-specific, regulatory promoter sequence of the transgene. subsequently, transgenic female progeny derived from breeding founder females and males will also express the transgenic protein. in goats, 16-18 months are required before the first milk is obtained from a natural lactation of a female transgenic animal. however, milk samples can be obtained from founder transgenic females, as well as approximately 30% of male transgenic animals by hormonally induced lactation at about 13 months after microinjection. induced lactation is useful in checking the expression level and integrity of the heterologous protein. the original precursors of most of the milk constituents are cellulose, starch, protein, fat, minerals, and vitamins of the plant materials of the ruminant diet. water is also a prerequisite, and dairy cows require 3-4 liters of water per liter of milk produced. rumen microorganisms synthesize amino acids, which are adsorbed into the bloodstream. the milk protein precursor amino acids are adsorbed from the bloodstream via the extracellular fluid between the capillaries and the epithelial cells, across the basement membrane of the mammary epithelial cells. protein biosynthesis is carried out in epithelial cells, and milk proteins are discharged into the lumen of the alveolus by exocytosis and then into the ducts. in ruminants, the ducts empty into a single, primary duct or cistern, which provides extra milk storage capacity. up to 30% of the milk in the udder is held in the cistern. as measured by the concentration difference in arteriovenous blood, the mammary gland is particularly efficient at extracting amino acids: 80% of the arterial methionine; 70% of the phenylalanine, leucine, and threonine; 60% of the lysine, arginine, and isoleucine; 55% of the histidine; and 50% of valine are adsorbed. more than 25 % of the arterial blood glucose is removed during passage through the mammary gland and is used to power protein synthesis, fat, and lactose production. intermediates required for protein synthesis are produced in the cytosol and mitochondria. protein is synthesized at polyribosomes in the rough endoplasmic reticulum where removal of signal peptides occurs. glycosylation is carried out in the endoplasmic reticulum and the golgi apparatus where phosphorylation, other posttranslational modifications, and the assembly of casein micelles are carried out. studies with radiolabeled amino acids indicate that proteins are synthesized in 3-15 min. radioactivity is seen in the golgi apparatus after 15-30 min, and the label concentration in the lumen increases after a further 30-60 min. immunoglobulins and albumin present in milk are not synthesized in the mammary gland but by plasma cells and hepatocytes, respectively, and enter the milk by active transport or filtration. during lactation, b lymphocytes migrate to the mammary gland where they become plasma cells. plasma cells lodged in the interstitial space may contribute to the high igg concentration present in the colostrum of early lactation. in the cow, approximately 500 liters of blood is required to provide the precursors for 1 liter of bovine milk: the blood flow in the udder is ca. 280 ml per second. the goat uses about 400 liters of blood to produce 1 liter of milk: the blood flow in the udder is ca. 1200 liters per day. the mammary gland is able to secrete about 2 g of milk, containing approximately 18 mg of whey protein, per gram of tissue per day. a gram of tissue contains about 2 x 108 cells and the milk output is therefore of the order of 10 -8 g per cell per day. lymph drains from the udder of the cow at the rate of 1300 ml per hour and in the goat at the rate of 6.5-35 ml per hour. leukocytes, which account for the major part of the somatic cells found in milk, are derived from lymph. morc61 et al. (1994) have calculated the synthesis rate of human recombinant protein c in transgenic swine expressing the protein at 0.1-1 mg/ml milk to be approximately 14 mg per gram of mammary cell per day or about 14 pg/cell/day. in contrast, the rate of normal synthesis in hepatocytes was calculated to be 0.02 mg protein c per gram cells per day. it has been suggested that mammalian species phylogenetically close to humans may be expected to have more elements of the glycosylation machinery in common (jenkins et al., 1996) . initial reports indicate that human glycoproteins expressed in the mammary gland of transgenic animals contain glycosylation patterns that differ from those found on human plasma-derived proteins. in general, the glycosylation found on human proteins secreted into transgenic animal milk has been generally similar to plasma protein glycosylation. sites are mainly biantennary complex oligosaccharides with some variations consistent with the tissue and species of origin. cole et al. (1994) have shown that human an-tithrombin iii (at iii) and a long-acting form of htpa expressed in goat milk had some galnac replacing galactose on complex nlinked oligosaccharides. both latpa and at iii were shown to be more fucosylated than their recombinant or plasma counterparts. goat plasma at iii contains n-glycolylneuraminic acid and nacetylneuraminic acid, as do transgenically expressed at iii and latpa. an additional difference observed between plasma and transgenic goat at iii is in the degree of sialylation , with the transgenic protein less sialylated than plasma at iii. denman et al., (1991) have also noted that significantly lower levels of galactose, n-acetylglucosamine, and sialic acid are present in goat transgenic latpa compared to the murine c 127 cell line and chinese hamster ovary (cho) cell-derived latpa. the glycosylation of interferon-~ at asparagine 25 and 97 is influenced dramatically by cho cell culture conditions (curling et al., 1990) ; considerable variations in site occupancy are seen. transgenic mouse-derived interferon-~ has predominantly complex sialated biantennary n-glycans at asparagine 25, and oligosaccharides are ~ 1-6 core fucosylated similar to the asn2s of cho cellderived interferon-~ (james et al., 1995) . there is an increased incidence of oligomannose at asn97 compared to the cho-derived counterpart, suggesting that murine mammary epithelial cells may be deficient in the ~ 1-2 mannosidase i and glcnac transferase i activities in the endoplasmic reticulum. although an ultimate goal may be to produce proteins with authentic human glycosylation patterns, a more realistic and possibly more desirable objective is the production of proteins with defined, engineered glycosylation characteristics and therefore predictable pharmacokinetics. glycoproteins can be remodeled in situ by the transgenic coexpression of human glycosyltransferase. prieto et al. (1995) have demonstrated that the heterologous, transgenic expression of human ~l,2-fucosyltransferase results in expression of both transgene and secondary gene products. their work also suggests that the mammary gland may be a unique bioreactor for the production of biologically active oligosaccharides and glycoconjugates. in studies of the expression of rh protein c in transgenic pigs, subramanian et al. (1996) have shown that there are rate limitations of ~-carboxylation in mice and pigs, partly dependent on the transgene. their study indicates that a rate limitation of ~-carboxylation in mammary epithelial cells occurs at expression levels of >20 ~g/ml in mice and at >500 ~g/ml milk in pigs. mammary glands are skin glands that have no counterpart in nonmammals and are located in the inguinal region of cows, sheep, and goats. in the sow they are located along the thoracic, abdominal, and inguinal walls. streak canals link the internal milk secretory system with the external environment. the number of teats, teat orientation, and length of the let-down reflex affect the ease and periodicity of milking. milk is released as a result of a neuroendocrine reflex of the nervous system to tactile, auditory, and visual stimuli. negative psychological and environmental disturbances have a detrimental effect on milk production. milk let-down is a response to the release of oxytocin, synthesized in the hypothalamus, by the posterior pituitary gland. oxytocin is transported to the mammary gland in the arterial blood and binds to myoepithelial cells that contract, causing a release or let-down of the milk. sheep and goats have let-down periods of 1-2 and 2-4 min, respectively. the let-down reflex in the cow lasts for 5-8 min. in the sow, however, the reflex is extremely short, of the order of 10-20 sec, with a frequency of 1 hr or less. from the point of view of primary milk production, goats and cows are preferred animals because of the relatively long let-down reflexes, vertical teat orientation, milk volume, and duration of lactation. however, other factors, such as protein concentration, may favor a choice of sheep. goats, sheep, and cows respond to familiar signals, such as the sound of a milking machine, whereas lactating, nonsuckling sows require injections of oxytocin to elicit the let-down response. at parturition, a series of programmed hormonal changes take place that transform the mammary cells to the fully secretory state. stage 2 lactogenesis, or the copious production of milk, is brought about by a synchronous drop in progesterone, an increase in estrogen, and the release of prolactin from the anterior pituitary and follows the immediate postpartum production of colostrum. in the cow, milk production increases in the first 3-6 weeks of lactation and then slowly declines. a similar pattern is seen in goats, sheep, and pigs. milk secretion continues as long as milk is regularly withdrawn, although production declines during lactation. dairy animals are capable of undergoing estrus and pregnancy while maintaining their lactation. this enables dairy management to breed the animal such that it will give birth and begin lactation on a yearly basis. in order to maximize production, the animals are allowed to lactate until 2 months before they are due to give birth. they are then "dried off" by cessation of milking. the 2-month rest before the restart of lactation allows the animal to rebuild her energy reserves for the coming birth and lactation. following birth of the progeny, the animal once again begins the yearly lactation cycle. in all species, yield and energy content are related to body size. the expression level of exogenous protein tends to follow the normal milk output as can be seen from the plots in fig. 1 which shows (a) the production of a recombinant monoclonal antibody over a 220-day lactation of a transgenic goat and (b) the levels of antithrombin iii in the milk of a transgenic goat during a 300-day lactation. it has been noted that "considering the yearly milk output of dairy cattle (6000-8000 liters)and the milk content of as 1-casein (10 g/liter), one cow carrying a transgene under the control of as 1casein promoter would theoretically produce 60-80 kg/year of the transgene-derived protein" (j~inne et al., 1992) . at an expression level of 5 g/liter a transgenic goat is capable of producing about 4 kg of target protein per year; at 20 g/liter the output is 16 kg. it is, therefore, quite conceivable to produce 100-kg quantities of target protein in small goat herds or sheep flocks. milk is a multiphasic fluid composed of a fat emulsion, a micellular casein dispersion, a colloidal suspension of lipoproteins, and a solution of proteins, mineral salts, vitamins, organic acids, and minor components. when collected, milk is not sterile and contains bacteria derived from the teat as well as somatic cells derived mostly from the lymphatic ducts of the udder. according to the pasteurized milk ordinance (u.s. department of health and human services, 1993) , the bacterial plate count should be less than 100,000 per milliliter. milk composition is species specific; major differences are seen between human and ruminant milk. within a species, the composition varies with breed, diet, and other factors. volume and composition also vary during lactation. table 3 gives the average percentage compositions of livestock milk. milk is approximately 85-90% water; the ph is 6.5-6.7 and as high as ph 6.8 in ewe's milk. it is important from a purification point of view that the fat is present in globular form in the size range of 0.1-10 ~m and with a density higher than the other constituents. the fat globule is enclosed in a membrane of polar lipids and proteins. triglycerides make up 97% of the fat. the three subgroups of casein, as-, 13-, and k-casein, display genetic polymorphism and consist of two to eight variants. casein is present as 10 to 300-nm micelles formed of submicelles held together by phosphate and hydrophobic bonding. each submicelle has a polar core and a heterogeneous distribution of ~s-and ~-casein with surface k-casein, ~s-and ~-caseins are almost insoluble, whereas the glycoprotein, ~-casein is highly soluble in water. casein may be precipitated somewhat below the isoelectric point range (ph 5.1-5. 3) at about ph 4.6-4.7 by acidification or by the addition of chymosin, which attacks the 105 (phenylalanine)and 106 (methionine) peptide bond of the k-casein. the hydrophilic amino acid terminal peptide (106-169)of k-casein solubilizes in the whey fraction and all other caseins precipitate. the soluble whey proteins consist primarily of ~-lactalbumin and f~-lactoglobulin with serum albumin and immunoglobulins being derived from the bloodstream. ~-lactoglobulin is the major protein, accounting for 50% or more of the total whey protein in ruminants and pigs. ~-lactalbumin accounts for approximately 25 % of the whey protein fraction and is essential for lactose synthesis and the control of milk secretion: ~-lactalbumin binds calcium and zinc. both ~-lactalbumin and ~-lactoglobulin have an amino acid composition close to the nutritional optimum and provide amino acids that are essential for the neonate. whey acid protein is present only in rodent milk. the temperatures of raw milk at collection is about 3 7~ to prevent bacterial growth, oxidation, and proteolysis, milk should be chilled immediately to 4~ and processed within 48 hr unless it is frozen and stored. in the dairy, milk is standardized with regard to the fat content by the centrifugal separation of cream and skim milk: 100 kg of 4% (fat)bovine milk yields 90.35 kg of 0.05% fat skim milk and 9.65 kg of 40% fat cream. the cream fraction is remixed with the skim milk to the required fat content. in processing for a target protein in the whey it is clear that the bulk of the fat can be removed using this standard procedure. however, other standard dairy procedures, particularly microfiltration, may be used to remove fat, casein, and cellular components in a single step. the high lactose and salt content may also be reduced by ultrafiltration. membrane techniques are generally the initial recovery methods of choice and, when correctly used, may yield a 60% pure protein in a single or a tandem microultrafiltration step, thus providing a clarified whey concentrate for further processing. the use of such techniques also provides a barrier to the entry of adventitious viruses, bacteria, and other microorganisms into the final product. an alternative pathway is to apply expanded bed technology. after initial processing to remove fat, skim milk may be passed through an adsorbent for the target, allowing the casein, lactose, and the bulk of the whey proteins to pass through the bed. partially purified target protein may be recovered by desorbtion from the matrix in a fixed bed mode. this type of separation or direct feed capture may be used in a totally fixed bed mode applied to the whey fraction after tangential flow filtration. subsequent processing by various chromatographic techniques, including affinity, hydrophobic interaction, ion exchange, and metal chelate chromatography, is applicable to achieve a protein of required purity. the number of steps should be kept to a minimum as even small step losses can lead to a low yield over an extensive process. a procedure for the purification of human recombinant protein c expressed in porcine milk has been described (degener et al., 1996) . milk fat was removed by centrifugation, and casein was precipitated in the presence of zinc. direct feed capture was performed using an expanded bed, and the protein was purified by hydrophobic interaction and anion-exchange chromatography. farm activities associated with transgenic production include founder development, progeny testing, and dairying. to minimize production risks and achieve validatable procedures, "good agricultural practices" (gaps), which are in the spirit of good manufacturing practices and good laboratory practice, should be used. gaps are based on high standards of animal husbandry, adherence to standard operating procedures, on-site veterinary care, rigorous animal health monitoring programs, and state-of-the-art milking practices to maximize product quality. standard operating procedures should cover areas such as generating founder animals, herd maintenance, herd health, breeding, milk production, and other special procedures. master and working transgenic banks should be kept under strictly controlled conditions. animal feeds should be specially blended free of animal fat and protein. hay for transgenic production animals should be screened for residual chemicals that may have been used in the growing process. both written and computerized records should be kept to track animal lineages, health, and performance records. careful attention should be paid to milk facilities and personnel sanitation. production animals should be observed on a daily basis, and animal side testing of milk can be used to detect early indications of mastitis and other animal illnesses. milk collection and processing areas should be physically separated, and state-of-the-art pharmaceutical grade milking equipment and practices should be used in dedicated milking parlors. "points to consider in the manufacture and testing of therapeutic products for human use derived from transgenic animals" [food and drug administration (fda), 1995] is the result of an iterative process between the fda and the biopharmaceutical industry, and the fda is supportive of transgenic production. the european union's committee on proprietary medicinal products (cpmp) is similarly positive to the use of transgenic animals, stating in its guidance document: "transgenic animals may produce higher quantities of material in more concentrated form than existing culture methods, and therefore have considerable advantages in both the cost of producing the starting material and in its downstream processing. in some instances where very large amounts of material are required for therapy the use of transgenic animals may be one of the few viable production strategies" (cpmp, 1995) . the fda has addressed topics such as the structure of the transgene, creation, characterization, and maintenance of the transgenic herd, fidelity of transgenic inheritance, consistency of transgene expression, analysis of product identity and purity, and the avoidance of contamination by drugs, chemicals, and adventitious agents. many issues of purity, consistency, safety, and potency of transgenically produced therapeutic proteins overlap with various cber points to consider for monoclonal antibodies and other biologicals produced by recombinant dna technology. to quote the fda: "the considerations that apply are therefore a blend of those relevant to recombinant dna derived materials and materials from less defined sources" (fda, 1995) . despite the fact that goats and other farm animals are susceptible to species-specific viral and prion infections, viruses may be more of a concern for animal health than they are a human risk. a cber official has pointed out that prions, such as those responsible for bovine spongiform encephalitis and scrapie, have "less than a theoretical" risk of transmission via milk because they do not occur in the mammary gland and, if introduced, they do not persist (rudolph, 1995) . no transmission of scrapie has ever been reported in humans. milk and semen are noninfectious for scrapie according to the world health organization (1996) . although scrapie has been detected in significant numbers of sheep in the united states, only five goat scrapie cases have occurred in goats, all comingled with scrapie-infected sheep. the use of ani-mals from closed, scrapie-free herds or flocks has been adopted by the leading transgenic production companies to obviate any such concerns. goat viruses relevant in north america are shown in table 4 . the risk of viral and bacterial contamination of human biopharmaceutical products expressed in the milk of transgenic goats can be minimized by a multistage or combinatorial approach at three levels: the goat, the milk, and the final product (ziomek, 1996) . minimization of the risk of contamination of the goat production herd can be accomplished by strict selection, animal husbandry, and adherence to gaps. bacterial contamination can be minimized by careful, state-of-the-art milking procedures and rapid gmp processing. purification processes from raw milk are designed in a manner similar to the manufacturing schemes for recombinant therapeutics from fermentation and cell culture with the inclusion of steps to remove and/or inactivate adventitious agents, microbial contaminants, and pyrogens. for advantages and disadvantages of transgenic expression in the milk of dairy animals, see table 5 . afrom ziomek (1996) . bz, zoonotic: ss, single stranded; ds, double stranded. high level expression at multigram/liter level expression directed to and located in mammary gland expression can be tested in rodents genetic and lactation-to-lactation stability animal-to-animal consistency mammary epithelial cells carry out post-translational modification product recovered at high concentration in milk aseptic milk processing technology well proven bulk impurities removed easily low cost of pre-purification product low investment costs capability of combining transgenics with nuclear transfer (cloning) possible adventitious virus and prion issues in dairy animals control of animal environment and feed microinjection required as a technology for species where cloning is not yet available time to clinical product slower than cell culture (without process development) expression of therapeutically beneficial proteins in the milk of transgenic dairy animals present an unparalleled opportunity for the large-scale production of monoclonal antibodies, plasma, and other proteins. mammary gland epithelia have a cell density that is 100-to 1000-fold greater than the cell densities used, for example, in mammalian cell culture using cho cells. the cells are some of the most productive protein synthesis sources designed by nature to produce large amounts of correctly processed protein and which are switched on and off by hormonal changes. thirty-five transgenic goats expressing a monoclonal antibody at 8 g/liter in their milk are equivalent to a 8500-liter batch cho cell culture running 200 days/year with a 1-g/liter final expression level (young et al., 1997) . in production terms, 170,000 liters of culture is equivalent to 21,000 liters of milk; with the expression levels noted earlier and process yields of 60%, both of these systems would produce 100 kg of purified monoclonal antibody. it has also been discussed that the mammary bioreactor is capable of most posttranslational modifications and protein folding and can, therefore, be used to produce complex proteins. a current limitation of the technology is the low rate (5-10%) of transgenesis. nuclear transfer techniques, in which embryonic or adult cells are engineered and cultured to produce a master cell bank from which nuclei are transferred to recipient oocytes, are under development; these methods hold great promise for improved rates of transgenesis. because of concerns of transmissible spongiform encephalopathies, animal sourced proteins and amino acids are currently under scrutiny. however, it should be kept in mind that the expression system described in this chapter is the mammary gland. given the safety of milk and milk products in combination with the quality and regulatory precautions described, there is every reason to consider that proteins derived from transgenic animals should be safe with respect to possible virus and prion transmission. many of the proteins that are expression targets today are currently available as human plasma derivatives, which have a measurable degree of risk for the transmission of hiv, hepatitis, parvo-, and other viruses, and creutzfeld-jakob disease. as indicated, several proteins are in phase i and ii clinical trials. this number can be expected to increase dramatically in the near future. transgenic dairy animals provide a bulk protein production system that is capable of making proteins available that are either not available today or are only recoverable from other sources at very low yields. transgenic production may thus enable a move from "on-demand" patient treatment to prophylaxis and far wider indications and use of many proteins. in addition, the production of proteins in the milk of transgenic dairy animals is highly cost effective (young et al., 1997) , opening up real possibilities for nutraceutical product development. human growth hormone (hgh) secretion in milk of goats after direct transfer of the hgh gene into the mammary gland by using replication-defective retrovirus vectors high level expression of biologically active human al-antitrypsin in the milk of transgenic mice transgenic pharming comes of age a 17.2 kbp region located upstream of the rabbit wap gene directs high level expression of a functional human protein variant in transgenic mouse milk production of transgenic mice, rabbits and pigs by microinjection into pronuclei expression of synthetic cdna sequences encoding human insulin-like growth factor-1 (igf-1) in the mammary gland of transgenic rabbits factors affecting the efficiency of introducing foreign dna into mice by microinjection eggs dairy processing handbook sheep cloned by nuclear transfer from a cultured cell line comparison of the whey acidic protein genes of rat and mouse transgenics on trial. scrip mag. november pharmaceuticals from transgenic livestock expression of anti-haemophilic factor ix in the milk of transgenic sheep use of transgenic animals in the manufacture of biological medicinal products for human use recombinant interferon-gamma. differences in the glycosylation and proteolytic processing lead to heterogeneity in batch culture high selectivity purification of recombinant proteins from milk using expanded bed chromatography transgenic expression of a variant of human tissue-type plasminogen activator in goat milk: purification and characterization of the recombinant enzyme high level production of human growth hormone in the milk of transgenic mice: the upstream region of the whey acidic protein (wap) gene targets transgene expression to the mammary gland inefficient processing of human protein c in the mouse mammary gland transgenic farm animals: progress report transgenic production a variant of human tissue-type plasminogen activator in goat milk: generation of transgenic goats and analysis of expression induction of human tissue plasminogen activator in the mammary gland of transgenic goats recombinant protein production in transgenic animals points to consider in the manufacture and testing of therapeutic products for human use derived from transgenic animals integration and stable germ-line transmission of genes injected into mouse pronuclei genetic transformation of mouse embryos by microinjection of purified dna production of human tissue plasminogen activator in transgenic mouse milk expression of a bovine k-cn cdna in the mammary gland of transgenic mice utilizing a genomic milk protein gene as an expression cassette strategies to express factor viii gene constructs in the ovine mammary gland production of transgenic rabbits, sheep and pigs by microinjection expression and characterization of biologically active human extracellular superoxide dismutase in milk of transgenic mice complete nucleotide sequence of the ovine b-lactoglobulin gene the mammary gland as a bioreactor: production of foreign proteins in milk transgenic animals--production of foreign proteins in milk manipulating the mouse embryo production of pharmaceutical proteins from transgenic animals the production of pharmaceutical proteins from the milk of transgenic animals pluripotent embryonic stem cells from the rat are capable of producing chimeras n-glycosylation of recombinant human interferon-g produced in different animal expression systems transgenic animals as bioproducers of therapeutic proteins transgenic bioreactors getting the glycosylation right: implications for the biotechnology industry genomic organization of the bovine asl-casein gene generation of transgenic dairy cattle using "in vitro" embryo production production of biomedical proteins in the milk of transgenic dairy cows: state of the art transgenic rabbits as bioreactors for the production of human growth hormone transgenic animals: beyond 'funny milk lactoferrin molecular structure and biological function mammary gland expression of transgenes and the potential for altering the properties of milk expression of human lysozyme mrna in the mammary gland of transgenic mice the efficiency of production, centrifugation, microinjection and transfer of one-and two-cell bovine ova in a gene transfer program bovine as 1-casein gene sequences direct high level expression of active urokinase in mouse milk the porcine mammary gland as a bioreactor for complex proteins functions of milk protein gene 5' flanking region on human growth hormone gene characterization of recombinant human lactoferrin expressed in the milk of transgenic mice germ-line transformation of mice high expression of the caprine b-casein gene in transgenic mice high-level, stage-and mammary-tissue specific expression of a caprine k-casein-encoding minigene driven by a b-casein promoter in transgenic mice transgenic animal technology: a laboratory handbook expression of human lactoferrin in milk of transgenic mice remodeling of mouse milk glycoconjugates by transgenic expression of a human glycosyl transferase high-level expression of recombinant human fibrinogen in the milk of transgenic animals integration, expression and germ-line transmission of growth-related genes in pigs expression of human growth hormone in the milk of transgenic mice development of one-cell fertilized sheep ova following microinjection into pronuclei insertion, expression and physiology of growth regulating genes in ruminants production of transgenic mice and rabbits that carry and express the human tissue plasminogen activator cdna under the control of a bovine alpha $1 casein promoter cloning of the goat beta-casein gene and expression in transgenic mice gene transfer experiments in cattle regulatory issues relating to protein production in transgenic animal milk collection and transfer of microinjectable embryos from dairy goats expression of human serum albumin in the milk of transgenic mice gene transfer into sheep pluripotent bovine embryonic cell lines direct embryonic development following nuclear transfer the bovine a-lactalbumin promoter directs expression of ovine trophoblast interferon in the mammary gland of transgenic mice rate limitations in posttranslational processing by the mammary gland of transgenic animals rabbit whey acidic protein gene upstream region controls high level expression of bovine growth hormone in the mammary gland of transgenic mice department of health and human services public health service bovine a-s1 casein gene sequences direct high level expression of human granulocyte-macrophage colony-stimulating factor in the milk of transgenic mice high-level expression of a heterologous protein in the milk of transgenic swine using the cdna encoding human protein c efficient tissue-specific expression of bovine a-lactalbumin in transgenic mice making transgenic livestock: genetic engineering on a large scale production of human surfactant protein c in milk of transgenic mice development and validation of swine embryonic stem cells: a review targeting expression to the mammary gland: intronic sequences can enhance the efficiency of gene expression in transgenic mice position-independent expression of the ovine b-lactoglobulin gene in transgenic mice production of pharmaceutical proteins in milk report of a who consultation on public health issues related to human and animal transmissible spongiform encephalopathies high level expression of active human alpha-l-antitrypsin in the milk of transgenic sheep secretion of unprocessed surfactant protein b in milk of transgenic mice production of biopharmaceutical proteins in the milk of transgenic dairy animals fixing human factor ix (fix): correction of a cryptic rna splice enables the production of biologically active fix in the mammary gland of transgenic mice minimization of viral contamination in human pharmaceuticals produced in the milk of transgenic goats key: cord-003817-k3m72uxw authors: braun, elisabeth; sauter, daniel title: furin‐mediated protein processing in infectious diseases and cancer date: 2019-08-05 journal: clin transl immunology doi: 10.1002/cti2.1073 sha: doc_id: 3817 cord_uid: k3m72uxw proteolytic cleavage regulates numerous processes in health and disease. one key player is the ubiquitously expressed serine protease furin, which cleaves a plethora of proteins at polybasic recognition motifs. mammalian substrates of furin include cytokines, hormones, growth factors and receptors. thus, it is not surprising that aberrant furin activity is associated with a variety of disorders including cancer. furthermore, the enzymatic activity of furin is exploited by numerous viral and bacterial pathogens, thereby enhancing their virulence and spread. in this review, we describe the physiological and pathophysiological substrates of furin and discuss how dysregulation of a simple proteolytic cleavage event may promote infectious diseases and cancer. one major focus is the role of furin in viral glycoprotein maturation and pathogenicity. we also outline cellular mechanisms regulating the expression and activation of furin and summarise current approaches that target this protease for therapeutic intervention. the human genome encodes more than 550 proteases. these molecular scissors play important roles in essentially all physiological processes. they digest the proteins in our food, degrade misfolded or unwanted proteins and regulate the trafficking and activity of numerous cellular factors. proteolytic cleavage is certainly one of the most important post-translational modifications, generating a plethora of bioactive proteins and peptides with key roles in cell proliferation, immunity and inflammation. not surprisingly, mutations in proteases and/or aberrant protease activity are associated with numerous pathological processes including cancer, cardiovascular disorders and autoimmune diseases. 1 intriguingly, also many viral pathogens exploit cellular proteases for the proteolytic processing and maturation of their own proteins. similarly, activation of bacterial toxins frequently requires cleavage by proteases of the infected or intoxicated host. in recent years, modulation of protease activity has therefore emerged as a potential therapeutic approach in a variety of infectious and noninfectious diseases. one particularly promising target for therapeutic intervention is the cellular protease furin. this protease most likely cleaves and activates more than 150 mammalian, viral and bacterial substrates. 2 among them are viral envelope glycoproteins and bacterial toxins, as well as cellular factors that promote tumor development and growth if they are hyperactivated ( figure 1 ). in this review, we summarise our current knowledge of furinmediated protein processing in health and disease with a focus on the role of furin in viral protein processing. furthermore, we describe cellular mechanisms regulating furin activity at the transcriptional and post-transcriptional level. finally, we present approaches that aim at modulating furin activity for therapeutic purposes and discuss their suitability for clinical application. furin is a member of the evolutionarily ancient family of proprotein convertases. their similarity with bacterial subtilisin and yeast kexin proteases has coined the abbreviation pcsk (proprotein convertase subtilisin/kexin type). humans encode nine members of this protease family (pcsk1-9), with pcsk3 representing furin (table 1) . pcsks are well known for their ability to activate other cellular proteins. the proteolytic conversion of inactive precursor proteins into bioactive molecules has already been described in the 1960s. 3 however, it took more than 20 years until furin was identified as the first mammalian proprotein convertase. 4, 5 to date, more than 200 cellular substrates of pcsks have been described, including hormones, receptors, growth factors and adhesion molecules. although pcsks are frequently coexpressed in the same cell and may cleave the same substrates, there is no complete redundancy and the inactivation of individual pcsks results in specific knock-out phenotypes in mice. 6 pcsk1-7 cleave their substrates after basic residues, with the typical recognition motif k/r-x n -k/r↓ 6 ( table 1 ). in contrast, pcsk8 cleaves after nonbasic residues and is best known for its regulation of cholesterol and lipid metabolism by activating sterol regulatory element-binding protein (srebp) transcription factors. 6, 7 like pcsk8, pcsk9 also plays a key role in cholesterol metabolism as it regulates low-density lipoprotein (ldl) particle levels in the blood. however, this effect does not involve proteolytic cleavage of a specific substrate, but is mediated by a direct binding of pcsk9 to ldl receptors. 8 with two fda-approved inhibitors for the treatment of hypercholesterolaemia, pcsk9 is also the prime example of a protease that is successfully targeted for therapy. 6 the prototypical and best-characterised member of the pcsk family is furin/pcsk3. since it cleaves basic amino acid motifs, it has also been termed pace (paired basic amino acid cleaving enzyme). furin is expressed by the fur (fes upstream region) gene on chromosome 15. although furin is ubiquitously expressed, its mrna and protein levels vary depending on the cell type and tissue. high levels can be found in salivary glands, liver and bone marrow, whereas muscle cells express relatively low amounts of furin. 9 three promoters (p1, p1a and p1b), each harbouring an alternative transcription start site, have been described ( figure 2 ). however, the respective transcripts differ only in the first untranslated exon and are therefore predicted to express the same protein. 10 while the p1a and p1b promoters resemble those of constitutively expressed housekeeping genes, the p1 promoter binds the transcription factor c/ebpb and can be trans-activated upon cytokine stimulation. 10 in line with this, ifnc, tgfb, il-12 and pma induce furin expression. [11] [12] [13] [14] upon translation of the mrna, furin enters the secretory pathway as an inactive proenzyme and is integrated into the er membrane via its cterminal transmembrane domain ( figure 2 ). like most type i transmembrane proteins, it harbours a short n-terminal signal peptide that is cleaved off cotranslationally. similar to other proprotein convertases, furin contains an inhibitory nterminal 83-amino acid propeptide, whose chaperone function is required for correct folding of the catalytic domain. 15 during the transition of furin from the er to the trans-golgi network (tgn), the inhibitory propeptide is removed in a two-step autoproteolytic process and furin gains its enzymatic activity. 16 at the same time, n-linked oligosaccharides are added and trimmed. although furin accumulates in the tgn, it can be further transported to the cell surface and back via the endosomal pathway. 17, 18 finally, furin can also be shed and released into the extracellular space upon proteolytic separation of the catalytic domain from the membrane-bound c terminus. 19 whether this cleavage step is mediated by furin itself or another protease remains to be determined. 20 the presence of furin in the tgn and endosomal compartments, at the cell surface and in the extracellular space, may explain its ability to process a large variety of intra-and extracellular substrates. the canonical furin cleavage site is frequently described as r-x-k/r-r↓. however, variations of this motif may also be recognised and a stretch of 20 amino acids surrounding the cleavage site as well as post-translational modifications determine interaction with the furin binding pocket. 21 bioinformatic analyses and functional studies uncovered more than 100 furin cleavage sites in mammalian proteins. these comprise growth factors and cytokines (e.g. igf1, igf2, tgfb, pdgfa, pdgfb, vegf-c, ngf, cxcl10), hormones (e.g. pth, trh, ghrh), adhesion molecules (e.g. integrins, vitronectin), collagens, metalloproteinases, coagulation factors, receptors, membrane channels and albumin. 2 while most of these target factors are activated upon furin-mediated cleavage, furin also exerts inactivating cleavage steps. for example, the furin paralogue pcsk9 and endothelial lipase can be inactivated by furin. 22 maturation of the cellular protease furin. furin expression is driven by three different promoters, sharing characteristics of either cytokine-activated (p1) or housekeeping gene (p1a and p1b) promoters. during translation, furin is integrated into er membranes and glycosylated. after the n-terminal signal peptide (red) is removed, an autocatalytic cleavage event occurs, generating a short propeptide (light blue). this propeptide remains associated with furin and acts as an intramolecular chaperone and inhibitor. after transit to the golgi complex, the propeptide is removed and glycans are trimmed before furin gains its proteolytic activity. furin accumulates in the trans-golgi network (tgn), but can also traffic to the plasma membrane and cycle between these two compartments via endosomes. proteolytic cleavage at the c terminus of furin separates the transmembrane domain (orange) from the catalytically active domain. as a result, furin can be shed into the extracellular space as an active enzyme. the physiological importance of furin is reflected by furin knock-out mice, which die at embryonic day 11 because of cardial ventral closure defects and hemodynamic insufficiency. 24 similarly, endothelial cell-specific knock-out of furin results in cardiac malformation and death shortly after birth. 25 even mutations in the cleavage site of a single furin target protein may have detrimental effects and result in genetic disorders such as haemophilia b or x-linked hypohidrotic ectodermal dysplasia. 26, 27 because of its pleiotropic effects, variations in furin expression levels and/or its enzymatic activity may have detrimental effects and promote the pathogenesis of a variety of disorders, including rheumatoid arthritis, amyloid dementia and cancer. 17 furin has been termed a 'master switch of tumor growth and progression' 28,29 as its aberrant expression or activation can promote the formation and progression of various malignancies including colon carcinoma, rhabdomyosarcoma, head and neck cancers, lung, skin and brain tumors. 30 in some cases, furin levels positively correlate with aggressiveness, and increased furin expression has been proposed as prognostic marker for advanced cancers. 30 the oncogenic and prometastatic activity of furin has been ascribed to its ability to activate proteins that promote cell proliferation, angiogenesis, migration and tissue invasion. for example, furin cleaves and activates growth factors such as igfs, pdgfs or ngf that enhance cell proliferation and consequently tumor growth. similarly, angiogenic and lymphangiogenic factors such as vegf-c and vegf-d may be hyperactivated and promote the vascularisation and growth of solid tumors. 30 notably, furin expression is induced by hypoxia, as all three fur promoters harbour binding sites for the hypoxia-inducible factor-1 (hif-1). 31 thus, furin-mediated vascularisation may preferentially occur in otherwise growth-restricted hypoxic tumors. interestingly, hypoxia also results in subcellular relocalisation of furin to the cell surface, which may further enhance processing of growth factors and other extracellular tumorigenic precursor proteins. 32 besides effects on tumor growth, furin can also promote migration and extravasation of malignant cells as it processes adhesion molecules mediating cell-cell and cell-matrix interactions. the cleavage of integrins may be particularly relevant as they not only mediate adhesion of cells to the extracellular matrix, but also act as signal transducers regulating cell growth, division and survival. 33 in addition, furin activates matrix metalloproteinases (e.g. mmp14) that facilitate metastasis by degrading components of the extracellular matrix. 30 finally, increased furin activity may also promote cancer development by suppressing protective antitumor mechanisms. for example, increased furin-mediated activation of tgfb reduces immune surveillance by promoting the development of suppressive t reg cells and inhibiting effector t-cell functions. 34 the key role of furin in immunity is highlighted by t-cellspecific furin knock-out mice, which harbour inherently over-reactive effector t cells that secrete reduced levels of active tgfb. 35 notably, positive feedback loops can further enhance the oncogenic potential of furin. for example, the furin substrate tgfb not only increases furin mrna expression, but also enhances its proteolytic activity by an unknown mechanism. 36 similarly, furin enhances the secretion of ifnc, which in turn activates the fur promoter. 11, 14 this mutual enhancement seems particularly important given the key role of ifnc in tumor development and progression. 37 on the one hand, furin-driven ifnc release may have beneficial effects as it boosts the tumorlytic activity of natural killer cells and cytotoxic t lymphocytes. furthermore, ifnc may act as an antiangiogenic factor and directly inhibit tumor cell proliferation by inducing the expression of tumor suppressors such as p21 or p27. on the other hand, however, recent evidence suggests that ifnc may also exert tumor-promoting effects, for example by selecting for immune evasive phenotypes and promoting an immunosuppressive tumor microenvironment. 37 intriguingly, a study on laryngeal cancer patients suggests that the ifnc-furin feedback loop may be further boosted iatrogenically since radiotherapy increased furin expression in some patients. 38 in summary, aberrant furin activation promotes several steps of cancer development, including cell proliferation, vascularisation, metastasis and antitumor immunity. however, the relative contribution of individual furin substrates to tumor progression and the role of other proprotein convertases remain largely unclear. furin may not only promote disease upon aberrant expression, but also by activating a variety of pathogen-derived proteins. one example for cleavage of unwanted proteins is the proteolytic activation of bacterial toxins. particularly, the group of ab toxins comprises several well-described furin substrates. these exotoxins are secreted by bacteria and exert their effect in the cytoplasm of the target cell. they usually consist of an enzymatically active a subunit and a b subunit that mediates membrane binding and translocation. to exert its toxic effect, the a subunit has to be separated from the membrane-associated b subunit by proteolytic cleavage. 39 in case of diphtheria toxin and pseudomonas exotoxin a, furin cleaves r-v-r-r↓ and r-q-p-r↓ target sequences, respectively. 40, 41 cleavage most likely occurs in endosomes before the a subunit translocates into the nucleus where it inhibits protein synthesis by inhibiting the elongation factor eef2. 42 in line with an important role of furin in bacterial virulence, toxicity is highest if an optimal furin cleavage site is present. 43 similarly, furin cleaves shiga and shiga-like toxins expressed by certain shigella spp. and escherichia coli strains and enhances their ability to halt protein synthesis. although a furin target sequence is conserved among all shiga-like toxins, mutational analyses suggested that furin-mediated cleavage augments toxin activity, but is not essential. 39 another wellcharacterised example is anthrax toxin, a threeprotein exotoxin consisting of the receptor binding protective antigen (pa) and the enzymatically active components oedema factor (ef) and lethal factor (lf). upon binding to its receptor, pa is cleaved by furin at the cell surface. this cleavage step triggers the oligomerisation of pa into a prepore that binds ef and lf. subsequently, this toxin complex is endocytosed and pa forms a channel that allows the translocation of ef and lf into the cytoplasm. 39 although pa can be activated by different proprotein convertase family members, furin seems to be the major protease activating anthrax toxin. 44 these examples illustrate that several bacterial pathogens exploit furin and related convertases for the activation of their exotoxins. strictly speaking, however, some toxins produced by bacteria (e.g. diphtheria toxin and shiga toxins) represent viral gene products as they are encoded by bacteriophages. 45 in these cases, the term 'viral exotoxin' may be more appropriate. this strongly suggests that furin-mediated toxin activation confers a selection advantage to both, the bacterium and its phage. for example, induction of cell death by furin-activated toxins may promote tissue invasion, increase transmission rates (e.g. by causing diarrhoea) or suppress cellular immune responses. without the proteolytic activation of exotoxins, diseases such as dysentery or diphtheria would not occur. like bacterial and viral exotoxins, most viral envelope glycoproteins need to be proteolytically cleaved before they can mediate viral entry into host cells. in many cases, viruses exploit cellular trypsin-or subtilisin-like endoproteases for this purpose. while subtilisin-like proteases such as furin require polybasic cleavage sites, trypsin-like proteases also recognise monobasic motifs and cleave after single arginine or lysine residues. 46 notably, the dependency on specific proteases can also be an important determinant of tissue tropism and viral spread in an infected organism. for example, avirulent newcastle disease virus (ndv) strains harbour a monobasic cleavage site in their fusion (f) protein and result only in local infections (mainly in the respiratory tract) since expression of the respective host proteases is limited to a few cell types. in contrast, the f proteins of virulent ndv strains can be cleaved by furin or related proprotein convertases that are ubiquitously expressed. consequently, these viruses are able to spread systemically and cause high rates of mortality in infected birds. 47 another well-described example is the cleavage of influenza a virus hemagglutinin (ha). in contrast to low pathogenic avian influenza a viruses, their highly pathogenic counterparts harbour a polybasic furin cleavage site in the ha protein. 48 thus, the ability of viruses to exploit furin may have drastic effects on their pathogenicity. to date, furin-mediated cleavage has been described for envelope glycoproteins encoded by numerous evolutionarily diverse virus families, including herpes-, corona-, flavi-, toga-, borna-, bunya-, filo-, orthomyxo-, paramyxo-, pneumoand retroviridae (table 2) . although viral furin substrates generally harbour the canonical polybasic cleavage site, timing and subcellular localisation of furin-mediated activation may differ substantially between virus families. since furin and viral glycoproteins both enter the secretory pathway, proteolytic activation can occur at different steps of the viral replication cycle. while the envelope proteins of some viruses are cleaved in the producer cell, others are processed in the extracellular space or during entry into their target cells (figure 3 ). the following sections highlight the characteristics of a few selected viral glycoproteins, their processing by furin and the importance of furin-mediated cleavage for infection and pathogenicity. retroviral glycoprotein trimers, such as those of human immunodeficiency, rous sarcoma or murine leukaemia viruses, are proteolytically processed and activated in the producer cells (figure 3a , left panel). in case of hiv-1, the gp160 precursor of the viral envelope protein (env) is cleaved into gp120 and membrane-anchored gp41 that remain associated through noncovalent interactions. cleavage occurs in intracellular compartments, before the assembly of virions at the plasma membrane. notably, proteolytic processing of env depends on correct n-linked glycosylation as aberrant carbohydrate side chains may result in subcellular mistrafficking or sequestration of env. 49 most likely, hiv-1 takes advantage of the redundancy of several proprotein convertases recognising the polybasic cleavage motif in env. furin, pcsk5, pcsk6 and pcsk7 have all been shown to cleave gp160 in cells, albeit with different efficiencies. 49 notably, in vitro cleavage experiments using recombinant proteases did not always reflect cleavage efficiency in transfected cells and the relative contribution of individual pcsks to hiv-1 maturation in vivo remains unclear. 49 interestingly, hiv-1 env harbours a second polybasic cleavage site, about eight amino acid residues upstream of the major one. although cleavage at this site does not result in fusiogenic env species, about 15% of all gp160 molecules are cleaved at this position, at least in case of the cellculture-adapted hiv-1 clone lai. 50, 51 in some cases, gp160 escapes intracellular cleavage and may be incorporated as an unprocessed precursor into budding virions. whether these env molecules may be processed extracellularly by shed furin is unknown. in this context, it is noteworthy that membrane-bound plasmin has been shown to convert extracellular gp160 into gp120 and gp41. 52 however, it remains to be determined whether plasmin-mediated cleavage results in fully infectious hiv-1 particles. influenza a virus hemagglutinin (ha) can be cleaved and primed by a variety of cellular proteases. even bacterial proteases may promote influenza virus spread by cleaving ha 0 into ha 1 and ha 2. 53 both subunits remain linked via disulphide bonds and form trimeric structures. while ha 1 binds to the sialic acid receptor on viral target cells, ha 2 harbours the fusion peptide that mediates fusion of viral and cellular endosomal membranes. ha cleavage can occur within producer cells, upon release of virions from infected cells or directly prior to entry into new target cells. 54 as a general rule, hemagglutinins of mammalian and low pathogenic avian influenza a viruses cannot be cleaved by furin as they usually only harbour a mono-or dibasic cleavage site. instead, they depend on trypsin-like proteases such as transmembrane protease serine s1 member 2 (tmprss2) or human airway trypsin-like protease (hat). 55 expression of such trypsin-like proteases is largely restricted to the respiratory and gastrointestinal tract. in contrast, h5 and h7 hemagglutinins of a large number of highly pathogenic avian influenza a viruses (hpaiv) can be cleaved by furin or pcsk5, which are present in many cell types. 56, 57 this is because they acquired a polybasic cleavage site upon insertion of additional lysine and/or arginine residues. duplication of lysine and arginine residues in ha is facilitated by polymerase slippage as these amino acids are encoded by purine-rich codons. 58 instead of the prototypical r-x-k/r-r↓ motif, some hpaivs harbour a suboptimal k-x-k/r-r↓ cleavage motif. proteolytic cleavage at this site is only efficient if additional positively charged amino acids upstream of this cleavage motif are present or if attachment sites for masking oligosaccharide chains are missing. 59, 60 notably, a subset of h9n2 lowly pathogenic avian influenza a virus strains also harbour r-s-k-r↓ or r-s-r-r↓ sites that are not only cleaved by trypsin-like proteases, such as tmprss2 or hat, but also by pcsks. 61 however, their cleavage is only efficient in the presence of very high amounts of furin or upon mutation of a glycosylation site in ha. 62 thus, the ability to exploit furin for efficient ha cleavage and the associated increase in pathogenicity are not only determined by the presence of a furin consensus target site, but also by adjacent residues and the absence of masking oligosaccharide chains. furin cleaves the glycoproteins (gps) of marburg virus (marv) and all five ebolavirus species into a large n-terminal subunit (gp1) that mediates receptor binding and a small membrane-anchored c-terminal part (gp2) that contains the fusion , the viral envelope (env) glycoprotein precursor (green) migrates through the er to the golgi complex where it is cleaved by furin (pink scissors) into the functional mature env glycoprotein (blue). processed env glycoproteins are transported to the cell surface and incorporated into assembling viral particles. in contrast, dengue viruses bud into the er lumen and incorporate the uncleaved premembrane protein (prm) (right panel). during virus particle transit through the secretory pathway, virion-associated prm proteins can be cleaved by furin (dark blue to light blue). (b) some prm molecules escape furin-mediated cleavage in the producer cells resulting in the release of immature or partially mature dengue virus particles. in this case, processing can also occur in endosomes of new target cells, upon receptor-mediated endocytosis (left panel). during human papillomavirus (hpv) infection (right panel), attachment to heparan sulphate proteoglycans induces a conformational change that allows proteolytic processing of the minor capsid protein l2 (red) by furin, which is present at the cell surface. furin processing induces a structural rearrangement that allows binding to a secondary receptor and subsequent receptor-mediated endocytosis. peptide. 63, 64 marv and human pathogenic ebolavirus species harbour canonical furin cleavage sites (r-x-k/r-r↓). in contrast, the gp of the closely related reston virus, which causes asymptomatic infections in humans, is processed less efficiently by furin as it carries the suboptimal cleavage site k-q-k-r↓. 63 surprisingly, however, uncleavable gp mutants of highly pathogenic ebola virus (ebov) are able to mediate infection and furin-mediated cleavage is not required for replication in cell culture. 65 furthermore, an ebov mutant lacking the furin cleavage site replicated efficiently in nonhuman primates and showed no differences in disease progression or lethality compared to wild-type viruses. 66 thus, the high conservation of the furin cleavage site among different ebolavirus species is surprising and it remains to be determined whether furin-mediated gp processing plays a role in the natural reservoir hosts of these viruses. 65 notably, the ebov gp gene harbours an rna editing site sequence and may not only express fulllength gp, but also a soluble form of the glycoprotein (pre-sgp) that lacks the c-terminal transmembrane domain. 67 intriguingly, pre-sgp harbours another furin recognition site and is cleaved into mature sgp and a short so-called d peptide. both of them are ultimately released from infected cells. 68 although pre-sgp is produced in higher amounts than gp, its role in viral replication is under debate. among others, sgp has been suggested to serve as a decoy antigen, to act as a structural substitute for gp1 and to induce apoptosis of uninfected lymphocytes. 69 flaviviruses flavivirus rna is translated into a single large polyprotein that is cleaved by cellular and viral proteases into all structural and nonstructural proteins of the virus. the structural proteins comprise the envelope proteins prm and e that are incorporated as prm/e heterodimers into budding virions. 70 prm acts as a chaperone and facilitates correct folding of the e glycoprotein. 71 many flaviviruses bud into the lumen of the er and enter the secretory pathway. 70 in the acidic milieu of the trans-golgi network (tgn), a furin cleavage site is exposed and prm can be cleaved into mature pr and m proteins. 72 thus, furinmediated cleavage of the viral glycoprotein occurs only after its incorporation into newly formed virions (figure 3a, right panel) . this is in contrast to viruses such as hiv, whose envelope proteins are cleaved before assembly. the pr peptide remains associated with the e protein until the virion is released from the cell, thereby preventing premature unintended fusion with membranes of the producer cell. 73 furin-mediated prm cleavage is essential for replication of flaviviruses such as tick-borne encephalitis or dengue virus. 74, 75 in some cases, prm molecules escape furin processing in the producer cell and result in the release of immature or partially mature viral particles. partially mature virions are still infectious since low amounts of mature m are sufficient to mediate fusion with target cell membranes. 70 furthermore, uncleaved prm may participate in virion attachment to target cells and can be cleaved by furin during the entry process, in the acidic milieu of endosomes 76, 77 (figure 3b , left panel). the relative contribution of prm processing during viral entry into new target cells, however, remains to be determined. the ratio of mature to immature prm depends on a variety of factors, including the producer cell type and the flavivirus species. for example, dengue viruses are known to release high amounts of immature or partially mature viruses, most likely because of a conserved acidic residue within the furin recognition site 78 (table 2) . importantly, the content of prm in viral particles also affects antibody recognition and consequently antibody-dependent enhancement of dengue virus infection. 77 thus, furin-mediated protein processing may once again markedly affect the outcome of infection. while furin plays a key role in activating envelope glycoproteins of a variety of viruses, its activity is also exploited for the cleavage of other viral proteins. one example is the cleavage of the l2 protein of papillomaviruses. 79 together with the major capsid protein l1, this minor capsid protein builds the viral capsid. the furin cleavage site is located close to the n terminus of l2 and highly conserved among different human papillomavirus (hpv) strains. 80 cleavage is not required for virus assembly or release, but essential for infection of new target cells. for example, lovo and cho cells lacking furin expression are completely resistant to infection with pseudoviruses of hpv16, 79, 80 which is one of the high-risk hpv types causing cervical cancer. in contrast to flavivirus prm, which can be cleaved during egress and entry, papillomavirus l2 seems to be exclusively cleaved on target cells. 80 a model has been proposed, in which attachment of papillomaviruses to heparan sulphate proteoglycans induces a conformational change in l2 that exposes the polybasic cleavage site. 81 upon proteolytic processing of l2, l1 may engage a secondary cellular receptor and mediate infection 80 (figure 3b, right panel) . furthermore, interaction of cleaved l2 with an unknown intracellular receptor may be required for escape of l2 from the endosomal compartment and its ability to escort viral dna into the nucleus. 80 this illustrates that also nonenveloped viruses have evolved the ability to exploit furin or related pcsks for their own purposes. another nonenvelope protein that is cleaved by furin is the external core antigen (hbeag) of hepatitis b virus (hbv). 82, 83 cleaved hbeag is secreted from infected cells and exerts immunosuppressive effects. 84 it has been suggested to act as a t-cell tolerogen that prevents killing of infected hepatocytes by cytotoxic t lymphocytes. 85 in contrast, uncleaved hbeag may have the opposite effects as it is transported to the plasma membrane where it can trigger antiviral immune responses. 86 thus, furin-mediated cleavage of hbeag may affect the outcome of infection. intriguingly, han chinese frequently harbour a single nucleotide polymorphism in the p1 promoter of the fur gene that is associated with increased risk of developing persistent hbv infection with detectable amounts of hbeag in the serum. 87 this polymorphism increases the binding efficiency of the hepatic transcription factor nf-e2, thereby most likely increasing furin expression. 87 whether the observed increase in hbv persistence is the result of increased hbeag processing and/or other effects of furin remains to be determined. viral pathogens and their hosts are in a continuous arms race. 88 although viruses have evolved sophisticated strategies to exploit the metabolism, protein synthesis and trafficking pathways of an infected cell, the host is not defenceless. besides innate and adaptive immune responses that directly target components of the virus, infected cells may also restrict viral spread by limiting the availability of cellular factors that are critical for viral replication, so-called 'virus dependency factors'. for example, the host protein samhd1 restricts replication of several viral pathogens by depleting cellular dntp levels. 89 furthermore, ifi16 targets the cellular transcription factor sp1 to suppress viral gene expression. 90 intriguingly, accumulating evidence suggests that inhibition of the virus dependency factor furin represents another efficient and broadly active mechanism of antiviral immunity. in 2013, aerts and colleagues found that protease-activated receptor 1 (par1), a g-proteincoupled receptor, interferes with the expression of furin and furin-mediated processing of the human metapneumovirus f protein. 91 follow-up experiments revealed that par1 harbours an r 41 xxxxr 46 motif that mediates interaction with several pcsks. 92 in line with this, soluble pc5a/ pcsk5 and pcsk6 cleave par1 at r 46 ↓ and abrogate its ability to induce calcium signalling upon thrombin-mediated cleavage at the plasma membrane. surprisingly, however, membranebound pcsks such as furin fail to cleave par1 at this position. instead, furin traps par1 in the trans-golgi network and prevents its anterograde transport to the cell surface (figure 4a ). at the same time, par1 also blocks the proteolytic activity of furin, inhibiting for example the maturation of hiv-1 env. this inhibitory activity is not shared by its paralogue par2, which is efficiently cleaved by furin. 93 notably, expression of par1 is induced in proinflammatory environments such as the brain of hiv-1 infected individuals suffering from hiv-associated neurocognitive disorders (hand). 92 thus, par1mediated furin inhibition may represent a mechanism of innate immunity limiting the spread of hiv-1 and potentially additional furindependent viral pathogens. a similar inhibitory activity was recently described for two ifnc-inducible gtpases, termed guanylate-binding proteins 2 and 5 (gbp2 and gbp5). initially, gbp5 was described in a screening for novel restriction factors of hiv and shown to interfere with the maturation of the retroviral env protein. 94, 95 follow-up experiments revealed that this antiviral activity is shared by its paralogue gbp2 and that both proteins reduce the proteolytically active amount of furin. as a result, cleavage of the env precursor gp160 into mature gp120 and gp41 is reduced and newly formed virions are only poorly infectious (figure 4b ). since many viral pathogens rely on furin or related pcsks for the maturation of their own (glyco)proteins, gbp2 and gbp5 exert broad antiviral activity, inhibiting replication of highly pathogenic avian influenza a, measles and zika viruses. in contrast, gbp2 and gbp5 do not decrease infectivity of virions carrying the glycoprotein of vesicular stomatitis virus, which does not require a proteolytic activation step. notably, inhibition of furin in infected cells comes at a cost, since furin-mediated processing of matrix metalloproteinases and other cellular substrates is also reduced in the presence of increased gbp2 or gbp5 levels. 96 future experiments will reveal whether par-1-or gbpmediated inhibition of furin activity may also prevent the development or proliferation of certain cancers. remarkably, increased gbp2/5 expression is associated with favorable outcome in patients suffering from melanoma or breast cancer. 97, 98 thus, a better understanding of cellular mechanisms regulating furin activity will help to understand the pathogenesis of infectious diseases and cancer and may uncover novel targets for therapeutic intervention. because of the key role of furin in the pathogenesis of cancer and infectious diseases, its suitability as a therapeutic target has raised significant interest for several years. many laboratories have explored the possibility to limit tumor growth, viral replication or bacterial intoxication by reducing the amount or proteolytic activity of furin. initially, most studies focused on peptides or proteins that bind to the active site of furin and inhibit substrate binding in a competitive manner. for example, a variant of the naturally occurring serine protease inhibitor a-1 antitrypsin was modified to harbour a consensus furin cleavage site. this variant, termed a-1 antitrypsin portland (a1-pdx), inhibits furin and pcsk5 and has been shown to prevent the processing of hiv-1 env and measles virus f in vitro. 99, 100 similarly, peptides derived from the cleavage site of influenza a virus hemagglutinin and polyarginines compete with natural furin substrates. 101, 102 even exogenous addition of the autoinhibitory propeptide of furin has been shown to reduce its enzymatic activity, limiting for example the activation of mmp9 in breast cancer cells. 103, 104 however, the therapeutic potential of the propeptide has never been evaluated in vivo and the inhibitory effects are most likely limited as it is known to dissociate from furin in the tgn. several approaches, including incorporation of d-instead of l-amino acids, have been applied to increase the stability and hence efficacy of furin inhibitors. for example, hexa-d-arginine (d6r), one of the first furin inhibitors, exhibits good stability and prevents the cytotoxic effects of pseudomonas exotoxin a in vitro and in vivo. 105 similarly, topical application of nona-d-arginine (d9r) has been shown to reduce corneal damage in mice infected with pseudomonas aeruginosa. 106 interestingly, d9r also showed direct bactericidal activity, probably because of its polycationic nature. 107 besides d-amino acids, incorporation of amino acid analogs such as decarboxylated arginine mimetics or 4-amidinobenzylamide (amba) has been used to increase the stability of peptide-derived furin inhibitors. 108 furthermore, the addition of a chloromethyl ketone (cmk) moiety to the c terminus of a polybasic cleavage motif has proven useful as it results in the alkylation of the active site of furin that irreversibly blocks its enzymatic activity. 109 nevertheless, the cytotoxicity of cmkbased inhibitors and the instability of the cmk moiety may limit their use to topical applications such as the treatment of hpv skin infections. 110 finally, the elucidation of the crystal structure of furin enabled the targeted modelling of nonpeptidic inhibitors such as streptamine-based compounds. upon addition of guanidine residues, streptamine derivatives mimic the cationic furin cleavage site and inhibit its enzymatic activity in the nanomolar range in vitro. 111 dahms and colleagues describe an interesting example of a 2,5dideoxystreptamine-derived inhibitor, where two molecules of the inhibitor form a complex with furin. 112 while the first inhibitor molecule directly interferes with the conformation of the catalytic triad, the second molecule binds to an adjacent planar peptide stretch. besides stability, the subcellular localisation of furin inhibitors is a key determinant of their efficacy in vivo. notably, optimal localisation of the inhibitor strongly depends on the processing event targeted for therapeutic intervention. to inhibit activation of anthrax toxin, for example, the inhibitor does not need to enter the cells, as . par1 as well as gbp2 and gbp5 reduces hiv particle infectivity by inhibiting furin-mediated env processing. human immunodeficiency virus (hiv) particles containing functional mature envelope (env) glycoproteins fuse with the plasma membrane of the target cell to release the capsid core into the host cell cytoplasm. upon reverse transcription and integration of the retroviral genome, viral gene expression is initiated. (a) in a cytokine (e.g. il-1b)-induced inflammatory state, furin and protease-activated receptor 1 (par1) expression are induced. furin and par1 interact with each other and are trapped as inactive proteins in the trans-golgi network (tgn). as a consequence, par1 cannot traffic to the cell surface, where it is usually cleaved by thrombin to induce inflammatory signalling pathways. moreover, production of infectious hiv-1 particles is impaired because of reduced furin-mediated cleavage of hiv env. (b) at the same time, cells of the infected host may induce the expression of interferon-stimulated genes such as guanylate-binding proteins 2 and 5 (gbp2 and gbp5). both proteins colocalise with furin and inhibit its proteolytic activity. as a result, hiv env maturation is impaired and newly forming viral particles are poorly infectious since they incorporate immature env glycoproteins. ifn-i/ii, type i and type ii interferons; stat, signal transducers and activators of transcription. furin cleaves the toxin precursor at the cell surface. in contrast, penetration of the inhibitor into the cell is essential to prevent the maturation of hiv-1 env and other viral glycoproteins that are cleaved intracellularly. while some inhibitors (e.g. ha-derived peptides) efficiently enter cells, others were modified to increase their intracellular availability. for example, addition of a decanoyl moiety to cmk inhibitors increases their ability to penetrate cells. 113 streptamine derivatives may be particularly promising for targeted therapy as the positioning of the guanidyl substituents determines the localisation of the inhibitor to distinct subcellular compartments such as endosomes or the golgi complex. 29 while many of the inhibitors described above potently reduce furin activity both in vitro and in vivo, most of them also inhibit other proprotein convertases recognising the same or similar polybasic cleavage sites. this limitation is inherent to competitive inhibitors that aim at mimicking the target sequence of furin and may be overcome by allosteric inhibitors that bind furin-specific motifs outside the active site. one example is the nanobody nb14, which binds to the c-terminal pdomain of furin, thereby blocking the access of larger substrates to the active site. notably, nb14 specifically binds to the p-domain of furin and does not recognise other pcsks. 114 instead of targeting furin at the protein level, therapeutic approaches may also aim at targeting its rna. for example, the endogenous degradation of furin mrna by regnase-1 (zc3h12a) and/or roquin (rc3h1) 115 could be modulated to interfere with the expression and thus proteolytically active amount of furin. however, modulation of regnase-1 and roquin will most likely have off-target effects as both endoribonucleases also degrade additional mrnas. a more selective rna-based approach is the silencing of furin via shrna. in fact, shrnamediated suppression of furin expression is currently the clinically most advanced therapy targeting this protease. in a phase iii clinical trial, patients suffering from metastatic ewing's sarcoma family of tumors (esft) are treated with an immunotherapy that involves the silencing of furin and simultaneous overexpression of gm-csf. 116 more specifically, tumor cells are extracorporeally transfected and reintroduced as so-called furin knock-down and gm-csf augmented (fang) cancer vaccine, also known as vigil. while gm-csf boosts the antitumor response by dendritic cells and t cells, knockdown of furin prevents the proteolytic activation of tgfb, which may otherwise revert the beneficial effects of gm-csf. [117] [118] [119] in a phase ii clinical trial, the autologous fang/vigil vaccine has already proven successful as it increased relapse-free survival of ovarian cancer patients from 481 to 826 days and showed only limited adverse effects. 120 the proprotein convertase furin has become an attractive target for the treatment of various infectious and noninfectious diseases as it regulates the activity of numerous mammalian, bacterial and viral proteins. in recent years, several peptidic and nonpeptidic inhibitors have been developed that block the activation of bacterial toxins, prevent the maturation of viral proteins and suppress tumor growth in vitro. although some of them also yielded promising results in mouse models, there have been only a limited number of clinical trials in humans. one major challenge is the redundancy of furin with related proprotein convertases that also recognise polybasic cleavage sites. on the one hand, selective inhibition of furin may be beneficial as it limits unwanted side effects due to inhibition of other pcsks. on the other hand, treatment of some diseases may require the simultaneous inhibition of several pcsks to efficiently block pathological substrate conversion. to advance current approaches, a better understanding of the relative contribution of individual pcsks to (nonphysiological) proteolytic protein processing is urgently needed. therapeutic intervention needs to specifically target the convertase(s) that drive disease progression. currently, the most promising approaches for selective inhibition are shrnamediated silencing and nanobodies as they show no or only little off-target effects. even if selective inhibition of individual pcsks can be achieved, systemic long-term inhibition will most likely have detrimental effects, as pcsks are required for the activation of hundreds of cellular substrates. thus, local applications such as targeted treatment of tumors or topical treatment of bacterial and viral infections may be more feasible than systemic therapy. finally, the ability of tumor cells or pathogens to evolve resistance or evasion mutations remains poorly investigated. for example, several substrates such as dengue virus prm harbour suboptimal furin target sequences and may optimise their cleavage sites upon therapy to enable sufficient cleavage in the presence of inhibitors. although the therapeutic application of furin inhibitors may be full of pitfalls, it is certainly a promising approach that should be further pursued. future studies will elucidate the role of individual pcsks and their substrates in disease progression and a better understanding of cellular pathways regulating furin activity may uncover additional targets for therapeutic intervention. pathophysiological aspects of proteases furindb: a database of 20-residue furin cleavage site motifs, substrates and their associated drugs insulin biosynthesis: evidence for a precursor evolutionary conserved close linkage of the c-fes/fps proto-oncogene and genetic sequences encoding a receptor-like protein furin is a subtilisin-like proprotein processing enzyme in higher eukaryotes the biology and therapeutic targeting of the proprotein convertases secreted site-1 protease cleaves peptides corresponding to luminal loop of sterol regulatory element-binding proteins catalytic activity is not required for secreted pcsk9 to reduce low density lipoprotein receptors in hepg2 cells tissue-based map of the human proteome expression of the dibasic proprotein processing enzyme furin is directed by multiple promoters proprotein convertase furin is preferentially expressed in t helper 1 cells and regulates interferon c tgfb1 regulates gene expression of its own converting enzyme furin the convertases furin and pc1 can both cleave the human immunodeficiency virus (hiv)-1 envelope glycoprotein gp160 into gp120 (hiv-1 su) and gp41 (hiv-i tm) processing of human toll-like receptor 7 by furin-like proprotein convertases is required for its accumulation and activity in endosomes intramolecular chaperones and protein folding activation of the furin endoprotease is a multiple-step process: requirements for acidification and internal propeptide cleavage furin at the cutting edge: from protein traffic to embryogenesis and disease intracellular trafficking and activation of the furin proprotein convertase: localization to the tgn and recycling from the cell surface maturation of the trans-golgi network protease furin: compartmentalization of propeptide removal, substrate cleavage, and cooh-terminal truncation shed" furin: mapping of the cleavage determinants and identification of its c-terminus a 20 residues motif delineates the furin cleavage site and its physical properties may influence viral fusion proprotein convertases [corrected] are responsible for proteolysis and inactivation of endothelial lipase in vivo evidence that furin from hepatocytes inactivates pcsk9 failure of ventral closure and axial rotation in embryos lacking the proprotein convertase furin loss of endothelial furin leads to cardiac malformation and early postnatal death mutations within a furin consensus sequence block proteolytic release of ectodysplasin-a and cause x-linked hypohidrotic ectodermal dysplasia haemophilia b: database of point mutations and short additions and deletions proprotein convertases: "master switches" in the regulation of tumor growth and progression proprotein convertase inhibition: paralyzing the cell's master switches the proprotein convertase furin in tumour progression: the pc furin in tumour progression hypoxiaenhanced expression of the proprotein convertase furin is mediated by hypoxia-inducible factor-1: impact on the bioactivation of proproteins hypoxia enhances cancer cell invasion through relocalization of the proprotein convertase furin from the trans-golgi network to the cell surface get a ligand, get a life: integrins, signaling and cell survival tgf-b in t cell biology: implications for cancer immunotherapy the dark side of ifnc: its role in promoting cancer immunoevasion radiotherapyassociated furin expression and tumor invasiveness in recurrent laryngeal cancer proteolytic activation of bacterial toxins: role of bacterial and host cell proteases evidence for involvement of furin in cleavage and activation of diphtheria toxin cell-mediated cleavage of pseudomonas exotoxin between arg279 and gly280 generates the enzymatically active fragment which translocates to the cytosol effect of diphtheria toxin on protein synthesis: inactivation of one of the transfer factors cellular processing of the interleukin-2 fusion toxin dab486-il-2 and efficient delivery of diphtheria fragment a to the cytosol of target cells requires arg194 human furin is a calcium-dependent serine endoprotease that recognizes the sequence arg-x-x-arg and efficiently cleaves anthrax toxin protective antigen bacteriophage control of bacterial virulence host cell proteases controlling virus pathogenicity molecular biology of newcastle disease virus the molecular biology of influenza virus pathogenicity maturation of hiv envelope glycoprotein precursors by cellular endoproteases immunological analysis of human immunodeficiency virus type 1 envelope glycoprotein proteolytic cleavage improved antigenicity of the hiv env protein by cleavage site removal the extracellular processing of hiv-1 envelope glycoprotein gp160 by human plasmin role of staphylococcus protease in the development of influenza pneumonia influenza virus activating host proteases: identification, localization and inhibitors as potential therapeutics proteolytic activation of influenza viruses by serine proteases tmprss2 and hat from human airway epithelium influenza virus hemagglutinin with multibasic cleavage site is activated by furin, a subtilisin-like endoprotease proprotein-processing endoproteases pc6 and furin both activate hemagglutinin of virulent avian influenza viruses virulenceassociated sequence duplication at the hemagglutinin cleavage site of avian influenza viruses different hemagglutinin cleavage site variants of h7n7 in an influenza outbreak in chickens in mutations at the cleavage site of the hemagglutinin alter the pathogenicity of influenza virus a/chick/penn/83 (h5n2) hat, and tmprss2 activate the hemagglutinin of h9n2 influenza a viruses a novel activation mechanism of avian influenza virus h9n2 by furin processing of the ebola virus glycoprotein by the proprotein convertase furin proteolytic processing of marburg virus glycoprotein endoproteolytic processing of the ebola virus envelope glycoprotein: cleavage is not required for function proteolytic processing of the ebola virus glycoprotein is not critical for ebola virus replication in nonhuman primates gp mrna of ebola virus is edited by the ebola virus polymerase and by t7 and vaccinia virus polymerases δ-peptide is the carboxy-terminal cleavage fragment of the nonstructural small glycoprotein sgp of ebola virus the multiple roles of sgp in ebola pathogenesis degrees of maturity: the complex structure and biology of flaviviruses fusion activity of flaviviruses: comparison of mature and immature (prm-containing) tick-borne encephalitis virions structure of the immature dengue virus at low ph primes proteolytic maturation association of the pr peptides with dengue virus at acidic ph blocks membrane fusion cleavage of protein prm is necessary for infection of bhk-21 cells by tick-borne encephalitis virus functional importance of dengue virus maturation: infectious properties of immature virions west nile virus discriminates between dc-sign and dc-signr for cellular attachment and infection immature dengue virus: a veiled pathogen? differential modulation of prm cleavage, extracellular particle distribution, and virus infectivity by conserved residues at nonfurin consensus positions of the dengue virus pr-m junction cleavage of the papillomavirus minor capsid protein, l2, at a furin consensus site is necessary for infection the role of furin in papillomavirus infection mechanisms of human papillomavirus type 16 neutralization by l2 cross-neutralizing and l1 type-specific antibodies characterization of genotype-specific carboxyl-terminal cleavage sites of hepatitis b virus e antigen precursor and identification of furin as the candidate enzyme proteolytic processing of the hepatitis b virus e antigen precursor. cleavage at two furin consensus sequences the secreted hepatitis b precore antigen can modulate the immune response to the nucleocapsid: a mechanism for persistence a function of the hepatitis b virus precore protein is to regulate the immune response to the core antigen the secretory core protein of human hepatitis b virus is expressed on the cell surface influence of a single nucleotide polymorphism in the p1 promoter of the furin gene on transcription activity and hepatitis b virus infection rules of engagement: molecular insights from host-virus arms races samhd1 restricts hiv-1 by reducing the intracellular pool of deoxynucleotide triphosphates the intracellular dna sensor ifi16 gene acts as restriction factor for human cytomegalovirus replication neuroinflammationinduced interactions between protease-activated receptor 1 and proprotein convertases in hivassociated neurocognitive disorder hiv-induced neuroinflammation: impact of par1 and par2 processing by furin identification of potential hiv restriction factors by combining evolutionary genomic signatures with functional analyses guanylate binding protein (gbp) 5 is an interferon-inducible inhibitor of hiv-1 infectivity the authors guanylate-binding proteins 2 and 5 exert broad antiviral activity by inhibiting furin-mediated processing of viral envelope proteins distinct prognostic value of mrna expression of guanylate-binding protein genes in skin cutaneous melanoma interferoninducible guanylate binding protein (gbp2) is associated with better prognosis in breast cancer and indicates an efficient t cell response inhibition of hiv-1 gp160-dependent membrane fusion by a furindirected a 1-antitrypsin variant engineered serine protease inhibitor prevents furin-catalyzed activation of the fusion glycoprotein and production of infectious measles virus targeting host proteinases as a therapeutic strategy against viral and bacterial pathogens polyarginines are potent furin inhibitors the prosegments of furin and pc7 as potent inhibitors of proprotein convertases. in vitro and ex vivo assessment of their efficacy and selectivity opposing function of the proprotein convertases furin and pace4 on breast cancer cells' malignant phenotypes: role of tissue inhibitors of metalloproteinase-1 the furin inhibitor hexa-d-arginine blocks the activation of pseudomonas aeruginosa exotoxin a in vivo nona-d-arginine therapy for pseudomonas aeruginosa keratitis nona-d-arginine amide for prophylaxis and treatment of experimental pseudomonas aeruginosa keratitis potent inhibitors of furin and furin-like proprotein convertases containing decarboxylated p1 arginine mimetics the crystal structure of the proprotein processing proteinase furin explains its stringent specificity therapeutic uses of furin and its inhibitors: a patent review guanidinylated 2,5-dideoxystreptamine derivatives as anthrax lethal factor inhibitors structural studies revealed active site distortions of human furin by a small molecule inhibitor inhibition of proteolytic activation of influenza virus hemagglutinin by specific peptidyl chloroalkyl ketones generation and characterization of non-competitive furin-inhibiting nanobodies regnase-1 and roquin nonredundantly regulate th1 differentiation causing cardiac inflammation and fibrosis vigil + irinotecan and temozolomide in ewing's sarcoma -full text view -clinicaltrials.gov generation of large numbers of dendritic cells from mouse bone marrow cultures supplemented with granulocyte/macrophage colony-stimulating factor processing of transforming growth factor b 1 precursor by human furin convertase contrasting effects of tgf-b1 and tnf-a on the development of dendritic cells from progenitors in mouse bone marrow phase ii study of human cytomegalovirus strain towne glycoprotein b is processed by proteolytic cleavage identification and structure of the gene encoding gpii, a major glycoprotein of varicella-zoster virus epstein-barr virus glycoprotein homologous to herpes simplex virus gb coronavirus ibv: partial amino terminal sequencing of spike polypeptide s2 identifies the sequence arg-arg-phe-arg-arg at the cleavage site of the spike precursor propolypeptide of ibv strains beaudette and m41 cleavage inhibition of the murine coronavirus spike protein by a furin-like enzyme affects cell-cell but not virus-cell fusion nucleotide sequence of yellow fever virus: implications for flavivirus gene expression and evolution proteolytic activation of tick-borne encephalitis virus by furin nucleotide sequence of the 26s mrna of sindbis virus and deduced sequence of the encoded virus structural proteins furin processing and proteolytic activation of semliki forest virus mechanism of borna disease virus entry into cells processing of the borna disease virus glycoprotein gp94 by the subtilisinlike endoprotease furin crimean-congo hemorrhagic fever virus glycoprotein precursor is cleaved by furin-like and ski-1 proteases to generate a novel 38-kilodalton glycoprotein molecular analyses of the hemagglutinin genes of h5 influenza viruses: origin of a virulent turkey strain complete nucleotide sequence of an influenza virus haemagglutinin gene from cloned dna structural comparison of the cleavage-activation site of the fusion glycoprotein between virulent and avirulent strains of newcastle disease virus fusion glycoprotein of human parainfluenza virus type 3: nucleotide sequence of the gene, direct identification of the cleavage-activation site, and comparison with other paramyxoviruses cloning and sequencing of the mumps virus fusion protein gene the nucleotide sequence of the mrna encoding the fusion protein of measles virus (edmonston strain): a comparison of fusion proteins from several different paramyxoviruses fusion protein of the paramyxovirus simian virus 5: nucleotide sequence of mrna predicts a highly hydrophobic glycoprotein nucleotide sequence of the gene encoding the fusion (f) glycoprotein of human respiratory syncytial virus endoproteolytic cleavage of gp160 is required for the activation of human immunodeficiency virus structural characterization of the avian retrovirus reverse transcriptase and endonuclease domains the role of envelope glycoprotein processing in murine leukemia virus infection furin-mediated cleavage of the feline foamy virus env leader protein we thank frank kirchhoff and dominik hotter for their helpful comments. this work was funded by the dfg priority programme 'innate sensing and restriction of retroviruses' (spp 1923) to ds; eb was supported by the international graduate school in molecular medicine ulm (igradu). the authors declare no conflict of interest. eb and ds performed literature research. eb drafted the figures, and ds wrote the initial version of the manuscript.this is an open access article under the terms of the creative commons attribution-noncommercial license, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. key: cord-017968-17d37a2z authors: lewinski, martin; köster, tino title: systems approaches to map in vivo rna–protein interactions in arabidopsis thaliana date: 2018-08-30 journal: systems biology doi: 10.1007/978-3-319-92967-5_5 sha: doc_id: 17968 cord_uid: 17d37a2z proteins that specifically interact with mrnas orchestrate mrna processing steps all the way from transcription to decay. thus, these rna-binding proteins represent an important control mechanism to double check which proportion of nascent pre-mrnas is ultimately available for translation into distinct proteins. here, we discuss recent progress to obtain a systems-level understanding of in vivo rna–protein interactions in the reference plant arabidopsis thaliana using protein-centric and rna-centric methods as well as combined protein binding site and structure probing. rna-binding proteins (rbps) are a diverse class of proteins that control every step of rna processing and rna function in the cell. they are characterized by dedicated domains involved in rna binding and can have accessory domains engaged in protein-protein interactions or enzymatic activities. in higher plants, rbp function so far has been best studied in the reference plant arabidopsis thaliana. among the rbps present in the arabidopsis genome are 197 proteins with an rna recognition motif (rrm), the most abundant type of rna-binding domain, and 28 k homology (kh) domain proteins first identified in mammalian heterogeneous nuclear protein hnrnp k (silverman et al. 2013 ). in addition, 26 pumilio (pum) domain proteins, nine dead-box helicases as well as five proteins with cold shock domains (csds) have been identified (silverman et al. 2013) . another 450 proteins harbor pentatricopeptide repeat (ppr) domains. ppr domains consist of multiple 35-amino acid repeats of which two are known to be engaged in specific rna recognition (barkan and small 2014) . these proteins are imported into mitochondria or chloroplasts and regulate all aspects of rna metabolism, e.g., rna editing, splicing, rna cleavage, and translation in organelles (schmitz-linneweber and small 2008; barkan and small 2014) . a suite of arabidopsis rbps have been experimentally characterized, mainly through loss-of-function mutants and transgenic plants ectopically overexpressing rbps. these approaches revealed a crucial role for rbps in development (kalyna et al. 2003; ripoll et al. 2006; kupsch et al. 2012; völz et al. 2012; ferrari et al. 2017; foley et al. 2017; teubner et al. 2017) , timing of plant reproduction (macknight et al. 1997; streitner et al. 2008; hornyik et al. 2010) , responses to abiotic stress (kim et al. 2007b (kim et al. , c, 2008 (kim et al. , 2010 park et al. 2009 ), pathogen defense (fu et al. 2007; qi et al. 2010; jeong et al. 2011; lyons et al. 2013; nicaise et al. 2013 ), responses to phytohormones (lu and fedoroff 2000; hugouvieux et al. 2001; riera et al. 2006; carvalho et al. 2010; hackmann et al. 2014; löhr et al. 2014) , and circadian timekeeping (heintzen et al. 1994; staiger 2001; jones et al. 2012; schmal et al. 2013; perez-santángelo et al. 2014) . at the biochemical level, an impact of defined rbps on rna processing including pre-mrna splicing, 3 end processing, processing of microrna precursors, and translation has been described (lopato et al. 1999; simpson et al. 2003; vazquez et al. 2004; dong et al. 2008; stauffer et al. 2010; ren et al. 2012; rühl et al. 2012; juntawong et al. 2013; sorenson and bailey-serres 2014; staiger 2015; carvalho et al. 2016) . recent attempts to comprehensively identify rbps, summarized in sect. 2, provided experimental evidence for rna binding for most of the previously identified arabidopsis rbps and identified a plethora of proteins with noncanonical rbds. systems approaches to describe rna-protein interactions globally come in two main flavors (fig. 1) . in rna-centric approaches, proteins associated with mrnas are recovered by rna pull-down and identified by mass spectrometry, a technique referred to as mrna interactome capture (baltz et al. 2012; castello et al. 2012) (fig. 1a) . in protein-centric approaches, the focus is laid on a particular rbp. the rna complement associated with the rbp of interest, the ribonome, is identified via immunoprecipitation of the rbp from cell lysates and identification of the bound target rnas, initially by microarrays (tenenbaum et al. 2000; galgano and gerber 2011; guerreiro et al. 2014) or more recently via high throughput sequencing (licatalosi et al. 2008; könig et al. 2010; rossbach et al. 2014; müller-mcnicoll et al. 2016) (fig. 1b) . of all predicted rbps in arabidopsis, rna binding has only been experimentally confirmed for a limited number of them. a first attempt to globally identify proteins based on their ability to interact with mrnas in vivo was made for cultured arabidopsis cells (schmidt et al. 2010) . in this study, mrnas and interactors were recovered under native conditions by affinity chromatography on an oligo(dt) cellulose column followed by two-dimensional gel electrophoresis. the protein components were identified via maldi-tof. in the rna-bound proteome were a suite of rrm proteins including members of the family of glycine-rich rnabinding proteins like atgrp2 (arabidopsis thaliana glycine rich rna-binding protein 2), atgrp7 and atgrp8 (lewinski et al. 2016) , the two oligouridylatespecific rbp45 and rbp47 proteins (lorkovic et al. 2000) , and csd proteins. in 2012, mrna interactome capture was reported to comprehensively identify proteins interacting with mrnas in mammalian cells (baltz et al. 2012; castello et al. 2012 ). this technique employs in vivo cross-linking of mrna and bound proteins by uv light irradiation. the rna-protein complexes are recovered by pull-down of polyadenylated rnas using magnetic beads coated with oligo(dt). proteins are released by rnase treatment, subjected to tryptic digest and identified via mass spectrometry (fig. 1a) . following these pioneering studies, this technique was applied to a wide range of organisms including yeast, drosophila melanogaster, caenorhabditis elegans, leishmania, trypanosomes, and plasmodium (mitchell et al. 2013; beckmann et al. 2015; matia-gonzalez et al. 2015; bunnik et al. 2016; lueong et al. 2016; sysoev et al. 2016; wessels et al. 2016; nandan et al. 2017) . a minimal core mrna bound proteome occurring in both human and yeast was defined by beckmann and coworkers (beckmann et al. 2015) . lately, mrna interactome capture has also been successfully applied to arabidopsis (marondedze et al. 2016; reichel et al. 2016; zhang et al. 2016 ). the first mrna interactome capture experiments in arabidopsis employed widely differing tissues to catalog rbps. gueten and coworkers chose protoplasts, cells without a cell wall, assuming that uv cross-linking should occur as efficiently as xing et al. 2015 , or relative to polyadenylated rna, e.g., meyer et al. 2017 . in iclip (könig et al. 2010) , rna-protein complexes are subjected to rnase treatment. bound proteins are digested with proteinase, leaving a polypeptide at the crosslink site. reverse transcriptase stops there, allowing the detection of the cross-link site at the −1 position of the processed sequencing reads in mammalian cell monolayers (zhang et al. 2016) . leaf mesophyll protoplasts are also widely used in transient assays to study the regulation of gene expression. a mesophyll protoplast mrna interactome was defined with a total of 325 proteins based on enrichment in cross-linked samples vs. non-cross-linked controls with a log 2 fold change above 2 (zhang et al. 2016) . of these, one class was represented by 123 ribosomal proteins of which 52 were also present in the core mrna-bound proteome of human and yeast cells (beckmann et al. 2015) . the second class comprised 70 proteins with a known rbd. for 41 of them, a role in mrna binding and rna biology had already been described while the remaining proteins had a potential role in mrna processing. moreover, 12 of the rbps in the second class overlapped with the rbps identified in the native oligo(dt) affinity chromatography approach (schmidt et al. 2010) . the third class comprised 132 candidate rbps. of these, 49 were metabolic enzymes, mainly oxidoreductases. moreover, numerous proteins related to photosynthesis were found. as these are generally strongly expressed, their rna binding activity and the domains involved beg for an independent validation. one of the enzymes was the arabidopsis ortholog of phosphoglycerate kinase whose rna binding capacity has previously been validated in yeast and human cells (beckmann et al. 2015 ). another mrna interactome capture experiment employed 4-days-old etiolated arabidopsis seedlings (reichel et al. 2016 ). this was based on the rationale that uv-absorbing pigments present in green plant tissue may interfere with uv crosslinking in planta and their absence in etiolated tissue may allow more efficient uv cross-linking. around 300 of the 746 proteins identified altogether were significantly enriched in uv cross-linked samples vs. non-cross-linked controls with a false discovery rate below 1% and designated the "at-rbp set." eighty percent of these have a known rbd, and 75% have been linked to rna biology. more than 400 additional proteins did not meet the significance criteria applied for the "at-rbp set" and were classified as "candidate rbps." notably, of the 197 computationally predicted rrm proteins in arabidopsis 160 were detected in the input fraction in etiolated seedlings (silverman et al. 2013 ). half of these were recovered in the "at-rbd set" and another 50 were present among the "candidate rbps." similarly, seven of the predicted kh proteins were present in the "at-rbd set" and 12 were among the "candidate rbps." of the predicted 450 members of the ppr protein family only 60 were detected in the input fraction, likely due to low abundance (schmitz-linneweber and small 2008; reichel et al. 2016) . only six ppr proteins were found in the "at-rbp set" and another twelve in the "candidate rbps," likely because most rnas in the organelles lack poly(a) tails. a comparison of the identified proteins to the mrna interactome in other model organisms revealed that 52 were present in the interactomes of humans (baltz et al. 2012; beckmann et al. 2015) , mice (kwon et al. 2013; liao et al. 2016) , and yeast (beckmann et al. 2015) and were assigned to basic functions in rna metabolism such as translation, splicing, and rna unwinding. in addition to rbps with known rbds many arabidopsis proteins emerged that have not been linked to rna binding so far. among novel rbps were proteins harboring a yt521-b homology (yth) domain (li et al. 2014) . yth domain proteins have been shown to bind n 6 methyladenosine and thus serve as readers of the m 6 a mark in mammals (wang et al. 2014 ). in addition, alba domain containing proteins have been identified. alba domain proteins are well characterized in archaebacteria where they act as transcriptional repressors and in other eukaryotes where they control translation (goyal et al. 2016) . in plants, they have not yet been functionally characterized. the only observation pointing to rna binding is the recovery of an arabidopsis alba domain protein by rna-affinity chromatography (gosai et al. 2015) . whirly domain containing proteins have been characterized as single-stranded dna binding proteins in organelles (krause et al. 2009 ) and in maize, association of a whirly protein with chloroplast transcripts has been observed (prikryl et al. 2008 ). the identification of three whirly proteins in the etiolated seedling interactome (reichel et al. 2016 ) and of whirly1 upon oligo(dt) affinity chromatography in arabidopsis cells (schmidt et al. 2010 ) now provides evidence for global in vivo rna binding. in addition, a plethora of proteins with potential rna binding activity have been detected. to substantiate their rna-binding properties, independent replication is desirable. among those are proteins with the domain of unknown function 1296, cytoskeletal proteins, and photoreceptors. the identification of plasma membrane intrinsic proteins has led to the speculation that aquaporins may be involved in transport of rnas between cells (reichel et al. 2016 ). another mrna interactome capture experiment was performed on cell suspension cultures generated from roots of the arabidopsis accessions col-0 and landsberg erecta. in parallel, leaves of four-weeks-old arabidopsis col-0 plants were investigated (marondedze et al. 2016) . of 1145 proteins identified altogether in these three samples, 914 appeared only in uv cross-linked samples, and 233 proteins were significantly enriched upon uv cross-linking relative to non-cross-linked samples. more than 350 proteins were known rbds whereas 736 were novel candidate rbps not previously assigned an rna-related function or known rbd, including many enzymes of intermediary metabolism, and thus await further experimental proof (marondedze et al. 2016 ). the discovery of many novel rbps begs for further investigation of the rnabinding properties of these proteins. accordingly, methods to define rna targets of candidate rbps genome wide using protein-centric methods have recently been adapted for the use in arabidopsis, as discussed below. approaches to globally identify in vivo targets of an rbp in arabidopsis mostly rely on transgenic plants expressing an epitope-tagged version of the rbp. immunopurification is performed via an antibody directed against the epitope tag. to mirror-image the endogenous expression pattern, authentic promoters are used and the constructs are introduced into a loss-of-function mutant (köster and staiger 2014) . alternatively, endogenous rbps can be recovered with dedicated antibodies. to freeze the in vivo rna-protein interactions before cell lysis, cross-linking is performed by exposing plants to formaldehyde in rna immunoprecipitation (rip) or by uv irradiation in uv cross-linking and immunoprecipitation (clip) (fig. 1b) . formaldehyde efficiently cross-links nucleic acids and proteins in vivo but also cross-links proteins. thus, not only direct targets are recovered. this is circumvented by using 254 nm uv light that cross-links proteins directly binding to nucleic acids in the neighborhood of the excited nucleobase but does not cross-link proteins. to date, a comprehensive determination of in vivo targets, the ribonome, has been performed for only a few arabidopsis rbps, both nucleocytoplasmic proteins and chloroplast-localized proteins with different tasks in posttranscriptional regulation. in the subsequent sections, selected examples are presented. hlp1 is an arabidopsis rbp resembling mammalian hnrnp a/b-like proteins (zhang et al. 2015) . high throughput sequencing (hits)-clip of hlp1 fused to gfp and expressed under control of the strong, constitutive cauliflower mosaic virus 35s rna promoter identified above 5500 transcripts bound in vivo (zhang et al. 2015) . when endogenous hlp1 protein was precipitated by a specific antibody, 6850 transcripts bound in vivo were detected with an overlap of above 3000 transcripts to the hlp1-gfp precipitation. the prevalence of cross-linked regions near polyadenylation sites provoked the hypothesis that hlp1 may control polyadenylation. indeed, in more than 2000 transcripts the distal polyadenylation site was preferred over the proximal polyadenylation site in hlp1 mutant plants. around 19% of these transcripts were also recovered by hlp1 hits-clip, pointing to a role for hlp1 in the control of alternative polyadenylation, at least partly by direct binding. in line with this, meme motifs overrepresented in the crosslink regions, namely a-rich (5 -agaaaa-3 ) and u-rich (5 -uuuucu-3 ) motifs, resembled motifs enriched in the vicinity of the poly(a) site, 5 -aaagaaaa-3 and 5 -uguuuc-3 . the presence of cross-link regions in other parts of the transcripts apart from the 3 untranslated region (utr) suggests that hlp1 may also affect other aspects of pre-mrna processing in addition to polyadenylation. atgrp7 (arabidopsis thaliana glycine rich rna-binding protein 7) is another hnrnp-like protein with an n-terminal rrm and a c-terminus enriched in contiguous glycine residues. atgrp7 is regulated by the circadian clock and negatively autoregulates its own oscillations by alternative splicing and nonsense-mediated decay (staiger et al. 2003; schmal et al. 2013) . additionally, it is involved in several steps of posttranscriptional regulation including alternative splicing, nucleic acid chaperone function, and pri-mirna processing (kim et al. 2007a; streitner et al. 2012; . to gain insights into the breadth of its in vivo targets, individual nucleotide resolution cross-linking and immunoprecipitation (iclip) and rip-seq were performed . atgrp7 fused to gfp was expressed from its own promoter including all regulatory elements (5 utr, intron, and 3 utr) in the atgrp7-1 loss-of-function mutant. in parallel, transgenic plants expressing gfp alone or an rna-binding dead variant of atgrp7 with a single conserved arginine in the rrm mutated to glutamine (atgrp7 r 49 q) were used as negative controls. iclip identified 858 transcripts with significant iclip hits in four out of five biological replicates for atgrp7-gfp that were not present in the controls. rip-seq identified 2453 transcripts enriched by atgrp7-gfp relative to total polyadenylated rna. the higher number may be due to the higher cross-linking efficiency of formaldehyde compared to uv light, and the recovery of many indirect targets. 452 transcripts were common in both data sets, suggesting that they represent a set of high confidence binders. the iclip cross-link sites were observed in all transcript regions, the utrs, coding sequence and introns. after correcting for the length of the feature in the genome, cross-link sites in the 3 utr prevailed. conserved motifs in the vicinity of the cross-link sites generally were u/c rich. to determine how atgrp7 may impact its downstream targets, the binding targets were cross-referenced against transcriptome data from atgrp7 overexpressing plants or loss-of-function mutants. in both, the atgrp7 overexpressors or the mutant, a similar number of transcripts was expressed at elevated or reduced levels compared to wild-type plants. notably, significantly more differentially expressed iclip targets were downregulated in atgrp7-overexpressors than upregulated. in turn, more of the differentially expressed atgrp7 iclip targets were expressed at elevated in the mutant than at reduced levels. this indicates a predominantly negative effect of atgrp7 on its targets. among the targets were more circadianly regulated transcripts than expected. in particular, elevated atgrp7 levels lead to damping of circadian oscillations of target transcripts including dor-mancy/auxin associated family protein2 and ccr-like. this conforms with the idea that the circadian clock regulated atgrp7 functions as a molecular slave oscillator, conveying temporal information from the core circadian clock within the cell (rudolf et al. 2004 ). in addition, changes in splicing patterns were observed for iclip and rip-seq targets upon misexpression of atgrp7, confirming a role for atgrp7 in the control of alternative splicing. arabidopsis thaliana serine/arginine rich (sr)-like protein sr45, the counterpart of metazoan rnps1, is an sr-like protein with two rs domains, flanking either side of the rrm (badolato et al. 1995; golovkin and reddy 1999) . notably, recombinant arabidopsis sr45 can activate splicing of a β-globin splicing reporter in hela cell s100 extracts (ali et al. 2007 ). sr45 occurs in two splice isoforms that arise through differential usage of a 3 splice site in intron 6. this leads to two protein isoforms that differ by seven amino acid residues and in their function: sr45.1 is involved in petal development in flowers, whereas sr45.2 is important for root growth (zhang and mount 2009) . genome-wide targets for sr45.1 were determined during early seedling development (xing et al. 2015) and in inflorescences (zhang et al. 2017) , respectively. in seedlings, rip-seq identified 4361 transcripts from 4262 genes that were enriched upon precipitation of sr45.1-gfp from nuclei of transgenic plants compared to mock precipitation from wild type plants (xing et al. 2015) . these were designated sars, for sr45 associated rnas. a gene ontology term analysis showed that 43 of 147 abscisic acid (aba) signaling genes (30%) were among the sars, in line with a function for sr45 in the aba signaling pathway (carvalho et al. 2010) . hundred and forty-eight of the sars had an altered expression in the sr45-1 mutant, suggesting that binding of sr45 has functional consequences. a meme search for sr45 binding motifs revealed four overrepresented motifs within sar genes. two g/a rich motifs are largely positioned within exons and show strong similarity to the binding motifs of two metazoan splicing regulators transformer 2 (tra2) and serine/arginine-rich splicing factor 10 (srsf10). furthermore, one g/a rich motif closely resembles the gaag motif, a known cisregulatory element in regulating alternative splicing in plants. in contrast, two u/c rich motifs peak within intronic regions near 5 and 3 splice sites, in line with the observation that the majority of sars were from intron-containing genes and the known role as a splicing regulator (xing et al. 2015) . to gain insights into a potential role of sr45 in flower development, rip-seq was performed for sr45.1-gfp in inflorescence tissue (zhang et al. 2017) . the resulting reads were analyzed by two different bioinformatics pipelines, one based on mapping reads to the genome and one directly quantifying annotated transcripts. sars in inflorescence were defined based on a twofold enrichment compared to gfp only controls and the identification by both pipelines. of 1812 sars in inflorescence, 677 overlapped with the sars in seedlings. notably, 19 transcripts encoding splicing factors were among the sars including sr45 itself, the three sr proteins sr30, sr34, and scl35, the pre-mrna processing factors prp39, prp40a, prp40b, and prp2, and the rna helicase rh42, pointing to a hierarchical regulation of posttranscriptional regulators (keene 2007) . genes upregulated in the sr45-1 mutant are enriched for defense response genes. indeed, the sr45-1 mutant was more resistant to bacterial and fungal pathogens. of 68 upregulated defense response genes in sr45-1, 10 were sars. thus, sr45 has an additional role as a negative regulator of plant immunity. furthermore, 81 of the inflorescence sars were aberrantly spliced in the sr45-1 mutant. determination of potential sr45 binding sites in inflorescence sars uncovered an overrepresentation of the purine-rich motifs ggngg, gngga, and gnggnng. importantly, ggngg and related motifs are enriched in introns and exons that are alternatively spliced in the sr45-1 mutant, irrespective of the splicing event is favored or suppressed by sr45. this led to the suggestion that sr45 identifies regions for alternative splicing and acts as a facilitator for other splicing factors. however, the identified binding motifs for sr45 in inflorescences differ from that in seedlings, which might be in part due to the different bioinformatic tools used for motif determination. both rip-seq data sets nevertheless strengthen sr45's key role as an important splicing factor in arabidopsis. however, in both rip-seq experiments intron-less transcripts were identified in addition to intron-containing transcripts, pointing to functions of sr45 beyond its known role in pre-mrna splicing. interestingly, a comparison between the u/c-rich motifs of atgrp7 and the u/crich motifs of sr45 identified by meme in seedlings revealed a high degree of similarity . the functional significance remains to be tested. in bacteria, csps are upregulated upon cold stress and destabilize rna secondary structure at low temperatures (sommerville 1999) . to elucidate a potential involvement of arabidopsis csps in the regulation of cold responsive genes, rip followed by gene chip analysis was performed for csp1 (juntawong et al. 2013) . more than 6000 mrnas were identified. comparison of these csp1-associated transcripts in total rna and rna loaded onto polysomes revealed an enrichment of mrnas associated with ribosome biogenesis in the pool of actively translating rnas. the high gc content in 5 utrs of these mrnas suggested that csp1 is involved in removing secondary structures in the 5 utr to facilitate their translation. accordingly, these mrnas were less efficiently loaded onto polysomes at low temperature in the atcsp1-1 mutant compared to wild type plants or csp1 overexpressing plants (juntawong et al. 2013 ). the highly abundant chloroplast ribonucleoproteins (cprnps) have been well characterized for their role in regulating chloroplast transcripts (ohta et al. 1995) . the cprnps comprise an acidic domain and two rrms. they are encoded in the nucleus and imported into chloroplasts. mutants in distinct cprnps are widely affected in processing of transcripts in the chloroplast, leading to defects in chloroplast development and, consequently, plant performance owing to the essential role of the chloroplast in photosynthetic energy (ruwe et al. 2011) . for example, mutants deficient in cp29a (29 kda chloroplast protein a) and cp31a (31 kda chloroplast protein a) showed gross defects at low ambient temperature. rip performed with antibodies against the endogenous proteins and subsequent hybridization of coprecipitated rnas on tiling arrays covering the arabidopsis chloroplast genome (rip-chip) showed that cp29a and cp31a associate with large overlapping sets of chloroplast transcripts including strong enrichment for psbb, psbd, psaa/b, atpb, ndhb and intermediate enrichment for almost all chloroplast mrnas (kupsch et al. 2012) . both cp29a and cp31a are required for accumulation of chloroplast mrnas under cold stress. furthermore, binding of cp31a to 3 ends of certain transcripts serves to protect these transcripts against 3 exonuclease activity (kupsch et al. 2012) . together with the known role of cp31a in rna (tillich et al. 2009 ) this points to multiple functions in posttranscriptional regulation in chloroplasts. for cp33a (33 kda chloroplast protein a), rip-chip revealed an association with a large body of chloroplast mrnas (teubner et al. 2017) . a global reduction in mrnas and proteins making up the photosynthetic apparatus was found in the cp33a mutant. in line with a crucial role for cp33a in the development of the photosynthetic apparatus, cp33a null mutants have an albino phenotype and are not able to survive without external sucrose supply (teubner et al. 2017 ). in contrast to the broad substrate specificity of the cprnps, a very narrow substrate specificity was found for a representative of the ppr class of nuclear-encoded rbps that are imported into organelles. atcpr1 (arabidopsis thaliana chloroplast rna processing 1) is important for the production of subunits of the thylakoid protein complexes (ferrari et al. 2017) . atcpr1 mutants are yellow-white because the subunits of the photosynthetic apparatus do not accumulate. rip-chip was performed for atcpr1 under native conditions. hybridization of bound targets to chloroplast tiling arrays revealed specific binding of atcpr1 to only few transcripts, the psac transcript encoding a photosystem i subunit, petb-petd encoding cytochrome b 6 and the subunit iv of the cytochrome b 6 /f complex. because during rip rnase was used to digest unprotected rna, it was possible to delineate the binding regions. binding to the petb-petd intergenic region correlated with a requirement for processing of the polycistronic transcript comprising petb and petd (ferrari et al. 2017) , thus providing proof for the functional relevance of the observed in vivo binding. in addition to rna sequence, rna secondary structure also strongly influences the interaction of rbps with their cognate rna binding motifs (cruz and westhof 2009; vandivier et al. 2016) . rna structure may facilitate binding of rbds with a preference for double-stranded rna or inhibit binding of rbps with a preference for single-stranded rna. protein interaction profile sequencing (pip-seq) allows simultaneous delineation of in vivo rna secondary structure and protein-protected sites (ppss) (fig. 2) (gosai et al. 2015) . to identify ppss, samples are treated with a single-strand specific or double-strand specific rnase. proteins are then denatured before library preparation. to determine the rna secondary structure, proteins are denatured by sds and removed by protease digestion to make sites protected by proteins in vivo accessible for rnases. collectively, motifs that are enriched in the samples used to determine protein protected sites compared to the samples used for structure determination are in vivo target sites of rbds. gregory and coworkers applied pip-seq to the nuclei of two specific cell types in the arabidopsis roots that derive from epidermal cells through distinct differentiation, those cells bearing root hairs and those that do not (foley et al. 2017) . distinct protein binding patterns were detected, and binding motifs either specific to hair cells, non-hair cells or common to both cell types were determined. to identify candidate proteins, rna affinity chromatography was performed on immobilized oligonucleotides derived from enriched motifs. a ggn repeat motif enriched in sites protected in both hair cells and non-hair cells recovered serrate (se) from root lysates, a zinc finger containing rbp involved in processing of mirna precursors. a tg rich motif enriched in hair cell-specific protected sites identified atgrp2, atgrp7 and atgrp8. subsequently, atgrp8 was shown to regulate root hair development at the posttranscriptional level. an advantage of pip-seq is that it does not rely on an antibody to identify target sites within bound transcripts. in contrast, subsequent identification of the cognate binding proteins requires in vitro binding techniques. thus, binding in vivo has to be confirmed by independent means. the recent mrna interactome capture studies are very valuable in having established uv cross-linking and oligo(dt) affinity capture to determine the mrna binding proteome also in arabidopsis. a large number of previously predicted rbps in arabidopsis were now identified experimentally and many novel proteins without a previous assignment to rna biology unearthed. reichel and colleagues noticed a bias toward proteins with higher abundance in the interactome compared to the input (reichel et al. 2016) , suggesting that additional proteins with lower expression level may still be identified in the future. only few of the mrna interacting proteins were present in all three interactomes ). this may partly be attributed to the widely differing developmental stages investigated. among the commonly identified proteins are numerous cytoplasmic ribosomal proteins from the small and large ribosomal subunits, likely due to their high abundance, as well as the ubiquitously expressed glycine-rich rbps atgrp7 and atgrp8 . future applications are the dynamics of posttranscriptional networks in response to endogenous and exogenous stimuli cues by describing changes in the mrna bound proteomes. furthermore, as proteins binding to nonpolyadenylated rnas obviously remain elusive in these approaches, transcript-specific approaches have to be developed. transcriptome-wide identification of target transcripts bound by selected rbps in vivo has overcome a major limitation in research on plant rna-based regulation. nevertheless, except for the ppr proteins, we are still far from understanding the exact binding specificity of most proteins and the consequences in vivo binding has for the targets. to correlate in vivo binding with function, the impact of mutated candidate binding motifs on rbp binding and target gene expression has to be determined. most bioinformatics pipelines today discussing motif discovery are limited to sequence data. current efforts focus on developing bioinformatics pipelines for identifying conserved motifs taking rna structure context into consideration (maticzka et al. 2014) . molecular dynamics of rna molecules are still compute intensive but can shed light on possible interaction sites and three dimensional structures (tuszynska et al. 2015; boniecki et al. 2016) . finally, heterogeneous datasets and analyses, fusing several kinds of sources, can improve meta-analysis with in silico and in vivo datasets. this is yet limited in arabidopsis but will improve the information quality in the near future. additionally, it will be important to have comprehensive databases on rbp target sites linked to the arabidopsis information portal (the international arabidopsis informatics consortium 2012). such resources will be of great value to improve a systems understanding of rnaprotein interaction. regulation of plant developmental processes by a novel splicing factor identification and characterisation of a novel human rna-binding protein the mrna-bound proteome and its global occupancy profile on protein-coding transcripts pentatricopeptide repeat proteins in plants the rna-binding proteomes from yeast to man harbour conserved enigmrbps simrna: a coarse-grained method for rna folding simulations and 3d structure prediction the mrna-bound proteome of the human malaria parasite plasmodium falciparum the plant-specific sr45 protein negatively regulates glucose and aba signaling during early seedling development in arabidopsis the arabidopsis sr45 splicing factor, a negative regulator of sugar signaling, modulates snf1-related protein kinase 1 stability insights into rna biology from an atlas of mammalian mrna-binding proteins the dynamic landscapes of rna architecture the rna-binding proteins hyl1 and se promote accurate in vitro processing of pri-mirna by dcl1 crp1 protein: (dis)similarities between arabidopsis thaliana and zea mays a global view of rna-protein interactions identifies post-transcriptional regulators of root hair cell fate a type iii effector adp-ribosylates rna-binding proteins and quells plant immunity rna-binding protein immunopurification-microarray (rip-chip) analysis to profile localized rnas an sc35-like protein and a novel serine/arginine-rich protein interact with arabidopsis u1-70k protein global analysis of the rna-protein interaction and rna secondary structure landscapes of the arabidopsis nucleus the alba protein family: structure and function genome-wide rip-chip analysis of translational repressor-bound mrnas in the plasmodium gametocyte salicylic acid-dependent and -independent impact of an rna-binding protein on plant immunity a light-and temperature-entrained circadian clock controls expression of transcripts encoding nuclear proteins with homology to rna-binding proteins in meristematic tissue the spen family protein fpa controls alternative cleavage and polyadenylation of rna an mrna cap binding protein, abh1, modulates early abscisic acid signal transduction in arabidopsis structure function analysis of an adp-ribosyltransferase type iii effector and its rna-binding target in plant immunity mutation of arabidopsis spliceosomal timekeeper locus1 causes circadian clock defects cold shock protein 1 chaperones mrnas during translation in arabidopsis thaliana ectopic expression of at rsz33 reveals its function in splicing and causes pleiotropic changes in development rna regulons: coordination of post-transcriptional events cold shock domain proteins and glycine-rich rna-binding proteins from arabidopsis thaliana can promote the cold adaptation process in escherichia coli functional characterization of a glycine-rich rna-binding protein 2 in arabidopsis thaliana under abiotic stress conditions a zinc finger-containing glycine-rich rna-binding protein, at rz-1a, has a negative impact on seed germination and seedling growth of arabidopsis thaliana under salt or drought stress conditions functional characterization of dead-box rna helicases in arabidopsis thaliana under abiotic stress conditions glycine-rich rna-binding proteins are functionally conserved in arabidopsis thaliana and oryza sativa during cold adaptation process iclip reveals the function of hnrnp particles in splicing at individual nucleotide resolution rna-binding protein immunoprecipitation from whole-cell extracts regulation of pri-mirna processing by the hnrnplike protein atgrp7 in arabidopsis rna-binding proteins revisited: the emerging arabidopsis mrna interactome whirly proteins as communicators between plant organelles and the nucleus? arabidopsis chloroplast rna binding proteins cp31a and cp29a associate with large transcript pools and confer cold stress tolerance by influencing multiple chloroplast rna processing steps the rna-binding protein repertoire of embryonic stem cells genome-wide identification and phylogenetic analysis of plant rna binding proteins comprising both rna recognition motifs and contiguous glycine residues genome-wide identification, biochemical characterization, and expression analyses of the yth domain-containing rna-binding protein family in arabidopsis and rice the cardiomyocyte rna-binding proteome: links to intermediary metabolism and heart disease hits-clip yields genome-wide insights into brain alternative rna processing a glycine-rich rna-binding protein affects gibberellin biosynthesis in arabidopsis ) atsrp30, one of two sf2/asf-like proteins from arabidopsis thaliana, regulates splicing of specific plant genes rbp45 and rbp47, two oligouridylatespecific hnrnp-like proteins interacting with poly(a)+ rna in nuclei of plant cells a mutation in the arabidopsis hyl1 gene encoding a dsrna binding protein affects responses to abscisic acid, auxin, and cytokinin gene expression regulatory networks in trypanosoma brucei: insights into the role of the mrna-binding proteome the rna-binding protein fpa regulates flg22-triggered defense responses and transcription factor activity by alternative polyadenylation fca, a gene controlling flowering time in arabidopsis, encodes a protein containing rna-binding domains the rna-binding protein repertoire of arabidopsis thaliana conserved mrna-binding proteomes in eukaryotic organisms graphprot: modeling binding preferences of rnabinding proteins adaptation of iclip to plants determines the binding landscape of the clock-regulated rna-binding protein atgrp7 global analysis of yeast mrnps sr proteins are nxf1 adaptors that link alternative rna processing to mrna export comprehensive identification of mrna-binding proteins of leishmania donovani by interactome capture pseudomonas hopu1 affects interaction of plant immune receptor mrnas to the rna-binding protein grp7 three types of nuclear genes encoding chloroplast rnabinding proteins (cp29, cp31 and cp33) are present in arabidopsis thaliana: presence of cp31 in chloroplasts and its homologue in nuclei/cytoplasms cold shock domain proteins affect seed germination and growth of arabidopsis thaliana under abiotic stress conditions role for lsm genes in the regulation of circadian rhythms a member of the whirly family is a multifunctional rna-and dna-binding protein that is essential for chloroplast biogenesis a putative rna-binding protein positively regulates salicylic acid-mediated immunity in arabidopsis in planta determination of the mrna-binding proteome of arabidopsis etiolated seedlings regulation of mirna abundance by rna binding protein tough in arabidopsis arabidopsis rna-binding protein uba2a relocalizes into nuclear speckles in response to abscisic acid pepper, a novel k-homology domain gene, regulates vegetative and gynoecium development in arabidopsis crosslinking-immunoprecipitation (iclip) analysis reveals global regulatory roles of hnrnp l slave to the rhythm polypyrimidine tract binding protein homologs from arabidopsis are key regulators of alternative splicing with implications in fundamental developmental processes the rna-recognition motif in chloroplasts a circadian clock-regulated toggle switch explains atgrp7 and atgrp8 oscillations in arabidopsis thaliana a proteomic analysis of oligo(dt)-bound mrnp containing oxidative stress-induced arabidopsis thaliana rna-binding proteins atgrp7 and atgrp8 pentatricopeptide repeat proteins: a socket set for organelle gene expression genomic era analyses of rna secondary structure and rna-binding proteins reveal their significance to post-transcriptional regulation in plants fy is an rna 3 end-processing factor that interacts with fca to control the arabidopsis floral transition activities of cold-shock domain proteins in translation control selective mrna sequestration by oligouridylate-binding protein 1 contributes to translational control during hypoxia in arabidopsis rna-binding proteins and circadian rhythms in arabidopsis thaliana shaping the arabidopsis transcriptome through alternative splicing the circadian clock regulated rna-binding protein atgrp7 autoregulates its expression by influencing alternative splicing of its own pre-mrna polypyrimidine tract-binding protein homologues from arabidopsis underlie regulatory circuits based on alternative splicing and downstream control the small glycine-rich rna-binding protein atgrp7 promotes floral transition in arabidopsis thaliana an hnrnp-like rna-binding protein affects alternative splicing by in vivo interaction with target transcripts in arabidopsis thaliana global changes of the rna-bound proteome during the maternal-to-zygotic transition in drosophila identifying mrna subsets in messenger ribonucleoprotein complexes by using cdna arrays the rrm protein cp33a is a global ligand of chloroplast mrnas and is essential for plastid biogenesis and plant development taking the next step: building an arabidopsis information portal chloroplast ribonucleoprotein cp31a is required for editing and stability of specific chloroplast mrnas npdock: a web server for protein-nucleic acid docking the conservation and function of rna secondary structure in plants the nuclear dsrna binding protein hyl1 is required for microrna accumulation and plant development, but not posttranscriptional transgene silencing lachesis-dependent egg-cell signaling regulates the development of female gametophytic cells n 6 -methyladenosine-dependent regulation of messenger rna stability the mrna-bound proteome of the early fly embryo transcriptome-wide identification of rna targets of arabidopsis serine/arginine-rich45 uncovers the unexpected roles of this rna binding protein in rna processing two alternatively spliced isoforms of the arabidopsis thaliana sr45 protein have distinct roles during normal plant development integrative genome-wide analysis reveals hlp1, a novel rnabinding protein, regulates plant flowering by targeting alternative polyadenylation uv crosslinked mrna-binding proteins captured from leaf mesophyll protoplasts transcriptome analyses reveal sr45 to be a neutral splicing regulator and a suppressor of innate immunity in arabidopsis thaliana the work in tino köster's lab is supported by the dfg through grant ko 5364/1-1. martin lewinski is supported by the dfg through grant sta653/6-1 to dorothee staiger. key: cord-022955-vy0qgtll authors: nan title: proteases date: 2005-06-20 journal: febs j doi: 10.1111/j.1742-4658.2005.4739_4.x sha: doc_id: 22955 cord_uid: vy0qgtll nan the incretin hormones glp-1 and gip are released from the gut during meals, and serve as enhancers of glucose stimulated insu-lin release from the beta cells. furthermore, glp-1 also stimulates beta cell growth and insulin biosynthesis, inhibits glucagon secretion, reduces free fatty acids and delays gastric emptying. glp-1 has therefore been suggested as a potentially new treatment for type 2 diabetes. however, glp-1 is very rapidly degraded in the bloodstream by the enzyme dipeptidyl peptidase iv (dpp-iv; ec 3.4.14.5). a very promising approach to harvest the beneficial effect of glp-1 in the treatment of diabetes is to inhibit the dpp-iv enzyme, thereby enhancing the levels of endogenously intact circulating glp-1. the three dimensional structure of human dpp-iv in complex with various inhibitors creates a better understanding of the specificity and selectivity of this enzyme and allows for further exploration and design of new therapeutic inhibitors. the majority of the currently known dpp-iv inhibitors consist of an alpha amino acid pyrrolidine core, to which substituents have been added to optimize affinity, potency, enzyme selectivity, oral bioavailability, and duration of action. various compound series and their sar relative to alpha amino acids will be presented. memapsin 2 (b-secretase, bace1) is the membrane-anchored aspartic protease that initiates the cleavage of b-amyloid precursor protein (app) leading to the production of amyloid-b (ab), a major factor in the pathogenesis of alzheimer's disease (ad). since memapsin 2 is a major target for the development of inhibitor drugs for the treatment of ad, its structure and physiological functions are topics of intense research interest currently. here we discuss the structural features of memapsin 2 and how do they contribute to the activity and inhibition of the protease. structural and kinetic evidence support the presence of 11 subsites for substrate or inhibitor binding in the activesite cleft of memapsin 2. subsites p3 to p2' are most useful in the design of transition-state analogue inhibitors. recent data indicated that subsites p7, p6 and p5 have strong influence of hydrolytic rate or inhibition potency. these subsites are, however, too far from the transition-state isostere for the design of drug-like transition-state inhibitors but can be utilized for the design of non-transition-state inhibitors that compete for substrate binding. besides carrying out proteolytic activity, the ectodomain of memapsin 2 also interacts with app leading to the endocytosis of both proteins into the endosomes where app is hydrolyzed by memapsin 2 to produce ab. a phosphorylated motif in the cytosolic domain of memapsin 2 is responsible for the recognition of gga proteins as part of the recycling mechanism that transports memapsin 2 from endosomes to trans-golgi then back to cell surface. these interactions may also be considered for the design of small-molecular compounds that interfere with memapsin 2 trafficking and thus reduce the production of ab. identification of human carnosinase -a brainspecific metalloprotease m. teufel biochemistry, exploratory research, sanofi aventis, strasbourg, france. e-mail: michael.teufel@sanofi-synthelabo.com metalloproteases form a large and diverse family of proteases and are molecular targets that represent an opportunity for therapeutic intervention. in particular, the development of potent inhibitors has made progress for the family of matrix metalloproteases (mmp). the sequencing of the human genome revealed that a significant percentage of the drugable genome is represented by proteases, many of them still with unknown function. in this presentation, data will be presented on the deorphanization of two previously unknown genes by means of bioinformatics and classical biochemistry. this work led to the identification of human carnosinase, a dipeptidase specifically expressed in the human brain and an ubiquitously expressed close homologue, characterized to be a non-specific dipeptidase. stimulating serpins with synthetic tailor-made oligosaccharides: a new generation of antithrombotics m. petitou thrombosis & angiogenesis, sanofi-aventis, toulouse, france. e-mail: maurice.petitou@sanofi-aventis.com we will discuss our research on synthetic oligosaccharides able to selectively activate the inhibitory activity of antithrombin towards various serine proteinases. we first synthesized pentasaccharides closely related to the antithrombin binding domain of heparin [1] (the active site), as well as analogues displaying different pharmacokinetic profiles. selective inhibitors of coagulation factor xa were thus obtained that represent a new class of antithrombotic [2] drugs currently being evaluated worldwide. we then designed larger oligosaccharides [3] that inhibit both factor xa and thrombin in the presence of antithrombin. they are devoid of undesired nonspecific interactions with blood proteins, particularly with platelet factor 4. clinical trials are ongoing to prove the therapeutic benefits of this new type of coagulation inhibitors. slow tight binding inhibitors in drug discovery: in the case of dppiv and elastase inhibitors z. kapui, e. boronkay, i. bata, m. varga, e. mikus, k. urban-szabo, s. ba´tori and p. ara´nyi discovery research, chinoin member of sanofi-aventis group, budapest, hungary. e-mail: zoltan.kapui@sanofi-aventis.com enzyme are extremely potent causing significant inhibition at very low concentrations that may be comparable to the concentration of the target enzyme. when this inhibition is studied in vitro, complexities arise because the concentration of the inhibitor is so low that it is altered significantly as a result of combination with the enzyme. this situation is referred to as tight-binding inhibition. partly as a result of their low concentrations, tight-binding inhibitors often show slow-binding characteristics. unlike conventional inhibitors that act almost instantaneously (or at least within the ms time scale), slow-binding inhibitors may take several seconds, minutes or even hours for their effect to be fully exhibited. this association between slow-binding and tight-binding is relatively common and slow tight-binding inhibitors are extremely potent and specific. proteolytic enzymes are involved in a multitude of important physiological processes. their intrinsic properties and activities are in the focus of wide-ranging research and they have a valuable role in experimental and therapeutic purposes. serine proteases are attractive targets for the design of enzyme inhibitors since they are involved in the etiology of several diseases. within the class of serine proteases, human leukocyte elastase (hle) is one of the most destructive enzymes in the body. the enzyme dipeptidyl peptidase iv (dppiv) is a serine exopeptidase that cleaves xaa-pro dipeptides from the n-terminus of oligo-and polypeptides. inhibitors of dpp iv are of increasing interest to pharmaceutical industry alike, as they may become established as the next member of the oral antidiabetic class of therapeutic agents. objective of our work was to develop reversible, slow, tight-binding inhibitors against these serine proteases. ssr69071 is a potent inhibitor of hle, the inhibition constant (k i ) and the constant for inactivation process (k on ) being 0.0168 ± 0.0014 nm. this inhibitor is reversible, slow, tight-binding inhibitor with k on = 0.183 ± 0.013 10 6 /ms, and k off = 3.11 ± 0.37 10 )6 /s. ssr69071 inhibits the solubilization of elastin by hle with 13 nm of ic50 value. this inhibitor is one of the most effective inhibitor of a serine proteinase yet described. ssr162369 is a potent, competitive and slow tight binding type inhibitor of the human dipeptidyl peptidase-iv enzyme (k i = 2 nm, t½ = 8 h). on the basis of kinetic properties, ssr162369 forms stable enzyme-inhibitor complex. these slow tight-binding inhibitors have unique inhibitory properties, they are extremely active, and selective, form stable enzyme-inhibitor complex, therefore they have long-lasting effect. their oral activity and long lasting in vivo biological potency agreed very well with stable enzyme-inhibitor complex. the advantages in drug discovery of slow tight-binding inhibitors are discussed in this presentation. enzyme inhibition trend analysis -a new method for drug design m. shokhen, n. khazanov and a. albeck the julius spokojny bioorganic chemistry laboratory, chemistry, bar lan, ramat gan, israel. e-mail: albecka@mail.biu.ac.il many of the drugs that are currently in use or at different stages of development are enzyme inhibitors. therefore, enzyme mechanism-based inhibitors could be developed into highly selective drugs. our novel enzyme inhibition trend analysis method com-bines experimental enzyme kinetics data and high level quantum mechanical modeling of enzyme-inhibitor chemical interactions. the method utilizes the principal catalytic reaction scheme of the target enzyme and does not require its 3d structure (a ligand based approach). the method is valid for the prediction of the trend in binding affinity of inhibitors not only for the specific enzyme for which the qsar model was optimized, but also for the whole enzyme family. the methodology would contribute significantly to overcoming the problem of fast mutational resistance developed by pathogens in response to pharmaceutical treatment. it can be used as a computational tool for expert analysis of various hypotheses about structure-activity relationships formulated for the design of new inhibitors. angiotensin-converting enzyme (ace, ec 3.4.15.1) is a key enzyme for blood pressure control and water-electrolyte homeostasis. a large number of highly potent and specific ace inhibitors are used as oral drugs in the treatment of hypertension and congestive heart failure. somatic ace consists of two homologous domains (n-and c-) within single polypeptide chain, each one containing a catalytic site. the two catalytic sites within somatic ace molecule were long considered to function independently. however, recent investigations indicate the existence of negative cooperativity between ace active sites. we studied the properties of bovine ace active centers by use of separate ace n-domain (n-ace) obtained by limited proteolysis of parent somatic enzyme and testicular ace, which represents c-domain. these results were compared with the data obtained for full-length somatic ace from bovine lungs. the results obtained demonstrate strongly dependent mechanism of action of ace active centers in the reaction of the hydrolysis of tripeptide substrates. however, the hydrolysis of decapeptide angiotensin i proceeds independently on n-and c-domains. the mechanism of inhibition of ace activity is also dependent on the length of the inhibitor: (i) random binding of the ''short'' inhibitor molecule (such as captopril, lisinopril) to one of the active sites dramatically decreases binding of another inhibitor molecule to the second site; (ii) ''long'' nonapeptide teprotid binds to both active sites without any difficulties. since the main physiological ace substrates in the organism are ''long'' peptides angiotensin i and bradykinin, the development of new class of inhibitors with prolonged structure would be beneficial for abolishing of ace activity. synthetic peptide studies on severe acute respiratory syndrome coronavirus (sars-cov) extensive proteolytic processing of the replicase polyproteins, pp1a (486 kda) and pp1ab (790 kda), by the sars-cov 3clike protease (3cl pro ). besides, the structural spike protein of sars-cov contains two heptad repeat regions (hr1 and hr2) that form coiled-coil structures, which play an important role in mediating the membrane fusion process. in this study, we focused on both 3cl pro and the hr regions of sars. previous studies demonstrated that the coronavirus 3cl pro cleaves the replicase polyproteins at no <11 conserved cleavage sites, preferentially at the lq sequence. the reported crystal structure of sars-cov 3cl pro provides insights into the rational design of anti-sars drugs. in order to understand the molecular basis of the enzyme-substrate binding mechanism, we employ the synthetic peptide and mass spectrometry-based approaches to investigate the significance of selected amino acid residues that are flanking both sides of the sars-cov 3cl pro cleavage site. in addition, previous studies indicated that the relatively deep hydrophobic coiled coil grooves on the surface of sars-cov spike protein heptad repeat regions (hr1 and hr2) may be a good target site for the design of viral fusion inhibitors. we have designed and synthesized five truncated peptide analogs derived from hr1 and hr2 peptides based on both bioinformatics and structural analysis. the biological activities of these truncated analogs will be studied using circular dichroism spectroscopy, multidimensional chromatography, protein cross-linking and mass spectrometry-based approach. the above investigation will definitely broaden our knowledge on the sars research and will reveal the feasibility of rational design of synthetic peptide-based drug in combating with sars disease. ras-transfection-associated invasion: involvement of matrix metalloproteinase(s) confirmed using a chicken embryo model and real time pcr during metastasis tumorogenic cells leave the primary tumour and intravasate into the blood/lymphatic system, exiting at a secondary site to establish a secondary tumour. ras-transfection of a parental, non-invasive mcf-10a cell line, established from a patient suffering with benign fibrocystic disease, gave rise to an invasive derivative cell line (mcf-10a-neot) exhibiting the phenotype of a pre-malignant, invasive tumour. invasion and metastasis are protease-assisted processes, proteases either being secreted by the tumour, or by the stromal cells under the influence of the tumour. here we demonstrate the involvement of matrix metalloproteinase(s) in the invasion of the ras-transfected mcf-10a cell line. tumour cells were inoculated onto the damaged surface of the upper chorioamniotic membrane (cam) of a vasculated 9-day old chick embryo. the tumour cells were allowed to invade, and the number of invading cells quantified using real time pcr. inhibitors specific for various proteases were applied to the upper cam, to block invasion, and hence identify the proteinases involved. the number of tumour cells invading into the vascular system was established by sampling the lower cam and quantifying the numbers of alu sequences (present only in human cells) in the dna, isolated from the embryonic tissue, using real-time pcr. using this method, the key role of an mmp was demonstrated. spectrum ptk inhibitor, genistein (100 lm) abolished the release of neutrophil mmp-9, in the presence and absence of extracellular calcium, and reduced the release of timp-1. both pp2 (10 lm), a src family ptk inhibitor, and piceatannol (30 lg/ml), a syk family ptk inhibitor, reduced mmp-9 release substantially, indicating that multiple ptk families might be involved in mmp-9 release. inhibition of either syk or src ptks by piceatannol or pp2 did not appear to influence timp-1 release. low levels of wortmannin (100 nm, inhibition of pi3k) abolished the release of mmp-9 in the absence of calcium, and reduced mmp-9 release in the presence of calcium. investigations into the signaling pathways involved in timp-1 release are continuing. we conclude that mmp-9 release induced by extracellular calcium may be mediated through pi3k and multiple tyrosine kinases, including src and syk family ptks. timp-1 granule release may also be mediated by tyrosine kinases, although src and syk family ptks do not appear to be involved. thermodynamical and structural analysis of cruzain/cruzipain2 complexed with e-64 by molecular modeling and dynamics simulations peptidases represent one of the most relevant enzyme classes targeted by therapeutic intervention. to contribute to the assignment of a physiological role to genomic-derived peptidases and to make them more accessible for the drug discovery process, we have undertaken a program consisting of mrna expression profiling, full-length recombinant expression in insect cells, purification and determination of the catalytic activity for the human proteolytic enzymes. a milestone in the process was the construction of a non-redundant comprehensive database for all human peptidases comprising 443 unique annotated entries, by assembling and filtering public domain information and in-house generated data. in order to get an informative picture on their expression profiling, a transcriptome database for 375 human peptidases was created using the microarray (affymetrix tm ) and taqman ò (applied biosystems) technologies. in parallel, we have set up the procedure for pcr amplification and cloning of the peptidase genes in 96 mtp format and we have already created a repository of 231 full-length human cdnas encoding for peptidases. besides, the conditions for miniaturized insect cell cultures have been established. experimental trials have defined a validated, reliable and fully-automated robotic procedure for the purification of recombinantly expressed peptidases in 96 mtp format. in a pilot study using the high-throughput approach, 85% of the chosen reference hydrolases (14) were secreted into the insect cell medium. of them, 66% have been proven to be catalytically active using fluorescent homogeneous assays in 384well format compatible with the high-throughput screening criteria. the application of this procedure to genomic-predicted peptidases is discussed. comparison of putative glutamate racemases from bacillus species glutamate racemase catalyzes the interconversion between l-and d-glutamic acid and is the cell's source of d-glutamate, a key component in the synthesis of both the bacterial cell wall and the glutamyl capsule. bacillus subtilis has two glutamate racemases in its genome, race and yrpc, while b. cerus and b. anthracis have two race genes, race1 and race1. interestingly, race in b. subtilis is the isoform that is essential and has the greater catalytic efficiency, but both race1 and race2 have higher sequence homology to race, 70 and 79% respectively and share less homology with the yrpc isoform, both at 53%. we have cloned, overexpressed, purified, and are characterizing the kinetic and biophysical properties of the two putative glutamate racemases, race1 and race2 from b. cereus and b. anthracis, and will utilize kinetic and biophysical information to design inhibitors that may result in a novel antibiotic. although these two isoforms share a high sequence similarity, their properties are unique. kinetic data indicates a fivefold difference in catalytic efficiency of race2 compared to that of race1 in the l-to d-glutamate reaction. also, the absence or presence of substrate has an effect on the oligomerization state, details of which will be reported. finally, our collaborators have demonstrated through genetic knock out experiments that only one of the race isoforms is essential for the growth of b. anthracis. we have crystallized the race2 isozyme and x-ray data have been collected to 2.3 å . we are currently solving the structure via heavy-atom derivatives. acknowledgment: this research was funded by nih grant u19 ai056575. anti-inflammatory effects of methionine aminopeptidase 2 inhibition on human b lymphocytes e. janas 1 , r. priest 1 , s. ratcliffe 2 and r. malhotra 1 1 rheumatoid arthritis biology, glaxosmithkline, stevenage, uk, 2 high throughput chemistry, glaxosmithkline, stevenage, uk. e-mail: eva.x.janas@gsk.com processing of n-terminal methionine is an essential post-translational modification in both prokaryotes and eukaryotes regulating the subcellular localization, stability and degradation of proteins. the cleavage of the initiator methionine is catalysed by a highly conserved family of metalloproteases, methionine-aminopeptidase 1 and 2 (met-ap2). human met-ap2 is the molecular target of fumagillin, a natural product with antiangiogenic properties, which covalently binds to his 231 in the catalytic site of met-ap2. although fumagillin has been observed to inhibit proliferation and to cause cell cycle arrest in endothelial cells, the mechanism of inhibition is still poorly understood. recent studies describe high expression of met-ap2 in germinal centre b lymphocytes. here, we investigate the effect of the met-ap2 inhibitor fumagillin on b lymphocyte proliferation and cell cycle progression and compare these results to those observed in hu-vec. in addition our work sheds light on the mechanistic aspects of met-ap2 inhibition by fumagillin and its derivatives. effect of distal mutations on the molecular dynamics of the hiv-1 protease l10i and l90m are the most common distal mutations found in the protease gene of the drug resistant hiv-1 strains. these mutations do not confer resistance by themselves, however induce a large synergy effect when added to active site mutations. understanding the impact of the l90m and l10i mutations on the hiv-1 protease resistance profile is still a challenge. assuming that their contribution to the resistance profile could be mediated by conformational dynamics we have modeled l10i, l90m and l10i/l90m mutants of hiv-1 protease. these unbound mutated and wild type proteases were subjected to 10ns molecular dynamics simulations and compared using an essential dynamics (ed) analysis protocol. the first eigenvector of the native protease describes the flap openning motion. following eigenvectors describe ''the catalytic assisting motions'' (cam) of the protease that becomes dominant upon complex formation with a substrate (piana s et al. j mol biol 2002; 319(2): 567-583). mutation of luecine to methionine residue at position 90 perturbs the protein packing at the dimerization domain. such perturbations affect the dimerization domain motions which correlate with flap opening and the cam. as result the first eigenvector corresponds to the rotational of the one subunit relative to another along axis connecting residues 60 and 60'. in other words l90m mutation mistunes essential motions of the enzyme while retaining its flexibility. this could be the cause of the reduced structural stability of the l90m mutant. in contrast, l10i mutation causes only redistribution of the correlated motions amplitude. the catalytic assisting motion becomes the most influential that results in stabilization of the closed conformation. in turn, the flap opening motions are reduced in l10i mutant. essential dynamics of the double mutant l10i/l90m could be described in the following terms. a strong propagation of the cam induced by l10i mutation is coupled with the altered conformational space caused by l90m mutation. as result the double mutant prefers cam motions that are close to the native protease but also account for the perturbed packing within the dimerization domain. results presented may help understanding hiv-1 protease resistance pathways and in developing more efficient inhibitors of known drug resistant mutants. glutamate carboxypeptidase ii as a cancer marker and therapeutical target: two faces of an enzyme glutamate carboxypeptidase ii is a membrane-bound metallopeptidase expressed in a number of tissues such as jejunum, kidney, prostate and brain. the brain form of gcpii (also known as naaladase) is expressed in astrocytes and cleaves n-acetylaspartyl glutamate, an abundant neurotransmitter, to yield free glutamate. gcpii thus represents an important target for the treatment of neuronal damage caused by excess glutamate. animal model experiments suggest that specific inhibitors of gcpii could be useful for the treatment of several neuropathic conditions, such as brain stroke, chronic neuropathic pain or amyotrophic lateral sclerosis. in the same time, the enzyme is known as prostate-specific membrane antigen since it is upregulated in prostate cancer. it is used for the diagnosis and experimental therapy of prostate cancer using monoclonal antibodies and specific inhibitors. in order to analyze this important pharmaceutical target, we established an expression system based on drosophila schneider's cells. we have also cloned, expressed and characterized its human homolog gcpiii and homologous carboxypeptidases from pig and rat. using specific monoclonal antibodies, we have been able to study the expression of gcpii in various healthy and malignant tissues. we analyzed the substrate specificity of the enzyme using peptide libraries and identified two novel peptide substrates. availability of a recombinant protein enabled to introduce a simple fluorescent activity assay and test specific inhibitors. furthermore, we have biochemically characterized the recombinant protein in terms of pharmacologic properties, oligomeric status, ph dependence and activity modulation by metal ions. we have shown that the glycosylation is indispensable for gcpii carboxypeptidase activity and analyzed the role of each specific n-glycosylation site for the gcpii activity and folding. using site-directed mutagenesis, we are able to identify the domains sufficient and necessary for gcpii activity and also suggest structural explanation for the substrate specificity of the enzyme. dozens of chemicals feature inhibition of proteolytically important tyrosine residue of 20s proteasome by forming covalent bond to hydroxyl group that abolished its catalytic function. in contrary, the approach we utilize here is based on hydrogen and hydrophobic interactions reversibly inactivating all three sites of 20s complexes. we performed flexible docking studies of analogues of a natural product tmc-95a using 1jd2 crystal structure to describe the active site of protein and the position of the ligand. the search yielded several amide-like derivatives that have been screened for superimposition with tmc-95a. few of them revealed similar orientation of propylene groups to the active site of 20s. second screen was performed to reveal the chemicals with the strongest hydrogen-bonding of the ligand to the protein backbone of the receptor. this screen resulted in two chemicals that had strong h-contacts with tyr21, ser129 and, importantly, with proteolytically active tyr1 residue. to access the validity of the predicted chemicals we undertook in vitro studies measuring the hydrolyses of fluorogenic substrate by the sds activated 20s proteasome isolated from hela cells. we obtained more than 85% inhibition of 20s proteasome activity upon incubation the above chemicals (0.5 lg/ml) with proteasomes. we then demonstrated the effectiveness of the obtained chemicals to stabilize the level of oncosupressors, including p53 in benign (mcf10a) and highly metastatic (mda231) cell lines. treatment with these compounds greatly restored the level of p53 in cancer cells. finally, we performed proliferation assay and proved that adding of this artificially synthesized chemicals to mda231 cell line significantly reduced the level of proliferation, whereas mcf10a cells treated at similar conditions have not revealed any abnormal reduction of proliferation below control level. thus, we report of a strategy to predict highly suitable proteasome inhibitors that act via inhibition of protease activity and may lead to creation of a new class of drugs for cancer therapy. localization and trafficking of prostate specific membrane antigen (psma) and its variant form psḿ glutamate carboxypeptidase ii, also known as prostate specific membrane antigen (psma), is a transmembrane glycoprotein highly expressed in maligant prostate tissues. it was shown to represent very useful diagnostic marker and also potential therapeutic target for prostate cancer. two forms of the enzyme were identified in the prostate: full-length transmembrane form consisting of 750 amino acids and a truncated form (called psḿ ), believed to represent spliced variant of psma. the cdnas of both forms are identical except for 266-nucleotide region near 5´end of psma that is absent in psḿ . this deleted region codes for signal peptide as well as for intracellular and transmembrane domains. we are able to detect two protein forms in prostate cancer model cells (lncap cells) and we also show that both forms are glycosylated suggesting that this truncated form might originate from the processing of full length transmembrane psma. number of methods including differential centrifugation, pulse-chase experiments, immunochemistry and gfp-fusion protein analysis were used to analyze the origin, cell localization and trafficking of psma and psḿ in the mammalian cells. we have investigated the substrate specificity of the ns2b(h)-ns3pro protease by using internally quenched synthetic peptides representing both natural cleavage sequences and their recombinant chimeras. synthetic peptides incorporating the o-aminobenzoic acid/3-nitro-l-tyrosine fluorescence donor-quencher pair were used to analyze the minimum substrate length requirement, residue preferences and the contribution of prime side residues for enzymatic cleavage by the ns3 protease. a series of peptides derived from the ns3/ns4a cleavage site was designed for the substrate length mapping study. amino acid truncations in the non-prime and prime side region differently affected rates of substrate hydrolysis and binding as shown by their km and kcat values. the optimal substrate identified was a heptapeptide spanning p4-p3'. chimeric substrates with all possible combinations of non-prime and prime side sequences derived from 5 polyprotein cleavage sites (c, 2a/2b, 2b/3, 3/4a and 4b/5) were assayed for reactivity with the ns3 protease. kinetic parameters revealed a strong impact of the non-prime side residues on km, whereas variations in the prime side region had greater effect on kcat. the fluorogenic derivative of tetrabasic peptide rrrr/gtgn (c/ns5) demonstrated the highest affinity, whereas the peptide kkqr/sagm (2b/c) had the highest turnover number. the one with the greatest catalytic efficiency was identified as rrrr/sltl (c/4a). in addition, we have shown that a ser at p1' is the most preferred residue. the discovery of ns3 substrates with maximized reactivity will be useful for inhibitor development in sensitive high-throughput assays. inhibiting the mtor pathway with cci-779 results in decreased production of vascular endothelial growth factor in a head and neck squamous cell cancer cell line c.-a. o. nathan 1,2 , n. amirghahari 1,2 , x. rong 1,2 and y. sun 1,2 1 nathan, otolaryngology, otolaryngology/head and neck surgery, louisiana state university health sciences center, shreveport, la, usa, 2 nathan, cancer center, medicine, feist-weiller cancer center, shreveport, la, usa. e-mail: cnatha@lsuhsc.edu introduction: overexpression of the proto-oncogene eif4e in surgical margins of head and neck squamous cell cancer (hnscc) patients is an independent predictor of recurrence and is associated with increase in vascular endothelial growth factor (vegf) expression. activation of eif4e in margins through the mtor pathway has led us to determine that cci-779 an mtor inhibitor has both in vitro and in vivo growth inhibitory effects in hnscc cell lines. we wanted to determine if these effects were associated with decrease in vegf production. material and methods: a hnscc cell line fadu was treated with 1 and 10 ng/ml of cci-779 (previously established ic50 = 1 ng/ml). elisa was used to determine vegf protein levels in conditioned medium at 30', 1, 2, 4, 6, 24 and 48 h after treatment with the drug and compared to control cells treated with the diluent for each of the time points. results: a significant decrease in vegf production of 70% was noted at 24 h and maintained at 48 h in treated cells when compared to control cells at the same time points. the decrease in vegf levels (39-47%) was noted within 6 h of treatment with the drug. the percent decrease in vegf protein levels was the same for both doses of cci-779. conclusions: overexpression of eif4e in hnscc increases translation of mrnas with long 5'utrs, one of which is an important angiogenic factor vegf. inhibiting the mtor path-way with cci-779 can potentially decrease vegf production. this has future clinical implications for arresting tumor progression in hnscc patients with molecular positive margins identified by cells overexpressing eif4e, also known as minimal residual disease. proteases from cell culture of jacaratia mexicana m. c. oliver-salvador 1 , g. barrera plant proteases are important in food industry and food technology. the latex of jacaratia mexicana, caricaceae, fruits contains a high level of cysteine proteases. in this work was established a cell suspension culture of j. mexicana. callus culture was initiated from stem explants of j. mexicana on medium consisted of ¼-strength and full-strength ms mineral salts (murashige and skoog, 1965), full-strength ms organics and 6 g/l agar supplemented with cytokinins: 6-benzylaminopurine (bap) at 0.5 mg/l and 6-furfurylaminopurine (kinetin) at 0.25 mg/l and various concentrations (0.25, 0.5 and 1.0 mg/l) of auxins: 2,4-dichlorophenoxyacetic acid (2,4-d) 4-amino-3,5,6-trichloropiridin-2-carboxilic acid (picloram) indoleacetic acid (iaa) a-naphthaleneacetic acid (naa). all of the treatments induced callus except for the iaa, ana and without added phytohormones. the best auxin concentration for callus development was determined to be 0.5 mg/l. and the best condition medium for callus development and proteolytic activity of callus was determined to be 0.5 mg/l 2, 4-d + 0.5 mg/l bap. cysteine proteases were produced on callus culture of j. mexicana and liberated in the medium. also in the cell suspension culture these enzymes were secreted. our results support that is possible the synthesis of proteases in vitro culture of j. mexicana. since protease is a primary metabolite, further improvement in enzyme production is possible by increasing the growth rate and yield of cell culture of j. mexicana. and arginine, from peptides and proteins at neutral ph. it is known to play an important role in the control of peptide hormones, growth factor activity at the cell surface, and in the membrane-localized degradation of extracellular proteins. therefore, the present work was carried out to clone and express carboxypeptidase m in pichia pastoris, aiming at developing specific inhibitors and to evaluate the importance of the enzyme in different physiological and pathological processes. for this purpose, the enzyme's cdna was amplified from total placental rna by rt-pcr and cloned in the vector ppic9, which uses the methanol oxidase promoter and drives the expression of high levels of heterologous proteins in pichia pastoris. the results show that the cpm gene, after cloning and transfection, integrated in the yeast genome, which started to produce the active glycosylated protein. the recombinant protein was secreted into the medium and the enzymatic activity was measured with the fluorescent substrate dansyl-ala-arg. the enzyme was purified by a two-step protocol including gel filtration and ionexchange chromatography, resulting in a 1761-fold purified active protein in a concentration of 400 mg/l of fermentation medium. sds-page showed that recombinant cpm migrated as a single band with molecular weight similar to native placental enzyme (62 kda). these results demonstrate for the first time the establishment of a method using pichia pastoris to express human carboxypeptidase m. mutational analysis of active site of glutamate carboxypeptidase ii human glutamate carboxypeptidase ii (gcp ii) is a membrane metallopeptidase expressed predominantly in the nervous system, prostate and small intestine. in the brain, gcp ii catalyzes cleavage of the abundant neuropeptide n-acetyl-l-aspartyl-l-glutamate (naag) to n-acetylaspartate and glutamate. gcp ii is a type ii transmembrane glycoprotein with a short cytoplasmic nterminal region (amino acids 1-18), a transmembrane domain (amino acids 19-43) and a large extracellular domain (amino acids 44-750) where the active site of the enzyme is situated. gcp ii, as a cocatalytic zinc metallopeptidase, has two zn 2+ ions in the active site which are necessary for its enzymatic activity. recently, the crystal structure of gcp ii was determined in our laboratory and amino acids arg210, asn257, lys699 and tyr700 were proposed to bind c-terminal glutamate of naag (mesters et al., manuscript in preparation). in the presented study, we carried out site-directed mutagenesis to assess the influence of these amino acid residues on the activity of gcp ii. in addition, glutamic acid in the position 424 which is proposed to be involved in proton shift during the catalytical hydrolysis of peptide bond, was mutated to alanine. all the mutant proteins were expressed in insect cells, purified to near homogeneity and enzymatically characterized. it was shown that a mutation in any of these positions lead to significantly reduced naag-hydrolyzing activity. the substitution of glu424 almost completely abolished the enzymatic activity, thus suggesting glu424 is crucial for enzymatic activity of gcp ii. kinetic characterizations of mutant proteins and their substrate specificities will be presented in comparison with wild type gcp ii. comparative study of mammalian homologues of human glutamate carboxypeptidase ii glutamate carboxypeptidase ii (gcpii) is a membrane-bound metallopeptidase. in homo sapiens, gcpii was shown to be expressed in various tissues, mostly in the central nervous system, small intestine and prostate. in brain it hydrolyses n-acetylaspartylglutamate (naag), which is the most prevalent peptide neurotransmitter in the mammalian nervous system, to form glutamate and n-acetylaspartate. in small intestine gcpii plays an important role in folate absorption. in prostate its function is still unknown. it was shown that inhibition of gcpii is neuroprotective in many neurodegenerative states. according to current knowledge of this enzyme, its role may also be important in prostate (and possibly other) cancers, where its expression is dramatically changed in comparison with healthy tissue. gcpii is thus becoming an important therapeutic target and diagnostic molecule. in order to analyze structure-activity relationships in related glutamate carboxypeptidases, we set to study the mammalian homologues of human gcpii: gcpii of rattus norvegicus, sus scrofa and mus musculus, which have approximately 90% dna sequence similarity to human gcpii. information on the biochemical properties, expression pattern and structural similarity is crucial e.g. for testing of gcpii inhibitors in animal models. we have cloned and expressed recombinant gcpii of r. norvegicus and s. scrofa in insect cells with the aim to obtain pure recombinant protein sufficient for structural analysis. data on biochemical comparison of rat, pig and human gcpii forms will be presented and interpreted in the light of the gcpii structure. structural analysis of pla protein from y. pestis: docking and molecular dynamics of interactions with mammalian plasminogen systemz e. ruback and p. g. pascutti laborato´rio de modelagem e dinaˆmica molecular, departamento de biofisica, universidade federal do rio de janeiro, rio de janeiro, r.j. brazil. e-mail: eruback@biof.ufrj.br the plasminogen (plg) system is an important mechanism for the cell migration through the tissues in the mammalian organisms. some bacterial agents can activate this system by proteases and lead an uncontrolled degradation of extracellular matrix components (mec), and make an invasive character of these infections. the y. pestis protein pla is a plasmid coded outer membrane protein, with aspartic-protease activity and is closely related with the proteolytic activation of plg in the serine-protease form called plasmin. exactly how the pla activate plg in plasmin remains unclear. we performed in this work the predicted interaction between the plg and pla protein by rigid-body docking with hex and evaluate the complex stability by molecular dynamics (md) using the gromacs. to evaluate the docking accuracy we use the crystal structure of complex plg-streptokinase. the md results show more stability in the docked plg-streptokinase complex than in crystal complex observed by the rmsd and rmsf calculations after 2 ns in simulation box. the pla model was constructed with spdb-viewer using the pdb structure of ompt as template and quality of model was evaluated with prochek. the docked complex of plg-pla show same interaction site predicted in mutagenesis studies. after 8 ns md (72 083 atoms in box), we observed the relax of beta barrel structure of pla and the progressive approximation and stabilization between the cleavage site of plg into the extracellular loops of pla, followed of the increase of hydrogen bonds number. in this study we report the possible aminoacids that can be participant in the active site and the sub sites of interaction. the total understanding of these interactions can be a important tool for drug design against bacterial proteases. glutamate carboxypeptidase ii (gcpii), also known as naa-ladase i, folylpolyglutamate hydrolase (folh) or prostate specific membrane antigen (psma) is localized in number of tissues. in brain astrocytes, it regulates neurotransmission by cleaving neurotransmitter n-acetylaspatylglutamate (naag) into n-acetylaspartate and most common excitatory neurotransmitter glutamate. inhibition of gcpii activity protects against cell death after brain stroke. in animal models it has been also shown that specific inhibitors of gcpii could be useful for the treatment of chronic neuropathic pain, amyotrophic lateral sclerosis and other pathologic situations when excess glutamate is neurotoxic. gcpii is identical to prostate-specific membrane antigen (psma), a tumor marker in prostate cancer. gcpii is also found in the membrane brush border of the small intestine where it acts as a folate hydrolase. this reaction expedites intestinal uptake of folate through hydrolysis of folylpoly-gamma-glutamates to monoglutamyl folates. gcpii inhibitors might thus be useful in the imaging and treatment of tumors where folate is required for their growth. therefore it was of interest to investigate whether gcpii might be upregulated in brain tumors as well. in order to analyze this possibility, we took 57 samples from 49 patients with brain tumors treated in faculty hospital motol during 1999-2004 and determined expression and activity of gcpii by western blots and immunohistochemistry using monoclonal and polyclonal antibodies developed against extracellular epitopes of gcpii. moreover, we characterized the enzymatic activity of the enzyme in human samples and correlated the expression of gcpii with the type and grade of the tumor. search for optimal isosteres in beta-secretase peptidic inhibitors alzheimer's disease is a widespread, neurodegenerative, dementia-inducing disorder. it is ascribed to the presence of a lesion in several brain regions, the neuritic plaques, which are extraneuronal accumulations of b-amyloid protein (ab), a 42-aa insoluble peptide that mixed with axons and dendrites of neurons, interrupt the synaptic process and cause neuronal death. the peptide ab is a product derived by proteolitic cleavage from a larger transmembrane cell protein termed amyloid precursor protein, app. two enzymes are involved in this cleavage: b-secretase and a-secretase. the first one cuts app between met671 and asp672 of app to generate the n-terminus of ab in the rate limiting step of the process, while the second one cleaves at various places within a sequence between amino acids 712 and 717 to generate the respective c-terminus. using a combination of molecular modeling techniques, we have designed a set of novel b-secretase peptidic inhibitors with a variety of isosteres starting from the available crystallographic structure of this enzyme bound to the inhibitor om99-2. some of the resulting ligands are predicted to have higher affinity for this enzyme than the starting compound. these inhibitors have been synthesized, their b-secretase affinity tested and cell essays have been performed to determine their ability to preclude the formation of ab peptides in cell cultures. schizophrenia and bipolar affective disorder (bd) are two neuropsychiatric diseases with high social and economic costs. in spite of the prevalence of these diseases, no effective long-term treatments are currently available. the enzyme prolyl oligopeptidase (pop) shows increased activity in both illnesses. this serine protease hydrolyzes peptide hormones and neuropeptides at the carboxyl end of proline residues. because of the relevance of pop as a therapeutic target, many specific inhibitors of this protein have been developed in recent years. the inhibitors ono-1603, jtp-4819 and s-17092-1 are currently in clinical trial phase. s-17092-1 has been administered safely to humans and has been proposed as a potential treatment for cognitive disorders associated with cerebral aging. our aim is to develop new peptide human pop inhibitors. to obtain the human brain pop required for our studies, the cdna corresponding to the enzyme was cloned and subsequently expressed in e. coli. pop activity was monitored by 19f-nmr using a new synthesized pop substrate labeled with 19f. this substrate allowed us to perform the inhibition assay avoiding the interference problems of colorimetric and fluorimetric assays and was suitable for high throughput screening of new pop inhibitors. different strategies were used to find putative human pop inhibitors: in silico screening and solid phase synthesis of candidates and screening with chinese medicinal plants extracts. furthermore, nmr studies were performed with the purified human enzyme by labeling the protein isotopically with 15n and d2o and by selective labeling of the residues methionine and tryptophan with 13c. nmr spectra of the labeled protein were obtained at 800 mhz by applying trosy techniques. nmr will provide structural information to perform structure-based drug design of new pop inhibitors in the future as well as to study the interaction of the candidates with the active site of the enzyme. the crucial regulatory function of the membrane type 1-matrix metalloproteinase (mt1-mmp or mmp-14) in connective tissue metabolism, pericellular proteolysis of extracellular matrix (ecm) components, zymogen activation and angiogenesis was demonstrated with the severe phenotype of the mt1-mmp-deficient mice. this membrane-anchored enzyme is not only essential for normal development of hard tissues, but highly expressed in different human cancers where its level frequently correlates with malignant parameters. in most cases the high level of mrna or elevated level of protein can be predictive for disease development but these parameters only partly reflect the expression and forms of mt1-mmp in pathological conditions. biosynthesis, trafficking, intracellular activation, internalization, protein-protein interactions, and the level of physiological inhibitors (timps) strictly influence the activity of mt1-mmp in cells and tissues. in our experimental system, we followed mt1-mmp processing and shedding and characterized the cell-associated and released forms of the enzyme (jbc 2000; 275: 12080-12089; jbc 2002; 277: 26340-26350 and biochem j 2005; 386: 1-10). we found active and inactive truncated forms of mt1-mmp as a result of treatments or experimentally generated imbalance with timps. we have also developed approaches to identify mt1-mmp forms in tumor tissues. here we present and discuss different strategies to identify mmp-14 in diverse biological samples. because mt1-mmp endows tumor cells with the ability to invade and metastasize, these strategies can provide valuable information on the role and function of this key protease. contribution of calpain to cellular damage in human retinal pigment epithelium cultured with zinc chelator y. tamada 1 , t. nakajima 1 , t. r. shearer 2 and m. azuma 1,2 1 research laboratories, senju pharmaceutical co., ltd., kobe, hyogo japan, 2 departments of integrative biosciences, oregon health & science university, portland, or, usa. e-mail: yoshiyuki-tamada@senju.co.jp purpose: we previously showed involvement of calcium-dependent cysteine proteases (calpains, ec 3.4.22.17) in neural retina degeneration induced by hypoxia and ischemia-reperfusion. aged macular degeneration (amd) is one of the leading causes for loss of vision. amd showed degeneration of neural retina due to dysfunction and degeneration of the retinal pigment epithelium (rpe). rpe performs critical functions in neural retina, such as phagocytosis of shed rod outer segments. the purpose of the present study was to determine the contribution of calpain-induced proteolysis to damage in human rpe. zinc chelator tpen was used to induce cellular damage since zinc deficiency is a suspected risk factor for amd. methods: third-to fifth-passage cells from human rpe were cultured with tpen. leakage of ldh into the medium was measured as a marker of rpe cell damage. activity of calpains was assessed by casein zymography, and proteolysis of calpain substrates was detected by immunoblotting. to confirm calpain-induced proteolysis, calpain in homogenized rpe was also activated by addition of calcium. results: tpen caused ldh to leak into the medium from rpe cells, and calpain inhibitor sja6017 inhibited the leakage. casein zymography and immunoblotting for calpain and a-spectrin showed activation of calpain in rpe cultured with tpen. proteolysis by activated calpain was confirmed by addition of calcium to homogenized rpe. conclusion: these results suggested that activation of calpain contributed to rpe damage induced by tpen in vitro. acknowledgments: dr shearer has substantial financial interest (research contract and consulting fee) in senju pharmaceutical co., ltd., and dr azuma is an employee of senju pharmaceutical co., ltd., a company that may have commercial interest in the results of this research and technology. this potential conflict of interest has been reviewed and managed by the ohsu conflict of interest in research committee. in vivo and molecular risk factors of chloroquine or pyrimethamine-sulfadoxine treatment failure in children with acute uncomplicated falciparum malaria the risk factors associated with chloroquine (cq) or pyrimethamine-sulfadoxine (ps) treatment failure were evaluated in 691 children enrolled prospectively in six antimalarial drug trials between july 1996 and july 2004 in a hyperendemic area of southwestern nigeria. following treatment, 149 (39%) of 389 children given cq and 64 (22%) of 302 children given ps failed treatment by day 7 or 14. in a multiple regression model, four factors were found to be independent risk factors for cq treatment failure at enrolment: age <7 years [adjusted odds ratio (aor) = 0.46, 95% confidence interval (ci) 0.26-0.84, p = 0.01], asexual parasitaemia >100 000/ll (aor = 0.46, 95% ci 0.23-0.93, p = 0.03), presence of gametocytaemia (aor = 0.48, 95% ci 0.26-0.88, p = 0.02) and enrolment after 4 years of commencement of the study, that is, after 2000 (aor = 0.47, 95% ci 0.25-0.77, p = 0.003). following treatment with cq, two factors were independent risk factors for failure of treatment: delay in parasite clearance >3 days (aor = 0.26, 95% ci 0.1-0.7, p = 0.014) and presence of gametocytaemia on day 7 or 14 (aor = 2.5, 95% ci 1.1-6.0, p = 0.03). in those treated with ps, two factors were found to be independent risk factors for ps treatment failure at enrolment: age <1.5 years (aor = 2.9, 95% ci 1.3-6.4, p = 0.009) and presence of fever (aor = 0.3, 95% ci 0.14-0.78, p = 0.01). following treatment with ps, delay in parasite clearance >3 days (aor = 0.39, 95% ci 0.18-0.84, p = 0.016) was an independent risk factor for failure of treatment. the quintuple mutants made up of triple dhfr (asn-108, arg-59 and ile-51) mutant alleles and double dhps (gly-437 and glu-540) mutant alleles were found in isolates obtained from 33% of patients, was significantly associated with ps treatment failure (p = 0.001), while pfcrt and pfmdr-1 mutant genes did not significantly predict cq treatment failure in these patients. these findings may have implications for malaria control efforts in sub-saharan africa where control of the disease depends almost entirely on antimalarial monotherapy. development of high-throughput assay of lethal factor using native substrate m.-y. yoon department of chemistry, hanyang university, seoul, south korea. e-mail: myyoon@hanyang.ac.kr designing of inhibitors for anthrax lethal factor (lf) is currently of interest as an approach for the treatment of anthrax because lf plays major roles in cytotoxicity of target cells. lf is a zincdependent metalloprotease that specifically cleaves the mitogen-activated protein kinase kinase (mapkk) family. current assay system for the screening of lf inhibitor use the optimized synthetic peptide coupled with various kinds of fluorophores, which enables fast, sensitive, and robust assays suited to high-throughput screening. however, lines of evidence suggest that the regions beside the cleavage site are also involved in specificity and proteolytic activity of lf. in the present study, we tried to develop high-throughput assay for lf activity based on native substrate, mek1. the assay system relies on the ecl signal resulting from a specific antibody against the c-terminal region of native substrate. a glutathione-coated multiwell plate was used as a solid support to immobilize the native substrate by its n-terminal gst-moiety. immobilized substrate increases the specificity and sensitivity lf-catalyzed substrate hydrolysis compared to the solution phase assay. this assay system would be expected to discover a wide spectrum of anthrax inhibitor. while significant progress has been made over the past decade in elucidating the structure and enzymatic mechanism of the 20s proteasome, our understanding of its assembly pathway and the role of the propeptides in the maturation process is still substantially incomplete. similarly, the mechanisms involved in the translocation of substrates into the central nanocompartment are only dimly understood at present. we have used the rhodococcus proteasome to dissect the assembly pathway, combining mutagenesis and crystallographic studies. for the thermoplasma proteasome we have established a ''host-guest'' interaction system which allows us to follow the translocation of specific substrates into the interior of the proteasome by electron microscopy, mass spectroscopy and x-ray crystallography. transferring substrates to the 26s proteasome in the fission yeast schizosaccharomyces pombe c. gordon mrc human genetics unit, western general hospital, edinburgh, uk. e-mail: colin.gordon@hgu.mrc.ac.uk the ubiquitin pathway is found in all eukaryotes. in this pathway, target proteins are covalently modified by the addition of ubiquitin, a 76 amino acid protein, to specific lysine residues. the ability of multi-ubiquitin chains to function as a signal to target proteins for degradation by the 26s proteasome is well documented. a key question is how is the multi-ubiquitin chain is recognized as a signal? fission yeast rhp23/rad23 and pus1/ rpn10 represent two families of multi-ubiquitin chain binding proteins that can associate with the proteasome as well as some e3 ubiquitin ligases. they seem to provide a link to shuttle ubiquitinated substrates from the e3 ubiquitin ligases to the 26s proteasome. a detailed characterization of their proteasome binding will be presented along with their potential role in ubiquitin conjugate dynamics. finally data will be presented indicating that an additional substrate presentation pathway exists in fission yeast which is also conserved in higher eukaryotes. non-proteasomal rpn10 raises the threshold for association of a ubiquitin-binding protein with the proteasome the ubiquitin proteasome pathway is responsible for the removal of the vast majority of short-lived proteins in the cell. in order to be degraded, a protein substrate is tagged with polyubiquitin and delivered to the proteasome where it is proteolysed. a slew of shuttle proteins is thought to mediate the delivery of polyubiquitinated substrates, although the mechanism remains elusive. one such family of proteins is comprised of rad23, dsk2 and ddi1, which all bind polyubiquitinated substrates through a ubiquitinassociated domain (uba) as well as the proteasome through their ubiquitin-like domain (ubl). another potential shuttle structurally unrelated to the ubl-uba family is rpn10. rpn10 is found as an integral subunit of the proteasome as well as an in an unincorporated pool. we characterized the interactions of these proteins with individual proteasomal subunits, as well as between themselves. we find unique relationships between the putative shuttle proteins and the proteasome, pointing to functional dissimilarity among them. strikingly, unincorporated rpn10 interferes with binding of dsk2 to the proteasome. thus, we propose that rpn10 might play a negative role in proteolysis through its action on dsk2. proteins modified by multi-ubiquitin chains are usually targeted for degradation by the proteasome. in other cases, ubiquitylation mediates protein sorting or regulates other functions. a striking example for a non-proteolytic role of ubiquitin is the rad6 dna damage bypass at stalled replication forks. key elements of this pathway are two ubiquitin-conjugating enzymes, rad6 and the mms2/ubc13 heterodimer, which are recruited to chromatin by the ring-finger ubiquitin ligases, rad18 and rad5, respectively. moreover, also the sumo-conjugating enzyme ubc9 is affiliated with the pathway and we discovered that proliferating cell nuclear antigen (pcna), a dna-polymerase sliding clamp involved in dna synthesis and repair, is a substrate. pcna is (i) mono-ubiquitylated by rad6/rad18, (ii) modified by lysine (k) 63-linked multi-ubiquitylation, which additionally requires mms2/ubc13/ rad5, and (iii) sumoylated by ubc9. all three modifications affect the same lysine residue of pcna, indicating that they label pcna for alternative functions. indeed, we discovered that monoubiquitylation of pcna promotes an error-prone replication bypass, whereas k63-linked multi ubiquitylation mediates errorfree replication across the lesions. in contrast, sumoylation, which occurs even in the absence of dna damage, prevents recombination between homologs at the replication fork. these findings indicate that mono-ubiquitin, k63-linked multi-ubiquitin chains, and sumo are crucial for decision making at the replication fork. ubiquitin-mediated proteolysis is the primary mechanism in eukaryotes for degrading unwanted and misfolded proteins. through the cascade of e1, e2 and e3 enzymes, ubiquitin monomers are attached sequentially to the target proteins, which are then recognized and degraded by the 26s proteasome. the selection and specific timing of polyubiquitination of the target proteins are conferred by different e3 ubiquitin ligases. the anaphase-promoting complex (apc) is one of the most extensively studied e3 ubiquitin ligases that plays essential role in the cell cycle and specific developmental processes. the core apc is composed of 11-13 subunits. except for apc2 and apc11, relatively little is known about the role of the other apc subunits or the assembly of the complex. two wd40-repeat activator proteins, cdc20 and cdh1 determine stage-specific activation of the core apc as well as selection and binding of the apc substrates. in plants, the apc activators are present in multiple copies. arabidopsis contains 5 cdc20 genes, 3 cdh1-type activators known as ccs52a1, ccs52a2 and ccs52b. our work has been focused on the function of apc activators in the cell cycle and plant development, identification of novel apc substrates and on the assembly of the apc complexes. apc activities, based on the expression profiles of the cdc20 and ccs52 genes, will be presented at organism level. by detailed protein interaction studies in yeast two hybrid system and arabidopsis protoplasts or transgenic plants, we shall demonstrate how the core apc interacts with the activators and substrates, and propose a model for apc assembly. characterization of substrate delivery to the saccharomyces cerevisiae proteasome by quantitative shotgun proteomics the proteasome is the central protein degradation machinery in the eucaryotic cell. in conjunction with the ubiquitin system, it is responsible for constitutive bulk protein turnover as well as the controlled degradation of regulatory proteins. the system is very well characterized, but the mechanism by which poly-ubiquitinated substrates are delivered to the proteasome remains unclear. recently our lab has proposed a number of proteins to be proteasome-based receptors for poly-ubiquitinated substrates in s. cerevisiae (rpn10p, rad23p, dsk2p; verma et al., 2004). others (e.g. richly et al. 2005) have put forward a complex model for the delivery of substrates from the ubiquitinating machinery to the proteasome involving the aaa atpase cdc48p. by analyzing the composition of affinity purified proteasome complexes from s. cerevisiae cells lacking these factors and/or exposed to specific proteasome inhibition, we hope to further elucidate the substrate delivery pathway. ubiquitinated proteins recruited to the proteasome are identified utilizing capillary chromatography in-line to electrospray ion trap mass spectrometry (mudpit; link et al. 1999). using a reference strain grown in minimal medium solely providing heavy nitrogen ( 15 n) as an internal standard, we are able to record even gradual fluctuations in sample composition. differences in the recruitment of substrates to the proteasome in varying mutant backgrounds will shed light on the specificity of proteasome substrate receptors and the topology of the substrate delivery mechanism. oxidative protein damage by reactive oxygen species (ros) produces cross-linking, fragmentation and biochemical modification of the amino acids resulting in biological dysfunctions. quercetin, a widely distributed bioactive plant flavonoid, possesses anti-cancer, antioxidants and free radical scavenging activities, as well as it binds with dna causing dna fragmentation. a little is known about protein oxidative damage and its modifications by antioxidants. therefore, the aim of the present work was to investigate the molecular mechanisms of antioxidant and prooxidant activities of quercetin toward proteins. the antioxidant activities of quercetin, such as superoxide dismutase (sod)-and catalase (cat)-mimetic as well as hydroxyl radical (aeoh) scavenging activities were possessed. bovine serum albumin (bsa) was incubated with different concentrations of quercetin. quercetin has highly sod-and cat-like and hydroxyl radical (aeoh) scavenging activities. its activities are concentration dependent. quercetin fragmentized bsa into specific fragments which they detected by sds/polyacrylamide gel electrophoresis. oxidative protein damage was assessed as tryptophan oxidation, carbonyl, quenone and advanced oxidation protein products (aopp) generation. the increase of protein oxidation products was in concentration dependent manner. the carbonyl and quenone contents and aopp were highly significantly elevated in querce-tin-treated proteins when compared with the control sample. the tryptophan fluorescence was highly decreased in treated protein than in the control sample. the mechanisms of antioxidant and pro-oxidant activities of quercetin have been discussed. these results demonstrate that antioxidant quercetin may potentiate protein damage via oxygen free radical generation, particularly .oh radicals by quercetin. protein stability mediated by a hyaluronanbinding deubiquitinating enzyme is involved in cell viability protein degradation by the ubiquitin system plays a crucial role in numerous cellular signaling pathways. deubiquitination, a reversal of ubiquitination, has been recognized as an important regulatory step in the ubiquitin-dependent degradation pathway. we have identified three novel genes encoding a deubiquitinating enzyme, vdub1, vdub2, and vdub3 (villi deubiquitinating enzyme 1, 2, and 3) from human chorionic villi by rt-pcr. their cdnas are 1,593 bp in length and encode an open-reading frame of 530 amino acids with a molecular weight of approximately 58 kda. expression analysis showed that vdub transcripts are highly expressed in the heart, liver, and pancreas. in addition, they are expressed in various human cancerous cell lines. amino acid sequence analysis revealed that they contain the highly conserved cys, his, and asp domains, which are required for the formation of active site for the deubiquitinating enzymes. in vivo and in vitro deubiquitinating enzyme assays indicated that vdub1, vdub2, and vdub3 have deubiquitinating enzyme activity. here, we show that the overexpression of vdub proteins leads to irregular nuclear morphology and apoptosis, suggesting that these vdubs play an important role in regulating signal transduction involved in cell death. interestingly, the sequence analysis showed that vdub proteins contain the putative hyaluronan/mrna-binding motifs, and cetylpyridinium chloride-precipitation analysis confirmed the association between vdubs and intracellular hyaluronan and rna. chemical cleavage of peptide (amide) bonds usually requires harsh conditions. as a result of side reactions and the lack of specificity, chemical amide bond hydrolysis is not a preferred means of protein digestion. we have discovered selective cleavage of peptide bonds in proteins under milder circumstances than any previously reported chemical method. hydrolysis takes place in aqueous buffers in a ph range of 410, and occurs c-terminal to the proteogenic non-natural amino acid azido-homoalanine (azhal), effected by a staudinger reaction after addition of the mild and biocompatible reagent tris(carboxyethyl)phosphine (tcep). key feature in the suggested reaction mechanism is the unprecedented nucleophilic substitution of the resulting gammaiminophosphorane by the flanking c-terminal backbone amide oxygen atom. after hydrolysis, the new c-terminal peptide is present as a homoserine lactone residue and the n-terminal peptide as its free amine. this new reaction may find application as a very mild and selective bio-orthogonal degradation pathway in biochemistry and biomaterials science. overexpression of proteasome b5 subunit increases amount of assembled proteasome and confers ameliorated response to oxidative stress and higher survival rates the proteasome is the major cellular proteolytic machinery responsible for the degradation of both normal and damaged proteins. proteasomes play a fundamental role in retaining cellular homeostasis. alterations of proteasome function have been recorded in various biological phenomena including aging. we have recently shown that the decrease in proteasome activity in senescent human fibroblasts relates to the down-regulation of btype subunits. in this study we have followed our preliminary observation by developing and further characterizing a number of different human cell lines overexpressing the b subunit. stable overexpression of the b5 subunit in wi38/t and hl60 cells resulted in elevated levels of other b-type subunits and increased levels of all three proteasome activities. immunoprecipitation experiments have shown increased levels of assembled proteasomes in stable clones. analysis by gel filtration has revealed that the recorded higher level of proteasome assembly is directly linked to the efficient integration of ''free''/not integrated b-type subunits identified to accumulate in vector-transfected cells. in support we have also found low pomp levels in b5 transfectants thus revealing an increased rate/level of proteasome assembly in these cells as opposed to vector-transfected cells. functional studies have shown that b5 overexpressing cell lines confer enhanced survival following treatment with various oxidants. moreover we demonstrate that this increased rate of survival is due to higher degradation rates following oxidative stress. finally, as oxidation is considered to be a major factor that contributes to aging and senescence, we have overexpressed the b5 subunit into primary imr90 human fibroblasts and we have observed a delay of senescence by 45 population doublings. in summary, these data demonstrate the phenotypic effects following genetic up-regulation of the proteasome and provide insights towards a better understanding of proteasome regulation. expression levels of the components of the ubiquitin/proteasome pathway in pisum sativum seedlings under anoxia stress change in gene expression: proteins produced under aerobic conditions are no longer synthesized and are replaced by the socalled anaerobic peptides. among those proteins synthesized under o 2 deficiency some enzymes of the glycolytic and fermentative pathways were identified in plants. upon reintroduction of air, the anaerobic mrnas disappear rapidly and the increased levels of those enzymes must return to the basal levels. the ubiquitin/proteasome system is a major pathway of proteolysis in eukaryotic cells and may contribute to controlling the intracellular levels of a variety of short-lived regulatory proteins. in this proteolytic pathway, proteins are covalently conjugated to ubiquitin, which flags them for rapid hydrolysis by the 26s proteasome. long polyubiquitin chains must be formed to target a protein for destruction by the proteasome. in plants, the ubiquitin-mediated proteolytic pathway is implicated in a variety of cellular processes, including stress responses. in this study, 3-dayold pisum sativum seedlings were subjected to: (i) 15 h of anoxia stress; (ii) 2 h of aerobic conditions after 15 h of anoxia stress and (iii) 4 h of aerobic conditions after 15 h of anoxia stress. the levels of free and conjugated ubiquitin were detected by immunoblotting using anti-ubiquitin polyclonal antibodies. the changes in the mrna levels of some components of the ubiquitin/proteasome pathway in the seedlings were determined by relative semiquantitative rt-pcr. the results suggest an involvement of the ubiquitin-mediated proteolytic pathway in the anoxia stress response. b2-012p involvement of the anaphase promoting complex in plant development controlled degradation of short-live proteins via ubiquitindependent proteolysis by the 26s proteasome is a key mechanism in eukaryotes that regulates nearly all fundamental cellular processes including cell cycle. polyubiquitination of the protein substrate is sufficient to target it for degradation by a large atp-dependent multicatalytic protease, the 26s proteasome. the selection and specific timing of ubiquitination of the target proteins are conferred by different e3 ubiquitin ligase. the anaphase promoting complex (apc) is one of the e3 ubiquitin ligases, which by ordered destruction of various cell cycle proteins has fundamental roles in the regulation of mitotic and endoreproduplication cycles. the apc functions also outside the cell cycle. in post-mitotic cells, the cdh1 adaptor protein ensures stage specific activation and substrate selection of the apc. in plants, two classes of the cdh1-type activators have been identified, ccs52a and ccs52b that display differential regulation during the cell cycle and plant development as well as differences in their substrate-specificities. in arabidopsis, transient and complimentary expression profiles of the atccs52a1, atccs52a2 and atccs52b genes indicate apc functions during flower development. to identify apc targets, yeast two hybrid screens were performed in the laboratory. out of about 200 interacting proteins, several proteins were transcription factors including a key a regulator of flowers development. data on the interactions of the ccs52 proteins and transcription factors in arabidopsis protoplasts will be presented as well as a model for the apc regulated pathways. novel effects of ubiquitin system and chaperone proteins on the prion ''life cycle'' in yeast t. a. chernova 1 , k. d. allen 2 , e. p. tennant 2 , k. d. wilkinson 1 and y. o. chernoff 2 1 department of biochemistry, emory university, atlanta, ga, usa, 2 school of biology and institute for bioengineering and bioscience, georgia institute of technology, atlanta, ga, usa. e-mail: tcherno@emory.edu yeast prion [psi + ], the self-propagated aggregated isoform of the translation termination factor sup35, is used as a model system to study neural inclusion disorders. prion aggregates and other neural inclusions in mammals were previously reported to sequester ubiquitin (ub). proteasome inhibitors affected the turnover of mammalian prion proteins. however, a role of ub-dependent proteolysis in the prion ''life cycle'' has not been clearly defined. chaperone proteins, which are also implicated in ub-dependent proteolysis, have been shown to influence the formation and propagation of the prion aggregates. our results uncover the connection between alterations of ub system and chaperone proteins in their effects on the maintenance of yeast prion. we have demonstrated that deletions of genes encoding deubiquitinating enzymes, that are critical for ub regeneration at the proteasome (ubp6) or the vacuole (doa4), cause pleiotropic phenotypic effects that are primarily due to decreased levels of free ub in the yeast cells. these alterations, as well as deletion of the gene encoding ub-conjugating enzyme, ubc4, decreases [psi + ] curing by the overproduced disaggregase hsp104, suggesting that ub system influences hsp104-dependent clearance of prion aggregates. spontaneous [psi + ] formation was also increased in the ubc4 depleted cells. we previously demonstrated that excess of cytosolic chaperone ssa of hsp70 family increases de novo formation of [psi + ]. both in vivo and in vitro experiments uncover direct interactions between sup35 and hsp70 proteins. the amount of sup35-bound to hsp70-ssa was increased in ubc4 deletion strain. we propose a model to explain roles of hsp104, hsp70 and ub system in the prion life cycle. effects of parkinson''s disease mimetics on proteasome activity and protein turnover in human sh-sy5y neuroblastoma cells it has recently been suggested that impairment of the ubiquitin/ proteasomal system contributes to the degeneration of dopaminergic neurons (dn) and lewy body (lb) formation in parkinson's disease (pd). mitochondrial dysfunction is also a key factor in pd and agents such as mpp + and dopamine, which inhibit mitochondrial electron transport, produce selective degeneration of dn in animal models. in this study the effects of treating sh-sy5y cells with mpp + or dopamine over 72 h on proteasomal chymotrypsin-like activity (cla) was monitored. mpp + (0.1mm) caused a sustained depletion of glutathione levels followed by a reduction in proteasomal activity. a reduction in atp levels, caused by higher levels of mpp + (2mm), exacerbated this effect. exposure to low dopamine concentrations (0.1mm) led to large reductions in atp without affecting cla or glutathione levels; whilst higher concentrations (2mm) caused marked reductions in cla, glutathione and atp levels. these results suggest that, under oxidative stress, glutathione levels are important regulators of proteasomal activity in this cell line. our group has shown that mpp + can destabilize the neurofilament network in shsy-5y cells, partly due to changes in phosphorylation of neurofilament (nf) chains. as nfs are important components of lbs, and their mode of turnover is uncertain, we tested the effects of proteasome inhibitors on nf levels. treatment with these inhibitors led to nf accumulation, which was enhanced when glutathione levels were artificially depleted, suggesting that nfs can be degraded via the proteasomal pathway. the effects of proteasome impairment on protein accumulation will be discussed. mitochondria and the hypoxia-inducible factor 1 (hif-1): regulation of hif-1 is independent of a functional mitochondrial respiratory chain k. doege, w. jelkmann and e. metzen insitute of physiology, university of luebeck, luebeck, germany. e-mail: doege@physio.uni-luebeck.de the hypoxia-inducible factor hif-1 is the ''master-regulator'' in adaptation to low oxygen concentration and induces the hypoxic expression of several target genes, e.g. erythropoietin and vascular endothelial growth factor (vegf). in normoxia hif-1a is constantly produced but also degraded by oxygen-dependent prolyl-hydroxylation. mitochondria consume most of the oxygen delivered to cells and have been implicated in oxygen sensing. firstly, mitochondria have been proposed to stabilize hif-1a by production of reactive oxygen species (ros) in hypoxia. secondly, inhibition of the respiratory chain, e.g. by nitric oxide, has been proposed to cause redistribution of intracellular oxygen followed by reactivation of the prolyl hydroxylases and inhibition of hif signalling. we have used cells depleted of mitochondrial dna (q0) and gas permeable cell culture dishes to eliminate all oxygen diffusion gradients affecting the cells. we show that these dishes neutralize all effects of mitochondrial inhibition. additionally, cellular hypoxia as assessed by pimonidazole staining has been evaluated in human osteosarcoma cells treated with inhibitors of the respiratory chain under hypoxia. these results demonstrate an elevated po 2 under hypoxic conditions after treatment with mitochondrial inhibitors correlating with an intracellular oxygen concentration which reduces hif-1 activation. thus, neither the absence of ros nor the redistribution of intracellular oxygen supply leads to the destabilization of hif-1a in hypoxia. our experiments provide evidence that an increased intracellular po 2 evoked by the absence of mitochondrial oxygen consumption reactivates the prolylhydroxylases and is therefore responsible for the degradation of hif-1a under hypoxic conditions. enzyme activity is generally higher in rhizosphere than in bulk soil, as a result of a greater microbial activity sustained by toot exudates or due to the release of enzymes from roots. negative effects of heavy metals on soil microorganisms and enzyme activities have been long recognized. the aim of this study was to assess the stimulatory effects of different low molecular weight organic compounds commonly present in root exudates (mres) on microbial activity and protease activities and , and how high cd concentrations affect such stimulatory effects. soils (arenic udifluvent) were sampled from the agir long-term field trials, contaminated with cd nitrate at rates of 0 (control soil), 20 and 40 mg cd per kg of soil. the mre solutions contained glucose, citric acid, oxalic acid, glutamic acid or a mixture of the four compounds, added to give a rate of 300 mg of mre-c per kg of soil. the effects were measured at 4 mm (bulk soil) distance from the mrs. protease activity was determined by hydrolysis of n-benzoylargininamide (baa). the results showed that different mres had different stimulatory effects on microbial growth and on the protease activities, mostly localized in the rhizosphere soil layer. in the control soil, the dsdna content was significantly increased by the addition of all mre in both rhizosphere and bulk soil layers. the 20 and 40 mg cd per kg of soil negatively affected on protease activity. the glucose, citric acid, oxalic acid, glutamic acid, mres mix in both rhizosphere and bulk soil layers, did not stimulate protease. the, microbial growth and protease activities were drastically reduced by high cd concentrations. participation of different digestive proteinases of the yellow mealworm, tenebrio molitor, in initial stages of hydrolysis of the main dietary protein insects generally have a wide spectrum of digestive proteinases. the knowledge about the impact of different proteinases to initial stages of hydrolysis of dietary proteins is essential for insect control by means of proteinase inhibitors and bacillus thuringiensis toxins. the larvae of a stored grain pest yellow mealworm, tenebrio molitor, were reared on milled oat flakes. the main dietary protein for these larvae was 12s globulin, the main storage protein of oat seeds. to study the initial stages of 12s globulin hydrolysis in vitro the reaction was performed in the physiological conditions of anterior midgut (am) (ph 5.6) by purified enzyme preparations from am: two fractions of cysteine proteinases cys ii and cys iii, chymotrypsin-and trypsin-like proteinases. total hydrolysis of 12s globulin was observed with cys ii. slightly less effective was hydrolysis by chymotrypsin-like enzyme. cys iii cysteine and trypsin-like proteinases produced only partial hydrolysis of seed globulin. in all cases high molecular mass (mm) intermediate products were formed testifying that hydrolysis of 12s globulin was sequential. incubation with both cysteine proteinase fractions led to formation of 31 kda product, while serine proteinases proin contrast to ''classical'' bioregulator peptides, peptides could be generated in the course of catabolic degradation of functional proteins. for 15 years, we have been interested in such particular group of peptides derived from blood hemoglobin, hemorphins. hemorphins consist in a family of opioid receptor-binding peptides from 4 to 10 amino acids that are released by proteolytic processing from the (32-41) segment of human hemoglobin betachain. they are prevalent throughout the peripheral and central nervous system and have been isolated in vivo from tissues or fluids. many in vivo physiological effects have been related (coronaro-constrictory, anti-tumorous, immunoregulatory activities) and several of the hemorphins interact at various levels of the reninangiotensin system (ras) by inhibiting angiotensin-convertingenzyme (ace), aminopeptidase n (apn) and dipeptidyl peptidase iv (dppiv) activities. in addition, some hemorphins and in particular lvv-hemorphin-7 (lvvypwtqrf), binds with high affinity to the brain (ic 50 = 4.15nm) and renal at4 angiotensin receptor subtype and is possible the main endogenous ligand from this receptor. in an attempt to characterize in vivo precise mechanisms for their release, our attention is focused towards tumoral and central nervous system environments. the last one is particularly interesting as all cellular components implicated in the release of hemorphins are present simultaneously: the haemoglobin precursor and localized brain proteases which might come in contact with blood haemoglobin. in this purpose, the examination of potentiality for this tissue to generate ''neuro''-hemorphins would be of interest since sources of hemorphins in the brain have not yet been definitively established. later, we showed that sgt1 interacts with calcyclin (s100a6) and other calcium-binding proteins of the s100 family (nowotny et al. j biol chem 2003). moreover, in collaboration with dr chazin's group, we found that in vitro sgt1 binds to hsp90 (lee y-t et al. j biol chem 2004). in this work we studied the expression and subcellular localization of sgt1 in mammalian cells by means of western and northern blots. among different cell lines examined human embryonic kidney hek293 and human glioma t98g cells exhibit highest expression of sgt1 protein. moreover, we found that in mouse and rat cells there is one isoform of sgt1, while in human cells two isoforms of this protein were found. to study the subcellular localization of sgt1 we chose the cells containing moderate level of sgt1 such as human epidermal hep-2 cells. by applying immunocytochemistry we found that this protein is present not only in the cytoplasm but also in the nucleus. at present we check the effect of intracellular ca 2+ concentration on subcellular localization of sgt1 and on its co-localization with target proteins. acknowledgements: this work was supported by grants: kbn 3 p04a 043 22 and firca/nih r13 tw006005. combining reverse genetics, reverse chemogenomics and proteomics to assess the impact of protein n-terminal methionine excision in the cytosol of higher eukaryotes in living organisms whatever the cell compartment, proteins are always synthesized with methionine (met) as the first residue. however, this first met is specifically removed from most mature proteins. in the course of protein n terminal met excision (nme), the free n terminal met is removed by met aminopeptidase (map) cleavage. three enzymes (map1a, map2a and map2b) have been identified in the cytoplasm of arabidopsis thaliana. by combining reverse genetics and reverse chemogenomics in transgenic plant lines, we have devised specific and reversible switches for the investigation of the role of cytoplasmic nme in a. thaliana and of the respective contributions of the two types of cytoplasmic map throughout development. in the map1a ko context (map1a-1), modulating map2 activity by treatment with various concentrations of the specific drug fumagillin impaired plant development. hence, (i) cytoplasmic nme is essential in plants, (ii) plant map1a and map2s are functionally interchangeable as a complete block of either map type activity does not cause any visible molecular or phenotypic effect, (iii) a minimal level of cytoplasmic map is required for normal development and (iv) the plant a. thaliana appears an excellent system to study nme and the associated-role of anti-cancer agents like fumagillin. proteomics was used to assess the impact of nme blocking induced by fumagillin. we used a wild-type plant and the map1a-1 variant grown in the presence of 100 nm fumagillin. the map1a-1 variant showed a dwarf phenotype. we compared by 2d gel electrophoresis the patterns of each protein extracts. protein spots were identified by tandem mass spectrometry. the data show that fumagillin induces many dedicated pathways, with a prevalence of those related to oxidative stress. prolyl endopeptidases from the midgut of the yellow mealworm tenebrio molitor side of proline residues. these enzymes were found in mammals, several higher plants, fungi and bacteria. it is suggested that the enzymes participate in the in vivo regulation of the action of biologically active peptides. we for the first time report about two prolyl endopeptidases in the larval midgut of a stored product pest yellow mealworm tenebrio molitor where they can participate in the proteolysis of one of the main dietary proteins of t. molitor larvae -rich in proline prolamines. characteristics of two prolyl endopeptidases are significantly different. optimum for hydrolysis of the substrate z-ala-ala-pro-pna (n-carbobenzoxy-l-alanyl-l-alanyl-l-prolyl-p-nitroanilide) by prolyl endopeptidase 1 was at ph 8.5, and prolyl endopeptidase 2 -at ph 5.6. prolyl endopeptidase 1 displayed high phstability in the ph range 7.0-10.0 and the rate of hydrolysis increased in the presence of kcl and cacl 2 . prolyl endopeptidase 2 demonstrated low stability in the whole ph range, the rate of hydrolysis strongly decreased in the presence of above mentioned salts, but increased in the presence of high concentrations of edta. the influence of cell growth media on the stability and antitumour activity of methionine enkephalin studies with cultured tumour cell lines are widely used in vitro to evaluate peptide-induced cytotoxicity as well as molecular and biochemical interactions. the objectives of this study were to investigate the influence of the cell culture medium on peptide metabolic stability and in vitro antitumour activity. the degradation kinetics of the model peptide methionine enkephalin (met-e, tyr-gly-gly-phe-met), demonstrated recently to play an important role in the rate of proliferation of tumour cells in vitro and in vivo, were investigated in cell culture systems containing different amounts of foetal bovine serum (fbs). the influence of enzyme inhibitors (bestatin, captopril, thiorphan) on the met-e degradation was also investigated. the results obtained in the dulbecco's modified eagle medium containing 10% fbs indicated a rapid degradation of met-e (t 1/2 = 2.8 h). pre-incubation of the medium with a mixture of peptidase inhibitors reduced the hydrolysis of met-e, as shown by increased half-life to 10 h. the in vitro activity of met-e against poorly differentiated cells from lymph node metastasis of colon carcinoma (sw620) and human larynx carcinoma (hep-2) cells was determined. tumour cells were grown 3 weeks prior to the experiment in a medium supplemented with 10, 5 or 2% fbs. statistically significant to mild or no suppression of cell proliferation was observed in all cultures. in both cell lines, a significant suppression of cell growth by a combination of peptidase inhibitors and met-e, compared with cells exposed to the peptide alone and cells grown in the absence of met-e, was observed. this study indicated that caution must be exercised in interpreting the antiproliferative effects of peptide compounds in conventional drug-response assays. protein metabolism in whole body and skeletal muscle of laboratory rats treated by proteasome inhibitors proteasome inhibitors are new agents which may be used in treatment of cancer and other severe disorders. one of the possible side effects of their administration is disturbance in protein metabolism which may affect outcome of the illness. two separate studies were performed using wistar rats. in the first study, m. soleus (sol) or m. extensor digitorum longus (edl) were incubated in medium containing 30 mmol/l mg 132 or 30 mmol/l adaahx3l3vs or without inhibitor (control). protein synthesis was evaluated using l-[1-14c]leucine. proteolysis was determined according to the rate of the tyrosine release into the medium during incubation. in the second study, proteasome inhibitor mg 132 diluted in dimethyl sulfoxide (dms) was administered intraperitoneally in dose 10 mg/kg b.w. controls consisted of dms treated animals. changes in protein and amino acid metabolism were estimated in steady-state conditions using continuous infusion of l-[1-14c]leucine 2 h later. mann-whitney (in vivo study) and paired t-test (in vitro study) were used for statistical analysis. in in vitro study, both mg 132 and adaahx3l3vs significantly decreased protein synthesis and proteolysis. however, in in vivo study, a significant increase in whole-body protein synthesis and proteolysis were observed in mg 132 treated animals. acknowledgements: the study was supported by a grant of gacr no. 303/03/1512. bioinformatical evidence for a prokaryotic ubiquitin-like protein modification system h. scheel, s. tomiuk and k. hofmann bioinformatics group, memorec biotec gmbh, ko¨ln, germany. e-mail: kay.hofmann@memorec.com until recently, the ubiquitin system has been considered a purely eukaryotic invention. by now, the bacterial moad/moeb and this/thif systems are known to be prokaryotic versions of a rudimentary activation system for ubiquitin-like proteins. however, similarities to the ubiquitin system end after the activation step, as moad and this are not conjugated onto target proteins but rather have a role in the biosynthesis of molybdopterin and thiamin, respectively. the eukaryotic protein urm1 is the closest homolog of moad and this. unlike its bacterial cousins, urm1 is conjugated onto target proteins and thus can be considered the founding member of the diverse eukaryotic ubiquitin family. by using a bioinformatics approach that integrates methods of sequence analysis, phylogenetics, phylogenomics and gene-order analysis, we were able to show that many bacteria possess a third ubiquitin-like activation system that most likely is used for protein modifications. the novel system uses a moad/this relative, which is more closely related to urm1 than the typical moad and this proteins. these bacterial urm1 (burm1) proteins typically require the proteolytic removal of a c-terminal extension, which masks the gg motif important for activation. many burm1 operons contain a mpn+/jamm domain protein (belonging to a bona fide ubiquitin-specific protease family), which is most likely responsible for this cleavage. as a third component, an e1-like enzyme is also part of typical burm1 operons. the burm1-associated e1 enzymes look more like uba4 (the eukaryotic urm1-e1) than like the bacterial moeb/thif e1 enzymes. interestingly, the mpn+/jamm protease is also conserved in those bacteria whose burm1 end with gg, suggesting that burm1 removal is important not only for the activation step b2-025p non-hypoxic induction of hypoxia-inducible factors by insulin and 2-deoxy-d-glucose hypoxia-inducible factors (hifs) are key mediators of the cellular adaptation to hypoxia, but also respond to non-hypoxic stimuli like insulin. to clarify involvement of all known hif subtypes in conditions resembling diabetes, we determined distribution of mrnas and proteins in rats subjected to in vivo hypoglycemia and glucoprivation. wistar rats were infused with either saline, insulin, or 2-deoxy-d-glucose (2-dg) to provoke hypoglycemia or impaired glucose assimilation. using real-time qpcr, mrna levels of hif subunits 1a, 2a, 3a, 1b, and of the target gene glut-1 were determined in various organs. cellular distributions of hif-a proteins were examined by immunohistochemistry. treatments with insulin or 2-dg resulted in a widespread increase in hif-3a mrna after 6 h, whereas mrna expression of other hif subunits remained unaffected, except for hif-2a which increased in lung and heart after 2-dg. in cerebral cortex and kidney, enhanced staining of all hif-a proteins was observed after insulin or 2-dg treatments. lung, heart and kidney showed enhanced levels of glut-1 mrna. both hypoglycemia and glucoprivation provoke functional activation of the hif system, with transcriptional up-regulation of hif-3a representing a typical response. our data indicate an involvement of the hif system, and hif-3a in particular, in the pathophysiology of diabetes. fragments of human salivary statherin and pb peptide underlying a furin-like pro-protein convertase action in the pre-secretory salivary fragmentation pathway the recent analysis of some derivatives of human salivary peptides and proteins [13], such as acidic and basic proline-rich proteins (prp) and histatins, allowed recognizing in the presecretory salivary fragmentation pathway the action of a furinlike pro-protein convertase of the kexin-subtilisin family, often followed by a carboxy-peptidase action. on the same line, the present study was carried out to search in human saliva the fragments generated from statherin and pb peptide by the action of furin-like proteinases, utilizing a selected-ion monitoring strategy based on hplc-it ms. the fragments and post-translational derivatives detected with high frequency in multiple samples were the following: (i) statherin (5380 amu), des-phe-43 (5232 amu), des-thr-42-phe-43 (5131 amu), des-asp-1 (5265 amu), mono-phosphor. (5300 amu), statherin sv2 (missing 6-15 residues; 4149 amu), fragm. 10-43 (4128 amu), fragm. 11-43 (3971 amu), fragm. 14-43 (3645 amu). moreover, the fragm. 6-57 (5215 amu) of pb peptide (5793 amu) was identified. the quantity of these fragments in salivary samples was usually <10% of the parent peptide. the identified fragments confirmed the action of a proprotein convertase on furin-like consensus sequences, being the cleavage at arg-9 (ekflr), arg-10 (lrr) and arg-13 (rrigr) for statherin, and at arg-5 (rgpr) for pb peptide. detection of statherin missing n-and c-terminus residues indicated also a pre-secretory exopeptidase action, already observed in other salivary peptides. the function of these statherin and pb derivatives in the oral cavity must be elucidated. cloning and expression of a pepstatin insensitive acid protease from thermoplasma volcanium in e. coli acid proteases, commonly known as aspartic proteases, are recognized by their specific inhibition by pepstatin. acid proteases are found in microorganisms both as intracellular and extracellular enzymes. there is very limited number of thermostable, pepstatin insensitive acid proteases isolated from bacterial sources. the only example of purified and cloned acid protease from archaebacteria is thermopsin, produced by sulfolobus acidocaldarius. this thermophilic enzyme represents a new class of acid proteases. to extend our knowledge on the microbial acid proteases with thermostable properties, in this study we have undertaken the cloning and expression of a thermostable, pepstatin insensitive acid protease from themoacidophilic archaeon thermoplasma volcanium. a primer set was designed based on nucleotid sequence of the predicted thermopsin gene and pcr amplification produced a 3080 bp fragment, which covered complete thermopsin gene with some upstream and downstream sequences. the amplified thermopsin gene was cloned in e. coli, using pdrive vector. the alignment of the amino acid sequences of thermopsins from various archaea revealed the highest homology (44%) between the tp. volcanium thermopsin and putative tp. acidophilum enzyme, thermopsin 1. there was a low degree of similarity (28%) between the tp. volcanium thermopsin and thermopsin from sulfolobus acidocaldarius. expression of the recombinant thermopsin was attempted using qia expression kit, where the cloned gene was ligated to pqe expression vectors to be expressed under the control of t5 promoter. in this system the protein was tagged with 6xhis residue at n-terminal end so that it could be selectively isolated using ni-nta metal-affinity chromatography. include various vital proteins with discrete functions in the propagation of apoptosis. our aim is to generate a caspase cleavage site predictor specific for each member of the caspase family in order to make subtype-specific predictions of new caspase substrates. we have used a set of experimentally verified proteins to generate sequence logos and train a neural network in order to predict caspase cleavage sites. machine learning techniques, such as artificial neural networks, are often well suited to integrate the subtleties of sequence variations. this approach also enables integration of structural information in the pattern recognition procedure which could possibly increase the predictive performance of the neural network. the identification of new caspase substrates can lead to further elucidation of several cellular processes involving caspases, including apoptosis, cell cycle regulation, cellular differentiation, and pro-inflammatory responses. in addition, the generation of caspase inhibitors could be greatly aided by a caspase cleavage site predictor. regulation of protein synthesis and autophagic-lyososomal protein degradation in isolated pancreatic acini a. l. kovacs and e. papp cell physiology laboratory, department of general zoology, eotvos lorand university, budapest, hungary. e-mail: alkova@cerberus.elte.hu a series of biologically active compounds (wortmannin, ly294002, 3-methyladenine, rapamycin, okadaic acid, theophyllin, insulin, glucagon, cholecystokinin) influencing protein synthesis and autophagic-lysosomal protein degradation by interfering with important signalization pathways were investigated. our results show that in exocrine pancreas cells phosphatidyl inositolkinases (pi3k-s) are activators, while the target of rapamycin protein (tor) is an inhibitor of autophagy. camp is an inhibitor of lysosomal protein degradation that acts through members of the pi3k family. okadaic acid inhibits lyososomal protein degradation without inhibiting the formation of autophagic vacuoles. the inhibition of pi3k-s and tor diminishes protein synthesis, inhibitors of these kinases reduce the synthesis stimulatory effect of insulin. cholecystokinin showed a biphasic stimulatory effect while glucagon was ineffective on protein synthesis. on the base of these results a possible signalization pathway is suggested for autophagic segregation and lysosomal protein degradation in pancreatic acinar cells. purification and characterization of a bifunctional protease from vibrio vulnificus in this study, we purified and characterized an extracellular protease showing dual functions as prothrombin activator and fibrinolytic enzyme from vibrio vulnificus atcc 29307. the purified enzyme had broad substrate specificity towards various bloodclotting associated proteins such as prothrombin, plasminogen, fibrinogen and factor xa. the cleavage of these proteins could be stimulated by addition of 1 mm mn 2+ . the protease could acti-vate prothrombin to active thrombin. however, the thrombin activity generated from prothrombin activation by the protease seemed to be transient, with further cleavage resulting in a loss of activity. interestingly, the enzyme could enhance the activity of thrombin during the initial rate of fibrin formation when purified fibrinogen was used as substrate. it could also actively digest fibrin polymer as well as cross-linked fibrin. these results suggest that the secreted protease functions as a prothrombin activator and a fibrinolytic enzyme to interfere with blood clotting as part of the mechanism associated with its pathogenicity in human. tumor invasion and metastasis are the major causes of treatment failure and death in cancer patients. one requisite for neoplastic cell invasion during tumorigenic processes is the remodeling events that occur within the stroma or extracellular matrix (ecm). cysteine cathepsins, most likely along with matrix metalloproteases and serine proteases, degradate the ecm, thereby facilitating growth and invasion into surrounding tissue and vasculature. clinically, the activity levels and localization of cysteine cathepsins and their endogenous inhibitors have been shown to be of diagnostic and prognostic value. the aim of our study was therefore both the determination of prognostic and diagnostic impact of cathepsins b, l and h from human tissues extracts (normal and tumor tissue) and extracellular fluids (such as plasma and urine) and a1-proteinase inhibitor (pi) in pathogenesis of different types of human brain tumors, and extraction and purification of cysteine cathepsin endogenous inhibitors from normal and tumor brains and studying of their physicochemical properties. it was found that the increasing of cysteine cathepsins b, l, h activity levels in brain tumors tissues depend on histostructure, histogenesis and tumor malignancy grade. increasing of cathepsins l and h activity levels was found in plasma and urine in depending on histogenesis. at the same time decrease in pi activity level was registered. besides, kinetic characteristics of extracted normal brain endogenous inhibitors of cysteine cathepsins were determined. in extracted tumor brain endogenous inhibitors, there were differences in physicochemical properties in comparison with normal. the data obtained contribute to understanding the participation of cysteine cathepsins and their inhibitors in mechanisms of cancer genesis and both become useful for solving the problem of improving of tumor therapy and provide the possibility of using their activity as diagnostic and prognostic markers. protein hydrolysates of sea origin as components for microbiological culture media dry hydrolysate was prepared from protein-containing waste of icelandic scallop chlamys islandicus processing (spw) by means of a proteinase complex from king red crabs hepatopancreas. the enzyme consist of the proteolytic enzyme complex from crab hepatopancreas, in which serine proteases dominate (collagenase, elastase and trypsin-and chymotripsin-like proteinases). as proteinases from king red crab hepatopancreas have high enzymesubstrate affinity to icelandic scallop proteins, a high degree of proteolysis can be achieved. the composition and properties of the material were investigated on enzymatic protein hydrolysate from spw obtained under the most technologically suitable conditions: 50-55 o c, ph 7.5, 6 h, the ratio between the protein material and the enzyme preparation being 1000:6. for comparison we examined the composition of commercial pancreatic hydrolysate from poor-quality fish species, mainly boreogadus and micromestistus. it was found that hydrolysate from spw significantly overpowered the commercial analog in the mass percentage of the target product (free amino acid and oligopeptides). the resulting product contains not <80% free amino acids and oligopeptides. predominant are aspartic acid, leucine, isoleucine, arginine and lysine, which account for >5% of the free amino acids. the potential usage of the protein hydrolysate as a nutrient for microorganism cultivation is estimated. microbiological studies have demonstrated that the hydrolysate from spw can be used as a protein component in nutrient media. the tested microbial strains satisfactorily grew on the media. the z variant alpha-1 proteinase inhibitor (a1piz) misfolds in the endoplasmic reticulum (er) and is a substrate for er-associated protein degradation (erad). we report here that a1piz degradation is also dependent on vps30/atg6, a gene that encodes a component of two pi3-kinase complexes that regulate membrane traffic; complex i is required for autophagy, complex ii is required for the cpy-to-vacuole pathway. to elucidate why vps30p participates in a1piz degradation, we tested the hypothesis that erad was saturated at elevated levels of a1piz expression and that excess a1piz was targeted to one of these alternative quality control pathways. overexpression of a1piz led to vacuole-dependent degradation and both complexes were required for delivery of the excess a1piz to the vacuole. when the cpy-to-vacuole pathway was compromised a1piz was secreted and the distribution of soluble vs. aggregated forms of a1piz was comparable with that of wild type yeast. however, disruption of autophagy led to an increase in levels of aggregated a1piz; suggesting that when erad is saturated the excess a1piz is selectively targeted to the vacuole via the cpy-to-vacuole sorting pathway, while excess a1piz that forms aggregates in the er is targeted to the vacuole via autophagy. together, these results reveal multiple pathways for recognition and removal of aberrant proteins and provide direct evidence that aggregated a1piz is removed by autophagy. our findings may have application in the understanding of, and treatment for, individuals with liver disease caused by the accumulation of er aggregates of a1piz. acknowledgements: the study was supported by national science foundation grants mcb-011079 and mcb-0110331. yeast and lactobacillus association generates peptides from acid goat whey proteins fermentation s. didelot, s. bordenave-juchereau, e. rosenfeld, l. murillo, j. m. piot and f. sannier laboratory of biotechnology and bioorganic chemistry, university of la rochelle, la rochelle, france. e-mail: lmurillo@univ-lr.fr our goal was to produce peptides from fermentation of unsupplemented acid goat whey by dairy micro-organisms. we used a lactobacillus, lactobacillus paracasei, and a yeast, candida parapsilosis, both previously isolated from a cheese microflora. when co-cultivated aerobically, both micro-organisms grew on unsupplemented goat whey and led to a medium acidification from 6 to ph 3.5. reversed phase (rp)-hplc analysis revealed a total alpha-lactalbumin hydrolysis after 96 h of fermentation, a modification of the beta-lactoglobulin elution peak, and 2.5-fold increase in peptide level compared with the non-fermented whey. in the absence of c. parapsilosis, l. paracasei grew poorly on whey and only a weak medium acidification from 6 to 4.5 was observed after 192 h of fermentation. rp-hplc analysis revealed a weak modification of beta-lactoglobulin elution peak, a truncated form of alpha-lactalbumin and no peptide generation. c. parapsilosis was able to grow on unsupplemented goat whey without modifying ph of the medium, but only 25% of proteins were hydrolysed (alpha-lactalbumin) or denaturated (beta-lactoglobulin) and, again, no peptides were detected. these results suggest that (i) c. parapsilosis is required for l. paracasei growth and (ii) the co-culture of both micro-organisms is needed to generate peptides from alpha-lactabumin hydrolysis. during co-culture on whey, the use of penicillin g and cycloheximide as bacterial and yeast growth inhibitors respectively, revealed that l. paracasei growth was required for medium acidification to ph 3.5 and alpha-lactalbumin hydrolysis. however, we demonstrated that the protease(s) responsible of alpha-lactalbumin hydrolysis was (were) synthesized by c. parapsilosis during the first stage of fermentation and that medium acidification (obtained either by l. paracasei growth or chemically) was required for yeast protease(s) activity. dengue virus causes widespread human diseases such as dengue fever, dengue hemorrhagic fever and dengue shock syndrome. the viral genome is a positive rna strand that encodes for a single polypeptide precursor. processing of the polyprotein precursor into mature proteins is carried out by the host signal peptidase and by ns3 serine protease. the three dimensional structure of ns3 protease domain ns3pro has been elucidated [1] . recently a new construct of the recombinant form of the ns3pro, was engineered [2] . we have expressed in e. coli the his-tag-cf40.gly.ns3pro protein a new construct of the recombinant form of the ns3pro linked to a 40 -residue co-factor, corresponding to a part of ns2b, via a non-cleavable, flexible non-apeptide (gly 4 sergly 4 ), and have currently optimized the purification procedure. chemically optimized substrates, peptides and depsipeptides, were designed and tested to afford an efficient in vitro activity assay, using hplc and fret spectroscopy. the data suggest that the amino-terminal region of the 40-amino acid co-factor domain is involved in additional charged interactions with ns3 that are essential for activity as previously described. this form showed catalytic activity and spectroscopic studies were performed to identify the folding of the protein. moreover, experiments of limited proteolysis have been performed to identify the essential enzymatic domain of the protein and to stabilize the role of the cofactor in the activity and in folding stabilization of the enzyme. after 2 h of the limited proteolysis with endoproteinase asp-n the product was analyzed by sds-page and activity assay, showing a high reduction of the molecular mass and only a loss of the activity of the 20%. cd and 15 n-1 h-hsqc spectra of this protein fragment were performed and other functional and structural characterizations are in progress in our laboratory. it is intended to obtain the structure in solution of the essential active domain of the uniformly 13 c, 15 n-labeled cf40.gly.n-s3pro by high-field 3d nmr spectroscopy. the solution structure of the enzyme will be used to answer yet unresolved questions about the mechanism of action, the role of its cofactor ns2b, and the observed substrate specificity. introduction: fish consumption is associated to nutritional benefits due to the presence of proteins of high biological value, minerals, vitamins and polyunsaturated fatty acids. most studies concerning the benefits of fish consumption on cancer prevention have focused on fish fatty acids but little is known about the potential bioactivity of fish peptides. the present study was then designed to assess the antiproliferative activity of various fish protein hydrolysates, in order to further purify and characterize anticancer peptides. methods: twenty-one fish hydrolysates (from seven species) produced within the framework of the european valbiomar programm. fish hydrolysates composition (protein, fat and salt content) was determined by standard methods (kjehldhal, soxhlet extraction and volhard respectively). cytotoxic and antiproliferative activity were assayed in vitro on mcf-7/6 and mda-mb-231 human breast adenocarcinoma cell lines, following a cell viability colorimetric assay (promega, france). antiproliferative activity of fish hydrolysates was compared with that of reference anticancer molecules with various cellular targets, namely actino-mycine d, cytosine-beta-d-arabinofuranoside, cyclophosphamide, etoposide, kenpaullone and roscovitine. results: composition analysis revealed that most hydrolysates contained more than 70% protein. three blue whiting hydrolysates containing 96% protein, 0.5% lipid and 0.2% salt induced a strong breast cancer cells growth inhibition when tested at 1 g/l for 72 h in cell culture medium. blue whiting hydrolysates 3, 4 and 5, respectively, induced a growth inhibition of 24.5, 22.3 and 26.3% on mcf-7/6, and 13.5, 29.8 and 29.2 % on mda-mb-231. these in vitro antiproliferative activities are in the range of that observed when the two breast cancer cell lines are treated for 72h with kenpaullone, roscovitine or cytosine-beta-d-arabinofuranoside 10 )6 m. further studies are engaged to fractionate and characterize the antiproliferative peptides contained in blue whiting hydrolysates. during recent years, it has been established that intracellular proteolysis in eukaryotic cells is largely accomplished by a highly selective non-lysosomal pathway that requires atp and a large (2.5 mda) multisubunit complex known as the 26s proteasome. the proteasome-mediated pathway plays vital regulatory functions. it degrades many important proteins involved in cell cycle control, in signaling pathway, and in general metabolism, including transcription factors and key metabolic enzymes. another function of the proteasomal system is the removal of abnormal, misfolded and oxidized proteins generated under normal and, in particular, stress conditions. to date, proteasomes from other than animal or plant cells were studied only in yeast. recently, in our laboratory, the proteasome-mediated pathway was shown to be involved in the regulation of ligninolytic activities in the white rot fungi trametes versicolor and phlebia radiata upon nutrient starvation (staszczak, enzyme microb technol 2002; 30: 537-540). it was the first report on proteasomes in fungi representing basidiomycota. white rot fungi are able to degrade lignin by the action of secreted enzymes, the best characterized of which are laccases, lignin peroxidases, and manganese peroxidases. the subject of lignin biodegradation has commanded attention for a considerable period of time mainly because of its ecological significance and wide industrial applications of bioligninolytic systems. heavy metal ions are important environmental pollutants which affect biodegradation processes performed by white rot fungi. in the present study, we investigated whether the proteasomal degradation pathway might be involved in the regulation of laccase production by t. versicolor in response to cadmium exposure. studies of cacybp/sip function using small interfering rna cacybp/sip was discovered as a protein that bound calcyclin (s100a6) in a calcium-dependent manner (filipek and wojda 1996; filipek and kuznicki, 1998) and its distribution and some biochemical properties have been studied. for instance, it has been shown that cacybp/sip binds calcyclin via its c-terminal fragment (nowotny et al. 2000) and that, beside calcyclin, it interacts with other calcium binding proteins of the s100 family (filipek et al. 2002) . originally, we identified cacybp/sip in ehrlich ascites tumour (eat) cells but it is also present in other mammalian tissues and cells. in particular, high expression of cacybp/sip was found in neuronal cells of mouse and rat brain (jastrzebska et al. 2000) . at present the distribution and structural properties of cacybp/sip are quite well described but its function remains obscure. there is only one paper published concerning the possible involvement of cacybp/sip in b-catenin ubiquitination and degradation (matsuzawa and reed 2001). to elucidate the biological role of cacybp/sip we have designed and synthesized sirna (small interfering rna) against this protein. this sirna was then used to transfect neuroblastoma nb-2a and embryonic kidney hek293 cells, expressing high and low amount of endogenous cacybp/sip respectively. the level of cacybp/sip was monitored in cell extracts by western blot technique. we found that sirna against cacybp/sip, which we designed, inhibited the expression of this protein, as its level in transfected cells was lower in comparison with control cells. at present, we checked the effect of diminished expression of cacybp/sip on b-catenin degradation and other cellular processes. acknowledgements: this work was supported by grants: kbn heavy metals are powerful poisons for living cells. it has been shown that exposure to arsenicals, either in vitro or in vivo, in a variety of model systems, causes the induction of a number of the major stress protein families, such as the heat shock proteins (hsp) (toxicol appl pharmacol 2001; 177: 132). the reasons for heavy metal toxicity in vivo are not fully understood, but they are known to contribute to the accumulation of aberrant proteins (bba,1995,1268, 59). in animal cells, arsenite has been reported to cause sulfhydryl depletion, to generate reactive oxygen species and increase the level of high molecular mass ubiquitin-protein conjugates (toxicol appl pharmacol 2003; 186: 101). in cells submitted to stress conditions, several components of the ubiquitin/proteasome pathway are activated. in this major, eukaryotic proteolytic pathway, multiple ubiquitin molecules are enzymatically ligated to proteins destined for catabolism by an enzyme system composed of three types of enzymes, commonly referred to as e1, e2, and e3. the large ubiquitinprotein conjugates thus formed are subsequently degraded by a very large protease complex, the 26s proteasome, in an atpdependent process. the changes in free ubiquitin (ub) and ubiquitin-protein conjugates (ub-p) levels were followed by immunoblotting during the incubation of the higher plant lemna minor l.(duckweed) in the presence of arsenite (as), at concentrations known to confer thermotolerance to the plants. the observed increase in the amount of large molecular mass ubiquitin-protein conjugates is indicative of a role for the ubiquitin/ proteasome pathway in the response of lemna to as stress. this outcome is primarily attributed to an increased availability in protein substrates during as treatment for three main reasons: an increase in protein carbonyl (a major marker for protein oxidation) content detected by immunoblotting; moderate increments (as determined by semi-quantitative rt-pcr) in the mrna levels of the codifying sequences for the ubiquitin pathway components: ubiquitin, e1, e2 and the b subunit and the atpase subunit of the 26 s proteasome; an identical pattern of variation for the large ubiquitin-protein conjugates is observed in the simultaneous presence of as and cycloheximide, indicating that the observed increase in ubiquitin conjugates does not depend on de novo protein synthesis. ageing and autophagy y. stroikin and a. terman experimental pathology, linko¨ping university, linko¨ping, sweden. e-mail: yurst@inr.liu.se life of aerobic cells is associated with continuous oxidative damage resulting in the formation of altered, non-functional macromolecules and organelles. intracellular accumulation of oxidized proteins defective organelles and lipofuscin inclusions are typical manifestations of ageing that preferentially affects long-lived post-mitotic or growth-arrested cultured cells. autophagy, an important biological mechanism for renewal of damaged intracellular structures, has been found decreased in ageing. to learn more about the role of autophagy in ageing, we studied the effect of the inhibitor of autophagic sequestration 3-methyladenine (3-ma) on human diploid fibroblasts and astrocytes. inhibition of autophagy in growth-arrested (confluent) fibroblasts for 2 weeks resulted in the accumulation of altered lysosomes displaying lipofuscin-like autofluorescence, especially when 3-ma exposure was combined with hyperoxia. the findings suggest that autophagy is indispensable for normal turnover of lysosomes, and lysosomal components may be direct sources of lipofuscin. the accumulation of oxidatively damaged intracellular structures (so-called biological ''garbage'') was associated with decreased cell viability. two-week-inhibition of autophagy with 3-ma resulted in a significantly increased proportion of dying cells when compared with both untreated confluent cultures and dividing (subconfluent) cells exposed to 3-ma. similar results were obtained when autophagic degradation was suppressed by the protease inhibitor leupeptin. the results support the idea that biological ''garbage'' accumulation is essential for ageing and age-related death of post-mitotic cells, which can be prevented by cell division. recently two family members of the tumour suppressor gene p53 have been described, p63 and p73, which seem to be necessary for specific p53-induced stress-response pathways. furthermore, p63 and p73 appears to be crucial to determine the cellular sensitivity to anticancer drugs, particularly in tumours lacking functional p53. here, we show that p63 and p73 isoforms are also regulated by proteasomal degradation. we have identified several e3-ubiquitin ligases responsible for the regulation of the stability of p63 and p73. we found that the regulation of p63 and p73 is isoform-specific. furthermore, we demonstrate that ubiquitination of p73 influences the cellular localization of p73 and of the respective e3-ubiquitin ligases. finally, we show that the expression of the various e3-ubiquitin ligases can be differentially induced by p73-isoforms. in addition, the e3-ubiquitin ligases can influence the apoptotic function of p73. our findings demonstrate that p63 and p73 are sent to degradation or stabilized by e3-ubiquitin ligases in an isoform-specific manner and we suggest a negative feedbackloop between p63, p73 and their regulators, as they also influence the function of p63 and p73. increased level of metalloproteases was shown to accompany tumor angiogenesis and active invasion in adjacent tissue [1] . development of different types of tumors is often accompanied by increased protease activity in blood [2, 3] . in the present study we compared protease activity of plasma and eluate from surface of blood cells in healthy donors and patients with breast tumor. we have demonstrated recently that in blood of healthy donors almost all circulating nucleic acids (cirna) are bound at the surface of blood cells. in patients with fibroadenoma cirna were found at cell surface whereas in breast cancer, no cell-surfacebound cirna were detected in blood [4] . conjugates of hydrophobic and hydrophilic peptides of cd34 receptor with biotin were incubated with avidin-coated 96-well eia microplates. avidinpeptide complex was incubated with samples under investigation and serial dilutions of proteinase k solution, which was used for calibration of protease activity. undegraded peptides were visualized by incubation with goat anti-peptide antibodies followed by conjugate of anti-goat immunoglobulins with peroxidase. blood plasma and eluate from surface of blood cells of cancer patients demonstrated increased level of anti-hydrophilic protease activity compared with healthy donors. increase of protease activity against hydrophilic peptide in blood correlate with decrease of cell-surface-bound cirna, indicating that blood proteases can affect concentration and distribution of circulated na. identification of cleavage site and natural substrate specificity of prta, a serralysin-type metalloprotease from the entomopathogenic microorganism photorhabdus prta, a secreted basic metalloprotease of photorhabdus, belongs to the m12b (serralysin) family of proteases. the biological function of these enzymes is not known, but in some cases they are supposed to have a role in virulence. serralysins are generally assumed to have broad substrate side-chain specificity. attempts toward the generation of a sensitive and specific substrate of these enzymes had limited success, and no such substrate is available for prt-a. through mass spectrometric analysis of prta cleavage products of oxidized insulin a and b chain, we found that prta has a welldefined cleavage site preference. based on this, we developed a sensitive and highly specific oligopeptide substrate through optimization of the amino acid composition and length. the kinetic parameters of prta isolated from photorhabdus luminescens ssp. laumondii strain brecon were measured on the best substrate, dabcyl-glu-val-tyr-ala-val-glu-ser-edans, giving a km of 8.8 · 10 )5 , a kcat of 2.1 · 10 )2 /s and a kcat/km of 2.4 · 10 6 . its poor hydrolysis by various proteases proved its specificity, while it was very sensitivity in measuring prta activity in hemolymph samples from photorhabdus infected galleria mellonella larvae. the substrate preference of prt-a was determined by in vivo digestion of hemolymph proteins from manduca sexta. six minor protein components were selectively cleaved, which were provisionally disthe epithelial sodium channel (enac) is an integral component of the pathway for na + absorption in epithelial cells. enac activity is mainly regulated by mechanisms that control its expression at the cell surface, such as ubiquitination. the ubiquitin ligases nedd4 and nedd4-2 have both been shown to bind to enac and decrease its activity. conversely, the serum-and glucocorticoid regulated kinase (sgk), a downstream mediator of aldosterone, is able to increase enac activity. this effect is at least partly mediated by direct interaction between sgk and nedd4-2. sgk binds both nedd4 and nedd4-2 but it is only able to phosphorylate nedd4-2. phosphorylation of nedd4-2 reduces its ability to bind to enac, and hence increases enac activity. the impact of the interaction between nedd4 and sgk remains unclear. nedd4-like proteins interact with enac via their ww-domains. these domains bind py-motifs (ppxy) present in enac subunits. nedd4 and nedd4-2 both have four highly homologous ww-domains. previous studies have shown that interaction between nedd4 and enac is mainly mediated by ww-domain 3. sgk also has a py-motif, therefore we tested whether the ww domains of nedd4 and nedd4-2 mediate binding to sgk. we show that single or tandem ww domains of nedd4 and nedd4-2 mediate binding to sgk and that, despite their high homology, different ww domains of nedd4 and nedd4-2 are involved. our data also suggest that ww domains 2 and 3 of nedd4-2 mediate the interaction with sgk in a concerted manner, and that in vitro the phosphorylation of sgk at serine residue 422 increases its affinity for the ww domains of nedd4-2. the stimulatory effect of sgk on enac activity is partly mediated via nedd4-2 and will decrease if competition between nedd4 and nedd4-2 for binding to sgk occurs. we show that nedd4 and nedd4-2 are located in the same subcellular compartment and that they compete for binding to sgk in vitro. the concerted or successive action of proteolytic enzymes has been described in a number of important biological processes in which proteins are degraded or matured, such as digestion, turnover (lysosomal, proteosomal...), blood coagulation, developmental remodeling or apoptosis, among others. the complementary action of proteases belonging to different families to achieve a more efficient o a better modulated hydrolytic mechanism is well documented. specific molecular associations or shared scaffolds between the involved proteases and/or protein inhibitors and defined three-dimensional structures have also been reported. however, only in a few cases such structures involved metallo.carboxy-peptidases or their inhibitors [1] . we shall review this subject and describe, in such a context, a new model found in a marine invertebrate organism in which such a fact takes place. in particular, the characteristics of a novel bifunctional molecule displaying the functionalities and structures of serine-and metallo.carboxy-peptidases will be presented. its structure is fully different than the ones previously reported by us and collaborative groups for metallocarboxypeptidase inhibitors [2] [3] [4] . regulating the activity of herpes virus proteases c. s. craik departments of pharmaceutical chemistry, pharmacology, and biochemistry and biophysics, ucsf, san francisco, ca, usa. e-mail: craik@cgl.ucsf.edu herpesviral proteases exist in a monomer-dimer equilibrium in solution. dimerization is required for activity and a comformational change communicates the oligomerization state of the enzyme to the active site of each intact monomer. each monomer has an active site, which is spatially separate from the dimer interface. kaposi's sarcoma-associated herpesvirus (kshv), encodes a protease (kshv pr), which is necessary for the viral lytic cycle. like those of other herpesvirues proteases, the dimer interface of kshv pr is composed primarily of a helix near the c terminus, of the protein. the helix of one monomer interacts with residues in the symmetrically related helix of the other monomer across the dimer interface as well as with neighboring helices. small molecule inhibitors, site directed mutagenesis and 2d nmr spectroscopy were used to compare the monomeric and dimeric forms of kshv pr and to investigate the relationship of the active site and the dimer interface of the enzyme. active site inhibition was shown to strongly regulate the binding affinity of the monomer-dimer equilibrium of the protease, shifting the equilibrium completely to the dimeric form of the enzyme. a previously undetermined conformational change provided insight in to the regulation of protease activity by dimerization as well as an explanation for the weak dimerization of a family of enzymes with a disparately large dimer interface compared to their measured binding affinities. using this information as a guide, protein grafting of the interfacial helix onto a small stable protein, avian pancreatic polypeptide, generated a small macromolecular inhibitor that successfully disrupted the dimer interface and inhibited enzymatic activity. these results provide direct evidence that peptide bond hydrolysis is integrally linked to the quaternary structure of the enzyme, validate the protease as a therapeutic target and suggest the dimer interface may be an alternative site for antiviral design. abteilung strukturforschung, max-planck-institut fu¨r biochemie, martinsried, germany. e-mail: huber@biochem.mpg.de proteolytic enzymes catalyze a very simple chemical reaction, the hydrolytic cleavage of a peptide bond. nevertheless they constitute a most diverse and numerous lineages of proteins. the reason lies in their role as components of many regulatory physiological cascades in all organisms. to serve this purpose and to avoid unwanted destructive action proteolytic activity must be strictly controlled. control is based on different mechanisms which i will discuss and illustrate with examples of systems and structures determined in my laboratory. the family of serine protease inhibitors known as the serpins is represented in all branches of life and predominate in the higher organisms, including man. they have evolved an extraordinary mechanism to inhibit proteases which distinguishes them from the 20 other families of serine protease inhibitors, and renders them uniquely qualified to control of the proteolytic pathways essential to life. the mechanism is best described as a spring-loaded mousetrap, where nibbling of the peptide loop bait springs the trap and crushes the unsuspecting protease. as with a mousetrap, the active state of a serpin is metastable, and the energy released upon conversion to its more stable form is used to trap the protease. the complexity of the serpin mechanism provides many advantages over the simpler lock-and-key type mechanism, utilized by all other serine protease families. serpins provide stoichiometric, irreversible inhibition, and the dependence on serpin and protease conformational change is exploited for signaling and clearance. the potential for regulation is also an inherent part of such a complex mechanism, as illustrated by the heparin activation of serpins antithrombin and heparin cofactor ii. however, with complexity of mechanism also comes susceptibility to disease causing mutations: both through loss-of-function, as with thrombosis caused by antithrombin deficiency; and gain-of-function, as with dementia caused by neuroserpin polymerization. many crystallographic structures of serpins have been solved over the past 20 years, and we now have a frame-by-frame cinematic view of the intricate conformational rearrangements involved in protease inhibition, modulation of specificity, and molecular pathology of the remarkable shape-shifting serpins. structural lessons of serine proteases: function and mechanism of the serine protease-like hgf as a growth factor in met signaling hepatocyte growth factor (hgf), a plasminogen-related growth factor, is the ligand for met, a receptor tyrosine kinase implicated in development, tissue regeneration and invasive tumor growth. hgf acquires signaling activity only upon proteolytic cleavage of single-chain hgf into its a/b-heterodimer, similar to zymogen activation of structurally related serine proteases. although both chains are required for activation, only the achain binds met with high affinity. recently, we reported that the protease-like hgf b-chain binds to met with low affinity this suggests that additional allosterically linked regions may be involved in the signaling process. furthermore, antibodies directed toward the b-chain or the hgf a-chain result in inhibition of met phosphorylation in a549 cells. these antibodies also inhibit proliferation in bxpc3 cells and baf3 cells. implications for dimerization mechanisms of hgf-dependent met receptor activation and signaling are presented. in addition, mutagenesis of the hgf b active site region has been investigated with respect to imparting enzymatic activity. thus while hgf has the function of a growth factor, the structural and receptor binding aspects of hgf are more akin to those of serine proteases. trypsinogen 4 with a 28 amino acid leader peptide on its n-terminus is the predominant form of the enzyme in human brain gene prss3 on chromosome 9 of the human genome encodes, due to alternative splicing, both mesotrypsinogen and trypsinogen 4. mesotrypsinogen has long been known as a minor component of trypsinogens expressed in human pancreas, while the mrna for trypsinogen 4 has recently been identified in brain and other human tissues. analysis of the gene encoding trypsinogen 4 predicted two isoforms of the zymogen: isoform a may have a 72 amino acid, while isoform b a 28 amino acid n-terminal leader sequence. the translation initiation site for isoform a is an atg codon, while the initiation site predicted for isoform b is a ctg codon. we measured the amount of trypsinogen 4 mrna and the quantity of the protein as well in 17 selected areas of the human brain. trypsinogen 4 could be localized in glial and neuronal cells using immunohistochemical methods. we purified human trypsinogen 4 by affinity chromatography. our results show that splice isoform b is the predominant if not the exclusive form of the zymogen in human brain. the n-terminal residue of the isolated protein was identified by amino acid sequencing as a leucine. at the same time the longest mrna we were able to isolate was barely longer than the one corresponding to splice isoform b. although the most trivial explanation of our results is that isoform a is proteolytically processed to result in isoform b, it cannot be excluded that leucine rather then methionine is used as translation initiator amino acid. search for endogenous substrates for prolyl oligopeptidase in porcine brain prolyl oligopeptidase (po) is a serine protease present in most tissues, which preferentially cleaves the peptide bond at the carboxyl site of proline residues. the function of po is unknown, but it has been associated with several disorders of the central nervous system, such as depression and alzheimer disease. the purpose was to look for endogenous substrates for the recombinant porcine po in porcine brain. we adapted a method to extract the proteins from the brain with special attention to the smaller polypeptides since po is not known to cleave peptides larger than 30 amino acids. subsequently we looked for a method to separate the protein mixture in less complex fractions. 2d-gelelectrophoresis, commonly used in proteomics, is only suitable for proteins with a molecular weight between 10 and 200 kda and an iso-electric point between 4 and 10. two-dimensional chromatography offers a suitable alternative for small peptides. we chose ion exchange chromatography as a first and reversed phase high pressure liquid chromatography as a second step. the resulting fractions were divided into two parts. one part was incubated with the purified po, the other served as a control. by looking for shifts in the mass spectrum between the control sample and the incubated sample, we identified peptides cleaved by po. different methods, such as esi-qtof-ms and maldi-toftof-ms, were used to sequence cleaved peptides by msms. these experiments allowed us to deduce the sequence requirements for po cleavage. serine protease subtilisin immobilized on novel mesoporous materials serine proteinases are widely used in protein mapping and peptide or ester bond formation. fixation of enzyme on solid support has many advantages, such as high stability, possibility of recovering and low product contamination by enzyme. subtilisin carlsberg, a protease from bacillus licheniformis, was immobilized on mesoporous silica (sba-15) and several organosilica supports via physical adsorption. the bifunctional mesoporous organosilicas containing ch2-ch2 or ch=ch bridges in combination with organic tethers bearing amino or hydroxyl functionalities were synthesized using supramolecular templating in the presence of non-ionic triblock copolymers and exhibited high surface area and large pore diameters in the range of 50-70 å suitable for the incorporation of subtilisin. the kinetics of immobilization was examined for six different carriers. it was shown that enzyme retained hydrolytic activity after the immobilization. the dependence of subtilisin loading on the starting concentration of the enzyme during adsorption shows the maximum loading (455 mg protein/g support) at [e] = 20 mg/ml. the ph dependences of loading and activity of immobilized biocatalysts were bell-shaped. for the organosilica support containing amino and hydroxyl groups the ph-dependence was shifted to the alkaline ph by 2 in comparison with the support containing ch2-ch2 bridges. the adsorbed subtilisin desorbs easily in aqueous media, while no leaching of the enzyme was observed in acetonitrile and dmf/acetonitrile mixture (6/4). the immobilized biocatalyst shows high hydrolytic activity after incubation in non-aqueous acetonitrile for 1 week and after 48 h incubation in 60% dmf/acetonitrile mixture. these data indicate a possible application of the obtained biocatalysts in low water media. purification, structural and biological characterization of protease inhibitors from acacia plumose seeds protease inhibitors have been used in many current medicines. therefore, there is a considerable interest inside the pharmaceutical industry in discovering new composites and mechanisms of protease inhibition, since these investments have led, for example, to new anti-hiv therapeutical tests, coagulation diseases treatment and tests with anti-carcinogenic drugs. serine protease inhibitors are found in all plant tissues, mostly in the seeds of the leguminosae subfamilies: mimosoideae, caesalpinoideae and papilionoideae. acacia genus is one of most important member of mimosoideae, and the presence of protease inhibitors in this genus was described in only three species and none of them were structurally characterized. in this sense, we are studying three new protease inhibitors from a. plumose seeds. from saline extract of triturated mature seeds the inhibitors were purified and presented anti-coagulant activity, serine protease inhibitory activity and action on growth of fitopathogenic microorganisms, in vitro. the purification steps included size exclusion chromatography on the superdex-75 column, equilibrated and eluted with pbs, a ionic exchange chromatography on mono-s (hr 5/5) column, equilibrated with the buffer sodium acetate 50 mm (ph 5.0), and eluted with the same buffer in a gradient of 0-0.5 m of nacl. three fractions (eluted around 0.18, 0.22 and 0.33 m of nacl) that presented anticoagulant activity and serine protease inhibition were separated and denoted apia, apib and apic. their apparent mws were around 20 kda, by sds-page in the absence of reducing agents. in the presence of reducing agents they shown two bands: between 14-22, and 8-6 kda. the n-terminal analyze of higher mw chains were tyafl (apia); kellvdne (apib) and telhdd (apic). the circular dichroism spectra of these inhibitors were very similar, presenting a maximum around 230 nm and a minimum in 202 nm, compatible with presence of unordered and beta elements of secondary structure. their nterminal, cd spectra and two-polypeptide chains linked by covalent bound, are compatible with kunitz type inhibitors. probably these inhibitors are three different isoforms that present different inhibition specificity degree on the serine proteases family. the ki to different serinoproteases (trypsin, plasmatic kalikrein, elastase, quimotrypsin) and specificity to the phytopatogenic fungus are being investigated. although the proteases were initially described as enzymes involved in the non-specific degradation of dietary proteins, today it is known that they can also act as highly specific enzymes that perform selective cleavage of specific substrates. thus, alterations in the structure, regulation or function of this type of enzymes underlie serious human disorders including cancer. to date, more than 550 protease and protease homologs are annotated in man, mouse, and rat genomes (www.uniovi.es/degradome). the increasing complexity of the proteolytic systems has led to the introduction of global concepts as the term degradome to define the complete set of proteases that are produced in a specific moment by a cell, tissue or organism. as part of our studies focused on the characterization of the mammalian degradomes, we have identified and cloned unusual mosaic proteases containing in tandem serine protease domains. the first, called polyserase-1 is synthesized as a transmembrane protein that undergoes post-translational events to generate three independent serine protease domains. the second polyprotease is the polyserase-2, a secreted protein that remains as integral part of the initial protein product. to date, it is difficult to understand the putative functional advantages derived from the complex polyproteases and, albeit extremely unusual, it is not an unprecedented situation. thus, the amphibians ovochymase and oviductin are polyserine proteases that contain three in tandem serine proteases. in humans, angiotensin-coverting enzyme and carboxypeptidase d are polymetalloproteases that exhibit some similarities to the polyserases. all these polyproteases constitute examples that illustrate an additional strategy for increasing the complexity of the degradomes. evolution of a genetic locus, expressing several protease inhibitors with homology to whey acidic protein (wap) a. clauss and å . lundwall department of laboratory medicine, lund university, malmo¨, sweden. e-mail: adam.clauss@klkemi.mas.lu.se we have previously described a locus on human chromosome 20 that gives rise to 14 proteins containing wap four disulphide core (wfdc) domains. among them are the elastase inhibitors elafin and secretory leukocyte proteinase inhibitor (slpi). both slpi and elafin are also known to be important components of the innate immune defence by displaying anti-microbial properties. in order to gain a deeper understanding of the biological role of the locus, we have now extended our investigations of its organization and evolution into non-human mammals. homologous loci were identified on mouse chromosome 2, rat chromosome 3 and dog chromosome 24. transcript sequences were generated by race technology or retrieved from the est databases. as in humans, the murine and canine loci are divided into two sub-loci separated by approximately 200 kb. the majority of genes are conserved in all species, but the comparison also showed gain and loss of genes, e.g. two human pseudogenes were identified due to the discovery of functional rodent genes, and in the rat several duplications has yielded four slpi genes. a most interesting finding was that there is no murine elafin gene. the different wfdc domains showed a highly variable species conservation. this was particularly striking in proteins containing multiple domains, where the aminoterminal wfdc generally displayed low conservation, whereas the opposite was true for the carboxyterminal wfdc. the difference could be due to the potential targets of the inhibitors, which might be either highly variable exogenous microbial proteases or conserved endogenous proteases. signaling mechanism of thrombin-induced human gingival fibroblast contraction thrombin is activated during gingival tissue injury and inflammation. thrombin and other bacterial proteases also affect the functions of adjacent periodontal cells via stimulation of proteaseactivated receptors (pars). we noted that thrombin and par-1 agonist peptide (20 lm) induced the gingival fibroblasts (gf)-populated collagen gel contraction within 2-h of exposure. however, par-3 and par-4 agonist peptide (<20 lm) show little effect on collagen gel contraction. u73122 (phospholipase c inhibitor) and 2-apb (ip3 antagonist) were effective in inhibition of gf contraction. thrombin-induced gf contraction was inhibited by 5 mm egta (an extracellular calcium chelator) and verapamil (a l-type calcium channel blocker). in addition, w7 (10 and 25 lm, a calcium/calmodulin inhibitor), ml-7 (50 lm, myosin light chain kinase, mlck inhibitor), and ha1077 (100 lm, rho kinase inhibitor) completely inhibited the thrombin-induced collagen gel contraction. thrombin also induced the phosphorylation of erk1/erk2 in gf. however, u0126 only partially inhibited the thrombin-induced gf contraction. similarly, wortmannin (100 lm), ly294002 (20 lm) (two pi3k inhibitors) and genistein, also showed partial inhibition. moreover, nac was not able to suppress the gf-contraction, as supported by slightly decrease in reactive oxygen species production in gf by thrombin. these results indicate that thrombin is crucial in the periodontal inflammation and wound healing by promoting gf contraction. this event is mainly mediated via par-1 activation, plc activation, extracellular calcium influx via l-type calcium channel, and the calcium/calmodulin-mlck and rho kinase activation pathway. survival of the anticarcinogenic bowman-birk inhibitor from soybean at the terminal ileum of cannulated pigs plant protease inhibitors (pi) of the bowman-birk class, a major pi class in legume seeds, have emerged as highly promising cancer chemopreventive agents, being capable of preventing or suppressing carcinogenic processes in a wide variety of in vitro and in vivo animal model systems. in order to exert their chemopreventive properties in vivo, plant pi have to resist and survive, at least to some extent, degradation by acidic conditions and digestive enzymes during gut passage. in this study, we have evaluated the survival rate of the bowman-birk inhibitor (bbi) in the terminal ileum of cannulated pigs fed defatted soybean. two different quantitative approaches have been carried out. firstly, a competitive indirect elisa assay using an antisera capable to detect bbi free and/or in complex with digestive proteases; secondly, we have carried out spectrophotometric measurements of trypsin and chymotrypsin inhibitory activities in ileal samples, where the presence of bbi metabolites and/or single active loops can be detected. according to the elisa method, ileal apparent digestibility of bbi was 58 %, which resulted in a recovery of 0.61 mg out of 1.5 mg/kg feed ingested. significantly higher ileal digestibility values (95 %) were found when trypsin and chymotrypsin inhibitor activities were evaluated. the results suggest that the immunoassay may be overestimating the presence of functional pi by detection of inactive bbi, but also that the presence of complexed bbi with digestive proteases, even if protein extraction was carried out under acidic conditions, could make bbi undetectable in activity assays. studies are in progress to overcome these drawbacks. the resistance of bbi to the acidic conditions and digestive enzymes of the upper gastrointestinal tract make these proteins very interesting candidates for evaluation as chemopreventive agents, in modulating cell viability and tumor progression within the gastrointestinal tract. a single amino acid change in a chymotrypsin prevents plant proteinase inhibitor binding plants have evolved economical strategies to combat insects, which on one hand involves the production of multi-domain pis that can target multiple enzymes with different specificities and on the other, pis that belong to structurally distinct families. solanaceous plants, produce both type i and type ii families of pis, which specifically target serine peptidases. this study showed that type i pis are better inhibitors of a particular class of chymotrypsins within the gut of helicoverpa species that is otherwise unaffected by the type ii class of inhibitors. homology models were used to identify a single amino acid substitution in the helicoverpa chymotrypsin that was likely to confer resistance to the type ii inhibitor. our hypothesis was further supported by recombinant expression and mutagenesis of this single amino acid in the type ii inhibitor-resistant chymotrypsin. we therefore propose that both type i and ii inhibitors are required to protect plants against lepidopteran insects. mobility of the sulphate protamin/ low molecular weight heparin complexes in an electrical field glycosaminoglycans low molecular weight heparin (lmwh) activated plasma serine proteases inhibitors. serine proteases play an important role in thrombogenesis, the process that leads to blood clotting and such as heart attack, stroke and other cardiovascular disorders. lmwh has been used to temporarily render the blood incoagulable during prophylaxis or treatment of thrombosis and sometimes result in serious bleedings and for the heparin anticoagulant activity neutralization used sulphate protamin. it was investigated relationship between new lmwh-sk derivatives (were generated through the controlled cleavage of porcine intestinal mucosa heparin with a mixture of chitinolytic complex from streptomyces kurssanovii) anticoagulant activities and lmwh-sk complexes with sulphate protamin mobility in an electrical field. with this purpose used biospecific electrophoresis in 1% agarose with protamin sulphate. precipitation zones (zones of the equivalent) in the ''rocket'' form were generated. scanning image was saved as jpg format. the ''rocket'' squares estimated with the help of bandscan program. results: lmwh-sk with molecular mass (mm) 14.0; 5.8; 5.4; 4.7; 4.0; 3.4 kd demonstrated antithrombin activities (aiia) 85-264 iu/mg, activities against factor xa (axa) has made 100-278 iu/mg, axa/aiia ratio -(0.8-2.2). correlation coefficients between mm and precipitation zone heights or squares consist 0.56-0.73 (p < 0.05), between axa activities and precipitation zone heights or squares consist 0.37-0.54 (p < 0.05). conclusion: lmwh-sk was obtained with the chitinolytic comlex hydrolisis help has ratio axa/aiia-2,2, it is necessary for antithrombotic preparations. with the mm decrease axa activity increase and precipitation zone heights or squares of the lmwh-sk complexes with sulphate protamin decrease. the role of extracellular proteases in supplying filamentous fungi with nutrient compounds is well understood and experi-mentally documented. however there is no definite answer on the question on the need and role of these proteases in pathogenesis. the study of differences in the spectra of extracellular enzymes of saprotrophic and pathogenic fungi performed on fusarium species revealed that activity of secreted serine proteinases of pathogenic f. culmorum strain was much higher (up to 20-fold) than that of saprotrophic strain. the use of f. culmorum strains differing in pathogenicity (strongly and weakly pathogenic) demonstrated that activity of secreted serine proteases of strongly pathogenic strain was significantly higher (1.5-8-fold) than that of weakly pathogenic strain. this tendency was preserved in calculations of activity towards protein content and dry weight of mycelium indicating on purposeful synthesis and secretion of extracellular proteases by strains with high pathogenicity. at that these differences were much higher when the substrate for trypsin-like proteinases bz-arg-pna was used than in the case of substrate for subtilisin-like proteinases glp-ala-ala-leu-pna. according to the data obtained it is proposed that the value activity of trypsin-like proteinases secreted by the fungi correlated with the degree of their pathogenicity and plays, apparently, an important role in pathogenesis. acknowledgment: this work was supported by grants from the russian foundation for basic research. conformational adaptation of a canonical protease inhibitor upon its binding to the target protease increases specificity atomic resolution crystal structure of sgti in complex with crayfish trypsin provided further data on the molecular basis of the inhibition mechanism of pacifastin type inhibitors. in complex with crayfish trypsin, sgti exhibits more or less continuous contacts in an extended region (through sites p 12 -p 5' ) of the molecule. the comparison of this complex with a simulated bovine trypsin-sgti one shows that more than half of the interaction energy surplus is originated from the extended region of binding. some of these contacts result from a conformational change of sgti that was induced by its binding to the enzyme which is strongly supported by the critical comparison of the crystal structure of crayfish trypsin-sgti complex with the free form of sgti. alignment of the nmr structure ensemble with the x-ray structure of complexed sgti and a careful comparison of the backbone j, w angles were carried out. additionally, noe-derived restraints and corresponding distances in the complex are also compared. local conformation of both p 12 -p 4 and p 4 '-p 5 ' regions of the inhibitor shows significant changes upon binding suggesting that either or both of these regions may act as molecular recognition sites. this comprehensive analysis of the local backbone properties of sgti in the free and in the complex form made possible to identify conformational similarities and differences responsible for its efficient binding to the enzyme, and provides a good basis for further studying the structural aspects of protease inhibitor specificity. as most of serine proteases enteropeptidase light chain contains four disulfide bonds and one nonpaired cysteine at 122 (chymotrypsinogen-derived residue numbering) position which forms disulfide bond linking the pro-and catalytic domains. a mutant of human enteropeptidase light chain cys122ser was constructed by site-directed mutagenesis. the recombinant wild type and mutant proteins were produced in escherichia coli bl21(de3) with expression vector pet-32a. the active proteins were obtained after solubilization and renaturation of the fusion protein thioredoxin/human enteropeptidase light chain from inclusion bodies. after autocatalytic cleavage of thioredoxin the active enzyme was purified on agarose linked soybean trypsin inhibitor. the yield of refolded active enzyme increased from 1.87 to 7.84% in case of cys122ser mutant. the wild type and c122s mutant showed similar kinetic parameters for cleavage of small synthetic substrate gly-asp-asp-asp-asp-lys-naphthylamide, small ester thiobenzyl benzyloxi-carbonil-l-lysinate (z-lys-sbzl) and fusion protein cleavage. both enzymes were inhibited by trypsin-like serine proteases inhibitors but not inhibitors of chymotrypsin-like, cysteineor metallo-proteinases. recombinant human enteropeptidase light chain and its mutant c122s were active between ph 6 and 9 with a broad optimum at about ph 7.54 and demonstrated quite high stability to different denaturating agents. both enzymes demonstrated secondary specificity to chromogenic substrate z-ala-phe-arg-na with km = 0.067 mm, kcat = 23 s-1. proteinaceous low molecular serine protease inhibitors from wood rotting fungi k. j. grzywnowicz and j. zuchowski biochemistry department, maria curie-sklodowska university, lublin, poland. e-mail: grzyw@hermes.umcs.lublin.pl proteolytic enzymes have been firmly established as main regulatory components in a number of cellular and physiological processes. the most important factors influencing the proteolytic enzymes are natural, proteinaceous protease inhibitors, which form complexes with target proteases. they have been extensively investigated from the points of view on physiological functions, as tools for protease enzymology, models for protein-protein interactions and on potential medical applications. there is growing interest in new inhibitors of proteases from various sources. among known protease inhibitors from fungi are, yeasts inhibitors of proteinases a (asparagine protease) and b (serine protease), and low molecular inhibitors of serine proteinases from fruiting bodies of mushrooms -pleurotus ostreatus and lentinus edodes as well as some undefined proteinase inhibitory activities from water extracts of some species of basidiomycetes. searching for new, bioactive metabolites of basidiomycetous fungi we isolated and characterized recently some low molecular, proteinaceous, natural inhibitors of serine proteases, from mycelia of wood rotting fungi -trametes versicolor, abortiporus biennis and schizophyllum communae. isolation of inhibitors was achieved by ion exchange and size exclusion chromatography. preliminary characterization of their inhibitory activity (against some serine proteases), ph and temperature optima of action, and molecular mass, were classically analyzed. analysis of n-terminal amino acid sequences of these inhibitors suggests a new family of serine protease inhibitors from fungi. more detailed characterization of inhibitors (including molecular modeling) and preliminary experiments with laboratory animals and with lines of human cells are in progress. the role of serine proteases in the lectin pathway of complement activation p. ga´l 1 , g. ambrus 1 , v. harmat 2 , b. ve´gh 1 , g. na´ray-szabo´2, r. b. sim 3 and p. za´vodszky 1 1 institute of enzymology, hungarian academy of sciences, budapest, hungary, 2 protein modeling group, hungarian academy of sciences, budapest, hungary, 3 department of biochemistry, university of oxford, oxford, uk. e-mail: gal@enzim.hu the complement system is a cascade of serine proteases, and mediates essential functions during infection as a part of the innate immunity. activation of the complement system culminates in the destruction and clearance of invading microorganisms and damaged or altered host cells. our view about the complement system has changed considerably in the recent years, due to the discovery of a new activation pathway of complement: the lectin pathway. we have recombinantly expressed and characterized the mannose-binding lectin associated serine proteases: masp-1 and masp-2. these are related mosaic serine proteases with similar domain organization but with different enzymatic properties. we showed that masp-2 is capable of autoactivation and it can cleave c2 and c4 complement subcomponents. masp-2, therefore, can initiate the complement cascade without the contribution of any other protease. we demonstrated that the complement control protein (ccp) modules, which associate directly with the serine protease domain, stabilize the structure of the catalytic region masp-2 and contain exosites for the large protein substrates. these results are in agreement with the crystal structures of activated and zymogen forms of masp-2. masp-1 is the most abundant mbl-associated serine protease but it cannot activate the complement system. we demonstrated that masp-1 has a more relaxed substrate specificity compared to masp-2 and the activity of both proteases can be blocked by c1-inhibtor. we concluded that the two mbl-associated serine proteases participate in evolutionary and functionally different pathways. comparative kinetic study on s2' trypsin variants l. gombos, j. tó th, p. medveczky, a. ma´lna´si csizmadia and l. szila´gyi laboratory of enzymology, department of biochemistry, eo¨tvo¨s lora´nd university, budapest, . e-mail: gl@ludens.elte.hu by far the most serine proteases have a glycine in position 193, which is part of the s2' subsite (the second subsite on the enzyme surface c-terminal from the scissile bond of the substrate). in contrast, human trypsin 4, the trypsin isoform expressed in human brain, possesses an arginine in that position. the bulky side chain of this amino acid is responsible for the inhibitor resistance, the most striking feature of this isoform, as it interferes with the binding of polypeptide inhibitors to the enzyme surface. a chimpanzee typsin also has an arginine193, while rat trypsin v bears a tyrosine in that position. there is also a snake venom plasminogen activator, a trypsin type serine protease, that contains an s2' phenilalanine. we created glycine, arginine, tyrosine and phenilalanine s2' variants of human and rat trypsins by site directed mutagenesis in order to investigate the effect of these amino acids on the kinetic behaviour. on small chromogenic substrates and synthetic inhibitors, which do not interact with the s2' residue, there is no signifi-cant difference between the various mutants in catalytic efficiency and inhibitory constants, respectively. however, on oligopeptide substrates the catalytic efficiency decreases 20-50-fold in the nonglycine variants. this effect is even more dramatic with polypeptide partners: the catalytic efficiency drops 200-500 times while inhibitory constants increase by 3-5 orders of magnitude. we conclude that the catalytic mechanism is not fundamentally influenced by the substitution of residue 193, although this amino acid is part of the oxyanion hole. bulky residues in the s2' subsite hinder mainly the binding to interaction partners. structural studies on masp-2: towards the understanding of the mechanism of autoactivation mannose-binding lectin-associated serine protease 2 (masp-2), is the key enzyme of the lectin activation pathway of complement, a major element of innate immunity. a dimer of masp-2 complexed with mannose-binding lectin (mbl) is able to perform its biological functions: upon recognition of the pathogen by mbl masp-2 undergoes autoactivation, and then initializes the complement cascade by cleaving c2 and c4. masp-2 is a mosaic protein containing a chymotrypsin like serine protease domain (sp) and further domains with binding sites of mbl or substrates. our present study focuses on the structural background of the ability of the zymogen form of masp-2 to undergo autoactivation. we solved the structures of catalytic fragment of masp-2 both in its zymogen and activated forms. comparison of the two structures reveals characteristic conformational differences in the classical activation domain and in some other loops lining the substrate binding region. loop 1 shows a unique conformation with arg192 blocking the s1 pocket. we docked the activation loop of masp-2 in the active site of the active enzyme and built a model of the complex of the active and zymogen forms. the model reveals extended regions of molecular recognition. while this model represents the second step of autoactivation (active form cleaves zymogen), the first step (zymogen cleaves zymogen) requires the stabilization of the zymogen enzyme in active-like conformation. we built a model of a zymogen-zymogen complex. favorable and unfavorable contacts of the two zymogen molecules help us to identify possible molecular switches, as well as contact regions stabilizing an active-like conformation of the zymogen enzyme in the complex. the deg/htra proteases are atp-independent serine endopeptidases which are present in most organisms, including bacteria, humans and plants. previous work in our laboratory has shown that the deg2 protease of the model plant arabidopsis thaliana selectively degrades the photodamaged d1 protein in the reaction center of photosystem ii (psii) in vitro. therefore, deg2 is thought to catalyze the primary cleavage of photodamaged d1 protein, which is an important step of the repair mechanism that restores functional psii. our present studies aim to elucidate the regulation of the deg2 protease activity, especially with regard to its d1 degrading activity. we found deg2 associated to the stromal side of the thylakoid membranes and as a soluble protein in the chloroplast stroma. the amount and distribution of deg2 protein remained unchanged after exposure to different light intensities, which suggest either a substrate regulation or a posttranslational regulation of the d1 degrading activity of deg2. recent advances on deg2 regulation and complex formation will be presented. novel peptide inhibitors of human kallikrein 2 (hk2) human kallikrein 2 (hk2) is a serine protease produced by the secretory epithelial cells in the prostate. it activates several other proteases that may participate in the proteolytic cascade mediating metastasis of cancer. thus, modulation of hk2 activity is a potential way of preventing tumor growth and metastasis. furthermore, specific ligands for hk2 may be potentially useful for targeting and imaging of prostate cancer. we used enzymatically active recombinant hk2 captured by a monoclonal antibody exposing the active site of the enzyme to screen phage display peptide libraries. six different peptides binding to hk2 were identified using libraries expressing 10 or 11 amino acids long linear peptides. three of these peptides were specific and efficient inhibitors of the enzymatic activity of hk2. alanine substitution analysis revealed that motifs of 5-7 amino acid determined the inhibitory activity of the peptides. the peptides are also of potential utility for development of immunopeptidometric assays for hk2, which is promising marker for diagnosis of prostate cancer. furthermore, these peptides are potentially useful for treatment and targeting of prostate cancer. the mechanism of autoactivation of the zymogen masp-2 residues on the surface of pathogens. we managed to recombinantly express and purify two forms of zymogen masp-2. one form is the wild type zymogen enzyme, which can be activated, while the other one is a stable zymogen mutant form of masp-2. we could prepare the zymogen form of wild type masp-2 under certain conditions which enabled us to examine the kinetics of activation. we demonstrated that activation of masp-2 is a true autocatalytic activation without the involvement of any other protease. we characterized the enzymatic properties of zymogen masp-2 using the stable zymogen form. we demonstrated that zymogen masp-2 cannot cleave small synthetic substrates but it can cleave large protein substrate (c4). a molecular model for the interaction between zymogen and activated masp-2 during activation has also been built based on the available 3d structures of zymogen and activated masp-2. influence of streptokinase on the fibrinolytic system proteins the present study is dedicated to the investigation of the effect of protein by bacterial origin -streptokinase (sk) on the activity and interaction regulation mechanisms of fibrinolytic system proteins. the study was carried out with use of porcine haemostasis system which plasminogen isn't activated by sk. especially we were interested in study of the changing fibrinolytic system parameters such as tissue type plasminogen activator (t-pa), plasminogen activator inhibitor (pai-1), plasminogen, a2-antiplasmin activities. also the main parameters of coagulation system such as fibrinogen, soluble fibrin, fibrin degradation products levels and thrombin activity and quantity were studied. it was used affinity chromatography, electrophoresis, western-blotting, elisa, determination of proteins activity. it has been determined an increased consumption of plasminogen on 15% in 4 h after streptokinase injection. it was shown that activity and concentration of t-pa were significantly increased in 3.5 times in 1 h. on the next stages of investigation this parameters tend to norm. after sk injection pai-1 quantity was increased in two times (16.7 ng/ml compared to normal 8.9 ng/ml). the interesting fact was the activation of prothrombin by sk without activation of coagulation system in vivo. the injection of sk causes the significant increase of t-pa activity and quantity possibly due to direct or/and indirect effect on endothelial cells. we can conclude that sk causes pai-1 secretion due to effect on platelets as 90% of pai-1 storage is in a-granules of platelets. thus analysis of the data displayed besides of well-known sk function the influence of sk on the changing of fibrinolytic system potential possibly due to its effect on endothelial cells and platelets. paracrystalline inclusions in the mitochondrial matrix or intermembrane compartment occur in several biochemically unrelated disorders such as myopathies, paragangliomas and steatohepatitis, and in various cell types under normal conditions, as well. however, little is known about the composition of the inclusions, the mechanism of their formation and their relation to disease processes. in this study we have described the helix-shaped structures in the intracristal compartments of rat liver mitochondria that have undergone ca 2+ -induced permeability transition. the filaments are anchored in opposing parts of the mitochondrial membranes and appear to support the cristae mechanically. a protein, that apparently is a component of these helical filaments, has been identified as serine protease lactb. this protein shows close sequence similarity to the class c bacterial beta-lactamases and is the only member of this class in animals. since lactb has not been studied previously we cloned its cdna for expression in e. coli as c-terminal his-tagged fusion protein. lactb underwent proteolytic processing in both e. coli and in isolated mitochondria resulting in several protein fragments. this is likely to be due to autocleavage and may be an activation/maturation process. 2d blue native gel electrophoresis indicated that lactb was part of a >600 kda protein supercomplex. in summary, the presence of the serine protease motive in lactb and its supposed ability to form helical filaments suggest that lactb might function not only as a component of 'mitoskeleton' in maintaining and rearranging the mitochondrial ultrastructure under certain conditions, but also might take part in apoptotic processes. novel psychrophylic trypsin-type protease from serratia proteomaculans proteinase with trypsin specificity from psychrophylic microorganism serratia proteomaculans was partly purified. it was shown that the properties of this enzyme (temperature and ph-stability, efficiency of substrate hydrolysis) correspond with the psychrophylic character. inhibitor analysis and study of substrate specificity indicate that this enzyme is serine trypsin-type protease. at the same time this enzyme is zinc-dependent. proteases of such type were unknown till now. secondary specificity of the studied enzyme differs from the bovine trypsin specificity -this protease hydrolyses the short substrates more efficient. zinc, cadmium (ii) and copper (ii) ions in mmolar concentrations inhibit the enzyme activity. the unusual character of calcium ions influence on substrate hydrolysis and inhibition by the bovine pancreatic trypsin inhibitor (bpti) was registered for the studied enzyme. in vitro by a neutral to basic ph change [2, 3] . the kinetics of the activation process can be followed by stopped flow fluorescence (sff) experiments while the structural features of the transition can be explored by in silico molecular dynamics (md) and targeted molecular dynamics (tmd) [4] simulations. to challenge the activation process, mutants were constructed and studied by sff measurements. subsequently, on these mutants multiple md/tmd simulations were carried out. our results indicate the existence of parallel activation pathways. they demonstrate the absolute necessity of multiple simulations and of proper statistics. they reveal the pros and cons of the tmd method. a simple method for the purification of a novel serine endoprotease from wheat triticum aestivum (cv. giza 164) has been developed. it consists of ion-exchange and gel filtration. the molecular mass of the enzyme was 58 kda by sds/page under reducing conditions and 57 kda by gel filtration on a sepharose 6b column. the enzyme had isoelectric point and ph optimum at 4.2 and 4.5, respectively. the substrate specificity of the enzyme was studied by the use of synthesized and natural substrates, azocasein, azoalbumin, hemoglobin, casein, gelatin and egg albumin. the enzyme appears to prefer azocasein with km 2 mg azocasein/ml. the enzyme had a temperature optimum at 50°c with heat stability up to 40°c. while co 2+ and mg 2+ accelerated the enzyme activity by 54 and 56%, respectively, ca 2+ and ni 2+ had very little effect. the enzyme was strongly inhibited by phenylmethylsulphonyl fluoride (pmsf), but not by the other protease inhibitors, suggesting that the enzyme is a serine protease. from the results it can be concluded from the characterization that the t. aestivum serine protease may be suitable for food processing. in vitro effects of a potent, selective dipeptidyl peptidase ii (dppii) inhibitor in leukocytes and u937-cells. the compound was able to penetrate the cell membrane and proved efficacy without evidence for acute cellular toxicity. there was a dosedependent inhibition of intracellular dppii activity without affecting the dppiv activity (maximal efficacy at 100 nm). these properties enable to differentiate between dppii and dppiv in biological systems and allow further investigation of the physiological function of dppii. in a second step, we have been investigating the involvement of dppii in apoptosis in human leukocytes by using this compound. preliminar results based on annexin v-/pi-staining using up to 1 lm inhibitor in u937-cells and pbmc did not show signs of apoptosis while dppii activity was inhibited for 90%. effect of calcium ions on hydrolysis of peptide substrates of general formula a-(asp/glu) n -lys(arg)-b, catalyzed by enteropeptidase (ec 3.4.21.9), differs depending on substrate type. for specific enteropeptidase substrates (n = 4) calcium ion exhibits the promotion of hydrolysis by the natural two-chain enteropeptidase. hydrolysis of atypical enteropeptidase substrates (n = 1-2) is as a rule less efficient; in addition calcium ion shows in this case the inhibition influence. therefore the regulation of the nondesirable side-hydrolysis during full-length enteropeptidase-catalyzed chimeric proteins processing is possible by means of calcium ions. on the contrary the hydrolysis of substrates of all type (n = 1-4) by enteropeptidase light chain as well as the enzyme containing the truncated heavy chain (466-800 or 784-800 fragments) is inhibited by calcium ions. hydrolysis of the natural enteropeptidase substrate, trypsinogen, is at least two orders of magnitude more efficient than any artificial substrate hydrolysis. we propose that this effect is caused by participation in trypsinogen coordination with enzyme of the addition secondary substrate binding site and/or calcium-binding site; both sites located on the n-terminal half (118-465) of the enteropeptidase heavy chain. one more mechanism of the regulation of the enteropeptidase activity by calcium ion is the unusual calciumdependent autolysis of the enteropeptidase heavy chain leading to the drastic loss of its activity towards trypsinogen. autolysis of enteropeptidase heavy chain and well-known autolysis of trypsin were compared; the second one serves as the natural defense mechanism against the undesirable premature proenzymes activation in pancreas leading to pancreatitis. the corresponding enteropeptidase inactivation in low ca 2+ ion environment might be the component of the same protective mechanism. b3-034p human trypsin 4 selectively cleaves myelin basic protein: is this brain protease involved in the pathomechanism of multiple sclerosis? demyelination, the breakdown of the major membrane protein of the central nervous system, myelin is involved in many neurodegenerative diseases. proteases participating in this process are potential targets of therapy in neurodegenerative diseases. in the present in vitro study the proteolytic actions of calpain, human trypsin 1 and human trypsin 4 (the product of gene prss3) were compared on lipid-bound and free human myelin basic protein as substrates. digestions only with calpain and human trypsin 4 actions may be of some physiological or pathological relevance, since these two are expressed in human brain. the fragments formed were identified by using n-terminal amino acid sequencing and mass spectrometry. the analysis of the degradation products showed that human trypsin 4 of these three proteases cleaved myelin basic protein most specifically. it selectively cleaves the arg80-thr81 and arg98-thr99 peptide bonds in the lipid bound form of human myelin basic protein. based on this information we synthesized region 94-104 of myelin basic protein, peptide ivtprtpppsq that contains the specific trypsin 4 cleavage site arg98-thr99. in vitro studies on the hydrolysis of this synthetic peptide by trypsin 4 confirmed our results with intact myelin basic protein. what lends some biological interest to the above finding is that the major autoantibodies found in patients with multiple sclerosis recognize sequence 80-96 of the protein. our results suggest that human trypsin 4 may be one of the candidate proteases involved in the pathomechanism of multiple sclerosis. enteropeptidase is a heterodimeric serine protease of the intestinal brush border that activates trypsinogen by highly specific cleavage of its activation peptide following the sequence asp-asp-asp-asp-lys. its light chain alone is sufficient for an effective cleavage of fusion proteins with trypsinogen activation peptide analog. human enzyme possesses 10-fold specificity coefficient compare to bovine one, and an explanation of this fact can contribute a lot to the attempts of improving or modulating enzymatic properties. highly pure and active recombinant human enteropeptidase light chain (l-hep) was obtained by renaturation from inclusion bodies expressed in escherichia coli cells and the active l-hep was purified on agarose-linked soybean trypsin inhibitor. enzymatic activity of purified l-hep was studied through the cleavage of the synthetic peptide substrates and several fusion proteins. l-hep associated with soybean trypsin inhibitor slowly and z-lys-sbzl cleavage was inhibited with ki* = 2.3 nm. comparison of l-hep and bovine enteropeptidase inhibition by bovine trypsin inhibitor aprotinin has shown almost an order difference in ki*. ph dependence of the enzyme activity was measured and ph optimum point was found to be 7.54. enteropeptidase light chain amino acid sequence and crystal structure were analyzed for the presence of target regions for mono-and bivalent ions. unlike trypsin with predicted and experimentally proved calcium-binding sites and sodium-activated thrombin, l-hep was predicted to be deprived of any of such sites and an influence of these ions on the cleavage of different substrates was found to be confined primarily to a substrate binding. as a continuation of our efforts to fully elucidate the antisnake venom properties of mucuna pruriens and to further understand the molecular changes that occurred in mouse plasma proteome as a result of in vivo challenge test with venom and mucuna pruriens proteins (mpe), two dimensional polyacrylamide gel electrophoresis was done. plasma was pooled and gels were run in triplicate to eliminate both biological and experimental variations. analysis using imagemaster 2d platinum software and other statistical analysis tools showed significant differences in protein expression between all the treatments and the control group. some proteins were down regulated, some up-regulated, some completely disappeared while new protein spots were identified. the protein expression of plasma of mouse immunized with mpe for 3 weeks before challenge with lethal dose of venom and that injected with venom alone was more complex. some venom proteins like ecarin are serine proteases that activate clotting factors like prothrombin, causing haemorrhage and disseminated intravascular coagulation, on the other hand, the protease inhibitors from mucuna pruriens must have acted to antagonize these effects by direct proteolysis (cleavage products/spots appearing in the protein map) or other immunological mechanisms. the results obtained represents the first proteomics approach in studying all the plasma proteins involved in this phenomenon. we have only concentrated on protein spots showing interesting variations with respect to control. it is also an important step in the identification of the affected proteins, the kind of modifications/molecular mechanisms involved which is likely the basis of the in vivo protection the plant extract showed against the venom. the use of enzymes at low temperatures has great potential in terms of lower energy costs, therapeutic applications and to lower microbial contamination in industrial processes. low temperature proteases (cryophilic -or psycrophilic -proteases) are of particular interest for detergents and as wound debriding agents. at present, we are studying cryophilic proteases from antarctic krill (euphausia superba), which normally lives in the sea at temperatures near 0°c. we have isolated several low temperature proteases by chromatography. enzyme activities and stability were characterized at low temperatures and as a function of ph to find optimum conditions for different applications. a particular enzyme, named kt1, showed particularly high specific activity at 20°c, several times that of commercial preparations of proteases such as subtilisins. this protein showed a high degree of similarity with digestive trypsins isolated from various arthropoda species. using mrna molecules obtained from abdominal sections of e. superba and subsequently subjected to a reverse-transcription reaction, we identified, isolated and sequenced a dna molecule that codes for an inactive zymogen of the enzyme. cloning of this dna sequence in escherichia coli strains allowed the recombinant expression of the zymogen, followed by purification and activation of the zymogen, which lead to an active cryophilic trypsin. we performed a homology modeling procedure that conducted us to obtain a molecular model of the mature enzyme. the 3d model thus obtained was refined using energy minimization, hydrogen network optimization and residue-residue contact optimization techniques, leading to a reliable model of the enzyme. we used this model to identify many interesting and novel features of the enzyme molecule that could be related with its cryophilic character, and to propose site-directed mutagenesis strategies that could be used to improve the enzyme performance at low temperatures, its ph-activity profile, specificity, inactivation resistance and recombinant expression. in addition, the 3d model allowed us to design and experimentally obtain mutants that are resistant to auto-degradation and more readily activated. molecular cloning and expression of lactba mitochondrial serine protease mitochondria are thought to have originated from a symbiotic relationship between a bacterium able to perform aerobic metabolism ant the ancestor of eukaryotic cells. lactb is the only mammalian protein showing sequence similarity to bacterial serine proteases and belongs to c class b-lactamases. mouse lactb is 551 amino acids long and compromises a predicted mitochondrial import sequence, a short putative transmembrane segment, a b-lactamase homology domain containing the serine protease motif, -sxxk-, and a c-terminal d-transpeptidase domain. the physiological role of mammalian lactb is unclear. therefore, the purpose of this research work was to clone the gene of lactb for expression of lactb in e. coli for further biochemical and cell biological study. the full length lactb gene was cloned into the entry plasmid pentr/sd/d-topo. expression clones were created performing a recombination reaction between the entry clone and four destination vectors. expression constructs resulting in n-or c-terminal gst fusion protein and in n-or c-terminal his6-tag fusion protein were transformed into bl21 (de3) competent cells which are designed for use with bacteriophage t7 promoter based expression systems. when lactb was expressed as an n-terminal gst fusion protein, full-length lactb protein was recovered by glutathione-agarose affinity chromatography. expression of lactb as a c-terminal gst fusion protein or with either an n-or c-terminal his6-tag resulted in proteolytic degradation of the protein and we were not able to detect full-length lactb. these results show that the n-terminal gst fragment protects lactb from proteolytic processing and that lactb can undergo autoproteolysis, which may be a part of a physiological maturation or activation process. design and synthesis of retro-binding peptides active site inhibitors of thrombin thrombin is an important pharmaceutical target for the treatment and prevention arterial and venous thrombosis. biological active peptides are recognized to have significant therapevtic potential but serious limitations especially for oral dosing. the peptide stereomers could differ when forming productive complexes with an enzyme. moreover, the replacement of l-amino acid residues forming the hydrolyzed p1-p'1 bond by their enantiomers is known to result in either an uncleavable or a very slowly hydrolyzed analogue. this phenomenon is often used for the synthesis of the peptide's inhibitors stable to the degradation by the enzymes of organism. as the peptides containing d-amino acids, nor are subject to an enzymatic hydrolysis, the purpose of researches was synthesis of a retro -d-analogues of thrombin's substrates constructed from d-amino acids. the di-and tripeptides of the general formula x-d-arg-d-phe-ome [where x = z, tos, ac h, and z-d-arg-d-ala-(d), l-phe-ome (otbu)] were synthesized by conventional methods of peptide synthesis in solution. special features of their interaction with thrombin are investigated. their inhibitory action on reaction of splitting of fibrinogen by thrombin and on reaction of a hydrolysis by thrombin baee showed, that their inactivating action depends on the substituent on n-end of dipeptides and configuration of phenylalanine in a molecule of tripeptides. the relationship between structure and inhibitory action of the synthesized peptides is discussed. the successful application of d-amino acids for designing of biologically active peptide's analogues as a potential medicinal agent, steady to enzymatic degradation is shown. substrate specificity of mannose lectin binding associated serine proteinase 3 n. s. quinsey and r. n. pike department of biochemistry and molecular biology, monash university, melbourne, victoria australia. e-mail: noelene.quinsey@med.monash.edu.au the innate complement system is involved with the neutralization of pathogenic microorganisms. it plays a comparative role to that of the classic immune complement cascade. in the innate complement system, the oligomers of mannose lectins are able to bind to microorganisms. these oligomers have been shown to have mannose lectin binding serine proteinase (masps) attached, which once activated lead to the activation of the c3 convertase complex, which finally leads to the formation of the membrane attack complex. there have been three active masps identified in the human innate immune system-masp-1, masp-2 and masp-3. there is high homology between these three serine proteases especially in the n-terserpins are protease inhibitors that present their reactive site loop (rsl) to target proteases, followed by drastic conformational changes that inactivate the protease. the sequence of the rsl of serpins determines the target specificity. the drosophila melanogaster gene spn4 encodes multiple serpin isoforms each containing an individual rsl, thus enabling the attack of different proteases. variant spn4a contains a consensus recognition/cleavage sequence of furin within its rsl and is equipped with a signal peptide and an endoplasmic reticulum (er) retrieval signal (hdel). this suggested that the protein resides in the secretory pathway, like furin, a proprotein convertase that activates many cellular proteins and pathogens. our experiments demonstrate that spn4a forms sds-stable complexes with human furin that is inhibited with a second order rate constant of 5.5 · 10 6 /m/s. the rsl of spn4a is cleaved c-terminally to arg-arg-lys-arg, in accord with the enzyme's cleavage site. furthermore, the serpin is retained in the er of transfected cos7 cells as shown by immunofluorescence staining. a hdel deletion mutant was detected mainly in the medium of trans-fected cos7 cells, demonstrating the necessity of the hdel signal for the observed cellular localization. further experiments show that furin 1 and 2 of drosophila melanogaster are physiological targets for spn4a, since secreted forms of both enzymes form stable complexes with the serpin. together, the results demonstrate that spn4a is a potent inhibitor of furin that may meet the target at its natural location. experiments with the other rsl variants show that the spn4 gene represents a multipurpose weapon that is directed against different families of proteases. formation of the covalent tetrahedral complex (tc) with substrate is the first step of the catalytic process in the active site of serine proteases. his57 (chymotrypsin numbering) plays a role of a general base catalyst, activating the ser195 nucleophile by abstraction of its proton. it was experimentally observed that the pka of his57 ne in tc formed by serine proteases with transition state analog inhibitors is about 5 units higher than the corresponding pka in the free enzyme. this work demonstrates that the environmental change of the his57 in tc, induced by the substrate binding in the enzyme active site, is the dominant factor in the pka increase of his57 ne, and triggers the enzymatic processing of the substrate. these results are based on quantum mechanical modeling of the active site of free chymotrypsin and tc complex of chymotrypsin with trifluoromethyl ketone inhibitor in dft b3lyp/6-31+g** level of theory. the polar environment of the enzyme active site is accounted for explicitly in the microscopic model. the combined environmental effects of the bulk water solvation and the rest of the protein is implicitly accounted for by our scrf(vs) continuous solvation approach. the role of local polar effects, such as the oxyanion and the asp102-his57 hydrogen bond, on the pka of his57 ne in tc is analyzed. genome-wide analysis of subtilase (subtilisinlike serine protease) genes in microbial genomes limited to regions surrounding the asp, his and ser catalytic residues. pattern-searching methods using hidden markov models, based on conserved sequences surrounding the catalytic residues, were used to search for subtilases encoded in >200 bacterial and archaeal genomes, representing 177 species. more than 350 subtilases were found to be encoded in 109 genomes. subtilases are more commonly found in grampositive bacteria than in archaea or gram-negative bacteria, and it is more common to have multiple subtilase-encoding genes than a single gene. the majority of the subtilases have a predicted signal peptide for translocation across the cell membrane, and a sub-group of these secreted subtilases are predicted to have a carboxy-terminal cell-envelope anchor, mainly of the lpxtg type for covalent anchoring to peptidoglycan. the genomic context of the subtilase-encoding genes was analyzed to gain insight in putative functions for these proteolytic enzymes. by also taking into account the predicted intracellular or extracellular location of the encoded subtilases, it was possible to predict a function for many subtilases in either nutrition/growth, spore germination, surface protein processing/activation, bacteriocin/toxin processing, or sigma factor activation/regulation. the poisoning by botropics species makes a similar physiologic, one of systemic effects is the blood coagulation for several mechanisms, as direct action on fibrinogen; factor x activation or platelet activation, by toxins of venoms. in the last years were identifies in botropics venoms, serine proteases. this toxins are responsible by coagulant activity with direct action on fibrinogen. serine proteases are utility for hemostatic system studies and for therapeutics use. looking for new molecules models is very important to show the mechanism of action and search structural characteristics responsible for its activities. the present work has the objective of purification and characterization of a coagulant factor (cf) from b. pirajai venom. the purification was made using a gel filtration, hydrophobic chromatography and an affinity chromatography. the molecular filtration was made in sephadex g-75 with ammonium bicarbonate buffer (ambic) 0.05 m ph 8.1, resulting four fractions (p1-p4), the coagulant fraction was named p1. the p1 fraction was submitted in phenyl sepharose chromatography using triz buffer 10 mm ph 8.6 in a decreasing gradient of nacl (4; 3; 2; 1; 0.5; 0 m), and to finish the chromatography it was used distilled water, resulting six subfractions (fp1-fp6), the coagulant subfraction was named fp1. the fp1 sub fraction was submitted in benzamidine sepharose chromatography and eluted in the solutions: distilled water, obtained the subfraction bfp1, sodium phosphate buffer 20 mm ph 7.8, obtained the subfraction bfp2 and glycine buffer 20 mm ph 3.2, obtained the sub fraction bfp3 that is the cf. the cf displayed one band in sds-page (11%) showing a pure protein, it has 58 kda, the minim coagulant dose is 1.75 lg and has action on fibrinogen beta chain. the genome of arabidopis thaliana encodes 16 putative proteases from the deg/htra family. this group of atp-independent serine-proteases was well examined in other organisms, especially e. coli and humans, but only limited data is available for members from this protease family in plants. deg1 and deg2 have been shown to act as proteases in the chloroplast, but no deg/htra proteases from other compartments have been examined so far. the putative protease deg15 is predicted to be localized in the peroxisome. we cloned the gene encoding deg15 (at1g28320) in an overexpression vector for heterologous expression in e. coli. the tagged protein was purified by affinity chromatography and used to raise polyclonal antibodies. with these antibodies we investigated the intracellular localization of deg15 and the protein level under various stress conditions in order to evaluate the in planta function of this protein. the effect of site-directed mutagenesis on cold adaptation of vpr; a subtilisin-like serine proteinase from a psychrophilic vibrio-species psychrophilic enzymes have very similar 3d structures as their homologous enzymes from mesophilic and thermophilic organisms. main characteristics of enzymes from psychrophiles are their high catalytic efficiency (kcat/km values) and thermolability. a subtilisin-like serine proteinase from a psychrophilic vibrio-species (vpr) shows these characteristics when compared to homologous enzymes from mesophilic and thermophilic organisms. the vpr gene was cloned, sequenced and expressed in e. coli and recently the crystal structure was determined at 1.84 å resolution [1] . structural comparisons have been carried out which have led to hypotheses about some of the structural factors which may contribute to cold adaptation of vpr. some of these hypotheses have been examined using site-directed mutagenesis. the specific residue exchanges were selected with the objective to incorporate stabilizing interactions into the cold adapted enzyme which were deemed to be present in related thermostable homologues. these include incorporation of pro into loops, a new potential salt-bridge, as well as substitutions aimed at improving packing in the hydrophobic core and decreasing apolar exposed surface. we have also introduced ser to ala substitutions at three different locations in the cold-adapted enzyme, but these were the most frequent amino acid exchanges observed in sequence comparisons of the enzyme to those of more thermostable homologues. here we report on the catalytic and stability characteristics of the selected mutants. engineering of gfp for the screening of serine protease inhibitors site specific proteolysis has been an attractive target for the development of antiviral therapies based on selective viral inhibitors. it has been previously demonstrated that reporter proteins like beta-galactosidase could be very useful for the high-throughput screening of hiv-1 protease inhibitors through the display of an accessible protease target site on the enzyme surface. in this work, by using structural analysis, we have engineered the gfp protein from jellyfish aequorea victoria to accommodate in its surface the hcv virus ns5a-5b protease cleavage site edvvccsmsytwtg, in a manner that proper proteolysis results in a fluorescent activity decrease. the three resulting gfp constructions, carrying the protease cleavage site in positions 23-24, 102-103 and 172-173, were soluble expressed in escherchia coli. moreover, the hcv ns4 cofactor residues 21-34 fused in frame via a short linker to the amino terminus of the hcv ns3 protease domain (residues 2-181) were also expressed in e. coli and under 1mm iptg induction, at least 60% of soluble protein was recovered and further purificated by an histidin tag. the analysis of gfp proteolysis in front of hcv recombinant protease were performed either with bacteria crude extracts and purificated proteins. the results presented here indicated that proper solvent exposure of target sites on gfp carrier protein may be a critical factor for protease cleavage and for the observation of fluorescence activity variance, being an aspect of absolute relevance for further design and implementation of newer analytical tests. various kinds of stressors cause the group of metabolic changes defined as the general stress response, initiated by some intracellular signals, such as production of abnormal or denaturated proteins, enhanced generation of reactive oxygen species and others. proteolytic enzymes quickly modify proteins and as a consequence can regulate cellular metabolism. although the stress defense mechanisms have been very often described in the recent literature, in very few works were estimated stress response abilities of white-rot basidiomycetes, which produce two kinds of very important ligninolytic enzymes -laccase and peroxidases. our previous results showed that the addition of menadione to abortiporus biennis idiophasic cultures caused the significant increase of the extracellular laccase activity in comparison to the control. the aim of this study was to determine activities of serine proteinases and natural serine proteinase inhibitors in idiophasic cultures of basidiomycete a. biennis grown under menadione-mediated oxidative stress conditions. we investigated the changes of intracellular serine proteinases activities in the presence and absence of atp, using hemoglobin and fluorogenic substrates. the level of natural serine proteinase inhibitors in mycelia was also measured. a fungal inhibitor of trypsin was partially purified and used to in vitro experiments. an interesting correlations between serine proteinases, serine proteinase inhibitors and laccase activities in prooxidant treated cultures were also observed. it can suggest that the proteolytic modifications under oxidative stress conditions can act as a regulation way of laccase activity. serine proteinases, inhibitory and laccase activities were additionally analyzed by native page. calpain is a ca 2+ -regulated cytosolic cysteine protease, functioning as a ''modulator protease'', i.e. regulating/modifying functions/activities of substrates by limited proteolysis to modulate cellular functions. human has 14 calpain genes and potential substrates extend to various cytosolic proteins such as kinases, transcription factors, cytoskeletal and er proteins. in skeletal muscles, expression of p94 (also called calpain 3) predominates, playing an indispensable role for muscle functions in cooperation with ubiquitously expressed conventional calpains. for, a defect of p94 proteolytic activity originated from gene mutations causes muscular dystrophy. p94 localizes in myofibrils binding to connectin/titin, a gigantic elastic muscle protein connecting the z-and m-lines of sarcomere, the repetitive unit of myofibril, with a single molecule. in mdm (muscular dystrophy with myositis) mice, connectin/titin with a small deletion caused by natural mutation of the connectin/titin gene is expressed, resulting in severe muscular dystrophy phenotypes such as body weight less than a half of that of wild type, severely affected limb muscles with impaired walking ability and only 2-3 months of life time. the deletion in the mdm allele of the connectin/titin gene overlaps one of the binding sites of p94 in the n2-line, another electron-microscopically visible line between the z-and m-lines of sarcomere. the mdm phenotypes clearly indicate that connectin/ titin or p94 or both are essential for proper muscle functions. to elucidate physiological roles of connectin/titin and p94, we analyzed mdm mice in relation to calpain system. as a result, mar-ps (muscle ankyrin repeat proteins) were shown to be up-regulated in mdm muscle. marps bind to the n2-and z-line regions of connectin/titin and function as transcriptional regulators translocating into the nuclei. carp (cardiac ankyrin repeat protein), one of marps, binding site in the n2-line region is proximate to the p94 binding site, thus suggesting interactions of both molecules. possible signal transduction systems to modulate muscle functions revealed by the analyses will be discussed based on the results. inhibition and activation of calpain by its disordered endogenous inhibitor, calpastatin p. tompa 1 , z. mucsi 2 , o. gyo¨rgy 2 , c. sza´sz 1 and p. friedrich 1 1 institute of enzymology, biological research center, budapest, hungary, 2 research group of peptide chemistry, university of eo¨tvo¨s lora´nd, budapest, hungary. e-mail: tompa@enzim.hu calpains are a family of intracellular calcium-activated cysteine proteinases, implicated in the regulation of key cellular processes, such as cell division and programmed cell death. their activity is under tight control by an intracellular protein inhibitor, calpastatin, an intrinsically unstructured protein that contains four equivalent inhibitory domains. each of these comprise three conserved subdomains, of which subdmomains a and c anchor the inhibitor in a calcium-dependent manner, whereas subdomain b binds at the active site and inhibits the enzyme. in this work it is shown that the consequence of this mode of binding is that isolated a and c peptides promote calcium binding to calpain and thus activate the enzyme. this activation is manifest in the sensitization to calcium ion: the calcium required for half-maximal activity is lowered from 4.3 to 2.4 lm for l-calpain and 250 to 140 lm for mcalpain. in the physiologically significant sub-micromolar and low micromolar calcium concentration range this sensitization leads to a more than tenfold activation, which is of potential physiological importance as isolated calpain requires high calcium concentrations never realized in vivo. here we suggest calpastatin is degraded in vivo in a way that generates the activator peptides. due to the structural disorder of calpastatin, this unprecedented mode of action raises intriguing questions with respect to the generality of this ambivalent behavior. to address this issue, we have collected extreme cases, when the same protein elicits opposing, inhibitory and activatory, responses within the same molecular setting: structural predictions show that these proteins are largely disordered. as a conclusion, the possible general implications of this finding are discussed. meprins are oligomeric, brush border membrane or secreted zinc proteases that have unique and complex structures. they are composed of multidomain, highly glycosylated evolutionarily-related a and b subunits that form disulfide-linked homo-or heterooligomeric dimers. the homooligomeric form of meprin a forms very high molecular mass multimers of 1 000 000-6 000 000 da, among the largest extracellular proteolytic complexes known. meprins cleave cytokines, growth factors, bioactive peptides and extracellular matrix proteins, important compounds in inflammatory intestinal disease and in cancer metastases. to investigate the role of meprins in intestinal immune responses, inflammation was induced in mice by oral administration of dextran sulfate sodium (dss). the results showed that wild-type mice (c57bl/6 · 129) had a more severe reaction to dss than meprin b null mice on the same genetic background, as determined by body weight loss, intestinal bleeding and mortality. this implies that the presence of meprin b increases host damage caused by dss and that meprin b plays an active role in intestinal pathophysiology. meprins are also expressed in colon cancer cells (e.g. sw480, sw620, and caco-2). expression of meprin a appears to increase with increasing metastatic potential. in addition, meprin a is highly expressed in the human liver hepatoblastoma cell line hepg2 and abundantly secreted into culture media. examination of human tumor samples showed that meprin a is expressed in primary colon tumors and in tumors that have metastasized to the liver. this indicates that meprin a expression in gastrointestinal tumor cells contributes to the progression of the disease. biochemical pathways mediating necrotic cell death and neurodegeneration in caenorhabditis elegans n. tavernarakis, p. syntichaki, c. samara and k. troulinaki institute of molecular biology and biotechnology, foundation for research and technology, heraklion, crete greece. e-mail: tavernarakis@imbb.forth.gr necrotic cell death plays a central role in devastating human pathologies such as stroke and neurodegenerative diseases. elucidation of the molecular events that transpire during necrotic cell death in simple animal models should provide insights into the basic biology of inappropriate neuronal death, and facilitate the characterization of mechanisms underlying degeneration in numerous human disorders. various cellular insults, including hyperactivation of ion channels, expression of human beta-amyloid protein implicated in alzheimer's disease, constitutive activation of certain g proteins, hypoxia and possibly the ageing process, can trigger a degenerative, necrotic cell death in the nematode caenorhabditis elegans. we are genetically and molecularly deciphering the c. elegans necrotic death program. we have isolated mutations in several distinct genetic loci that bock degenerative cell death initiated by various genetic and environmental insults. by characterizing such suppressors, we have discovered that neuronal degeneration inflicted by various genetic lesions in c. elegans, requires the activity of specific calcium-regulated calpain proteases and acidic ph-dependent aspartyl proteases. although, it is believed that these proteases become activated under conditions that inflict necrotic cell death, the factors that govern the erroneous activation of such-otherwise benign-enzymes are largely unknown. we identified novel factors that modulate cellular ph homeostasis, which are required for necrosis and showed that targeting these factors effectively protects from necrotic cell death in c. elegans. our findings demonstrate that two distinct classes of proteases are involved in necrotic cell death and suggest that perturbation of intracellular calcium levels may initiate neuronal degeneration by compromising ph homeostasis and deregulating proteolysis. search for regulatory proteins which are controlled by proteolysis in escherichia coli based on microarray analysis j. m. heuveling ag hengge, institute of microbiology, fu berlin, berlin, germany. e-mail: joheuvel@zedat.fu-berlin.de the impact of controlled proteolysis on regulatory events in prokaryotes is increasingly recognized over the last decade. as in eukaryotic cells, proteolysis is more than just a garbage disposal but has been found to be implicated in the regulation of many vital functions of the bacterial cells, like cell cycle, stress responses and development (hengge r and bukau b. mol microbiol 2003) . conditional degradation of regulators shows a high potential of integrating a great variety of signals as is well studied for the degradation of sigma s. this sigma subunit of the rna polymerase, which triggers the general stress response in escherichia coli is digested rapidly by the clpxp protease in association with the phosphorylated response regulator rssb under non-stress conditions (stuedemann a. embo 2004). several other regulatory proteins have been found to be subjected to proteolysis, as lexa, a regulator of the sos response. also lon protease is involved for example in the degradation of rcsa, a regulator of the capsule biosynthesis and of sula, a cell division inhibitor (as a review: hengge-aronis r, jenal u. curr opin microbiol 2003). in order to find other regulatory processes in which proteolysis plays a role we pursued a global approach using the microarray technique. in mutants lacking functional clpp or lon proteases or either one of the clp recognition factors clpa and clpx, we searched for genes, which are differentially transcribed compared to the wildtype. we found some interesting groups of genes belonging to common regulons governed by known regulators -candidates for clp or lon mediated proteolysis. after confirmation of these results through lacz fusion studies of representative genes of these regulons, these regulators are presently examined in in vivo degradation studies using immunodetection methods. a distinct group of serine peptidases cannot hydrolyze proteins, but can readily cleave peptides that are up to about 30 amino acid residues long. the representative member of the family, prolyl oligopeptidase is implicated in a variety of disorders of the central nervous system. the enzyme consists of a peptidase domain with an a/b-hydrolase fold and its catalytic triad is covered by the central tunnel of a seven-bladed b-propeller. this domain makes the enzyme an oligopeptidase by excluding large structured peptides from the active site. in most propeller domains the circular structure is ''velcroed'' together in a mixed blade, where both amino and carboxy terminus are involved to form a four stranded antiparallel b-sheet. non-velcroed or ''open topology'' propellers are rare, and prolyl oligopeptidase was the first protein structure exhibiting a domain of this nature. the apparently rigid crystal structure does not explain how the substrate can approach the catalytic groups. two possibilities of substrate access were investigated: either blades 1 and 7 of the propeller domain move apart or the peptidase and/or propeller domains move to create an entry site at the domain interface. engineering disulfide bridges to the expected oscillating structures prevented such movements, which destroyed the catalytic activity and precluded substrate binding. this indicated that concerted movements of the propeller and the peptidase domains are essential for the enzyme action. biochemical characterization of thermoplasma volcanium recombinant 20s proteasome and its regulatory subunit g. baydar and s. kocabiyik molecular genetics, biological sciences, middle east technical university, ankara, turkey. e-mail: gozde_baydar@hotmail.com proteasome associated energy dependent proteolysis is not only involved in rapid turnover of specific proteins that could be important during periods of stress, but also engaged in the turnover of the short-lived proteins that regulate a variety of cellular processes in both procaryotic and eucaryotic cell. the universal distribution of proteasome homologs in archaeal genome provide insight into the vital role of archaeal proteasomes. 20s catalytic core of archaeal proteasomes in combination with various aaa atpases and membrane associated lon proteases may play role in stress response or turnover of the regulatory proteins. however, little is known about the potential physiological roles of archaeal proteasomes. this study presents the data on biochemical and biophysical features of recombinant 20s proteasome of a thermoacidophilic archaeon thermoplasma volcanium (tpv). pcr was performed to amplify dna fragments containing tpv genes encoding the a -and b-subunits of the proteasome from tpv genomic dna. the amplified a-gene (tpva) and b-gene (tpvb) together with their upstream sequences were separately cloned and then combined in puc18 vector. the resulting recombinant puc-skba plasmid was used for heterologous production of in vivo assembled 20s proteasome in e. coli. the recombinant proteasome was purified by combination of ammonium sulfate precipitation, gel filtration chromatography (sepharyl s-300) and ion-exchange chromatography (q sepharose). molecular masses of purified protein subunits were estimated as 23.71 kda (b-subunit) and 21.13 kda (a-subunit). substantial post-glutamyl peptide hydrolyzing activity and chymotrysin-like activity were detected as associated with recombinant proteasome. maximum chymotrypsin-like activity was measured at 85°c and ph 8.5. crystallographic studies of the gtp-dependent transcriptional regulator cody from bacillus subtilis. cody is a gtp dependent transcriptional regulator of early stationary phase and sporulation genes in bacillus subtilis. it is activated by gtp, during rapid cell growth it represses several genes whose products allow adaptation to nutrient depletion. when the cells pass from rapid growth to stationary phase, the intracellular concentration of gtp drops thus releasing the repressed genes. cod y is a 259-residue polypeptide containing a helix-turn-helix motif for binding to dna. it also has motifs common with small gtpases, but cody has a much lower affinity for gtp. crystals of the full-length cody have been grown in the presence and absence of gtp from sodium citrate buffered solutions using lithium sulphate as a precipitant and diffraction data have been collected to 3.5 å resolution. attempts to solve the structure using anomalous data from the semet derivative crystals of cody have been hampered by the large number (70) of methionines in the asymmetric unit and difficulties in reproducibility of angiotensin-converting enzyme (ace) is a zinc metallopeptidase critical for the generation of the vasoconstrictor peptide angiotensin ii. a homologue of ace, ace-2, has recently been identified, which appears to play a counter-regulatory role to ace by inactivating angiotensin ii. like ace, ace-2 is a type i membrane protein with its active site contained within the extracellular domain. the expression of ace2 protein is normally low and restricted primarily to endothelial cells of the heart and kidney, kidney epithelium and testis. recent evidence from ourselves and others indicates that ace2 is significantly upregulated in a number of pathologies, such as myocardial infarction, renal disease and hepatitis c-induced cirrhosis. given that ace can be proteolytically released from the cell surface in culture, ace2 may likewise be shed into plasma or urine. detection of elevated levels of ace2 in plasma and urine may be a useful biomarker for the diagnosis of hepatic, renal and vascular disease. using a specific quenched fluorescent substrate, we have detected ace2 activity in human urine. in contrast, ace2 activity could not be detected in human plasma; interestingly, however, we noted that plasma markedly inhibited the activity of recombinant ace2, thus compromising the possibility of measuring plasma enzyme activity. we are in the process of purifying this inhibitor, which preliminary results suggest is small and hydrophilic. we are also currently optimizing methods for its removal from plasma samples, thus allowing detection of low levels of soluble ace2 activity in normal human plasma. the identification of a potential endogenous inhibitor of ace2, the first for this family of metallopeptidases, could have significant consequences for ace2 function in vivo and the regulation of angiotensin peptides. future studies will examine whether plasma or urinary levels of ace2 are elevated in cardiovascular, renal or liver disease. molecular determinants of proteolytic processing of non-structural polyprotein of semliki forest virus. institute of molecular and cell biology, university of tartu, tartu, estonia. e-mail: lulla@ut.ee semliki forest virus (sfv) is a positive-stranded rna virus. the replication of sfv is performed by the rna-dependent rna replicase complex (rc) and regulated by proteolytic processing. during the course of the infection template preference of rc changes from rna plus-strand to minus-strand. it has been known for several years that this preference switch is due to the proteolytic processing of sfv non-structural polyprotein p1234, mediated by viral cysteine protease located in the carboxy-terminal domain of the nsp2 protein. tight temporal regulation of this template specificity switch is crucial for the viral replication, but, nevertheless, its mechanism remains unsolved. therefore, the mapping of the essential molecular determinants of the site-specific cleavage consensuses may provide necessary information, concerning the cleavage regulation as well as regulation of the rna replication. the results of our studies indicate that as little as 5 amino acid residues from the c terminus of nsp3 protein determine the specificity of the proteolytic cleavage of the nsp3/nsp4 junction. at the same time sequences laying downstream of the cleavage point (in nsp4 region) have only minor effect on the cleavage efficiency. the exact region required for the cleavage of nsp2/nsp3 junction is yet not known but the sequences, required from c-terminal part of nsp2 protein, are likely short as well. in contrast, sequence lying within 80-240 n-terminal amino acid residues of nsp3 is vital for cleavage of the nsp2/nsp3 junction. this region may represent the cofactor of the nsp2 protease that activates processing at the nsp2/nsp3 cleavage site. thus, as the result of current research, a principally new function -regulation of the proteolytic processing and rna replication -was mapped to the conserved n-terminal region of the nsp3. this finding significantly improves our understanding about the role of nsp3, which was enigmatic till now, in the virus life cycle. activation occurs within the specific dibasic motif hsiirrsl, suggesting the involvement of the proprotein convertases (pcs) in these process. this family of endoproteases are responsible for the activation of a large variety of regulatory proteins by cleavage at multi-basic recognition sites exhibiting the general motif (k/r)-(x)n-(k/r)(n = 0, 2, 4 or 6). cotransfection of the furindeficient colon carcinoma cell line lovo with provegf-c and different pc members revealed that furin, pc5 and pc7 are vegf-c convertases. the processing of provegf-c is blocked by the inhibitory prosegments of furin, pc5 and pace4, as well as by furin-motif variants of alpha2-macroglobulin and alpha1antitrypsin. accordingly, mutation of the vegf-c pc-site (hsiirrsl to hsiisssl) inhibited provegf-c processing. following zebrafish caudal fin amputation, the injection of control vector or vector containing wild vegf-c did not affect fin regeneration. in contrast, injection of muted vegf-c (pro-vegf-c) inhibited fin regeneration. these data highlight the importance of vegf-c processing in zebrafish fin regeneration and suggest that zebrafish can be used as a simple and useful model for studying the role of protein maturation by the pcs in physiological processes. thimet oligopeptidase (top) hydrolyzes a variety of bioactive peptides and is implicated in the regulation of neurological and other physiological processes. top is composed of two ''clamshell'' domains, with the substrate-binding pocket and catalytic site lying between these domains. it is speculated that conformational changes in loops and coil regions connecting the domains lead to changes in substrate specificity. the loop region (residues 599-611) is close enough to the active site to interact with even the smallest substrate. it contains three glycine residues and is expected to be quite flexible. in an effort to trap intermediate conformations of the loop, we have replaced gly 599, 603, or 604 with ala and have compared the activities of the three resulting protein constructs towards two quenched fluorescent substrates. all three enzymes had lower activity than wild type towards a bradykinin analog, with g599a, the most active of the mutants, possessing 1/3 wild-type activity. however, utilizing a smaller substrate, g603a was the most active, surpassing even wild type (fivefold increase in activity). g604a had little activity towards either substrate. these results are consistent with data that revealed increases in activity towards the larger substrate, when the enzyme is partially denatured and presumably, more flexible and with increased accessibility of the binding loop to proteolytic enzymes, when partially denatured. acknowledgment: this work was supported by hhmi, nih-ns39892 (mjg). the proteasome: paradigm of a self-compartmentalizing protease self-processing of subunits of the proteasome crystal structures of the rhodococcus proteasome with and without its pro-peptides: implications for the role of the pro-peptide in proteasome assembly handbook of metalloproteins bovine chymotrypsinogen-a x-ray crystal-structure analysis and refinement of a new crystal form at 1.8 a resolution equilibrium and rate constants for the interconversion of two conformations of a-chymotrypsin. the existence of a catalytically inactive conformation at neutral ph refolding transition of alpha-chymotrypsin -ph and salt dependence e-mail: saleh38@hotmail.com reference 1. arnorsdottir j, kristjansson mm, ficner r. crystal structure of a subtilisin-like serine proteinase from a psychrotrophic vibrio species reveals structural aspects of cold adaptation (which is an excellent substrate for adam-17), we observed 35% increase in enzymatic activity in the media of 1 mm 5-ht-treated cells compared to untreated cells. however, we did not see any increase in the fluorescence when we used ''cate1'', an adam substrate, which does not recognize adam-17. to further support a role for adam-17/tace, we designed silencing rnas against the enzyme, which were introduced into the mesangial cells using lentiviral infection. successful silencing was confirmed by western blotting 4 days after infection. control and tace silenced human mesangial cells were stimulated with 2-10 lm of serotonin for 5 min, and erk activation was assessed by western blotting baron-ruppert 2 and e. heymann 1 1 department of physiological chemistry (fb8/gw) e-mail: rebecca.lew@med.monash.edu.au b4-015p functional properties of p94/calpain3 and connectin/titin in mdm mouse skeletal muscle y bunkyo-ku the lectin pathway of complement system is an important component of the innate immunity. it provides the first line of defence against infection, since it is activated on the surface of invading pathogens. the activation of the complement system results in the destruction and clearance of foreign microorganisms. mannose-binding lectin-associated serine protease-2 (masp-2) is the enzyme which is responsible for the initiation of the lectin pathway of complement activation. masp-2 is a multidomain serine protease, which is synthesized as an inactive zymogen and become activated upon mbl binds to carbohydrate plant proteinase inhibitors are widely spread in the different plant species being a significant component of a defense system. somewhere a significant diversity of the proteins related to the same structural family of the inhibitors in the same species may be observed. the family of potato kunitz-type proteinase inhibitors (pkpis) exemplifies a group of proteins with the diverse properties and may be divided into three major homology groups: a, b and c. a lot of genes encoding different pkpiproteins of each group were found in various potato cultivars (solanum tuberosum l.). inhibition activity of plant invertase, cysteine and serine proteinase was found in proteins subgroup c. a set of gene copies were isolated by pcr from potato cv. istrinskii genome. dna sequencing analysis of these resulted in identification of 24 different dna sequences with a high similarity to potato kunitz-type inhibitors of group c (pkpi-c). cluster analysis demonstrated that this clones represented multiple copies of six new genes denoted as pkpi-c1, -c2, -c3, -c4, -c5 and c6. it can be supposed that at least two alleles containing pkpi-c genes are harbored in tetraploid genome of potato. one of new genes, namely pkpi-c5, exhibited 99% identity with known invertase inhibitor cdna (1423) from cv. provita. another pkpi-c6 gene was similar (98% identical residues) with cdna (p340) from potato cv. bintje encoding for a putative trypsine inhibitor. four other new genes demonstrated as much as 89-92% identity with known pkpi-c proteins from other potato cultivars. the n-terminal sequence of the protein encoded by the pkpi-c2 gene was identical to the n-terminal sequence of specific subtilisin inhibitor pksi isolated from cv. istrinskii.b3-049p regional distribution of human trypsinogen 4 in human brain determined at mrna and protein level j. to´th 1 , l. gombos 1 , e. siklo´di 1 , p. ne´meth 2 , m. palkovits 3 , l. szila´gyi 1 and l. gra´f 1 1 laboratory of enzymology, department of biochemistry, eo¨tvo¨s lora´nd university, budapest, hungary, 2 institute of immunology and biotechnology, university of pe´cs, pe´cs, hungary, 3 laboratory of neuromorphology, department of anatomy, semmelweis university, budapest, hungary. e-mail: july@ludens.elte.huproteases play an important role in many physiological and pathological processes in the central nervous system such as development, neurite outgrowth, neuronal plasticity and degeneration and cell signaling. a gene coding for such an enzyme might be prss3 on chromosome 9 of the human genome. it encodes due to alternative splicing both mesotrypsinogen, which is expressed in pancreas, and trypsinogen 4 whose mrna has been identified in different human tissues (initially in brain, recently in different epithelial cell lines from prostate, colon and airway). analysis of the gene prss3 predicted two isoforms of the zymogen: isoform a may have a 72 amino acid, while isoform b a 28 amino acid n-terminal leader sequence. in order to gain information on the possible role of human trypsinogen 4 we have determined its amount at the mrna and the protein level as well in 17 selected brain areas using real-time quantitative pcr and elisa.the highest transcript levels could be detected in cerebellar cortex, while low amounts were found, e.g. in cerebellar white matter samples. the distribution of the mrna in different brain areas measured by real-time pcr is consistent with the protein levels detected with elisa. the usage of different monoclonal antibodies specific for the 28 amino acid leader sequence and the protease domain allowed the separate detection of the zymogen and the active enzyme. in e.g. the hypothalamus the zymogen is the dominant form, while a significant degree of activation was found in the cerebellar cortex. our data indicate that the extent of activation varies with different areas. as human trypsinogen 4 is ubiquitous in the brain we conclude that it might play a role in general neurological processes. serine proteases are enzyme involved in the maintenance of the cell homeostasis. thus, this type of enzymes must be extremely regulated and it has been highly reported that serine proteases are involved in the growth and expansion of different cancers. in this regard, the type ii transmembrane serine proteases (ttsps) constitute a subfamily of membrane anchored serine proteases that are ideally positioned to carry out different interactions with other cell surface or extracellular proteins. among them, tmprss2 and tmprss3 proteins have been reported to be overexpressed in most prostate and ovarian cancers respectively, matriptase/mt-sp1 is expressed in a wide variety of benign and malignant tumors and hepsin is overexpressed in ovarian and renal cancers. desc-1 is a ttsp member found differentially expressed in squamous cell carcinoma (differentially expressed in squamous cell carcinoma gene 1) and differentially from other ttsps, its expression is found to be reduced in tumor tissues respecting to the normal tissue at rna level in head and neck squamous cell carcinoma (hnscc), what suggests a possible tumor protective function for desc-1. in order to shed light about the role of desc-1 in these processes, we have carried out the molecular cloning of the human full-length cdna and expression of the recombinant protein to delineate the implication of this protease in hnscc.diffracting crystals. therefore we used limited proteolysis and mass-spectrometry analysis to identify the domain boundaries of the protein and were able to determine the sequence of two principal proteolytic fragments corresponding to the n-and c-terminal domains of cody. these individual domains which were successfully cloned in escherichia coli, overexpressed as histagged proteins, isolated and purified. both domains have been crystallized. the crystals of the n-terminal domain grow from bis-tris buffered solutions at ph 6.5 containing polyethylene glycol and calcium acetate. the crystals of the c-terminal domain were obtained using ammonium sulphate as a precipitant. the crystals of n-terminus domain diffract to at least 2.3 å and crystals of c-terminus domain -to 3.2 å using an in-house diffractometer with a mar research image-plate as a detector. progress towards the determination of cody structure will be presented. the gram-positive bacterium listeria monocytogenes is a facultative intracellular parasite. interactions of l. monocytogenes with the host cell are provided by a number of secreted and cell surface proteins. one of the most important virulence factors, actinpolymerizing protein acta, is surface attached via the hydrophobic c-tailed membrane anchor. despite, the membrane anchor acta was found in comparable amounts both on the cell surface and in the culture supernatant. the aim of the work was to investigate the mechanism of acta release and the role of this process in l. monocytogenes virulence. maldi-tof ms analysis of trypsin released acta suggested releasing due to proteolytic cleavage between histidine and threonine residues in the close vicinity of the membrane anchor predicted by the htmm analysis. the substitution of histidine with proline prevented acta release into the culture supernatant, although did not disturb its surface presentation. in silico analysis of eight other l. monocytogenes membrane-anchored surface proteins suggested the role for asparagine and threonine residues in specific proteolysis. the prediction was experimentally tested by substitution of the residues with alanine. the l. monocytogenes spontaneous mutant strain, unable to release membrane-anchored proteins into the culture supernatant, was isolated. the mutation was mapped outside the acta gene and presumably affected the corresponding peptidase. the mutation impaired the invasion of l. monocytogenes into the human epithelial-like hela cells that suggested the effect of the released proteins on signaling events that result in induced phagocytosis of the pathogen by normally non-phagocytic cells. angiotensin ii (ang ii) has been proposed to act as a regulatory peptide in the epidermal layer of human skin. while the expression of receptors and peptide precursors have been demonstrated in epidermis, the formation of ang ii and its inactivation have not been studied in detail. thus we have established a model system with cultured keratinocytes to examine the metabolism of ang i, ii and related peptides by intact epidermal cells. cultures were incubated with peptides in a minimal medium, which sustained cell viability for at least 24 h and the metabolism of peptides was monitored by chromatography (rp-hplc). with ang i as peptide substrate five major products were detected in keratinocyte culture media after 12 h incubation. a half-life of about 9 h was estimated for ang i and the slow degradation supports results of earlier studies revealing low activities of exopeptidases in a microsomal fraction from keratinocytes as compared to fibroblasts. the degradation of ang i was not affected by inhibitors of alanyl aminopeptidase, peptidyl dipeptidase a and neprilysin. since a peptide product formed from ang i in keratinocyte cultures resembled ang ii in hplc analysis, the activity of peptidyl dipeptidase a in these cells was assayed with hip-his-leu and the presence of the peptidase was confirmed by its sensitivity to captopril. further experiments showed that ang ii, iii and related peptides were degraded in keratinocyte cultures with rates similar to ang i and these reactions interfered severely with the formation of ang ii. immunohistochemical studies showed a strong positive staining for neprilysin and alanyl aminopeptidase in the dermal layer of human skin and at the epidermal-dermal junction confirming the results obtained with the cell cultures. soluble angiotensin converting enzyme-2 present in human plasma and urine p94/calpain 3 is the skeletal-muscle-specific calpain and is considered to be a modulator protease in various cellular processes. a defect in the p94 gene causes limb-girdle muscular dystrophy type 2a (lgmd2a), suggesting that p94 functions are indispensable for proper muscle functions. in sarcomeres, p94 localizes at z-, n2-and m line regions. although the binding partner for p94 at z-line has not been identified yet, n2-and m-line localization of p94 are considered dependent on its interaction with the n2a and m-line regions of connectin/titin, respectively. connectin is a gigantic sarcomeric protein playing an important role as a molecular template for sarcomeric organization, an elastic element generating passive tension, a platform for various protein ligands, etc. in this study, we focused on the molecular components associated with the n2a region of connectin/titin to extend our understanding on p94. intriguingly, a recessive mutation in the mouse connectin gene, mdm (muscular dystrophy with myositis), causes muscular dystrophy. there are two remarkable phenotypes consequential to mdm mutation. first, the mdm mutation abolishes p94 binding activity of connectin n2a fragment. second, in skeletal muscle from mice homozygous for mdm mutation, upregulation of cardiac ankyrin repeat protein (carp) is observed. carp also binds to n2a connectin at the n-terminal proximity of the region mutated by mdm. the effect of mdm mutation on p94 activity and the properties of n2a connectin as well as carp were analyzed using both animal model and cell culture systems. semliki forest virus (sfv) is well known model virus, which has been studied for decades. the main topic of this research was characterization and analysis of sfv replication machinery using approach based on use conditional-lethal mutants of viruses. the direct aim of the present study was to sequence and functionally characterize a panel of independent sfv temperature sensitive mutants. from all putative ts-mutations, identified in this study, two were mapped to nsp1 protein, four were mapped to nsp2 protein and one was founded in nsp4 region. number of assays were used to verify phenotypic effects of revealed mutations: titration of virus stocks at different temperatures, leak yield experiments, analysis of viral rna synthesis and viral polyprotein processing at different temperatures. nsp2 mutants had clear viral protease defect and accumulated non-cleaved polyproteins on different stages. besides all, biotechnological branch of our research is already developing. it includes improving of existing sfv based expression vector system by use of ts-mutations for the temperature regulation of foreign gene expression in mammalian cells. dipeptidyl peptidase iv activity and/or structure homologues (dash) in brain tumors pathogenesis of many diseases, including cancer, often involves improper proteolytic post-translational modification of biologically active peptides. association of dysregulated expression pattern of novel group of ''dipeptidyl peptidase (dpp)-iv activity and/or structure homologues'' (dash) with cancer development and progression has been suggested by several authors, including us [1] . dpp-iv enzymatic action as a common attribute of most of dash members modifies signaling potential of their substrates, biologically active peptides, not only quantitatively, but due to the changes in their receptor preferences also qualitatively. in this study, we have investigated expression (by real time rt-pcr and immunohistochemistry) and enzymatic activity (by biochemical assays and enzyme histochemistry) of plasma membrane localized dash members, in particular dpp-iv, fibroblast activation protein-alpha (fap) and attractin in human gliomas. it was revealed that varying quantities of dpp-iv, fap and attractin mrnas and proteins were coexpressed in the studied tumors. the majority of dpp-iv-like activity in the glioma tissue could be attributed to the canonical dpp-iv. this activity, assayed biochemically and expressed per mg of protein, was increased in high grade gliomas. inhibition studies suggested lack of enzymatically active attractin in the examined glioma tissues. the results of our pilot study demonstrate for the first time that both enzymatically active and inactive dash molecules are coexpressed in gliomas and suggest prevailing association of increased dpp-iv activity with high grade tumors. acknowledgment: this work was supported by iga nr/8105-3 and msmt 0021620808 vegf-c is involved in the neovascularization processes, steps essential for wound healing, cancer progression and many other physiological functions. zebrafish vegf-c processing and key: cord-016126-i7z0tdrk authors: dangi, mehak; kumari, rinku; singh, bharat; chhillar, anil kumar title: advanced in silico tools for designing of antigenic epitope as potential vaccine candidates against coronavirus date: 2018-10-14 journal: bioinformatics: sequences, structures, phylogeny doi: 10.1007/978-981-13-1562-6_15 sha: doc_id: 16126 cord_uid: i7z0tdrk vaccines are the most economical and potent substitute of available medicines to cure various bacterial and viral diseases. earlier, killed or attenuated pathogens were employed for vaccine development. but in present era, the peptide vaccines are in much trend and are favoured over whole vaccines because of their superiority over conventional vaccines. these vaccines are either based on single proteins or on synthetic peptides including several b-cell and t-cell epitopes. however, the overall mechanism of action remains the same and works by prompting the immune system to activate the specific b-celland t-cell-mediated responses against the pathogen. rino rappuoli and others have contributed in this field by plotting the design of the most potent and fully computational approach for discovery of potential vaccine candidates which is popular as reverse vaccinology. this is quite an unambiguous advance for vaccine evolution where one begins with the genome information of the pathogen and ends up with the list of certain epitopes after application of multiple bioinformatics tools. this book chapter is an effort to bring this approach of reverse vaccinology into notice of readers using example of coronavirus. it compelled us to apply the well-known reverse vaccinology (rv) approach on available proteome of coronavirus. rv approach has been successfully applied on many prokaryotes, but there are very few known applications on eukaryotes and viruses. so, it is worthwhile to explore the potential of this approach to identify potential vaccine candidates for coronavirus. rv basically does the in silico examination of the viral proteome to hunt antigenic and surface-exposed proteins. this approach was initially applied successfully to neisseria meningitidis serogroup b (kelly and rappuoli 2005) against which none of the prevailing techniques could develop a vaccine. the present book chapter is intended to explore the potential of rv approach to select the probable vaccine candidates against coronavirus and validate the results using docking studies. undoubtedly, the traditional approaches for vaccine development are fortunate enough to efficiently resist the alarming pathogenic diseases of its time. however, the traditional approach suffers from certain limitations like it is very timeconsuming, the pathogens which can't be cultivated in the lab conditions are out of reach, and certain non-abundant proteins are not accessible using this approach (rappuoli 2000) . consequently, a number of pathogenic diseases are left without any vaccine against them. all these limitations are conquered by reverse vaccinology approach utilizing genome sequence information which ultimately is translated into proteins. hence all the proteins expressed by the genome are accessible irrespective of their abundance, conditions in which they expressed. the credit of fame of reverse vaccinology should go to the advancements in the sequencing strategies worldwide. accordingly, improvement in the sequencing technologies has flooded the genome databases with huge amount of data which can be computationally undertaken to reveal the various crucial aspects of the virulence factors of the concerned pathogen. reverse vaccinology is based on same approach of computationally analysing the genome of pathogen and proceeds step by step to ultimately identify the highly antigenic, secreted proteins with high epitope densities. the best epitopes are selected as potential vaccine candidates (pizza et al. 2000) . this approach has brought the unapproachable pathogens of interest in spotlight and is evolving as the most reassuring tool for precise selection of vaccine candidates and brought the use of peptide vaccines in trend (sette and rappuoli 2010; kanampalliwar et al. 2013 ). bexsero is the first universal serogroup b meningococcal vaccine developed using rv, and it has currently earned positive judgement from the european medicines agency (gabutti 2014) . whether it is discovery of pili in gram-positive pathogens which were thought to not have any pili or the sighting of factor g-binding protein in meningococcus (alessandro and rino 2010), the reverse vaccinology steals all the credits from other conventional approaches. most of the applications of rv are against prokaryotes and very few against eukaryotes and viruses because of complexity of their genome. corynebacterium urealyticum (guimarães et al. 2015) , mycobacterium tuberculosis (monterrubio-lópez et al. 2015) , h. pylori (naz et al. 2015) , acinetobacter baumannii (chiang et al. 2015) , rickettsia prowazekii (caro-gomez et al. 2014) , neospora caninum (goodswen et al. 2014) and brucella melitensis (vishnu et al. 2017) are the examples of some pathogens that are recently approached using this in silico technique in order to spot some epitopes having potential of being a vaccine candidate. herpesviridae (bruno et al. 2015 ) and hepatitis c virus (hcv) (kolesanova et al. 2015) are the examples of the viruses that are addressed using this approach. (altschul et al. 1990; okonechnikov et al. 2012; golosova et al. 2014) . multiple sequence alignment (msa) was done via clustalw, and the phylogenetic tree was constructed using nj method from unipro ugene 1.16.1 bioinformatics toolkit (okonechnikov et al. 2012 ). analysis of secondary structure of the proteins of seed genome was done by means of expasy portal. the aim is to forecast the solvent accessibility, instability index, theoretical pi, molecular weight, grand average of hydropathicity (gravy), aliphatic index, number of charged residues, extinction coefficient etc. (http://web. expasy.org/protparam/; gasteiger et al. 2005) . virus-mploc was used to identify the localization of proteins of virus in the infected cells of host (http://www.csbio.sjtu.edu.cn/bioinf/virus-multi/; hong-bin shen and kuo-chin chou 2010) . this information is important to understand the destructive role and mechanism of the viral proteins in causing the disease. in total six different subcellular locations, namely, host cytoplasm, viral capsid, host plasma membrane, host nucleus, host endoplasmic reticulum and secreted proteins, were covered. these predictions could help in formulation of better therapeutic options against the virus. as per the protocol of rv, secreted and membrane proteins are of special interest, therefore, filtered for further analysis. to predict the number of transmembrane helices tmhmm server v. 2.0 (http://www.cbs.dtu.dk/services/tmhmm/; krogh et al. 2001 ) was used. signal peptides are known to impact the immune responses and possess high epitope densities. moreover, most of the known vaccine candidates also possess signal peptides. hence, it is worthwhile to predict signal peptides in proteins prior to epitope predictions. signal-blast web server is used to predict the signal peptides without any false predictions (http://sigpep.services.came.sbg.ac.at/signalblast.html; frank and sippl 2008) . the prediction options include best sensitivity, balanced prediction, best specificity and detect cleavage site only. we choose to make the predictions using each option, and the proteins predicted as signal peptide by all the four options were preferred for further investigation. the most appropriate targets as vaccine candidates are those which possess the adhesion-like properties because they not only mediate the adhesion of pathogen's proteins with cells of host but also facilitate transmission of virus. adhesions are known to be crucial for virulence and are located on surface which makes them promptly approachable to antibodies. the stand-alone spaan with a sensitivity of 89% and specificity of 100% was used to carry out the adhesion probability predictions, and the proteins with having adhesion probabilities higher than or equal to 0.4 were selected (sachdeva et al. 2004 ). betawrap motifs are dominant in virulence factors of the pathogens. if the proteins are predicted to possess such motifs, then they are appropriate to be taken under reverse vaccinology studies. betawrap server is the only online web server to make such predictions. the proteins having p-value lower than 0.1 were anticipated to contain betawraps (http://groups.csail.mit.edu/cb/betawrap/betawrap.html; bradley et al. 2001 ). for added identification of the antigenic likely of the proteins, they were subjected to vaxijen server version 2.0. it is basically an empirical method to hunt antigenic proteins. so, if the proteins are not found antigenic using other sequence-based methods, then they can be identified using this method. this step confirms the antigenicity of proteins selected using above-mentioned steps (http://www.ddgpharmfac.net/vaxijen/vaxijen/vaxijen.html; doytchinova and flower 2007). for being a probable vaccine candidate, the protein should not exhibit the characteristics of an allergen as they trigger the type-1 hypersensitivity reactions causing allergy. therefore, to escape out such possibilities, the proteins were also subjected to allergenicity predictions using allertop (http://www.pharmfac.net/ allertop; dimitrov et al. 2014) and algpred tools (http://www.imtech.res.in/ raghava/algpred/submission.html; saha and raghava 2006a, b). to check whether the filtered proteins possess any similarity to host proteins or not, the standard blastp (http://blast.ncbi.nlm.nih.gov/blast) searches were performed. in case of sequence similarity, there is a feasibility of generation of immune responses against own cells. predicting the epitopes binding to mhc class i is the main decisive phase of the rv to carry out valid vaccine predictions. the predicted epitopes were docked with receptor that is hla-a*0201 using cluspro (http://cluspro.bu.edu/login.php; kozakov et al. 2017 ) that is an automated protein-protein docking web server. the literature searches provided the information of conserved residues of the receptor site. the default parameters were used for docking (comeau et al. 2004a, b; kozakov et al. 2006 ). a total of 40 different sequenced strains of coronavirus are available at ncbi. among them 7 strains are pathogenic to humans. various information regarding source, host and collection of these strains are presented in table 15 .1 and 15.2. this information can be obtained from ncbi's genome database, the virus pathogen database and analysis resource and genomes online database (liolios et al. 2006; pickett et al. 2012) . the mers strain is taken as seed genome as it is the most prevalent and disastrous strain among others. its proteome consists of total 11 proteins as shown in table 15 .3. the results of sequence similarity to reveal orthologs using blastp are shown in table 15 .4. the sequences with greater than 30% identity score are considered as homologs. the phylogenetic tree is depicted in fig. 15 .1 and the mers-cov, taken as seed genome, found clustered with different bat coronaviruses. the results of analysis of secondary structure of the proteome using expasy tools are shown in the table 15 .5. from the analysis of charge on the residues and ph values, it is concluded that six of the proteins are basic and positively charged unlike allergens which are acidic in nature. however, five proteins are acidic and show negative charge. the negative gravy score of five proteins justify them to be of hydrophilic nature with majority of the residues positioned towards the surface. for the rest of six proteins, the gravy score is positive; it means that these are the accession number and identity of orthologs obtained in different strains is shown in the table hydrophobic proteins. the proteins with less than 40 value of instability index are quite stable than those with higher values. all the proteins are having the molecular weight less than 110 kda except 3 (yp_009047202.1, yp_009047203.1 and yp_009047204.1). this exhibits the effectiveness of lightweight proteins as targets as they can be easily purified because of their low molecular weights. the protein yp_009047204.1 is reported as a spike glycoprotein. it is acidic with prominent negative charge, with negative gravy score which suggests its hydrophilicity and figure 15 .2 depicts the subcellular localization of proteins of the seed genome, i.e. mers-cov. only one protein was predicted to be localized in host cytoplasm, four in host membrane, two in both host cell membrane and endoplasmic reticulum (er) while two in only er, and two are left unrecognized. the known spike protein is predicted to be localized in host er. from these results we decided to pick the proteins which are located in host membrane or were predicted to be localized in both host membrane and er. the two are known envelop protein and membrane protein from bibliographic studies, and along with that, the known spike protein was also included in the filtered results. out of the filtered proteins, only two (yp_009047210.1 and yp_009047208.1) contain more than two transmembrane helices, therefore filtered out. the results of transmembrane helices prediction are tabulated in table 15 .6. figure 15 .3 depicts the subcellular localization of proteins of all the four selected genomes using virus-mploc prediction tool. the proteins that are predicted to possess the signal peptides by signal-blast web server are yp_009047204.1 and yp_009047205.1. the results of signal-blast web server are tabulated in the table 15 .7. this step takes into account the concept of adhesion-based virulence. adhesions cause pathogen recognition and initiation of inflammatory responses by the host. spaan predicted 2 (yp_009047204.1 and yp_009047205.1) out of 11 proteins of mers strain as adhesive (table 15 .8). only one protein (yp_009047204.1) was predicted to contain betawrap motifs within it (table 15 .8). hence, it is considered virulent and might be responsible for initializing the infection in the host. a total of 9 out of 11 proteins of mers strain were predicted antigenic (prediction values greater than 0.4). the protein with accession number yp_009047206.1 and yp_009047208.1 were among the filtered proteins, however, not predicted antigenic, therefore filtered out. as a result, only four proteins (yp_009047204.1, yp_009047205.1, yp_009047207.1 and yp_009047209.1) were kept for further analyses. none of the 11 proteins of mers-cov possessed any clue of allergenicity as per prediction results from algpred and allertop tools; it means that no vigorous immune responses will be mounted if the epitopes from these proteins will be adopted as vaccine candidates. none of the protein of mers strain shows similarity with the proteins of host that demonstrates that the epitopes from these proteins can safely elicit the required immune response without the hazard of autoimmunity. in total 12 different 9-mer epitopes with potential to bind to receptors of both b-cell and t-cell were predicted. the list of the predicted epitopes can be found in the table 15 .9 and are specific for mers-cov strain. all these epitopes displayed no conservancy with proteins of other human and non-human pathogenic strains. docking permits to reveal the binding energy or potency of connection among epitopes and the receptor in appropriate orientation. the cluspro docking server was used to dock the predicted 90 epitopes against hla-a*0201. the structure of the receptor was available from pdb and was optimized before docking to free it from the complexed self-peptide (4u6y, resolution 1.47 å, bouvier et al. 1998 ). pepstr (peptide tertiary structure prediction server; kaur et al. 2007 ) was used to derive the tertiary structure of the predicted peptides. figure 15 .4 depicts the quaternary structure of the receptor hla-a*0201 with its conserved active site known to form complex with the peptides (bouvier et al. 1998 ). the binding energy results obtained after performing docking analysis are listed in table 15 .9. the 9-mer epitope vvcaitllv at site 21 of protein yp_009047209.1 docked to the receptor with smallest amount of binding energy (à951.7) and 12 hydrogen bonds. the next epitope in the list was also from the same protein yp_009047209.1 at site 27, i.e. tllvcmafl. the predicted structure of the top 5 potent epitopes on the basis of docking energy and the snapshots of docking results are displayed in figs. 15.5, 15.6, 15.7, 15.8 and 15.9 . the most chief restriction for developing a safe and sound vaccine against any of the virus is to identify the protective antigens. the present study is an effort of application of reverse vaccinology approach to investigate a choice of coronavirus proteomes to identify possible vaccine targets. this technique has demonstrated to be a competent way to forecast 12 different epitopes from the selected seed genome. these epitopes are from spike glycoprotein, ns3 protein, ns4b protein and envelope protein. unfortunately none of the epitope is found conserved in other strains, and all are specific to mers-cov. the docking analysis studies revealed perfect binding between hla-a*0201 receptor and epitopes. the conserved residues of the receptor site are also involved in h-bonding with epitope residues. further, the selected antigenic epitopes must be validated using in vitro and in vivo studies to confirm their potential as vaccine candidates. review: reverse vaccinology: developing vaccines in the era of genomics basic local alignment search tool the middle east respiratory syndrome coronavirus -a continuing risk to global health security crystal structures of hla-a*0201 complexed with antigenic peptides with either the amino-or carboxyl-terminal group substituted by a methyl group betawrap: successful prediction of parallel beta -helices from primary sequence reveals an association with many microbial pathogens geminiviridae. in: virus taxonomy-ninth report of the international committee on taxonomy of viruses lessons from reverse vaccinology for viral vaccine design discovery of novel cross-protective rickettsia prowazekii t-cell antigens using a combined reverse vaccinology and in vivo screening approach identification of novel vaccine candidates against acinetobacter baumannii using reverse vaccinology cluspro: a fully automated algorithm for protein-protein docking cluspro: an automated docking and discrimination method for the prediction of protein complexes middle east respiratory syndrome coronavirus (mers-cov): announcement of the coronavirus study group allertop v.2-a server for in silico prediction of allergens vaxijen: a server for prediction of protective antigens, tumour antigens and subunit vaccines high performance signal peptide prediction based on sequence alignment techniques meningococcus b: control of two outbreaks by vaccination protein identification and analysis tools on the expasy server unipro ugene ngs pipelines and components for variant calling, rna-seq and chip-seq data analyses discovering a vaccine against neosporosis using computers: is it feasible? genome informatics and vaccine targets in corynebacterium urealyticum using two whole genomes, comparative genomics, and reverse vaccinology reverse vaccinology: basics and applications pepstr: a de novo method for tertiary structure prediction of small bioactive peptides reverse vaccinology and vaccines for serogroup b neisseria meningitidis way to the peptide vaccine against hepatitis c piper: an fft-based protein docking program with pair wise potentials the cluspro web server for protein-protein docking predicting transmembrane protein topology with a hidden markov model: application to complete genomes an integrative approach to ctl epitope prediction, a combined algorithm integrating mhc-i binding, tap transport efficiency, and proteasomal cleavage prediction improved method for predicting linear b-cell epitopes the genomes on line database (gold) v.2: a monitor of genome projects worldwide identification of novel potential vaccine candidates against tuberculosis based on reverse vaccinology identification of putative vaccine candidates against helicobacter pylori exploiting exoproteome and secretome: a reverse vaccinology based approach database resources of the national center for biotechnology information unipro ugene: a unified bioinformatics toolkit scheme for ranking potential hla-a2 binding peptides based on independent binding of individual peptide side-chains virus pathogen database and analysis resource (vipr): a comprehensive bioinformatics database and analysis resource for the coronavirus research community identification of vaccine candidates against serogroup b meningococcus by whole-genome sequencing reverse vaccinology spaan: a software program for prediction of adhesins and adhesin-like proteins using neural networks algpred: prediction of allergenic proteins and mapping of ige epitopes prediction of continuous b-cell epitopes in an antigen using recurrent neural network reverse vaccinology: developing vaccines in the era of genomics virus-mploc: a fusion classifier for viral protein subcellular location prediction by incorporating multiple sites propred1: prediction of promiscuous mhc class-i binding sites identification of potential antigens from non-classically secreted proteins and designing novel multitope peptide vaccine candidate against brucella melitensis through reverse vaccinology and immunoinformatics approach key: cord-011184-ohdukhqt authors: patil, shital p.; goswami, ashutosh; kalia, kiran; kate, abhijeet s. title: plant-derived bioactive peptides: a treatment to cure diabetes date: 2019-07-22 journal: int j pept res ther doi: 10.1007/s10989-019-09899-z sha: doc_id: 11184 cord_uid: ohdukhqt abstract: recent advances in analytical techniques have opened new opportunities for plant-based drug discovery in the field of peptide and proteins. enzymatic hydrolysis of plant parent proteins forms bioactive peptides which are explored in the treatment of various diseases. in this review, we will discuss the identified plant-based bioactive proteins and peptides and the in vitro, in vivo results for the treatment of diabetes. extraction, isolation, characterization and commercial utilization of plant proteins is a challenge for the pharmaceutical industry as plants contain several interfering secondary metabolites. the market of peptide drugs for the treatment of diabetes is growing at a fast rate. plant-based bioactive peptides might open up new opportunities to discover economic lead for the management of various diseases. graphic abstract: [image: see text] biomolecules play an important role in any ligand-target interaction and their recognition in any biological system starts the cascade of signalling steps, which is an important part of the normal body function. nucleic acids, carbohydrates, and lipids are the important ligands that bind and cause conformational changes in targets (receptors) but the messengers that release after binding and conformation changes are peptides. any defect in these signal transduction processes cause diseases (otvos and wade 2014; gautam 1999) . proteins, under the influence of proteolytic enzymes get fragmented at particular catalytic sites to form bioactive peptides. bioactive peptides are made up of 2-20 amino acids and lack protein-protein interaction because of their small size. they have properties like tissue affinity, specificity, and efficiency (moller et al. 2008) . the structure and the overall charge and hydrophobicity/hydrophilicity of the any bioactive peptide depends on the nature of amino acids, their sequences in the peptide backbone along with the n and c terminals. the relation between structure and biological activity of the bioactive peptides are yet not confirmed but the structure is an important arbitrator for their activity (li and yu 2015) . bioactive and nutraceutical peptides have a positive effect in the regulation of the health of an individual (kitts and weiler 2003) . they boost the quality of human health by preventing various ailments and by improving medical conditions related to lifestyle, metabolism, and immunity (vitetta et al. 2005) . peptide and protein drugs are available from the different sources for therapeutic uses, but as with many drugs, they have several advantages and disadvantages of their own. peptide drugs are effective and specific to their biological target but devoided of some of the essential qualities of a drug to withstand in the current market of therapeutics. proteins have a large chemical structure which makes their synthesis complex, tedious and expensive, the further low oral bioavailability of proteins limit their oral delivery hence they are administered through parenteral route (uhlig et al. 2014) . the worldwide peptide therapeutics market size valued at usd 22,071.5 million in 2017 and assumes to grow at cagr of 9.1% over the period 2018 to 2026. (2016) . insulin is the first therapeutic peptide which has been used widely and shares good peptide market. it was first isolated by frederick banting and charles best from pancreatic islet extracts, further, j.b. collip developed the method for extraction and purification of insulin from other animal sources (quianzon and cheikh 2012) . eli lilly began producing insulin from the animal pancreas, later on, biosynthetic insulin was introduced in 1983 with the name humulin. novo nordisk, sanofi, and eli lilly together share 88.7 percent of the global insulin market. lantus (basal insulin) is a product of sanofi accounts for approximately 22.29% of the total sales of top 10 antidiabetic drug brands, in 2016 sale of lantus was usd 6.057 billion (dezzani 2017) . small molecules and peptide-based drug candidate belongs to the category of new chemical entities (nces) according to the food and drug administration, on the other hand, proteins used for the treatment of various diseases are classified under new biological entities (nbes). in recent years, therapeutic peptides are approved for the mitigation of various ailments mainly tumors, immunological disorders, blood disorders and metabolic disorders such as diabetes, obesity ( fig. 1) (loganathan 2016) . the peptide drug, liraglutide (victoza) is a glucagon-like peptide-1 receptor (glp-1r) agonist and used for the treatment of t2dm has gained popularity as a top-selling drugs in recent past, which is marketed by novo nordisk (lund et al. 2014) . glatiramer acetate (copaxone), developed by teva is an immunomodulator and used to treat multiple sclerosis (duda et al. 2000) . leuprolide (lupron), developed by abbott is useful for the treatment of cancer and estrogendependent conditions that respond to hormone therapy (brito et al. 2004) . goserelin (zoladex) from astra zeneca is a gonadotropin-releasing hormone agonist (gnrh agonist) that suppresses the release of different sex hormones and is useful in the treatment of breast and prostate cancer (magon 2011) . other innovator companies include amgen, eli lilly, roche, and pfizer have peptide and protein-based drugs in different developmental stages will reach the market soon for the treatment of various complicated diseases (loganathan 2016) . till date, plant secondary metabolites, mainly small molecules are the established source of new drugs to treat various diseases (verpoorte 1998) , however, advancement in analytical technique, sophisticated purification methodology and in vitro assay system pointed out the researchers fig. 1 uses of therapeutic peptides and proteins in various disease conditions (loganathan 2016) 1 3 to look beyond the small molecules. identification of plant primary metabolites including proteins and peptides for the management of diseases have opened up a new horizon in the development of plant-based peptides as a drug candidate (otvos 2008) . this review summarizes various plant-based peptides reported for the treatment of diabetes, challenges with the development of peptide-based drugs and the future prospect of peptides as a drug. diabetes is a metabolic disorder and can hit anyone at different stages of life. though it is considered as one of the top health killers, its treatment is a huge challenge. diabetes can be characterized with symptoms of prolonged hyperglycemia, glucose intolerance, disturbance in the regulatory systems for storage and utilization of metabolic energy, including catabolism and anabolism of carbohydrate, lipids, and proteins in a diabetic individual due to lack of insulin production and impaired insulin receptor functioning or both (piero 2015; votey and peters 2005) . based on the causes and clinical survey, diabetes mellitus is categorized into four types as mentioned in table 1 (deepthi et al. 2017) . ordinarily, the people below 18 age suffer from the type 1 diabetes, whereas type 2 diabetes mainly occurs in adult and old age group. according to the world health organisation, approximately 10% of all diabetes cases are type 1, and the remaining 90% cases of diabetes worldwide are of type 2 (baynes 2015) . diabetes is ruining the life of peoples worldwide. population suffering from diabetes is increasing and the imminence is expected to rise even more because of the current lifestyle issues. till the year 2017, 422 million people were affected by type 2 diabetes and the number is expected to increase up to 642 million people worldwide by 2040 (fig. 2) . the data indicates that approximately 50% of diabetic patients will increase in the next two decades. the threat is every 6 s a person dies with diabetes (roglic 2016) . from the statistics, it is clear that the population suffering from type 2 diabetes is increasing and it requires more promising therapies. still, no promising drug is available which has the ability to completely cure type 2 diabetes. generally, to treat type 2 diabetes, oral hypoglycemic agents are used. the first line treatment is metformin, which is assisted with other hypoglycemic agents from different categories like thiazolidinediones, alpha-glucosidase inhibitors, sulphonylurea, dipeptidyl peptidase-iv (dpp-iv) inhibitors, sodium glucose cotransporter 2 (sglt 2) inhibitors. the risk associated with these therapies include hypoglycemia, weight gain, tiredness, diarrhea, and anemia risk, etc. over the period of time, these side effects lead to further complications for a visceral system like cardiovascular system and central nervous system (nathan et al. 2009 ). peptide drugs used for the treatment of diabetes are derived from different sources, like exendin-4 was initially isolated from the heloderma suspectum (gila monster) and its synthetic version had come to the market in the form of exenatide (aramadhaka et al. 2013) . liraglutide and semaglutide are structurally very close to glp-1 with improved half-life and resistant to dpp-iv mediated degradation (edmonds and price 2013) . nowadays, these peptides are being produced in large scale by using recombinant dna technology. the peptides which were obtained via synthesis or by using the recombinant technology are comparatively expensive than small molecules and are not affordable to most of the patients. the interest in health-promoting products from the natural origin and pharmaceutical formulations involving bioactive peptides are remarkably increasing (henninot et al. 2018) . numerous scientific reports have been published on bioactive peptides obtained from animal proteins. expedition by researchers in the field of bioactive peptides is of great interest as it opens the way to find economic lead from plants for the better care of diabetic patients (daliri et al. 2017 ). plentiful bioactive peptides have been reported from the animal and plant sources. most of the peptides that had bioactivity are being derived from animal products such as milk, eggs, meat, and fish. plant peptides are still under exploration and few bioactive peptides are reported from soy, wheat, etc (hartmann and meisel 2007) . in animals, proteins are associated with high-fat content and lead to diseases, including high blood pressure and heart diseases if consumed in the large amount. on the other hand, plant proteins neither have associated fat nor have any side effects (nehete et al. 2013) . previously, plant hormones were considered the only player through which cells communicate, but from the last decade as research was driven towards an understanding of plants signaling, secreted peptides, small rnas and transcription factors emerging as a new and well-defined player in the cell to the cell communication network (lindsey et al. 2002; van norman et al. 2011; murphy et al. 2012) . however, it is now clearly defined that signaling peptides of plant and animal origin are biosynthesized through the pathway that is evidently similar, although some aspects of the biosynthesis are still not identified. peptides secreted by plants have well recognized role in preliminary processes of development like the growth of meristem, organ shedding, cell elongation, cell multiplication, cell differentiation, geotropism and protection from the invader (ghorbani 2014) . the ubiquitous presences of proteins, peptides as mediator components convey to the proposals that these messenger peptides may have emerged evolutionary in microbes and were further integrated into the complicated mediator systems of complex organisms during evolution (roth et al. 1982) . existence of hormone-like peptides in plants including insulin-like peptide in spinacia oleracea l. (spinach) and lemna gibba g-3, prolactin-like inhibitor in alfalfa, a substance with luteinizing hormone-releasing activity from leaves of avena sativa l. (oat) and somatostatin-like material in spinach support this assumption, it is speculated that the signaling peptides in plants existed from about 1 billion years ago (collier et al. 1987; fukushima et al. 1976; leroith et al. 1985; morley et al. 1980) . plant proteins containing essential amino acids are important in the food chain to fulfill human physiological needs. the proteins from plants are derived from leaves, seeds, and fruits, out of which seeds are considered as an economical source of proteins. as compared to vegetables and fruits, seeds (legumes and cereals) contain a higher amount of protein (creighton 1993) . proteins in a mature seed represent 18-20% in pisum sativum l. (pea), vicia faba l. (faba) bean and till 35-45% in lupinus albus l. (lupin) and glycine max l. (soybean) (gueguen and cerletti 1994) . many peptides are reported from different plants for the treatment of diabetes as shown in tables 2 and 3 through various known targets such as (a) alpha-glucosidase inhibitors (b) alpha-amylase inhibitors (c) dipeptidyl peptidase-iv inhibitors (d) inhibitors of the glucose transporter system (e) insulin mimetics alpha-glucosidase inhibitors competitively inhibit small intestinal enzyme alpha-glucosidase, which converts nonabsorbable polysaccharides into absorbable monosaccharide, these effects together postpone and minimize the rise in postprandial plasma glucose level. inhibitors of this enzyme decrease the glucose toxicity (improves insulin sensitivity), decreases stress on beta cells (decreases post-meal hyperglycemia), increases glucagon like peptide-1 (glp-1) production hence increases insulin secretion (scheen 2003; mooradian and thurman 1999) . in general, normalization of postprandial hyperglycemia is difficult as compared to fasting hyperglycemia, inhibitors act specifically on the postprandial part of 24 h blood glucose curve and causes a reduction in diabetes-associated hyperglycemia which is responsible for macrovascular complication in patients (mooradian and thurman 1999; ceriello et al. 2004 ). glycaemic control is achieved with these agents reduces glucotoxicity which increases insulin secretion from the beta cell (scheen 2003; salvatore and giugliano 1996) . all marketed alpha-glucosidase inhibitors are from natural origin, however, none of them belongs to peptide class. few research groups have reported peptides from plants having alpha-glucosidase inhibitory activity. peptides obtained from cannabis sativa l. (hemp) seeds contains hydrophobic amino acids (pro and leu), essential amino acids and branched chain amino acids in their structure and have reported to have enzyme inhibitory activity for this target (ren et al. 2016) . oat seed proteins hydrolysate obtained by protease (alcalase) digestion gives bioactive peptides which inhibits this enzyme. in vivo study also revealed that oat peptides, at a higher dose, showed the hypoglycaemic effect on stz-induced diabetic mice, reduces the food intake, stimulates insulin secretion, improves insulin sensitivity and elevate glycogenesis (zhang et al. 2015) . walnut hydrolyzed peptides (whps) from the fruit proteins of juglans mandshurica maxim. (walnut) showed anti-diabetic activity by inhibiting the enzyme. in vitro study of whps showed that peptides of molecular weight (3-10 kda) inhibit alpha-glucosidase with the inhibitory rate of (61.73%) and they remarkably raise extracellular glucose consumption in insulin-resistant hepg2 cells. whereas, in vivo results showed that whps reduce 64.82% of fasting blood glucose level by increasing insulin secretion, liver glucokinase, and glycogen level by 23.71%, 69.54%, and 76.19% respectively (wang et al. 2018) . vigna angularis wild. (adzuki bean) proteins have reported enzyme inhibitory properties in mice model kk-ay of diabetes. the extract reduces the postprandial (2015) 10 hook.f. in vitro and in vivo marella et al. (2016) 11 cucurbita pepo l. in vitro (vaštag et al. 2014) 12 lam. in vitro (gonzalez garza et al. 2017) 13 l. in vitro arise (2016) 14 theobroma cacao l. in vivo sarmadi et al. (2012) 15 l. in vitro mojica et al. (2017b) 16 willd. in vitro nongonierma et al. (2015) 17 l. in vivo and in vivo yibchok-anun et al. (2006) 18 momordica charantia l. insulin mimetic in vivo (khanna et al. 1981) 19 momordica charantia l. seed protein expressed in e.coli stimulated the phosphorylation of pdk1 and akt, enhanced expression of glut-4, stimulated both the uptake of glucose in cells and the clearance of glucose in vitro and in vivo lo et al. (2016) 20 hook.f. in vivo rajasekhar et al. (2010) 21 hook.f. molina. duchesne. in vivo teugwa et al. (2013) 22 zea mays l. enhance the secretion of glp-1 in vivo hira et al. (2009) 23 glycine max l. increases the glucose uptake, enhance the expression of p-ir, p-irs1, p-akt and membrane glut4 protein improve the insulin resistance hypoglycemic in vitro and in vivo lu et al. (2012) 24 oryza sativa l. improve insulin resistance in vivo boonloh et al. (2015) 25 roxb. in vivo poovitha et al. (2017) blood glucose in two sucrose challenge model i.e. normal and streptozocin-treated rats by 15.6% and 30.9% respectively. in vitro study of extruded adzuki bean proteins at 10 mg/ml, concentration inhibits 60.44% of rat intestinal alpha-glucosidase (yao et al. 2014 amylase (alpha-1,4-glucan-4-glucanohydrolase) is an important enzyme that speeds up the hydrolysis of (alpha-1,4) glycosidic linkage of carbohydrate and starch present in the diet and helps in absorption of non-absorbable polysaccharides after their conversion into absorbable monosaccharide (alagesan et al. 2012 ). inhibitors of amylase minimize the hydrolysis of (alpha-1,4) glycosidic linkage and help in slow digestion of carbohydrate and thus delay the absorption of glucose and reduces the postprandial glucose level in blood (jayaraj et al. 2013 ). the pinto beans are a rich source of proteins and other nutrients but are not well explored for its therapeutic benefits. pinto bean protein fraction was digested enzymatically using protease (protamex) at ph 6.5, and for the incubation period of 1-h with (e/s) ratio of 1:10 at 50 °c, hydrolysed fraction was dialysed using molecular weight cutoffs and the fraction obtained with molecular weight < 3 kda showed 62.10 ± 3.49% of enzyme inhibition (ngoh and gan 2016) . similarly digestion with protease (bromelain) and dialysis with molecular weight cut-off < 1 kda showed 49.9 ± 1.4% of enzyme inhibition (oseguera-toledo et al. 2015) . the peptide obtained from cuminum cyminum l. (cumin) seeds, cumin seed peptide1 (csp1) had shown 24.54% of alphaamylase inhibition (siow and gan, 2016) . in vitro study of triticum aestivum l. (wheat) albumin extract showed alphaamylase inhibition, wheat albumin restrained the peak of postprandial blood glucose levels in a dose-dependent manner. it showed the reduction in postprandial blood glucose by 31%, 47%, and 50% after administering 0.25 g, 0.5 g, and 1.0 g of wheat albumin respectively. in long-term administration study, 0.5 g of wheat albumin did not affect fasting blood glucose levels, while it reduces hemoglobin a1c levels which reflects the metabolic control of individual suffering from diabetes (kodama et al. 2005) . a study revealed that different protein fractions isolated from the hordeum vulgare l. (barley) flour have been subjected to pancreatic hydrolysis and the obtained peptides have shown 57-77% alpha-amylase inhibitory activity (alu'datt et al. 2012). cucurbitin protein was obtained from the cucurbita pepo l. (pumpkin) seeds, alcalase and pepsin hydrolysed fraction of it showed the enzyme inhibition (vaštag et al. 2014) . oligopeptides from the mulberry showed highest enzyme inhibition with ic 50 16.25 µg/ml. protein hydrolysates prepared from the seeds of moringa oleifera lam. by enzymes like trypsin, chymotrypsin and pepsin-trypsin for 2.5 and 5 h. pepsin-trypsin digested fraction showed alpha-amylase inhibition and the ic 50 was found to be 0.195 and 0.123 µg/ml for 2.5 h and 5 h of hydrolysis respectively (gonzalez garza et al. 2017) . protein hydrolysate of citrullus lanatus l. (watermelon) seeds was prepared by using enzymes like pepsin, trypsin and alcalase where alcalase hydrolysate gives the highest enzyme inhibition and the ic 50 was found to be 0.149 mg/ml (arise 2016) . autolysates of theobroma cacao l. fruits were prepared at ph 3.5 and 5.2. autolysate at ph 3.5 showed highest inhibition of alpha-amylase as well as helps to release the insulin, in vivo study with autolysates showed decrease in the blood glucose level (sarmadi et al. 2012) . quinoa proteins were enzymatically hydrolyzed to obtain bioactive peptides, hydrolyzed fraction thus obtained showed 6.86% inhibition of alpha-amylase at 250 µm concentration (vilcacundo et al. 2017 ). two main gut-derived hormones glp-1 and glucosedependent insulinotropic polypeptide (gip), also known incretin hormones are responsible for the attainment of normoglycemia. they stimulate insulin secretion, decrease glucagon, inhibit gastric emptying, reduce food intake and appetite (drucker and nauck 2006) . in diabetic individuals, reduced secretion of incretin hormones have been observed. these gut hormones bind with their respective receptors and show their biological action (barnett 2006) . treatment of diabetes requires to restore either normal secretion or reduce the degradation of these hormones. dipeptidyl peptidase-iv (dpp-iv) is a protease enzyme that cleave the glp-1 hormone within minutes after its secretion and makes it inactive for biological function. thus, inhibition of dpp-iv has the potential to revert the hyperglycaemic condition. dpp-iv inhibitors block the rapid degradation of glp-1 and enhance the postprandial level of active glp-1 which further reduces the liver glucagon production and stimulate beta cells of the pancreas and increase insulin level (ahrén 2007) . dipeptides obtained from oryza sativa l. (rice) bran protein hydrolysates showed inhibition of dpp-iv enzyme and have ic 50 1.45 ± 0.13 mg/ml (hatanaka et al. 2015) . in another study, rice bran protein fractions have been defatted and hydrolyzed with umamizyme g and bioprase sp. the dipeptides obtained after digestion with umamizyme g showed inhibition of enzyme and the ic 50 was found to be 2.3 ± 0.1 mg/ml (hatanaka et al. 2012) . the proteins extracted from amaranthus hypochondriacus l., have been subjected to simulated gastrointestinal digestion resulted into bioactive peptides. these peptides fraction have shown dose-dependent enzyme inhibition and have ic 50 1.1 mg/ml (velarde-salcedo et al. 2013) . glycine max l. (soybean) and lupinus albus l. (lupin) protein hydrolysate have been screened for the presence of bioactive peptides, and soy 1 and lup 1 were found to be efficient inhibitors of dpp-iv with ic 50 values 106 and 228 μm respectively (lammi et al. 2016) . peptides obtained (cpgnk and ggglhk) from the protein digestion of common beans showed enzyme inhibitory activity and the ic 50 (mg dw ml −1 ) for the pure peptides were found to be 0.87 ± 0.02 and 0.61 ± 0.10 respectively (mojica et al. 2017b) . gastrointestinal digestion of proteins obtained from phalaris canariensis l. (canary) seed showed 43.5% inhibition of dpp-iv (estrada-salas et al. 2014) . enzymatic digestion of quinoa proteins with papain showed enzyme inhibition and the ic 50 for the hydrolysate-papain was found to be 0.88 ± 0.05 mg/ml (nongonierma et al. 2015) . study was performed on quinoa protein simulated duodenal digestion and the fraction (˃ 5 kda) obtained after 120 min of digestion showed highest dpp-iv inhibition and the ic 50 (mg protein/ml) was found to be 0.84 ± 0.07 (vilcacundo et al. 2017 ). glucose, the metabolic fuel of any living cell, unable to cross lipid bilayer of the membrane due to its polar nature and membrane proteins (glucose transporter) associate with lipid bilayer helps glucose to travel across (abdul-ghani et al. 2011) . glut transporters facilitate the transport of glucose passively across the membrane along with its concentration gradient and do not require energy to operate, on the other hand sglt transporters actively transport glucose across the membrane against the concentration gradient thus these types of transporters require energy to operate. in order to receive energy for operation, sodium is transported along with its concentration or electrochemical gradient thus provide the needed amount of energy for the transportation of glucose across the membrane (musso et al. 2012; brown 2000) . in the human body, at least 12 different glut transporters and 7 sglt transporters exist which belongs to membrane class of proteins. excreted glucose is reabsorbed up to 90% in a convoluted segment of proximal tubule via high capacity, low-affinity sglt 2 transporter while remaining 10% of glucose is reabsorbed in the distal segment of proximal tubule via high affinity, low capacity sglt 1 transporter. in the case of diabetes, hyperglycemia causes the high load of glucose in order to compensate for this high load of glucose, an increase in expression of a glucose transporter gene in proximal tubule occurs (abdul-ghani et al. 2011) . plant peptides targeting glucose transporters are less explored. however, peptides having amino acid sequences aksplf, atnplf, feeln, and lsvsvl, isolated from the black beans have shown the ability to block the glucose transporters glut-2 and sglt-1. the results of in vitro studies by using the caco-2 cell model have shown reduced glucose re-absorption by 21.5% after 24 h treatment of these bioactive peptides. the oral glucose tolerance test in rats has shown a 24.5% decrease in postprandial glucose (p < 0.05) (50 mg hydrolyzed protein isolate/kg bw) (mojica et al. 2017a ). insulin hormone trigger several signaling events that control the metabolic fate of nutrients. insulin binds with alphasubunit of insulin receptor which causes the conformational changes in β-subunit of the insulin receptor, and these conformational changes in receptor activate the cytoplasmic tyrosine kinase domain. this activation further leads to the auto-phosphorylation which in turn trans-phosphorylate various intracellular substrate of the insulin receptor. these effects collectively control metabolic and mitogenic processes (nankar and doble, 2013) . tyrosine kinase domain activation by insulin mimetic, cause the auto-phosphorylation of receptor and trigger downstream signaling requires for metabolic activity of insulin. the other mechanism by which insulin mimetic (vanadate) works is by inhibiting the dephosphorylation of the insulin receptor via inhibition of tyrosine phosphatase (stapleton 2000) . the reports suggest that plant extracts of spinach and lemna gibba g-3 mimics like mammalian-insulin, which was confirmed by the radioimmunoassay. in an in vivo study on young rats indicated that the plant insulin-like material binds to insulin receptors on im-9 lymphocytes and stimulates glucose oxidation and lipogenesis in isolated adipocytes (collier et al. 1987) . vigna unguiculate l. (cowpea) containing bio-molecules have contributed in reducing the blood glucose level and enhancing the antioxidant status of patients. study on l6 rat skeletal muscle cells was performed where cells were treated with cowpea peptides at different doses (0.1, 1, 10 and 100 ng) for 20 h or insulin (100 nm) for 30 min. further, from the treated cells proteins were isolated for western blot analysis to check the phosphorylation of akt (a form of protein kinase b; pkb). the results of the study showed that the cowpea peptides have the ability to phosphorylated akt in the cell culture. this observation suggests that the treatment of cowpea peptides to the skeletal muscle cells can initiate the insulin signaling cascade. there is a possibility that cowpea peptides have the ability to mimic like insulin by inducing the same signaling cascade (uruakp, 2015) . linum usitatissimum l. (flax) seed peptides obtained from the protein hydrolysate (50kd) increases glucose uptake in l6 cells ). mc2-1-5 peptide fraction was obtained from momordica charantia l. showed hypoglycaemic effect in in vivo study. mc2-1-5 fraction decreases the blood glucose level by 61.70% and 69.18% in alloxan-induced diabetic mice with a time interval of 2 and 4 h respectively (yuan et al. 2008) . study on protein from the pulp of m. charantia l. showed increase glucose uptake in c2c12 myocytes and 3t3l adipocyte, the in vivo studies showed that the protein obtained from pulp helps in secretion of the insulin as well as act like insulin to lower the glucose level (yibchok-anun et al. 2006 ). polypeptide-p was isolated from the m. charantia l. acts like insulin mimetic (khanna et al. 1981) . another 68-residue insulin receptor (ir)-binding protein (mcirbp) was reported from the plant upon in vitro digestion yields mcirbp-19, spanning residues 50-68 of mcirbp which increases the binding of insulin to its receptor, stimulate the phosphorylation of pdk1 and akt, induces the expression of glucose transporter 4, and stimulate both the uptake of glucose in cells and the clearance of glucose in diabetic mice (lo et al. 2016) . a novel protein called mcy having molecular weight 17kd was isolated from momordica cymbalaria l. fruit which showed hypoglycemic effect in in vivo study (rajasekhar et al. 2010) . the peptide fraction obtained from the soybean showed increase in glucose uptake in muscular cells, it is reported that obtained peptide fraction enhances the glucose uptake via amk activation (roblet et al. 2014) . protein extract of cucurbitaceae family containing seeds like telfairia occidentalis hook.f., citrullus lanatus l., lagenaria siceraria molina., and cucurbita moschata duchesne., showed a significant decrease in blood sugar level in in vivo study (teugwa et al. 2013) . peptide reported from the leaves of bauhinia variegata l, mimics like insulin which is evidently proved from the immunological reactivity and via in vivo study (azevedo et al. 2006) . zea mays l. (zein) seed protein hydrolysate increase the secretion of glp-1 directly in the ileum and indirectly in the duodenum in the rat and also confirmed with the help of glutag cell line study (hira et al. 2009 ). aglycine which is a bioactive peptide obtained from the soy bean increases the glucose uptake in c2c12 cell, whereas in in vivo study on diabetic model it was observed that treatment with aglycine increases the expression of p-ir, p-irs1, p-akt and membrane glut4 protein which results into hypoglycemia (lu et al. 2012) . soymorphin-5 also called µ-opioid obtained from soy bean decreases the blood glucose level via activation of adiponectin and pparα systems in diabetic kkay mice (yamada et al. 2011) . rice bran protein hydrolysate helps to improve insulin resistance and prevent metabolic diseases (boonloh et al. 2015) . protein extracted from the fruit pulp of momordica dioica roxb. decreases the blood glucose level in a diabetic model (poovitha et al. 2017) . plant scientist believe that there are more plants left unexplored which may contain insulin-like peptides or peptides which will mimic insulin (xavier-filho et al. 2003) . the success of any therapy and its acceptability depends on the ease of its delivery. the oral route of drug administration remains the most accepted route of drug delivery and most of the available drugs in the market are delivered through the oral route (morishita and peppas 2006) . an orally active drug should be stable enough to withstand the gastric microenvironment and also it should be permeable through the gastrointestinal membrane. small molecules are well tolerant to the gastric ph and permeated through the membrane whereas the large macromolecules are vulnerable to degradation in the gastrointestinal tract and are not permeable in their intact form to reach specific target successfully without being degraded in git (morishita and peppas 2006) . as the medical field advances, the peptide drugs are widely explored for their therapeutic potential in the treatment of various diseases. though these drugs are specific to their target, efficacious and potent, the oral bioavailability of peptide and peptide-based drugs always remains a challenge that limits their wide acceptance and market value. high cost of therapy is a major limitation associated with peptide drugs (ayoub and scheidegger 2006) . with the advancement of biotechnology, the recombinant synthesis of a peptide of interest opens a door for cheap and cost-effective peptides. also, the commercial interest for the production of bioactive peptides from the natural sources is elevated to reduce the cost but due to the lack of suitable technologies which helps to retain or increase the activity of bioactive peptides, the large-scale production gets hampered (craik et al. 2013) . another limitation associated with peptide-based drugs is their short half-life. when the peptides come in contact with intestinal mucosa, gastric acid in stomach and peptidases that are present in the blood convert peptides into amino acids and gets quickly eliminated from the body, thereby reducing its half-life. in order to achieve prolonged action and to reduce the frequent dosing, it is necessary to increase the half-life of the peptides. as peptide therapeutics advances, several non-degradable polymers and polymeric matrix are available to increase the half-life of peptides and proteins thus reducing the frequent dosing (brown 2005) . another strategy to increase the half-life of the peptide is to modify the amino acids which are susceptible to cleavage by enzymes (adessi and soto 2002) . oral delivery of the peptides is explored in past years but limits its use as peptides are degraded in the git tract (lau et al. 2015) . novo nordisk developed a semaglutide oral formulation which is currently in clinical trial phase 3. other non-invasive and suitable routes for the peptide and protein-based drugs are under investigation includes the nasal and pulmonary route which directly deliver the drugs to the blood component hence, reduces the overall drug degradation which is the main drawback associated with oral delivery of peptides (davies et al. 2017 ). different therapeutic peptides have made their way to the market in recent years, and it is assumed that the market demand will increase further as peptides are specific, selective and safe in comparison to small molecules (fosgerau and hoffmann 2015; loffet 2002) . traditional synthesis and drug design approaches are used by researchers and pharmaceutical companies for the development of the peptide-based drug candidate. the recombinant protein is another tool for the synthesis of the desired peptide in bacteria, yeast, and fungi. the market of peptides has now moved from the isolation of peptides from an animal source that can mimic the endogenous peptide to synthetic, semi-synthetic and recombinant peptides. further, development in this area is drug conjugated peptides and multifunctional peptides (fosgerau and hoffmann 2015) . peptides are susceptible to degradation when given orally, which is supposed to be the most convenient route of administration. the preferred route for the peptide delivery is intravenous, however, more research is directed for the delivery of peptides. recently, alternative routes are explored, which include oral, transdermal and intranasal route. as peptides are excellent biomarkers, they can also be utilized for diagnosing diseases. further, peptides are also found to be advantageous in vaccine development (brown 2005; sheridan 2012 ). plant peptides have been isolated and characterized in different ways as described in fig. 3 . x-ray crystallography and nmr have contributed enormously in the field of structure elucidation of pure peptides (gomathi and subramanian fig. 3 the general scheme of isolation, purification, and characterization of plant peptides (gomathi and subramanian 1996; krishnan and rupp 2012; sinz et al. 2015; wysocki et al. 2005) 1996; krishnan and rupp 2012) . other techniques like electron microscopy, fluorescence resonance energy transfer, chemical cross-linking emerged in recent year for the characterization of proteins and peptides (sinz et al. 2015) . mass spectrometry is frequently used technique in analyzing peptide mixtures and identifying their proposed structures. amino acid sequencing in mass spectrometry can be performed by two methods namely top-down sequencing and down-top sequencing. in top-down sequencing, the proteins analysis is performed without hydrolysis and the molecular weight can be detected by fragmentation pattern of the whole protein. a common approach is down-top sequencing in which proteolytic digestion is performed followed by ms analysis. software based on the algorithm of the amino acid sequence is available for protein identification (wysocki et al. 2005 ). plants remain a valuable source for developing a new drug candidate. they are still sharing their fair share of the new drug candidates and lead. as research interest is now shifted from small molecules to bio-molecules by virtue of several added benefits, plants are now under exploration for the presence of proteins and bioactive peptides for the disease management. though plant proteins and their functions have not been defined earlier, plant hormones were only considered through which cells communicate, as our understanding to these signalling pathways become clear, several secreted peptides and small rnas now known which regulates molecular recognition and thus, controls cell communication. from our diet we consume proteins which after gastrointestinal digestion get converted to bioactive peptides with more tissue affinity to specific receptor. several research publications in recent years claimed plant proteins and their bioactive peptides as a potential candidate in the treatment of various diseases specially in metabolic and life style borne diseases including cancer, diabetes and obesity. sophisticated instrument and advancement in technology leads easy isolation, purification and characterization of proteins and peptides in a short period of time, which further reduces the overall time of discovery. apparently, most of the bioactive peptides available for the treatment of diabetes have emerged from the synthetic route or derived from recombinant technology, which further adds in the overall cost of the treatment. to make peptide therapy economic there is a demand to explore plant-based biomolecules. this review discusses bioactive peptide isolated from various plant parts like leaves, fruits and seeds. the trend indicates that most of the bioactive peptides are obtained from seeds, although reports are available where bioactive peptides have been also isolated from leaves, whole fruits and pulps. it was observed in various in vitro and in vivo studies that protein hydrolysate obtained from aqueous extracts of plant are capable of inhibiting the enzymes and transporter systems responsible for the diabetes. expedition to find molecular targets and mechanism of action of plant based bioactive peptides is needed to use these bioactive peptides as a potential drug candidate. epidemiological data reveals that type 2 diabetes requires promising therapy as the available synthetic drugs for the treatment have moderate to severe adverse effects. plant-derived bioactive peptides inhibit the enzymes like alpha-glucosidase, alpha-amylase, dipeptidyl peptidase-iv and glucose transporter systems involved in type 2 diabetes. in vivo studies of a certain plant peptides fraction showed insulin-mimetic action in animal models. it is clear that plant-derived bioactive peptides are not explored to their full potential in comparison to other classes of natural products. the reason might be the lack of suitable instrumental techniques to identify, purify and characterize the peptides from plant source. however, the recent advancement in lcms, lc-nmr, and x-ray crystallography has bridged the gap and this would be the right time to embark on plant peptides. in future, systematic studies including dereplication for the early identification of known peptides, target identification followed by in vivo studies would help to speed up the plant peptides research. dipeptidyl peptidase-4 inhibitors: clinical data and clinical implications amylase inhibitors: potential source of anti-diabetic drug discovery from medicinal plants anti-oxidant, anti-diabetic, and anti-hypertensive effects of extracted phenolics and hydrolyzed peptides from barley protein fractions connectivity maps for biosimilar drug discovery in venoms: the case of gila monster venom and the anti-diabetes drug byetta® in vitro antioxidant and α-amylase inhibitory properties of watermelon seed protein hydrolysates peptide drugs, overcoming the challenges, a growing business isolation and intracellular localization of insulin-like proteins from leaves of bauhinia variegata dpp-4 inhibitors and their potential role in the management of type 2 diabetes classification, pathophysiology, diagnosis and management of diabetes mellitus rice bran protein hydrolysates improve insulin resistance and decrease pro-inflammatory cytokine gene expression in rats fed a high carbohydrate-high fat diet a single luteinizing hormone determination 2 hours after depot leuprolide is useful for therapy monitoring of gonadotropin-dependent precocious puberty in girls glucose transporters: structure, function and consequences of deficiency commercial challenges of protein drug delivery postprandial glucose regulation and diabetic complications coherent market insights partial purification and characterization of an insulin-like material from spinach and lemna gibba g3 the future of peptidebased drugs proteins: structures and molecular properties effect of oral semaglutide compared with placebo and subcutaneous semaglutide on glycemic control in patients with type 2 diabetes: a randomized clinical trial a modern review of diabetes mellitus: an annihilatory metabolic disorder bloomberg markets top 20 diabetes drugs 2017 anti-diabetic and antihypertensive activities of two flaxseed protein hydrolysate fractions revealed following their simultaneous separation by electrodialysis with ultrafiltration membranes the incretin system: glucagon-like peptide-1 receptor agonists and dipeptidyl peptidase-4 inhibitors in type 2 diabetes glatiramer acetate (copaxone®) induces degenerate, th2-polarized immune responses in patients with multiple sclerosis characterization of antidiabetic and antihypertensive properties of canary seed (phalaris canariensis l.) peptides peptide therapeutics: current status and future directions extraction and purification of a substance with luteinizing hormone releasing activity from the leaves of avena sativa defects in signal transduction proteins leading to disease. in: sitaramayya a (ed) introduction to cellular signal transduction signaling peptides in plants elucidation of secondary structures of peptides using high resolution nmr biofunctional properties of bioactive peptide fractions from protein isolates of moringa seed (moringa oleifera) proteins of some legume seeds: soybean, pea, fababean and lupin food-derived peptides with biological activity: from research to food applications production of dipeptidyl peptidase iv inhibitory peptides from defatted rice bran anti-oxidation activities of rice-derived peptides and their inhibitory effects on dipeptidylpeptidase-iv the current state of peptide drug discovery: back to the future glp-1 secretion is enhanced directly in the ileum but indirectly in the duodenum by a newly identified potent stimulator, zein hydrolysate, in rats amylase inhibitors and their biomedical applications in vitro antioxidant and antidiabetic activity of oligopeptides derived from different mulberry hypoglycemic activity of polypeptide-p from a plant source bioactive proteins and peptides from food sources. applications of bioprocesses used in isolation and recovery effects of single and long-term administration of wheat albumin on blood glucose control: randomized controlled clinical trials macromolecular structure determination: comparison of x-ray crystallography and nmr spectroscopy peptides derived from soy and lupin protein as dipeptidyl-peptidase iv inhibitors: in vitro biochemical screening and in silico molecular modeling study discovery of the once-weekly glucagonlike peptide-1 (glp-1) analogue semaglutide somatostatin-like material is present in flowering plants research progress in structure-activity relationship of bioactive peptides peptides: new signalling molecules in plants identification of the bioactive and consensus peptide motif from momordica charantia insulin receptor-binding protein peptides as drugs: is there a market? growth potential in peptide and protein therapeutics to enhance supplier capacities the soybean peptide aglycin regulates glucose homeostasis in type 2 diabetic mice via ir/irs1 pathway glucagon-like peptide-1 receptor agonists for the treatment of type 2 diabetes: differences and similarities gonadotropin releasing hormone agonists: expanding vistas mcy protein, a potential antidiabetic agent: evaluation of carbohydrate metabolic enzymes and antioxidant status optimization of enzymatic production of anti-diabetic peptides from black bean (phaseolus vulgaris l.) proteins, their characterization and biological potential evaluation of the hypoglycemic potential of a black bean hydrolyzed protein isolate and its pure peptides using in silico, in vitro and in vivo approaches characterization of peptides from common bean protein isolates and their potential to inhibit markers of type-2 diabetes, hypertension and oxidative stress bioactive peptides and proteins from foods: indication for health effects drug therapy of postprandial hyperglycaemia is the oral route possible for peptide and protein drug delivery? a prolactin inhibitory factor with immunocharacteristics similar to thyrotropin releasing factor (trh) is present in rat pituitary tumors (gh3 and w5), testicular tissue and a plant material, alfalfa small signaling peptides in arabidopsis development: how cells communicate over a short distance a novel approach to control hyperglycemia in type 2 diabetes: sodium glucose cotransport (sglt) inhibitors: systematic review and meta-analysis of randomized trials non-peptidyl insulin mimetics as a potential antidiabetic agent medical management of hyperglycemia in type 2 diabetes: a consensus algorithm for the initiation and adjustment of therapy: a consensus statement of the american diabetes association and the european association for the study of diabetes natural proteins: sources, isolation, characterization and applications enzyme-assisted extraction and identification of antioxidative and alpha-amylase inhibitory peptides from pinto beans (phaseolus vulgaris cv. pinto) quinoa (chenopodium quinoa willd.) protein hydrolysates with in vitro dipeptidyl peptidase iv (dpp-iv) inhibitory and antioxidant properties hard-to-cook bean (phaseolus vulgaris l.) proteins hydrolyzed by alcalase and bromelain produced bioactive peptide fractions that inhibit targets of type-2 diabetes and oxidative stress peptide-based drug design: here and now current challenges in peptide-based drug discovery diabetes mellitus-a devastating metabolic disorder protein extract from the fruit pulp of momordica dioica shows anti-diabetic, anti-lipidemic and antioxidant activity in diabetic rats history of insulin isolation and characterization of a novel antihyperglycemic protein from the fruits of momordica cymbalaria identification and characterization of two novel α-glucosidase inhibitory oligopeptides from hemp (cannabis sativa l.) seed protein enhancement of glucose uptake in muscular cell by soybean charged peptides isolated by electrodialysis with ultrafiltration membranes (eduf): activation of the ampk pathway world health organization the evolutionary origins of hormones, neurotransmitters, and other extracellular chemical messengers: implications for mammalian biology pharmacokinetic-pharmacodynamic relationships of acarbose hypoglycemic effects of cocoa (theobroma cacao l.) autolysates is there a role for α-glucosidase inhibitors in the prevention of type 2 diabetes mellitus? proof of concept for next-generation nanoparticle drugs in humans chemical cross-linking and native mass spectrometry: a fruitful combination for structural biology extraction, identification, and structureactivity relationship of antioxidative and α-amylase inhibitory peptides from cumin seeds (cuminum cyminum) selenium: an insulin mimetic anti-hyperglycaemic globulins from selected cucurbitaceae seeds used as antidiabetic medicinal plants in africa the emergence of peptides in the pharmaceutical business: from exploration to exploitation influence of cowpea (vigna unguiculata) peptides on insulin resistance intercellular communication during plant development bioactivity evaluation of cucurbitin derived enzymatic hydrolysates in vitro inhibition of dipeptidyl peptidase iv by peptides derived from the hydrolysis of amaranth (amaranthus hypochondriacus l.) proteins exploration of nature's chemodiversity: the role of secondary metabolites as leads in drug development release of dipeptidyl peptidase iv, α-amylase and α-glucosidase inhibitory peptides from quinoa (chenopodium quinoa willd.) during in vitro simulated gastrointestinal digestion mind-body medicine: stress and its impact on overall health and longevity diabetes mellitus, type 2-a review evaluation of the antidiabetic activity of hydrolyzed peptides derived from juglans mandshurica maxim fruits in insulin-resistant hepg2 cells and type 2 diabetic mice mass spectrometry of peptides and proteins plant insulin or glucokinin: a conflicting issue soymorphin-5, a soy-derived μ-opioid peptide, decreases glucose and triglyceride levels through activating adiponectin and pparα systems in diabetic kkay mice alpha-glucosidase inhibitory activity of protein-rich extracts from extruded adzuki bean in diabetic kk-ay mice slow acting protein extract from fruit pulp of momordica charantia with insulin secretagogue and insulinomimetic activities purification and characterisation of a hypoglycemic peptide from momordica charantia l. var. abbreviata ser peptides derived from oats improve insulin sensitivity the authors want to thank niper ahmedabad, under the aegis of the ministry of chemicals and fertilizers, department of pharmaceuticals for providing facilities and research fellowships. key: cord-243806-26n22jbx authors: vandelli, andrea; monti, michele; milanetti, edoardo; ponti, riccardo delli; tartaglia, gian gaetano title: structural analysis of sars-cov-2 and prediction of the human interactome date: 2020-03-30 journal: nan doi: nan sha: doc_id: 243806 cord_uid: 26n22jbx specific elements of viral genomes regulate interactions within host cells. here, we calculated the secondary structure content of>2500 coronaviruses and computed>100000 human protein interactions with severe acute respiratory syndrome coronavirus 2 (sars-cov-2). we found that the 3 and 5 prime ends are the most structured elements in the viral genome and the 5 prime end has the strongest propensity to associate with human proteins. the domain encompassing nucleotides 23000-24000 is highly conserved both at the sequence and structural level, while the region upstream varies significantly. these two sequences code for a domain of the viral protein spike s that interacts with the human receptor angiotensin-converting enzyme 2 (ace2) and has the potential to bind sialic acids. our predictions indicate that the first 1000 nucleotides in the 5 prime end can interact with proteins involved in viral rna processing such as double-stranded rna specific editases and atp-dependent rna-helicases, in addition to other high-confidence candidate partners. these interactions, previously reported to be also implicated in hiv, reveal important information on host-virus interactions. the list of transcriptional and post-transcriptional elements recruited by sars-cov-2 genome provides clues on the biological pathways associated with gene expression changes in human cells. a disease named covid-19 by the world health organization and caused by the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) has been recognized as responsible for the pneumonia outbreak that started in december, 2019 in wuhan city, hubei, china 1 and spread in february to milan, lombardy, italy 2 becoming pandemic. as of april 2020, the virus infected >2'000'000 people in >200 countries. sars-cov-2 is a positive-sense single-stranded rna virus that shares similarities with other betacoronavirus such as severe acute respiratory syndrome coronavirus (sars-cov) and middle east respiratory syndrome coronavirus (mers-cov) 3 . bats have been identified as the primary host for sars-cov and sars-cov-2 4,5 but the intermediate host linking sars-cov-2 to humans is still unknown, although a recent report indicates that pangolins could be involved 6 . coronaviruses use species-specific proteins to mediate the entry in the host cell and the spike s protein activates the infection in human respiratory epithelial cells in sars-cov, mers-cov and sars-cov-2 7 . spike s is assembled as a trimer and contains around 1,300 amino acids within each unit 8, 9 . the receptor binding domain (rbd) of spike s, which contains around 300 amino acids, mediates the binding with angiotensin-converting enzyme, (ace2) attacking respiratory cells. another region upstream of the rbd, present in mers-cov but not in sars-cov, is involved in the adhesion to sialic acid and could play a key role in regulating viral infection 7, 10 . at present, very few molecular details are available on sars-cov-2 and its interactions with the human host, which are mediated by specific rna elements 11 . to study the rna structural content, we used cross 12 that was previously developed to investigate large transcripts such as the human immunodeficiency virus hiv-1 13 . cross predicts the structural profile of rna molecules (singleand double-stranded state) at single-nucleotide resolution using sequence information only. here, we performed sequence and structural alignments among 62 sars-cov-2 strains and identified the conservation of specific elements in the spike s region, which provides clues on the evolution of domains involved in the binding to ace2 and sialic acid. as highly structured regions of rna molecules have strong propensity to form stable contacts with proteins 14 and promote assembly of specific complexes 15, 16 , sars-cov-2 domains enriched in double-stranded content are expected to establish interactions within host cells that are important to replicate the virus 17 . to investigate the interactions of sars-cov-2 rna with human proteins, we employed catrapid 18, 19 . catrapid 20 estimates the binding potential of a specific protein for an rna molecule through van der waals, hydrogen bonding and secondary structure propensities allowing identification of interaction partners with high confidence 21 . the unbiased analysis of more than 100000 protein interactions with sars-cov-2 rna reveals that the 5' of sars-cov-2 has strong propensity to bind to human proteins involved in viral infection and especially reported to be associated with hiv infection. a comparison between sars-cov and hiv reveals indeed similarities 22 , but the relationship between sars-cov-2 and hiv is still unexplored. interestingly, hiv and sars-cov-2, but not sars-cov nor mers-cov, have a furin-cleavage site occurring in the spike s protein, which could explain the spread velocity of sars-cov-2 compared to sars-cov and mers-cov 23, 24 . yet, many processes related to sars-cov-2 replication are unknown and our study aims to suggest relevant protein interactions for further investigation. we hope that our large-scale calculations of structural properties and binding partners of sars-cov-2 will be useful to identify the mechanisms of virus replication within the human host. structured elements within rna molecules attract proteins 14 and reveal regions important for interactions with the host 25 . indeed, each gene expressed from sars-cov-2 is preceded by conserved transcription-regulating sequences that act as signal for the transcription complex during the synthesis of the rna minus strand to promote a strand transfer to the leader region to resume the synthesis. this process is named discontinuous extension of the minus strand and is a variant of similarity-assisted template switching that operates during viral rna recombination 17 . to analyze sars-cov-2 structure (reference wuhan strain mn908947.3), we employed cross 12 to predict the double-and single-stranded content of rna genomes such as hiv-1 13 . we found the highest density of double-stranded regions in the 5' (nucleotides 1-253), membrane m protein (nucleotides 26523-27191), spike s protein (nucleotides 23000-24000), and nucleocapsid n protein (nucleotides 2874-29533; fig. 1 ) 26 . the lowest density of double-stranded regions were observed at nucleotides 6000-6250 and 20000-21500 and correspond to the regions between the nonstructural proteins nsp14 and nsp15 and the upstream region of the spike surface protein s (fig. 1 ) 26 . in addition to the maximum corresponding to nucleotides 23000-24000, the structural content of spike s protein shows minima at around nucleotides 20500 and 24500 (fig. 1) . we used the vienna method 27 to further investigate the rna secondary structure of specific regions identified with cross 13 . employing a 100 nucleotide window centered around cross maxima and minima, we found good match between cross scores and vienna free energies ( fig. 1 ). strong agreement is also observed between cross and vienna positional entropy, indicating that regions with the highest structural content have also the lowest structural diversity. our analysis suggests the presence of structural elements in sars-cov-2 that have evolved to interact with specific human proteins 11 . our observation is based on the assumption that structured regions have an intrinsic propensity to recruit proteins 14 , which is supported by the fact that structured transcripts act as scaffolds for protein assembly 15, 16 . we employed crossalign 13 to study the structural conservation of sars-cov-2 in different strains (materials and methods). in our analysis, we compared the wuhan strain mn908947.3 with around 2800 other coronaviruses (data from ncbi) with human ( fig. 2) or other hosts (supp. fig. 1) . when comparing sars-cov-2 with human coronaviruses (1387 strains, including sars-cov and mers-cov), we found that the most conserved region falls inside the spike s genomic locus (fig. 2) . more precisely, the conserved region is between nucleotides 23000 -24000 and exhibits an intricate and stable to better investigate the sequence conservation of sars-cov-2, we compared 62 strains isolated from different countries during the pandemic (including china, usa, japan, taiwan, india, brazil, sweden, and australia; data from ncbi and in vipr www.viprbrc.org; materials and methods). our analysis aims to determine the relationship between structural content and sequence conservation. using clustal w for multiple sequence alignments 28 , we observed general conservation of the coding regions (fig. 3a) . the 5' and 3' show high variability due to experimental procedures of the sequencing and are discarded in this analysis 29 . one highly conserved region is between nucleotides 23000 -24000 in the spike s genomic locus, while sequences up-and downstream are variable (red bars in fig. 3a) . we then used crossalign 13 to compare the structural content (materials and methods). high variability of structure is observed for both the 5' and 3' and for nucleotides between 21000 -22000 as well as 24000 -25000, associated with the s region (red bars in fig. 3a) . the rest of the regions are significantly conserved at a structural level (p-value < 0.0001; fisher's test). we then compared protein sequences coded by the spike s genomic locus (ncbi reference qhd43416) and found that both sequence (fig. 3a) and structure (fig. 2) of nucleotides 23000 -24000 are highly conserved. the region corresponds to amino acids 330-500 that contact the host receptor angiotensin-converting enzyme 2 (ace2) 30 promoting infection and provoking lung injury 24, 31 . by contrast, the region upstream of the binding site receptor ace2 and located in correspondence to the minimum of the structural profile at around nucleotides 22500-23000 ( fig. 1) is highly variable 32 , as indicated by t-coffee multiple sequence alignments 32 (fig. 3a) . this part of the spike s region corresponds to amino acids 243-302 that in mers-cov binds to sialic acids regulating infection through cell-cell membrane fusion ( fig. 3b ; see related manuscript by e. milanetti et al.) 10, 33, 34 . our analysis suggests that the structural region between nucleotides 23000 and 24000 of spike s region is conserved among coronaviruses (fig. 2) and that the binding site for ace2 has poor variation in human sars-cov-2 strains (fig. 3b) . by contrast, the region upstream, which has propensity to bind sialic acids 10, 33, 34 , showed poor structural content and high variability (fig. 3b) . in order to obtain insights on how the virus replicates in human cells, we predicted sars-cov-2 interactions with the whole rna-binding human proteome. following a protocol to study structural conservation in viruses 13 , we first divided the wuhan sequence in 30 fragments of 1000 nucleotides each moving from the 5' to 3' and then calculated the protein-rna interactions of each fragment with catrapid omics (3340 canonical and putative rna-binding proteins, or rbps, for a total 102000 interactions) 18 . proteins such as polypyrimidine tract-binding protein 1 ptbp1 (uniprot p26599) showed the highest interaction propensity (or z-score; materials and methods) at the 5' while others such as heterogeneous nuclear ribonucleoprotein q hnrnpq (o60506) showed the highest interaction propensity at the 3', in agreement with previous studies on coronaviruses ( fig. 4a ) 35 . for each fragment, we predicted the most significant interactions by filtering according to the z score. we used three different thresholds in ascending order of stringency: z ³ 1.50, 1.75 and 2 respectively and we removed from the list the proteins that were predicted to interact promiscuously with more than one fragment. fragment 1 corresponds to the 5' and is the most contacted by rbps (around 120 with z³2 high-confidence interactions; fig. 4b ), which is in agreement with the observation that highly structured regions attract a large number of proteins 14 . indeed, the 5' contains multiple stem loop structures that control rna replication and transcription 36, 37 . by contrast, the 3' and fragment 23 (spike s), which are still structured but to a lesser extent, attract fewer proteins (10 and 5, respectively), while fragment 20 (between orf1ab and spike s) that is predicted to be unstructured, does not have binding partners. the interactome of each fragment was then analysed using clevergo, a tool for gene ontology (go) enrichment analysis 38 . proteins interacting with fragments 1, 2 and 29 were associated with annotations related to viral processes ( fig. 4c ; supp. table 1 ). considering the three thresholds applied (materials and methods), we found 22 viral proteins for fragment 1, 2 proteins for fragment 2 and 11 proteins for fragment 29 (fig. 4d) . among the high-confidence interactors of fragment 1, we discovered rbps involved in positive regulation of viral processes and viral genome replication, such as double-stranded rna-specific editase 1 adarb1 (uniprot p78563 39 ), 2-5-oligoadenylate synthase 2 oas2 (p29728) and 2-5adependent ribonuclease rnasel (q05823). interestingly, 2-5-oligoadenylate synthase 2 oas2 has been reported to be upregulated in human alveolar adenocarcinoma (a549) cells infected with sars-cov-2 (log fold change of 4.2; p-value of 10 -9 and q-value of 10 -6 ) 40 . while double-stranded rna-specific adenosine deaminase adar (p55265) is absent in our library due to its length that does not meet catrapid omics requirements 18 , the omixcore extension of the algorithm specifically developed for large molecules 41 attributes the same binding propensity to both adarb1 and adar, thus indicating that the interactions might occur (materials and methods). moreover, experimental works indicate that the family of adar deaminases is active in bronchoalveolar lavage fluids derived from sars-cov-2 patients 42 and is upregulated in a549 cells infected with sars-cov-2 (log fold change of 0.58; p-value of 10 -8 and q-value of 10 -5 ) 40 . we also identified proteins related to the establishment of integrated proviral latency, including xray repair cross-complementing protein 5 xrcc5 (p13010) and x-ray repair cross-complementing protein 6 xrcc6 (p12956; fig. 4b and 4e). in accordance with our calculations, comparison of a549 cells responses to sars-cov-2 and respiratory syncytial virus, indicates upregulation of xrrc6 in sars-cov-2 (log fold-change of 0.92; p-value of 0.006 and q-value of 0.23) 40 . nucleolin ncl (p19338), a protein known to be involved in coronavirus processing, was also predicted to bind tightly to the 5' (supp. table 1) 43 . importantly, we found proteins related to defence response to viruses, such as atp-dependent rna helicase ddx1 (q92499), that are involved in negative regulation of viral genome replication. some dna-binding proteins such as cyclin-t1 ccnt1 (o60563), zinc finger protein 175 znf175 (q9y473) and prospero homeobox protein 1 prox1 (q92786) were included because they could have potential rna-binding ability (fig. 4e) 44 . as for fragment 2, we found two canonical rbps: more than simple scaffold elements, gag proteins are versatile elements that bind to viral and host proteins as they traffic to the cell membrane (supp. table 1 ) 45 . analysis of functional annotations carried out with genemania 46 revealed that proteins interacting with the 5' of sars-cov-2 rna are associated with regulatory pathways involving notch2, myc and max that have been previously connected to viral infection processes ( fig. 4e) 47, 48 . interestingly, some proteins, including ddx1, ccnt1 and znf175 for fragment 1 and trim32 for fragment 2, have been shown to be necessary for hiv functions and replication inside the cell. recently, gordon et al. reported a list of human proteins binding to open reading frames (orfs) translated from sars-cov-2 58 . identified through affinity purification followed by mass spectrometry quantification, 332 proteins from hek-293t cells interact with viral orf peptides. by selecting 274 proteins binding at the 5' with z score ³1.5 (supp . table 1) , of which 140 are exclusively interacting with fragment 1 (fig. 4b) , we found that 8 are also reported in the list by gordon et al. 58 , which indicates significant enrichment (representation factor of 2.5; p-value of 0.02; hypergeometric test with human proteome in background). the fact that our list of proteinrna binding partners contains elements identified also in the protein-protein network analysis is not surprising, as ribonucleoprotein complexes evolve together 14 and their components sustain each other through different types of interactions 16 . we note that out of 332 interactions, 60 are rbps (as reported in uniprot 39 ), which represents a considerable fraction (i.e., 20%), considering that there are around 1500 rbps in the human proteome (i.e., 6%) and fully justified by the fact that they involve association with viral rna. comparing the rbps present in gordon et al. 58 table 2) . interestingly, srp72, larp7 and larp4b proteins assemble in stress granules [61] [62] [63] that are the targeted by rna viruses 53 . we speculate that sequestration of these elements is orchestrated by a viral program aiming to recruit host genes 49 protein kinase a radixin rdx (p35241; in addition to those mentioned above; supp. table 2 ). in the list of 274 proteins binding to the 5' (fragment 1) with z score ≥1.5, we found 10 hits associated with hiv (supp. table 3 (q06787) and rna polymerase-associated protein rtf1 homologue (q92541; supp. table 3 ). by contrast, no significant enrichments were found for other viruses such as for instance ebola. 65 , among other targets. in addition, hvb-related targets are nuclear receptor subfamily 5 group a member 2 nr5a2 (chembl3544), interferon-induced, double-stranded rna-activated protein kinase eif2ak2 (chembl5785) and srsf protein kinase 1 srpk1 (chembl4375). we hope that this list can be the starting point for further pharmaceutical studies. a number of proteins identified in our catrapid calculations have been previously reported to coalesce in large ribonucleoprotein assemblies such as stress granules. among these proteins, we found double-stranded rna-activated protein kinase eif2ak2 (p19525), nucleolin ncl (p19338), atp-dependent rna helicase ddx1 (q92499), cyclin-t1 ccnt1 (o60563), signal recognition particle subunit srp72 (o76094), larp7 (q4g0j3) and la-related protein 4b heterogeneous nuclear ribonucleoprotein q hnrnpq (o60506) 63 . to further investigate the propensity of these proteins to phase separate in stress granules, we used the catgranule algorithm (materials and methods) 66 . we found that the 274 proteins binding to the 5' (fragment 1) with z score ≥1.5 are highly prone to accumulate in stress-granules (274 proteins with the lowest z score are used in the comparison; p-value<0.0001; kolmogorov-smirnoff; fig. 4g ; supp. table 4 ). this finding is particularly relevant because rna viruses are known to antagonize stress granules formation 53 . indeed, the role of stress granules and processing bodies in translation suppression and rna decay have impact on virus replication 67 . spreading. using advanced computational approaches, we investigated the structural content of sars-cov-2 rna and predicted human proteins that bind to it. we employed cross 13, 68 to compare the structural properties of 2800 coronaviruses and identified elements conserved in sars-cov-2 strains. the regions containing the highest amount of structure are the 5' as well as glycoproteins spike s and membrane m. we found that the spike s protein domain encompassing amino acids 330-500 is highly conserved across sars-cov-2 strains. this result suggests that spike s must have evolved to specifically interact with its host partner ace2 30 and mutations increasing the binding affinity are highly infrequent. as the nucleic acids encoding for this region are enriched in double-stranded content, we speculate that the structure might attract host regulatory elements, thus further constraining the variability. the fact that the spike s region is highly conserved among all the analysed sars-cov-2 strains suggests that a specific drug can be designed against it to prevent interactions within the host. by contrast, the highly variable region at amino acids 243-302 in spike s protein corresponds to the binding site of sialic acids in mers-cov (see manuscript by e. milanetti et al.) 7, 10, 34 and could play a role in infection 33 . the fact that the binding region changes in the different strains might indicate a variety of binding affinities for sialic acids, which could provide clues on the specific responses in the human population. interestingly, the sialic acid binding site is absent in sars-cov but present in mers-cov, which represents an important difference between the diseases. both our sequence and structural analyses of spike s protein indicate high conservation among coronaviruses and suggest that human engineering of sars-cov-2 is highly unlikely. using catrapid 18, 19 we computed >100000 protein interactions with sars-cov-2 and found previously reported interactions such as polypyrimidine tract-binding protein 1 ptbp1, heterogeneous nuclear ribonucleoprotein q hnrnpq and nucleolin ncl 43 . in addition, we discovered that the highly structured region at the 5' has the largest number of protein partners including atp-dependent rna helicase ddx1 that was previously reported to be essential for hiv-1 and coronavirus ibv replication 54,55 , and the double-stranded rna-specific editases adar and adarb1 that catalyse the hydrolytic deamination of adenosine to inosine 49 . other predicted interactions are xrcc5 and xrcc6 members of the hdp-rnp complex associating with atp-dependent rna helicase dhx9 69 as well as and 2-5a-dependent ribonuclease rnasel and 2-5-oligoadenylate synthase 2 oas2 that control viral rna degradation 70, 71 . interestingly, ddx1, xrcc6 and oas2 are upregulated in human alveolar adenocarcinoma cells infected with sars-cov-2 40 . in agreement with our predictions, recent experimental work indicates that the family of adar deaminases is active in bronchoalveolar lavage fluids derived from sars-cov-2 patients 42 . a significant overlap exists with the list of protein interactions reported by gordon et al. 58 , and among the candidate partners we identified akap8l, involved as a dead/h-box rna helicase binding protein involved in hiv infection 59 . in general, proteins associated with retroviral replication are expected to play different roles in sars-cov-2. as sars-cov-2 massively represses host gene expression 49 , we hypothesize that the virus hijacks host pathways by recruiting transcriptional and post-transcriptional elements interacting with polymerase ii genes and splicing factors such as for instance a-kinase anchor protein 8-like akap8l and la-related protein 7 larp7 that is upregulated in human alveolar adenocarcinoma cells infected with sars-cov-2 40 . the link to proteins previously studied in the context of hiv and other viruses, if further confirmed fro, is particularly relevant for the repurposing of existing drugs 65 . the idea that sars-cov-2 sequesters different elements of the transcriptional machinery is particularly intriguing and is supported by the fact that a large number of proteins identified in our screening are found in stress granules 63 . indeed, stress granules protect the host innate immunity and are hijacked by viruses to favour their own replication 67 . moreover, as coronaviruses transcription uses discontinuous rna synthesis that involves high-frequency recombination 43 , it is possible that pieces of the viruses resulting from a mechanism called defective interfering rnas 72 could act as scaffold to attract host proteins 14, 15 . we predicted the secondary structure of transcripts using cross (computational recognition of secondary structure 13, 68 . cross was developed to perform high-throughput rna profiling. the algorithm predicts the structural profile (single-and double-stranded state) at single-nucleotide resolution using sequence information only and without sequence length restrictions (scores > 0 indicate double stranded regions). we used the vienna method 27 to further investigate the rna secondary structure of minima and maxima identified with cross 13 . we used crossalign 13,68 an algorithm based on dynamic time warping (dtw), to check and evaluate the structural conservation between different viral genomes 13 . crossalign was previously employed to study the structural conservation of ~5000 hiv genomes. sars-cov-2 fragments (1000 nt, not overlapping) were searched inside other complete genomes using the obe (open begin and end) module, in order to search a small profile inside a larger one. the lower the structural distance, the higher the structural similarities (with a minimum of 0 for almost identical secondary structure profiles). the significance is assessed as in the original publication 13 . the fasta sequences of the complete genomes of sars-cov-2 were downloaded from virus pathogen resource (vipr; www.viprbrc.org), for a total of 62 strains. regarding the overall coronaviruses, the sequences were downloaded from ncbi selecting only complete genomes, for a total of 2862 genomes. the reference wuhan sequence with available annotation (epi_isl_402119) was downloaded from global initiative on sharing all influenza data. (gisaid https://www.gisaid.org/). interactions between each fragment of target sequence and the human proteome were predicted using catrapid omics 18, 19 , an algorithm that estimates the binding propensity of protein-rna pairs by combining secondary structure, hydrogen bonding and van der waals contributions. as we used clustal w 28 for 62 sars-cov-2 strains alignments and t-coffee 32 for spike s proteins alignments. the variability in the spike s region was measured by computing shannon entropy on translated rna sequences. the shannon entropy is computed as follows: s(a) = -sum_i p(a,i) log p(a,i) where a correspond to the amino acid at the position i and p(a,i) is the frequency of a certain amino-acid a at position i of the sequence. low entropy indicates poorly variability: if p(a,x) = 1 for one a and 0 for the rest, then s(x) =0. by contrast, if the frequencies of all amino acids are equally distributed, the entropy reaches its maximum possible value. catgranule 66 was employed to identify proteins assembling into biological condensates. scores > 0 indicate that a protein is prone to phase separate. structural disorder, nucleic acid binding propensity and amino acid patterns such as arginine-glycine and phenylalanine-glycine are key features combined in this computational approach 66 . fig. 1 a novel coronavirus from patients with pneumonia in china coronaviruses and immunosuppressed patients. the facts during the third epidemic evaluation and treatment coronavirus (covid-19). in statpearls isolation and characterization of a bat sars-like coronavirus that uses the ace2 receptor furin cleavage of the sars coronavirus spike glycoprotein enhances cell-cell fusion but does not affect virion entry isolation and characterization of 2019-ncov-like coronavirus from malayan pangolins structures of mers-cov spike glycoprotein in complex with sialoside attachment receptors cryo-electron microscopy structure of a coronavirus spike glycoprotein trimer characterization of spike glycoprotein of sars-cov-2 on virus entry and its immune cross-reactivity with sars-cov identification of sialic acid-binding function for the middle east respiratory syndrome coronavirus spike glycoprotein the structure and functions of coronavirus genomic 3' and 5' ends a high-throughput approach to profile rna structure a method for rna structure prediction shows evidence for structure in lncrnas rna structure drives interaction with proteins an integrative study of protein-rna condensates identifies scaffolding rnas and reveals players in fragile x-associated tremor/ataxia syndrome phase separation drives x-chromosome inactivation: a hypothesis identification of a coronavirus transcription enhancer catrapid omics: a web server for large-scale prediction of protein-rna interactions quantitative predictions of protein interactions with long noncoding rnas predicting protein associations with long noncoding rnas rnact: protein-rna interaction predictions for model organisms with supporting experimental data cloaked similarity between hiv-1 and sars-cov suggests an anti-sars strategy inhibition of furin-mediated cleavage activation of hiv-1 glycoprotein gp160 differential downregulation of ace2 by the spike proteins of severe acute respiratory syndrome coronavirus and human coronavirus nl63 conserved structural rna domains in regions coding for cleavage site motifs in hemagglutinin genes of influenza viruses genome composition and divergence of the novel coronavirus (2019-ncov) originating in china viennarna package 2.0 the embl-ebi search and sequence analysis tools apis in 2019 rna-seq methods for transcriptome analysis the proximal origin of sars-cov-2 a pneumonia outbreak associated with a new coronavirus of probable bat origin t-coffee: a web server for the multiple sequence alignment of protein and rna sequences using structural information and homology extension distinct roles for sialoside and protein receptors in coronavirus infection silico evidence for two receptors based strategy of sars-cov-2 host cell proteins interacting with the 3' end of tgev coronavirus genome influence virus replication structural determinants and mechanism of hiv-1 genome packaging an overview of their replication and pathogenesis protein aggregation, structural disorder and rna-binding ability: a new approach for physico-chemical and gene ontology classification of multiple datasets uniprot: a worldwide hub of protein knowledge sars-cov-2 launches a unique transcriptional signature from in vitro, ex vivo, and in vivo systems omixcore: a web server for prediction of protein interactions with large rna evidence for host-dependent rna editing in the transcriptome of sars-cov-2 rna-rna and rnaprotein interactions in coronavirus replication and transcription insights into rna biology from an atlas of mammalian mrna-binding proteins hiv gag polyprotein: processing and early viral particle assembly the genemania prediction server: biological network integration for gene prioritization and predicting gene function viral interactions with the notch pathway what retroviruses teach us about the involvement of c-myc in leukemias and lymphomas the architecture of sars-cov-2 transcriptome. biorxiv (now cell in press) drllps: a data resource of liquid-liquid phase separation in eukaryotes liquid nuclear condensates mechanically sense and restructure the genome phase separation of signaling molecules promotes t cell receptor signal transduction a dead box protein facilitates hiv-1 replication as a cellular co-factor of rev the cellular rna helicase ddx1 interacts with coronavirus nonstructural protein 14 and enhances viral replication cyclin t1 domains involved in complex formation with tat and tar rna are critical for tat-activation role of the human and murine cyclin t proteins in regulating hiv-1 tat-activation a sars-cov-2-human protein-protein interaction map reveals drug targets and potential drug-repurposing the role of a-kinase anchoring protein 95-like protein in annealing of trnalys3 to hiv-1 rna the la-related protein larp7 is a component of the 7sk ribonucleoprotein and affects transcription of cellular and viral polymerase ii genes a stimulatory role for the la-related protein 4b in translation larp4b is an au-rich sequence associated factor that promotes mrna accumulation and translation context-dependent and disease-specific diversity in protein interactions within stress granules systematic analysis of the protein interaction network for the human transcription machinery reveals the identity of the 7sk capping enzyme chembl: towards direct deposition of bioassay data a concentration-dependent liquid phase separation can cause toxicity upon increased protein expression viral regulation of rna granules in infected cells a high-throughput approach to profile rna structure dna-dependent protein kinase (dna-pk) phosphorylates nuclear dna helicase ii/rna helicase a and hnrnp proteins in an rnadependent manner the nature of the catalytic domain of 2'-5'-oligoadenylate synthetases enzymatic characteristics of recombinant medium isozyme of 2'-5' oligoadenylate synthetase defective interfering rnas: foes of viruses and friends of virologists structure and interactions of sars-cov-2 population; p_bonf = p-value of the enrichment. to correct for multiple testing bias, use bonferroni correction) 38 ; (d) viral processes are the third largest cluster identified in our analysis; (e) protein interactions with the 5' of sars-cov-2 rna (inner circle) and associations with other human genes retrieved from literature (green: genetic associations number of rbp interactions identified by gordon et al. 58 for different sars-cov-2 regions (see panel a for reference). (g) proteins binding to the 5' with z score ≥ 1.5 show high propensity to accumulate in stress-granules the authors would like to thank dr. mattia miotto, dr lorenzo di rienzo, dr. alexandros armaos, dr. alessandro dasti and dr. claudia giambartolomei for discussions. we are particularly grateful to prof. annalisa pastore for critical reading, dr. gilles mirambeau for the rt vs rdrp analysis, dr. andrea cerase for the discussing on stress granules and dr. roberto giambruno for pointing to ptbp1 and hnrnpq experiments.the research leading to these results has been supported by european research council (ribomylome_309545 and astra_855923), the h2020 projects iasis_727658 and key: cord-103255-4k13re9y authors: daniell, henry; streatfield, stephen j; wycoff, keith title: medical molecular farming: production of antibodies, biopharmaceuticals and edible vaccines in plants date: 2001-05-01 journal: trends in plant science doi: 10.1016/s1360-1385(01)01922-7 sha: doc_id: 103255 cord_uid: 4k13re9y abstract the use of plants for medicinal purposes dates back thousands of years but genetic engineering of plants to produce desired biopharmaceuticals is much more recent. as the demand for biopharmaceuticals is expected to increase, it would be wise to ensure that they will be available in significantly larger amounts, on a cost-effective basis. currently, the cost of biopharmaceuticals limits their availability. plant-derived biopharmaceuticals are cheap to produce and store, easy to scale up for mass production, and safer than those derived from animals. here, we discuss recent developments in this field and possible environmental concerns. research in the past few decades has revolutionized the use of therapeutically valuable proteins in a variety of clinical treatments. because most genes can be expressed in many different systems, it is essential to determine which system offers the most advantages for the production of the recombinant protein. the ideal expression system would be the one that produces the most safe, biologically active material at the lowest cost. the use of modified mammalian cells with recombinant dna techniques has the advantage of resulting in products that are identical to those of natural origin; however, culturing these cells is expensive and can only be carried out on a limited scale. the use of microorganisms such as bacteria permits manufacture on a larger scale, but introduces the disadvantage of producing products that differ appreciably from the products of natural origin. for example, proteins that are usually glycosylated in humans are not glycosylated by bacteria. furthermore, human proteins that are expressed at high levels in e. coli frequently acquire an unnatural conformation accompanied by intracellular precipitation, owing to lack of proper folding and disulfide bridges. the production of recombinant proteins in plants has many potential advantages for generating biopharmaceuticals relevant to clinical medicine. first, plant systems are more economical than industrial facilities using fermentation or bioreactor systems. second, the technology is already available for harvesting and processing plants and plant products on a large scale. third, the purification requirement can be eliminated when the plant tissue containing the recombinant protein is used as a food (edible vaccines). fourth, plants can be directed to target proteins into intracellular compartments in which they are more stable, or even to express them directly in certain compartments (chloroplasts). fifth, the amount of recombinant product that can be produced approaches industrial-scale levels. last, health risks arising from contamination with potential human pathogens or toxins are minimized. in the decade since the expression and assembly of immunoglobulin (ig) heavy and light chains into functional antibodies was first shown in transgenic tobacco, plants have proven to be versatile production systems for many forms of antibodies. these include full-sized igg and iga, chimeric igg and iga, secretory igg and iga, single-chain fv fragments (scfv), fab fragments and heavy-chain variable domains. recently, this list has been extended to include bispecific antibodies, which are made by the genetic fusion of two different scfvs via a flexible peptide linker 1 . plants have great potential as a virtually unlimited source of inexpensive monoclonal antibodies (dubbed 'plantibodies') for human and animal therapeutics (table 1) . there is not yet a consensus as to the best plant species or tissue for commercial antibody production. most antibodies expressed to date have been in tobacco, although recently potatoes, soybean, alfalfa, rice and wheat have also been used successfully 2-6 . the major advantage of using green tissue (tobacco, alfalfa, soybean) is sheer productivity. both alfalfa and tobacco can support several crops (cuttings) per year, with potential annual biomass yields of 25 tonne ha −1 and >100 tonne ha −1 , respectively. by contrast, the maximum yields of wheat, rice and corn seed are ~3 tonne ha −1 , 6 tonne ha −1 and 12 tonne ha − 1 , respectively. other advantages of tobacco include its relative ease of genetic manipulation, production of large numbers of seeds (up to a million per plant) and an impending need to explore alternate uses for this hazardous crop. however, seeds are likely to have fewer phenolic compounds and a less complex mixture of proteins and lipids than green leaves, which might be an advantage in purification. another advantage of seeds or tubers is their ability to be stored for long periods. levels of scfv in rice seeds did not show a significant decline after storage at room temperature for six months 5 . potato tubers in cold storage for 18 months lost only 50% of functional antibody 2 . for short periods of time (five to seven days), dried tobacco and alfalfa leaves can also be stored with little loss of scfv (ref. 7) or igg antibody 4 . purification of antibody from stored plant material has the advantages that the processing facility need not be near the field and can be used continually all year, rather than for just a few large batches. to date, only four antibodies have been made in plants that are potentially useful as human therapeutics. only one of these has been tested in humans: a chimeric secretory igg-iga antibody against a surface antigen of streptococcus mutans, the primary causal agent of tooth decay. this tobaccoproduced antibody was applied topically to teeth and found to be as effective as an igg produced in a murine hybridoma at preventing recolonization by s. mutans 8 . the second antibody, a humanized anti-herpes-simplex virus (hsv) antibody made in soybean, was effective in the prevention of vaginal hsv-2 transmission in a mouse model 3 . its activity was indistinguishable both in vitro and in vivo from the monoclonal antibody produced in cell culture. a third antibody, against carcinoembryonic antigen (cea), has recently been expressed in rice and wheat 5 . cea, a cell-surface glycoprotein, is one of the best-characterized tumor-associated antigens. antibodies against cea are used for in vivo tumor imaging, as well as in antibody-based cancer therapy. levels of scfv in seeds did not show a significant decline after storage at room temperature for six months. this same antibody has been expressed in a rice cell culture 6 . the fourth antibody is an example of both a novel use of plant-produced antibodies and an alternative production system. a plant virus vector has been used to produce a tumor-specific vaccine transiently in tobacco for the treatment of lymphoma 9 . the antibody genes for expression of an scfv were derived from a mouse b-cell lymphoma. the plantproduced scfv was used to immunize mice, which generated anti-idiotypic antibodies (antibodies against the binding portion of the antibody). these mice were protected against infection by the lymphoma that produced the original antibody. other groups have used modified plant viral vectors to produce therapeutically useful antibodies in plants, including an antibody against the colorectalcancer-associated antigen ga733-2 (ref. 10). although these vectors might find limited usefulness if the rapid production of an antibody is necessary (perhaps in greenhouse production), their acceptability to regulatory agencies (e.g. the us food and drug administration, dept of agriculture and environmental protection agency) has not been tested. there are no plantibodies yet in commercial production, therefore estimates of cost are difficult to find and involve many assumptions. the costs of producing an igg from alfalfa grown in a 250 m 2 greenhouse are estimated to be us$500-600 g −1 , compared with us$5000 g −1 for the hybridomaproduced antibody 4 . planet biotechnology (mountain view, ca, usa) has compared the cost per gram of purified iga made by cell culture, transgenic goats, grain (7.5 tonne ha −1 ) and green biomass (120.0 tonne ha −1 ) (fig. 1) . expression levels will have a significant impact on the costs but, at the best expression level reported [500 µg g −1 leaf for a secretory iga (ref. 11)], the final cost should be well below us$50 g −1 . this significantly undercuts the costs of cell culture (us$1000 g −1 ) or transgenic animal production systems (us$100 g −1 ). the biggest component of cost with plantibodies will be purification. however, expression in seeds of rice and wheat 5 opens up the possibility of oral administration of some therapeutic antibodies without the need for expensive purification. some of the properties of igs depend on their glycosylation (e.g. binding to monocyte fc receptors). there is one conserved n-glycosylation site in the ch 2 domain of igg. the structures of n-linked glycans on plant-and murine-produced guy's 13 (an igg1) have been determined and compared 12 . the plantibody n-glycans were more structurally diverse, with 40% being of the high-mannose type. the other 60% of the plantibody oligosaccharides had β-(1,2)-xylose and α-(1,3)-fucose linked to the man 3 glcnac 2 core. these linkages are typical of plants but are not found in mammalian n-glycans. the plantibody also lacked sialic acid, which represented ~10% of the sugar content of the mouse monoclonal antibody. these differences in glycan structure appear to have no effect on antigen binding or affinity in vitro 3,4,11,13 and might not be significant in vivo either. an igg produced in alfalfa had a serum half-life in balb/c mice that was indistinguishable from that of the hybridoma-produced antibody 4 . however, there is some concern about the potential immunogenicity and allergenicity of plantibodies used as human therapeutics. for mucosal applications, this is not likely to present problems for most people because plant glycoproteins are ubiquitous in the human diet. there has been no evidence of allergic reaction or of a human antimouse antibody (hama) response in 60 patients receiving topical oral application of a secretory iga specific to s. mutans 8 . proteins of microbial and viral pathogens were some of the earliest examples chosen to show the feasibility of transgenic plant expression systems 14-17 . the rationale was that key immunogenic proteins of major pathogens could be synthesized in plant tissues and then fed as edible . costs for plants compare green biomass (120.0 tonne ha −1 ) and seed production (7.5 tonne ha −1 ). cost differences are based primarily on production costs, and it was assumed that purification costs and losses during purification will be the same for all systems. subunit vaccines to humans or commercially important animals. the proof of this concept has since been shown using several bacterial and viral proteins ( table 2 ). the practical aspects of choosing particular foodstuffs in which to deliver defined doses of a vaccine are being explored, and efforts are under way to establish clear regulatory paths for the development of edible vaccines. oral delivery of vaccines is an attractive alternative to injection, largely for reasons of low cost and easy administration. the chances of acquiring mucosal immunity against infectious agents that enter the body across a mucosal surface are also increased with oral vaccines. however, a major concern with oral vaccines is the degradation of protein components in the stomach and gut before they can elicit an immune response. to guard against degradation, several delivery vehicles have been developed to ferry intact proteins to the gut. these include recombinant strains of attenuated microorganisms, bioencapsulation vehicles such as liposomes and transgenic plant tissues. early work with plant-based subunit vaccines used the readily transformed species tobacco, potato and tomato 14-18 . however, the most attractive species for expressing subunit vaccine components should have high levels of soluble protein that is stable during storage; seed crops such as cereals are particularly suitable. the embryo fraction is rich in soluble protein and can easily be separated from other seed tissue to increase the concentration of antigen and thus decrease the dose size. the choice of crop defines the type of material to be fed. many plant tissues can be consumed raw but others must be processed. processing facilitates the creation of a homogeneous sample, enabling a defined dose size, but it is important that any heat or pressure treatments involved do not destroy the antigen. alternative processing steps have been applied to a candidate vaccine component against enterotoxigenic strains of e. coli that consists of the b subunit of the heat-labile toxin (lt-b) expressed in corn. a typical 1 mg dose of lt-b could be delivered in an embryo fraction, to decrease the volume of the dose, or in a 'cooked' whole corn snack, to increase palatability and enhance stable storage (fig. 2) . in this case, neither treatment degrades the antigen. for commercial animal vaccines, the relevant protein can be expressed in a plant tissue that constitutes a major proportion of the diet, and heat and pressure treatments are not necessary. some key examples that illustrate the range of candidate proteins under investigation and plant expression systems being used are given in table 2 . plant-expressed antigens have been shown to able to induce mucosal and serum immune responses when administered parenterally or orally to experimental animals and, in some test cases, they have offered protection against a subsequent pathogen challenge or challenge model 14,16, [19] [20] [21] [22] [23] [24] . a few of these vaccine candidates have been successfully tested in clinical trials or, where appropriate, in commercial or native animal trials [25] [26] [27] [28] [29] [30] . thus, edible vaccines delivered in plant tissues or processed plant products show great potential for efficacy in target organisms. the bioencapsulation of lt-b in transgenic corn material results in an increased mucosal immune response compared with that achieved with naked antigen when fed to mice 30 . presumably, this is because the antigen is protected from degradation in the gut, and it augurs well for the development of plant-based edible vaccines. the quantity of plant tissue constituting a vaccine dose must be of a practical size for consumption. thus, achieving a high level of expression is crucial. the expression of vaccine components in plants has been increased by using a range of leader and polyadenylation signals 31 and by optimizing codon usage for plants 22, 29, 30 . expression could also be raised through crosses of transformed lines to various genetic backgrounds, an approach that has been successfully applied to boost protein production in corn. it is also important that any vaccine component should be present in its native form in the transgenic plant tissue. this has been assessed in several cases by examining the size of the synthesized protein, its ability to form higher-order complexes that mirror microbial or viral structures and, where relevant, by showing an enzymatic or receptor-binding activity 14,16-18, 22, 24, 29 . the stability of heterologous proteins and the assembly of multisubunit structures depend on the cellular environment and therefore on the subcellular location. favored locations for the expression of selected subunit vaccine components are the cell surface and the endoplasmic reticulum and golgi body 14,18,29,31,32 . as with antibodies, transient expression systems (in which candidate vaccine sequences are incorporated into plant viral surface proteins) have also been investigated extensively and high levels of expression have been achieved. a related strategy to that of edible vaccines uses transgenic plants expressing autoantigens, whereby a large oral dose of an autoantigen can inhibit the development of an autoimmune disease through the mechanism of oral tolerance. this approach has been successful in a mouse model for diabetes 33 . generally, levels of pharmaceutical proteins produced in transgenic plants have been less than the 1% of total soluble protein that is needed for commercial feasibility if the protein must be purified 34 . plantderived recombinant hepatitis-b surface antigen induced only a low level serum antibody response in a small human study, probably reflecting the low level of expression (1-5 ng g −1 fresh weight) in transgenic lettuce 27 . in spite of recent improvements in expression levels in potato with a view to clinical trials 31 , expression levels should be increased further for practical purposes. also, even though norwalk virus capsid protein expressed in potatoes caused oral immunization when consumed as food, expression levels are too low for large-scale oral administration (0.37% of total soluble protein) 16, 26 . expression of genes encoding other human proteins in transgenic plants has been disappointingly low: human serum albumin, 0.020% total soluble protein; human protein c, 0.001% total soluble protein; erythropoietin,~0.003% total soluble protein; and human interferon-β, <0.001% fresh weight (table 3) . a synthetic gene coding for the human epidermal growth factor was expressed only up to 0.001% of total soluble protein in transgenic tobacco 35, 36 . in spite of several successful reports of high-level expression of non-human proteins (e.g. phytase, glucanase) via the nuclear genome, there is a great need to increase expression levels of human blood proteins to enable the commercial production of pharmacologically important proteins in plants. one alternative approach is to express foreign proteins in chloroplasts of higher plants. foreign genes have been integrated into the tobacco chloroplast genome, giving up to 10 000 copies per cell and resulting in the accumulation of recombinant proteins at up to 47% of the total soluble protein 37 . chloroplast transformation uses two flanking sequences that, through homologous recombination, insert foreign dna into the spacer region between the functional genes of the chloroplast genome, thus targeting the foreign genes to a precise location. this eliminates the 'position effect' upon expression that is frequently observed in transgenic plants with genes inserted into the nuclear genome. in addition, gene silencing has not been observed with chloroplast transformation, whereas it is a common phenomenon with nuclear transformation. chloroplast genetic engineering is an environmentally friendly approach, minimizing several environmental concerns 38, 39 . importantly, chloroplasts can process eukaryotic proteins, including enabling correct folding and the formation of disulfide bridges. chaperonin proteins are present in chloroplasts and might function in the folding and assembly of non-native proteins of both prokaryotic and eukaryotic origins. also, chloroplast proteins are activated by disulfide bond oxidation-reduction cycles using the plastid thioredoxin system 40 or protein disulfide isomerase 41 . accumulation of large quantities of a fully assembled form of human somatotropin with the correct disulfide bonds (7% total soluble protein) 42 provides strong evidence for hyperexpression and assembly of pharmaceutical proteins using this approach. such folding and assembly of foreign proteins should eliminate the need for expensive in vitro processing of pharmaceutical proteins produced in recombinant organisms. for example, 60% of the total operating cost for the commercial production of human insulin in e. coli is associated with in vitro processing (formation of disufide bridges and cleavage of methionine) 43 . purification is likely to represent most of the cost of biopharmaceutical production in plants. for the commercial production of insulin in e. coli, chromatography accounts for 30% of operating expenses and 70% of equipment costs 43 . therefore, new approaches are necessary to minimize or eliminate chromatography in the production of pharmaceutical proteins. one successful recent approach is targeting pharmaceutical proteins to seed oil bodies. this was shown with hirudin, an anticoagulant first isolated from the leech hirudo medicinalis. an oleosin-hirudin fusion protein has been targeted to oil bodies of brassica napus seeds and purified by flotation centrifugation for commercial production in canada 44 . another novel approach is the use of gvgvp as a fusion protein to facilitate single-step purification without the use of chromatography. gvgvp is a protein-based polymer encoded by synthetic genes. at low temperatures, it exists as an extended molecule but, upon raising the temperature above the transition range, the polymer hydrophobically folds into dynamic structures called β-spirals that further aggregate by hydrophobic association to form twisted filaments 45 . using this approach, single-step purification of an insulin-polymer fusion has recently been shown. inverse temperature transition offers several advantages, including facilitating the scale-up of purification from grams to kilograms (o. carmona-sanchez and h. daniell, unpublished) . yet another recent approach is the use of a chaperonin protein to fold foreign proteins into cuboidal crystals, allowing their purification in a single step by centrifugation 37 . one additional advantage of this method is the protection of foreign proteins from cellular proteases. plant-derived biopharmaceuticals should meet the same standards of safety and performance as other production systems. however, many herbal medicines are now exempt from such close scrutiny and are not required to meet the same standards because of their classification as nutritional supplements. because several environmental concerns have been raised by interest groups to confuse public perception, it is of paramount importance that regulating agencies distinguish between real and perceived public concerns (scientific versus non-scientific). if biopharmaceuticals that are potentially harmful are capable of persisting in the environment and might accumulate in non-target organisms, precautionary measures should be taken. induction of biopharmaceutical production after harvesting (as was done in the case of glucocerebrosidase 36 ) might be one approach to minimize environmental exposure, provided that the use of viral vectors does not introduce additional environmental or regulatory concerns. expression of potentially harmful proteins in a form that must be treated for activation might minimize the risk of exposure. for example, hirudin is produced as a fusion protein and is inactive in this form; it is activated only after it is purified from seeds 36 . another hotly debated environmental concern has been the outcrossing of transgenic pollen to weeds or related crops 38, 39 . expression of harmful pharmaceutical proteins in non-target plants resulting from such outcrosses might create public concern and negative perception. several gene containment methods are currently being investigated, including apomixis, incompatible genomes, transgenic mitigation, control of seed dormancy or shattering, suicide genes, infertility barriers, male sterility and maternal inheritance. engineering foreign genes via the chloroplast genome has been shown to contain transgenes effectively, although there are a few exceptions in which the chloroplast genome shows biparental inheritance (e.g. pines) 46 . as an example of an alternative strategy, rnase genes have been expressed under the control of a tissue-specific promoter to destroy the tapetum selectively during anther development, resulting in male sterile plants 47 . there is also concern over the expression of harmful proteins in transgenic pollen. for example, the controversial observation of the toxic effect of bacillus thuringiensis (bt) corn pollen on milkweeds (asclepias spp.) fed to monarch butterfly larvae had a significant impact on public perception, even though the validity of this study has been repeatedly questioned. engineering biopharmaceuticals via the chloroplast genome might be a solution. although the cry protein of bt was expressed at high levels in leaves (up to 47% of total soluble protein), no toxicity was observed when milkweeds dusted with transgenic pollen were fed to monarch butterfly larvae 37 . however, to date, chloroplast genetic engineering has been shown only in tobacco and potato. more recently, several academic and industrial laboratories have initiated projects to extend this technology to other useful crops. also, there are no reports of the production of glycoproteins in transgenic chloroplasts. another public concern is the presence of antibiotic resistance genes or their products (which are used as selective markers) in edible parts of genetically modified crops. however, several approaches are now available to generate plants with transgenes in their nuclear 48 or chloroplast 49 genomes without the use of antibiotic selection. practical considerations will dictate the choice of biopharmaceutical proteins and the crop in which they are to be produced. these include yield, storage conditions, containment properties, initial set up and running costs, purification strategies, size of the market, environmental concerns, public perception and competing technologies. access to several alternative approaches to optimize protein synthesis in plants in an environmentally sound manner augurs well for the safe production of biopharmaceuticals in transgenic plants and for greater availability of these proteins to populations requiring them. toxin b subunit oligomers in transgenic potato plants efficacy of a food plantbased oral cholera toxin b subunit vaccine protective immune response to foot-and-mouth disease virus with vp1 expressed in transgenic plants expression of immunogenic glycoprotein s polypeptides from transmissible gastroenteritis coronavirus in transgenic plants edible vaccine protects mice against escherichia coli heat-labile enterotoxin (lt): potatoes expressing a synthetic lt-b gene immunogenicity of transgenic plant-derived hepatitis b surface antigen induction of a protective antibody response to foot and mouth disease in mice following oral or parenteral immunization with alfalfa transgenic plants expressing the viral structural protein vp1 immunogenicity in humans of a recombinant bacterial antigen delivered in a transgenic potato human immune responses to a novel norwalk virus vaccine delivered in transgenic potatoes a plant-derived edible vaccine against hepatitis b virus immunization with potato plants expressing vp60 protein protects against rabbit hemorrhagic disease virus immunogenicity of porcine transmissible gastroenteritis virus spike protein expressed in plants plant-based vaccines: unique advantages production of hepatitis b surface antigen in transgenic plants for oral immunization development of biopharmaceuticals in plant expression systems: cloning, expression and immunological reactivity of human cytomegalovirus glycoprotein b (ul55) in seeds of transgenic tobacco transgenic plants expressing autoantigens fed to mice to induce oral immune tolerance production of recombinant proteins in transgenic plants: practical considerations application of transgenic plants as production systems for pharmaceuticals transgenic plants for therapeutic proteins: linking upstream and downstream technologies hyper-expression of the bt cry2aa2 operon in chloroplasts leads to formation of insecticidal crystals gm crops: public perception and scientific solutions environmentally friendly approaches to genetic engineering regulation of chloroplast enzyme activities by thioredoxins: activation or relief from inhibition protein disulfide isomerase as a regulator of chloroplast translational activation high-yield production of a human therapeutic protein in tobacco chloroplasts computer-aided process analysis and economic evaluation for biosynthetic human insulin production: a case study molecular farming in plants: oil seeds as vehicles for production of pharmaceutical proteins hyperexpression of an environmentally friendly synthetic polymer gene containment of herbicide resistance through genetic engineering of the chloroplast genome induction of male sterility in plants by a chimaeric ribonuclease gene removing selectable marker genes: taking the shortcut engineering chloroplast genome without the use of antibiotic resistance genes transgenic plants as factories for biopharmaceuticals triple helix assembly and processing of human collagen produced in transgenic tobacco plants expression of full length bioactive antimicrobial human lactoferrin in potato plants key: cord-258363-gmgbus9i authors: kolla, venkatadri; chakravorty, maharani; pandey, bindu; srinivasula, srinivasa m; mukherjee, annapurna; litwack, gerald title: synthesis of a bacteriophage mb78 late protein by novel ribosomal frameshifting() date: 2000-08-22 journal: gene doi: 10.1016/s0378-1119(00)00264-x sha: doc_id: 258363 cord_uid: gmgbus9i mb78 is a virulent phage of salmonella typhimurium that possesses a number of interesting features, making it a suitable organism to study the regulation of gene expression. a detailed physical map of this phage genome has been constructed and is being extensively studied at the molecular level. here, we demonstrate the expression of two late proteins of bacteriophage mb78 derived from the same gene as a result of possible ribosomal frameshifting. in vitro transcription-translation yields a major protein that migrates as 28 kda, whereas in vivo expression using pet expression vectors yields two equally expressed proteins of molecular sizes 28 and 26 kda. a putative slippery sequence tttaaag and a pseudoknot structure, two essential cis elements required for the classical ribosomal frameshifting, are identified in the reading frame. mutations created at the slippery sequence resulted in a single 28 kda protein and completely abolished the expression of 26 kda protein. thus, we have produced the first evidence that ribosomal frameshifting occurs in bacteriophage mb78 of salmonella typhimurium. allow phages like p22 and 9na to grow in its presence. mb78 contains a 42 kb linear, double-stranded dna bacteriophage mb78 is a virulent phage of salmonella (molecular weight 28×106 da), which replicates through typhimurium (joshi et al., 1982; srinivasula, 1992) . concatemer formation, subsequently converted to full-morphologically, physiologically and serologically, it is length phage dna through 'headful' packaging mechadifferent from the well-known temperate phage p22 and nism. like p22, mb78 dna is circularly permuted and related phages as well as a virulent phage 9na (murthy, terminally redundant ( khan et al., 1991a; pandey, 1992; 1987) . mb78 cannot multiply in minimal medium consrinivasula, 1992 ). taining citrate. the chelating agent edta is an effective it is now known that two proteins can be expressed inhibitor of its dna synthesis, whereas egta and from a single open reading frame through 'ribosomal orthophenanthroline have practically no effect on the frameshifting'. if the ribosome shifts during translation, development of the phage ( verma and chakravorty, one base in either direction, i.e. towards 3∞ or 5∞ ends, 1987) . mb78 is a dominant phage in that it does not the reading frame will be changed. during the process of 'ribosomal frameshifting', two or more proteins can result, starting from a single initiation codon abbreviations: bp, base pairs; iptg, isopropyl b-thiogalactosidase; ( farabaugh and vimaladithan, 1998 direction (+1 frame shift) has been described in the k the nucleotide sequence reported in this paper has been deposited yeast retrotransposon ty (belcourt and farabaugh, in the embl/genbank database under accession no. x87092. 1990 ), copia-like elements of drosophila (saigo et al., been demonstrated for retroviruses (jacks et al., 1988; from bacteriophage mb78 is expressed by ribosomal frameshifting. vickers and ecker, 1992) , luteoviruses (brault and miller, 1992; prufer et al., 1992) , the bacterial transposon is 1 (sekine et al., 1992) and in potato leafroll virus ( kujawa et al., 1993) . a −1 frameshifting event often controls the levels of expression of viral reverse tran-2. materials and methods scriptase relative to viral core proteins in retroviruses. frameshifting is also known to effect gene expression in 2.1. bacterial strains coronaviruses and even in a bacterial system (chamorro et al., 1992) . in all these cases, frameshifting occurs as lt2, a salmonella strain, was originally obtained the ribosome passes a seven nucleotide sequence 5∞ from dr. myron levine, department of human xxxyyyz 3∞ ( x is a, u or g, y is a or u, and z is genetics, university of michigan, ann arbor, mi. e. coli strain kk2186 was a generous gift from dr. p. any nucleotide), known as the 'slippery site'. two of the berget, then at the department of biochemistry and three base pairs between the anticodons of each of the molecular biology, university of texas, houston, tx. two trnas and mrnas can be maintained after the all other bacterial strains were purchased commercially slip into the −1 reading frame. the slippery sequence from gibco-brl life technologies. all the chemicals is not the only determinant of frameshifting; secondary were obtained from sigma chemical company, st. signals are also required (larsen et al., 1995; atkinson louis, usa. et al., 1997) . secondary signals programmed in the mrna augment shifting at the slippery sequence to give high levels of frameshifting. these signals, called 2.2. purification of bacteriophage mb78 and isolation of 'stimulators', are very diverse. for example, the +1 its dna shift for decoding rf2 (release factor) of e. coli requires two stimulators: one is a uga terminator at codon 26 phage stocks were prepared as described earlier flanking the shift site on its 3∞ site ( weiss et al., 1988) ; ( kolla and chakravorty, 2000) . phage dna was isothe other is a shine-dalgarno sequence located three lated as per the method described by maniatis et al. nucleotides upstream of the shift site ( weiss et al., (sambrook et al., 1989; kolla and chakravorty, 2000) . 1988). these two stimulators act independently with substantial activity, but their effects are synergistic. 2.3. isolation of plasmid dna pseudoknots, a tertiary interaction involving base pairing between two regions of unpaired bases, are also plasmid dnas were isolated by either alkali lysis involved in frameshifting. the model for the pseudoknot method using standard protocols (sambrook et al., structure was based on biochemical analysis of the 3∞ 1989; kolla and chakravorty, 2000) or by qiagen and end of turnip yellow mosaic virus ( tymv ) rna (pleij promega columns according to the manufacturer's et al., 1985; dumas et al., 1987) . in e. coli, ribosomal instructions. dna from the gels was extracted using protein s4 represses its own synthesis in addition to the qiagen columns. synthesis of other ribosomal proteins viz. s11, s13 and l17 by binding to a pseudoknot structure. the structure resembles a 'double pseudoknot' linking a hairpin 2.4. nested deletions by exoiii upstream of ribosome binding site with sequences 2-10 codons downstream of the initiation codon ( tang and the deletions were created primarily as described by draper, 1989) . pseudoknots also play a role in the henikoff (1987) . briefly, cloned dna fragment (5structural mimicry of trna at the 3∞ termini of plant 10 mg) was digested with two different restriction viral rnas (pleij et al., 1985) . one of the most intrienzymes e.g. psti and bamhi. the enzyme psti produces guing functions of the pseudoknot structure in framea four-base 3∞ overhang, resistant to exoiii activity, shifting occurs during the translation of certain retroviral while the enzyme bamhi generates 5∞ protrusion, which mrnas (jacks et al., 1988; kujawa et al., 1993; is accessible to exoiii ( weiss, 1976) . after complete atkinson et al., 1997) . mutational analyses in mouse digestion with both the enzymes, the dna sample was mammary tumor virus (mmtv ) (chamorro et al., deproteinized by extracting with phenol-chloroform and 1992) and in infectious bronchitis virus (ibv ) (brierley precipitated with ethanol. the dna pellet was resuset al., 1992) provide strong evidence for the stimulator pended in 25 ml of 1× exoiii buffer and the exonuclease structural element being a pseudoknot. the autoregulatreatment carried out as per the recommendations of tion of gp32 in phage t4 also involves a pseudoknot the manufacturer (promega). finally, the deleted frag(shamoo et al., 1993) . in this investigation, we provide ments were ligated and used for completing the nucleotide sequence as well as for in vivo expressions. evidence for the first time that one of the two late genes 2.5. in vitro transcription and translation rc5c centrifuge. the supernatant was discarded, and the minicell pellet was suspended in 6 ml of m9 medium. the cell suspension was again layered on to sucrose the coding region for 26 and 28 kda proteins was amplified by pcr and cloned in bacterial expression gradient as before. again, two-thirds of the minicell layer was collected into a 15 ml of corex cup, and an vectors, pet21a and pet28a. recombinant dnas were in vitro transcribed-translated in the presence of equal volume of m9 was added. the optical density of this minicell suspension was measured at 600 nm in a [35s]-methionine in rabbit reticulocyte lysate with a t7-rna polymerase-coupled tnt kit (promega) hitachi spectrophotometer to determine the volume in which the minicells would be finally suspended to have according to the manufacturer's recommendations. briefly, reactions were set up in 50 ml volume in an a 600 =2.0/ml (2×1010 cells/ml ). the cells were finally suspended in m9 minimal medium containing 30% eppendorf tube containing 25 ml of rabbit reticulocyte lysate, 2 ml of reaction buffer, 1 ml of amino acid mix ( v/v ) glycerol, aliquoted into a number of tubes (200 ml in each) and stored at −80°c. the minicells thus stored minus methionine, 1 ml of rnasin (5u ), 1 mg of template dna, 4 ml of 35s-methionine (1000 ci/mmol ) and 1 ml could be used for at least a year. of t7 rna polymerase (20 u ). the final reaction was made up to 50 ml with sterile distilled water, and the 2.8. expression of plasmid encoded proteins tubes were incubated at 42°c for 90 min to allow the synthesis of proteins. one to 3 ml samples were applied purified minicells were labeled with 35s-methionine as described previously (reeve, 1979; kolla and on to sds-polyacrylamide gels after denaturing in a sample buffer by boiling. chakravorty, 2000). the frozen minicell suspension (0.1 or 0.2 ml ) was thawed slowly and centrifuged for 3 min in a microfuge. the pellet was suspended in 200 ml of 2.6. sequencing m9 minimal medium to which 3 ml of 10.5% ( w/v ) difco methionine assay medium were added and incu-nucleotide sequencing of ecori 'f' fragment was carried out by manual sequencing using sequenase kit bated at 37°c for 90 min, to complete the translation of bacterial mrnas in the minicells, received from the ( usb) and also by automated sequencing. forward and reverse sequencing primers were obtained from mother cell. then, 25 uci of 35s-methionine were added and incubated for 60 min at 37°c, followed by incuba-pharmacia ( uppsala, sweden). tion of 5 min after the addition of 10 ml of unlabeled methionine (1%). the cells were then centrifuged at 2.7. preparation of minicells 12 000 rpm for 3 min, the cell pellet was washed with 500 ml of 10 mm tris-hcl, ph 7.6, suspended in 20 ml minicells were prepared as described by reeve (1979) . e. coli strain ds410, the minicell producing of the same buffer to which 20 ml of 2× sample buffer were added. the labeled proteins were separated by strain, transformed with desired plasmid was grown overnight to stationary phase in 400 ml of terrific broth sds-polyacrylamide gel electrophoresis (sds-page) (laemmli, 1970) , at a constant voltage of 50 v. (sambrook et al., 1989) in the presence of the relevant antibiotics (ampicillin and tetracycline). the culture was examined under a light microscope to observe the forma-2.9. fluorography tion of long filamentous cells and minicells. when a reasonable number of minicells were visible under the to detect radio labeled proteins, the gels were fluorographed using water-soluble sodium salicylate (bonner, microscope, the culture was chilled on ice for 20 min and then centrifuged at 4°c for 5 min at 2,000 rpm 1984). the gel was soaked in methanol, acetic acid, and water (5:1:5) for 60 min, followed by a thorough wash (675×g) in gs3 rotor of sorvall rc5c centrifuge. the supernatant was transferred to fresh gs3 cups and with water (30 volumes of gel ), then immersed in 1 m sodium salicylate, ph 7.0 for 1 h with mild shaking. centrifuged at 7000 rpm (5,278×g) for 20 min in the same centrifuge. the cell pellet was resuspended in 9 ml finally, the gel was transferred to whatman no. 1 sheet, dried under vacuum and subjected to fluorography. of m9 minimal medium, and then 4 ml of the suspension were carefully layered on to 10-30% (w/v) sucrose gradients. the gradients were centrifuged at 4°c for 2.10. cloning and pcr amplification 18 min at 5000 rpm (4122×g) in the hb4 rotor of the sorvall rc5c centrifuge. the presumed coding regions for 26 and 28 kda proteins were amplified by pcr using the following the top two-thirds of the minicell layer was collected into a 30 ml corex cup using pasteur pipette. to this, forward and reverse primers with overhanging restriction sites. an equal volume of m9 minimal medium was added, and the suspension was centrifuged again at 10 000 rpm forward primer: 5∞ cccggatccatgaatcgtt-ttttacgttac 3∞ (11 953×g) for 10 min in the ss34 rotor of the sorvall reverse primer: 5∞ cccgaattcggcagggtt-agattt 3∞ the primers were commercially synthesized by gibco-brl life technologies (including the primers designed to create mutations at the 3∞ end of the fragment to avoid the frameshift by overlapping pcr). the primers were also used to identify the presence of inserts in the cloned vectors by pcr screening and to determine the nucleotide sequence. the amplified fragments were digested with appropriate restriction enzymes and cloned in-frame into pet21a and pet28a expression vectors at bamhi and xhoi restriction sites. restriction enzyme analyses and dna sequencing confirmed the sequence of all the constructs. cruz biotechnology) and resolved by sds-page (10%) and transferred electrophoretically (100 v constant for 1 h) to hybond ecl nitrocellulose paper (amersham were named according to the number of bases deleted; for example, +1518 means that 1518 bases were deleted, life science). the paper was blocked overnight in 10% nonfat milk, and incubated in 5% non-fat milk with and +1601 means that 1601 bases were deleted and so on from the full-length construct. to ascertain the sizes horse-radish peroxidase-conjugated t7-antibody (1:10 000, novagen) for 1 h at room temperature. after of the inserts, the deleted plasmid dnas were digested with hindiii, located 138 bases away from the ecori stringent washing, the filter was developed by chemiluminiscent ecl, as described (amersham, arlington site ( fig. 2) . the sequentially deleted dnas were further confirmed by dot-blot hybridization (data not pre-heights, il). sented ). the deleted fragments were cloned into puc18, and the deletions were confirmed by dna sequencing. in order to determine the expression of various deletion mutants, we performed in vivo minicell expression using 35s-methionine [only a few of the sequentially in order to understand more about the physiology and genetics of the phage mb78, the ecori 'f' fragment deleted plasmids are presented here ( fig. 1) ]. lane 6 (proteins expressed by vector puc18) shows two pro-(2.3 kb) of the phage was cloned in puc18 vector (data not shown), and the expression in minicells was exam-teins, including 30 kda b-lactamase protein. in lane 1, four major proteins of molecular weights 28, 26, 21 and ined. the ecori 'f' fragment of mb78 codes for four proteins of mass 28, 26, 21 and 11 kda ( fig. 1, lane 1) . 11 kda expressed from the cloned ecori 'f' fragment could be seen. lanes 2-5 show the proteins expressed the expression of b-lactamase was not strong in cells carrying the ecori 'f' fragment, suggesting that the by different sequentially deleted dnas. the 21 and 11 kda proteins could not be seen when 1438 bases presence of a strong promoter (pandey et al., 1997) in the 'f' fragment is interfering with the expression of the were deleted ( lane 2), suggesting that their promoters and orfs (open reading frames) reside within 1438 bp b-lactamase gene. to characterize the ecori 'f' fragment, sequencing of the fragment was carried out with from the original construct. the expression of 28 and 26 kda proteins was unaffected even after the deletion a nested set of deletions in the target dna. the clones of 1518 bp from the original construct ( lane 3). the ( fig. 3a) . computation also revealed the presence of a slippery sequence with a possible downstream pseu-present study focuses on the characterization of the 28 and 26 kda proteins. the clone f-1518 expressed 28 doknot structure ( fig. 3b) . the slippery sequence present in bacteriophage mb78 resembles that of the turnip and 26 kda proteins, suggesting that their promoters and orfs are present in 793 bp (1518-2311) of the ecori yellow mosaic virus ( kujawa et al., 1993) . 'f' fragment. after deletion of 1595 bp, the expression of both proteins was reduced simultaneously ( lane 4), 3.4. effect of 3∞ truncation on the expression and no expression was detected after the deletion of 1723 bp ( lane 5). these results suggest that the expres-we next examined whether the 28 and 26 kda proteins are expressed from overlapping open reading sion of these two proteins is driven by a common promoter that resides within 1518-1601 bp and that frames by ribosomal frameshifting. if they are expressed from overlapping open reading frames with the same both the proteins are possibly expressed from overlapping open reading frames. initiation codon, truncation of the gene from the 3∞ end should yield only a single protein. this part of the gene has an internal hindiii restriction site, located 138 bp 3.3. nucleotide sequence analyses away from the ecori site, at the 3∞ end of the fragment. when this portion is deleted, the coding region will be the nucleotide sequence of the ecori 'f' fragment was determined by sanger's dideoxy chain termination reduced to 555 bp, encoding a protein of approximately 22 kda. to test this, plasmid +1518 was truncated method (sanger et al., 1977) using a set of deletion mutants produced by exoiii (accession no. x87092). (+1518/138 h ) and used to transform the minicell producing strain ds410 to observe the expression of computer analysis of the nucleotide sequence indicated the possibility of encoding four proteins that could be proteins. the expression pattern of the deleted plasmid is presented in fig. 4 . deletion of 138 bp from the 3∞ expressed from three orfs (data not presented ). the orf for the 28 and 26 kda proteins starts from 1641 end of the ecori 'f' fragment resulted in complete abolition of 26 and 28 kda proteins, but a major protein, and does not have a stop codon in the 'f' fragment; it appears that the vector stop codon, located adjacent to smaller in size (22 kda, marked with triangle), was expressed ( lane 4). the deletion of 3∞ end of the gene, ecori site, might be used. analysis of the nucleotide sequence of ecori 'f' fragment revealed that expression resulting in the synthesis of a single protein instead of two proteins, supports the frameshift notion. the pres-of 28 and 26 kda proteins may have occurred through ribosomal frameshifting. we analyzed the sequence for ence of a slippery sequence and a pseudoknot structure downstream to the putative shift site strengthens the the formation of possible secondary structures, necessary for the process of classical ribosomal frameshifting argument. fig. 3 . sequence analysis. (a) probable secondary structure of mrna derived from 3∞ 138 nucleotides. (b) probable slippery sequence and downstream pseudoknot structure. the slippery sequence essential for the frameshift is marked in a box, and the bold nucleotides represent the stop codon where mutations were created. nucleotides involved in the pseudoknot structure are connected. in order to examine the phenomenon of frameshift further, we next performed coupled in vitro transcription and translation using rabbit reticulocyte lysate. the nucleotide sequence starting from atg to the end of the fragment was amplified by pcr with forward and reverse primers, as described in section 2.10. the amplified dnas were cloned in-frame into bacterial expression vectors, pet21a and pet28a, at bamhi and xhoi sites and named kvm21 and kvm28, respectively. these dnas were used to synthesize proteins by in vitro transcription and translation (promega). the results are presented in fig. 5 ( lanes 1 and 2) . in vitro transcription-translation of the fragment yielded a major protein (90%) of 28 kda, with he synthesis of a minor protein of apparent molecular mass 26 kda (arrows). we observed the formation of dimers from 26 and 28 kda proteins in the absence of reducing agents dtt (data not shown). these results suggest the synthesis of -methionine in rabbit reticulocyte lysate, as described in section 2. one to three microliters of these samples, as indicated, were applied on to sds-polyacrylamide gels after denaturing in a sample buffer by boiling. a major translated protein product (28 kda) and a minor protein are marked with arrows. the position of possible dimers is also marked with an arrow. lanes 3 and 4 represent the translation of empty pet28a and pet21a vectors, respectively. able volume of 2× sample buffer and subjected to sds-page fol-recombinant proteins with c-terminal his.tag lowed by western blot. the presence of 26 and or 28 kda proteins (pet21a) and c and n-terminal his.tags (pet28a) was detected using the ecl kit (amersham). the molecular weight of the proteins is marked with arrows. were expressed in e. coli bl-21 de3. purified proteins were separated on a 12% sds-polyacrylamide gel and transferred on to a nitrocellulose membrane, and vector (fig. 6b ). this suggests that frameshift occurs at the c-terminal region, resulting in the appearance of a western blotting was performed with t7 specific antibody. a single 28 kda protein was present in cells truncated 26 kda protein without a c-terminal his.tag in addition to the expression of a full-length 28 kda expressing c-terminal his.tag in pet21a vector (fig. 6a) , whereas 26 and 28 kda proteins were present protein with a c-terminal his.tag. mutations were created (deletion of three nucleotides) near the putative in cells expressing n and c-terminal his.tags in pet28a slippery sequence by overlapping pcr and were cloned 4. conclusions into pet28a. recombinant proteins were expressed and separated as described above. we predicted that this 1. we demonstrated that two late proteins of bacteriophage mb78 could be derived from the same gene as mutation should result in the loss of frameshift. as expected the mutated recombinant constructs did not a result of ribosomal frameshifting. 2. ribosomal frameshifting has been well established in yield the 26 kda protein, but only the 28 kda protein (fig. 6c, lane 1) . these results further support ribo-e. coli phages t2 (du et al., 1997) , t4 (groisman and engelberg-kulka, 1995) and t7 (condron et al., somal frameshifting. 1991; sipley et al., 1991; lewis and matsui, 1996) but not well documented in the case of salmonella 3.7. kinetic expression of 28 and 26 kda proteins bacteriophages except the previously reported observation in the phage p22 ( uomini and roth, 1974) . to examine the functions of the 28 and 26 kda 3. two cis elements are essential for ribosomal frameproteins, kinetic expression of phage mb78 was pershifting (jacks et al., 1988; blinkowa and walker, formed. the lt2 cells (host) were infected with phage 1990; tsuchihashi and kornberg, 1990), but the exact mb78 at a multiplicity of infection (m.o.i.) 10 and pulsemechanism is not clearly understood. it has been labeled with 35s-methionine for 2 min at different times postulated in −1 frameshift that an anti-shineafter infection. the labeled proteins were subjected to dalgarno-like sequence, present at 5∞ to the shift site 12.5% sds-page followed by fluorography. the (larsen et al., 1994) , pairs with the 16s rrna of kinetic study demonstrated that the two proteins are the elongating ribosome to make the ribosome pause, late proteins of bacteriophage mb78 ( fig. 7) . the proresulting in a frameshift. this feature is observed tein pattern of uninfected cells served as a control. the mostly in prokaryotes. expression of 28 and 26 kda proteins was low at 2 and 4. an aspect of the current study of bacteriophage 5 min after infection, but from 10 min onwards, their mb78 is that more than 35% of the ribosomes appear synthesis increased significantly until cell lysis. it may to be involved in frameshifting. reports in the literatherefore be assumed that these two proteins are late ture indicate that the proteins synthesized as a result proteins of phage mb78. of frameshifting are much less numerous, to a maximum of 10-20%. 5. amino acid sequence analysis (e.g. lc/ms, maldi-tof ) of the 28 and 26 kda two proteins, encoded by the ecori 'f' fragment is required to confirm the ribosome frameshift hypothesis from the present study. exponentially growing lt2 cells in minimal medium (m9) at 37°c were infected with phage atkinson ribosomal frameshifting in the yeast retrotransposon ty: trnas induce slippage on a 7 nucleotide centrifugation, washed with 500 ml of medium and finally suspended in 25 ml of m9 medium. samples collected at different times as indicated minimal site programmed ribosomal framewere lysed in an equal volume of 2× sample buffer and applied on to a 12.5% sds-polyacrylamide gel. lane 1, uninfected ( ui ) lt2 cells; shifting generates the escherichia coli dna polymerase iii gamma subunit from within the tau subunit reading frame. nucleic acids lanes 2-7, cells infected with phage mb78 after 2, 5, 10, 15, 30 and 45 min, respectively. brl low-molecular-weight markers are repre-res fluorography for the detection of radioactivity sented in the left lane. the 28 and 26 kda proteins are marked with two arrows. in gels translational frameshifting mediated phage 9na, a virulent phage of salmonella typhimurium identi-'slippery-sequence' component of a coronavirus ribosomal framefication of a strong promoter of bacteriophage mb78 that lacks shifting signal rohde, sequences stimulating frameshifting in the decoding of gene 10 of w., 1992. ribosomal frameshifting in plants: a novel signal directs bacteriophage t7 use of minicells for bacteriophage-directed polypepthe simian retrovirus-1 gag-pro frameshift site 3-d graphics modelling of the scriptase-like enzyme in a transposable genetic element in drosoph-trna-like 3∞-end of turnip yellow mosaic virus rna: structural ila melanogaster effect of frameshift-inducing press, cold spring harbor, ny. mutants of elongation factor 1alpha on programmed +1 frame unidirectional digestion with exonuclease iii in encoded by is1: identification of the site of translational frameshift-dna sequence analysis mb78, a specific recognition of an rna pseudoknot structure repesis and gene 10 frameshifting in escherichia coli showing different lication, maturation and physical mapping of bacteriophage mb78 degrees of ribosomal fidelity molecular analysis of bacteriophage mb78 cloning, sequencing, expression and ph unusual mrna pseudoknot strucpromoter analysis of a structural protein of bacteriophage mb78. ture is recognized by a protein translational repressor translational frameshifting gentural requirements for efficient translational frameshifting in the erates the gamma subunit of dna polymerase iii holoenzyme. synthesis of the putative viral rna-dependent rna polymerase proc cleavage of structural proteins during the mutants of bacteriophage p22 thesis is specifically inhibited by the chelating agent edta. fems rrna-mrna base pairing stimulates a programmed -1 ribosomal microbiol enhancement of ribosomal frame maldoshifting by oligonucleotides targeted to the hiv gag-pol region endonuclease ii of escherichia coli is exonuclease iii. 1123-1129 reading frame switch caused by base-pair formation 70, 2869-2875. between the 3∞ end of 16s rrna and the mrna during elongation of protein synthesis in escherichia coli biochemical characterization of the bacterio the authors are thankful to the department of biotechnology, government of india, the university grants commission and the national institutes of health (nih ) (to v.k ) for financial assistance and dr. ravikumar rallapalli for discussion. key: cord-014597-66vd2mdu authors: nan title: abstracts from the 25th european society for animal cell technology meeting: cell technologies for innovative therapies: lausanne, switzerland. 14-17 may 2017 date: 2018-03-15 journal: bmc proc doi: 10.1186/s12919-018-0097-x sha: doc_id: 14597 cord_uid: 66vd2mdu nan . a schematic representation of crispr based synthetic transcription factor technology. b mrna expression levels of protein transport related genes (napg, rab5a and arpc1b). background an increasing number of biologics are entering the development pipelines of pharmaceutical companies [1] . today, the preferred production host for therapeutic proteins is the cho cell line. however one of the major hurdles, especially for the production of non-antibody glycoproteins, is host cell-related proteolytic degradation which can drastically impact developability and timelines of pipeline projects. material and methods spike-in: cho cells were cultivated in a chemically defined culture medium at 36.5°c/10% co 2 in shake-flasks. when the cells reached their maximum viable density, they were removed by centrifugation and the conditioned medium was collected. a model mab was spiked into the conditioned medium and incubated at 37°c ± protease inhibitors. the amount of proteolytic degradation was analysed by western blot and lc-ms. transcriptomics: total rna was extracted after 3 days of cell cultivation. rna sequencing libraries were constructed and processed on the hiseq 2000 platform from illumina. generation of matriptase knockout: cho-k1 cells were transfected with mrna encoding "transcription activator-like effector nucleases" or "zinc finger nucleases" targeting matriptase exon 2. the transfected cells were subsequently sorted into single cells and analysed for frameshift mutations in both alleles via sanger sequencing. cell cultivation: fed batch cultivation was performed in 15-ml miniaturized bioreactors (ambr15). approximately 700 proteases are known in rodents. to reduce the number of candidate proteases we showed first that a model mab (prone to proteolytic degradation) incubated in conditioned medium of cho-k1 cells resulted in clipping of the mab, demonstrating the involvement of secreted/shedded proteases (fig. 1a) . broad spectrum inhibitors of the different protease classes revealed that only serine protease inhibitors prevented clipping. serine protease inhibitors of higher specificity highlighted the group of "s1a trypsin-like proteases" (fig. 1a) . comparison of the proteolytic degradation profile of several therapeutic proteins between cho-k1 with another cho cell line (cho-a) revealed less degradation in cho-a. therefore expression of the involved protease(s) is likely lower in cho-a. gene expression profile analysis of both cell lines showed five secreted/shedded "s1a trypsin-like serine proteases" more than 1.5 fold lower expressed in cho-cho-a versus cho-k1 (fig. 1b) . surprisingly, sirna knockdown experiments of these five candidates identified "matriptase" as the major protease involved in degradation of recombinant proteins expressed in cho-k1 cells ( fig. 1c upper panel) . next, we generated a cho-k1 matriptase knockout (ko) cell line. no proteolytic degradation product was detected when the model mab was spiked into conditioned medium of the ko cell line (fig. 1c lower panel) . also, stable expression of the model mab in the ko cell line resulted in no/significantly less clipping (fig. 1e) . the protein titer and the cell growth behaviour of the matriptase ko cells were similar to the corresponding wildtype (wt) cells (fig. 1d) as shown by comparative cultivation in ambr system. conclusions one major challenge for the production of recombinant proteins is cho host cell mediated proteolytic degradation which can negatively impact or even result in termination of projects [2; 3] . using a variety of techniques such as applying protease inhibitors, transcriptomics and sirna mediated knock-down we were able to identify "matriptase" as the major protease involved in degradation of recombinant proteins expressed in cho-k1 cells. subsequently we generated a matriptase deficient cho cell line. protein candidates of diverse formats, severely degraded in wt cho-k1 cell line, were not or significantly less cleaved in the matriptase ko cell line. furthermore cell growth, viability and productivity levels were comparable between the wt and the matriptase ko cell line. in summary, we have generated a superior platform-compatible cho production host cell line with the same favourable productivity properties as the parental host cell line [4; 6] , allowing expression of complex glycoproteins prone to clipping. background cho cell lines are common hosts for the production of biopharmaceutical proteins. so far, considerable progress has been made increasing productivity of cell culture to meet the rapidly growing demand for antibody biopharmaceuticals through increased cell densities and longer culture times. the downside is the increase of the process related impurities, bringing new challenges for process and harvest development. among the process related impurities such as host cell proteins (hcps) or dna the potential impact of lipids production and release during cell culture is still poorly understood due to the complex nature and diversity of this class of molecules. thanks to recent advances in analytical tools especially mass spectrometry, the advent of lipidomics offers now the feasibility to study several thousands of lipid species thus unraveling the possibility to understand and potentially control the interactions between high performance bioreactor processes, harvest conditions and purification. in order to analyze and quantify lipids, we developed a three steps method. in a first step, lipids were extracted with methyl tert-butyl ether (mtbe) according to matyash method [1] . lipids were then separated by liquid chromatography using either hilic of reverse phase column prior to detection and quantification by mass spectrometry. all lipid classes were detected by esi-ms/ms excepted cholesterol (apci-ms/ms). finally we applied this method to analyze the lipid content of different cell lines each expressing a different recombinant protein, during a 14 days fed batch process. lipid from cho cells were successfully extracted with a yield between 80% and 95% depending on the different lipid classes. stable isotope labeled lipids were used as internal standard in order to have comparable results between batches. the obtained results (fig. 1) show that for a given cell line, lipid distribution is changing over the process. moreover, this distribution may vary significantly depending on the cell line: cl-1 and in a lower extend, cl-3, show an accumulation of triglycerides from day 6 to the end of the process, while cl-2 doesn't seems to follow this trend. conclusion interestingly, in some cell lines/experimental conditions, we highlighted an overproduction of triglycerides and cholesterol leading to the accumulation of lipid droplets known as energy storage sink. at the metabolic level, these findings suggest a relative overflow of the carbon metabolism. from a process development perspective these findings can be considered on the one hand as a resource waste since the stored energy is not used for protein/biomass biosynthesis and on the second hand as the root cause of additional process challenges especially during the harvest and the first capture steps given the hydrophobic nature of these molecules. implementation of lipidomics analysis enables us to highlight a new type of process variability and to anticipate potential problems for the downstream steps. the application of this methodology on our platform has helped us to design tailor made solutions (pretreatment selection, filter selection,…) at the clarification step which are now implemented in our harvest development platform approach. . matriptase knock-out in cho cells prevents clipping of recombinant proteins. a serine protease inhibitors protect model mab from proteolytic degradation in cho-k1 cell derived conditioned medium. the model mab was incubated in conditioned medium for 0h or 48h at 37°c, subsequently samples were analyzed by western blot. broad spectrum serine protease inhibitors (aprotinin, leupetin) were added during incubation. aprotinin and leupetin are inhibiting proteolytic degradation. the intact mab (upper band) and the clipped mab (lower band) are indicated by arrows. b gene expression profiling of cho-k1 versus cho-a by ngs. shown is the gene expression profile of "secreted/shedded members of the s1a trypsin-like serine protease family" for cho-k1 and cho-a cell lines using next generation sequencing. the gene expression analysis highlights that five proteases were more than 1.5 fold higher expressed in cho-k1 cells (labelled with a red asterix). the y-axis shows the transcript abundance as rpkm (reads per kilobase of exon model per million mapped reads). c sirna knock-down identifies matriptase as major clipping protease and cho matriptase ko clone shows no detectable clipping activity. upper figure: sirnas directed against the five protease genes and scrambled (scr.) sirna were transfected and conditioned medium was collected three days after transfection. the model mab was incubated in fresh medium as control (first lane) and conditioned medium from the sirna transfected cells. samples were analyzed by western blot. only sirna targeting matriptase (st14) showed reduced proteolytic degradation. the intact mab (upper band) and the clipped mab (lower band) are indicated by arrows. lower figure: the model mab was incubated for 48h in conditioned medium collected from wt cho-k1 as well as the matriptase knockout clone. samples were analyzed by western blot. the intact mab (upper band) and the clipped mab (lower band) are indicated by arrows. no proteolytic degradation could be detected in the samples originating from the matriptase ko clone. d cell growth, viability and productivity in ambr (fed batch with temperature shift). cell growth, viability and volumetric productivity profiles of wt cho-k1 (red circles, n=2) and matriptase ko clone (blue squares, n=1) cultivated in 15-ml ambr. no significant differences were seen between wt and matriptase ko clone regarding cell growth and viability. comparable or slightly higher productivity was detected for the matriptase ko clone compared to the wt. e significant reduced proteolytic clipping applying matriptase ko clone. the model mab was stable expressed in cho-k1 (wt) as well as the cho-k1 matriptase ko clone. samples were analyzed by western blot. the intact mab (upper band) and the clipped mab (lower band) are indicated by arrows. significant reduced proteolytic degradation could be detected in the samples originating from the matriptase ko clone (3 samples each is shown for wt and ko cells) the glycosylation of therapeutic proteins is a critical quality attribute (cqa) and needs to be analyzed during cell line and bioprocess development. the current methods for analyzing glycosylation are mainly based on the enzymatic release of glycans. they are tedious and offer only limited throughput, which makes them unsuitable for cell line development work. in this study we evaluated a novel paia assay for measuring intact glycoproteins with capture beads and fluorescence labeled plant lectins to analyze glycans in a high throughput 384-well plate format. material and methods analytes: erbitux © , mabthera © , arzerra © and avastin © . two glycoengineered variants of one igg were kindly provided by merck (vevey, switzerland). all analytes were spiked into cho-k1 cell culture supernatant or buffer, diluted 1:1 with a denaturation solution and incubated at 65 or 70°c for 20 minutes to expose the fc glycans. erbitux samples were analyzed under denaturing conditions to detect fab-and fc-glycosylation and in native conditions for fab glycosylation only. 10μl of pretreated sample was added to each well of the special 384-well paiaplate, containing labeled lectin and capture beads. the microplate was incubated for 45 minutes at 1800 rpm on an orbital shaker at room temperature and spun down at 500 xg. the read-out was done on a fluorescence microscope (synentec, elmshorn, germany) in less than five minutes. results figure 1a : lectin binding profiles of different iggs. the analysis of different igg results in lectin binding profiles which show the different degrees in glycosylation. high abundance of sugars leads to high binding rates of the lectin for the respective sugar. avastin has a very low degree of galactosylation and high mannose species compared to mabthera and arzerra (fig. 1a) . only arzerra is carrying glycans with 2-6 linked sialic acids. these findings are in line with results from literature [1] . figure 1b : distinction between fc and fab glycosylation in erbitux. without denaturation only the fab glycans are detectable in erbitux. denaturation leads to additional exposure of the fc glycans and thus higher lectin binding rates compared to native erbitux. gna and npl only bind to denatured erbitux indicating that the high mannose glycans are only present on the fc part. the equal sna binding rates for both conditions confirm that the 2-6 linked sialic acids are almost exclusively found on the fab part. this is in agreement with published data [2] . figure 1c : lectin binding rates correlate with the levels of galactosylation and fucosylation. increasing degrees of glycosylation in the mixtures of the glycan variants from merck lead to higher lectin binding rates for all galactose and fucose markers proving that quantitative analysis can be performed with these assays. the cona lectin which binds to the common core mannose glycan motive remains at the same level, suggesting that the fc glycans were similarly exposed in all samples. the results demonstrate that paia assays are capable of quickly detecting differences in glycan patterns of different antibodies. in addition it was shown that glycan variants of the same igg can be analyzed quantitatively. and finally we could confirm the differences in fab and fc glycosylation in erbitux. we believe that bead-based assays with lectins have a great potential for monitoring product quality early in the development process. background gene-and cell therapy-based medicines are experiencing resurgence due to the introduction of "next generation" transfer viral vectors, which have demonstrated improved safety and efficacy. adeno associated viruses (aav) and lentiviruses are very commonly used in therapeutics and often produced by transient gene expression, using pei-mediated transient transfection in hek-293 or hek-293t cells [1] . the critical raw materials needed for cgmp vector production must be sourced from approved suppliers and should have gone through a rigorous testing program to reduce the risk of introducing adventitious agents into the production process. correspondingly, the pei transfection reagent must also be sourced from a qualified supplier, and have gone through rigorous testing to ensure reliable transfection efficiencies, and hence reproducible virus production yields. here, we present peipro® and peipro®-hq, the unique pei-based transfection reagents suitable for use in process development and in cgmp biomanufacturing, respectively. unlike commercially available peis, peipro® benefits from extensive research and development in polymer chemistry and formulation for mammalian cell transfection. we further demonstrate that peipro® and peipro®-hq are the reagents of choice for virus production runs in most cell culture systems, hence facilitating the transition from initial optimization during process development up to large-scale therapeutic viral vector production in adherent or suspension cells. manufacturing process of peipro® and peipro®-hq reagents. peipro® and peipro®-hq are fully synthetic reagents, free of any animal-origin components. in comparison to peipro®, a more extensive number of quality controls are performed on peipro®-hq to enable its use as a qualified raw material in gmp processes for the manufacturing of clinical batches of therapeutic products. lentivirus and aav production. irrespective of the cell culture vessel type, transfection using peipro® was performed following our recommandations. as an example, hek-293t (lentivirus) and hek-293 (aav) cells were thawed directly into each medium and passaged every 3 to 4 days before going into a 2 liter benchtop bioreactor. cells were resuspended and cultured for 3 days before transfection with peipro®. hek-293t cells were transfected with a third-generation system (four plasmids) for lentivirus production. hek-293 cells were co-transfected with three plasmids for aav production. lentiviral and aav titers were measured 48 and 72 hours post-transfection (data kindly provided by généthon). peipro® is the reagent of choice for virus production runs in most adherent and suspension cell culture systems from process development up to large scale clinical-grade virus production. irrespective of the cell culture-based system and production scale, peipro® and peipro®-hq have led to efficient viral vector yields in standard laboratory cell systems, such as in flasks, cell factories, and roller bottles, as well as in multilayers flasks or fixed-bed culture systems that take into account time and space concerns for the scaling-up process (table 1) . for example, high viral vector yields superior to 10 7 ig/ml and 10 11 -10 12 vg/ml were obtained respectively for lentiviruses and aavs in suspension hek-293t and hek-293 cells cultured in one of the commercially available synthetic cell culture medium balancd® hek293 (irvine scientific®). conclusion peipro® and its higher quality grade peipro®-hq are the unique pei suitable for efficient and reproducible production of therapeutic viral vectors. efficient viral vector production yields can be achieved in most cell culture systems, irrespective of the production scale. with appropriate and advanced quality controls, the highest quality grade peipro®-hq is commercially available to accompany academics and biopharmaceutical companies in terms of qualified raw material for their gmp-grade viral vector production needs. b distinction between fc and fab glycosylation in erbitux. erbitux was diluted to a concentration of 200 μg/ml in tris buffer and measured in native and denatured conditions to distinguish fab glycosylation (native erbitux) from fab and fc glycosylation (denatured erbitux). it could be confirmed that sialic acids are almost exclusively present on the fab part of erbitux and that the high mannose glycans are only found in the fc part. c lectin binding rates correlate with the levels of galactosylation and fucosylation. two glycan variants samples of the same igg from merck were mixed in different ratios to yield glycosylation rates of 9 to 55% in terminal β-galactose and 8 to 100% in core-fucose, based on data from 2-ab uplc analysis. the mixtures all contained 0.5 μg igg per well. the measured lectin binding rates for all galactose and fucose markers correlate very well with the respective degree of glycosylation in the mixtures. all measurements were performed in triplicates (irvine scientific®) . hek-293t (lentivirus) and hek-293 (aav) cells were thawed directly into each medium and passaged every 3 to 4 days before going into a 2 liter benchtop bioreactor. cells were seeded and cultured for 3 days before being transfected by peipro® (polyplus). for transfection, four plasmids were used for lentivirus and three plasmids were used for aav. lentiviral and aav titer were measured 48 and 72 hours post transfection (data kindly provided by généthon) table 1 (abstract p-004). peipro®, the reagent of choice for virus production runs in most cell culture systems in both adherent and suspension cells. irrespective of the cell culture-based system and production scale, peipro® and peipro®-hq have led to efficient viral vector yields superior to 10 7 ig/ml and 10 9 vg/ml, respectively for lentiviruses and aavs background continuous perfusion process is making a comeback as a competing upstream manufacturing technology for the production of biopharmaceuticals compared to the standard fed batch processes. this is primarily because of cost advantages such as reduced capital cost and increased product yield. the change in status of perfusion process from older perfusion to the new-age perfusion is due to availability of better cell retention devices leading to more efficient processes, improved cell lines, cell culture medium capable of supporting high cell densities and better bioreactor control strategies. in this work, we present the advantages, limitations and challenges of fed batch and perfusion type of processes through case studies. table 1 results the perfusion run yielded 5-fold higher titer compared to fed batch run (fig. 1a) . considering the number of runs that could be executed in a manufacturing facility within the same calendar days, about 1fold increase in product output can be achieved with the perfusion process (fig. 1a) . this difference is attributed to higher ivcc, higher pcd and longevity of cells because of decreased level of toxic metabolite concentrations such as lactate and ammonia. case 2: understanding product retention in perfusion process the new-age perfusion processes utilize hollow fiber filters. this has been observed to cause retention of product within the bioreactor especially towards the end of the production run. two types of experiments were conducted to study the factors contributing to product retention: -spiking studies: -role of product titer: product was spiked into chemically defined media -role of different cell viability: different broths with varying viability spiked with same product titer -evaluation of different hollow fiber membrane (m1, m2 and m3) on product retention. from spiking studies, it was evident that cell debris and poor quality cell broth (lower viability) were the major factors contributing to product retention (fig. 1c) . from the different membranes experiments, it was identified that at pilot scale, m1 showed much higher retention from the first perfusion cycle itself and it increased to more than 75% towards the end of the batch. however, with m2 membrane, product retention started only late (after 50% of batch duration) and it remained low (~20-40%). on the contrary, this difference was not observed at 1l scale due to the usage of membranes with larger filter area (2-3 folds higher compared to pilot scale). when the filter area per unit volume of perfusate was decreased by half (m2_batch 4) for the pilot scale, even m2 showed retention profile similar to m1 (fig. 1d ). we presented data to show that perfusion process has 5-fold increase in product yield on a per-batch basis and a 3-fold increase when facility throughput is considered. product retention is a technical challenge that requires optimization (perfusion rates and filter membrane types). we believe it is imperative that labs that develop processes for biologics can now consider both perfusion and fedbatch based processes as both these technologies can now closely compete with each other. the choice of the process format going forward should now solely be dependent on the requirement for the biologic rather than the earlier perception that fed-batch is the preferred choice because of its simplicity. background chinese hamster ovary (cho) cell culture has been widely used for production of monoclonal antibodies in the pharmaceutical industry. previous studies have shown that the cell specific productivity in cho cells can be increased by glucose limitation [1] . introducing a productivity enhancing effect it is possible that this also affects the quality of the product such as glycosylation or other posttranslational modifications. in this work, we are focusing on the impact of glucose limitation and increased productivity on the product quality of a monoclonal antibody produced in a fed-batch cultivation of cho cells. materials and methods cho cells were cultivated both under limiting and non-limiting nutrient conditions in fed-batch. for fed-batch cultivation the reduced range for glucose concentration was chosen between 0.2 and 0.5 g/ l. reference cultivation was performed between 1.5 and 3.0 g/l. both cultures were fed with similar volumes of a complex nutrient supplement. all cultivations were performed in chemically-defined, animalcomponent free cho growth media (xell ag). viable cell density and viability were determined using the automated cell counting system cedex (roche diagnostics), glucose and lactate concentrations were detected via ysi (ysi life sciences). amino acid were quantified using hplc-fld, vitamins were quantified using reversed phase chromatography coupled to a triple quadrupol mass spectrometer (varian 320, selected reaction monitoring). amounts of igg1 were quantified via protein a hplc, mab purified from another cho cell clone was used as a standard. the analysis of product quality was performed by intact mass analysis using reversed phase chromatography coupled to a microotof-q ii mass spectrometer (bruker daltonik). the cho cell culture cultivated under low nutrient conditions reached a 54% higher viable cell density than the reference culture (fig. 1a) . the product titer was even increased by 109% (fig. 1b) . the spent media analysis shows that some amino acids and vitamins were present at presumably limiting concentrations after day 5/6, mostly in the low nutrient level culture (down to 40 to 190 μm for tyr, gln, arg, and asn, below 1 μm for pyridoxine, data not shown). the product quality showed significant changes for the changed feeding strategy ( fig. 1c and d) . as expected, the glycation level decreased from 3% to 1% compared to the reference culture. the truncation level of c-terminal lysine at the heavy chain of the mab increased from 79% to 88%. the glycosylation was also significantly influenced by the low nutrient level (fig. 1e) : the nonfucosylated variants increased from 3% to 6% (fig. 1f) , the degree of galactosylation increased from 31% to 39% (fig. 1g) . cultivation under low nutrient level led to 54% higher viable cell density and a product titer increased by 109% when compared to reference culture grown under non-limiting nutrient conditions. the analysis of product quality reveals 75% less glycation of light chain for cho cells grown under low nutrient conditions (0.7% vs 2.7% in reference culture). the truncation of c-terminal lysine decreased by 10% (from 88% to 79%), the degree of galactosylation increased by 23% (from 31% to 39%, also observed by takuma et al. [2] ) and non-fucosylated glycans increased by 105% (from 2.8% to 5.8%) under low nutrient conditions. the product quality analysis by intact mass proved to be highly robust (average cv for four replicates = 2%). in summary, cultivation with alternative feed led to higher igg product titer and better product quality (glycation unwanted, higher amount of non-fucosylated glycans leads to higher antibody-dependent cellmediated cytotoxicity (adcc), higher amount of galactosylation to higher complement-dependent cytotoxicity (cdc) and adcc [3, 4] ). background we developed an automated, multiwell plate (mwp) based screening system for suspension cell cultures (fig. 1a) which is now routinely used in cell culture process development. it is characterized by a fully automated workflow with integrated analytical instrumentation. it uses shaken 6-24 well plates as bioreactors which can be run in batch and fed-batch mode with a capacity of up to 768 reactors in parallel [1] [2] [3] . a wide ranging analytical portfolio to monitor cell culture processes and also a cooperation with internal high throughput (ht) analytic groups to characterize product quality are available. in addition the use and the benefits of spectroscopic methods for cell culture automation were shown in the past [4, 5] . automated cell culture systems enable broader screening within a shorter time frame for many applications in upstream process development. the higher degree of parallelization and automation helps to screen for most promising parameters in a shorter time. the use of broad doe screening design allows in addition the identification of parameters that support high titers while keeping high product quality (multiple factors at the same time). the illustration (fig. 1b) shows an example how this combination can speed up process development steps. main applications of the cell culture automation are for example the identification of product quality levers and media or feed optimization. the application of the cell culture automation is shown for two examples. the goal in the first application was to identify levers to reduce trisulfides. by a screening of 39 conditions in parallel (in 4-fold replication, 158 wells in sum) the reduction of trisulfides by 97.5 % (normalized to start level) was possible. in addition the levers for trisulfide reduction were identified. the best and start conditions were verified in bioreactor scale (fig. 1c) . the goal in the second application was to increase product concentration without an impact on product quality. by a screening of 54 conditions in parallel (in 4-fold replication, 216 wells in sum) the increase of titer from 1.5 g/l to 3.7 g/l (> factor 2) was possible by media platform change and media optimization. an impact on product quality could not been shown. the best conditions were also verified in bioreactor scale (fig. 1d) . the benefits of using cell culture automation in late stage process development were shown based on two examples of current applications. for this purpose the experimental results of the development work of two late state projects using the in-house developed automated cell culture system were shown. the first example shows the capability of the automated cell culture system by reducing trisulfides significantly in just one experiment. for the other project the final product concentration could be increased by factor 2.5 by a media screening and changing to the in-house media platform. these two examples show the potential of cell culture automation as a routine tool in process development. the cell line development process has become faster and is simultaneously generating more clone-and product-related analytical data. in order to select the best producer cell line, extremely heterogeneous data types need to be systematically compared. the timely availability of all data needed to decide which cell line to pursue has become a bottleneck in the cell line development workflow. to ensure sound decision making, new integrated workflow support and data analysis methods are needed. we have developed a new end-to-end platform for bioprocess development, which includes a cell line development workflow system supporting seeding, selection, passaging, analyzing, cryo-conservation, and processing in (micro-) bioreactors. this platform, genedata bioprocess™, enables partially or fully automated cell line selection and assessment processes, and it increases process efficiency and quality. the system tracks the full history of all clones -from initial transfection all the way to their evaluation in bioreactor runs -and combines this information with analytics data on molecules, clones, and product quality. it can directly integrate with all instruments, such as pipetting robots, bioreactors, and bioanalyzers. the system is designed for a wide range of . a schematic illustration of the automated cell culture system. only the core system is shown with a robotic plate handler as key device connecting cultivation, processing and analytical parts. b illustration of an example how cell culture automation can speed up process development steps. c application in the identification of product quality levers. d application in titer optimization biologic molecules, including antibodies (iggs, novel formats) and other therapeutic proteins (e.g., fusion proteins). highlighted use cases describe the identification of top producer cell lines, decision making support, bioreactor data management, and full clone history report documentation (fig. 1 ). genedata bioprocess, which was developed in collaboration with top pharmaceutical companies, can flexibly support various (non-linear) workflows and structure the collected information in a way that fosters collaboration across an organization. while increasing throughput is crucial to ensure the timely availability of optimal producer cell lines, high-throughput is only possible when automated processes in the laboratory and the resulting data collection and aggregation can be streamlined. genedata bioprocess helps to establish more productive processes by offering support and integration for automation stations and measurement devices. thanks to the comprehensive workflow support and the possibility to integrate results from cell line stability experiments, product quality assessment, and bioreactor suitability tests, genedata bioprocess provides a unique way to evaluate cell lines. comprehensive analysis of all data collected in the process helps to ensure the highest possible quality and minimize the time and resources needed for data analysis and management. integration of bioreactor data analysis and visualization with other parameters measured in cell line development, streamlines clone evaluation in micro-bioreactors and supports highthroughput operations. genedata bioprocess comprehensively tracks the full clone history from the origin of the host cell line to the generation of the validated monoclonal producer cell line. for promising clones, the clone history report can be generated with one click. besides supporting cell line development, genedata bioprocess is a comprehensive platform capable of tracking the complete bioprocess development process. in 2012, 14.1 million people suffered from cancer [1] making it to a major concern of our society. since common cancer treatment is limited and not effective for late stage carcinoma, alternative methods are needed to reduce the high mortality rate of cancer patients. one alternative approach is the application of the oncolytic measles virus (omv), because omv has a natural affinity against cancer cells. the major drawbacks of omv is to produce the extremely high amount of at least 10 11 tcid 50 (50 % tissue culture infective dose) per dose [2] which is needed. to solve this problem, a high titer process must be established including an efficient downstream processing (dsp). we developed an appropriate upstream processing and are able to produce 10 10 -10 11 tcid 50 ml -1 in a bioreactor with 0.5 l working volume [3] . now, we focus on the dsp part. the following study tested the application of charged depth filters for the omv clarification. in contrast to common dsp schemes, a depletion of virus particles or a loss of infectivity is not desired. the aim is a reduction of protein content and dna with minimal loss of infective omv. further, we investigated the influence of the cell culture medium on the depth filtration process. to explore the influence of the surrounding cell culture medium on the depth filtration performance, omv was either produced in serum-free medium (vp-sfm) or serum-containing medium (dmem + 10% fcs). the production was done in a str with a working volume of 0.5 l as described in [3] . cells and carriers were separated with an opticap xl 1-module (polygard-cr; 5 μm; merck). for the depth filtration millistak+ ce50 filters (merck) were used. the filter material was autoclaved and rinsed with 25 ml of 20 mm tris-hcl (ph=7.4). the virus suspension was filtered with a load of 50 l m -2 using a peristaltic pump (ism931c; ismatec) applying a flux of 150 l m -2 h -1 (fig. 1 ). samples were collected at the beginning and end of a filtration run. the omv titer (tcid 50 ml -1 ) of the samples was determined according to kärber and reed [4, 5] . protein content was measured with the pierce bca protein assay kit (thermofisher scientific) according to the manufacturer's instructions. dna was measured by a microtiter assay using quant-it picogreen dsdna reagent (thermofisher scientific) according to the manufacturer's instructions. we found that positively charged depth filters were suitable to clarify omv suspensions. the cell culture medium, in which the omv was produced, influenced the outcome of the depth filtration. a log reduction value (lrv) of 0.87 was determined for omv present in serum-containing medium (scm), whereas the titer of omv in serumfree medium (sfm) was reduced 1.63 log levels. this indicates that without serum in the surrounding liquid, omv will adsorb to the filter material. however, we must evaluate if the missing serum or other components present in sfm are responsible for this effect. total protein was not relevantly reduced by the clarification using charged depth filters. for omv present in scm, the residual protein content was slightly less compared to omv present in sfm (table 1 ). in contrast, host cell dna (hcdna) was bound to the filter material. we achieved a 33% reduction of hcdna for an omv suspension in sfm. after clarifying an omv suspension in sfm, the remaining hcdna content was even lower being only 42 %. conclusions charged depth filters are suitable for the first clarification step of omv downstream processing. residual protein could pass the depth scheme of the complete cell line development workflow support in genedata bioprocess. showcasing integration of data from diverse measurement instruments, data visualization for decision making support as well as, tracking of full clone history filter almost unhindered, whereas the hcdna content was already reduced to 42% at maximum. however, the omv titer was also reduced by the depth filtration. this undesired effect was stronger for the omv present in sfm. because the agencies require avoiding serum in clinical-grade production processes, this is disadvantageous. nonetheless, because sfm will be soon standard for omv production, further experiments have to be done preventing the omv reduction during clarification. one option can be to reduce the adsorption strength of the virus to the filter material by the addition of salt. moreover, it is important to establish a standardized protocol for the upstream processing. we determined batch-to-batch variations within the clarification indicating a strong impact of upstream processing (usp) on the outcome of the dsp. therefore, further studies must investigate the influence of usp parameter e.g. time of harvest and ph of the harvest solution on the omv. fed-batch culture is commonly employed to maximize cell and product concentrations in upstream mammalian cell culture processes. typical standard platform processes rely on fixed-volume bolus feeding of concentrated feed supplements at regular intervals. however, such static approaches might result in over-or underfeeding. to mimic more closely the dynamics of a fed-batch culture, we developed a dynamic feeding strategy responsive to the actual nutrient needs of a mab-producing recombinant cho cell line. results and discussion improvements made at different steps during fed-batch development are shown in fig. 1 . at step 1, all eight cell boost supplements were added to cdm4ns0 according to a doe approach, and batch cultures were performed. this evaluation allowed us to select only those cell boost supplements that were beneficial to the overall culture performance. non-performing cell boost supplements were removed and not considered further. at step 2, the selected cell boost supplements were added daily to the cultures at different ratios according to a doe approach, and fedbatch cultures were conducted. as expected, daily feed additions to replenish consumed nutrients substantially improved mab and peak cell concentrations as well as viable cumulative cell days (vccd) compared to batch cultivation. further, the results enabled us to fine-tune the feed ratio of selected cell boost supplements. at step 3, we further optimized the best performing feed ratio by investigation of static and dynamic feed protocols. most fed-batch protocols rely on constant feed additions on distinct days. however, these approaches often lead to substantial over-or underfeeding during bioprocessing. to improve such "static" protocols, we investigated three different "dynamic" approaches as shown in table 1 by applying the selected cell boost supplements with the optimized feed ratio. this investigation allowed us to further improve the bioprocess performance. the best performing approaches, constant and retrospective feed, were further investigated in fully automated bioreactors under controlled conditions. in general, constant cultivation parameters in the bioreactor slightly enhanced mab titers compared to shake flask cultivation. the retrospective feed strategy yielded 10% higher titers than the constant strategy. overall, the established methodology for fed-batch development allowed us to obtain 2.5× higher mab titers (batch mean: 1.9 g/l vs. fed-batch 4.9 g/l) in a short time and three simple steps. in addition, the product quality was investigated. compared to the legacy fedbatch process, fed-batches that were conducted with the newly selected basal and feed media altered the distribution of charge and glycan variants. the amount of aggregated product was not altered. the established methodology for fed-batch development is a rapid protocol to select well-performing feed supplements and optimize their ratio to the culture requirements. in three steps, mab titers were boosted 2.5x from 1.9 g/l to 4.9 g/l. product glycosylation and charge variants could be influenced by the newly selected basal and feed media compared to a legacy fed-batch process. the amount of aggregated product was not altered. the present study investigates the beneficial effect of spiking hyclone™ actipro™ basal medium with hyclone cell boost™ 7a and cell boost 7b feed supplements on growth and productivity of a recombinant cho cell line. to evaluate the impact of feed-spiking compared with cultivation in basal medium only, the cell line was grown in bioreactors under controlled conditions to determine cellspecific metabolic rates, nutrient consumption, and byproduct accumulation over the process time. transcriptome analysis of the cultivated cells, using microarrays on four consecutive days to investigate differential gene expression, revealed the beneficial effect of feed-spiking compared with cells grown in basal medium. model cell line was a mab-expressing cho dg44 (licensed from cellca gmbh) cultivated either in actipro basal medium only (ge healthcare) or in actipro basal medium feed-spiked with additional supplementation with 7% cell boost 7a and 0.7% cell boost 7b (ge healthcare). both cultures were grown in batch mode using dasgip™ cellferm-pro™ stirred-tank bioreactors (eppendorf). the beneficial effect of feed-spiking was analysed by transcriptome analysis using microarray technology (8×60 k design, agilent). both basal and feed-spiked processes lasted for seven days with viabilities above 95% until day 6. on day seven, a sharp decline in viability indicated the end of the batch process (fig. 1a) . in feed-spiked medium, cells initially grew slower but reached almost twice as high peak cell concentrations (17.6 × 10 6 c/ml) than in basal medium only (9.79 × 10 6 c/ml). remarkably, the integral of the viable cell concentration over the total process time (viable cumulative cell days [vccd] ) was similar between both process strategies (fig. 1c) . while mab production plateaued after day 4 in basal medium only (final titer 0.8 g/l), a continuous increase to three-fold higher final titers (2.4 g/l) was observed in feed-spiked medium (fig. 1b) . the higher titers could be attributed to generally higher cell-specific productivities (qp), which remained rather constant (~70 pg/cell/day) in feedspiked cultures. in basal medium, the qp continuously dropped by 20% (day 0 to 3), 50% (day 4), and > 90% (day 5 to 7) from 70 to 10 pg/cell/day in basal medium cultures. in average, the qp was 70% higher in feed-spiked cultures (fig. 1d ). transcriptome analysis of differentially expressed genes between cells grown in basal medium or feed-spiked medium were used to identify relevant go terms that indicated a more active proliferative state for feed-spiked cultures (data not shown). the top go terms significantly related to cell cycle and primary metabolism, cellular division, as well as nucleobase formation or regulation. furthermore, gsea revealed several significantly enriched set of genes related to gene transcription, dna replication and repair, cell growth and proliferation, as well as inhibition of apoptosis in feed-spiked cultures. thus, feed-spiking increased the proliferative activity of cultivated cells. several of the identified genes appear as promising targets for cell line engineering, but have not yet been described in relation to high-producing recombinant cell lines and will need to be evaluated in future studies. feed-spiking of basal medium is a convenient and easy way to considerably increase product concentrations in a simple batch culture. differential gene expression revealed genes that appear important for high cell-specific production rates, and this knowledge can be leveraged into cell line engineering approaches or the design of high producing cho cell media. in the latter case, a maximized supply of high biosimilarity must be demonstrated by physicochemical and functional characterization for approval requirements of phase i and phase iii studies in terms of efficacy, safety and immunogenicity. in this study, rounds of upstream and downstream processes were run to reach the cqa limits of the originator molecule. after conducting many different development strategies, the mirror plot images of the intact deconvoluted mass were found to be identical corresponding to similar levels of glycoforms. the uv chromatogram of reversed phase ultraperformance liquid chromatography (rp-uplc) of tryptic peptide mapping demonstrated that the primary structure of tur01 is identical to the originator as shown in fig. 1a . post-translational modifications (ptms) such as oxidation, deamidation, n-terminal pyroglutamic acid, c-terminal lysine truncation levels were also comparable for two products. the glycosylation site (hc-asn301) was confirmed by peptide mapping analysis and 100% glycan site occupancy was proven for tur01 and originator. the glycosylation pattern for two products were highly similar in terms of major glycans (g0f, g1f, g0f-gn and etc.). man5 level was lower in tur01 compared to the originator product which may not have any clinical effect on the molecule. the secondary structure was determined by atr-ftir spectroscopy. absorption bands (amide i and amide ii) were overlapped completely and amounts of α-helix and β-sheet structures were comparable. furthermore; size-exclusion chromatography (sec) analysis revealed that both products have the same level of purity (>99%) and aggregate (<1%) levels. the level of impurities were determined as below 4% by ce-sds. the capillary isoelectric focusing (cief) experiments showed that the charge variant profiles of two products are indistinguishable and the isoelectric point of main peak is observed at 8.3 for both products. the association/dissociation rate constants and binding affinity for both tur01 and originator were highly similar and similarity score was calculated greater than 99%, as shown in fig. 1b . in this study, state-of-art analytical techniques were used to assess the biosimilarity of tur01 to the originator adalimumab. head-tohead comparison data clearly demonstrated that tur01 is highly similar to the originator adalimumab in terms of physicochemical and functional characteristics. based on the analytical similarities, we . process performance of basal medium (black) and feed-spiked (red) bioreactor batch cultures: a cell concentrations and viability, b viable cumulative cell days and specific growth rate, and c antibody concentrations and cell-specific productivity. error bars indicate standard deviation from three independent experiments. the black arrows on day 4 indicate the beginning of decreasing cell-specific productivities and lower cell-specific growth rates in basal medium cultures believe that tur01 will have comparable pk/pd, potency, and efficacy results to the originator adalimumab. the expanded interest in intensified continuous bioprocessing has highlighted the need to develop a small scale model for perfusion cell culture. the direction in the industry has been to increase target cell densities to ≥50x10 6 vc/ml and decrease perfusion rates to ≤3vvd. in order to increase the throughput of our perfusion media development capabilities we sought to develop a small scale model of perfusion using the ambr®15 instrument (sartorius, germany). we used a modified cell settling model from the previously published by kreye et. al. to achieve the cell retention necessary to reach perfusion relevant viable cell concentrations [1] . in this work, we will show the application of this small scale model for: (1) identification of specific productivity performance over a steady-state for tested media, (2) identification of cspr min for a specific cell line and medium combination, and (3) confirmation of consistent product quality profiles between the small scale model and benchtop perfusion (data not shown). a chozn® cell line producing an igg1 was evaluated in several proprietary chemically defined media prototypes generated during the development of the catalog excell® advanced hd perfusion medium: "fed batch medium", "early prototype", "mid prototype", "intermediate prototype" and "late prototype" [2] . small scale simulation of perfusion experiments were run in ambr®15. media exchange was performed 3 times per day in equal amounts. agitation, gassing, and liquid handling were stopped for an optimized period of time to allow cells to settle to the bottom. spent media was removed in an amount proportional to 1/3 rd daily exchange volume. agitation, gassing, and liquid handling were resumed and fresh media was added back to the vessels. for benchtop perfusion, cells were inoculated in 3l applikon bioreactors (applikon, netherlands). at a concentration of~6.0x10 6 vc/ml, perfusion was initiated using the atf2 (repligen, massachusetts). perfusion rate was limited at 1.2vvd during steady-state. using the cell settling method described above we have been able to achieve ≥90% cell retention efficiency. all media tested in this work were able to reach and maintain the 30x10 6 vc/ml target cell density at 1vvd (fig. 1) . performance of each media formulation was ranked based on specific productivity (table 1) . using "intermediate prototype", minimum steady-state cspr was determined to be 33.3pl/c/d for this cell line. n-glycan analysis of ambr®15 and bioreactor samples via intact mass spectrometry displayed only slight differences in product quality profile (data not shown). our work has shown a clear distinction between various prototype perfusion media and demonstrated a 50% increase in specific productivity over "fed batch medium" used in perfusion. additionally, we have shown the application to further characterize the process using this model to determine cspr min for a given medium and cell line. the transient process has been successfully operated at 500l in a sartorius biostat single use bioreactor (sub), yielding 0.4kg of crude product from a two-week expression culture (table 1) . successful scale up of the process to 500l creates the potential to supply transiently expressed products to support toxicology studies or even early gmp clinical supply, enabling accelerated biopharmaceutical development project timelines. the scale up from rocking bioreactors (rbr) to sub scale identified some scalability issues. lower specific productivity due to increased cell growth and decreased titres were observed in the sub ( fig. 1 iii & iv). to improve the predictability of scale up, a new process was developed and evaluated in the sub vessels utilising a modified transfection method, which resulted in comparable expression levels and specific productivity between rbr and sub scales. two sets of expression vectors comprising heavy chain and light chain plasmids expressing a human igg1 kappa mab, as previously described [1, 2] were used in the process optimisation study. the cell line used for transient expression and the pei mediated transfection method has been described previously [1] . transfected cultures were run under fed batch conditions for 14 days in 22l ge healthcare wave bioreactors (rbr), hyclone sub using 50l and 250l hyclone bioreactor bags (thermo scientific). the transfection process was modified to address the reduced titres and higher viable cell density (vcd) seen in the sub cultures. shake flask cultures were used to assess the standard (a) and modified transfection processes (b and c) (fig. 1 , i & ii). process c was identified as the process to be studied at sub scale, offering the potential to mitigate the high viable cell densities (vcd) observed. scaling up process c to 50l and 250l sub resulted in cultures producing titres exceeding 1g/l with desired cell growth profiles. scale up of process a into sub vessels resulted in decreased productivity compared to the rbr scale. after optimisation, the sub process c yielded increased specific productivities and expression titres comparable to those seen at rbr scale (table 1) . medimmune has successfully completed the first known successful cho transient culture at 500l scale producing > 800mg/l of mab at harvest. process optimisation has subsequently demonstrated reproducible titres at 50l to 250l scale exceeding 1g/l with comparable glycosylation profiles between sub and rbr cultures across scales. comprehensive analysis of the impact of trace elements in media on clone dependent process performance and product quality background state-of-the-art biopharmaceutical processes are accounting concomitantly for process performance and product quality. even though high yielding, robust processes are the cornerstones of any process development, product quality parameters such as structural integrity, charge variances and post-translational modifications are progressively becoming the focus of the developmental work. in conjunction with host cell line selection and process performance parameters, media components are crucial for the continued progress in rational modulation of product quality attributes affecting biological activity, immunogenicity, half-life or stability. among media components, trace elements (te) are of particular interest as they play a pivotal role in various cell metabolism pathways. based on a comprehensive doe approach, extensive process performance-and product quality evaluation combined with metabolic flux analysis, the impact of several trace elements on the biopharmaceutical process is assessed. in a comprehensive i-optimal doe approach ( fig. 1) , the effect of six te in various concentration levels and combinations in serum-free media was studied for four different cho-k1 cells lines in an ambr® 15 setup. a scrutiny of the process performance parameters such as cellular growth, productivity, amino acids and vitamins consumptions rates for each of the conditions was performed. the process performance evaluation was accompanied by extensive product quality analysis including size and charge variants, glycosylation patterns, oxidation and methylation. furthermore, a metabolic flux analysis was performed based on the nitrogen balance. based on extensive analytical data, the obtained response surface model provides a clear insight into the impact of particular te and their combinations on process performance and product quality. the high model quality enables discriminations between clone dependent and clone independent effects. with an elevation in titer up to 25% in the best condition of the cell lines clearly show, that even state-of-the-art media can be outperformed by trace element rebalancing. analyzing specific rates in combination with metabolic flux analysis improves the understanding of metabolic restructuring of the cell lines under distinct te levels and combinations. modulation of trace elements levels had a tremendous effect on the charge heterogeneity and glycosylation structure of the different proteins. this provides a toolbox for the fine tuning and control of product quality parameters. taken together, the data further paves the way to the rational fine tuning of process performance and product quality attributes. background due to regulatory concerns and economic impact, ensuring product quality and consistency is now one of the main challenge faced by the biopharmaceutical industry. for monoclonal antibodies (mab), glycosylation is one of the most important quality attributes as it impacts on mab structure integrity, and ultimately on both clinical efficacy and safety. many factors affect mab glycosylation and its inherent heterogeneity, including the host cell, the culture medium, the mode of operation and the operating conditions. in this context, the capacity to monitor and control on-line the antibody glycosylation, from early-to late-stage process development, would be of salient interest to reduce the time and cost to market. in order to address this unmet need, we have designed an improved spr biosensor assay to measure the kinetics of interaction between a mab and the extracellular domain of the fcγriiia receptor bound at the biosensor surface [1, 2] . of salient interest, we also demonstrated that various binding kinetic signatures, especially different dissociation kinetics could be correlated with distinct mab glycosylation patterns and with therapeutic efficacies, as deduced from mass spectrometry and a surrogate adcc assay, respectively. in parallel, we have also harnessed a spr biosensor directly to a bioreactor, which permitted the at-line determination of the concentration of antibodies by hybridoma cells during a bioreactor culture. we now plan on combining both approaches to determine on-line the glycosylation profile of the produced mabs. our ultimate goal is to design a unique and highly innovative bioprocess control tool that can be readily applied in an industrial bio-manufacturing setting. reducing timelines and costs are key factors for bio-pharmaceutical industries to accelerate process development and drug delivery to patients. enhancing throughput of bioprocess development has become increasingly important for the screening and optimization of cell culture processes. this challenge requires high throughput tools. in a previous study [1] , we showed that ambr® 15, a robotically driven mini-bioreactor system developed by tap-sartorius, could be advantageous to accelerate process development. the use of ambr® 15 system allows us to test a large number of experimental conditions in a single experiment. therefore, the large amount of production samples to be characterized for product quality attributes (pqa) increases as well: the bottleneck has moved from the generation of samples at the production bioreactor step to in-process analysis. for product quality attribute analysis at lab scale, protein purification is generally carried out on >5ml columns which is incompatible with the size of ambr® 15 bioreactors. moreover, the applied methods are relatively low throughput. the development of a high binding capacity resin (up to 70 mg/ml) [2] , combined with high performing new cell lines which are able to produce up to 5 g/l of recombinant monoclonal antibodies, allow require the development of an efficient and high throughput (hts) purification method robot. the use of robotic equipment for small scale purification purposes is a great opportunity for us to tackle this bottleneck, by enabling highthroughput sample purification at smaller scale (200μl). recombinant monoclonal antibodies were produced by a genetically engineered dihydrofolate reductase (dhfr)-/-dg44 chinese hamster ovary (cho) cell line. clarified cell culture fluid (cccf) was obtained from 2 and 2k liter bioreactors after three filtration steps. minipurifications were performed on tecan freedom evo® robot with predictor robocol-umns® containing 200μl mabselect sure® resin. larger scale purification were executed using an aktaxpress using hitrap column prota. to assess monoclonal antibody purification at small scale, we first tested the repeatability of the minipurification, purifying the samples 8 times on the same columns and using different columns, focusing on the yield of the purification and the impact on product quality attributes, especially the hmws. then, we compared those results to those obtained with the aktaxpress at larger scale purification, comparing the yield of the purification and the pqa of the protein-a eluates obtained with both purification systems. finally, we assessed the capability of the robot to perform hts of buffer and purification conditions, evaluating three different buffers at different concentrations and ph values, and also testing different loading column capacities. in this study we established that the tecan can be used as a robust platform for purification at small scale. we observed similar purification yields, intra and inter run. the analysis of the pqa1a level showed there is also very high reproducibility. and the ph of the eluate showed as well strong comparability as well. table 1 shows the coefficient of variation (cv) of the yield, hmws and eluate ph are low, demonstrating the good reproducibility of the purification. the strong reproducibility obtained between the different purifications showed that the tecan and the aktaxpress are similar in terms of purification performance and pqa (fig. 1a, b) . the tecan is a versatile automated liquid handler allowing the screening of huge purification conditions (fig. 1c) , the possibility to purify large quantities of samples, while the samples amount is limited. the tecan has the potential to purify more than 150 samples/day, reducing timelines and allowing us to deliver faster to the patients. viable cell density monitoring in bioreactor with lensless imaging geoffrey monitoring cell density and viability of mammalian cell culture bioreactors is a necessary task that presents today a number of remaining challenges. the traditional measurement for bioreactor cell count and viability rely on using the trypan blue exclusion method once a day. while automatic cell counters have reduced the statistical manual error, sampling the bioreactor remains a contamination risk and is prohibiting process control as the sampled volume becomes siginficant. lensless imaging technology is a new method for accurately determining cell concentration and viability without staining. this technique directly acquires the light diffraction properties of each individual cells through their holograms images without any objective, lens or focus settings. living and dead cells have significant holographic patterns that can be distinguished and precisely counted. lensless imaging technique directly acquires the light diffraction properties of each individual cells through their holograms images without any objective, lens or focus settings. living and dead cells have significant holographic patterns that can be distinguished and precisely counted. we compare cell counts and viability between the reference method and our lensless imaging device, the cytonote counter. measures are performed once a day on samples from 12 bioreactors, from the inoculation to the end of the culture. we also assessed the repeatability of our method. another lensless imaging prototype is setup as a measurement chamber directly connected to a perfusion bioreactor, for continuously receiving the bioreactor broth, and therefore reproducing an in situ measure. with a concentration range up to 40x10 6 cells/ml ( fig. 1) and viability range at 75-100%, we obtained a correlation factor of 0.98 between the two compared methods. the large field of view allows the analyze of several thousand cells within a single image, keeping the statistical variability of the measure as low as 3%. our measurement chamber prototype has demonstarted its capability for continuous viable cell density and viability monitoring. we are now working at designing a steam strerilizable probe, and we envision lensless imaging to become the future method of choice for on-line monitoring of suspension cells cultures. lensless imaging technology is capable of accurately and precisely monitoring viable cell density and viability with a combination of significant advantages starting from low sample volume use, label free detection, quick measure, simple device, to high number of cell analyzed which let us think that it is a good candidate for very smallcomparison between both purification systems and the ability of the system to be used as a high throughput tools for buffers screening. a purification yield (%). b pqa1.a (normalized). c impact of the ph and buffer concentration on the pqa1.a scale bioreactor and high-throughput measures. its high repeatability is also a key parameters in the effort to narrow batch to batch deviations. in addition we demonstrate that this technique is potentially powerful for in-line and continuous monitoring of a lab bioreactor. we envision lensless imaging to become the future method of choice for on-line and in-situ monitoring of suspension cells and a perfect tool for process control in fed-batch or perfusion mode in single-use bioreactors or traditional steam sterilized vessels. it can certainly become the first vcd measurement technique to work from cell line engineering, to process development, pilot scale, and up to manufacturing scale. time-dependent product heterogeneity in mammalian cell fermentation processes background a consistent product quality is a major goal in the production of biotherapeutics, especially recombinant glycoproteins. whereas it is unlikely that the polypeptide chain changes during a production process, posttranslational modifications and protein folding are sensitive to fluctuations in parameters and conditions. here we focus on protein glycosylation as one important indication for product quality [1] . during a batch process conditions change continuously. at the beginning, the supply situation for the cell is excellent, but the secreted material stays a long time in the culture fluid. later during cultivation substrate provision decreases, whereas the exposition time of the protein to the culture fluid is much shorter. altogether this leads to product heterogeneity of the secreted protein during a batch culture. four different cell lines, two of human origin and two cho clones, producing four different recombinant glycoproteins were investigated in this study. together with their respective parental cell line the clones were cultured in three replicates in shakers. supernatant from the cultures were harvested at four time points. the removed culture volume was replaced by culture supernatant of the identically cultured corresponding parental cell line. the product was isolated from the supernatant and the glycans were released. one part of the released glycans was labeled with 2-ab and separated by hilic-fld. the other part of the glycans was permethylated and analyzed by maldi-tof mass spectrometry (fig. 1a) . the investigated proteins were antibody, antithrombin iii from cho clones and α1-antitrypsin, c1-inhibitor from human clones. the antennarity of the glycans is quite stable in all production phases. the degree of core fucosylation is very high in all products. a low fucosylation degree of antibodies may be favorable for a higher adcc performance [2] . some of the products showed an antennary fucosylation, which seemed to change not very much in different cultivation phases. nevertheless, this might be an issue due to an antigenic impact in the patient. the antennary galactosylation changes noticeable for the antibody and α1-antitrypsin. in both cases the degree is highest in the first phase. an incomplete galactosylation leads to truncated glycans. this leads inevitable to undersialylated antennas to be seen for α1-antitrypsin. the sialylation is the highest in the early phases and decreases during cultivation time. sialylation of a therapeutical protein is important for the half-life in the patient. therefore highly sialylated products are desired [3] . in further studies the consistency of the galactosylation and the sialylation will investigated for fed batch and long term continuous cultures in comparison to batch cultures. due to the feed solution or the fresh media being present during such processes, the supply situation should be excellent for the whole cultivation time. the differences between the maldi-tof and hilic-fld data originate from complex and unresolved chromatograms (fig. 1b , chromatograms not shown). for that reason coupling of hilic-fld and ms is very much recommended. background novel biologics are often selected from a large library of lead candidates in the initial stage of preclinical and clinical developments. for this selection, there is a demand for high-throughput production of recombinant proteins of high quality and in sufficient quantity. transient gene expression offers a rapid approach to the production of numerous recombinant proteins for the initial-stage developments of biologics. mammalian cells are major host cells for transient gene expression, but they have the disadvantages of complicated operations and high cost of culture. insect cells are easy to handle and can be grown to a high cell density in suspension with a serum-free medium. insect cells can also produce large amount of recombinant proteins through post-translational processing and modifications of higher eukaryotes. hence, insect cells have been recognized as an excellent platform for the production of functional recombinant proteins [1, 2] . in the present study, the production of an antibody fab fragment through transient gene expression in lepidopteran insect cells was examined. the dna fragments encoding the heavy chain (hc) and light chain (lc) genes of an fab fragment of mouse anti-bovine rnasea [3] were respectively cloned into the plasmid vector pihaneo, which contained the bombyx mori actin promoter downstream of the b. mori nucleopolyhedrovirus (bmnpv) ie-1 transactivator and the bmnpv hr3 enhancer for high-level expression [4] . trichoplusia ni bti-tn-5b1-4 (high five) cells were co-transfected with the resultant plasmid vectors using linear polyethyleneimine (pei; mw 40,000). before transfection, the plasmids and pei were prepared in 150 mm nacl, ph 7.0 and incubated at room temperature for 5 min. when the transfection efficiency was checked, a plasmid vector encoding the enhanced green fluorescent protein (egfp) gene was also co-transfected. transfected cells were incubated with a serum-free medium in a static or shake-flask culture. culture supernatants were analysed by western blotting and enzyme-linked immunosorbent assay (elisa). the numbers of green fluorescent cells and total cells in culture broth was determined using a flow cytometer. western blot analysis and elisa of culture supernatants showed that transfected high five cells secreted the fab fragment with antigenbinding activity. in static cultures, transfection and culture conditions, such as hc:lc gene ratio, a serum-free medium, dna:pei ratio, and dna amount per cell, were successfully optimized by flow cytometry of egfp expression in transfected cells and the yield of the secreted fab fragment measured by elisa. the effects of culture temperature and initial cell density were also examined by comparing the cell growth and the production of fab fragments in shake-flask cultures. under optimal conditions (medium, psfm-j1 (wako pure chemical industries, japan); hc:lc gene ratio, 3:7; dna, 5 μg/(10 6 cells); pei, 10 μg/(10 6 cells); initial cell density, 1 x 10 6 cells/cm 3 ; temperature, 24°c), the yield of more than 100 mg/l of fab fragment was achieved in 5 days in a shake-flask culture ( fig. 1) . transfection did not significantly affect the growth of high five cells as compared with untransfected cells. transient gene expression using insect cells may offer a promising approach to high-throughput production of candidate proteins for the development of biologics. the increasing demand for biopharmaceuticals produced in mammalian cells has led industries to increase volumetric productivity of bioprocesses through different strategies [1, 2, 3] . in this context, fedbatch and perfusion cultures have attracted more interest than conventional batch processes. the efficient application of such alternative processes requires the availability of reliable on-line measuring tools for cell density and cell metabolic activity estimation [4] . the comparison of different culture strategies for hek293 cell line producing ifn-γ are presented below: batch, fortified batch and fed-batch. in this context, a new robust feeding strategy based on the monitoring of alkali buffer addition was applied for the estimation of nutrient requirements. this method allows to increase cell density and product titer compared with the other strategies assessed. three different culture strategies were carried out in 2-litre biostat b-dcu ii bioreactor. first, a reference batch and a batch using fortified medium (nutrient enriched medium) were run and assessed in terms of viable cell density (vcd) and product titer, and set as initial references. then, a fed-batch was performed applying a feeding strategy based on the nutrient requirements estimation by monitoring the alkali buffer addition used for the control of ph. results vcd and product titer achieved for the different culture strategies assessed (batch, fortified-batch and fed-batch) are presented in table 1. in fortified batch an increase in vcd of 145% and also 350% in product titer were obtained compared with batch. in the fed-batch culture carried out (fig. 1) , we observed that alkali buffer addition profile matched the vcd evolution trend. thus, the monitoring of alkali buffer addition was used for estimating the nutrients requirements (i.e. the volume of feeding medium) at any time during the fed batch phase. the feeding strategy based on alkali buffer addition enabled to maintain glucose concentration set point therein a narrow range during fed-batch phase (around 20 mm). as a result, higher vcd (16.6·10 6 cells/ml) was obtained when compared with both batch references: vcd was enhanced to 241% and 39% and an increase up to 381% and 7% in product titer in respect to batch and fortified batch respectively. the results prove that fed-batch strategy based on the alkali buffer addition is a robust on-line monitoring method that enables to optimize the feeding strategy in a fed-batch cultures. three different culture strategies have been tested in bioreactor with a hek293 cell line producing ifn-γ. results show as the higher vcd is reached, the higher product concentration is achieved. therefore, from bioprocess development point of view, it is very interesting to implement strategies with higher vcd outcome, such as fed-batch operation mode. in this context, a new robust method for vcd estimation in fed-batch was applied. the alkali buffer addition necessary for maintaining the ph set-point is an on-line reliable and easy measuring variable that provides information about by-products formation (mainly lactic acid). the monitoring of this variable can provide information about the cell concentration, activity and metabolism, to detect changes in culture. besides that, a relationship between alkali buffer addition and vcd can be established since the first is strongly correlated with cell growth and metabolites consumption/formation. the application of alkali buffer addition measure to implement an optimal feeding strategy in fed-batch permits to enhance vcd and product titer when comparing with batch strategies. a novel approach to high throughput screening for perfusion background perfusion systems for suspended mammalian cells raise growing interest in the biomanufacturing industry. continuous manufacturing is growing in the field and is encouraged by health authorities [1, 2, 3] . this work addresses scale down limitations inherent to continuous media exchange and cell retention by using a semi-continuous system. data was generated with a set of different clones that were previously studied in fed-batch mode [4] . materials and methods 4 cho-k1 cell lines expressing the same monoclonal antibody (mab) and issued from the same transfection were used as models. 3.5l bioreactors (sartorius) were used for fed-batch and perfusion production runs. the perfusion bioreactors were run using an alternating tangential flow filtration device (repligen, xcell™ atf 2 system). the cell biomass was controlled by removing cells through a bleed line and was controlled using a biocapacitance probe (hamilton, incyte). the perfusion rate (d) was fixed to one vessel volume a day (vvd -1 ). the semi-continuous runs were made in 50 ml shake tube (tpp, tubespin® bioreactor 50). once a day, the tubes were centrifuged (5 min, 200 g), the supernatant removed (to mimic a perfusion rate of 1 vvd -1 ), replaced with fresh media and cells were re-suspended. the clone's growth potential were preserved across the systems (fig. 1 ). clone #3 always reached the highest viable cell density (vcd), followed by clone #1. clone #2 and #4 showed similar growth characteristics. it is interesting to note that in the perfusion bioreactor different patterns in terms of vcd were observed although the cell biomass signals were similar for all 4 runs. this reflects the fact that the capacitance measures the biomass and not the absolute cell count [5] . to estimate the minimum cell specific perfusion rate (cspr min ) in the semi-continuous experiment, the perfusion rate was divided by the maximum viable cell density (vcd max ). this value was compared to the cspr obtained during the 4 th set-point (sp4) of the perfusion runs. as expected, the bleed fraction decreased when the capacitance set-point was increased and went down to 5% or less of the total perfusion rate (data not shown). since the bleed removes the excess biomass, it is an indication of how close to a limitation the system is. therefore, the cspr calculated at sp4 was considered as the minimum cspr. the cspr min obtained in both systems were very close ( table 1 ). the semi-continuous system can therefore be used to identify the cspr min before running a continuous bioreactor, it therefore facilitates the decision making early in the development (to define the target cell density for a defined perfusion rate). the specific productivity (q p ) of the 4 clones was quantified at the maximum vcd (semi-continuous) or at sp4 (perfusion). absolute values are not representative since the cell environment is so different in both systems. nevertheless, a relative ranking proved to be indicative of the respective performances ( table 1 ). the maximum cell growth in fed-batch, semi-continuous and perfusion were also compared, their ranking was always preserved. both indications can be used to assess a performance ranking for different clones. the performance of 4 clones was studied in 3 different cultivation systems: fed-batch/perfusion bioreactors and semi-continuous shake tube. the semi-continuous system was able to precisely predict the cspr min , an important process parameter for perfusion. specific productivity and maximum cell density ranking was preserved across the systems, therefore the scale down experiment can be used to assess a performance ranking for perfusion clone screening. modulating antibody galactosylation through cell culture medium for improved function and product quality jenny y. bang, james-kevin y. tan the production of therapeutic antibodies (abs) requires high titers and excellent product quality to ensure efficient manufacturing and potent drug efficacy. glycosylation, or the attachment of sugars to organic molecules, is a critical quality aspect that can significantly alter ab binding, function, and therapeutic effect [1] . galactose is a key sugar of interest due to its significant impact on ab function and the ability to control galactosylation through cell culture medium. herein, irvine scientific assessed the ability of media components to modulate galactose levels on a model therapeutic ab. various media compositions were able to modulate galactosylation levels without compromising cell growth and ab titers. in addition, an in vitro assay was utilized to evaluate the functional ability of abs to bind and activate complement-dependent cytotoxicity (cdc). differences in galactosylation significantly altered the abs' ability to induce cell cytotoxicity. furthermore, design of experiment analysis determined the optimal ratio of supplements to maximize galactosylation. this "optimized supplement" was verified and evaluated against other suppliers' galactosylation supplements in terms of growth, titer, glycan analysis, and ab function. the optimized supplement outperformed all other suppliers' supplements and resulted in the best overall cell growth, titer, galactosylation, and ab function. ab against cd20 were grown in balancd® cho growth a and were fed with balancd® cho feed 4 on days 3-7 of the cultures. viable cell density and cell viability were assessed by a beckman coulter vi-cell xr, ab titer was assessed by a pall fortébio qk e , and glycan analysis was assessed by a perkinelmer labchip gxii. for the functional cdc assay, abs were incubated with daudi b lymphoblast cells and normal human complement serum. cell cytotoxicity was assessed with a promega cytotox-glo kit. various supplements were evaluated in fed-batch cultures and resulted in 15-45% ab galactosylation without compromising cell growth and ab titers. design of experiment analysis determined an optimal composition, deemed "optimized supplement," which was evaluated against a panel of galactose-modulating supplements from other suppliers. the optimized supplement resulted in a similar viable cell density (vcd) and cell viability compared to the fed-batch culture control which had no supplements (fig. 1a) . supplements from supplier 1 resulted in similar to half the vcd of the control while supplements from supplier 2 resulted in very low vcd and percent viability. all of the supplements except those from supplier 2 resulted in ab titers similar to the control (fig. 1b) . due to the poor growth and subsequently low titer from supplier 2's supplement, supplier 2 was not further evaluated. the glycan profiles were analyzed and are presented in (fig. 1c ). all the evaluated supplements were able to raise galactosylation; however, only the optimized supplement and the 2x supplier 1 supplement resulted in over 40% galactosylation. the function of the abs was further evaluated in a cdc assay (fig. 1d) . abs from the optimized supplement were more effective than the control abs and had a significantly lower half-maximal effective concentration (ec 50 , 1.19 μg/ml) than the control (1.71 μg/ml). abs from the 2x supplier 1 supplement had a similar ec 50 to the control which may be due to the higher man5% of the abs. an optimized supplement was produced through fed-batch evaluation and design of experiment analysis. the optimized supplement outperformed all other supplenments from other suppliers and resulted in the best overall cell growth, glycan profile, and functional ab activity (table 1) . industry practice for mammalian cell culture media and feed development typically employs a high-throughput screening (hts) platform along with large sets of experiments [1] . modern hts systems often include robotic liquid handlers to replace labor intensive steps. to align with advancements in the field, a semi-automated hts platform was developed to facilitate in-house media and feed development for early stage biologics projects. selecting appropriate instruments and integrating them into a seamless system are the keys to a hts platform. the developed hts platform uses 24 deep well plates (dwps) for culture vessels, the liquid handler of the advance microscale bioreactor (ambr15) for media/ feed formulation preparation in an aseptic environment, nyone cell imager for viability and cell growth analysis, tecan freedom evo's liquid handler for activity assay sample preparation, and cedex bioht for high-throughput metabolite analysis. 24 dwps offer comparable cell growth to shake flasks and compatible layout to ambr15, which makes the 24 dwp an ideal candidate for the platform. in addition, the user friendly design of experiments (does) interface and liquid handler function of the ambr15 expediates the formulation preparation of varying doe conditions [2] . a macro program was written and developed in excel to enable the easy import of does design from major statistics software packages, such as jmp and simca, into ambr15. performance qualification of each component were performed prior to implementing the hts platform. comparable cell growth profile and productivity were achived between shake flasks and 24 dwps (fig. 1a ), indicating compatable cell culture environment for the cells. cell counts using nyone gave identical cell growth ranking as the traditional count from vi-cell xr (fig. 1b) . freedom evo's liquid handler was optimized to produce comparable activity results to manual operation while expediting the sample preparation with improved consistency (fig. 1c) . finally, implementing the liquid handler function of ambr15 to support media and feed formulation significantly reduced the labor for each experiment. summary of the capability comparison between the hts platform and the traditional method are listed in table 1 . a case study of a complex feed screening with definitive screening design was completed using the semi-automatic hts platform. this experiment, containing more than 60 feed formulations in duplicates, was handled by one operator and delivered a 40% improvement in productivity within a 4 week period (data not shown). in addition, implementing the hts platform for this study also resulted into~80% reduction in labor while improving the traceability of formulation preparation. a semi-automated hts platform was developed to support media and feeds screening and development for early stage biologics projects. the platform utilizes 24 dwps, nyone cell imager, ambr15, freedom evo liquid handler system, and bioht metabolic analyzer to accelerate the screening process. this screening platform not only improves process throughput, operational precision, and traceability of formulation preparation, but also reduces the labor for the media and feed formulation preparation. background a perfusion medium requires high concentrations of specific nutrients while balancing other components to support intensified perfusion processes. using a combination of design of experiment (doe), multivariate analysis (mva), and spent media analysis, we developed a catalog "de novo" perfusion medium by working with multiple cho cell lines and proteins. the optimization of the medium in bioreactors using alternating tangential flow (atf) cell retention devices reduced the minimum cspr from over 80pl/cell/d to under 35pl/cell/d for most cell lines while increasing specific productivity during 30 day steady states with stable growth rates, viability, volumetric productivity and product quality. high throughput screening (hts) was performed with seven cell lines, while four were used in bioreactors: cho-s, dg44, and two chozn® gs lines, each producing different monoclonal antibodies and include a fusion protein. for hts experiments, cells were inoculated at 2.0x10 6 vc/ml with a 30 ml working volume in 50 ml tpp® tubes and cultured for 7 days in a multitron shaken at 200rpm, 37°c, 80% rh, and 5% co 2 . for benchtop perfusion, cells were inoculated at 0.4-2.0x10 6 vc/ml in 3l applikon bioreactors (applikon, netherlands) with a 2l working volume. bioreactors were operated at 350 rpm, 37°c, 40% do, and a ph of 6.9 or 7.1±0.05 depending on the cell line. oxygen was supplied through an l-sparger or microsparger as needed, and excell® antifoam (milliporesigma, germany) was added at a maximum rate of 0.25% v/v to control foam. at a cell concentration of~6.0x10 6 vc/ml, perfusion was initiated using the atf2 (repligen, massachusetts), with a bleed set to maintain cell concentrations at 50 or 80*10 6 vc/ml. two "de novo" prototype media were developed using doe and mva in hts with tpps and an ambr®15 [1] and one was chosen for further development after comparing to a basal medium enriched with feed in bioreactors. eleven components were identified as significant effectors of critical parameters for perfusion processes across evaluated cell lines. doe central composite experiments were run and component concentrations were optimized in the selected prototype. in parallel, amino acid specific consumption rates were calculated from bioreactor spent media samples and used to adjust the concentration of amino acids to target a reduced cspr. increasing specific amino acids concentrations resulted in a significant reduction of the minimum cspr across all tested cell lines -for example the cspr of cho-s was reduced from 60 to 39pl/cell/day (table 1 ). however, even at the lower cspr, spent media analysis revealed excess concentration of some amino acids, so specific accumulating amino acids were reduced and components were streamlined for the final medium: excell® advanced hd perfusion medium. using this medium, a cho-s and a chozn® gs cell line producing a fusion protein were cultured at a cspr of less than 40pl/cell/day with a vcd of 50*10 6 vc/ml. metabolic profile, productivity, and product quality were constant over the 30 day steady state. the chozn® gs cell line was also tested at 80*10 6 vc/ml with a cspr of 33pl/cell/day (fig. 1 ). we have developed a catalog perfusion medium from first principles, ensuring broadness of application by using seven cell lines in scaleddown systems and four in perfusion bioreactors. the final catalog medium showed significant improvements in productivity across all cell lines, with reduced csprs when compared to enriched fed-batch medium or initial prototypes (table 1) . there is a rising demand for accelerated process development, increased efficiency and economics for biopharmaceutical production processes. furthermore, increased process understanding have evolved from the process analytical tool initiative (pat) and the quality by design (qbd) methodology. in contrast to one-factor-at-atime methods, statistical design of experiment (doe) methods are widely used to develop biopharmaceutical processes. even if highthroughput systems can handle these numbers of experiments in parallel, the heuristic restriction of boundaries and the high number of factors results in stepwise iterations with multiple runs. therefore, the combination of model-based simulations with doe methods (mdoe) for the development of sophisticated cell culture processes is a novel tool for process development [1] . it is used to reduce the number of experiments during doe and the time needed for the development of more knowledge-based cell culture processes. this concept was applied to the optimization of the initial glutamine and glucose concentrations of a cho batch process. a mechanistic model was adapted and modified from [2] and used to describe the dynamics of cell metabolism and antibody production of an il-8 antibody producing cho cell line (see abbreviation of fig. 1 for cultivation details). experiments were simulated and compared to a fully experimental doe. as can be seen from table 1 , user defined constraints were chosen to get a stable and reproducible process with the aim of maximizing the cell density but decreased lactate and ammonia production. at first, the experimental space was estimated by simulating the responses for broad concentration ranges and calculating the multiple response desirability function (fig. 1a) . this results in a small area (turquoise) suggested as experimental space. experiments were planned within these boundaries and responses were either simulated ( fig. 1b, 4 cultivations for fitting the model) or compared with the purely experimental responses (fig. 1c, 16 cultivations). optimal concentrations for glutamine and glucose with respect to the constraints are in the lower right corner and similar for both methods (red frame, fig. 1 ). compared with the fully experimental design, mdoe results in a reduction of 75 % in the number of experiments (4 experiments for modelling vs. 16 experiments in experimental doe). the method is intended to optimize cultivation strategies for mammalian cell lines and evaluated these before experiments have to be performed in laboratory scale. this results in a significant time and cost reduction during process development and process establishment. the strategy is especially intended for the use in multi-single-use-devices to speed up process development. . at a target cell density of 50*10 6 vc/ml, volumetric productivity was stable for a 30 day steady state with excell® advanced hd perfusion medium. shorter steady states were tested at 80*10 6 vc/ml background for the large-scale production of therapeutic glycoproteins, fedbatch culture has been widely used for its operational simplicity and high titer. however, repeated feeding of medium concentrates and/ or addition of a base to maintain optimal ph during fed-batch culture lead to increase in osmolality. the hyperosmolality affects glycosylation in a protein-specific manner. however, the mechanism behind such osmolality-dependent variations in glycosylation in recombinant chinese hamster ovary (rcho) cells remains unclear. in this study, to better understand the effect of hyperosmolality on the glycosylation of a protein produced from rcho cells, we investigated 52 n-glycosylation-related gene expression and n-linked glycan structure in fc-fusion protein-producing rcho cells exposed to hyperosmotic conditions. furthermore, to validate the effect of hyperosmolality on protein glycosylation, we performed hyperosmotic culture supplemented with betaine, an osmoprotectant, and then analyzed the n-linked glycan structure and mrna levels of n-glycan branching/antennary genes. after three days of hyperosmotic culture, nine genes (ugp, slc35a3, slc35d2, gcs1, manea, mgat2, mgat5b, b4galt3, and b4galt4) were differentially expressed over 1.5-fold of the control, and all these genes were down-regulated. n-linked glycan analysis by anion exchange and hydrophilic interaction hplc showed that the proportion of highly sialylated (di-, tri-, tetra-) and tetra-antennary n-linked glycans was significantly decreased upon hyperosmotic culture. addition of betaine, an osmoprotectant, to the hyperosmotic culture significantly increased the proportion of highly sialylated and tetra-antennary n-linked glycans (p ≤ 0.05), while it increased the expression of the n-glycan branching/antennary genes (mgat2 and mgat4b). thus, decreased expression of the genes with roles in the n-glycan biosynthesis pathway correlated with reduced sialic acid content of fc-fusion protein caused by hyperosmolar conditions. conclusions taken together, the results obtained in this study provide a better understanding of the detrimental effects of hyperosmolality on n-glycosylation, especially sialylation, in rcho cells. the identified genes, particularly mgat2 and mgat4b, are potential targets for engineering in cho cells to overcome the impact of hyperosmolality on glycoprotein sialylation. disruptive cost-effective antibody manufacturing platform based on cutting-edge purification process v. medvedev, m. duyck, t. albano, j. castillo univercells sa, gosselies, belgium correspondence: v. medvedev (v.medvedev@univercells.com) bmc proceedings 2018, 12(suppl 1):p-093 background demand for high-quality monoclonal antibodies is growing exponentially, calling for new production capacities. overcoming current limitations of conventional manufacturing strategies, namely the high capital investment and production cost, can only be achieved through innovative process designs based on the latest technologies. this study presents a process design combining batch-fed technology with continuous multi-column capture. an advanced cell culture clarification method was introduced to simplify downstream operations and increase overall cost-effectiveness of the process, for an optimized production of recombinant proteins. this study was performed with cho cells expressing a monoclonal antibody targeted against the coronoavirus responsible for the middle east respiratory syndrome (mers), developed by organic vaccines tm and the nih, kindly provided to univercells. upstream process: -fed-batch, 12 days culture at 10l scale with cd-cho chemically defined media and feeds. harvest treatment: -precipitation of impurities in the production bioreactor using organic compounds (<1% v/v) and flocculation by electropositive organics (<0.1% w/v). -acidic ph and physiological conductivity. upstream processing and harvest treatment: culture reached 0.5 g/ l (8x10 6 cells/ml; 90% viability), harvest treatment was found to be very effective in terms of impurities clearance. capture: capture strategies were evaluated from the point of view of simplification of downstream operations, with hcp impurities content monitored as a key performance indicator. -protein a affinity chromatography: advanced harvest clarification enabled major improvements in affinity capture, in terms of eluate purity and reduction of host cell impurities (<35 ppm in all conditions tested). (fig. 1 ). -cation exchange chromatography: cex allows higher capacities (>100 g/l) than protein a, whilst being more affordable (from 2-to 6-fold cheaper). low residual hcp (<500 ppm) was observed with all cex resins tested. without harvest treatment and clarification preceding the capture studies (either affinity or cex), results showed a lower binding capacity of the resin, a higher content of hcp in the eluate (up to 2000 ppm), a higher content of hmw species in the elution fraction (up to 3-fold higher) and a significant turbidity of the neutralized eluate. -continuous multicolumn chromatography: further options to increase cost efficiency include using a continuous multicolumn setup (table 1) . two models were assessed based on two different static binding capacities (sbc), demonstrating that 4 to 6 columns of 100ml were able to process a 200l production in less than 24h. this method provides a great opportunity for designing simplified and low footprint mabs dsp processes, while maintaining similar or achieving superior quality profile compared to standard approaches: -harvests treatments followed by depth filtration proved to be a cost-efficient way to obtain pretreated feed and minimize the burden on downstream operations. -protein a resins exhibited advantages of extracting key contaminants during harvest treatment, while caex confirmed to be a competitive capture strategy. -switching from batch to continuous multicolumn mode allowed to process a complete batch in less than 24 hours, requiring lower media and resins volumes. followed by a single polishing step, such process set-up strongly supports the reduction of operations required to deliver a high-quality product. analyses of product quality of complex polymeric igm produced by cho cells background immunoglobulin m (igm) antibodies are secreted by b cells as the first defense against invading pathogens during primary immune response. some igm antibodies already gained the orphan drug status, which shows their unique capability in therapy of rare diseases. potential fields for applications are discovered with increasing knowledge about these molecules. it seems that the most active forms are pentameric and hexameric igms. unfortunately, recombinant production of igms is rather difficult as secretion and correct polymer formation results in low expression yields and mixtures of polymers. we established stable producing chinese hamster ovary (cho) dg44 cell lines to analyze cellular and extracellular factors that influence quantity and quality of the produced recombinant polymeric igm in future studies [1] . one quality parameter is polymer distribution, which can be measured directly in cell culture supernatant using densitometric analyses [2] . additionally, we developed a very efficient single-step-affinity purification strategy using the poros captureselect igm affinity matrix to analyze pure igms. for more precise measurements of the igm isoform distribution we separated the purified polymers by high performance liquid size exclusion chromatography (sec hplc). our cho dg44 cell lines grow to peak cell concentration of 4.5x10 6 cells/ml in erlenmeyer flasks and 4.0x10 6 cells/ml in bioreactors. similar productivity of approximately 50 mg/l was observed for cells cultivated in both cultivation vessels in a nonoptimized batch culture using chemically defined media. analysing how cultivation conditions affect the fraction of polymers may offer clues about the assembly of polymers and the challenges of igm production. we quantified polymeric distribution of igm directly in the supernatant using a densitometric method [2] . cultivated under standard conditions (37°c, ph 7) igm012 is produced as 90% pentamers, whereas igm012_gl only consists of approximately 80% pentamers. the purified igm012_gl was analysed with sec-hplc and contained 81 % pentamer and 19 % dimer, which is comparable to the results achieved with densitometry. the purification of the igm antibodies was quite challenging as the manufacturer recommend acidic elution, which led to aggregation and inefficient elution of our model igms. therefore, we screened for different elution buffers that prevent denaturation and aggregation. by combining high salt concentrations with moderate ph reduction we optimized elution conditions to 88-99% igm recovery, which corresponded to a five to six fold improvement compared to the manufacturers' conditions. sds-page analysis and sec-hplc showed that our elution strategy resulted in a very pure product after a single chromatographic step. the purification strategy was verified with the igm103, igm104 and igm617. our model igms were produced in a ratio of approximately 4:1 pentameric to dimeric igm, measured concordantly with both analytical methods. process development on igm purification using the poros capture select human igm affinity matrix enabled the recovery of highly pure fractions. through optimization, by combining mild ph and high salt concentrations, the relatively low elution yields were increased by a factor of 5-6. applying densitometry and sec-hplc we will investigate how culture conditions influence polymer formation in future. currently, no small scale (<0.5l) cell culture system is commercially available for high cell density perfusion cultivations to use in high throughput screening studies. to increase throughput for process characterization activities at janssen vaccines and prevention, a shaker flask-based scale down model was developed. though, the control possibilities of shaker flask cultures are technically very limited and different compared to a bioreactor controlled process. in addition, the sensitivity of the shaker flask model should allow the detection of the effects of process parameters on critical quality attributes (cqas) of the vaccine produced at large scale. iterative experiments were performed in shake flasks to evaluate the influence of cultivation parameters such as shaking speed, working volume, co2% in the incubator and daily base additions on cultivation parameters (as cell growth, ph and do). in addition, a medium exchange was tested to mimic the perfusion mode used in the bioreactor process. the presens shake flask reader was implemented to allow for ph and do monitoring. the conditions for which the performance as reflected in specific virus titer showed the best fit were selected. at these conditions, a series of parallel shaker flask infections were conducted to demonstrate statistical equivalence of performance parameter and cqas (as cell specific iu titer and vp/iu ratio) between the production scale and reduced scale processes and thus to qualify the shake flask as a scale-down model. a daily medium exchange by centrifugation was implemented and cultivation parameters for shake flasks were identified. based on performance parameter (cell specific vp titer) and the cqas of the vaccine (cell specific iu titer and vp/iu ratio), equivalence between the production-scale and scale down systems was confirmed. the scale down model data fall into the 95% prediction intervals calculated on manufacturing data whereas scale down model data from batch mode experiments (using non optimized cultivation conditions) do not. the shaker flask as a scale down model for the 10l bioreactor perfusion process was qualified. this model is a tool to screen a subset of process parameters at a higher throughput, thereby reducing process characterization timelines. background until today, the market for therapeutic proteins, especially monoclonal antibodies, is gaining more and more importance in the pharmaceutical field. to meet the increasing demand for these products, the industry made tremendous efforts to generate highly efficient production systems. one of the pharmaceutical industry's research focuses is the improvement of the secretion process in eukaryotic cells. in mammalian cells, the efficiency of protein transportation strongly depends on the translocation of a nascent protein into the er, which is mostly conducted by the signal peptide (sp) coupled to the nterminus. through the interchangeability of signal peptides between products and even species, a large variety can be used to enhance protein expression in already existing production systems materials and methods at first the influence of four different natural sps (sp (7), (8), (9) and (10)) was compared on the secreted amount of an igg4 model antibody (product a) in fed batches using a cho dg44 host cell line. in the second part, one promising sp-candidate showing improved secretion (sp (9)) was identified and the influence of this sp on four additional antibody products, which varied in their expressability from good to mid/bad, was investigated. in both approaches, the standard sp was implemented for comparative reasons. in the first approach, four signal peptides sp (7), sp(8), sp(9) and sp(10) were screened for their potential to improve the product secretion of cho dg44 cells expressing a model antibody (product a). the results revealed a 2.4-fold increase in average final fed-batch antibody titer of sp(9) when compared to the standard sp approach (standard sp = 0.44 g/l; sp(9) = 1.50 g/l). in the second approach, the enhancing capacity of sp(9) on secretion of four other igg products (named product b to e, table 1 ) was further evaluated. an improved performance was observed for all products when comparing sp(9) and the standard sp in a fed batch process (fig. 1) . with an increase in average final fed-batch titers ranging from 28 to 354 % and up to 290 % in cell-specific productivities. taken together, with a positive influence on the final concentrations of all tested products, the results obtained with sp(9) contribute to -signal peptide sp(9) was identified as a promising candidate with an average 2.4-fold titer increase during screening of four signal peptides. -sp(9) was able to improve production titers up to 354 % compared to standard sp. -sp(9) was able to improve cell-specific productivities up to 290 % compared to standard sp. -future usage of sp (9) contributes to the further optimization of sartorius stedim cellca's standard cell line development process. new platform for the integrated analysis of bioreactor online and offline data lukasz gricman 1 , milan ganguly 2 , amanda fitzgerald 3 , hans peter more and more experiments are used to assess bioreactor suitability and stability of clones, to evaluate media composition and other process parameters, and to start upscaling campaigns. this has resulted in a major bottleneck due to the increase in data capturing, processing, aggregation, visualization, and statistical analysis. in addition, the association of the data with the experimental context (e.g., fermentation protocols, media recipes, bioreactor control parameters) is not easily accomplished in high throughput. the data generated in the process must not only be analyzed, but also managed and stored to enable easy tracking and relating to historical records. furthermore, the processes are often developed by global teams interacting in complex enterprise it ecosystems. therefore, new and high performing systems for data capture, processing, and analysis need to be integrated in order to enable storage and correlation of experimental context information and various types of time course analytics data. we have developed genedata bioprocess™, a new enterprise platform for bioprocess development. the platform enables automatic capture and visualization of all online and offline data (e.g., ph, o2, metabolic data), auto-calculations and aggregations (e.g., ivcd, qp, consumption rates) and multi-parametric assessment of any type of time-series bioreactor data in the context of experimental protocol data (e.g., process parameters, feeds). genedata bioprocess comes with dedicated interfaces for integrating with relevant laboratory instruments, control systems, statistical analysis software packages and custom enterprise solutions. it enables the modeling and tracking of complex nonlinear workflows and supports decision making in bioprocess development. the data can be analyzed in the context of upstream process development, and also be correlated to other unit operations. automation support assists the ever increasing throughput of bioprocess development operations, and the analysis of experimental data and process parameters across unit operations or even different projects. this overall integration enhances process development workflows. highlighted use cases describe the selection of the best producer clones (fig. 1a) , the identification of optimized media feeding strategies (fig. 1b) , and the comparison of clone performance across different fermentation scales (fig. 1c) . a special focus is on the analysis of data from micro-and bench-top bioreactors (such as the ambr15™ and dasgip™ systems) operated in parallel. these bioreactors allow for increased throughput of clone selection and process optimization studies, which in turn leads to an increase in data generation. genedata bioprocess supports integration with such systems and enables a comparison of data regardless of the instrument provider or scale. automated bioreactor data analysis allows development groups to take advantage of even richer datasets and, as data management is built-in to the system, the data can be easily tracked and associated to historical records. another focus is on cross-reactor scale comparisons. data coming from different bioreactor scales can be easily imported into the platform and analyzed to establish the best conditions for upscaling. genedata bioprocess enables the correlation of process parameters (e.g., fermentation protocols, media recipes, bioreactor control parameters), with key performance indicators of the processes (e.g., titer, qp) and the product quality attributes (e.g., aggregation, glycosylation profiles). finally, bioreactor time course data can be tracked together with clone analytics and product quality parameters, which makes the platform uniquely able to support end-to-end biopharma development. upstream bioprocesses are at particular risk of contamination from adventitious agents. the typical 0.1 μm filters used at this step protect bioreactors from bacteria and mycoplasma but offer no protection from viral contaminations. a new polyethersulfone (pes) upstream virus filter, viresolve® barrier, has demonstrated high levels of microorganism retention -full retention for bacteria and mycoplasma (>8.0 lrv -log reduction value) and~5 lrv for small viruses, such as parvoviruses. it also has improved flow and capacity as compared to virus removal filters designed for monoclonal antibody purification. given the small pore size of virus retentive filters, implementing a virus filter upstream of the bioreactor raises the question of whether critical cell culture media components are removed. therefore, it is important to evaluate the cell culture performance and protein quality attributes using virus-filtered media to ensure that filtration does not negatively impact the process. materials and methods ex-cell® cho media and corresponding feeds were processed through either viresolve® barrier filters or 0.22 μm filters (control). media composition post-filtration was evaluated by high performance liquid chromatography (hplc), inductively coupled plasma/ optical emission spectrometry (icp-oes), and nuclear magnetic resonance (nmr). recombinant cho cells were cultured in fed batch culture. cell density and viability were measured by vi-cell tm cell viability analyzer while metabolites were analyzed by bioprofile® flex analyzer. shake flasks and bioreactors were utilized to verify that surfactants, such as poloxamer, (which are essential for shear protection in stirred tank bioreactors and can be difficult to filter) have not been removed during filtration. monoclonal antibody titer was quantitated by protein a hplc. characterization of the antibody product quality was assessed via weak cation-exchange chromatography (charge heterogeneity), size exclusion chromatography (aggregate profile), and 2-ab fluorescent labeling with np-uplc (glycan species). media and feed compositions were unaffected by filtration through the virus barrier filter. no significant differences in concentrations were observed with icp-oes (trace metals) or hplc (amino acids and water soluble vitamins). nmr showed no change in the organic composition of the media including poloxamer. the aromatic region with vitamin and amino acid signals is shown (fig. 1a) . cell cultures showed no differences in cell growth or titer, in either shake flasks or bioreactors (fig. 1b) . cell viability was unaffected, metabolite levels were within limits, and titer was consistent. the protein quality of the secreted antibodies showed no differences in the glycosylation pattern (fig. 1c) , amount of aggregates or charge variants. the risk of virus contamination in the bioreactor remains a concern for biotherapeutic manufacturers as there is no universal technology that provides a reliable, cost effective solution for virus removal that can be applied to all components of cell culture media. this study evaluated the viresolve® barrier filter that provides an efficient and easy way to protect bioprocesses from adventitious virus contamination. study results demonstrated that media and feed compositions, cell culture performance, and product quality were unaffected by filtration through the viresolve® barrier filter. implementation of vire-solve® barrier filters provides efficient filtration performance, high virus retention, and minimal cell culture impact and offers a viable option to improve the overall virus risk mitigation strategy for the manufacture of biotherapeutics. b tracking of process conditions together with online and offline performance analytics. the system allows to flexibly define tracked parameters and select optimal process conditions. c comparison of process performance across different reactor scales. the open architecture makes genedata bioprocess a provider agnostic system which allows to aggregate and compare data regardless of provider background bi-and multi-specific antibodies, antibody-cytokine fusion proteins, nonimmunoglobulin scaffolds, chimeric antigen receptors (cars), engineered t-cell receptors (tcrs) and tcr-based bispecific constructs can provide significant advantages for use in cancer immunotherapy. however, as highly engineered molecules they pose new challenges in design, engineering, cloning, expression, purification, and analytics. we have thus implemented an infrastructure that addresses these challenges and enables the industrialization of these various novel therapeutic platforms. in close collaboration with leading biopharmaceutical companies, we implemented a workflow, data management and analysis support system, genedata biologics™, enabling the automated design, screening, and expression of large panels of therapeutic candidates using these novel technologies. we have also built tools for developability and manufacturability assessments of these complex molecules. we have ensured that there is a seamless integration of all data generated and that functionalities such as bulk protein and vector generation using our in silico cloning engine, configurable library of template vectors and cloning strategies, fully annotated in silico protein molecules and dna constructs, and dna synthesis verification support, can be used for the newest protein formats and molecule topologies. we implemented data structures and data handling systems, which mirror how these complex next-generation biologics molecules and cell lines are being designed, screened, and analyzed. the result successfully addresses workflows for tcr optimization and engineering. we exemplified this with the generation and evaluation of a panel of engineered tcrs with an alpha chain cdr3 randomization and successfully supported the analysis and selection of beneficial mutations. the system also successfully supported workflows for the design and generation of a panel of tcr-based bispecifics (tcr coupled with anti-cd3) using automated molecule registration and in silico cloning tools and subsequent capture of expression, purification, and functional and analytical characterization data. on the car-t cell front, the system is able to provide traceability of the work from antibody generation, optimization, car engineering (e.g., attachment to the scfv with cd3-zeta and co-stimulatory domains to mimic the natural tcr complex) to the engineering of the t-cell. the genedata biologics platform successfully enabled automation, increased data integrity and traceability during research and development work, and will contribute towards the industrialization of these very exciting novel approaches for cancer immunotherapy. optimal selection of therapeutic antibodies and production cell lines by assessment of critical quality attributes and developability background the increasing cost of bringing a new drug to the market has put significant pressure on biopharma organizations. to increase efficiency in r&d processes and reduce costs, organizations need to evaluate potential drug candidates earlier in the r&d process, eliminate those with undesirable characteristics, and focus on the most promising candidates. after designing and thorough testing of successful candidates, efficient production of new biological entities in mammalian cell lines is necessary. the main goal here is to find a suitable cell line and optimal upstream and downstream processing conditions that not only lead to a satisfactory product yield, but also to a product with the desired biochemical properties. the evaluation of production cell lines, processes, and product quality attributes is performed earlier and in higher throughput for an increasing number of drug candidates. in addition, new methods in molecular and cell biology (e.g., novel genome engineering approaches such as crispr/cas9), in analytics [e.g., process analytical techniques (pats)], in process miniaturization, and in automation promise to make process development more efficient. however, the management and analysis of the increasing amount of experimental data during candidate selection and cell line and process development has become a bottleneck. in addition, quality-compromising steps in biopharma organizations can negatively impact the cost of goods and substantially prolong the drug candidate's time to market. therefore, systems for integrated management and analysis of wellstructured and curated data that comprehensively integrate molecule and sample information, manufacturing process parameters, and process and product quality attributes are needed. critical quality attribute (cqa) assessment should be enabled along the whole bioprocess development workflow, including cell line development, upstream and downstream process development, as well as analytical and formulation development. we have developed a comprehensive platform, genedata bioprocess™, which supports drug candidate developability and manufacturability assessment and bioprocess development. the platform captures and structures the cell line and process parameters together with analytical data for cell lines, processes, and protein products. the protein analytical data being tracked include biological data (such as bioactivity, immunogenicity), and physicochemical properties. these properties include glycosylation, chemical liabilities (such as deamidation and oxidation), aggregation, stability under different conditions (low ph, low and high temperature), solubility, and impurities. genedata bioprocess™ simplifies and streamlines laborious, manual process and supports tools for molecule, clone and process selection. furthermore, the platform allows for seamless integration with laboratory instruments, statistical software packages, and custom solutions. here, we present use cases showing how to identify and annotate liability sites prone to chemical modifications (fig. 1a) and how to monitor cqas of molecules allowing to assess developability more efficiently. we show how the analytical data generated in the course of a developability assessment are compiled to select the best drug candidate (fig. 1b) . implemented traffic-light systems indicate where molecules harbor issues such as in case of the antibody tpp-86, which is compromised by low temperature and repeated freeze-thaw operations. the same assessment views can also be applied on batches and cell lines. the underlying data can be visualized graphically. as an example, we show glycan types of products obtained from different cell line clones generated in a cell line development campaign for the molecule tpp-86 (fig. 1c) . even though the selected clone cli-35 meets the glycosylation criteria (e. g., <13% afucosylation, <40% galactosylation, <2% sialylation), the produced next-generation biologics molecules are composed from a number of specific subdomains. each type of molecule is composed of a specific set of domains, which must be mirrored in the registration and further research and development workflow. molecule registration and hit-selection using data from a number of assays is shown here using the example of car-t cells. the image is a screenshot from the genedata biologics™ software molecule harbors some stability issues as mentioned above. therefore, more attempts would be needed either in formulation or in reengineering of the complimentarity determining regions (cdrs) in order to provide a developable ttp-86-like drug candidate. background environmental process variables are often used as tools to optimize the performance of mammalian cell cultures to achieve higher cell densities and high productivities of r-proteins (q p ). the manipulation of culture temperature in the range of mild hypothermia (mh) (35-30°c) [1, 2] , as well as different glucose availability scenarios [3, 4] , has been shown to improve productivity in different cell lines. however, the manipulation of these variables individually or together has a concomitant effect on the rate at which cells grow, masking the net response exhibited by the cells. in order to identify the effects of these variables, we have taken advantage of the use of the chemostat culture. chemostat cultures were performed at two dilution rate (d)(0.010 or 0.018(h-1)), two temperatures (33 or 37°c) and three feed glucose concentrations (20, 30 or 40 mm). the response was analysed considering r-protein production, cell growth and key metabolites. r-tpa protein concentration was determined by immunoassay (trinilize tpa kit); cells were counted using a hemocytometer and cell viability was determined by the method of exclusion using trypan blue (t8154, sigma, usa); glucose, lactate and glutamate were determined by enzymatic assay using a biochemical analyser ysi (yellow spring instruments). statistical analysis of the results was performed by anova (design-expert 7 for windows). a decrease in cell density was observed in response to an increase of glucose feeding concentration, regardless of temperature or specific growth rate (in this case μ=d) evaluated. the maximum cell densities were reached at 20mm, achieving 1.65 and 1.50 x10 6 cells/ml at 37/33°c and 0.018(h -1 ); and 1.10 and 1.33 x 10 6 cells/ml at 37/33°c and 0.010(h -1 ) respectively (fig. 1a) . the increase in glucose concentration from 20 to 40mm resulted in an q p increase of 3 and 3.3 fold at 33°c/0.018(h -1 ) and 37°c/0.018(h -1 ) respectively. a lower increase of 2.4 and 1.8 fold was reached at 33°c/0.010(h -1 ) and 37°c/0.010(h -1 ) respectively (fig. 1b) . the highest q p s were reached at 37°c and 0.010(h -1 ). however, a positive effect of mh was not observed, in contrast to that observed in batch culture [1, 2, 3] . this behaviour suggests that low μ is a main factor on increased r-protein production in batch cultures exposed at mh condition. the specific consumption rate of glucose was significantly increased by the glucose increase from 20 to 40mm and reduced by mh (fig. 1c) . at 0.010 (h -1 ) the specific production rate of lactate (q lac ) was increased by glucose increase, independent of the culture temperature used. while at 33°c/0.018(h -1 ) the q lac decreased with increasing glucose concentration and at 37°c/0.018(h -1 ) a maximum consumption was observed at 30 mm glucose (fig. 1d) . the lactate-glucose yield ( fig. 1e ) not showed relevant changes at 0.010(h -1 ), while at 0.018(h -1 ) this yield showed a more efficient utilization of glucose, as glucose concentration was increased. however, this last behaviour was not reflected in an increase of r-protein production. the concentration of glucose has the greatest impact on the behaviour of the culture, and its increase affects positively the protein productivity. the mh did not improve proteins productivity of cho cells producing tpa under the different conditions evaluated; low dilution rate and at high glucose concentration impact positively the protein productivity and the metabolism exhibited by the cells. background mammalian cell cultures are the most commonly used bioprocess for the production of therapeutic recombinant proteins such as monoclonal antibodies (mabs). facing to the increasing demand of these biopharmaceuticals, the fda has initiated the process analytical technology (pat) framework in order to encourage pharmaceutical industries to use innovative technologies to monitor in real time the critical process parameters (cpps), and to ensure the final product quality [1] . one of the most important cpps for cell culture bioprocesses is the specific growth rate (μ), which is a direct indicator of cellular physiological state. indeed, μ is sensible to culture conditions and its value decreases when cells are in the unfavourable environment for growth [2] , which may greatly influence mab production and quality. however, until this day, the online monitoring of μ remains a great challenge for mammalian cell culture bioprocesses. igg-producing cho cells were cultured in 2 l stirred bioreactors equipped with an in situ dielectric spectroscopy (hamilton). operating conditions were fixed at 90 rpm, 50% of air saturation, ph 7.2 and 37°c. permittivity of cell culture was measured every 12 min, which allowed to calculate in real time the vcd by using a previously established linear correlation. then, a model of online estimation of μ was developed based on vcd prediction and cell mass balance equations. several signal noise filters and various calculation methods were evaluated to reach better model stability. cell cultures were performed in both batch and feed-harvest modes. feed-harvest cultures consisted of sequential renewals of 2/3 volume of the culture medium by following different strategies. this study proposed an innovative methodology based on dielectric spectroscopy to monitor in real time the cellular physiological state, by online estimating the specific growth rate (μ) of cells. model of online estimation of μ was developed from cultures in batch mode, and was validated by comparing online estimated μ with the experimental ones calculated at the end of the culture. with this model, the moment when μ started to decrease significantly, which indicated that cells were no longer in the exponential growth phase, was identified as the critical moment. to demonstrate the interest of online estimation of μ, the developed model was applied to a feedharvest culture, where the medium renewals were performed at the critical moments indicated by the model. this culture was then compared with the traditional feed-harvest culture where medium renewals were performed by following offline measurements of glucose and glutamine. we found that the online strategy allowed to maintain the value of μ by renewing the medium at the right time, while the values of μ varied a lot when using offline strategy. moreover, by using the online estimation of μ, the glycosylation of igg was kept at a high level (about 95%) throughout the whole culture. however, for the culture using offline strategy, the glycosylation level decreased progressively and was only about 75% at the end of the culture (fig. 1) . model of online estimation of μ was developed by using dielectric spectroscopy, which allowed to monitor the physiological state of cells in cell culture bioprocesses. implementation of this model in feed-harvest cell culture led to better mab glycosylation, which demonstrates clearly the potential of this methodology in mab production bioprocesses. background monoclonal antibodies are normally synthesised from transfected mammalian cells as heterogeneous mixtures of glycoforms [1] . however, clinical efficacy may depend upon single glycoforms which have been difficult to isolate [2] . we have now developed an efficient method for generating single glycoforms by solid phase re-modelling which is superior to previous methods because it allows a sequential series of enzymatic changes without the need for intermediate purification of the antibody. solidphase binding exposes the antibody glycans to enable easier access of the transforming glycosylation enzyme. the antibodies subjected to modification were a chimeric human/ camelid monoclonal antibody (eg2), a humanized monoclonal antibody (il8), a full size chimeric antibody (cetuximab) and polyclonal antibodies obtained from pooled human serum. the antibodies were bound to a protein a column using conditions typical of mab purification (fig. 1 ). after washing out non-bound impurities by a neutral ph buffer, each antibody was subjected to enzymatic modification directed to a targeted glycan profile ( table 1 ). the antibodies were then eluted with a low ph buffer and neutralized. the glycan profiles were analysed following glycan removal with pngase f, labelling with 2-aminobenzamide and separation on a hilic-hplc column [3] . prior to enzymatic modification glycan analysis of all 4 antibodies showed variable galactosylation and sialylation typical of human abs. this included a distribution of fg0, fg1, fg2, fs1 and fs2 with galactosylation indices ranging from 0.22 for il8 to 0.64 for eg2. there was minimal sialylation in il8 but up to 11% in eg2. glycan modifications were made as each antibody was held on a protein a column in accordance with procedures shown in table 1 . agalactosylated glycans were enriched by treatment with the single addition of galactosidase and neuraminidase. this resulted in 83-95% of agalactosylated structures in the mabs and 65% in the polyclonal antibody. galactosylated antibodies (>95% yield) were produced by a single stage reaction involving sialidase and by galactosyltransferase with udp-gal. breakdown of the glycans to a trimannosyl core was accomplished by treatment of the agalactosylated structures with hexosaminidase. this produced a yield of 76-80% of the fm3 structure with a small remainder of fa1. sialylated antibodies (>95%) were produced by a 2 stage reaction involving sialidase, galactosyltransferase and finally treatment with 2,6 sialyltransferase in the presence of cmp-nana. the latter reaction produced equimolar quantities of monosialylated and disialylated cetuximab and polyclonal antibodies. the results suggest that for human antibodies (150 kda) there may be a limitation for sialylation given the steric constraints between the two ch2 domains of the dimeric structure. the ability to sialylate the smaller camelid antibody (80 kda) was greater resulting in a high (>90%) level of disialylated glycans. this suggests that the steric constraints for glycosylation may be lower. these sialylated antibodies have significant potential clinical importance for their ant-inflammatory activities. we have modified the glycans of antibodies following immobilization on an affinity ligand column. this allows enzymatic transformation in a solid state that has a distinct advantage over the equivalent transformation in solution because the enzymes and buffers can be washed out on completion of the modification leaving the antibody still attached to the affinity ligand. this enables repeated rounds of an enzymatic reaction or sequential reaction steps without the need for intermediate antibody purification. the antibody can be removed eventually from the column by application of an elution buffer once all desired glycan modification have been made. since affinity ligand purification of antibodies is performed routinely as an initial step of purification after cell culture, the glycan modification can easily be incorporated into this process. the enrichment of the resulting antibody for a targeted glycoform can enhance the potential therapeutic efficacy as it is known that specific glycoforms are required for certain biological effects. [1, 2] . this is mainly because microvesicles can be enriched/deprived for specific proteins, based on their functional purpose and their cellular origin. recently, microvesicles purified from the supernatant of t24 bladder cancer cells were reported to be enriched for bcl-2 and cyclin d1 (anti-apoptotic proteins), but deprived for bax and caspase-3 proteins (pro-apoptotic proteins) contributing towards immunity against programmed cell-death [2] . however, impact of microvesicles on cho-based bioprocess has not been evaluated yet. therefore, in this investigation, we aimed to evaluate their impact on cell growth and recombinant protein production from cho cells. materials and methods cho-k1 cells were grown in chemically-defined protein-free culture medium (life technologies-1835273) in shake flask (gx-00125p). the different fractions of spent-media (microvesicles and microvesicle-free spent media) were collected using ultracentrifugation method [1, 3] . quality of different fractions was ensured using western blotting for exosomal marker, cd63 (sc-15363) and coomassie stained gel for loading control (fig. 1a) . to evaluate impact on cell growth, cells were seeded with microvesicles and microvesicle-free fraction collected from log-phase of culture and cell counts were performed by vicell using trypan-blue dye exclusion method. for impact on productivity, cell-free supernatant, collected from microvesicle-treated human igg secreting cho culture from stationary-phase of culture with respective control, was evaluated using elisa (ab100547). microvesicles collected from 10% of media (by volume) from routine maintenance cultures compared to working volume for microvesiclesupplementation were used in each experiment. the growth of microvesicle-supplemented cultures had shorter lag-phase and achieved 1.2 fold higher maximum cell density (1.46x10 6 viable cells/ml) compared to untreated standard culture (1.21 x10 6 viable cells/ml) and maintained higher for the remaining period of batch culture (fig. 1b) . however, microvesicle-free fraction did not had significant impact on growth. the viability of microvesicle-supplemented cultures, similar to microvesicle-free media supplemented, was also higher compared to standard culture suggesting potential use of microvesicles for regulating cho growth in production cultures. this could be possibily because microvesicles have already been reported to be enriched with cell growth/death-regulating proteins and hence facilitating cell growth [2, 3] . we have also observed abundance of cell cycle regulators including cyclin d1 in microvesicle-fraction compared to microvesicle-free spent-media in our laboratory (data not shown); however, further investigation are required to prove the hypothesis. the overall productivity of human igg secreting cho cells was also observed to increase bỹ 4 fold following supplementation of microvesicles to the culture without significantly affecting per-cell productivity. since microvesicle-supplementation facilitates cell growth, increased number of viable producer cells in the culture could be expected to be the basis of observed increase in the overall productivity of the culture [2, 3] . the further work is ongoing to in-depth explore the potential of microvesicles for improving recombinant protein production from cho cells. the data indicate that microvesicles secreted from cho cells can improve cell growth and hence recombinat protein production in culture. therefore, strategies need to be developed for sterile isolation of cho microvesicles from routine maintenance cultures and their supplementation into the production culture for improving the performance of cho-based production process. the glycosylation profile of a recombinant protein is one of the most important attributes when defining product quality. producing a protein with desired characteristics requires the ability to modify and target specific glycosylation profiles. traditionally the approach to modify the glycosylation profile of a protein involves supplementing a culture with components that can improve galactosylation. experimentation using this supplemental approach resulted in a dramatic increase in terminal galactosylation, but lacked the ability to easily and repeatedly target specific glycosylation profiles. using novel and proprietary technology, we have developed a feed (glycantune™) and a unique feeding process that will maximize growth and titer while being able to modulate glycan profiles. this new feed can be added as a standalone process that can result in a significant shift from g0f to g1f and g2f (maximum galactosylation). using a unique fed-batch process, glycantune can also be used with a standard feed to dial in targeted glycosylation profiles. through process development, we created a method where a transition point is used to switch from a standard feed to a glycan modulating feed. the timing of the transition point will determine the specificity of the glycan profile. ) . n-linked glycans were digested with pngase f and quantified using 100pmole maltohexose/maltopentose internal standards labeled with 8-aminopyrene-1,3,6-trisulfonic acid (atps) as described by laroy et al [1] or the user guide for the glycan labeling and analysis kit (glycanassure™ user guide, thermo fisher scientific). all ce separations were performed using the applied biosystems™ 3500xl. the timing of transition from efc+ to gtc+ made it possible to target specific glycosylation profiles. modulating g0f from 75% down to 32%, while increasing g1f and increasing g2f (fig. 1) . transitioning to gtc+ early in culture resulted in a greater shift from g0f to g1f and g2f. transitioning midway or late in culture resulted in a greater proportion of g0f compared to g1f and g2f. supplementation based approaches using glycosylation modulating media components to modify and target specific glycosylation profiles proved to be difficult. these approaches were able to increase terminal galactosylation (g1f and g2f), but lacked the ability to fine tune glycan profiles. this could result in numerous rounds of titration experiments to target specific glycan profiles that would likely remain inconsistent between cell lines, culture media and feeds, and process scale. the development of a unique process made it possible to predictably target specific glycosylation profiles. transition from standard feeding to glycantune allowed for precise targeting of glycan profiles. transition to glycantune early in culture resulted in an increased shift from g0f to g1f and g2f. a transition late in culture resulted in increased g0f and decreased g1f and g2f. growth performance during precultures and batch curves in plain shaking flasks did not show any differences among tested surfactants or lots thereof, and cell densities reached 10-12·10 6 cells/ ml ( fig. 1a and b) . experiments with hek 293-f cells at elevated power input in baffled shaking flasks revealed distinct differences between pluronic® f-68, f-127 and kolliphor® p188, with f-127 showing the best performance. peak viable cell densities reached with lots a and b of pluronic® f-68 and f-127 were comparable to those in plain shaking flasks, while those for kolliphor® p188 and lots c and d of pluronic® f-68 were significantly lower. peak viale cell densities were of 2 -12·10 6 cells/ml (fig. 1c) . similar transient transfection efficiency and mean fluorescence of transfected cells independent of applied surfactant and lot thereof indicated no major impact of respective poloxamer (fig. 1d) . interestingly, experiments using fluorescein-labelled pluronic® showed a time-dependent uptake into hek cells. visual tracking revealed an endocytic uptake of poloxamers by the cells (>10fold increase in signal after 96 h) and its co-localisation with cell membrane and lysosomes. sec (fig. 1e) analyses showed differences between the tested poloxamers. especially tested lots of pluronic® f-68 revealed notable deviations in the low molecular weight fraction (peak 2, fig. 1e ), compared to the other poloxamers. cultures subjected to varying levels of shear stress showed distinct growth differences depending on used poloxamer. while experiments in plain shake flasks did not show any differences in growth, cultivations under elevated shear stress in baffled shake flasks resulted in lower peak viable cell densities with kolliphor® p188 and some pluronic® f-68 lots. it remaines unclear whether this can be explained by different membrane protective activities alone, or if other mechanisms, occuring during and after cellular uptake, contribute to this effect. especially for the tested lots of pluronic® f-68, sec of surfactants showed differences in the low molecular weight fraction. this fraction mainly represents polyethylen oxide (peo) (revealed by nmr), which is likely to be a remnant from synthesis. these observations indicate that the use of different poloxamers and lots thereof should be carefully evaluated, especially under elevated shear stress. further experiments will focus on investigating distinct sec fractions of poloxamers. overcoming (fig. 1) . aurintricarboxylic acid (endonuclease inhibitor; enhancer used in e.g. salivary gland transfection) and polyvinylpyrrolidone (polymer; beneficial in electroporation) were both found to negatively impact peimediated transfection of cho cells, while another tested polymer enhanced growth as well as transfection efficiency. the use of a strong chelator led to a high transfection efficiency, but impaired cell growth. based on the results of the independent substance testings, the medium formulation was modified by the addition of a weak chelator and further components including vitamins. different osmolalities between 280 mosmol/kg and 340 mosmol/kg were tested for the final formulation, but no major impact was seen neither on transfection efficiency nor on viability 2 days post-transfection. the final cho tf medium formulation supported high cell growth of finally tested cho cell lines 2 and 3 with peak viable cell densities above 10⋅10 6 cells/ml in batch cultivations with an overall cultivation time of 7-8 days (fig. 1) . further improvements of the process might be achieved by adapting the protocol, as the results shown are based on a simple precomplexing of dna-pei. moreover, product yields could potentially be increased by using feeds, temperature shifts or commonly used enhancers (e.g. valproic acids). scaling of a cell culture process is an essential part in its development. in a typical approach scaling [1] is performed by keeping a (critical) process parameter constant throughout the complete bioreactor range. this can lead to non-beneficial results either on the high or the low end of the range. for instance, the specific power input [p/v] of 30 w/m 3 might result in a good agitation in production scale whereas it leads to a nonturbulent mixing behavior in process development scale. to overcome this issue a new approach for an easy scaling procedure was developed. this "utility function" approach for agitation scaling is based on individual functions with a value-based mapping independent of bioreactor scale. process insight information (established either from doe process investigation or existing experience with a process platform) is directly formalized into a set of mappings which transform bioprocess values into perceived benefits (0 to 1). at each bioreactor scale, parameters (e.g. stirring and gassing) are then chosen to maximize the product of resultant utility functions. the model cho fed-batch process in this trial comprised a cho dg44 cell line that was transfected to produce a humanized antibody igg1. a chemically defined media system was used. the process, including cell line, medium and feeding strategy was designed and developed by sartorius stedim cellca. the aim of the gassing scale-up was to achieve similar cell densities when the addition of pure oxygen starts. for all flexsafe str® bags oxygen was sparged via the micro sparger part of the combi sparger. all other systems used a ring sparger with holes face up. the initial air flow rate was set to an oxygen transfer rate (k l a) of 8 1/h at the corresponding agitation rate and volume. all process engineering characterization parameters were determined according to dechema guidelines [2] . with the use of the utility functions the discrete agitation rate was determined (table 1) . the utility functions led to discrete agitation rates where not only homogeneous mixing but also a turbulent flow pattern and a suitable specific power input was guaranteed. the initial gassing rate of air supplied enough oxygen for 5 x 10 6 cells/ml in all bioreactors. due to the used scaling methods the growth patterns in all bioreactor scales were comparable. peak viable cell densities (vcd) of 20 -26 x 10 6 cells/ml were achieved and viability at the point of harvest was above 80 % in all scales. the final product concentration was in an acceptable range of 2.9 -3.6 g/l. product quality attributes show comparability over the complete bioreactor range (fig. 1) . the harvest criteria of 12 days gave a combination of viability and product concentration that made it easy to process the cell broth during cell removal and other downstream steps. the process implementation of the cho production system -expressing mab 2 was successfully performed with the use of utility functions. cell growth, productivity and product quality is comparable over the complete bioreactor range. background endoplasmic reticulum (er), the central part of the secretory pathways in eukaryotic cells, is responsible for controlling the quality of secreted and resident proteins through the regulation of protein translocation, protein folding, and early post-translational modifications [1] . a number of physiological conditions such as oxidative stress, hypoglycemia, acidosis, and thermal instability can disturb the er functions, which triggers er stress [2] . prolonged er stress induces apoptotic cell death [3] . oxidative stress that naturally accumulates in the er as a result of mitochondrial energy metabolism and protein synthesis can disturb the er function [4] . because er has a responsibility on the protein synthesis and quality control of the secreted proteins, er homeostasis has to be well maintained. when h 2 o 2 , an oxidative stress inducer, was added to recombinant chinese hamster ovary (rcho) cell cultures, it reduced cell growth, monoclonal antibody (mab) production, and galactosylated form of mab in a dose-dependent manner. antioxidants can reduce the oxidative stress level and suppress the apoptotic cell death by scavenging oxygen free radicals, inhibiting chain reaction of oxidation, and detoxifying peroxide [5] . however, despite the importance of mass production of mabs, studies on the effect of antioxidants on the production and quality of mabs in rcho cell cultures have not been fully substantiated. to find a more effective antioxidant in rcho cell cultures, six different antioxidants including baicalein, which have used widely in mammalian cell cultures, were evaluated as chemical supplements with two different rcho cell lines producing the same mab in 6-well plates. then, batch and fed-batch cultures were performed in shake flasks with the supplementation of baicalein, which showed the best effect on culture performance among the 6 antioxidants. the reactive oxygen species (ros) and er stress levels were measured to study the effect of baicalein on mab production and quality. among these antioxidants, baicalein showed the best mab production performance. addition of baicalein significantly reduced the expression level of bip and chop along with reduced ros level, suggesting oxidative stress accumulated in the cells can be relieved using baicalein. as a result, addition of baicalein in batch cultures resulted in 1.7 -1.8-fold increase in the maximum mab concentration (mmc), while maintaining the galactosylation of mab ( fig. 1 and table 1 ). likewise, addition of baicalein in fed-batch culture resulted in 1.6-fold increase in the mmc while maintaining the galactosylation of mab. oxidative stress negatively affected the production and galactosylation of mab in rcho cell cultures. among the various antioxidants tested in this study, baicalein showed the best mab production performance in both batch and fed-batch cultures of rcho cells. baicalein addition significantly enhanced mab production while maintaining galactosylated forms of mab. thus, baicalein is an effective antioxidant for use in rcho cell cultures for improved mab production. background the production of many biopharmaceuticals (e.g. antibodies & proteins for diagnostic and therapeutic purposes) requires the cultivation of mammalian cell lines, which is demanding with respect to various aspects such as complex cell metabolism, variabilities in cell behavior, scale dependencies, influences of changes in cultivation conditions, medium composition etc. although an increasing number of measurement parameters is available, only a part of them is routinely utilized in industrial cell culture processes and their corresponding seed trains. nevertheless, the data base grows, statistical investigation of data gains importance and process data are more easily accessible in the context of industry 4.0. cell cultivation has to consider these complex requirements, e.g. for fed-batch control and seed train design. furthermore, cultivation strategies have to be adapted to new products, cell lines and clones as well as to different production plants when transferring processes. one approach to encounter the variabilities and to include actual information from the process and from data analysis is adaptive model-assisted control [1] . two software tools enabling adaptive model-assisted control applying unstructured, unsegregated models have been developed and implemented using matlab © , winers and fortran, one tool for fedbatch control and another one for seed train simulation and optimization. one key element of adaptive model-assisted control is the underlying process model. in order to provide an adaptive character, model parameters should be easily identifiable from routine cultivation data, which is available during seed train and fed-batch without additional sophisticated measurements. therefore, the usage of unstructured, unsegregated models is recommended. a) example of an unstructured, unsegregated cell culture model (for adaptive model-assisted control) one example, describing cell growth, cell death, uptake of substrates and production of metabolites via a first order system of ordinary differential equations and monod-type kinetics, is shown in table 1 . this mathematical model includes 13 cell specific model parameters [2] . ii) b) open-loop control sequence for seed train simulation and optimization [3] : using model, a priori identified model parameters and starting concentration values, the temporal concentration courses can be predicted for the first scale. subsequently, points in time for passaging and starting values for the next scale can be computed by adding a passaging strategy, seed train conditions and medium concentrations. prediction for the following scales can be obtained iteratively. integrating feedback from the process in terms of cultivation data enables increasing prediction accuracy and responding to possible changes in cell behaviour. process design and optimization, e.g. regarding seed train and fed-batch, is realized by adaptive model-assisted software tools using unstructured, unsegregated models. they enable feedback from the process via routine cultivation data and allow adaptation to diverse circumstances such as different cell lines, products, cultivation conditions, plant configurations etc. ) in polyelectrolyte capsules. significant advantages, such as great mechanical stability, good biocompatibility and good mass transfer properties characterized these capsules based on sodium cellulose sulfate/poly(diallyldimethyl) ammonium chloride (scs/pdadmac) [1, 2] . here, we present the possibility to cultivate human t cells, freshly isolated from blood, to high densities in similar semipermeable polyelectrolyte microcapsules within less than 10 days. cells were encapsulated in semipermeable scs/pdadmac polyelectrolyte microcapsules or confined in 1.5% alginate/poly-l-lysine (pll) beads, a standard approach for cell immobilization. the permeability of the microcapsules was estimated using dextran-based molecular weight standards (10 and 20 kda) and vitamin b12 (1.6 kda). gentle digestion with endocellulase allows an easy release of the cells out of the capsules. cell growth, cytokines production and phenotype were measured in non-encapsulated and encapsulated cells grown under standard culture conditions. moreover, we analyzed the interplay between the secreted cytokines and the scs within the capsules and its putative influence on cell growth. cells mixed in the cellulose sulfate solution under physiological conditions can be safely trapped within a liquid core during capsule formation. encapsulated cells can reached cell densities ≤ 40 x 10 6 cells ml capsule -1 , whereas cells confined in alginate/pll beads and non-encapsulated ones reached 11.3 x 10 6 cells ml bead -1 and 2.4 x 10 6 cells ml, respectively. one major advantage of these polyelectrolyte microcapsules (<1 mm) is the low mwco (<10 kda) (fig. 1a-b) . this restricted permeability allows for a conditioning of the capsule core by autocrine factors, which in turn permits the use of basal cell culture medium instead of expensive t cell specialized media, hence does not necessitate high amounts of rhil-2 and reduces the cultivation costs. moreover, co-encapsulation of rhil-2 had a beneficial effect on the growth kinetics in most cases (fig. 1c) . some evidence is presented that the scs used to form the polyelectrolyte microcapsules, specifically adsorbs il-2 (table 1 ) -a cytokine which provides an essential signal for t-cell proliferation and differentiation [3] . therefore, we postulate that the scs used for encapsulation has biomimetic properties, creating an artificial extracellular matrix mimicking heparin sulfate which in turn positively affect t cell proliferation via trans-presentation of il-2 (fig. 1d) [4] . primary t lymphocytes can be expanded under appropriate conditions outside the body. in the latter, t cells grow/expand in specific environments where the cells are tightly packed, leading to multiple cell-cell contacts and manifold interactions with the extracellular matrix. ex vivo suspension cultures of diluted cells cannot provide such a microenvironment. in the microcapsulesbased cultivation system presented, the cells are suspended in a viscous scs-solution. the low molecular weight cut off of the surrounding polyelectrolyte membrane assures that typical signaling molecules produced by the cells are retained thus facilitates the "conditioning" of the cellular microenvironment, while nutrients and metabolites can pass. expensive additives, such as interleukin-2 (il-2), can be co-encapsulated. expansion then no longer requires specialized t-cell media. moreover, the scs seems to have biomimetic properties, representing an artificial extracellular matrix mimicking heparin sulfate. we consider that the described method may be an appropriate alternative to expand t cells while creating a local microenvironment mimicking in vivo conditions. -175) . equations of balances and kinetics of an employed process model including x v viable cell density, x t total cell density, μ cell-specific growth rate, μ d cell-specific death rate, t time, k s and k monod kinetic constant and monod constant for uptake, k lys cell lysis constant, q cell-specific uptake rate or production rate, respectively, y kinetic production constant, c concentration, glc glucose, gln glutamine, lac lactate, amm ammonia, f feed rate, v volume balances with fed-batch terms kinetics ;uptake if c glc ≥ 0.5 mm : q lac,uptake = 0 if c glc < 0.5 mm : q lac,uptake = q lac,uptake,max q amm = y amm/gln • q gln background digital manufacturing (dm) is heightening the productivity and robustness of existing processes and facilities. it also enables the efficient development of previously unmanageable products or processes and provided the basis for a wave of innovations. dm is a resident and on-line source of continuous optimization of process performance. it relies upon the comprehensive, real-time interfacing of both human and machine sourced information through one centralized system. more than legacy distributed control system (dcs) and supervisory control and data acquisition (scada), it is an integral interconnection of real-time access to divergent sources of information. as such, it can promise deep analysis and predictions leading to shortened product cycle and advanced process control. this comprehensive analysis is extending beyond operations performance data from the production floor to data driving such activities as raw materials security of supply (sos) and business continuity management systems (bcms). digital biomanufacturing (db) can be viewed as yet another, larger, embodiment of digital biotechnology. db is similar to digital manufacturing in that it promotes innovations in the manufacturing of biologicals by using such things as computer aided design, manufacture, verification and deep process analysis using software sensors (fig. 1) . however, the fact that there are living components (cells) involved in the processes puts a distinctly different flavor to the systems employed. it is desirable to use a distinct term here to distinguish it because, as in the terms bioproduction and biopharmacology, db addresses many unique aspects of biologically-based activities. the reasons why the biotech and biopharma industry lags behind other sectors such as the automotive regarding the transformation to digital manufacturing are (i) the complexity and dissipative nature of biological systems, (ii) distributed heterogeneous data and (iii) limited at-line or on-line data sources. however, the costs of genomic sequencing, omics data generation, and computing resources are decreasing rapidly, and at the same time process analytical technologies, computational power and predictive modeling as well as data management infrastructures are greatly improving (table 1) . by removing roadblocks that used to limit approaches, these changes have paved the way to transforming the bioeconomy into an industry that is based on digital knowledge. such new and optimized manufacturing technologies as continuous biomanufacturing and 3d bioprinting can actually demand the interfacing of many sources of information, deep data prior to elisa, the various proteins were incubated at 37°c in scs prepared as for encapsulation. as control, the scs was replaced by pbs. shown are mean values ± sd, n = 3 analysis including software sensors for metabolic fluxes, and model-based predictions of digital biomanufacturing. the application of predictive models for bioprocess optimization greatly improves established platforms and finally leads to a massively increased mechanistic process understanding. four essential benefits result from the increased bioprocess understanding, development, and control of db. first, personnel are relieved of many manual and repetitive tasks. second, strategic planning and operational efficiency are improved. third, we see real-time optimization of end-to-end manufacturing based on such high-value criteria as projected product quality and profitability. fourth, it enables previously unmanageable operations and creates innovative solutions. monitoring between-batch behavior of real-time adjusted cellculture parameters xavier lories, jean-françois michiels arlenda, mont-saint-guibert, 1435, belgium correspondence: xavier lories (xavier.lories@arlenda.com) bmc proceedings 2018, 12(suppl 1):p-192 background cell-culture parameters (ccp), such as ph, may be continuously measured online and subject to real-time automated adjustment (e.g. automated addition of a base to prevent the ph to drop too low). this is an efficient method to maintain the parameter within specified limits. this type of control constraint the variability within the predefined limits and does not provide any information on the between-batch variability of the process. online measurements of ccp provide time-dependent curves presenting one or more transitions. different types of transition can be observed: -the process can shift from a state in which adjustment is needed to keep the ccp in range to a state in which it is not. typically, the ccp drifts away from a limit. -the process shifts from a state in which adjustment is not needed to one in which it is. for instance, a drifting ccp reaches the lower or upper limit of the accepted range. the timepoints at which those transitions take place are here called changepoints. those are aspects of the process and, as such, should be controlled. in the multiple changepoints cases, the approach allows the early termination of runs showing very early or very late first changepoint. the identification of the changepoints position is based on simple rules rather than complex statistical modeling to keep the identification methodology simple. once the changepoint are identified, a multivariate bayesian model is adjusted on the appropriately transformed data. prediction regions are obtained and used as control limits [1] . results obtained for a 2-changepoint case are shown on fig. 1 . points on the right-hand graph represent new batches. the red triangle represents a failed batch. it appears that the control strategy fails to identify the failed batch. two reasons can be considered: -the limits of the prediction region have been established based on 9 points, such a small sample size is likely to be insufficient for the definition of such a control chart. -the tested batches were produced out of set point. a control chart should be used on a stable process, ran in the same conditions, in order to be really relevant. this work was based on available historical data, which is never an ideal situation. the suggested strategy offers a simple approach to the monitoring of between-batch behavior for cell-culture. once the limits have been defined, the approach is quite straightforward and usable by nonstatistician. however, such strategy, as any other of this type, must be based on a sufficient number of batches for the definition of the control limits in order to have a good estimation of the batch-to batch variability. fig. 1 (abstract p-190) . intelligent software applications support digital biomanufacturing process development and control. • databases using data collected online, at-line, and offline from bioprocesses operating worldwide. • process data are used to generate metabolic network models that represent a specific host cell line in a bioprocess. • modelbased computational simulations improve process understanding and reduce experimental efforts for media design, clone selection, and metabolic engineering. • automated data import and processing allow for a streamlined and standardized metabolic process analysis. • identification of critical metabolic parameters is used for proactive steering and control of production processes background rabies is a zoonotic viral disease with a mortality close to 100% [1] . as there is not an efficacious treatment available, post-exposure vaccination is recommended for individuals in contact with the virus. on the other hand, the most common source of virus transmission is saliva of infected animals, mostly dogs, whereby mass vaccination of pets is the most cost-effective way to reduce human infections. in this context, availability of both human and veterinary vaccines is critical [2, 3] . our group had previously developed an effective vlp-based rabies vaccine candidate produced in high density hek293 cell cultures with serum free medium (sfm) [4, 5] . one of the aims in vaccine production process is the achievement of a good productivity with a low cost per dose, mainly in the case of vaccines for animal use in which case the sfm is one of the principal expenses. in this work, we show the adaptation of the producer clone to a non-expensive in-house developed culture medium, in order to reduce the global cost of the process and therefore the price per dose. experimental approach first, we compared a direct and a sequential adaptation protocol of our hek293 rv-vlps producer clone, from 100% of the commercial sfm (ex-cell293, safc) to a new formulation with only 50% of the sfm and a minimum essential medium (p2g), developed in our laboratory specifically for rv-vlps production. this new formulation was called rvpm (rabies vaccine production medium). the specific productivity of rv-vlps in culture supernatants was measured by sandwich elisa, using the 6 th international standard for rabies vaccine that quantify the glycoprotein content (nibsc, expressed in elisa units per ml). further, we evaluated both media for the production of the rabies vaccine, using stirred tank bioreactors operated in continuous mode (biostat qplus, sartorius). the production of the rv-vlps was daily evaluated by elisa and the obtained harvests analysed by the nih potency test for rabies vaccine. after the adaptation process, suspension cultures without aggregates or clumps were obtained, with the same specific growth rate. a lower maximum cell density with the rvpm was reached, achieving 5x10 6 cells.ml -1 , compared with the sfm that reach cell densities between 8 and 9x10 6 cells.ml -1 in batch mode. the specific rv-vlps productivity per cell was maintained, obtaining values of 0.88 and 0.90 eu.10 6 cells -1 .day -1 for the clone being cultured in sfm and rvpm, respectively. taking into account that this producer clone can be changed directly from one medium to the other without lag phase or cell damage, and that in rvpm the maximum cell density reached was lower, this medium was proposed to be analysed in high cell density in perfusion mode for a continuous culture in bioreactor. therefore, we performed two cultures in parallel to compare the efficacy of each media formulation in perfusion. as shown in fig. 1 , we obtained very similar culture performances in both bioreactors; 14.4 eu.ml -1 and 16.1 eu.ml -1 of rv-vlps for the commercial sfm and rvpm, respectively. after that, the harvests were evaluated by the nih potency test obtaining a rabies vaccine potency of 1.2 iu.ml -1 for both cultures (being 1 iu.ml -1 the minimum potency required for animal vaccine). thus, the results obtained represent an interesting advance in the optimization of this vaccine production process since the use of this new medium formulation represents a reduction of 40% of the total cost which will be reflected in a considerable reduction of the price of the vaccine dose. background vaccines are one of the most powerful and effective health inventions ever developedproviding tremendous economic and societal value; yet several factors hinder comprehensive immunization coverage. traditional methods of biologics production, based on stainless steel bioreactors, allow pharmaceutical companies to achieve economy of scale, but are limited by high capital expenditures. such approaches stifle manufacturing innovation and lack long-term cost-effectiveness and sustainability. current innovations can cut biologics' production costs to revolutionize the mainstream use of biologic treatments, focusing on developing fast, potent and cost-effective vaccine production. univercells' mission to make biologics affordable to all initiated a paradigm shift, targeting an innovative single-use manufacturing platform incorporating bioprocess into continuous operations. univercells employs process intensification, using high volumetric productivity bioreactors; and unit steps integration, coupling usp and dsp into continuous operations. the objective is a down-scaled high-productivity process for a cost-effective manufacturing solution. the resulting micro-facilities are easily-deployable in developing countries, breaking entry barriers to biomanufacturing (fig. 1) . manufacturing and distribution advancements, from centralized to distributed, foresee affordable treatments' obtainability via supplying local populations with local production units. -bench-scale fixed-bed bioreactor; -carriers made of 100% pure non-woven hydrophilized pet fibers; -vero cells grown in serum-free and serum containing media; -attenuated polio strains; -cell nuclei on carriers counted by the crystal violet method; -polio virus production estimated by elisa assay (d-antigen content). cultivation of vero cells in medium with serum and in serum-free medium, was carried out in bench-scale compact fixed-bed bioreactors, to determine which culture conditions result in the highest growth rate, the highest cell biomass by carriers and virus production. cells were inoculated at 0.05x10 6 cells/cm 2 and infected during the mid-exponential phase, following a complete media exchange. viral infection took place in serum-free media. in-line clarification and purification is targeted to be performed in only a few steps (maximum one of two) without intermediary diafiltration. in such configuration, we measured that vero cells can reach a cell density of 300-350x10 3 cells/cm 2 with pdl/day of 1.0-1.2 in serum-containing media. this new facility is expected to manufacture any type of viral vaccine at a very low cost and could be deployed at the site of the manufacturer in emerging countries, killing the two birds of cost of manufacturing and distribution with one stone. the presentation will feature the description of the engineering development, but also the preliminary results of cell growth, infections, and product quality, as well as a description of the cogs calculation. univercells developed a disruptive polio vaccine manufacturing technology exceeding expectations when compared to traditional methodsachieving a superior result via its all-in-one solution of a simple, scalable, and fully-disposable vaccine production platform resulting in long-term cost-effectiveness, flexibility and sustainability: -all upstream, downstream and inactivation steps take place within a closed system with all the equipment contained in a low footprint isolatorcreating a confined area for polio virus handling that facilitates the deployment of micro-facilities. -this leads to a dramatic reduction in capital investment, time required for development and increases production capacity. -in conclusion, this is a simple and elegant solution for the industrial production of human vaccines at a low cost in micro-facilities, making polio vaccines available to all. comparison of media formulations for the vaccine production in 1 l stirred tank bioreactors operated in continuous mode. both cultures were performed in parallel using the corresponding medium for the perfusion. a feeding was performed with the commercial sfm. b the first two days of perfusion feeding was performed with sfm until the cell density reach 10 7 cells.ml -1 and, after that, the bioreactor was fed using the rvpm formulation. (↓) on day number 10, 20% of the reactor volume was punctually bled maintaining the working volume background vectored vaccines based on modified vaccinia virus ankara (mva) are reported to stably maintain large transgenes, and to be safe, immunogenic and tolerant to pre-existing immunity. mva is usually produced on primary chicken embryo fibroblasts but continuous cell lines are being investigated as more versatile substrates. we have previously reported development of a continuous suspension cell line (cr.pix) derived from the muscovy duck and efficient production process for mva in chemically defined media [1, 2] . this process allowed isolation of an hitherto undescribed genotype (mva-cr19) that induced fewer syncytia in adherent cultures and replicated to higher infectious titers in the extracellular volume of suspension cultures [3] . replication of mva-cr19 remained restricted predominantely to avian cells, an important property of mva vectors. homologous recombination in cr.pix cells was used to generate viruses with various expression cassettes in deletion site iii [4] and combinations of the differentiating point mutations of mva-cr19 in a backbone of wildtype virus. all recombinant viruses were plaquepurified. successful introduction of the mutations was confirmed by sequencing and specifically designed restriction fragment length polymorphisms (rflps). viruses were analyzed by serial passaging, diagnostic pcrs accross deletion sites [4] , replication kinetics, plaque phenotype and electron microscopy. the genome was further investigated by anchored pcr and long pcr. efficiency of spread of recombinant viruses (fig. 1a) could be mapped to a point mutation in one of the genes, a34r. however, although mva-cr19 carries mutations in three structural proteins we detected no obvious differences to wildtype by electron microscopy (fig. 1b) . the replacement of the left viral telomere by the right counterpart was the most surprising result of our new study (fig. 1c) . this extensive rearrangement affects 15 % of the viral genome and has also increased the area of complementarity between the two telomeres. the recombination site was precisely located and shown via analysis of earlier and subsequent passages to be a stable property of mva-cr19. various viruses, including those with larger dual (dsred1 and gfp) expression cassettes, were serially passaged at least 20-fold. although the genotype of mva-cr19 is advantageous for replication, all genomic and genetic markers of wildtype and mva-cr19 were stably maintained in all passages of the recombinant viruses, independent of wildtype or mva-cr19 backbone. we confirmed our previous results that suggested that mva-cr19 replicates efficiently in single-cell suspensions and were able to connect this property with the d86y mutation in a34, a structural protein on the surface of the virions. mva-cr19 was also found to differ from wildtype mva by a recombination between left and right viral telomere. due to this event, several genes encoded at the left terminus have been deleted whereas the gene dosis of those originally encoded only at the right terminus may have increased. we do not currently know how much the various point mutations and changes in genomic structure combine to explain the improved replication of mva-cr19. as several of the affected genes have been reported in the literature to impact interaction of mva with the host we would expect that in vivo studies may reveal additional novel properties of mva-cr19. an extremely important distinction between our earlier study [3] and this one concerns the source of the viruses. here, we investigated plaque-purified viruses and confirm the high genetic and genomic stability of mva. different expression cassettes inserted into deletion site iii, all diagnostic rflps and pcrs over various sites of the genome and within the viral telomeres remained unchanged throughout at least 20 serial passages -independent of whether recombinant viruses with wildtype or cr19-derived backbones were characterized. fig. 1 (abstract p-225) . a one hallmark of mva-cr19 is a significantly reduced tendency to induce syncytia and an increased dispersion of plaques in cr.pix cell monolayers. this property appears to be supported by the mutation in a34r. b electron microscopy reveals no obvious differences between novel genotype and wildtype. background transient gene expression systems using polyethylenimine (pei) are considered to be fast, flexible and cost-efficient for recombinant protein production [1] . transfection efficiency depends on different factors; one of them is the type of media. production media support cell growth and protein production but not high transfection efficiency (te) mediated by pei [2] . therefore, media were selected for transfection followed by feeding of production media [3] to improve te and protein production. two different transfection strategies are compared: conventional transfection by preparing polyplex of a plasmid (pdna) and pei interaction before transfection and insitu transfection by direct addition both of them to the cell suspension and the polyplex formed spontaneously [4] . cells were seeded 24 hr in chomacs cd media before transfection. at transfection time point an equal amount of cells were resuspended in each media type. transfection was applied either insitu or conventional (polyplex prepared in 100 μl of 150 mm nacl and incubated for 20 min.), media addition was performed 5 hours post-transfection (hpt). media type and transfection condition were illustrated in table 1 . media screen result exhibits the highest transfection efficiency of around 50% transfected cells by opti-mem medium coming along with low cell growth and viability. to improve the transfection efficiency, basic parameters including cell density, pdna, and pei concentrations were varied and higher transfection efficiency was reached by reducing media or accordingly increasing cell density, pei and pdna concentration for transfection. further optimization results show that the transfection of cho-k1 cells in opti-mem (transfection medium) for 5 hours followed by addition of cho-macs cd (production medium) for further enhancing the transfection, cell count, and cell viability. the transfection efficiency (te) increased up to 85 ± 2.6% coincide with increases in viable cell concentration (vcc) in comparsion to transfection and cultivation in opti-mem media alone fig. 1a . both conventional and insitu methods are successfully transfected cho-k1 to the same similar high te as shown in fluorescence microscope images of fig. 1b . insitu transfection shows super-priority for suspension cell transfection concerning the reduction of handling steps (one step) compared to the conventional way (two steps). the insitu transfection avoiding the optimization step required for the incubation period to prepare transfection polyplex but require a higher amount of pdna and pei than conventional way as shown in table 1 . in order to deal with the growing demand of large quantities of therapeutic proteins in a timely fashion, expression systems are being optimized to reduce the time of generation of stable clones as well as to increase the levels of protein secretion. this can be achieved by a combination of expression cassette optimization, cell engineering and selection process. we have previously developed the cumate gene-switch, which is a very efficient expression system for protein production [1] . we have shown that the cumate-inducible promoter (cr5) was the strongest promoter we had tested so far in chinese hamster ovary (cho) cells. with this promoter, we were able to generate stable cho pools capable of producing high levels of a fc fusion protein (900 mg/l), outperforming by 3 to 4 fold those generated with cmv5 and hybrid ef1α-htlv constitutive promoters. besides the strength of the cr5 promoter, we demonstrated that the ability to control both the time and the level of expression during pool generation and maintenance gave a real advantage to the inducible expression system. indeed, we observed that keeping the expression off during selection enabled the generation of pools with superior productivity compared with the pools whose expression was maintained on. moreover, preliminary results suggest that keeping recombinant protein expression down increases the frequency of high producer clones [2] . knowing that one of the main bottlenecks of the successful bioprocessing of recombinant proteins using cho cells is the rapid isolation of a high producer, our data suggest that the cumate gene-switch system could be a valuable platform for the generation of stable clones. the main regulatory authorities and organizations demand proof of monoclonality for biotechnological producer cells. with increasing pressure to shorten timelines and to improve drug safety, technologically advanced methods have to be established to ensure that production cell lines are derived from a single progenitor cell. sartorius stedim cellca's single cell cloning approach is based on one round of fluorescence-activated cell sorting (facs) using becton dickinson (bd) facsariatm fusion cell sorter combined with photodocumentation by synentec cellavista microscopic imaging system. for the approach, critical process parameters such as different cell lines, viability and cell aggregation levels were investigated separately to assess their contribution to the probability of monoclonality. immediately after single cell cloning into 384-well plates (1 cell/ well) the plates were centrifuged followed by imaging using the cellavista (day 0). further cellavista images are taken on day 1, day 2 and on one day between day 5 and 7. outgrowth was defined at day 14. 8 cell lines expressing different recombinant products were investigated to calculate probability of having ≥ 2 cells/well after facs sorting p(d), the apparent probability p(i) of having ghostcells (cells that are out-of-focus and, thus, are not visible during initial microscopic imaging), and the apparent probability p(k) of having ghostcells that outgrow the 384-well stage (fig. 1 ). using these results, the probability of obtaining a monoclonal cell by using sartorius stedim cellca's single cell cloning approach was determined (table 1) by conservative examination: p(monoclonal, conservative) = 1 -(p(d) x p(i)) realistic examination: p(monoclonal, realistic) = 1 -(p(d) x p(k)) cell pools with low viability can theoretically impact the probability of monoclonality by e.g. diminishing microscopic imaging quality (cell debris). therefore, pool cell line 1 with very low viability (≥36 %) was used to demonstrate, that the probability of monoclonality is still 99.9 % in case of low viability on day of sorting: p(monoclonal, conservative) = p(d) x p(i) = 99.9 % p(monoclonal, realistic) = p(d) x p(k) = 99.9 % furthermore, cell pools with high aggregation levels can theoretically impact the probability of monoclonality by sticking together during facs sorting and therefore increase the probability p(d) of having ≥ 2 cells/droplet. therefore, pool cell line 8 with high aggregation levels (≥11.1 %) was used to demonstrate, that the probability of monoclonality is still ≥ 99.9 % in case of highly aggregated cell pools on day of sorting: p(monoclonal, conservative) = p(d) x p(i) = > 99.9 % p(monoclonal, realistic) = p(d) x p(k) = > 99.9 % conclusions in summary, there is no obvious correlation between protein product type and the determined probabilities for monoclonality. furthermore, pools with a viability as low as 36 % and pools with an aggregation level as high as 11.1 % can be used for scc resulting in acceptable probabilities of monoclonality. background ich guidance [1] requires that any cell line used to produce biopharmaceuticals originates from a single progenitor cell. recently, there has been increased scrutiny of the method(s) used to achieve this requirement. here, we review the suitability of the legacy capillary aided cell cloning (cacc) method in light of this changing landscape of expectations. the cacc method is based on the 'spotting' technique [2] and relies on independent visual conformation by two scientists of the presence of a single cell in a 1 μl droplet. this method achieves a high probability of monoclonality in one cloning round. although the method has since been replaced by facs single cell deposition for routine use, it remains a viable cloning method. -performed by trained scientists -dilute culture to 1500 ± 500 cells/ml with ≤2% doublets -draw cell suspension into pipette tip by capillary action; tap tip against the centre of the base of each well of a 48 well plate. -size of resulting droplet =~1μl (fig. 1a ) -two scientists independently view all wells using a microscope (initially use 40x magnification with the entire rim of the droplet visible within the field-of-view. next, examine particles using 100x or 200x magnification to confirm they are cells) and individually record the number of cells present in each well's droplet (fig. 1b to d) . -exclude droplet from further analysis if full visualisation is hindered (fig. 1e to h) . -add growth medium, and incubate plates. record all wells containing colonies; only progress colonies from wells that both scientists agree contains only one cell. -data analysis: -each scientist's observations categorised as: 0 cells, 1 cell or >1 cell -observed outcome for each well: growth or no growth -probability of monoclonality estimated from data using a statistical model cloning (ldc) increased accuracy of p(monoclonality) with cacc -ldc weakness: no visualisation after seeding (to check both well seeding and subsequent growth of colonies is well described by the poisson distribution), potentially overestimating p(monoclonality) -addressed by cacc: visual examination with colonies arising from wells seeded with 1 cell distinguished from those seeded with >1 cell -visualisation step further strengthened by: using controls for exclusion of wells; measuring errors based on the presence or absence of colonies in wells where two scientists independently reported 0 cells; and formally analysing the data using a suitable statistical model decreased time and resource requirements with cacc -high p(monoclonality) possible in single round as each well examined individually with only those containing a single cell progressed, and because the error rate for incorrect scoring is considered to be low two scientists miss a cell one cell sitting on top of another and the two thus appearing as one an experiment was performed to estimate error frequency [3] . conclusion -scientists miss a cell infrequently (in the range 0.4% to 1.3%, [3] ) -error frequency does not invalidate use of direct observation methods for cell cloning -single cell seen by both scientists is highly likely to be monoclonal -during method development, strategies established to control potential sources of error ( table 1 ) use of a contemporaneous visualisation approach, a strict control strategy, and a suitable statistical model (which takes into account potential errors) results in: -the cacc method being at least as robust as the ldc method -the cacc method being a reliable, single-step method for cloning to achieve a high p(monoclonality) background vector design is a key step in cell line development for the expression of therapeutic biologics. it is essential that the vector design results in high, stable expression of the encoded protein. other considerations include ease of cloning, stability for propagation in e. coli as well as in the mammalian host cell line, and ease of sequence amplification for verification of vector construction and for detection of insertion site and copy number in stably expressing cells. for these reasons, use of the same promoters and polya tails in dual cassette vectors, as is common for expression of the heavy and light chains of monoclonal antibodies, can be problematic. in order to minimize sequence similarities between the two expression cassettes, we have modified the promoters, introns, and polya tails of the light chain and heavy chain expression cassettes in the dual expression vector commonly used for the expression of therapeutic antibodies in the chozn® gs -/cell line development platform. gene synthesis and vector construction of igg1 and fluorophore-expressing vectors was done by atum. vectors were transfected into chozn® gs -/cells via electroporation. analysis of gfp and rfp expression was achieved using a macsquant instrument. selection and generation of stable pools and single cell clones from transfections with igg1-encoding vectors was performed as described in the chozn® platform technical bulletin. titer analysis was performed in static (96 well plate), in a 7 day tpp assay and in a 14 day fed batch assay using a qk fortebio. initial screening experiments identified a lead vector, #39, and a vector, #37, which produced very low titers and relatively few minipools expressing detectable levels of igg1. analysis of gfp and rfp expression from the modified vectors indicated relatively high expression from the rfp/hc expression cassette of vector #37. a stronger promoter resulting in overabundance of hc, known to be toxic to cells, provides a possible explanation for the poor results with this vector. interestingly, swapping the positions of the lc and hc in #37 resulted in a vector, #77, that outperformed the initially identified lead vector (fig. 1 ). this same change was made to vector #39 without any resulting improvement in titers (vector #78, fig. 1 ). interestingly, vector #39 had a smaller difference in relative promoter strengths, based on mean channel fluorescence ratio of gfp to rfp, suggesting that overabundance of hc was not an impediment to igg1 expression from #39. poor titers were also seen with a modified version of vector #39 (vector #79, fig. 1 ) in which the glutamine synthethase selection cassette was in the reverse orientation. this second screen identified vector #77 as the lead vector design (fig. 1) . a full comparative study of vector #77 and the control vector was performed, cumulating in the generation and comparison of single-cell clones from each. these studies have demonstrated the equivalence of these vectors in terms of igg1 titer. this work has resulted in the identification and characterization of a dual expression vector with minimized similarity between the two expression cassettes, easing the cloning, propagation and analysis of vector integration in stable cell lines while maintaining the high, stable expression of the encoded protein of the original vector design. background traditional cell line engineering strategies mainly include an antibiotics resistance selection. in this process, cells are transfected with the goi (gene of interest) together with an antibiotics resistance gene and those cells are selected that survive treatment with the respective antibiotic [1] . although the gene responsible for the survival of the cell is transfected together with the goi, resistance is not necessarily linked to high goi expression. thus, a significant proportion of resistant cells may not express the goi at all, necessitating the search for alternative, more closely linked selection systems. sirnas (silencing inducing rnas) are short, noncoding rnas that can bind to complementary mrna and inhibit their translation. this function has been used in many approaches to silence the expression of certain genes [2] . with their short length, sirnas can be hidden in introns (non-translating regions) of genes, making it possible to couple the expression of a sirna to a gene. this way a cell produces a correlating amount of sirna when transcribing the gene, without adding any further translational burden on the cell. the co-expression of the sirna can be used as a selective marker by one of the following methods: (1) knock-down of a suicide gene to enable a cell's survival after suicide gene mrna transfection, (2) down-regulation of a surface marker which is used in macs (magnetic cell separation) to filter out wanted or unwanted cells, and (3) inhibition of a fluorophore marker for selection using facs without product specific antibodies. for sirna based cell selection systems, sirnas replace the commonly used antibiotics resistance as a marker. cells that produce goi will also produce the sirna that protects the cell from a suicide gene. the selection protein (suicide genes, fluorophores, surface markers, etc.) is transfected as mrna and is only expressed during selection. the general process is outlined in fig. 1. (a) the traditional antibiotics resistance marker is replaced by an sirna, which is cotranscribed with the goi. unlike in antibiotic resistance, the marker here is not a protein, reducing the translational burden and providing more resources for goi production [3] . transfection with the suicide gene proved to be 100% lethal within 2 days, with no outgrowth over two weeks. protection by expression of the sirna was shown to be efficient. currently a comparison of stable cell line development programs based on sirna selection and neomycin selection is ongoing. conclusions the novel selection system should speed up cell line development, as the system kills rapidly and directly selects for cells transcribing the product gene on a high level. we expect to see more high producers earlier in the process, which will allow for an easier and faster selection in the following steps. sirna based selection offers great opportunities. by directly selecting based on goi transcription and not a proxy marker, we expect more relevant cells on a pool level. in addition, the elimination of an antibiotics resistance allows more cellular resources for goi production. the system offers multiple ways of application, either by enriching wanted, or depleting unwanted cells. background single-cell cloning is an essential step used in the upstream development of transformed cell lines for therapeutic protein production. while single-cell clones are typically used to ensure product consistency, such low cell density cultures present a survival challenge; cells grow more slowly or may even not survive at low densities in protein-free media, costing the industry time and money and limiting the pool of candidate colonies for choice of production clones [1, 2] . to address this problem, we aimed to develop a highly efficient serum-free medium suitable for optimising single-cell cloning efficiency by studying a range of conditioned media (cm) samples isolated from different chinese hamster ovary (cho) cell lines. materials and methods cho-s, dg44 and cho-k1 were adapted to cho-s sfm-ii (gibco) medium for a minimum of three passages. conditioned media was then collected when the cultures reached a cell density of 1x10 6 cells/ml (typically day 2-day 3 depending on the growth profiles of each cell line and whether they grew in suspension or attached conditions). samples were then centrifuged twice to remove cell pellet/debris and stored at -20°c. the ability of conditioned media to support cho colony formation was then assayed using 96-well plates, seeding the cells at low cell density (1-10cells/well) by diluting down cho cultures in media/conditioned media. after incubation at 37°c for 10 days, cloning efficiency was assayed using a standard xtt assay. initial screening of the nine cm samples was performed using cho-k1 cells due to their widespread use in industrial antibody production. successful media candidates were subsequently screened using additional cho cell lines. table 1) . the k1-sfmii-cm product improved cell cloning efficiency for dg44 cells (avg. increase>1.5-fold) and cho-s cells (avg. increase>3-fold) ( fig. 1) and also the adherent cho-k1 cell line growing in atcc +5%fbs. the ability of conditioned media to support cho growth in limiting-dilution conditions (1, 6 and 10 cells/ml) was investigated. from a range of nine conditioned media samples; four compelling products have been identified which improve low-cell density growth of cho-k1 cells, compared to sfm-ii control media. we feel that these early-stage conditioned media products may increase cloning efficiencies during upstream cho cell line development, resulting in financial savings for industry and increasing the possibilities of identifying particularly highperforming transformed clones. 12 (7):3496-3510. the main rate-limiting step in the upstream stages of protein biomanufacture is the isolation of stable, high producing cell clones. ubiquitous chromatin opening elements (ucoe®s) consist of at least one promoter region with associated methylation-free cpg island from housekeeping genes; they possess a dominant chromatin opening capability and thus confer stable transgene expression. ucoe®-viral promoter (e.g. cmv) based plasmid vectors markedly reduce the time it takes to isolate high, stably producing cell clones. although some ucoe®-viral promoter combinations have been tested, they have not been thoroughly evaluated in chinese hamster ovary (cho) cells. plasmid vectors containing combinations of either the human hnrpa2b1-cbx3 ucoe® (a2ucoe®) or murine rps3 ucoe® linked to different viral promoters (hcmv, gpcmv, sffv) driving expression of an egfp reporter gene were functionally analysed by stable transfection into cho-k1 cells and expression analysed by flow cytometry and qpcr to determine vector copy number. the results at 21 days post-transfection and selection clearly indicate that the rps3 ucoe®-gpcmv and -hcmv combinations give the highest transgene expression as shown in fig. 1 . the a2ucoe®-hcmv/gpcmv constructs were the next efficacious but 2-fold lower than the rps3 ucoe® vectors. the sffv promoter linked with either of the two ucoe®s was the least effective with expression levels 17-fold lower than the rps3-cmv constructs. the rps3 ucoe®-gpcmv/hcmv constructs are now being further modified to include elements that will provide optimal post-transcriptional pre-mrna processing (splicing, polyadenylation, transcription termination, mrna stability) thereby maximising stable cytoplasmic transgene mrna levels and protein production. in the last 20 years, growing number of innovator biologics and biosimilars have formed a competitive environment, where speed and efficiency of generating robust and highly productive cell lines needs to be improved continuously. through various advances, especially in media development and process optimization, product titers as high as 10 g/l were achieved in the pharmaceutical industry (kim et al., 2012) for standard products such as monoclonal antibodies. nevertheless, other proteins e.g. bispecific antibodies, fc-fusion proteins or fab-related products are difficult-to-express (dte) in chinese hamster ovary (cho) and may result in delays or even in termination of the cell line development process. we developed a new robust pool generation approach (cld 2.0) addressing both, easy-and difficult-to-express molecules, while reducing timelines down to 5 months (cld standard = 6 months), improving reliability of cell line development as well as clearly increasing obtained titers. in order to create stable cell lines, we transfected our cho dg44 host cells by electroporation. cells processed using the standard approach were cultivated in selective medium or medium containing additional 2.5 nm methotrexate (mtx) for three weeks. after an amplification step with 30 nm mtx for three weeks, stable individual cell pools were expanded and clones were generated by facs-sorting. clones were analyzed for growth performance and product concentration in fed-batch studies. in our new cld 2.0 approach, we increased mtx concentrations (2.5 nm, 5 nm and 10 nm mtx) during the first selection phase of three weeks. afterwards we omitted the 30 nm mtx amplification step. thereby, pool generation finished four weeks earlier than in the standard approach. to evaluate the stability of cell clones derived from mini pools (mps) generated according to the cld 2.0 approach, stability studies were performed for eight weeks, including stability fed batches at t=2 weeks and t=8 weeks. altogether three different proteins of interest with six cell clones each were tested. we adapted our cell line development process by increasing the initial selection during the first selection phase, thereby allowing the omission of the 30 nm mtx amplification step. we observed that the capacity of amplifiability varied for different products. cell lines with a protein titer ranging from >1 g/l to 1.5 g/l (dte) in shake flask fed-batch showed to be more susceptible to increased initial mtx levels and were thus not amplifiable with 30 nm mtx. in contrast, cell lines with high protein titer >1.5 g/l were observed to adapt to 30 nm mtx easily and were amplifiable. finale shake flask fed-batch data with cld 2.0 clones of highexpressing products showed comparable titers to clones from the standard approach. cld 2.0 clone titers for dte proteins revealed in average a 2.0-fold increase compared to clones generated in the standard approach. titers of top producing clones were in a range of 1.8 g/l to to 2.7 g/l (fig. 1) . furthermore, stability data of cld2.0 cell clones from different dte products showed a stable specific productivity in a range of +/-15 % over eight weeks cultivation. fed-batch titer from t=2 weeks and t=8 weeks were in a normal range of +/-20% of the standard 30 nm projects. our results demonstrate that cld 2.0 is a robust and reliable process for standard products (mab) and dte proteins. with our new process, we were able to increase titer of difficult-to-express proteins up to 200%. by omitting the amplification step (30 nm mtx) 96 % of generated clones were stable over eight weeks cultivation time. additionally using the cld2.0 approach, the time line from dna to rcb was reduced to 5 months. background cho cells have become the most popular platform for production of therapeutic proteins [1] . however, the generation of high-producer cells is a time-consuming and labor-intensive process that requires the screening of large amount of cells to get a clone of high titer and stability. since the expression titer and stability of clone is highly dependent on the site of integration, we demonstrated a new cell line development strategy by using ngs to identify the integration site and using crispr/cas9 to generate the target integrated high producing cell lines [1, 2] . to identify the high expression sites in the cho cells, we employed ngs to analyze the integration sites of a high producing cell line (titer > 3g/l). the pair-end reads with one read mapped to the vector and the other read mapped to the cho reference genome are extracted to identify the integration sites. to test the expression activity of the integration sites, we employed crispr/cas9 to specifically integrate the antibody gene into cho genome for expression. our data showed 4 integration sites are in the high producing cell line. among the 4 integration site, is1 integration site was tested by crispr/ cas9 for target integration of antibody gene for expression. the is1target integrated cell pool present higher expression titer than cell pool generated by target integration into other integration sites (fig. 1a) . the single cell clones derived from is1-target integrated cell pool had low copy number of goi (fig. 1b) . after normalization with copy numbers, the single cell clones derived from is1-target integrated cell pool showed high titer per copy (123~583 mg/l/copy) (fig. 1c) . this study demonstrated the generation of high-producing cell lines by crispr/cas9 mediated target integration. this approach will cost less time and labor than traditional method. the active integration site will serve as a platform like a cassette player for therapeutic antibody production. background cho, hek and sp2/0 are the dominant host cells for biologics drug production. achieving high level of recombinant protein production by these cell lines still remains a challenge. in order to understand the potential roles of lipids in protein production, secretion, vesicular transport and energy metabolism, we coupled high-throughput transcriptomics and lipidomics technologies. quantitative lipidomics is an emerging 'omics technology which can help us understand the physiological limitations of each cell line. the two types of major lipid groups in cells are non-polar and polar lipids. polar lipids such as glycerophospholipids (pls) include phosphatidylethanolamine (pe), phosphatidylcholine (pc), phosphatidylinositol (pi), phosphatidylserine (ps), phosphatidylglycerol (pg), and phosphatidic acid (pa). in this study; we integrated two dimensional high performance thin layer chromatography (2d-hptlc) and mass spectrometry (ms) lipid analysis of sp2/0, cho, and hek cell lines to understand the major differences in the lipid content of these hosts. bligh-dyer method was used to extract the lipids and extracts were analyzed by hp-tlc and ms. the polar lipids were separated into different categories by 2-d hp-tlc using a chcl 3 -meoh-h 2 o (71:25:2.5, v/v/v) solvent system in the first dimension and a chcl 3 -meoh-acetic acid-h 2 o (76:9:12:2, v/v/v/v) solvent system in the second dimension. non-polar lipids were separated by 1-d hptlc using hexane-diethyl ether-acetic acid. 2,7-dichlorofluorescein dye was used to visualize both polar and non-polar lipids. further detailed analysis was performed on a qqq mass spectrometer (thermo tsq vantage, san jose, ca) using negative-ion and positive-ion esi modes as well as negative-ion esi mode in the presence of lithium hydroxide. in this study, quantitative lipidomics was coupled with transcriptomics to further understand the physiological pathways of hek, cho-m and sp2/0 cells. initial hp-tlc analysis indicated that major lipids in these industrial cell lines were pe and pc. other polar lipids such as pi, ps, pg, pa, and sm were lower compared to pc and pe in exponential and stationary phases of each cell line. figure 1 represents 2d hp-tlc results of hek with the relative quantitation of polar lipids. in order to investigate the lipid subgroups, shotgun ms analysis was conducted for both exponential and stationary growth phases of the three cell lines. ms analysis indicated that lysophosphatidylethanolamine (lpe) and lyso-phosphatidylcholine (lpc) amounts were 4 -10 fold and 2-4 fold higher in hek cells compared to sp2/0 and cho cell lines. sphingomyelin (sm) was another lipid subgroup that was shown to have a major difference between sp2/0 and other mammalian cell lines. sm was 30-65 fold lower in sp2/0 cell line compared to cho and hek. to understand these metabolic differences, transcriptomics analysis using illumina highseq and gene expression omnibus was conducted on these mammalian cells. the kyoto encyclopedia of genes and genomes (kegg) database was used to map the transcriptomics data to the lipid synthetic pathways. transcriptomics data mapping to kegg pathways demonstrated that differences in lpe and lpc pathways correlate with the expression profiles of secretory phospholipase a2 (spla2), lysophospholipid acyltransferase (lpeat), lysophosphatidylcholine acyltransferase (lpcat), and lysophospholipase (lypla) [1] . the hp-tlc and lc/ms findings demonstrated that high levels of lpe and lpc existed in the hek cell line and low levels of sm were observed in the sp2/0 cell line. coupling lipidomics with transcriptomics provides us with an improved understanding of the physiological differences across sp2/0, cho, and hek cell lines that could be used to guide cell engineering efforts with the goal of increasing the recombinant protein expression capabilities of these three cell lines. biopharmaceuticals are a class of biological macromolecules that include antibodies and antibody derivatives, generally produced from cultured mammalian cell lines via secretion directly into the media. manufacturing at medimmune requires the generation of chinese hamster ovary (cho) clonal cell lines capable of producing the biopharmaceutical product at commercially relevant quantities with optimal product quality. the isolation of cell clones based on random single cell deposition via fluorescence activated cell sorting (facs) provides a heterogeneous panel of expressers. we hypothesize that the application of facs to provide an additional sorting step based on desirable cell attributes that correlate with productivity, product quality or cell growth attributes could lead to the isolation of higher producing cell lines with enhanced product quality attributes. a panel of 20 cell lines expressing a model recombinant monoclonal antibody were characterised in terms of growth, productivity, intracellular recombinant protein and mrna amounts. assays were also developed to investigate cell attributes using the commercially available imagestream instrument, an imaging flow cytometer, which enables the investigation of cellular characteristics that correlate with cell productivity at the single cell level. characterisation revealed the cell lines exhibited a range of values for productivity, growth, and intracellular (ic) antibody mrna and protein expression, ideal for further imagestream characterisation. western blot and qrt-pcr analysis demonstrated that final titre correlated with both ic heavy chain (hc) protein and mrna amounts (pearson correlation coefficient (pcc) = 0.70 and pcc = 0.80, respectively). to assess productivity at the single cell level, assays multiplexing ic hc protein and mrna with cell attributes were therefore developed. initial assay development focusing on hc mrna and protein amounts has revealed interesting results; four cell lines displayed two distinct populations, one producing the antibody and another nonexpressing population. the ratio of these populations varied amongst the cell lines. images obtained from the imagestream have shown the cellular localization and expression of hc and lc message and protein (fig. 1) . for both message and protein, hc and lc colocalize in the cell. whether there is any relationship between ic hc protein and cell attributes at the single cell level was then also investigated, as well as correlations with cell culture parameters at the population level. at the population level, correlations were found between titre and ic hc protein and mrna (pcc = 0.84 and pcc = 0.79, respectively) confirming the data obtained by western blot and qrt-pcr analysis. a panel of 20 cell lines has been characterised at the population level and show a wide range of antibody expression profiles at both the mrna and protein level. in parallel, assays have been developed for the imagestream to measure hc and lc message and protein amounts at the single cell level. protein and message quantification with the imagestream are consistent with more traditional approaches, such as western blots and qrt-pcr, that operate at the population level. the developed assays are now being used to investigate single cell productivity attributes and for the isolation of more productive clones. background productivity and stability are key factors for the selection of cell line in protein drugs production. large amount of target gene integrated in cell genome could lead to the instability of production. therefore, cells with low copies of target gene integrated in high yield sites could be an ideal production cells for manufacturing. it has been known that the transposon system can control the integrated copy number of target gene and can generate high yield producing cells, it could be a great approach to generate stable high yield producing cell lines carrying low copies of target gene through transposon system. we intended to develop a platform to generate high yield producing cell lines carrying 1-2 copy of the integrated target gene using transposon system. two cho cell lines, cho-s cells and dxb11 cells, have been applied. cells were co-transfected with transposon and target gene expression plasmids. after drug selection, the cell pool with highest productivity per target gene copy was applied to single cell cloning. the productivity and copy number of cell clones were determined, and the stability of cell clones was analysed after culture of about 60 generations. in the stable pools of cho-s and dxb11 cells, the productivities per integrated target gene copy were about 11-13 mg/l/copy and 68-75 mg/l/copy in a batch culture, respectively. after single cell cloning, the integrated copy numbers in most cell clones were less than three copies per cell. in cho-s and dxb11 cell clones, the productivities per integrated target gene copy were 20-60 mg/l/copy and 60-150 mg/l/copy in a batch culture, respectively. the productivity per integrated target gene in cell clones developed by the transposon system was much higher than that in cell clones developed by random integration (fig. 1a and b) . to evaluate the productivity stability of cell clones developed by the transposon system, ten cell clones at generation 0, 30, 60, and 100 were applied in the analysis. of interest, about 80% of cell clones were stable at generation 60, but lost the productivity at generation 100 (fig. 1c) , implying the most cell clones could maintain the stability within 2 months. using the optimized conditions of the transposon system to develop the stable gene expression cells, the productivity per integrated target gene was higher than random integration. these results suggested that our platform is capable to develop high yield producing cells with 1-2 copy of integrated target antibody gene and can be applied to identify high yield integration sites. background mammalian cells show an inefficient metabolism characterized by high glucose uptake and the production of high amounts of lactate, a widely known growth inhibition by-product [1] . recently, we have observed a different glucose-lactate metabolism in some cell lines. while some cell lines are unable to metabolize lactate, others can co-metabolize simultaneously glucose and lactate under certain culture conditions, even during the exponential growth phase [2] . these metabolic differences between different mammalian cell lines (cho, hek293 and hybridoma) have been studied by means of flux balance analysis (fba). three different cell lines were cultured in a 2-liter bioreactor: cho-s, hek293sf and hybridoma kb-26.5. for the fba, two adapted genome-scale metabolic models were used: a reconstruction of mus musculus for cho and hybridoma [3] , and a reconstruction of human metabolic model (recon 2) for hek293 [4] . in cultures where ph was not controlled, two different metabolic phases were observed for cho and hek293 cells. during the first phase both cell lines produced large amounts of lactate as a consequence of the high glucose consumption rates. interestingly, when ph dropped below 6.8, due to acid lactic secretion and accumulation, a second metabolic phase was identified, in which concomitant consumption of glucose and lactate was observed even during the exponential growth phase. conversely, hybridoma cells were unable to co-consume lactate and glucose simultaneously even under noncontrolled ph conditions. therefore, the hybridoma physiological data used for the fba corresponded to only phase 1 of phcontrolled cultures. a summary of the main cell growth and metabolic parameters obtained from the different experiments performed is presented in table 1 . fba shows ( fig. 1 for hek293 cell culture) that lactate is produced in phase 1 because pyruvate has to be converted to lactate to fulfill the nadh regeneration in the cytoplasm and only a small amount of pyruvate can be transported into tca through acetyl-coa. cell metabolism in phase 1 is highly inefficient, as the majority of the carbon source is not used for the generation of energy nor biomass. in phase 2, in which mitochondrial ldh was considered, tca fluxes could be maintained as in phase 1 at the maximal rate encountered; hence, the energy available for cells to grow was similar in both phases, obtaining similar growth rate. two different glucose and lactate metabolism behaviors have been observed in cho and hek293 cultures depending on the culture conditions: phase 1) glucose consumption and lactate production, and phase 2) glucose and lactate simultaneous consumption. in contrast, only phase 1 was observed in hybridoma cultures even when ph was non-controlled. fba showed that tca fluxes in phase 1 and phase 2 were similar, obtaining similar cell growth rate, but glucose uptake rate was much lower in phase 2 due to the lactate co-consumption. some authors hypothesize that cells metabolize extracellular lactate as a strategy for ph detoxification [2] . glucose and lactate co-metabolization resulted in a better-balanced cell metabolism, as can be seen from the metabolic fluxes calculated, with minor effects on cell growth. the observation of glucose and lactate co-consumption metabolic behavior and its deeper study and characterization could open the door of novel culturing strategies with the aim of increasing bioprocesses productivity. background transient protein expression in mammalian cell lines has gained increasing relevance as it enables fast and flexible production of high-quality eukaryotic protein. considerable efforts have thus been made to overcome existing limiting aspects of transient gene expression systems, in terms of cell lines, cell culture-based systems, and protein production in a cost-effective manner. milligram amounts of protein per liter can be produced within several days, allowing a significant shortening of the bioproduction process in comparison to protein production from stable clones. to ensure the robustness of the process, it is essential to have a reliable and easy-to-use transfection method. to palliate for the need of a reliable transfection reagent, we developed peipro®, the only commercially available pei optimised for mid to large-scale transient protein production during process development. peipro® is a non-polydiperse and fully-characterised polymer that has become the gold pei standard due to its reliability, reproducibility in high dna delivery efficiency and in ensuing high protein production yields. here, we present experimental data showing the benefits of using peipro® for protein production in comparison to other peis. we further demonstrate compatibility of using peipro® for recombinant protein production in most commonly used chemicallydefined media. materiel and methods suspension hek-293 and cho cells were cultured in shaker flasks in various synthetic media, as listed in table 1 . hek-293 and cho cells were resuspended at 1×10 6 cells/ml of serum-free medium, on the day before transfection. cells were transfected with 0.5-1 mg of plasmid dna encoding for the luciferase gene reporter using peipro®, pei "max" and l-pei 25 kda (polysciences, warrington, pa) resuspended at 1 mg/ml according to the manufacturer's recommendation. protein expression of the luciferase reporter gene was assayed 48 hours post-transfection by affinity chromatography using protein g (hplc). comparison of peipro® to other commercially available peis was achieved by transfecting suspension hek-293 and cho cells with plasmid dna encoding for the luciferase gene reporter. luciferase production yields obtained in hek-293 and cho cells were at least respectively 5-fold and 10-fold higher when using a similar amount of peipro in comparison to the other peis (fig. 1) . furthermore, peipro® was the only pei that led to similar luciferase production yields when decreasing the amount of plasmid dna per liter of cell culture. conversely, at least 1 mg of plasmid dna and 4-fold more of pei "max" and l-pei 25 kda were needed to obtain a similar luciferase expression range in both hek-293 and cho cells. we further assessed the compatibility and versatility of peipro® by measuring protein production yields obtained in most commonly used animal-free synthetic media. as shown in fig. 2 , peipro® leads to high protein production yieds in several commercially avaialble media formulations for hek-293 anc cho cell lines. peipro® is the only fully characterised pei transfection reagent that is suitable for reliable and reproducible recombinant protein production, irrespective of the scale of production and of the type of adherent and suspension cell culture system. fig. 1 (abstract p-274) . peipro® requires less reagent and similar to lower dna amount compared to other peis. suspension hek-293 and cho cells were seeded at 1×10 6 cells/ml in serum free medium and transfected with peipro®, pei "max" and l-pei 25 kda (polysciences, warrington, pa) resuspended at 1 mg/ml. luciferase expression was assayed 48 h after transfection using a conventional luciferase assay fig. 2 (abstract p-274) . peipro® is optimized for transfection of hek-293 and cho cells in several specific synthetic culture media. suspension hek-293 and cho cells were seeded following the recommended protocol in serum-free media and transfected with peipro® using the standard conditions. igg3-fc production was assayed 48 h after transfection using protein g affinity quantification (hplc) monoclonal antibodies (mabs), which are widely used in anticancer therapies, are mainly produced by mammalian cell lines. mab conjugation to biological molecules for enhancing their antitumor activity offers a new powerful tool for anticancer therapies. we have assessed the production of commercially approved anti-her2 therapeutic antibody trastuzumab (tzmb) [1] and also its fusion with interferon-α2b (ifnα2b). two cloning strategies consisting in transfecting cho-s and hek293 cell lines with two bicistronic or with a single tricistronic plasmids have been assessed. the in vitro efficacy of both antibodies has been tested and compared side by side. tzmb heavy and light chains were cloned in two bicistronic plasmids (pirespuro3 and piresneo3, clontech) and in a tricistronic plasmid derived from pirespuro3. ifnα2b was spliced to tzmb heavy chain by overlap extension pcr and the resulting tzmb-ifnα2b fusion protein was also cloned in the expression vectors in the same way than non-modified tzmb. selected cell pools were cultured in 125 ml shake flasks containing sfmtransfx supplemented with 10% v/v of cell boost 5 (hyclone), 4 mm of glutamax (gibco) and 2 μg/ml of puromycin and also with 700 μg/ml neomycin in the case of the cells transfected with pires-neo3. cells were cultivated in the same conditions as described elsewhere [2] . purified products (using protein a chromatography (hitrap mabselect sure, äkta avant 150)) were quantified by both elisa and sds-page. antigen binding test was performed in sk-br-3 breast cancer cell line by means of flow cytometry analysis. the biological activity of the different candidates was tested with mtt assay. both tzmb and the fusion protein tzmb-ifnα2b have been successfully expressed in cho-s and hek293, which use for heterologous protein expression have previously been optimized in prior works [3] . the tricistronic strategy resulted in the most efficient, showing a 3.5fold increase in terms of productivity with respect to the bicistronic double-transfection for tzmb in cho-s cells and a 5-fold increase in hek293 cells (fig. 1a) . in the case of tzmb-ifnα2b, the tricistronic strategy also allowed to achieve higher productivities than the bicistronic one (fig. 1b) . regarding the differences of specific productivity between both cell lines tested, hek293 emerged as the best production host candidate, for the two tested strategies (tricistronic and bicistronic) and for the two produced proteins, showing a 1.5-fold increase in terms of productivity with respect to cho-s cells for tzmb using the tricistronic strategy. tzmb and tzmb-ifnα2b were analysed in terms of their antigen binding capacity, and both were find to efficiently bind to her2+ skbr-3 cells (fig. 1c) . thus, the antibody affinity to her2 antigen has not been affected when fused to inf-α2b. finally, antiproliferative activity of tzmb and tzmb-ifnα2b were assessed on the same sk-br-3 cells. at a concentration of 500 nm of tzmb, and after a 72-hour incubation, sk-br-3 cells presented a 83% growth with respect to the untreated control. however, no antiproliferative effect was observed for tzmb-ifnα2b (fig. 1d) . the tricistronic strategy provides higher productivity yields in hek293 and cho-s cell lines for both recombinant proteins (trastuzumab and tzmb-ifnα2b). regarding which cell line is the best production host candidate, hek293 achieved higher productivity than cho-s cells for the two proteins tested. all constructions performed preserved the binding affinity to its antigen, trastuzumab and tzmb-ifnα2b bind efficiently to the her2 antigen present in skbr-3 cells. finally, tzmb-ifnα2b does not present an improved antiproliferative effect with respect to trastuzumab when compared by means of an in vitro assay. the genetic engineering of patient-specific t cells with lentiviral vectors (lvv) expressing chimeric antigen receptors (car) for late phase clinical trials requires the large-scale manufacture of high-titer vector stocks. the state-of-the-art production of lvv is based on 10-to 40layer cell factories transiently transfected in the presence of serum. this manufacturing process is extremely limited by its labor intensity, open-system handling operations, its requirements for significant incubator space plus costs and patience risk due to presence of serum. to circumvent these limitations, this study aims to develop a stable and serum-free process to produce lvv with pei-mediated transfection. in addition, this study also focuses on the development of a a c b d fig. 1 (abstract p-276) . expression of trastuzumab (a) and trastuzumab-ifnα2b (b) from bicistronic strategy (bc) and tricistronic strategy (tri) with cho-s and hek293 cells. relative specific productivity units are used for comparing the different strategies. c antigen binding analysis of trastuzumab and trastuzumab-ifnα2b. d antiproliferative activity of trastuzumab and trastuzumab-ifnα2b on sk-br-3 cells production system not only using a gfp marker but also a therapeutically relevant transgene (cd20-car) [1] . therefore, three different cell lines (hek 293, 293t, 293ft) were investigated concerning their productivity of lvv and their growing behavior in the in-house serum-free medium transmacs. as part of this, design of experiment was used to investigate the optimal conditions for pei/ dna-transfection. furthermore, this statistical approach was used focusing an ideal ratio between the 3rd generation plasmids (transfer plasmid cd20-car or gfp, envelope plasmid, packaging plasmids). in addition, different enhancers (sodium butyrate, lithium acetate, caffeine, trichostatin a, cholesterol, hydroxyurea, valproic acid) were investigated concerning their effects on productivity comparing hek cultures producing lvv encoding for gfp-marker or cd20-car. concerning productivity and growing behavior, hek 293t was the favored cell line for our serum-free lv manufacturing process. in addition, an additive screen revealed that sodium butyrate alone had the most promising effect on both gfp-lvv and cd20-car-lvv production. after pei/dna titration, we finally could increase lvv productivity by lowering pei/dna amount at higher cell densities referred to our standard transfection protocol. furthermore, the titration for the optimal plasmid ration revealed, that for large transfer constructs higher amounts of transfer plasmid are required than for smaller constructs to achieve a high productivity (fig. 1) . the outcome of these experiments enabled the development of a robust hek293t based process to produce clinical relevant lvv under serumfree conditions. furthermore, it provides an insight how therapeutic genes and the expression of its transgene can influence cell productivity. led to a vast increase in productivity, cho cells yield less than other expression systems like yeast or bacteria [1] . to improve yields and find beneficial bioprocess phenotypes, genetic engineering plays an essential role in recent research. the mir-23 cluster with its genomic paralogues (mir-23a and mir-23b) was first identified as differentially expressed during temperature shift, suggesting its role in proliferation and productivity [2] . the common approach to deplete mirnas is the use of a sponge decoy which, requires the introduction of reporter genes. as an alternative this work aims to knockdown mirna expression using the recently developed crispr/cas9 system which does not require a reporter transcript. this system consists of two main components: the single guide rna (sgrna) and an endonuclease (cas9) which induces double strand breaks (dsbs). these dsbs can result in insertion or deletion (indels) of base pairs which can disrupt mirna function and processing [3] . a cho-k1 cell line stably expressing an igg was used for knockdown experiments. sgrnas were designed to target the seed region of each mirna member and stable mixed populations were generated (fig. 1a) . total rna form each mixed population was reverse transcribed into cdna using mirna specific stemloop primers. the expression was quantified by rt-qpcr. to further analyse the range of indels the mir-23a and mir-23b clusters were amplified by a standard high-fidelity pcr. amplicons were cloned into pcr tm -topo® vector and positive clones were analysed by sanger sequencing. cell growth was monitored using viacount tm viability stain on a guava tm benchtop flow cytometer. productivity was assessed by elisa. students t-test was used for statistical analysis. it was shown that mirna expression was significantly reduced in mixed populations. a knockdown up to 95% was achieved for mir-23a, mir-23b and mir-24. the knockdown in mir-27a and mir-27b expression was considerably less -between 70-90% (n=3, * p ≤ 0.05, ** p ≤ 0.01, *** p ≤ 0.001) (fig. 1b) . furthermore, it was shown that various sizes of indels were generated by targeting the seed region. smaller indels (+1/ +2/-1/-2 bps) seemed to be more common but larger deletions were detected as well (fig. 1c) . mir-23a, mir-23b and mir-27b showed increased viability in late stages of the culture. depletion of mir-27a reduced growth significantly whereas knockdown of mir-24 showed increased proliferation as well as boosting igg titers (table 1) . in this work, we have shown that crispr/cas9 can be successfully applied as a tool to knockdown mirna expression in cho cells. the data was generated using mixed pools and it remains to be established if both alleles can be successfully targeted e.g. using nextgeneration sequencing of individual clones. background chinese hamster ovary (cho) cells are the most widely used host cell line for the production of therapeutic antibodies. pre-and posttranslational modifications and optimization of culture methods contributed to increase the productivity, resulting in a very high titre [1, 2] . however, it has been pointed out that the intracellular secretion process is a bottleneck in the production of therapeutic antibodies [3] . in addition, the details of the process of secretion of humanized recombinant antibodies from cho cells have not been well investigated. in this study, we thus analysed the detailed process of secretion of therapeutic antibodies using cho cell lines, which have already been established as high producers, with the aim of obtaining information for the more rational and efficient establishment of high-producer cells. we performed 1) chase assay, 2) immunofluorescent microscopy observation, and 3) size exclusion chromatography (sec) analysis to investigate the duration of secretion, bottleneck position, and formation of recombinant igg, respectively. high-producer cho cells expressing humanized igg1 [4] and igg3 were used. for the chase assay, cells were cultivated in shake flasks with serum-free medium containing 50 μg/ml cycloheximide (chx) to stop nascent peptide synthesis. the amounts of igg both remaining in the cell and secreted into the medium at each time point were measured by quantitative western blotting. for immunofluorescent microscopy observation, cells were cultivated on coverslips with chx for 4 h. immunofluorescent staining against the recombinant igg, endoplasmic reticulum (er), and golgi apparatus was performed after chemical fixation. for sec, cells cultured with chx were re-suspended in a buffer containing tritonx-100 and injected into a column. the amount of igg in each fraction was measured by quantitative western blotting. the amount of igg3 in the supernatant increased until 4-6 h after the inhibition of protein synthesis by chx; however, it hardly changed thereafter (fig. 1, upper panel) . at this point in time, however, around 40% of igg still remained in the cells (fig. 1, lower panel) , meaning that all of the synthesized igg could not be secreted into the medium and remained in the cells for several hours. this result was almost the same as that of studies using igg1-expressing cells [5, 6] . the localization of igg in the cells was checked before and after the addition of chx, with the results showing that igg1 remained in the er and was hardly seen in the golgi apparatus [5] [6] [7] ; igg did not seem to be efficiently transported to the golgi apparatus. the sec experiment showed that most of the igg1 remaining in the cell seemed to form full-sized antibodies [5, 6] , but it could not be secreted despite this. the high-producer cells could not secrete all of the synthesized igg, and around 40% of igg remained in the cells for several hours. this incomplete secretion is a common phenomenon among cho cells producing different types of recombinant igg. the igg could not be transported from the er to the golgi despite its formation of fullsized antibodies. solving this bottleneck in the transportation of igg from the er to the golgi and/or achieving more efficient glycosylation of igg after the formation of full-sized antibodies might be the next target to improve productivity. background humanized monoclonal antibodies (mabs) are among the most promising drugs, but defined strategies for their modification are still not available. our work deals with humanization of murine mab2/ 3h6. the superhumanization approach leads to a loss of binding affinity which was partially restored by a single human-to-mouse backmutation (t98hr). [1] this residue was selected by synergistic combination of sequence analyses of antibody framework regions and structural information using novel in silico simulations. for structural stabilization, a conglomeration of tyrosine residues surrounding t98hr was identified, the so called "tyrosine cage". [2] analysis of the "tyrosine cage" was done by alanine scanning mutations with a double mutation variant t98hr + y27ha (bm09) and a triple mutation variant t98hr + y27ha + y32ha (bm10). in a recent series of experiments we tried to enhance binding affinity by three new variants with backmutations in the variable light chain (vl). originating from t98hr, residues in the vl were selected based on their spatial proximity to the cdr3 loop of the variable heavy chain. affinity improvement of t98hr was evaluated by vl-double backmutation variants t98hr + f46ll (su01) and t98hr + q49ls (su02) and a triple backmutation variant t98hr + f46ll + q49ls (su03). all five variants were expressed transiently in hek293-6e cells and binding affinities were investigated in two individual settings with bio-layer interferometry. in the first approach concentrated cell culture supernatants were directly applied and mabs were captured on protein a tips, blocked with 3d6scfv-fc and the association and dissociation of 2f5 igg was measured. for the second approach, the culture supernatants were purified and the affinity was determined with streptavidin biosensors. first, biotinylated 2f5 igg was bound and then the association/dissociation of the purified 3h6 variants was measured. affinity evaluation of concentrated culture supernatants with protein a sensor tips showed a decrease of binding affinity of bm09 and a loss of binding of bm10. the protein a measurement showed an increased binding strength of su01, su02 and su03 compared to su3h6 and bm07. su01 and su03 result in a higher binding affinity compared to su02. these results can be confirmed with purified variants by the streptavidin bio-assay (fig. 1) . alanine scanning of the tyrosine cage demonstrated a reduction of binding affinity (bm09) and a severe loss of binding (bm10), concluding that the tyrosine cage plays an important role for supporting a correct cdr loop conformation. further affinity improvement of the single mutation variant t98hr could have been reached via mutations in the vl. it demonstrates the underestimated role of the vl for the interaction with its binding partner. although cho cells are a major expression system for production of recombinant biopharmaceuticals, the molecular and cellular background characterizing a high producer is largely unknown. it has been observed that in producer cell lines important signaling pathways like the akt-signaling are altered in characteristical ways. thus analyzing according signaling events should lead to identification of key elements characterizing high producer cells. to investigate this, our emphasis lies on the phosphorylation status of involved proteins as reversible switches in all signaling pathways. we aimed to establish a workflow for cho-specific phosphoproteomics and focused on igf signaling, as cell culture media often are supplemented with this growth factor. two producer cell lines and the according parental cells were cultivated in a stable isotope labeling with amino acids in cell culture (silac) experiment, followed by quantitative ms phosphoproteomic analysis including chospecific data evaluation. the chosen cho cell lines were cultivated in triplicates in silac media containing isotopically-labeled lysine/arginine (hlys/harg) and in parallel in identical standard media (llys/larg, tcx10d, xell). cell density, viability, metabolism and cell cycle distribution were monitored during 50 ml batch culture for 7-8 days. at day 3.25 igf was added into hlys/harg cultures. 5 min later a part of the cells was harvested. for ms analysis igf-treated (hlys/harg cultures) and control cultures (llys/larg cultures) were combined. the following ms sample preparation workflow included digestion of whole protein lysate and phosphopeptide enrichment via tio 2beads. nanolc-esi-orbitrap ms (q exactive plus, thermo fisher scientific) of phosphopeptides was excecuted with subsequent identification and quantification in maxquant [1] . in addition to silac quantification of h/l ratios for investigation of igf effects, aquired data was also used to perform label-free quantification (lfq) in maxquant [1] for comparison of cell lines. statistical significance was calculated via t-test (p<0.05) or anova (permutation-based fdr<0.05) in perseus [2] . results igf effects on growth and production the igf treatment resulted in a prolonged viability for all cell lines. however, an increased vcd was only observed for producer cell line 1, yielding in an enhanced integral of vcd (ivcd). for the parental cells growth was inhibited by igf, although s-phase cells were enriched at least temporary (fig. 1a) . regarding antibody production igf led to a decreased qp and product titer, concomitantly with an increase in s-phase cells (fig. 1a) . this inverse correlation of proliferation and cell specific productivity is known from different productivity enhancing molecules, like butyrate [3] . ms investigation of signaling events the phosphoproteomic experiment resulted in the identification of 10.485 class i-phosphorylation sites. statistical evaluation of phosphopeptide abundances in perseus showed up 144 significant differences between the cell lines and led to producer vs. parental classifications (fig. 1b) . the quantitative evaluation via silac yielded in about 2.408 quantifiable phosphosites in at least 6 biological replicates. rapid phosphorylation changes after growth factor treatment indicated signaling towards protein synthesis, cell cycle and regulation of actin cytoskeleton amongst others. for 201 phosphosites significantly different h/l ratios were calculated between the two groups parental vs. producer, four of them are listed (table 1) . the workflow to study phosphorylation states revealed differences in the related cell lines and gave insights into signal transduction as a response on igf. on the one hand, igf-treatment resulted in a fast and widespread upregulation of phosphorylation sites within aktand mapk-signaling. on the other hand, a different phosphorylation status for producer compared to parental cell lines uncovered distinctions in biological processes like rna-and dna-binding and regulation of cytoskeleton. in sum, our sucessfully established phosphoproteomic approach allows to detect important signaling key players in cho cells that subsequently can be targeted through cell engineering or small molecule treatment. to improve antibody production in the cho cell expression system, it seems to be useful to up-or downregulate gene expression including antibody folding, secretion, and cell metabolism. many cell engineering approaches, including gene introduction, knockout and knockdown, have been employed to enhance recombinant antibody production [1] . however, identifying production enhancer genes is the rate-limiting step for cho cell engineering, because the conventional method requires a series of experiments including genomic integration of the tested genes, selection of stable cell clones and cell culture experiments of several clones. in this study, we propose an approach for rapid evaluation of production enhancer genes based on an episomal expression system. plasmid vector carrying the epstein-barr virus (ebv) encoded nuclear antigen 1 (ebna1) was transfected into cho cell line producing igg1 antibody. after g418 selection and single colony isolation, ebna1 expression was checked with capillary electrophoresis system wes (proteinsimple). ebv ebna1-antibody (1eb12) was used for detection as the primary antibody. the expression vector for the gene of interest was prepared by inserting 1508 bp of an orip dna sequence into a plasmid vector carrying cag promoter, resulting in the potc vector. pei max (polysciences, inc.) and balancd transfectory cho (irvine scientific) were used for the transfection. the number of viable cells and gfp-positive cells were counted using countess ii fl automated cell counter (thermo fisher scientific). the transfected cells were cultured in cellstar cellreactor tubes. the tubes were incubated in a climo-shaker isf1-x (kuhner). antibody production was measured using biolayer interferometry with an octet qk system (fortebio). we constructed four cho cell lines stably expressing ebna1, termed igg1-eb01 to eb04. in capillary electrophoresis analysis, we observed a clear peak corresponding to the ebna1 expression in all four cell lines. we tested the transfection efficiency by potc-gfp plasmids. in the best transfection condition, pei/dna ratio of 1/1, igg1-eb01 cell showed the highest gfp-positive cell number (1.07×10 7 cell/ml) and transfection efficiency (95%) among the four cell lines. therefore, igg1-eb01 cell lines were selected for further study. after the transfection, the number of gfp-positive cells continued to increase even after the passage (fig. 1) , suggesting that the potc-gfp plasmid was stably retained and replicated by ebna1/orip system in igg1-eb01 cell lines. in preliminary experiments, we introduced three genes, mdh2, gss and gclm, into igg1-eb01 cell lines. cotransfection of these three genes led to an increase in igg1 production from 287±18 mg/l (control) to 334±21 mg/l at day 8 (p<0.05, t-test, n=3). this result suggests that these three genes work as production enhancer genes. conventional methods based on stable cells take up to 6 months to determine whether the gene of interest is beneficial for recombinant igg1 production. in contrast, identification of production enhancer genes is achievable within 10 days by our proposed method based on ebna1/orip system. the proposed method makes it possible to evaluate production enhancer genes in a rapid manner. the proposed method is a promising approach to identify genes enhancing recombinant antibody production. background 2g unic™ (2gun) technology comprises a set of protected genetic elements that improve protein production by acting on transcription as well as on translation. the elements can either be inserted into existing (platform) vectors or be provided as complete ready-to-use vectors. the technology can be used in stable and in transient transfection to boost protein production for product development and is being applied in cld for pharmaceutical proteins. in combination with antibiotic selection or dhfr selection, 2gun technology routinely results in 2-3 fold increase in expression of client antibodies or fusion proteins, both in pools and after clonal selection. previously, we have successfully combined 2gun technology with glutamine synthetase (gs) selection and the cho gs null cells of horizon discovery, resulting in clonal cell lines producing > 6 g/l of a biosimilar mab in fed-batch assay. here we present data on the successful application of the 2gun technology for the enhanced expression of a large (>300 kda) human heterotrimeric glycoprotein, a renowned difficult-to-express (dte) protein. all expression vectors comprised a hcmv promoter and bgh polyadenylation sequence in the expression cassettes for the gene of interest, and a selection marker gene with sv40 promoter and sv40 polyadenylation sequence. 2g unic™ vectors also contained genetic elements (2g unic™ technology, proteonic). cho gs null cells (horizon discovery) were transfected in duplicate with reference or 2gun expression vectors and selected in media lacking glutamine and containing the appropriate antibiotics. the bulk pools were seeded at equal viable cell density after obtaining maximum viability and cultured for 9 days without feeding (batch). expression of the target protein in cell culture supernatants of stable bulk pools was measured by elisa. the three protein subunit genes were expressed from vectors with different selection markers. in the reference constructs (without 2gun), the α, β, and γ chains were expressed from vectors with marker genes for zeocin, blasticidin, and gs, respectively. a similar 3 vector combination was also generated with 2gun elements integrated in each vector. in addition, 2 vectors with 2 subunits (γ-α and α-γ), each with a separate 2gun element, promoter and polyadenylation signal, were generated with a gs marker gene. cho gs -/cells were transfected with the 4 appropriate vector combinations in equimolar ratios and selected in bulk in medium lacking glutamine and 1 or 2 antibiotics. the 2-vector transfected cell pools recovered first, due to the presence of only 1 antibiotic in the medium (fig. 1a) . the pools transfected with three 2gun vectors recovered to maximum viability just a few days after the 2-vector 2gun pools. recovery of the reference pools took up to a week longer than the 2gun pools. production of each pool was assessed in a batch production run in shaker flasks. all 2-vector 2gun pools which recovered first produced titers around 0.1 g/l, which is almost 10-fold higher as compared to the production by reference pools (fig. 1b) . the highest titers of 0.5 g/l were obtained in the 3-vector 2gun pools. these data show that the 2g unic™ genetic elements can be successfully used to obtain a significant increase in the titer of difficult-to-express proteins. similar results have been obtained with other dte proteins, including fc-fusion proteins and bi-specific antibodies (not shown). the expression of a large, glycosylated multimeric difficult to express protein can be increased more than ten-fold in cho gs pools by application of 2g unic™ genetic elements. the highest expression of is obtained using a separate vector for each subunit. characterization of antibody-producing cho cells with chromosome aneuploidy noriko yamano 1,2 , sho tanaka background chinese hamster ovary (cho) cells are commonly used as host cells to produce biopharmaceuticals. however, the number of chromosomes in cho cells varies. previously, dg44-sc20 and dg44-sc39 cell lines with modal chromosome numbers of 20 and 39 were isolated from parental cho-dg44 cells, from which igg3-expressing cell lines named igg3-sc20 and igg3-sc39 were established, respectively. the igg3-sc39 cell pool showed a higher specific igg3 production rate than the igg3-sc20 cell pool [1] . even though all of the igg3-sc20 clones and half of the igg3-sc39 clones contained the same number of vector integration sites (single integration site), igg-sc39 cell clones produced more igg3 following the culture of single-cell clones than any of the igg3-sc20 clones [1] . in this study, we performed transcriptome analysis to investigate the characteristics of high-producer cells with chromosome aneuploidy. transcriptome analyses using amplified fragment length polymorphism (aflp)-based high-coverage expression profiling (hicep) and de novo mrna-seq were performed on dg44-sc20, dg44-sc39, igg3-sc20 and igg3-sc39. to compare cell lines with different numbers of chromosomes, transcriptome data from mrna-seq were adjusted for cell number using rna reference materials (nmij crm 6204-a; national institute of advanced industrial science and technology) mixed at equal amounts per cell. pathways related to differentially expressed genes were searched using keymolnet (km data). high-chromosome-number cho cells showed larger cell diameters, as determined by vi-cell (beckman coulter) measurement. the predicted volume ratios, based on these diameters, are 2.24 (dg44-sc39:dg44-sc20) and 1.59 (igg3-sc39:igg3-sc20). the levels of β-actin and the products of most other genes that were detected by mrna-seq differed by approximately 20% in the comparison between sc39 and sc20 (sc39 > sc20). based on the analysis of gene expression levels per cell volume, approximately 90% of detected genes showed lower expression in both dg44-sc39 and igg3-sc39 compared with the levels in dg44-sc20 and igg3-sc20, respectively. in addition, the number of genes whose expression level was decreased in igg3-sc39 compared with that in dg44-sc39 was larger than those showing the opposite pattern. the results of the comparisons between igg3-sc20 and igg3-sc39 indicate that differentially expressed genes were mainly related to cell growth (e.g. myc, smad), apoptosis (e.g. caspase), lipid metabolism (e.g. srebp, pparγ) and epigenetic histone modification (e.g. brca, hat) pathways. the mrna levels of myc, smad, caspase, brca and hat related genes were lower in igg3-sc39, while those of srebp and pparγ related genes were higher in igg3-sc39. the effects of these pathways on antibody production should be examined in future. in this study, we found that high-chromosome-number cho cells have lower amounts of mrna relative to their volume. a reduction per unit volume in the expression of genes that are required for survival might generate additional energy for recombinant protein production in high-chromosome-number cells. from an evolutionary perspective, an increased set of chromosomes underlies rapid evolutionary adaptation. although there are issues to be considered, such as stability, there may also be advantages to using high-chromosome-number aneuploid cho cells as a production host cells of recombinant proteins. background human growth factors have an enormous therapeutic potential. among them, the bone morphogenetic protein-2 (bmp-2) can induce de novo bone formation endowing the protein a high therapeutic potential. however, finding a suitable recombinant production system for such a protein still remains a challenge. recombinant expression of hbmp2 was investigated in transiently transfected hek-293 cells and in stable clones established in cho-k1 cells cultivated in excell and pro-cho5 medium, respectively. protein stability and interaction of the hbmp2 with the producer cells were investigated in vitro using commercially available rhbmp2. in addition, we investigated a cell-free protein synthesis system harboring translocationally active microsomal structures, hence having the potential to perform post-translational modifications, as an alternative production method. we showed that growth rates and viabilities of the rhbmp2producing cells were similar to those of the parent cell line, while entry into the death phase was delayed in case of the recombinant cells. the maximum rhbmp2 concentration detected in the culture supernatant was low for stable clones but can be greatly improved combining the hek-293 cells transient expression system and batch reactor cultivation which reflects a better compatibility of the codon usage in the human cells (table 1) . hbmp2 protein is sensitive to slightly acidic ph and to a lesser extend to proteases (fig. 1a ) and binds to both producers cell lines (fig. 1b) -all this could incidentally contribute to the low product titers. cell-free protein synthesis has been proposed as alternative for "difficult-to-express" proteins. since native hbmp2 is glycosylated, a cell-free system based on eukaryotic cell lysates is required for its production. cho cell lysates were chosen, since they had previously been established as the most productive eukaryotic system in our hands [1] , while concomitantly enabling a direct comparison to the production of hbmp2 in stable clones established in cho-k1. the ability to perform post-translational modifications is a major advantage of eukaryotic systems. the cho lysates prepared by the protocol used here have previously been shown to contain significant amounts of endogenous microsomes derived from the endoplasmatic reticulum during lysis [2] . to enforce translocation of the target protein into the microsomal structures, a melittin signal peptide was fused to the hbmp2 cdna. the glycosylation of the protein was assessed by enzymatic treatment (pngase, endoh) and confirmed using 14 c-mannose for the de novo protein synthesis. upon cell-free protein synthesis, the hbmp2 yield was 100-fold higher than the best one in the hek-293 cells. the difference becomes even more dramatic, when productivities are considered (table 1) , i.e. the fact that maximum product titers are reached within 3 h in the cellfree system compared to 120 h in the cell-based ones. this demonstrates that the cell-free expression system is most suitable compared to mammalian cell expression method for the production of glycosylated human bmp2 (table 1 ) [3] . human growth factors are complex molecules, which make their production in mammalian cells desirable. however, low product titers caused by a variety of both cell and process related effects may hinder the development of highly productive processes. in such cases, cell-free protein production using cho cell lysates containing endogenous microsomes for posttranslational processing, may eventually present an attractive alternative. in particular since these lysates can be used under tightly controlled conditions assuring a higher degree of reproducibility, than, e.g. transient transfection systems. cell-free systems are known to circumvent typical bottlenecks of cellbased ones, e.g. metabolic regulation and cell maintenance mechanisms. in consequence, the production of a recombinant protein is neither inhibited by its accumulation nor by any interaction with the cells, e.g. through the activation of inhibitory signaling pathways. core. preliminary studies showed that the corresponding polyplexes, but also some of the cells that came into contact with them, became magnetic and were manageable by magnetic fields [1] [2] [3] . here, we present a characterization of the influence of structure and composition on the function of these polymers using a library of highly homogeneous, paramagnetic nano-stars with varied arm lengths and densities [4] . the paramagnetic nano-stars library was synthesized by coating maghemite nanoparticles (γ-fe 2 o 3 ) with a thin silica-shell functionalized with an atomic transfer radical polymerization (atrp) initiator. pdmaema arms were grown from the core particles via atrp. in one case, the pdmaema arms was end-capped with pdegma blocks produced during a second atrp step. all nanostars were characterized by size exclusion chromatography and thermogravimetry to calculate number and length of the pdmaema arms. the core diameter was determined by transmission electron microscopy and dynamic light scattering (dls). the different variants (table 1) were analyzed for their ability to complex pdna (pegfp-n1) using various physicochemical methods (dls, zeta sizer). transfection efficiency/cytotoxicity in cho-k1 cells were determined by flow cytometry. transfected cells were placed in a magnetic field and the influence of the polymer architecture on the magnetic separation was investigated. nonparametric spearman analysis was used to correlate between arm length/arm densities, magnetic properties of the cells and transfection efficiency. based on the hydrodynamic radii of the polyplexes, the investigated nano-stars could be divided into three subgroups (table 1) . middle, but also high arm density nano-stars formed smaller polyplexes with hydrodynamic radii ≤ 300 nm, a size that is considered suitable for endocytosis and transfection. transfection efficiencies and cytotoxicities varied systematically with the nano-stars architecture, with viability showing a more pronounced dependency on the characteristics of the transfection agent than the transfection efficiency itself. the arm density was particularly important, with values of approximately 0.06 arms/ nm 2 yielding the best results (fig. 1a) . the end-capping the polycation arms with pdegma significantly improved the serum compatibility (fig. 1b) . the gene delivery potential of a given nano-star and its ability to render the cells magnetic did not correlate. although, compared to the non-separated cells, egfp-expressing cells were consistently more frequent in the magnetic cell fraction, while the non-magnetic fraction was slightly depleted. when the egfp-expressing cells were further divided into low, middle and high producers, a statistically significant shift towards the high producers was observed in the magnetic cell fraction (fig. 1c) . a nonparametric spearman correlation analysis was used to statistically evaluate possible links between the molecular characteristics of the nano-stars, the physicochemical properties of the corresponding polyplexes, the transfection conditions, and the cellular reactions. the resulting correlogram is shown in fig. 1d . transfection agents with magnetic properties enlarge the toolbox for studying non-viral gene delivery, since cellular magnetism is added as a new parameter. this allows, inter alia, a distinction between mere cellular interaction and actual uptake, which is otherwise difficult. viability showed a much more pronounced dependency on the characteristics of the transfection agent/polyplex than the transfection efficiency itself, which should be taken into account during method optimization. end-capping the polycationic pdmaema-arms with pdegma-blocks improved the compatibility of the polycationic nano-stars with serum components. in future optimized, blood-compatible, nano-stars, which can be retained/directed by magnetic fields, could become options for non-viral gene delivery in vivo. the increasing demand for monoclonal antibodies has necessitated the need to increase the productivity of current industrial cell lines. in our earlier study [1] , we had shown that treatment with er-stress inducer, tunicamycin significantly increased the titers and productivity in recombinant cho cell lines with a simultaneous upregulation of many genes from the unfolded protein response pathway (upr). however the loss in cell viability prevented a sustained increase in titers. in the current study we explore the effect of varying concentrations of tunicamycin and treatment times, such as to modulate the increase in protein folding capacity while preventing induction of apoptosis. anti-rhesus igg-secreting cho cells [2] were cultured in sf-cdm in 125 ml shake flasks. the cells were treated with varying concentrations (30-500 ng/ml) of tunicamycin in a batch culture. further, the effect of treatment with tunicamycin for short periods of time (24 hrs) was also evaluated. igg titers and mrna expression levels were quantified using elisa and qrt-pcr (illumina), respectively. results cho cells were treated with different concentrations of tunicamycin and cultured in a batch for 8 days (referred as continuous treatment/cte). figure 1a presents the maximum vcd and % drop in viability under treatment. a dose-dependent inhibitory effect is observed on growth and viability of cells in cte-cultures, with minimal inhibition as lower concentrations. contrastingly, igg titers (fig. 1b) were higher in treated cultures w.r.t. control in initial phase of the cultures at all the concentrations of tunicamycin. the per-cell productivity (fig. 1c) also showed a significant increase w.r.t control at all the concentrations of tunicamycin. however, the increased productivity due to tunicamycin was not sustained and levels become similar to control after day 3 (data not shown). to prevent loss of viability due to tunicamycin, the effect of short-term treatment (ste) with tunicamycin was explored. cells treated with tunicamycin for 24 hours were harvested (corresponding to day 2 of cte cultures) and inoculated in fresh media. the ste-cultures showed improved viability and higher maximum vcd as compared to cte-cultures (fig. 1a) . the fold increase in igg titers was not sustained beyond day 1-2 in stecultures ( fig. 1d ) but significant increase in productivity was seen in the initial phase (fig. 1e) . further, the cells were adapted over 25 continuous generations under 30ng/ml tunicamycin. the adapted cells had overall 1.3-fold higher productivity, as compared to control (fig. 1f) , in a batch culture. to understand the molecular basis of increase in productivity, mrna expression level of key genes was determined. xbp1s is a transcription factor involved in activation of chaperones (like grp78, calreticulin) and apoptotic genes (such as chop). significant increase in the levels of calreticulin was seen on treatment with tunicamycin (fig. 1g) . both xbp1s and grp78 were marginally induced when treated with 30ng/ml of tunicamycin in both cte-and ste-cultures (fig. 1h) , and significantly up-regulated when treated with 500ng/ml of tunicamycin. the chop mrna levels also increase with increasing tunicamycin concentrations, with levels in ste-cultures lower than cte-cultures (fig. 1h) . the results suggest that upr induction may be important to increase productivity in these cte/ste-cultures. note that, tunicamycin had no effect on the expression levels of igg heavy-chain, thus eliminating the involvement of igghc-mrna in increasing productivity (fig. 1i) . tunicamycin induced er-stress increased productivity in the initial phase of the culture and enhanced upr-mediated folding capacity can be attributed as one of the reasons for it. at lower concentrations of tunicamycin, a fine balance between optimum upr induction and apoptosis can be achieved, as seen in 30ng/ml tunicamycin ste-cultures. in summary, this study demonstrates an alternate approach to enhance productivity of current industrial cell lines. background chinese hamster ovary (cho) cells have been widely used for the large-scale production of biopharmaceuticals [1] . to construct antibody-producing cho cells, exogenous genes encoding antibodies are usually integrated into unspecified regions of chromosomes (random integration). however, the chromatin structure differs depending on the location of the chromosomal region, which affects the expression level of the gene of interest [2] . recently, gene-targeting methods that enable site-specific integration of expression vectors have been developed. however, the regions that are most efficient for exogenous gene expression have not been clarified. we previously constructed a cho genomic bacterial artificial chromosome (bac) library generated from the recombinant cho-dg44 cell line. it was expected to cover the entire cho genome five times. the 20 chromosomes in cho-dg44 cells were aligned in decreasing order of size and assigned letters from a to t [3] . three hundred and four bac clones were mapped to every chromosome of cho-dg44. among the karyotypes of cho-dg44, cho-k1 and primary chinese hamster cells, chromosomes a and b are considered as the sole paired chromosomes corresponding to chromosome 1 in primary chinese hamster cells. hence, chromosomes a and b are considered to be stable [4] . in this study, we constructed antibody-producing cells by using a gene-targeting method, which focused on the stable chromosomes. a gene map of chromosome 1 was constructed by combining the bac-fluorescence in situ hybridization (fish)-based chromosome physical map and sequence data of mapped bac clones. the sequences of bac clones were searched by blast with ncbi and chogenome.org databases. three different regions on chromosomes a and b were selected based on cho genomic bac library sequences as target sites. cho-k1 cells were stably transfected by lipofection. the target sequences were broken using the clustered regularly interspaced short palindromic repeats (crispr)/crispr-associated protein 9 (cas9) system and humanized igg1 genes were integrated by non-homologous end joining recombination. transfection without using the crispr/ cas9 system was also performed. these cell pools were cultivated for six days with serum-supplemented medium, and their levels of antibody productivity were evaluated by elisa. copy number analysis was also performed using real-time pcr. results and discussion construction of gene map of chromosome 1: eighty-three bac clones were mapped onto chromosomes a and b (each clone contained 100-150 kb of the cho genome sequence). as a result of annotations of 83 bac clone sequences, 91 genes were mapped on chromosome 1. investigation of the differences of productivity among antibodyproducing cells that were constructed by chromosome 1 targeting and/or random integration: cell growth was not affected by the gene targeting site. the specific production rates of antibodyproducing cell pools constructed by gene targeting of chromosome 1 were higher than those of the cell pool constructed by random integration. all cell pools constructed by gene targeting showed lower copy numbers of heavy chain and light chain in genomic dna than those in the cell pool constructed by random integration, despite showing high productivity. our results indicate that high productivity of the cells constructed by gene targeting of chromosome 1 does not depend on the increase of the antibody copy number, and that the environments around these target regions are suitable for exogenous gene expression. the approach of using gene targeting to chromosome 1 may be promising for constructing antibodyproducing cells. retroviral vectors have been widely used as gene delivery tools in various biotechnology fields. however, the random integration feature of retroviral vectors seems to cause problems such as insertional mutagenesis and gene silencing. we previously demonstrated cre-mediated retroviral transgene insertion into a pre-determined site of the founder cells using integrasedefective retroviral vectors (idrvs), where a cre expression plasmid was transfected into the cells prior to retroviral transduction [1] . recently, we reported novel hybrid idrvs (cre-idrvs) incorporating bioactive cre recombinase protein, and validated site-specific gene integration of an scfv-fc antibody expression unit into the chinese hamster ovary (cho) cell genome [2] . we also developed an accumulative site-specific gene integration system, which enables repeated integration of multiple transgenes into a pre-determined locus of the cell genome [3] . here, we attempted repeated integration of transgenes using cre-idrvs. a viral vector plasmid (pqmscv/hd[scfv-fc]) encoding reporter genes and an scfv-fc expression unit flanked with wild-type and mutant loxps was constructed for the production of idrvs. cre-idrvs were produced described previously [2] . results and discussion figure 1a shows a schematic drawing of each round of targeted transgene integration using cre-idrvs harboring an scfv-fc expression unit. (fig. 1b) . genomic dna extracted from the cells were subjected to pcr using specific primer pairs α and β, and γ and δ to confirm site-specific integration. dna fragments with expected sizes were amplified in each cell clone (fig. 1c) . these results indicate that site-specific repeated integration was achieved using cre-idrvs. in contrast, scfv-fc productivity in cho/hd[scfv-fc]×2 cells was slightly decreased compared with that of cho/ne[scfv-fc]×1 (data not shown). although the reason remains unclear, repeat-induced gene silencing might occur due to tandem repeat structure of expression units. we reported improved recombinant antibody production using a production enhancer element [4] . such a cis-regulatory element might be a feasible approach to enhance the productivity. we demonstrated site-specific repeated transgene integration into a pre-determined chromosomal locus using cre-idrvs for the production of an scfv-fc antibody. if lipids role in the cell have been reduced for a long time to cell membrane formation, it is now understood that lipids plays also a role into energy metabolism, vesicular transport, membrane structure, dynamics and signaling. however, the exact mechanism of how compositional complexity affects cell homeostasis remains unclear. thanks to recent advances in mass spectrometry, it is now possible to study a wide range of lipids, providing a better understanding of lipid homeostasis in high performance cell culture processes. the purpose of this work was to develop a robust lipidomics method applied to mammalian cell cultures in a three step method: extraction, separation and detection (fig. 1 ). both matyash [1] and folch [2] extraction method were performed on our cells to reach the highest yield. two separation techniques were also tested: hydrophilic interaction liquid chromatography (hilic) and reverse phase chromatography. finally lipid classes' identification was achieved by tandem mass spectrometry analysis thanks to structure-specific fragmentation ions. the yield obtained with matyash extraction method was higher than with folch method for each lipid class tested. besides, matyash method presents also the advantage to be less toxic and suitable for high throughput analysis since the organic layer is above the aqueous layer. lipids separation by hilic is based on their polar head. since lipid classes are defined by polar head, the lipids are eluted class by class, making their identification easier. the separation of lipids by reverse phase was correct but the method is longer and we observed a massive carryover of triglycerides on the column. finally each lipid class was screened in ms/ms parent ion mode. target daughter ion was set according to the lipid class structure and fragmentation pattern. this detection technique enabled the identification of 50 different lipids. to ensure the absolute quantification of the detected lipids and to guarantee comparable results between batches labeled internal standard were added prior to extraction. this method was optimized in a stepwise process to ensure a sensitive and selective measurement of the lipids. lipids were extracted by matyash method, separated by hilic and detected by tandem mass spectrometry. this method is suitable for both in process sample lipid analysis providing information on the cell lipid content, and for harvest samples, enabling to follow the lipid release during the different harvest steps. this non-targeted lipidomic quantitation method will enable us to better control lipid synthesis during biopharmaceutical fed batch production through clone selection, metabolomics studies and harvest development. background human mesenchymal stem/stromal cells (hmsc) can easily be isolated from e.g. bone marrow, fat tissue or umbilical cord blood and are therefore a central player in regenerative medicine, gene therapy and cell therapy [1] [2] [3] . the necessary gene shuttle is mainly provided by viruses associated with diseases, like retrovirus or adenovirus [4] [5] [6] [7] . these possible pathogen viruses demand for high safety standards. also, they are prone to genomic alterations and there is the possibility of virus inactivation, triggered due to pre-existing immunity in the patient [8] [9] [10] . in this context, the autographa californica multicapsid nucleopolyhedrovirus (acmnpv) is a safe alternative. the virus replication is hostspecific for insects [11] , but it is known since the mid-90s, that a temporary transduction of mammalian cells is possible [12] . some modifications of the virus increased the applicability in stem cells. pseudotyping the virus with the vesicular stomatitis glycoprotein (vsv-g) led to an expansion of the transducable cell [13, 14] and the integration of the woodchuck hepatitis virus post-transcriptional regulatory element (wpre) prolonged the recombinant protein expression [15, 16] . for achieving a baculovirus-induced differentiation of hmscs, the promotor and the expression strength of the recombinant protein are crucial factors. still, there are still few comparative promotor studies [17, 18] . however, a successful virus uptake is the prerequisite for a successful protein expression. we therefore investigated factors significantly influencing the transduction process by applying design of experiments (s. fig. 1a ). the experimental design comprises a two level factorial screening, set-up using design expert v9.for the transduction 60,000 c/cm 2 were seeded in 24-well plates with dmem + 10% fcs and incubated overnight at 37°c, 8% co 2 and humidified atmosphere. the recombinant baculovirus using an integrated ef1α promoter to control gfp expression, described elsewhere [18] , was diluted to the respective concentrations in the different surrounding fluids. after discarding the cultivation medium of the hmsc-tert, 1 ml of virus containing solution was added to the cells. the following incubation was varied in duration before replacing the virus solution with growth medium and an incubation overnight. 24 h post transduction (hpt) the cells were washed with pbs, trypsinized with 100 μl trypsin/edta and incubated for 5 min at 37°c. trypsination was then stopped applying 100 μl soybean trypsin inhibitor and the cells were analyzed using flow cytometry. as shown in fig. 1a , the virus concentration and incubation time exert the highest influence on the transduction efficiency. obviously, a higher concentration of viral particles and longer incubation of cells with virus increases the probability for hits between cells and virus particles. additionally, the surrounding fluid can have a negative impact on the transduction. this is due to the interaction of medium components with the baculovirus. therefore, pbs containing ca 2+ & mg 2+ is recommended as surrounding fluid for transduction experiments. in fig. 1b , the transduction conditions resulting in the highest percentage of gfp+ cells are displayed: 150 virus particles per cell (ppc) and an incubation time of 5 h with hmsc-tert. the experiments show, that especially the virus concentration and the incubation time of cells with virus influence the transduction efficiency. based on the results of the screening, further optimization of the transduction conditions will be done using a face centered central composite design with pbs containing ca 2+ & mg 2+ as surrounding fluid and at an incubation temperature of 37°c. background breast cancer is the second main cause of cancer related deaths for women worldwide and among them the triple negative subtype (tnbc) represents a clinical challenge by being associated with high mortality and having no effective therapies against it [1] , [2] . accordingly, there is an urgent need to design new and more effective drugs to treat breast cancer. notch signaling is an evolutionary conserved cell-to-cell communication pathway crucial during embryonic and breast development and tissue homeostasis. this pathway is often hyper-activated by overexpression of notch receptors and/or its ligands in several types of cancers, such as breast cancer (tnbc included), where it contributes to its development, progression and drug resistance [3] , [4] , [5] . our aim is to generate a function blocking antibody against the notch delta-like-1 (dll1) ligand with therapeutic efficacy against breast cancer. materials and methods dna of human dll1 full length extracellular domain (dll1-ecd) and a truncated version, containing the minimal binding region to the notch receptor (dll1-egf3), were cloned into pfuse-fc1-igg1, and expressed in hek293e6cells. recombinant proteins were purified from culture media by protein-a affinity and size exclusion chromatography. the human scfv phage display tomlinson i+j library was used to select specific scfv against peptides targeting dll1 binding regions to notch. the binding ability and specificity of the selected scfv clones was evaluated by scfv-on-phage elisa. our strategy allowed us to obtain 20 mg of pure (>95%) and stable dll1-ecd-fc as confirmed by sds page and thermofluor assay. dll1-egf3-fc yield was very low and buffer screenings are ongoing to optimize protein stability. functional studies performed in human breast cancer mcf7 cells showed that both ligands are biologically active as they increased the expression of the notch-dependent genes hes-1, hey-l and hey-1. recombinant dll1 and peptides were used to select for monoclonal antibodies by phage display. after three rounds of panning with dll1 peptides we identified 13 scfv positive clones, 2 of which presented high affinity to dll1-ecd-fc. currently we are performing more phage display selections to increase the number of positive clones. scfv with higher affinities will be reformatted into iggs and their ability to inhibit the notch pathway will be evaluated. the anti-oncogenic effects of anti-dll1 iggs will be assessed in breast cancer cells in viability/apoptosis, proliferation, migration, and invasion assays. an anti-dll1 igg with therapeutic efficacy against breast cancer will demonstrate that targeting dll1 could be one of the key factors for successfully targeting breast cancer. recombinant adeno-associated virus (raav) approaches have an outstanding reputation in gene therapy and are evaluated for cancer therapy [1] . advantages include long-term gene expression, targeting of dividing and non-dividing cells, and low immunogenicity. established raav production utilizes triple transfection of adherent hek 293 cells, which hardly meets product yield requirements for clinical applications. we transferred the aav production system to hek 293-f suspension cells. this process is scalable and uses serum-free media streamlining downstream procedures. after optimization of transfection efficiencies and shaker cultivations, we produced titers of 1×10 5 viral genomes per cell in a 2 l bioreactor. the suspension adapted hek-freestyle 293-f cell line was used for the experiments in chemically defined animal component free media (hek-tf, hek-gm (xell ag), freestyle f17 (thermo fisher scientific)). samples for viable cell density and viabilities were taken daily and analyzed using an automated cell counting system (cedex, roche diagnostics). transient transfection of 3×10 6 cells/ml was carried out with polyethylenimine max in a 1:4 dna-pei ratio (w/w) with 2 μg dna. three plasmids (pgoi, prepcap, phelper) were applied in a molar 1:1:1 ratio (fig. 1a) . pretests were performed in orbital shaking tube spin bioreactors. for scale-up, batch processes were carried out in 125 ml shake flasks as well as in 2 l stirred bioreactors at 30% air saturation and ph 7.1. transfection efficiencies and raav production were quantified by flow cytometry using a goi coding for a fluorescent protein and qpcr of genomic copies, respectively. by optimizing the dna amount for transfection of 293-f cells more than 90 % of the cells were reproducibly transfected. batch cultivations in shaker flasks revealed that raav were produced in the first 24-96 h after transfection. figure 1b shows viable cell densities and viabilities in relation to the genomic titer. genomic titers were determined from raw cell extracts and up to 10 9 copies/ml were repetitively achievable. a decrease in viability marked the decline in genomic copies per ml showing that a prolongation of the process e.g. by addition of a feed would probably not increase yield. in a first scale-up, the raav production was transferred to a 2 l bioreactor (fig. 1c) . transfection efficiencies in bioreactors of up to 55% were comparable to that obtained in a simultaneous shaker flask experiment. transfection efficiencies were lower compared to prior experiments due to controlled conditions in the bioreactor. nonetheless the titer with up to 1×10 5 genomic copies per cell was elevated compared to that of shaker flasks. first experiments with 293-f cells in hek tf medium showed promising results of transferring raav production from the adherent system to suspension. after improvement of transfections by the adjustment of dna amounts in small scale experiments, aav production was analyzed in shaker flasks. the batch process showed an expected increase in cell density with low variability between biological replicates (fig. 1b) . the genomic titer increased according to the viable cell density until day four where a sudden drop started. this observation was made for aav productions in hek-tf, hek-gm and freestyle f17 medium. for optimal yields, we assume that a slight decrease in viability marks the point in time for harvest. from optimized protocols, a batch process in a 2 l bioreactor was carried out. interestingly the bioreactor cultivation resulted in lower overall viable cell densities but in higher genomic copies per cell compared to shaker flasks (fig. 1c) . these results are comparable to already published data for suspension cells [2] . subsequent optimization of the bioreactor protocol will lead to further increase in raav yield. genethon and pall have collaborated to assess pall's single-use icellis fixed-bed bioreactor for viral vector production. clinical use of gene therapies to treat formerly incurable genetic diseases is advancing rapidly. viral vectors are an important tool for introducing genes into target cells. many gene therapies have been developed using adherent cells in 2-dimensional flatware or roller bottles but using these technologies to reach commercial-scale production represents a significant challenge. the icellis bioreactor enables large-scale viral vector production by providing a 3-dimensional matrix for cell growth in a compact configuration (fig. 1 ). up to 500 m 2 of surface area is available in a compact bioreactor measuring 88 mm in diameter in a total volume of 75 l with ph, do and temperature control. a key feature of the icellis bioreactor is that it scales by increasing the diameter of the fixed-bed while keeping the height constant with no change in aspect ratios. the height of the fixed-bed can be varied (2, 4 and 10 cm) as well as density of carrier packing (96 gm/l or 144 gm/l). the icellis system comes in two formats, the icellis nano bioreactor (0.53-4.0 m 2 ) and the icellis 500 bioreactor (66-500 m 2 ). processes developed in the bench top icellis nano bioreactor can be directly transferred to the corresponding icellis 500 system. the icellis nano bioreactor enables an efficient platform for process optimization. the genethon raav-8 process was transferred to an icellis nano bioreactor 0.8 m 2 (2 cm bed height, 144 gm/l density) bioreactor using freestyle media. the initial icellis nano process was established as (1) seed on day 1, (2) transfect at day 5, (3) harvest at day 8 and yielded <1x 10 9 vp/cm 2 (n=3). media exchange, cell density at transfection, pdna/cell ratio, and lysis method were then changed to determine the effect on productivity. the modified process was then scaled from 0.8 m 2 to 4.0 m 2 (10 cm bed height, 144 gm/l density) icellis nano bioreactor. -media: a media exchange at 5 hours post transfection with dmem substituted for freestyle medium resulted in an 8x increase in specific productivity. 1 (abstract p-349) . a schematic overview of raav production in hek293 cells with triple-transfection system. b viable cell densities (vcd), viabilities and genomic copies per ml (gc) of a raav production with 293-f batch cultivations in shaker flasks. genomic copies per ml refer to the titer determined in 1 ml culture volume. error bars represent biological and technical duplicate measurements of samples. c viable cell densities and genomic copies per cell of a raav production with 293-f batch cultivation in a 2 l bioreactor. for reasons of comparability between shaker and bioreactor data genomic copies are given per cell. error bars represent technical duplicate measurements of samples -cell density at transfection: cells were seeded at 6,000 cells/cm 2 and reached 200,000 cells/cm 2 at day 5 which was determined to be the optimal cell density for transfection. -pdna/cell ratio: reducing pdna by 50% had no significant effect on productivity. -lysis: use of trion x-100 at 0.5% with 100 mm nacl at ph 8 resulted in >100% virus recovery compared to sampled carriers. -scaling: specific productivity was maintained as the system was scaled from 0.8 m 2 to 4.0 m 2 . -overall, an average yield of 4x10 13 vg/m 2 was achieved. the icellis technology is being adopted widely for viral vector production. transferring a process to the icellis nano bioreactor can be easily achieved and once in place can be optimized to provide significant productivity increases and cost savings such as reduced pdna. the icellis nano bioreactor is an efficient bench-top system the results of which can be readily scaled to the icellis 500 system. background tissuse multi-organ-chip (moc) platform contributes to the ongoing advancement in systemic substance testing in vitro. current in vitro and animal tests for drug development are failing to emulate the systemic organ complexity of the human body and, therefore, often do not accurately predict drug toxicity. especially, cardiotoxicity is one of the main reasons why new compounds are failing in clinical trials. therefore, we aimed to establish an autologous dynamic multiorgan-device integrating cardiomyocytes for substance testing. generic 2d monolayer and 3d suspension ipsc derived cardiomyocytes differentiation protocols were established. beating cardiomyocytes were first seen on day 8 in monolayer as well as in spheroid culture. cardiomyocytes show up to 64% cardiac troponin t positive cells and 44% myosin heavy chain positive cells by flow cytometry (fig. 1g, h) . myosin ii heavy chain, α-actinin, myosin 9/10, myosin 11 and caldesmon expression was shown by immunohistochemistry ( fig. 1a-d) . due to the exclusion of a lactate enrichment of cardiomyocytes, cardiac fibroblasts are also expressed in the spheroids shown by vimentin staining. those cardiac fibroblasts lead to a physiological heterologous cell population similar to the human heart. beating spheroids were cultivated for 7 days under dynamic culture conditions in the multi-organ-chip. the integrated on-chip micropump provides physiological-like pulsatile circulation at a microliter scale and leads to better nutrition and oxygen supply. the next significant step is to combine multiple autologous 3d organ equivalents in our multi-organ-chip using ipsc differentiation technology. differentiating all cell types from one ipsc donor is crucial to overcome source and rejection problems. combining our multi-organ-chip platform with ipsc differentiation technology will eventually lead to a personalized system for drug and substance testing. lab as a service -automated cell-based assays lena schober, moriz walter, andrea traube laboratory automation and biomanufacturing engineering, fraunhofer ipa, stuttgart, germany correspondence: lena schober (lena.schober@ipa.fraunhofer.de) bmc proceedings 2018, 12(suppl 1):p-365 background the use of cell-based assays in pharmaceutical industry and academic research is a growing trend that is a driving force to reduce costs for drug development. academic research is gaining information about intracellular targets or functional mechanisms through the variety of different assays. these benefits can be used in preclinical studies and furthermore costly late-stage drug failures may be reduced by the use of cell-based assays. the use of automated systems is also in great demand and will change the testing of substances and research activities. nevertheless, there are a lot of barriers at the moment limiting the successful application of automated systems in this field. by the lack of flexibility and the demand for skilled computer scientists & engineers just the two main aspects stated by experts shall be mentioned. our strong background on automated cell culture technologies and expertise, gained in several projects, let us rethink the overall process chain and overcome established principles. a new service orientated platform for the execution of cell-based assays that are commonly used will be introduced. the main idea is to give access to automated infrastructure for academic research or spin-offs which cannot afford the special infrastructure. nowadays it is known that the development of inhibitory antibodies by hemophiliac patients is closely related with immunogenic epitopes present in the coagulation factors. these proteins are produced in hamsters cells [1 -4] which insert a different posttranslational modification profile when compared with the human profile. patients with high-titer/high-responding inhibitors must be treated with bypassing agents that can achieve hemostasis. activated factor vii (fviia) is an attractive candidate for hemostasis, independent of fviii/fix, making this coagulation factor an alternative for hemophilia patients with inhibitory antibodies. however recombinant factor vii is produced in bhk-21 cells (baby hamster kidney cells) and as well as the others coagulation factors, it may contain immunogenic epitopes [5 -7] . in this context, becomes extremely important to produce recombinant proteins with complex posttranslational modifications in a cell line not yet used [8 -10] . we have been using the sk-hep-1 human cell line for the production of recombinant fvii. to generate the recombinant cell line we have used a bicistronic lentiviral vector, 1054-gfp, containing a fvii gene and the gfp selection marker gene. a master cell bank and a work cell bank were generated in gmp conditions. the rfvii analyses were made by elisa assay, western blot, gene expression quantification and biological activity using the prothrombin time (pt) assay. rfvii purification by affinity chromatography using viiselect (ge) column. after purification the rfvii was formulated and dry froze to be used in in vivo experiments. in static conditions sk-hep-1 cells showed, for a period of 6 months, a stable fvii production with an average of 8,03 iu/ml of fvii, 83% of cell viability and 77% of cells expressing the gfp gene. after purification with viiselect column it was possible observe a recover of 65% of the purified protein with 95% degree of purity (fig. 1) . this recombinant purified fvii is being used in in vivo experiments to determine the pharmacokinetics parameters and to evaluate the posttranslation modifications profile. in conclusion, this study reports the use of sk-hep-1 cell line for high-level production of recombinant factor vii. these cells have proven to be effective in the production of recombinant protein and can be used as a new platform for the production of recombinant proteins. fig. 1 (abstract p-366) . a determination of protein yield of egfr (epidermal growth factor receptor) synthesized in a cho cecf system. analysis of egfr protein yield obtained in a various batches of cecf formatted reaction. cecf synthesis was performed in the presence of 14 c leucine for radio labeling of target proteins. radio labeled proteins were precipitated using tca followed by scintillation measurement. b detection of radio labeled egfr by autoradiography. a no template control (ntc) was prepared containing no egfr dna template background emergence of stem cell-based regenerative medicine recently leaded to the necessity to reach a sustained production of such cells [1] . hence, new bioreactors and carriers were designed for cell expansion. however, to meet this increasing demand, improvement of both quality and quantity of stem cells remains necessary. soft biocompatible microcarriers mimicking extracellular matrix in term of structure and stiffness should be of valuable utility as substrate stiffness strongly influence in vitro stem cell fate and differentiation [2, 3] . our expertise in the field of microbeads design using jetcutting technology [4] enabled us to engineer +/-200 μm alginate beads of various g/m monomer ratio. we used jetcutter (genialab gmbh) with 100 μm nozzle at max speed 12000 rpm. alginate solutions with concentrations 2% to 4% were gelifyed in 2% cacl2 etoh 50% solution. alginates with estimated viscosity (@1%) from 30 to 720 mpa were tested. a further surface treatment with gelatine (0,1%, 1%) and poly-l-lysine (0,1%) was carried out to reach an optimal cell anchoring of human adiposederived mesenchymal stem cells (atcc-psc-500-011) in mesempro rs medium (gibco). jetcutter technology allowed us to obtain alginate microcarriers with a good homogeneity in size around 200 μm and sphericity comparable to commercial carriers (table 1) . best adhesion of human adipose-derived mesenchymal stem cells was obtained on 0,1% gelatine coated alginate carriers (fig. 1) . we observed limited apoptosis and human adipose-derived mesenchymal cells stemness was conserved after 14 days in culture (data not shown). cellular bioassays developed with functionally immortalized cell lines aileen bleisch 1 , aleksandra velkova 3 , tom wahlicht 2 , dagmar wirth 2 , tobias may 1 1 inscreenex gmbh, braunschweig, germany; 2 msys, helmholtz centre for infection research, braunschweig, germany; 3 greiner bio-one gmbh, frickenhausen, germany bmc proceedings 2018, 12(suppl 1):p-381 background a major challenge of current research is the limited availability of physiologically relevant cells [1] . thus the development of relevant cellular bioassays that are robust, reproducible and scalable is hindered. to overcome current limitations we developed an immortalization strategy allowing the efficient and reproducible establishment of novel cell lines showing an in vivo-like phenotype. the main feature of our ci-screen technology is the ability to combine the advantage of cell linesthe unlimited cell supplywith the advantage of primary cellsthe physiological relevance. using this technology we have immortalized, amongst others, a human osteoblast cell line (ci-huob) [2] . in the present study, the in vivo-like phenotype and functionality of the novel ci-huob was examined. therefore, ci-huob cells were used to develop a 3d cell culture model by using the magnetic 3d bioprinting technology (nano3d biosciences, houston, tx, usa) [3] . the ci-huob cell line was recently described and cultivated in huob maintenance medium (inscreenex, germany). for spheroid creation ci-huobs were grown in a monolayer, magnetized by adding a magnetic nanoparticle assembly (nanoshuttle, ns, nano3d biosciences, houston, tx, usa) at a concentration of 4μl ns/cm 2 growth area. after an overnight incubation magnetized ci-huob were detached and seeded into cellstar® cell-repellent 96-well plates (greiner bio-one, frickenhausen, germany). with the help of mild magnetic forces cells were printed into spheroids within 2h. these consist of 1.000-50.000 cells and were cultured for a period of up to 50 days. the cell viability was analyzed by a propidium iodide (pi) and calcein am staining. to improve spheroid functionality spheroids were cultivated with huob differentiation medium (inscreenex, germany). "mini bone" tissue functionality and thus mineralization was analyzed by an alkaline phosphatase (alkaline phosphatase activity) and an alizarin red s staining (ca 2+ deposits). the combination of ci-huob cells with the magnetic 3d bioprinting technology enabled the establishment of reproducible and consistent 3d spheroids. single spheroids per well were formed independent of the amount of cells (1.000-50.000 cells) (fig. 1a) . formed spheroids were stable for a culture period of up to 50 days (fig. 1b) . neither cell death nor cell proliferation were observed in the bioprinted spheroids which is indicated by the stable size of the spheroids throughout the cultivation (fig. 1c) . after treatment with a differentiation stimulus the 3d bioprinted spheroids became fully functional "mini bones". this was highlighted by the alkaline phosphatase activity and the ca 2+ deposits within the 3d bioprinted spheroids (fig. 1d,e) . taken together, these results demonstrated that the functional immortalization technology provides physiologically relevant cells in sufficient numbers and that the magnetic 3d bioprinting technology enabled a fast, consistent cell aggregation and the formation of stable uniform spheroids. importantly, these immortalized cells are capable to differentiate when a suitable stimulus is provided. for differentiation into mini bones, 3d spheroid cultivation and additional stimulation by small molecules are required. the combination of physiologically relevant cell systems with three dimensional culturing will help to generate in vitro test systems which closely resemble the in vivo physiology and thereby supporting future drug discovery approaches. fig. 1 (abstract p-381) . characterization of spheroid "mini bones". a different number (1.000-50.000 cells) of ci-huob cells were printed into spheroids. b 20.000 ci-huobs were printed into spheroids and cultivated for indicated time points. c for analyzing spheroid sizes, pictures were taken and quantified by imagej. (d/e) 20.000 ci-huob cells were printed into spheroids and cultivated with (huob differentiation medium) or without a differentiation stimulus for two weeks. afterwards, bioprinted spheroids were sectioned by a cryo microtome and d stained for ca 2+ deposits (alizarin red s) or e stained for alkaline phosphatase activity crispr/cas9, a novel genomic tool to knock down microrna in vitro and in vivo degrontagged dcas9/cpf1 effectors for multi-directional drug-inducible control of synthetic gene regulation assessing the variability of an innovator molecule n-glycan profile correct primary structure assessment and extensive glyco-profiling of cetuximab by a combination of intact, middle-up, middle-down and bottom-up esi and maldi mass spectrometry techniques 2d-dige screening of high productive cho cells under glucoselimitation -basic changes in the proteome equipment and hints for epigenetic effects dependence on glucose limitation of the pco2 influences on cho cell growth, metabolism and igg production fcgalactosylation modulates antibody-dependent cellular cytotoxicity of therapeutic antibodies fc glycans of therapeutic antibodies as critical quality attributes development of an automated, multiwell plate based screening system for suspension cell culture global cancer statistics remission of disseminated cancer after systemic oncolytic virotherapy screening different host cell lines for the dynamic production of measles virus beitrag zur kollektiven behandlung pharmakologischer reihenversuche a simple method of estimating fifty per cent endpoints webinar: ambr15 as a sedimentation-perfusion model for cultivation characteristics and product quality prediction de novo" high density perfusion medium: increased productivity and reduced perfusion rates a high-yielding cho transient system: co-expression of genes encoding ebna-1 and gs enhances transient protein expression an integrated vector system for the eukaryotic expression of antibodies or their fragments after selection from phage display libraries towards the development of a surface plasmon resonance assay to evaluate the glycosylation pattern of monoclonal antibodies using the extracellular domains of cd16a and cd64 biotinylation of the fc gamma receptor ectodomains by mammalian cell co-transfection: application to the development of a surface plasmon resonance-based assay ambr™ mini-bioreactor as a high-throughput tool for culture process development to accelerate transfer to stainless steel manufacturing scale: comparability study from process performance to product quality attributes maximizing binding capacity for protein a chromatography protein glycosylation and its role in protein folding afucosylated antibodies increase activation of fcγriiia-dependent signaling components to intensify processes promoting adcc sialic acids and other nonulosonic acids production of antibody in insect cells suitability and perspectives on using recombinant insect cells for the production of virus-like particles cloning of cdna and characterization of anti-rnase a monoclonal antibody 3a21 production of functional antibody fab fragment by recombinant insect cells optimization of hek-293s cell cultures for the production of adenoviral vectors in bioreactors using on-line our measurements enhancing heterologous protein expression and secretion in hek293 cells by means of combination of cmv promoter and ifnα2 signal peptide hek293 cell culture media study towards bioprocess optimization: animal derived component free and animal derived component containing platforms comparison of control strategies for fed-batch culture of hybridoma cells based on on-line monitoring of oxygen uptake rate, optical cell density and glucose concentration continuous bioprocessing: the real thing this time? mabs white paper on continuous bioprocessing fda perspective on continuous manufacturing. ifpac annu. meet screening and assessment of performance and molecule quality attributes of industrial cell lines across different fed-batch systems amanullah: quantitative modeling of viable cell density, cell size, intracellular conductivity, and membrane capacitance in batch and fed-batch cho processes using dielectric spectroscopy optimal and consistent protein glycosylation in mammcalian cell culture journal of laboratory automation. designs and concept-reliance of a fully automated high content screening platform mini-bioreactor as a highthroughput tool for culture process development to accelerate transfer to stainless steel manufacturing scale: comparability study from process performance to product quality attributes perfusion media development using cell settling in automated cell culture system model-based design of process strategies for cell culture bioprocesses: state of the art and new perspectives model-based strategy for cell culture seed train layout verified at lab scale hyperosmotic stimulus study discloses benefits in atp supply and reveals mirna/mrna targets to improve recombinant protein production of cho cells effects of osmoprotectant compounds on ncam polysialylation under hyperosmotic stress and elevated pco 2 evaluating the bottlenecks of recombinant igm production in mammalian cells igm characterization directly performed in crude culture supernatants by a new simple electrophoretic method effect of culture ph on erythropoietin production by chinese hamster ovary cells grown in suspension at 32.5 and 37.0°c enhancing protein expression in hek-293 cells by lowering culture temperature relationship between tissue plasminogen activator production and specific growth rate in chinese hamster ovary cells cultured in mannose at low temperature a quantitative proteomic analysis of cellular responses to high glucose media in chinese hamster ovary cells process analytical technologies in the pharmaceutical industry: the fda's pat initiative effects of ammonium and lactate on growth and metabolism of a recombinant chinese hamster ovary cell culture glycosylation in cell culture structural mechanism of high affinity fcgammari recognition of immunoglobulin g inhibition of glycosylation on a camelid antibody uniquely affects its fcγri binding activity investigation of cho secretome: potential way to improve recombinant protein production from bioprocess bladder cancer cell-derived exosomes inhibit tumor cell apoptosis and induce cell proliferation in vitro exploring packaged microvesicle proteome composition of chinese hamster ovary secretome glycome mapping on dna sequencing equipment iron (iii) citrate inhibits polyethylenimine-mediated transient transfection of chinese hamster ovary cells in serum-free medium efficient high-throughput biological process characterization scale-up of a stirred single-use bioreactor family quality control: er-associated degradation: protein quality control and beyond pharmacological targeting of endoplasmic reticulum stress signaling in cancer when er stress reaches a dead end oxidative stress and antioxidant defense the antioxidant edaravone attenuates er-stress-mediated cardiac apoptosis and dysfunction in rats with autoimmune myocarditis advanced process and control strategies for bioreactors. book chapter model-based strategy for cell culture seed train layout verified at lab scale seed train optimization for cell culture. chapter high cell density cultivation of human leukemia t cells (jurkat cells) in semipermeable polyelectrolyte microcapsules. eng. life sci cell retention by encapsulation for the cultivation of jurkat cells in fixed and fluidized bed reactors the role of interleukin-2 during homeostasis and activation of the immune system creating a biomimetic microenvironment for the ex vivo expansion of primary human the era of digital biomanufacturing the multivariate normal distribution recovery from rabies: a call to arms the role of vaccination in rabies prevention eliminating canine rabies, the principal source of human infection: what will it take? rabies virus-like particles expressed in hek293 cells immunogenic virus-like particles continuously expressed in mammalian cells as a veterinary rabies vaccine candidate an avian cell line designed for production of highly attenuated viruses a chemically defined production process for highly attenuated poxviruses a genotype of modified vaccinia ankara (mva) that facilitates replication in suspension cultures in chemically defined medium easy and efficient protocols for working with recombinant vaccinia virus mva large-scale transfection of mammalian cells for the fast production of recombinant protein recombinant protein production by large-scale transient gene expression in mammalian cells: state of the art and future perspectives high density transfection with hek 293 cells allows doubling of transient titers and removes need for a priori dna complex formation with pei regulation of recombinant protein expression during chobri/rcta pool generation increases productivity and stability qc, canada; 2 department of microbiology, infectiology and immunology the cumate gene-switch: a system for regulated expression in mammalian cells rapid protein production from stable cho cell pools using plasmid vector and the cumate gene-switch christoph zehe sartorius stedim cellca gmbh, 88471 laupheim animal cell technology: from target to market recent advances in mammalian protein production molecular mechanisms of rna interference. annual review of biophysics ribosome profiling-guided depletion of an mrna increases cell growth rate and protein secretion ubiquitous chromatin-opening elements (ucoes): applications in biomanufacturing and gene therapy the art of cho cell engineering: a comprehensive retrospect and future perspectives a novel bxb1 integrase rmce system for high fidelity site-specific integration of mab expression cassette in cho cells high-troughput lipidomic and transcriptomic analysis to compare sp2/0, cho, and hek-293 mammalian cell lines single cell characterisation of chinese hamster ovary (cho) cells eva pekle 1,2 , guglielmo rosignoli 1 generation of stable chinese hamster ovary pools yielding antibody titers of up to 7.6 g/l using the piggybac transposon system comparison of three transposons for the generation of highly productive recombinant cho cell pools and cell lines effects of ammonia and lactate on hybridoma growth, metabolism, and antibody production lactate and glucose concomitant consumption as a self-regulated ph detoxification mechanism in hek293 cell cultures flux balance analysis of cho cells before and after a metabolic switch from lactate production to consumption reducing recon 2 for steady-state flux analysis of hek cell culture trastuzumab -mechanism of action and use in clinical practice hek293 cell culture media study towards bioprocess optimization: animal derived component free and animal derived component containing platforms enhancing heterologous protein expression and secretion in hek293 cells by means of combination of cmv promoter and ifnα2 signal peptide production of lentiviral vectors references 1. wurm f m: production of recombinant protein therapeutics in cultivated mamammalian cells initial identification of low temperature and culture stage induction of mirna expression in suspension cho-k1 cells small indels induced by crispr/cas9 in the 5' region of microrna lead to its depletion and drosha processing retardance production of recombinant protein therapeutics in cultivated mammalian cells improved antibody production in chinese hamster ovary cells by atf4 overexpression the vesicle-trafficking protein munc18b increases the secretory capacity of mammalian cells rapid evaluation of n-glycosylation status of antibodies with chemiluminescent lectin-binding assay intracellular secretion pathway analysis for constructing highly producible engineered cho cells. 16th annual peptalk intracellular secretion analysis of therapeutic antibodies in engineered high-producible cho cells analysis of intracellular recombinant igg secretion in engineered cho cells. the 29th annual and international meeting of the japanese association for animal cell technology humanization strategies for an anti-idiotypic antibody mimicking hiv-i gp41 antibody humanization by molecular dynamics simulations-in-silico guided selection of critical backmutations maxquant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification the perseus computational platform for comprehensive analysis of (prote)omics data hoffrogge: label-free protein quantification of sodium butyrate treated cho cells by esi-uhr-tof-ms the art of cho cell engineering: a comprehensive retrospect and future perspectives gs system for increased expression of difficult-to-express proteins the netherlands correspondence: maurice van der heijden (heijden@proteonic.nl) bmc proceedings increased recombinant protein production owing to expanded opportunities for vector integration in high chromosome number chinese hamster ovary cells ires-mediated translation of membrane proteins and glycoproteins in eukaryotic cell-free systems cell-free protein expression based on extracts from cho cells comparison of cell-based vs. cell-free mammalian systems for the production of a recombinant human bone morphogenic growth factor. eng dual-responsive magnetic core-shell nanoparticles for nonviral gene delivery and cell separation pdmaema-grafted core-shell-corona particles for nonviral gene delivery and magnetic cell separation influence of polyplex formation on the performance of starshaped polycationic transfection agents for mammalian cells systematic study of a library of pdmaema-based, superparamagnetic nano-stars for the transfection of cho-k1 cells systems biology of unfolded protein response in recombinant cho cells dynamics of unfolded protein response in recombinant cho cells cells were transfected with np@(pdmaema 1037 ) 46 (n/p 15), separated 24 h post transfection (t = 0) by magnetically-assisted cell sorting and placed into separated cultures. the bars represent the overall transfection efficiency. distribution of low (light green), middle (green), and high (dark green) producers within the egfp-expressing cell fraction. data represent one experiment carried out in duplicate, with random experimental error shown. d correlogram between the molecular characteristics of the nano-stars (core diameter, arm density, arm length, number of monomeric units per nano-star), the physicochemical properties of the corresponding polyplexes (hydrodynamic radius, zeta potential), the transfection conditions (n/p ratio, amount of polymer), and the cellular reactions (transfection efficiency, magnetism, viability) production of recombinant protein therapeutics in cultivated mammalian cells position effects on eukaryotic gene expression bacterial artificial chromosome library for genome-wide analysis of chinese hamster ovary cells construction of bac-based physical map and analysis of chromosome rearrangement in chinese hamster ovary cell lines suguru imanishi 1 , akira ito 1 , masamichi kamihira 1,2 1 department of chemical engineering cre recombinase-mediated sitespecific modification of a cellular genome using an integrasedefective retroviral vector targeted transgene insertion into the cho cell genome using cre recombinase-incorporating integrase-defective retroviral vectors an accumulative site-specific gene integration system using cre recombinase-mediated cassette exchange improved recombinant antibody production by cho cells using a production enhancer dna element with repeated transgene integration at a predetermined chromosomal site lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics a simple method for the isolation and purification of total lipids from animal tissues using baculovirus as a gene shuttle in hmsc: optimization of transduction efficacy gundula sprick 1 clinical applications of mesenchymal stem cells concise review: mesenchymal stem cell treatment of the complications of diabetes mellitus wharton's jelly-derived mesenchymal stromal cells as a promising cellular therapeutic strategy for the management of graft-versus-host disease gene therapy: twenty-first century medicine state-of-the-art gene-based therapies: the road ahead viral vectors: a look back and ahead on gene transfer technology basic biology of adeno-associated virus (aav) vectors used in gene therapy biosafety challenges for use of lentiviral vectors in gene therapy adenoviral vector-mediated gene therapy for gliomas: coming of age manufacturing of viral vectors for gene therapy: part i. upstream processing the complete dna sequence of autographa californica nuclear polyhedrosis virus efficient gene transfer into human hepatocytes by baculovirus vectors efficient transduction of mammalian cells by a recombinant baculovirus having the vesicular stomatitis virus g glycoprotein recombinant baculoviruses as mammalian cell gene-delivery vectors post-transcriptional regulatory element boosts baculovirusmediated gene expression in vertebrate cells baculoviral vector-mediated transient and stable transgene expression in human embryonic stem cells systematic comparison of constitutive promoters and the doxycycline-inducible promoter baculovirus-induced recombinant protein expression in human mesenchymal stromal stem cells: a promoter study triple-negative breast cancer: an unmet medical need the therapeutic monoclonal antibody market. mabs a monoclonal antibody against human notch1 ligand-binding domain depletes subpopulation of putative breast cancer stem-like cells notch activation stimulates migration of breast cancer cells and promotes tumor growth notch-out for breast cancer therapies aav production in suspension: evaluation of different cell culture media and scale-up potential modular adeno-associated virus (raav) vectors used for cellular virusdirected enzyme prodrug therapy production of recombinant adenoassociated virus vectors using suspension hek293 cells and continuous harvest of vector from the culture media for gmp fix and flt1 clinical vector development of a cost-efficient scalable production process for raav-8 based gene therapy by transfection of hek-293 cells simon arias 2 , mustapha hohoud 1 , roel lievrouw 1 , fabien moncaubeig 1 b-1120 brussels, belgium; 2 généthon, rue de l'internationale 1 cell-free systems based on cho cell lysates: optimization strategies, synthesis of "difficult-to-express" proteins and future perspectives cell-free protein expression based on extracts from cho cells comparison of cell-based vs. cell-free mammalian systems for the production of a recombinant human bone morphogenic growth factor ires-mediated translation of membrane proteins and glycoproteins in production of recombinant factor vii in sk-hep-1 human cell line zip 14049-900, brazil; 3 department of clinical, toxicological and food science analysis, faculty of pharmaceutical sciences of ribeirão preto human cell lines for the production of recombinant proteins: on the horizon production of recombinant protein therapeutics in cultivated mammalian cells recombinant protein therapeutics from cho cells -20 years and counting establishment of a cell line expressing recombinant factor vii and its subsequent conversion to active form fviia through hepsin by genetic engineering method expression and fast preparation of biologically active recombinant human coagulation factor vii in cho-k1 cells implications of the presence of n-glycolylneuraminic acid in recombinant therapeutic glycoproteins uniquely human evolution of sialic acid genetics and biology production platforms for biotherapeutic glycoproteins. occurrence, impact, and challenges of non-human sialylation human cells: new platform for recombinant therapeutic protein production therapeutic glycoprotein production in mammalian cells masthering industrialization of cell therapy products tissue cells feel and respond to the stiffness of their substrate matrix elasticity directs stem cell lineage specification continuous cider fermentation with co-immobilized yeast and leuconostoc oenos cells eternity and functionality -rational access to physiologically relevant cell lines generation and characterization of two immortalized human osteoblastic cell lines useful for epigenetic studies biocompatibility of nanoshuttletm and the magnetic field in magnetic 3d bioprinting publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations we accept pre-submission inquiries • our selector tool helps you to find the most relevant journal • we provide round the clock customer support • convenient online submission • thorough peer review • inclusion in pubmed and all major indexing services • maximum visibility for your research submit your manuscript at www submit your next manuscript to biomed central and we will help you at every step authors thankfully acknowledge the biotechnology and biological sciences research council for funding this research work. sns thanks esact 2017 for providing her with the opportunity to present her work at the meeting. we would like to thank moritz frei for his support for the generation of the ngs transcriptomics data. many thanks to valentine chevallier for her precious advices, to stefanos grammatikos for his support and to the whole upstream process sciences team. we thank david bruehlmann and thomas vuillemin from merck (vevey, switzerland) for providing the igg glycan variants and the 2-ab-uplc glycan data. polyplus-transfection would like to thank généthon for their kindly provided data.acknowledgements asmita mukerji, reetesh pm, sasi kumar k, pilot plant team acknowledgment to cedric allier from cea leti, grenoble, france. we would like to thank a. schemel and a. ehrlich for technical assistance. the authors would like to mention that this research was supported by the fi-dgr (2017) from spanish government and the project was led by prof. jordi joan cairó badillo. the authors want to thank the biotech process sciences team at merck in corsier-sur-vevey for their support and also the members of the morbidelli group at eth zürich for their input and collaboration. authors would like to thank dr. benjamin youn in manufacturing science and technology (msat) at biomarin for his help on coding the excel macro program for ambr15, and dr. donald l. traul from tap biosystems (now part of the sartorius stedim biotech group) for his assistance on ambr15 operations. thanks to the bioreactor team of dustin davis, amer al-lozi, and jana mahadevan. we organic vaccines tm and the nih, who kindly provided to univercells. we thank polymun scientific immunbiologische forschung gmbh for providing the antibodies igm103, igm104 and igm617 as a kind gift. this allison kurz and gian andrea signorell, genedata ag, basel, switzerland allison kurz, gian andrea signorell, genedata, basel, switzerland. allison kurz, gian andrea signorell, genedata ag high glucose concentration and low specific cell growth rate improve specific r-tpa productivity in chemostat culture of cho acknowledgements r. boraston for photographs in fig. 1 . we acknowledge atum for their contributions to vector design and construction. austrian bmwfw, bmvit, sfg, standortagentur tirol, government of lower austria and business agency vienna through the austrian ffg-comet-k2. this research is supported by sfi grant 13/ia/1841. financial support of the austrian science fund (fwf; grant number p 25056) is gratefully acknowledged. we would like to thank the australian institute for bioengineering and nanotechnology, university of queensland-brisbane, australia (aibn) for providing the cho clones. this research is partially supported by the developing key technologies for discovering and manufacturing pharmaceuticals used for next-generation treatments and diagnoses both from meti and amed, japan. this work was supported in part by grants for developing key technologies for discovering and manufacturing pharmaceuticals used for next-generation treatment and diagnoses, both from the ministry of economy, trade and industry (meti), japan and from the japan agency for medical research and developments (amed). many thanks stefanos grammatikos for his support and to the whole upstream process sciences team. the authors thank the hessen state ministry of higher education, research and the arts within the hessen initiative for scientific and economic excellence (loewe program) for the financial support. the authors thank xell ag, bielefeld, for providing hek serum-free media (hek gm and hek tf) and for fruitful discussions. the authors would like to thank dana wenzel for cho lysate preparation (fraunhofer izi, potsdam-golm, germany). this work is supported by the european regional development fund (efre) and the german ministry of education and research (bmbf, no. 031b0078a). the authors acknowledge são paulo research foundation -fapesp (2015/ 19017-6), centro de pesquisa, inovação e difusão (cepid), and national institute of science and technology in stem cell and cell therapy -inctc for financial support. this work was supported by grants from the niedersächsisches ministerium für wissenschaft und kultur (80029155) and the german ministry for economic affairs and energy (igf 16153 n). the infrastructure which was modularly built up, consists of automated liquid handling robots, plate and tube handling robots as well as incubators, refrigerator and analysis systems as for example an imaging system. the aim is to address the need on reproducibility and reliability of results and to offer access to a maximal controlled and automated environment. with the help of a web-based configurator assay selection as well as parameterization of the assays can be done in an easy way. after the order process, test items can be shipped to the lab. assays will be executed on the fully automated platform. by capturing in process data as well as environmental conditions, a real complete data set is leading to comprehensively results. as soon as results are available during the process, the view and analysing can be done in a secure cloud. the service can be used for single experiments in low throughput applications and is therefore a benefit for labs which cannot afford automated infrastructure or the staff for the maintenance for such platforms. extensive monitoring and data capturing during the run leads to a gapless data trail and the possibility of detailed result analysis. due to automated processing the reproducibility is increased associated with direct reduction of costs and time. the centralized service paired with specific know-how allows up-scaling of processes at any time. the web-based interface provides a flexible guidance for the user and the online order gives 24/7 access on the infrastructure, leading to a fast reliable result generation. furthermore the secure interaction with additional services e.g. other specific data analysis tool is possible. this dynamic access to automation offers high flexibility for low throughput experiments and will push high quality research and drug development in early stage. development of alternative animal cell technology platforms: cho based cell-free protein synthesis systems for the production of "difficult-to-express" proteins lena thoring 1, 2 background nowadays, animal cell technologies are commonly used for a broad range of medical and pharmaceutical applications. one main topic of these technologies is the production of proteins used for therapeutical purposes. these in vivo production processes are often time consuming and limited in production of so called "difficult-to-express" proteins including the pharmaceutical relevant class of membrane proteins. to overcome these issues, novel cell-free protein synthesis platforms were developed based on the industrial working horse cho cells [1] . cell lysates provide a basis for this technology by including all components of the translational machinery and enabling protein production within a few hours. microsomal structures present in cho cell lysates enable posttranslational modification of target proteins and insertion of membrane proteins into lipid bilayer. in this study a cell-free protein synthesis platform was developed based on a combination of cho cell lysates and a continuous exchange reaction format. the continuous exchange reactor consists of a twochamber system, a reaction and a feeding chamber, separated by a semipermeable membrane. due to concentration gradients, energy components can diffuse to the reaction chamber, while inhibitory byproducts are continuously removed. different classes of proteins were selected to evaluate the quality of the cho cecf system including a transmembrane receptor, a single chain variable fragment and an ion channel. cell-free protein synthesis was performed in the presence of 14 c leucine for radio labeling of synthesized proteins. protein yield was quantified by tca precipitation of radio labeled proteins followed by scintillation measurement and molecular mass was detected by autoradiography. posttranslational modifications and activities of proteins were estimated by kinase assays, elisa, endoglycosidase treatment and electrophysiological measurements. the demonstrated results showed a protein production of up to around 1 g/l while detecting correct molecular weights by autoradiography. analysis of the productivity using different lysate batches by the production of the membrane protein egfr revealed only minimal batch-to-batch variations (fig. 1a) . posttranslational modifications of proteins, including phosphorylation and glycosylation, were detected using western blot and autoradiography (fig. 1b) . evaluation of localization of membrane embedded eyfp fusion proteins by confocal laser scanning microscopy resulted in the detection of proteins in the microsomal fraction of cho cell lysate. produced single chain variable fragments showed binding specificity in elisa experiments. the activity of synthesized ion channels was underlined by electrophysiological measurements and detected single channel activities. a cell-free system based on cho cell lysates for high yield production of proteins was developed that provides a platform for efficient production of "difficult-to-express" proteins. the combination of a cho lysate based cell-free system and a continuous exchange cell-free system leads to be a highly efficient production system for various classes of "difficult-to-express" proteins. this approach opens up a fast and cost-effective process pipeline for the production of "difficult-to-express" proteins and shows a high potential for industrial applications including screening technologies, protein structure determination and just-in-time protein production processes. key: cord-011602-hzqayt3n authors: chen, jianlin; liu, xiaorong; chen, jianhan title: targeting intrinsically disordered proteins through dynamic interactions date: 2020-05-11 journal: biomolecules doi: 10.3390/biom10050743 sha: doc_id: 11602 cord_uid: hzqayt3n intrinsically disordered proteins (idps) are over-represented in major disease pathways and have attracted significant interest in understanding if and how they may be targeted using small molecules for therapeutic purposes. while most existing studies have focused on extending the traditional structure-centric drug design strategies and emphasized exploring pre-existing structure features of idps for specific binding, several examples have also emerged to suggest that small molecules could achieve specificity in binding idps and affect their function through dynamic and transient interactions. these dynamic interactions can modulate the disordered conformational ensemble and often lead to modest compaction to shield functionally important interaction sites. much work remains to be done on further elucidation of the molecular basis of the dynamic small molecule–idp interaction and determining how it can be exploited for targeting idps in practice. these efforts will rely critically on an integrated experimental and computational framework for disordered protein ensemble characterization. in particular, exciting advances have been made in recent years in enhanced sampling techniques, graphic processing unit (gpu)-computing, and protein force field optimization, which have now allowed rigorous physics-based atomistic simulations to generate reliable structure ensembles for nontrivial idps of modest sizes. such de novo atomistic simulations will play crucial roles in exploring the exciting opportunity of targeting idps through dynamic interactions. proteins are central components of regulatory networks that dictate virtually all aspects of cellular decision-making [1] . demand for more sophisticated signaling in complex multicellular organisms has been met with increasing utilization of proteins that are highly flexible [2] [3] [4] . in particular, so-called intrinsically disordered proteins (idps) account for~50% of signaling-associated proteins in eukaryotes [5] . these proteins have lower sequence complexity compared to folded proteins, lacking large hydrophobic residues and enriched with charged and polar ones [6] . they do not have stable tertiary structures in the unbound state under physiological conditions, even though they frequently undergo folding transitions upon binding to specific targets [7] . the inherent thermodynamic instability of the structural features of this class of proteins allows their conformational properties to respond sensitively to numerous stimuli, including the binding of various small and large molecules, changes in cellular environments (e.g., ph), and post-translational modifications [8] [9] [10] [11] [12] [13] . multiple signals could also be naturally integrated through cooperative responses of the dynamic structure ensemble biomolecules 2020, 10, 743 2 of 16 (such as coupled binding and folding) [14] . these properties make idps uniquely suitable for fulfilling the complex signaling need of higher organisms. at the same time, deregulation of idps has been associated with many human diseases, including cancers, neurodegenerative diseases, heart disease, and diabetes [5, [15] [16] [17] [18] [19] [20] . for example, over two-thirds of cancer-associated proteins have been predicted to contain extensive regions of intrinsic disorder [5] , and predicted disordered regions have been estimated to house almost one quarter of disease-associated missense mutations [21] . there is thus tremendous interest in determining if and how idps may be targeted for therapeutic purposes. the dynamic and heterogeneous nature of unbound idps presents substantial challenges for characterization and this has proven to be a major bottleneck for establishing a reliable sequence-structure-function-disease relationship of idps [14, [22] [23] [24] [25] [26] . the lack of a clear understanding of the molecular basis of idp function and deregulation in diseases has created significant ambiguity on the druggability of most idps, including transcription factors [16] . most existing case studies of targeting idps have focused on extending the traditional structure-based screening and drug design strategies and emphasize exploiting residual structures and pre-existing potential binding pockets of the unbound state [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] . nonetheless, it is clear that the disordered nature of idps would require novel strategies for targeting as well as new conceptual frameworks for thinking about how small molecule binding could modulate idp structure and function. in particular, it has been recognized that it may be more useful to consider the problem of targeting idps in the context of structural ensemble modulation [44] , even though it is generally believed that one still needs to achieve specific interactions, such as by exploiting pre-existing structural features [45] . many outstanding reviews have already been dedicated towards existing examples along these lines and they also provide extensive discussion of the successes, opportunities, and challenges of targeting idps via specific interactions of small molecules in neurodegenerative diseases, cancers, and other diseases [18, [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] . in this review, we will first summarize important recent advances in physics-based de novo simulations of disordered protein ensembles, including graphic processing unit (gpu) computing, enhanced sampling, and re-balanced protein force fields, and then focus on emerging examples that suggest the exciting possibility of targeting idps by directly modulating the disordered ensembles through dynamic and transient interactions. we will discuss the promise of such a broader view of how idps may be targeted as well as key challenges and required methodological developments to support targeting idps via dynamic interactions. a principal challenge in understanding the druggability and best targeting strategy of idps resides in the difficulty of detailed characterization of disordered protein states [14, 23, 24, 56] . these states need to be represented using heterogeneous structure ensembles and are not amenable to traditional high-resolution structure determination methods. here, we briefly discuss the current status and challenges of disordered protein ensemble determination, which has a direct impact on the ability to devise effective strategies of designing idp binders and optimizing leads identified from traditional screening efforts. experimentally, a wide range of biophysical methods can be applied to characterize disordered protein states, including nmr, circular dichroism (cd), small-angle x-ray scattering (saxs), förster resonance energy transfer (fret), hydrogen/deuterium (h/d) exchange, mass spectrometry, and others [57, 58] . these methods can provide complementary information on the local, intermediate, and long-range structural organizations of idps. nmr in particular is arguably the most powerful technique for structural studies of idp. many nmr observables can be measured at residue and atomic levels to infer secondary and tertiary structural properties. saxs and fret are highly complementary to nmr and provide information on the long-range global organization of the disordered ensemble. yet, a fundamental limitation is that these experimental measurements generally reflects the average properties, which alone are not sufficient to uniquely define the underlying heterogeneous ensemble due to the severely underdetermined nature of the structural calculation problem [22, 23, 25, [59] [60] [61] . at present, the most robust methods generally involve first generating a large number of candidate random structures and then using experimental structural restraints to select and construct optimal sub-ensembles according to various statistical criteria [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] . nonetheless, these methods rely critically on the ability to generate initial candidate structures that are not only diverse enough to cover the range of accessible states of the protein but also specific enough to contain any nontrivial local (and long-range) structure features associated with a particular protein state. these two requirements are difficult to satisfy simultaneously for most proteins of moderate sizes (e.g., 50-100 residues or longer) and complexity. as a consequence, disordered ensembles constructed using these approaches depend critically on the underlying protein model and/or coil library. furthermore, these ensembles are generally not proper thermodynamic ensembles. they should be considered just as ensemble models and cannot be used to reliably quantify statistical properties and extract thermodynamic parameters. as such, these experimental restraint-based ensemble construction approaches are likely inadequate for capturing potentially subtle effects of ligand binding on the disordered ensemble. given the fundamental challenges of disordered ensemble modeling based on experimental restraints alone, physics-based atomistic simulations have a crucial role to play in helping elucidate the conformational properties of idps and establishing a reliable molecular basis of their function and regulation [72] [73] [74] [75] [76] . a particularly attractive approach is to first generate the atomistic ensemble using a transferable physics-based force field in absence of any experimental restraints and then use the experimental data for independent evaluation of the quality of the simulated ensemble. such a de novo simulation approach can effectively overcome the under-determined nature of disordered ensemble calculation, by leveraging the laws of physics that govern the nature of conformational fluctuation of the protein. successful simulations of disordered protein ensembles have proven to be very challenging, requiring both accurate description of the conformational dependence of energy and sufficient sampling of relevant conformational space of the protein. early idp simulations suffered from both systematic biases in the general-purpose protein force fields and severe limitations in conformational sampling [14] . nonetheless, these physics-based simulation approaches promise to provide rigorous thermodynamic ensembles required for reliable description of idp-ligand interactions. the accuracy and capability of de novo idp simulations can be expected to improve continuously over time, benefiting from robust advances in molecular simulation methodologies and high-performance computing hardware. indeed, important breakthroughs have been made in the last few years in force field accuracy and sampling capability, which arguably have now allowed reliable de novo simulations of at least moderate-sized idps in general. one of the most important recent advances in molecular dynamics (md) simulations is the widespread availability of efficient gpu-enabled algorithms in virtually all major molecular simulation packages [77] [78] [79] [80] [81] [82] . modern gpus can process thousands of threads in parallel to accelerate explicit solvent md simulations by up to 100× compared to traditional cpu computing, significantly boosting the sampling capability. for example, the most efficient gpu-enabled md codes can yield 100-200 ns per day for systems of~1,000,000 atoms on a single nvidia rtx 2080ti gpu card that costs onlỹ $1000. the ability to efficiently sample the protein conformational space has further benefitted from the emergence of enhanced sampling techniques [83] [84] [85] [86] [87] [88] [89] [90] , particularly various replica exchange (rex)-based methods. among the various rex methods, replica exchange with solute tempering methods (rest) [86, 91] is particularly suitable for atomistic simulations of idps in explicit solvent. in rest, a selected region of the system (e.g., the protein solute or a flexible segment of the protein) is subjected to tempering (i.e., random walk in the temperature space) while the rest of the system is maintained at a constant temperature (e.g., room temperature). this is achieved by scaling the interactions within the selected region and between it and the rest of the system. because the number of required replicas of replica exchange simulations scales as the square root of the number of atoms, rest can significantly reduce the number of replicas required for covering the needed temperature space (by~3-fold), thus overcoming a critical drawback in traditional temperature rex simulation in explicit solvent. the role of enhanced sampling in the recent successes of simulating dynamic idp-ligand interactions will become evident in the later sections of this review. the importance of enhanced sampling for the generation of converged ensembles cannot be over emphasized. for example, figure 1a shows the evolution of the per-residue β-sheet structure during a 30-µs conventional md simulation of an intrinsically disordered aβ40 peptide in explicit solvent at 300 k, performed with anton specialized hardware using the well optimized a99sb-disp force field [92] . note that this trajectory is one to two orders of magnitude longer than typical md simulations performed using general purpose cpu-or gpu-based high-performance computing platforms. yet, very few reversible transitions are observed. for example, a transient β-hairpin spanning residues 15 to 36 persists for about 5 µs from~10 to 15 µs and never appears again for the rest of the simulation. as a result, the final average residue helix and β-sheet probability profiles calculated from the first and second halves of the trajectory differ greatly, reflecting a very limited level of convergence. there is thus danger in relying on standard md simulations in deriving quantitative characterizations of disordered protein ensembles. it should also be emphasized that achieving a sufficient level of convergence required for resolving potentially subtle changes in the disordered ensemble, such as upon ligand binding, can be extremely challenging even with enhanced sampling. it is critical to carefully analyze and establish the level of convergence for proper interpretation of simulated ensembles. ideally, one should perform two or more independent simulations using distinct initial conformations and compare the resulting ensembles. the simulated ensemble from a single continuous run may appear to stop changing with respect to simulation time due to trapping in numerous local energy minima, giving rise to a misleading impression of convergence. biomolecules 2020, 10, x 4 of 15 temperature space (by ~3-fold), thus overcoming a critical drawback in traditional temperature rex simulation in explicit solvent. the role of enhanced sampling in the recent successes of simulating dynamic idp-ligand interactions will become evident in the later sections of this review. the importance of enhanced sampling for the generation of converged ensembles cannot be over emphasized. for example, figure 1a shows the evolution of the per-residue β-sheet structure during a 30-μs conventional md simulation of an intrinsically disordered aβ40 peptide in explicit solvent at 300 k, performed with anton specialized hardware using the well optimized a99sb-disp force field [92] . note that this trajectory is one to two orders of magnitude longer than typical md simulations performed using general purpose cpu-or gpu-based high-performance computing platforms. yet, very few reversible transitions are observed. for example, a transient β-hairpin spanning residues 15 to 36 persists for about 5 μs from ~10 to 15 μs and never appears again for the rest of the simulation. as a result, the final average residue helix and β-sheet probability profiles calculated from the first and second halves of the trajectory differ greatly, reflecting a very limited level of convergence. there is thus danger in relying on standard md simulations in deriving quantitative characterizations of disordered protein ensembles. it should also be emphasized that achieving a sufficient level of convergence required for resolving potentially subtle changes in the disordered ensemble, such as upon ligand binding, can be extremely challenging even with enhanced sampling. it is critical to carefully analyze and establish the level of convergence for proper interpretation of simulated ensembles. ideally, one should perform two or more independent simulations using distinct initial conformations and compare the resulting ensembles. the simulated ensemble from a single continuous run may appear to stop changing with respect to simulation time due to trapping in numerous local energy minima, giving rise to a misleading impression of convergence. the dramatically improved sampling capability has facilitated extensive efforts to reparametrize general-purpose protein force fields to achieve greater balance of describing protein conformational equilibria. studies of disordered protein states have been a key driver of these developments and several well-characterized idps have been widely used as training systems and/or benchmarks for force field optimization [92] [93] [94] [95] [96] [97] [98] [99] . many force field variants have been developed in recent years, including amber ff98sb [100] , ff99sb*-ildn [101] and variants [95, 102] , ff03ws [103] , ff14sb [104] , ff99sbnmr [105] , charmm22* [106] , charmm36m (and c36mw) [93] , a99sb-disp [92] , and others. a key focus of these optimization efforts has been to rebalance the protein-protein, protein-water, and water-water interactions. earlier versions of general-purpose protein force fields consistently over-stabilize nonspecific protein-protein interactions and lead to overly compact conformational ensembles for disordered protein states [107, 108] . it was demonstrated that such bias could be effectively compensated by directly increasing the strengths of protein-water dispersion interactions [95, 103, 109] , even though other components of the force field should also be reparametrized for self-consistency. the latest charmm36m and a99sb-disp force fields, in particular, have been systematically reparametrized based on extensive simulations of tens of globular and disordered proteins and achieve impressive levels of accuracy for describing both structured and unstructured proteins. in a recent benchmark study, six of the latest protein force fields were evaluated using the 61-residue n-terminal transactivation domain (tad) of tumor suppressor p53, which is a very challenging system due to its size and complex conformational features [99] . it has been extensively characterized by nmr, saxs, and single-molecule fret and shown to contain a range of nontrivial local and long-range residual structures [110] . the disordered ensemble of p53-tad was calculated using rest2-enhanced sampling using gpu-accelerated gromacs 5.1.4 [80, 111] patched with plumed 2.3.0 [112] [113] [114] . each rest2 simulation lasted 1.0 µs per replica, representing one of the most extensive atomistic simulations of idps of similar sizes. the results show that the ensembles generated using the force field a99sb-disp yield the best agreement with the experimental data at both secondary and tertiary structure levels. for example, the back-calculated nmr paramagnetic relaxation enhancement (pre) effects are highly consistent with the experimental results for all four available labelling sites (figure 2 ). this suggests that the simulated ensembles not only have the proper overall chain dimension but also recapitulate much of the transient long-range ordering within the unbound state of p53-tad. the latter is an extremely challenging task. the fact that this could be achieved by a99sb-disp represents an exciting breakthrough, suggesting that de novo atomistic simulations are now ready to provide a reliable approach for detailed characterization of the disordered ensembles of at least moderately sized idps. the dramatically improved sampling capability has facilitated extensive efforts to reparametrize general-purpose protein force fields to achieve greater balance of describing protein conformational equilibria. studies of disordered protein states have been a key driver of these developments and several well-characterized idps have been widely used as training systems and/or benchmarks for force field optimization [92] [93] [94] [95] [96] [97] [98] [99] . many force field variants have been developed in recent years, including amber ff98sb [100] , ff99sb*-ildn [101] and variants [95, 102] , ff03ws [103] , ff14sb [104] , ff99sbnmr [105] , charmm22* [106] , charmm36m (and c36mw) [93] , a99sb-disp [92] , and others. a key focus of these optimization efforts has been to rebalance the protein-protein, protein-water, and water-water interactions. earlier versions of general-purpose protein force fields consistently over-stabilize nonspecific protein-protein interactions and lead to overly compact conformational ensembles for disordered protein states [107, 108] . it was demonstrated that such bias could be effectively compensated by directly increasing the strengths of protein-water dispersion interactions [95, 103, 109] , even though other components of the force field should also be reparametrized for selfconsistency. the latest charmm36m and a99sb-disp force fields, in particular, have been systematically reparametrized based on extensive simulations of tens of globular and disordered proteins and achieve impressive levels of accuracy for describing both structured and unstructured proteins. in a recent benchmark study, six of the latest protein force fields were evaluated using the 61residue n-terminal transactivation domain (tad) of tumor suppressor p53, which is a very challenging system due to its size and complex conformational features [99] . it has been extensively characterized by nmr, saxs, and single-molecule fret and shown to contain a range of nontrivial local and long-range residual structures [110] . the disordered ensemble of p53-tad was calculated using rest2-enhanced sampling using gpu-accelerated gromacs 5.1.4 [80, 111] patched with plumed 2.3.0 [112] [113] [114] . each rest2 simulation lasted 1.0 μs per replica, representing one of the most extensive atomistic simulations of idps of similar sizes. the results show that the ensembles generated using the force field a99sb-disp yield the best agreement with the experimental data at both secondary and tertiary structure levels. for example, the back-calculated nmr paramagnetic relaxation enhancement (pre) effects are highly consistent with the experimental results for all four available labelling sites (figure 2 ). this suggests that the simulated ensembles not only have the proper overall chain dimension but also recapitulate much of the transient long-range ordering within the unbound state of p53-tad. the latter is an extremely challenging task. the fact that this could be achieved by a99sb-disp represents an exciting breakthrough, suggesting that de novo atomistic simulations are now ready to provide a reliable approach for detailed characterization of the disordered ensembles of at least moderately sized idps. . calculated (lines) and experimental (grey bars) nmr paramagnetic relaxation enhancement (pre) effects induced by paramagnetic spin labelling at residues d7, e28, a39, and d61 of p53-tad. red and green traces were calculated from an independent control and folding rest2 simulations of p53-tad in a99sb-disp, respectively, to evaluate the level of convergence. control and folding simulations were initiated from helical and fully unstructured structures, respectively, and the length of rest2 simulations were 1 μs per replica. this figure was adapted from [99] . see [99] for details on the simulation and analysis. red and green traces were calculated from an independent control and folding rest2 simulations of p53-tad in a99sb-disp, respectively, to evaluate the level of convergence. control and folding simulations were initiated from helical and fully unstructured structures, respectively, and the length of rest2 simulations were 1 µs per replica. this figure was adapted from [99] . see [99] for details on the simulation and analysis. we note that implicit solvent protein force fields have also been developed and deployed for atomistic simulations of idps with various levels of success [75, 96, 98, 115] . implicit treatment of solvent reduces the simulation system size~10-fold by direct estimation of the solvation free energy. it could provide important advantages for satisfying the simultaneous requirements of adequate sampling and sufficient force field accuracy for simulating disordered protein states. the absinth model, in particular, has demonstrated significant successes in mapping the sequence-conformational space relationship of idps [25] . an improved version named absinth-c was recently developed by including the backbone torsion cross-terms optimized based on experimentally derived statistics [98] . independently, the generalized born with the molecular volume 2 (gmmv2) model, which is considered one of the most accurate implicit solvent models, was recently implemented on gpu within the charmm/openmm interface [116] . gpu-accelerated gbmv2 is about 60-fold faster and provides a competitive alternative to explicit solvent simulations for studying the idp structure and interaction. this model was previously optimized based on enhanced sampling of model peptides and shown to be capable of accurately describing the conformational properties of both folded and unfolded peptides [96] . the development of the gpu-accelerated version thus removed a key bottleneck to broader application of gbmv2 for atomistic simulations of the idp structure and interactions. nonetheless, whether these implicit solvent models could provide a viable alternative to explicit solvent simulations in studies of idp-ligand interactions is yet to be demonstrated. to date, small molecular targeting of idps has mostly focused on proteins involved in neurodegenerative diseases, such as amyloid β (aβ) peptides, α-synuclein, and tau protein, and disordered regions of cancer-associated transcription factors, such as p53, c-myc, ews-fli1, klf5, and others [16, 117] . advances in experimental and computational methods for studying disordered protein ensembles have allowed examination in greater details the interactions between idps and small molecules. there has been an emergence of examples showing that small molecules could modulate the disordered ensemble itself through nonspecific and dynamic interactions and achieve specific functional effects, in complete contrast to the traditional paradigm of drug binding that emphasizes strong specific interactions. such a dynamic mode of idp-small molecule interactions is reminiscent of "fuzzy complexes" in protein-protein interactions involving idps [118] [119] [120] . the observation that small molecules could induce substantial effects through dynamic interactions is fascinating and suggests a broader and more effective strategy for targeting idps in general. [27, 28] . nmr, cd, and fluorescence studies initially suggested that these inhibitors bound specifically to multiple independent sites in the monomeric and disordered c-myc, which induced max-binding-incompatible conformations to disrupt the c-myc-max interaction [27] . subsequent explicit solvent simulations by michel and cuchillo in 2012 [121] revealed that the interaction between c-myc and one of the inhibitors, 10058-f4, was actually dynamic and involved many short-lived contacts. such a dynamic and nonspecific nature of the interaction could explain the observation that small modifications to the ligand had limited effects on the binding affinity [122] . a similar mode of interaction was proposed by jin et al. in 2013 [123] for another c-myc inhibitor, 10074-a4, where atomistic simulations in explicit and implicit solvent force fields revealed the ligand to form a "ligand cloud" and interact dynamically with the disordered "protein cloud". importantly, the dynamic nature of the interaction between 10058-f4 and c-myc 402-412 can confer sequence specificity, even though the interaction involves many transient contacts instead of well-defined specific ones [124] . it has been further suggested that such dynamic interactions may be driven by entropic expansion of the idp conformational space [51] . indeed, thermodynamics analysis showed that binding of 10058-f4 to c-myc 402-412 was dominated by the entropic contribution (−20.7± −4.2 kj/mol out of the total binding free energy of −27.6 ± −8.5 kj/mol) [124] . the caveat, however, is that binding of hydrophobic ligands is generally associated with large entropic contributions due to the release of restricted water molecules near the hydrophobic surface. it is thus not clear if c-myc indeed undergoes entropic expansion upon 10058-f4 binding. an in-depth nmr study of the interaction of a small molecule with intrinsically disordered p27 kip1 found little evidence of ligand-induced conformational space expansion [31] . instead, ligand binding was shown to mainly shift the populations of pre-existing states. dynamic interactions were also found to underlie the mechanism of α-synuclein aggregation inhibition by analogs of cyclized nordihydroguaiaretic acid (cndga) [125] . the structural basis of cndga inhibition was characterized using an array of biochemical and biophysical methods, including nmr and fluorescence correlation spectroscopy. the results revealed that cndga induced modest compaction of the conformational ensemble of monomeric α-synuclein, apparently mediated by dynamic and transient interactions with the protein and without hindering membrane association. cndga-treated α-synuclein is resistant to aggregation even when seeded with α-synuclein aggregates. importantly, cndga was further shown to be effective in reducing α-synuclein-driven neurodegeneration in c. elegans. the observation that dynamic interactions between a small molecule and idp could be functionally effective both in vitro and in vivo is very encouraging and supports the promise of targeting idps using dynamic interactions for therapeutics. induced compaction of the disordered conformational ensemble has also been predicted to underlie the inhibition mechanism of two drugs under clinical trials for treating alzheimer's diseases [126] . the disordered ensembles of aβ 42 with and without tramiprosate (homotaurine; ht) and scyllo-inositol (si) were calculated using rest2 simulations in the charmm36m force field that lasted 10 µs per replica, making them the most extensive atomistic simulations of aβ 42 to date. the resulting ensembles are well converged and appear consistent with the nmr chemical shifts. comparing the ensembles with and without the ligand showed that both ht and si mainly reduced the β propensity in the c-terminal region with minimal secondary structure perturbation in the rest of the peptide. intriguingly, both ht and si were found to induce modest compaction of the conformational ensemble, particularly in the c-terminal segment that is known to be important for amyloid fibril formation (figure 3a-c) . detailed analysis further revealed that the effects of both ht and si binding were achieved via dynamic and nonspecific interactions with various backbone and sidechain moieties of the peptide. it is noteworthy that the conformational modulation effects of both ht and si can be very difficult to detect at the ensemble level using bulk measurements, highlighting a critical role for reliable atomistic simulations that leverage recent advances in both protein force field quality and sampling capability. nonetheless, additional validation is needed to support the predicted conformational shifts induced by drugs and to establish the roles of such conformational changes in the mechanisms of drug action. fda-approved drugs and identified 15 compounds that could bind nupr1, a multi-functional idp involved in pancreatic ductal adenocarcinoma [133, 134] . nmr chemical shift analysis suggested that nupr1 remained disordered in complex with all compounds, which is consistent with an inhibition mechanism involving transient and dynamic idp-ligand interactions. importantly, these compounds showed efficacy in cell-based assays, the most effective of which was found to completely arrest tumor growth in a mouse model. . the conformational space of aβ42 is projected onto the number of backbone hydrogen bonds and end-to-end distance and that of p53-tad is projected onto the first two principal components. the conformational ensembles were calculated using long timescale rest2 simulations in explicit solvent (10 and 1 μs per replica for aβ42 and p53-tad, respectively). representative conformations are shown in backbone traces. this figure was adapted from [126, 127] . see [126, 127] for details on the simulation and analysis. idps have remained an extremely challenging class of proteins to target using small molecules. albeit limited, successful inhibitors have been discovered and designed for several idps involved in cancers and neurodegenerative diseases, suggesting that idps are not undruggable. nonetheless, the unstructured and dynamic nature of idps is distinct from typical protein targets with well-defined binding pockets. it requires new conceptional frameworks to guide the development of novel strategies for discovering and designing small molecules that can modulate idp functions. traditional structure-based screening and lead optimization strategies are clearly inadequate, even though some success has been demonstrated in deploying existing tools to identify possible binding pockets and target pre-existing structural elements. such structural elements are lightly populated and generally too small to harbor significant pockets for small molecular binding that relies on specific interactions to achieve high affinity. in fact, there is a great uncertainty on whether highaffinity binding to idps is feasible with small molecules (e.g., to meet the typical industrial standard of dissociation constants of nm or lower). it is encouraging that examples are emerging that small molecules could modulate the idp ensembles entirely through dynamic nonspecific interactions. importantly, there is evidence that high-affinity binding may not be necessary to induce functional responses in vitro and in vivo. this may reflect a fundamental nature of how idps mediate function in biology, in that the disordered ensemble of an idp is poised to respond sensitively to a wide array figure 3 . conformational ensembles of aβ42 with and without the ligands (a-c) and p53-tad with and without ligands (d,e). the conformational space of aβ42 is projected onto the number of backbone hydrogen bonds and end-to-end distance and that of p53-tad is projected onto the first two principal components. the conformational ensembles were calculated using long timescale rest2 simulations in explicit solvent (10 and 1 µs per replica for aβ42 and p53-tad, respectively). representative conformations are shown in backbone traces. this figure was adapted from [126, 127] . see [126, 127] for details on the simulation and analysis. de novo atomistic simulations have also been integrated with nmr and biophysical experiments to examine how an anticancer drug, epigallocatechin gallate (egcg), modulates the disordered unbound state of p53-tad [127] . egcg is a major active ingredient of green tea and has been reported to have anticancer effects in both animal studies and clinical trials [128] [129] [130] . the results suggested that egcg also interacted dynamically with p53-tad through numerous transient and nonspecific interactions, which appeared consistent with nmr chemical shift titration results. multiple hydrophobic and particularly aromatic sidechains contribute significantly to egcg binding. the dynamic interaction with egcg was predicted to induce significant conformational compaction of p53-tad in the n-terminal region (figure 3d ,e), which appears consistent with the saxs measurements. the compaction could shield the p53-tad site required for interacting with mdm2 and thus inhibit p53 degradation to promote its anticancer activities. it is noteworthy that egcg has also been shown to be active in inhibiting the aggregation of multiple proteins, including aβ peptides and α-synuclein [131, 132] . the implication is that dynamic interactions of egcg could provide a molecular basis for promiscuous selectivity, which is a fascinating property to investigate further using a combination of experiments and simulation. recently, neira et al. screened a set of fda-approved drugs and identified 15 compounds that could bind nupr1, a multi-functional idp involved in pancreatic ductal adenocarcinoma [133, 134] . nmr chemical shift analysis suggested that nupr1 remained disordered in complex with all compounds, which is consistent with an inhibition mechanism involving transient and dynamic idp-ligand interactions. importantly, these compounds showed efficacy in cell-based assays, the most effective of which was found to completely arrest tumor growth in a mouse model. idps have remained an extremely challenging class of proteins to target using small molecules. albeit limited, successful inhibitors have been discovered and designed for several idps involved in cancers and neurodegenerative diseases, suggesting that idps are not undruggable. nonetheless, the unstructured and dynamic nature of idps is distinct from typical protein targets with well-defined binding pockets. it requires new conceptional frameworks to guide the development of novel strategies for discovering and designing small molecules that can modulate idp functions. traditional structure-based screening and lead optimization strategies are clearly inadequate, even though some success has been demonstrated in deploying existing tools to identify possible binding pockets and target pre-existing structural elements. such structural elements are lightly populated and generally too small to harbor significant pockets for small molecular binding that relies on specific interactions to achieve high affinity. in fact, there is a great uncertainty on whether high-affinity binding to idps is feasible with small molecules (e.g., to meet the typical industrial standard of dissociation constants of nm or lower). it is encouraging that examples are emerging that small molecules could modulate the idp ensembles entirely through dynamic nonspecific interactions. importantly, there is evidence that high-affinity binding may not be necessary to induce functional responses in vitro and in vivo. this may reflect a fundamental nature of how idps mediate function in biology, in that the disordered ensemble of an idp is poised to respond sensitively to a wide array of cellular signals to support signal transduction and cellular regulation [10] . therefore, there is a great potential and promise for targeting idps through dynamic interactions with small molecules. it is noteworthy that idps have been found to play central roles in mediating liquid-liquid phase separation (llps) that underlies a range of cellular processes [135] [136] [137] . how small molecules may modulate the equilibrium and properties of these biological condensates is essentially unknown at this point. however, it is conceivable that llps may also be ideally targeted using dynamic interactions with small molecules that modulate the conformational flexibility and preference of disordered regions, which in turn modify the multivalent interaction profiles and entropic contribution that affect the condensation process as well as the properties of the condensate itself. elucidating the molecular details of dynamic interactions between idps and small molecules will require further development and integration of new experimental and computational methodologies. the capability for reliable disordered ensemble characterization with and without small molecules will almost certainly be required for any future design and optimization strategy to discover drugs that target idps through dynamic interactions. this unfortunately remains a formidable task. bulk experimental measurements on average properties alone are not sufficient to uniquely define the disordered ensemble. the dynamic and transient nature of molecular contacts can be very difficult to detect and resolve experimentally [138, 139] . leveraging significant recent advances in the protein force field quality, sampling techniques, and gpu computing, de novo atomistic simulations are now poised to help meet these challenges and play a pivotal role in establishing the molecular basis of dynamic idp-small molecule interactions. author contributions: data analysis, x.l.; writing-draft and edit, j.c. (jianhan chen); writing-review and edit, j.c. (jianlin chen) and x.l.; all authors have read and agreed to the published version of the manuscript. funding: this work is partially supported by national institutes of health (gm114300 to jhc). targeting transcription factors in cancer-from undruggable to reality intrinsically unstructured proteins: re-assessing the protein structure-function paradigm intrinsically unstructured proteins flexible nets-the roles of intrinsic disorder in protein interaction networks intrinsic disorder in cell-signaling and cancer-associated proteins biomolecules 2020 sequence complexity of disordered protein coupling of folding and binding for unstructured proteins showing your id: intrinsic disorder as an id for recognition, regulation and cell signaling intrinsically unstructured proteins and their functions sending signals dynamically intrinsic disorder as a mechanism to optimize allosteric coupling in proteins ensemble allosteric model: energetic frustration within the intrinsically disordered glucocorticoid receptor a comprehensive ensemble model for comparing the allosteric effect of ordered and disordered proteins towards the physical basis of how intrinsic disorder mediates protein function intrinsically disordered proteins in human diseases: introducing the d-2 concept targeting intrinsically disordered transcription factors: changing the paradigm intrinsically disordered side of the zika virus proteome untapped potential of disordered proteins in current druggable human proteome intrinsic disorder associated with 14-3-3 proteins and their partners intrinsic disorder here, there, and everywhere, and nowhere to escape from it disease mutations in disordered regions-exception to the rule? atomic-level characterization of disordered protein ensembles constructing ensembles for intrinsically disordered proteins intrinsically disordered proteins in a physics-based world relating sequence encoded information to form and function of intrinsically disordered proteins molecular recognition features (morfs) in three domains of life multiple independent binding sites for small-molecule inhibitors on the oncoprotein c-myc discovery of novel myc-max heterodimer disruptors with a three-dimensional pharmacophore model targeting the disordered c terminus of ptp1b with an allosteric inhibitor sci rep a small molecule causes a population shift in the conformational landscape of an intrinsically disordered protein binding cavities and druggability of intrinsically disordered proteins presmo target-binding signatures in intrinsically disordered proteins identification of small-molecule binding pockets in the soluble monomeric form of the a beta 42 peptide identification of small molecule inhibitors of tau aggregation by targeting monomeric tau as a potential therapeutic approach for tauopathies a fragment-based method of creating small-molecule libraries to target the aggregation of intrinsically disordered proteins structure-based inhibitor design for the intrinsically disordered protein c-myc a moving target: structure and disorder in pursuit of myc inhibitors conservation of potentially druggable cavities in intrinsically disordered proteins epi-001, a compound active against castration-resistant prostate cancer, targets transactivation unit 5 of the androgen receptor targeting the intrinsically disordered structural ensemble of alpha-synuclein by small molecules as a potential therapeutic strategy for parkinson's disease intrinsic disorder within akap79 fine-tunes anchored phosphatase activity toward substrates and drug sensitivity proteus: a random forest classifier to predict disorder-to-order transitioning binding regions in intrinsically disordered proteins structural ensemble modulation upon small-molecule binding to disordered proteins targeting intrinsically disordered proteins at the edge of chaos intrinsically disordered proteins: from sequence and conformational properties toward drug discovery drugs for 'protein clouds': targeting intrinsically disordered transcription factors intrinsically disordered proteins are potential drug targets intrinsically disordered proteins and novel strategies for drug discovery how to design a drug for the disordered proteins targeting disordered proteins with small molecules using entropy biomolecules 2020 druggability of intrinsically disordered proteins targeting protein-protein interactions (ppis) of transcription factors: challenges of intrinsically disordered proteins (idps) and regions (idrs) targeting intrinsically disordered proteins in rational drug discovery recent insights into the development of therapeutics against coronavirus diseases by targeting n protein pe-db: a database of structural ensembles of intrinsically disordered and of unfolded proteins intrinsically disordered proteins in cellular signalling and regulation intrinsically disordered proteins studied by nmr spectroscopy structural interpretation of paramagnetic relaxation enhancement-derived distances for disordered protein states principles of protein structural ensemble determination a critical assessment of methods to recover information from averaged data improved structural characterizations of the drkn sh3 domain unfolded state suggest a compact ensemble with native-like and non-native structure nmr characterization of long-range order in intrinsically disordered proteins structure of tumor suppressor p53 and its intrinsically disordered n-terminal transactivation domain recovering a representative conformational ensemble from underdetermined macromolecular structural data atomistic ensemble modeling and small-angle neutron scattering of intrinsically disordered protein complexes: applied to minichromosome maintenance protein atomistic modelling of scattering data in the collaborative computational project for small angle scattering (ccp-sas) combined monte carlo/torsion-angle molecular dynamics for ensemble modeling of proteins, nucleic acids and carbohydrates conformational propensities of intrinsically disordered proteins from nmr chemical shifts modeling intrinsically disordered proteins with bayesian statistics conformational space of flexible biological macromolecules from average data biomolecules 2020 simulations of disordered proteins and systems with conformational heterogeneity a preformed binding interface in the unbound ensemble of an intrinsically disordered protein: evidence from molecular simulations net charge per residue modulates conformational ensembles of intrinsically disordered proteins atomistic details of the disordered states of kid and pkid. implications in coupled binding and folding residual structures, conformational fluctuations, and electrostatic interactions in the synergistic folding of two intrinsically disordered proteins the biomolecular simulation program openmm 4: a reusable, extensible, hardware independent library for high performance molecular simulation high performance molecular simulations through multi-level parallelism from laptops to supercomputers scalable molecular dynamics with namd routine microsecond molecular dynamics simulations with amber on gpus. 1. generalized born accelerate sampling in atomistic energy landscapes using topology-based coarse-grained models scalable free energy calculation of proteins via multiscale essential sampling replica-exchange molecular dynamics method for protein folding replica exchange with solute tempering: a method for sampling biological systems in explicit water hamiltonian switch metropolis monte carlo simulations for improved conformational sampling of intrinsically disordered regions tethered to ordered domains of proteins a hybrid md-kmc algorithm for folding proteins in explicit solvent enhanced sampling and applications in protein folding in explicit solvent practically efficient and robust free energy calculations: double-integration orthogonal space tempering replica exchange with solute scaling: a more efficient version of replica exchange with solute tempering (rest2) developing a molecular dynamics force field for both folded and disordered protein states charmm36m: an improved force field for folded and intrinsically disordered proteins improved peptide and protein torsional energetics with the opls-aa force field water dispersion interactions strongly influence simulated structural properties of disordered protein states optimization of the gbmv2 implicit solvent force field for accurate simulation of protein conformational equilibria optimizing solute-water van der waals interactions to reproduce solvation free energies improvements to the absinth force field for proteins based on experimentally derived amino acid specific backbone conformational statistics residual structures and transient long-range interactions of p53 transactivation domain: assessment of explicit solvent protein force fields comparison of multiple amber force fields and development of improved protein backbone parameters improved side-chain torsion potentials for the amber ff99sb protein force field optimizing protein-solvent force fields to reproduce intrinsic conformational preferences of model peptides balanced protein-water interactions improve properties of disordered proteins and non-specific protein association improving the accuracy of protein side chain and backbone parameters from ff99sb balanced amino-acid-specific molecular dynamics force field for the realistic simulation of both folded and disordered proteins how robust are protein folding simulations with respect to force field parameterization? optimization of the additive charmm all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ1 and χ2 dihedral angles extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations computational and theoretical advances in studies of intrinsically disordered proteins modulation of the disordered conformational ensembles of the p53 transactivation domain by cancer-associated mutations tackling exascale software challenges in molecular dynamics simulations with gromacs plumed 2: new feathers for an old bird on easy implementation of a variant of the replica exchange with solute tempering in gromacs hamiltonian replica exchange in gromacs: a flexible implementation balancing solvation and intramolecular interactions: toward a consistent generalized born force field accelerating the generalized born with molecular volume and solvent accessible surface area implicit solvent model using graphics processing units targeting the intrinsically disordered proteome using small-molecule ligands fuzzy complexes: polymorphism and structural disorder in protein-protein interactions potential conformational heterogeneity of p53 bound to s100b(betabeta) dynamic multivalent interactions of intrinsically disordered proteins the impact of small molecule binding on the energy landscape of the intrinsically disordered protein c-myc improved low molecular weight myc-max inhibitors ligand clouds around protein clouds: a scenario of ligand binding with intrinsically disordered proteins sequence specificity in the entropy-driven binding of a small molecule and a disordered peptide cyclized ndga modifies dynamic alpha-synuclein monomers preventing aggregation and toxicity modulation of amyloid-beta42 conformation by small molecules through nonspecific binding modulation of p53 transactivation domain conformations by ligand binding and cancer-associated mutations epigallocatechin gallate (egcg) is the most effective cancer chemopreventive polyphenol in green tea primary cancer prevention by green tea, and tertiary cancer prevention by the combination of green tea catechins and anticancer compounds green tea extracts for the prevention of metachronous colorectal polyps among patients who underwent endoscopic removal of colorectal adenomas: a randomized clinical trial egcg remodels mature alpha-synuclein and amyloid-beta fibrils and reduces cellular toxicity green tea epigallocatechin-3-gallate (egcg) reduces beta-amyloid mediated cognitive impairment and modulates tau pathology in alzheimer transgenic mice designing and repurposing drugs to target intrinsically disordered proteins for cancer treatment: using nupr1 as a paradigm identification of a drug targeting an intrinsically disordered protein involved in pancreatic adenocarcinoma considerations and challenges in studying liquid-liquid phase separation and biomolecular condensates polymer physics of intracellular phase transitions methods of probing the interactions between small molecules and disordered proteins characterization of the binding of small molecules to intrinsically disordered proteins this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license we thank d. e. shaw research for giving us access to the md trajectories of aβ40. the authors thank chungwen liang for helpful discussions. we would also link to thank anonymous reviewers for their careful reading and insightful suggestions that have greatly improved both the content and writing of this review. the authors declare no conflict of interest. key: cord-255371-o9oxchq6 authors: nguyen, thanh thi; pathirana, pubudu n.; nguyen, thin; nguyen, henry; bhatti, asim; nguyen, dinh c.; nguyen, dung tien; nguyen, ngoc duy; creighton, douglas; abdelrazek, mohamed title: genomic mutations and changes in protein secondary structure and solvent accessibility of sars-cov-2 (covid-19 virus) date: 2020-07-10 journal: biorxiv doi: 10.1101/2020.07.10.171769 sha: doc_id: 255371 cord_uid: o9oxchq6 severe acute respiratory syndrome coronavirus 2 (sars-cov-2) is a highly pathogenic virus that has caused the global covid-19 pandemic. tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. this paper reports and analyses genomic mutations in the coding regions of sars-cov-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. prediction results suggest that mutation d614g in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. based on 6,324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of sars-cov-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. our analysis results also show that coding genes e, m, orf6, orf7a, orf7b and orf10 are most stable, potentially suitable to be targeted for vaccine and drug development. biological investigations of the novel coronavirus sars-cov-2 are important to understand the virus and help to propose appropriate responses to the pandemic. scientists have been able to obtain genomic sequences of sars-cov-2 and have started analysis of these data. reference genome of sars-cov-2 deposited to the national center for biotechnology information (ncbi) genbank sequence database (isolate wuhan-hu-1, accession number nc_045512) shows that sars-cov-2 is an rna virus having a length of 29,903 nucleotides. comparative genomic analysis results obtained in [1] [2] [3] suggest that the covid-19 virus may be originated in bats. other studies show that pangolins may have served as the hosts for the virus [4; 5] . andersen et al. [6] furthermore believe that sars-cov-2 is not a purposefully manipulated virus or constructed in a laboratory but has a natural origin. a study in [7] using machine learning unsupervised clustering methods corroborates previous findings that sars-cov-2 belongs to the sarbecovirus subgenus of the betacoronavirus genus within the coronaviridae family [8; 9] . the whole genome analysis results also indicate that bats are more likely the reservoir hosts for the virus than pangolins. another study in [10] demonstrates that sars-cov-2 may have resulted from a recombination of a pangolin coronavirus and a bat coronavirus, and pangolins may have acted as an intermediate host for the virus. since the first cases were detected, the covid-19 virus has spread to almost every country in the world and has been linked to the deaths of more than 404,000 people of over 7 million confirmed cases [11] . tracing the evolution and spread of the virus is important for developing vaccines and drugs as well as proposing appropriate intervention strategies. monitoring and analysing the viral genome mutations can be helpful for this task. due to a strong immunologic pressure in humans, the virus may have mutated over time to circumvent responses of the human immune system. this leads to the creation of virus variants with possible different virulence, infectivity, and transmissibility [25] . this paper reports all point mutations occurring so far in sars-cov-2 and presents exemplified implications obtained from the analysis of these mutation pattern data. four types of mutations, which include synonymous, nonsynonymous, insertion and deletion, are detected. we use 6,324 sars-cov-2 genome sequences collected in 45 countries and deposited to the ncbi genbank so far and create a spreadsheet dataset of all mutations occurred across different genes. eleven protein coding genes of sars-cov-2 have been identified, namely orf1ab, spike (s), orf3a, envelope (e), membrane (m), orf6, orf7a, orf7b, orf8, nucleocapsid (n) and orf10. the order of these genes and their corresponding length are illustrated in fig. 1 . the genes s, e, m, and n produce structural proteins that play important roles in the virus functions. for example, the receptor-binding domain (rbd) region of the s protein can bind to a receptor of a host cell, e.g. the human and bat angiotensin-converting enzyme 2 (ace2) receptor, enabling the entrance of the virus into the cell [12] . predictions of protein structures may help understand the virus's functions and thus contribute to developing vaccines and therapeutics against the virus. in this paper, to evaluate the possible impacts of genomic mutations on the virus functions, we propose the use of the sspro/accpro 5 methods to predict protein secondary structure and relative solvent accessibility [13] . these predictors were built using deep learning one-dimensional bidirectional recurrent neural networks incorporated in the scratch-1d software suite (version 1.2, 2018) [14] . by comparing the prediction results obtained on the reference genome and mutated genomes, we are able to assess whether the detected mutations have the potential to change the protein structure and solvent accessibility, and thus lead to possible changes of the virus characteristics. because of the functional importance of structural proteins, we only report the prediction results of these proteins in this study. the next section reviews related works in the literature. we then present materials and methods for sars-cov-2 mutation detection, and protein secondary structure and solvent accessibility prediction. next we summarize statistics of sars-cov-2 mutations so far and implications of these mutations. details of mutations in nonstructural orf genes and structural s, e, m and n genes are presented after that. a full sars-cov-2 mutation spreadsheet report is provided in the supplemental information. since the first genomes were collected in december 2019, there have been many findings on the mutations of sars-cov-2. for example, phan [15] analysed 86 genomes of sars-cov-2 downloaded from the the global initiative on sharing all influenza data (gisaid) database (https://www.gisaid.org/) and found 93 mutations over the entire viral genome sequences. among them, there are three mutations occurring in the rbd region of the spike surface glycoprotein s, including n354d, d364y and v367f, with the numbers showing amino acid (aa) positions in the protein. that study also reveals three deletions in the genomes of sars-cov-2 obtained from japan, usa and australia. two of these deletion mutations are in the orf1ab polyprotein and one deletion occurs in the 3' end of the genome. likewise, a study in [16] shows that the sars-cov-2 genomes may have undergone recurrent, independent mutations at 198 sites with 80% of these are of the nonsynonymous type. tang et al. [17] investigated 103 genomes of covid-19 patients and discover mutations in 149 sites of these genomes. the study also shows that the spike gene s consistently has larger ds values (synonymous substitutions per synonymous site) than other genes. in addition, two major lineages of the virus, denoted as l and s, have been specified based on two tightly linked snps. the l lineage is found more prevalent than the s lineage among the examined sequences. korber et al. [18] tracked the mutations of spike protein s of sars-cov-2 because it plays an important role in mediating infection of targeted cells and is the focus of vaccine and antibody therapy development efforts [19] . they detected 14 mutations in the spike protein that are growing, especially the mutation d614g that rapidly becomes the dominant form when spread to a new geographical region. likewise, hashimi [20] analysed the mutation frequency in the spike protein s of 796 sars-cov-2 genomes downloaded from the gisaid and genbank databases. the study found 64 mutations occurring in the s protein sequences obtained from multiple countries. it suggests that the virus is spreading in two forms, the d614 form (residue d at position 614 in the s protein) takes 68.5% while the g614 form takes 31.5% proportion of the examined isolates. koyama et al. [21] on the other hand found several variants of sars-cov-2 that may cause drifts and escape from immune recognition by using the prediction results of b-cell and t-cell epitopes in [22] . typically, the mutation d614g occurring in the spike protein is found prevalent in the european population. this mutation may have caused antigenic drift, resulting in vaccine mismatches that lead to a high mortality rate of this population. a recent situation report [23] by nextstrain [24] on genomic epidemiology of novel coronavirus using 5,193 publicly shared covid-19 genomes shows that sars-cov-2 on average accumulates changes at a rate of 24 substitutions per year. this is approximately equivalent to 1 mutation per 1,000 bases in a year. this evolutionary rate of sars-cov-2 is typical for a coronavirus, and it is smaller than that of influenza (average 2 mutations per 1,000 bases per year) and hiv (average 4 mutations per 1,000 bases per year). shen et al. [25] conducted metatranscriptome sequencing for bronchoalveolar lavage fluid samples obtained from 8 patients with covid-19 and found no evidence for the transmission of intrahost variants as well as a high evolution rate of the virus with the number of intrahost variants ranged from 0 to 51 around a median number of 4. pachetti et al. [26] examined 220 genomic sequences of covid-19 patients from the gisaid database and discovered 8 novel recurrent mutations at nucleotide locations 1397, 2891, 14408, 17746, 17857, 18060, 23403 and 28881. mutations at locations 2891, 3036, 14408, 23403 and 28881 are mostly found in europe while those at locations 17746, 17857 and 18060 occur in sequences obtained from patients in north america. likewise, a study in [27] on 95 sars-cov-2 complete genome sequences discovered 116 mutations. among them, the mutations at position c8782t in the orf1ab gene, t28144c in the orf8 gene and c29095t in the n gene are common. we use 6,324 sequence records downloaded from the ncbi genbank database on 2020-06-17. the latest collection date for the samples from which the sequences were derived was on 2020-06-05. the data, which were collected in 45 countries, include both nucleotide sequences and protein translations of coding genes. a proportion of the 6,324 records have sequences of only few proteins, i.e. these records do not annotate all 11 proteins (orf1ab, orf3a, orf6, orf7a, orf7b, orf8, orf10, s, e, m and n). the number of available sequences is thus different from one protein to another (see column "avai num" in table 1 ). genome sequences that do not specify country or aa sequences that contain letter "x" representing an unknown aa are excluded in our calculations. we use the genome obtained from the isolate wuhan-hu-1, accession number nc_045512 as the reference genome. for the mutation detection purpose, we apply a dynamic programming algorithm to protein aa sequences to get global pairwise alignments between a reference sequence and a query sequence. specifically, we use the python bio.pairwise2.align.globalms function (https://biopython.org/docs/dev/api/bio.pairwise2.html) where a match is given 2 points, a mismatch is deducted 0.5 points, 2 points are deducted when opening a gap, and 1 point is deducted when extending it. gaps are then inserted into nucleotide sequences corresponding to the resulted protein sequence alignments. using the resulted pairwise alignments, we are able to compare query sequences and the reference sequences at each position and identify locations of insertion, deletion, synonymous and nonsynonymous mutations. virus protein structure plays a key role in its functions and a change in structure shape may affect its functions, virulence, infectivity and transmissibility, possibly resulting in non-functional proteins. protein secondary structure is defined by hydrogen bonding patterns, which make an intermediate form before the protein folds into a three-dimensional shape composing its tertiary structure. eight types of protein secondary structure defined by the dictionary of protein secondary structure (dssp) include 3 10 helix (g), α helix (h), π helix (i), hydrogen bonded turn (t), extended strand in parallel and/or anti-parallel β-sheet conformation (e), residue in isolated β-bridge (b), bend (s) and coil (c). the dssp tool assigns every residue to one of the eight possible states. in a reduced form, these 8 conformational states can be diminished to 3 states: h = {h, g, i}, e = {e, b} and c = {s, t, c} [28] . the protein secondary structure represents interactions between neighboring or near-by aas as its functional three-dimensional shape is created through the polypeptide folding. we thus determine a change in protein secondary structure if any change happens in the structures of the mutated aa and its 10 neighboring aas compared to those of the reference sequence. in detail, we consider 5 aas ahead and 5 aas behind the mutated aa. the same approach is applied when considering a change of the protein relative solvent accessibility. solvent-exposed area represents the area of a biomolecule on a surface that is accessible to a solvent. accordingly, a residue is considered as exposed if at least 25% of that residue must be exposed, denoted as the "e" state. alternatively, the residue is determined as buried, i.e. the "b" state. there have been various protein secondary structure prediction programs in the literature and many of those were developed based on artificial intelligence models using protein aa sequences such as jpred4 [29] , spider2 [30] , porter 5 [31] , raptorx [32] , psspred [33] , yasspp [34] and sspro [13] . in this paper, we use the protein secondary structure and relative solvent accessibility prediction methods sspro/accpro 5 [13] within the scratch-1d software suite (release 1.2, 2018) [14] . these predictors were built using the bidirectional recursive neural networks and a combination of the sequence similarity and sequence-based structural similarity to sequences in the protein data bank [35] . prediction results of 8-class structure (sspro8 predictor) and 25%-threshold relative solvent accessibility (accpro predictor) are used for statistics on protein secondary structure and accessibility changes. we however also report in the spreadsheet supplemental information prediction results of 3-class structure (sspro predictor) and relative solvent accessibility on 20 thresholds, ranging from 0% to 95% with a 5% step (the accpro20 predictor within the scratch-1d software). table 1 summarizes statistics of sars-cov-2 mutations so far. "aa length" indicates the length of the protein aa sequence derived from the sars-cov-2 reference genome. "avai num" denotes the number of records among 6,324 ncbi genbank records that have the complete sequence of the corresponding protein. "no mu" refers to the number of sequences that do not have any mutations compared to the reference sequence. "delete" means the number of deletion mutations occurring in the aa sequences of the protein. this number may be larger than the number of sequences having deletion mutations because an aa sequence may have more than one deletion. likewise, "insert", "nonsyn" and "syn" show the number of insertion, nonsynonymous and synonymous mutations occurring in the protein aa sequences. "nonsyn/syn" demonstrates a ratio between the number of nonsynonymous mutations versus the number of synonymous mutations. "struct change" means the number of nonsynonymous mutations that have protein secondary structure change potential based on the sspro8 predictor of the scratch-1d software. similarly, "acc change" refers to the number of nonsynonymous mutations that have potential to change the protein relative solvent accessibility based on the accpro predictor of the scratch-1d software. insertion and deletion mutations alter protein secondary structure and solvent accessibility by default so that they are not included in the structure and solvent accessibility change statistics. table 1 shows that the orf3a and orf8 proteins have the number of nonsynonymous mutations significantly larger than that of the synonymous mutations. in contrast, this ratio in proteins e, m, orf7b and orf10 are very small (less than 1). these proteins could be targeted for vaccine and drug development as they have less variations than other proteins. these findings are supported by results presented in figs (fig. 3) , entire regions before and after the spike at position 614 are almost unchanged. fig. 4 presents variations of multiple proteins. in addition to proteins e, m, orf7b and orf10, we find that proteins orf6 and orf7a are also relatively stable without a large number of variations at any particular locations. protein n has 1,927 nonsynonymous mutations but 1,678 of them are likely to make changes in protein secondary structure, making a ratio of 87.08%. this is considerably larger than those of protein s (4.42%), protein m (24.79%) and protein e (64.29%). the number of solvent accessibility changes of protein s is larger than its structure changes: 184 vs 164. this however is opposite in other structural proteins: e (7 vs 18), m (8 vs 30) and n (37 vs 1,678). the orf1ab polyprotein has 7,096 aas. among 6,324 records deposited to the ncbi genbank database, only 3,726 genomes have the complete coding sequence (cds) of protein orf1ab, with 1,024 unique aa sequences. this is quite a large number compared to other proteins but understandable because orf1ab is the longest protein of sars-cov-2 and thus has a large number table 2 . table 2 ). the genbank accession numbers are presented on the left while isolate names and collected dates are on the right. the numbers on top show the positions of aas in the protein and isolates are ordered by collected dates. the first isolate having these deletions is usa-ca6/2020 (record mt044258 in second row), collected on 2020-01-27 in usa: ca. this is also the isolate having the largest number of deletions: five sequentially at g82-, h83-, v84-, m85-, v86-and three at k141-, s142-, f143-. the other patients followed were possibly infected by this first case but more data such as travel history are needed to confirm this hypothesis. (18) germany (16) taiwan (9) the orf3a protein has 275 aas with its complete cds appearing in 5,527 isolates (146 unique aa sequences). among these, 2,321 sequences have no mutation or only synonymous mutations, and 3,206 sequences have insertion, deletion or nonsynonymous mutations. table 4 . notably, the mutation q57h occurs in 2,795 sequences collected in many countries. this is an emerging and active mutation, which requires further investigation as the latest case of this mutation was on 2020-06-05, same as the latest collection date of the entire downloaded dataset. the mutation g251v occurring in 206 sequences is also a prevalent mutation in the orf3a protein. the orf6 protein has 61 aas, appearing in 5,792 isolates with 25 unique aa sequences. among these, 5,719 sequences have no mutation or only synonymous mutations and 73 sequences have insertion, deletion or nonsynonymous mutations. two insertion mutations occur in record mt520188 at positions -62r and -63t (end of the sequence). nine continual deletions occur similarly in 2 sequences: mt547814 (collected in hong kong on 2020-01-22 from an adult male patient [37] ) and mt609561 (usa: virginia in 2020-04). these deletions are f22-, k23-, v24-, s25-, i26-, w27-, n28-, l29-and d30-. alignment of these sequences with the reference genome is displayed in fig. 6 . the isolate mt547814 thus may have transmitted the virus to mt609561 but this implication needs to be corroborated by patients' travel history. there are 23 distinct nonsynonymous mutations and those occurring in 2 or more sequences are presented in table 5 . the orf7a protein has 121 aas in length, found in 5,321 isolates with 34 unique aa sequences. among these, 5,215 sequences have no mutations or only synonymous mutations, while the rest orf7a. there are 15 deletion mutations occurring in 2 records: mt520425 (collected in usa: massachusetts on 2020-03-27) and mt507795 (usa on 2020-04-06). the mt520425 sequence has 1 deletion at position l77-while the mt507795 sequence has 14 sequential deletions f63-, a64-, f65-, a66-, c67-, p68-, d69-, g70-, v71-, k72-, h73-, v74-, y75-and q76-. alignment of these sequences with that of the reference genome is shown in fig. 7 . there are 32 distinct nonsynonymous mutations with those occurring in 2 or more sequences are reported in table 6 . the orf7b protein has 43 aas with its complete cds appearing in 5,175 isolates, forming a set of 11 unique aa sequences. there are 5,151 sequences having no mutations or only synonymous mutations and 24 sequences having nonsynonymous mutations. no insertion or deletion mutations are found in gene orf7b. this along with a small number of nonsynonymous mutations indicate that orf7b is a stable gene. distinct nonsynonymous mutations (10 of them) include f19l, f28y, f30l, s31l, l32f, t40i, c41f, c41s, h42y and a43t. summary of nonsynonymous mutations in gene orf7b occurring in 2 or more sequences is shown in table 7 . the orf10 protein has 38 aas in length, appearing in 5,891 isolates with only 9 unique aa sequences. among them, 5,872 sequences have no mutation or only synonymous mutations and the rest 19 sequences have nonsynonymous mutations. no insertion and deletion mutations are found in gene orf10. similar to orf7b, this is a stable gene. there are 8 distinct nonsynonymous mutations, including i4l, a8v, s23f, r24l, r24c, a28v, d31y and v33i. those occurring in 2 sequences or more are presented in table 9 . the virus transmission may have happened between these two isolates but this needs further investigation. alignment of these sequences is shown in fig. 9 . the number of nonsynonymous mutations in gene s is 3,711, with 240 distinct mutations. mutations that occur in 10 or more cases are reported in table 10 . the number of synonymous mutations is 670, making a ratio between nonsynonymous versus synonymous mutations at 5.54. among the nonsynonymous mutations, mutation d614g is extremely common as it happens in 3,089 sequences, majorly collected in usa (2340), india (210) and australia (132). the first collected date of the d614g mutation cannot be identified precisely because some sequences deposited to the ncbi genbank did not record the full date details. the current data show that either of the following sequences, which have the d614g mutation, was first collected: mt326173 in usa in 2020, or mt270104, mt270105, mt270108 and mt270109 all in germany: bavaria in 2020-01, or mt503006 in thailand on 2020-01-04. it is however important to note that the first patient having the d614g mutation and his/her location may never be known because genome of that patient might not be sequenced and reported. therefore, information reported here can support for further investigation. our statistics show that among 4,434 sequences of the s protein, 3,089 sequences have the mutation d614g, taking 69.67%. this number has considerably increased compared to 31.5% in the previous analysis in [20] on a dataset downloaded on 2020-03-22. on the other hand, there are 37 a829t mutations that all occur in thailand. the first case of this mutation was collected on 2020-01-23 and its latest case was on 2020-04-07. this may indicate that the first case had probably transmitted to other cases having the same mutation a829t in thailand. alternatively, mutations h146y (24 cases), v483a (11 cases), e554d (14 cases), p681l (16 cases) and s939f (11 cases) all occur only in usa or mutation l8v (4 cases) occurs only in hong kong (refer to the attached spreadsheet). the "latest date" in tables 2-15 may be used to infer which mutations are inactive or still active. for example, in gene s (table 10) , the latest date of d614g was on 2020-06-05 (same as the latest collection date of the entire dataset) that indicates that this mutation is still active. the latest date of p681l was on 2020-04-03, indicating that this mutation may no longer occur. this kind of information may be useful for further research on vaccine and drug development as ongoing changes of the viral proteins need to be focused and addressed. we identify the rbd region within the residue range arg319-phe541 of protein s based on a study in [36] . in the rbd region only, the number of nonsynonymous mutations is 53 and that of synonymous is 46, making a ratio of 1.15. this is much smaller than the ratio of 5.54 for the entire gene s, suggesting that the rbd region may have been optimized for binding to a receptor of a host cell. this is complemented by fig. 9 showing all deletion mutations in gene s being outside the rbd region. note that the difference of these ratios is partly due to the large number of d614g mutations (3, 089) , which is outside the rbd region. table 11 summarizes nonsynonymous mutations in the rbd region occurring in 2 or more sequences. notable mutation in this region is v483a occurring in 11 isolates all collected in usa. the first and latest collected dates of these isolates were respectively 2020-03-05 and 2020-04-05, suggesting that the first isolate may have spread to others having the same mutation v483a. likewise, the mutation g476s occurs in 6 isolates all collected in usa: wa from 2020-03-10 to 2020-03-25. alternatively, the mutation y453f occurs in 5 sequences all in netherlands but the first collected date was on 2020-04-25 and the latest collected date was on 2020-04-29. these dates are too close, indicating that all the reported y453f cases may have been infected from another case, whose genome had not been sequenced and reported to the ncbi genbank. it is important to note that all the transmission implications need further investigation with more data from other aspects such as travel history, physical contacts and so on. in for the entire protein s, 134 nonsynonymous mutations (48 unique) have both structure and solvent accessibility change potentials. these mutations occurring in 2 or more sequences are reported in table 12 . mutation h146y occurs in 24 cases and mutation p681l occurs in 16 cases, which are all collected in usa. the most common mutation d614g does not have the potential to change either protein secondary structure or relative solvent accessibility. the envelope protein e has 75 aas, found in 5,852 genbank records with 15 unique aa sequences. among them, 5,824 sequences have no mutation or only synonymous mutations while 28 sequences have nonsynonymous mutations. gene e is thus relatively stable and could be targeted for vaccine and drug development. this is supported by the fact that no insertion or deletion mutations are found within gene e. there are 14 distinct nonsynonymous mutations in gene e and those occur in 2 or more sequences are presented in table 13 . five distinct nonsynonymous mutations in gene e have protein structure change potential: s68c, s68f, p71l, d72y and l73f. alternatively, 4 distinct mutations have potential to change relative solvent accessibility: l37h, l37r, d72y and l73f. therefore, d72y and l73f are two mutations in gene e that have a potential to change both protein structure and solvent accessibility. table 12 . gene s -nonsynonymous mutations that have both structure and solvent accessibility change potentials occuring in 2 or more sequences. the "query structure" (and "query accessibility") shows the unique structure (and accessibility) changes based on on prediction results. structure letter in parentheses is the predicted structure of the residue at the corresponding mutation position. five letters before and after parentheses are structures of neighbouring residues. likewise, letter "b" or "e" in parentheses shows the accessibility status of the residue at the mutation position. 2 ccccc(c)cbeee ccccc(c)ceeee bebbb(b)bbbbb beebb(b)bbbbb d253g 7 eecct(t)ccctc ccccc(c)cceec cccct(c)cceec bbbeb(e)bbbeb bbbeb(b)bbbbb s254f 2 ecctt(c)cctcc ecttc(e)eeecc bbebe(b)bbebb bbbbb(b)bbbbb w258l 4 tccct(c)cccse cccee(e)eeese ebbbe(b)bbbeb bbbbb(b)bbbeb g261d 4 ctccc(c)seeee eeccc(c)seeee bebbb(b)ebbbb bbbbb(b) the m protein has 222 aas and its complete cds appears in 5,677 genbank records, with 37 unique aa sequences. there are 5,557 sequences having no mutation or only synonymous mutations while other 120 sequences have nonsynonymous mutations. no insertion or deletion mutations are found in gene m. the number of distinct nonsynonymous mutations in gene m is 37, with those occurring in 5 or more sequences shown in table 14 . among these, 10 mutations are likely to make changes in protein secondary structure: c64f, a69s, a69v, v70f, n113b, r158l, v170i, d190n, d209y and s214i. alternatively, 6 mutations have the solvent accessibility change potential: n113b, p123l, p132s, h155y, d190n and t208i. n113b and d190n are thus two mutations having potential to change both protein structure and solvent accessibility in gene m. the n protein has 419 aas and its complete cds appears in 5,281 isolates, with 178 unique aa sequences. among them, 4,315 sequences have no mutation or only synonymous mutations while the rest 966 sequences have deletions or nonsynonymous mutations. there are no insertion in gene n. the sequence in mt434815 (collected in usa: ny on 2020-03-09) has three sequential deletions at q390-, t391-and v392-while the sequence in mt370992 (usa: ny on 2020-03-20) has six sequential deletions at t366-, e367-, p368-, k369-, k370-and d371-. two other sequences mt605818 and mt560525 (both collected in turkey on 2020-04-16) have three sequential deletions at r195-, n196-and s197-. there are 1,927 nonsynonymous mutations with 156 distinct ones and those occurring in 10 or more sequences are presented in table 15 . notable mutations are r203k occurring in 871 sequences and g204r occurring in 433 sequences. there are 15 mutations in this protein having the potential to change both protein structure and solvent accessibility, including g18v, d22y, g34w, r40c, r40l, r185c, a211s, p365h, t391i, t393i, a398s, d399e, d399h, d401y and d402y. analysing the virus genome sequences and their proteins is crucial for understanding the virus and proposing appropriate approaches to respond to and control the pandemic. this paper has reported all point mutations of sars-cov-2 since the virus's first genomes were obtained in december 2019. a sars-cov-2 mutation database is built using a large number of genome sequences (6,324) obtained across 45 countries. this database can enable scientists to monitor the evolution and spread of the virus although the use of these data needs to be corroborated with patients' clinical data and travel history for substantiated confirmations. we also predict the secondary structure and relative solvent accessibility of the virus proteins to evaluate whether the detected mutations have a potential to change the virus characteristics. these protein secondary structure and solvent accessibility change potentials are predicted results based on deep learning recurrent neural networks, which need to be experimentally verified. they however provide important insights about the virus and prompt further experimental biochemistry and molecular biology research into the genomic regions of these mutations. among 3,089 d614g mutations, our prediction results show that none of these mutations is likely to make changes in the protein secondary structure and relative solvent accessibility. in addition, we have shown regions of the sars-cov-2 genomes that have small variations such as those coding for proteins e, m, orf6, orf7a, orf7b and orf10. these regions could be targeted for vaccine and drug development. usa (10) australia (6) bangladesh (3) hong kong (2) taiwan (1) germany (1) kazakhstan (1) s202n 25 2020-01-30 china australia (203) greece (122) bangladesh (88) japan (38) czech republic (34) poland (26) germany (18) india (12) taiwan (10) turkey (8) france (8) thailand (6) serbia australia (101) greece (61) bangladesh (44) japan (19) czech republic ) serbia (3) italy (3) spain (2) russia (2) sri lanka (1) puerto rico (1) peru (1) nigeria (1) a new coronavirus associated with human respiratory disease in china genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding a pneumonia outbreak associated with a new coronavirus of probable bat origin identifying sars-cov-2 related coronaviruses in malayan pangolins probable pangolin origin of sars-cov-2 associated with the covid-19 outbreak the proximal origin of sars-cov-2 origin of novel coronavirus (covid-19): a computational biology study using artificial intelligence. biorxiv a novel coronavirus from patients with pneumonia in china the species severe acute respiratory syndrome-related coronavirus: classifying 2019-ncov and naming it sars-cov-2 isolation of sars-cov-2-related coronavirus from malayan pangolins who coronavirus disease (covid-19) dashboard characterization of the receptor-binding domain (rbd) of 2019 novel coronavirus: implication for development of rbd protein as a viral attachment inhibitor and vaccine sspro/accpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity scratch: a protein structure and structural feature prediction server genetic diversity and evolution of sars-cov-2. infection emergence of genomic diversity and recurrent mutations in sars-cov-2. infection on the origin and continuing evolution of sars-cov-2 spike mutation pipeline reveals the emergence of a more transmissible form of sars-cov-2. biorxiv a short review on antibody therapy for covid-19 emergence of mutations and possible antigenic drift in the surface glycoprotein of sars-cov-2 (covid-19) emergence of drift variants that may affect covid-19 vaccine development and antibody treatment a sequence homology and bioinformatic approach can predict candidate targets for immune responses to sars-cov-2 genomic analysis of covid-19 nextstrain: real-time tracking of pathogen evolution genomic diversity of sars-cov-2 in coronavirus disease 2019 patients emerging sars-cov-2 mutation hot spots include a novel rna-dependent-rna polymerase variant genomic characterization of a novel sars-cov-2 multi-output interval type-2 fuzzy logic system for protein secondary structure prediction jpred4: a protein secondary structure prediction server spider2: a package to predict secondary structure, accessible surface area, and main-chain torsional angles by deep neural networks porter 5: fast, state-of-the-art ab initio prediction of protein secondary structure in 3 and 8 classes. biorxiv raptorx: exploiting structure information for protein alignment by statistical inference a comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction yasspp: better kernels and coding schemes lead to improvements in protein secondary structure prediction the protein data bank structure of the sars-cov-2 spike receptor-binding domain bound to the ace2 receptor a rare deletion in sars-cov-2 orf6 dramatically alters the predicted three-dimensional structure of the resultant protein key: cord-253844-y6xdcf20 authors: yesudhas, dhanusha; srivastava, ambuj; gromiha, m. michael title: covid-19 outbreak: history, mechanism, transmission, structural studies and therapeutics date: 2020-09-04 journal: infection doi: 10.1007/s15010-020-01516-2 sha: doc_id: 253844 cord_uid: y6xdcf20 purpose: the coronavirus outbreak emerged as a severe pandemic, claiming more than 0.8 million lives across the world and raised a major global health concern. we survey the history and mechanism of coronaviruses, and the structural characteristics of the spike protein and its key residues responsible for human transmissions. methods: we have carried out a systematic review to summarize the origin, transmission and etiology of covid-19. the structural analysis of the spike protein and its disordered residues explains the mechanism of the viral transmission. a meta-data analysis of the therapeutic compounds targeting the sars-cov-2 is also included. results: coronaviruses can cross the species barrier and infect humans with unexpected consequences for public health. the transmission rate of sars-cov-2 infection is higher compared to that of the closely related sars-cov infections. in sars-cov-2 infection, intrinsically disordered regions are observed at the interface of the spike protein and ace2 receptor, providing a shape complementarity to the complex. the key residues of the spike protein have stronger binding affinity with ace2. these can be probable reasons for the higher transmission rate of sars-cov-2. in addition, we have also discussed the therapeutic compounds and the vaccines to target sars-cov-2, which can help researchers to develop effective drugs/vaccines for covid-19. summary: the overall history and mechanism of entry of sars-cov-2 along with structural study of spike-ace2 complex provide insights to understand disease pathogenesis and development of vaccines and drugs. angiotensin-converting enzyme 2 ccl2 chemokine (c-c motif) ligand the sars-cov-2 is a single, positive-strand rna virus that causes severe respiratory syndrome in humans [1] . the coronavirus disease 2019 (covid-19) has emerged as a severe pandemic, claiming more than 0.8 million lives worldwide between december 2019 and august 2020 [2, 3] . compared to sars-cov, sars-cov-2 human-to-human infection is more readily transmitted and spread to almost all continents leading to the who's declaration of a public health emergency of international concern (pheic) on january 30, 2020 [3] [4] [5] . generally, coronaviruses can cause respiratory, gastrointestinal, and central nervous system diseases in humans and animals, threatening humans life and causing economic loss [6, 7] . these viruses also have the capacity to adapt to a new environment through mutations and are programmed to modify host tropism; thus, the threats are constant and long-term [6, 8, 9] . including sars-cov-2, other coronaviruses cross the species barrier into humans, which lead to outbreaks of severe and fatal respiratory diseases. the sars-cov was first identified in bats, and spread to other animals in different geographic regions. the sars-cov outbreak was first emerged in humans in 2003, through transmissions from animals in open-air markets in china [10, 11] . thereafter, a higher number of genetically related viruses were also identified in chinese horseshoes bats (rhinolophus sinicus) [11] [12] [13] . coronaviruses belong to the family coronaviridae and are divided into alpha (α-cov), beta (β-cov), gamma (γ-cov), and delta (δ-cov) coronaviruses. the alpha and betacoronaviruses can infect mammals, and the viruses found in humans are genetically similar to β-cov genus. the β-covs are further divided into different lineages (a, b, c, and d lineages): sars-cov and sars-cov-2 are grouped in lineage b, which has approximately 200 published virus sequences, whereas mers-cov belongs to lineage c, which has ~ 500 viral sequences [11] . the hcov-229e and hcov-nl63 belong to the alphacoronavirus family, whereas hcov-oc43, hcov-hku1 and sars-cov are betacoronaviruses [14] [15] [16] . the phylogenetic analysis shows that sars-cov-2 protein is firmly rooted in the β-genus lineage of bat coronaviruses [14] . the whole genome of sars-cov-2 shares 80% identity with that of sars-cov and is 96% identical to the bat coronavirus batcov-ratg13 [17] . the spike protein sequence similarity between sars-cov-2 and sars-cov is around 76-78%. the rbd alone shares a similarity of 73-76%, and rbm shares 50-53%. in contrast, the human mers-cov is related to tylonycteris bat coronavirus hku4, shares less sequence similarity (~ 54%) and recognizes dpp4 as their receptor. the sequence similarity between sars-cov-2 and sars-cov spike proteins explains the possibility of binding to the same receptor angiotensin converting enzyme 2 (ace2) in the host cell [14] . coronavirus is one of the largest genomes among all rna viruses ranging from 27 to 32 kb. receptor-mediated endocytosis is the main process of virus entry to the host cells. sars-cov-2 uses ace2, a cell-surface receptor that is present in the kidney, blood vessels, heart, and importantly, in the lung at2 alveolar respiratory tract epithelial cells for viral infection [18] . the spike protein, which is responsible for the viral entry, has n-terminal and c-terminal domains, and two major subunits s1 and s2 are present in almost all coronaviruses [6] . one of these s1 or s2 subunits binds with the host receptors and acts as a receptor-binding domain (rbd) (fig. 1a) . next-generation sequencing technology (ngs) revolutionized the biological sciences, including virus discovery. the ngs made it easy to recognize thousands of novel virus sequences from wild animal and human populations around the world. despite these vast coronavirus sequences are published, very little work has been performed for further studies. further, the lack of tools to test these novel viruses and their ability to infect humans hindered the efforts to predict the subsequent zoonotic viral outbreaks [11] . understanding the virology of coronaviruses and the methods to control their spread is currently a necessary task to maintain the global health and economic stability. in this review, we discuss the history of coronaviruses in both humans and animals, their transmissions, mechanism of host cell entry and the structural studies, explaining active and inactive receptor binding of spike protein and the key residues playing an important role in the receptor binding. the drug repurposing and the therapeutic targets for sars-cov-2 are also discussed. until the first identification of human coronaviruses 229e and oc43, in the late 1960s, coronavirus infections were witnessed as harmless for humans [14, 15] . the outbreak of sars-cov in southern china in the winter of 2002, took a fatality rate of 10% of the infected patients [18] [19] [20] . the virus had been rapidly spreading throughout the world, especially in asia, and controlled after july 2003 [21] . viral analysis of the outbreak of sars showed that bats are natural reservoirs for sars-covs, and civet cats and raccoon dogs are the intermediate hosts. in the year 2012, a novel highly pathogenic middle east respiratory syndrome coronavirus (mers-cov) was identified in humans, demonstrating that the coronaviruses are transmitted from animals to humans at any time and with unexpected consequences for the public health [22] . mers-cov, the slow-spreading virus, has affected ~ 1700 people with a fatality rate of ~ 36% [6, 22] . the animal sources of sars-cov-2 infections are bats and sars-cov-2 can be transmitted to cats, pangolins, and dogs [23] . other than humans, the coronaviruses have a big impact on the animal kingdom. animal coronaviruses can cause severe threat to their hosts; since 1984, an unrecognized infection massively spread among the swine population in europe, and later it was identified as porcine epidemic diarrhea coronavirus (pedv), which is derived from the porcine enteric coronavirus tgev [24] . in 2013, the same pedv in less than a year had caused a 100% fatality rate in piglets and rubbing out more than 10% of america's pig population [5, 25, 26] therefore, the total number of human coronaviruses identified has been increasing throughout the years. hcov-229e and hcov-oc43 are the first two discovered human coronaviruses in the 1960s, hcov-nl63 and hcov-hku1 human coronaviruses identified after the sars-cov outbreak in [22] . however, the story continues with the new identification of sars-cov-2 in december 2019 at the seafood wholesale market in wuhan, china. sars-cov-2 is the seventh member of the family of coronaviruses that infects humans and it is different from both mers-cov and sars-cov. although some infections caused by human coronaviruses are mild and associated with common colds, certain animal and human coronaviruses can make a severe impact on the human population. especially in young children, elderly people, and immune-deficient patients, the infections can be lethal [9, [27] [28] [29] . therefore, it is important to understand the mechanism of invading viruses to theirs hosts, transmission and prevention of these processes. the respiratory droplets are the main routes of transmissions; sars-cov-2 can be transmitted to a healthy person if he happens to have contact with the infected person or any of his belongings, including clothes, doorknobs, etc. studies have been reported that aerosol transmission (airborne transmission) is also possible for sars-cov-2, but there is no clear study on neonatal infections (mother to child) [30] [31] [32] [33] [34] [35] [36] . however, the transmission can be avoided by keeping a distance of 2 m between two people, wearing masks while going out, and the isolation of infected people. during the initial phase of the covid-19 outbreak, a dataset was obtained from 1099 patients with laboratoryconfirmed covid-19 from 552 hospitals in 30 provinces of china on january 29, 2020. only 2% of the patients had a history of contact with animals; more than three quarters have either visited the wuhan city or are residents. hence, the outbreak patterns or the source of infection could not be predicted from their study. the incubation period of the sars-cov-2 was from 1 to 12 days; however, the median incubation period was 4 days [32] . the most common symptoms are fever (43% on admission, and 88.7% during hospitalization), cough (67.8%), diarrhea (3.8%), and fatigue [32, 37] . the sars-cov-2 was detected in saliva, blood, sputum, and urine before the development of viral pneumonia, and some patients do not develop pneumonia at all. asymptomatic persons are potential sources of sars-cov-2 infection, which control the transmission dynamics of the current outbreak [32, 38] . the sars-cov-2 first identified in wuhan, china has spread all over the globe. as of august 20, 2020, more than 22 million confirmed infection cases and 0.8 million deaths had been reported across the world, including almost all the countries. the rate of infection or the average number of people getting infected by an individual (r0) was 2.75 in the case of sars pandemic in 2003. the r0 value of ebola 2014 was in the range of 1.51-2.53, and h1n1 influenza 2009 was from 1.46 to 1.48, and for mers, it was around 1. the sars-cov-2 r0 value was estimated to be in the range of 1.5-3.5. the comparison of r0 values of various coronaviruses shows that the difference is minimal. however, the difficulties arising for sars-cov-2 infection are due to the following: (1) basic properties of the viral infection and the infection periods are uncertain, (2) most of the infected individuals do not show symptoms, but are capable of spreading the infection, (3) changing susceptibility of the population in affecting the spread of infection remains unanswered. in addition, there are no control measures for this spread [39] . coronaviruses consist of four structural proteins: the nucleocapsid protein (n) forms the helical capsid to accommodate its genome. the whole structure is further surrounded by a lipid envelope, which is made of s (spike), e (envelope) and m (membrane) proteins. the membrane and the envelope proteins are needed for the virus assembly and the spike protein is for virus entry and host cell recognition [6] . the spike protein forms large protrusions (peplomers) on the virus surface (looks like the virus has crowns), and hence it is named as "corona" (corona is a latin word which means crown). it comprises three segments: (1) large ectodomain, (2) transmembrane domain and (3) intracellular tail. the receptor-binding subunits s1 and s2 are placed in the ectodomain region. during the infection, the s1 binds with the host receptor, and s2 fuses the host and viral membranes, thereby releasing the viral genome into the cell (fig. 1 ). the spike protein is a clove-shaped trimer with three s1 heads and a trimeric s2 stalk [6] . during viral infection, spike protein (~ 1300 amino acid residues) is cleaved by host proteases into receptor binding subunit s1 and membrane fusion subunit s2. during cell entry, the s1 subunit binds directly to the sugar receptors [40] and ace2 of the host cell surface, and the s2 subunit undergoes conformational changes and obtains post-fusion state [41] . during this state, the three pairs of heptad repeat region hr-n and hr-c in trimeric s2 form a six-helix bundle structure [42] . the buried hydrophobic fusion peptides become exposed and insert into the target host membrane. these fusion peptides and the transmembrane anchors are positioned at the end of a six-helix bundle structure, bringing the viral and host membranes to fuse [42, 43] . during this process, a large amount of energy is released, which accelerates the membrane fusion forward. along with this, receptor binding and low ph can also trigger this membrane fusion [6] . since the spike protein has a good binding affinity for sugar receptors of human cells, it uses them as a mechanism of cell entry [6] . notably, the sars-cov-2 has a higher affinity to human ace2 than the sars-cov virus strain. the ectodomain of the sars-cov-2 spike protein binds to the peptidase domain (pd) of ace2 with a k d (equilibrium dissociation constant) of ~ 15 nm [4] . spike protein priming is done by transmembrane protease serine 2 (tmprss2), which is also essential for the entry of sars-cov-2 [44] . generally, sars-cov enters host cells through endocytosis, where its spike protein is processed by cathepsin l and cathepsin b lysosomal proteases. the extracellular proteases, including elastase in the respiratory tract and tmprss2 on the surface of flung cells are also known to activate spike membrane fusion [6] . the viral entry to the host cells can be via: (1) the endocytic pathway and (2) non-endosomal pathway [45] (fig. 1b) . the endocytic pathway, especially clathrindependent endocytosis is extensively studied for sars-cov and mers-cov viral entry. since the sars-cov-2 also uses the same receptor as sars-cov, it is reported that sars-cov-2 also uses the same viral entry mechanism. wang et al. [46] reported clathrin-and caveolaeindependent endocytic pathways for sars-cov entry. despite the common use of endocytic pathway as a viral entry mechanism, the discrepancies about the same are unavoidable. thus, the exact nature of the viral entry is context-dependent, including the type of the virus and the host cells [47] . in addition, the macrophages can also act as a viral reservoir and support minimally for the sars-cov-2 entry and its replication. although the dendritic cells and other immune cells are not infected by sars-cov-2, they may serve as a transporter for the viruses, which is also responsible for the pathogenesis [48, 49] . the s1 subunit of sars-cov spike glycoprotein is predominantly composed of β-strand structures, composed of an n-terminal domain (ntd, residues 14-294) and three c-terminal domains (ctd1, residues 320-516; ctd2, residues 517-578 and ctd3, residues 579-663). the ntd is linked with ctd1 through the linker residues 295-319, where ctd1 acts as a potential rbd for sars-cov and binds explicitly with the ace2 receptor. gui et al. [43] reported that the sars-cov spike glycoprotein trimer could obtain different conformations, which are necessary for effective binding with ace2. all these conformations are termed based on the position of ctd1 of the spike glycoprotein (fig. 2) . the three spike glycoprotein monomers intertwine with each other and form densely packed homotrimer. the head portion of this trimer is taken place by ntds and ctd1s of s1 subunits, where the ctd1s are located at the center and the ntds are located outside of this triangular head. the s2 subunits represented as a stem for this trimer, which is further surrounded by ctd2s and ctd3s of the s1 subunits (fig. 2) . in the inactive state (fig. 2a, b) , s2 subunits (stem portion) are completely covered by the "down" position of ctd1s (head portion), which causes steric clashes for the binding between spike protein and ace2. in the active state (fig. 2c) , two ctd1s adapt the "down" conformation and one ctd1 rotates outward and obtains "up" conformation, which does not cover the s2 subunit and allows the interactions between spike protein and ace2. the "up" position also paves the way for the s2 subunit to expose and insert its fusion peptides into the host cell membrane [43] (fig. 2) . the sequence similarity between sars-cov-2 and sars-cov spike protein explains the possibility of having the same receptor ace2 in the host cell [14] . the sequence and the structural studies revealed the key residues, which are involved in spike-ace2 interactions. the key residue at position 493 in rbd of sars-cov-2 is gln, wherein sars-cov (479 is the corresponding residue in sar-cov) of civets and humans, it is lys and asn, respectively. since the residue 479 of rbd is near to the virus-binding hotspot residue lys31 of human ace2, the lys residue present in civet causes steric clashes and not favoring human ace2 receptor binding. however, the lys479asn mutation revealed that the asn present in 479 of human sars-cov enhances the viral binding with human ace2 receptors. the gln493 in sars-cov-2 rbd is compatible with the human ace2 receptor hotspot residue lys31, which explains its target cell identification [14] . similarly, the residue 501 in sars-cov-2 rbd is asn, wherein civet rbd, it is ser (487 is the corresponding residue in sars-cov), and for humans, it is thr. this residue also plays an important role in making interaction with the hot spot residue lys353 of receptor ace2. ser487thr mutation analysis shows encouraging results upon human ace2 receptor binding and plays a role in human-to-human transmission. thus, the interactions with lys353 will be favorable for threonine (human sars-cov) and asn (sars-cov-2) than serine (civet) [14] . the residues lue455, phe486, and ser494 in sars-cov-2 rbd (tyr442, leu472, and asp480 in human and civet sars-cov) are considered as important for the human ace2 receptor binding. tyr442 of human and civet sars-cov rbds shows unfavorable interactions with hotspot residue lys31 of human ace2; however, lue455 of sars-cov-2 provides favorable interactions. compared with leu472 of human civet sars-cov rbd, phe486 of sars-cov-2 provides better interaction with hotspot residue lys31 of human ace2. although ser494 provides positive support for the human ace2 receptor hotspot residue lys353, the sars-cov asp480 also makes favorable interaction with hotspot residue lys353. throughout this viral entry process, the lys31 and lys 353 of human ace2 receptors are termed as "hot spot" residues, which consist of a salt bridge buried in a hydrophobic environment and contribute critically for the virus and host cell receptor binding. thus, this key residue comparison of sars-cov-2 with the civet and human sars-cov explains how actively sars-cov-2 is choosing and binding with the human ace2 receptors, which is likely to cause the human-to-human transmission [14] . the heterogeneity of amino acids in ace2 receptors is also responsible for the wavering binding affinities between the host and the virus, which is associated with the viral transmission. however, the variants in the host cell receptors (ace2) can confer resistance against the invading pathogens. hussain et al. reported that the mutations s19p and e329g in ace2 disrupt the intermolecular interactions and have low binding affinity with viral spike protein. in addition to the variations in the viral spike protein, ace2 allelic variants can also drive the potential resistance against sars-cov-2 infection [50] . intrinsically disordered proteins (idps) and intrinsically disordered regions (idrs) also play major roles in a number of biological functions, including dna/rna binding, protein binding, and facilitating access to the binding sites between the binding partners [51, 52] . the rna-protein recognition often needs conformational changes in both rna and protein, which is facilitated by the structural flexibility of disordered residues [51] . also, the functional importance of intrinsically disordered regions in proteins includes transcription, translation, post-translational modifications, and cell signaling [51, 53] . the categorization of coronaviruses, based on the intrinsic disorder propensities, can represent useful identification for the viral life cycle and its pathogenicity. the idrs in sars-cov nucleocapsid protein comprise three segments, such as 1-44, 182-247 and 366-422 [54, 55] . the highly flexible intrinsic disordered linker region, which connects the ntd and ctd is rich in serine and arginine residues. an intrinsically disordered domain that flanks the ctd (c-terminal tail peptide) plays a significant role in dimer-dimer association in human coronavirus 229e (hcov-229e). likewise, the coronavirus hku1, has a partially disordered conserved linker loop (amino acids 428-587) structure [56] . hku1 s2 subunit also shows the presence of disordered residues at its protease cleavage site [57] . in sars-cov-2, the attachment of spike protein with the host cell is activated by the host cell enzymes trypsin, cathepsin l, furin and tmprss2 (fig. 1b) . the sequence comparison of sars-cov-2 against other lineage b betacoronaviruses shows that the unique amino acid pattern "rrar" is present at the s1/s2 junction of the spike protein, which is cleaved by the furin enzyme. however, the structure reported for sars-cov-2 spike protein (pdb code: 6vsb), shows that s1/s2 junction is in a disordered, solvent-exposed loop [4] . hence, it has been hypothesized that the unique amino acid sequence "rrar" present in sars-cov-2 is responsible for their effective transmission [58] [59] [60] . the binding with ace2 is governed by the intrinsically flexible receptor binding motif, and the binding interfaces along with the key residues are reported in the literature [61, 62] . in addition, based on our analysis (fig. 3) , the monomeric structure of the spike protein in the free form (pdb id: 6vsb; green color) [4] shows a number of missing residues in the structure. these missing residues of the spike protein attain a stable conformation upon binding to the ace2 receptor and hence, they are termed as disorderto-order transition (dot) residues (pdb id: 6lzg; light golden color). specifically, the regions, leu455 to pro491 and asn501 to val503 show disorder-to-order transitions [61] . these disordered-to-ordered residues are facilitating a better shape complementary and affinity between ace2 and spike protein. interestingly, the key residues responsible for transmission, and interaction with ace2 receptors (already discussed in section "key residues of sars-cov-2 and sars-cov") are overlapping with these mentioned disordered-to-ordered residues. based on these observations, the disorder-to-order conformational change is necessary to facilitate the spike protein binding with its receptor. thus, an in-depth analysis of these disordered residues will shed additional insights on the viral recognition and transmission mechanism. the studies by goh et al. [63] on 1918 h1n1 and h5n1 explain that it is very likely that disordered regions are important for host specificity and recognition, e.g., across species of birds. they also explained the changes created by disorder in crucial regions, increasing the virulence of both the h5n1 and the 1918 h1n1 viruses. therefore, the increasing reports for the intrinsically disordered regions in coronaviruses need to be pointed out. the importance and functional role of intrinsically disordered regions need to be thoroughly studied. this could identify additional targets for drugs to combat coronavirus through the disruption of their packing and assembly process. the receptor binding, along with its membrane fusion, is the initial and important step in the coronavirus infection and serves as primary targets for inhibiting the viral entry. the genome sequencing of sars-cov-2 and the comparison of the genomes of related virus proteins suggested the anti-hiv lopinavir plus ritonavir combination can be likely effective [15, 64, 65] . sars-cov-2 uses the ace2 receptor of at2 cells in the lung as its primary targets. since the viral entry is governed by receptor-mediated endocytosis, ap2-associated protein kinase 1 (aak1), a known regulator of endocytosis, can be a good target to interrupt the virus entry. the janus kinase inhibitor baricitinib, and its binding with the cyclin g-associated kinase (endocytosis regulator) is sufficient to inhibit aak1 [66, 67] . similarly, sunitinib and erlotinib, the oncology drugs, have been shown to inhibit viral infection of cells through the inhibition of aak1 [68] . however, these compounds bring serious side effects and cannot be considered for a safe therapy. fig. 3 structure of the monomeric spike protein (green)-ace2 receptor (blue) complex. the interface residues are shown in light golden. the disordered-to-ordered transition residues (leu455 to pro491 and asn501 to val503) have been marked in the figure based on the pathogenicity studies of sars-cov and mers-cov, an increased amount of inflammatory cytokines in serum is associated with the inflammation and extensive lung damage [69] . similarly, sars-cov-2 also has a high amount of il1b, ifnγ, ip10, and mcp1/ccl2, which lead to t-helper-1 (th1) activation. however, sars-cov-2 is shown to increase the t-helper-2 (th2) cytokines (il4 and il10) that are known to suppress inflammation, which differs from sars and mers-cov infection. in view of cytokines induced by sars-cov-2 and other sars-cov, mers-cov, corticosteroids were frequently used to reduce the inflammation-induced lung injury [2] . arabi et al. [70] examined the combination of interferon beta-1b, lopinavir, and ritonavir combination for the mers infection in saudi arabia. the antiviral nucleotide prodrug remdesivir showed a potent efficacy on mers-cov and sars-cov infections. since sars-cov-2 is an emerging virus, no effective treatment has been developed so far, yet the already available combination of lopinavir and ritonavir is being used [2] . since the sars-cov enters the cell through endocytosis and the lysosomal protease is priming the spike protein, targeting/inhibiting the endosomal acidification or lysosomal cysteine protease can block the sars-cov entry [6] . the view of the activation of sars-cov and sars-cov-2 by tmprss2 might also have therapeutic suggestions. a protease inhibitor, camostat mesylate, which is studied as a tmprss2 inhibitor and has been used to treat humans, is available and could be employed as a defense against the respiratory viruses [71, 72] . autophagy has been implicated in viral replications, which is also responsible for the formation of doublemembrane vesicles (dmv) in the host cells [73] . since the autophagosomes are degraded by lysosomes, inhibitors like lysosomotropic agents have been proposed for sars-cov-2 [74] . the antimalarial drugs hydroxychloroquine (hcq) and chloroquine (cq) have been demonstrated for sars-cov-2 antiviral activity. however, data to support the use of hcq and cq for covid-19 are incomplete [75] . these lysosomotropic agents are helpful for neutralizing the endosome-lysosomal acidic ph, thereby blocking protease activity and subsequently blocking the viral entry. however, how autophagy is implicated in the infection of covs is still under debate [47, 74, 76] . table 1 describes the different inhibitors used for blocking the viral entry. the most studied and the common pathway proposed for all coronaviruses is the endocytic pathway, so blocking that pathway is a big hallmark for treating the disease. identification of potential drugs for sars-cov-2 is a necessary task, so drug repurposing can also work for this scenario. the small molecules, which are already in clinical trials or used for some other diseases and the molecules from the expert's opinion are also listed in table 2 . the table is categorized into three groups: (1) compounds involving drug repurposing, (2) compounds in clinical and pre-clinical study, (3) compounds targeting the mechanism pathway. researchers in government and private sectors are making huge efforts to develop effective vaccines for sars-cov-2. the vaccine development approaches are mainly based on inactivated and attenuated viral protein particles, viral vectors and viral dna/rnas. a novel rna-based vaccine uses a part of the genetic code of spike protein mrna-1273 (clinicaltrials.gov: nct04283461) [139] . several other mrna-based vaccines by curevac (tübingen, germany), bnt162 by biontech (mainz, germany), pfizer (new york, ny, usa) and biontech mrna vaccine (mainz, germany) are in different stages of development [139, 140] . cansino biologics (tianjin, china), the company that developed the vaccine for ebola is also developing a vaccine named ad5-ncov for sars-cov-2. it is a spike protein-based vaccine that is undergoing phase i clinical trials in healthy individuals in wuhan china (clinicaltrials. gov: nct04313127) [141, 142] . the current status of the vaccines under development is available at the milken institute treatment and vaccine tracker: https ://milke ninst itute .org/sites /defau lt/files /2020-03/ covid 19%20tra cker%20032 020v3 -posti ng.pdf [141] . sars-cov-2 is a continuously growing life-threatening disease. this coronavirus has rapidly evolved and spread all over the world, having a mortality of more than 0.8 million lives so far. the exact origin and mechanism of attack and spatial distribution are still not completely explored. the viral and the host cell proteins assisting in invasion process can be a target for treating the infections. the available crystal structures and the binding mechanism proposed by other coronaviruses can help to study the sars-cov-2. the critical structural studies of the viral particles are also helpful to identify the drug targets. the positional changes (up and down conformation) of spike protein trimer determine the active to inactive state. thus, targeting these conformations and developing small molecules or peptides can also stop the viral entry. in addition, phylogenetic analysis and structural studies revealed that the hot spot residues, the role of gln493 in human ace2 receptor binding and the critical residue asn501 of rbd explain the human-tohuman transmission. the intrinsic disorder region and a precise furin-like cleavage site can be responsible for viral cycle and pathogenicity. however, in-depth studies are needed to address this issue, which may represent a potential antiviral strategy. the small molecules, which are in clinical trials, the drug compounds, which are currently in use for treating different diseases can also help to identify (screen) potential drug candidates. development of suitable mice models to understand this novel virus infection and comparison with the hdac2 histone deacetylase 2, csnk2a2 casein kinase 2 alpha 2, comt catechol-o-methyltransferase, ptges2 prostaglandin e synthase 2, ndufs1 nadh-ubiquinone oxidoreductase 75 kda subunit, gla alpha-galactosidase a, impdh2 inosine-5′-monophosphate dehydrogenase 2, mark2/3 map/microtubule affinity-regulating kinase 3, abcc1 atp-binding cassette subfamily c member 1, fkbp fk506-binding protein, larp1 la ribonucleoprotein 1, sigmar1 sigma non-opioid intracellular receptor 1, nek9 nima: related kinase 9, brd2 bromodomain containing 2, csnk2a2 casein kinase 2 alpha 2, tmem97 sigma-2 receptor, prkaca protein kinase camp-activated catalytic subunit alpha, dnmt1 dna (cytosine-5)-methyltransferase 1, dctpp1 dctp pyrophosphatase 1, f2rl1 f2r-like trypsin receptor 1, eif4e2 eukaryotic translation initiation factor 4e family member 2, rae1 ribonucleic acid export 1, cep250 centrosomal protein 250, cul2 cullin 2, lox lysyl oxidase, nups nucleoporins other coronavirus will speed up the drug discovery. drug testing techniques are also necessary to be accelerated. recognizing the risks and commercial benefits, researchers are developing effective vaccines for sars-cov-2 infection. sars coronavirus: a new challenge for prevention and therapy clinical features of patients infected with 2019 novel coronavirus in wuhan a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster cryo-em structure of the 2019-ncov spike in the prefusion conformation early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia structure, function, and evolution of coronavirus spike proteins coronaviruses post-sars: update on replication and pathogenesis receptor recognition and cross-species infections of sars coronavirus recombination, reservoirs, and the modular spike: mechanisms of coronavirus cross-species transmission sars-cov infection in a restaurant from palm civet functional assessment of cell entry and receptor usage for sars-cov-2 and other lineage b betacoronaviruses isolation and characterization of a bat sars-like coronavirus that uses the ace2 receptor discovery of a rich gene pool of bat sars-related coronaviruses provides new insights into the origin of sars coronavirus receptor recognition by the novel coronavirus from wuhan: an analysis based on decade-long structural studies of sars coronavirus proteolytic activation of the sars-coronavirus spike protein: cutting enzymes at the cutting edge of antiviral research the species severe acute respiratory syndrome related coronavirus: classifying 2019-ncov and naming it sars-cov-2 a pneumonia outbreak associated with a new coronavirus of probable bat origin tissue distribution of ace2 protein, the functional receptor for sars coronavirus. a first step in understanding sars pathogenesis severe acute respiratory syndrome sars: understanding the virus and development of rational therapy severe acute respiratory syndrome coronavirus as an agent of emerging and reemerging infection isolation of a novel coronavirus from a man with pneumonia in saudi arabia the risk of sars-cov-2 transmission to pets and other wild and domestic animals strongly mandates a one-health strategy to control the covid-19 pandemic. one health porcine respiratory coronavirus: molecular features and virus-host interactions deadly pig virus slips through us borders emergence of porcine epidemic diarrhea virus in the united states: clinical signs, lesions, and viral genomic sequences severity and outcome associated with human coronavirus oc43 infections among children human coronavirus nl63 infection and other coronavirus infections in children hospitalized with acute respiratory disease in hong kong coronavirus infections in hospitalized pediatric patients with acute respiratory tract disease the novel coronavirus originating in wuhan, china: challenges for global health governance clinical analysis of 10 neonates born to mothers with 2019-ncov pneumonia clinical characteristics of coronavirus disease in 2019 in china airborne transmission route of covid-19: why 2 meters/6 feet of inter-personal distance could not be enough consideration of the aerosol transmission for covid-19 and public health asymptomatic carriage and transmission of sars-cov-2: what do we know? small droplet aerosols in poorly ventilated spaces and sars-cov-2 transmission clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan transmission of 2019-ncov infection from an asymptomatic contact in germany round s/how-scien tists -quant ifyinten sity-of-an-outbr eak-like-covid -19. accessed human coronavirus hku1 spike protein uses o-acetylated sialic acid as an attachment receptor determinant and employs hemagglutinin-esterase protein as a receptor-destroying enzyme host cell proteases: critical determinants of coronavirus tropism and pathogenesis cellular entry of the sars coronavirus cryoelectron microscopy structures of the sars-cov spike glycoprotein reveal a prerequisite conformational state for receptor binding return of the coronavirus: 2019-ncov coronaviruses-drug discovery and therapeutic options sars coronavirus entry into host cells through a novel clathrin-and caveolae-independent endocytic pathway targeting the endocytic pathway and autophagy process as a novel therapeutic strategy in covid-19 the lung macrophage in sars-cov-2 infection: a friend or a foe? sars-coronavirus replication in human peripheral monocytes/macrophages structural variations in human ace2 may influence its binding with sars-cov-2 spike protein intrinsically unstructured proteins and their functions role of disordered regions in transferring tyrosine to its cognate trna the coronavirus nucleocapsid is a multifunctional protein structure of the n-terminal rna-binding domain of the sars cov nucleocapsid protein the sars coronavirus nucleocapsid protein-forms and functions receptor recognition mechanisms of coronaviruses: a decade of structural studies pre-fusion structure of a human coronavirus spike protein the spike glycoprotein of the new coronavirus 2019-ncov contains a furin-like cleavage site absent in cov of the same clade the role of furin cleavage site in sars-cov-2 spike protein-mediated membrane fusion in the presence or absence of trypsin a review on the cleavage priming of the spike protein on coronavirus by angiotensin-converting enzyme-2 and furin structural and functional basis of sars-cov-2 entry by using human ace2 structural basis for the recognition of sars-cov-2 by full-length human ace2 protein intrinsic disorder and influenza virulence: the 1918 h1n1 and h5n1 viruses computational studies of drug repurposing and synergism of lopinavir, oseltamivir and ritonavir binding with sars-cov-2 protease against covid-19 therapeutic targets and computational approaches on drug development for covid-19 baricitinib as potential treatment for 2019-ncov acute respiratory disease epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study feasibility and biological rationale of repurposing sunitinib and erlotinib for dengue treatment plasma inflammatory cytokines and chemokines in severe acute respiratory syndrome treatment of middle east respiratory syndrome with a combination of lopinavir-ritonavir and interferon-β1b (miracle trial): study protocol for a randomized controlled trial simultaneous treatment of human bronchial epithelial cells with serine and cysteine protease inhibitors prevents severe acute respiratory syndrome coronavirus entry identification of the first synthetic inhibitors of the type ii transmembrane serine protease tmprss2 suitable for inhibition of influenza virus activation coronavirus replication complex formation utilizes components of cellular autophagy chloroquine is a potent inhibitor of sars coronavirus infection and spread use of hydroxychloroquine and chloroquine during the covid-19 pandemic: what every clinician should know sars coronavirus, but not human coronavirus nl63, utilizes cathepsin l to infect ace2-expressing cells rvx-208, an inducer of apoa-i in humans, is a bet bromodomain antagonist discovery and sar of 5-(3-chlorophenylamino) benzo [c][2, 6] naphthyridine-8-carboxylic acid (cx-4945), the first clinical stage inhibitor of protein kinase ck2 for the treatment of cancer ck2α and ck2α' subunits differ in their sensitivity to 4,5,6,7-tetrabromo-and 4,5,6,7-tetraiodo-1h-benzimidazole derivatives cx-4945, an orally bioavailable selective inhibitor of protein kinase ck2, inhibits prosurvival and angiogenic signaling and exhibits antitumor efficacy determination of the class and isoform selectivity of small-molecule histone deacetylase inhibitors remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-ncov) in vitro valproic acid defines a novel class of hdac inhibitors inducing differentiation of transformed cells the histone deacetylase inhibitor valproic acid selectively induces proteasomal degradation of hdac2 effect of inhibiting histone deacetylase with short-chain carboxylic acids and their hydroxamic acid analogs on vertebrate development and neuronal chromatin biochemical and pharmacological properties of a peripherally acting catechol-o-methyltransferase inhibitor entacapone catechol-o-methyltransferase: variation in enzyme activity and inhibition by entacapone and tolcapone inhibition of prostaglandin synthesis as a mechanism of action for aspirin-like drugs metformin inhibits mitochondrial complex i of cancer cells to reduce tumorigenesis in vitro inhibition and intracellular enhancement of lysosomal α-galactosidase a activity in fabry lymphoblasts by 1-deoxygalactonojirimycin and its derivatives mycophenolic acid: an anti-cancer compound with unusual properties a quantitative analysis of kinase inhibitor selectivity comprehensive analysis of kinase inhibitor selectivity doxorubicin-and daunorubicin-glutathione conjugates, but not unconjugated drugs, competitively inhibit leukotriene c4transport mediated bymrp/gs-xpump r)-and (s)-verapamil differentially modulate the multidrug-resistant protein mrp1 characterization of the cloned full-length and a truncated human target of rapamycin: activity, specificity, and enzyme inhibition as studied by a high capacity assay larp1 is a major phosphorylation substrate of mtorc1. biorxiv sigma receptors as endoplasmic reticulum stress "gatekeepers" and their modulators as emerging new weapons in the fight against cancer dabrafenib inhibits the growth of braf-wt cancers through cdk16 and nek9 inhibition synthesis and biological evaluation of the 1-arylpyrazole class of σ1 receptor antagonists: identification of 4-{2-[5-methyl-1-(naphthalen-2-yl)-1 h-pyrazol-3-yloxy] ethyl} morpholine (s1ra, e-52862) the pharmacology of the novel and selective sigma ligand, pd 144418 novel sigma receptor ligands: synthesis and biological profile cyclohexylpiperazine derivative pb28, a σ2 agonist and σ1 antagonist receptor, inhibits cell growth, modulates p-glycoprotein, and synergizes with anthracyclines in breast cancer inhibition of forskolininduced neurite outgrowth and protein phosphorylation by a newly synthesized selective inhibitor of cyclic amp-dependent protein kinase, n-[2-(p-bromocinnamylamino) ethyl]-5-isoquinolinesulfonamide (h-89), of pc12d pheochromocytoma cells the structure of inosine 5′-monophosphate dehydrogenase and the design of novel inhibitors discovery of xl413, a potent and selective cdc7 inhibitor anti-metastatic inhibitors of lysyl oxidase (lox): design and structure-activity relationships discovery of the first potent and selective inhibitors of human dctp pyrophosphatase 1 identification of triazolothiadiazoles as potent inhibitors of the dctp pyrophosphatase 1 diverse heterocyclic scaffolds as dctp pyrophosphatase 1 inhibitors. part 2: pyridone-and pyrimidinone-derived systems synthesis and structure-activity relationships of a novel series of pyrimidines as potent inhibitors of tbk1/ikkε kinases discovery of potent and selective small-molecule par-2 agonists structural insight into allosteric modulation of protease-activated receptor 2 novel agonists and antagonists for human protease activated receptor 2 selective inhibition of the bd2 bromodomain of bet proteins in prostate cancer targetable bet proteins-and e2f1-dependent transcriptional program maintains the malignancy of glioblastoma selective small molecule induced degradation of the bet bromodomain protein brd4 identification of a benzoisoxazoloazepine inhibitor (cpi-0610) of the bromodomain and extra-terminal (bet) family as a candidate for human clinical trials atpcompetitive inhibitors of mtor: an update eft226, a potent and selective inhibitor of eif4a, is efficacious in preclinical models of lymphoma verdinexor (kpt-335), a selective inhibitor of nuclear export, reduces respiratory syncytial virus replication in vitro an inhibitor of nedd8-activating enzyme as a new approach to treat cancer impdh2 is an intracellular target of the cyclophilin a and sanglifehrin a complex ternatin and improved synthetic variants kill cancer cells by targeting the elongation factor-1a ternary complex blocking eif4e-eif4g interaction as a strategy to impair coronavirus replication structure-based design of pyridine-aminal eft508 targeting dysregulated translation by selective mitogen-activated protein kinase interacting kinases 1 and 2 (mnk1/2) inhibition translation control of the immune checkpoint in cancer and its therapeutic targeting discovery of a potent and orally bioavailable cyclophilin inhibitor derived from the sanglifehrin macrocycle design and structural characterization of potent and selective inhibitors of phosphatidylinositol 4 kinase iiiβ comparative flavivirus-host protein interaction mapping reveals mechanisms of dengue and zika virus pathogenesis a sars-cov-2-human protein-protein interaction map reveals drug targets and potential drug-repurposing ester prodrugs of ihvr-19029 with enhanced oral exposure and prevention of gastrointestinal glucosidase interaction enhancing the antiviral potency of er α-glucosidase inhibitor ihvr-19029 against hemorrhagic fever viruses in vitro and in vivo substrate dependence of angiotensin i-converting enzyme inhibition: captopril displays a partial selectivity for inhibition of n-acetylseryl-aspartyl-lysyl-proline hydrolysis compared with that of angiotensin i crystal structure of the human angiotensin-converting enzyme-lisinopril complex the novel coronavirus 2019 (2019-ncov) uses the sars-coronavirus receptor ace2 and the cellular protease tmprss2 for entry into target cells identification of nafamostat as a potent inhibitor of middle east respiratory syndrome coronavirus s protein-mediated membrane fusion using the split-proteinbased cell-cell fusion assay oxazolidinones inhibit cellular proliferation via inhibition of mitochondrial protein synthesis sars-cov-2 vaccines: status report the pandemic pipeline sars-cov-2/covid-19: viral genomics, epidemiology, vaccines, and therapeutic interventions the covid-19 vaccine development landscape acknowledgements we thank indian institute of technology madras, key: cord-256340-w4z5avld authors: bailer, sm; haas, j title: connecting viral with cellular interactomes date: 2009-07-24 journal: curr opin microbiol doi: 10.1016/j.mib.2009.06.004 sha: doc_id: 256340 cord_uid: w4z5avld genome-scale screens for intraviral and virus–host protein interactions and the analysis of literature-curated datasets are able to provide a novel, comprehensive perspective of viruses, and virus-infected cells. until now, large-scale interaction screens were predominantly performed with the yeast-two-hybrid (y2h) system; however, alternative high-throughput technologies detecting binary protein interactions or protein complexes have been developed. although many of the previous studies suffer from a rather poor validation of the results and few biological implications, these technologies potentially lead to a plethora of novel hypotheses. here, we will give an overview of current approaches and their technical limitations, present recent examples and novel developments. the y2h system as the standard assay for the evaluation of interactomes essentially all high-throughput approaches to identify binary protein interactions on a genome-scale currently rely on the gal4-based yeast-two-hybrid (y2h) system ( figure 1 ) developed in 1989 [1] . in principle, two proteins are fused to separately expressed and nonfunctional domains of the gal4 transcription factor, either the gal4 dna binding domain (bait) or the gal4 activation domain (prey). upon interaction of the two proteins of interest, the transcription factor activity is reconstituted in the yeast nucleus leading to the activation of one or several reporter genes. a major improvement was the introduction of a mating protocol in which pretransformed haploid yeast cells form diploids that carry both the bait and prey vector [2] . compared to the previously used transformation protocol, this novel strategy is easier to perform and allows automated screening and the crosscombination of a large number of pretransformed bait and prey pairs. another major improvement that further boosted genome-scale y2h approaches was the combination with the highly efficient gateway recombinational cloning system for the generation of larger clone collections [3] . over the last 10 years, the interactomes of several previously sequenced organisms like h. pylori [4] , c. jejuni [5] , m. tuberculosis [6] , p. falciparum [7] , t. pallidum [8] , s. cerevisiae [9, 10] , c. elegans [11] , d. melanogaster [12] , and humans [13, 14] could be generated using the y2h system. in most cases arrays were generated to test either individual or defined pools of open reading frames (orfs) for interaction with each other (matrix screens). alternatively, a number of individual baits were used to screen cdna libraries. more recently, several viral interactomes with the herpesvirus family being by far the largest group analyzed have been generated (refs. [15 ,16 ,17 ,18 ,19 ] and (e fossum et al., in revision). the reliability and the biological relevance of the y2h system in general have been challenged repeatedly. despite certain limitations the y2h system is used by the majority of groups because of its enormous efficacy and the data discussed in this review are all based on y2h screens as all currently published large-scale studies on intraviral or virus-host protein interactions are based on them. clearly, since based on the nuclear localization of a transcriptional reporter system, it is limited in the analysis of transcriptional activators and proteins localized to membrane compartments. however, although the proteins to be tested are forced into the yeast nucleus for interaction, no bias between nuclear or non-nuclear proteins was observed [20 ] . interestingly, while the yeast system is not expected to provide translational modifications comparable to the mammalian cell, it is nonetheless able to introduce and report modificationdependent interactions up to a certain extent. thus, the yeast cell offers an environment sufficiently natural for the analysis of protein interactions of other species [20 ] . a major concern of the y2h system particularly in its high-throughput application is the small overlap of identified interactions in comparative studies. this is exemplified by two s. cerevisiae proteome-wide y2h approaches in which the screening of 6000 gene products identified 682 (uetz-screen) and 843 (ito-core) binary interactions [9, 10] . however, only 19% of the uetz-screen and 15% of the ito-core interactions were found in the respective other screen [9, 10] . a similarly small overlap was observed in several independently performed genome-wide herpesviral protein interaction screens [15 ,16 ,19 ,21,22 ] (e fossum et al., in revision). recent large-scale efforts demonstrated by reinvestigating a set of random protein interactions of the s. cerevisiae and human interactomes that the low coverage is the result of low sensitivity rather than low specificity [20 ,23 ,24 ]. several validation protocols performed in parallel (see below) confirmed 25-30% of the y2h interactions suggesting that the y2h approach does not yield more false positives than other assays that detect binary interactions and is comparable in quality to literature-curated data [25 ] . several reasons are likely to account for the low sensitivity of y2h approaches [23 ] . first, only a fraction of all possible pairwise combinations are actually tested in a given screening situation. second, depending on the assay or screening protocol applied different spaces are screened. this is demonstrated by studies on hepatitis c virus (hcv) where the same set of proteins was screened by a yeast mating (imapi) as well as a transformation protocol (imapii) using two different human cdna libraries [18 ] . although performed in the same laboratory, imapi and imapii shared only 22 interactions indicating that different screening protocols, vectors, and yeast strains as well as the quality and composition of cdna libraries have a greater impact on the screening success than generally assumed. third, high-throughput approaches are often hampered by technical limitations that can be improved for example by multiple screenings of the same set of proteins. multiple sampling of the same search space allowed the identification of 80-90% while a single round revealed only 60% of all possible y2h interactions [23 ] . the orfeomes of five evolutionarily related herpesviruses were used to systematically address the low coverage of y2h screens by comparing the interactions identified in individual species followed by secondary methods [15 ] (e fossum et al., in revision). in the initial y2h screens 283 interactions of the 41 core orthologous proteins were observed and 113 interactions high-throughput technologies to detect protein interactions. graphic depiction of several approaches for the detection of binary protein interaction including the yeast-two-hybrid (y2h) system, the nucleic acid programmable protein array (nappa), the lumier (luminescence-based mammalian interactome mapping), the protein fragment complementation assay (pca), and the mappit (mammalian protein-protein interaction trap). multicomponent protein interactions can be analyzed by the tap (tandem affinity purification) of prota-tagged bait proteins followed by mass spectrometry of associated proteins. bait (x) and prey (y) proteins are indicated, fusion tags are shown in orange (ad: activation domain; dbd: dna binding protein; atg: start codon). were found in more than one species (e fossum et al., in revision). on the basis of 55 y2h interactions detected in kshv, 59 of 92 interactions predicted for the corresponding orthologs in hsv-1, mcmv, and ebv could be identified by coprecipitation. in conclusion, the low coverage of the y2h system -currently its major drawback -can be addressed either by technical improvements or the combination with other assays. to address the biological significance of an interaction identified by y2h, validation by one or more biochemical methods is required ( figure 1 ). technologies similar to the y2h, for example an assay based on closeness of two proteins if not their direct interaction likely leads to high confirmation rates, however be little complementary [20 ] . coexpression in mammalian cells of two proteins fused to distinct tags followed by immuno-coprecipitation or affinity-coprecipitation (cop) was successfully applied to validate several y2h studies [15 ,17 ] (e fossum et al., in revision). this method, however, is time consuming and thus not applicable to analyze large sets of interactions. recently, a tool-box for high-throughput validation of y2h interactions was developed (figure 1; [20 ]). expression of proteins on array-printed template dna using a coupled in vitro transcription-translation reaction (nucleic acid programmable protein array, nappa, [26] ) follows a principle similar to the cop. however, since performed in vitro under a defined but artificial environment it is technically challenging. a second method used on a rather large scale is the lumier (luminescence-based mammalian interactome mapping) pull-down assay where two proteins are expressed in mammalian cells. while the bait protein is expressed as a fusion to the protein a-tag or flag-tag to immobilize the complex, the prey protein is expressed as a fusion with luciferase allowing the detection of the coisolated prey [27] . protein fragment complementation assays (pca), for example the split-yfp system in which interacting bait and prey proteins are fused to yfp domains, reconstitute an enzyme or a fluorescent protein and generally do not require an enrichment of the interactors. thus, they might be more easily performed in an automated large-scale fashion [28] . finally, the mappit (mammalian protein-protein interaction trap) uses a bait protein fused to a hybrid erythropoietin-leptin receptor located to the plasmamembrane and a prey protein fused to gp130, which drives a signaling cascade resulting in the read-out of an endogenous transcriptional reporter [29] . proteomic approaches that are able to detect indirect interactions are complementary to the y2h system rather than confirmatory, and are therefore well suited to increase the coverage of a y2h interactome. in the tandem affinity purification (tap) approach individual proteins are fused to a cleavable tandem tag composed for example of protein a or g and a calmodulin binding peptide (cbp), and the isolated proteins present in the pulled-down complex are subsequently identified by mass spectrometry as done for yeast ( figure 1 ; [30] [31] [32] ). similarly, smaller subsets of human proteins have been analyzed in higher eukaryotic cells [33] [34] [35] . since the systematic tagging of chromosomal genes is currently not feasible in these cells, tagged proteins have been introduced in addition to the endogenous proteins and thus compete with them in the pull-down analysis. genetic systems are now available for many viral genomes (including large dna viruses), and systematic functional tap-tagging of viral proteins might be addressed in the near future. proteomic analyses of purified virus particles have recently been performed for a few virus species and may provide further evidence for the interaction between viral proteins, particularly in conjunction with y2h data (figure 2) [36, 37] . initial 'genome-wide' studies concentrated on intraviral protein interactions of small rna and dna viruses. however, owing to the rather small number of proteins and protein interactions identified (figure 3) , a detailed network analysis could not be applied (reviewed by [38] ). more recent approaches on intraviral interactomes include several members of the herpesvirus family [15 ,16 ,19 scheme of hsv-1 virus particle with protein interactions detected in a genome-wide y2h screen. capsid, tegument, and glycoproteins are indicated depending on their localization in the virus particle. proteins are colored according to their conservation, purple: conserved in all herpesviruses, red: in alpha herpesviruses, grey: in herpes simplex virus, pink: in alpha and gamma herpesviruses. first combined virus-host interactomes were investigated in hcv [18 ], ebv [16 ] , vzv and kshv (haas and collaborators). virus-host protein interactions have been included into several public databases, or novel databases specific for virus-host interactions like virhostnet have been set up [39] . in a systematic screen of 27 full-length proteins or domains of hcv against two human cdna libraries, 314 virus-host interactions were identified [18 ]. taking published interactions into account, this is the first analysis on rna viruses producing a large enough dataset to constitute a virus-human interaction network composed of 481 interactions involving 11 hcv and 421 human proteins. the most highly connected proteins were the ns3, ns5a, and core proteins with 214, 96, 76 interactors, respectively. intriguingly, the insulin, jak/ stat, and tgfb signaling pathways were particularly enriched, which might be consistent with the metabolic disorders observed during chronic hcv infection. focal adhesion complexes could represent a novel target of hcv suggesting a role of several hcv proteins in viral spreading, cell-cell interaction, and tissue reorganization. this analysis may thus help to identify potential new targets for hcv therapy. the largest high-throughput approaches involving viruses have been performed with herpesviruses. equipped with moderately large-sized genomes encoding a manageable but complex set of genes these viruses represent the ideal candidates for genome-wide y2h approaches followed by bioinformatical analysis. intraviral interactomes have been generated for several herpesviruses including the a-herpesviruses global view of the interaction of two herpesviruses, vzv and kshv, with the human proteome. two experimental y2h datasets were used to connect the two viral interactomes of vzv (red) and kshv (pink) into a predicted high-confidence human interaction network consisting of 10 636 edges between 3169 nodes. interactions between viral proteins are depicted in red or orange, interactions to cellular proteins in blue. of five herpesviral species including 1007 intraviral interactions a core set of highly conserved protein interactions has been identified. intriguingly, the interactions between the orthologous proteins were found to be conserved independent of their sequence homology. the topology of all herpesviral networks differed from cellular networks; however, it is difficult to judge whether this really reflects biological differences or artefacts caused by different setups used to evaluate them. in ebv [16 ] , hcv [18 ], vzv and kshv (haas and collaborators), and kshv (haas and collaborators) viral proteins tend to interact with highly connected cellular proteins, which could be a general hallmark of many pathogen-host interactions [40 ] . the datasets available to date are hampered by the low coverage of the y2h screens performed, which makes it difficult to draw general biological conclusions. to reveal significant differences how different pathogens interact with the host proteome, protein interaction data with a considerably higher coverage of the screening space have to be generated. to provide an example of the power of this approach, a comparison of the five herpesviral core networks identified the highly connected hsv-1 ul33 ortholog (vzv orf25, mcmv m51, ebv bfrf4, and kshv orf67.5), which interacted with 14 tegument proteins (figure 2 , e fossum et al., in revision). interestingly, 11 out of 14 interactions were found in more than one species, a majority of which was confirmed by cop. in addition, the interaction between ul33 orthologs and the hsv-1 ul31 orthologs (vzv orf27, mcmv m53, ebv bflf2, and kshv orf69) which in turn were found and previously published to interact with the hsv-1 ul34 orthologs (vzv orf24, mcmv m50, ebv bfrf1, and kshv orf67) was highly conserved [41] [42] [43] [44] [45] . likely these proteins form a large protein complex that mediates budding of capsids at the inner nuclear membrane of the host (figure 4 ). the role of ul33 in dna packaging and cleavage could point to a role in connecting capsid maturation with nuclear egress [46] . this is in line with these proteins being crucial for capsid formation and nuclear egress [41, [46] [47] [48] [49] . the identification of this protein complex thus demonstrates the potential of systems virology to reveal targets for alternative herpesviral therapies. this could be achieved by peptides or other small molecules introduced to interfere with targeting or assembly of the complex components. nuclear egress of hsv-1 capsids. herpesviral capsids are formed in the host nucleus and released to the cytoplasm by budding through the nuclear envelope. primary envelopment at the inner nuclear membrane (inm) requires the membrane anchored ul34/ul31 family of proteins. the ul33 protein family interacts with this nuclear egress complex and may connect capsid packaging and nuclear egress (er: endoplasmic reticulum; inm: inner nuclear membrane; onm: outer nuclear membrane; npc: nuclear pore complex). these could be confirmed by cop. the intraviral interaction network revealed network parameters similar to herpesviruses [15 ] and the combined published and experimental virus-host interactions suggest that sar-s-cov targets host functions like apoptosis, cell communication, and signaling. large-scale protein interaction screens are able to provide large amounts of novel and unbiased data, but the extraction of biological implications from these screens is difficult, particularly if they are based on a technology with a rather low coverage as the y2h system. in the near future, however, improved 'deep' screening technologies will lead to more comprehensive interaction maps of both intraviral and virus-host protein interactions, which, in combination with other genome-scale technologies like sirna knock-down screens, transcriptional profiling, and spatial/temporal distribution studies may allow to setup improved models of virus-infected cells. the comparison of different viruses and, possibly, other pathogens may allow the identification of common strategies for infection and replication used by divergent pathogen groups, and the identification of targets for novel broad-spectrum antibiotics. on the other hand, it might reveal strategies that are specific for individual pathogens and help explain the characteristics of the infection with this particular pathogen. in combination with a genetic profiling of the infected host this approach will be even more powerful and might potentially lead to a step change in our understanding of viral infections. a tool-box of high-throughput methods was tested for the validation of y2h interactions including the mammalian protein-protein interaction trap (mappit), the luminescence-based mammalian interactome (lumier), the yellow fluorescent protein (yfp) protein complementation assay (pca), and the nucleic acid programmable protein array (nappa). this analysis aims at defining a confidence score for binary proteinprotein interactions. these two sets of data define high-quality binary interaction maps of yeast and human protein networks. in particular, these datasets suggest that the y2h data are more reliable than generally believed and that the y2h screens currently performed suffer from low sensitivity rather than low specificity. moreover, the distinct nature of binary versus cocomplex derived interactions is pointed out. these two sets of data define high-quality binary interaction maps of yeast and human protein networks. in particular, these datasets suggest that the y2h data are more reliable than generally believed and that the y2h screens currently performed suffer from low sensitivity rather than low specificity. moreover, the distinct nature of binary versus cocomplex derived interactions is pointed out. curation of protein interactions from literature is a method applied to obtain comprehensive protein networks. however this analysis critically evaluates the current literature curation and shows that it is more errorprone and possibly of lower quality than generally assumed. a novel genetic system to detect proteinprotein interactions toward a functional analysis of the yeast genome through exhaustive two-hybrid screens gateway recombinational cloning: application to the cloning of large numbers of open reading frames or orfeomes the proteinprotein interaction map of helicobacter pylori a proteome-wide protein interaction map for campylobacter jejuni mycobacterium tuberculosis interactome analysis unravels potential pathways to drug resistance a protein interaction network of the malaria parasite plasmodium falciparum the binary protein interactome of treponema pallidum -the syphilis spirochete a comprehensive analysis of protein-protein interactions in saccharomyces cerevisiae a comprehensive two-hybrid analysis to explore the yeast protein interactome a map of the interactome network of the metazoan c. elegans a protein interaction map of drosophila melanogaster towards a proteome-scale map of the human protein-protein interaction network a human protein-protein interaction network: a resource for annotating the proteome herpesviral protein networks and their interaction with the human proteome see annotation to epstein-barr virus and virus human protein interaction maps in revision) this study provides important insights into intraviral interactions of herpesviruses. moreover, it presented the first virus-host interactome of a member of the herpesvirus family self-assembling protein microarrays highthroughput mapping of a dynamic signaling network in mammalian cells capturing protein interactions in the secretory pathway of living cells design and application of a cytokine-receptor-based interaction trap functional organization of the yeast proteome by systematic analysis of protein complexes systematic identification of protein complexes in saccharomyces cerevisiae by mass spectrometry global landscape of protein complexes in the yeast saccharomyces cerevisiae a physical and functional map of the human tnf-alpha/nf-kappa b signal transduction pathway an efficient tandem affinity purification procedure for interaction proteomics in mammalian cells large-scale mapping of human protein-protein interactions by mass spectrometry viral proteomics: global evaluation of viruses and their interaction with the host comprehensive characterization of extracellular herpes simplex virus type 1 virions from orfeomes to protein interaction maps in viruses virhostnet: a knowledge base for the management and the analysis of proteome-wide virus-host interaction networks the landscape of human proteins interacting with viruses and other pathogens this paper integrates protein interaction networks of several pathogens and identifies certain cellular processes that participate in interactions with different pathogen groups. this global view on infectious diseases may be a first step toward the identification of multiviral or multibacterial treatments budding events in herpesvirus morphogenesis the epstein-barr virus bfrf1 and bflf2 proteins interact and coexpression alters their cellular localization characterization and intracellular localization of the epstein-barr virus protein bflf2: interactions with bfrf1 and with the nuclear lamina deletion of epstein-barr virus bflf2 leads to impaired viral dna packaging and primary egress as well as to the production of defective viral particles identification and characterization of the product encoded by orf69 of kaposi's sarcoma-associated herpesvirus temperature-sensitive mutations in the putative herpes simplex virus type 1 terminase subunits pul15 and pul33 preclude viral dna cleavage/packaging and interaction with pul28 at the nonpermissive temperature comprehensive mutational analysis of a herpesvirus gene in the viral genome context reveals a region essential for virus replication bfrf1 of epstein-barr virus is essential for efficient primary viral envelopment and egress functional domains of murine cytomegalovirus nuclear egress protein m53/p38 the work of the authors has been supported by grants provided by baygene (bayerisches staatsministerium fü r wissenschaft, forschung und kunst jh), dfg (sfb 576 jh, ba 1165/5-1 sb/jh), bmbf (ngfn+ jh), mrc (g050145, jh), and scottish funding council (ichair, jh). key: cord-018265-twp33bb6 authors: becker, pablo d.; guzmán, carlos a. title: community-acquired pneumonia: paving the way towards new vaccination concepts date: 2007 journal: community-acquired pneumonia doi: 10.1007/978-3-7643-7563-8_10 sha: doc_id: 18265 cord_uid: twp33bb6 despite the availability of antimicrobial agents and vaccines, community-acquired pneumonia remains a serious problem. severe forms tend to occur in very young children and among the elderly, since their immune competence is eroded by immaturity and immune senescence, respectively. the main etiologic agents differ according to patient age and geographic area. streptococcus pneumoniae, haemophilus influenzae, respiratory syncytial virus (rsv) and parainfluenza virus type 3 (piv-3) are the most important pathogens in children, whereas influenza viruses are the leading cause of fatal pneumonia in the elderly. effective vaccines are available against some of these organisms. however, there are still many agents against which vaccines are not available or the existent ones are suboptimal. to tackle this problem, empiric approaches are now being systematically replaced by rational vaccine design. this is facilitated by the growing knowledge in the fields of immunology, microbial pathogenesis and host response to infection, as well as by the availability of sophisticated strategies for antigen selection, potent immune modulators and efficient antigen delivery systems. thus, a new generation of vaccines with improved safety and efficacy profiles compared to old and new agents is emerging. in this chapter, an overview is provided about currently available and new vaccination concepts. the mucosa of the human respiratory tract represents a primary target for a large number of microbial pathogens. typically, colonization is an asymptomatic process, resulting from the interplay between bacterial factors and host clearance mechanisms. clinical illness may result from either the local release of bacterial toxins or the systemic dissemination of the pathogen after breaching the mucosal barrier. in the course of respiratory infections adaptive immune responses could be significantly impaired. this might lead to more severe forms of disease or to super-infections, which in turn complicate the clinical management of the patient. the most severe forms of respiratory infection tend to occur in very young children and among the elderly, in whom immune competence is eroded by immaturity or immunesenescence, respectively. in addition, patients who are immunocompromised, as a result of disease or therapeutic interventions, have the greatest risk of developing a fatal infection. despite the availability of new antimicrobials and effective vaccines, community-acquired pneumonia remains a common and serious illness. in fact, it is a leading contributor to the nearly 4 million deaths occurring each year due to respiratory infections, especially in children from developing countries [1, 2] . the main causative agents of pneumonia differ according to the patient age and the geographic area. in addition, there are relatively few comprehensive studies on the specific aetiology of pneumonia [2] due to (i) overlaps in the clinical manifestations of the different syndromes, (ii) difficulties in establishing the precise aetiology, and (iii) frequent occurrence of co-infections. however, streptococcus pneumoniae, haemophilus influenzae, respiratory syncytial virus (rsv), and parainfluenza virus type 3 (piv-3) have been identified as the main agents responsible for acute respiratory infections in children, whereas influenza virus related pneumonia is the leading cause of disease-related deaths in the elderly. in addition, the availability of new and more sensitive diagnostic tests have contributed to the identification of hitherto unknown lower respiratory pathogens, such as the human metapneumonovirus (hmpv) and novel coronaviruses causing the severe acute respiratory syndrome (sars). significant efforts have been invested in the last two decades to develop new diagnostic tools, to elucidate the molecular mechanisms of microbial pathogenesis and to understand host clearance mechanisms. this resulted in an improved knowledge on host responses to infection and immuno-pathogenesis, which in turn have facilitated the establishment of new prophylactic and therapeutic interventions. however, despite our accomplishments in vaccine development, there are many pathogens for which vaccines or adequate therapies are not available or the existent ones are suboptimal. the main approach applied for vaccine development has radically changed in recent years. whole cell vaccines are systematically being replaced by subunit vaccines, in which purified antigens or their coding genes are exploited in combination with new adjuvants and/or delivery systems. as a result, many of the vaccines under development will exhibit consistently improved stability, safety and efficacy profiles. they will also be amenable for mucosal administration, thereby mimicking natural infections. influenza a viruses are the most commonly responsible for severe respiratory illness in humans, followed by influenza b. the population's susceptibly to infection is renewed annually, because of the rapid antigenic variation of this virus. the antigenic variation is due to the accumulation of point mutations in the two major surface glycoproteins of the virus, haemagglutinin (ha) and neuraminidase (na). this can lead to an antigenic drift of the virus, which often leaves current influenza vaccines outdated and ineffective. antigenic shift can also occur due to the segmented nature of the viral genome that favours the emergence of re-assorted strains, in which an entire glycoprotein can be acquired from a different animal influenza virus. both types of variation represent a critical bottleneck for the establishment of a robust vaccination strategy against influenza. in fact, when an influenza virus with the capacity to spread from person-to-person and a complete new glycoprotein subtype suddenly emerges, a worldwide pandemic outbreak can result [3] . the earliest vaccines against influenza were whole cell vaccines obtained in the 1940s by inactivating viruses grown in the allantoic cavity of embryonated chicken eggs with formalin. while contemporary inactivated influenza vaccines are still produced in embryonated eggs, improvements in manufacturing have resulted in a highly purified and less-reactogenic detergentsplit product. three viral strains are selected on the basis of the previous year's surveillance data on the most prevalent subtypes, therefore, vaccine composition may vary from year-to-year. vaccination has a high benefit:cost ratio, since influenza-related illness (e.g., hospitalizations and deaths) are effectively prevented [1] . the world's total vaccine production is approximately 300 million doses, with a maximum capacity of 900 million doses. however, the world health organization (who) estimates that there are about 1.2 billion people at high risk for severe influenza outcomes (e.g., elderly over 65 years of age, infants, health care workers, children and adults with underlying cardiopulmonary disease). furthermore, the global infrastructure would not be able to handle the timely manufacturing and distribution of a vaccine for a pandemic outbreak [4] . one alternative would be to lower the quantity of antigen per dose and add adjuvants to the vaccine formulation, but this needs to be tested in clinical trials [5] . another solution would be to improve current vaccine production technologies (i.e., egg-derived vaccines). however, there is the limited number of egg producers and viral strains can emerge, which could not be easily adapted to embryonated eggs. to overcome these problems, several pharmaceutical companies have embarked themselves on projects for the development of vaccines produced by growing the virus on cell lines. the influenza virus can be adapted to grow on a variety of mammalian cell lines, including vero, per.c-6, and madin-darby canine kidney (mdck) cells [6] [7] [8] . this strategy would also improve the possibility of up-scaling vaccine production in face of a pandemic spread. alternatively, it would be possible to develop a vaccine against any influenza virus, such as the avian h5n1 strain, by using reverse genetics techniques [9] (see below in advances in vaccinology). cold adaptation was found to be a reliable and efficient procedure for the derivation of live attenuated viral vaccine strains for humans. cold-adapted (ca) virus strains can grow in primary chick kidney cells or embryonated eggs at 25-33°c, however, they exhibit a reduced replication at 37°c. the process of genetic re-assortment with the transfer of the six internal genes from a stable attenuated ca master donor strain of influenza a or b to the new prevailing wild-type epidemic strain has been used to generate attenuated cold-reassorted vaccines with the proper level of attenuation, genetic stability and immunogenicity, which show low or absent transmissibility [10] . medimmune and wyeth have developed along these lines a trivalent live ca vaccine (flumist) for intranasal spray delivery, which was licensed in 2003. in contrast, to parenterally-administered vaccines, this formulation triggers immune responses resembling those observed after natural infections [11] . despite the moderate hemagglutination-inhibiting antibody titres observed in vacinees, flumist showed 92% efficacy over a 2-year period in children, including protection against antigenic variants that circulated during the second year [12] [13] [14] . this ca vaccine also stimulated the production of nasal iga, as well as t-cell and interferon responses [15] . the cell-mediated immunity against virus matrix and nucleoprotein antigens may favour viral clearance and early recovery from illness [3] . the advisory committee on immunization practices has recommended its use only in persons from 5 to 49 years of age, since side-effects were observed in young children (wheezing, nasal congestion) and there are no data available for elderly [16, 17] . despite its remarkable genetic stability, this vaccine has to be kept at -18°c. thus, a new heat stable derivative has recently been developed, which showed good efficacy in clinical trials [1] . a live vaccine based on a master virus strain developed at the institute of applied microbiology (austria) by growing wild influenza virus in vero cells at 25°c was also demonstrated to be safe, well-tolerated and immunogenic after intranasal immunization in young adults [18]. a number of subunit-or dna-based vaccines are also in various stages of development. an influenza vaccine formulated in virosomes has been commercialized by berna biotech (inflexal ® v); it contains the surface spikes of the three currently circulating influenza virus strains inserted in vesicle membranes of the three corresponding virus types (for more details see section "pseudoviruses as antigen delivery systems") [19] . this company has also developed a virosome-based nasal formulation. however, it was withdrawn from the market due to the presence of sideeffects (i.e., bell's palsy), which was assumed to be linked to the presence of the escherichia coli heat labile toxin (lt) as adjuvant. two companies, yeda and bionvax, are also developing a peptide-based influenza vaccine for nasal administration, which showed protective efficacy in humanized mice [1] . a subunit vaccine containing recombinant ha protein produced using a baculovirus system was successfully tested in a phase ii trial in 64 to 89-year-old volunteers. an epidermal dna-based influenza vaccine, which contained the ha gene from a/panama/2007/99 delivered by particle-mediated epidermal delivery was also tested in humans by powderject [20] . serum haemagglutination-inhibition antibody responses were observed in volunteers receiving a single dose of 1, 2 or 4 g of dna, with the strongest and most consistent responses in subjects vaccinated with the highest dosage. some immunization approaches aim at the development of a universal vaccine with a broad spectrum of protective activity against different influenza strains [21]. among them, the use of the highly conserved transmembrane m2 protein of the virion can be mentioned. a recombinant particulate vaccine has been engineered by genetically fusing copies of the m2 to the hepatitis b core antigen (hbc). the m2-hbc fusion protein spontaneously assembled into virus-like particles (vlp), which provided complete protection against a lethal challenge with influenza virus a in mice [22, 23] . promising results were also obtained after vaccination with a m2 peptide conjugated with a neisseria meningitidis outer membrane protein complex (ompc) in monkeys [24] . the human piv (hpiv) consist of four serotypes, with hpiv-3 being the second leading cause of bronchitis and pneumonia in infants. no vaccine has been licensed to date against piv, however, several approaches are currently under investigation. the initial attempts to provide protection by using vaccines based on formalin-inactivated viruses failed. subsequent work demonstrated that the glycoproteins haemagglutininneuraminidase (hn) and f, which are responsible for virus attachment and fusion, are able to stimulate the elicitation of neutralizing antibodies in animals. however, their poor immunogenicity in naïve subjects led to the currently favoured approach, which is based on the use of live attenuated piv. live attenuated piv vaccines have been developed from both human and bovine strains, which are amenable for delivery by the intranasal route. candidate vaccines should be able to replicate and induce a protective immune response in young infants, even in the presence of maternally acquired antibodies. two main attenuated strains have been studied in detail. one is the hpiv-3 strain cp45, which was selected after 45 passages of the virus in african green monkey cells at low temperature. the other is a bovine piv (bpiv)-3 strain, which is antigenically related to the hpiv-3, and replicates poorly in humans. both cp45 and bpiv-3 have been evaluated in phase i/ii trials in sero-positive and sero-negative children and in young infants. they were found to be over-attenuated in sero-positive children, but immunogenic in sero-negative children and infants [25] . however, the magnitude of the anti-hn response was lower in children who received the bpiv-3 vaccine [25]. this prompted the engineering of chimeric bovine/human piv-3 candidates (e.g., hpiv-nb strain in which the human nucleocapsid is replaced by the bovine counterpart, or a bpiv-3 strain that expresses the f and hn proteins of hpiv-3). attenuated, chimeric viruses that contain piv-3 cp45 internal genes with the f and hn genes from either piv-1 or piv-2 have also been tested in hamsters [26] . berna biotech is also developing a virosomal formulation of the piv-3 [1] . using the successful approach of the influenza vaccine, a formalin-inactivated candidate against the respiratory syncytial virus (rsv) was tested in children in the 1960s. the consequence was the hospitalization of 80% of vaccinees and two deaths [1] . moreover, vaccinated children also suffered more severe disease on subsequent exposure to the virus, as compared to unvaccinated controls [27] . this demonstrated that the elicitation of a strong immune response is not sufficient to confer protection against disease, and can even lead to immuno-pathological reactions. thus, it is essential to stimulate the "right" type of immune response. in the particular case of rsv, host responses play an important role in the pathogenesis of the disease, thereby making the development of a preventive vaccine extremely difficult. in addition, naturally acquired immunity to rsv is neither complete nor long-lasting, and recurrent infections often occur [28] . however, older children and adults are usually protected, suggesting that protection against severe disease develops after several consecutive infections. passive immunization with rsv-neutralizing immune globulins was also shown to prevent rsv infection in newborns with underlying cardiopulmonary disease [29] . this demonstrates that antibodies play a major role in protection against this disease, whereas t-cell immunity targeted to internal viral proteins appears to contribute to clearance. although live attenuated vaccines seem to be preferable for immunization of naïve infants, subunit vaccines may be of choice for elderly, high-risk children and pregnant women. candidate subunit vaccines based on purified f proteins (pfp-1, -2 and -3) were demonstrated to be safe and immunogenic, even during pregnancy [30] . maternal immunization using a pfp-based vaccine could be an interesting strategy to protect infants younger than 6 months of age [25] . however, no significant protection was reported in a phase iii trial performed on children 1-12 years of age with cystic fibrosis after vaccination with a subunit vaccine based on pfp-3 [31] . a formulation based on surface glycoproteins f and g together with the virion matrix protein m from rsv-a was tested in healthy adult volunteers in the presence of either alum or polyphosphazene as an adjuvant. short-live neutralizing antibody responses to rsv-a and rsv-b were detected in 76-93% of the vaccinees, suggesting that annual boosting will be needed [32] [33] [34] . the central domain of the g protein of rsv-a is relatively conserved among viruses from the groups a and b. thus, a recombinant vaccine candidate, bbg2na, was developed by fusing the g2na domain to the albumin binding region of streptococcal protein g. this candidate was shown to be moderate immunogenic in adult human volunteers, but its clinical development was interrupted due to the appearance of purpura in vaccinees [1] . the main two difficulties associated with the generation of live attenuated vaccines against rsv are over-or under-attenuation of the virus and limited genetic stability. temperature-sensitive (ts), ca and cold-passaged (cp) mutant viral strains have been generated. despite the attenuation shown in adults and sero-positive children, cpts mutants still caused moderate congestion in the upper respiratory tract of sero-negative infants (1-2 months old) [35] . recombinant rsv vaccines with deletions in non essential genes (e.g., sh, ns1 or ns2), which also carry cp and ts mutations in essential genes are currently being evaluated [1] . through recombinant dna technology chimeric viruses were engineered, which contain the genes of hpiv-3 surface glycoproteins f and nh together with those of rsv glycoproteins f and g in a bpiv-3 genetic background. one of these candidates was found to be attenuated and able to induce the elicitation of immune responses against both hpiv-3 and rsv in rhesus monkeys [36] . similarly, a bpiv-3 genome was engineered to express hpiv-3 f and hn proteins and either native or soluble rsv f protein [37] . the resulting strain, which induced rsv neutralizing antibodies and protective immunity against rsv challenge in african green monkeys, needs to be tested for safety and efficacy against rsv and piv-3 in infants. this emerging disease was originally described in the guangdong province of china in 2002. even when the global outbreak of sars was under control in 2003, new infections were reported in persons who had contacts with animals in 2003 and 2004 [38] . the typical sars-cov-like virus is not transmitted from animals to humans. however, under certain conditions the virus can evolve into the early human sars-cov, which has the ability to be transmitted from animals to humans or even humans to humans, thereby leading to localized outbreaks of mild disease. the early human sars-cov, under selective pressure in humans, may further evolve into the late human sars-cov, which can cause local or global outbreaks of typical sars [39] . sars can be easily grown in cell cultures [38] . thus, there is an urgent need for vaccines, not only to prevent naturally occurring epidemic outbreaks, but also as a tool against the threat of biological weapons. several structural proteins are expressed by sars-cov, including nucleocapsid, envelope and spike (s) proteins [38] . the latter is a type i trans-membrane glycoprotein, which is responsible for virus binding, fusion and entry, and being the major target of neutralizing antibodies [38, 40] . the extracelullar domain of the s protein consists of two subunits, s1 and s2 [40] . the s1 subunit possess a receptor-binding domain (rbd), which is responsible for viral binding to one of its receptors [41, 42] . vector-based vaccines expressing the s protein, as well as dna vaccines encoding full-length s protein have been assessed in preclinical studies [43, 44] . when modified vaccinia virus ankara (mva) coding for full-length s protein was administered by either intranasal or intramuscular route, neutralizing antibodies were elicited [45] . however, vaccination of ferrets resulted in liver damage after challenge, raising some concerns about the safety of this approach [46] . vaccines formulated using different synthetic peptides encompassing linear b cell epitopes from the s protein, which were identified using sera from convalescent patients, stimulated high antibody titres. nevertheless, none of them triggered the elicitation of neutralizing activity. on the other hand, some studies demonstrated that although antibodies against s protein of the late sars-cov (urbani strain) exhibit neutralizing activity, they can also enhance infection by an early human sars-cov isolate (gd03t0013) and the civet sars-cov-like viruses. a derivative of the s protein with a truncation at amino acid (aa) 1153 fails to cause antibody dependent enhancement of infection, but retains the ability to induce neutralizing antibodies. these findings suggest that the elimination of the putative heptad repeated 2 (hr2, aa 1153-1194), which is implicated in viral fusion, might abrogate the stimulation of virus infection-enhancing antibodies [47, 48] . the use of the nucleoprotein of the coronavirus in a dna vaccination protocol also led to the stimulation of a protective response [49] . in contrast, protection was not achieved when a recombinant piv-3 expressing the nucleoprotein alone or together with the matrix protein was used [50] . this demonstrates that the selection of the delivery system and immunization strategy play a critical role in vaccine efficacy. the human adenoviruses are divided into six subgroups (a-f). the adenovirus can cause large-scale epidemics of acute respiratory disease, and dissemination is especially favoured under conditions in which persons are housed communally. the subgroup a viruses, such as ad31, have been associated with pneumonia in immunocompromised patients. neutralizing antibodies directed against the capsid (hexon and fiber proteins) seems to be the main effector mechanism to prevent re-infections by adenovirus [3] . until 1998, military recruits in usa were administered enteric-coated capsules containing live viruses from the serotypes 4 and 7. the virus, which was not attenuated if delivered by respiratory route, was able to replicate in the gastrointestinal tract without causing disease, thereby stimulating protective responses in the respiratory tract [51] . when the vaccine went out of production, outbreaks of respiratory diseases caused by adenovirus reemerged among the military recruits [3] . since serotypes 1, 2, 3 and 5 cause the 80% of adenovirus associated respiratory disease in young children, the development of a tetravalent vaccine similar to the above mentioned might solve the problem in children [52] . however, the implementation of a vaccine (live or attenuated) against adenovirus should be carefully evaluated, since recombinant adenoviruses are proposed both as vaccine vectors and as tools for the transfer of foreign genes in gene therapy protocols. polysaccharide-based vaccines against s. pneumoniae in 1945, macleod et al. [53] reported the protective efficacy of a capsular polysaccharide (ps) vaccine in military personnel during an outbreak of pneumococcal pneumonia. the immunization with purified ps showed a drastically reduced reactogenicity, in comparison with the previously used inactivated whole cell vaccines. this was a major breakthrough, not only in terms of safety, but also because it demonstrated that a specific virulence factor can be purified and effectively implemented for the prevention of an infectious disease, thereby paving the road for modern non toxoid-based subunit vaccines. although the serological correlates of immunity are poorly defined, type-specific anti-capsular antibodies are responsible for protective immu-nity. however, immunity is serotype specific, rendering extremely difficult the development of a universal vaccine. this is in part due to the elevate number of serotypes, the regional variations in dominant serotypes and the lack of updated sero-prevalence data for certain regions. these problems have been partially solved by the use of ps-based polyvalent vaccines. the currently licensed formulations contain 23 serotypes of s. pneumoniae, which cover approximately 90% of serious pneumococcal disease, but only in western industrialized countries. relatively good antibody responses (60-70%) are elicited in healthy adults 2-3 weeks after a single intramuscular or subcutaneous immunization [54] . unfortunately, they are poorly immunogenic in children aged less than 2 years, in immune compromised individuals (e.g., aids patients) and in elderly people with concomitant disease, and they do not induce good immunological memory. randomized controlled trials in healthy elderly and young men also failed to show a beneficial effect against pneumonia [55] . however, vaccination is recommended for healthy people over 65 years of age to confer protection against invasive disease [54] . ps-based vaccines can be also used in pregnant women to stimulate the production of antibodies, which are transferred to the foetus via the placenta or to the newborns by breast-feeding. however, it is still a matter of controversy whether maternal vaccination can indeed protect newborns against pneumococcal infections [56] . the second generation of ps-based conjugate vaccines stimulates stronger antibody responses, even in infants, young children and immune deficient individuals, as well as immunological memory. these vaccines also suppress nasopharyngeal carriage of the pathogen and reduce bacterial transmission in the community leading to herd immunity, which adds considerable value to their implementation. the introduction of these vaccines in usa in 2000 resulted in a dramatic decline in the rates of invasive pneumococcal disease [1, 57, 58] . a significant reduction in the incidence rates among non vaccinated individuals was also observed as a result of herd immunity [59, 60] . however, the licensed seven-valent vaccine does not contain some of serotypes that cause severe disease in developing countries (i.e., serotypes 1 and 5). new conjugate vaccines including more serotypes, such as the ninevalent vaccine (wyeth) and two 11-valent vaccines (glaxosmithkline and sanofi-pasteur), should provide better serotype coverage. new approaches to develop protein-based subunit vaccines against s. pneumoniae are currently being pursued by different research groups. this is expected to enable the generation of a universal vaccine conferring protective immunity against a large number of serotypes, as well as to avoid the complexity of manufacturing a conjugate vaccine [61] . there are different pneumococcal candidate antigens, such as the pneumolysin, neuraminidase, autolysin, pneumococcal surface protein a (pspa) and adhesin a (psaa), which are in an early phase of clinical development [1] . in addition, several promising candidates have been identified, which are currently being tested in pre-clinical experimental models [1] . among them, the two iron uptake abc transporters of s. pneumoniae (piaa and piua), which trigger protective immunity against invasive pneumococcal disease in mice. through the screening of s. pneumoniae genomic expression libraries with sera from convalescent patients, bacterial surface proteins were identified (e.g., bvh-3 and bvh-11) that promote the elicitation of protective anti-pneumococcal antibodies in mice [1] . a recombinant hybrid protein, bvh3/11v, has successfully been tested in toddlers and elderly volunteers. this candidate vaccine should be able to trigger serotype-independent responses, since the bvh3 and bvh11 antigens are common to all serotypes of s. pneumoniae. the major obstacle for developing an effective vaccine against h. influenzae capsular ps was related to the inherently poor immunogenicity of this t-cell-independent antigen. antibody responses against ps are age-related, with extremely poor immunogenicity in infants during the first 18 months of life. unfortunately, this age group exhibits the highest risk for invasive infections caused by h. influenzae. a ps-based vaccine against the h. influenzae type b (hib) was licensed in the united states in 1985, for children more than 18 months old [62, 63] . the protective efficacy after licensure studies showed the inefficacy of this vaccine not only in infants, but also in older children [64] . this problem was solved by the generation of a conjugate hib vaccine. to this end, the hib ps (i.e., polyribosylribitol phosphate; prp) was covalently linked to an immunogenic carrier protein, thereby leading to t-cell-dependent responses against the ps. different conjugate hib vaccines currently exist. these vaccines are hboc, prp-t and prp-omp, which make use of the mutant diphtheria toxin crm197, the tetanus toxoid and the outer membrane protein from group b n. meningitidis as carriers, respectively. all of them trigger similar immune responses at the recommended doses. however, the dynamic of the elicited response may vary for each of them [65, 66] . efficacy studies of these vaccines showed that they confer protection not only against meningitis, but also against pneumonia [67] [68] [69] . although hib vaccines are highly effective, their cost is still prohibitive for the world's poorest nations. however, with the establishment of the global alliance for vaccines and immunization (gavi), we have moved consistently ahead in making them also available for developing countries. gavi has approved the establishment of a hib initiative to support countries wishing to sustain hib vaccination, as well as those exploring whether their introduction could be considered a priority in the near future. although the introduction of conjugated ps vaccines has significantly decreased the prevalence of invasive hib disease, paediatric infections due to non typeable h. influenzae (nthi) are still highly prevalent. nthi is most often associated with otitis media, sinusitis and bronchitis. in addition, nthi is an important cause of lower respiratory infection in adults with chronic obstructive pulmonary disease (copd). thus, the development of a vaccine against nthi is considered an important goal in public health. in contrast to hib, vaccines against the non-encapsulated nthi strains must be directed against alternative virulence factors. the lipoproteins d and p6 are widely distributed and antigenically conserved among h. influenzae strains, and also trigger the elicitation of protective immunity in animals vaccinated by mucosal route [70] [71] [72] [73] . thus, their incorporation in vaccine candidates might facilitate the generation of a universal vaccine against all typeable and non typeable h. influenzae. even in the age of vaccine availability, b. pertussis continues to be a major cause of childhood morbidity and mortality (i.e., approximately 50 million cases and 300,000 deaths occur annually worldwide). since the late 1940s, the incidence of whooping cough has dramatically decreased in most developed countries, as a result of widespread immunization. the first vaccine formulations, which are still in use, consist of preparations based on killed b. pertussis. the frequent incidence of minor adverse effects (e.g., fever, protracted crying and local erythematous reactions), as well as concerns raised by reports of serious neurological side-effects, resulted in a decline in vaccine acceptance and use [74] . this in turn led to a re-emergence of whooping cough and its complications. this serious problem prompted the development of a new generation of acellular vaccines. in 1981 japan was the first country to successfully introduce acellular vaccines against whooping cough in its immunisation programme [75] , leading to a consistent reduction in the reported side-effects. in the mid 1980's a major phase iii trial of acellular vaccines was undertaken in sweden, at a time when the banning of the whole cell vaccine had resulted in a pertussis epidemic in that country [76] . the first vaccine trials contained chemically detoxified pertussis toxin (pt) and filamentous haemagglutinin (fha), or detoxified pt alone. the results of these trials showed that whilst producing good antibody responses, the vaccines failed to give an adequate level of protection in infants. the mono-component vaccine conferred no protection against infection, whereas the use of the two component candidate only gave incomplete protection against infection [77] . the results obtained in japan and sweden stimulated vaccine companies in the usa and europe to establish vigorous research programmes aimed at the development of a new generation of acellular vaccines with higher efficacy. currently available vaccines have incorporated chemically or genetically inactivated pt and additional virulence factors, such as fha, the outer membrane protein pertactin (prn) and fimbrial proteins (fims). the efficacy studies of this second generation of acellular vaccines have demonstrated that they confer levels of protection equivalent to the whole cell vaccines. the advent of improved techniques for antigenic characterisation and the introduction of acellular vaccines containing genetically defined components also resulted in a reduction of lot-to-lot variation in comparison with conventional whole cell vaccines and the acellular formulation originally introduced in japan. however, despite the wide implementation of vaccination campaigns in infants and children, the disease continues to be endemic. in addition, in countries with high vaccine coverage we are now observing a consistent increment in the cases of pertussis in adolescents and adults [78] [79] [80] . these patients can then transmit the disease to infants, thereby now representing a primary reservoir for bacterial transmission and cycling in the community. the above-mentioned observations can be explained by one or more of the following factors: (i) improved detection techniques, (ii) major awareness on the possibility that bacteria may affect these age groups, (iii) vaccinedriven antigenic changes in circulating isolates, and (iv) reduction in vaccine efficacy over time. in this context, concerns have been raised about genetic variation between the strains used for vaccine preparation and circulating isolates. this seems to be true, since the currently used whole cell and acellular vaccines are prepared with strains that were isolated before mass vaccine introduction and show clear mismatches with respect to circulating strains. there is a steady tendency to decrease diversity in recent isolates, together with clonal expansion during epidemic outbreaks [81, 82] . over time, at least two surface proteins (pt and prn) may have changed sufficiently to allow for an increase in the incidence of disease. unfortunately, our global information on antigenic variation and disease in adults and adolescent is extremely limited. thus, despite widespread introduction of pertussis vaccines, it is essential to continue surveillance studies and collection of circulating strains. the present view is that successful control of pertussis in the community may require routine immunization of adolescents and adults with the new acellular vaccines, perhaps in combination with the diphtheria and tetanus toxoids (dtap). this intervention might help in turn to reduce the burden of disease and transmission to infants. chlamydia pneumonia is an intracellular bacterium transmitted person-toperson via respiratory droplets. this pathogen is a common cause of pneumonia, with infections usually being oligosymptomatic or asymptomatic in young age groups. however, the rate of asymptomatic carriage in the normal population is unknown. there is also a tremendous gap in our understanding of host response to infections caused by c. pneumoniae. most of the studies have been focused on the development of efficient diagnostic methods. however, less work has been done on vaccine development, and there is a paucity of knowledge on the microbial components which may serve as target antigens. in fact, at present there are no licensed vaccines against c. pneumoniae. however, the potential of different antigens, such as the major omp2 [83] have been assessed in experimental animal models. nevertheless, mice vaccinated with omp2 using a protocol based on priming with dna and boosting with recombinant vlp showed only partial protection [84] . recent studies also suggested that ctl responses play a role in protection and clearance [85] . animals immunized with a mini-gene encoding seven h-2(b)-restricted ctl epitopes fused to a endothelial reticulum-translocation signal showed protection following intranasal challenge with a virulent c. pneumoniae [85] . the current view is that multi-component vaccine will be required in order to induce a protective response [86] . using the promising approach of reverse vaccinology combined with proteomics (see section "reverse vaccinology"), the whole-genome of c. pneumoniae was screened searching for vaccine candidate antigens among exposed and immune accessible surface proteins [87] . the selected candidates were then expressed in a heterologous system and used in immunization studies. approximately 53 proteins were able to trigger the elicitation of c. pneumoniae-binding antibodies. when tested in secondary screenings, six of them were also able to neutralize bacteria in vitro, and four inhibited systemic dissemination of c. pneumoniae in a hamster model [86] . moraxella catarrhalis is the third most common bacterial etiologic agent of otitis media in children. furthermore, m. catarrhalis is an important cause of respiratory infections in patients with copd. thus, different studies have been carried out to characterize potential protective antigens. in this context, two major omp (cd and e) have been identified, which are considered prime candidate antigens for vaccine development. these proteins are expressed on the surface and show a high degree of conservation among circulating strains. both omp triggered the elicitation of bactericidal antibodies and protective immunity in preclinical models [88] . additional candidates are the uspa1 protein [89] , which seems to be required for bacterial colonization of the human upper respiratory tract, the iron-induced omp b1 and lbp, and the iron-repressed omp b2 [90] . a conjugate vaccine based on detoxified lipo-oligosaccharide was also tested in mice by intranasal route with encouraging results [91, 92] . some of these candidates are planned to be tested in clinical studies soon [90] . mycoplasmas are commensal microorganisms, as well as opportunistic pathogens. mycoplasma pneumoniae is one of the causative agents of acute and chronic human respiratory diseases and the main responsible for primary atypical pneumonia, accounting for approximately 20-30% of all community-acquired pneumonia [93] . there is a considerable underreporting for m. pneumoniae-associated diseases. this is in part due to the wide diversity of clinical manifestations, the difficulties associated with its cultivation from clinical specimens and the lack of adequate diagnostic tools. no vaccines are currently available against this pathogen. however, studies conducted in human volunteers in the late 1960s demonstrated that a formalin-inactivated whole cell vaccine and an acellular extract were able to confer moderately protective immunity against m. pneumoniae [94] . unfortunately, immune pathological reactions were observed following challenge with live organisms. therefore, studies are still needed to understand the underlying mechanisms to the observed autoimmune responses [95] . more specifically, we need to elucidate the specific role played by humoral and cellular response in protection against m. pneumoniae. m. pneumoniae is one of the smallest self-replicating prokaryotic pathogens (approximately 800 kb). the complete genome sequence is now available. this is expected to expand our knowledge on the physiological and virulence properties of this agent, as well as new hints for vaccine development. a previously unrecognized bacterium was isolated after the outbreak of legionnaires disease in 1976, which was designated legionella pneumophila [96, 97] . the spreading of l. pneumophila is increasing due to the use of air-conditioners and humidifiers, since infections can occur by inhalation of aerosolized contaminated water sources. several approaches have been developed in the fight against this facultative intracellular pathogen. infection and immunization induce a rapid increase of antibody titres. however, antibodies do not seem to play a significant role in host resistance, particularly after aerosol challenge [98] [99] [100] . some authors also suggested that these antibodies can promote bacterial phagocytosis, thereby favouring invasion and subsequent intracellular replication [101] . in contrast, cellular responses appear to be important for protection. different vaccine candidates were tested in the past. heat-, acetone-and formalin-killed l. pneumophila vaccines were not able to confer protective immunity in guinea pigs, whereas animals immunized with l. pneumophila membranes survive an aerosol challenge with virulent bacteria [98, 99] . additional work demonstrated that also purified antigens, such as the major secretory protein [98] , the major cytoplasmatic membrane protein [102] , the peptidoglycan-associated lipoprotein [103] , omps [104] and flagella [100] can confer protection against challenge with virulent l. pneumophila. finally, different live attenuated mutants of l. pneumophila were used in animal infection models with promising results [105] . cystic fibrosis (cf) patients are particularly susceptible to severe bacterial infections of the lung, being pseudomonas aeruginosa one of the most prominent etiologic agents. thus, significant efforts have been invested to develop a vaccine against this pathogen. surface ps are among the antigens that were most intensively assessed. berna biotech have developed an octavalent vaccine against the eight most prevalent serotypes based on o-ps conjugated with the exotoxin a [106] [107] [108] [109] [110] [111] [112] [113] . a consistent reduction in the number of cf patients with chronic p. aeruginosa lung infection was observed in a cohort receiving the basic immunization protocol, followed by yearly boosters over a period of 10 years [112, 113] . the conjugate vaccine induced the production of specific igg antibodies and increased the number of igg memory b cells. it is still unclear if cellular responses might contribute to the overall protection conferred by this vaccine. however, strong proliferative responses of lymphocytes with a th1 phenotype were observed in vaccinated individuals in response to the carrier exotoxin a protein [113] . alternative vaccination strategies are currently being tested in clinical trials. among them, formulations based on a fusion protein between the outer membrane proteins f and i, which have been administered by parenteral and mucosal routes [114, 115] . these formulations were demonstrated to be safe in volunteers and conferred increased protection against p. aeruginosa in cf patients. cell-surface alginate, flagella, components of the type iii secretion system, inactivated toxins and proteases are other proposed target antigens [116] . some of them are already in clinical trials alone or in combination [116] . when pasteur returned from his summer holidays in 1881 to continue with his studies on chicken cholera, he inoculated chickens with an old culture of pasteurella multocida, which was left during the whole summer on his bench. the animals that received the preparation were protected against a challenge performed with a fresh isolate. thus, pasteur developed the hypothesis that pathogens could be attenuated by exposure to environmental insults (e.g., high temperature, oxygen and chemicals) [117] . the strategy was then successfully extrapolated for developing anthrax vaccines in livestock in the 1880s, with significant economic benefits. this was followed by the generation of attenuated vaccines against rabies and other important pathogens towards the end of the nineteenth century. pasteur's approach for "attenuating" or "inactivating" a pathogenic organisms still constitutes a cornerstone in vaccine technology [117] . this exemplifies that until recently the major achievements in vaccinology have been facilitated by technological (e.g., adjuvants, delivery systems, reverse vaccinology, genetic engineering) rather than immunological advances [117] [118] [119] . however, it is expected that the impressive knowledge accumulated in recent years in the fields of immunology, immune pathology and microbial pathogenesis will pave the road to a new golden era in vaccinology, in which knowledge and technology will enable rational vaccine design. in the 20th century, pertussis vaccines progressed from crude bacterial preparations to the highly purified antigens used for acellular vaccines. a similar quantum jump in technology allowed the development of subunit vaccines against influenza, hib and s. pneumoniae, as well as the production of antigens by recombinant dna techniques (e.g., genetically inactivated pt). despite the fact that these techniques enable the production of almost any foreseeable antigen, the identification of suitable targets still remained as a main bottleneck for vaccine development [120] . the advent of genomics and its exploitation in the vaccinology field have rendered possible the implementation of a systematic and holistic approach for the screening, identification and prioritisation of candidate antigens. this new approach, called "reverse vaccinology" [121] , does not require cultivation of the original pathogen, thereby being amenable for highlypathogenic or non culturable micro-organisms. it is possible to predict and select the most promising candidates by the analysis of genomic sequences in silico, which will then be cloned and expressed in heterologous systems. the resulting proteins are then used to perform immunological and/or functional studies to select the most promising candidates (e.g., able to induce the production of microbicidal or neutralizing antibodies, capacity to confer protective immunity). flanking studies are usually carried out, such as molecular epidemiological analysis to assess their degree of conservation among circulating strains, or transcriptional profiling to evaluate their expression during natural infections [122] . the time-consuming process in which highly expressed components of an in vitro cultivable organism are identified (one at a time) and separated (different components between them) is one of the disadvantages that reverse vaccinology has solved. the conventional method usually requires 15-20 years to arrive to a clinical trial, whereas reverse vaccinology reduces the process to approximately 5 years. reverse vaccinology also allows the identification of hundreds of potential candidates in a few days, in comparison with the small number of antigens that conventional approaches have provided after decades of research. moreover, reverse vaccinology offers the possibility to select potential candidates independent of their expression levels or purification easiness. the reverse vaccinology approach has proved its usefulness in the field for both viral and bacterial pathogens (e.g. hepatitis c virus, group b meningococci, group b streptococci) [123, 124] . reverse vaccinology has also become an essential tool for several vaccine development projects against agents causing community-acquired pneumonia (e.g., c. pneumoniae, streptococci). the potential and speed of genomic-based approaches was also shown when the nucleotide sequence of the coronavirus causing sars was made available in less than one month. in addition, the increasing number of available genomes from bacteria and viruses would allow comparative genomic studies, thereby providing hints on conserved protein families and/or functional domains. this would facilitate the generation of vaccines using immunogens covering multiple micro-organisms [125] . despite the incredible potential of reverse vaccinology, this approach also has some important limitations (tab. 1). among them is the fact that it is not be possible to identify non-protein antigens (e.g., ps, glycolipids), which are the cornerstone for many successful vaccines (e.g., pneumococcal and hib vaccines). currently available influenza vaccines (see above) are based on inactivated viruses, and, more recently, attenuated ca viruses and virosomes. all these vaccines exploit the same starting material (wild-type virus), which is inactivated or attenuated. the last approach consists in the co-infection of chicken eggs with the new isolate and a master attenuated strain, and subsequent selection for re-assorted viruses with the desired genotype/phenotype. however, the virulence of certain virus strains, such as the h5n1, renders difficult the implementation of this traditional strategy. the use of reverse genetics represents a valid alternative for the generation of vaccines against rna respiratory viruses, such as the influenza virus, piv and rsv. it consists in the production of the virus from cloned dna [126] , thereby allowing the development of vaccines against any pandemic viral strain. in some cases (e.g., avian h5n1) an additional mutagenesis step would be required to attenuate its virulence [127] . then, the new ha and na segments would be transferred into an appropriate influenza a virus master strain adapted to grow in a cell line. the final re-assorted virus will have the antigenic specificity of the pandemic strain and the growth characteristics of the master strain [128, 129] . this technology would also allow production of the influenza vaccine in cells that are co-transfected with plasmids encoding for different frag[130] . therefore, the complete genome is inside the cell and virus can be produced and assembled. one of the main advantages is that a plasmid encoding for ha and na can be easily replaced. therefore, re-assortment and selection become unnecessary. this method would considerably reduce the time for vaccine production, from many months to only a few weeks. another advantage would be the simple manipulation of the genome (contained in plasmids), which would enable detoxification of specific virulence factors. similar approaches can be implemented for other viruses, such as rsv, piv and sars-cov. however, intellectual property and liability issues are still obstacles for the industrial development of reverse-genetics-based vaccines [131] . furthermore, since the resulting viruses are considered genetically modified organisms, additional problems may arise from the regulatory stand point [131] . most of the infective agents are either limited to the mucosal membranes, or need to transit across them in order to cause disease. therefore, it is highly desirable to elicit an efficient immune response at the local site in which the first line of defence is laid. the stimulation of a pathogen-specific response at the portal of entry is expected to impair infection (i.e. colonization), thereby reducing the risk of transmission to susceptible hosts. parenterally administered vaccines mainly stimulate systemic responses, whereas vaccines given by the mucosal route mimic natural infections, thereby leading to efficient mucosal and systemic responses. thus, there is a considerable interest in the development of mucosal vaccines. however, antigens administered by this route are usually poorly immunogenic. different strategies are being pursued to overcome this bottleneck, among them can be cited the use of (i) advanced synthetic delivery systems, (ii) live attenuated bacterial or viral vectors, (iii) bacterial ghosts, (iv) pseudoviruses and (v) mucosal adjuvants [132] [133] [134] [135] . advanced synthetic mucosal delivery systems particulate antigens are more immunogenic than those in solution, due to their vulnerability to degradation by enzymes and extreme ph. thus, it would be helpful to incorporate them into a protective vehicle. often, these vehicles do not serve only to protect them, but can also enhance their uptake, promote targeting to antigen presenting cells and serve as adjuvants [136] . the most commonly exploited delivery systems are: (i) gelatine capsules, which are dissolved at alkaline ph in the intestine but not in the stomach, (ii) muco-adhesive polymers that are highly viscous inert ps, (iii) eldexomer and carboxymethyl cellulose, which have been used for oral, nasal and vaginal delivery, (iv) lipid-based structures with entrapped antigens, such as immune stimulating complexes (iscoms) and liposomes, and (v) biodegradable micro/nano-spheres based on biocompatible materials such as starch, copolymers of lactic or glycolic acid [137, 138] . some of these approaches are currently being explored to develop vaccines against agents causing community-acquired pneumonia. encouraging results have been obtained, among others, using surface antigens from s. pneumoniae encapsulated in micro-spheres [139] and a iscom-adjuvanted vaccine obtained by reverse genetics against the influenza virus, in preclinical models [140] . attenuated viruses and bacteria can be used not only as vaccine candidates per se, but also as delivery systems for heterologous antigens. thus, many attenuated microorganisms have been exploited as a scaffold for the development of subunit vaccines against other agents, under the premise that the expression of the recombinant antigen(s) does not increase their pathogenic potential for humans or animals. the most frequently exploited bacterial vectors are attenuated derivatives of salmonella enterica and shigella spp., and the bacille calmette-guérin (bcg). for example, vaccination with an attenuated salmonella expressing the oprf-opri was also shown to be able to confer protection against p. aeruginosa in a murine experimental infection model [141] . in addition, it was also demonstrated that a recombinant bcgbased vaccine expressing the pspa confers protection against s. pneumoniae in an infection animal model [142] . the use of commensals represents an alternative to attenuated organisms (e.g., lactobacilli). in this context it was demonstrated that oral administration of lactobacillus expressing proteins from coronavirus can protect against a gastric infection [143] . thus, this approach has been also proposed to combat sars. promising results were also obtained using x chlamydia psittaci [144] . on the other hand, different attenuated viruses, such as mva, bovine or attenuated hpiv-3 and adenovirus can be used as delivery systems for heterologous antigens [25, 145] . in fact, mva has recently been exploited for antigens of the sars associated coronavirus [146] . an alternative approach to the use of live attenuated carriers is given by the use of bacterial ghosts. ghosts are generated by the conditional expression of the lethal lysis gene e from bacteriophage phix174 in gram-negative bacteria [147] [148] [149] [150] [151] . this leads to the formation of a trans-membrane tunnel through the bacterial cellular envelope [147] . due to the high internal osmotic pressure, the cytoplasm content is expelled through the tunnel, thereby leading to an empty bacterial cell envelope [152] . the presence of envelope components in the ghosts provides a strong danger signal through the activation of pattern recognition receptors [153] . in addition, bacterial ghosts are efficiently taken up by antigen-presenting cells, stimulating their maturation and activation [154] . bacterial ghosts retain all morphological, structural, and antigenic features of the cell wall and can be used as vaccine candidates per se. ghosts can also be externally loaded with purified antigens. alternatively, ghosts can be generated from recombinant bacteria expressing heterologous antigens, hence avoiding the difficulties associated with the purification steps. this technology also offers the possibility to manipulate the topology of the recombinant antigen (e.g., the antigen can be bound to the inner membrane, secreted into the periplasmic space or associated to the surface). encouraging results has been obtained in preclinical models using ghosts expressing chlamydial antigens [135, 155] . promising results have been reported using different types of pseudoviruses, such as virosomes and virus-like particles (vlp), which are non-replicating viral-like structures. virosomes are based on the principle of reconstituting empty viral envelopes through integration of viral envelope proteins in liposomes. they offer the versatility of liposomes in terms of lipid composition, with the advantage of including viral membrane proteins. virosomes are produced by disassembling the viral membrane envelope with detergents. then, the viral nucleocapsid is removed by ultracentrifugation before reconstitution (fig. 1) . in contrast, vlp exploit the capacity of recombinant viral coat proteins to spontaneously self-assemble, thereby mimicking at structural level the viral capsid. vlp can be isolated after protein expression in eukaryotic cells or by in vitro assemblage from subunits produced in an heterologous system [156] . their main advantages are the lack the viral genetic material with an "intact" envelope, and the fact that they are significantly more immunogenic than soluble proteins. they can be used as vaccines per se, as well as a delivery system for protein-or nucleic acid based vaccines, or as carriers for small molecules. foreign antigens can be expressed on their surface, or can be simply encapsulated. in addition, amphiphilic adjuvants can be incorporated into their membranes, thereby offering the advantage of combining an adjuvant and the antigen in one entity without a covalent attachment. pseudoviruses are especially attractive for mucosal vaccination protocols, since they offer the opportunity to use the natural route of transmission of the agents. induction of serum antibodies, secretory iga, t helper and ctl responses, and protection against mucosal pathogen challenge has been reported from studies in animals and humans [157] [158] [159] . the virosomes generated using the influenza virus retain membrane fusion properties very similar to the naïve virus. therefore, they are able to deliver material to the cytosol of target cells, offering the possibility to access the mhc class i-restricted pathway of antigen presentation to prime ctl activity [160] [161] [162] . bacterial toxins and their derivatives are among the first molecules that have been used as mucosal adjuvants. they are characterized by the presence of an a moiety with enzymatic activity, and a b moiety that mediates toxin binding to the target cells. cholera toxin and the closely related escherichia coli heat-labile toxin showed potent adjuvant activity when co-administrated with different antigens by the mucosal route [163] [164] [165] . however, their use in humans is hampered by their intrinsic toxicity. thus, mutated derivatives were developed, in which the a subunit was modified to remove the adp-ribosylating activity. the resulting polypeptides retain their adjuvanticity, in the absence of detectable toxicity [166] [167] [168] . however, additional studies have demonstrated that even these derivatives can lead to potential severe side-effects, such as retrograde homing of adjuvant and antigen to neural tissues [169] . this might explain, at least in part, the side-effects observed after intranasal vaccination against influenza with a virosomes-based formulation containing heat-labile toxin (i.e., bell's palsy), which in turn led to its retraction from the market. however, chimeric derivatives lacking the targeting moiety for neural tissues (i.e., b subunit) are now available [170] . they might allow the exploitation of the high potential of these molecules for the development of vaccines against respiratory pathogens. in fact, preclinical studies provided the proof-of-concept for the usefulness of derivatives of bacterial toxins in the generation of acellular vaccines against microorganisms, such as s. pneumonia and h. influenzae [171, 172] . other bacterial components were also explored for their activity as adjuvants. the monophosphoryl lipid a retains much of the immune stimulatory properties of lps, without the inherent toxicity [165] . on the other hand, extracellular matrix binding proteins, such as the fibronectin binding protein i of streptococcus pyogenes, also exhibit adjuvant activity [173] . this offers the possibility of using them as dual antigen/adjuvant moieties in the same formulation. recent reports also demonstrate that vaccine formulations containing adamantylamide dipeptide, a non-toxic compound obtained by linking the l-alanine-d-isoglutamine residue of the muramyl dipeptide to the antiviral drug amantadine, confer protection against non typeable h. influenzae in preclinical models [73] . the innate immune system plays a critical early role in host defence against pathogenic microorganisms through the recognition of pathogenassociated molecular patterns [174] . this is achieved through the stimulation of pattern-recognition receptors (prr) that sense a broad range of exogenous and endogenous danger signals [153, 174] . toll-like receptors (tlr) represent the best-characterized family of prr. natural and synthetic tlr agonists are being used as immune modulators to optimize responses after vaccination. since the identification of the tlr4, many mammalian tlr homologues have been identified (i.e., 10 in humans and 13 in mice) [175] . each tlr member binds specifically to different ligands (tab. 2), alone or in combinations (e.g., heterodimers formed by tlr2 with either tlr1 or tlr6). an example of tlr agonist is bacterial dna, but not vertebrate dna, and synthetic oligodeoxynucleotides containing unmethylated cpg motifs. they act on tlr9, thereby inducing a strong th1 responses by activation of dendritic cells [176, 177] . cpg motifs have been successfully used as adjuvants in preclinical studies of different candidate vaccines against agents causing community-acquired pneumonia [178] [179] [180] . another important adjuvant with tlr-binding capacities is the mycoplasma-derived macrophage-activating lipopeptide malp-2, which act a the level of the tlr heterodimer 2/6 [181, 182] . malp-2 promotes a global activation of cells from the innate and adaptive immune system [183, 184] , such as macrophages, dc, t-and b-lymphocytes [183, 185] . when coadministered with an antigen by either the parenteral or the mucosal route, malp-2 promotes the elicitation of humoral and cellular responses at systemic and mucosal level [186] . preclinical studies suggested that malp-2 could be exploited in vaccine formulations against the sars-associated coronavirus, m. catarrhalis and influenza virus, among others (unpublished data). dna vaccination offers some advantage over the normal antigen vaccination, such as the fact that it is not necessary to express any antigen. in contrast, it is the biosynthetic machinery present in the cells of the vaccinees that takes care of this work. furthermore, since eukaryotic cells are in charge of protein synthesis, their glycosylation and folding are optimal. however, the large-scale purification of dna might be associated with high costs. this can be solved by the use of attenuated or inactivated bacteria or viruses as delivery systems [187] . this approach can also lead to an enhanced induction of antibodies, which is otherwise poor using conventional naked dna vaccines. we have recently demonstrated that bacterial ghosts can be also exploited as a delivery system for dna vaccines for both in vivo and ex vivo applications [188] . the potential of this approach is demonstrated by the fact that it is possible to optimize performance by a broad range of manipulations, such as (i) choice of optimal promoters, (ii) use of codon optimized genes for expression in mammalian cells, (iii) addition of nuclear localization signals or ubiquitination signals to improve expression and processing, and (iv) co-delivery of dna constructs coding for immune modulatory molecules [189] . in addition, by the presence of immune stimulatory cpg motifs, the dna vaccine constructs has built-in adjuvant properties. this vaccination approach is particularly suited for the stimulation of cellular immune responses [190] . interestingly, several reports suggest that dna vaccines may represent a valid alternative to prime the neonatal immune system, even in the presence of passive transferred maternal antibodies [191, 192] . in fact, promising results were also obtained in preclinical models of community-acquired pneumonia, such as influenza [193] and s. pneumoniae [194] . furthermore, dna coding for vaccine antigens appears to induce excellent immunological memory, which can be reawakened by later immunization or exposure to the pathogen. the knowledge generated in several basic disciplines, such as immunology and microbial pathogenesis, has allowed the identification of critical bottlenecks for establishing a successful vaccination strategy. it is expected that in the coming years we will develop customized approaches to address each of them, in order to stimulate efficient protection against infective agents under specific clinical settings (i.e., newborns, aging individuals, immunocompromised patients). the importance of immunological memory b lymphocytes that have differentiated into plasma cells are the producers of antigen-specific igg antibodies. bone-marrow (bm) plasma cells have a short life, therefore, the bm reservoir needs to be replenished by the stimulation of memory b cells [195, 196] . the maximal life span of bm plasma cells is still debated. only few factors have been identified that control the differentiation of antigen-specific b cells toward short-or long-life plasma cells or to memory b cells [119] . beside the requirement of cd4 + t cells, the nature of the antigen [197] and the dose are also important. higher antigen doses, as well as rapid vaccination schedules (closely spaced vaccine doses) tend to favour the rapid induction of short-term effectors, whereas lower doses of antigens preferentially support the induction of immune memory [198] [199] [200] [201] . it was demonstrated that neonatal vaccination (priming) and infant boosting might be effective even when pathogen exposure occurs very early in life. in children in whom vaccine-induced hib antibody titres have fallen to undetectable levels, memory is readily demonstrated [202] . however, immune memory per se is not enough to protect against pathogens that required high levels of neutralizing antibodies. the delay between memory b-cell reactivation and differentiation may limit the ability to interrupt pathogen invasion. therefore, it is important to establish vaccination protocols in which the population is boosted at different ages in order to maintain the required levels of antibodies. this is particularly important in diseases in which antibodies play a central role in microbial clearance or toxin neutralization. in the particular case of community-acquired pneumonia, we should consider that aging individuals are neglected in many vaccination programs. however, the strategies proposed for elderly would be different from those used for small children, since the main factors affecting vaccine efficacy are immune senescence and immaturity, respectively. the attempts to give a rational solution to this issue are discussed in the next sections. the immune system in children immune responses to bacterial and viral antigens usually increase with age in a stepwise manner [203] . prompt immunization after birth is required to induce active immunity against diseases that may occur early in life. unfortunately, this strategy is limited by the relative immaturity of the neonatal and infant immune system. some factors implicated in this poor response are the limited switch from igm to igg2 antibodies, impaired complement-mediated reactions and deficient organization of the splenic marginal zone. vaccination studies performed in newborn mice suggested that limited germinal centre reactions may results from the delayed devel-opment of follicular dc and limit plasma cell differentiation [204] . it was also showed that the neonatal bm has a limited capacity to support the establishment of long-life antibody-secreting plasma cells [205] . thus, the responses to glyco-conjugates and to most t-cell-dependent antigens are usually affected [119] . therefore, only few and highly immunogenic vaccines show significant protective efficacy after a single dose in infants. the limited igg responses are extended all over the first year of life. in addition, the immune responses, particularly antibodies, elicited in the first year of life after vaccination rapidly decline [203] . however, the problem observed in infants in terms of magnitude and duration of immune response does not seem to affect efficient priming. in fact, the immune memory generated in neonates may be recalled later in life [119] . nevertheless, strategies to generate strong and long-lasting protective responses in infants are still needed. this is in part due to the presence of maternal antibodies, which inactivate and clear the vaccine antigens, thereby rendering difficult the stimulation of an immature immune system [203] . in addition, the effects of adjuvants reported in adults cannot be extrapolated to neonates [206] . a potential strategy to overcome these problems would be to implement vaccination during pregnancy, to provide the required antibodies by placenta and later by maternal feeding [30, [207] [208] [209] . this could be complemented with an early priming of the "immature" immune system of the newborn by dna vaccination, followed by a boost during the second half of the first year or later in life [203] . poly-pathology and multiple organ failure is the rule rather than the exception in aging individuals. thus, many systems are affected (e.g., endocrine, cardiovascular), and the immune system is not an exception. the mechanisms involved in the immune senescence process, which in turn may lead to poor response to vaccination, are not fully understood. however, it is clear that responses against certain vaccines are more affected by immune senescence than others (e.g., ps-based vaccines against s. pneumoniae) [210] . in contrast, the responses to a boost dose of the anti-tetanus vaccine are hardly affected by age [211] . a rapid decline of antibody responses, together with a relative restriction of the t-cell repertoire is characteristic of the immune senescence process. this restriction and the reduction in the pool of naïve cells can explain the poor cd4 + t cell responses against antigens that are cross-reacting with proteins which were seen earlier in life. in contrast, t-cells responses of healthy elderly individuals to new antigens are often unaffected. nevertheless, the overall response to vaccination in the elderly is less efficient than in young adults, making more vigorous approaches necessary (fig. 2) . in the case of influenza, the actual strategy is annual re-vaccination. however, there are concerns regarding the capacity to increase antibodies with proper specificity against re-assorted viruses in aging adults who have been repeatedly infected or immunized. after exposure to a new, but cross-reacting antigenic variant, such individuals may respond by producing antibodies. however, these antibodies could be primarily directed against influenza strains, which were encountered earlier in life. figure 2 . factors affecting the responses in young adults and aging individuals after vaccination. the process of immune senescence impairs host response to both infection and vaccination. this critical issue needs to be considered during vaccine design and will require the development of special approaches. for example, individuals previously exposed to the "old" h1n1 influenza strain (i.e., 50 years ago), may respond differently from naïve adults who are vaccinated with a "new" h1n1 strain which have accumulated different mutations. the former might produce antibodies against the ha of the "old" h1n1 strain rather than to the cross-reacting epitopes of the new strain [212] . this is phenomenon is the so-called "original antigenic sin" [119] . on the basis of this observations, it was proposed that variations in vaccine efficacy might be due to differences in the antigenic distance between the vaccine strains and the epidemic strains responsible for influenza outbreaks [213] . however, this hypothesis was not confirmed by epidemiologic studies [214] . even more, individuals aged 65 years or older who were annually vaccinated showed a significantly reduced mortality risk. therefore, until now, it seems that the antigenic sin does not represent a major practical obstacle in influenza vaccination and additional strategies may not be required. despite the broad availability of vaccines against agents causing community-acquired pneumonia, they still represent an important cause of death, human suffering and economic losses. however, we have dramatically expanded our knowledge on the pathophysiology of diseases caused by respiratory pathogens, their virulence factors and the effector mechanisms responsible for their clearance. it is becoming clearer which microbial components are attractive as vaccine targets, as well as the type of immune response needed to confer protection against disease. thus, it is now possible to address vaccine development using rational rather than empiric approaches. this is facilitated by powerful bioinformatics tools for the accurate prediction of epitopes and proteasome trimming [215] [216] [217] , as well as by the availability of a broad palette of immune modulators and delivery systems. therefore, we can predict that new and improved vaccines against the etiologic agents of community-acquired pneumonia will considerably reduce the global impact of this disease in the coming years. a review of vaccine research and development: human acute respiratory infections estimates of world-wide distribution of child deaths from acute respiratory infections respiratory viral vaccines public health. will vaccines be available for the next influenza pandemic novel generations of influenza vaccines safety, reactogenicity and immunogenicity of madin darby canine kidney cell-derived inactivated influenza subunit vaccine. a meta-analysis of clinical studies development of a vero cell-derived influenza whole virus vaccine influvac: a safe madin darby canine kidney (mdck) cell culture-based influenza vaccine structure of influenza haemagglutinin at the ph of membrane fusion influenza vaccine-live influenza virus: immunity and vaccination strategies. comparison of the immune response to inactivated and live, attenuated influenza vaccines current status of live attenuated influenza virus vaccine in the us evaluation of trivalent, live, cold-adapted (caiv-t) and inactivated (tiv) influenza vaccines in prevention of virus infection and illness following challenge of adults with wild-type influenza a (h1n1), a (h3n2), and b viruses safety, efficacy, and effectiveness of live, attenuated, cold-adapted influenza vaccine in an indicated population aged 5-49 years correlates of ratory syncytial virus purified fusion protein-2 vaccine in pregnant women sequential annual administration of purified fusion protein vaccine against respiratory syncytial virus in children with cystic fibrosis enhanced pulmonary immunopathology following neonatal priming with formalin-inactivated respiratory syncytial virus but not with the bbg2na vaccine candidate safety and immunogenicity of a novel recombinant subunit respiratory syncytial virus vaccine (bbg2na) in healthy young adults the immunogenicity, protective efficacy and safety of bbg2na, a subunit respiratory syncytial virus (rsv) vaccine candidate, against rsv-b evaluation of a live, cold-passaged, temperature-sensitive, respiratory syncytial virus vaccine candidate in infancy recombinant bovine/human parainfluenza virus type 3 (b/hpiv3) expressing the respiratory syncytial virus (rsv) g and f proteins can be used to achieve simultaneous mucosal immunization against rsv and hpiv3 parainfluenza virus type 3 expressing the native or soluble fusion (f) protein of respiratory syncytial virus (rsv) confers protection from rsv infection in african green monkeys severe acute respiratory syndrome sars-associated coronavirus angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus a 193-amino acid fragment of the sars coronavirus s protein efficiently binds angiotensin-converting enzyme 2 a dna vaccine induces sars coronavirus neutralization and protective immunity in mice identification of two neutralizing regions on the severe acute respiratory syndrome coronavirus spike glycoprotein produced from the mammalian expression system mucosal immunisation of african green monkeys (cercopithecus aethiops) with an attenuated parainfluenza virus expressing the sars coronavirus spike protein for the prevention of sars immunization with modified vaccinia virus ankara-based recombinant vaccine against severe acute respiratory syndrome is associated with enhanced hepatitis in ferrets interaction between heptad repeat 1 and 2 regions in spike protein of sars-associated coronavirus: implications for virus fusogenic mechanism and identification of fusion inhibitors evasion of antibody neutralization in emerging severe acute respiratory syndrome coronaviruses induction of sars-nucleoprotein-specific immune response by use of dna vaccine contributions of the structural proteins of severe acute respiratory syndrome coronavirus to protective immunity prevention of adenoviral acute respiratory disease in army recruits: cost-effectiveness of a military vaccination policy current research on respiratory viral infections: third international symposium respiratory bacterial vaccines pneumococcal polysaccharide randomised trial of 23-valent pneumococcal capsular polysaccharide vaccine in prevention of pneumonia in middle-aged and elderly people. swedish pneumococcal vaccination study group maternal immunization with pneumococcal polysaccharide vaccine in the third trimester of gestation a trial of a 9-valent pneumococcal conjugate vaccine in children with and those without hiv infection impact of the pneumococcal conjugate vaccine on otitis media pneumococcal conjugate vaccine decline in invasive pneumococcal disease after the introduction of protein-polysaccharide conjugate vaccine pneumococcal vaccines: an update on current strategies the epidemiology and prevention of disease caused by haemophilus influenzae type b prevention of hemophilus influenzae type b bacteremic infections with the capsular polysaccharide vaccine the 1996 albert lasker medical research awards. prevention of systemic infections, especially meningitis, caused by haemophilus influenzae type b. impact on public health and implications for other polysaccharide-based vaccines haemophilus influenzae type b vaccines: history, choice and comparisons the use of haemophilus influenzae type b-tetanus toxoid conjugate vaccine mixed with diphtheria-tetanus-pertussis vaccine in gambian infants randomised trial of haemophilus influenzae type-b tetanus protein conjugate vaccine [corrected] for prevention of pneumonia and meningitis in gambian infants haemophilus influenzae type b conjugate vaccines: a review of efficacy data large scale, postlicensure, selective vaccination of chilean infants with prp-t conjugate vaccine: practicality and effectiveness in preventing invasive haemophilus influenzae type b infections biological activity of serum antibodies to a nonacylated form of lipoprotein d of haemophilus influenzae enhanced respiratory clearance of nontypeable haemophilus influenzae following mucosal immunization with p6 in a rat model intranasal immunization with recombinant outer membrane protein p6 induces specific immune responses against nontypeable haemophilus influenzae intranasal vaccination with recombinant p6 protein and adamantylamide dipeptide as mucosal adjuvant confers efficient protection against otitis media and lung infection by nontypeable haemophilus influenzae dtp-associated reactions: an analysis by injection site, manufacturer, prior reactions, and dose development of a pertussis component vaccine in japan placebo-controlled trial of two acellular pertussis vaccines in sweden -protective efficacy and adverse events secondary analyses of the efficacy of two acellular pertussis vaccines evaluated in a swedish phase iii trial analysis of bordetella pertussis isolates collected in japan before and after introduction of acellular pertussis vaccines recommendations are needed for adolescent and adult pertussis immunisation: rationale and strategies for consideration polymorphism of bordetella pertussis isolates circulating for the last 10 years in france, where a single effective whole-cell vaccine has been used for more than 30 years comparison of acellular pertussis vaccines-induced immunity against infection due to bordetellapertussis variant isolates in a mouse model analysis of bordetella pertussis populations in european countries with different vaccination policies immunity to chlamydia pneumoniae induced by vaccination with dna vectors expressing a cytoplasmic protein (hsp60) or outer membrane proteins (momp and omp2) dna immunization followed by a viral vector booster in a chlamydia pneumoniae mouse model a cd8 + t cell heptaepitope minigene vaccine induces protective immunity against chlamydia pneumoniae identification of new potential vaccine candidates against chlamydia pneumoniae by multiple screenings genomic approach for analysis of surface proteins in chlamydia pneumoniae the major outer membrane protein, cd, extracted from moraxella (branhamella) catarrhalis is a potential vaccine antigen that induces bactericidal antibodies evaluation of purified uspa from moraxella catarrhalis as a vaccine in a murine model after active immunization progress toward the development of a vaccine to prevent moraxella (branhamella) catarrhalis infections specific immune responses and enhancement of murine pulmonary clearance of moraxella catarrhalis by intranasal immunization with a detoxified lipooligosaccharide conjugate vaccine synthesis and characterization of lipooligosaccharidebased conjugate vaccines for serotype b moraxella catarrhalis mycoplasmas: sophisticated, reemerging, and burdened by their notoriety inactivated mycoplasma pneumoniae vaccine. evaluation in volunteers mycoplasma pneumoniae and its role as a human pathogen classification of the legionnaires' disease bacterium: legionella pneumophila, genus novum, species nova, of the family legionellaceae, familia nova legionnaires' disease: isolation of a bacterium and demonstration of its role in other respiratory disease vaccination with the major secretory protein of legionella pneumophila induces cell-mediated and protective immunity in a guinea pig model of legionnaires' disease vaccination against legionella pneumophila: serum antibody correlates with protection induced by heat-killed or acetone-killed cells against intraperitoneal but not aerosol infection in guinea pigs induction of protective immunity by legionella pneumophila flagellum in an a/j mouse model legionella pneumophila immunity and immunomodulation: nature and mechanisms major cytoplasmic membrane protein of legionella pneumophila, a genus common antigen and member of the hsp 60 family of heat shock proteins, induces protective immunity in a guinea pig model of legionnaires' disease comparison of responses elicited by immunization with a legionella species common lipoprotein delivered as naked dna or recombinant protein human and guinea pig immune responses to legionella pneumophila protein antigens omps and hsp60 a live avirulent mutant legionella pneumophila vaccine induces protective immunity against lethal aerosol challenge safety and immunogenicity of a pseudomonas aeruginosa o-polysaccharide toxin a conjugate vaccine in humans characterization of the human immune response to a pseudomonas aeruginosa o-polysaccharide-toxin a conjugate vaccine immunization of noncolonized cystic fibrosis patients against pseudomonas aeruginosa immunization of cystic fibrosis patients with a pseudomonas aeruginosa o-polysaccharide-toxin a conjugate vaccine jr (1991) safety and immunogenicity of pseudomonas aeruginosa conjugate a vaccine in cystic fibrosis effect of high-affinity anti-pseudomonas aeruginosa lipopolysaccharide antibodies induced by immunization on the rate of pseudomonas aeruginosa infection in patients with cystic fibrosis vaccination of cystic fibrosis patients against pseudomonas aeruginosa reduces the proportion of patients infected and delays time to infection cellular immunity in healthy volunteers treated with an octavalent conjugate pseudomonas aeruginosa vaccine safety and immunogenicity of an intranasal pseudomonas aeruginosa hybrid outer membrane protein f-i vaccine in human volunteers mucosal vaccination with a recombinant oprf-i vaccine of pseudomonas aeruginosa in healthy volunteers: comparison of a systemic vs. a mucosal booster schedule application of vaccine technology to prevention of pseudomonas aeruginosa infections vaccines: past, present and future six revolutions in vaccinology can successful vaccines teach us how to induce efficient protective immune responses vaccines, vaccination, and vaccinology reverse vaccinology dna content analysis on microarrays immunoinformatics: mining genomes for vaccine components the development of multi-epitope vaccines: epitope identification, vaccine design and clinical evaluation reverse vaccinology and genomics reverse vaccinology a new approach to an influenza live vaccine: modification of the cleavage site of hemagglutinin generation of high-yielding influenza a viruses in african green monkey kidney (vero) cells by reverse genetics from lethal virus to life-saving vaccine: developing inactivated vaccines for pandemic influenza influenza vaccines generated by reverse genetics influenza: girding for disaster. facing down pandemic flu, the world's defenses are weak mucosal immunity and vaccines bacteria as dna vaccine carriers for genetic immunization mucosal adjuvants and delivery systems for protein-, dna-and rna-based vaccines a novel recombinant multisubunit vaccine against chlamydia poly(lactideco-glycolide) microencapsulation of vaccines for mucosal immunization iscoms, liposomes, and oil-based vaccine delivery systems lipid based particulate formulations for the delivery of antigen cross-protective immunity of mice induced by oral immunization with pneumococcal surface adhesin a encapsulated in microspheres protection of mice against lethal infection with highly pathogenic h7n7 influenza a virus by using a recombinant low-pathogenicity vaccine strain enhanced immunogenicity in the murine airway mucosa with an attenuated salmonella live vaccine expressing oprf-opri from pseudomonas aeruginosa protective humoral response against pneumococcal infection in mice elicited by recombinant bacille calmette-guerin vaccines expressing pneumococcal surface protein a intragastric administration of lactobacillus casei expressing transmissible gastroentritis coronavirus spike glycoprotein induced specific antibody production expression of chlamydia psittaci-and human immunodeficiency virus-derived antigens on the cell surface of lactobacillus fermentum br11 as fusions to bspa jr (1996) oral immunization with a replication-deficient recombinant vaccinia virus protects mice against influenza evaluation of modified vaccinia virus ankara based recombinant sars vaccine in ferrets new strategies for combination vaccines based on the extended recombinant bacterial ghost system online monitoring of escherichia coli ghost production extended recombinant bacterial ghost system bacterial ghosts as carrier and targeting systems bacterial ghosts as multifunctional vaccine particles dynamics of phix174 protein e-mediated lysis of escherichia coli the danger model: a renewed sense of self endotoxicity does not limit the use of bacterial ghosts as candidate vaccines delivery of chlamydia vaccines virus-like particles as vaccines and vessels for the delivery of small molecules recombinant norwalk virus-like particles administered intranasally to mice induce systemic and mucosal (fecal and vaginal) immune responses chimeric recombinant hepatitis e virus-like particles as an oral vaccine vehicle presenting foreign epitopes papillomavirus pseudovirus: a novel vaccine to induce mucosal and systemic cytotoxic t-lymphocyte responses induction of cytotoxic t lymphocyte activity by fusion-active peptide-containing virosomes delivery of protein antigens to the immune system by fusion-active virosomes: a comparison with liposomes and iscoms virosomes in vaccine development: induction of cytotoxic t lymphocyte activity with virosome-encapsulated protein antigens modulation of the immune response by the cholera-like enterotoxins mucosal adjuvants and anti-infection and anti-immunopathology vaccines based on cholera toxin, cholera toxin b subunit and cpg dna new generation of immune modulators based on toll-like receptors signaling mucosal vaccines: non toxic derivatives of lt and ct as mucosal adjuvants detoxification of cholera toxin without removal of its immunoadjuvanticity by the addition of (sta-related) peptides to the catalytic subunit. a potential new strategy to generate immunostimulants for vaccination mutant escherichia coli heat-labile enterotoxin [lt(r192g)] enhances protective humoral and cellular immune responses to orally administered inactivated influenza vaccine cutting edge: the mucosal adjuvant cholera toxin redirects vaccine proteins into olfactory tissues induction and recall of immune memory by mucosal immunization with a non-toxic recombinant enterotoxin-based chimeric protein expression and characterization of cholera toxin b-pneumococcal surface adhesin a fusion protein in escherichia coli: ability of ctb-psaa to induce humoral immune response in mice intranasal immunization enhances clearance of nontypeable haemophilus influenzae and reduces stimulation of tumor necrosis factor alpha production in the murine model of otitis media fibronectin-binding protein i of streptococcus pyogenes is a promising adjuvant for antigens delivered by mucosal route innate immune recognition the evolution of vertebrate toll-like receptors toll-like receptor 9 mediates cpg-dna signaling the immunobiology of the tlr9 subfamily mucosal immunity induced by pneumococcal glycoconjugate safety and immunogenicity of cpg 7909 injection as an adjuvant to fluarix influenza vaccine the adjuvant effect of synthetic oligodeoxynucleotide containing cpg motif converts the anti-haemophilus influenzae type b glycoconjugates into efficient anti-polysaccharide and anti-carrier polyvalent vaccines mycoplasmal lipopeptide malp-2 induces the chemoattractant proteins macrophage inflammatory protein 1alpha (mip-1alpha), monocyte chemoattractant protein 1, and mip-2 and promotes leukocyte infiltration in mice isolation, structure elucidation, and synthesis of a macrophage stimulatory lipopeptide from mycoplasma fermentans acting at picomolar concentration toll-like receptor 2-and 6-mediated stimulation by macrophage-activating lipopeptide 2 induces lipopolysaccharide (lps) cross tolerance in mice, which results in protection from tumor necrosis factor alpha but in only partial protection from lethal lps doses stimulation of human toll-like receptor (tlr) 2 and tlr6 with membrane lipoproteins of mycoplasma fermentans induces apoptotic cell death after nf-kappa b activation the toll-like receptor ligand malp-2 stimulates dendritic cell maturation and modulates proteasome composition and activity the mycoplasma-derived lipopeptide malp-2 is a potent mucosal adjuvant dna vaccines: immunology, application, and optimization* bacterial ghosts are an efficient delivery system for dna vaccines polynucleotide viral vaccines: codon optimisation and ubiquitin conjugation enhances prophylactic and therapeutic efficacy preclinical efficacy of a prototype dna vaccine: enhanced protection against antigenic drift in influenza virus successful nucleic acid based immunization of newborn chimpanzees against hepatitis b virus the potential use of dna vaccines for neonatal immunization cross-reactive protection against influenza a virus by a topically applied dna vaccine encoding m gene with adjuvant protective efficacy of pspa (pneumococcal surface protein a)-based dna vaccines: contribution of both humoral and cellular immune responses protective long-term antibody memory by antigendriven and t help-dependent differentiation of long-lived memory b cells to short-lived plasma cells independent of secondary lymphoid organs on differences between immunity and immunological memory kinetics of protective antibodies are determined by the viral surface antigen haemophilus influenzae type b conjugate vaccine diluted tenfold in diphtheriatetanus-whole cell pertussis vaccine: a randomized trial a randomized trial of alternative two-and three-dose hepatitis b vaccination regimens in adolescents: antibody responses, safety, and immunologic memory dose dependency of antibody response in infants and children to pneumococcal polysaccharides conjugated to tetanus toxoid accelerated hepatitis b vaccination schedule in childhood immunological response to conjugate vaccines in infants: follow up study neonatal and early life vaccinology unresponsiveness to lymphoid-mediated signals at the neonatal follicular dendritic cell precursor level contributes to delayed germinal center induction and limitations of neonatal antibody responses to t-dependent antigens delayed and deficient establishment of the long-term bone marrow plasma cell pool during early life influence of mycobacterium bovis bacillus calmette-guerin on antibody and cytokine responses to human neonatal vaccination transfer of antibody via mother's milk maternal immunization with pneumococcal polysaccharide vaccine in the philippines pneumococcal conjugate vaccines as maternal and infant immunogens: challenges of maternal recruitment the protective efficacy of polyvalent pneumococcal polysaccharide vaccine booster effect of low doses of tetanus toxoid in elderly vaccinees vaccine-induced antibodies to heterologous influenza a h1n1 viruses: effects of aging and "original antigenic sin variable efficacy of repeated annual influenza vaccination annual revaccination against influenza and mortality risk in community-dwelling elderly persons immunoinformatics: bioinformatic strategies for better understanding of immune function. introduction syfpeithi: database for mhc ligands and peptide motifs paproc: a prediction algorithm for proteasomal cleavages available on the www this work was supported in part by grants from the dfg (gu482/2-3) and the bmbf ("pathogenomik" -competence center for genome research of pathogenic bacteria "pathogenomik", 031u213b) to cag. we are particularly grateful to d. felnerova from etna biotech, who provided us with a micrograph from a transmission electron microscopy of a virosome, and to m. höfle for critical reading of the manuscript. key: cord-013178-li1x1m25 authors: hung, ling-chu title: the monoclonal antibody recognized the open reading frame protein in porcine circovirus type 2-infected peripheral blood mononuclear cells date: 2020-08-29 journal: viruses doi: 10.3390/v12090961 sha: doc_id: 13178 cord_uid: li1x1m25 the purpose of this study in the context of the open reading frame 3 (orf3) protein of porcine circovirus type 2 (pcv2) was especially its location and its relation to the capsid protein and the apoptosis protein in pcv2-infected porcine peripheral blood mononuclear cells (pbmcs). to detect the orf3 protein, monoclonal antibodies (mabs) were generated in this study. the mab 7d3 binds to the orf3 peptide (residues 35–66) and the native orf3 protein in pcv2-infected pbmcs, as shown by immunofluorescence assay (ifa). the data show that 3–5% of pbmcs were positive for orf3 protein or p53 protein. further, 78–82% of pbmcs were positive for the capsid. this study confirmed the orf3 protein not only colocalized with the capsid protein but also colocalized with the p53 protein in pbmcs. immunoassays were conducted in this study to detect the capsid protein, the orf3 protein, anti-capsid igg, and anti-orf3 igg. the data show the correlation (r = 0.758) of the orf3 protein and the capsid protein in the blood samples from the pcv2-infected herd. however, each anti-viral protein igg had a different curve of the profile in the same herd after vaccination. overall, this study provides a blueprint to explore the orf3 protein in pcv2-infected pbmcs. the porcine circovirus (pcv) is a small virus and contains closed circular single-stranded dna [1] . previous studies indicated that porcine circovirus type 1 (pcv1) is non-pathogenic to porcine herd [2, 3] . however, porcine circovirus type 2 (pcv2)-infected pigs showed dullness, thinness, lymphadenopathy, jaundice, hepatomegaly, and other manifestations [4] [5] [6] [7] [8] . lymphocyte depletion and apoptosis of lymphocytes were the significant histopathological lesions in lymphoid tissues of pcv2-infected pigs [9, 10] . pcv2 infection causes a reduction of t-and b-lymphocytes in the pbmc [11] . likewise, pcv2-associated disease (pcvad) manifested as severe swine herd problems [4, [6] [7] [8] . recently, pcvad has become one of the significant swine diseases worldwide, and it impacts pork production [7] . at least four commercial pcv2a vaccines have been widely used to reduce viremia and pcv2 infectious pressure in swine herds [12] [13] [14] [15] . that might have caused the global trend shift from pcv2a to pcv2b first [16] [17] [18] [19] , and then pcv2d (pcv2b mutant strain) has become a predominant genotype globally [20] [21] [22] [23] [24] [25] [26] [27] . the open reading frame 1 (orf1), open reading frame 2 (orf2), and open reading frame 3 (orf3) genes are the three principal orfs among 11 orfs in pcvs [28] . the orf1 encodes for the replicase peptide c3 and peptide n1 were appended with an n-terminal cysteine during synthesis, which was required for conjugation with maleimide-activated carriers. animal experiments in this study were performed following the current legislation on ethical and welfare recommendations. the experiments followed the standards of the guide of the care and use of laboratory animals. all animal work was approved by the institutional animal care and use committee (iacuc) of the livestock research institute, and the iacuc of the animal health research institute. the iacuc approval numbers lriiacuc100-33 (for swine and murine experiments), a00027 (for murine and rabbit experiments), and a02023 (for murine and rabbit experiments) were given in this study. antisera against pcv2 were generated by immunizing new zealand white rabbits (from livestock research institute, council of agriculture, taiwan). rabbits were maintained in isolation rooms, and the room temperature was at 20-26 • c. they were fed with a commercially pelleted diet (fwusow in., taiwan), and pure water was available ad libitum. then, animals were immunized with peptide immunogen (the peptide n1 conjugated with klh) or commercial vaccine five times at two-week intervals. for peptide immunization, an initial dose of 150 µg of the conjugated peptide was mixed with complete freund's adjuvant (sigma-aldrich, st. louis, mo, usa) at the first injection. each rabbit was re-immunized with the same amount of immunogenic mixture with incomplete freund s adjuvant (sigma-aldrich, st. louis, mo, usa). for virus-like particles (vlp) of pcv2 immunization, the rabbit was injected intramuscularly in the legs with 0.5 ml of the pcv2 vaccine (circoflex ® , boehringer ingelheim). the blood was harvested two weeks after the last injection. the antibody titers were measured by indirect enzyme-linked immunosorbent assay (ielisa) [48] (performed as described previously for mouse anti-pcv2 elisa, with the exception that peroxidase-conjugated donkey anti-rabbit igg (h + l, jackson immunoresearch, west grove, pa, usa) secondary antibody was used). hybridomas were generated following established methods, as previously described [49] . briefly, four five-week-old, female, balb/cbyjnarl mice were purchased from a specific pathogen-free (spf) colony (the national applied research laboratories, taiwan). the mice were maintained in isolation rooms in filtertop cages, and the room temperature was at 20-26 • c. mice were fed with a commercially pelleted diet (labdiet, 5002, st. louis, mo, usa), and pure water was available ad libitum. mice were inoculated with 65 µg of immunogen (the peptide n1 conjugated with klh) emulsified in freund's adjuvant. subsequently, mice were boosted fortnightly with the same amount of immunogen. three days after the final booster, the spleens of the mice were harvested for hybridoma generation. hybridomas were screened for the secretion of anti-orf3-specific mabs by ielisa [49] using the peptide n1 as a coating antigen. that secondary antibody solution (hrp-conjugated goat anti-mouse iga/igg/igm, h/l chain (novus, saint charles, mo, usa) was used at 1:2500 dilution for hybridoma screening. the hybridomas that secreted anti-n1 mabs were subcloned by limited dilution of the cells. then, ascites containing mabs were collected as previously described [49] . subsequently, the mabs were purified from the ascites by using the nab protein g spin column (thermo scientific, rockford, il, usa), then collecting solutions were exchanged for the buffer and concentrated using the amicon ultra 50k (millipore, burlington, ma, usa). the concentration of mabs was determined using the nano photometer ® (implen, munich, germany). the heavy chain and the light chain of the two mabs secreted by each hybridoma were determined by using an sba clonotyping system/hrp (southern biotechnology, birmingham, al, usa). it was performed according to the manufacturer's instructions and a previously published method [49] . briefly, the nunc maxisorp flat-bottom 96-well plate (thermo fisher scientific, inc., waltham, ma, usa) was coated with the capture antibody (goat anti-mouse ig, 5 µg/ml) in 0.05-m bicarbonate buffer (sigma-aldrich, st. louis, mo, usa) at 4 • c overnight. after three washes with pbs containing 0.05% tween 20 (pbst), the plate was blocked with the blocking buffer (pbst containing 5% casein hydrolysate) at 37 • c for 30 min. after washing, 100 µl of each culture supernatant of hybridoma were added sequentially, and the plate was incubated at 37 • c for 2 h. after washing, 100 µl of hrp labeled goat anti-mouse igg1, -igg2a, -igg2b, -igg3, -iga, -igm, -κ light chain, and -λ light chain were added to appropriate wells of the plate. pbst served as background. the plate was then incubated at 37 • c for 1 h and after that washed with pbst. finally, 100 µl of 2, 2 -azino-bis (3-ethylbenzothiazoline-6-sulfonic acid) diammonium salt (abts, sigma-aldrich, st. louis, mo, usa) buffer were added to each well of the plate. after 30 min, the optical density (od) of each well was measured at 405 nm using a spectramax m5 microplate reader (molecular devices, san jose, ca, usa). the epitope mapping of these mabs was performed by ielisa [49] using truncated peptides (as shown in table 1 ) as coating antigens. briefly, peptides contained the sequence of orf3 protein of pcv2b between residues 35 and 66, associated 10-mer peptides, truncated derivatives, and the control (peptide c3). ninety-six-well maxisorp plates were coated with each peptide (5 µg/ml) in bicarbonate buffer and incubated at 4 • c overnight. after three washes in pbst, the plates were blocked with the blocking buffer at 37 • c for 30 min. after washing, the culture supernatant of hybridoma was added, and plates were again incubated at 37 • c for 2 h. the anti-peptide n1 mouse serum (at a 1:1000 dilution) was used as the positive control, and the anti-orf2 (peptide c3) mab (1h3) was served as the negative control. after rinsing three times with pbst, the secondary antibody solution was applied to the appropriate wells. peroxidase-conjugated goat anti-mouse igg (subclasses 1 + 2a + 2b + 3, fcγ, jackson immunoresearch, west grove, pa, usa) was used at a 1:2500 dilution for binding to the mab 7d3. hrp-conjugated goat anti-mouse lambda light chain (novus, saint charles, mo, usa) was used at a 1:5000 dilution for binding to the mab 6d10. after 1 h, the plates were washed three times. the colorimetric reaction was developed by using the abts solution. following a 30-min incubation, the od was measured at 405 nm. both of the isotype and molecular weight of the mab 6d10 were determined by western blotting. the culture supernatant of the hybridoma (clone 6d10) was separated by bolttm bis-tris plus gel (invitrogen, carlsbad, ca, usa). one gel was stained with fastain protein staining solution (yeastern biotech, taiwan). the other gel was transferred to the polyvinylidene difluoride (pvdf) membranes (millipore, burlington, ma, usa) with fast semi-dry transfer buffer (thermo, waltham, ma, usa) by using a yrdimes semi-dry transfer system (wealtec, new taipei city, taiwan). the membrane was blocked with the blocking buffer for 30 min at 37 • c. after three washes in pbst, each strip of the membrane was incubated with one appropriate horseradish peroxidase (hrp)-conjugated antibody to detect mab at 4 • c overnight and then at room temperature for 2 h. peroxidase-conjugated goat anti-mouse iga/igg/igm, h/l chain (novus, saint charles, mo, usa) at 1:2500, hrp-conjugated goat anti-mouse iga (southern biotechnology, birmingham, al, usa) at 1:2500, hrp-conjugated goat anti-mouse lambda light chain (novus, saint charles, mo, usa) at 1:5000, hrp-conjugated goat anti-mouse igm, µ chain (jackson immunoresearch, west grove, pa, usa) at 1:5000, and hrp-conjugated goat anti-mouse igg (subclasses 1 + 2a + 2b + 3, fcγ, jackson immunoresearch, west grove, pa, usa) at 1:2500 were used in this assay. then, they were washed three times with pbst, 5 min each wash. each strip was incubated with 3, 3 -diaminobenzidine (dab, horseradish peroxidase substrate, millipore, burlington, ma, usa) buffer for 5 min. protein molecular weight markers were included in each western blot analysis. after revelation with the dab substrate, the target protein was visualized and calculated the molecular weight by using molecular weight markers. homemade immunofluorescence assay (ifa) slides were performed as previously described [49] . pcv2-infected peripheral blood mononuclear cells (pbmcs) were collected from the pcv2-infected conventional piglet. the pig serum was detected as seropositive for pcv2 by anti-capsid peptide (c3) specific immunoassay [48] . the whole blood sample was confirmed as positive for pcv2 antigen by using the ingezim pcv das kit (ingenasa, madrid, spain). briefly, pbmcs were isolated from ethylenediaminetetraacetic acid (edta)-treated whole blood samples by ficoll-paque plus (ge healthcare bio-sciences, uppsala, sweden) and according to the manufacturer's instructions. then, mononuclear cells were drawn from the interface of the ficoll-paque plus-containing tube and resuspended in 45 ml pbs for washing and centrifugation (100× g for 10 min at 4 • c). after washing three times with pbs, the cell pellet was resuspended in 0.5 ml of fetal bovine serum. an aliquot (25 µl) of cell suspension was loaded onto a glass slide for the cell smear. after the cell smear, the slides were air-dried at room temperature. thereby, the pbmcs were stuck on slides. these slides were fixed with pbs containing 2% paraformaldehyde at 4 • c for 10 min and then washed three times with pbs. following that, the slides were soaked in pbs containing 0.1% triton x-100 at 4 • c for 3 min. subsequently, slides were washed with pbs and prepared for ifa. two anti-capsid (the peptide c3) mabs (1h3 and 6b8) [49] , one anti-pcv2 mab 36a9 (ingenasa, madrid, spain), one anti-p53 protein (wt-p53) rabbit polyclonal antibody (bioss, woburn, ma, usa), the homemade anti-orf3 peptide (n1) mab (7d3), and rabbit antisera (as the method mentioned above) were used in this study. the anti-capsid peptide mabs (1h3 and 6b8) recognized native capsid proteins and reacted with core peptides (p59, 227 kdpplnp 233 ); however, the mab 1h3 bound the three minimal linear epitopes (dpplnp, dpplnpk, and lkdpplkp) and the mab 6b8 bound two minimal linear epitopes (kdpplnp and kdpplnpk). these mabs produced a variable definite staining pattern in pbmcs by ifa [49] . the pcv2-infected pbmcs slides were incubated with a 1:100 dilution of rabbit antiserum and a 1:100 dilution of mab (ascites). after incubation at 37 • c for 1 h, the slides were gently rinsed briefly in pbs and then soaked for 15 min in pbs at 4 • c. the slides were then incubated at 37 • c with tritc-conjugated goat anti-mouse igg (subclasses 1 + 2a + 2b + 3, fcγ) at a 1:100 dilution and fitc-conjugated goat anti-rabbit igg (h + l) (all from jackson immunoresearch, west grove, pa, usa, and each minimal antibody cross-reaction to other animal's serum protein) at a 1:100 dilution. after 30 min of incubation, the slides were washed with pbs and then incubated with 4, 6-diamidino-2-phenylindole, dihydrochloride (dapi, aat bioquest, sunnyvale, ca, usa) at a 1:2300 dilution in pbs at room temperature for 15 min. the slides were mounted under 50% glycerol and observed with an olympus bx51 fluorescence microscope and spot flex camera (diagnostic instrument, model 15.2 64mp, usa). terminal deoxynucleotidyl transferase dutp nick end labeling (tunel) assay was performed according to the manufacturer's instructions (the cell meter™ tunel apoptosis assay kit, aat bioquest ® , inc., sunnyvale, ca, usa) to detect excessive dna breakage in the pbmc. briefly, the pcv2-infected pbmcs slides were incubated with the tf3-dutp/reaction buffer. after 60 min of incubation, the slides were washed with pbs. they were then incubated with hoechst solution at room temperature for 13 min. the slides were mounted under 50% glycerol and observed with an olympus bx51 fluorescence microscope and spot flex camera. blood samples were collected from the pcv2-infected herd with the following vaccination at two weeks of age with a pcv2 vaccine (2 ml, porcilis ® pcv, msd animal health) in the conventional pig farm with the farrow-to-finish operation. pigs were selected from animals of different ages (1, 3, 6, 9, 12, 15, 18, 21, 24 , and ≥34 weeks of age) at one time. then, four samples were randomly chosen per age group. blood was collected in the vacutainer ® edta tubes (becton and dickinson bv) and sored at −20 • c. all whole-blood samples had been treated by the freeze-thaw process twice before the test. the capsid protein was detected from 50 µl of whole-blood sample by using the ingezim pcv das kit (ingenasa, madrid, spain) according to the manufacturer's instructions, and the absorbance of each well was measured at 450 nm. meanwhile, the homemade antigen-elisa was conducted in this study for detecting the orf3 protein in whole-blood specimens. it was based on the double-antibody-sandwich principle. ninety-six-well maxisorp plates were coated with 1 µl/ml (100 µl per well) of the trapping sera (rabbit anti-orf3 peptide) in bicarbonate buffer. the plates were incubated at 4 • c overnight and washed three times with pbst. then, the plates were blocked with viruses 2020, 12, 961 7 of 22 the blocking buffer at 37 • c for 30 min. after the plates were washed, 50 µl of pbst and 50 µl of whole-blood samples were mixed and added to the well. the plates were incubated at 37 • c for 2 h and then washed in pbst. then, 100 µl/well of mab 7d3 at 1 µg/ml were added to the plates, and the plates were incubated at 37 • c for 1 h. after the plates were washed, 100 µl/well of peroxidase-conjugated goat anti-mouse igg fcγ at a 1:2500 dilution were added, and the plates were incubated at 37 • c for 1 h. after washing, abts buffer was added to each well. following a 30-min incubation, the od was measured at 405 nm. 2.9. detection of the specific antibodies against the capsid protein or the orf3 protein in plasm samples from the pcv2-infected herd with a pcv2 vaccine plasm samples were separated from the aforementioned fresh whole-blood samples and sored at−20 • c. the antibody titers were measured by ielisa [48] (performed as described previously for pig anti-peptide elisa, with the exception that the capsid protein (circoflex ® , boehringer ingelheim) or peptide p126 was used as the coating antigen, and pig plasm at a 1:200 dilution was used). the subsequent procedures of blocking, pig antibody, washing, secondary antibody, substrate, and the od reading, were the same as described previously. the data were analyzed by using one-way analysis of variance (anova) and tukey's studentized range (hsd, honestly significant difference) multiple comparisons test using the sas enterprise guide 7.1 ® software (sas institute inc., cary, nc, usa). a p-value < 0.05 was considered significant. the orf3 peptide n1 or vlp of pcv2 was capable of inducing each specific antibody. the antibody titer of each post-immune serum (two weeks after the fourth booster) was detected at the dilution 1:10000 ( figure s1 ). the binding of the rabbit antiserum to the virus was confirmed by using an immunofluorescence assay (ifa) ( figure s2 ). although pre-immune serum had a low background (od 405 value less than 0.4) at a dilution of 1:100, the pre-immune serum did not bind the virus as detected by ifa (data not shown). the author checked for the cross-reactivity of post-immune antiserum to other antigens. the result shows they have a minor cross-reaction at a dilution of 1:100 ( figure s1 ), but the od 405 value of cross-reaction has not significantly higher than that of the negative control serum (p ≥ 0.05) ( figure s1 ). following the fusions, hybridoma supernatants were screened for the presence of anti-orf3 (peptide n1) specific antibodies by ielisa. among them, 34 hybridomas supernatants reacted with the peptide n1 at the first screening (data not shown). after repeatedly subcloning by the limiting dilution and selection, two stable hybridomas secreting anti-orf3 mabs (7d3 and 6d10) were obtained. the hybridoma produced igg1 mab (7d3) with kappa-light chains. the other produced mab (6d10) with a lambda-light chain ( figure 1a) ; however, no heavy chain was detected in the supernatant of clone 6d10 by sba clonotyping system/hrp assay. the supernatant of a hybridoma (clone 6d10) was separated directly by bolttm bis-tris plus gel to confirm the heavy chain of this mab. the gel showed a broad staining region within 40-98 kda. the author speculated that the supernatant of a hybridoma contains abundant serum proteins from the fetal bovine serum-containing medium. that is to say, the supernatant of a hybridoma (clone 6d10) contains a little mab 6d10 (light chain). further, the western blotting was used to confirm the heavy chain of mab 6d10 and its molecular weight. the hrp-conjugated antibodies (goat anti-mouse iga/igg/igm, h/l chain, and goat anti-mouse lambda light chain) gave specific reactions with the supernatant. two bands were observed at about 33 (between 28 and 38) and 6 kda, respectively ( figure 1 ). the lower band (6 kda) implied that some degradation of this mab. however, other anti-ig antibodies (goat anti-mouse ig a (α chain), goat anti-mouse igm (µ chain), and goat anti-mouse igg (fcγ)) gave negative reactions with the supernatant. this study confirmed mab 6d10 contains a lambda light chain, but it does not include any intact heavy chain. since the molecular weight of the light chain was 25 kda, it may be assumed that the mab 6d10 might contain a short fragment of a protein. a peptide scan analysis was performed to determine the binding site for each mab ( figure 2 ). the mab 7d3 bound to two linear peptides (peptide n1 and p126). the peptide n1 was appended with a cysteine residue at the n-terminal of the orf3 peptide (residues 35-65, p126). both of them contained the sequence of the orf3 protein of pcv2b between residues 35 and 66. however, the non-igg mab (6d10) not only reacted with peptide n1 but also showed minor reactions against other peptides, which compared with the mab 7d3 in the elisa test ( figure 2) . interestingly, the anti-peptide n1 mouse serum reacted all test peptides and bound firmly to six linear peptides (peptide n1, p32, p33, p69, and p126). notably, anti-n1 poly antibodies reacted with these peptides (expect p33) which contained the common core peptide (p32 and 45 tllhfpahfq 54 ). nevertheless, this study did not get any mab which bound firmly to the peptide p32. viruses 2020, 12, x for peer review 9 of 21 a peptide scan analysis was performed to determine the binding site for each mab ( figure 2 ). the mab 7d3 bound to two linear peptides (peptide n1 and p126). the peptide n1 was appended with a cysteine residue at the n-terminal of the orf3 peptide (residues 35-65, p126). both of them contained the sequence of the orf3 protein of pcv2b between residues 35 and 66. however, the non-igg mab (6d10) not only reacted with peptide n1 but also showed minor reactions against other peptides, which compared with the mab 7d3 in the elisa test ( figure 2) . interestingly, the antipeptide n1 mouse serum reacted all test peptides and bound firmly to six linear peptides (peptide n1, p32, p33, p69, and p126). notably, anti-n1 poly antibodies reacted with these peptides (expect p33) which contained the common core peptide (p32 and 45tllhfpahfq54). nevertheless, this study did not get any mab which bound firmly to the peptide p32. anti-orf3 mabs (7d3 and 6d10) bound the linear peptide spanning from residues 35 to 66. the antipeptide n1 mouse serum was used as the positive control, and the anti-orf2 (peptide c3) mab (1h3) served as the negative control. the mab binding was tested by using an ielisa. peptides contained the sequence of the orf3 protein between residues 35 and 66, associated 10-mer peptides, truncated derivatives, and the control (peptide c3) (as shown in table 1 ). the experiments were performed three times. data represent the mean ± standard error (se). for these homemade pbmcs slides, all slides were made from the same batch and contained the same proportion of pcv2-infected cells. all wells contained both negative and positive cells (pcv2infected cells) confirmed by ifa, as previously noted in a similar study [49] . in this study, pcv2 viral protein (orf3 protein and capsid protein) locations were initially assessed by ifa. the previous study indicated that the mab 1h3 bound the three minimal linear epitopes (dpplnp, dpplnpk, and lkdpplkp), which were located at carboxyl-terminus (cterminus) of the capsid proteins of pcv2b, pcv2d, and pcv2a, respectively [49] . likewise, the mab 6b8 bound two minimal linear epitopes (kdpplnp and kdpplnpk), which were located at cterminus of the capsid proteins of pcv2b, and pcv2d, respectively [49] . first, the percentage was determined by counting the number of positive cells per 100 nuclei in an ifa slide. the data show that 78-82% of pbmcs were positive for anti-capsid peptide mab (1h3) staining (pcv2a-, pcv2b-, the anti-peptide n1 mouse serum was used as the positive control, and the anti-orf2 (peptide c3) mab (1h3) served as the negative control. the mab binding was tested by using an ielisa. peptides contained the sequence of the orf3 protein between residues 35 and 66, associated 10-mer peptides, truncated derivatives, and the control (peptide c3) (as shown in table 1 ). the experiments were performed three times. data represent the mean ± standard error (se). for these homemade pbmcs slides, all slides were made from the same batch and contained the same proportion of pcv2-infected cells. all wells contained both negative and positive cells (pcv2-infected cells) confirmed by ifa, as previously noted in a similar study [49] . in this study, pcv2 viral protein (orf3 protein and capsid protein) locations were initially assessed by ifa. the previous study indicated that the mab 1h3 bound the three minimal linear epitopes (dpplnp, dpplnpk, and lkdpplkp), which were located at carboxyl-terminus (cterminus) of the capsid proteins of pcv2b, pcv2d, and pcv2a, respectively [49] . likewise, the mab 6b8 bound two minimal linear epitopes (kdpplnp and kdpplnpk), which were located at cterminus of the capsid proteins of pcv2b, and pcv2d, respectively [49] . of pbmcs were positive for anti-orf3 mab (7d3) staining ( figure s3 ). it might be due to different abundance of the protein, the degradation of the protein, different pcv2 strains, conformational epitopes, or linear epitopes, which interacted with the antibody. interestingly, the anti-vlp of pcv2 rabbit serum produced cytoplasmic staining in pcv2-infected pbmcs ( figure 3b,f) . when pcv2-infected pbmcs were co-stained with anti-vlp of pcv2 rabbit serum and anti-capsid peptide mabs (6b8), some vlp/capsid peptide dual-positive cells were shown ( figure 3d ). the overlap between the anti-vlp rabbit serum staining and the anti-orf3 mab (7d3) staining is presented in figure 3h , and the anti-orf3 polyclonal antibody and anti-capsid mab (36a9) overlapped the same in figure 3p ; however, the polyclonal antibody staining was brighter than the mab staining. this result shows orf3 protein overlaps the capsid protein (or vlp) of pcv2 ( figure 3h,p) . both the capsid proteins and the orf3 protein were detected mainly in the cytoplasm. according to the previous study, the n-terminal half of orf3 protein (residues 1-53) was discovered in the cytoplasm, and the c-terminal half of orf3 protein (residues 53-104) was majorly located in the nucleus [50] . the medial region of orf3 protein (residues 35-66) might be primarily located in the cytoplasm in this study. of pcv2 ( figure 3h,p) . both the capsid proteins and the orf3 protein were detected mainly in the cytoplasm. according to the previous study, the n-terminal half of orf3 protein (residues 1-53) was discovered in the cytoplasm, and the c-terminal half of orf3 protein (residues 53-104) was majorly located in the nucleus [50] . the medial region of orf3 protein (residues 35-66) might be primarily located in the cytoplasm in this study. a previous study suggested that orf3 protein led to increased p53 protein levels and apoptosis of the infected cells [42] . to identify the subcellular location of the orf3 protein in pcv2-infected pbmcs, anti-capsid, anti-orf3, and anti-p53 antibodies were used. the p53 protein also colocalized with capsid peptide (figure 3l ), and the orf3 peptide ( figure 3x ). the p53 protein was mainly distributed in the cytoplasm and around the nucleus ( figure 3j,v) . the orf3 protein, as detected with mab 7d3, seemed to colocalize quite nicely with the p53 protein ( figure 3x ). the p53/orf3 dual-positive cell presented a segmented nucleus ( figure 3w,y) . the nuclei of negative cells were round or oval, and nuclear segmentation was very rare in negative cells ( figure 3y ). the apoptosis in pcv2-infected pbmcs was observed by using a terminal deoxynucleotidyl transferase dutp nick end labeling (tunel) assay (figures 4 and s4) . the data show that 3-4% pbmcs were positive for tunel staining. it is very similar to the percentage of anti-orf3 mab (7d3) staining. however, these similar percentages are not sufficient proof to show that the orf3 protein of pcv2 causes dna damage. a more careful analysis would be needed to reveal whether pcv2 infection causes dna damage. unfortunately, it was not easy to label both the orf3 protein and the signals of dna breakage simultaneously by ifa and tunel assay (data not shown). further research would be needed to determine whether the orf3 protein colocalizes with the signals of tunel. a previous study suggested that orf3 protein led to increased p53 protein levels and apoptosis of the infected cells [42] . to identify the subcellular location of the orf3 protein in pcv2-infected pbmcs, anti-capsid, anti-orf3, and anti-p53 antibodies were used. the p53 protein also colocalized with capsid peptide (figure 3l ), and the orf3 peptide ( figure 3x ). the p53 protein was mainly distributed in the cytoplasm and around the nucleus ( figure 3j ,v). the orf3 protein, as detected with mab 7d3, seemed to colocalize quite nicely with the p53 protein ( figure 3x ). the p53/orf3 dual-positive cell presented a segmented nucleus ( figure 3w,y) . the nuclei of negative cells were round or oval, and nuclear segmentation was very rare in negative cells ( figure 3y ). the apoptosis in pcv2-infected pbmcs was observed by using a terminal deoxynucleotidyl transferase dutp nick end labeling (tunel) assay ( figure 4 and figure s4 ). the data show that 3-4% pbmcs were positive for tunel staining. it is very similar to the percentage of anti-orf3 mab (7d3) staining. however, these similar percentages are not sufficient proof to show that the orf3 protein of pcv2 causes dna damage. a more careful analysis would be needed to reveal whether pcv2 infection causes dna damage. unfortunately, it was not easy to label both the orf3 protein and the signals of dna breakage simultaneously by ifa and tunel assay (data not shown). further research would be needed to determine whether the orf3 protein colocalizes with the signals of tunel. 6b8 (a); mab 1h3 (i); and mab 36a9 (m)), and the orf3 peptide of pcv2 (mab7d3 (e,u)). staining with spf mouse serum was used as the negative control (q). (b,f,j,n,v) staining with rabbit antiserum (green) was used to identify the vlp of pcv2 (b,f), the p53 (j,v), and the orf3 peptide of pcv2 (n). (c,g,k,o,s,w,y) nuclei were stained with dapi (blue). the arrow points to the positive staining cell (z) and its irregular shaped nucleus (y), and the enlarged image in the sixth row (x,w). (d,h,l,p,t,x,z) the merged images are also shown. the scale bar is 10 μm. the viral proteins were detected in the whole-blood samples using antigen-elisa techniques. the cut-off value in the commercial pcv2 capsid elisa and the orf3 protein elisa were 0.15 and 0.22, respectively. the capsid protein of pcv2 was mainly detected in pigs at 1, 3, 6, and ≥34 weeks of age in this study farm ( figure 5a ). noticeably, these groups (at 1, 3, 6, and ≥34 weeks of age) showed more elevated standard error of od 405 value than other groups in this statistical analysis. this result implies that pigs at that age would be susceptible to pcv2 infection. interestingly, non-vaccinated piglets showed the highest od 450 value of capsid protein at one week of age. after piglets were inoculated with the pcv2 vaccine at two weeks of age, the od 450 value of capsid protein gradually decreased until 24 weeks of age. however, the od 450 value of capsid protein was raised at ≥34 weeks of age (two gilts and two sows). there is evidence to suggest that the vaccine could not protect pigs against pcv2 infection at 32 weeks postimmunization. the orf3 protein of pcv2 showed the same result except for od values of the orf3 protein were lower than the capsid protein. the correlation coefficient (r) was 0.7583. the mean od 405 values of the orf3 protein were significantly higher for suckers at one week of age than postweaning ones at 9-24 weeks of age (p < 0.05; figure 5b ). there is an apparent probability that the amount of orf3 protein was fewer than the amount of capsid protein in pcv2-infected blood from suckers at one week of age. protect pigs against pcv2 infection at 32 weeks postimmunization. the orf3 protein of pcv2 showed the same result except for od values of the orf3 protein were lower than the capsid protein. the correlation coefficient (r) was 0.7583. the mean od405 values of the orf3 protein were significantly higher for suckers at one week of age than postweaning ones at 9-24 weeks of age (p < 0.05; figure 5b ). there is an apparent probability that the amount of orf3 protein was fewer than the amount of capsid protein in pcv2-infected blood from suckers at one week of age. figure 5 . assessment of the capsid protein and the orf3 protein in the whole-blood samples from pigs of different age groups. this study used the commercial capsid kit (a) and the orf3 protein elisa (b) to measure each specimen, and the absorbance was measured at 450 or 405 nm, respectively. the peptide n1 and psbt were used as a positive (pos) control and a negative (neg) control, respectively, in the orf3 protein elisa. the data represent the mean ± se. treatments with different letters have statistically significant differences (p < 0.05). figure 5 . assessment of the capsid protein and the orf3 protein in the whole-blood samples from pigs of different age groups. this study used the commercial capsid kit (a) and the orf3 protein elisa (b) to measure each specimen, and the absorbance was measured at 450 or 405 nm, respectively. the peptide n1 and psbt were used as a positive (pos) control and a negative (neg) control, respectively, in the orf3 protein elisa. the data represent the mean ± se. treatments with different letters have statistically significant differences (p < 0.05). the anti-viral protein-specific antibodies were detected in the plasm samples using ielisa techniques. the cut-off value of in the anti-capsid igg elisa and the anti-orf3 peptide (p126) igg elisa were 0.20 and 0.18, respectively. non-vaccinated suckers showed the highest mean od 405 value of anti-capsid igg and anti-orf3 igg at one week of age. it may mean that non-vaccinated suckers absorbed large amounts of maternally derived antibodies after birth. notably, the non-vaccinated group showed the most elevated standard error of od 405 value (anti-capsid igg) in this statistical analysis. a gradual decrease of the mean od 405 value of anti-capsid igg for piglets from the suckling period (one week of age) to the weaning period (six weeks of age) was observed. from 9 to 12 weeks old, the mean od 405 value of anti-capsid igg increased significantly (p < 0.05; figure 6 ). the mean od 405 value of anti-capsid igg also increased significantly for pigs from 15 to 18 weeks old (p < 0.05) and reached a plateau at 18 weeks old. the same trend was observed in the result of the anti-orf3 igg elisa test, except for od 405 values of anti-orf3 igg were lower than anti-capsid igg. notwithstanding, it is worth mentioning that there was a sharp decrease of the mean od 405 value of anti-orf3 igg for suckers from one to three weeks old. the mean od 405 value of anti-orf3 igg increased significantly for pigs from six to nine weeks old (p < 0.05; figure 6 ) and reached a plateau at nine weeks old. there was a weak correlation (r = 0.3771) between the od 405 value of anti-capsid igg and the od 405 value of anti-orf3 igg in the plasm samples. it points to the probability that the amount of anti-capsid igg reflects humoral immunity in pcv2-infected pigs before and after the injection of a pcv2 vaccine. besides, the amount of anti-orf3 igg reflects humoral immunity in pigs, which were caused by pcv2 infecting only. the mean od405 value of anti-capsid igg also increased significantly for pigs from 15 to 18 weeks old (p < 0.05) and reached a plateau at 18 weeks old. the same trend was observed in the result of the anti-orf3 igg elisa test, except for od405 values of anti-orf3 igg were lower than anti-capsid igg. notwithstanding, it is worth mentioning that there was a sharp decrease of the mean od405 value of anti-orf3 igg for suckers from one to three weeks old. the mean od405 value of anti-orf3 igg increased significantly for pigs from six to nine weeks old (p < 0.05; figure 6 ) and reached a plateau at nine weeks old. there was a weak correlation (r = 0.3771) between the od405 value of anti-capsid igg and the od405 value of anti-orf3 igg in the plasm samples. it points to the probability that the amount of anti-capsid igg reflects humoral immunity in pcv2-infected pigs before and after the injection of a pcv2 vaccine. besides, the amount of anti-orf3 igg reflects humoral immunity in pigs, which were caused by pcv2 infecting only. figure 6 . assessment of the anti-capsid igg and the anti-orf3 peptide (p126) in the plasm samples from pigs of different age groups. the experiments were repeated three times, and data represent the mean ± se. treatments with different letters at the same kind of antibody assay have statistically significant differences (p < 0.05). according to a previous study, the orf3 peptide (residues 35-65) of pcv2 interacts with the p53-binding domain of ppirh2 [42] . the peptide n1 (residues 35-66) of the orf3 protein was synthesized and used in this study. subsequently, anti-orf3 mabs were generated and characterized in this study. this work created one hybridoma producing the mab 7d3 with igg1, and the other producing the defective-ig mab (6d10) with a lambda-light chain. the mab 7d3 (igg1) bound to the linear peptide n1 (c35hndvyislpi tllhfpahfq kfsqpaeisdkr66) and p126 (35hndvyislpi tllhfpahfq kfsqpaeisdkr66). however, the defective-ig mab 6d10 minorly reacted with all truncated peptides. this finding is similar to the previous study [49] . the defective-ig mabs likely have broad binding, moderate specificity, and low affinity with the associated peptide. there were concerns about this defective-ig mab since this seems to be very rare. further, the molecular weight of the mab 6d10 was determined by western blotting. two bands were observed figure 6 . assessment of the anti-capsid igg and the anti-orf3 peptide (p126) in the plasm samples from pigs of different age groups. the experiments were repeated three times, and data represent the mean ± se. treatments with different letters at the same kind of antibody assay have statistically significant differences (p < 0.05). according to a previous study, the orf3 peptide (residues 35-65) of pcv2 interacts with the p53-binding domain of ppirh2 [42] . the peptide n1 (residues 35-66) of the orf3 protein was synthesized and used in this study. subsequently, anti-orf3 mabs were generated and characterized in this study. this work created one hybridoma producing the mab 7d3 with igg1, and the other producing the defective-ig mab (6d10) with a lambda-light chain. the mab 7d3 (igg1) bound to the linear peptide n1 (c 35 hndvyislpi tllhfpahfq kfsqpaeisdkr 66 ) and p126 ( 35 hndvyislpi tllhfpahfq kfsqpaeisdkr 66 ). however, the defective-ig mab 6d10 minorly reacted with all truncated peptides. this finding is similar to the previous study [49] . the defective-ig mabs likely have broad binding, moderate specificity, and low affinity with the associated peptide. there were concerns about this defective-ig mab since this seems to be very rare. further, the molecular weight of the mab 6d10 was determined by western blotting. two bands were observed at about 33 and 6 kda, respectively. the small one (6 kda) implied some degradation of this mab. since the molecular weight of the light chain was 25 kda, the author suggested that the mab 6d10 might contain a short fragment of a protein. interestingly, the author also tested the molecular weight of another defective-ig mab 6c3 [49] (which against peptide c3), and two molecules at about 33 and 6 kda were again shown (data not shown). according to previous documents, human patients with abnormal serum immunoglobulin-free light chain production should only be due to monoclonal plasmaproliferative disorders (included multiple myeloma, light chain myelomas, and light chain amyloidosis) [51] [52] [53] . arguably, the defective-ig mab (6d10) might be related to the phenotype of splenocytes, which fused with the myeloma cell. further research should be carried out using advanced techniques (such as liquid chromatography with mass spectrometry and spectroscopic techniques) to investigate the constitution of the mab 6d10. based on previous reports, lymphocyte depletion and apoptosis in lymphoid tissues are histological hallmarks in pcv2-infected pigs [54] [55] [56] . interestingly, these histopathological lesions are similar to marek's disease and african swine fever [57, 58] . more recent evidence shows that these viruses caused an early cytolytic infection in lymphocytes by apoptosis [57] [58] [59] . therefore, this study hypothesized that lymphocyte and pbmcs lineage cells should be the primary target cells for pcv2 infection. to the best of the author's knowledge, the study of the orf3 protein by indirect immunofluorescence assay in pcv2-infected pbmcs has never been performed, and only a single article mentions the transient expression of the orf3 in porcine pbmcs and detecting apoptosis with a tunel assay [50] . therefore, the author explored the relation of the capsid, p53 protein, and the orf3 protein by ifa and shed light on p53 protein (the marker of apoptosis) accompanied the peptide (residues 35-66, p126) of the orf3 protein in pcv2-infected pbmcs. it is worth highlighting that 3-5% of pbmcs were positive for some antibodies staining, including anti-orf3 mab (7d3) staining, anti-p53 protein rabbit polyclonal antibody staining, and tunel assay ( figure s3 ). there is a strong probability that 3-5% of pbmcs were undergoing the process of apoptosis. however, the data revealed that the variable percentage (4-82%) of pbmcs was positive for anti-capsid mabs (1h3, 6b8, and 36a9) staining and anti-vlp rabbit serum staining. it was because these mabs or antiserum recognized different epitopes of proteins presented in pcv2-infected cells. that means that the variety of interaction between antigen-binding sites of antibodies and epitopes, which includes linear form epitopes, conformational epitopes, degradation, and different pcv2 strains. curiously, this study ( figure s3 ) showed that the percentage of pcv2-infected pbmcs (by mab 1h3 staining) was about 20-fold the rate of orf3-positive pbmcs (by mab 7d3 staining) or the percentage of p53-positive pbmcs. it seems likely that not all pcv2-infected cells contained orf3 proteins or p53 protein. this finding concurs with the previous study, which indicated that the apoptosis statistically decreases in the initial pcv2-infected pigs [60] . on the other hand, this study showed the orf3 protein colocalized with the capsid protein marker of pcv2. moreover, the p53 protein was also colocalized with the orf3 protein. remarkably, the p53/orf3 dual-positive cell presented a segmented nucleus ( figure 3w ). however, these p53/orf3 dual-positive cells were very few in these samples. this finding will be confirmed by flow cytometry in the future. although a previous study indicated that pcv2-infected pigs had a reduction of lymphocytes in the peripheral blood [11] , the mechanism of lymphocyte lysis was still unclear. this study confirmed that the orf3 protein was related to the p53 protein (apoptosis marker) in pcv2-infected pbmcs, while other factors (proliferative activity [60] , caspases 8 and 3 [34, 61] , granzymes [62] , or corticosteroids [63, 64] ) causing lymphocyte lysis or depletion need to be considered. although the experiment on apoptosis in pcv2-infected pbmcs was carried out by the tunel assay (figure 4) , it did not clarify the relationship between the tunel result and the orf3 protein in pcv2-infected cells. only the ifa data confirm previous reports that exogenous orf3 protein was related to the accumulation of p53 protein [37, 42] , but more proof are needed to make sure the orf3 protein is a major factor leading to apoptosis in pcv2-infected cells [34, 37, 42] and depletion of lymphocytes. pcv2 is one of the most critical pathogens in modern swine production and causes endemic disease in pig farms [7] . the commercial vaccines were confirmed protective in the field against pcv2 and mainly administered to suckers in herds with pcv2 infection [13, 14, [65] [66] [67] . most researchers utilized the quantitative real-time pcr to detect pcv2 nucleic acid, and then they evaluated the efficacy of vaccination in regard to pcv2 viremia in pigs [13, 14, 66] . however, little is known about the capsid protein or the orf3 protein in blood from the pcv2-infected herd with vaccination. although the capsid antigen-elisa could detect capsid protein, this assay could not differentiate from native viral proteins or vaccine ones in blood. the orf3 protein-elisa, by contrast, only identified native orf3 proteins in pig blood since no commercial vaccine contains the orf3 protein of pcv2. for these purposes, this study used the commercial capsid antigen-elisa and homemade orf3 protein-elisa (anti-n1 polyclonal antibodies and mab 7d3 based) to detect viral proteins in pig blood. this study found non-vaccinated suckers had the highest of pcv2 proteins in blood at one week of age. the author suggests that in these suckers it was caused by pcv2 infection, and these viruses majorly stemmed from the sow-to-newborn transmission [68] . after inoculated with the pcv2 subunit vaccine, the capsid protein, and the orf3 protein were gradually decreased. however, these proteins were detected again in gilts and sows (≥34 weeks of age). that means pcv2 viral proteins (the capsid protein or the orf3 protein) in gilts and sows were higher than that of pigs at 9-24 weeks of age. to put it another way, vaccinated-pigs reduced viremia compared to non-vaccinated suckers, and the immunity could not continue to protect vaccinated pigs after 32 weeks post-vaccination. since pcv2 is highly resistant to environmental conditions, being able to remain in the farm environment and thus represent a risk for infection maintenance [69] , even mass pcv2 vaccination (without implementing further farm management practices or biosafety measures) was not able to clear out pcv2 infection, and the virus became detectable again when vaccination was stopped [12, 67] . another key point to mention is that pcv2-specific antibodies response play roles in the pcv2-infected herd. it is worth noting that there were two different antibody profiles in this pcv2-infected herd with the pcv2 vaccine. the data show that pcv2-infected herd had a higher od 405 value of anti-capsid igg at age 1, 3, and ≥12 weeks, compared with 6-9 weeks. in addition, pcv2-infected herd had a higher od 405 value of anti-orf3 igg at age 1 and ≥9 weeks, compared with 3-6 weeks. according to previous studies, newborn suckers received colostral antibodies from seropositive sows and showed various levels of maternally derived antibodies [48, 68] . these maternally derived antibodies might interfere with the viral protein-specific antibodies response while suckers immunized with the pcv2 subunit vaccine. it may be assumed that the artificial capsid protein (pcv2 vaccine) elicited anti-capsid igg producing while maternally derived antibodies progressed to degrade. the total level of anti-capsid igg decreased slowly but still maintained a high level of anti-capsid igg during the age of 1-6 weeks. this phenomenon could neutralize the pcv2 virus and prevent piglets from suffering pcv2 infection. in contrast with dual-source capsid proteins (pcv2 virus and subunit vaccine), the piglet's immunity response to the orf3 protein was only caused by pcv2 virus infection. the total level of anti-orf3 igg decreased quickly at 1-3 weeks of age. then, it increased significantly at 6-9 weeks of age. the weak correlation (r = 0.3771) between the anti-capsid igg and the anti-orf3 igg is worth mentioning. it might reflect two kinds of antibody responses in virus infection and post-immunization. these data interpret the interaction of the viral protein with the host immune system. according to previous reports, orf2 encodes a 30-kda protein that is involved in viral capsid formation and contributed to self-assembled virus-like particles (vlp) [32] . the monomer structures were assembled into a vlp model consisting of 60 capsid subunits to form an icosahedron [70] . vaccination with this vlp (subunit vaccine) induced both humoral and cell-mediated responses against pcv2 [71, 72] . however, orf3 encodes an 11.9-kda protein [28] that is the non-structural protein and involved in pcv2-induced apoptosis [34, 37, 42] . until now, no report mentions that the orf3 protein was solely used as the vaccine against pcv2, reducing viremia or pathological lesions in pcv2-infected herds. these differences contributed to their application in different ways [36, 73] . in this study, the percentage of orf3-positive pbmcs was significantly lower than capsid-positive pbmcs. similarly, the antigen-elisa result shows that the amount of orf3 protein was less than the amount of capsid protein in pcv2-infected blood from suckers at one week of age. that might be because mab 7d3 only recognizes the orf3 protein (peptide p126) for pcv2b strain (genbank: aac35332.1), and it causes low detection. however, pcv2d (pcv2b mutant strain) and pcv2b are still predominant genotypes in the field farms. besides the similar percentages of pbmcs were positive for these antibodies (mab 7d3 and anti-p53 protein rabbit polyclonal antibody) staining or tunel assay. in contrast to the orf3 protein, the detection of the capsid protein could be the variable result in pcv2-infected pbmcs via different antibodies, due to different pcv2 strains [49, 74] , conformational epitopes [75] , or linear epitopes [49, 76] . previous studies indicated that antibody binding residues were often on the exterior of the capsid of pcv2 [48, 49, 70, 74, 77, 78] . although an epitope might be on the surface of the single capsid unit, it could bury in the vlp and be inaccessible to the antibody [79, 80] . in general, these results confirm that mab 7d3 recognized the native orf3 protein in pcv2-infected pbmcs. this study used mab 7d3 or its minimal linear epitope to design various immunoassays. these assays evaluate the viral load and the immune response of viral proteins stimuli. these may serve as surveillance tools for monitoring natural pcv2 infection in the herd. overall, the author generated anti-orf3 mabs against the orf3 peptide (residues 36-66) of pcv2. this study confirmed the defective-ig mab (6d10) with a lambda-light chain, and its molecular weight was about 33 kda. the mab 7d3 contained heavy chains (γ 1 ) and kappa-light chains, and it bound to one minimal linear epitope (hndvyislpitllhfpahfq kfsqpaeisdkr). the data show that 3-5% of pbmcs were positive for orf3 protein or p53 protein. otherwise, 78-82% of pbmcs were positive for anti-capsid peptide mab (1h3) staining. this study confirmed the orf3 protein colocalized with the p53 protein in pcv2-infected pbmcs. the author devised the orf3 protein-elisa (anti-orf3 antisera and mab 7d3-based) to detect the orf3 protein in blood samples. the results show that the amount of orf3 protein was less than the amount of capsid protein in pcv2-infected blood from suckers at one week of age. the correlation between the orf3 protein and the capsid protein is worth noting (r = 0.7583). furthermore, the antibody level of anti-capsid igg and anti-orf3 igg could imply the immunity response of pig herd, but the sample size needs to be considered. the author declares no conflict of interest. a very small porcine virus with circular single-stranded dna characterization of novel circovirus dnas associated with wasting syndromes in pigs studies on epidemiology and pathogenicity of porcine circovirus experimental reproduction of severe wasting disease by co-infection of pigs with porcine circovirus and porcine parvovirus reproduction of lesions of postweaning multisystemic wasting syndrome by infection of conventional pigs with porcine circovirus type 2 alone or in combination with porcine parvovirus enteritis associated with porcine circovirus 2 in pigs porcine circovirus type 2 associated disease: update on current terminology, clinical manifestations, pathogenesis, diagnosis, and intervention strategies porcine circovirus type 2 (pcv2) infections: clinical signs, pathology and laboratory diagnosis porcine circovirus induces b lymphocyte depletion in pigs with wasting disease syndrome postweaning mulstisystemic wasting syndrome (pmws) in pigs. a review changes in peripheral blood leukocyte populations in pigs with natural postweaning multisystemic wasting syndrome (pmws) can porcine circovirus type 2 (pcv2) infection be eradicated by mass vaccination? comparative efficacy of three commercial pcv2 vaccines in conventionally reared pigs efficacy of the porcine circovirus 2 (pcv2) vaccination under field conditions evaluation of natural porcine circovirus type 2 (pcv2) subclinical infection and seroconversion dynamics in piglets vaccinated at different ages a new emerging genotype subgroup within pcv-2b dominates the pmws epizooty in switzerland epidemiology and horizontal transmission of porcine circovirus type 2 (pcv2) genotypic shift of porcine circovirus type 2 from pcv-2a to pcv-2b in spain from efficacy and future prospects of commercially available and experimental vaccines against porcine circovirus type 2 (pcv2) genetic variation analysis of chinese strains of porcine circovirus type 2 porcine circovirus type 2 (pcv2): genetic variation and newly emerging genotypes in china genetic analysis of porcine circovirus type 2 (pcv2) in queensland detection and genotyping of porcine circovirus 2 (pcv-2) and detection of porcine circovirus 3 (pcv-3) in sera from fattening pigs of different european countries genomic analysis of porcine circovirus type 2 from southern china genetic diversity and prevalence of porcine circovirus type 3 (pcv3) and type 2 (pcv2) in the midwest of the usa during global molecular genetic analysis of porcine circovirus type 2 (pcv2) sequences confirms the presence of four main pcv2 genotypes and reveals a rapid increase of pcv2d nucleotide sequence of porcine circovirus associated with postweaning multisystemic wasting syndrome in pigs the essential and nonessential transcription units for viral protein synthesis and dna replication of porcine circovirus type 2 molecular biology of porcine circovirus: analyses of gene expression and viral replication replacement of the replication factors of porcine circovirus (pcv) type 2 with those of pcv type 1 greatly enhances viral replication in vitro open reading frame 2 of porcine circovirus type 2 encodes a major capsid protein a porcine circovirus type 2 (pcv2) mutant with 234 amino acids in capsid protein showed more virulence in vivo, compared with classical pcv2a/b strain characterization of a previously unidentified viral protein in porcine circovirus type 2-infected cells and its role in virus-induced apoptosis the orf3 protein of porcine circovirus type 2 is involved in viral pathogenesis in vivo attenuation of porcine circovirus 2 in spf piglets by abrogation of orf3 function the orf3 protein of porcine circovirus type 2 interacts with porcine ubiquitin e3 ligase pirh2 and facilitates p53 expression in viral infection pirh2, a p53-induced ubiquitin-protein ligase, promotes p53 degradation pcdh10, a novel p53 transcriptional target in regulating cell migration maf transcriptionally activates the mouse p53 promoter and causes a p53-dependent cell death p53 and apoptosis: it's not just in the nucleus anymore porcine circovirus type 2 orf3 protein competes with p53 in binding to pirh2 and mediates the deregulation of p53 homeostasis the orf3 protein of porcine circovirus type 2 promotes secretion of il-6 and il-8 in porcine epithelial cells by facilitating proteasomal degradation of regulator of g protein signalling 16 through physical interaction orf3 of porcine circovirus 2 enhances the in vitro and in vivo spread of the of the virus t lymphocyte epitope mapping of porcine circovirus type 2. viral immunol characterization of specific antigenic epitopes and the nuclear export signal of the porcine circovirus 2 orf3 protein analysis of putative orf3 gene within porcine circovirus type 2 peptides mimicking viral proteins of porcine circovirus type 2 were profiled by the spectrum of mouse anti-pcv2 antibodies versatile carboxyl-terminus of capsid protein of porcine circovirus type 2 were recognized by monoclonal antibodies with pluripotency of binding the porcine circovirus type 2 nonstructural protein orf3 induces apoptosis in porcine peripheral blood mononuclear cells light chain multiple myeloma: an evaluation of its biochemical investigations international myeloma working group guidelines for serum-free light chain analysis in multiple myeloma and related disorders treatment of al amyloidosis suppression of lymphocyte apoptosis in spleen by cxcl13 after porcine circovirus type 2 infection and regulatory mechanism of cxcl13 expression in pigs apoptosis in lymphoid organs of pigs naturally infected by porcine circovirus type 2 immunosuppression in postweaning multisystemic wasting syndrome affected pigs pathology of african swine fever: the role of monocyte-macrophage atrophy of primary lymphoid organs induced by marek's disease virus during early infection is associated with increased apoptosis, inhibition of cell proliferation and a severe b-lymphopenia transcriptomic characteristics of bronchoalveolar lavage fluid and peripheral blood mononuclear cells in covid-19 patients apoptosis and proliferative activity in lymph node reaction in postweaning multisystemic wasting syndrome (pmws). veterinary immunology and immunopathology porcine circovirus type 2 (pcv2) causes apoptosis in experimentally inoculated balb/c mice granzyme b-activated p53 interacts with bcl-2 to promote cytotoxic lymphocyte-mediated apoptosis calcium dependence of glucocorticoid-induced lymphocytolysis stress-induced enhancement of activity of lymphocyte lysosomal enzymes in pigs of different stress-susceptibility studies on porcine circovirus type 2 vaccination of 5-day-old piglets inactivated pcv2 one shot vaccine applied in 3-week-old piglets: improvement of production parameters and interaction with maternally derived immunity ten years of pcv2 vaccines and vaccination: is eradication a possibility? fetal infections and antibody profiles in pigs naturally infected with porcine circovirus type 2 (pcv2) environmental distribution of porcine circovirus type 2 (pcv2) in swine herds with natural infection the 2.3-angstrom structure of porcine circovirus 2 one dose of a porcine circovirus 2 (pcv2) sub-unit vaccine administered to 3-week-old conventional piglets elicits cell-mediated immunity and significantly reduces pcv2 viremia in an experimental model immunogenicity of empty capsids of porcine circovius type 2 produced in insect cells rüütel boudinot, s. porcine circovirus type 2 orf3 protein induces apoptosis in melanoma cells antigenic subtyping and epitopes' competition analysis of porcine circovirus type 2 using monoclonal antibodies identification of one critical amino acid that determines a conformational neutralizing epitope in the capsid protein of porcine circovirus type 2 fine mapping of antigenic epitopes on capsid proteins of porcine circovirus, and antigenic phenotype of porcine circovirus type 2 in vitro and in silico studies reveal capsid-mutant porcine circovirus 2b with novel cytopathogenic and structural characteristics structure analysis of capsid protein of porcine circovirus type 2 from pigs with systemic disease antibody recognition of porcine circovirus type 2 capsid protein epitopes after vaccination, infection, and disease viruses 2020, 12 key: cord-017999-saxwqc2j authors: travers, andrew a. title: gene regulation by hmga and hmgb chromosomal proteins and related architectural dna-binding proteins date: 2005 journal: dna conformation and transcription doi: 10.1007/0-387-29148-2_11 sha: doc_id: 17999 cord_uid: saxwqc2j the eukaryotic abundant high mobility group hmga and hmgb proteins can act as architectural transcription factors by promoting the assembly of higher-order protein-dna complexes which can either activate or repress gene expression. the structural organisation of both classes of protein is similar with either a single or repeated dna binding domain preceding a short negatively charged c-terminal tail. in the hmgb class of proteins the hmg dna-binding domain binds non-specifically and introduces a sharp bend into dna whereas the at-hook in the hmga protein binds preferentially to a/t rich regions of dna and stabilises a b-dna structure. the acidic tails are hypothesised to facilitate the interaction of the proteins with nudeosomes by binding to the positively charged histone tails. both classes of protein also interact with a large number of transcription factors that bind to specific dna sequences. the eukaryotic nucleus contains three classes of abundant chromatin associated proteinsthe high mobility group proteins of the hmga, hmgb and hmgn classes (originally termed hmgia^, hmg 1/2 and the hmg 14/17; for recent nomenclature changes see re£ 1), so called because they were initially identified on the basis of their rapid migration through starch gels. the hmga and hmgb classes of chromosomal proteins in general share some common characteristics, notably a conserved acidic region and the ability to interact with several different transcription factors. a major role is to organise the struaure of dna-protein complexes in the context of chromatin. in both the eukaryotic nucleus and the bacterial nucleoid the trajectory of the dna double helix is normally tightly constrained so that not only can the dna be compacted without entanglement but also to provide an appropriate environment for the enzymatic machinery involved in dna transcription, replication and recombination. this organisation is normally effected by abundant dna binding proteins, termed architectural dna-binding proteins, that either induce dna bending or facilitate the formation of multicomponent dna-protein complexes. the term ^architectural' in this context implies that the protein is required for organising dna but the proteins that fall within this definition are often otherwise functionally distinct and would include, for example, the histone octamer, abundant eukaryotic dna conformation and transcriptiony edited by takashi ohyama. ©2005 eurekah.com and springer science+business media. chromosomal proteins, such as the hmga and hmgb proteins, abundant proteins associated with the prokajyotic nucleoid, such as fis, h-ns, ihf and hu, as well as bona fide transcription factors exemplified by the tata-binding protein (tbp). some of these proteins have more than one architectural function. for example fis can stabilise particular configurations of supercoiled dna plasmids and also act to promote the assembly and activity of transcription, replication and recombination complexes. many of the more generalised architectural proteins may be regarded as facilitators and are ofiien not essential for viability while some more 'specialised' proteins, such as the histone octamer and tbp, are clearly essential. the bending of dna by transcription factors and by other protein complexes is a major component in the establishment of the overall morphology of protein-dna complexes. this bending is usually a consequence of indirect readout, a mechanism by which the selectivity of binding is dependent not on making direct contacts between the aminoacids and bases, i.e., direct readout, but instead on the physicochemical properties of the dna molecule itself recognition of dna by transcription factors often involves both direct and indirect readout. however, the principles of indirect readout are well illustrated by the histone octamer which, although not a transcription factor itself, completely lacks direct contacts between the aminoacid side chains and the bases of the boimd dna. the octamer binds 147 bp of dna which are wrapped in a left-handed superhelix with a total curvature of approximate 10 radians. this curvature contrasts with the stiffness of dna in solution where the average persistence length (p), defined as the length over which the average deflection of the polymer axis caused by thermal agitation is one radian, is 140-150 bp,^ i.e., the same length as that bound by the histone octamer. for dna molecules that are not anisotropically curved the affinity of the dna for the octamer is direcdy proportional to the flexibility (the inverse of the stiffiiess). however the dependence of the binding energy on p is some 10-fold lower than the dependence of the bending energy in solution on p. this implies that the histone octamer increases the apparent flexibility substantially to compensate for the average increase in dna curvature on binding. how might this change in flexibility be effected? the histone core provides a dna binding siu-face in the form of a positively charged ramp. on binding to this ramp the negative charges on one side of the dna are neutralised. this asymmetric neutralisation, which can be mimicked in free dna,^ creates an imbalance in charge distribution on opposite sides of the double helix so that repulsion between the opposing sugar-phosphate backbones on the unneutralised side facilitates bending by increasing the width of the grooves. concomitandy, the reduction in this repulsion on the inside of the bend permits greater freedom in the motions of the basepairs, with a corresponding reduction in the width of the grooves. the greater flexibility of the motions between base-pairs is reflected in the periodic variation of twist and roll with groove width such that the ranges of values assumed for both are substantially larger than the corresponding ranges observed for dna molecules free in solution.^ the correlation between flexibility and affinity for the histone octamer only applies strictly when a dna molecide does not possess intrinsic anisotropic curvature. when it does the affinity may be relatively higher or lower. for example, the intrinsically curved tata dna sequence whose curvature is compatible with the surface of the histone octamer binds with an affinity that would normally be characteristic of a substantially more flexible isotropic binding site. ' in this case binding is favoured by the lower entropic penalty on binding relative to an isotropically flexible molecide.^ however if the intrinsic bend is too great and therefore less compatible with the protein binding surface the affinity is reduced relative to an isotropically flexible molecule.^^ an extension of this principle of asymmetric alteration of the ionic environment of dna is provided by the transcription factor tbp and the hmg-domain, found in hmgb proteins, a class of abundant chromosomal proteins and certain transcription factors such as sry and lef-1.^^ the hmgb proteins consist essentially of a small l-shaped protein domain with a cluster of hydrophobic residues on its inner surface and an extended unstructured basic region. when these proteins bind to dna they produce a bend of 95-120° over about six base-pairs and decrease both the axial and torsional stiffness. ^ on the outer surface of the bend the hydrophobic 'wedge' towards the apex of the l binds in and widens the minor groove, concomitandy untwisting the dna. this effect is believed to be facilitated by a local reduction in the dielectric constant which increases the repulsion between opposing sugar-phosphate backbones on the approach of the protein to same time the basic region neutralises the phosphates bounding the major groove on the inside of the bend thus decreasing the repulsive forces and permitting the narrowing of the groove. additionally the protein inserts, or intercalates, hydrophobic aminoacids into either a single base-step or into two base-steps that are themselves separated by a single base-step. the extent to which this intercalation increases or simply stabilises the induced bend is unclear. the bend induced by the intercalation contrasts with the smooth dna bending induced by the histone octamer since the intercalation effectively introduces a kink in the dna such that the stacking interactions between adjacent base-pairs are very substantially reduced. in the tata-binding protein this same principle of hydrophobic interactions predominates. here two pairs of phenylalanine residues are intercalated at steps separated by 6 bases, kinking the dna by --45° at each intercalation site.^^' between these pairs of phenylalanine residues a hydrophobic surface rests snugly within the minor groove. again the minor groove is widened and untwisted. however, unlike the hmgb proteins there is no charge neutralisation on the opposing major groove face of the bent dna and indeed the sharpness of the induced curvature is less than that for the hmgb proteins. in other transcription factors there is substantial variation in the degree of induced bending. the escherichia coli cap (aka crp) factor is a good example of mixed direct and indirect readout. this dimeric protein induces a bend of --45° per monomer.^^ in this case the major bend occurs where the recognition helix of the helix-turn-helix motif binds in the major groove on the inside of the bend, concomitandy making direct contacts with the dna bases and neutralising the sugar-phosphate backbone in the immediate vicinity. ^^ flanking the central recognition palindrome is a basic ramp which binds dna and increases the overall dna bend by indirect readout in a manner analogous to the histone octamer. although one of the principal roles of dna bending in the living cell is to maintain the compaaion of dna, it also has important functions in transcriptional control and, in particular, in the assembly of regulatory complexes. a major consequence of introducing a tight bend into dna is to bring dna sequences which are far apart on a linear representation of a dna molecide into close spatial proximity. this effect, which is also characteristic of plectonemically supercoiled dna, is mediated in chromatin by the hmg-domain transcription factors, such as tcf-1, lef-1 and sry. in the case of tcf-1 acting at the enhancer of the tcr promoter, the bend induced by the factor brings together a normally unstable complex of the ets-1 and pebp2a dna-binding proteins and atf/creb activator proteins to form a stable complex. ^^ this example is probably a particular case of the more general phenomenon in which the dna between a transcription factor and its target protein partner must be bent for protein-protein contacts to occur. the ease of bending will depend critically on the distance and the helical phase difference between the binding sites of the factor and its target. normally unless one or both of the partner proteins are flexible contact will be facilitated when the binding sites are in helical phase, primarily because of the constraints on the torsional flexibility of dna. however, at least in vitro, the constraints imposed by both torsional and axial rigidity can be overridden by the abundant dna-bending proteins of the hmgb class. in the presence one of these proteins a requirement for an integral number of helical turns between binding sites is no longer crucial.^^ furthermore the involvement of the hmgb protein in the formation of the complex need only be transient. to what extent are variations in dna flexibility reflected in genomic organisation? an excellent example of the dependence of biological function on dna structure is provided by genome of the enteric bacterium e. colt. in this organism the strongest promoters for dna transcription, often those directing the synthesis of rrna and trna, are almost invariably associated with a/t rich, and hence flexible, dna sequences extending upstream for 100-300 base-pairs from the transcription startpoint."^^ the activity of many of these promoters is strongly dependent on a high negative superhelical density stored in the dna. this would in principle favour both dna untwisting at sequences such as tataat close to startpoint"^^ and also lefthanded dna wrapping aroimd the protein complex responsible for initiating transcription. in many of these highly active promoters the dna sequence also imparts curvature to the region, a featiue that correlates both with the presence of multiple activating binding sites for the abundant dna bending protein fis (factor for inversion stimulation).^^'^^ these sites are often organised in helical phase such that the binding of fis could constrain a negative superhelical loop. indeed in the rma pi regulatory region the presence of a far upstream fis site centred at position -222 from the transcription startpoint results in the constraint of an additional supercoil in the initiation complex.^"^ a primary function of this fis-induced dna looping is to promote the wrapping of dna around the rna polymerase prior to the initiation of transcription, a phenomenon that has also been proposed for the activation of the lac promoter by the cap dna-bending transcription factor,^ and consequendy to facilitate the extended wrapping characteristic of the open complex.^'^ in turn the fis-induced constraint of negative superhelicity buff^ers this type of promoter against changes in the unconstrained superhelical density.^ ' '^^ in other such promoters an alternative model proposes that the upstream activating sequence contains regions that are highly susceptible to dna untwisting. for both dna wrapping and untwisting in the upstream region both models predict that the topological unwinding is transmitted to the 'tata sequence and promotes its untwisting. dna bending may also be required for the establishment of repressive regulatory complexes. here, a dna loop is often formed by the binding of an oligomeric repressor to two sites that are distant from each other along the dna sequence. this loop, which can be as tightly bent as nucleosomal dna, prevents the binding of rna polymerase to the regulated promoter. examples of this mode of regulation include repression by the arac, lad and galr proteins.^^'^^ the vertebrate hmga proteins are small proteins of -100-110 aminoacids and contain tandem copies, usually three, of a characteristic dna-binding domain, the at-hook, together with a c-terminal acidic region (fig. 1) .^^ the at-hook is not restricted to the hmga proteins as such as it is also found in a related drosophila chromosomal protein, dl, which contains multiple copies of the motif,^^'^ in the motor subunits of various atp-dependent chromatin remodelling complexes and in certain transcription factors that also contain a primary sequence-specific dna-binding domain where it is assumed to act as an auxiliary dna-binding element.^^ the at-hook is an unstructiu'ed short motif with the consensus sequence arg/lys-pro-arg-gly-arg-gly-pro-arg/lys^^'^^ and selectively binds in the minor groove of a/t-rich regions of dna.^^'^^ the central arg-gly-arg core adopts an extended conformation deep within the groove with the arginine side chains making extensive hydrophobic contacts along the base of the groove. ^ the proline residues change the trajectory of the backbone allowing the basic residues flanking the core mediate electrostatic and hydrophobic contacts with the dna backbone. when bound to dna the surface of the core motif contacting the dna is concave and resembles that of the dna binding drug netropsin, which has a similar selectivity for a/t-rich sequences. ^ for both netropsin and an at-hook one consequence of dna binding is a modest widening of the minor groove with a concomitant stabilisation of b-dna structure. in some cases this widening results in a change in the direction of bending of dna, particularly when that bending is dependent on a narrow minor groove width in a/t-rich sequences. ' thus a small intrinsic bend of -20° towards the minor groove in the ifn-p enhancer is reversed on binding hmgal. this could then facilitate recognition of the opposing major groove by transcription factors binding to specific sequences. nevertheless although the hmga proteins induce only small changes in dna structure they bind tightly to dna ligands v^ith distorted or unusual features. these include supercoiled dna, four-way junctions and baseunpaired regions of at-rich dna (reviewed in ref. 34) . strikingly in vitro these proteins can also introduce supercoils into relaxed dna, possibly by stabilising cross-overs and thereby stabilising dna loops. this ability has been suggested as an explanation of the observation that in vitro hmgal represses the chick globin p gene promoter in the absence of the 3' enhancer but strongly activates transcription in its presence, regardless of whether or not the substrate is free dna or is assembled into nucleosomes. this stabilisation of loops could be mediated by the presence of multiple at-hooks on each protein. in mammals the hmga proteins are encoded by two functional genes, hmgal and hmga2. alternative splicing of the transcripts of these genes increases the variety of protein products, of which the most abundant are hmgal a (hmg-i) and hmgal b (hmg-y). these proteins appear to perform a variety of functions, of which the most studied are related to chromatin structure and to the facilitation or inhibition of transcription factor binding. in vitro hmga proteins bind to nucleosomes, notably at the exit and entry points to the nucleosome core particle where they are in close proximity to histones h2a, h2b and h3. '^ the proteins can also bind to internal sites where they can induce local changes in the rotational setting of the wrapped dna. the binding to the nucleosomal dna is mediated by the athooks but (by analogy to the hmgb class of proteins) it is conceivable that the acidic tail may also be involved in contacting the histone octamer and could perform a similar role to that of the hmgb proteins. in mitotic chromosomes the hmga proteins are associated with particular bands and these proteins are localised to the base of large chromatin loops in close proximity to scaffold-attachment regions (sars). as a consequence it has been suggested that the hmga proteins are involved in the maintenance of the condensed mitotic chromosome structure in these regions. evidence supporting this view was adduced from the observation that synthetic 'math' proteins containing at-hooks interfere with chromosome condensation during mitosis. in contrast another model suggests that co-operative binding of hmga molecules to a looped chromatin domain in interphase nuclei will facilitate the formation of an 'open' chromatin structure that is competent for transcription by competing with histone hi. although some experiments demonstrate that the math proteins can counteract the spreading of heterochromatin, as shown in particular by suppressing position-effect variegation in flies. the mechanism by which this is accomplished remains to be established. however the expression of the hmga proteins is strongly correlated with cell growth and is characteristically high in neoplastically transformed cells. hmgal has been shown to interact directly with a large variety of transcription factors including at-1, atf-3, nf-y, irf-1, srf, nf-kb, p50, tst-l/oct-6 and c-jun.^^ in some cases the protein regulates the formation of an enhanceosome. thus at the virus-inducible (3interferon enhancer a complex containing both hmga proteins and transcription factors forms and then acts to recruit rna polymerase ii and its associated general transcription factors.^^'^ in other cases hmga proteins can block enhanceosome formation.^^ this modulation of transcription factor binding may be integrated with the regulation of chromatin organisation. thus hmgal a enhances the binding of the atf-3 to a site at the edge of a nucleosome positioned on the hiv-1 promoter.^ this combination of bound proteins can then recruit the remodelling complex hswi/snf. the interactions of hmga with these factors can be modulated by covalent modifications including phosphorylation and methylation. hmgb proteins are characterised by the hmg-box, a dna-binding domain specific to eukaryotes. a major characteristic of this domain is to introduce a sharp bend into dna (fig. 2) . accordingly the domain also binds preferentially to a variety of distorted dna structures, especially those in which the distortion itself induces a bend. these include negatively supercoiled dna, small dna circles, cruciforms, dna bulges and cisplatin modified dna.^^ the hmg-box domain is also found in several related types of protein, for example transcription factors such as sry and lef-1, and subunits of many chromatin remodelling complexes. all these proteins are predominantly nuclear and appear to act primarily as architectural facilitators in the manipulation of nucleoprotein complexes; for example, in the assembly of complexes involved in recombination and the initiation of transcription, as well as in the assembly and organisation of chromatin. the archetypal hmgb proteins are highly abundant (--10-20 copies per nucleosome in the mammalian nucleus^^) and often occur in two major forms, hmgbl and hmgb2, originally termed hmgl and hmg2, in vertebrates.^ the two distinguishing features of these highly homologous proteins are two similar, but distinct, tandem hmg-box domains (a and b) , and a long acidic c-terminal 'tail', consisting of-30 (hmgl) or 20 (hmg2) acidic (aspartic and glutamic acid) residues, linked to the boxes by a short, predominantly basic linker (fig. 1) . however the most abundant hmg-box domain proteins in saccharomyces cerevisiae, nhp6ap and nhp6bp (non-histone proteins 6a and 6b respectively), contain only a single hmg box, and lack an acidic tail. likewise the two major hmg-box domain proteins in drosophila melanogaster, hmg-d and hmg-z, have only a single hmg box but, unlike the yeast proteins, contain a short c-terminal acidic tail in addition to a basic region (fig. 1) . these abundant proteins in yeast and drosophila may be the general functional counterparts of hmgbl and 2 in vertebrates. the precise functions of the chromosomal hmgb proteins in vivo for a long time remained obscure. however there is now substantial evidence that they interact directly with both transcription factors and with the histone octamer. these interactions can affect transcription factor access to chromatin either directly or by promoting chromatin remodelling. in the latter case the proteins may facilitate repression or activation. there are two established cases in which the assembly of nucleoprotein complexes containing sequence-specific dna-binding proteins is promoted by the dna-bending properties of hmgbl and 2, i.e., the proteins have a classical architectural role. first, in v(d)j recombination the lymphocyte-specific proteins ragl and rag2 (human recombination activating genes 1 and 2) appear to recruit hmgbl and 2 to the appropriate sites in chromatin ' ^ presumably by protein-protein contacts with the ragl homeodomain. here they ensure the "12/23 rule". this requires that v(d)j recombination occurs only between specific recombination signal sequences (rss). each rss is made up of a conserved hep tamer and nonamer sequence separated by a non-conserved spacer of either 12 or 23 base pairs. hmgb 1 (in concert with rag 1,2) facilitates recombination probably by bending the dna between the two conserved sequences spaced by 23 bp and stabilising a nucleoprotein complex. the hmgb protein plays the dual role of bringing critical elements of the 23-rss hep tamer into the same phase as the 12-rss to promote rag binding and of assisting in the catalysis of 23-rss cleavage. recent footprinting experiments indicate that the hmgbl (or hmgb2) protein is positioned 5' of the nonamer in 23-rss complexes, interacting largely with the side of the duplex opposite the one contacting the rag proteins. a second instance in which an abundant hmgb protein may facilitate nucleoprotein complex assembly is in the formation of an enhanceosome containing the epstein-barr virus replication activator protein zebra and hmgb 1,^^ the two proteins bind cooperatively, hmgbl binding to, and presumably bending, a specific dna sequence between two zebra recognition sites. bending of dna by hmgbl and 2 has also been invoked to explain the essential role of these proteins in initiating dna replication by loop formation at the mvm (minute virus of mice) parvovirus origin of replication. in vitro hmgb proteins can enhance the binding of various transcription factors (e.g. adenovirus mltf, oct-1 and 2, hoxd9, p53, steroid hormone receptors, rel proteins, p73, dof2 and the epstein-barr activator rta) to their cognate dna binding sites (reviewed in ref gg). similarly rat ssrpl has been shown to facilitate the dna binding of serum response factor and human ssrpl is associated with the y isoform of p63 in vivo at the endogenous mdm22inap2nfl'"^^ promoters.^^ in most of these cases, the interaction of the hmg protein with the transcription factor has been detected in vitro and could, in principle, serve as the mechanism for recruitment of hmgb 1 or 2 to particular dna sites. in some cases transfection experiments indicate functional interactions in vivo. direct interactions between nhp6p and the gal4p and tup ip transcription factors have also been inferred in vivo by a split-ubiquitin screen and confirmed by a pull-down assay.^^ although the demonstrated interactions in vitro so far involve an hmgb protein and a single transcription factor, it is entirely possible that in vivo, in a natural regulatory context, the bending of dna by hmgbl and 2 could potentially allow the recruitment of a second transcription factor to the complex, in an analogous manner to the action of sequence-specific hmg-box transcription factors^ such as lef-1 in the enhanceosome at the t cell receptor alpha (tcra).^^ hmgbl may play a catalytic, chaperone role, since it does not appear to be stably incorporated into the final complex. although a role for hmgbl and hmgb2 induced dna bending in the facilitated binding of transcription factors, while being entirely plausible, has not been directly established, it is strongly suggested by the ability of hu to substitute for the hmgbl-stimulated binding of the epstein barr virus transactivator rta to its cognate binding sites.'^^ a possible role for an hmgblinduced change in dna conformation in facilitation of transcription factor binding is also suggested by the observation that hmgbl promotes binding of p53 to linear dna but not to gg bp dna circles. however, in this case the data do not distinguish between possible effects of dna bending or untwisting. the biological roles of the hmgb proteins have been studied using gene knock-outs. in mice the loss of hmgbl but not of hmgb2 is lethal although in the former knock-out there are pleiotropic effects on glucose metabolism while in the latter spermatogenesis is impaired. ' this suggests a functional redundancy between members of the hmgbl and 2 family. a similar situation occurs with nhp6ap and nhp6bp in yeast.'^^ however, the different phenotypes of the hmgbl and hmgb2 null mice probably reflect specific roles for the two proteins in different tissues."^^'"^ in s. cerevisiae the transcriptional effects o(nhp62irc not general but gene-specific. at the chal locus, loss of nhp6results both in an increase in the basal level of transcription and in a substantial decrease in the induced level.^^ this suggests an effect at the level of chromatin. the chal regulatory region contains a positioned nucleosome which occludes the tata box under non-inducing conditions. on induction the tata region becomes accessible.''^ however in the mutant strain, consistent with the increased basal level transcription, the chromatin structure of the tata region in the uninduced state is similar to that in the induced wild-type strain. nhp6thus appears to be required for establishment of the organised chromatin structure characteristic of the uninduced state. the rsc remodelling complex is also required for this process''^ suggesting that rsc and nhp6p may cooperate to remodel chromatin. further insights into how nhp6ap and nhp6bp function were provided by studies on the ho gene. loss of a^//p6^ function can be suppressed by mutations that increase nucleosome accessibility and mobility, and enhanced by those with the opposite effect. mutations both in the sin3 and rpd3 genes, encoding components of a histone deacetylase complex, and in sin4, partially restore wild-type function in cells lacking both nhp6ap and nhp6bp, while loss of the histone acetylase gcn5p (also a component of the saga histone acetylase complex) in the same cells results in a more severe phenotype. rpd3p and gcn5p contribute to the dynamic balance between histone acetylation and deacetylation.^^ both histone acetylation and the sin (swi/snf independence) phenotype are correlated with chromatin unfolding ' and/or enhanced nucleosome accessibility^^ while histone deacetylation would be expected to favour folding. on this argument one role of the nhp6p proteins would be to antagonise folding and possibly promote nucleosome accessibility. the ability of the hmgb proteins to promote both transcription factor binding to their cognate sites and also chromatin remodelling implies that these activities could be coordinated to alter chromatin structure in the vicinity of a factor binding site. like the hmga proteins the abundant hmgb proteins bind to nucleosomes at sites close to the dna exit and entry points. an insight into how hmgb proteins might alter the accessibility of nucleosomal dna was provided by the observation that hmgbl could facilitate the binding and subsequent remodelling function of the acf remodelling complex in vitro.^ further observations showed that hmg-d, a drosophila hmgb protein, when bound to nucleosome core particles increased the accessibility of nucleosomal dna to restriction endonucleases at particular sites. these sites were asymmetrically distributed, one site being located at one end of the bound dna and the other in the vicinity of the nucleosome dyad. this effect required the acidic tail of the hmgb protein: without it the hmg-d reduced accessibility at all sites tested on the nucleosome. this result argues that certain hmgb proteins can alter the structure of nucleosomes and to do so presumably by interacting with an available basic region of the histone octamer. from the distribution of the sites with increased accessibility a prime candidate would be one (but not both) of the n-terminal tails of histone h3 or one of the c-terminal tails of histone h2a. it is important to note that the yeast nhp6 proteins lack an acidic region and so could not interact with histones direcdy in this way. however they can associate with two other proteins, pob3p and sptl6p, to form a complex, spn, involved in chromatin remodelling. ' both these proteins contain extensive acidic regions and so, in principle, could substitute for the lack of an acidic region in nhp6p. the abimdant hmga and hmgb chromosomal proteins share several common features. both interact with nucleosomes, both can also bind a set of transcription factors, both are involved in enhanceosome formation and both can facilitate the recruitment of chromatin remodelling complexes. interestingly the hmgn class of hmg proteins shares with the hmga and hmgb classes the ability to interact with nucleosomes and also possesses a c-terminal region with a net negative charge. revised nomenclature for high mobility group (hmg) chromosomal proteins studies on nuclear proteins. the binding of extra acidic proteins to deoxyribonucleoprotein during the preparation of nuclear proteins hmg domain proteins: architectural elements in the assembly of nucleoprotein structures crystal structure of the nucleosome core particle at 2.8 a resolution flexibility of dna the structural basis of dna flexibility dna bending by asymmetric phosphate neutralization the structure of dna in the nucleosome core identification and characterization of genomic nucleosome-positioning sequences dual role of dna intrinsic curvature and flexibihty in determining nucleosome stabiuty sequence-dependent dna curvature and flexibility from scanning force microscopy images hmgl and 2, and related architectural dna-binding proteins reading the minor groove the low dielectric interior of proteins is sufficient to cause major structural changes in dna on association co-crystal structure of tbp recognizing the minor groove of a tata element crystal structure of a yeast tbp/tata-box complex comparative gel electrophoresis measurement of the dna bend angle induced by the catabolite activator protein crystal structure of a cap-dna complex: the dna is bent by 90° the hmg domain of the lymphoid enhancer factor 1 bends dna and facilitates assembly of functional nucleoprotein structures hmg proteins and dna flexibiuty in transcription activation a dna structural atlas for escherichia coli negative supercoiling induces spontaneous unwinding of a bacterial promoter dna microloops and microdomains: a general mechanism for transcription activation by torsional transmission promoter protection by a transcription factor acting as a local topological homeostat fis modulates the kinetics of successive interactions of rna polymerase with the core and upstream regions of the e. coli tyrt promoter mechanism of activation of transcription by the complex formed between cycuc amp and its receptor in escherichia coli wrapping of dna around the e. coli rna polymerase open promoter complex mechanism of transcriptional activation by fis: role of core promoter structure and dna topology dna topology-mediated control of global gene expression in escherichia coli an operator at -280 base pairs that is required for repression of the arabad operon promoter: addition of dna helical turns between the operator and promoter cyclically hinders repression atomic force microscopic demonstration of dna looping by galr and hu lac repressor forms loops with linear dna carrying two suitably placed lac operators in vivo thermodynamic analysis of repression with and without looping in lac constructs. estimates of fi-ee and local lac repressor concentrations and of physical properties of a region of supercoiled plasmid dna in vivo molecular biology of hmga proteins: hubs of nuclear function protein dl preferentially binds a + t-rich dna in vitro and is a component of drosophila melanogaster nucleosomes containing a + t-rich satellite dna isolation and sequencing of cdna clones encoding drosophila chromosomal protein dl. a repeating motif in proteins which recognize at dna at-hook motifs identified in a wide variety of dna-binding proteins the at-dna-binding domain of mammalian high mobiuty group i chromosomal proteins: a novel peptide motif for recognizing dna structure protein motifs that recognize structural features of dna a mammalian high mobility group protein recognizes any stretch of six a-t base pairs in duplex dna the solution structure of an hmg-i(y) dna complex defines a new architectural minor groove binding motif refinement of netropsin bound to dna: bias and feedback in electron density map interpretation reversal of intrinsic dna bends in the ifn p gene enhancer by transcription factors and the architectural protein hmg i(y) changes in superhelicity are introduced into closed circular dna by binding of high mobility group protein la^ hmg i/y regulates long-range enhancer-dependent transcription on dna and chromatin by changes in dna topology interaction of high mobility group-i(y) nonhistone proteins with nucleosome core particles substrate structure influences binding of the non-histone protein hmg-i(y) to free and nucleosomal dna metaphase chromosome structure: bands arise from a differential folding path of the highly at-rich scaffold sars are cis dna elements of chromosome dynamics: synthesis of a sar repressor protein sar-dependent mobilization of histone hi by hmg-i/y in vitro: hmg-i/y is enriched in hi-depleted chromatin in vivo analysis of scaffold-associated regions in drosophila: a synthetic high-affinity sar binding protein suppresses position effect variegation hmgi(y) and hmgi-c dysregulation: a common occurrence in human tumors ordered recruitment of chromatin modifying and general transcription factors to the ifn-p promoter the mechanism of transcriptional synergy of an in vitro assembled interferon-p enhanceosome hmg i(y) interferes with the dna binding of nf-at factors and the induction of the interleukin 4 promoter in t cells recruitment of swi/snf to the human immunodeficiency virus type i promoter a deoxyribonucleic acid unwinding protein isolated from regenerating rat liver. physical and functional properties v(d)j recombination: modulation of ragl and rag2 cleavage activity on 12/23 substrates by whole cell extract and dna-bending proteins stimulation of v(d)j cleav^e by high mobility group proteins accessibility of nucleosomal dna to v(d)j cleavage is modulated by rss positioning and hmgl the rag-hmgl complex enforces the 12/23 rule of v(d)j recombination specifically at the double-hairpin formation step the ragl homeodomain recruits hmgbl and hmgb2 to facilitate recombination signal sequence binding and to enhance the intrinsic dna-bending activity of rag1-rag2 fine structure and activity of discrete rag-hmg complexes on v(d)j recombination signals mechanism for specificity by hmg-1 in enhanceosome assembly two widely spaced initiator binding sites create an hmgldependent parvovirus rolling-hairpin replication origin chromosomal hmg-box proteins cooperative transcriptional activation by serum response factor and the high mobility group protein ssrpl ssrpl functions as a co-activator of the transcriptional activator p63 a new screen for protein interactions reveals that the saccharomyces cerevisiae high mobility group proteins nhp6a/b are involved in the regulation of the gall promoter assembly and function of a tcra enhancer complex is dependent on lef-1-induced dna bending and multiple protein-protein interactions the dna architectural protein hmgbl displays two distinct modes of action that promote enhanceosome assembly efficient specfic dna binding by p53 requires both its central and cterminal domains as revealed by studies with high-mobility group 1 protein the lack of chromosomal protein hmgbl does not disrupt cell growth but causes hypoglycaemia in newborn mice reduced fertility and spermatogenesis defects in mice lacking chromosomal protein hmgb2 nhp6a and nhp6b, which encode hmgl-uke proteins, are candidates for downstream components of the yeast slt2 mitogen-activated protein kinase pathway chromatin-mediated transcriptional regulation by the yeast architectural factors nhp6a and nhp6b nucleosome structure of the yeast chal promoter: analysis of activation-dependent chromatin remodeling of an rna-polymerase-ii-transcribed gene in tbp and rna pol ii mutants defective in vivo in response to acidic activators transcriptional repression of the yeast chal gene requires the chromatin-remodeung complex rsc architectural factors and the saga complex function in parallel pathways to activate transcription hyperacetylation of chromatin at the adh2 promoter allows adrl to bind in repressed conditions disruption of higher-order folding by core histone acetylation dramatically enhances transcription of nucleosomal arrays by rna polymerase iii the sin domain of the histone octamer is essential for intramolecular folding of nucleosomal arrays effects of histone acetylation on the equilibrium accessibility of nucleosomal dna target sites the dna chaperone hmgbl facilitates acf/chracdependent nucleosome sliding hmg-d and histone hi alter the local accessibility of nucleosomal dna sptl6-pob3 and the hmg protein nhp6 combine to form the nucleosome-binding factor spn a bipartite yeast ssrpl analog comprised of pob3 and nhp6 proteins modulates transcription hmg-d complexed to a bulge dna: an nmr model key: cord-257802-vgizgq2y authors: uttamchandani, mahesh; neo, jia ling; ong, brandon ngiap zhung; moochhala, shabbir title: applications of microarrays in pathogen detection and biodefence date: 2008-11-12 journal: trends biotechnol doi: 10.1016/j.tibtech.2008.09.004 sha: doc_id: 257802 cord_uid: vgizgq2y the microarray is a platform with wide-ranging potential in biodefence. owing to the high level of throughput attainable through miniaturization, microarrays have accelerated the ability to respond in an epidemic or crisis. extending beyond diagnostics, recent studies have applied microarrays as a research tool towards understanding the etiology and pathogenicity of dangerous pathogens, as well as in vaccine development. the original emphasis was on dna microarrays, but the range now includes protein, antibody and carbohydrate microarrays, and research groups have exploited this diversity to further extend microarray applications in the area of biodefence. here, we discuss the impact and contributions of the growing range of microarrays and emphasize the concepts that might shape the future of biodefence research. natural outbreaks and the wanton use of pathogenic organisms for acts of terror have had tremendous impact on human populations, especially when considering the events over the past decade. modern travel and trade have further dissipated traditional geographical boundaries that once curbed the spread of disease. infectious agents capable of spreading from human to human, such as the severe acute respiratory syndrome (sars) coronavirus, have shown us just how quickly (in a matter of weeks) a threat anywhere could become a threat everywhere [1] . the lessons learnt from sars, avian flu and the anthrax letter attacks have emphasized the need for improved preparedness for, response to and treatment of both known and emergent biological threats [2, 3] . this has also prompted heightened funding initiatives worldwide to build up national and international biodefence capabilities [3] . among the systems and protocols to be put in place, platforms that facilitate the rapid and accurate identification of agents are particularly vital, both in confirming whether an attack has occurred and in instituting prompt measures to secure public health. the intrinsic ability of microarrays to perform multiplexed, low-volume and sensitive biological assays in a highly scalable manner is a significant advantage in biological threat analysis [4] . developed in the early to mid 1990s, dna microarrays stirred a technological revolution that continues to propel genomics research today [5] . applications included the identification and comparison of mrna or dna from across tissues or organisms by hybridization against thousands of oligomeric dna probes immobilized on planar surfaces, such as glass slides (figure 1 ). this provided researchers with an unprecedented ability to quantify genome-wide differences in gene expression or sequence changes using minimal amounts of sample. in the context of biodefence, these studies have dramatically extended our capability beyond merely detecting known pathogens; now we are able to rapidly profile and characterize pathogens, as well as identify novel strains or 'genetic islands' that manifest changes in virulence (or the evolution of resistance). such comparative phylogenomic profiling using microarrays facilitates the understanding of pathogen diversification by providing a bird's-eye view of the gene content present or absent within a given microbial genome [6] . microarray technology has also been brought outside the laboratory to the point-of-care, a development that has widening implications for threat detection. dna-based detection systems generally rely on the ability of pcr to amplify and fluorescently tag the tiny amounts of target dna present in the specimen. advances in miniaturizing this initial pcr step, for instance the development of review glossary biodefence: defensive measures against biological threats, including natural/ emerging pathogens and bioterror agents, that have significant potential to endanger public health detection: identifying the presence of target pathogen(s) from clinical or environmental samples. diagnostics: tests used to detect a medical condition, for example to test for the causative pathogen responsible for an infection. sandwich immunoassay: a biochemical assay for detecting the presence and/ or abundance of a target substance using the antibody-antigen reaction. two antibodies are used; the first is immobilized and the other, free antibody carries a reporter group, thereby providing a positive readout when the target is recognized and 'sandwiched' between the two antibodies. sensitivity: probability of a positive result when the pathogen is indeed present; high sensitivity is related to a low type 2 error. serovars: a group of microorganisms distinguished by the presence of specific surface antigens. specificity: probability of a negative result when the pathogen is not present; high specificity is related to a low type 1 error. tiling arrays: tiling arrays are a type of dna microarray in which short probe segments that have been designed to cover the entire genome are used. the extent of probe overlap will translate to the mapping resolution and might range from highly overlapped probes across each individual nucleotide base (for resequencing applications) to non-overlapping probes (for genome-wide expression analysis). vaccinia virus: a poxvirus that is closely related to the virus that causes cowpox. it was the first human vaccine and has been extensively used for vaccination against smallpox. corresponding author: uttamchandani, m. (mahesh@dso.org.sg). micro-pcr (mpcr), followed by hybridization and identification against probes on dna microarrays have spun-off viable chip-based platforms that are able to successfully perform biological threat detection and analysis. such portable systems, as will be described, are already appearing on the market and competing to attract lucrative biodefence funding. alternative detection platforms have exploited quantitative pcr in 'fieldable' real-time detectors, where positive pcr amplification would indicate the presence of the corresponding target dna [4] . these have included devices such as genexpert (cepheid), rapid (idaho technologies) or bioseeq (smiths detection), but these systems provide relatively limited throughput [7] . microarray systems are, by contrast, more definitive and highly scalable because hundreds to tens of thousands of possible dna elements can be interrogated in a single experiment. their performance nevertheless hinges on both the adoption of robust panels of probes that can accurately identify dna from organisms of interest and the successful extraction and amplification of pathogen dna from the relevant clinical sample or isolates [8, 9] . more recent developments in microarray technologies have significantly broadened the horizon for their use in the biodefence arena well beyond the confines of dnabased profiling or detection. newer microarray formats developed at the turn of the 21st century provide a host of other biomolecules that can be presented on chip, in-cluding proteins (whole proteomes, enzymes and antibodies) [10, 11] , small molecules (drug-like molecules, peptides and carbohydrates) [12, 13] and even whole cells and tissues for simultaneous, multiplexed experimentation [14] . each format offers unique ways in which microarrays can be harnessed towards improving our understanding of pathogen biology ( figure 1 ). the microarray paradigm has over the years contributed significantly towards these goals by not only providing a robust tool for molecular diagnostics but also by advancing biodefence capabilities in protection and therapy. this review will describe microarray-based applications for biodefence, as well as exciting new concepts and approaches that will drive future advancement and growth. probe selection and design is usually an important first step in microarray-based pathogen detection, and many issues associated with probe design for dna microarrays can impact the overall fidelity of the assay, in particular with regard to levels of specificity and sensitivity attained [15] . these issues and considerations include cross-hybridizations, orthogonal probe binding to target dna from the specific organism(s) of interest and vice versa, uniformity of annealing temperatures (or gc content) and probe length. the occurrence of false positives and false negatives is highly problematic because they might cause an (a) as discussed in this article, dna microarrays can be applied to test for dna from pathogenic organisms or for the resequencing of pathogen genomes. (b) antibody microarrays can be used to detect pathogen proteins or antigens that might be present in environmental samples as an indication of contamination or for diagnostic purposes to determine pathogen infection in human tissues. (c) small-molecule microarrays offer novel approaches for differentiating between pathogens, for example by clustering the binding signatures obtained for each pathogen. they can also be used to identify therapeutics that could potentially disrupt the infection cycle. trends in biotechnology vol. 27 no.1 unwarranted response (and potentially panic) or a missed response (and a lost opportunity for intervention), which are unacceptable when dealing with deadly biological threats. for this reason, biodefence detection platforms strive to reach near perfect accuracies, which are comparable to, if not better than, the usual gold-standard assaystypically culture-based methods. with microarrays, redundancy can, however, be easily engineered into the system to improve accuracy and analytical power simply by over-representation. the us centers for disease control (cdc) have identified a list of 36 agents that, if spread uncontrollably, have the potential to cause a major public health crisis with heavy morbidity and mortality rates (box 1). these pathogens are further subclassified into categories a, b and c according to the degree of risk imposed (table 1 ) [16] . in 2002, wilson et al. fabricated a customized affymetrix microarray containing 53 660 probes to detect dna amplified from 18 different pathogenic microorganisms simultaneously, including pathogens from the us cdc's list of bioterrorism agents, such as bacillus anthracis (which causes anthrax), clostridium botulinum (which generates the botulinum toxin), yersinia pestis (which causes bubonic plague) and the ebola virus [17] . specific multiplexed primer sets were designed to amplify unique diagnostic regions specific to each organism for hybridization against tiling arrays comprising 20-mer probes. the highly redundant system facilitated the accurate identification of each organism tested, with over 91% of the probes working as predicted. impressively, as little as 10 fg, equivalent to the 'mass' of just two genomes (or dna from two cells), of b. anthracis could be detected after multiplexed pcr on these microarrays. even though culture methods should in theory be able to amplify and detect a single organism, the procedures are usually time-consuming and challengingdetection could take from 18 h to several days depending on the microbe type and even longer for fastidious microbes (e.g. mycobacterium tuberculosisa very slow growing microbe) that might require specialized, or as yet undiscovered, laboratory growth conditions. molecular identification using random pcr followed by multiplexed resolution over a microarray offers a particularly attractive alternative for detection and diagnostics, generating results in 2-6 h once assays are optimized ( figure 2a ). wang et al. [18] developed such a microarray approach for the screening of viral pathogens from across broad viral families. a randomized primer was used to amplify any viral rna that was present in the sample using reverse transcriptase-pcr followed by hybridization on a microarray comprising 1600 70-mer probes, representing nearly 140 virus genomes. degeneracy of probes on the microarray and cross-hybridization of certain viruses across expected patterns indicated the emergence of novel, uncharacterized strains, which could hence be identified with the aid of microarrays. similarly, sengupta and colleagues [19] developed microarrays with 476 probes to distinguish among various influenza viruses. primers were designed against characteristic sequences specific to influenza that targeted the viral fusion protein haemagglutinin and the glycosidase neuraminidase segments. using the affymetrix respiratory pathogen microarray, lin et al. [20] detected the respiratory viruses influenza a and adenovirus, which were then further differentiated according to strain and species. this setup made use of a so-called 'resequencing' microarray that contained one perfectly matched and three mismatched probes per base and thus was able to identify genetic mutations at the sequence level [20] . the array's large screening capacity meant that both the forward and reverse strands of the dna targets could be sequenced to provide added sequencing accuracy. pooled primer pairs have also been optimized for use in detecting both dna and rna targets for pathogens that cause upper respiratory tract infections, including viruses (influenza, corona viruses and others) and bacteria (bordetella pertusis, streptococcus pyogenes and others) [21] . the small subunit ribosomal rna (ssu rrna) gene is used widely as a microbial marker for taxonomy and species classification [22] . desantis and colleagues [23] generated a high-density tiling microarray with 62 358 oligonucleotide probes of ssu rrna with sufficient coverage to detect 18 different orders of microbes from environmental samples, as well as novel variants that exhibited mutations in their ssu rrna. the array fluorescence intensities correlated well with the spiked pathogen concentrations, providing a means of quantifying the pathogen box 1. bioterror agent classification biohazardous agents that pose a significant threat require special attention, especially in the implementation of regulatory measures and controls to limit their access and distribution and to prevent them from falling into the wrong hands. in addition, such measures also aim to raise awareness within national healthcare systems and amongst healthcare providers, especially in cases where these bioterror threats are only rarely seen. the us centers of disease control (cdc) have compiled a list of 36 biohazardous agents, which are divided into the categories a, b and c according to the level of threat they impose, and their characteristics are described here. examples of pathogens from these three categories are provided in table 1 . the cdc defines category a agents as those that are of the highest priority for biodefence research. these organisms can be easily disseminated from person to person. they result in high mortality rates and have the potential to cause a major impact on public health. the accidental or deliberate release of these agents could cause public panic and social disruption. these agents thus require particular attention in ensuring public health preparedness because they pose a significant risk to national security. the cdc defines category b agents as those that are moderately easy to disseminate. these pathogens result in moderate morbidity and low mortality. these agents would hence require specific improvements of diagnostic capabilities as well as enhanced measures for disease surveillance. the cdc defines category c agents as those that have the potential for mass dissemination in the future because of their availability, ease of production and dissemination. this category also includes emergent threats that have the potential of causing high morbidity and mortality rates, as well as a major impact on public health. further details on the classification and prioritization of biological threat agents are available on the cdc website: http:// www.bt.cdc.gov/agent/agentlist-category.asp. levels in the original sample. a similar method was used to detect staphylococci. out of 201 isolates taken from 33 staphylococci species, 185 were correctly assigned by using only 16s rrna sequences on microarrays, conferring a sensitivity of 92% [24] . the ability to refine and expand on the existing probe set, for example by the incorporation of additional informative genetic loci, is a key advantage provided by the dna microarray platform to increase redundancy and hence improve assay performance and fidelity. miller and colleagues [25] applied microarrays in clinical diagnostics and were able to identify pathogens in a panel of 36 patient specimens with a 94% accuracy score, calculated from 76% sensitivity and 100% specificity. the microarray was designed to detect up to 35 rna viruses, including the sars coronavirus and dengue viruses, using 40-mer probes. a total of 53 555 such probes in replicates of seven were printed on nimblegen tiling microarrays. the authors studied ways to minimize pcr bias to ensure uniform amplification and developed a unique statistical approach to infer pathogen identity from the resulting microarray fingerprints. apart from the highdensity microarrays described here, more focused biodefence platforms have been developed for selected panels of environmental [26] or blood-borne pathogens [27] , using inhouse microarrays that comprise only several hundred relevant probes. such panels are designed to selectively identify targets of interest with enough redundancy to reduce the occurrence of false positives, which is infrequently controlled in other diagnostic platforms that often use either one or a handful of primer sets specific to each pathogen of interest (as is the case in the genexpert, rapid and cepheid real time pcr platforms). there are several instructive examples where dna microarrays have advanced our understanding of disease etiology and epidemiology through their use in studies into molecular evolution, host-pathogen interactions and modes of virulence. in the following section, we will review salient studies that have provided such fundamental insight, focusing on agents from the cdc categories a to c. one important application of microarrays is in the resequencing of pathogen genomes, which has been successfully applied to interrogate inter-species and/or interstrain differences (figure 2b ). this provides a vital alternative to traditional shotgun sequencing approaches because it is quicker and more cost-effective; however, the disadvantage is that the sequence information for the organism of interest must available beforehand. this is not a severe limitation given that many pathogenic organisms have or are being sequenced, and with the advances in de novo sequencing capabilities, new pathogens such as sars can be readily sequenced in a matter of weeks from the time of discovery [28] . subsequent resequencing using microarrays would only take several days. early experiments on resequencing arrays showed high rates of false discovery of up to 12-45% [29] , but experimental improvements and the development of robust algorithms, such as the abacus (adaptive background genotype calling scheme) software package [30] , have now greatly improved sequencing quality. adopting this approach for variation analysis, wong and colleagues [31] tracked strains of the sars coronavirus that were rapidly resequenced using high-density microarrays. a total of 383 102 probes (27-mers) were designed to cover the complete 29.7 kb genome on nimblegen microarrays [31] . virus samples from cell cultures and patient tissues could be amplified by reverse-transcriptase pcr using optimized primers and resequenced with accuracy greater than 99.99%. the platform was used to confirm that a lone case of a sars virus infection of a graduate student had most likely come from a laboratory source because a unique 47-bp deletion was detected that was not present in other sars isolates. wang et al. [32] also developed a sars microarray that helped to characterize a novel coronavirus variant. the microarray platform is thus ideal in virus tracking should outbreaks occur and, furthermore, is readily applicable to other pathogens evolutionary perspectives lend unique insight into the genetic changes that result in differences in pathogenicity. plague-causing y. pestis is estimated to have diverged from the non-lethal yersinia pseudotuberculosis an estimated 1500-20 000 years ago [34] . bearing identical 16s rrna sequences, more comprehensive genetic analysis was required to understand the differences in disease morbidity manifested by these two bacterial species, as well as in the different modes of transmission to humans: y. pestis by flea bites and y. pseudotuberculosis by food or water. a microarray specific for y. pestis revealed three unique genomic islands and the inactivation of several genes, including those encoding the cell-adhesion proteins adhesin (yada) and invasion (inv), as well as the o-antigen biosynthetic operon. these unique characteristics potentially led to y. pestis' enhanced virulence and its adaptation from an enteric organism (y. pseudotuberculosis) to a mammalian blood-borne pathogen [35] . equivalent findings were reported by hinchliffe et al. [36] using a similar figure 2 . the use of microarrays for studying emergent pathogens, exemplified by severe acute respiratory syndrome (sars)-associated coronavirus and avian influenza (h5n1). various approaches have been developed for detection and diagnostics with dna and protein microarrays. (a) specific dna probes, for instance co-v (orange) for sars and h5 (green) for h5n1, can be used to detect the presence of pathogen dna or rna using pcr or reverse transcriptase (rt)-pcr, thus enabling multiple pathogens to be detected simultaneously [26] . (b) tiling arrays can be used to resequence pathogens so that any mutations in evolving pathogens can be rapidly detected and tracked. as illustrated, four probes, one for each nucleotide base, are used to determine the genotype at a given loci. a set of probes designed for an arbitrary position xyz is shown to reveal the unknown base to be adenine (a). multiple probe sets are 'tiled' across the whole pathogen genome and, upon hybridization and analysis, can confer the complete sequence information of the pathogen with great accuracy [17, 23, 25] . (c) antibody arrays make use of the sandwich immunoassay to screen for the presence of a pathogen. as depicted, the pathogen sample is first applied to the microarray, followed by the reporter antibody. as a result, the pathogen is sandwiched in between two antibodiesan immobilized antibody (blue) and a tagged antibody (purple) capable of reporting the presence of a pathogen on the microarray [43, 45] . (d) proteome microarrays, which contain immobilized proteins from the target pathogen, can be screened against sera obtained from infected individuals. if the individual has been exposed to the pathogen, the sera will contain antibodies (blue) against specific antigens of the pathogens that will react with the immobilized protein and that can be detected using tagged secondary antibodies (purple) [49] . this procedure also facilitates the identification of specific immunodominant antigens present in the pathogen proteome. these proteins are considered to be largely responsible for triggering the host immune response and thus have a high potential for the development of vaccines against this particular pathogen. trends in biotechnology vol.27 no.1 y. pestis gene-specific microarray. these also showed that, of the three major y. pestis subspecies, the antiqua and mediaevalis biovars showed greater divergence from the orientalis strains [36] . similarly, variation analysis has been performed on burkholderia pseudomallei microarrays to differentiate amongst virulent and avirulent burkholderia species, specifically bioterror agents such as b. pseudomallei (which causes meliodosis), b. mallei (which causes equine glanders) and the avirulent (but closely related) b. thailandensis [37] . in an interesting approach to reveal molecular mechanisms for pathogenesis, weiss and colleagues [38] developed a microarray-based negative-screening approach for francisella and obtained surviving bacterial samples from infected mice to reveal genes responsible for bacterial survival and growth in vivo. in a related study, genome-wide microarrays also revealed loci that are only present in the highly virulent franscicella tularensis subspecies tularensis strain [39] . these loci could be used to identify potential pathogenicity islands as well as to provide unique genetic markers that could be used for strain typing and identification. microarrays have also been applied to distinguish among variants of influenza a that are resistant to the antiviral drug amantidine by detecting mutations in the viral ion channel m2 protein [40] . such molecular forensics experiments using microarrays have extended our capabilities to profile new or emergent threats. the differences identified might provide an understanding of the causes of increased virulence, as well as reveal vulnerabilities that can be exploited for building therapeutic defences. beyond laboratory-based applications, dna microarrays have been tested successfully in epidemic outbreak surveillance (eos). project silent guardian was a 10-week trial launched by the naval research laboratory during the 2005 us presidential inauguration [41] . the challenge was to take laboratory-based microarray technology to a production-scale system capable of operationally screening up to 300 samples per day. a custom dna microarray was designed with the capability of detecting 20 natural (including avian flu) and biothreat agents with strain-level resolution. in total, 10 000 samples were collected and screened by civilian and military laboratory personnel within the stipulated period. the trial included blinded samples that were spiked with pathogens, which were all successfully detected. this exercise showcased the feasibility of implementing microarrays as a screening tool for eos and demonstrated their use as a rapid and robust means of sample processing and pathogen identification. recommendations from the study included the development of a more user-friendly system with greater automation of the steps involved. these features are being incorporated by several recently launched commercial platforms. microarray-based platforms for biothreat screening from akonni biosystems (http://www.akonni. com) and veredus laboratories (http://www.vereduslabs. com) are now on the market for multiplexed threat detection. new advances include (i) greater integration of the sample preparation (which is considered to be one of the major bottlenecks), (ii) the use of mpcr and (iii) the development of small handheld microfluidic chips and portable readers to take microarray technology to the point-ofcare. veredus has also recently launched an influenza test chip that enables the amplification and discrimination of influenza a and b subtypes, including avian flu (h5n1), within just two hours. this great improvement in detection time has been attributed to the mpcr step, which, through rapid thermocycling, significantly reduces the duration of pcr to under 30 min. similar devices might be developed in the future for protein and other types of non-dna microarrays for applications in serodiagnostics. the national institute of allergy and infectious diseases (niaid) and the j. craig venter institute fund a programme that enables researchers to obtain, through an approval process and material transfer agreement (mta), pathogen dna microarrays for select category a-c threats at no cost to promote and accelerate research on these pathogens. the pathogen functional genomics resource centre (pfgrc) currently fabricates and distributes a range of these 70-mer microarrays covering 39 agents, including b. anthracis, coronaviruses, f. tularensis, y. pestis and b. pseudomallei (http://pfgrc.tigr.org) ( table 1) . such collaborative and valuable initiatives augur favourably for the future of the biodefence community worldwide and will hopefully lead to greater sharing of resources in the pursuit of knowledge on global and regional pathogens. emerging roles of protein, peptide and carbohydrate microarrays newer microarray formats have already significantly extended the scope of microarray applications. the advantages of parallelization, miniaturization and automation hold true for whichever biomolecule is presented on highdensity microarrays. it was initially difficult to believe that proteins would retain their functionality once anchored on a solid support. however, schreiber and colleagues [42] erased this concern by developing the first small-molecule microarray in 1999 and the first protein microarray in 2000, simultaneously demonstrating that proteins could retain their activity despite being covalently attached to glass slides. soon afterwards, other array types were developed, such as cell arrays, carbohydrate arrays and proteome arrays. some of these newer applications have brought vital capabilities to pathogen detection and biodefence, as will be described below. as before, examples are provided for key biothreat agents and are categorized according to their consequent applicationseither in detection or profiling. pathogen detection using non-dna-based microarrays antibodies have become an established denominator for disease detection, through the time-honoured use of enzyme-linked immunosorbent assays (elisas), agglutination and lateral-flow assays. as a next-generation tool, antibody microarrays are increasing in popularity, and not without reason, for they offer unparalleled throughput, minimal reagent consumption and sensitive detection of multiple targets simultaneously [10] . the accuracy conferred is nevertheless closely linked with the quality of review trends in biotechnology vol. 27 no.1 antibodies employed, and there are issues relating to crossreactivity and antibody availability, both of which are particularly crucial for finer and more accurate strain typing. notwithstanding these matters, a plethora of antibody microarrays have been developed that have growing potential for clinical, biothreat and point-of-care applications (figure 2c ). rucker et al. [43] applied antibody microarrays for the detection of toxins with low-nanomolar sensitivities, such as anthrax lethal factor (lf, a metalloendopeptidase), protective antigen (pa) and tetanus toxins (endopeptidases), through the co-application of a competitive fluorescently-labelled toxin reporter. in this setting, the samples can be tested without the need for labelling by monitoring the depletion and competitive displacement of reporter signals generated by the labelled reporter. thirty-five antibodies were also arrayed to subtype the 20 most common salmonella serovars [44] , and a similar study was undertaken for escherichia coli [45] . in an impressive demonstration of the detection throughput attainable, huelseweh and colleagues [46] built an antibody microarray to detect several category a and b agents simultaneously, including f. tularensis, y. pestis, brucella melitensis (which causes brucellosis) and b. mallei, using sandwich immunoassays (figure 2c) . conversely, antigen microarrays have also been used to identify seropositive individuals by using the presence of defensive antibodies in the serum as a means of detecting exposure to a pathogen. pathogen proteins are typically orthologously expressed using high-throughput cloning and expression and then affinity purified to provide protein sets. in special cases, it might be necessary to consider the post-translational modifications or the way the pathogen presents itself to the host to identify and use the correct immunogenic protein variants in serodiagnostics. it is also desirable to purify the protein and check for efficient and generally uniform immobilization on the microarrays as a quality control, especially when comparisons are made across slides. in one such example, sundaresh et al. [47] narrowed down an initial panel of 1741 f. tularensis antigens to just the top 244 in terms of their ability to predictively discriminate seropositive sera and then used these antigens to establish a diagnostic microarray comprising whole proteins. mezzasoma et al. [48] also developed protein microarrays for simultaneous diagnostics using parasitic and viral antigens. zhu et al. [49] monitored the antibody profiles of sars patients using protein microarrays containing 82 purified coronavirus proteins. antibodies present in human serum samples were detected on the protein microarrays with fluorescently labelled goat anti-human immunoglobulins. it was found that immunoreactivity against the coronavirus nucleocapsid proteins remained high for 120-320 days post infection. this provided a means to check for exposure long after infection might have occurred. over 400 human sera samples were screened with an accuracy of 91% in differentiating infected and normal specimens [49] . nevertheless, caution should be applied in protein-based diagnostics because incubation periods vary greatly amongst diseases. during the incubation period, the patient might not express antibodies at detectable levels but might still be capable of spreading the infection. in the case of sars, the incubation period (the time between exposure and onset of symptoms) has a median of 4-5 days (up to a maximum of 10 days). antibodies might be detectable only 6-7 days after first symptoms are presented, but pcr testing might be able to pick up the virus more quickly, just 1-2 days after symptoms become apparent. it might become possible in the future to integrate dnaand protein-based microarray methods to extend the range of rapid clinical diagnostics from detection of current pathogen infections to testing for exposure long after the actual infection has occurred. as an alternative, short peptides have also been applied for serodiagnostics, and this was facilitated by the robust and routine procedures established to chemically synthesize peptide sequences up to 50 amino acids in length. peptide microarrays have been applied for detecting immunoreactive sera against sars, amongst other pathogens, enabling the detection of low-picomolar concentrations of antibodies [50] . lipopolysaccharide, carbohydrate-based and whole-cell microarrays have also been used for antibody-based detection of pathogens such as f. tularensis [51, 52] , b. anthracis [53] and b. pseudomallei [54] . the surface antigens on these microbes are frequently responsible for immunoreactivity, so these methods rely on detecting the presence of such immunogenic antigens. further improvements in fabrication techniques and greater knowledge of surface chemistries might lead to the development of hybrid microarray platforms in which multiple types of antigens, such as peptides, proteins and carbohydrates, are presented simultaneously to detect host antibodies with improved efficiency and sensitivity. pathogen profiling using non-dna-based microarrays protein microarrays with a high pathogen proteome content offer a valuable platform for high-throughput serology. proteins that are immunoreactive to patient sera represent antigens that can be applied as markers in serodiagnostics or as therapeutic proteins for vaccine development. in one such example, the immunogenic epitopes of the sars coronavirus were traced to the cterminal fragments of the nucleocapsid protein by screening against 52 human sera samples from infected individuals ( figure 2d) [55, 56] . to address the bottleneck of protein expression, davies and colleagues [57, 58] applied pcr recombinant cloning to accelerate the process of cloning and expression. this enabled more than 190 proteins from vaccinia virus and over 1700 proteins from f. tularensis to be expressed and profiled using microarrays [57, 58] . these arrays have also helped to identify antibodies against the vaccinia virus h3l envelope protein, which confers protection in vivo (in a mouse model) [59] . the vaccinia protein microarrays have also been used to derive antibody profiles in humans inoculated with the licenced dryvax 1 smallpox vaccine [60] , and these profiles were found to be very similar to those induced by the attenuated modified vaccinia virus ankara strain, an alternative vaccine candidate [61] . various protein microarrays have similarly been generated for identification of the immunodominant antigens of y. pestis [62, 63] . trends in biotechnology vol. 27 no.1 other microarray platforms have demonstrated valuable potential in biodefence. for instance, small-molecule microarrays have been applied to functionally screen anthrax lf through activity-based binding signatures against a library of 1400 immobilized peptide-hydroxamate inhibitors. putative drug candidates were discovered that bound selectively to the anthrax lf (with a low dissociation constant, k d = 0.81 mm) and inhibited its activity in vitro [64] . carbohydrate microarrays have also been developed to elucidate the receptor preferences of influenza viruses. stevens and colleagues [65, 66] applied glycan arrays to establish the a2-3 and a2-6 binding selectivity of the h5 and h1 haemagglutinins, which determine influenza virulence. the selectivity was altered by specific mutations in the protein, providing a useful way of functionally monitoring viral adaptation. non-dna microarray formats have thus contributed unique ways in which we can explore pathogen biology and develop countermeasures in the event of an outbreak. it is clear that microarrays have significantly strengthened our biodefence capabilities. although the upstream cost of library creation and microarray fabrication, as well as the initial infrastructure investment of several hundreds of thousands of dollars (for microarray spotters and scanners), might represent significant hurdles for small laboratories with tight budgets, the provision of printed microarray slides through commercial or research avenues heralds a favourable trend towards the reduction of exclusivity. greater access to the technology, such as the increased availability of microarrays together with established array designs, is helping researchers worldwide to move more quickly towards the downstream application phase, for example to perform regional studies of endemic variants. the applications of microarrays described here are broad-ranging and span from dna-or protein-based detection of pathogens to epidemic outbreak surveillance and vaccine development, thus showcasing the full spectrum of microarray potential ( table 1 ). the progress made in the past several years has brought microarrays to the forefront of rapid diagnostics and medical research. further integration of upstream sample preparation with downstream data processing is expected to transform microarrays into compact, field expedient solutions for analysis and monitoring in the near future. the large assortments of biomolecules now utilized in microarrays provide novel opportunities in molecular forensics and comparative profiling, especially for the newer and less well-understood pathogens. as we equip ourselves with better capabilities and countermeasures against potential biological threats, we hope to become more agile and responsive whenever such threats emerge in the future. microarrays have improved our confidence in this respect and will continue to play a decisive part in biodefence research and technology. molecular evolution of the sars coronavirus during the course of the sars epidemic in china chemical and biological weapons: current concepts for future defenses turning biodefense dollars into products current and developing technologies for monitoring agents of bioterrorism and biowarfare microarrays -status and prospects comparative phylogenomics of pathogenic bacteria by microarray analysis nucleic acid approaches for detection and identification of biological warfare and infectious disease agents potential applications of dna microarrays in biodefense-related diagnostics oligonucleotide microarrays in microbial diagnostics progress in protein and antibody microarray technology advances in functional protein microarray technology small molecule microarrays: recent advances and applications peptide microarrays: next generation biochips for detection, diagnostics and high-throughput screening microarrays in infection and immunity highly parallel microbial diagnostics using oligonucleotide microarrays microbiological threats to homeland security sequence-specific identification of 18 pathogenic microorganisms using microarray technology microarray-based detection and genotyping of viral pathogens molecular detection and identification of influenza viruses by oligonucleotide microarray hybridization broad-spectrum respiratory tract pathogen identification using resequencing dna microarrays identification of upper respiratory tract pathogens using electrochemical detection on an oligonucleotide microarray potential of dna microarrays for developing parallel detection tools (pdts) for microorganisms relevant to biodefense and related research needs rapid quantification and taxonomic classification of environmental dna from both prokaryotic and eukaryotic origins using a microarray high-density dna probe arrays for identification of staphylococci to the species level optimization and clinical validation of a pathogen detection microarray multipathogen oligonucleotide microarray for environmental and biodefense applications a multiplex polymerase chain reaction microarray assay to detect bioterror pathogens in blood the genome sequence of the sars-associated coronavirus microarray-based resequencing of multiple bacillus anthracus isolates high-throughput variation detection and genotyping using microarrays tracking the evolution of the sars coronavirus using high-throughput, high-density resequencing arrays viral discovery and sequence recovery using dna microarrays genechip resequencing of the smallpox virus genome can identify novel strains: a biodefense application microevolution and history of the plague bacillus, yersinia pestis dna microarray analysis of genome dynamics in yersinia pestis: insights into bacterial genome microevolution and niche adaptation application of dna microarrays to study the evolutionary genomics of yersinia pestis and yersinia pseudotuberculosis patterns of large-scale genomic variation in virulent and avirulent burkholderia species in vivo negative selection screen identifies genes required for francisella virulence genome-wide dna microarray analysis of francisella tularensis strains demonstrates extensive genetic conservation within the species but identifies regions that are unique to the highly virulent f. tularensis subsp. tularensis detection of adamantane-resistant influenza on a microarray nrl uses microarray technology to detect natural and bio-threat pathogens during silent guardian project printing proteins as microarrays for high-throughput function determination antibody microarrays for native toxin detection development of a novel protein microarray method for serotyping salmonella enterica strains use of miniaturized protein arrays for escherichia coli o serotyping a simple and rapid protein array based method for the simultaneous detection of biowarfare agents from protein microarrays to diagnostic antigen discovery: a study of the pathogen francisella tularensis antigen microarrays for serodiagnosis of infectious diseases severe acute respiratory syndrome diagnostics using a coronavirus protein microarray functional peptide microarrays for specific and sensitive antibody diagnostics lipopolysaccharide microarrays for the detection of antibodies bacterial cell microarrays for the detection and characterization of antibodies against surface antigens photogenerated glycan arrays identify immunogenic sugar moieties of bacillus anthracis exosporium polysaccharide microarray technology for the detection of burkholderia pseudomallei and burkholderia mallei antibodies antigenicity analysis of different regions of the severe acute respiratory syndrome coronavirus nucleocapsid protein screening of specific antigens for sars clinical diagnosis using a protein microarray immunodominant francisella tularensis antigens identified using proteome microarray profiling the humoral immune response to infection by using proteome microarrays: high-throughput vaccine and diagnostic antigen discovery vaccinia virus h3l envelope protein is a major target of neutralizing antibodies in humans and elicits protection against lethal challenge in mice proteome-wide analysis of the serological response to vaccinia and smallpox antibody profiling by proteome microarray reveals the immunogenicity of the attenuated smallpox vaccine modified vaccinia virus ankara is comparable to that of dryvax protein microarray for profiling antibody responses to yersinia pestis live vaccine quorum sensing affects virulence-associated proteins f1, lcrv, katy and ph6 etc. of yersinia pestis as revealed by protein microarray-based antibody profiling quantitative inhibitor fingerprinting of metalloproteases using small molecule microarrays structure and receptor specificity of the hemagglutinin from an h5n1 influenza virus glycan microarray technologies: tools to survey host specificity of influenza viruses the authors acknowledge funding support from dso national laboratories. key: cord-000642-mkwpuav6 authors: moreira, rebeca; balseiro, pablo; planas, josep v.; fuste, berta; beltran, sergi; novoa, beatriz; figueras, antonio title: transcriptomics of in vitro immune-stimulated hemocytes from the manila clam ruditapes philippinarum using high-throughput sequencing date: 2012-04-19 journal: plos one doi: 10.1371/journal.pone.0035009 sha: doc_id: 642 cord_uid: mkwpuav6 background: the manila clam (ruditapes philippinarum) is a worldwide cultured bivalve species with important commercial value. diseases affecting this species can result in large economic losses. because knowledge of the molecular mechanisms of the immune response in bivalves, especially clams, is scarce and fragmentary, we sequenced rna from immune-stimulated r. philippinarum hemocytes by 454-pyrosequencing to identify genes involved in their immune defense against infectious diseases. methodology and principal findings: high-throughput deep sequencing of r. philippinarum using 454 pyrosequencing technology yielded 974,976 high-quality reads with an average read length of 250 bp. the reads were assembled into 51,265 contigs and the 44.7% of the translated nucleotide sequences into protein were annotated successfully. the 35 most frequently found contigs included a large number of immune-related genes, and a more detailed analysis showed the presence of putative members of several immune pathways and processes like the apoptosis, the toll like signaling pathway and the complement cascade. we have found sequences from molecules never described in bivalves before, especially in the complement pathway where almost all the components are present. conclusions: this study represents the first transcriptome analysis using 454-pyrosequencing conducted on r. philippinarum focused on its immune system. our results will provide a rich source of data to discover and identify new genes, which will serve as a basis for microarray construction and the study of gene expression as well as for the identification of genetic markers. the discovery of new immune sequences was very productive and resulted in a large variety of contigs that may play a role in the defense mechanisms of ruditapes philippinarum. the manila clam (ruditapes philippinarum) is a cultured bivalve species with important commercial value in europe and asia, and its culture has expanded in recent years. nevertheless, diseases produced by a wide range of microorganisms, from viruses to metazoan parasites, can result in large economical losses. among clam diseases, the majority of pathologies are associated with the vibrio and perkinsus genera [1] [2] [3] . although molluscs lack a specific immune system, the innate response involving circulating hemocytes and a large variety of molecular effectors seems to be an efficient defense method to respond to external aggressions by detecting the molecular signatures of infection [4] [5] [6] [7] [8] ; however, not many immune pathways have been identified in these animals. although knowledge of bivalve immune-related genes has increased in the last few years, the available information is still scarce and fragmentary. most of the data concern mussels and eastern and pacific oysters [9] [10] [11] [12] [13] [14] , and very limited information is available on the expressed immune genes of r. philippinarum. recently, the expression of 13 immune-related genes of ruditapes philippinarum and ruditapes decussatus were characterized in response to a vibrio alginolyticus challenge [15] . also, a recent 454 pyrosequencing study was carried out by milan et al. [16] , who sequenced two normalized cdna libraries representing a mixture of adult tissues and larvae from r. philippinarum. even more recently ghiselli et al. [17] , have de novo assembled the r. philippinarum gonad transcriptome with the illumina technology. moreover, a few transcripts encoded by genes putatively involved in the clam immune response against perkinsus olseni have been reported by cdna library sequencing [18] . currently (19/12/ 2011) , there are 5,662 ests belonging to r. philippinarum in the genbank database. the european marine genomics network has increased the number of ests for marine mollusc species particularly for ecologically and commercially important groups that are less studied, such as mussels and clams [19] . unfortunately, most of the available resources are not annotated or well described, limiting the identification of important genes and genetic markers for future aquaculture applications. the use of 454-pyrosequencing is a fast and efficient approach for gene discovery and enrichment of transcriptomes in non-model organisms [20] . this relatively low-cost technology facilitates the rapid production of a large volume of data, which is its main advantage over conventional sequencing methods [21] . in the present work, we undertook an important effort to significantly increase the number of r. philippinarum ests in the public databases. specially, the aim of this work was to discover new immune-related genes using pyrosequencing on the 454 gs flx (roche-454 life sciences) platform with the titanium reagents. to achieve this goal, we sequenced the transcriptome of r. philippinarum hemocytes previously stimulated with different pathogen-associated molecular patterns (pamps) to obtain the greatest number of immune-related transcripts as possible. the raw data are accessible in the ncbi short read archive (accession number: sra046855.1). the r. philippinarum normalized cdna library was sequenced with 454 gs flx technology as shown in figure 1 . sequencing and assembly statistics are summarized in table 1 . briefly, a total of 975,190 raw nucleotide reads averaging 284.1 bp in length were obtained. of these, 974,976 exceeded our minimum quality standards and were used in the mira assembly. a total of 842,917 quality reads were assembled into 51,265 contigs, corresponding to 29.9 megabases (mb). the length of the contigs varied from 40 to 5565 bp, with an average length of 582.4 bp and an average coverage of 5.7 reads. singletons were discarded, resulting in 37,093 contigs formed by at least 2 ests, and 26,675 of these contigs were longer than 500 bp. clustering the contigs resulted in 1,689 clusters with more than one contig. the distribution of contig length and the number of ests per contig, as well as the contig distribution by cluster are all shown in figure 2 . even though the knowledge of expressed genes in bivalves has increased in the last few years, it is still limited. indeed, only 41,598 nucleotide sequences, 362,149 ests, 24,139 proteins and 704 genes from the class bivalvia have been deposited in the genbank public database (19/12/11) , and the top entries are for the mytilus and crassostrea genera. for ruditapes philippinarum, these numbers are reduced to 5,662 ests, 612 proteins and 12 genes. this evidences the lack of information which prompted the recent efforts to increase the number of annotated sequences of bivalves in the databases. for non-model species, functional and comparative genomics is possible after obtaining good est databases. these studies seem to be the best resource for deciphering the putative function of novel genes, which would otherwise remain ''unknown''. ncbi swissprot, ncbi metazoan refseq, the ncbi nonredundant and the uniprotkb/trembl protein databases were chosen to annotate the contigs that were at least 100 bp long (49, 847) . the percentage of contigs annotated with a cut off evalue of 10e-3 was 44.7%. contig sequences and annotations are included in table s1 . of these contigs, 3.26% matched sequences from bivalve species and the remaining matched to non-bivalvia mollusc classes (4.13%), other animals (81.38%), plants (2.58%), fungi (1.78%), protozoa (1.50%), bacteria (4.95%), archaea (0.20%), viruses (0.21%) and undefined sequences (0.01%). as shown in figure 3a , the species with the most sequence matches was homo sapiens with 3,106 occurrences. the first mollusc in the top 35 list was lymnaea stagnalis at position 11. the first bivalve, meretrix lusoria, appeared at position 17. r. philippinarum was at position 25 with 124 occurrences. notably, a high percentage of the sequences had homology with chordates, arthropods and gastropods ( figure 3b and c), and only 343 contigs matched with sequences from the veneroida order ( figure 3d ). these values can be explained by the higher representation of those groups in the databases as compared to bivalves and the quality of the annotation in the databases, which has been reported in another bivalve transcriptomic study [22] . the data shown highlight, once again, the necessity of enriching the databases with bivalve sequences. a detailed classification of predicted protein function is shown for the top 35 blastx hits ( figure 4a ). the list is headed by actin with 903 occurrences, followed by ferritin, an angiopoietin-like protein and lysozyme. an abundance of proteins directly involved in the immune response was predicted for this 454 run; ferritin, lysozyme, c1q domain containing protein, galectin-3 and hemagglutinin/amebocyte aggregation factor precursor are immune-related proteins present on the top 35 list. ferritin has an important role in the immune response. it captures circulating iron to overcome an infection and also functions as a proinflammatory cytokine via the iron-independent nuclear factor kappa b (nf-kb) pathway [23] . lysozyme is a key protein in the innate immune responses of invertebrates against gram-negative bacterial infections and could also have antifungal properties. in addition, it provides nutrition through its digestive properties as it is a hydrolytic protein that can break the glycosidic union of the peptidoglycans of the bacteria cell wall [24] . the c1q domain containing proteins are a family of proteins that form part of the complement system. the c1q superfamily members have been found to be involved in pathogen recognition, inflammation, apoptosis, autoimmunity and cell differentiation. in fact, c1q can be produced in response to infection and it can promote cell survival through the nf-kb pathway [25] . galectin-3 is a central regulator of acute and chronic inflammatory responses through its effects on cell activation, cell migration, and the regulation of apoptosis in immune cells [26] . the hemagglutinin/amebocyte aggregation factor is a single chain polypeptide involved in blood coagulation and adhesion processes such as self-nonself recognition, agglutination and aggregation processes. the hemagglutinin/ amebocyte aggregation factor and lectins play important roles in defense, specifically in the recognition and destruction of invading microorganisms [27] . other proteins that are not specifically related to the immune response but could play a role in defense mechanisms include the following: angiopoietin-like proteins, apolipoprotein d and the integral membrane protein 2b. in other animals, angiopoietin-like proteins (angptl) potently regulate angiogenesis, but a subset also function in energy metabolism. specifically, angptl2, the most represented angptl, promotes vascular inflammation rather than angiogenesis in skin and adipose tissues. inflammation occurs via the a5b1 integrin/rac1/nf-kb pathway, which is evidenced by an increase in leukocyte infiltration, blood vessel permeability and the expression of inflammatory cytokines (tumor necrosis factor-a, interleukin-6 and interleukin-1b) [28] . apolipoprotein d (apod) has been associated with inflammation. pathological and stressful situations involving inflammation or growth arrest have the capacity to increase its expression. this effect seems to be triggered by lps, interleukin-1, interleukin-6 and glucocorticoids and is likely mediated by the nf-kb pathway, as there are several conserved nf-kb binding sites in the apod promoter (apre-3 and ap-1 binding sites are also present). the highest affinity ligand for apod is arachidonic acid, which apod traps when it is released from the cellular membrane after inflammatory stimuli and, thus, prevents its subsequent conversion in pro-inflammatory eicosanoids. within the cell, apod could modulate signal transduction pathways and nuclear processes such as transcription activation, cell cycling and apoptosis. in summary, apod induction is specific to ongoing cellular stress and could be part of the protective components of mild inflammation [29] [30] [31] . finally, the short form of the integral membrane protein 2b (itm2bs) can induce apoptosis via a caspase-dependent mitochondrial pathway [32] . to avoid redundancy, the longest contig of each cluster was used for gene ontology terms assignment. a total of 23.05% of the representative clusters matched with at least one go term. concerning cellular components ( figure 4b ), the highest percentage of go terms were in the groups of cell and cell part with 25.9% in each; organelle and organelle part represented 19.67% and 11.38%, respectively. within the molecular function classification ( figure 4c ), the most represented group was binding with 49.25% of the terms, which was followed by catalytic activity (29.12%) and structural molecular activity (4.60%). with regard to biological process ( figure 4d ), cellular and metabolic processes were the highest represented groups with 16.78% and 12.43% of the terms, respectively, which was followed by biological regulation (10.18%). similarities between the r. philippinarum transcriptome and another four bivalve species sequences were analyzed by comparative genomics (crassostrea gigas of the family ostreidae, bathymodiolus azoricus and mytilus galloprovincialis of the family mytilidae and laternula elliptica of the family laternulidae). this analysis could identify specific transcripts that are conserved in these five species. a venn diagram was constructed using unique sequences from these databases according to the gene identifier (gi id number) of each sequence in its respective database: 207,764 from c. gigas, 76,055 from b. azoricus, 121,318 from m. galloprovincialis and 1,034,379 from l. elliptica. c. gigas was chosen because is the most represented bivalve species in the public databases. the other three species are bivalves that have been studied in transcriptomic assays. figure 5 shows that of the total 29,679 clusters, 72% were found exclusively in the r. philippinarum group, while only 7.59% shared significant similarity with all five species. the number of coincidences among other groups was very low (4.14% to 0.31% of sequences), suggesting that 21,454 new sequences were discovered within the bivalve group. the percentage of new sequences is very high compared to previous transcriptomic studies [33] [34] , in which the fraction of new transcripts was approximately 45%. one possible explanation for this discrepancy is the low number of nucleotide and est sequences currently available in public databases for r. philippinarum, but these transcripts could also be regions in which homology is not reached, such as 59 and 39 untranslated regions or genes with a high mutation rate. on the other hand, a comparison between our 454 results and the milan et al. [16] transcriptome using a blastn approach is summarized in table 2 immune-related sequences r. philippinarum hemocytes were subjected to immune stimulation using several different pamps to enrich the est collection with immune-related sequences. the objective was to obtain a more complete view of clam responses to pathogens. a keyword list and go immune-related terms were used to find proteins putatively involved in the immune system. after this selection step, we found that more than 10% of the proteins predicted from the contig sequences had a possible immune function. some sequences were found to be clustered in common, well-recognized immune pathways, such as the complement, apoptosis and toll-like receptors pathways, indicating conserved ancient mechanisms in bivalves ( figures 6, 7, 8 ). the complement system is composed of over 30 plasma proteins that collaborate to distinguish and eliminate pathogens. c3 is the central component in this system. in vertebrates, it is proteolytically activated by a c3 convertase through both the classic, lectininduced and alternative routes [35] . although the complement pathway has not been extensively described in bivalves, there is evidence that supports the presence of this defense mechanism. ests with homology to the c1q domain have been detected in the american oyster, c. virginica [36] , the tropical clam codakia orbicularis [37] , the zhikong scallop chlamys farreri [38] and the mussel m. galloprovincialis [39] [40] . more recently, a novel c1q adiponectin-like, a c3 and a factor b-like proteins have been identified in the carpet shell clam r. decussatus [41] [42] . these data support the putative presence of the complement system in bivalves. our pyrosequencing results, using the blastx similarity approach, showed that the complement pathway in r. philippinarum was almost complete as compared to the kegg reference pathway ( figure 6 ). only the complement components c1r, c1s, c6, c7 and c8 were not detected. i. lectins. lectins are a family of carbohydrate-recognition proteins that play crucial self-and non-self-recognition roles in innate immunity and can be found in soluble or membraneassociated forms. they may initiate effector mechanisms against pathogens, such as agglutination, immobilization and complement -mediated opsonization and lysis [43] . several types of lectins have been cloned or purified from the manila clam, r. philippinarum [44] [45] [46] , and their function and expression were also studied [18, 47] . also, a manila clam tandemrepeat galectin, which is induced upon infection with perkinsus olseni, has been characterized [46] . lectin sequences have been found in the stimulated hemocytes studied in our work: 23 of the contigs are homologous to c-type lectins (calcium-dependent carbohydrate-binding lectins that have characteristic carbohydrate-recognition domains), 115 are homologous to galectins (characterized by a conserved sequence motif in their carbohydrate recognition domain and a specific affinity for bgalactosides), 4 contigs have homology with ficolin a and b (a group of oligomeric lectins with subunits consisting of both collagen-like and fibrinogen-like domains) and 34 contigs have homology with other groups of lectins such as lactose-, mannoseor sialic acid-binding lectins. ii. b-glucan recognition proteins. b-glucan recognition proteins are involved in the recognition of invading fungal organisms. they bind specifically to b-1,3-glucan stimulating short-term immune responses. although these receptors have been partially sequenced in several bivalves, there is only one complete description of them in the scallop chlamys farreri [48] . two contigs with homology to the beta-1,3-glucan-binding protein were found in our study. iii. peptidoglycan recognition proteins. peptidoglycan recognition proteins (pgrps) specifically bind peptidoglycans, which is a major component of the bacterial cell wall. this family of proteins influences host-pathogen interactions through their pro-and anti-inflammatory properties that are independent of their hydrolytic and antibacterial activities. in bivalves, they were first identified in the scallops c. farreri and a. irradians [49, 50] and the pacific oyster c. gigas, and from the latter four different types of pgrps were identified [51] . peptidoglycan-recognition proteins and a peptidoglycan-binding domain containing protein have been found for the first time in r. philippinarum in our results and were present 4 and 1 times, respectively. iv. toll-like receptors. toll-like receptors (tlrs) are an ancient family of pattern recognition receptors that play key roles in detecting non-self substances and activating the immune system. the unique bivalve tlr was identified and characterized in the zhikong scallop, c. farreri [52] . tlr 2, 6 and 13 were present among the pyrosequencing results. tlr2 and tlr6 form a heterodimer, which senses and recognizes various components from bacteria, mycoplasma, fungi and viruses [53] . tlr13 is a novel and poorly characterized member of the toll-like receptor family. although the exact role of tlr13 is currently unknown, phylogenic analysis indicates that tlr13 is a member of the tlr11 subfamily [54] suggesting that it could recognize urinary pathogenic e. coli [55] . it has been demonstrated that tlr13 colocalizes and interacts with unc93b1, a molecule located in the endoplasmic reticulum, which strongly suggests that tlr13 might be found inside cells and might play a role in recognizing viral infections [56] . figure 7 summarizes the tlr signaling pathway with the corresponding molecules found in the r. philippinarum transcriptome. pathogen proteases are important virulence factors that facilitate infection, diminish the activity of lysozymes and quench the agglutination capacity of hemocytes. because protease inhibitors play important roles in invertebrate immunity by protecting hosts through the direct inactivation of pathogen proteases, many bivalves have developed protease inhibitors to regulate the activities of pathogen proteases [1] . some genes encoding protease inhibitors were identified in c. gigas [57] , a. irradians [58] , c. farreri [59] and c. virginica; in the latter a novel family of serine protease inhibitors was also characterized [60] [61] [62] . a total of 23 contigs with homology to serine, cystein, kunitzand kazal-type protease inhibitors and metalloprotease inhibitors were found among our results. lysozyme was one of the most represented groups of immune genes in this transcriptome study with 208 contigs present. it is an antibacterial molecule present in numerous animals including bivalves. although lysozyme activity was first reported in molluscs over 30 years ago, complete sequences were published only recently including those of r. philippinarum [24] . antimicrobial peptides (amps) are small, gene-encoded, cationic peptides that constitute important innate immune effectors from organisms spanning most of the phylogenetic spectrum. amps alter the permeability of the pathogen membrane and cause cellular lysis [63] . in bivalves, they were first purified from mussel hemocyte granules [64, 65] . in mussels, the amp myticin c was found to have a high polymorphic variability as well as chemotactic and immunoregulatory roles [66, 67] . in clams, two amps with similarity to mussel myticin and mytilin [68] and a big defensin [69] are known. we were able to detect 36 contigs with homology to different defensins: defensin-1 (american oyster defensin), defensin mgd-1 (mediterranean mussel defensin) and the big defensin previously mentioned. four contigs were similar to an unpublished defensin sequence from venerupis ( = ruditapes) philippinarum. the primary role of heat shock proteins (hsps) is to function as molecular chaperones. their up-regulation also represents an important mechanism in the stress response [70] , and their activity is closely linked to the innate immune system. hsps mediate the mitochondrial apoptosis pathway and affect the regulation of nf-kb [71] . hsps are well studied in bivalves. for r. philippinarum, several assays have been developed to better understand the hsps profile in response to heavy metals and pathogen stresses [72] [73] [74] . the most important and well-studied groups of hsps were present in our r. philippinarum transcriptome (hsp27, hsp40/ dnaj, hsp70 and hsp90), but other, less common hsps were also represented (hsp10, hsp22, hsp83 and some members from the hsp90 family). recently, several genes related to the inflammatory response against lps stimulation have been detected in bivalves. such is the case of the lps-induced tnf-a factor (litaf), which is a novel transcription factor that critically regulates the expression of tnfa and various inflammatory cytokines in response to lps stimulation. it has been described in three bivalve species: pinctada fucata [75] , c. gigas [76] and c. farreri [77] . other tnf-related genes have been identified in the zhikong scallop, such as a tnfr homologue [78] and a tumor necrosis factor receptor-associated factor 6 (traf6), which is a key signaling adaptor molecule common to the tnfr superfamily and to the il-1r/tlr family [79] . figure 7 shows that several components of the tlr signaling pathway that are present in our transcriptomic sequences (myd88, irak4, traf-3 and -6, tram, btk, rac-1, pi3k, akt, btk and tank). a total of 1,918 contigs, 8.43% of those annotated, had homology with the main groups of putatively pathogenic organisms such as viruses (47 hits), bacteria (1,126 hits), protozoa (341 hits) and fungi (404 hits). figure 9 displays the taxonomic classification of these sequences and table 3 summarizes a list of the known bivalve pathogens found in our results. bacteria constitute the main group found among the sequences not belonging to the clam. as filter-feeding animals, bivalves can concentrate a large amount of bacteria and it could be one of their sources of food [24] . because vibrio spp. are ubiquitous in aquatic ecosystems, it was expected that the vibrionales order, with 141 hits, would be the most predominant. several species of the vibrio genus are among the main causes of disease in bivalves specifically causing bacillary necrosis in larval stages [80] . is noticeable that sequences belonging to the causative agent of brown ring disease in adults of manila clam, vibrio tapetis, have not been found. perkinsus marinus, with 2 matches, is the only bivalve pathogen found within the protozoa (alveolata) group. perkinsosis is produced by species from the genus perkinsus. both p. marinus and p. olseni have been associated with mortalities in populations of various groups of molluscs around the world and are catalogued as notifiable pathogens by the oie. viruses were the least represented among pathogens. the baculoviridae family was the most predominant with 21 matches, but the corresponding sequences were inhibitors of apoptosis (iaps) [81] that could also be part of the clam's transcriptome. five viral families were found in our transcriptome study: iridoviridae, herpesviridae, malacoherpesviridae, picornaviridae and retroviridae. a well-known bivalve pathogen was also identified, the ostreid herpesvirus 1, which has been previously been found to infect clams [82] . fungi had 404 matches in our results. it is known that bivalves are sensitive to fungal diseases, which can degrade the shell or affect the larval bivalve stages [83, 84] . this study represents the first r. philippinarum transcriptome analysis focused on its immune system using a 454-pyrosequencing approach and complements the recent pyrosequencing assay carried out by milan et al. [16] . the discovery of new immune sequences was effective, resulting in an enormous variety of contigs corresponding to molecules that could play a role in the defense mechanisms. more than 10% of our results had relationship with immunity. this new resource is now gathered in the ncbi short read archive with the accession number: sra046855.1. our results will provide a rich source of data to discover and identify new genes, which will serve as a basis for microarray construction and gene expression studies as well as for the identification of genetic markers for various applications including the selection of families in the aquaculture sector. we have found sequences from molecules never described in bivalves before like c2, c4, c5, c9, aif, bax, akt, tlr6 and tlr13, among others. as a part of this work, three immune pathways in r. philippinarum have been characterized, the apoptosis, the toll like signaling pathway and the complement cascade, which could help us to better understand the resistance mechanisms of this economically important aquaculture clam species. animal sampling and in vitro stimulation of hemocytes r. philippinarum clams were obtained from a commercial shellfish farm (vigo, galicia, spain). clams were maintained in open circuit filtered sea water tanks at 15uc with aeration and were fed a total of 100 clams were notched in the shell in the area adjacent to the anterior adductor muscle. a sample of 500 ul of hemolymph was withdrawn from the adductor muscle of each clam with an insulin syringe, pooled and then distributed in 6-well plates, 7 ml per well, in a total of 7 wells, one for each treatment. hemocytes were allowed to settle to the base of the wells for 30 min at 15uc in the darkness. then, the hemocytes were stimulated with 50 mg/ml of polyinosinic:polycytidylic acid (poly i:c), peptidoglycans, ã�-glucan, vibrio anguillarum dna (cpg), lipopolysaccharide (lps), lipoteichoic acid (lta) or 1610 6 ufc/ml of heat-inactivated vibrio anguillarum (one stimulus per well) for 3 h at 15uc. all stimuli were purchased from sigma. pyrosequencing. after stimulation, hemolymph was centrifuged at 1700 g at 4uc for 5 minutes, the pellet was resuspended in 1 ml of trizol (invitrogen) and rna was extracted following the manufacturer's protocol. after rna extraction, samples were treated with turbo dnase free (ambion) to eliminate dna. next, the concentration and purity of the rna samples were measured using a nanodrop nd1000 spectrophotometer. the rna quality was assessed in a bioanalyzer 2010 (agilent technologies). from each sample, 1 mg of rna was pooled and used for the production of normalized cdna for 454 sequencing in the unitat de genã²mica (sct-ub, barcelona, spain). full-length-enriched double stranded cdna was synthesized from 1,5 mg of pooled total rna using mint cdna synthesis kit (evrogen, moscow, russia) according to manufacturer's protocol, and was subsequently purified using the qiaquick pcr purification kit (qiagen usa, valencia, ca). the amplified cdna was normalized using trimmer kit (evrogen, moscow, russia) to minimize differences in representation of transcripts. the method involves denaturation-reassociation of cdna, followed by a digestion with a duplex-specific nuclease (dsn) enzyme [85, 86] . the enzymatic degradation occurs primarily on the highly abundant cdna fraction. the single-stranded cdna fraction was then amplified twice by sequential pcr reactions according to the manufacturer's protocol. normalized cdna was purified using the qiaquick pcr purification kit (qiagen usa, valencia, ca). to generate the 454 library, 500 ng of normalized cdna were used. cdna was fractionated into small, 300-to 800-basepair fragments and the specific a and b adaptors were ligated to both the 39 and 59 ends of the fragments. the a and b adaptors were domain; pkc: protein kinase c; pten: phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase pten; raidd: caspase and rip adapter with death domain; tnf r1: tumor necrosis factor receptor 1; tnf-a: tumor necrosis factor alpha; tradd: tnf receptor type 1-associated death domain protein; traf2: tnf receptor-associated factor 2; trail: tnf-related apoptosis-inducing ligand; trail decoy: decoy trail receptor without death domain; trail-r: trail receptor. doi:10.1371/journal.pone.0035009.g008 used for purification, amplification, and sequencing steps. one sequencing run was performed on the gs-flx using titanium chemistry. 454 sequencing is based on sequencing-by-synthesis, addition of one nucleotide, or more, complementary to the template strand results in a chemiluminescent signal recorded by the ccd camera within the instrument. the signal strength is proportional to the number of nucleotides incorporated in a single nucleotide flow. all reagents and protocols used were from roche 454 life sciences, usa. pyrosequencing raw data, comprised of 975,190 reads, were processed with the roche quality control pipeline using the default settings. seqclean (http://compbio.dfci.harvard.edu/tgi/software/) software was used to screen for and remove normalization adaptor sequences, homopolymers and reads shorter than 40 bp prior to assembly. a total of 974,973 quality reads were subjected to mira, version 3.2.0 [87] , to assemble the transcriptome. by default, mira takes into account only contigs with at least 2 reads. the other reads go into debris, which might include singletons, repeats, low complexity sequences and sequences shorter than 40 bp. ncbi blastclust was used to group similar contigs into clusters (groups of transcripts from the same gene). two sequences were grouped if at least 60% of the positions had at least 95% identity. the 51,265 contigs were grouped into a total of 29,679 clusters. an iterative blast workflow was used to annotate the r. philippinarum contigs with at least 100 bp (49,847 contigs out of 51,265). then, blastx [88] with a cut off value of 10e-3, was used to compare the r. philippinarum contigs with the ncbi swissprot, the ncbi metazoan refseq, the ncbi nr and the uniprotkb/trembl protein databases. after annotation, blast2go software [89] was used to assign gene ontology terms [90] to the largest contig of a representative cluster (minimum of 100 bp). this strategy was used to avoid redundant results. default values in blast2go were used to perform the analysis and ontology level 2 was selected to construct the level pie charts. to make a comparison between r. philippinarum and other bivalve species, the nucleotide sequences and ests from c. gigas, m. galloprovincialis, l. elliptica and b. azoricus were obtained from genbank and from dedicated databases, when available. [93] . unique sequences from these databases (based on gi number) were used from each of the databases. these sequences were compared by blastn against the longest contig from each of 29,679 r. philippinarum clusters with a cut off e-value of 10e-05. hits to r. philippinarum sequences were represented in a venn diagram. the comparison between our 454 results, the longest contig from each of 29,679 clusters, and the milan et al. [16] transcriptome, contigs downloaded from ruphibase (http:// compgen.bio.unipd.it/ruphibase/query/), was made by blastn with a cut off e-value of 10e-05. another analysis was carried out to compare just the longest contig from each of 2,005 clusters identified as immune-related and the milan et al. contigs as well. the results were summarized in a table ( table 2 ). the percentage of coverage is the average % of query coverage by the best blast hit and the percentage of hits is the % of query with at least one hit in database, in parenthesis were added the total number of hits. identification of immune-related genes all the contig annotations were revised based on an immunity and inflammation-related keyword list (i.e. apoptosis, bactericidal, c3, lectin, socsâ�¦) developed in our laboratory to select the candidate sequences putatively involved in immune response. the presence or absence of these words in the blastx hit descriptions was checked to identify putative immune-related contigs. the remaining non-selected contigs were revised using the go terms at level 2, 3 and 4 assigned to each sequence after the annotation step that had a direct relationship with immunity. selected contigs were checked again to eliminate non-immune ones and distributed into functional categories. immune-related genes were grouped in three reference immune pathways (complement cascade, tlr signaling pathway and apoptosis) to describe each route indicated by our pyrosequencing results. to identify and classify the groups of organisms that had high similarity with our clam sequences, the uniprot taxonomy [94] was used except for the protozoa group. because protozoa are a highly complex group, a specific taxonomy [95] was followed. briefly, after the blastx annotation step all the hit descriptions included the species name (i.e. homo sapiens) or a code (i.e. human) meaning that protein has been previously identified as belonging to that species. with such information sequences were classified in taxonomical groups and represented in pie charts. table s1 list of contigs (e-value,10-3) of ruditapes philippinarum including sequence, length, description (hit description), accession number of description (hit acc), e-value obtained and database used for annotation (blast). study of diseases and the immune system of bivalves using molecular biology and genomics bacterial disease in marine bivalves, review of recent studies. trends and evolution perkinsosis in molluscs: a review bacteria-hemocyte interactions and phagocytosis in bivalves role of lectins (c-reactive protein) in defense of marine bivalves against bacteria modulation of the chemiluminescence response of mediterranean mussel (mytilus galloprovincialis) haemocytes immune parameters in carpet shell clams naturally infected with perkinsus atlanticus nitric oxide production by carpet shell clam (ruditapes decussatus) hemocytes generation and analysis of a 29,745 unique expressed sequence tags from the pacific oyster (crassostrea gigas) assembled into a publicly accessible database, the gigasdatabase immune gene discovery by expressed sequence tags generated from hemocytes of the bacteria-challenged oyster, crassostrea gigas sequence variability of myticins identified in haemocytes from mussels suggests ancient host-pathogen interactions mytibase, a knowledgebase of mussel (m. galloprovincialis) transcribed sequences insights into the innate immunity of the mediterranean mussel mytilus galloprovincialis development of expressed sequence tags from the pearl oyster, pinctada martensii dunker gene expression analysis of clams ruditapes philippinarum and ruditapes decussatus following bacterial infection yields molecular insights into pathogen resistance and immunity transcriptome sequencing and microarray development for the manila clam, ruditapes philippinarum: genomic tools for environmental monitoring de novo assembly of the manila clam ruditapes philippinarum transcriptome provides new insights into expression bias, mitochondrial doubly uniparental inheritance and sex determination analysis of est and lectin expression in hemocytes of manila clams (ruditapes phylippinarum) (bivalvia, mollusca) infected with perkinsus olseni increasing genomic information in bivalves through new est collections in four species, development of new genetic markers for environmental studies and genome evolution rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing sequencing technologies -the next generation transcriptomic analysis of the clam meretrix meretrix on different larval stages ferritin functions as a proinflammatory cytokine via iron-independent protein kinase c zeta/nuclear factor kappab-regulated signaling in rat hepatic stellate cells cloning and characterization of an invertebrate type lysozyme from venerupis philippinarum c1q and tumor necrosis factor superfamily: modularity and versatility the regulation of inflammation by galectin-3 isolation, cdna cloning, and characterization of an 18-kda hemagglutinin and amebocyte aggregation factor from limulus polyphemus angiopoietin-like proteins: emerging targets for treatment of obesity and related metabolic diseases modulation of apolipoprotein d expression and translocation under specific stress conditions neuroprotective effect of apolipoprotein d against human coronavirus oc43-induced encephalitis in mice apolipoprotein d itm2bs regulates apoptosis by inducing loss of mitochondrial membrane potential transcriptomic signatures of ash (fraxinus spp.) phloem transcriptomics of the bed bug (cimex lectularius) complement and its role in innate and adaptive immune responses potential indicators of stress response identified by expressed sequence tag analysis of hemocytes and embryos from the american oyster, crassostrea virginica analysis of a cdna-derived sequence of a novel mannose-binding lectin, codakine, from the tropical clam codakia orbicularis a novel c1q-domaincontaining protein from zhikong scallop chlamys farreri with lipopolysaccharide binding activity the c1q domain containing proteins of the mediterranean mussel mytilus galloprovincialis: a widespread and diverse family of immune-related molecules mgc1q, a novel c1q-domain-containing protein involved in the immune response of mytilus galloprovincialis differentially expressed genes of the carpet shell clam ruditapes decussatus against perkinsus olseni characterization of a c3 and a factor b-like in the carpet-shell clam, ruditapes decussatus structural and functional diversity of lectin repertoires in invertebrates, protochordates and ectothermic vertebrates purification and characterisation of a lectin isolated from the manila clam ruditapes philippinarum in korea characterization, tissue expression, and immunohistochemical localization of mcl3, a c-type lectin produced by perkinsus olseni-infected manila clams (ruditapes philippinarum) noble tandem-repeat galectin of manila clam ruditapes philippinarum is induced upon infection with the protozoan parasite perkinsus olseni lectin from the manila clam ruditapes philippinarum is induced upon infection with the protozoan parasite perkinsus olseni cdna cloning and mrna expression of the lipopolysaccharide-and beta-1,3-glucan-binding protein gene from scallop chlamys farreri molecular cloning and characterization of a short type peptidoglycan recognition protein (cfpgrp-s1) cdna from zhikong scallop chlamys farreri molecular cloning and mrna expression of peptidoglycan recognition protein (pgrp) gene in bay scallop (argopecten irradians, lamarck 1819) distribution of multiple peptidoglycan recognition proteins in the tissues of pacific oyster, crassostrea gigas molecular cloning and expression of a toll receptor gene homologue from zhikong scallop, chlamys farreri pattern recognition receptors and inflammation the evolution of vertebrate toll-like receptors a tolllike receptor that prevents infection by uropathogenic bacteria unc93b1 delivers nucleotide-sensing toll-like receptors to endolysosomes cg-timp, an inducible tissue inhibitor of metalloproteinase from the pacific oyster crassostrea gigas with a potential role in wound healing and defense mechanisms molecular cloning, characterization and expression of a novel serine proteinase inhibitor gene in bay scallops (argopecten irradians, lamarck 1819) molecular cloning and expression of a novel kazal-type serine proteinase inhibitor gene from zhikong scallop chlamys farreri, and the inhibitory activity of its recombinant domain a novel slowtight binding serine protease inhibitor from eastern oyster (crassostrea virginica) plasma inhibits perkinsin, the major extracellular protease of the oyster protozoan parasite perkinsus marinus evidence indicating the existence of a novel family of serine protease inhibitors that may be involved in marine invertebrate immunity serine protease inhibitor cvsi-1 potential role in the eastern oyster host defense against the protozoan parasite perkinsus marinus antimicrobial peptides: pore formers or metabolic inhibitors in bacteria? innate immunity. isolation of several cysteine-rich antimicrobial peptides from the blood of a mollusc, mytilus edulis a member of the arthropod defensin family from edible mediterranean mussels (mytilus galloprovincialis) evidence of high individual diversity on myticin c in mussel (mytilus galloprovincialis) mytilus galloprovincialis myticin c: a chemotactic molecule with antiviral activity and immunoregulatory properties analysis of differentially expressed genes in response to bacterial stimulation in hemocytes of the carpetshell clam ruditapes decussatus: identification of new antimicrobial peptides molecular characterization of a novel big defensin from clam venerupis philippinarum heat shock proteins: facts, thoughts, and dreams heat shock proteins, cellular chaperones that modulate mitochondrial cell death pathways djla, a membrane-anchored dnaj-like protein, is required for cytotoxicity of clam pathogen vibrio tapetis to hemocytes alternation of venerupis philippinarum hsp40 gene expression in response to pathogen challenge and heavy metal exposure identification of two small heat shock proteins with different response profile to cadmium and pathogen stresses in venerupis philippinarum molecular characterization and expression analysis of a putative lps-induced tnf-alpha factor (litaf) from pearl oyster pinctada fucata cloning, characterization and expression analysis of the gene for a putative lipopolysaccharide-induced tnf-alpha factor of the pacific oyster molecular cloning and characterization of a putative lipopolysaccharide-induced tnf-alpha factor (litaf) gene homologue from zhikong scallop chlamys farreri first molluscan tnfr homologue in zhikong scallop: molecular characterization and expression analysis identification and expression of traf6 (tnf receptor-associated factor 6) gene in zhikong scallop chlamys farreri diversity and pathogenecity of vibrio species in cultured bivalve molluscs an apoptosis-inhibiting baculovirus gene with a zinc finger-like motif detection of ostreid herpesvirus 1 dna by pcr in bivalve molluscs: a critical review synopsis of infectious diseases and parasites of commercially exploited shellfish a fungus disease in clam and oyster larvae a novel method for snp detection using a new duplex-specific nuclease from crab hepatopancreas simple cdna normalization using kamchatka crab duplex-specific nuclease using the miraest assembler for reliable and automated mrna transcript assembly and snp detection in sequenced ests basic local alignment search tool blast2go, a universal tool for annotation, visualization and analysis in functional genomics research gene ontology, tool for the unification of biology. the gene ontology consortium pyrosequencing of mytilus galloprovincialis cdnas: tissue-specific expression patterns insights into shell deposition in the antarctic bivalve laternula elliptica: gene discovery in the mantle transcriptome using 454 pyrosequencing highthroughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel bathymodiolus azoricus newt, a new taxonomy portal the new higher level classification of eukaryotes with emphasis on the taxonomy of protists key: cord-022262-ck2lhojz authors: gromeier, matthias; wimmer, eckard; gorbalenya, alexander e. title: genetics, pathogenesis and evolution of picornaviruses date: 2007-09-02 journal: origin and evolution of viruses doi: 10.1016/b978-012220360-2/50013-1 sha: doc_id: 22262 cord_uid: ck2lhojz the discovery of viruses heralded an exciting new era for research in the medical and biological sciences. it has been realized that the cellular receptor guiding a virus to a target cell cannot be the sole determinant of a virus's pathogenic potential. comparative analyses of the structures of genomes and their products have placed the picornaviruses into a large “picorna-like” virus family, in which they occupy a prominent place. most human picornavirus infections are self-limiting, yet the enormously high rate of picornavirus infections in the human population can lead to a significant incidence of disease complications that may be permanently debilitating or even fatal. picornaviruses employ one of the simplest imaginable genetic systems: they consist of single-stranded rna that encodes only a single multidomain polypeptide, the polyprotein. the rna is packaged into a small, rigid, naked, and icosahedral virion whose proteins are unmodified except for a myristate at the n-termini of vp4. the rna itself does not contain modified bases. the key to ultimately understanding picornaviruses may be to rationalize the huge amount of information about these viruses from the perspective of evolution. it is possible that the replicative apparatus of picornaviruses originated in the precellular world and was subsequently refined in the course of thousands of generations in a slowly evolving environment. picornaviruses cultivated the art of adaptation, which has allowed them to “jump” into new niches offered in the biological world. the discovery of viruses heralded an exciting new era for research in the medical and biological sciences. many contemporary virologists do not know, however, that the first animal virus described was a picornavirus, the etiological agent of the dreaded foot-and-mouth disease in cloven-footed animals. the discovery of footand-mouth disease virus (fmdv) by e loeffier and p. frosch in 1898 (loeffier and frosch, 1898) occurred at the same time as m.w. beijerinck described the amazing "contagium vivum fluidum" in 1898. this liquid was a filtered leaf extract derived from tobacco plants suffering from tobacco mosaic disease. free of bacteria, it was yet able to transmit the disease to uninfected plants. already in 1892, i. ivanovski had made a similar observation with tobacco mosaic virus but apparently he was unable to fully convince his peers of the significance of his discovery (waterson and wilkinsen, 1978) . research on viruses, now formally in its hundred-and-first year, has yielded an immense harvest of biochemical and biological information. the studies were driven not only by an urgent need to understand, and possibly prevent, viral disease; they were also fueled by a strong curiosity about the minute biologicals called viruses, which we can view as chemicals, on the one hand and as "living" entities on the other. poliovirus is an exquisite example of a chemical with a known empirical formula (molla et al., 1991) that can be crystallized (schaffer and schwerdt, 1955) yet causes a devastating disease in humans. poliovirus was discovered 90 years ago by landsteiner and popper (1909) to be the causative agent of poliomyelitis. the current knowledge of its chemical and three-dimensional structure and of its life cycle and pathogenesis is second to none. indeed, the intense research efforts on poliovirus over a period of nine decades will lead to its demise in the near future: global eradication of poliovirus is considered possible by the year 2000. following the identification of fmdv and poliovirus, a deluge of other viruses with similar properties were uncovered. these viruses have now been classified, as picornaviridae, a large family of small (lat. pico) rna (rna) viruses. currently, picornaviridae consists of six genera: enterovirus, rhinovirus, hepatovirus, parechovirus, cardiovirus and aphthovirus (table 12 .1). the first four genera include predominantly human pathogens, which cause a bewildering array of disease syndromes. although a disease syndrome may be considered characteristic for a specific picornavirus group, the same syndrome can possibly be also produced by b coxsackieviruses a9 (see cav above), poliomyelitis, c~v~ 3 (1) coxsackieviruses b1-6 myocarditis, pleurodynia, meningitis, hcar*, daf (2) hand-fix)t-and-mouth" disease, respiratory disease, neonatal, infections meningitis, encephalitis, pleurodynia, exanthema echoviruses 1-9, 11-21, 24-27, 29-33 vla-2(=0c2~i) daf (=cd55) enterovirus 69 nd c poliovirus types 1-3 poliomyelitis, meningitis cd155 (5) coxsackieviruses 1, 11, 13, 15, 17, [18] [19] [20] [21] [22] 24 common cold, infantile diarrhea icam-lt the following viruses have been recognized as picornaviruses on the basis of their genome sequences and physico-chemical properties as well as the result of comparative sequence analyses (see the section on evolution): equine rhinovirus types i and 2, aichi virus, porcine enterovirus, avian encephalomyelitis virus, infectious flacherie virus of silkworm clusters of enteroviruses refer to groups of enteroviruses arranged predominantly according to genotypic kinship (hyypia et al., 1997) . more clusters, including mainly animal enteroviruses, have been proposed. list of human syndromes adapted from melnick, 1996 . common syndromes in humans caused predominantly by one and/or other member(s) of the cluster but member viruses of other clusters may cause the same syndrome. receptors may be specific for specific serotypes. for details, see text. references describing the identification of receptors: (1) roivainen et al., 1994, (2) tomko and philipson, 1997; shafren et al., 1997, (3) bergelson et al., 1992; (4) bergelson et al., 1994; ward et al., 1994; (5) mendelsohn et al., 1989; (6) shafren et al., 1997; (7) staunton et al., 1989; greve et al., 1989; (8) hofer et al., 1994; (9) feigelstock et al., 1998; (10) neff et al., 1998; berinstein et al., 1995; (11) jackson et ai., 1996; (12) huber, 1994 . * shared with adenovirus type 2. t daf (decay accelerating factor) may function as non-essential (infection-augmenting) coreceptor. coxsackie virus a24v is a genetic variant of coxsackie virus a24. ** pringle, 1996. other picornaviruses. it has been realized that the cellular receptor guiding a virus to a target cell cannot be the sole determinant of a virus's pathogenic potential. indeed, it is a major challenge of the day to decipher the molecular mechanism(s) that determine viral tissue tropism and disease. what is the identity of picornaviruses? it relates to ancestral viruses whose identity we will never know. comparative analyses of the structures of genomes and their products, however, have placed the picornaviruses into a large "picorna-like" virus family, in which they occupy a prominent place (discussed in the section on evolution). these same analyses have led to an evolutionary tree of picornaviruses that reveals the extent of kinship (figure 12.1a) . one result of these phylogenetic investigations was a radical reorganization of the taxonomy of enterovirus, a genus of picornaviridae comprising numerous members infecting the gastrointestinal tract. the enteroviruses have now been divided into clusters (table 12. 1; figure 12 .1b) grouping the viruses mainly corresponding to genotypes (hyypia et al., 1997) . earlier classifications were based (1) on specific properties of the virions, (2) on disease patterns, (3) on the apparent absence of pathogenesis (echo is an acronym for "enteric cytopathic human orphan" because no disease was originally correlated with these viruses), or (4) in reference to the site of discovery (e.g. the town of coxsackie in new york state) and pathogenesis in suckling mice. as the number of known enteroviruses increased and the properties of these new isolates were elucidated, the need for a modified classification became apparent. however, even the latest dendrograms are likely to be modified again. principally, viruses that have been classified as belonging to a specific genus may be further divided into serotypes. a serotype is defined by the virus's ability to elicit a set of neutralizing antibodies ("antiserum") in a host animal; this set of neutralizing antibodies will generally not neutralize any other virus, regardless of the origin of the antiserum. neutralizing antibodies, in turn, are elicited by structures specific for a virus's capsid, and they have been referred to as neutralization antigenic determinants (or sites). the poliovirion carries at its surface four distinct neutralization antigenic determinants minor, 1990) . however, poliovirus expresses only three unique sets of these four determinants; hence poliovirus occurs in three serotypes. hepatitis a virus, on the other hand, expresses only one set of neutralization antigenic determinants; hence, it occurs in only one serotype. in contrast, human rhinoviruses (hrv) can express more than 100 unique sets of four antigenic determinants. rhinoviruses, therefore, occur in more than 100 serotypes. it should be noted that a poliovirus has been constructed that expresses neutralization antigenic determinants of all three serotypes. this virus, which is severely handicapped in proliferation, is trivalent as it can be neutralized by all three serotype-specific antibodies (murdin et al., 1992) . a genus consisting of viruses that cause the same disease syndrome can be subdivided further on the basis of receptor use. for example, all member viruses of the genus rhinovirus cause the common cold, yet they use two different receptors (icam-1 and ldl receptor; table 12 .1). on the basis of genotypes, however, this division no longer holds up (figure 12.1~) . as mentioned, the enteroviruses have now been subdivided into clusters based on genotypes (table 12 .1, figure 12 .1b). for example, the ccluster consists of the three serotypes of poliovirus and of serotypes 1, 11, 13, 15, 17, [18] [19] [20] [21] [22] . originally the c-cavs were not considered related to polioviruses because of the profound difference in pathogenesis (common cold and poliomyelitis, respectively) and the different use of receptor (icam-1 and cd155, respectively). however, their very close kinship was revealed by genome sequence. this proximity has led to the interesting question of whether the c-cavs are genetic variants of poliovirus (harber et al., 1995) or vice versa (discussed in detail in evolution an interesting recent variant of cav24 is cav24v, an agent that emerged in the early 1970s and that causes acute hemorrhagic conjunctivitis. this syndrome is also associated with a new variant of enterovirus 70, a d-cluster enterovirus (table 12.1; yin-murphy, 1973) . the phenomenon of the sudden appearance of enterovirus strains causing human diseases not previously associated with picornaviruses is of greatest interest with respect to the dynamics of picornavirus diversification, particularly in view of the eradication of poliovirus. what are the mechanisms by which the picornaviruses and other rna virus families have diversified? clearly, the genetic program inscribed into the viral genome is being changed as the viruses acquire new genetic traits. the predominant driving force of the changes in the genotype is largely an adaptation to new opportunities to proliferate. in the following, we will discuss some mechanisms and rules of genetic diversification and evolution of picornaviruses. sequences involved in rna replication, and the internal ribosomal entry site (ires), controlling translation. the virus-encoded 5'-terminal protein, vpg (viral protein genome-linked) is covalently linked to the 5'-terminal uridylic acid via a 04-(5'-uridylyl)tyrosine bond (lee et al., , 1977 nomoto et al., 1977b; rothberg et al., 1978) . picornavirus vpgs are 22-24 amino acids long; their third amino acid (from the n-terminus) is always a tyrosine, the residue linking vpg to the genome. genomelinked proteins are quite common amongst viruses belonging to the picorna-like super family (see figure 12.10) picornavirus vpgs are attached to 5'-terminal nucleotide sequences that form complex structures typical for entero-and rhinoviruses on the one hand, or cardio-, aphtho-and hepatoviruses on the other. these sequences are important signals in genome replication. entero-and rhinoviruses share a cloverleaf structure (rivera et al., 1988; andino et al., 1990 ) that has been subject to intense studies (see below). relatively little is known about the role of corresponding structural elements (which do not form cloverleaves) of cardio, aphtho-and hepatovirus genomes. the cloverleaf is followed by the internal ribosomal entry site (ires), arguably the most complex cis-acting element in any rna virus genome known (figures 12.2, 12.3; wimmer et al., 1993) . picornavirus ires elements, which are approximately 400 nt long, regulate the initia-tion of polyprotein synthesis. in deviation to c a p -d e p e n d e n t "scanning", ireses promote internal ribosomal entry, i.e. they allow initiation of translation independently of a capping group and even a free 5' end (jang et al., 1988 (jang et al., , 1989 pelletier and sonenberg, 1988; molla et al., 1992; chen and sarnow, 1995) . remarkably, ires elements are defined by their function, not by their sequences or apparent higher-order structure(s). this is illustrated in figure 12 .3, which depicts the sequence and folding pattern of the ires elements of poliovirus and encephalomyocarditis virus (emcv; pilipenko et al., 1989a,b) . in spite of these differences, the poliovirus ires has been exchanged with that of emcv, leading to a novel chimeric virus with excellent growth properties . similarly, the ires of hepatitis c virus (hcv), a flavivirus, was found to functionally substitute for the poliovirus ires, yielding a p o l i o / h c v chimeric virus (lu and wimmer, 1996; zhao et al., 1999) . finally, a construct in which the ires of h u m a n rhinovirus type 2 (hrv2) replaced that of poliovirus yielded a p v / h r v chimeric virus (pvi(ripo)) that is figure 12.3 sequences and secondary structures of ires elements of poliovirus and encephalomyocarditis virus. a. poliovirus ires; individual domains have been labeled with roman numerals. b. encephalomyocarditis virus (emcv) ires; domains have been labeled with capital letters. both ireses contain a conserved ynxmaug motif, of which the oligopyrimidine stretch (yn) and the aug triplet are indicated by solid bars. note that in the emcv ires, the aug triplet of the ynxmaug motif is the initiating codon of the polyprotein. in the poliovirus ires, this aug triplet is silent and is separated from an aug codon initiating the synthesis of the polyprotein by a "spacer sequence" of 154 nt (jang et al., 1990) . single attenuating mutations in the poliovirus vaccine strains map to domain v (wimmer et al., 1993) . indistinguishable from wt poliovirus with respect to replication in hela cells yet is highly attenuated in poliovirus-receptor-transgenic mice and in monkeys (gromeier et al., 1996 (gromeier et al., , 1999a ; discussed in the section on pathogenesis). the properties of this interesting novel virus will be discussed in a later section. the mechanism by which ires elements function is still obscure. translation of picornavirus mrna is initiated downstream of the ires to yield an unstable "polyprotein" that is rapidly cleaved by virusencoded proteinases to proteins involved in viral proliferation (figure 12.4 ; see also evoiution and figure 12 .10). it is important to note that the mrna found in viral polyribosomes that encodes the polyprotein differs from virion rna in one important aspect: it is terminated with pup ... (hewlett et al., 1976; nomoto et al., 1976) . apparently, the terminal protein vpg has been cleaved from incoming or from newly synthesized rna. it has been suggested that the enzyme cleaving the vpg-pup phosphodiester bond is of cellular origin but the reason for the removal of the protein and the nature of the enzyme catalyzing it remain unknown. moreover, it is not clear whether the incoming vpg-linked virion rna will be processed immediately after entry or whether the removal of vpg will occur only after the first round(s) of viral protein synthesis. entero-and rhinoviruses encode the two proteinases 2a pr~ and 3c/3cd p~~ aphthoviruses the two proteinases l pr~ and 3c pr~ and cardioviruses only the proteinase 3c p~~ interestingly, both cardioviruses and aphthoviruses have evolved a peculiar cleavage mechanism between 2a and 2b that occurs only in cis and is an enzyme-independent reaction (reviewed by ryan and flint, 1997) . a similar as yet unknown mechanism of proteolytic cleavage is that between vp4 and vp2 ( figure 12 .4d), which occurs only during maturation of the virion (maturation cleavage) and appears also to be proteinase-independent (harber et al., 1991; see below) . the origin of these fascinating enzymes and of specific cleavage events are discussed in the section on evolution. since most details of proteolytic processing have been accumulated for poliovirus, much of the following discussion will center on this viral system. the two poliovirus proteinases 2a pr~ and 3c/3cd pr~ cleave at different sites, as determined by the sequences of the scissile bond ( figure 12 .4b, c). theoretically, the poliovirus polyprotein could give rise to 77 different cleavage products if proteolytic processing by these enzymes and the maturation cleavage were entirely random (wimmer et al., 1993) . in fact, only roughly 29-30 cleavage products have been identified in poliovirus-infected cells (nicklin et al., 1986) . it has thus been concluded that processing of the picornavirus polyproteins is not random but follows a pathway that is determined by protein folding (masking of cleavage sites) and by the amino-acid sequences surrounding the scissile bond ( figure 12 .4b, c; harris et al., 1990) . for example, the precursor 3cd pr~ can be cleaved into 3c pr~ and 3cd p~ by a (cis?) cleavage in which the 3c/3cd pr~ proteinase is involved. both 3c pr~ and 3d p~ are quite stable end-products of processing. however, in the case of poliovirus type 1 (mahoney) (pv1 (m)), 3cd pr~ can also be efficiently processed in trans by 2a pr~ to 3c' and 3d' (figure 12 .4c), two polypeptides with no apparent function in viral proliferation (lee and wimmer, 1988) . just like 3c pr~ and 3d p~ 3c' and 3d' are quite stable endproducts of processing, even though 3d p~ harbors a perfect cleavage site for 2a pr~ and 3c' harbors a cleavage site for 3c/3cd pr~ (figure 12.4) . indeed, in pvl(m)-infected cells, nearly equal amounts of the four cleavage products of precursor 3cd pr~ are observed. it is assumed that structural constraints mask one or the other cleavage site from recognition and processing once the cleavage product has been formed (lee and wimmer, 1988) . the preferred cleavage sequence for 3c/3cd pr~ in the poliovirus polyprotein is axxq*g; hence, cleavage sites with this sequence are usually rapidly processed. numerous mutational studies have supported the identity of this 3c/3cd pr~ cleavage motif (reviewed in dougherty and semler, 1993, and wimmer et al., 1993) . an intriguing genetic analysis has made use of a viral construct that 12.4 processing scheme and cleavage sites of the poliovirus polyprotein. a. proteolytic cleavages of the polyprotein. triangles indicate cleavage by 3c pr~ and/or 3cd pr~ note that both enzymatic entities can efficiently cleave the non-structural proteins. in contrast, the p1 capsid precursor can be processed by 3cd pr~ only. solid triangles represent efficient cleavage sites, whereas open triangles represent slowly cleaved sites resulting in stable precursor proteins. the 2apr~ cleavages are depicted with circles. only the cleavage between p1 and p2-p3 (solid circle) is essential, whereas the cleavage of 3cd p'~ to 3c' and 3d' is dispensable (open circle). the maturation cleavage is indicated by the open diamond. the mechanism by which this cleavage occurs is unknown. numbers in brackets indicate the molecular weight in kda. b-d. amino-acid residues at sites cleaved by (b) 3c p~~ and/or 3cd pr~ (c) by 2a pr~ and (d) during the maturation cleavage are shown in a single-letter code. the positions of the amino-acid residues are designated p1, p2, p3 .... at the newly generated c-termini, or pi', p2', p3', ... at the newly generated n-termini. the fastest cleavages catalyzed by 3cp~~ pr~ occur at sites in which the p4 position is a small aliphatic amino acid (e.g. axxq*g). cleavage at tqsq*g between 3c and 3d is slow, giving rise to the 3cd p~~ cleavage intermediate with a long half-life (cao and wimmer, 1996) . mutated this axxq*g cleavage motif at a specific site in order to avoid proteolytic processing. the amino acids placed by the mutants into the motif confirmed the proposed interaction between substrate and enzyme during cleavage (cao and wimmer, 1996 , and references therein). as will be discussed later, poliovirus is a purist with respect to cleavage signals, since the scissile bond in all cleavages, catalysed by 3c/3cd pr~ is q*g (kitamura et al., 1981; semler et al., 1981a semler et al., , 1981b . in other picornaviruses, or viruses of the large picoma-like superfamily, the cleavage site may differ from the canonical q*g signal. a most important observation in studies of picomavirus proliferation is that cleavage intermediates may have important functions that in some cases may even be distinct from that of their end-products (e.g. 3cw pr~ yielding 3c pr~ and 3dp~ the structure of picornavirus 3c p~~ enzymes has been accurately predicted by gorbalenya et al. (1986) , leading to the genetic analyses alluded to above. the structures were proved to be correct by x-ray crystallographic studies of 3c p~~ of human hepatitis a virus (allaire et al., 1994) and human rhinovirus 14 (matthews et al., 1994) . following the orf, there is a heteropolymeric region that may be different with respect to length (72-126 nt) and structure in different picornavirus genomes (xiang et al., 1997) . however, all picornavirus genomes terminate with poly(a), as was shown first for poliovirus (yogo and wimmer, 1972) . the role these sequences play in replication will be discussed below. the genomic rna of picornaviruses can serve as mrna and, consequently, it is of the same polarity as cellular mrna. by convention, this polarity has been designated plus-strand polarity (baltimore, 1971) . fortunately, the genomic rna of picornaviruses is infectious; that is, upon transfection into suitable host cells, virion rna will initiate a complete infectious cycle (wimmer et al., 1993) . interestingly, poliovirus and its purified genome will replicate even in enucleated cells (morgan-detjen et al., 1978) , an observation suggesting that the nucleus does not contribute factors essential for viral proliferation. using reverse transcriptase, racaniello and baltimore (1981) generated full-length "complementary" dna (cdna) that contained the entire genetic information of the viral genome (currently, cdna refers to double-stranded dna generated from the original complementary dna strands). transfections into hela cells of the cdna that contained heterologous dna sequences at either end of the virus-specific sequences generated, surprisingly, poliovirus. with this experiment, "reverse genetics" of rna viruses was born as the rna genome was now amenable to manipulations developed for dna. the efficiency with which the original cdna clones induced an infectious cycle in hela cells was very low (about 10 pfu/~tg dna; racaniello and baltimore, 1981) . construction of plasmids that could replicate in transfected cells dramatically increased the specific infectivity to 103 pfu/~tg dna; semler et al., 1984) . however, reverse genetics was made more practical when the cdna was cloned downstream of the phage t7 rna transcriptase promoter and, using purified t7 transcriptase, virtually unlimited amounts of highly infectious transcript rna could be produced in a simple test-tube experiment (>10 s pfu/~tg of transcript rna; van der werf et al., 1986) . this was important because mutant genomes with highly debilitating replication phenotypes could not be recovered by the inefficient cdna transfection method. it was known before that vpg is not required to be at the 5' end for poliovirus rna to be infectious (nomoto et al., 1977a) . the 5' end of the t7 transcripts is pppgguuaaaa.., whereas that of virion rna is vpg-puuaaaa... the extra g residues do not prevent transfection but they reduce the specific infectivity of the transcript. in any event, picornavirus rna is quite tolerant of modifications of the 5' end of its genome and, in all cases, the virion rnas isolated after transfections have the authentic terminus restored (wimmer et al., 1993) . infectious cdnas have now been generated from members of all picornaviruses. the method of choice to generate virus remains transfection of t7 transcripts. recently developed methods of rt/pcr allow researchers to generate infectious cdna clones in less than 1 month (tellier et al., 1996) . in general terms, genome replication proceeds in two steps: synthesis of a complementary rna strand (-strand) that then serves as template for plus rna strands (+strands; figure t q a b c d e f g h figure 12.5 steps in the replication of the poliovirus genome. parental, positive-stranded virion rna (solid line) is transcribed, yielding -rna (broken line) after protein (vpg)-priming by the viral rna-dependent rna polymerase 3d pr~ (enzyme or any other proteins involved are not shown). a replicative intermediate (ri) form consisting of a single +strand template and multiple nascent -rna strands (a) has not been detected, so that, more probably, intermediates in -rna synthesis are either mainly single-stranded (b) or double-stranded (c). elongation of the nascent-rna (c) yields the replicative form (rf) double-stranded rna (d). available evidence suggests that the rf is an intermediate in genome replication (discussed in xiang et al., 1997) . accordingly, a cloverleaf/rnp is formed at the end of the rf that promotes vpg-primed synthesis of +rna (e). the structures formed after multiple initiation could either be "closed" (entirely base-paired; f) or "open" (g). available evidence suggests that structure f is the correct intermediate (note that 3d p~ is an unwindase). for details, see wimmer et al., 1993, and xiang et al., 1997 . modified from wimmer et al., 1993. 12.5). the validity of this scheme has been known for almost three decades yet only very few details of the individual steps have been elucidated . because the vast majority of studies have been carried out with poliovirus, this review will concentrate predominantly on this viral system. with the exception of the capsid proteins, all viral non-structural proteins and even processing intermediates have been implicated in genome replication (xiang et al., 1997) . the evidence for the involvement of these proteins (2a pr~ 2b, 2bc, 2c, 3a, 3ab, vpg, 3c/3cd pr~ 3d p~ is based largely on genetic data or on biochemical experiments assumed to be indicative of genome replication (wimmer et al., 1993; xiang et al., 1997) . for example, genetic and bio-chemical analyses of 3ab strongly suggest that this protein, a non-specific rna binding protein, and the proteinase 3cd pr~ participate in the formation of an initiation complex for +strands (xiang et al., 1997) . another example is the involvement of 2c in rna replication. briefly, poliovirus rna synthesis is highly sensitive to the presence of 2 mm guanidine hydrochloride (gua hc1); poliovirus mutants resistant to 2 mm gua hc1 harbor a single amino-acid exchange (n179a/g) in polypeptide 2c. it has recently been established that 2c is an atpase (and not a gtpase) and we now refer to it as 2c awpase (pfister and wimmer, 1999) . the atpase activity of purified 2c awpase is inhibited by 2 mm gua hc1, whereas that of purified 2c atpase with a n179/g mutation is resistant to this concentration of the drug (pfister and wimmer, 1999) . on the basis of these considerations, it can be assumed that the atpase activity of 2c awpase is essential for genome replication. just as with 3ab or 3cd pr~ however, the step(s) by which 2c awpase is exerting its essential function are still unknown. the only proteins whose role in genome replication has been firmly established are vpg and 3d p~ the crystal structure of 3d p~ has recently been solved (hansen and schultz, 1997) , a result that will greatly advance our (limited) understanding of this important enzyme. importantly, 3d p~ was established already in 1977 as being a primer-dependent and rnadependent rna polymerase . although a deluge of circumstantial evidence suggested that a uridylylated form of vpg might serve as primer for 3d p~ (nomoto et al., 1977b; wimmer, 1982; takeda et al., 1986; toyoda et al., 1987) , direct evidence for this mechanism has been obtained only very recently (paul et al., 1998b) . briefly, vpg is being uridylylated to vpg-pu(pu) by the viral rna polymerase 3d p~ in the presence of template (poly(a)). vpg-pu(pu) then primes the transcription of poly(a), leading to the synthesis of poly(u), which is the 5' terminus of-strands (paul et al., 1998b) . in spite of these seemingly simple experiments (paul et al., 1998b) , the mechanism of initiation of rna synthesis was a matter of controversy for almost two decades. baltimore's and flanegan's groups presented evidence favoring "hairpin priming", whereas wimmer's group accumulated data suggesting "protein priming" (reviewed by richards and ehrenfeld, 1990) . the controversy has finally been settled in favor of protein priming. at low concentration of enzyme, poliovirus polypeptide 3ab stimulates the transcriptional activity of 3d p~ up to 100-fold plotch et al., 1989; paul et al., 1994) . indeed, biochemical and genetic evidence suggests that 3d p~ and 3ab form a complex in solution (molla et al., 1994) . the significance of these observations is not yet known. an important additional property of 3d p~ is its ability to unwind double-stranded rna. that is, the enzyme, while transcribing a template, can replace a dormant rna strand that is hybridized to the template with the new strand that is just being synthesized (cho et al., 1993) . it should be noted, however, that 3d p~ is not a helicase as it will not separate two strands without transcribing one of them (cho et al., 1993) . the participation in picornavirus replication of cellular proteins, referred to by investigators as "host factors", has also had a history of controversies. several polypeptides were proposed to be involved in replication (e.g. a kinase or a uridylic acid transferase) but these proteins have disappeared after further analysis (richards and ehrenfeld, 1990 ). ehrenfeld's and semler's groups have recently identified a cellular 38 kda rna binding protein, poly(rc) binding protein 2 (pcbp2), that is not only required for the function of the poliovirus ires but it has also the propensity to bind, together with 3cd pr~ to the poliovirus 5'-terminal cloverleaf (blyn et al., 1996) . pcbp2 (or pcbp1, a protein related to pcbp2; gamarnik and andino, 1997) is undoubtedly the "host factor p36" that was originally proposed by baltimore's group to effect the binding of 3cd pr~ to the poliovirus cloverleaf (andino et al., 1990 (andino et al., , 1993 . andino et al. (1993) provided first evidence suggesting that the formation of a specific protein/cloverleaf rnp complex consisting of viral protein 3cd, a cellular protein ("p36") and the viral rna is required for the initiation of +strand synthesis (andino et al., 1993) . this hypothesis has been further supported by the discovery of pcbp2 (gamarnik and andino, 1997; parsley et al., 1997) . pcbp2 is therefore a sensible candidate for a "host factor" involved in poliovirus rna replication. however, poliovirus protein 3ab can replace pcbp2 in all biochemical reactions characteristic of the formation of a 5' terminal rnp. moreover, 3ab and 3cd pr~ both cleavage products of the p3 precursor ( figure 12 .4a), are associated in solution (molla et al., 1994) . finally, the phenotypes of mutants of 3ab in vivo and in vitro support the conjecture that 3ab is involved in the formation of a cloverleaf/3cdpr~ complex xiang et al., 1995a,b) . currently, there is no compelling evidence in favor of the cloverleaf/3cdpr~ complex over that of cloverleaf/3cdpf~ with respect to poliovirus genome replication (see a discussion in xiang et al., 1997) . recognition of rna signals located somewhere in the rna genome is a prerequisite for specificity in genome replication. this review will concentrate only on cis-acting elements of entero-and rhinoviruses because, as mentioned earlier, the overwhelming number of experiments deal with these viruses. currently, only the 5'-terminal cloverleaf has been firmly established as a cis-acting signal in enterovirus genome replication (see previous section), although the mechanism by which it functions is still obscure. clearly, the formation of a specific rnp plays a role the significance of which will be discussed below. more complicated is the recognition of the +strand template for the initiation of-strands. since replication of picornavirus rnas commences at the 3'-terminal poly(a), a homopolymeric sequence found also in most cellular mrnas, poly(a) alone cannot be a determinant for virus-specific -strand synthesis. vpg-pu(pu) can be synthesized in the presence of poly(a), and vpg-poly(u), the 5' end of -strands, will follow the synthesis of the primer (paul et al., 1998) . this reaction, however, does not reveal the mechanism of specificity. mutational analysis of the heteropolymeric sequence of the 3'ntr of enteroviruses indicated that this region was critically important for replication (pilipenko et al., 1996; melchers et al., 1997) . however, the poliovirus 3'-terminal heteropolymeric sequence can be replaced with that of hrv14, a hairpin with no apparent homology with the poliovirus structure, and the resultant poliovirus/hrv14 hybrid genome replicated with wt kinetics (rohll et al., 1995) . even more startling was a report from semler's group that presented evidence that the heteropolymeric region could be deleted altogether without loss of viability (todd et al., 1997) . currently, the paradox intrinsic to these findings remains unsolved . it is possible that the 3' heteropolymeric region plays an important role in the efficient formation of an initiation complex for replication but to a much lesser extent in +strand template recognition. the authentic recognition signal may reside in rna-internal sequences, as proposed by mcknight and lemon (1996) . these authors reported that, surprisingly, a stem loop structure mapping to the coding region of the hrv14 capsid proteins was absolutely necessary for genome replication. fittingly, a stemloop rna structure that has been uncovered in poliovirus rna also appears to play a role in genome replication; it maps to the coding region of 2c awpase (goodfellow et al., 1998) . the mechanism by which these new elements influence replication has yet to be resolved. finally, evidence has been presented suggesting that sequences within the ires play a role in genome replication (borman et al., 1994; shiroki et al., 1995) . this is difficult to comprehend if one considers chimeric ires viruses. as mentioned above, the cognate ires of poliovirus can be replaced with ires elements from different viruses whose ires are merely related (hrv2, hrv14, cbv4, cav9, cav24, ev71; gromeier et al., 1996, 1999a and unpublished results) or entirely different (emcv; alexander et al., 1994; hcv, lu and wimmer, 1996; zhao et al., 1999) without loss of genome replication. defective interfering particles (di particles; see below) of poliovirus are naturally occurring variants with deletions (of varying sizes) in the p1 region, encoding the capsid proteins. di particles can replicate their rna without helper function but they need wt virus for encapsidation. sequence analyses of genomic rnas of di particles led nomoto and his colleagues to the surprising observation that in all cases the deletions were in-frame of the polyprotein coding sequence. on the other hand, artificial genomes engineered with out-of-frame deletions were unable to replicate their rna, even in the presence of wt helper virus (kuge et al., 1986; hagino-yamagushi and nomoto, 1989) . it was concluded that translation was necessary for the cognate genome to replicate. that is, translation had a cis effect on replication that could not be complemented in trans by a helper genome. these observations were later confirmed and extended (wimmer et al., 1993; novak and kirkegaard, 1994; agol et al., 1999) . there are several hypotheses that are used to explain the phenomenon. the least likely is that certain replication proteins can only function in cis. if so, only viral mrna could serve as template in rna synthesis. since viral mrnas lack vpg (hewlett et al., 1976; nomoto et al., 1976 ; see above), every +strand rna that functions as template in rna synthesis should also lack vpg. available evidence suggests that all rna templates involved in replication are terminated with vpg (nomoto et al., 1977b; petterson et al., 1977; wu et al., 1978; larsen et al., 1980) . furthermore, rna replication occurs in a tight membranous environment (bienz et al., 1992) . thus, it is unlikely that these genome replicating membranous complexes also harbor viral polysomes (wimmer et al., 1993) . indeed, crude, membranous replication complexes can be isolated from infected cells that can replicate poliovirus rna yet they are free of ribosomes (takegami et al., 1983; takeda et al., 1986; toyoda et al., 1987) . an alternative explanation is that the observed cis effect is operating only during the very first round of translation at the onset of infection. clearly, translation of an infecting genome will have to be somehow arrested to allow the template to switch from translation to transcription. it is possible that, once the switch has been made, replication can proceed independently of translation. this does not exclude the possibility that viral proteins, perhaps intermediates with a short half-life or short-lived protein complexes, must be continuously supplied to the rna synthesizing machinery. the question of the switch from translation to rna synthesis of infecting + stranded genomic rna has been subject of much speculation. the classical study of kolakofsky and weissmann (1971) on phage q~ replication solved the dilemma by showing that the phage replicase (a complex of four proteins) can repress translation of viral mrna. a similar model has been proposed for poliovirus by gamarnik and andino (1998) : the formation of an rnp consisting of cloverleaf/3cdpr~ at the 5' end of the viral mrna inhibits further translation, thereby switching the template to replication. one problem with this model is that at the peak of poliovirus replication, translation and rna syn-thesis occur concomitantly in the presence of an excess of 3cd pr~ molecules (note that for each virus particle, 60 molecules of 3cd pr~ are synthesized; the ratio of viral +strand rna to unprocessed 3cd pr~ may be 1:100 through most of the replicative cycle). moreover, if the genome has to be translated for replication to occur, how can inhibition of translation promote rna synthesis? a very schematic representation of steps in genome replication is shown in figure 12 .5 (wimmer et al., 1993) . the possible rna structures involved in replication have been divided into three categories: (1) we have argued before that the cumulative evidence favors the "closed forms" for rf and ri but this view may not be shared by others (wimmer et al., 1993; xiang et al., 1997) . since 3d p~ is an "unwindase" (cho et al., 1993 ; see above), the scheme does not necessarily require a helicase. indeed, so far no picornaviral helicase has been identified, and purified 2c atpase has stubbornly refused to exhibit such activity (pfister and wimmer, 1999) . briefly, vpg will be uridylylated at the 3'-terminal poly(a). vpg-pu(pu), in turn, will then prime synthesis of-strands ( figure 12 .5c). it is unlikely that multiple initiation of-strands on the same template (prior to completion of the first-strand) occurs, since an ri with multiple -strands (such as in figure 12 .5a) has not been found in infected cells (bishop and koch, 1969) . it is even possible that initiation at the poly(a) tail of poliovirus rna occurs only once. completion of the-strand will thus yield rf ( figure 12 .5d), which we consider an intermediate in replication and not a byproduct (wimmer et al., 1993) . one compelling argument in favor of this assumption is that in the rf the 5' end of +strands is in the close vicinity of the 3' end of -strands, a prerequisite first proposed by baltimore and his colleagues (andino et al., 1993; harris et al., 1994) . destabilization of this end of the rna will lead to the formation of an rnp consisting either of cloverleaf/3cdpr~ or cloverleaf/3cdpr~ which, in turn, will free the 3' end of the-strand for vpgprimed +strand synthesis to occur ( figure 12 .5e). multiple initiation at this end will lead to the multistranded ri ( figure 12 .5f), the nascent or full-length +strands being replaced during transcription by the 3d p~ unwindase. initiation of +strands may be more efficient than initiation of-strands; hence the large excess of +strands over-strands in infected cells. note that a reconstituted replication system of purified viral and cellular components capable of synthesizing +strands from input +strands has not been achieved; thus many of the hypotheses put forward in this scheme have not yet been tested. gamamik and andino (1996) have described a novel system to study poliovirus replication in xenopus oocytes by injecting poliovirus rna into these cells. however, virus will replicate only if a hela cell $10 extract was co-injected with the rna. interestingly, the authors have been able to separate the hela supporting activities ($10) into two factors, one necessary for poliovirus ires-driven translation, the other for poliovirus rna synthesis. this system offers an excellent opportunity to separate and characterize viral and cellular factors involved in virus replication. viruses, lacking the genetic information as well as the tools to provide most of the essential components to replicate, are obligatory intracellular parasites. the complexity of viral proliferationmacromolecular synthesis of polypeptides and genomic nucleic acid, and encapsidation-has led to the text book wisdom that viruses are obligatory intracellular parasites unable to proliferate outside living cells. however, poliovirus rna (obtained either from virions or by transcription with phage t7 rna polymerase from plasmid dna), when incubated in an extract of uninfected hela cells void of nuclei, mitochondria and cellular mrna, will direct translation, genome replication and genome encapsidation such that infectious particles are formed. these newly synthesized virions are indistinguishable from poliovirus isolated from tissue cultures. thus, a picornavirus (poliovirus) is the first virus that has been synthesized de novo in a cell-free extract of mammalian cells (molla et al., 1991) . this experiment has nullified the notion that viruses can proliferate exclusively in living cells. moreover, the novel approach can be used to study individual steps of viral replication in the absence of cell-membrane barriers. several interesting observations regarding protein-protein interactions, the role of membranes, of cellular membranous components or soluble cellular factors, or of inhibitors of viral rna synthesis, have been published (barton and flanegan, 1993; molla et al., 1993c molla et al., , 1994 barton et al., 1995; parsley et al., 1997; cuconati et al., 1998; towner et al., 1998) . the use of the cell-free cellular extract for studies of poliovirus rna replication, however, is still in the early stages of exploitation. nevertheless, it has been possible to even achieve genetic recombination of poliovirus in cell-free hela extracts (duggal et al., 1997; tang et al., 1997; duggal and wimmer, 1999 ; see below). in the course of transcription, all templatedependent nucleic acid polymerases make errors in incorporating nucleotides with roughly the same frequency (10-3-10-4). as is discussed in chapter 7, this phenomenon has profound biological consequences for rna viruses. because rna viruses have chosen not to develop mechanisms by which misincorporations of nucleotides can be recognized and corrected, the average number of "spontaneous" mutations per replication of the genome, referred to as error rate, is around 10 -4 . the high error rate in the absence of mechanisms of proofreading and editing has several consequences. first, the average genome length of animal rna viruses is small (12000nt). notwithstanding the genome of the exceptional coronavirus (30 000nt), rna viruses with genomes exceeding 150 kb (e.g. the dna viruses, herpes viruses, poxviruses, iridoviruses) are inconceivable because of the high probability that each genome would carry multiple mutations after each round of synthesis. it should be noted that these considerations by no means imply that dna viruses with very small genomes do not exist. in fact, the animal virus with the smallest known genome is hepatitis b virus (3.2 kb). as to picornaviruses, their average genome length is 8000 nt (see also wimmer et al., 1993) . second, rna viruses replicate near the threshold of error catastrophe (holland et al., 1990) . that is, the artificial increase of misincorporation of nucleotides (e.g. by chemical mutagens) may lead to a rapid decline of the viability of the entire virus population. third, plaque-purified clones of rna viruses are not homogeneous but populations of many different, albeit very closely related genotypes; hence the term "quasispecies" (eigen, 1993) . fourth, the genetic heterogeneity allows an rna to rapidly adapt to a changing environment. a simple example should demonstrate the ease with which a drug-resistant mutant of poliovirus can be isolated. as mentioned, poliovirus rna replication is highly sensitive to the presence of 2 mm guanidine hydrochloride (gua hc1). after plating a stock of plaque-purified poliovirus on a monolayer of hela cells in the presence of 2 mm gua hc1, a few plaques will arise corresponding to resistant variants (gr) with mutations mapping to 2c awpase (pincus et al., 1986; tolskaya et al., 1994) . in the case of the selection of gr poliovirus mutants, it should be noted that the resistant variants already existed in the population of the inoculating virus. if the virus inoculum had been entirely free of gr variants, no selection of gr mutants could have occurred since the drug inhibits rna synthesis; hence, there would have been no misincorporation of nucleotides to generate the gr mutations in 2c atpase. although it may be a trivial thing to repeat, it is important to remember that genetic variation by misincorporation of nucleotides (just as recombination) requires replication. no replication, no mutants. the genetic plasticity of genotypes and the dynamics of genetic variation can be studied conveniently when transcript rna, produced by transcription of cdna with t7 rna polymerase, is transfected on to hela cell monolayers and the corresponding plaque phenotype of progeny virus is analysed. in the case of wt virus, the plaques are, by convention, "large". if mutant rnas are analysed in plaque assays, one may observe only "small plaques" with a rare "large" plaque emerging on the plate. this rare large plaque may signal a reversion (either directly or through suppresser mutations) to a fast-replicating genotype. passage of the population of small and large plaque phenotype viruses (at multiplicities of infection of more than 5) will rapidly yield populations of only the faster-growing virus because the impaired genomes are eliminated by competition. an example of this phenomenon has been described by lu et al., (1995b) , who analysed a hybrid poliovirus in which the cognate 2a pr~ coding sequence was exchanged to that of coxsackie b4 virus. a special case of a genetic phenomenon is that of a "quasi-infectious" genome. this term was originally introduced by agol and his colleagues (gmyl et al., 1993) to describe the following phenomenon. genetically engineered poliovirus variant rna was transfected into hela cells. progeny virus was harvested, sometimes only after prolonged incubations of the transfected tissue cell cultures. analysis of the genotypes of progeny virus genomes (by rt/pcr) revealed only revertant or pseudorevertant rnas. none of the original mutant genotypes were detectable. this phenomenon can be explained if the original mutant genotype was able to replicate its rna, albeit only at levels too low for virus production or even for the development of cpe. nevertheless, the slowly replicating mutant genome allowed for mutation (either misincorporation or deletions, insertions), eventually leading to fast-growing genotypes. by definition, the progeny of quasiinfectious genomes will not yield virus with the parental genotype. if a mutation (point mutation, linker insertion, etc.) engineered into the genome rna is lethal, the lesion may effect complete abrogation of genome replication. hence, reversion to viability cannot be expected. an interesting example of quasi-infectious versus lethal mutations in the poliovirus genome was described when mutations in vpg were studied (kuhn et al., 1988; reuer et al., 1990; cao and wimmer, 1995) . as mentioned, vpg is linked to the genome via a o4-(5'-uridylyl) tyrosine (the tyrosine in position three of vpg). a mutation of tyrosine to phenylalanine (y3f) was originally described as being lethal (reuer et al., 1990) . this conclusion made sense, since phenylalanine lacks a 04 hydroxyl group for phosphodiester formation. however, cao and wimmer (1995) later observed that cells transfected with vpg(y3f) variant rna produced viable virus, albeit only at very low frequency and only after prolonged incubation of the cultures. all of the progeny genomes carried a f3y reversion. the possibility of contamination of the cultures with wt virus was excluded. the only explanation for this surprising result was that the vpg(y3f) variant was quasi-infectious, presumably in that the threonine residue in position four of vpg may have served as a (poor) surrogate acceptor for uridylylation and protein priming of rna synthesis (it should be noted that genome-linked terminal proteins are often attached to serine residues; salas, 1991) . further analyses supported this hypothesis. a vpg(t4a) variant was found to be viable, expressing good growth kinetics. in contrast, vpg(f3y, t4a) variant rna never yielded progeny virus and was, therefore, considered unable to replicate its rna. this mutation then can be considered to be lethal. genetic analysis of mutant genomes and their revertants has been an invaluable tool to study the structure and function of picornavirus genetic elements and picornavirus proteins (wimmer et al., 1993) . genetic recombination of picomaviruses is the exchange of genetic elements between two viruses that may occur during replication in the same cell. discovered by hirst (1962) , and used first by cooper and his colleagues (cooper, 1977) to map poliovirus genetic units, lingering skepticism about the phenomenon was dispelled through biochemical analyses of poliovirus recombinant proteins (romanova et al., 1980; tolskaya et al., 1983) or fmdv recombinant genomes (king et al., 1982) . (for a detailed review of recombination, the reader is referred to wimmer et al., 1993.) picornavirus recombinants have been detected because (1) they acquired genetic traits from the parental strains allowing them to proliferate under conditions restricting the growth of either parent and (2) they arose in excess over (replication competent) revertants of the parental strains (cooper, 1977) . restricting conditions for the selection of recombinants can include specific drugs, such as 2 mm gua hc1, monoclonal neutralizing antibodies (emini et al., 1984) or host-cell specificity (duggal et al., 1997 ). an elegant method to study poliovirus recombination under normal growth conditions (without selection) has been developed by jarvis and kirkegaard (1992) . a wealth of experimental data has shed light on the most important steps in recombination. details of individual steps in recombination, however, remain to be elucidated. the current knowledge can be summarized as follows. 1. recombination is homologous and it occurs by copy choice; i.e. an incomplete (nascent) rna strand may switch template strands during genome replication. the probability of crossover depends strongly upon the degree of homology between the two recombining viral genomes. 2. genetic analyses have indicated that template switching occurs (predominantly?) during-strand synthesis (kirkegaard and baltimore, 1986) . 3. recombination is precise: no deletions or insertions at sites of homologous recombination have been observed. this is true even if crossover occurred in the 154 nt long noncoding region (spacer region) between ires and initiating aug of poliovirus (jarvis and kirkegaard, 1992) even though the sequence of this "spacer" is not conserved amongst the three poliovirus serotypes (toyoda et al., 1984) . indeed, deletions in the "spacer" could conceivably be tolerated in view of the observations by kuge and nomoto (1987) and others that the spacer can be partially or completely deleted without loss of viability. 4. template switching requires that the replication complex pauses, allowing a heterologous (invading) +strand to offer its service as template. an unsolved question is whether pausing and crossover is random (jarvis and kirkegaard, 1991) or non-random (romanova et al., 1986) . the latter is in all probability true. agol and his colleagues have proposed that higher order structures formed on template rnas may favor pausing of rna synthesis and crossover (romanova et al., 1986) in addition, king (1988) suggested that there are preferred sites of recombination in poliovirus rna, and that crossover may be favored immediately after synthesis of two uridylate residues (uu) in the nascent strand. duggal and wimmer (1999) observed that crossover patterns changed significantly when recombination occurred at different temperatures. specifically, crossover between two genetically marked rna strands at 34~ occurred over a wide range of the genome with preference for sequences coding for structural proteins in the 5'-terminal half of the genome. in contrast, recombination in vivo at 37~ and 40~ yielded crossover patterns that had shifted dramatically to a region encoding nonstructural proteins (duggal and wimmer, 1999) . preferential selection of recombinants at 37~ and 40~ was ruled out by analyses of the growth kinetics of the recombinants. the reason for the temperature effect is unknown. temperature-dependent stability of higher order rna structures seems possible. recombination frequencies are calculated by dividing the yield of the recombinant virus by the sum of the yield of the parental virus. for picornaviruses with linear genomes, the distance between genetic markers used to determine recombination is proportional to the recombination frequency. as mentioned above, the degree of homology between the parental genomes strongly influences the probability for crossover. the frequency of recombination between homologous genomes is remarkably high (2 x 10 -3 between markers only 600 nt apart; jarvis and kirkegaard, 1992) . it has been estimated that 10-20% of the homologous viral genomes may undergo genetic recombination within a single growth cycle (king, 1988) . this would mean an unprecedented genetic shuffling between genotypes of which the fittest retain the "wt" phenotype. experimental results that support a high frequency of recombination between sibling strands were obtained with engineered, quasiinfectious poliovirus genomes carrying two adjacent vpg sequences (cao and wimmer, 1996) . after transfection, all progeny viruses had lost the downstream vpg, most probably by homologous recombination during-strand synthesis. on studying recombination using genetically marked genomes, 90% of the recombination events occurred between sibling strands. finally, jarvis and kirkegaard (1992) have demonstrated that the frequency of recombination increases with the progression of the infectious cycle; i.e. the larger the concentration of intracellular viral rna the higher the probability of recombination. recombination in a cell-free extract of uninfected hela cells (molla et al., 1991) has recently been reported by two groups. recombination of parental viruses in the cell-free medium was detected either by rt/pcr in the absence of selection (tang et al., 1997) , or by plating the progeny virus under conditions that were restricted for either parent (duggal et al., 1997) . the recombination frequencies were found to be roughly the same as that in vivo (in tissue culture cells). the crossover pattern of recombination in vitro and in vivo at 34 ~ was the same, lending credibility to the cell-free system as reflecting an in-vivo environment (duggal and wimmer, 1999) . the in-vitro approach has the potential to decipher some basic steps in recombination, as, for example, the invasion of heterologous template strands into the replication complex. as mentioned before, the pattern of recombination changed significantly when recombination was carried out in vivo at 37 ~ or 40 ~ unfortunately, this effect at higher temperature cannot be analysed in vitro because cell-free synthesis of poliovirus is highly inefficient or completely absent at temperatures above 36~ (molla et al., 1993c) . picornavirus genomes are extremely "plastic" in that any change in their genotype can lead to unexpected nucleotide rearrangements. this has been observed, for example, in analyses of ires elements, where deletions or insertions lead to unexpected new genotypes with excellent growth properties (see, for example, dildine and semler, 1989; gmyl et al., 1993; pilipenko et al., 1992; alexander et al., 1994; charini et al., 1994; cao and wimmer, 1995) . genetic plasticity is particularly apparent when poliovirus genomes are constructed harboring foreign sequences. the analysis of genetic rearrangements and deletions is especially important when picornavirus genomes are to be used as vectors for the delivery of foreign genes. generally, polioviruses respond to the insertion of foreign sequences by rapidly deleting these sequences either partially or completely. the driving force behind the selection of deletion variants may be: (1) excessive length of the genome, restricting encapsidation; (2) interference with efficient processing of the polyprorein; (3) interference with initiation of translation; (4) alteration of rna structures necessary for replication; or others. there are examples, however, where poliovirus, or deletion mutants thereof, appear to tolerate a foreign sequence inserted into the genome. an interesting approach to studying ires elements was the insertion of the emcv ires into the orf of poliovirus, thereby making unnecessary the primary cleavage between pi*p2 catalyzed normally by 2a pr~ (figure 12 .6b; molla et al., 1992) . the resulting rna transcripts proved highly infectious (molla et al., 1992) . although this dicistronic virus expressed a small plaque phenotype, neither the plaque size nor the genotype surrounding the insertion changed over six passages. these observations suggested that the insertion was stable, at least under the conditions studied. the emcv ires has no apparent sequence homology with any sequence of the poliovirus genome, the poliovirus ires included. this eliminated the possibility that the emcv ires was rapidly removed by homologous recombination. perhaps, illegitimate recombination (see below) to delete the ires without debilitating the virus was a very rare event and not apparent in progeny virus. it should be noted in parenthesis that the viability of the dicistronic virus shown in figure 12 .6b proved for the first time the function of the emcv ires as a true internal ribosomal entry site (molla et al., 1992) . on the basis of these experiments, a specific version of novel expression vectors was constructed ( figure 12 .6c,d) that included, in addition to the foreign ires, a foreign orf lu et al., 1995a) . none of these vectors was genetically stable over extended numbers of passages, and in some cases the deletion occurred during first passage (lu et al., 1995a) . nevertheless, the insertion of a foreign orf between two ireses in the 5'ntr, the foreign gene (dark stippled box) was inserted upstream of 2a pr~ which now delivers the foreign protein by a cis cleavage. d. dicistronic poliovirus generated by inserting a foreign gene and the emcv ires into the 5'ntr. in this case the foreign gene is synthesized independently from the polyprotein. e. generation of an expression vector by fusing the coding sequence of a foreign gene to the n-terminus of the poliovirus polyprotein. the foreign gene product is liberated through trans cleavage, by either 3dpr~ pr~ or 2a pr~ f. expression vector based on mengovirus, a cardiovirus that carries a small leader sequence preceding the p1 region of the polyprotein (see figure 12 .9). in this case, the foreign gene is inserted into the l coding sequence. note that the organization of the genomes in e and f is identical. g. encapsidation-incompetent poliovirus expression vector in which a portion of the p1 coding sequence has been replaced by a foreign gene. this genome can be encapsidated in trans but, by itself, it can only go through one cellular cycle of replication. as shown in figure 12 .6d, yielded a replicating poliovirus vector that efficiently expressed the cat gene over several passages . no deletion was apparent after the first passage. remarkably, the genome of this construct is 17% larger than that of the wt genome, an observation indicating that the capsid of naturally occurring polioviruses is not "full". however, attempts failed to encapsidate and express the larger luciferase gene instead of the cat gene in the context of the dicistronic virus. the luciferase activity was clearly detectable in cells transfected with the appropriate dicistronic transcript, but the genome harboring luciferase was not encapsidated . apparently, an increase of genome length to 31% (luciferase gene plus emcv ires) was not tolerable for encapsidation . a different strategy to convert poliovirus to an expression vector was the fusion of a foreign orf directly to the poliovirus polyprotein (figure 12.6e; andino et al., 1994) . in these experiments, the strategy of altmeyer et al. (1994) was mirrored, which made use of the genetic make-up of cardioviruses (mengovirus or emcv). specifically, the cardiovirus polyprotein is preceded by a small leader protein (67 aa) that is cleaved from the capsid region p1 by the viral 3c pr~ proteinase, thereby allowing maturation and encapsidation of the virion. altmeyer et al. (1994) inserted into the leader sequence of the mengovirus genome a foreign gene ( figure 12 .6f) and expressed the product of this fusion protein over several passages in tissue culture (altmeyer et al., 1994 (altmeyer et al., , 1995 . andino et al., (1994) generated a similar "leader" protein in front of the poliovirus polyprotein. in this case, however, it was necessary to engineer a novel 3c/3cd pr~ cleavage site between the foreign orf (the new "leader") and the viral polyprotein such that the foreign polypeptide can be cleaved from the poliovirus capsid precursor (figure 12 .6e). although these poliovirus constructs were originally claimed to express excellent growth properties and, more importantly, were reported to be genetically highly stable (andino et al., 1994) , the poliovirus-based vectors proved, in fact, impaired in replication and prone to rapid deletions, at least if the insert was more than 500 nt in size (mueller and wimmer, 1998 , and references therein; see below). it should be noted that the cardiovirus-based vectors also suffered from loss of the inserts of a foreign gene upon repeated passage, an observation suggesting that even cardioviruses do not tolerate an extended leader protein for the purpose of gene therapy (altmeyer et al., 1994 (altmeyer et al., , 1995 . in a third strategy of the construction of picornavirus vectors, the p1 capsid region of the picornavirus genome is partially replaced with a foreign orf (figure 12 .6g), yielding proliferation-incompetent replicons that appear to be genetically quite stable (see, for example , porter et al., 1993) . for a possible application as vectors in gene therapy, the replicons are trans-encapsidated via a vaccinia virus-based p1 expression vector, with relatively low yields of proliferation-incompetent virions. the apparent genetic stability of these replicons may be due to the fact that the rnas are similar in size when compared to the wt genome, and that the naturally occurring cis cleavage (between pi*p2) catalyzed by 2a pr~ is highly efficient, placing no restriction on this step of polyprotein processing. however, the rapid selection of faster growing variants that lost the foreign gene is unlikely, since the trans-encapsidated replicons can only proceed to a one-step infectious cycle. this is very different from the selection pressure in proliferation-competent vectors, which engage in second-round infections. what is the mechanism by which poliovirus may eliminate foreign sequences? homologous recombination cannot function because there is not enough sequence homology to engage in crossover. pilipenko et al. (1995) have nevertheless proposed that short sequences may serve as parting and anchoring sites for template switching in illegitimate (non-homologous) recombination. an alternative mechanism is "loop-out" deletion, in which the nascent strand skips endogenous sequences, jumping to an upstream sequence that serves as anchoring sequence. given the high frequency by which recombination occurs among sibling strands, a crossover mechanism may be favored, but decisive experiments to decide between these two mechanisms are lacking. in any case, a detailed study of genetic variations of polyprotein fusion vectors (figure 12 .6d) strongly supports the model of parting and anchoring sites for template switching or loop-out deletion (mueller and wimmer, 1997; see below) . briefly, when expression vectors ( figure 12 .6e) consisting of a gag gene (encoding p17-p24; 1161 nt) of human immunodeficiency virus that was fused to the n-terminus of the poliovirus polyprotein (andino et al., 1994; mueller and wimmer, 1998) were analysed after transfection into hela cells, the genomes were not only found to be severely impaired in viral replication but they were also genetically unstable (mueller and wimmer, 1997) . upon replication, the inserted sequences were rapidly deleted as early as the first growth cycle in hela cells. interestingly, the vector viruses did not readily revert to wt sequences but rather retained some of the insert plus the artificial 3c/3cd pr~ cleavage site (to allow processing at the n-terminus of the polyprotein). thus, variants of different genotypes that replicated nearly as well as wt poliovirus had followed an evolutionary pathway towards the genetic organization of cardioviruses (mueller and wimmer, 1998) . that is, the poliovirus polyprotein of these variants was preceded by gag-derived "leader" proteins of different but distinct sizes (predominantly between 20 and 50 aa long), the most prominent leader size reflecting the length of that in cardioviruses (67 aa). in the immediate vicinity of the deletion borders of several isolates, short direct sequence repeats were observed that are likely to allow alignment of rna strands for non-homologous (illegitimate) recombination during-strand synthesis (figure 12.7; mueller and wimmer, 1998) . interestingly, the selection of the leader size occurred during the very first rounds of replication of the transfected rna; in most cases, as sequential shortening of the leader sequence was not observed. defective interfering particles an interesting phenomenon of naturally occurring deletion mutants of picornaviruses are defective interfering particles (di particles) that can be (rarely) discovered in laboratory stocks of virus or generated (with difficulty) by passage of virus at high multiplicities (reviewed in wimmer et al., 1993) . all naturally occurring di particles carry deletions in the p1 capsid precursor region (cole et al., 1971; cole and baltimore, 1973; lundquist et al., 1979; nomoto et al., 1979; kajigaya et al., 1985; kuge et al., 1986) . as mentioned before, nomoto and his colleagues have found that the deletions in all di particles are in-frame (kuge et al., 1986) . since genetically engineered di genomes (replicons) with an out-of-frame deletion in the p1 region are unable to replicate, nomoto and his colleagues (kuge et al., 1986; hagino-yamagushi and nomoto, 1989) correctly concluded that poliovirus rna replication requires, at some stage of the replicative cycle, translation of the replicating rna (cis requirement of translation), and that the di particles with out-of-frame deletion cannot be complemented in trans. novak and kirkegaard (1994) confirmed this hypothesis in that replicons with translation termination codons downstream ofthe p1 coding region could not be rescued in trans. for hypotheses to explain this phenomenon, see above. chetverin et al. (1997) have recently made the startling observation that certain rna fragments can join to one another via a molecular pathway determined by the intrinsic chemical properties of the rna molecules (chetverin et al., 1997) . the fragments that formed chemically stable duplexes were selected by the replicase of phage q[3: only those dimers that had acquired signals from two different rna molecules were able to replicate. gmyl et al. (1999) have now reported that viable recombinants could also be generated from non-replicating and non-translatable segments of the poliovirus genome. these fragments by themselves were unable to generate the viral rna-dependent rna polymerase necessary for a replication-dependent recombination event. the "crossovers" were targeted to the highly variable segment of 154 nucleotides located within the 5'ntr upstream of the initiating aug codon for the polyprotein. a great number of recombinants have been obtained by transfection of mixtures of rna fragments. analyses of viruses that evolved after the mix (+) figure 12.7 two models of illegitimate recombination during -rna synthesis as was observed with an expression vector shown in figure 12 .6e (mueller and wimmer, 1998) . both models require a partial dissociation of the nascent-rna from the template +rna, caused presumably by pausing of the rna polymerase. the free 3' end of the nascent-rna can re-anneal to a short complementary sequence further upstream on the same template strand, thereby looping out the intervening sequence (a), or it can re-anneal to the same complementary sequence but on a sibling +strand, and complete synthesis on this second template (b; strand switching). in both cases, the resulting strands would have excised the same sequence and could now, in turn, give rise to truncated +rna genomes. note that this deletion event can occur even during the first round of replication of the expression vector leading to partial or complete deletion of the foreign coding sequences. reproduced, with permission, from mueller and wimmer, 1998. ture of the rna species strongly suggested that the connection between two fragments was the result of chemical reactions between the fragments rather than of template-switching (gmyl et al., 1999) . the mechanism by which the chemical linkage between two fragments is formed is obscure but it could involve structures reminiscent of ribozyme-like activities in the viral rnas. nevertheless, this observation could have pro-found implications for the generation of novel genotypes in nature (gmyl et al., 1999) . genetic complementation is the compensatory action of gene products of two homologous genetic systems to alleviate defects of mutant genes. genetic complementation has been firmly established in picornavirus replication. however, because of the complexity of diverse function(s) of precursor proteins and their cleavage products, it has not been possible to define complementation groups (wimmer et al., 1993) . complementation groups are indicative of genetic elements that can function independently, and they have been the basis of the definition of cistrons (benzer, 1957) . a cistron, therefore, may be equated with a gene, i.e. a functional unit of genetic material specifying a single protein. on the basis of these definitions, the picornavirus genome, encoding only the polyprotein whose products function in many cases in overlapping or even opposing fashions, cannot be called multicistronic. in general genetics, mutations affecting the same polypeptide can occasionally complement each other, a phenomenon referred to as intracistronic complementation (schlesinger and levinthal, 1963) . based on these considerations, we have suggested that the picornavirus genome be considered "monocistronic" (wimmer et al., 1993) . it follows that the genome encodes only one gene product, the polyprotein. the polyprotein, in turn, contains multiple genetic units whose products may or may not be capable of intracistronic complementation. if this definition is accepted, one should avoid referring to individual coding regions of the picornavirus genome as "genes". thus, there would be no "3d p~ gene". this convention makes good sense if one considers that a "gene for 3d p~ is for the most part also the gene for 3cd pr~ a proteinase with properties unrelated to the polymerase 3d p~ it should be noted that theiler's virus is an exception to the monocistronic nature of picornaviruses in that it encodes a small protein in a separate reading frame, mapping towards the n-terminus of the polyprotein . it is interesting to consider that there is no absolute requirement that picornaviruses must exist as monocistronic (single-polyprotein-producing) entities. for example, the insertion of a second ires into the genome would represent a viable dicistronic virus (figure 12.6b) , an entity artificially generated by molla et al. (1992, 1993b) . apparently, during the evolution of picornaviruses, the elimination of genetic elements regulating the expression of different picornavirus proteins was favored over retaining them. similarly, there was no pressure to generate such regulatory sequences and insert them into the genome. in other words, proteolytic processing of a single polyprotein evolved not only to be highly efficient but also as a means to regulate the temporal appearance of viral proteins (e.g. precursor proteins versus end-products of proteolytic cleavage). in contrast, in prokaryotic rna phages or in-strand rna viruses, the expression of proteins is regulated by sequence elements located between different cistrons. interestingly, the dicistronic poliovirus depicted in figure 12 .6b resembles the genetic composition of cow pea mosaic virus (cpmv), a plant virus (hellen et al., 1989; molla et al., 1992) . indeed, cpmv and the dicistronic poliovirus shown in figure 12 .6b have similar gene order and amino acid sequences, the main difference being that the genome of cpmv is bipartite. that is, rather than inserting a sequence such as an ires into the genome, cpmv preferred to divide the genome into two portions, one coding for the capsid proteins the other for the replication proteins. such a genetic arrangement, which requires two particles to initiate a complete infectious cycle, may be suitable for a plant virus (where the yield of virus per host can be extremely high and, equally importantly, the local concentration of host organisms can be high) but it would be highly disadvantageous for an animal virus. the fascinating topic of the structure and evolution of polyproteins within the rna-like virus superfamily is discussed later. evidence for genetic complementation in vivo has existed for decades, the best-known involving guanidine-generated mutants. complementation of guanidine mutants seemed unidirecional (wimmer et al., 1993) . bernstein et al. (1986) provided the first conclusive evidence for symmetric complementation, using mutants that were generated in 2a pr~ and 3a. these authors, however, also made the unexpected observation that mutants mapping to 2b or 3d p~ (which they had also generated by genetic engineering) could not be complemented (bernstein et al., 1986) . on the other hand, charini et al. (1991) clearly showed that mutations in 3d p~ mapping to a different region of the coding sequence could be rescued in trans. this example and many others (wimmer et al., 1993) support the notion that the polyprotein is a single genetic unit that does not consist of non-overlapping genes whose functions can be separated by complementation grouping. a special case of complementation was tested using dicistronic polioviruses. briefly, cao and wimmer (1995) constructed a virus with a genotype shown in figure 12 .6b, the extra cistron being the coding region for poliovirus 3ab. the dicistronic construct yielded a virus expressing a small plaque phenotype but it was genetically unstable, losing its inserts after several passages. nevertheless, if the lethal mutation vpg(y3f, t4a) was engineered into the second cistron of the dicistronic virus (see above), the first cistron (3ab) could rescue the genome, albeit inefficiently. a much more efficient rescue of lesions in the 3ab coding sequence was reported by towner et al. (1998) , who used the cell-free system of poliovirus replication developed by molla et al. (1991) . apparently, the supply in trans of p3 polypeptides in vitro is more efficient that in vivo, a phenomenon that remains as yet unexplained. interestingly, it appeared as if a mutation in 3ab could be rescued only if the complementing polypeptide was a precursor of 3ab, preferably p3 (see figure 12 .4). perhaps, efficient complex formation between 3ab/3cd pr~ (see above) in this system depends on the cleavage of the p3 precursor in situ. picornaviridae combines species that infect animals with exceedingly varied pathogenic features affecting almost every organ system (table 12 .1). in the following, we will concentrate only on human picornavirus infections, which range in severity from protean symptoms associated with the common cold (e.g. rhinoviruses), mild gastroenteritis (e.g. echoviruses), hepatitis (e.g. hepatitis a virus), to fatal cns manifestations (e.g. pv, enterovirus 71) or lethal myocarditis (coxsackievirus b group). despite the enormous variety in organ tropis m observed with different species of the picornavirus group, pathogenic features of every single species are recognized in the form of highly distinct disease syndromes (exceptionally, coxsackieviruses can cause fatal disseminated infections in neonates with widespread viral propagation in multiple organs). .we will define pathogenic properties of picornaviruses as a combination of different viral traits: (1) those that affect tropism (determining the target cell type of, and influencing spread in, the host); (2) those that affect virulence (determining kinetics of particle propagation); and (3) those determining the progression of a disease syndrome ("pathogenicity proper", the propensity to cause clinical symptoms). there is a fourth parameter, which is strictly related to a condition of the host. an example is injury-provoked ("provocation") poliomyelitis, which will also be discussed below. surprisingly, the disparity in pathogenic properties may be contrasted with a high degree of sequence conservation among certain groups of picornaviruses. this is most evident with the cluster c enteroviruses (figure 12 .1b). for example, on the basis of sequences of the 3d p~ poliovirus serotype 2 (lansing) (pv2(l)) shares more than 90% sequence homology with its close relative coxsackievirus a24 (cav24). indeed, their sequence similarity exceeds that between pv2(l) and pv1 (mahoney) or pv3 (leon) (figure 12.1b) . yet, whereas all pv serotypes are associated with poliomyelitis, a severe and frequently fatal infection of the cns, cav24 causes mild upper respiratory tract infections only. since minor sequence variations of picornaviruses can account for drastically different disease syndromes it may be assumed that the pathogenic phenotype of picornaviruses is encrypted within a few crucial genetic determinants. these basic determinants of pathogenic features appear to be dynamic, leading occasionally to the emergence of novel virus variants causing clinical syndromes not previously observed with their ancestors. this was the case when widespread epidemics of acute hemorrhagic conjunctivitis ravaged africa and the pacific rim (yin-murphy, 1973) and quickly expanded worldwide. two picornaviruses were associated with this previously unknown clinical syndrome, coxsackievirus a24 variant (cav24v) and enterovirus 70 (ev70; table 12 .1). the former evolved from its ancestral cav24, causing mild upper respiratory tract infections, whereas ev70 was primarily recognized for its association with a poliomyelitis-like neurological disorder (melnick et al., 1974) . the deviation of tropism toward ocular tissues resulting in acute hemorrhagic conjunctivitis suggests a switch or an expansion in receptor specificity. this hypothesis, however, awaits confirmation, since the cellular receptor(s) of ev70 is unknown ( the circumstances and conditions that may favor a switch in host cell tropism with resulting changes in the pathogenic phenotype are unknown. a detailed discussion of our current view of the evolution of enteroviruses, however, is presented in the following section on evolution. this is particularly relevant in the context of the imminent global eradication of poliovirus. it is known that coxsackieviruses a7 and a9 (cav7, cav9) as well as enteroviruses 70 and 71 (ev71) occasionally cause a clinical syndrome with striking resemblance to poliomyelitis (melnick et a!., 1974) . occurrence of poliomyelitis caused by these virus species has only rarely been reported in epidemic proportions (voroshilova and chumakov, 1959; melnick et al., 1980) ; generally they occur as isolated incidents. fortunately, preliminary evidence (da silva et al., 1996) suggests that to date no surge in non-pv-caused poliomyelitis has occurred in response to the eradication of poliovirus in latin america. however, the time elapsed since the eradication of wt polioviruses in the western hemisphere is too short in terms of evolution to conclude that the incidence of non-pv-caused poliomyelitis and poliovirus eradication are unrelated (see section on evolution). the observation of diverse specific clinical syndromes caused by closely related picornaviruses (particularly enteroviruses) has sparked interest amongst virologists in identifying those factors that may determine the clinical outcome of picornaviral infections. generally, signals for pathogenic phenotypes can be found in all parts of the viral genome. however, factors that determine cell and tissue tropism are not necessarily the same as factors determining virulence or attenuation. for example, the capsid mutations of the live attenuated strains of pv (the sabin strains) have an attenuating effect without altering the tropism of the sabin strains for the prime target of pv: spinal anterior horn motor neurons. a different effect of capsid proteins on an extended host tropism has been reported for several poliovirus type 2 strains, e.g. pv2(l). a small segment in capsid protein vp1 of pv2(l) (the b-c loop) has been identified as carrying determinants of host range extension from primates to rodents (murray et al., 1988) . however, whereas pv2(l) infection causes poliomyelitis in primates and in cd155 tg mice, normal mice developed histopathology indicative of panencephalomyelitis, which was radically distinct from poliomyelitis observed in primates and cd155 tg mice (gromeier et al., 1995) . this observation suggests pv2(l) tropism toward a cell type in mice that is not targeted in primates. it is likely that the mouse-adapted pv2(l) acquired additional receptor specificity but the nature of the receptor for pv2(l) in normal mice is unknown. it should be noted that although pv2(l) causes disease in mice after intracerebral injection, cultivated mouse l cells cannot be infected with this strain (gromeier et al., 1995) . the fact that single determinants of pathogenicity (e.g. the pv capsid) can carry signals that influence either tropism (e.g. pv2(l)) and/or virulence (sabin strains of poliovirus) indicates different dimensions of picornaviral pathogenesis. taking multiple determinants of tissue tropism and virulence (shared between the capsid, non-structural viral proteins and non-coding sequences) into account, the enormous complexity of the molecular basis of picornaviral pathogenesis comes into perspective. capsid protein structure determines the interaction of a virus with its cellular receptor. as pointed out for the poliovirus sabin strains and pv2(l), small differences in the structure of viral capsids are critical for cell and organ tropism as they ultimately determine the pathognomic features of the resulting infection. similarly, the capsid was determined to harbor sequences critical for cardiotropism (tracy et al., 1995; cameron-wilson et al., 1998) as well as diabetogenicity (kang et al., 1994) of coxsackie b viruses (cbv). moreover, diabetogenicity of emcv in mice mapped to the capsid (jun et al., 1998) . as pointed out, the mechanism by which the changes in the capsid may affect pathogenesis may be related to differences in the interaction between virus and receptor. apart from direct receptor switching (or extending receptor specificity to more than one cell surface molecule), virus capsid alterations may affect the kinetics of virus/receptor binding or particle stability. this would influence the virulence of that virus without a concurrent change in host cell tropism. reduced particle integrity has been proposed to participate in the attenuation phenotype of the sabin strains of poliovirus (filman et al., 1989) . however, theories linking capsid mutations within the sabin strains with structural elements important for protomer cohesion and capsid integrity remain inconclusive. in contrast to a likely relationship between capsid structure and pathogenic phenotype, the role of non-structural viral gene products in the determination of disease has been less obvious. non-structural viral proteins are cell-internal and, hence, do not influence tropism in a strict sense. cell-type-specific restrictions in viral replication regulated by non-coding regions of the viral proteins or by non-structural proteins will not change the spectrum of target cells infected but may critically influence virulence. viruses normally evolve to adapt to host cells offering adequate portal of entries (receptors), thereby exposing the viral particles to an intracellular milieu supportive of particle propagation. thus, favorable cell-internal conditions for viral replication would ideally be matched by a suitable viral receptor to avoid virus entry into cells that do not permit replication. for most viruses invading a host organism, the match is not perfect and, hence, the viruses are restricted to replication in fewer cells or organs than the distribution of receptor molecules would suggest. cell-internal determinants mapping to the viral genome of various picornaviruses have been suggested to influence virulence. these can be divided into loci mapping to viral proteins or to non-coding regions (e.g. the 5'ntr). for example, mutations within the coding region for the rna-dependent rna polymerase 3d p~ of the pv1 sabin vaccine have been implicated in the attenuation phenotype (toyoda et al., 1987; tardy-panit et al., 1993) . it was found that these mutations contributed to the ts phenotype of this sabin vaccine strain (toyoda et al., 1987) . mutations in the p2 region of hepatitis a virus have been correlated with an attenuated phenotype of hav (raychaudhuri et al., 1998) . in many instances the genetic loci of pathogenesis mapping to viral non-structural proteins have been identified through sequence comparison. this approach, of course, did not reveal mechanisms to account for reduced virulence. the non-coding genetic elements of picornaviruses have also been shown to carry signals determining virulence (evans et al., 1985; duke et al., 1990; gromeier et al., 1996; tracy et al., 1996) . as has been discussed in the section on genetics, the 5'ntr harbors the internal ribosomal entry site (ires) that, on the basis of the early observation by evans et al. (1985) with the sabin type 3 vaccine strain, has been identified as a major determinant of virulence for a number of picornaviruses. sequence comparison of the sabin strains of poliovirus with their wt progenitors revealed point mutations within a confined region of the 5'ntr in all three serotypes, known as domain v (figure 12 .3; reviewed in wimmer et al., 1993) . in analyses of viral strains recovered from patients who acquired paralytic polio after vaccination, a point mutation at position 472 (direct reversion in domain v) of the ires of pv3(sabin) was proposed to contribute to the neurovirulence of the isolate (evans et al., 1985) . recent analyses led to a different hypothesis stressing a co-operative attenuating effect of capsid mutations with mutations in the ires and other locations in the sabin vaccine genomes (mcgoldrick et al., 1995) . the mechanism responsible for ires-mediated attenuation of neurovirulence remains obscure. analyses of cell-type-specific growth restrictions in cell lines of neuronal origin and biochemical studies of cell-type-specific ires function suggested impairment of initiation of translation in a cell-type-specific manner (haller et al., 1996; agol et al., 1989; la monica and racaniello, 1989) . following the example of poliovirus, ires elements or surrounding sequences of a large number of picornaviruses were found to contain genetic markers with a role in the determination of a pathogenic phenotype. this was most impressively demonstrated by the drastic attenuating effect of a deletion within the poly(c) tract of the 5'ntr of emcv (duke et al., 1990) . how do the multiple mechanisms alluded to interlace to produce a specific picornavirus disease syndrome? the intricacies of dual cell external and internal determinants of viral pathogenic features are best illustrated using the example of poliovirus. this most thoroughly studied prototype picornavirus is characterized by pathogenic properties of specificity untypical of viral pathogens of the cns. poliovirus host range is limited to primates only. within the primate organism poliovirus replicates within an unknown site in the gastrointestinal tract and within associated lymphatic structures, leading to viremia (bodian, 1972) . at this stage, the virus causes hardly any disease symptoms. however, viremia may, in only about 1% of infections, lead to cns invasion, presumably through passive passage of the blood-brain barrier (yang et al., 1997) . on muscle injury, the virus may also reach the cns via retrograde axonal transport wimmer, 1998, 1999) . within the cns, poliovirus uniquely targets spinal cord anterior horn motor neurons. lytic destruction of anterior horn motor neurons results in flaccid paralysis, the hallmark clinical sign of paralytic poliomyelitis (bodian, 1972) . whereas efficient poliovirus proliferation occurring in the human gastrointestinal tract produces few or no symptoms, this virus's pathological potential is expressed in a small and relatively inaccessible subpopulation of neurons in the cns. this peculiar restriction has been the subject of research interest for many years. available evidence clearly suggests that the restriction of the host range of poliovirus to primates is determined predominantly by the receptor. in humans, this receptor is the immunoglobulin superfamily molecule cd155 (mendelsohn et al., 1989) and two proteins closely related to cd155 function as poliovirus receptor in simians (koike et al., 1990) . clearly, the observed organ and cell tropism are co-determined by the virus's dependence on the cellular receptor. we believe that, at least in part, the expression of cd155 must also determine the highly restrictive cell tropism of poliovirus within the cns, because mice transgenic for the cd155 gene develop a neurological condition with pathologic and clinical features identical to those observed in primates (ren et al., 1990; koike et al., 1991; gromeier et al., 1996) . furthermore, support for this hypothesis comes from studies of transcriptional control (solecki et al., 1999) and developmental expression of the cd155 gene (gromeier et al., 1999b) . it has been reported that cd155 expression is restricted to structures in close anatomical and functional relationship with spinal cord anterior horn neurons during embryonic development of the cns (gromeier et al., 1999b) . it is thus likely that the restrictive expression pattern of cd155 may indeed direct poliovirus tropism toward a specific cellular compartment of the cns. analyses of pathogenesis related to genetic determinants mapping to the capsid, proteins, ires, or 3d p~ have recently been extended using poliovirus hybrid viruses. specifically, polioviruses have been constructed in which the cognate ires was replaced by that of other picornaviruses (gromeier et al., 1996) . by exchanging the cognate ires element of pv by that of other picornavirus species, it could be shown that neuropathogenicity of pv can be eliminated without affecting growth properties in non-neuronal cell types normally susceptible to poliovirus (gromeier et al., 1996) . thus, it was determined that the ires of rhinovirus type 2, a virus species never associated with neurological disease, confers the attenuation phenotype to poliovirus. significantly, this chimeric virus, called pvi(ripo), did not carry any attenuating mutations in the poliovirus-specific sequence of its genome. the neuropathogenic potential of a picornavirus ires cannot be predicted but it is innate, of course, in its sequence. certainly, ires elements of enteroviruses known to cause poliomyelitis (cav7, cav9, ev70, and ev71) are candidates for "neurovirulent ireses" and, indeed, a corresponding pv/cav chimera has proved this hypothesis (gromeier et al., unpublished results) . c-cluster enteroviruses, on the other hand, never cause poliomyelitis. however, because of their close genetic kinship, the ireses of c-cluster coxsackieviruses confer a highly neurovirulent phenotype to the pv/cav chimeras (gromeier et al., unpublished) .. finally, ires elements of the genus rhinovirus ablate neuropathogenesis in poliovirus chimeric viruses (gromeier et al., 1996) . these observations indicate that, indeed, cellexternal restriction in cell tropism as well as cellinternal factors exert powerful limitations toward enterovirus pathogenesis. many relatives of poliovirus of the enterovirus genus (particularly the c cluster; table 12 .1) presumably would equal the neuropathogenic properties of pv if their capsid structure allowed interaction with neuronal cells. thus, non-neurovirulent ccluster enteroviruses with high sequence homology to pv and "neurovirulent" ires elements may gain tropism for neurons in the future (see under evolution). in addition to virus-encoded factors of pv neuropathogenicity, the circumstances within the host organism at the time of infection or shortly thereafter may influence the outcome of poliovirus infection. trivial muscle injury has been shown to increase the probability of neurological complications of concurrent poliovirus infection (mccloskey, 1950) . a proposed pathogenic mechanism for provocation polio identified a deviation of the route of cns invasion toward retrograde axonal transport to account for the increased risk of polio among individuals who received intramuscular injections wimmer, 1998, 1999) . picornaviruses have adapted to a wide variety of cellular components of their hosts. the diverse spectrum of disease syndromes associated with human picornaviruses provides an excellent field of study to examine the factors that determine the clinical outcome of a viral infection. the enormous amount of sequence information, combined with a broad knowledge of the molecular biology of many of these agents, have sparked hopes of a rapid elucidation of the molecular basis for their pathogenic properties. initial optimism that sequence comparison of virulent strains with their attenuated variants alone would rapidly identify those elements responsible for a pathogenic phenotype and unravel mechanisms of pathogenesis is, however, unjustified. this is particularly true for poliovirus, the most thoroughly studied picornavirus. progress in the analysis of poliovirus neuropathogenicity has revealed that the interactions of poliovirus with the host are characterized by a degree of complexity not previously appreciated. mechanistic concepts of viral pathogenesis, combined with one-dimensional views of virus replication and its relation to the host organism, have helped little in increasing our understanding of the selective susceptibility to poliovirus of motor neurons. viral infections result in complex clinical syndromes that are difficult to explain in terms of single viral genetic elements. using poliovirus as an example, picornavirus-induced disease is the result of an intricate interplay of numerous factors, of both viral and host origin, that coordinately affect the ability of the virus to propagate in any particular cell type or organ. numerous investigations, particularly those of j.j. holland, e. domingo and their colleagues (see chapter 7), have led to the realization that the rapid evolution of rna viruses results from high mutation rates combined with exceedingly large heterogenic populations. this is true for picornaviridae that encode variants of conserved protein folds as well as catalytic systems not found in the cellular world and, hence, have explored an enormous evolutionary space. it is less appreciated, although equally true, that it is the host environment that has provided new opportunities for viruses to proliferate and select new variants out of a huge number of mutants. without this "co-operation" between host and parasites, picornavirus evolution would not have resulted in the tremendous diversity of genotypes whose number, now counting in the hundreds, is currently biased towards those infecting mammals. the key role of virus-host interaction is evident in the evolution of rna viruses. indeed, the relatively slow pace by which the cellular environment is changing imposes severe restrictions on the mode of rna virus evolution. a model has been proposed suggesting that, in each moment of protein evolution, mutations could only be accepted in a very limited number of positions of a polypeptide chain; otherwise, protein structure and function would have been severely compromised. these limited places of acceptable variation have been called covarions (fitch, 1971) . it has been argued that the fast evolution of rna viruses in a constrained environment has to proceed through the exploitation of constantly emerging but vastly overlapping covarions . this model of virus evolution provides a sensible hypothesis as to how picornaviral proteins have managed to accept a heavy load of mutations that are quite frequently unique and rarely seen at sites in cellular proteins, while still not entirely losing some discernible similarity with other viral and cellular homologs. this has important practical consequences, since structural similarities can be used to reconstruct the evolution of rna viruses, an undertaking impossible just 20 years ago. apart from the biological properties generally associated with all rna viruses, the current end-product of evolution makes each virus species unique. this is engraved in the virus' genetic plan: the organization of the genome and the mode of gene expression. during the past years we have learned that +strand rna viruses employ variations of surprisingly few basic genetic plans. the genetic organization and genetic expression of picornaviruses has been outlined in detail above. in the following, the genetic organization of different picornaviruses will be put into an evolutionary perspective. in addition, we will discuss in a rather speculative manner some hypothetical implications of evolutionary consequences of the eradication of poliovirus. numerous rna viruses have been combined into a picorna-like supergroup of which the picornaviridae comprise a rather compact domain (figure 12 .8). the viruses of the picorna-like supergroup, a taxon not yet recognized as higher-than-family rank, share a conserved array of replicative proteins (see below). they infect plants, insects, birds and mammals. most of the established members of picornaviridae are mammalian viruses, and they fall principally into six genera (table 12 .1). the newly characterized avian encephalomyelitis virus (aev), which is a member of picornaviridae, is most closely related to hepatitis a virus (marvil et al., 1999) . the latest revision of the classification of picornaviruses, although closely related to the original version, has a clear evolutionary flavor since it tends to combine viruses in accord with phylogenetic kinship rather than relying on phenotypic properties (see the introduction). the largest number of picornaviruses fall into the most closely related genera, enterovirus and rhinovirus (table 12 .1). cardiovirus and aphthovirus comprise two other genera that have most probably emerged from a common ancestor. it appears that the two pairs of picornavirus genera diverged after hepato-and parechoviruses split from the main trunk of the picornavirus tree (figures 12.1, 12.8) . the exact phylogenetic interrelationship between hepatoand parechoviruses remains somewhat uncertain and may be different from that shown in figure 12 .1 (see also below). in addition, three other picornaviruses, equine rhinovirus types 1 and 2 (erv1 and 2 respectively) and aichi virus (aiv), remain to be classified. erv1 was shown to be a distant relative of fmdv (wutz et al., 1996) , while erv2 (li et al., 1996; wutz et al., 1996) and aiv (yamashita et al., 1998) appear to be related to the cardio-and cardio/aphthovirus branches respectively. they are not recognized, however, as cardio-or aphthoviruses. in addition to mammalian viruses, a number of insect viruses have been previously included into the picornaviridae on the basis of phenotypic criteria. during the last 2 years, however, the complete genome structure of six picornalike insect viruses has been reported (van der wilk et al., 1997; isawa et al., 1998; johnson and christian, 1998; moon et al., 1998; sasaki et al., 1998) . some of these viruses feature different genome organizations, and only infectious flacherie virus of silkworm (infv; isawa et al., 1998) was shown to possess a genotype and gene organization that may justify placing it in the picornaviridae (a.e. gorbalenya, unpublished results) . it is likely that an ancestor of infv separated from the main branch of picornaviridae before the radiation of mammalian viruses (figure 12.8) . the rapid accumulation of new virus genotypes has not been matched by an understanding of its evolutionary meaning. therefore, the basis of picornavirus classification may need to be revisited. moreover, the relationship between picornaviridae and other genetic systems may have to be defined within a new classification. although confusing for virologists on first sight, any reclassification into hierarchically organized taxa will ultimately aid our understanding of evolution, host range of viruses and pathogenesis. it should be kept in mind, however, that any classification is, at best, an approximation of true phylogenetic relationships, and the current classification of picornaviridae should be treated as such. picornaviridae have evolved by speciation from a common ancestor. this plausible statement has been supported by computer analyses of nucleotide and protein sequences as well as by studies of the tertiary structure of capsid proteins and 3c pr~ proteinases. there is every reason to believe that the putative ancestral viral entity had a genetic organization that has been conserved largely in contemporary picornaviruses. its signature is a long 5'ntr-long open reading frame-3'ntr-poly(a) (figure 12 .2). with the sole exception of a strain of theiler's virus (tmev; see below), all picornavirus proteins are generated by autocatalytic processing of the gigantic polyprotein (figure 12 .3). the backbone of the polyprotein is formed by a set of polypeptides conserved in all known picornaviruses. in addition, the backbone may be decorated with a few optional proteins unique to a particular virus or virus group. among the proteins, the conservation increases in the order (figures 12.4, 12 .9): this order can be deduced from analyses of virus groups belonging to different phylogenetic ranks-clusters of closely related viruses, of distinct genera, of an entire family, or of the entire picorna-like supergroup (figure 12 .10; a.e. gorbalenya, unpublished results) . slight deviations can only be seen upon analysis of some small taxonomic groups. as already noted, tmev encodes a small, unique l* polypeptide outside the main reading frame at a locus overlapping the l reading frame. additionally, the insect picornavirus infv was predicted to encode unique domains as part of the polyprotein (isawa et al., 1998; a.e. gorbalenya, unpublished results) . a closer look at these proteins reveals the following. the l polypeptide preceding the capsid coding region (figure 12 .9) is encoded by all picornaviruses except the entero-and rhinoviruses. the l protein is the most variable of all picornavirus proteins, and it exists in five different versions. l proteins with proteinase activity are encoded only by fmdv, erv1 and erv2 (skern, 1998) . cardioviruses and aiv encode three different versions of l polypeptides containing a putative zn finger a.e. f i g u r e 12.9 primary cleavage in picornavirus polyproteins. open boxes at the left end depict l proteins, of which only that of aphthoviruses is a proteinase. of the 2a coding sequences, only 2a pr~ of entero-and rhinoviruses is a proteinase. in cardio-and aphthoviruses, processing at the c-terminus of 2a is strictly a cis cleavage event. in hepatoviruses, even this cleavage is catalyzed by 3c pr~ modified from ryan and flint, 1997. gorbalenya, unpublished observations), while hepato-, parechoviruses and infv appear to encode unique l proteins (najarian et al., 1985; hyypia et al., 1992; isawa et al., 1998) . some evolutionary characteristics of the 2a polypeptides parallel those of the l proteins. entero-and rhinoviruses encode 2a of the same protein family, known as 2a cysteine chymotrypsin-like proteinases (bazan and fletterick, 1988) , whereas each of the three groups of hepato-and parechoviruses and infv encodes a unique 2a with unknown function(s). the other picornaviruses, cardio-and aphthoviruses and aiv, encode a 2a protein having a characteristic c-terminal motif (or a derivative thereof) that has been implicated in the spontaneous separation of 2a and 2b proteins during polyprotein synthesis (reviewed in ryan and flint, 1997) . besides l and 2a, the only other protein of the family not conserved at the primary structure is vp4 (palmenberg, 1989; a.e. gorbalenya, unpublished observation) . it may therefore not come as a surprise that vp4 has been poorly resolved in x-ray analyses of picornavirions whose structure has been solved (lentz et al., 1997) . it is interesting to note that this small pro-tein, which occupies a position upstream of vp2, has moved its position to between vp2 and vp3 in insect infv . other picornavirus proteins are conserved, albeit to varying degrees. these polypeptides therefore may play similar role(s) in the life cycle of different picornaviruses. for example, 2b and 3a are of variable sizes but they contain hydrophobic regions thought to be involved in the anchoring of these proteins to membranes in rna replication complexes. only the hydrophobic patches of 2b and 3a polypeptides, however, have been conserved (a.e. gorbalenya, unpublished results) . in capsid proteins, the most pronounced conservation is evident in residues critically important for fold maintenance. finally, in the key replicative enzymes 2c awpase, 3c pr~ and 3d p~ as well as in 3b vpg, the active site residues are amongst the most highly conserved (gorbalenya and koonin, 1993a) . the majority of proteins of picornaviruses, regardless of how well they have been conserved within their own family, have homologs among cellular and other viral proteins. first, the three capsid proteins, vp1, vp2 and vp3, have adopted different versions of an eight-f i g u r e 12.10 comparison of the genome organizations of the main groups of the picorna-like supergroup. for each picorna-like family group, excluding apv, the conserved organization of an "averaged" genome typical for this group is shown and compared with that of picornaviruses. the genomes are aligned with respect to the position of the 3d (-like) locus (the rna polymerase). "averaging" was carried out with respect to genome size so that most conserved genome features could be shown. note that bymoviruses, comprising a genus of potyviridae, have a bipartite genome. it is believed that all picoma-like viruses contain vpg at the 5'-end, although a vpg has not yet been demonstrated for every group of viruses. in picoma-like viruses, proteins were designated so as to reflect their similarity to the prototype picornavirus enzymes, although other nomenclatures may be in use by other investigators. apart from studies with picomaviruses, enzymatic activities have been ascribed to some proteins of como-, poty-, calici-and sequiviruses, but the complete processing map of polyproteins has been established only for como-and potyviruses. the conserved 0~/~ rossmann-fold and palm-like fold comprise only one of the domains of 2c and 3d, respectively, or their homologs. for further details, see legends to figures 12.2 and 12.4, and the text. stranded antiparallel beta-barrel fold, dubbed "jelly-roll" (rossmann and johnson, 1989) . amongst rna and dna viruses of different families, this fold is the most common to build icosahedral capsids (rux and burnett, 1998) . it is also conserved in a number of cellular proteins (rossmann, 1987; orengo et al., 1997) . second, the core domain of 3d p~ containing several highly conserved sequence motifs, is related to a number of polynucleotide polymerases includ-ing rna-dependent rna polymerases of rna viruses, reverse transcriptases of viral and cellular origins, and dna-dependent dna polymerases (hansen et al., 1997 ). an analysis of the crystal structure of pv 3d p~ has also identified a (palm) subdomain adopting a rrm-like fold conserved among a number of functionally different proteins, including ribosomal proteins l7/l12 and $6 as well as the uia splicing factor (hansen et al., 1997) . third, two picomavirus proteinases, the ubiquitous 3c pr~ and entero/rhinovirus-specific 2a pr~ have adopted 12-stranded antiparallel two beta-stranded barrel folds, conserved in cellular serine proteases with chymotrypsin as the prototype (reviewed in skern, 1998) . these picornavirus proteinases have also relatives that are encoded by (+)rna viruses belonging to dozens of different species (gorbalenya and snijder, 1996; ryan and flint, 1997) . unlike cellular proteases, the picornaviruses 3c pr~ and 2a pr~ employ cysteine as the principal catalytic nucleophile and, in some lineages, have another unique replacementinstead of the catalytic asp they use a glu . the other small family of picornavirus proteinases, l pr~ of aphthoviruses, erv1 and erv2, is related to cellularpapain-like proteases (gorbalenya et al., 1991; skern, 1998) whose homologs have been identified in many animal and plant rna viruses as well (gorbalenya and snijder, 1996) . finally, 2c atpase, whose structure is yet to be solved, belongs to the so-called helicase superfamily iii. this protein group includes polynucleotidestimulated atpases, some with helicase activity, which are encoded by (+)rna and small dna viruses as well as proteins of cellular origin (gorbalenya and koonin, 1990, 1993b) . the 2c awpase has been predicted to be a three-domain protein. two (x/(x domains flank an atp-binding domain adopting a variation of the (x/j3 "rossmann" fold, which is widespread in the protein world (teterina et al., 1997) . with respect to details, our current understanding of the function of picornavirus proteins is rather fragmentary. nevertheless, a preliminary functional profile of picornavirus proteins fits patterns of conservation evident at the structural level. the most conserved non-structural proteins provide the basic enzymatic activities needed for the synthesis and expression of viral rnas inside the cell. the three conserved capsid proteins form the scaffold of virions shielding virus rna from the detrimental environment outside the cell. all these activities appear to be virus-specific, although they may be modulated by cell-encoded components. in contrast, non-conserved viral proteins seem to sense and modify the host environment in addition to serving basic biosynthetic processes pro-grammed by the viral genomes (for instance, see piccone et al., 1995; zoll et al., 1996; svitkin et al., 1998; ventoso et al., 1998) . virions also have host-dependent functions, such as the recognition of the cellular receptor, entry and, possibly, virion maturation. different lines of evidence have shown that the least conserved regions of three capsid proteins, as well as vp4, may mediate these early activities of host cell entry (for recent work, see hadfield et al., 1997, and lentz et al., 1997) . much of what has been said about proteins applies also to the terminal ntrs of the picornavirus genomes. these regions are conserved within related genera but they may diverge when groups are compared (e.g. enterovirus/ rhinovirus versus cardiovirus /aphthovirus) even although they play identical roles in viral proliferation (discussed in the section on genetics). variants of two very different conserved secondary structure organizations of ires elements are shown in figure 12 .3, prototyped by those of pv and emcv. it is unclear what type of 5'ntr was encoded by an ancestor of picornaviruses -that resembling one of the contemporary prototypes or rather a "consensus" one (le and maizel, 1998) . in contrast, the 3'ntr region has diverged profoundly amongst picornaviruses and it is not conserved even within the otherwise closely related entero-and rhinoviruses (poyry et al., 1996) . the polyprotein of numerous +strand rna viruses has evolved such that its organization reveals an additional level of conservation-the order of mature proteins in this large precursor (figure 12 .10; see also the order of protein domains in the prototype pv in figure 12 .4). this order is inflexible and none of the picornaviruses violates it, although entero-and rhinoviruses do not encode l proteins (see above). despite a near absolute conservation of the order of protein domains, there is some plasticity (figure 12 .10). upon computer sequence analyses of the picorna-like viruses, it has become evident that the polyprotein can be divided into two parts, one comprising capsid proteins and the other the non-structural proteins. these parts are expressed rather independently. in a group of picorna-like insect viruses, rhopalosiphum padi virus (rhpv), drosophila c virus (dcv), plautia stali intestine virus (psiv) and cricket paralysis virus (crpv), non-structural and capsid proteins are encoded by two orfs, separated by a ntr (koonin and gorbalenya, 1992; johnson and christian, 1998; moon et al., 1998; nakashima et al., 1998; sasaki et al., 1998) . in comoviridae, a family of plant viruses, the capsid and non-structural proteins are encoded by two distinct rnas, rna2 and rna1 (goldbach, 1986) . remarkably, in the dendrogram shown in figure 12 .8, comoviridae, sequiviridae, a plant virus family having the same polyprotein organization as picornaviridae (turnbull-ross et al., 1993) and picorna-like insect viruses form a division immediately adjacent to the picornaviridae. it is important to stress that comparison of the sequences of many viruses of the picorna-like supergroup has revealed a profile of sequence conservation that parallels that observed for picornaviridae. thus, two groups of highly conserved clusters can be distinguished in polyproteins. the first group comprises the capsid proteins vp2-vp3-vp1, the second the non-structural proteins 2c atp .... (vpg)-3cpr~ p~ (or equivalents). the functions assigned to the individual members of the group of non-structural proteins remain provisional for the majority of known viruses. they have been inferred largely on the basis of sequence similarities with proteins of well-characterized viruses like poliovirus. among the positionally highly conserved non-structural proteins, the genomelinked protein vpg has a special standing (highlighted by bracketing) since it is conserved functionally rather than structurally in the picornalike viruses (figure 12 .10; gorbalenya and koonin, 1993a) . the combination of conserved non-structural proteins of picorna-like viruses has been termed "replicative module" (goldbach, 1986) . such module of related proteins has been recognized also as "capsid modules" built of three "jelly-roll" proteins. animal caliciviridae, plant potyviridae and insect acyrthosiphon pisum virus, all of which are distantly related to picornaviridae, encode a distinct variety of the replicative module that is associated with one of three unique sets of capsid protein(s) encoded in the 3'-region of viral genomes (domier et al., 1986; meyers et al., 1991; van der wilk et al., 1997; figure 12.10) . the conservation of protein order in the picornavirus polyprotein and the patterns of expression (proteolytic processing) have been conserved. pairs of neighboring proteins are separated at scissile bonds cleaved by a virus proteinase or, in case of the vp4/vp2 junction, by an unknown mechanism. it could be hypothesized that the position of protein domains could be changed as long as the corresponding proteins were released more or less independently from the precursor. this, however, is not the case, as the pathway of proteolytic processing in picornavirus polyproteins is not random. furthermore, at least some intermediate precursors, e.g. 2bc, 3ab and 3cd pr~ have essential functions that differ from those of the end-product of processing (see the section on genetics). these considerations provide a biological reasoning for the observed conservation of the protein order in polyproteins. we have already pointed out that the order of two conserved units within the polyprotein, the capsid precursor and replicative modules, is flexible. the two least conserved proteins, l and 2a, flank the capsid precursor at the n-and ctermini, respectively, and bring additional plasticity to the organization of the polyprotein. this is reflected also in terms of expression, i.e. the mechanism of proteolytic processing. processing of the capsid precursors as well as of the replicative module at junctions separating conserved proteins involves exclusively the conserved 3c/3cd pr~ proteinases, a mechanism functioning not only in picornaviruses but also in other picorna-like viruses (figure 12.10; ryan and flint, 1997) . in contrast, the three cleavages separating the poorly conserved l and 2a pro-teins from the neighboring polypeptide chains (l/vp4, vp1/2a and 2a/2b) are processed by a range of mechanisms in a genus-specific manner. furthermore, whereas picornaviruses use two general pathways of cleavages -3c pr~ pr~ versus distinct mechanisms involving l and 2a-this genetic repertoire may be further diversified in some picorna-like viruses. for instance, in comoviridae and insect viruses, capsid precursor and non-structural proteins are encoded by distinct orfs (figure 12 .10), which eliminates the need for cleavages separating these polypeptide chains. 3cpr~ pr~ have emerged as the major enzymatic factors in the regulation of protein expression in all picorna-and related viruses. interestingly, the primary structure of sites recognized by these proteases is virus-specific rather than position-specific. among picornaviruses, entero-and rhinoviruses employ sets of structurally uniform sites while viruses of the other genera use more diversified sets. poliovirus and hav exemplify the most extreme diversity. in poliovirus, all eight cleavage sites have the same ("canonical") q/g structure ( figure 12.4) , whereas in hav, six variations of this structure were described in different sites (palmenberg, 1990) . poliovirus proteins produced from its replicative module seem to have been exceptionally strongly constrained not only with respect to the type of the terminal amino acids but also with respect to size. mature poliovirus proteins (except 3cpr~ as well as processing intermediates, have sizes that can be divided by 11 without remainder or with only a small remainder (gorbalenya et al., 1986) . this feature separates poliovirus proteins from the overwhelming majority of cellular and viral proteins. the latter are heterogeneous both in size and sequence, particularly at their termini, because of a relative abundance of mutations, including insertions and deletions. structural regularities documented for poliovirus can be visualized in a form of weak primary structure periodicities with the common denominator of 11 comprising the major portion of the replicative module. on the basis of these observations, it has been proposed that the replicative module of picornaviruses has originated from a primitive self-replicating rna molecule through consecutive multistep duplications (gorbalenya, 1995; gorbalenya et al., 1986) . how it is likely to evolve in the future? we have briefly described different levels of evolutionary conservation in picornaviruses by using results of comparative sequence analyses. the conservation of different properties is the result of a long evolutionary process, accompanied by numerous radiations. does the history of the polyprotein determine how picornaviruses may evolve in the future? we are unaware that this question has ever been directly addressed in experimental studies, although many results obtained by using genetic engineering seem to be quite relevant. these data can technically be separated into two setsthose obtained in studies using site-directed mutagenesis and those aimed at constructing chimeras. in numerous studies of the first type, it has been observed that different regions of the picornavirus genome express a differential tolerance to replacements (wimmer et al., 1993) . it can be predicted that a profile of the "accepted" mutability, drawn over the entire genome, would fit the conservation profiles described above. such a result would support the hypothesis that the past of picornaviruses influences their future in terms of evolution. however, mutagenesis saturating the genome has never been systematically carried out. therefore, the available "mutagenesis profile" can only be used as a rough approximation of the yet-to-bedefined "accepted" mutability profile in relation to the conservation of modules. the "resolution" of the mutagenesis studies that remained unresolved is potentially relevant to an understanding of the evolution of contemporary picornaviruses, regardless of whether this relates to recent evolutionary events or to the complete historical past. the second group of data involving genome engineering complements the mutagenesis studies and helps to address the question posed above. in wt genomes of entero-and rhinoviruses, the orf of the capsid precursor is preceded by the cognate 5'ntr ( figure 12 .2). as was observed in studies of poliovirus expression vectors ( figure 12 .5), genetically stable variants of poliovirus have been selected (mueller and wimmer, 1997) in which an additional leader peptide is encoded that is fused to the n-terminus of the polyprotein, just downstream from the 5'ntr. this organization may look unique on first sight, but in fact it resembles that of all other picornaviruses distantly related to enteroand rhinoviruses. these terminal appendices in the poliovirus variants resemble the l proteins and, hence, these poliovirus chimeras have a "cardio-like" organization ( figure 12 .6e). we can speculate that pv has "accepted" an artificial l peptide because a similar event has already happened in the past history of its ancestors. in a different set of experiments, several poliovirus chimeras have been generated in which the heterologous emcv ires was placed into the sequences specifying scissile bonds of the polyprotein, thereby dividing the polyprotein into two parts (figure 12 .6b). this insertion radically modified the conserved protein expression mechanism of picornaviruses, since it functionally replaced a proteolytic cleavage event by an event of internal initiation of translation directed by the alien ires. in all, poliovirus genomes were constructed in which the emcv ires was placed between the y*g cleavage site of 2a pr~ (figure 12 .6b) or all possible q*g cleavage sites involving the 3cpr~ pr~ proteinase (molla et al., 1992 (molla et al., , 1993b paul et al., 1998a) . only two poliovirus-emcv dicistronic chimeras, specifically those carrying emcv ires between vp1 and 2a and between 2a and 2b, have given rise to viable and stable virus progeny (molla et al., 1992 (molla et al., , 1993b paul et al., 1998a) . although the genome organizations of these chimeras do not match anything found in nature, immediate parallels come to mind with genomes of picorna-like viruses in which capsid and replicative modules are encoded by different orfs (for example comoviridae, see above and the section on genetics). these considerations imply that the conserved and non-conserved features in organization and structure of genomes of picornaviruses and even picorna-like viruses are indicative of an evolutionary plasticity and of possible future changes of a picornavirus. perhaps an "evolutionary space" of a picornavirus can be approximated from the past. mechanistically, this can be seen as if the past evolution of the entire family has been "imprinted" in the organization of the genome of each of the contemporary picornaviruses. phylogenetic trees that have been built for different picornaviral proteins (most often vp3, 2c awpase and 3d p~ by employing parsimonious and maximum-likelihood methods proved roughly topologically equivalent even though different regions of the polyproteins have definitely evolved at different rates (stanway, 1990; rodrigo and dopazo, 1995; hyypia et al., 1997) . these observations strongly favor a concerted evolution of (the majority of) the picornavirus proteins. this conclusion is not compromised by some incongruity in the tree topology of closely related viruses, e.g. the c cluster of the enteroviruses (poyry et al., 1996) , or very distantly related groups, e.g. hepato-and parechoviruses. it is likely that some trees generated for different regions look different, as a result of technical limitations related to phylogenetic and biopolymer sequence analyses as well as a biased representation of some groups. also, possible recombination events between closely related viruses may have complicated phylogenetic analyses. we shall analyse sequence alignments of picornavirus proteins and polynucleotides aimed at deducing the mechanisms functioning in picornavirus evolution. uniform sizes of each of the vp3, 2c awpase, 3c pr~ or 3d p~ polypeptides have been maintained in all picornaviruses. the diversity of the proteins is therefore most probably the result of numerous in-frame mutations. for the other proteins, some additional mechanism of diversification may have been functioning in the course of evolution. among the viruses encoding 2a proteins sharing the npgp motif, the two viruses fmdv and erv1 encode a 2a consisting of only 16 amino acids, whereas the cardioviruses erv2 and aiv encode a 2a ranging between 67 and 150 residues. it can be speculated that deletion events in the 2a coding region of fmdv and erv1 are the result of "jumping" of 3d p~ perhaps by loop-out deletion or by illegitimate recombination (figure 12.7) . on the other hand, the three adjacent coding regions for vpg uniquely found in all strains of fmdv suggest duplication events. in other viruses, e.g. erv2 or tmev, genetic events such as local duplication and deletions may have occurred, leading to considerable size heterogeneity of the corresponding vpgs and adjacent sequences (wutz et al., 1996; a.e. gorbalenya, unpublished results) . duplications have also been discovered in the 5'ntr of enteroviruses (pilipenko et al., 1989a) . picornavirus genomic redundancy, known as duplications, may have been generated by intragenomic recombination. after duplications, however, the sequences must have undergone some variation so as to avoid elimination by homologous recombination. indeed, the nucleotide sequences (and to a small extent also the amino acid sequences) of the three vpgs of fmdv differ such that homologous recombination at this locus is unlikely (cao and wimmer, 1996) . in spite of lack of evidence, duplications by intragenomic recombination might have been involved in the production of large differences in size found in capsid proteins vp1 and vp2, or in non-structural 3a and 2b proteins of some picornaviruses. the capsid proteins contain long extra loops while the 2b protein of erv1 has an enormous size relative to the 2b proteins of all other picornaviruses (283 versus 100-150 amino acids; wutz et al., 1996) . on the other hand, charini et al. (1994) have reported that, surprisingly, a viable poliovirus isolate they selected from a swarm of revertants had captured a short segment of cellular ribosomal rna. thus, capture of entirely foreign rna sequences, although very rare, cannot be excluded from the mechanisms of diversification. at least two different mechanisms could have given rise to the contemporary diversity of 2a and l protein families. the diversity includes, amongst others, chymotrypsin-like proteinase and npgp motif-containing polypeptides for 2a and papain-like proteinase and zn-finger proteins for l. phylogenetic analyses suggest that "new" unrelated 2a and l proteins have emerged in the course of evolution of picornaviruses on several occasions, following the split of the major groups of the picornavirus tree. it is logical to assume that, following each split, one of the two descendants has arisen from an ancestral viral source, the other from an "independent" source. as to the latter, the coding sequence of either 2a or l could have recombined with a gene of either another virus or of the cell, leading to the replacement of the ancestral coding sequence. for example, this replacement mechanism could have resulted in the capture of cellular chymotrypsin-like (2a pr~ or papain-like (l pr~ activities. this hypothesis is, of course, purely speculative since no potential partners in recombination have been identified as yet. alternatively, the diversity of the 2a and l families may be the result of frame-shifting events. for example, enteroviruses have a "spacer sequence" between the ires and the orf of the polyprotein ~. this spacer commences with an unused ("silent") aug at the 3' border of the ires. in poliovirus, it is 154 nt long and represents a small out-of-frame orf terminating inside the polyprotein orf. if the silent aug at the 5' end of the spacer were to trigger initiation of translation and, in addition, a frame-shift mutation connected the small orf with the main orf, a small "leader" peptide would be created fused to the polyprotein. all that is then necessary is a 3c pr~ cleavage site to sever the "leader" from vp4-and a genetic arrangement would have been created resembling that of cardio-and aphthoviruses (jang et al., 1990) . indeed, the silent aug of poliovirus can be turned on by changing its kozak context (pestova et al., 1994) , and stable poliovirus variants can be isolated that carry short foreign leaders (see above; mueller and wimmer, 1998) . thus, the conversion of an enterovirus to a cardiovirus genotype with respect to an l protein can be envisioned by relatively simple genetic changes. similarly, it should be possible to convert a cardiovirus genotype in this region into an enterovirus genotype by silencing its l orf. it is relevant that, as already mentioned, a strain of theiler's virus has been identified that, just like the normal cardioviruses, synthesizes a polyprotein-fused l protein and, in addition, a polypeptide l* in a separate orf. l* synthesis is initiated at its own aug initiation codon (takata et al., 1998) . apparently, the synthesis of l* may present the virus with an advantage in the natural host, a fact that may have contributed to its selection. by comparison with the l protein region in tmev, two 2a proteins may have existed in the ancestral picornavirus genome, one active in the polyprotein, the other "silent". in the course of subsequent speciation, each of these 2a variants may have been used in separate picornavirus lineages. the activation of the "silent" 2a may have led to a concomitant inactivation of the other 2a. it should be mentioned that the presence of multiple alternative orfs in ancestral picornavirus genomes may have been the rule rather than the exception, particularly if the polyprotein evolved by amplification of 11-mers (gorbalenya, 1995 ; see also above). ohno (1984) has demonstrated that periodicity-organized polynucleotides with a period that cannot be divided by 3 (11-long periodicity included) have an identical coding capacity in each frame. in other words, if one orf is open the two other frames are open also. in the course of evolution, two out of three reading frames may have deteriorated or may have given rise to genetic variation as speculated for the generation of the diversity in 2a and l proteins. numerous studies attest to a remarkable stability of the picornavirus genotype if grown under identical conditions (wimmer et al., 1993) . on the other hand, if exposed to altered conditions in the environment, a shift to new variants can be readily observed. just like other biological systems, it can be assumed that picornavirus speciation has been driven by a changing environment. circumstances upon which a picornavirus may encounter a "new" environment include: (1) horizontal or vertical transfer to a new (different) host; (2) entering a natural host through a non-natural gate; (3) infecting immunized (natural) hosts previously exposed to the same virus. although there is no proof, it is intuitively highly likely that all three scenarios have played a role in picornavirus speciation. in the following, a speculative reconstruction of forces will be presented that may have contributed to the evolution of picornaviruses. picornaviruses belonging to a genus or a cluster may have almost identical phenotypes with respect to growth properties and even in regard to pathogenic potential. a most important characteristic, however, does further divide a group of very closely related picornaviruses (e.g. polioviruses): the susceptibility to activation by different neutralizing antibodies and, hence, the separation into serotypes (see the introduction). it is logical to assume that the (negative) pressure of the immune system may be largely accountable for serotype diversification of picornaviruses. that is, the immune response can lead to the selection of viral variants resistant to the neutralizing immune response produced by the surviving host. such variants would form a pool from which a new serotype could be further selected. in fact, such mechanism of virus evolution seems to dominate in the case of influenza a virus or immunodeficiency virus (hiv). however, the sheer unlimited degree of serotype diversification observed in influenza viruses or hiv is an exception rather than the rule amongst viruses. indeed, not all picornaviruses seem to be able to easily produce new serotypes. for example, the genus hepatovirus encompasses only one serotype while others are restricted to a few serotypes (e.g. poliovirus). new viral variants that have escaped the immune surveillance must, of course, interact with multiple host components at virtually every stage of their reproduction in order to survive. this includes virus entry into the host cell, translation and processing, genome replication, encapsidation and maturation, spread in the host. each of these steps are checkpoints and every new viral variant must be fit to pass these barriers. the earliest events in the infectious cycle-receptor interaction, uptake, uncoating-and the mechanisms of neutralization are amongst the least understood in the molecular biology of picornaviruses. the crystal structures of some member viruses of four picornavirus genera have been solved; examples are: enterovirus, poliovirus 1 and 3 (hogle et al., 1985) ; rhinovirus, human rhinoviruses 2, 3, 14, 16 (rossman et al., 1985) ; cardiovirus, mengovirus (luo et al., 1987) , theiler's virus (luo et al., 1996) ; aphthovirus, fmdv (acharya et al., 1989) . (for a complete list, see lentz et al., 1999) . however, the precise localization and structures of different neutralization antigenic sites (the structures interacting with neutralizing antibodies) is known only for polioviruses, rhinoviruses and aphthoviruses. for aphthoviruses and for polioviruses, the available evidence suggests that the same structures that determine in part the serotype identity are also involved in receptor recognition (domingo et al., 1993; mason et al., 1994; harber et al., 1995) . thus, immuneescaping viral mutants are likely to be enriched in those variants that have maintained the ability to efficiently interact with the cognate receptor and follow the pathway of uptake and uncoating. this is, of course, only speculative but, if correct, it would explain in part serotype restriction (harber et al., 1995) . in this respect it may be informative to compare receptor specificities with serotype diversities of human enteroviruses, on the one hand and rhinoviruses on the other. these two genera encompass viruses that have diverged from an immediate common ancestor and radiated during the same time period (figure 12.1) . in the course of evolution, different serotypes in roughly the same numbers have been generated in these two picornavirus branches: there are about 66 enterovirus and over 100 rhinovirus serotypes. this implies that viruses of the two genera are similarly prone to accumulation of changes in those capsid structures giving rise to new serotypes. but what about receptor specificity of these viruses? at the time of writing, two receptors have been assigned for human rhinoviruses (which is probably all that will be found) and six receptors for human enteroviruses (at least four more are awaiting identification; table 12 .1). thus, in contrast to the quite similar extent of serotype diversification in both genera, adaptation to new receptors is significantly more restricted in rhinoviruses than in the closely related enteroviruses. importantly, there is an overlap between the two receptor patterns and, taken together, the icam-1 receptor specificity appears to be dominant among entero-and rhinoviruses. this can be interpreted to mean that the immediate common ancestor of both enteroand rhinoviruses may have used a receptor related to icam-1. regardless of whether this is true or not, the subsequent evolution of icam-1-recognizing picornaviruses has proceeded differently, as seen in the disparity of the current use of this cellular receptor (>90 for rhinoviruses versus 11 for c-cluster coxsackieviruses). given that the serotype diversification has proceeded at a similar pace in entero-and rhinoviruses, enteroviruses may have had greater opportunities -or a greater need -to adapt to new receptors in order to initiate an infection. this may be related to the function(s) of receptors in viral docking and uncoating: whereas rhinoviruses may need the receptor only for docking and uptake (because of their inherent sensitivity to the acidic ph inside late endosomes), the exceedingly stable enteroviruses do need the receptor (and possibly a co-receptor) for docking, uptake and uncoating. with poliovirus, a particle stable to detergents, proteases and low ph (ph 2), this is exemplified in the formation of a-particles, a labile product of receptor/virion interaction and an intermediate in uncoating . a-particle formation appears to involve also sequences of neutralization antigenic sites (harber et al., 1995) . thus, the intercourse between receptor and enterovirion may be much more complex than that between receptor and rhinovirion. consequently, a change in the serotype may have forced enteroviruses to search for new receptors to retain the uncoating capacity of the cellular receptor. the unusually large serotype diversity of the major receptor group human rhinoviruses may then be explained as follows. it seems possible that the initiation of an infectious cycle of hrv does not critically require an interaction between structures of the neutralization antigenic sites of the virion and icam-1. that is, the n-terminal domain of icam-1, by inserting itself into the virion's canyon, can effect docking, uptake and uncoating of the particle. progression through any of these events is not critically dependent on sequences of the neutralization antigenic sites. if correct, it follows that variation of the antigenic sites does not restrict viral proliferation and serotype evolution. consistently, in other picornaviruses the neutralization antigenic sites and the determinants recognizing the receptor would be much more overlapping and mutually dependent. it is likely that an initial immune-driven selection might also finally result in a virus variant with changed or extended tissue tropism. this might have happened with cav24v, a c-cluster human enterovirus. immune pressure might have initiated the selection of the cav24v mutant derived from a cav24 swarm. as mentioned before, cav24v is a very recent variant of cav24 and, unlike its parent and the other members of the c-cluster, it can cause acute hemorrhagic conjunctivitis. apart from the possibility that cav24v emerged through immune selection, it could also have been selected from a swarm when the parental cav24 was accidentally inoculated into the eye. another type of selection might have been responsible for the emergence of swine vesicular disease virus (svdv). phylogenetic analysis of genomes of human enteroviruses identified svdv as being interleaved with human viruses comprising the cbv-like cluster (poyry et al., 1996) . this observation is strongly indicative of selection of svdv from a mutant of a human coxsackie-b virus entering the new host through frequent contacts of these domestic animals with (infected) humans. we have discussed different aspects relevant to picornavirus evolution, but we did not address one crucial question: are picornaviruses a successful family? we believe that the answer is: yes. in discussing this issue, we will also formu-late considerations regarding the worldwide eradication of poliovirus. one of the strongest criteria of biological prosperity is the diversity of a taxonomic group. despite some bias inherent in current analyses, phylogenetic studies of picornaviral genomes suggest that picornaviridae have radiated densely over the course of evolution, at both early and late stages (figure 12.1) . furthermore, picornaviruses are members of a superfamily with numerous distant relatives (figure 12 .8) that infect a wide range of organisms, including both plants and animals. some of these viruses, like sequiviridae, employ a genetic plan that is basically a variation of the genetic plan used by picornaviruses (figure 12 .10). prosperity of the host is another prerequisite for a virus to be successful. by this criterion also, picornaviruses are successful, since the majority of them, representing different branches of the picornavirus tree, infect humans. humans are arguably one of the most successful species in the biological world. in truth, picornaviruses are relatively harmless even though few humans, if any, can escape picornavirus infections. this too can be viewed as evidence that these viruses have adapted well to their host, as they have not significantly undermined human affairs. this is true even for poliovirus, an agent that is commonly regarded as a deadly virus following epidemics of poliomyelitis. however, prior to this century, poliovirus did not cause epidemics, even though it infected humans at rates approaching 100%. epidemics emerged because human behavior changed through the invention of modern hygiene. hygiene broke the chain of natural immunization through infant infection combined with infant protection by maternal antibodies. even in this century's devastating epidemics, however only 1-2% of infected individuals developed poliomyelitis. the poliovirus-human relationship alluded to above deserves to be discussed in more detail. humans, who occupy a unique niche in the bio-logical world (because they care about each human life), did not accept their potential defeat as poliomyelitis became an epidemic. unprecedented efforts combining medical research with modern technologies led to the development of two highly effective poliovirus vaccines, the inactivated poliovirus vaccine by jonas salk and the live attenuated vaccine by albert sabin (wimmer et al., 1993) . through education of the populace and advanced healthcare measures, mass vaccinations have gradually eliminated wild-type poliovirus, first in the developed countries and later in most of the world. incredibly, the few cases of poliomyelitis in the western hemisphere now result from vaccination with the live sabin strains. overall, polio vaccination is a success story of greatest consequence. indeed, through worldwide efforts led by the world health organization, it is likely that wild-type polioviruses will be eradicated globally by the turn of the century (who, 1985) . do these considerations allow us to safely conclude that, after its global eradication, poliovirus will have no chance to re-emerge through enterovirus evolution? for discussion of this issue, we will first summarize hypotheses about the possible origin of polioviruses and their closest relatives, the c-cluster coxsackieviruses. the three serotypes of poliovirus belong to the c-cluster of enteroviruses (table 12 .1; figure 12 .1b). the most comprehensive analysis of the c-cluster has been performed with sequences of the vp4-vp2 capsids and with sequences of the 3d p~ rna polymerase (pulli et al., 1995) . results of these analyses are consistent with data obtained in a study of the other regions of the viral genome using a less representative set of sequences (poyry et al., 1996) . therefore, these relationships shown in figure 12 .1b can be assumed to be quite reliable. a phylogenetic analysis of the capsid vp4-vp2 region of c-cluster viruses indicated that the tree has split at least twice, perhaps before the emergence of an immediate ancestor of polioviruses. the first split led to the separation of a branch encompassing cav1, cav21 and cav24 from the main c-cluster trunk, and the second, more recent one resulted in the separation of the ancestor for pv and the ancestor for cav11, cav13, cav17, cav18, cav20 and cav20b. the results obtained with sequences of 3d p~ favor an even more complex evolutionary history of poliovirus, including more than five intermediate steps (pulli et al., 1995) . consistent with the results of the analysis of the capsid region, cav21 and cav24 were among those viruses that diverged from the main trunk relatively early in evolution while the three poliovirus serotypes clustered together with cav11, cav13, cav17 and cav18. remarkably, in the tree based on 3d p~ sequences the latter four coxsackieviruses (as well as several other coxsackieviruses) are interleaved with, rather than separated from, the three poliovirus serotypes (figure 12 .1). this stands in contrast to the tree of the capsid region. assuming the most parsimonious scenario of evolution, the combination of these results strongly implies that coxsackieviruses that recognize the icam-1 receptor formed a pool from which polioviruses, interacting with the cd155 receptor, have evolved. this conclusion is compatible with a hypothesis of the immune-driven evolution of entero-and rhinoviruses presented above. furthermore, the analyses do not indicate that three polioviruses comprise a monophyletic subgroup within the c-cluster enteroviruses and, hence, have emerged from an ancestral virus by speciation, as one could expect from a distinct phenotypic profile of these viruses. we have previously hypothesized that the coxsackiviruses may have derived from polioviruses by switching receptors from cd155 to icam-1 (harber et al., 1995) . this possibility may be supported from the fact that the ireses of c-cluster coxsackieviruses are highly 'neuropathogenic". on the other hand, the assessment presented above favors an evolutionary relationship in the opposite direction. regardless of the direction in which these viruses emerged, the receptor switch has profound consequences for their pathogenic properties: whereas the c-cluster coxsackieviruses cause respiratory disease, poliovirus can cause deadly neurological disease. these considerations may also have important practical implications. for the sake of the argument, we will assume that the poliovirus eradication campaign has been successfully completed and no more poliovirus particles, including those of the vaccine strains, are circulating worldwide. furthermore, we will assume that all vaccination against poliovirus (including vaccination by inactivated vaccines) has been terminated, a scenario that has been envisioned to be a reality by the end of the next decade. these measures would mark the beginning of a new era in the history of mankind: there will be no human exposure to polioviruses and their antigens. generations of humans will be born that have not been infected with wild-type or vaccine polioviruses and, gradually, they will replace the older generations who carry anti-poliovirus antibodies. at that point, the world will not only be free of poliovirus, but its human population will also no longer carry anti-poliovirus antibodies. thus, a new environment will emerge for human viruses, in particular for c-cluster coxsackieviruses, which are the closest genetic relatives of poliovirus. these c-cavs are expected to circulate widely in the human population, exploring a new evolutionary space. within the human space populated by the c-cavs, there will then exist also a free space that was previously occupied by the three (extinct) poliovirus serotypes. it is possible that mutations in antigenic sites of the c-cavs may (re)generate affinity to cd155. prior to eradication, c-cavs carrying such mutations could conceivably be eliminated by anti-poliovirus antibodies (harber et al., 1995) but in the poliovirus-free world they may remain unchecked. this means that, once emerged, these new viruses carrying poliovirus-like neutralization antigenic sites with cd155 receptor affinity are less likely to be eliminated from the human population after eradication than before. since all enteroviruses, the variants included, lead to enteric infections, these variants may find a passage to the cns and, mediated by their affinity to cd155, may cause neurological disease. it is relevant to point out (gromeier and wimmer, unpublished results ) that poliovirus chimeric viruses in which the poliovirus ires has been replaced with that of c-cluster ires elements have been found to be highly neurovirulent in cd155 tg mice (see the section on pathogenesis). thus, there is reason to fear that in a poliovirus-free world new coxsackievirusrelated, poliovirus-like pathogens that can cause poliomyelitis may emerge in the course of natural viral evolution. the time frame, however, cannot be predicted. it could be one generation or 1000 years. the human condition favors an increasing rate of diversity of human viruses simply because of the increasing size of the human population (estimated to stabilize at 8-12 billion during the next century). this population explosion will lead to a dramatic increase of human contacts, either in cities, particularly megacities (harboring more than 50% of the world's population), through travel or otherwise. clearly, this presents a fertile ground for proliferation and diversification of the highly infectious human picornaviruses. thus, the possibilities of genetic variation of picornaviruses leading to new or renewed human pathogens, such as cav24v, must always be kept in mind. at this point, however, our considerations of the possible re-emergence of poliovirus-like pathogens in the post-eradication era pale in the face of mankind's heroic attempt to eradicate an rna virus for the first time. after all, poliovirus has caused, and is still causing, terrible human suffering. picornaviruses have been discovered because they cause diseases in animals and humans. fortunately, most human picornavirus infections are self-limiting. yet the enormously high rate of picornavirus infections in the human population can lead to a significant incidence of disease complications that may be permanently debilitating or even fatal. the case of poliovirus has taught us that a change of human behavior, which, paradoxically, was the invention of modern hygiene, has greatly aggravated the impact of infection by this specific agent. clearly, this scenario could repeat itself with other human picornavirus species. the terror of this century's poliomyelitis epidemics has driven picornavirus research forward more than any other factor. this work has led to a wealth of discov-eries in biology in general, and to an abundance of data describing the unique biology of picornaviruses and their evolution in particular. picornaviruses employ one of the simplest imaginable genetic systems: they consist of single-stranded rna that encodes only a single multidomain polypeptide, the polyprotein. the rna is packaged into a small, rigid, naked, icosahedral virion whose proteins are unmodified except for a myristate at the n-termini of vp4. the rna itself does not contain modified bases. thus, picornaviruses travel with light baggage. on first sight, the replication of picornaviruses is exceedingly simple. after having chosen a receptor from a large menu of cell-surface proteins, the virion enters the cytoplasm and immediately translates its genome, controlled by its ires element. thereafter, the polyprotein is processed by its own proteinases. rna replication occurs by a unique, proteinprimed mechanism catalyzed by the rnadependent rna polymerase. assembly appears to be linked to rna synthesis, and release of the progeny virions follows a passive mechanism. there is no need for a cellular nucleus. indeed, the entire replication cycle can occur in a cellfree system free of nuclei, mitochondria and perhaps of all other cellular organelles. yet as of now we understand only a small fraction of these viruses' life cycle, and we are awed by the sophistication with which the viruses express their genetic information. the ires, arguably one of the most complex cis-acting signals known in rna systems, has freed picornaviruses from the cellular constraint of cap-dependent translation. this, in turn, allows the primer-dependent rna polymerase, an enzyme with properties generally ascribed only to dna polymerases or reverse transcriptase, to prime with vpg and leave the rna uncapped. polyprotein processing proceeds in a controlled manner yielding cleavage intermediates and end-products that can be used for different functions. thus, the menu of gene products is expanded through the temporal regulation of proteolytic processing. details of all of these steps in replication are still obscure . the key to ultimately understanding picornaviruses may be to rationalize the huge amount of information about these viruses from the perspective of evolution. it is possible that the replicative apparatus of picornaviruses originated in the precellular world and was subsequently refined in the course of thousands of generations in a slowly evolving environment. picornaviruses cultivated the art of adaptation, which has allowed them to "jump" into new niches offered in the biological world. also, by having chosen humans as an additional host, they were offered an abundance of opportunities to proliferate in different tissues, which has contributed to their diversification. these opportunities have further increased through the human population explosion and through changes in human behavior. we suggest that, in addition to drastic and expansive measures such as global eradication, strategies should be developed that aim at predicting the possible evolution of new picornavirus pathogens and preparing for their control. the results reviewed in this article may contribute to achieving this tantalizing and desirable goal. the threedimensional structure of foot-and-mouth disease virus at 2.9 a resolution restricted growth of attenuated poliovirus strains in cultured cells of a human neuroblastoma paradoxes of the replication of picornaviral genomes polioviruses containing picornavirus type 1 and/or type 2 internal ribosomal entry site elements: genetic hybrids and the expression of a foreign gene picornaviral 3c cysteine proteinases have a fold similar to chymotrypsin-like serine proteinases attenuated mengo virus as a vector for immunogenic human immunodeficiency virus type i glycoprotein 120 attenuated mengo virus: a new vector for live recombinant vaccines a functional ribonucleoprotein complex forms around the 5' end of poliovirus rna poliovirus rna synthesis utilizes an rnp complex formed around the 5'-end of viral rna engineering poliovirus as a vaccine vector for the expression of diverse antigens expression of animal viral genomes coupled translation and replication of poliovirus rna in vitro: synthesis of functional 3d polymerase and infectious virus complete replication of poliovirus in vitro: preinitiation rna replication complexes require soluble cellular factors for the synthesis of vpg-linked rna viral cysteine proteases are homologous to the trypsin-like family of serine proteases: structural and functional implications genetic fine structure decay-accelerating factor (cd55), a glycosylphosphatidyylinsitol-anchored complement regulatory protein, is a receptor for several echoviruses identification of the integrin vla-2 as a receptor for echovirus 1 antibodies to the vitronectin receptor (integrin alpha v beta 3) inhibit binding and infection of foot-andmouth disease virus to cultured cells genetic complementation among poliovirus mutants derived from an infectious cdna clone structural and functional characterization of the poliovirus replication complex infectious replicative intermediate of poliovirus: purification and characterization poly(rc) binding protein 2 binds to stem-loop iv of the poliovirus rna 5' noncoding region: identification by automated liquid chromatography-tandem mass spectrometry poliomyelitis sequences within the poliovirus internal ribosome entry segment control viral rna synthesis nucleotide sequence of an attenuated mutant of coxsackievirus b3 compared with the cardiovirulent wildtype: assessment of candidate mutations by analysis of a revertant to cardiovirulence intragenomic complementation of a 3ab mutant in dicistronic polioviruses genetic variation of the poliovirus genome with two vpg coding units trans rescue of a mutant poliovirus rna polymerase function transduction of a human rna sequence by poliovirus initiation of protein synthesis by the eukaryotic translational apparatus on circular rnas a picornaviral protein synthesized out of frame with the polyprotein plays a key role in a virus-induced immune-mediated demyelinating disease nonhomologous rna recombination in a cell-free system: evidence for a transesterification mechanism guided by secondary structure rna duplex unwinding activity of poliovirus rna-dependent rna polymerase 3d p~ defective interfering particles of poliovirus defective interfering particles of poliovirus. 1. isolation and physical properties genetics of picomaviruses brefeldin a inhibits cell-free, de novo synthesis of poliovirus role of enterovirus 71 in acute flaccid paralysis after the eradication of poliovirus in brazil the deletion of 41 proximal nucleotides reverts a poliovirus mutant containing a temperaturesensitive lesion in the 5' noncoding region of genomic rna the nucleotide sequence of tobacco vein mottling virus rna new observations on antigenic diversification of rna viruses: antigenic variation is not dependent on immune selection expression of virus-encoded proteinases: functional and structural similarities with cellular enzymes temperaturedependent alteration of cross-over sites in poliovirus recombination. virology, submitted attenuation of mengo virus through genetic engineering of the 5' noncoding poly(c) tract the origin of genetic information: viruses as models recombinants of mahoney and sabin strain poliovirus type 1: analysis of in vitro phenotypic markers and evidence that resistance to guanidine maps in the nonstructural proteins increased neurovirulence associated with a single nucleotide change in a noncoding region of the sabin type 3 poliovaccine genome the human homolog of ha vcr-1 codes for a hepatitis a virus cellular receptor structural factors that control conformational transitions and serotype specificity in type 3 poliovirus rate of change of concomitantly variable codons poliovirus-specific primer-dependent rna polymerase able to copy poly(a) covalent linkage of a protein to a defined nucleotide sequence at the 5'-terminus of virion and replicative intermediate rnas of poliovirus replication of poliovirus in xenopus oocytes requires two human factors two functional complexes formed by kh domain containing proteins with the 5' noncoding region of poliovirus rna switch from translation to replication in a positivestranded rna virus functional and genetic plasticities of the poliovirus genome: quasi-infectious rnas modified in the 5'-untranslated region yield a variety of pseudorevertants molecular evolution of plant rna viruses abstract. europic'98 origin of rna viral genomes: approaching the problem by comparative sequence analysis superfamily of uvra-related ntp-binding proteins. implications for rational classification of recombination/repair systems comparative analysis of the amino acid sequences of the key enzymes of the replication and expression of positive-strand rna viruses. validity of the approach and functional and evolutionary implications helicases: amino acid sequence comparisons and structure-function relationships viral cysteine proteinases poliovirus induced proteinase 3c: a possible evolutionary link between cellular serine and cysteine proteinase families cysteine proteases of positive strand rna viruses and chymotrypsin-like serine proteases: a distinct protein super-family with a common structural fold putative papain-related thiol proteases of positive-strand rna viruses. identification of rubi-and aphthovirus proteases and delineation of a novel conserved domain associated with proteases of rubi-, alpha-and coronaviruses interaction of rhinovirus with its receptor, icam-1 mechanism of injury-provoked poliomyelitis prophylactic injections and the onset of paralytic poliomyelitis mouse neuropathogenic poliovirus strains cause damage in the central nervous system different from poliomyelitis internal ribosomal entry site substitution eliminates neurovirulence in intergeneric poliovirus recombinants dual stem loops within the poliovirus internal ribosomal entry site control neurovirulence the human poliovirus receptor/cd155 promoter directs reportergene expression in floor plate and optic nerve of transgenic mice in vitro construction of poliovirus defective interfering particles attenuation stem-loop lesions in the 5' noncoding region of poliovirus rna: neuronal cell-specific translation defects structure of the rna-dependent rna polymerase of poliovirus the catalysis of the poliovirus vpo maturation cleavage is not mediated by serine 10 of vp2 serotype polymorphism of poliovirus-cellular receptor interaction: separation of events of viral attachment and uptake proteolytic processing in the replication of picornaviruses interaction of the polioviral polypeptide 3cd pr~ with the 5' and 3' termini of the poliovirus genome: identification of viral and cellular cofactors necessary for efficient binding proteolytic processing of viral polyproteins in the replication of rna viruses 1976) 5'-terminal structure of poliovirus polyribosomal rna is pup genetic recombination with newcastle disease virus, polioviruses and influenza. cold spring harbor symp members of the low density lipoprotein receptor family mediate cell entry of a minor-group common cold virus the three dimensional structure of poliovirus at 2.9 a resolution mutation frequencies at defined single codon sites in vesicular stomatitis virus and poliovirus can be increased only slightly by chemical mutagenesis vcam-1 is a receptor for encephalomyocarditis vitrus on murine vascular endothelial cells a distinct picornavirus group identified by sequence analysis classification of enteroviruses based on molecular and biological properties analysis of genetic information of an insect picorna-like virus, infectious flacherie virus of silkworm: evidence for evolutionary relationships among insect, mammalian and plant picorna(-like) viruses efficient infection of cells in culture by type o foot-and-mouth disease virus requires binding to cell surface heparan sulfate a segment of the 5' nontranslated region of encephalomyocarditis virus rna directs internal entry of ribosomes during in vitro translation initiation of protein synthesis by internal entry of ribosomes into the 5' nontranslated region of encephalomyocarditis virus rna in vitro cap-independent translation of picornavirus rnas: structure and function of the internal ribosomal entry site the polymerase in its labyrinth: mechanisms and implications of rna recombination poliovirus rna recombination: mechanistic studies in the absence of selection the novel genome organization of the insect picoma-like virus drosophila c virus suggests this virus belongs to a previously undescribed virus family determination of encephalomyocarditis viral diabetogenicity by a putative binding site of the viral capsid protein isolation and characterization of defective-interfering particles of poliovirus sabin i strain complete nucleotide sequence of a strain of coxsackie b4 virus of human origin that induces diabetes in mice and its comparison with nondiabetogenic coxsackie b4 jbv strain preferred sites of recombination in poliovirus rna: an analysis of 40 intertypic cross-over sequences recombination in rna the mechanism of rna recombination in poliovirus primary structure, gene organization and polypeptide expression of poliovirus rna the poliovirus receptor protein is produced both as membrane-bound and secreted forms transgenic mice susceptible to poliovirus q[3 replicase as repressor of q[3 rna-directed protein synthesis evolution of rna genomes: does the high mutation rate necessitate high rate of evolution of viral proteins? an insect picornavirus may have genome organization similar to that of caliciviruses construction of viable deletion and insertion mutants of the sabin strain type i poliovirus: function of the 5' noncoding sequence in viral replication primary structure of poliovirus defective interfering particle genomes and possible generation mechanism of the particles mutational analysis of the genome-linked protein vpg of poliovirus properties of purified recombinant poliovirus protein 3ab as substrate for viral proteinases and as co-factor for viral polymerase 3d p~ differences in replication of attenuated and neurovirulent poliovirus in human neuroblastoma cell line sh-sy5y ubertragung der poliomyelitis acuta auf affen the structure of poliovirus replicative form evolution of a common structural core in the internal ribosome entry sites of picornavirus proteolytic processing of poliovirus polyproteins: elimination of 2a pro-mediated, alternative cleavage of polypeptide 3cd by in vitro mutagenesis the genome of poliovirus is an exceptional eukaryotic mrna the genome-linked protein of picornaviruses. i. a protein covalently linked to poliovirus genome rna structure of poliovirus type 2 lansing complexed with antiviral agent sch48973: comparison of the structural and biological properties of three poliovirus serotypes equine rhinovirus i is more closely related to foot-and-mouth disease virus than to other picornaviruses berichte der kommission zur erforschung der maul-und klauenseuche bei dem institut fuer infektionskrankheiten in berlin poliovirus chimeras replicating under the translational control of genetic elements of hepatitis c virus reveal unusual properties of the internal ribosomal entry site of hepatitis c virus construction and genetic analysis of dicistronic polioviruses containing open reading frames for epitopes of human immunodeficiency virus type 1 gp 120 analysis of picornavirus 2a(pro) proteins: separation of proteinase from translation and replication functions characterization of a new isolate of poliovirus defective interfering particles the atomic structure of mengo virus at 3.0 a resolution the structure of a highly virulent theiler's murine encephalomyelitis virus (gdvii) and implications for determinants of viral persistence the relation of prophylactic inoculations to the onset of poliomyelitis role of mutations g-480 and c-6203 in the attenuation phenotype of sabin type 1 poliovirus capsid coding sequence is required for efficient replication of human rhinovirus 14 rna avian encephalomyelitis virus is a picornavirus and is most closely related to hepatitis. a virus rgd sequence of foot-and-mouth disease virus is essential for infecting cells via the natural receptor but can be bypassed by an antibody dependent enhancement pathway structure of human rhinovirus 3c protease reveals a trypsin-like polypeptide fold, rna-binding site, and means for cleaving precursor polyprotein kissing of the two predominant hairpin loops in the coxsackie b virus 3' untranslated region is the essential structural feature of the origin of replication required for negative-strand rna synthesis enteroviruses: polioviruses, coxsackieviruses, echoviruses, and newer enteroviruses identification of bulgarian strain 258 of enterovirus 71 enteroviruses 69, 70, and 71 cellular receptor for poliovirus: molecular cloning, nucleotide sequence, and expression of a new member of the immunoglobulin superfamily rabbit hemorrhagic disease virus -molecular cloning and nucleotide sequencing of a calicivirus genome antigenic structure of picornaviruses cell-free, de novo synthesis of poliovirus cardioviral internal ribosomal entry site is functional in a genetically engineered dicistronic poliovirus inhibition of proteolytic activity of poliovirus and rhinovirus 2a proteinases by elastase specific inhibitors studies on dicistronic polioviruses implicate viral proteinase 2apro in rna replication effects of temperature and lipophilic agents on poliovirus formation and rna synthesis in a cell free system stimulation of poliovirus proteinase 3cpro-related proteolysis by the genome-linked protein vpg and its precursor 3ab nucleotide sequence analysis shows that rhopalosiphum padi virus is .a~member of a novel group of insect-infecti~ig rna viruses poliovirus single-stranded and dou pathogenesis and evolution of picornaviruses ble-stranded rna: differential infectivity in enucleated cells expression of foreign proteins by poliovirus polyprotein fusion: analysis of genetic stability reveals rapid deletions and formation of cardioviruslike open reading frames expression of foreign proteins by poliovirus polyprotein fusion: analysis of genetic stability reveals rapid deletions and formation, of cardiovirus-like open reading frames poliovirus antigenic hybrids simultaneously expressing antigenic determinants from all three serotypes poliovirus host range is determined by a short amino acid sequence in neutralization antigenic site 1 primary structure and gene organization of human hepatitis a virus properties of a new picorna-like virus of the brown-winged green bug, plautia stali foot-and-mouth disease virus virulent for cattle utilizes the integrin alpha(v)beta3 as its receptor proteolytic processing in the replication of polio and related viruses genetic studies of the antigenicity and the attenuation phenotype of poliovirus the 5' end of poliovirus mrna is not capped with m7g(5')pppnp the 5' terminal structures of poliovirion rna and poliovirus mrna differ only in the genome-linked protein vpg the location of the polio genome protein in viral rnas and its implication for rna synthesis defective interfering particles of poliovirus: mapping of the deletion and evidence that the deletions in the genome of di coupling between genome translation and replication in an rna virus repeats of base oligomers as the primordial coding sequences of the primeval earth and their vestiges in modern genes cath -a hierarchic classification of protein domain structures sequence alignments of picornaviral capsid proteins proteolytic processing of picornaviral polyprotein poly (rc) binding protein 2 forms a ternary complex with the 5' terminal sequences of poliovirus rna and the viral 3cd proteinase studies with poliovirus polymerase 3dp~ stimulation of poly (u) synthesis in vitro by purified poliovirus c protein 3ab internal ribosomal entry site scanning of the poliovirus polyprotein: implications for proteolytic processing protein-primed rna synthesis by purified poliovirus rna polymerase internal initiation of translation of eukaryotic mrna directed by a sequence derived from poliovirus rna a conserved aug triplet in the 5' nontranslated region of poliovirus can function as an initiation codon in vitro and in vivo 1977) 5'-terminal nucleotide sequences of polio-virus polyribosomal rna and virion rna are identical characterization of the nucleotide triphosphatase activity of poliovirus protein 2c reveals a mechanism by which guanidine inhibits replication of poliovirus the foot-and-mouth disease virus leader proteinase gene is not required for viral replication conserved structural domains in the 5'-untranslated region of picornaviral genomes: an analysis of the segment controlling translation and neurovirulence conservation of the secondary structure elements of the 5'-untranslated region of cardioand aphothovirus rnas prokaryotic-like cis elements in the cap-independent internal initiation of translation on picornavirus rna a model for rearrangements in rna genomes cis-element, orir, involved in the initiation of (-) strand poliovirus rna: a quasiglobular multi-domain rna structure maintained by tertiary guanidine-selected mutants of poliovirus: mapping of point mutations to polypeptide 2c purification and properties of poliovirus rna polymerase expressed in escherichia coli encapsidation of genetically engineered poliovirus minireplicons which express human immunodeficiency virus type i gag and pol proteins upon infection genetics and phylogenetic clustering of enteroviruses virus taxonomy 1997 molecular comparison of coxsackie a virus serotypes cloned poliovirus complementary dna is infectious in mammalian cells utilization of chimeras between human (hm-175) and simian (agm-27) strains of hepatitis a virus to study the molecular basis of virulence transgenic mice expressing a human poliovirus receptor: a new model for poliomyelitis characterization of poliovirus clones containing lethal and nonlethal mutations in the genome-linked protein vpg poliovirus rna replication comparative sequence analysis of the 5' noncoding region of the enteroviruses and rhinoviruses evolutionary analysis of the picornavirus family the 3"untranslated region of picornavirus rna: features required for efficient genome replication intestinal trypsin can significantly modify antigenic properties of polioviruses: implications for the use of inactivated poliovirus vaccine biochemical evidence for intertypic genetic recombination of polioviruses the primary structure of intertypic poliovirus recombinants: a model of recombination between rna genomes the evolution of rna viruses icosahedral rna virus structure structure of a human common cold virus and functional relationship to other picornaviruses the genome-linked protein of picornaviruses v. 04-(5' uridylyl)-tyrosine is the bond between the genome-linked protein and the rna of poliovirus spherical viruses virus-encoded proteinases of the picornavirus super-group protein-priming of dna replication an insect picorna-like virus, plautia stali intestine virus, has genes of capsid proteins in the 3' part of the genome crystallization of purified mef-i poliomyelitis virus particles hybrid protein formation of e. coli alkaline phosphatase leading to in vitro complementation cleavage sites in the polypeptide precursors of poliovirus protein p2-x poliovirus replication proteins: rna sequence encoding p3-1b and the site of pioteolytic processing production of infectious poliovirus from cloned cdna is dramatically increased by sv40 transcription and replication signals a decay-accelerating factor-binding strain of coxsackievirus b3 requires the coxsackievirus-adenovirus receptor protein to mediate lytic infection of rhabdomyosarcoma cells a new cis-acting element for rna replication within the 5' noncoding region of poliovirus type 1 rna picornain 3c identification and characterization of the cis-acting elements of the human cd155 gene core promoter structure, function and evolution of picornaviruses a cell adhesion molecule, icam-1, is the major surface receptor for rhinoviruses rapamycin and wortmannin enhance replication of a defective encephalomyocarditis virus l* protein of the da strain of theiler's murine encephalomyelitis virus is important for virus growth in a murine macrophagelike cell line initiation of poliovirus plus-strand rna synthesis in a membrane complex of infected hela cells membrane fractions active in poliovirus rna replication contain vpg precursor polypeptides poliovirus rna recombination in cell-free extracts a mutation in the rna polymerase of poliovirus type 1 contributes to attenuation in mice amplification of the full-length hepatitis a virus genome by long reverse transcription-pcr and transcription of infectious rna directly from the amplicon poliovirus 2c protein determinants of membrane binding and rearrangements in mammalian cells translation and replication properties of the human rhinovirus genome in vivo and in vitro intertypic recombination in poliovirus: genetic and biochemical studies genetic studies on the poliovirus 2c protein, an ntpase. a plausible mechanism of guanidine effect on the 2c function and evidence for the importance of 2c oligomerization hcar and mcar: the human and mouse cellular receptors for subgroup c adenoviruses and group b coxsackieviruses rescue of defective poliovirus rna replication by 3ab-containing precursor polyproteins complete nucleotide sequences of all three poliovirus serotype genomes: implication for genetic relationship, gene function and antigenic determinants analysis of rna synthesis of type 1 poliovirus by using an in vitro molecular genetic approach genetics of coxsackievirus b3 cardiovirulence genetics of coxsackievirus b cardiovirulence and inflammatory heart muscle disease sequence analysis of the parsnip yellow fleck virus polyprotein: evidence of affinities with picornaviruses synthesis of infectious poliovirus rna by purified t7 rna polymerase nucleotide sequence and genomic organization of acyrthosiphon pisum virus mutational analysis of poliovirus 2apro. distinct inhibitory functions of 2apro on translation and transcription poliomyelitis-like properties of ab-iv coxsackie a7 group of viruses role for beta2-microglobulin in echovirus infection of rhabdomyosarcoma cells who (1985) expanded programme on immunization, global poliomyelitis eradication by the year 2000: manual for managers of immunization programmes on activities related to polio eradication genome-linked proteins of viruses genetics of poliovirus poliovirus receptors an electron microscope study of proteins attached to poliovirus rna and its replicative form (rf) equine rhinovirus serotypes 1 and 2: relationship to each other and to aphthoviruses and cardioviruses molecular dissection of the multifunctional poliovirus rna-binding protein 3ab interaction between the 5'-terminal cloverleaf and 3ab/3cdpro of poliovirus is essential for rna replication rna signals in entero-and rhinovirus genome replication complete nucleotide sequence and genetic organization of aichi virus, a distinct member of the picornaviridae associated with acute gastroenteritis in humans efficient delivery of circulating poliovirus to the central nervous system independently of poliovirus receptor viruses of acute haemorrhagic conjunctivitis polyadenylic acid at the 3-terminus of poliovirus rna polivirus/hepatitis c virus (internal ribosomal entry site-core) chimeric viruses: improved growth properties through modification of a proteolytic cleavage site and requirement for core rna sequences but not core-related polypeptides mengovirus leader is involved in the inhibition of host cell protein synthesis we are indebted to leena kinnunen for providing figure 12 .1, and to steffen mueller for figure 12 .6. we thank astrid wimmer for editing parts of the manuscript. work by m.g. and e.w. described here has been supported in part by grants from the national institutes of health, the national cancer institute, and the centers for disease control. key: cord-016448-7imgztwe authors: frishman, d.; albrecht, m.; blankenburg, h.; bork, p.; harrington, e. d.; hermjakob, h.; juhl jensen, l.; juan, d. a.; lengauer, t.; pagel, p.; schachter, v.; valencia, a. title: protein-protein interactions: analysis and prediction date: 2009-10-01 journal: modern genome annotation doi: 10.1007/978-3-211-75123-7_17 sha: doc_id: 16448 cord_uid: 7imgztwe proteins represent the tools and appliances of the cell — they assemble into larger structural elements, catalyze the biochemical reactions of metabolism, transmit signals, move cargo across membrane boundaries and carry out many other tasks. for most of these functions proteins cannot act in isolation but require close cooperation with other proteins to accomplish their task. often, this collaborative action implies physical interaction of the proteins involved. accordingly, experimental detection, in silico prediction and computational analysis of protein-protein interactions (ppi) have attracted great attention in the quest for discovering functional links among proteins and deciphering the complex networks of the cell. proteins represent the tools and appliances of the cellthey assemble into larger structural elements, catalyze the biochemical reactions of metabolism, transmit signals, move cargo across membrane boundaries and carry out many other tasks. for most of these functions proteins cannot act in isolation but require close cooperation with other proteins to accomplish their task. often, this collaborative action implies physical interaction of the proteins involved. accordingly, experimental detection, in silico prediction and computational analysis of protein-protein interactions (ppi) have attracted great attention in the quest for discovering functional links among proteins and deciphering the complex networks of the cell. proteins do not simply clump togetherbinding between proteins is a highly specific event involving well defined binding sites. several criteria can be used to further classify interactions (nooren and thornton 2003) . protein interactions are not mediated by covalent bonds and, from a chemical perspective, they are always reversible. nevertheless, some ppi are so persistent to be considered irreversible (obligatory) for all practical purposes. other interactions are subject to tight regulation and only occur under characteristic conditions. depending on their functional role, some protein interactions remain stable for a long time (e.g. between proteins of the cytoskeleton) while others last only fractions of a second (e.g. binding of kinases to their targets). protein complexes formed by physical binding are not restricted to so called binary interactions which involve exactly two proteins (dimer) but are often found to contain three (trimer), four (tetramer), or more peptide chains. another distinction can be made based on the number of distinct proteins in a complex: homo-oligomers contain multiple copies of the same protein while hetero-oligomers consist of different protein species. sophisticated "molecular machines" like the bacterial flagellum consist of a large number of different proteins linked by protein interactions. the focus of this chapter is on the computational methods for analyzing and predicting protein-protein interactions. nevertheless, some basic knowledge about experimental techniques for detecting these interactions is highly useful for interpreting results, estimating potential biases, and judging the quality of the data we use in our work. many different types of methods have been developed but the vast majority of interactions in the literature and public databases come from only two classes of approaches: co-purification and two-hybrid methods. co-purification methods (rigaut et al. 1999 ) are carried out in vitro and involve three basic steps. first, the protein of interest is "captured" from a cell lysatee.g. by attaching it to an immobile matrix. this may be done with specific antibodies, affinity tags, epitope tags along with a matching antibody, or by other means. second, all other proteins in the solution are removed in a washing step in order to purify the captured protein. under suitable conditions, protein-protein interactions are preserved. in the third step, any proteins still attached to the purified protein are detected by suitable methods (e.g. western-blot or mass spectrometry). hence, the interaction partners are co-purified, as the name of the method implies. the two-hybrid technique (fields and song 1989 ) uses a very different approachit exploits the fact that transcription factors such as gal4 consist of two distinct functional domains. the dna-binding domain (bd) recognizes the transcription factor (tf) binding site in the dna and attaches the protein to it while the activation domain (ad) triggers transcription of the gene under the control of the factor. when expressed as separate protein chains, both domains remain fully functional: the bd still binds the dna but lacks a way of triggering transcription. the ad could trigger transcription but has no means of binding to the dna. for a two-hybrid test, two proteins x and y are fused to these domains resulting in two hybrids: x-bd and y-ad. if x binds to y, the resulting protein complex turns out to be a fully functional transcription factor. accordingly, an interaction is revealed by detecting transcription of the reporter gene under the control of the tf. in contrast to co-purifications, the interaction is tested in vivo in the two-hybrid system (usually in yeast, but other systems exist). the above description refers to small-scale experiments testing one pair of proteins at a time, but both approaches have successfully been extended to large-scale experiments testing thousands of pairs in very short time. while such high-throughput data is very valuable, especially for computational biology which often requires comprehensive input data, a word of caution is necessary. even with the greatest care and a maximum of thoughtful controls, high-throughput data usually suffer from a certain degree of false-positive results as well as false-negatives compared to carefully performed and highly optimized individual experiments. the ultimate source of information about protein interactions is provided by high-resolution three-dimensional structures of interaction complexes, such as the one shown in fig. 1 . spatial architectures obtained by x-ray crystallography or nmr spectroscopy provide atomic-level detail of interaction interfaces and allow for mechanistic understanding of interaction processes and their functional implications. additional kinetic, dynamic and structural aspects of protein interactions can be elucidated by electron and atomic force microscopy as well as by fluorescence resonance energy transfer. fig. 1 structural complex between rhoa, a small gtp protein belonging to the ras superfamily, and the catalytic gtpase activating domain of rhogap (graham et al. 2002) 3 protein interaction databases a huge number of protein-protein interactions has been experimentally determined and described in numerous scientific publications. public protein interaction databases that provide interaction data in form of structured, machine-readable datasets organized according to well documented standards have become invaluable resources for bioinformatics, systems biology and researchers in experimental laboratories. the data in these databases generally originate from two major sources: large-scale datasets and manually curated information extracted from the scientific literature. as pointed out above, the latter is considered substantially more reliable and large bodies of manually curated ppi data are often used as the gold standard against which predictions and large-scale experiments are benchmarked. of course, these reference data are far from complete and strongly biased. many factors, including experimental bias, preferences of the scientific community, and perceived biomedical relevance influence the chance of an interaction to be studied, discovered and published. in the manual annotation process it is not enough to simply record the interaction as such. additional information such as the type of experimental evidence, citations of the source, experimental conditions, and more need to be stored in order to convey a faithful picture of the data. annotation is a highly labor intensive task carried out by specially trained database curators. ppi databases can be roughly divided in two classes: specialized databases focusing on a single organism or a small set of species and general repositories which aim for a comprehensive representation of current knowledge. while the former are often well integrated with other information resources for the same organism, the latter strive for collecting all available interaction data including datasets from specialized resources. the size of these databases is growing constantly as more and more protein interactions are identified. as of writing (november 2007) , global repositories are approaching 200,000 pieces of evidence for protein interactions in various species. all of these databases offer convenient web interfaces that allow for interactively searching the database. in addition, the full datasets are usually provided for download in order to enable researchers to use the data in their own computational analyses. table 1 gives an overview of some important ppi databases. until relatively recently, molecular interaction databases like the ones listed in table 1 acted largely independently from each other. while they provided an extremely valuable service to the community in collecting and curating available molecular interaction data from the literature, they did so largely in an uncoordinated manner. each database had its own curation policy, feature set, and data formats. in 2002, the proteomics standards initiative (psi), a work group of the human proteome organization (hupo), set out to improve this situation, with contributions from a broad range of academic and commercial organizations, among them bind, cellzome, dip, glaxosmithkline, hybrigenics sa, intact, mint, mips, serono, and the universities of bielefeld, bordeaux, and cambridge. in a first step, a community standard for the representation of protein-protein interactions was developed, the psi mi format 1.0 (hermjakob et al. 2004) . recently, version 2.5 of the psi mi format has been published , extending the scope of the format from protein-protein interactions to molecular interactions in general, allowing to model for example protein-rna complexes. the psi mi format is a flexible xml format representing the interaction data to a high level of detail. n-ary interactions (complexes) can be represented as well as experimental conditions and technologies, quantitative parameters and interacting domains. the xml format is accompanied by detailed controlled vocabularies in obo format (harris et al. 2004 ). these vocabularies are essential for standardizing not only the syntax, but also the semantics of the molecular interaction representation. as an example, the "yeast two-hybrid technology" described above is referred to in the literature using many different synonyms, for example y2h, 2h, "yeast-two-hybrid", etc. while all of these terms refer to the same technology, filtering interaction data from multiple different databases based on this set of terms is not trivial. thus, the psi mi standard provides a set of now more than 1000 well-defined terms relevant to molecular interactions. figure 2 shows the intact advanced search tool with a branch of the hierarchical psi mi controlled vocabulary. figure 3 provides a partial graphical representation of the annotated xml schema, combined with an example dataset in psi mi xml format, reprinted from kerrien et al. (2007b) . for user-friendly distribution of simplified psi data to end users, the psi mi 2.5 standard also defines a simple tabular representation (mitab), derived from the biogrid format (breitkreutz et al. 2003) . while this format necessarily excludes details of interaction data like interacting domains, it provides a means to efficiently access large numbers of basic binary interaction records. the psi mi format is now widely implemented, with data available from biogrid, dip, hprd, intact, mint, and mips, among others. visualization tools like cytoscape (shannon et al. 2003) can directly read and visualize psi mi formatted data. comparative and integrative analysis of interaction data from multiple sources has become easier, as has the development of analysis tools which do not need to provide a plethora of input parsers any more. the annotated psi mi xml schema, a list of tools and 359 databases implementing it, as well as further information, are available from http:// www.psidev.info/. however, the development and implementation of a common data format is only one step towards the provision of consistent molecular interaction data to the scientific community. another key step is the coordination of the data curation process itself between different molecular interaction databases. without such synchronization, independent databases will often work on the same publications and insert the data into their systems, according to different curation rules, thus doing redundant work on some publications, while neglecting others. recognizing this issue, the dip, intact, and mint molecular interaction databases are currently synchronizing their curation efforts in the context of the imex consortium (http://imex.sf.net). these databases are now applying the same curation rules to provide a consistent high level of curation quality, and are synchronizing their fields of activity, each focusing on literature curation from a non-overlapping set of scientific journals. for these journals, the databases aim to insert all published interactions into the database shortly after publication. regular data exchange of all newly curated data between imes databases is currently in the implementation phase. to support the systematic representation and capture of relevant molecular interaction data supporting scientific publications, the hupo proteomics standards initiative has recently published "the minimum information required for reporting a molecular interaction experiment (mimix)" , detailing data items considered essential for the authors to provide, as well as a practical guide to efficient deposition of molecular interaction data in imex databases . the imex databases are also collaborating with scientific journals and funding agencies, to increasingly recommend data producers to deposit their data in an imex partner database prior to publication. database deposition prior to publication not only ensures public availability of the data at the time of publication, but also provides important quality control, as database curators often assess the data in much more detail than reviewers. the psi journal collaboration efforts are starting to show first results. nature biotechnology, nature genetics, and proteomics are now recommending that authors deposit molecular interaction data in a relevant public domain database prior to publication, a key step to a better capture of published molecular interaction data in public databases, and to overcome the current fragmentation of molecular interaction data. as an example of a molecular interaction database implementing the psi mi 2.5 standard, we will provide a more detailed description of the intact molecular interaction database ), accessible at http://www.ebi.ac.uk/intact. intact is a curated molecular interaction database active since 2002. intact follows a full text curation policy, publications are read in full by the curation team, and all molecular interactions contained in the publication are inserted into the database, containing basic facts like the database accession numbers of the proteins participating in an interaction, but also details like experimental protein modifications, which can have an impact on assessments of confidence in the presence or absence of interactions. each database record is cross-checked by a senior curator for quality control. on release of the record, the corresponding author of the publication is automatically notified (where an email address is available), and requested to check the data provided. any corrections are usually inserted into the next weekly release. while such a detailed, high quality approach is slow and limits coverage, the provision of high quality reference datasets is an essential service both for biological analysis, and for the training and validation of automatic methods for computational prediction of molecular interactions. as it is impossible for any single database, or even the collaborating imex databases, to fully cover all published interactions, curation priorities have to be set. any direct data depositions supporting manuscripts approaching peer review have highest priority. next, for some journals (currently cell, cancer cell, and proteomics) intact curates all molecular interactions published in the journal. finally, several special curation topics are determined in collaboration with external communities or collaborators, where intact provides specialized literature curation and collaborates in the analysis of experimental datasets, for example around a specific protein of interest (camargo et al. 2006) . as of november 2007, intact contains 158.000 binary interactions supported by ca. 3,000 publications. the intact interface implements a standard "simple search" box, ideal for search by uniprot protein accession numbers, gene names, species, or pubmed identifiers. the advanced search tool (fig. 2) provides field-specific searches as well a specialized search taking into account the hierarchical structure of controlled vocabularies. a default search for the interaction detection method "2 hybrid" returns 30,251 interactions, while a search for "2 hybrid" with the tickbox "include children" activated returns more than twice that number, 64,589 interactions. the hierarchical search automatically includes similarly named methods like "two hybrid pooling approach", but also "gal4 vp16 complement". search results are initially shown in a tabular form based on the mitab format, which can also be directly downloaded. each pairwise interaction is only listed once, with all experimental evidence listed in the appropriate columns. the final column provides access to a detailed description of each interaction as well as a graphical representation of the interaction in is interaction neighborhood graph. for interactive, detailed analysis, interaction data can be loaded into tools like cytoscape (see below) via the psi 2.5 xml format. all intact data is freely available via the web interface, for download in psi mi tabular or xml format, and computationally accessible via web services. intact software is open source, implemented in java, with hibernate (www.hibernate.org/) for the object-relational mapping to oracle tm or postgres, and freely available under the apache license, version 2 from http://www.ebi.ac.uk/intact. on a global scale, protein-protein interactions participate in the formation of complex biological networks which, to a large extent, represent the paths of communication and metabolism of an organism. these networks can be modeled as graphs making them amenable to a large number of well established techniques of graph theory and social network analysis. even though interaction networks do not directly encode cellular processes nor provide information on dynamics, they do represent a first step towards a description of cellular processes, which is ultimately dynamic in nature. for instance, protein-interaction networks may provide useful information on the dynamics of complex assembly or signaling. in general, investigating the topology of protein interaction, metabolic, signaling, and transcriptional networks allows researchers to reveal the fundamental principles of molecular organization of the cell and to interpret genome data in the context of large-scale experiments. such analyses have become an integral part of the genome annotation process: annotating genomes today increasingly means annotating networks. a protein-protein interaction network summarizes the existence of both stable and transient associations between proteins as an (undirected) graph: each protein is represented as a node (or vertex), an edge between two proteins denotes the existence of an interaction. interactions known to occur in the actual cell ( fig. 4a ) can thus be represented as an abstract graph of interaction capabilities (fig. 4b ). as such a graph is limited by definition to binary interactions, its construction from a database of molecular interactions may involve arbitrary choices. for instance, an n-ary interaction measured by co-purification can be represented using either the clique (all binary interactions between the n proteins are retained) or the spoke model (only edges connecting the "captured" protein to co-purified proteins are retained). once a network has been reconstructed from protein interaction data, a variety of statistics on network topology can be computed, such as the distribution of vertex degrees, the distribution of the clustering coefficient and other notions of density, the distribution of shortest path length between vertex pairs, or the distribution of network motifs occurrences (see for a review). these measures can be used to describe networks in a concise manner, to compare, group or contrast different networks, and to identify properties characteristic of a network or a class of network under study. some topological properties may be interpreted as traces of underlying biological mechanisms, shedding light on their dynamics, their evolution, or both and helping connect structure to function (see the "network modules" section below). for instance, most interaction networks seem to exhibit scale-free topology (jeong et al. 2001; yook et al. 2004) , i.e. their degree distribution (the probability that a node has exactly k links) approximates a power law p(k) $ k -g , meaning that most proteins have few interaction partners but some, the so-called "hubs", have many. as an example of derived evolutionary insight, it is easy to show that networks evolving by growth (addition of new nodes) and preferential attachment (new nodes are more likely to be connected to nodes with more connections) will exhibit scale-free topology (degree distribution approximates a power-law) and hubs (highly connected nodes). a simple model of interaction network evolution by gene duplication, where a duplicate initially keeps the same interaction partners as the original, generates preferential attachment, thus providing a candidate explanation for the scale-free nature and the existence of hubs in these networks . interacting proteins are denoted as p1, p2, etc. (b) a graph representation of the protein interactions shown in a. each node represents a protein, and each edge connects proteins that interact. (c) information on protein interactions obtained by different methods. (d) protein interaction network derived from experimental evidence shown in c. as in a, each node is a protein, and edges connect interactors. edges a colored according to the source of evidence: red -3d, green -apms, brown -y2h, magenta -prof, yellow -lit, blue -loc a corresponding functional interpretation of hubs and scale-free topology has been proposed in terms of robustness. scale-free networks are robust to component failure, as random failures are likely to affect low degree nodes and only failures affecting hub nodes will significantly change the number of connected components and the length of shortest paths between node pairs. deletion analyses have, perhaps unsurprisingly, confirmed that highly connected proteins are more likely to be essential (winzeler et al. 1999; giaever et al. 2002; gerdes et al. 2003) . most biological interpretations that have been proposed for purely topological properties of interaction networks have been the subject of heated controversies, some of which remain unsolved to this day (e.g. (he and zhang 2006; yu et al. 2007 ) on hubs). one often cited objection to any strong interpretation is the fact that networks reconstructed from high-throughput interaction data constitute very rough approximations of the "real" network of interactions taking place within the cell. as illustrated in fig. 4c , interaction data used in a reconstruction typically result from several experimental methods, often complemented with prediction schemes. each specific method can miss real interactions (false negatives) and incorrectly identify other interactions (false positives), resulting in biases that are clearly technology-dependent (gavin et al. 2006; legrain and selig 2000) . assessing false-negative and false-positive rates is difficult since there is no gold standard for positive interactions (protein pairs that are known to interact) or, more importantly, for negative interactions (protein pairs that are known not to interact). using less-than-ideal benchmark interaction sets, estimates of 30-60% false positives and 40-80% false negatives have been proposed for yeast-two-hybrid and copurification based techniques (aloy and russell 2004) . in particular, a comparison of several high-throughput interaction datasets on yeast, showing low overlap, has confirmed that each study covers only a small percentage of the underlying interaction network (von mering et al. 2002 ) (see also "estimates of the number of protein interactions" below). integration of interaction data from heterogeneous sources towards interaction network reconstruction can help compensate for these limitations. the basic principle is fairly simple and rests implicitly on a multigraph representation: several interaction networks to be integrated, each resulting from a specific experimental or predictive method, are defined over the same set of proteins. integration is achieved by merging them into a single network with several types of linksor edge colors-each drawn from one of the component networks. some edges in the multigraph may be incorrect, while some existing interactions may be missing from the multigraph, but interactions confirmed independently by several methods can be considered reliable. figure 4d shows the multigraph that corresponds to the evidence from fig. 4c and can be used to reconstruct the actual graph in fig. 4b . in practice, integration is not always straightforward: networks are usually defined over subsets of the entire gene or protein complement of a species, and meaningful integration requires that the overlap of these subsets be sufficiently large. in addition, if differences of reliability between network types are to be taken into account, an integrated reliability scoring scheme needs to be designed (jansen et al. 2003; von mering et al. 2007 ) with the corresponding pitfalls and level of arbitrariness involved in comparing apples and oranges. existing methods can significantly reduce false positive rates on a subset of the network, yielding a subnetwork of highreliability interactions. the tremendous amounts of available molecular interaction data raise the important issue of how to visualize them in a biologically meaningful way. a variety of tools have been developed to address this problem; two prominent examples are visant (hu et al. 2005) and cytoscape (shannon et al. 2003) . a recent review of further network visualization tools is provided by suderman and hallett (2007) . in this section, we focus on cytoscape (http://www.cytoscape.org) and demonstrate its use for the investigation of protein-protein interaction networks. for a more extensive protocol on the usage of cytoscape, see (cline et al. 2007) . cytoscape is a stand-alone java application that is available for all major computer platforms. this software provides functionalities for (i) generating biological networks, either manually or by importing interaction data from various sources, (ii) filtering interactions, (iii) displaying networks using graph layout algorithms, (iv) integrating and displaying additional information like gene expression data, and (v) performing analyses on networks, for instance, by calculating topological network properties or by identifying functional modules. one advantage of cytoscape over alternative visualization software applications is that cytoscape is released under the open-source lesser general public license (lgpl). this license basically permits all forms of software usage and thus helps to build a large user and developer community. third-party java developers can easily enhance the functionality of cytoscape by implementing own plug-ins, which are additional software modules that can be readily integrated into the cytoscape platform. currently, there are more than forty plug-ins publicly available, with functionalities ranging from interaction retrieval and integration across topological network analysis, detection of network motifs, protein complexes, and domain interactions, to visualization of subcellular protein localization and bipartite networks. a selection of popular cytoscape plug-ins is listed in table 2 . in the following, we will describe the functionalities of cytoscape in greater detail. the initial step of generating a network can be accomplished in different ways. first, the user can import interaction data that are stored in various flat file or xml formats such as biopax, sbml, or psi-mi, as described above. second, the user can directly retrieve interactions from several public repositories from within cytoscape. a number table 2 ). third, the user can utilize a text-mining plug-in that builds networks based on associations found in publication abstracts (agilent literature search; table 2 ). while these associations are not as reliable as experimentally derived interactions, they can be helpful when the user is investigating species that are not well covered yet in the current data repositories. fourth, the user can directly create or manipulate a network by manually adding or removing nodes (genes, proteins, domains, etc.) and edges (interactions or relationships). in this way, expert knowledge that is not captured in the available data sets can be incorporated into the loaded network. generated networks can be further refined by applying selections and filters in cytoscape. the user can select nodes or edges by simply clicking on them or framing a selection area. in addition, starting with at least one selected node, the user can incrementally enlarge the selection to include all direct neighbor nodes. cytoscape also provides even sophisticated search and filter functionality for selecting particular nodes and edges in a network based on different properties; in particular, the enhanced search plug-in (table 2) improves the built-in search functionality of cytoscape. filters select all network parts that match certain criteria, for instance, all human proteins or all interactions that have been detected using the yeast two-hybrid system. once a selection has been made, all selected parts can be removed from the network or added to another network. the main purpose of visualization tools like cytoscape is the presentation of biological networks in an appropriate manner. this can usually be accomplished by applying graph layout algorithms. sophisticated layouts can assist the user in revealing specific network characteristics such as hub proteins or functionally related protein clusters. cytoscape offers various layout algorithms, which can be categorized as circular, hierarchical, spring-embedded (or force-directed), and attribute-based layouts (fig. 5 ). further layouts can be included using the cytoscape plug-in architecture, for example, to arrange protein nodes according to their subcellular localization or to their pathways assignments (bubblerouter, cerebral; table 2 ). some layouts may be more effective than others for representing molecular networks of a certain type. the spring-embedded layout, for instance, has the effect of exposing the inherent network structure, thus identifying hub proteins and clusters of tightly connected nodes. it is noteworthy that current network visualization techniques have limitations, for example, when displaying extremely large or dense networks. in such cases, a simple graphical network representation with one node for each interaction partner, as it is initially created by cytoscape, can obfuscate the actual network organization due to the sheer number of nodes and edges. one potential solution to this problem is the introduction of meta-nodes (metanode plug-in; table 2 ). a meta-node combines and replaces a group of other nodes. meta-nodes can be collapsed to increase clarity of the visualization and expanded to increase the level of detail (fig. 6 ). an overview of established and novel visualization techniques for biological networks on different scales is presented in (hu et al. 2007 ). all layouts generated by cytoscape are zoomable, enabling the user to increase or decrease the magnification, and they can be further customized by aligning, scaling, or rotating selected network parts. additionally, the user can define the graphical network representation through visual styles. these styles define the colors, sizes, and shapes of all network parts. a powerful feature of cytoscape is its ability of visually mapping additional attribute values onto network representations. both nodes and edges can have arbitrary attributes, for example, protein function names, the number of interactions (node degree), expression values, the strength and type of an interaction, or confidence values for interaction reliability. these attributes can be used to adapt the network illustration by dynamically changing the visual styles of individual network parts (fig. 7) . for example, this feature enables highlighting trustworthy interactions by assigning (table 2 ). all protein nodes with subcellular localizations different from plasma membrane are combined into meta-nodes. these meta-nodes can be collapsed or expanded to increase clarity or detailedness, respectively different line styles or sizes to different experiment types (discrete mapping of an edge attribute), to spot network hubs by changing the size of a node according to its degree (discrete or continuous mapping of a node attribute), or to identify functional network patterns by coloring protein nodes with a color gradient according to their expression level (continuous mapping of a node attribute). hence, it is possible to simultaneously visualize different data types by overlaying them with a network model. in order to generate new biological hypotheses and to gain insights into molecular mechanisms, it is important to identify relevant network characteristics and patterns. for this purpose, the straightforward approach is the visual exploration of the network. table 2 lists a selection of cytoscape plug-ins that assist the user in this analysis task, for instance, by identifying putative complexes (mcode), by grouping proteins that show a similar expression profile (jactivemodules), or by identifying overrepresented go terms (bingo, golorize). however, the inclusion of complex data such as time-series results or diverse gene ontology (go) terms into the network visualization might not be feasible without further software support. particularly in case of huge, highly connected, or dynamic networks, more advanced visualization techniques will be required in the future. fig. 7 visual representation of a subset of the gal4 network in yeast. the protein nodes are colored with a red-to-green gradient according to their expression value; green represents the lowest, red the highest value, and blue a missing value. the node size indicates the number of interactions (node degree); the larger a node, the higher is its degree. the colors and styles of the edges represent different interaction types; solid black lines represent protein-protein, dashed red lines protein-dna interactions in addition to the visual presentation of interaction networks, cytoscape can also be used to perform statistical analyses. for instance, the networkanalyzer plug-in (assenov et al. 2008 ) computes a large variety of topology parameters for all types of networks. the computed simple and complex topology parameters are represented as single values and distributions, respectively. examples of simple parameters are the number of nodes and edges, the average number of neighbors, the network diameter and radius, the clustering coefficient, and the characteristic path length. complex parameters are distributions of node degrees, neighborhood connectivities, average clustering coefficients, and shortest path lengths. these computed statistical results can be exported in textual or graphical form and are additionally stored as node attributes. the user can then apply the calculated attributes to select certain network parts or to map them onto the visual representation of the analyzed network as described above (fig. 7) . it is also possible to fit a power law to the node degree distribution, which can frequently indicate a so-called scale-free network with few highly connected nodes (hubs) and many other nodes with a small number of interactions. scale-free networks are especially robust against failures of randomly selected nodes, but quite vulnerable to defects of hubs (albert 2005) . how many ppis exist in a living cell? the yeast genome encodes approximately 6300 gene products which means that the maximal possible number of interacting protein pairs in this organism is close to 40 million, but what part of these potential interactions are actually realized in nature? for a given experimental method, such as the two-hybrid essay, the estimate of the total number of interactions in the cell is given by where n measured is the number of interactions identified in the experiment, and r fp and r fn are false positive and false negative rates of the method. r fn can be roughly estimated based on the number of interactions known with confidence (e.g., those confirmed by three-dimensional structures) that are being recovered by the method. assessing r fp is much more difficult because no experimental information on proteins that do not interact is currently available. since it is known that proteins belonging to the same functional class often interact, one very indirect way of calculating r fn is as the fraction of functionally related proteins not found to be interacting. an even more monumental problem is the estimation of the total number of unique structurally equivalent interaction types existing in nature. an interaction type is defined as a particular mutual orientation of two specific interacting domains. in some cases homologous proteins interact in a significantly different fashion while in other cases proteins lacking sequence similarity engage in interactions of the same type. in general, however, interacting protein pairs sharing a high degree of sequence similarity (30-40% or higher) between their respective components almost always form structurally similar complexes (aloy et al. 2003) . this observation allows utilization of available atomic resolution structures of complexes for building useful models of closely related binary complexes. the total number of interaction types can then be estimated as follows: where the interaction similarity multiplier c reflects the clustering of all interactions of the same type, and e all-species extrapolates from one biological species to all organisms. aloy and russel (2004) derived an estimate for c by grouping interactions between proteins that share high sequence similarity, as discussed above. c depends on the number of paralogous sequences encoded in a given genome. for small prokaryotic organisms it is close to 1 while for larger and more redundant genomes it adopts smaller values, typically in the range of 0.75-0.85. the multiplier for all species e allspecies can be derived by assessing what fraction of known protein families is encoded in a given genome. based on the currently available data this factor is close to 10 for bacteria, which means that a medium size prokaryotic organism contains around one tenth of all protein families. for eukaryotic organisms e all-species lies between 2 and 4. for the comprehensive two-hybrid screen of yeast by (uetz 2000) in which 936 interactions between 987 proteins were identified, aloy and russell (2004) estimated c, r fp, and r fn , and e all-species to be 0.85, 3.92, 0.55, and 3.35 respectively, leading to an estimated 1715 different interaction types in yeast alone, and 5741 over all species. based on the two-hybrid interaction map of the fly (giot 2003 ) the number of all interaction types in nature is estimated to be 9962. it is thus reasonable to expect the total number of interaction types to be around 10,000, and only 2000 are currently known. beyond binary interactions, proteins often form large molecular complexes involving multiple subunits (fig. 8) . these complexes are much more than a random snapshot of a group of interacting proteinsthey represent large functional entities which remain stable for long periods of time. many such protein complexes have been elucidated step by step over time and recent advances in high-throughput technology have led to largescale studies revealing numerous new protein complexes. the preferred technology for this kind of experiment is initial co-purification of the complexes followed by the identification of the member proteins by mass spectrometry. as the bakers yeast s. cerevisiae is one of the most versatile model organisms used in molecular biology, it is not surprising that the first large-scale complex datasets were obtained in this species (gavin et al. 2002; ho et al. 2002; gavin et al. 2006; krogan et al. 2006 ). the yeast protein interaction database mpact (guldener et al. 2006 ) provides access to 268 protein complexes based on careful literature annotation composed of 1237 different proteins plus over 1000 complexes from large-scale experiments which contain more than 2000 distinct proteins. these numbers contain some redundancy with respect to complexes, due to slightly different complex composition found by different groups or experiments. nevertheless, the dataset covers about 40% of the s.cerevisiae proteome. while many complexes comprise only a small number of different proteins, the largest of them features an impressive 88 different protein species. a novel manually annotated database, corum (ruepp et al. 2008 ) contains literature-derived information about 1750 mammalian multi-protein complexes. over 75% of all complexes contain between three and six subunits, while the largest molecular structure, the spliceosome, consists of 145 components (fig. 9 ). modularity has emerged as one of the major organizational principles of cellular processes. functional modules are defined as molecular ensembles with an autonomous function (hartwell et al. 1999) . proteins or genes can be partitioned into modules based on shared patterns of regulation or expression, involvement in a common metabolic or regulatory pathway, or membership in the same protein complex or subcellular structure. modular representation and analysis of cellular processes allows for interpretation of genome data beyond single gene behavior. in particular, analysis of modules provides a convenient framework for studying the evolution of living systems (snel and huynen 2004) . multiprotein complexes represent one particular type of functional modules in which individual components engage in physical interactions to execute a specific cellular function. algorithmically, modular architectures can be defined as densely interconnected groups of nodes on biological networks (for an excellent review of available methods see (sharan et al. 2007 ). statistically significant functional subnetworks are characterized by a high degree of local clustering. the density of a cluster can be represented as a function q(m,n) = 2m/(n(n à 1)), where m is the number of interactions between the n nodes of the cluster (spirin and mirny 2003) . q thus takes values between 0 for a set of unconnected nodes and 1 for a fully connected cluster (clique). the statistical significance of q strongly depends on the size of the graph. it is obvious that random clusters with q ¼ 1 involving just three proteins are very likely while large clusters with q ¼ 1 or even with values below 0.5 are extremely unlikely. in order to compute the statistical significance of a cluster with n nodes and m connections spirin and mirny calculate the expected number of such clusters in a comparable random graph and then estimate the likelihood of having m or more interactions within a given set of n proteins given the number of interactions that each of these proteins has. significant dense clusters identified by this procedure on a graph of protein interactions were found to correspond to functional modules most of which are involved in transcription regulation, cell-cycle/ cell-fate control, rna processing, and protein transport. however, not all of them constitute physical protein complexes and, in general, it is not possible to predict whether a given module corresponds to a multiprotein complex or just to a group of functionally coupled proteins involved in the same cellular process. the search for significant subgraphs can be further enhanced by considering evolutionary conservation of protein interactions. with this approach protein complexes are predicted from binary interaction data by network alignment which involves comparing interaction graphs between several species (sharan et al. 2005) . first, proteins are grouped by sequence similarity such that each group contains one protein from each species, and each protein is similar to at least one other protein in the group. then a composite interaction network is created by joining with edges those pairs of groups that are linked by at least one conserved interaction. again, dense clusters on such network alignment graph are often indicative of multiprotein complexes. an alternative computational method for deriving complexes from noisy large-scale interaction data relies on a "socio-affinity" index which essentially reflects the frequency with which proteins form partnerships detected by co-purification (gavin et al. 2006) . this index was shown to correlate well with available three-dimensional structure data, dissociation constants of protein-protein interactions, and binary interactions identified by the two-hybrid techniques. by applying a clustering procedure to a matrix containing the values of the socio-affinity index for all yeast protein pairs found to associate by affinity purification, 491 complexes were predicted, with over a half of them being novel and previously unknown. however, dependent on the analysis parameters distinct complex variants (isoforms) are found that differ from in terms of their subunit composition. those proteins present in most of the isoforms of a given complex constitute its core while variable components present only in a small number of isoforms can be considered "attachments" (fig. 10) . furthermore, some stable, typically smaller protein groups can be found in multiple attachments in which case they are fig. 10 definitions of complex cores, attachments, and modules. redrawn and modified with permission from (gavin et al. 2006) called "modules". stable functional modules can thus be flexibly used in the cell in a variety of functional contexts. proteins frequently associated with each other in complex cores and modules are likely to be co-expressed and co-localized. in this section, we offer a computational perspective on utilizing protein network data for molecular medical research. the identification of novel therapeutic targets for diseases and the development of drugs has always been a difficult, time-consuming and expensive venture (ruffner et al. 2007) . recent work has charted the current pharmacological space using different networks of drugs and their protein targets (paolini et al. 2006; keiser et al. 2007; kuhn et al. 2008; yildirim et al. 2007 ) based on biochemical relationships like ligand binding energy and molecular similarity or on shared disease association. above all, since many diseases are due to the malfunctioning of proteins, the systematic determination and exploration of the human interactome and homologous protein networks of model organisms can provide considerable new insight into pathophysiological processes (giallourakis et al. 2005) . knowledge of protein interactions can frequently improve the understanding of relevant molecular pathways and the interplay of various proteins in complex diseases (fishman and porter 2005) . this approach may result in the discovery of a considerable number of novel drug targets for the biopharmaceutical industry, possibly affording the development of multi-target combination therapeutics. observed perturbations of protein networks may also offer a refined molecular description of the etiology and progression of disease in contrast to phenotypic categorization of patients (loscalzo et al. 2007 ). molecular network data may help to improve the ability of cataloging disease unequivocally and to further individualize diagnosis, prognosis, prevention, and therapy. this will require a network-based approach that does not only include protein interactions to differentiate pathophenotypes, but also other types of molecular interactions as found in signaling cascades and metabolic pathways. furthermore, environmental factors like pathogens interacting with the human host or the effects of nutrition need to be taken into account. after large-scale screens identified enormous amounts of protein interactions in organisms like yeast, fly, and worm (goll and uetz 2007) , which also serve as model systems for studying many human disease mechanisms (giallourakis et al. 2005) , experimental techniques and computational prediction methods have recently been applied to generate sizable networks of human proteins (cusick et al. 2005; stelzl and wanker 2006; assenov et al. 2008; ram ırez et al. 2007 ). in addition, comprehensive maps of protein interactions inside pathogens and between pathogens and the human host have been compiled for bacteria like e. coli, h. pylori, c. jejuni, and other species (noirot and noirot-gros 2004) , for many viruses such as herpes viruses, the epstein-chapter 6.2: protein-protein interactions: analysis and prediction barr virus, the sars coronavirus, hiv-1, the hepatitis c virus, and others (uetz et al. 2004) , and for the malaria parasite p. falciparum (table 3) . those extensive network maps can now be explored to identify potential drug targets and to block or manipulate important protein-protein interactions. furthermore, different experimental methods are also used to expand the known interaction networks around pathway-centric proteins like epidermal growth factor receptors (egfrs) (tewari et al. 2004; oda et al. 2005; jones et al. 2006) , smad and transforming growth factor-b (tgfb) (colland and daviet 2004; tewari et al. 2004; barrios-rodiles et al. 2005) , and tumor necrosis factor-a (tnfa) and the transcription factor nf-kb (bouwmeester et al. 2004 ). all of these proteins are involved in sophisticated signal transduction cascades implicated in various important disease indications ranging from cancer to inflammation. the immune system and toll-like receptor (tlr) pathways were the subject of other detailed studies (oda and kitano 2006) . apart from that, protein networks for longevity were assembled to research ageing-related effects (xue et al. 2007 ). high-throughput screens are also conducted for specific disease proteins causative of closely related clinical and pathological phenotypes to unveil molecular interconnections between the diseases. for example, similar neurodegenerative disease phenotypes are caused by polyglutamine proteins like huntingtin and over twenty ataxins. although they that are not evolutionarily related and their expression is not restricted to the brain, they are responsible for inherited neurotoxicity and age-dependent dementia only in specific neuron populations (ralser et al. 2005) . yeast two-hybrid screens revealed an unexpectedly dense interaction network of those disease proteins forming interconnected subnetworks (fig. 11) , which suggests common pathways affected in disease (goehler et al. 2004; lim et al. 2006) . some of the protein-protein interactions may be involved in mediating neurodegeneration and thus may be tractable for drug inhibition, and several interaction partners of ataxins could additionally be shown to be potential disease modifiers in a fly model (kaltenbach et al. 2007) . a number of methodological approaches concentrate on deriving correlations between common topological properties and biological function from subnetworks around proteins that are associated with a particular disease phenotype like cancer. recent studies report that human disease-associated proteins with similar clinical and pathological features tend to be more highly connected among each other than with other proteins and to have more similar transcription profiles xu and li 2006; goh et al. 2007 ). this observation points to the existence of disease-associated functional modules. interestingly, in contrast to disease genes, essential genes whose defect may be lethal early on in life are frequently found to be hubs central to the network. further work focused on specific disease-relevant networks. for instance, to analyze experimental asthma, differentially expressed genes were mapped onto a protein interaction network ). here, highly connected nodes tended to have smaller expression changes than peripheral nodes. this agrees with the general notion that disease-causing genes are typically not central in the network. similarly, a comprehensive protein network analysis of systemic inflammation in human subjects investigated blood leukocyte gene expression patterns when receiving an inflammatory stimulus, a bacterial endotoxin, to identify functional modules perturbed in response to this stimulus (calvano et al. 2005) . topological criteria and gene expression data were also used to search protein networks for functional modules that are relevant to type 2 diabetes mellitus or to different types of cancer (jonsson and bates 2006; cui et al. 2007; lin et al. 2007; pujana et al. 2007 ). moreover, it was recently demonstrated that the integration of gene expression profiles with subnetworks of interacting proteins can lead to improved prognostic markers for breast cancer outcome that are more reproducible between patient cohorts than sets of individual genes selected without network information (chuang et al. 2007 ). in drug discovery, protein networks can help to design selective inhibitors of protein-protein interactions which target specific interactions of a protein, but do not affect others (wells and mcclendon 2007) . for example, a highly connected protein (hub) may be a suitable target for an antibiotic whereas a more peripheral protein with few interaction partners may be more appropriate for a highly specific drug that needs to avoid side effects. thus, topological network criteria are not only useful for characterizing disease proteins, but also for finding drug targets. the diversity of interactions of a targeted protein could also help in predicting potential side effects of a drug. apart from that, it is remarkable that some potential drugs have been found to be less effective than expected due to the intrinsic robustness of living systems against perturbations of molecular interactions (kitano 2007) . furthermore, mutations in proteins cause genetic diseases, but it is not always easy to distinguish protein interactions impaired by mutated binding sites from other disease causes like structural instability induced by amino acid mutations. nowadays many genome-wide association and linkage studies for human diseases suggest genomic loci and linkage intervals that contain candidate genes encoding snps and mutations of potential disease proteins (kann 2007) . since the resultant list of candidates frequently contain dozens or even hundreds of genes, computational approaches have been developed to prioritize them for further analyses and experiments. in the following, we will demonstrate the variety of available prioritization approaches by explicating three recent methods that utilize protein interaction data in addition to the inclusion of other sequence and function information. all methods capitalize on the above described observation that closely interacting gene products often underlie polygenic diseases and similar pathophenotypes (oti and brunner 2007) . using protein-protein interaction data annotated with reliability values, lage et al. (2007) first predict human protein complexes for each candidate protein. they then score the pairwise phenotypic similarity of the candidate disease with all proteins within each complex that are associated with any disease. the scoring function basically measures the overlap of the respective disease phenotypes as recorded in text entries of omim (online mendelian inheritance in man) (hamosh et al. 2005 ) based on the vocabulary of umls (unified medical language system) (bodenreider 2004) . lastly, all candidates are prioritized by the probability returned by a bayesian predictor trained on the interaction data and phenotypic similarity. therefore, this method depends on the premise that the phenotypic effects caused by any disease-affected member in a predicted protein complex are very similar to each other. another prioritization approach by franke et al. (2006) does not make use of overlapping disease phenotypes and primarily aims at connecting physically disjoint genomic loci associated with the same disease using molecular networks. at the beginning, their method prioritizer performs a bayesian integration of three different network types of gene/protein relationships. the latter are derived from functional similarity using gene ontology annotation, microarray coexpression, and proteinprotein interaction. this results in a probabilistic human network of general functional links between genes. prioritizer then assesses which candidate genes contained in different disease loci are closely connected in this gene-gene network. to this end, the score of each candidate is initially set to zero, but it is increased iteratively during network exploration by a scoring function that depends on the network distance of the respective candidate gene to candidates inside another genomic loci. this procedure finally yields separate prioritization lists of ranked candidate genes for each genomic loci. in contrast to the integrated gene-gene network used by prioritizer, the endeavour system (aerts et al. 2006 ) directly compares candidate genes with known disease genes and creates different ranking lists of all candidates using various sources of evidence for annotated relationships between genes or proteins. the evidence can be derived from literature mining, functional associations based on gene ontology annotations, co-occurrence of transcriptional motifs, correlation of expression data, sequence similarity, common protein domains, shared metabolic pathway membership, and protein-protein interactions. at the end, endeavour merges the resultant ranking lists using order statistics and computes an overall prioritization list of all candidate genes. finally, it is important to keep in mind that current datasets of human protein interactions may still contain a significant number of false interactions and thus biological and medical conclusions derived from them should always be taken with a note of caution, in particular, if no good confidence measures are available. a comprehensive atlas of protein interactions is fundamental for a better understanding of the overall dynamic functioning of the living organisms. these insights arise from the integration of functional information, dynamic data and protein interaction networks. in order to fulfill the goal of enlarging our view of the protein interaction network, several approaches must be combined and a crosstalk must be established among experimental and computational methods. this has become clear from comparative evaluations which show similar performances for both types of methodologies. in fact, over the recent years this field has grown into one of the most appealing fields in bioinformatics. evolutionary signals result from restrictions imposed by the need to optimize the features that affect a given interaction and the nature of these features can differ from interaction to interaction. consequently, a number of different methods have been developed based a range of different evolutionary signals. this section is devoted to a brief review of some of these methods. these techniques are based on the similarity of absence/presence profiles of interacting proteins. in its original formulation (gaasterland and ragan 1998; huynen and bork 1998; pellegrini et al. 1999; marcotte et al. 1999a ) the phylogenetic profiles were codified as 0/1 vectors for each reference protein according to the absence/presence of proteins of the studied family in a set of fully sequenced organisms (see fig. 12a ). the vectors for different reference sequences are compared by using the hamming distance (pellegrini et al. 1999) between vectors. this measure counts the number of differences between two binary vectors. the rationale for this method is that both interacting proteins must be present in an organism and that reductive evolution will remove unpaired proteins in the rest of the organisms. proposed improvements include the inclusion of quantitative measures of sequence divergence (marcotte et al. 1999b; date and marcotte 2003) and the ability to deal with biases in the taxonomic distribution of the organisms used (date and marcotte 2003; barker and pagel 2005) . these biases are due to the intuitive fact that evolutionarily similar organisms will share a higher number of protein and genomic features (in this case presence/absence of an orthologue). to reduce this problem, date et al. used mutual information from sequence divergent profiles for measuring the amount of information shared by both vectors. mutual information is calculated as: miðp1; p2þ ¼ hðp1þ þ hðp2þ à hðp1; p2þ; where hðp1þ ¼ p pðp1þ ln pðp1þ is the marginal entropy of the probability distribution of protein p1 sequence distances and hðp1; p2þ ¼ à p p pðp1; p2þ ln pðp1; p2þ is the joint entropy of the probability distributions of both protein p1 and p2 sequence distances. the corresponding probabilities are calculated from the whole distribution of orthologue distances for the organisms. in this way, the most likely evolutionary distances between orthologues from a pair of organisms will produce smaller entropies and consequently smaller values of mutual information. this formulation should implicitly reduce the effect of taxonomic biases. in an interesting work, published recently by barker et al. (2007) , the authors showed that detection of correlated gene-gain/gene-loss events improves the predictions by reducing the number of false positives due to taxonomic biases. the phylogenetic profiling approach has been shown to be quite powerful, because its simple formulation has allowed the exploration of a number of alternative interdependencies between proteins. this is the case for enzyme "displacement" in metabolic pathways detected as anti-correlated profiles (morett et al. 2003) , and for complex dependence relations among triplets of proteins (bowers et al. 2004) . phylogenetic profiles have also been correlated with bacterial traits to predict the genes related to particular phenotypes (korbel et al. 2005) . the main drawbacks of these methods are the difficulty of dealing with essential proteins (where there is no absence information) and the requirement for the genomes under study to be complete (to establish the absence of a family member). fig. 12 prediction of protein interactions based on genomic and sequence features. information coming from the set of close homologs of the proteins p1 and p2 from the organism 1 in other organisms can be used to predict an interaction between these proteins. (a) phylogenetic profiling. presence/absence of a homolog of both proteins in different organisms is coded as the corresponding two 1/0 profiles (most simple approach) and an interaction is predicted for very similar profiles. (b) similarity of phylogenetic trees. multiple sequence alignments are built for both sets of proteins and phylogenetic trees are derived from the proteins with a possible partner present in its organism. proteins with highly similar trees are predicted to interact. (c) gene neighbourhood conservation. genome closeness is checked for those genes coding for both sets of homologous proteins. interaction is predicted if gene pairs are recurrently close to each other in a number of organisms. (d) gene fusion. finding the proteins containing different sequence regions homologous to each of the two proteins is used to predict an interaction between them similarity in the topology of phylogenetic trees of interacting proteins has been qualitatively observed in a number of cases (fryxell 1996; pages et al. 1997; goh et al. 2000) . the extension of this observation to a quantitative method for the prediction of protein interactions requires measuring the correlation between the similarity matrices of the explored pairs of protein families (goh et al. 2000) . this formulation allows systematic evaluation of the validity of using the original observation as a signal of protein interaction (pazos and valencia 2001) . the general protocol for these methods is illustrated in fig. 12b . it includes the building of the multiple sequence alignment for the set of orthologues (one per organism) related to every query sequence, the calculation of all protein pair evolutionary distances (derived from the corresponding phylogenetic trees) and finally the comparison of evolutionary distance matrices of pairs of query proteins using pearsons correlation coefficient. protein pairs with highly correlated distance matrices are predicted to be more likely to interact. although this signal has been shown to be significant, the underlying process responsible for this similarity is still controversial (chen and dokholyan 2006) . there are two main hypotheses for explaining this phenomenon. the first hypothesis suggests that this evolutionary similarity comes from the mutual adaptation (co-evolution) of interacting proteins and the need to retain interaction features while sequences diverge. the second hypothesis implicates external factors. in this scenario, the restrictions imposed by evolution on the functional process implicating both proteins would be responsible for the parallelism of their phylogenetic trees. although the relative importance of both factors is still not clear, the predictive power of similarities in phylogenetic trees is not affected. indeed, a number of developments have improved the original formulation (pazos et al. 2005; sato et al. 2005 ). the first advance involved managing the intrinsic similarity of the trees because of the common underlying taxonomic distribution (due to the speciation processes). this effect is analogous to the taxonomic biases discussed above. in these cases, the approach followed was to correct both trees by removing this common trend. for example, pazos et al. subtracted the distances of the 16s rrna phylogenetic tree to the corresponding distances for each protein tree. the correlations for the resulting distance matrices were used to predict protein interactions. additionally some analyses have focused on the selection of the sequence regions used for the tree building (jothi et al. 2006; kann et al. 2007) . for example, it has been shown that interacting regions, both defined as interacting residues (using structural data) and as the sequence domain involved in the interaction, show more clear tree similarities than the whole proteins (mintseris and weng 2005; jothi et al. 2006) . other interesting work showed that prediction performance can be improved by removing poorly conserved sequence regions ). finally, in a very recent work (juan et al. 2008 ) the authors have suggested a new method for removing noise in the detection of tree similarity signals and detecting different levels of evolutionary parallelism specificity. this method introduces the new strategy of using the global network of protein evolutionary similarity for a better calibration of the evolutionary parallelism between two proteins. for this purpose, they define a protein co-evolutionary profile as the vector containing the evolutionary correlations between a given protein tree and all the rest of the protein trees derived from sequences in the same organism. this co-evolutionary profile is a more robust and comparable representation of the evolution of a given protein (it involves hundreds of distances) and can be used to deploy a new level of evolutionary comparison. the authors compare these co-evolutionary profiles by calculating pearsons correlation coefficient for each pair. in this way, the method detects pairs of proteins for which high evolutionary similarities are supported by their similarities with the rest of proteins of the organism. this approach significantly improves the predictive performance of the tree similaritybased methods so that different degrees of co-evolutionary specificity are obtained according to the number of proteins that might be influencing the co-evolution of the studied pair. this is done by extending the approach of sato et al. (2006) , that uses partial correlations and a reduced set of proteins for determining specific evolutionary similarities. juan et al. calculated the partial correlation for each significant evolutionary similarity with respect to the remaining proteins in the organism and defined levels of co-evolutionary specificity according to the number of proteins that are considered to be co-evolving with each studied protein pair. with this strategy, its possible to detect a range of evolutionary parallelisms from the protein pairs (for very specific similarities) up to subsets of proteins (for more relaxed specificities) that are highly evolution dependent. interestingly, if specificity requirements are relaxed, protein relationships among components of macro-molecular complexes and proteins involved in the same metabolic process can be recovered. this can be considered as a first step in the application of higher orders of evolutionary parallelisms to decode the evolutionary impositions over the protein interaction network. this method exploits the well-known tendency of bacterial organisms to organize proteins involved in the same biochemical process by clustering them in the genome. this observation is obviously related to the operon concept and the mechanisms for the coordination of transcription regulation of the genes present in these modules. these mechanisms are widespread among bacterial genomes. therefore the significance of a given gene proximity can be established by its conservation in evolutionary distant species (dandekar et al. 1998; overbeek et al. 1999) . the availability of fully sequenced organisms makes computing the intergenic distances between each pair of genes easy. genes with the same direction of transcrip-tion and closer than 300 bases are typically considered to be in the same genomic context (see fig. 12c ). the conservation of this closeness must be found in more than two highly divergent organisms to be considered significant because of the taxonomic biases. while this signal is strong in bacterial genomes, its relevance is unclear in eukaryotic genomes. this is the main drawback of these methodologies. in fact, this signal only can be exploited for eukaryotic organisms by extrapolating genomic closeness of bacterial genes to their homologues in eukaryotes. obviously, this extrapolation leads to a considerable reduction in the confidence and number of obtained predictions for this evolutionary lineage. however, conserved gene pairs that are transcribed from a shared bidirectional promoter can be detected by similar methods and can found in eukaryotes as well as prokaryotes (korbel et al. 2004) a further use of evolutionary signals in protein function and physical interaction prediction has been the tendency of interacting proteins to be involved in gene fusion events. sequences that appear as independently expressed orfs in one organism become fused as part of the same polypeptide sequence in another organism. these fusions are strong indicators of functional and structural interaction that have been suggested to increase the effective concentration of interacting functional domains (enright et al. 1999; marcotte et al. 1999b ). this hypothesis proposes that gene fusion could remove the effect of diffusion and relative correct orientation of the proteins forming the original complex. these fusion events are typically detected when sequence searches for two nonhomologous proteins obtain a significant hit in the same sequence. cases matching to the same region of the hit sequence are removed (these cases are schematically represented in fig. 12d ). in spite of the strength of this signal, gene fusion seems to not be a habitual event in bacterial organisms. the difficulty of distinguishing protein interactions belonging to large evolutionary families is the main drawback of the automatic application of these methodologies. 13 integration of experimentally determined and predicted interactions as described above, there are many both experimental techniques and computational methods for determining and predicting interactions. to obtain the most comprehensive interaction networks possible, as many as possible of these sources of interactions should be integrated. the integration of these resources is complicated by the fact that the different sources are not all equally reliable, and it is thus important to quantify the accuracy of the different evidence supporting an interaction. in addition to the quality issues, comparison of different interaction sets is further complicated by the different nature of the datasets: yeast two-hybrid experiments are inherently binary, whereas pull-down experiments tend to report larger complexes. to allow for comparisons, complexes are typically represented by binary interaction networks; however, it is important to realize that there is not a single, clear definition of a "binary interaction". for complex pull-down experiments, two different representations have been proposed: the matrix representation, in which each complex is represented by the set of binary interactions corresponding to all pairs of proteins from the complex, and the spoke representation, in which only bait-prey interactions are included (von mering et al. 2002) . the binary interactions obtained using either of these representations are somewhat artificial as some interacting proteins might in reality never touch each other and others might have too low an affinity to interact except in the context of the entire complex bringing them together. even in the case of yeast two-hybrid assays, which inherently report binary interactions, not all interactions correspond to direct physical interactions. the database string ("search tool for the retrieval of interacting genes/ proteins") (von mering et al. 2007) represents an effort to provide many of the different types of evidence for functional interactions under one common framework with an integrated scoring scheme. such an integrated approach offers several unique advantages: 1) various types of evidence are mapped onto a single, stable set of proteins, thereby facilitating comparative analysis; 2) known and predicted interactions often partially complement each other, leading to increased coverage; and 3) an integrated scoring scheme can provide higher confidence when independent evidence types agree. in addition to the many associations imported from the protein interaction databases mentioned above (bader et al. 2003; salwinski et al. 2004; guldener et al. 2006; mishra et al. 2006; stark et al. 2006; chatr-aryamontri et al. 2007 ), string also includes interactions from curated pathway databases (vastrik et al. 2007; kanehisa et al. 2008 ) and a large body of predicted associations that are produced de novo using many of the methods described in this chapter (dandekar et al. 1998; gaasterland and ragan 1998; pellegrini et al. 1999; marcotte et al. 1999c) . these different types of evidence are obviously not directly comparable, and even for the individual types of evidence the reliability may vary. to address these two issues, string uses a two-stage approach. first, a separate scoring scheme is used for each evidence type to rank the interactions according to their reliability; these raw quality scores cannot be compared between different evidence types. second, the ranked interaction lists are benchmarked against a common reference to obtain probabilistic scores, which can subsequently be combined across evidence types. to exemplify how raw quality scores work, we will here explain the scoring scheme used for physical protein interactions from high-throughput screens. the two funda-mentally different types of experimental interaction data sets, complex pull-downs and binary interactions are evaluated using separate scoring schemes. for the binary interaction experiments, e.g. yeast two-hybrid, the reliability of an interaction correlates well with the number of non-shared interaction partners for each interactor. string summarizes this in the following raw quality score: logððn 1 þ1þ á ðn 2 þ1þþ; where n 1 and n2 are the numbers of non-shared interaction partners. this score is similar to the ig1 measure suggested by saito et al. (2002) . in the case of complex pulldown experiments, the reliability of the inferred binary interactions correlates better with the number of times the interactors were co-purified compared to what would be expected at random: where n 12 is the number of purifications containing both proteins, n 1 and n 2 are the numbers of purifications containing either protein 1 or 2, and n is the total number of purifications. for this purpose, the bait protein was counted twice to account for bait-prey interactions being more reliable than prey-prey interactions. these raw quality scores are calculated for each individual high-throughput screen. scores vary within one dataset, because they include additional, intrinsic information from the data itself, such as the frequency with which an interaction is detected. for medium sized data sets that are not large enough to apply the topology based scoring schemes, the same raw score is assigned to all interactions within a dataset. finally, very small data sets are pooled and considered jointly as a single interaction set. we similarly have different scoring schemes for predicted interactions based on coexpression in microarray expression studies, conserved gene neighborhood, gene fusion events and phylogenetic profiles. based on these raw quality scores, a confidence score is assigned to each predicted association by benchmarking the performance of the predictions against a common reference set of trusted, true associations. string uses as reference the functional grouping of proteins maintained at kegg (kyoto encyclopedia of genes and genomes (kanehisa et al. 2008) . any predicted association for which both proteins are assigned to the same "kegg pathway" is counted as a true positive. kegg pathways are particularly suitable as a reference because they are based on manual curation, are available for a number of organisms, and cover several functional areas. other benchmark sets could also be used, for example "biological process" terms from gene ontology (ashburner et al. 2000) or reactome pathways (vastrik et al. 2007 ). the benchmarked confidence scores in string generally correspond to the probability of finding the linked proteins within the same pathway or biological process. the assignment of probabilistic scores for all evidence types solves many of the issues of data integration. first, incomparable evidence types are made comparable by assigning a score that represents how well the evidence type can predict a certain type of interactions (the type being specified by the reference set used). second, the separate benchmarking of interactions from, for example, different high-throughput protein interaction screens accounts for any differences in reliability between different studies. third, use of raw quality scores allows us to separate more reliable interactions from less reliable interactions even within a single dataset. the probabilistic nature of the scores also makes it easy to calculate the combined reliability of an interaction given multiple lines of evidence. it is computed under the assumption of independence for the various sources, in a na€ ıve bayesian fashion. in addition to having a good scoring scheme, it is crucial to make the evidence for an interaction transparent to the end users. to achieve this, the string interaction network is made available via a user-friendly web interface (http://string.embl.de). when performing a query, the user will first be presented with a network view, which provides a first, simplified overview (fig. 13) . from here the user has full control over parameters such as the number of proteins shown in the network (nodes) and the minimal reliability required for an interaction (edge) to be displayed. from the network, the user also has the ability to drill down on the evidence that underlies any given interaction using the dedicated viewer for each evidence type. for example, it is possible to inspect the publications that support a given interaction, the set of protein that were fig. 13 protein interaction network of the core cell-cycle regulation in human. the network was constructed by querying the string database (von mering et al. 2007 ) for very high confidence interactions (conf. score > 0.99) between four cyclin-dependent kinases, their associated cyclins, the wee1 kinase and the cdc25 phosphatases. the network correctly recapitulates cdc2 interacts with cyclin-a/b, cdk2 with cyclin-a/e, and cdk4/6 with cyclin-d. it also shows that the wee1 and cdc25 phosphatases regulate cdc2 and cdk2 but not cdk4 and cdk6. moreover, the network suggests that cdc25a phosphatase regulates cdc2 and cdk2, whereas cdc25b and cdc25c specifically regulate cdc2 co-purified in a particular experiment and the phylogenetic profiles or genomic context based on which an interaction was predicted. protein binding is commonly characterized by specific interactions of evolutionarily conserved domains (pawson and nash 2003) . domains are fundamental units of protein structure and function , which are incorporated into different proteins by genetic duplications and rearrangements (vogel et al. 2004) . globular domains are defined as structural units of fifty and more amino acids that usually fold independently of the remaining polypeptide chain to form stable, compact structures (orengo and thornton 2005) . they often carry important functional sites and determine the specificity of protein interactions (fig. 14) . essential information on fig. 14 exemplary interaction between the two human proteins hhr23b and ataxin-3. each protein domain commonly adopts a particular 3d structure and may fulfill a specific molecular function. generally, the domains responsible for an observed protein-protein interaction need to be determined before further functional characterizations are possible. in the depicted protein-protein interaction, it is known from experiments that the ubiquitin-like domain ubl of hhr23b (yellow) forms a complex with de-ubiquitinating josephin domain of ataxin-3 (blue) (nicastro et al. 2005) the cellular function of specific protein interactions and complexes can often be gained from the known functions of the interacting protein domains. domains may contain binding sites for proteins and ligands such as metabolites, dna/rna, and drug-like molecules (xia et al. 2004) . widely spread domains that mediate molecular interactions can be found alone or combined in conjunction with other domains and intrinsically disordered, mainly unstructured, protein regions connecting globular domains (dunker et al. 2005) . according to apic et al. (2001) multi-domain proteins constitute two thirds of unicellular and 80% of metazoan proteomes. one and the same domain can occur in different proteins, and many domains of different types are frequently found in the same amino acid chain. much effort is being invested in discovering, annotating, and classifying protein domains both from the functional (pfam (finn et al. 2006) , smart (letunic et al. 2006 ), cdd (marchler-bauer et al. 2007 , interpro (mulder et al. 2007 ) and structural (scop (andreeva et al. 2004) , cath (greene et al. 2007 )) perspective. notably, it may be confusing that the term domain is commonly used in two slightly different meanings. in the context of domain databases such as pfam and smart, a domain is basically defined by a set of homologous sequence regions, which constitute a domain family. in contrast, a specific protein may contain one or more domains, which are concrete sequence regions within its amino acid sequence corresponding to autonomously folding units. domain families are commonly represented by hidden markov models (hmms), and highly sensitive search tools like hmmer (eddy 1998 ) are used to identify domains in protein sequences. different sources of information about interacting domains with experimental evidence are available. experimentally determined interactions of single-domain proteins indicate domain-domain interactions. similarly, experiments using protein fragments help identifying interaction domains, but this knowledge is frequently hidden in the text of publications and not contained in any database. however, domain databases like pfam, smart, and interpro may contain some annotation obtained by manual literature curation. in the near future, high-throughput screening techniques will result in even larger amounts of protein fragment interaction data to delineate domain borders and interacting protein regions (colland and daviet 2004) . above all, three-dimensional structures of protein domain complexes are experimentally solved by x-ray crystallography or nmr and are deposited in the pdb database (berman et al. 2007) . structural contacts between two interacting proteins can be derived by mapping sequence positions of domains onto pdb structures. extensive investigations of domain combinations in proteins of known structures (apic et al. 2001 ) as well as of structurally resolved homo-or heterotypic domain interactions (park et al. 2001) revealed that the overlap between intra-and intermolecular domain interactions is rather limited. two databases, ipfam (finn et al. 2005 ) and 3did (stein et al. 2005) , provide pre-computed structural information about protein interactions at the level of pfam domains. analysis of structural complexes suggests that interactions between a given pair of proteins may be mediated by different domain pairs in different situations and in different organisms. nevertheless, many domain interactions, especially those involved in basic cellular processes such as dna metabolism and nucleotide binding, tend to be evolutionarily conserved within a wide range of species from prokaryotes to eukaryotes (itzhaki et al. 2006) . in yeast, pfam domain pairs are associated with over 60% of experimentally known protein interactions, but only 4.5% of them are covered by ipfam (schuster-bockler and bateman 2007) . domain interactions can be inferred from experimental data on protein interactions by identifying those domain pairs that are significantly overrepresented in interacting proteins compared to random protein pairs (deng et al. 2002; ng et al. 2003a; riley et al. 2005; sprinzak and margalit 2001) (fig. 15) . however, the predictive power of such an approach is strongly dependent on the quality of the data used as the source of information for protein interactions, and the coverage of protein sequences in terms of domain assignments. basically, the likelihood of two domains, d i and d j , to interact can be estimated as the fraction of protein pairs known to interact among all proteins in the dataset containing this domain pair. this basic idea has been improved upon by using a maximum-likelihood (ml) approach based on the expectation-maximization (em) algorithm. this method finds the maximum likelihood estimator of the observed protein-protein interactions by an iterative cycle of computing the expected likelihood (e-step) and maximizing the unobserved parameters (domain interaction propensities) in the m-step. when the algorithm converges (i.e. the total likelihood cannot be further improved by the algorithm), the ml estimate for the likelihood of the unobserved domain interactions is found (deng et al. 2002; riley et al. 2005 ). riley and colleagues further improved this method by excluding each potentially interacting domain pair from the dataset and recomputing the ml-estimate to obtain an additional confidence value for the respective domain-domain interaction. this domain pair exclusion (dpea) method measures the contribution of each domain pair to the overall likelihood of the protein interaction network based on domain-domain interactions. in particular, this approach enables the prediction of specific domain-domain interactions between selected proteins which would have been missed by the basic ml method. another ml-based algorithm is insite which takes differences in the reliability of the protein-protein interaction data into account (wang et al. 2007a) . it also integrates external evidence such as functional annotation or domain fusion events. an alternative method for deriving domain interactions is through co-evolutionary analysis that exploits the notion that mutations of residue pairs at the interaction interfaces are correlated to preserve favorable physico-chemical properties of the binding surface (jothi et al. 2006) . the pair of domains mediating interactions between two proteins p1 and p2 may therefore be expected to display a higher similarity of their phylogenetic trees than other, non-interacting domains (fig. 16) . the degree of agreement between the evolutionary history of two domains, d i and d j , can be computed by the pearsons correlation coefficient r ij between the similarity matrices of the domain sequences in different organisms: where n is the number of species, m i pq and m j pq are the evolutionary distances between species, and m i and m j are the mean values of the matrices, respectively. in figure 16 the evolutionary tree of the domain d2 is most similar to those of d5 and d6, corroborating the actual binding region. a well-known limitation of the correlated mutation analysis is that it is very difficult to decide whether residue co-variation happens as a result of functional co-evolution directed at preserving interaction sites, or because of sequence divergence due to speciation. to address this problem, suggested to distinguish the relative contribution of conserved and more variable regions in aligned sequences to the co-evolution signal based on the hypothesis that functional co-evolution is more prominent in conserved regions. finally, interacting domains can be identified by phylogenetic profiling, as described above for full-chain proteins. as in the case of complete protein chains, the similarity of evolutionary patterns shared by two domains may indicate that they interact with each other directly or at least share a common functional role (pagel et al. 2004) . as illustrated in fig. 17 , clustering protein domains with similar phylogenetic profiles allows researchers to build domain interaction networks which provide clues for describing molecular complexes. similarly, the domainteam method (pasek et al. 2005) considers chromosomal neighborhoods at the level of conserved domain groups. a number of resources provide and combine experimentally derived and predicted domain interaction data. interdom (http://interdom.i2r.a-star.edu.sg/) integrates domain-interaction predictions based on known protein interactions and complexes with domain fusion events (ng et al. 2003b) . dima (http://mips.gsf.de/genre/proj/dima2) is another database of domain interactions, which integrates experimentally demonfig. 16 co-evolutionary analysis of domain interactions. two orthologous proteins from different organisms known to interact with each other are shown. the first protein consists of two domains, d1 and d2, while the second protein includes the domains d3, d4, d5, and d6. evolutionary trees for each domain are shown, their similarity serves as an indication of interaction likelihood that is encoded in the interaction matrix strated domain interactions from ipfam and 3did with predictions based on the dpea algorithm and phylogenetic domain profiling ). recently, two new comprehensive resources, domine (http://domine.utdallas.edu) (raghavachari et al. 2008 ) and dasmi (http://www.dasmi.de) (blankenburg et al. 2008, submitted) , were introduced and are available online. these resources contain ipfam and 3did data and predicted domain interactions taken from several other publications. predictions are based on several methods for deriving domain interactions from protein interaction data, phylogenetic domain profiling data and domain coevolution. with the availability of an increasing number of predictions the task of method weighting and quality assessment becomes crucial. a thorough analysis of the quality of domain interaction data can be found in schlicker et al. (2007) . beyond domain-domain contacts, an alternative mechanism of mediating molecular recognition is through binding of protein domains to short sequence regions (santonico et al. 2005) , typically from three to eight residues in length (zarrinpar et al. 2003; neduva et al. 2005) . such linear recognition motifs can be discovered from protein interaction data by identifying amino acid sequence patterns overrepresented in proteins that do not possess significant sequence similarity, but share the same interacting partner (yaffe 2006) . web services like eml (http://elm.eu.org (puntervoll et al. 2003) ), support the identification of linear motifs in protein sequences. as described above, specific adapter domains can mediate protein-protein interactions. while some of these interaction domains recognize small target peptides, others are involved in domain-domain interactions. as short binding motifs have a rather high probability of being found by chance and the exact mechanisms of binding specificity for this mode of interaction are not understood completely, predictions of proteinprotein interactions based on binding domains is currently limited to domain-domain interactions for which reliable data is available. predicting ppis from domain interactions may simply be achieved by reversing the ideas discussed above, that is, by using the domain composition of proteins to evaluate the interaction likelihood of proteins (bock and gough 2001; sprinzak and margalit 2001; wojcik and schachter 2001) . in a naive approach, domain interactions are treated as independent, and all protein pairs with a matching pair of interacting domains are predicted to engage in an interaction. given that protein interactions may also be mediated by several domain interactions simultaneously, more advanced statistical methods take into account dependencies between domains and exploit domain combinations (han et al. 2004 ) and multiple interacting domain pairs (chen and liu 2005) . exercising and validating these prediction approaches revealed that the most influential factor for ppi prediction is the quality of the underlying data. this suggests that, as for most biological predictions in other fields, the future of prediction methods for protein and domain interactions may lie in the integration of different sources of evidence and weighting the individual contributions based on calibration to goldstandard data. further methodological improvements may include the explicit consideration of cooperative domains, that is, domain pairs that jointly interact with other domains (wang et al. 2007b ). basic interactions between two or up to a few biomolecules are the basic elements of the complex molecular interaction networks that enable the processes of life and, when thrown out of their intended equilibrium, manifest the molecular basis of diseases. such interactions are at the basis of the formation of metabolic, regulatory or signal transduction pathways. furthermore the search for drugs boils down to analyzing the interactions between the drug molecule and the molecular target to which it binds, which is often a protein. for the analysis of a single molecular interaction, we do not need complex biological screening data. thus it is not surprising that the analysis of the interactions between two molecules, one of them being a protein, has the longest tradition in computational biology of all problems involving molecular interactions, dating back over three decades. the basis for such analysis is the knowledge of the three-dimensional structure of the involved molecules. to date, such knowledge is based almost exclusively on experimental measurements, such as x-ray diffraction data or nmr spectra. there are also a few reported cases in which the analysis of molecular interactions based on structural models of protein has led to successes. the analysis of the interaction of two molecules based on their three-dimensional structure is called molecular docking. the input is composed of the three-dimensional structures of the participating molecules. (if the involved molecule is very flexible one admissible structure is provided.) the output consists of the three-dimensional structure of the molecular complex formed by the two molecules binding to each other. furthermore, usually an estimate of the differential free energy of binding is given, that is, the energy difference dg between the bound and the unbound conformation. for the binding event to be favorable that difference has to be negative. this slight misnomer describes the binding between a protein molecule and a small molecule. the small molecule can be a natural substrate such as a metabolite or a molecule to be designed to bind tightly to the protein such as a drug molecule. proteinligand docking is the most relevant version of the docking problem because it is a useful help in searching for new drugs. also, the problem lends itself especially well to computational analysis, because in pharmaceutical applications one is looking for small molecules that are binding very tightly to the target protein, and that do so in a conformation that is also a low-energy conformation in the unbound state. thus, subtle energy differences between competing ligands or binding modes are not of prime interest. for these reasons there is a developed commercial market for protein-ligand docking software. usually the small molecule has a molecular weight of up to several hundred daltons and can be quite flexible. typically, the small molecule is given by its 2d structure formula, e.g., in the form of a smiles string (weininger 1988) . if a starting 3d conformation is needed there is special software for generating such a conformation (see, e.g. (pearlman 1987; sadowski et al. 1994) ). challenges of the protein ligand problem are (i) finding the correct conformation of the usually highly flexible ligand in the binding site of the protein, (ii) determining the subtle conformational changes in the binding site of the protein upon binding of the ligand, which are termed induced fit, (iii) producing an accurate estimate of the differential energy of binding or at least ranking different conformations of the same ligand and conformations of different ligands correctly by their differential energy of binding. methods tackling problem (ii) can also be used to rectify smaller errors in structural models of proteins whose structure has not been resolved experimentally. the solution of problem (iii) provides the essential selection criterion for preferred ligands and binding modes, namely those with lowest differential energy of binding. challenge (i) has basically been conquered in the last decade as a number of docking programs have been developed that can efficiently sample the conformational space of the ligand and produce correct binding modes of the ligand within the protein, assuming that the protein is given in the correct structure for binding the ligand. several methods are applied here. the most brute-force method is to just try different (rigid) conformations of the ligand one after the other. if the program is fast enough one can run through a sizeable number of conformations per ligand (mcgann et al. 2003) . a more algorithmic and quite successful method is to build up the ligand from its molecular fragments inside the binding pocket of the protein (rarey et al. 1996 ). yet another class of methods sample ligand conformations inside the protein binding pocket by methods such as local search heuristics, monte carlo sampling or genetic algorithms (abagyan et al. 1994; jones et al. 1997; morris et al. 1998 ). there are also programs exercising combinations of different methods (friesner et al. 2004 ). the reported methods usually can compute the binding mode of a ligand inside a protein within fractions of a minute to several minutes. the resulting programs can be applied to screening through large databases of ligands involving hundreds of thousands to millions of compounds and are routinely used in pharmaceutical industry in the early stages of drug design and selection. they are also repeatedly compared on benchmark datasets (kellenberger et al. 2004; englebienne et al. 2007 ). more complex methods from computational biophysics, such as molecular dynamics (md) simulations that compute a trajectory of the molecular movement based on the forces exerted on the molecules take hours on a single problem instance and can only be used for final refinement of the complex. challenges (ii) and (iii) have not been solved yet. concerning problem (ii), structural changes in the protein can involve redirections of side chains in or close to the binding pocket and more substantial changes involving backbone movement. while recently methods have been developed to optimize side-chain placement upon ligand binding (claußen et al. 2001; sherman et al. 2006) , the problem of finding the correct structural change upon binding involving backbone and side-chain movement is open (carlson 2002) . concerning problem (iii), there are no scoring functions to date that are able to sufficiently accurately estimate the differential energy of binding on a diverse set of protein-ligand complexes huang and zou 2006) . this is especially unfortunate as an inaccurate estimate of the binding energy causes the docking program to disregard correct complex structures even though they have been sampled by the docking program because they are labeled with incorrect energies. this is the major problem in docking which limits the accuracy of the predictions. recent reviews on protein-ligand docking have been published in sousa et al. (2006) and rarey et al. (2007) . one restriction with protein-ligand docking as it applies to drug design and selection is that the three-dimensional structure of the target protein needs to be known. many pharmaceutical targets are membrane-standing proteins for which we do not have the three-dimensional structure. for such proteins there is a version of drug screening that can be viewed as the negative imprint of docking: instead of docking the drug candidate into the binding site of the proteinwhich is not availablewe superpose the drug candidate (which is here called the test molecule) onto another small molecule which is known to bind to the binding site of the protein. such a molecule can be the natural substrate for the target protein or another drug targeting that protein. let us call this small molecule the reference molecule. the suitability of the new drug candidate is then assessed on the basis of its structural and chemical similarity with the reference molecule. one problem is that now both the test molecule and the reference molecule can be highly flexible. but in many cases largely rigid reference molecules can be found, and in other cases it suffices to superpose the test moelcule onto any low-energy conformation of the reference molecule. there are several classes of drug screening programs based on this molecular comparison, ranging from (i) programs that perform a detailed analysis of the three-dimensional structures of the molecules to be compared (e.g. (lemmen et al. 1998; kr€ amer et al. 2003) ) across (ii) programs that perform a topological analysis of the two molecules (rarey and dixon 1998; gillet et al. 2003) to (iii) programs that represent both molecules by binary or numerical property vectors which are compared with string methods (mcgregor and muskal 1999; xue et al. 2000) . the first class of programs require fractions of seconds to fractions of a minute for a single comparison, the second can perform hundreds comparisons per second, the third up to several ten thousand comparisons per second. reviews of methods for drug screening based on ligand comparison are given in (lengauer et al. 2004; k€ amper et al. 2007 ). here both binding partners are proteins. since drugs tend to be small molecules this version of the docking problem is not of prime interest in drug design. also, the energy balance of protein-protein binding is much more involved that for protein-ligand binding. optimal binding modes tend not to form troughs in the energy landscape that are as pronounced as for protein-ligand docking. the binding mode is determined by subtle side-chain rearrangements of both binding partners that implement the induced fit along typically quite large binding interfaces. the energy balance is dominated by difficult to analyze entropic terms involving the desolvation of water within the binding interface. for these reasons, the software landscape for protein-protein docking is not as well developed as for protein-ligand docking and there is no commercial market for protein-protein docking software. protein-protein docking approaches are based either on conformational sampling and mdwhich can naturally incorporated molecular flexibility but suffers from very high computing demandsor on combinatorial sampling with both proteins considered rigid in which case handling of protein flexibility has to be incorporated with methodical extensions. for space reasons we do not detail methods for protein-protein docking. a recent review on the subject can be found in hildebrandt et al. (2007) . a variant of protein-protein docking is protein-dna docking. this problem shares with protein-protein docking the character that both binding partners are macromolecules. however, entropic aspects of the energy balance are even more dominant in protein-dna docking than in protein-protein docking. furthermore dna can assume nonstandard shapes when binding to proteins which deviate much more from the known double helix than we are used to when considering induced fit phenomena. icm-a method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation gene prioritization through genomic data fusion scale-free networks in cell biology the relationship between sequence and interaction divergence in proteins ten thousand interactions for the molecular biologist structural systems biology: modelling protein interactions scop database in 2004: refinements integrate structure and sequence family data domain combinations in archaeal, eubacterial and eukaryotic proteomes gene ontology: tool for the unification of biology. the gene ontology consortium computing topological parameters of biological networks bind: the biomolecular interaction network database network biology: understanding the cells functional organization constrained models of evolution lead to improved prediction of functional linkage from correlated gain and loss of genes predicting functional gene links from phylogenetic-statistical analyses of whole genomes high-throughput mapping of a dynamic signaling network in mammalian cells the worldwide protein data bank (wwpdb): ensuring a single, uniform archive of pdb data predicting protein-protein interactions from primary structure the unified medical language system (umls): integrating biomedical terminology superti-furga g (2004) a physical and functional map of the human tnf-alpha/ nf-kappa b signal transduction pathway use of logic relationships to decipher protein network organization the grid: the general repository for interaction datasets interaction network containing conserved and essential protein complexes in escherichia coli epstein-barr virus and virus human protein interaction maps a network-based analysis of systemic inflammation in humans disrupted in schizophrenia 1 interactome: evidence for the close connectivity of risk genes and a potential synaptic basis for schizophrenia protein flexibility is an important component of structure-based drug discovery mint: the molecular interaction database on evaluating molecular-docking methods for pose prediction and enrichment factors prediction of protein-protein interactions using random decision forest framework the coordinated evolution of yeast proteins is constrained by functional modularity network-based classification of breast cancer metastasis flexe: efficient molecular docking considering protein structure variations integration of biological networks and gene expression data using cytoscape integrating a functional proteomic approach into the target discovery process identification of the helicobacter pylori anti-sigma28 factor a map of human cancer signaling interactome: gateway into systems biology conservation of gene order: a fingerprint of proteins that physically interact discovery of uncharacterized cellular systems by genome-wide analysis of functional linkages inferring domain-domain interactions from protein-protein interactions flexible nets. the roles of intrinsic disorder in protein interaction networks profile hidden markov models evaluation of docking programs for predicting binding of golgi alpha-mannosidase ii inhibitors: a comparison with crystallography protein interaction maps for complete genomes based on gene fusion events a novel genetic system to detect protein-protein interactions ipfam: visualization of protein-protein interactions in pdb at domain and amino acid resolutions pfam: clans, web tools and services pharmaceuticals: a new grammar for drug discovery a genomic approach of the hepatitis c virus generates a protein interaction map reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes glide: a new approach for rapid, accurate docking and scoring. 1. method and assessment of docking accuracy the coevolution of gene family trees microbial genescapes: phyletic and functional patterns of orf distribution among prokaryotes analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets proteome survey reveals modularity of the yeast cell machinery functional organization of the yeast proteome by systematic analysis of protein complexes experimental determination and system level analysis of essential genes in escherichia coli mg1655 disease gene discovery through integrative genomics similarity searching using reduced graphs a protein interaction map of drosophila melanogaster a protein interaction network links git1, an enhancer of huntingtin aggregation, to huntingtons disease co-evolution of proteins with their interaction partners analyzing protein interaction networks the human disease network mgf(3)(-) as a transition state analog of phosphoryl transfer the cath domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution mpact: the mips protein interaction resource on yeast online mendelian inheritance in man (omim), a knowledgebase of human genes and genetic disorders prespi: a domain combination based prediction system for protein-protein interaction the gene ontology (go) database and informatics resource from molecular to modular cell biology why do hubs tend to be essential in protein networks? the hupo psis molecular interaction format -a community standard for the representation of protein interaction data modeling protein-protein and protein-dna docking systematic identification of protein complexes in saccharomyces cerevisiae by mass spectrometry towards zoomable multidimensional maps of the cell visant: data-integrating visual framework for biological networks and modules an iterative knowledge-based scoring function to predict protein-ligand interactions: ii. validation of the scoring function measuring genome evolution evolutionary conservation of domain-domain interactions a bayesian networks approach for predicting protein-protein interactions from genomic data lethality and centrality in protein networks development and validation of a genetic algorithm for flexible docking a quantitative protein interaction network for the erbb receptors using protein microarrays global topological features of cancer proteins in the human interactome co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions high-confidence prediction of global interactomes based on genome-wide coevolutionary networks huntingtin interacting proteins are genetic modifiers of neurodegeneration lead identification by virtual screaning kegg for linking genomes to life and the environment protein interactions and disease: computational approaches to uncover the etiology of diseases predicting protein domain interactions from coevolution of conserved regions relating protein pharmacology by ligand chemistry comparative evaluation of eight docking tools for docking and virtual screening accuracy intact-open source resource for molecular interaction data broadening the horizon -level 2.5 of the hupo-psi format for molecular interactions a robustness-based approach to systems-oriented drug design systematic association of genes to phenotypes by genome and literature mining analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs fast 3d molecular superposition and similarity search in databases of flexible molecules stitch: interaction networks of chemicals and proteins a protein interaction network of the malaria parasite plasmodium falciparum a human phenome-interactome network of protein complexes implicated in genetic disorders genome-wide protein interaction maps using two-hybrid systems flexs: a method for fast flexible ligand superposition novel technologies for virtual screening smart 5: domains in the context of genomes and networks a protein-protein interaction network for human inherited ataxias and disorders of purkinje cell degeneration a multidimensional analysis of genes mutated in breast and colorectal cancers network-based analysis of affected biological processes in type 2 diabetes models human disease classification in the postgenomic era: a complex systems approach to human pathobiology hubs in biological interaction networks exhibit low changes in expression in experimental asthma cdd: a conserved domain database for interactive domain family analysis detecting protein function and protein-protein interactions from genome sequences a combined algorithm for genome-wide prediction of protein function a combined algorithm for genome-wide prediction of protein function gaussian docking functions pharmacophore fingerprinting. 1. application to qsar and focused library design structure, function, and evolution of transient and obligate protein-protein interactions human protein reference database-2006 update systematic discovery of analogous enzymes in thiamin biosynthesis automated docking using a lamarckian genetic algorithm and an empirical binding free energy function new developments in the interpro database systematic discovery of new recognition peptides mediating protein interaction networks integrative approach for computationally inferring protein domain interactions interdom: a database of putative interacting protein domains for validating predicted protein interactions and complexes the solution structure of the josephin domain of ataxin-3: structural determinants for molecular recognition protein interaction networks in bacteria diversity of protein-protein interactions a comprehensive map of the toll-like receptor signaling network a comprehensive pathway map of epidermal growth factor receptor signaling submit your interaction data the imex way: a step by step guide to trouble-free deposition the minimum information required for reporting a molecular interaction experiment (mimix) protein families and their evolution-a structural perspective the modular nature of genetic diseases use of contiguity on the chromosome to predict functional coupling a database and tool, im browser, for exploring and integrating emerging gene and protein interaction data for drosophila the mips mammalian protein-protein interaction database dima 2.0 predicted and known domain interactions a domain interaction map based on phylogenetic profiling species-specificity of the cohesin-dockerin interaction between clostridium thermocellum and clostridium cellulolyticum: prediction of specificity determinants of the dockerin domain global mapping of pharmacological space mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the pdb and yeast a proteome-wide protein interaction map for campylobacter jejuni identification of genomic features using microsyntenies of domains: domain teams assembly of cell regulatory systems through protein interaction domains assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome similarity of phylogenetic trees as indicator of protein-protein interaction rapid generation of high quality approximate 2-dimension molecular structures assigning protein functions by comparative genome analysis: protein phylogenetic profiles network modeling links breast cancer susceptibility and centrosome dysfunction elm server: a new resource for investigating short functional sites in modular eukaryotic proteins domine: a database of protein domain interactions an integrative approach to gain insights into the cellular function of human ataxin-2 computational analysis of human protein interaction networks docking and scoring for structure-based drug design feature trees: a new molecular similarity measure based on tree matching a fast flexible docking method using an incremental construction algorithm a generic protein purification method for protein complex characterization and proteome exploration inferring protein domain interactions from databases of interacting proteins corum: the comprehensive resource of mammalian protein complexes human protein-protein interaction networks and the value for drug discovery comparison of automatic three-dimensional models builders using 639 x-ray structures interaction generality, a measurement to assess the reliability of a protein-protein interaction the database of interacting proteins: 2004 update methods to reveal domain networks partial correlation coefficient between distance matrices as a new indicator of protein-protein interactions the inference of protein-protein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships functional evaluation of domain-domain interactions and human protein interaction networks reuse of structural domain-domain interactions in protein networks cytoscape: a software environment for integrated models of biomolecular interaction networks conserved patterns of protein interaction in multiple species network-based prediction of protein function novel procedure for modeling ligand/ receptor induced fit effects quantifying modularity in the evolution of biomolecular systems protein-ligand docking: current status and future challenges protein complexes and functional modules in molecular networks correlated sequence-signatures as markers of protein-protein interaction biogrid: a general repository for interaction datasets 3did: interacting protein domains of known three-dimensional structure the value of high quality protein-protein interaction networks for systems biology protein-protein interactions: analysis and prediction tools for visually exploring biological networks systematic interactome mapping and genetic perturbation analysis of a c. elegans tgf-b signaling network a comprehensive analysis of protein-protein interactions in saccharomyces cerevisiae herpesviral protein networks and their interaction with the human proteome from orfeomes to protein interaction maps in viruses reactome: a knowledge base of biologic pathways and processes structure, function and evolution of multidomain proteins analysis of intraviral protein-protein interactions of the sars coronavirus orfeome string 7 -recent developments in the integration and prediction of protein interactions comparative assessment of large-scale data sets of protein-protein interactions insite: a computational method for identifying protein-protein interaction binding sites on a proteome-wide scale comparative evaluation of 11 scoring functions for molecular docking analysis on multi-domain cooperation for predicting protein-protein interactions smiles, a chemical language and information system. 1. introduction and encoding rules reaching for high-hanging fruit in drug discovery at protein-protein interfaces protein-protein interaction map inference using interacting domain profile pairs analyzing cellular biochemistry in terms of molecular networks discovering disease-genes by topological features in human protein-protein interaction network a modular network model of aging evaluation of descriptors and mini-fingerprints for the identification of molecules with similar activity bits" and pieces drug-target network functional and topological characterization of protein interaction networks the importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics the structure and function of proline recognition domains protein-protein interactions: analysis and prediction key: cord-104279-choywmwd authors: nan title: membrane protein sorting in the yeast secretory pathway: evidence that the vacuole may be the default compartment date: 1992-10-01 journal: j cell biol doi: nan sha: doc_id: 104279 cord_uid: choywmwd the targeting signals of two yeast integral membrane dipeptidyl aminopeptidases (dpaps), dpap b and dpap a, which reside in the vacuole and the golgi apparatus, respectively, were analyzed. no single domain of dpap b is required for delivery to the vacuolar membrane, because removal or replacement of either the cytoplasmic, transmembrane, or lumenal domain did not affect the protein's transport to the vacuole. dpap a was localized by indirect immunofluorescence to non-vacuolar, punctate structures characteristic of the yeast golgi apparatus. the 118-amino acid cytoplasmic domain of dpap a is sufficient for retention of the protein in these structures, since replacement of the cytoplasmic domain of dpap b with that of dpap a resulted in an immunolocalization pattern indistinguishable from that of wild type dpap a. overproduction of dpap a resulted in its mislocalization to the vacuole, because cells expressing high levels of dpap a exhibited vacuolar as well as golgi staining. deletion of 22 residues of the dpap a cytoplasmic domain resulted in mislocalization of the mutant protein to the vacuole. thus, the cytoplasmic domain of dpap a is both necessary and sufficient for golgi retention, and removal of the retention signal, or saturation of the retention apparatus by overproducing dpap a, resulted in transport to the vacuole. like wild type dpap b, the delivery of mutant membrane proteins to the vacuole was unaffected in the secretory vesicle-blocked sec1 mutant; thus, transport to the vacuole was not via the plasma membrane followed by endocytosis. these data are consistent with a model in which membrane proteins are delivered to the vacuole along a default pathway. ar~y proteins that reside in the organelles of the secretory pathway of eukaryotic cells have targeting information that directs retention in or sorting to the appropriate compartment (pfeffer and rothman, 1987) . in the absence of retention or sorting signals, soluble proteins of the secretory pathway are secreted; thus, the default pathway for these proteins is secretion (burgess and kelly, 1987; pelham, 1989) . in saccharomyces cerevisiae, mutations in the targeting signal of the soluble vacuolar protein, carboxypeptidase y (vails et al., 1990) , or the retention signal of the soluble er protein, bip (hardwick et al., 1990) , result in the secretion of these proteins. likewise, the flow of membrane proteins to the cell surface of nonpolarized mammalian cells is apparently by default, because mutations that disrupt the retention of er or golgi-retained membrane proteins (machamer et al., 1987; machamer, 1991; jackson et al., 1990) or the sorting of a lysosomal membrane protein (williams and fukuda, 1990) result in localization to the plasma membrane. little is known regarding membrane protein sorting in s. cerevisiae, although a previous study suggested that the cell surface is the default compartment for membrane proteins (fuller et al., 1989b) . in this paper, we characterize the targeting of two membrane proteins of the yeast secretory pathway, dap dipeptidyl aminopeptidases (dpaps) ~ dpap a and dpap b of the golgi apparatus and vacuole, respectively. the biogenesis of two membrane proteins of the yeast vacuole, dpap b (see fig. 1 a), and alkaline phosphatase (alp), has been characterized (klionsky and emr, 1989; roberts et al., 1989) . both dpap b and alp are type ii membrane glycoproteins (nomenclature of singer, 1990) , consisting of nh2-terminal cytoplasmic domains of approximately 30 amino acids, single hydrophobic membrane anchors, and cooh-terminal lumenal catalytic domains. these proteins transit the early compartments of the secretory pathway (i.e., er and golgi), but not the later compartments (i.e., secretory vesicles), indicating that the proteins do not transiently reside at the plasma membrane before delivery to the vacuole. the localization signals of these two proteins have not been identified, although the lumenal domain of alp has been shown to be unnecessary for vacuolar targeting (klionsky and emr, 1990) . the biosynthesis of several membrane proteins that reside this paper is dedicated to the memory of david merrill stevens. in the yeast golgi apparatus has also been examined. in yeast, three membrane-bound proteases, kex2p, kexlp, and dpap a (see fig. 1 a) process the mating pheromone a-factor precursor polypeptide as it traverses the secretory pathway (bussey, 1988; fuller et al., 1988) . the biosynthetic pathways of kex2p and kexlp have been characterized (fuller et al., 1989a,b; cooper and bussey, 1989) , and kex2p has been shown to function in a late golgi compartment (julius et al., 1984; graham and emr, 1991) . kex2p has been localized by indirect immunofluorescence to three to six punctate structures per cell that exhibit a somewhat random distribution within the cytoplasmic compartment (franzusoff et al., 1991; redding et al., 1991) . thus, the golgi apparatus of yeast is not localized to a perinuclear location, as in mammalian cells, but rather is dispersed throughout the cell. both kexlp (a. cooper and h. bussey, manuscript submitted) and dpap a (see below) have been immunolocalized to punctate bodies that are similar to those containing kex2p in size, abundance, and distribution. in this paper, a combined gene fusion and mutational analysis was used to show that no single domain of dpap b is required for vacuolar localization. furthermore, both overproduction of dpap a and a mutation in the cytoplasmic domain of dpap a resulted in mislocalization of this protein to the vacuole. finally, the fusion proteins analyzed in this work are shown to be transported directly to the vacuole from the golgi, and not to the plasma membrane. the bacterial strain mc1061 (casadaban and cohen, 1980) was used for all subcloning steps. oligonucleotide-directed mutagenesis was carded out in strain cj236 (kunkel et al., 1987) . the yeast strains used were jhry20-1aaia3 (mata, dap2a: : h1s3, stel3a : :leu2, ura3-52, leu2-3, 1eu2412, his3-a200, pep4-3; roberts et al., 1989) , sey2012zs (match, dap2:: leu2, emr et al., 1983) , cjry25-6b (mata dap2a : :leu2 mnn9 ura3-52 leu2-3 leu2412, and sey-5016 ( mata, s e cl-l , dap 2 : : le u2 , ura3-5 2 , le u2-3, leu2412) . yeast cultures were grown in yepd or minimal (sd) medium supplemented with the appropriate nutrients as previously described (sherman et al., 1982) . for the simultaneous induction of the gal/ promoter and the secl4 secretion defect, ceils were grown to log phase in minimal media plus raltinose and then harvested and resuspended in yep-raitinose. after a 1-h incubation at 25~ galactose was added directly to the cultures and at the same time the cultures were shifted to 34~ after 2 h, the cells were fixed immediately and prepared for indirect immunofluorescence as described below. oligonucleotides for mutagenesis were prepared by the university of oregon biotechnology laboratory on an applied biosystems 380b dna synthesizer (foster city, ca) as described (ito et al., 1982) . tran35s-label and zymolyase 100t were from icn biomedicals (irvine, ca), endo h was from boehringer marmheim (indianapolis, in), ultra-pure sds was from bdh biochemicals (san francisco, ca), glusulase was from dupont pharmaceuticals (wilmington, de), and all antibodies (except anti-dpap b, anti-dpap a, anti-alp, and anti-vat2p antibodies) used for indirect immunofluorescence were from jackson immunoresearch (west grove, pa), cappel products (malvern, pa), or promega biotech (madison, wi). all other reagents were from sigma chemical co. (st. louis, mo). dpap and invertase assays were performed as previously described (gildstein and lampen, 1975; roberts et al., 1991) . restriction endonuclease digests and ligations were performed as recommended by the suppliers. plasmid purification, agarose gel electrophoresis, fill-in reactions of sticky-ended dna fragments using t4 dna polymerase, and dna-mediated transformation of escherichia coli were done according to standard procedures (maniatis et al., 1982) . lithium acetate transformations of yeast were performed as described (jto et al., 1983) . a disruption of the chromosomal ste/3 locus was constructed by onestep gene disruption (l~thstein, 1983) , using the plasmid pslk349 (kindly provided by dr. george sprague). psl349 consists of pbr322 containing a 7.2 kbp bamhi ste/3 fragment, from which a 1.6-kbp bcli fragment within the coding region of the dpap a lumenal domain (c. a. flanagan, d. a. barnes, m. c. flessel, and j. thorner, manuscript submitted for publication) was replaced by the 2.9-kbp bglii leu2 fragment. to create a strain lacking both dpap a and dpap b, a disruption of the chromosomal dap2 locus with the his3 gene was made using the plasmid pgp6, which contains the 1.2-kbp ecori-bamhi his3 fragment (sikorski and hieter, 1989) in place of the l3-kbp bsteii-kpni portion of the coding region of dap2 (roberts et al., 1989) . the c~-factor signal sequence was fused to the lumenal domain of dpap b as follows: a sail linker was inserted at the hincll site of plasmid p771, which contains a portion of the 5' region of the mfcd gene (kurjan and herskowitz, 1982) , including 70 bp of non-coding region (not including the uas) and 66 bp of coding region, including the signal sequence, signal peptidase cleavage site (waters et al., 1988) , and 3 nh2-terminal residues of pro-a-factor, fused to the kre/gene (boone et ai., 1990) . the acci site at position +140 of the dap2 gene was changed to a sali site using oligonucleotide-directed mutagenesis (kunkel et al., 1987) . mutagenesis of dap2 was performed using the vector pcjr27, which contains the 3.3-kbp bamhi-hindiii da/'2 fragment in the plasmid ks + (stratagene, san diego, ca). the 2.7-kbp sali-hindui fragment, encoding the lumenal domain of dpap b, was inserted into the sali-hindiii sites of p771/sali, fusing the coding regions of mfal and dap2 in frame, c~fss-b was placed under the control of the ga/_j promoter (johnston and davis, 1984) by inserting the 2.9-kbp bamhi-hindiii fragment from this plasmid into the bamhi-hindiii sites of pc jr52, which contains the 822-bp ecori-bamhi ga/_,/ promoter fragment inserted into ecori-bamhi sites of the cen plasmid pseyc68 ca modified version of pseyc58; emr et al., 1983) . the resulting fusion protein consists of the nh2-terminal 22 residues of prepro-afactor and two residues arising from linker sequences fused to residue 49 of the lumenal domain of dpap b (fig. l b ; amino acid sequence nh2-mrfpsiftavlfaassala-apvgrphh . . . ). the bb-inv fusion protein expression vectors, pcjr13 and pc jr15 (2p and cen plasmids, respectively), were constructed by inserting the 0.6-kbp bamhi-acci fragment (acci blunt ended) into the invertase fusion vectors psey304 (a derivative of the 2/z plasmid psey303; emr et al., 1986) and the cen plasmid pseyc306 (johnson et al., 1987) at the bamhi and hin-diii sites (hindlij blunt ended). the resulting fusion protein contains the 48 nh2-terminal residues of dpap b fused to residue three of cytoplasmic invertase ( fig. 1 b) . bb-inv was placed under the control of the gal/promoter by cutting pcjr15 with ecori and hiediii and ligating in the 822-bp gal/fragment (johnston and davis, 1984) . this fused the 3' end of the gal/fragment at nucleotide -92 of the dap2 sequence. vectors encoding a20-bb and a27-bb were constructed as follows. oligonucleotide-directed mutagenesis (kunkel et al., 1987) was used to create 60-and 81-bp deletions in dap2 (a20-bb and a27-bb, respectively) in pcjr27. the ~3.2kbp bamhi-hindiii mutant dap2 fragments were inserted into the bamhi-hindiii sites of the cen plasmid pseyc58, creating pmnh1 (620-bb) and pc jr45 (a27-bb). &20-bb and a27-bb were over-produced by replacing the dap2 promoter with the gal/promoter as follows. the hindiii site at -92 of dap2 was blunt ended and religated, creating an nhei site. the ~3-kbp nhei-hindiii fragments of pc jr43 and pc jr44 were inserted at the xbai-hindiii sites of pc jr52, creating pc jr56 and pc jr54, respectively. the a20 and a27 deletions removed residues 2-21 and 2-27 of dpap b, changing the nh2-terr~nal acid sequence of dpap b from nh2-meg-geeeveripdelfdtkkkhlldklirv30 to nh2-mhlldklirv and nh2-mirv, respectively ( fig. 1 b) . the 2pt plasmid encoding dpap a, pcjr46, was constructed as follows: the 5.9-kbp xbai-bamhi fragment, containing the stf_j3 gene (c. a. flanagan, d. a. barnes, m. c. flessel, andj. thorner, manuscript submitted for publication) was isolated from the plasmid p13-3 (julius et al., 1983) and inserted into puc13. the 5.9-kbp sali-bami-ii fragment from this vector was inserted into the xhoi-bglii sites of the 2# plasmid pckr201 (c. raymond and t. stevens, unpublished results) . the cen plasmid pcjr78 was made by inserting the 4.5-kbp eagi-pvuii ste/3 fragment into the eaglecorv sites of prs316 (sikorski and hieter, 1989) . the plasmid pc jr64, encoding the fusion protein aa-b, was constructed by inserting the 2.5-kbp bamhi-mlui (mlui blunt ended) fragment from p13-3, encoding the stf.j3 promoter and the cytoplasmic and transmembrane domains of dpap a, and the 2.7-kbp sall-hindlii fragment of dap2 (sail blunt ended), encoding the lumenal domain of dpap b, into the bamhi-hindlii sites of pseyc58. the 5.2-kbp bamhi-hindlii fragment of the resulting vector (pcjr41) was cloned into the bamhi-hindlii sites of the 2# plasmid, psey18 (emr et al., 1986) . the resulting plasmid encodes aa-b, which consists of the nh2-terminal 150 residues of dpap a to residue 47 of dpap b (fig. 1 c) . for the construction of the fusion protein a-bb, xbai sites were created in both the dap2 and ste/3 genes just upstream of the coding regions of the transmembrane domains of dpap b and dpap a, respectively, using oligonucleotide-directed mutagenesis (kunkel et al., 1987) . nucleotides 69 and 70 of the dap2 gene were changed from gt to tc (roberts et al., 1989) , and nucleotides 342-344 of ste/3 were changed from gcc to aga (with the a of the initiation codon as 1) (c. a. flanagan, d. a. barnes, m. c. flessel, and j. thorner, manuscript submitted for publication). a 1.1kbp saci-xbai fragment (encoding the ste13 promoter and the cytoplasmic domain of dpap a) and a 2.8-kbp xbal-hindlii dap2 fragment (encoding the transmembrane and lumenal domains of dpap b) were ligated into the saci-hindlii sites of psey18. the resulting plasmid (pcjr67) encodes a protein consisting of 113 residues of the nh2-terminal cytoplasmic domain of dpap a fused to amino acid 24 of dpap b (fig. 1 c) . the fusion protein b-a-b was constructed as follows: the 3.3-kbp bamhi-hindui dap2 fragment, including the xbal site at +70, was ligated into the bamhi-hindlii sites of pseyc58. a 4.3-kbp xbal-hindlii fragment of ste/3, encoding the transmembrane and lumenal domains of dpap a, was inserted into the xbai-hindlii sites, resulting in the plasmid pc jr58. the 2.7-kbp sali-hindiii dap2 fragment (sali blunt ended), encoding the dpap b lumenal domain, was inserted into the mlui-hindlii sites (mlui blunt ended) of pcjr58, resulting in pcjr69, a cen plasmid encoding b-a-b. the 4.8-kbp bamhi-hindlii fragment from pcjr69 was inserted into the same sites of psey18, resulting in pcjrt0. b-a-b was placed under the control of the gal/promoter by cloning the nhei-hindlii b-a-b fragment into the xbai-hindlii sites of pcjr52, creating pc jr79. the b-a-b fusion protein consists of residues 116-150 of dpap a in place of residues 26-46 of dpap b ( fig. 1 c) . oligonucleotide-directed mutagenesis of the portion of ste/3 encoding the 118 amino acid cytoplasmic domain of dpap a ( fig. 1 a) was performed in pcjr71, which consists of the 0.65-kbp eagi-psti fragment of ste/3 inserted into the eagi-psti sites of ks +. a 66-bp in-frame deletion, removing the amino acids 85-106 (z~22), was created, and the sacl-mlui fragment from this plasmid was inserted into the saci-mlui sites of pcjr64, creating the 2# plasmid psn58, which encodes a92-aab ( fig. 1 c) . the same saci-mlui fragment was ligated with the 4-kbp mlui-hindlll (encoding the lumenal domain of dpap a) into the saci-hindiii sites of psey18, creating psn59, which encodes a22-aaa ( fig. 1 c) . a22-aa-b was placed under the control of the gal/promoter by fusing a saci-eagi fragment from pcjr125 (contains the 360 bp hindlii-ecorl gal/fragment in pgem-5zf) to an eagl-hindlii fragment from psn58, both of which were ligated into the saci-hindlii sites of psey18, creating psn89. the fusl-laczp expression plasmid pcjrll4 as follows: a 6-kb nhei-hindlli fragment, encoding the 254 nh2-terminal amino acids of the fus1 protein fused to ~-galactosidase, under the gall promoter (isolated from pcjrll3, a derivative of psb231 (trueheart and fink, 1989) ) was cloned into the xbal-hindlll sites of pvt105u, a 2# plasmid (vernet et al., 1987) , creating pcjr114. for the production of dpap a antigen to be used to generate dpap a antiserum, the plasmid pcjr24 was constructed by inserting the 0.8-kbp mlui-hpal (mlui blunt ended) ste/3 fragment into the sinai site of pexp1, an e. coli expression vector containing the tac promoter just upstream of the translational start codon of t4 lysozyme and a multiple cloning site (raymond et al., 1990) . a 27 amino acid peptide, corresponding to the 26 nh2-terminal residues of dpap b followed by a cooh-terminal cysteine residue, was synthesized on an applied biosystems peptide synthesizer. the peptide was coupled to carboxymethylated bsa through the cooh-terminal cysteine with the hifunctional crosslinking agent, mbs, following the manufacturer's recommendations (pierce chemical co., rockford, il), and through lysine residues with gluteraldehyde as previously described (kagen and glick, 1979) . a 1:1 mixture of bsa-peptide conjugates prepared by the two methods was used to immunize rabbits as described previously (vaitukaitis, 1981) . for affinity purification of the antibody, the poptide was cross-linked to tresyl-activated sepharose 4b (pharmacia fine chemicals), and affinity purification was carded out as described (raymond et al., 1990) . dpap a antigen was produced in e. coli cells containing the plasmid pc jr24. induction with iptg resulted in the production ofa 31-kd protein that corresponds to the nhe-terminal portion of the lumenal domain of dpap a (see fig. 6 ) fused to seven residues of t4 lysosyme and seven residues encoded by the pexpi polylinker. antigen purification was performed as described previously (raymond et al., 1990) . immunoprecipitations were performed by growing cultures to log phase in supplemented minimal media lacking methionine and cysteine, then pulse labeling in the same media with tran3ss-label and chasing by adding 50 /~g/ml methionine and 50/~g/ml cysteine, followed by the addition of sodium azide to 10 ram. the cells were converted to spheroplasts (stevens et al., 1986) which were lysed in 1% sds, 8 m urea plus a protease inhibitor cocktail (0.5 mm pmsf, 1/~g/ml leupeptin, and 1 /~g/ml pepstatin) for 5 rain at 100~ and adjusted to 1 ml in ip buffer (pbs, 0.1% sds, 0.1% triton x-100, 1 mivl edta). for precipitating ~fss-b from the extracellular fractions, the medium was supplemented with 2 m4,,/ml bsa and 50 mm potassium phosphate, ph 5.7. after spberoplasting, the periplasmic and media fractions were pooled and adjusted to 1 ml in ip buffer plus protease inhibitors. after pre-adsorbtion to 0.5% iggsorb, anti-dpap b coohterminal antibody (roberts et al., 1989) , or anti-dpap a antibody was added, and samples were incubated one hour on ice. iggsorb was added to 0.5%, followed by 1 h on ice, and the immune complexes were precipitated and washed twice with ip buffer. the immune complexes were solubilized, and half of the samples were treated with endoglycosidase h (endo h) overnight at 37~ as described (orlean et al., 1991) . samples were analyzed by sds-page and fluorography as described previously (stevens et ai., 1986) . the secretion of t~fss-b was quantified using an ambis radioanalytic imaging system (ambis systems, inc., san diego, ca). the fractionation of membranes in the presence of high ph sodium carbonate was performed as described (roberts et al., 1989) . preparation of fixed, spheroplasted cells for indirect immunofluorescence was carried out essentially as previously described (roberts et al., 1991) , except that the fixed spheroplasts were treated with 1% sds for 1-5 minutes. antibody adsorption against fixed spheroplasts harboring null mutations in either dap2 or stej3 was performed as described elsewhere (raymond et al., 1990; roberts et ai., 1991) . the fixed spheroplasts were stained with a 1:10 dilution of adsorbed anti-dpap a or anti-dpap b affinity-purified antibody in pbs-bsa (roberts et al., 1991) . for co-localization with the 60-kd subunit of the vacuolar h+-atpase (vat2p, the product of the vat2 gene; yamashiro et al., 1990) , the antibody solution also contained a 1:10 dilution of the mab, 13dll (kane et al., 1992) . the dpap b or dpap a staining pattern was amplified by subsequent incubations with goat antirabbit antibody conjugated to biotin, followed by a streptavidin-fitc conjugate. the last antibody incubation also contained goat anti-rabbit antibody conjugated to rhodamine. co-detection of alp and fusl-laczp was performed by staining cells with a 1:10 dilution of a rabbit polyclonal anti-alp antibody (raymond et al., 1990 ) and a 1:1,000 dilution of a mouse monoclonai anti-/~-gaiactusidase antibody, followed by antibody amplification identical to that used for dpap a and dpap b. the cells were mounted in media containing dapi for staining nuclei, and photomicrographs were made as described previously (roberts et al., 1989) . the role of the lumenal domain in the sorting of dpap b to the vacuole was addressed in two ways. first, to determine if the lumenal domain was sorted to the vacuole when expressed in the secretory pathway as a soluble protein, a gene fusion was used to create the protein ctfss-b, consisting of the nh2-terminal er-targeting signal sequence of prepro-t~ factor fused to the lumenal domain of dpap b at residue 49 ( fig. 1 b) . the signal sequence should direct the translocation of the protein into the er lumen and be cleaved, rendering the lumenal domain a soluble protein in the secretory figure 2 . immunoprecipitations of c~fssb from mnn9 and mnn9 strains. jhry20-1a ( dap2a mnn9 ) or cjry25-6b ( dap2a mnn9 ) cells were labeled with tran3ss-label for 15 rain and chased for 60 min in the presence of 50 izg/ml methionine and 50/~g/ml cysteine. cultures were separated into two fractions, intracellular (i.e., spheroplasts) and extraceuular (i.e., periplasmic and media), and immunoprecipitated with affinity purified dpap b antibody. half of each immunoprecipitation sample was treated with endo h, and equal amounts were analyzed by sds-page and fluorography. positions of the molecular weight standards are indicated (in kd). pathway. the construct was analyzed in a strain which contains a null allele of dap2, the structural gene for dpap b (suarez rendueles and wolf, 1987) . fig. 2 shows the results of immunoprecipitating o~fss-b from intraceuular and ex-traceuular (i.e., combined periplasmic and medium) fractions of 35s-labeled cells using an antibody that recognizes the cooh-terminal half of dpap b (roberts et al., 1989) . analysis of the samples by sds-page and fluorography showed that 64 % of txfss-b was secreted as a heterogeneous population of highly glycosylated species (fig. 2, lane 2) , similar to the secreted protein invertase (esmon et al., 1981) , whereas the intracellular fraction contained a tightly migrating species (fig. 2 , lane /). treatment of the immunoprecipitates with endo h to remove n-linked carbohydrate demonstrated that the difference in the apparent mobilities was due to glycosylation (fig. 2, lanes 3 and 4) . the glycosylation pattern of the secreted material differs from that of wild type dpap b, which receives only modest glycosyl modifications of the core oligosaccharides in the golgi apparatus (roberts et al., 1989) . to test whether the alteration in glycosylation caused the secretion of the lumenal domain, otfss-b was expressed in an mnn9 mutant, which is deficient in the addition of the extensive t~1,6 outer chain glycosyl groups (kukuruzinska et al., 1987) . fig. 2 , lanes 5 and 6, show that t~fss-b was secreted to the same extent (59%) from mnn9 cells, even though the protein was not aberrantly glycosylated. indirect immunofluorescence microscopy showed that the portion of the lumenal domain that remained intracellular was retained in the er, and no staining of the vacuole was observed (data not shown). unlike the secreted material, the er-retalned material was enzymati(suc2-a9) cells containing a plasmid encoding bb-inv (pcjr13) were labeled with tran35s-label for 30 rain and chased for 60 rain in the presence of 50/~g/ml rrw, thionine and 50 /~g/ml cysteine. the cells were converted to sphemplasts, and extracts immunoprecipitated with either affinity-purified dpap b antibody (lanes 1-8) or affinity-purified invertase antibody (lanes 9 and 10) . half of each immunoprecipitation sample was treated with endo h, and equal amounts were analyzed by sds-page and fluorography. positions of the molecular mass standards for lanes 1-8 are indicated (in kd) on the left, whereas standards for lanes 9 and 10 are indicated on the right. cally inactive (data not shown), and thus may be unfolded and incompetent to exit the er (rose et al., 1989; rothman, 1989) . the role of the lumenal domain of dpap b in vacuolar targeting was also tested by constructing the fusion protein bb-inv, in which the 48 nh2-terminal residues of dpap b were fused to the non-vacuolar protein, invertase (fig. 1 b) . immunoprecipitations of bb-inv using anti-invertase antibody, followed by sds-page and fluorography, showed that the fusion protein was glycosylated, and treatment with endo h showed that the protein was of the expected size (fig. 3, lanes 9 and 10) . bb-inv fractionated with membranes under high ph carbonate conditions, consistent with bb-inv being an integral membrane protein (data not shown). invertase enzyme assays of either permeabilized or non-permeabilized whole cells deleted for the invertase structural gene and expressing bb-inv from a cen plasmid showed that <2 % of the total invertase activity was extracellular. the subcellular localization of bb-inv was determined by indirect immunofluorescence microscopy using an antibody that recognizes the cytoplasmic domain of dpap b (see materials and methods). fig. 4 shows the staining pattern of wild type dpap b (fig. 4, a-c) and bb-inv (fig. 4 , g-i) when expressed from the high copy number (2#) plasmids, pgp3 (roberts et al., 1989 ) and pcjr13, respectively. for these and the other 2# plasmid constructs used in this study, the proteins were overproduced 10-20-fold as determined by enzyme activity. both dpap b and bb-inv were localized to the vacuolar membrane as judged by differential interference contrast (nomarski) optics and co-localization with a marker for the vacuolar membrane, the 60-kd subunit of the vacuolar h+-atpase (vat2p; yamashiro et al., 1990; table i ). dpap b and bb-inv were also localized to the vacuolar membrane when expressed from a single copy (cen) plasmid (data not shown). aside from the differences in signal intensity, no difference in subcellular localization was observed when bb-inv was expressed from cen or 2# plasmids. the results from the afss-b and bb-inv fusions indi-cate that, similar to alp (klionsky and emr, 1990) , the lumenal domain of dpap b is neither necessary nor sufficient for vacuolar targeting. the role of the transmembrane domain in the vacuolar targeting of dpap b was tested by constructing the fusion protein b-a-b (fig. 1 b) , in which the membrane anchor of dpap b was replaced by that of the non-vacuolar membrane protein, dpap a. the role of the cytoplasmic domain of dpap b was tested by constructing in-frame deletions in this domain using oligonucleotide-directed mutagenesis of the dap2 gene. two deletion variants of dpap b were constructed, a20-bb and a27-bb, in which 20 and 27 amino acids, respectively, were removed from the 29 residue cytoplasmic domain (fig. 1 b; see materials and methods). immunoprecipitations of b-a-b, a20-bb, and a27-bb from 35s-labeled cells, followed by sds-page and fluorography, showed that the mutant proteins were glycosylated, and that the deglycosylated proteins were of the predicted size ( fig. 3, lanes 3-8) . as with bb-inv, these proteins behaved as integral membrane proteins in high ph carbonate fractionation experiments (data not shown). dpap activity assays of cells expressing b-a-b, a20-bb, and a27-bb showed that the proteins were fully enzymatically active, and that 88-98 % of the total activity was intracellular. indirect immunofluorescence microscopy of ceils expressing b-a-b, a20-bb, and a27-bb from either 2# plasmids (fig. 5, c-h) or cen plasmids, using an antibody that recognizes the cooh-terminal half of dpap b, showed that each of the mutant proteins was predominantly localized to the vacuolar membrane as determined by nomarski optics (fig. 5) and co-localization with vat2p (data not shown). no staining of the plasma membrane was observed for any of these proteins. thus, the cytoplasmic and transmembrane domains of dpap b are not necessary for targeting to the vacuolar membrane. a significant fraction of cells expressing a20-bb and a27-bb also showed some er localization, as judged by staining of the perinuclear space and long cisternal compartments (table i; rose et al., 1989) , whereas b-a-b showed only vacuolar labeling. the z~20-bb and ~27-bb cells in fig. 5 show predominantly vacuolar staining. in many cases, the plane of focus had to be adjusted to observe staining of both vacuolar and er structures within a given cell. the increased residence time of a20-bb and a27-bb in the er was corroborated by immunoblotting analysis, which showed that a small amount of the er forms of a20-bb and a27-bb were seen in the steady state, along with the golgi-modified forms, whereas only the golgi-modified form was seen for wild type dpap b (data not shown). the increased er retention of a20-bb and a27-bb could be due to an impaired ability to achieve a native conformation competent for exiting the er, as has been observed with mutant membrane proteins in mammalian cells (gething et al., 1986; doms et al., 1988) . upon exiting the er, however, a20-bb and a27-bb were transported through the golgi complex to the vacuole, indicating that the cytoplasmic domain of dpap b is not necessary for vacuolar localization. these results, combined with those from the analysis of eefss-b, bb-inv, and b-a-b, demonstrate that, aside from a membrane anchor, no single domain of dpap b is required for transport to the vacuole. * for each construct, the percentages refer to the fraction of stained cells that exhibited staining of a particular organelle. depending on the protein being monitored, some cells showed staining of more than one class of organdie; thus, the percentages for a given protein may add up to more than 100. w vacuolar localization was determined both by coincidence of staining with the vacuole membrane as determined by nomarski optics, and by co-localization with vat2p. i) golgi localization was defined as punctate, non-vacuolar, and non-er staining, characteristic of dpap a, kex2p , and kexlp (cooper and bussey, manuscript submitted for publication). ** er localization was determined by staining of perinuclear (as determined by dapi staining of nuclei) and extended cisternal structures. several different models can explain the data presented above (see discussion), including a simple model in which the vacuolar membrane, not the plasma membrane, is the default compartment for membrane proteins of the yeast secretory pathway. to distinguish among these models, we analyzed the retention signal of the golgi membrane protein, dpap a. mutations in the retention signal of this protein should result in dpap a becoming localized to the default compartment for membrane proteins. dpap a is a protein that resides in the golgi apparatus, where it processes the c~-factor precursor polypeptide (julius et al., 1983) . the structural gene for dpap a (ste23) has been cloned and sequenced (julius et al., 1983 ; c. a. flanagun, d. a. barnes, m. c. flessel, and j. thorner, manuscript submitted for publication), and the predicted structure of dpap a is similar to that of dpap b in several regards. both are type ii integral membrane proteins, and both have lumenal enzymatic domains of ,x,800 residues that share a high degree of sequence identity, including 48 % identical residues over the cooh-terminal 250 residues (fig. 1 a) ; however, there is no significant sequence similarity between the cytoplasmic and transmembrane domains of dpap a and dpap b. to assess the subcelhlar distribution of dpap a, an antibody specific for dpap a was generated (see materials and methods). the specificity of the antibody is demonst-rated in fig. 6 . immunoprecipitation of dpap a was carried out with 35s-labeled cells that varied with regard to the dosage of the ste/3 gene. sds-page and fluorography showed that the antibody immunoprecipitated a protein of 112 kd from wild type cells but not from ste13a ceils, and that this polypeptide was overproduced ~20-fold in a strain containing the ste/3 gene on a 2t~ plasmid (fig. 6 a) . a protein of *40 kd was also immunoprecipitated; however, the level of the 40-kd protein did not vary with the dosage of ste/3. the ste/3 dna sequence predicts three possible sites for addition of n-linked carbohydrate (c. a. flanagan, d. a. barnes, m. c. flessel, and j. thorner, manuscript submitted for publication). treatment of dpap a with endo h resulted in an ~,5 kd decrease in apparent molecular weight (fig. 6 b) , suggesting that at least two of the three asn-x-ser/thr sites of dpap a were modified, assuming that the core oligosaccharides added were of the typical structure (kukuruzinska et al., 1987) , and that the carbohydrate moieties were only slightly modified in the golgi apparatus, as is the case for the glycosyl groups of kex2p (fuller et al., 1989a,b) and kexlp (cooper and bussey, 1989) . the apparent molecular weight of deglycosylated dpap a, 107 kd, is consistent with the molecular weight of dpap a predicted from the ste13 dna sequence, 107,200 d (c. a. flanagan, d. a. barnes, m. c. flessel, andj. thorner, manuscript submired for publication). indirect immunofluorescence microscopy of cells containing the $7f,/3 gene on a 2# plasmid showed that dpap a was localized to several non-vacuolar puncture patches dispersed throughout the cell, as judged by nomarski optics and in double staining experiments with vat2p antibody (fig. 7 , d-u). this signal was absent in a stei3a strain (fig. 7 , a-c). this staining pattern is typical of the golgi apparatus, as determined by the localization of kex2p and kexlp (a, cooper and h. bussey, manuscript submitted for publication). attempts to localize dpap a in cells containing only the chromosomal copy of ste/3 were unsuccessful due to the low abundance of the protein. because the copy number of yeast 2/~ plasmids can vary from 10-40 copies/cell within a population (rose and broach, 1991) , the fluorescence signal corresponding to dpap a when expressed from a 2# plasmid also varied from cell to cell, ranging from a weak signal to a very strong signal, whereas the intensity of the vat2p signal was consistent from cell to cell. all of the cells that showed a signal with the dpap a antibody displayed a golgi-staining pattern (fig. 7 , d-u; table i ); however, cells exhibiting a very strong signal, presumably due to a high dosage of the ste/3 gene, also showed vacuolar staining (5 % of the cells ofapep4-3 strain; table i ). an example of this is shown in fig. 7, (~ s, and u) . a single cell exhibiting a strong dpap a signal is shown, and the signal stains both vacuolar and punctate non-vacuolar structures, as judged by co-localization with vat2p. the percentage of cells showing vacuolar localization of dpap a decreased significantly in an otherwise isogenic pep4 strain, which contains the full complement of vacuolar proteases (i %; table i) . thus, cells producing high levels of dpap a showed mislocalization of the protein to the vacuolar membrane, with the mislocalized protein degraded in a vacuolar proteasedependent fashion. dpap enzyme assays on permeabilized and non-permeabilized cells expressing dpap a from either cen or 2g plasmids showed that ,088% of the total activity was intracellular; thus, it is possible that a small percentage of dpap a is present at the cell surface, even though no plasma membrane staining was detected by indirect immunofluorescence, however, the apparent extraceuular dpap a enzyme activity could in part be due to a small degree of cell lysis during the assay. to determine whether a specific domain of dpap a con-the journal of cell biology, volume 119, 1992 tained the signal for golgi retention, the fusion proteins aa-b and a-bb were constructed ( fig. 1 c) , in which the lumenal domain of dpap a was exchanged for that of dpap b (aa-b), and the cytoplasmic domain of dpap b was exchanged for that of dpap a (a-bb). immunoprecipitation of aa-b and a-bb from 35s-labeled cells, followed by sds-page and fluorography, showed that both fusion proteins were glycosylated and treatment of the immunoprecipitates with endo h showed that the proteins were of the expected size (fig. 6 b) . the broadly migrating species observed for aa-b and a-bb indicate that the proteins are more extensively modified in the golgi apparatus than dpap a. dpap activity assays on permeabilized or non-permeabilized whole cells demonstrated that the aa-b and a-bb proteins were enzymatically active, and that ~80% of the activity was intracellular. a-bb fractionated with membranes in the presence of high ph sodium carbonate, and thus behaves biochemically as an integral membrane protein (data not shown). indirect immunofluorescence microscopy using anti-dpap b lumenal domain antibody showed that aa-b and a-bb exhibited non-vacuolar punctate staining patterns indistinguishable from wild type dpap a (fig. 8, a-f) . again, no plasma membrane staining was detected. as with dpap a, all cells that showed a signal displayed a punctate staining pattern, and a subset of those cells that showed a strong signal also exhibited vacuolar localization (table i ). an example is shown in fig. 8 (a-c) for cells expressing aa-b. the cell in the bottom right corner shows a very intense fluorescein signal relative to the other cells in the panel, and clearly exhibits co-localization with vat2p. thus, as was seen for wild type dpap a, overproduction of these proteins resulted in some mislocalization to the vacuole. these experiments demonstrate that the cytoplasmic domain of dpap a is sufficient for the retention of an otherwise vacuolar membrane protein in punctate structures typical of the golgi apparatus in s. cerevisiae. the cytoplasmic domain of dpap a also has been shown to be sufficient for golgi retention when fused to the transmembrane and lumenal domains of the vacuolar membrane protein alp (s. nothwehr and 1". stevens, unpublished data). to determine if the cytoplasmic domain of dpap a was also necessary for the retention of dpap a in the golgi, a series of in-frame deletion mutations in the cytoplasmic domain were generated using oligonucleotide-directed mutagenesis (see materials and methods). deletion variants that lacked (fig. 9) , although both proteins also showed a small amount of er staining (table i). immunoblotting analysis of crude extracts prepared from cells expressing a22-aa-b showed a broadly migrating species on sds-page similar to aa-b and a-bb (data not shown; fig. 6 b) , suggesting that the a22-aa-b protein transits the golgi apparatus before its transport to the vacuole. similar to the deletions in the cytoplasmic domain of figure 8 . immunolocalization of aa-b and a-bb. jhry'20-1a (ste13a dap2a) cells containing plasmids expressing either aa-b (pcjr64, a-c) or a-bb (pcjr67, d-f) were fixed, spheroplasted, and stained with dpap a antibody and vat2p mab. the cells were viewed by nomarski optics (a and c) and by epifluorescence using filter sets specific for fluorescein (b and e) and rhodamine (c and f) fluorescence. dpap b, this deletion delays exit of the fusion proteins from the er; however, upon exit from the er, the enzymatically active a22-aaa and a22-aa-b fusion proteins were transported to the vacuole. thus, the cytoplasmic domain is both necessary and sufficient for the retention of dpap a in the golgi apparatus, and the signal for the retention of dpap a maps to a 22-amino acid segment within the ll8-residue cytoplasmic domain. we have previously shown that the transport of dpap b to the vacuole does not involve delivery to the plasma membrane followed by endocytic targeting to the vacuole (roberrs et al., 1989) . this was demonstrated by expressing dpap b in a secl mutant, which at 34~ is blocked at a late stage of the secretory pathway and accumulates secretory vesicles salminen et al., 1987) . to ad-dress the possibility that the mutant constructs analyzed in this study were mislocalized to the plasma membrane followed by rapid endocytic uptake to the vacuole, indirect immunofluorescence experiments were performed on bb-inv, b-a-b, a20-bb, and/x22-aa-b expressed in a secl strain at 34~ as a positive control for accumulation in secretory vesicles, the localization of the fusion protein fusl-laczp was analyzed (trueheart et al., 1987) . the fus1 gene product is required for the breakdown of the cell walls after the fusion of a and ct cells during conjugation (mccaffrey et al., 1987; trueheart et al., 1987) , and has been shown by protease treatment of whole yeast cells to be a plasma membrane protein (trueheart and fink, 1989) . the fusion protein fusl-laczp, consisting of the 254 nh2-terminal amino acids of fuslp fused to the/acz gene product (/3-galactosidase), has been shown by immunofluorescence microscopy to be localized to the plasma membrane after induction during mating (trueheart et al., 1987) . for these experiments, the expression of these proteins was under the control of the inducible gal/promoter (see materials and methods; johnston and davis, 1984) . neither fusl-laczp, bb-inv, b-a-b, or a20-bb showed any signal before addition of galactose (data not shown). fig. 10 (a-f) shows that fusl-laczp was localized to the plasma membrane in a sec ⧠strain after 2 h of induction at 34~ but accumulated intracellularly, presumably in secretory vesicles, in a secl strain under the same conditions. no colocalization with the vacuolar marker, alp, was detected. figure 10 . immunolocalization in sec* and secl strains. jhry20-1a (sec*) and sey-5016 (secl) cells containing plasmids encoding fusl-laczp (pcjrll4) or b-a-b (pcjr79) under the control of the gal/ promoter were grown to log phase in media containing raffinose, then shifted to 34~ at the same time that galactose was added to the cultures. after 2 h, the cells were fixed, spheroplasted, and prepared for immunofluorescence. cells expressing fusl-laczp were stained with anti-/~-galactosidase mab and anti-alp polyclonal antibody, and were viewed using filter sets specific for fluorescein and rhodamine fluorescence, respectively. cells expressing b-a-b were stained as in fig. 5 . typically, fusl-laczp accumulated in a large patch in the bud and bud neck region of budded cells, and near the cell periphery of unbudded cells. bb-inv, b-a-b, and a20-bb were each localized to the vacuole under these conditions (e.g., fig. 10 , g-l; table ii) , as defined by co-localization with vat2p. the intraceuular, non-vacuolar signal occasionally observed for b-a-b (23 %; table ii ) was different from that of fusl-laczp, in that the signal looked reminiscent of the golgi signal observed for dpap a (data not shown). this was also observed for b-a-b in sec ⧠cells at 34~ suggesting that the kinetics of delivery of this protein is slowed at elevated temperatures. the analysis of a22-aa-b in the secl strain was complicated by the fact that expression of the protein was leaky; ,~50% of the cells in the population showed a weak vacuolar staining pattern before galactose addition. however, addition of galactose to the culture after shifting the cells to 34~ resulted in '~90% of the cells exhibiting an intense vacuolar signal (data not shown). these data show that the bb-inv, b-a-b, a20-bb, and a22-aa-b proteins are targeted directly from the golgi apparatus to the vacuole without transient appearance at the cell surface; thus, these proteins follow the same route to the vacuole as wild type vacuolar membrane proteins (roberts et al., 1989) . the search for the vacuolar localization determinant of dpap b led to the surprising conclusion that neither the cytoplasmic, transmembrane, nor lumenal domain of the protein was necessary for vacuolar delivery. analysis of the golgi retention signal of dpap a led to the equally surprising observations that both overproduction of the protein and mutations in the cytoplasmic domain resulted in mislocalization of dpap a to the vacuole, not the plasma membrane. these results were unexpected, given that the targeting of soluble vacuolar proteins in yeast and soluble and membrane proteins of the lysosomes of animal cells require targeting information to prevent delivery to the cell surface (vails et al., 1990; kornfeld and mellman, 1989; williams and fukuda, 1990; machamer, 1991) . the fusion proteins analyzed in this study are appropriate tools for these experiments for the following reasons. all of the proteins were stable, enzymatically active, membranebound, and glycosylated when expressed in yeast, indicating that they have the correct membrane topology. in addition, all of the fusion proteins containing the dpap b lumenal domain, i.e., a20-bb, a27-bb, b-a-b, and a22-aa-b, were transported to the vacuole after receiving glycosyl modifications in the golgi apparatus. finally, each of these fusion proteins was transported to the vacuole directly from the golgi apparatus, without prior delivery to the plasma membrane followed by endocytic uptake to the vacuole. other golgi membrane proteins have also been shown to be mislocalized to the vacuole rather than the plasma membrane. both mutations in the cytoplasmic domain of kexlp as well as its overproduction result in mislocalization of the protein to the vacuolar membrane (a. cooper and h. bussey, manuscript submitted for publication). similarly, a single amino acid change in the cytoplasmic domain of kex2p results in its transport to the vacuole (c. wilcox, k. redding, r. wright, and r. fuller, manuscript submitted for publication). whereas this result might appear to conflict with earlier published data on kex2p, more recent analysis indicates that membrane-bound forms of kex2p that fail to be retained in the golgi are found in the vacuole rather than the plasma membrane, when these proteins are expressed at wild type or modestly elevated levels (c. wilcox, k. redding, r. wright, and r. fuller, manuscript submitted for publication). given that several golgi membrane proteins all show missorting to the vacuole when either mutant or overproduced, the model we favor to explain these data is the vacuolar default model, which states that vacuolar membrane proteins do not require sorting information, because the default pathway for membrane proteins of the secretory pathway leads to the vacuole. alternatively, one could argue that, given the similarity between dpap b and dpap a, both proteins could contain vacuolar targeting information in their transmembrane domains, even though no significant similarity is apparent from the amino acid sequence data (c. a. flanagan, 13. a. barnes, m. c. flessel, and j. thorner, manuscript submitted for publication). mutations in the golgi retention signal of dpap a, or saturation of the retention apparatus, would then result in the delivery of dpap a to the vacuole. this model requires that both kex2p and kexlp must also have cryptic vacuolar targeting signals. with regard to the cryptic vacuolar targeting signal model, we have found that replacement of the membrane spanning domains of dpap a and a22-aaa with a 21 residue hydrophobic sequence, l(lalv)~, creating the proteins axa and a22-axa, results in the retention of axa in the golgi and the transport of a22-axa to the vacuole (s. nothwehr and t. stevens, unpublished data). these data suggest that the transmembrane domain of dpap a does not contain vacuolar targeting information, and thus support the vacuolar default model. the vacuolar default model is directly testable by analyzing potential localization signals of yeast plasma membrane proteins. in apparent conflict with the vacuolar default model is the observation that the majority of kex2p (70%) is missorted to the plasma membrane in cells deficient for clathrin heavy chain (payne and schekman, 1989) . however, dpap a is missorted to the cell surface to a lesser extent (30 %; seeger and payne, 1992a) . it is possible that the majority of dpap a and a significant percentage of kex2p are missorted to the vacuole in cells lacking clathrin heavy chain. mutations in clathrin may affect both the retention of golgi membrane proteins and the functional integrity of the sorting pathway (seeger and payne, 1992b) . this could obscure the default pathway for membrane proteins, resulting in the transport of golgi membrane proteins to the plasma membrane as well as the vacuole. an alternative explanation of the data is that one or more of the dpap a and dpap b fusion or mutant proteins analyzed in this study were in a partially unfolded state, and thus recognized as abnormal and transported directly from the er to the vacuole via a "garbage" pathway. there is no clear evidence for the existence of such a pathway in yeast or animal cells. however, several cases have been described in animal cells in which non-lysosomal membrane proteins were transported to lysosomes, either due to mutations (armstrong et al., 1990) , or in the case of the heptameric t-cell antigen receptor, complexes lacking the ~" subunit are degraded in lysosomes (minami et al., 1987) . in these cases, the proteins were transported through the golgi apparatus, and it is unclear whether the proteins were transported to lysosomes by a "garbage" pathway, or by the uncovering of a cryptic lysosomal targeting signal. in the great majority of cases, mislocalized membrane proteins of the secretory pathway in animal cells accumulate at the plasma membrane, which is presumed to be the default destination for these proteins (e.g., williams and fukuda, 1990) . it is well established that proteins that are slow to reach the folded state are retained in the er due to the action of proteins such as bip (pelham, 1989; rothman, 1989) , and proteins that are unable to fold or oligomerize properly are degraded in the er (klausner and sitia, 1990) . several of the proteins analyzed in this study (i.e., a20-bb, a27-bb, a22-aaa, and a22-aa-b) showed increased retention in the er. however, all of these proteins eventually exited the er, were enzymatically active, and were transported through the golgi complex to the vacuole. the 22-amino acid segment of the cytoplasmic domain of dpap a presumably contains the recognition domain for a "retention protein" of the golgi apparatus. this protein could be a permanent resident of the golgi apparatus; alternatively, the "retention protein" could reside in a post-golgi compartment and function as a salvage receptor, such as the proposed receptor for soluble er proteins (pelham, 1989) . it is interesting to note that the 22-amino acid stretch identified as the dpap a golgi retention signal contains five phenylalanine residues (fig. 1) . in animal cells, one or more aromatic amino acids in the cytoplasmic domains of cell surface receptors have been shown to comprise part of the signal for the clustering into coated pit regions of the plasma membrane (chen et al., 1990; johnson et al., 1990; lobel et al., 1989; mcgraw et al., 1991) . that the phenylalanine residues in the a22 region may play a direct role in golgi retention is supported by the observation that mutations in just two of these residues result in a substantial level of missorting of dpap a to the vacuole (s. nothwehr and t. stevens, unpublished data) . the results of this study suggest that the default pathway for membrane proteins of the yeast secretory pathway may be different from that of certain mammalian cell lines that have been examined, where mutant forms of er, golgi, and lysosomal membrane proteins are mislocalized to the plasma membrane with the bulk flow of membrane (machamer and rose, 1987; jackson et al., 1990; williams and fukuda, 1990; wieland et al., 1987; karrenbauer et al., 1990; machamer, 1991) . however, in polarized epithelial ceils, it remains unclear which membrane serves as the default destination for membrane proteins (simons and wandinger-ness, 1990; mostov et al., 1992) . even if the default compartments for membrane proteins of the secretory pathways of yeast and certain animal cells are different, the same mechanistic considerations apply, that is, positive sorting information is required for proteins to avoid delivery to the default compartment. according to the vacuolar default model for yeast membrane protein sorting, not only must er and golgi membrane proteins have sorting signals specifying their retention in the appropriate compartment, but plasma membrane proteins are predicted to have sorting information that prevents their localization to the vacuole. we are currently investigating whether yeast plasma membrane proteins have targeting signals that prevent their transport to the vacuole. lysosomal sorting mutants of coronavirus e 1 protein, a golgi membrane protein yeast kre genes provide evidence for a pathway of cell wall/~ glucan assembly constitutive and regulated secretion of proteins proteases and the processing of precursors to secreted proteins in yeast npxy, a sequence often found in cytoplasmic tails, is required for coated pit mediated internalization of the ldl receptor characterization of the yeast kex1 gene product" a carboxypeptidase involved in processing secreted precursor proteins differential effects of mutations in three domains on folding, quaternary structure, and intracellular transport of vesicular stomatitis virus g protein an mfod-suc2 (~-factor-invertase) gene fusion for study of protein localization and gene expression in yeast the amino terminus of the yeast ft-atpase ~-subunit precursor functions as a mitochondrial import signal compartmentalized assembly of oligosaccharides on exported glycoprotein in yeast localization of components involved in protein transport and processing through the yeast golgi apparatus enzymes required for yeast prohormone processing yeast prohormone processing enzyme (ke, x2 gene product) is a ca~+-dependent serine protease intracellular targeting and structural conservation of a prohormone-processing endoprotease expression of wild type and mutant forms of influenza hemagglutinin: the role of folding in intracellular transport /3-d-fructofuranoside fructohydrolase from yeast compartmental organization of golgispecific protein modification and vacuolar protein sorting events defined in a yeast secl8 (nsf) mutant erdi, a yeast gene required for the retention of luminal endoplasmic reticulum proteins, affects glycoprotein processing in the golgi apparatus solid phase synthesis of polynucleotides vi. further studies on polystyrene copolymers for the solid support transformation of intact yeast ceils treated with alkali cations identification of a consensus motif for retention of transmembrane proteins in the endoplasmic reticulum cation-dependent mannose 6-pbosphate receptor contains two internalization signals in its cytoplasmic domain distinct sequence determinants direct intracellular sorting and modification of a yeast vacuolar proteinase sequences that regulate the divergent gal1-galio promoter in saccharomyces cerevisiae yeast a factor is processed from a larger precursor polypeptide: the essential role of a membrane-bound dipeptidyl aminopeptidase glycosylation and processing of prepro-a-factor through the yeast secretory pathway assembly and targeting of peripheral and membrane subunits of the yeast vacuolar h+-atpase the rate of bulk flow from the golgi to the plasma membrane protein degradation in the endoplasmic reticulum membrane protein sorting: biosynthesis, transport, and processing of yeast vacuolar alkaline phosphatase a new class of lysosomal/vacuolar protein sorting signals protein glycosylation in yeast rapid and efficient sitespecific mutagenesis without phenotypic selection structure of a yeast pheromone gene (mfa): a putative a-factor precursor contains four tandem copies of mature c~-factor mutations in the cytoplasmic domain of the 275 kd mannose 6-phosphate receptor differentially alter lysosomal enzyme sorting and endoeytosis golgi retention signals: do membranes hold the key? a specific transmembrane domain of a coronavirus el glycoprotein is required for its retention in the golgi region molecular cloning: a laboratory manual. cold spring harbor laboratory identification and regulation of a gene required for cell fusion during mating of the yeast saccharomyces cerevisiae mutaganesis of the human transferrin receptor: two cytoplasmic phenylalanines are required for efficient internalization and a second-site mutation is capable of reverting an internalization-defective phenotype building a multichain receptor: synthesis, degradation and assembly of the t-cell antigen receptor plasma membrane protein sorting in polarized epithelial cells order of events in the yeast secretory pathway analysis ofglycoproteins from saccharomyces cerevisiae clathrin: a role in the intracellular retention of a golgi membrane protein control of protein exit from the endoplasmic reticulum biosynthetic protein transport and sorting by the endoplasmic reticulum and golgi molecular analysis of the yeast vps3 gene and the role of its product in vacuolar protein sorting and vacuolar segregation during the cell cycle immunolocalization of kex2 protease identifies a putative late golgi compartment in the yeast saccharomyces cerevisiae structure, biosynthesis, and localization of dipeptidyl aminopeptidase b, an integral membrane glycoprotein of the yeast vacuole methods for studying the yeast vacuole cloning genes by complementation in yeast kar2, a karyogamy gene, is the yeast homolog of the mammalian bip/grp78 gene polypeptide chain binding proteins: catalysts of protein folding and related processes in ceils one step gene disruption in yeast a ras-like protein is required for a post-golgi event in yeast secretion selective and immediate effects of clathrin heavy chain mutations on golgi membrane protein retention in saccharomyces cerevisiae a role for clathrin in the sorting of vacuolar proteins in the golgi complex of yeast methods in yeast genetics a system of shuttle vectors and yeast host strains designed for efficient manipulation of dna in saccharomyces cerevisiae polarized sorting in epithelia the structure and insertion of integral proteins in membranes gene dosage-dependent secretion of yeast vacuolar carboxypeptidase y identification of the structural gene for dipeptidyl aminopeptidase yscv (dap2) of saccharomyces cerevisiae the yeast cell fusion protein fusi is o-glycosylated and spans the plasma membrane two genes required for cell fusion in yeast: evidence of a pheromone-induced surface protein yeast carboxypeptidase y vacuolar targeting signal is defined by four propeptide amino acids a family of yeast expression vectors containing the phage fl intergenic region prepro-a-factor has a cleavable signal sequence the rate of bulk flow from the endoplasmic reticulum to the cell surface accumulation of membrane glycoproteins in lysosomes requires a tyrosine residue at a particular position in the cytoplasmic tail we acknowledge cathy flanagan, jeremy thorner, antony cooper, howard bussey, celeste wilcox, robert fuller, and greg payne for communication of results prior to publication; scott ernr, charlie boone, and george sprague for plasmids; margaret ho and joe horecka for help in the construction of the a20-bb and fusl-laczp constructs, respectively; and jerry gleason and scan poston for the photographic work. we especially thank christopher raymond for numerous insightful contributions to this work, and nick davis, charlie boone, george sprague, antony cooper, carol vater, cynthia bauerle, and margaret ho for comments on the manuscript. received for publication 10 february 1992 and in revised form 30 june 1992. key: cord-256325-q70rky3r authors: stewart, cameron r.; deffrasnes, celine; foo, chwan hong; bean, andrew g. d.; wang, lin-fa title: a functional genomics approach to henipavirus research: the role of nuclear proteins, micrornas and immune regulators in infection and disease date: 2017-07-04 journal: roles of host gene and non-coding rna expression in virus infection doi: 10.1007/82_2017_28 sha: doc_id: 256325 cord_uid: q70rky3r hendra and nipah viruses (family paramyxoviridae, genus henipavirus) are zoonotic rna viruses that cause lethal disease in humans and are designated as biosafety level 4 (bsl4) agents. moreover, henipaviruses belong to the same group of viruses that cause disease more commonly in humans such as measles, mumps and respiratory syncytial virus. due to the relatively recent emergence of the henipaviruses and the practical constraints of performing functional genomics studies at high levels of containment, our understanding of the henipavirus infection cycle is incomplete. in this chapter we describe recent loss-of-function (i.e. rnai) functional genomics screens that shed light on the henipavirus–host interface at a genome-wide level. further to this, we cross-reference rnai results with studies probing host proteins targeted by henipavirus proteins, such as nuclear proteins and immune modulators. these functional genomics studies join a growing body of evidence demonstrating that nuclear and nucleolar host proteins play a crucial role in henipavirus infection. furthermore these studies will underpin future efforts to define the role of nucleolar host–virus interactions in infection and disease. paramyxoviruses (order mononegavirales) are single-stranded rna viruses of negative polarity that can cause diseases in humans (rabies, measles virus, mumps virus, respiratory syncytial virus, human parainfluenza virus, ebola virus) and animals (newcastle disease virus, canine distemper virus, borna disease virus). the family paramyxoviridae is divided into two subfamilies (paramyxovirinae and pneumovirinae), with hendra virus (hev) being the foundation member of the genus henipavirus in the subfamily paramyxovirinae. the discovery of the hev and nipah virus (niv) had a striking impact on our understanding of paramyxovirus biology. henipaviruses have a much wider host range and a significantly larger genome than other paramyxoviruses, and to date are the only biosafety level (bsl)-4 agents within the family. with mortality rates of human infection between 50 and 100%, hev and niv are among the most deadly viruses known to infect humans. hev emerged in 1994 in the brisbane suburb of hendra, queensland, australia, where it caused an outbreak of severe respiratory disease in horses that led to the natural death or euthanasia of 14 out of 21 affected animals. two people who had close contact with the infected horses were infected and one of these patients died (murray et al. 1995) . extensive sampling demonstrated that australian mainland flying foxes (family pteropodidae, genus pteropus) were seropositive for neutralising antibodies against hev (young et al. 1996) , while the virus was subsequently isolated from flying fox uterine fluid and urine (halpin et al. 2000) , providing strong evidence for australian mainland flying foxes as the hev reservoir. sporadic hev incidents occurred in horses between 1994 and 2010, with 14 events identified. an alarming number of hev incidents (34 in total) occurred between 2011 and 2013, with 18 of those occurring in 2011 alone, highlighting the unpredictable nature of hev outbreaks. seven human cases of hev disease have been observed, four of which resulted in fatal disease. all recorded cases of hev transmission to humans have occurred directly from affected horses. the horses are believed to have acquired hev infection following direct exposure to secretions from flying foxes. more recently, the decline of reported human cases of hev infection is potentially due to the development of a vaccine to inhibit hev disease in horses (middleton et al. 2014) . niv was first identified during a disease outbreak on the west coast of peninsular malaysia in late 1998. commercial pig farmers suffered disease characterised by febrile encephalitis that was linked to mild respiratory and neurological disease in pigs (mohd nor et al. 2000 ; from the centers for disease control and prevention 1999) . nucleotide sequencing demonstrated the virus was closely related to hev, whilst fruit bats of the pteropodidae family, pteropus genus, were confirmed as the natural reservoir (yob et al. 2001) . epidemiological evidence suggested that human infections were caused by transmission from pigs which likely had prior contact with fruit bats (update: outbreak of nipah virus-malaysia and singapore 1999). by mid-1999, cases of human infection were reported in singapore, where abattoir workers developed niv infection associated with contact with pigs imported from malaysia. this initial outbreak of niv in malaysia resulted in 265 human cases reported with 105 deaths. since 2001, niv outbreaks have been reported almost every year in selected districts of bangladesh (hossain et al. 2008; luby et al. 2009a) . unlike hev, human-to-human transmission of niv has been documented (luby et al. 2009b) , including in a hospital setting. an increasing focus on flying foxes as viral reservoirs has led to the discovery of new henipaviruses. the genus was expanded in 2012 upon the isolation and characterisation of cedar virus (cedpv), isolated from bat urine samples from a flying fox colony in cedar grove, south east queensland. cedpv shows a remarkably similar genome organisation to hev and niv, antigenic cross-reactivity of the nucleocapsid protein between henipaviruses, and shares the same predominant entry receptor molecule, ephrin-b2 (marsh et al. 2012 ). however, a critical difference between cedpv and hev and niv is that the cedpv p gene lacks coding capacity for the immune antagonising v protein, whilst the cedpv p protein shows an impaired capacity to bind and inhibit ifn signalling via signal transducer and activator of transcription (stat)1 and stat2 (lieu et al. 2015) . accordingly, cedpv infection induces a robust type i interferon (ifn) response in human cells in vitro and does not cause clinical disease in ferret and guinea pig models of disease. such findings highlight the importance of immune evasion in the context of henipavirus pathogenicity and demonstrate the diverse range of pathogenicity within the same genus. in addition to these three viruses, the henipavirus genus is likely to be expanded in the future to accommodate the discovery and characterisation of emerging viruses from bats and other reservoirs. west african fruit bats harbour neutralising antibodies against hev and niv in particular, demonstrating a wider geographical range for henipaviruses not limited to pteropid bats (hayman et al. 2008 ). furthermore, a novel henipa-like virus, mojiang paramyxovirus, was isolated from rats in the yunnan province of china in 2012 and may have caused fatal disease in three individuals (wu et al. 2014) . alarmingly, a recent study looking at bat and human serum samples from cameroon found that 3-4% of human samples were seropositive for henipaviruses, and that this was almost exclusively among individuals who reported butchering bat meat, providing the first evidence of human henipavirus spillover infections in africa (pernet et al. 2014 ). there are currently no licensed therapies to treat human cases of henipavirus infection. therefore, gaining a deeper understanding of host pathways exploited by henipaviruses for infection may identify targets for new antiviral therapies. viruses rely on the cell host machinery for completion of their infection cycle and therefore have adapted to interact with or exploit host molecules. retroviruses, most dna viruses, and many orthomyxoviruses replicate their genomes in the host nucleus. conversely, most positive-sense single-stranded viruses such as picornaviruses and flaviviruses and negative-sense, single-stranded viruses such as filoviruses, rhabdoviruses, and paramyxoviruses are perceived as cytoplasmic viruses and therefore are believed to not have a nuclear stage in their life cycle, replicating their genome entirely in the cytoplasm (lamb and parks 2007) . however, proteins of some of these viruses can traffic into nuclear compartments during infection (peeples 1988; yoshida et al. 1976; ghildyal et al. 2003; monaghan et al. 2014; wang et al. 2010) and this movement is sometimes critical for efficient infection (wang et al. 2010 ). this evidence indicates that the host nucleus may play a significant role in the infection cycle of henipaviruses and that the dynamics of virus-host interactions that occur in the nuclear compartments is an understudied area of molecular biology and virology. furthermore, since important discoveries in cell biology often follow studies of how viruses exploit normal host machinery, investigations into these nuclear interactions may reveal interesting novel insights into the cell biology of the mammalian nucleus. with this in mind, functional genomics provides a powerful and unbiased approach to study these biological questions. functional genomics refers to the development and application of global (genome-wide or system-wide) experimental approaches to assess gene function by making use of the information and reagents provided by sequenced genomes (hieter and boguski 1997) . a wide range of laboratory techniques can be considered as functional genomics, including genome interaction mapping (at the dna level), microarrays, transcriptomics and serial analysis of gene expression (sage) (at the rna level), yeast 2 hybrid systems and affinity chromatography and mass spectrometry (at the protein level) and loss-of-function studies such as mutational studies, rna interference (rnai) and clustered regularly interspaced short palindromic repeats (crispr) studies. functional genomics has demonstrated much power in its ability to dissect the dynamic interplay between host and viral factors during a virus infection, paving the way for novel drug targets. for instance, a haploid genetic screen resulted in the discovery of the once elusive entry receptor for ebola virus (carette et al. 2011 ). there have been many full-or partial-genome rnai screens of host-virus interactions, including orthomyxoviruses (brass et al. 2009; hao et al. 2008; karlas et al. 2010; konig et al. 2010; shapira et al. 2009 ), retroviruses (zhou et al. 2008; konig et al. 2008; brass et al. 2008 ) and flaviviruses (ang et al. 2010; sessions et al. 2009 ). until recently, such information was lacking for henipaviruses, and perhaps surprisingly, for paramyxoviruses generally. functional genomics screens can be technically challenging, laborious and involve the use of robotics and advanced imaging equipment. consequently there are technical and practical challenges to performing high-throughput screens at higher levels of containment. hev and niv are classified at bsl-4 agents due to their association with lethal human disease and the absence of preventive measures and effective treatments to combat infections. bsl-4 facilities feature additional precautions to protect workers from infections and prevent exposure, such as infectious work being conducted within class ii biosafety cabinets, limited access by secure, locked doors, hepa filtration of laboratory air, and additional primary containment (positive pressure air suits or class iii biosafety cabinets). due to these limitations, previous genome-wide screens for bsl-4 viruses used surrogate viruses, such as pseudotyped particles, and have been performed under bsl-2 conditions (kouznetsova et al. 2015; kleinfelter et al. 2015) . functional genomics have been employed to study henipavirus infection. for instance, the entry receptor of hev and niv, ephrin-b2, was identified by microarray analysis of infection-permissive and infection-resistant cell lines (bonaparte et al. 2005) . transcriptomics and proteomics have been utilised to uncover key differences in cellular responses to hev infection in hev disease-susceptible (human) and disease-resistant (bat) cells, and suggest that activation of apoptosis pathways via the innate immune pathway may contribute to the tolerance of henipaviruses by flying foxes (wynne et al. 2014 ). here we largely focus on findings from two recent rnai screens to identify protein-coding genes and host-encoded micrornas impacting the henipavirus infection cycle in human cells. not only can these findings be compared to published rnai screens of hostvirus interactions, but the identification of host genes required for infection (as opposed to those that are merely differentially expressed during infection) may deliver new targets for the development of antiviral therapies. the large number of hev incidents in australia from 2011 to 2013 prompted researchers at our laboratory to establish the capability to perform genome-wide rnai screens at bsl-4. central to this work was the development of a recombinant hev expressing the renilla luciferase construct, which allowed for high throughput and rapid measurement of virus infection (marsh et al. 2013 ). this recombinant virus was shown to be lethal in the ferret model of henipavirus disease and exhibited a pathogenesis profile comparable to the wild-type virus. functional genomics at high containment also required the establishment of protocols and/or safe work procedures for the operation and decontamination of liquid handling robots. a genome-wide analysis of host protein-coding genes required for henipavirus infection involved a primary screen assaying 18,120 protein-coding genes, followed by a secondary deconvolution screen and a tertiary screen determining whether screen results obtained using recombinant hev could be recapitulated using wild-type hev and niv (deffrasnes et al. 2016) . applying a robust z score normalisation method often used to interpret sirna screen results (birmingham et al. 2009; zhang et al. 2006) , 585 and 630 genes were identified that promoted or suppressed hev infection, respectively, without adversely impacting cell numbers. at the completion of the primary screen, 200 proviral genes were selected based on rank for the secondary deconvolution screen. by this measure, 20 high-and 46 medium-confidence genes (>2 standard deviations from mean mock values for 4/4 or 3/4, or 2/4 sirnas, respectively) were identified as being required for hev infection. the apparent reliance of henipavirus infection on the nuclear or nucleolar host proteins was particularly striking, as over 40% of high confidence hits localise in the nucleus or nucleolus, with many involved in ribosome biogenesis (table 1) . the nucleus is the site of gene expression and dna transcription into mrna, and houses the early steps of the rnai pathway. the nucleus is separated from the cell cytoplasm by the nuclear envelop which contains nuclear pores and import/export proteins allowing the passage of small molecules such as mrna. nuclear import/export proteins such as xpo1 and kpna3, which are required for trafficking of larger molecules like proteins, were identified by rnai screen as required for henipavirus infection (deffrasnes et al. 2016) . the nucleolus is a highly dynamic structure and has increasingly been shown to play a critical role in virus-host interactions (rawlinson and moseley 2015; xu et al. 2016 ). the nucleolus contains three regions composed of the fibrillar centre (fc) in the middle, surrounded by the dense fibrillary component (dfc) and the granular component (gc). this membrane-less structure contains a high concentration of proteins and rnas and is the site of ribosomal rna (rrna) synthesis and ribosome production but is also a multifunctional structure in eukaryotic cells. cell cycle progression, stress response, genetic silencing, regulation of apoptosis, cell migration and invasion are all functions associated with the nucleolus or partly regulated in this compartment (rawlinson and moseley 2015; xu et al. 2016; pederson 2010 ). fibrillarin is the main nucleolar protein responsible for the chemical modification of ribosomal rna (rrna). this 34-38 kda 2′-o-methyltransferase transfers methyl groups from its substrate, the s-adenosylmethionine (sam), to the 2-hydroxyl groups of ribose target in rrna. fibrillarin has also been shown to methylate glutamine residue 104 of the human histone h2a, weakening its binding to the fact (facilitator of chromatin transcription) complex and impacting chromatin remodelling and rdna transcription by rna pol i (tessarz et al. 2013) , which points at an additional role for fibrillarin in ribosome biogenesis and translation. pre-ribosome processing, modification of pre-rrna rpl7a component of the 60s ribosomal subunit sp7 transcriptional regulation xpo1 nuclear export fibrillarin itself is methylated on several arginine residues by protein arginine n-methyltransferase 1 (prmt1), which is thought to influence its activity (rodriguez-corona et al. 2015) . expression levels of fibrillarin have been shown to be regulated by p53 through direct binding to fibrillarin intron 1. abnormal levels of fibrillarin have been detected in p53-inactivated cancer cells and a decrease in p53 levels has been associated with an increase in fibrillarin expression, and conversely an increase in p53 expression results in decreased fibrillarin expression (marcel et al. 2013) . high levels of fibrillarin lead to changes in the rrna methylation pattern, diminished translation fidelity and increase in ires-mediated translation of some cancer genes. moreover, ribosome biogenesis is often dysregulated and over-activated in cancer cells that have a decreased or absent p53 expression (marcel et al. 2013) . in its n-terminal region, fibrillarin contains a glycine-and arginine-rich region (the gar domain) enabling interaction with cellular and viral proteins, and acting as a nucleolar retention signal. its c-terminal region (mtase) contains multiple rna-binding domains, a catalytic site allowing for fibrillarin methyltransferase function, and is the site for nop56/58 interaction. fibrillarin is a part of at least one nucleolar ribonucleoprotein (snornp) complex comprising the nop56, nop58 and 15.5 k nucleolar proteins. x-ray data have suggested that the methylation of rrna requires the formation of this complex with involvement of four fibrillarin molecules interacting with different regions of the target rrnas. the yeast equivalent of fibrillarin, nop1, has been more extensively studied than the human counterpart but fibrillarin is a well-conserved protein in most organisms, reinforcing the notion that all post-transcriptional processes involving fibrillarin such as chemical modification (methylation) of rrna, pre-rrna cleavage and ribosome assembly are essential for proper cellular functioning (rodriguez-corona et al. 2015) . in eukaryotes, ribosome biogenesis involves numerous nucleolar proteins and accessory factors, around 80 ribosomal proteins, many small nucleolar rnas (snornas), three rna polymerases (rna polymerase i, ii and iii) and four different species of rrnas. the process of assembly of elongation-competent 80s ribosomes is divided into three major steps: (1) ribosomal dna (rdna) transcription into precursor rrnas (pre-rrnas), (2) processing of pre-rnas into mature rrnas, and then (3) assembly of rrnas with ribosomal proteins into functional ribosomes. in the nucleolus, the rna polymerase i (rna pol i) is responsible for transcribing the 18s, 5.8s and 28s rrna from a single polycistronic pre-rrna, while rna pol iii transcribes the 5s rrna in the nucleus (xue and barna 2012) . the pre-rnas are then cleaved and modified during the pre-rrna processing phase. all ribosomal proteins (rp) are transcribed in the cytoplasm by rna pol ii and then translated before migrating to the nucleolus. these rp, along with nucleolar proteins such as fibrillarin and rpl13a, are responsible for modifying the rrnas (ribose 2′-o-methylation, pseudouridylation, etc.) with the activity of more than 100 snornas guiding the process in a site-specific manner. the main nucleolar protein involved in rrna modification is fibrillarin, which methylates more than 100 sites essential for ribosome biogenesis and stability. although these post-transcriptional modifications are crucial for ribosome functions, their roles are not yet fully understood. in eukaryotes, the large 60s subunit of ribosomes is made of the 5s, 5.8s, and 28s rrna along with multiple large subunit ribosomal proteins (rpl), while the small 40s subunit is made of the 18s rrna along with multiple small subunit ribosomal proteins (rps). the two subunits are assembled in the nucleolus into the 80s ribosomes before being transferred into the cytoplasm. deffrasnes and colleagues showed that sirna-mediated knockdown of fibrillarin expression dramatically reduced hev protein production and viral genome replication but did not impact viral fusion, and that fibrillarin catalytic activity was essential to henipavirus infection. on the other hand, overexpression experiment did not lead to an increase in viral titers, suggesting that a simple reduction or increase in overall ribosome production is unlikely to explain the reliance of henipaviruses on fibrillarin activity (deffrasnes et al. 2016 ). the requirement of fibrillarin and several other proteins from the ribosomal biogenesis pathway for henipavirus infection points a reliance on translation for efficient infection. however, while we tend to view ribosomes as homogenous, new studies reveal a more heterogeneous nature of ribosomes due to differences in the ribosomal proteins recruited, post-translational modifications of rrna and rrna composition. moreover, ribosomal proteins have been found to have additional functions outside of their primary roles in ribosomes and to be involved in other nucleolar functions such as regulation of cell proliferation, tumorigenesis and dna damage response (xu et al. 2016; xue and barna 2012; au and jan 2014) . in eukaryotes, most messenger rna (mrna) harbour a 5′ 7-methylguanosine cap structure and a 3′ poly(a) tail, which are both required for canonical, cap-dependent translation. a cap-independent translation mechanism also utilised by a subset of host proteins is called internal ribosome entry site (ires)-mediated translation. it is believed that most genes translated via an ires are related to stress response, cell proliferation, cell death/survival, and that ires-mediated translation happens when the canonical cap-dependent translation is inhibited either by the host reaction to environmental factors, damage, stress or infections. however, a group recently suggested that thousands of human genes are translated via this cap-independent mechanism, representing a 50-fold increase in the number of sequences previously associated with this translation pathway (weingarten-gabbay et al. 2016) . recently a new type of translation has been described in vesicular stomatitis virus (vsv)-infected cells. this non-canonical cap-dependent protein translation involves the ribosomal protein rpl40 acting as a constituent of the large subunit of ribosomal complexes and suggests a novel ribosome-specialised translation initiation pathway benefiting viral mrna translation (lee et al. 2012 ). translations of viral proteins from several other mononegaviruses, including the paramyxoviruses measles virus (mev) and newcastle disease virus (ndv), and a subset of cellular transcripts, are also rpl40-dependent. how henipavirus mrnas are translated is not fully understood. whilst the rpl40-dependent form of cap-dependent translation remains to be characterised in detail, one could speculate that fibrillarin, like rpl40, acts a novel initiation factor for henipavirus mrnas. the fact that depleting cells of fibrillarin did not impact synthesis of influenza a viral proteins (which occurs via the canonical cap-dependent pathway) suggests that henipavirus mrna translation occurs via a non-canonical pathway, perhaps used by a subset of cellular transcripts. such a concept would allow henipavirus protein synthesis to proceed in an environment where viruses may induce cellular translation shutdown in order to suppress host antiviral immune responses. there are several reports of paramyxoviruses blocking canonical translation pathways, including the mev n protein binding to the eukaryotic initiation factor 3 (eif3-p40) (sato et al. 2007 ), whilst the p and v proteins of simian virus 5 (sv5) limit activation of the double-stranded rna (dsrna)-dependent protein kinase (pkr) to limit both host and viral protein translation (gainey et al. 2008 ). similar to sv5, sirna-mediated depletion of pkr results in increased hev growth (robust z score 1.46), consistent with the notion that shutdown of host protein translation inhibits henipavirus infection. if future studies do indeed demonstrate a role of fibrillarin in influencing the synthesis of ribosome subtypes required for viral protein translation, this may explain the targeting of fibrillarin by several viral proteins. fibrillarin binds the hev matrix (m) protein during the early stages of infection, whilst the hiv-tat protein has been reported to bind fibrillarin and u3 snorna, both required for pre-rrna processing, and this interaction reduces the pool of cytoplasmic ribosomes (ponti et al. 2008) . intriguingly, the nucleoprotein of porcine reproductive and respiratory syndrome virus, the non-structural protein 1 (ns1) of a h3n2 influenza virus (melen et al. 2012 ) and the non-structural protein 3b of the severe acute respiratory syndrome coronavirus (yuan et al. 2005 ) all bind and co-localise with fibrillarin in the nucleolus; however, the reasons for this binding are yet to be determined. many negative strand viruses encode viral proteins that localise in the nucleus and/or nucleolus at some point in their infection cycle [reviewed in (rawlinson and moseley 2015; hiscox 2003; oksayan et al. 2012; flather and semler 2015; watkinson and lee 2016) ]. within the paramyxoviridae, nuclear localisation of matrix (m) protein has previously been described for ndv (peeples 1988) , sendai virus (sev) (yoshida et al. 1976 ), human respiratory syncytial virus (ghildyal et al. 2003) , hev (monaghan et al. 2014 ) and niv (wang et al. 2010) . during the early stages of henipavirus infection or when expressed ectopically (monaghan et al. 2014; wang et al. 2010) , the hev and niv m proteins traffic through the nucleolus to the cytoplasm. it has been recently shown that nuclear traffic is required for the henipavirus m protein to coordinate viral budding. the henipavirus m protein is a structural protein that mediates viral assembly and budding (liljeroos and butcher 2012; takimoto and portner 2004; eaton et al. 2007 ). indeed, for both hev-m and niv-m, overexpression of these proteins alone is sufficient to trigger viral-like particles (vlps) that bud into the supernatant. wang and colleagues (2010) demonstrated that mutation of niv-m nuclear localisation signals (nls) or nuclear export signals (nes) blocks nuclear/cytoplasmic traffic and impairs viral budding. furthermore, a highly conserved lysine residue in the nls (k258) serves two functions: its positive charge mediates niv-m nuclear import, while is also a potential site for monoubiquitination which regulates niv-m nuclear export. niv infection (deffrasnes et al. 2016 ). this raises the question: do henipavirus m proteins traffic through the nucleolus for other reasons? the multi-faceted roles of paramyxovirus proteins in replication-specific roles and various cellular processes, particularly immune evasion, would suggest so. to explore whether m binds to host proteins associated with infection efficiency, results from the genome-wide rnai hev screen were cross-referenced against a proteomics study by pentecost and colleagues cataloguing host proteins that bind hev-m and niv-m, among other paramyxovirus m proteins (pentecost et al. 2015) . that study revealed that the henipavirus m interactome spans hundreds of host proteins, with interactions with nuclear pore complex proteins, nuclear transport receptors and nucleolar proteins particularly prevalent. interestingly, niv-m and hev-m interactomes show notable overlap to other paramyxovirus m proteins, including sev and ndv, with over 60% of the proteins found in any single interactome also found in the interactomes of one or more of the other three viruses (pentecost et al. 2015) . whilst the binding of fibrillarin to hev-m was demonstrated by co-immunoprecipitation assays (deffrasnes et al. 2016 ) and this was not observed by proteomics, interactions were observed between hev-m and numerous nucleolar proteins such as nop58 (pentecost et al. 2015) which forms a complex with fibrillarin, supporting a functional interaction between fibrillarin and hev-m. the relative hev growth (presented as robust z scores) in cells depleted of the 389 hev-m-binding host proteins is shown in fig. 1a . of the 327 candidates assayed, hev-m binds to 22 protein-coding genes that have a large impact (robust z score −2 or ! 2) on hev infection, roughly evenly distributed between proviral (12) and antiviral (10) candidates. designating all candidates genes with z scores <0 as proviral and genes with z scores >0 as antiviral, host proteins that bind hev-m appears to be pro-and antiviral at approximately equal ratios with a slight enrichment of proviral genes (174 proviral candidates vs. 146 antiviral candidates). an assessment of whether the relative abundance of hev-m-host protein interactions indicated a likelihood of that host protein adopting a proviral or antiviral function was also carried out (fig. 1b) . the relative abundance of host proteins within the proteomics dataset is represented as the normalised spectral abundance factor (nsaf), with higher nsaf values presenting more abundant interactions. plotting nsaf values against robust z scores demonstrates that host proteins that bind hev-m with high abundance (nsafe5 scores between 250 and 938) were more proviral (11 candidates) than antiviral (4 candidates, z score sums: proviral 13.9, antiviral 2.6). these candidates are listed in table 2 and include several ribosomal proteins, further implicating m in host translation. akin to fibrillarin, the critical role of host molecules in henipavirus infection and pathogenesis can be inferred by their specific targeting by viral proteins. this is particularly true in the context of immune evasion, as the innate antiviral immune response is a known target for several henipavirus proteins. the henipavirus genome contains six transcriptional units, n, p, m, f, g and l, coding for nine proteins (eaton et al. 2007 ). the p gene alone codes for at least four of the proteins: p, w, v and c . all four of these proteins are involved in modification of the immune response in the host cell, through inhibition of the type i interferon (ifn) responses [reviewed in (audsley and moseley 2013) ]. intracellular detection of pathogen-associated molecular patterns (pamps) is mediated by membrane-bound toll-like receptors (tlrs) or cytoplasmic retinoic acid-inducible gene i (rig-i)-like receptors (rlrs) and nucleotide-binding oligomerisation domain containing (nod)-like receptors (nlrs). engagement of these receptors with their agonists results in the activation of complex signalling pathways culminating in the production of cytokines and anti-microbial compounds. a critical component of this response is the type i ifn system, which induces a local antiviral state upon detection of viruses or intracellular bacteria or molecules associated with their replication (schoggins and rice 2011). genes with z scores <0 were designated proviral, while genes with z scores >0 were designated antiviral. values represent the sum of all the z scores. it should be noted that 43 genes were excluded from analysis due to ambiguous gene identification listings in the proteomics study, whilst the silencing of 19 additional gene targets resulted in cell death that prevented the measurement of virus growth. b plot of the z score of hev-m-binding proteins (x-axis) and relative abundance of hev-m interactions, represented by normalised spectral abundance factor (y-axis) viral replication is typically detected by tlrs 3 and 7/8 in endosomal compartments (alexopoulou et al. 2001; lund et al. 2004) , whilst rig-i and/or melanoma differentiation-associated gene 5 (mda5) recognise short or long viral dsrna intermediates in the cytosol (yoneyama et al. 2004; triantafilou et al. 2012) . tlr3 activates the tir-domain-containing adapter-inducing ifn-b (trif) (matsumoto et al. 2011) , whilst rig-i/mda5 interact via their caspase recruitment domains (cards) with mavs (mitochondrial activated signalling protein) (seth et al. 2005) to induce signalling. activation of trif or mavs promotes recruitment of multiple cytosolic effectors, resulting in the phosphorylation and dimerisation of interferon regulatory factor (irf) 3 or liberation of nf-jb from its inhibitory complex. these transcription factors then shuttle into the nucleus to form part of a large multiprotein complex that binds to the promoter region of ifn-b and initiates transcription (honda and taniguchi 2006) . the c-terminus of the hev v protein binds and sequesters mda5, thereby impairing ifn-b transcription in response to double-stranded rna (andrejeva et al. 2004) . this binding appears to be conserved amongst most paramyxoviruses including niv, sv5 and mumps virus (childs et al. 2007 ). intriguingly, rig-i is not targeted by paramyxovirus v proteins, and perhaps consistent with this, the genome-wide rnai screen suggested that depleting cells of mda5 increased hev infection (robust z score 2.02), whilst targeting rig-i had very little impact (z score −0.37). similar to the nlr cytoplasmic antiviral immune responses, tlr3-dependent antiviral signalling is also inhibited by henipaviruses, with the w protein localising to the nucleus via the importin molecules kpna3 and kpna4 to block irf3-responsive promoter activation by virus and intracellular dsrna (shaw et al. 2005). transfecting niv-w into cells in a dose-dependent manner sequesters inactive irf3 in the nucleus, thus depleting the pool of available irf3 for phosphorylation and activation. from the genome-wide screen, the impact of down-regulating tlr3 (z score 1.16) and irf3 (0.97) was a moderately antiviral phenotype. the best-characterised target of henipavirus immune evasion is the stat proteins, critical signalling molecules in the context of type i ifn cytokine production conferring the antiviral state [reviewed in (platanias 2005) ]. the binding of type i ifn (ifn-a and ifn-b) and type ii ifn (ifn-c) to their respective receptor complexes leads to the phosphorylation and association of stat1 and stat2 heterodimers (for type i ifn signalling), or stat1 homodimers (type ii ifn). this prompts the formation of stat1-stat2-irf9 (ifn-regulatory factor 9) complexes that translocate to the nucleus and bind ifn-stimulated response elements (isres) in dna to initiate transcription of ifn-stimulated genes (isgs). whilst there are hundreds, potentially thousands of isgs that collectively confer antiviral immunity, very few isgs have been functionally characterised in the context of henipavirus infection. one isg, cholesterol 25 hydroxylase (ch25h), inhibits infection by niv and a range of other rna viruses by blocking membrane fusion between host and viral membranes (liu et al. 2013a) . consistent with this observation, ch25h blocked hev infection in the genome-wide rnai screen (robust z score 1.05). henipaviruses, like other paramyxoviruses, generate multiple alternative mrnas from the p gene locus-p, v and w (thomas et al. 1988) . a fourth protein, c, is generated by alternate translation initiation site selection from all these mrnas and does not share sequence homology to the other proteins. the p, v, and w proteins share 407 amino acids in their n termini and all three proteins bind to stat1 and stat2 via this n-terminal region (ciancanelli et al. 2009; rodriguez et al. 2004) . virus-host interactions in this context prevent stat1/2 phosphorylation and activation, and lead to their sequestration in high molecular weight complexes (rodriguez et al. 2003; rodriguez et al. 2002; shaw et al. 2004) . interestingly, the sirna-mediated inhibition of stat1 increased hev infection in the genome-wide screen, but inhibition of stat2 did not (robust z scores of 1.01 and −0.67). this preliminary observation suggests that stat1 activity may have a greater impact on henipavirus infection than stat2, and may implicate type ii in antiviral immunity against henipaviruses. although the role of henipavirus p gene products in immune evasion is well-established, a recent study demonstrates the surprising ability of niv-m to antagonise the antiviral type i ifn response (bharaj et al. 2016) . the study by bharaj and colleagues shows that niv-m binds to and targets the e3-ubiquitin ligase trim6 for degradation. trim6 catalyses the synthesis of unanchored polyubiquitin chains that are used as a substrate for the activation of ikb kinase-e (ikke), which phosphorylates irf3 and activates irf3-dependent transcription of type i ifn, and tnf-a. trim6 targeting by niv-m occurs in the cytoplasm via an unknown mechanism not involving the proteasome or the lysosome, and requires nuclear/cytoplasmic trafficking of niv-m. similar to viral budding, this function of m is dependant on nuclear traffic, as k258 mutants of niv-m do not target trim6 for degradation. the study expands our understanding of immune antagonism and highlights the potential purpose of henipavirus m protein nuclear trafficking. . although far less frequent, mirna binding may also cause an increase in target mrna translation and thus up-regulation of protein expression (vasudevan et al. 2007 ). in terms of target complementarity, mirnas do not require perfect base pairing (tenoever 2013). as a result, one mirna has the potential to regulate a surprisingly broad network of genes (skalsky and cullen 2010; zhang et al. 2013) , with certain mirnas found to have binding sites located on several hundred different mrna sequences (guo and steitz 2014) . despite the potential for widespread impacts, studies have described the effects of mirna gene regulation on protein expression levels as generally 'subtle' (tenoever 2013) or 'typically relatively mild' (selbach et al. 2008 ). this is due to the fact that, in general, mirnas do not entirely silence but rather moderately repress translation and, hence, effectively fine tune rather than knock out gene expression (baek et al. 2008 ). the role of mirnas in the infection cycle of rna viruses is becoming increasingly apparent. certain mirnas may promote virus replication by directly interacting with the viral genome or, alternatively, by down-regulating the expression of host genes that suppress virus infection (skalsky and cullen 2010; roberts et al. 2011) . inhibiting specific 'proviral' mirnas, therefore, may have a direct negative impact on the viral life cycle (janssen et al. 2013) or alternatively render the intracellular environment unfavourable for virus replication (stewart et al. 2013) . in an example of the latter, mir-146a has been found to promote hev infection by repressing ring finger protein 11, a negative regulator of nf-ĸb activity (stewart et al. 2013) . furthermore, inhibiting mir-146a has been found to significantly reduce hev replication in vitro (stewart et al. 2013 ). on the other hand, mir-122 is an example of a mirna that promotes hepatitis c virus (hcv) replication by directly interacting with the viral genome-this activity is the basis of the first mirna inhibitor drug to enter phase ii clinical trials (janssen et al. 2013; wilson and sagan 2014) . the functional genomics platform established as part of the screen of protein-coding genes associated with hev infection was recently adapted to study the impact of host-encoded mirnas on hev growth (foo et al. 2016) . the screen involved the use of synthetic mirna mimics and inhibitors targeting 834 micrornas. mimic and inhibitor screens identified 35 and 61 micrornas, respectively, that promoted hev infection, and 19 and 83 micrornas, respectively, that inhibited virus infection. a major finding from this study was that all four members of the mir-181 family (-a to -d) promote infection by hev and niv. infection promotion was primarily mediated via the ability of mir-181 to significantly enhance henipavirus-induced membrane fusion. cell signalling receptors of ephrins, namely epha5 and epha7, were identified as novel negative regulators of henipavirus fusion. the expression of these receptors, as well as ephb4, was suppressed by mir-181 overexpression, suggesting that simultaneous inhibition of several ephs by the mirna contributes to enhanced infection and fusion. to our knowledge, this study represented the first evidence of a host-encoded mirna promoting virus cell entry. previous studies have reported that members of the mir-181 family are involved in different aspects of immune regulation (hutchison et al. 2013; galicia et al. 2014; zietara et al. 2013) . specifically, mir-181 has been found to play a central role in the regulation of b cell differentiation and t cell selection, maturation and sensitivity (sun et al. 2014) . for instance, induction of mir-181a has been found to occur at the cd4(+)-cd8(+) double-positive stage of t cell development, inhibiting the expression of cd69, bcl-2 and t cell receptor-all involved in positive selection and t cell maturation (neilson et al. 2007 ). in addition, mir-181c has been found to suppress cd4+ t cell activation by targeting interleukin 2 (il-2) (sun et al. 2014; xue et al. 2011 ). in addition, mir-181a expression levels have been shown to correlate with pro-inflammatory signals (e.g. il-1b, il-6 and tnf-a) in blood and various tissues of humans with chronic inflammation, as well as in the blood of lps-treated mice (xie et al. 2013) . consistent with the notion that mir-181 expression is immune-responsive, levels of mir-181 were up-regulated in the biofluids of ferrets and horses infected with hev, suggesting that the host innate immune response may promote henipavirus spread and exacerbate disease severity. the study of both mirnas and protein-coding genes associated with hev infection allows an assessment whether genes required for virus infection (i.e. proviral genes) are regulated by mirnas that inhibit virus infection (i.e. antiviral micrornas). multiple members of the let-7 mirna family inhibited hev infection. there are 10 mature let-7 sequences in humans, with multiple roles described, including negative regulation of tumorigenesis (shi et al. 2008; esquela-kerscher and slack 2006) . in a transcriptome-wide study in hela cells, genes significantly down-regulated by let-7b at either the mrna level, protein level or both, included fourteen validated genes required for wild-type hev infection, including akt1 (selbach et al. 2008 ). furthermore, six proviral genes contain putative let-7b binding sites in their 3′ utr (akt1, c6orf106, eif2s3, hmga1, ifitm3 and serpinh1), as identified by diana-mirextra (alexiou et al. 2010) . collectively, these data suggest that let-7 mirnas inhibit hev by suppressing host proteins required for virus infection. cross-referencing results from the protein-coding screen study showed that the majority of verified target genes for mir-181 and mir-17-92 mirnas (proviral in the mirna screen) were predominately antiviral, demonstrating a level of congruency between mirna and protein-coding gene screens. in contrast to let-7, all six members of the mirna precursor mir-17 family (mir-17, -20a, -20b, -106a, -106b and -93), part of the oncogenic mir-17-92 polycistron, strongly promoted hev infection. interestingly, other mirnas of the mir-17-92 cluster with distinct "seed" families (based on sequence identity at positions 2-7)-mir-18, mir-19 and mir-92) did not impact virus replication to a similar extent. the mir-17-92 cluster is a known oncogene locus-it is amplified in b cell lymphomas (ota et al. 2004 ) and accelerates tumour development in a mouse b cell lymphoma model (he et al. 2005) . members of the mirna precursor mir-17 family are expressed in almost all human tissues (liang et al. 2007 ). in addition, mir-106a and -106b are expressed in peripheral blood mononuclear cells (pbmcs), platelets and exosomes derived from peripheral blood (hunter et al. 2008) . henipaviruses are dangerous pathogens and control of disease caused by these viruses will critically rely on the development of new antiviral therapeutics and vaccination strategies. currently, there is requirement for renewed research into the host immune responses to henipavirus infection and how competent immune responses may fight disease. a major challenge is to ascertain the molecular mechanisms of virus replication and immunity associated with protection to infection. the improved knowledge of functional genomics approaches and immune response to viral infection means that we now have the tools to further progress our understanding and knowledge. nevertheless, this must be implemented to develop advanced infection control approaches. the diana-mirextra web server: from gene expression data to microrna function recognition of double-stranded rna and activation of nf-kappab by toll-like receptor 3 the v proteins of paramyxoviruses bind the ifn-inducible rna helicase, mda-5, and inhibit its activation of the ifn-beta promoter small interference rna profiling reveals the essential role of human membrane trafficking genes in mediating the infectious entry of dengue virus novel viral translation strategies paramyxovirus evasion of innate immunity: diverse strategies for common targets the impact of micrornas on protein output the matrix protein of nipah virus targets the e3-ubiquitin ligase trim6 to inhibit the ikkepsilon kinase-mediated type-i ifn antiviral response statistical methods for analysis of high-throughput rna interference screens ephrin-b2 ligand is a functional receptor for hendra virus and nipah virus identification of host proteins required for hiv infection through a functional genomic screen the ifitm proteins mediate cellular resistance to influenza a h1n1 virus, west nile virus, and dengue virus ebola virus entry requires the cholesterol transporter niemann-pick c1 mda-5, but not rig-i, is a common target for paramyxovirus v proteins nipah virus sequesters inactive stat1 in the nucleus via a p gene-encoded mechanism genome-wide sirna screening at biosafety level 4 reveals a crucial role for fibrillarin in henipavirus infection hendra and nipah viruses: different and dangerous oncomirs -micrornas with a role in cancer downregulation of microrna-24 and -181 parallels the upregulation of ifn-gamma secreted by activated human cd4 lymphocytes picornaviruses and nuclear functions: targeting a cellular compartment distinct from the replication site of a positive-strand rna virus ):e1005974 from the centers for disease control and prevention (1999) outbreak of hendra-like virus-malaysia and singapore paramyxovirus-induced shutoff of host and viral protein synthesis: role of the p and v proteins in limiting pkr activation mirna-181a regulates toll-like receptor agonist-induced inflammatory response in human fibroblasts the matrix protein of human respiratory syncytial virus localises to the nucleus of infected cells and inhibits transcription virus meets host microrna: the destroyer, the booster, the hijacker isolation of hendra virus from pteropid bats: a natural reservoir of hendra virus drosophila rnai screen identifies host genes important for influenza virus replication evidence of henipavirus infection in west african fruit bats a microrna polycistron as a potential human oncogene functional genomics: it's all how you read it the interaction of animal cytoplasmic rna viruses with the nucleus to facilitate replication irfs: master regulators of signalling by toll-like receptors and cytosolic pattern-recognition receptors clinical presentation of nipah virus infection in bangladesh detection of microrna expression in human peripheral blood microvesicles evidence for mir-181 involvement in neuroinflammatory responses of astrocytes treatment of hcv infection by targeting microrna genome-wide rnai screen identifies human host factors crucial for influenza virus replication haploid genetic screen reveals a profound and direct dependence on cholesterol for hantavirus membrane fusion global analysis of host-pathogen interactions that regulate early-stage hiv-1 replication human host factors required for influenza virus replication identification of 53 compounds that block ebola virus-like particle entry via a repurposing screen of approved drugs a ribosome-specialized translation initiation pathway is required for cap-dependent translation of vesicular stomatitis virus mrnas characterization of microrna expression profiles in normal human tissues the non-pathogenic henipavirus cedar paramyxovirus phosphoprotein has a compromised ability to target stat1 and stat2 matrix proteins as centralized organizers of negative-sense rna virions mechanism of t cell regulation by micrornas interferon-inducible cholesterol-25-hydroxylase broadly inhibits viral entry by production of 25-hydroxycholesterol recurrent zoonotic transmission of nipah virus into humans transmission of human infection with nipah virus recognition of single-stranded rna viruses by toll-like receptor 7 acts as a safeguard of translational control by regulating fibrillarin and rrna methylation in cancer cedar virus: a novel henipavirus isolated from australian bats recombinant hendra viruses expressing a reporter gene retain pathogenicity in ferrets antiviral responses induced by the tlr3 pathway influenza a h3n2 subtype virus ns1 protein targets into the nucleus and binds primarily via its c-terminal nls2/nols to nucleolin and fibrillarin hendra virus vaccine, a one health approach to protecting horse, human, and environmental health nipah virus infection of pigs in peninsular malaysia detailed morphological characterisation of hendra virus infection of different cell types using super-resolution and conventional imaging a morbillivirus that caused fatal disease in horses and humans activin and tgfbeta regulate expression of the microrna-181 family to promote cell migration and invasion in breast cancer cells dynamic regulation of mirna expression in ordered stages of cellular development subcellular trafficking in rhabdovirus infection and immune evasion: a novel target for therapeutics identification and characterization of a novel gene, c13orf25, as a target for 13q31-q32 amplification in malignant lymphoma the nucleus introduced differential detergent treatment allows immunofluorescent localization of the newcastle disease virus matrix protein within the nucleus of infected cells evidence for ubiquitin-regulated nuclear and subnuclear trafficking among paramyxovirinae matrix proteins evidence for henipavirus spillover into human populations in africa mechanisms of type-i-and type-ii-interferon-mediated signalling the hiv tat protein affects processing of ribosomal rna precursor the nucleolar interface of rna viruses the role of micrornas in viral infection identification of the nuclear export signal and stat-binding domains of the nipah virus v protein reveals mechanisms underlying interferon evasion nipah virus v protein evades alpha and gamma interferons by preventing stat1 and stat2 activation and nuclear accumulation hendra virus v protein inhibits interferon signaling by preventing stat1 and stat2 nuclear accumulation fibrillarin from archaea to human measles virus n protein inhibits host translation by binding to eif3-p40 interferon-stimulated genes and their antiviral effector functions widespread changes in protein synthesis induced by micrornas discovery of insect and human dengue virus host factors identification and characterization of mavs, a mitochondrial antiviral signaling protein that activates nf-kappab and irf 3 a physical and regulatory map of host-influenza interactions reveals pathways in h1n1 infection nipah virus v and w proteins have a common stat1-binding domain yet inhibit stat1 activation from the cytoplasmic and nuclear compartments, respectively nuclear localization of the nipah virus w protein allows for inhibition of both virus-and toll-like receptor 3-triggered signaling pathways cancerous mirnas and their regulation viruses, micrornas, and host interactions promotion of hendra virus replication by microrna 146a role of mir-181 family in regulating vascular inflammation and immunity molecular mechanism of paramyxovirus budding rna viruses and the host micro rna machinery glutamine methylation in histone h2a is an rna-polymerase-i-dedicated modification two mrnas that differ by two nontemplated nucleotides encode the amino coterminal proteins p and v of the paramyxovirus sv5 visualisation of direct interaction of mda5 and the dsrna replicative intermediate form of positive strand rna viruses update: outbreak of nipah virus-malaysia and switching from repression to activation: micrornas can up-regulate translation ubiquitin-regulated nuclear-cytoplasmic trafficking of the nipah virus matrix protein is important for viral budding nipah virus matrix protein: expert hacker of cellular machines comparative genetics. systematic discovery of cap-independent translation sequences in human and viral genomes hepatitis c virus and human mir-122: insights from the bench to the clinic novel henipa-like virus, mojiang paramyxovirus, in rats proteomics informed by transcriptomics reveals hendra virus sensitizes bat cells to trail-mediated apoptosis ) mir-181a and inflammation: mirna homeostasis response to inflammatory stimuli in vivo the role of ribosomal proteins in the regulation of cell proliferation, tumorigenesis, and genomic integrity human activated cd4(+) t lymphocytes increase il-2 expression by downregulating microrna-181c specialized ribosomes: a new frontier in gene regulation and organismal biology nipah virus infection in bats (order chiroptera) in peninsular malaysia the rna helicase rig-i has an essential function in double-stranded rna-induced innate antiviral responses membrane (m) protein of hvj (sendai virus): its role in virus assembly serologic evidence for the presence in pteropus bats of a paramyxovirus related to equine morbillivirus nucleolar localization of non-structural protein 3b, a protein specifically encoded by the severe acute respiratory syndrome coronavirus robust statistical methods for hit selection in rna interference high-throughput screening experiments progress in microrna delivery genome-scale rnai screen for host factors required for hiv replication critical role for mir-181a/b-1 in agonist selection of invariant natural killer t cells key: cord-009636-5kddituy authors: shirbaghaee, zeinab; bolhassani, azam title: different applications of virus‐like particles in biology and medicine: vaccination and delivery systems date: 2015-12-22 journal: biopolymers doi: 10.1002/bip.22759 sha: doc_id: 9636 cord_uid: 5kddituy virus‐like particles (vlps) mimic the whole construct of virus particles devoid of viral genome as used in subunit vaccine design. vlps can elicit efficient protective immunity as direct immunogens compared to soluble antigens co‐administered with adjuvants in several booster injections. up to now, several prokaryotic and eukaryotic systems such as insect, yeast, plant, and e. coli were used to express recombinant proteins, especially for vlp production. recent studies are also generating vlps in plants using different transient expression vectors for edible vaccines. vlps and viral particles have been applied for different functions such as gene therapy, vaccination, nanotechnology, and diagnostics. herein, we describe vlp production in different systems as well as its applications in biology and medicine. © 2015 wiley periodicals, inc. biopolymers 105: 113–132, 2016. v irus-like particles (vlps) known as viral "empty shells" maintain the same structural properties of virions, without genome. these constructs are considered very efficient as vaccine platforms and therapeutic delivery systems. 1 many antigens can readily be displayed on the surface of vlps. these antigens can be genetically or chemically fused to the vlp. 2 regarding to the reports, the immune stimulation by vlps contains: (a) stimulation of innate immunity through tlrs and pattern recognition receptors (prrs) due to the expression of multivalent structures; (b) induction of strong humoral response and also igm in t-cell independent way; and (c) enhancement of the uptake, processing and presentation by apcs through mhc i and mhc ii cross-presentation pathway due to the particulate nature of vlps. 3 vlps can be subcutaneously or intramuscularly injected. their small diameter facilitates entry into lymphatic vessels and direct drainage into local lymph nodes. once in the lymph node, vlps are taken up by lymph node resident dendritic cells (dcs). this uptake is enhanced by the size and form of vlps. vlps stimulate cd4 t cells via the mhc ii pathway, as well as highly efficient cross-presentation on the mhc class i pathway. 3 generally, viral-like particles, are considered as vaccine candidates because their natural properties such as multimeric antigens and their specific structures are suitable for the stimulation of efficient humoral and cellular immunity. currently, the development of recombinant subunit vaccines (suvs) has been significantly increased using heterologous expression systems. antigens derived from many bacterial, viral, fungal and parasitic pathogens were used for safe and effective vaccination. five vlp-based vaccines have been already approved including three for hbv and two for hpv, while in the veterinary field; a vlp-based vaccine against porcine circovirus type 2 (pcv2) has been approved. some vlp-based vaccines targeting human and animal diseases are recently in late stages of clinical trials. vlps have a positive value as academic, industrial, and commercial systems especially in gene therapy and design of nanomaterials. however, the study of the vlp-based applications (vaccination, gene and drug delivery, and imaging) must be followed to show the reliability, and cost efficiency of this technology. furthermore, the expression systems would be improved to achieve the best strategy for vlp production from different viral genes. this review will focus on vlp characteristics and its applications especially as vaccines or delivery systems for dna, sirna and drugs. it should be noted that in the vaccination field whenever a viral-like particle carries genetic material is called "vectored vaccines 4 " and in gene therapy, they are called viral vectors. however, for simplicity in this review, we called all particles entitled as viral-like particles (vlps). viral-like particles (vlps) have been generated for over thirty various infectious viruses in animals and humans. 5 vlps are composed of one or more structural (/capsid) proteins possessing natural properties for self-assembly, and are morphologically similar to authentic viruses. 5, 6 comparing to live viruses, vlps are non-replicating and non-infective due to the lack of infectious genetic material. 5 virus-like particles have the potential to be used as safe vaccine candidates without the need for any adjuvant. 7, 8 different viruses present different structures for generation of viral-like particles such as: a. simple viral capsids with one or two major proteins (e.g., parvoviruses, papillomaviruses, circovirses, calciviruses, hepatitis e virus (hev) and polyomaviruses). b. complex viral capsids with various protein layers, encoded by many distinct mrnas, or generated from a single polyprotein (e.g., picornaviruses). c. viral capsids with lipid envelopes including a lipid bilayer obtained from the host cell, as well as viral glycoprotein spikes (e.g., influenza, hiv and hepatitis c). 5, 9 figure 1 shows the general model of vlp along with its applications. the selection of expression vector is one of the major factors in vlp generation. the reports showed the successful production of 174 vlps indicating that bacterial systems, yeast and insect systems are used in 28%, 20%, and 28% of the cases. in addition, mammalian cells (15%) and plants (9%) were usually applied to produce vlps with special properties. 8 bacterial systems are often included the commercial e.coli strains and expression vectors, to produce non-enveloped vlps in high levels compared to other systems (table i) . 5 in addition, bacterial cells have been applied for generation of vlps which need several types of structural proteins, such as the avibirnavirus ibdv vp2, vp4, and vp3-polyproteins. 54 the reports indicated that the expression of the hepatitis b virus (hbv) capsid protein in e. coli leads to the formation of structures similar to the hbv core (hbc) particle. 55 bacterial figure 1 general model of vlp along with its applications: the picture shows the recombinant hpv16 l1 pentamers assembled in vitro into capsid-like structures. self-assembly of recombinant viral coat proteins into empty capsids is a promising strategy for production of virus-like particles (vlps) in vaccine design. the resulting vlps can induce a protective immune response by mimicking the authentic epitopes of virions. systems are not always a desired plan for vlp production due to several factors, such as (a) lack of ability to generate recombinant proteins with mammalian-like post-translational modifications (ptm), (b) failure to produce the correct disulfide bonds, (c) drawbacks of protein solubility, and d) the existence of lipopolysaccharides (lps)/or endotoxins in production of recombinant proteins (rp). 5 viral coat proteins (cps) can be efficiently produced as insoluble inclusion bodies, purified under denaturing conditions, refolded, and selfassembled, as indicated in the parvovirus b19 and the ccmv and cmv plant viruses. 56 a simple change in the cultivation conditions such as low-temperature can solve the problem of inclusion bodies and induce the formation of soluble vlps, as performed for two viral systems, the densovirus ihhnv, 57 and the potyvirus pvy. 58 some factors including the resistance markers of the expression plasmids and the composition of the cultivation medium can also change the vlp assembly (e.g., bacteriophage qb). 59 another strategy applied to increase expression levels and solubility involves the use of different fusion protein systems, e.g., glutathione-s-transferase (gst) fusion proteins such as the papillomavirus l1, the polyomavirus mupyv, and the picornavirus fmdv. [60] [61] [62] [63] other prokaryotic hosts have been recently used to generate vlp, e.g., lactobacillus. 8 the intracellular assembly of hpv16 l1 vlp was reported in lactobacillus casei, a lactose-inducible expression strain. 5 furthermore, the production of l1 vlps using lactobacillus developed new live mucosal prophylactic vaccines (table i) . 29, 30 a pseudomonas fluorescens (p. fluorescens) expression system is an efficient choice against e. coli, because of simple manipulation, high yields of active and soluble proteins, and largescale cultivation. some differences between p. fluorescens and e. coli including the various sizes of genome, and diverse metabolic approaches can influence the generation of recombinant proteins. 64 the capsid protein of a plant bromovirus, the cowpea chlorotic mottle virus (ccmv), has been recently expressed as a soluble form in p. fluorescens, and assembled into vlps in vivo. this construct was structurally similar to the natural viral particles provided from plants. 64 eukaryotic expression systems are a striking alternative to bacteria, especially for solving the problem of bacterial endotoxins in vaccine development. some structural genes of mammalian viruses expressed in yeast are able to form the vlp. this expression host has been efficiently applied to generate the first licensed hbv vaccine. 65 hbsag is one of the antigens commonly utilized for production of vlp-based hbv vaccine. hbsag has been expressed in pichia pastoris (p. pastoris), sac-charomyces cerevisiae (s. cerevisiae) and hansenula polymorpha (h. polymorpha) (table i) . 5, 16 it is critical to consider that the viral-like particles are not always formed during the cultivation procedure of the yeast cells. these studies showed that the selfassembly of the vlps in pichia system should be completed during the protein purification. 16, [66] [67] [68] the expression and selfassembly of recombinant bacteriophage q coat protein (q-cp) was indicated in saccharomyces cerevisiae and pichia pastoris. the yeast-derived q-vlps were greatly immunogenic in mouse similar to that in e.coli-derived q-vlps. 69 ms2 vlps produced in saccharomyces cerevisiae could package functional heterologous mrnas. for example, the linkage of the ms2 packaging sequence to the human growth hormone mrna allowed the packaging of the mrna in ms2 vlps. indeed, the high stability of ms2 vlps suggests them as an efficient delivery system for rna-based vaccines. 70 the p. pastoris system was also utilized as a potent alternative for expression of ccmv coat protein vlps due to easy manipulation and high expression levels. 71 in addition, this system has been utilized to express efficiently the premembrane and envelope glycoproteins of dengue virus type 2 (denv-2), 72 hbsag, 73, 74 hccag 75 resulting in the generation of vlps. 72 the major advantage of yeast systems is the ptm including phosphorylation or glycosylation, as indicated in hbv vlps. 8 the studies indicated that hbc phosphorylation plays a major role in viral replication and capsid formation. such yeast-derived hbc vlps are valuable for vaccination and diagnostics. 76 furthermore, the potent multigene expression systems have been constructed in yeasts. for example, the expression of three rotavirus structural genes from a single plasmid vector led to the generation of triple layered vlps in saccharomyces cerevisiae. 8, 77 however, the multimerization of protein into vlps is not supported for the enveloped viruses (e.g., gag vlps of hiv-2), suggesting that yeast does not have the essential factors of host. 8 thus, the generation of enveloped hiv-1 pr55gag vlps has been performed using s. cerevisiae spheroplasts, morphologically similar to immature viral particles. 5, 78 in general, the construction of yeast expression systems, especially hansenula and pichia strains, are more difficult than bacterial vectors. in addition, the yield of vlp production is less than that in e.coli. 8 other limitation of yeast system is its dissimilarity with mammalian cells in the ptm of proteins, especially glycosylation. 79,80 therefore, this system is more suitable for the generation of non-enveloped viral-like particles. another attractive system utilized broadly for production of vlp is the baculovirus-insect cell expression system, due to 118 shirbaghaee and bolhassani some advantages, such as the rapid growth ratios, the culture preparation in large-scale, and the ptm of the target proteins similar to mammalian cells. [81] [82] [83] the results showed that both yeast and insect cells were previously used for the vp1 expression of several polyomaviruses, and its assembly into viral-like particle. 84 in addition, insect cells were used to provide vlpbased vaccines, e.g., the approved hpv vaccine, cervarix. indeed, insect cells are able to generate both vlp types (i.e., enveloped and non-enveloped). there are enveloped vlps in clinical trials. 9 the main limitation of insect cell system is protein contamination with the enveloped baculovirus particles, suggesting the development of efficient plans for purification of vlps. 85 recently, co-expression of four genes of human influenza h3n2 virus (i.e., ha, na, m1, and m2) in insect cells led to generate influenza vlps which protected mice against h3n2 virus challenge. 86 these data suggested that viral-like particles are a hopeful vaccine candidate for h9n2 influenza and probably other subtypes of virulent avian influenza viruses. 87 the non-infectious viral-like particles of the alphavirus sav was also generated using the recombinant baculoviruses expressing sav capsid protein and two major immunodominant viral glycoproteins (e1 and e2) in insect cells. 88 moreover, baculovirus expression system was utilized to generate vlps from cowpea mosaic virus (cpmv), tomato bushy stunt virus, and entorovirus271 (ev71). 8, 89, 90 recently, non-replicative baculovirus have been developed to cope with the problem of baculovirus contamination. 91 stable systems using insect cells have been also tested. 92 moreover, silkworm expression systems were efficiently applied to generate vlps and the surface of vlps could be changed by some strategies, irrespective if their constructs are enveloped or not. silkworms show a high capability for production of recombinant proteins, in comparison with insect cells, and also easy and inexpensive protein preparation similar to e.coli expression system. 81 for over two decades, different mammalian cell lines have been developed as a source of commercial therapeutic proteins for clinical applications, 93 because of their ability for proper protein folding, assembly and ptm (e.g., the correct glycosylation pattern). 8, 93 however, high costs of production and potential safety concerns remained a challenge for these systems. the mammalian cells were progressively utilized to produce vlpbased vaccines 5,94 , e.g., for influenza viruses. for instance, the generation of a stable mammalian cell line (e.g., vero cells) expressing four influenza structural proteins (ha, na, m1, and matrix 2 (m2)) led to form hybrid vlps containing matrix proteins, and surface glycoproteins of h3n2 and h5n1 influenza types, respectively. 8, 95 another examples are the produc97 and hiv-1 vlp in cos-7/vero cells, 98 and hbv vlp in cho cells. [99] [100] [101] plant systems plants were successfully used to express specific gene products. the feasibility of recombinant plants for generation of vaccine antigens were shown in tobacco plants, potato tubers, and others. 102 this approach develops vaccine strategies which can stimulate mucosal as well as systemic immune responses. in addition, it can be delivered orally as part of a normal biologic function in human. 102 the antigen expressed in plant systems shows extensive disulphide crosslinking and oligomerization for formation of virus-like particles. for example, the hepatitis b major surface antigen has been expressed in several plant systems. 103 plants are able to express and assemble both types of vlps (i.e., enveloped and non-enveloped) as multimeric and chimeric proteins. the high expression of vlps in plant is easy and rapid (e.g., 1-2 weeks) using a tobacco mosaic virus (tmv) rna replicon system and/or a bean yellow dwarf virus (beydv) dna replicon system. 104 another advantage of plants is the use of plant virus particles as a delivery system to present foreign epitopes. furthermore, the problem of plant-specific glycans has been partially solved using the development of transgenic plants with "humanized" glycosylation pathways. 104 plantderived vlps can be used for oral delivery of vaccines. virallike particles are more resistant to digestive enzymes than soluble proteins in body, because of their highly ordered and packed structures. for example, the gastrointestinal virusesderived vlps including noroviruses and rotaviruses were utilized orally as potent candidates for mucosal immunization. 105 plant-derived vlps showed the same structures with vlps generated in other expression systems accompanied by a comparable or higher immunogenicity. some plant-derived vlps could induce protective humoral and cellular immunity and also safety in clinics. 105 the studies showed that the level of protein expressed in the recombinant plants is variable and often low. therefore, further increase in expression will be necessary for practical and efficient products. 102 recent progress in the glyco-engineering of plants allows human-like glycol-modification and optimization of desired glycan structures for increasing safety and functionality of recombinant pharmaceutical glycoproteins. 1 some plantbased systems can stabilize antigen and thus reduce storage and distribution costs. 103 different applications of virus-like particles 119 toxoplasma gondii (t. gondii) is an obligate intracellular parasite infecting the nucleated cells of warm-blood vertebrates. this parasite is able to stimulate strong humoral, cellular and mucosal immunity, and thus it can be used as an efficient delivery system for heterologous antigens. t. gondii was applied as a vector for live vaccination against infectious pathogens. [104] [105] [106] [107] [108] [109] recently, a non-pathogenic kinetoplastida, leishmania tarentolae, was utilized to express heterologous proteins. the studies showed that expression of mammalian glycoproteins in this parasite leads to their modification with mammalian-like oligosaccharides. [110] [111] [112] [113] recently, our group has focused on its use as a live vector or killed vaccine, [114] [115] [116] and also generation of viral coat proteins and their assembly as vlp in this system. 117 virus-like particles show an efficient strategy to deliver antigens to the immune system, inducing both arms of the adaptive immunity. 118 indeed, vlps present antigenic epitopes in the proper conformation, leading to induce humoral responses. 5 for example, preclinical trials with influenza vlps indicated their capacity to induce both humoral and cellular immune responses at low antigen doses. several authors have reported antibody response to parenterally or orally administered plant-derived antigens. 119, 120 as exogenous antigens, vlps are taken up by professional antigen presenting cells (apcs), especially dcs, followed by antigen processing and presentation via mhc class ii molecules, dc activation and maturation through up-regulation of co-stimulatory molecules and cytokine production, and stimulation of cd41 t helper cells. all these events can efficiently induce both humoral and cellular immunity. 5 in addition, the exogenous vlps can enter the cytosol of dcs, be processed and presented by mhc class i molecules to cytotoxic t lymphocytes (ctls) using cross-presentation. 5, 121, 122 furthermore, the b-cell activation using vlps is robust enough to induce t cell-independent igm antibodies. 7, 8 dcs loaded with yeast-derived hiv vlps can alter gag-specific memory cd81 t cells into effector cells through cross-presentation in chronically hiv-infected individuals, although some gag-specific t cells in these patients did not show any response. 123 the reports showed that the expression system used for generation of vlp might significantly affect direction, type and outcome of immune responses. 121 for example, potent and specific immunomodulatory effects were assigned to yeast-derived hiv vlps in comparison with other expression systems. 123 on the other hand, plant-or insectderived vlps, consisting of the l1 capsid protein of hpv, were both immuno genic to an equal degree. half of mice fed trans-genic potatoes expressing hpv vlps developed l1-specific antibodies. 124 the studies indicated that the vlps are taken up by clathrin-dependent macropinocytosis and phagocytosis before being degraded in acidic lysosomal compartments. vlp-derived peptides are loaded onto mhc i that have been recycled from the cell surface. 125 a study showed that uptake and activation of dc by vlp involves proteoglycan receptors, tlr4 and nf-kb, and can be inhibited by heparin. 126 several data suggest different routes of vlp uptake by dc and langerhans cells (lc). 127 for example, lcs and dcs internalize similar amounts of hpv-vlps in vaccine design, albeit through different uptake mechanisms. 128, 129 vlp uptake by dcs results in activation and cross-presentation of mhc i-restricted peptides with co-stimulation to t-cells. on the other hand, vlp uptake by lc leads to cross-presentation in the absence of costimulation. efficient vlp cross-presentation by lcs with costimulation can be achieved by addition of cd40 ligand. 128 the lack of a protective immune response after viral contact with lcs may explain why some women fail to induce an immune response against the virus. lcs endocytose hpv vlps via a non-clathrin, non-caveolae, actin-independent pathway, whereas dcs take up hpv vlps both by a clathrin-mediated mechanism and via macropinocytosis in an actin-dependent manner. this difference in endocytosis resulted in processing and presenting hpv vlp peptides by lcs similar to that by dcs on their surface, but in the absence of co-stimulation. with the addition of cd40l, lcs incubated with hpv vlps generated the efficient amounts of the pro-inflammatory cytokine (il-12) and could stimulate a hpv-specific immune response after incubation with t cells. 128 despite these differences, vlps taken up by dc and lc were able to prime naive cd81 t cells and induce cytolytic effector t cells in vitro. 127 furthermore, hiv-1 pr55 gag virus-like particles could stimulate strong humoral and cellular immune responses. vlp expressed by recombinant baculoviruses activated human pbmc to release pro-inflammatory (il-6, tnf-a), antiinflammatory (il-10) and th1-polarizing (ifn-c) cytokines as well as gm-csf and mip-1a in a dose-and time-dependent manner. furthermore, vlp-induced monocyte activation was shown by up-regulation of molecules involved in antigen presentation (mhc ii, cd80, and cd86) and cell adhesion (cd54). exposure of vlp to serum inactivated its capacity to stimulate cytokine production. 130 the linking of vlps to adjuvant molecules was also shown to improve the immunogenicity of the nano-bioparticles. 131 adjuvanted vlps [e.g., cpg odn1826 or poly (i: c) adjuvants] elicited a higher titer of total specific igg compared to vlps alone. furthermore, while vlps alone induced a balanced th2 pattern, vlps formulated with adjuvant elicited a th1-biased igg subclasses (igg2a and igg3), with poly (i: c) more potent than cpg odn1826 in 120 shirbaghaee and bolhassani animal model. 118 in addition, mice immunization with chimeric simian immunodeficiency virus (siv) vlps containing gm-csf significantly induced siv env-specific antibodies as well as neutralizing activity at higher levels than those elicited by standard siv vlps, siv vlps containing cd40l, or standard vlps mixed with soluble gm-csf. on the other hand, the incorporation of immunostimulatory molecules showed significantly increased cd41 and cd81 t-cell responses to siv env, compared to standard siv vlps. 132 formulation of vlps with rough lps (r-lps) adjuvant as well as dna primed-vlp boosted regimen were led to increase specific immune responses as compared to vlps alone, but among them the vlp/r-lps highly enhanced immune response. 133 recent studies demonstrated the potential of the hbc vlps as an oral immunogen. intraperitoneal immunization with the hbc vlp induced a strong, mixed th1/th2 response. in contrast, oral administration of the hbc vlp generated a high humoral response with mainly igg2a antibodies, directing toward a th1 response which is essential in the control of intracellular pathogens. 134 in addition, the intranasal monovalent adjuvanted norwalk vlp vaccine was well tolerated and highly immunogenic. 135 the studies showed that chimeric hpv-vlps are able to elicit potent ctl responses in mice against hpv16transformed tumors; however, the mechanism of t cell priming has remained obscure. hpv vlp could bind to human mhc class ii-positive apcs through interaction with fccriii, and immature dcs were activated after incubation with hpv vlp. 136 it was shown that binding and uptake of vlp by dc from fccrii, fccriii, and fccrii/iii deficient mice are reduced by up to 50% compared with wild-type mice. in addition, maturation of murine dc from fccrii/iii-deficient mice by vlp is also reduced, indicating that dc maturation, and thus ag presentation, is diminished in the absence of expression of fccr. 136 poor immunogenicity of mucosally administered proteins has been a major barrier for development of efficient oral vaccines. one way to overcome this obstacle is the use of appropriate adjuvants. also, delivery of antigen to mucosal surfaces as vlp provides an efficient way of inducing mucosal immunity. after oral or intranasal immunization with norwalk vlp, or rotavirus vlp without adjuvant, intestinal iga was detected in immunized mice, which were protected from virus challenge. 137 in addition, the plasma cell precursors that migrate to the genital tract are derived primarily from mucosal lymphoid tissues and often secrete iga. 138 the studies indicated that immune responses generated by mucosal administration of vlp were generally weaker than systemic administration. vlp specific iga was higher in intestine washes following intrarectally (i.r.) than intravaginally (i.va.) immunization, and higher in vaginal washes following intramuscularly (i.m.) than i.r. or i.va immunization. some studies suggested that the immunogenicity of virus particles at mucosal surfaces is probably a property of particulate antigens assembled as multimers of subunits. indeed, vlp might be actively taken up by mucosal apc through the integrin receptors. 137 lipoparticles are stable, highly purified, homogeneous, and specialized vlps containing high concentrations of an integral membrane protein. integral membrane proteins are involved in different biological functions and are targeted by 50% of existing therapeutic drugs. however, because of their hydrophobic domains, membrane proteins are difficult to manipulate outside of living cells. lipoparticles can incorporate a wide variety of the membrane proteins, including g proteincoupled receptors, ion channels, and viral envelopes. lipoparticles provide a platform for different applications such as antibody screening, production of immunogens, and ligand binding assays. [139] [140] [141] during the assembly of enveloped viruses, lipid ordered domains of the host cell plasma membrane, known as lipid rafts, frequently function as a natural target for viral proteins. the role of lipid rafts in the organization of complex combinations of immune receptors during antigen presentation and t cell signaling is extensively recognized. 142 on the other hand, in order to improve the immunogenicity of hiv-1 envelope glycoproteins, the fusion of gp120 was performed to a carrier protein, hepatitis b surface antigen (hbsag) which is capable of spontaneous assembly into viruslike particles. the hbsag-gp120 hybrid proteins assembled efficiently into 20-30 nm particles. the particles resembled native hbsag particles in size and density, consistent with a lipid composition of about 25% and a gp120 content of about 100 per particle. particulate gp120 folded in its native conformation and was biologically active, as shown by high affinity binding of cd4. because the particles are lipoprotein micelles, an array of gp120 on their surface closely mimics gp120 on the surface of hiv-1 virions. these gp120-rich particles can enhance the quality, and also quantity of antibodies elicited by a gp120 vaccine. 143 virus-like particles show an expanding spectrum of applications such as gene therapy, nanotechnology, vaccination, and diagnostics. 55, 77 recently, the studies showed a pattern of direct conjugation of some ligands, including nucleic acids and proteins attached to vlp surface. 144, 145 in addition, because of the superior accessibility of cysteine and lysine residues on vlps, bio-conjugation has been performed by commercial homo-or hetero-bifunctional linkers. [146] [147] [148] [149] for example, three foreign proteins were chemically conjugated to the vlp surface of cpmv by proper bifunctional cross-linkers. 147 on the other hand, the researchers could produce an alphavirus vlp surrounding a functional gold nanoparticle. 150 vlps have been also used to stimulate immune responses and generate antitumor responses, e.g., alphavirus-based virus-like replicon particles (vrp) expressing various melanoma antigens. [151] [152] [153] it is interesting that the first viral-associated cancer vaccines were founded on hbv vlp and hpv vlp to prevent hbvassociated hepatocellular carcinoma (hcc) and hpvassociated cervical carcinoma, respectively. 153, 154 it should be noted that these vlp formulations are viral vaccines that prevent a viral infection that may progress to carcinoma after a long time. we indicate some applications of vlps against viral diseases as following: in several studies, specific vaccine antigens were generated by various expression systems to induce protective immune responses and apply in licensed recombinant viral vaccines. 100, 155 some examples of preventive vlp-based vaccines are recently commercialized worldwide including glaxos-mithkline's engerixv r (hbv) and cervarixv r (hpv), and merck and co., inc.'s recombivax hbv r (hbv) and gardasilv r (hpv). other vlp-based vaccines undergo preclinical evaluation or clinical trials, including parvovirus-, influenza-, norwalk-derived vlps and also different chimeric vlps. 156 for generation of immunogenic vlps, eukary otic expression hosts including yeast (s. cerevisiae, p. pastoris and h. polymorpha) and mammalian cells (chinese hamster ovary cell line [cho]) were used. the studies indicated that the recombinant hbsag generated in cho and h. polymorpha have significant differences in size, molecular weight (mw), and monomer number. furthermore, the cho-derived viral-like particles include a combination of glycosylated and non-glycosylated hbsag proteins, similar to those in patients' sera, while yeastderived antigens were reported to be non-glycosylated. chobased vaccines were provided by pasteur-m erieux aventis in france (genhevac bv r ) and scigen in israel (sci-b-vac tm ). both vaccines contained not only the hbsag s pro tein but also the m protein (genhevac b) or the m and l protein (sci-b-vac). 156 on the other hand, gardasil approved by the fda in 2006 is a quadri valent hpv types 6/11/16/18 l1 vlp vaccine produced in s. cerevisiae. cervarix is the other licensed hpv vaccine approved by the fda in 2009. 156 cervarix is a bivalent hpv types 16/18 l1 vlp vaccine expressed via a recombinant baculovirus vector. 25, 157 different experiments have been concentrated on hpv vlp vaccination in mouse and human models including: (a) activation of immature human dcs by chimeric hpv16 vlps, (b) determination of systemic cytokine pattern elicited by hpv l1 vlps, (c) identification of gene expression signatures in hpv16 l1 vlp-induced human pbmcs, (d) generation of potent and prolonged neutralizing l1 antibodies using a single intramuscular (im) mice injection with recombinant adenoassociated virus encoding hpv16 l1 protein (raav-16l1), (e) augmentation of immunogenicity of hpv l1 dna vaccines using genetic linkage to a chemokine and secretory signal peptide sequences, (f) potent stimulation of systemic and mucosal immune responses to vlp vaccines using the encapsulation of a genetic cytokine adjuvant (e.g., il-2), (g) improvement of hpv16 vlp immunogenicity by linkage to the modified adjuvant, and m) nasal immunization of mice with hpv16 vlps. 158 hpv16 l1-e7 chimeric virus-like particles (cvlp) could induce e7-and l1-specific ctls. the therapeutic potential of the cvlp also indicated a considerable safety in high grade cervical intraepithelial neoplasia patients (cin 2/ 3). 159 several improvements in vaccine design by vlp are still in preclinical trials. some main examples are referred as following: a. co-injection of the hpv16 l1 vlp with e. coli heatlabile enterotoxin (lt) as an adjuvant significantly increased the levels of serum igg and vaginal iga after nasal or bronchial mice immunization. 160 antigens (hiv-1 p17/p24: ty vlp) was also immunogenic and well-tolerated in phase i clinical trials. 24,169 g. several groups have focused on improving bacteriophagebased vlp vaccines, e.g., rna bacteriophage qb. these chimeric vlp vaccines were targeted against noninfectious diorders including hypertension, allergy, neurodegenerative and autoimmune diseases (e.g., diabetes mellitus type ii and alzheimer), cancer (e.g., melanoma). the vaccine candidate against alzheimer (cad-106) was constructed to display chemically coupled amyloid beta (ab1-6) peptide derived from the n-terminal b cell epitope of ab protein, on the surface of qb-cp vlps. this vaccine could stimulate ab-specific igg and decrease amyloid accumulation in animal models expressing ab precursor protein, without eliciting t-cells or inflammatory reactions in brain tissue. 170, 171 in addition, the angiotensin ii vaccine was synthesized by covalently conjugation of a peptide derived from angiotensin ii to the rna bacteriophage qb vlp capsid. this modified vlp could decrease blood pressure in spontaneously hypertensive rats. 172 table i shows preclinical and clinical studies of vlps in vaccine development. generally, a major application of vlps is the stimulation of immunity against foreign protein epitopes by genetically fusing or chemically conjugating them to vlps entitled as chimeric vlp (cvlps). 173 antigens can be fused to vlps through either covalent or non-covalent bonds. the most common covalent bond is generated by the heterobifunctional chemical cross-linkers with amine and sulfhydryl-reactive arms. 104 for instance, cysteine-containing antigens can be conjugated to lysine residues of vlps surface at a high density (e.g., three peptides per coat protein molecules). the non-covalent conjugation strategy contains the use of streptavidin as linkers to attach biotinylated antigens and vlps through their efficient and specific interactions. 104 sv40 vlps can also encapsulate various materials such as dna (5 kb) and proteins as antigens. insertion of a special exogenous peptide into the surface loops of vp1 produced sv40 vlps with the ability of cell targeting. moreover, sv40 vlps stimulated innate immunity as a natural adjuvant. indeed, sv40 vlps may be a promising vaccine candidate to deliver heterologous antigens followed by the induction of ctls without synthetic adjuvants. 174 several chimeric vlp vaccines have entered clinical trials, such as the anti-influenza a m2-hbcag vlp vaccine (hbcag vlps displaying m2 epitope of influenza a), the anti-hiv p17/p24: ty vlp, two anti-malaria vaccines (hbcag vlps displaying malaria epitopes), the nicotine-qb vlp and the anti-ang ii qb vlp. 175 genetic linkage contains a stable bond between vlp and antigen. the studies showed that only peptides shorter than 30 amino acids (small peptides) can be presented without interfering with the correct assembly of vlps. other limitations contain the improper folding of displayed antigens and the formation of cvlps with heterogeneous size. to prevent these issues (e.g., assembly problems), structural studies have identified domains for different vlps such as hbcag, hbsag and hiv gag that were not necessary for vlp assembly as well as allowed insertion of foreign antigens. 104 the simplest way for generation of single component cvlps, is the insertion of peptides at the n-or c-terminal regions of chimeric vlps. multiple fusion positions should be identified to produce multicomponent cvlps inducing broad immune responses. 104 the direction and intensity of the immune responses are significantly influenced by the vlp type, the foreign antigen density, and its accessibility on or within the vlp. furthermore, preexisting immunity against the epitopes of the vlp as a delivery system may importantly change the response against the heterogenous antigen. for example, hbcag was also utilized to display a neutralizing epitope of hpv16 l2 protein. the nasal delivery of hbcag-hpv16 l2 epitope cvlps expressed in tobacco induced antigen-specific antibody responses in mouse model. on the other hand, an hpv16 l1-based chimeric vlp was generated in transgenic tomato to present several t-cell epitopes from hpv16 e7 and e6 proteins. the hpv l1-e7/e6 vlps elicited a neutralizing antibody response similar to that from an equal amount of the commercial vaccine (gardasil) in preclinical study. moreover, the chimeric vlps induced ctl responses against the e7 and e6 epitopes. chimeric hpv l1 vlps were also designed using genetic fusion to display epitopes of influenza m2 protein. 104 to overcome the problems associated with genetic fusion including the antigen size, conformation and vlp assembly, different applications of virus-like particles chemical conjugation approaches were applied to construct cvlps. in this strategy, target antigens and native vlps were generated individually and coupled together by attachment of the antigen to the surface of the pre-assembled vlps. two main advantages of this strategy include: (a) various sizes and types of antigens can be exposed, and (b) the antigen-vlp binding site can be manipulated for further presentation of the conjugated antigen. for example, vlps were used to display full-length and correctly folded proteins, such as interleukin-17 (il-17). 104 generally, vlps were used for delivery of protein/peptide, dna, sirna and drugs as a brief description in following: viral-like particles were used as a peptide/protein carrier, in vitro and in vivo. there are several examples for delivery of protein/peptide using vlps as following: a. chimeric vlp vaccines have been improved based on rna bacteriophage ap205, presenting peptides of selfantigens or pathogens fused to either the n-or cterminal regions of ap205 coat protein. ap205-derived vlps were highly immunogenic in mice. furthermore, influenza m2 vlps stimulated an efficient m2-specific antibody response and full protection against lethal influenza virus challenge. 176 b. vlps containing flt3 ligand (fl-vlps), a dc growth factor, could effectively increase immunogenicity in mice. dcs exposed to vlps also produced high levels of il-6. 177 c. a plant vlp-based approach was used to develop respiratory syncytial virus (rsv) vaccine. a target peptide displaying amino acids 170-190 of the rsv g protein was delivered on the surface of recombinant alfalfa mosaic virus (almv) particles. this construct induced high pathogen-specific immune responses in immunized animals. 178,179 d. in a recent study, a peptide from an external loop of mouse ccr5 protein was inserted into a neutralizing epitope of hpv l1. the particles generated by this chimeric l1 could elicit high levels of ccr5 antibodies that specifically recognized the surface of ccr5transfected cells and blocked in vitro infection of an mtropic hiv strain in mice. 161 in addition, chimeric vlps containing the full length hpv16 e7 oncoprotein linked to l2, or the n-terminal region of e7 fused to l1, could induce antigen-specific protection of mice from lethal challenge with e7-expressing tumor cells. 180-182 e. a pre-s1 epitope of hbv was also inserted into the ef loop of hpv vlp recognized by hbv-specific antibody. 6 chimeric vlps produced in e.coli carried a virus-neutralizing hbv pre-s1 epitope in the major immunodominant region (mir) and a highly conserved n-terminal hcv core epitope (aa 1 to 60) at the c-terminal region of the truncated hbv core vlps (hbc). the presence of two different foreign epitopes within the hbc molecule did not interfere with its vlp-forming potential, with the hbv pre-s1 epitope exposed on the surface and the hcv core epitope buried within the vlps. mice vaccination showed a specific t cell activation by both foreign epitopes and a highlevel antibody response against the pre-s1 epitope, whereas an antibody response against the hbc carrier was inhibited. 183 f. the researchers have shown that the nanosized hbc-vlps bearing mycobacterial antigen cfp-10 (hbc-vlp: cfp-10 fusion protein) induced an increased immune response in balb/c mice compared to mixtures of native antigen. 184 g. chimeric papillomavirus vlps based on the bovine papillomavirus type 1 (bpv-1) l1 protein were designed by replacing the 23-carboxyl-terminal amino acids of the bpv1 major protein l1 with a synthetic "polytope" minigene, containing known ctl epitopes of human pv16 e7 protein, hiv iiib gp120 p18, nef, and reverse transcriptase (rt) proteins, and an hpv16 e7 linear b epitope. the chimeric l1 protein assembled into vlps in insect cells. polytope vlps could deliver multiple b and t epitopes as immunogens to the mhc class i and class ii pathways. this study has demonstrated that hybrid vlps can be used as an efficient antigen delivery system to transfer more than one ctl epitope through mhc class i pathways. 185 h. the chimeric hpv vlps were generated in which hpv16 l2 neutralization epitopes (l2 residues 69-81 or 108-120) are inserted within an immunodominant surface loop (between residues 133 and 134) of the l1 major capsid protein of bpv1. immunization of rabbits with assembled particles elicited high l2-specific serum antibody responses. 186 193 l. the studies showed that the c-terminal region of gag fused by t cell epitopes from human cytomegalovirus pp65 led to the formation of hybrid vlps activating antigen-specific cd81 memory t cells ex vivo. 161 regarding to previous studies, the gag polyprotein is the only retroviral protein required for vlp formation. [194] [195] [196] vlps, derived from an avian retrovirus, were applied to deliver proteins to cells, either as part of gag fusion proteins (intracellular delivery) or on the surface of vlps. the construct is an effective system because the vlps are completely made of the gag fusion protein, and a single vlp will deliver 2000-5000 copies of gag fusion protein into a transduced cell. 197 delivery of foreign genes to the digestive tract mucosa by oral administration of non-replicating gene transfer vectors would be a very useful method for vaccination and gene therapy. 198 the studies indicated that plasmid dna could be packaged in vitro into a vlp composed of open reading frame 2 (orf2) of hev, which is an orally transmissible virus. these vlps could deliver this foreign dna to the intestinal mucosa in vivo, eliciting high mucosal and systemic immunity in mice, without the use of adjuvants. an orally administered hiv dna vaccine encapsulated in hev-vlps could induce mucosal and systemic cellular and humoral immune responses. 198 moreover, the ability of hpv vlps was examined to mediate delivery and expression of dna plasmids in vitro and in vivo. 199 hpv pseudoviruses were provided by disrupting hpv-vlp, mixing them with dna plasmids and reassembling them into the pseudoviruses (vlps with plasmids inside). the pseudovirus induced more potent immune responses than dna vaccines. the pseudovirus could be used in gene therapy by transferring the therapeutic genes into lymphoid tissues in human. 5 in addition, the recombinant hpv16 l1 vlps, produced in insect cells, could efficiently encapsulate a plasmid harboring either a gene for the gfp or b-galactosidase during in vitro disassembly-reassembly of vlps. 200 vlp-mediated delivery of a gfp reporter construct in vitro showed to be highly dependent on the presence of full-length l2 protein within the vlps. similarly, expression of gfp and luciferase reporter plasmids in vivo was efficiently enhanced by co-administration of l1/l2 vlps. in addition, co-administration of vlps with a hpv16 e6-expressing plasmid increased significantly e6-specific cellular immune responses. 201 the reports indicated that the recombinant major structural protein of the bk polyomavirus (bkv vp1) was shown to self-assemble into vlps with a diameter of 45-50 nm. the potential of bkv vp1 vlps was investigated to transfer gene into cos-7 cells using three methods for the formation of pseudovirions: disassembly/reassembly, osmotic shock and direct interaction between vlps and plasmid dna. the most efficient method is the direct interaction between vlps and linearized plasmid dna. the findings generally demonstrated that bkv vlps have exogenous dnabinding activity, as a promising vehicle for gene transfer studies. 200 sirna delivery there is a major challenge to identify novel approaches for specific and effective delivery of new types of drugs like sirnas and peptides. systemic delivery of small interfering rna (sirna) was restricted by its poor stability and low cellpenetrating properties. to overcome these limitations, an efficient sirna delivery system was designed using polyethyleneimine (pei)-coated vlps derived from adeno-associated virus type 2 (pei-aav2-vlps). generally, one of the strategies to integrate sirna into nanoparticles was to coat these particles with positively charged polymers, including pei, poly b-amino different applications of virus-like particles ester, or poly l-lysine. electrostatic coating could increase the efficiency of systemic sirna delivery due to its protective effects and improved cellular uptake. an insect/baculovirus expression system was used to generate aav2-vlps. pei-aav2-vlps could condense sirna, protect it from enzymatic degradation, transfer it with high efficiency and induce cell death in mcf-7 breast cancer cells, for breast cancer therapy. 201 furthermore, micrornas (mirnas) play an essential role in immunoregulation and may be involved in the pathogen esis of systemic lupus erythematosus (sle). among these sle-related mirnas, mir-146a, acts as a significant inhibitor of autoimmunity, myeloproliferation, and cancer. a novel mirna-delivery approach was described via bacteriophage ms2 vlps for evaluation of the therapeutic effects of mir-146a, in bxsb lupus-prone mice. treatment with ms2-mir-146a vlp increased the level of mature mir-146a, leading to a significant reduction in the expression of autoantibodies and total igg. furthermore, the levels of inflammatory cytokines, including ifn-a, il-1b and il-6 were decreased in mice. the stimulation of dysregulated mirnas by an ms2 vlp-based delivery system may be considered as a novel therapy. [202] [203] [204] the use of ms2 vlps was reported for selective delivery of nanoparticles, chemotherapeutic drugs, sirna cocktails, and protein toxins to human hcc. 205 in addition, the researchers used jc virus (jcv) vlps as a vector for delivering rnai in silencing the il-10 cytokine gene. jcv vlps were non-toxic, and showed the therapeutic use as a gene therapy approach for autoimmune diseases (aid) including sle. 206, 207 drug delivery a major challenge in pharmacology is to find methods that drugs (especially anti-cancer drugs) can be delivered specifically to target tissues. a potential strategy would be to package or encapsulate the drug molecules inside a particle which is bound to the cancerous tissue. such encapsulation would protect the drug from degradation in blood. for this purpose, it will be necessary to develop particles which can be modified on their outer surface to carry drug molecules into the target cells. novel nanocarriers such as dendrimers, liposomes, polymersomes, micelles, and vlps indicated high potency in improving drug delivery, and targeting strategies. all of these delivery systems make drugs more biocompatible, watersoluble, or colloidal, indicating low toxicity and high uptake in cells. 208 different virus-based materials were studied for drug delivery such as: the ccmv, the cpmv, the red clover necrotic mosaic virus (rcnmv), ms2 rna-containing bacteriophage, the bacteriophage qb, m13 bacteriophage, the tmv. 208 drug cargo can be loaded through covalent attachment of drugs or their analogs to particular reactive residues on the capsid pro-teins. 209 several cancer cell targeting ligands were attached to different types of vlps, including small molecules, antibodies, peptides and proteins, as well as dna aptamers. folic acid (fa) was broadly used in drug delivery targeted to cancer cells. uptake of fa into cells is mediated by the folate receptor (fr). 210 recently, lactobionic acid (la) was applied for the specific targeting of a rotavirus capsid vp6 to hepatocytes or hepatoma cells bearing asialoglycoprotein receptors (asgprs). 211 human holo-transferrin (tfn) is essential for iron homeostasis. tfn is especially recognized by the tfn receptor (tfnr), which is over-expressed on the surface of various tumor cells and efficiently taken up by cells in the clathrinmediated endocytosis. 212, 213 tfn has been conjugated to cpmv 214 and bacteriophage qb. 215 the cellular uptake of the qb-tfn particles was relative to the tfn density; while the internalization was prevented by comparable concentrations of free tfn. antibodies contain another group of targeting proteins that could be chemically linked to vlps. for instance, a single-chain (scfv) antibody that recognizes the carcinoembryonic antigen (cea) over-expressed in a variety of tumor cells, has been attached to cpmv. 216 an important strategy to improve cellular uptake of therapeutic molecules is the use of cell-penetrating peptides (cpps). 217 the hiv-1 tat peptide is one of the cpps that were extensively used in the delivery of vlps. 218, 219 in general, virus-like particles represents an attractive system for drug delivery in vitro. 220 the efficient delivery of hydrophobic drugs into target cells without the use of organic solvents or chemical linkage to delivery carriers is a critical issue in the biological field. recently, the intracellular delivery of hydrophobic dyes or drugs encapsulated in vlps through cyclodextrins (cds) showed high efficiency. as a model anticancer drug, paclitaxel (ptx)-cd complexes encapsulated inside vlps exhibited a dose-dependent cytotoxic effect with a 20-fold smaller ic50 than that of free ptx dissolved in dmso. 221 cell targeting is aimed to effective uptake of therapeutic and/ or diagnostic reagent in a special location such as a tumor. 222 targeting can also be achieved using proteins (mainly antibodies), peptides, nucleic acids (aptamers), small molecules, vitamins and carbohydrates. by attachment of targeting ligands, specificity for cell targeting was obtained by receptor-mediated endocytosis. for instance, bacteriophage ms2 vlps, were chemically conjugated to a targeting peptide (sp94) for the selective delivery of nanoparticles, chemotherapeutic drugs, sirna cocktails and protein toxins to human hcc. [223] [224] [225] recently, the chemical conjugation of human epidermal growth factor (egf) to simian virus 40 vlps allowed for cell 126 shirbaghaee and bolhassani selective targeting. 226 simian viruses 40 vlps have attracted a great attention in gene delivery due to their high stability and low toxicity in blood. 172 in design of polymeric nanoassemblies, chemical modification is necessary to conjugate the dye or probe for in vitro and in vivo imaging. however, in the case of nanobioassemblies, chemical or genetic modification can be applied for bioconjugation of fluorescent dyes or other probes. another advantage of nanobioassemblies such as vlps for bioimaging is their biological compatibility. quantum dots (qds) and gfp were used broadly for in vitro and in vivo imaging as alternatives to labelling. for example, fluorescent chimeric vlps of canine parvovirus were expressed in insect cells. 227 to create the fluorescent chimeric vlps of canine parvovirus, gfp was genetically engineered onto the n-terminal region of the viral protein vp2, as a visualization tool to understand mechanisms of viral infections. gfp was also used to design chimeric hiv vlps allowing protein to be followed during assembly and transmission using live-cell imaging. 228, 229 advantages of vlps include: (a) no need to propagate pathogenic organisms, (b) repetitive and ordered surface structures, (c) multivalent as well as particulate in nature, (d) safer than other vaccines because of non-infectious and non-replicating properties: the studies showed that there is no risk of disease progress in vaccinated groups with vlp-based vaccines as compared to attenuated viral vaccines, because they lack the genomic material needed for the replication and the spread of the viruses, (e) stable in extreme environmental conditions, depending on vlp structure (i.e., envelope or non-envelope), and (f) as carrier to express foreign antigen. 230 the potential of vlps to target dcs is a main advantage of vlp vaccines, for activating the innate and adaptive immune responses. they have a special benefit against other delivery systems in size, stability, and capacity to transfer biological molecules across cell barriers. particles in the 20-200 nm range can stimulate cd41, cd81 cells and especially generate th1 responses. in addition, despite a limited number of vlp vaccines approved for human use, they represent a promising platform for the development of novel mucosal vaccine strategies. indeed, vlps are sufficiently small, and the composition of their surface chemistry can be designed to minimize hydrophobic and electrostatic adhesive interactions with mucus. they can also be engineered for recombinant expression of multiple antigenic epitopes and for incorporation of co-stimulatory and immuno-regulatory proteins. however, vlp technology can be limited by difficulties of scale-up and the need for purification from the expression systems. 231 other limitation in chimeric vlp vaccine is to determine the compatibility of peptide with assembly of vlp and its immunogenicity property. under the host immune defence, pathogens undergo mutation which render the vlp vaccine ineffective and will be effective for only highly conserved b or t cell epitopes. 230 the major challenge is to develop novel production platforms that can deliver vlp vaccines while significantly reducing production times and costs. 104 viral-like particles (vlps) have shown high ability for the improvement of vaccines against infectious and non-infectious diseases. several recombinant expression systems were successfully applied for vlp production, with different efficiency. the use of vlps in vaccine development showed that they are considered safe. in addition, nano-sized vlps, can act as an adjuvant as well as antigen delivery system through increasing the antigen uptake by apcs. thus, it is not necessary for the use of adjuvants along with vlps to stimulate potent immune responses. vlps have shown a natural affinity to target host cells, and this property has been used for cell-targeting applications. regarding the advantages of vlps, it is necessary for further studies in various aspects especially easy and low-cost purification of vlps as well as their application as a delivery system in vivo. different applications of virus-like particles hum vaccine hepatitis b virus vaccine ip recombinant (genetically engineered): enivac hb hepavax-genev r . summary of product characteristics revac-b1tm. available at prescribing information. merck prescribing information. glaxosmithkline medicago to present additional positive clinical data at the 2011 eswi influenza conference medicago inc. news release exp rev vaccine hum vaccine immunother different applications of virus-like particles antimicrob agents chemother intranasal norwalk vaccine hum alves, p. m. exp rev vaccine exp rev vaccines different applications of virus-like particles curr top microbiol immunol the authors are grateful to elnaz agi and negar zohrei (dept. of hepatitis and aids, pasteur institute of iran) for technical assistance. key: cord-023865-6rafp3x3 authors: surjit, milan; lal, sunil k. title: the nucleocapsid protein of the sars coronavirus: structure, function and therapeutic potential date: 2009-07-22 journal: molecular biology of the sars-coronavirus doi: 10.1007/978-3-642-03683-5_9 sha: doc_id: 23865 cord_uid: 6rafp3x3 as in other coronaviruses, the nucleocapsid protein is one of the core components of the sars coronavirus (cov). it oligomerizes to form a closed capsule, inside which the genomic rna is securely stored thus providing the sars-cov genome with its first line of defense from the harsh conditions of the host environment and aiding in replication and propagation of the virus. in addition to this function, several reports have suggested that the sars-cov nucleocapsid protein modulates various host cellular processes, so as to make the internal milieu of the host more conducive for survival of the virus. this article will analyze and discuss the available literature regarding these different properties of the nucleocapsid protein. towards the end of the article, we will also discuss some recent reports regarding the possible clinically relevant use of the nucleocapsid protein, as a candidate diagnostic tool and vaccine against sars-cov infection. by definition, nucleocapsid is a viral protein coat that surrounds the genome (either dna or rna). nucleocapsid protein is the major constituent of a viral nucleocapsid. it is capable of associating with itself and with the genome, thus packaging the genome inside a closed cavity. in some viruses, nucleocapsid protein may also be assisted by other viral cofactors to form the capsid. however, in coronaviruses (including sars-cov), the nucleocapsid protein alone is capable of forming the capsid. the primary advantage of the virus for encoding the nucleocapsid protein is that the latter encloses and protects the viral genome from coming into direct contact with the harsh environment in the host. in fact, in some simple viruses like hepatitis e virus and polio virus, the nucleocapsid protein is the only coat that protects the genome from the outside world. however, in complex viruses, like hepatitis b virus and coronaviruses (including sars-cov), the nucleocapsid is covered by an additional coat composed of other viral proteins (spike protein is a major component of this coat). besides this property, nucleocapsid proteins of several viruses have been demonstrated to play multiple regulatory roles during viral pathogenesis. they are equipped with specific structural motifs and/or signature sequences, by which they associate with other viral/ host factors and skew the host cellular machinery in such a manner that it becomes more favorable for the survival of the virus. nucleocapsid protein is also one of the most abundantly expressed viral proteins and it is the major antigen recognized by convalescent antisera. hence, it is tempting to evaluate its potential as a candidate diagnostic tool or vaccine against the virus. therefore, understanding the properties of the nucleocapsid protein is of utmost importance to any virologist in order to understand the biology of the virus and develop effective tools to control the infection. since the identification and isolation of sars-cov in 2003, several laboratories around the world have focussed their research on characterization of various properties of the nucleocapsid protein. an indirect measure of the curiosity among sars-cov researchers to study the nucleocapsid protein is revealed from the fact that in pubmed the number of sars-cov research publications focussed on nucleocapsid protein is second only to those on spike protein. evidence accumulated from these articles has helped us gain substantial understanding of the properties of this protein. in this article, we will provide a comprehensive description of all the different properties of the nucleocapsid protein, as established by independent workers from several laboratories. we will conclude this article with the discussion of some of the remaining challenges in this field that need to be addressed in future. the nucleocapsid (n) protein is encoded by the ninth orf of sars-cov. the same orf also codes for another unique accessory protein called orf9b, though in a different reading frame, whose function is yet to be defined. the n-protein is a 46-kda protein composed of 422 amino acids (rota et al. 2003) . its n-terminal region consists mostly of positively charged amino acids, which are responsible for rna binding. a lysine-rich region is present between amino acids 373 and 390 at the c-terminus, which is predicted to be the nuclear localization signal. besides these, an sr-rich motif is present in the middle region encompassing amino acids 177-207. biophysical studies done by chang et al. (2006) have suggested that this protein is composed of two independent structural domains and a linker region. the first domain is present at the n-terminus, inside the putative rna binding domain, and the second domain consists of the c-terminal region that is capable of selfassociation. between these two structural domains, there lies a highly disordered region, which serves as a linker. this region has been reported to interact with the membrane (m) protein and human cellular hnrnpa1 protein (fang et al. 2006; luo et al. 2005) . besides, this region is also predicted to be a hot spot for phosphorylation. hence, in summary, the n-protein can be classified into three distinct regions ( fig. 9 .1), which may serve completely different functions during different stages of the viral life-cycle. a similar mode of organization has been reported for other coronavirus nucleocapsid proteins. in-vitro thermodynamic studies done by luo et al. (2004b) using purified recombinant n-protein have shown it to be stable between ph 7 and 10, with maximum conformational stability near ph 9. further, it was observed to undergo irreversible thermal-induced denaturation. it starts to unfold at 35 c and is completely denatured at 55 c . however, denaturation of the n-protein induced by chemicals such as urea or guanidium chloride is a reversible process. as in other coronavirus n-proteins, sars-cov n-protein has been predicted and later experimentally proven to undergo various posttranslational modifications such as acetylation, phosphorylation, and sumoylation. acetylation is the first modification of the n-protein to be experimentally proven. by mass spectrometric analysis of convalescent sera from several sars patients, it has been shown that the n-terminal methionine of n is removed and all other methionines are oxidized and the resulting n-terminal serine is acetylated. however, the functional relevance of this modification, if any, remains to be elucidated (krokhin et al. 2003) . another unique modification of the n-protein is its ability to become sumoylated. studies done by li et al. (2005a) have clearly established that heterologously expressed n in mammalian cells is sumoylated. using a site-directed mutagenesis approach, the sumoylation motif has been mapped to the 62nd lysine residue, which is present in a putative sumo-modification domain (gk 62 ee). their data further suggests that sumoylation may play a key role in modulating homo-oligomerization, nucleolar translocation and cell-cycle deregulatory property of the n-protein. further experimental support regarding sumoylation of n-protein came from another independent study carried out by fan et al. (2006) wherein they have demonstrated an association between the n-protein and hubc9, which is a ubiquitinconjugating enzyme of the sumoylation system. they have also mapped the interaction domain to the sr-rich motif, which is in agreement with the earlier report. however, they failed to detect the involvement of the gkee motif in mediating this interaction (fan et al. 2006) . initially, the sars-cov n-protein was predicted to be heavily phosphorylated. later on, from results obtained in our laboratory as well as by other researchers, it is now clear that the n-protein is a substrate of multiple cellular kinases. first experimental evidence for the phosphorylation status of the n-protein came from the study done by zakhartchouk et al. (2005) in which, using [ 32 p]orthophosphate labelling, they were able to observe phosphorylation of adenovirus-vectorexpressed n-protein in 293t cells. further studies done in our laboratory clearly confirmed this observation. the majority of the n-protein was found to be phosphorylated at its serine residues (although the involvement of threonine and tyrosine residues could not be detected; they may be occurring in vivo). in addition, using a variety of biochemical assays, it was proved that, at least in vitro, the n-protein could become phosphorylated by mitogen-activated protein kinase (map kinase), cyclin-dependent kinase (cdk), glycogen synthase kinase 3 (gsk3), and casein kinase 2 (ck2). also, this data provided preliminary indication regarding phosphorylation-dependent nucleo-cytoplasmic shuttling of the n-protein (surjit et al. 2005) . a recent report published by wu et al. (2008) has further confirmed that n-protein is a substrate of gsk3 enzyme, both in vitro and in vivo. using a variety of biochemical and genetic assays, it was clearly demonstrated that serine 177 residue of n-protein was phosphorylated by gsk3. an antibody specific to phospho 177 residue of the n-protein could efficiently detect the phospho n-protein both in vitro and in sars-cov infected cells. interestingly, biochemically mediated inhibition of gsk3 activity in sars-cov infected cells also leads to around 80% reduction in viral titer and subsequent induction of a virus-induced cytopathic effect. the authors proposed that gsk3 may be a major regulator of sars-cov replication, possibly by virtue of its ability to phosphorylate the n-protein. however, phosphorylation of other viral and/or host proteins by gsk3 may also be a determinant of the observed cytopathic effect. in contrast to the n-protein of many other coronaviruses, the sars-cov n-protein is predominantly distributed in the cytoplasm, when expressed heterologously or in infected cells (surjit et al. 2005; you et al. 2005; rowland et al. 2005) . in infected cells, a few cells exhibited nucleolar localization (you et al. 2005) . as reported by you et al. (2005) , the n-protein contains pat4, pat7 and bipartite-type nuclear localization signals. it has also been predicted to possess a potential crm-1dependent nuclear export signal. however, no clear experimental evidence could be obtained regarding the involvement of these signature sequences in regulating the localization of the n-protein. interestingly, studies done in our laboratory revealed that the majority of n-protein localized to the nucleus in serum-starved cells. this phenomenon could be reproducibly observed both in biochemical fractionation as well as immunofluorescence studies. in addition, treatment of cells with specific inhibitors of different cellular kinases such as ck2 inhibitor and cdk inhibitor resulted in retention of a fraction of the n-protein in the nucleus, whereas gsk3 and mapk inhibitor had very little effect. further, n-protein was found to be efficiently phosphorylated by the cyclin-cdk complex, which is known to be active only in the nucleus. the n-protein was also found to associate with 14-3-3 protein in a phospho-specific manner and inhibition of the 14-3-3y protein level by sirna resulted in nuclear accumulation of the n-protein. although these experiments are too preliminary to conclusively provide any answer regarding the intracellular localization of n-protein, nevertheless they do provide substantial clues regarding the physical presence of the n-protein in the nucleus, under certain circumstances, which may be a very dynamic phenomenon. another study done by timani et al. (2005) using different deletion mutants of the n-protein fused to egfp showed that the n-terminal of n-protein, which contains the nls 1 (aa 38-44), localizes to the nucleus, whereas the c-terminal region containing both nls 2 (aa 257-265) and nls 3 (aa 369-390) localizes to the cytoplasm and nucleolus. using a combination of different deletion mutants, they concluded that the n-protein may act as a shuttle protein between cytoplasm-nucleus and nucleolus. taken together, all these results further suggest that the n-protein per se has the physical ability to localize to the nucleus. whether this localization is regulated through phosphorylation-mediated activation of a potential nls or piggy-backing by association with another cellular nuclear protein or through any other mechanism remains to be established. being the capsid protein, the primary function of the n-protein is to package the genomic rna in a protective covering. in order to achieve this structure, the n-protein must be equipped with two different characteristic properties; such as (1) being able to recognize the genomic rna and associate with it, and (2) selfassociate into an oligomer to form the capsid. the n-protein of sars-cov has been experimentally proven to possess these properties in vitro, as discussed below. 9.6.1 recognition and binding with the genomic rna the first experimental evidence regarding the rna binding property of the n-protein came from the work of huang et al. (2004) , in which, by nmr studies, they proved the ability of the n-terminal domain to associate with several viral 3 0 untranslated rna sequences. additionally, chen et al. (2007) reported the presence of another rna binding domain at the c-terminal region (residues 248-365) of the n-protein, which was proposed to be a stronger interaction than that at the n terminus. based on structural analysis of the rna-protein interaction, they have further suggested that the genomic rna is packaged in a helical manner by the n-protein. in another report published by luo et al. (2006) , the rna binding motif of the n-protein was mapped to amino acid residues 363-382. in summary, the rna binding ability of the n-protein was attributed to its two distinct structural domains: the n-terminal domain (residues 45-181) and the c-terminal dimerization domain (residues 248-365). these two domains are spatially separated by long stretches of disordered region. a recent study done by chang et al. (2008) has demonstrated rna binding ability of these disordered regions. they have proposed that different rna binding domains of the n-protein may cooperate to enhance the overall rna binding efficiency of the n-protein and may also serve as interaction hubs for the association of n-protein with other viral and/or host nucleic acid and/or proteins. perhaps the most convincing proof to date regarding the ability of the n-protein to package the genomic rna came from the work of hsieh et al. (2005) . they have established a system to produce sars-cov vlps by cotransfection of spike, membrane, and envelope and nucleocapsid cdnas into vero e6 cells. while testing the packaging of an rna-bearing gfp fused to sars-cov packaging signal into this particle, they observed that presence of the n-protein is an absolute requirement. however, the n-protein was not essential for the assembly of the empty particle per se. further, by performing a filter binding assay using recombinant n-protein, they were able to identify two independent rna binding domains in the n-protein; one at the n terminus (aa 1-235) and the other at the c terminus (aa 236-384). these results are in agreement with previous findings and further suggest that these two regions may be functional in vivo. future experiments using a model infection system will confirm these observations. one of the most crucial properties required by the n-protein for genome encapsidation is its ability to self-associate. therefore, many laboratories have focused on characterizing this phenomenon, with an eye on developing possible interference strategies that may help in limiting virus propagation. initial studies done in our laboratory using a yeast two-hybrid assay revealed that n-protein is able to self-associate through its c-terminal amino acid 209 residues (surjit et al. 2004a) . a parallel study done by he et al. (2004) using the mammalian two-hybrid system and sucrose gradient fractionation also proved the ability of the n-protein to self-associate to form an oligomer. they further mapped the interaction region to amino acid 184-196 residues, encompassing the sr-rich motif. however, there were some discrepancies regarding the interaction domain mapped in these two studies. later on, extensive biophysical and biochemical analysis done by chen's laboratory and jiang's laboratory (luo et al. 2006 (luo et al. , 2005 have enriched our understanding of the oligomerization process of the nprotein. in summary, the sr-rich motif does possess binding affinity, but this is specific for the central region (aa 211-290) of another molecule of n-protein, instead of the sr-rich motif itself. the c-terminal region (aa 283-422) possesses binding affinity for itself and to associate into a dimer, trimer, tetramer or hexamer, in a concentration-dependent manner. the essential sequence for oligomerization of the n-protein was identified to be residues 343-402. interestingly, this region also encompasses the rna binding motif of the n-protein, which prompts us to speculate that there might be mutual interplay between rna binding and oligomerization activities of the n-protein. further, the oligomerization was observed to be independent of electrostatic interactions and addition of single strand dna to the reaction mixture containing tetramers of the n-protein promoted oligomerization. thus, it has been proposed that once the tetramer is formed by protein-protein interaction between nucleocapsid molecules, binding with genomic rna prompts further assembly of the complete nucleocapsid structure. besides being the capsid protein of the virus, the n-protein of many coronaviruses is known to double up as a regulatory protein. the n-protein of the sars-cov too has been shown to modulate the host cellular machinery in vitro, thereby indicating its possible regulatory role during its viral life-cycle. some of the major cellular processes perturbed by heterologous expression of the n-protein are discussed below. three different groups have reported the ability of the n-protein to interfere with the host cell cycle in vitro. work done by li et al. (2005a li et al. ( , 2005b proved that mutation of the sumoylation motif in the n-protein leads to cell cycle arrest. work done in our laboratory has shown the inhibition of s phase progression in cells expressing the n-protein (surjit et al. 2006 ). further, s-phase specific gene products like cyclin e and cdk2 were found to be downregulated in sars-cov infected cell lysate, which suggested that the observed phenomenon may be relevant in vivo. in an attempt to further characterize the mechanism of cell cycle blockage induced by the n-protein, several biochemical and mutational analysis were carried out. results thus obtained demonstrated that the n-protein directly inhibits the activity of the cyclin-cdk complex, resulting in hypophosphorylation of retinoblastoma protein with a concomitant downregulation of e2f1-mediated transactivation. analysis of rxl and cdk phosphorylation mutant n-protein identified the mechanisms of inhibition of cdk4 and cdk2 activity to be different. whereas the n-protein could directly bind to cyclin d and inhibit the activity of the cdk4-cyclind complex, inhibition of cdk2 activity appeared to be achieved in two different ways: indirectly by downregulation of protein levels of cdk2, cyclin e, and cyclin a, and by direct binding of n-protein to the cdk2-cyclin complex. a third piee of evidence supporting the ability of n-protein to deregulate the host cycle came from the work of zhou et al. (2008) . they observed slower transition from s to g2/m phase and slower growth rate in n-protein-expressing 293t cells. they also observed a similar phenomenon in human peripheral blood lymphocyte and k 562 cells infected with a retrovirus expressing sars-cov n-protein. while searching for interaction partners for the c terminus of n-protein (aa 251-422) by following a yeast two-hybrid library screening approach, zhou et al. (2008) discovered human elongation factor 1 alpha (ef1a) as a candidate partner. the specificity of the interaction was confirmed by various in-vitro and in-vivo assays. further, expression of n-protein induced aggregation of ef1a. it is known that the majority of cellular ef1a is bound to f-actin and promotes f-actin bundling, which is a key event during cytokinesis (kurasawa et al. 1996; yang et al. 1990) . hence, the authors tested whether n-protein-induced aggregation of ef1a affected f-actin bundling and cytokinesis. as expected, they observed significantly fewer f-actin bundles in n-protein-expressing cells. in fact, a similar f-actin distribution pattern was also observed by surjit et al. (2004b) in cos-1 cells. further, the authors observed multinucleated cells in n-protein-expressing cells at a later time point (72 h post-transfection), indicating inhibition of cytokinesis in those cells. specificity of the above data has been confirmed by the use of different deletion mutants of the n-protein, in which only the c-terminal domain of the n-protein (responsible for binding with ef1a) was able to reproduce the above results. thus, it has been suggested that ef1a binding by the n-protein leads to its aggregation, resulting in inhibition of f-actin bundling and subsequent blocking of cytokinesis. ef1a is known to play a key role during the peptide elongation stage of translation. therefore, it is an attractive candidate for pathogen proteins to manipulate its activity in order to skew the host translation machinery. for example, hiv-type 1 gag polyprotein has been shown to interact with ef1a and impair translation in vitro (cimarelli and luban 1999) . since zhou et al. (2008) observed an interaction between ef1a and sars-cov n-protein, they further tested whether it interfered with the host translation machinery. indeed, presence of the n-protein inhibited total cellular translation, both in vitro and in vivo, in a dose-dependent manner. moreover, exogenous addition of excess ef1a could reverse the n-proteininduced translation inhibition, thus suggesting that n-protein exerts its effect by interfering with ef1a function. however, it remains to be confirmed whether a similar effect is recapitulated in vivo. production of interferon (ifn) is one of the primary host defense mechanisms. however, sars-cov infection does not result in ifn production. nevertheless, pretreatment of cells with ifn blocks sars-cov infection (spiegel et al. 2005; zheng et al. 2004 ). based on this observation, palese's laboratory has studied the ifn inhibitory property of different sars-cov proteins, which revealed that orf3, orf6 as well as the n-protein have the ability to independently inhibit ifn production through different mechanisms. the n-protein was found to inhibit the activity of irf3 and nfkb in host cells, resulting in inhibition of ifn synthesis. irf3 activity was also blocked by orf3, orf6 proteins, but inhibition of nfkb activity was a property unique to the n-protein. in addition, orf3, orf6 proteins were able to block stat1 activity through different mechanisms (kopecky-bromberg et al. 2007 ). all these data suggest that sars-cov may employ multiple factors to check the activity of the host immune system and n-protein may be one of the major partners in this process. it may be possible that these different factors act independently during different stages of the viral life cycle. in that case, regulatory activity of the n-protein will be as indispensible as its structural activity. during the sars outbreak, a large number of patients developed severe inflammation of the lungs, which subsequently led to acute respiratory distress syndrome (ding et al. 2003; nicholls et al. 2003) . acute respiratory distress syndrome is characterized by pulmonary fibrosis, which results in lung failure and subsequent death of the patient. the tgfb signaling pathway plays a critical role in pulmonary fibrosis (roberts et al. 2006; border and noble 1994) . it enhances the expression of extracellular matrix (ecm) proteins, accelerates the secretion of protease inhibitors and reduces the secretion of proteases, thereby leading to deposition of ecm proteins. tgfb may also induce pulmonary fibrosis directly by stimulating chemotactic migration and proliferation of fibroblasts as well as by fibroblast-myofibroblast transition. hence, it is worth speculating that some of the sars-cov encoded factors may be modulating the tgfb signaling pathway. in fact, proteins of several other viruses, such as hepatitis c virus core, ns3 and ns5 protein, adenovirus e1a, human papilloma virus e7, human t-lymphotropic virus tax and epstein-barr virus lmp1, have been reported to modulate the tgfb pathway. in general, these proteins directly bind with smad proteins and alter the innate signaling pathway. interestingly, a recent report published by zhao et al. (2008) revealed that n-protein of sars-cov also interacts with smad3 and modulates the activity of the tgfb pathway. by performing a smad binding element (sbe)-driven reporter assay, rt-pcr and immunohistological analysis of tgfb target genes such as pai-1 (plasminogen activator inhibitor 1) and collagen in a variety of cell lines and sars patients, the authors have clearly proved that n-protein indeed enhanced the activity of the tgfb signaling pathway. further, they observed that the effect of n-protein on tgfb signaling was mediated through smad3 only (independent of the involvement of smad4). while trying to unravel the mechanism behind this phenomenon, they observed that n-protein specifically associated with the mh2 domain of smad3 (stronger binding affinity for phospho smad3) interrupted the interaction between smad3 and smad4, and enhanced the interaction between smad3 and transcriptional coactivator p300 in a dose-dependent manner. to further confirm the above data, they performed a chromatin immunoprecipitation assay at the sbe region of pai-1 promoter in hpl1 cells and detected the presence of n-protein in the complex of smad3 and p300. interestingly, however, n-protein inhibited tgfb-induced apoptosis of hpl1 cells (it is a well established fact that smad3 activation induces apoptosis of hpl1 cells). thus, n-protein appears to employ a clever mechanism whereby, on the one hand, it enhances the activity of the tgfb signaling pathway, thus leading to enhanced expression of a subset of genes (such as ecm protein coding genes), and on the other hand, it blocks the programmed cell death of the host cell. it would be interesting to unravel the mechanism behind this unique property of the n-protein. another major proinflammatory factor induced during viral infection is the cyclooxygenase-2 (cox2) protein. using 293t cells expressing the n-protein, yan et al. (2006) have shown that expression of the n-protein leads to upregulation of cox2 protein production in a transcriptional manner. they have further demonstrated that the n-protein directly binds to the nfkb response element present in the cox2 promoter through a 68 aa residue binding domain (aa 136-204) and activates its transcription. although the n-protein is known to associate with stretches of nucleic acids, to date there is no other documentation or prediction of its sequence-specific dna binding activity (as a transcription factor). in such a scenario, the above observation, if reproducible in vivo, may really be a unique property of the n-protein and may further add to the established regulatory functions of the n-protein. exogenously expressed n-protein has been reported to enhance the dna binding activity of c-fos, atf-2, creb-1, and fos b in an elisa-based assay, thus suggesting an increase in ap1 activity in these cells (he et al. 2003) . the mechanistic details and functional significance of this phenomenon remain to be elucidated. earlier work done in our laboratory has shown that n-protein, when expressed in cos-1 monkey kidney cells, induces apoptosis in the absence of growth factors. attempts to understand the mechanism of programmed cell death revealed that the n-protein downmodulated the activity of prosurvival factors such as extracellular regulated kinase, akt and bcl 2, and upregulated the activity of proapoptotic factors like caspase-3 and caspase-7 (surjit et al. 2004b ). however, this phenomenon was not observed in another cell line of epithelial lineage (huh7). the above observation was further confirmed by zhang et al. (2007) . they reported that serum starvationinduced apoptosis of n-protein-expressing cos-1 cells involved activation of mitochondrial pathway. another elegant study done by diemer et al. (2008) has further extended our understanding regarding the apoptotic property of the n-protein. through a series of experiments involving both a model infection system of sars-cov and transient transfection of n-protein, the authors have confirmed that n-protein induces an intrinsic apoptotic pathway resulting in activation of caspase-9, which further leads to activation of caspase-3 and -6. their data further revealed that these activated caspases cleave the n-protein at residues 400 and 403 and that nuclear localization of n-protein is an absolute requirement for cleavage. in addition, the authors have reported that the apoptosis-inducing ability of the n-protein is highly cell type specific. only in cells where n-protein localizes to both nucleus and cytoplasm (vero e6 and a549 cells), is it able to activate caspase and become cleaved; however, in cell lines where it localizes to the cytoplasm only (caco2 and n-2a cells), no activation of caspase is observed. it remains to be studied whether this phenomenon is actually recapitulated in vivo. a recent report by han et al. (2008) revealed that, of all the sars-cov structural proteins, only n-protein specifically induced the transcription of prothrombinase gene in thp-1 and vero cells. by performing luciferase reporter assay of hfgl2 promoter in n-protein-expressing cells and electrophoretic mobility shift assay using n-protein-transfected cell lysate, they demonstrated that n-protein expression induced the binding of transcription factor c/ebpa to its cognate response element present in hfgl2 promoter, leading to enhanced transcription of hfgl2 gene. since lungs of sars patients have been shown to contain high amount of fibrin, the authors proposed that n-protein-mediated enhanced production of prothrombinase gene may contribute to the development of thrombosis in sars patients. luo et al. (2005) have reported the interaction between hnrnpa1 and n-protein by using a variety of biochemical and genetic assays. the interaction was found to be mediated through the middle region (aa 161-210) of n-protein. if relevant in vivo, this interaction may play a significant role in regulation of the viral rna synthesis. another interesting study done by luo et al. (2004a) has reported association between the n-protein and human cyclophylin a. by spr (surface plasmn resonance) analysis they have shown it to be a high affinity interaction. although the significance of this interaction is not known in vivo, they have proposed that this interaction might be crucial for viral infection. notable is the fact that hiv-1 gag also binds with human cyclophylin a and this interaction is crucial for hiv infection (gamble et al. 1996) . recently, zeng et al. (2008) have reported that n-protein associates with b23, a phosphoprotein in the nucleus. by performing in vivo coimmunoprecipitation in hela cells and gst pull-down assay using purified recombinant n-protein, the authors have demonstrated direct interaction between b23 and n-protein. the interaction domain has been mapped to amino acid residues 175-210 of n-protein, which include the sr-rich motif. b23 plays a key role in centrosome duplication during cell division. phosphorylation of b23 at threonine-199 residue is known to regulate its function (okuda et al. 2000 , tokuyama et al. 2001 . in order to demonstrate the functional significance of n-protein interaction with b23 protein, the authors tested the phosphorylation status of threonine-199 residue of b23 in the presence of n-protein. interestingly, n-protein was able to block threonine-199 phosphorylation. based on this observation, the authors have proposed that n-protein exerts its effect on cell cycle deregulation by modulating the activity of b23 protein. in summary, although several regulatory roles have been proposed for the sars-cov n-protein using a variety of in-vitro experimental systems, no clear evidence exists for their occurrence in vivo. in the absence of a suitable in-vivo experimental system, all these functions remain speculative. one of the most essential steps to limit the outbreak of any infectious disease is the ability to diagnose the causative agent at the earliest possible time, which can be achieved by detecting some of the markers that are specifically expressed by the pathogen or by identifying some of the host factors that are specifically produced during infection. n-protein, being one of the predominantly expressed proteins at the early stage of sars-cov infection, against which a strong antibody response is initiated by the host, has been proposed to be an attractive diagnostic tool. in serum of sars-cov patients, the n-protein has been detected as early as day one of infection by elisa using monoclonal antibodies against it (che et al. 2004 ). further, a comparative study to detect sars-cov-specific igg, sars-cov rna, and the n-protein during early stages of infection has demonstrated that the detection efficiency of the n-protein is significantly higher than the other two markers (li et al. 2005b) . researchers have been mainly focussing on two different strategies by which nucleocapsid can be used as a diagnostic tool: (1) development of efficient monoclonal antibodies against the n-protein, and (2) production of recombinantly expressed, highly purified n-protein for detection of n-protein-specific antibody in the host. using a phage display approach, flego et al. (2005) have identified human antibody fragments that recognize distinct epitopes of the n-protein. these may help develop efficient reagents to detect n-protein in the infected host. further, several laboratories have been trying to develop efficient monoclonal antibodies against the major immunodominant epitopes of the n-protein, that can be used in elisa to detect sars-cov at an early stage of infection (shang et al. 2005; liu et al. 2003; he et al. 2005; woo et al. 2005) . in another interesting study, liu et al. (2005) have developed an immunofluorescence assay using antirabbit n-protein antibody that can specifically detect n-protein from throat wash samples of sars-cov patients at day two of illness. several other workers have focussed on economical production of highly purified recombinant n-protein using a variety of heterologous expression systems that can be used in elisa to detect n-protein-specific antibody in the patient sample. n-protein has been produced in abundant quantity using a codon-optimized gene in e. coli (das and suresh 2006) . saijo and coworkers have successfully expressed recombinant n-protein using a baculovirus expression system, which was found to be 92% efficient in neutralizing antibody assay (saijo et al. 2005) . in another study, liu et al. have expressed full length n-protein using a yeast expression system ). however diagnostic use of recombinant n-protein has been a problematic issue because of several reasons as discussed below. bacterially expressed n-protein has been reported to produce false seropositivity owing to interference of bacterially derived antigens (leung et al. 2006; yip et al 2007) . in addition, several studies have shown cross-reactivity between full-length n-protein of sars and polyclonal antisera of group 1 animal coronaviruses, which may lead to faulty detection (sun and meng 2004) . another study done by woo et al. also reported cross-reactivity of full-length recombinant n-protein with antisera of hcov-oc43 and hcov-229e infected patients, thus giving false positive results. they were able to minimize this false positivity by further verifying the elisa results with western blot assay using recombinant n and spike proteins of sars-cov (woo et al. 2004) . later, studies done by qiu et al. and bussmann et al. showed that the recombinantly expressed c-terminal of the n-protein acts more specifically in detecting sars-cov-specific antisera in comparison to full-length n-protein (qiu et al. 2005; bussmann et al. 2006) . it is noteworthy that this region is predicted to encompass major antigenic sites of the n-protein. in a recent report, shin et al. (2007) demonstrated significantly higher efficacy of phosphorylated n-protein as a diagnostic antigen. they expressed the n-protein in insect cells, where it was phosphorylated by posttranslational modification. when the antigenicity of this protein was compared to that of a bacterially expressed n-protein (unphosphorylated) or to that of a dephosphorylated n-protein (by treatment with protein phosphatase 1) using sars-positive or -negative patient serum, phosphorylated n-protein did not show any cross-reactivity with sars-negative serum, thereby reducing the number of false positives. also, the phosphorylated protein showed considerably stronger cross-reactivity with an n-protein-specific monoclonal antibody. based on these observations, the authors have proposed the use of a phosphorylated n-protein as a better diagnostic agent. also, several reports have been published dealing with the detection of n-protein-specific igm by elisa or indirect immunofluorescent assay (chang et al. 2004; hsueh et al. 2004; woo et al. 2004 ). however, in these studies, igm antibodies became detectable later than igg antibodies, which is in contrast to the phenomena observed in most other pathogens. a recent report published by yu et al. (2007) attempted to solve this problem by using a truncated n-protein (aa 122-422) as an antigen in igm elisa. they found the igm response appeared three days before detection of the igg response, which is in agreement with the results obtained from other known pathogens. further, their results showed 100% specificity and sensitivity of the truncated protein in detecting n-protein-specific igm from patients with laboratory confirmed sars cases in comparison to healthy volunteers. the authors have suggested that the igm capture elisa using this truncated n-protein may be more effective in serodiagnosis of sars-cov at an earlier time. in another interesting report, woo et al. (2005) carried out comparative studies to evaluate the relative diagnostic efficacy of recombinantly expressed n and spike proteins. they observed sensitivity of recombinant n-igg elisa to be significantly higher than that of recombinant s-igg elisa. the reverse was true in the case of igm elisa using recombinant n and s proteins. based on this data, they have suggested the practise of elisa for detection of igm against both s and n proteins instead of n alone (woo et al. 2005) . taken together, all this data does support the notion that the n-protein may be used as an efficient diagnostic tool for detection of sars-cov infection. nevertheless, production scale-up and further validation of specificity using patient samples will determine the possible clinical use of these reagents. one of the most clinically relevant uses of the n-protein can be its use as a protective vaccine against sars-cov infection. n-protein is one of the major antigens of the sars-cov. also, n-protein analyzed from different patient samples shows least variation in the gene sequence (tong et al. 2004) , therefore indicating it to be a stable protein, which is a primary requirement for an efficient vaccine candidate. earlier studies carried in collins', rao's, and li's laboratories have clearly shown that antiserum to the n-protein does not contain neutralizing antibodies against sars-cov (buchholz et al. 2004; pang et al. 2004; liang et al. 2005) . this may be attributed to the localization of n-protein inside the viral envelope, which will not be accessible to the antibody during infection. it is noteworthy that the most effective sars-cov structural protein that can induce neutralizing antibody production is the s-protein (buchholz et al. 2004 ). the s-protein antibody could block viral infection with 100% efficiency. on the other hand, although unable to induce humoral immunity, expression of n-protein induced significant cytotoxic t-lymphocyte (ctl) response (buchholz et al. 2004; gao et al. 2003; zhu et al. 2004) . induction of n-protein-specific ctls will help limit the infection by lysing virus infected cells. this will also limit the spread of virus. thus, n-protein-based vaccines may further augment the protection efficiency when coadministered with s-protein-based vaccine. several laboratories have been exploring various strategies to evaluate the potential of n-protein as a vaccine candidate. in an elegant work done by kim et al. (2004) , calreticulin-fused n-protein expressing vaccinia virus has been shown to generate potent n-protein-specific humoral and t-cell immune responses in mice. as reported by the authors, fusion with calreticulin specifically enhanced the efficiency and significantly reduced the titer of the challenging vector (vaccinia virus). the authors have proposed that n-protein may be the logical choice as a target antigen in the event of s-protein antibody-dependent enhancement (ade) of infection. however, the ade phenomenon has not been observed during spike-mediated vaccination (buchholz et al. 2004 ). another study done by wang et al. (2005) has attempted to use plasmid dna expressing s, m, and n proteins as an efficient vaccine candidate. although they report the production of some b-cell and t-cell responses against n-protein, strong immune response was obtained for the s and m proteins, thus scaling down the choice of n-protein as a suitable candidate vaccine . a similar plasmid-mediated vaccination approach has also been reported by zhao et al. (2004) , in which they immunized mice with the dna construct (pci vector) expressing the n-protein. they too reported the generation of a robust b-cell and t-cell immune response in animals. another group of workers has also reported successful use of the n-protein as a dna vaccine. they immunized mice by intramucosal injection of the n-protein-expressing plasmid vector and were able to obtain specific humoral and t-cell responses (zhu et al. 2004) . the n-protein has also been reported to be of potential interest as a peptidebased vaccine. a systematic study done by liu et al. (2006) has revealed the immunodominant epitopes of the n-protein which could efficiently stimulate immune response. they have also deduced some conserved immunodominant epitopes in mouse, monkey, and humans, which may help in design of the vaccine. a recent report published by gao's laboratory provides further evidence regarding the efficiency of an n-protein-based vaccine (zhao et al. 2007 ). by using overlapping synthetic peptides spanning the n-protein, they have identified dominant helper t-cell epitopes in the n-protein of sars-cov. immunization of mice with peptides emcompassing these dominant th cell epitopes resulted in strong cellular immunity in vivo. priming with the helper peptides significantly accelerated the immune response induced by the n-protein. further, by fusing with a conserved neutralizing epitope from the spike protein of sars-cov, two of the th cell epitope-bearing peptides assisted in the production of higher titer neutralizing antibodies in vivo, in comparision to spike epitope alone or its mixture with th epitope of n. thus, it is practically possible to generate a better immune response by using a fusion of n and s protein. however, the th epitopes identified in their report are specific to mouse, and will therefore not be useful for human. nevertheless, their data provides useful information for the design of peptide-based anti-sars-cov vaccines. another interesting study conducted by pei et al. (2005) reports the possible use of the n-protein as a mucosal vaccine candidate. they expressed the n-protein in lactobacillus lactis, which is a food-grade bacteria, and challenged the mice either orally or intramucosally. as preliminary evidence, they were able to observe significant n-protein-specific igg in the sera of orally challenged animals. it is a significant achievement for the research community that, within a short span of time, we have been able to obtain a more-or-less clear understanding regarding the structural and functional properties of the n-protein. however, it is a fact worth mentioning that all the studies done here were performed with in-vitro experiments, using recombinantly expressed n-protein, in isolation. so at present, all we can conclude is that the n-protein per se has the physical ability to perform the above described functions, in other words n-protein does bear the necessary signature sequence or motifs or conformation to perform these functions under suitable circumstances. whether a similar event is recapitulated in vivo during viral infection will be dependent on several criteria: (1) the net effect of other viral factors on the activity of n-protein, (2) the net translation and turnover rate of n-protein, (3) a conducive intracellular milieu, and (4) the net modulation of an already skewed cellular pathway by other viral factors. hence, it will be interesting to reevaluate the properties of n-protein in a sars-cov infection model. however, owing to the limited user-friendliness and accessibility of an infection system, we must probably still resort to in vitro systems for further analysis of the characteristics of n-protein. one of the better experimental systems has already been established by chang's laboratory (hsieh et al. 2005) , in which all the structural proteins were coexpressed to form vlp in 293t cells. if this system can be further improved to optimize the rate of synthesis of these different proteins to a level near that in vivo, it will at least enable us to study the net effect of the n-protein with respect to other viral proteins. further establishment of a replicon system may also be helpful. in addition, some of the interesting preliminary observations reported by several laboratories need to be analyzed in detail. to begin with, the reported interaction of the n-protein with the genomic rna packaging signal needs to be further characterized and mapped. since the oligomerization domain and the rna binding regions of the n-protein overlap with each other, the suggested possibility of regulated genome incorporation and capsid assembly should be further characterized with the aid of a replicon system or a particle assembly system. in addition, the reported ability of the n-protein to modulate different cellular pathways should be further characterized in the particle assembly system or at least in the presence of other viral accessory proteins. the most unique and significant property of the n-protein revealed by preliminary studies is its ability to act as a sequence-specific dna binding factor. it has been shown to bind the nfkb response element of cox2 promoter and to enhance cox2 gene expression. this activity may be further empowering the n-protein to manipulate the entire gene expression programme of the infected cell. therefore, studies should be initiated to analyze this phenomenon in detail. it seems to deserve so much attention because another study done by palese's laboratory has proved the ability of the n-protein to inhibit nfkb activity, which results in inhibition of ifn synthesis. further, liao et al. (2005) have reported the activation of nfkb by n-protein in vero e6 cells and he et al. (2005) failed to detect any change in nfkb activity in the same cells. therefore it needs to be clarified whether n-protein enhances nfkb activity and, if so, whether upregulation of cox2 transcription by direct dna binding is a property specific to that promoter or whether it is a global phenomenon. in such a scenario, there may be complicated cross-talk between the ability of n-protein to deregulate the expression of cox2 and ifn in infected cells. lastly, the n-protein is known to be the most abundantly expressed protein of the sars-cov. therefore, any information generated from the analysis of this protein, whether in vivo or ex vivo, will definitely help to increase our understanding of the biology of sars-cov and may someday help to design better protective tools against it. transforming growth factor beta in tissue fibrosis contributions of the structural proteins of severe acute respiratory syndrome coronavirus to protective immunity antigenic and cellular localisation analysis of the severe acute respiratory syndrome coronavirus nucleocapsid protein using monoclonal antibodies modular organization of sars coronavirus nucleocapsid protein multiple nucleic acid binding sites and intrinsic disorder of sars coronavirus nucleocapsid protein -implication for ribonucleocapsid protein packaging nucleocapsid protein as early diagnostic marker for sars structure of the sars coronavirus nucleocapsid protein rna-binding dimerization domain suggests a mechanism for helical packaging of viral rna translation elongation factor 1-alpha interacts specifically with the hiv-1 gag polyprotein copious production of sars-cov nucleocapsid protein employing codon optimized synthetic gene generation of human antibody fragments recognizing distinct epitopes of the nucleocapsid (n) sars-cov protein using a phage display approach crystal structure of human cyclophilin a bound to the amino-terminal domain of hiv-1 capsid effects of a sars-associated coronavirus vaccine in monkeys the nucleocapsid protein of sars-cov induces transcription of hfgl2 prothrombinase gene dependent on c/ebp alpha activation of ap-1 signal transduction pathway by sars coronavirus nucleocapsid protein analysis of multimerization of the sars coronavirus nucleocapsid protein characterization of monoclonal antibody against sars coronavirus nucleocapsid antigen and development of an antigen capture elisa assembly of severe acute respiratory syndrome coronavirus rna packaging signal into virus-like particles is nucleocapsid dependent chronological evolution of igm, iga, igg and neutralisation antibodies after infection with sars-associated coronavirus structure of the n-terminal rna-binding domain of the sars cov nucleocapsid protein generation and characterization of dna vaccines targeting the nucleocapsid protein of severe acute respiratory syndrome coronavirus severe acute respiratory syndrome coronavirus open reading frame (orf) 3b, orf 6, and nucleocapsid proteins function as interferon antagonists mass spectrometric characterization of proteins from the sars virus: a preliminary report characterization of f-actin bundling activity of tetrahymena elongation factor 1 alpha investigated with rabbit skeletal muscle actin extremely low exposure of a community to severe acute respiratory syndrome coronavirus: false seropositivity due to use of bacterially derived antigens sumoylation of the nucleocapsid protein of severe acute respiratory syndrome coronavirus detection of the nucleocapsid protein of severe acute respiratory syndrome coronavirus in serum: comparison with results of other viral markers sars patients-derived human recombinant antibodies to s and m proteins efficiently neutralize sars-coronavirus infectivity activation of nf-kappab by the full-length nucleocapsid protein of the sars coronavirus the c-terminal portion of the nucleocapsid protein demonstrates sars-cov antigenicity high-yield expression of recombinant sars coronavirus nucleocapsid protein in methylotrophic yeast pichia pastoris immunofluorescence assay for detection of the nucleocapsid antigen of the severe acute respiratory syndrome (sars)-associated coronavirus in cells derived from throat wash samples of patients with sars immunological characterizations of the nucleocapsid protein based sars vaccine candidates nucleocapsid protein of sars coronavirus tightly binds to human cyclophilin a in vitro biochemical and thermodynamic characterization of nucleocapsid protein of sars the nucleocapsid protein of sars coronavirus has a high binding affinity to the human cellular heterogeneous nuclear ribonucleoprotein a1 carboxyl terminus of severe acute respiratory syndrome coronavirus nucleocapsid protein: self-association analysis and nucleic acid binding characterization lung pathology of fatal severe acute respiratory syndrome nucleophosmin/b23 is a target of cdk2/cyclin e in centrosome duplication protective humoral responses to severe acute respiratory syndrome-associated coronavirus: implications for the design of an effective protein-based vaccine expression of sars-coronavirus nucleocapsid protein in escherichia coli and lactococcus lactis for serodiagnosis and mucosal vaccination use of the cooh portion of the nucleocapsid protein in an antigen-capturing enzymelinked immunosorbent assay for specific and sensitive detection of severe acute respiratory syndrome coronavirus smad3 is key to tgf-beta-mediated epithelial-to-mesenchymal transition, fibrosis, tumor suppression and metastasis characterization of a novel coronavirus associated with severe acute respiratory syndrome intracellular localization of the severe acute respiratory syndrome coronavirus nucleocapsid protein: absence of nucleolar accumulation during infection and after expression as a recombinant protein in vero cells recombinant nucleocapsid protein-based igg enzyme-linked immunosorbent assay for the serological diagnosis of sars characterization and application of monoclonal antibodies against n protein of sarscoronavirus antigenic characterization of severe acute respiratory syndrome-coronavirus nucleocapsid protein expressed in insect cells: the effect of phosphorlation on immunoreactivity and specificity inhibition of beta interferon induction by severe acute respiratory syndrome coronavirus suggests a two-step model for activation of interferon regulatory factor 3 antigenic cross-reactivity between the nucleocapsid protein of severe acute respiratory syndrome (sars) coronavirus and polyclonal antisera of antigenic group i animal coronaviruses: implication for sars diagnosis the nucleocapsid protein of the sars coronavirus is capable of self-association through a c-terminal 209 amino acid interaction domain the sars coronavirus nucleocapsid protein induces actin reorganization and apoptosis in cos-1 cells in the absence of growth factors the severe acute respiratory syndrome coronavirus nucleocapsid protein is phosphorylated and localizes in the cytoplasm by 14-3-3-mediated translocation the nucleocapsid protein of severe acute respiratory syndrome-coronavirus inhibits the activity of cyclin-cyclin-dependent kinase complex and blocks s phase progression in mammalian cells nuclear/nucleolar localization properties of c-terminal nucleocapsid protein of sars coronavirus specific phosphorylation of nucleophosmin on thr(199) by cyclin-dependent kinase 2-cyclin e and its role in centrosome duplication direct sequencing of sarscoronavirus s and n genes from clinical specimens shows limited variation low stability of nucleocapsid protein in sars virus immune responses with dna vaccines encoded different gene fragments of severe acute respiratory syndrome coronavirus in balb/c mice longitudinal profile of immunoglobulin g (igg), igm, and iga antibodies against the severe acute respiratory syndrome (sars) coronavirus nucleocapsid protein in patients with pneumonia due to the sars coronavirus differential sensitivities of severe acute respiratory syndrome (sars) coronavirus spike polypeptide enzyme-linked immunosorbent assay (elisa) and sars coronavirus nucleocapsid protein elisa for serodiagnosis of sars coronavirus pneumonia glycogen synthase kinase-3 regulates the phosphorylation of sars-coronavirus nucleocapsid protein and viral replication nucleocapsid protein of sars-cov activates the expression of cyclooxygenase-2 by binding directly to regulatory elements for nuclear factor-kappa b and ccaat/enhancer binding protein identification of an actinbinding protein from dictyostelium as elongation factor 1a naturally occurring anti-escherichia coli protein antibodies in the sera of healthy humans cause analytical interference in a recombinant nucleocapsid protein-based enzyme-linked immunosorbent assay for serodiagnosis of severe acute respiratory syndrome subcellular localization of the severe acute respiratory syndrome coronavirus nucleocapsid protein recombinant severe acute respiratory syndrome (sars) coronavirus nucleocapsid protein forms a dimer through its c-terminal domain crystal structure of the severe acute respiratory syndrome (sars) coronavirus nucleocapsid protein dimerization domain reveals evolutionary linkage between corona-and arteriviridae recombinant truncated nucleocapsid protein as antigen in a novel immunoglobulin m capture enzymelinked immunosorbent assay for diagnosis of severe acute respiratory syndrome coronavirus infection severe acute respiratory syndrome coronavirus nucleocapsid protein expressed by an adenovirus vector is phosphorylated and immunogenic in mice the nucleocapsid protein of sars-associated coronavirus inhibits b23 phosphorylation sars-cov nucleocapsid protein induced apoptosis of cos-1 mediated by the mitochondrial pathway immune responses against sars-coronavirus nucleocapsid protein induced by dna vaccine identification and characterization of dominant helper t-cell epitopes in the nucleocapsid protein of severe acute respiratory syndrome coronavirus severe acute respiratory syndrome-associated coronavirus nucleocapsid protein interacts with smad3 and modulates transforming growth factor-beta signaling potent inhibition of sars-associated coronavirus (scov) infection and replication by type i interferons (ifnalpha/ beta) but not by type ii interferon (ifn-gamma) the nucleocapsid protein of severe acute respiratory syndrome coronavirus inhibits cell cytokinesis and proliferation by interacting with translation elongation factor 1alpha induction of sarsnucleoprotein-specific immune response by use of dna vaccine acknowledgments the authors wish to thank ms. alisha lal for helping out in typing and formatting this review. we apologize to all those colleagues whose work we might have omitted to cite in this article. key: cord-257465-9yrf7ofy authors: finlay, william j. j.; bloom, laird; grant, joanne; franklin, edward; shúilleabháin, deirdre ní; cunningham, orla title: phage display: a powerful technology for the generation of high-specificity affinity reagents from alternative immune sources date: 2016-05-23 journal: protein chromatography doi: 10.1007/978-1-4939-6412-3_6 sha: doc_id: 257465 cord_uid: 9yrf7ofy antibodies are critical reagents in many fundamental biochemical methods such as affinity chromatography, enzyme-linked immunosorbent assays (elisa), flow cytometry, western blotting, immunoprecipitation, and immunohistochemistry techniques. as our understanding of the proteome becomes more complex, demand is rising for rapidly generated antibodies of higher specificity than ever before. it is therefore surprising that few investigators have moved beyond the classical methods of antibody production in their search for new reagents. despite their long-standing efficacy, recombinant antibody generation technologies such as phage display are still largely the tools of biotechnology companies or research groups with a direct interest in protein engineering. in this chapter, we discuss the inherent limitations of classical polyclonal and monoclonal antibody generation and highlight an attractive alternative: generating high-specificity, high-affinity recombinant antibodies from alternative immune sources such as chickens, via phage display. the rapid expansion of the genomics, proteomics , and biotechnology fi elds has led to a growing demand for affi nity reagents that can specifi cally recognize proteins, peptides, carbohydrates, and haptens . affi nity reagents of high specifi city are routinely required for diverse protein drug targets, members of newly discovered biochemical pathways, posttranslationally modifi ed proteins, protein cleavage products, and even small molecules such as drugs of abuse and toxins. individual biomedical researchers will often need to monitor, quantify, and purify proteins of interest via affi nity chromatography , but there may not be any commercially available antibody reagents to allow them to do so [ 1 ] . indeed, even in situations where there are commercially available antibodies, these reagents are often expensive, poorly characterized, and/or simply not appropriate for demanding applications. compounding this problem, the technical diffi culty of monoclonal antibody generation by the untrained researcher and the high cost (~$15,000) of a commercial monoclonal antibody generation program leads many researchers to the default solution of producing polyclonal hyperimmune sera in hosts such as rabbits. the net result of this is that researchers often settle for reagents that lack the necessary specifi city to perform the applications for which they were intended. in this review, we will outline the limitations of classical antibody generation technologies and illustrate an attractive alternative: the use of phage display libraries of recombinant antibodies built on immunoglobulin repertoires from nonmammalian animals. in particular, we will highlight the advantages of libraries derived from the domestic chicken gallus gallus , which offers a relatively inexpensive and technically accessible route to high-quality monoclonal reagents [ 2 ] . if, like many people, you have purchased (or paid to generate) a costly and "specifi c" antibody, but subsequently found that it is actually polyreactive and of dubious quality, phage display from immunized chickens may offer an attractive alternative. hyper-immune sera from rabbits, sheep, or other mammals may be produced in large quantities, but they do not offer the consistency of monoclonal antibodies and need to be regularly replenished and recharacterized. serum antibodies are also polyclonal and frequently polyspecifi c, even when purifi ed over an antigen column, rendering them suboptimal for the specifi c recognition of a single component in a complex matrix. one illuminating study has demonstrated that when used to probe a comprehensive yeast proteome chip, unpurifi ed polyclonal antibody preparations could recognize up to 1770 different proteins, with some monoclonal antibodies and antigen column-purifi ed polyclonal antibodies also recognizing multiple proteins (related and unrelated) [ 3 ] . the arrival of monoclonal antibody technology [ 4 ] was a major step forward in generating high-specifi city reagents, but the reliance on the murine immunoglobulin system frequently leads to a number of practical diffi culties: (1) monoclonal antibodies are raised on the basis of an ineffi cient fusion of splenic b-cells to an immortalized mouse myeloma line, followed by limiting dilution of the cell population. target-specifi c antibodies are randomly identifi ed, often by a simple direct elisa , where few preconditions can be set to determine which antibodies are identifi ed and one must "take what one can get" during the screening process. (2) it is often desirable to have multiple monoclonal antibodies with specifi city for different epitopes on the same target molecule, but the diffi culty in sequencing monoclonals does not allow the rapid identifi cation of unique clones early in the screening process. (3) humans and rodents are relatively closely related phylogenetically. many proteins of interest are highly conserved among mammals and this can frequently lead to thymic tolerance, restricting the antibody response after immunization. (4) when an immune response to a human protein is raised in mice, the large regions of sequence similarity between murine and human proteins may lead to a restricted number of immunogenic epitopes. (5) to generate antibodies that cross-react with orthologues from multiple species of mammal is particularly tricky, as the common epitopes among mammals are the very ones that are unlikely to provoke a strong immunoglobulin response in the mouse. (6) tolerance issues can become even harder to circumvent when the protein of interest is from a mouse or rat. creating "knockout" mice, in which the endogenous copy of the gene for the target protein has been disabled, can often break tolerance, but this is a highly laborious and time-consuming process that few laboratories have the resources to undertake. these factors all hinder the generation of high-quality antibody reagents and thereby limit one's experimental options when developing antibodies for purifying or tracking novel proteins. to bypass the limitations in polyclonal and monoclonal antibody generation, several groups have turned to in vitro display technologies such as phage display, ribosome display, or yeast display libraries to generate recombinant antibodies. the more recent technologies of yeast and ribosome display are becoming highly established, but phage display is currently the most robust, well characterized, and reliable of these methods. antibodies derived from these technologies are cloned in microbial hosts and are therefore monoclonal from the start, with their production being easily scaled up. critically, selection and screening efforts can be directed toward specifi c epitopes or species cross-reactivity and away from polyspecifi c binding. phage display allows a researcher to do in a single tube what would be unfathomable in traditional hybridoma work: interrogate libraries of millions to tens of billions of antibodies on the basis of their binding specifi cally to a target of interest. all of this is possible due to the ingenious concept of "genotypeto-phenotype linkage," which is exemplifi ed by phage display. phage display was originally described as a rapid method for cloning gene fragments that encode a specifi c protein [ 5 ] . by cloning gene fragments into the genome of a fi lamentous e. coli bacteriophage, smith et al. were able to generate libraries of gene fusions with the key phage coat protein p3. after transformation into e. coli , the viral replication system packaged the genome (and therefore the cloned gene) into a highly stable complex carrying the gene product on the tip of the phage particle, as a fusion protein with p3 . by subsequently selecting expressed virions on an immobilized antibody with specifi city for a known protein that had been cloned in the phage genome, they were able to show 1000-fold enrichment of the gene product. this set of experiments showed that "genotypeto-phenotype linkage" could be achieved and thereby defi ned the basis of all display technologies developed since. the subsequent development of a method for the effective cloning of antibody v-gene sequences via pcr allowed the capture of antibody sequences in a recombinant form [ 6 ] , removing the need to immortalize b-cells via hybridoma fusion as required in traditional monoclonal antibody generation. this discovery was combined with the phage display process to make an effi cient method of isolating antibodies and their corresponding gene sequences simultaneously [ 7 ] . in the antibody phage display process, libraries of diverse v-gene sequences are cloned into an appropriate expression vector in e. coli , creating an in-frame fusion with the p3 protein or, as favored in most recently described libraries, a truncated form of p3. the phage display of protein libraries was originally performed using "phage" vectors (i.e., built upon the phage genome itself), but due to practical diffi culties in handling these libraries, more recent phage methods have mostly used so-called phagemid expression vectors. in this case, the dna backbone is a stable, small plasmid such as puc, and the p3 gene plus f1 phage packaging origin are the only phage-derived dna sequences [ 8 ] . these phagemid libraries are more easily handled and more stable than phage libraries, as the plasmid is incapable of causing phage production by itself. the libraries can therefore be more simply cloned, expanded, and controlled than phage libraries. by infecting "helper" phage (based on m13) into a growing culture of e. coli harboring a phagemid library, the phage propagation machinery is provided, but due to a mutation introduced in the origin of replication on the helper phage genome, preferential packaging of the phagemid dna and p3 fusion occurs during phage replication [ 8 ] . phage production is thereby induced, genotype-to-phenotype linkage is created, and the expressed phage particles are interrogated for the presence of useful protein sequences via target binding. this selective step may be performed by simply immobilizing the target protein on, for example, a protein-binding plastic surface such as an elisa plate, adding phage and allowing binding to occur via the antibody-p3 fusion proteins. nonbinding phages are removed by washing the immobilized surface, and the remaining bound phages are eluted. the eluted phages are then reinfected back into a fresh culture of e. coli to retrieve the selected gene sequences. this process is not perfect, however, and two to four rounds of selection/re-expression/selection are normally performed iteratively to remove all unwanted clones and enrich the binding population from the background of the library. nonetheless, this process can be spectacularly powerful, massively enriching specifi c antibodies in a single selection round [ 9 ] . furthermore, the use of multiple forms of the target antigen in sequential selection rounds and the inclusion of competitor proteins can drive the selected pool toward a highly specifi c set of epitopes. much of the evolution and progression of phage display technology has been driven by the remarkable discovery that this method can be used to mine large libraries of combinatorial human antibody diversity, theoretically removing the need for animals in antibody production [ 10 ] . by random recombination of v h and v l sequences from human lymphocyte cdna [ 10 ] , or, by using degenerate oligonucleotides to create diversity of dna sequence in the loops associated with target binding [ 11 ] , several groups have created very large libraries of antibody gene sequences for phage display. these libraries are analogous to the naïve antibody repertoire in an animal, and selecting from them can result in the identifi cation of antibody fragments that exhibit high specifi city and occasionally high affi nity for the target protein. several major studies have proven that with appropriate application of this technology, specifi c antibodies can be raised to proteins, peptides, haptens and even carbohydrates. in the most exemplary studies, antibodies have been raised with equivalent affi nities to those associated with a strong humoral immune response [ 12 -15 ] . additionally, antibodies with suboptimal affi nities can provide useful starting molecules for the construction of mutagenized libraries which can be further screened for affi nity matured variants [ 16 -18 ] . in addition to the affi nity maturation process, phage display also provides a useful tool for the production of ultra-humanized antibodies from nonhuman sources. it is widely accepted that there are intrinsic diffi culties associated with the humanization of nonhuman antibodies using cdr grafting. frequently, such grafted antibodies suffer from immunogenicity associated with high non-germ-line amino acid content, the propensity for v-domain destabilization, and very often downstream expression and formulation issues. in a recent study, ultra-humanized antibodies from three nonhuman sources were generated via singlestep complementarity-determining region (cdr) germ-lining in a process termed " augmented binary substitution" (abs) [ 19 ] . this process utilizes phage display for selection from germline targeting combinatorial libraries which results in the identifi cation of cdr residues amenable to human germ-lining without compromising on specifi city and affi nity. a further degree of complexity is incorporated into the library through the random substitution of cdr-h3 residues in tandem with the germ-lining process. this simple single pass process generates signifi cantly lower non-germline sequence content, on cdr grafts from nonhuman hosts, rendering them virtually indistinguishable from fully human antibodies. this process opens up new possibilities for the affi nity maturation and humanization of antibodies from alternative sources such as chickens. recombinant antibodies are typically displayed and expressed in "fragment" forms. the simplest and most commonly used fragment is the single-chain fragment variable (scfv) where a fl exible peptide sequence links the v-regions of antibodies between the c-terminus of one domain and the n-terminus of the other, thereby combining both v-domains into a single polypeptide [ 20 ] . the scfv may be assembled in v l -v h or v h -v l orientations, with v h -v l being the most heavily used format historically. the fl exible linker helps to make the scfv simple to express, but must be sufficiently long and fl exible to allow effective association of the v-regions to form a functional antigen-combining site. as long as this is true, the classical hydrophobic pairing of the v-regions will stabilize the structure. by far, the most common linkers are based on glycine-serine repeat structures such as ggggs × 3. the second most commonly used recombinant antibody format is the fab (fragment antigen binding) molecule. this structure is a complete binding "arm" of an antibody and is comprised of the full immunoglobulin light chain, expressed in conjunction with the v h -c h 1 region from the heavy chain [ 21 ] . fabs obligately form predominantly monomeric, monovalent fragments. they are the most "natural" of the recombinant antibody fragments and it has been shown that the presence of the constant regions can often help to stabilize antibody variable regions [ 22 ] . the fab format is the less commonly used of the two main recombinant antibody formats, however, as its dual polypeptide structure is generally more diffi cult to express and display in e. coli than the scfv [ 22 ] . a recent engineered scfab platform however attempts to address some of the limitations associated with scfab expression [ 23 ] . much of the under-use of phage display may be due to experiences with the early libraries derived from naïve or synthetic human antibody diversity, which were donated to academic laboratories that were not specifi cally invested in antibody engineering. unfortunately, these forays into display technologies have often left investigators somewhat disappointed. many people have accepted the viewpoints of recombinant antibody technology experts, that these libraries can yield useful high-affi nity antibodies to any form of antigen . in general, (mostly to those highly skilled in the fi eld), this is indeed true, but the average antibody generated from these libraries is often of disappointingly low affi nity to those who are used to high-sensitivity antibodies from immunized sources. human recombinant antibodies from naïve library sources can require technically challenging in vitro molecular evolution if they are to perform the demanding "real world" functions required of many reagent antibodies [ 9 ] . molecular evolution is far from trivial to perform and is usually beyond both the scope and interest level of the average researcher. for such reasons, it is of little wonder then that most people either ignore, or at worst disparage, phage display technology itself. nevertheless, phage display can be a relatively simple technology to use and when employed to harness natural repertoires of antibodies from immunized animals, it can offer a rapid path to highly specifi c, high-affi nity antibodies against problematic antigens . while the most successful naïve antibody libraries contain over 10 10 members and are often the domain of biotechnology companies, typical immune libraries are in the 10 7 -10 8 range and are easily assembled by a single investigator [ 24 , 25 ] . when an immunized rabbit or sheep has raised a signifi cant serum immunoglobulin titer, the common end-point to the experiment is to exsanguinate the animal and harvest the serum. however, harvesting b-cell rich lymphoid tissues from the animal, such as the spleen and bone marrow, allows the isolation of total rna and the subsequent generation of cdna [ 24 ] . this is a simple method with which many biomedical researchers are familiar, and commercial kits are available to simplify most steps of the process. the immunoglobulin gene sequences of many animals are now known and the cdna from immune tissues can subsequently be used for the rt-pcr amplifi cation and cloning of the animal's variable region sequence repertoire [ 24 ] . these cloned variable region sequences can then be assembled into a display library format such as scfv or chimeric fab (using human c h 1 and c k/λ regions) [ 26 ] . these targeted immune libraries thereby offer a potentially huge advantage over monoclonal antibodies , as libraries of >10 8 variants may be built, allowing the effective sampling of a much broader range of antibodies than the hundreds (occasionally thousands) of clones usually examined in a monoclonal antibody screen. the resulting library can be interrogated for specifi c binding proteins via phage display and the retrieved antibody fragments expressed very simply in bacteria [ 26 ] . this process has been used to successfully harness the antibody repertoires of a large number of immune host species, including mice [ 24 ] , rabbits [ 25 ] , sheep [ 27 ] , camelids [ 28 ], and sharks [ 29 ] . of greater interest to us, however, is to exploit this approach to harvest the novel immunoglobulin repertoires of the domestic chicken ( gallus gallus ), which is as simple to use as mice and rabbits, but also highly phylogenetically distant from mammals. avians can circumvent many of the common problems encountered with mammalian immunizations described above. as a fully domesticated small animal, chickens are an attractive host for immunization as they are highly accessible, very affordable, and easily housed in a generic animal house. most importantly, however, the amino acid homology between the mammalian and avian orthologues of a given protein is typically lower than between the mammals commonly used for antibody generation, and indeed, some mammalian proteins may not even exist in avians. the immunoglobulin response of chickens to highly conserved mammalian proteins is reliably robust, generally exhibits high avidity, and potentially targets a broad spectrum of epitopes on protein immunogens [ 30 -32 ]. chickens therefore have a potentially major advantage over other common immune hosts: they can produce a high-affi nity cross-reactive antibody response targeting an epitope that is conserved across multiple orthologues of a mammalian protein. this can lead to signifi cant savings in time and resources as, if a single, broadly applicable cloned reagent can be identifi ed, it can then be used to generate a single affi nity column for the capture of the target protein from multiple species. chicken immunoglobulins have also shown benefi cial biophysical properties: they exhibit high stability to changes in ph and temperatures up to 70 °c [ 33 , 34 ] , provide functional coating on latex microspheres [ 35 ] , and demonstrate functional direct covalent coupling to a dextran layer for the detection of serum proteins by surface plasmon resonance [ 36 ] . furthermore, as chickens are small animals, very little protein immunogen is required to raise a strong immunoglobulin response. approximately 200 μg/bird of purifi ed protein is suffi cient to carry out a full immunization regime [ 37 , 38 ] . these observations have led to the regular use of chickens as an immune host for production of the polyclonal antibody termed igy (egg yolk antibody), in both research and commercial settings. laying hens will export signifi cant quantities of polyclonal igy into the egg (~100 mg of igy per yolk), in a process analogous to mammalian placental igg transfer, which allows direct screening of their antibody response without the need for serum sampling [ 38 ] . once a strong immune response has been raised, large quantities of polyclonal antibody are easily prepared from the yolk. these polyclonal antibodies have been successfully applied in research immunochemistry [ 39 ], diagnostics [ 40 ] , and affi nity column purifi cation [ 41 ] . indeed, immunodepletion resins based on chicken igy can be used to remove high abundance proteins from serum and are now commercially available (sigma-aldrich seppro ® igy14 spin columns). unfortunately, polyclonal igy does still suffer from the same issues of ill-defi ned specifi city that all polyclonal antibody preparations do. in addition, there have been several studies describing successful chicken hybridoma monoclonal antibody generation to antigens such as human peptides [ 42 ], sporozoite proteins [ 43 ], and prion protein [ 44 ], but the low antibody expression and instability associated with chicken myeloma cell lines [ 42 , 45 ] led to the under-use of this species as a source of monoclonal antibodies. today however, the progress in chicken antibody phage display has circumvented these problems and made recombinant chicken antibody reagents readily accessible, as we describe below. the chicken immunoglobulin repertoire is almost ideally suited to antibody phage display, as chickens generate their immunoglobulin repertoire from a single set of v h and v l germ line sequences [ 46 ] , with the majority of positions in the framework regions maintained as germline [ 2 ] . diversity in the v-regions is created by both v-d-j recombination and somatic hypermutation, with the additional infl uence of "gene conversion," where multiple upstream pseudogenes are recombined into the functional sequence. this germline v-gene system means that the entire chicken antibody repertoire can be captured using only four pcr primers [ 47 ], making chicken libraries highly representative of the induced immunoglobulin response. this is in direct contrast with immune hosts such as mice, which have diverse germline v-gene sequences and therefore require complex mixes of pcr primers [ 24 ] . additionally, the two v-gene germline sequences found in chickens are highly homologous to the human v λ and v h 3 germline families [ 48 ] , which are both associated with creating v-domains with high stability and solubility. indeed, chicken scfvs can be stable in crude bacterial culture supernatants for up to 1 month at room temperature [ 49 ] . the initial work of davies et al. [ 47 ] showed that a simple recombinant chicken antibody library could be displayed on phage. while this small library was nonimmune and derived from the bursa cells of a single young chicken, the group was able to select target-specifi c scfv sequences recognizing lysozyme , serum albumin and thyroglobulin. the potential of chicken recombinant antibodies was further highlighted by a study [ 50 ] which used an scfv library derived from the spleens of immunized chickens and successfully generated highly specifi c scfv antibodies that targeted both mouse and rat serum albumins, where tolerance issues limit the ability to generate murine monoclonal antibodies . in addition, an in-depth analysis of the avian immunoglobulin v h repertoire demonstrated that naïve repertoire features were fully replicated in the resulting target-selected, phage display repertoire [ 2 ] . a major study [ 26 ] subsequently demonstrated that chickens could be a useful source of scfv and chimeric fab antibodies with specifi city for hapten molecules. however, none of these early studies characterized the antibodies for their affi nity, or their function as practical reagents. more recent studies have shown that scfv antibodies derived from immunized chickens are highly effective reagents in diverse settings such as diagnostic elisa for infectious bursal disease virus [ 51 ], the diagnosis of prion disease [ 52 ], immunodetection of haptenic shellfi sh toxins [ 53 ], immunostaining of sars-infected cells [ 54 ], biosensing of cardiac biomarkers [ 55 ] , and the measurement of apob protein in mouse and human sera [ 56 ] . raats et al. [ 57 ] have also illustrated that antiidiotype scfvs from an immune chicken scfv library were of considerably higher sensitivity than those derived from a human antibody library in the same study. the generation of highly selective scfvs toward the prp protein, which is highly conserved in mammals demonstrates the advantage of the chicken as an immune model, as isolated scfvs were shown to react with murine, ovine, and bovine orthologues of the protein [ 52 ] . the cloning of chicken antibodies via phage display has also allowed the precise dissection of the specifi city and affi nity of the chicken immunoglobulin response. high-throughput affi nity measurements for panels of chicken scfvs to the infl ammatory biomarker c-reactive protein have identifi ed clones that preferentially recognize the multimeric and monomeric forms of the protein [ 50 ]. in the same study, clones with affi nities as high as 350 pm were generated from an immune phage display library of only 3 × 10 7 total clones. in addition, chicken anti-prp scfv have been reported to have affi nities up to 15 pm, making them among the highest affi nity scfvs reported to date [ 58 ] . what may be of particular practical interest to many researchers is that chickens can serve as a host for simultaneous immunization with multiple proteins of interest, with as many as eight proteins being used successfully in a single immunization scheme [ 37 , 49 , 59 ] . the target proteins of interest are mixed in a single adjuvant preparation and each immunized animal receives all multiplexed proteins simultaneously. spleen and bone marrow tissues from the immunized animals are then used to generate relatively small phage display libraries and specifi c antibodies are derived via selection of the library separately on the individual proteins originally used for immunization [ 37 ] . the immunized chickens appear to react to the proteins fully independently, as the phage display libraries generate individual scfv antibody clones that are fully specifi c by western blot and elisa , showing no reactivity to their co-immunogens [ 37 , 59 ] . this approach has major benefi ts practically and ethically, as it allows the use of a single library to derive high-affi nity antibodies to a group of proteins of interest. multi-immunization methods also simultaneously minimize animal use and raise the likelihood of success in generating an immediately useful reagent [ 37 , 59 ] . multi-target immunization regimes should be designed with one of two objectives in mind. firstly, the simplest scenario combines multiple unrelated proteins, which leads to unrelated b-cell responses after immunization. to derive antibodies of greatest specifi city during a multi-target immunization of this kind, it is important to ensure that each of the protein immunogens is highly purifi ed and that no closely related proteins are co-immunized into a single animal. secondly, to derive antibodies that are cross-reactive to orthologues of a conserved protein from multiple species, it is likely to be benefi cial, but not necessarily essential, to include each orthologue in the mix of immunogens given to each animal. iterative selection rounds that change orthologue each time can then be used to bias toward the isolation of cross-reactive antibodies. the generation and selection of chicken recombinant antibodies is extremely reliable using the methods described in detail in the accompanying chapter. the subsequent identifi cation and sequencing of antibodies displaying the characteristics desired can also be performed simply. in general, scfvs isolated from immunized chicken libraries exhibit high affi nity and can be assayed via a direct elisa , using crude periplasmic extracts from the protein expressing e. coli clones. the level of further downstream analysis carried out on positive hits identifi ed during the binding elisa depends on what the end user requires. specifi c binding function may suffi ce for scfvs that are to be used simply as reagent antibodies for in vitro analysis of samples via elisa or western blotting . for antibodies to be used in affi nity chromatography , the antibody fragments must be purifi ed and tested for their function after being coupled to a solid matrix and for their specifi city during purifi cation. few studies have been performed using antibody fragments in affi nity chromatography, but several potential approaches have been described. for the antibody binding site(s) to be fully solventexposed and active, the antibody fragment must be directionally captured onto the solid matrix. even for full-length igg , nonspecifi c adsorption or covalent coupling onto solid supports can lead to denaturation, reducing or negating antigen-binding function [ 60 ] . antibody fragments may be slightly more prone to chemical or physical denaturation than full-length immunoglobulins , but scfv and fab have been used successfully in affi nity chromatography, and in the creation of spr sensing surfaces which can go through serial rounds of binding and regeneration [ 61 ] . mcelhinney et al. [ 62 ] created a simple scfv-based affi nity column for the concentration and cleanup of microcystin toxins from environmental samples, by transiently coupling the his-tagged scfv to a disposable nickel chelate column. the scfv was thereby coupled directionally, maximizing the functional antibody content on the column, and the analyte for purifi cation was co-eluted with the scfv before quantifi cation by reverse-phase hplc. however, this method can only be used under a limited number of conditions, as the interaction of the his-tag with nickel is noncovalent and ph dependent. other possibly useful low affi nity expression tags include e. coli maltose binding protein and glutathione s -transferase , which have both been successful in protein purifi cation [ 63 , 64 ] ( see also chapter 8 for a discussion on protein tagging ). high affi nity, highly stable linkage via affi nity tagging may also be achieved by site-specifi c biotinylation of antibody fragments and their immobilization onto a matrix that has been passively or covalently coated with avidin. bacterial expression vectors are now available which introduce biotin into specifi c peptide tags ( avitag ), which can be produced on the termini of recombinant proteins. a similar method was proven to be effi cient in the production of fabs that are specifi cally biotinylated in vivo during bacterial expression, via c-terminal fusion of the fabs to the e. coli acetyl-coa carboxylase [ 65 ] . importantly, these biotinylated fabs were successfully used to purify recombinant tnf-alpha from bacterial lysates, via a streptavidinated column. the peptide tagging method has also been used successfully to label both fab and scfv antibodies for their oriented immobilization and use as capture antibodies in clinical diagnostic elisas [ 66 , 67 ] . these studies suggest that biotinstreptavidin coupling is a simple and rapid method for the stable, directional capture of recombinant antibody fragments. while the covalent coupling of recombinant antibody fragments via their reactive lysine side chains is likely to be disruptive to their function, some alternative covalent coupling methods have been identifi ed. in the simplest example, the disulfi de bonds linking the two constant regions of a fab can be reduced using a mild agent to expose cysteine thiols. these thiol groups can then be used to covalently couple the fragment to a thiol-activated surface [ 68 ] . more elegant versions of this approach have expressed antibody fragments with a c-terminal cysteine group, then gently applied the same chemistry, to preferentially reduce the exposed disulphide groups [ 61 ] . the exposed terminal thiols are again an effi cient reactive group for covalent attachment. it is also possible to express scfv fused to the constant regions of human igg light chains as another source of usable cysteine residues external to the v-regions [ 69 ] . whether any of the above attachment methods are appropriate for a given affi nity purifi cation application may be decided upon by the individual investigator. in cases where stable linkage has been achieved and the column is to be reused, it is prudent for the investigator to examine multiple clones for their stability under repeated cycles of elution and regeneration. while chicken scfvs are built upon naturally stable frameworks, the stability of different clones cannot be taken for granted. in cases where stability remains an issue, the appropriate chicken v-regions can be cloned into an fc-fusion [ 70 ] or igg [ 71 ] expression vector to produce full-length antibody in mammalian, yeast, and even plant culture systems [ 72 ] . in conclusion, this chapter demonstrates that phage display is a powerful technology that can be used to generate highly specifi c reagents from immune sources. it is an attractive alternative to classical antibody generation technologies that can be employed in many fi elds for reliable antibody and antibody fragment generation . antibody microarrays: promises and problems fundamental characteristics of the immunoglobulin vh repertoire of chickens in comparison with those of humans, mice, and camelids analyzing antibody specifi city with whole proteome microarrays continuous cultures of fused cells secreting antibody of predefi ned specifi city filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface cloning immunoglobulin variable domains for expression by the polymerase chain reaction phage antibodies: fi lamentous phage displaying antibody variable domains multi-subunit proteins on the surface of fi lamentous phage: methodologies for displaying antibody (fab) heavy and light chains selecting and screening recombinant antibody libraries human antibodies from v-gene libraries displayed on phage semisynthetic combinatorial antibody libraries: a chemical solution to the diversity problem human antibodies with sub-nanomolar affi nities isolated from a large non-immunized phage display library fully synthetic human combinatorial antibody libraries (hucal) based on modular consensus frameworks and cdrs randomized with trinucleotides generation of high-affi nity human antibodies by combining donor-derived and synthetic complementarity-determiningregion diversity high-throughput generation of synthetic antibodies from highly functional minimalist phage-displayed libraries picomolar affi nity antibodies from a fully synthetic naive library selected and evolved by ribosome display affi nity maturation of a humanized rat antibody for anti-rage therapy: comprehensive mutagenesis reveals a high level of mutational plasticity both inside and outside the complementarity-determining regions in vitro selection and affi nity maturation of antibodies from a naive combinatorial immunoglobulin library augmented binary substitution: single-pass cdr germ-lining and stabilization of therapeutic antibodies protein engineering of antibody binding sites: recovery of specifi c activity in an anti-digoxin single-chain fv analogue produced in escherichia coli human monoclonal fab fragments derived from a combinatorial library bind to respiratory syncytial virus f glycoprotein and neutralize infectivity domain interactions in the fab fragment: a comparative evaluation of the singlechain fv and fab format engineered with variable domains of different stability an improved single-chain fab platform for effi cient display and recombinant expression reliable cloning of functional antibody variable domains from hybridomas and spleen cell repertoires employing a reengineered phage display system high affi nity scfvs from a single rabbit immunized with multiple haptens methods for the generation of chicken monoclonal antibody fragments by phage display the isolation of super-sensitive anti-hapten antibodies from combinatorial antibody libraries derived from sheep high throughput ranking of recombinant avian scfv antibody fragments from crude lysates using the biacore a100 determination of lox-1-ligand activity in mouse plasma with a chicken monoclonal antibody for apob generating recombinant anti-idiotypic antibodies for the detection of haptens in solution humanization of chicken monoclonal antibody using phagedisplay system multipleantigen immunization of chickens facilitates the generation of recombinant antibodies to autoantigens profi ling receptor tyrosine kinase activation by using ab microarrays oriented immobilisation of engineered single-chain antibodies to develop biosensors for virus detection rapid isolation of a single-chain antibody against the cyanobacterial toxin microcystin-lr by phage display and its use in the immunoaffi nity concentration of microcystins from water advances in recombinant antibody microarrays comparison of affi nity tags for protein purifi cation in vivo biotinylated recombinant antibodies: construction, characterization, and application of a bifunctional fab-bccp fusion protein produced in escherichia coli in vitro enzymatic biotinylation of recombinant fab fragments through a peptide acceptor tail use of an in vivo biotinylated single-chain antibody as capture reagent in an immunometric assay to decrease the incidence of interference from heterophilic antibodies sitedirected immobilisation of antibody fragments for detection of c-reactive protein recombinant antibody expression vectors enabling double and triple immunostaining of tissue culture cells using monoclonal antibodies production of anti-prion scfv-fc fusion proteins by recombinant animal cells whole igg surface display on mammalian cells: application to isolation of neutralizing chicken monoclonal anti-il-12 antibodies plant expression of chicken secretory antibodies derived from combinatorial libraries key: cord-003020-q69f57el authors: farhadi, tayebeh; hashemian, seyed mohammadreza title: computer-aided design of amino acid-based therapeutics: a review date: 2018-05-14 journal: drug des devel ther doi: 10.2147/dddt.s159767 sha: doc_id: 3020 cord_uid: q69f57el during the last two decades, the pharmaceutical industry has progressed from detecting small molecules to designing biologic-based therapeutics. amino acid-based drugs are a group of biologic-based therapeutics that can effectively combat the diseases caused by drug resistance or molecular deficiency. computational techniques play a key role to design and develop the amino acid-based therapeutics such as proteins, peptides and peptidomimetics. in this study, it was attempted to discuss the various elements for computational design of amino acid-based therapeutics. protein design seeks to identify the properties of amino acid sequences that fold to predetermined structures with desirable structural and functional characteristics. peptide drugs occupy a middle space between proteins and small molecules and it is hoped that they can target “undruggable” intracellular protein–protein interactions. peptidomimetics, the compounds that mimic the biologic characteristics of peptides, present refined pharmacokinetic properties compared to the original peptides. here, the elaborated techniques that are developed to characterize the amino acid sequences consistent with a specific structure and allow protein design are discussed. moreover, the key principles and recent advances in currently introduced computational techniques for rational peptide design are spotlighted. the most advanced computational techniques developed to design novel peptidomimetics are also summarized. different diseases may be caused by pathogens or malfunctioning organs, and using therapeutic agents to heal them has an old recorded history. small molecules are conventional therapeutic candidates that can be easily synthesized and administered. however, many of these small molecules are not specific to their targets and may lead to side effects. 1 moreover, a number of diseases are caused due to deficiency in a specific protein or enzyme. thus, they can be treated using biologically based therapies that are able to recognize a specific target within crowded cells. 2 under the biologic conditions, some macromolecules such as proteins and peptides are optimized to recognize specific targets. 3 therefore, they can override the shortcomings of small molecules. 3 recently, pharmaceutical scientists have shown interest in engineering amino acid-based therapeutics such as proteins, peptides and peptidomimetics. [4] [5] [6] theoretical and experimental techniques can predict the structure and folding of amino acid sequences and provide an insight into how structure and function are encoded in the sequence. such predictions may be valuable to interpret genomic information and many life processes. moreover, engineering of novel proteins or redesigning the existing proteins has opened the ways to achieve novel biologic macromolecules with desirable therapeutic functions. 7 protein sequences comprise tens to thousands of amino acids. besides, the backbone and side chain degrees of freedom lead to a large number of configurations for a single amino acid sequence. protein design techniques give minimal frustration through precise identification of sequences and their characteristics. [8] [9] [10] [11] considering energy landscape theory, the adequately minimal frustration in natural proteins occurs when their native state is adequately low in energy. 7 the de novo design of a sequence is difficult because there are huge numbers of possible sequences: 20 n for n-residue proteins with only 20 natural amino acids. 12 peptide design should incorporate computational approaches. it can benefit from searching the more advanced fields used for small molecules and protein design. 13 however, the straightforward adoption of computational approaches employed to small-molecule and protein design has not be accepted as a reasonable solution to the peptide design problem. [14] [15] [16] in the peptide drug design, the conformational space accessible to peptides challenges the small-molecule computational approaches. besides, the necessity for nonstandard amino acids and various cyclization chemistries challenges the available tools for protein modeling. 13 furthermore, the aggregation of peptide drugs during production or storage can be an unavoidable problem in the peptide design procedure. rational design of a peptide ligand is also challenging because of the elusive affinity and intrinsic flexibility of peptides. 17 peptide-focused in silico methods have been increasingly developed to make testable predictions and refine design hypotheses. consequently, the peptide-focused approaches decrease the chemical spaces of theoretical peptides to more acceptable focused "drug-like" spaces and reduce the problems associated with aggregation and flexibility. 13, 18 for the discussions that follow, peptides can be defined as relatively small (2-30 residues) polymers of amino acids. 18 in physiological conditions, several problems such as degradation by specific or nonspecific peptidases may limit the clinical application of natural peptides. 19 moreover, the promiscuity of peptides for their receptors emerges from high degrees of conformational flexibility that can cause undesirable side effects. 20 besides, some properties of therapeutic peptides, such as high molecular mass and low chemical stability, can result in a weak pharmacokinetic profile. therefore, peptidomimetic design can be a valuable solution to circumvent some of undesirable properties of therapeutic peptides. 21, 22 in the biologic environment, peptidomimetics can mimic the biologic activity of parent peptides with the advantages of improving both pharmacokinetic and pharmacodynamic properties including bioavailability, selectivity, efficacy and stability. a wide range of peptidomimetics have been introduced, such as those isolated as natural products, 23 synthesized from novel scaffolds, 24 designed based on x-ray crystallographic data 25 and predicted to mimic the biologic manner of natural peptides. 26 using hierarchical strategies, it is possible to change a peptide into mimic derivatives with lower undesirable properties of the origin peptide. 27 over the past 10 years, computational methods have been developed to discover peptidomimetics. 28 in a part of this review, novel computational methods introduced for peptidomimetic design have been summarized. peptidomimetics can be categorized as follows: peptide backbone mimetics (type 1), functional mimetics (type 2) and topographical mimetics (type 3). 29 the first generation of peptidomimetics (type 1) mimics the local topography of amide bond. it includes amide bond isosteres, 30 pyrrolinones 31 or short fragments of secondary structure, such as beta-turns. 32 such mimetics generally match the peptide backbone atom-for-atom, and comprise chemical groups that also mimic the functionality of the natural side chains of amino acids. a number of prosperous instances of type 1 peptidomimetics have been reported. 33 the second type of peptidomimetics is described as functional mimetics or type 2 mimetics, which include small, non-peptide compounds that are able to identify the biologic targets of their parent peptide. 34 at first, they were assumed to be conservative structural analogs of parent peptides. however, using site-directed mutagenesis, their binding sites to biologic targets were investigated. the results indicated that type 2 peptidomimetics routinely bind to protein sites that are different from those selected by the original peptide. 35 therefore, type 2 mimetics maintain the ability to interfere with the peptide-protein interaction process without the necessity to mimic the structure of the natural peptide. 28 type 3 peptidomimetics reveal the best conception of peptidomimetics. they consist of the necessary chemical groups that act as topographical mimetics and contain novel chemical scaffolds that are unrelated to natural peptides. 36 here, theoretical and computational techniques to design proteins, peptides and peptidomimetics are reviewed. however, the current review does not deeply highlight the computational aspects of amino acid-based therapeutic design, but only discusses the methods used to design the mentioned therapeutics. figure 1 summarizes the key concepts presented in this study. as some examples, the structures of aldesleukin, leuprolide and spaglumic acid, important amino acid-based therapeutics approved by the us food and drug administration (fda), are shown in figure 2a computer-aided design of amino acid-based therapeutics figure 2a ) and leuprolide (pdb id: 1yy2; figure 2b ) were obtained from the protein data bank (pdb; http://www. rcsb.org/) and visualized by pymol tool. the structure of spaglumic acid was retrieved (in mol format) from pub-chem database (https://pubchem.ncbi.nlm.nih.gov/) with the pubchem id 188803 ( figure 2c ) and visualized using pymol. aldesleukin, a lymphokine, is a recombinant protein used to treat adults with metastatic renal cell carcinoma (https://www.drugbank.ca/drugs/db00041). leuprolide, a synthetic nine-residue peptide analog of gonadotropin releasing hormone, is used to treat advanced prostate cancer (https://www.drugbank.ca/drugs/db00007). spaglumic acid is used in allergic conditions such as allergic conjunctivitis. the drug belongs to a class of peptidomimetics known as hybrid peptides. hybrid peptides contain at least two dissimilar types of amino acids (alpha, beta, gamma or delta) linked to each other via a peptide bond (https://www.drugbank.ca/ drugs/db08835). in the current study, all fda-approved therapeutics (in 2018) were retrieved from drugbank (https://www.drugbank. ca/biotech_drugs) and an analysis was conducted to compare their percentages. protein-based therapies, gene or nucleic acid-based therapies, vaccines, allergenics and cell transplant therapies made up 8.05%, 0.17%, 2.64%, 16.20% and 0.14% of total approved therapeutics, respectively. small-molecule drugs made up 72.76% of the approved therapeutics ( figure 3 ). computational designing of proteins can be classified as follows: 1) template-based designing in which three-dimensional (3d) farhadi and hashemian structure of a predefined template is adapted to design a sequence and 2) de novo designing in which the amino acids' arrangement is changed to generate both sequence and 3d structure of a completely novel protein. 3 the problem of predicting the fold of an unknown sequence could be solved by utilizing templates. since the fold is unaltered, the backbone atoms are directly located on this framework. 3 moreover, to generate a functional protein, the side chains that can effectively stabilize the structure are added to the backbone. 37, 38 routine concerns and methods for template-based protein design are reviewed below. selecting the template (scaffold) protein the template (also named as scaffold protein) contains a group of backbone atom coordinates. the coordinates can be retrieved from an available x-ray crystal structure or cautiously from a nuclear magnetic resonance (nmr) structure. 39 computer-aided design of amino acid-based therapeutics fixing the backbone decreases the computational complication, but it may inhibit the main chain modifications to adjust sequence alternation. 7 backbone flexibility can generate designed functionalities over the protein's normal function. the backbone flexibility is introduced through incorporating other closely associated conformations to an existing structure. [40] [41] [42] recently, new functionalities were effectively introduced into the tim-barrel topology. 43 this fold has been detected as one of the most shared structures in 21 distinct protein superfamilies. 44 sequence search and characterization in a design procedure, a protein sequence is selected such that it meets the energetic and geometric constraints established by the chosen fold. sequence search techniques sample different sequences and estimate their energies to gain the one owing the minimum energy. 3 in order to identify the sequences subject to an objective function or a specific energy, a diverse strategies including optimization and probabilistic approaches have been developed. 45 optimization processes may recognize candidate sequences using stochastic or deterministic methods. 45 probabilistic approaches focus on characterizing the sequence space probabilistically. deterministic methods: to achieve a sequence folded into a global minimum energy conformation, deterministic methods search the whole sequence space and identify the global optima. 3, 7 these methods include dead-end elimination (dee), 46 self-consistent mean field, 47 graph decomposition and linear programming. 48 stochastic algorithms search the sequence space in an exploratory manner. 3 these algorithms include monte carlo algorithms (simulated annealing), 49 graph search methods 50 and genetic algorithms. 51 some of the most commonly used methods are discussed below. dee has been considered as a thorough search algorithm. to find and remove sequence-rotameric positions that are not portions of the global minimum energy conformation, dee compares two amino acid rotamers and removes the one with greater interaction energy. 52 interaction energies are computed for each rotamer of the test amino acid, along with all rotamers of every other amino acid. 3 the situation is repetitively examined for total amino acid states as well as their rotamers until it no longer holds true. 52, 53 expanding the sequence length increases the combinatorial complication of dee exponentially. therefore, to design sequences of 30 amino acids or larger, application of dee may be restricted. 54 details of the theorem are explained elsewhere. 3, 7 stochastic search algorithms: as mentioned before, deterministic approaches are perfect to design proteins with small sizes, but show the applied disadvantages with extension of sequence size. stochastic or heuristic methods are valuable to design large proteins. 3 the most widely used method for protein design includes monte carlo sampling. 3, 7 monte carlo method samples positions of complicated proteins in a way related to a selected probability distribution such as boltzmann distribution. boltzmann distribution specially weighs low-energy configurations. the monte carlo algorithm performs iterative series of calculations. at the primary step of each search, a partially accidental test sequence is generated, and its energy is calculated via a physical potential. during the primary step, both rotamer state and amino acid identity are adjusted and an efficient temperature controls the probable energy alterations. in the next step, named simulated annealing, the temperature gradually decreases and permits favorable sampling of lowerenergy configurations. 55 multiple independent calculations are carried out to converge the system to a global minimum. 3, 7 for more explanation about the theorems and details of the formulation of the probability distribution and weights, readers are referred to study previous reports. 3, 7 probabilistic approach: probabilistic approaches are frequently employed when thorough information is not accessible for protein design. in a probabilistic approach, sitespecific amino acid probabilities may be utilized, rather than particular sequences. the procedure is partially motivated by the uncertainties to find sequences consistent with a specific structure. briefly, the backbone atoms are fixed or greatly constrained, side chain conformations are discretely handled, energy functions are estimated and solvation is handled by simple models. 7 however, in order to offer valuable sequence information for design experiments and to find structurally significant amino acids, probabilistic techniques leverage structural characteristics of interatomic interactions. 7 generally, monte carlo methods give a probabilistic sampling of sequences. 49, 55 in addition, an entropy-based formalism has been defined to predict amino acid probabilities for a certain backbone structure. 56, 57 the method employs concepts from statistical thermodynamics to assess the sitespecific probabilities. to address the whole space of existing compositions, the theory is not restricted by the computational enumeration and sampling. large protein structures with .100 variable residues can be supplied simply. 7 sampling sequence space to generate conformations the chemical variability of a sequence and the number of various amino acids permitted at each position are defined as "degrees of freedom for each amino acid". moreover, each of the 20 natural residues search the whole sequence space. 58 drug design, development and therapy 2018:12 submit your manuscript | www.dovepress.com to decrease the degrees of freedom for each amino acid and searching the sequence space, diverse approaches such as hydrophobic patterning have been proposed. 58 monomers can be used to probe a protein structure 59 and improve its function, 60 other than the naturally occurring amino acids. 61 sampling of side chain conformational space to form conformations side chain conformations are typically consistent with the energy minima of molecular potentials and can be obtained from a structural database. 62 rotamer statuses are related to the repeatedly detected values of dihedral angles in the side chain of each amino acid. for example, the simplest amino acids including alanine and glycine have only one rotamer status, while the bigger amino acids have .80 diverse rotamer statuses. 62 a variety of rotamer libraries including backbonedependent, secondary structure-dependent and backboneindependent libraries have been developed for protein design. 62, 63 by using a rotamer library, one can discretize a meaningful state space to decrease the computational difficulty. rotamer libraries can be extended beyond the 20 natural amino acids. the effective rotamers can model cofactors, ligands, water and posttranslational modifications. for example, to improve the modeling of protein-protein interactions and model water within proteins interiors, the structurally definite water molecules can be inserted as a solvated rotamer library. 61 energy functions have been employed to quantify sequencestructure compatibilities. 64 they include linear associations of hydrogen bonds made by backbone atoms, repulsion among atoms, hydrophobic attraction among non-polar groups and electrostatic interactions among sequential neighbors. 65 the sequence of a protein is selected so that it can adjust the energetic and geometric constraints enforced by the favorite fold. constraints typically contain several intramolecular interactions such as van der waals, hydrophobic, polar and electrostatic interactions, as well as hydrogen bonds. generally, by using a scoring function, it is possible that energetic contributions of the mentioned parameters are taken into account. 3, 7, 65 de novo design: designing the sequence and 3d structure through assembly of proteins fragments 66, 67 or secondarystructure elements, 68,69 novel structures can be modeled de novo. in the design procedures, the backbone coordinates are generally constrained. summary and important findings of some proteins designed using computational approach including a retroaldol enzyme, 43 the kemp elimination enzyme, 70 a novel βαβ protein, 71 a redesigned procarboxypeptidase, 72 a novel α/β protein structure and the top7 73 are shown in table 1 . peptide design methods have been categorized as ligand-and target-based design methods. in the ligand-based designing procedure, information derived from peptides is used to design novel therapeutic peptides. in the target-based method, information derived from target proteins is specifically utilized. typically, a hybrid approach including both ligand-and target-based design is utilized. 13 ligand-based peptide design the ligand-based design has been classified as follows: 1) sequence-based, 2) property-based and 3) conformationbased design. sequence-based approach uses the information of conserved regions and analyzes the multiple sequence alignments. this method is directed by the hypothesis that conserved regions are functionally and structurally significant. 13 computational tools allow the ligand-based peptide design, although they lag behind bioinformatics strategies developed for protein designing. 13 recently, using a method based on a pam250 matrix, the relationship between a series of 35 collagen peptides and antiangiogenic activity including proliferation, migration and adhesion was analyzed. 74 the pam250 matrix captured information of mutation rates among all pairs of amino acids. based on the results, regions at the c and n termini of the peptides were detected to be significant for an ideal activity and suggested as two distinct binding sites. the approach showed the potential worth of the sequence-based peptide design. 74 in another report, a computational platform called sarvision was developed to support sequence-based design. sarvision signifies an important step for peptide sequence/activity relationship (sar) analysis. moreover, it pools the improved visualization abilities with advanced sequence/activity analysis. 75 compared to small molecules, property-based design methods for peptides are in the early stages of development. in a recent study, the δg decomposition per residue and the physicochemical characteristics of amino acids, such as hydrophilicity, hydrophobicity and volume, were used computer-aided design of amino acid-based therapeutics to model peptide binding to targets of interest. 76, 77 finally, a model was built to estimate peptide δg values for binding to the class i major histocompatibility complex (mhc) protein hla-a*0201. 78 furthermore, in a wide range of studies, antimicrobial peptides were successfully analyzed by using the property-based approach. 79 for example, a machinelearning method was employed to design novel antimicrobial peptides. 80 the victory of the property-based methods with antimicrobial peptides may be explained by the fact that the desired biologic activity of membrane disruption is relatively nonspecific. 13 in the case of conformation-based peptide design, computational techniques were developed to predict the conformational ensembles or structure of peptides and analyze the sars. 81,82 pep-fold is an online tool used to predict the 3d structures of peptides of length 9-36 residues. 81 a remarkable suggestion from the data is that pep-fold seems to solve the conformational sampling problem. 13, 81 in order to search conformational spaces of a peptide, long timescale molecular dynamic simulations have been employed. 83, 84 besides, quantum mechanical calculations are promising to address the scoring deficiency in the peptide conformational examination. 85 apparently, to affect the peptide design processes positively, improving the major theoretical and technical issues is necessary before such computationally sophisticated and costly procedures. conformation of a peptide may be modeled to generate a 3d pharmacophore hypothesis. a certain pharmacophore hypothesis is useful to determine the adme/tox activities or particular potencies of a peptide. 86 for example, screening of a peptide library was jointed to generate a pharmacophore hypothesis to identify potent agonists of melanocortin-4 receptor isoforms. a combinatorial tetrapeptide library was screened, and sar and ligand-derived pharmacophore templates were generated. the pharmacophore hypothesis was proposed to allow continuous attempts in the rational design of melanocortin receptor molecules. 86 target-based peptide design compared to ligand-based peptide design, target-based design appears to be in a more improved level. 13 targetbased design is initiated with the computer-aided survey of a ligand-bound or unbound protein target to recognize its potential binding sites, prospective specificity surfaces and other pharmacologic activity elements. the phase is generally followed by an in silico design phase where computational methods perform, refine and evaluate peptide design ideas. some recently developed computational methods for targetbased peptide design are reviewed below. recently, an increase in the number of protein-peptide 3d structures deposited in the pdb has assisted to search the molecular mechanism and structural basis of peptide recognition and binding. 87 information of crystal structures of protein-peptide complexes can improve our knowledge of the farhadi and hashemian chemical forces involved in the binding and special modes of binding. dynamic data of the complexes can be partially extracted from the solution nmr structures deposited in the pdb. to record the structures and functions of various protein-peptide complexes, the experimentally resolved structure data were gathered, annotated and analyzed, and several distinctive databases such as pepx, 88 pepbind 89 and peptiddb were generated. 90 the pepx database, derived from the pdb, comprises unique protein-peptide interface collections. 88 the pepbind database contains 4,986 proteinpeptide complex structures from the pdb. 89 peptiddb is a curated database of 103 protein-peptide complexes. 90 the abundance of the structural information specifically on monomeric proteins could be gathered to design proteinpeptide interactions with no requirement for their sequence homology. 91 protein-peptide docking precise docking of a highly flexible peptide is a major challenge. 18 traditional docking protocols, such as autodock, vina 92,93 and moe-dock, 94 developed for docking of small molecules, were also used to dock a peptide to a protein receptor. however, comparative studies revealed that these techniques would face failure if the docked peptides were .3 residues long. 95 therefore, development of peptide-focused docking protocols is very important. 96 other protein-protein docking tools such as z-dock and hex have been used for the computational peptide design in some studies. 96 below, details of recently developed peptide-focused docking approaches are discussed. first, heuristic evolution procedures were applied to search the large conformational space of linear peptides before the binding. 97 however, these procedures were not efficient and their use was limited. 18 then, a scheme based on conformational sampling became common in the peptide docking. besides, several illustrative approaches were proposed to balance between the accuracy and efficacy of the flexible peptide docking. in this aspect, docscheme, 98 dynadock 99 and pepspec 100 were integrated to online userfriendly interfaces and introduced. recently, pepcrawler 101 and flexpepdock 102 were developed as the peptide docking tools. 18 it is reported that flexpepdock 102 has sub-angstrom accuracy in reproducing the crystal structures of protein-peptide complexes. 103 all of the flexpepdock-based methods assume previous information about the peptide-binding site. 13 anchordock, a recently described algorithm, allows powerful blind docking calculations through relaxing the constraint. 104 the program predicts anchoring origins on a protein surface. following recognition of the anchoring origins, an assumed peptide conformation is refined using an anchor-constrained molecular dynamic process. 105 haddock, a well-known protein-protein docking tool, has been recently expanded to run the flexible peptide-protein docking. 105 to handle a docking procedure, haddock uses ambiguous interaction restraints based on the experimental information about intermolecular interactions. this rigid body peptide docking is followed through a flexiblesimulated annealing process. the novel haddock strategy initiates docking computations from an ensemble of three dissimilar peptide conformations (eg, α-helix, extended and polyproline-ii) that are high informative inputs. 105 cabs-dock is a recently introduced protein-peptide docking tool and runs a primary docking procedure whose outcomes can be refined by other tools such as flexpepdock. 106 in the primary phase of the procedure, random conformations of a peptide are predicted and located around the protein target of interest. the process is followed by replica exchange monte carlo dynamics. subsequently, 10 models are selected for the last optimization using the modeller tool to gain accurate scoring and ranking poses. 13, 106 galaxypepdock was developed to use experimentally resolved protein-peptide structures for running the template-based docking pooled by flexible energy-based optimization. 107 atomistic simulation atomistic monte carlo and molecular dynamics simulations are accurate, but they are meticulous techniques to investigate peptide-protein binding interactions. these techniques can also detect the thermodynamic profile and trajectory included in protein-peptide identification. these methods predict the association among conformations of a peptide in solution or protein. 108 in a study, in order to describe the binding of a decapeptide to the cognate sh3 receptor, a long-term molecular dynamic simulation was used and a two-state model was built. 109 in the first step, a relatively quick diffusion phase, nonspecific encounter complexes were generated and stabilized by using electrostatic energy. the secondary step was a slow modification phase, in which the water molecules were emptied out from the space between the peptide ligand and the receptor. 109 in another report, by using monte carlo method, the mentioned two-state model was verified to trace some oligopeptide routes for binding to various pdz (post synaptic density protein, drosophila disc large tumor suppressor, and zonula occludens-1 protein) domains. 110 drug design, development and therapy 2018:12 submit your manuscript | www.dovepress.com computer-aided design of amino acid-based therapeutics the affinity of bh3 peptides to bcl-2 protein was investigated, and results showed the higher affinity of bound peptides occurred when the corresponding peptides were in a lower degree of disorder in unbound states and vice versa. 111 these results showed that the highly structured peptides could increase their affinity through reducing the entropic loss associated with the binding. overall, in addition to the electrostatic and hydrophobic forces, protein-peptide interactions can be affected by the entropic effect and conformational flexibility that could be willingly examined with atomistic simulations. 111 very recently, using a fast molecular dynamics simulation, the energetic and dynamic features of protein-peptide interactions were studied. in most cases, the native binding sites and native-like postures of protein-peptide complexes were recapitulated. additional investigation showed that insertion of motility and flexibility in the simulation could meaningfully advance the correctness of protein-peptide binding prediction. 112 peptide affinity prediction most features of computational peptide design are based on the accuracy and efficacy of affinity prediction. hence, the fast and reliable prediction of peptide-protein affinity is significant for rational peptide design. 18 in this aspect, two categories of prediction algorithms including sequence-and structure-based approaches were developed. the sequencebased method uses the information derived from primary polypeptide sequences to approximate and evaluate the standards of the binding affinity. the structure-based process takes the information derived from 3d structures of proteinpeptide complexes to predict the binding affinity. 113 at the sequence level, the quantitative structure-activity relationships (qsars) have been widely utilized to forecast the binding affinity of peptides and conclude the biologic function. 114 to model the statistical correlation between sequence patterns and biologic activities of experimentally assessed peptides, machine-learning methods such as partial least squares (pls), artificial neural networks (ann) and support vector machine (svm) have been used. the obtained correlations have been used to infer experimentally undetermined peptides. 115 the relationship between the biologic activity and molecular structure is an important issue in biology and biochemistry. qsar is a well-established method employed in pharmaceutical chemistry and has become a standard tool for drug discovery. however, the predictive capacity of qsar techniques is generally weaker than statistics-based approaches. therefore, a combination of the qsar method with a statistic-based technique may bring out the best in each other and can be a trend in future developments of drug discovery. 114 at the structural level, numerous reports on affinity prediction have addressed the mhc-binding peptides. plentiful mhc-peptide complex structure records have been deposited in the pdb. 116 the significance of domain-peptide recognition has been recently illustrated in the metabolic pathway and cell signaling. 117 to predict the protein-peptide binding potency, a number of strict theories were suggested based on the potential free energy perturbation. the theories computed the alteration of free energies upon the interaction between phosphor-tyrosine-tetra-peptide (pyeei) and human lck sh2 domain. 118 furthermore, to obtain a deep insight into the structural and energetic aspects of peptide recognition by the sh3 domain, a number of molecular modeling experiments such as homology modeling, molecular docking and mechanism dynamics were used. 119 peptide array strategies confirmed that some peptide candidates may be potent binders of the abl sh3 domain. 120 very recently, an approach including quantum mechanics/molecular mechanics, semiempirical poisson-boltzmann/surface area and empirical conformational free energy analysis was developed to quantitatively illustrate the energetic contributions involved in the affinity losing of pdz domain and oppa protein to their peptide ligands. 121, 122 de novo peptide design recently, in order to de novo target-based peptide design, two remarkable methodologies including the vital method and an approach developed by bhattacherjee and wallin were introduced. the vital method pools verterbi algorithm with autodock to design peptides for the binding sites of a target. 123 the "bhattacherjee and wallin" approach explores both peptide sequence and conformational space around a protein target at the same time. 124 this approach was tested on three dissimilar peptide-protein domains to assess its ability. 13 a brief list of the existing computational resources employed in peptide design is presented in table 2 . in recent years, some computational methods have been proposed to design peptidomimetics. these methods can be classified based on their specificity to translate peptides to peptidomimetics. 28 to select the best method, drug design, development and therapy 2018:12 submit your manuscript | www.dovepress.com farhadi and hashemian awareness about the structure of peptide-protein complexes is important. 28, 96 herein, recently introduced methods for computer-aided design of peptidomimetics are presented. growmol is a combinatorial algorithm employed in the peptidomimetics design. growmol searches a variety of probable ligands for the binding sites of a target protein 125 and produces molecules with the chemical and steric complementarity for the 3d structure of binding sites. this method was used to generate peptidomimetic inhibitors of thermolysin, hiv protease and pepsin. by using the x-ray crystal structures of pepstatin-pepsin complexes, growmol predicted therapeutic peptidomimetics against the aspartic proteases. the algorithm created some cyclic inhibitors bridging the side chains of cysteine residues in the pl and p3 inhibitor subsites. the binding modes were checked using x-ray crystallography. 125, 126 ludi is another interesting software referring to the de novo methodology. 127 by using natural and non-natural amino acids as building blocks, the software designed peptidomimetics against renin, thermolysin and elastase. 127 conformational flexibility of each novel peptidomimetic was searched through sampling the multiple conformers of each amino acid. 127 peptide-driven pharmacophoric hypothesis is the most perceptive computational technique discovered in the peptidomimetics design. the method is especially useful when the x-ray structures of protein-protein complexes exist. 28 the main idea is to adapt the hot spot concept into the associated pharmacophoric feature concept. with a pharmacophorebased virtual screening process, this strategy can determine novel type 3 mimetics. 128 in fact, the side chains of each amino acid can be simply categorized based on the conventional pharmacophoric characteristics, such as hydrogen bond donors and acceptors, aromatic ring and charged and hydrophobic centers. for example, in a report, pharmacophore model directed synthesis of the non-peptide analogs of a cationic antimicrobial peptide identified an anti-staphylococcal activity. 129 to make a pharmacophore hypothesis, a model of rna iii-inhibiting peptide (rip), a well-known heptapeptide inhibitor of the staphylococcal pathogenesis, was utilized. through the virtual screening of 300,000 commercially available small molecules based on the rip-based pharmacophore, hamamelitannin was discovered as a non-peptide mimetic of rip. hamamelitannin is a tannin derivate extracted from hamamelis virginiana. 28, 129 in another study, two rounds of in silico screening were performed to discover potential peptidomimetics able to mimic a cyclic peptide (cyclo[cpfvktqlc] ) that is known to bind the anb3 integrin receptor. 130 at the end of the process, the most potent representatives were at least 2,000 times better than the original cyclopeptide (around 2 mm). 130 in a prosperous instance, virtual screening was done by using multi-conformational forms of a large commercial library. a target-based pharmacophoric model mapped the cd4-binding site on hiv-1 gp120. the pharmacophore hypothesis was made based on a homology model of the protein cavity. in a cell-based assay, two of the top scoring molecules were detected as micromolar inhibitors of hiv-1 replication. 131 computer-aided design of amino acid-based therapeutics the pharmacophore-based screening was used to find the novel alzheimer's therapeutics as mimetics of neurotrophins. 132 the therapeutic utilization of neurotrophins might be restricted because of several deficiencies such as its reduced central nervous system penetration, decreased stability and potency to enhance neuronal death through interaction with the p75ntr receptor. the mimetism of particular nerve growth factor domains could inhibit neuronal death. peptidomimetics of the loop 1 and loop 4 domains of nerve growth factor can prevent neuronal death induced by p75ntr-dependent and trk-related signaling. 132 in another study, a full-computational pharmacophorebased approach assessed the fda-approved drugs as valuable candidates to inhibit protein-protein interactions. 133 peptide structures were designated in terms of pharmacophores and searched against the fda-approved drugs to detect same molecules. the top ranking drug matches contained several nuclear receptor ligands and matched allosterically to the binding site on the target protein. the top ranking drug matches were docked to the peptide-binding site. the majority of the top-ranking matches presented a negative free energy change upon binding that was comparable to the standard peptide. 133 geometry similarity method geometry similarity methods create a geometric similarity between non-peptide templates and peptide patches. in a study, the supermimic tool was developed to recognize peptide mimetics. 134 in the program, a complex library of peptidomimetics composed of several protein structure libraries has been deposited. moreover, supermimic includes the d-peptides, synthetic components (reported as betaturn or gamma-turn mimetics) and peptidomimetic ligands obtained from the pdb. 134 in the program, the searching process allows scanning a library of small molecules that mimic the tertiary structure of a query peptide followed by scanning of a protein library where a query for small molecule can adopt into the backbone. 28, 134 sequence-based method recently, a method has been developed to rank peptide compound matches that are limited to short linear motifs in proteins and compounds with amino acid substituents. 135 the algorithm allows mapping the side chain-like substituents on every compound of a large chemical library. the complete molecule can be signified by a short sequence, and each fragment in the molecule can be represented as a distinct letter abbreviation. 28 a cross-search between the pubchem database (about 5.4 million molecules) and a non-redundant collection of 11,488 peptides obtained from pdb demonstrated that the algorithm can be useful for high-throughput measurements. 28 to recognize a true positive, the method explored identified protein motifs against the national cancer institute developmental therapeutic program compound database. 135 in another study, the similarity of amino acid motifs to compounds web server was developed to ease screening of identified motif structures against bioactive compound databases. 136 the methodology was reported to be efficient since the compound databases were preprocessed to maximize the accessible data, and the necessary input data was minimal. 136 in similarity of amino acid motifs to compounds, motif matching can be full or partial that may decrease or enhance the number of potential mimetics, respectively. using a novel search algorithm, the web service can perform a fast screening of known or putative motifs against ready compound libraries. the classified results can be examined by linking to appropriate databases. 28, 136 fragment-based method replacement with partial ligand alternatives through computational enrichment is a fragment-based approach. 137 by using structures of peptide-bound proteins as design anchors, the program can computationally find a non-peptide mimetic for specific determinants of known peptide ligands. 137 hybrid peptide-driven shape and pharmacophoric method development and application of strategies for pharmacophore modeling indicate that the medicinal chemistry community has broadly accepted the intuitive nature of the pharmacophore concept. besides, shape complementarity has been identified as a significant element in the molecular identification between ligands and their targets. 28 in virtual screening efforts, using the pharmacophore-and shape-based techniques distinctly may increase the rate of false-positive results. 128 therefore, incorporating both pharmacophore-and shape-matching techniques into one program can potentially diminish the rate of false positives. 128 recently, to discover novel peptidomimetics, a weboriented virtual screening tool named pepmmsmimic 138 was developed to pool the conventional pharmacophore matching with shape complementarity. a library of 17 million conformers were extracted from 3.9 million commercially available chemicals and gathered in the mmsinc database. the database was used as a skeleton to develop farhadi and hashemian pepmmsmimic. 139 in the pepmmsmimic interface, the 3d structure of a protein-bound peptide is used as an input. then, chemical structures able to mimic the pharmacophore and shape similarity of the original peptide are proposed to involve in the protein-protein recognition. 139 a list of in silico methods used to design potential peptidomimetics along with their strengths and weaknesses is presented in table 3 . overall, design and development of therapeutics are tedious, expensive and time-consuming procedures. therefore, using modern approaches including computer-aided design methods can lessen the examination phase, price and failure of therapeutics discovery. computational methods used to design amino acid-based therapeutics can increase the range of available biotherapeutics. benefiting from the dramatic advance in bioinformatics, computational tools can be used to find and develop therapeutic proteins, peptides and peptidomimetics. 140, 141 moreover, using the computational tools decrease the cost of therapeutics development, from concept to market, by up to 50%. 140 however, in the computational protein designing, there are some challenges such as our inadequate knowledge of folding and physical forces that stabilize protein structures. moreover, sequences and local structures have many degrees of freedom that can complicate the sequence search. therefore, there is a requirement for effective methods to find sequences related to a particular structure and measure essential protein folding criteria. overall, in silico design of amino acid-based therapeutics includes many challenges that should be removed to improve the overall performance of the design processes. for example, although structure determination of all disease-related proteins through crystallography and nmr is a laborious task, it is necessary to gather much structural information of peptide-protein interactions. besides, development of vigorous algorithms to calculate protein-protein binding energies is essential. the estimation of binding constant between two macromolecules with an appropriate speedaccuracy tradeoff needs millisecond scale molecular dynamics. moreover, understanding of both protein-protein and protein-peptidomimetics recognition processes in a molecular level can be improved using higher accurate force fields such as quantum mechanical polarizable force. in recent years, there are growing examples on the approval of monoclonal antibodies (therapeutic antibodies) by the fda for treatment of various diseases. this important area of amino acid-based therapeutics has been covered in more depth elsewhere. 142, 143 for more explanation about the theorems and details of antibody informatics for drug discovery as well as the computer-aided antibody design, readers are referred to study previous reports. 142, 143 the authors report no conflicts of interest in this work. computer-aided design of amino acid-based therapeutics what is the future of targeted therapy in rheumatology: biologics or small molecules protein therapeutics: a summary and pharmacological classification in silico methods for design of biological therapeutics constructing novel chimeric dna vaccine against salmonella enterica based on sopb and groel proteins: an in silico approach in silico phylogenetic analysis of vibrio cholera isolates based on three housekeeping genes designing of complex multi-epitope peptide vaccine based on omps of klebsiella pneumoniae: an in silico approach theoretical and computational protein design annu evaluation of in silico protein secondary structure prediction methods by employing statistical techniques inhibition of mycobacterial cyp125 enzyme by sesamin and β-sitosterol: an in silico and in vitro study theory of protein folding: the energy landscape perspective toward an outline of the topography of a realistic protein-folding funnel artificial diiron enzymes with a de novo designed four-helix bundle structure computer-enabled peptide drug design: principles, methods, applications and future directions docking small peptides remains a great challenge: an assessment using autodock vina empirical estimation of local dielectric constants: toward atomistic design of collagen mimetic peptides recent work in the development and application of protein-peptide docking rational design of peptide drugs: avoiding aggregation computational peptidology: a new and promising approach to therapeutic peptide design strategies employed in the design and optimization of synthetic antimicrobial peptide amphiphiles with enhanced therapeutic potentials multifaceted roles of disulfide bonds. peptides as therapeutics peptidomimetics, a synthetic tool of drug discovery an in silico pipeline for the design of peptidomimetic proteinprotein interaction inhibitors (order no. 10188557) natural products as sources of new drugs over the last 25 years diversity-oriented synthesis of macrocyclic peptidomimetics structure-based design, synthesis, and biological evaluation of peptidomimetic sars-cov 3clpro inhibitors advances in amino acid mimetics and peptidomimetics a hierarchical approach to peptidomimetic design mimicking peptides… in silico peptidomimetic design rational design for peptide drugs peptidomimetics as a cutting edge tool for advanced healthcare an unusual functional group interaction and its potential to reproduce steric and electrostatic features of the transition states of peptidolysis low molecular weight, non-peptide fibrinogen receptor antagonists neurotrophin small molecule mimetics: candidate therapeutic agents for neurological disorders design of peptides, proteins, and peptidomimetics in chi space molecular technology. designing proteins and peptides molecular engineering: an approach to the development of general capabilities for molecular manipulation x-ray versus nmr structures as templates for computational protein design high-resolution protein design with backbone freedom prediction of protein-protein interface sequence diversity using flexible backbone computational protein design backbone flexibility in computational protein design de novo computational design of retro-aldol enzymes one fold with many functions: the evolutionary relationships between tim barrel families based on their sequences, structures and functions search and sampling in structural bioinformatics the dead-end elimination theorem and its use in protein side-chain positioning application of a self-consistent mean field theory to predict protein sidechains conformation and estimate their conformational entropy design of protein-interaction specificity gives selective bzip-binding peptides computational methods for protein design and protein sequence variability: biased monte carlo and replica exchange exploring the conformational space of protein side chains using dead-end elimination and the a* algorithm side-chain and backbone flexibility in protein core design dead-end elimination with a polarizable force field repacks pcna structures improved prediction of protein side-chain conformations with scwrl4 trading accuracy for speed: a quantitative comparison of search algorithms in protein sequence design using self-consistent fields to bias monte carlo methods with applications to designing and sampling protein sequences computational design and characterization of a monomeric helical dinuclear metalloprotein statistical theory of combinatorial libraries of folding proteins: energetic discrimination of a target structure achieving stability and conformational specificity in designed proteins via binary patterning photophysics of a fluorescent non-natural amino acid: p-cyanophenylalanine an expanded eukaryotic genetic code a "solvated rotamer" approach to modeling watermediated hydrogen bonds at protein-protein interfaces rotamer libraries in the 21st century improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library potential energy functions for protein design de novo design of foldable proteins with smooth folding funnel: automated negative design and experimental verification assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions structure by design: from single proteins and their building blocks to nanostructures computational de novo design and characterization of a four-helix bundle protein that selectively binds a nonbiological cofactor using α-helical coiled coils to design nanostructured metalloporphyrin arrays kemp elimination catalysts by computational enzyme design de novo design of a βαβ motif high-resolution structural and thermodynamic analysis of extreme stabilization of human procarboxypeptidase by computational protein design design of a novel globular protein fold with atomic-level accuracy novel peptide-specific quantitative structure activity relationship (qsar) analysis applied to collagen iv peptides with antiangiogenic activity development of an informatics platform for therapeutic protein and peptide analytics two-level qsar network (2l-qsar) for peptide inhibitor design based on amino acid properties and sequence positions recent development of peptide drugs and advance on theory and methodology of peptide inhibitor design predicting the affinity of epitope-peptides with class i mhc molecule hla-a*0201: an application of amino acid-based peptide prediction a brief overview of antimicrobial peptides containing unnatural amino acids and ligand-based approaches for peptide ligands machine learning assisted design of highly active peptides for drug discovery pep-fold: an updated de novo structure prediction server for both linear and disulfide bonded cyclic peptides in silico predictions of 3d structures of linear and cyclic peptides with natural and nonproteinogenic residues long-timescale molecular dynamics simulations of protein structure and function how fastfolding proteins fold bond distances in polypeptide backbones depend on the local conformation identification of tetrapeptides from a mixture based positional scanning library that can restore nm full agonist function of the l106p, i69t, i102s, a219v, c271y, and c271r human melanocortin-4 polymorphic receptors (hmc4rs) the protein data bank protein design with fragment databases pepbind: a comprehensive database and computational tool for analysis of protein-peptide interactions the structural basis of peptide-protein binding strategies protein-peptide interactions adopt the same structural motifs as monomeric protein folds highly flexible protein-peptide docking using cabs-dock computer-aided design of amino acid-based therapeutics virtual screening for potential inhibitors of ctx-m-15 protein of klebsiella pneumoniae in silico panning for a non-competitive peptide inhibitor comparative evaluation of eight docking tools for docking and virtual screening accuracy in silico designing of peptide inhibitors against pregnane x receptor: the novel candidates to control drug metabolism computation of the binding of fully flexible peptides to proteins with flexible side chains a flexible docking procedure for the exploration of peptide binding selectivity to known structures and homology models of pdz domains dynadock: a new molecular dynamics-based algorithm for protein-peptide docking including receptor flexibility structure-based prediction of proteinpeptide specificity in rosetta pepcrawler: a fast rrt-based algorithm for high-resolution refinement and binding-affinity estimation of peptide inhibitors rosetta flexpepdock ab-initio: simultaneous folding, docking and refinement of peptides onto their receptors sub-angstrom modeling of complexes between flexible peptides and globular proteins anchordock: blind and flexible anchordriven peptide docking a unified conformational selection and induced fit approach to proteinpeptide docking cabs-dock web server for the flexible docking of peptides to proteins without prior knowledge of the binding site galaxypepdock: a proteinpeptide docking tool based on interaction similarity and energy optimization predicting peptide structures in native proteins from physical simulations of fragments mechanism of fast peptide recognition by sh3 domains binding free energy landscape of domain peptide interactions molecular dynamics simulations of pro-apoptotic bh3 peptide helices in aqueous medium: relationship between helix stability and their binding affinities to the anti-apoptotic protein bcl-xl structural and dynamic determinants of protein-peptide recognition quantitative sequenceactivity model (qsam): applying qsar strategy to model and predict bioactivity and function of peptides, proteins and nucleic acids recent advances in qsar and their applications in predicting the activities of chemical molecules, peptides and proteins for drug design comprehensive comparison of eight statistical modelling methods used in quantitative structure retention relationship studies for liquid chromatographic retention times of peptides generated by protease digestion of the escherichia coli proteome prediction of mhc-peptide binding: a systematic and comprehensive overview domain mediated protein interaction prediction: from genome to network calculation of absolute protein-ligand binding free energy from computer simulations prediction of binding affinities between the human amphiphysin-1 sh3 domain and its peptide ligands using homology modeling, molecular dynamics and molecular field analysis characterization of domain-peptide interaction interface: a generic structure-based model to decipher the binding specificity of sh3 domains why oppa protein can bind sequence-independent peptides? a combination of qm/mm, pb/sa, and structure-based qsar analyses characterization of pdz domain-peptide interactions using an integrated protocol of qm/mm, pb/sa, and cfea analyses computational design of peptide ligands exploring protein-peptide binding specificity through computational peptide screening multiple highly diverse structures complementary to enzyme binding sites: results of extensive application of a de novo design method incorporating combinatorial growth transformation of peptides into non-peptides. synthesis of computer-generated enzyme inhibitors towards the automatic design of synthetically accessible protein ligands: peptides, amides and peptidomimetics structure-based pharmacophores for virtual screening antimicrobial activity of small β-peptidomimetics based on the pharmacophore model of short cationic antimicrobial peptides small molecule inhibitors of hantavirus infection a dynamic target-based pharmacophoric model mapping the cd4 binding site on hiv-1 gp120 to identify new inhibitors of gp120-cd4 protein-protein interactions alzheimer's therapeutics approved drug mimics of short peptide ligands from protein interaction motifs supermimic-fitting peptide mimetics into protein structures identification of potential small molecule peptidomimetics similar to motifs in proteins drug design, development and therapy 1254 web server to identify similarity of amino acid motifs to compounds (saamco) replace: a strategy for iterative design of cyclin-binding groove inhibitors swimming into peptidomimetic chemical space using pepmmsmimic mmsinc: a large-scale chemoinformatics database computational drug discovery developability assessment as an early de-risking tool for biopharmaceutical development antibody informatics for drug discovery computer-aided antibody design tumorhope: a database of tumor homing peptides drug-permeability and transporter assays in caco-2 and mdck cell lines pepx: a structural database of non-redundant protein-peptide complexes rosetta flexpepdock web server -high resolution modeling of peptide-protein interactions pdock: a new technique for rapid and accurate docking of peptide ligands to major histocompatibility complexes predicting peptide binding sites on protein surfaces by clustering chemical interactions protein-peptide complex prediction through fragment interaction patterns pep-sitefinder: a tool for the blind identification of peptide binding sites on protein surfaces vital: viterbi algorithm for de novo peptide design drug design, development and therapy 2018:12 submit your manuscript | www.dovepress.com submit your manuscript here: http://www.dovepress.com/drug-design-development-and-therapy-journal drug design, development and therapy is an international, peerreviewed open-access journal that spans the spectrum of drug design and development through to clinical applications. clinical outcomes, patient safety, and programs for the development and effective, safe, and sustained use of medicines are the features of the journal, which has also been accepted for indexing on pubmed central. the manuscript management system is completely online and includes a very quick and fair peer-review system, which is all easy to use. visit http://www.dovepress.com/testimonials.php to read real quotes from published authors. key: cord-104282-90t1m430 authors: nan title: microsomal aldehyde dehydrogenase is localized to the endoplasmic reticulum via its carboxyl-terminal 35 amino acids date: 1994-09-02 journal: j cell biol doi: nan sha: doc_id: 104282 cord_uid: 90t1m430 rat microsomal aldehyde dehydrogenase (msaldh) has no amino-terminal signal sequence, but instead it has a characteristic hydrophobic domain at the carboxyl terminus (miyauchi, k., r. masaki, s. taketani, a. yamamoto, a. akayama, and y. tashiro. 1991. j. biol. chem. 266:1953619542). this membrane-bound enzyme is a useful model protein for studying posttranslational localization to its final destination. when expressed from cdna in cos-1 cells, wild-type msaldh is localized exclusively in the well-developed er. the removal of the hydrophobic domain results in the cytosolic localization of truncated proteins, thus suggesting that the portion is responsible for membrane anchoring. the last 35 amino acids of msaldh, including the hydrophobic domain, are sufficient for targeting of e. coli beta-galactosidase to the er membrane. further studies using chloramphenicol acetyltransferase fusion proteins suggest that two hydrophilic sequences on either side of the hydrophobic domain play an important role in er targeting. domain results in the cytosolic localization of truncated proteins, thus suggesting that the portion is responsible for membrane anchoring. the last 35 amino acids of msaldh, including the hydrophobic domain, are sufficient for targeting of e. coli ~5-galactosidase to the er membrane. further studies using chloramphenicol acetyltransferase fusion proteins suggest that two hydrophilie sequences on either side of the hydrophobic domain play an important role in er targeting. great deal of attention has been paid to resolve the mechanism for sorting and targeting of newly synthesized proteins to their final destinations, and this is one of the fundamental problemsin cell biology. newly synthesized proteins are destined to follow two distinct routes, depending on the presence or absence of a signal sequence at their amino termini (37) . the signal recognition particle (srp) t in the cytosol recognizes and binds to a signal sequence of nascent peptides. the resulting srp-ribosome-nascent peptides complexes are targeted to the er membrane through their interactions with the docking protein complex (35, 36) . after translocation across the er membrane, secretory and plasma membrane proteins follow a common pathway from the er, through the golgi complex, to the cell surface by default with bulk flow of lipids (33) . resident proteins in the central vacuolar system are localized and retained in their final destinations with the aid of either specific targeting (18) or retention signals (25) (26) (27) 44) . on the other hand, proteins without a signal sequence at their amino termini are synthesized on free polysomes and 1. abbreviations used in this paper: cat, chloramphenicol acetyltransferase; fp2, nadph-cytochrome p-450 reductase; msaldh, microsomal aldehyde dehydrogenase; pbs(+), pbs containing 1 mm caci2 and 0.5 mm mgci2; pdi, protein disulfide isomerase; ptp, protein tyrosine phosphatase; ste, sucrose solution containing 10 mm tris-hcl, ph 7.4, 1 mm edta, 10 t~g/ml leupepdn a, 0.5 mm pmsf, and 10 u/ml trasyol; srp, signal recognition particle. are posttranslationally directed to intracellular organdies such as mitochondria, peroxisomes, the nucleus, and the er, or they remain in the cytosol. much is known regarding targeting signals (14, 16, 39) and cytosolic protein factors (1, 15) by which proteins are imported into these organelles. however, less is known regarding posttranslational targeting of integral membrane proteins to the outer mitochondrial membrane, the peroxisomal membrane, or the er membrane. cytochrome b5 has a hydrophobic domain instead of an amino-terminal signal sequence at its carboxyl terminus (42) . for a long time, cytochrome b5 has been noted to be a typical membrane protein inserted posttranslationally into microsomal membranes through the hydrophobic domain. besides, it has been reported that cytochrome b5 is localized in any membranes of intracellular organdies, including the er membrane, the golgi membrane, the plasma membrane, and the outer mitochondrial membrane (47) . experiments using srp-depleted or -supplemented in vitro systems have shown that this protein does not require srp for insertion into microsomal membranes (3) . therefore, the carboxylterminal hydrophobic sequence was termed an "insertion" sequence through which spontaneous integration of cytochrome b5 into any exposed membranes could occur not only in vitro, but also in vivo (5) . however, recent studies have shown that cytochrome b5 is restricted in the er in vivo (11) , and that the last 10 amino acids of this protein adjacent to the hydrophobic domain are important for its targeting to the er (23) . recently, we have isolated and sequenced a full-length cdna for rat microsomal aldehyde dehydrogenase (msaldh) (24) . the deduced amino acid sequence of this enzyme prediets a similar molecular structure to that of cytochrome b5, a bulky amino-terminal domain without an amino-terminal signal sequence and a short hydrophobic domain at the carboxyl terminus. we report here that msaldh is localized exclusively in the er in cos-1 cells when expressed from cdna, and that expression of this protein apparently alters the structure of the er from a reticular to large vesicular one. in addition, we report that the hydrophobic domain is responsible for membrane anchoring and that msaldh is likely to have two er targeting sequences on either side of the membrane anchoring domain. our results suggest a novel mechanism for the posttranslational er targeting of this tail-anchored protein. (toyama, japan), respectively. since the/3-subunit of prolyl 4-hydroxylase has been shown to be identical to protein disulfide isomerase (pdi) (34) , this mab is referred to as pdi mab in this paper. rabbit antibodies to rat msaldh, nadph-cytoehrome p-450 reductase (fp2), and pdi have been prepared and characterized as described (2, 21, 24) . rabbit antiserum to bovine mitochondrial complex hi was a generous gift from dr. takamasa ozawa (nagoya university, nagoya, japan). the eukaryotic assay vectors, pch110 and psv2cat, were from pharmaeia lkb biotechnohigy, inc. (uppsala, sweden) and stratagene (la joua, ca), respectively. the eukaryotic expression vector pmiw (43) was kindly provided by dr. akihiro inoue (national institute for physiological sciences, okazaki, japan). restriction enzymes and dna modifying enzymes were purchased from nippon gene (toyama, japan) and takara co., ltd. (kyoto, japan). all other chemicals were of the highest purity commercially available. all constructions were verified by the dideoxy chain termination method (38) . insertion of a full-length edna encoding rat msaldh into the unique ecori site of the sv-40-based vector pcd (30) has been described previously (24) . the gapped duplex method of oligonueleotide-directed mutagenesis (19) a chimeric edna for e. coli/~-gaiactosidase fusion protein was constructed as follows. first, a dna fragment encoding the earboxyl-terminal 53 amino acids (amino acids 994-1046) of/5-galaetosidase, followed by five amino acids (amino acids 450-454) of msaldh, was created by pcr using oligonuclcotides no. 6 (5' agccatcgccatctgctg 3') and no. 7 (5' ggaagaatttcgaccatrttw_tacaccagacc 3'). similarly, a second dna fragment corresponding to the last five amino acids (amino acids 1042-1046) of/5-galactosidase followed by the last 35 amino acids (amino acids 450-484) and 3' untranslated sequence of msaldh was amplified using oligonueleotides no. 8 (5' c_~'ictc.~tg'ic_aaaaa'iugtcgaaat-tcticc 3') and no. 9 with mutated nucleotides underlined for introduction of an ecori site (5' aacaacttagaattcac~gttc 3'). these two fragments were then used as the templates to create a chimeric edna with the oligonucleotides nos. 6 and 9. the resultant pcr fragment was digested with ecori and ligated in-frame to the ecori site (amino acids 1029-1030) of pchii0 to construct pchii0/aldh. since it was difficult to construct a series of various fusion genes using ~-gaiactosidase edna according to the method described above, we used cat edna for further analyses on the role of the carboxyl-terminai portion of msaldh in the intracellular localization. chimeric cdnas for cat fusion proteins (cat/aldh chimeras) were constructed essentially by combination of oligonucleotide-directed mutagenesis and pcr. a 1.8-kbp hindhlbamhi edna fragment encoding cat in the psv2cat was cloned into ml3tvl8. oligonuclnotides no. 10 (5' ag~ggc~ggtacct-aattttttta 3') and no. 11 (5' ataagtgatatcaagcggatga 3') with mutated nucleotides underlined were used for generation of a kpni site at the carboxyl terminus and an ecorv site in the 3' untranslated sequence of cat, respectively. the mutated edna was cloned into hindlh/ bamhi-digested pmiw expression vector to construct pmiwcat. pcr was then used to generate dna fragments corresponding to the carboxylterminal regions of msaldh, including the 3' untranslated sequence. pcr reactions used the following primers and templates to amplify dna fragments termed aldh1-7 and 9: aldh1; oligonueleotides no. 12 (5' gagtccaagggtacctggtcgaaattc 3') with mutated nueicotides underlined for introduction of a kpni site and no. 9 (pcdaldh as template), aldi-i2; oligonucleotides no. 12 and no. 9 (pcdaldha481-484 as template), aldh3; oligonucleotides no. 13 (5' aaacagticaac-ggtaccagc~tgcagctg 3') and no. 9 (pcdaldh as template), aldh4; oligonucleotides no. 13 and no. 9 (pcdaldha481-484 as template), aldh5; oligonucleotides no. 14 (5' gaaattcttcggtacc-aaacagttcaac 3') and no. 9 (pcdaldh as template), aldh6; oligonucleotides no. 14 and no. 9 (pcdaldha481-484 as template), aldh7; oligonucleotides no. 12 and no. 9 (pcdaldha457-459 as template), and aldh9; oligonucleotides no. 12 and no. 9 (pcdaldha460-463 as template). these amplified fragments were then digested with kpni and hpai and ligated into the kpni-ecorv site of pmiwcat. the resultant plasmids were designated pmiwcat/aldh1-7, and pmiwcat/aldh9, respectively. for construction of pmiwcat/aldhs, pmiwcat/aldh2, which has the unique psti site in the coding sequence for the hydrophobic domain of msaldh, was digested with psti and bamhi, and ligated into pstl/bamhi digested pmiwcatialdh7. similarly, the same fragment was ligated into psti/bamhi digested pm1wcat/aldh9 to construct pmiwcat/aldh10. cos-1 cells were maintained in dme with 10% fbs, 50 u/ml penicillin, and 50 ~g/ml streptomycin at 37"c in a 5% coz incubator, and were replated the day before transfection by trypsinization. transfection was performed 4 h after the medium was replaced by the fresh one. for subeellniar fractionation experiments, cells plated in a lo0-mm dish (50-70% confluem) were transfected with an expression plasmid (20 ~g) using the caisequences of the carboxyl termini of wild-type and truncated forms of msaldh. the single amino acid code is used, and the amino acid numbers of msaldh are shown at the top. positively and negatively charged amino acids are marked + and -at the bottom, respectively. the hydrophobic domains are underlined. the edna for msaldh was converted to code for three truncated proteins, msaldha450-484, msaldha471-484, or ms-aldha481-484 by oligonucleotide-directed mutagenesis. cium phosphate precipitation method (48) . for immunottuorescent experiments, cells were grown on a 22 × 22-ram coverslip in a 35-ram dish (10-20% confluent) and were transfected with 4 /~g plasmid dna per 35-mm dish. for immunogold localization of cat/aldh chimeras, cells plated in a 60-ram dish (50-70% confluent) were transfected with 10 ttg plasmid dna per dish. 4 h after application of dna-calcium phosphate precipitate at 37°c, cells were shocked with 15% glycerol for 2 rain at room temperature, then incubated again at 370c for an additional 44 h before harvesting for subcellular fractionation or fixation for indirect immunofluorescence microscopy or immunoelectmn microscopy. cell fractionation was performed essentially as described previously for cos-i cells by clark and waterman (10) with slight modifications. briefly, cells were washed once with pbs and harvested in 5 ml of ice-cold 0.5 m sucrose containing 10 mm tris-hcl, ph 7.4, 1 mm edta, 10 #g/ml leupeptin, 10/~g/ml pepstatin a, 0.5 m_m pmsf, and 10 u/ml trasyol (0.5 m ste). after centrifugation at 800 g for 5 rain, the pellet was suspended in 0.5 ml of 0.5 m ste, homogenized with a teflon-glass homogenizer, then diluted with an equal volume of 10 mm tris-hcl, ph 7.4, 1 mm edta, 10/tg/nd leupeptin, 10/=g/ml pepstatin a, 0.5 mm pmsf, and 10 u/rni trasyol to obtain isotonic conditions. the total homogenate (designated h) was layered over 0.5 ml of 0.5 m ste and centrifuged at 800 g for 10 rain at 4°c using a swing but bucket, yielding a pellet (p1) consisting mainly of nuclei and unbroken cells. the supernatant and the interface were again layered over 0.5 m ste and centrifuged as above at 9,000 g for 10 min to isolate mitochondrial fraction (p2). the resultant supernatant was centrifuged at 88,000 g for 80 rain at 40c to sediment microsomal fraction (p3, mostly of the er membrane). the final supernatant, consisting mostly of cytosol, was designated $3. membrane fractions, p1, p2, and p3 were resuspended by hand homogenization in 0.25 m ste. endogenous enzyme activities of fp2 (an er marker) and succinate-cytochrome c reductase (a mitochondrial marker) were assayed by the methods of omura and takesue (31) and king (17) , respectively. protein was measured by bradford's method (7) . membrane fractions (100/tg protein) were resuspended in 0.8 ml of 100 mm na2co3 (ph 11.5), and were incubated for 30 rain at 0*c (13). the suspension was then centrifuged at 88,000 g for 80 rain at 40c, and the pellet (p) was suspended in 100/~i of sds-page sample buffer (20) . the supernatant (s) was precipitated with 10% tca, washed twice with 90% ethanol and once with diethyl ether, and dried. the resultant pellet was suspended in 100/d of sds-page sample buffer. all procedures were done at room temperature. proteins were separated on 8.5 % polyacrylamide gels (20) and electrophoretically transferred to a dora-pore membrane according to the method of burnette (8) using a semidry transfer blotter for 2.5 h at 36 v. after blocking with 3% skim milk in tbs for 1.5 h, blots were incubated with primary antibody in 3 % skim milk/ tbs for 1.5 h, and washed four times (5 rain each) with 0.05% tween-20 in tbs. they were then incubated with secondary antibody (pemxidaseconjugated goat anti-rabbit igg or anti-mouse igg) in 0.05 % tween-20/ tbs for 1.5 h, followed by four washes (5 rain each) with 0.05% tween-20/tbs. blots were stained using the enhanced chemiluminescence western blotting detection system (amersham corp., arlington heights, il). protein bands were quantitated using an enhanced laser densitometer (lkb instruments inc., bromma, sweden) to evaluate the relative level of proteins in each subcellular fraction. the densitometric scan value was then estimated by multiplying the relative level by the protein yield (milligrams of protein) in corresponding subcellular fraction. in calculating the percent distribution of each immunodetectable protein, the densitometric scan value of the homogenate was defined as 100%. all procedures, except for the incubation with antibodies or fitcconjugated lectin at 37°c, were carried out at room temperature. cells grown on coverslips were washed gently three times (5 rain each) with pbs, fixed with 4% paraformaldehyde in pbs containing 1 mm cac12 and 0.5 mm mgci2 [pbs(+)] for 20 rain, and permeabilized with 0.1% triton x-100 in pbs(+) for 1 h. they were then rinsed twice with pbs(+), and incubated with 2 % fbs in pbs(+) to block nonspecific binding of primary antibodies for 1 h followed by 45 rain incubation with primary antibody in 2 % fbs/pbs(+). after washing four times (5 rain each) with pbs(+), cells were incubated with secondary antibody in 2% fbs/pbs(+) for 45 min. for localization of the golgi complex, cells were incubated with fitcconjugated wheat germ agglutinin in 2% fbs/pbs(+) for 45 rain. after washing with pbs(+), they were then mounted on glass slides with 90% glycerol in pbs containing 1 m~/ml paraphenylenediamine, examined, and photographed on a microscope (bh-2; olympus corp., tokyo, japan) with ektachrome 400 film (kodak, rochester, ny). frozen ultramicrotomy was performed as described by tokuyasu (46) . transfected cells were harvested by centrifugation at 1,000 g for 3 rain, and the pellet was fixed with 4% paraformaldehyde and 0.1% glutaraldehyde discell fractionation was performed as described in materials and methods. in calculating the percent distribution, each value of the homogenate was defined as 100%. the distribution of fp2 and succinate-cytochrome c reductase(scr) is determined from the specific activities measured. the specific activities of the homogenate fractions are 52.3 + 11.0 nmol cytechrome c reduced/rain per mg protein (the mean + sd n = 4) and 26.9 + 2.8 nmol cytoehrome c reduced/min per mg protein for fp2 and succinate-cytochrome c reductase, respectively. in the previous study (24), we cloned and sequenced the fulllength cdna for rat msaldh. the nucleotide sequence predicts a polypeptide of 484 amino acids, and the most characteristic feature of this membrane-bound aldh is carboxyl-terminal 35 amino acids, consisting of a stem region (amino acids 450-463) and a hydrophobic domain (amino acids 464-480) followed by a short hydrophilic tail region (amino acids 481-484) as shown in fig. 1 . since rat cytosolic tumor-associated aldh, which is 65.5 % identical to msaldh (24) , lacks the carboxyl-terminal portion, we asked whether this portion played an important role in the intracellular localization of msaldh. for this purpose, we constructed three mutant proteins truncated at either amino acid 450 (msaldha450-484, which corresponds to the tumor-associated aldh in size), 471 (msaldha471-484, deletion of more than half of the hydrophobic domain together with the tail region), or 481 (msaldha481-484, deletion of solely the tail region) by oligonucleotide-directed mutagenesis (fig. 1) . these mutant proteins and wild-type msaldh were expressed transiently in cos-1 cells under the control of sv-40 promoter in the pcd expression vector. this transient expression system in cos-1 cells was chosen as the most rapid method for evaluating the intracellular localization of expressed proteins. cells were allowed to express these proteins for 44 h, harvested, and the expressed proteins were analyzed by immunoblotting using anti-msaldh antibody. as shown in fig. 2 a, msaldh and three truncated proteins were expressed efficiently in the total homogenates. as expected, molecular masses of the truncated proteins were smaller than that of msaldh (54 kd). in addition, no crossreactive protein was detected in untransfected cos-1 cells (data not shown), indicating the absence of endogenous msaldh. these results allowed us to investigate the intracellular localization of msaldh by transfection experiments. we analyzed the intraceuular distribution of msaldh and three truncated proteins by subcellular fractionation accordfigure 3 . effect of sodium carbonate treatment on the membrane association of wild-type msaldh and two truncated proteins. (,4) each p3 fraction was treated with 100 mm na2co3 for 30 rain at 0°c, and centrifuged at 88,000 g for 80 min to separate the pellets (p) from the supernatants (s). the distribution of msaldh, msaldha481-484, msaldha471-484, fp2, or pdi in the p and s fractions was assayed by immunoblotting. polyclonal anti-pdi antibody was used for immunodetection of endogenous pdi. (b) the relative level of each protein in the p and s fractions was quantitated by a scanning densitometer. in calculating portions recovered in the p fraction, the total level of each protein recovered in the p and s fractions was defined as 100 %. the bars show the mean + sd (n = 3). ing to the method of clark and waterman (10) . the separation of er membranes from mitochondria was checked by assaying two typical marker enzymes, fp2 (nadph-cytochrome p-450 reductase), which has been shown to be an integral er membrane protein (22) , and a mitochondrial marker succinate-cytochrome c reductase. as shown in table i, fp2 and succinate-cytochrome c reductase were enriched in the p3 fraction and the p2 fraction, respectively. in addition, when equal amounts of protein were immunoblotted, the highest levels of fp2 and mitochondria rieske iron-sulfur protein were found inthe p3 fraction and the p2 fraction, respectively (fig. 2 b) . the percent distribution of the two proteins in each subcellular fraction also confirmed a good separation of er membranes from mitochondria (fig. 2 c) . the subceuular distribution of msaldh and msaldha481-484 was almost identical to that of fp2 (fig. 2, b and c) , suggesting the er localization of these proteins in transfected cos-1 cells. on the other hand, msaldh a450-484 was found exclusively in the $3 fraction. this result indicated that the mutant protein lacking the last 35 amino acid8 of msaldh was no longer associated with intracellular membranes. curiously, msaldha471-484 was recovered not only in the $3 fraction, but also in the p3 fraction, showing an intermediate distribution between msaldh and msaldha450-484. we explored the nature of the association of the expressed proteins with microsomal membranes. upon treatment of the p3 fraction with 100 mm na2co3 (ph 11.5) (13), fp2 (the integral er membrane protein) remained attached to microsomal membranes as judged by immunoblotting (fig. 3, a and b) . wild-type msaldh and msaldi-i~481-484 were also resistant to alkali extraction, although ,x,35 % of msaldha481-484 was released. under the same conditions, >80% of msaldha471-484 and pdi (a luminal er residen0 were released, indicating a loose association of these proteins with microsomal membranes. similar results were obtained upon triton x-114 cloud point extraction (6) (data not shown). these data, together with those from subcellular fractionation, suggested that the carboxyl-terminal portion of msaldh including the hydrophobic sequence (amino acid 464-480) was necessary for both its er localization and the tight association with the er membrane. we next determined the intracellular localization of msaldh and the truncated proteins using indirect immunofluorescence microscopy. cells transfected with cdnas in the pcd vector were fixed and permeabilized 44 h after transfection, and the expressed proteins were detected by incubation with anti-msaldh antibody followed by tritc-conjugated secondary antibody. as shown in fig. 4 a, anti-msaldh antibody stained a number of large vesicular structures that surrounded the nucleus, in addition to diffuse reticular ones in cos-1 cells transfected with msaldh edna. no staining was seen at the plasma membrane and in the nucleoplasm. to explore this strange structure in more detail, we used double indirect immunofluorescence microscopy using two mabs or fitc-conjugated lectin. expressed msaldh was found to colocalize with endogenous pdi (fig. 4 b, arrow) , although pdi displayed a characteristic reticular staining pattern in cells not expressing msaldh (fig. 4 b, arrowhead) . in contrast, msaldh colocalized neither with a mitochondrial 65-kd protein (fig. 4 , c and d) nor with the perinuclear golgi complex visualized by fitc-conjugated wheat germ agglutinin (fig. 4, e and f) . these results strongly suggest the er localization of msaldh and the drastic morphological change of the er by expression of msaldh. a similar er staining pattern was observed for msaldha481-484 (fig. 4, g and h) , whereas both msaldha471-484 and msaldha450-484 were found distributed diffusely throughout the cytoplasm, and they did not colocalize with pdi (fig. 4, i-l) . in addition, a significant amount of msaldha450-484 appeared to enter the nucleus (fig. 4 k) . these immunolocalization data were consistent with those from subcellular fractionation, suggesting strongly the important role of the carboxyl-terminal portion of msaldh in its er localization. we focused our attention on the role that the carboxylterminal portion of msaldh might play in er targeting, and we asked whether this portion could direct a heterologous protein to the er membrane. for this purpose, we constructed an expression plasmid shown in fig. 5 a. the control vector (pchll0) contains e. coli/3-galactosidase, which the chimeric edna fragment with ecori sites shown by the bar was amplified by pcr and cloned in-frame to the unique ecori site in pchll0 as described in materials and methods. to the right, the intracellular location of each protein is noted as e for er or c for cytosolie. (b) cos-1 cells were transfected with pchll0 (~gal) or pchll0/aldh ~3gai-aldh), and harvested 44 h after transfection. the membrane (p) and cytosol (s) fractions were prepared by centrifugation of the postnuelear fractions at 88,000 g for 80 i rain. the membrane fraction containing/3gai-aldh was treated • with 100 mm na2co3 at 0*c for 30 min, and was centrifuged at 88,000 g for 80 min to separate the pellet (p) from the supernatant (s). each fraction was assayed by immunoblotting using mab to /3-galactosidase. was supposed to remain in the cytoplasm when expressedin cos-1 cells under the control of the sv-40 promoter. the constructed expression plasmid (pchll0/aldi-i) contains the last 35 amino acids of msaldh cloned in-frame to the 3' end of/3-galactosidase. cos-1 cells were transfected with these dnas and subjected to a crude subcellular fractionation. in this case, the postnuclear supernatant was centrifuged at 88,000 g for 80 min to separate the membrane fraction (p, containing mostly mitochondria and microsomes) from the cytosol fraction (s). immunoblotting using mab to e. coli ~-galactosidase revealed that wild-type ~-galactosidase was recovered, as expected, in the s fraction, whereas ~-galactosidase/aldh chimera was found concentrated in the p fraction (fig. 5 b) . in addition, the chimera in the p fraction was resistant to alkali extraction, indicaring a fight anchoring of this protein to intracellular membranes (fig. 5 b) . indirect immunofluorescence microscopy showed that ~-gaiactosidase was distributed diffusely throughout the cytoplasm (fig. 6 a) . however, attachment of the carboxylterminal 35 amino acids of msaldh to ~-galactosidase resulted in the reticular pattern of staining that surrounded the nucleus and extended throughout the cytoplasm (fig. 6 b) . double indirect immunofluorescence microscopy using polyclonal anti-pdi antibody showed the colocalization of /3-galactosidase/aldh chimera and pdi in the er (fig. 6 c) . these results suggested that the carboxyl-terminai 35 amino acids were sufficient for targeting of e. coli #-galactosidase to the er membrane. we attempted to define the sequence requirement for er targeting within the last 35 amino acids of msaldh, which is composed of three regions as shown in fig. l , and we constructed a series of cat/aldh chimeras (cat/aldh1-4) (fig. 7) . cat/aldh1 contains the last 35 amino acids of msaldh at the carboxyl terminus of cat, while cat/ aldh2-4 chimeras lack either the tail, stem, or both regions of msaldh. cos-1 cells were transfected with these cdnas in the pmiw expression vector that possesses /3-actin promoter and rous sarcoma enhancer. the intracellular localization of the expressed proteins were determined by indirect immunofluorescence microscopy using anti-cat antibody. wild-type cat was distributed, as expected, throughout the cytoplasm. in addition, it was also detected in the nucleoplasm (fig. 8 a) . cat/aldh1 showed the characteristic er staining pattern (fig. 8 b) , which was confirmed by double indirect immunofluorescence microscopy using pdi mab (fig. 8 c) . this result was similar to that obtained with/5-gaiactosidase/aldh (fig. 6 , b and c). both cat/aldi-i2 and cat/aldh3 were also found to colocalize with endogenous pdi in the er (fig. 8, d-g) . on the other hand, cat/aldh4 containing only the hydrophobic domain of msaldh at its carboxyl terminus showed a similar pattern of staining to that of wild-type cat. this chimera remained not only in the cytoplasm, but in the nucleoplasm as well (fig. 8 h) , and it did not colocaiize with pdi (fig. 8 i) . the simplest interpretation of these results is that both the stem and tail regions appear to contain ertargeting information of msaldh. to elucidate more narrowly the sequence requirement within the stem region for er targeting, we constructed an additional series of cat/aldh chimeras (cat/aldh5-10) (fig. 9 ), in which three to seven amino acids of the stem region of either cat/aldh1 or cat/aldi-i2 are deleted. the intracellular localization of these chimeras expressed in cos-1 cells was analyzed by indirect immunofluorescence microscopy. three chimeras, cat/aldh5, cat/aldh7, and cat/aldh9, were supposed to be directed to the er by virtue of the tail region, even if they lack an er targeting sequence within the stem region, and indeed they showed the reticu~lar staining pattern including the nuclear membrane (fig. 10 , a, c, and e). among the mutant proteins lacking the tail region, cat/aldh6 and cat/aldh10 showed the reticular pattern (fig. 10, b and f) . these reticular compartments were confirmed to be the er by colocalization with pdi (data not shown). in contrast, cat/aldh8, which lacks the lys-gln-phe sequence (amino acids 457-459) within the stem region and the tail region, was found distributed diffusely throughout the cytoplasm and the nucleoplasm (fig. 10 d) . we investigated in more detail the intracellular localization of cat and cat/aldh chimeras in transfected cos-1 cells using an igg-gold immunoelectron microscopic technique on frozen ultrathin sections. gold particles were distributed in the cytoplasm and in the nucleoplasm in cos-1 cells transfected with pmiwcat ( fig. 11 a) . a similar distribution of gold particles was observed in cos-1 cells expressing either cat/aldh4 (data not shown) or cat/ aldh8 (fig. 11 d) . in marked contrast, gold particles were predominantly detected on the cytoplasmic surface of the er membrane in cos-1 cells expressing cat/aldh1 (fig. 11 b) , cat/aldh7 (fig. 11 c) , or the other cat/aldh chimeras (data not shown). these immunoelectron microscopic data are consistent with indirect immunofluorescent data, and these immunolocalization data together strongly suggest that the er-targeting sequences of msaldh exist within the lys-gln-phe sequence in the stem region and within the lys-asp-gln-leu sequence of the tail region. we have shown here by subcellular fracfionation and indirect immunofluorescence experiments that not only msaldh, but also msaldha481-484, colocalizes with two er marker proteins, fp2 and pdi, when expressed from cdnas in cos-1 cells. surprisingly, indirect immunofluorescence microscopy revealed the apparent alteration of the er structure in cells expressing either msaldh or msaldh-a481-484. the altered er structure was characterized by large vesicular structures mainly located near the nucleus. since the unusual staining pattern was observed by expression of msaldh or msaldha481.484, it appeared that the er anchoring of these proteins led to the morphological alteration of the er. we are now investigating in more detail the altered er structure by immunoelectron microscopy, and we have recently found that the er tubules in cos-1 cells expressing msaldh are packed in crystalloid hexagonal arrays (yamamoto, a., r. masaki, and y. tashiro, manuscript in preparation). the elucidation of the altered er structure would provide interesting information on the morphological change of the intracellular organelle induced by overexpression of a resident membrane protein. we showed previously that msaldh has no amino-terminai signal sequence, but instead, that it has a longer carhoxylterminal portion than cytosolic tumor-associated aldh (24) . our truncation and domain replacement experiments have shown here that the last 35 amino acids of msaldh are necessary for its er localization and also sufficient enough for localization of two beterologous proteins, e. coli /~-galactosidase and cat, to the er. we therefore conclude that the carboxyl-terminal portion contains an er-targeting sequence of msaldh. further analyses on the intraceilular distribution of various cat/aldh chimeras have defined the er targeting sequence ofmsaldh. both cat/aldh4 and cat/aldh8 were distributed in the cytoplasm and nucleoplasm similar to wild-type cat in spite of possessing the whole hydrophobic domain (amino acids 464-480) of msaldh at their carboxyl termini, whereas the other chimeras were predominantly localized to the er. the two chimeras are lacking the lys-gln-phe sequence within the stem region and the lys-asp-gln-leu sequence of the tail region, thus suggesting that both sequences represent the er-targeting sequences of msaldh. we are now in the process of determining through mutagenesis which amino acids within these two regions are required for er targeting. on the other hand, it seems likely that the hydrophobic sequence of msaldh functions mainly as the er membrane anchor because this portion is essential for membrane anchoring. to the best of our knowledge, our data characterize the first intracellular protein that appears to have two targeting sequences separated by a membrane anchor. the ambiguous behavior of msaldha471-484 could be explained as follows. a portion of this truncated protein could be targeted to the er by virtue of the lys-gln-phe sequence within the stem region, hut because of the shortened hydrophobic domain, the portion would he either released into the cytosol or maintain only a loose association with the membrane. the significance of the presence of two separated targeting sequences is not yet clear. a small portion of msaldha481-484 was, in fact, recovered in the $3 fraction, whereas only insignificant amounts of msaldh or fp2 (fig. 2, b and c) , which is efficiently targeted to the er through the aminoterminal signal sequence, were recovered in the $3 fraction. therefore, it is likely that the presence of two targeting sequences in the same protein might be a means to ensure the more efficient er localization of msaldh. in addition, the hydrophilic tail region might also serve to increase the strength of the membrane anchoring, as indicated by the partial membrane extraction of msaldha481-484 with na2co3 (fig. 3) . although it is not yet clear whether the (24) . the single amino acid code is used. positively and negatively charged amino acids are marked + and -at the top, respectively. each hydrophobic domain is underlined and the er targeting sequences of either cytochrome b5 (23) or msaldh are indicated by broken lines. (b) two possible models for er targeting of msaldh. in model i, the aggregation of the newly synthesized msaldh polypeptides in the cytoplasm is prevented by the hydrophilic er targeting sequences adjacent to the hydrophobic anchoring sequence. the monomeric msaldh would be inserted into the er membrane with the aid of a receptor protein on the er membrane. in model ii, the complex of msaldh and a cytosolic • receptor protein would be targeted to the er, and msaldh would be inserted correctly into the er membrane spontaneously or through the interaction with a receptor protein on the membrane. hydrophobic segment fully transverses the membrane, this transmembrane disposition seems more likely to explain the tight anchoring of the protein in the er membrane than a loop model where the hydrophilic tail remains on the cytoplasmic surface. the lys-asp-gln-leu sequence of the tail region strongly resembles the well-characterized lys-asp-glu-leu retention signal found in the er luminal proteins (25) , suggesting that the lys-asp-gln-leu sequence, if translocated into the lumen of the er, serves as an er retention signal of msaldh. however, andres et al. have shown that proneuropeptide y mutant bearing the lys-asp-gln-leu sequence at the carboxyl terminus is processed and secreted like wild-type proneuropeptide y when expressed in att-20 cells, despite the intraceuular retention of the unprocessed proneuropepnuclear protein import in permeabilized mammalian cells requires soluble cytoplasmic factors distribution of protein disulfide isomerase in rat hepatocytes mechanisms of integration of de novo-synthesized polypeptides into membranes: signalrecognition particle is required for integration into microsomal membranes of calcium atpase and of lens mp26 but not of cytochrome b5 variants of the carboxylterminal kdel sequence direct intracellular retention intracellular protein topogenesis phase separation of integral membrane proteins in triton x-114 solution a rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding western blotting": electrophoretic transfer of proteins from sodium dodecyl sulfate-polyacrylamide gels to unmodified nitrocellulose and radiographic detection with antibody and radioiodinated protein a cloning of a cdna for a major human protein-tyrosinephosphatase the hydrophobic amino-terminal sequence of bovine 17~-hydroxylase is required for the expression of a functional hemoprotein in cos-i cells the specific subcellular localization of two isoforms of cytochrome b5 suggests novel targeting pathways the nontransmembrane tyrosine phosphatase ptp 1b localizes to the endoplasmic reticulum via its 35 amino acid c-terminal sequence isolation of intracellular membranes by means of sodium carbonate treatment: application to endoplasmic reticulum a conserved tripeptide sorts proteins to peroxisomes a mitochondrial import factor purified from rat liver cytosol is an atf-dependent conformational modulator for precursor proteins sequence requirements for nuclear location of simian virus 40 large t antigen preparations of succinate-cytochrome c reductase and the cytochrome b-c1 particle, and reconstitution of succinate-cytochrome c reductase the biogenesis of lysosomes oligonucleotide-directed construction of mutations via gapped duplex dna cleavage of structural proteins during the assembly of the head of bacteriophage t4 cylochrome p-450 and nadph-cytochrome p-450 reductase are degraded in the autolysosomes in rat liver immunoelectron microscope localization of cytochrome p-450 reductase on microsomes and other membrane structures of rat hepatocytes the carboxy-terminal 10 amino acid residues of cytochrome b5 are necessary for its targeting to the endoplasmic reticulure molecular cloning, sequencing, and expression of cdna for rat liver microsomul aldehyde dehydrogenase a c-terminal signal prevents secretion of luminal er proteins sequences within and adjacent to the transmembrane segment of c~-2,6-sialyltransferase specify golgi retention short cytoplasmic sequences serve as retention signals for transmembrane proteins in the endoplasmic reticulum ribosome-binding protein p34 is a member of the leucine-richrepeat-protein superfamily studies on the biosynthesis of microsomal proteins: site of synthesis and mode of insertion of cytochrome bs, cytochrome b5 reductas¢, cytochrome p-450 reductase and epoxide hydrolase. fur a cdna cloning vector that permits expression of cdna inserts in mammalian cells a new method for simultaneous purification of cytochrnme b5 and nadph-cytochrome c reductase from rat liver microsomes chemical structure of rat cytochrome bs: isolation of peptides by high-pressure liquid chrom~atography biosynthetic protein transport and sorting by the endoplasmic reticulum and golgi molecular cloning of the ~-subunit of human prolyi 4-hydroxylase. this subunit and protein disulfide isomerase are products of the same gene protein translocation across the er requires a functional gtp binding site in the a subunit of the signal recognition particle receptor transport of proteins across the endoplasmic reticulure membrane mechanisms for the incorporation of proteins in membranes and organelles dna sequencing with chain-terminating inhibitors signals guiding proteins to their correct locations in mitochoudria cloning and expression ofcdna for rat heine oxygenase mechanism of increase of heine oxygenase by hemin in cultured pig alveolar macrophages the binding of cytochrome b5 to liver microsomes a mouse embryonic stem cell line showing phiripotency of differentiation in early embryos and ubiquitous ~-galactosidase expression a goigl retention signal in a membrane-spanning domain of coronavirus el protein biogenesis of microsomal aldehyde dehydrogenase in rat liver application of cryoultramicrotomy to immunocytochemistry protein translocatiou across membranes biochemical transfer of single-copy eucaryotic genes using total cellular dna as donor we thank kimie masaki for valuable technical assistance, dr. shigeru taketani for helpful comments on the manuscript, dr. akihiro inoue, national institutes for physiological science, okazaki, japan, for the generous gift of the pmiw vector.this work was supported in part by a grant-in-aid for scientific research from the ministry of education, science, and culture of japan, and by a grant from naito foundation.received for publication 12 january 1994 and in revised form 1 june 1994. tide y with the lys-asp-glu-leu sequence (4), thus suggesting the important role of the acidic glu residue in er retention. we are now studying the membrane topology of msaldh and the intracellular localization of both msaldh and msaldha481-484 in more detail by immunoelectron microscopy to check the potential role of the tail region for er retention.novel er targeting sequences have recently been reported for two other tail-anchored proteins. frangioni et al. have found that the er targeting sequence of protein tyrosine phosphatase 1b (ptp 1b) exists within the carboxyl-terminal 35 amino acids (12). mitoma and ito have shown that the last 10 amino acids of cytochrome b5, which include the hydrophilic tail of this protein, are important for its targeting to the er (23). fig. 12 a shows an alignment of the last 35 amino acids of ptp 1b (9), cytochrome b5 (32) , and msaldh (24), where each hydrophobic domain is underlined and the er-targeting sequences of cytochrome b5 and msaldh are indicated by the broken line. although the sequence required for er targeting within the last 35 amino acids of ptp 1b has not been defined, ptp 1b and cytochrome b5 have the common leu-x-tyr-arg motif ( fig. 12 a) within their last 10 amino acids. in addition, ptp 1b and cytochrome b5 have tails of similar length (seven and eight amino acid residues, respectively), which are longer than that (four amino acid residues) of msaldh. it therefore seems likely that ptp 1b possesses a similar er-targeting sequence to that of cytochrome b5. we have not found in the carboxyl-terminal portion of msaldh a motif similar to that in those two proteins, despite the fact that the ertargeting sequences of cytochrome b5 and msaldh are both hydrophilic and adjacent to the membrane anchors. in addition, the presence in msaldh of two er-targeting sequences separated by the membrane-anchoring domain suggests that this protein, with respect to the er-targeting sequence, belongs to a different class than the other two. in addition to the three proteins described above, several other er membrane proteins are known that also have carboxyl-terminal anchors, such as heine oxygenase (40) and ribosome-binding protein p34 (28), but er-targeting sequences of these proteins have not yet been identified. cytochrome bs, heme oxygenase, and msaldh are synthesized on free polysomes (29, 41, 45) and posttranslationally targeted to the er. indeed, the srp-independent integration of cytochrome b5 into microsomal membranes has been demonstrated (3). our present study, together with these findings, suggests a novel pathway by which a class of tailanchored proteins with a novel targeting sequence is directed to the er.how might the er-targeting sequence function? fig. 12 b shows two possible models to explain the er targeting of msaldh. one possibility (/) is that the hydrophilic targeting sequences serve to prevent the aggregation of the newly synthesized msaldh polypeptides through their hydrophobic anchoring sequences, which makes it easier for them finding a. specific receptor on the er membrane. another possibility (i/) is that a cytosolic receptor binds to the targeting sequences of msaldh to form a complex that effects the correct insertion of the carboxyl-terminal anchor into the er membrane spontaneously or through their interactions with a receptor on the membrane. the elucidation of the mechanisms by which er localization of msaldh occurs should provide valuable clues for an understanding of the posttranslational targeting and insertion of a class of membrane proteins with a carboxyl-terminal anchor. key: cord-193489-u6ewlh16 authors: wang, rui; hozumi, yuta; yin, changchuan; wei, guo-wei title: decoding sars-cov-2 transmission, evolution and ramification on covid-19 diagnosis, vaccine, and medicine date: 2020-04-29 journal: nan doi: nan sha: doc_id: 193489 cord_uid: u6ewlh16 tremendous effort has been given to the development of diagnostic tests, preventive vaccines, and therapeutic medicines for coronavirus disease 2019 (covid-19) caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2). much of this development has been based on the reference genome collected on january 5, 2020. based on the genotyping of 6156 genome samples collected up to april 24, 2020, we report that sars-cov-2 has had 4459 alarmingly mutations which can be clustered into five subtypes. we introduce mutation ratio and mutation $h$-index to characterize the protein conservativeness and unveil that sars-cov-2 envelope protein, main protease, and endoribonuclease protein are relatively conservative, while sars-cov-2 nucleocapsid protein, spike protein, and papain-like protease are relatively non-conservative. in particular, the nucleocapsid protein has more than half its genes changed in the past few months, signaling devastating impacts on the ongoing development of covid-19 diagnosis, vaccines, and drugs. the ongoing pandemic of coronavirus disease 2019 (covid-19) caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2) poses crucial threats to the public health and the world economy since it was detected in wuhan, china in december 2019 [1] . as of april 24, 2020, more than 2.6 million cases of covid-19 have been reported in 185 countries and territories, resulting in more than 184,000 deaths [2] . tragically, there is no sign of slowing down nor relief at this monument partially due to the fact there is no specific anti-sars-cov-2 drugs and effective vaccines. sars-cov-2 is a positive-strand rna virus and belongs to the beta coronavirus genus. the genomic information underpins the development of antiviral medical interventions, preventative vaccines, and viral diagnostic tests. the first sars-cov-2 genome was reported on january 5, 2020 [3] . it has a genome size of 29.99 kb, which encodes for multiple non-structural and structural proteins. the leader sequence and orf1ab encode non-structural proteins for rna replication and transcription. among various nonstructural proteins, viral papain-like (pl) proteinase, main protease (or 3cl protease), rna polymerase, and endoribo-nuclease are common targets in antiviral drug discovery. however, it typically takes more than ten years to put an average drug to the market. the downstream regions of the genome encode structural proteins including spike (s) protein, envelope (e) protein, membrane (m) protein, and nucleocapsid (n) protein. notably, s-protein uses one of two subunits to bind directly to the host receptor angiotensinconverting enzyme 2 (ace2), enabling the virus entry into host cells [4] . the nucleocapsid (n) protein, one of the most abundant viral proteins, can bind to the rna genome and is involved in replication processes, assembly, the host cellular response during viral infection [5] . the e protein is a small integral membrane protein, a virulence factor, regulating cell stress response and apoptosis, and promoting inflammation [6] . the structural proteins, especially, the spike protein and the n protein, are the candidate antigens for vaccine development. developing safe and effective vaccines is urgently needed to prevent the spread of sars-cov-2. however, it typically takes over one year to design and test a new vaccine. furthermore, the sars-cov-2 genome undergoes rapid mutations partially stimulated as a response to the challenging immunological environments arising from the covod-19 patients of different races, ages, and medical conditions. sars-cov-2 exists as heterogeneous and dynamic populations because of their error-prone replication [7] . the vaccine developed at one time may not be effective for mitigating the infection by new mutated virus isolates. an alarming fact is that many of these mutations may devastate the on-going effort in the development of effective medicines, preventive vaccines, and diagnostic tests. accurate identification of the antigens and their mutations represents the most important roadblock in developing effective vaccines against covid-19. for example, different vaccines are needed for different geographic locations due to predominant mutations in the corresponding regions. in covid-19 diagnosis, the diagnosis kits are designed using two major methods, i.e., specific serological tests and molecular tests. serological tests are to detect specific covid-19 proteins. molecular diagnoses test specific covid-19 pathogenic genes, which usually rely on the polymerase chain reaction (pcr).because of the fast mutations of the sars-cov-2 genome, genotyping analysis of sars-cov-2 may optimize the pcr primer design to detect sars-cov safely and reduce the risk of false-negatives caused by genome sequence variations. in addition, the genotyping analysis may also reveal those regions that are highly conserved with very few mutations, which can be selected as a target sequence for reliable drug therapy and general diagnosis. the evolution pattern through the highly frequent mutations of sars-cov-2 can be observable on short time scales. in the early infection period (i.e., february 2020), the sars-cov-2 variants were clustered as s and l types [8] . recent genotyping analysis reveals a large number of mutations in various essential genes encoding the s protein, the n protein, and rna polymerase in the sars-cov-2 population [9] . monitoring the evolutionary patterns and spread dynamics of sars-cov-2 is of grace importance for covid-19 control and prevention. although mutations occur randomly, most preserved mutations can be regarded as virus responds to the host immune system surveillance. as a result, the faster and the wider the sars-cov-2 spread is, the more frequent and diverse the mutations will be. the tracking and analysis of covid-19 dynamics, transmission, and spread is of paramount importance for winning the on-going battle against covid-19. genetic identification and characterization of the geographic distribution, intercontinental evolution, and global trends of sars-cov-2 is the most efficient approach for studying covid-19 genomic epidemiology and offer the molecular foundation for region-specific sars-cov-2 vaccine design, drug discovery, and diagnostic development [10] . for example, different vaccines for the shell can be designed according to predominant mutations. this work provides the most comprehensive genotyping to reveal the transmission trajectory and spread dynamics of covid-19 to date. based on genotyping 6156 sars-cov-2 genomes from the world as of april 24, 2020, we trace the covid-19 transmission pathways and analyze the distribution of the subtypes of sars-cov-2 across the world. we use k-means methods to cluster sars-cov-2 mutations, which provides the updated molecular information for the region-specific design of vaccines, drugs, and diagnoses. our clustering results show that globally, there are at least five distinct subtypes of sars-cov-2 genomes. while, in the u.s., there are at least three significant sars-cov-2 genotypes. we introduce mutation hindex and mutation ratio to characterize conservative and non-conservative proteins and genes. we unveil the unexpected non-conservative genes and proteins, rendering an alarming warning for the current development of diagnostic tests, preventive vaccines, and therapeutic medicines. tracking the sars-cov-2 transmission pathways and analyzing the spread dynamics are critical to the study of genomic epidemiology. temporospatially clustering the genotypes of sars-cov-2 in transmission provides insights into diagnostic testing and vaccine development in disease control. in this work, we retrieve and genotype 6156 sars-cov-2 isolates from word as of april 24, 2020. there are 4459 single mutations in 6156 sars-cov-2 isolates. based on these mutations, we classify and track the geographical distributions of 6156 genoytype isolates by k-means clustering. the sars-cov-2 genotypes, represented as snp variants, are clustered as five groups in the world table 2 . the genotypes in the u.s. are clustered as three groups table 2 . the optimal clustering groups are established using the elbow method in the k-means clustering algorithm (see supporting material). the detailed distribution of the snp variants from the world for each cluster is provided in the supporting material. the snp variant clusters from 11 countries that have the highest number of cases recorded in are listed in table 2 figure 1 . the geographic distribution of the snp variant clusters reflects the approximate transmission pathways and spread dynamics across the world. several findings can be read from the table 2: 1. two early subtypes of sasr-cov-2 (cluster i and ii) are epidemic in the asian countries (cn, jp, kr). 2. the subtypes of sars-cov-2 in cluster iii is not spreading in the european countries (uk, de, fr, it). 3. all of the subtypes of sars-cov-2 in five different clusters can be found in the us, au, and canada. moreover, we analyze the statistic of snp variants located in different states of the united states. in table 3 , we list the number of cases in three different clusters with respects to the west coast states we note that cluster c in the u.s. is derived from cluster iii in the world, with an additional mutation at the leader sequence 241. the high spread in new york is consistent with the high transmission of sars-cov-2 in the european countries, where the subtype in cluster iii is predominated. table 5 presents the statistics of single mutations on various sars-cov-2 proteins that occurred in the recorded genomes between january 5, 2020, and april 24, 2020. the papain-like protease has the highest number of mutations of 599 while the envelope protein has the lowest number of mutations of 13. since the sizes of proteins vary dramatically from 1945 for the papain-like protease to 75 for the envelope protein, it is useful to consider the mutation ratio, i.e., the number of mutations per residue. in this category, the envelope protein still has the lowest score of 0.17, whereas the nucleocapsid protein has the highest score of 0.56, i.e, 235 mutations on its 419 residues. note that 3cl protease has the second-lowest mutation ratio of 0.22, indicating its conservative nature. another relatively conservative protein judged by the mutations ratio is the rna-dependent rna polymerase. it has 223 mutations over its 932 residues. counting the number of single mutations and mutation ratio does not reflect the fact some mutations occur numerous times over genome samples while other mutations may happen only on a few genome samples. to account for the frequency effect of mutations, we introduce a mutation h-index to measure both the number of mutations and the frequency of mutations of a given protein or genetic section. it is defined as the maximum value of h such that the given protein genetic section has h single mutations that have each occurred at least h times. it is very interesting to note from table 5 that the mutation h-index correlates very well with the number of mutations per residue. specifically, nucleocapsid protein has both the highest number of mutations per residues of 0.56 and the highest h-index of 27, suggesting that it is the most non-conservative protein in sars-cov-2 genomes. in contrast, the envelope protein has the lowest number of mutations per residues of 0.17 and the lowest h-index of 5, indicating its relatively conservative nature. by combining the number of mutations per residue and the mutation h-index, we report that the three most conservative sars-cov-2 proteins are 1) the envelope protein, 2) the main protease, and 3) the endoribonuclease. it is found that the most non-conservative sars-cov-2 proteins are 1) the nucleocapsid protein, 2) the spike protein and 3) the papain-like protease. real-time rt-pcr (rrt-pcr) is routinely used in the qualitative detection of nucleic acid from sars-cov-2 for diagnostic testing covid-19 [11, 12] . the primers used in the rrt-pcr are critical for the precise diagnosis of covid-19 and the discovery of new strains. the primer sequences are specially designed for amplifying the conserved regions across the different existing strains for high specificity and sensitivity, and also are subject to genotype changes as the sars-cov-2 coronavirus evolves. in diagnostic testing covid-19, many rrt-pcr primers are designed to detect for three perceived conservative sars-cov-2 regions: (1) rna-dependent rna polymerase (rdrp) gene in orf1ab region, (2) the e protein gene, and (3) the n protein gene [11] . our genotyping statistics given in table 5 indicates that the nucleocapsid protein is the worst choice. among four structural proteins of sars-cov-2, the spike surface glycoprotein (s) of 1273 amino acid residues, nucleocapsid protein (n) of 419 amino acid residues, membrane protein (m) of 222 amino acid residues, envelope protein (e) of 75 amino acid residues, the s protein is the most divergent with 385 unique mutations among the 6156 sars-cov-2 genomes. the n protein has 235 unique mutations, the e protein has 13 mutations. considering the lengths of the proteins, all the four structural proteins undergo high mutations. the rdrp gene, which is often used in diagnostic testing covid-19, also has 223 mutations. therefore, all the three regions in routine rrt-pcr target, namely rdrp, the n protein gene, and the e protein genes, have significant mutations. precise and robust diagnosis tools must be re-established according to the conserved regions and predominated mutations in the sars-cov-2 genomes detailed in the supporting material. notably, sars-cov-2 has a unique furin cleavage site, where four amino acid residues (prra) are inserted into the s1-s2 junction region 681-684 of the s protein [13] . the furin cleavage site is crucial for zoonotic transmission of sars-cov-2 [14] . this study reveals crucial mutations near the s1-s2 junction region in the s protein, including 23403a>gmoreover, these mutations of the s protein sars-cov-2 are located at the epitope region, corresponding to the regions 469-882 and 599-620 in sars-cov) [15] . additionally, many mutated amino acids are on the surface of the s protein as shown in fig. 5 . unfortunately, the s protein is the second most non-conservative protein in the genome based on the number of mutations per residue and mutation h-index. in fact, about half of the receptor-binding domain residues of the s proteins have had mutations in the past few months as shown in fig. 6 . because the surface accessibility of epitope is also important for the interaction of antibody and antigen, these mutations are critical for the antigenicity of the s protein. the convalescent covid-19 patients show a neutralizing antibody response after infection, which are directed against the s protein or the n protein [16] . the neutralizing antibody responses against sars-cov-2 could give some defense against sars-cov-2 infection and thus, having implications for preventing sars-cov-2 outbreaks. the divergence of spike proteins, the non-conserved regions of the spike proteins might contribute to the antigenicity. the high frequent mutations identified in the s protein and the n proteins must be considered when designing a vaccine. unfortunately, there is no specific effective drug for sars-cov-2 at this point. much of the drug discovery effort focuses on sars-cov-2 non-structural proteins. among the major non-structural proteins of sars-cov-2, the main protease of 306 amino acids has 68 mutations with 0.22 mutations per residue and the mutation h-index of 9, rna polymerase of 932 amino acids has 223 mutations with 0.24 mutations per residue and the mutation h-index of 13, and papain-like protease of 1945 amino acids has 599 mutations with 0.31 mutations per residue and the mutation h-index of 15. in fact, the main protease is the most popular drug target because there are no similar known genes in the human genome, which implies sars-cov-2 main protease inhibitors will be likely less toxic [17] . the present study suggests that the main protease is the second most conservative protein. therefore, it remains the most attractive target for drug discovery. the sars-cov-2 spike glycoprotein, or s protein, comprised of two subunits, s1 and s2, of very different properties [13] , see fig. 5 . among them, the s1 subunit, as shown in fig. 5 , contains the receptor-binding domain (rbd) responsible for binding to the host cell receptor angiotensinconverting enzyme 2 (ace2). the rbd is also the common binding domain for antibodies. the s2 subunit offers the structural support of the s protein and mediates fusion between the viral and host cell membranes. after the fusion, the virus releases the viral genome into the host cell. the s1 rbd protein plays key parts in the induction of neutralizing-antibody and t-cell responses, as well as protective immunity. however, s2 and extracellular domain (ecd) of spike protein and their combination are commonly used in recombinant proteins in sars-cov-2 antibody development. table 5 , the s protein is the most heterogeneous structural protein with a significant number of mutations as shown in figs. 5 and 6 and table 6 . the divergence of the spike protein, the non-conserved regions of the spike protein might contribute to the antigenicity difference in sars-cov-2 isolates. we found that most of the high frequent mutations of the s protein are located in the s1 subunit. figure 6 indicates that near half of the amino acid residues have had mutations since january 5, 2020. one of the important mutations at s1 is 23010 (v483a) within the rbd for ace2 binding. the structural study revealed that the amino acids 442-487 in the s1 subunit may impact viral binding to human ace2 [18, 19] . the mutations identified in this study imply the change in ace2 binding affinity and the transmissibility of sars-cov-2 as well as negative impacts in preventive vaccine and diagnostic test development. top1 10323 k90r 52 1 8 43 0 0 top2 10097 g15s 51 0 1 0 50 0 top3 10851 a266v 44 0 2 16 0 26 top4 10582 d176d 19 0 0 5 0 14 top5 10771 y239y 15 15 0 0 0 0 top6 10507 n151n 11 0 10 1 0 0 top7 10948 r298r 11 0 0 0 11 0 top8 10265 g71s 9 0 0 0 9 0 top9 10870 l272l 9 0 0 1 4 4 top10 10319 l89f 8 0 1 4 0 3 top11 10450 p132p 8 main protease sars-cov-2 main protease, or 3cl protease, is essential for cleaving the polyproteins that are translated from the viral rna [17] . it operates at multiple cleavage sites on the large polyprotein through the proteolytic processing of replicase polyproteins and plays a pivotal role in viral gene expression and replication. sars-cov-2 main protease is one of the most attractive targets for anti-cov drug design because its inhibition would block viral replication and it is unlikely to be toxic due to no known similar human proteases. another reason for the focused drug discovery efforts in developing sars-cov-2 main protease inhibitors is that this protein is relatively conservative as shown in table 5 . figure 7 illustrates the main protease mutation patterns. figure 8 further highlights the inhibitor binding domain (bd). indeed, the main protease is relatively conservative compared to the spike protein. table 7 lists top 11 mutations and their frequency in our dataset. it is interesting to see that many mutations, such as y239y, n151n, r298r, l272l, and p132p, are degenerate ones. one possible explanation is that nondegenerate may be non-silent and likely cause unsurvivable disruption to the virus. note that mutation g15s mostly occurs in cluster iv. mutation y239y is restricted to cluster i. some other mutations, such as r298r, g71s, and p132p, are specific to certain clusters. nonetheless, some mutations at the bd shown in fig. 8 are worth noting. they can undermine the ongoing drug discovery effort. top1 3037 f106f 3889 0 80 1800 964 1045 top2 2891 a58t 120 0 119 0 1 0 top3 3177 p153l 72 0 69 2 1 0 top4 4540 y607y 60 0 60 0 0 0 top5 7011 a1431v 45 0 43 2 0 0 top6 6312 t1198k 44 0 42 1 1 0 top7 7438 y1573y 34 0 9 21 4 0 top8 3373 d218d 29 0 3 0 26 0 top9 4002 t428i 26 0 1 0 25 0 top10 6040 f1107f 26 0 10 12 0 4 figure 9 : illustration of sars-cov-2 papain-like protease mutations using 6w9c as a template [20] . papain-like protease sars-cov-2 papain-like protease (plpro) is a cysteine cleavage protein located within the non-structural protein 3 (ns3) section of the viral genome [20] . like, the main protease, plpro activity is required to cleave the viral polyprotein into functional, mature subunits and, thereby, contributes to the biogenesis of the virus replication. additionally, plpro possesses a deubiquitinating activity. the sars plpro is also a major therapeutic and diagnostic target. as shown in table 5 , the sars plpro is prone to mutations. figure 9 shows that mutations are all over the places in plpro. table 8 lists top ten mutations in plpro. five of these mutations are degenerate ones, including one of the highest frequented mutations. note that none one of the top mutations occurred in cluster i. on the contrast, cluster ii has many different mutations. [21] . as one of the non-structure proteins, rdrps located in the early part of orf1b section. like most other rna viruses, sars-cov-2 rdrps are considered to be highly conserved to maintain viral functions and thus targeted in antiviral drug development as well as diagnostic tests. on the other hand, the sars-cov-2 rna polymerase lacks proofreading capability and thus its mutations are deemed to happen as shown in table 5 . figure 10 illustrates the sars-cov-2 rdrp mutations since january 5, 2020. surprisingly, there are many mutations in sars-cov-2 rdrp. table 9 describes the top ten mutations. as in other cases, five of these mutations are degenerate ones. cluster i has no nondegenerate mutations. endoribo-nuclease (nendou) protein is a nidoviral rnauridylate-specific enzyme that cleaves rna [22] . it contains a c-terminal catalytic domain belonging to the endou family rna processing. the nendou protein is presented among coronaviruses, arteriviruses, and toroviruses. the many aspects of the detailed function and activity of sars-cov-2 nendou protein are yet to be revealed. figure 11 depicts sars-cov-2 nendou protein mutations. like in most other sars-cov-2 proteins, mutations have occurred over different parts. table 5 shows that nendou is relatively conservative. table 10 lists the top twelve high-frequency mutations of the sars-cov-2 nendou protein that occurred in the past few months. three of these mutations are degenerate ones. the frequencies of these mutations range from 38 to 6. note that cluster i do not have any of these mutations. total frequency cluster i cluster ii cluster iii cluster iv cluster v top1 26319 v25v 8 0 7 1 0 0 top2 26340 a32a 7 0 5 2 0 0 top3 26326 l28l 5 0 5 0 0 0 top4 26256 f4f 4 0 0 2 2 0 top5 26301 l19l 3 0 2 1 0 0 top6 26433 k63k 2 0 0 1 0 1 top7 26370 y42y 1 0 1 0 0 0 top8 26392 s50g 1 0 1 0 0 0 top9 26313 f23f 1 the sars-cov-2 envelope (e) protein is one of sars-cov's four structural proteins. as a transmembrane protein, it involves in ion channel activity, and thus facilitates viral assembly, budding, envelope formation, pathogenesis, and release of the virus [23] . the e protein may not be essential for viral replication but it is for pathogenesis. figure 12 illustrates e protein as a very small pentamer with a few mutations. table 11 shows its top thirteen mutations. note that the first 7 mutations are degenerate ones. all other mutations have very low frequencies. as shown in table 5 , the sars-cov-2 e protein is very conservative. total frequency cluster i cluster ii cluster iii cluster iv cluster v top1 28881 r203k 989 1 20 4 964 0 top2 28882 r203r 983 0 18 1 964 0 top3 28883 g203r 983 0 18 1 964 0 top4 28657 d128d 125 1 124 0 0 0 top5 28311 p13l 102 0 101 1 0 0 top6 28688 l139l 91 0 90 1 0 0 top7 29045 p258t 67 0 65 2 0 0 top8 29046 p258r 67 0 65 2 0 0 top9 29047 p258p 67 0 65 2 0 0 top10 29049 r259l 67 0 65 2 0 0 top11 29050 r259r 67 0 65 2 0 0 top12 29051 q260e 67 0 65 2 0 0 top13 29052 q259r 67 0 65 2 0 0 top14 29053 q260h 67 0 its primary function is to encapsidate the viral genome. to do so, it is heavily phosphorylated (or charged) and thereby, can bind with rna. additionally, sars-cov-2 n protein confirms the viral genome to replicasetranscriptase complex (rtc) and plays a crucial role in viral genome encapsulation. therefore, it may function completely differently at different stages of the viral life cycle. sars-cov-2 n protein is considered to be one of the most conservative sars-cov-2 proteins in the literature and is a popular target for diagnosis of vaccine development [11] . the present works shown in table 5 indicates the sars-cov-2 n protein is the worst target of any drug, vaccine, and diagnostic development. table 12 presents the top fourteen mutations of the sars-cov-2 n protein since january 5, 2020. note that only three out of fourteen top mutations are degenerate ones, which is a significantly lower ratio than that of other proteins. the frequency of 14th mutation is 67, which suggests there are many mutations associated with these mediate-sized proteins. most top mutations occurred to clusters ii and iv. cluster v has none of the top fourteen mutations. membrane protein sars-cov-2 membrane (m) protein is another structural protein and plays a central role in viral assembly and viral particle formation. it exists as a dimer in the virion and has certain geometric shapes to enable certain membrane curvature and binding to nucleocapsid proteins. similar to other sars-cov proteins, m protein is also a popular target for viral diagnosis and vaccines. table 5 gives sars-cov-2 m protein the meddle ranking for its conservation. table 13 details the top eleven mutations in sars-cov-2 m protein occurred in the past few months. seven of these mutations are degenerate. clusters i and v have relatively a few of these mutations. on january 5, 2020, the complete genome sequence of sars-cov-2 was first released on genbank (access number: nc 045512.2) by zhang's group at fudan university [3] . since then, there has been a rapid accumulation of sars-cov-2 genome sequences. in this work, 6156 complete genome sequences with high coverage of sars-cov-2 strains from the infected individuals in the world are downloaded from the gi-said database [25] (https://www.gisaid.org/) as of april 24, 2020. all the records in gisaid without the exact submission date will not take into considerations. to rearrange the 6156 complete genome sequences according to the reference sars-cov-2 genome, multiple sequence alignment (msa) is carried out by using clustal omega [26] with default parameters. snp genotyping measures the genetic variations between different members of a species. establishing the snp genotyping method for the investigation of the genotype changes during the transmission and evolution of sars-cov-2 is of great importance. by analyzing the rearranged genome sequences, snp profiles which record all of thesnp positions in teams of the nucleotide changes and its corresponding positions can be constructed. the snp profiles of a given genome of a covid-19 patient capture all the differences from a complete reference genome sequence and can be considered as the genotype of the individual sars-cov-2. the jaccard distance measures dissimilarity between sample sets. the jaccard distance of snp variants is widely employed in the phylogenetic analysis of human or bacterial genomes [9] . in this work, we utilize the jaccard distance to compare the difference between the snp variant profiles of sars-cov-2 genomes. the jaccard similarity coefficient, also known as the jaccard index, is defined as the intersection size divided by the union of two sets a, b [27] : the jaccard distance of two sets a, b is scored as the difference between one and the jaccard similarity coefficient and is a metric on the collection of all finite sets: therefore, the genetic distance of two genomes corresponds to the jaccard distance of their snp variants. in principle, the jaccard distance measure of snp variants takes account of the ordering ofsnp positions, i.e., transmission trajectory, when an appropriate reference sample is selected. however, one may fail to identify the infection pathways from the mutual jaccard distances of multiple samples. in this case, the dates of the sample collections offer useful information. additionally, clustering techniques, such as kmeans described below, enable us to characterize the spread of covid-19 onto the communities. k-means clustering is one of the fundamental unsupervised algorithms in machine learning which aims at partitioning a given data set x = {x 1 , x 2 , â· â· â· , x n , â· â· â· , x n }, x n â�� r d into k clusters {c 1 , c 2 , â· â· â· , c k }, k â�¤ n such that the specific clustering criteria are optimized. more specifically, the standard k-means clustering algorithm starts to pick k points as cluster centers randomly and then allocates each data to its nearest cluster. the cluster centers will be updated iteratively by minimizing the within-cluster sum of squares (wcss) which is defined by: where âµ k = i â�� c k x i /n k is the mean of points located in the k-th cluster c k and n k is the number of points in c k . here, â· 2 denote the l 2 distance. the algorithm above only provides a way to obtain the optimal partition for a fixed number of clusters. however, we are interested in finding the best number of clusters for the snp variants. therefore, the elbow method is applied. by varying the number of clusters k, a set of wcss can be calculated in the k-means clustering process, and then the plot of wcss according to the number of clusters k can be carried out. the location of the elbow in this plot will be considered as the optimal number of clusters. to be noticed, the wcss measures the variability of the points within each cluster which is influenced by the number of points n . therefore, as the number of total points of n increases, the value of wcss becomes larger. additionally, the performance of k-means clustering depends on the selection of the specific distance. in this work, we propose to implement k-means clustering with the elbow method for analyzing the optimal number of the subtypes of sars-cov-2 snp variants. the jaccard distance-based and locationbased representations are considered as the input features for the k-means clustering method. suppose we have a total of n snp variants concerning a reference genome in a sars-cov-2 sample. the location of the mutation sites for each snp variant will be saved in the set s i , i = 1, 2, â· â· â· , n . the jaccard distance between two different sets (or samples) s i , s j is denoted as d j (s i , s j ). therefore, the n ã�n jaccard distance-based representation will be: suppose we have n snp variants with respect to a reference genome in a sars-cov-2 sample. among them, m different mutation sites can be counted. for i-th snp variant, v i = [v 1 i , v 2 i , â· â· â· , v m i ], i = 1, 2, â· â· â· , n is a 1 ã� m vector which satisfies: therefore, an n ã� m location-based representation will be: hundreds of complete genome sequences are deposit to the gisaid every day, which results in an evergrowing massive quantity of high dimensional data representations for the k-means clustering. for example, if the dataset of an organism involves 10,000 snps, the initial representation will be a 10,000dimensional vector for each sample, which can be computationally difficult for a simple k-means clustering algorithm. therefore, a dimensionality reduction method is used to pre-process the data. the essential idea of pca-based k-means clustering is to invoke the pca to obtain a reduced-dimensional representation of each sample before performing the k-means clustering. in practice, one can select a few lowest dimensional principal components as the k-means input for each sample. in ref. [28] , the authors proved that the principal components are the continuous solution of the cluster indicators in the k-means clustering method, which provides us a rigorous mathematical tool to embed our high-dimensional data into a low-dimensional pca subspace. the rapid global transmission of coronavirus disease 2019 (covid-19) has offered some of the most heterogeneous, diverse, and challenging mutagenic environments to stimulate dramatic genetic evolution and response from severe acute respiratory syndrome coronavirus 2 (sars-cov-2). this work provides the most comprehensive genotyping of sars-cov-2 transmission and evolution up to date based on 6156 genome samples and reveals five clusters of the covid-19 genomes and associated mutations on eight different sars-cov-2 proteins. we introduce mutation h-index and mutation ratio to qualify individual protein's degree of non-conservativeness. we unveil that sars-cov-2 envelope protein, main protease, and endoribonuclease protein are relatively the most conservative, whereas, sars-cov-2 nucleocapsid protein, spike protein, and papain-like protease are relatively the most non-conservative. we report an alarming fact that all of the sars-cov-2 proteins have undergone intensive mutations since january 5, 2020, and some of these mutations may seriously undermine ongoing efforts on covid-19 diagnostic testing, vaccine development, and drug discovery. the nucleotide sequences of the sars-cov-2 genomes used in this analysis are available, upon free registration, from the gisaid database (https://www.gisaid.org/). eighteen tables are provided in the supporting material for snp variants of 6156 sars-cov-2 samples across the world, snp variants of 1625 sars-cov-2 samples in the us, snp variants in five global clusters, snp variants in three us clusters, and mutation records for eight sars-cov-2 proteins. the acknowledgments of the sars-cov-2 genomes are also given in the supporting material. the species severe acute respiratory syndrome-related coronavirus: classifying 2019-ncov and naming it sars-cov-2 who. coronavirus disease 2019 (covid-19) situation report 93. coronavirus disease (covid-2019) situation reports a new coronavirus associated with human respiratory disease in china the sars-cov s glycoprotein: expression and functional characterization the coronavirus nucleocapsid is a multifunctional protein severe acute respiratory syndrome coronavirus envelope protein regulates cell stress response and apoptosis the population genetics and evolutionary epidemiology of rna viruses origin and evolution of the 2019 novel coronavirus genotyping coronavirus sars-cov-2: methods and implications models of rna virus evolution and their roles in vaccine design detection of 2019 novel coronavirus (2019-ncov) by real-time rt-pcr diagnosing covid-19: the disease and tools for ddtection structure, function, and antigenicity of the sars-cov-2 spike glycoprotein furin cleavage of the sars coronavirus spike glycoprotein enhances cell-cell fusion but does not affect virion entry a strategy for searching antigenic regions in the sars-cov spike protein coronavirus infections: epidemiological, clinical and immunological features and hypotheses structure of mpro from covid-19 virus and discovery of its inhibitors. biorxiv sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor receptor recognition by the novel coronavirus from wuhan: an analysis based on decade-long structural studies of sars coronavirus the crystal structure of papain-like protease of sars cov-2 structure of the rna-dependent rna polymerase from covid-19 virus structural model of the sars coronavirus e channel in lmpg micelles crystal structure of rna binding domain of nucleocapsid phosphoprotein from sars coronavirus 2. center for structural genomics of infectious diseases (csgid) gisaid: global initiative on sharing all influenza data-from vision to reality clustal omega. current protocols in bioinformatics distance between sets k-means clustering via principal component analysis this work was supported in part by nih grant gm126189, nsf grants dms-1721024, dms-1761320, and iis1900473, michigan economic development corporation, bristol-myers squibb, and pfizer. the authors thank the ibm tj watson research center, the covid-19 high performance computing consortium, and nvidia for computational assistance. key: cord-014852-6friw2ek authors: chumakov, s. p.; prassolov, v. s. title: organization and regulation of nucleocytoplasmic transport date: 2010-04-24 journal: mol biol doi: 10.1134/s0026893310020020 sha: doc_id: 14852 cord_uid: 6friw2ek separation of dna replication and transcription, which occur in the nucleus, from protein synthesis, which occurs in the cytoplasm, allows a more precise regulation of these processes. selective exchange of macromolecules between the two compartments is mediated by proteins of the nuclear pore complex (npc). receptor proteins of the karyopherin family interact with npc components and transfer their cargos between the nucleus and cytoplasm. nucleocytoplasmic transport pathways are regulated at multiple levels by modulating the expression or function of individual cargoes, transport receptors, or the transport channel. the regulatory levels have increasingly broad effects on the transport pathways and affect a wide range of processes from gene expression to development and differentiation. since eukaryotic genetic material is localized in a separate compartment (the nucleus), dna replica tion and transcription are spatially separated from protein synthesis, which occurs in the cytoplasm. the spatial separation provides the cell with additional opportunities to finely regulate the processes and, on the other hand, requires the highly selective exchange of many macromolecules between the nucleus and cytoplasm. exchange is due to a variety of receptor transport proteins, which interact with components of the nuclear pore complex (npc) to transport bound proteins between the two compartments. each of the receptors is involved in its specific transport pathway, transferring a certain set of substrates (cargoes). the state of a substrate also plays an important role in the transfer, since a cargo must have necessary signal sequences to properly bind to its receptor. the major ity of cell signaling pathways are involved in signal transduction between the nucleus and cytoplasm. thus, detailed knowledge of the organization and reg ulation of nucleocytoplasmic transport is essential for the understanding of cell metabolism and functional activity. macromolecular exchange between the nucleus and cytoplasm is mediated by nuclear pores. selective transport of molecules is due to the npc. nuclear pores were first identified in 1950, when the nuclear membrane was examined by electron microscopy [1] . further studies showed that holes in the nuclear mem brane are occupied by the npc, whose structure is evolutionarily conserved [2] . the npc is a protein complex of 40-60 mda that has sevenfold symmetry and consists of more than 30 proteins known as nucle oporins (nup). the spatial structure of the npc was established by tomographic reconstruction. the npc consists of a cylindrical central framework, eight microfilaments attached to it at the cytoplasmic side, and a nuclear basket of eight filaments, which are attached to the framework at the nuclear side and are distally bound with each other. the central part of the npc has a channel, which resembles a sand glass in shape and has a diameter of approximately 45 nm in the narrowest region ( fig. 1 ) [3] . the cytoplasmic filaments of the npc are approx imately 35 nm in length; the nuclear filaments, which form the nuclear basket, are 60 nm. the length of the npc including the central framework of 50 nm is approximately 150 nm; its outer diameter is 125 nm [4] . npc components slightly vary in size among organisms but the general structure and proportion are the same [5] . the npc number on the nuclear membrane broadly varies depending on the cell size and activity. one yeast cell has approximately 200 npcs, while actively proliferating human cells have an npc den sity of 10-20 npcs/μm 2 , corresponding to 2000-5000 npcs per nucleus. mature oocytes of the spur toed frog xenopus have 60 npcs/μm 2 , corresponding to 5 × 10 7 npcs per cell [6] . the npc density increases as the cell progresses through the cell cycle [7] . one npc transmits 1000 molecules per second on average; i.e., at least ten molecules simultaneously pass through one nuclear pore. proteins of up to 39 nm can pass through the nuclear pore. proteins of less than 30-40 kda freely diffuse through the nuclear pore, while larger proteins (40-100 kda) must have a nuclear localization signal (nls) or a nuclear export signal (nes) to be transferred from the cytoplasm into the nucleus or backwards. the signals are bound by transport receptors, which are capable of passing through the npc both alone and in complex with another protein [8] . highly efficient (more efficient than simple diffu sion) transfer of a protein of any size into the nucleus requires the npc mediated active transport mecha nisms [9] . this transport depends on either affinity of the target protein for npc components or its com plexation with the receptor proteins that have such affinity. kariopherins, which are the largest class of transport receptors, include importins, exportins, and transportins, which are involved in both nuclear import and export. importin β (karyopherin 95 in yeasts) is one of the best studied karyopherins. this protein binds with cargo molecules through the adap tor protein importin α (karyopherin 60 in yeasts) and is capable of interacting with several nup proteins [10] . according to a typical mechanism of interactions of karyopherins with a target protein and nup, a short motif (nls or nes) is recognized in the target pro tein and is bound by karyopherins or intermediate adaptor proteins, which then interact with karyo pherins. the resulting protein complex may interact with the nup proteins contained in the npc to enter the nucleus or leave it. proteins of the same size that lack a nls/nes are incapable of passing across the npc barrier. energy for transportation of complexes through the npc is generated by a high gradient of the ran gtpase between the nucleus and cytoplasm. nuclear rangtp is capable of releasing the target protein from an imported complex upon binding with karyo pherins. exportins occurring in the nucleus interact with their targets in the presence of rangtp, which is involved in transferring the complex through the npc into the cytoplasm, where rangtp is hydrolyzed to disassemble the transport complex and to replenish the cytoplasmic rangdp pool [11] . proteins of the karyopherin family are involved in the majority of specific nucleocytoplasmic transport pathways. there are at least 20 karyopherins in human cells and 14 in yeast cells [12] . target proteins that are transferred by karyopherins into or out of the nucleus usually have an nls or nes. the main, or classical, nls structure was established in studies of the nls of the sv40 large t antigen. this nls consists of one or two positively charged amino acid clusters, which are connected via a neutral linker [6] . further studies revealed many nlss that lack this classical structure. functional activity of an nls may be regulated via posttranslational modification or signal masking as a result of conformational changes. the classical nls binds with the adaptor protein karyopherin α (importin α), which forms a hete rodimer with importin β. in turn, importin β ensures nuclear import [13] . nls binding is due to a karyo pherin α region that contains ten arm repeats, each consisting of 40 amino acid residues forming three α heli ces. the region assumes a superhelical structure, which has a shallow groove, which accommodates a nls. an nls is accommodated in the acceptor pocket of karyopherin α owing to its certain charge and hydrophobicity. binding with karyopherin α, one nls contacts several arm repeats [14, 15] . most proteins of the karyopherin β family bind with a target protein directly. the primary structures of nlss recognizable by karyopherin β are far more diverse than the structures of nlss interacting with importin α. an nls often has a consensus sequence of several positively charged amino acid residues. such sequences were found in the nlss of histones, riboso mal proteins, and certain rna binding proteins. in some cases, a minimal functional nls is rather large. for instance, the m9 nls consists of 38 amino acid residues, is enriched in glycine, and includes only a few positively charged residues. even longer nlss are known, indicating that a certain spatial structure of the karyopherin binding region plays an important role in complex formation [16] . karyopherins that mediate nuclear export recognize ness [9, 11] . minimal func tional ness have been identified for several proteins. a moderately conserved short motif with three or four hydrophobic amino acid residues (lpplerltl in the rev nes) has been studied in most detail to date. this nes is utilized in all eukaryotes and serves as a binding site for the karyopherin crm1. the nes was found in at least 75 proteins, including many tran scription factors, cell cycle regulators, and virus pro teins, such as rev and proteinase k inhibitor of the human immunodeficiency virus type 1 (hiv 1), where this signal was identified for the first time. like importin β1, crm1 is capable of transferring com plexes through the npc, binding its target via an adaptor protein [18] . certain proteins have nonhydrophobic ness. yeast msn5p is the best known karyopherin that binds to such sites [12] . this transportin is capable of medi ating both nuclear export and import of various pro teins [19] . export of all known targets of msn5p is induced by their phosphorylation, indicating that a phosphate group is contained in the nes or that phos phorylation indirectly affects the recognition of the nes by a receptor [20] . some exportins utilize other karyopherins as tar gets. for instance, cas exports karyopherin α from the nucleus, thus, bringing it back into the cytoplasm [21] . in addition to proteins, certain rnas are subject to karyopherin dependent export. at least two export ins, exportin t and exportin 5, are capable of directly binding rna. exportin t exports trna from the nucleus by binding to a trna structural element act ing as a signal. in higher eukaryotes, exportin 5 trans fers microrna precursors from the nucleus upon recognizing their hairpin structure with a protruding 3' end [22] . the main properties that allow karyopherins to ensure efficient nuclear transport are their capabilities of binding with a target protein (directly or through an adaptor) and interacting with nup or rangtp [12] . almost all karyopherins known in humans and yeasts are involved in either export or import. two proteins (human importin 15 and yeast msn5p) are exceptions, having both exporting and importing activities. known importins are more numerous than exportins. for instance, ten importing, two exporting, and one universal karyopherins were identified in yeast cells. karyopherin targets are far more numerous than karyopherins. a clear view of the situation is just emerging, since karyopherins recognize the localiza tion signals having low structural homology ( table 1) . all karyopherins are similar in molecular weight (96-146 kda), charge, and domain structure. a ran binding domain is usually at the n end, a nup binding domain is in the center, and a target protein binding domain is at the c end. structural studies of four dif ferent karyopherins showed that a karyopherin mole cule contains approximately 20 heat repeats, each consisting of 40 amino acid residues that form two antiparallel α helices linked by a turn. the repeats stack together to form superhelical arches at both ends of the protein [23] . one karyopherin may form differ ent complexes with different target proteins; i.e., the karyopherin molecule has many binding sites, each recognizing a certain motif. in addition, karyopherins may change their conformation depending on the tar get protein [24] . this circumstance explains why one karyopherin transfers several proteins lacking homol ogous regions [25] . transport both importins and exportins bind with rangtp. importins bind with rangtp with high affinity, and the target protein is consequently released from its complex with an importin. exportins, whose function requires the formation of a ternary complex with rangtp and a cargo, have low affinity for rangtp when not bound with a target protein [9, 11] . a study of the atomic structure for the complexes of importin β1 and karyopherin β2 with rangtp showed that rangtp binds similarly with both receptors. rangtp interacts with a concave on the n terminal arch of importins, and its binding site has two spatially separate domains, one being closer to the n end of the arch, and the other closer to its c end [26] . the c terminal domain that contacts rangtp is a negatively charged loop between two heat domains. the same loop was found to bind with proteins transported by karyo pherin β2. this circumstance may explain why the target protein is released from its complex with impor tin once transferred into the nucleus (and in contact with rangtp). mutations of the c terminal rangtp binding domain result in a loss of the asso ciation between karyopherin binding with rangtp and the release of a cargo. thus, a contact of rangtp with this region may regulate the formation of the importin-cargo complex [23] . structural studies of the exportin cse1p in complex with rangtp and its cargo karyopherin α clearly demonstrate the difference in rangtp binding between import and export complexes. cse1p forms a superhelix with n and c terminal arches, which clamp around rangtp on both sides. simulta neously, cse1p binds with karyopherin α and fixes it in a conformation that prevents cargo binding. karyo pherin α also forms bonds with rangtp in this com plex. a feature of cse1p-rangtp interaction is that the n terminal arch alone can form a weak bond with rangtp, but the complex becomes sufficiently stable only when rangtp binds with both n and c termi nal arches, which is conformationally possible only when cse1p is bound with cargo. the resulting ternary complex resembles a tight spring, which stretches in the cytoplasm upon rangtp hydrolysis to release the cargo karyopherin α [27] . the exportin crm1 is similar in several structural features to karyopherin β2. like the c terminal rangtp binding site of karypherin β2, crm1 has a flexible loop in the middle of the polypeptide chain. the domain is thought to shield the binding site for target proteins and, simultaneously, to prevent crm1 from forming a stable complex with rangtp. how ever, when crm1 binds with a target protein or rangtp, the loop changes its position and stimulates the formation of a stable ternary complex. it is possible that conformational changes of a flexible loop underlie a general mechanism of the formation and dissocia tion of transport complexes [28] . at the same time, the exportin cse1p lacks regions similar to the crm1 flexi ble loop, indicating that the mechanism is not univer sal [27] . structural studies of karyopherin β1 in complex with the phenylalanine-glycine (fg) repeat contain ing fragment of yeast nup showed that the interactions between the two proteins are mostly hydrophobic and involve the phenylalanine residues of nup. the karyo pherin molecule forms two hydrophobic pockets: between heat repeats 5 and 6 and between repeats 6 and 7. the pockets capture the amino groups of the phenylalanine residues contained in the fg repeats of nup [29] . a similar nup binding region occurs at the c end of karyopherin β1 [30] . there is evidence that importing karyopherins have higher affinity for the nup proteins located at the nuclear side of the npc, while exporting karyopherins have higher affinity for cytoplasmic nup proteins [31] . a deletion analysis of all fg containing nup proteins (with a yeast model) made it possible to construct npcs having minimal sets of the fg domains. it was found by this means that asymmetrical fg domains (which occur exclusively at the cyptoplasmic or nuclear side of the npc) have no functional role. the role of asymmetrical nup proteins in nuclear transport is still unclear. the main role in nuclear transport is ascribed to the ran gtpase, which controls the formation and disso ciation of transport complexes. ran activity is strongly regulated by evolutionarily conserved proteins, which regulate both the intracellular distribution of ran and the transition between the gtp and gdp associated forms [6] . although ran occurs predominantly in the nucleus, it is continuously delivered into the cyto plasm at a high rate (10 5 molecules per second), mostly as a component of export complexes [32] . re import of ran is due to ntf2. ntf2 interacts with rangdp, which is the main cytoplasmic form of ran. the ntf2-rangdp complex is transferred into the nucleus, which is due to the ability of ntf2 to interact with low affinity with fg containing nup proteins, as karyopherins do. the rangdp transfer is unidirec tional because rangdp is rapidly converted to rangtp on the nuclear surface of the npc. since ntf2 binds to the switchii region of rangdp and this region has another conformation in rangtp, the ntf2-rangtp transport complex dissociates, and ntf2 is recycled into the cytoplasm [33] (fig. 2) . rangdp is converted to rangtp in the nucleus, where ran interacts with the ran guanine exchange factor (rangef). rangef is associated with chro matin, occurs in the nucleus at approximately one copy per nucleosome, and directly interacts with nucleosomal histones h2a and h2b. rangef stimu lates the exchange of gdp for gtp because a rangef loop (β wedge) penetrates into ran and releases gdp. binding with rangef fixes ran in the free form for a period long enough for ran to bind gtp contained in the nucleoplasm [34] . although no preference for gtp is observed for nucleotide exchange in the presence of rangef in vitro, ran-gtp binding in vivo is more likely owing to a higher gtp : gdp ratio in the nucleoplasm. rangef does not always occur in a chromatin associated form. its free form is detectable in the cell both in the interphase and during mitosis [35] . how ever, only chromatin associated rangef is capable of stimulating gtp-gdp exchange in complex with ran. the t42n ran mutant, which is incapable of nucleotide exchange, leads to a fixation of free rangef on chromatin [35] . it is most likely that gtp binding to ran is necessary for a detachment of the two proteins from chromatin. in addition, the associ ation of such exchange complexes with chromatin plays an important role in mitosis, since a high local concentration of rangtp is necessary for the proper assembly of microtubules in the vicinity of the chro mosome surface [36] . high affinity interactions between rangtp and karyopherins leaving the nucleus take place both when cargoes are transported into the cytoplasm (with exportins) and when imported karyopherins are recyc led into the cytoplasm. in either case, a rangtpkaryopherin complex is moved onto the cytoplasmic surface of the npc and then interacts with rangap, which substantially (approximately by five orders of magnitude in vitro) increases the originally weak gtpase activity of ran. this interaction causes gtp hydrolysis to yield rangdp. rangdp in complex with a karyopherin is not fully accessible for rangap, and ranbp1 facilitates the rangtp-rangap bind ing via either generating a ranbp1-rangtp-karyo pherin intermediate complex or releasing rangtp from karyopherin [37] . the cytoplasmic localization of rangap and the nuclear localization of rangef underlie the mecha nism that maintains a high gradient of rangtp between the nucleus and cytoplasm [38] . a fluores cence resonance energy transfer (fret) analysis showed that the difference between rangtp concen trations in the nucleus and cytoplasm is more than two orders of magnitude [36] , which agrees well with mod eling results. rangtp plays an important role in nuclear trans port. a high rangtp concentration in the cytoplasm would distort nuclear import, causing premature dis sociation of import complexes. the cell has mecha nisms to ensure the proper location of the ran regula tors. rangef mediated exchange of gdp for gtp occurs exclusively in the nucleus owing to two import pathways [39] . rangtp hydrolysis occurs in the cyto plasm, since rangap is too large to enter the nucleus through the npc. rangap can be sumoylated (covalently bound with a protein of the sumo family), and the resulting form has affinity for cyto plasmic nup358 [40] . ranbp1 is small enough to enter the nucleus through the npc via diffusion. however, ranbp1 has a nes recognizable by the exportin crm1, which ensures continuous active export of ranbp1 appearing in the nucleus via diffu sion into the cytoplasm. rangtp and karyopherins are the main means of nuclear transport. it should be noted, however, that the eukaryotic cell has several accessory factors that affect the transport efficiency. for instance, the effi ciency of karyopherin α dependent nuclear import is improved upon npap60 binding, which increases karyopherin α affinity for importin β1. appearing in the nucleus, the npap60-importin β1-karyopherin α-nls cargo complex is affected by rangtp and cas, which acts as an exportin to recycle karyopherin α into the cytoplasm. the import complex is cleaved into two complexes, npap60-importin β1-rangtp and cas-karyopherin α-rangtp. a feature of the npap60 effect is that npap60 differently binds with importin β1 during import. it is thought that npap60 has a structure of a tri stable switch [41] . ranbp3, which is a npap60 analog involved in export, is struc turally similar to npap60 and binds with the exportin crm1 to increase its affinity for rangtp and nes containing proteins, thus improving the efficiency of crm1 dependent export [42] . nucleoporins: their structure and functions the npc consists of more than 30 different nup proteins, each occurring in at least eight copies [43] . some nup proteins are predominantly included in the npc, while some others shuttle between the nucleus and cytoplasm and are only associated with the npc for a short while [44] . the fg repeats, which are found in one third of all nup proteins, are particularly important in the func tioning of npc. there are approximately 128 fg domains containing 3500 fg repeats in one npc. fg domains are not distinctly structured and are thought to line the inner wall of the nuclear channel. all known transport receptors bind with fg repeats, which are essential for nuclear transport [11] . the interaction with a transport receptor involves mostly the phenyl alanine ring of the core fg region; the ring binds with hydrophobic amino acid residues on the surface of a transport receptor [45] . hydrophilic regions, which occur between individual fg motifs and account for a major part of an fg domain, allow several fg regions of one domain to bind with a receptor [46] and, pre sumably, are necessary for modulating the receptor binding (table 2) . structural studies of the fg domains in yeast nup proteins by biophysical methods showed that the domains are unfolded and lack a distinct secondary structure in natural conditions [47] . similar data were obtained for the fg domains of nup proteins of other organisms [48] . as was revealed by electron micros copy, structures containing fg domains are flexible and are capable of moving along the npc. atomic force microscopy (afm) showed that the human nup153 fg domain (700 amino acid residues) occurs as a long (180 nm) unstructured sequence [49] . the spatial structure of the npc changes dynami cally, adapting to nuclear transport requirements. two main types of npc conformations were identified by cryoelectron microscopy. conformations of one type are characterized by the cytoplasmic filaments extended towards the central channel of the npc. the filaments apparently interact with a protein complex transferred though the npc. conformations of the other type have stretched unfolded cytoplasmic fila ments [50] . the arrangement of certain functional domains of nup changes in the course of nuclear transport. for instance, the c terminal end of the fg domain of nup214 moves onto the nuclear side of the npc when polyadenylated rna is introduced into the nucleus, while the n end remains associated with cytoplasmic fibrils [50] . the fg domain of nup153 similarly moves along the npc as cargoes are transported [51] . in addition to their role in nucleocytoplasmic transport, the fg domains may perform other func tions. for instance, the fg repeats of the rrm domain of mouse nup35 have a distinct secondary structure and do not interact with transport receptors. these domains interact with the ndc1 transmembrane protein and, possibly, are involved in forming the cen tral framework of the npc [52] . nup proteins may play a role in transcriptional reg ulation by interacting with active genes. this assump tion is supported by the fact that npcs are not regu larly distributed through the nuclear membrane, but their distribution corresponds to the distribution of active chromatin within the nucleus. such a distribu tion is necessary for a correct rearrangement of the chromosomes and nuclear envelope during mitosis and for mrna export through specific npcs in the interphase [53] . further studies revealed that yeast genes whose transcription is increasing move to the periphery of the nucleus, which is due to their binding with certain nup proteins [54] . in addition, several nup proteins, such as nup2p, bind to loci at the boundary between euchromatin and heterochromatin, thus preventing heterochromatin from spreading to transcriptionally active regions. as was revealed more recently, nup2p binds to chromatin within a complex consisting of nup2p, nup60p (an element of the npc nuclear basket), prp20 (a yeast analog of rangef), and the htz1 histone protein [55] . htz1 is responsible for the prevention of hetero chromatin spreading [56] . nup2p interacts with the promoters of functionally active genes, and this inter action depends on transcriptional activators and the tata box located 5' of them [57] . thus, the npc may serve as a multifunctional regulator of gene expression by distributing transcription activation signals and checking the quality of spliced mrna. nup proteins can act as a platform for the attach ment of various transport factors. for instance, nup358, which is a component of the cytoplasmic fibrils of the npc, provides a platform for the cleavage of transport complex and a subsequent recycling of transport factors. the nup358-ranbp1 complex has four ran binding sites and a binding site for the sumoylated ran activator (rangap1). to bind to this complex, rangap1 must be sumoylated, while ranbp1 has sumoylating activity. acting in complex with the e2 ligase ubc9 similarly to the e3 ligase, ranbp1 stabilizes rangap1 on the cytoplasmic fibrils [40] . rangap1 is capable of stimulating hydrolysis of rangtp associated with export com plexes, which is essential for nuclear export. some nup proteins bind with proteins possessing desumoylating activity, such as senp2 (yeast ulp1) [58] . there is evidence that ulp1 binds with nup60p and mlps located on the nuclear side of the npc [59] . other data indicate that ulp1 is anchored on the npc by npc associated karyopherins [58] . this possibly facilitates desumoylation of hnrnps that were sumoylated on the cytoplasmic side of the npc and then transferred into the nucleus. yet this assumption disagrees with the data that ubc9 occurs on the nuclear side of the npc as well, possibly binding with nup153p [59] . apart from sumoylation, the dead box helicase dbp5 plays a substantial role in separating hnrnp from mrna. dbp5 also occurs on the cytoplasmic surface of the npc in complex with nup214 (yeast nup159p). atpase activity of dbp5 depends on the binding of gle1, an mrna export factor capable of binding to the npc, and is stimulated by inositol hexaphosphate (ip 6 ) [54] . in yeast cells, dbp5 is bound with mex67, which is an analog of tap/nxf1. the level of mex67 bound rna is elevated in cell lines with mutant dbp5, implicating dbp5 in the mex67 recycling to the nucleus [60] . thus, both karyopherin dependent export and mrna export depend on a protein stimulating ntpase activity (rangap1 in the case of ran gtpase and gle1/ip 6 in the case of atpase activity of dbp5). moreover, the interaction of these activators with a substrate is determined by specific binding sites of cytoplasmic nup proteins and is regulated by addi tional cofactors (sumoylation and ip 6 ). hierarchic regulation of nuclear transport nuclear transport is regulated by several mecha nisms, which are organized hierarchically. a flow of proteins transferred between the nucleus and cyto plasm changes in response to various signals, such as hormones, cytokines, and growth factors, as well as signals regulating the cell cycle, differentiation, and the immune response, and in stress. modification of signal molecules via phosphorylation or dephosphory lation, which is the most clearly understood mecha nism regulating nuclear transport, may involve many kinases and phosphatases [61] . kinases and phos phatases are regulated by many cell signals, directly linking external signaling factors and intracellular sig nals with changes in the nuclear import or export of signal molecules, such as cell cycle regulators, kinases, and transcription factors. many of these proteins have both nes and nls, which allow a fine regulation of their intracellular localization by changing the effi ciency of nuclear export or import [62] . one of the key steps of nuclear transport is the interaction of importins and exportins with the nls or nes of a target protein. changes in nls or nes accessibility via inter or intramolecular masking are one of the most common mechanisms modulating the efficiency of nuclear transport of a particular protein. intramolecular masking occurs when the accessibility of the nls or nes decreases as a result of conforma tional changes in the protein containing the given site. an example of such regulation is provided by the nf κb p50 transcription factor. the factor occurs in the cyto plasm in the form of a p105 precursor, which has an nls inaccessible for binding with importin α. during the immune response, phosphorylation and degrada tion of the c terminal fragment of p105 unmasks the nls, which then binds with importin α to allow active transfer of nf κb into the nucleus [63] . conformational changes that result from disulfide bonding between the amino groups of cysteine resi dues in one protein may also mask or unmask the nes or nls. for instance, a disulfide bond is formed in stress in the yeast transcription factor yap1p between cys598 and cys620, which belong to a cysteine rich region. an nes, which is in the same region, conse quently becomes inaccessible for xpo1p [64] . such masking takes place when binding with another protein makes the localization signals inac cessible for transportins. an example is provided by the nf at4 transcrip tion factor. at a higher ca 2+ concentration, nf at4 binds with the ca 2+ responsive phosphatase cal cineurin, which masks the crm1 binding nes of nf at4. when the ca 2+ concentration decreases, calcineurin dissociates from nf at4, the nes becomes accessible for crm1, and crm1 transfers nf at4 from the nucleus [65] . the nuclear localization of the p53 tumor suppres sor is regulated by several mechanisms. one of these is p53 homotetramerization in the nucleus, which occurs in response to dna damage [66] and masks the c terminal ness. it is essential for a nuclear export of p53 that the tetramer dissociate and the c terminal ness be unmasked [67] . ligand binding may also mask the nls or nes of a receptor. for instance, the nes of the androgen receptor is close to its ligand binding domain. upon binding with the ligand, the nes becomes inaccessi ble for crm1, and nuclear export is suppressed until the ligand dissociates from the receptor [68] . an inter molecular masking of an nls or nes is possible upon dna or rna binding. hiv 1 rev, which transfers nonspliced virus mrna from the nucleus into the cytoplasm, masks its own importin β2 dependent nls upon mrna binding [69] . a release of mrna restores the ability of rev to bind with importin β2, which then recycles rev into the nucleus. rev binds again with virus mrna, this binding facilitates rev dissociation from its complex with importin β2, and a new round of mrna nuclear export starts. the yeast gal4 transcription factor and human sry chromatin remodeling factor have dna bind ing domains that overlap their nlss [70] . binding to dna prevents their association with importin β2 and vice versa. it is possible that this mechanism is alterna tive to a rangtp mediated release of cargoes from transport complexes. the mechanism is effective when the local rangtp concentration is too low or ran activ ity is suppressed by high ca 2+ concentrations [71] . many various rnas are expressed in the nucleus. to be exported from the nucleus, rnas undergo post transcriptional modification, which is necessary for successful interactions with proteins of the transport complex. for instance, trnas acquire the capability of binding with exportin t at the last step of their maturation in the nucleus, and this capability ensures the export from the nucleus only for mature trnas. several cell mrnas are exported owing to cis regula tory elements. in mouse cells, certain retroviral tran scripts leave the nucleus owing to the constitutive transport element (cte), which is capable of binding directly to the mnxf1 export receptor [9] . the mrnas of several genes involved in the cell cycle con trol have a β1 untranslated region (utr) that is recogni zed by the translation initiation factor eif4e, which binds to the 3' utr and mediates the mnxf1 depen dent export of these transcripts [72] . in addition to an nes, mrna may contain an nls, which provides for an additional mechanism of gene expression regulation. the 3' utr of the msf cytokinin gene contains a sequence that holds this mrna in the nucleus. further posttranscriptional modification in response to tgf β1 activates the export of the msf mrna and msf synthesis [73] . signal elements may regulate the rna nuclear import as well. for instance, an 8 nt sequence is necessary and sufficient for the nuclear import of the mir 29b microrna [74] . the export of mature mrnas and ribosomal sub units results from a series of standard modifications essential for the binding with export receptors. during maturation, mrna interacts with various proteins to form mrnp. the composition of this complex changes in the course of mrna splicing, capping, and polyadenylation. mature mrna is capable of inter acting with the mnxf1-nxt1 (yeast mex67-mtr2) transport complex, which is necessary for cytoplasmic export [75] . the mrnp transfer through the npc depends on ip 6 , thus allowing phospholipase c to regulate the mrna nuclear export. similar modifica tions necessary for the export from the nucleus occur in rrna as well [75] . phosphorylation of proteins at sites close to an nls or nes is capable of not only masking these sig nals, but also improving protein affinity for importins and exportins. for instance, ck 2 kinase phosphoryla tion of the t antigen at ser111/112, which are adja cent to the nls, results in a 100 fold increase in nls affinity for importin α, and this leads to a 50 fold increase in the nuclear import of the t antigen [76] . phosphorylation may affect the nuclear export as well. for instance, phosphorylation of pho4 at ser114 and ser128 improves the recognition of its nes by the msn5p exportin [77] . in addition to phosphorylation, target proteins may experience other types of posttranslational modifica tion, such as methylation and ubiquitination, which are also capable of regulating the intracellular location of the protein. for instance, the proper nuclear import of rna helicase a requires methylation of its nls, and pten phosphatase does not efficiently enters the nucleus until monoubiquitinated at certain lysine res idues [78] . ubiquitinase ubcm2 appears in the nucleus only in an activated ubiquitinated form. this example sup ports the idea that the functional state of an enzyme may affect its intracellular localization [79] . another mechanism regulating nuclear transport is a binding of nls or nes containing proteins with cytoplasmic or nuclear factors that anchor the bound proteins to retain them in the cytoplasm or nucleus. for instance, p53 is retained in the cytoplasm in the absence of stress signals owing to its binding with parc ubiquitin ligase. excess synthesis or inhibition of parc modulates the cytoplasmic localization of p53 [80] . the hiv 1 transactivator tat, the vascular growth factor angiogenin, and the interferon dependent tran scription factor ifi16 are retained in the nucleus via nls dependent binding with nuclear or nucleolar components. angiogenin is small enough to enter the nucleus via passive diffusion, and its nls does not interact with importins, but binds with nuclear pro teins. as a result, angiogenin is retained in the nucleus, and its backward diffusion into the cytoplasm is pre vented [81] . such anchorage is due to nls phospho rylation in some cases. for instance, affinity of the ifi16 nls for nuclear proteins increases as a result of phosphorylation by the ck 2 kinase. in contrast to these two proteins, tat has an nls that has affinity for both nuclear and cytoplasmic anchoring proteins, and its affinity depends on protein modification [82] . different members of the importin family, espe cially importins α, have affinity for different groups of transport targets in higher eukaryotic cells. moreover, different transport complexes, for instance, β katenins or stat family proteins, pass through the npc via different pathways, binding exclusively with the fg repeats of a certain set of nup proteins [83] . thus, the presence or absence of a particular nup or karyopherin can determine whether a certain protein is transferred through the nuclear pore. a difference in the tissue distribution of transport proteins affects the efficiency of nuclear transport of the same proteins. an example is provided by the drosophila melanogaster heat shock protein dhsf, which is transferred into the nucleus by importin α3. importin α3 is absent in early embryo development, and, consequently, dhsf does not enter the nucleus [84] . differences in the tissue distribution of importins α are especially clear in higher eukaryotes. for instance, importin α4 accounts for more than 1% of total pro tein in human striped muscle cells (i.e., its content is 100 fold higher than that of importin α5) and is almost absent in heart, spleen, and kidney cells [85] . in contrast, importin α1 is abundant in the heart, testis, skeletal muscle, and ovary, while importins α3 and α7 occur at a high content in the ovary and brain. the levels of mrna expression in one tissue greatly vary among different importins α; however, a high content of impor tin α6 is only characteristic of ovarian cells [86] . competition between different karyopherins β for binding sites on the npc surface may also be subject to regulation. the contents and composition of karyo pherins β and their targets change during the cell life, affecting the efficiency of nucleocytoplasmic trans port. a higher content of a particular karyopherin β increases the transport efficiency of its targets [87] , the saturation point differing between individual karyo pherins β. since different transport receptors are capable of interacting with the same sites on the npc surface, an excess of one karyopherin β may inhibit the transport of proteins associated with another karyo pherins. for instance, the import in cultured cells is approximately tenfold lower than in systems in vitro, possibly, because artificially reconstructed systems lack the majority of competing karyopherins β [87] . the regulation of nucleocytoplasmic transport pathways changes in certain diseases. overproduction of karyopherin β and karyopherin α family proteins was observed in several colorectal, breast, and lung cancers. deregulation of karyopherin α2, which is frequent in melanoma and breast cancer cells, corre lates with a low survival. it is still unclear which karyo pherin α2 dependent proteins are redistributed to determine the high malignant potential of these tumors. a possible cause is that karyopherin β1 is sequestered by overproduced truncated karyopherin α2 to alter the karyopherin α1 dependent import of p53 into the nucleus [88] . the antiviral response is similarly altered in cells infected with the ebola virus or the avian influenza virus (sars cov) [89] . hodgkin's lymphomas are characterized by excess phosphorylation and degradation of i κb, which results in an extremely high nuclear content of nf κb p65 because the intermolecular masking of its nls is deregulated [90] . gametogenesis is well understood and is known to depend, to a substantial extent, on the nonuniform expression of various importin α genes. each of the three d. melanogaster importins α has its own expres sion pattern during spermatogenesis. importin α3 is syn thesized predominantly in the postmeiotic phase, reaching its maximum during spermatid elongation. the level of importin α1 remains low throughout the mitotic phase of spermatid development and is com pletely suppressed at spermatid elongation. importin α2 is actively synthesized in spermatogonia during the first four mitoses and during the two subsequent meioses. flies with a mutant importin α2 (imp α2 d14 ) have a low fertility. an elevated level of importins α1 and α3 in transgenic flies is capable of restoring the male fer tility, indicating that these importins may functionally substitute importin α2 to a certain extent during sper matogenesis. however, the fertility of imp α2 d14 mutant females is not restored at higher levels of other importins, suggesting a key role in oogenesis for importin α2 [91] . сaenorhabditis elegans importins are synthesized differentially in different cells. importins α1 and α2 occur mostly in germline cells, while importin α3 is found in somatic cells as well. suppression of importin α3 synthesis via rna interference blocks meiosis in pachytene. it seems that importins α1 and α2 are incapable of compensating for lack of importin α3, indicating that importin α3 is essential for successful meiosis. inhibition of importin α2 leads to aneup loidy, an improper chromatin organization during cell division, and an incomplete restoration of the nuclear membrane after meiosis [92] . in lower eukaryotes, different importins are critical at different stages of gametogenesis, indicating that the proper organization of nuclear transport is impor tant for meiosis. an essential developmental role of crm1 depen dent nuclear export was demonstrated in experiments with leptomycin b, which acts as a specific inhibitor of crm1. when nondifferentiated gonad explants from female mouse embryos were treated with leptomycin b, the sox9 chromatin remodeling factor was redis tributed into the nucleus, as characteristic of nondif ferentiated male gonads [93] . the sox9 content in the nucleus increased, which was apparently related to a role of crm1 in the export of sox9 into the cytoplasm. while crm1 and importins β2 and β3 occur in all cells in drosophila, exportin dcas, which is responsi ble for a recycling of importin α from the nucleus into the cytoplasm, is differentially expressed in different tissues. importin α synthesis is almost absent at the mid blastodermal stage. its level increases at subse quent stages, especially in embryonic nervous cells. mutations of dcas result in either a lethal phenotype or, in the case of hypomorphic dcas, developmental defects of the nervous system. such developmental defects are possibly caused by a lack of cytoplasmic importin α3 and a consequent decrease in the con centration of the notch regulated su9(h) protein in the nucleus [94] . nup proteins differ in affinity for importins and exportins, and, consequently, changes in the levels of certain nup proteins may affect the efficiency of nuclear transport. lack of nup98b alters the nuclear import of all proteins possessing the classical nls, except for the spliceosomal factor u1a [95] . nup bs 63, which is functionally associated with the ranbp2/nup358 complex, is synthesized exclu sively in spermatids [96] . nup bs 63 is capable of interacting with the af10 chromatin remodeling fac tor, which is contained in postmeiotic cells. factors similar to af10 may enter the nucleus via direct inter actions with certain nup proteins, as is the case with β catenins and smad family proteins. thus, nup bs 63 possibly provides a specific binding site for af10 in haploid sperms [97] . the npc, a dynamically changing macromolecu lar complex, can mediate both passive transport and active transport of specific complexes. some nup pro teins act as active components of a transport pathway. during mitosis, the npc presumably dissociates into several subcomplexes differing in nup composi tion. a functional npc is again reassembled from these subcomplexes in telophase [53] . in lower eukaryotes, whose mitosis proceeds with out a disassembly of the nuclear membrane, certain nup proteins (gle2/rae1 and nup98 in aspergillus) are separated from the npc to improve its permeabil ity [98] . in saccharomyces cerevisiae, the npc is not disas sembled in mitosis even partly, but nup53, which is bound with nup170 in the interphase, is phosphory lated and transferred onto nic96. as a result, a high affinity binding site for karyopherin 121 is opened in nup53, and karypherin 121 dependent nuclear trans port is suppressed [99] . during mitosis, several nup proteins (nup107-160, nup358, ranbp2, and gle2/rae1-nup98) bind with the kinetochores and play a role in regulating the interactions between the kinetochores and microtu bules, which are also necessary for the npc formation after mitosis [100] . as was revealed in experiments with xenopus, gle2/rae1 induces the assembly of microtubules into a mitotic spindle. according to other data, gle2/rae1-nup98 inhibits the anaphase promoting complex (apc) to delay anaphase [101] . dynamic changes in the npc composition affect the transmission of intracellular signals. the regula tion of nuclear transport is usually modulated by changes in the karyopherin interactions with target proteins. in particular, conformational changes in a protein or posttranslational modification of sites in the vicinity of or within the nls (or nes) may mask these signals. however, npcs are possibly capable of serving several independent transport pathways at the same time [102] . if so, structural and functional changes in the npc may provide for a fine regulation of the transmission of certain signals. this hypothesis is supported by the results of studying the independent changes in nuclear import of various proteins (fig. 3) . the npc contains approximately 128 fg domains with several thousands of fg repeats [103] . the fg repeats harbor various binding sites for transport pro teins. the multiplicity of fg associated transport posttranscriptional modification mrna splicing, rnp maturation, trna maturation nuclear retention due to the signal sequences of the 3' utr maturation of ribosomal subunits intra and intermolecular interactions nls or nes masking via homo and heterodimerization n p c t r a n s p o r t r e c e p t o r s c a r g o fig. 3 . nucleocytoplasmic transport may be regulated at the levels of npc components, transport receptors, or individual car goes. the regulations of higher hierarchic levels exert broader and less specific effects on transport processes. pathways is evident from the fact that only some of the nuclear import pathways are altered in yeast cells with functionally defective fg nup nsp1 s5 [104] . transport pathways that involve different karyo pherins need different sets of fg domains [105] . this finding well correlates with in vitro data. therefore, different pathways of transport through the npc have a common mechanism, but utilize different karyo pherin sites and may be regulated independently. experimental studies on amphibian oocyte nuclei: 1. investigation of the structure of the nuclear membrane by means of the electron microscope the ultrastructure of the nuclear envelope of amphibian oocytes: a reinves tigation: 2. the immature oocyte and dynamic aspects the nuclear pore com plex: nucleocytoplasmic transport and beyond snapshots of nuclear pore complexes in action captured by cryo electron tomography yeast nuclear pore complexes have a cytoplasmic ring and internal fila ments transport between the cell nucleus and the cytoplasm cell cycle dependent dynamics of nuclear pores: pore free islands and lamins the nuclear pore complex: oily spa ghetti or gummy bear? nucleocytoplasmic trans port: taking an inventory mechanisms of receptor mediated nuclear import and nuclear export regulating access to the genome: nucleocytoplasmic transport throughout the cell cycle karyo pherins: from nuclear transport mediators to nuclear function regulators importin alpha: a multipurpose nuclear transport receptor crystallographic analysis of the recognition of a nuclear localization signal by the nuclear import factor karyopherin alpha autoinhibition by an internal nuclear localization signal revealed by the crystal structure of mammalian importin alpha nuclear import and the evolution of a multi functional rna binding protein the hiv 1 rev activation domain is a nuclear export signal that accesses an export path way used by specific cellular rnas nuclear export of ribosomal subunits the karyopherin kap142p/msn5p mediates nuclear import and nuclear export of different cargo proteins regulation of nuclear localization: a key to a door export of importin alpha from the nucleus is mediated by a specific nuclear transport factor microrna precursors in motion: exportin 5 mediates their nuclear export karyopherins and nuclear import molecular basis for the recognition of a nonclas sical nuclear localization signal by importin beta conformational variability of nucleo cytoplasmic transport factors structure of the nuclear transport complex karyopherin beta2 ran × gpp nhp structural basis for the assembly of a nuclear export complex architecture of crm1/exportin1 sug gests how cooperativity is achieved during formation of a nuclear export complex structural basis for the interaction between fxfg nucleoporin repeats and importin beta in nuclear trafficking importin beta contains a cooh terminal nucleoporin binding region important for nuclear transport gradient of increasing affinity of importin beta for nucleoporins along the pathway of nuclear import charac terization of ran driven cargo transport and the rangtpase system by kinetic measurements and computer simulation structural basis for the interaction between ntf2 and nucleoporin fxfg repeats structural basis for guanine nucleotide exchange on ran by the regulator of chromosome condensation (rcc1) a mechanism of coupling rcc1 mobility to rangtp production on the chromatin in vivo visualization of a ran gtp gradient in interphase and mitotic xenopus egg extracts co activation of rangtpase and inhibition of gtp dissociation by ran gtp binding protein ranbp1 the asymmetric distribution of the constituents of the ran system is essential for transport into and out of the nucleus nuclear import of the ran exchange factor, rcc1, is mediated by at least two distinct mechanisms the nucleoporin ranbp2 has sumo1 e3 ligase activity npap60/nup50 is a tri stable switch that stimulates importin alpha:beta mediated nuclear protein import ran binding protein 3 is a cofactor for crm1 mediated nuclear protein export the yeast nuclear pore com plex: composition, architecture, and transport mechanism nup98 is a mobile nucleoporin with tran scription dependent dynamics association of nuclear pore fg repeat domains to ntf2 import and export complexes structural basis for the high affinity binding of nucleoporin nup1p to the saccharomyces cerevisiae importin beta homologue disorder in the nuclear pore complex: the fg repeat regions of nucleoporins are natively unfolded rapid evolution exposes the boundaries of domain structure and func tion in natively unfolded fg nucleoporins from the trap to the basket: getting to the bottom of the nuclear pore complex nuclear pore complex structure and dynamics revealed by cryoelectron tomography nucle oporin domain topology is linked to the transport sta tus of the nuclear pore complex the crystal structure of mouse nup35 reveals atypical rnp motifs and novel homodimerization of the rrm domain pushing the envelope: structure, function, and dynamics of the nuclear periphery dynamic nuclear pore complexes: life on the edge the mobile nucleoporin nup2p and chromatin bound prp20p function in endogenous npc mediated transcrip tional control con served histone variant h2a.z protects euchromatin from the ectopic spread of silent heterochromatin nup pi: the nucle opore promoter interaction of genes in yeast unconventional tethering of ulp1 to the transport channel of the nuclear pore complex by karyopherins enzymes of the sumo modification pathway localize to filaments of the nuclear pore complex the dead box protein dbp5p is required to dissociate mex67p from exported mrnps at the nuclear rim nuclear target ing signal recognition: a key control point in nuclear transport? regulation of nuclear transport: central role in development and transfor mation? pro cessing of the precursor of nf kappa b by the hiv 1 protease during acute infection regulation of the yeast yap1p nuclear export signal is mediated by redox sig nal induced reversible disulfide bond formation nf at activation requires suppression of crm1 dependent export by cal cineurin con struction of chimeric tumor suppressor p53 resistant to the dominant negative interaction with p53 mutants a leucine rich nuclear export signal in the p53 tetramerization domain: regulation of subcellular localization and p53 activity by nes masking identification and characterization of a ligand regulated nuclear export signal in androgen receptor inhibition of nuclear import mediated by the rev arginine rich motif by rna molecules the c ter minal nuclear localization signal of the sex determin ing region y (sry) high mobility group domain medi ates nuclear import through importin beta 1 a sox9 defect of calm odulin dependent nuclear import in campomelic dys plasia/autosomal sex reversal eif4e is a central node of an rna regulon that governs cellular proliferation the expression of migration stimulating factor, a potent oncofetal cytok ine, is uniquely controlled by 3' untranslated region dependent nuclear sequestration of its precursor mes senger rna a hex anucleotide element directs microrna nuclear import exporting rna from the nucleus to the cytoplasm the protein kinase ck2 site (ser111/112) enhances recognition of the simian virus 40 large t antigen nuclear localiza tion sequence by importin roles of phosphoryla tion sites in regulating activity of the transcription fac tor pho4 ubiquitination regulates pten nuclear import and tumor suppression ubiquitin charging of human class iii ubiq uitin conjugating enzymes triggers their nuclear import parc: a cytoplasmic anchor for p53 novel properties of the nucleolar targeting signal of human angiogenin the hiv 1 tat nuclear localization sequence confers novel nuclear import properties nucleocytoplasmic shuttling by nucleoporins nup153 and nup214 and crm1 dependent nuclear export control the subcellular dis tribution of latent stat1 develop mental regulation of the heat shock response by nuclear transport factor karyopherin alpha3. develop ment cloning and characterization of hsrp1 gamma, a tissue specific nuclear transport factor evidence for distinct substrate specificities of importin alpha family members in nuclear protein import sim ple kinetic relationships and nonspecific competition govern nuclear import rates in vivo. in vivo truncated form of importin alpha identified in breast cancer cell inhibits nuclear import of p53 severe acute respiratory syndrome coronavirus orf6 antagonizes stat1 function by sequestering nuclear import factors on the rough endoplasmic reticulum/golgi membrane nuclear transport and cancer: from mechanism to intervention drosophila melanogaster importin alpha1 and alpha3 can replace importin alpha2 during spermatogenesis but not oogenesis a role for caenorhabditis elegans importin ima 2 in germ line and embryonic mitosis a nuclear export signal within the high mobility group domain regulates the nucleocytoplasmic translocation of sox9 during sexual determination dcas is required for importin alpha3 nuclear export and mechano sensory organ cell fate specification in drosophila disruption of the fg nucleoporin nup98 causes selective changes in nuclear pore complex stoichiometry and function characterization and potential function of a novel testis specific nucle oporin bs 63 expression pattern and cellular distribution of the murine homo logue of af10 partial nuclear pore complex disassembly during closed mitosis in aspergillus nidulans cell cycle regulated transport controlled by alterations in the nuclear pore complex the nuclear pore comes to the fore the rae1 nup98 complex prevents aneuploidy by inhibiting securin degradation nuclear trans port of single molecules: dwell times at the nuclear pore complex pores for thought: nuclear pore complex proteins analysis of nucleo cytoplasmic trans port in a thermosensitive mutant of nuclear pore pro tein nsp1 minimal nuclear pore complexes define fg repeat domains essential for transport key: cord-016419-v1f6dx3e authors: gupta, varsha; sengupta, manjistha; prakash, jaya; tripathy, baishnab charan title: production of recombinant pharmaceutical proteins date: 2016-10-23 journal: basic and applied aspects of biotechnology doi: 10.1007/978-981-10-0875-7_4 sha: doc_id: 16419 cord_uid: v1f6dx3e the proteins produced in the body control and mediate the metabolic processes and help in its routine functioning. any kind of impairment in protein production, such as production of mutated protein, or misfolded protein, leads to disruption of the pathway controlled by that protein. this may manifest in the form of the disease. however, these diseases can be treated, by supplying the protein from outside or exogenously. the supply of active exogenous protein requires its production on large scale to fulfill the growing demand. the process is complex, requiring higher protein expression, purification, and processing. each product needs unique settings or standardizations for large-scale production and purification. as only large-scale production can fulfill the growing demand, thus it needs to be cost-effective. the tools of genetic engineering are utilized to produce the proteins of human origin in bacteria, fungi, insect, or mammalian host. usage of recombinant dna technology for large-scale production of proteins requires ample amount of time, labor, and resources, but it also offers many opportunities for economic growth. after reading this chapter, readers would be able to understand the basics about production of recombinant proteins in various hosts along with the advantages and limitations of each host system and properties and production of some of the important pharmaceutical compounds and growth factors. the proteins produced in the body control and mediate the metabolic processes and help in its routine functioning. any kind of impairment in protein production, such as production of mutated protein or misfolded protein, leads to disruption of the pathway controlled by that protein. this may manifest in the form of the disease. however, these diseases can be treated, by supplying the protein from outside or exogenously. the supply of active exogenous protein requires its production on large scale to fulfi ll the growing demand. the process is complex, requiring higher protein expression, purifi cation, and processing. each product needs unique settings or standardizations for large-scale production and purifi cation. as only large-scale production can fulfi ll the growing 7 demand, thus it needs to be cost-effective. the tools of genetic engineering are utilized to produce the proteins of human origin in bacteria, fungi, insect, or mammalian host ( fig. 4.1 ) . usage of recombinant dna technology for large-scale production of proteins requires ample amount of time, labor, and resources, but it also offers many opportunities for economic growth. after reading this chapter, readers would be able to understand the basics about production of recombinant proteins in various hosts along with the advantages and limitations of each host system and properties and production of some of the important pharmaceutical compounds and growth factors. in all living cells, the expression of gene occurs where genetic information contained in dna is passed on to rna in the process of transcription and from rna to protein in the process of translation . thus, the body synthesizes rna and then proteins according to the instructions from the dna. dna-dependent rna polymerase or rna polymerase carries the transcription . eukaryotes have three rna polymerases: polymerase i transcribes ribosomal rna genes (18s,5.8s and 28srrna), polymerase ii transcribes all proteincoding genes (mrna and small rna), and polymerase iii transcribes the genes for 5srrna and trna. in prokaryotic cells ( e. coli ), rna polymerase consists of fi ve subunits-two identical α subunits and one subunit each of β, β′, and σ subunit. the σ subunit dissociates after polymerization ensues. thus, the term "holoenzyme" is used for complete enzyme and "core enzyme" is used without σ subunit. the process of transcription is initiated by binding of rna polymerase to dna molecule at a very specifi c site called " promoters ." the promoters are critical for the start of the transcription. the promoter sites are nearly 40 bp long and are mostly located before the fi rst base, which is copied into rna ( fig. 4.2a ) . this fi rst base is called as start point of transcription and is denoted by +1. promoters for rnapol ii allow differential expression of genes and determine the rate at which the genes are transcribed. there are some for expression of the cloned gene, the gene is with all the essential regulatory elements required for transcription of the gene. the gene is attached to the selective gene, which helps in the selection of the clones with gene of interest. the cells are screened for the synthesis of desirable product and then processed for large-scale production in the bioreactor with optimum condition for high yields promoters that cause the inserted genes to be expressed all the time; in all parts of the system, they are known as "constitutive" promoters. others allow expression only at certain stages/ certain tissue/organ of individuals and at certain time points. gene expression is under temporal and spatial regulation. in prokaryotes the position before start site at −10 and −35 can interact with σ subunit of holoenzyme of rna polymerase. in eukaryotes the sequence (analogous to −10-consensus sequence of prokaryotes), tataaaa, is present at −30 position in the promoter region. eukaryotic promoter is shown (fig. 4.2a ) with tata box forming the core promoter at −30 position (from −30 to −100), upstream of transcription start site. caat box and gc box are at approximately −70-90 and −100, respectively. the location of promoter is always on the same dna molecule which they regulate. they are referred as cis-acting elements. the spacing of various elements is more important and much is dependent on locus-specifi c activators, either at core promoter or at distant sites. various other signals as enhancers are also involved which are far apart from the target gene. they exert stimulatory effects on promoter activity and can be upstream, downstream or in the middle of the gene. promoters of housekeeping genes or genes with complex patterns of expression have cpg islands rather than tata box. the gene to be transcribed has 5ʹ-untranslated region (5ʹ utr), exons (coding region) separated by introns (noncoding region). the end has 3ʹ-untranslated region (3ʹ utr). in the cloning for gene expression, usually intronless version of the gene is used. in eukaryotes, the transcription is further enhanced by enhancers, which may be 2,000-3,000 bases away from the promoter region but are able to affect the rate of transcription . the simplest host for the work of recombinant dna technology is prokaryotic bacterial system . in the early 1980s, the fi rst recombinant fdaapproved pharmaceutical, the human insulin (humulin-us/humuline-eu), was obtained from genetically engineered escherichia coli ( e. coli ) for treatment of diabetes . due to increasing demand, many strains of microbial species are being designed with increased throughput and better recovery of the therapeutic protein from large-scale culture. the recombinant proteins approved by fda are obtained either from escherichia coli or other prokaryotes; from saccharomyces cerevisiae or other fungal species; from insect cells , mammalian cells , or human cells ; or from transgenic plants or animals. cloning and production of protein in a particular host system are dependent upon a property of host to clone and express the desirable size of protein-encoding genes; production of correctly modifi ed, folded, and functional protein; high yield of the protein; and low-cost requirements. the choice of host systems requires best system, which can fulfi ll the requirements [ 17 ] . the advantages of producing proteins using recombinant dna technology are: • as human gene may be cloned and expressed, it minimizes the risk of immune reaction and the specifi c activity of the protein is high. • the therapeutic protein can be produced efficiently, maintaining its cost-effectiveness. • it minimizes the risk of transmission of unknown pathogens present in animal and human sources. • appropriate modifi cation with higher specifi city, increased half-life, and improved functionality. • allow to create critical changes for better specifi city and activity. expression and production of eukaryotic protein require cloning of its cdna in an expression vector and subsequent transfer of the vector in suitable host. dna is modifi ed, cloned, and expressed in other host for the production of the protein; thus, optimum production conditions are required in each host. however, there are yield variations in different expression systems but high-level expression of the protein may be achieved by considering the following points: • though various host systems are available for production of recombinant proteins, microbial hosts offer several advantages over other systems, as production is fast, cheap, and economic: 1. the molecular biology and physiology is well characterized and documented. 2. easy to maintain and manipulate. 3. utilizes inexpensive nutrition sources. 4. rapid growth and biomass accumulation to achieve high cell densities. 5. scale-up is easy and convenient. 6. their expression machinery can be with variety of strong inducible promoters . as with the cellular structure of bacteria, it can rapidly adapt to culturing conditions with very short replication time (20 min). the media requirement of bacterial cell is simple and consists of simple carbon and nitrogen source. thus, the overall inputs in bacteria are 90 % lower than for mammalian cells . some of the the inducible promoter may require an inducer, or the depletion or addition of a specifi c nutrient, or ph change or changes in physicochemical factors in order to initiate the process of gene expression. the inducible systems suffer from the disadvantage that chemical inducers may be expensive and toxic and would require elimination during downstream processing when the product is intended for human usage. thus, usage of thermoregulated systems has been used for production of recombinant pharmaceutical proteins as expression is dependent upon strong heatregulated promoter minimizing the risks of any addition of chemical agent. approved bacteria-derived (by either the european union or fda, usa) therapeutics include hormones (human insulin and insulin analogs, calcitonin, human growth hormone , glucagons, parathyroid hormone , somatropin, and insulin-like growth factor 1), interferons (alfa-1, alfa-2a, alfa-2b, and gamma-1b), interleukins 11 and 2, light and heavy chains raised against vascular endothelial growth factor-a, tumor necrosis factor alpha, cholera b subunit protein, granulocyte colony-stimulating factor, and plasminogen activator [ 3 ] . the microbe of fi rst choice for production of recombinant protein is enterobacterium e. coli . the system offers quick and easy modifi cations, ease of growing in manageable environmental conditions and short life cycle. the bacterial cell can tolerate and adapt to changes in the environment rapidly, thus scale-up is easier. however, the system suffers from some of the disadvantages [ 11 , 16 ] . 1. human or mammalian genes cloned in bacteria cannot undergo splicing due to lack of splicing machinery, thus intron-less version of the gene is cloned for optimum results (fig. 4.2b ). 2. the signals involved in transcription of genes may vary; thus, the gene of interest is usually fused with bacterial gene under the control of its promoter , and the protein is obtained as a fusion product, which can be later cleaved, purifi ed, and used. 3. lac promoter is one of the most popular bacterial promoters . however, for high-level expression, t7 promoter is also preferred (present in pet vectors . for addition of amino acids during the process of translation , as there are more than one codon for several amino acids, thus, codon biasing occurs which is the preference of a particular codon of amino acid in a particular species (fig. 4.2c ). 5. complexity: eukaryotic cells have the advantage of producing fully functional and properly folded proteins. the antibodies with four subunits may be secreted by eukaryotic cells in fully functional form. on the other hand, it is very diffi cult to obtain multidomain protein from e. coli . even if protein is obtained, the renaturation and folding in laboratory condition may either be very expensive or protein may lack its activity. 6. lack of posttranslational modifi cations (ptms) is the problem, which cannot be solved, and is mandatory for activity of many therapeutic proteins. the glycosylation is the most common modifi cations, and others are phosphorylation and formation of disulfi de bond, which are essentially required for the full functional capability of many human proteins. ptms play an important role in proper protein folding, processing, stability, tissue targeting, activity, immune reactivity, and half-life of the protein. lack of these results in insoluble, unstable, or inactive product. however, the n-linked glycosylation system of campylobacter jejuni has been successfully transferred to e. coli , thus opening a possibility for the production of glycosylated protein in it. certain mutant e. coli are being developed to promote disulfi de bond formation (ad494, origami, rosetta-gami) with reduced protease activity (blz1). 7. overproduction of recombinant protein in bacteria might result in the loss of solubility and deposition of many protein species as protein aggregates or inclusion bodies. alteration in growth conditions might render the product in insoluble form. many eukaryotic proteins are found trapped in inclusion bodies with resistance to further processing. success has been obtained in purifi cation of insulin and betaferon from inclusion bodies. the retrieval of proteins using denaturating condition with subsequent refolding and renaturation might not always be easy and prove to be extremely expensive. 8. with the e. coli host, it is very diffi cult to obtain protein larger than 60 kda in soluble form. due to certain limitations for production of proteins in e. coli , other host systems are being discussed for production of proteins. and two n-acetylglucosamine residues. the core may attach to different oligosaccharides to form different glycoproteins. man glcnac glcnac --asn-pentasaccharide core glycosylation is one of the important posttranslational modifi cations, which occurs inside the lumen of the endoplasmic reticulum (er) and in golgi complex. the er and golgi complex are important in protein targeting and transport. n-linked glycosylation starts in the endoplasmic reticulum and continues in the golgi complex after the polypeptide is synthesized on ribosomes. however, o-linked glycosylation exclusively occurs in the golgi complex. in the process, oligosaccharide to be attached to the protein associates with a specialized lipid present in er, dolichol phosphate, which consists of about 20 isoprene (c5) units. through phosphate of dolichol phosphate, oligosaccharide is transferred to specifi c asparagine residue of polypeptide chain on ribosomes. the enzyme responsible for glycosylating protein and activated oligosaccharide are located on the lumen side of er. then these are transported to the golgi complex, where the carbohydrate units are altered and fi nalized. golgi complex is responsible for o-linked sugar attachment and modifi cation of n-linked sugar. then the proteins are targeted and transported to their destination . covalent attachment of carbohydrate group to the protein to form glycoprotein is called glycosylation. in glycoproteins, proteins constitute a major fraction. these play important roles in various physiological processes and are components of cell membranes. the carbohydrates commonly attached to proteins may be fucose (fuc), galactose (gal), n-acetylgalactosamine (galnac), glucose (glc), n-acetylglucosamine (glcnac), mannose (man), and sialic acid (sia). these sugar moieties may associate through amide nitrogen atom of side chain of asparagine (asn) termed as n-linked glycosylation or to the oxygen atom in the side chain of serine (ser) or threonine (thr) termed as o-linked glycosylation. not all the asparagine (asn) present in polypeptide can accept the carbohydrate moiety. the residues with the sequence asn-x-ser or asn-x-thr, where x is any amino acid except proline, are targets for glycosylation. not only the residual sequence but other aspects of the structure of the protein and cell type determine the glycosylation site. all the n-linked sugar residues have a common core of pentasaccharides. these pentasaccharide consists of three mannose (continued) due to the problems encountered in e. coli for production of larger proteins or modifi ed proteins, the next cost-effective, fast, high-density, and easy to handle system is of fungi. saccharomyces cerevisiae ( yeast ) was the system of choice when it was diffi cult to obtain therapeutic protein in soluble form and with appropriate posttranslational modifications in bacterial host. in yeast , mutants are available which can give high yield. the approved products obtained from yeast are hormones , vaccines, recombinant granulocyte macrophage colony-stimulating factors (gm-csf), albumin, hirudin, and platelet-derived growth factor ( pdgf ). the advantages of s. cerevisiae are the following: (1) it secretes recombinant protein in the culture, (2) protein is properly folded, and (3) it performs most posttranslational modifi cations. with the yeast system, high amounts of recombinant protein are obtained, and yeast is also capable of performing posttranslational modifi cations as o-linked glycosylation, phosphorylation, acetylation, and acylation but differs drastically in patterns of n-linked glycosylation ( fig. 4.3 ). chaperones are family of highly conserved different proteins. the important functions of chaperones are: • prevention of aggregation and misfolding of newly synthesized polypeptide chain. • they prevent irreversible aggregation of nonnative conformation and maintain the protein on the productive folding pathway. • they prevent nonproductive interactions with other components of the cell. • they help and guide the direct assembly of multisubunit protein complexes and larger proteins. • the chaperones involved in folding recognize nonnative substrate proteins mainly via their exposed hydrophobic residues. the major classes of molecular chaperones are: heat shock proteins are present in a variety of systems and prevent damage to the proteins under high heat. (continued) differences in n-glycosylation in yeast are with high or hypermannose which is highly immunogenic. unmodifi ed proteins are suitable for production in yeast . other members of fungi are pichia pastoris , pichia methanolica , candida boidinii , and pichia angusta , which are facultative methylotrophic yeast having great potential. pichia pastoris is favored as high cell densities can be obtained; protein is secreted in high concentration (1 g/l), less hypermannosylation as compared to yeast and thus less immunogenic. however, the disadvantage is that it requires methanol to induce gene expression as transgene is under the control of the promoter of alcohol oxidase 1 (aox1) gene. methanol may be fl ammable and is toxic to cells and humans if not thoroughly removed. because of hypermannosetype glycosylation , the fungi are also unsuitable for production of many recombinant proteins . insect cells : insect cell can be infected with baculoviruses which are double-stranded circular dna viruses with arthropods as host. baculovirus-mediated gene expression in insects is a method of choice and is cost-effective, giving the much higher yield of recombinant protein compared to other systems. it is possible to produce large protein resulting in production of correctly processed and biologically active protein. a baculovirus autographa californica nuclear polyhedrosis virus (acmnpv) is used as a cloning vector for insect cell lines. in this viral polyhedron protein is used, which is required in its normal habitat and exhibits high rate of transcrip• it is a bacterial chaperone. • it binds with partially folded and misfolded proteins. • for its functionality groel requires its cofactor groes. • it is tetradecameric mitochondrial chaperonin. • it is implicated in protein import and macromolecular assembly. • required for folding of precursor polypeptides in atp-dependent manner. • prevents aggregation and mediates refolding of protein after heat shock. • they are central components of the cellular network of folding catalysts and molecular chaperones . • they assist in different types of processes of protein folding in the cell by transient association of their substrate binding domain with short hydrophobic peptide. • they bind and release their substrate by switching to low-affi nity atp-bound state and the high-affi nity adp-bound state. • they form complex network of folding machines [ 15 ] . • it is highly abundant chaperone. • it plays an important role in many cellular processes, for example, cell cycle control, cell survival, and hormone and other signaling pathways. • it is a key player in maintaining cellular homeostasis during stress . • has atpase activity, whose binding and hydrolysis affects conformational dynamics of the protein. • it has become a major therapeutic target for cancer, and its role is being explored in neurodegenerative disorders and infectious diseases [ 10 ] . • this is eukaryotic chaperonin tailless complex polypeptide 1 (tcp1) ring complex (tric). • it facilitates the proper folding of many cellular proteins. (continued) tion, but is not needed in cell culture. thus, the coding sequence of the gene is replaced with foreign dna. the gene is transcribed under the control of powerful polyhedron promoter with high yields (~30 % of total cell protein). the observed yield may be variable due to the course of virus infection and viral titer. the production of recombinant protein in insect cell is time consuming (as compared to bacterial system ) as cell growth is slow and the cost of medium is high. every time fresh cells are required, viral infection is lethal for cells. it also has limitations in performing posttranslational modifi cations as it performs non-syalated n-linked glycosylation . all the other optimizations need to be perfect as yield depends upon the virus titer and time taken from infection to expression. insect cells are preferred when active protein is diffi cult to obtain in e. coli system. genetic engineering has been used to select mimic™ (invitrogen) and sfswt-3, which are transgenic cell lines expressing all necessary enzymes to obtain humanized, complex n-linked glycosylation pattern. the system has been extensively used for structural studies as correctly folded eukaryotic proteins may be obtained in secreted form simplifying purifi cation protocols. some of the approved biopharmaceuticals from infected insect cell line hi five are cervarix (recombinant papillomavirus c-terminal truncated major capsid protein l1 types 16 and 18, used as cancer vaccine). glycosylation is a problem which is encountered when insect cells are used for production of recombinant human glycosylated proteins. lots of genetic engineering is required to produce humanlike glycosylation in insect cell . thus, the preferred system for therapeutic human protein production is mammalian system (chinese hamster ovary cell line ). due to time and diffi cultly in maintaining insect cells , the mammalian cells were explored for production of recombinant protein. mammalian cells , because of their properties of protein folding, assembly, and posttranslational modifi cations, have become the preferred system for protein production and are now accounting for major recombinant protein production . mammalian cells : production of complex proteins requires extensive processing and posttranslational modifi cations. mammalian cells have the advantage of performing ptms (fig. 4. 3 ) correctly; they secrete recombinant protein into the medium in their natural form, thus skipping the critical steps of renaturation and refolding which sometimes leads to inactive proteins. therefore, major therapeutic proteins (60-70 %) are produced in mammalian cells primarily chinese hamster ovary (cho) cells and baby hamster kidney cells (bhk). cho cells are relatively easy to manipulate and their properties favor large-scale production in them [ 24 ] . the proteins produced are safe to use in humans with no adverse reactions because of similar glycosylation pattern. chinese hamster ovary (cho) and baby hamster kidney (bhk) cells are the prominent producers of recombinant proteins (fig. 4.4 ) . the recombinant therapeutic product used for clinical applications was produced from mammalian cells. mammalian cells were maintained in serum-/blood-/plasma-based medium; therefore, the presence of any infectious agent in the product might be deleterious if not properly removed. infections may range from hiv , coronavirus (severe acute respiratory syndrome (sars), non-lipid-enveloped (nle) viruses as circoviruses (torque tenovirus (ttv) and torque tenominivirus (ttmv)), hbv, hcv, htlv (human t-cell lymphotropic virus), to west nile virus. prions that are self-replicating infectious proteins may also be present which may lead to variant creutzfeldt-jakob disease (vcjd). pathogen transmission was a major concern in the manufacture of blood-derived coagulation factor. in the early 1980s, the factor replacement products derived from plasma, which were used to treat hemophilia, were found to be contaminated with hiv , hbv, and hcv viruses. in the year 1984, up to 78 % of us-based hemophilic patients were infected with hiv and 74-90 % were infected with hcv. parvoviruses, b19 (b19v) and parv4, were present as con-in mammalian cells, genes can be expressed either transiently or stably. obtaining stably transformed cell lines requires the usage of some selectable marker. another major advantage of these cells is they can be grown in suspension, in serum-free (sf), protein-free, and chemically defi ned media . the product is safe without the risk of prions of bovine spongiform encephalopathy (bse) from bovine serum albumin and infections of variant creutzfeldt-jakob disease (vcjd) from the human serum. presently due to virus and prions in the donor plasma or blood samples, the manufacturers are opting for plasma-/serum-free growth medium for culturing the cell lines . cells require some of these components (albumin, transferring) for their growth. nowadays recombinant human albumin is available like human insulin (produced in e. coli and yeast ) and yeast-derived animal-free recombinant human transferrin is available. these support plasma-free mammalian cell culture. else, cho cells may be engineered to produce its own transferring or insulin-like growth factor. the therapeutic protein obtained by mammalian cell is treated and tested for the presence of any pathogenic agents or viruses as contaminant. the therapeutic products obtained during processing are treated with various agents to remove inactive virus, but the presence of asymptomatic virus poses a serious risk. common virus inactivation technique is using solvent/ detergent, which is effective against lipidenveloped viruses such as human immunodeficiency virus (hiv) , hepatitis b and c virus (hbv and hcv), human tcell lymphotropic virus (htlv), and west nile virus (wnv). non-lipidenveloped viruses such as parvoviruses, enteroviruses, and circoviruses are resistant to inactivation via solvent detergent treatment. human herpesvirus 8, responsible for kaposi's sarcoma, is shown to be transmitted through blood and blood product. following outbreaks, strict requirements were imposed on manufacturers of biologics and medical devices. such concerns prompted manufacturers to switch to sugar-based fi nal formulations and develop recombinant plasma-free albumin produced in yeast for usage in biopharmaceutical manufacturing. plasma-free manufacturing involves the elimination of plasma derivatives in every step (like cell line development, upstream processing, downstream processing, and fi nal formulations) of the process with appropriate postproduction checks. the manufacturers shifted from the use of serum to serum-free cell culture media with animal product-free media and ultimately to protein-free, completely synthetic chemically defi ned media . the media consists of protein hydrolysates derived from yeast , soy, and wheat with amino acids, peptides, carbohydrates, vitamins, and essential elements, which are ultrafi ltered to remove any unwanted contaminants [ 8 ] . (continued) ) , and replagal-alfa-galactosidase a (lysosomal hydrolase) have been approved by the european union (eu) or food and drug administration (fda, usa). as these products are fully glycosylated when expressed in human cell lines and used as therapeutics in human beings. transgenics for protein production the transgenic animals are successfully used for production of recombinant proteins (for details refer to chap. 5 ). protein production poses great risk in terms of safety as transmission of infectious agents, allergic responses, immune reactivity, and autoimmune responses might occur. atryn was the only approved (approved in 2006 by european medical agency and in 2009 by fda) recombinant biopharmaceutical using transgenic animals. it contains human antithrombin (432aa) with 15 % glycosylated moieties and is secreted into milk of transgenic goats. rhucin intended for acute attacks of angioedema in patients with congenital c1 inhibitor activity defi ciency, obtained from transgenic rabbit, was denied approval. transgenic plants are being explored as recombinant protein producers for research and diagnostic uses. obtaining therapeutic proteins from their natural source poses threat for spread of diseases. therefore, alternative systems for the production of therapeutic agents have their own benefi ts. molecular farming in plants has been widely explored for production of recombinant pharmaceutical proteins. their advantages are low cost, high mass production, scale-up, lack of human pathogens , and addition of eukaryotic ptms. the fi rst recombinant protein obtained in 1986 from tobacco plants was human growth hormone . however, sometimes plant-specifi c ptms might result in adverse immune reactivity. production of recombinant heterologous proteins in plants is simple and is used for production of non-naturally occurring proteins as single chain fv fragments (scfvs). high yield of recombinant protein is the main goal of production system in transgenic plants . therefore appropriate expression vectors and constructs are designed to achieve high yields of the engineered gene products [ 7 , 12 ] . taminant in plasma-derived factor viii. therefore, the production urgently required regulatory measures. then came fi rst recombinant factor viii , advate (baxter), in the usa in 2003. advate was produced in cho cells grown in serum-free and protein-free medium with ultrafi ltered soybean peptides with subsequent purifi cation by immunoaffi nity chromatography. the usage of advate helped in eliminating the risk of transmitting emerging blood pathogens . prions pose a serious risk, as they are highly resistant to physical/chemical inactivation. early stage of prion infection is almost impossible to detect in plasma donor. iatrogenic transmission of prions has occurred in patients who received humanderived pituitary hormones as human growth hormone (hgh) and gonadotropins. cjd was transmitted to over 160 recipients of cadaveric pituitary hgh before its withdrawal. cadaveric pituitaryderived gonadotropins for infertility were associated with iatrogenic transmission of cjd. later on cadaveric pituitary hgh and gonadotropins have been replaced with recombinant gh (produced in microbial system) and recombinant gonadotropins (produced in cho cell lines ). nowadays plant system is effi ciently engineered to produce human growth hormone , human serum albumin, erythropoietin, α-interferon, antibodies and scfvs, toxins, subunit vaccines , and insulin [ 20 ] . with increasing demand from the consumer, the companies are trying to increase the productivity [ 17 ] . few challenges with large-scale production of proteins are: the transgene in expression construct is chimeric structure as it is surrounded by various active regulatory elements. polyadenylation sites play an important role for the high level of expression of transgene. caulifl ower mosaic virus (camv) 35s promoter works well with dicots. it is a strong constitutive promoter that is made more active by duplicating the enhancer region. however, in monocots maize ubiquitin-1 promoter is the preferred promoter . the presence of an intron in the 5′-untranslated region (5′-utr) enhances transcription in monocots. for obtaining high yield of the protein, several factors may be appropriately considered: • the incorporation of polyadenylation sites may be from camv 35s transcripts or the agrobacterium tumefaciens nos gene or the pea ssu gene. • the yield can be controlled by placing the gene under the control of the promoter which is active in a particular tissue or developmental stage or particular environment (e.g., rice glutelin, pea legumin). • usage of inducible promoter (e.g., tomato hydroxyl-3-methylglutaryl coa reductase-2 (hmgr2)) which has mechanical gene activation system developed by cramer (crop tech corp., virginia, usa). transcription starts when harvested tobacco leaves are sheared during processing. • codon bias in the host plant may be overcome by engineering of transgene at positions, which might lead to truncation and/or misincorporation, or slowing the process. • subcellular targeting is also very important factor which affects the process of folding, assembly, and posttranslational modifi cation and can be effi ciently achieved by inclusion of an n-terminal signal peptide. • position of transgene integration. • structure of transgene locus. • gene copy number. • presence of truncated or rearranged transgene copies. • affi nity tags as his or the flag epitopes can be used to ease the process of purifi cation; however, these modifi cations not only affect the primary structure but also the properties of the protein. figure 4 .5 shows the general vector pcambia (it is small in size (7-12 kb) maintained in high copy number with pvs1 replicon that imparts chloramphenicol or kanamycin resistance and high stability in agrobacterium ) that is used for transgene expression in the plants. the modifi ed vector has shown success in insulin production. (continued) 4.9 challenges of production of therapeutic proteins important that gene of interest should give adequate protein production. however expression may be lost due to many factors, like, if there are structural changes in the recombinant gene or inactivation or disappearance of the gene from host cell. other factors infl uencing yield may be increase in the copy number of insert, maintenance of optimum temperature, and toxicity of the expressed protein to the host. chromosomal integration of the foreign gene might overcome the problem of expression stability, but in plasmid -based system, high copy number leads to increased yield. 2. posttranslational processing: protein folding requires foldases (accelerates protein folding) and chaperones (prevents protein formation of nonnative insoluble folding intermediates). glycosylation is complex ptm requiring consecutive steps and enzymes. glycosylation is important as it determines protein stability, solubility, antigenicity , folding, localization, biological activity, and circulation half-life. getting correctly glycosylated and folded protein is required for therapeutic usage. in prokaryotes, with the discovery of n-glycosylation system in campylobacter jejuni , several other systems of o-glycosylation were unraveled in both pathogenic and symbiotic bacteria. the production of recombinant proteins is commonly done by using e. coli , yeast , or cell lines derived from insects (sf9), mice (sp2/0), or cho, but obtaining fully human ptms is a challenging task. fig. 4 . 5 the fi gure shows the structure of pcambia vector (cambia.org) used for plant transformation. the vector has camv35s promoter , multiple cloning site, and reporter gene (gus or gfp may be used). the vector can be modifi ed to express genes for insulin (tomato) or hep-b surface antigen (hbsag) for recombinant therapeutics providing a human glycosylation pattern have increased attention, and the efforts are being made for the development of novel glycoengineered cell lines for production of fully glycosylated protein therapeutic. 3. overexpression of therapeutic protein might result in the formation of inclusion bodies in prokaryotic system. rapid intracellular protein accumulation and expression of large proteins increases the probability of aggregation. aggregation protects proteins from proteolysis and can facilitate protein recovery. when the expressed protein is toxic to the host, the presence of protein in the inclusion body tends to protect it. precautions recombinant protein production requires some precaution resulting in a loss of yield and/or product: 1. contamination: at any stage, contamination might occur from any source. it poses big challenge and may be adverse when contaminated material is put for human usage. 2. immune response: for the therapeutic agent, the body can mount the immune reactions which lead to deposition of immune complexes in various tissues, and condition of anaphylactic shock might occur (e.g., when some essential agent is lost since birth like factor viii , then the patient might raise antibody response against the treatment). this also occurs when antibodies are used as therapeutic agents in the treatment of variety of cancers. 3. protein aggregation: any nonfunctional condition might result in aggregation or loss of activity of recombinant protein. after purifi cation, the proper modifi cations and refolding are required for therapeutics. 5. disulfi de bond formation: it stabilizes protein structure; thus, strategies for specifi c extracellular excretion pathway or overexpression of chaperones is required for optimum production. 6. degradation: sometimes proteins, which are active in host cells like proteases or proteinmodifying chemicals, might degrade the recombinant protein (e.g., peg interferon is modifi ed to have polyethylene glycol with prolonged presence and reduction in enzymatic degradation and renal clearance, thus extending its presence with lesser immunogenicity ) [ 17 ] . ischemic stroke and myocardial infarction are one of the leading causes of cardiovascular morbidity and mortality in the world. important thrombolytic agents are urokinase (obtained from urine), tissue plasminogen activator (tpa), and streptokinase (obtained from bacteria). among these, t-pa is largely used commercially. plasminogen activators are serine proteases, which are responsible for conversion of inactive proenzyme plasminogen (plg-a single chain glycoprotein) to serine protease called plasmin (plm). the plasmin degrades the network of fi brin of the blood clots ( fig. 4.6a ). there are two immunologically unrelated groups of plasminogen activators, the 55 kda urokinase-type-pa (u-pa) and 72 kda tissue-type pa (t-pa) (ec 3.4.21.68). the t-pa is the physiological vascular activator consisting of single polypeptide chain of 72 kda consisting of 527 amino acids. it shows strong activity in the presence of fi brin [ 23 ] . the potential inhibitors of the thrombolytic cascade are type i plasminogen activator inhibitor (pai-1 or serpin e1) and pai-2 (secreted by placenta and present in signifi cant amount during pregnancy). they act by competing with t-pa for binding sites on fi brin thus preventing the fi brinolytic cascade. pai-1 complexes with t-pa for binding to fi brin. thus, truncated t-pa in which the residues responsible for interacting with pai-1 (296-299) are replaced with four alanine amino acids and three domains (fi nger, epidermal growth factor (egf) , and kringle 1 domains)) are deleted, and chimeric tetrapeptide gly-his-arg-pro (ghrp) with high affi nity to fi brin was added. reteplase is the deletion mutant with a prolonged half-life, in which the fi nger, egf , and kringle 1 domains of the full-length molecule are all deleted; thus, it is not inhibited. defi ciency of pai leads to over fi brinolysis and hemorrhagic diathesis (like defi ciency of clotting factors). tiplaxtinin is the inhibitor and is used for remodeling of blood vessels [ 2 ] . plasminogen activator has great clinical relevance for the management of stroke and myocardial infarctions. for production of t-pa, e. coli and yeast system did not work properly due to lack of posttranslational modifi cation and over glycosylation , respectively. a novel truncated form of t-pa with an improved fi brin affi nity and an increased resis-tance to pai-1 was expressed in a cho dg44 expression system. therapeutic protein was produced in stably transfected cho dg44 cell lines . these cell lines were maintained in serum-free medium, with glutamine, hypoxanthine, and thymine in stirred tank bioreactor . the cells were grown at 37 °c, 140 rpm with 5 % co 2 and 85 % humidity. the protein was then purifi ed, with higher yield (fig. 4.6b ). factor viii is one of the important factors of all blood clotting factors. defi ciency of factor viii causes bleeding disorder called hemophilia a. the hemophilia may be mild to severe depending upon factor viii concentration in the body. in moderate and severe factor viii defi ciency, there can be spontaneous bleeding episodes in the joints. hemophilia a affects 1 in 5,000-10,000 males. replacement therapy is the treatment option for hemophiliacs either with human plasma-derived factor viii (pdfviii) or recombinant fviii (rfviii) [ 21 ] . transfection of hek 293 cell cultures in serum-free suspension is being tried for optimal yield. recombinant factor viii (rfviii) is produced by culturing mammalian cells as baby hamster kidney (bhk) or chinese hamster ovary cells (cho), using large-scale bioreactors . standardizations are done to maximize yields . insulin is a peptide hormone consisting of 51 amino acids. it is secreted by β cells of islets of langerhans of pancreas. the hormone is responsible for maintaining normal blood glucose level in blood. insulin is stored in the form of proinsulin which contains two polypeptide chains, a and b, and is connected with a third peptide cchain), which before secretion is cleaved with production of insulin and c-chain. the cleavage results in the removal of c-chain, and the a (21 amino acids) and b chain (30 amino acids) are linked by disulfi de linkage to form mature insulin. in the beginning, efforts were made to isolate mrna for pre-and proinsulin from rat islets of langerhans of pancreas and to synthesize cdna. thereafter, it was inserted into a plasmid . the recombinant plasmids were transferred into the e. coli cells, which secreted proinsulin [ 4 , 5 , 6 , 19 ] . scientists have chemically synthesized dna sequences for two chains, a and b, of insulin and separately inserted into two pbr322 plasmids by the side of β-galactosidase gene. the recombinant plasmids were separately transferred into e. coli cells which secreted fused β-galactosidase-a chain and β-galactosidase-b chain separately. these chains were isolated in pure form by detaching from β-galactosidase with yields of about 10 mg/24 g of healthy and transformed cells. production of recombinant insulin is shown in (fig. 4.7a, b ) . growth hormone is produced by the anterior lobe of the pituitary gland and is released in multiple pulses. the hgh is encoded by ghn gene cluster (an array of fi ve closely related genes), which is localized on chromosome 17. it belongs to diverse gene family that has evolved by gene duplication events and has lots of structural similarity and some common functions. of the various forms, the predominant form of hgh is 22 kda protein of 191 amino acids with two disulfi de bonds. gh does not control the functions directly, but acts on certain hormones or somatomedins for its activity, for example, insulin-like growth factor 1 (igf-1). it has a wide spectrum of roles to play as promotion of long bone growth, promotion of normal sex organ development through puberty, regulation of metabolism, stimulation of tissue growth and repair, anabolic/anti-catabolic effect via improved nitrogen retention, modulation of bone mineral density and metabolism throughout life, proliferation of some cell types of the immune system, appetite stimulation, and breakdown of fat (lipolysis). in clinical conditions, the therapy of growth hormone is given in the treatment of dwarfi sm, bone fractures, skin burns, bleeding ulcers, and aids . recombinant human growth hormone (rhgh) is 22 kda consisting of long chain of amino acids. it is used in defi ciency disorders of growth hormone . in children, it is used for growth abnormality as short stature and is used in chronic renal insuffi ciency. the therapy of growth hormone is also approved for adult growth hormone defi ciency. gh is one of the most widely used hormones with the estimated market of more than 1.7 billion usd. long experience in its administration has proven the therapy as safe and effective in various conditions of growth abnormality. earlier pituitary-derived hgh was used but later on it was prohibited when found associated with creutzfeldt-jakob disease. because of recombinant dna technology , safe and abundant recombinant hgh was produced in various heterologous systems. as the non-glycosylated human growth hormone was biologically active, thus, the preferred system for its production is e. coli , which allows its rapid and economical production in large amounts [ 14 ] . recombinant hgh (rhgh) is now used to treat: • gh-defi cient (ghd) short-stature children. • acceleration of wound healing. • increase in insulin-like growth factor (igf)-1 levels. • rhgh increases igf-1, osteocalcin, type i procollagen pro-peptide (picp), and bone density, when administered to children with ghd. for optimal productivity, strong inducible promoters are preferred as ipl, ipr, trc, and t7 in e. coli . they are advantageous as they drive overproduction of recombinant proteins. apart from e. coli , human somatotropin (hst) expression was tried in a biologically active, disulfi de-bonded form in tobacco chloroplasts. the hormone is used for the treatment of hypopituitary dwarfi sm in children; additional indications are in treatment of turner syndrome, chronic renal failure, hiv wasting syndrome, and possibly treatment of the elderly. growth hormone defi ciency in human occurs both in children and adults [ 18 , 22 ] . hematopoietic growth factors consist of cytokines and protein hormones produced by the body which govern the production and maturation of the various cells produced during the process of hematopoiesis in the bone marrow from hematopoietic stem cell. the precursor cells in the presence of a particular growth factor differentiate and become a specialized kind of cell as monocyte, macrophage, lymphocytes, or red blood cell. one of the important growth factor is erythropoietin which is a protein hormone produced by a specifi c type of cells in the kidney. in the presence of erythropoietin, progenitor cells are stimulated in the bone marrow to form mature erythrocytes (red blood cells). thus the patients with chronic kidney disease are unable to maintain adequate amount of erythropoietin for normal development of erythrocytes in blood, resulting in low numbers of red blood cells and subsequent anemia. these patients either require blood transfusion or erythropoietin from outside. as supply is limited from the natural source that is kidney cells, thus a recombinant human erythropoietin epogen® which is amgen's trade name for epoetin alfa is marketed for anemic condition involving erythropoietin. the human gene encoding erythropoietin was cloned into the chinese hamster ovary cell line for production of the human protein. this cell line continues to be used today for the production of epogen®. the half-life of erythropoietin can be increased by incorporating the glycosylation of the protein growth factor. thus, darbepoetin-α is an analog which is engineered for two extra amino acids which are substrates for glycosylation . thus, production is done in cho cell lines ; the product has fi ve n-linked sugar chains and has almost three times longer life than erythropoietin . pdgf regulates cell growth and division and plays a signifi cant role in blood vessel formation (angiogenesis), the growth of blood vessels from already existing blood vessel in tissue, and may act in autocrine and paracrine stimulation of cell growth in vivo. pdgf plays the role in development, cell proliferation, cell migration, and angiogenesis and has been linked to atherosclerosis, fi brosis, and malignant diseases. pdgf has fi ve different isoforms: pdgfa, pdgfb, pdgfc, pdgfd, and ab heterodimer. pdgf-a and pdgf-b have 60 % similarity in amino acid sequence, but experiments suggest different biological functions for the two chains and different locations of these under different transcriptional controls. pgdf-aa is released in the medium, and pdgf-bb are insuffi ciently secreted and remain attached to the plasma membrane, and of these pgdf-bb and pdgf-ab are strong mitogens and are probably responsible for biological roles of pdgf. pdgf receptor (pdgfr) is receptor tyrosine kinase (rtk) (alpha and beta type). upon activation by pdgf, these receptors dimerize leading to autophosphorylation of several sites on their cytosolic domain. pdgf being a mitogen promotes the proliferation of fi broblasts and smooth muscle cells in vitro. pdgf shows considerable heterogeneity with sizes of 27-31 kda; however, purifi ed pdgf is cationic protein of 30 kda. recombinant human platelet-derived growth factor (rh-pdgf) was the fi rst recombinant protein to be approved by the us food and drug administration for treatment of chronic foot ulcers in diabetic patients (regranex, ethicon inc. somerville, nj). it has the potential for use in bone regeneration and increasing bone density in long bones and spine. pdgf is commercially produced by using e. coli and mammalian cell s . human egf protein has 53aa and three intracellular disulfi de bonds and plays an important role in the regulation of cell growth and proliferation. it shows strong sequential and functional homology with human type-alpha transforming growth factor (htgf alpha), which is a competitor for egf receptor site. egf acts by binding with high affi nity to egfr on cell surface and stimulates the intrinsic protein tyrosine kinase activity. egf has many biological activities. initial observations were centered around their proliferative effects on fi broblasts, keratinocytes, and epithelial cells. egf modulates luteinizing hormone and thyroid hormone . egf is produced commercially by engineered e. coli . the other systems are also being explored for optimum egf production [ 9 ] . fgfs are commonly mitogens with multifunctional proteins with a wide variety of regulatory, morphological, and endocrine effects. there are 18 mammalian fgfs (1-10 and 16-23) which affect growth and functions of a wide variety of mesenchymal, endocrine, and nerve cells. the functions of fgfs in developmental processes include mesoderm induction, anteroposterior patterning, limb formation, neural tube induction, and brain development and in mature tissue/systems angiogenesis, keratinocyte organization, and wound healing processes. fgf is very important during normal development of both vertebrates and invertebrates, and any irregularities in their function lead to a range of developmental defects [ 1 ] . fgf1 and fgf2 show strong angiogenic properties with the promotion of endothelial cell proliferation and the physical organization of endothelial cells into tubelike structures and the growth of new blood vessels from the preexisting vasculature. fgf7 and fgf10 (also known as keratinocyte growth factors (kgf) and kgf2, respectively) stimulate the repair of injured skin and mucosal tissues by stimulating the proliferation, migration, and differentiation of epithelial cells, and they have direct chemotactic effects on tissue remodeling. most fgfs are secreted proteins that bind heparan sulfates and can therefore be caught up in the extracellular matrix of tissues that contain heparan sulfate proteoglycans. this allows them to act locally in a paracrine fashion. however, the fgf19 subfamily (including fgf19, fgf21, and fgf23) which binds less tightly to heparan sulfates can act in an endocrine fashion on far away tissues, such as the intestine, liver, kidney, adipose, and bone. for example, fgf19 is produced by intestinal cells but acts on fgfr4-expressing liver cells to downregulate key genes in the bile acid synthase pathway; fgf23 is produced by the bone but acts on fgfr1expressing kidney cells to regulate the synthesis of vitamin d and in turn affect calcium homeostasis. fgf may be synthesized using e. coli as host system . fgf is involved in stimulating collateral vascularization and recovery from ischemia as well as enhancing wound healing, nerve regeneration, and repair of cartilage and has been alternately referred to as "pluripotent" (capable of developing into more than one cell type or tissue) growth factors and as "promiscuous" (biochemistry and pharmacology concept of how a variety of molecules can bind to and elicit a response from single receptor) growth factors due to its multiple actions on multiple cell types. the fgfs and small-molecule fgf receptor kinase inhibitor are used in the treatment of cancer and cardiovascular disease and have potential in the treatment of metabolic syndrome and hypophosphatemic diseases: • receptor tyrosine kinase inhibitor (sunitinib) is approved for indications in renal cell carcinoma and gastrointestinal stromal tumors. • small fgfr inhibitors, su5402, pd173074, and nordihydroguaiaretic acid, are effective in multiple myeloma cell lines . • pd173074 can induce cell cycle arrest in endometrial cancer cells with mutated fgfr2. • antibody against fgfr3 has been shown to effectively cause apoptosis in mouse models of multiple myeloma and bladder cancer. thus fgfr inhibition can be very effective in the treatment of cancer. the nerve growth factor (ngf) is an important member of the family of neurotrophins. five protein nerve growth factors of the neurotrophin family are important. they regulate the development of the nervous system and play an important role in maintaining the structure, plasticity , and repair of the adult nervous system. all the neurotrophins are basic proteins of about 120aa, share 50 % sequence homology, and are highly conserved in mammalian species. nerve growth factor (ngf) is a small secreted protein which induces the differentiation and survival of particular target neurons (nerve cells). little is known of the biological action of neurotrophin apart from ngf. the nucleotide sequence of cdna predicts that ngf is synthesized as pre-pro-ngf. upon removal of hydrophobic signal, either 34 kda or 27 kda pro-ngf is generated depending on the size of the transcript. however, processing of the precursors in different tissues is not well understood. they are essential for normal development, growth, and differentiation of the sympathetic and sensory neurons and are also essential to maintain the normal function of these cells in adults. thus it is important for maturation and survival of neurons and prevents degeneration of adult neurons. apart from its important role in the nervous system, it has been shown to possess protective action of human pressure ulcer, corneal ulcer, and glaucoma. reduced sensation may be observed in leprosy , wound healing, nerve injury, and diabetes . ngf may help to regulate the sensory fi ber sensitivity and function directly or indirectly by stimulating other effectors. administration of recombinant ngf may improve sensation and pain [ 13 ] . cholinergic neurons of the basal forebrain show receptors for ngf; specifi c mrnas for various ngfs have been identifi ed in different areas of the brain. cholinergic neuron loss is a cardinal feature of alzheimer's disease. nerve growth factor (ngf) stimulates cholinergic function, improves memory, and prevents cholinergic degeneration in animal models of injury, amyloid overexpression, and aging. ngf acts on intracellular calcium through tyrosine kinase receptor mechanism. nerve growth factor enhances early regeneration of severed exons and is also important in maintaining the biochemical and morphological phenotype of mature basal forebrain cholinergic neurons (bfcnls) after lesions or injury of the central nervous system (cns). thus, ngf may provide therapeutic option for preventing death of cholinergic neurons and other clinical conditions and is produced using e. coli . tgf-α exerts its function in an autocrine and endocrine fashion for various cell types of ectodermal origin, including most epithelial cells. it is normally a transmembrane protein and functions in cell communication through its ability to activate a receptor tyrosine kinase. ectodomain of tgf-α is cleaved in a highly regulated manner, releasing soluble tgf-α which activates paracrine signaling. the receptor is egf/tgf alpha receptor; therefore, the focus is on understanding the important roles of tgf-α and egf receptor signaling in carcinoma development. tgf-β is a large group of related proteins. some of its family members include bone morphogenetic protein (bmps), growth, and differentiation factors (gdp). it affects tissue remodeling, wound repair, hematopoiesis, morphogenesis, embryonic development, adult stem cell differentiation, immune regulation (is switch factor for iga), and infl ammation. it exists as multiple forms as tgf-β1, tgf-β2, and tgf-β3. it acts through transmembrane serine/ threonine receptor kinase leading to the activation of smads. it acts as tumor suppressor during cancer initiation but promoter during tumor progression. it has a role in the control of embryonic development, cellular differentiation, hormone secretion, and immune function. its role as mesenchymal differentiation factor, with focus on the muscle, fat, and bone cell, might provide insights into its deregulation in skeletal and developmental diseases and is the area of active research. it is produced using cho cell lines . there is huge potential for future therapies using proteins as therapeutic agents. the recombinant proteins are not only benefi cial, but the researchers can further engineer them to improve their activity and prolonged stay in the body, for example, engineering of monoclonal antibodies to have toxin or radioisotope or generation of bispecifi c antibody . still the technology is struggling hard to make the diseases completely a text of books and having a society free of diseases and the pathogens . • the production of proteins for therapeutic purpose is very important not only because of their specifi c action with minimum side effects but also due to their unique form and functions. • nowadays therapeutics (pancreatic enzymes from hog and pig pancreas or a-1-proteinase inhibitor from pooled human plasma) are not extracted from their native sources (from humans, animals) due to risk of transmission of pathogens . other problems faced for extraction from native sources were the scarce availability of animal tissue for production, high cost of purifi cation with less yield, and immunological reactions in the recipients. • majority of the biological therapeutic agents are produced by using various advance tools and technologies as cloning, selection , purification, and stability monitoring. it is also very important to monitor risks and side effects of therapeutics (safety analysis ) obtained from a wide range of cells. • the various biological systems are available for production of recombinant protein as bacteria, fungal system ( yeast ), insect cell , mammalian cell , human cell, and transgenic plants and animals . • the system of choice is dependent upon the total cost of production and obtaining fully functional and appropriately modifi ed (ptms) (glycosylation, phosphorylation, or properly folded) protein. all these are essential for the biological activity of the protein. • the bacterial system is the simplest with short cell division times, easy maintenance , easy to modify, and cost-effective production. the bacteria have limitation in production of (c) growth factors are required for progression of tumors (d) all of the above 11. darbepoetin is used for the treatment of: the fgf family: biology, pathophysiology and therapy nature reviews combined tge-sge expression of novel pai-1-resistant t-pa in cho dg44 cells using orbitally shaking disposable bioreactors microbial factories for recombinant pharmaceuticals the preparation and characterization of human insulin of recombinant dna origin chemistry and clinical use of insulin biosynthetic human proinsulin: review of chemistry, in vitro and in vivo receptor binding, animal and human pharmacology studies, and clinical trial experience biopharmaceuticals derived from genetically modifi ed plants emerging trends in plasma-free manufacturing of recombinant protein therapeutics expressed in mammalian cells clinical applications of epidermal growth factor hsp90: structure and function engineering of therapeutic proteins production in escherichia coli the production of recombinant pharmaceutical proteins in plants nerve growth factor: basic studies and possible therapeutic applications genetic engineering and biotechnology of growth hormones, genetic engineering -basics, new applications and responsibilities hsp70 chaperones: cellular functions and molecular mechanism microbial factories for recombinant pharmaceuticals production of recombinant proteins optimization of production of recombinant human growth hormone in escherichia coli bacterial production of human insulin cloning, transformation and expression of proinsulin gene in tomato (lycopersicum esculentum mill) transient transfection of serum-free suspension hek 293 cell culture for effi cient production of human rfviii heat-induced production of human growth hormone by high cell density cultivation of recombinant escherichia coli the structure of the human tissue-type plasminogen activator gene: correlation of intron and exon structures to functional and structural domains production of recombinant protein therapeutics in cultivated mammalian cells large-sized protein and are unable to perform posttranslational modifi cation as glycosylation. the glycosylation is very important as it affects the activity, half-life, and immunogenicity of the recombinant protein.• due to limitations, the other host systems were used for production which could give better product with minimal immune reactions and are cost-effective. fungal and insect systems are utilized for protein production but sometimes result in inappropriate glycosylation. improper glycosylation may result in nonfunctional and highly immunogenic protein.mammalian cells are the most preferred system for production of these proteins. cho system is the system of choice for therapeutic protein production. it accounts for nearly 70 % production of all products available in the market. key: cord-024193-khdvj6t5 authors: zhang, hong; pelech, steven; ruijtenbeek, rob; felgenhauer, thomas; bischoff, ralf; breitling, frank; stadler, volker title: peptide arrays date: 2012-01-17 journal: microarrays in diagnostics and biomarker development doi: 10.1007/978-3-642-28203-4_7 sha: doc_id: 24193 cord_uid: khdvj6t5 despite the concern over the potential loss of structural information as a result of the use of peptides as opposed to proteins as molecular probes, peptide arrays have been implemented in a broad range of applications including antibody screening and epitope mapping, characterization of molecular interactions, and enzymatic activity profiling, and they have become a valuable tool for proteomics research. in this chapter, we first (sect. 7.1) recapitulate the development of these arrays and highlight a couple of key improvements in the array production and the application in proteomics research. for clinical and biomarker development applications, it is important to measure entities that are directly related to physiological function (and dysfunction). in this respect, the assessment of enzymatic activities is obviously preferable to genotyping, expression profiling, or even measurement of protein amounts. in sect. 7.2, an original technology based on peptides arrayed onto a porous support allows detailed profiling of kinase activities in a biological sample. the applications described range from kinase characterization to inhibition profiles, detection of off-target effects, and drug response prediction in a clinical setting, allowing rational choice of the drug to be used. such directly functional approaches will have an important role in the transition to more personalized medicine. finally, in sect. 7.3, a recently developed method for “laser printing” of peptide arrays that will make these approaches much more practical is presented. has proved to be unparalleled in its power in profiling gene expression and identifying single nucleotide polymorphisms (snps) in high-throughput manner, there is a poor correlation between mrna and protein expression and activity. this arises from regulation of mrna translation, for example, with micrornas, and from the involvement of posttranscriptional and posttranslational modifications (ptms), protein-protein interactions, and differences in subcellular localizations. these confounding factors have driven the development and the use of array technologies to directly study proteomes and have set the stage for the arrival of microarray-based proteomics. similarly to dna/oligonucleotide microarrays, arrays for proteomics studies feature a wide range of molecules including recombinant proteins, complex protein samples, antibodies, peptides, or small molecules that are assembled in an addressable fashion on planar surfaces to allow parallel interrogations for activity and interactions associated with biomolecules at the protein level. among various proteomics array types, protein microarrays and antibody microarrays have attracted most of attention in the field in the past decade, as illustrated by the rapid technological advancements and broad applications in numerous basic research and clinical studies (reviewed in chap. 6). even though synthetic peptides together with oligonucleotides were among the first to be explored as molecular probes in array format, the development and application of peptide arrays has been sluggish and has lagged far behind arrays of many other types (frank et al. 1983; geysen et al. 1984) . one of the concerns with the use of peptides as opposed to proteins as molecular probes is the potential loss of information as a result of the missing structural context (mahrenholz et al. 2010 ). there is a much higher degree of entropy in the structures of peptides than with macromolecules such as proteins which are much more constrained. arrays of proteins have the distinct advantage in being able to mimic their physiological counterparts through presenting full-length proteins that are likely in their native 3d conformations and with proper ptms and associated proteins, to facilitate investigation of protein-protein interactions and assay of physiological activities. however, the molecular cloning and expression of tens of thousands of proteincoding genes with optimal covalent modifications and in combination with activating subunits has been proven to be technologically challenging, which significantly hampers the development and application of protein microarrays. in contrast, the chemistry of solid-phase peptide synthesis (spps) was well defined about half a century ago (merrifield 1965a, b) . further technological advancement in peptide synthesis over the years coupled with the recent influx of genome sequencing information upon the completion of a number of genome-wide sequencing projects has made designing and synthesizing peptides with defined sequences corresponding to the segments of a protein easier than ever. furthermore, given the fact that the biological activity of a protein is often carried out through the coordinated actions of individual protein domains, it is reasonable to expect that the various functions of the protein can be recapitulated through the study of its constituent peptides representing individual functional domain sequences. since its advent about two decades ago, the peptide array has been implemented in a broad range of applications including antibody screening and epitope mapping, characterization of molecular interactions, and enzymatic activity profiling and has since become an increasingly important and versatile tool for proteomics research. more recently, with the improvements in array production techniques, peptide microarrays with higher peptide density and diversity can be produced en masse using the peptides generated from the parallel synthesis approaches such as the spot technology. the utility of these peptide microarrays has been demonstrated in some large-scale systems biology studies on the dynamics of protein interactions in cell signaling networks as well as kineome (also known as kinome) activity profiling. however, the clinical application of peptide arrays is still at its infancy despite potential and promises shown in early exploratory work. this chapter summarizes the development of peptide array technology and highlights the progress made toward its applications in proteomics research in recent years. a perspective of its future implementation in clinical practice is also presented. despite the fact that the concept of the array-format peptide synthesis was introduced about three decades ago, it was not until the introduction of the spot synthesis of peptides in the early 1990s that the peptide array finally took its shape (geysen et al. 1984; frank 1992) . currently, there are two main strategies (in addition to very new approaches, see sect. 7.2) being used to produce peptide arrays: in situ parallel on-chip synthesis or immobilization of presynthesized peptides on the array surface. each has its own advantages and shortcomings. hence, the choice of the approach for peptide array production is largely determined by the downstream applications of the resulting arrays. the in situ on-chip synthesis of peptides is affordable as a result of the requirement of small amounts of reagents and of the fact that no purification of individual peptides is required. however, by the same token, the resulting peptides from in situ synthesis may suffer from low purity due to the variability in coupling efficiency between amino acid residues, especially in the case of long peptides (>20 residues) or those with residues such as cys or met, or containing multiple hydrophobic residues or phosphorylated amino acid residues. the two most common techniques for in situ synthesis that have been described and routinely used are spot synthesis and photolithography. the spot synthesis was introduced by ronald frank back in 1992 and is essentially a stepwise synthesis of peptides through sequentially delivering small amounts of activated amino acids on functionalized cellulose or polypropylene membranes using standard fmoc-based peptide chemistry (frank 1992) . the resulting membrane-based arrays are usually at low to medium density and can be used directly for downstream assays such as antibody epitope mapping. recently, improvements have been introduced allowing peptides either to be cleaved from the membrane using strong bases or to be recovered from individual spots as soluble peptides. for this purpose, they are synthesized on acid-labile cellulose membranes (e.g., trifluoroacetic acid (tfa)-soluble), allowing postsynthesis printing of the recovered peptides on selected array surfaces at a higher density (zander et al. 2005; hilpert et al. 2007) . another technique used in preparing peptide arrays in situ is the photolithographic synthesis developed by fodor et al. (1991a, b) on the basis of the addressable surface activation concept. compared to spot synthesis, photolithographic synthesis of peptides on array surface is more suitable for generating high-density arrays. however, it is both laborious and expensive. it requires the use of special photolabile protected amino acid derivatives as building blocks and of photomasks through which a laser is then used to activate specific areas on the array to cleave photolabile protecting groups. even though the technique was initially developed for peptide synthesis, it was more easily adopted for the production of oligonucleotide arrays, due to the complexity of making masks for each of the 20 amino acids for every coupling cycle, as opposed to only four bases in oligonucleotide array production (pease et al. 1994; mcgall and fidanza 2001) . in recent years, a number of modifications have been introduced, including the uses of conventional amino acids and photogenerated reagents, resulting in the improvement of efficiency and the reduction in cost for array production (singh-gasson et al. 1999; gao et al. 2003; bhushan 2006; shin et al. 2010) . thanks to technological advances in microarray printing and array substrate production in the field of genomics, it has become a common practice to spot presynthesized peptides onto reactive planar array surfaces. the approach is particularly useful when multiple copies of the same array with high density are required. furthermore, peptides can be purified after synthesis and prior to array printing to avoid any complications that may stem from peptide impurity. various chemistries have been utilized for immobilizing peptides onto the array surfaces. one of the popular approaches is to attach n-terminal biotinylated peptides onto avidin/ streptavidin-coated microarray slides (lesaicherre et al. 2002) . covalently immobilizing peptides through a terminal cys onto a surface functionalized with maleimide groups or disulfide is also a method that is being used quite commonly (inamori et al. 2008) . the suitability of immobilization chemistry varies according to the nature of the peptide as well as the downstream applications of the array. in our hands, attaching peptides through the n-terminal free amino groups to a surface functionalized with epoxide groups worked very well for assaying protein kinase activity (see below). thus, it is very important and necessary to determine which strategy should be used for preparing peptide arrays already prior to peptide synthesis, according to the intended application of the resulting arrays. since the advent of the technology more than two decades ago, peptide arrays have been applied in a broad range of investigations including antibody epitope mapping, protein domain-mediated interaction screening, and enzymatic activity profiling. in recent years, the utility of peptide arrays has been further extended to system-wide proteomics studies, fuelled by the advances in genomics and proteomics. this can be exemplified by their roles in cell signaling studies. kineome activity profiling reversible phosphorylation and dephosphorylation of proteins mediated by protein kinases and protein phosphatases, respectively, are recognized as one of the most important and widespread molecular mechanisms in regulating cell signaling pathways involved in cell proliferation, division, differentiation, adherence, angiogenesis, and apoptosis (brognard and hunter 2011) . according to our most recent tallies, the total number of phosphorylation sites in the human proteome is estimated to exceed 650,000, which encompass over 100,000 phosphosites that have been experimentally characterized and those predicted based on evolutionary conservation (http://www. phosphonet.ca). the biological importance and potential clinical significance of protein phosphorylation, exemplified by the implication of deregulation of both kinase and phosphatase activity in a wide range of human diseases including cancer, autoimmune diseases, neurodegenerative diseases, and diabetes, has driven the development of strategies for the identification of physiological substrates for each of over 500 protein kinases within the human kineome, as well as for systematic profiling of kinase activity in biological samples. for the identification of physiological substrates of kinases, a protein microarray featuring all the proteins representing the entire proteome would seemingly be an ideal platform. however, it is technically challenging to create such a comprehensive array encompassing all of the proteins encoded by about 23,000 genes in the human genome with current cloning and expression technologies. the most comprehensive protein microarray currently available commercially, trademarked as protoarray® by life technologies (carlsbad, ca, http://www.lifetechnologies. com), consists of only 9,000 human proteins that have been expressed and purified from a baculovirus-based expression system. moreover, issues with protein conformation, autophosphorylation, and stability on the array surface, as well as complications in data interpretation as a result of the presence of multiple phosphorylation sites within a protein potentially targeted by different kinases, have limited the practicality of this approach. as a result, a number of peptide-based strategies have been devised including peptide libraries and peptide arrays, based on the notion that the substrate specificity of the kinase is largely defined by the flanking linear amino acid sequences around its target phosphorylation site(s) on the substrates. based on the consensus recognition sequences for protein kinases derived from the peptide-based studies, one can deduce potential physiological substrates for each of the kinases in a proteome, coupled with information about protein-protein interactions, subcellular colocalization, and correlations in expression or activation. back in 1995, luo and colleagues originally used the peptide array approach to identify and optimize substrate sequences for protein kinase a (pka) and transforming growth factor (tgf) b receptors (luo et al. 1995) . since then, the substrate specificities for a number of protein kinases have been elucidated and refined sequentially using peptide arrays (schutkowski et al. 2005) . there are two main strategies to be deployed for determining consensus phosphorylation sequences for protein kinases. on the one hand, peptide macroarrays featuring combinatorial peptide libraries or random peptide libraries such as those on the spot cellulose membranes were indispensable tools for elucidating the recognition sequences targeted by the kinases for which little information on their physiological substrates is available. the availability of expanding collections of recombinant active protein kinases in the past several years has facilitated the effort in this front. on the other hand, incorporation of increasing numbers of physiological phosphorylation sites uncovered through recent large-scale mass spectrometry-based phosphoproteomics studies into peptide microarrays has also significantly improved the efficiency of the substrate peptide screening process. currently, many large-scale peptide microarrays comprising a large number of experimentally verified phosphosites as well as those identified and optimized using the peptide library approach are readily available through various commercial sources such as pepstar™ from jpt peptide technologies (jpt peptide technologies gmbh, berlin, germany, http://www.jpt.com), pamchip® from pamgene (pamgene international b.v., hertogenbosch, the netherlands, http:// www.pamgene.com), and pepchip™ from pepscan presto (pepscan presto, lelystad, the netherlands, http://www.pepscanpresto.com), either as products or services, for profiling kinase activities in biological samples. various experimental protocols based on the same approach have been developed (schutkowski et al. 2005; thiele et al. 2010) . the approach was also adapted to kineome activity profiling in bovine samples by utilizing information gathered through bioinformatics analysis of the phosphorylation sites conserved in evolution (jalal et al. 2009 ). however, inferring endogenous kinase activities based on the data from such peptide microarrays is less straightforward than initially thought. one of the main issues associated with the current approach is the overlapping specificity among protein kinases dictated by the promiscuity in substrate recognition, especially for the kinases from the same or related families. phosphorylation of a peptide on each spot may represent the sum of activity of all the kinases targeting this particular peptide. indeed, a specific phosphosite sequence may be optimized through evolution to be recognized by a panel of kinases and phosphatases and not be optimized for an individual kinase or phosphatase. thus, under most circumstances, it is impossible to directly correlate the level of peptide phosphorylation on the array with the activity of a specific kinase. it is even more challenging when the activity of kinases in crude cell or tissue lysates is to be assessed, where the cell compartmentalization has been destroyed and the proper subcellular localization of proteins cannot be maintained. in light of these challenges, we set out to identify the optimal peptide substrate sequences unique to each kinase by combining the high-throughput capability of peptide microarrays with the power of a proprietary kinase-substrate prediction algorithm developed at kinexus (fig. 7 .1). the algorithm was built based on the information gathered through manual analysis of close to 10,000 confirmed kinase-substrate pairs for 229 typical kinases. coupling with the alignment of the primary amino acid sequences of the catalytic domains of protein kinases, the specificity-determining residues (sdrs) were identified, and the position-specific scoring matrix (pssm) was generated for each of the kinases for predicting their respective recognition sequences around the phosphorylation site. the pssms were then used to derive the optimal substrate peptide sequences. in total, 445 15-mer peptides corresponding to the predicted sequences with a single phosphorylatable residue (ser, thr, or tyr) in the middle were synthesized and immobilized onto an epoxysilane-coated glass microarray surface. the resulting peptide microarray was made up of four identical subfields to allowing four kinase assays to be run in parallel. phosphorylation of the peptides on the array was carried out by applying active protein kinases individually into each field under their respective assay conditions, and the extent of peptide phosphorylation was then detected with pro-q diamond (life technologies), a fluorescent dye that had been validated to bind specifically to phosphorylated residues including ser, thr, and tyr, regardless the context of sequences they are in. so far, over 200 protein kinases have been assayed. many highly reactive and selective peptides have been identified as substrates for the kinases tested. while most of the sequences conformed to those reported previously, some novel motifs were also uncovered. detailed analysis of the "hit" peptide sequences is expected to reveal the prototype optimal substrate peptides unique to each kinase, which will then be further optimized for their reactivity and selectivity using an oriented peptide library approach. it is expected in the near future a peptide microarray spotted with substrate peptides that are preoptimized to each of the kinases will become available from kinexus for kineome profiling in complex biological samples. compared to protein kinases, protein phosphatases have been less well characterized with respect to their regulation and physiological substrates. this can be attributed to the misconception that phosphatases are promiscuous in substrate recognition and a b c d fig. 7 .1 a bioinformatics algorithm-guided identification of the optimal peptide substrates unique to each kinase on peptide microarrays. panel a. schematic description of the workflow from peptide substrate sequence prediction using the kinase predictor 1.0 algorithm developed by kinexus to select test peptides for phosphorylation by kinases on the peptide microarray and, finally, to the deduction of the optimal substrate sequences for individual kinases. panel b. scanned image of the full kinex™ kinase substrate peptide microarray phosphorylated with three different kinases. the second field was incubated with atp in the absence of added kinase as a control. each peptide featured a phosphorylatable residue (ser, thr, or tyr) in the middle. the strong spots common among all four fields are the orientation markers designed for easy peptide localization. panel c. close-up scanned image of one field of the kinex™ kinase substrate peptide microarray. panel d. alignment of the top phosphorylated peptides detected following incubation with amp-dependent protein kinase alpha 1. peptides were ranked according to their respective phosphorylation signal intensity, and an optimal substrate peptide sequence is shown in the bottom row regulated in less stringent fashion in vivo, which might have arisen from that observation that a relatively small number of protein-serine/threonine-(ser/thr-) specific phosphatases are able to catalyze a myriad of dephosphorylation events, and that most protein phosphatases have not been found to recognize well-defined linear sequences or consensus motifs within their substrates so far. despite prevailing evidence that short synthetic phosphopeptides are poor phosphatase substrates compared to their parent proteins (zhao and lee 1997) , as supported by the role of regulatory subunits in forming the substrate-binding sites required for substrate recognition according to crystallography studies (virshup and shenolikar 2009) , several phosphopeptide-based studies have been reported that aimed at the delineation of substrate preferences using either activity-or interaction-based approaches (sun et al. 2009; wang et al. 2002) . among the two main classes of protein phosphatases, protein-tyrosine (tyr-) phosphatases (ptps), not protein-ser/thr phosphatases, had been the focus of early studies on substrate specificities, due to the availability of better characterized phospho-tyr antibodies than phospho-ser/thr antibodies. in those studies, phosphatase substrate specificities were commonly delineated using individually synthesized phosphopeptides (cho et al. 1993; zhang et al. 1994) . in recent years, peptide arrays, peptide microarrays in particular, have been demonstrated for their utility in protein phosphatase specificity mapping and activity profiling. in 2008, waldmann's and yao's groups independently used phosphopeptide microarrays for large-scale, high-throughput characterization of ptp and protein-ser/thr phosphatase substrate specificities, respectively (k€ ohn et al. 2007; sun, et al. 2008) , for the first time. while a fluorescently labeled phospho-tyr antibody was employed to monitor dephosphorylation of tyrosine in the peptides in waldman's study, yao and coworkers used pro-q diamond dye to detect dephosphorylation of ser/thr, circumventing the detection problem as a result of the lack of well-characterized generic antibodies for phospho-ser/thr. the dye has recently extended to the detection of dephosphorylation of tyrosine in place of anti-phospho-tyr antibodies on peptide microarray by the same group (gao et al. 2010) . a phosphopeptide microarray featuring the most evolutionarily conserved human phosphorylation sites is now being explored for its potential for the determination of phosphatase specificities and activity profiling in our laboratory. in addition to protein kinases and phosphatases, peptide microarrays have also been successfully used to characterize protease specificity, based on the notion that proteolytic cleavage can be monitored by the changes in fluorescent signals on fluorogenic peptides immobilized on the array upon the action of proteases. salisbury et al. (2002) used a fluorogenic peptide substrate array with 361 spatially addressable peptides to decipher the specificity of thrombin. gosalia and colleagues employed a solution-phase fluorogenic peptide microarray, in which peptides were spotted as spatially separate nanodroplets, to reveal the evolutionary conservation of substrate specificity of thrombin from human, bovine, and salmon (gosalia and diamond 2003) . the approach was also applied to determine substrate preferences of 13 serine and 11 cysteine proteases (gosalia et al. 2005) . winssinger et al. (2004) generated a library of 192 peptides tagged with peptide nucleic acid (pna) molecules and incubated it with protease in solution, followed by spatial deconvolution on a dna microarray to profile the substrate specificities of thrombin, plasmin, and caspase-3. the peptide array-based protease specificity profiling approach has now become an essential part of protease characterization platform, complementary and synergistic to other proteomic approaches used to detect alterations of substrate abundance and to identify and quantitate proteolytically generated neo amino-or carboxy-termini (auf dem keller and schilling 2010). the application of peptide arrays for protein-protein interaction characterization has been well documented since the advent of spot peptide synthesis. it is applicable to characterizing protein-protein interactions where the interface between the two interacting proteins can be recapitulated by linear peptide sequences derived from the parent proteins. it is even more advantageous compared to other proteomics techniques such as protein arrays when the protein-protein interactions mediated by ptms such as phosphorylation are concerned, as amino acids carrying corresponding ptms can be readily incorporated into specific sites during peptide synthesis. not only can the peptide array-based approach be used to map the consensus sequences recognized by these domains, it can also provide dynamic information on signal-dependent change in molecular networks for proteins defined by the peptides on the array and the proteins for which the binding is monitored (sinzinger and brock 2010) . intracellular signal networks are organized through the interactions of proteins, which are often mediated by a group of diverse modular protein interaction domains (pids) with defined specificity. among them, the src homology 2 (sh2) domains are the largest family recognizing tyrosine phosphorylated sequences, and thus play pivotal roles in relaying information flow emanating from receptor protein-tyr kinases (pawson 2007) . a phospho-tyr-oriented peptide library with only one amino acid introduced at the defined positions at a time, and a mixture of amino acids at the randomized position, was spotted on the array and was interrogated with 120 bacterially expressed human sh2 domains, and the phosphotyrosinecontaining peptide sequence motifs for 76 of them were defined (huang et al. 2008) . combining the power of phage-displayed libraries, spot technology, and bioinformatics, the peptide array-based approach was also successfully used to deduce the consensus sequences of yeast sh3 domains (tonikian et al. 2008 (tonikian et al. , 2009 . a peptide microarray featuring peptides with inverted configuration representing 6,223 c-terminal sequences of human proteins was probed with a pdz domain to screen for putative interaction partners (boisguerin et al. 2004 ). with the knowledge of consensus sequences for the pids, peptide microarrays that carry peptides corresponding to known sequences recognized by sh2, sh3, pdz, and other pids have been employed to profile the binding of proteins from complex biological samples to detect the differences in molecular interactions between different physiological states (stoevesandt et al. 2005; sinzinger and brock 2010) . a peptide microarray populated with peptides, in which kinase consensus sequences and caspase cleavage recognition motifs (identified through a search of the human proteome) are overlapping, was employed in a study to investigate the role of phosphorylation in the regulation of caspase signaling pathways. protein kinase ck2 emerged as the kinase with the most number of substrates that contained kinase consensus sequences that overlapped with caspase-3 cleavage motifs, indicating a role of phosphorylation in the inhibition of caspase-mediated apoptosis signaling pathways (duncan et al. 2011) . as a natural extension to their classical application in antigenic epitope mapping, peptide arrays displaying a collection of biologically active synthetic peptides have been demonstrated in recent years to be a very versatile tool for profiling the antibody repertoire in complex biological samples such as serum, urine, saliva, and other types of body fluids for the diagnosis of pathogen infections, allergy reactions, and autoimmunity, based on the notion that the immune response to pathogens, allergens, or autoantigens can be captured by the presence or absence of specific populations of antibodies. hence, serological mapping has become one of the most sought after applications of the peptide array technology, as it appears to have the greatest clinical potential. peptide libraries featuring fragments derived from autoantigens, allergens, or viral proteins presented on either the spot membrane-based peptide macroarrays or glass slide-based peptide microarrays have been used for antibody profiling in clinical samples. the clinical potential of such analyses have been shown by their use for antibody spectrum profiling in the sera from patients infected with hepatitis b and c, with a simian-human immunodeficiency virus (shiv), the severe acute respiratory syndrome (sars) corona virus, and with herpes virus. this provides crucial information not only for infection diagnostics but also for the development of vaccines (neuman de vegvar et al. 2003; duburcq et al. 2004; guo et al. 2004; andresen and gr€ otzinger 2009) . the use of peptide arrays in kineome profiling has also inspired the exploration of their application in the studies of human diseases. as increasing numbers of kinase substrate peptides have been identified in recent years, peptide microarrays with the capability of screening a broad range of protein kinases have been established and used to profile the aberrant kinase activity in clinical samples as well as for monitoring the response to kinase inhibitory compounds in a highthroughput manner. this underscores the potential of peptide arrays in disease diagnosis and drug discovery (piersma et al. 2010 ). among the handful of studies reported so far in this area, schrage et al. (2009) recently reported the activation of multiple pathways in relation to akt/gsk3b, pdgfrb, and src protein kinases in chondrosarcoma cells on a kinase substrate peptide array containing 1,024 peptides. supplemented with the cell viability data in vitro, the study indicated that the src inhibitor dasatinib is a potential treatment option for patients who are inoperable (schrage et al. 2009 ). tuynman and colleagues investigated the molecular mechanism underlying anticarcinogenic activity of celecoxib (celebrex), a selective cyclooxygenase-2 (cox-2) inhibitor, against colorectal cancer (crc) using a kinase substrate peptide array with 1,176 different kinase substrate consensus sequences and found that celecoxib represses c-met-dependent signaling, which in turn led to downregulation of oncogenic wnt signaling in crc, supporting the potential of targeting c-met and wnt signaling in crc therapy (tuynman et al. 2008) . recently, a cellulose membrane-based peptide array of 70 peptides derived from p160 peptide, a cancer cell targeting peptide identified by phage display, was employed to optimize the affinity of the peptides for human cancer cells using peptide-whole cell interaction assay (ahmed et al. 2010) . the binding of the three peptides with the highest affinity and selectivity for cancer cells was further confirmed using fluorescence imaging and flow cytometry. the study revealed the potential of the peptide array-based whole cell binding assay for screening and identifying cancer cell targeting peptides for cancer diagnosis and drug targeted delivery. peptide microarrays broaden a new field of research and applications often referred to as functional proteomics (thiele et al. 2011) . while dna and protein arrays mostly focus on determination of abundance of rna or protein molecules, peptide arrays allow the functional analysis of multiple proteins or protein families (fig. 7.2) . by functional analysis, we mean the detection of protein activity. clear examples are the detection of enzymatic activities, for example, of kinases, phosphatases, and proteases in lysates from cells or tissues. however, nonenzymatic functions, like the responses to hormone binding of nuclear receptors in terms of specific coregulator protein recruitment, are also currently being studied on peptide arrays (heneweer et al. 2007; koppen et al. 2009 ). peptide arrays enable the miniaturization and multiplexing of activity-based assays. in the context of pharmaceutical research, and in the field of translational medicine in particular, such array-based approaches are emerging. this makes sense since the majority of the drugs being developed target protein activity and function. these new and more targeted drugs act by effecting protein function rather than targeting dna or rna or interfering with the modulation of protein levels. because functional profiling of the interaction of drugs with cellular or tissue samples is of specific interest in pharmaceutical research, peptide microarrays are proving to be very useful with their ability to profile protein activity and its modulation by drugs. we focus here on the drug class of kinase inhibitors which have been reshaping the oncology field due to their high success rate. these molecules inhibit kinase function by reducing kinase activity, which can be monitored on a peptide array. kinases play a pivotal role in cellular biology by being the key regulators of signal transduction. signals being detected by a membrane-bound receptor are transduced to the inner parts of the cell to result in an appropriate response. this happens via highly complex cascades of events in which the signal is received and propagated using transphosphorylation reactions (fig. 7.3) . these reactions are catalyzed by protein kinases together with the crucial atp molecule. atp is important as not only does it provide the kinase's energy source, it also supplies the phosphate moiety, vital to the whole signal transduction cascade. a kinase becomes activated and places this phosphate group on a substrate protein; this being the subsequent link in the signal transduction pathway. often this substrate protein is a kinase as well. a signal transduction event can be compared to a relay in athletics, where each kinase gets activated by an upstream event and subsequently passes on the baton to the next member downstream in the pathway. most protein kinases have distinct preferences for the aromatic hydroxyl groups of tyrosine residues or for the aliphatic serines or threonines. it is this characteristic which divides this family of more than 500 members into two kinase subfamilies: proteintyrosine kinases (ptks) and protein-serine/threonine kinases (stks). while in classic kinase assays the activity is detected by the phosphorylation of a single substrate, multiple substrates can now be immobilized and monitored on a microarray. instead of placing multiple protein substrates on a chip, only the phosphosites (the sites within the protein which become phosphorylated) are immobilized in a peptide microarray. thus, the peptides represent the protein substrates. as has been discussed in the previous chapters, this can be done in a variety of ways, but all are based on a solid support. in most cases, the sequences are derived from known phosphorylation sites in the human proteome. as the human proteome is estimated to comprise more than a million proteins, of which more than two-thirds can be phosphorylated, this indicates the huge amount of different phosphosites that can be investigated by peptide arrays. the principle of the assay is that the kinase activities in the sample of interest phosphorylate the peptides. the phosphorylation event is detected by either radiography or fluorescence imaging of the array. in radio assays, the peptide is phosphorylated using radioactive atp as the phosphate source. this approach is increasingly being replaced by the use of fluorescence assays. in the latter case, the phosphorylation of the peptide is detected by a fluorescently labeled molecule which is either a chelator (e.g., phosphotag) or an antibody. ideally, the antibody needs to detect the phosphoamino acid in all the available peptide sequences on the chip equally well and independently of the adjacent amino acids. antibodies like py20 work very well in detecting tyrosine phosphorylated peptides, but for serine/ threonine phosphorylated peptides, cocktails are needed for full coverage of detection. the first peptide microarray applications for kinase profiling used glass as a solid support and radioactivity for readout at a single time point. later, protocols were developed based on the fluorescent readout of labeled antibodies (or cocktails of antibodies) binding to phosphorylated peptides. a second generation of this technology was developed by researchers in the netherlands and is referred to as the pamchip® technology (lemeer et al. 2007; hilhorst et al. 2009; versele et al. 2009) (fig. 7.4) . with this technology, antibody-based fluorescence detection has been combined with a change of solid support from glass slides to a porous ceramic. in this format, the sample is pumped up and down through the porous aluminum oxide ceramic sheets, in which the peptides are immobilized at designated spots. each spot comprises thousands of separated pores with diameters of 0.2 mm in which the peptides are site-specifically immobilized. each time the sample is below the solid support, the degree of phosphorylation is monitored by imaging the fluorescence intensities caused by the antibody binding to the phosphorylated peptides alone. these time curves, or kinetic readouts, appear to be instrumental in the enzymatic studies; a kinase is after all an enzyme which catalyzes the rate (the kinetics) of a phosphorylation reaction. in addition, the kinetic and multi(time) step readout for each of the 144 or 256 peptides on each single array allows much more comprehensive statistical analysis of the signals than data from a single time point per peptide spot on a glass array (thilakarathne et al. 2011) . the application of peptide arrays in biological, pharmaceutical, and medical studies often requires the analysis of many samples under variable conditions. for example, lysates from cells should be analyzed using a range of time points, varying concentrations, and with multiple different drugs. for this reason, a system has been developed which has the capability of analyzing 96 arrays at once. this latest technology for kinase activity profiling is based on a 96-well plate format, in which each well comprises a peptide microarray. bioinformatics for analysis of the vast datasets from such studies has been evolving in parallel. thilakarathne et al. (2011) developed a new method based on semiparametric mixed linear models to further enhance the amount of information that can be obtained from the multiparallel kinetic readouts from each microarray. a straightforward application is substrate identification using recombinant kinases. such studies have indicated that different kinases have their own preferences for the peptide sequences they phosphorylate. clear differentiation between the ptks and stks has been confirmed, although dual specificity kinases have also been found. in addition, it has become apparent that although each kinase has a preference for particular peptide sequences, they can also be promiscuous, resulting in multiple peptides being phosphorylated to different degrees in diverse peptide sets. in short, the degree of phosphorylation by purified kinases varies from peptide to peptide and can be profiled in hundreds per array, resulting in phosphorylation fingerprints. substrate profiling studies have revealed important biological information as described in the paper by schutkowski et al. where they showed that for optimal recognition by gsk3b, a peptide substrate should be prephosphorylated or primed . due to this porosity, the sample can be pumped up and down through this solid support. every time when the sample is positioned below the microarray, an image is taken of the microarray by a ccd camera (middle panel). via this realtime imaging of the microarray, the signal, developing in the peptide spots due to binding of fluorescent antibody detecting peptides phosphorylated by the kinases, can be monitored (schutkowski et al. 2004 ). another interesting application was explored by poot et al. who identified an optimal substrate for pkc isozymes and coupled this peptide to an atp-binding site inhibitor to generate a bisubstrate (poot et al. 2009; van ameijde et al. 2010) . the peptide microarrays were subsequently used to evaluate the resulting inhibitors, which were potent and selective toward the theta isozyme. with the development of protocols for profiling cell lysates of tissue homogenates, the application area has been broadened to signal transduction and pathway studies [well reviewed by the group of schutkowski (thiele et al. 2011)] . the effect of a stimulus or a kinase-inhibiting drug on cultured cells can now be investigated at the complexity level of a cell, where multiple kinases can be active in the context of the interacting networks that exist. at this point, peptide arrays provide a welcome extension to classical methods like (phospho) western blots and elisas, which monitor drug effects on a (single) kinase by detecting the variation in abundance of the downstream phosphorylated substrate. the peptide arrays monitor the enzyme activity of multiple kinases at once and not only the end result of this activity. an interesting feature of functional proteomics is found in the ability to study direct effects of the investigative drug. because activities of kinases can be monitored in cell lysates or tissues, drugs can be characterized in a complex, and probably more realistic, context than in classical single readout (singleplex) assays. this latter type of assay is limited as it can only investigate the activity of the isolated drug target. drug selectivity profiling is a clear example of an application which benefits from the combination of multiplexing and miniaturization (fig. 7.5) . another example of an application area is the unraveling of a drug's cellular mechanism of action. in this application, the peptides on the chip represent multiple different proteins involved in complex cellular pathways and signal transduction networks. in a small lysate sample, derived from just 10,000 to 100,000 cells or less than a tenth of a cubic millimeter of tissue, multiple diverse interactions can be studied at once. a recent development of peptide microarrays has been the application of new drugs in pharmaceutical research and clinical development. in the field of oncology in particular, fundamental progress has been made by so-called targeted medicine. previously, anticancer drugs were targeting cellular processes, like cell division, more globally. new insights into cell signaling and signal transduction cascades have changed the way novel oncological drugs are being developed, and it is the kinase enzyme class which is playing a crucial role in this progress. many of its members play a pivotal role in the mechanisms of tumor genesis, and some kinases are even the active protein products of oncogenes. examples of successful cancer drugs targeting protein kinases-or the signaling pathways they are involved with-are imatinib, erlotinib, gefitinib, and the previously mentioned sunitinib and sorafenib. these are all molecules that block the catalytic activity of protein kinases. a related class of therapeutics is antibodies, which intervene in a different way with cellular signaling: they act by blocking the initiation of receptor signaling. examples of the latter are trastuzumab and bevacizumab, which block egfr kinase and vegfr signaling, respectively. these drugs can be studied comprehensively with peptide arrays. in these studies, two formats are currently being used. using cell line models, the cells are either treated with the drug in culture or on the chip. in the first case, lysates are prepared from the cells before and after treatment and profiled for activities on the chip. in the second case, lysates can be treated directly by spiking the drug into the solution just before application onto the chip. in the latter instance, cell lines, tissue homogenates from animal models, or even clinical samples can all be used. the effects of the inhibitors on the kinase activities in these samples can all be directly assessed. although the highly important context of the cellular architecture is lost, which is surely a downside, the potential to profile all detectable, full-length kinases-with their relevant posttranslational modifications-in the same sample, opens up vast new fields of applications. in drug discovery, researchers screen for kinase-inhibiting compounds in chemical libraries. during such studies, they often use an abstracted model, the purified protein, but this protein is frequently truncated to its domain only. a major disadvantage of this approach is the absence of other domains, including those with a inhibition / activation sorafenib sunitinib fig. 7 .5 selectivity profiling of kinase inhibitor drugs using peptide microarrays. here sorafenib inhibition profiles are compared with sunitinib in extracts from both normal and tumor tissue from a renal cell carcinoma patient regulatory function. in the recently developed protocols for peptide array analysis of kinases in cell and tissue lysates, the drug target can now be studied more naturally as a full-length protein, in the way it is actually expressed in cells or tissues. at first, this was shown in a model system, but interestingly, this approach appears to be translatable to patient-derived tissues. this means that the kinase drug targets can now be studied in the same form as they are expressed in a patient's tumor, thus full length, fully decorated with all relevant posttranslational modifications and in the presence of stabilizing or activating cofactors (e.g., heatshock proteins). in addition, they can be studied in the presence of all other kinases expressed in the cell or tissue being investigated. such analysis of patient-derived tumor samples can result in the identification of tumor-specific kinase activities. when linked to pathological, diagnostic, and/or clinical data, this can lead to the identification of diagnostic or prognostic biomarkers. while the on-target effects are being monitored, the researchers can also obtain insights into the drug's effect(s) on other kinase targets, which are either intended-in the case of multitarget inhibitors-or unintended and can putatively cause side effects. the latter opens up new opportunities for the toxicologist in investigating and understanding adverse drug reactions leading to toxicological biomarkers. a very typical feature of activity-based assays is the capability of drug testing. with peptide microarray-based kinase assays, not only can multiple kinases in a patient sample be studied at once, their response to their inhibiting drugs can also be studied. this possibility links the presence and activity of the kinase drug targets to their responsiveness to the drug. drug response is a leading parameter in the clinical development of a drug. in the development of kinase inhibitors in cancer, the response rates are often very low, even in case of effective drugs. these drugs are developed against specific kinase targets, but these targets are not always equally present or active in the whole treated population. furthermore, in a subset of nonsensitive or resistant patients, the role of this target in tumorigenesis and growth or metastasis is not essential and can be overruled by other mechanisms. in order to identify the patient subpopulation that is likely to respond, tests need to be developed that match the right patient to the right drug and vice versa as more drugs are being developed. there are already examples of such companion diagnostic tests. for the prediction of response to trastuzumab (moelans et al. 2011) , targeting the receptor tyrosine kinase her2/neu, patients are tested for the presence of this target on their tumor cells, before they receive this breast cancer therapy. a recent example is the test for the elm4-alk translocation to select patients for pharmacotherapy with crizotinib (kwak et al. 2010) , an alk kinase inhibitor. if the whole lung cancer patient population would have been treated, only an extremely low percentage would have shown a clinical response because only 4% have this mutation. the availability of the companion diagnostic test was therefore essential for the success of the clinical trial. the identification of predictive biomarkers appears to become essential in many drug development programs. the classical technologies for biomarker discovery are based on testing for dna mutation or rna or protein expression levels. molecular data are obtained in biopsies taken before the patient is treated. if these data can be correlated or associated to the therapy response, this can be the start of generating a companion diagnostic test. peptide microarrays are also currently being used in this effort. while classical methods cannot involve the drug of interest in predose tissue samples, kinase microarrays can, as discussed above. in addition, the drug can actually be used in the test. this means that drugspecific data and information can be generated using predose biopsies. proof of concept of this approach was shown by versele et al. in a multiple cell line study. analogous to the way it is aimed to work in a clinical setting, they profiled the lysate of a cell line on a peptide microarray in the presence and absence of their drug candidate. the inhibition profiles were used to predict the response of the cell line to drug treatment in culture. from these profiles, they could identify a set of peptide phosphorylations of which the response (inhibition) on the chip was predictive for the tumor cell proliferation (versele et al. 2009 ). this concept (fig. 7.6 ) is now being explored in clinical studies by my research group in collaboration with the netherlands cancer institute (nki), the vu medical center, and other cancer centers in both the usa and japan. in a study presented at asco in 2011 on neoadjuvant treatment of non-small-cell lung cancer with the egfr kinase inhibitor erlotinib, we showed that candidate biomarkers could be identified. on-chip peptide phosphorylations and inhibitions were correlated to clinical responses. with no information on the pathological assessment of the resection tissues available to the testers, a model built on those profiles could still predict the pathological response (hilhorst et al. 2011) . it should be noted that resection tissue was used and not pretreatment biopsies which is needed to make this into a it could be possible to apply this principle of drug testing on patient-derived tissues to other targeted therapies as well. in addition, if other protein classes are targeted, for example, phosphatases, proteases, nuclear receptors, acetyltransferases, histone deacetylases, and methyltransferases, the target responses in patient-derived samples could be tested using a peptide microarray. the nonfocused, nonbiased, and global profiling nature of the arrays allows parallel monitoring of drug targets and class-related nontargets. these nontargets can be functional proteins involved in the mechanisms of resistance and are therefore possibly very useful markers for predicting resistance to targeted therapies. finally, nontargeted therapies such as chemoradiation could also be accompanied in the future by such testing methods, as was shown in a recent publication by a norwegian group. they generated kinase activity profiles of tens of biopsies taken before patients were treated and could identify peptide phosphorylation patterns that correlated to the tumor regression grade after therapy. they generated a response prediction model that could predict the responses of a newly tested set of patients with promising accuracy (folkvord et al. 2010) . thomas felgenhauer, ralf bischoff, frank breitling, and volker stadler several sophisticated methods are in use worldwide to produce peptide microarrays. each of these methods has its special advantages and drawbacks. high amounts of identical oligomers are achievable on cellulose supports via spot synthesis (frank 1992; dikmans et al. 2006) , but spot densities are very low due to droplet handling. with photochemical methods where chain growth is induced by a laser beam, very small spot sizes and high spot densities are possible (fodor et al. 1991a, b; lipshutz et al. 1999) . in this case, the drawback is in the sequential use of monomer solutions which might be acceptable in dna synthesis (four monomers), but the yield is dramatically reduced when the number of needed coupling cycles increases as in the case of standard peptide synthesis where minimum 20 individual cycles are needed to complete a fully combinatorial layer. the use of a laser printer as synthesis machine makes it possible to overcome the obstacles of the methods described above. solid particles (toners) carrying the reactive building blocks are printed in parallel in high resolution to a desired support. a full combinatorial layer is developed-like a color picture printoutat once, and the coupling cycles are reduced from 20 to a single one per layer (stadler et al. 2008) . a commercial laser printer uses small solid toner particles (~10 mm) that are triboelectrically charged by friction inside a toner cartridge drum system. because of the materials involved, this procedure leads to strong electrical charges on the particle surfaces, which enables the directional movement of the particles within electrical fields. a laser beam or an led row translates 2d light patterns into electrical patterns on top of an organic photoconductor drum. these images are developed with the charged toner particles that are finally transferred to the support. at office applications, a color laser printer system delivers four different color toners (black, cyan, yellow, and magenta) on a sheet of paper with a resolution of 1,200 up to 2,400 dpi. the polymer-based toner particles are fixed to the cellulose support by heat. the main challenge in combinatorial synthesis is to deliver different kinds of monomers with high accuracy to their designated reaction partner or reaction site. whereas a color laser printer delivers only four toners, a peptide synthesizer based on the xerographic technique should be able to handle at least 20 different building blocks for basic peptide synthesis or other feasible monomers for the production of peptide mimetics (amino acids in d-form, methylated, phosphorylated derivatives, nonnatural versions). in addition to the great flexibility of the synthesizer, an exact positioning of consecutively printed layers is the basic requirement for the parallel elongation of combinatorial assembled oligomer chains. with increasingly better printing accuracy, the spot density also increases, as well as the diversity of synthesized peptides. to benefit from the laser printer as delivery machine for monomers in combinatorial chemistry, the toner particles (delivery packages) have to be modified for this chemical purpose. in addition to their properties as solid, electrically charged particles, they also need the attributes of a solvent once melted. this change of properties happens after the particles have been addressed to their designated reaction site, where they are transformed into a liquid sphere simply by melting. thereby, activated monomers are mobilized, which allows them to diffuse to their reaction partner for chain elongation. these very special solid/liquid characteristics of the toner particle depend on the choice of the appropriate matrix material. on the one hand, this material should withstand the harsh mechanical treatment inside the printer (e.g., friction, charging, transport); on the other hand, the liquefaction at moderate temperatures (<100 c) is fundamental in order to perform as a solvent for a chemical reaction. in addition, the matrix material should protect the reactive monomers from ambient conditions during long-term storage in cartridges, and finally, the material itself must be inert toward the components inside. since all the different monomer particles are addressed in parallel by the printer, all the different activated amino acid derivatives within a completed layer of amino acid particles are activated at once in a single melting step. this feature is the main advantage of our technique. washing and deprotection steps that follow after the coupling step finish the cycle, and result, if repeated, in the combinatorial synthesis of a peptide array (fig. 7.7) . our method uses conventional fmoc chemistry (chan and white 2000) and differs from the spot synthesis only in the solvent we employ: it is solid at room temperature, which allows for the intermittent immobilization of chemically highly activated amino acid derivatives within particles. this activation is due to a c-terminal pentafluorophenyl ester in combination with n-terminal fmoc protection and standard side chain protecting groups. surprisingly, when embedded in particle matrix, these very reactive chemicals proved to be stable for months at room temperature, an exception being fmoc-arginine-opfp that shows a decay of 4% per month. however, if compared to the much faster decay of activated arginine esters in solution, this decay is negligible. the surface-coated solid support must provide free amino groups that react with preactivated amino acid derivatives. it must stand harsh conditions during peptide synthesis (solvents, bases, strong acids during final cleavage of side chain protecting groups) and postsynthesis; it must allow for the incubation of arrays with an analyte, for example, an antibody solution. we employ 30-100 nm thick 3d polymer coatings that have a high loading capacity (high density of amino groups). alternatively, essentially 2d layers are used as solid supports for peptide synthesis. these are generated from functionalized silanes that also stand the conditions during peptide synthesis and, dependent on the used assay system, sometimes perform better when compared to the polymer coating. such surfaces are described in detail elsewhere (beyer et al. 2006; stadler et al. 2007 ). the peptide laser printer at pepperprint gmbh (see fig. 7 .8a, b) has 24 printing units assembled in a row. twenty of these toner cartridges that are based on the oki system are equipped with fmoc-amino acid esters; the four remaining cartridges are used for nonstandard amino acid particles. the printer works with micron resolution which currently allows for the synthesis of 270,000 peptides on a (20 â 20) cm 2 glass substrate. this corresponds to a spot density of~800 spots per cm 2 . currently, the machine is used for the synthesis of up to 20meric peptides. our particle-based synthesis method makes available to the scientific community, for the first time, very high-density peptide arrays. this is, in technical terms, the main novelty of this method (table 7 .1). as such, this method certainly soon will approach the number of different molecules that nature's screening systems employ. these use, for example, millions of randomly generated antibodies to screen for binders against virtually any target molecule, among them nonnatural molecules that have never been encountered by evolution. peptide arrays with natural amino acids (l-form) have been used in the past mainly as protein fragment libraries in proteome research, or as diagnostic tools for serum screening and for antibody profiling. for the development of novel therapeutics, often nonnatural amino acids (e.g., d-amino acids) were integrated at critical positions within the peptide sequence in order to increase the metabolic stability of the peptides. however, due to the use of only low-density peptide arrays, up to now, such screens usually had to rely on extensive previously available knowledge. typically, a peptide sequence that was already known to bind to the target was then modified to improve the stability or the binding affinity, or an already known antigenic protein sequence was used to generate many overlapping peptides in order to narrow down an antibody's binding epitope. in the near future, we will certainly see that very high-density and affordable peptide arrays will be used to find binders without extensive previous knowledge about the sequence of a potential binder. similar to nature's screening systems, a vast number of different peptides should be sufficient to find binders against nearly any target molecule, and different from surface display techniques, the array format will allow for an easy and unequivocal discrimination of specific from nonspecific binders. very high-density peptide arrays will be used for such screens that have been cleared from all those peptides that were found to bind to more than a few different target proteins. it is practically impossible to avoid such nonspecific binders in all the surface display methods. thus, the high combinatorial diversity given by the laser printer method should increase the possibility to discover potent low-density peptide arrays have been used previously to narrow down the binding epitope(s) of binding antibodies by synthesizing many overlapping peptides derived from the sequence of a known antigen that has been used, for example, to immunize an animal; however, such experiments used to be prohibitively expensive. highdensity peptide arrays are cheaper-they simply allow more of these experiments. it should be feasible in the future, for example, to routinely monitor the kind of antibodies that evolved in a mouse that was immunized with a protein. the experimenter could then use only those mice that evolved an antibody specific for an especially interesting epitope within the protein that was used for immunization, thus saving a lot of time and money by using only selected mice for the generation of hybridomas. shown as an example for this statement are the results obtained when we used four different rabbit sera that were immunized with protein a and closely related protein b (fig. 7.9 ). only one of two rabbits immunized with protein a or with protein b revealed that specific antibodies have been generated, while the other two rabbits did not generate protein a-or protein b-specific antibodies. the staining pattern revealed that rabbit 1 developed several antibodies that were specific for the c-terminal region of protein a, while rabbit 3 developed antibodies that targeted the n-terminal region of protein b, and when scrutinizing the peptide sequences, both productive sera did not reveal cross-reacting antibodies against determinants from closely related proteins #a and #b. a biomarker is a substance that is used as an indicator of a biological state. the biomarker serves as an indicator to measure and evaluate normal or pathogenic biological processes, or pharmacologic responses to a therapeutic intervention. especially useful are biomarkers that are found in human sera. these are a rich and accessible source for the detection of diagnostic markers in human diseases. serum antibodies have been used extensively for diagnosis, for example, of flu antibodies in a patient. such antibodies often are found in a patient decades after infection and can easily be analyzed with peptide arrays. especially interesting are those (peptide-)antigens (and their corresponding antibodies) that are targeted by an immune response. these are useful as biomarkers for drug discovery and diagnostics. however, the most interesting scientific question in biomarker discovery is as follows: can we find novel antigens and their corresponding antibodies without previous knowledge about the antigen? figure 7 .10a shows that such a scientific question could be answered with very high-density peptide arrays. a randomized high-density array of 15meric stochastically chosen peptides (only a detail view of~5,000 structures is shown) was used to find peptide binders for the flag m2 antibody with its known binding epitope nnndyknnnd/ennn. indeed, we could find six weak binders in a first screen. sequences from all of these initial hits were then used in a follow-up screen that stained the completely permutated sequences from these initial binders. figure 7 .10b shows such a permutation screen that started with the sequence "ecwgdyksmecadwh" found as an initial hit. this sequence, and all the other five hits from the initial screen, then revealed either the sequence nnndyknnnennn or nnndyknnndnnn, i.e., the flag m2 epitope. all amino acid positions depicted by "n" could be exchanged by other amino acids. it remains to be seen if such a screen could be employed to also find novel biomarkers when staining stochastically chosen very high-density peptide arrays with serum antibodies derived from patients with enigmatic diseases. many protein-protein interactions are mediated by short linear motifs. some of these interactions are deregulated in diseases and thus are potential targets for modulating peptide-based drugs. these peptides should interfere with the protein-protein interactions by binding to one of the partners, either activating or inhibiting the signals that depend on the respective proteins. historically, and as stated above, peptide drugs have been based, for example, upon the optimization of natural peptide hormones but, more recently, novel peptides are being developed that have been isolated from combinatorial recombinant libraries. the idea is to offer such a large number of different potential peptide-based binders that simply by chance any protein would "find" at least one binder among a plurality of different peptides. however, the current size limits of peptide arrays and the associated costs had made it, until now, unrealistic to conduct such comprehensive profiling screens. the production of microarrays by laser printing uses a novel chemical concept where an activated monomer is encapsulated within solid particles that are sent to different "addresses" on a surface displaying reactive groups. thereby, an established technology is used for the rapid construction of a densely spaced pattern of different kinds of particles. these comprise different building blocks that are wild-type residues are highlighted by a red circle initially "frozen" at room temperature within solid particles. thawing these particles at once leads to a single coupling step per layer, which is the main advantage of this method when compared to sequential coupling, for example, in lithographic methods. our particle-based method is particularly well suited for automation and, thereby, results into drastically reduced cost per peptide spot. it brings affordable high-density peptide arrays within reach of normal laboratories and may have an impact similar to the one that high-density oligonucleotide arrays had in the field of genomics. peptide arrays for screening cancer specific peptides deciphering the antibodyome-peptide arrays for serum antibody biomarker diagnostics functional peptide microarrays for specific and sensitive antibody diagnostics proteomic techniques and activity-based probes for the system-wide study of proteolysis a novel glass slide-based peptide array support with high functionality resisting non-specific protein adsorption light-directed maskless synthesis of peptide arrays using photolabile amino acid monomers an improved method for the synthesis of cellulose membrane-bound peptides with free c termini is useful for pdz domain binding studies protein kinase signaling networks in cancer fmoc solid phase peptide synthesis-a practical approach substrate specificities of catalytic fragments of protein tyrosine phosphatases (hptp beta, lar, and cd45) toward phosphotyrosylpeptide substrates and thiophosphotyrosylated peptides as inhibitors peptide-protein microarrays for the simultaneous detection of pathogen infections a peptide-based target screen implicates the protein kinase ck2 in the global regulation of caspase signaling solas d (1991a) light-directed, spatially addressable parallel chemical synthesis prediction of response to preoperative chemoradiotherapy in rectal cancer by multiplex kinase activity profiling spot-synthesis: an easy technique for the positionally addressable, parallel chemical synthesis on a membrane support bl€ ocker h (1983) a new general approach for the simultaneous chemical synthesis of large numbers of oligonucleotides: segmental solid supports light directed massively parallel on-chip synthesis of peptide arrays with t-boc chemistry activity-based high-throughput determination of ptps substrate specificity using a phosphopeptide microarray use of peptide synthesis to probe viral antigens for epitopes to a resolution of a single amino acid printing chemical libraries on microarrays for fluid phase nanoliter reactions high throughput substrate specificity profiling of serine and cysteine proteases using solution-phase fluorogenic peptide microarrays sars corona virus peptides recognized by antibodies in the sera of convalescent cases estrogenic effects in the immature rat uterus after dietary exposure to ethinylestradiol and zearalenone using a systems biology approach peptide microarrays for detailed, high-throughput substrate identification, kinetic characterization, and inhibition studies on protein kinase a peptide arrays on cellulose support: spot synthesis, a time and cost efficient method for synthesis of large numbers of peptides in a parallel and addressable fashion defining the specificity space of the human src homology 2 domain optimal surface chemistry for peptide immobilization in on-chip phosphorylation analysis genome to kinome: species-specific peptide arrays for kinome analysis a microarray strategy for mapping the substrate specificity of protein tyrosine phosphatase nuclear receptor-coregulator interaction profiling identifies trip3 as a novel peroxisome proliferatoractivated receptor gamma cofactor anaplastic lymphoma kinase inhibition in non-small-cell lung cancer proteintyrosine kinase activity profiling in knock down zebrafish embryos developing site-specific immobilization strategies of peptides in a microarray high density synthetic oligonucleotide arrays the specificity of the transforming growth factor beta receptor kinases determined by a spatially addressable peptide library a study to assess the cross-reactivity of cellulose membrane-bound peptides with detection systems: an analysis at the amino acid level photolithographic synthesis of high-density oligonucleotide arrays automated synthesis of peptides solid-phase peptide syntheses current technologies for her2 testing in breast cancer microarray profiling of antibody responses against simian-human immunodeficiency virus: postchallenge convergence of reactivities independent of host histocompatibility type and vaccine regimen dynamic control of signaling by modular adaptor proteins light-generated oligonucleotide arrays for rapid dna sequence analysis strategies for kinome profiling in cancer and potential clinical applications: chemical proteomics and array-based methods development of selective bisubstrate-based inhibitors against protein kinase c (pkc) isozymes by using dynamic peptide microarrays blind prediction of response to erlotinib in early-stage non-small cell lung cancer (nsclc) in a neoadjuvant setting based on kinase activity profiles peptide microarrays for the determination of protease substrate specificity kinome profiling of chondrosarcoma reveals src-pathway activity and dasatinib as option for treatment high-content peptide microarrays for deciphering kinase specificity and biology peptide arrays for kinase profiling automated maskless photolithography system for peptide microarray synthesis on a chip maskless fabrication of light-directed oligonucleotide microarrays using a digital micromirror array peptide microarrays for a network analysis of changes in molecular interactions in cellular signalling multifunctional cmos microchip coatings for protein and peptide arrays combinatorial synthesis of peptide arrays with a laser printer peptide microarrays for the detection of molecular interactions in cellular signal transduction peptide microarray for highthroughput determination of phosphatase specificity and biology high-throughput screening of catalytically inactive mutants of protein tyrosine phosphatases (ptps) in a phosphopeptide microarray high density peptide microarrays for proteome-wide fingerprinting of kinase activities in cell lysates deciphering enzyme function using peptide arrays the use of semi-parametric mixed models to analyze pamchip peptide array data: an application to an oncology experiment identifying specificity profiles for peptide recognition modules from phage-displayed peptide libraries bayesian modeling of the yeast sh3 domain interactome predicts spatiotemporal dynamics of endocytosis proteins cyclooxygenase-2 inhibition inhibits c-met kinase activity and wnt activity in colon cancer preparation of novel alkylated arginine derivatives suitable for click-cycloaddition chemistry and their incorporation into pseudosubstrate-and bisubstrate-based kinase inhibitors response prediction to a multitargeted kinase inhibitor in cancer cell lines and xenograft tumors using high-content tyrosine peptide arrays with a kinetic readout from promiscuity to precision: protein phosphatases get a makeover screening combinatorial libraries by mass spectrometry. 2. identification of optimal substrates of protein tyrosine phosphatase shp-1 pnaencoded protease substrate microarrays a special cellulose membrane support for the combinatorial and parallel synthesis of peptide libraries suitable for the sc2-type manufacturing of high density multi-purpose chemical micro-arrays protein tyrosine phosphatase substrate specificity: size and phosphotyrosine positioning requirements in peptide substrates a protein phosphatase-1-binding motif identified by the panning of a random peptide display library key: cord-258468-52gej3co authors: marcekova, zuzana; psikal, ivan; kosinova, eva; benada, oldrich; sebo, peter; bumba, ladislav title: heterologous expression of full-length capsid protein of porcine circovirus 2 in escherichia coli and its potential use for detection of antibodies date: 2009-08-05 journal: j virol methods doi: 10.1016/j.jviromet.2009.07.028 sha: doc_id: 258468 cord_uid: 52gej3co a capsid protein of porcine circovirus 2 (pcv 2) serves as a diagnostic antigen for the detection of pcv 2-associated disease known as a postweaning multisystemic wasting syndrome (pmws). in this report, a bacterial expression system was developed for the expression and purification of the full-length pcv 2 capsid (cap) protein from a codon-optimized cap gene. replacement of rare arginine codons located at the 5′ end of the cap reading frame with codons optimal for e. coli was found to overcome the poor expression of the viral protein in the prokaryotic system. the cap protein was purified to greater than 95% homogeneity by using a single cation-exchange chromatography at a yield of 10 mg per litre of bacterial culture. despite the failure of the e. coli-expressed cap protein to self-assemble into virus-like particles (vlps), the immunization of mice with recombinant cap yielded antibodies with the same specificity as those raised against native pcv 2 virions. in addition, the antigenic properties of the purified cap protein were employed in a subunit-based indirect elisa to monitor the levels of pcv 2 specific antibodies in piglets originating from a herd which was experiencing pcv 2 infection. these results pave the way for a straightforward large-scale production of the recombinant pcv 2 capsid protein and its use as a diagnostic antigen or a pcv 2 subunit vaccine. porcine circoviruses (pcv) are small non-enveloped, singlestranded circular dna viruses of the circoviridae family . two distinct types of pcv have been described: the non-pathogenic pcv type 1 (pcv 1) (tischer et al., 1982) and the pathogenic pcv type 2 (pcv 2) nayar et al., 1997) , which is associated with a newly emerged disease called postweaning multisystemic wasting syndrome (pmws) (clark, 1997) . four-to twelve-weeks old piglets which are affected with pmws display clinical symptoms of wasting, respiratory distress, anaemia, diarrhoea, jaundice and enlarged lymph nodes (for review, see chae, 2004) . pmws is considered to be an important porcine disease worldwide which is reported to have a serious economic impact on the global pig farming industry. the 1.77 kb pcv 2 genome contains three functional open reading frames (orfs) . orf1 encodes several forms of non-structural replicase proteins (mankertz and hillenbrand, 2001; mankertz et al., 1998) , orf2 encodes the capsid protein (nawagitgul et al., 2000) , and orf3 encodes a 105-amino acid protein which appears to be involved in virus-induced apoptosis of infected cells (liu et al., 2005) . the capsid protein is a unique structural protein of the viral coat (nawagitgul et al., 2000) that is formed by 60 protein subunits in an icosahedral t = 1 capsid structure (crowther et al., 2003) . the n-terminus of the capsid protein is rich in basic amino acid residues and displays a nuclear localization signal (nls) which is important for capsid assembly (liu et al., 2001a) . the capsid protein is highly immunogenic and reacts strongly with serum from pcv 2-infected pigs (mahe et al., 2000; fenaux et al., 2003) , and thus, it is the preferred antigen in a variety of serological tests (nawagitgul et al., 2002; blanchard et al., 2003; liu et al., 2004; shang et al., 2008) . the expression of recombinant capsid protein was achieved successfully in the baculovirus system (nawagitgul et al., 2000; mahe et al., 2000; blanchard et al., 2003; fan et al., 2008) , where the expressed capsid protein assembled spontaneously into virus-like particles (vlps) (nawagitgul et al., 2000; liu et al., 2008) . vlps represent non-infectious structures which completely lack the dna or rna genome of the virus. therefore, vaccines based on vlps trigger an immune response to the host immune cells by mimicking native virus morphology (ludwig and wagner, 2007) . however, the production of vlps for vaccination in eukaryotic systems is costly for veterinary applications. on the other hand, heterologous protein expression in e. coli offers an alternative to the production of large amounts of protein. the expression of full-length capsid protein in a standard bacterial expression system, such as e. coli bl21 (de3), has not been reported. only certain regions of the cap protein (wu et al., 2008) or a fusion protein with maltose-binding protein (liu et al., 2001b) or truncated variant of cap lacking the nls (zhou et al., 2005; trundova and celer, 2007) have been expressed in e. coli. in this study, a genetic approach which takes advantage of a codon-optimized synthetic oligonucleotide and a synthetic codonoptimized gene is described to obtain the high-level production of the full-length pcv 2 capsid protein in e. coli bl21 (de3) cells. the purified capsid protein is used as antigen to develop an indirect elisa for monitoring the levels of pcv 2 specific antibodies in piglets originating from a herd experiencing pcv 2 infection. the czech field-strain isolate of porcine circovirus type 2 (l-14181, brno, czech republic) was used in this study. the virus stock was prepared from the supernatant of organ homogenate from a pig which fulfilled the diagnostic criteria for pmws (sorden, 2000) . samples of enlarged lymph nodes were pooled and homogenised in a fivefold volume of phosphate-buffered saline (pbs, ph 7.2). two volumes of chloroform were added to 10 volumes of the homogenate and the mixture was shaken at 20 • c for 10 min and centrifuged at 3000 × g for 15 min. the supernatant was subjected to a cushion of cscl density gradient (1.3 g/ml) and centrifuged at 60,000 × g for 4 h in a beckman sw 60ti rotor (beckman coul-ter, fullerton, usa). the purified pcv 2 virions were resuspended in pbs and subsequently used to infect the circovirus-free pk15 cells which were maintained in a d-mem medium (paa laboratories, pasching, austria) supplemented with 10% heat-inactivated fetal calf serum (gibco, invitrogen, carlsbad, usa) at 37 • c with 5% co 2 . the viral dna was purified from pk15-infected cells using a dnazol genomic dna isolation reagent (molecular research center, cincinnati, usa) according to manufacturer's instructions. the genomic dna of pcv 2 isolate was sequenced by abi prism 3130xl analyzer (applied biosystems, foster city, usa) using bigdye terminator 3.1 cycle sequencing kit (applied biosystems, foster city, usa). the nucleotide sequence encoding the cap protein was 99.9% identical with that of orf2 of pcv 2 strain fd4 (genbank accession no. ay321986) (de boisséson et al., 2004) . a 707 bp sequence encoding the cap protein was amplified using polymerase chain reaction (pcr) with the following primers: the upstream primer 5 -ccccatggcgatgacgtatccaaggaggc-3 containing the ncoi site and the downstream primer 5 -tgtctcgagagggttggggggtc-3 containing the xhoi site. the purified pcr product was cut with ncoi and xhoi (new england biolabs, ipswich, usa), cloned into the pet28b expression vector (novagene, merck kgaa, darmstadt, germany), and the vector was designated as pet28b-cap-his (fig. 1b) . to generate a truncated version of the cap protein ( cap-his) lacking the first n-terminal 16 amino acid residues, the pet28b-cap-his was used as a template for pcr with the following primers: the upstream primer 5 -agaccatgggcagcc-atcttggccagatc-3 containing ncoi site and downstream primer 5 -tgtctcgagagggttggggggtc-3 containing the xhoi site. the 669 bp pcr product was cut with ncoi and xhoi, cloned into the pet28b vector, and designated as pet28b-cap-his (fig. 1b) . for the construction of a vector carrying the codon-optimized nucleotide sequence of the first 19 amino acid residues of the cap protein, the pet28b-cap-his expression vector was cut with ncoi and msci. the resulting fragment was ligated with a dsdna adaptor (sense oligonucleotide 5 -catgacctatccgcgccgccgttacc-gccgccgccgccaccgcccgcgcagccatctggg-3 and anti-sense oligonucleotide 5 -cccagatggctgcgcgggcggtggcggcggcg-gcggtaacggcggcgcggataggt-3 ) creating the ncoi and msci overhangs at 5 and 3 end of the adaptor molecule, respectively. this construct was designated as pet28b-o-cap-his (fig. 1b) . codon-optimized gene encoding the pcv 2 capsid protein was obtained from genscript (piscataway, usa) and the nucleotide sequence was deposited in genbank (accession no. eu376523). the plasmid carrying the nucleotide sequence of the cap protein was cut with ncoi and xhoi and cloned into pet28b expression vector. to enhance the expression of the synthetic cap gene in e. coli, a pet42b expression vector was modified by replacement of the nucleotide sequence between the ndei and saci sites with a dna adaptor molecule annealed from a pair of synthetic oligonucleotides (sense oligonucleotide 5 -tatgcaccaccaccacc-accacgccatgggagct-3 and anti-sense oligonucleotide 5 -cccatggcgtggtggtggtggtggtgca-3 ) which encoded the n-terminal histidine tag of the synthetic cap gene. this construct was cut with ncoi and xhoi and ligated with the synthetic cap gene which had been cut with the same restriction enzymes to obtain the expression vector carrying s-cap with the flanking n-and cterminal histidine tags (pet42b-s-cap) (fig. 1b) . all the constructs were confirmed by dna sequence analysis with abi prism 3130xl analyzer (applied biosystems, foster city, usa) using a big dye terminator cycle sequencing kit. recombinant proteins were expressed in e. coli bl21 (de3) (novagene, merck kgaa, darmstadt, germany). cells containing the expression plasmid were grown at 37 • c in luria-bertani (lb) medium supplemented with 60 g/ml kanamycin to the optical density of 0.6 at 600 nm and then isopropyl ␤-dthiogalactopyranoside (iptg) (alexis corporation, lausen, switzerland) was added to a final concentration of 1 mm. after 4 h of growth, cells were harvested by centrifugation (4000 × g for 20 min) and resuspended in 100 mm tris-hcl (ph 8.0) containing 1 mm edta. the expression in e. coli bl21-codonplus(de3)-ripl strain (stratagene, agilent technologies, santa clara, usa) was done under the same conditions, except for antibiotics concentrations: kanamycin (30 g/ml), chloramphenicol (34 g/ml) and spectinomycin (75 g/ml). this strain contains extra copies of the argu, iley, prol, and leuw genes for rare trnas, which can rescue protein production restricted by either agg/aga or ccc codons. the expression of desired proteins was documented by sds-polyacrylamide gel electrophoresis (sds-page) and western blot analysis. bacterial pellet was washed and resuspended in sonication buffer (50 mm mes, ph 5.8, 50 mm nacl) containing 1 mm phenylmethylsulfonyl fluoride. the cells were disrupted by sonication (45 w, misonix sonicator 3000, misonix, farmingdale, usa) on ice and the homogenate was centrifuged at 15,000 × g at 4 • c for 30 min. solid urea was added to a final concentration of 8 m, and mixed gently for 30 min. after centrifugation at 15,000 × g for 10 min at 4 • c, the supernatant was loaded to a unosphere-s cation-exchange column (bio-rad, hercules, usa) which was connected to an akta prime chromatography system (ge healthcare, chalfont st. giles, united kingdom). after washing with 10 column volumes of sonication buffer, the cap protein was eluted with a continuous gradient of nacl (0-1 m) in a buffer containing 50 mm mes, ph 5.8 and 8 m urea. pooled fractions of purified cap protein were concentrated by using centricons (amicon mw 10000 cutoff, millipore, billerica, usa), and stored at −20 • c for further use. total protein concentration was determined by the bradford assay (bio-rad, hercules, usa) using bovine serum albumin as a standard. proteins were separated by sds-page using 12% polyacrylamide gels and either stained by coomassie brilliant blue r-250 or transferred to a nitrocellulose membrane (pall corporation, new york, usa) in a transfer buffer (20 mm tris-hcl, ph 8.3, 190 mm glycine, 0.1% sds, 20% methanol) using a te 70xp semi-dry transfer unit (hoefer, holliston, usa) at 1 v/cm 2 for 1 h. the membranes were blocked with 5% non-fat milk in pbs-t (phosphate-buffered saline containing 0.05% tween-20), incubated with anti-his monoclonal antibody (1:5000) (sigma, st. louis, usa) followed with a peroxidase-conjugated anti-mouse antibody. the signal was developed using an enhanced chemiluminiscence system (ge healthcare, chalfont st. giles, united kingdom). the experimental work with mice was conducted according to the certificate of animal welfare authority commission of the ministry of agriculture of the czech republic no. mze 926/08 in accordance with current czech legislation on animal welfare. a total of 15 balb/c mice (8 weeks of age) were used in the experiment. a group of 5 mice was bled before the immunization to obtain pre-immune serum. a group of 10 mice received intramuscular injection with recombinant s-cap (25 g protein per mouse) mixed with a complete freund's adjuvant (sigma, st. louis, usa). three weeks later, the mice were boosted intramuscularly with the same dose of recombinant cap protein mixed with incomplete freund's adjuvant (sigma, st. louis, usa). the mice were euthanised and bled 24 days after the last injection prior to the sera were screened for the presence of the cap-specific antibodies by using an immunoperoxidase monolayer assay (ipma). for ipma, confluent monolayers of pk15 cells were mixed with a reference pcv 2 isolate (stoon-1010, after 36 passages) and incubated at 37 • c for 48 h prior to fixation with 4% paraformaldehyde. the fixed cells were then incubated with mice sera (1:50 dilution in pbs-t buffer) at 37 • c for 1 h, followed by incubation with a peroxidase-conjugated goat anti-mouse secondary antibody (1:500 dilution in pbs-t buffer) and the peroxidase activity was detected with 4-chloro-1-naphthol (bio-rad, hercules, usa) as a substrate. e. coli cells were washed twice in 50 mm phosphate buffer (ph 7.4) and fixed in a solution containing 2% glutaraldehyde in 50 mm phosphate buffer (ph 7.4) for 2 h. after three washes, the cells were post-fixed in buffered 1% oso 4 (50 mm phosphate buffer, ph 7.4) for 1 h, washed three times in 50 mm phosphate buffer (ph 7.4), dehydrated in ethanol series, and then transferred into absolute acetone and embedded in vestopal w resin (sigma, hercules, usa). ultrathin sections were cut with glass knife using a lkb ultratome 1 (lkb, bromma, sweden) and mounted on formvar-coated copper grids. the grids were stained in saturated aqueous uranyl acetate followed by lead citrate. samples were examined under philips cm100 electron microscope (fei, eidhoven, the netherlands) at 80 kv. images were recorded by using a megaviewii slow scan camera controlled by analysis 3.2 software (olympus soft imaging solutions, münster, germany). pcv 2 antibodies were determined with an indirect enzymelinked immunosorbent assay (elisa) using the purified s-cap and cap-his proteins as coating antigens (cap-elisa). the 96-well elisa plates (nunc, thermo fisher scientific, waltham, usa) were coated with 50 l of 100 mm sodium carbonate buffer (ph 8.9) containing 10 g/ml of the proteins and incubated overnight at 4 • c. the plates were washed with phosphate-buffered saline (pbs) containing 0.05% tween-20 (pbs-t) and blocked with a blocking buffer (pbs containing 1% bovine serum albumin) at 37 • c for 1 h. after washing, 100 l of each serum sample (diluted 1:100) was added into each well and tested in quadruplicate: two wells for a negative control antigen glutathione s-transferase (sigma) and two parallel wells for the antigens. the plates were washed with pbs-t and 50 l of blocking buffer containing a peroxidase-conjugated goat anti-swine antibody (bethyl laboratories, montgomery, usa) was added into each well and incubated for 1 h. finally, the plates were washed with pbs-t three times and the colour reaction was developed with 3,3 ,5,5 -tetramethylbenzidine (tmb) as a substrate. the optical density at 450 nm (od 450nm ) was read using a microplate reader. the final od 450nm value was calculated by subtracting the mean value of od 450nm of wells containing a negative antigen from that of the parallel wells containing the capsid proteins. the cut-off value was determined by using specific pathogen free (spf) pig's serum samples diluted at 1:40 which were negative for anti-pcv 2 antibodies as determined by ipma. the mean od 450nm value was 0.098 with a standard deviation (s.d.) of 0.035 and the final cut-off value was calculated by adding the mean od 450nm value + 3 s.d. to the value of 0.200. pig serum with a high titre of pcv 2 antibodies 1:5120, as determined by ipma, was used as a positive control (kindly provided by annette mankertz from robert-koch institut, berlin, germany). the spf pig sera were negative for pcv 2 and used as negative controls in each plate. a limited serological and genomic load survey of specific pcv 2 antibodies and equivalent pcv 2 genome copy numbers, respectively, was carried out on pig sera collected at 4, 8, 12, 16 and 20 weeks of age. 22 selected piglets were selected from a pig herd experiencing pcv 2 infection, marked with tags and monitored during their nursery and fattening period. at the regular intervals, a minimum of 2 ml of blood was collected from each piglet and the sera were stored at −20 • c for further use. total genomic dna was extracted from a 200 l volume of each serum sample using a nucleospin blood isolation kit (macherey-nagel, düren, germany) according to manufacturer's instructions. quantification of pcv 2 dna levels was performed using a realtime pcr by roche light cycler 480 (roche, basel, switzerland) in a taqman format as previously described (brunborg et al., 2004) . in short, the taqman probe was labelled with 5 -fam and 3 -bhq1 fluorophores (generi biotech, hradec kralove, czech republic) and the primers were designed to amplify a 100 bp dna segment within the nucleotide sequence of the pcv 2 cap gene. the absolute quantification of pcv 2 dna was carried out using calibration curves generated by means of external standard dna obtained by cloning the pcv 2 cap gene into a pcr 2.1 vector (invitrogen, carlsbad, usa). standard curves for pcv 2 dna quantification were generated using tenfold dilution of the linearised plasmids in the range of 8 log 10 . statistical analyses were carried out using the one-way analysis of variance (anova) test running in spss software (base 14.0, spss, chicago, usa). values of p < 0.05 were considered significant. the open reading frame encoding cap protein was amplified using pcr from genomic dna of pcv 2 (l14181 isolate) and the 707 bp long dna fragment was cloned into pet28b expression vector (fig. 1b) . the expression was documented in crude cell lysate by sds-page and western blotting. as demonstrated in fig. 2 , no expression of cap-his was detected in coomasie-stained gels and by western blot using anti-his antibody. frequently, an insufficient production of heterologous proteins in e. coli is due to the presence of codons that are used rarely in bacterial proteosynthesis (zahn, 1996) . forced high-level expression of genes with rare codons may lead to the depletion of the endogenous pools of the corresponding trnas, resulting in the abortion of the translation process and the degradation of the mrna (mcnulty et al., 2003) . examination of the nucleotide sequence of the cap gene revealed a cluster of rare codons within the first 16 codons of the gene (fig. 1c) . the presence of such rare codon tandems near the 5 end of a coding sequence has been reported to cause ribosomal frameshifts and codon skipping with a strong inhibitory effect on protein translation (gurvich et al., 2005) . in order to eliminate the cluster of rare codons from the 5 end of the cap gene, the expression vector encoding a truncated variant of cap protein ( cap-his), which was lacking the first 16 amino acid residues, was constructed (fig. 1b) . expression of the truncated gene in iptg-induced e. coli bl21 (de3) cells resulted in the production of a protein with an apparent molecular weight of 26 kda ( fig. 2a) which corresponded to the cap-his protein. this suggested that the removal of the first 16 codons from the 5 end of the cap gene allowed the production of cap protein in e. coli. in previous report, the expression of the truncated variant of cap protein lacking the nuclear localization signal (nls) has been reported (zhou et al., 2005; trundova and celer, 2007) . the nls consists of 41 amino acid residues at the n-terminus of cap and the nls-defective constructs have been expressed also as fusion proteins with glutathione s-transferase (zhou et al., 2005) or maltose-binding protein (liu et al., 2001b) . based on these results, the authors concluded that the expression of the capsid protein in e. coli was inhibited by the nucleotide sequence encoding the nls domain. however, the expression of the truncated cap protein, which lacks the first 16 amino acids, was achieved which suggested that the production of the full-length capsid protein in e. coli was hindered only by the occurrence of rare arginine codons near the 5 end of the cap gene. the heterologous expression of the entire cap gene was examined also in e. coli bl21-codonplus(de3)-ripl cells. this strain contains extra copies of the argu, iley, prol, and leuw genes for rare trnas, which rescue protein production restricted by either agg/aga or ccc codons. the expression of the cap gene in iptginduced e. coli bl21-codonplus(de3)-ripl yielded the production of a protein with an apparent molecular weight of 27 kda which corresponded to the full-length cap protein ( fig. 2a) . this indicated that the expression of rare aminoacyl-trnas allowed the production of the full-length cap protein. however, the use of e. coli bl21-codonplus(de3)-ripl for a high-yield production of recombinant proteins is inconvenient as the expression system depends on using a combination of three different antibiotics: one for the maintenance of the expression vector, and the other for the maintenance of plasmids carrying trna genes. moreover, due to the presence of high amounts of antibiotics the growth rate of bacterial cells is reduced and expensive. a promising approach to maximise heterologous production of proteins in e. coli is codon optimization strategy which makes codon usage in the gene of interest to match the available trna pool within the cells (jana and deb, 2005; peti and page, 2007; burgess-brown et al., 2008) . to facilitate the cap gene expression in e. coli bl21 (de3), the sequence of the first 19 codons was optimized as described in fig. 1c . ten codons, rare in e. coli, were replaced by the most frequent ones, including six arginine, two proline, and single leucine and threonine codon substitutions. the codon-optimized sequence was introduced into pet28b-cap-his using ncoi and msci restriction sites to obtain pet28b-o-cap-his (fig. 1b) . the expression of the o-cap-his gene in e. coli bl21 (de3) cells yielded the production of a protein with a molecular weight of 27 kda (fig. 2a) . the corresponding band was recognised by the anti-his antibody which confirmed the presence of the o-cap-his protein (fig. 2b) . these data indicated that the replacement of the first 19 codons of the cap gene with a codon-optimized sequence enabled the production of cap protein in a conventional prokaryotic expression system, such as e. coli bl21 (de3). different bioinformatics tools are available for codon optimization which considers many factors, such as rna secondary structure, gc content, repetitive and rare codons (grote et al., 2005) . these approaches were used to design a synthetic cap gene with nucleotide sequence optimized for prokaryotic expression in e. coli (genbank accession no. eu376523). the synthetic gene with introduced ncoi and xhoi restriction sites was cloned into pet28b vector carrying the c-terminal polyhistidine tag and the expression of this construct was analyzed by sds-page. however, no protein product was detected in coomasie-stained gel (data not shown). to force the expression of the capsid protein from the synthetic gene, additional codons encoding the polyhistidine tag were added at the 5 end of the gene. the expression of this construct (pet42b-s-cap) in e. coli bl21 (de3) yielded the s-cap protein with a molecular mass of 28 kda, which was a slightly higher (+1 kda) than the o-cap-his protein due to the presence of the additional polyhistidine tag at the n-terminus of the protein (fig. 2a) . the levels of the s-cap protein expression were comparable to those of the o-cap-his protein which indicated that codon optimization of the entire sequence of the cap gene did not enhance significantly the heterologous expression of the capsid protein in e. coli. the analysis of protein distribution in e. coli cells revealed that all the cap proteins were recovered in the soluble (cytoplasmic) fraction of the iptg-induced cell lysate. however, initial attempts to purify native cap proteins gave unsatisfactory results. using standard chromatographic procedures, no significant binding of cap proteins was detected while loading the material on either cationexchange or nickel immobilised affinity columns in native buffer conditions (data not shown). to enable purification of cap proteins, denaturating conditions in the presence of 8 m urea were utilised. the cytosolic extract of the cell lysates was supplemented with solid urea to a final concentration of 8 m and the mixture was loaded onto a unosphere-s cation-exchange column at ph 5.8. a typical elution profile of protein fractions from the unosphere-s column is shown in fig. 3a . fractions 1-14 contained large amounts of unrelated bacterial proteins, while the later fractions, eluting after 38 min, were enriched with cap proteins (fig. 3b) . the identity of the cap proteins was confirmed also by western blotting with anti-his antibody (fig. 3c) . however, some apparent differences between the s-cap protein and the remaining capsid proteins were observed during purification procedures. while the s-cap protein was recovered in a high purity in the distinct peak eluting at 40 min (fig. 3b, fraction 17) , both the o-cap-his and cap-his proteins were eluted from the column in a shorter time (about 38-39 min). moreover, these fractions were enriched partially with a protein with a molecular weight of about 30 kda and some low molecular weight proteins below 15 kda (data not shown). the original aim of using the polyhistidine tag was to isolate the cap proteins on a metal affinity chromatography column. however, the presence of two polyhistidine tags within the s-cap protein resulted in the increase of the basicity of the protein, which enabled simple purification of the protein by a single step cation-exchange chromatography. the next purification steps did not increase the purity of proteins (data not shown) and these steps were omitted in the purification protocol. the purity of the s-cap protein was greater than 95% after the cation-exchange chromatography and approximately 10 mg of purified protein was obtained from 1 l of bacterial cell culture. these data suggested that the yield of the s-cap protein is much higher than the yield of the o-cap-his and cap-his proteins in terms of purity, time and costs for large-scale production of the recombinant pcv 2 capsid protein. to examine the immunogenicity of the s-cap protein, mice were immunized with the purified protein and the collected sera were analyzed for the presence of cap-specific antibodies. western blot analyses revealed that a single 28-kda band in the bacterial lysate enriched in the s-cap protein was recognised by the cap-positive sera, but not by the pre-immune sera (data not shown). moreover, a typical dense nuclear staining of pcv 2-infected pk15 cells was seen using the cap-positive sera compared to the pre-immune sera by in situ immunohistochemistry (data not shown). these results documented the immunogenicity of the s-cap protein isolated under denaturating conditions. the majority of vlps have been produced using insect or mammalian cell expression systems, but some vlps have been found to assemble also in e. coli cells (noad and roy, 2003) . in order to detect whether the pcv 2 capsid protein is able to self-assemble into vlps in e. coli, the ultrastructure of the bacterial cells expressing the o-cap protein, which lacks any of the additional tag (fig. 1) , was examined by using electron microscopy (em) of ultrathin sections. as shown in fig. 4 , no vlps were observed in either non-induced cells or e. coli cells expressing the o-cap protein. in contrast, in vivo vlps assembly was detected unambiguously following the expression of a fusion protein containing the capsid and nucleocapsid protein obtained from the gag polyprotein of mason-pfizer monkey virus (ulbrich et al., 2006) (fig. 4a and b ). in addition, the sucrose density gradient fractionation of cytosolic content of e. coli cells expressing o-cap protein did not reveal any presence of o-cap protein in the form of vlps. the absence of vlps was corroborated also by em of negatively stained preparations (data not shown). these results demonstrated that the pcv 2 capsid protein was unable to self-assemble into vlps in the cytosol of e. coli cells. in contrast, self-assembly of the pcv 2 capsid protein into vlps has been observed repeatedly in baculovirus expression systems (nawagitgul et al., 2000; kim et al., 2002; liu et al., 2008) . the vlps were of similar morphology as the intact pcv 2 virions and appeared to be empty capsids with a less ordered structure (nawagitgul et al., 2000) . however, the question why the pcv 2 capsid protein does form vlps in insect cells and not in e. coli remains to be addressed. sequence analysis of the pcv 2 capsid protein revealed amino acid residues with consensus patterns for potential posttranslational modifications (n-glycosylation, phosphorylation) (liu et al., 2001b) , which are known to occur in eukaryotic but not in prokaryotic expression systems. another explanation could be a specific need for chaperones, scaffolding proteins, or ssdna, which might play important roles in a proper pcv 2 assembly (ludwig and wagner, 2007) . a plausible hypothesis is that the n-terminal portion of the pcv 2 capsid protein is enriched with basic amino acids that could bind preferentially to the bacterial genomic dna, and therefore, prevent a regular arrangement of the capsid subunits into vlps. this hypothesis would be supported by the observation that the e. coli cells expressing cap protein produced large amounts of outer membrane vesicles (fig. 4d ) compared to the non-induced cells or the cells expressing the capsid protein of mason-pfizer monkey virus. indeed, the release of outer membrane vesicles has been attributed to a novel envelope stress response (mcbroom and kuehn, 2006) , which might result from the binding of the cap protein to genomic dna. specificity and sensitivity of both the full-length capsid protein (s-cap) and its truncated variant ( cap-his) to pig pcv 2-positive sera was determined using an indirect elisa format (cap-elisa). using a chequer-board titration of pcv 2-positive serum (ipma titre of 5120), optimal antigen concentrations and serum dilutions were selected to be 10 g/ml and 1:100, respectively. when the positive/negative cut-off value was set to 0.2 (see section 2), all the pcv 2-positive sera tested were 100% specific (data not shown). cap-elisa showed a low background, and significant differences between pcv 2-negative and pcv 2-positive pig sera were detected (fig. 5) . cap-elisa did not exhibit any cross-reactivity with pig sera obtained from spf pigs which were infected experimentally with coronavirus causing a transmissible gastroenteritis (tge) or a porcine epidemic diarrhoea (ped), respectively. moreover, no significant differences in specificity and sensitivity of cap-elisa were observed between the s-cap and cap-his proteins when used as coating antigens, indicating that the n-terminus of the capsid protein is not specifically recognised by antibodies in the pcv 2positive sera (wu et al., 2008) . this showed that the recombinant full-length cap protein could be used as coating antigen to develop the indirect cap-elisa for specific and sensitive detection of pcv 2 antibodies. pcv2 antibodies are detected currently by indirect immunofluorescence , ipma , competitive elisa (walker et al., 2000) and by indirect elisa based on either pcv2 viral particles (nawagitgul et al., 2002) or recombinant pcv 2 capsid protein expressed in baculovirus (nawagitgul et al., 2002; blanchard et al., 2003; liu et al., 2004) . however, while both the indirect immunofluorescence and ipma assays are highly demanding and not suitable for large-scale survey of pcv 2 infection, the cap-his proteins (10 g/ml) was analyzed by a colorimetric reaction at the optical density at 450 nm (od450nm). od450nm of 0.2 represents a cut-off value for seropositivity of the samples. tge-specific and ped-specific pig sera were obtained from spf pigs which were experimentally infected with the coronavirus causing transmissible gastroenteritis (tge) and porcine epidemic diarrhoea (ped), respectively. data represent the means ± s.d. for the three independent experiments. current elisa formats are less specific and display antigenic crossreactivity to non-pathogenic pcv 1 (magar et al., 2000) . recently, an indirect elisa based on the recombinant nls-truncated pcv 2 capsid protein expressed in e. coli has been established by shang et al. (2008) . although a direct comparison in the sensitivity and specificity of the full-length (this study) and the nls-truncated (shang et al., 2008) proteins as antigens would be interesting, it is very likely that both elisa formats are comparable in the detection of pcv 2 antibodies. considering these data, the cap-elisa based on the recombinant full-length cap protein as coating antigen would provide a simple and reliable tool for standard serodiagnosis of pcv 2 infection. to further analyze the capacity of cap-elisa to detect the pcv 2specific antibodies, a limited serological and genomic load survey of pcv 2 infection was monitored in piglets during the first 20 weeks of age. the piglet's antibody response along with total amounts of pcv 2 dna per 1 ml of serum was determined by cap-elisa and quantitative pcr, respectively. as shown in fig. 6a , the levels of pcv 2-specific antibodies showed an initial drop within the first 12 weeks of age, from then on the levels increased gradually until 20th week. in contrast, the dna copy numbers of pcv 2 increased gradually during the 1st weeks of age to reach the maximum level in the 12th week, after which they decreased slightly until the 20th week (fig. 6b) . these results are in good agreement with the time course dynamics of pcv 2 infection in newborn piglets (mckeown et al., 2005) . a dramatic decrease of the pcv 2-specific antibodies in the 12th week of age corresponds to the period in which the passive intake of maternal antibodies from breast milk during the weaning period (weeks 1-8) is discontinued. this results in the onset of pcv 2 infection after the 8th week of age prior to the active production of the intrinsic pcv 2-specific antibodies (weeks 12-20). taken the final values were expressed as a signal-to-positive (s/p) ratio. s/p value was determined according to the following formula: (od450nm of sample − od450nm of negative control)/(od450nm of positive control − od450nm of negative control). s/p value of 0.3 represents the cut-off signal for seropositivity of the samples. (b) time course analysis of the pcv2 dna copy numbers per 1 ml of serum as detected by the quantitative pcr. total genomic dna was extracted from 200 l of each serum sample and quantified by the pcr amplification of 100 bp dna segment within the nucleotide sequence of the pcv 2 cap gene using a taqman format. results of statistical analysis using one-way analysis of variance (anova) between serum samples obtained at given time intervals from 22 piglets are represented by the box plots. the median value for each dataset is indicated by the black center line. the vertical height of each box indicates the 25-75% data range. the upper and lower bars denote the largest and smallest data values. the marker ( ) denotes the extreme value of the individual serum sample that is >1.5 times the inter-quartile range from the upper and lower quartile. the marker (᭹) denotes the extreme value of the individual serum sample that is >3 times the inter-quartile range from the upper quartile. data represent the means ± s.d. for the three independent experiments. together, these data confirmed that cap-elisa was developed with a high specificity and sensitivity for detection of pcv 2 antibodies in pig sera and could be used for large-scale surveys of pcv 2 infection at low cost. in summary, a bacterial expression system has been developed for the production of the full-length recombinant capsid protein of porcine circovirus type 2. here, the codon optimization strategy was used to obtain high yields of the recombinant capsid protein in an inexpensive cultivation system. purification protocol based on the single step cation-exchange chromatography provided a cost effective procedure to obtain substantial quantities of the cap protein in a high purity. although the recombinant capsid protein expressed in e. coli did not self-assemble into vlps, the antigenic properties of the cap protein resembled that of intact pcv 2 virions. in addition, the recombinant full-length cap protein was used as antigen to develop the indirect cap-elisa for specific and sensitive detection of pcv 2 infection. porcine circoviruses: a review isolation of porcine circovirus-like viruses from pigs with a wasting disease in the united states of america and europe an orf2 protein-based elisa for porcine circovirus type 2 antibodies in post-weaning multisystemic wasting syndrome quantitation of porcine circovirus type 2 isolated from serum/plasma and tissue samples of healthy pigs and pigs with postweaning multisystemic wasting syndrome using a taqman-based realtime pcr codon optimization can improve expression of human genes in escherichia coli: a multi-gene study postweaning multisystemic wasting syndrome: a review of aetiology, diagnosis and pathology post-weaning wasting syndrome comparison of the structures of three circoviruses: chicken anemia virus, porcine circovirus type 2, and beak and feather disease virus molecular characterization of porcine circovirus type 2 isolates from post-weaning multisystemic wasting syndrome-affected and non-affected pigs isolation of circovirus from lesions of pigs with postweaning multisystemic wasting syndrome construction and immunogenicity of recombinant pseudotype baculovirus expressing the capsid protein of porcine circovirus type 2 in mice immunogenicity and pathogenicity of chimeric infectious dna clones of pathogenic porcine circovirus type 2 (pcv2) and nonpathogenic pcv1 in weanling pigs jcat: a novel tool to adapt codon usage of a target gene to its potential expression host expression levels influence ribosomal frameshifting at the tandem rare arginine codons agg-agg and aga-aga in escherichia coli strategies for efficient production of heterologous proteins in escherichia coli characterization of the recombinant proteins of porcine circovirus type 2 field isolate expressed in the baculovirus system nuclear localization of the orf2 protein encoded by porcine circovirus type 2 bacterial expression of an immunologically reactive pcv2 orf2 fusion protein development of an elisa based on the baculovirus-expressed capsid protein of porcine circovirus type 2 as antigen characterization of a previously unidentified viral protein in porcine circovirus type 2-infected cells and its role in virus-induced apoptosis efficient production of type 2 porcine circovirus-like particles by a recombinant baculovirus virus-like particles-universal molecular toolboxes retrospective serological survey of antibodies to porcine circovirus type 1 and type 2 differential recognition of orf2 protein from type 1 and type 2 porcine circoviruses and identification of immunorelevant epitopes replication of porcine circovirus type 1 requires two proteins encoded by the viral rep gene identification of a protein essential for replication of porcine circovirus release of outer membrane vesicles by gramnegative bacteria is a novel envelope stress response effects of porcine circovirus type 2 (pcv2) maternal antibodies on experimental infection of piglets with pcv2 mistranslational errors associated with the rare arginine codon cgg in escherichia coli characterization of novel circovirus dnas associated with wasting syndromes in pigs detection and characterization of porcine circovirus associated with postweaning multisystemic wasting syndrome in pigs modified indirect porcine circovirus (pcv) type 2-based and recombinant capsid protein (orf2)-based enzyme-linked immunosorbent assays for detection of antibodies to pcv open reading frame 2 of porcine circovirus type 2 encodes a major capsid protein virus-like particles as immunogens strategies to maximize heterologous protein expression in escherichia coli with minimal cost development and validation of a recombinant capsid protein-based elisa for detection of antibody to porcine circovirus type 2 update on porcine circovirus and post-weaning multisystemic wasting syndrome (pmws). j. swine health prod a very small porcine virus with circular single-stranded dna expression of porcine circovirus 2 orf2 gene requires codon optimized e. coli cells distinct roles for nucleic acid in in vitro assembly of purified mason-pfizer monkey virus canc proteins overexpression of an mrna dependent on rare codons inhibits protein synthesis and cell growth in vitro expression, monoclonal antibody and bioactivity for capsid protein of porcine circovirus type ii without nuclear localization signal expression of the porcine circovirus type 2 capsid protein subunits and application to an indirect elisa development and application of a competitive enzyme-linked immunosorbent assay for the detection of serum antibodies to porcine circovirus type 2 the excellent technical help of sona charvatova, hana kubinova, zuzana vecerkova and petra klodnerova (proteix, s.r.o.) is acknowledged. we also wish to thank pavel ulbrich (institute of chemical technology, prague, czech republic) for providing mason-pfizer monkey virus canc expression vectors. this work was supported by grants gacr 310/07/p115 (l.b.), npvii 2b06161 (p.s.) of the ministry of education, youth and sports of the czech republic, npv 1b53016 (i.p.) and 0002716201 (i.p.) of the ministry of agriculture of the czech republic, and institutional research concept av0z50200510 and av0z50520701. key: cord-262904-0b0ljjq1 authors: lon, jerome rumdon; bai, yunmeng; zhong, bingxu; cai, fuqiang; du, hongli title: prediction and evolution of b cell epitopes of surface protein in sars-cov-2 date: 2020-10-29 journal: virol j doi: 10.1186/s12985-020-01437-4 sha: doc_id: 262904 cord_uid: 0b0ljjq1 background: in order to obtain antibodies that recognize natural proteins, it is possible to predict the antigenic determinants of natural proteins, which are eventually embodied as polypeptides. the polypeptides can be coupled with corresponding vectors to stimulate the immune system to produce corresponding antibodies, which is also a simple and effective vaccine development method. the discovery of epitopes is helpful to the development of sars-cov-2 vaccine. methods: the analyses were related to epitopes on 3 proteins, including spike (s), envelope (e) and membrane (m) proteins, which are located on the lipid envelope of the sars-cov-2. based on the ncbi reference sequence: nc_045512.2, the conformational and linear b cell epitopes of the surface protein were predicted separately by various prediction methods. furthermore, the conservation of the epitopes, the adaptability and other evolutionary characteristics were also analyzed, the sequences of the whole genome of sars-cov-2 were obtained from the gisaid. results: 7 epitopes were predicted, including 6 linear epitopes and 1 conformational epitope. one of the linear and one of the conformational consist of identical sequence, but represent different forms of epitopes. it is worth mentioning that all 6 identified epitopes were conserved in nearly 3500 sars-cov-2 genomes, showing that it is helpful to obtain stable and long-acting epitopes under the condition of high frequency of amino acid mutation, which deserved further study at the experiment level. conclusion: the findings would facilitate the vaccine development, had the potential to be directly applied on the prevention in this disease, but also have the potential to prevent the possible threats caused by other types of coronavirus. in late december 2019, a novel coronavirus was officially named as sars-cov-2 by the international committee on taxonomy of viruses (ictv) and identified as the pathogen causing outbreaks of sars-like and mers-like illness in chinese city of wuhan, which was a zoonotic disease. as of august 13, 2020, the outbreak of sars-cov-2 has been reported in many areas of the world, with more than 20,423,000 people infected [1] . with an alarming epidemicity, the reproductive number of sars-cov-2 has been computed to around 3.28 [2] . according to the data in the national genomics data center (ngdc, https ://bigd.big.ac.cn/ncov/), 15,118 genomic variations of sars-cov-2 has been reported at 13:00(gmt + 8) on august 13, 2020, which has aroused widespread concern. the b cell epitope of viral surface protein can specifically bind to the host's b cell antigen receptor and induce the body to produce protective antibody and humoral immune response. the discovery of epitopes is helpful to the development of sars-cov-2 vaccine and the understanding of sars-cov-2′s pathogenesis [3] . 3 proteins embedded in the virus envelope of sars-cov-2 have been identified, including spike (s), envelope (e) and membrane (m) proteins. at present, due to the lack of study of the crystal structure of surface protein of sars-cov-2, the study of epitopes, is time-consuming, power-consuming, costly and difficult [4] , especially the conformational epitopes that depend on accurate protein structures. in this work, we analyzed the surface protein (s, e and m protein) of sars-cov-2 and predicted the structures with bioinformatics methods. on this basis, we predicted the linear and conformational b cell epitopes, analyzed the conservation of the epitopes, the adaptability and other evolutionary characteristics of the surface protein, which provided a theoretical basis for the vaccine development and prevention of sars-cov-2. however, the results still need some experimental confirmation to ensure the validity of the application. all of the analyses and prediction were based on the ncbi reference sequence: nc_045512.2. on the basis of previous research of our group, 3624 genome sequences from gisaid (up to april 6th, 2020) were downloaded to construct a dataset for conservation analysis (additional file 4: table s1 ) [5] . the structure of s protein(pdb id: 6 × 6p) was downloaded from rcsb pdb [6] , which has a resolution of 3.22 å [7] .the data sets for s, e, and m protein were obtained by extracting the corresponding locations of the reference genome. the physical and chemical properties of target protein were analyzed by the port-param tool in expasy(expert protein analysis system) [8] , an online practical analysis kit for proteomics, including the primary structure of the target protein, molecular formula, theoretical isoelectric point, the protein instability index(the index < 40 means the protein was stable) and the location information. online software, protscale, was used to deeply analyze the hydrophilicity and hydrophobicity of target protein and the distribution of hydrophilicity and hydrophobicity of polypeptide chains [8] . sars-cov-2 carried the s/e/m proteins through the virus envelope, the transmembrane region of the protein was predicted online by tmhmm 2.0 [9] . with the amino acid sequences of the surface protein of sars-cov-2 of nc_045512.2 as templates, we predicted the 3d structure of e and m protein through the online server swiss-model [10] based on homology modeling method, selected the optimal structure based on the template identity and gmqe value [10] , and the rationality of the structure was evaluated by ramachandran plot [11] with pdbsum server. the structures were displayed and analyzed by swiss-pdb viewer v4.10 [12] . based on the structures, the conformational b cell epitopes were predicted by seppa 3.0 [13] and ellipro [14] respectively, and the conformational b cell epitopes, which were predicted by all of the two methods were selected for the further analysis. the protean module of dnastar was used to predict the flexibility [15] , surface probability [16] and antigenic index [17] of the target protein of sars-cov-2. the linear b cell epitope was predicted by abcpred [18] and bepipred 2.0 [19] respectively and the common predicted linear b cell epitopes from two methods were selected for the further analysis. coupled with the secondary structure, the tertiary structure and the glycosylation sites [20] , the linear b cell epitopes were finally determined. based on the pdb model and the multiple alignment result, we used the consurf server to analyze the conservation of amino acid sites of the epitopes online [21] . the conservation of epitopes on the surface protein of sars-cov-2 was analyzed by multiple alignment with mafft and logo was drawn with weblogo [22, 23] . the primary structure and physicochemical properties of the s/e/m protein were analyzed. the results revealed that the s, e and m protein have average hydrophilic indexes of − 0.079, 1.128 and 0.446, respectively. on the basis of hydrophilicity, the s and m protein showed amphipathic properties, the e protein showed hydrophobic (additional file 1: figure s1 ). according to the prediction, there was an outside-in transmembrane helix in 23 residues from position 1214th to position 1236th at the n-terminal of the s protein, which was almost consistent with the study indicating that the transmembrane domain of s protein was at the position from 1213 to 1237th [24] , an inside-out transmembrane helix in 23 residues from position 12th to position 34th at the n-terminal of the e protein. two outside-in transmembrane helices of the m protein, one was in 20 residues from position 20th to position 39th, the another one was in 23 residues from position 78th to position 100th, and an inside-out transmembrane helix of the m protein in 20 residues from position 51st to position 73rd at the n-terminal, was predicted (additional file 2: figure s2 ). each transmembrane region corresponds to a hydrophobic peak in the hydrophilic index curve. the protein instability index of the s, e and m protein were 33.01 38.68 and 39.14, which revealed that all of the s, e and m protein was stable. the optimal template for homology modeling of the e protein of sars-cov-2 was the e protein of sars (pdb id: 5 × 29.1), with the sequence identity of 91.38% and the gmqe score of 0.73. according to the evaluation of the structure by ramachandran plot (fig. 1a) , 100% of the residues were located in the most allowed regions ( table 1 ), indicating that the structure was reliable. the e protein of sars-cov-2 is a pentamer (fig. 1b) , which can be divided into the concentrated transmembrane part and the head located outside the envelope. the head is mainly composed of α-helix, irregular curl and turn, which is exposed to the envelope and contributes to the formation of epitopes. the tail is mainly composed of long α-helix, most of which are embedded in the envelope, hindering the formation of epitopes. the optimal template for homology modeling of the m protein of sars-cov-2 was the effector protein zt-kp6-1(pdb id: 6qpk. 1. a), with the sequence identity of 20.00% and the gmqe score of 0.06. the sequence identity between the optimal template and the m protein of sars-cov-2 and the gmqe score are too low, so that the template is not suitable for homology modeling. all linear b cell epitopes of the surface protein were filtered according to the following criteria: (1) region with high surface probability (≥ 0.75), strong antigenicity(≥ 0) and high flexibility; (2) excluding the region with α-helix, β-sheet and glycosylation site (fig. 2) ; (3) in line with the prediction by bepipred 2.0(cut off to 0.35) and abcpred (cut off to 0.51). based on the results obtained with these methods and artificial optimization, we removed epitopes that are too long to be suitable for applicate, 4 potential linear b cell epitopes of the s protein were predicted ( table 2 , fig. 3a ), including 601-605 aa, 656-660 aa, 676-682 aa, 808-813 aa, and they were named as the epitope a, b, c, d, respectively; 1 epitope of the e protein was selected(60-65 aa) and named as the epitope e ( table 2 , fig. 3c ); 1 epitope of the m protein was selected (211-215 aa) and named as the epitope g ( table 2 ). the 3d structure prediction and ramachandran plot analysis of the e protein. a the ramachandran plot analysis of the 3d structure of the e protein (without gly and pro). all of the residues located on the allowed region. indicating that the structure was reliable from a thermodynamic point of view. b the 3d structure of the e protein predicted by homology modeling. it is a pentamer with ion channel activity [38] . its head is short, the middle of the tail is a transmembrane region which help the e protein embed in the envelope of sars-cov-2 residues in disallowed regions 0 0.00 with the structure of s protein (pdb id: 6 × 6p), the conformational b cell epitopes of surface protein were predicted with ellipro and seppa 3.0 with the default threshold of 0.063 and 0.5, respectively. one conformational b-cell epitope (60-65 aa) of e protein was predicted (table 2) , which is consistent with the linear epitope e. similarly, this region located on the outside (fig. 3c) , and we selected it as a dominant conformational epitope and named f. however, the conformational epitope of the m protein could not be predicted due to the failure of credible homology modeling. the consurf server was used to predict epitope conservative sites with the structure of surface proteins and the alignment results in our dataset. due to the lack of crystal structure of m protein, the epitope g only applies interestingly, the high antigenicity peaks of all three proteins were in the region where the α-helix is relatively sparse, which may be related to the fact that the α-helix structure of the helix prevents continuous residues from being located on the surface table 2 the composition and the antigenic index of the epitopes of sars-cov-2 the scores of the epitope e and the epitope g were calculated by ellipro, the others were calculated by bepipred 2.0. the epitopes a, b, c and d belong to s protein, the epitopes e and f belong to e protein and they are coincident, the epitope g belongs to m protein data sets to calculate conservation. in the dataset, the epitopes of s, e and m protein were basically conservative (table 3 , additional file 3: figure s3 ). further calculation of the conservatism of the epitopes in the dataset was carried out, and the average score of all the epitopes was less than 1, which could be considered as conservative epitopes. epitope e and epitope f from e protein had the lowest scores and showed the highest conservatism. however, it is worth noting that this value is an overall assessment of the epitopes. residues no. 808,809 of epitope d and 214 of epitope g, as a single residue, showed a conservative score greater than 1 respectively, which revealed a risk of mutation. sars-cov-2 caused huge impact to human production, living and even life, and has become a major challenge confronting the whole world. development of vaccine is one of the effective means of long-term prevention of the virus. epitope vaccine is the trend of development of vaccine due to the advantages of strong pertinence, less toxic and side effects and easy to transportation and storage [25] . a group founded in march 2020 by preston estep, calling themselves "the rapid vaccine partnership" (radvac), has developed a very simple vaccine. in early july, radvac published a white book detailing the vaccine they developed (https ://radva c.org/). the radvac vaccine is a "subunit" vaccine because it is composed of fragments of a pathogen, in this case it was peptide, which is essentially a short fragment of a protein that matches the sars-cov-2 section but does not cause disease. subunit vaccines are already used for diseases such as hepatitis b and human papillomavirus, and a number of companies are developing subunits for covid-19, including novavax biotechnology. reliable epitopes are particularly important for the development of subunit vaccines. (https :// www.techn ology revie w.com/2020/07/29/10057 20/georg e-churc h-diy-coron aviru s-vacci ne/). the determination of epitopes is the basis of the development and application of vaccine, and the clinical diagnosis.herrera et al. [7] reported antigenic analysis of s protein obtained by elisa, but did not study the epitopes. the conserved epitopes were predicted based on the calculation by us, which provided more reference for the immunological study of s protein. vashi et al. predicted some epitopes based on the structure of s protein [26] . although their studies predicted both b-cell and t-cell epitopes of s protein, they did not discuss the conservation of epitopes. we effectively supplement the study of epitopes with the conservative analysis based on a large amount of data, which can ensure the long-term effect and stability of epitopes in the application process. at the same time, their study is limited to s protein, while our study on e and m protein provides more options. moreover, walls et al. [27] reported the use of conservative glycosylation sequence in s protein of sars-cov can stimulate neutralizing antibody against sars-cov-2, the epitope g is the linear epitope and the f is the conformational epitope, which are coincide and the study of the yuan et al. [28] reported that they researched the recognition of epitopes and antibodies by parsing the structure of antibody cr3022 from rehabilitation in patients with sars. wang et al. [29] reported a kind of human monoclonal antibodies, which could neutralize the sars-cov-2, from the cell culture. what these studies have in common is that they are based on some immune responses that have already occurred. in contrast, our calculation in the computer environment is faster, but the accuracy still needs to be verified experimentally. the two methods form an effective complement. currently, the methods which were mainly used are x-ray scattering method, immune experiment method and bioinformatics method. the first two are time-consuming and laborious, while the bioinformatics method is gaining more and more credibility among researchers [3, 25, 30] . there are many factors to be considered in the prediction of epitopes by bioinformatics method, such as the surface probablity and flexibility of the epitopes. at the same time, it is necessary to exclude the structurally stable and non-deformable α-helix, β-sheet, glycosylation sites which may obscure the epitopes or alter the antigenicity, etc. [31] . even so, the predicted epitopes are still inaccurate [4] . our work takes the intersection of above methods to predict, which greatly improves the stability of the prediction. compared with the current study on sars-cov-2, this work adopted various prediction methods and 3d structure databases developed in recent years, which were based on artificial neural network, hidden markov model (hmm), support vector machine(svm), etc., such as abcpred, bepipred2.0, seppa 3.0, iedb, etc. compared with prediction by a single method [32] , on the basis of a single protein [33] or on the basis of epitopes of sars [28] , these methods and databases greatly improved the accuracy of prediction and had more bioinformatic meaning. we comprehensively analyzed the prediction results from the tools which were widely used, set up screening criteria on the basis of primary structure, secondary structure and tertiary structure, so that the prediction results would more accurate and reliable. the s protein, the e protein and the m protein are surface proteins of sars-cov-2 that form the outer table 3 the the calculation was independent and based on the sars-cov-2 data set layer of the coronavirus and protect the internal rna, which have the potential as antigenic molecules. however, considering the current study on the epitopes prediction of sars-cov-2 [34] and due to the fact that s protein has been reported to be the directly binding molecule of sars-cov-2 to ace2 [35] , the prediction of epitopes is mainly focusing on the s protein, with few studies on the e protein and the m protein. in this work, we analyzed the s protein, the e protein and the m protein and predicted their epitopes. on this basis, 7 b cell epitopes were predicted, including 1 conformational and 6 linear b cell epitopes, one of the conformational and one of the linear are coincide. all of the epitope a, b, c, d located on the surface of the tail of the s protein, which is relatively easy to bind. the epitope e and the epitope f located at the end of the head of the e protein coincide, and this may be explained by the fact that they are all consecutive and the secondary structure avoiding the α-helix and the β-sheet. the epitope g is derived from the m protein, and the structure and conservation could not be determined due to the inability to predict reliable structure. however, it could be inferred from the surface probability scores that the epitope g is more likely to be located on the surface of the m protein. the higher the conservation score calculated by the consurf server is, the more likely the site is to be mutated in the evolutionary process. when the score < 1, the site is likely to be a conservative site; when the score is between 1 and 2, the site is a site which is likely to be a relatively easy mutation; when the score > 2, the site is likely to be an easy mutation site [36] . in the 7 epitopes obtained, all the epitopes of the s, e, m protein were absolute conservative among all sars-cov-2 sequences. the conservation of the epitope g could not be calculated by the pdb file. our work provides identified and conserved sites for further study. mutations that occur during the spread of the virus can cause significant resistance to vaccine development. for example, the recently reported mutation of amino acid 614 of s protein [37] not only affects the ability of the virus to transmit, but also may affect the efficacy of vaccines involving this site. our work provides reliable candidates for the development of epitope vaccines, but the application value of the epitopes needed further experimental verification. for example, the antigenicity of the epitope could be tested. although the epitopes could be integrally considered to be conservative, the independent residues of these epitopes could still easy to mutate. epitopes d and e had two and one residues, respectively, with conservative scores greater than 1, meaning that they were at risk for a single point mutation. more attention should be paid to these two epitopes in application. the epitope detection in glycoproteins is significant to the study of the immunoreaction of sars-cov-2, but its challenge is less reliable than the epitope detection due to the presence of glycan [33] . in addition, sars-cov-2 would mutate frequently, and the epitopes predicted might mutate too, so conservative epitopes analyzed in the present study might be more reliable. according to the data from ngdc, the variation frequencies of s, e, and m proteins were 0.83, 1.02, and 0.73, respectively. under the condition of relatively high variation frequency, the conservation of the proteins was analyzed to identify the epitopes with low mutation risk, which were important for the development of long-term and stable vaccines. however, this work is limited. without the molecular dynamic analysis, the binding between epitopes and antibodies was not simulated to further determine the availability of epitopes, but researches from different perspectives can provide more epitopes choices for subsequent studies. in this work, we predicted 7 reliable epitopes: a, b, c, d, e/f and g. the reliability of the epitopes of the s protein was relatively better than that of the epitopes of the e protein and the m protein, indicating that the s protein is still the optimal choice for the prediction of epitopes and the development of vaccine. all of the 7 epitopes were able to achieve high conservation in sars-cov-2, therefore, the epitopes not only have the potential to be directly applied on the treatment in this disease, but also have the potential to prevent the possible threats caused by other types of coronavirus. in addition, although various factors of prediction were integrated in this work, more experimental data are needed to further verify whether all the 7 epitopes can induce the body to produce corresponding antibodies and generate specific humoral immunity, due to the limited data set and other factors. supplementary information accompanies this paper at https ://doi. org/10.1186/s1298 5-020-01437 -4. additional file 1. figure s1 : deep analysis of hydrophilicity and hydrophobicity of surface protein of sars-cov-2. the online software, protscale, was used to predict the hydrophilicity and hydrophobicity of the surface protein deeply. a. the s protein has a maximum score of hydrophobicity, 3.222 at the 7th site, which revealed a strong hydrophobicity; a minimum score of hydrophobicity, -2.589 at the 679th site, which revealed a strong hydrophilicity. the score of hydrophilicity and hydrophobicity on the polypeptide chain of s protein constantly fluctuates, with most of the scores being negative, which revealed the possibility that the protein had bisexual properties on the basis of hydrophilicity. b. the e protein has a maximum score of hydrophobicity, 3.489 at the 21st and the 25th site, which revealed a strong hydrophobicity; a minimum score of hydrophobicity, -1.550 at the 65th site, which revealed a strong hydrophilicity. most of the scores of the residues being positive, which revealed the possibility that the protein has obvious hydrophobicity. c. the m protein has a maximum score of hydrophobicity, 2.978 at the 84th site, which revealed a strong hydrophobicity; a minimum score of hydrophobicity, -1.956 at the 211th and the 212th site, which revealed a strong hydrophilicity. the scores of hydrophilicity and hydrophobicity on the polypeptide chain of m protein showed large fluctuations, and the number of positive scores and negative scores were similar, the positive scores accounted for the majority, which revealed the possibility that the protein had bisexual properties on the basis of hydrophobicity. additional file 2. figure s2 : the transmembrane region of the surface protein of sars-cov-2. the s, e and m protein are embedded in the envelope of sars-cov-2, the transmembrane helix was predicted by tmhmm 2.0 server. all of three amino acid indexes were higher than 18, indicating the reliability of the prediction. a. for the s protein, an outsidein transmembrane helix was predicted in the 23 residues of amino acids from position 1214th to position 1236th at the n-terminal. the amino acid index was 23.97303. b. for the e protein, an inside-out transmembrane helix was predicted in the 23 residues of amino acids from position 12th to position 34th at the n-terminal. the amino acid index was 25.72521. c. for the m protein, 2 outside-in transmembrane helices were predicted, which were a helix in the 20 residues of amino acids from position 20th to position 39th and a helix in the 23 residues of amino acids from position 78th to position 100th at the n-terminal. an inside-out helix was predicted in the 23 residues of amino acids from position 51st to position 73rd at the n-terminal. the amino acid index was 64.90522. the calculation of the transmembrane pattern and data has clarified the position and direction of the protein in the virus, which is of great significance for the understanding of the availability of the antigen when predicting the epitopes, the epitopes located outside the virus has significant application advantages. additional file 3. figure s3 : the antigenic conservation of the surface protein in sars-cov-2. the overall height of each stack is proportional to the sequence conservation, measured in bits, at that position, while the height of symbols within the stack indicates the relative frequency of each nucleic acid at that position. all the epitopes in the data set are highly conservative, and the serial numbers (a-g) in the figure represent the epitopes a-g respectively. additional file 4. table s1 : the list of the id of genomes in dataset. coronavirus 2019-ncov: a brief perspective from the front line the reproductive number of covid-19 is higher compared to sars coronavirus recent advances in b-cell epitope prediction methods bioinformatics resources and tools for conformational b-cell epitope prediction comprehensive evolution and molecular characteristics of a large number of sars-cov-2 genomes reveal its epidemic trends rcsb protein data bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy characterization of the sars-cov-2 s protein: biophysical, biochemical, structural, and antigenic analysis the proteomics protocols handbook sequence and structure-based prediction of eukaryotic protein phosphorylation sites swiss-model: homology modelling of protein structures and complexes procheck: a program to check the stereochemical quality of protein structures swiss-model and the swiss-pdbviewer: an environment for comparative protein modeling seppa 3.0-enhanced spatial epitope prediction enabling glycoprotein antigens ellipro: a new structure-based tool for the prediction of antibody epitopes prediction of chain flexibility in proteins induction of hepatitis a virus-neutralizing antibody by a virus-specific synthetic peptide the antigenic index: a novel algorithm for predicting antigenic determinants convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold open access which fosters wider collaboration and increased citations maximum visibility for your research: over 100m website views per year • at bmc, research is always in progress. learn more biomedcentral.com/submissions ready to submit your research ready to submit your research ? choose bmc prediction of continuous b-cell epitopes in an antigen using recurrent neural network improved method for predicting linear b-cell epitopes prediction of n-glycosylation sites in human proteins consurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules weblogo: a sequence logo generator sequence logos: a new way to display consensus sequences severe acute respiratory syndrome coronavirus-2 (sars-cov-2), a newly emerged pathogen: an overview immuno-informatics: mining genomes for vaccine components understanding the b and t cell epitopes of spike protein of severe acute respiratory syndrome coronavirus-2: a computational way to predict the immunogens structure, function, and antigenicity of the sars-cov-2 spike glycoprotein a highly conserved cryptic epitope in the receptor binding domains of sars-cov-2 and sars-cov a human monoclonal antibody blocking sars-cov-2 infection an introduction to epitope prediction methods and software the evolutionary pattern of glycosylation sites in influenza virus (h5n1) hemagglutinin and neuraminidase a sequence homology and bioinformatic approach can predict candidate targets for immune responses to sars-cov-2 immunoinformatics-aided identification of t cell and b cell epitopes in the surface glycoprotein of 2019-ncov cross-reactive antibody response between sars-cov-2 and sars-cov infections potent binding of 2019 novel coronavirus spike protein by a sars coronavirus-specific human monoclonal antibody mutation feature analysis on epitope and receptor binding sites of influenza a h1n1 hemagglutinin spike mutation pipeline reveals the emergence of a more transmissible form of sars-cov-2 analysis of sars-cov e protein ion channel activity by tuning the protein and lipid charge publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations the student entrepreneurship and innovation center of the school of biology and biological engineering, south china university of technology, also provided a lot of help during the preparation of the project. authors' contributions jl conceived the study and participated in its design and coordination. hd participated in the design of the study and helped draft the manuscript. yb participated in analysis of conservation, sequence alignment and manuscript drafting. bz participated in antigenic prediction. fc participated in drafting the manuscript. all authors read and approved the final manuscript. the viral genomes described in detail here were deposited in ncbi, genbank and gisaid. not applicable. not applicable. the authors declare that they have no competing interests.received: 1 may 2020 accepted: 20 october 2020 key: cord-023143-fcno330z authors: nan title: molecular aspects of viral immunity date: 2004-02-19 journal: j cell biochem doi: 10.1002/jcb.240591009 sha: doc_id: 23143 cord_uid: fcno330z nan mechanisms of t-cell mediated clearance of viruses from the central nervous system are poorly understood, but likely to differ from those employed in the periphery because the cns lacks lymphatic drainage and constitutive expression of mhc class i antigen, and the unique structure of the cns vasculature imposes constraints on access by leukocytes and soluble immune mediators. to study the mechanism by which viruses are cleared from neurons in the central nervous system, we have developed a mouse model involving infection with a neurotropic variant of mouse hepatitis virus (oblv60). grew preferentially in the olfactory bulbs of balbk mice. using in situ hybridization, we found viral rna localized primarily in the outer layers of the olfactory bulb, including neurons of the mitral cell layer. virus was cleared rapidly from the olfactory bulb between 5 and 11 days. athymic nude mice failed to eliminate the virus demonstrating a requirement for t lymphocytes. immunosuppression of normal mice with cyclophosphamide also prevented clearance. both cd4+ and cd8+ t-cell subsets were important as depletion of either of these subsets delayed viral clearance. gliosis and infiltrates of cd4+ and cd8+ cells were detected by immunohistochemistry at 6 days. the role of cytokines in clearance was investigated using an rnase protection assay for il-la, il-lp, il-2, il-3, il-4, il-5, il-6, tnfa, tnfp and ifny. in immunocompetent mice there was upregulation of rna for il-la, il-lp, il-6, tnfa and ifny at the time of clearance. nude mice had comparable increases in these cytokine messages with the exception of ifny. induction of mhc-i molecules on cells in infected brains was demonstrated by immunohistochemistry in normal and nude mice, suggesting that ifny may not be necessary for induction of mhc-i on neural cells in vivo. luca g. guidotti, kazuki ando, tetsuya ishikawa, lisa tsui and francis v. chisari. the scripps research institute, la jolla, ca 92037 although cytotoxic t lymphocytes (ctl) are known to clear viral infections by killing infected cells, recent studies suggest that they can also suppress the replication of certain viruses by noncytolytic mechanisms. we have examined this area by monitoring the immunopathological and antiviral consequences of antigen recognition by hepatitis b virus (hbv) specific ctl in hbv transgenic mice that express the viral gene products in their hepatocytes. we have shown that intravenously injected ctl rapidly trigger their target hepatocytes to undergo apoptosis, but that the direct cytopathic effect of the ctl is minimal in comparison with the cytopathic effects of the antigen-nonspecific intrahepatic inflammatoly response that they activate. in addition to killing the hepatocyte, the same ctl also downregulate hbv gene expression and completely abolish hbv replication in the hepatocytes that they don't destroy. this noncytolytic antiviral ctl effect is mediated by at least two distinct processes in these animals. first, the ctl cause a quantitative reduction in the steady state content of all hbv mrna species in the hepatocyte, and this is followed by disappearance of all of the corresponding viral proteins in the liver and serum. the ctl initiate this process by secreting ifny and tnfa when they are activated by antigen recognition. since the regulatory effect of the ctl can he prevented completely by prior administration of the corresponding antibodies. nuclear run-on experiments reveal that viral mrna transcription is unaffected despite the profound reduction in hbv mrna content in the liver, suggesting that the ctl-derived cytokines accelerate viral mrna degradation in the hepatocyte. a second noncytolytic antiviral pathway is also activated by the ctl. we have recently shown that hbv nucleocapsid particles, and the replicative hbv dna intermediates that they contain, disappear from the transgenic mouse liver following either ctl administration or partial hepatectomy. the latter of which triggers hepatocellular regeneration without any change in hepatocellular hbv mrna content. these results suggest that preformed hbv nucleocapsid particles may be actively degraded during hepatocyte turnover, and they raise the possibility that similar events might also occur in nondividing hepatocytes that are activated by noncytolytic signals delivered by the ctl. we propose that, in addition to their pathogenetic effect, the comhined effects of the ctl response at die hbv mrna. nucleocapsid and rcplicative dna levels may represent a curative antiviral stimulus during hbv infection. since the virus must contain molecular elements that iespond to these ctl-induced antiviral signals. inactivating mutations at these loci could be very efficiently selected by immune pressure, because a single mutation could abrogate the antiviral effect of a wide spectrum of t cell responses, irrespectrve of epitope specificity. identification of these viral response elements and the intracellular pathways that interact with them may lead to the development of new strategies for antiviral drug design. human fibroblasts infected with hsv are resistant to lysis by cd8+ cytotoxic t lymphocytes (ctl), yet human b cell lines can be efficiently lysed by these ctl. the effect on human fibroblasts is rapid (within 2 hr of infection of cells), occurring before synthesis of mhc class i is altered by virus infection. a recombinant hsv, f-usbmhc, which expresses mouse mhc class i proteins does not render human fibroblasts sensitive to lysis by mouse ctl. mhc class i molecules are retained in the er of hsv-infected fibroblasts i n a misfolded, unstable form and stability of the mhc complex can be restored by addition of exogenous peptides. using a panel of hsv mutants and ad expression vectors we demonstrated that the hsv ie protein icp47 was both necessary and s f i c i e n t to cause retention of class i and icp47 expression in fibroblasts caused the cells to resist lysis by cd8+ t lymphocytes. icp47 is a soluble, cytosolic protein and we have found no evidence of membrane association. therefore, it appears that icp47 inhibits cytosolic stages of the antigen presentation pathway so that antigenic peptides do not reach the er. to date, polyclonal and monoclonal antibodies directed to icp47 have not specifically precipitated any of the previously described components of the antigen presentation pathway and we have not found icp47 associated with tap transporter proteins or proteosomes i n these experiments. the effects of icp47 are being assessed in proteosome and tap transporter assays. gst-icp47 fusion proteins tightly bind a 8.5 kda cytosolic cellular protein which is found in a number of adherent human cell lines but not lymphocytes. the protein has been purified and sequencing is in progress. in addition, radiolabelled icp47 binds to a single cellular protein of =55 kda on ligand blots. these proteins are good candidates as cellular targets of icp47 and as novel components of the antigen presentation pathway. preliminary experiments support the hypothesis that icp47 is very effective i n blocking cd8+ t lymphocyte responses in vivo, perhaps explaining the predominance of cd4+ vs. cd8+ anti-hsv ctl i n vivo. we expect that icp47 may be very useful, not only to elucidate antigen presentation pathways, but also to prevent immune recognition of gene transfer vectors and a s a immunosuppressive agent. susceftibility to polyoma virus-induced tumors is conferred by an endogenous mmtv superantigen. aron e. lukacherl, yupo ma2, john p. carroll2, sara r. abromson-leeman2, joseph c. laning2, martin e. dorf2, and thomas l. benjamin2. idepartment of pathology, emory university school of medicine, atlanta, ga 30322, and 2department of pathology, harvard medical school, boston, ma 02115. susceptibility to tumors induced by mouse polyoma virus varies among inbred mouse strains. we have previously shown that polyoma tumor susceptibility is controlled by products of mhc as well as non-mhc genes. in crosses between mhc-nonidentical strains differing in tumor susceptibility, resistance correlates with dominantkodominant inheritance of the resistant h-2 haplotype. we have observed the opposite pattern of inheritance of susceptibility in crosses between mhc-identical strains. in crosses between the highly susceptible c3wbida mouse and the highly resistant but mhc-identical (h-2k) c57bwcd.i mouse, polyoma tumor susceptibility is conferred by a single autosomal dominant gene, which we have designated pyvs. pyj does not encode cell receptors for the virus, affect viral dissemination or anti-viral antibody responses, or affect intracellular events essential for productive infection or cell transformation by the virus. whole-body irradiation renders cs7bwcd.i mice fully susceptible to polyoma-induced tumors, indicating an immunological basis for this strain's resistance. we hypothesized that p y j encodes an mtv superantigen (sag) that confers susceptibility to c3wbida mice by deleting precursors of polyoma-specific t cells. we found that tumor susceptibility in (c3wbida x c57bwcd.i) x c57bwcdj backcross mice cosegregated with mtv-7. inheritance of mtv-7 showed perfect concordance with absence of peripheral vp6+ t cells. genotyping of backcrossmice using markers of simple sequence repeat polymorphisms flanking mtv-7 showed no evidence of recombination between pyvs and mtv-7. strongly biased usage of vp6 by (a) polyoma-specific cd8+ ctl from virus-infected c57bwcdj mice and by @) cd8+ t cells infiltrating a polyoma tumor in a virus-immune c57bwcd.i host provide further evidence that t cells bearing this mtv-7 sag-reactive vp domain are critical anti-polyoma tumor effector cells. these results indicate identity between p y j and mtv-7 sag, and demonstrate a novel mechanism of inherited susceptibility to virus-induced tumors based on effects of an endogenous superantigen on the host's t cell repertoire. infection of mice with lymphocytic choriomeningitis virus (lcmv) causes a transient to longlasting immunosuppression dependent upon virus-isolate dose of virus and age, h-2, non h-2, level of cd4+ t cells, of cd8+ t cells and kinetics of neutralizing antibodies of the host. the immunohistological analysis suggests that cd8+ t cell dependent disappearence of marginal zone macrophages of follicular dendritic cells and of virus infected cells in general correlates with immunosuppression. the details of mechanisms responsible for these findings are now being analysed. a role of this cd8+ t cell dependent immunosuppression in the establishment of a lcmv carrier state in immunocompetent mice is suggested by the following experiments: the otherwise slow and low neutralizing antibody response agamst lcmv is accelerated and enhanced by cd8+ t cell depletion at the time of infection, suggesting virus-specific immunopathology being responsible at least partially. the elisa antibody response is not significantly altered under the same conditions but is abrogated if lcmv-specific t cell receptor transgenic mice are infected with high doses of lcmv, indicating, that suppression of the specific antibody response depends upon the relative kinetics of ctl versus antibody responses. whether exhaustion of specific ctl responses is enhanced by similar mechanisms remains to be tested. the role of interleukins of the relative distribution of virus in the mouse and in the various aspects of immunosuppression are now being studied. immunosuppression, caused by cd8+ t cell-dependent immunopathology, may also be operational in hiv infection in humans. such a pathogenesis of hiv-triggered aids could explain several aspects of the disease process not readily fitting the (unproven) conventional idea that hiv is causing immunodeficiency via direct viral pathogenicity. the cellular immunity against two dna tumor viruses (i.e. human adenovirus type 5 (ad 5) and human papillomavirus type 16 (hpv16)) was studied with respect to possible immune escape mechanisms and to the development of ctl epitope based peptide vaccines. after identifying an immunorelevant ctl epitope in the ad 5e1a protein to which ctl clones were directed that could eradicate ad 5e1 induced tumors in nude mice, an amino acid replacement study of this epitope revealed a point mutation that totally eliminated the possibility to recognize the mutant peptide by the ctl clones directed against the wild-type peptide sequence. new viral constructs were made that contained this point mutation and used to transform mouse embryo cells. however, these mutant tumor cells were still immunogenic and ctl clones specific for these mutant tumor cells were shown to react with a peptide derived from the ad 5e1b protein. these ad 5eib specific ctl clones, however, were as effective as the ad 5e1a specific ctl clones in the eradication of ad 5e1 induced tumors in nude animals, indicating that a choice can be made of immunorelevant epitopes to which an immunization strategy could be developed. in addition, we discovered that by supertransfection of ad 5e1 induced tumor cells with the activated ras oncogen the possibility of ad 5e1b specific ctl to recognize the ad 5e1 induced tumors was eliminated whereas the ad 5e1a specific ctl could still kill these tumor cells. this might indicate a new mechanism of tumors to escape ctl. in an hpv16 induced mouse tumor model an immunosubdominant ctl epitope was identified in the e7 protein that could, upon immunization with that peptide,protect mice against a subsequent challenge with hpv16 induced tumor cells. by changing the anchor residues in that peptide an even more immunoprotective peptide could be generated. combined, these data indicated a successful use of a ctl epitope based peptide vaccine in the prevention of hpv16 induced tumors in mice. subsequently this led us to identify relevant ctl epitopes of hpv16, that is highly associated with cervical carcinoma in humans, for the major hla-a alleles (i.e. hla-a *0101, a *0201, a"0301, a*1101 and a *2401). together these alleles cover a majority of all humans. ctl epitopes were identified through peptide-mhc binding assays followed by in v i m peptide immunizations with high affinity binding peptides to induce primary ctl responses and immunogenicity studies in hla-a transgenic mice. thereafter, memory ctl responses were measured in cervical cancer patients against selected peptides. combined, these data led us to develop a ctl epitope based peptide vaccine that could be of use in hpv16 induced cervical cancer patients. a clinical trial for this disease is scheduled to start in the fall of 1994. class ii presentation of an endogenously synthesized glycoprotein. carol s. re is^'.'.^, shirley m. bartido', miriam stein', and stephanie diment3.4, biology department', and center in contrast to class i presentation which is well characterized to use peptide fragements of proteins sythesized in the cytoplasm, exogenously administered experimental antigens enter the class ii mhc pathway through endocytosis. we have been studying the recognition of the glycoprotein of vesicular stomatitis virus (vsv) which can enter either the exogenous or endogenous pathways for presentation to cd4 + t cells. investigations of the intracellular sites involved, the proteolytic processes involved in the epitope generation, will be discussed. the glycoprotein studied in detail is a truncated form of the wt type 1 glycoprotein, termed poison tail (gpt) . expressed with a vaccinia virus vector, the gpt remains endo h sensitive and never becomes endo d sensitive, indicating that it is restricted to the endoplasmic reticulum. gpt is degraded in the er, and w e believe the degradation products include the immunogenic epitopes recognized by a panel of lad and i-ed t cell clones and hybridomas. lmmunofluorescence studies have confirmed the er localization. flow cytometric evaluations s h o w that the gpt never appears on the cell surface, in contrast to the wt g. the peptides generated are not secreted; using an innocent bystander assay, gpt-infected cells are incapable of sensitizing 5'cr-labeled uninfected apc. this contrasts with the rapid ability of supernatants from wt g-vaccinia virus-infected cells to sensitize apc for t cell recognition. investigations of the characteristics of the enzymes contributing to the degradation of the gpt have shown that a reducing environment is essential, as diamide treatment of cells prevents degradation. lysosomotropic drugs (eg. nh,ci and leupeptin) d o not alter the half-life of the protein, but do prevent presentation of the peptides; this is inconsistent with an autophagic component to the proteolysis. ph optima are physiological, as ph8 environment inhibits the enzyme activity. inhibitors of enzyme classes are consistent with a trypsin-like, and not cystein-, cathepsin b-, or chymotrypsin-like class. supported by nih grant al 18083 to csr. (emcv) and mengovirus are related members of the cardiovirus genus of picomaviruses. their rna genomes encode a large polyprotein which is cleaved proteolytically in co-and post-translational reactions to yield all mature viral proteins necessary to establish an infection. although originally thought to be exclusively murine in host range, both viruses actually infect a wide range of mammals. emcv has caused devastating epizootics in captive primates (eg: macaques, chimps and baboons), domestic pigs and exotic zoo collections (elephants, lions and tigers). death, following ingestion of virus-contaminated material, is rapid, and caused by extensive meningoencephalitis and virus-induced damage to the cns. myocarditic lesions are common in older animals. when administered intracerebrally, the ld,, for emcv strain r is about 1 pfu. we are studying the pathogenesis of emcv and mengo with engineered cdna plasmids containing infectious viral sequences. many plasmids contain h'uncated versions of the unusual 5' noncoding homopolymeric poly(c) tract that is a hallmark of these cardioviruses. short poly(c) mengoviruses grow very well in tissue culture but are 106-10'z fold less pathogenic to mice than the wild-type strains. animals receiving sublethal doses of short-tract mengo strains develop high titers of neutralizing antibodies, exhibit potent ctl responses and acquire lifelong protective immunity against challenge with wild-type virus. the genetic stability of the short-tract strains, even upon serial brain passage, mark them as safe, efficacious live vaccines. currently, we believe the poly(c) phenomenon is due to interference by the wild-type virus sequences (long poly(c) tract) with normal cellular cytokine induction mechanisms (ie: ifa and ifp) during the initial stages of animal infection. the targeted cells are probably macrophages, and their singular ability to correctly respond or not respond to poly(c) tract length during the first few hours of infection determines whether an inoculated animal will live (protectively vaccinated) or die. the short-tract viruses probably induce if in the macrophages, and are consequently killed then rapidly cleared from the host in related experiments we've found that attenuated mengo strains can easily carry large heterologous insertions within their genomes, and express these sequences into protein them during replication in animals. the resulting immune response (b cell and ctl) to the chimeras is directed towards the foreign sequences (epitopes) as well as towards the mengo proteins. a chimeric hiv vaccine, a rabies vaccine and an lcmv vaccine have been developed and tested. the lcmv chimera seems especially effective, as a single pfu of this engineered mengo strain, administered orally to a mouse, is sufficient for complete immunogenic protection against intracerebral challenge with wild-type lcmv virus. rsv is the most common cause of serious viral lower respiratory tract disease in infants and children. we have recently renewed our efforts to generate a safe and effective live attenuated rsv vaccine for topical administration that will overcome the deficiencies of previously studied live and non-living rsv vaccines. this vaccine will be a bivalent vaccine consisting of subgroup a and b live attenuated virus components. since the peak incidence of severe disease caused by rsv is in the 2-month old infant, an rsv vaccine will need to be effective when given to 1-month old infants. based on the success of live poliovirus vaccines given early in infancy, it is anticipated that the intranasally administered live virus vaccine will infect and induce a protective local and systemic immune response even in infants with passively acquired maternal antibodies. the main approach that we have taken in this effort to develop the live rsv vaccine is to introduce one or more ts mutations by chemical mutagenesis into a cold-passaged virus (cprsv) that had been partially attenuated by the acquisition of host-range mutations selected by passage in cells of a heterologous host species. we have developed a large set of cprsv subgroup a rs mutants (termed cprs mutants) that contain the host-range mutations selected during cold passage and two or more ts mutations introduced by chemical mutagenesis. these mutants have been evaluated in virro for their level of temperature sensitivity and in vivo in rodents, chimpanzees, and humans. a large set of rsv subgroup b cpts mutants has been similarly produced and evaluated. the immunogenicity and protective efficacy of three candidate live attenuated rsv vaccine strains that represent a specaum of attenuation were evaluated for protective efficacy in chimpanzees. prior to infection some of these animals were given rsv immune globulin by the iv route to simulate the condition of the very young infant who possesses passively-acquired maternal rsv antibodies. the three candidate vaccine strains were immunogenic and induced significant resistance to rsv challenge in both groups of chimpanzees. interestingly, the chimpanzees infused with rsv antibodies prior to immunization were primed more effectively for an unusually high serum neutralizing antibody response to infection with challenge virus than chimpanzees which did not receive such antibodies. this high booster response occurred despite marked reshiction of replication of the challenge virus. the evaluation of two candidate vaccines in seronegative human infants will also be described. rs virus is immunologically interesting for at least t w o reasons: 1) upper respiratory reinfection occurs despite previous exposure and demonstrable immunological memory: 2) humans or rodents previously immunised against virus infection can show enhanced disease during reinfection. others have shown that passive transfer of antiviral antibody either protects against virus infection or has no effect, and there is no evidence of antibody enhancement of disease in vivo. by contrast, t cell immunity appears closely associated with disease augmentation. we have focused on examining the immunological mechanisms of disease enhancement in mice. initial studies showed that transfer of cd8+ cytotoxic t lymphocytes (ctl) causes rapid virus clearance from the lungs of rs virusinfected mice, but also increased disease severity with alveolar haemorrhage and polymorphonuclear (pmn) cell recruitment to the lung. this disease (reminiscent of shock lung) could sometimes be fatal, whereas normal mice recover well from similar doses of rs virus. next, we compared the effects of cd4' and cd8+ t cells, using polyclonal t cells separated immunomagnetically from mixed lines grown in vitro with viral antigen. cd4' t cells were more pathogenic than cd8+ t cells in a dose-for-dose comparison, but that the type of pathology varied depending on the type of cell injected. while testing recombinant vaccinia viruses expressing single rs viral proteins for their ability to protect mice against infection, we observed that animals sensitised t o the major surface glycoprotein g (attachment protein) developed lung eosinophilia after challenge with rs virus intranasally. t cell lines from the spleens of mice sensitised with various recombinant vaccinia viruses were established. those form mice primed with the m 2 (22k) protein were predominantly cd8' ctl, and that produced few cytokines. those from mice primed with fusion protein (f) generated mixed t cell lines with both t h l cd4+ t cells, and ctl. mice primed to g protein gave rise to predominantly cd4' t cells producing th2 cytokines. ln vivo transfer of these cell lines into na'ive rsv infected mice reproduces the patterns of disease seen in mice sensitised in vivo with the respective antigens. the mouse model of rs virus disease therefore has excellent potential for illustrating mechanisms of lung immunopathology. the eye is a complex organ whose function is to transmit light images through different cell and tissue layers and liquid media to a neurosensory retina. elements as could occur when invading pathogens arrive and an inflammatory response with its swelling, plasma protein extravasation, leukocyte infiltration and tissue damage results. inflammatory responses when possible and rely on immune defenses which do not involve tissue distortion and damage. restricting tissue damaging responses is not always effective and the process is best developed in response to agents delivered to locations such as the anterior chamber. where an inflammatory response is initiated which may result in ocular impairment. such herpetic stromal keratitis (hsk) is a common cause of blindness in man. animal model studies indicate that hsk is a multi-step process initiated by virus in an avascular structure. hsk fails to occur in the absence of t cells or replicating virus. disappears several days before a visible inflammatory response becomes evident. evidence will be presented that the secondary agonists which drive the inflammatory response may not be viral antigen(s) per s e . multiple cell types are involved in hsk, with the respective role of functional sets of lymphocytes changing according to the clinical phase of the disease. in addition, nonspecific inflammatory cells such as neutrophils and nk cells also influence the severity of lesions. basically the reaction begins with t cells that produce type one cytokines, particularly ifn-y, dominating the scene, but during remission type 2 cytokines, notably il-10, appear as mechanistically involved. from the use of knockout mice for various immunological parameters, evidence will be presented that numerous mechanisms of pathogenesis may be at play during hsk. damage to corneal tissues in all systems appear to involve tnfo. a second ocular damaging event in which immunopathology is at least partially involved is herpetic retinal necrosis. evidence that this disease may involve the immunopathological role of cd4 t cells and protective effects by cd8' t cells will be presented, as will be suggestions by which the pathological events are mediated at the molecular level. thus it is in the eye's functional interest to limit acute viral infections and live vaccines often confer long-term immunity the nature of t and b cell memory is different. b cell memory is manifested not only by the presence of memory b cells but also by continuous antibody production in contrast, the effector phase of the t cell response i s shortlived and long-term t cell memory is due to the presence of 'quiescent' antigen specific memory t cells that are present at higher frequencies and are able to respond faster upon re-exposure to virus due to increased levels of adhesion molecules in this talk i w i l l present our results on. (i) the bone marrow as a major site of long-term antibody production after acute viral infection, (ii) the role of c d 4 ' t cells and b cells (immune complexes) in maintaining cd8+ t cell memory, (iii) the role offas antigen in regulating t cell responses, and (iv) the efficacy ofvarious antigen delivery systems in inducing long-term t cell memory sendai virus is a natural respiratory viral pathogen of mice. intranasal infection of mice with the virus provokes a virus specific antibody-forming-cell reaction that exhibits a distinct kinetic pattern in the lymph nodes that drain the respiratory tract, in the spleen, and in the bone marrow. the bone marrow afc population is extremely long-sustained, and supports an active humoral response that essentially persists for the lifetime of the infected animal. thus the conventional categories of "primary" and "secondary" response may not apply to the humoral response of mice naturally exposed to respiratory viruses. paradoxically, the population of b cells that reacts most rapidly to sendai virus infection does not itself secrete antibody, but can he demonstrated by the recovery of hyhridomas that secrete "polyspecific" antibodies. the activation of this polyspecific b cell population is, like the humoral response, extremely persistant. viral infection thus sets in train multiple b cell "memory" processes. variation in the rules of development and turnover of different b cell populations constrains the mechanisms that may operate to generate these different forms of memory. establishment and maintenance of t cell memory to respiratory viruses, peter c. doherty, sam hou, christine ewing, david topham, anthony mcmickle, james houston, and ralph tripp, department of immunology, st. jude children's research hospital, memphis, tn 38105. the analysis of the development and memory phases of the cd4+ "helper" n h ) and cd8+ cytotoxic t lymphocyte (ctl) responses to the respiratory pathogens, influenza virus and sendai virus (parainfluenza type 1) have been characterized by a combination of limiting dilution analysis (lda) for determining th and ctl precursor @) frequency and facs separation of lymphocytes with different activation phenotypes. the interpretation at this stage, largely based on the analysis of the ctl response, is that the development phase of t cell memory and the primary response are synonymous. virus-specific ctlp are produced in considerable excess of the numbers required to provide the effector ctl that terminate the primary infection, with only a fairly small proportion localizing to the target organ (the lung) that supports virus growth. even when many of the proliferating ctlp are killed by administration of a small dose (20 mgkg) of the dna-targeted drug cyclophosphamide (cy), there is no indication of immune exhaustion. the cd4+ th response has, at this stage, not been analyzed through the course of the primary infection. use of the lda approach to determine thp frequencies is inherently more difficult, as the "read-out'' is lymphokine production and there is considerable "bystander" activation in these primary responses to respiratory viruses. memory thp and ctlp are characterized initially by the expression of an "activated" phenotype: cd44-high, l-selectin-low, cd49d (vla-4) high. after some months, an increasing proportion of the memory t cells revert to the l-selectin-high cd49d-low form typical of naive ctlp. the change, which is never absolute, seems to occur first with cd49d and the rate varies for different viruses. current experiments are addressing the possibility that intercurrent infection, particularly with the mouse y-herpesvirus 68 which causes persistent infection of lymphoid tissue, may be inducing a switch back to the activated pattern, as a consequence of "bystander" effects, or "low affmity" stimulation via the clonotypic tcr in responding lymphoid tissue. the question of such cross-reactivity and/or exposure to "high lymphokine" environments for the long-term maintenance of memory is also being addressed. to study the factors which regulate the generation and persistence of specific t cell memory we have used model systems utilizing t cell receptor transgenic mice as a source of enriched naive cells which can be either cultured in vitro to generate effector populations or restimulated in adoptive hosts. in either case one can visualize the development of an expanded effector population. we have documented that the proliferation and il-2 production of the naive t cells depends on their activation by apc expressing high levels of co-stimulatory molecules. we find that b7.1 and icam-i as costimulators strongly synergize and that increased t cell receptor triggering can both increase the magnitude of the response and decrease its dependence on costimulation. when cytokines il-4 vs il-i2/ifny are present at the initiation of the response of either cd8 or cd4 cells they dictate that the effectors generated will be polarized either towards il-4 and il-5 secretion or il-2 and ifny secretion, respectively. the fate of the effector population generated and followed in vitro, also is tightly regulated by ag, cytokines and probably by costimulation. cd4 effector cells not re-exposed to ag, produce no cytokines and they die within 3-4 days. effectors restimulated with ag make massive amounts of cytokines, regardless of the presence of cytokines, at low densities of ag and with little dependence on costimulation. when there is little il-2 produced and no cytokines added, effectors die rapidly by apoptosis. however the combination of il-2 and tgfp block apoptosis and support expansion of the effector population which is greatly enhanced by periodic ag stimulation. some conditions favor the reversion of effector-like cells to a more resting memory phenotype and these are being further explored. we have also examined the development and maintenance of memory after transfer of effector cells to adoptive hosts. long-lived polarized memory populations are generated from the polarized effectors and these persist for prolonged period in the absence of apparent ag stimulation. this supports the idea that factors other than antigenic stimulation, present in situ can support the expansion and maintenance of memory cells. the rabies glycoprotein (g) is the only external protein of the virion and is therefore responsible for any interaction that rabies makes with the host cell during the first steps of the virus cycle. the g protein is also the target of neutralizing antibodies. there are around 450 trimers of g at the virion surface which constitute the spikes visible by electron microscopy. upon exposure to slightly acidic ph, the glycoprotein undergoes a conformational change which results in ion er and less regular spikes. strikingly and quite differently from influenza hemagglutinin, this conformational change is reversible: if the p d is risen back to 7.0, the s ikes re ain their neutral configuration (1). probably as a consequence, the viral infectivity is totally preserved after an exposure of 2 hours at p 8 6.4 an cf 37t, which induces the conformational change, followed by an incubation at neutral ph. since the conformational change is reversible, there is a ph-dependant equilibrium between the native and the low-ph conformation: the higher the ph, the more spikes are in their native configuration. two main antigenic sites and several minor sites have been identified on the native rabies glycoprotein (2). specific amino acids belonging to each of the two major antigenic sites are important or essential for viral virulence. for instance a lysine in position 198, which is part of antigenic site 11, is important, although not essential, for the viral virulence. similarly, the arginine 333, which belongs to antigenic site 111, is essential for pathogenicity while dispensable for multiplication in cell culture (reviewed in 3). viral strains mutated at arginine 333 have lost the capability to penetrate certain categories of neurons, suggesting that this mutation affected the recognition of specific receptors or subsequent interactions necessaly for the penetration of the virus at nerve terminals. therefore the two main antigenic sites are regions of the glycoprotein which also interact specifically with neurons in the animals. we have found that neutralization requires the fixation of at least one or two igg for every three spikes, irrelevant of the anti enic site recognized by the antibody (4). most neutralizing antibodies recognize conformational epitopes which are accessible on the native configuration of the protein. some epitopes remain accessible also on the acidic configuration while others are not. in addition, a minority of antibodies recognize epitopes which are only accessible on the acidic conformation. this is not unlikely in view that each spike has a certain probability to undergo a conformational change, even at neutral ph. in consequence the surface of the virus probably fluctuates and g epitopes which are not accessible on the native glycoprotein could be transiently exposed. conformational flexibility at neutral ph and physiological temperatures has also been observed for poliovirus (5). structural flexibility of external proteins could have important implications in virus-host interactions. katpus, norlhwestern university medical school, chicago, il 6061 1 theiler's murine encephalomyelitis viruses (tmev) are endemic enteric pathogens of wild and colony-reared mice. lntracerebral inoculation of susceptible mouse strains leads to a chronic, progressive inflammatory demyelinating disease of the central nervous system (cns) characterized clinically by an abnormal gait, progressive spastic hind limb paralysis and urinary incontinence, and histologically by parenchymal and perivascular mononuclear cell infiltration and demyelination of cns white matter tracts. demyelination is related to persistent cns viral infection. due to the similarity in clinical and histological presentation, tmev-induced demyelination is considered to be a highly relevant model of multiple sclerosis (ms). our current interests are in determining the phenotype, fine specificity, lymphokine profile and tcr usage of cns-infiltrating cells involved in the effector stages of tmev-induced demyelination. based on a variety of experimental evidence, it is clear that demyelination induced in sjuj mice by infection with the bean strain of tmev is a thl-mediated event: (a) disease induction is suppressed in t cell-deprived mice and by in vivo treatment with anti-i-a and anti-cd4 antibodies; (b) disease susceptibility correlates temporally with the development of tmev-specific, mhc-class il-restricted dth responses and with a predominance of anti-viral lgg2a antibody; (c) activated (le., ll-2rc) t cells infiltrating the cns are exclusively of the cd4+ phenotype, and (d) proinflammatory cytokines (ifnq and tnf-p) are predominantly produced in the cns. we have mapped the predominant thl epitope on the virion to amino acids 74-86 of the vp2 capsid protein. a thl line specific for vp274-86 exacerbates the onset of demyelination in recipient mice infected with a suboptimal dose of tmev. tmev-infected sjuj mice fail to exhibit peripheral dth and t cell proliferative responses to the major myelin proteins, mbp and plp, and pre-tolerization with neuroantigens has no affect on the incidence or severity of tmev-induced demyelinating disease, whereas neuroantigen-specific tolerance prevents the induction of relapsing experimental autoimmune encephalomyelitis (eae). in contrast, tolerance induced with intact tmev virions specifically anergizes virus-specific thl responses and results in a dramatic reduction of the incidence and severity of clinical disease and cns dernyelination in sjuj mice subsequently infected with tmev. these results have important implications for a possible viral trigger in ms as they indicate that chronic demyelination in tmev-infected mice is initiated in the absence of demonstrable neuroantigen-specific autoimmune responses and are consistent with a model wherein early myelin damage is mediated via primarily by mononuclear phagocytes recruited to the cns and activated by pro-inflammatory cytokines produced by tmev-specific thl cells. the concept that prions m novel pathogens which are different fium both viroids and viruses has received increasing support from many avenues of investigation over the past decade. enriching fractions from syrian hamster (sha) brain for scrap= prion infectivity led to the discovery of the prion protein 0. prion diseases of animals include scrapie and mad cow disease; those of humans present as inherited, sporadic and w o r n neurodegenemive disorders. the inhecited human pion diseases m genetically linked to mutatim in the prp gene that result in non-conswative amino acid substitutions. transgenic v g ) mice expressing both sha and m o w @lo) prp genes were used to demonstrate that the "specie9 bank?' for -pie prions resides in the primary structure of pip. this concept was strengthened by the results of studies with mice expressing chimeric mdsha transgenes &om which "artificial" prions have been synthesized. similar chimeric mdhuman (hu) rp transgenes were constructed which differ from m o w by 9 amino acids between residues 96 and 167. au of the tg(mhu2m) mice developed neurologic drsease -200 days after inmulation with brain homogenates from three patients who died of creutzfeldt-jakob disease (cjd). inoculation of tg(mhu2m) mice with cjd prims produced mhu2mprpsc, inoculation with mo prions produced moprw. ihe patterns of meluzmprpc and mom% accumulation in the brains of tg(mhu2m) mice wen differenl about 10% of tg(huprp) mice expressing huf" and non-tg mice developed neurologic diseane >500 days after inoculation with cn, prions. the different susce@uies of tg(hw) and tg(mhu2m) mice to human prions indiate that additional species specific factors such as chaperone proteins are involved in prion replicaton. diagnosis, prevention and treament of human @on diseases should be faciliated by tg(mhu2m) mice. in other sindies, tg mice were compared expressing wt and mutant moprp. overexpression of the wtmoprp-a aansgene -8-fold was not deleterious to themiw but it did shorten scrapie incubation times from -145 d to -45 d after inoculation with murine m p i e pnons. in contrast, overexpression at the same level of l morp-a transgene mutated at codon 101 (corresponding to codon 102 in hurp) pmdnced spontaneous, fatal neurcdegeneration between 150 and 300 d of age in two lines of tg(mohp-pio1l) mice designated 2866 and 2247. genetic crosses of tg(moprp-p101l)2866 mice with gene targeted mice lacking both rp alleles ( p m -p ) produced anhats with a highly synchronous onset of illness between 150 and 160 days of age. the t g~o p r p -p l o l l ) 2 8 6~~ mice had numerous prp plaques and widespread spongiform degeneration in contrast 10 the tg2866 and 2247 mice that exhibited spongifonn degeneration but only a few prp amyloid plaques. another line of mice designated tg2862 overexpress the mutant transgene -32-fold and develop fatal neurodegeneration behveen 200 and 400 d of age. tg2862 mice exhibited the most severe spongiform degeneration and had numerous, large pip amyloid plaques. while mutant moprpccploll) clearly produces neurodegeneration, wtmoprpc profoundly modifies both the age of onset of illness md the mumpathology for a given level of transgene expression. our tidigs and those from other smdies suggest that mutant and wtprp interact, phaps through achaperone-like protein as noted above in sndies of tg(mhu2m) mice, to modify the pathogenesis of the dominantly inhe&ed prion diseases. anton, heidi t. link, and jonathan w. yewdell, laboratory of viral diseases, niaid, bethesda, md 20892-0440. cd8' lymphocytes (tcd8+) play an important role in host immunity to viruses and other intracellular parasites. virus-specific tcdi+ recognize mhc class i molecules in association with peptides of 8 to 10 residues derived from viral proteins. this presentation will focus on how and where antigenic peptides are generated by cells. to begin to characterize the nature of proteases involved in the generation of antigenic peptides from cytosolic proteins, we used a panel of recombinant vaccinia viruses expressing different forms of influenza virus nucleoprotein (np). we found that the efficiency of generation of two np peptides is related to the metabolic stability of the source gene product. there has been considerable speculation that such short lived proteins are degraded by proteasomes in a ubiquitin-targeted process. our observations, however, call into question the importance of ubiquitin targeted-proteolysis in generating antigenic peptides from exogenously provided or endogenously synthesized viral proteins. we also examined the extent to which antigenic peptides can be generated in the endoplasmic reticulum (er). we found that antigenic peptides could be produced from short precursors (17 residues) hut not from a number of full length proteins (influenza virus hemagglutinin, np, ovalbumin) that are targeted to the er by a nh2-terminal signal sequence. peptides were generated much more efficiently from the cooh-terminus of the 17 residue precursor than from the nh2-terminus. these findings indicate that the er has a much more limited capacity than the cytosol to generate antigenic peptides, but that er proteases (particularly aminopeptidases) could perform the final proteolytic steps in the generation of class i binding peptides from precursors imported from the cytosol by tap, the mhc encoded peptide transporter. potential advantages of synthetic peptide or engineered recombinant vaccines are that they can be limited to contain only the specific antigenic determinants for desired responses without other determinants that elicit unwanted responses, and that the sequences of the determinants themselves can be modified to enhance potency or breadth of crossreactivity. however, they can have the disadvantage that any single determinant may be presented by only a limited selection of major histocompatibility complex (mhc) molecules of the species. to overcome the problem of mhc polymorphism, we have identified determinants presented by multiple mhc molecules, and have also located multideterminant regions of the hiv-1 envelope protein that contain overlapping determinants each presented by different class i1 mhc molecules, so that the whole multideterminant region is presented by multiple mhc molecules of both mouse and human. we have made use of "cluster peptides" spanning these multideterminant regions of the hiv-1 envelope to provide help for neutralizing antibody (ab) and cd8+ cytotoxic t lymphocyte (ctl) responses to peptides attached to these helper regions. these synthetic peptide vaccine constructs containing the p18 peptide from the v3 loop of hiv-1 iiib or mn, elicited both neutralizing ab and ctl in multiple strains of mice. the cluster peptides inducing helper t cells were essential for elicitation of ab and ctl to the p18 segment of both iiib and mn strains of hiv-1 in mice of several mhc haplotypes. several adjuvants were compared for their ability to elicit both ctl and ab simultaneously, without one response inhibiting the other. a single formulation in incomplete freund's adjuvant (ifa) could elicit all 3 responses, neutralizing ab, ctl, and th1 helper cells. the ctl specific for the mn strain p18 peptide crossreacted with strains sc, sf2,2321, and cdc4. the peptides in f a also elicit high titers of antibodies in rabbits. boosting was found to enhance ctl responses as well as ab responses. these constructs are being prepared for a human immunotherapy trial. these vaccine constructs are potent and also avoid sites on gp160 that are known to elicit enhancing antibodies or autoimmune responses that might conmbute to disease pathogenesis. however, we can potentially improve on these by tinkering with the internal structure of the individual epitopes. we have found that replacing a negatively charged glutamic acid residue with an uncharged amino acid in one of the helper determinants makes it 10 to 100-fold more potent in binding to the class i1 mhc molecule and in eliciting murine helper t cells that still recognize the natural hiv-1 sequence. thus, such a modified peptide should be more potent as a vaccine, while retaining the ability to elicit t cells that will respond to hiv proteins that of course do not have the altered sequence. we are currently mapping the critical residues for presentation of one of these peptides by human hla-a2, with the intent of developing modified peptides that will be more potent as components of a human vaccine. thus, by leaming how these peptides bind to mhc molecules and tcell receptors, we can design internally modified determinants to construct more potent or more crossreactive second generation vaccines. we are testing these vaccine approaches in a mouse model in which mice can be protected against tumor cells expressing hiv proteins as would an hivinfected cell. dna vaccines, comprised of non-replicating plasmids encoding viral proteins, are capable of generating protective immunity in animal models of several viral diseases. in preclinical models of influenza infection, reduced viral shedding was observed in dna-vaccinated ferrets after challenge with the human clinical virus strain, a/georgia/93. cross-strain protection was conferred by dna encoding the major internal proteins (nucleoprotein, np, and matrix, m1) and the surface protein haemagglutinin (ha) from the antigenically-distinct previous virus strains, a/beijing/89 and a/hawaii/91. this protective efficacy was greater than that seen by immunization with the widely-used clinical vaccine composed of killed a/beijing/89 virus. thus, compared to a killed virus vaccine, protection seen with the dna vacane against a drifted virus strain was greater. we previously demonstrated that immunization of mice with np dna generated mhc class i-restricted cytotoxic t lymphocytes. mice likewise were protected from death and morbidity following cross-strain challenge'. ha dna vaccines generated neutralizing antibodies in mice, ferrets and primates, and provided protection in m i d and ferret models of influenza. in animal models of other viral diseases, immune responses and protection against viral challenge have been seen after immunization with dna encoding viral proteins. dna encoding hiv gp120 generated ctl and neutralizing antibodies in monkeys. antigen-specific proliferative responses and, in mice, secretion of high levels of yifn relative to levels of il-4, months after immunization were also observed . immunization of rabbits with dna encoding l1, the major viral capsid protein of cotton tail rabbit papilloma virus (crpv), resulted in neutralizing antibodies and protected against the development of warts after inoculation with crpv. mice immunized with dna encoding the glycoprotein gd from herpes simplex virus type 2 (hsv-2), developed neutralizing antibodies and were protected from death when subsequently challenged with hsv-2. dna vaccines were protective in animal models of various viral diseases. neutralizing antibodies, helper t cells (thl) and cytotoxic t cells were generated. cross-strain protection due to cellular immunity was demonstrated. ' science, 1993 2593745-1749 , 2dna cell biol, 1993 the profile of a neurovirulent virus is determined by its mechanism of entry into the cns (neuroinvasion), the type of cns cell in which it replicates (neurotropism) and its ability to cause pathologic effects in the brain (neurovimlence). whereas neuroectodermal cells, especially neurons, are the target cells of most neurovirulent viruses, the main target cell in the brain for siv and other lentiviruses is the macrophage. infection in, expression of viral antigens by and products of siv replication exported from these cells result in inflammation and degenerative changes in the brain and concomitant loss of neurons. siv strains that are mainly t-cell tropic cause transient activation of t-cells and during this period, infected t-cells cross the blood brain barrier and localize in the brain causing persistent hut minimally productive infection and minimal neuro pathologic effects. viral proteins but not virions are produced continuously. by virtue of the tropism of the virus for cd4 t cells, many infected animals eventually become immunosuppressed and develop aids, but not classical ueurological disease. viruses which are macrophage tropic invade the brain presumably also in t lymphocytes and the viruses infect macrophages in the brain. however, productive virus replication is minimized by antiviral cd8 t cells which suppress (kill?) all virus producing cells throughout the body, including the cns. productive virus replication in brain macrophages and accompanying inflammatory changes develop only when cd8 cells fail i.e. after profound immunosuppression sets in. the neurological disease that results from productive virus replication in macrophages in the brain therefore depends on presence of an appropriate macrophage-tropic viral phenotype invading the neuropil and development of immunosuppression in the host. the neurological disease could therefore be defined as one of the aids syndromes. the adenovirus (ad) early transcription region (e3) codes for more than 7 polypeptides, four of which have already been shown to alter the immune response to ad infection. the amount of the class i major histocompatibility complex (mhc) on the plasma membrane can be reduced by the binding of the ad e3 gpl9k protein to the mhc heavy chain, which prevents transport of the complex out of the endoplasmic reticulum. this process interferes with presentation of viral peptides to cytotoxic t lymphocytes. cytolysis by tumor necrosis factor-o (tnf) is inhibited by 4 distinct viral polypeptides, 3 of which (the ad e3 14.7k or the complex of the 10.4k and 14.5k proteins) are coded in the e3 region. the e3 polypeptides are translated from a family of viral mrnas, that are synthesized from a single viral promoter and processed by alternative splicing. we have studied the functions of the e3 polypeptides in several murine models. the goals of these experiments were to determine the effects of the ad e3 polypeptides in acute and persistent viral infections as well as in a transplantation model designed to measure whether these viral immunoregulatory proteins would abrogate allogeneic graft rejection. in a vaccinia virus (v.v.) pneumonia model, in which the isolated ad e3 14.7k or ad e3 gpl9k genes were inserted into the v.v. pathogen, the ad anti tnf polypeptide increased viral virulence but the ad anti mhc had no effects. in addition to manipulating the ad e3 genes in viral constructs, several transgenic mouse lines containing the ad e3 genes have been constructed for these experiments. the e3 genomic dna behind the rat insulin promoter (rip) has been used to generate transgenic animals. islets from rip-e3 transgenic animals (h-2b'd) have been transplanted allogeneically to h-2d recipients and remained viable, secreting insulin until the end of the experiment at 94 days; in contrast, control nontransgenic islets of the same genotype were rejected by 21-28 days. the e3 genes behind the native e3 promoter have been inserted into mouse embryos to generate transgenic animals, and the expression of the transgene monitored in multiple organs. the e3 promoter of the transgene is responsive to stimulation by the ad e1a following infection with an e3 minus ad 7001 and can also be upregulated by administration of bacterial lipopolysaccharide. the effects of this transgene on ad pathogenesis are currently being studied. thus, these viral immunoregulatory genes have been shown to alter viral pathogenicity during acute infection and to downregulate the host immune response sufficiently to permit islet cell transplantation. these results on manipulating the ad e3 genes for the control of the host immune response also have implications for designing adenovirus vectors for gene therapy. "emerging" infections can be defined as infectious diseases that either have newly appeared in the population, or that are rapidly increasing their incidence or expanding their geographic range. recent viral examples include aids, ebola, and hantavirus pulmonary syndrome (fnst identified in a 1993 outbreak in the southwestern u.s.). emerging viral infections show a number of common features. most "new" viruses derive from existing viruses that move into new areas or acquire new hosts ("viral traffic"). many are zoonotic (originating from animal sources) (even pandemic influenza appears usually to be a reassortant originating in wildfowl). ecological or environmental changes (either natural, or, often, man-made) may precipitate emergence of new diseases by placing people in contact with a previously unfamiliar zoonotic reservoir or by increasing the density of a mtud host or vector of a pathogen, increasing the chances of human exposure. upon introduction into a human population from a zoonotic reservoir, the newly introduced virus may cause localized outbreaks of disease. some may show rapid variation and evolution upon introduction, and some evidence suggests a role for immune selection in this process. a few viruses (such as hiv) may succeed in establishing themselves and disseminating in the human population, becoming truly "human" infections. human activities can also play an important role in establishment and dissemination. migrations from rural areas to cities, now an accelerating worldwide phenomenon, or other displacements, can introduce remote viruses to a larger population; the virus may then spread along highways and (globally) by air travel. the development of an effective system of surveillance and rapid response is essential, but resources for this are presently inadequate. vaccine development, production, and deployment problems also need to be addressed. immunopathology may be a key feature of many of these infections, a number of which manifest as hemorrhagic fevers. many of the life threatening complications are due to increased vascular permeability. the resemblances to septic shock suggest that cytokines (such as tnf) are likely to be important in the pathogenesis of these infections. the response of cells, such as the macrophage, that induce or synthesize key cytokines, may be an important element, and the ability to infect these cells may be one common denominator. why some viruses elicit this response, while other closely related viruses do not, cannot yet be predicted from molecular data. better understanding of these aspects of the immune response should lead to additional therapeutic strategies. (supported by nih grant roi rr03121.) genetic approaches have been used to detect and characterize numerous previously unidentified hantaviruses. puumalarrospect hilvsin nombre-like viruses or virus variants are present throughout north and south america, europe and russia. several of the american viruses identified are associated with the newly recognized hantavirus pulmonary syndrome (hps), a severe respiratory illness with high mortality. the genetic relationships of these and previously characterized hantaviruses have been studied by phylogenetic analysis of the nucleotide sequence differences located in pcr bgments amplified from the g2 encoding region of the virus m segments. the relationships observed are consistent with a long-term association of viruses with their primary rodent reservoirs and suggestive of coevolution of host and virus. a sin nombre virus isolate is now available and its genetic characterization has been completed. various virus antigens have been expressed and are being used to probe the interaction of the virus with the host immune system. hantaviruses cause significant morbidity and mortality throughout the world. more than 200,000 cases of hemorrhagic fever with renal syndrome (hfrs) are reported annually in asia, europe and scandinavia. the etiologic agents of hfrs are hantaan, seoul and puumala viruses, with hantaan virus causing the most severe form of the disease. in 1993, a new hantavirus was discovered in the united states (initially termed four comers virus), and was identified as the etiologic agent of hantavirus pulmonary syndrome (hi's). vaccines for hantaviruses are not readily available, although a number of inactivated viral preparations have been made and tested in asia. recurrent problems with inactivated hantaviral vaccines have been lot to lot variability, the need for repeated immunizations, and their inability to elicit long-lasting neutralizing antibody responses in immunized volunteers. because of such limitations on traditional vaccine development for these viruses, as well as the viruses' hazardous nature and slow, low-titer replication in cell culture, we used a recombinant dna approach to develop a vaccine for i-ifrs. our vaccine is a recombinant vaccinia virus expressing the m segment of hantaan virus under control of the vaccinia virus 7.5 k promoter and the s segment under control of the 11 k promoter. the m segment, which encodes the g1 and g2 envelope proteins, was included because of our findings that: (1) immunization with vaccinia or baculovirus-expressed g1 and g2 induced a neutralizing and protective immune response in hamsters; and, (2) neutralizing antibodies to g1 or g2 could passively protect hamsters from challenge with virulent virus. the s segment, which encodes the nucleocapsid protein (n), was included because of our finding that hamsters immunized with baculovirus-expressed n also were protected from subsequent infection. although the protective immune response to n is probably cell-mediated, the importance of such a response is presently not well defined. assessment of our vaccine in preclinical studies, indicated that immunized hamsters developed neutralizing antibodies and were protected from displaying viral antigen in their lungs after challenge. in a phase i, dose escalation, clinical study, the vaccine induced neutralizing antibodies in individuals immunized subcutaneously with approximately lo7 pfu of the recombinant virus. in addition to humoral responses, immunized volunteers developed a cell-mediated immune response as indicated in lymphocyte proliferation assays. larger clinical studies, including alternate routes or booster immunizations, are planned. based on these studies, we anticipate that the vaccine will be efficacious for preventing hfrs caused by hantaan and the antigenically closely related seoul virus. we are studying the cross-protective properties of this vaccine with more distantly related hantaviruses such as puumala virus. although we expect this vaccine to be safe as well as effective, we also are investigating the use of more attenuated pox-viruses as vaccine vectors. infection of mice with lymphocytic choriomenigitis virus (lcmv) results in a profound expansion in the number of spleen cd8 t cells and in the induction of virus-specific ctl activity. thereafter, the cd8 t cell number declines, and the ctl activity diminishes, though the frequency of lcmvspecific precursor ctl per cd8 cell, as assessed by limiting dilution assays (lda), is remarkably stable throughout long-term immunity. the decline in t cell and total spleen leukocyte number at the late stages of acute infection is associated with high levels of apoptosis, as detected by the in situ nucleotidyl transferase assay. apoptosis occurred in both the t cell and b cell populations, with the b cells dying in clusters. this apoptosis was also seen in tfansgenic mice ectopically expressing bcl-2 in the t and b cells and in c57bl/6 ipr/@r mice, which have a mutation in the fas gene. t cells from the infected animal underwent apoptosis in vitro when stimulated through the tcr with anti-cd3, thereby explaining some of the immunosuppression seen during acute viral infections. memory cells persisted for over a year and could be found in blast-size cell populations. challenge of lcmv-immune mice with either pichinde virus, vaccinia virus, or murine cytomegalovirus led to the reactivation of the lcmv-specific ctl response. lda analyses showed unexpectedly that these heterologous viruses crossreacted with subpopulations of lcmv-specific memory t cells. this memory t cell response to virus from an earlier infection was associated with enhanced immunopathology and enhanced clearance of virus during a heterologous virus challenge. over the course of the acute infection, ctl specific for the second virus were preferentially expanded over the crossreactive ctl, and after the acute infection, when the t cell response had subsided, ctl memory to the first infection had decreased. there is therefore a network of memory t cells which contribute to and are modulated by infections with putatively unrelated viruses, and apoptosis plays a homeostatic role in the course of these t cell responses. immune responses to live attenuated retroviral vaccines, r. paul johnson*?, cara wilsont, kelledy mansons, michael wyands, bruce walker?, ronald c. desrosiers* *new england regional primate research center, southborough, ma 01772 thfectious disease unit, massachusetts general hospital, boston, ma 021 14 â§tsi/mason, worcester, ma immunization of rhesus macaques with live attenuated retroviruses deleted in nef can induce protective immunity against challenge with pathogenic siv. development of protective immunity in these vaccinated animals occurs only after several months of infection, with maximal protection observed after one year. the specific immune responses responsible for mediating protection have not been defined, and little is known about the cellular immune responses in animals vaccinated with these live attenuated retroviruses. we have analyzed cellular and humoral immune responses in rhesus macaques and chimpanzees infected with live attenuated retroviruses. siv-specific neutralizing antibodies were present in vaccinated animals, but did not clearly correlate with protection against challenge. ctl specific for envelope and gag were identified in vaccinated macaques studied 12 or more months after vaccination. quantitation of siv-specific ctl activity in one of these animals using limiting dilution analysis revealed a relatively high precursor frequency of cytotoxic t lymphocytes, up to moo0 for gag and 1/8500 for envelope. cd8+ lymphocytes obtained from vaccinated macaques were also able to suppress siv replication in autologous cd4+ cells. suppression mediated by unstimulated cd8 autologous cells was maximal when cells were in direct contact with siv-infected lymphocytes, but cd8+ cells activated by an anti-cd3-specific monoclonal antibody were able to release. a potent soluble inhibitor of siv replication. in contrast to the relatively vigorous ctl response present in vaccinated macaques, we were not able to detect consistent ctl activity in chimpanzees infected with a hiv-1 molecular clone (nl43) or attenuated viruses at periods up to one year after infection, despite the use of a variety of stimulation techniques. proliferative responses to hiv p24 and gp160 were observed in chimpanzees infected with n u 3 and attenuated variants. although the relative contribution of these immune responses to protective immunity is not known, the relative vigor of the cellular immune responses observed in vaccinated macaques suggest they may play a role in mediating resistance to challenge. obiectives: to analyze the magnitude and specificity of the ctl response to hiv-1, and to determine the tcr usage by clonal ctl responses in infected persons, including persons with documented infection of up to 15 years with cd4 cells > 500/mm3. methods: hiv-l-specific ctl activity was evaluated in pbmc as well as in pbmc stimulated in vitro with hn-1 infected autologous cd4 cells, using target cells infected with recombinant vaccinia viruses expressing hn-1 proteins. ctl epitopes recognized by these individuals were determined using cloned effector cells. quantitative cultures were performed by endpoint dilution, and viral quantitation was determined by qc-pcr tcr analysis was performed by pcr, using both family-specific primers and anchored pcr, followed by sequencing. sequence analysis of ctl epitopes in autologous viruses was determined by pcr amplification and sequencing. clonal frequency was analyzed in pbmc by oligonucleotide probe to the cdr3 region of the tcr. studies performed in long-term non-progressing persons indicate the presence of a vigorous and broadly directed ctl response. detailed epitope mapping in a person infected for 15 years, who by qc-pcr had i00 ngld induce profound immunotoxicities characterized as almost complete inhibition of virus-induced cd8+ t cell expansion and ctl activation, and up to 2 log increases in viral replication [orange, wolf, and biron, j. immunol. 152:1253, 19941 . serum tumor necrosis factor (tnf) is also observed under these conditions. the studies reported here further characterize the expression and function of tnf in this context. northern blot and in sifu hybridization analyses demonstrated that il-12 induced tnf-cx expression and that lcmv infection synergized with il-12 for this induction. administration of antibodies neutralizing tnf reversed the il-12-induced immunotoxicities in lcmv-infected mice and restored anti-viral defenses. the tnf-mediated immunotoxicities appeared to result from an induced cellular sensitivity to the factor, as splenic leukocytes and cd8+ t cells isolated from lcmv-infected mice were more sensitive to tnfmediated cytotoxicity in culture than were equivalent populations prepared from uninfected mice. additional physiological changes were observed in il-12-treated uninfected mice and were dramatically elevated in il-12-treated virus-infected mice, including: 1) decreases in body weights; 2) elevation of circulating glucocorticoid levels: and 3) decreases in thymic mass. these changes were also reversed by anti-tnf. the results delineate a unique tnf-mediated immunotoxicity and have significant implications concerning detrimental consequences of in vivo tnf andlor il-12 for protective anti-viral responses. lactate dehydrogenase-elevating virus (ldv), a naturally occurring virus, causes a persistent infection in mice and presents an ideal model for the study of immune modulation during acute and persistent virus infections. within a few days following infection with ldv there is a pronounced polyclonal activation of b cells followed by the suppression of primary b cell responses to t-dependent ag. we investigated the effect of acute and persistent ldv infection on the development of a memory b cell response to the model protein antigen, horse cytochrome c (cyt), by employing a modification of the splenic fragment assay. about a 50% decrease in the frequency of responding agspecific memory b cells was observed in balb/c mice infected with ldv, whether the mice were immunized with cyt at the time of ldv infection or three weeks later. this may be due in part to a defect in t cell help, since in cultures of normal memory b cells and t cells derived from ldv acutelyinfected mice the frequency of responding b cells was also decreased two-fold. in situ hybridization using a cdna probe specific for ldv revealed two patterns of ldv rna within the spleen. twenty-four hr p.i. ldv rna was located within the marginal zone, surrounding each follicle. this pattern is consistent with permissive macrophages. during persistence viral rna could no longer be detected in the marginal zone, but was located within the follicles. the absence of ldv-permissive cells within the follicular region suggests that the source of ldv rna is not due to ongoing viral replication. one possibility is that circulating virus is trapped by a specific cell population within the follicle. the effect of virus trapping within the spleen provides a mechanism by which ldv and other viruses can modulate immune cell function during persistent infections. ifn-y can be produced by activated nk cells. this cytokine enhances immune responses by augmenting macrophage antigen presentation. viral infection induces ifn-dp and nk cell activation. changes in splenic architecture, cell trafficking, and cytokine expression were examined during viral infections of c57bu6 mice. at times coinciding with ifn-dp production and nk cell activation, there was a redistribution of nucleated cells from red pulp to white pulp regions in spleens isolated from mice infected with either lymphocytic choriomeningitis virus (lcmv) or murine cytomegalovirus (mcmv). cell transfer experiments with dioctadecyl-3,3,3',3'-tetramethyl indocarbocyanine perchlorate-or pkh26-gl-labeled bone marrow cells isolated from normal mice demonstrated an infection-induced accumulation of non-t/non-b cell populations along recipient splenic marginal zones. flow cytometric analyses demonstrated that approximately 10% of the transferred bone marrow cells accumulated in spleens after 20 hrs and 30% of these expressed the nk cell marker, nkl.l+. in vivo antibody treatment procedures, to eliminate cell subsets in donor mice, demonstrated that the cells localizing at the marginal zone were derived from agmi+ and n k l . l + populations. a small subpopulation of marginal zone cells in infected mice were shown to be expressing high levels of ifn-7 mrna by in sifu hybridization. treatment with anti-agmi or anti-nk1 .i antibodies eliminated both endogenous nk cells and the ifn-y mrna positive cells. these data demonstrate that newly derived nk cells accumulate along marginal zones. the results also suggest that this trafficking pattern may act to enhance immune responses by facilitating delivery of cytokines to specialized antigen presenting cells. david segal, janet ruby, alistair ramsay and ian ramshaw. depamnent of cell biology, john curtin school of medical research, po box 334, canberra, act, 2601 australia cytokine expression has been shown to correlate with protective or ineffective immune responses in a number of disease models. recently there has been the suggestion that immunity to some retroviruses is associated with the production of ceaain patterns of cytokines. to explore this further we have have used rauscher murine leukemia virus (r-mulv) infection of c57bu6 (resistant) and balb/c (susceptible) mice to elucidate the role of cytokines immunity to retroviruses. initially the in viho proliferation of spleen and lymph node cells from infected mice was examined. in response to stimulation with immohilised anti-cd3 antibodies the proliferation of spleen hut not lymph node cells from infected mice. was found to he rapidly suppressed. suceptible balb/c mice exhibited a much greater suppression than resistant c57bu6 mice. the cause of this suppression is under investigation however, the immunosuppressive molecules nitric oxide and prostaglandins are not involved. in vitro cytokine production by spleen and lymph node cells from r-mulv infected mice was determined. in response to stimulation with immobilised anti-cd3 antibodies, spleen cells from infected balb/c mice produced diminishing amounts of ifn-7 and il-2. in contrast spleen cells from infected c57bu6 mice produced ifn-y and l-2 to levels that were only slightly less than uninfected controls. a-6 production by spleen cells from infected mice of both strains was at levels higher uninfected controls. anti-cd3 stimulated lymph node cells from infected mice produced elevated ifn-1 suggesting that suppressed cytokine production is spleen specific. expression of cytokine genes in vivo is currently being investigated using rt-pcr to detect cytokine mrna in the spleens of infected mice. we have previously shown that primary resting murine b lymphocytes are non-permissive for vesicular stomatitis virus (vsv), however, a productive infection can be induced when infected b cells are activated with anti-immunoglobulin (a-lg) plus il-4 or lipopolysaccharide (lps). we posit vsv in unactivated primary b cells provides a paradigm of persistently infected lymphocytes and activation dependent recall of an active infection. analysis of the behavior of virus in unstimulated b cells during long term culture and the requirements for subsequent induction of productive infection has been limited by the poor survival of primary cells in culture. we circumvented this limitation by using highly purified small b cells from mice transgenic for the bcl-2 proto-oncogene, expression markedly extends in vitro survival of unstimulated primary b cells. overexpression of bcl-2 does not alter b cell infection or induction of a productive infection by activators during acute infection. infection does not effect b cell survival in culture. unstimulated virus infected b cells produce primary viral mrnas but not viral proteins or infectious particles (pfu) during culture. persistently infected b cells stimulated with a-lg plus il-4 produced a fully productive vsv infection at all times analyzed, up to 3 weeks post infection. in contrast, vsv production in persistently infected b cells activated with lps markedly declined relative to acutely infected activated cells (50-1 00 fold by week 1 and 1,000 fold by week 2). cells were not completely refractory to lps activation as vsv protein was produced. the selective lps deficiency is unique to persistently infected cells as uninfected cultured b cells proliferate and differentiate to produce antibody upon lps activation. these data show that a persistent infection may selectively alter the host cell response to previously productive activators which may as a consequence interfere with immune regulation. rsv-g glycoprotein specific t cells preferentially secrete il-5 and predispose to pulmonary eosinophillia., anon srikiatkhachorn. and thomas j. braciale, the beirne b. carter center for immunology research and the departments of microbiology, pathology, and pediatrics, university of virginia health sciences center, charlottesville, va 22908 we studied the immune responses to two different glycoproteins of respiratory syncytial virus (rsv) in a murine model. balb/c mice were immunized with recombinant vaccinia virus expressing either rsv-fusion glycoprotein ( vac-f), attachment glycoprotein (vac-g) or 8-galactosidase (as a control). these mice were given rsv intranasally three weeks after priming and then sacrificed 5 or 14 days later. spleens and bronchial lymph nodes were harvested for in vitro culture and lungs were harvested for histologic studies . we found that bulk cultures obtained from both vac-f and vac-g immunized animals secreted both thl and th2 type cytokines when stimulated with rsv infected spleen cells . however, the levels of 11-5 and 1fn-y were higher in bulk cultures derived from vac-g primed animals while the levels of il-2 were higher in the bulk culture from vac-f primed animals. the il-4 and il-5 production was relatively short lived since spleen cells and bronchial lymph node cells obtaind form mice sacrificed 14 days after intranasal inoculation produced much lower levels of il-4 and 11-5 while the levels of il-2 and ifn-y production were comparable to bulk cultures obtained from mice at the peak of infection. there was little inflammatory response in the lungs obtained from mice immunized with the control vaccinia. in contrast , lungs from mice immunized with vac-f or vac-g showed significant infiltration of inflammatory cells. there was a striking infiltration of eosinophils in the lungs from mice primed with vac-g. these eosinophils could be detected aroud major bronchi and blood vessels, as well as, in some cases, in lung parenchyma. this study suggests that the immune responses to different viral glycoproteins may be distinct and may play important roles in viral pathogenesis. during infection of normal mice with lymphocytic choriomeningitis virus (lcmv), nk cell responses peak on day 3 and subside as cd8+ t cell responses are activated at day 7 post-infection. in contrast, 02m-/-mice, lacking cd8+ t cells, have dramatically elevated nk cell responses on day 7 postinfection. the 02m-/-response is evidenced by increased nk cell activity, as well as up to 5-fold increases in blast and total nki.i+cd3-cell numbers. nk cell responses in normal mice are cyclosporin a (csa)-resistant and interleukin (il)-2independent, whereas day 7 nk cell responses in 02m-/-mice are csa-sensitive and il-2-dependent. to investigate the role of additional cytokines in regulating cellular responses during acute viral infections, production and function of il-4 and transforming growth factor4 (tgf-0) were examined. induction of il-4 mrna, at late times post-infection of normal mice, was shown by in situ hybridization of t cell-enriched splenic leukocytes and polymerase chain reaction (pcr) amplification of cdna from rna. ellsas of media cor.aitioned with cells isolated on days 0, 3, 5, 7, 9, and 14 post-infection demonstrated delayed induction of il-4 protein as compared to ctl activation. tgf-0, evaluated in biological and elisa assays, was induced maximally at days 7 to 9 post-infection. the kinetics of tgf-0 production by cells from infected 82m-/mice was similar to that of normal mice. however, cells from 02m-/-mice produced il-4 at early but not at late times postinfection. together, these results suggest that either il-4 is a critical cytokine for shutting off nk cells during normal responses to viral infection, or that the 02m-l context modulates responsiveness of nk cell subsets to other late cytokines. studies are in progress to distinguish between these two possible mechanisms. the induction of fever in response to infection is an important host defense mechanism that enhances aspects of the immune response and restricts the replication of some microorganism. vaccinia virus, a member of the poxvirus family, is a complex cytoplasmic dna virus that encodes a variety of proteins that interfere with host immune functions, such as complement regulatory factors and soluble receptors for il-lp, tnf and ifny. here we show that expression of the vaccinia virus il-1p receptor (vil-lpr) in the w r strain prevents the febrile response and reduces the severity of infection in intranasally inoculated mice. fever was recorded on days 1-6 after infection of mice with a vil-lpr deletion mutant, but not in animals infected with wild type wr or a virus revertant. these studies were extended to other virus strains that were used as smallpox vaccines, and expression of the vil-lpr was consistently found to prevent the onset of fever. vaccinia virus induced a severe hypothermia after 6 days in infected mice that was independent on vil-lpr expression and correlated with virus replication in the brain, the organ that controls body temperature. these results represent the first example of a virus mechanism to inhibit the host febrile response and suggest a central role for soluble il-lp in the induction of fever in poxvirus infections. measles virus (mv) infection can depress cell-mediated immune responses for months following clinical disease. mv is known to infect the thymus during human illness and this may contribute to immune suppression. we have used the scid-hu m o w with co-implants of human fetal thymus and liver to determine the effect of virulent and avirulent strains of mv on the thymus. scid-hu mice were. infected by direct inoculation of the graft with 103 pfu of either a wild type strain of mv(chicago-1,chi-1) or an attenu-ated strain (moraten, mor) and sacrificed at intervals over 28 days. peak viral titers, as judged by plaque assay on vero cells, were reached by chi-1 on d4 (105.7 pfu/ third of implant), and moron d21 (103.2 pfu/ third of implant). hematoxylideosin stained sections of chi-1-infected thymuses showed marked distortion of the cortex and medulla by d4 with thymocyte poilolosis and decreased cellularity. by d14, these. implants were mostly devoid of normal thymocytes. mor-infected thymuses showed relatively preserved architecture and cellularity. suspensions of the cells from implants stained with mabs to cd3,cd4 and cd8 were analyzed by flow cytomehy. there were significant decreases in the cd4+cd8+ cell pop-ulation by d10 with complete loss of all such cells by d28 with chi-1, and only modest reductions with mor. immune fluorescence staining of sections with a mv mab to hemagluttinin(ha) and abs for either human cytokera-tins(ael/ae3) or cd15 co-localized mv predominantly to epithelial and monocytic cells. additionally, mv antigen was present diffusely by d4 in both cortex and medulla in chi-i infection whereas mor-infected implants had only patchy distribution by d21. only rare cells stained both with mv ha and cd2 or cd4. mv ha was not expressed over background on any cd4+ cells judged by facs. we conclude that mv replicates in the scid-hu thymic implant primarily in epithelial and monocytic cells, and that the attenuated virus reproduces more slowly and with less cellular disruption. little mv ha could be demonstrated in thymocytes, therefore the data suggest that significant infection of the thymic epithelial stroma disrupts the thymic microenvironment which normally supports and aids in selection of immature t cells. part of the long-term immune suppression seen in mv infection may be due to infection of the thymic epithelial stroma with subsequent loss of thymocytes. it is becoming increasingly evident that many poxviruses contain genes that enable the virus to evade the host's immune system. myxoma virus is a leporipoxvirus and is the causative agent of myxomatosis, a rapidly lethal disease in the european rabbit (oryctolagus cuniculus). one possible mechanism of immune evasion is virus-induced downregulation of cell-surface receptors important for an immune response. cell-surface levels of several receptors on a rabbit t cell lymphoma cell line (rl-5) were monitored by flow cytometry. following infection with myxoma virus, cellsurface levels of cd4 were found to drop dramatically. other cell surface antigens such as cd18, cd43, and cd45 were unaffected during infection with myxoma virus. further more, the downregulation of cd4 by myxoma virus could be inhibited by treating cells for an extended period of time with pma, suggesting that the downregulation was not simply a masking of the epitope via viral antigens. analysis of cd4 levels in the presence of cytosine arabinoside indicates that late gene expression is not necessary for the modulation. since the tyrosine specific protein kinase p56lck associates with the cytoplasmic domain of cd4 we have also examined the association of p56lck with cd4 as well as steady state levels of p56lck during viral infection. the modulation of surface cd4 has also been described in hiv infected t cells suggesting that the loss of cell-surface cd4 may be a common viral immune evasion tactic by lymphotrophic viruses. i n addition, stably-transfected cell l i n e s expressing e i t h e r u s 1 1 o r us2-6 gene products s i g n i f i c a n t l y reduced l e v e l s of mhc class i heavy chain. studies are i n progress t o f u r t h e r d e f i n e t h e mechanism by which t h e s e v i r a l gene products a l t e r immune recognition. cytotoxic t lymphocytes (ctl) may play a significant role in containing the spread of hiv in infected individuals. although hiv-infection is associated with immune suppression, a vigorous ctl response has been detected in infected adults. hiv can be transmitted from mother to child. one third of vertically infected children has a rapid evolution toward disease, with onset of aids before 18 months. the other two thirds remain asymptomatic for years. the bimodal course of disease evolution in hiv-infected children could be related to differences in the host immune control of viral replication. hiv-specific ctl response from fresh and in vitro activated pbmc of hiv-infected children was measured. the vast majority of infected chidren had detectable hiv-specific ctl, which where cds+cd8+. we previously showed that among children with a slow disease progression, fresh ctl were more frequent in the p2a(paucisymptomatic) group than in the pl(asymptomatic) and the p2b-f groups (symptomatic group). the cohort of children has now been followed during 4 years, and 46 children have been tested at least once. we found that ctl responses were less frequent in the children with a rapid disease progression than in the children with a slow disease progression at the same age. our data suggest that ctl response is an important factor in delaying disease evolution. we, as well as others. have proposed that sag function is critical to the ability of milk-borne m m n to infect mice. to determine whether this is the case, we created transgenic mice (hyb pro/cla) with a frameshift mutation int the sag gene. young hyb pro/cla mice (c 10 weeks of age) showed no deletion of their cognate vp14* t cells, unlike transgenic mice carrying a functional sag gene however, a slow, progressive loss was seen in the hyb prolcla mice as they aged, indicating that it was due to expression of wild type sag protein. thus, as the hyb pro/cla mice aged, there was production of virus that appeared to lose the cla mutation. the hyb pro/cla mice produced transgene rna in their lactating mammary gland and shed virus in their milk. their nontransgenic offspring of showed infection with transgene-encoded mmtv because they had the typical slow deletion of vp14+ t cells characteristic of c3h mmtv infection and because we detected transgene-derived m m n rna in their mammary glands. cloning and sequencing of the viral rna produced by the nontransgenic offspring of the hyb pro/cla mice showed that recombination between the mtv-1 endogenous viral rna and the transgene-encoded rna occurred, such that the frameshift introduced by the cia mutation was repaired. these results show that there is selection of infectious virus that contains a functional sag gene. thus, it appears that the only virus that is capable of being transmitted by the milk borne infection pathway is that which encodes a functional sag protein. hepatitis b virus (hbv) causes acute and chronic liver diseases and is closely associated with hepatocellular carcinoma. in order to understand the cellular immune response against hbv in chronic hbv infection, t cell proliferation, cytotoxicity and cytokine production were studied. we found that although the majority of asymptomatic hbsag carriers and patients of chronic hepatitis b (chb) had no proliferative response to hbsag, some individuals in both groups showed significant t cell proliferation against hbsag. in contrast, the proliferative t cell response to hbcag in asyrnpatomatic hbsag carriers was significantly stronger than that in patients of chb with acute exacerbation. in addition, the frequency of hbcag-reactive t cell precursors measured by limiting dilution assay was much higher in asymptomatic hbsag carriers than in patients of chb. therefore, t cell responses against hbsag and hbcag are regulated differently in chronic hbv infection. furthermore, we demonstrated hbsag-and hbcag-specific cytotoxic t lymphocyte (ctl) activity in asymptomatic hbsag carriers, using autologous hbsag-and hbcag-expressing lymphoblastoid cell lines (lcl) as target cells, respectively. the cloned ctl were able to produce ifn-y, tnf-a or gm-csf after stimulation. these findings demonstrate that t cell response to hbv is not completely suppressed in asymptomatic hbsag carriers. most of them have strong hbcag-specific response and some of them have hbsag-specific response. transcription and tax the human t-lymphotropic virus type i (htlv-i) promoter contains the structural features of a typical rna polymerase i1 (pol 11) template. the promoter contains a tata box 30 bp upstream of the transcription initiation site, binding sites for several pol i1 transcription factors, and long poly a+ rna is synthesized from the integrated htlv-i proviral dna in vivo. consistent with these characteristics, htlv-i transcription activity was reconstituted in v i m using tbp, tfiia, rtfiib, rtfiie, rtfiif, tfiih and pol 11. in hela whole cell extracts, however, the htlv-i ltr also contains an overlapping transcription unit (otu). htlv-i otu transcription is initiated at the same nucleotide site as the rna isolated from the htlv-i-infected cell line, mt-2, but was not inhibited by the presence of a-amanitin at concentrations which inhibited the adenovirus major late pol i1 promoter (6 pglml). htlv-i transcription was inhibited when higher concentrations of a-amanitin were used (60 pglml), in the range of a typical polymerase in (pol 111) promoter (va-i). purified tax, transactivates this promoter 5-to 10-fold in v i m . interestingly, basal and tax,-transactivated transcriptional activity of the htlv-i ltr could be reconstituted with the 0.5 m phosphocellulose fraction. these observations suggest that the htlv-i ltr contains overlapping tax,responsive promoters, a typical pol i1 promoter and a unique pol i11 promoter which requires a distinct set of transcription factors. tax, further in vifro transactivates a polymerase i1 template containing the 21 base pair repeats cloned upstream of the ovalbumin promoter and g-free cassette. tax,-transactivated transcription was concentration dependent and inhibited by low concentrations of a-amanitin. flaviviruses are arthropod-borne viruses whose route of infection is via the skin. they are mostly neurotropic and responsible for significant human morbidity and mortality. the classic cell-mediated immune response to a viral infection may be influenced by the ability of these viruses to modify expression of cell-surface molecules involved in the presentation of antigen to, and activation of, t cells. the skin langerhans cell is the prototypic nonlymphoid dendritic cell and as such is uniquely placed to participate in a response against epidermally-acquired viral infections. the migratory properties of these cells contribute to their role as initiators of t cell-mediated immune responses within the draining lymph node. we have previously shown infection of epidermal cells in vifro by the flavivirus west nile (wnv) results in an increase in mhc class i and i1 expression on the majority of epidermal cells and langerhans cells respectively. in this study a technique for infecting the epidermis with wnv in vivo was developed. tme-dependent increases in the surface expression of a number of antigens which are involved either directly or in a co-stimulatory capacity in initiating a cell-mediated immune response, were detected on both the majority of epidermal cells and the langerhans cell population using flow cytometry. these increases were detectable as early as 16 hours after infection. a significant decrease in the percentage of langerhans cells remaining in the epidermis was observed within 48 hours of infection. the phenotypic changes observed in vivo are analogous to those described following in vifro culture of langerhans cells. these results, together with the reduction in langerhans cell numbers, may represent the in situ maturation and concomitant migration of these. cells as a consequence of virus-induced cytokines within the skin microenvironment. which cause a wide variety of illnesses with high morbidity and mortality in humans throughout the world. their high genomic stability argues for a survival strategy related more to interaction with the vertebrate host immune response, than a dependence on viral genetic mutation. our previous work has shown that west nile virus (wnv) infection of many cell types directly induces functional increases in class i and 11 mhc expression. we report here that wnv infection of human embryonic fibroblasts (hef) results in the increased expression of cd54 by two distinct mechanisms. an early, direct cytokine-independent mechanism operates within 2 h of virus infection, while an indirect mechanism, regulated by type 1 interferon (ifn), operates within 24 h of virus infection. cd54 expression increased by 4-5 fold within 2h of wnv infection on hef, and by 6-7-fold within 24h. wnv-inactivated, conditioned supematants removed from infected hef cultures after 4 h incubation did not alter cd54 expression on unqimulated hef. whereas conditioned supernatants from 24 h-infected cultures increased cd54 expression by about 1.5-2-fold after incubation for 24 h, but not after 4 h, similar to cd54 induction by 200ulml of ifn-p. increased cd54 expression on hef by wnv was also cell-cycle dependent. cd54 increased only in quiescent, contact-inhibited infected hef in go phase. in contrast, induction of cd54 by types 1 and 2 ifn was not cell-cycle dependent. other viruses, including double-stranded dna viruses, vaccinia, and adenovirus 2 and 5, and the single, positive-stranded rna alphavirus, semiliki forest virus, did not induce cd54 expression on hef after 24 h. another alphavirus, ross river, was able to induce cd54 but only by the indirect mechanism of type 1 ifn-dependent release. poly i.c, also, increased cd54 expression to the same extent as ifn-p after 24 h, making it unlikely that the early increase was due to a nonspecific viral effect. the closely related flavivirus, kunjin, induced increased cd54 expression in a manner similar to wnv. the ability of flavivhses to induce increased cd54 expression directly within a few hours of infection may be an important virus-host survival strategy promoting cell-cell adhesion and hence possible further viral infectiodreplication. recognition of viral peptides presented on the cell surface in association with class i mhc molecules leads to lysis by cytotoxic t cells (ctl) and forms an important part of the immune response to hiv infection. hiv virus has a high mutation rate and variation in the region of the viral epitope may allow evasion of this immune response. variation could theoretically affect processing of the antigen, binding of the epitope to the hla molecule or recognition of the presented epitope on the cell surface. we have studied proviral sequence variation in gag and ctl responses in a number of hla b8 patients infected with hiv. amino acid substitutions, such as a lysine to arginine change at position 3 of the pi7 gag nonamer cckkkyklk, lead to loss of recognition of the peptide by ctl from the patient whose provirus contained this sequence. these variant peptides bind to hla 68 with comparable affinity to the index peptide suggesting that this loss of recognition is likely to be caused by changes in the interaction between the hla-peptide complex and the t cell receptor. other changes, such as lysine to arginine or glutamine at position 7, not only cause loss of recognition, but also lead to inhibition of lysis of targets bearing the index peptide. thus it appears that in addition to loss of recognition by cytotoxic t cells, naturally occurring epitope variants may act as "antagonists", as has been demonstrated in mhc class ii systems. antagonism may be an important mechanism allowing immune escape by the hiv virus. genes. subsequent complex formation between peptide, class i and p2microglobulin in the er results in stable cell surface expression of the trimeric mhc-1 molecule. in previous studies we showed that in hpv-16 positive cervical carcinomas there was a loss of mhc-1 protein expression, which correlated at the single cell level with loss of tap protein. in this study we investigated whether loss of tap and mhc-1 is mediated by an hpv-16 encoded protein. human keratinocytes were transfected withvarious hpv-16 constructs including pat16, the full length genome, pat16esx the full length genome with a premature stop codon in e5, puc.et16, the e6 and e7 oncogenes only, and pkve5, expressing e5 from mouse moloney ltr the different constructs were transfected into primary keratinocytes, cloned cells grown in medium supplemented with and without y-interferon ( y -a r ) for 48 hours. cells were harvested and total rna and protein harvested for northern and western blots respectively. western blots showed very low steady state levels of tap-1 and mhc-1 heavy chains in the cells with pat16 as well as those containing es alone, which was marginally increased by y-lfn. in contrast, primary keratinocytes, pat16esx and puc.et16 lines showed comparable tap-1 and mhc-1 protein levels, which increased a & y-ifn treatment. northem blots showed no differences in the amounts of tap-1 and mhc-1 mrna between the different cell lines. the data indicate that expression ofhf'v-16 e5 leads to post-transcriptional loss of mhc-1, presumably by interfering with tap. to map and characterize functional differences between e1a of ad5 and adl2, we previously constructed a series of hybrid ad5/12 e1a genes and used them with ad12 e1b to transform primary hooded lister rat kidney cells. at least two regions within the first exon of ad12 e1a were identified which influenced tumorigenicity. this study further examines the role of these regions in tumorigenicity by analyzing their affect on cell surface mhc class i expression and sensitivity to class i-restricted cd8+ as well as to non-class irestricted nks. the bcrfl open reading frame of epstein-barr virus exhibits remarkable sequence homology with the coding sequences of interleukin-10 from a variety of organisms. many of the numerous immunological properties ascribed to interleukin-10 are shared by the product of bcrfl and this has led to it being termed viral interleukin-10. in order to investigate the activity of viral interleukin-i0 (vil-10) and its interactions with the human interleukin-10 receptor we have expressed the protein in a bacterial and the eukaryotic cos-7 expression systems. the bacterially expressed vil-i0 was partially purified and used to set up two assays to measure i l l o activity: i)the increase in igm secretion from an ebv transformed b cell line -mt4.l and ii)the downregulation of class ii hla expression on the human monocytic cell line thp-1. a series of deletion mutants (both n-and c-terminal as well as an internal deletion to remove a putative heparin binding domain) were constructed to identify possible domains within the vil-10 protein that interact with the hil-10 receptor and confer its biological activity. a number of these mutants have been expressed in the cos-7 expression system and their structure and biological activity are currently being assessed. the identification of the domains within vil-10 that interact with the receptor or accessory proteins may aid in the understanding of the possible role of vil-i0 within the ebv life cycle and in the pathogenesis of the numerous diseases associated with the virus. generation. to further test the role of ctl in ad pathogenesis, viruses lacking the cll epitopes were tested when mutants that lack the immunodominate ctl epitope in eia where used, a second immun-ssive epitope in elb becorns the predominate target of clu. these findings arc important since human ad is currently being tested as a vector for gene therapy of cystic fibrosis. our data suggest that when consuucting ad vectors to be. used for gene therapy, one must retain either the 10.4k or 14.7k genes to decrease pathology and that meting the genes that encode the antigens that a n recognized by clu does not prevent the generation of ad specific clu. the interferons (ifns) a n ? a family of cytokines whose functions include the protection of cells against viral infection. type i ifns include the 15 ifna subtypes and ifnp that compete for binding to the same cell surface receptor, while type ii ifn (ifny) binds to a different receptor. the orthopoxviruses, of which vaccinia virus (vv) is the prototypic member, have developed a number of anti-ifn strategies. the vv e3l protein competitively binds dsrna and prevents the activation of ifninduced and dsrna-activated protein kinase (pkr), while the vv k3l protein shows sequence similarity to the eukaryotic initiation factor 2a (eif2a) that is phosphorylated and inactivated by pkr. the k3l protein competitively binds the kinase and blocks host eif2a phosphorylation and hence ifn-induced inhibition of host protein synthesis. onhopoxviruses also suppress cytokine action by expressing soluble cytokine receptors that bind and sequester the ligand; to date soluble receptors for interleukin-18, tumour necrosis factor and ifny have been described. supernatants from vv-infected cells were found to contain a soluble inhibitor of type i ifn that was conserved in most of the orthopoxviruses tested. the inhibitor was produced early in infection and did not inhibit ifny. the ifna/p inhibitor was mapped and the gene expressed from recombinant baculovirus. the inhibitor blocked the binding of 125i-ifna to u937 cells and binding of 125i-ifna to supernatants from baculovirus and vv-infected cells demonstrated that the inhibitor functioned as a soluble receptor for 1fnc1fp. direct binding of 1251-ifna to vv wr supernatants revealed that the soluble ifna/p receptor had a high affinity for type i ifn. deletion of the gene from the vv genome and ligand blotting of the soluble receptor demonstrated that ifn binding was encoded by a single protein. competitive binding curves using ifna from other species revealed that the poxvirus soluble ifndp receptor bound human and bovine ifn with high affinity but murine ifn with relatively low affmity. interestingly, the soluble ifncrip receptor is highly conserved in variola virus. given the importance of ifn in antiviral defense it is likely that the soluble ifndp receptor plays an important role in the virulence of the orthopoxviruses. endogenous processing of a viral glycoprotein for presentation t o cd4+ t cells has defined a previously under-investigated pathway in antigen processing and presentation. it may be important not only for pathogens, but also for self-proteins, and thus may be involved in self-tolerance. we have been characterizing the processing o f the er-restricted gpt glycoprotein of vesicular stomatitis virus (vsv) biochemically and enzymatically, by cellular localization using confocal immunofluorescence, cellular fractionation, and by t cell recognition assays. by flow cytometry, gpt is undetected on the plasma membrane; in contrast, the wild type protein (g) is readily found following infection of a20 cells with a vaccinia virus vector, leading t o endogenous synthesis. the gpt can be found exclusively in the er compartment using co-localization with markers for er (signal peptide binding protein, calnexin), and not in the golgi compartment (a-mannosidase 11, wheat germ agglutinin), endosome, lysosome, or surface plasma membrane. this is consistent with the characteristics o f the localization of the proteases which appear to be responsible for its degradation. work is in progress to localize the site of peptide binding to mhc heterodimers. supported by nih grant a118083 t o csr. presentation of an out-of-frame class i restricted epitope. t.n.j.bullock and l.c.eisenlohr, department of immunology, thomas jefferson university, philadelphia, pa 19107. antigen presentation by class i mhc molecules is thought to require the degradation of fully formed proteins in the cytosol. this degradative process supplies oligopeptide epitopes for transport into the endoplasmic reticulum (er) where they can interact with and stabilize class i molecules. stable class i molecules, associated with p2-microglobulin, can then proceed to the cell surface where they present the epitopes to t cell receptors. the generally accepted model for protein translation, the scanning hypothesis proposed by ko&, is thought to describe the traditional method of translation for the majority of proteins. we wished to test the hypothesis that any internal methionine that is in good translation initiation context can be a source of short peptides, which may then be processed into class i epitopes. nucleoprotein gene (np), the target of the ctl response of several inbred mouse strains. np contains three class i restricted epitopes at amino acids 50-57 (h2-kk), 147-155 (h2-kd) and 366-374 (h2-db). the frameshift was introduced 26 amino acids upstream of the h2-kd epitope. the mutated genes were then recombined with vaccinia virus and tested for presentation using ctl restricted to each of the epito s described above. we found that, whilst presentation of the h2-i@ epitope was unaffected by the frame shift, the epitope proximal to the frameshift (h2-kd) was no longer presented to appro riately restricted ctl. however, presentation of the distal h2-dg epitope was retained. therefore we have shown, using a viral protein and a viral expression system, that out-of-frame epitopes can be processed and presented to ctl. work is ongoing to c o n f m that internal methionines are capable of providing a platform for the initiation of translation for in-frame and out-of-frame epitopes. we have created a frameshift mutation in the influenza pr8 the fine specificity of t cell recognition of peptide analogues of the influenza nucleoprotein epitope np 383-391 srywairtr was studied using hla b27-restricted influenza-specific cytotoxic t cell (ctl) clones, of defined t cell receptor (tcr) usage, derived from unrelated individuals following natural infection. synthetic analogue peptides were synthesized containing single amino acid substitutions, and tested both for binding to hla b'2705 in vitro, and for presentation to ctl clones by hla 827positive targets. even conservative amino acid substitutions of the peptide residues p 4 , 7, and 8 profoundly influenced ctl recognition, without affecting binding to hla 8'2705. these amino acid side chains are thus probably directly contacted by the tcr. ctl clones which used the tcr v a l 4 gene segment (but not those using tcr va12) were also sensitive to p1 substitutions, suggesting that the tcr alpha chain of these clones lies over the n terminus of bound peptide, and that the "footprint" of certain tcrs can span all exposed residues of a peptide bound to mhc class 1. these results, taken together with previous structural and functional data, suggest that, for nonarner peptides bound to hla 827, p i , p4 and p8 are "flag" residues with tcr accessible side chains. the e3/19k protein of human adenovirus type 2 (ad2) is a resident transmembrane glycoprotein of the endoplasmic reticulum. its capacity to associate with class i histocompatibility (mhc) antigens abrogates cell surface expression and the antigen presentation function of mhc antigens. at present, it is unclear exactly which structure of the e3/19k protein mediates binding to mhc molecules. apart from a stretch of approximately 20 conserved amino acids in front of the transmembrane segment, e3/19k molecules from different adenovirus subgroups (b and c) share little homology. remarkably, the majority of cysteines is conserved. in this report, we examined the importance of cysteine residues (cys) for structure and function of the ad2 e3/19k protein. we show that e3/19k contains intramolecular disulfide bonds. by using sitedirected rnutagenesis, individual cysteines were substituted by serines and alanines, and mutant proteins were stably expressed in 293 cells. based on the differential binding of monoclonal antibody tw1.3 and cyanogen bromide cleavage experiments, a structural model of e3/19k is proposed, in which cys 11 and cys 28 as well as cys 22 and cys 83 are linked by disulfide bonds. both disulfide bonds (all four cysteines) are absolutely critical for the interaction with human mhc antigens. this was demonstrated by three criteria: loss of e3/19k coprecipitation, lack of transport inhibition and normal cell surface expression of mhc molecules in cells expressing mutant e3/19k molecules. mutation of the three other cysteines at position 101, 109 and 122 had no effect. this indicates that a conformational determinant based on two disulfide bonds is crucial for the function of the e3/19k molecule, namely, to bind and to inhibit transport of mhc antigens. previous studies have suggested that several abundant cmv proteins are major immunogenic targets in seropositive adults. we are interested in defining the major viral protein targets of a cd8' ctl response, in order to derive a vaccine strategy for individuals who are unable to mount immune responses which are lymphokinedependent because of immunosuppression. hla-typed and cmv-pgsitive normal volunteers who have hla-a alleles that represent -75% of the u.s. population are being tested to determine which of 5 abundant cmv proteins they recognize by a cd8' ctl response: p28, p65, p150, ie, and gb. t cell lines will be derived in order to unambiguously determine the hla restriction of the cd8' ctl response to each of these proteins. proteins which are recognized by the most hla diverse population will be further characterized in terms of mapping of class 1 epitopes through the use of t cell clones derived from the polyclonal cell lines by limiting dilution. the defined epitopes will form the basis of a vaccine strategy to augment the memory responses of seropositive volunteers against cmv. these epitopes will be used to boost the ctl precursor frequency of bone marrow transplant donors as a means to transfer cellular immunity to immunosuppressed hematologic transplant recipients. an alternative strategy is to immunize seropositive individuals with recombinant viral proteins as a means to boost immunologic memory. we are pursuing that strategy in a transgenic murine model of hla-a2.1 developed by dr. l. sherman (scripps institute, la jolla). we are vaccinating the transgenic mice with two well defined cmv proteins, p65 and gb together with either of two lipid-based adjuvants, commercially available d0tapm (bcehringer-mannheim) or mf5gth (chiron, emeryville, ca). our preliminary studies with hsv-2 gb demonstrate that both adjuvants are effective at eliciting murine class i restricted responses against the protein. current studies are evaluating the recognition properties of the adjuvant-cmv protein complexes by hwa2 as a restriction element in the transgenic model. the ctl response to sendai virus in c57by6 mice is directed almost exclusively to a single h-2kb-restricted epitope derived from the virus nucleoprotein, npj24-332 (sev-9). analysis of 18 independent t cell hybridomas generated from c57by6 mice following primary sendai virus infection has shown that a very diverse repertoire of tcr is selected in response to this epitope. crystallographic analysis of sev-9 bound to kb has shown that the side chaiis of peptide residues phpi, gl484, a d s , and alaps protrude towards the solvent and are potentially available for recognition by the tcr notably, residues gi484 and a d 5 protrude prominently from the peptide binding site due to their l o c a l i o n on a bulge in the center of sev-9. to determine the importance of each of these residues for t cell recognition, we analyzed hybridoma responses to sev-9 analogs substituted at each of these four positions. preliminary data showed there generally appeared to be dominant recognition of glyp4 and asnm. however, individual hybridomas exhibited distinct patterns of fine specificity for residues phep1 and alaps. thus, individual hybridomas were dependent on one, both, or neither of these residues for recognition of sev-9. these data are consistent with a critical role for the gi94 and a d 5 in governing tcr-sev-9eb recognition and suggest a structural basis for the diversity of the tcr repertoire selected by this @tope. previous results from this laboratoty demonstrated that the dominant influenza a epitope recognized by hla42.1 restricted ctl from hla-a2.1 uansgenic mice was the m1 peptide epitope that is immunodominant in human ctl responses. however, analysis of a large number of ctl lines revealed a subset of influenza a/pr/8/34-specific murine ctl that recognized an hla-a2.1 restricted epitope distinct from m1. using recombinant vaccinia viruses encoding werent influenza gene segments, the epitope recognized by these ctl was shown to be derived from the a/pr/8 nsl protein. because these ctl did not recognize targets infected with the a/alaska/6/77 saain of influenza, candidate peptide epitopes were synthesized based on sequences that included an hla-a2.1 specific binding motif and that differed between a/pw and nalaska all of these ctl recognized a nonamer and a decamer peptide which contained a common 8 amino acid sequence and two distinct sets of bmding mtif residues. however, the n0name.r peptide was able to sensitize ctl for half maximal lysis at 80-2500 fold lower doses than either the octamer or decamer. the homologous peptide derived from nalaska nsl contained conservative amino acid changes at positions 4 and 8 and was not recognized at any tested concentration, although it bound with higher &ity to hla-a2.1 than the peptide from a/pw8. the a/pr/8 nsl nonamer epitope was also recognized by human influenza a specific ctl derived from two individuals. these results substantiate the general utility of hla class i aansgenic mice for the identification of human cn epitopes for other pathogens. furthemore, the recombinant dhfr was functional in the induction of gb epitope-specific ctl response upon immunization of c57bv6 mice. these results indicate that an viral epitope expressed in a cellular protein can be. efficiently processed, presented and recognized by epitope-specific ctl, and suggest that the cellular proteins can be used to express ctl epitopes for induction of cd8+ immune responses. virus-specific cytotoxic t lymphocytes (ctl.) were generated a day later at this site. to determine which apc was capable of stimulating virusspecific ctl precursors in the mln, b, t and dendritic cells from the mln of influenza-infkcted mice were separated and examined for the presence of virus. the predominant cell type which contained infectious virus was the dendritic cell. b and t cells from the mln contained little, ifany, virus. the apc capacity ofthese populations was tested by their ability to stimulate vir~~-~pecific t cell hybridomas. only dendritic cells from the mln of influenza-infected mice were able to stimulate virusspecific t cell hybridomas, althwgh all apc populations from both naive and influenza-infected mice were effective stimulators after in y h pulsing with the appropriate intluenza peptide. potential apc populations were also separated from the lung. v i s was detected in bronchioalveolar macrophages and dendritic cells but not b or t cells. both macrophages and dendritic cells isolated from intlum-infected lungs could stimulate virus-specific t cell hybridomas. the ability of the mln and lung apc populations to stimulate naive cd8' t cells and generate virus-specific ctl is currently being examined. virus infected cells present only a very limited number of peptides intracellularly processed from a viral protein to ctl even when many peptides hearing the mhc class i-restricted binding motif are present in the protein. infection of h-2b mice w i t h lymphqtic choriomeningitis virus (lcmv) induces a cd8+ ctl response directed against three wellcharacterized epitopes presented by h-2db molecules: "396-404 (fqpq-ngqfi), gp33-43 (kavynfatcgi) and gp276-286 (sgven-pggycl). the h-2db motif is characterized by a sequence of 9 to 11 a.a. with two anchor residues: asn at position 5 and hydrophobic (met, ile, leu) at the c-terminus. the lcmv np and gp proteins contain thirly-one other peptides exhibiting the db motif. however, no ctl response against one (or more) of these peptides has been characterized. peptide binding to mhc is a critical step in antigen presentation. the aim of this study was therefore to analyze the binding properties of the potential db lcmv peptides. the 34 lcmv peptides and 11 known db-selective peptides were synthesized and their mhc binding affinities measured in two db-specific binding assays. most of the lcmv peptides (28/34) did not bind to db. the other 6 (including the 3 epitopes) and all the known db peptides showed good affinity. comparison of the sequences (good vs. non binders) allowed the identification of auxilliary anchors required for high binding affinity or of negative elements hampering mhc binding. in addition to the main anchors, the positive and negative factors at secondary residues play a crucial role in governing peptidemhc interactions. knowledge of such factors might he of importance for the prediction of mhcrestricted ctl epitopes. etienne joly, andrea gonzalez, carol clarkson, jonathan c. howard and geoffrey w. butcher. laboratory of immunogenetics, department of immunology, the babraham institute, cambs cb2 4at, uk. tap transporters from rats can be divided into two allelic groups, depending on their capacity to provide the rt1.aa molecule with an appropriate level of suitable peptidesl. recent results suggest that this might correlate with the rt1.aa molecule requiring arginine-ended peptides (powis et al., manuscript submitted), which the tapb allele of the transporter is unable to translocate across the er membrane efficiently2~3. rt1.a alleles are naturally linked with the tapa or the tapb allelic group4. we have set out to characterise various alleles for the rt1.a molecule, and find that, for the majority of tapaassociated rt1.a molecules, 3 acidic residues line the c/e pocket, dictating arginine as c-terminal anchor residue for the bound peptides. on the other hand, in tapb-associated rt1.a molecules, one acidic residue at the most is found in the c/e pocket, which certainly results in a different anchor residue for the bound peptides. the selective pressure of viral infections must have driven this coevolution which affects dramatically the array of peptides presented to cytotoxic t lymphocytes. cytotoxic t lymphocyte responses in hiv infection can be impaired due to variation in the epitope regions of viral proteins such as gag. we show here an analysis of variant epitope peptides in three gag epitopes presented by hla b8. seventeen variant peptides were examined for their binding to hla b8; all but one bind at concentrations comparable to known epitopes. all except two could be seen by ctl clones grown from hla b8 positive hiv-1 infected patients and were therefore immunogenic. however, in one haemophiliac patient studied in detail, there was a failure to respond to some of the peptides that represented virus present as provirus in his peripheral blood. in one case his ctl had previously responded to the peptide. thus there was a selective failure of the ctlresponse to variant epitopes. this impaired reaction to new variants and failure to maintain responses to some epitopes late in hiv infection could contribute to the loss of immune control of the infection. pira, anna ferraris, daniele saverino, peifang sun and annalisa kunkl; dept. immunology, san martino hosp. univ. of genoa, 16132 genoa, italy. th epitopes present on viral proteins can be recognized by specific th cells if appropriately expressed by antigen presenting cells (apc) as a result of uptake and processing. since viral epitopes are not simply present in the context of viral proteins, but also in the context of whole viral particles, it is important to determine the role of the molecular and/or structural context on antigen uptake-processing-presentation. therefore we have generated panels of cd4+ human t cell lines and clones specific for different hiv antigens (gp120, p66, p24), in order to test their ability to respond to the same epitopes present within synthetic peptides, recombinant proteins or inactivated virions (provided by g. lewis, dept. microbiology, univ. maryland, baltimore). we could identify t cell lines and clones that were able to discriminate the molecular and structural context of the epitops. certain t cells, in fact, responded to peptides and proteins, but not to viral particles, whereas other t cells were also able to proliferate when challanged in vitro with autologous apc and viral particles. the data suggest that in the human th cell repertoire specific for viral antigens t cells exist that can discriminate the molecularstructural context of th epitopes. it will be interesting to ascertain whether t cells specific for epitopes that can only be recognized when provided in the context of a soluble molecule, but not of a viral particle, have any relevance in viva protection, or are a simple by-product of the cellular immune response. eric g. pamer, merceditas s. villanueva, section of infectious diseases, yale university school of medicine, new haven, ct 06520 listeria monocytogenes is a gram positive bacterium that infects macrophages and secretes proteins into host cell cytosol. the murein hydrolase p60 is secreted by l. monocytogenes and is required for complete bacterial septation. in the infected macrophage secreted p60 is processed by the host cell into the nonamer peptide p60 217-225 and is presented to cytotoxic t lymphocytes by the h-2kd mhc class i molecule. we have used strains of l. monocytogenes that secrete different amounts of p60 to show that the rate of p60 217-225 production is proportional to the amount of antigen secreted into the host cell cytosol. p60 is degraded in the host cell cytosol with a half life of 90 minutes. the appearance of p60 217-225 is coupled to the degradation of newly synthesized p60. we have determined the rate of intracellular p60 secretion and by accounting for the rate of p60 degradation we estimate that approximately 35 p60 molecules are degraded to produce one p60 217-225 epitope. this ratio is maintained over a range of intracellular antigen concentrations. our findings provide an estimate of the efficiency of antigen processing and demonstrate the remarkable capacity of the mhc class i antigen processing pathway to accommodate new epitopes. we have isolated and characterized three cytotoxic t lymphocyte (ctl) clones from the peripheral blood of two acute seroconversion patients and one patient in the first trimester of pregnancy. these clones were cd8+ and class i hlarestricted by the b7 molecule. all three clones recognized lllb and rf but not mn strains of hiv-1. using vaccinia vectors expressing truncated versions of the hiv-1 envelope, the clones were found to recognize an epitope within amino acids 287-364, but not including 312-328 of gp120. further mapping of the epitope with synthetic 20-mer peptides overlapping by 10, or 25-mers overlapping by 8, was unsuccessful. the sequence of the region of gp120 recognized by these clones was compared to the predicted hla-87 peptide binding motif and a possible matching region was found. using shorter peptides corresponding to this potential epitope recognition site, the minimum epitope recognized by the clones was determined to be the 10 aa sequence rpnnntrksi spanning amino acids we have further pursued a strategy to define a minimal cytotoxic epitope for a vaccine against cmv infection using t cell clones derived from individuals who have the mhc 835 gene (kind gifts of drs. riddell and greenberg, fred hutchinson cancer research center and dr. robert siliciano, johns hopkins university medical center). we tested by chromium release assay (cra) the recognition of a series of 835 allelic variants of ebv-lcl. by 835 restricted and cmv or hiv-specific t cell clones. several conclusions quickly became apparent. the previously described 8'3501 peptide epitope from pp65 was not able to prime the autologous 835 ebv-lcl for killing by the pp65-specific ctl, whereas a recombinant vaccinia virus expressing whole pp65 could cause the same cell line to be recognized and killed in the same experiment. in addition, an hiv gp41-specific cd8' ctl which has a defined minimal cytotoxic epitope will only recognize and kill a subset of 835 ebv-lcl. the two t cell clones will not recognize each other's autologous ebv-lcl. the resolution of this interesting phenomena comes from sequence analysis of the hla class i b genes from both ebv-lcl. ebv-lcl which contain the b'3502 allele are recognized and killed by the pp65-specific t cell clone, and cell lines carrying 8'3501 alleles are recognized by the hiv gp41-t cell clone. we conclude that the reported cmv pp65 b"3501 restricted epitope is not correct, since the ctl in question will only recognize 6'3502 alleles in combination with the correct pp65 epitope. fragments with or without a signal sequence sensitize rma-s/kd to a similar limited extent. this data i s consistent with an inefficient movement of peptides from the cytoplasm into the er by a tap independent mechanism and does not reveal a processing competent compartment within the secretory pathway. peptide transport by the transporter associated with antigen processing (tap) was studied using a microsome system as previously reported by heemels et. al.. in this system, a radiolabeled synthetic peptide which can be n-link glycosylated is used as the indicator peptide for the transport studies. the transport efficiency of synthetic peptides corresponding to antigenic peptides restricted to the murine kd molecule was measured by inhibition of labeled peptide transported into the microsomes. the transport efficiency of three kd epitopes in the type a influenza virus "147-155, ha204-212 and ha210-219 was found to be similar. an 11 amino acid peptide corresponding to ha204-214 which contains the 204-212 epitope was transported at a similar efficiency as the 9 amino acid minimum epitope. however, when the peptide sequence is further extended by one amino acid to residue 215, this peptide is poorly transported. these results suggest that the flanking region of an epitope can dramatically influence the transport of the epitope. when the transport kinetics of tap was studied using the microsome system, the vmax for transporting the indicator peptide (a variant of np epitope that has the sequence tynrtrali) was found at 260.8 fmolelminute (+/-30.5). the km for this peptide was found to be 231.9nm(+/-31.8). bypassing a block in antigen processing for class i-restricted cytotoxic t cell recognition. amy j. yellen-shaw and laurence c. eisenlohr. thomas jeferson universitv. hiladelphia, pa., 19107. previous work from our laboratory showed that processing of an influenza nucleoprotein (np) epitope (amino acids 147-155) expressed endogenously from a recombinant vaccinia virus "minigene" is severely impaired when a flanking sequence (the dipeptide threonine-glycine) is appended to the cterminus of the construct (147-158/r-). the inhibition of processing is overcome by placing the unprocessed peptide in the context of the fulllength np molecule, demonstrating that regions of a protein outside the epitope itself critically affect the ability of the proteolytic machinery to fragment the protein appropriately. to determine the requirements for bypassing the block in antigen processing, we have constructed an array of "minigene"-expressing vaccinia recombinants in which the unprocessed epitope is extended by varying lengths toward either the c-terminus or the n-terminus of the np molecule. our results show that while an extension of the c-terminus by only one amino acid restores processability, a much longer extension of the n-terminus (75 < n < 100 amino acids) will also allow the substrate to be processed. it is therefore clear that a full-length, properly folded molecule is not required for liberation of the blocked epitope, and that probably more than one mechanism can contribute to enhancement of substrate proteolysis. we hypothesize that the c-terminal extension allows recruitment of an endopeptidase versus exopeptidase ("trimming") activity which is capable of cleaving the difficult bond. we considered the possibility that the n-terminal extension rescues processing by recruitment of the ubiquitin-dependent degradation system. to address this possibility we replaced all available ubiquitination sites (lysine residues) in one of the rescued constructs (50-158/r-) to see if the construct would still be processed and presented. the six available lysine residues were changed to arginine using pcr-based mutagenesis. the resulting construct (termed 6r) was recombined into vaccinia virus and tested for presentation to np-specific ctl. the 6r construct was presented at a level equivalent to that seen with the wild-type 50-158/rconstruct. this result provides clear evidence that entry into the ubiquitindependent degradation pathway is not responsible for rescue of presentation in this system and more importantly, that ubiquitination is not required for processing of all large substrates. chia-chi ku, li-jung chien ,and chwan-chuen king, institute of epidemiology, national taiwan university, taipei, taiwan, r.o.c. dengue virus (den) can cause dengue fever (df) and dengue hemorrhagic fever (dhf) i dengue shock syndrome (dss) and den-2 was the most common serotype found in dhf outbreaks globally. current hypotheses suggested that dhf may be associated either with antibodydependent enhancement (ade) or with viral virulence. den can replicate predominantly in monocytedmacrophages (mim), but whether peripheral blood lymhocytes (pbls) are the target cells of den still remain controversial. in order to compare whether various clinically derived den-2 will interact with mim and lymphocytes in different manners, we used two isolates --plo46 strain (obtained from a df patient during taiwan 1981 outbreaks) and 16681 strain (isolated from a dhf patient in thailand by cdc, usa) to infect primary mim and lymphocytes as well as several types of cell lines. primary lymphocyte culture was nonadherent cells obtained after 24 hr adherence of pbmcs, whereas the primary mim culture was collected by depletion of lymphocytes using anti-cd3icdi9 mab and complement prior to adherence procedure and the purity of mim culture was checked by cd14 surface marker staining. supernatants (sn) of virus were harvested at various time points post infection after with several or without treatments. our prelimanary data showed that dhf-associated den-2 strain had higher viral yield in certain age of mim and a promonocytic cell line (hl-cz) than taiwan df-associated den2 strain. in addition, this dhf-den2 strain was more likely to infect the promonocytic (hl-cz) than well differentiated monocytic (ctv-1) and lymphocytic (h9) cell lines and also had higher peak yields than den-i virus in hl-cz cells. interestingly, dhf-den2 strain replicated much more efficiently in primary lymphocytes no matter these cells were activated with pha or not, whereas taiwan df-den2 strain virus was hardly detectable in sn of both activated and non-activated lymphocyte cultures. therefore we conclude that (1) different strains of dengue virus could orchestrate quite differently with immune cells, (2) different stage of mim differentiation might be an important permissive determinants for dengue virus infection and replication, and (3) den virus strain virulence -a more important factor than lymphocyte activation status -seemed to determine whether this strain would infect human pbls. further studies should be focused on searching for detaied mechanisms of virus and immune cell interactions. (2) when viral yields were enhanced early than day5 post infection, it provided tremendous opportunity to attack the immune system and finally may lead to severe disease. hiv-1 using recombinant immunoglobulin molecules, marie-claire gauduin, graham p. allaway, paul j. maddon, carlos f. barbas, dennis r. burton, and richard a. university school of medicine, new york. ny 10016. primary isolates of hiv-1 have been shown to be less sensitive to neutralization by immune sera, monoclonal antibodies and cd4-based molecules than t cell line-adapted strains of hiv-i. we studied two immunoglobulin molecules for ability to neutralize primary isolates of hiv-i. lgg12 is an immunoglobulin molecule created from a combinatorial phage expression library and reacts with the cd4 binding site (cd4-bs) on gp120. cd4-lgg2 is a recombinant molecule in which the variable domains of both heavy and light chains of lgg2 were replaced with the first and second immunoglobulin-like domains of human cd4. both molecules have been previously shown to effectively neutralize hiv-i in vitro. ex vivo neutralizations were performed as follows: lgg12 and cd4-lgg2 were added at 25 pg/ml to wells containing serial dilutions of plasma from hiv-i-infected patients and phastimulated peripheral blood mononuclear cells from seronegative donors. p24 production was measured over 14 days of culture and an end-point titer of hiv-1 in the presence and absence of added antibody was determined. both igg12 and cd4-lgg2 were found to reduce the original hiv titer from seven plasma samples with high virus titer (>250 tcid50/ml) by up to 625-fold. this is in comparison to soluble cd4 which only reduced viral infectivity by 55-fold at the same concentration. in vitro binding and neutralization assays on isolates recovered from plasma confirm the potency and breadth of neutralization by these two molecules. these studies suggest that recombinant antibodies directed at the cd4-bs of hiv-1 gp120 are able to effectively neutralize primary isolates of hiv-1 and may be useful in dissecting the mechanisms of resistance to neutralization by other antibodies. dillner and p. heino, microbiology & tumor biology center, karolinska institute, stockholm, sweden hpv 16 the major cause of anogenital precancers in man. the search for neutralizing epitopes that could form the basis for a preventive vaccine has shown that the surface-exposed imunodominant epitopes of the capsid are strongly conformationdependent, which has precluded detailed epitope analysis. similarly, immunization with whole, denatured capsid proteins has only identified linear immunodominant epitopes positioned on the inside of the capsid. reasoning that linear surface-exposed epitopes should exist, but might be cryptic, a set of 66 overlapping synthetic peptides corresponding to the entire hpv16 capsid proteins was used to generate hyperimmune sera. several antisera against 3 different peptides were reactive with intact hpv16 capsids at titers up to 1:150.000. hiv-1 serum antibodies and mucosal iga. basil golding, john inman, paul beining, jody manischewitz, robert blackburn and hana golding. div. of hematology and viral products, cber, fda, and lab. of immunology, niaid, bethesda md 20892. previously, we showed that hiv-1 proteins conjugated to 8. abortus (ba) could generate anti-hiv-1 neutralizing antibodies in mice even after depletion of cd4* t cells. in this study a 14-mer peptide from the v3 loop of hiv-1 (mn) was synthesized 013) and coupled to ba and klh. balb/c mice were immunized twice i.p. with these conjugates at two week intervals. v3-klh induced mainly igg1, whereas v3-ba induced all igg isotypes but lgg2a predominated. fecal extracts from mice immunized with v3-ba were shown by elsa to contain iga antibodies. sera from these mice bound gp120, expressed on the surface of infected cells. sera from mice immunized with v3-ba inhibited syncytia formed between cd4' t cells and chronically infected [hiv-i (mn)] h9 cells. inhibition of syncytia, formed by other hiv-1 lab. strains correlated with the degree of their homology with the v3 region of hiv-i (mn). to mimic the efffect of hiv-1, mice were depleted of cd4' cells using anti-l3t4 at the time of primary or secondary immunization. following primary immunization, cd4+ t cell depletion abrogated v3-klh antibody responses, whereas responses to v3-ba were retained and sera from these mice were able to inhibit gp-120mediated syncytia. in secondary responses, cd4' t cell-depletion prevented boosting to v3-klh, but v3-ba increased anti43 and syncytia-inhibiting antibodies. these results suggest that: 1. 8. abortus, can provide carrier function for a peptide and induce both serum and mucosal antibody responses, and 2. that infection with hiv-1 with subsequent impairment of cd4' t cell function would not abrogate anti-hiv-1 antibody responses if 8. abortus is used as a carrier to stimulate memory responses. nucleotide sequence analysis of the vh genes revealed the usage of one particular vh germline element (vh61-1p) in all clones. this finding allowed the determination of somatically mutated positions in the vh regions. two vsv-ind neutralizing antibodies expressed vh and vl genes in complete germline configuration whereas the rest of the clones showed somatic mutations which obviously were antigen dependently selected for. however, binding affinities of mutated and unmutated antibodies were comparably high. in order to determine the influence of somatic point mutations on one single antibody we generated a monovalent single chain antibody (fv-ck) of a mutated clone and reversed it stepwise to germline configuration by means of site directed mutagenesis. surprisingly, already the germline configuration of fv-ck could neutralize vsv-ind, even though the binding affinty was lower than that of the mutated fv-ck. every single somatic point mutation tested improved the binding avidity although some mutations reduced affinity. thus, during the course of vsv-ind infection some antibodies are subjected to avidity maturation although this is not required for the generation of high affme, efficiently virus neutralizing antibodies, lisa hyland'", sam hou'.~, and peter c. doherty'. 'department of immunology, st. jude children's research hospital, memphis, tn 38 10 i, 2departments of immunology and microbiology, and 'pathology, university of otago,dunedin, new zealand. the b and t cell responses in c57bl/6j(b6) mice treated with the mab mel-14 to l-selectin have been analysed following i.n. infection with sendai virus. mel-14 treatment caused a 70-90% decrease in the lymphocyte recruitment to the mediastinal (h4ln) and cervical (cln) lymph nodes following infection with sendai virus. the cellularity of the spleen was unchanged. the clonal expansion of cd8+ ctl precursors in the mln was slightly delayed, but potent ctl effectors were present in the virusinfected lung by day 10 after infection and the overall magnitude of the response was not compromised. the prevalence of iga antibody forming cells (afcs) was greatly increased in both the mln and the cln of the mice given the mel-14 antibody. the igm response was prolonged and the igg response, particularly iggl, was delayed compared to controls. the altered pattern of the antibody response may reflect the limited availability in mel-14-treated mice of th cells secreting lymphokines which are involved in ig class switching, by blocking the entry of cd4+ th precursor cells into lymph nodes. facs sorting for l-selectin+, 8220+, and l-selectin-, b220+ cell populations from the mln and the cln of normal b6 mice 9 days post sendai virus infection, showed that the afcs were from the l-selectin-, b220+ cell population, a population which comprised 6-10% ofthe total cell population. we have distinguished targets of broadly neutralizing antibodies present in hiv-1 infected individuals by imunoselection in vitro and by the use of chimeric virus. one target of neutralizing antibodies, defined by an escape mutant with an ala to thr substitution at position 582 in gp41, is resistant to human monoclonal antibodies that map to a site closely congruent with that for cd4 binding. substitution of gly, ser, and val fail to confer resistance. a second, defined by an ala to val substitution at position 281, upstream from the v3 loop, does not involve the same site and does not involve v3. substitution of thr or ile also confers resistance. replacement of the v3 loop of hiv-l(mn) into a clone of hiv-l(iiib) allows the detection of two other broadly neutralizing targets. one recognizes the v3 peptide of mn but is affected by regions outside v3. the other appears to be conformational and outside v3, but its functional recognition is influenced by the v3 loop. all of these sites seem to depend on the overall conformation of the envelope protein rather than a single discrete linear epitope. antibodies against amino acids 579-613 of the hiv transmembrane (tm) glycoprotein have been shown to enhance hiv infection in vitro in the presence of complement. there has been no study demonstrating that enhancing antibodies to this region of hiv, despite increasing levels of infectious virus 10 to 100 fold in vitro, adversely affect disease pathogenesis. in two separate studies reported herein, it is shown that animals which have high levels of antibody against this region of siv, amino acids 603-622 of the envelope, fair poorly compared to animals with lower antibody levels against this region when subsequently challenged with siv. when actively immunized with a synthetic peptide from this region of siv, animals died earlier and failed to clear antigen at two weeks after infection compared to animals that received a control peptide (p<0.05). when animals were passively immunized with antibodies from a longterm survivor of siv infection, those animals that received higher levels of antibody against the tm peptide died within six months compared to longer intervals for those animals that had lower levels of antibody to this region. when taken together, these data suggest that antibody to the tm region of siv and hiv in general, and to this highly conserved peptide in particular, are detrimental to the host. therefore, immunization strategies that minimize the immune response against tm or treatment protocols that decrease antibody levels against tm may lead to prolonged survival following exposure to lentiviruses. we have developed a mouse model to examine the immune response to hpv 16 proteins when these proteins are presented to the immune system via the epithelial route. in this model animals are grafted with keratinocytes expressing hpv e6 a n d e7 genes using a transplantation procedure which permits epithelial reformation. animals so grafted when challenged intradermally with e7 either as protein or via a recombinant vaccinia virus exhibit a delayed type hypersensitivity response which is e7-specific and cd4+ t cell mediated. animals grafted with a sub optimal priming inoculum of cells develop immune non-responsiveness and have an abrogated dth response when challenged subsequently with a priming cell graft. in the present study w e have examined the antibody status in these animals. the e7 protein of hpv 16 was expressed in e. coli as a maltose binding fusion protein using the plasmid vector pmalc. after cleavage and affinity purification this protein was used in a n elisa assay to measure antibody levels in 4 groups of mice (1) those not challenged with e7 (2) mice not grafted but challenged with e7 protein in the ear (3) mice primed by grafting with 107 hpv e7 expressing cells and challenged with e7 protein (4) mice primed by grafting with 5 x 105 hpv 16 e7 cells on day 7, grafted again with lo7 hpv 16 e7 cells on day 14 and challenged with e7 protein in the ear. mice optimally grafted and challenged (group 3) exhibited high titres of igg antibodies, particularly elevated levels of iggza. mice sub-optimally grafted (group 4) exhibited igg antibody levels comparable to the control group (1). the possible mechanisms of this immune attenuation are discussed. the hepatitis c virus is a frequent cause of chronic liver disease. a proposed mechanism responsible for virus persistence is evasion of the host immune response through a high mutation rate of crucial regions of the viral genome. the portion of hcv genome coding for the amino-terminal part of the putative envelope protein (gp70) undergoes frequent mutation during the course of infection. we have cloned and sequenced the hypervariable region (hvri) of the virus isolated from an hcv asymptomatic patient at three time points during 18 months follow up. sequence analysis has allowed the identification of variants of this region and multiple antigenic peptides (map), corresponding to three hvrl variants, sequentially foundin the blood stream of the patient, have been synthesized. maps have been used as antigens for detection of specific antibodies in elisa. our results show that anti-hvri antibodies and their cognate viral sequence coexist in the blood stream but a viral sequence becomes undetectable when the specific antibodies reach maximum levels of reactivity. thus humoral immunity against the hvrl may play a role for virus clearence. the presence of anti-hvr1 antibodies was also investigated in 100 hepatitis c viremic individuals and 25 non-viremic patients. a high frequency of positive reaction (90%) against at least one of the three hvrl variants analysed in this study was detected in the viremic patients. finally, competition experiments show that antibodies crossreacting with more than one hvrl variant are produced by hcv infected individuals. this results suggest that complex cross-reactivity exist between hcv isolates for antibodies against the hvrl region as described for antibodies against the gp120 v3 loop of hiv. we propose as mechanism for viral escape in hcv chronic infections the one described as the "original antigenic sin", observed firstly in influenza, in togavirus, paramixovirus, enterovirus, and recently in hiv infection. using an adult mouse model to study active immunity against rotavirus infection, it was previously shown that oral immunization with some, but not all, animal rotavirus strains induced protection against subsequent infection following oral challenge witb the murine rotavirus strain edim (ward et al., 1992) . to determine i f a specific rotavirus protein could be associated with protection in this model, mice were immunized with a series of 18 reassortants between the fully protective edim strain and a partially protective heterologous rotavirus strain (rrv-g). reassortants that contained genes for edim proteins responsible for protection were anticipated to provide complete protection; however, no edim proteins were found to be both necessary and sd3cient for full protection. instead, protection was found to be highly correlated with viral shedding (p = ,005) and with serum rotavirus iga titers stimulated by the different reassortants (p < ,001). this indicated that protection was related to the intestinal replication properties of the different reassortants rather than to specific immunogenic properties of edim proteins. this conclusion was supported by the finding that the titers of serum rotavirus i& but not igg, stimulated in mice following oral immunization with a series of animal rotaviruses was directly related to protection against edim. if these findings can be extended to humans, they suggest that the efficiency of intestinal replication following oral inoculation with a live rotavirus vaccine candidate may be the primary determinant of successful immunization. h a l l medical center, 55 individuals with adequate serum samples were identified as either rapidly progressing (rp) or slowly progressing (sp) by clinical and surrogate marker criteria. anti-v3 profiles were determined using synthetic proteins derived from the amino acid sequences of the v3 region of 5 laboratory strains of hiv-1 in standard capture elisa format. serum obtained from each patient at multiple different time points was screened against these peptides. the majority of individuals in both groups demonstrated broad recognition, with reactivity to peptides corresponding to the v3 regions of mn, sf2, ny5 and han/sc. less than 50% of individual in each group recognized the v3 peptide derived from iiib, @=ns, between groups). as the rp progressed to aids there was significant nonspecific narrowing of response, while the sp remained broadly reactivity (p< .001). in v i m neutralizing activity of the homologous laboratory isolates was determined with cytotoxicity, cytopathic effect and p24 ag inhibition assays. although most patient serum was capable of inhibiting p24 ag production in homologous lab strains while aids-free, there was no relationship with the ability to inhibit homologous virus effects on target cells and anti-v3 profiles. model, we show that after resolution of the acute infection, when antiviral plasma cells in the spleen decline, a population of virus-specific plasma cells appear in the bone m w and constitute the major sou~ce of longterm antibody production. following infection of adult mice, wspecific antibody secreting cells (asc) peaked in the spleen at 8 days postinfection, but were at this time undetectable in the bone marmw. the infection was essentially cleared by 15 days and the asc numbers in the spleen rapidly declined while an increasing population of lcmv-specific asc appeared in the bone marrow. when compared to the peak response at 8 days post-infection, timepoints from 30 days to more than one year later demonstrated greater than a l@fold reduction in splenic asc. in contrast, jltvlv-specific plasma cells in the bone marrow remained at high numbers and correlated with the high levels of antiviral serum antibody. the prewnce of antiviral plasma cells in the bone marrow was not due to a persistent infection at this site, since virus was cleared from both the spleen and bone marrow with similar kinetics as determined by infectivity and pcr assays. the igg subclass profile of antibody m e t i n g cells derived from bone manuw and spleen correlated with the igg subclass distribution of lcmv-specific antibody in the serum. upon rechallenge with w , the spleen exhibited a substantial increase in virus-specific plasma cell numbers during the early phase of the secondary response, followed by an equally sharp decline. bone marrow asc populations and lcmv-specific antibody levels in the serum did not change during the early phase of the reinfection but both increased about 2-fold by 15 days post-challenge. after both primary and secondary viral infection, lcmv-specific plasma cells were maintained in the bone marrow showing that the bone marrow is a major site of long-term antibody production after acute viral infection. "memory" t cells, and associated with responsiveness to soluble and recall antigens. cd4+ lymphocytes staining bright, dim, or negative (equivalent to an isotype control) for cd29 were evaluated in 49 uninfected controls (group l), 84 hiv-1 positive patients with 220% cd4+ t cells (group 2), and 47 hiv-i-infected patients with ~2 0 % cd4+ t cells (group 3). most of these subjects also had 3-color staining for cd4\cd45ro\cd45ra. the appearance of positive cd29 and cd45ro on hivinfected and uninfected cells correlated well (r=.82 p<.ool). the percentage of cells staining cd4+\cd29+.(bright plus dim) was 43.3 (95%cl 37.3-49.4) in group 1, 28.9 (27.5-30.4 ) in group 2, and 10.2(8.6-11.9) in group 3. the respective values for these groups that were cd4+\cd2gbwm was 30.6 (26.9-34.3), 20.7 (1 9.3-22.2), and 7.4(6.3-8.6). values for cd4+\cd45ro+ were 33.7 (31.8-35.5), 21.8 (20.5-23.1), and 9.9 (8.5-11.3), respectively. in single factor discriminate function tests, the %cd4+\cd29+ cells best predicted subject group (87% correct), proving to be a better discriminator than %cd4+\cd29b'h' (77% ~orrect),cd4+\cd29~"" (5 l%), cd4+\cd45ro+ (75%) and cd4+\cd45ro+\cd45ra-(63%). overall, no advantage was seen to splitting the cd4+\cd29+ cells into bright and dim positive subsets in the subjects studied for the purpose of stratifying early vs. late hiv infection. likewise, splitting the cd4+\cd45ro+ compartment into cd45ra+ subsets did not improve the ability to distinguish between uninfected and early or late hiv-1 infected patients. the relationship between the virus-specific cytotoxic response in hiv infected patients and disease progression support the concept that a vaccine candidate should also induce a virus-specific ctl activity. immunization of uninfected adult volunteers by a hiv-gpl60 recombinant canarypox virus was carried out in a phase i trial.two injections of a recombinant canarypox expressing the hiv-l/mn gp160 were performed at month 0 and 1 and two boosts of recombinant gpl60mn/lai at month 3 and 6 in alum or incomplete freund adjuvant(1fa). hiv-envelope specific cytotoxic activities were detected from ctl lines derived from pbmc stimulated by specific stimulation with autologous hiv infected blasts. ctl lines were obtained from 18 out of 20 donors : seven out of eighteen (39%) were found to present envelope specific cytotoxic activity at months 2, 4, 7 or 12 post immunization ; this activity was characterized as a cd3+,cd8+, mhc class-i restricted cytotoxic activity, and for at least two volunteers, this activity was still present two years after the first canari-pox/env injection. because avian poxviruses are incapable of complete replication and undergo abortive replication in mammalian cells , this is a n example of the persistence of long term memory cd8+ cytotoxic t lymphocytes in the absence of the priming antigen, indicating that t-cell memory might be independent of continued antigenic exposure. the university of alabama at birmingham, al 35294. mhc class i restricted cd8' ctl activity plays an important role in the control of influenza virus infection as indicated in studies in mice and humans. cytokines such as il-2 and ifn-y regulate the generation of virus-specific ctl responses. we recently demonstrated a good correlation between the induction of influenza virus-specific ctl activity and the production of ifn-y by the cd8' t cells at the single cell level using an if?-specific elispot assay, secreted ifn-y by an elisa, and ifn-y specific mrna expression by rt-pcr. several recent studies have characterized cd4+ and cd8' t cells by their expression on the surface of distinct d45r isoforms. cd45ra is expressed on naive or virgin t cells, while cd45ro is expressed on memory t cells. in the present study, pbmc of healthy young adult subjects were stimulated with influenza a virus and then enriched for cd8+ t cells. the cd8' cells were stained for cd45ro' (pe) and cd45ra' (fitc) cells and sorted. ctl activity against virus-infected autologous target cells was determined in a 4 hour 'lcr release assay while ifn-y production and expression was assessed by elispot and quantitative rt-pcr, respectively. cds+/cd45ro+ (memory) cells exhibited significant mhc class i ctl while cds+/cd45ra+ cells exhibited no lytic activity. no activity was exhibited by freshly isolated or unstimulated cd8+/cd45ro+ t cells. similarly, cd8+/cd45rot t cells contained significantly higher numbers of ifn-y spot forming cells and higher quantity of ifn-y-specific mrna than cd8+/cd45rac cells. these data support our previous findings that ifn-y may serve as a useful surrogate marker for influenza virus-specific ctl activity in humans. in studying the kinetics of the cd8+ t cell response in lcmv infection we have observed a profound activation and proliferation of cd8+ t cells with a 10-40 fold increase in total number peaking at day 8-9 post infection. in c57bw6 mice, most of the viral antigen is cleared by day seven, and after day 9 the total cd8+ number per spleen drops about 10-fold. however, the relative specificity of the viral peptidespecific precursor ctl frequencies @ctwf) per cd8+ cell remains remarkably stable between day 7-8 of the acute infection and for many months thereafter. thus, the decline in the cd8' t cell number is not a function of the tcr specificities but is rather an across-the-board event. in contrast, we found that subsequent to the decline of the ctl response to a second heterologous virus infection such that the mouse was in a "resting, immune'' state, there often was a reduction in pctl/f to the first virus. for example, infections with w or mcmv substantially reduced the pctuf to lcmv or pv in all memory compartments, including spleen, lymph nodes, peritoneal exudate cells. reinfection with the original virus substantially elevated its pctuf and restored the pctuf that had been reduced by a heterologous viral infection. analyses of the progression of ctl responses during a heterologous virus challenge of a virus-immune mouse indicated a high frequency of crossreactive ctl appearing early during infection, but as the infection progressed there was a higher proportion of ctl specific only for the second virus. thus, we believe that when the across-the-board apoptosis of t cells occurs late in the infection, ctl specific for the first virus are diluted by those responding to the second virus. this may cause the reduction in memory to the first virus and may be one of the mechanisms contributing to the waning of secondary immune responses to certain viruses over time if there is no re-exposure to the original infectious agent. t-cells which arise after virus infection will aid our understanding of tcell memory and be useful in the design of vaccines which augment the memory response. to estimate the sendai virus specific precursor frequency in memory mice, cd4+ cells from c57bl6 female mice which had been infected with sendai virus intranasally (i.n.) more than two months earlier were subjected to limiting dilution analysis. responder cell populations were enriched for cd4+ cells either by magnetic bead depletion of non-cd4+ cells, or by facs after staining with anti-cd4 monoclonal antibody these enriched (>90% cd4+) responders were cultured with sendai virus-infected, irradiated, t-cell depleted splenic antigen presenting cells (apc). supernatants from these cultures were tested for activity on the cytokine-dependent ctll cell line. duplicate cultures of responders on uninfected apc were used to set the level of rejection (mean cpm + 3x std. dev.). using this type of analysis we were able to demonstrate a frequency of memory thp at 111600 cd4+ cells, compared to a frequency greater than 1/1ooooo in naive controls. the memory cd4+ cells were further characterized as cd45rb-low (1/472) , cd44-high (1/294), lselectin-low (1/364), and cd49d-high (vla-4-high) (v102). this is close agreement with other phenotyping studies on cd4+ memory cell specific for soluble antigens. t cells, ralph a. tripp, sam hou, anthony mcmickle, james houston and peter c. doherty, department of immunology, st. jude children's research hospital, memphis, tn 38105. the immune response of influenza a and sendai-virusspecific, memory cd8' cytotoxic t lymphocyte precursors (ctlp) have been analyzed in c57bu6 mice infected intranasally with unrelated or cross-reactive respiratory viruses. the numbers of influenza a-specific memory t cells increased in the regional lymph nodes (ln), spleen and bronchoalveolar lavage through the course of an irrelevant infection (influenza b). memory t cells showed evidence of enhanced steady-state activation. profiles of ctlp recruitment were analyzed in association with t cell proliferation and activation to determine whether signaling via the t cell receptor is necessary to induce "bystander" stimulation of the memory t cell pool. the extent of t cell proliferation was addressed by treating mice with low doses of cyclophosphamide (cy). "resting" sendai virus-specific memory t cells were unaffected by cy treatment, however upon challenge with influenza and treated 5 or 6 days later, the emergence of influenzaspecific ctlp was severely diminished. cell cycle analysis showed that cy eliminated the majority of cd8' t cells from the ln and spleen resulting in dna fragmentation of 12-18% ofthis lymphocyte subset. a decrease (though smaller) in the numbers of sendai virus-specific ctlp indicated that some of the cycling cells killed by cy were memory t cells, presumably activated in a "bystander" manner. the decrease in ctlp numbers for both influenza and sendai virus-specific ctlp was still apparent 9 days after cy treatment, long after the viral elimination. thus, immune responses to unrelated antigens may be a mechanism involved in maintaining the pool of memory t cells. experimentally vsv can result in an acute cns infection of mice. data from our in vitro experiments indicate that no has inhibitory effect on productive vsv infection. vsv infection at neuroblastoma nb41 a3 cells was significantly inhibited by loopm of a no donor s-nitro-n-acetylpencillamine (snap), while 1oopm of the control compound n-acetylpencillamine (nap) had no effect. when vsv infected nb41a3 cells were treated with 500pm of a constitutive no synthase (cnos) activator n-methyl-d-aspartate (nmda), a significant inhibition of vsv production was observed. inhibition by 500pm of nmda was reversed by 300pm of nos inhibitor n-methyl-l-arginine (l-nma). work is in progress to determine the effects of inducible nos (inos) in a glioma cell line c6 on vsv infection. levels of no and expressions of both cnos in neurons and inos in glial cells in the cns following vsv will be further investlgated. supported by nih grant a118083 to carol s. reiss. pediatrics, university of iowa, iowa city, ia. 52242 mouse hepatitis virus, strain jhm (mhv-jhm), is a neurotmpic coronavirus which causes acute encephalitis and acute and chronic demyelinating encephalomyelitis in susceptible rodents. 40.90% of suckling c57bu6 (kbdb) mice inoculated intranasally with mhv-jhm at 10 days and nursed by dams immunized against the virus develop a chronic demyelinating encephalomyelitis characterized clinically b hindlimb paralysis, at 3-8 weeks postinoculation. the chronic demyelinating encephalomyelitis nor the clinical symptoms. recently, it was shown that lymphocytes isolated from the central nervous system (cns) of c57bu6 mice both acutely and persistently infected with mhv-jhm display a cytotoxic t lymphocyte (ctl) response to the s protein of mhv-jhm. this response was further characterized by identifying the ctl epitopes that are recognized by a bulk population of ctls from the cns of mhv-jhm infected c57bv6 mice. three epitopes were identified using synthetic peptides and truncated forms of the s protein in primary ciz assays. the epitopes recognized were amino acids 510-518 (cslwngphl, db), 598-605 (rcqifani, kb), and 1143-1151 (nfcgngnhi, db). thus, the results indicate that cytotoxic t lymphocytes responsive to the s protein of mhv-jhm in c57bu6 mice recognize both kb and db-restricted cil epitopes. ctl lines and clones specific to these peptides and the entire s protein are being developed to test their biological significance in vivo with respect to the acute encephalitis and chronic demyelinating disease caused by mhv-jhm. a marked change in susceptibility to some neurotropic viruses during the first few postnatal weeks has long been recognised in rodents. infection of neonatal or suckling mice with the neurotropic alphavirus, semliii forest virus results in lethal encephalitis. infection of weaned animals is not lethal. earlier investigations focusing on changes in specific immunity have shown this not to be the explanation. infection of 3-4 week old mice with severe combined immunodeficiency does not result in acute rapidly fatal encephalitis. we have studied mortality, neuroanatomical distribution and spread of infection in mice of different ages and the effect of gold compounds on rendering infection of 3-4 week old mice lethal. neuroanatomical distribution of infection correlates with synaptogenesis. as this is completed in different systems within the first two weeks postnatal, systems no longer transmit virus and infection switches from disseminated to focal and restricted. complete productive replication and transmission of infection require smooth membrane synthesis which is present in neurones undergoing synaptogenesis, absent in mature neurones but inducible by administration of gold compounds. infection of neurones undergoing synaptogenesis is productive and virus is transmitted along neuralpathways, infection spreads rapidly around the brain, destroys cells and animlas die of a fulminant encephalitis. in mice infected after 14 days of age replication in mature neurones is restricted, nonproductive, cannot be transmitted, does not spread, is non-destructive and non-lethal. as a consequence, in the absence of immune responses virus can persist in isolated cns cells for life and can even be detected by reverse transcriptase pcr in immunocompetent mice months after infection. in the presence of an immune response, cd8+ t-cells recognise and destroy infected glial cells leading to dem yelination. a~k e r m a n n ,~ virology swine,' virology cattle,' and avian diseases3 research units, national animal disease center, usoa, agricultural research service, ames, ia 5001 0 a recombinant pseudorabies virus (prv) (lltbap) was constructed which contains a 3.0 kb deletion spanning the standard recombination junction of the unique long and internal repeat sequences replaced by e lacz expression cassette. this deletion interrupted the large latency transcript gene (llt) and truncated one copy of the diploid immediate early iel80 gene. replication and viral gene expression of lltbaz in madin-darby bovine kidney cells was similar to that of the parental virus and a virus rescued for the deleted sequences (lltbres). when inoculated intranasally in 4-week-old or 4-day-old pigs, lltba2 replicated efficiently at the site of inoculation yet caused markedly reduced fatality when compared to the parent or lltbres viruses. in particular, the lltba2-infected pigs did not exhibit neurological symptoms characteristic of prv infection. to further examine the pathogenesis of lltba2, 4-day-old pigs were infected intranasally with lltpa2 or lltbres and necropsied at various times postinfection. virus isolation from the nasal turbinate, tonsils, and trigeminal ganglia was comparable between the two viruses. although both viruses spread to the brain and induced an inflammatory response in cns tissues, virus isolation from brain tissues was reduced about 20-fold for lltpa2. abundant prv antigen was detected in the cerebrum and cerebellum of lltpresinfected pigs, but only a few antigen positive neurons were observed in the cerebrum of lltba2-infected pigs. while replication of lltbres in the brain progressed until death at 7 days post-infection, replication of lltpa2 in the brain ceased by 9 days post-infection and the pigs exhibited only mild clinical signs. since lltba2 is capable of spread to the cns, reduced neurovirulence of lltbaz is likely the result of its decreased ability to replicate in cns tissues. the cns is a target for hiv infection, and in individuals with aids this can lead to a devastatin dementia. only certain viral variants appear capable 07 invading the cns and infecting microglia and brain macrophages. in order to determine whether the virus entering the brain may be particularly pathogenic to the cns, we isolated microglia from the brains of siv-infected rhesus monkeys. transfer of these cells into naive animals indicated that productive siv infection could indeed be transferred. furthermore, cns infection occurred within a relatively short time span, and was associated with viral gene expression in the brain and pathology characteristic of hiv encephalitis. serial transfer of microglia into additional animals also resulted in successful transfer of infection, neuroinvasion, and neuropathology. behavioral analysis in a trained group of animals is ongoing. this result demonstrates that neuropathogenic virions partition into the cns during natural siv infection, likely driven by mutational events that occur during the course of infection. molecular characterization of the microglia-associated virus has revealed that a distinct pattern of sequence changes in the envelope gene occurs concomitantly with this in vivo selection. our approach will allow the dissection of functional neuropathogenic elements present in these viruses. in non-specific host defense mechanisms. ifn-y-induced nitric oxide (no) in murine macrophages was previously shown to inhibit the replication of poxviruses and herpes simplex virus type 1 (hsv-1) . we now demonstrate that murine macrophages activated as a consequence of vaccinia virus (vv) infection in viva express inducible nitric oxide synthase (ios). the vvelicited macrophages were resistant to infection with w and efficiently blocked the replication of w and hsv-1 in infected bystander cells of epithelial and fibroblast origin. this inhibition was arginine dependent, correlated with no production in cultures and was reversible by the nos inhibitor nqjmonomethyl-l-arginine. the mechanism of no mediated inhibition of virus replication was studied by treating vv-infected 293 cells with the noproducing compound, s-niuoso-n-afetyl-penicillamine. antibodies specific for temporally expressed viral proteins, a vv-specific dna probe and transmission electron microscopy were employed to show that no inhibited late gene protein synthesis, viral dna replication and virus particle formation, but not expression of the early proteins analyzed. further, we have also identified putative enzymatic targets of inactivation by no that results in inhibition vv replication. although antiviral ctl are important for virus elimination. they can only halt further virus spread, and cannot reduce the number of infectious particles already present. the beneficial effect of ctl-mediated lysis is apparent only if infected cells are lysed before assembly of progeny virus. if infectious virus was released from infected cells in solid tissues before the generation of neutralizing antibody or in sites where antibody did not readily penetrate, then recruitment of mononuclear phagocytes, which phagocytose and destroy infectious material and/or become non-productively infected, would definitely help control virus dissemination. in this context, inos induction in macrophages may be an important antiviral strategy. in addition, the inhibition of virus replication in infected contiguous cells by inos-expressing macrophages at infectious foci would prevent release of mature viral particles after lysis by nk cells and ctl. since viral early proteins are expressed in such infected cells, their recognition and subsequent lysis by ctls will not be hindered. cns persistence, tropism and genetic j. pedro s i s , anthony a. nash and john k. fazakerley, department of pathology, university of cambridge, cb2 iqp, uk theiler's murine enchephalomyelitis virus, a natural occuring enteric pathogen of mice, is a picomavirus belonging to the curdovirus genus. following intracerebral inoculation of 3-4 week old cba or balb/c mice, the bean strain causes a chronic persistent cns demyelinating infection in a proportion of the cba that survive acute infection. balb/c mice are resistant to chronic demyeliating disease. we have studied the tropism, persistence and genetic variability of bean, in cba and balb/c mice in the chronic phase of this disease. by in situ hybridisation and reverse transcription (rt) pcr and southern blot analysis, no viral rna could be detected in the cns of any balb/c mice later than day 60 post-infection. in contrast, in a large group of cba mice studied up until 393 days post-infwtion, viral rna could be detected by both techniques in 50% of mice until as late as 268 days post-infection. by employing a combination of, in situ hybridisation for viral genome followed by immunocytochemistry for cell phenotypic markers, bean rna was observed predominantly in oligodendrocytes and occasionally in astrocytes during persistent infection, in both brain and spinal cord. in the persistently infected mice, the striking total destruction of the pyramidal layer of the hippocampus, substantia nigra and anterior thalamic nuclei indicated that these were the mice that had had greatest dissemination of virus and highest virus titers during the preceeding acute phase of infection. direct pcr t h d cycle sequencing of uncloned rt-pcr products, revealed that during persistent infection, loops i and ii of the vpi capsid protein gene did not undergo any genetic variability. furthermore, no changes were detected in this region in sequenced pcr products amplified from the cns of mice with severe combined immunodeficiency in which no selective immunological pressure would have been operative. infection, thomas e. lane, michael j. buchmeier, dorota jakubowski, debbie d. watry, and howard s. fox, department of neuropharmacology, the scripps research institute, la jolla, ca 92037 our laboratory is interested in the effects of siv infection in the central nervous system of rhesus macaques. to enrich for neuroinvasive and neurovirulent viruses, microglia were isolated from infected monkeys and used t o infect new, uninfected monkeys. such microglia-mediated infection resulted in the production of neuropathological changes, including giant cells, macrophage infiltrates and microglial nodules in recipient animals within 4 months. microglial cells isolated from siv-infected monkeys produced virus in vitro as measured by reverse transcription (rt) and p27 production. treatment of microglia with recombinant human interferon alpha (rhulfn-a) resulted in a sharp decrease in viral activity (both rt and p27 production) suggesting that rhulfn-a is able t o modulate viral activity in infected microglia. we have analyzed slvenv sequences by pcr amplification directly from microglia dna preparations from monkeys. nucleotide sequence analysis results in an enrichment of unique sequences in the v1 region of the siv env gene. the majority (>95%) of nucleotide changes encoded amino acid changes, indicating that these envelope sequences evolved as a result of selection. moreover, sequential passage of sivassociated microglia resulted in an increase in potential n-linked glycosylation sites within the v1 region of the env gene when compared with the parental virus. these data suggest that sequential passage of microgliaassociated siv may select for neuroinvasive, neurovirulent variants. the adoptive transfer of ctl specific for an ld-restricted epitope within the nucleocapsid protein of the jhmv strain of mouse hepatitis virus both protect from acute infection and reduce virus replication in the mhc class 1 positive cells within the cns. the source of these ctl and the route of their delivery is critical in the outcome of this protection. for example, 10 fold less spleen cells activated in vitro with the pn peptide are required for protection via the direct i.c. route than the i.v. route. in addition, ctl clones are unable to protect via the i.v. route and are very efficient via the i.c. route. these data suggested the possibility that the cd4+ t cells within the polyclonal activated spleen cell population derived from in vitro culture on the pn peptide were facilitating access to the cns. to examine this question, polyclonal pn-specific t cells were either depleted of cd4+ t cells prior to transfer to infected recipients or untreated cells were transferred to recipients depleted of cd4+ t cells with monoclonal antibody gk1.5. both of these treatments eliminated the ability of the ctl to reduce virus replication within the cns, suggesting that cd4+ t cells in the peripheral compartment are required for the entry of ctl into the parenchyma of the cns during acute cns encephalomyelitis. division of retrovirology, walter reed army institute of research and henry m. jackson foundation, rockville, md 20850; department of retrovirology, armed forces research institute of medical sciences, bangkok, thailand background the hn-1 epidemic in thailand is largely due to two highly divergent subtypes of virus, b and e. dual infection with distinct hn-1 subtypes, which has not been reported previously, would suggest that antiviral immunity evoked by one subtype can be incompletely protective against a second. merhoak: pcr typing and serologic typing were used to screen a panel of non-random convenience specimens from hiv-1 infected subjects in thailand. specimens that showed dual subtype reactivity in these assays were subjected to differential pmbe hybridization and nucleotide sequence analysis of multiple molecular clones to c o n f m the presence of dual infection. results. two individuals were shown to simultaneously harbor hiv-1 of env subtypes b and e (table) . additionally, both subtypes were identified in co-cultured pbmc from one individual. conclusions. these data provide the fmt evidence of dual hiv-i infection in humans and reinforce the need for polyvalent vaccines. infection by herpes simplex virus wsv) induces in man and in mice cytolytic t lymphocytes (ctl) which recognize the immediateearly protein icp27. because of its early expression during the hsv replication cycle, lcp27 represents a prime target for specific t cell responses susceptible of controlling virus replication. we have expressed in e. coli a remmbinant construct coding for a fusion protein consisting of a fragment of influenza virus non-structural protein4 (nsi) and the lcp27 sequence of hsv-2. the nsi-icp27 protein was purified by preparative eleclmplmresis and formulated in oil-in-water emulsions with monophosphoryl lipid a (mpl) and qszl adjuvants. balwc mice were immunized by two intrafootpad injections of formulations containing 5 pg of nsi-icp27. responder cells obtained from draining lymphnodes were re-stimulated in vitro with p815 cells lransfected with icp27 and then lesled for cytolytic activity on icp27-p815 and control p815. the induction of icp27 specific ctl by different formulations was observed and will be discussed. the induction of heterologous cytotoxic t lymphocytes (ctl) using cassettes of multiple conserved t cell epitopes derived from different proteins and/or virus strains is envisioned as a promising vaccine approach. to study the effects of antigen processing on peptide presentation from chimeric epitope precursors we are using a model system comprising two distinct viral epitopes which are immunodominant in the h-2d haplotype: a dd restricted epitope from the gp160 protein of hiv-1 and an ld restricted epitope from the murine hepatitis virus nucleocapsid protein (mhv n). the influence of proximity and flanking sequences of epitopes on antigen presentation was analyzed using vaccinia virus (vv) recombinants in which the epitopes were expressed as chimeras containing the individual epitopes in reverse order or separated by different spacer residues. whereas individually expressed epitopes were efficiently recognized by protein-specific ctl, recognition of peptides derived from tandem constructs varied significantly with closer epitope proximity and sequential order. following immunization with the recombinant viruses, the chimeras were all able to induce antiviral ctl specific for the native proteins. however, cn, frequency analysis indicated that the number of responder cells to the same epitope dramatically depends on its context within the chimera and correlates with antigen recognition in vitro. the profound effect of flanking regions on ctl induction suggests that the context of an epitope will require careful evaluation in the design of recombinant multivalent minigene vaccines to induce an optimal t cell mediated immune response. and robert e. johnston', departments of 'microbiology & immunology and %iochemistxy, univ. of north carolina, chapel hill, nc 27599 a hll-length cdna clone of venezuelan equine encephalitis virus w e ) has been altered to contain two strongly attenuating mutations and a second subgenomic rna promoter immediately downstream of the structural gene region. expression ofthe influenza ha protein from this second promoter in baby hamster kidney (bhk) cells was approximately 50?? of the level in influenza virus-infected cells, as measured by immunoprecipitation. fourweek-old cd-1 mice were inoculated subcutaneously with 2 x 10' p h of the ha vector, vector alone or diluent. expression of ha mrna was detected in the draining lymph node of ha vector-inoculated mice by in situ hybridization, consistent with the organ tropism of vee. mice were challenged three weeks after imnmization by intranasal administration of lo5 ed, of influenza h s . au 24 corn1 mice suffered severe disease and 50% died. only one of 12 ha vector-inoculated mice died, and another exhibited signs of disease for one day and recovered. the geometric mean elisa titer of anti-ha serum igg in the ha-vector inoculated mice was 246, while only three control mice had measurable serum reactivity, and that was at the lowest dilution tested, 150. in a parallel experiment, no influenza infectivity was detected in the lungs of 12 ha vector immunized mice at 4 days postchallenge. in contrast, 8/12 pbs-inoculated mice and 5/12 inoculated with vector alone were positive for influenza infectivity and had geometric mean titers of 3.04 and 1.93 x lo6 pwgm, respectively. this vector also has been used to express the h n w c a protein in a form recognized by patient sera and a specific antibody on western blots. these experiments demonstrate the feasibility of using vectors based on attenuated vee cdna clones for protective immunization against heterologous human and animal pathogens. dose/response curves have been used to compare different routes of immunization with plasmid dna encoding the h1 hemagglutinin glycoprotein of influenza virus. routes of inoculation included intramuscular, intradermal and gene gun delivery of dna. from 100 to 0.1 ug of dna was inoculated by intramuscular and intradermal routes. from 0.4 ug to 0.0004 ug of dna was inoculated by gene gun. each route was evaluated for single and boosted immunizations. antibody titers were followed over a 20 week period, following which animals were evaluated for protection against a lethal challenge. each of the routes raised both antibody and protective responses. gene gun-delivery of dna required 250 to 2,500 times less dna to raise responses than the intramuscular and intradermal inoculations. boosts did not have much of an effect on antibody titer or protection except at low dose inoculations (4 ng and lower for the gene gun). for each of the routes, antibody responses showed good persistence over the 20 weeks of the experiment. inoculation of mice with plasmid vectors carrying a microbial gene under the control of an appropriate promoter results in a full spectrum of immune responses to the vectorencoded antigen. using a murine rabies model a plasmid termed psgsrab.gp expressing the full-length rabies virus glycoprotein regulated by an sv40 promoter was shown to induce upon inmuscular inoculation a rabies virus specific t helper cell response. of the thl type, cytolytic t cells and virus neutralizing antibodies resulting in protection against a subsequent challenge with live rabies virus given either peripherally or directly into the cenaal nervous system. a response comparable in magnitude was also induce upon inoculation of a vector expressing a secreted form of the rabies virus glycoprotein. the immune response to the dna vaccine could be modulated by co-injection of the rabies virus glycoprotein-expressing vector with plasmids expressing mouse cytokines. inoculation of mice with the psg5rab.gp vector and a vector expressing granulocyte/macrophage colony stimulating factor (gm-csf) enhanced both the t helper and the b cell response to rabies virus thus improving vaccine efficacy. co-inoculation with vectors expressing interferon-g failed to improve the response. co-inoculation of the antigen-expressing vector with a plasmid encoding mouse i l 4 caused a reduction of both the t helper cell response and the b cell response to rabies vlns. hpvl6 e7 hpv associated cervical cancer cells express hpv16e7 protein and antibody to hpv16 e7 can be detected in the blood of cancer patients, yet the twnours are. not rejected. a mouse transgenic for the e7 protein of hpv16, and expressing e7 protein in the skin, has recently been described (1) and these mice develop spontaneous humoral immunity to e7 protein similar to patients with cervical cancer(2). to determine whether immunisation could induce immunity to e7 sufficient to allow tumour rejection, we firstly demonstrated that immunisation of h-zb mice with hpv16e7 protein with quil a as adjuvant could induce cytotoxic t cells able to kill hpv16 e7 expressing tumour cells in nrro. we then used similar immunisation with e7iquil a to induce e7 specific immunity in fvb (h-29) mice. h-2qskin @s expressing e7 were not rejected by e7 immunised h-zq mice, though immunisation induced antibody to e7, and similar grafts were rejected, as expected, across an doantigen mismatch in h-zb mice. we conclude either that hpvl6 e7 lack a tc epitope in the context of h-zq, or that expression ofe7 in the skin from the e7 transgenic mice is insufficient for recognition by primed effector cells, and further experiments will address this distinction. cervical carcinoma is strongly associated with infection by human papillomavirus (hpv) types 16 or 18, and continued expression of the e6 and e7 gene products. this provides an opportunity for an immunotherapeutic approach to the treatment of cervical carcinoma by activation of immune reponses directed against these virally encoded tumour specific antigens. we have constructed a recornbinant vaccinia virus expressing e6 and e7 from hpv16 and 18 with the aim of inducing e6 and e7 specific hla class i restricted cytotoxic t lymphocytes (ctl). the sequences have been inserted into the wyeth vaccine strain of vaccinia virus at a single locus in the form of two separate fused e6/e7 reading frames, each under the control of an early v a d n i a promoter, and each modified to inactivate the rb binding site. the virus has been characterised with respect to its ability to synthesise the expected hpv proteins, its genetic stability, and growth and virulence in a mouse model prior to use in human clinical trials. analysis of hpv16 â�¬7 specific ctl from c57bu6 mice immunised with this recombinant virus show the response to be equivalent to that generated by a control vaccinia recombinant expressing non-modified hpvi 6 e7 alone, with similar recognition of the defined immunodominant h-2db restricted epitope, e7 residues 49-57. ability of mice to resist influenza challenge, arthur friedman, douglas martinez, john j. donnelly and margaret a. liu, department of virus and cell biology research, merck research laboratories, west point, pa 19486 mice infected with the laboratory strains of a/pr/8/34 (hln1) or the mouse adapted a/hk/68 (h3n2) show complete protection against challenge with a different strain of influenza a. humans, however, undergo multiple influenza infections as previous infections appear to provide weak or short-lived protection against the continual antigenic change of strains. we have previously shown that immunization of naive mice with dna encoding the conserved internal antigen nucleoprotein (np) provides protection against both h1 and h3 strains of a/influenza. although such mice became infected they were resistant to weight loss and death this differed substantially from a/pr8 and a/hk recovered mice which were resistant to subsequent infection. to produce a more representative model of human infection, we infected the lungs of mice with currently circulating strains of human influenza. mice that had been given lung infections with a/beijing/92 were susceptible to subsequent infection with the a/hk/68 strain although they were resistant to weight loss and death. other strains such as a/beijing/89 or a/georgia/93 provided only marginal protection against weight loss and death against a/hk challenge. mice that were immunized with np dna had greater resistance to weight loss and death after a/hk/68 challenge than mice previously infected with a/bei/89 and a/ga/93, and were similar to mice that had been previously infected with a/bei/92. thus, infection with different virus strains provide various levels of cross strain protection and the level of protection provided by immunization with dna can exceed that induced by live influenza infection. the development of sendai virus-specific cytotoxic t lymphocyte (ctl) effectors and precursors (p) has been compared for mice that are homozygous (-/-) for a disruption of the h-21-ab class ii major histocompatibility complex (mhc) glycoprotein, and for normal (+i+) controls. the generation of cd8+ ctlp was not diminished in the (-/-) mice, although they failed to make virusspecific igg class antibodies. while the cellularity of the regional lymph nodes was decreased, the inflammatory process assayed by bronchoalveolar lavage (bal) of the infected lung was not modified and potent ctl effectors were present in bal populations recovered from both groups at day 10 after infection. there was little effect on virus clearance. as found previously with cd4-depleted h-2b mice, the absence of a concurrent class il-mhc-restricted response does not compromise the development of sendai virus -specific cd8+ t cell-mediated immunity. the importance of cytoxic t lymphocytes in defense against acute and chronic viral infections is gaining increasing recognition. our approach to investigating the structure-function relationship between immunogens and their in vivo ability to elicit cytotoxic t lymphocyte responses has been to formulate simple, well-defined structures that vary in their ability to introduce associated antigens directly into the cytoplasm of antigen presenting cells. we have introduced methods for the preparation of unique, lipid-matrix based immunogens, which are highly effective in mice and monkeys for stimulating strong cd8+ cytotoxic t cell responses, (ctl). antigens used have been proteins or peptides derived from influenza, parainfluenza, and hiv viruses, and whole formalin-fixed siv. ctl can be induced by parenteral as well as oral administration. comparing the physical and chemical nature of our formulations with those from other laboratories which have reported the use of subunit preparations to induce cd8+ ctl, leads us to propose that a minimal immunogenic formulation capable of eliciting cd8+, mhc class i restricted cytotoxic t lymphocytes includes: i) a peptide that represents a mhc class i epitope; ii) a component that enhances the aftinity of the immunogen for mhc class i positive antigen presenting cells ; iii) properties that can compromise the integrity of a lipid bilayer, facilitating delivery of the antigen directly into the cytoplasm for class i presentation. cd8+ responses to peptides, glycoproteins, and even whole fixed viruses, makes them attractive candidates for diseases where clearance of infected cells is important in protection and recovery. cani ne rabies is uncontrolled. rabies also is epizootifally active in several species in most areas of the wor!d. thm, vaccination of animals, both wild and domestic, as well as postexposure treatment of humans remains a global concern. unfortunately, in those countries in which,people most need postexposure prophylaxis, the best vaccines are expenswe and in limited supply, whereas available vaccines are of questionable immunogenic efficiency, are otten contaminated and may produce neurological complications. the goal of this study was to determine whether a rabies vaccine for global use is complete one round of replication have the potential to be used as vaccines. we have previously reported the abiliity of a ghdeleted herpes simplex virus type 1 (hsv-1) to protect mice and guinea-pigs from subsequent challenge with wild-type hsv. this virus, which we have called disc (disabled infectious single cycle) virus, can infect normal cells but the absence of gh in the progeny virus prevents further rounds of infection. as disc hsv clearly has potential as a vaccine, it is important to determine the durability of the immune response elicited by this virus. we have investigated the ability of disc hsv-1 to protect mice from a wild-type virus challenge six months post vaccination using the ear model of hsv infection. two immunisations on day 0 and day 21 resulted in a considerable reduction in virus titres in the challenged ears, and an almost complete absence of virus in the dorsal root ganglia. hsv-specific antibody titres as determined by neutralisation and ellsa were maintained for the six months period. it was possible to demonstrate an hsvspecific cytotoxic t-cell response in the disc hsv-1 vaccinated mice following challenge; this ctl activity was similar to that observed in mice vaccinated with wild-type virus and challenged after the same time period. animals vaccinated with inactivated virus or control mock-vaccinated mice showed a low level of ctl activity typical of a primary ctl response following challenge. these results indicate that an effective cell-mediated and humoral anti-hsv immune response can be maintained for at least six months following vaccination with disc hsv-1. viruses which lack an essential gene and thus can only the lmmunogenicity of two ctl @topes. influenza npl47-158 and plasmodium berghei cs protein 252-260 were studled in balblc mice. paptides were formulated as a) a iipopepudepeplkle conlugated to irlpalmltoyl-sgly~~l cysteine (pam3cyj) and dissolved in a 1% dmsolglycerol solution. b) micmparticles prepared with poly @.l ladide-coglywlide) using a solvent evaporation technique. the micropaltides were administered as a suspension in phosphate buffered saline or c) an emulsion prepared wilh egg lecithin and 10% soya oil in water. 1 wpg of peptide or controls (the welght equivalent of blanks) were administered to groups of 3 mice intra-peritoneally or subcutaneously at 1.10 and 20 days. 7 days following the last immunization splenocytes were cukured in vlro in the presence of appropriate pepwe or wntml m h rat con a supematant as a source of omwth fadors. ctl adivity was measured in a standard 4 hour chromium release assay and results expressed as % specific lysis. ctl could be elicited in vivo with all three formulations. at an ewedoctarget ratio of 1w:l the plasmodium berghei peptide encapsulated in micmpartides gave 47% iysls on peptide pulsed target calls. levels of lysis were similar for the peptide in emulslons. the iipopeplide p3ccs252-260 gave a level of lysis of 82% at an e:t ralioof1w1. these results demonstrate that peplides edminldered in a variety of formulations can induce a systemic ctl response in vivo. peplide vaccines using such formulations wuld be used to stimulate ctl responses as part of a prophyladic vaccines or as immunotherapeulics. attenuated the attenuated sabin strains of poliovirus have been used for many years to elicit protective immunity to poliovirus. oral vaccination with replicating polioviruses generates both mucosal and systemic immunity. therefore, use of recombinant polioviruses expressing heterologous antigens as vaccine delivery vectors should provide a system for generating protective immunity to those antigens. cdna copies of the poliovirus genome has been used to construct vectors containing a multiple cloning site for insertion of heterologous genes. a pilot enhanced-potency inactivated poliovirus vaccine (ejpv) with assumably improved immunogenicity containing win-treated type 3 poliovirus (shah sauketf) together with the regular type 1 and type 2 canponena was subjected to s*mdard safety and potency tesls in the labmatory and laken through wase i and i1 clinical aials. in balb/c mice, the lrypin-mcdit%d e-if' v cfryipv) was found to induce antibodies targeted ouaide the uypsin-sensitive bc-loop of capsid protein vp1. as previously shown for hypsin-mated type 3 poliovirus @vm alone. trypsin used to modify the type 3 component at the bulk phase was removed by the vaccine manufacturer (rivm) in the regular purification process. absence of uypsin in the final product was further confumed by immunizing mice and rabbits with 10-fold concentrated type 3 component of tryipv. assays for lrypin antibodies using eia and westem blot techniques were newve. in the clinical phase 1 aial six adult volunteers with existing immunity to poliovirus were given increasing doses of tryipv. already one tenth of the regular dose induced a booster effect in neuarlizing antibodies to both intact and mypsin-treated type 3 poliovirus. no unexpected sideeffects were recorded phase i1 trials comprised 50 adult volunteers with at least 5 years since the last dose of poliovirus vaccine and 50 children who were due to receive the third dose of the regular immunization schedule at about 2 years. in both groups. 25 individuals received tryipv and 25 were injected with the regular enhanced potency ipv (e-ipv). serum specimens drawn before injection and one month after were tested for neunalizing antibodies using standard microneuwlization assays (all mutypes) and the racina test (intact and uypsin-mated type 3 poliovirus). in all volunteers tryipv was at least as immunogenic t w the regular e-ipv according to all assays. no statistically significant differences in side effects were reported. a murine/influenza virus model has been used to evaluate the longevity of antibody and protective responses raised by gene gun delivery of a hemagglutinin-expressing dna. mice were immunized and boosted at one month with 0.4 ug of an h1 expressing plasmid dna (pcmvri1). antibody responses and protection against a lethal challenge were followed over the next year. antibody responses had good longevity exhibiting comparable titers at one year post boost as at 10 days post boost protection against the lethal challenge was complete at 10 days, 1 month and four months post boost, but only partial at one year. a transgenic mouse model for identifing htlv-1 t-cell epitopes: generation of hla-b*3501-restricted ctl directed against synthetic peptides and naturally processed viral antigens, christian schiinbach*, ai kariyone*, kiyoshi nokiharaa+, karl-heinz wiesmulle6 and masafumi takiguchil, departments of tumor biology* and immunology#, institute of medical science, university of tokyo, tokyo "tokyo university of agriculture and technology, tokyo +biotechnology instruments department, shimadzu corp., kyoto, japan $natural and medical science institute at the university of tubingen, reutlingen, germany the majority of human t-cell leukemia virus type-i (htlv-l), hla class i-resmcted t-cell epitopes have been identified by cloning htlv-1 patient-derived t cells. here we describe for the frst time a rapid method (reverse immunogenetics) for identifing t-cell epitopes, together with a transgenic mouse model as a guide for testing the cellular immune response to a mixture of the lipohexapeptide immunoadjuvant pamgcys-ser-(lys)4 and synthetic htlv-1 peptides which seem suitable for vaccine design. htlv-1 amino acid sequences were searched for eight to 14mer patterns carrying the anchor residues of the hla-b*3501 peptide motif at positions two and eight to fourteen. 65 candidate peptides were synthesized according to the matched sequence patterns. their hla-b*3501 affinity was quantitatively analyzed in an indirect immunofluorescence peptide binding assay using rma-s-b*3501 cells. the fourth group (controls) were inoculated with h3n2 (in) thereby providing heterotypic ctl immunity in the context of a natural infection without the confounding effects of humoral immunity against surface antigens. all four types of inoculations have been shown to protect normal (class i expressing) mice from a lethal challenge with influenza, presumably mediated by class i restricted cytotoxic t cells. the two groups inoculated via the intranasal route may gain additional protection by activating the mucosal immune system (iga). none of these types of inoculations has been evaluated in the context of class i1 restricted cytotoxic t cells, the only ctls found in class i deficient mice. for all four types of inoculations, mhc class i deficient mice lost significantly more weight than the class i expressing control groups (seven mice per group) indicating the importance of class i restricted t-cells in protection. within the class i expressing groups, there was no significant difference between the four types of inoculations; within the class i deficient groups the vac-np im immunized mice lost significantly more weight than the h3n2 group;the other two groups, vac-np in and genetically immunized groups had intermediary results. these data lend support for a protective role for mucosal immunity. results on both class i and class i1 ctl activity for the four types of inoculations will also be presented. we tested the pbmcs of patients participating in two vaccine therapy trials for their ability to recognize overlapping peptides of the gp120 la1 sequence. seventeen patients participating in a phase i gp160 protocol and 13 patients participating in a phase i gp120 protocol had their pbmcs isolated by ficoll separation of heparinized venous blood. the fresh pbmcs were plated, in triplicate, into 96 well plates containing peptides overlapping the la1 sequence of gp120, pulsed on day 7 with tritiated thymidine and harvested and counted on day 8. results: the percentage of patient's pbmcs from each trial with an lsi 2 5 to each peptide are depicted below. conclusions: the pbmcs of hiv-infected volunteers who have been multiply immunized with either gp160 or gp120 proliferate to multiple peptides within the gp120 molecule. reactivity from the end of c1 through early c2 (lai #i 12-21 1) is particularly prominent and contains previously undescribed th epitopes (asterisks). conspicuously missing is reactivity to the v3 loop peptide (la1 #300). although the percent reactivity to the entire gp120 molecule is similar between the immunization groups, there is differential recognition of some of the individual peptides, particularly peptides in early c3 (lai #319-348). the intracytoplasmic lifecycle of listeriu mmcytogenes (lm) enables it to be a convenient vaccine vehicle for the introduction of foreign proteins into the mhc class i pathway of antigen presentation. taking advantage of these properties, we have inserted the nucleoprotein (np) gene from lymphocytic choriorneningitis virus (lcmv) into the lm chromosome by site specific homologous recombination. infection of mice with recombinant lm expressing lcmv-np elicited a virus-specific ctl response. we were able to recover lcmv-np specific ctl precursers from recombinant lm vaccinated mice as shown by vigorous secondary ctl responses after in vitro stimulation. in contrast to mice immunized with wild type lm, mice vaccinated with np-recombinant lm were protected against challenge with immunosuppressive lcmv variants. protection was demonstrated by reduced viral titers or complete clearance of lcmv from serum and various organs including, spleen, liver, lung, kidney, and brain. the kinetics of the lcmv challenge indicate that mice vaccinated with recombinant lm were able to arrest viral growth early in the infection due to a strong ctl response and did not exhibit the immunopathology associated with infection of naive mice. since lm not only delivers antigens into the mhc class i pathway but also induces il-12 production, it has the potential to function simultaneously as a vehicle for expressing foreign antigens and as an adjuvant promoting cell mediated immunity. , latent membrane protein [lmp] 1 and 2a) in chromium release assays. we were fortunate in identifying one child from whom cryopresetved pbmc samples were available before. and during ebv seroconversion. ebv-specific ctl activity was demonstrated concurrent with initial detection of virus in the peripheral blood by ebv-dna pcr. in the absence of detectable serum antibody. ctl lines from all nine children recognized one or more ebv latent gene prcduct(s). all children demonstrated ctl responses against one or more ebna 3 proteins (3a, 3b, 3c). and ebna 3c was recognized most frequently. no ctl responses were detected against the ebv latent proteins ebna 1, 2, lp or lmp 1. the ebv-specific ctl lines expressed cd3/cd8 and mab blocking experiments demonstrated that the majority of target cell lysis was inhibited by antibody against mhc class-i but not antibody against mhc class-11. these results represent one of the first reports characterizing ebv-specific ctl responses in young children. the striking similarity between ebv-specific ctl responses described here in young children and those reported for adults suggests that the ebna 3 family of proteins and lmp 2a should be considered for inclusion in candidate ebv vaccines. evaluation of cellular immune responses to adenovirus vectors in the cotton rat. soonpin yei,' gary kikuchi,' ke tang' and bruce c. trapnell.' departments of virology' and immunology,2 genetic therapy, inc., gaithersburg, maryland 20878 replication deficient recombinant adenovirus (av) vectors are efficient gene delivery vehicles currently being developed for a variety of in vivo gene therapy strategies such a s for the fatal pulmonary component of cystic fibrosis. the cotton rat (sigmodon hispidus) is one of the most widely accepted animal models for studying these av vectors because wild type human adenovirus replicates in cotton rats and because of the histopathologic similarities of infected respiratory epithelial tissues from humans and cotton rats. despite this, methods for studying immunologic responses in the cotton rat have not been developed. importantly, recent studies in the cotton rat (gene tber. 1 :192-200; 1994) in our laboratory suggest that a dose-dependent specific immune response to av vectors can limit expression of the transgene. in this context, w e have established methods to evaluate cytotoxic lymphocyte (ctl) responses to av vectors in the cotton rat. to accomplish this, a ctl target cell line was established consisting of primary cotton rat lung fibroblasts (crlf). splenocytes from cotton rats exposed previously to an av vector were harvested, cultured in virro with irradiated, addb274nfected crlf. cultured (effector) splenocytes were then incubated with s'cr-labelled crlf (target) cells a t effectoctarget (e:t) ratios of 100, 50 and 10. in parallel, splenocytes from naive cotton rats served as negative controls. results demonstrated vector-specific ctl lysis of target cells significantly greater than controls: 80.3 f 1.3% vs 6.2* 0.5%. 49.6*1.6% v s 5.7*0.4%. and 22.8*3.5% vs4.8*0.5% (meanrts.e.m., n=3; p500 celllpl after more than 8 years of infection were selected from the amsterdam cohort study on aids versus 10 subjects who progressed to aids < 5 years. ctl activity was measured on "cr labelled hla matched or autologous b-lcl, infected with rvv expressing hiv-1 ag. both bulk and limiting dilution ctl assays were performed longitudinally with pbmc after ag-specific stimulation. sequences of ctl epitopes were determined in homologous virus isolates resulrs: different kinetics of anti-gag ctl responses were observed in rapid progressors. in any case ctl responses disappeared during progression to aids. in long-term asymptomatic subjects persistent ctl responses were observed together with low viral load. conclusions: sustained, broad anti-hiv cellular immunity may correlate with maintenance of the asymptomatic state in long-term survival by controlling viral replication. enteroviruses are a large group of positive stranded rna viruses known to be responsible for a number of distinct disease entities. recombination is thought to be capable of generating new enterovirus strains that cause significant morbidity. for example, enterovirus 70 which was responsible for a pandemic of haemorrhagic conjunctivitis and poliomyelitis is thought to have originated by recombination between a coxsackie b like virus and another unidentified enterovirus. we are studying a group of echovirus 11 isolates from an outbreak of disease in southem india. sequence analysis within the 5' untranslated region reveals that these isolates fall into two groups that differ by -20% (equivalent diversity to that seen between between published sequences of poliovirus 1 and coxsackie a 9 virus). these two groups of viruses also differ in their cell tropism. isolates defined as group 1 by their 5'utr sequence grow equally well on ht29 cells (a human colon carcinoma cell line) and vero cells. isolates of group 2, with one exception, grow only on ht29 cells. analysis of the structural proteins of these isolates revealed differences in migration that correlated with their cellular tropism. thus, significant genotypic and biological diversity exists amongst these virus isolates. one virus isolate had the 5' untranslated region sequence of a group 1 virus but the protein profile and cellular tropism of a group 2 virus. the best explanation of these findings is that this anornolous isolate is a natural recombinant between the parented strains. both the ease with which viable recombinants are generated and the diversity present within this one enterovirus serotype increase the potential for the production of novel pathogenic enterovirus strains. dominant susceptibility to polyoma tumors in inbred wild mice, sharon r. nahill, yupo ma, john carroll and thomas l. benjamin, department of pathology, harvard medical school, boston, ma 021 15 polyoma virus (f' y) is a mouse dna tumor virus which, under appropriate conditions, causes tumors in a wide variety of cell types. generation of tumors is a function of both the viral and host genomes. lukacher et al. have recently described a dominant gene, pyv', carried by the c3-i mouse strain, which confers susceptibility to py-induced tumors mapping and immunological analyses indicate that py4 is the mouse mammary tumor virus 7 superantigen (mtv 7 sad gene, which deletes t cells required for py tumor immunosurveillance in h-2' mice. to determine the generality of endogenous superantigens as determinants of susceptibility and to reveal potentially novel mechanisms of susceptibility, we have looked for dominant susceptibility @ s ) gene(s) id newly established and genetically diverse inbred wild mouse strains, czech i1 and pedatteck (peru). both strains are susceptible to py as 100% of infected animals develop a full profile of tumors. crosses between cs7br, whose resistance is contributed by the major histocompatibility (mhc) locus, and susceptible peru or czech 11, yield f1 progeny which are fully susceptible, indicating a dominant inheritance pattern of susceptibility the incidence of tumor-bearing backcross animals [((peru x cs7br) x c57br) and ((czech i1 x cs7br) x c57br)i suggests that ds is due to at least one, but not more than two genes. amplification of genomic dna from the czech i1 and peru mice by pcr using primers specific for mtv 7 sag indicates that both strains are negative for proviral mtv 7 sag. furthermore, the mechanism ofds in these mice may be independent of all mtv sag as pcr using primers specific for the highly conserved region of mtv sag is unable to amplify mtv dna from peru or czech i1 genomic dna. these results indicate that, like the c3hibi, the pedatteck and czech i1 contain gene(s) which overide the resistance to py-induced tumors contributed by the mhc of the c57br parent and which may cause tumors via a novel, mtv sag-independent mechanism. we have initiated efforts to map the ds in peru and czech i1 mice using pcr and primer pairs flanking simple sequence length polymorphisms. fis-2 is a low leukemogenic, but relatively strong immunosuppressive variant of friend murine leukemia virus (f-mulv). this variant was originally isolated from t-helper cells of flc-infected adult nmrl mice. compared to f-mulv, fis-2 suppresses primary antibody response more efficiently in infected mice. some of the fts-2 infected adult nmri mice developed a disease resembling the acquired immunodeficiency syndrome induced by hiv. restriction mapping and nucleotide sequence analysis of fis-2 show a high degee of homology between this variant and the prototype f-mulv clone 57. in this study we have attempted to localize the genomic determinant of fis-2 which is responsible for induction of a strong suppression of primary antibody response. six chimeric viruses of fis-2 and f-mulv were constructed. the primary antibody response of the mice infected with these chimeric viruses were investigated. the results of these experiments will be presented. anti-fmdv antibodies, as measured in an elisa capture assay, were cross reactive. b) cellular: proliferative (cd4) t cell responses of peripheral blood mononuclear cells (pbmc) were low or undetectable during primary responses to vaccine or virus, and frequently low during secondary responses. for good t cell proliferation in vitro, multiple immunisation is required. this may reflect preferential stimulation of the th2 cd4 t cell subset. interestingly, when cd4 responses were observed, cd8 tcell responses were also detectable. 2 . recognition of individual viral proteins a) expression cloning: structural and non-structural protein pseudogenes were cloned from cdna by pcr. expressed in pgex-3xuc. and purified by sds-page. b) humoral: structural and non-structural proteins were recognised by infected animals. a good anamneetic antinon-structural response was only observed when the boosting serotype differed from the serotype stimulating the primary response. c) cellular: both structural and non-structural proteins were recognised and some were cross reactive. interestingly, vp1 was strain specific, and the polymerase (3d) was the most immunogenic and cross reactive. d) a construct comprising 3d and the immunodominant vp1 epitopes was prepared and tested. in common with other herpesviruses, the envelope glycoproteins of equine herpesvirus 1 (ehv-1; equine abortion virus) are major determinants of the infectious process and pathogenicity, and are inducers of humoral and cell-mediated immune responses. as such, they are candidates for components of subunit vaccines against ehv-1. to generate useful amounts of individual ehv-1 glycoproteins, we have constructed recombinant badoviruses capable of expressing glycoproteins c, d, h (gc, gd, gh ) in insect cells, and have evaluated the recombinant products as innnunogens in a murine model of ehv-1 infection. au three glycoproteins induced serum (elisa) antihodies to ehv-1, and ehv-1 gc and gd also induced neutralizing antibody responses. following intranad challenge with infectious ehv-1, protective immunity, as demonstrated by acelerated clearance of virus fiom respiratory tissues to below detectable levels, was evident in mice immunized with either recombinant gc or gd. in contrast, gh-immmkd mice did not develop detectable neutralizing antibody, and did not clear challenge virus more rapidly than controls. delayed type hypersensitivity and lymphoproliferation responses to ehv-1 antigen were observed for each of the ehv-1 glycoproteins, and in experiments with gdimmunized mice, a role for cell mediated immunity in protection was confirmed by adoptive transfer and t-cell depletion experiments. the data provide support for the potential of glycoproteins c and d as a subunit vaccine against ehv-1. molecular pathogenesis of ural infeetiom 52-331 enterovirus-immune cell interactions: implications in enterovirus-induced diseases we have also evaluated the effect of virus infection on the humoral immune response to cvb3, infection in adolescent c3h/hesnj mice. antigen presenting cell, 1-helper cell and 8-cell function were evaluated utllizlng a sheep red blood cell (srbc) plaque assay. mice were injected intrapentoneally (ip) with lo5 plaque forming units of cvb3, at day 0 and with lo7 srbc's at days 0, 2, 3 and 4 post-cvbb, infection. splenocytes were harvested 4 days post-srbc injection, mixed with target srbc's and guinea pig complement and incubated. plaques were then quantitated. results: cvb3, was associated with 12.9% to 17.4% of cd-8 positive t-cells and w a 11 % to 26% of adherent splenocytes. after mitogen (lps and con a) stimulation, b-cells and adherent cells were demonstrated to be permissive for viral replication. a 248% and 738% under non-stimulated conditions. an average of 1 % of virus is cell-associated (plaque north america. bruce anderson, teny yates, norah torrez-martinez, wanmin song, brian hjelle. university of new mexico, albuquerque, n.m.we recently identified a new species of hantavirus (hmv) associated with the harvest mouse reithrodontomys megalotis (hjelle b et al, j. viroj. 1994, in press ). an arizona woodrat (neotoma mexicana) was found to he infected with hmv, presumably through "spillover". hmv is most closely related to the four comers hantavirus (fcv) of deer mice (genus peromyscus). the nucleocapsid gene and protein of hmv differ from those of fcv by 24% and 15% of residues, and the 1896 nt s genome is shorter by 163 nt. we surveyed 174 reithrodontomys animals captured in the u.s. and mexico for hantavirus antibodies; 27 (15.6%) were positive. s segment cdnas were amplified and sequenced from seropositive animals captured in california (4), arizona (3), new mexico (l), and mexico (2). a monophyletic clade of hmv-like agents was identified at all sites, although an r. megalotis infected with an fcv-like virus was also identified in the state of zacatecas, mexico. nucleotide sequence distances among members of the hmv clade were up to 15.5%. but amino acid distances were less than 2%. hmv is enzootic in harvest mice throughout much of north america, and can also infect wood rats. htlv i-associated myelopathy/tropical spastic paraparesis (hamnsp) is a slowly pro ressive neurological disease characterized by perivascuaar mononuclear infiltrates in the cns. htlv i-specific cd8+ ctl are found in pbl and csf of infected patients with htlv i-associated neurological disease but not in htlv i seropositive individuals without neurological involvement. previous studies have shown that in hla-a2+ patients, htlv i-specific cd8+ ctl restricted by hla-a2 recognize a peptide derived from the htlv i tax protein (tax 11-19 llfgypvw). in the present study, we have analyzed the potential of these tax-specific ctl to recognize addtional peptides. our results demonstrate that a subpopulation of high affinity cd8' tax 11-19 specific ctl clones cross-react on a self peptide derived from the se uence of myelin-associated glycoprotein (mag 556-564 vl&sdfri) presented by hla-a2. these obsenatlons suggest that the demyelination process in hamltsp may be,due, in part, to virus-specific ctl recognition of a self myelin component that is independent of htlv i infection. development of pathology varies widely between different strains of mice after intracerebral inoculation with the so-called 'docile' isolate of lymphocytic choriomeningitis (lcm) virus. the c3hebfej and blo.br/sgsnj mouse strains have been of special interest because they display autoimmune hemolytic anaemia with varying degrees of apparent immunological involvement. in this study, we examined the role of cd4+ t helper cells in this autoimmune response by treating mice with the cw-specific gk1.5 monoclonal antibody. we also determined if polyclonal activation of b lymphocytes, induced either by lcm virus or by lactate dehydrogenase-elevating virus, another well known b cell activator, correlated with the development of anaemia in these mice. our results strengthened the central role of the immune system in the anaemia in c3h mice by showing that depletion of cd4+ cells largely, if not completely, abrogated this anti-erythrocyte autoimmune reaction. as reported by others, we found that the anaemia was more mild in b 1o.br mice than in c3h mice. however, we could not confirm the difference in the degree of b lymphocyte polyclonal activation between these mice. furthermore, lactate dehydrogenase-elevating virus had no apparent effect on erythrocytes, even though this virus also induced a sharp increase in plasma igg levels. one of the two class i mhc (h-pkd)-restricted immunogenic sites identified on the influenza strain aijapanl57 (h2n2) hemagglutinin (ha) encompasses two distinct partially overlapping epitopes, mapping to residues 204-212 and 210.219. when we investigated the magnitude of the ctl responses of balwc mice to the two overlapping epitopes, we found that while the nhrterminal nonamer epitope is immunodominant, eliciting vigorous ctl responses in njapanl57-immunized balb/c mice, the ctl responses to the cooh-terminal decamer epitope are weak and variable. the c-terminal epitope subdominance seems to be due to factors other than inefficient processing of the epitope in vivo because ctls generated by priming mice with recombinant sindbis viruses expressing only one of the ha 204-219 subsites displayed patterns of responsiveness similar to that of influenza virus primed ctls. limiting dilution ctl assays showed that the ctl precursor frequency (pctl) of the nterminal epitope is at least ten fold higher than the pctl of the cterminal epitope, implying that the low and variable pattern of cterminal specific responsiveness was due to the limited t cell precursors in the c-terminal specific ctl repertoire. this was further confirmed by the limited heterogeneity in the cross reactivity patterns displayed by the c-terminal specific ctl for an ig vh fragment and the ha 210-219 epitope of influenza strain a i m 5 7 in short term bulk cultures, and the facs analysis of tcr vg chain usage. taking these together with our previous observation that some jha 210-219 specific ctls can also crossrecognize an ig vh fragment. these studies had provided a strong evidence that ig gene products may influence t lymphocyte function and repertoire development. we have previously described the identification of homologous regions in the c-terminus of hiv-1 gp41 and in the n-terminus of hla class i1 beta chains. forty percent of patients infected with hiv-i virus were shown to have antibodies which bind to the homologous sequences, as well as to native hla class i1 molecules. affinity purified crossreactive antibodies (crab) were shown to have direct blocking effects on normal t cell responses to recall antigens, and could mediate adcc of hla class ii+ cell lines.in order to determine the contribution of such antibodies to disease progression, we obtained longitudinal plasma samples from patients in the macs study. in a first study, it was found that the presence of high titers crabs correlated with a more rapid disease progression (p = 0.027 by fisher two tail analysis)in a second, 7 year-longitudinal study of 12 progre.ssors and 12 stable patients we found: (1) the production of crab was seen in 70 -80% of rapid progresson, while the true stables produce only infrequent low-titers crab. (2) in rapid pmgressors, production of crab preceded by 2-3 years the marked drop in cd4 counts. (3) crab production did not correlate with the degree of hyperglobulinemia in these patients. (4) the presence of crab during the asymptomatic stage correlated with early loss of t-helper responses to recall antigens.we are currently establishing whether periodic measurements of crab in patients sera could be valuable in predicting a drop in cd4 counts and disease progression. the lymphokine ifn-y is i pleiotropic insnunomodulator and possesses intrinsic antiviral activity. we studied its significance in the development of antiviral immune responses using ifn-7 receptor deficient (ifn-yr-'.) mice. after inoculation with live attenuated pseudorabies virus (prv) the mutant mice showed no infectivity titers in various tissues and transient viral ag expression only in the spleen similar as in wild-type mice. however, the absence of the ifn-yr resulted in increased proliferative splenocyte responses. the prv-immune animals showed a normal ifn-1 and 11-2 production, without detectable 11-4, and with decreased 11-10 secretion in response to viral ag or con a. immunohistochemically, an increased ratio of ifny/i1-4 producing spleen cells was found. after immunization with either live attenuated or inactivated prv, ifn-yr"' mice produced significantly less antiviral antibody (ab), and more succumbed to challenge infection than the intact control animals. the reduction in ah titers in the mutant mice correlated with lower protection by their sera in transfer experiments. thes? findings are in line with the strong enhancing effect of exogenous ifn-y on rabies virusand prv-specific igg responses. our data demonstrate that a physiological ifn-y system is surprisingly not critical for the generation of antiviral th-i-type and the suppression of th-2-type cytokine responses. the lymphokine, however, is an important mediator in the generation of protective antiviral ab. key: cord-020101-5rib7pe8 authors: nan title: cumulative author index for 2008 date: 2008-11-17 journal: virus res doi: 10.1016/s0168-1702(08)00367-5 sha: doc_id: 20101 cord_uid: 5rib7pe8 nan dettori, g., see medici, m.c. (137) 163 devi, s., see osman, o. (135) murine leukemia virus reverse transcriptase: structural comparison with hiv-1 reverse transcriptase the gprlqpy motif located at the carboxy-terminal of the spike protein induces antibodies that neutralize porcine epidemic diarrhea virus detection of ovine herpesvirus 2 major capsid gene transcripts as an indicator of virus replication in shedding sheep and clinically affected animals genetic characterization of equine influenza viruses isolated in italy between a new living cell-based assay system for monitoring genome-length hepatitis c virus rna replication unraveling the puzzle of human anellovirus infections by comparison with avian infections with the chicken anemia virus the contribution of feathers in the spread of chicken anemia virus cloning and subcellular localization of the phosphoprotein and nucleocapsid proteins of potato yellow dwarf virus, type species of the genus nucleorhabdovirus the p26 gene of the autographa californica nucleopolyhedrovirus: timing of transcription, and cellular localization and dimerization of product complete genomic sequence of turkey coronavirus recombinant l and p protein complex of rinderpest virus catalyses mrna synthesis in vitro molecular divergence of grapevine virus a (gva) variants associated with shiraz disease in south africa sequence analysis of a reovirus isolated from the winter moth operophtera brumata (lepidoptera: geometridae) and its parasitoid wasp phobocampe tempestiva (hymenoptera: ichneumonidae sars coronavirus replicase proteins in pathogenesis virus-induced gene silencing in medicago truncatula and lathyrus odorata evaluating the 3c-like protease activity of sars-coronavirus: recommendations for standardized assays for drug discovery hbx modulates iron regulatory protein 1-mediated iron metabolism via reactive oxygen species pathogenetic mechanisms of severe acute respiratory syndrome detection of a novel circovirus in mute swans (cygnus olor) by using nested broadspectrum pcr chimaeric hiv-1 subtype c gag molecules with large in-frame c-terminal polypeptide fusions form virus-like particles cross-species recombination in the haemagglutinin gene of canine distemper virus sapovirus-like particles derived from polyprotein cauliflower mosaic virus gene vi product n-terminus contains regions involved in resistance-breakage, self-association and interactions with movement protein adenovirus vector induced innate immune responses: impact upon efficacy and toxicity in gene therapy and vaccine applications interfering with cellular signaling pathways enhances sensitization to combined sodium butyrate and gcv treatment in ebv-positive tumor cells evidence for recombination between pcv2a and pcv2b in the field retroviral reverse transcriptases (other than those of hiv-1 and murine leukemia virus): a comparison of their molecular and biochemical properties mitochondrial plasmids of sugar beet amplified via rolling circle method detected during curtovirus screening appearance of intratypic recombination of enterovirus 71 in taiwan from the circulation of subgenogroups b5 and c5 of enterovirus 71 in taiwan from in vitro replication of bamboo mosaic virus satellite characterization of the interaction of domain iii of the envelope protein of dengue virus with putative receptors from cho cells intrahost evolution of envelope glycoprotein and orfa sequences after experimental infection of cats with a molecular clone and a biological isolate of feline immunodeficiency virus the sars-coronavirus plnc domain of nsp3 as a replication/ transcription scaffolding protein limited compatibility between the rna polymerase components of influenza virus type a and b serotype-specificity of recombinant fusion proteins containing domain iii of dengue virus very virulent infectious bursal disease virus isolated from wild birds in korea: epidemiological implications genetic analysis and evaluation of the reassortment of influenza b viruses isolated in taiwan during the enhanced immune responses of mice inoculated recombinant adenoviruses expressing gp5 by fusion with gp3 and/or gp4 of prrs virus effect of antiviral treatment and host susceptibility on positive selection in hepatitis c virus novel hiv-1 reverse transcriptase inhibitors synthesis of recombinant human parainfluenza virus 1 and 3 nucleocapsid proteins in yeast saccharomyces cerevisiae evolutionary analyses of european h1n2 swine influenza a virus by placing timestamps on the multiple reassortment events hepatitis b pregenomic rna splicing-the products, the regulatory mechanisms and its biological significance human papillomavirus 16 e6, l1, l2 and e2 gene variants in cervical lesion progression pathology and hematology of the caribbean spiny lobster experimentally infected with panulirus argus virus tomato leaf curl virus satellite dna as a gene silencing vector activated by helper virus infection down-regulation of sclerotinia sclerotiorum gene expression in response to infection with sclerotinia sclerotiorum debilitation-associated rna virus presence of p1b and absence of hc-pro in squash vein yellowing virus suggests a general feature of the genus ipomovirus in the family potyviridae the truncated virus-like particles of c6/36 cell densovirus: implications for the assembly mechanism of brevidensovirus tubulovesicular structures are a consistent (and unexplained) finding in the brains of humans with prion diseases interferon antagonist function of japanese encephalitis virus ns4a and its interaction with dead-box rna helicase identification of novel viral interleukin-10 isoforms of human cytomegalovirus ad169 cross-reactive and serospecific epitopes of nucleocapsid proteins of three hantaviruses: prospects for new diagnostic tools genetic diversity of the vp1 gene of duck hepatitis virus type i (dhv-i) isolates from southeast china is related to isolate attenuation the major tegument structural protein vp22 targets areas of dispersed nucleolin and marginalized chromatin during productive herpes simplex virus 1 infection mutational events during the primary propagation and consecutive passages of hepatitis e virus strain je03-1760f in cell culture the n protein of tomato spotted wilt virus (tswv) is associated with the induction of programmed cell death (pcd) in capsicum chinense plants evolutionary relationships of virus species belonging to a distinct lineage within the ampelovirus genus a novel genomic constellation (g10p[3]) of group a rotavirus detected from buffalo calves in northern india phylogeny of lagos bat virus: challenges for lyssavirus taxonomy dc-sign enhances infection of cells with glycosylated west nile virus in vitro and virus replication in human dendritic cells induces production of different modulation of cellular transcription by adenovirus 5, ⌬e1/e3 adenovirus and helper-dependent vectors involvement of cytoskeleton in junín virus entry hiv-1 reverse transcriptase inhibitor resistance mutations and fitness: a view from the clinic and ex vivo genomic characterization of novel marine vesiviruses from steller sea lions restricted quasispecies variation following infection with the gb virus b molecular characterization of vp4, vp6 and vp7 genes of a rare g8p[14] rotavirus strain detected in an infant with gastroenteritis in italy retroviral reverse transcription mechanisms of resistance to nucleoside analogue inhibitors of hiv-1 reverse transcriptase truncation of cytoplasmic tail of eiav env increases the pathogenic necrosis characterization of russian rabies virus vaccine strain rv-97 bacteriophage preparation inhibition of reactive oxygen species generation by endotoxin inhibition of autographa californica nucleopolyhedrovirus (acnpv) polyhedrin gene expression by dnazyme knockout of its serine/threonine kinase (pk1) gene serine/threonine kinase (pk-1) is a component of autographa californica multiple nucleopolyhedrovirus (acmnpv) very late gene transcription complex and it phosphorylates a 102 kda polypeptide of the complex amino acid at position 95 of the matrix protein is a cytopathic determinant of rabies virus phosphorylation of the tgbp1 movement protein of potato virus x by a nicotiana tabacum ck2-like activity comparative genomics of serotype asia 1 foot-and-mouth disease virus isolates from india sampled over the last two decades low dna htlv-2 proviral load among women in são paulo city antiviral potentials of medicinal plants a molecular epidemiological study of rabies in puerto rico the latent membrane protein 1 (lmp1) encoded by epstein-barr virus induces expression of the putative oncogene bcl-3 sars coronavirus accessory proteins hepatitis b viruses: reverse transcription a different way increase in proto-oncogene mrna transcript levels in bovine lymphoid cells infected with a cytopathic type 2 bovine viral diarrhea virus plantibody-mediated inhibition of the potato leafroll virus p1 protein reduces virus accumulation interferon ␤1-a and selective anti-5ht 2a receptor antagonists inhibit infection of human glial cells by jc virus the begomoviruses honeysuckle yellow vein mosaic virus and tobacco leaf curl japan virus with dna␤ satellites cause yellow dwarf disease of tomato genetic structure of a population of potato virus y inducing potato tuber necrotic ringspot disease in japan molecular characterisation and phylogenetic analysis of chronic bee paralysis virus inhibition of west nile virus replication in cells stably transfected with vector-based shrna expression system complete genome sequence analysis of dengue virus type 2 isolated in detecting molecular adaptation at individual codons in the glycoprotein gene of the geographically diversified infectious hematopoietic necrosis virus positive natural selection in the evolution of human metapneumovirus attachment glycoprotein sphingomyelin induces structural alteration in canine parvovirus capsid late steps of parvoviral infection induce changes in cell morphology molecular cloning and sequence analysis of the duck enteritis virus ul5 gene the interaction between kshv rta and cellular rbp-j and their subsequent dna binding are not sufficient for activation of modulation of hepatitis b virus replication by expression of polymerasesurface fusion protein through splicing: implications for viral persistence seroprevalence and genetic evolutions of swine influenza viruses under vaccination pressure in korean swine herds the cycle for a siphoviridae-like phage (vhs1) of vibrio harveyi is dependent on the physiological state of the host expression and biochemical characterization of nsp2 cysteine protease of chikungunya virus effect of sirna mediated suppression of signaling lymphocyte activation molecule on replication of peste des petits ruminants virus in vitro prevalence and molecular characterization of wu/ki polyomaviruses isolated from pediatric patients with respiratory disease in functional mapping of the porcine reproductive and respiratory syndrome virus capsid protein nuclear localization signal and its pathogenic association genetic variation of hepatitis c virus in a cohort of injection heroin users in wuhan identification of a conserved linear b-cell epitope at the n-terminus of the e2 glycoprotein of classical swine fever virus by phage-displayed random peptide library lópez-galíndez, an hiv-1 215v mutant shows increased phenotypic resistance to d4t hiv-1 p17 binds heparan sulfate proteoglycans to activated cd4 ϩ t cells up-regulation of murid herpesvirus 4 orf50 by hypoxia: possible implication for virus reactivation from latency effective inhibition of japanese encephalitis virus replication by small interfering rnas targeting the ns5 gene size reversion of a truncated dna␤ associated with tobacco curly shoot virus f gene recombination between genotype ii and vii newcastle disease virus growth of tick-borne encephalitis virus (european subtype) in cell lines from vector and non-vector ticks dna recognition properties of the cell-to-cell movement protein (mp) of soybean isolate of mungbean yellow mosaic india virus emerging g9 rotavirus strains in the northwest of china implications of recombination for hiv diversity induction of apoptosis in vero cells by newcastle disease virus requires viral replication, de-novo protein synthesis and caspase activation subcellular localization of the triple gene block proteins encoded by a foveavirus infecting grapevines phylogenetic analysis of the ns5 gene of dengue viruses structural basis for drug resistance mechanisms for non-nucleoside inhibitors of hiv reverse transcriptase importance of cholesterol for infection of cells by transmissible gastroenteritis virus animal models and vaccines for sars-cov infection comparative full genome analysis revealed e1: a226v shift in oropouche virus entry into hela cells involves clathrin and requires endosomal acidification molecular evidence for polyphyletic origin of human immunodeficiency virus type 1 subtype c in transcriptomic analysis of responses to infectious salmon anemia virus infection in macrophage-like cells rnase h activity: structure, specificity, and function in reverse transcription a single nucleotide change in hop stunt viroid modulates citrus cachexia symptoms sequence analysis of mrna transcripts encoding jembrana disease virus tat-1 in vivo isolation of a type 3 vaccine-derived poliovirus (vdpv) from an iranian child with x-linked agammaglobulinemia a review of studies on animal reservoirs of the sars coronavirus genetic analysis of dengue 3 virus subtype iii 5ј and 3ј non-coding regions allopurinol, an inhibitor of purine catabolism, enhances susceptibility of tobacco to tobacco mosaic virus effects of acp26 on in vitro and in vivo productivity, pathogenesis and virulence of autographa californica multiple nucleopolyhedrovirus sars-cov replication and pathogenesis in an in vitro model of the human conducting airway epithelium rna transcription analysis and completion of the genome sequence of yellow head nidovirus mechanisms of inhibition of hiv replication by non-nucleoside reverse transcriptase inhibitors acute non-cytopathic bovine viral diarrhea virus infection induces pronounced type i interferon response in pregnant cows and fetuses extracellular vesicles containing virus-encoded membrane proteins are a byproduct of infection with modified vaccinia virus ankara analysis of jembrana disease virus mrna transcripts produced during acute infection demonstrates a complex transcription pattern complete sequence of the genome of avian paramyxovirus type 2 (strain yucaipa) and comparison with other paramyxoviruses cloning and sequencing of capsid protein of indian isolate of extra small virus from macrobrachium rosenbergii a member of a new genus in the potyviridae infects rubus molecular epidemiology of rabies in indonesia key: cord-260225-bc1hr0fr authors: sirpilla, olivia; bauss, jacob; gupta, ruchir; underwood, adam; qutob, dinah; freeland, tom; bupp, caleb; carcillo, joseph; hartog, nicholas; rajasekaran, surender; prokop, jeremy w. title: sars-cov-2-encoded proteome and human genetics: from interaction-based to ribosomal biology impact on disease and risk processes date: 2020-07-20 journal: j proteome res doi: 10.1021/acs.jproteome.0c00421 sha: doc_id: 260225 cord_uid: bc1hr0fr [image: see text] sars-cov-2 (covid-19) has infected millions of people worldwide, with lethality in hundreds of thousands. the rapid publication of information, both regarding the clinical course and the viral biology, has yielded incredible knowledge of the virus. in this review, we address the insights gained for the sars-cov-2 proteome, which we have integrated into the viral integrated structural evolution dynamic database, a publicly available resource. integrating evolutionary, structural, and interaction data with human proteins, we present how the sars-cov-2 proteome interacts with human disorders and risk factors ranging from cytokine storm, hyperferritinemic septic, coagulopathic, cardiac, immune, and rare disease-based genetics. the most noteworthy human genetic potential of sars-cov-2 is that of the nucleocapsid protein, where it is known to contribute to the inhibition of the biological process known as nonsense-mediated decay. this inhibition has the potential to not only regulate about 10% of all biological transcripts through altered ribosomal biology but also associate with viral-induced genetics, where suppressed human variants are activated to drive dominant, negative outcomes within cells. as we understand more of the dynamic and complex biological pathways that the proteome of sars-cov-2 utilizes for entry into cells, for replication, and for release from human cells, we can understand more risk factors for severe/lethal outcomes in patients and novel pharmaceutical interventions that may mitigate future pandemics. the sars-cov-2 (covid-19) pandemic has impacted every component of life, including research and medicine. in just a few months from the onset of infections to writing of this review, 10573 papers/objects have been published on sars-cov-2 ( figure 1 ). this body of literature primarily focuses on infectious diseases, the respiratory system, public environmental occupational health, biochemistry molecular biology, virology, immunology, pharmacology, microbiology, and healthcare science services, to name a few fields ( figure 1a ). title extraction of these papers reveals mainly clinically connected terms ( figure 1b) . the extensive infectious disease and clinical base of this literature has yielded knowledge of viral entry, replication, immune response, and transmission. however, in a short window of time, biochemical and molecular biology insights into sars-cov-2 have yielded a smaller body of literature that continues to grow (1267 out of the 10573 items), taking more time for data generation than clinical descriptions. of these 1267 biochemistry/molecular biology items, 934 are primary articles ( figure 1c ). title and abstract word extraction from these biochemistry/molecular biology items, followed by counting mentions of all human (20368) or sars-cov-2 proteins, shows a heavy focus on ace2 and spike (s) proteins ( figure 1d ). the virus primarily enters cells through the interaction of the sars-cov-2 surface glycoprotein, spike (s), interacting with the human encoded ace2, similar to that of the sars virus. 1,2 from the abstract/title terms, we identified 51/346 usages of ace2 and 76/295 of spike. other human proteins with repeated mentions include tmprss2 (13 titles/49 abstract), ace (1/32), furin (2/ 14), dpp4 (2/11), and c3 (2/2). additional sars-cov-2 proteins with mentions include nsp12 (rna-directed rna polymerase, 20/71), nucleocapsid (n, 17/71), membrane (m, 5/48), envelope (e, 4/31), nsp5 (3clpro/mpro, 7/26), nsp8 (3/19), nsp16 (2′-o-methyltransferase, 3/14), orf8 (1/10), nsp10 (3/9), nsp14 (guanine-n7 methyltransferase, 1/8), nsp3 (papain-like protease, 16/6), and nsp15 (uridylate-specific endoribonuclease, 16/4). only nsp6 and nsp11 for sars-cov-2 have no mentions within any of these titles or abstracts for biochemical linked papers on sars-cov-2. overall this suggests a few papers specifically related to sars-cov-2 proteins have been published; however, a large body of literature exists for the original sars and other coronaviruses that can give interpretation of the diverse functions performed by the viral-coded proteins and how they interact with human biology. the advancement of knowledge of the sars-cov-2 proteome has been slower than clinical insights due to the need for experimental work that is slow and that is being hampered by social isolation. the 29903 base-pair single-stranded rna genome of sars-cov-2 (ncbi nc_045512.2) has a 265 base-pair 5′ utr, multiple protein-coding segments, and a 228 base-pair 3′ utr. sars-cov-2 has a 79% genomic similarity with sars-cov, a known human pathogen, with both known to enter cells through the binding of human ace2. 3, 4 in addition to sars-cov and sars-cov-2, five other coronaviruses are capable of human-to-human transmission and infection (hku1, nl63, oc43, 229e, and mers-cov). 5 hundreds of coronaviridae family member genomes have been sequenced in human and other vertebrate hosts, 6,7 and many structures have been solved for coronaviridae species proteins, allowing for systematic assessments of the knowledge base. our group implemented a sequence-to-structure-to-function analysis 8, 9 to understand sars-cov-2 proteins, developing a robust understanding of protein conservation, structure, and molecular dynamics. 10 the data generated for each protein was then developed into the viral integrated structural evolution dynamic database (vistedd), a publicly released database of multiple tools for the virus. the database can be accessed at https://prokoplab.com/vistedd/. these tools consist of educational resources for the proteins coded by sars-cov-2 (molecular videos, 3d protein model prints, amino acid details of conservation, and dynamics), the mapping of critical sites to each protein, and the insights into how sars-cov-2 interacts with human proteins. generating this database has given our team a diverse understanding of sars-cov-2, particularly for host protein interactions of each of the viral proteins. multiple studies have begun building systemic insights for sars-cov-2 infections. multiple groups have performed systematic data assessment of ace2 expression and protein staining, suggesting the physiological cell types that can be targeted by the virus. they have shown expression in many tissues throughout humans, with expression within the lung found on the apical surface of polarized bronchial secretory epithelia cells. 11−14 once the virus enters the cells, it results in the alteration of broad biological pathways, including translation, splicing, protein homeostasis, and nucleic acid metabolism. 15 epithelial organoid cultures exposed to the virus produce a robust change in rna expression patterns for cytokine and interferon intracellular immune responses that give rise to tissue signals. 16 single cell profiling within the lungs of patients shows the intracellular cytokine/interferon response results in the recruitment of macrophages in severe cases and t-cells in moderate cases, with a high potential for therapeutic intervention. 17, 18 over activation of the cytokine/interferon response is connected to poor outcomes within patients, correlating with macrophage activation syndrome. 19 additional adverse outcomes for the activation of apoptosis within lymphocytes have been observed and may contribute to the noted lymphopenia. 20 proteomics and metabolomics of patient sera show the same macrophage dysfunction, while also elucidating platelet and complement dysregulation with the identification of severity classifiers. 21−23 in totality, the physiological response to the virus is likely mediated by a combination of immune system activation and the direct human interaction partners, altering cellular processes. an understanding of these detailed biological interactions can shed light on potential therapeutic opportunities while building a fundamental knowledge of viral biology. to date, few studies have been performed that systematically look at mapping how the sars or coronavirus proteins physically interact with human proteins. structural level insights for coronavirus proteins are surprisingly deficient of human interaction partners. 10 a few of these proteins have been targeted for interaction assessments, such as the nucleocapsid protein 24,25 (shown below). it has been speculated that the understanding of virus−host interactions represents a major untapped potential of viral inhibitors. 26 a 2018 review highlights the literature of viral−host interactions for coronaviruses, focused on synergizing the knowledge of independent experiments for virus receptors, translation, membrane dynamics, immune regulation, cell cycle control, and replication. 27 the more recent work by gordon et al. 28 covering the systematic affinity purification of 26 different sars-cov-2 proteins within human cells has elucidated many mechanisms and drug compounds for the regulation of viral processes. 28 bringing this data together with our vistedd tools, we provide a current snapshot of sars-cov-2 viral proteins ( figure 2 ). orf1ab is a large protein that is proteolytically cleaved to produce 16 different proteins, many involved in rna replication. the nmr structure of 2gdt has been solved, 29 and 250 sequences have been identified by basic local alignment search tool (blast). nsp1 interacts with proteins of the alpha dna polymerase ( figure 2 ) and is involved in regulating endonucleolytic rna cleavage of mrna, allowing the virus to enrich viral rna within a cell. 30, 31 nsp1 has been shown to interact with ribosomal subunits, resulting in the inhibition of translation, 5′ mrna capping changes, and mrna destabilization. 32−34 from a sars-cov yeast two hybrid screen, nsp1 was identified to interact with immunophilins, showing that it alters the intracellular immune response. 35 expression of nsp1 drives changes in interferon signaling. 36 these processes make nsp1 a potential virulence factor. 36−38 see prokoplab.com/ nsp1 for additional information. the protein has no solved protein structure, with itassergenerated predictions 39 that are mostly (67%) coiled, and 246 sequences have been identified by blast. all of the protein interaction partners are acetylated ( figure 2 ). the protein has been suggested to be dispensable to viral replication but does impact rates of replication. 40 see prokoplab.com/nsp2 for additional information. the protein has hundreds of solved x-ray crystal structures with a c4 zinc finger, and 3180 sequences have been identified by blast. the papain-like proteinase cleaves the first four nsp proteins, 41 where inhibition can block viral replication. 42 the proteinase can cleave proteins and has been shown to have deubiquitinase activity. 43−45 this deubiquitinase function has been linked to the regulation of immune system cytokine response, 46,47 specifically the type-i interferon signaling pathway, 48 and has connection to virulence. 49 see prokoplab. com/papain_like_proteinase for additional information. the protein has no solved protein structure, with itassergenerated predictions 39 that are mostly (58%) coiled, and 3325 sequences have been identified by blast. nsp4 interacts with several proteins involved in mitochondrial import for inner membrane insertion (figure 2 ). nsp4 and nsp3 interact and form within the membrane and are involved in transcription complex assembly anchoring. 50, 51 the complex is involved in the double membrane secretory vesicle formation 52, 53 in the endoplasmic reticulum, 54 conferring with human protein interaction partners. 28 see prokoplab.com/nsp4 for additional information. nsp5 has hundreds of solved x-ray crystal structures, with the protein found in a dimer form with a cysteine protease function, 55, 56 and 3397 sequences have been identified by blast. the enzyme cleaves most of the proteins of the larger rep protein with a highly conserved specificity, where inhibition is one of the most studied interventions. 57−59 see prokoplab.com/3c-like_proteinase for additional information. the protein has no solved protein structure, with itassergenerated predictions 39 that are mostly (63%) coiled, and 2558 sequences have been identified by blast. nsp6 interacts with multiple proteins involved in atp hydrolysis-coupled cation transmembrane transport ( figure 2 ). the protein is likely transmembrane-localized, along with nsp3/nsp4, 60 and is involved in autophagosome formations. 61−63 the few papers discussing nsp6 suggest a major future area of understanding and pharmaceutical intervention potential. see prokoplab. com/nsp6 for additional information. there are several solved structures for nsp7 that interact with nsp12/nsp8 (6nur, 2ahm, and 3ub0), 64−66 and 3256 sequences have been identified by blast. the nsp7 protein interacts with multiple small gtpases of the ras complex, many of which are prenylation-regulated ( figure 2 ). the nsp7/nsp8/nsp12 complex is a viral rna-directed rna polymerase unit, where nsp12 is enhanced through the binding of nsp7/nsp8. 67 see prokoplab.com/nsp7 for additional information. there are several solved structures for nsp8 that interact with nsp12/nsp7 (6nur and 3ub0), 64−66 and 3339 sequences have been identified by blast. the nsp8 protein interacts with proteins involved in translation, snrna 3′-end processing, 7s rna binding, and ribonucleoproteins ( figure 2 ). in addition to the information provided for nsp7, nsp8 has been suggested to also interact with the orf6 protein. 68 see prokoplab.com/nsp8 for additional information. nsp9 has many known protein structures, with the protein requiring dimerization to function, 69 and 3386 sequences have been identified by blast. nsp9 interacts with multiple proteins of structural constituents of the nuclear pore ( figure 2 ). nsp9 and nsp10 interact with the nuclear factor-κb repressing factor (nkrf) and may cause an interleukin (il)-8/il-6-mediated chemotaxis of neutrophils and an overexuberant host inflammatory response. 70 nsp9 is involved in viral rna synthesis and rna binding, which likely evolved from a protease. 71, 72 see prokoplab.com/nsp9 for additional information. nsp10 has many known protein structures, including those interacting with nsp14 and nsp16, and contains two zinc binding motifs; 73 3344 sequences have been identified by blast. nsp10 stimulates nsp14 3′−5′ exoribonuclease/ mismatch excision 74, 75 and nsp16 2′-o-methyltransferase activities. 76, 77 the interface of interaction with nsp14 and nsp16 overlaps, suggesting a dynamic regulation process 78 that may involve the linkage of functions through a spherical dodecameric structure. 79 a peptide-based inhibition of the nsp10 interaction has been proposed as a potential viral regulator. 80 see prokoplab.com/nsp10 for additional information. nsp11 is a little-known small 1.3 kda peptide with few interaction partners. 28 nsp12 has multiple known protein structures with a zinc active site and a structure that interacts with nsp7/nsp8, 64−66 and 5086 sequences have been identified by blast. nsp12 is involved in the replication of plus-strand rna through complement strand synthesis and then viral rna synthesis. 81 the enzyme is highly targeted for therapeutic inhibition of viruses. it is also known as rdrp and is the target of the drug remdesivir. see prokoplab.com/rna-directed_rna_polymerase for additional information. nsp13 has multiple known protein structures with a zinc active site, and 5598 sequences have been identified by blast. nsp13 interacts with multiple proteins involved in the centrosome−golgi apparatus and centrosome ( figure 2 ) and has a rna and a dna duplex-unwinding ability to separate strands with 5′ to 3′ polarity. 82−84 see prokoplab.com/helicase for additional information. nsp14 has multiple known protein structures with a zinc active site, and 2794 sequences have been identified by blast. nsp14 has an s-adenosyl-l-methionine (sam)-binding pocket and an exoribonuclease function that is involved in rna capping, 85−87 and it interacts with nsp10 and is known to interact with the human ddx1 rna helicase to enhance the virus replication. 88 see prokoplab.com/guanine-n7_ methyltransferase for additional information. nsp15 has multiple known protein structures, and 2489 sequences have been identified by blast. nsp15 is a mn 2+dependent toric monomer to the hexamer enzyme involved in uridylate-specific cleavage 89,90 that may be regulated by nsp7/ nsp8, 91 and it interacts with the retinoblastoma protein to impact the cell cycle 92 and is also known as nendou. see prokoplab.com/uridylate-specific_endoribonuclease for additional information. nsp16 has multiple known protein structures with na, mg, and s-adenosyl-l-methionine (sam), and 2495 sequences have been identified by blast. nsp16 is an sam-based enzyme for the methylation of ribose 2′-oh in viral rna capping 93, 94 and interacts with nsp10. 77 the protein is a critical component in the inhibition of the host type-i interferon response 95 and is also known as 2′-o-mtase. see prokoplab.com/2-omethyltransferase for additional information. the spike surface glycoprotein has multiple known protein structures that are heavily glycosylated and form a trimer complex 96, 97 and is a known structure of the interaction with the dimer of heterodimers ace2/slc6a19 (6m17); 3 6612 sequences have been identified by blast. s is a class-i viral fusion protein 98 and drives the specificity of cell targets through the interaction with ace2 to enter human cells. 99, 100 following binding to the receptor, s undergoes a conformational change to allow viral entry. 101 for the protein to function correctly, it must be proteolytically cleaved by trypsin and, upon cell-binding proteases such as tmprss2, elevate entry through the mediating tropism. 102 s is of interest to the development of immunizations and rapid detection of coronaviruses, as its surface is exposed. 103, 104 see prokoplab. com/spike for additional information. orf3a has no solved protein structure, with itassergenerated predictions, 39 and 65 sequences have been identified by blast. orf3a is a three transmembrane helix protein where the extracellular component localizes the protein to the golgi apparatus with a caveolin-1 binding potential 105 and is involved in the formation of viral particles. 106, 107 orf3a has been shown to impact the cell cycle. 108 see prokoplab.com/3a for additional information. the envelope protein (e) has no solved protein structure, with itasser-generated predictions, 39 and 94 sequences have been identified by blast. e is required for viral particle formation 109 with transmembrane helix-forming pentameric αhelical bundles with channel activity 110 that can contribute to the membrane permeability. 111 see prokoplab.com/e for additional information. the membrane protein (m) has no solved protein structure, with itasser-generated predictions, 39 and 1507 sequences have been identified by blast. m has human interaction partners that are involved in the mitochondrial matrix ( figure 2 ) and is a critical component of viral membranes that are involved in viral budding. 112 see prokoplab.com/m for additional information. orf6 has no solved protein structure, with itassergenerated predictions, 39 and 31 sequences have been identified by blast. two of the interaction partners are involved in the transcription-dependent tethering of rna polymerase ( figure 2 ). orf6 can function toward the inhibition of beta interferons 113 through the regulation of the signal transducer and activator of transcription 1 (stat1) 114 and endoplasmic reticulum (er) stress 115 and can interact with nsp8. 68 see prokoplab.com/orf6 for additional information. orf7a has multiple known protein structures, and 42 sequences have been identified by blast. orf7a has protein interaction partners involved in ribosomal large subunit biogenesis ( figure 2 ) and localizes to the er and golgi network. 116 it can regulate the cell cycle in g0/g1 progression. 117 see prokoplab.com/7a for additional information. orf8 has no solved protein structure, with itassergenerated predictions, 39 and 35 sequences have been identified by blast. orf8 has multiple interaction partners involved in the er lumen (figure 2) and is a protein shared with sarsr-batcov, with a high positive selection. 118 see prokoplab.com/ orf8 for additional information. the nucleocapsid protein (n) has multiple known protein structures, and 2261 sequences have been identified by blast. n has protein interaction partners involved in mrna binding, the ribonucleoprotein complex, and the mrna surveillance pathway (figure 2 ) and is critical for the viral replication 119 in multiple processes, including viral rna stability, replication, and packaging. 120 the protein is modified within the cell, including phosphorylation and adpribosylation. 121, 122 the protein consists of three domains, with the n-terminal domain involved in rna binding, the internal dynamic multimer structured unit, and the c-terminal domain, an acidic dimerization region. 123−125 the protein can interact with rna by serving as a rna chaperone 126 while also interacting with the m protein and human hnrnpa1 through the internal multimerization domain. 127, 128 see prokoplab.com/n for additional information. orf10 has no solved protein structure, with itassergenerated predictions, 39 and is unique to sars-cov-2. very little is known of its molecular function or cellular expression. see prokoplab.com/orf10 for additional information. ■ sars-cov-2 risk factors and genetics based on human protein interactions sars-cov-2 infection exhibits more adverse effects and outcomes in those with other comorbidities, including hypertension, diabetes mellitus, and coronary heart disease. the other risk factors for mortality include older age, elevated d-dimer levels, and a higher sequential organ failure assessment (sofa) score. 129 the mortality associated with sars-cov-2 infection is tied to the patient's progression to multiorgan dysfunction. the elderly are particularly susceptible to severe sars-cov-2 infection, which is most likely due to the immunosuppression and underlying comorbidities associated with advanced age. advanced age has been shown to have a depressive effect on both the innate and the adaptive immune system, known as immunosenescence. this is associated with decreased phagocytosis and the bactericidal effects of neutrophils 130 and is also associated with the downregulation of cytokine signaling 131,132 and innate immune receptors. 133 with sars-cov-2 infection, emphasis is placed on the adaptive immune system to aid in clearing virally infected cells. the elderly population has been shown to have a shift toward inhibitory pathways, particularly in cd8+ t cells and to a lesser degree in cd4+ t cells, 134 which may play a role in allowing disseminated viral spread. this reduction of t cell activity is also joined by the involution of the thymus with age, leading to less naive t cell output, 135 which further depresses immune functions. these accumulative effects on the immune system render the elderly population particularly susceptible to dispersed viral infection at baseline levels, which may ultimately result in viral sepsis. with the immunosenescence and increased prevalence of comorbidities associated with older age, it makes sense that this population is being hit the hardest by sars-cov-2; however, many younger adults who lack the above immunosenescence have also been killed from the infection, some of whom displayed no prior medical history. this aspect points to the idea that genetics may play a role in determining the severity of sars-cov-2 infection. the immune response to sars-cov-2 infection in severe cases characteristically induces lymphopenia, particularly of cd-8+ t cells, and increases il-2, il-6, il-10, and interferon (ifn)-γ levels. 136 this work is backed by multiple proteomic studies identifying biomarkers of severity that connect to the immune system. 21, 22 the cytokine storm induced by sars-cov-2 is not a new phenomenon and has been demonstrated in the pathogenesis of other novel human coronaviruses, including mers and sars-cov-1. 137 similar consequences in severe coronavirus infections appear to stem from the cytokine storm of proinflammatory chemokines and cytokines, eventually resulting in acute respiratory distress syndrome (ards) and multiorgan dysfunction. 138 a previous study on sepsis and cytokine storm indicates the presence of genetic variants in multiple pathways that have a polygenetic contribution. 139 in many patients with sars-cov-2 that have severe infection, the identification of hyperferritinemic sepsis often occurs. fever developed at day 1, sepsis developed at day 10, admission to the intensive care unit occurred at day 12 (for acute respiratory distress syndrome), and death occurred at day 19. critically ill patients, defined as those with septic shock, multiple organ dysfunction/failure, and/or respiratory failure, accounted for approximately 5% of the study population, yet the study population displayed a case fatality rate of 49.0% in early reports from wuhan, china. 129 hyperferritinemia on day 4 and day 7 predicts mortality long before the development of sepsis and intensive care unit admission. hyperferritinemia has been suggested to have genetic associations through pathogenic variants in genes targetable by il1rap and anti-c5 antibodies. 140 type-1 interferonopathies, like heterozygous null variants in irf7, have been shown to result in severe manifestations of seasonal influenza virus. 141 similar monogenetic variants likely exist that lead to the individual risk of severe disease onset from sars-cov-2 in previously healthy patients. much of the genetics around the immune activation leading to a cytokine storm and hyperferritinemic sepsis remains poorly defined and requires future initiatives and cohorts to define these genetic contributions adequately. initial sars-cov-2 infection is commonly associated with fever, cough, malaise, and fatigue. 142 in more severe cases, disseminated intravascular coagulation has been noted with elevated d-dimer levels in the serum of severe covid-19 patients, placing them into thromboembolic risk. 143 recent recommendations have been made to utilize thromboprophylaxis or full-anticoagulation therapy for patients in the thromboembolic risk category. 144 a specific protein−protein interaction was discovered between sars-cov-2's orf8 and the tissue plasminogen activator (tpa) protein of hosts. 28 the tpa, which is encoded by the plat gene, plays a crucial role in thrombolysis by catalyzing the conversion of plasminogen to plasmin, the major enzyme involved in lysis of blood clots. increased the activity of tpa can lead to excessive bleeding, whereas decreased activity is associated with thromboembolus formation, 145 increasing the chances of pulmonary embolism, stroke, and myocardial infarction. the extent to which orf8 interacts with tpa is not well understood, but its involvement may render a patient at risk for thromboembolism, as has been seen in the clinical setting. in a study by ladenvall et al., it was found that the discovered eight single nucleotide polymorphisms and the alu insertion polymorphism at the plat locus were not significant contributors to plasma tpa levels. 146 this finding indicates that inherited variants of the plat gene may not be directly involved with the coagulopathy in sars-cov-2 patients; however, the polymorphisms may render the host's tpa protein to a tighter binding by orf8, yielding greater repression during infection, placing the patient at higher risk for thromboembolism. it has also been shown that sepsis involves upregulation of platelet adhesion molecules and increased circulation of platelet−leukocyte aggregates. 147 this may point toward more of an immune-system-catalyzed coagulopathy, resulting in the presentation of strokes, 148 pulmonary embolisms, 149 myocardial infarctions, 150 and microvascular injury, 151 which impact severe sars-cov-2 patients. as coagulopathy has mainly been investigated in both viral and bacterial sepsis, there may be a dual effect of both the immune-mediated response and the protein−protein interaction of orf8 with the host tpa in cases of sars-cov-2 infection. further investigation is warranted to determine the extent of the interaction of orf8 with the host tpa to determine if it plays into the pathogenesis of sars-cov-2-related coagulopathy. sars-cov-2 has been associated with cardiac dysfunction, including myocardial infarction and heart failure. the underlying mechanisms for cardiac injury currently being hypothesized are indirect cardiac injury from the cytokine storm and inflammatory response, 152 severe hypoxia as a result of ards, 153 and direct viral invasion of cardiomyocytes. 154 interestingly, ace2, the host receptor for sars-cov-2, is expressed in the heart, 155 indicating direct viral invasion could be a potential cause of myocardial dysfunction. sars-cov-2's nonstructural protein 9 (nsp9) was found to interact with the e3-ubiquitin ligase mindbomb homologue 1 (mib1). 28 this ubiquitin ligase is a positive regulator of the delta-mediated notch signaling pathway, which is involved in multiple processes during cardiac development. 156 mutations in the mib1 have been associated with left ventricular noncompaction (lvnc) characterized by left ventricular trabeculations and reductions in cardiac systolic function. lvnc can range from being asymptomatic to presenting heart failure, depending on the extent the mutation has on the notch pathway. 157 the prevalence of lvnc in the general population is estimated to be around 1/5000 to 1/ 30000. patients with the asymptomatic form of lvnc may be at higher risk for exacerbation of cardiac dysfunction following sars-cov-2 due to involvement of this pathway, especially if they are unaware that they have this mutation. this may play a role in the cardiac dysfunction seen in younger sars-cov-2 patients who lack underlying comorbidities. aside from cardiac development, the notch pathway has also been implemented in cardiac repair, which was demonstrated in rat models where the notch 3 and notch 4 pathways were upregulated, thereby reducing postmyocardial infarctions in the setting of heart failure. 158 the mechanism behind the repair process is still under investigation; however, the disruption of the notch signaling pathway by the interaction of nsp9 with mib1 may prove to play a role in the cardiac involvement of sars-cov-2 infection. furthermore, although vertical transmission of sars-cov-2 has not been seen, 159 neonates who have tested positive for the infection may need to have their cardiac function assessed over time due to mib1's role in cardiogenesis and repair. while we present a detailed assessment of two interaction partners' connections to pathology and risk factors, many more likely exist. we postulate that if function of any protein diverges from normal biology to contribute to sars-cov-2 biology, it could result in a similar disease state within the cell as a loss of function or deleterious genetic mutation. thus, to journal of proteome research pubs.acs.org/jpr reviews understand the sars-cov-2-connected diseases through the human protein interactions, we assessed clinvar, a database of clinically identified variants. a query of the 332 sars-cov-2 human interaction partners through clinvar reveals 8311 protein-based variants within the list ( figure 3a ). in total, 188 of the queried 332 genes have a clinvar submission. of these clinvar-connected genes, there are a total of 111 that have a clinical annotation of pathogenic (pathogenic or likely pathogenic), with a total of 2386 different variants ( figure 3b ). the gene with the most pathogenic variants is fbn1, which is known to interact with sars-cov-2 nsp9 and is involved in autosomal dominant familial thoracic aortic aneurysms and aortic dissections and marfan syndrome. a further analysis of clinical disorders connected with those genes with 10 or more pathogenic-associated variants, excluding fbn1 (pkp2, acadm, ppt1, wfs1, col6a1, pcnt, fbn2, bcs1l, ngly1, cyb5r3, acad9, neu1, gnb1, nars2, tcf12, npc2, pigo, cdk5rap2, cenpf, ggcx, fkbp10, tbk1, fbln5, exosc3 , por, gpaa1, and rhoa), reveals a high connection to cardiac, neurological, diabetic, and syndromic biology. the sars-cov-2 orf8 has the most genes connected by interactions to pathogenic clinvar returns from queried genes, followed by protein m, nsp13, nsp7, and orf9c ( figure 3c ). orf8 is connected to 18 genes associated to human genetic diseases (col6a1, ngly1, neu1, npc2, fkbp10, dnmt1, plod2, smoc1, il17ra, adam9, sil1, lox, pofut1, tor1a, hyou1 , edem3, emc1, and hs6st2), with significant enrichment of these genes to protein folding (false discovery rate (fdr) = 0.00095) and endoplasmic reticulum lumen (fdr = 2.99 × 10 −5 ). while only associated with 3 pathogenic interaction partners, the nucleocapsid (n) protein has interesting disease genetics based on previous observations of a process known as viral-induced genetics. 160 the nucleocapsid (n) protein has the potential to impact and change cellular landscapes through the direct regulation of ribosomal biology. 161 the protein−protein interaction map by gordon et al. 28 supports the hypothesis that sars-cov-2 n proteins interact with multiple mrna-binding proteins and ribonucleoprotein complex proteins ( figure 2 ). in multiple viruses, proteins have been shown to interact with these complexes to regulate a process known as nonsense-mediated decay (nmd). 162 nmd is a cellular process involved in the removal of mrna that does not conform to the bulk of cellular mrna, where proteins accumulate on the transcript and direct the cellular degradation of the mrna. 163 the process is primarily used within cells to degrade mrna molecules with nonsense and frameshift genetic variants and those with improper splicing to prevent the cell from producing truncated proteins that can drive dominant-negative or deleterious gain of function outcomes. 164 viral rna is usually suppressed and degraded within cells through nmd, acting as a cellular immune system process. 165−167 thus, an evolutionary arms race has arisen where a virus can propagate more efficiently if it has a protein that can suppress nmd, keeping its rna levels elevated. 168−171 multiple lines of evidence for both sars-cov-2 and sars-cov suggested that the n protein is used to suppress nmd and evade cellular immune processes. nearly all of the coronaviruses and the larger nidovirales order genomes contain the n protein, which has been shown to interact with multiple ribosomal proteins, including crucial nmd factors. 28 directly inhibited by nmd, with the n protein expression blocking this inhibition. 162 positive-sense single-stranded rna viruses, including coronaviruses, are likely targets of nmd due to their many overlapping reading frames, retained introns, and long 3′ utrs present within the cytoplasm of human cells. the n protein interacts with three proteins annotated to the mrna surveillance pathway (upf1, pabpc1, and pabpc4) and several proteins involved in mrna binding and the ribonucleoprotein complex that are all known to have cellular interactions (figure 2 ). while the fine details of the n protein interaction on the factors are poorly understood, the three mrna surveillance genes are well-connected to nmd biology. pabpc1 is known to be critical for nmd, with its removal suppressing nmd. 175, 176 from plants to humans, upf1 is considered a key regulator of nmd through its recruitment of multiple proteins to rna. 177−180 the regulation switch of upf1 is known to be regulated/activated through phosphorylation at various sites to allow its protein interactions, 181−183 while the n protein has been shown in multiple viral species to be phosphorylated 184−188 and likely dynamic in modifications throughout the rna replication and viral lifecycle. 122, 189 these phosphorylation switches and the interaction of n to nmd proteins are potential sites of pharmaceutical or biological regulation that have been undervalued to this point. other notable interactions of the n protein are ras gtpaseactivating protein-binding protein homologues (g3bp1/2) and casein kinase 2 alpha (csnk2a2), 28 suggesting the regulation of stress granule formation. nmd is found at the intersection of a variety of cellular pathways beyond mrna surveillance and viral control. notably, it is closely associated with the integrated stress response requiring translation initiation factor eif2s1 for function. 190 cellular stresses such as hypoxia and er stress lead to the inhibition of nmd via phosphorylation of eif2s1. 190 this phosphorylation typically induces stress granule formation as well, which has been cited to aid viral replication in some cases and weaken them in others. 191 when g3bp1 is depleted within cells, there is a significant impairment of the replication for coronaviruses and respiratory syncytial viruses (rsvs). 191, 192 multiple viruses have been shown to regulate phosphorylation of eif2s1 at varying time points of infection with connection into nmd regulation. 193 stress granule formation can enhance nmd inhibition, such as hypoxic conditions modulating upf1 and eif2s1. 194, 195 the interaction of sars-cov-2 n protein human interactors promotes the inhibition of nmd and enhancement of both viral replication and truncated host polypeptides that can enhance viral pathogenicity (figure 4) . the regulation of nmd by viral proteins is crucial for allowing the viral rna to survive, but nmd processes within cells also regulate multiple endogenous transcripts, several in normal biology, and some based on genetic disease regulation. many genes, including isoforms with early truncation (frameshifts and nonsense codons) and genes involved in amino acid homeostasis, tumorigenesis, and cell cycle control, are activated when nmd is inhibited within a cell. 196−199 in total, this amounts to about 10% of transcripts within a cell that are regulated by nmd processes and could be altered within the cell by sars-cov-2. 196 on top of this, most individuals contain at least one gene where a nonsense or frameshift variant within the genome is being suppressed, being either inherited or somatic. assessments of human genomes reveal that every person has at least one variant regulated by nmd. 200 recently, our group has shown a complex involvement of this regulation with rare human variants, driving adverse outcomes through a process we have termed viral-induced genetics (vig). 160 in a patient with an epstein−barr virus (ebv) infection, they had an adverse immune response of classical hyperferritinemic sepsis like that of severe sars-cov-2 patients. this individual has both whole-exome sequencing in addition to multiple bloodbased rnaseq experiments performed throughout their clinical course. sequencing revealed a heterozygous splicing variant in the gene rnaseh2b, which is associated with recessive aicardi−goutieres syndrome 201 and has been connected to type-i ifn-mediated autoimmune disease. 202, 203 rnaseq of the patient, when healthy, showed that the splicing variant was present at very low levels, suggesting that the copy was being inhibited through nmd. while the patient was healthy for 16 years of life, the ebv was shown by rnaseq to inhibit nmd, resulting in the presence of the splice variant, which resulted in a dominant-negative rnaseh2b protein that drove cell dysfunction. this suggests that many human variants within genes connected to the immune system and viral response, which are usually suppressed by nmd and result in no cellular dysfunction, are activated by the virus through the inhibition of nmd and can give rise to severe viral outcomes. just like in a computer virus, the antivirus of the computer is often targeted. when additional computer code contains a risk to system failure that is inhibited by the antivirus, if the antivirus is shut down, the other system vulnerabilities become present and often contribute to the computer failure. the full extent of these variants and the disease process remain to be elucidated but is a promising avenue for exploration of viral-induced outcomes in sars-cov-2. the sars-cov-2 pandemic represents a unique challenge to scientists. unlike previous pandemics, our knowledge of the genome and its coded proteins was gleaned within weeks of the outbreak, now with thousands of sequences within a short window. this level of insight has allowed for a pivot to a more robust insight into the viral proteome and how it interacts with host proteins. the advancement of protein-based bioinformatics and previous coronavirus research studies have proved useful in defining the function of each protein coded by the virus. here, we show how many of these viral proteins interact with human proteins connected to biological pathways and disease connections, including numerous risk factors from immune to cardiovascular systems. most notably, we highlight literature on the role of the viral nucleocapsid (n) protein in nmd regulation, where the inhibition of nmd allows for viral rna stability while simultaneously activating genetics of cellular processes and viral-induced genetics. while we have seen thousands of publications on sars-cov-2 and other coronaviruses, the details of a proteome-wide knowledge base of sars-cov-2-coded proteins limit our ability to expand into the incredible potential of preventing and mitigating the current pandemic and future pandemics with a larger therapeutic toolset. characterization of coding/noncoding variants for shroom3 in patients with ckd molecular modeling in the age of clinical genomics, the enterprise of the next generation sars-cov-2 (covid-19) structural and evolutionary dynamicome: insights into functional evolution and human genomics expression of the sars-cov-2 cell receptor gene ace2 in a wide variety of human tissues ace2 receptor expression and severe acute respiratory syndrome coronavirus infection depend on differentiation of human airway epithelia receptor ace2 and tmprss2 are primarily expressed in bronchial transient secretory cells hca lung biological network. sars-cov-2 entry factors are highly expressed in nasal epithelial cells together with innate immune genes proteomics of sars-cov-2-infected host cells reveals therapy targets single-cell landscape of bronchoalveolar immune cells in patients with covid-19 eils, r. covid-19 severity correlates with airway epithelium-immune cell interactions identified by single-cell analysis imbalanced host response to sars-cov-2 drives development of covid-19 transcriptomic characteristics of bronchoalveolar lavage fluid and peripheral blood mononuclear cells in covid-19 patients proteomic and metabolomic characterization of covid-19 proteomic fingerprints for potential application to early diagnosis of severe acute respiratory syndrome the sars coronavirus nucleocapsid protein-forms and functions nucleocapsid phosphorylation and rna helicase ddx1 recruitment enables coronavirus transition from discontinuous to continuous transcription virus-host interactomes-antiviral drug discovery host factors in coronavirus replication novel beta-barrel fold in the nuclear magnetic resonance structure of the replicase nonstructural protein 1 from the severe acute respiratory syndrome coronavirus sars coronavirus nsp1 protein induces template-dependent endonucleolytic cleavage of mrnas: viral mrnas are resistant to nsp1-induced rna cleavage suppression of host gene expression by nsp1 proteins of group 2 bat coronaviruses a two-pronged strategy to suppress host protein synthesis by sars coronavirus nsp1 protein severe acute respiratory syndrome coronavirus protein nsp1 is a novel eukaryotic translation inhibitor that represses multiple steps of translation initiation severe acute respiratory syndrome coronavirus evades antiviral signaling: role of nsp1 and rational design of an attenuated strain unique sars-cov protein nsp1: bioinformatics, biochemistry and potential effects on virulence coronavirus nonstructural protein 1: common and distinct functions in the regulation of host and viral gene expression protein structure and sequence reanalysis of 2019-ncov genome refutes snakes as its intermediate host and the unique similarity between its spike protein insertions and hiv-1 the nsp2 proteins of mouse hepatitis virus and sars coronavirus are dispensable for viral replication identification of severe acute respiratory syndrome coronavirus replicase products and characterization of papain-like protease activity a noncovalent class of papain-like protease/deubiquitinase inhibitors blocks sars virus replication deubiquitinating activity of the sars-cov papainlike protease the papain-like protease from the severe acute respiratory syndrome coronavirus is a deubiquitinating enzyme severe acute respiratory syndrome coronavirus papain-like protease: structure of a viral deubiquitinating enzyme sars coronavirus papain-like protease inhibits the tlr7 signaling pathway through removing lys63-linked polyubiquitination of traf3 and traf6 coronavirus papain-like proteases negatively regulate antiviral innate immune response through disruption of sting-mediated signaling sars coronavirus papain-like protease inhibits the type i interferon signaling pathway through interaction with the sting-traf3-tbk1 complex the papain-like protease determines a virulence trait that varies among members of the sars-coronavirus species structure and cleavage specificity of the chymotrypsin-like serine protease (3clsp/nsp4) of porcine reproductive and respiratory syndrome virus (prrsv) mutation in murine coronavirus replication protein nsp4 alters assembly of double membrane vesicles localization and membrane topology of coronavirus nonstructural protein 4: involvement of the early secretory pathway in replication replication is supported by a reticulovesicular network of modified endoplasmic reticulum 3c-like proteinase from sars coronavirus catalyzes substrate hydrolysis by a general base mechanism dissection study on the severe acute respiratory syndrome 3c-like protease reveals the critical role of the extra domain in dimerization of the enzyme: defining the extra domain as a new target for design of highly specific protease inhibitors coronavirus main proteinase (3clpro) structure: basis for design of anti-sars drugs biosynthesis, purification, and substrate specificity of severe acute respiratory syndrome coronavirus 3c-like proteinase the substrate specificity of sars coronavirus 3c-like proteinase topology and membrane anchoring of the coronavirus replication complex: not all hydrophobic domains of nsp3 and nsp6 are membrane spanning coronavirus nsp6 proteins generate autophagosomes from the endoplasmic reticulum via an omegasome intermediate evolutionary analysis of sars-cov-2: how mutation of non-structural protein 6 could affect viral autophagy coronavirus nsp6 restricts autophagosome expansion structure of the sars-cov nsp12 polymerase bound to nsp7 and nsp8 co-factors insights into sars-cov transcription and replication from the structure of the nsp7-nsp8 hexadecamer nonstructural proteins 7 and 8 of feline coronavirus form a 2:1 heterotrimer that exhibits primer-independent rna polymerase activity the sars-coronavirus nsp7+nsp8 complex is a unique multimeric rna polymerase capable of both de novo initiation and primer extension the nonstructural protein 8 (nsp8) of the sars coronavirus interacts with its orf6 accessory protein severe acute respiratory syndrome coronavirus nsp9 dimerization is essential for efficient viral growth virus-host interactome and proteomic survey of pmbcs from covid-19 patients reveal potential virulence factors influencing sars-cov-2 the nsp9 replicase protein of sars-coronavirus, structure and functional insights the severe acute respiratory syndrome-coronavirus replicative protein nsp9 is a single-stranded rna-binding subunit unique in the rna virus world crystal structure of nonstructural protein 10 from the severe acute respiratory syndrome coronavirus reveals a novel fold with two zinc-binding motifs rna 3′-end mismatch excision by the severe acute respiratory syndrome coronavirus nonstructural protein nsp10/ nsp14 exoribonuclease complex structural basis and functional analysis of the sars coronavirus nsp14-nsp10 complex in vitro reconstitution of sars-coronavirus mrna cap methylation crystal structure and functional analysis of the sars-coronavirus rna cap 2′-o-methyltransferase nsp10/nsp16 complex coronavirus nsp10, a critical co-factor for activation of multiple replicative enzymes dodecamer structure of severe acute respiratory syndrome coronavirus nonstructural protein nsp10 coronavirus nsp10/nsp16 methyltransferase can be targeted by nsp10-derived peptide in vitro and in vivo to reduce replication and pathogenesis molecular anatomy of viral rna-directed rna polymerases multiple enzymatic activities associated with severe acute respiratory syndrome coronavirus helicase the severe acute respiratory syndrome (sars) coronavirus ntpase/helicase belongs to a distinct class of 5′ to 3′ viral helicases the human coronavirus 229e superfamily 1 helicase has rna and dna duplex-unwinding activities with 5′-to-3′ polarity structure-function analysis of severe acute respiratory syndrome coronavirus rna cap guanine-n7-methyltransferase functional screen reveals sars coronavirus nonstructural protein nsp14 as a novel cap n7 methyltransferase characterization of the guanine-n7 methyltransferase activity of coronavirus nsp14 on nucleotide gtp the cellular rna helicase ddx1 interacts with coronavirus nonstructural protein 14 and enhances viral replication crystal structure and mechanistic determinants of sars coronavirus nonstructural protein 15 define an endoribonuclease family crystal structure of a monomeric form of severe acute respiratory syndrome coronavirus endonuclease nsp15 suggests a role for hexamerization as an allosteric switch structural and biochemical characterization of endoribonuclease nsp15 encoded by middle east respiratory syndrome coronavirus binding of the methyl donor s-adenosyl-l-methionine to middle east respiratory syndrome coronavirus 2′-o-methyltransferase nsp16 promotes recruitment of the allosteric activator nsp10 coronavirus nonstructural protein 16 is a cap-0 binding enzyme possessing (nucleoside-2′o)-methyltransferase activity ribose 2′-o-methylation provides a molecular signature for the distinction of self and non-self mrna dependent on the rna sensor mda5 assembly of coronavirus spike protein into trimers and its role in epitope expression cryo-electron microscopy structure of a coronavirus spike glycoprotein trimer the coronavirus spike protein is a class i virus fusion protein: structural and functional characterization of the fusion core complex coronavirus spike proteins in viral entry and pathogenesis structure of sars coronavirus spike receptor-binding domain complexed with receptor characterization of severe acute respiratory syndrome-associated coronavirus (sars-cov) spike glycoprotein-mediated viral entry mechanisms of coronavirus cell entry mediated by the viral spike protein severe acute respiratory syndrome coronavirus spike protein expressed by attenuated vaccinia virus protectively immunizes mice potent binding of 2019 novel coronavirus spike protein by a sars coronavirus-specific human monoclonal antibody severe acute respiratory syndrome coronavirus orf3a protein interacts with caveolin severe acute respiratory syndrome coronavirus 3a protein is a viral structural protein the severe acute respiratory syndrome coronavirus 3a is a novel structural protein g1 phase cell cycle arrest induced by sars-cov 3a protein via the cyclin d3/prb pathway nucleocapsid-independent assembly of coronavirus-like particles by co-expression of viral envelope protein genes structure and inhibition of the sars coronavirus envelope protein ion channel expression of sars-coronavirus envelope protein in escherichia coli cells alters membrane permeability coronavirus particle assembly: primary structure requirements of the membrane protein severe acute respiratory syndrome coronavirus open reading frame (orf) 3b, orf 6, and nucleocapsid proteins function as interferon antagonists severe acute respiratory syndrome coronavirus orf6 antagonizes stat1 function by sequestering nuclear import factors on the rough endoplasmic reticulum/golgi membrane orf-6, induces caspase-3 mediated, er stress and jnk-dependent apoptosis structure and intracellular targeting of the sars-coronavirus orf7a accessory protein sars coronavirus 7a protein blocks cell cycle progression at g0/g1 phase via the cyclin d3/prb pathway severe acute respiratory syndrome (sars) coronavirus orf8 protein is acquired from sars-related coronavirus from greater horseshoe bats through recombination selective replication of coronavirus genomes that express nucleocapsid protein the coronavirus nucleocapsid is a multifunctional protein the severe acute respiratory syndrome coronavirus nucleocapsid protein is phosphorylated and localizes in the cytoplasm by 14−3-3-mediated translocation sequence comparison of the n genes of five strains of the coronavirus mouse hepatitis virus suggests a three domain structure for the nucleocapsid protein the nucleocapsid protein of coronavirus infectious bronchitis virus: crystal structure of its n-terminal domain and multimerization properties modular organization of sars coronavirus nucleocapsid protein coronavirus nucleocapsid protein is an rna chaperone characterization of protein-protein interactions between the nucleocapsid protein and membrane protein of the sars coronavirus the sars-cov nucleocapsid protein: a protein with multifarious activities clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study effect of age on human neutrophil function innate immunosenescence: effect of aging on cells and receptors of the innate immune system in humans altered cytokine production in the elderly cutting edge: impaired toll-like receptor expression and function in aging signaling pathways in aged t cells -a reflection of t cell differentiation the effect of age on thymic function human coronavirus infections: causes and consequences of cytokine storm and immunopathology molecular immune pathogenesis and diagnosis of covid-19 understanding the inflammatory cytokine response in pneumonia and sepsis: results of the genetic and inflammatory markers of sepsis (genims) study adults with septic shock and extreme hyperferritinemia exhibit pathogenic immune variation understanding human autoimmunity and autoinflammation through transcriptomics covid-19) in china: a systematic review and meta-analysis changes in blood coagulation in patients with severe coronavirus disease 2019 (covid-19): a meta-analysis thromboembolic risk and anticoagulant therapy in covid-19 patients: emerging evidence and call for action tissue plasminogen activator genetic variation at the human tissue-type plasminogen activator (tpa) locus: haplotypes and analysis of association to plasma levels of tpa the coagulopathy of acute sepsis covid-19 presenting as stroke pulmonary embolism in patients with covid-19: time to change the paradigm of computed tomography covid-19 and the cardiovascular system: implications for risk assessment complement associated microvascular injury and thrombosis in the pathogenesis of severe covid-19 infection: a report of five cases novel coronavirus outbreak research team. epidemiologic features and clinical course of patients infected with sars-cov-2 in the cardiovascular burden of coronavirus disease 2019 (covid-19) with a focus on congenital heart disease associated with acute respiratory distress syndrome receptor recognition by the novel coronavirus from wuhan: an analysis based on decade-long structural studies of sars coronavirus notch signaling in cardiac development and disease mutations in the notch pathway regulator mib1 cause left ventricular noncompaction cardiomyopathy activation of notch signaling in cardiomyocytes during post-infarction remodeling vertical transmission of coronavirus disease 19 (covid-19) from infected pregnant mothers to neonates: a review viral-induced genetics revealed by multi-dimensional precision medicine transcriptional workflow applicable to covid-19 nucleocapsid protein recruitment to replication-transcription complexes plays a crucial role in coronaviral life cycle interplay between coronavirus, a cytoplasmic rna virus, and nonsense-mediated mrna decay pathway nonsense-mediated mrna decay (nmd) mechanisms the nonsense-mediated decay rna surveillance pathway the host nonsense-mediated mrna decay pathway restricts mammalian rna virus replication nonsense-mediated mrna decay: novel mechanistic insights and biological impact beyond quality control: the role of nonsense-mediated mrna decay (nmd) in a combined proteomics/genomics approach links hepatitis c virus infection with nonsense-mediated mrna decay virus escape and manipulation of cellular nonsense-mediated mrna decay rna virus evasion of nonsense-mediated decay how retroviruses escape the nonsense-mediated mrna decay an interactome map of the nucleocapsid protein from a highly pathogenic north american porcine reproductive and respiratory syndrome virus strain generated using silac-based quantitative proteomics function of a retrotransposon nucleocapsid protein the cellular interactome of the coronavirus infectious bronchitis virus nucleocapsid protein and functional implications for virus biology interaction of pabpc1 with the translation initiation complex is critical to the nmd resistance of aug-proximal nonsense mutations a conserved role for cytoplasmic poly(a)-binding protein 1 (pabpc1) in nonsense-mediated mrna decay upf1 is required for nonsense-mediated mrna decay (nmd) and rnai in arabidopsis mammalian staufen1 recruits upf1 to specific mrna 3′utrs so as to elicit mrna decay nmd factors upf2 and upf3 bridge upf1 to the exon junction complex and stimulate its rna helicase activity interactions between upf1, erfs, pabp and the exon junction complex suggest an integrated model for mammalian nmd pathways upf1 phosphorylations create binding platforms for smg-6 and smg-5:smg-7 during nmd binding of a novel smg-1-upf1-erf1-erf3 complex (surf) to the exon junction complex triggers upf1 phosphorylation and nonsense-mediated mrna decay a post-translational regulatory switch on upf1 controls targeted mrna degradation phosphorylation of the porcine reproductive and respiratory syndrome virus nucleocapsid protein phosphorylation and subcellular localization of transmissible gastroenteritis virus nucleocapsid protein in infected cells severe acute respiratory syndrome coronavirus nucleocapsid protein expressed by an adenovirus vector is phosphorylated and immunogenic in mice phosphorylation of the mouse hepatitis virus nucleocapsid protein effects of phosphorylation of avian retrovirus nucleocapsid protein pp12 on binding of viral rna regulation of hepadnavirus reverse transcription by dynamic nucleocapsid phosphorylation nonsense-mediated mrna decay at the crossroads of many cellular pathways valiente-echeverría, f. strategies for success. viral infections and membraneless organelles innate immune evasion by human respiratory rna viruses mouse hepatitis coronavirus replication induces host translational shutoff and mrna decay, with concomitant formation of stress granules and processing bodies hypoxic inhibition of nonsense-mediated rna decay regulates gene expression and the integrated stress response possible roles in the control of translation and mrna degradation. cold spring harbor perspect nonsense surveillance regulates expression of diverse classes of mammalian transcripts and mutes genomic noise nonsense-mediated rna decay regulation by cellular stress: implications for tumorigenesis nonsense-mediated mrna decay factors act in concert to regulate common mrna targets nonsense-mediated mrna decay in health and disease the rules and impact of nonsense-mediated mrna decay in human cancers a novel rnaseh2b splice site mutation responsible for aicardi-goutieres syndrome in the faroe islands genetically defined autoinflammatory diseases rare adar and rnaseh2b variants and a type i interferon signature in glioma and prostate carcinoma risk and tumorigenesis key: cord-256156-mywhe6w9 authors: clausen, thomas mandel; sandoval, daniel r.; spliid, charlotte b.; pihl, jessica; perrett, hailee r.; painter, chelsea d.; narayanan, anoop; majowicz, sydney a.; kwong, elizabeth m.; mcvicar, rachael n.; thacker, bryan e.; glass, charles a.; yang, zhang; torres, jonathan l.; golden, gregory j.; bartels, phillip l.; porell, ryan; garretson, aaron f.; laubach, logan; feldman, jared; yin, xin; pu, yuan; hauser, blake; caradonna, timothy m.; kellman, benjamin p.; martino, cameron; gordts, philip l.s.m.; chanda, sumit k.; schmidt, aaron g.; godula, kamil; leibel, sandra l.; jose, joyce; corbett, kevin d.; ward, andrew b.; carlin, aaron f.; esko, jeffrey d. title: sars-cov-2 infection depends on cellular heparan sulfate and ace2 date: 2020-09-14 journal: cell doi: 10.1016/j.cell.2020.09.033 sha: doc_id: 256156 cord_uid: mywhe6w9 we show that sars-cov-2 spike protein interacts with both cellular heparan sulfate and angiotensin converting enzyme 2 (ace2) through its receptor binding domain (rbd). docking studies suggest a heparin/heparan sulfate-binding site adjacent to the ace2 binding site. both ace2 and heparin can bind independently to spike protein in vitro and a ternary complex can be generated using heparin as a scaffold. electron micrographs of spike protein suggests that heparin enhances the open conformation of the rbd that binds ace2. on cells, spike protein binding depends on both heparan sulfate and ace2. unfractionated heparin, non-anticoagulant heparin, heparin lyases, and lung heparan sulfate potently block spike protein binding and/or infection by pseudotyped virus and authentic sars-cov-2 virus. we suggest a model in which viral attachment and infection involves heparan sulfate-dependent enhancement of binding to ace2. manipulation of heparan sulfate or inhibition of viral adhesion by exogenous heparin presents new therapeutic opportunities. the covid-19 pandemic, caused by the novel respiratory coronavirus 2 (sars-cov-2), has swept across the world, resulting in serious clinical morbidities and mortality, as well as widespread disruption to all aspects of society. as of september 1, 2020, the virus has spread to 215 countries, causing more than 25.4 million confirmed infections and at least 851,000 deaths (world health organization). current isolation/social distancing strategies seek to flatten the infection curve to avoid overwhelming hospitals and to give the medical establishment and pharmaceutical companies time to develop and test antiviral drugs and vaccines. currently, only one antiviral agent, remdesivir, has been approved for adult covid-19 patients (beigel et al., 2020) and vaccines may be 12-18 months away. understanding the mechanism for sars-cov-2 infection and its mechanism of infection could reveal other targets to interfere with viral infection and spread. the glycocalyx is a complex mixture of glycans and glycoconjugates surrounding all cells. given its location, viruses and other infectious organisms, must pass through the glycocalyx to engage receptors thought to mediate viral entry into host cells. many viral pathogens have evolved to utilize glycans as attachment factors, which facilitates the initial interaction with host cells, including influenza virus, herpes simplex virus, human immunodeficiency virus, and different coronaviruses (sars-cov-1 and mers-cov) (cagno et al., 2019; koehler et al., 2020; stencel-baerenwald et al., 2014) . several viruses interact with sialic acids, which are located on the ends of glycans found in glycolipids and glycoproteins. other viruses interact with heparan sulfate (hs) (milewska et al., 2014) , a highly negatively charged linear polysaccharide that is attached to a small set of membrane or extracellular matrix proteoglycans (lindahl et al., 2015) . in general, glycan-binding domains on membrane proteins of the virion envelope mediate initial attachment of virions to glycan receptors. attachment in this way can lead to the engagement of protein receptors on the host plasma membrane that facilitate membrane fusion or engulfment and internalization of the virion. j o u r n a l p r e -p r o o f 5 like other macromolecules, hs can be divided into subunits, which are operationally defined as disaccharides based on the ability of bacterial enzymes or nitrous acid to cleave the chain into disaccharide units (esko and selleck, 2002) . the basic disaccharide subunit consists of α1-4 linked d-glucuronic acid (glca) and α1-4 linked n-acetyl-d-glucosamine (glcnac), which undergo various modifications by sulfation and epimerization as the copolymer assembles on a limited number of membrane and extracellular matrix proteins (only 17 heparan sulfate proteoglycans are known) (lindahl et al., 2015) . the variable length of the modified domains and their pattern of sulfation create unique motifs to which hs-binding proteins interact (xu and esko, 2014) . different tissues and cell types vary in the structure of hs, and hs structure can vary between individuals and with age (de agostini et al., 2008; feyzi et al., 1998; han et al., 2020; ledin et al., 2004; vongchan et al., 2005; warda et al., 2006; wei et al., 2011) . these differences in hs composition may contribute to the tissue tropism and/or host susceptibility to infection by viruses and other pathogens. in this report, we show that the ectodomain of the sars-cov-2 spike (s) protein interacts with cell surface hs through the receptor binding domain (rbd) in the s1 subunit. binding of heparin to sars-cov-2 s protein shifts the structure to favor the rbd open conformation that binds ace2. spike binding to cells requires engagement of both cellular hs and ace2, suggesting that hs acts as a coreceptor priming the spike for ace2 interaction. therapeutic unfractionated heparin (ufh), non-anticoagulant heparin and hs derived from human lung and other tissues blocks binding. ufh and heparin lyases also block infection of cells by s protein pseudotyped virus and authentic sars-cov-2. these findings identify cellular hs as a necessary co-factor for sars-cov-2 infection and emphasizes the potential for targeting s protein-hs interactions to attenuate virus infection. the trimeric s proteins from sars-cov-1 and sars-cov-2 viruses are thought to engage human ace2 with one or more rbd in an "open" active conformation (fig. 1a ) (kirchdoerfer et al., 2018; walls et al., 2020; wrapp et al., 2020) . adjacent to the ace2 binding site and exposed in the rbd lies a group of positively-charged amino acid residues that represents a potential site that could interact with heparin or heparan sulfate ( fig. 1a and suppl. fig. s1 ). we calculated an electrostatic potential map of the rbd (from pdb id 6m17 (yan et al., 2020) ), which revealed an extended electropositive surface with dimensions and turns/loops consistent with a heparin-binding site (fig. 1b) (xu and esko, 2014) . docking studies using a tetrasaccharide (dp4) fragment derived from heparin demonstrated preferred interactions with this electropositive surface, which based on its dimensions could accommodate a chain of up to 20 monosaccharides ( fig. 1b and 1c ). evaluation of heparin-protein contacts and energy contributions using the molecular operating environment (moe) software suggested strong interactions with the positively charged amino acids r346, r355, k444, r466 and possibly r509 (figs. 1a, 1d, and 1e) . other amino acids, notably f347, s349, n354, g447, y449, and y451, could coordinate the oligosaccharide through hydrogen bonds and hydrophobic interactions. notably, the putative binding surface for oligosaccharides is adjacent to, but separate from the ace2 binding site, suggesting that a single rbd could simultaneously bind both cell surface hs and the ace2 protein receptor. the putative hs binding site is partially obstructed in the "closed" inactive rbd conformation, while fully exposed in the open state (suppl. fig. s1 ). the amino acid sequence of s protein rbd of sars-cov-2 s is 73% identical to the rbd of sars-cov-1 s (fig. 1f) , and these domains are highly similar in structure with an overall cα r.m.s.d. of 0.929 å (fig. 1g) . however, an electrostatic potential map of the sars-cov-1 s j o u r n a l p r e -p r o o f 7 rbd does not show an electropositive surface like that observed in sars-cov-2 (fig. 1h ). most of the positively charged residues comprising this surface are conserved between the two proteins, with the exception of sars-cov-2 k444 which is a threonine in sars-cov-1 (fig. 1f ). additionally, the other amino acid residues predicted to coordinate with the oligosaccharide are conserved with the exception of asn354 in sars-cov-2, which is a negatively charged glutamate residue in sars-cov-1. sars-cov-1 has been shown to interact with cellular hs in addition to its entry receptors ace2 and transmembrane protease, serine 2 (tmprss2) (lang et al., 2011) . our analysis suggests that the putative heparin-binding site in sars-cov-2 s may mediate an enhanced interaction with heparin or hs compared to sars-cov-1, and that this change evolved through as few as two amino acid substitutions, thr lys444 and glu asn354. to test experimentally if the sars-cov-2 s protein interacts with heparin/hs, recombinant ectodomain and rbd proteins were prepared and characterized. initial studies encountered difficulty in stabilizing the s ectodomain protein, a problem that was resolved by raising the concentration of nacl to 0.3 m in hepes buffer. under these conditions, the protein could be stored at room temperature, 4 o c or at -80 o c for at least two weeks. sds-page showed that each protein was ~98% pure ( j o u r n a l p r e -p r o o f 8 recombinant s ectodomain and rbd proteins were applied to a column of heparin-sepharose. elution with a gradient of sodium chloride showed that the rbd eluted at ~0.3 m nacl, with a shoulder that eluted with higher salt (fig. 2a) . recombinant s ectodomain also bound to heparin-sepharose, but it eluted across a broader concentration of nacl. the elution profiles suggest that the preparations contained a population of molecules that bind to heparin, but that some heterogeneity in affinity for heparin occurs, which may reflect differences in glycosylation, oligomerization or the number of binding sites in the open conformation. the rbd protein from sars-cov-2 also bound in a saturable manner to heparin-bsa immobilized on a plate (fig. 2b ). the rbd domain from sars-cov-1 showed significantly reduced binding to heparin-bsa and a higher k d value (640 nm [95% c.i.; 282 -1852 nm] for sars-cov-1 rbd vs. 150 nm [95% c.i. 123 -173 nm]) for sars-cov-2 rbd), in accordance with the difference in electropositive potential in the proposed hs binding regions (fig. 1h) . a monomeric form of sars-cov-2 s ectodomain protein also bound in a saturable manner to heparin immobilized on a plate (suppl. fig. s3a ). the trimeric protein bound to heparin-bsa with an apparent k d value of 3.8 nm [95% c.i. 3.1 -4.6 nm] (fig. 2c ). binding of recombinant s ectodomain, mutated to lock the rbds into a closed (mut2) or that favors an open (mut7) conformation, showed that the heparin binding site in the rbd domain is accessible in both conformations (fig. 2d ). however, the k d value for mut7 is lower (4.6 nm [95% c.i. 3.8 -5.5 nm] vs. 9.9 nm [95% c.i. 8.7 -11.3 nm] for mut2), which is in line with the partial obstruction of the site in the closed conformation (suppl. fig. s1 ). as expected, only trimer with an open rbd conformation bound to ace2 (fig. 2e ). in contrast to spike protein, ace2 did not bind to heparin-bsa (fig. 2c) . ace2 also had no effect on binding of s protein to heparin-bsa at all concentrations that were tested (fig. 2c , inset). biotinylated ace2 bound to immobilized s protein (suppl. fig. s3b ) and a ternary complex of heparin, ace2 and s protein could be demonstrated by titration of s protein bound to immobilized heparin-bsa with ace2 (fig. 2f ). binding of ace2 under these conditions j o u r n a l p r e -p r o o f 9 increased in proportion to the amount of s protein bound to the heparin-bsa. collectively, these findings show that (i) spike protein can engage both heparin and ace2 simultaneously and (ii) that the heparin binding site is somewhat occluded in the closed conformation, but it can still bind heparin albeit with reduced affinity. the simultaneous binding of ace2 to spike protein and heparin suggested the possibility that heparin binding might affect the conformation of the rbd, possibly increasing the open conformation that can bind ace2. to explore this possibility, spike protein was mixed with ace2 (6-fold molar ratio) with or without dp20 oligosaccharides derived from heparin (9-fold molar ratio). the samples were then stained and analyzed by transmission electron microscopy, and the images were deconvoluted and sorted into 3d reconstructions to determine the number of trimers with 0, 1, 2, or 3 bound ace2 (fig. 2g -h and suppl. fig. s3c-d) . the different populations were counted and the percentage of particles belonging to each 3d class was calculated. two time points were evaluated after mixing ace2 and trimeric s: at 15 min 29,600 and 31,300 particles were analyzed in the absence or presence of dp20 oligosaccharides, respectively; at 60 min, 17,000 and 21,000 particles were analyzed in absence or presence of dp20 oligosaccharides, respectively. at both time points, the presence of dp20 increased the total amount of ace2 protein bound to spike . after 15 minutes in the absence of dp20 very few of the trimers had conformations with 1 or 2 bound ace2 (5% each), whereas the inclusion of dp20 oligosaccharides greatly increased the proportion of trimers bearing one (37%) or two (21%) ace2, with a proportional drop in the unbound conformers from 90% in the absence of heparin to 42% in its presence (fig. 2g ). extending the incubation to 60 minutes resulted in a mixture of trimers containing 1 (45%), 2 (11%) and 3 ace2 (13%) in the absence of heparin. inclusion of dp20 further increased the proportion of bound spike trimers bearing 2 (19%), and 3 (27%) ace2 (fig. 2h) . the imaging studies suggest that, under these j o u r n a l p r e -p r o o f 10 experimental conditions, heparin may stabilize the ace2 interaction, increasing the proportion of spike bound to ace2 as well as the occupancy of individual spikes. the sars-cov-2 spike protein depends on cellular heparan sulfate for cell binding. to extend these studies to hs on the surface of cells, s ectodomain protein was added to human h1299 cells, an adenocarcinoma cell line derived from type 2 alveolar cells (fig. 3a ). spike ectodomains bound to h1299 cells, with half-maximal binding achieved at ~75 nm. treatment of the cells with a mixture of heparin lyases (hsase), which degrades cell surface hs, dramatically reduced binding ( fig 3a) . the s ectodomain also bound to human a549 cells, another type 2 alveolar adenocarcinoma line, as well as human hepatoma hep3b cells (fig. 3b ). removal of hs by enzymatic treatment dramatically reduced binding in both of these cell lines as well (fig. 3b ). recombinant rbd protein also bound to all three cell lines dependent on hs (fig. 3c) . a melanoma cell line, a375, was tested independently and also showed hs dependent binding ( fig 3d) . the extent of binding across the four cell lines varied ~4-fold. this variation was not due to differences in hs expression as illustrated by staining of cell surface hs with mab 10e4, which recognizes a common epitope in hs ( we also measured binding of the s ectodomain and rbd proteins to a library of mutant hep3b cells, carrying crispr/cas9 induced mutations in biosynthetic enzymes essential for synthesizing hs (anower et al., 2019) . inactivation of ext1, a subunit of the copolymerase required for synthesis of the backbone of hs, abolished binding to a greater extent than enzymatic removal of the chains with hsases ( fig. 3f and suppl. fig. s4 ), suggesting that the hsase treatment may underestimate the dependence on hs. targeting ndst1, a glcnac n-j o u r n a l p r e -p r o o f 11 deacetylase-n-sulfotransferase that n-deacetylates and n-sulfates n-acetylglucosamine residues, and hs6st1 and hs6st2, which introduces sulfate groups in the c6 position of glucosamine residues, significantly reduced binding (figs. 3f and suppl. fig. s4 ). although experiments with other sulfotransferases have not yet been done, the data suggests that the pattern of sulfation of hs affects binding to s and rbd. to further examine how variation in hs structure affects binding, we isolated hs from human kidney, liver, lung and tonsil. the samples were depolymerized into disaccharides by treatment with hsases, and the disaccharides were then analyzed by lc-ms (experimental methods). the disaccharide analysis showed that lung hs has a larger proportion of ndeacetylated and n-sulfated glucosamine residues (grey bars) and more 2-o-sulfated uronic acids (green bars) than hs preparations from the other tissues (fig. 4a ). the different hs preparations also varied in their ability to block binding of rbd to h1299 cells (fig. 4b ). interestingly, hs isolated from lung was more potent compared to kidney and liver hs, consistent with the greater degree of sulfation of hs from this organ (suppl. table 1 ). hs from tonsil was as potent as hs from lung, but the overall extent of sulfation was not as great, supporting the notion that the patterning of the sulfated domains in the chains may affect binding. unfractionated heparin is derived from porcine mucosa and possesses potent anticoagulant activity due to the presence of a pentasaccharide sequence containing a crucial 3-o-sulfated nsulfoglucosamine unit, which confers high affinity binding to antithrombin. heparin is also very highly sulfated compared to hs with an average negative charge of -3.4 per disaccharide (the overall negative charge density of typical hs is -1.8 to -2.2 per disaccharide). mst cells, which were derived from a murine mastocytoma, make heparin-like hs that lacks the key 3-o-sulfate group and anticoagulant activity (gasimli et al., 2014; montgomery et al., 1992) . the 12 anticoagulant properties of heparin can also be removed by periodate oxidation, which oxidizes the vicinal hydroxyl groups in the uronic acids, resulting in what is called "split-glycol" heparin (casu et al., 2004) . all of these agents significantly inhibited binding of the s protein to h1299 and a549 cells ( fig. 4c and 4d ) yielding ic 50 values in the range of 0.01-0.12 µg/ml (suppl . table 1 ). interestingly, the lack of 3-o-sulfation, crucial for the anticoagulant activity of heparin, had little effect on its inhibition of s binding. in contrast, cho cell hs (containing 0.8 sulfates per disaccharide) only weakly inhibited binding (ic 50 values of 18 and 139 µg/ml for a549 and h1299, respectively) (suppl. table 1 ). these data suggest that inhibition by heparinoids is most likely charge dependent and independent of anticoagulant activity per se. the experiments shown in fig. 2g -h indicate that binding of heparin to spike protein can increase binding to ace2. to explore if hs, ace2 and spike interact at the cell surface, we investigated the impact of ace2 expression on s protein cell binding. initial attempts were made to measure ace2 levels by western blotting or flow cytometry with different mabs and polyclonal antibodies, but a reliable signal was not obtained in any of the cell lines tested (a375, a549, h1299, and hep3b). nevertheless, expression of ace2 mrna was observed by rt-qpcr (suppl. fig. 5a ). transfection of a375 cells with ace2 cdna resulted in robust expression of ace2 (fig. 5a) , resulting in an increase in s ectodomain protein binding by ~4fold (fig. 5b) . interestingly, the enhanced binding was hs-dependent, as illustrated by the loss of binding of s protein after hsase-treatment (fig. 5b ). crispr/cas9 mediated deletion of the b4galt7 gene, which is required for glycosaminoglycan assembly (suppl. fig. s5b ), also reduced binding of spike protein (fig. 5b ) despite the overexpression of ace2 (fig. 5a ). to explore the impact of diminished ace2 expression, we examined spike protein binding to a549 cells and in two crispr/cas9 gene targeted clones c3 and c6 bearing biallelic mutations in ace2 (suppl. fig. s5c ). binding of s ectodomain protein was greatly reduced in the ace2 -/-j o u r n a l p r e -p r o o f 13 clones and the residual binding was sensitive to hsases (fig. 5c ). these findings show that binding of spike protein on cells requires both hs and ace2, consistent with the formation of a ternary complex (figs. 2f-h). assays using purified components provide biochemical insights into binding, but they do not recapitulate the multivalent presentation of the s protein as it occurs on the virion membrane. thus, to extend these studies, pseudotyped vesicular stomatitis virus (vsv) was engineered to express the full-length sars-cov-2 s protein and gfp or luciferase to monitor infection. vero e6 cells are commonly used in the study of sars-cov-2 infection, due to their high susceptibility to infection. spike protein binding to vero cells also depends on cellular hs as binding was sensitive to hsases, heparin and split-glycol heparin (fig. 6a ). interestingly, hsase treatment reduced binding to a lesser extent than the level of reduction observed in a549, heparin very potently reduced infection more than ~4-fold at 0.5 µg/ml and higher concentrations (fig. 6g) . in contrast, studies of sars-cov-1 s protein pseudotype virus showed that hsase-treatment actually increased sars-cov-1 infection by more than 2-fold, suggesting that hs might interfere with binding of sars-cov-1 in this cell line (fig. 6h ). infection of h1299 and a549 cells by sars-cov-2 s pseudotype virus was too low to obtain j o u r n a l p r e -p r o o f 14 accurate measurements, but infection of hep3b cells could be readily measured (fig. 6i ). hsase and mutations in ext1 and ndst1 dramatically reduced infection 6-to 7-fold. inactivation of the 6-o-sulfotransferases had only a mild effect unlike its strong effect on s protein binding (fig. 3f) , possibly due to the high valency conferred by multiple copies of s protein on the pseudovirus envelope. hep3b cells were not susceptible to infection by sars-cov-1 s protein pseudotyped virus, but was infected by mers-cov s protein pseudotyped virus and infection was independent of hs (suppl. fig. s6 ). studies of pseudovirus were then extended to authentic sars-cov-2 virus infection using strain usa-wa1/2020. infection of vero e6 cells was monitored by double staining of the cells with antibodies against the sars-cov-2 nucleocapsid (n) and s proteins ( heparin inhibition (maroon and blue symbols). to rule out that the treatments caused a decrease in ace2 expression or a reduction in cell viability, vero cells were treated with heparin lyases and 100 µg/ml ufh, and ace2 expression was measured by western blotting and cell viability by celltiter-blue® (suppl. fig. s7a -b) . no effect on ace2 expression or cell viability was observed. these findings further emphasize the potential for using unfractionated heparin or other non-anticoagulant heparinoids to prevent viral attachment. j o u r n a l p r e -p r o o f 15 these findings were then extended to hep3b cells and mutants altered in hs biosynthesis using a viral plaque assay. virus was added to wildtype, ndst1 -/and hs6st1/2 -/cells for 2 hr, the virus was removed, and after 2 days incubation a serial dilution of the conditioned culture medium was added to monolayers of vero e6 cells. the number of plaques were then quantitated by staining and visualization. as a control, culture medium from infected vero e6 cells was tested, which showed robust viral titers. hep3b cells also supported viral replication, but to a lesser extent than vero cells. inactivation of ndst1 in hep3b cells abolished virus production, whereas inactivation of hs6st1/2 -/reduced infection more mildly, ~3-fold (fig. 7d) . hsase and ufh reduced infection more than 5-fold, but it had no effect on cell viability (suppl. in this report, we provide compelling evidence that hs is a necessary host attachment factor that promotes sars-cov-2 infection of various target cells. the receptor binding domain of the sars-cov-2 s protein binds to heparin/hs, most likely through a docking site composed of positively charged amino acid residues aligned in a subdomain of the rbd that is separate from the site involved in ace2 binding (fig. 1) . competition studies, enzymatic removal of hs, and genetic studies confirm that the s protein, whether presented as a recombinant protein (figs. 2-j o u r n a l p r e -p r o o f 16 5), in a pseudovirus (fig. 6) , or in authentic sars-cov-2 virions (fig. 7) , binds to cell surface hs in a cooperative manner with ace2 receptors. mechanistically, binding of heparin/hs to spike trimers enhances binding to ace2, likely increasing multivalent interactions with the target cell. this data provides crucial insights into the pathogenic mechanism of sars-cov-2 infection and suggests hs-spike protein complexes as a novel therapeutic target to prevent infection. the glycocalyx is the first point of contact for all pathogens that infect animal cells, and thus it is not surprising that many viruses exploit glycans, such as hs, as attachment factors. for example, the initial interaction of herpes simplex virus with cells involves binding to hs chains on one or more hs proteoglycans (shieh et al., 1992; wudunn and spear, 1989 ) through the interactions with the viral glycoproteins gb and gc. viral entry requires the interaction of a specific structure in hs with a third viral glycoprotein, gd (shukla et al., 1999) , working in concert with membrane proteins related to tnf/ngf receptors (montgomery et al., 1996) . similarly, the human immunodeficiency virus binds to hs by way of the v3 loop of the viral glycoprotein gp120 (roderiquez et al., 1995) , but infection requires the chemokine receptor ccr5 (deng et al., 1996; dragic et al., 1996) . other coronaviruses also utilize hs, for example nl63 (hcov-nl63) binds hs via the viral s protein in addition to ace2 (lang et al., 2011; milewska et al., 2018; milewska et al., 2014; naskalska et al., 2019) . in these examples, initial tethering of virions to the host cell plasma membrane appears to be mediated by hs, but infection requires transfer to a proteinaceous receptor. the data presented here shows that sars-cov-2 requires hs in addition to ace2. we imagine a model in which cell surface hs acts as a "collector" of the virus and a mediator of the rbd-ace2 interaction, making viral infection more efficient. hs varies in structure across cell types and tissues, as well as with gender and age (de agostini et al., 2008; feyzi et al., 1998; ledin et al., 2004; vongchan et al., 2005; warda et al., 2006; wei et al., 2011) . variation in competition by hs from different tissues supports this conclusion and raises the possibility that hs contributes to the tissue tropism and j o u r n a l p r e -p r o o f 17 the susceptibility of different patient populations, in addition to levels of expression of ace2 . coronaviruses can utilize a diverse set of glycoconjugates as attachment factors. human coronavirus oc43 (hcov-oc43) and bovine coronavirus (bcov) bind to 5-n-acetyl-9-oacetylneuraminic acid (hulswit et al., 2019; tortorici et al., 2019) , middle east respiratory syndrome virus (mers-cov) binds 5-n-acetyl-neuraminic acid (park et al., 2019) , and guinea fowl coronavirus binds biantennary di-n-acetyllactosamine or sialic acid capped glycans (bouwman et al., 2019) . whether sars-cov-2 s protein binds to sialic acid remains unclear. mapping the binding site for sialic acids in other coronavirus s proteins has proved elusive, but modeling studies suggest a location distinct from the hs binding site shown in fig. 1 (park et al., 2019; tortorici et al., 2019) . the s protein in murine coronavirus contains both a hemagglutinin domain for binding and an esterase domain that cleaves sialic acids that aids in the liberation of bound virions (rinninger et al., 2006; smits et al., 2005) . whether sars-cov-2 s protein, another viral envelope protein, or a host protein contributes to hs-degrading activity to aid in the release of newly made virions is unknown. the repertoire of proteins in organisms that bind to hs make up the so called "hs interactome" and consists of a variety of different hs-binding proteins (hsbps) (xu and esko, 2014) . unlike lectins that have a common fold that helps define the glycan binding site, hsbps do not exhibit a conserved motif that allows accurate predictions of binding sites based on primary sequence. instead, the capacity to bind heparin appears to have emerged through convergent evolution by juxtaposition of several positively charged amino acid residues arranged to accommodate the negatively charged sulfate and carboxyl groups present in the polysaccharide, and hydrophobic and h-bonding interactions stabilize the association. the rbd domains from the sars-cov-1 and sars-cov-2 s proteins are highly similar in structure (fig. 1g ), but the electropositive surface in sars-cov-1 s rbd is not as pronounced in sars-cov-2 s rbd (fig. 1h ). in accordance with this observation, recombinant rbd protein from sars-j o u r n a l p r e -p r o o f 18 cov-2 showed significantly higher binding to heparin-bsa, compared to rbd from sars-cov-1 (fig. 2b) . a priori we predicted that the evolution of the hs binding site in the sars-cov-2 s protein might have occurred by the addition of arginine and lysine residues to its ancestor, sars-cov-1. instead, we observed that four of the six predicted positively charged residues that make up the heparin-binding site are present in sars-cov-1 as well as most of the other amino acid residues predicted to interact with heparin ( fig. 1) . sars-cov-1 has been shown to interact with cellular hs in addition to its entry receptors ace2 and transmembrane protease, serine 2 (tmprss2) (lang et al., 2011) . our analysis suggests that the putative heparinbinding site in sars-cov-2 s may mediate an enhanced interaction with heparin compared to sars-cov-1, and that this change evolved through as few as two amino acid substitutions, thr444lys and glu354asn. further studies are underway to define the amino acid residues in the combining site for heparin/hs to test this hypothesis. the ability of heparin and hs to compete for binding of the sars-cov-2 s protein to cell surface hs and the inhibitory activity of heparin towards infection of pseudovirus and authentic sars-cov-2 illustrates the therapeutic potential of agents that target the virus-hs interaction to control infection and transmission of sars-cov-2. there is precedent for targeting proteinglycan interactions as therapeutic agents. for example, tamiflu targets influenza neuraminidase, thus reducing viral transmission, and sialylated human milk oligosaccharides can block sialic acid-dependent rotavirus attachment and subsequent infection in infants (hester et al., 2013; von itzstein, 2007) . covid-19 patients typically suffer from thrombotic complications ranging from vascular micro-thromboses, venous thromboembolic disease and stroke and often receive unfractionated heparin or low molecular weight heparin (thachil, 2020) . the findings presented here and elsewhere suggest that both of these agents can block viral infection (courtney mycroft-west, 2020; kim et al., 2020; liu et al., 2020; mycroft-west et al., 2020; tandon et al., 2020; wu et al., 2020) . effective anticoagulation is achieved with plasma levels of heparin of 0.3-0.7 units/ml. this concentration is equivalent to 1.6-4 µg/ml heparin (assuming that the activity of ufh is 180 units/mg). although this is sufficient to block spike protein binding to cells (fig. 4) , it would not be expected to prevent viral infection, but it should attenuate infection depending on the viral load (fig. 7) . the anticoagulant activity of heparin, which is typically absent in hs, is not critical for its antiviral activity based on the observation that mst derived heparin and split-glycol heparin is nearly as potent as therapeutic heparin ( figs. 4 and 6) . additional studies are needed to address the potential overlap in the dose response profiles for heparin as an anticoagulant and antiviral agent and the utility of nonanticoagulant heparins. antibodies directed to heparan sulfate or the binding site in the rbd might also prove useful for attenuating infection. in conclusion, this work revealed hs as a novel attachment factor for sars-cov-2 and suggests the possibility of using hs mimetics, hs degrading lyases, and metabolic inhibitors of hs biosynthesis for the development of therapy to combat covid-19. further information and request for resources should be directed to the lead contact, thomas mandel clausen (tmandelclausen@health.ucsd.edu) all developed sars-cov-2 expression plasmids produced in this study can be made available upon request to the lead contact. j o u r n a l p r e -p r o o f 28 this study did not generate any unique datasets or code. cell lines nci-h1299, a549, hep3b, a375 and vero e6 cells were from the american type culture collection (atcc). nci-h1299 and a549 cells were grown in rpmi medium, whereas the other lines were grown in dmem. hep3b cells carrying mutations in hs biosynthetic enzymes were previously derived from the parent hep3b line as described (anower et al., 2019) . all cell media were supplemented with 10% (v/v) fbs, 100 iu/ml of penicillin and 100 µg/ml of streptomycin sulfate, and the cells were grown under an atmosphere of 5% co 2 and 95% air. cells were passaged at ~80% confluence and seeded as explained for the individual assays. protein was produced in expicho or hek293-6e cells that were acquired from thermo fisher and grown according to the manufacturer's specifications. human bronchial epithelial cells were acquired from lonza. they were cultured in pneumacult-ex plus medium or to pneumacult-ali medium according to the manufacturer's instructions (stemcell technologies). specific details on the culture methods are described in the methods section. the collection of human tissue in this study abided by the helsinki principles and the an electrostatic potential map of the sars-cov-2 spike protein rbd domain was generated from a crystal structure (pdb:6m17) and visualized using pymol (version 2.0.6 by schrödinger). a dp4 fully sulfated heparin fragment was docked to the sars-cov-2 spike protein rbd using the cluspro protein docking server (https://cluspro.org/login.php) (kozakov et al., 2013; kozakov et al., 2017; vajda et al., 2017) . heparin-protein contacts and energy contributions were evaluated using the molecular operating environment (moe) software (chemical computing group). recombinant sars-cov-2 spike protein, encoding residues 1-1138 (wuhan-hu-1; genbank: mn908947.3) with proline substitutions at amino acids positions 986 and 987, a "gsas" substitution at the furin cleavage site (amino acids 682-682), twinstreptag and his 8x , was produced in expicho cells by transfection of 6 x10 6 cells/ml at 37 ºc with 0.8 µg/ml of plasmid dna using the expicho expression system transfection kit in expicho expression medium (thermofisher). one day later the cells were refed, then incubated at 32 ºc for 11 days. the conditioned medium was mixed with complete edta-free protease inhibitor (roche). samples of the recombinant trimeric spike protein ectodomain were diluted to 0.03 mg/ml in 1x tbs ph 7.4. carbon coated copper mesh grids were glow discharged and 3 µl of the diluted sample was placed on a grid for 30 sec then blotted off. uniform stain was achieved by depositing 3 µl of uranyl formate (2%) on the grid for 55 sec and then blotted off. grids were transferred to a thermo fisher morgagni operating at 80 kv. images at 56,000 magnification j o u r n a l p r e -p r o o f 31 were acquired using a megaview 2k camera via the radius software. a dataset of 138 micrographs at 52,000x magnification and -1.5 µm defocus was collected on a fei tecnai spirit (120kev) with a fei eagle 4k by 4k ccd camera. the pixel size was 2.06 å per pixel and the dose was 25 e − /å 2 . the leginon (suloway et al., 2005) software was used to automate the data collection and the raw micrographs were stored in the appion (lander et al., 2009) database. particles on the micrographs were picked using dogpicker , stack with a box size of 200 pixels, and 2d classified with relion 3.0 (scheres, 2012) . secreted human ace2 was transiently produced in suspension hek293-6e cells. a plasmid encoding residues 1−615 of ace2 with a c-terminal hrv-3c protease cleavage site, a twinstreptag and an his 8x tag was a gift from jason s. mclellan, university of texas at austin. briefly, 100 ml of hek293-6e cells were seeded at a cell density of 0.5 × 10 6 cells/ml 24 hr before transfection with polyethyleneimine (pei). for transfection, 100 µg of the ace2 plasmid and 300 µg of pei (1:3 ratio) were incubated for 15 min at room temperature. transfected cells were cultured for 48 hr and fed with 100 ml fresh media for additional 48 hr before harvest. secreted ace2 were purified from culture medium by ni-nta affinity chromatography (qiagen). filtered media was mixed 3:1 (v/v) in 4x binding buffer (100 mm tris-hcl, ph 8,0, 1,2 m nacl) and loaded on to a self-packed column, pre-equilibrated with washing buffer (25 mm tris-hcl, ph 8, 0.3 m nacl, 20 mm imidazole). bound protein was washed with buffer and eluted with 0.2 m imidazole in washing buffer. the protein containing fractions were identified by sds-page. j o u r n a l p r e -p r o o f sars-cov-2 spike protein in dpbs was applied to a 1-ml hitrap heparin-sepharose column (ge healthcare). the column was washed with 5 ml of dpbs and bound protein was eluted with a gradient of nacl from 150 mm to 1 m in dpbs. for binding studies, recombinant spike protein and ace2 was conjugated with ez-link tm sulfo-nhs-biotin (1:3 molar ratio; thermo fisher) in dulbecco's pbs at room temperature for 30 min. glycine (0.1 m) was added to quench the reaction and the buffer was exchanged for pbs using a zeba spin column (thermo fisher). heparin ( and incubated with s protein (100 nm). ace2 binding was measured to bound spike protein as described above. mixtures of stabilized (mut7) spike protein, 6x molar excess soluble ace2 ectodomain, with or without 9x molar excess an icosasaccharide (dp20) fragment derived from heparin were incubated at 4°c for 15 min or 1 hr. samples were diluted to 0.02 mg/ml with respect to spike protein in 1x pbs ph 7.4. carbon coated copper mesh grids were glow discharged at 20 ma for 30 s and 3 µl sample was applied for 20 s and blotted off. grids were washed five times in 10 µl 1x tbs ph 7.4 for 15 sec then stained and blotted twice with 3 µl 2% uranyl formate for 15 sec. grids were imaged with an fei tecnai spirit (120 kev) or fei tecnai f20 (200 kev) with an fei eagle ccd (4k) camera. data were collected on the fei tecnai f20 at 62,000x magnification, -1.5 µm defocus with a pixel size of 1.77 å per pixel. these datasets employed a box size of 256 and comprised 167 to 331 micrographs. data were collected on the fei tecnai spirit as described above. data collection on both microscopes was automated through leginon (suloway et al., 2005) . stored in the appion (lander et al., 2009 ) database, and particles were picked with dog picker . particles were 2d classified with relion 3.0 j o u r n a l p r e -p r o o f 34 (scheres, 2012) . trimeric 2d classes were selected for iterative 3d classification with relion 3.0. classifications were performed until 3d classes demonstrated ace2 occupancy throughout the relevant threshold-level of the spike protein as visualized using chimerax (goddard et al., 2018) . particle counts of final 3d classes were obtained with relion 3.0 (scheres, 2012) and the percentages of particles bound to 0, 1, 2, or 3 ace2 were calculated and visualized in graphpad prism 8. cells at 50-80% confluence were lifted with pbs containing 10 mm edta (gibco) and fresh human tissue was washed in pbs, frozen, and lyophilized. the dried tissue was crushed into a fine powder, weighed, resuspended in pbs containing 1 mg/ml pronase with 90% ethanol (esko, 1993) . for hs quantification and disaccharide analysis, purified hs was digested with a mixture of heparin lyases i-iii (2 mu each) for 2 hr at 37 °c in 40 mm ammonium acetate buffer containing the ace2 expression plasmid (addgene, plasmid #1786) (li et al., 2003) qpcr mrna was extracted from the cells using trizol (invitrogen) and chloroform and purified using the rneasy kit (qiagen). cdna was synthesized from the mrna using random primers and the superscript iii first-strand synthesis system (invitrogen). sybr green master mix (applied biosystems) was used for qpcr following the manufacturer's instructions, and the expression of tbp was used to normalize the expression of ace2 between the samples. the qpcr primers used were as follows: ace2 (human) forward: 5' -cgaagccgaagacctgttcta -3' and reverse: 5' -gggcaagtgtggactgttcc -3'; and tbp (human) forward: 5' -aacttcgcttccgctggccc -3' and reverse: 5' -gaggggaggccaagccctga -3'. to generate the cas9 lentiviral expression plasmid, 2.5 x 10 6 hek293t cells were seeded to a 10-cm diameter plate in dmem supplemented with 10% fbs. the following day, the cells j o u r n a l p r e -p r o o f 38 were co-transfected with the pspax2 packaging plasmid (addgene, plasmid #12260), pmd2.g envelope plasmid (addgene, plasmid #12259), and lenti-cas9 plasmid (addgene, plasmid #52962) (sanjana et al., 2014) in dmem supplemented with fugene6 (30µl in 600µl dmem). media containing the lentivirus was collected and used to infect a549 wt and a375 wt cells, which were subsequently cultured with 5 µg/ml and 2 µg/ml blasticidin, respectively, to select for stably transduced cells. a single guide rna (sgrna) targeting ace2 (5'-tggatacatttgggcaagtg -3') and one targeting b4galt7 (5'-tgacctgctccctctcaacg-3') was cloned into the lentiguide-puro plasmid (addgene plasmid #52963) following published procedure (sanjana et al., 2014) . the lentiviral sgrna construct was generated in hek293t cells, using the same protocol as for the cas9 expression plasmid, and used to infect a549-cas9 and a375-cas9 cells to generate crispr knockout mutant cell lines. after infection, the cells were cultured with 2 µg/ml puromycin to select for cells with stably integrated lentivirus. after 7 d, the cells were serially diluted into 96-well plates. single colonies where expanded and dna was extracted using the dneasy blood and tissue dna isolation kit (qiagen). proper editing was verified by sequencing (genewiz inc.) and gene analysis using the online ice tool from synthego (suppl. fig. 5 ). vesicular stomatitis virus (vsv) pseudotyped with spike proteins of sars-cov-2 were generated according to a published protocol (whitt, 2010) . briefly, hek293t, transfected to express full length sars-cov-2 spike proteins, were inoculated with vsv-g pseudotyped ∆gluciferase or gfp vsv (kerafast, ma). after 2 hr at 37°c, the inoculum was removed and cells were refed with dmem supplemented with 10% fbs, 50 u/ml penicillin, 50 µg/ml streptomycin, and vsv-g antibody (i1, mouse hybridoma supernatant from crl-2700; atcc). pseudotyped particles were collected 20 hr post-inoculation, centrifuged at 1,320 × g to remove cell debris and stored at −80°c until use. briefly, 100 µl of luciferin lysis solution was added to the cells and incubated for 5 min at room temperature. the solution was transferred to a black 96-well plate and luminescence was detected using an enspire multimodal plate reader (perkin elmer). data analysis and statistical analysis was performed in prism 8. fluor 594 labeling kits (invitrogen), respectively. zombie uv™ was used to gate for live cells in the analysis. cells were then analyzed using an ma900 cell sorter (sony). for 4 days. fresh medium, 100 µl in the apical chamber and 500 µl in the basal chamber, was added daily. at day 7, the medium in the apical chambers was removed, and the basal chambers were changed every 2-3 days with apical washes with pbs every week for 28 days. the apical side of the hbec ali culture was gently washed three times with 200 µl of phosphate buffered saline without divalent cations (pbs-/-). heparinase was added to the apical side for half an hour prior to infection. an moi of 0.5 of authentic sars-cov-2 live virus (usa-wa1/2020 (bei resources, #nr-52281)) in 100 µl total volume of pbs was added to the apical chamber with either dmso, heparinase (2.5mu/ml heparin lyase ii, and 5mu/ml heparin lyase iii (ibex)) or 100ug/ml of unfractionated heparin. cells were incubated at 37c and 5% co2 for 4 hours. unbound virus was removed, the apical surface was washed and the compounds were re-added to the apical chamber. cells were incubated for another 20 hours at 37c and 5% co2. after inoculation, cells were washed once with pbs-/-and 100 µl tryple (thermofisher) was added to the apical chamber then incubated for 10 min in the incubator. cells were gently pipetted up and down and transferred into a sterile 15 ml conical tube containing neutralizing medium of dmem + 3% fbs. tryple was added again for 3 rounds of 10 minutes for a total of 30 min to clear transwell membrane. cells were spun down and resuspended in pbs with zombie uv viability dye for 15 min in room temp. cells were washed once with facs buffer then fixed in 4% pfa for 30 min at room temp. pfa was washed off and cells were resuspended in pbs. zombie uv™ was used to gate for live cells in the analysis. infection was analyzed by flow cytometry as explained above. cell viability was assessed using the celltiter-blue® assay (promega). briefly, vero cells were seeded into a 96 well plate. the cells were treated with hsase mix (2.5 mu/ml hsase ii, and 5 mu/ml hsase iii; ibex) or 100 µg/ml ufh for 16 hrs. the viability of the cells using celltiter-blue® was measured according to the manufacturers protocol. briefly, the j o u r n a l p r e -p r o o f 42 celltiter-blue® reagent was added directly to the cell culture and the cells were incubated overnight. fluorescence was read at excitation 560nm and emission 590nm, using an enspire multimodal plate reader (perkin elmer). data analysis was performed in prism. the human bronchial epithelial cells were grown at an air-liquid interface as explained above. cell viability after treatment with hsase mix (2.5 mu/ml hsase ii, and 5 mu/ml hsase iii; ibex) or 100 µg/ml ufh for 16 hrs was measured by adding celltiter-blue® reagent directly to the transwell inserts and developed as explained above. all statistical analyses were performed in prism 8 (graphpad). all experiments were performed in triplicate and repeated as indicated in the figure legends. data was analyzed statistically using unpaired t-tests when two groups were being compared or by one-way anova without post-hoc correction for multiple comparisons. ic 50 values and confidence intervals were determined using non-linear regression using the inhibitor vs. response least squares fit algorithm. the error bars in the figures refer to mean plus standard deviation (sd) values. the specific statistical tests used are listed in the figure legends and in the methods section. experiments were evaluated by statistical significance according to the following scheme; ns: p > 0.05, *: p ≤ 0.05, **: p ≤ 0.01, ***: p ≤ 0.001, ****: p ≤ 0.0001. after 48 hr, cell culture supernatants were collected and stored at -80°c. virus titers were determined by plaque assays on vero e6 monolayers greiner bio-one, #662160) and rocked for 1 hr at room temperature. the cells were subsequently overlaid with mem containing 1% cellulose the plaques were visualized by fixation of the cells with a mixture of 10% formaldehyde and 2% methanol (v/v in water) for 2 hr. the monolayer was washed once with pbs and stained with 0.1% crystal violet (millipore sigma # v5265) prepared in 20% ethanol the pennsylvania state university, following the guidelines approved by the institutional biosafety committees. human bronchial epithelial cell air-liquid interface generation and infection human bronchial epithelial cells (hbecs, lonza) were cultured in t75 flasks in plus medium according to manufacturer instructions (stemcell technologies) to generate air-liquid interface (ali) cultures, hbecs were plated on collagen i-coated 24 well transwell inserts with a 0.4-micron pore size (costar, corning) at 5x10 4 cells/ml. cells were maintained for 3-4 days in pneumacult-ex plus medium until confluence, then changed to pneumacult-ali medium triglyceride-rich lipoprotein binding and uptake by heparan sulfate proteoglycan receptors in a crispr/cas9 library of hep3b mutants remdesivir for the treatment of covid-19 -preliminary report guinea fowl coronavirus diversity has phenotypic consequences for glycan and tissue binding heparan sulfate proteoglycans and viral attachment: true receptors or adaptation bias? viruses 11 undersulfated and glycol-split heparins endowed with antiangiogenic activity the 2019 coronavirus (sars-cov-2) surface protein (spike) s1 receptor binding domain undergoes conformational change upon heparin binding identification of a major co-receptor for primary isolates of hiv-1 hiv-1 entry into cd4+ cells is mediated by the chemokine receptor cc-ckr-5 special considerations for proteoglycans and glycosaminoglycans and their purification order out of chaos: assembly of ligand binding sites in heparan sulfate age-dependent modulation of heparan sulfate structure and function bioengineering murine mastocytoma cells to produce anticoagulant heparin ucsf chimerax: meeting modern challenges in visualization and analysis structural analysis of urinary glycosaminoglycans from healthy human subjects human milk oligosaccharides inhibit rotavirus infectivity in vitro and in acutely infected piglets human coronaviruses oc43 and hku1 bind to 9-o-acetylated sialic acids via a conserved receptor-binding site in spike protein domain a loss of bcl-6-expressing t follicular helper cells and germinal centers in covid-19 stabilized coronavirus spikes are resistant to conformational changes induced by receptor recognition or proteolysis initial step of virus entry: virion binding to cell-surface glycans how good is automated protein docking? the cluspro web server for protein-protein docking appion: an integrated, database-driven pipeline to facilitate em image processing inhibition of sars pseudovirus cell entry by lactoferrin binding to heparan sulfate proteoglycans evolutionary differences in glycosaminoglycan fine structure detected by quantitative glycan reductive isotope labeling heparan sulfate structure in mice with genetically modified heparan sulfate production assessing ace2 expression patterns in lung tissues in the pathogenesis of covid-19 angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus proteoglycans and sulfated glycosaminoglycans sars-cov-2 spike protein binds heparan sulfate in a length-and sequence-dependent manner entry of human coronavirus nl63 into the cell human coronavirus nl63 utilizes heparan sulfate proteoglycans for attachment to target cells stable heparin-producing cell lines derived from the furth murine mastocytoma herpes simplex virus-1 entry into cells mediated by a novel member of the tnf/ngf receptor family heparin inhibits cellular invasion by sars-cov-2: structural dependence of the interaction of the surface protein (spike) s1 receptor binding domain with heparin membrane protein of human coronavirus nl63 is responsible for interaction with the adhesion receptor structures of mers-cov spike glycoprotein in complex with sialoside attachment receptors localisation and distribution of o-acetylated n-acetylneuraminic acids, the endogenous substrates of the hemagglutinin-esterases of murine coronaviruses, in mouse tissue mediation of human immunodeficiency virus type 1 binding by interaction of cell surface heparan sulfate proteoglycans with the v3 region of envelope gp120-gp41 improved vectors and genome-wide libraries for crispr screening relion: implementation of a bayesian approach to cryo-em structure determination cell surface receptors for herpes simplex virus are heparan sulfate proteoglycans a novel role for 3-o-sulfated heparan sulfate in herpes simplex virus 1 entry nidovirus sialate-o-acetylesterases: evolution and substrate specificity of coronaviral and toroviral receptor-destroying enzymes the sweet spot: defining virus-sialic acid interactions automated molecular microscopy: the new leginon system effective inhibition of sars-cov-2 entry by heparin and enoxaparin derivatives. biorxiv the versatile heparin in covid-19 structural basis for human coronavirus attachment to sialic acid receptors the war against influenza: discovery and development of sialidase inhibitors structural characterization of human liver heparan sulfate dog picker and tiltpicker: software tools to facilitate particle selection in single particle electron microscopy function, and antigenicity of the sars-cov-2 spike glycoprotein isolation and characterization of heparan sulfate from various murine tissues site-specific glycan analysis of the sars-cov-2 spike a comprehensive compositional analysis of heparin/heparan sulfate-derived disaccharides from human serum generation of vsv pseudotypes using recombinant deltag-vsv for studies on virus entry, identification of entry inhibitors, and immune responses to vaccines cryo-em structure of the 2019-ncov spike in the prefusion conformation vaccines and therapies in development for sars-cov-2 infections initial interaction of herpes simplex virus with cells is binding to heparan sulfate demystifying heparan sulfate-protein interactions structural basis for the recognition of sars-cov-2 by full-length human ace2 cov-2 spike protein interacts with heparan sulfate and ace2 through the rbd • heparan sulfate promotes spike-ace2 interaction • sars-cov-2 infection is co-dependent on heparan sulfate and ace2 • heparin and non-anticoagulant derivatives block sars-cov-2 binding and infection in brief provide evidence that heparin sulfate is a necessary co-factor for sars-cov-2 infection. they show that heparin sulfate interacts with the receptor binding domain of the sars-cov-2 spike glycoprotein we thank scott selleck (the pennsylvania state university), eugene yeo (uc san diego), john guatelli (uc san diego), mark fuster (uc san diego) and stephen schoenberger (la jolla institute for immunology) for many helpful discussions, and annamaria naggi and giangiacomo torri from the ronzoni institute for generously providing split-glycol heparin. this key: cord-272467-8heg5iql authors: armstrong, john; mccrae, malcolm; colman, alan title: expression of coronavirus e1 and rotavirus vp10 membrane proteins from synthetic rna date: 2004-02-19 journal: j cell biochem doi: 10.1002/jcb.240350206 sha: doc_id: 272467 cord_uid: 8heg5iql some viruses acquire their envelopes by budding through internal membranes of their host cell. we have expressed the cloned cdna for glycoproteins from two such viruses, the e1 protein of coronavirus, which buds in the golgi region, and vp10 protein of rotavirus, which assembles in the endoplasmic reticulum. messenger rna was prepared from both cdnas by using sp6 polymerase and either translated in vitro or injected into cultured cv1 cells or xenopus oocytes. in cv1 cells, the el protein was localised to the golgi region and vp10 protein to the endoplasmic reticulum. in xenopus oocytes, the e1 protein acquired post‐translational modifications indistinguishable from the sialylated, o‐linked sugars found on viral protein, while the vp10 protein acquired endoglycosidase‐h‐sensitive n‐linked sugars, consistent with their localisation to the golgi complex and endoplasmic reticulum, respectively. thus the two proteins provide models with which to study targeting to each of these intracellular compartments. when the rnas were expressed in matured, meiotic oocytes, the vp10 protein was modified as before, but the e1 protein was processed to a much lesser extent than in interphase oocytes, consistent with a cessation of vesicular transport during cell division. in the eukaryotic cell, the rough endoplasmic reticulum is the site of synthesis for both secretory proteins and integral proteins of the plasma membrane. to reach the cell surface, both types of protein must traverse the golgi complex. during this transport process, a series of post-translational modifications may occur. since the modifying enzymes are localised along the secretory pathway, the endoplasmic reticulum and golgi complex constitute a series of membrane-limited compartments, each apparently with a distinct complement of proteins [reviewed in . how are these proteins confined to their appropriate destinations, rather than migrating onward to the plasma membrane? we are investigating two viral model proteins, one for the endoplasmic reticulum and one for the golgi complex, with a view to determining the features of each molecule responsible for its correct localisation. rotaviruses are a class of animal viruses which characteristically assemble at the endoplasmic reticulum [5] . although assembly appears to be a budding process, the matured virion surprisingly lacks a visible envelope . the bovine uk strain encodes two glycoproteins: vp7, which forms part of the virion, and vpio, which does not [8, 9] . the latter is an integral membrane glycoprotein [9, 10] bearing "immature" n-linked sugars which imply a failure to leave the endoplasmic reticulum [ w . by contrast, with coronaviruses the envelope is acquired by budding initially at smooth membranes close to the golgi complex but continuous with the endoplasmic reticulum; as infection progresses budding may also occur in the golgi cisternae or the rough endoplasmic reticulum [ 11, 121 . the strain mhv-a59 bears two glycoproteins, the smaller of which, e l , is restricted within the cell and does not appear to reach the plasma membrane except as part of a budded virion 112,131. thus we have adopted the proteins vplo and e l as potential models for the class of membrane proteins which localise within compartments of the secretory pathway. previously we have reported the cloning and sequence analysis of both proteins [9, 14] . here we show that both cdnas can be expressed via artificial mrna prepared using sp6 polymerase [ 15, 161 . the mrna may be translated either in vitro or after injection into xenopus oocytes or cultured cvi cells. expressed in this way, vplo and e l prove to be valid model proteins for the endoplasmic reticulum and golgi complex, respectively. in addition, we have exploited the fact that xenopus oocytes can be matured into a meiotic state [ 17, 181 to investigate the transport of the two proteins to their destinations during cell division. all restrictions, ligations, and transformations were carried out according to standard methods 1191. full-length cdna for bovine rotavirus vplo (assembled and kindly provided by h. baybutt) was excised with the enzymes ahaiu and sali, to give a fragment corresponding to nucleotides 8-588 [in figure 3 of 91. this was inserted into the bglii site of the transcription vector psp64t [ 151. e l cdna [ 14, 203 was excised with aha111 and fom, representing nucleotides 59-780 [in 141, plus the following 13 bases of the adjacent nucleocapsid gene [21] and inserted into the bglii site of psp64t 1151. both plasmids were linearised with ecori and rna transcribed with sp6 polymerase in the presence of 0.5mm m7g (5')ppp(5')g exactly as described [16] . since the vector psp64t contains a poly-a tract between the bglii and ecori sites, this method results in the synthesis of capped, polyadenylated mrna in a single reaction. translation in vitro in reticulocyte lysates, injection of xenopus oocytes, immunoprecipitation of proteins, microinjection of cv 1 cells, and immunofluorescence analysis were all performed as before [16] . for analysis of vplo protein, rabbit antiserum to a fusion protein of bacterial 0-galactosidase and vpio, whose prepara-tion will be described in detail elsewhere, was used. to detect el, we used a rabbit serum to e l purified by detergent extraction of virus [22] , kindly provided by s. tooze, embl, heidelberg, who also provided stocks of coronavirus mhv-a59 and its host, sac-cells. radiolabelled virus was produced by infection of cells at a ratio of 10 pfukell, incubation for 16 hr in methionine-free medium to which was added 50 pci/ml 35s-methionine (amersham) , and collection of the culture supernatant. endoglycosidase h digestions were carried out as before [ 181. xenupus oocytes were matured into second meiotic metaphase by incubation in 10 p g / d progesterone as described [ 181. previously we showed that rna prepared with sp6 polymerase could be translated in cultured cells, provided it has both a 5' cap structure, and a 3' poly-a tract added using poly-a polymerase [16] . e l rna prepared in this way was found to be translated in both cv1 cells and xenopus oocytes (not shown). we have further simplified this method by using a vector which includes a sequence of 23 a's in the transcribed region [ 151, eliminating the requirement for a second reaction. rnas prepared by this method for vplo and e l were translated efficiencly in reticulocyte lysates, xenopus oocytes, and cultured cvl cells (figs. 1, 2). rough endoplasmic reticulum; these structures may be cleaved by endoglycosidase h [23] . vplo protein which had been immunoprecipitated from injected oxcytes was found to be completely sensitive to endoglycosidase h, yielding a species which approximately comigrated with the unprocessed form produced in the reticulocyte lysate (fig. 1, lanes a-d) . thus the vplo protein is likely to be localised within the endoplasmic reticulum of the oocyte. e l protein was synthesized in oocytes as a spectrum of forms, the most mobile of which comigrated with the unmodified protein from the reticulocyte lysate (fig. 1 , lanes e,f). the oocyte proteins were very similar in both mobilities and relative abundance to the species of e l found in virus particles (fig. 1, lane 8) . these have been shown to contain 0-linked sugars as their only posttranslational modification [24, 25] . since at least the later stages of 0-linked glycosylation are thought to occur in the golgi complex 1261, most or all of the el protein would appear to have reached this organelle in the oocyte. with both antibodies, no significant labeled proteins were precipitated from oocytes in the absence of the appropriate mrna. sp6 rnas for vplo and el were microinjected into cvl monkey kidney cells, and the resulting proteins were detected by immunofluorescence. vplo proteins was found in an elaborate pattern around the nucleus and throughout the cytoplasm, characteristic of the rough endoplasmic reticulum (fig. 2a) . by changing the plane of focus, it became clear that all of the nuclear envelope was labeled (not shown). this was expected as the outer nuclear membrane is continuous with, and usually considered as part of, the endoplasmic reticulum. in contrast, e l protein showed a much more localised fluorescence pattern (fig. 2b) . labeling was concentrated in a perinuclear area corresponding to the golgi region and coincident with the intracellular pattern shown by fluorescent wheat germ agluttinin, a golgi marker (not shown). the pattern also closely resembled the distribution of e l protein in virally infected cells 112,131. if injected cells were not permeabilised with detergent, no labeling of the cell surface was detected with either protein (not shown). incubation of xenopus oocytes in progesterone causes them to mature from their initial stage, first meiotic prophase, to second meiotic metaphase [17] . in this state, the golgi apparatus is broken down and protein secretion blocked [ 181. we investigated the synthesis of vplo and e l in such matured oocytes. vplo was electrophoretically indistinguishable from the form produced in nonmatured oocytes (fig. 3, lanes a,b) . e l , in contrast, showed a striking reduction in the extent to which it was processed in comparison to nonmatured oocytes (fig. 3, lanes c,d) , implying that most of the protein is denied access to the modifying enzymes of the golgi complex under these conditions. transcription of rna with sp6 polymerase has been used as a route for expression of cloned cdnas either in vitro or in the xenopus oocyte [ 151. if the rna incorporates a 5' cap structure and a 3' poiy-a tract, it may also be expressed in a b c d fig. 3. expression of vplo (a,b) and e l (c,d) in normal (a,c) or progesterone-matured (b,d) oocytes. proteins were analysed as for figure 1 , and the fluorograph was exposed for 1 wk. cultured cells [ 161. by using the vector psp64t [ 151, such rna can be prepared in a single reaction. we assume that the presence of an encoded poly-a sequence in the vector is the critical factor in allowing expression of the rna in cultured cells 1161; however, the transcribed region also contains 5' and 3' untranslated regions from a globin cdna, and a sequence of c bases at the extreme 3' end [ 151, and any of these elements may contribute to the stability of the mrna or the efficiency with which it is translated. whichever feature is important, the system provides a quick and versatile approach to the expression of cloned dnas. its principal limitation, at present, is the lack of a technique for the efficient introduction of rna into large populations of cultured cells. it appears that both of the proteins we have studied, vplo and el, are localised in a similar fashion whether they are expressed in isolation in cell types as diverse as oocytes and cvl cells, or in the course of viral infection. thus each is likely to be a legitimate model with which to study targeting of proteins to its respective intracellular destination. in the case of vpio, this is the rough endoplasmic reticulum, as judged by fluorescence microscopy and sensitivity of its n-linked oligosaccharides to endoglycosidase h (figs. 1, 2) . the corresponding protein of rotavirus sall , termed ncvps, bears oligosaccharides of the form man8glcnac2 [lo] . in contrast, the principal location of the e l protein appears to be the golgi complex. as the bulk of the protein made in oocytes has been modified to give forms of similar mobilities to the viral protein (fig. 1, lanes f,g; see also [27] ), it has presumably acquired 0-linked oligosaccharides, an indication of having reached the golgi complex [26] . most of the oligosaccharides on viral el contain terminal sialic acid [25] , whose presence is detectable by changes in electrophoretic mobility of the glycoprotein after neuraminidase digestion. however, we have repeatedly failed to observe any effect on mobility of oocyte el by treatment with neuroaminidases from various sources and under various conditions (not shown). the explanation for this is not clear; perhaps the oocyte adds exotic sialic acids which are not susceptible to cleavage. immunofluorescence of cultured cells expressing e l clearly showed a very localised perinuclear distribution (fig. 2b ) characteristic of the golgi region and similar to the pattern observed during viral infection [12] . however, not all of the viral protein is restricted to the flattened cisternae of the golgi complex; some at least is found in smooth membranes which are in the same region of the cell but are in fact continuous with the rough endoplasmic reticulum [ 12, 221 . these membranes are reminiscent of the transitional elements of specialised secretory cells [28] and are the earliest site of budding during viral infection [12] . at the resolution of light microscopy it is impossible to say whether the pattern in figure 2b includes this compartment. it will be of interest to resolve this point by immunoelectron microscopy, which should also allow us to determine how many of the golgi cisternae are labelled. nevertheless, it is probably safe to conclude that the behaviour of the e l protein is largely responsible for determining the intracellular budding site of coronavirus. the localisation of vplo might argue for a similar function in rotavirus morphogenesis. however, this protein's precise role in viral infection remains enigmatic. the other rotavirus glycoprotein, vp7, also appears to be restricted to the endoplasmic reticulum when it is expressed in the absence of other viral proteins [291. thus both proteins may be involved in determining the viral assembly site, but the subsequent fate of vplo is unclear, since it is not detectable in purified virions [9] . perhaps the simplest hypothesis is that the newly budded virus has a lipid envelope including vplo and vp7, which is somehow shed at a later stage to leave vp7 connected directly to components of the capsid [7] . an aspect of the biogenesis of the endoplasmic reticulum and golgi complex which has recently attracted attention is their fate during cell division. in mammalian cells, both organelles are thought to break up into vesicles which then partition randomly between the daughter cells and fuse together; concomitant with this disruption is a cessation of traffic of secretory and plasma mebmrane proteins to and through the golgi complex [reviewed in 301. the same phenomena appear to occur in the xenopus oocyte, which has the experimental advantage that its cell cycle state may be manipulated with hormones [mi. the results presented in figure 3 are entirely consistent with this model; vplo is synthesized and processed normally in meiotic oocytes, but e l is processed to a far lesser extent than in interphase oocytes, and thus has probably failed to traverse what is assumed to be the first vesicle-mediated step of membrane traffic, from the endoplasmic reticulum to the golgi complex. it might be of interest to study the vesicles in which e l becomes trapped under these conditions. what features of each protein might be involved in determining its intracellular destination, and what are the mechanisms involved in each case? the two proteins share a superficial topological similarity, in having a very short domain on the lumenal side of the membrane which is n-terminal but does not arise from cleavage of a signal sequence; e 1 has the additional unusual feature of hydrophobic regions large enough to span the membrane three times [9, 10, 14, 31, 32] . two mechanisms have been proposed for retention of proteins in the endoplasmic reticulum. one is that a species variously known as bip or grp78 attaches to improperly folded or assembled molecules and prevents their exit to the golgi complex [33-351. the second is that all of the membrane proteins of the endoplasmic reticulum form a continuous interacting network and are unable to diffuse in the lipid bilayer 134,371. in respect of either model, we have never observed any host species which coprecipitates from oocytes with vplo, although admittedly our labelling schedule is perhaps not optimal for detection. (note that the unidentified band in figure 1 , lanes b-d is made only in response to vplo mrna.) neither have we observed such species with our golgi model protein, e l , whose sorting mechanism is even more obscure. however, with the availability of cloned cdna and a reliable expression system, it is now possible to dissect each molecule and identify the characteristics involved in its correct targeting within the cell. molecular cloning: a laboratory manual molecular biology and pathogenesis of coronaviruses this work was supported by grants from the cancer research campaign (to a. colman) and the wellcome trust (to a. colman and m. mccrae). key: cord-254909-8zgvovu4 authors: srivastava, rajneesh; ray, sandipan; vaibhav, vineet; gollapalli, kishore; jhaveri, tulip; taur, santosh; dhali, snigdha; gogtay, nithya; thatte, urmila; srikanth, rapole; srivastava, sanjeeva title: serum profiling of leptospirosis patients to investigate proteomic alterations() date: 2012-12-05 journal: j proteomics doi: 10.1016/j.jprot.2012.04.007 sha: doc_id: 254909 cord_uid: 8zgvovu4 leptospirosis is a zoonotic infectious disease of tropical, subtropical and temperate zones, which is caused by the pathogenic spirochetes of genus leptospira. although this zoonosis is generally not considered as fatal, the pathogen can eventually cause severe infection with septic shock, multi-organ failure and lethal pulmonary hemorrhages leading to mortality. in this study, we have performed a proteomic analysis of serum samples from leptospirosis patients (n = 6), febrile controls (falciparum malaria) (n = 8) and healthy subjects (n = 18) to obtain an insight about disease pathogenesis and host immune responses in leptospiral infections. 2de and 2d-dige analysis in combination with maldi-tof/tof ms revealed differential expression of 22 serum proteins in leptospirosis patients compared to the healthy controls. among the identified differentially expressed proteins, 8 candidates exhibited different trends compared to the febrile controls. functional analysis suggested the involvement of differentially expressed proteins in vital physiological pathways, including acute phase response, complement and coagulation cascades and hemostasis. this is the first report of analysis of human serum proteome alterations in leptospirosis patients, which revealed several differentially expressed proteins, including α-1-antitrypsin, vitronectin, ceruloplasmin, g-protein signaling regulator, apolipoprotein a-iv, which have not been reported in context of leptospirosis previously. this study will enhance our understanding about leptospirosis pathogenesis and provide a glimpse of host immunological responses. additionally, a few differentially expressed proteins identified in this study may further be investigated as diagnostic or prognostic serum biomarkers for leptospirosis. this article is part of a special issue entitled: integrated omics. leptospirosis is caused by a gram-negative like, obligate, aerobic spirochete of genus leptospira, which is sustained in nature within a wide range of reservoir and carrier animals, including rodent and non-rodents. it is a contagious disease, transmitted through water or soil contaminated with urine of animal hosts containing a large number of this pathogenic spirochete. this zoonosis generally appears as a seasonal infection that initiates at the beginning of monsoon, occurs frequently during the rainy season and disappears when the rains recede; except few sporadic cases that may happen throughout the year [1, 2] . due to the increasing number of severe leptospirosis infections in developed and developing countries, it has been recognized as a major health problem [3] . additionally, this emerging infectious disease can adversely affect agricultural industry [4] . consequently, this fatal infectious zoonosis has a devastating impact on human health and corresponding impediment to economic improvement worldwide. leptospirosis is a systemic disease of humans and broad range of feral and domestic animals, which is characterized by fever, myalgia, conjunctival suffusion, renal and hepatic insufficiency, pulmonary manifestations and reproductive failure [5] . renal involvement is very common in leptospirosis patients and severe cases often lead to acute renal failure. pulmonary alveolar haemorrhage is the prime cause of leptospirosis related casualties [6] . successful completion of the genome sequence of leptospira interrogans [7] has paved the emergence of omics-based approaches in leptospirosis research [8] . proteome level analysis of different biological fluids has been found to be effective for investigation of disease pathobiology as well as identification of surrogate markers with diagnostic and prognostic significance [9, 10] . recent proteomic studies have generated valuable information regarding the global/differential proteome and immunome profile of pathogenic leptospira spp. [11] [12] [13] , protein expression and multiple modification system of the pathogen [14] , and analysis of leptospires excreted in urine of chronically infected hosts [15, 16] . to this end, notable achievements have been accomplished in deciphering the global and sub-cellular proteome profile of this pathogenic spirochete using quantitative mass spectrometric approaches [17, 18] . however, the molecular pathogenesis of leptospirosis is not clearly understood and requires further comprehensive studies at the molecular levels. the omics-based investigations can reveal the mechanism of disease pathogenesis and the pathogen induced alterations in host physiological system as well as identification of leptospiral immunogens useful for diagnostic applications and vaccine development. in the present study, we have performed a comprehensive proteomic analysis of serum samples from patients suffering from leptospirosis using two dimensional gel electrophoresis (2de) and 2d fluorescence difference gel electrophoresis (2d-dige), and compared with febrile (falciparum malaria) and healthy controls. identification of differentially expressed proteins was performed using high-sensitivity maldi-tof-tof mass spectrometry. in this differential proteomics analysis, we have identified 22 statistically significant (p < 0.05) proteins in leptospirosis patients compared to the healthy subjects. proteins showing differential expression in leptospirosis patients were subjected to further functional clustering by using database for annotation, visualization and integrated discovery (david) and protein analysis through evolutionary relationships (panther). the possible involvement of differentially expressed proteins in various physiological pathways and the host immune response against the pathogen has been studied. as far our knowledge goes, this is the first report of human serum proteome alterations in leptospiral infection and our results provide valuable insight into the host immune response against this acute bacterial infection and the underlying molecular mechanisms of the disease pathogenesis. materials and methods 2. (table s1 .1). blood specimens were also collected from age and sex matched healthy (n =18) (voluntary blood donors) and febrile controls (n= 8) with written informed consent to perform comparative analysis. patients with non-severe, uncomplicated, falciparum malaria diagnosed by microscopic examination and confirmed by rapid diagnostic tests (rdt) were selected as febrile controls (fc) for this proteomic study (table s1 .2). serum separation tubes (bd vacutainer®; bd biosciences) were used to collect blood samples from the antecubital vein of the subjects. subsequent to blood collection, samples were allowed to clot by keeping the tubes in ice for 30 min. after clotting, the samples were centrifuged at 2500 rpm at 20°c for 10 min to separate serum from the clotted blood. collected serum samples were labeled and stored in multiple aliquots at −80°c. to minimize any pre-analytical variations, exactly similar collection, processing and storage conditions were maintained for each control and diseased samples. previously described [19] . in brief, a combination of tcaacetone protein precipitation, sonication, albumin and igg depletion and desalting was used for processing serum samples from leptospirosis patients and controls for proteomic analysis. the 2d gel images were scanned by using labscan software version 6.0 (ge healthcare) and analysis was performed by using imagemaster 2d platinum 7.0 software (ge healthcare). 2d-dige gels were scanned using ettan dige imager scanner (ge healthcare) using proper wavelengths and filters for cy2, cy3 and cy5 dyes keeping the resolution at 40 μm. accurate cropping of the gel images were performed using imagequant software version 5.0 (ge healthcare). the cropped gel images were imported to imagemaster 2d platinum 7.0 dige software (ge healthcare) and subjected to comparative analysis for relative protein quantification across all the leptospirosis and control samples. the average ratio of expression was analyzed by student's t-test. protein spots showing differential expression with reproducibility and statistical significance (p< 0.05) were selected and excised for further ms analysis. in-gel trypsin digestion and mass spectrometry protein spots identified in 2de and 2d-dige gels exhibiting differential expression with statistical significance (p < 0.05) in leptospirosis patients compared to the healthy subjects were excised for in-gel digestion and further ms analysis. in-gel digestion of the proteins was performed as described previously [19] . the identity of differentially expressed proteins was established using ab sciex 4800 maldi-tof/tof mass spectrometer. the combined ms and ms/ms peak lists were searched using the gps™ explorer software version 3.6 (ab sciex). the mascot version 2.1 (http://www.martixscience.com) was used as the search engine for protein identification against the swiss-prot database. during the database search following parameters were specified: human taxonomy, trypsin digestion with one missed cleavage, carbamidomethyl (c) as fixed modification, oxidation (m) as variable modification, peptide mass tolerance set at 75 ppm and ms/ms tolerance of 0.4 da. western blot analysis was performed with serum samples from controls (healthy and febrile) and leptospirosis patients (n = 6) to validate the differential expression of some of the target proteins identified in 2de and 2d-dige experiments. prior to the western blotting experiment protein concentration in each sample (leptospirosis patients and controls) was estimated using the 2d-quant kit (ge healthcare) and bca protein assay kit (thermo fisher scientific) following the manufacturer's instructions. the serum proteins were separated on a 12% sds-page (50 μg of total proteins per lane) and then transferred onto pvdf membranes under semi-dry conditions by using an ecl semi-dry transfer unit (ge healthcare). protein migration was tracked by using pre-stained protein standard (fermentas). western blotting was performed by using monoclonal/polyclonal antibody against clusterin (santacruz biotechnology, sc-8354), ceruloplasmin (santacruz biotechnology, sc-365206) and appropriate secondary antibody conjugated with hrp (genei (merck)-621140380011730 or 621140680011730). imagequant software version 5.0 (ge healthcare) was used for quantitation of the signal intensity of the bands in western blots. the differentially expressed proteins in leptospirosis patients were subjected to functional pathway analysis using panther software, version 7 (http://www.pantherdb.org) [20] and david database version 6.7 (http://david.abcc.ncifcrf.gov/home.jsp) [21] for better understanding of the biological context of the identified proteins, their connection with disease pathobiology and participation in various physiological pathways. uniprot accession numbers of the 22 differentially expressed proteins identified in our study were uploaded and mapped against the homo sapiens reference dataset to extract and summarize functional annotation associated with individual or group of genes/proteins and to identify gene ontology terms, molecular function, biological process and important pathways for each dataset. this comparative proteomic profiling was performed to identify differentially expressed serum proteins in leptospiral infection. serum proteome profile of healthy subjects, febrile controls and leptospirosis patients were analyzed using two gel-based proteomic platforms classical 2de and advanced 2d-dige. in each analysis individual samples were studied (n = 32 for classical 2de and n = 18 for 2d-dige) and sample pooling was not executed since it cannot reflects the true pictures of biological variability effectively. differentially expressed protein spots identified in gel-based analysis, which full-filled the statistical parameters (t-test; p < 0.05) were subjected to further mass spectrometric and functional analysis. over 600 protein spots were detected in each 2d gel stained with gelcode blue safe protein stain using imp7 software. 13 statistically significant (p < 0.05) differentially expressed, 7 down-regulated and 6 up-regulated (with fold changes ranging from −4.39-fold to +2.7-fold) proteins were identified in 2de analysis ( fig. s1 and table s2 ). among the 6 up-regulated spots, 5 spots were between 1.1 and 2-fold, and 1 spot exhibited over 2-fold increase in expression level; while in the case of down-regulation, 2 spots were between 1.1 and 2-fold, 1 spot was between 2-and 3-fold, and 4 spots found to have over 3-fold reduced expression levels. representative 2de images of serum proteome profile of leptospirosis patients and healthy individuals, and bar-diagrammatic representation of the fold change and 3d views of some selected differentially expressed proteins are shown in fig. 1a and b. second level of gel-based proteomic analysis was performed using 2d-dige approach, where the test (leptospirosis) and control (healthy/febrile) samples, after pre-electrophoresis labeling with cydyes (ge healthcare), were separated on the same gel to reduce gel-to-gel variations. additionally, superior sensitivity of cydyes compared to routinely used cbb stains, allowed to visualize less abundance protein spots in 2d-gels, which were not detectable in classical 2de profiling, and thereby increased the overall coverage of serum proteome. approximately 1000 protein spots were detected on each 2d-dige gels in the image master 2d platinum 7.0 dige software analysis. in 2d-dige profiling, total of 98 (around 10% of the total detected spots) differentially expressed spots satisfied the statistical criteria (t-test and 1-way anova; p<0.05), among which, 48 were up-regulated (range from 1.15 to 2.71-fold) while, the remaining 50 protein spots showed reduced expression level (with changes from 1.19 to 7.5-fold) in the leptospirosis patients (table s3 ). 37 up-regulated spots were between 1.1 and 2-fold change range, and for the remaining 11 spots the range was between 2 and 3-fold; while among the 50 down-regulated spots, 33 were between 1.1 and 2-fold, 12 were between 2 and 3-fold, and 5 spots exhibited over 3-fold reduced expression level. owing to the superior sensitivity and reproducibility in dige technique, we obtained much higher numbers of differentially expressed protein spots from 2d-dige experiment ( fig. 2a) . fig. 2b depicts 3d views and graphical representation of few selected differentially expressed protein spots. identification of differentially expressed proteins using maldi-tof/tof ms alteration of human serum proteome due to the leptospiral infection was reflected by the differential expression of multiple serum proteins in leptospirosis patients detected by gel-based profiling. among the several differentially expressed protein spots detected in 2de, those 13 spots that satisfied the statistical criteria (p < 0.05) were subjected to in-gel trypsin digestion and subsequent maldi-tof/tof analysis to establish protein identity. ms and ms/ms analysis results revealed that 13 differentially expressed protein spots identified in regular 2de experiment correspond to 6 proteins, among which 2 were upregulated (α-1-antitrypsin and α-1b-glycoprotein precursor) and the remaining 4 were down-regulated (apolipoprotein a1 precursor, serum albumin precursor, apo-lipoprotein a-iv precursor, complement c4 precursor) ( table 1 and s4.1). in case of 2d-dige, selected spots were excised manually from preparative 2d gels stained with gelcode blue safe protein stain (thermo scientific, usa) containing higher amount of serum protein. among the 98 differentially expressed protein spots, 57 spots that could be excised from the gels were subsequently subjected to ms and ms/ms analysis, which successfully established the identity of 42 spots ( fig. 2a ; table s4 .2). for the remaining spots we were unable to establish ms identity most likely due to the very low intensity of those spots and insufficient amount of detectable peptides, probably beyond the sensitivity limit of the instrument. the 42 protein spots identified by ms correspond to 21 (10 down-regulated and 11 up-regulated) differentially expressed proteins in patients suffering from leptospiral infection ( fig. s2 ; table 2 ). in this study, we have performed two levels of gel-based proteomic analysis; initially classical 2de and latter more advanced 2d-dige technology using sub-sets of the patient and control (febrile and healthy) populations studied by 2de. almost all of the proteins (except complement c4 precursor) identified in 2de were again identified in 2d-dige profiling and exhibited similar trends of differential expressions which enhanced the confidence level of our study. the number of identified proteins in 2de and 2d-dige are 6 and 21 respectively, while 5 of the identified proteins (apolipoprotein a-i, serum albumin precursor, apolipoprotein a-iv, alpha-1-antitrypsin and alpha-1b-glycoprotein) found to be overlapping for 2de and 2d-dige, and overall 22 differentially expressed proteins (p < 0.05) have been identified in our study (tables 1 and 2) . pre-electrophoresis labeling of protein samples with highly sensitive cy dyes in 2d-dige attributed around 55% increase in spot number compared to the cbb stained 2de gels, which certainly enhanced the overall coverage of the whole serum proteome. in 2d-dige experiment we obtained 16 more differentially expressed proteins including complement c3 precursor, clusterin precursor, complement factor h, regulator of g-protein signaling 7 (down-regulated) and vitronectin precursor, ceruloplasmin precursor, leucine-rich α-2-glycoprotein (up-regulated) ( table 2) , which were not identified in classical 2de due to lower sensitivity and reproducibility issues. differentially expressed serum proteins identified in leptospiral infection (compared to the healthy subjects) were further investigated in falciparum malaria patients (febrile controls). around 60% of the identified proteins found to be commonly modulated in both of the infectious diseases. however, for most of the proteins levels of altered expression were found to be different in leptospirosis compared to malaria. interestingly, 3 of the identified proteins exhibited opposite trend of differential expression in leptospirosis compared to p. falciparum infection. moreover, 5 proteins found to be differentially expressed in leptospiral infection, but not in febrile controls (table s5) . bar-diagrammatic representation of altered expression levels of different serum proteins in leptospirosis and falciparum malaria has been depicted in fig.s3. two selected differentially expressed proteins; clusterin and ceruloplasmin identified in leptospirosis patients were measured using western blotting to confirm the results of proteomic analysis. equal loading of the samples was verified by cbb staining of the sds-page gels and ponceau staining of the transferred blots containing the resolved proteins ( fig. s4a 3a ) and 1.49-fold up-regulation of ceruloplasmin in the leptospirosis patients compared to the hc (fig. 3b) . additionally, comparative analysis performed between leptospirosis patients and febrile controls (n = 6 each), revealed 1.1-fold down-regulation of clusterin and 1.13-fold up-regulation of ceruloplasmin in leptospiral infection compared to falciparum malaria ( fig. s4c and d) . western blot analysis of each sample (l, fc and hc) were performed in duplicate to check the reproducibility and minimize technical artifacts. functional pathway analysis was performed with 22 differentially expressed serum proteins identified in the leptospirosis patients using gel-based and ms-based analysis to understand their biological context, involvement in diverse physiological pathways and association with leptospiral infections. panther analysis revealed involvement of identified proteins in blood coagulation system (33.3%), heterotrimeric g-protein signaling pathway-gq alpha and go alpha mediated pathway (33.3%) and heterotrimeric g-protein signaling pathway-gi alpha and gs alpha mediated pathway (33.3%) (table s6. 1; fig. 4a ). concerning biological process, the identified proteins were involved in 10 biological processes: the major five processes include metabolic process (17.3%), immune system process (16%), response to stimulus (13.3%), cell communication (13.3%) and cellular process (13.3%), (table s6. 1; fig. 4b ). five major go functions: binding (31.4%), receptor activity (17.1%), enzyme regulator activity (22.9%), catalytic activity (11.4%) and transporter activity (17.1%) were identified in panther analysis for the differentially expressed proteins in leptospirosis patients (table s6. 1; fig. s5 ). in david analysis, kegg category revealed complement and coagulation cascades (1.01 e − 09 ; 30.43%). reactome category revealed three biological pathways: signaling in immune system (p=0.010079; 21.74%), hemostasis (p= 0.035589; 17.39%), metabolism of lipids and lipoproteins (p=0.01712; 17.39%), while, different complement cascades (fig. 4c) leptospirosis has been categorized as a globally important infectious disease due to the frequent occurrence of leptospiral infections in both developing and developed countries, specifically the large outbreaks in tropical and subtropical regions leading to considerable adverse effects on human health and economy [3, 22] . incidence of this acute bacterial infection have been found to be high (annual incidence per 100,000 > 10) in quite a few countries of asia pacific region including india, indonesia, bangladesh, sri lanka, thailand etc. [23] . india being one of the most flood-stricken countries in the south asian region due to its geographical location and climate, leptospirosis remains as an important endemic environmental infectious disease in this country with multiple devastating outbreaks such as in orissa (1999), mumbai (2005) and several parts of the country [23] . serum and plasma are attractive biological fluids that contain diversity of proteins released by diseased tissue, and serum/plasma proteomics has gained considerable interest for the clinical studies, particularly in disease biomarker discovery [10] . both serum and plasma are components of blood; the liquid portion of blood is referred to as plasma, while removal of the fibrinogen and other clotting factors from plasma results in serum. applications of serum and plasma as biological fluid for clinical research have their own advantages and limitations. serum does not contain fibrinogen and other blood clotting factors and has a lesser total protein content compared to plasma, which reduces the unnecessary complexity to some extent making serum more ideal for disease diagnostic purposes. very recently, zimmerman et al. have performed a paired comparison of plasma and serum samples prepared from the same subjects and demonstrated negligible differences in the numbers of peptide and protein identifications or in the overall percentages of semi-tryptic peptides or methionine oxidized peptides between these two biological fluids indicating there is no significant difference between heparinized plasma and serum in terms of overall protein diversity except the depletion of fibrinogen in serum [24] . earlier reports suggest that proteome level analysis of different biological fluids specifically serum/plasma is very effective in studying disease pathogenesis, pathogen-induced alteration in host, and host responses in different vectorborne infectious diseases like malaria [19] , dengue [25] and leishmaniasis [26] . however, over the last decade, only few proteomic studies have been reported on this zoonotic infectious disease. again, most of the previous proteomic studies have focused on the proteome and immunome profile of the pathogen [11, 17, 18] , and very few studies have investigated the effect of leptospiral infection on host proteome. to this end, two recent studies have reported proteomic analysis of leptospires excreted in urine of chronically infected natural reservoir host rattus norvegicus [15, 16] , but, no proteome level analysis has been reported hitherto to describe alterations of human serum proteins and related biological pathways due to leptospiral infection. investigation of the pathogen induced alterations in host serum proteome has immense clinical relevance in light of diagnosis and prognosis. the present proteomic study aimed to perform human serum proteome analysis of leptospirosis patients from an endemic area in india, and comparative analysis with healthy subjects to investigate disease pathobiology and host responses against leptospiral infections. additionally, another clinically relevant infectious disease; falciparum malaria was analyzed to identify the generic febrile responses. leptospirosis and malaria have overlapping geographical distributions and coexistence of these two pathogenic infections have been reported from different part of the world, particularly in the tropics [27] [28] [29] [30] . although, these two infections have many similar clinical presentations, interestingly, quite a few serum proteins including complement c3 precursor, ceruloplasmin precursor, complement component c9, complement factor h and inter-alpha-trypsin inhibitor heavy chain h4 precursor exhibited altered expression levels in leptospirosis patients, but not in falciparum malaria (febrile controls) (table s5 ; fig. s3 ). again, few proteins like complement factor b precursor, ig kappa chain c region, ig mu chain c region found to exhibit opposite trend of differential expression between leptospirosis and falciparum malaria compared to the healthy controls. however, many of the identified targets found to be similarly modulated in leptospirosis and falciparum malaria ( table s5 ). all of the leptospirosis patients selected for this study were suffering from preliminary infection with 2-4 days of fever and treated with antibiotics and antipyretics prior to the sample collection. the mean age of the leptospirosis patients, selected for this proteomic analysis was 30.5 years (sd = 8.31; range of 23-42; median 26.5) ( table s1 ). selection of healthy and febrile control populations with comparable age distribution, with an average value of 28.7 years (sd = 5.98; range of 21-34; median 26.5) and 30.4 years (sd = 9.60; range of 20-45; median 28.5) respectively, allowed maintaining the uniform population profiles for differential protein expression analysis. since, rigid inclusion criteria were maintained (table s1 .1) during the selection of patients, and subjects with any significant past history of diseases were excluded, minimum variations were observed in serum proteome profile of the patients suffering from leptospiral infection. aiming at analysis of host serum proteome alteration due to leptospirosis, we have identified several differentially expressed proteins and modulation of multiple physiological processes and pathways, including inflammation mediated acute phase responses, complement pathways, heterotrimeric g-protein signaling pathway, coagulation cascade and hemostasis in patients suffering from leptospiral infection. some of our identified proteins such as ig mu chain c, clusterin precursor, different complement factors and associated pathways like acute phase response signaling, complement and coagulation cascades, even though not through proteomic studies, have been correlated with leptospiral infection earlier, supporting our findings and enhance the confidence in this study. further, proteomic analysis revealed differential expression of few serum proteins, such as α-1-antitrypsin precursor (2.29 to 1.81fold), vitronectin precursor (2.26-fold), α-1-antichymotrypsin precursor (1.40 to 1.68-fold), ceruloplasmin precursor (1.19 to 1.59-fold) up-regulated; and regulator of g-protein signaling 7 (1.62-fold), apolipoprotein a-iv (1.91-fold) down-regulated, which have not been reported previously in the context of leptospirosis (tables 1 and 2) . altered expression of multiple acute phase proteins (apps) including ceruloplasmin, α-1-antichymotrypsin, α-1-antitrypsin, apolipoprotein a-i precursor, inter-alpha-trypsin inhibitor heavy chain h4 precursor has been identified in leptospirosis patients ( table 2) . apps are a group of serum proteins that exhibit differential expression during various acute phase responses involved in host-adaptive and host-defense mechanisms including opsonizing activities, stimulate phagocytic killing of the invading pathogens and many other specific actions such as protein transport, antioxidant activity, and inhibition of serum serine proteinases [31] . inflammation mediated acute phase signaling has been reported previously in various parasitic and viral infectious diseases like malaria [19] , dengue [25] and severe acute respiratory syndrome [32] . cytokines play a pivotal role in stimulating the production of apps as a part of the host immune response against the infection. investigation of the precise biological significance of these apps and their association with leptospiral infection may provide some insight about the disease pathogenesis. our proteomic analysis reveals altered expression of few members of the blood coagulation cascade such as alpha-1-antitrypsin precursor, complement factors b and h etc. in the patients suffering from leptospiral infection (table s6 ). the thrombocytopenia has been found to be associated with majority of the leptospirosis patients, the precise reasons and pathophysiological mechanisms liable for bleeding in this zoonosis is not clear [33] . previous studies have demonstrated disseminated intravascular coagulation (dic) as an important feature of leptospirosis [34] . inflammation-induced coagulation activation leading to thrombocytopenia is a consistently detectable phenomenon in infectious diseases and reported in the context of falciparum malaria [35] and dengue virus infection [36] earlier. bleeding tendency in leptospirosis may arise due to an imbalance in the hemostatic equilibrium and toxic effect exerted by the pathogen on bone marrow leads to thrombocytopenia [37, 38] . however, the precise role of activation of coagulation system in the pathogenesis of leptospirosis needs to be established. another interesting finding is the modulation of complement pathways, alternative and lectin, in this zoonotic infectious disease (fig. s6 ). after being activated by innate immunity system of the host, the complement cascade plays an important role in recognition and destruction of invading pathogens. previous studies have also shown the activation of complement system's lectin pathway through elevated serum levels of mannose-binding lectin, which exhibit sound correlation with the severity of clinical signs of leptospirosis [39] . in order to survive within host cells and escape the defense system, pathogenic microorganisms adapt versatile mechanisms including inhibition of complement cascades [40] . in this proteomic analysis we have identified down-regulation of few complement factors, including complement c3 precursor (2.21-fold), complement c4 precursor (1.89-fold) and some complement regulatory proteins like clusterin (2.16-fold) in the leptospirosis patients ( table 2 ). in leptospiral infection, soluble complement regulator factor h prevents complement activation at the c3 stage by inhibiting the c3-convertase and inactivating c3b into ic3b [41] . to get rid of clearance and destruction by host complement system virulent pathogens sometimes acquire fluid-phase regulators of complement pathways such as factor h and c4bbinding protein (c4bp) on the cellular surface [42, 43] . previous studies have demonstrated immune evasion of leptospira species by acquisition of human complement regulator factor h, factor h-related protein 1 (fhr-1) and c4bp [41, 44, 45] . decreased expression level of complement factors and regulatory proteins might be a consequence of the leptospiral infection and probably through the deposition of complement inhibition molecules on the pathogen cell surface or by some other unknown mechanisms performed by this pathogenic spirochete after invasion. detailed investigation of functional aspects of complement factors and regulatory proteins might provide interesting insights about the disease pathogenesis and aid in identification of potential therapeutic targets. although, leptospiral infection in human generally remains at a subclinical level or results in mild self-limiting systemic illness, but, it may cause severe infection with lifethreatening complications like septic shock, multi-organ failure and lethal pulmonary hemorrhages leading to mortality if not diagnosed and treated timely. in the developing countries, lack of awareness of the disease and inaccessibility of appropriate laboratory diagnostic facilities are the major causes behind the rapid transmission and motility associated with this vector-borne infectious disease [46] . diagnosis of leptospirosis generally performed through the detection of igm antibodies against the pathogen by rapid screening tests [47] . microscopic agglutination test (mat) and elisa-based immunoassays are the most common approaches for diagnosis of this acute bacterial infection and carried out at different health facilities. since serological tests become positive only after one week, hematology and urine analysis is also executed to identify different physiological abnormalities such as increased erythrocyte sedimentation rate, breathlessness, bleeding tendencies, proteinuria, hematuria, thrombocytopenia, elevated levels of alkaline phosphatase, serum creatinine phosphokinase, sgot and sgpt etc. that can be utilized as the indicators of leptospiral infection [48, 49] . misdiagnosis of leptospirosis sometime occurs due to its broad spectrum of symptoms, which may mimic the clinical signs of many related infectious diseases, such as dengue fever, hantavirus infection and malaria [23] . moreover, mixed infections with malaria or dengue along with leptospirosis also create complications and difficulties for diagnosis process. establishment of a panel of serological markers that can confirm leptospiral infection with high accuracy and efficiently discriminate it from other related infectious diseases will be extremely precious from the diagnostic points of view. while, the major emphasis of the study was to provide better insight into the underlying molecular mechanisms of the disease pathogenesis and host immune response in leptospirosis, but, one of the possible outcomes of this study could be establishment of early detection surrogates for the disease to meet the need for better diagnostics and effective therapy. among the identified differentially expressed serum proteins, apolipoprotein a-i, clusterin, α-1b-glycoprotein precursor, vitronectin precursor, and α-1b-glycoprotein precursor could further be investigated as inflammation-related biomarkers of leptospiral infection. in summary, this is the first study to investigate leptospira induced alterations in human serum proteome. comprehensive proteomic analysis using classical 2de and 2d-dige in combination with maldi-tof/tof ms revealed differential expression of several serum proteins in leptospirosis patients compared to healthy individuals. our results suggest that identified proteins are associated with various essential physiological processes and biological functions. further investigation of functional properties of proteins identified in this study is likely to elucidate better understanding of disease pathobiology and may help to establish candidate surrogate markers proteins for detection of leptospiral infection and discrimination from other related infectious diseases. clinical presentation of leptospirosis: a retrospective study of 201 patients in a metropolitan city of brazil urban epidemic of severe leptospirosis in brazil salvador leptospirosis study group leptospirosis: a zoonotic disease of global importance epidemic leptospirosis associated with pulmonary hemorrhage-nicaragua unique physiological and pathogenic features of leptospira interrogans revealed by whole-genome sequencing proteomics in leptospirosis research: towards molecular diagnostics and vaccine development discovery of urinary biomarkers proteomic technologies for the identification of disease biomarkers in serum: advances and challenges ahead proteome and immunome of pathogenic leptospira spp. revealed by 2de and 2de-immunoblotting with immune serum global proteome analysis of leptospira interrogans analysis of differential proteomes in pathogenic and non-pathogenic leptospira: potential pathogenic and virulence factors high-coverage proteome analysis reveals the first insight of protein modification systems in the pathogenic spirochete leptospira interrogans comparative proteomic analysis of differentially expressed proteins in the urine of reservoir hosts of leptospirosis proteomic analysis of leptospira interrogans shed in urine of chronically infected hosts proteome-wide cellular protein concentrations of the human pathogen leptospira interrogans visual proteomics of the human pathogen leptospira interrogans serum proteome analysis of vivax malaria: an insight into the disease pathogenesis and host immune response applications for protein sequence-function evolution data: mrna/protein expression analysis and coding snp scoring tools systematic and integrative analysis of large gene lists using david bioinformatics resources leptospirosis: an emerging global public health problem leptospirosis in the asia pacific region global stability of plasma proteomes for mass spectrometry-based analyses two dimensional difference gel electrophoresis (dige) analysis of plasmas from dengue fever patients two-dimensional difference gel electrophoresis (dige) analysis of sera from visceral leishmaniasis patients severe sepsis due to severe falciparum malaria and leptospirosis co-infection treated with activated protein c co-infection of malaria and leptospirosis study of coinciding foci of malaria and leptospirosis in the peruvian amazon area coexistence of leptospirosis with falciparum malaria relation of serum retinol to acute phase proteins and malarial morbidity in papua new guinea children plasma proteome of severe acute respiratory syndrome analyzed by two-dimensional gel electrophoresis and mass spectrometry thrombocytopenia in leptospirosis activation of the coagulation cascade in patients with leptospirosis does activation of the blood coagulation cascade have a role in malaria pathogenesis? activation of coagulation and fibrinolysis during dengue virus infection erythroid hypoplasia associated with leptospirosis what role do coagulation disorders play in the pathogenesis of leptospirosis high levels of serum mannose-binding lectin are associated with the severity of clinical signs of leptospirosis complement-resistance mechanisms of bacteria regulation of complement activation at the c3-level by serum resistant leptospires c4bp binding to porin mediates stable serum resistance of neisseria gonorrhoeae role of the hypervariable region in streptococcal m proteins: binding of a human complement inhibitor leptospiral immunoglobulin-like proteins interact with human complement regulators factor h, fhl-1, fhr-1, and c4bp immune evasion of leptospira species by acquisition of human complement regulator c4bp leptospirosis in madras-a clinical and serological study elisa for the detection of specific igm and igg in human leptospirosis world health organization. guidelines for prevention and control of leptospirosis strategies for diagnosis and treatment of suspected leptospirosis: a cost-benefit analysis we are grateful to all the leptospirosis patients and healthy volunteers who participated in our study. the active support from dr. srinivasarao chennareddy, wipro ge healthcare, mumbai 400093, india in data analysis process is gratefully acknowledged. we would also like to thank dr. priyanka parte and sumit bhutada, national institute for research in reproductive health (nirrh), parel, mumbai for support in performing the scanning part of 2d-dige experiment.funding: this research was supported by a start-up grant 09ircc007 from the iit bombay to ss. key: cord-000372-wzwpyvll authors: castelló, alfredo; álvarez, enrique; carrasco, luis title: the multifaceted poliovirus 2a protease: regulation of gene expression by picornavirus proteases date: 2011-04-14 journal: j biomed biotechnol doi: 10.1155/2011/369648 sha: doc_id: 372 cord_uid: wzwpyvll after entry into animal cells, most viruses hijack essential components involved in gene expression. this is the case of poliovirus, which abrogates cellular translation soon after virus internalization. abrogation is achieved by cleavage of both eif4gi and eif4gii by the viral protease 2a. apart from the interference of poliovirus with cellular protein synthesis, other gene expression steps such as rna and protein trafficking between nucleus and cytoplasm are also altered. poliovirus 2a(pro) is capable of hydrolyzing components of the nuclear pore, thus preventing an efficient antiviral response by the host cell. here, we compare in detail poliovirus 2a(pro) with other viral proteins (from picornaviruses and unrelated families) as regard to their activity on key host factors that control gene expression. it is possible that future analyses to determine the cellular proteins targeted by 2a(pro) will uncover other cellular functions ablated by poliovirus infection. further understanding of the cellular proteins hydrolyzed by 2a(pro) will add further insight into the molecular mechanism by which poliovirus and other viruses interact with the host cell. a great variety of animal viruses encode for proteases that accomplish crucial functions during the biological cycle of the virus [1] . usually, the main function of these proteases is to proteolyze viral polypeptide precursors to render mature viral proteins that form part of viral capsids or participate in virus vegetative processes [2] . although both dna and rna viruses can encode proteases, the proteolytic tailoring of polypeptide precursors is most common among viruses with positive single-stranded rna genomes, such as picornaviruses, flaviviruses, caliciviruses, and retroviruses [3] [4] [5] [6] [7] . this mechanism of gene expression by proteolytic processing serves to compress the genetic information of viruses in the limited space provided by the genome. in this manner, viruses reduce the genetic space occupied by 5 and 3 untranslated regions (utrs), the signals devoted for mrna transcription and to initiate translation are minimal, such that, for instance, in the case of picornaviruses or flaviviruses, only one 5 and 3 utr is necessary for viral replication, transcription, translation, and morphogenesis, despite the fact that several viral proteins are synthesized by the infected cells. in addition, a number of polypeptide precursors may exhibit functions that differ from those present in their mature products. in the case of poliovirus (pv), eleven mature proteins are produced from a single translation initiation event, and at least two precursors, 2bc and 3cd, accomplish functions which are not present in their mature proteins. taking together all these considerations, the "proteolytic strategy" provides the small rna viruses with an advantageous and efficient mechanism for distribution of the genome to accomplish all the viral biological functions with the smaller genetic space. apart from generation of active viral proteins that participate in capsid morphogenesis and genome replication, viral proteases may also target a number of cellular proteins. proteolysis of these cellular substrates can very much affect a variety of cellular processes and play an important role in virus-induced cytopathogenesis [8, 9] . in this regard, productive poliovirus infection induces rapid morphological alterations in host-cell. among them, the most prevalent is the accumulation of numerous membranous vesicles in the cytoplasm, derived from endoplasmic reticulum where the viral proteins 2c and 2bc play a central role [10] . in addition, cellular shape is modified upon viral replication giving rise to cell rounding, which is most probably induced by disorders in the cytoskeletal network [11] . finally, chromatin condensates at late times postinfection, associated with the nuclear envelope except for sites where nuclear pores are placed [11] . interestingly, individual expression of the viral proteases 2a pro and 3c pro leads to the induction of most of these cytopathic effects, supporting the idea that these proteases actively contribute to the viral-induced morphological changes [12] . indeed, long-term expression of either 2a pro or 3c pro triggers the activation of caspases and, thus, cell death by apoptosis [11, 12] , reflecting the strong cytotoxicity of both proteases. in addition to the cytopathic effects induced by 2a pro and 3c pro , hydrolysis of host proteins may impact on other cellular functions such as the antiviral responses to virus infection. activation of innate immunity pathways, as well as the establishment of an antiviral response, is absolutely dependent on signals traversing the nuclear membrane through the nuclear pore complex. therefore, many viruses block cellular gene expression at different levels, that is, translation, transcription or protein and rna trafficking between nucleus and cytoplasm. the blockade of active trafficking can inhibit the nuclear import of antiviral signals or prevent the export of cellular mrnas detrimental to virus processes. all these effects can be achieved by hydrolysis of specific cellular proteins. the precise number of cellular proteins degraded by a viral protease, which is known as the "degradome," still remains unknown for a given viral protease. perhaps, one of the beststudied proteases in this respect is pv 2a pro . the discovery that pv 2a pro bisects the initiation factor of translation eif4g leading to the regulation of translation in the infected cells has attracted much attention from many laboratories during the past three decades [13, 14] . more recently, 2a pro has been involved in the alteration of rna and protein trafficking between the nucleus and the cytoplasm upon proteolysis of several nucleoporins [15] [16] [17] . the present paper focuses on the multifaceted activities of 2a pro and its regulation of different viral and cellular processes. pv is a prototype member of the picornaviridae family that infects cells of human or simian origin cytolytically or persistently and is responsible for poliomyelitis in humans [18] . the rna genome is housed in a naked capsid formed by 60 copies of each of the four structural proteins: vp1, vp2, vp3, and vp4. the infectious cycle commences by the attachment of a viral particle to cellular receptors present at the cell surface [19, 20] . this interaction leads to virion internalization and destabilization of the capsid, which adopts a less compact structure. once the rna is released in the cytoplasm, it interacts with the translational machinery, directing the synthesis of viral proteins during the early phase of infection. the pv genome is composed of a single-stranded rna copy of positive polarity of about 7.4 kb [21, 22] . this rna molecule is uncapped and contains a poly(a) tail at its 3 end and a single open reading frame, which encodes for a polyprotein of about two thousands amino acid residues. this polyprotein is proteolytically processed giving rise to the mature viral proteins [23] (figure 1(a) ). three different cleavages can be distinguished on the viral polyprotein: (i) polysomal cleavages that are produced on the nascent polypeptide chain. the first of these cleavages is catalyzed by 2a pro at its amino terminus separating the p1 precursor that encodes for the structural proteins from the rest of nonstructural polypeptides (figure 1(b) ). the second cleavage still on polysomes is performed by 3c pro , releasing the p2 precursor (2abc) from p3 (3abcd); (ii) cytoplasmic cleavages that are mostly exerted by 3c pro and (iii) hydrolysis of vp0 (vp4-vp2), which is concomitant with the morphogenesis of virus particles [2, 23] . all these hydrolytic events lead to the formation of eleven mature proteins and several precursors such as p1, p2, p3, vp0, vp3, vp1, 2bc, 3ab, and 3cd. this last precursor, 3cd, can be used as substrate by 2a pro or 3c pro . the alternative cleavage carried out by 2a pro renders the mature products 3c and 3d , whereas 3c pro generates the canonical proteins 3c pro and 3d pol . however, the biological significance of this alternative cleavage is obscure because pv mutated at 2a pro -cleavage site on 3cd does not exhibit defects in virus replication [24] . the nonstructural proteins that are generated participate in the replication of viral genomes [25, 26] . to this end, the positive rna genome is recognized at its 3 end by proteins of the replication complex to synthesize the complementary rna strand of negative polarity. in this process, 3b protein, also known as vpg, acts as a primer to initiate viral rna transcription [27] . this leads to the formation of a doublestranded rna molecule, also known as the replicative form. the negative rna synthesized serves in turn as a template to direct the synthesis of several copies of positive rna, so this process leads to the production of several nascent rna molecules with a vpg molecule bound to their 5 ends on the negative rna molecule forming a replicative intermediate. the positive rna molecules synthesized may participate in three processes (i) to serve as templates for synthesizing more negative rna molecules; (ii) as mrnas that will be engaged in translation, and (iii) as genomes that will be encapsidated in new viral particles. in picornaviruses, the only type of mrna molecule known is exactly the same as the genome. once the synthesis of several thousands of positive rna molecules is performed, the late phase of translation takes place, giving rise also during this period to a great amount of viral proteins, some of which will participate in virus morphogenesis. this late phase of infection is preceded by the abrogation of cellular mrna translation, such that only viral proteins are being synthesized late in the pv life cycle [14] . in the case of picornaviruses, transcription is dependent on continuous viral protein synthesis [28] . thus, inhibition of viral mrna translation provokes the sudden blockade of viral rna synthesis. moreover, translation is coupled to transcription, such that viral rnas transfected journal of biomedicine and biotechnology figure 1 : (a) structure of poliovirus genome and proteolytic processing of its polyprotein. the pv genome consists of a single-stranded, positive-sense polarity rna molecule, which encodes a single polyprotein. the 5 nontranslated region (ntr) is covalently linked to the viral protein vpg. the pv genome is polyadenylated (a n ) in its 3 end. the polyprotein contains four structural (p1) and seven nonstructural (p2 and p3) proteins that are released from the polypeptide chain by proteolytic processing mediated by the viral-encoded proteinases 2a pro and 3c pro /3cd pro . the intermediate products of processing 2bc, 3cd, and 3ab exhibit functions distinct from those of their respective final cleavage products. the alternative cleavage carried out by 2a pro rendering the mature products 3c and 3d is also shown. (b) once the ribosome has synthesized the pv 2a pro sequence and continues translation on the p2 region, the autocatalytic activity of pv 2a pro is manifested by cleaving itself at its amino terminus still on the nascent polypeptide chain. this cleavage liberates the p1 precursor that will render the capsid proteins on subsequent proteolytic events catalyzed by 3c pro or 3cd pro . cleavage at the carboxy terminus of pv 2a pro on the p2 precursor is accomplished by pv 3c pro , leaving free 2a pro and generating the 2bc precursor. into picornavirus-infected cells are not able to direct protein synthesis [29] . therefore, these two processes of viral macromolecular biosynthesis are tightly coupled, making it difficult to determine exactly the function affected in some pv mutants. notably, continuous lipid and cellular membrane synthesis is also necessary for pv rna synthesis [30] . the morphogenesis of progeny virions in pv-infected cells is observed concomitantly with viral rna translation and replication. the release of new viral particles takes place by cell lysis, due to membrane permeabilization that occurs at the late phase of infection [31] . viroporin 2b and its precursor 2bc are responsible for this permeabilization upon the formation of pore channels in cellular membranes [32, 33] . pv has represented a useful model to gain insight into diverse aspects of molecular biology and gene expression. a number of discoveries concerning animal viruses with rna genomes were initially made in pv. for example, the presence of uncapped mrna, the sequencing and development of an infectious cdna clone, the three-dimensional structure of a virus particle, the discovery of the ires elements, the synthesis of an infectious virus in a cell-free system, the chemical synthesis of a complete viral genomes, among others, were initially reported in pv [34] [35] [36] [37] [38] . in addition, the first time that eif4g was found proteolytically cleaved was in pv-infected cells [39] . picornaviruses encode different proteases depending on the virus species although it is common to all of them to encode 3c pro and its precursor 3cd pro (figure 2 ). in pv, both these exhibit protease activity, and they execute most of the hydrolytic events on the viral polyprotein [2, 23, 40] . apart from these two proteases, 3c pro and 3cd pro , picornaviruses also contain a 2a gene, whose product in some species exhibits proteolytic activity, as is the case for pv ( figure 2) . 2a pro has a limited proteolytic effect on the polyprotein and its function is most probably one of altering cellular functions by the cleavage of a number of cellular proteins. in this regard, the best studied of these cleavages is the bisection of eif4g ( figure 3 ) [14] . the general organization of the picornavirus genomes is to encode for p1-p2-p3 precursors giving rise to 4-3-4 mature products. some picornavirus species, in addition, encode a leader protein (l) placed before p1 ( figure 2 ) [22] . in the case of aphthoviruses, such as foot-and-mouth disease virus (fmdv), the l protein has proteolytic activity and it is known as l pro [42, 43] . during polyprotein synthesis l pro is the first protein synthesized and its autoproteolytic activity releases itself from the rest of the polypeptide chain. thus, the only known hydrolysis executed by l pro on the polyprotein is to hydrolyze between its carboxy terminus and the amino terminus of vp4 ( figure 2 ). because l pro does not play a direct role in viral replication [44] and its protease activity has a limited impact in the viral polyprotein, this protease may be involved in the interaction with the hostcell [45] [46] [47] . indeed, l pro also exhibits proteolytic activity on eif4g acting at a position close to that of pv 2a pro (figure 3 ) [48] [49] [50] . in the case of fmdv, the 2a protein is reduced to a small peptide of 18 residues that does not hydrolyze eif4g; instead it induces the release of 2a from its carboxy terminus by a ribosomal skip mechanism [51, 52] . this model proposes that fmdv 2a modifies the activity of the ribosome to promote hydrolysis of the peptidyl(2a)-trna(gly) ester linkage at the c-terminus of 2a, thereby releasing the polypeptide from the translational complex. however, not all l or 2a proteins from picornaviruses exhibit protease activity, since in the case of emcv, which encodes both proteins, neither has been demonstrated to possess proteolytic activity [53] . all known proteases have been classified in four classes and many subgroups according to three parameters: (i) their catalytic center, (ii) their substrate specificity, and (iii) their three-dimensional structure. the classification of the different picornavirus proteases initially relied upon the effect of protease inhibitors. compounds that blocked sulphydryl groups abrogated the proteolytic activity of 2a pro and 3c pro , suggesting that the nucleophilic aminoacid in the active site was cysteine [54, 55] . however, another cysteine inhibitor such as e64 had no effect on these proteases, while l pro activity was inhibited not only by sulphydrylactive compounds but also by e64 [56, 57] . these findings together with structural observations imply that l pro belongs to the class of papain-like cysteine proteinases [43, [58] [59] [60] . comparison of the structure of picornavirus proteases with prototypes of cellular ones revealed that both 2a pro and 3c pro are similar in structure to the chymotrypsin like group [61] [62] [63] . picornavirus 3c pro reflects similarities to the staphylococcus aureus proteinase, whereas 2a pro is more akin to streptomyces griseus proteinase a. pv 2a pro is a protein composed of 149 amino acids that belongs to the cysteine protease group [54] . pv 2a pro is autocatalytically processed at its amino terminus between the capsid protein vp1 and 2a [64] (see figure 1 (b)). the determinants of substrate specificity of picornaviral pv 2a pro have been investigated in detail by identification of cleavage sites by n-terminal edman degradation, mutational analysis and using synthetic peptides as substrates [34, [65] [66] [67] . pv 2a pro can recognize a wide variety of amino acid residues at the p1 position. the determinants of substrate specificity for pv 2a pro lie at positions p4, p2, p1 , and p2 , which are preferentially occupied with ile/leu, thr/ser, gly, and pro, respectively. moreover, the determinants of substrate specificity of hrv and coxsackievirus 2a pro are very similar to those found for pv 2a pro [65] [66] [67] . the yeast two-hybrid system has been used to identify the substrate sequence interacting with pv 2a pro . all the sequences identified contain the leu-x-thr-z motif (x for any amino acid; z for a hydrophobic residue) in positions from p4 to p1 suggesting the presence of a common interacting site on pv 2a pro substrates [68] . several 2a pro variants have been generated in the entire pv genome or in the isolated 2a gene. generation of pv 2a pro mutants was initially used to identify the cys 106 , his 18 , and asp 35 as the residues that form part of the catalytic triad of 2a pro [69, 70] . these data have been confirmed in the structure of hrv2 2a pro [71] . the role of the conserved cys and his residues in the structure-function relationship has also been studied by mutagenesis. the residues cys 55 , cys 57 , cys 115 , and his 117 play a critical role in the cis and trans proteolytic activity by maintaining 2a pro structure [72] . the structure of hrv2 2a pro shows that the zn 2+ ion is coordinated tetrahedrally by the side chains of these conserved cys and his residues. this zn 2+ ion is tightly bound near to the c-terminal domain and may be important for the stability of the 2a pro [71, 73, 74] . the yeast saccharomyces cerevisiae has been used as a system to obtain pv 2a pro variants [75] . the fact that this protease is very toxic for yeast has been exploited to generate 2a pro variants devoid of this cytotoxicity. using this approach, several pv 2a pro unable to cleave eif4g have been obtained. the characterization of these mutants revealed a region in 2a pro involved in the interaction with substrates but none of the mutations were found in the catalytic triad. a parallelism has been observed between the ability of these pv 2a pro variants to block protein synthesis and to cleave eif4g [76] . pv mutants in 2a gene that lack trans but not cis proteolytic activity have been also identified. normal journal of biomedicine and biotechnology processing of the viral polyprotein is observed upon infection with these pv variants, whereas eif4g remains intact in these cells [77] . interestingly, rna replication of those mutant viruses is hampered, suggesting that there is a correlation between pv rna replication and the trans activity of 2a pro . it is controversial whether pv 2a pro contributes directly to viral replication or not. although for many years it was thought that pv 2a pro plays a direct role in pv replication, more recent studies have shown that a full-length dicistronic pv construct lacking 2a pro is capable to give rise to progeny viruses. moreover, virus yields of pv variants lacking the p1 coding region is partially restored when p1 is expressed in trans, suggesting that cleavage of the viral polyprotein by pv 2a pro is not essential for viral replication. [78] . however, it is known that 2a pro is important for inducing the cytophatic effect and for avoiding the inhibition of pv replication in interferon (ifn) α treated cells [79] . in agreement with the idea that 2a pro participates in viral rna replication, a fraction of this protease localizes in pv replicative foci although the majority of 2a pro is associated with the matrix structure in the cytoplasm of infected cells [80] . however, the presence of 2a pro in the proximity of replication complexes does not demonstrate that it participates directly in the replication process. 5.1. structure and functioning of eif4g. the process of translation can be divided in different steps: initiation, elongation, termination and ribosome recycling. the synthesis of cellular proteins is highly regulated, and in this sense, the most precisely controlled step is the initiation of translation (for a recent review, see [81, 82] ). for most eukaryotic mrnas, the initiation of translation commences with the recognition of the cap structure ( 7 mgpppn) and the poly(a) tail by the heterotrimeric complex eif4f and the poly(a)-binding protein (pabp), respectively, followed by the recruitment of the 43s preinitiation complex containing the 40s ribosomal subunit, the ternary complex met-trna met i -eif2-gtp, and the eukaryotic initiation factors (eifs), 1, 1a, 3, and 5. then, the preinitiation complex scans along the 5 untranslated region until an aug initiation codon is encountered in a favourable context. the perfect complementarity between the aug start codon and the anticodon of met-trna i leads to the arrest of scanning and the hydrolysis of gtp in the ternary complex. the release of eif2-gdp and other factors triggers the interaction of the preinitiation complex with the 60s ribosomal subunit to form the 80s initiation complex, proceeding to translation of the mrna coding region. a number of eukaryotic initiation factors participate in both mrna binding and scanning of the 5 utr of the mrna by the small ribosomal subunit. the cap-binding protein eif4e, together with the dead-box helicase eif4a and the translation initiation factor eif4g, forms the protein complex eif4f. the heterotrimeric complex eif4f is required for recruiting the 43s preinitiation complex onto the cap structure located at the 5 end of the mrna [83] . in this sense, eif4g, the larger polypeptide of eif4f, functions as an adaptor molecule that bridges the mrnas to ribosomes via interactions with factors eif4e (which binds the 5 cap structure), pabp (which binds the poly(a) tail), and eif3, which interacts with the 40s ribosome subunit ( figure 3 ) [81, [84] [85] [86] . in addition, eif4g also contains binding sites for other polypeptides involved in translation, such as the rna helicase eif4a ( figure 3 ) [87] , which is required to unwind the secondary structure within the mrna 5 leader sequence that would otherwise inhibit ribosome scanning. the simultaneous interaction of eif4g with eif4e and pabp promotes circularization of the mrna in a closed loop that facilitates the initiation of new rounds of translation by the proximity of the 5 and 3 ends [88, 89] . furthermore, eif4g also interacts with the mitogen-activated protein kinase 1 (mnk1) (figure 3 ), which phosphorylates eif4e, although the role of this phosphorylation in the initiation of translation in still unclear [90] [91] [92] [93] [94] . two forms of eif4g, known as eif4gi and eif4gii, have been identified in mammalian cells. both forms show only 46% amino acid sequence identity but they are thought to be functionally interchangeable due to the high homology in key domains that interact with other factors [88] . evidence obtained by specific depletion of each eif4g form or differential cleavage of each of them by specific proteases (see below) points to the idea that both factors should be lacking for complete abolition of protein synthesis [95, 96] . eif4gi is the dominant form in hela cells, in which the ratio between eifgi and eif4gii is 9 : 1 [97] . however, the specific role of each form of eif4g in the initiation of translation remains unknown. it was proposed that both eif4g forms are differentially regulated by different kinases, supporting the hypothesis that eif4gi and eif4gii could drive differentially translation initiation [98] . eif4gi is phosphorylated in response to serum and in a rapamycin-dependent manner at ser 1148, 1188, and 1232, although the role of these posttranslational modifications is still under investigation [99] . in addition, eif4gi is phosphorylated by p21-activated protein kinase (pak-2) that is induced under stress conditions. this phosphorylation takes place in the eif4e-binding site of eif4g and avoids the interaction between these two factors, inhibiting cap-dependent initiation of translation [100] . on the other hand, eif4gii is phosphorylated during mitosis [101] by calmodulin-dependent kinase i at ser 1156 [102] . therefore, activity of eif4gi and eif4gii might be tightly and reversibly regulated by phosphorylation under different physiological conditions. nevertheless, this factor is also subjected to irreversible modifications such as caspasemediated proteolysis, which is triggered during apoptosis and leads to shutoff of protein synthesis. caspase-3 cleaves directly eif4gi in positions 532 and 1175 removing pabp, mnk1, and one eif4a-binding domain from the eif4gi core ( figure 3 ) [103, 104] . in contrast, eif4gii is degraded during apoptosis with a delayed kinetics in relation to eif4gi proteolysis, correlating with the shutoff of the protein synthesis [104] . furthermore, many eif4gi isoforms have been detected in hela cells and these are synthesized from several distinct mrnas via alternative promoter usage and alternative splicing [105] . the largest is the eif4gi-a isoform, which contains 1,600 residues, while the eif4gi-b, -c, -d, ande are shorter variants [106] . it has been described that the longer isoforms are more active in translation initiation, most probably because they contain the pabp-binding site [107] . pv results in a rapid shutoff of host-cell protein synthesis, whereas viral mrna translation takes place efficiently [108] . it was initially observed that the inhibition of host-cell translation in pv-infected cells correlated with the proteolysis of a component of the eif4f complex with a molecular mass of about 220 kda (later identified as eif4g) [39, 109] . this cleavage is exerted by 2a pro and can be prevented by both insertion of mutations that abolish the protease activity and addition of 2a inhibitors [76, 110, 111] . interestingly, this proteolysis is more effective when eif4e is interacting with eif4g, suggesting that pv 2a pro preferentially acts on the eif4g pool involved in translation [112] . cleavage of eif4gi also occurs in cells infected with other picornaviruses such as hrv, coxsackievirus, and fmdv [48, 113, 114] . interestingly, 2a pro from pv, hrv, and coxsackieviruses cleave eif4gi at positions 681/682 (figure 3 ), suggesting the conservation of the specificity of enterovirus 2a proteases for the substrate determinants present in eif4gi [114, 115] . cleavage at position 681/682 separates eif4e-and eif3binding sites of eif4gi, contained in n-terminal and cterminal fragments respectively, thus decoupling mrna and ribosome recruiting activities. the pv 2a pro cleaves eif4gi directly and does not require any additional proteins for this process to occur [116, 117] . however, the fact that pv 2a pro is not copurified with eif4gi fragments from pv-infected cell extracts suggest that 2a pro induced the activation of a host protease, which in turn cleaves eif4g during pv infection [112] . in addition, it has been proposed that eif3 and an unknown host-cell protein could act as cofactors for eif4gi cleavage by pv 2a pro [118] . in this sense, zamora and colleagues suggested that pv infection activates at least two host-cell proteases, which together with pv 2a pro , cleave eif4gi [119] . nevertheless, no additional evidence has been put forward to support this hypothesis and the identity of these host proteases has not yet been determined. many reports have demonstrated that the kinetics of protein synthesis shutoff and eif4gi cleavage are not correlated in pv-infected cells [120] [121] [122] . these data clearly indicate that additional translation factors may be cleaved to achieve an efficient inhibition of cellular mrna translation. in this regard, additional reports showed that eif4gii is also proteolyzed by 2a pro during pv and hrv infections and that this cleavage is exerted between amino acids 699/700 leading to a proteolytic pattern similar to eif4gi. interestingly, eif4gii is significantly more resistant to 2a pro -mediated cleavage than eif4gi and the kinetics of protein synthesis shutoff close correlates with eif4gii cleavage in pv-and rhv-infected cells [122, 123] . in those studies, gradi and colleagues proposed that hydrolysis of both eif4gi and eif4gii is required for achieving pv-and hrv-mediated inhibition of host-cell mrna translation and that the cleavage of eif4gii is the rate-limiting step in the shutoff of host-cell translation after infection with those viruses [122, 123] . a variety of approaches have been devised to express pv 2a pro in order to cleave eif4g in culture cells or in cellfree systems. of these approaches, the most straightforward system has been the addition of the purified pv 2a pro , usually as a hybrid protein such as mbp-2a pro , to cellfree systems such as rabbit reticulocyte lysates (rrls), hela and krebs-2 extracts [76, 118, [124] [125] [126] [127] . in this sense, the addition of about 1 to 5 μg mbp-2a pro suffices to hydrolyze eif4g in those cell-free systems [76, [125] [126] [127] . an alternative method to cleave eif4g in an in vitro system is the translation of an mrna encoding pv 2a pro in translation competent extracts [50] . this assay has the advantage of providing genuine and freshly made pv 2a pro , leading to total cleavage of eif4g in the test tube after several minutes of translation. many different approaches have been explored in culture cells, the most popular being transfection of plasmids encoding pv 2a pro in different eukaryotic cell types [75, 76, [128] [129] [130] [131] [132] [133] [134] . several plasmids have been utilized in this respect, and perhaps the most successful one is ptm1-2a, which is transfected in mammalian cells that transiently express t7 rna polymerase by infection with a recombinant vaccinia t7 virus [76, 130, 131, 133, 134] . the amount of protease synthesized in this system is similar to that found in pv-infected cells at late times of infection, but these amounts are reached 1-2 hours after transfection. similar results have been obtained in cells constitutively expressing t7 polymerase, which comprises a less pleiotropic system, because vaccinia virus proteins are not expressed (unpublished data). since pv 2a pro targets a number of different cellular proteins, which affect several cellular functions depending on the amount of protease synthesized (see below), in some instances, it is useful to express 2a pro at low levels. we have explored many alternative methods trying to get a system that allows us to control the levels of pv 2a pro into the cells. novoa and colleagues developed a protocol based on the addition of hybrid proteins bearing pv 2a pro . these recombinant proteins enter into the cytoplasm on cell membrane permeabilization by different methods such as addition of mbp-2a pro mixed with replicationally inactive chicken adenovirus particles [135] . cleavage of eif4g following these protocols takes place after incubation for 8-10 hours, suggesting that the amount of protease internalized is probably low but sufficient to hydrolyze eif4gi in virtually all culture cells [136] . probably one of the most attractive systems is a stable cell line that inducibly express pv 2a pro , obtained in two different laboratories including ours [137, 138] . in these cell lines, pv 2a pro is synthesized when tretracycline is removed from the culture medium, leading to low expression of pv 2a pro that induces efficient cleavage of eif4g after 13 h post induction correlating with a potent inhibition of cellular translation. finally, long term expression of pv 2a pro in tet off cell lines triggers apoptosis [12, 137, 138] . the main drawback of this cell line is the low pv 2a pro escape under repression conditions that gives rise to a basal cytotoxicity. probably, the most efficient method is electroporation of an mrna encoding 2a pro under the control of emcv leader sequence (ires-2a) [95] . the biggest advantage of this method is the capacity to regulate levels of 2a pro expression by controlling the amounts of ires-2a transfected. for example, electroporation of 9 μg of ires-2a into ∼ 1.5 · 10 6 hela cells leads to total cleavage of both eif4gi and eif4gii in only 2 h, resulting in an almost complete shutoff of cellular protein synthesis. in contrast, electroporation of low amounts of ires-2a (1 μg) into hela cells induces efficient cleavage of eif4gi, whereas eif4gii remains largely intact. therefore, 9-fold more ires-2a mrna is required to cleave eif4gii compared to eif4gi [95] . based on the ires-2a mrna electroporation method we were able to induce the differential proteolysis of eif4gi and eif4gii in a timeand dose-dependent manner. kinetics of protein synthesis shutoff and eif4gii cleavage is closely correlated in hela cells, resembling what was found in pv-infected cells [122] . in agreement with what was observed with the addition of exogenous recombinant proteins [136] , translation of de novo synthesized mrnas showed higher susceptibility to low doses of pv 2a pro than mrnas already engaged in translational machinery [95] . these results suggested a possible specific role of eif4gi in the pioneer round of translation in agreement with a previous report [139] . however, specific ablation of eif4gi using sirnas induced a moderate inhibition of luciferase synthesis from de novo synthesized and preexisting mrna (about 40% in both cases) [96] . these findings reported by welnowska and colleagues indicated that the higher susceptibility of de novo synthesized mrna translation to low doses of ires-2a might be produced by an additional effect of pv 2a pro on another gene expression step. in this regard, further studies demonstrate that the stronger impact of 2a pro on de novo synthesized mrnas is due to the concomitant inhibition of rna nuclear export by nucleoporin 98 cleavage, which is also achieved under these conditions (see below) [17] . interestingly, cellular mrnas are able to initiate translation after a polysome runoff with high salt treatment when eif4gi is totally cleaved by pv 2a pro , whereas it is completely abolished when both forms of eif4g are proteolyzed [95] . taken together these set of data from cell expressing 2a pro [95, 136] as well as from pv-infected cells [120] [121] [122] we can conclude that complete shutoff of the protein synthesis induced by pv 2a pro is achieved when both eif4gi and eif4gii are completely cleaved. therefore, when the levels of one of the two populations of eif4g remain unaffected either because it is not cleaved by pv 2a pro [95] or it is not depleted by sirnas [96] , extensive host protein synthesis takes place. the infection of pv and coxsackievirus also leads to hydrolysis of pabp [140, 141] . this cleavage is carried out by pv 3c pro and coxsackievirus 2a pro and 3c pro and it might actively contribute to the host translational shutoff induced by these viruses [142, 143] . in conclusion, the proteolysis of different components of the translation initiation machinery by picornavirus proteases can account for the shutoff of host translation induced after infection although the specific contribution of hydrolysis of eif4gi, eif4gii, and pabp remains still unclear. infection of animal cells with fmdv also leads to proteolysis of eif4g and to rapid inhibition of cellular translation [48] . the proteolysis of eif4g is carried out by the two virally encoded proteases l pro and 3c pro [144, 145] . l pro cleaves both eif4gi and eif4gii extremely rapidly at positions 674 ( figure 3) and 700, respectively, located seven and one amino acids upstream of the 2a pro cleavage sites on eif4gi and eif4gii [50, 146] . the cleavage of eif4g by fmdv l pro results in the rapid shutoff of host-cell protein synthesis [145] . although the initial cleavage of eif4gi can be carried out by fmdv l pro in the absence of virus replication [145] , a sequential cleavage of the c-terminal fragment of eif4gi by fmdv 3c pro also occurs in bhk cells at early stages of infection concomitant with the shutdown of viral translation. the 3c pro cleavage site on eif4gi has been located at position 712, 38 amino acids downstream of the l pro cleavage site [147] although the role of this sequential cleavage is still unclear. the amino acid segment of eif4g located between the l pro and 3c pro cleavage sites binds rna and was suggested to be critical for mrna scanning by the preinitiation complex [148] . interestingly, this secondary cleavage does not occur in human cell lines due to an amino acid substitution at the cleavage site on eif4gi [147] . infection of cells with other picornaviruses such as cardioviruses (emcv and mengovirus) leads to a shutoff of host-cell protein synthesis. however, eif4g remains unaffected in these cells. these findings indicate that apart from eif4g cleavage, there are other mechanisms that may block host translation by picornaviruses. in addition to picornavirus l pro and 2a pro , proteases from other viruses can also cleave eif4g. the protease of human immunodeficiency virus type-1 (hiv-1) hydrolyzes eif4gi during infection of human cd4+ cells [149] . the cleavage of eif4gi takes place at positions 718, 721, and journal of biomedicine and biotechnology 9 1125, separating it in three domains ( figure 3 ) [149, 150] . interestingly, hiv-1 protease efficiently cleaves eif4gi, but not eif4gii, both in cell-free systems and in mammalian cells [151] . the differential sensitivity of eif4gi and eif4gii to hiv-1 protease is more selective than that observed with picornaviral 2a proteases [122, 123] . hiv-1 protease also cleaves pabp at positions 237 and 477 separating the two first rna-recognition motifs from the c-terminal domain of pabp [152] . cleavage of eif4gi and pabp by hiv-1 protease is sufficient to inhibit the translation of capped and polyadenylated mrnas in cell-free systems, as well as in transfected cells [127, 151] . in contrast, ires-driven translation is unaffected or even enhanced by hiv-1 pr after cleavage of both eif4gi and pabp [127, 151] . moreover, the translation of capped and polyadenylated hiv-1 genomic mrna remains unaffected in hela extracts under these conditions suggesting that viral protein synthesis might persist at late phases of hiv-1 infection where those factors are cleaved [127] . in contrast, a previous report claimed that the hydrolysis of eif4gi impaired the translation of both capped and ires-driven mrnas in reticulocyte lysate assays [150] . however, the different effect observed by these authors on ires-driven translation can be due to differences already reported between hela extract and rrl [143, 153] . in addition, eif4g cleavage is executed by proteases from other retrovirus species, such as hiv-2, simian immunodeficiency virus (siv), human t-cell leukemia virus (htlv-1), moloney murine leukemia virus (momlv), and mouse mammary tumor virus (mmtv). these proteases hydrolyze eif4gi and eif4gii with different cleavage patterns and kinetics [134] . and indeed, several retroviruses, including hiv, siv, and momlv, promote the translation of their gag gene products by internal ribosome entry, indicating that eif4g cleavage could be compatible with viral protein synthesis in infected cells [154] [155] [156] . furthermore, cleavage of eif4gi and eif4gii also occurs in feline calicivirusinfected cells although the cleavages occur at different sites to those observed for picornavirus proteases [157] . in addition, the 3c-like protease of two caliciviruses, like pv 3c pro , cleaves pabp perhaps as a complementary strategy to inhibit cellular translation [158] . the fact that proteases from many picornaviruses, retroviruses and caliciviruses, target eif4g, and, in some cases, pabp, strongly suggest that those viruses may share a common mechanism to regulate cellular and viral translation. however, further investigation is required to determine the specific contribution of eif4gi, eif4gii and pabp to the shutoff of host-cell translation and virus protein synthesis. the biological cycle of picornaviruses is confined to the cytoplasm of infected cells. however, some of the pv proteins are able to target nuclear proteins such as transcription and splicing factors and proteins involved in nuclear-cytoplasmic trafficking. one of the best studied cases in this regard is the cleavage of nucleoporins (nups), components of the nuclear pore complex (npc), by pv and hrv 2a pro , which directly impacts on nuclear-cytoplasmic trafficking of proteins and rnas [15, 16, 159] . complementarily, emcv l protein affects on the phosphorylation status of nup62 and induces similar effects to those described for pv 2a pro in the transport of macromolecules through npc [160] . nevertheless, nups are not only targets for picornavirus proteins but also for matrix (m) protein from vesicular stomatitis virus (vsv) [161] and nonstructural protein 1 (ns1) from influenza virus [162] , which also impair components of the nuclear export-import host machinery following analogous mechanisms. taking into account that proteins from different positive and negative strand rna viruses target nups, we highlight in this paper npc as a key target for viral proteins, although the possible role of npc in virus biological cycle is still under intensive research. nucleus and cytoplasm are physically separated by a semipermeable barrier known as the nuclear envelope. due to this compartmentalization, a large number of macromolecules traverse the nuclear envelope to reach their biological destination. for example, proteins are synthesized by ribosomes in the cytoplasm, but some of them such as polymerases, transcription factors, nucleosome components and splicing factors, have to traverse the npc to reach the nucleoplasm. conversely, all rna species are transcribed and processed in the nucleus and later, most of mature rnas are exported through the npc to the cytoplasm, where their biological roles take place. therefore, the regulation of rna and protein trafficking between nucleus and cytoplasm directly impacts on gene expression [163, 164] . npc forms large structures (∼125 mda) embedded in the nuclear envelope with a polarized eightfold symmetrical core. it is sandwiched by a cytoplasmic and nuclear ring, which projects eight filaments of about 50 nm into the cytoplasm and a basket-like structure of about 100 nm into the nucleoplasm [165, 166] . the npc is composed of multiple copies (8, 16 or 32) of ∼30 different proteins, called nups, that are grouped in three major classes: (i) the phenylalanineglycine (fg)-containing nucleoporins that actively work in the nuclear-cytoplasmic trafficking of macromolecules; (ii) the structural components, which lack fg-rich domains and (iii) the membrane integral proteins, which anchor the npc to the nuclear envelope [163, 167] . whereas the two last groups of nucleoporins play a role in the architecture and localization of the npc, the fg-nucleoporins directly regulate the transport of rnas and proteins through the npc, and they are the main nucleoporin class targeted by viruses (figure 4) . movement of ions, metabolites and other small molecules between nucleus and cytoplasm takes place by passive diffusion; however, transport cargos larger than 40 kda require the participation of specific receptors and carriers [168, 169] . fg nucleoporins are placed on both cytoplasmic and nucleoplasmic sides of the npc and play a central role in the active transport of macromolecules. the fg domains of nucleoporins are unfolded regions that participate in energy-independent transient interactions with the cargo receptors during the docking and translocation processes [170] . nevertheless, the delivery of the cargo and some of the directional steps require hydrolysis of gtp. trafficking of proteins through npc is mediated by a family of conserved transport receptors named karyopherins, which recognize short peptides known as nuclear localization signal (nls) and nuclear export signal (nes) [169, 171] . in addition, karyopherins also recognize nucleotide sequences during the export of some classes of rnas [163] . due to their key role, karyopherins involved in cargo import are known as importins and those involved in export are known as exportins. the rangtp cycle also plays a central role in karyopherin activity and, therefore, in trafficking of macromolecules through the npc. importins bind the cargo in the cytoplasm and release it on binding to rangtp in the nucleus. in contrast, exportins bind the cargo in the nucleus together with rangtp; and then the ran-associated gtp is hydrolyzed in the cytoplasm by rangap and cargo is liberated [163, 164] . proteins containing constitutively active nls are predominantly nuclear; but in some cases, the accessibility of nls or the nls itself is modified to selectively regulate the localization of the protein [172] . a similar regulatory mechanism is also exerted for control of nes activity [173] . for example, the nls of the transcription factor nfκb is masked by the interaction with the inhibitor iκbα. however, iκbα is degraded on proinflammatory stimuli, exposing the nls of nfκb for importin recognition. therefore, under proinflammatory conditions, nfκb is imported to the nucleus, where it triggers a specific gene response [172] . in addition to "masking strategies," phosphorylation, ubiquitination or methylation of nls or nes also influences (negatively or positively) their recognition by importins or exportins. therefore, trafficking of proteins between nucleus and cytoplasm could be finely regulated by posttranslational modifications in the nls and nes [164, 174] . conceptually, export of most of nuclear rna follows a similar mechanism to that described above for protein trafficking. this process also involves cargo receptors, export factors, and nucleoporins to deliver mature rna to the cytoplasm, but in this case, structure and function of nuclear export signals are not well understood. aminoacylated trnas are necessary in the cytoplasm for protein synthesis, but trnas are transcribed in the nucleus. trna export to the cytoplasm is mediated by exportin-t, which belongs to the karyopherin family. exportin-t forms a complex together with rangtp in the nucleus, and once the exportin-t-rangtp-trna complex reaches the cytoplasm, rangap induces the hydrolysis of the ran-associated gtp and the release of the trna [175, 176] . it has been proposed that exportin-5 could act as an auxiliary protein in this process [177] . however, later reports in yeast have opened the possibility of alternative pathways for trna nuclear export [163] . traffic of snrna follows a complex mechanism involving adaptor proteins. snrnas are transcribed by the rna polymerase (pol) ii (with the exception of u6 snrna which is produced by poliii) and then they are capped but not polyadenylated. cap-binding proteins (cbp) 80 and 20 interact with the cap of snrnas in the nucleus and recruit an export adaptor known as phax [178] . this adaptor is phosphorylated and in this state, it is able to interact with crm1, another member of karyopherin family. this interaction together with the joining of rangtp is essential for snrna trafficking [163] . hydrolysis of rangtp and dephosphorylation of phax lead to the release of the snrna in the cytoplasm. ribosomal proteins are assembled together with the different rrnas in the nucleolus following a complex process of maturation to give rise to the ribosomal subunits 40s and 60s. although is known that preribosomal subunits are exported by separate routes that involve crm1 and rangtp, nowadays the exact mechanism followed by 40s preribosomal subunits to leave the nucleus remains unclear [163] . however, 60s preribosomal subunit relies on nmd3 adaptor, which mediates the interaction with crm1 [179, 180] . the release of the 60s preribosomal subunit requires two gtpase steps: (i) the hydrolysis of the ran-associated gtp induces the liberation of crm1; and (ii) the hydrolysis of gtp mediated by the cytoplasmic gtpase lsg1, which induces the release of nmd3 [181] . finally, mrnas compose the most heterogeneous group of rnas, varying in length and structure. thus, different export factors and adaptor proteins associate with each subpopulation of mrnas [163] . mrnas are transcribed by polii, and, concomitantly with this process, a number of rna-binding proteins assemble with them. these rnabinding proteins exert different modifications in immature mrnas such as polyadenylation, splicing and capping. in addition, export factors and adaptor proteins are also recruited to nascent pre-mrnas, playing a further function in nuclear export. trex (transcription-coupled export) complex is recruited to the 5 end of nascent pre-mrnas in a splicing-dependent manner by means of the interaction of one of its components, namely aly/ref, with the subunit of the nuclear cap-binding complex cbp80 [182] . the trex complex also recruits the conserved rna-helicase uap56 that is important for mrnp biogenesis [183] . tap-p15, (also known as nxf1-nxt1) directly binds the mrna immediately after splicing and actively participates in mrna export. aly/ref, tap-p15 and uap56 associate with exon junction complexes (ejc), which are deposited in a splicingdependent manner at 20-24 nt of every exon-exon junction [184, 185] . this interaction network makes spliced mrnas more susceptible to export and couple splicing and mrnatrafficking. in fact, unspliced mrnas are exported by an alternative and less efficient pathway that involves crm1 and rangtp. this alternative route is also followed by mrnas encoding some protoncogenes and cytokines [186, 187] . it is important to mention that although mrna can follow different nuclear export pathways, in all cases the interaction of export receptors with nucleoporins plays an essential role in the transport of the mrnps throughout the npc [170] . protein and rna trafficking by pv 2a pro . pv infection strongly impacts on hostcell protein localization, giving rise to an unusual cytoplasmic distribution of nuclear proteins [188] . this particular effect has been characterized by different laboratories for a number of nuclear factors involved in several cellular processes and containing different types of nlss [189] . nevertheless, not all nuclear proteins are re-localized after pv infection, evidencing the presence of a viral-specific mechanism affecting protein subcellular distribution [188] . gustin and colleagues proposed the inhibition of nuclear protein import machinery as the cause of the cytoplasmic accumulation of nuclear proteins in pv-infected cells (figure 4 ). they demonstrated that pv impairs protein trafficking across the npc by expressing gfp proteins encoding classical or transporting nlss in mock and infected cultured cells [15, 16, 159] . these recombinant proteins accumulate in the cytoplasm after pv infection, being almost completely depleted from the nucleus. nevertheless, pv does not affect gfp distribution when the nls is mutated or deleted, since the small size of gfp allows its inefficient efflux throughout the npc [15] . interestingly, cell-free nuclear import assays demonstrated that nls-containing gfp is unable to traverse the npc when cells are previously infected with pv [15] . in agreement with these findings, shuttling endogenous proteins, such as heterogeneous nuclear ribonucleoprotein (hnrnp) a1 and hnrnp k, are detected in the cytoplasm of infected cells from 3 hpi but are undetectable in the nucleus at 4.5 hpi. however, cytoplasmic accumulation of nuclear resident proteins such as hnrnp c requires longer times of infection [15] . these findings support the idea that the distribution of a protein that shuttles between nucleus and cytoplasm may be strongly altered by the disruption of protein trafficking pathways, as compared to nuclear resident proteins. however, not all the nuclear factors are redistributed to the cytoplasm of pv-infected cells. this is the case for sc35 (a serine/arginine-rich splicing factor), fibrillarin or tata-binding protein (tbp), which remain in the nucleus for the duration of infection. the different behaviour of these groups of proteins could arise as a consequence of different turnover; thus, the distribution of highly stable nuclear proteins might be less affected by the inhibition of protein nuclear import than those proteins with low stability. alternatively, pv might not impair all protein import pathways, and some might remain operative, thereby allowing the nuclear import of several families of nuclear proteins. interestingly, some nuclear factors such as la antigen [190] , ptb [191] , sam68 [192] , or nucleolin [188] have been shown to interact with pv rna or viral proteins. these proteins accumulate in the cytoplasm of pv-infected cells and, consequently, their availability for viral replication is increased [188, [193] [194] [195] . cytoplasmic accumulation of nuclear factors might be produced mostly by the blockade of nuclear-cytoplasmic trafficking. however, loss of the nls after the cleavage of ptb and la proteins by pv 3c pro may also contribute to the subcellular relocalization of those proteins in pv-infected cells [193, 195] . redistribution of nuclear proteins as well as impairment of cellular import machinery was also observed in cells infected with other picornaviruses such as hrv [159] or emcv [160] . these findings indicate that most picornaviruses might share similar strategies to impair nuclear-cytoplasmic trafficking machinery. an alternative hypothesis has been proposed as the cause of the redistribution of nuclear factors. belov and colleagues reported that the nuclear envelope is permeabilized on pv infection, allowing nuclear proteins to diffuse across the nuclear membrane [196] . indeed, electron microscopy revealed that pv 2a pro induces severe structural damage in npc [196] . however, this hypothesis did not clarify why other proteins resident in the nucleus do not diffuse to the cytoplasm after pv infection. both protein nuclear import and permeabilization of nuclear envelope could be integrated together as sequential steps in pv biological cycle. protein import blockade is detected early after infection [16] , correlating with the first modifications in the npc (see below). however, prolonged expression of viral proteins might induce nuclear membrane leakiness, reflected by stronger alterations in npc architecture [196] . nevertheless, what might the biological relevance of these events be for a cytoplasmic virus? the most evident answer, which was extensively commented above, is that inhibition of nuclear protein import and further nuclear envelope leakiness might increase the presence of nuclear proteins in the cytoplasm of infected cells, which has been proposed in some cases to play a relevant role in pv replication. in addition, many transcription factors are arrested in the cytoplasm of uninfected cells (e.g., nf-κb, irf7, and irf3), but they are immediately activated and imported to the nucleus after proinflammatory extracellular signals or on the activation of intracellular sensors as a consequence of the viral replication. once in the nucleus, these factors trigger the transcription of a set of genes involved in the antiviral response [197] . inhibiting the import of these transcription factors, pv might prevent or, at least attenuate, the establishment of a hostile intracellular environment. further effort will be made in the future to explore this attractive hypothesis. interestingly, nup98, nup153 and nup62, components of the npc belonging to fg nup family, were found to be degraded in pv-as well as hrv-infected cells (figure 4 ) [15, 16, 159] . these proteins are essential factors of the nuclear-cytoplasmic trafficking machinery since their nterminal fg-rich domains serve as docking sites for soluble transport factors [163, 198] . nup98, nup153 and nup62 are proteolyzed in pv-infected cells following different kinetics. thus, nup98 is the cleaved early after infection (from 1 hpi), whereas nup153 and nup62 are targeted at late times after infection (from 4 hpi) [15, 16] . in agreement with these findings, cleavage of nup98 is induced even in presence of inhibitors of pv replication, suggesting that small amounts of viral proteins are sufficient for this proteolysis to occur. however, cleavage of nup153, and nup62 are efficiently prevented on arrest of viral replication, probably because they are only efficiently achieved when large amounts of viral proteins are produced [16] . therefore, nup98, nup153, and nup62 exhibit different susceptibilities to pv replication, thus pv might have a gradual impact on the nuclear-cytoplasmic trafficking machinery. nup153 is also proteolyzed by caspase-3 and caspase-9 during apoptosis induction; however, the involvement of these cellular proteases in pv-induced nup cleavage has been ruled out by different laboratories. first, nup153 cleavage journal of biomedicine and biotechnology 13 products generated upon caspase activation differ to those found in pv-infected cells [159] . in addition, nup62 is cleaved in pv-infected cells, but it remains intact despite caspase activation [199] . most importantly, pv-induced npc structural damage takes place in cells lacking caspase 3 and 9 [196] , and nup153 and nup62 are efficiently cleaved in pv-infected cells even in presence of the caspase inhibitor z-vad [159] . all together, these data support the idea that one or more viral proteins play a direct role in the cleavage of those nups. in agreement with this hypothesis, pv-induced npc damage is prevented by pv 2a pro inhibitors such as elastatinal, elastase, and mpcmk, suggesting an involvement of this viral protease in the alteration of npc [196] . indeed, individual expression of pv 2a pro in hela cells as well as addition of this protease to cell-free systems gives rise to nup98, nup62, and nup153 cleavage [16, 17, 200] . in agreement with the data obtained from pv-infected cells, on pv 2a pro expression in hela cells, nup98 is cleaved faster than nup62 and nup153, which suggests the presence of optimal cleavage sites in this protein [17] . proteolysis of nup98 in pv-infected cells as well as in cell-free systems generates two different cleavage products of around 50-65 kda and 35 kda [16] . there are two optimal cleavage sites in nup98 for pv 2a pro located between aminoacids 373-374 and 551-552, containing gly at p1 , thr at p2, and leu at p4. hydrolysis at both sites results in n-and cterminal products with predicted molecular masses of 37 and 53 or 55 and 35 kda, in good agreement with the size of the peptides detected experimentally in pv-infected cells and 2a-treated hela extracts [16, 17] . an explanation for the delayed kinetics of nup62 and nup153 with respect to nup98 is that optimal pv 2a pro cleavage sites were not found in these nups (unpublished data). recently, park and colleagues have reported that pv 2a pro directly cleaves nup62 at six different positions rendering multiple proteolytic products. these cleavage sites are located between aminoacids 103 and 298, thus releasing the fg-rich region from the protein core [200] . functionally, loss of the fg-rich region might make nup62 inactive for interaction with cargo receptors. this hypothetical mechanism of nup62 functional decoupling could be extrapolated to nup98 and nup153 (figure 4) . however, it remains unknown whether pv 2a pro is able to directly cleave nup98 and nup153 or where cleavage might occur. nup98, nup153 and nup62 are also involved in rna export from the nucleus and therefore, cleavage by pv 2a pro might also impact on this process (figure 4) . however, oligo d(t) hybridization studies showed that pv infection does not affect distribution of the polyadenylated mrna bulk after 3 hpi [16] . a more detailed analysis revealed that expression of pv 2a pro in hela cells induces a number of disorders in rna location. nuclear export of cellular mrnas is inhibited in 2a pro -expressing cells in a dose dependent manner concomitantly with nup98, nup62 and nup153 cleavage [17] . interestingly, mrna export of constitutively expressed mrnas such as β-actin is less affected than that of newly synthesized mrnas. for example, tetracycline-induced luciferase mrna was almost totally retained in the nucleus when these nups are cleaved by pv 2a pro . this effect was also observed for endogenous mrnas such as il-6, c-myc or p53 mrnas which are induced on pv 2a pro expression. therefore, pv 2a pro could counteract the induction of proapoptotic (c-myc and p53) and proinflammatory (il-6) responses by accumulating the c-myc, p53 and il-6 mrnas in the nucleus [17] . this export blockage may prevent the establishment of a hostcell response against pv infection. these findings could explain why pv 2a pro is essential for replication of pv in cells pre-treated with ifn-α [79] . furthermore, impairment of mrna export strongly alters the localization of mrnas with high turnover as compared to constitutively expressed and highly stable mrnas such as β-actin [17] . as observed in pv-infected cells, oligo d(t) hybridization revealed that pv 2a pro expression hardly affects the distribution of the polyadenylated mrna pool after short times of expression (8 h). however, nuclear accumulation of polyadenylated mrna bulk is detected when cells are exposed to pv 2a pro for longer times (16 and 24 h) . in this regard, the progression of the alterations of polyadenylated mrna localization in 2a pro -expressing cells takes place as follows: (i) disruption of nuclear mrna-containing foci (ii) appearance of mrnacontaining granules in the cytoplasm (most probably stress granules) and (iii) depletion of cytoplasmic mrnas. these events were more clearly observed when high amounts of pv 2a pro are synthesized, reflecting that nuclear accumulation of mrnas is a time-and dose-dependent process [17] . nevertheless, pv 2a pro is not only able to block mrna export, but also rrna and snrna transport. both 18s rrna and u2 snrna accumulate in the nucleus of 2a proexpressing cells in a dose-dependent manner. most probably, cleavage of nup98, nup62 and nup153 is involved in these effects, since rrna, snrna, and mrna are exported using different cargo receptors and auxiliary proteins (see above) but all of them relay in nup activity to traverse npc [163] . importantly, trnas (val-trna) are exported normally despite pv 2a pro expression, indicating that some rna nuclear export pathways are not affected by this viral protease. in fact, nup98, nup62, and nup153 are not directly involved in trna export [163] , reinforcing the idea that nucleoporin cleavage plays a central role in the impairment of protein and rna trafficking by pv (figure 4) . notably, ifn-γ induced a specific increase of nup98 levels in hela cells that counteracts the inhibition of mrna export by pv 2a pro [17, 201] . collectively, these findings reflect the central role of nup98 in pv infection and in antiviral response, since its overexpression by itself prevents, at least in part, blockade of nuclear rna export. therefore, secretion of ifn-γ by immune cells might allow the induction of antiviral response by neighbouring cells by increasing the levels of nup98 in order to protect nuclearcytoplasmic protein and rna trafficking pathways. the physiological relevance of the crosstalk between nup98-pv 2a pro in pv (and other viruses) infection might be studied in the future with cellular and animal systems. that target the nuclear pore. as mentioned above, hrv also induces cleavage of nup62, nup153 and most probably nup98, leading to the impairment of nuclear protein import [159] . nevertheless, pv and hrv 2a pro exhibit high homology and both proteases are therefore expected to share common targets. in contrast, the 2a gene of cardioviruses encodes a short peptide with autoproteolytic activity but lacks trans-protease activity. however, emcv and mengovirus are able to damage npc [202] . as occurs in pv-infected cells, these cardioviruses induce both protein nuclear import inhibition and late membrane leakiness but they do not induce the cleavage of nups [160, 202, 203] . cardioviruses encode an additional protein known as l protein, which is highly cytopathic although it lacks protease activity (in contrast to aphthovirus l pro ). l protein contains a zinc finger domain, and an acidic region, which is proposed to be phosphorylated in infected cells [204] . individual expression of this protein in cultured cells or in cell-free systems induces several cellular disorders including the inhibition of protein nuclear import that resembles that observed in emcv-infected cells [160, 203] . several studies reported emcv l protein mutations that resulted in defective virus growth phenotypes in cell culture [202, 204] . in particular, mutations in the zinc finger domain (cys19ala and cys22ala) or in the acidic region (thr47ala) partially avoid blockade of protein trafficking between the nucleus and the cytoplasm [202] . taken together, these data support the involvement of cardiovirus l protein in npc damaging and in the inhibition of protein nuclear import, inducing nups phosphorylation rather than their cleavage. indeed, nup62 is quickly and strongly phosphorylated after emcv infection (2 hpi), and it was clearly detected by conventional western blotting. nevertheless, analysis with pro-q diamond phosphoprotein stain revealed that nup153 and nup214 are also phosphorylated to a certain extent upon emcv infection. notably, nups phosphorylation was avoided when cys19ala mutation was inserted in the l protein sequence, suggesting that the zinc finger domain is essential for this posttranslational modification to occur [160] . however, the exact role of nups phosphorylation in nuclear-cytoplasmic trafficking is still unknown. the idea that nups phosphorylation could regulate protein import and rna export as a switch that turns the different pathways on/off should be pursued in more detail. as mentioned above, ran is essential for the regulation of most of the nuclear export and import pathways, because it acts as a cofactor modulating the affinity of importins and exportins for the cargo. the rangtp cycle is described in detail in section 6.1. it has been described that emcv l directly interacts with ran, and this interaction is abrogated by the insertion of c19a mutation in l [203] . however, the potential role of l/ran interaction in the modulation of rangtp cycle and its impact in nuclear import and export pathways have not yet been studied. alteration of nuclear-cytoplasmic trafficking is not only restricted to picornaviruses but has also been observed with negative strand viruses such as vesicular stomatitis virus (vsv) and influenza virus. her and collaborators reported that rna export and protein import are strongly inhibited by vsv matrix (m) protein, by microinjection of oocytes with radiolabeled rnas and proteins. radiolabeled trnas, mrnas, u snrnas, and rrnas were injected directly into the nucleus of xenopus laevis oocytes, and the subcellular localization of those rnas was monitored by autoradiography. the conclusion of this work is that mrna, u snrna, and rrna but not trna trafficking is blocked by vsv m protein [205] , in agreement with our findings on 2a proexpressing hela cells [17] . in addition, protein nuclear import was monitored by microinjection of radiolabeled proteins containing nls in the oocytes cytoplasm. this assay revealed that vsv m protein abrogates protein nuclear import to the same extent as treatment with specific inhibitors of this pathway such as wga [205] . these interesting findings support the idea that pv 2a pro and vsv m protein could target similar host proteins to impair macromolecule trafficking between nucleus and cytoplasm. in agreement with this possibility, it was found that the vsv m protein interacts with nup98 [161] , which is one of the primary targets of pv 2a pro [16] . the n-terminal domain of vsv m is sufficient to block rna nuclear export and aa 52-54 may play an essential role in this blockade, because their mutations to ala completely abrogate this inhibitory effect. indeed, the n-terminal domain of m protein is involved in the interaction with nup98 and, in particular, aa 52-54, because their mutation to ala blocks the binding of vsv m to nup98 [161] . in addition, binding of m to nup98 requires active mrna export pathways since, treatment with inhibitors such as wga hampers this interaction. vsv m is also able to induce the accumulation of endogenous polyadenylated mrnas in the nucleus of hela cells, and this effect is prevented again by mutations in aa 52-54 of m [161] . furthermore, vsv m interacts with rae1, which plays an essential role in mrna nuclear export by its interaction with nup98 and mrnps. overexpression of either nup98 or rae-1 prevents the nuclear accumulation of polyadenylated mrnas, suggesting that both factors may play a role in the blockade of mrna nuclear export by vsv m protein [206] . interestingly, ifn-γ specifically increases the level of both nup98 and rae-1 and indicates a potential antiviral effect of these proteins. indeed, overexpression of both proteins by ifn-γ treatment counteracts the inhibitory effects of vsv m protein on mrna nuclear export, highlighting the possibility of a crosstalk between m and ifn-γ that might control the fate of the viral replication in infected animals [201, 206] . influenza virus replication also impacts on nuclearcytoplasmic trafficking and leads to the nuclear accumulation of host mrnas [162, 207] . influenza virus ns1 protein is a major virulence factor that is essential for pathogenesis, because it impairs innate and adaptive immunity by inhibiting host signal transduction and gene expression [208, 209] . ns1 forms a complex with nxf1/tap, p15/nxt [162] , rae1, and e1b-ap5, which are components of the mrna nuclear export machinery (see above). individual expression of ns1 in 293t cells induces the accumulation of polyadenylated mrnas in the nucleus, suggesting that the interaction of ns1 with these export factors yields an inactive complex for mrna export. influenza virus also induces a strong reduction of nup98 steady-state levels although the viral mechanisms involved in this process are still unknown [162] . expression of reporter luciferase mrna synthesized from a nuclear plasmid is inhibited by ns1. however, this inhibition is overcome by overexpression of nxf1, p15, rae-1 or nup98, evidencing the role of ns1 interaction with these factors in the impairment of mrna nuclear export [162] . furthermore, mouse cells expressing low levels of nup98 or/and rae-1 show greater susceptibility to influenza infection, resulting in a significant increase in cell death and virus production. in addition, mrnas encoding antiviral factors or immunomodulators such as irf-1, mhc i and icam1 accumulated more in the nucleus of those cells than in cells expressing normal levels of nup98 or rae-1 [162] . all these data support the physiological role of ns1 interaction with rna export factors as well as the reduction of nup98 levels in influenza pathogenicity. interestingly, vsv m, influenza ns1 and pv 2a pro expression gives rise to similar effects on mrna trafficking. all these viral proteins target nup98 and other components of the cellular machinery involved in nuclear-cytoplasmic trafficking. survival of motor neurons (smn) complex is composed by smn and a class of proteins called gemins, which localize in both cytoplasm and nucleoplasm [210, 211] . gemin7 and gemin8 constitute the core of the complex where the other gemins associate by means of numerous proteinprotein interactions from the periphery. the smn complex is involved in the biogenesis of uridine-rich small nuclear ribonucleoprotein (u snrnp) in the cytoplasm and then the u snrnp carries out the splicing of pre-mrnas in the nucleus [212] [213] [214] . the snrnps are composed of the major u snrnas u1, u2, u4, u5, and u6 as well as a group of seven proteins known as sm ribonucleoproteins that collectively make up the extremely stable sm core of the snrnp. gemins (except gemin-2) associate with sm proteins to form a heptameric ring structure in the presence of u snrnas [211] . after sm core assembly, the u snrnps are imported to the nucleus, localizing in foci known as cajal bodies, where further maturation processes take place [215] . gemin-3, one of the main components of smn complex, is cleaved in pv-infected hela cells leading to a 50 kda cleavage product. scission of gemin-3 negatively impacts on the kinetics of sm core assembly, which is prevented in presence of inhibitors of pv replication [216] . these results indicated that high levels of pv proteins are required for this process to occur, as is the case of nup153, nup62, and eif4gii, [16, 122] . pv 2a pro is able to hydrolyze purified gemin-3 in vitro, rendering a cleavage product similar to that found in pv-infected hela cells [216] . only one potential 2a procleavage site, between the amino acids tyr 462 and gly 463 (vhtyg), was found in this smn complex component. proteolysis of gemin-3 at this position would render two cleavage products of about 50-30 kda, in agreement with the polypeptide of about 50 kda found in pv-infected and 2a pro -expressing cells. in addition, g463e mutation avoids direct hydrolysis of gemin-3 exerted by pv 2a pro in vivo and in vitro [216] . taken together, these findings support the notion that vhtyg is the cleavage site for pv 2a pro in gemin-3. although hydrolysis of gemin-3 is exerted in cells transfected with plasmid encoding pv 2a pro , it does not take place when this protease is expressed from exogenous mrnas; contrary to that found with eif4gi, eif4gii, nup98, nup153 and nup62 [17] (and unpublished data). a probable explanation for this difference is that pv 2a pro is expressed at lower levels from transfected mrnas than from plasmids [95] (castello et al., 2006) . thus, gemin-3 cleavage may be a very late event in pv-infected cells because it requires expression of high amounts of pv 2a pro . gemin-3 hydrolysis may directly impact on pre-mrna splicing since this event reduces the availability of smn complexes, which is involved in u snrnps biogenesis. nevertheless, alstead and colleagues could not detect any apparent effect of gemin-3 proteolysis in splicing of cellular pre-mrnas [216] . therefore, the physiological relevance of gemin-3 cleavage in pv biological cycle remains unknown. in addition to eifs, nups and proteins from smn complex, pv 2a pro is able to cleave proteins involved in other cellular processes, such as transcription. tbp is cleaved by pv 2a pro between amino acids tyr 34 and gly 35 in vitro, although this cleavage only removes the first 34 aa located at the n-terminus and does not inhibit transcription carried out by rna polymerase ii [217, 218] . these findings are in agreement with the fact that host mrna transcription takes place in 2a pro -expressing cells when both translation and rna nuclear export are inhibited, upon cleavage of eif4g and nups [17] . one attractive hypothesis is that pv 2a pro could cleave specific initiation factors affecting specific rather than general mrna transcription in order to modulate host-cell response to viral infection. further studies in this direction can be carried out using microarray platforms to detect precise alterations in cellular transcriptome after pv-infection or 2a pro expression. these studies could be complemented by screening for new host factors cleaved by pv 2a pro using different in silico and experimental approaches. the study of viral proteases is crucial to understand the mechanism used by animal viruses to replicate their genomes and to translate viral mrnas at the molecular level. in addition, we wish to draw attention to the concept that viral proteases can be used as tools to reveal the exact functioning of their target cellular proteins. in this regard, pv 2a pro has been very useful for examining the requirements for eif4g to translate different cellular or viral mrnas. in addition to this, in this paper, we have highlighted the role of pv 2a pro not only in the processing of the poliovirus polyprotein but also in the interaction with the host-cell. interestingly, a single viral protein is able to modulate many steps of gene expression in order to generate an optimal intracellular environment for the viral biological cycle. in particular, pv 2a pro cleaves cell proteins involved in transcription, pre-mrna splicing, nucleus/cytoplasm transport and translation, in order to hijack those host functions and to concentrate the cellular resources on the production of the viral progeny. for example, pv 2a pro inactivates host translational machinery for capped cellular mrnas by cleaving eif4g, whereas viral protein synthesis takes place under those conditions by ires-driven translation [14] . because of the decrease of cellular mrna translatability in pv-infected cells, host ribosomes are available for viral protein synthesis. probably, the inhibition of host protein synthesis may prevent the production of antiviral proteins. similarly, the blockade of nucleus/cytoplasm transport of macromolecules might isolate nuclear processes from cytoplasmic cellular ones, hampering the arrival of specific proinflammatory transcription factors to the nucleus and of mrnas encoding proinflammatory, antiviral or proapoptotic proteins to the cytoplasm of infected cells. finally, transcription and pre-mrna splicing could be also modulated by pv 2a pro and might reduce the availability of mature mrnas. nevertheless, very little is known about the potential role of pv 2a pro in transcription and pre-mrna splicing, and further studies of these steps of gene expression in pv-infected and 2a-expressing cells should be carried out. making use of the "omics" technologies it would be possible to identify changes in the host transcriptome in those cells, allowing us to understand the readjustment of host gene expression to viral infection and to the cytotoxic effect of pv 2a pro . gene ontology tools can be used to cluster the pathways (kegg), molecular activities and biological functions of the genes which are transcriptionally up-or downregulated or not affected after these unfavourable stimuli. in silico analysis could be carried out in order to identify whether these gene clusters belong to particular networks controlled by specific transcription factors. this approach will give us additional information about potential host targets for pv 2a pro . complementarily, deep sequence (rnaseq), specific microarrays types and conventional rt-pcr could be used to screen for nonspliced or abnormal spliced mrna variants in pv-infected and 2a pro -expressing cells. finally, microarrays can be employed to identify the cytoplasmic and nuclear transcriptome in those cells to determine whether mrna nuclear export inhibition induced by pv 2a pro impacts on the distribution of the entire host mrna bulk or in specific mrna pools. taken together, all this information may provide us with a general and deep vision of the modification induced by pv 2a pro on the different steps of mrna metabolism. additionally, the physiological role of nups and gemin-3 cleavage might be studied using different models. one possible and interesting approach is to engineer stable cell lines expressing noncleavable versions of nup98, nup62, nup153 and/or gemin-3 and then to analyze the fitness of pv in the different cell types, especially in presence of extracellular antiviral stimuli such as ifns or interleukins, which will activate different epigenetic programs. these studies will provide essential information to help us understand the specific role that those host factors play in pv infection. the total number of cellular proteins targeted by pv 2a pro (degradome) remains unknown, but it can be anticipated that with the expanding use of proteomic methodologies, this analysis will be known soon not only for pv 2a pro , but also for other viral proteases of interest. furthermore, the analysis of the pv 2a pro -induced degradome in human cells will be of general interest for many researchers, including virologists and cellular biologists. this goal could be achieved combining in silico prediction of 2a pro cleavage sites and experimental tools such as proteomics. in the first case, blom and collaborators developed a bioinformatics tool using neural network algorithms to predict cellular targets for picornavirus proteases [219] . this approach has been successfully used to predict the cleavage of dystrophin by coxsackievirus 2a pro [220] although most of the predicted human targets for rhinovirus and enterovirus 2a pro have not been proved yet. in addition, the algorithm did not predict the cleavage of cellular targets that have been later demonstrated to be proteolyzed by picornaviral 2a pro such as nup98 and cytokeratin 8 [16, 17, 221] . thus, it would be necessary to develop an improved algorithm able to find optimal cleavage sequences in the host proteome by implementing the proteolytic sites known for newly described 2a pro targets. many parameters have to be taken into account, including the protein localization (cytoplasmic and nuclear protein will be considered, but not proteins resident in the lumen of other organelles such as re or peroxisomes), the exposure of the cleavage site to the solvent (the sequence must be accessible to the protease), and the secondary structure in which the proteolytic site is included (optimally, unstructured regions). potential targets could be ordered by their degree of homology with optimal cleavage sequences, as well as with the degree in which they fulfill the above prerequisites. on the other hand, novel pv 2a pro targets can be identified by proteomic tools such as two-dimensional differential gel electrophoresis (dige) or quantitative proteomics such as stable isotope labeling by amino acids in cell culture (silac) coupled to monodimensional electrophoresis. following these two methods, it will be feasible to identify proteins with reduced levels on 2a pro expression and, in addition, to detect the cleavage products that will appear as lower size peptides. by the uncovering of novel 2a pro targets we will be able to map the cellular networks impacted by pv 2a pro and to integrate them in the context of pv infection. in fact, the role of 2a pro hijacking host processes could be potentially expanded to other cellular pathways with direct impact on control of viral infection. such knowledge will provide more insight into our understanding of the cytopathogenicity of viral proteases at the molecular level. viral proteases structure and function of picornavirus proteinases hiv-1: fifteen proteins and an rna the structures of picornaviral proteinases organization and expression of calicivirus genes proteolytic processing of foamy virus gag and pol proteins new insights into flavivirus nonstructural protein 5 poliovirus, pathogenesis of poliomyelitis, and apoptosis poliovirus and apoptosis induction of membrane proliferation by poliovirus proteins 2c and 2bc two types of death of poliovirus-infected cells: caspase involvement in the apoptosis but not cytopathic effect individual expression of poliovirus 2a and 3c induces activation of caspase-3 and parp cleavage in hela cells picornavirus rna translation: roles for cellular proteins translational control by viral proteinases effects of poliovirus infection on nucleo-cytoplasmic trafficking and nuclear pore complex composition differential targeting of nuclear pore complex proteins in poliovirusinfected cells rna nuclear export is blocked by poliovirus 2a protease and is concomitant with nucleoporin cleavage the pathogenesis of poliomyelitis: what we don't know poliovirus cell entry: common structural themes in viral cell entry pathways one hundred years of poliovirus pathogenesis genetics of poliovirus picornavirus genome: an overview processing determinants and functions of cleavage products of picornavirus polyproteins proteolytic processing of poliovirus polyprotein: elimination of 2a(pro)-mediated, alternative cleavage of polypeptide 3cd by in vitro mutagenesis poliovirus rna-dependent rna polymerase (3dpol): structure, function, and mechanism possible unifying mechanism of picornavirus genome replication protein-primed rna synthesis by purified poliovirus rna polymerase inhibitors of poliovirus uncoating efficiently block the early membrane permeabilization induced by virus particles viral translation is coupled to transcription in sindbis virus-infected cells phospholipid biosynthesis and poliovirus genome replication, two coupled phenomena effects of viral replication on cellular membrane metabolism and function viroporinmediated membrane permeabilization: pore formation by nonstructural poliovirus 2b protein viroporins primary structure, gene organization and polypeptide expression of poliovirus rna three-dimensional structure of poliovirus at 2.9å resolution internal initiation of translation of eukaryotic mrna directed by a sequence derived from poliovirus rna cell-free, de novo synthesis of poliovirus synthetic viruses: a new opportunity to understand and prevent viral disease inhibition of hela cell protein synthesis following poliovirus infection correlates with the proteolysis of a 220,000-dalton polypeptide associated with eucaryotc initiation factor 3 and a cap binding protein complex virus-encoded proteinases of the picornavirus super-group genome organization and encoded proteins structure of the foot-and-mouth disease virus leader protease: a papain-like fold adapted for self-processing and eif4g recognition structural and biochemical features distinguish the foot-and-mouth disease virus leader proteinase from other papain-like enzymes the foot-and-mouth disease virus leader proteinase gene is not required for viral replication the leader proteinase of foot-and-mouth disease virus inhibits the induction of beta interferon mrna and blocks the host innate immune response a conserved domain in the leader proteinase of foot-and-mouth disease virus is required for proper subcellular localization and function differential ifnα/β production suppressing capacities of the leader proteins of mengovirus and foot-and-mouth disease virus leader protein of foot-and-mouth disease virus is required for cleavage of the p220 component of the cap-binding protein complex relationship of p220 cleavage during picornavirus infection to 2a proteinase sequencing extremely efficient cleavage of eif4g by picornaviral proteinases l and 2a in vitro foot-and-mouth disease virus 2a oligopeptide mediated cleavage of an artificial polyprotein analysis of the aphthovirus 2a/2b polyprotein 'cleavage' mechanism indicates not a proteolytic reaction, but a novel translational effect: a putative ribosomal 'skip the cleavage activities of aphthovirus and cardiovirus 2a proteins purification and partial characterization of poliovirus protease 2a by means of a functional assay poliovirus proteinase 3c: large-scale expression, purification, and specific cleavage activity on natural and synthetic substrates in vitro multiple proteases in foot-and-mouth disease virus replication antiviral effects of a thiol protease inhibitor on foot-and-mouth disease virus putative papain-related thiol proteases of positive-strand rna viruses. identification of rubi-and aphthovirus proteases and delineation of a novel conserved domain associated with proteases of rubi, α-and coronaviruses identification of the active-site residues of the l proteinase of foot-and-mouth disease virus identification of critical amino acids within the foot-and-mouth disease virus leader protein, a cysteine protease structural similarity of poliovirus cysteine proteinase p3-7c and cellular serine proteinase of trypsin cysteine proteases of positive strand rna viruses and chymotrypsin-like serin proteases: a distinct protein superfamily with a common structural fold viral cysteine proteases are homologous to the trypsin-like family of serine proteases: structural and functional implications a second virus-encoded proteinase involved in proteolytic processing of poliovirus polyprotein substrate requirements of a human rhinoviral 2a proteinase determinants of substrate recognition by poliovirus 2a proteinase cleavage specificity on synthetic peptide substrates of human rhinovirus 2 proteinase 2a genetic selection of poliovirus 2a(pro)-binding peptides identification of essential amino acid residues in the functional activity of poliovirus 2a protease mutational analyses support a model for the hrv2 2a proteinase the structure of the 2a proteinase from a common cold virus: a proteinase responsible for the shut-off of host-cell protein synthesis characterization of the roles of conserved cysteine and histidine residues in poliovirus 2a protease the 2a proteinase of human rhinovirus is a zinc containing enzyme spectroscopic characterization of rhinoviral protease 2a: zn is essential for the structural integrity the yeast saccharomyces cerevisiae as a genetic system for obtaining variants of poliovirus protease 2a mutational analysis of poliovirus 2a(pro): distinct inhibitory functions of 2a(pro) on translation and transcription defective rna replication by poliovirus mutants deficient in 2a protease cleavage activity 2a protease is not a prerequisite for poliovirus replication proteinase 2a is essential for enterovirus replication in type i interferontreated cells viable polioviruses that encode 2a proteins with fluorescent protein tags regulation of translation initiation in eukaryotes: mechanisms and biological targets the mechanism of eukaryotic translation initiation and principles of its regulation eif4 initiation factors: effectors of mrna recruitment to ribosomes and regulators of translation molecular mechanisms of translational control eif4g: a multipurpose ribosome adapter elf4g: translation's mystery factor begins to yield its secrets human eukaryotic translation initiation factor 4g (eif4g) possesses two separate and independent binding sites for eif4a a newly identified n-terminal amino acid sequence of human eif4g binds poly(a)-binding protein and functions in poly(a)-dependent translation circularization of mrna by eukaryotic translation initiation factors mnk1, a new map kinaseactivated protein kinase, isolated by a novel expression screening method for identifying protein kinase substrates mitogen-activated protein kinases activate the serine/threonine kinases mnk1 and mnk2 human eukaryotic translation initiation factor 4g (eif4g) recruits mnk1 to phosphorylate eif4e phosphorylation and dephosphorylation events that regulate viral mrna translation regulation of capdependent translation by eif4e inhibitory proteins differential cleavage of eif4gi and eif4gii in mammalian cells: effects on translation translation of mrnas from vesicular stomatitis virus and vaccinia virus is differentially blocked in cells with depletion of eif4gi and/or eif4gii eukaryotic translation initiation factor 4g is targeted for proteolytic cleavage by caspase 3 during inhibition of translation in apoptotic cells selective modification of eukaryotic initiation factor 4f (eif4f) at the onset of cell differentiation: recruitment of eif4gii and long-lasting phosphorylation of eif4e serum-stimulated, rapamycin-sensitive phosphorylation sites in the eukaryotic: translation initiation factor 4gi inhibition of capdependent translation via phosphorylation of eif4g by 20 journal of biomedicine and biotechnology protein kinase pak2 suppression of cap-dependent translation in mitosis phosphorylation screening identifies translational initiation factor 4gii as an intracellular target of ca 2+ /calmodulin-dependent protein kinase i cleavage of polypeptide chain initiation factor eif4gi during apoptosis in lymphoma cells: characterisation of an internal fragment generated by caspase-3-mediated cleavage cleavage of eukaryotic translation initiation factor 4gii correlates with translation inhibition during apoptosis translation of eukaryotic translation initiation factor 4gi (eif4gi) proceeds from multiple mrnas containing a novel capdependent internal ribosome entry site (ires) that is active during poliovirus infection generation of multiple isoforms of eukaryotic translation initiation factor 4gi by use of alternate translation initiation codons specific isoforms of translation initiation factor 4gi show differences in translational activity regulation of protein synthesis in hela cells. 3. inhibition during poliovirus infection control of protein synthesis in extracts from poliovirus-infected cells. i. mrna discrimination by crude initiation factors poliovirus proteinase 2a induces cleavage of eucaryotic initiation factor 4f polypeptide p220 a poliovirus 2a(pro) mutant unable to cleave 3cd shows inefficient viral protein synthesis and transactivation defects direct cleavage of elf4g by poliovirus 2a protease is inefficient in vitro human rhinovirus 14 infection of hela cells results in the proteolytic cleavage of the p220 cap-binding complex subunit and inactivates globin mrna translation in vitro mapping the cleavage site in protein synthesis initiation factor eif-4γ of the 2a proteases from human coxsackievirus and rhinovirus 2a proteinases of coxsackie-and rhinovirus cleave peptides derived from eif-4 gamma via a common recognition motif the eif4g-eif4e complex is the target for direct cleavage by the rhinovirus 2a proteinase poliovirus 2a proteinase cleaves directly the eif-4g subunit of eif-4f complex relationship of eukaryotic initiation factor 3 to poliovirus-induced p220 cleavage activity multiple eif4gi-specific protease activities present in uninfected and poliovirus-infected cells lack of direct correlation between p220 cleavage and the shut-off of host translation after poliovirus infection monensin and nigericin prevent the inhibition of host translation by poliovirus, without affecting p220 cleavage proteolysis of human eukaryotic translation initiation factor eif4gii, but not eif4gi, coincides with the shutoff of host protein synthesis after poliovirus infection eukaryotic initiation factor 4gii (eif4gii), but not eif4gi, cleavage correlates with inhibition of host cell protein synthesis after human rhinovirus infection high level expression in escherichia coli cells and purification of poliovirus protein 2a(pro) cleavage of p220 by purified poliovirus 2a(pro) in cell-free systems: effects on translation of capped and uncapped mrnas poly(a)-binding protein interaction with elf4g stimulates picornavirus ires-dependent translation hiv-1 protease inhibits cap-and poly(a)-dependent translation upon eif4gi and pabp cleavage human immunodeficiency virus tat-activated expression of poliovirus protein 2a inhibits mrna translation the effect of poliovirus proteinase 2a(pro) expression on cellular metabolism. inhibition of dna replication, rna polymerase ii transcription, and translation expression of poliovirus 2a(pro) in mammalian cells: effects on translation efficient cleavage of p220 by poliovirus 2a(pro) expression in mammalian cells: effects on vaccinia virus hybrid proteins between pseudomonas exotoxin a and poliovirus protease 2a(pro) effects of poliovirus 2a(pro) on vaccinia virus gene expression the eukaryotic translation initiation factor 4gi is cleaved by different retroviral proteases hybrid proteins between pseudomonas aeruginosa exotoxin a and poliovirus 2apro cleave p220 in hela cells cleavage of eukaryotic translation initiation factor 4g by exogenously added hybrid proteins containing poliovirus 2a(pro) in hela cells: effects on gene expression a stable hela cell line that inducibly expresses poliovirus 2a(pro): effects on cellular and viral gene expression poliovirus 2a protease induces apoptotic cell death eif4g is required for the pioneer round of translation in mammalian cells cleavage of poly(a)-binding protein by enterovirus proteases concurrent with inhibition of translation in vitro cleavage of poly(a)-binding protein by coxsackievirus 2a protease in vitro and in vivo: another mechanism for host protein synthesis shutoff? efficient cleavage of ribosome-associated poly(a)-binding protein by enterovirus 3c protease cleavage of poly(a)-binding protein by poliovirus 3c protease inhibits host cell translation: a novel mechanism for host translation shutoff foot-andmouth disease virus leader proteinase: purification of the lb form and determination of its cleavage site on eif-4γ foot-and-mouth disease virus 3c protease induces cleavage of translation initiation factors eif4a and eif4g within infected cells cleavage of eukaryotic translation initiation factor 4gii within foot-and-mouth disease virus-infected cells: identification of the l-protease cleavage site in vitro sequential modification of translation initiation factor elf4gl by two different foot-andmouth disease virus proteases within infected baby hamster kidney cells: identification of the 3c cleavage site conducting the initiation of protein synthesis: the role of eif4g hiv-1 protease cleaves eukaryotic initiation factor 4g and inhibits cap-dependent translation in vitro cleavage of eif4gi but not eif4gii by hiv-1 protease and its effects on translation in the rabbit reticulocyte lysate system cleavage of eif4g by hiv-1 protease: effects on translation hiv protease cleaves poly(a)-binding protein cap-poly(a) synergy in mammalian cellfree extracts. investigation of the requirements for poly(a)-mediated stimulation of translation initiation characterization of an internal ribosomal entry segment within the 5' leader of avian reticuloendotheliosis virus type a rna and development of novel mlv-rev-based retroviral vectors an internal ribosome entry segment promotes translation of the simian immunodeficiency virus genomic rna the leader of human immunodeficiency virus type 1 genomic rna harbors an internal ribosome entry segment that is active during the g/m phase of the cell cycle cleavage of eukaryotic initiation factor elf4g and inhibition of host-cell protein synthesis during feline calicivirus infection calicivirus 3c-like proteinase inhibits cellular translation by cleavage of poly(a)-binding protein inhibition of nuclear import and alteration of nuclear pore complex composition by rhinovirus leader-induced phosphorylation of nucleoporins correlates with nuclear trafficking inhibition by cardioviruses vesicular stomatitis virus matrix protein inhibits host cell gene expression by targeting the nucleoporin nup98 influenza virus targets the mrna export machinery and the nuclear pore complex exporting rna from the nucleus to the cytoplasm crossing the nuclear envelope: hierarchical regulation of nucleocytoplasmic transport the yeast nuclear pore complex: composition, architecture, transport mechanism proteomic analysis of the mammalian nuclear pore complex the part and the whole: functions of nucleoporins in nucleocytoplasmic transport nuclear pore complex is able to transport macromolecules with diameters of about 39 nm nucleocytoplasmic transport: taking an inventory the nuclear pore complex: bridging nuclear transport and gene regulation nuclear localization signal and protein context both mediate importin α specificity of nuclear import substrates regulation of nuclear transport: central role in development and transformation inhibition of crm1-p53 interaction and nuclear export of p53 by poly(adp-ribosyl)ation arginine methylation of rna helicase a determines its subcellular localization identification of a trna-specific nuclear export receptor identification of a nuclear export receptor for trna exportin-5-mediated nuclear export of eukaryotic elongation factor 1a and trna phax, a mediator of u snrna nuclear export whose activity is regulated by phosphorylation nmd3p is a crm1p-dependent adapter protein for nuclear export of the large ribosomal subunit 60s pre-ribosome formation viewed from assembly in the nucleolus until export to the cytoplasm release of the export adapter, nmd3p, from the 60s ribosomal subunit requires rpl10p and the cytoplasmic gtpase lsg1p human mrna export machinery recruited to the 5' end of mrna splicing factor sub2p is required for nuclear mrna export through its interaction with yra1p the exon-exon junction complex provides a binding platform for factors involved in mrna export and nonsense-mediated mrna decay magoh, a human homolog of drosophila mago nashi protein, is a component of the splicing-dependent exon-exon junction complex a ranindependent pathway for export of spliced mrna delineation of mrna export pathways by the use of cell-permeable peptides viral ribonucleoprotein complex formation and nucleolar-cytoplasmic relocalization of nucleolin in poliovirus-infected cells inhibition of nucleo-cytoplasmic trafficking by rna viruses: targeting the nuclear pore complex a cellular protein that binds to the 5'-noncoding region of poliovirus rna: implications for internal translation initiation a cytoplasmic 57-kda protein that is required for translation of picornavirus rna by internal ribosomal entry is identical to the nuclear pyrimidine tract-binding protein human protein sam68 relocalization and interaction with poliovirus rna polymerase in infected cells intracellular redistribution of truncated la protein produced by poliovirus 3c(pro)-mediated cleavage kh domain integrity is required for wild-type localization of sam68 translation of polioviral mrna is inhibited by cleavage of polypyrimidine tractbinding proteins executed by polioviral 3c(pro) bidirectional increase in permeability of nuclear envelope upon poliovirus infection and accompanying alterations of nuclear pores cytoplasmic nucleic acid sensors in antiviral immunity versatility at the nuclear pore complex: lessons learned from the nucleoporin nup153 caspasedependent proteolysis of integral and peripheral proteins of nuclear membranes and nuclear pore complex proteins during apoptosis specific cleavage of the nuclear pore complex protein nup62 by a viral protease role of nucleoporin induction in releasing an mrna nuclear export block nucleocytoplasmic traffic disorder induced by cardioviruses a picornavirus protein interacts with ran-gtpase and disrupts nucleocytoplasmic transport leader protein of encephalomyocarditis virus binds zinc, is phosphorylated during viral infection, and affects the efficiency of genome translation inhibition of ran guanosine triphosphatase-dependent nuclear transport by the matrix protein of vesicular stomatitis virus vsv disrupts the rae1/mrnp41 mrna nuclear export pathway influenza virus ns1 protein inhibits pre-mrna splicing and blocks mrna nucleocytoplasmic transport rig-i-mediated antiviral responses to single-stranded rna bearing 5 -phosphates inhibition of retinoic acid-inducible gene i-mediated induction of beta interferon by the ns1 protein of influenza a virus gemin8 is a novel component of the survival motor neuron complex and functions in small nuclear ribonucleoprotein assembly gemin8 is required for the architecture and function of the survival motor neuron complex a multiprotein complex mediates the atp-dependent assembly of spliceosomal u snrnps smnrp is an essential pre-mrna splicing factor required for the formation of the mature spliceosome specific sequence features, recognized by the smn complex, identify snrnas and determine their fate as snrnps biogenesis of small nuclear rnps inhibition of u snrnp assembly by a virus-encoded proteinase poliovirusencoded protease 2a(pro) cleaves the tata-binding protein but does not inhibit host cell rna polymerase ii transcription in vitro the interaction of cytoplasmic rna viruses with the nucleus cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks enteroviral protease 2a cleaves dystrophin: evidence of cytoskeletal disruption in an acquired cardiomyopathy 2a proteinase of human rhinovirus cleaves cytokeratin 8 in infected hela cells our group is supported by a dgicyt grant (no. bfu2009-07352). the institutional grant awarded to the centro de biología molecular "severo ochoa" by the fundación ramón areces is also acknowledged. a. castello is a beneficiary of marie curie ief fellowship (seventh framework programme). alfredo castelló and enriqueálvarez contributed equally to this work. key: cord-018018-2yyv8vuy authors: rybicki, ed title: history and promise of plant-made vaccines for animals date: 2018-07-04 journal: prospects of plant-based vaccines in veterinary medicine doi: 10.1007/978-3-319-90137-4_1 sha: doc_id: 18018 cord_uid: 2yyv8vuy plant-made vaccines are now a well-established and well-tested concept in veterinary medicine—yet the only product so far licenced was never produced commercially. this is puzzling, given the breadth of exploration of plant-made animal vaccines, and their immunogenicity and efficacy, over more than twenty years of research. the range of candidate vaccines that have been tested in laboratory animal models includes vaccines for e. coli, salmonella, yersinia pestis, foot and mouth disease virus, rabbit haemorrhagic disease virus, rabbit and canine and bovine papillomaviruses, mink enteritis and porcine circovirus, and lately also bluetongue virus, among many others. there are many proofs of efficacy of such vaccines, and regulatory pathways appear to have been explored for their licencing. this review will briefly explore the history of plant-made vaccines for use in animals, and will discuss the unique advantages of plant-made vaccines for use in a veterinary medicine setting in detail, with a proposal of their relevance within the “one health” paradigm. plant-produced vaccines for veterinary medicine are an exciting prospect, largely because of the possibilities of producing protein-based vaccines' including edible vaccines' at low cost, at almost any scale, and potentially locally and on demand. they have also been controversial because of the very real possibilities of contamination of the human food supply with vaccine-producing transgenic plants, and because of concerns around the possibility of immunological tolerance developing to oral or edible vaccines. however, one set of problems that many foresawregulatory and production problems-has not eventuated, and in fact the environment now seems primed very favourably for their introduction. the main justifications for plant-made vaccines are that vaccine antigen production in plants is safe; that it is both cheap and highly scalable; that plants produce and process eukaryote-derived proteins much better than can bacteria or even yeasts; that use of plants would allow for production of vaccines in the developing world where they are needed most; and of vaccines or therapeutics that will never be produced economically by other technologies. however, despite more than twenty years of development, there are still no plant-produced vaccines or biologics available for animals-although there are in fact products licenced for and in use in humans. this review will explore the early history of plant-produced vaccines with an emphasis on proofs of principle and of efficacy, what the recent development of robust, stable transient plant production systems for vaccine antigens could mean for veterinary medicine, and the potential of plant-produced vaccines to advance both animal and potentially human health' under the banner of the one health movement. while viral proteins have probably been the most common vaccine candidates made in plants (reviewed in rybicki 2014) , it was expression of a bacterial protein-escherichia coli heat labile enterotoxin (lt-b)-that first proved that veterinary-relevant antigens could be produced in plants, and provided the first proof of principle for edible vaccines. lt-b produced in transgenic tobacco or potatoes (haq et al. 1995) was functionally equivalent to e coli-produced protein in specific assays, and immunisation of mice by oral gavage with plant material elicited systemic and mucosal toxin neutralising antibodies. moreover, fresh potato containing lt-b was immunogenic in mice when eaten. an early virus vaccine candidate was one against mink enteritis virus (mev) disease: this was novel in that it comprised chimaeric cowpea mosaic virus (cpmv) virions incorporating a short linear epitope from mev vp2 capsid protein and displaying it on the surface of virions, produced by inoculation of bean plants with an infectious cdna clone of rcpmv (dalsgaard et al. 1997) . this conferred protection against clinical disease and virtually abolished virus shedding-and given that the epitope sequence used is found in mev, canine parvovirus, and feline panleukopenia virus, the same vaccine could potentially also protect against these viruses. another early virus vaccine candidate was against rabbit haemorrhagic disease virus (rhdv): this was made by expressing the whole rhdv vp60 capsid protein in transgenic potatoes; parenteral immunisation with plant extracts was protective in rabbits (castanon et al. 1999) . subsequently, another study demonstrated that an edible vaccine consisting of leaves of transgenic plants containing presumably partially-assembled vp60 subunits, was an effective priming vaccine for later baculovirus-derived parenterally-delivered vaccine (gil et al. 2006 ). the first report of a foot and mouth disease virus (fmdv) plant-made antigen was of expression in plant protoplasts of a vp1-derived peptide of fmdv as an insertion into the minor coat protein of a replicating cpmv as a demonstration of antigen presentation (usha et al. 1993) . however, the first proof of efficacy was done using transgenic arabidopsis thaliana expressing whole vp1: parenteral immunisation of mice with leaf extracts elicited antibodies that bound to vp1 and to intact fmdv particles, and all immunised mice were protected against virulent fmdv challenge (carrillo et al. 1998) . the wigdorovitz group went on to demonstrate that mice could be protected against fmdv challenge after oral or parenteral vaccination with extracts of transgenic alfalfa expressing vp1 (wigdorovitz et al. 1999) , or immunisation with leaf extracts of tobacco plants expressing vp1 via a recombinant tobacco mosaic virus vector (wigdorovitz et al. 1999) . a refinement of these achievements included transgenic expression in alfalfa of amino acid residues 135-160 of vp1 (vp135-160) fused to glucuronidase (gus), which both allowed selection of strongest expressers by assay of enzyme activity, and was protective in mice (dus santos et al. 2002) . another novel application of carrier technology was the insertion of vp1 amino acids 140-160 (g-h loop) in an interior region of the hepatitis b virus core antigen gene (hbcag), and expression of the chimaera in transgenic nicotiana tabacum. the chimaeric protein formed virus-like particles (vlps) in the tobacco leaves, and mice immunised intraperitoneally with a soluble extract were protected against viral challenge (huang et al. 2005 ). an early attempt at showing the feasibility of making an anthrax vaccine was the expression in transgenic n. tabacum of the protective antigen (pa) protein of bacillus anthracis, possibly the best target for a subunit vaccine because it alone is protective (aziz et al. 2002) , although it went no further than showing cytolytic activity of the protein. soon after, the same group went on to express pa in transplastomic n. tabacum, with significant yield increases but still no efficacy trial (aziz et al. 2005) . another investigation of transplastomic tobacco by henry daniell's group was more thorough: yields were high (2.5 g/kg in fresh leaf tissue), the protein was protected in chloroplasts from protease cleavage and was stable when stored in leaves or as crude extracts, and was biologically active (watson et al. 2004 ). while they did not show immunogenicity or protection, the authors speculated that "with an average yield of 172 mg of pa per plant using an experimental transgenic cultivar grown in a greenhouse, 400 million doses of vaccine (free of contaminants) could be produced per acre". the daniell group subsequently showed that chloroplast-derived pa was equal in potency to the natural product from b anthracis, and that mice immunised subcutaneously with partially purified chloroplast-derived pa with adjuvant produced high igg titres and survived challenge with lethal doses of toxin (koya et al. 2005) . a different sort of approach to anthrax, and one of the first attempts at making a therapeutic antibody in plants, was taken by vidadi yusibov's group, who used the technique of transient agrobacterium infiltration-mediated expression in n. benthamiana to produce a human-derived pa-specific monoclonal antibody (hull et al. 2005) . the antibody neutralised toxin activity both in vitro and in vivo at a comparable level to hybridoma-produced antibodies. the yusibov group at what became fraunhofer usa center for molecular biotechnology later used the same transient expression technology to separately express artificial antigens comprising domain 4 of pa or domain 1 of b anthracis lethal factor (lf), fused in-frame with lichenase (lickm), a thermostable enzyme from clostridium thermocellum (chichester et al. 2007) . mice immunised with a combination of the two antigens produced high titres of mainly igg1, and sera could neutralise the effects of anthrax lethal toxin (letx) in vitro. rabies vaccines made in plants included an early yet highly sophisticated candidate that was composed of the alfalfa mosaic virus (amv) cp fused to an artificial polypeptide containing rabies virus g protein amino acids 253-275, and n protein amino acids 404-418, and expressed either in n. tabacum plants transgenic for amv replicase, or via rtmv in either n. benthamiana or spinach (yusibov et al. 2002) . the plants made particles containing amv-derived rna, encapsidated with chimaeric cp: raw spinach leaves were orally immunogenic in mice and in human volunteers. a simpler candidate was the g protein alone, with plant signal peptide and er retention signal, made in transgenic n. tabacum (ashraf et al. 2005) . while yields were relatively low (0.38% of total soluble leaf protein), purified protein injected peritoneally in mice elicited protective immunity against lethal intracerebral challenge with live rabies virus-an excellent proof of both principle and efficacy. plant-made animal rotavirus vaccines were an early target, with a stand-out study by yu and langridge (2001) providing evidence that transgenic potato could produce fusion proteins consisting of cholera toxin (ct) b and a2 subunits fused with murine rotavirus enterotoxin and enterotoxigenic e coli fimbrial antigen, respectively. fusion antigens assembled in potato tubers into cholera holotoxin-like structures that bound enterocytes, and elicited serum and intestinal antibodies after oral immunisation in mice. moreover, passively immunised mouse neonates were partially protected against diarrhoea after rotavirus challenge, demonstrating that combination vaccines for viral and bacterial pathogens may be made in plants. a simpler approach to rotavirus prevention was expression of a his-tagged vp8 * fragment of bovine rotavirus (brv) vp4 in n. benthamiana via recombinant tmv, purification of the antigen by ni 2+ chromatography, and intraperitoneal immunisation of adult female mice ). eighty-five percent of suckling mice born from these mothers were protected from brv challenge, compared to 35% immunised with an irrelevant control antigen. the same group also showed that a fusion protein made in transgenic alfalfa consisting of a short peptide derived from brv vp4 fused to gus was immunogenic both when given intraperitoneally and orally to adult female mice, and their sucklings were protected against challenge ). another group used transgenic alfalfa to produce human rotavirus vp6, and showed that female mice gavaged with alfalfa extract containing oligocpg as an adjuvant developed high titres of antibodies both systemically and mucosally, and their pups were partially protected against simian rotavirus challenge. the same animal model first used to show the efficacy of insect cell-made papillomavirus virus-like particle (vlp)-based vaccines (breitburd et al. 1995) was also used to demonstrate the efficacy of two very different plant-made papillomavirus vaccines, a few years after the demonstration that human papillomavirus l1 major capsid protein virus-like particles could be produced in transgenic tobacco or potato (biemelt et al. 2003; varsani et al. 2003; warzecha et al. 2003) . cottontail rabbit papillomavirus (crpv), the cause of the famous "jackalope" sightings in the usa, provides an excellent model system in domestic rabbits for investigation of prophylactic and therapeutic papillomavirus vaccines (breitburd et al. 1997) . accordingly, in the first study crpv l1 major capsid protein-containing extracts were prepared either from transgenic n. tabacum or n. benthamiana infected with recombinant tmv, and used with freund's incomplete adjuvant to immunise rabbits that were subsequently challenged with live virus (kohl et al. 2006) . although the vaccines appeared to contain small aggregates of crpv l1 rather than intact vlps, and immune rabbit sera failed to neutralise crpv infectivity in an in vitro assay, the rabbits were protected from wart development (kohl et al. 2006 ). in the second study, infectious recombinant tmv was used to surface display, via fusion to the capsid protein, a peptide consisting of amino acids 94-122 of the l2 minor capsid protein from either crpv or the rabbit oral papillomavirus (ropv) (palmer et al. 2006) . groups of rabbits received either or both vaccines, and were challenged with live crpv or ropv. immune rabbit sera reacted with whole l2 protein, and crpv-specific sera neutralised crpv pseudovirion infectivity. rabbits receiving the crpv or crpv + ropv vaccines were completely protected against crpv infection, and those receiving ropv alone were weakly protected against crpv. these studies demonstrated for the first time that plant-made papillomavirus vaccines based on l1 protein or l2-derived peptide had real potential as prophylactic vaccines, for use in animals as well as in humans. strangely, given that bovine papillomaviruses (bpv) had been used for many years as model systems for anti-wart vaccination, it was not until 2012, with transient agroinfiltration-mediated expression of bpv-1 vlps in n. benthamiana, that a candidate plant-made bpv l1 vlp-based vaccine was successfully made, although no efficacy trials were done (love et al. 2012) . expression of animal vaccine components in seeds of transgenic plants was attempted quite early on, with lamphear et al. (2002) in 2002 reviewing their own earlier work on maize seed expression of the b subunit of e coli heat-labile enterotoxin and the tgev s protein, with data on the potency, efficacy, and stability of these vaccines. another report followed in 2002 on the expression in maize seed of the s envelope protein of transmissible gastroenteritis coronavirus (tgev) of swine, and its protective efficacy in piglets fed with the seed (jilka 2002) . this followed an earlier demonstration of oral immunogenicity of the s protein n-terminal domain in transgenic potato tubers (gomez et al. 2000) . rabies too was a target for maize seed expression, with a report of g protein expression in transgenic maize seed to 1% of total soluble protein, and complete protection in a heterologous rabies strain challenge of mice orally immunized with one dose of *50 lg of g protein in seed extract (loza-rubio et al. 2008) . the same group later showed that sheep orally given a single dose of the transgenic maize seed containing *2 mg of the g protein were protected to the same extent as those immunized with a commercial inactivated vaccine ). the authors claimed that "this is the first study in which an orally administered edible vaccine showed efficacy in a polygastric model", which was an important development. maize was a popular target for both production and storage of recombinant proteins in early molecular farming times (see streatfield et al. 2003) ; however, other hosts were used too. for instance, the haemagglutinin (h) protein of rinderpest virus was expressed in transgenic pigeon pea to 0.49% of total soluble protein (satyavathi et al. 2003) , and also in peanuts for a product that was both orally and parenterally immunogenic in mice (khandelwal et al. 2004 ); so too was glycoprotein b (gb) of human cytomegalovirus in seeds of transgenic tobacco (tackaberry et al. 2003) , the fusion (f) glycoprotein of newcastle disease virus in transgenic rice seed (yang et al. 2007) , and the serotype-specific vp2 protein of bluetongue virus in transgenic peanuts (athmaram et al. 2006) . most of these efforts were negated, however, by the one big scandal to have hit molecular farming as far as the use of food plants for vaccine production is concerned. in 2002, aphis inspectors found volunteer tgev cp-expressing maize growing in soybean fields in two locations that were used to grow prodigene inc's tgev transgenic maize in the previous season (aphis 2008)-and in one, the soybeans were harvested with the maize plants still standing and sent to a storage facility, where they were mixed with a large volume of other seeds. the company was fined and paid substantial cleanup costs, had to develop a new compliance implementation programme, and the us dept of agriculture issued new guidelines for trials of such products. this had an unfortunate knock-on effect for molecular farming, in that it resulted in an effectively voluntary moratorium on the use of food crops for recombinant protein production worldwide. the one major success story of early work on veterinary vaccines was the approval by the us department of agriculture's center for veterinary biologics of dow agrosciences' injectable newcastle disease virus (ndv) haemagglutin-based vaccine for poultry, that had been made in a suspension cultured n. tabacum cell line. sadly, the product was never sold: the company only wanted '… to demonstrate that our concert™ plant-cell-produced system is capable of producing a vaccine that is safe and effective and to demonstrate that it meets the requirements for approval under the rigorous usda regulatory system. ndv is well known and understood by the regulatory agency, so it served as an excellent model to prove this new technology' (rybicki 2009 ). the early historical account of molecular farming for veterinary vaccines given above gives an idea of the array of technologies available and used up to the mid-2000s: transgenic and transplastomic expression of subunit proteins; recombinant plant viruses either used to express whole vaccine candidate genes, or to display chosen peptides fused to their capsid proteins; fusion of vaccine protein genes to carrier proteins to improve immunogenicity, including by inherent adjuvant properties; candidate parenteral and oral vaccines to both viruses and bacteria; therapeutics for animals made in plants; use of plant cell cultures to make antigens. many proofs of principle were obtained, for candidate vaccines against a wide range of viral and bacterial disease agents; and proofs of efficacy for vaccines delivered orally or parenterally, in whole plant material or as extracts. while all of these aspects are still currently used in molecular farming, developments that have revolutionised the field were first, the widespread adoption of agrobacterium-mediated transient expression (agroinfiltration) of recombinant proteins; and second, the use of "deconstructed" plant virus-derived vectors delivered via agrobacterium to amplify expression (reviewed in rybicki 2010). these innovations enabled the advent of high-throughput testing of expression constructs, coupled with very rapid and generally higher yield production of vaccine antigens once optimal construct design had been determined. for example, our group investigated, via agroinfiltration techniques, three different codon usage schemes and three different intracellular localization strategies for optimization of human papillomavirus type 16 l1 protein expression in n. benthamiana, in one large experiment over only 7 days (maclean et al. 2007) . use of deconstructed tmv-based vectors delivered by agrobacterium routinely has allowed significant increases of antigen yield, up to grams per kilogram fresh tissue weight (gleba et al. 2014; klimyuk et al. 2014 ). the so-called tmv-based "launch vectors" of fraunhofer usa have also allowed significant yield increases and rapid production of antigens (chichester et al. 2013; shamloul et al. 2014) . improved non-replicating hyper-translational (ht) expression vectors derived from cowpea mosaic virus rna2 have also allowed significantly higher yields via agroinfiltration (sainsbury et al. 2008 (sainsbury et al. , 2009 and the possibility of multiple genes from the same vector (saxena et al. 2016) ; so too has the use of a ssdna geminivirus-derived set of vectors by different groups (huang et al. 2009; regnard et al. 2010) , and other ssdna plant (or other host) virus-derived vectors (rybicki and martin 2014) . the number of peptide display vectors/chimaeric protein fusion partners has multiplied: while self-replicating rtmv was once state of the art, now one may choose between tmv-and potato virus x (pvx)-based vectors (lico et al. 2015) , cucumber mosaic virus (cmv) cp (nemchinov and natilla 2007; zhao and hammond 2005) , bamboo mosaic virus (yang et al. 2007 ), pvx-vectored alternanthera mosaic virus (altmv) cp gene (tyulkina et al. 2011) , lichenase (lickm), cholera toxin b subunit (ctb), amv cp, and gus, as mentioned earlier. plant virus virions in particular are now seen as easily-made nanoparticles suitable for a number of vaccine-relevant purposes (steele et al. 2017) , including as selfadjuvanting peptide-based vaccine display vehicles (lebel et al. 2015; leclerc 2014) , and excellent inducers of cross-presentation by mhc receptors (hanafi et al. 2010) . the use of tags or small peptide fusion partners is now also considerably more sophisticated, with a variety of specialized tags to choose from. these include the now-ubiquitous 6xhis tag, used for ni 2+ or other immobilised metal affinity chromatography (imac) protein purification technique; a new "cysta-tag" for the same purpose ; the n-terminal proline-rich domain of maize seed gamma zein (zera) that induces the formation of er-located protein bodies (torrent et al. 2009 ); elastin-like polypeptides (elps) with repeating pentapeptide 'vpgxg' sequences, or hydrophobins-small fungal proteins which alter the hydrophobicity of the fusion partner-both of which also form protein bodies (conley et al. 2011) . as examples, our group has recently successfully used elp fusion to the cp of beak and feather disease virus (bfdv) of parrots to aid in both accumulation and purification of the protein as a candidate vaccine (duvenage et al. 2013) . we have also used the zera tag as a protein body display vehicle for an ectopic m2e moiety common to all influenzavirus a types, which could serve as a universal vaccine for these viruses (mbewana et al. 2015) . another potentially veterinary use of zera was in the enhancement of yersinia pestis f1-v antigen fusion protein accumulation: this was *3â higher than f1-v alone in three different host plant systems-namely, n. benthamiana, alfalfa and n. tabacum nt1 suspension-cultured cells (alvarez et al. 2010) . the expression vehicles themselves have also been subject to engineering: it is now possible to precisely control glycosylation of plant-made proteins. this can be done by knock-out modification via rna interference (rnai) technology of the plant glycosyltransferases beta1,2-xylosyltransferase (xylt) and core alpha1,3-fucosyltransferase (fuct). these enzymes are responsible for the transfer of beta1,2-linked xylose and core alpha1,3-linked fucose residues to glycoprotein n-glycans, which are plant-specific modifications not found in mammalian glycoproteins (strasser et al. 2008) . it is also possible to use transient co-expression technologies to modify glycosylation (castilho and steinkellner 2016) , as well as to achieve almost completely native sialylated recombinant proteins by expression of whole mammalian glycosylation pathways in plants (castilho et al. 2010; steinkellner and castilho 2015) . it is possible to abolish n-glycosylation entirely, by co-expression of bacterial pngase f (mamedov and yusibov 2013) . one can also control endogenous plant proteases that may limit recombinant protein accumulation: for example, transient co-expression of secreted a1/s1 protease inhibitor tomato cathepsin d inhibitor (slcdi) significantly lowered a1 and s1 protease activities in the n. benthamiana apoplast, while increasing recombinant protein content by *45% (goulet et al. 2012) . it was found that co-expression of tomato cystatin slcys8, which inhibits c1a proteases, increased the transient expression yield of a monoclonal antibody in n. benthamiana by nearly 40% (robert et al. 2013) . it is also possible to reduce protease activity in cell suspension cultures by expression of specific antisense rnas, resulting in significantly increased accumulation of recombinant antibodies (mandal et al. 2014) . while suspension-cultured plant cells have been used for many years for molecular farming-and in fact were used for the only usda-licenced plant-produced animal vaccine, against ndv-new developments have made them an even more attractive prospect for low-cost vaccine production. use of flow cytometry with cell sorting, formerly the province of mammalian cell culture work only, has allowed high-expressing mab-producing tobacco by-2 cell lines from a heterogeneous population of cells by selecting the co-expressed fluorescent marker protein dsred (kirchhoff et al. 2012 ). however, one of the most exciting recent developments with this technology is the advent of the "cell pack": this is a technique for getting highly efficient (up to 100%) agrobacterium-mediated transient transformation of suspension-cultured cells that have been captured by suction onto a filter (rademacher 2014) . cell packs can be tiny (eppendorf tube tips) or large (e.g.: centimetres deep in a 20 cm buchner funnel); protein expression occurs in immobilised cells in the presence of minimal liquid media, and can continue for days (https://tinyurl.com/k22da6q). the technology is ideal for rapid and high-throughput screening of expression-and the possibility exists for taking cells back into culture and selecting for permanent transfection. these are important developments, because of the acceptability of the products of plant cell cultures for production of biologics to regulatory bodies (see below). another production host highly suited to industrial-scale production is microalgae: they are easier to establish and use than plant cell cultures, and share all the same advantages of scalability, contained growth, and consistent transgene expression levels (specht and mayfield 2014) . a very important development for molecular farming has been the development of protocols for increasing yields and implementing industrial-scale production and downstream processing of vaccines and biologics, without which no large-scale trials could take place, or routine manufacturing occur. a useful development was use of a transgenic n. tabacum/n. glauca hybrid that does not synthesize alkaloids, is highly vigorous, can easily be propagated by vegetative cuttings and does not produce viable pollen, which greatly aids biocontainment (ling et al. 2012 ). the application of techniques more familiar to chemical engineers is also advantageous: for example, it proved possible, by sequential use of fractional factorial designs and response surface methodology, to optimize culture media for mab production in transgenic tobacco by-2 cells, and to increase mab yields up to 31-fold after 10 days of culture compared to use of standard media (vasilev et al. 2013 ). the fraunhofer ime group have described generic chromatography-based strategies focusing on the binding behaviours of host cell proteins to chromatography resins under varying conditions of ph and conductivity (buyel and fischer 2014) . another useful technique from that group is a comprehensive description of the use of heat treatment of either intact leaves or of plant extracts to facilitate the industrial-scale removal of host cell proteins, optimised by a design-of-experiments approach that will also be familiar to engineers (buyel et al. 2016 ). many of these and other strategies used to optimise yields in molecular farming are reviewed here (twyman et al. 2013) . the establishment by various companies and institutes of facilities suitable for manufacture of animal and clinical trial material is also a very welcome development. as examples, the long-established kentucky bioprocessing inc (kbp) is a contract manufacturer capable of production from transgenic plants or transiently transfected plants, using the u.s. food and drug administration's current good manufacturing practices (cgmp) for pharmaceuticals, at scales up to thousands of kilograms of plants per week (https://www.kentuckybioprocessing.com/). they have recently produced and stockpiled mabs against ebolaviruses. another contract manufacturing firm with large production capacity is ibio inc: like kbp, they have a wide range of patents on their proprietary gene expression technology (holtz et al. 2015) . they are also partnering with a range of agencies and companies, including with the brazilian oswaldo cruz foundation for plant-made yellow fever vaccine, and the us dept of defense and the bill & melinda gates foundation for influenza vaccines (http://www.ibioinc.com/). the fraunhofer usa center for molecular biotechnology (http://www.fhcmb. org/) is a not-for-profit research and development organisation, that offers "… plant-based protein production, purification, scale-up and gmp manufacturing to support the development of vaccines, therapeutics and diagnostics", also with proprietary expression platforms, and can take products right through to fill and finish. the fraunhofer ime in aachen also has a state-of-the-art mechanised plant production facility still under construction as of 2017. the regulatory environment has changed for the better, even though it was not in truth as inimical as first supposed: this was borne out by the fact that as early as 2006, the cuban regulatory agencies and the usda had approved plant-made mabs for the purification of an already-licenced yeast-made hepatitis b vaccine, and the tobacco cell-made ndv vaccines, respectively (rybicki 2009 ). as another early example, the fraunhofer ime molecular farming group published in 2004 that use of whole plants for biologics production lacks intrinsic benefits of cell culture techniques, such as precise control over growth conditions, batch-to-batch product consistency, sterile containment, and it being much harder to be in compliance with good manufacturing practice (gmp) (hellwig et al. 2004 ). they pointed out that plant cell suspension cultures have all the merits of microbial and animal cell cultures, have an established track record for secondary metabolite production, and are far cheaper to use. these justifications notwithstanding, the same group later noted, in a review on gmp issues for plant-made proteins in whole plants, that: "when [plant-derived] recombinant proteins are intended for medical use… they fall under the same regulatory guidelines for manufacturing that cover drugs from all other sources, and when such proteins enter clinical development this includes the requirement for production according to [gmp] . in principle, the well-characterized gmp regulations that apply to pharmaceutical proteins produced in bacteria and mammalian cells are directly transferrable to plants" . they subsequently were able to get gmp manufacturing authorisation from german authorities for making mabs from transgenic n. tabacum for a phase i clinical trial (ma et al. 2015) . other entities have also scaled and regularised production to allow production of materials for animal and clinical trial-and one of the most successful has been medicago inc., who presently has routine large-scale production of influenzavirus a haemagglutinin (ha)-based vlps for use in advanced human clinical trial (d'aoust et al. 2010) . in 2012, medicago inc. succeeded in manufacturing 10 million doses of an h1n1 vlp-based influenza vaccine candidate in one month, by phase 1-appropriate cgmp, as part of the us defense advanced research projects agency (darpa)-funded challenge (darpa 2012) . a group in japan has also recently developed a gmp-compliant production process for a transgenic rice seed-based cholera vaccine-mucorice-ctb-which is simply polished, powdered seed, now in clinical trial (kashima et al. 2016) . as evidence of the increasing maturity of veterinary molecular farming, one of the editors of this book has co-authored a recent article on regulatory and commercial hurdles hampering the advance to market of plant-produced veterinary vaccines, covering developing business plans, assessing market opportunities, manufacturing scale-up, financing, protecting and using intellectual property, and regulatory approval (macdonald et al. 2015) . at first sight, molecular farming appears the ideal way to make recombinant protein-based veterinary vaccines: production of active ingredients is markedly cheaper per unit mass than by use of any animal tissue-culture system, and generally cheaper than yeast or bacterial culture (rybicki 2010 ); partially-purified or unprocessed extracts are highly unlikely to contain any animal pathogens; edible and oral vaccines appear highly feasible; the financial barrier to entry for manufacture appears far lower than for conventionally-made vaccines. it is possible to efficiently make bacterial proteins using bacterial-derived translational machinery in chloroplasts in transplastomic plants, as well as to make other proteins at very high yield; conventional transgenics have been used to make many vaccine candidates, with many proofs of efficacy; transient expression technologies have revolutionised the field in terms of providing high yields and very rapid development times from concept to product. and yet, only one product-dow's ndv vaccine-is registered for use, and that is not sold. it is possible that heavy investment by big industry players in conventional manufacturing technologies has stalled their uptake of molecular farming technology for veterinary vaccines and biologics: this has certainly been true for human biologics. however, perhaps developments from the human field could be used as a spur for uptake of veterinary vaccines and biologics: an example here is the licencing of protalix biotherapeutics' elelyso ® or glucocerebrosidase, a therapeutic for a genetic mitochondrial enzyme deficit called gaucher disease, made using transgenic carrot cell lines in 800 litre plastic bag fermenters (http://protalix.com/ about/elelyso/). a contamination of genzyme's mammalian cell production facilities in 2009 with a mammalian calicivirus led to the fda allowing protalix to supply the drug to patients who needed it, and to accelerated licensure (bethencourt 2009 ). the company has also successfully tested oral administration of drugs in plant cells, which would be a highly welcome development: they claim that "oral delivery of protein therapies [is] possible due to the unique cellulose wall of plant cells that makes them resistant to degradation when passing through the digestive tract" (protalix 2017). another apposite example was the fortuitous availability of a plant-made anti-zaire ebolavirus mab cocktail known as zmapp™, at the height of the recent west african ebola disease outbreak (reviewed in rybicki 2014). this was made by transient expression in n. benthamiana, and only a few clinical trial doses were available: these were used under the humanitarian principle, and later the mabs were cleared for use by the fda in an efficacy trial just before the end of the epidemic (leafbio 2016). both these examples are of niche products that were not being made at large scale or for a large market by conventional techniques, and for which there was a sudden, pressing need that could not be supplied by other means. this could provide motivation for small companies to either develop inexpensive vaccines for emerging diseases, or to target niche vaccines or niche therapeutics, in the knowledge that large established entities are unwilling to take the risk. one example for the former possibility comes from the recent emergence of bluetongue virus (btv) disease in sheep and small ruminants in europe, due to northward spread of the insect vector with climate change (purse et al. 2008) : while attenuated live vaccines are available-south africa presently uses a cocktail of 24 such viruses-concerns in europe about reassortment of virus dsrna genome components between vaccinated and naturally diseased animals, as well as of the safety of the vaccines in terms of possible under-attenuation which may result in disease development in certain sheep breeds (niedbalski 2011) , mean these are not being used. the irregular occurrence of outbreaks, and the limited number of strains involved, mean that stockpiling vaccines is desirable. however, killed vaccines still require growing potentially dangerous viruses, and while it is possible to make vlps in cell cultures and these are effective (pearson and roy 1993; roy et al. 1994) , the technology is too expensive for farm animal use. it is fortunate, therefore, that it is also possible to make btv-8 vlps via transient expression in n. benthamiana, and these are as effective in a single injected dose as the commercial vaccine (thuenemann et al. 2013) . there are currently no plans to manufacture this or other plant-produced btv vaccines for the european or other markets; however, this may soon change. an example for a niche vaccine product comes from ours and others' work on beak and feather disease virus (bfdv) vaccines: psittacines are highly valued companion animals; however, there are very few vaccines for their diseases, and none yet available for bfdv. while some recent work in this area has shown that recombinant cp can be made in e coli and in insect cells (heath et al. 2006; patterson et al. 2013; stewart et al. 2007) , that it appears to be protective (bonne et al. 2009 ) and that this can apparently form vlps (sarker et al. 2015) , it still appears that the protein is too expensive to produce for use as a vaccine. while initial work with plant production of bfdv cp was disappointing due to low yields, recent work from our group (duvenage et al. 2013 ) showed a significant increase in bfdv yield due to fusion with elastin-like polypeptide (elp), and good immunogenicity in mice. this, coupled with a very simple purification protocol enabled by elpylation (conley et al. 2009 ), could allow scalable, cheap production of bfdv vaccines. while therapeutics such as mabs or other biologics for veterinary use are generally limited to high-value companion animals, plant production could open up a hitherto neglected market niche. one excellent example is the manufacture in japan of canine interferon-a (tabayashi and matsumura 2014) : this is done via transgenic strawberries in a completely enclosed gmp-compliant facility, and the product is powdered strawberry extract given orally, to combat canine periodontal disease. another very recent example in dogs, albeit with them being used as a model for human disease, was the proof that lyophilised transplastomic lettuce leaves expressing ctb fusions of coagulation factor ix (fix) could be used orally in feed for >300 days in haemophilia b dogs with no ill effects-and that this treatment resulted in robust suppression of igg/inhibitor and ige formation against intravenously-provided fix, and a marked shortening of blood coagulation times (herzog et al. 2017 ). an example for agricultural use is the oral dosing of pigs with transgenic arabidopsis thaliana seeds containing designer igas against enterotoxigenic e coli (etec) (virdi et al. 2013) : this product consisted of dimeric llama-derived heavy chain variable region fused to the fc portion of a porcine iga and the porcine iga j chain and secretory component, which allowed production of dimeric secretory iga-like antibodies (vhh-iga). in a piglet feed-challenge experiment with etec, dosing piglets with 20 mg/d per pig vhh-iga produced a progressive decline in bacterial shedding and a significantly higher weight gain than seen in control or other experimental pigs. a highly novel plant-made therapeutic product was the receptor binding domain of the tailspike protein gp9 from the p22 bacteriophage: this is known to reduce salmonella colonisation in the chicken gut (miletic et al. 2015) . purified elp-fused gp9 bound to salmonella enterica serovar typhimurium in vitro, and feeding lyophilized leaves containing gp9-elp to newly hatched chickens showed that it has the potential to control salmonella contamination in commercially-raised fowl. these and other experiments are reviewed here (juarez et al. 2016; topp et al. 2016) , in articles that make an excellent case for plant-made immunotherapeutics for veterinary use. the one health concept has as one of its central themes the integration of opportunities for vaccine-based approaches for the prevention of zoonotic and emerging diseases across veterinary and human medicine (monath 2013) a set of disease agents which exemplify the potential strength of the one health approach are influenza viruses, and they have in fact been the focus of a number of international meetings and planning sessions (chien 2013; dwyer and kirkland 2011; kahn et al. 2014; ludwig et al. 2014; powdrill et al. 2010; short et al. 2015) . the unique mix of hosts that occurs in intensive agricultural environments that could give rise to pandemics-swine, birds and humans-is a major cause of international concern; so too is the development of suitable vaccines for the prevention of infection in domesticated birds, farmed swine, and humans. plants have been shown to be highly useful for the production of influenza vaccines, and indeed possibly the fastest ever production at scale of an influenzavirus a strain vaccine-1 month for 10 million doses-was done by medicago inc. for h1n1pdm 2009 ha vlps in 2012 (rybicki 2014) . medicago also managed in 2013, as an exercise to demonstrate preparedness, to produce grams of cgmp-grade plant-made h7 ha-only vlps only 19 days after accessing the h7 ha gene cdna sequence, in response to an outbreak in china in the same year. the fact that plant-made influenza vaccines have worked very well in animal models means that they should be trialled extensively in domestic fowl and swine, to see if the maintenance of the viruses in these hosts can be curbed. as for companion animals, there is even a canine influenza vaccine candidate: following a 2004 h3n8 outbreak in the us, a group in canada used the plant-derived filamentous malva mosaic virus (mamv) nanoparticles as a vaccine platform to display the highly conserved ectopic m2e peptide and to increase its immunogenicity. together with the adjuvant ompc derived from salmonella typhi, the vaccine was protective against both the homologous virus and a heterosubtypic strain of influenza in mice, as well as eliciting antibodies reactive with m2e peptides derived from h9n2, h5n1 and h1n1 strains and being immunogenic in dogs (leclerc et al. 2013) . given that brucellosis is listed as a one health priority, it is worth noting that a transgenic plant-produced b abortus outer membrane protein (u-omp19) was an effective oral vaccine in mice against a systemic challenge, eliciting an adaptive il-17 immune response (pasquevich et al. 2011 )-and that the protein has significant adjuvant activity, and oral vaccination of mice with u-omp19 plus salmonella antigens was protective against virulent challenge with s typhimurium (risso et al. 2017) . it is important to realise that, while vaccines are the target of this review, one health products can also be reagents to be used in more effective or cheaper diagnostic kits, and in particular for point-of-care devices, or for research laboratory use-and especially proteins that could be both a reagent and used as a candidate vaccine in animals and possibly humans. a few of the best potential one health targets for plant-made dual-function proteins would be proteins from middle eastern respiratory syndrome (mers) coronavirus (wirblich et al. 2017) , nipah and hendra viruses (landford and nunn 2012; mackenzie et al. 2003) , diagnostic/ vaccine candidate proteins from rift valley fever and crimean-congo haemorrhagic fever viruses (kortekaas 2014; monath 2013) . inexpensive and abundant proteins made from these agents could first serve as reagents in the development of cheap point-of-care diagnostics, and then as vaccine candidates in animals, if appropriate, and then possibly in humans. a useful example here is of the expression both by agroinfiltration in n. benthamiana as a reagent, and in transgenic n. tabacum roots and leaves as a vaccine, of a fused gcgn envelope glycoprotein-encoding gene from crimean-congo haemorrhagic fever virus (ghiasi et al. 2011 ). the protein yield was 1-2 mg/kg fresh plant weight. transgenic material was orally immunogenic, and elicited humoral and mucosal antibody responses, and antibodies bound inactivated virus used as a vaccine booster in some experiments. agroinfiltration-produced gngc was used as a reagent in elisa to detect immune responses. another study from our group was of the production of cchfv n protein in n. benthamiana by agroinfiltration specifically as a reagent for use in diagnostic tests (atkinson et al. 2016 ): a plant codon-optimised and 6xhis tagged n protein gene was found to accumulate best as a soluble protein in the cytoplasm, from which it could be easily purified by ammonium sulphate fractionation and immobilised ni 2+ column chromatography. purified np was used in a validated indirect elisa to detect anti-cchfv igg in sera from convalescent human patients: this was successful for 13/13 samples, with no readings for samples from patients with no history of cchfv infection. the results were 100% concordant with those from a commercially available immunofluorescent assay. given that soluble n protein is hard to produce and difficult to purify from insect cell cultures, the plant-made product would seem to be a desirable replacement. while the same has been said in many venues over more than twenty years now, the field of molecular farming really does seem to be near to meeting its initial promise for veterinary use. all of the technology that is required for efficient, high-yield production of biologics is in place; downstream processing modalities have been well worked out by a number of near-and cgmp-compliant facilities; many candidate vaccines for a wide variety of pathogens have been tested; therapeutic biologics too for veterinary use are now feasible; regulatory agencies seem agreeable to considering plant-made products. the generally shorter regulatory path, the possibility of using less stringently purified products, and the very real possibility of using oral vaccines and therapeutics, should also be highly attractive for product developers. i sincerely hope, then, that realisation of the promise comes very soon. higher accumulation of f1-v fusion recombinant protein in plants after induction of protein body formation noncompliance history high level expression of surface glycoprotein of rabies virus in tobacco leaves and its immunoprotective activity in mice integration and expression of bluetongue vp2 gene in somatic embryos of peanut through particle bombardment method plant-produced crimean-congo haemorrhagic fever virus nucleoprotein for use in indirect elisa expression of protective antigen in transgenic plants: a step towards edible vaccine against anthrax transformation of an edible crop with the paga gene of bacillus anthracis virus stalls genzyme plant production of human papillomavirus type 16 virus-like particles in transgenic plants assessment of recombinant beak and feather disease virus capsid protein as a vaccine for psittacine beak and feather disease immunization with viruslike particles from cottontail rabbit papillomavirus (crpv) can protect against experimental crpv infection the rabbit viral skin papillomas and carcinomas: a model for the immunogenetics of hpv-associated carcinogenesis generic chromatography-based purification strategies accelerate the development of downstream processes for biopharmaceutical proteins produced in plants comparison of tobacco host cell protein removal methods by blanching intact plants or by heat treatment of extracts immunization with potato plants expressing vp60 protein protects against rabbit hemorrhagic disease virus in planta protein sialylation through overexpression of the respective mammalian pathway transient expression of mammalian genes in n. benthamiana to modulate n-glycosylation immunogenicity of a subunit vaccine against bacillus anthracis a plant-produced protective antigen vaccine confers protection in rabbits against a lethal aerosolized challenge with bacillus anthracis ames spores how did international agencies perceive the avian influenza problem? the adoption and manufacture of the 'one world, one health' framework optimization of elastin-like polypeptide fusions for expression and purification of recombinant proteins in plants protein body-inducing fusions for high-level production and purification of recombinant proteins in plants the production of hemagglutinin-based virus-like particles in plants: a rapid, efficient and safe response to pandemic influenza plant-derived vaccine protects target animals against a viral disease a novel methodology to develop a foot and mouth disease virus (fmdv) peptide-based vaccine in transgenic plants expression in tobacco and purification of beak and feather disease virus capsid protein fused to elastin-like polypeptides influenza: one health in action gmp issues for recombinant plant-derived pharmaceutical proteins mice orally immunized with a transgenic plant expressing the glycoprotein of crimean-congo hemorrhagic fever virus successful oral prime-immunization with vp60 from rabbit haemorrhagic disease virus produced in transgenic plants using different fusion strategies plant viral vectors for delivery by agrobacterium oral immunogenicity of the plant derived spike protein from swine-transmissible gastroenteritis coronavirus a protease activity-depleted environment for heterologous proteins migrating towards the leaf cell apoplast two distinct chimeric potexviruses share antigenic cross-presentation properties of mhc class i epitopes oral immunization with a recombinant bacterial antigen produced in transgenic plants the capsid protein of beak and feather disease virus binds to the viral dna and is responsible for transporting the replication-associated protein into the nucleus plant cell cultures for the production of recombinant proteins oral tolerance induction in hemophilia b dogs fed with transplastomic lettuce commercial-scale biotherapeutics manufacturing facility for plant-made pharmaceuticals immunogenicity of the epitope of the foot-and-mouth disease virus fused with a hepatitis b core protein as expressed in transgenic tobacco a dna replicon system for rapid high-level production of virus-like particles in plants human-derived, plant-produced monoclonal antibody for the treatment of anthrax an oral vaccine in maize protects against transmissible gastroenteritis virus in swine biomanufacturing of protective antibodies and other therapeutics in edible plant tissues for oral applications swine and influenza: a challenge to one health research good manufacturing practices production of a purification-free oral cholera vaccine expressed in transgenic rice plants systemic and oral immunogenicity of hemagglutinin protein of rinderpest virus expressed by transgenic peanut plants in a mouse model monoclonal tobacco cell lines with enhanced recombinant protein yields can be generated from heterogeneous cell suspension cultures by flow sorting production of recombinant antigens and antibodies in nicotiana benthamiana using 'magnifection' technology: gmp-compliant facilities for small-and large-scale manufacturing plant-produced cottontail rabbit papillomavirus l1 protein protects against tumor challenge: a proof-of-concept study one health approach to rift valley fever vaccine development plant-based vaccine: mice immunized with chloroplast-derived anthrax protective antigen survive anthrax lethal toxin challenge delivery of subunit vaccines in maize seed good governance in 'one health' approaches leafbio announces conclusion of zmapp™ clinical trial plant viruses as nanoparticle-based vaccines and adjuvants a novel m2e based flu vaccine formulation for dogs plant viral epitope display systems for vaccine development the two-faced potato virus x: from plant pathogen to smart nanoparticle an interspecific nicotiana hybrid as a useful and cost-effective platform for production of animal vaccines in planta production of a candidate vaccine against bovine papillomavirus type 1 induction of a protective immune response to rabies virus in sheep after oral immunization with transgenic maize, expressing the rabies virus glycoprotein influenza, a one health paradigm-novel therapeutic strategies to fight a zoonotic pathogen with pandemic potential regulatory approval and a first-in-human phase i clinical trial of a monoclonal antibody produced in transgenic tobacco plants bringing plant-based veterinary vaccines to market: managing regulatory and commercial hurdles managing emerging diseases borne by fruit bats (flying foxes), with particular reference to henipaviruses and australian bat lyssavirus optimization of human papillomavirus type 16 (hpv-16) l1 expression in plants: comparison of the suitability of different hpv-16 l1 gene variants and different cell-compartment localization in vivo deglycosylation of recombinant proteins in plants by co-expression with bacterial inhibition of protease activity by antisense rna improves recombinant protein production in nicotiana tabacum cv. bright yellow 2 (by-2) suspension cells production of h5n1 influenza virus matrix protein 2 ectodomain protein bodies in tobacco plants and in insect cells as a candidate universal influenza vaccine a plant-produced bacteriophage tailspike protein for the control of salmonella vaccines against diseases transmitted from animals to humans: a one health paradigm transient expression of the ectodomain of matrix protein 2 (m2e) of avian influenza a virus in plants bluetongue vaccines in europe protection of rabbits against cutaneous papillomavirus infection using recombinant tobacco mosaic virus containing l2 capsid epitopes an oral vaccine based on u-omp19 induces protection against b. abortus mucosal challenge by inducing an adaptive il-17 immune response in mice differential expression of two isolates of beak and feather disease virus capsid protein in escherichia coli genetically engineered multi-component virus-like particles as veterinary vaccines passive protection to bovine rotavirus (brv) infection induced by a brv vp8* produced in plants using a tmv-based vector one health approach to influenza: assessment of critical issues and options invasion of bluetongue and other orbivirus infections into europe: the role of biological and climatic processes high level protein expression in plants through the use of a novel autonomously replicating geminivirus shuttle vector u-omp19 from brucella abortus is a useful adjuvant for vaccine formulations against salmonella infection in mice protection of recombinant mammalian antibodies from development-dependent proteolysis in leaves of nicotiana benthamiana long-lasting protection of sheep against bluetongue challenge after vaccination with virus-like particles: evidence for homologous and partial heterologous protection plant-produced vaccines: promise and reality plant-made vaccines for humans and animals plant-based vaccines against viruses virus-derived ssdna vectors for the expression of foreign proteins in plants expression of multiple proteins using full-length and deleted versions of cowpea mosaic virus rna-2 peaq: versatile expression vectors for easy and quick transient expression of heterologous proteins in plants a chimeric affinity tag for efficient expression and chromatographic purification of heterologous proteins from plants an efficient approach for recombinant expression and purification of the viral capsid protein from beak and feather disease virus (bfdv) in escherichia coli expression of hemagglutinin protein of rinderpest virus in transgenic pigeon pea [cajanus cajan (l.) millsp.] plants virus-derived vectors for the expression of multiple proteins in plants optimization and utilization of agrobacteriummediated transient protein production in nicotiana algae-based oral recombinant vaccines synthetic plant virology for nanobiotechnology and nanomedicine n-glyco-engineering in plants: update on strategies and major achievements baculovirus expression of beak and feather disease virus (bfdv) capsid protein capable of self-assembly and haemagglutination generation of glyco-engineered nicotiana benthamiana for the production of monoclonal antibodies with a homogeneous human-like n-glycan structure corn as a production system for human and animal vaccines forefront study of plant biotechnology for practical use: development of oral drug for animal derived from transgenic strawberry increased yield of heterologous viral glycoprotein in the seeds of homozygous transgenic tobacco plants cultivated underground a method for rapid production of heteromultimeric protein complexes in plants: assembly of protective bluetongue virus-like particles the case for plant-made veterinary immunotherapeutics protein body induction: a new tool to produce and recover recombinant proteins in plants optimizing the yield of recombinant pharmaceutical proteins in plants new viral vector for superproduction of epitopes of vaccine proteins in plants expression of an animal virus antigenic site on the surface of a plant virus particle expression of human papillomavirus type 16 major capsid protein in transgenic nicotiana tabacum cv optimization of by-2 cell suspension culture medium for the production of a human antibody using a combination of fractional factorial designs and the response surface method orally fed seeds producing designer igas protect weaned piglets against enterotoxigenic escherichia coli infection oral immunogenicity of human papillomavirus-like particles expressed in potato expression of bacillus anthracis protective antigen in transgenic chloroplasts of tobacco, a non-food/feed crop induction of a protective antibody response to foot and mouth disease virus in mice following oral or parenteral immunization with alfalfa transgenic plants expressing the viral structural protein vp1 protection of mice against challenge with foot and mouth disease virus (fmdv) by immunization with foliar extracts from plants infected with recombinant tobacco mosaic virus expressing the fmdv structural protein vp1 protective lactogenic immunity conferred by an edible peptide vaccine to bovine rotavirus produced in transgenic plants one-health: a safe, efficient, dual-use vaccine for humans and animals against middle east respiratory syndrome coronavirus and rabies virus induction of protective immunity in swine by recombinant bamboo mosaic virus expressing foot-and-mouth disease virus epitopes expression of the fusion glycoprotein of newcastle disease virus in transgenic rice and its immunogenicity in mice a plant-based multicomponent vaccine protects mice from enteric diseases expression in plants and immunogenicity of plant virus-based experimental rabies vaccine development of a candidate vaccine for newcastle disease virus by epitope display in the cucumber mosaic virus capsid protein key: cord-266481-9afb0yvt authors: naskalska, antonina; dabrowska, agnieszka; szczepanski, artur; milewska, aleksandra; jasik, krzysztof piotr; pyrc, krzysztof title: membrane protein of human coronavirus nl63 is responsible for interaction with the adhesion receptor date: 2019-07-17 journal: journal of virology doi: 10.1128/jvi.00355-19 sha: doc_id: 266481 cord_uid: 9afb0yvt human coronavirus nl63 (hcov-nl63) is a common respiratory virus that causes moderately severe infections. we have previously shown that the virus uses heparan sulfate proteoglycans (hspgs) as the initial attachment factors, facilitating viral entry into the cell. in the present study, we show that the membrane protein (m) of hcov-nl63 mediates this attachment. using viruslike particles lacking the spike (s) protein, we demonstrate that binding to the cell is not s protein dependent. furthermore, we mapped the m protein site responsible for the interaction with hspg and confirmed its relevance using a viable virus. importantly, in silico analysis of the region responsible for hspg binding in different clinical isolates and the amsterdam i strain did not exhibit any signs of cell culture adaptation. importance it is generally accepted that the coronaviral s protein is responsible for viral interaction with a cellular receptor. here we show that the m protein is also an important player during early stages of hcov-nl63 infection and that the concerted action of the two proteins (m and s) is a prerequisite for effective infection. we believe that this study broadens the understanding of hcov-nl63 biology and may also alter the way in which we perceive the first steps of cell infection with the virus. the data presented here may also be important for future research into vaccine or drug development. navirus (bcov), and avian infectious bronchitis virus (ibv), reportedly use sialic acids for initial attachment to the cell (summarized in reference 10). the coronaviral particle consists of a dense core formed by a nucleocapsid (n) protein with viral genomic rna and an envelope decorated with the membrane (m), envelope (e), and spike (s) proteins. some coronaviruses contain other structural proteins, such as hemagglutinin esterase (he) or accessory open reading frame (orf) proteins (orf3, orf4a, and orf7) (59, 69) . the s protein is a class i viral membrane glycoprotein responsible for the interaction with the entry receptor and fusion (11) . hcov-nl63 employs ace2 as an entry receptor (3) . however, we recently reported that heparan sulfate (hs) proteoglycans (hspgs) are required for effective adhesion of the virus to the cell surface and that such an interaction enhances the infection process (12) . hspg binding was also demonstrated for mhv (13) , sars-cov (14) , and porcine epidemic diarrhea virus (pedv) (15) . hspgs are glycoproteins that are ubiquitous at the surface of the mammalian cell. binding to hspg is the initial event promoting subsequent recognition of a secondary receptor by increasing the local concentration of pathogens or triggering conformational changes of proteins involved in viral entry. as an example, binding to hspg induces structural rearrangements of proteins responsible for infection by adenoassociated virus 2 (16) , adenovirus types 2 and 5 (17) , human papillomavirus 16 (18) , and several herpesviruses (19) . furthermore, the majority of oncogenic viruses (hepatitis b and c viruses, kaposi's sarcoma-associated herpesvirus, human papillomaviruses, merkel cell polyomavirus, and human t cell lymphotropic virus type 1) initially attach to the hspgs (20) . hspgs can also enhance virulence by binding accessory viral factors necessary for viral replication. this is illustrated by hspg binding of the tat protein of human immunodeficiency virus, which after internalization activates transcription of the viral rna (21) . to better address the nature of the virus-hspg interaction, we constructed viruslike particles (vlps) composed of hcov-nl63 proteins. vlps structurally mimic the native virus and thus constitute a good model for studying virus-host interactions in the context of additional capsid components. importantly, vlps can be relatively easily tailored by molecular biotechnology techniques, facilitating the assessment of the role of individual viral proteins in receptor recognition. in the present study, we demonstrate that hcov-nl63 binds hspg via the m protein, which is to some extent responsible for virus attachment. we thus show that viral entry is an outcome of the concerted action of the m and s proteins. the presented findings improve the understanding of viral biology and may facilitate the development of improved antivirals and neutralizing vaccines. production and purification of vlps. hcov-nl63 vlps composed of the m, e, n, and (optionally) s proteins were produced in insect cells, as previously described (22) , with the modification that the n protein was also included. to evaluate protein expression, insect cells infected with bicistronic recombinant baculovirus (rbv) coding for m and e proteins (mϩe) and monocistronic n rbv (and optionally monocistronic s rbv) were immunostained and examined under a confocal microscope. colocalization of the m and n proteins was observed, suggesting the formation of a protein complex within the producer cells (fig. 1a) . to verify whether vlps were effectively assembled and released, the culture medium harvested from insect cells expressing m, e, n, and s proteins was analyzed by western blotting. the m (26-kda), n (42-kda), and s (150-kda) proteins were detected in the secreted fraction when insect cells were coinfected with (mϩe) rbv, n rbv, and, optionally, s rbv (fig. 1b) . in agreement with our previous studies, the m protein migrated as two bands, reflecting its two different glycosylation states. similarly, two bands of the n protein were observed, upon electrophoresis, resulting from protein degradation under denaturing conditions (23) . in contrast, the apparent molecular weight of the s protein band was higher than expected, most likely because of the glycosylation and/or incomplete denaturation of the protein trimer. both vlps, composed of m, e, and n (men) and m, e, n, and s (mens) proteins, were concentrated and purified using a heparin column according to a previously developed protocol (22) (fig. 1c ). the analysis of purified samples by transmission electron microscopy (tem) showed spherical particles of different diameters, which might reflect the amount of the n protein incorporated into vlps (fig. 1d ). spike protein is dispensable for hcov-nl63 vlp adhesion to llc-mk2 cells. previously, we have shown that hcov-nl63 vlps composed of the m, e, and s proteins penetrate llc-mk2 cells, which are naturally permissive to hcov-nl63 infection, and that this process is ace2 dependent. we have also observed that vlps composed of m and e proteins adhere to but do not enter the target cell (data not shown). in the present study, we evaluated the binding and endocytosis of men and mens hcov-nl63 vlps. llc-mk2 cells were incubated with the purified men and mens vlps, and we found that men vlps localize to the cell surface, whereas mens vlps were present on the cell surface and in the cytoplasm (fig. 2 ). this confirmed that the s protein is required for particle internalization but dispensable for vlp adhesion. to further validate this result, we blocked the virus-ace2 interaction with anti-ace2 antibody, which resulted in inhibition of mens vlp internalization but not adhesion to the cell surface (fig. 3) . interestingly, for mens vlps, we observed a higher number of viral particles on cells preincubated with anti-ace2 antibodies. we believe that this is an artifact, as during entry, vlps fuse with cellular membranes, which results in diffusion of the fluorescence signal and a decrease of the score. blocking of the interaction with the cellular receptor hampers vlp-cell fusion, and the scored number of particles remains high. adhesion of men vlps to cells was not affected by inhibition of the vlp-ace2 interaction. altogether, these observations indicate that s protein of hcov-nl63 is not involved in virus attachment to target cells. using flow cytometry, we next investigated whether the adhesion of men and mens vlps to the cell surface involves interaction with hspgs, as has been shown for hcov-nl63. for this, purified vlps were preincubated with soluble hs and used to inoculate llc-mk2 cells. identical samples were prepared using untreated vlps. the cells were then immunostained to label the vlps, and the signal was recorded. native, iodixanol-concentrated hcov-nl63 was used as a control. we observed that both men and mens vlps bound to the cell surface in the absence of hs and that hs pretreatment of vlps hampered adhesion (fig. 4) . peptides of the membrane protein are recognized by heparin. men vlps contain the n protein encapsulated in an envelope formed by the m and e proteins. we have a priori assumed that the m protein serves as a partner for cellular hspg during viral or vlp adhesion, and we verified this hypothesis experimentally. first, an array of synthetic, overlapping peptides covering the complete m protein was used to test binding with the labeled heparin. the m protein regions potentially interacting with heparin were hence identified as amino acids (aa) 25 to 51, aa 93 to 119, and aa 153 to 207 (fig. 5) . most of the peptides are arginine and lysine rich, which is in agreement with previously reported data (24, 25) . next, we predicted the m protein topology in silico, and based on these data and previous reports, we concluded that regions spanning aa 25 to 51 and aa 93 to 119 may not be involved in the interactions with hpsgs, as they are localized inside the virion or in transmembrane domains (tmds) (fig. 6 ). this led us to the conclusion that the hspg binding site is localized in the c-terminal region of aa 153 to 226, predicted to be exposed on the virion surface. data are presented as means ϯ 95% confidence intervals (ci). the total number of particles and the number of cells were quantified using the imagej fiji tool 3d objects counter, and internalized particles were counted manually (*, p ͻ 0.05; ****, p ͻ 0.0001; ns, not significant). ab, antibody. to further evaluate the interaction between the c-terminal domain of the m protein and hspgs, we expressed and purified a his-tagged fragment of the m protein consisting of amino acids 153 to 226 (6ϫhis-m 153-226 ) in bacteria. to investigate whether the protein adhered to cellular hspg, we performed an in situ enzyme-linked immunosorbent assay (elisa), with different amounts of the protein added to cells seeded in a microplate. the 6ϫhis-m 153-226 protein and an equal molar amount of 6ϫhis-n hku1, as a control, were serially diluted and incubated with llc-mk2 cells. a signal developed using an anti-his-horseradish peroxidase (hrp) antibody was detected only with the 6ϫhis-m 153-226 protein and was proportional to its concentration (fig. 7a) . importantly, when the 6ϫhis-m 153-226 protein was preincubated with soluble hs, the interaction with llc-mk2 cells was diminished (fig. 7b) , suggesting that the putative epitope is localized within the region from aa 153 to 226 of the m protein. inhibition of virus adhesion to the cell surface by anti-m antibody binding. to further validate the notion that hcov-nl63 employs the m protein to bind to the cell, we performed a "pseudoneutralization" assay using a polyclonal anti-m rabbit serum, obtained after immunizing rabbits with peptides corresponding to aa 181 to 195 cell infection is a complex process that involves viral attachment, which leads to viral enrichment on the cell surface and subsequent internalization. while initial binding is usually nonspecific and mediated by ubiquitous molecules, such as sugars or glycoconjugates (glycoproteins, glycolipids, and proteoglycans), the second step requires a highly specific interaction with the entry receptors, often involving their proteolytic processing. in some viruses, a single protein is responsible for both steps of this interaction, whereas in other viruses (i.e., paramyxoviruses), different structural elements of the virion are employed for the attachment to and fusion with the cellular membrane (26) . this sophisticated interplay of distinct receptor entities and their sequential engagement increase viral avidity and coordination in time for key events for efficient cellular uptake. for coronaviruses, it is generally believed that s protein is responsible for both virion attachment and internalization (reviewed in references 27 and 28). this large (ϳ150-kda) glycoprotein consists of three segments: an ectodomain, a transmembrane anchor, and a short endodomain. the ectodomain can be further divided into s1 and s2 subunits, although not all coronaviruses undergo enzymatic cleavage of the s ectodomain (29) . different experimental approaches demonstrating that the s1 domain encompasses one or more receptor binding sites (rbss), located at its c terminus, have been reported for almost all studied covs. while the c-terminal part of the s1 domain (s1-ctd) is highly divergent and responsible for the interaction with entry receptors, the more conserved n-terminal region of s1 (s1-ntd) is thought to function as a ligand for the initial attachment factors. indeed, it was demonstrated that the s1-ntds of several coronaviruses bind carbohydrates. examples include the alphacoronavirus (tgev), betacoronavirus (hcov-hku1 and bcov), and gammacoronavirus (ibv) genera (30) (31) (32) (33) . in contrast, the s2 domain presumably participates in the fusion of the cellular membrane with the viral envelope (34) . this was confirmed for a number of coronaviruses, including hcov-229e (35), sars-cov (36), mhv (37) , and hcov-nl63 (38) . the structure of the hcov-nl63 s protein and its role in viral infection were previously described (3, (39) (40) (41) (42) . the receptor binding domain (rbd) within the s1 subunit and the heptad repeat region within the s2 subunit were mapped (38) . some authors speculate that s1 also contains a domain responsible for binding to the attachment receptors on the cell surface, which triggers binding with ace2 (27, 41) . of note, it has been shown that purified full-length s protein of hcov-nl63 exhibits surprisingly low-affinity binding to its putative receptor (43) . at the same time, there is evidence that the rbd of the hcov-nl63 s protein binds ace2 with a higher affinity than its full-length counterpart and with efficiency comparable to that of s1 of sars-cov (40, 41, 44) . these observations raise questions about the existence of an additional, s-independent stimulus prerequisite for this interaction. in the present study, we provide evidence that the m protein may be responsible for the interaction of the virus with cellular hspg. for this, we have used previously developed vlps, produced in insect cells, which enable efficient and scalable expression of complex macromolecular structures. first, using confocal microscopy, we showed that adhesion to the cell surface was not abrogated in vlps missing the s protein. moreover, blocking the vlp-ace2 interaction with specific antibodies did not affect adhesion. next, using flow cytometry, we confirmed that adhesion is mediated by hspg, since soluble hs blocks the interaction. to identify the ligand involved in this interaction, we decided to first screen the m protein for the presence of putative heparin binding domains. using a simple peptide array overlay, we identified three regions that may potentially be responsible for such an interaction. the experimental data were consistent with the notion that these regions are rich in positively charged amino acids, lysine, and arginine (45, 46) . subsequently, we verified this result using the predicted topology of the m protein and literature data. only one site was predicted to localize on the virion surface, and consequently, we assumed that this particular site is responsible for the interaction. this assumption was then proven by an in situ elisa, which demonstrated that the region from aa 153 to 226 of the m protein binds to cellular hspg. of note, topology prediction algorithms detected four transmembrane domains (tmds) within the m protein, which is somewhat atypical for coronaviruses (fig. 6) . the c terminus of the m protein is essential for the m-m, m-n, and m-s interactions, and it is thought to be hidden within the virion envelope (6, (47) (48) (49) (50) (51) . however, the exact region of the m protein responsible for these interactions has not been defined for most coronaviruses. recently, it was suggested that this interaction might rely on structural motifs consisting of several residues dispersed throughout the protein (52) . furthermore, we have previously demonstrated that a c-terminally tagged m protein is assembly competent for vlp formation, which supports the four-tmd model of the m protein and the resulting n exo -c exo topology. this conclusion was validated by the observation that hcov-nl63 and nl63 vlp adhesion is hampered by an antibody raised against two peptides corresponding to the distal part of the m protein. interestingly, this observation was consistent with works of enjuanes and colleagues (53, 54) , who demonstrated a similar m protein topology of tgev cov belonging to the same genus (alphacoronavirus) as hcov-nl63. hence, we propose that such a unique organization of the nl63-hcov m protein might be one explanation for its engagement in hcov-nl63 infection. it has to be mentioned that some coronaviruses acquire the ability to bind hspg in the course of cell culture adaptation, as described for mhv and ibv (13, (55) (56) (57) . to exclude this possibility, we compared the m protein sequence used in the present study (amsterdam i strain) with those of clinical isolates, and no differences in this region were identified (not shown). we believe that the engagement of the m protein in cell attachment is hence a natural characteristic of hcov-nl63. the involvement of proteins other than the s protein in carbohydrate binding by coronaviruses has been previously described. for instance, he of an mhv strain binds cellular sialic acids (58) . he also appears to mediate the attachment of bcov and hcov-oc43 to the cell surface (33, 59, 60) . more generally, the involvement of multiple capsid proteins was described for a number of viruses, i.e., influenza virus (paramyxoviruses) or herpesviruses (reviewed in references 61 and 62). it is also worth mentioning that the presented results do not preclude the participation of s protein in the recognition of cellular hspg during hcov-nl63 infection. the observations presented here provide new insight into the understanding of cell receptors and their interplay with the viral ligands during hcov-nl63 infection. considering the role of the m protein, one could suggest novel strategies for inhibiting or preventing infection by this and potentially other coronaviruses. specifically, some experimental anticoronaviral vaccines that induce anti-s protein humoral responses cause serious adverse effects, probably related to an antibody-dependent enhancement mechanism (63) (64) (65) . in this context, vaccines, immunotherapeutics, or antivirals developed to inhibit the interaction between the m protein and cellular hspg may offer an interesting alternative. cell lines. sf9 (spodoptera frugiperda; atcc crl-1711) and hf (high five) (trichoplusia ni; atcc crl-7701) cells were cultured in esf medium (expression systems, ca, usa) supplemented with 2% fbs (fetal bovine serum; thermo fisher scientific, poland), 100 g/ml streptomycin, 100 iu/ml penicillin, 10 g/ml gentamicin, and 0.25 g/ml amphotericin b. the culture was maintained in a humidified incubator at 27°c. sf9 cells were used for baculovirus (bv) generation and amplification, while hf cells were used for recombinant protein expression. llc-mk2 cells (macaca mulatta kidney epithelial cells; atcc ccl-7) were maintained in minimal essential medium (mem) (2 parts hanks' mem and 1 part earle's mem; thermo fisher scientific, poland) supplemented with 3% fbs, 100 g/ml streptomycin, 100 iu/ml penicillin, and 5 g/ml ciprofloxacin. the culture was maintained at 37°c under 5% co 2 . vlp production. vlps composed of membrane (m), envelope (e), and spike (s) proteins of hcov-nl63 were produced as described previously (22) , but nucleocapsid (n) protein was additionally included for this work. for this, the n gene was subcloned from pet duet (23) to pfastbac dual, under the control of the polyhedrin promoter. recombinant baculoviruses (rbvs) were generated using the bac-to-bac system (thermo fisher scientific, poland) and titrated using a plaque assay. subsequently, hf cells were coinfected with (mϩe) bicistronic rbv, n monocistronic rbv, and s monocistronic rbv at a multiplicity of infection (moi) of 4 and cultured for 72 h. secreted vlps were then harvested by centrifugation (5,000 ϫ g for 30 min). for purification of vlps, the harvested culture medium was diluted (1:1) with binding buffer (20 mm k 2 hpo 4 -kh 2 po 4 [ph 6.2], 70 mm nacl) and loaded onto a 5-ml heparin ht column (ge healthcare, poland) using an äkta fast-performance liquid chromatography (fplc) system (äkta, sweden). before purification, the column was equilibrated with binding buffer. proteins were eluted with a linear nacl gradient (50 mm to 2 m nacl in binding buffer), and collected peak fractions were analyzed using sds-page and western blotting. sds-page and western blotting. insect cells or culture media were harvested, resuspended in denaturing buffer to final concentrations of 1.5% sds and 2.5% ␤-mercaptoethanol, and resolved by laemmli sds-page using 4-to-20% gradient precast gels (bio-rad, poland). a pageruler prestained plus protein ladder (thermo fisher scientific, poland) was used in this study as a protein size marker. gels were stained with coomassie brilliant blue or subjected to electrotransfer in buffer containing 25 mm tris, 192 mm glycine, and 20% methanol onto an activated polyvinylidene difluoride (pvdf) membrane. following transfer, the membrane was blocked with 5% skim milk in tris-buffered saline supplemented with 0.05% tween 20 (tbs-t), followed by 1.5 h of incubation with rabbit polyclonal anti-m serum (1:15,000; kindly provided by lia van der hoek and generated by rabbit immunization with peptides spanning aa 180 to 195 and aa 212 to 226 of the m protein), mouse monoclonal anti-n antibody (1:1,000; ingenansa, spain), and mouse polyclonal anti-s serum (1:250; eurogentec, belgium) and 1 h of incubation with anti-rabbit (1:20,000; dako, denmark) and anti-mouse (1:20,000; dako, denmark) secondary antibodies conjugated with horseradish peroxidase (hrp), respectively. the signal was developed using the immobilon western chemiluminescent hrp substrate (millipore, poland) and visualized by exposing the membrane to an x-ray film (thermo fisher scientific, poland). confocal microscopy. for assessment of protein colocalization in insect cells, hf cells were grown in 6-well culture plates on glass coverslips coated with 0.01% poly-l-ornithine (sigma-aldrich, poland). cells were infected with rbvs at an moi of 1, fixed at 48 h postinfection with 4% formaldehyde, permeabilized with 0.2% triton x-100 in phosphate-buffered saline (pbs), and blocked for 1 h, at room temperature with 5% bovine serum albumin (bsa) in pbs. expression of m and n proteins was detected with rabbit polyclonal anti-m serum (the same as described above; 1:1,000) and mouse monoclonal anti-n antibody (the same as described above; 1:2,000), respectively, followed by detection with alexa fluorophore secondary antibodies at a 1:400 dilution (santa cruz biotechnology, usa). cell nuclei were stained with dapi (4=,6=-diamidino-2-phenylindole) (0.1 g/ml in pbs; sigma-aldrich, poland). coverslips were mounted on glass slides with prolong diamond antifade mountant (sigma-aldrich, poland). for transduction and adhesion analyses of vlps, llc-mk2 cells were grown to 80% confluence for 48 h in 6-well culture plates on glass coverslips. cells were then washed with ice-cold pbs and inoculated with 1 ml of purified vlps. next, llc-mk2 cells were incubated for 2 h at 4°c and subsequently for 90 min at 32°c under 5% co 2 and further washed three times with pbs. subsequently, cells were fixed, permeabilized, and stained with anti-m polyclonal serum, as described above. additionally, actin filaments were visualized with phalloidin conjugated with alexa 647 (0.132 m; sigma-aldrich, poland). to verify the role of the ace2 protein during vlp entry, llc-mk2 cells were incubated with anti-ace2 polyclonal antibodies (catalog number af933; r&d systems) or an appropriate isotype control antibody (catalog number gtx35039; genetex) for 1 h at 37°c (5 g/ml). the anti-ace2 antibody concentration was determined based on hcov-nl63 neutralization experiments in llc-mk2 cells (not shown). furthermore, cells were overlaid with purified men or mens vlps and incubated for 2 h at 4°c and subsequently for 90 min at 32°c under 5% co 2 in the presence of antibodies. next, cells were washed three times with pbs, fixed, permeabilized, and stained with anti-m polyclonal serum, as described above. fluorescent images were acquired using a zeiss lsm 710 confocal microscope (carl zeiss microscopy gmbh), deconvolved using autoquant x3 software, and processed in imagej fiji (national institutes of health, bethesda, md, usa) (66) . the number of particles and number of cells were quantified using the built-in imagej fiji tool "3d objects counter." the numbers of internalized particles were counted manually from orthogonal views. for this study, the actin cortex was assumed to indicate the cell surface. electron microscopy. samples were prepared as described previously (22) . briefly, purified vlps were fixed in karnovsky solution and loaded onto copper grids coated with a support film (formvar 15/95e; sigma-aldrich, st. louis, mo, usa). after drying, the material was stained with uranyl acetate (polyscience, inc., warrington, pa, usa) and lead citrate (sigma-aldrich, st. louis, mo, usa). subsequently, the grids were washed with water and dried in air at room temperature. ultrastructural observations were performed by using a hitachi h500 transmission electron microscope at an accelerating voltage of 75 kv. flow cytometry. to evaluate adhesion of vlps to llc-mk2 cells, cells were grown for 48 h to reach 100% confluence in 6-well culture plates on glass coverslips. cells were washed twice with pbs and incubated with purified vlps, iodixanol-concentrated hcov-nl63 (12, 67) , mock supplemented with heparan sulfate (hs) (1 mg/ml; sigma-aldrich, poland), or pbs for 4 h at 4°c. the cells were then washed three times with pbs, fixed with 4% paraformaldehyde, permeabilized with 0.1% triton x-100 in pbs, and incubated overnight in 5% bovine serum albumin and 0.5% tween 20 in pbs. to examine hcov-nl63 or vlp adhesion, cells were incubated for 2 h at room temperature with a mouse monoclonal anti-n antibody (the same as described above; 1:1,000 in 2.5% bsa with 0.5% tween 20 in pbs), followed by 1 h of incubation with an alexa fluor 488-labeled goat anti-mouse antibody (1:400). cells were then washed, mechanically detached from the glass coverslips, resuspended in pbs, and analyzed by flow cytometry (facscalibur; becton, dickinson). data were processed using cellquest software (becton, dickinson) and flowjo v10. expression of the c-terminal domain of the m protein. the region coding for the c-terminal fragment (aa 153 to 226) of the m protein was pcr amplified (5= primer cca gga tcc gga tgg cca taa gat tgc tac tcg tg and 3= primer gca ctc gag tta gat taa atg aag caa ctt), digested with bamhi and xhoi enzymes (thermo fisher scientific, poland), gel purified, and cloned into the pet duet plasmid in a manner to include the 6ϫhis tag in frame at the n terminus. the sequence (6ϫhis-m 153-226 ) was verified by sequencing. the escherichia coli bl21 strain was transformed with the recombinant plasmid and precultured overnight at 37°c. lb medium (1 liter; bioshop-labempire, poland) was inoculated with the preculture, induced with isopropyl-␤-d-thiogalactopyranoside (iptg) (0.5 mm; bioshop-labempire, poland) at an optical density of 0.6, and harvested at 4 h postinduction. bacterial cell pellets were subsequently resuspended in 50 ml of lysis buffer (50 mm tris [ph 7.5], 5 mm urea, 250 mm nacl, 5 mm dithiothreitol [dtt], 1 mm edta, 1% triton x-100) and subjected to 2 cycles of cell disruptor (constant systems, uk) operation at 25 lb/in 2 . lysates were then centrifuged (40 min at 5,000 ϫ g), and the supernatant was diluted (1:1) with immobilized-metal affinity chromatography (imac) binding buffer (20 mm k 2 hpo 4 -kh 2 po 4 [ph 7.4], 500 mm nacl, 20 mm imidazole). diluted supernatants were loaded onto a 1-ml imac column (ge healthcare, poland) charged with ni 2ϩ and connected to an äkta fplc system (äkta, sweden) and preequilibrated with binding buffer. proteins were eluted with 500 mm imidazole in imac binding buffer. collected peak fractions were pooled and fractionated again to remove imidazole and excess nacl. for this purpose, a 26/10 desalting column (ge healthcare, poland) (preequilibrated with 20 mm na 2 hpo 4 -nah 2 po 4 [ph 7.7], 250 mm nacl, and 5% glycerol) was used. purified 6ϫhis-m 153-226 protein was analyzed by sds-page and western blotting. aminopeptidase n is a major receptor for the entero-pathogenic coronavirus tgev feline aminopeptidase n serves as a receptor for feline, canine, porcine, and human coronaviruses in serogroup i human coronavirus nl63 employs the severe acute respiratory syndrome coronavirus receptor for cellular entry angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus mouse hepatitis virus receptor as a determinant of the mouse susceptibility to mhv infection incorporation of spike and membrane glycoproteins into coronavirus virions the receptor binding domain of the new middle east respiratory syndrome coronavirus maps to a 231-residue region in the spike protein that efficiently elicits neutralizing antibodies identification of spike protein residues of murine coronavirus responsible for receptor-binding activity by use of soluble receptor-resistant mutants hla class i antigen serves as a receptor for human coronavirus oc43 coronavirus spike protein and tropism changes receptor recognition mechanisms of coronaviruses: a decade of structural studies human coronavirus nl63 utilizes heparan sulfate proteoglycans for attachment to target cells heparan sulfate is a binding molecule but not a receptor for ceacam1-independent infection of murine coronavirus inhibition of sars pseudovirus cell entry by lactoferrin binding to heparan sulfate proteoglycans porcine epidemic diarrhea virus uses cell-surface heparan sulfate as an attachment factor membrane-associated heparan sulfate proteoglycan is a receptor for adeno-associated virus type 2 virions heparan sulfate glycosaminoglycans are receptors sufficient to mediate the initial binding of adenovirus types 2 and 5 different heparan sulfate proteoglycans serve as cellular receptors for human papillomaviruses herpesviruses and heparan sulfate: an intimate relationship in aid of viral entry interaction of human tumor viruses with host cell surface receptors and cell entry proteoglycans in host-pathogen interactions: molecular mechanisms and therapeutic implications novel coronavirus-like particles targeting cells lining the respiratory tract the nucleocapsid protein of human coronavirus nl63 pattern and spacing of basic amino acids in heparin binding sites importance of specific amino acids in protein binding sites for heparin and heparan sulfate 2013. fields virology a structural view of coronavirus-receptor interactions structure, function, and evolution of coronavirus spike proteins mechanisms of coronavirus cell entry mediated by the viral spike protein point mutations in the s protein connect the sialic acid binding activity with the enteropathogenicity of transmissible gastroenteritis coronavirus crystal structure of bovine coronavirus spike protein lectin domain mapping of the receptor-binding domain and amino acids critical for attachment in the spike protein of avian coronavirus infectious bronchitis virus human coronavirus hku1 spike protein uses o-acetylated sialic acid as an attachment receptor determinant and employs hemagglutinin-esterase protein as a receptordestroying enzyme coronavirus spike proteins in viral entry and pathogenesis identification of a receptor-binding domain of the spike glycoprotein of human coronavirus hcov-229e severe acute respiratory syndrome coronavirus (sars-cov) infection inhibition using spike protein heptad repeat-derived peptides cooperative involvement of the s1 and s2 subunits of the murine coronavirus spike protein in receptor binding and extended host range core structure of s2 from the human coronavirus nl63 spike glycoprotein crystal structure of nl63 respiratory coronavirus receptor-binding domain complexed with its human receptor characterization of the spike protein of human coronavirus nl63 in receptor binding and pseudotype virus entry glycan shield and epitope masking of a coronavirus spike protein observed by cryo-electron microscopy interaction between the spike protein of human coronavirus nl63 and its cellular receptor ace2 differential downregulation of ace2 by the spike proteins of hcov-nl63 adhesion to cells journal of virology severe acute respiratory syndrome coronavirus and human coronavirus nl63 identification of residues in the receptor-binding domain (rbd) of the spike protein of human coronavirus nl63 that are critical for the rbd-ace2 receptor interaction cell surface heparan sulfate and its roles in assisting viral infections molecular modeling of proteinglycosaminoglycan interactions mapping of the coronavirus membrane protein domains involved in interaction with the spike protein a conserved domain in the coronavirus membrane protein tail is important for virus assembly a single tyrosine in the severe acute respiratory syndrome coronavirus membrane protein cytoplasmic tail is important for efficient interaction with spike protein identifying sars-cov membrane protein amino acid residues linked to virus-like particle assembly a structural analysis of m protein in coronavirus assembly and morphology analyses of coronavirus assembly interactions with interspecies membrane and nucleocapsid protein chimeras organization of two transmissible gastroenteritis coronavirus membrane protein topologies within the virion and core membrane protein molecules of transmissible gastroenteritis coronavirus also expose the carboxy-terminal region on the external surface of the virion murine coronavirus with an extended host range uses heparan sulfate as an entry receptor cleavage of group 1 coronavirus spike proteins: how furin cleavage is traded off against heparan sulfate binding upon cell culture adaptation heparan sulfate is a selective attachment factor for the avian coronavirus infectious bronchitis virus beaudette attachment of mouse hepatitis virus to o-acetylated sialic acid is mediated by hemagglutinin-esterase and not by the spike protein structure, function and evolution of the hemagglutininesterase proteins of corona-and toroviruses the acetyl-esterase activity of the hemagglutinin-esterase protein of human coronavirus oc43 strongly enhances the production of infectious virus viral entry mechanisms: cellular and viral mediators of herpes simplex virus entry receptor binding and membrane fusion in virus entry: the influenza hemagglutinin evaluation of modified vaccinia virus ankara based recombinant sars vaccine in ferrets ezrin interacts with the sars coronavirus spike protein and restrains infection at the entry stage sars cov subunit vaccine: antibody-mediated neutralisation and enhancement fiji: an opensource platform for biological-image analysis entry of human coronavirus nl63 into the cell predicting transmembrane protein topology with a hidden markov model: application to complete genomes supramolecular architecture of the coronavirus particle key: cord-257584-v38tjof3 authors: fahmi, muhamad; kubota, yukihiko; ito, masahiro title: nonstructural proteins ns7b and ns8 are likely to be phylogenetically associated with evolution of 2019-ncov date: 2020-03-03 journal: infect genet evol doi: 10.1016/j.meegid.2020.104272 sha: doc_id: 257584 cord_uid: v38tjof3 the seventh novel human infecting betacoronavirus that causes pneumonia (2019 novel coronavirus, 2019-ncov) originated in wuhan, china. the evolutionary relationship between 2019-ncov and the other human respiratory illness-causing coronavirus is not closely related. we sought to characterize the relationship of the translated proteins of 2019-ncov with other species of orthocoronavirinae. a phylogenetic tree was constructed from the genome sequences. a cluster tree was developed from the profiles retrieved from the presence and absence of homologs of ten 2019-ncov proteins. the combined data were used to characterize the relationship of the translated proteins of 2019-ncov to other species of orthocoronavirinae. our analysis reliably suggests that 2019-ncov is most closely related to batcov ratg13 and belongs to subgenus sarbecovirus of betacoronavirus, together with sars coronavirus and bat-sars-like coronavirus. the phylogenetic profiling cluster of homolog proteins of one annotated 2019-ncov protein against other genome sequences revealed two clades of ten 2019-ncov proteins. clade 1 consisted of a group of conserved proteins in orthocoronavirinae comprising orf1ab polyprotein, nucleocapsid protein, spike glycoprotein, and membrane protein. clade 2 comprised six proteins exclusive to sarbecovirus and hibecovirus. two of six clade 2 nonstructural proteins, ns7b and ns8, were exclusively conserved among 2019-ncov, betacov_ratg, and batsars-like cov. ns7b and ns8 have previously been shown to affect immune response signaling in the sars-cov experimental model. thus, we speculated that knowledge of the functional changes in the ns7b and ns8 proteins during evolution may provide important information to explore the human infective property of 2019-ncov. in december 2019, the seventh human coronavirus, termed 2019 novel coronavirus (2019-ncov) or severe acute respiratory syndrome coronavirus 2 (sars-cov-2), was found in wuhan, china. on february 8, 2020, the total number of infections and deaths due to 2019-ncov globally was 34,439 and 720, respectively, according to the johns hopkins university center for systems science and engineering. coronaviruses are enveloped rna viruses that infect many species, including humans, other mammals, and birds. after infection, the host may develop respiratory, bowel, liver, and neurological diseases (weiss and leibowitz, 2011; cui et al., 2019) . coronaviruses are members of the order nidovirales and subfamily orthocoronavirinae. this subfamily is divided into four genera: alphacoronavirus, betacoronavirus, gammacoronavirus, and deltacoronavirus. generally, alphacoronavirus and betacoronavirus tend to infect mammals, while gammacoronavirus and deltacoronavirus typically infect birds. however, some gammacoronavirus and deltacoronavirus can infect mammals under specific conditions (woo et al., 2012) . in immunocompromised individuals, infection with one of the four human coronaviruses-human coronavirus nl63 (hcov-nl63), human coronavirus 229e (hcov-229e), human coronavirus oc43 (hcov-oc43), and human coronavirus hku1 (ecov-hku1)-usually results in cold-like symptoms. these viruses can cause severe infections in some infants and the elderly. due to the frequent interaction between wild animals and humans, wild animals are a common source of human t zoonotic infections. sars-cov and middle east respiratory syndrome coronavirus (mers-cov) are zoonotic coronaviruses that can cause severe respiratory diseases in humans; both belong to betacoronavirus (su et al., 2016; forni et al., 2017; cui et al., 2019; luk et al., 2019; ramadan and shaib, 2019) . 2019-ncov is the seventh coronavirus discovered that infects humans. it causes acute respiratory disease in respiratory infections. immediately after its discovery, the complete genome sequence of 2019-ncov was determined. the sequence (mn908947) was released by genbank on 05 january 2020 (lu et al., 2020) . the sequence of 2019-ncov is 96% identical, at the wholegenome level, to a bat coronavirus (zhou et al., 2020) . the genomic characteristics and epidemiology of 2019-ncov have been analyzed (lu et al., 2020) . nine inpatient culture isolates were subjected to next-generation sequencing, and individual complete and partial 2019-ncov genomic sequences were obtained. phylogenetic analysis of these 2019-ncov genomes and other coronaviruses was performed to determine the evolutionary history of the virus and to explore the origin of 2019-ncov. at the first onset, homology modeling investigated the potential receptor-binding properties of the virus. however, sars-cov and mers-cov showed approximate similarities of 79% and 50% with 2019-ncov, respectively. these findings indicated that there is not a close evolutionary relationship of 2019-ncov with sars-cov and mers-cov. thus, 2019-ncov is considered the seventh novel human betacoronavirus (lu et al., 2020) . in this study, we comprehensively characterized the relationship of the translated proteins of 2019-ncov to other species of orthocoronavirinae. this was done using a combination of the phylogenetic tree constructed from the genome sequences and the cluster tree developed from the profiles retrieved from the presence and absence of homologs of ten 2019-ncov proteins. the genomes and the combination of genome and protein sequences were used to develop a phylogenetic tree and phylogenetic profiling, respectively. the dataset of the genomes of the orthocoronavirinae subfamily was collected from the refseq database using the orthocoronavirinae ncbi taxonomy id (txid2501931). this dataset contains representative complete genomes from each species of that subfamily. (pruitt et al., 2007; federhen, 2012) (supplementary table 1 ). additionally, we collected genome sequences from bat sarslike coronavirus (mg772934 and mg772933) from ncbi and betacov/ bat/yunnan/ratg13/2013 (epi_isl_402131) from gisaid (http:// www.gisaid.org). one species of the okanivirinae subfamily, the yellow head virus, was also collected as an outgroup (supplementary table 1 ). the genome sequences were aligned using mafft multiple sequence alignment program provided at the xsede portal in the cipres science gateway with an automatic selection strategy (miller et al., 2012; katoh and standley, 2013) . a phylogenetic tree was constructed using the maximum likelihood method with raxml-hpc blackbox in the cipres science gateway (stamatakis, 2006) . the analysis used an automatic bootstrapping option using a general timereversible substitution model with a gamma-shape parameter (gtr+ g). the model was selected as the best-fit model under the akaike information criterion using modeltest-ng (darriba et al., 2020) . phylogenetic trees were viewed using figtree v1.4 (http://tree.bio.ed.ac.uk/ software/figtree/). the annotated protein sequences of 2019-ncov were collected from the data of one representative genome from ncbi (mn996527). we built a blast database with the retrieved genome sequences data using blast+ version 2.2.30 (camacho et al., 2009) . we then determined the presence and absence of homolog proteins of one representative set of annotated 2019-ncov proteins against other genome sequences in a database using tblastn with a threshold of > 50 and > 25 bits score for protein sequences > 50 amino acids (aa) and < 50 aa in length, respectively. the results of the presence and absence of homolog proteins were converted into a binary matrix and used to build a clustering tree using ward hierarchical clustering method (ward jr, 1963) (supplementary table 2 ). nonstructural protein (ns) 7b and ns8 local alignments were only positive in the sarbecovirus subgenus sample, excluding the sars coronavirus. additionally, we predicted the structural properties of the 2019-ncov ns7b protein, including the secondary structure and order-disorder propensity, using jpred4 and dichot, respectively (fukuchi et al., 2014; drozdetskiy et al., 2015) . we also predicted the structure using the contact assisted protein structure prediction (c-i tasser) composite approach (zhang et al., 2018) . additionally, we specifically collected the sequences that produced significant alignments of ns7b using the mega x software (kumar et al., 2018) . the phylogenetic analysis using complete genome sequences showed that 2019-ncov was the most closely related to batcov ratg13 and belonged to the sarbecovirus subgenus of betacoronavirus, together with sars coronavirus and bat-sars-like coronavirus (bat-sl-covzxc21 and bat-sl-covzc45) with the full support of reliability (fig. 1) . additionally, hibecovirus with bat hp-betacoronavirus/zhe-jiang2013, as the representative species, was the most closely related subgenus of betacoronavirus to sarbecovirus as compared to other subgenera, including merbecovirus (under which mers-cov has been classified), nobecovirus, and embecovirus. these findings agree with previous phylogenetic tree and similarity plot data (paraskevis et al., 2020) . 2019-ncov was found to be more closely related to the batinfecting sarbecovirus species, bat sars-like coronavirus, and betacov ratg13 than to the sars coronavirus that infects humans. this indicated that 2019-ncov more likely originated from bats. however, the wuhan outbreak was first detected in december, which is a time of year when most bat species hibernate. moreover, the huanan seafood market, which is considered as ground zero of the outbreak, does not sell bats. instead, it has been suggested that there is an animal mediator for virus transmission from bats to humans, similar to the previous cases of sars-cov and mers-cov, wherein the masked palm civet (paguma larvata) and dromedary camel (camelus dromedarius) act as intermediate hosts, respectively (lu et al., 2020) . although coronaviruses can exchange genetic material during coinfection, a recent report described the lack of a mosaic relationship of 2019-ncov to the closely related sarbecovirus, indicating the lack of a recombination event in the emergence of 2019-ncov (paraskevis et al., 2020) . hence, 2019-ncov likely emerged from the accumulation of mutations responding to altered selective pressures or from the infidelity of rna polymerase perpetuated as replication-neutral mutations. these speculations need to be studied further. a previously reported comprehensive similarity plot revealed notable mutational hotspots and conserved regions of the genome nucleotide positions of 2019-ncov against closely related coronaviruses (lu et al., 2020; paraskevis et al., 2020) . the present findings provide a different perspective of the similarity among orthocoronavirinae species, using a cluster tree developed from the profiles retrieved from the presence and absence of homologs of ten 2019-ncov proteins. this cluster was combined with the cladogram of a previously constructed phylogenetic tree (fig. 2) . both the trees were consistent in their heatmap distributions. the tree of 2019-ncov proteins comprised two clades. the first, indicated with a blue bar in fig. 2 , contained a group of conserved proteins in most orthocoronavirinae species. these comprised orf1ab polyprotein, nucleocapsid protein, spike glycoprotein, and membrane protein. spike and orf1a regions of 2019-ncov were previously shown to have the lowest sequence identity as compared to the closely related coronavirus species (lu et al., 2020; paraskevis et al., 2020) . however, since the translated spike glycoprotein and orf1ab polyprotein from these regions are very long, the sequence similarity is still sufficient to classify them as homologs. in contrast, another clade, indicated by the green bar in fig. 2 , comprised proteins specific to sarbecovirus for all proteins in this clade and hibecovirus for envelope protein only. this clade included proteins that were not completely conserved by all orthocoronavirus. two (ns7b and ns8) of five nonstructural proteins were specific for 2019-ncov and its closely related species, batcov ratg13 and bat-sars-like coronavirus (bat-sl-covzxc21 and bat-sl-covzc45). the other three nonstructural proteins (ns3, ns6, and ns7a) were also detected in the sars coronavirus. based on these results, we propose that the comprehensive analysis of nonstructural proteins, especially ns7b and ns8, may provide new insight into the properties of 2019-ncov. as shown in fig. 2 , ns7b and ns8 of 2019-ncov, batcov ratg13, and bat-sars-like coronavirus were distinct from other species of orthocoronavirus. in sars-cov, ns7b is an integral protein localized in the golgi compartment. the protein is packaged into sars-cov particles (schaecher et al., 2007) . interestingly, open reading frame (orf) 7b, but not orf 7b deletion, induces interferon (ifn)-dependent reporter gene expression as well as apoptosis and the type i ifn response (pfefferle et al., 2009 ). moreover, the deletion of orf 7b enhance virus growth (pfefferle et al., 2009 ). thus, we speculate that the property of the non-conserved ns7b in 2019-ncov may affect the human infective property of the virus. similarly, the existence of 29 nucleotide deletions in orf 8b has been described in sars-cov (oostra et al., 2007) . a study involving mers-cov described that orf 8b strongly antagonizes the inf-beta (β) promoter and orf4b and 8b significantly suppress ifn induction (lee et al., 2019b) . accessory proteins 8b and 8ab of sars-cov can suppress the inf-β signaling pathway (and thus interferon production) by their participation in the ubiquitin-mediated rapid degradation of inf regulatory factor 3 (irf3) (wong et al., 2018) . in contrast, when we focused on mers-cov from bats and camels, orf 8b antagonized melanoma differentiation-associated protein 5-mediated nuclear factor kappa b (nf-κb) activation. orf 8b strongly inhibited tank-binding kinase 1-mediated induction of nf-κb signaling, but not iκb kinase epsilon and irf3-mediated activations (lee et al., 2019a) . thus, we speculate that the properties of the accessory proteins, ns7b and ns8, in 2019-ncov may affect its ability to infect humans. further studies are required to confirm this speculation. ns7b is a short peptide of 43 residues. a three-dimensional structure is often difficult to obtain from such a short peptide. we predicted the three-dimensional structure of the queried ns7b amino acid sequence using dichot and c-i-tasser ( supplementary fig. 1, fig. 3 ) (fukuchi et al., 2014; zhang et al., 2018) ; a protein family (pf11395) was found, but no known three-dimensional structure was found. the fahmi, et al. infection, genetics and evolution 81 (2020) 104272 secondary structure of this query was also predicted using jpred4 ( supplementary fig. 2 ) (drozdetskiy et al., 2015) . the secondary structure was predicted to be an α-helix, but that this very likely does not occur depending on the environment (fig. 3) . the alignment of this protein revealed three polymorphism sites between 2019-ncov, batcov ratg13, and bat-sars-like coronavirus sequences (bat-sl-covzxc21 and bat-sl-covzc45) ( supplementary fig. 3 ). in summary, some nonstructural proteins were conserved and others were not conserved between 2019-ncov and sars-cov. by focusing on the 2019-ncov-specific proteins, ns7b and ns8, we proposed a combination of phylogenetic profiling analysis and structural characterization of the genes that were specifically expressed in 2019-ncov and the closely related bat coronavirus. the data provide insight for further characterization of the infective properties of this virus. supplementary data to this article can be found online at https:// doi.org/10.1016/j.meegid.2020.104272. this work was supported by the mext-supported program for the strategic research foundation at private universities (grant number s1511028 to t.i) and the takeda science foundation. the authors declare no competing interest. okavirus as the outgroup. the heatmap indicates the binary matrix of the homolog proteins of 2019-ncov against other species in the dataset, with black and white colors as presence and absence, respectively. the bit pattern was arranged following the vertical and horizontal trees. the vertical tree is a phylogenetic profiling tree constructed from a binary matrix of the presence and absence of homolog proteins. it has two clades, indicated by blue and green bars. the horizontal tree is the cladogram of the maximum likelihood tree, as shown in fig. 1 , with a collapsed clade of 2019-ncov. (for interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) fig. 3 . the model of nonstructural-structural transition of 2019-ncov nonstructural protein 7b. the predicted protein structure of 2019-ncov nonstructural protein 7b by c-i tasser is shown as the helix structure protein. blast+: architecture and applications origin and evolution of pathogenic coronaviruses modeltest-ng: a new and scalable tool for the selection of dna and protein evolutionary models jpred4: a protein secondary structure prediction server the ncbi taxonomy database molecular evolution of human coronavirus genomes ideal in 2014 illustrates interaction networks composed of intrinsically disordered proteins and their binding partners mafft multiple sequence alignment software version 7: improvements in performance and usability mega x: molecular evolutionary genetics analysis across computing platforms middle east respiratory syndrome coronavirus-encoded accessory proteins impair mda5-and tbk1-mediated activation of nf-κb middle east respiratory syndrome coronavirusencoded orf8b strongly antagonizes ifn-β promoter activation: its implication for vaccine design genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding molecular epidemiology, evolution, and phylogeny of sars coronavirus the cipres science gateway: enabling highimpact science for phylogenetics researchers with limited resources the 29-nucleotide deletion present in human but not in animal severe acute respiratory syndrome coronaviruses disrupts the functional expression of open reading frame 8 full-genome evolutionary analysis of the novel corona virus (2019-ncov) rejects the hypothesis of emergence as a result of a recent recombination event reverse genetic characterization of the natural genomic deletion in sars-coronavirus strain frankfurt-1 open reading frame 7b reveals an attenuating function of the 7b protein in-vitro and in-vivo ncbi reference sequences (refseq): a curated non-redundant sequence database of genomes, transcripts and proteins middle east respiratory syndrome coronavirus (mers-cov): a review the orf7b protein of severe acute respiratory syndrome coronavirus (sars-cov) is expressed in virus-infected cells and incorporated into sars-cov particles raxml-vi-hpc: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models epidemiology, genetic recombination, and pathogenesis of coronaviruses hierarchical grouping to optimize an objective function coronavirus pathogenesis accessory proteins 8b and 8ab of severe acute respiratory syndrome coronavirus suppress the interferon signaling pathway by mediating ubiquitin-dependent rapid degradation of interferon regulatory factor 3 discovery of seven novel mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus template-based and free modeling of i-tasser and quark pipelines using predicted contact maps in casp12 a pneumonia outbreak associated with a new coronavirus of probable vat origin we want to thank dr. motonori ota, dr. satoshi fukuchi, dr. kota kasahara, and dr. takeshi kikuchi for their support and helpful comments. key: cord-196265-mvnkkcow authors: m'esz'aros, b'alint; s'amano-s'anchez, hugo; alvarado-valverde, jes'us; vcalyvseva, jelena; mart'inez-p'erez, elizabeth; alves, renato; kumar, manjeet; rippmann, friedrich; chemes, luc'ia b.; laboratory, toby j. gibson . european molecular biology; heidelberg,; germany,; embl, collaboration for joint phd degree between; university, heidelberg; biosciences, faculty of; estructural, laboratorio de bioinform'atica; leloir, fundaci'on instituto; aires, buenos; argentina,; chemistrybiology, computational; kgaa, merck; darmstadt,; biotecnol'ogicas, instituto de investigaciones; mart'in, universidad nacional de san title: short linear motif candidates in the cell entry system used by sars-cov-2 and their potential therapeutic implications date: 2020-04-21 journal: nan doi: nan sha: doc_id: 196265 cord_uid: mvnkkcow the primary cell surface receptor for sars-cov-2 is the angiotensin-converting enzyme 2 (ace2). recently it has been noticed that the viral spike protein has an rgd motif, suggesting that cell surface integrins may be co-receptors. we examined the sequences of ace2 and integrins with the eukaryotic linear motif resource, elm, and were presented with candidate short linear motifs (slims) in their short, unstructured, cytosolic tails with potential roles in endocytosis, membrane dynamics, autophagy, cytoskeleton and cell signalling. these slim candidates are highly conserved in vertebrates. they suggest potential interactions with the ap2 mu2 subunit as well as i-bar, lc3, pdz, ptb and sh2 domains found in signalling and regulatory proteins present in epithelial lung cells. several motifs overlap in the tail sequences, suggesting that they may act as molecular switches, often involving tyrosine phosphorylation status. candidate lir motifs are present in the tails of ace2 and integrin beta3, suggesting that these proteins can directly recruit autophagy components. we also noticed that the extracellular part of ace2 has a conserved midas structural motif, which are commonly used by beta integrins for ligand binding, potentially supporting the proposal that integrins and ace2 share common ligands. the findings presented here identify several molecular links and testable hypotheses that might help uncover the mechanisms of sars-cov-2 attachment, entry and replication, and strengthen the possibility that it might be possible to develop host-directed therapies to dampen the efficiency of viral entry and hamper disease progression. the strong sequence conservation means that these putative slims are good candidates: nevertheless, slims must always be validated by experimentation before they can be stated to be functional. the covid-19 pandemic is caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2), an enveloped, single-stranded rna virus. it had infected more than two and a half million people and caused circa 170,000 deaths globally by mid-april 2020. sars-cov-2 belongs to the coronaviridae family, whose members are common human pathogens responsible for the common cold, as well as for some emerging severe respiratory diseases. among them are the sars-cov and the middle east respiratory syndrome coronavirus (mers-cov), the former of which caused over 8,000 cases in 2003 with a fatality rate of ~10%, and the latter caused about 2,500 infections in 2012 with a fatality rate of 37% (de wit et al., 2016) . another coronavirus, infectious bronchitis virus (ibv), infects birds and has been used as a model in coronavirus research (sisk et al., 2018) . sars-cov-2, like sars-cov (li et al., 2003) , uses the angiotensin converting enzyme 2 (ace2) as a receptor to attach to the host cells (hoffmann et al., 2020; wrapp et al., 2020; p. zhou et al., 2020) . ace2 is a single pass type i membrane protein with a short cytosolic c-terminal region for which the functionality, however, is mostly unknown. earlier results clearly show that the sars-cov-2 receptor binding domain (rbd) of the spike protein interacts with ace2 for cellular entry. however, type ii alveolar cells (at2)the main targets of sars-cov-2 in the lung (zou et al., 2020) express a relatively low amount of ace2, which points to the existence of co-receptors being targeted by the virus in parallel. one such candidate are integrins that bind a large variety of ligands harbouring an rgd sequence motif, as recent analysis of the rbd identified a possibly functional rgd motif (sigrist et al., 2020) . integrins are major cell attachment receptors, which are known to be targeted by a range of viruses -including hiv, herpes simplex virus-2, epstein-barr virus (ebv) and the foot and mouth virus -for cell entry and activation of linked intracellular pathways (hussein et al., 2015; stewart and nemerow, 2007; triantafilou et al., 2001) . integrins are special types of receptors, as they propagate signals in both directions; extracellular ligands can induce cytoplasmic pathway activation, but intracellular binding on the cytosolic tails can influence the structure of the ectodomains and hence ligand-binding affinity. the complexity of integrin signalling stems in the dimeric structure of integrins, as they are composed of two subunits, α and β. for the rgd-binding integrins, the ligand binding surface lies at the interface of the two integrin subunits with both subunits making contacts with the ligand. these rgd motifs are recognized by at least 8 out of the 24 human integrins, and the flanking residues next to the core rgd motif are known to play a huge role in selectivity (kapp et al., 2017) . several viral proteins contain rgd (or rgd-like) short linear motifs (slims) for integrin modulation; and in addition, some viruses can not only use integrins on the host cell surface, but hiv and siv can also incorporate integrins into their own membranes for mediating interactions with the host (guzzo et al., 2017) . therefore, integrins can potentially be targeted at both the extracellular and the intracellular side to combat pathogenic hijacking. viruses, as obligate intracellular entities, need to interfere with major cellular processes like vesicular trafficking, cell cycle, cellular transport, protein degradation or signal transduction to satisfy their replication, enzymatic, metabolic and transport needs (davey et al., 2011) . to achieve this, a large number of host processes are hijacked using slims often located in intrinsically disordered regions to establish protein-protein interactions with host proteins or undergo post-translational modifications, like tyrosine phosphorylation. for example, cellular signalling relies heavily on the use of slims (van roey et al., 2013 . the low affinity and cooperativity of slim-based molecular processes allow reversible and transient interactions that can work as switches between distinct functional states and are regulated both in time and space (gibson, 2009; scott and pawson, 2009) . conditional switching of slims, for example through phosphorylation, can induce the exchange of binding partners for a protein, thus mediating molecular decision-making in response to signals reporting on the cell state (van roey et al., 2012) . the central resource for slims is the eukaryotic linear motif database (elm; http://elm.eu.org/), also serving as an exploratory server for over 280 manually curated slim classes with experimental evidence, each defined by a posix regular expression (kumar et al., 2020) . as explained above, a major strategy of viruses is to abuse the host system by using mimics of eukaryotic slims to compete with extracellular or intracellular binding partners or to sequester host proteins (davey et al., 2011) . this dependence of viruses and many other pathogens on slim-mediated functions suggests that there is an opportunity to drug the cell systems where these interactions are being hijacked (sámano-sánchez and . for example, tyrosine kinase inhibitors, often used in anticancer therapy, have shown promising coronavirus replication inhibition in infectious cell culture systems (coleman et al., 2016; dong et al., 2020; shin et al., 2018; sisk et al., 2018) . in the remainder of the introduction, we will describe some of the major pathways hijacked by viruses to accomplish cell attachment, entry and replication, which are suggested by our results to be relevant to sars-cov-2 infection. indicating that acidification is not a requirement per se, but acts by inducing the endosomal spike protein cleavage required for viral fusion (matsuyama et al., 2005; simmons et al., 2005) . spike protein cleavage can be done by the transmembrane protease serine 2 (tmprss2) at the cell surface, or by cathepsin l within endosomes (de wilde et al., 2018) . the same entry route and proteases are utilized by sars-cov-2, and the main entry route also seems to be endocytic (hoffmann et al., 2020; ou et al., 2020) . autophagy is an evolutionarily conserved process in eukaryotes with multiple cellular roles that include the regulation of cellular homeostasis through the catabolism of cell components, immune development and the host cell response to infection through pathogen phagocytosis (deretic and levine, 2009) . viruses have evolved mechanisms to block the host cell antiviral response, and can further hijack autophagy components to promote their survival and replication. this can be done through viral mimicry of host proteins coordinating autophagy or through the direct inhibition of the host autophagy machinery (kudchodkar and levine, 2009) . coronaviruses (covs) exploit the autophagy machinery through different mechanisms (cong et al., 2017; maier and britton, 2012) . for example, mers-cov targets the becn1 autophagy regulator for degradation, blocking the fusion of autophagosomes and lysosomes and protecting the virus from degradation (gassen et al., 2019) . coronaviruses repurpose cellular membranes to create double membrane vesicles (dmvs) onto which the replication-transcription complex (rtc) is assembled, a process that involves recruitment of multiple autophagy components (cong et al., 2017; prentice et al., 2004; v'kovski et al., 2019) . betacoronavirus mouse hepatitis virus (mhv) rtcs assemble by recruiting lc3-i, a nonlipidated form of the lc3 autophagy protein (cong et al., 2017; reggiori et al., 2010) , and sars-cov rtcs also colocalize with lc3 (prentice et al., 2004) . proximity-based mass spectrometry on the mhv replication complex further revealed that the rtc environment repurposes components from the host autophagy, vesicular trafficking and translation machineries (v'kovski et al., 2019) in the present work, we identify a set of conserved slim candidates in the ace2 and integrin proteins, which are likely to act in the cell entry system of sars-cov-2. these motifs can provide molecular links to understand how the virus recognizes target membranes, enters into cells, and how it repurposes intracellular membrane components to drive its replication. these molecular links might provide novel clues towards drugging sars-cov-2 infections. we first focus on the extracellular slims, before moving across the membrane to examine the cytosolic potential of the receptor tails. the ability of sars-cov-2 rbd to bind integrins via the rgd motif (see table 1 ) has not yet been assessed directly. however, there are several features that make the spike-integrin interaction plausible, including sequence-and structure-level information, expression profiles, the presence of accessory motifs and protein-protein interactions. aligning close homologues of the rbd from the coronaviridae family ( figure 1a ) shows that the motif candidate is located in a locally less conserved region, hinting at the rapid evolvability of the site. several coronavirus rbds contain kgd at this site, which is known to be a lower affinity integrin binding site, first identified in snake venom disintegrins, such as barbourin (minoux et al., 2000) . a kgd motif residing in the ebv gh/gl protein has also been shown to be essential for entry into epithelial and b cells . figure 1b shows the tree derived from the spike protein sequence alignment, highlighting that the sars-cov-2 rgd motif might have evolved from an earlier kgd motif, and might present a distinct step of adaptation if the motif is indeed an integrin attachment site. (kgd and rgd) shown in orange and red boxes, respectively. c) structure of the sars-cov-2 rbd as seen in the ace2-bound form (pdb id:6m17) . the rgd motif is shown in red sticks. regions in direct contact with ace2 are shown in blue. residues with missing atomic coordinates (indicating flexibility) in the unbound trimeric spike protein structures (pdb ids: 6vsb, 6vxx and 6vyb) are shown in transparency. alignment and tree prepared in jalview (waterhouse et al., 2009) with clustal colours. structure was visualized using ucsf chimera (pettersen et al., 2004) . at the time of reporting the rgd motif, no sars-cov-2 spike structures were available, so the authors used structural homology modelling to determine that the rgd motif is surface accessible (sigrist et al., 2020) . since then, several rbd structures have been determined, both in unbound (walls et al., 2020; wrapp et al., 2020) and ace2 complexed forms using electron microscopy ) and x-ray diffraction (lan et al., 2020) , allowing for the direct structural assessment of the possibility of binding to integrins. figure 1c shows the rgd motif together with the residues involved in direct binding of ace2. the rgd motif is located in the vicinity of the ace2 binding site, however, based on uncomplexed structures of the rbd, the residues that surround the rgd site are flexible. this indicates that while ace2 binding blocks the rgd motif, without ace2 the rgd is surface accessible and interaction with integrins are not sterically blocked. as spike exists as a trimer on the virion surface, different copies of the rbd can in theory interact with ace2 and integrins at the same time. there is no solved complex structure of the ace2-integrin complex. however, further structural consideration may indicate whether the spike-ace2 and the spike-integrin interactions can coexist. the ectodomains of both ace2 and integrins in the open conformation are roughly the same size measured from the membrane, being in the 150-200 å range (based on available structures pdb:6m17 and pdb:3ije (xiong et al., 2009) ). this means that the rgd-binding site of integrins and the rbd binding regions of ace2 are relatively close in space. in addition, in the more open 'up' conformation of the rbds, the distance between pairs of rbds is about 100 å (based on the structure pdb:6vsb reported in (wrapp et al., 2020) ), which is probably larger than the distance between the ace2 binding region and the integrin ligand binding site, estimated from the individual integrin and ace2 structures. the rgd motif is recognized by several integrins, and specificity is determined mostly by the flanking residues around the core motif. as evidenced by crystallized integrin dimer:ligand complexes, the residue preceding rgd is in contact with the α subunit, while the residue after the core motif interacts with the β subunit. the immediate context of the sars-cov-2 rbd motif is 402-irgde-406, which can give an indication about possible integrin targets. irgd can be found in several native integrin-binding partners, including frem1 , mfap4 (pilecki et al., 2015) and igfbp1/2 (cavaillé et al., 2006; wang et al., 2006) . these extracellular matrix proteins target integrins with αv, α5 and α8 subunits. rgde is present in the native human integrin ligands tgf-β1, osteolectin, collagen α-1(vi) chain, psbg-9 and polydom, and in vitro and in vivo binding studies on the specificity profiles of these proteins (cescon et al., 2015; rattila et al., 2019; sato-nishiuchi et al., 2012; shanley et al., 2013; shen et al., 2019; tumbarello et al., 2012) highlighted a post-rgd glu to be efficient in binding to β1, β2 and β3 integrin subunits. correlating these preferences with possible α and β integrin subunit pairings points to the most likely candidate target integrins for sars-cov-2 being αvβ1, αvβ3, α5β1 and α8β1. motif-domain interactions are typically under heavy spatio-temporal regulation. hence the sars-cov-2 rbd-integrin binding can only occur if the possible target integrins are expressed on at2 cells. both α5β1 (pilewski et al., 1997) and αvβ3 (caccavari et al., 2010; nakamura et al., 2002; singh et al., 2000) have been observed in lung epithelial cells and have been shown to be implicated in disease emergence and progression, including emphysema, non-small cell lung cancer and mechanical injury of the lungs (teoh et al., 2015) , marking these two integrins as prime suspects for targets of the rbd. it has been shown that in heart tissues, ace2 is able to bind the β1 subunit of integrins in an rgd-independent manner, enhancing cell adhesion and regulating integrin signalling via the focal adhesion kinase (fak) (clarke et al., 2012) . the rgd independence of the interaction means that while ace2 and integrins are in complex, the rgd binding site of the integrin is unoccupied, further supporting a potential integrin:ace2:spike ternary interaction. apart from the known interplay between ace2 and integrins, there are additional features that indicate an even tighter crosstalk between the two receptors. rgd-mediated binding to integrins is metal mediated (via divalent cations like mg 2+ or mn 2+ ), and all integrins have a so-called 'metal ion-dependent adhesion site' (midas) motif (dxsxs) (lee et al., 1995) . the integrin midas structural motif is located near the ligand binding site on the β subunit and is essential for binding, as sidechains belonging to the motif and an acidic residue from the ligand coordinate the metal ion together (zhang and chen, 2012) . ace2 also has a similar dxsxs motif (see table 1 ), that might facilitate interactions with ligands that are recognized by integrins, possibly creating an overlap between the ligand binding profiles and regulation of the two receptors. in the known structures where spike is bound to ace2 the rgd motif is not in contact with the ace2 midas . however, the midas motif is highly conserved across species (see figure 2) , and surface exposed. consequently it may still be involved in mediating an interaction with an rgd-like motif, potentially serving as a parallel mechanism for binding the spike protein. ace2 and several integrin subunits require proteolytic cleavage for biological activity (see table 1 ). integrin subunits α3, α5, α6 and αv are cleaved by furin or furin-like proprotein convertases (pcs) during maturation (lehmann et al., 1996; lissitzky et al., 2000) . nearly all pcs contain an rgd motif, and while its role in integrin binding is not clear, the motif has been shown to be required for proper functioning for several pcs (lou et al., 2007; lusson et al., 1997; rovère et al., 1999) . the sars-cov-2 spike protein contains a furin-like cleavage site that is absent from closely related spike proteins, immediately following the rbd (coutard et al., 2020) . this cleavage results in increased virulence, possibly by allowing greater movement of the rbd potentially aiding in exploring a larger space around the rbd-binding region of ace2. ace2 is cleaved by several proteases, including tmprss2 (heurich et al., 2014) . ace2 binds to tmprss2, forming a receptor-protease complex (shulla et al., 2011) . tmprss2 is also known to cleave the spike protein of both sars-cov and mers-cov (iwata-yoshikawa et al., 2019), augmenting their entry into the host cell (heurich et al., 2014) . furthermore, similar results have been found for sars-cov-2, where tmprss2 was found to be fundamental for cell entry (hoffmann et al., 2020) . this dependence is most probably two-fold: on one hand tmprss2 is needed for ace2 activation, on the other hand, sars-cov-2 spike protein also contains a tmprss2 cleavage site (meng et al., 2020) . the ace2 sequence (uniprot:ace2_human) was entered in the elm server (kumar et al., 2020) and returned several relevant candidate slims in the short cytosolic c-terminal tail. because slims are so short, it is difficult to obtain significant results in sequence searches. contextual information, including cell compartment localisation and functional relevance, is important in deciding whether a motif candidate is worth testing experimentally (gibson et al., 2015) . furthermore, in intrinsically unstructured protein sequence, amino acid conservation is indicative of functional interactions. therefore, an alignment was prepared of vertebrate ace2 proteins. all of the detected motif matches (shown in table 1 together with potential binding partner domains defined using pfam and interpro (mitchell et al., 2019) ) were conserved in mammals, most were conserved with birds and mammals and some were conserved with extant reptiles ( figure 3 ). these groups diverged from one another >300 million years ago (kemp, 2005) indicating a strong conservation of all candidate motifs. in addition, the functional contexts of these motifs are biologically coherent, involving signalling by tyrosine kinases, endocytosis, autophagy and actin filament induction (table 1 ). in the following subsections we briefly summarise each of the conserved motifs and their possible role in the viral entry mechanism. diagram of the ace2 protein. figure prepared with jalview using clustal colours. the yxxphi motif binds the μ2 subunit (uniprot:ap2m1_human) of the endocytosis ap-2 adaptors by β-augmentation (owen and evans, 1998) . it is found in numerous cell surface receptors which have intrinsically disordered c-terminal tails (bonifacino and traub, 2003) . a small selection is listed in the database entry elm:trg_endocytic_2, and while the motif has not been validated in ace2, it is highly conserved (figure 3 ). when the tyr is phosphorylated, this motif becomes an sh2 binding site, while in the apo form it binds the μ2 adapter. therefore, this motif can operate as a molecular switch. the residue following the tyr makes a β-strand interaction and therefore cannot be a proline (pdb:1bxx). the phi position requires a bulky hydrophobic residue. the motif pattern can be represented by the regular expression y[^p]. [lmvif] and this motif is conserved in ace2 from all mammals except monotremes. thus the mammalian ace2, which internalises the coronavirus, has a slim candidate for internalisation appropriately located within its cytosolic tail. the region encompassing the yxxphi motif overlaps with a src homology 2 (sh2) domain binding motif ( figure 3 ) that is created upon phosphorylation of tyr 781. sh2 domain binding motifs are characterized by an invariant phosphotyrosine (py) that is created following tyrosine kinase activation, and allows binding to more than 100 types of sh2 domains present in human proteins (tinti et al., 2013) . the py residue is accompanied by additional binding determinants that frequently involve hydrophobic residues at the py+3 position, but can also involve other combinations, such as asn at py+2 in grb2-specific sh2 motifs, or hydrophobic residues at py+4 in stap-1 sh2 motifs kaneko et al., 2010) . most motifs are also characterized by the exclusion of residues at certain positions following the py, and in general, sh2-binding motifs show a high degree of cross-specificity liu et al., 2010) . cell culture infection assays with different coronaviruses, including sars-cov, have shown susceptibility to tyrosine kinase inhibitors, indicating the involvement of host tyrosine phosphorylation (coleman et al., 2016; dong et al., 2020; shin et al., 2018; sisk et al., 2018) . the sequence found in ace2 (781-yasid-785) best matches the binding specificity for the sh2 domain present in nck1/2 proteins, which belong to the class ia sh2 domains (kaneko et al., 2010) . proteins known to contain this motif are listed in entry elm:lig_sh2_nck1_1. nck proteins are adaptor proteins that modulate actin cytoskeleton dynamics (buday et al., 2002) . for example, nck is recruited to the cell membrane by the nephrin protein, following tyrosine signalling that creates several nck sh2 binding motifs in nephrin (blasutig et al., 2008) . once recruited to the membrane, nck activates wiskott-aldrich syndrome (wasp) family proteins through the use of a helical binding motif (elm:lig_gbd_chelix_1) that relieves wasp autoinhibition and allows the recruitment of the actin regulatory protein complex arp2/3, which leads to the initiation of actin polymerization (okrut et al., 2015) . the nck binding motif is exploited by two known human pathogens, namely enteropathogenic escherichia coli and vaccinia virus (frese et al., 2006) . the residues present at py+1, py+2 and py+4 rule out that the ace2 yasid motif can be a strong grb2, crk or stap1 sh2 domain binder, and binding to stat1/2/3 sh2 domains is also unlikely due to the lack of adequate specificity determinants. other sh2 domains (e.g. src-related) could be recruited by ace2, and experimental validation will be required to test these hypotheses. tyrosine 781 in ace2 also overlaps a phosphorylation-independent npy motif (elm:lig_ibar_npy_1). this motif was initially described in the bacterial secreted protein translocated intimin receptor (tir) from pathogenic strains of e. coli like enterohaemorrhagic e. coli (ehec). the npy tripeptide recognizes and binds with a 60 μm affinity to inverse bin-amphiphysin-rvs (i-bar) domains in adaptor proteins like insulin receptor substrate protein of 53 kda (irsp53) and its homolog insulin receptor tyrosine kinase substrate (irtks) (campellone et al., 2006; de groot et al., 2011) . i-bar domains bind to the plasma membrane to favour weak membrane protrusions, and the preference of i-bar domains for negative membrane curvatures enables a positive feedback loop that can result in the formation of lamellipodia, filopodia and other types of membrane protrusions prévost et al., 2015; zhao et al., 2011) . irsp53 and irtks are modular proteins that contain sh3 domains which in turn recognize pxxp slims in actin filament regulators like mena, eps8 and mdia1 (ahmed et al., 2010) resulting in the formation of membrane protrusions through actin filament formation (campellone et al., 2006; chen et al., 2015; prévost et al., 2015; zhao et al., 2011) . moreover, irsp53 has an additional cdc42-binding motif that can result in a direct wasp activation (ahmed et al., 2010) . during ehec infection, the bacteria uses the npy motif in the transmembrane protein tir to recruit irsp53 (campellone et al., 2006) . irsp53 acts as a scaffold to localize the injected bacterial protein espfu to the bacterial attachment site, cytosolic side, through the binding of a pxxp motif in espfu to the irsp53 sh3 domain. through the use of the same helical slim present in nck (elm:lig_gbd_chelix_1), espfu acts as a potent wasp activator, inducing the actin polymerization that contributes to the pedestal formation characteristic of ehec infections (cheng et al., 2008; sallee et al., 2008) . the npy slim, although not yet experimentally validated in any human protein, is potentially functional in proteins like shank2 or the microtubule-binding clip-associating protein 1 (clasp1), based on protein conservation and functional association (de groot et al., 2011) . the putative npy motif in ace2 is conserved in all analysed mammalian and bird homologs (figure 3) , suggesting a direct interaction with host i-bar-containing proteins such as irsp53 or irtks, which are expressed in lung tissues (uhlén et al., 2015) . the i-bar domain-binding motif in the cytosolic region of ace2 could be relevant for sars-cov-2 infection in the following scenario. during viral cell entry, the npy motif could recruit i-bar-containing proteins such as irsp53 or irtks, resulting in membrane protrusion formation that could be exploited for viral entry or in cell to cell transmission. it is known that the hijack of the filopodia formation network is beneficial for the entry and spreading of many enveloped viruses (reviewed in chang et al., 2016) , but whether this process is active during coronavirus infection is still unclear. a second route might cooperate with the npy motif in the recruitment of actin cytoskeleton components. a direct interaction between the sars-cov spike protein cytosolic side c-terminal domain and the ezrin ferm domain can occur during the opening of the viral fusion pore and has been proposed to restrain viral infection (millet et al., 2012) . ezrin is a protein involved in cell morphology and apical membrane remodelling that acts as a membranecytoskeleton linker. ezrin recruits f-actin through its c-terminal domain, and can also bind to irsp53 located at negatively curved membranes (saleh et al., 2009; tsai et al., 2018) , suggesting that while the npy motif acts at earlier stages of viral attachment, the spike protein/ezrin interaction might work during or after viral fusion, to promote the recruitment of actin regulatory components to viral fusion sites. certain members of the ptb domain family were discovered to bind to phosphorylated npxy motifs, hence the designation phospho-tyrosine binding domain (zhou et al., 1995) . the npxy motifs in cytosolic tails of receptors, including integrins, are regarded as endocytosis sorting signals (bonifacino and traub, 2003) . it was later discovered that ptb domains in the internalisation adapter protein dab1 could also bind non-phosphorylated nxx[fy] motifs (apoptb motif) and that this might be the case for the majority of ptbs (uhlik et al., 2005) . representative receptors with apoptb motifs are in the database entry elm:lig_ptb_apo_2. in ace2, the core nxxf motif is conserved in all land vertebrates ( figure 3 ). for dab1 apoptb motifs, there is a hydrophobic requirement two residues before the asn. in ace2, this is predominantly a charged residue: therefore, if this strikingly conserved nxxf is an apoptb motif, it should bind a protein other than dab1 and proteins with related specificities. the apoptb motif binds as a short β-strand (β-augmentation) followed by a β-turn. proline is rejected at one strand-forming position and therefore the regular expression for this motif would be [^p] .n..f for land vertebrates or [eq] .n..f for mammals. as with the phosphorylated versions, the apo-motifs are tightly connected to endocytosis (uhlik et al., 2005) . autophagy, the recycling of cellular material, is vital for cellular homeostasis. many pathogens must control the autophagy response to establish productive infection (deretic and levine, 2009) . it has been shown that coronaviruses, including human covs, subvert autophagy components to promote viral replication at dmvs associated to the rtc (cottam et al., 2014; fung and liu, 2019; gassen et al., 2019; reggiori et al., 2010) . the lir motif is required for the interaction of a target protein with atg8 or its homologues lc3 and gabarap to facilitate autophagy of the target via the autophagosome (birgisdottir et al., 2013) . the lir motif has been catalogued in the elm resource entry elm:lig_lir_gen_1 which detected a candidate motif in the human ace2 cytosolic tail sequence (figure 3 ). after the lir motif was annotated in elm, a more recently solved lc3-lir structure (pdb:5cx3) showed that the interacting peptide is longer, with one or two more hydrophobic interactions (olsvik et al., 2015) . lir enters a hydrophobic groove bordered by positively charged residues. a core [wfy]xx[ilmv] enters the deepest part of the groove. on either side of the core, the interacting residues can be flexibly spaced. the core must be preceded by a negatively-charged residue (which might be enabled by phosphorylation). further, the motif core is followed by a flexibly spaced hydrophobic residue. there is often a negatively-charged residue preceding this hydrophobic position: it can make favourable interactions with counter charges but is not an absolute requirement, so is not included in the revised motif pattern. based on the structure (pdb:5cx3) and some spot arrays (alemu et al., 2012; olsvik et al., 2015; rasmussen et al., 2019) matches the motifs annotated in elm. the revised motif is conserved in the mammalian ace2 cytosolic tail, but not in birds or reptiles. the ace2 lir motif candidate can potentially enable the incoming coronavirus to attract autophagy elements such as lc3 to the structures where the virus replicates and assembles. in line with this, a nonlipidated form of the lc3 protein has been shown to be associated with the rtcs of mhv and sars-cov (cong et al., 2017; prentice et al., 2004; reggiori et al., 2010) . this brings up the interesting possibility that ace2 remains associated with the membranous structures where sars-cov-2 replicates at later infection stages, assisting in the repurposing of autophagy components required for viral replication. amongst other motif-binding modules, pdz domains come in great abundance in human and other multicellular animals (ernst et al., 2009) . pdz domains take part in a variety of biological processes including cellular signalling and activity at the synapse (manjunath et al., 2018) . these domains bind slims by β-strand augmentation which are called pdz-binding motifs or pbms, most commonly known to be found in the c-terminus of fully or partially disordered proteins. these interactions are widely studied and their link to various diseases and infections has been previously established (christensen et al., 2019) . a pbm candidate is also found in the very c-terminus of the cytosolic tail of vertebrate ace2 proteins (figure 3 ). motifs following a pattern [st] . [acvilf]$ are a common pdz-binding motif variant, described in the elm resource entry elm:lig_pdz_class_1. there are multiple functional examples of this motif. however, in the ace2 protein the matching sequence is not characterized. ace2 has a disordered tail facing the cytosol, where a number of different pdz domains could be its potential binders (manjunath et al., 2018) . two pdzs in two different adapter proteins -na(+)/h(+) exchange regulatory cofactor nherf3 and sh3 and multiple ankyrin repeat domains protein 1 shank1have been previously identified to be able to bind txf$ sequences ("$" stands for the cterminal end) (ernst et al., 2014) , which makes them both candidates for an interaction with the ace2 c-terminus. nherf3 is co-localised with ace2 in intestinal tissue, and its pdz domains were previously validated to interact with pbms in transmembrane proteins on the cytosolic side of the membrane (gisler et al., 2003) , so it is possible they come in proximity with the ace2 tail containing the txf$ motif, and possibly bind it as a part of ion exchange regulation of small molecule transport activities. nherf3 is known for its involvement in sodium ion-dependent transporter activity (srivastava et al., 2019) , and ace2 was also shown to interact with a sodium-dependent transporter , which could be one of the leads towards unravelling the possible interaction between nherf3 and ace2. nherf3 (gene name pdzk1) and ace2 share a network of experimentally validated protein-protein interactions localized at the cell membrane ( figure 4) . these proteins share localisation and function characteristics, and some of them could turn out to be the missing puzzle pieces to connect ace2 and nherf3 as possible interactors. all in all, whether nherf3 is the pdz-containing protein interacting with ace2 or not is an open question, but there should be very little doubt that ace2 exhibits pdz-binding activity in the cell, since the motif is highly conserved, very specific and already known to appear in membrane-associated proteins with c-terminal tails facing the cytosol. ace2. c-terminal pbms are displayed in white boxes. common functionalities of the proteins marked by different colours. thickness of nodes is proportional to the confidence behind the experimental evidence. the network was built using the string resource (https://string-db.org) (szklarczyk et al., 2019) . tyrosine 781 is a part of the motif patterns for four of the motifs listed above but must be phosphorylated to act as an sh2-binding motif. therefore, we searched the ace2 literature for reports of phosphorylation but were unable to find any with strong site identification. examination of the human ace2 entry in the database phosphositeplus (hornbeck et al., 2019) revealed that high-throughput (htp) phosphoproteomic studies, but no low-throughput (ltp) studies, identify ptyr781. as shown in figure 5 , thirteen htp measurements identified phosphorylation at tyr781 and this residue is the only ace2 phosphosite that is highly reproducible across multiple htp datasets. for example, ptyr781 was one of 318 unique phosphopeptides belonging to 215 proteins analysed from an erlotinib-treated breast cancer cell line model (tzouros et al., 2013) . therefore, this site indeed fulfils the phosphorylation requirement to be an sh2-binding motif. have only been reported once each and therefore are likely to be misidentified peptides. as described above, four sequence motifs overlap in the region surrounding tyr781: the yxxphi endocytic sorting signal (elm:trg_endocytic_2), an nck sh2 binding motif (elm:lig_sh2_nck_1), an npy i-bar binding motif (elm:lig_ibar_npy_1) and the lir autophagy motif (elm:lig_lir_gen_1). while the yxxphi, npy and lir motifs require a non-phosphorylated state of tyr781, the nck motif requires tyr781 phosphorylation, creating the opportunity for a four way molecular phospho-switch acting in this region of ace2 that directs different steps of the sars-cov-2 infection cycle. the state of this switch can be controlled by tyrosine kinase activity involving the src/abl and other tyrosine kinases known to be upregulated during endosomal processes (reinecke and caplan, 2014 ) and viral infection (davey et al., 2011) including in coronaviruses (coleman et al., 2016; dong et al., 2020; shin et al., 2018; sisk et al., 2018) . similar two-way switches have been described before, as with the ctla-4 receptors, where src-family tyrosine kinases dictate the binding preferences of overlapping yxxphi and sh2 binding motifs. in the non-phosphorylated state endocytosis is favoured, whereas t cell activation brings about tyr phosphorylation, shutting down endosomal recycling and initiating signalling through the recruitment of sh2-domain containing proteins (bradshaw et al., 1997; miyatake et al., 1998; ohno et al., 1996; owen and evans, 1998; shiratori et al., 1997) . during early stages of viral infection, and following viral attachment to the host cell membrane, the yxxphi motif could activate the early events of receptor-mediated endocytosis by binding the µ subunit of the ap2 complex, which in turn recruits clathrin and other endocytic components to the viral attachment sites. following the formation of the clathrin coat and initial invagination, actin polymerization is required to assist in the internalization of the endocytic vesicles. in addition, some viruses can 'surf' along filopodia by myosin-mediated actin cytoskeleton movements that transport the viral particles to the entry sites at the cell body, ultimately increasing their entry rate (reviewed in chang et al., 2016) . the actin hijack related to endocytic uptake and the formation of membrane protrusions could be coordinated in one or several stages by the nck sh2 and npy motifs respectively, by recruiting wasp and i-bar proteins to the viral attachment site. this might be enacted by a tyr781 phosphorylation switch that is regulated in time with early attachment characterized by non-phosphorylated tyr781 that allows the yxxphi and npy motifs to be active, and a later phase where tyr781 becomes phosphorylated, inactivating the yxxphi and npy motifs and bringing the nck sh2 motif into play. an alternative scenario might be enabled by the multimeric nature of the spike protein and by attachment of several viral particles to a membrane domain, leading to adjacent ace2 tails on the intracellular side that expose both phosphorylated and nonphosphorylated motifs, allowing these three signalling steps to take place simultaneously. the presence of several parallel routes for the recruitment of cytoskeleton components involving the npy and nck motifs could provide robustness needed to ensure the actinreorganization required for the uptake of virus-containing vesicles into the cytosol. following endocytosis and fusion, viral components are released into the cell and viral replication takes place. during this phase, the last component of the switch could come into play, when the ace2 protein that remains bound to spike protein-coated membranes could promote the hijack of autophagy components necessary to assemble the viral replication factories. at this time, non-phosphorylated ace2 tails would recruit lc3 to replication structures through their lir motifs. integrin β tails are short cytosolic c-terminal intrinsically disordered regions, similar to the analysed region of ace2. the two most probable integrin β subunit candidates at play in sars-cov-2 viral entry are β3 and β1. the c-terminal tail of both subunits share a high degree of sequence similarity, and similarly to ace2, contain several known and candidate slims (see table 1 and figure 6 ) that propagate signals in the cytoplasm and regulate integrin activity not just through intracellular pathways, but also changing the structural state of the ectodomains determining ligand binding capacity (anthis and campbell, 2011) . integrin β tails contain a highly charged patch in their membrane proximal region. this region is indispensable for the interaction between integrins and tyrosine kinases, including the src-family kinase fyn (reddy et al., 2008) and the focal adhesion kinase (fak), most probably via the direct interaction with paxillin (liu et al., 2002) . through these interactions, integrins regulate cytoskeletal remodelling (lv et al., 2016) and the promotion of cell survival (tang et al., 2007) , as well as regulation of focal adhesion assembly and cell protrusion formation (hu et al., 2014) . in turn, fak regulates integrin recycling and endosomal trafficking (mana et al., 2020; nader et al., 2016) . currently, there is no consensus sequence motif describing these interactions, although a definition of hdr[kr]e has been proposed (legate and fässler, 2009) , fitting integrins β1, β3, β5 and β6. this motif is under heavy regulation by several mechanisms. first, the interaction with tyrosine kinases seem to involve additional residues n-terminal of the charged motif coremost notably the conserved lysine preceding the hydrophobic patch (reddy et al., 1998) that are only accessible in the active state of the integrin dimer, as these regions are buried in the membrane otherwise (stefansson et al., 2004) . second, the d residue of the motif forms a salt bridge with the cytosolic tail of the α subunit of the integrin in the inactive conformation of the receptor. thus, this motif region is dependent on integrin activation regulated by ligand binding and intracellular interactions mediated by the c-terminally located npxy motifs. both the β1 and β3 tails contain two regions that match the apoptb motif defined in the elm resource (elm:lig_ptb_apo_2). furthermore, these regions are known to have tyr phosphorylation, matching the phosphorylated motif definition as well (elm:lig_ptb_phospho_1). these regions are known to be able to form β-turns, and are recognition sites for phosphotyrosine-binding domains. npxy motifs (or nxxy in the case of the second motif of the β3 tail) are the major sorting signals mediating interactions with ferm domains for regulating endosomal trafficking (ghai et al., 2013) . in integrin β tails these motifs recruit adaptor proteins and clathrin, serving as sorting signals (ohno et al., 1995) , and the npxy motifs in the β1 tail have a direct connection to viral entry for reovirus (maginnis et al., 2008) . the first npxy motif binds talin-1, serving as a connection between the plasma membrane and the major cytoskeletal structures (horwitz et al., 1986) . considering the expression profiles of talins, the most likely interaction partner of lung-expressed integrins is talin-1. talin-1 contains a ferm domain, similarly to ezrin, which establishes a direct interaction with the sars-cov spike protein upon viral fusion (millet et al., 2012) . however, the interaction between the rbd and integrins offers the virus an earlier point of interference with the cytoskeletal system, being able to modulate it cooperatively with the ace2 actin-regulatory elements (npy and nck sh2 motifs) before and during cellular entry. the talin/integrin interaction however presents a feedback loop: the binding of talin on the cytoplasmic side induces a structural rearrangement on the ectodomains of integrins, enabling a higher affinity interaction with rgd motif containing ligands (tadokoro et al., 2003) . the first npxy motif is also a binding site for dok1, a negative regulator of integrin activation. dok1 is in direct competition with talin for binding integrins (tadokoro et al., 2003) . the competition is fundamentally influenced by phosphorylation on tyr773 of the npxy motif. the non-phosphorylated motif has a higher affinity towards talin, whereas phosphorylation prefers dok1 (oxley et al., 2008) ; thus tyr773 acts as a phospho-switch that regulates integrin activation. the first npxy motif also presents a site for a largely phosphorylation-independent interaction with the integrin cytoplasmic domain-associated protein-1 (icap-1). icap-1 is a fundamental regulator of the assembly of focal adhesions (fa) and icap-1 knockdown reduced fa assembly (alvarez et al., 2008) , possibly working in conjunction with the membrane-proximal charged motif. icap-1 seems to be specific for β1, and hence the therapeutic considerations for targeting this pathway requires the verification of the type of integrins expressed on at2 cells (and other related cell types). the second npxy motif is a binding site for kindlin . this interaction requires the integrin tail to be non-phosphorylated and phosphorylation on tyr785 (for β3) can switch off the interaction with kindlin-2 (bledzka et al., 2010) . kindlin binding (together with talin binding) is a crucial step in integrin activation, and hence regulates the availability of integrins for extracellular ligands (herz et al., 2006) , and was also suggested to play a role in tgf-β1 signalling (kloeker et al., 2004) . the two npxy(-like) motifs in the integrin β tails not only constitute two separate phospho-switches, but also act in synergy to give rise to more complex regulation. filamin and the ptb domain region of shc1 each bind to both npxy motifs (deshmukh et al., 2010; liu et al., 2015) . shc is an adaptor protein playing a key role in mapk and ras signalling pathways, and its interaction with integrin β3 requires both phosphorylations on tyr773 and tyr785 (audero et al., 2004; deshmukh et al., 2010) . in contrast, binding of the ig domain of filamin-a requires both tyrosines to be in a non-phosphorylated state. the filamin-a interaction can be considered as a main shutdown switch in integrin signalling, as this interaction induces the closed conformation of the integrin ectodomains, decreasing the chance of ligand binding (liu et al., 2015) . in addition, binding partners utilizing both npxy motifs may also serve as stronger modulators of endosomal trafficking, switching on enhanced signals. the connection between autophagy and cell adhesion has already been described, showing that both reduced fak signalling (sandilands et al., 2011) and detachment from the extracellular matrix via integrins (vlahakis and debnath, 2017) enhances autophagy. atg deficient cells have enhanced migration properties, and at the molecular level there seems to be a direct connection between atg proteins and integrins as well: autophagy stimulation increases the co-localization of β1 integrin-containing vesicles with lc3stained autophagic vacuoles, while autophagy inhibition decreases the degradation of internalized β1 integrins (tuloup-minguez et al., 2013) . in drosophila cells it has been shown that the wiskott-aldrich syndrome protein and scar homolog (wash) plays a connecting role between integrin recycling and the efficiency of phagocytic and autophagic clearance (nagel et al., 2017) . however, molecular details about how this connection is brought about are unclear. sequence analysis of integrin β3 tails show a potential atg-interacting lir motif, similarly to the ace2 tail. low throughput phosphorylation assays have determined that both tyr773 and tyr785 (required for the functional ptb npxy motif) are in fact phosphorylated in vivo. however, such assays have also determined additional phosphorylation sites in the β3 tail, thr777, ser778, thr779 and thr784. these phosphorylations aren't connected to the npxy motif switches in any known way. however, the three phosphorylations between residues 777-779 and the following sequence region matches the lir autophagy motif also found in the ace2 tail. while the current motif definition does not exactly fit the β1 tail, there is also low throughput phosphorylation assay data (wennerberg et al., 1998) for the existence of these phosphorylations in the corresponding residues, hinting at the possibility of the presence of a slightly modified motif. for both β1 and β3 tails, the phosphorylation provides the negative charge required n-terminal of the fxxixy lir motif hydrophobic core. phosphopeptides spanning the candidate region should reveal whether the lir motif-like region is a functional atg-binding site in integrin β1, and also whether the multiple phosphorylations act as a rheostat, modulating the affinity of the interaction. the motif found in integrin β3 is also present in integrin β2, and the motif candidate identified in integrin β1 is also present in integrin β6. bringing together the candidate slims identified in the integrin β and ace2 tails potentially strengthens the functional links between them, and provides an emergent picture of slim-driven cooperative switches driving viral attachment, entry and replication (figure 7) . following attachment of spike to the receptors, the two npxy motifs in the integrin β subunit could act cooperatively with the apoptb and yxxphi motifs in ace2 as sorting signals that mediate the internalization of viral particles into endosomes. the presence of several endocytic motifs in close proximity would strengthen the interaction with the endocytosis apparatus creating a high-avidity environment for recruitment of rme components (bonifacino and traub, 2003) . during this time, the phosphorylated integrin npxy motifs would also reinforce viral attachment through inside out signalling, stabilizing the integrin ectodomain in the open, high affinity ligand-binding conformation. as discussed previously, rme also involves the recruitment of adaptor molecules that activate rearrangements of the actin cytoskeleton required for the internalization of the endocytic vesicle. at this stage, the npy and nck sh2 motifs in ace2 would recruit several molecules that mediate actin polymerization signalling, prominently i-bar containing proteins irsp53 and irtks, and the wasp-arp2/3 complex. while most of this actin signalling would serve to allow viral entry, additional actin recruitment processes could occur following viral fusion, such as that initiated by the interaction between the spike protein and ezrin. finally, at later stages of infection, both integrins and ace2 might remain attached to virus-associated dmvs and other replication-competent membranes where the rtc assembles. at this stage, ace2 and integrins might cooperatively mediate the recruitment of autophagy components such as lc3, through the lir motifs located in the cytosolic tails of both molecules. elements shown in one of the monomers of a homotrimer (spike) or homodimer (ace2) are also present in the other proteins forming that complex. lines below motif boxes represent each of the overlapping motifs in that specific region. arrows indicate the related cellular process and the protein known to interact with their respective motif is indicated in parenthesis. phosphorylation sites are shown as red circles with the respective sequence position indicated. for the integrin β tail, the ptb/apoptb phospho switch is depicted as two separate versions of the same motif region, and the subscripts represent the motif order in the sequence. the colour code is as follows: cleavage sites (brown), for motifs: apoptb/ptb (orange), endocytic sorting signal (purple), i-bar binding (red), lir (blue), midas (yellow), nck (green), pbm (magenta), rgd (light red). slims mediating interactions are marked with coloured aquares, protease cleavage sites are marked with hexagons, structural motifs are marked with ovals. motifs marked with † are experimentally validated. the analysis of candidate slims in ace2 and integrins suggests that sars-cov-2 hijacks both receptors, co-opting their slims to drive viral attachment, entry and replication. this creates an opportunity for drugging these interactions, or the processes they control through host directed therapies (hdts), to prevent viral entry. a list of potentially useful drugs is presented in table 2 , together with chembl accessions (mendez et al., 2019) . the sequence rgd is used by a large number of viruses for cell attachment, via integrins (hussein et al., 2015) . rgd mimics have been developed as inhibitors of integrin-extracellular matrix protein interaction for a variety of diseases. a cyclic rgd peptide (c-rgdfv, cilengitide) has been developed clinically for glioblastoma treatment and other cancers. it proved safe, but did not enhance the survival benefit (stupp et al., 2014) . sars-cov-2 has a unique rgd sequence in the ace2 binding region of its spike protein. it has been speculated that integrins may have a potential role for infectivity (sigrist et al., 2020) . therefore, integrin inhibitors like rgd mimetics might be able to block the rgd binding site(s) on target cells and block the attachment of the virus. another application that has been suggested is bacterial sepsis (sepsis is also a dreaded complication in covid-19 patients), and experimental evidence in animals is available (garciarena et al., 2017) . cilengitide is relatively specific for integrin αvβ3 and also active on αvβ5, αvβ6. the antibody abituzumab (aka di 17e6) is a pan-av antibody, i.e. also active against other αv integrins, and may consequently be better suited for blocking virus entry. it has been clinically tested in several cancer indications (élez et al., 2015; hussain et al., 2016) . cilengitide and abituzumab are made available for in vitro testing in the context of the sars-cov-2 pandemic by the company who developed it initially (contact: compound_donation@merckgroup.com). as discussed above, tyrosine kinase mediated phosphorylation plays an important role in virus entry and maturation, and several tyrosine kinase inhibitors have been developed in the clinic and show effects on viral infection. for example, saracatinib, a src and abl inhibitor which has completed several clinical trials, including some cancers, inhibited replication of different coronaviruses including mers-cov, sars-cov and hcov-229e in cell culture infection experiments (shin et al., 2018) . after internalization and endosomal trafficking, imatinib, an abl inhibitor, prevented fusion of sars-cov and mers-cov virions at the endosomal membrane in infected cell culture experiments (coleman et al., 2016) . using the avian model virus, ibv, imatinib and two other abl inhibitors, gnf2 and gnf5, prevented the fusion of the spike protein to the membrane of the target cell as well as cell-cell fusion and syncytia formation (sisk et al., 2018) . more recently, tyrphostin a9, a platelet-derived growth factor receptor (pdgfr) tyrosine kinase inhibitor came out from a high-throughput screening using cytopathic effect as readout, it also showed in vitro inhibitory capacity to transmissible gastroenteritis virus (tgev), an alpha coronavirus that infects pigs (dong et al., 2020) . the authors also showed that tyrphostin a9 has a broad spectrum, being active against three other tested coronaviruses: mhv in l929 cells, porcine epidemic diarrhea virus in vero cells and feline infectious peritonitis virus in ccl-94 cells. the mode of action was found to be through p38 mapk, at the post-adsorption stage. as fak has been implicated in viral entry for other viruses including influenza a (elbahesh et al., 2014) , experimental drugs targeting fak, including some in clinical trials (de jonge et al., 2019) , can be considered for studying the potential spike-induced integrin signalling. currently, 39 tyrosine kinase inhibitors are approved by the fda: eleven target non-receptor protein-tyrosine kinases and 28 inhibit receptor protein-tyrosine kinases (roskoski, 2020) . consequently, tyrosine kinase inhibitors may be good candidates to test for their effect on sars-cov-2. a number of protease inhibitors are currently discussed for sars-cov-2 treatment. serine protease inhibitor camostat mesylate is active against tmprss2 and blocks cell entry (hoffmann et al., 2020) . only the spike protein of sars-cov-2 contains a furin cleavage sequence (prrars|v). consequently, furin convertase inhibitors are considered as antiviral agents (shiryaev et al., 2007) . many viruses enter the cell via endocytosis, and a number of candidate slims relevant for sars-cov-2 infection are related to endocytosis (see above). chlorpromazine, an antipsychotic (via dopamine d2 antagonism) dating from the 1950s, is also a potent endocytosis inhibitor (which may explain some of its marked side effects, which can include low white blood cell levels). it promotes the assembly of adaptor proteins and clathrin on endosomal membranes thus depleting them from the plasma membrane, leading to a block in clathrin-mediated endocytosis (wang et al., 1993) . the potential use of endocytosis inhibitors such as amiodarone (stadler et al., 2008) and chlorpromazine in coronavirus infection is further discussed here . the situation with targeting autophagy seems unclear. autophagy activators might help the cell to consume incoming virus -or speed up the establishment of the viral replication complexes and accelerate disease. autophagy inhibitors might work in later stages of infection to dampen viral production, but this will depend on whether autophagy is active at the time or if the constituent components have been captured and effectively shut down. several inhibitors/activators have been reported which can target autophagy and multiple auxiliary signals feeding into the process of autophagy (table 2 ). one such axis is via the mtorc1 complex. active mtorc1 keeps the autophagy process inhibited by phosphorylating the ulk complex which is a key regulator in autophagy. inhibition of mtorc1 activates autophagy. multiple fda approved mtor inhibitors are known and include rapamycin and everolimus. rapamycin has been shown to be effective in cell culture for countering mers-cov infection and might assist in tackling sars-cov-2 infection (kindrachuk et al., 2015) although the stage of infection might be crucial for the desired outcome. simvastatin is another drug which is known to upregulate autophagy via the mtor pathway (wei et al., 2013) . simvastatin has also been reported to alleviate airway inflammation in a mouse asthma model (gu et al., 2017) . another autophagy modulator is niclosamide which regulates autophagy by targeting the autophagy regulator beclin1 via skp2 e3-ligase in mers-cov infection. in this case, reduced beclin1 levels lead to blocking fusion of autophagosomes and lysosomes and hence the virus protects itself in the host (gassen et al., 2019) . overall, inhibiting skp2 by niclosamide relieves beclin1, allowing autophagosome-lysosome fusion and resumption of autophagy to reduce the mers-cov production. in addition, niclosamide valinomycin has been shown to target sars-cov in cell cultures as well (wu et al., 2004) . a recent proteomic study expressed 26 tagged sars-cov-2 proteins individually to create a viral-human protein-protein interaction map (gordon et al., 2020) . as a result, 69 compounds, some being fda approved, are candidates for drug repurposing. the spike protein interacted with the golga7-zdhhc5 acyl-transferase complex, possibly for palmitoylation on spike's cytosolic tail. spike did not pull down ace2 or any cell surface receptor protein, but these experiments are not an assay for viral entry, and therefore many protein interactions related to the viral entry mechanism are likely missing from these data. therefore our observations of slim candidates in the viral attachment, entry and replication system reflect additional areas of the cell where drug repurposing for hostdirected therapy might be explored. although tyrosine kinase inhibitors have frequently been shown to dampen pathogen invasion and disease progression in cell culture, there has been little effort to move these findings into the clinic (sámano-sánchez and . because of their widespread use in cancer, the safety profiles of tyrosine kinase inhibitors are well known and we wonder whether this might be a neglected opportunity. drugging the cell to cure the pathogen using hdts is unlikely to fully remove a virus. this would also be undesirable, because the immune system must mount a defence in order to prevent viral reinfection. rather, dampening viral load during viral invasion or replication should be the target, to give the host defences time to respond. it is well known that drugs like tamiflu that slows influenza exit, and therefore entry into uninfected cells, can only have a strong effect when taken prophylactically or early in infection (bassetti et al., 2019) . depending on the importance of integrins in sars-cov-2 lung cell entry, reducing viral entry is a possible role for cilengitide or other molecules that hamper integrin or ace2 binding. an endocytosis inhibitor might play a similar role and is independent of receptor type. however, for any such inhibitor that passes the blood-brain barrier, effects on mood and other brain operations are an inevitable side effect: even so, the endocytosis inhibitor chlorpromazine is a widely used drug with a well-known safety profile (solmi et al., 2017) . due to the presence of the cell attachment motif rgd in sars-cov-2, integrin inhibitors seem worthwhile to explore further. cilengitide, a relatively selective integrin αvβ3 and αvβ6 inhibitor (a cyclic peptide that proved safe in patients, but failed to show a survival benefit in glioblastoma (stupp et al., 2014) ) might be useful in two phases: it could block virus attachment to target cells, and it has also been proposed as a potential treatment in sepsis (garciarena et al., 2017) . sepsis was the most frequently observed complication among covid-19 patients in wuhan (f. . another potential application of integrin inhibitors, especially integrin αvβ6, would be lung fibrosis -patients on respirators tend to develop lung fibrosis, and show increased αvβ6 levels (horan et al., 2008) . the antibody abituzumab (aka di 17e6) is a pan-αv antibody with high potency on αvβ6, i.e. also active against several αv integrins, and may consequently be better suited for blocking virus entry, or may be suitable for lung fibrosis or sepsis protection. it has been clinically tested in several cancer indications (élez et al., 2015; hussain et al., 2016) , and proved safe, but did not achieve a survival benefit in cancer. whether the enzymatic function of ace2 has a role in sars-cov-2 infection is unknown. however, it would be readily testable with available ace2 inhibitors, like captopril, on the market since the 1970s (enalapril would not be suited for in vitro testing, as it needs to be activated in the liver to become the active ingredient). it might be productive to test for synergy between molecules that block viral binding to ace2 and molecules that block binding to integrins. we have presented evidence at the sequence level for slims in ace2 and β integrins with the potential to function in viral attachment, entry and replication for sars-cov-2. we identified several candidate molecular links and testable hypotheses that might help uncover the (still poorly understood) mechanisms of sars-cov-2 entry and replication. because these motifs belong to host proteins acting as viral receptors, they are not revealed by virus-centred proteomic assays. most of these putative motifs lack direct experimental evidence. that they may well be functional, however, is indicated by sequence conservation, in some cases for hundreds of millions of years. in addition, these motifs are in appropriate cellular contexts to interact with their respective partner proteins. experimental validation will yield insights into receptor-mediated endocytosis for sars-cov-2 virus and, in addition, for the role of ace2 in the normal cell, where it surely has much more functionality than being an angiotensin converting enzyme. overall, the collection of candidate motifs in this system suggests that a range of host directed therapies might be explored including rgd inhibition, tyrosine kinase inhibition, endocytosis inhibition and autophagy inhibition and/or activation. or interpro (mitchell et al., 2019) , where applicable motif candidates 2 xany residue, p 5 pc: proprotein convertases 3 '*' marks cleavage points for protease-recognition motifs 6 not a slim but a structural motif i-bar domains, irsp53 and filopodium formation atg8 family proteins act as scaffolds for assembly of the ulk complex: sequence requirements for lc3 -interacting region (lir) motifs integrin cytoplasmic domain-associated protein-1 (icap-1) promotes migration of myoblasts and affects focal adhesions the tail of integrin activation adaptor shca protein binds tyrosine kinase tie2 receptor and regulates migration and sprouting but not survival of endothelial cells neuraminidase inhibitors as a strategy for influenza treatment: pros, cons and future perspectives the lir motif -crucial for selective autophagy phosphorylated ydxv motifs and nck sh2/sh3 adaptors act cooperatively to induce actin reorganization tyrosine phosphorylation of integrin beta3 regulates kindlin-2 binding and integrin activation signals for sorting of transmembrane proteins to endosomes and lysosomes interaction of the cytoplasmic tail of ctla-4 (cd152) with a clathrin-associated protein is negatively regulated by tyrosine phosphorylation the nck family of adapter proteins: regulators of actin cytoskeleton integrin signaling and lung cancer enterohaemorrhagic escherichia coli tir requires a c-terminal 12-residue peptide to initiate espf-mediated actin assembly and harbours nterminal sequences that influence pedestal length igfbp-1 inhibits egf mitogenic activity in cultured endometrial stromal cells collagen vi at a glance filopodia and viruses: an analysis of membran e processes in entry mechanisms the kgd motif of epstein-barr virus gh/gl is bifunctional, orchestrating infection of b cells and epithelial cells regulation of membrane-shape transitions induced by i-bar domains structural mechanism of wasp activation by the enterohaemorrhagic e. coli effector espf(u) pdz domains as drug targets angiotensin converting enzyme (ace) and ace2 bind integrins and ace2 regulates integrin signalling abelson kinase inhibitors are potent inhibitors of severe acute respiratory syndrome coronavirus and middle east respiratory syndrome coronavirus fusion the interaction between nidovirales and autophagy components coronavirus nsp6 restricts autophagosome expansion the spike glycoprotein of the new coronavirus 2019-ncov contains a furin-like cleavage site absent in cov of the same clade how viruses hijack cell regulation structural basis for complex formation between human irsp53 and the translocated intimin receptor tir of enterohemorrhagic phase i study of bi 853520, an inhibitor of focal adhesion kinase, in patients with advanced or metastatic nonhematologic malignancies host factors in coronavirus replication sars and mers: recent insights into emerging coronaviruses autophagy, immunity, and microbial adaptations integrin {beta}3 phosphorylation dictates its complex with the shc phosphotyrosine-binding (ptb) domain receptor tyrosine kinase inhibitors block proliferation of tgev mainly through p38 mitogen-activated protein kinase pathways novel roles of focal adhesion kinase in cytoplasmic entry and replication of influenza a viruses abituzumab combined with cetuximab plus irinotecan versus cetuximab plus irinotecan alone for patients with kras wild-type metastatic colorectal cancer: the randomised phase i/ii poseidon trial the pfam protein families database in 2019 a structural portrait of the pdz domain family rapid evolution of functional complexity in a domain family the phosphotyrosine peptide binding specificity of nck1 and nck2 src homology 2 domains human coronavirus: host-pathogen interaction pre-emptive and therapeutic value of blocking bacterial attachment to the endothelial alphavbeta3 integrin with cilengitide in sepsis skp2 attenuates autophagy through beclin1-ubiquitination and its inhibition reduces mers-coronavirus infection structural basis for endosomal trafficking of diverse transmembrane cargos by px-ferm proteins cell regulation: determined to signal discrete cooperation experimental detection of short regulatory motifs in eukar yotic proteins: tips for good practice as well as for bad pdzk1: i. a major scaffolder in brush borders of proximal tubular cells systems biology the cell biology of receptor-mediated virus entry simvastatin alleviates airway inflammation and remodelling through up-regulation of autophagy in mouse models of asthma virion incorporation of integrin α4β7 facilitates hiv-1 infection and intestinal homing membrane remodeling in clathrin-mediated endocytosis kindlin-1 is a phosphoprotein involved in regulation of polarity, proliferation, and motility of epidermal keratinocytes tmprss2 and adam17 cleave ace2 differentially and only proteolysis by tmprss2 augments entry driven by the severe acute respiratory syndrome coronavirus spike protein sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor partial inhibition of integrin alpha(v)beta6 prevents pulmonary fibrosis without exacerbating inflammation. a m 15 years of phosphositeplus®: integrating post-translationally modified sites, disease variants and isoforms interaction of plasma membrane fibronectin receptor with talin--a transmembrane linkage fak and paxillin dynamics at focal adhesions in the protrusions of migrating cells defining the specificity space of the human src homology 2 domain sars coronavirus, but not human coronavirus nl63, utilizes cathepsin l to infect ace2-expressing cells differential effect on bone lesions of targeting integrins: randomized phase ii trial of abituzumab in patients with metastatic castration -resistant prostate cancer beyond rgd: virus interactions with integrins tmprss2 contributes to virus spread and immunopathology in the airways of murine models after coronavirus infection mechanisms of clathrin-mediated endocytosis loops govern sh2 domain specificity by controlling access to binding pockets a comprehensive evaluation of the activity and selectivity profile of ligands for rgd-binding the origin and evolution of mammals, oxford biology antiviral potential of erk/mapk and pi3k/akt/mtor signaling modulation for middle east respiratory syndrome coronavirus infection as identified by temporal kinome analysis basement membrane assembly of the integrin α8β1 ligand nephronectin requires fraser syndrome-associated proteins the kindler syndrome protein is regulated by transforming growth factor-beta and involved in integrin-mediated adhesion viruses and autophagy elm-the eukaryotic linear motif resource in 2020 crystal structure of the 2019-ncov spike receptor-binding domain bound with the ace2 receptor crystal structure of the a domain from the alpha subunit of integrin cr3 (cd11b/cd18) mechanisms that regulate adaptor binding to beta-integrin cytoplasmic tails lack of integrin alpha-chain endoproteolytic cleavage in furin-deficient human colon adenocarcinoma cells lovo angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus endoproteolytic processing of integrin pro-alpha subunits involves the redundant function of furin and proprotein convertase (pc) 5a, but not paired basic amino acid converting enzyme (pace) 4, pc5b or pc7 sh2 domains recognize contextual peptide sequence information to determine selectivity structural insight into the mechanisms of targeting and signaling of focal adhesion kinase structural mechanism of integrin inactivation by filamin the transmembrane domain of the prohormone convertase pc3: a key motif for targeting to the regulated secretory pathway the integrity of the rrgdl sequence of the proprotein convertase pc1 is critical for its zymogen and c-terminal processing and for its cellular trafficking fyn mediates high glucose-induced actin cytoskeleton reorganization of podocytes via promoting rock activation in vitro kindlin-2 (mig-2): a co-activator of beta3 integrins npxy motifs in the beta1 integrin cytoplasmic tail are required for functional reovirus entry involvement of autophagy in coronavirus replication conformationally active integrin endocytosis and traffic: why, where, when and how? structure function relations in pdz-domain-containing proteins: implications for protein networks in cellular signalling protease-mediated enhancement of severe acute respiratory syndrome coronavirus infection chembl: towards direct deposition of bioassay data the insert sequence in sars-cov-2 enhances spike protein cleavage by tmprss (preprint) ezrin interacts with the sars coronavirus spike protein and restrains infection at the entry stage structural analysis of the kgd sequence loop of barbourin, an alphaiibb eta3-specific disintegrin interpro in 2019: improving coverage, classification and access to protein sequence annotations src family tyrosine kinases associate with and phosphorylate ctla-4 (cd152) fak, talin and pipkiγ regulate endocytosed integrin activation to polarize focal adhesion assembly drosophila wash is required for integrin-mediated cell adhesion, cell motility and lysosomal neutralization fibulin-5/dance is essential for elastogenesis in vivo structural determinants of interaction of tyrosine-based sorting signals with the adaptor medium chains interaction of tyrosine-based sorting signals with clathrin-associated proteins allosteric n-wasp activation by an inter-sh3 domain linker in nck fyco1 contains a c-terminally extended, lc3a/b-preferring lc3-interacting region (lir) motif required for efficient maturation of autophagosomes during basal autophagy characterization of spike glycoprotein of sars-cov-2 on virus entry and its immune crossreactivity with sars-cov a structural explanation for the recognition of tyrosine-based endocytotic signals an integrin phosphorylation switch: the effect of beta3 integrin tail phosphorylation on dok1 and talin binding ucsf chimera--a visualization system for exploratory research and analysis microfibrillar-associated protein 4 modulates airway smooth muscle cell phenotype in experimental asthma expression of integrin cell adhesion receptors during human airway epithelial repair in vivo coronavirus replication complex formation utilizes components of cellular autophagy irsp53 senses negative membrane curvature and phase separates along membrane tubules use of peptide arrays for identification and characterization of lir motifs interaction of pregnancy-specific glycoprotein 1 with integrin α5β1 is a modulator of extravillous trophoblast functions identification of an interaction between the m-band protein skelemin and beta-integrin subunits. colocalization of a skelemin-like protein with beta1-and beta3-integrins in nonmuscle cells analysis of fyn function in hemostasis and alphaiibbeta3-integrin signaling coronaviruses hijack the lc3-i-positive edemosomes, er-derived vesicles exporting short-lived erad regulators, for replication endocytosis and the src family of non-receptor tyrosine kinases properties of fda-approved small molecule protein kinase inhibitors: a 2020 update the rgd motif and the c-terminal segment of proprotein convertase 1 are critical for its cellular trafficking but not for its intracellular binding to integrin alpha5beta1 properties of an ezrin mutant defective in f-actin binding the pathogen protein espf(u) hijacks actin polymerization using mimicry and multivalency mimicry of short linear motifs by bacterial pathogens: a drugging opportunity autophagic targeting of src promotes cancer cell survival following reduced fak signalling polydom/svep1 is a ligand for integrin α9β1 cell signaling in space and time: where proteins come together and when they're apart regulation of actin dynamics by pi(4,5)p2 in cell migration and endocytosis pregnancy-specific glycoproteins bind integrin αiibβ3 and inhibit the platelet-fibrinogen interaction integrin alpha11 is an osteolectin receptor and is required for the maintenance of adult skeletal bone mass saracatinib inhibits middle east respiratory syndrome-coronavirus replication in vitro tyrosine phosphorylation c ontrols internalization of ctla-4 by regulating its interaction with clathrin-associated adaptor complex ap-2 targeting host cell furin proprotein convertases as a therapeutic strategy against bacterial toxins and viral pathogens a transmembrane serine protease is linked to the severe acute respiratory syndrome coronavirus receptor and activates virus entry a potential role for integrins in host cell entry by sars-cov-2 inhibitors of cathepsin l prevent severe acute respiratory syndrome coronavirus entry characterization of severe acute respiratory syndrome-associated coronavirus (sars-cov) spike glycoprotein-mediated viral entry vascular expression of the alpha(v)beta(3)-integrin in lung and other organs coronavirus s protein-induced fusion is blocked prior to hemifusion by abl kinase inhibitors safety, tolerability, and risks associated with first-and second-generation antipsychotics: a state-of-the-art clinical review identification of the multivalent pdz protein pdzk1 as a binding partner of sodium-coupled monocarboxylate transporter smct1 (slc5a8) and smct2 (slc5a12) amiodarone alters late endosomes and inhibits sars coronavirus infection at a post-endosomal level determination of n-and c-terminal borders of the transmembrane domain of integrin subunits cell integrins: commonly used receptors for diverse viral pathogens cilengitide combined with standard treatment for patients with newly diagnosed glioblastoma with methylated mgmt promoter (centric eortc 26071-22072 study): a multicentre, randomised, open-label string v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets talin binding to integrin beta tails: a final common step in integrin activation src-family tyrosine kinase fyn phosphorylates phosphatidylinositol 3-kinase enhancer-activating akt, preventing its apoptotic cleavage and promoting cell survival integrins as therapeutic targets for respiratory diseases the sh2 domain interaction landscape mechanisms of integrin-mediated virus attachment and internalization process ezrin enrichment on curved membranes requires a specific conformation or interaction with a curvature-sensitive partner autophagy modulates cell migration and β1 integrin membrane recycling ß3 integrin modulates transforming growth factor beta induced (tgfbi) func tion and paclitaxel response in ovarian cancer cells development of a 5-plex silac method tuned for the quantitation of tyrosine phosphorylation dynamics proteomics. tissue-based map of the human proteome structural and evolutionary division of phosphotyrosine binding (ptb) domains the switches.elm resource: a compendium of conditional regulatory interaction interfaces motif switches: decision-making in cell regulation determination of host proteins composing the microenvironment of coronavirus replicase complexes by proximity-labeling the interconnections between autophagy and integrin-mediated cell adhesion structure, function, and antigenicity of the sars-cov-2 spike glycoprotein an interaction between insulin-like growth factor-binding protein 2 (igfbp2) and integrin alpha5 is essential for igfbp2-induced cell mobility mis-assembly of clathrin lattices on endosomes reveals a regulatory switch for coated pit formation jalview version 2--a multiple sequence alignment editor and analysis workbench enhancement of autophagy by simvastatin through inhibition of rac1-mtor signaling pathway in coronary arterial myocytes mutational analysis of the potential phosphorylation sites i n the cytoplasmic domain of integrin beta1a. requirement for threonines 788-789 in receptor activation cryo-em structure of the 2019-ncov spike in the prefusion conformation small molecules targeting severe acute respiratory syndrome human coronavirus crystal structure of the complete integrin alphavbeta3 ectodomain plus an alpha/beta transmembrane fragment structural basis for the recognition of sars-cov-2 by full-length human ace2 targeting the endocytic pathway and autophagy process as a novel therapeutic strategy in covid-19 the regulation of integrin function by divalent cations i-bar domain proteins: linking actin and plasma membrane dynamics clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study structure and ligand recognition of the phosphotyrosine binding domain of shc a pneumonia outbreak associated with a new coronavirus of probable bat origin single-cell rna-seq data analysis on the receptor ace2 expression reveals the potential risk of different human organs vulnerable to 2019-ncov infection bm has received funding from the european union's horizon 2020 research and innovation programme under the marie skłodowska-curie grant agreement no. 842490 (mimic). jč is supported by the european union's horizon 2020 research and innovation programme under the marie skłodowska-curie grant agreement no. 675341 (pdznet). em-p is a phd student of conicet, argentina. ra is supported by bmbf-funded heidelberg center for human bioinformatics (hd-hub) within the german network for bioinformatics infrastructure (de.nbi #031a537b) and elixir germany. lbc is a national research council investigator (conicet, argentina). the work was supported by agencia nacional de promoción científica y tecnológica (pict 2017(pict -1924 grant to lbc. this paper is part of a project that has received funding from the european union's horizon 2020 research and innovation programme under the marie skłodowska-curie grant agreement no. 778247 (idpfun) to lbc and tjg. idpfun also funded em-p's placement at embl. key: cord-150183-zzzyewjb authors: phillips, j. c. title: synchronized attachment and the darwinian evolution of coronaviruses cov-1 and cov-2 date: 2020-08-27 journal: nan doi: nan sha: doc_id: 150183 cord_uid: zzzyewjb cov2019 has evolved to be much more dangerous than cov2003. experiments suggest that structural rearrangements dramatically enhance cov2019 activity. we identify a new first stage of infection which precedes structural rearrangements by using biomolecular evolutionary theory to identify sequence differences enhancing viral attachment rates. we find a small cluster of mutations which show that cov-2 has a new feature that promotes much stronger viral attachment and enhances contagiousness. the extremely dangerous dynamics of human coronavirus infection is a dramatic example of evolutionary approach of self-organized networks to criticality. it may favor a very successful vaccine. the identified mutations can be used to test the present theory experimentally. the standard tool for comparing two or more proteins is blast, which compares different protein sequences site-by-site. it is not enough to profile ψ(aa) site by site for two proteins, as this produces hundreds of oscillations, with no clear differences between the two proteins. instead we plot ψ(aa,w), where w is a rectangular box of length w = 35 over which ψ(aa) is averaged . this w value is chosen to maximize the hydropathic shape differences between cov-1 and cov-2, as measured by their variance ratio, which is dominated by extrema [11] . smoothing the profile over w enhances resolution by reducing the number of extrema to the level which corresponds to critical domain motion differences. choosing the best value of w to enhance resolution of evolutionary differences is similar to adjusting the focal plane of a microscope. our main result is shown in fig. 1 , which displays the 400-800 central region of the spikes. this region is part of the n-terminal domain (s1) responsible for cellular attachment [14] . fig. 1 shows the ψ(aa,w) hydropathic profiles of for cov-1 and cov-2, using the ψ(aa) values of mz [4] . while the two ψ(aa, 35) profiles are similar, there are important differences in their extrema. the maxima are most hydrophobic and are located in the globular interior, while the minima are most hydrophilic and are located on the globular surface. the two cleavage sites s1/s2 and s 2' [2] of cov-1 have moved lower (hydrophilically, further outside) in cov-2 ( fig. 1) , consistent with the very accurate mz scale. when a cleavage segment is further outside, there is more space for cleavage and reassembly, which will occur more rapidly. the insertion prra in cov-2 was identified [2] with blast (w = 1) as unique to cov-2, but with blast alone one cannot show that this change has made cov-2 more dangerous. a characteristic feature of second order phase transitions is that they break what physicists often call a "hidden" symmetry, here the key symmetry of extrema [15] . proteins are typically composed of domains of ~ 100 amino acids, and these domains can rotate from the resting state to the functional state, and then return reversibly to the resting state. this phase transition requires a balance between stability and flexibility [16, 17] . to maximize the reversibility and extend protein life one can synchronize this domain motion by leveling the pivotal hydrophobic (inside) extrema forming level sets [10, 11, [18] [19] [20] [21] . viruses must act rapidly before being destroyed by antibodies, and they could do this through synchronized motion differently, by leveling their hydrophilic (outside) extrema. as shown in fig. 1 , such a leveling of minima 1-3 occurs in cov-2, while it is absent from cov-1. the change in minimum 2 is especially striking: it is caused by a cluster of four critical mutations from cov-1 to cov-2. these are (cov-1 site numbering from uniprot p59594): 546gln to leu; 556 and 561ser to ala; and 568ser to leu. the differences associated with each of these mutations are hydropathically large (~50-100 in the mz scale [4] ; all 20 amino acids span a range from most hydrophilic to most hydrophobic of 170). how important is the fractal mz scale? in fig. 2 the mz panoramic profiles of cov-1 and cov-2 are compared across the entire 1255 amino acid spike. we see immediately a strong hydrophobic peak above 1200, which is responsible for anchoring the spike to the central cushion; it is almost unchanged from cov-1 to cov-2. the central hydrophilic level set, absent from cov-1 and present in cov-2, is our main result ( there is an excellent review of the principles of self-organized criticality in living matter [3] . these can be compared to relaxation of homogeneous glass alloys, whose composition has been adjusted to bring the glass network to a critical point. there the glass (commercial gorilla glass) is strong and flexiblenot brittle [21] . another broad and deep review includes fractals and their many implications for quantifying protein dynamic criticality [22] . in the genomic age abstract tools have many applications to analyzing the evolution of protein functions. some readers may be interested in the connections between hydropathic scaling theory of proteins and the more general synchronization of complex networks. the analogy is closest for networks that are scale-free because they are near a second-order phase transition. social networks contain many such examples, usually described by one or two fractals [23]. generally it is found that synchronizability is robust against random removal of nodes, but it is fragile to specific removal of the most highly connected nodes. in proteins the most highly connected nodes are hydrophobic extrema (deeply interior), where the number of van der waals contacts is largest. in coronavirus spikes, it is the hydrophilic extrema that are most exposed to water and most likely to attach to protein targets. thus the symmetry of extrema is broken between hydrophobic (proteins) and hydrophilic (cov-2). one can also regard darwinian selection as a kind of percolation [24] . further extensions to multiphase hydrodynamic level sets require a special theory [15] . our primary focus here has been on enhanced attachment caused by the synchronization of covthese grouped sites are all close to the hydrophilic minimum 1 in fig. 1 (mz scale) and fig. 3 (kd scale). fragmental binding is a feature common to both scales, and thus not sensitive to whether the phase transition is first-or second-order. instead, by synchronizing minima 2 and 3 with minimum 1, cov-2 is able to greatly increase its attachment to ace2 (angiotensinconverting enzyme 2) and probably other receptors as well; synchronization is specifically sensitive to allosteric interactions and is definitely a second-order effect. in conclusion, the answer to the question posed in [1] is that, compared to cov-1, cov-2 has moved closer to its functional critical point [3] . cov-2 has an added synchronized feature in its central attachment region that contributes to its extreme contagiousness in an asymptomatic phase [26] . the attachment of the s1 region precedes cleavage [2] , and it is likely that both mechanisms enhance the infectiveness of cov-2. an early estimate of the basic contagious reproductive number (r0) of coronavirus 2019 was 2.2-2.7, but this was later revised to 5.7 [27]. by comparison, the numbers for h1n1 flu (1918, 1.8; 2009, 1.5) were much smaller [28] the uncontrolled r0 for cov-1 was 3.6, and this dropped to 0.7 after the rapid implementation of control measures [29] . the new feature explains the unexpected yet often asymptomatic behavior of early stages of cov-2 infection, when the new feature is acting alone. it should be possible to desynchronize this added attachment feature and reduce r0 by attaching a small molecule in the region 530-575. in any case, the "hidden" symmetry of the cov-2 sequence should be of interest to both biologists and physicists. the effects of the four critical mutations on cov contagiousness could be tested on mice. if the predictions of their importance should prove to be correct, then the general principle of darwinian evolution of proteins would be confirmed. the level hydrophilic extrema of the spikes of cov-2 may be unique among viruses. these level extrema bring cov-2 contagion very close to a critical point. there very small differences in dna can cause large fluctuations in infection levels, not only between individuals, but also even in neighboring countries [30] . the quasi-linear spike morphology, with its large surface/volume ratio, is ideal for strong protein-water interactions, explaining [1] . the oxford vaccine is based on the cov-2 spike protein hydrophobically anchored to an adenovirus vaccine vector [31] [32] [33] . it seems likely that this vaccine will be more effective for cov-2 than for cov-1 because the synchronized attachment mechanism is stronger for cov-2 and thus more easily disrupted by induced antibodies. thus vaccines based on the entire spike are expected to be more effective than vaccines based on only part of the spike [34] . the most dangerous type of flu virus is h 3 n 2 , and it does not exhibit level sets [35] . flu viruses mutate significantly annually, while cov-2 has shown only a few mutations [36] . the spike stability against mutations is consistent with maximized attachment. there are many examples of scale-free networks. readers unfamiliar with the general properties of scale-free networks, such as (often only 1 or 2) fractals, preferential attachment, and synchronization, may find early reviews interesting [37, 38] . the sequences used here for cov-1 and cov-2 are the ones used in [2] , that is, aap13441.1 and yp_009724390.1. many other cov sequences are aligned (w = 1) in their supp. table [2] . similar w = 1 sequence alignments have been reported by others [39, 40] , but the genetic origin of cov-2 remains mysterious [1, 40] . historically it has long been the case that the amount of information that could be obtained from hydropathic scales was limited, because the fractals describing second-order phase transitions were not known. now that we are in the genomic age, with a very large sequence data base available for many proteins and many species, the discovery of these 20 fractals [5, 13] opens a new biophysics field of accurate thermodynamic analysis of small but medically important evolutionary differences. the methods used here on cov-1 and cov-2 do not require elaborate calculation to obtain new results. the hydropathic scaling methods are easily implemented on a spread sheet, but are still not routine, and vary between proteins. the modular nature of protein domains responsible for protein assembly is well known [41, 42] . here we maximized small and otherwise mysterious evolutionary differences by using hydropathic scaling to optimize w at 35. this enables us to focus on domain-scale motion responsible for stronger attachment in cov-2. evidence for large-scale motion has been found in the evolution of many proteins [4] [5] [6] [7] 9] , including molecular motors [11] , while it is not accessible to traditional w = 1 blast-based alignment methods [2, [43] [44] [45] . from detailed genomic studies it has been suggested that "a more comprehensive theory of molecular evolution must be sought" [46] . readers wishing to explore these novel scaling methods can obtain copies of sample excel spread sheets from the author. many physicists have long suspected that proteins must function near a critical point very near thermodynamic equilibrium [3] . however, to implement this idea in practice with protein sequences, one needs to compress the existing vast array of three-dimensional structural data into a one-dimensional form. if you are one in a million, namely among the small community of coauthors of the 500+ references of [3] , what comes next may not surprise you. as discussed above, this near-miracle compression has been achieved in the 21 st century by two brazilian physicists [13] , located far from the main stream. [13] described the discovery of 20 (not just one) universal amino-acid specific fractals in the solvent accessible surface areas (sasa) of proteins. this amazing 21 st century discovery was achievable only for proteins. it has made possible the quantitative analysis of darwinian evolution of many different proteins [4] [5] [6] [7] [8] [9] [10] [11] , which brings us to the present application. inserted at 681 in cov-1 (the s1/s2 cleavage interface) [2] , has an average mz hydropathicity 108.5. this lowers ψ(aa,35) to 140.9 at the 3 minimum, aligning it with ~140 minima 1 and 2. the three minima span ~ 250 amino acids sites, which makes their water-driven synchronization for cov-2, but not cov-1, outside the range of most simulation or modeling methods. why the coronavirus has been so successful structure, function, and antigenicity of the sars-cov-2 spike glycoprotein criticality and dynamical scaling in living systems scaling and self-organized criticality in proteins: lysozyme c fractals and self-organized criticality in proteins fractals and self-organized criticality in anti-inflammatory drugs similarity is not enough: tipping points of ebola zaire mortalities evolution of the ubiquitin-activating enzyme uba1 (e1) phys. a 483 thermodynamic scaling of interfering hemoglobin strain field waves quantitative molecular scaling theory of protein amino acid sequences, structure, and functionality self-organized networks: darwinian evolution of dynein rings, stalks and stalk heads a simple method for displaying the hydropathic character of a protein amino acid hydrophobicity and accessible surface area influence of hydrophobic and electrostatic residues on sars-coronavirus s2 protein stability: insights into mechanisms of general viral fusion and inhibitor design estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures virus batters some areas, why does it spare others? oxford covid-19 vaccine begins human trial stage future prospects for the development of cost-effective adenovirus vaccines atomic structure of human adenovirus by cryo-em reveals interactions among protein networks subunit vaccines against emerging pathogenic human coronaviruses prediction (early recognition) of emerging flu strain clusters controlling the sars-cov-2 outbreak, insights from large scale whole genome sequences generated across the world collective dynamics of 'small-world' networks statistical mechanics of complex networks the proximal origin of sars-cov-2 a genomic perspective on the origin and emergence of sars-cov-2 assembly of cell regulatory systems through protein interaction domains arrangements in the modular evolution of proteins immunoinformatics-aided identification of t cell and b cell epitopes in the surface glycoprotein of 2019-ncov novel antibody epitopes dominate the antigenicity of spike glycoprotein in sars-cov-2 compared to sars-cov phylogenetic analysis and structural modeling of sars-cov-2 spike protein reveals an evolutionary distinct and proteolytically sensitive activation loop the neutral theory in light of natural selection with the mz scale panoramic spike profiles of cov-1 and cov-2 reveal a set of hydrophilic level extrema in cov-2, but not in cov-1. the new level set was identified with a even on the panoramic picture, it is clear that with the kd scale the three minima that were aligned in cov-2 are not level. this is shown in more detail in fig key: cord-259412-l8uta7du authors: mattossovich, rosanna; merlo, rosa; miggiano, riccardo; valenti, anna; perugino, giuseppe title: o(6)-alkylguanine-dna alkyltransferases in microbes living on the edge: from stability to applicability date: 2020-04-20 journal: int j mol sci doi: 10.3390/ijms21082878 sha: doc_id: 259412 cord_uid: l8uta7du the genome of living cells is continuously exposed to endogenous and exogenous attacks, and this is particularly amplified at high temperatures. alkylating agents cause dna damage, leading to mutations and cell death; for this reason, they also play a central role in chemotherapy treatments. a class of enzymes known as agts (alkylguanine-dna-alkyltransferases) protects the dna from mutations caused by alkylating agents, in particular in the recognition and repair of alkylated guanines in o(6)-position. the peculiar irreversible self-alkylation reaction of these enzymes triggered numerous studies, especially on the human homologue, in order to identify effective inhibitors in the fight against cancer. in modern biotechnology, engineered variants of agts are developed to be used as protein tags for the attachment of chemical ligands. in the last decade, research on agts from (hyper)thermophilic sources proved useful as a model system to clarify numerous phenomena, also common for mesophilic enzymes. this review traces recent progress in this class of thermozymes, emphasizing their usefulness in basic research and their consequent advantages for in vivo and in vitro biotechnological applications. monofunctional alkylating agents, a class of mutagenic and carcinogenic agents present in the environment, induce dna alkylation in several positions including guanine at o 6 (o 6 -mg; 6% of adducts formed), the n 7 of guanine (n 7 -mg; 70%), and the n 3 of adenine (n 3 -ma; 9%) [1] . alkylation of guanine (o 6 -ag) is a cytotoxic lesion, although the specific mechanism of this cytotoxicity is not yet fully understood. it was proposed that the toxic effect occurs after dna replication, because the o 6 -ag incorrectly base-pairs with thymine generating a transition from g:c to a:t [2] . the mutations caused by o 6 -mg that occur at the time of replication are recognized by the post-replication mismatch repair system with potential harmful implications for cell viability. apart from conventional dna repair pathways as mismatch excision repair (mmr), nucleotide excision repair (ner), base excision repair (ber), alkylated-dna protein alkyl-transferases (called o 6 -alkyl-guanine-dna-alkyl-transferase (agt or ogt) or o 6 -methyl-guanine-dna-alkyl-transferase (mgmt); ec: 2.1.1.63) perform the direct repair of alkylation damage in dna [3, 4] . they represent the major factor in counteracting the effects of alkylating agents that form such adducts [4] . these are small enzymes (17) (18) (19) (20) (21) (22) that are widely present in organisms of the three kingdoms (bacteria, archaea, eukaryotes) but apparently absent from plants, schizosaccharomyces pombe, thermus thermophilus, and deinococcus radiodurans. the reaction mechanism of agts is based on the recognition of the damaged nucleobase on dna [5] , followed by a one-step sn 2 -like mechanism, in which the alkyl group of the damaged guanine is irreversibly transferred to a cysteine residue in its active site [5] [6] [7] [8] (figure 1 , blue path). in the c-terminal domain, a helix-turn-helix motif (in light green, or in orange in the snap-tag) is responsible of the dna-binding activity. the peculiar irreversible reaction mechanism of these enzymes plays a pivotal role in the physiological dna repair (blue path), and it has important repercussions in cancer cell treatment (red path) and biotechnological applications (green path). atoms are coloured by the cpk colour convention. for these reasons, they are also called suicide or kamikaze proteins, showing a 1:1 stoichiometry of their reaction with the natural substrate. the disadvantage of this elegant catalysis is that, upon alkylation, the protein is self-inactivated and destabilized, triggering its recognition by cellular systems to be degraded by the proteasome [8, 9] . alkylation damage to dna occurs in various living conditions, and for this reason the widespread presence of agt protects cells from killing by alkylating agents. however, human agt (hmgmt) is a double-edged sword-on the one hand, it protects healthy cells from these genotoxic and carcinogenic effects, but also counteracts alkylating agents-based chemotherapy by protecting cancer cells from the killing effect of these drugs [10, 11] . consequently, hmgmt has emerged as a crucial factor in anticancer therapies [12] -an inverse relationship has been discovered between the presence of hmgmt and the sensitivity of cells to the cytotoxic effects of alkylating agents, such as temozolomide (tmz), in different types of cancer cells, including prostate, breast, colon, and lung cancer cells [13] . the resistance to chemotherapy may be reduced by inhibition of these enzymes; as described before, after removing the lesion, the alkylated form of the protein is inactivated and enters the intracellular degradation pathways. hence, in order to counteract the action of hmgmt in chemotherapy regimens, a large number of studies aimed to the develop hmgmt inhibitors to be used in combination with alkylating agents. in view of this therapeutic relevance, much success has been obtained through the design of hmgmt pseudo-substrates, namely, the o 6 -benzylguanine (o 6 -bg) and the strong inactivator o 6 -[4-bromothenyl]-guanine (o 6 -btg, lomeguatrib) [13, 14] . these compounds mimic damaged guanine on dna and react with the protein by the covalent transfer of the alkyl adduct to the active site cysteine residue, thus irreversibly inactivating the enzyme (figure 1 , red path). therapeutically, o 6 -bg is not toxic on its own, but renders cancer cells 2 to 14 times more sensitive to alkylating agents' effects. the oligonucleotides containing several o 6 -bg are potent inhibitors and represent a valid alternative to the use of free modified guanines, thereby improving the activity of the alkylating chemotherapy drug in the treatment of some tumours [15] [16] [17] . the specific labelling of proteins with synthetic probes is an important advance for the study of protein function. to achieve this, the protein of interest is expressed in a fusion with additional genetically encoded polypeptides, called tags, which mediate the labelling. the first example of an autofluorescent tag was the aequorea victoria green fluorescent protein (gfp) allowing the in vivo localization of fusion proteins in cellular and molecular biology [18, 19] . among affinity tags, of particular importance are the poly(his)-tag, the chitin-binding protein, the maltose-binding protein [20] , the strep-tag [21] , and the glutathione-s-transferase (gst-tag) [22] , which allow fast and specific purification of proteins of interest from their crude biological source using affinity techniques. solubilization tags are especially useful to assist the proper folding of recombinant proteins expressed in chaperone-deficient species such as escherichia coli, avoiding protein precipitation and the use of alternative expression protocols [23, 24] -these include thioredoxin [25] and poly(nanp). however, all the tags listed above are limited by the fact that each of them can be used for one or only a few applications. the need therefore emerged to develop a universal tag that could widely cover several applications. in 2003, the group headed by kai johnsson pioneered the use of an engineered hmgmt variant as a fusion protein for in vitro and in vivo biotechnology applications, which led then to its commercialization, namely, the snap-tag (new england biolabs) [26] [27] [28] [29] . they started from the knowledge that hmgmt tolerates the presence of groups conjugated to the pseudo-substrate o 6 -bg (o 6 -bg derivatives)-the unusual covalent bond with the benzyl moiety can therefore be exploited for "biotech" purposes ( figure 1 , green path). thanks to its small size, the engineered hmgmt (snap-tag) can be fused with other proteins of interest. the expression of the fusion protein inside the cells followed by incubation with opportune fluorescent derivatives leads to in vivo labelling of fusion proteins with the probe, which can be used for localization studies [26] . the same principle has also been used for the immobilization of tagged fusion proteins in vitro [30] . this offers a delicate condition for fixing and disposing in a better orientation of a wide range of proteins/enzymes on a surface. the snap-tag technology was successfully applied to surface plasmon resonance (spr) for the covalent immobilization of proteins of interest [31] . another interesting application of this protein-tag is the possibility to produce new antibody fragments (scfv-snap) to be employed in the spr analysis [32] . despite the need to use a specific substrate, snap-tag offers endless applications-the possibility to covalently link a desired chemical group (conjugated to the o 6 -bg) to a protein of interest (genetically fused to it) makes it decidedly advantageous, if compared to traditional protein tags currently in use. table 1 shows a brief comparison between some examples of protein tags and the snap-tag in several application fields. as for organisms living under mesophilic conditions, environmental and endogenous alkylating agents also attack the genome of thermophilic and hyper-thermophilic organisms. additionally, high temperatures accelerate the process of alkylation, leading to dna breaks [33] , because alkylating agents are chemically unstable at the physiological conditions of these organisms, and their collateral decomposition may worsen the formation of dna alkylation products [34] . thus, the presence of agts and methylpurine glycosylases in hyperthermophilic organisms implies that they are naturally exposed to endogenous methylating agents [34] , thus supporting the crucial role of agts [35, 36] . apart from some studies on archaea using cell-free extracts, a few examples of biochemical studies of agts from thermophilic sources include the enzymes from pyrococcus sp. kod1 [35] conducted by imanaka and co-workers from aquifex aeolicus and archaeoglobus fulgidus performed by the group of prof. pegg in 2003 ( figure 2 ) [34] . intriguingly, a. aeolicus agt, whose organism was identified as the most primitive bacterium, is closer to the mammalian agts than other bacterial homologues in terms of o 6 -bg sensitivity [34] . despite the different primary structures (figure 2a) , thermophilic enzymes show a typical agt protein architecture, consisting of two domains [37] : a highly conserved c-terminal domain (ctd), surprisingly superimposable for all available agt structures (figure 2b) , and a n-terminal domain (ntd), which is very different among agts and whose function is not well understood (likely involved in regulation, cooperative binding, and stability [6, 38, 39] ). the ctd contains the dna binding helix-turn-helix motif (hth); the asn hinge, which precedes the -v/ipchrvv/i-amino acid sequence containing the conserved catalytic cysteine (except the caenorhabditis elegans agt-2 that has the -pchpsequence [40, 41] ); and the active site loop, responsible for the substrate specificity. a comparative structural analysis performed on agt proteins whose structures are in the protein data bank revealed significant differences of the intrinsic structural features that have been considered to be relevant for thermostability, such as helix capping, intramolecular contacts (hydrogen bonds, ion-pairs), and solvent-accessible surface areas. helix capping plays a central role in the stability of α-helices, due to lack of intra-helical hydrogen bonds in the first and last turn [42, 43] , and its effect results in an overall structural stabilization of protein folding [44] . by inspecting the crystal structure of ssogt (pdb id: 4zye), considered here as the thermophilic reference agt protein, we verified that the five α-helices of the composing the protein tertiary structure are characterized by the presence of helix capping, this possibly increasing the thermal stability. in particular, the helix h1 at the ntd is stabilised by a peculiar double serine sequence (s40-s41) and a glutamic acid (e54) at its ctd, the latter is strictly conserved in all agts from thermophilic organisms (see figure 2a) . the hth motif, built on helices h3 and h4, is stabilized at the level of h3 by a highly conserved threonine residue (t89) as n-cap and a serine (s96), distinctive of ssogt, as c-cap. furthermore, helix h4 contains two serine-based capping among which the one placed at ntd (s100) is strictly conserved in all thermophilic agts and is followed by a proline (p101) that fits well in the first turn of the helix thanks to its own backbone conformation. finally, the helix h5 is protected by glutamic acid capping that is present in all the agts from different species. another feature contributing to thermal stability is the solvent-accessible surface area (sasa). indeed, the decrease of sasa and the increase of hydrophobic residues that are buried from the solvent have stabilizing principles for thermostable protein [45] . as described in table 2 , ssogt shows the smaller total sasa value, in line with its exceptional stability. on the contrary, ogt from mycobacterium tuberculosis [38, 46] has a higher value due to the peculiar conformation of both the active site loop and the c-terminal tail that are exposed to the bulk solvent and are less heat stable (table 2) . finally, by comparing hyperthermophilic agts with the orthologs from mesophilic organisms, in terms of atomic contacts between charged residues as well as intramolecular hydrogen bonds (table 2 ), significant differences emerged in the number of charged residues contacts. as expected for thermostable proteins [47] , ssogt, as well as the proteins from sulfurisphaera tokodaii and pyrococcus kodakaraensis, shows a larger number of electrostatic contacts, characterized by higher bond-dissociation energy, with respect to hydrogen bonds for which we did not detect significant differences among the analysed structures, apart from mgmt of p. kodakaraensis (pk-mgmt) [48] . although the number of h-bonds is approximately similar across the agts from different organisms, there should be differences in the position-related role of such bonds, supporting overall stability of thermophilic variants. with reference to pk-mgmt, hashimoto and co-workers detected the same number of ion-pairs between the extremophilic protein and e. coli ada-c [49] ; however, more intra-and inter-helix ion pairs were found in pk-mgmt. although the absence of a correlation between ion pairs' position and stabilization in ada-c exists, the intra-helix ion pairs act in the secondary structure of pk-mgmt, stabilizing helices, and the inter-helix ion pairs consolidate the inter-domain interactions, enhancing the stability of the tertiary structure packing. in the last decade, ssogt has been characterized through detailed physiological, biochemical, and structural analysis. due to its intrinsic stability, the ssogt protein has proven to be an outstanding model for clarifying the relationships between function and structural characteristics. saccharolobus solfataricus (previously known as sulfolobus solfataricus) is a microorganism first isolated and discovered in 1980 in the solfatara volcano (pisciarelli-naples, italy) [51] , which thrives in volcanic hot springs at 80 • c and a ph 2.0-4.0 range. in order to protect its genome in these harsh conditions, s. solfataricus evolved several efficient protection and repair systems [33, 52] . s. solfataricus is highly sensitive to the alkylating agent methyl methane sulfonate (mms), showing a transient growth arrest when treated with mms concentrations in the range of > 0.25 mm to 0.7 mm [33, 52] . interestingly, although the ogt rna level increases after mms treatment, the relative enzyme concentration decreases, suggesting its degradation in cells in response to the alkylating agent and, in general, to a cellular stress [52] . under these treatment conditions, however, the protein level rises after few hours, and, in parallel, the growth of saccharolobus starts again [52] , indicating a role of ssogt in efficient dna repair by alkylation damage. various assays to measure agt activity are reported in the literature. the first methods were based on the use of oligonucleotides carrying radioactive ( 3 h or 14 c) o 6 -alkylguanine groups. proteinase k digestion was then carried out to measure the levels of marked s-methyl-cysteine in the lysate in an automatic amino acid analyser [53] . a very similar, but simpler and faster radioactive assay was used in another procedure with a 32 p-terminal labelled oligonucleotide containing a modified guanine in a methylation-sensitive restriction enzyme sequence (as mbo i). the agt dna repair activity thereby allowed the restriction enzyme to cut [54] . this procedure was also used by ciaramella's group to identify for the first time the activity of ssogt [52] . this test has the advantage of analysing the digested fragment directly by electrophoresis on a polyacrylamide gel [55] . it was therefore improved in terms of precision by the subsequent separation of the digested oligonucleotides by hplc. the chromatographic separation allowed the calculation of the concentration of active agt after measuring the radioactivity of the peak corresponding to the digested fragment [55] . similarly, moschel's group developed the analysis of hmgmt reaction products based on hplc separation in 2002. this test investigated the degree of inhibition of oligonucleotides with o 6 -mg or o 6 -bg in different positions that varied from the 3' to the 5' end and whether they could be used as chemotherapy agents. ic 50 values were obtained by quantifying the remaining active protein after the radioactive dna reaction [56] . although the assay measures the protein activity, the use of radioactive materials and chromatographic separations made these assays long, tedious, and unsafe. an alternative approach was proposed in 2010 by the group of carme fàbrega, who set up an assay based on the thrombin dna aptamer (tba), a single-stranded 15 mer dna oligonucleotide identified via systematic evolution of ligands by exponential enrichment (selex), which in its quadruplex form binds thrombin protease with high specificity and affinity [57] . in this assay, they put a fluorophore and a quencher to the tba-the quadruplex structure of this oligonucleotide is compromised if a central o 6 -mg is present, preventing the two probes to stay closer. an agt's repair activity on the oligonucleotide allows the folding of the quadruplex structure and the förster resonance energy transfer (fret) energy transfer takes place, resulting in a decrease of the fluorescence intensity [58] . recently, the introduction of fluorescent derivatives of the o 6 -bg (as snap vista green, new england biolabs) made possible the development of a novel dna alkyl-transferase assay. because agt covalently binds a benzyl-fluorescein moiety of its substrate after reaction, it is possible to immediately load the protein product on a sds-page-the gel-imaging analysis of the fluorescence intensity gives a direct measure of the protein activity because of the 1:1 stoichiometry of protein/substrate (figure 3 ). signals of fluorescent protein (corrected by the amount of loaded protein by coomassie staining analysis) obtained at different times are plotted, and a second order reaction rate is determined [38, 39, 46, 52, 59, 60] . this method can be applied to all agts that bind o 6 -bg, with the exception of the e. coli ada-c [61, 62] . innovative fluorescent agt assay. the substrate could be used alone for the determination of the agt catalytic activity, or in combination with a competitive non-fluorescent substrate (alkylated-dna). in the latter case, an indirect measure of the dna repair activity on natural substrates is determined (adapted from [63] ). furthermore, an alkylated double strand dna (dsdna) oligonucleotide can be included in a competition assay with the fluorescein substrate. this non-fluorescent substrate lowers the final fluorescent signal on gel imaging analysis, depending on its concentration. in this way, it is possible to measure the activity of agts for their natural substrate, giving an indirect measure of methylation repair efficiency (figure 3 ) [38, 39, 46, 52, 59, 60] . by using this methodology, it was even possible to discriminate the ssogt activity regarding the position of the o 6 -mg on dna (see below; [39] ), in line with previous data on hmgmt [64] . the recombinant ssogt protein, heterologously expressed in e. coli, has been fully characterized using the fluorescent assay described and summarized in section 3.1, and some results are compiled in table 3 . in agreement with its origin, the protein showed optimal catalytic activity at 80 • c, although retaining a residual activity at lower temperatures (table 3) , and in a ph range between 5.0 and 8.0. as for the most part of many thermophilic enzymes, ssogt is resistant over a wide range of reaction conditions, such as ionic strength, organic solvents, common denaturing agents, and proteases [52, 59] . interestingly, chelating agents do not affect the activity of this enzyme. crystallographic data clarified this observation, as the archaeal enzyme lacks a zinc ion in the structure [39] , whereas this ion is important for correct folding of hmgmt [6] . all catalytic steps of the agts' activity (alkylated dna recognition, dna repair, irreversible trans-alkylation of the catalytic cysteine, recognition, and degradation of the alkylated protein) have been structurally characterized. most information comes from the classic studies on hmgmt, as well as the ada-c and ogt from escherichia coli [5] [6] [7] [8] 49] . other agts' structures are also available in the protein data bank site (figure 2a) as shown in figure 1 , all agts are inactivated after the reaction and degraded via proteasome, whereas in higher organisms, the degradation is preceded by protein ubiquitination [9] . it is a common view that the recognition of alkylated-agts is due by a conformational change; however, data on structure and properties of alkylated agts are limited because alkylation greatly destabilizes their folding [39] . the methylated-hmgmt and benzylated-hmgmt 3d structures were only obtained by flash-frozen crystals, showing that alkylation of the catalytic cysteine (c145) induces subtle conformational changes [6, 7, 65] . consequently, these structures might not reflect the physiological conformation of the alkylated hmgmt [39] . concerning the interactions with the dna, ssogt binds methylated oligonucleotides. however, the repair activity depends on the position of the alkyl-group [39] . to efficiently repair the alkylated base on dna double helix, the protein requires at least three bases from either the 5 or the 3 end. this is due to the necessary interactions formed with the double helix. structural analysis confirmed these data [39] . to overcome the serious limitation to obtain structural data from mesophilic agts after reaction, studies have moved to thermostable homologues, based also on the knowledge that all agts share a common ctd domain structure (figure 2b) . in contrast to the human counterpart, alkylated ssogt was soluble and relatively stable, thus allowing in-deep analysis of the protein in its post-reaction form [39] . structural and biochemical analysis of the archaeal ogt, as well as after the reaction with a bulkier adduct in the active site (benzyl-fluorescein; [66] ), suggested a possible mechanism of alkylation-induced ssogt unfolding and degradation (figure 4) . on the basis of their data, perugino and co-workers suggested a general model for the mechanism of post-reaction agt destabilization-the so called active-site loop moves towards the bulk solvent as a result of the covalent binding of alkyl adduct on the catalytic cysteine and the extent of the loop movement and dynamic correlates with the steric hindrance of the adduct [39, 66] (figure 4) . the destabilization of this protein region triggers then the recognition of the alkylated protein by degradation pathway. as described in section 1.2, the introduction of the snap-tag technology enabled a wide in vivo and in vitro labelling variety for biological studies by fusing any protein of interest (poi) to this protein tag [67] . however, being originated from hmgmt, the extension to extremophilic organisms and/or harsh reaction conditions is seriously limited. [39] ), or with snap vista green substrate (in green; [66] ). by following the same approach used for the hmgmt as kai johnsson [26] [27] [28] [29] [30] , an engineered version of ssogt was produced [52, 59] . this protein, called ssogt-h 5 , contains five mutations in the helix-turn-helix domain, abolishing any dna-binding activity [52] . in addition, a sixth mutation was made-in the active site loop, where serine residue was replaced by a glutamic acid at position 132 (s132e). this modification increased the catalytic activity of ssogt [52, 59] , as it was observed in the engineered version of the hmgmt during the snap-tag development [26] . ssogt-h 5 shows slightly lower heat stability in respect to the wild-type protein (table 3) , whereas the resistance to other denaturing agents is maintained. moreover, ssogt-h 5 is characterised by a surprisingly high catalytic activity at lower temperatures, keeping the rate of reaction to the physiological ones (table 3) [52, 59] . these characteristics make this mutant a potential alternative to snap-tag for in vivo and in vitro biotechnological applications. the stability against thermal denaturation allowed miggiano and co-workers to obtain the structure of the protein after the reaction with the fluorescent substrate snap-vista green, revealing the peculiar destabilization of the active site loop after the alkylation of the active cysteine [66] . the saccharolobus ogt mutant has been firstly tested as protein tag fused to two thermostable s. solfataricus proteins heterologously expressed in e. coli. the chimeric proteins were correctly folded, and the tag did not interfere with the enzymatic activity of the tetrameric s. solfataricus β-glycosidase (ssβgly) [59] , nor with the hyperthermophile-specific dna topoisomerase reverse gyrase [68] [69] [70] [71] [72] . furthermore, the stability of h 5 made possible a heat treatment of the cell-free extract to remove most of the e. coli proteins and performing the β-glycosidase assay at high temperatures without the need of removing the tag [60] . as the applicability of the thermostable tag under in vivo conditions is very important, the ssogt-h 5 was also expressed in thermophilic organisms. the fluorescent agt assay allows for the detection of the presence of ssogt-h 5 both in living cells as well as in vitro in cell-free extracts [59, 72] . to assay the activity to ssogt-h 5 , it was necessary to choose models in which the endogenous agt activity is suppressed. thermus thermophilus is an ogtspecies, showing only one agt homologue (ttha1564), whose annotation corresponds to an alkyltransferase-like protein (atl) [73] . atls are a class of proteins present in prokaryotes and lower eukaryotes [74] , presenting aminoacidic motifs similar to those of agts' ctd, in which a tryptophan residue replaces the cysteine in the active site [75] . like agts, atls use a helix-turn-helix motif to bind the minor groove of the dna, but they do not repair it as they only recruit and interact with proteins involved in the nucleotide excision repair system [76, 77] . although t. thermophilus is a natural ogt knockout organism, sulfolobus islandicus possesses an ogt gene very similar to that of s. solfataricus, which was silenced by a clustered regularly interspaced short palindromic repeats (crispr)-based technique and then used as a host organism [72] . the fluorescent signal obtained by sds-page gel imaging revealed that ssogt-h 5 not only is efficiently expressed in these thermophilic microorganisms, but it also showed that this tag was correctly folded and active, demonstrating the fact that ssogt-h 5 might be used as an in vivo protein tag at high temperatures [59, 72] . as is the case with snap-tag in human cells, the utilization of ssogt-h 5 with different fluorescent substrates gives the opportunity to perform a multi-colour fluorescence study (see table 1 ), by following a poi inside living "thermo cells" at different stages and localization. as most biotechnological processes require harsh operational conditions, the immobilization of very robust enzymes on solid supports is often essential [78] . by definition, an immobilized enzyme is a "physically confined biocatalyst, which retains its catalytic activity and can be used repeatedly" [79] . protein immobilisation offers several advantages, such as the catalysts' recovery and reuse, as well as the physical separation of the enzymes from the reaction mixture. currently, different immobilisation strategies are available, from physical adsorption to covalent coupling [80] [81] [82] [83] . however, all these procedures require purified biocatalysts and suffer from problems related to steric hindrance between the catalyst, the substrate, and the solid support, with increasing of costs and time for the production processes. the introduction of "cell-based" immobilisation systems resulted in a significant improvement and reduces both time and costs of the process. one of the most widely used display strategies is the simultaneous heterologous expression of enzymes and their in vivo immobilisation on the external surface of gram-negative bacteria cells, by the utilisation of the ice nucleation protein (inp) from pseudomonas syringae [84, 85] . most recently, the n-terminal domain of inp (inpn) was used to produce a novel anchoring and self-labelling protein tag (hereinafter asl tag ). the asl tag consists of two moieties, the inpn and the engineered and ssogt-h 5 mutant ( figure 5 ) [86] . the novel anchoring and self-labelling protein tag (asl tag ) system. a protein of interest (poi) is genetically encoded with the tag, which in turn makes it anchored in the outer membrane and accessible for the covalent linkage to a desired chemical group (magenta sphere) by the activity of ssogt-h 5 (adapted from [87] ). the inpn allows an in vivo immobilisation on e. coli outer membrane of enzymes of interest and their exposition to the solvent. the significant reduction of the costs related to the purification and immobilization is added to the overcoming of problems related to the recovery of enzymes by simple filtration or centrifugation methods [88] . ssogt-h 5 , in turn, gives the unique opportunity to label immobilized enzymes with any desired chemical groups (opportunely conjugated to the benzyl-guanine; in magenta in figure 5 ) [27, 59] , dramatically expanding biotechnological applications of this new tool. depending of the chemical group of choice, modulating the activity of enzymes fused with the asl tag can be possible by introducing activator or inhibitor molecules ( figure 5 ). the asl tag system was successfully employed for the expression and immobilization of monomeric biocatalysts, such as the thermostable carbonic anhydrase from saccharolobus solfataricus (sspca), as well as the tetrameric ssβgly, without affecting their folding and catalytic activity [86] . moreover, sspca fused to the asl tag showed an increase in residual activity of up to 30 % for a period of 10 days at 70 • c [87] , representing a huge advantage in pushing beyond reactions in bioreactors and in the reutilization of biocatalysts. to extend the snap-tag technology to hyperthermophilic microorganisms for in vivo studies, an o 6 -alklylguanine-dna alkyltransferase has been recently characterized from the archaeon pyrococcus furiosus [89] . this extremophilic microorganism was originally isolated from hot marine sediments in vulcano island (italy) [90] , with an optimum growth temperature around 100 • c, thus thriving under extremely harsh conditions. like those of other thermophilic archaea, its enzymes are extremely thermostable and can be used in various biotechnological applications. for example, dna polymerase i, also known as pfudna polymerase, is one of the most famous and frequently used enzymes from p. furiosus because of its high activity, thermostability, and strong 3 -5 proof-reading activity [91] . the first demonstration of an ogt activity in p. furiosus was in 1998, when margison and co-workers identified a protein of 22 kda, whose catalytic activity was abolished by the o 6 -bg pseudo-substrate. the pf1878 orf is relative to a protein of 20.1 kda. from its primary structure, the relative polypeptide seems to be closely related to the mgmt from pyrococcus kodakarensis kod1 (pk-mgmt) [48, 89, 92] . the extreme thermostability was confirmed by in vitro biochemical studies on the heterologous expressed and characterized ogt protein from p. furiosus pfuogt. this enzyme was active on bg-fluorescent substrates, thus allowing the competitive assay with methylated dsdna. however, the experiments were performed at 65 • c instead of the standard procedure at 50 • c, as described for ssogt [39, 59, 60] , due to the strong thermophilicity of this enzyme. this behaviour was effectively confirmed by differential scan fluorimetry analysis where the temperature melting (t m ) of pfuogt was found to be 80 • c, much higher than that of ssogt (68 • c) [89] . it is worth noting that, in order to obtain a the sigmoidal melting curve for pfuogt, a slower heating rate (10 min/ • c × cycle) was set up, whereas the t m value measurement is usually performed at 1 min/ • c × cycle [93] . thermotoga neapolitana is a hyperthermophilic gram-negative bacterium of the order of thermotogales [94] [95] [96] , which are excellent models for genetic engineering and biotechnological applications [97] [98] [99] [100] . the ctn1690 orf shows a clear homology of the o 6 -alkylguanine-dnaalkyl-transferase. sds-page gel imaging analysis on lyophilized t. neapolitana cells incubated with the agt fluorescent substrate showed a strong fluorescent signal with a molecular weight close to that of ssogt. the observed molecular weight and, above all, the sensitivity to the o 6 -bg derivative, led to the cloning and heterologous expression of the thermotoga neapolitana ogt protein (tnogt) in e. coli [89] . this protein, like most agts, has a role in dna repair, as confirmed by competitive fluorescent assay in the presence of methylated dsdna. as shown in figure 3 , the ic 50 value was similar to that obtained for ssogt. surprisingly, the enzyme from t. neapolitana exhibited a very high activity at low temperatures [89] , similar to that possessed by the mutant ssogt-h 5 (table 3 ) [52, 59] . superimposition analysis between a tnogt 3d model and the free form of ssogt (id pdb: 4zye) revealed in both structures the presence of a serine residue in the active site loop (s132 in ssogt, see figure 2a ), which was replaced in ssogt-h 5 by a glutamic acid to improve its activity at lower temperatures. interestingly, some residues are missing in tnogt that play an important role in stabilizing ssogt. in particular, the ionic interactions that play a crucial role in the stability of the saccharolobus enzyme at high temperatures, such as the pair r133-d27 [39] and the k-48 network [60] , are largely replaced by hydrophobic residues in the thermotoga homolog. evidently, different residues and mechanisms of stabilization may contribute to its exceptional catalytic activity at moderate temperatures and the high thermal stability. the interest shown from the important insights of this class of small proteins led to novel biotechnological applications [101] . studies on thermophilic agts represent a unique opportunity for structural analysis and, in the case of the s. solfataricus protein, for the identification of conformational changes after the trans-alkylation reaction, which are detectable with mesophilic agts, as the alkylated form are rapidly destabilized [6] . these results could have a wide impact, especially in medical fields for the design of novel hmgmt inhibitors to be used in cancer therapy [102] . furthermore, given their small size, thermophilic enzymes are very useful for studying general stabilization mechanisms at high temperatures (as for pk-mgmt and ssogt), which can then be applied to mesophilic enzymes. searching for alternative ssogt homologues was clearly useful, leading to the identification of agts that are more resistant to thermal denaturation (pfuogt) or to enzymes with a higher reaction rate at all tested temperatures (tnogt). concerning biotechnology, the use of a modified hmgmt as protein tag opened the possibility to generalise this method-a targeted mutagenesis on a thermostable ogt by following a rational approach led to the characterization of ssogt-h 5 , applicable to in vitro harsh reaction conditions and to in vivo (hyper)thermophilic model organisms. on the other hand, by an irrational approach (random mutagenesis) it is also possible to enhance their catalytic activity [103] , or modify the substrate specificity of these enzymes, making them active on benzyl-cytosine (o 2 -bc) derivatives, such as that which happened for the production of the clip-tag [104] . this knowledge could be the starting point of developing a new engineered thermo-snap-tag to be employed in particular biotechnological fields, from in vivo studies in (hyper)thermophilic microorganisms (such as the in vivo crispr-cas immune system in p. furiosus [105, 106] ) to industrial processes that require high temperatures or, in general, harsh reaction conditions. targeted modulation of mgmt: clinical implications high-resolution structure of a mutagenic lesion in dna alkylation damage in dna and rna-repair mechanisms and medical significance multifaceted roles of alkyltransferase and related proteins in dna repair, dna damage, resistance to chemotherapy, and research tools the structure of the human agt protein bound to dna and its implications for damage detection active and alkylated human agt structures: a novel zinc site, inhibitor and extrahelical base binding dna binding and nucleotide flipping by the human dna repair protein agt dna binding, nucleotide flipping, and the helix-turn-helix motif in base repair by o 6 -alkylguanine-dna-alkyltransferase and its implications for cancer chemotherapy degradation of the alkylated form of the dna repair protein, o 6 -alkylguanine-dna alkyltransferase its role in cancer aetiology and cancer therapeutics exploiting the role of o 6 -methylguanine-dna-methyltransferase (mgmt) in cancer therapy effects of o 6 -methylguanine-dna methyltransferase (mgmt) polymorphisms on cancer: a meta-analysis dna repair in personalized brain cancer therapy with temozolomide and nitrosoureas the therapeutic potential of o 6 -alkylguanine dna alkyltransferase inhibitors disulfiram is a direct and potent inhibitor of human o6-methylguanine-dna methyltransferase (mgmt) in brain tumor cells and mouse brain and markedly increases the alkylating dna damage molecular mechanisms of resistance and toxicity associated with platinating agents targeting o 6 -methylguanine-dna methyltransferase with specific inhibitors as a strategy in cancer therapy green fluorescent protein as a marker for gene expression the green fluorescent protein vectors that facilitate the expression and purification of foreign peptides in escherichia coli by fusion to maltose-binding protein molecular interaction between the strep-tag affinity peptide and its cognate target walid qoronfleh, m. glutathione s-transferase pull-down assays using dehydrated immobilized glutathione resin structural investigations on orotate phosphoribosyltransferase from mycobacterium tuberculosis, a key enzyme of the de novo pyrimidine biosynthesis biochemical and structural investigations on phosphorribosylpyrophosphate synthetase from mycobacterium smegmatis a thioredoxin gene fusion expression system that circumvents inclusion body formation in the e. coli cytoplasm directed evolution of o 6 -alkylguanine-dna alkyltransferase for efficient labeling of fusion proteins with small molecules in vivo a general method for the covalent labeling of fusion proteins with small molecules in vivo covalent and selective immobilization of fusion proteins directed evolution of o 6 -alkylguanine-dna alkyltransferase for applications in protein labeling how to obtain labeled proteins and what to do with them spr-based interaction studies with small molecular weight ligands using hagt fusion proteins snap-tag technology: a useful tool to determine affinity constants and other functional parameters of novel anti-body fragments selective degradation of reverse gyrase and dna fragmentation induced by alkylating agent in the archaeon sulfolobus solfataricus alkylation damage repair protein o 6 -alkyl-guanine-dna alkyltransferase from the hyperthermophiles aquifex aeolicus and archaeoglobus fulgidus the o 6 -methylguanine-dna methyltransferase from the hyperthermophilic archaeon pyrococcus sp. kod1: a thermostable repair enzyme thermostable archaeal o 6 -alkylguanine-dna alkyltransferases function of domains of human o 6 -alkyl-guanine-dna alkyltransferase biochemical and structural studies of the mycobacterium tuberculosis o 6 -methylguanine methyltransferase and mutated variants structure-function relationships governing activity and stability of a dna alkylation damage repair thermostable protein novel dna repair alkyltransferase from caenorhabditis elegans the dna alkylguanine dna alkyltransferase-2 (agt-2) of caenorhabditis elegans is involved in meiosis and early development under physiological conditions helix signals in proteins amino acid preferences for specific locations at the ends of α-helices importance of alpha-helix n-capping motif in stabilization of betabetaalpha fold structure of a hyperthermo-philic tungstopterin enzyme, aldehyde ferredoxin oxidoreductase crystal structure of mycobacterium tuberculosis o 6 -methylguanine-dna methyltransferase protein clusters assembled on to damaged dna insight into the molecular basis of thermal stability from the structure determination of pyrococcus furiosus glutamate dehydrogenase hyperthermostable protein structure maintained by intra and inter-helix ion-pairs in archaeal o 6 -methylguanine-dna methyltransferase crystal structure of a suicidal dna repair protein: the ada o 6 -methylguaninedna methyltransferase from e. coli structural studies of mj1529, an o 6 -methylguanine-dna methyltransferase the sulfolobus-"caldariella" group: taxonomy on the basis of the structure of dna-dependent rna polymerases activity and regulation of archaeal dna alkyltransferase: conserved protein involved in repair of dna alkylation damage repair of alkylated dna in escherichia coli. methyl group transfer from o 6 -methylguanine to a protein cysteine residue measurement of o 6 -alkylguanine-dna-alkyltransferase activity in human cells and tumor tissues by restriction endonuclease inhibition assay for o 6 -alkylguanine-dna-alkyltransferase using oligonucleotides containing o 6 -methylguanine in a bamhi recognition site as substrate repair of oligodeoxyribonucleotides by o(6)-alkylguanine-dna alkyltransferase selection of single-stranded dna molecules that bind and inhibit human thrombin development of a novel fluorescence assay based on the use of the thrombin binding aptamer for the detection of o 6 -alkyl-guanine-dna alkyltransferase activity a novel thermostable protein-tag: optimization of the sulfolobus solfataricus dna-alkyl-transferase by protein engineering interdomain interactions rearrangements control the reaction steps of a thermostable dna alkyltransferase differential inactivation of mammalian and escherichia coli o 6 -alkylguanine-dna alkyltransferases by o 6 -benzylguanine repair of o 6 -benzylguanine by the escherichia coli ada and ogt and the human o 6 -alkylguanine-dna alkyltransferase every ogt is illuminated... by fluorescent and synchrotron lights interactions of human o(6)-alkylguanine-dna alkyltransferase (agt) with short double-stranded dnas mechanism to trigger unfolding in o 6 -alkylguanine-dna alkyltransferase crystal structure of a thermophilic o 6 -alkylguanine-dna alkyltransferase-derived self-labeling protein-tag in covalent complex with a fluorescent probe labeling of fusion proteins with synthetic fluorophores in live cells dissection of reverse gyrase activities: insight into the evolution of a thermostable molecular machine inhibition of translesion dna polymerase by archaeal reverse gyrase reverse gyrase and genome stability in hyperthermophilic organisms positive supercoiling in thermophiles and mesophiles: of the good and evil in vivo and in vitro protein imaging in thermophilic archaea by exploiting a novel protein tag an o 6 -methylguanine-dna methyltransferase-like protein from thermus thermophilus interacts with a nucleotide excision repair protein alkyltransferase-like proteins: brokers dealing with alkylated dna bases flipping of alkylated dna damage bridges base and nucleotide excision repair atl1 regulates choice between global genome and transcription-coupled repair of o(6)-alkylguanines mycobacterium tuberculosis uvrb forms dimers in solution and interacts with uvra in the absence of ligands progress in enzyme immobilization in ordered mesoporous materials and related applications chibata, i. studies on continuous enzyme reactions. i. screening of carriers for preparation of water-insoluble aminoacylase an overview of technologies for immobilization of enzymes and surface analysis techniques for immobilized enzymes an overview of techniques in enzyme immobilization a general overview of support materials for enzyme immobilization: characteristics, properties chapter nine-enzyme immobilization: an overview on methods, support material, and applications of immobilized enzymes bacterial ice nucleation: significance and molecular basis ice crystallization by pseudomonas syringae an agt-based protein-tag system for the labelling and surface immobilization of enzymes on e. coli outer membrane thermostability enhancement of the α-carbonic anhydrase from sulfurihydrogenibium yellowstonense by using the anchoring-and-selflabelling-protein-tag system (asl tag ) a one-step procedure for immobilising the thermostable carbonic anhydrase (sspca) on the surface membrane of escherichia coli a journey down to hell: new thermostable protein-tags for biotechnology at high temperatures pyrococcus furiosus sp. nov. represents a novel genus of marine heterotrophic archaebacteria growing optimally at 100 • c. arch. microb high-fidelity amplification using a thermostable dna polymerase isolated from pyrococcus furiosus mutational effects on o 6 -methylguanine-dna methyltransferase from hyperthermophile: contribution of ion-pair network to protein thermostability the use of differential scanning fluorimetry to detect ligand interactions that promote protein stability a new sulfur-reducing, extremely thermophilic eubacterium from a submarine thermal vent thermotoga neapolitana sp. nov. of the extremely thermophilic, eubacterial genus thermotoga microbial biochemistry, physiology, and biotechnology of hyperthermophilic thermotoga species site-directed mutagenesis of a hyperthermophilic endoglucanase cel12b from thermotoga maritima based on rational design engineering of tm1459 from thermotoga maritima for increased oxidative alkene cleavage activity engineering a switch-based biosensor for arginine using a thermotoga maritima periplasmic binding protein development of a pyre-based selective system for thermotoga sp. strain rq7 repair of o 6 -alkylguanine by alkyltransferases variability and regulation of o 6 -alkylguanine-dna alkyltransferase directed evolution of the suicide protein o 6 -alkylguanine-dna alkyltransferase for increased reactivity results in an alkylated protein with exceptional stability an engineered protein-tag for multi-protein labeling in living cells rna-guided rna cleavage by a crispr rna-cas protein complex the rna-and dna-targeting crisp-cas immune systems of pyrococcus furiosus this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license acknowledgments: g.p. would like to thank all the authors, elena and elisa perugino for their efforts in writing this work and in their technical assistance, but mainly for their human support during the difficult and delicate period of staying at home following the covid-19 outbreak. all authors strongly thank the reviewers for their important corrections and changes, increasing the scientific level of this manuscript. the authors declare no conflict of interest. key: cord-260440-e63pgcir authors: dinjaski, nina; prieto, m. auxiliadora title: smart polyhydroxyalkanoate nanobeads by protein based functionalization date: 2015-02-24 journal: nanomedicine doi: 10.1016/j.nano.2015.01.018 sha: doc_id: 260440 cord_uid: e63pgcir the development of innovative medicines and personalized biomedical approaches calls for new generation easily tunable biomaterials that can be manufactured applying straightforward and low-priced technologies. production of functionalized bacterial polyhydroxyalkanoate (pha) nanobeads by harnessing their natural carbon-storage granule production system is a thrilling recent development. this branch of nanobiotechnology employs proteins intrinsically binding the pha granules as tags to immobilize recombinant proteins of interest and design functional nanocarriers for wide range of applications. additionally, the implementation of new methodological platforms regarding production of endotoxin free pha nanobeads using gram-positive bacteria opened new avenues for biomedical applications. this prompts serious considerations of possible exploitation of bacterial cell factories as alternatives to traditional chemical synthesis and sources of novel bioproducts that could dramatically expand possible applications of biopolymers. from the clinical editor: in the 21st century, we are coming into the age of personalized medicine. there is a growing use of biomaterials in the clinical setting. in this review article, the authors describe the use of natural polyhydroxyalkanoate (pha) nanoparticulates, which are formed within bacterial cells and can be easily functionalized. the potential uses would include high-affinity bioseparation, enzyme immobilization, protein delivery, diagnostics etc. the challenges of this approach remain the possible toxicity from endotoxin and the high cost of production. health-focused nanotechnologies have put under screening a growing spectrum of materials whose properties can be modified during fabrication. merging synthesis and smart functionalization of natural polymers allows straightforward cost-effective production of novel materials specifically designed for target application. 1, 2 the performance of polymers synthetic in origin has been investigated for nanotechnology applications, as well. 3 however, in this case production and functionalization are usually two separate processes. among natural polymers, polyhydroxyalkanoates (phas), the highly tunable bacterial polyesters, play an important role in the development of next generation biomaterials (figure 1 ). their properties are greatly influenced by the type (e.g., short chain length pha, scl-pha; medium chain length pha, mcl-pha) and homogeneity of hydroxyalkanoic monomer building blocks, and others ( figure 2 ). 4 the ability to edit and redirect bacterial cell system through metabolic or genetic engineering, enables the construction of platforms to produce versatile materials carrying wide range of functional groups which confer desired properties to the polymer. [4] [5] [6] alternatively, the direct use of highly structured natural pha nanoparticulate entities formed within bacterial cells opened new avenues for attractive biomaterial design where tailor-made beads are functionalized using intrinsic bacterial granule producing system. 1, 7, 8 these possibly phospholipidcoated inclusions carry granule-associated proteins (gaps) on their surfaces, such as: i) pha synthases, involved in the polymerization of the biopolyester; ii) pha depolymerases, responsible for pha mobilization; iii) phasins, the main structural components of gaps and iv) other proteins such as enzymes related to the synthesis of pha monomers, as well as transcriptional regulators not classified as gaps ( figure 3 ). 1, 9, 10 the implementation of these new assets, aside from broadening the potential, allows customizing and fine tuning to improve polymer performance for each specific application ( figure 4 ). nanostructured materials produced by bacteria are becoming increasingly recognized as functionalized beads with great biotechnological and biomedical potential. 11, 12 functionally complex architecture of pha inclusions, based on interacting proteins embedded/attached to pha core, 13 has been exploited as a toolbox to display molecules carrying out specific function (figure 4 ). under a wide scope of applications the performance of such engineered pha beads has been demonstrated in high-affinity bioseparation, 14 enzyme immobilization, 7, 15, 16 protein delivery to natural environments, 17, 18 diagnostics, 19 as an antigen delivery system 20 and many others (table 1) . 2, 20 herein, we revise the diversity of cell systems available to produce functionalized pha nanobeads and underline specific properties in context of their suitability for different applications. we highlight the advantages of different granule-associated proteins (gaps) and address the possible gaps that need to be fulfilled. importantly, powerful combination of synthetic biology and microengineering can create appropriate framework for future application of pha nanobeads. finally, we compare the properties of nanoparticles based on bacterial and selected synthetic polyesters. despite the fact naturally occurring nanoparticles have been present for millions of years, nanotechnology is first and . schematic representation of the currently used strategies for pha functionalization centered around added-value pha production. in vivo pha modification based on peptide functionalization of pha nano-beads using gaps for recombinant protein anchoring to the pha granule or nonspecific binding and in vivo chemical modification through incorporation of functional group in the side chain of the polymer applying metabolic engineering and systems biology approach. similarly to in vivo, in vitro approach for peptide functionalization can be based on the use of gaps or nonspecific binding, while the underlying principle of in vitro chemical modification might be based on polymer synthesis or modification. foremost focused on in vitro man-made particles. 11 nevertheless, dependently on the target application, in vivo biological or in vitro synthetic approach for fusion protein immobilization to the pha granule surface might better meet the requirements ( table 2 ). the in vivo pha granule functionalization consists of gap fusion immobilization onto the granule surface simultaneously with the granule formation inside the pha-producing host ( figure 4) . 7, 49 on the other hand, the production of these bioinspired constructs in vitro is based on pha extraction, followed by in vitro bead production and in vitro gap fusion protein immobilization via gap-bead interaction ( figure 4 ). 30 the main advantages of this in vitro cell-free system are: i) the possibility of tight control of nanoparticle disassembly and reassembly process; ii) absence of competition among the recombinant gap-fusion and wild type proteins; iii) tight control over particle size and immobilized protein/active agent concentration; iv) possibility of endotoxin removal, crucial for the design of every biomedical setup. nevertheless, pha isolation and in vitro nanobead production require more tedious methodology (e.g., to avoid pha particle aggregation) in comparison to isolation of in vivo produced pha granules. also, the use of non-environmentally friendly solvents is needed for in vitro technology. all mentioned significantly increase the costs of in vitro pha nanobead production and make the technology suitable mainly for added-value applications where tight control over particle size and active agent concentration is needed. 30 in the line of safety, in vitro approach is highly convenient for nanomedical purposes, including nanofabrication, imaging, drug delivery and tissue engineering, where the use of endotoxin-free pha is requisite. 4 importantly, the fabrication of endotoxin-free pha vehicles can also be achieved using in vivo settings (see below). some applications such as protein delivery to natural environments do not necessary acquire endotoxin free pha and can benefit from an in vivo approach where bacterial naturally produced nanoscale particulate entities can be used in a straightforward manner. 16 furthermore, as bacterial polymeric particles can be functionalized in vivo before isolation there is a clear environmental and economic advantage over those produced chemically. particle functionalization is achieved through the recombinant expression of fusion proteins, where natural gaps are used as anchoring tag for foreign protein immobilization. perfect example is biof tag from pseudomonas putida based on the use of intrinsic p. putida pha granules as scaffold to immobilize fusion proteins in vivo. once fermentation under optimal pha production conditions is accomplished, granules decorated with the biof-protein fusions are obtained as the end product ( figure 5 ). 8, 16 dependently on protein release treatment, up to 100% of fusion protein can be recovered with a good purity, since the phasins represent major gaps. 7 additionally, the possibility of minimizing the presence of gap proteins to increase the yield of fusion protein binding and purity has been investigated. 8 biof system was proven efficient for in vivo coating of mcl-pha granules with cry1ab derived insect-specific toxin protein. generation of bioplastic-biof-insect specific toxin complex indicated excellent performance of biof tag as a device for spreading active polypeptides to the environment without the need for active agent release and purification. 16 similarly, organophosphohydrolase from agrobacterium radiobacter immobilized on polyester inclusions of recombinant escherichia coli was shown suitable for bioremediation applications. 17 testing this new in vivo assets and analyzing their limits, indicated the possible room for improvement. current trends deal with implementation of new methodological platforms, as synthetic biology, to improve the production process and productivity. 57 this highlights the importance of re-programming approaches to optimize the system and design strategies focused on meeting the necessities of each specific application. in the line of fine tuning of biological interfaces and the use of pha as vehicles, addressing the key factors of pha machinery permitted overcoming biological barriers to reach maximal in vivo coating of pha nanobead and at the same time avoid side effects concerning disordered granule biodistribution after cell division (see below). 8 different gapsdifferent advantages: hydrophobic vs. covalent binding the diversity of gaps offers gentle alternatives through flexible and highly tunable design of specific tags suitable for personalized requirements of different application. thus, the window of possibilities that each specific gap offers implies different modes to connect recombinant protein and pha table 2 comparison of pha nanoparticles in vitro and in vivo production process, their applications and costs. in vitro ref. production by bacteria synthetic production 2, 20 use of renewable sources for production harsh chemical needed for polymer isolation and particle production 30, 53 simultaneous production and functionalization functionalization posterior to nanobead production 8, 20, 30 nanobead assembly and disassembly cannot be tightly controlled tight control over bead assembly and disassembly 10, 54 competition of recombinant and wild type gaps functionalization with target protein only, no other gaps 8, 30, 54 particle size can be controlled by biotechnological production process tight control over particle size 32, 54 immobilized protein concentration variation might represent challenge tight control over immobilized protein concentration 7, 30 in production cost total production cost includes in vivo particle production cost and particle purification, lower production cost compared to in vitro produced particles, since additional functionalization is not needed higher production costs compared to in vivo produced particles, total price accounts for polymer synthesis, isolation, endotoxin removal, in vitro particle synthesis and functionalization 30, 54, 56 nanobeads (covalent, hydrophobic or non-specific) ( figure 4 ). although so far very little is known about their structure and interaction with the pha granules, 58 phasins are highly attractive among gaps, largely due to the wide assortment of structurally different compositions compared to other gaps ( figure 4 ). phasins have been utilized as affinity tags and through protein engineering designed to build recombinant protein purification system. this provides low cost method for production and purification of high added value proteins in a continuous way. 49 significant improvements in bio-separation technology were made by upgrading the system interconnecting phasins and target proteins via self-cleaving intein. 47 this approach enabled in vivo recombinant protein immobilization onto the granule and the release of purified proteins once the native scl-pha particles were recovered, which in turn pushed bio-separation technology several steps ahead, toward convenience and economic production. in vivo immobilized correctly folded eukaryotic proteins on the surface of pha granules across phasin protein have been used for fluorescence activated cell sorting (facs) based diagnostics. 18 in completely different context to in vivo tag binding, in vitro synthesized pha nanoparticles and in vitro hydrophobic binding of phap fusion proteins with protein ligands (e.g., mannosylated human α1-acid glycoprotein (hagp) and human epidermal growth factor (hegf)) have been reported as another outstanding application of phasins for receptor-mediated drug delivery. 30 mostly utilized phasins are phap of ralstonia eutropha that bind scl-pha, 20 while the exclusive example of mcl-pha binding p. putida phaf phasin is for environmental application (biof system). 8, 16 other identified phasins as phap proteins of aeromonas hydrophila, phap of haloferax mediterranei, paracoccus denitrificans, bacillus megaterium, and others (revised in 10 ) have not been deeply studied for nanobiotechnology purposes. likewise, applying the in vitro approach the substrate binding domain of pha depolymerase has been used to hydrophobically anchor fusion proteins to pha nano and microbeads. 22, 59, 60 a different strategy to in vivo immobilize recombinant proteins onto pha nanobead surface relays on the advantage of covalent gap-pha binding using p. aeruginosa, p. putida, r. eutropha or b. megaterium pha synthase as a tag. 14,61-63 phasin-pha interaction usually results in a slow non-triggered protein release over time under physiologic conditions. moreover, specific environmental conditions can alter release rates. 64 in contrast, covalent attachment enables unique natural cross-linking of a protein and polymeric support and allows better control over protein release kinetics. pha synthase offers the possibility of covalent protein-pha conjugation. both n-and c-terminal of pha synthase were shown suitable for in vivo assembly of functionalized polyester beads. 14, 17, 26, 31, 44, 62, 65, 66 this approach based on pha nanobead functionalization through phac helps to circumvent the washing off of non-covalently bound fusion proteins during the process. 67 the particles with an intrinsic label can be tailored to covalently display proteins for applications in antibody capture-based diagnostic (e.g., immunochromatographic strips or bach-and-elute bioseparation applications). the modular arrangement of the protein domains provides a large figure 5 . in vivo immobilization of fusion proteins to bioplastics by biof tag. the procedure consists of: 1, the fermentation in p. putida under optimal pha production conditions; 2, 3, isolation of the granules carrying the biof-proteins fusions from the crude cell lysate by a simple centrifugation step; 4, release of fusion proteins via detergent treatment (modified from 16 ). design space for the production of custom-made materials. 20 by introducing enterokinase digestion site between the tag and target protein the latter can be efficiently released from polymer support providing efficient and cost-effective methodology to obtain added value product. 67 similarly, to facilitate target protein release from bio-bead, thrombin cleavage site was used as a linker, 68 as well as previously mentioned autolytic intein. this enables straightforward liberation of target protein. 52, 69 in addition, proteins can be unspecifically absorbed to pha. 59, 70 an alternative route to intracellularlly produce enzyme decorated pha beads consists of simultaneous synthesis of insoluble protein inclusion bodies and pha granules. charged particles are created by introducing acidic coil via n-terminal of phac. this structure has been used to capture an enzyme of interest that was co-expressed in the same host cell and contains a basic coil fused to its c-term. coils are held together by hydrophobic and electrostatic interactions. 65 therefore, it follows that understanding protein-pha interactions from a biophysical point of view will undoubtedly widen the biotechnological and clinical potential of these bioplastics. in fact, in some cases there are indications that phasin-pha interaction is influenced not only by the nature of these two components but also by the presence of other gaps that interfere and play the role of mediation elements facilitating the binding. 8, 10 for instance, the optimization of biof system by minimizing the dosage of natural phasins in p. putida kt2440 illustrates the importance of understanding the molecular basis underlying the pha-phasin interaction and its biological consequences. 8 also, the mechanistic study of the pha granule producing machinery functioning, the dynamics and factors that direct gap-pha binding together assist in overcoming technical hurdles and indicate bottlenecks important for the design of bioinspired nanoparticles (see "editing, streamlining and refactoring wild type strains for enhancement of protein immobilization" section for details). bug systems for scaling up: wild type over recombinant cells success in producing pha naturally or recombinantly in broad range of bacteria showed that many microorganisms with desirable properties could perform the function of cell factory for production of functionalized pha beads. e. coli is default host microorganism for recombinant protein production and often the first choice. the fact that this strain serves as a workhorse of basic and applied research worldwide is largely due to the possibility of high recombinant protein yield achievement. remarkably, e. coli, a previous non-pha producer, through pathway engineering has been set up to produce up to 150 g/l cell dry weight (cdw) with final pha content of more than 80%. 4 this was used to co-produce several tagged proteins (maltose binding protein (mbp), β-galactosidase (lacz), chloramphenicol acetyltransferase (cat)) with polyhydroxybuyrate (phb) granules in the e. coli cells. proteins were purified with yields of 3.17-7.96 mg/g cdw. 47 currently applying recombinant e. coli cells allows covering of the granule surface up to 20% of total proteins associated with the bead, 19 while using wild type such as p. putida strain as much as 2% can be achieved. 8 it should be noted that different bacterial strains have different pha producing capacities regarding polyester type (sclor mcl-pha) and relative amount to cdw. besides, the cause of altered final recombinant protein yield might be the consequence of the type of gaps used to immobilize recombinant protein, affecting the specific recombinant protein-pha interaction. importantly, r. eutropha naturally produces more than 200 g/l of phb, which gets to 80% of cdw similarly to recombinant production in e. coli, 40 while yields of mcl-pha obtained with p. putida reach 65%. 71 p. putida productivity can be upgraded to 84% of intracellular mcl-pha, incorporating knock-out mutations of beta-oxidation genes fada and fadb. 72 recombinant e. coli is able to produce 20% of mcl-pha when beta-oxidation is impaired due to the deletion of fadb, 4 whereas qi et al. used metabolic routing strategy to inhibit fatty acid beta-oxidation by acrylic acid in recombinant e. coli (fadr) and produce 60% mcl-pha. 73 additionally, phaj encoding (r)-specific enoyl-coa hydratase, was demonstrated to supply 3-hydroxyacyl-coa of c4-c6 for pha biosynthesis via beta-oxidation pathway. 74, 75 its co-expression with phac in e. coli led to production of pha with monomer composition containing c4, c6, c8, and c10 from unrelated carbon source. 76, 77 though, e. coli remains the most commercially valuable host for phb large-scale production as the polymer degradation is avoided, the down sides as endotoxin contamination and previously mentioned relatively low yields of mcl-pha, substantially limit its use for biomedical purposes. also, the overexpression of foreign genes over physiological rates usually triggers a spectrum of conformational stress responses and causes the accumulation of insoluble protein versions that do not reach their native conformation. 78 these pseudospherical protein aggregates, inclusion bodies, are considered undesired byproducts of protein production processes. other bottlenecks as the loss of the plasmid due to the instability of introduced genes, use of antibiotics and gene expression expensive inducers have been partially solved, however they still represent a challenge (reviewed in 53 ). taking all this together, the advantages of using wild type strains as host should not be overlooked. specific strategies applied on the components of pha machinery can drive productivities of high contents of pha immobilized recombinant proteins in wild type strains as reported for e. coli. 8 on the positive side, a great understanding of pha synthesis in model mcl-pha producer strains such as p. putida, has been gained through systems biology ("omics" data, genome-scale metabolic models, etc.). 57, [79] [80] [81] [82] [83] powerful genetic tools based on synthetic biology 84 support bottom-up approaches and might be used to design p. putida strains that generate added-value bioproducts, such as active mcl-pha based nanobeads. the great value of this bacterium as an autolytic specialized strain for mcl-pha production has also been demonstrated. 85 due to its broad metabolic versatility and genetic plasticity, which allow a variety of renewable carbon sources to be used for pha production, p. putida is one of the most prominent candidates for protein production. aside from pseudomonas, many other gram-positive and gram-negative eubacterial genera such as bacillus, ralstonia, aeromonas, rhodobacter, rhodospirillum, rhodococcus were shown suitable for production of pha nanobeads. 4, 86 editing, streamlining and refactoring wild type strains for enhancement of protein immobilization complex subcellular architecture and self-organizing nanoand micro-compartments of bacterial cell hold great promise, largely due to the possibility for their biofunctionalization. disturbing these highly coordinated systems might easily imbalance the physiology of the bacterial cell. pha granules take over the control of the carbon and energy storage and thus represent important element of bacterial metabolic network. 83 thereafter, from an energy flow and survival physiology standpoint, balanced distribution of pha between daughter cells after division has fundamental importance as competitive setting. understanding the pha machinery and interplay of its components was shown crucial for optimization of the in vivo system for production of protein functionalized pha nanobeads. 8, 9 different scenarios involving different molecular events and interactions as well as granule localization have been proposed by micelle, budding and scaffold model of granule formation. 7 in contrast to a micelle model where pha granules are assumed to be randomly distributed in the cytoplasm, budding and scaffold model suggests defined localization proposing granule-cell membrane interaction or phac-scaffold molecule interplay, respectively. recently proposed scaffold model suggest cooperative work of phac and phasins in granule formation. since, phasins-phac interaction has been spotted in some bacterial strains (e.g., pham, phasin-like protein that interacts with phac in r. eutropha), phasins were proposed as the main components forming network that interconnects granules, dna and enzymes involved in pha metabolism. 9, 87, 88 this network should serve as a mediation element responsible for granule localization within the cell and their balanced segregation between daughter cells during cell division. on some of gap interactions depends their activity, while the function of others is still to be discovered. for instance, homo-oligomerization of r. eutropha phac1 reu and phar reu 89, 90 and p. putida phac1 and hetero-oligomerization of phac bmeg with phar bmeg are known to be essential for accomplishing the function. meanwhile, the interaction of certain phasins with other pha players was identified, 90 but their exact function is to be unraveled. namely, p. putida phaf was proposed to form homoand hetero-tetramers interacting with phai through short leucine zipper. 58 another suggested role of phasins is the control of the access of pha depolymerases. indeed, weak phap2-phaz interaction was reported in r. eutropha. 90 all these interactions are taught to contribute to the formation of net-like structure found in the vicinity of pha granules 91 and provide a window into the system functioning. phaf has been shown to have a role as a central player in the machinery, controlling pha granule segregation and localization in the cell, since it shows a unique ability to bind at least two ligands (the pha granules and the nucleoid). 7, 9, 58, 92 the peculiar structural organization of phaf into two domains performing diverse functions (i.e., c-terminal histon-like domain, n-terminal phasin-like domain) supplies an explanation to its biological role. 8, 9 moreover, whether or not p. putida cytoskeletal or other gap proteins facilitate the organization of granules in needle array like structure (figure 4) , by direct or indirect interaction with phaf, is still an open question and currently the precise mechanisms by which intermediary phaf positions the pha granules are still unknown. 9 similarly, pham of r. eutropha can bind both dna and pha. 93 therefore, to refine the system it is needed to unravel the puzzle of how functionally diverse, or even a multifunctional set of gaps, should be combined to generate an optimal yield of in vivo immobilized protein onto the granule surface and engender a coherent cell phenotype. in a further step toward the use of pha granules as nanocarriers decorated with functionalized phasins, the information on phasin physiological function provided important insights into the critical factors needed to be targeted to improve existing models. 8, 9 for instance, phasin binding prevents unspecific attachment of not only proteins unrelated to the pha metabolism to the granules surface, but also limits the space for recombinant proteins to anchor. 94 therefore, the absence of wild-type phasins favors binding of recombinant tagged protein molecules anchoring to the granule surface. 8 this could be explained by limited surface for recombinant proteins to anchor wild-type pha granules and the need to compete with natural phasins. in this respect, the key phasin factors have been identified for optimal pha production in p. putida addressing the minimum amount of complete phasin proteins necessary to achieve adequate pha production and higher yield of immobilized recombinant protein. 8 applying this strategy maximum biof (n-terminal of phaf) fusion protein concentration was in vivo immobilized onto the pha beads (2.2% of recombinant protein/pha) without compromising phasins' intrinsic function. 8 also, this demonstrated the swappable nature of phai phasin and biof pha binding modules in terms of their physiological function and illustrated the utility of the phaf/phai structure redundancy, being autonomous modular cooperatively working units. 8, 58 altogether, these examples show that the escalating drive to identify the connections within the complex system of gaps network is fueled by the need to develop new strategies that will lead to improvement of protein immobilization onto the pha beads. metabolic and biotechnology capacities of p. putida, as well as global understanding of the capabilities of this strain are facilitated by metabolic models that enabled integration of experimental along with genomic and high-throughput data. 57 endotoxin free pha nanobead production bacterial lipopolysaccharides (lps) or endotoxins, also designated as pathogen associated molecular patterns (pamps) recognized by innate immune system are most potent identified microbial mediators implicated in the pathogenesis of sepsis and septic shock. lps is the most prominent 'alarm molecule' sensed by the host's early warning system of innate immunity presaging the threat of invasion by gram-negative bacterial pathogens. 95 thus, presence of lipopolysaccharide (lps) endotoxins in pha nanobeads produced in gram-negative bacteria makes these in vivo naturally produced particles unsuitable for biomedical applications. 96, 97 the problem occurs because co-purification of pyrogenic outer lps together with pha granules cannot be avoided. in vitro approach on the other hand offers the possibility of endotoxin removal from pha polymer. the concentration of endotoxins in pha is greatly influenced by purification strategy and might vary from more than 10 4 eu/g to less than 1 eu/g. 55, 98 the methodology for endotoxin elimination depends on type of pha (e.g., scl-pha, mcl-pha, presence of functional groups, etc.) and each results in different rates of polymer recovery. 55, 98 however, in vitro strategy remains hampered by the necessity of extensive and tedious purification methodology to achieve the levels in compliance with the endotoxin requirements for biomedical application according to the u.s. food and drug administration (fda). generally, for products that directly or indirectly contact the cardiovascular system and lymphatic system the limit is 0.5 eu/ml or 20 eu/ device, while for devices in contact with cerebrospinal fluid the limit is 0.06 eu/ml or 2.15 eu/device. 99 all mentioned factors together with the bacteria growth conditions significantly influence the total cost of the production of endotoxin-free polymer. to get around this limitation, alternative sources of functionalized pha granules free of lps contamination are gram-positive bacteria. they offer a platform for production of lps free tailored beads due to the difference in the structure of their cell envelopes compared to gram-negative bacteria. 100 even so, other pamps, such as lipoteichoic acid (lta) and peptidoglycan (pg), found in gram-positive bacterial pathogens are now appreciated to activate many of the same or similar host defense networks induced by lps. 95 subsequently their presence in phas isolated from gram-positive bacteria might have immunogenic activities similar to lps. 101 among pamps, lta predominates in the bacillus, whereas actinomycete bacteria typically synthesize lipoglycans. 102 importantly, certain gram-positive pha producing strains (e.g., bacillus circulans, bacillus polymyxa) lack both, lta and lipoglycans. 103 clostridium and staphylococcus citreus were reported to lack lta and may be considered for recombinant pha production. 104 hence, emerging area to be investigated are the mechanisms triggered by pamps of gram-positive pha producing bacteria regarding mammalian immune system. remarkably, gram-positive genera corynebacterium, nocardia and rhodococcus are the only wild-type bacteria, which naturally synthesize the commercially important copolymer poly(3-hydroxybutyrate-co-3-hydroxyvalerate), p(3hb-co-3hv), from simple carbon sources such as glucose. 105, 106 the genus bacillus, in common with many other pha-accumulating gram-positive bacteria, accumulates co-polymers of 3hb when grown on different substrates. 98 for instance, copolymers of p(3hb-co-3hv) are accumulated when the cultures are fed with odd-chain-length n-alkanoic acids such as propionic acid, valeric acid and heptanoic acid. 107 the generally-regarded-as-safe (gras) bacterium lactococcus lactis has been genetically engineered to produce pha beads. unfortunately this recombinant strain did not show feasibility for commercial-scale production, since the beads were both smaller in size and contributed less pha per cdw (6%) than other pha producing bacteria. 29 therefore, this platform was designated for added value medical product synthesis (e.g., vaccine development) instead the large scale production. 25 the improvement of the yield would likely require re-engineering metabolic flux to push carbon utilization away from lactate production and toward the pha biosynthesis pathway. 29 interestingly, the platform based on pha functionalized granules was used to develop a phap-based system for endotoxin removal from protein solution. an endotoxin receptor protein was fused with r. eutropha phasin, in vitro attached to phb beads and used to remove lps from the solution. 108 functionalized pha nanobead in vivo performance, cytotoxicity and biocompatibility numerous in vivo studies have clearly demonstrated that endotoxin and bacterial protein free phas provoke mild host reactions in different animal models, 96 which is not surprising when considering the fact that [r]-3-hydroxybutyric acid is a normal blood constituent 109 and is found in the cell envelope of eukaryotes. 110 in vitro based approaches have focused on enhancing growth of different eukaryotic cell lines using arginyl-glycyl-aspartic acid (rgd) tailored pha in form of a scaffold. as such, it showed excellent in vitro performance on supporting and promoting neural stem cell, human bone marrow mesenchymal stem cell and fibroblasts adhesion and growth. [111] [112] [113] phap-rgd fusion immobilization allowed evading tedious cross-linking processes and chemical immobilization that easily damage the biological activity of attached protein. new approaches based on nanoparticulate carriers with targeting capability for imaging and drug delivery to cancer cells are slowly replacing longstanding concepts. with this aim, posterior to synthesis of loaded pha particles, surface modification was performed via hydrophobic interaction between particle surface and growing pha chain from phac enzyme fusion with rgd that stabilized core-shell structure. 31 however, little attention was placed on endotoxin removal and scaffold performance in vivo. alternatively, the pha micelles synthesis was performed in vitro by mixing phac-rgd and 3hb-coa and therefore avoiding the incorporation of endotoxins. 66 bacterial polyester inclusions have been also engineered to display fusion protein of phac and the components involved in immune response to the infectious agent and used as a vaccine delivery system. 19 remarkably, particle-based carriers very closely mimic the physiochemical characteristics of natural pathogens, enhancing particle-displayed protein delivery to the immune system. [114] [115] [116] however, very few in vivo studies address essential issue of immunogenicity of soluble and pha granule bound gaps, considering that the main objective when using biomaterials and nanocarriers is to generate the most appropriate beneficial cellular or tissue response without eliciting any undesirable local or systemic effects in the recipient of the therapy. as the immune response and repair functions in the body are exceptionally complex, the biocompatibility of a material should not be described in relation to a single cell type or tissue. nevertheless, it is essential to consider in vitro and in vivo cellular behavior for further comprehensive biocompatibility evaluation of biopolymers. several studies report no toxic nor pyrogenic effect of wild type or functionalized non endotoxin free pha beads in mice, 19 which suggests that due to the profound differences between mice and human immune systems another animal model should be considered for these type of studies. 117 given the breadth of these functional differences, the discrepancies surely limit the usefulness of mouse models in mentioned studies and as such should be taken into account when choosing preclinical animal models. 118 the results of the study comparing immune response of pha-beads for vaccine application produced in l. lactis and e. coil support this hypothesis since no higher inflammation was spotted for e. coli produced particles. 26, 29 however, this might be due to the pamps, present in both gram-positive and gram-negative bacteria that induce similar immune reaction. in addition, overall impact of functionalized pha nanobeads on eukaryotic organism including levels of ketone bodies and other possible secondary effects are unknown. in vivo tracking of pha nanocarriers might give insight into environmentally-triggered structural changes of nanoparticles and provide additional information about their localization and pathway. in a very different context, complexed phas (cphas) were discovered representing different type of pha structures. unlike bacterial phas that play a major role in carbon and energy storing, these cphas found in mammalian cells are assumed to be involved in regulation of various cell functions through modification of target molecules. 119 complex of cpha with ca 2+ and inorganic polyphosphate is involved in formation of ion-conducting channels in mitochondrial membranes. 120 furthermore, cpha can interact with membrane proteins through hydrophobic and perhaps covalent interactions. 121, 122 it has been suggested that in case of protein channels these interactions might play an important role in regulation of channel function and selectivity. 123 previous studies indicate that cpha can be found in various subcellular compartments of the eukaryotic organisms 124 as well as associated with specific proteins. 125, 126 although, these structures are still not profoundly explored and are in very early stage of investigation, they definitely offer great possibility for functionalization and exploitation. additionally, they might give the critical piece of information on pha metabolism, their uptake and pathway inside the eukaryotic cell essential when dealing with functionalized pha nanobeads designed for biomedical application. besides natural polyesters such as pha, several synthetic polyesters have attracted considerable attention as materials for biomedical purposes due to their attractive properties (e.g., biocompatibility and biodegradability). currently majority of synthetic polyesters systems used in medicine are based on poly(lactic acid) pla, poly(glycolic acid) pga and their copolymer poly(lactic-co-glycolic acid) plga. this is mainly due to their well described formulations and methods for production, as well as their low toxicity and immunogenicity. even though such polyesters have been extensively used for resorbable sutures, bone implants, screws and others, 127 only small number of commercially available products are designed for nanoparticle based drug delivery. 128 nevertheless, synthetic polyesters such as plga have been profoundly tested for this application (reviewed in 128, 129 ) . synthetic polyesters are considered promising candidates for development of the nanoparticle delivery systems to release, target, uptake, retain, activate and localize the drugs at the right time, place and dose. 130 although natural and synthetic polyesters share many common properties (e.g., biocompatibility and biodegradability), due to their specific characteristic one or the other might be more suitable dependently on the application. the main characteristic of synthetic and natural polyesters, significant for nanoparticle production and drug delivery systems are outlined in table 3 . degradation of both, synthetic and natural polyesters, results in biologically compatible and metabolizable moieties. however, their degradation rates and patterns differ considerably. thereby, synthetic polyesters are suitable for sustained release due to their slow degradation rates. importantly, in the case of natural polyesters the drug release kinetics can be more easily controlled via conventionally engineering the pha matrix parameters to reach desired degradation rates. for instance, scl-phas are crystalline and hydrophobic, but many pores are formed on the surface and the drugs are released quickly without any polymer degradation. mcl-pha copolymers on the other hand, have low melting point and low crystallinity, therefore they are more suitable for drug delivery. plga found many applications in biomedical field, such as treatment of cancer, inflammation diseases, cerebral diseases, cardio-vascular disease as well as in regenerative medicine, infection treatment, vaccination and many others. 128, 133 they were also used for diagnostic purposes for magnetic resonance, cancer-targeted imaging 136, 137 and as ultrasound contrast agent. 138 similarly, the good performance of phas for variety of biomedical applications has been proven (table 1) . nevertheless, the main advantage of synthetic plga over natural phas is its fda approval as drug delivery platform and lower production costs. currently, the only fda approved pha is poly(4-hydroxybutyrate) p(4hb) for suture application, which might open the possibility for other phas to be tested and enter the investigations for fda approval. this would significantly influence the development of pha based drug delivery systems and enhance their application. at present, due to its large availability on the market and its relatively low price, pla shows one of the highest potential among polyesters, particularly for packaging and medical applications. for instance, cargill has developed processes that use corn and other feedstock to produce different pla grades (natureworks). 139 in this company, the actual production is estimated to be 140,000 tons/year. presently, it is the highest and worldwide production of biodegradable polyester. its price is lower than 2 €/kg. 140 although, the cost of production of phas is still quite high (3-5 €/kg), current advances in fermentation, extraction and purification technology as well as the development of superior bacterial strains are likely to lower the price of phas, close to that of other biodegradable polymers such as polylactide and aliphatic polyesters. 141 engineering biomaterial nanobeads has attracted much attention of the research community. ongoing efforts to push the boundaries are reflected in the design of wide range of nanostructured bacterial materials for innovative medicines. 1 apart from pha, biologically produced nanoparticles are highly diverse and omnipresent in prokaryotic (magnetosomes, storage paricles, etc.), but also in eukaryotic (e.g., exosomes, lipoproteins, etc.) systems giving the ground to the further development of bionanothechnology. 11 smart pha nanoparticles described in this review provide grounds on how these bacterial polymers, traditionally considered for industrial or conventional clinical applications, are progressively entering the most innovative biomedical fields as promising and highly flexible materials. the fact that pha can be produced from inexpensive waste carbon sources enhanced commercial interest in these polymers. on the other hand, interest in functionalized pha nanobead technology has been hampered by existing legislation in terms of endotoxin concentration allowed for biomedical application. 99 importantly, these technical hurdles were successfully surmounted following in vitro approach or using certain gram-positive strains for in vivo functionalized bead assembly. nevertheless, up-to-date phas are produced on the large-scale exclusively using gram-negative bacteria. 4 for simplicity and cost control the goal is to adapt the approach to a system in which maximal covering of pha granule surface with recombinant protein is achieved. different module swapping strategies and fine tuning were proven effective to reach this goal. 8 to meet the challenges new tendencies suggest multi-functionality. the concept behind multi-functional beads would allow the design of variety of biomedical systems with unique advantage of adaptability and subsequently responding to current trends of biomedicine. pha nanoparticles allow multifunctional tuning due to the possibility of the use of variety of gaps, as well as their both n-and c-terminal domains, to immobilize diverse proteins simultaneously. nevertheless, many nanotoxicological tests on their safety have to be performed before they can overtake the current stage of synthetic polyesters. aside from fda approval for biomedical applications, the production costs should be reduced. table 3 comparison of synthetic and natural polyesters production, processing, properties and application. bacterial polyesters (pha) ref. bio-production of la and chemical synthesis of pla, plga completely biosynthesized 4, 96, 131 no possibility of in vivo production and functionalization in vivo functionalization; one-step production of active agent and carrier, no need to produce, purify and conjugate active agent 26, 54, 131 use of harsh chemicals for production production from renewable sources 4, 132 difficulty to scale-up similar to bioprocesses for pha production; certain difficulties to scale-up 132, 133 production cost comparable with conventional plastics like pet high cost of production; at least twice that of pla 4, 131 high risk due to flammable and toxic solvents low risk level 132 production completed within days production duration 1-2 weeks 132 endotoxin contamination less probable due to synthetic origin endotoxins can be efficiently removed; use of gram+ strains allows endotoxin free production 20 properties lower number of copolymers that can be produced; only d-and l-lactic acids (la) more than 150 monomeric building blocks for polymer design 4, 131 approved by fda and european medicine agency as drug delivery system not approved by fda as drug delivery system 131, 133, 3 low drug loading no limitations regarding drug loading 32, 131, 133 protection of drug from degradation protection of drug from degradation 133, 3, 134 biodegradable, biocompatible, low cytotoxicity biodegradable, biocompatible, low cytotoxicity 30, 32, 96, 3 material properties poor, could be adjusted by regulating d-and l-la ratios good thermomechanical properties from brittle, flexible to elastic, fully controllable, easy processability 4, 30, 96, 135 degradation rate can be controlled degradation rate can be controlled 130, 3 drug delivery kinetics can be controlled drug delivery kinetics can be controlled 32, 130 easy particle size control size of in vitro produced particles might be controlled, in vivo production limits control over particle size 30, 32, 34, 134 application wind variety of biomedical applications applicable to a range of diseases 26, 133 lowering ph at the site of implantation that might lead to sterile sepsis no detected side effect of pha degradation 130, 131 best chance for clinical application due to fda approval. packaging, printing, coating, yet limited by t g of 65-75°c almost all areas of conventional plastic industry, limited by current higher cost and availability 4, 20, 131, 134 the big challenges that pha industry has to overcome 132 to lead to pha nanobeads successfully commercialization are: i) reduction of production costs; ii) construction of functional pha production strains to precisely control the structure of pha molecules increasing the consistency of structure and properties to reach the level of competitor synthetic polymers; iii) reach the simplicity of synthetic polymer processing; iv) use of alternative renewable sources for production to avoid use of expensive glucose; v) development of high value added applications. nanostructured bacterial materials for innovative medicines engineering bacteria to manufacture functionalized polyester beads biodegradable nanoparticles for drug and gene delivery to cells and tissue a microbial polyhydroxyalkanoates (pha) based bio-and materials industry second-generation functionalized medium-chain-length polyhydroxyalkanoates: the gateway to high-value bioplastic applications phacos, a functionalized bacterial polyester with bactericidal activity against methicillin-resistant staphylococcus aureus in vivo immobilization of fusion proteins on bioplastics by the novel tag biof swapping of phasin modules to optimize the in vivo immobilization of proteins to medium-chain-length polyhydroxyalkanoate granules in pseudomonas putida nucleoid-associated phaf phasin drives intracellular location and segregation of polyhydroxyalkanoate granules in pseudomonas putida kt2442 new insights in formation of polyhydroxyalkanoate (pha) granules (carbonosomes) and novel functions of poly(3-hydroxybutyrate) (phb) biological nanoparticles and their influence on organisms polyhydroxyalkanoate granules are complex subcellular organelles (carbonosomes) zz polyester beads: an efficient and simple method for purifying igg from mouse hybridoma supernatants in vivo enzyme immobilization by use of engineered polyhydroxyalkanoate synthase in vivo immobilization of dhydantoinase in escherichia coli new tool for spreading proteins to the environment: cry1ab toxin immobilized to bioplastics immobilization of organophosphohydrolase opda from agrobacterium radiobacter by overproduction at the surface of polyester inclusions inside engineered escherichia coli recombinant escherichia coli produces tailor-made biopolyester granules for applications in fluorescence activated cell sorting: functional display of the mouse interleukin-2 and myelin oligodendrocyte glycoprotein bacterial polyester inclusions engineered to display vaccine candidate antigens for use as a novel class of safe and efficient vaccine delivery agents bacterial polyhydroxyalkanoate granules: biogenesis, structure, and potential use as nano-/micro-beads in biotechnological and biomedical applications protein engineering towards biotechnological production of bifunctional polyester beads selective immobilization of fusion proteins on poly(hydroxyalkanoate) microbeads new skin test for detection of bovine tuberculosis on the basis of antigendisplaying polyester inclusions produced by recombinant escherichia coli in vivo production of scfv-displaying biopolymer beads using a self-assembly-promoting fusion partner production of functionalized biopolyester granules by recombinant lactococcus lactis novel particulate vaccines utilizing polyester nanoparticles (bio-beads) for protection against mycobacterium bovis infection-a review vaccines displaying mycobacterial proteins on biopolyester beads stimulate cellular immunity and induce protection against tuberculosis polymeric particles in vaccine delivery production of a particulate hepatitis c vaccine candidate by an engineered lactococcus lactis strain a specific drug targeting system based on polyhydroxyalkanoate granule binding protein phap fused with targeted cell ligands tumor-specific hybrid polyhydroxybutyrate nanoparticle: surface modification of nanoparticle by enzymatically synthesized functional block copolymer application of polyhydroxyalkanoates nanoparticles as intracellular sustained drug-release vectors rifampicin carrying polyhydroxybutyrate microspheres as a potential chemoembolization agent preparation and characterization of triamcinolone acetonide-loaded poly(3-hydroxybutyrate-co-3-hydroxyhexanoate) (phbhx) microspheres fate and effect of ccnu-loaded microspheres made of poly(d, l)lactide (pla) or poly-β-hydroxybutyrate (phb) in mice controlled production of poly (3-hydroxybutyrate-co-3-hydroxyhexanoate) (phbhhx) nanoparticles for targeted and sustained drug delivery the use of polymeric microcarriers loaded with anti-inflammatory substances in the therapy of experimental skin wounds preparation and characterization of poly(3-hydroxybutyrate-co-3-hydroxyhexanoate) (phbhhx) based nanoparticles for targeted cancer therapy sustained pdgf-bb release from phbhhx loaded nanoparticles in 3d hydrogel/stem cell model poly(3-hydroxybutyrate-co-r-3-hydroxyhexanoate) nanoparticles with polyethylenimine coat as simple, safe, and versatile vehicles for cell targeting: population characteristics, cell uptake, and intracellular trafficking in vivo monitoring of pha granule formation using gfp-labeled pha synthases the inherent property of polyhydroxyalkanoate synthase to form spherical pha granules at the cell poles: the core region is required for polar localization multifunctional inorganicbinding beads self-assembled inside engineered bacteria tolerance of the ralstonia eutropha class i polyhydroxyalkanoate synthase for translational fusions to its c terminus reveals a new mode of functional display recombinant escherichia coli strain produces a zz domain displaying biopolyester granules suitable for immunoglobulin g purification protein engineering of streptavidin for in vivo assembly of streptavidin beads novel and economical purification of recombinant proteins: intein-mediated protein purification using in vivo polyhydroxybutyrate (phb) matrix association integrated recombinant protein expression and purification platform based on ralstonia eutropha a novel selfcleaving phasin tag for purification of recombinant proteins based on hydrophobic polyhydroxyalkanoate nanoparticles one-step production of immobilized alphaamylase in recombinant escherichia coli design of a single-chain multi-enzyme fusion protein establishing the polyhydroxybutyrate biosynthesis pathway self-cleaving fusion tags for recombinant protein production current trends in polyhydroxyalkanoates (phas) biosynthesis: insights from the recombinant escherichia coli biogenesis of microbial polyhydroxyalkanoate granules: a platform technology for the production of tailor-made bioparticles efficient recovery of low endotoxin medium-chain-length poly([r]-3-hydroxyalkanoate) from bacterial biomass bacterial polymers: biosynthesis, modifications and applications a genome-scale metabolic reconstruction of pseudomonas putida kt2440: ijn746 as a cell factory a new family of intrinsically disordered proteins: structural characterization of the major phasin phaf from pseudomonas putida kt2440 use of extracellular medium chain length polyhydroxyalkanoate depolymerase for targeted binding of proteins to artificial poly microarray of dna-protein complexes on poly-3-hydroxybutyrate surface for pathogen detection use of snares for the immobilization of poly-3-hydroxyalkanoate polymerase type ii of pseudomonas putida ca-3 in secretory vesicles of saccharomyces cerevisiae atcc 9763 bioengineering of bacterial polymer inclusions catalyzing the synthesis of n-acetylneuraminic acid phac and phar are required for polyhydroxyalkanoic acid synthase activity in bacillus megaterium caged protein nanoparticles for drug delivery in vivo enzyme immobilization by inclusion body display enzymatic synthesis of a drug delivery system based on polyhydroxyalkanoate-protein block copolymers recombinant protein production by in vivo polymer inclusion display expression of active recombinant human tissue-type plasminogen activator by using in vivo polyhydroxybutyrate granule display a method and kit for purification of recombinant proteins using self-cleaving protein intein molecular characterization of extracellular medium-chain-length poly(3-hydroxyalkanoate) depolymerase genes from pseudomonas alcaligenes strains pseudomonas: a model system in biology production of polyhydroxyalkanoates with high 3-hydroxydodecanoate monomer content by fadb and fada knockout mutant of pseudomonas putida kt2442 metabolic routing towards polyhydroxyalkanoic acid synthesis in recombinant escherichia coli (fadr): inhibition of fatty acid beta-oxidation by acrylic acid molecular cloning of polyhydroxyalkanoate synthesis operon from aeromonas hydrophila and its expression in escherichia coli molecular characterization and properties of (r)-specific enoyl-coa hydratases from pseudomonas aeruginosa: metabolic tools for synthesis of polyhydroxyalkanoates via fatty acid beta-oxidation expression of 3-ketoacyl-acyl carrier protein reductase (fabg) genes enhances production of polyhydroxyalkanoate copolymer from glucose in recombinant escherichia coli jm109 functional expression of the pha synthase gene phac1 from pseudomonas aeruginosa in escherichia coli results in poly(3-hydroxyalkanoate) synthesis protein folding and conformational stress in microbial cells producing recombinant proteins: a host comparative overview industrial biotechnology of pseudomonas putida and related species the metabolic response of p. putida kt2442 producing high levels of polyhydroxyalkanoate under single-and multiple-nutrientlimited growth: highlights from a multi-level omics approach in-silico-driven metabolic engineering of pseudomonas putida for enhanced production of poly-hydroxyalkanoates new insights on the reorganization of gene transcription in pseudomonas putida kt2440 at elevated pressure the polyhydroxyalkanoate metabolism controls carbon and energy spillage in pseudomonas putida the standard european vector architecture (seva): a coherent platform for the analysis and deployment of complex prokaryotic phenotypes controlled autolysis facilitates the polyhydroxyalkanoate recovery in pseudomonas putida kt2440 biochemical and genetic analysis of pha synthases and other proteins required for pha synthesis identification of a multifunctional protein, pham, that determines number, surface to volume ratio, subcellular localization and distribution to daughter cells of poly(3-hydroxybutyrate), phb, granules in ralstonia eutropha h16 pham is the physiological activator of poly(3-hydroxybutyrate) (phb) synthase (phac1) in ralstonia eutropha pha) hemeostasis: the role of pha synthase interaction between poly(3-hydroxybutyrate) granule-associated proteins as revealed by two-hybrid analysis and identification of a new phasin in ralstonia eutropha h16 phap is involved in the formation of a network on the surface of polyhydroxyalkanoate inclusions in cupriavidus necator h16 phaf, a polyhydroxyalkanoate-granule-associated protein of pseudomonas oleovorans gpo1 involved in the regulatory expression system for pha genes phb granules are attached to the nucleoid via pham in ralstonia eutropha binding of the major phasin, phap1, from ralstonia eutropha h16 to poly(3-hydroxybutyrate) granules endotoxemia and endotoxin shock: disease, diagnosis and therapy the application of polyhydroxyalkanoates as tissue engineering materials biomedical applications of polyhydroxyalkanoates: an overview of animal testing and in vivo responses removal of endotoxin during purification of poly(3-hydroxybutyrate) from gram-negative bacteria guideline on validation of the limulus amebocyte lysate test as an end-product endotoxin test for human an animal parenteral drugs, biological products, and medical devicesin: u.s. department of health and human services fada polyhydroxyalkanoates in gram-positive bacteria: insights from the genera bacillus and streptomyces pathogen associated molecular pattern motifs from gram-positive and gramnegative bacteria induce different inflammatory mediator profiles in equine blood the lipoteichoic acids and lipoglycans of gram-positive bacteria: a chemotaxonomic perspective structure and glycosylation of lipoteichoic acids in bacillus strains atypical lipoteichoic acids of gram-positive bacteria accumulation of a poly(hydroxyalkanoate) copolymer containing primarily 3-hydroxyvalerate from simple carbohydrate substrates by rhodococcus sp. ncimb 40126 accumulation and mobilization of storage lipids by rhodococcus opacus pd630 and rhodococcus ruber ncimb 40126 production of poly-d(-)-3-hydroxybutyrate and poly-d(-)-3-hydroxyvalerate by strains of alcaligenes latus endotoxin removing method based on lipopolysaccharide binding protein and polyhydroxyalkanoate binding protein phap treatment of diabetic ketoacidosis using normalization of blood 3-hydroxybutyrate concentration as the endpoint of emergency management. a randomized controlled study transmembrane ion transport by polyphosphate/poly-(r)-3-hydroxybutyrate complexes chondrogenic differentiation of human bone marrow mesenchymal stem cells on polyhydroxyalkanoate (pha) scaffolds coated with pha granule binding protein phap fused with rgd peptide enhanced proliferation and differentiation of neural stem cells grown on pha films coated with recombinant fusion proteins the improvement of fibroblast growth on hydrophobic biopolyesters by coating with polyhydroxyalkanoate granule binding protein phap fused with cell adhesion motif rgd pathogen-like particles: biomimetic vaccine carriers engineered at the nanoscale ovalbumin peptide encapsulated in poly(d, l lactic-co-glycolic acid) microspheres is capable of inducing a t helper type 1 immune response nanoparticles and microparticles as vaccine-delivery systems of mice and not men: differences between mouse and human immunology animal models have little to teach us about type 1 diabetes: 1. in support of this proposal identification of the polyhydroxybutyrate granules in mammalian cultured cells a large, voltage-dependent channel, isolated from mitochondria by waterfree chloroform extraction low molecular weight complexed poly(3-hydroxybutyrate): a dynamic and versatile molecule in vivo posttranslational modification of e. coli histone-like protein h-ns and bovine histones by short-chain poly-(r)-3-hydroxybutyrate (cphb) insight into the selectivity and gating functions of streptomyces lividans kcsa poly-beta-hydroxybutyrate/calcium polyphosphate complexes in eukaryotic membranes isolation and 1h-nmr spectroscopic identification of poly(3-hydroxybutanoate) from prokaryotic and eukaryotic organisms. determination of the absolute configuration (r) of the monomeric unit 3-hydroxybutanoic acid from escherichia coli and spinach gomes me, reis rl. biodegradable polymers and composites in biomedical applications: from catgut to tissue engineering. part 1. available systems and their properties plga nanoparticles in drug delivery: the state of the art nanoparticles-a review poly(3-hydroxyalkanoate)s: diversification and biomedical applications. a state of the art review biodegradable polyhydroxyalkanoate implants for osteomyelitis therapy: in vitro antibiotic release polyhydroxyalkanoates, challenges and opportunities plga-based nanoparticles: an overview of biomedical applications biodegradable polymeric nanoparticles as drug delivery devices polyhydroxyalkanoates: biodegradable polymers with a range of applications multifunctional nanoparticles for photothermally controlled drug delivery and magnetic resonance imaging enhancement designed fabrication of a multifunctional polymer nanomedical platform for simultaneous cancer-targeted imaging and magnetically guided drug delivery current advances in research and clinical applications of plga-based nanotechnology nano-biocomposites: biodegradable polyester/nanoclay systems environmental silicate nano-biocomposites, green energy and technology production of polyhydroxyalkanoates: the future green materials of choice key: cord-266543-ng9zr299 authors: klebe, gerhard title: virtual ligand screening: strategies, perspectives and limitations date: 2006-06-20 journal: drug discov today doi: 10.1016/j.drudis.2006.05.012 sha: doc_id: 266543 cord_uid: ng9zr299 in contrast to high-throughput screening, in virtual ligand screening (vs), compounds are selected using computer programs to predict their binding to a target receptor. a key prerequisite is knowledge about the spatial and energetic criteria responsible for protein–ligand binding. the concepts and prerequisites to perform vs are summarized here, and explanations are sought for the enduring limitations of the technology. target selection, analysis and preparation are discussed, as well as considerations about the compilation of candidate ligand libraries. the tools and strategies of a vs campaign, and the accuracy of scoring and ranking of the results, are also considered. in the late 1980s and early 1990s, experimental high-throughput screening (hts) and combinatorial chemistry were aggressively developed to overcome the lead discovery bottleneck in drug development. using sophisticated large-scale automation, it was anticipated that this would generate an unprecedented number of novel leads, resulting in a substantial increase in novel drug entities launched to the market per year. however, in reality, the opposite was the case [1, 2] . frequently, the discovered hits could not be validated and further optimized into actual leads and preclinical candidates. thus, the initial euphoria surrounding these approaches has subsided owing to the disappointingly low hit rates and significant costs involved [3, 4] . such situations fuel the consideration and development of alternative techniques. the expression 'virtual screening' (vs) was coined in the late 1990s; however, the techniques involved are much older. in an effort to show that searching for lead candidates using a computer is a serious alternative to hts, the term 'vs' was adopted by the community. in contrast to hts, which is largely phenomenological and technology driven, in vs, compounds are selected by predicting their binding to a macromolecular target using computer programs (in drug discovery, the term 'target' or 'receptor' is used frequently to describe the macromolecule to which a drug binds, which is usually a protein but can also be dna or rna). the compounds studied do not necessarily exist, and their 'testing' does not consume valuable substance material. experimental deficiencies, such as limited solubility, aggregate formation or any sort of influence that could possibly interfere with experimentally applied assay conditions do not need to be considered in the initial computational screen. in contrast to hts, vs requires as a key prerequisite knowledge about the spatial and energetic criteria responsible for the binding of a particular candidate ligand to the receptor under investigation. in consequence, either the three-dimensional (3d) structure of the macromolecular target -as given by crystal structure analyses, nmr or sophisticated homology modelling -or, at the very least, a rigid reference ligand with a known bioactive conformation mapping out the putative receptor binding site must be available [5] . this defines vs as a knowledgedriven approach. even though it is likely that multiple screening campaigns have been performed in parallel in industry [6] [7] [8] [9] , it was only recently that mcmaster university launched an open and unbiased competitive screening against escherichia coli dihydrofolate reductase to detect inhibitors [10] , to assess how well vs can enrich candidate ligands compared with hts random screening [10, 11] . the original sample set of compounds used for screening was split into two, and one fraction was tested by hts. obtained hits were reported and served as a training set to validate and tailor applied vs tools in different research groups, who entered into a competition to retrieve hits also detected by the experimental screen in the second portion of the screening sample [12] [13] [14] [15] [16] . subsequently, in a totally unbiased fashion, the second part of the data sample was evaluated by hts and vs in parallel. unexpectedly, the second portion did not show any hits as competitive inhibitors, as would be expected for ligands docked into the substrate binding site of a target protein. interestingly, brenk et al. [16] reported on some promising hits found by docking that were not detected as inhibitors in the initial hts. retesting at higher concentration indicated weak inhibition. this observation could be seen as supporting the view that vs and hts are complementary approaches and that they might find candidates missed by the other [7, 9] . as mentioned above, the methods involved in vs are much older than the approach itself. initial attempts to find ligands by docking or by mapping them onto ligand-based pharmacophore models were the generic prototypes of a vs approach. however, this term was not used at that time, probably because computers and algorithms were not fast enough to enable large scale applications. focusing on approaches that actually make use of an available protein structure, one of the first systems to be studied by docking was hiv protease. initial versions of the program dock, developed over many years by kuntz's group [17, 18] , tried to dock rigid entries from the cambridge crystallographic database into the protein receptor, focusing primarily on shape complementarity and later considering chemical complementarity. in 1990, the kuntz group retrieved the neuroleptic drug haloperidol as a potential 'lead' from a docking screen using a database of known drug molecules as input. however, this compound would have had to be administered at a very high dose -far beyond a toxicologically tolerable concentration -to be an effective inhibitor of the protease [19] . nevertheless, haloperidol as a 'lead' gave rise to some new ideas for developing a derivative possessing 15 mm inhibition of hiv protease [20] . later, at dupont-merck, a 3d database search retrieved a substituted terphenyl derivative as a putative lead for inhibiting this protease. further optimization via six-and sevenmembered rings resulted in the class of cyclic ureas that are able to replace the crucial structural water molecule in the protease, at the same time targeting the carboxy groups of the two catalytic aspartates via two appropriately placed hydroxy functionalities [21] . since these early vs attempts, a plethora of case studies has been performed and the list of success stories is steadily growing. despite the fact that vs is still a young discipline, it has been reviewed frequently [22] [23] [24] [25] [26] [27] [28] [29] [30] , most recently in a comprehensive overview by kubinyi [31] . if one excludes purely retrospective studies, in which the potential of a method is demonstrated by its ability to enrich putatively active molecules from a sample of anticipated nonactive ones, 50 targets have been studied to date, and reports on the discovery of mostly micromolar binding ligands in a truly predictive fashion are available (table 1) . still under development, and far from mature, the number of strategies followed in vs is nearly as large as the number of reported screening campaigns. owing to space constraints, this review is unable to give a comprehensive overview of success stories or investigated targets, or to review all details of currently applied vs protocols; accordingly, the reader is recommended to consult the abovementioned surveys [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] . instead, this review attempts to summarize and comment on the agreed concepts and prerequisites for performing successful vs runs. furthermore, it attempts to analyse the limitations of the approach in a frank and plain manner and seeks to explain why such limitations still exist. accordingly, in the following sections, first the target selection, analysis and preparation are discussed, followed by considerations about the compilation of a candidate ligand screening sample. this is followed by a discussion of the actual requirements, tools and strategies of a vs campaign, and, finally, remarks concerning the accuracy of the scoring and ranking of the screening results. being a knowledge-driven approach, the scope of vs strongly depends on the amount and quality of information available about the system under investigation. clearly, the availability of the target receptor is of great benefit compared with situations where only a rigid reference ligand is known. the target receptor could be any macromolecular biomolecule, either a protein, rna or dna. in terms of the rigid reference ligand, either a substrate, inhibitor molecule, agonist or antagonist (e.g. a steroid hormone) could serve. because the structures of target receptors are becoming increasingly available, this review concentrates solely on such situations. however, it does not mean that vs cannot be applied to situations in which no structural information about the target receptor is available [5] . with respect to dna and rna as target receptors, the development of appropriate docking and scoring tools has been initiated only recently, and, accordingly, vs applications on such targets are still rare [32, 33] . by contrast, the scope of applications addressing proteins ranges from enzymes to gprotein-coupled receptors and ion channels (table 1) . assessing the druggability of the target receptor the first issue to be considered is the druggability of the selected target. does the selected protein exhibit a binding pocket that can be successfully targeted by small molecule ligands? clearly, characteristics such as pocket size and geometry, surface complexity and roughness, exposure of recognition properties and their complementarity in shape and polarity with respect to a putative druglike ligand, are of importance. because such criteria are not immediately available, a pragmatic, but hardly generally applicable, approach would be to correlate gene families. if one member of such a family is able to bind a drug, other members might also be able to bind druglike ligands with related physicochemical properties [34] . this assumption is based simply on the fact that members of a gene family usually operate on related substrates or recognize similar endogenous ligands. however, the setup for a vs study requires more conclusive information about the actual bindingsite architecture. recently, hajduk et al. [35] suggested several very decisive indices that help to discriminate druggable from nondruggable binding pockets. for druggability, the total surface area and a portion of a polar contact area below 75 å 2 appear to be beneficial, along with an appropriate pocket compactness, surface roughness and complexity. this suggests that there is an optimal size and composition of a protein-binding pocket that is best suited to recognize and accommodate small organic ligands. however, the analysis by hajduk et al. also showed that no single index consistently dominates the correlation. this fact points to the complexity of the interrelationships among the various discriminatory indices and explains why, at present, our concepts for predicting druggability are still rudimentary. selection of the most relevant geometry of the target receptor the selection of an appropriate 3d geometry for the target is another important issue when setting up a vs run. the most powerful method for learning about the spatial structure of proteins is crystal structure analysis. the accuracy and reliability of this method strongly depends on the resolution of the diffraction data. an alternative experimental technique to determine protein structures is nmr; there the accuracy of the structural determination depends strongly on the local distribution of the nuclear overhauser effect distance information along the protein chain (this effect involves transfer of magnetization through space and gives information about distances between nuclei in a 3d structure). furthermore, in the past, significant methodological enhancements in homology modelling have improved the quality of protein models [36] [37] [38] ; accordingly, several successful vs screening campaigns have been reported based on model-built protein structures [39] [40] [41] . homology modelling strongly depends on the availability of related proteins for which a crystal structure has been determined. the model will be particularly precise in those areas where the homology with the experimentally determined references is high, which is usually the case in the conserved core regions. however, functional binding sites, to be addressed by vs, are generally located in loop regions where, even among homologues of a gene family, significant differences are experienced (apart from catalytic residues in enzymes that are spatially highly conserved). to enhance the accuracy of model-built structures in regions where the binding sites are found, new algorithms have been suggested that consider the putative binding orientation of known ligands during the homology modelling process. these algorithms feed the information about the binding properties of possibly bound ligands back into the homology building process. improved binding site geometries can be expected from such approaches [42, 43] . another obstacle that complicates vs attempts is molecular flexibility. ligands and proteins possess internal degrees of freedom and can adopt various conformational states. with respect to the target protein, several methods have been described to simulate flexibility [44] . for vs, it is important to obtain an estimate of the protein conformers competent at accommodating a ligand. once these conformers have been determined, a vs run can either be performed by considering the flexibility of the protein instantly during the calculation, or by addressing an ensemble of several rigid receptor conformations [45] [46] [47] [48] . ligand-binding competent conformers can either be sampled by exhaustive conformational searches (e.g. using molecular dynamics simulations [49, 50] ) or by examining multiple conformational states, observed in crystal structures with different ligands bound, to obtain insight into the relevant conformations [51, 52] . the latter approach appears tempting because it can be assumed that a sample of experimentally observed protein conformers corresponds to stable, low freeenergy states. even though the crystal structure is usually considered as the 'gold standard' for learning about the geometry of a protein, it is highly questionable how representative a single structure determination really is. mcgovern and shoichet [53] reported on increasing information decay, irrespective of whether the geometry of a ligand-bound or ligand-free crystal structure was used or a model-built structure was considered. principally, crystalline protein-ligand complexes can be prepared by exposing preformed crystals of a protein to the solution of a ligand, which can then diffuse into the crystals and find its way to the binding site. this process is called 'soaking'. alternatively, in a cocrystallization experiment, the protein-ligand complex is formed by equilibrating them in solution, and then the assembled complex is crystallized. we recently observed, depending on the soaking protocol applied, different conformational states of the protein aldose reductase complexed by the inhibitor zopolrestat [54] . in one structure, the protein forms a hydrogen bond with the ligand via one of its binding site-exposed amide bonds ( figure 1 ). in a second structure, the amide bond is rotated off from the binding site, and the same ligand can no longer form this hydrogen bond. such differences have a significant impact on docking and scoring results in vs. depending on the crystallization conditions, ligands have been observed to accommodate a binding pocket with reversed orientation [55] ; in other cases, ligand-induced conformational adaptations of the protein are observed [56] ( figure 2 ). increasing the concentration of a ligand in the soaking buffer will enhance the opportunity to accommodate it in the binding pocket. however, it is likely that soaking depicts some kind of kinetic trap for ligands; consequently, surprising differences in cocrystallization have been observed (several ligands bound at one time, multiple binding modes [57] ). before selecting a particular crystal structure as a reference for a vs run, detailed analysis of parameters such as population of the bound ligand, b-factors next to the binding site (b-factors indicate thermal motion in a crystal structure; however, they are highly correlated with the population of a ligand in the crystal) or consistency of the hydrogen bond network is advisable. in addition, most programs used for the actual computer screens require properly defined protonation states of the active-site residues. this is by no means easy to perform because local dielectric drug discovery today volume 11, numbers 13/14 july 2006 reviews induced-fit adaptations in two different crystal forms. two crystal structures of benzamidine (in both cases, accommodated deeply buried in the s1 pocket, which is indicated as a deep depression in the left-and right-hand images) bound to a trypsin mutant exhibiting a binding pocket related to factor xa. both crystal structures differ in terms of the conformational state of phe174 (in the left-and right-hand images at the left rim of the binding pocket with the red surface), which is exposed ('up') in one structure (orange and on the left) and buried ('down') in the other (green and on the right). this adaptation involves partial unwinding of a short three-turn helix (in the centre, helix at the left-hand side of the binding pocket) and rearrangement of a disulfide bond (in the centre, yellow bond to the back) [56, 60] . binding mode differs with soaking and co-crystallization conditions. crystal structure of aldose reductase with the inhibitor zopolrestat is shown. the crystal structure determined after one day of soaking (yellow) differs from the structure obtained after six days of soaking (magenta) but is identical to that obtained by cocrystallization (light blue). whereas in the one day-soaked structure (yellow) the amide bond between ala299 and leu300 orientates its nh group towards the inhibitor to form a hydrogen bond (dotted yellow line) with the bound ligand, in the latter two structures (magenta and light blue) this amide bond rotates away from the inhibitor and no such hydrogen bond is observed [54] . www.drugdiscoverytoday.com 583 conditions can modulate pk a values of functional groups by several orders of magnitude. even more complex and, at present, difficult to calculate are pk a shifts of residues during the course of ligand binding. isothermal titration calorimetry [58] measurements can make such changes apparent; these changes probably occur more frequently than we presently admit [59] , and can easily turn an acceptor functional group into a donor or a charge-assisted hydrogen bond into a neutral one. these changes are of significance during various validation steps of a vs run. we have recently observed that even the protonation states of the residues in the binding pocket of inhibitor-bound aldose reductase can change depending on whether the cofactor nadph/nadp + is present in the oxidized or reduced state ( figure 3 ) (o. krämer and g. klebe, unpublished results). with respect to protein flexibility, the decision has to be taken as to whether a vs run should consider one single protein conformer or multiple binding-competent states of the receptor. different strategies with respect to docking and scoring will be the consequence (see below). experience shows that conformational adaptations observed in a protein-binding pocket are usually related to the structural modulations potentially needed by the protein to accomplish its functional role [51] . such functional adaptations must correspond to low-energy transformations, otherwise dramatic shortcomings in the functional performance of the protein would be the consequence. accordingly, enzymes that operate on a large palette of structurally diverse substrates (cf. aldose reductase and short chain dehydrogenases) or perform substantial conformational adaptations during catalysis (cf. kinases) will experience multiple bindingcompetent states that have to be considered in vs. thus, a profound understanding of the functional properties of a protein might be the best option for predicting the conformational adaptability of a protein. such information is not usually available at the beginning of a drug development project, when vs is used. because proteins occur in gene families of closely related members, the analysis across different entries of the family might provide some insight into the flexibility properties inherently shared by the members of the family. however, it might also be that some family members display rigid solutions, as required by their function (e.g. trypsin and factor xa among the serine proteases), whereas others possess pronounced flexible behaviour, as appropriate to their function (e.g. the serine protease factor viia) [60] . protonation states can change upon ligand binding. depending on the oxidation state of the bound cofactor nadph (left) and nadp + (right), aldose reductase binds to a carboxylate-type inhibitor (e.g. idd594) by placing its acid functional group (red) into the catalytic centre; the complex thereby formed remains either in an unchanged protonation state (left) or picks up protons upon binding (right). in the centre, the crystal structure of the complex is shown. a short contact distance between the carboxylate function of the inhibitor (shown with a yellow surface) and the nicotinamide portion of the cofactor (shown with an atom-type coloured surface, where oxygen is red, nitrogen is blue, carbon is white, sulfur is yellow and phosphorous is orange) is formed. high resolution x-ray crystal structure analysis and neutron diffraction has provided evidence that the inhibitor binds with its carboxylate function in the deprotonated state, and the active site his is present in the neutral state. it remains unresolved where the proton goes in the case of the nadp + -bound complex, or whether the residues of the catalytic centre exhibit deviating protonation states in the uncomplexed and inhibited situation. with respect to vs, a unique and precise assignment of the protonation states of the ligand and protein functional groups in a pk a range between 3 and 11 is essential because, for example, for docking it is important whether such a group is considered as donor or acceptor of a hydrogen bond (krämer and klebe, unpublished). after the reliability and relevance of the protein conformer(s) of the selected target protein have been assessed, its binding-site properties should be mapped before blindly starting a vs run (see below). this strategy helps the user to evaluate the properties of the target protein better and subsequently to examine the relevance of docking solutions suggested by vs. several tools have been described to elucidate the 'hot spots' of binding in a particular binding pocket [61] . these methods are either based on thoroughly parameterized force fields [62] or on well-selected empirical information (superstar [63] , drugscore [64] ; figure 4 ). most importantly, this analysis has to be performed on all multiple conformational states of a binding pocket because this will provide a composite picture of how the molecular recognition properties of a binding site might change upon protein adaptation. water, the nasty third binding partner in proteinligand complexes finally, a very crucial decision with respect to the setup of the protein reference for a vs run concerns the consideration of water molecules in a binding site. the analysis of several thousand crystal structures of ligand-protein complexes using the waterbase module in relibase [65, 66] revealed that, in about two-thirds of all cases, a water molecule is involved in ligand binding, frequently mediating contacts between protein and ligand. thus, any approach based on the prediction of binding modes of putative candidate ligands, as required in vs, must take water into consideration. one possible approach for learning about water molecules tightly bound to the protein is the analysis of crystallographic data with respect to the repeated occurrence of water molecules in structurally related binding sites (e.g. in a gene family) or multiple structural determinations of the same protein with many distinct ligands [66] . if water is present in all structures analysed at more or less the same location, this is a strong indica-tion that this water molecule is tightly bound, and in a vs run could be considered as an integral part of the target structure. as with hts, vs needs a thoughtfully designed and thoroughly compiled sample of small molecule candidate ligands for screening. pharmaceutical companies will primarily screen their own proprietary compound collections, with the advantage that detected hits will be exclusive and will cover molecules for which the synthesis is well-established at their site. however, are such sample pools biased? how well is the available chemical space covered? these considerations have led large pharma companies to complement their inhouse collections with compounds offered through commercial suppliers. starting with the available chemical directory as the initial prototype (see mdl, http://www.mdli.com), there are currently more than 10 million unique purchasable compounds on offer [67] . how well do they cover chemical space, and which portion of this space represents druglike molecules? some dramatic figures have been proposed concerning the number of organic molecules that can be considered as druglike. usually, molecules meet these criteria if they are composed only of the elements h, c, n, o, p, s, cl and br, and possess a molecular weight <500 da [68] . how diverse should the entries of a compound database used for screening be? what physicochemical properties have to be met by the candidates to guarantee sufficient bioavailability? the concept of a well-balanced and homogeneously populated space of diverse druglike molecules appears very tempting; however, there is no proper definition of what descriptors to use as coordinates on the axes of such a compound space. what is 'diverse' in this context? as with the criterion 'similarity', the expression 'diversity' is a relative measure that relates to a reference point. in vs, the reference point is the target receptor and the difference in binding mapping the hot spots of binding. hotspot analysis using drugscore for the binding pocket of t-rna guanine transglycosylase (surface of the binding pocket indicated in white). regions energetically favourable for the binding of a hydrogen bond donor group (represented by an nh group) or an acceptor group (represented by a c=o group), or a hydrophobic molecular portion (represented by aromatic carbon atoms, c.ar) have been analysed and contoured on three subsequent levels (indicated using three different colours) above the deepest energy minimum found for each atom type in the maps [88] . the crystallographically determined binding mode of an inhibitor is shown superimposed. www.drugdiscoverytoday.com 585 affinity of two ligands with respect to this structure. this defines whether two molecules should be classified as 'similar' or 'diverse'. it can be imagined that, for one target, a correctly placed methyl group on a ligand might have a dramatic effect on binding affinity, whereas for another it will be tolerated with no change in binding affinity. in the first case, the ligands with and without a methyl group would have to be termed 'diverse', whereas in the second, they will be called 'similar'. however, consulting a 'diversity' scoring that concentrates solely on the topology of a ligand, the methylated and unsubstituted derivatives will probably be ranked as 'highly similar'. is there an optimal size for candidate ligands to be submitted to screening? in the past, combinatorial chemistry in particular has enabled pharmaceutical companies to develop screening libraries of several million candidate molecules; thus, in principle, the sheer number of test compounds is no longer an issue. although large in total count, the individual members of such libraries are usually large in terms of size and molecular weight. they are mostly in the range of typical drug-size molecules, having been synthesized in other drug development projects. however, if they turn out to be a micromolar screening hit, they still require optimization to improve their affinity towards the target protein by two or three orders of magnitude. this has to be accomplished while maintaining reasonable molecular size and appropriate absorption, distribution, metabolism and excretion properties. in consequence, it involves stripping down to a scarcely decorated core skeleton while subsequently building up this core again with novel well-tailored side chains. this is a challenging task, and is often difficult to achieve. experience from drug optimization programs has shown that small core fragments ('privileged templates' or 'fragments') known to bind with significant affinity are ideal starting points for further optimization [69, 70] . accordingly, the compounds selected for vs should leave some room for optimization, thus matching the range of socalled leadlike or fragment-like molecules [71] . verdonk et al. [72] have presented a very insightful discussion on approaches to be taken when constructing databases for vs, and have suggested that the methods involved should be validated (e.g. by assessing the enrichment rates of known binders). for this purpose, in a vs campaign, known binders are pooled with the sample set of screening candidates to assess how well the known binders 'enrich' during the course of the screening process. such considerations are important for the development of new methods; however, in an actual drug development project, at the end of the day, the medicinal chemist will ask for novel leads that are worthwhile pursuing with synthesis and optimization. impressive enrichment rates of known actives will not be convincing. in light of these considerations, what is a suitable compound collection? some general criteria have to be matched either by inhouse proprietary compounds or by substances offered by commercial suppliers. compounds can be validated with respect to drug likeness considering lipinski's rule of five, or lead likeness applying more stringent criteria (as defined by the rule of three) [73] [74] [75] . recently, martin tested lipinski's frequently applied rule with respect to experimentally determined bioavailability in the rat [76] . this study showed that the rule of five has predictive ability for neutral and positively charged molecules; however, anions obey different rules, and the size of their polar surface area confers some predictive power. similarly to the considerations concerning the druggability of binding sites, the criteria for bioavailability are multifactorial. irwin and shoichet took the initiative to set up the freely available database zinc with validated compounds for vs [67] . it is built from 2d compound information, generates 3d coordinates and curates, if possible, from stereo-and regioisomeric ambiguities. multiple states with respect to protonation, charges and tautomers are enumerated. however, as described for proteins, the properties of compounds might be altered upon protein binding. insoluble, reactive and aggregating compounds are not represented in the database. the rule of five is a straightforward filter to discard compounds with putatively undesired properties from the screening sample. depending on the strategy pursued in the subsequent vs campaign, multiple conformers can be precalculated and stored as separate entries in the database. limiting the search sample to purchasable compounds in vs is a pragmatic approach because screening hypotheses can be tested rapidly [67, 77] . however, vs can also scan over virtual compound libraries, and synthesis can be postponed to a later stage, considering only the most promising hits. reymond's group attempted to generate a database of all possible organic molecules up to 160 da under the constraints of defined chemical stability and synthetic feasibility [78] . this database contains 13.9 million entries. it is possible that such a sample could be used for fragment screening. for larger compounds, exhaustive sampling of possible molecular skeletons will end up in a combinatorial explosion with too many possible solutions. however, proper design criteria, defined by the architecture of the binding site used in vs as the target, might guide the generation of target-tailored virtual libraries for vs. in particular, considering the criteria of combinatorial chemistry and parallel synthesis, such vs strategies can actually help to synthesize only the most promising entries of a large virtual combinatorial library. principally, two strategies can be followed in a vs campaign: forward or backward filtering of hits obtained by docking. the most crucial step in vs is the docking of candidate molecules to the target protein [79] [80] [81] [82] [83] [84] [85] [86] . in forward filtering, various criteria are used to reduce the initial data sample, which might comprise several millions of test compounds, to the several hundred or thousand most promising candidates to be docked [87] . in backward filtering, all entries from the data sample are docked to the target protein, and filter criteria are subsequently applied to rank the generated docking solutions. nowadays, the speed of computers is no longer the limiting factor in selecting the strategy. the forward technique requires fewer computational resources. this is mainly because of the fact that at each hierarchical filtering drug discovery today volume 11, numbers 13/14 july 2006 step, a significant amount of the original data sample is discarded. however, with a decreasing number of compounds, the filtering becomes computationally increasingly demanding and sophisticated. flexible docking is the computationally most intensive step of all; thus, the fewer candidates to be considered here, the more effort can be spent in controlling, validating and assessing docking results. this is clearly an advantage of this strategy. forward filtering eliminates compounds initially according to simple descriptors such as molecular weight, number of rotatable bonds, lipophilicity (usually expressed by the logarithm of the partition coefficient, log p) or crude shape descriptors, such as the ellipticity of the overall structure. subsequently, information about the receptor's binding site is exploited. once a hotspot analysis of the most likely anchoring positions in the binding pocket has been performed, a protein-based pharmacophore can be derived [88] . this pharmacophore sets the constraints for the minimal requirement of functional groups to be matched by putative ligands (e.g. number of hydrogen bond donors, acceptors or hydrophobic groups). molecules satisfying such criteria can be retrieved by any database engine capable of a functional group substructure search. once the topographical arrangement of the protein-based pharmacophore has been incorporated into the search and the remaining candidates are requested to match this pattern, the study can be further focused. unity (unity chemical information software, version 4.1, tripos) and catalyst [89] are prototypes of such database engines supporting this screening step. an alternative tool is featuretrees, which can retrieve molecules of similar topology in feature space [90] . in this context, 'features' are considered as being similar types of functional groups or molecular building blocks. similarity can be considered as a further filter criterion in vs. for example, if candidate molecules are requested to match with a predefined protein-based pharmacophore, this condition already involves a selection in terms of similarity criteria; however, they are regarded very generally. furthermore, information about known binders can be used for filtering, although the danger then exists that, owing to biased filters and preconceived concepts, some unexpected and novel chemistry is discarded during the early filtering steps. finally, docking is pursued, usually considering only 1-10% of the initial sample collection. an advantage of the forward filtering approach is that it enables more elaborate docking protocols to be performed -for example, taking multiple protein conformational states into account or reflecting a protein-based pharmacophore as a restraint in docking. most importantly, this hierarchical filtering strategy enables the tracking of the performance at the various filter levels by human intervention, and, in particular, visual inspection of docking solutions remains feasible. backwards filtering starts with high-throughput docking and analyses the generated docking modes as subsequent steps. this approach is especially challenging because docking returns multiple solutions for most candidate molecules. strongly discriminative and reliable scoring functions must be available for analysing the computed results in an automated fashion because visual inspection for the large number of diverse compounds to be examined is hardly feasible. however, it is questionable whether the existing scoring functions are reliable enough to succeed with such heroic demands (see below). nevertheless, to proceed with the multiple docking solutions from a high-throughput run, the generated docking modes can be filtered with respect to achieved matching of the protein-based pharmacophore, contact complementarity of protein and ligand surfaces or the remaining residual unoccupied voids along the protein-ligand interface. because nature probably avoids gaps in molecular assemblies, the latter criterion could be a powerful indicator for irrelevant binding modes or the putative accommodation of interstitial water molecules. docking is the crucial step in vs. the seminal program dock, originally described in 1982 by kuntz et al. [17] , has evolved as the first vs tool. later, other programs were successfully applied in vs, such as gold [91] , flexx [92] , glide [93, 94] or autodock [95] , to name the most popular prototypes. these have been recently reviewed [96] [97] [98] [99] [100] [101] [102] . all docking tools follow slightly different concepts. this might give individual programs a particular advantage in one task with respect to another -for example, incorporating aspects such as flexibility of ligand and receptor or restraining the docking search engine to particular regions in configuration space (e.g. mapping a protein-based pharmacophore) [103] . however, all docking tools are still far from perfect. an even more challenging, but carelessly disregarded, aspect of docking is the appropriate consideration of water molecules. as indicated above, water molecules are involved in binding in about two-thirds of the known protein-ligand complexes. however, most docking tools ignore them simply because conclusive concepts of how to consider them correctly are missing. if structural evidence is given, preplacement of water molecules in a docking run is a feasible strategy [66] . the docking tool slide [104] treats preplaced water molecules in a way that still enables their replacement by ligand atoms in docking. the particle concept in flexx [105] enables the placement of water molecules 'on the fly' during the course of the generation of individual docking solutions. in the docking tool gold, water molecules can be switched on or off, and can spin around their principal axes to achieve good contacts with a docked ligand [106] . recently, the popular docking tools dock and flexx have been equipped with features that enable docking on a preconceived pharmacophore or property distributions [107, 108] . consideration of such criteria will drive docking solutions especially into regions either frequently trapped by other bound ligands or featured by complementary analytical tools as being particularly relevant for binding. the program autodock [95] performs docking on a precalculated grid, storing potential values from any sort of interaction field [109] . in the original implementation, lennard-jones and coulomb potential values are used. sotriffer et al. [110] replaced these by knowledge-based potentials, originally implemented into drugscore. the latter potentials have proven to be powerful for discriminating, and rank among multiple ligand poses. the potential grid approach in autodock also enables one to average across the fields produced by various protein frames, so that the conformational degrees of freedom of a protein can be considered. in addition, adapted fields optimized with respect to the binding properties of some known actives in the comparative molecular field-type approach 'adaptation of fields for molecular comparison' (afmoc) [111] can be used as target potential values in autodock [112] . the latter docking tool performs multiple stochastic searches on the potential hypersurface; accordingly, the frequency of occurrence of certain docking solutions can be used as an additional figure of merit for their relevance [113] . the issue of an optimal ligand size for screening has been addressed above. the complexity of the docking problem increases with the size of the ligand and its number of rotatable bonds. thus, smaller molecules in the typical range of ligand fragments should be simpler to dock. perplexingly, present experience indicates the opposite to be the case. fragments are easily scattered over a binding site by docking; reliably successful docking can only occur if the binding site itself is restricted in size and shows dimensions similar to the fragments [48] . interestingly, experimental approaches show the opposite: small molecular fragments (>200 da) usually populate, in a well-defined manner, in a very limited number of sites in binding pockets [69, 70] . enrichment rates to control the achievements of virtual screening vs runs are usually monitored and validated by comparing the performance of a set of known actives with a large number of 'randomly' picked compounds. the actives are pooled with the random entries. all compounds are submitted to the selected vs protocol, and the performance ranks of the known actives with respect to the remaining pool are converted into enrichment plots. such a process is essential for keeping control over the performance and achievements of vs. however, the choice of the random compound library is crucial and can strongly affect the enrichments obtained [72] . accordingly, it has to be examined whether the pooled, randomly picked decoy structures are actually nonbinders. in a real-life scenario, it should be noted that merging known actives with a set of candidate ligands should result in a gradually declining enrichment rate at the subsequent hierarchical filter steps because novel actives, retrieved by vs, will populate in prominent ranks and dilute the set of predefined known actives. independent of the actual vs strategy applied, docking and subsequent scoring of the suggested solutions is the key performancedetermining factor in vs. the discriminatory power of the applied scoring function is of utmost importance for ranking, and hopefully enriching, potentially active binders at the top of the list of docking solutions. in consequence, a myriad of scoring functions has been developed over the past few years [114] [115] [116] . approaches have been taken not simply to rely on one single function but to take on board the consensus picture of several scoring schemes [117, 118] . to characterize the binding affinity of putative lead candidates experimentally, the binding constant or its inverse, the dissociation constant, is determined (or approximated by values such as the ic 50 ). if we assume that the basic rules of equilibrium thermodynamics can be applied, we can define, according to the law of mass, an equilibrium constant that describes the formation of a protein-ligand complex. this equilibrium constant is logarithmically related to the gibbs free energy, comprising both an enthal-pic and entropic contribution. whereas the former relates to energetic features, the latter concerns configurational and ordering phenomena. the entropic contribution estimates how the energy content of the system is distributed over internal and external molecular degrees of freedom. up until now, three strategies have been followed to predict binding affinities on the basis of a given protein-ligand binding geometry. the most rigorous and theoretically most solid approaches are first-principle methods [59] . using quantum mechanics or computationally less demanding (however approximate) force fields, the partition function of a system is computed and the free energy differences between the bound and unbound state are determined. with the increasing speed of computers, such methods are becoming more accurate and obtaining a growing relevance for scoring [119] . however, screening large samples of docked solutions to estimate binding affinities is still far beyond tractability. nevertheless, first-principle methods do not need any calibration or training in experimentally determined affinity data; thus, they will not suffer from inherent experimental shortcomings or accuracy limits. this differs from the other two approaches described in the following section, the regression-and knowledgebased scoring functions, which are based on empirical concepts [59] . regression-based approaches assume additivity of individual terms considered in a master equation to describe the total gibbs free energy of binding. in this context, a 'term' can reflect any physicochemical property of relevance for the protein-ligand binding process -for example, the number of charged or uncharged hydrogen bonds, the size of the polar or nonpolar surface portion, the number of rotatable bonds or the enthalpy required to desolvate either the ligand or the protein, among others. they are assumed to be independent from each other, and their individual contribution in reproducing the known affinities of some training set ligands is extracted by regression analysis, partial least-squares analysis or neural networks [59] . however, independence of the 'terms' is unlikely, and fair to strong correlations between terms are probable. in the correlation analysis, this fact might point out that another, at first glance surprising, property pops up as the best explanation. however, it might be the case that the latter property is not on its own essential in a physical sense, but is highly correlated with another property that is actually crucial. we recently parameterized a new regression-based scoring function on the basis of a large sample of crystal structures with known affinities for the bound ligands. depending on the composition of the training set, different terms are found to be relevant in the analysis. this clearly shows that such empirical scoring functions reflect a best fit with respect to the training set used but they rarely achieve generality. an alternative to regression-based approaches are knowledgebased scoring functions [59] . these evaluate the occurrence frequencies of some properties of interest -for example, the mutual distance between particular atom types found across the proteinligand interface. the sample distributions describe occurrence probabilities and can be compared with a statistical mean reference situation. any deviations from this average can be translated using some type of mathematical relationship into statistical preferences that can be used to determine the geometry of protein-ligand complexes. conceptionally, the knowledge-based functions appear to be more general because no master equation with preconceived 'terms' is required. however, the data selected to derive the function, and the definition of atom types, together with many adjustable parameters needed to actually establish the method, also attenuates the generality of this approach. to estimate binding constants, both the regression-and knowledge-based scoring functions require experimental affinity data for internal calibration. thus, their prediction accuracy can never be better than the precision by which binding data can be measured. it is interesting to note that the estimated standard deviations of such empirical scoring functions are reduced if data for a selected number of targets, determined in one laboratory based on assay data recorded under strictly conserved conditions, are used or if broad-range data covering many targets are considered based on assay data determined in various laboratories. the relative differences in binding data within a series of compounds can be determined precisely -usually much better than across data from various systems for which the comparison has to be performed on an absolute scale. in consequence, the standard deviations of predicted binding affinities of presently available functions range between 0.7 and 1.5 logarithmic units in binding affinity. this range matches the experimental accuracy achieved for binding data across training sets of growing heterogeneity covering a broad spectrum of targets. in our experience, knowledge-based scoring functions are better able to extract binding geometries generated by a docking program that closely approximate the experimentally confirmed binding mode from a sample set of decoy placements. by contrast, regression-based scoring functions are better in the actual affinity prediction, provided that a fairly accurate binding geometry is given. unlike the professed opinion that in docking the geometry problem has been resolved to a sufficient extent, and that the scoring problem remains an open question [120, 121] , it appears that both are intimately related. we recently developed a knowledge-based scoring function based on accurate contact data from small molecule crystal data [122] . this function reliably recognizes the experimentally determined binding mode found in a crystal structure among a set of decoy poses. it appears selfevident that the scoring problem can only be alleviated if more relevant, near-native binding poses are produced by docking programs and reliably recognized by a scoring function. accordingly, it appears advisable to drive docking solutions as close as possible to the native geometries -that is, as they would show up in a corresponding crystal structure. this can also be achieved by optimizing them towards the near-native situation. at best, this involves minimization with respect to the function used for scoring. as a disadvantage, this process would be computationally demanding. binding affinity: a sufficiently well understood property? any discussion of scoring and ranking must ask the question as to how well we understand the target value gibbs free energy. is it advisable to focus scoring on free energy or would it be better to treat enthalpy and entropy as separate terms in the scoring [123] ? there is a mutual compensation between enthalpy and entropy owing to the fact that both entities scatter over much larger ranges than does the free energy itself. considering the binding of a ligand to a protein, enthalpy (dh) and entropy (-tds) can easily scatter over a 3-4-fold larger range than the spread of gibbs free energy (dg). the mutual compensation of enthalpy and entropy can even be found across closely related molecules [124] (figure 5 ). furthermore, many physicochemical phenomena of relevance for the binding process are not yet fully understood and, therefore, have not yet been correctly incorporated into scoring functions (e.g. the role of water, change in protonation states or an appropriate consideration of entropy). interestingly, microcalorimetry indicates that with increasing temperature, the protein-ligand binding process becomes more exothermic and entropically less favourable [58, 59] . because this observation applies to all targets, it suggests that general phenomena are involved which are not yet understood at a molecular level. despite our present deficiencies in understanding the physics of the binding process, scoring still works satisfactorily -most likely because we consider the binding of ligands to a protein on a relative scale to each other. accordingly, any unappreciated phenomena, similar across all complexes in the analysis, will simply be cancelled out. for example, as mentioned above, one approach for reflecting protein flexibility applies parallel docking into several rigid binding-competent conformers of the protein. the disadvantage of this strategy is the fact that additional degrees of scoring are created: what discriminates the scoring against different protein conformers? cancellation of unreflected internal protein energy contributions is no longer certain. studies have shown that a special scoring scheme or protocol is required [48, 52] . because dramatic energy differences between low-energy conformers are unlikely, a modulated pocket size for the different protein conformers might require individual scoring of the altered desolvation properties of the binding site. required accuracy of scoring: enrichment or hit identification? vs is used to enrich putative actives from a large sample set of test ligands. accordingly, the desired accuracy of scoring depends on whether, at first glance, prioritization of the sample set for testing is anticipated or whether putative actives are expected among the top ten or 100 of a hit list. the latter requires very powerful discrimination of actives over inactive decoy binders. present scoring functions have been optimized to discriminate for a particular ligand decoy binding mode from near-native ones. the discrimination of binders from decoy nonbinders still remains as major challenge for vs protocols [125] . at worst, in these cases the above-mentioned poorly understood contributions to binding become overwhelmingly important. perhaps the consideration of similarity criteria in the search [87, 107, 108, [110] [111] [112] enables the problem to be alleviated to some degree because this drives the search towards more closely related ligands for which some of the disregarded effects in scoring cancel themselves out. ultimate proof of concept: crystal structure analysis of vs hits only rarely, the crystal structure of detected vs hits has been determined and subsequently published. such case studies have recently been reviewed by shoichet [24] . cases have been described for which vs has correctly predicted the subsequently found binding mode. other examples point to deficiencies arising from the superficially understood phenomena described above. finally, it has also been reported that vs inappropriately identified an active compound because binding actually occurs in a totally different region of the protein surface. vs has been established as a powerful alternative and complement to hts. when performed optimally, impressive hit rates have been reported, which have been significantly higher (by a factor of 100-1000 [24] ) than those for hts. comparative studies of hts and vs indicate that the methods can capture alternative and complementary ligands. undoubtedly, vs is not yet a fully mature technology following a well-established process line. few of the foundations of proteinligand recognition are understood well enough to be deployed in a large scale, multi-compound effort such as that commonly undertaken in vs. this calls for further indepth research. in particular, protein flexibility and induced-fit adaptations, the role of water in solvation, desolvation and ligand binding, and the electrostatics involved, including changes in protonation states and an appropriate consideration of entropic changes, will need to be better understood to improve the hit retrieval rates in vs. frequently, experimental work performed in parallel or as a follow-up to a vs campaign provides a whole bunch of unexpected results pointing towards manifold deficiencies of the concepts applied. many more experimental studies are required. nevertheless, vs has proven successful and as a valuable alternative to hts, in particular if it is used as a tool to support and complement hit discovery. compared with hts, it is significantly cheaper and faster to use. it can be easily rerun under modified conditions -for example, if additional information about the target protein under consideration becomes available or novel filter criteria are taken into account. it is interesting to note that an experienced modeller or medicinal chemist can often figure out whether a particular binding pocket appears druggable or a certain molecule obeys the rules of drug likeness; however, putting such knowledge into computer algorithms makes the multifactorial nature of these rules apparent, complicating attempts to generalize them. the same multifactorial nature holds true for docking and scoring. it is therefore highly advisable to refrain from fully automated strategies in vs. experience and human intervention are of the utmost importance for keeping control over the various filter steps in a vs run. used in such a way, vs can be successful; in particular, if information about molecular similarity is considered in terms of generic physicochemical properties and not simply as chemical formulae. condrug discovery today volume 11, numbers 13/14 july 2006 similar ligands decompose differently into enthalpic and entropic binding contributions. crystal structures of two closely related thrombin inhibitors bearing a cyclopentyl or cyclohexyl moiety as terminal substituent to accommodate the s3/s4 pocket of the catalytic site (surface of the binding pocket is indicated in blue). whereas the five-membered ring (left) gives rise to a well-defined difference in electron density (white 'chicken-wire' contouring), the six-membered ring (right) cannot be assigned to any density (see inside white circles). it is likely that the latter fragment shows enhanced residual mobility and is scattered around several conformational states. interestingly, this deviating behaviour is well reflected in the thermodynamic properties (centre). both compounds exhibit the same free energy of binding (dg, blue columns). however, the cyclohexyl derivative (right) with the enhanced residual mobility is entropically (-tds, red columns) more favoured than the 'less-well clamped' five-ring derivative (left). the latter experiences better enthalpic contributions (dh, green columns) to binding [124] . sidering similarity concepts takes the risk that highly diverse molecules remain undetected; however, it probably makes the searches simpler because many parameters that actually matter in vs, and which are not properly considered, simply cancel each other out in a relative comparison. considering all of the mentioned limitations, and taking molecular similarity as some kind of work-around to evade existing shortcomings, a critical reviewer might raise the question of whether, under such restrained conditions, vs can really contribute something novel that an experienced medicinal chemist would not have thought about. only successful examples can convince. we applied vs to a trnamodifying enzyme for which the inhibitor replaces either guanine or preq1 (a precursor to the modified base queuine) as a substrate in the enzyme recognition pocket ( figure 6 ). initial vs searches retrieved structures that, admittedly, an experienced medicinal chemist might also have suggested as being 'substrate-like'. however, a follow-up vs screen in which several water molecules were allowed to be replaced in the binding site suggested a cyclic urea structure with a different orientation in the binding pocket [126] . synthetically, this lead appeared tempting and presently serves as a starting point for a new series of compounds ( figure 6 ). this unbiased approach, resulting, in chemical terms, in an unexpected lead, demonstrates that vs can make important contributions to drug discovery, providing some unexpected candidates. nevertheless, there is still a long way to go until it becomes an established tool for routine lead discovery. interestingly, nowadays, most aspects of contemporary drug discovery are optimized towards high throughput; by contrast, our current increase in the knowledge and understanding of protein-ligand recognition principles is still proceeding at a low-output rate. drug discovery today volume 11, numbers 13/14 july 2006 reviews an unexpected ligand skeleton from virtual screening. virtual screening (vs) has been used to search for putative inhibitors of t-rna guanine transglycosylase [88] . initial hits such as 1-4 (lower left box), which were followed up by chemical synthesis, showed structural similarity with the natural substrates of this enzyme guanine, preq 0 and preq 1 (upper box). these searches were based on the protein-based pharmacophore shown on the left (contours for hydrogen bond acceptor group) and in figure 4 . an experienced medicinal chemist could possibly have come up with similar suggestions for potential leads such as 1-4. however, in a second vs campaign we focused on the replacement of two water molecules at the lower right rim of the binding pocket of t-rna guanine transglycosylase (right, contours for hydrogen bond acceptor group [126] ). this screen suggested 5 as a potential hit. its cyclic urea-type skeleton is distinct in its structure and binding mode from any known natural substrate and it can serve as a synthetically easily accessible lead. several derivatives (e.g. 6-8) have been synthesized and show a reasonable structure-activity relationship (stengl et al., unpublished) . it is unlikely that this novel lead structure would have been suggested by comparative substrate considerations. this example underlines the power of vs as an alternative source for novel lead discovery. www.drugdiscoverytoday.com 591 reviews foundation review trends in development cycles a new grammar for drug discovery how many leads from hts? comment: how many leads from hts? chemical feature-based pharmacophores and virtual library screening for discovery of new leads virtual screening to enrich hit lists from high-throughput screening: a case study on small-molecule inhibitors of angiogenin molecular docking and high-throughput screening for novel inhibitors of protein tyrosine phosphatase-1b comparing performance of computational tools for combinatorial library design inhibitors of dihydrodipicolinate reductase, a key enzyme of the diaminopimelate pathway of mycobacterium tuberculosis high throughput screening identifies novel inhibitors of escherichia coli dihydrofolate reductase that are competitive with dihydrofolate experimental screening of dihydrofolate reductase yields a ''test set'' of 50,000 small molecules for a computational data-mining and docking competition mcmaster university data-mining and docking competition: computational models on the catwalk evaluating the high-throughput screening computations screening for dihydrofolate reductase inhibitors using molprint 2d, a fast fragment-based method employing the naïve bayesian classifier: limitations of the descriptor and the importance of balanced chemistry in training and test sets virtual ligand screening against escherichia coli dihydrofolate reductase: improving docking enrichment using physics-based methods here be dragons: docking and screening in an uncharted region of chemical space a geometric approach to macromolecule-ligand interactions docking flexible ligands to macromolecular receptors by molecular shape structure-based design of nonpeptide inhibitors specific for the human immunodeficiency virus 1 protease structure of a nonpeptide inhibitor complexed with hiv-1 protease. developing a cycle of structure-based drug design rational design of potent, bioavailable, nonpeptide cyclic ureas as hiv protease inhibitors structure-based virtual screening: an overview virtual screening in structure-based drug design. mini rev virtual screening of chemical libraries integration of virtual screening into the drug discovery process virtual screening in lead discovery and optimization virtual screening methods that complement hts recent development and application of virtual screening in drug discovery: an overview docking and scoring in virtual screening for drug discovery: methods and applications hit and lead generation: beyond high-throughput screening success stories of computer-aided design validation of an empirical rna-ligand scoring function for fast flexible docking using ribodock identification of ligands for rna targets via structurebased virtual screening: hiv-1 tar the druggable genome druggability indices for protein targets derived from nmr-based screening data comparative protein structure modelling all are not equal: a benchmark of different homology modelling programs utility of homology models in the drug discovery process protein-based virtual screening of chemical databases. ii. are homology models of g-protein coupled receptors suitable targets? successful virtual screening for a submicromolar antagonist of the neurokinin-1 receptor based on a ligand-supported homology model structure-based drug discovery using gpcr homology modeling: successful virtual screening for antagonists of the alpha1a adrenergic receptor docking ligands onto binding site representations derived from proteins built by homology modelling ligand-supported homology modelling of protein binding sites using knowledge-based potentials implications of protein flexibility for drug discovery molecular docking to ensembles of protein structures ligand docking to proteins with discrete side-chain flexibility flexe: efficient molecular docking considering protein structure variations testing a flexible-receptor docking algorithm in a model binding site accommodating protein flexibility in computational drug design incorporating protein flexibility in structure-based drug discovery: using hiv-1 protease as a test case probing flexibility and ''induced-fit'' phenomena in aldose reductase by comparative crystal structure analysis and molecular dynamics simulations unveiling the full potential of flexible receptor docking using multiple crystallographic structures information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes expect the unexpected or caveat for drug designers: multiple structure determinations using aldose reductase crystals treated under varying conditions ph-dependent binding modes observed in trypsin crystals: lessons for structure-based drug design understanding protein-ligand interactions: the price of protein flexibility zz made ez: influence of inhibitor configuration on enzyme selectivity isothermal titration calorimetry and differential scanning calorimetry as complementary tools to investigate the energetics of biomolecular recognition approaches to the description and prediction of binding affinity of small-molecule ligands to macromolecular receptors reconstructing the binding site of factor xa in trypsin reveals ligand-induced structural plasticity predicting binding modes, binding affinities and ''hot spots'' for protein-ligand complexes using a knowledge-based scoring function a computational procedure for determining energetically favorable binding sites on biologically important macromolecules superstar: a knowledge based approach for identifying interaction sites in proteins knowledge-based scoring function to predict proteinligand interactions relibase: design and development of a database for comprehensive analysis of protein-ligand interactions utilising structural knowledge in drug design strategies: applications using relibase zinc-a free database of commercially available compounds for virtual screening the art and practice of structure-based drug design: a molecular modelling perspective fragment-based drug discovery fragment-based lead discovery current trends in lead discovery: are we looking for the appropriate properties? virtual screening using protein-ligand docking: avoiding artificial enrichment experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings a rule-of-three for fragment-based lead discovery admet in silico modelling: towards prediction paradise? a bioavailability score drug-like annotation and duplicate analysis of a 23-supplier chemical database totalling 2.7 million compounds virtual exploration of the small-molecule chemical universe below 160 daltons lead discovery using molecular docking small molecule docking and scoring high-throughput docking for lead generation docking: successes and challenges high-throughput docking as a source of novel drug leads comparing protein-ligand docking programs is difficult detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance target-based scoring approaches and expert systems in structure-based virtual screening successful virtual screening for novel inhibitors of human carbonic anhydrase: strategy and experimental confirmation virtual screening for submicromolar leads of trna-guanine transglycosylase based on a new unexpected binding mode detected by crystal structure analysis pharmacophore modeling and three-dimensional database searching for drug design using catalyst feature trees: a new molecular similarity measure based on tree matching development and validation of a genetic algorithm for flexible docking a fast flexible docking method using an incremental construction algorithm glide: a new approach for rapid, accurate docking and scoring. 1. method and assessment of docking accuracy glide: a new approach for rapid, accurate docking and scoring. 2. enrichment factors in database screening automated docking using a lamarckian genetic algorithm and an empirical binding free energy function high-throughput docking for lead generation principles of docking: an overview of search algorithms and a guide to scoring functions molecular recognition and docking algorithms comparison of automated docking programs as virtual screening tools binding site characteristics in structurebased virtual screening: evaluation of current docking tools comparative evaluation of eight docking tools for docking and virtual screening accuracy a detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance pharmacophore-based molecular docking to account for ligand flexibility virtual screening with solvation and ligandinduced complementarity the particle concept: placing discrete water molecules during protein-ligand docking predictions modeling water molecules in protein-ligand docking using gold flexible docking under pharmacophore type constraints similarity-driven flexible ligand docking automated docking to multiple target structures: incorporation of protein mobility and structural water heterogeneity in autodock docking into knowledge-based potential fields: a comparative evaluation of drugscore drugscore meets comfa: adaptation of fields for molecular comparison (afmoc) or how to tailor knowledge-based pair-potentials to a particular protein improving binding mode predictions by docking into protein-specifically adapted potential fields comparative docking studies on ligand binding to the multispecific antibodies ige-la2 and ige-lb4 detailed analysis of scoring functions for virtual screening comparative evaluation of 11 scoring functions for molecular docking assessing scoring functions for protein-ligand interactions consensus scoring: a method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins how does consensus scoring work for virtual library screening? an idealized computer experiment large-scale validation of a quantum mechanics based scoring function: predicting the binding affinity and the binding mode of a diverse set of protein-ligand complexes evaluation of the casp2 docking section chris lipinski discusses life and chemistry after the rule of five drugscore(csd)-knowledge-based scoring function derived from small molecule crystal data with superior recognition rate of nearnative ligand poses and better affinity prediction structural parameterization of the binding enthalpy of small ligands library design based on privileged scaffolds through docking and direct design in the protein binding pocket. abstracts of papers decoys for docking crystallographic study of inhibitors of trna-guanine transglycosylase suggests a new structure-based pharmacophore for virtual screening you can access endeavour online on sciencedirect, where you'll find book reviews, editorial comment and a collection of beautifully illustrated articles on the history of science. featuring: waxworks and the performance of anatomy in mid-eighteenth-century italy by l. dacome representing revolution: icons of industrialization by p. fara myths about moths: a study in contrasts by d.w. rudge the origins of research into the origins of life by i. fry in search of the sea monster by the author is grateful to current and former members of his research group for their joint efforts in developing tools and strategies for virtual screening and in performing the experimental proof of concept. this article was written during a very stimulating sabbatical at the university of california, san francisco, usa, and many discussions with the author's host, brian shoichet, and his colleagues and research group are kindly acknowledged. endless bus rides on muni, san francisco, are appreciated for providing the opportunity to do the required reading for this review. the livingstone story and the industrial revolution by l. dritsas endeavour is available on sciencedirect, www.sciencedirect.com key: cord-262268-gm99cadh authors: wang, jingqiang; wen, jie; li, jingxiang; yin, jianning; zhu, qingyu; wang, hao; yang, yongkui; qin, e’de; you, bo; li, wei; li, xiaolei; huang, shengyong; yang, ruifu; zhang, xumin; yang, ling; zhang, ting; yin, ye; cui, xiaodai; tang, xiangjun; wang, luoping; he, bo; ma, lianhua; lei, tingting; zeng, changqing; fang, jianqiu; yu, jun; wang, jian; yang, huanming; west, matthew b; bhatnagar, aruni; lu, youyong; xu, ningzhi; liu, siqi title: assessment of immunoreactive synthetic peptides from the structural proteins of severe acute respiratory syndrome coronavirus date: 2003-12-01 journal: clin chem doi: 10.1373/clinchem.2003.023184 sha: doc_id: 262268 cord_uid: gm99cadh background: the widespread threat of severe acute respiratory syndrome (sars) to human life has spawned challenges to develop fast and accurate analytical methods for its early diagnosis and to create a safe antiviral vaccine for preventive use. consequently, we thoroughly investigated the immunoreactivities with patient sera of a series of synthesized peptides from sars-coronavirus structural proteins. methods: we synthesized 41 peptides ranging in size from 16 to 25 amino acid residues of relatively high hydrophilicity. the immunoreactivities of the peptides with sars patient sera were determined by elisa. results: four epitopic sites, s599, m137, n66, and n371-404, located in the sars-coronavirus s, m, and n proteins, respectively, were detected by screening synthesized peptides. notably, n371 and n385, located at the cooh terminus of the n protein, inhibited binding of antibodies to sars-coronavirus lysate and bound to antibodies in >94% of samples from sars study patients. n385 had the highest affinity for forming peptide-antibody complexes with sars serum. conclusions: five peptides from sars structural proteins, especially two from the cooh terminus of the n protein, appear to be highly immunogenic and may be useful for serologic assays. the identification of these antigenic peptides contributes to the understanding of the immunogenicity and persistence of sars coronavirus. the worldwide threat of severe acute respiratory syndrome (sars) 6 becoming an epidemic creates urgent challenges for the scientific community. several laboratories have unraveled the genetic information of the sars virus (1) (2) (3) (4) . the genome size of the sars coronavirus is ϳ29 kb and has 11 open reading frames, composed of a stable region encoding an rna-dependent rna polymerase with 2 open reading frames, a variable region representing 4 coding sequences for viral structural genes [spike (s protein), envelope (e protein), membrane (m protein), and nucleocapsid (n protein)], and 5 putative uncharacterized proteins. its gene order is similar to that of other known coronaviruses; however, phylogenetic analyses and sequence comparisons indicate that this virus does not closely resemble any of the previously characterized coronaviruses. the incubation period of sars is usually 2-7 days, but could be as long as 10 days. the disease progresses with unusual severity within a short time once a patient exhibits obvious clinical symptoms. therefore, an urgent task is to develop accurate and sensitive diagnostic tools for identifying sars, specifically for early diagnosis. a noninvasive diagnostic test for sars coronavirus has been reported recently that uses quantitative reverse transcription-pcr with detection of sybr green fluorescence (5 ) . the pcr primer design was based on a unique region within the rna-dependent rna polymerase-encoding sequence of the virus. the amplification was very specific and showed no cross-reaction with two serogroups of human coronavirus, 229e and oc43. in a total of 29 sars patients and 58 uninfected controls, the pcr assay showed positive identification for 79% of sars cases and negative identification for 98% of controls. however, this technique is limited in its clinical use. the samples for pcr were collected from nasopharyngeal aspirates of sars patients. because sars is a respiratory infection, any attempt to obtain a sample from a patient's pharynx and larynx increases the risk of infection to the healthcare worker. furthermore, 79% accuracy is below the level of acceptably when taking into account those in the early phase of infection, showing no symptoms, who were among the 20% excluded. this could certainly be cause for concern because these people could still pose a threat of transmitting the virus. serologic assays are used extensively for diagnosis of viral infection in a host (6 ) . in urgent situations, a common way to perform a serologic assay is to use complex viral lysates as antigens. the use of viral lysates, however, presents several disadvantages. viral lysates consist of many viral antigens that can not be clearly purified and classified. the lysates are prepared from cells infected with the virus; thus, cellular proteins can contaminate the preparation. moreover, sars-coronavirus lysates present a considerable risk of infection to laboratory workers. to overcome these problems, the discovery of antigenic fragments in sars-coronavirus proteins is expected to lead to the rapid development of new assays for the diagnosis of sars infection. it has been shown in many cases that several epitopes located in viral proteins can be successfully mimicked with synthetic peptides (7 ) . to expedite epitope mapping of the sars coronavirus, we have synthesized a group of peptides representing the most hydrophilic, as well as the most accessible, residue regions of the s, m, and n structural proteins of sars coronavirus. using sera from sars patients, we probed these peptides by elisa. sera from 31 sars patients from eight different hospitals in beijing and from 49 uninfected volunteers were collected for study. the clinical diagnostic criteria for sars followed the clinical description of sars released by who. confirmation of sars infection was evidenced by the presence of antibodies against sars coronavirus in the serum. the control sera were divided into two groups: 24 samples obtained from healthy volunteers and 25 samples obtained from patients suffering from respiratory symptoms but not infected with the sars coronavirus. on the basis of the published genome sequence of the sars coronavirus, we downloaded structural proteins s, m, and n into the protscale program at the swiss institute of bioinformatics to analyze the physical characteristics of the proteins, such as hydrophilicity, hydrophobicity, accessible residues, buried residues, molecular mass, and pi values. a total of 41 peptides ranging in size from 16 to 25 amino acid residues and in molecular mass from 2500 to 3000 da were selected for synthesis ( table 1 ). all of the peptides were synthesized commercially by chinese peptide co. the synthesized peptides were characterized by hplc and mass spectrometry. the sars coronavirus isolated from sars case bj01, whose genomic sequence was determined by the beijing genomics institute, was used as a viral source and propagated in vero-e6 cells as described previously (8 ) . after viral propagation, the cells were harvested and placed at 70°c for 2 h to inactivate the virus. the inactivated infected vero-e6 cells were completely lysed by freeze-thaw cycles followed by centrifugation. after the cell pellet was removed, the supernatant was loaded on a sephadex g-150 column for virus purification. the elution fractions were collected at 1 ml/min, and viral proteins in each fraction were detected by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (sds-page) with coomassie blue staining and confirmed by western blot using sars serum. before elisa the fractions containing virus were sonicated and concentrated with a centricon-10 biological filter. the reactivities of the various peptides with sars sera were determined by elisa. in brief, peptides (1 mg/l, 100 l/well) were adsorbed to duplicate wells of 96-well microplates in 0.5 mol/l carbonate buffer (ph 12.5) and incubated at 4°c for 12 h. after the wells were washed with phosphate-buffered saline (pbs), they were coated with 2 g/l bovine serum albumin at 37°c for 2 h and then incubated at 37°c for 30 min with 10 l of sera in sample buffer (total volume of buffer, 100 l). each well was then washed and incubated with peroxidase-conjugated goat anti-human igg at 37°c for 20 min. finally the wells were washed again with pbs containing 5 ml/l tween 20. the peroxidase reaction was visualized by use of o-phenylenediamine solution as substrate. after 10 min of incubation at 37°c, the reaction was stopped by the addition of 50 l of 4 mol/l sulfuric acid, and the absorbance at 450 nm (reference wavelength, 630 nm) was measured by an automatic elisa plate reader (multiskan microplate photometer). to determine the background resulting from nonspecific binding of viral lysates or peptides to normal sera, the cutoff value was calibrated for each viral preparation or peptide by incubation with negative control serum. a panel of 24 sera from healthy individuals was examined routinely for the calibrations. the mean absorbance was considered as the cutoff value for a viral preparation or a peptide. the cutoff values were ϳ0.2. western blot for each sample, 50 g of protein was subjected to 12% sds-page under reducing conditions and transferred to a bio-rad pvdf membrane. the membrane was blocked with pbs containing 50 g/l nonfat milk powder in tris-buffered saline at 37°c for 30 min before incubation with diluted sars serum. bound antibodies were detected by use of an appropriate alkaline phosphatasecoupled secondary antibody with nitroblue tetrazolium chloride and 5-bromo-4-chloro-3ј-indolyphosphate p-toluidine salt added as the substrate for visualization. estimation of binding affinities of synthesized peptides to sera from sars patients to compare the affinities of the antigenic peptides to sars-coronavirus antibodies, we estimated the binding constants by the scatchard equation. each antibody has a limited number of binding sites that become saturated as the concentration of antigen increases. this can be quantified by the following equation: where ␣ is the proportion of bound antigens per antibody based on the elisa measurement at 450 nm; n is the number of binding sites on the antibody; k d is the dissociation constant of the antibody-antigen complex; and [p] is the antigen concentration. we used the student t-test to calculate whether the reactivities of the synthetic peptides and sars coronavirus preparations with the same serum in the elisa were significantly different. p values ͻ0.05 were considered statistically significant. results and discussion epitope mapping of the s, m, and n peptides to effectively develop synthetic peptide immunogens from sars-coronavirus structural proteins, we used several theoretical calculations for peptide design. for a potential epitopic peptide, we specifically looked for high local hydrophilicity, charged residues on the exposed protein surface, and accessible surface area. statistical analysis based on the linear amino acid sequences of the three proteins suggests that, in contrast to the other two viral structural proteins, the n protein contains a high percentage of accessible residues and low hydrophobicity (see figs. 1-3 in the data supplement that accompanies the online version of this article at http://www.clinchem. org/content/vol49/issue12/). a total of 41 peptides spanning multiple possible epitopic sites along the s, m, and n proteins were designed and synthesized in a solid-phase peptide synthesizer. these peptides were purified by hplc, incubated with panels of sera from both controls and sars patients, and then subjected to elisa for measurement of the immunoreactivities. the correlations between the length of amino acid sequence and immunoreactivity are shown in fig. 1 . the peptides representing the cooh terminus of the n protein, in particular n371 and n385, had high absorbance/cutoff value ratios with the highest positive detection rate and the lowest hydrophobicity score among all of the synthesized peptides (fig. 1c, and fig. 3 in the online data supplement). peptide n66 is located in a hydrophilic region and reacts with sars serum antibodies. however, compared with the peptide region n371-n404, which has a score of 9.4 for accessible residues and a hydrophobicity score of ϫ3.5, peptide n66 has relatively low antigenicity because of a correspondingly lower score for accessible residues (6.2) and higher score for hydrophobicity (ϫ1.8). these differences may partially explain the low number of true positives (58%) that we obtained when we used n66 as an antigen. according to the hydrophobicity analysis (figs. 1 and 2 in the online data supplement), most regions in both the s and m proteins are very hydrophobic. although many factors influence antigen-antibody interactions, two major factors, charge-charge interactions and hydrogen bonding caused by hydrophilicity, have been hypothesized to be crucial for an epitopic site (9 ) . of the 41 synthesized peptides, only 5 displayed significantly high immunoreactivity (table 1) . interestingly, all five immunoreactive peptides are located in regions with a relatively high hydrophobicity score (figs. 1-3 in the online data supplement). moreover, the number of polar amino acids in each peptide correlates well with the percentage of true positive, e.g., n371 and n385 reacted with ͼ94% of the samples from sars patients and contain nine and seven polar amino acids, respectively, whereas s599, m137, and n66 reacted with յ80% of the sera from sars patients and have four, three, and three polar amino acids, respectively. our findings thus support that both hydrophobicity and peptide charge are important in determining immunoreactive sites in sarscoronavirus structural proteins. the sars coronavirus propagated in vero-e6 was used as an antigen to test whether sars-coronavirus antibodies were raised in sera. we collected sera from 24 healthy controls and measured their immunoreactivities against sars coronavirus by elisa. none of the control sera reacted with the sars coronavirus. the same results were observed for 25 other control sera obtained from patients with respiratory diseases who did not have sars. however, sera from all 31 sars patients reacted with the sars coronavirus, with high absorbance/cutoff value ratios [mean (sd), 4.23 (1.85)]. the results of the peptide screening are summarized in table 1 . there are two key parameters listed in table 1 , p values, which indicate whether there was a significant difference between the synthesized peptide and sars coronavirus as antigen in the elisa, and x/31, which represents the detection rate when particular peptides were used as antigen. for the s protein, 18 peptides spanning the protein sequence were synthesized. only one peptide, s599, elicited a response in the elisa that was not significantly different from the response elicited by the sars coronavirus: ϳ80% (25 of 31) of the sera from sars patients reacted with this peptide. the other 17 peptides reacted only slightly with the sera from sars patients and gave low detection rates, suggesting that the regions of the s protein covered by these peptides have no epitopic site. for the m protein, a relatively small viral structural protein, nine peptides were synthesized. compared with the sars coronavirus, this peptide reacted with 80% (25 of 31) of the sera (p ϭ 0.1), indicating that it may contain a weak epitopic site. the highest immunoreactivities were for the peptides located within the n protein. three of 14 synthesized n-protein peptides, n66, n371, and n385, showed reactivities comparable to that of the sars coronavirus. interestingly, n66 is located at the nh 2 terminus of the n protein, whereas n371 and n385 are neighboring fragments, both at the cooh terminus. we thus reasoned that n protein has at least two epitopic regions. these three peptides showed low y/24 values-0/24, 1/24, and 0/24, respectively-and high x/31 values: n371 and n385 reacted with 97% (30 of 31) and 94% (29 of 31) of the sera from sars patients, respectively, whereas n66 reacted with ϳ58% (18 of 31). taken together, these data suggest that the most immunoreactive epitopic site in the sars coronavirus is located at the cooh terminus of the n protein. members of the coronavirus family all have three major structural proteins, s, m, and n, but the antigenicities and roles of these viral proteins in immunity remain unclear. different viruses often cause different immunoresponses. for example, in infectious bronchitis virus (10 ), the s glycoprotein is more antigenic than either the n or the m protein. because s protein is exposed to the outer viral surface, it is often used as the antigenic group for serologic testing. the m protein, embedded in the viral envelope, has highly variable sequences and is considered to exhibit weak antigenicity, but glycosylation of this protein may elicit antibody production. one common phenomenon is that the n proteins have been shown to be strong immunogens in several coronaviruses, such as murine coronavirus (11 ), turkey coronavirus (12 ) , and porcine reproductive and respiratory syndrome virus (13 ) . in these viruses, the n protein is a highly abundant and relatively conserved antigen. importantly, the n protein may be involved in stimulating cytotoxic t lymphocytes. it has been reported that the n protein accumulates even before it is packaged in the mature virus (14 ) . a cellular immune response elicited early in infection by internal viral proteins could therefore be an important defense mechanism. on the basis of the genomic sequences published to date, most of the sars-coronavirus n protein is highly conserved among strains. thus, the n protein, which has high antigenicity, is likely to have a great impact on the development of diagnostic tests for sars. the proteins extracted from vero-e6 cells infected by sars coronavirus were separated by sds-page (12% polyacrylamide), and the separated proteins were probed with antibodies in sera from sars patients. as shown in fig. 2 , some immunostained bands appeared only on the membranes incubated with patient sera, suggesting that these proteins are associated with sars. on the basis of genomic data, the structural proteins of the virus have molecular masses of 139 (s protein), 46 (n protein), 25 (m protein), and 8 (e protein) kda (4 ), respectively. in all of the western blots performed with sera from three sars patients as the primary antibody, no protein band ͻ40 kda was immunostained, suggesting that the m and e proteins either do not elicit antibody production or generate low antibody titers in sars patients. of the protein bands ͼ40 kda, most were located around two regions ( fig. 2) , 49 and 120 -240 kda. the protein band slightly smaller than 49 kda reacted consistently with all sars sera. because its location is close to the theoretical molecular mass of the n protein, 46 kda, we believe that this protein is the n protein. the immunostained bands around 120 -240 kda did not display a consistent pattern. although the theoretical value of the s protein is 139 kda, it has been reported to be glycosylated during infection of the host cells. it therefore is not surprising that sarscoronavirus s proteins have significantly higher molecular masses than the theoretical value. we thus deduced that these immunostained proteins with high molecular masses are the s protein. to further confirm the identities of these protein bands, we conducted competition experiments to determine whether the peptides from the s or n protein would inhibit the binding of the sera from sars patients in the elisa. the patient sera preincubated with 4 mg/l s599 or n385 gave a 25-30% lower response in the elisa (data not shown), suggesting that the two peptides could compete with sars coronavirus for binding to the antibodies in sars serum. these observations suggest that the s and n proteins account for the antigenicity of the sarscoronavirus structural proteins. affinity of the peptides located at the cooh terminus of n protein protscale analysis suggested that the cooh terminus of the n protein has fewer buried residues and higher hydrophilicity than the other regions of this protein. we designed four peptides around the region, n355, n371, n385, and n401. of these, n401, located at the end of the n protein, had low reactivity in the elisa; conversely, the other three peptides were significantly more immunoreactive in the elisa. to determine which peptides had higher affinities for antibodies from sars sera, we quantitatively measured the binding of the peptides to the antibodies in the elisa. in this experiment, the concentration of the antibodies was kept constant and the binding capacity was estimated as a function of peptide concentration (fig. 3) . using the scatchard equation, we calculated the constants n and k d ; the binding curves obtained for the three peptides gave similar n values but different k d values. on the basis of the definitions of the two parameters in materials and methods, the fact that the curves give similar n numbers indicates that the peptides share similar epitopic sites, whereas the different k d values indicate that the three peptides bind to the same epitope with different affinities. of the three peptides, the k d value for n385 was much lower than the k d values for n355 and n371, approximately one-fourth of the k d value for n371 and one-thirteenth of the k d value for n355, suggesting that n385 is able to bind more strongly with antibodies in sars sera. thus, the region around n385 is most likely a strong epitopic site within the n protein. among the 31 sars cases that were identified with use of the sars coronavirus as antigen, 1 and 2 cases were not detected by peptides n371 and n385, respectively, although both were the most immunoreactive antigens of the 41 synthesized peptides. a possible explanation for this failure may relate to the specific recognition of the antibodies raised from a sars case. in this case, the antibodies may not actively recognize the cooh terminus of the n protein. hence, combining the peptides that represent different structural proteins could improve the detection coverage. we selected n371 and n385 to use in combination with peptides s599 and m137, respectively, and used these combinations as antigens for the elisa. contrary to our expectations, the use of peptide combinations did not improve the detection coverage, whereas single peptides displayed higher immunoreactivities in the elisa ( table 1 in the data supplement). this unexpected behavior may be attributable to interactions between peptides n385 and s599 or between n371 and m137. if this hypothesis is correct, then generating a chimeric peptide by linking the two peptides could be a good solution for reducing interactions. we are currently pursuing this line of investigation. in summary, the data presented indicate that (a) in three sars-coronavirus structural proteins, a total of four epitopic sites, s599, m137, n66, and n371-404, were detected by screening with synthesized peptides; (b) of the five peptides with high reactivity against sars sera, n371 and n385 gave significant results compared with the sars coronavirus lysate in an elisa as well as a positive detection rate, suggesting that the cooh terminus of the n protein could be extremely antigenic and useful in clinical diagnosis based on the elisa; (c) of four peptides located at the cooh terminus of the n protein that were synthesized and tested for their binding with sars serum antibodies, n385 was confirmed to have the highest affinity for forming the peptide-antibody complex. taken together, these results show that synthesized peptides could be important in exploring epitopes of immunogens, specifically in urgent situations such as that presented by the emergence of the sars virus. the identification of these reactive peptides could aid in the development of diagnostic techniques for sars. the genome sequence of the sars associated coronavirus characterization of a novel coronavirus associated with severe acute respiratory syndrome comparative full-length genome sequence analysis of 14 sars coronavirus isolates and common mutations associated with putative origins of infection a complete sequence and comparative analysis of a sars-associated virus (isolate bj01) rapid diagnosis of a coronavirus associated with severe acute respiratory syndrome (sars) new approaches and perspectives in cytomegagaovirus diagnosis use of peptide synthesis to probe vial antigens for epitopes to a resolution of a single amino acid isolation and identification of a novel coronavirus from patients with sars prediction of protein antigenic determinants from amino acid sequences evolution of avian coronavirus ibv: sequence of the matrix glycoprotein gene and intergenic region of several serotypes an immunodominant cd4ϩ t cell site on the nucleocapsid protein of murine coronavirus contributes to protection against encephalomyelitis nucleocapsid protein gene sequence analysis reveals close genomic relationship between turkey coronavirus and avian infectious bronchitis virus identification of a common antigenic site in the nucleocapsid protein of european and north american isolates of porcine reproductive and respiratory syndrome virus cytotoxic t lymphocytes are critical in the control of infectious bronchitis virus in poultry key: cord-252576-1ec545o2 authors: wu, xiangli; sun, jian; zhang, guoqing; wang, hexiang; ng, tzi bun title: an antifungal defensin from phaseolus vulgaris cv. ‘cloud bean’ date: 2011-01-15 journal: phytomedicine doi: 10.1016/j.phymed.2010.06.010 sha: doc_id: 252576 cord_uid: 1ec545o2 an antifungal peptide with a defensin-like sequence and exhibiting a molecular mass of 7.3 kda was purified from dried seeds of phaseolus vulgaris ‘cloud bean’. the isolation procedure entailed anion exchange chromatography on deae-cellulose, affinity chromatography an affi-gel blue gel, cation exchange chromatography on sp-sepharose, and gel filtration by fast protein liquid chromatography on superdex 75. although the antifungal peptide was unadsorbed on deae-cellulose, it was adsorbed on both affi-gel blue gel and sp-sepharose. the antifungal peptide exerted antifungal activity against mycosphaerella arachidicola with an ic(50) value of 1.8 μm. it was also active against fusarium oxysporum with an ic(50) value of 2.2 μm. it had no inhibitory effect on hiv-1 reverse transcriptase when tested up to 100 μm. proliferation of l1210 mouse leukemia cells and mbl2 lymphoma cells was inhibited by the antifungal peptide with an ic(50) of 10 μm and 40 μm, respectively. fungi inflict tremendous damage to humans, other animals, and plants. some fungal diseases are devastating to plants leading to crop destruction and enormous economic losses. antifungal proteins have arrested the attention of investigators because of the tremendous economic implications. to date, many different types of plant antifungal proteins are known. among them are thaumatin-like proteins (chu and ng 2003; pressey 1997; wang and ng 2002; ye et al. 1999) , glucanases (vogelsang and barz 1993) , chitinases and chitinases-like proteins (lam et al. 2000; vogelsang and barz 1993) , ribosome inactivating proteins (roberts and selitrennikoff 1986) , defensins (thevissen et al. 2003; wong and ng 2003) , peroxidases , allergen-like proteins (ye et al. 2001a) , protease inhibitors (chilosi et al. 2000; ye et al. 2001a,b) , lectins (ye et al. 2001b) , lipid transfer proteins (cammue et al. 1995; wang et al. 2004 ), embryo abundant protein-like proteins , cyclophilin-like proteins (ye and ng 2000) , and others wang and ng 2005; . many of the aforementioned antifungal proteins are also referred to as pathogenesis related proteins (van loon 1990) . antifungal proteins have also been purified from animals including insects (iijima et al. 1993) . the aim of the present investigation was to isolate an antifungal protein from the seeds of the cloud bean cultivar of phaseolus vul-garis and to ascertain which type of antifungal protein it belongs to. its characteristics and activities were compared with those of other antifungal proteins. dried seeds of phaseolus vulgaris 'cloud bean' (250 g) from mainland china were purchased from a vendor in hong kong. they were authenticated by prof. shiu ying hu, honorary professor of chinese medicine, the chinese university of hong kong and then deposited in laboratory 302, school of biomedical sciences, under the voucher number pvcb 195. they were homogenized in distilled water (6 ml/g) and the homogenate was centrifuged at 12 000 × g for 20 min at 4 • c. the supernatant was collected and loaded on a 5 cm × 20 cm column of deae-cellulose (sigma) in 10 mm tris-hcl buffer (ph 7.2). following removal of unadsorbed proteins in fraction d1, adsorbed proteins were eluted from the column with 0.2 m nacl and then with 1 m nacl added to the 10 mm tris-hcl buffer, to yield fractions d2 and d3, respectively. fraction d1 was applied to a column (5 cm × 18 cm) of affi-gel blue gel (bio-rad) in 10 mm tris-hcl buffer (ph 7.2). following elution of unadsorbed proteins in fraction b1, adsorbed proteins were desorbed initially with 0.2 m nacl in the tris-hcl buffer to yield fraction b2, and subsequently with 1 m nacl in the tris-hcl buffer to yield fraction b3. fraction b2 was further purified by ion exchange chromatography on a column (2.5 cm × 20 cm) of sp-sepharose (ge healthcare) in 10 mm nh 4 oac buffer (ph 5.0). after elution of unadsorbed proteins (fraction s1), adsorbed proteins were desorbed with a linear concentration (0-1 m) gradient of nacl. the fraction with antifungal activity (s2) was then further fractionated by fast protein liquid chromatography on a superdex 75 hr 10/30 column (ge healthcare) in 0.2 m nh 4 hco 3 buffer (ph 8.5). the last absorbance peak (su3) represented purified antifungal peptide. it was conducted as described by laemmli and favre (1973) using 18% gel. following electrophoresis under non-reducing conditions, the gel was stained with coomassie brilliant blue. the molecular mass of the antifungal peptide was estimated by comparison of its electrophoretic mobility with those of molecular mass marker proteins from ge healthcare including horse myoglobin peptides of different molecular mass: 16 949, 14 404, 10 700, 8159 and 6214 da. the molecular mass was also determined using gel filtration on a superdex peptide column (ge healthcare). the amino acid sequence of the antifungal peptide was analyzed by means of automated edman degradation. microsequencing was carried out using a hewlett packard 1000a protein sequencer equipped with a high performance liquid chromatography system (lam et al. 1998 ). it was carried out by using the dye binding reagent (bio-rad). the assay for antifungal activity toward mycosphaerella arachidicola, physalospora piricola and fusarium oxysporum was conducted in 100 mm × 15 mm petri plates containing 10 ml of potato dextrose agar. after the mycelial colony had developed, sterile blank paper disks (0.625 cm in diameter) were laid at a distance of 0.5 cm away from the rim of the mycelial colony. an aliquot (15 l) of the antifungal peptide was added to a disk. the plates were then left at 23 • c for 72 h until mycelial growth had surrounded the disks containing the control and had produced crescents of inhibition around disks containing samples with antifungal activity . to determine the ic 50 value for the antifungal activity, three doses of the antifungal peptide were added separately to three aliquots each containing 4 ml potato dextrose agar at 45 • c, mixed rapidly and poured into three separate small petri dishes. after the agar had cooled down, the same amount of mycelia was added to each plate. buffer without the antifungal peptide was used as a control. after incubation at 23 • c for 72 h, the area of the mycelial colony was measured and the inhibition of fungal growth was calculated. from a graph plotting % reduction in area of mycelial colony versus concentration of antifungal peptide, the concentration bringing about 50% reduction in area of mycelial colony (ic 50 ) compared with the control was determined (wong and ng 2003) . the assay was conducted by the uptake of sytox green, a high-affinity nuclear stain that penetrates cells with compromised membranes as detailed by thevissen et al. (2003) . briefly, fungi were cultured in the presence or in the absence of cloud bean defensin. sytox green (invitrogen) was added to the fungal cultures (0.5 mm final concentration). after incubation for 10 min, the fungal cells were examined for the presence of the dye by using a fluorescence microscope (nikon te2000). an excitation wavelength of 500-540 nm was used. the assay was performed according to instructions provided with the assay kit from boehringer-mannheim (germany). the assay makes use of the ability of reverse transcriptase to synthesize dna, starting from the template/primer hybrid poly (a) oligo (dt) 15. the digoxigenin-and biotin-labeled nucleotides in an optimized ratio are incorporated into one of the same dna molecule, which is freshly synthesized by the reverse transcriptase (rt). the detection and quantification of synthesized dna as a parameter for rt activity follows a sandwich elisa protocol. biotin-labeled dna binds to the surface of microtiter plate modules that have been precoated with streptavidin. next, an antibody to digoxigenin conjugated to peroxidase binds to the digoxigenin-labeled dna. finally, the peroxidase substrate is added. the peroxidase enzyme catalyzes cleavage of the substrate, producing a colored reaction product. the absorbance of the samples at 405 nm, which is directly correlated to the level of rt activity, can be measured with a microtiter plate (elisa) reader. a fixed amount (4-6 ng) of recombinant hiv-1 reverse transcriptase was used. the inhibitory activity of the isolated peptide was calculated as percent inhibition as compared to a control without the peptide ng and wang 2001) . the defensin gymnin (wong and ng 2003) was used as a positive control. the plasmid that expressed his-tagged wild-type hiv-1 integrase, pt7-7-his (y|tx)-hiv-1-in, was a generous gift from professor s.a. chow (school of medicine, ucla). to express the protein, a 1-l culture of e. coli bl21 (de3) cells containing the expression plasmid was grown at 37 • c until od600 reached 0.7-0.8. cells were induced by addition of 0.8 mm iptg (isopropyl-␤-d-thiogalactopyranoside) and harvested, after 4 h of incubation, by centrifugation at 6000 × g for 10 min at 4 • c. cells were suspended at a concentration of 10 ml/g wet cell paste in 20 mm tris-hcl buffer (ph 8.0) containing 0.1 mm edtandash;hcl buffer (ph 8.0) containing 0.1 mm edta, 2 mm ␤-mercaptoethanol, 0.5 m nacl and 5 mm imidazole. lysozyme was added to a concentration of 0.2 mg/ml. after incubation at 4 • c for 1 h, the lysate was sonicated and centrifuged at 40 000 × g at 4 • c for 20 min. the pellet was homogenized in 50 ml buffer a (20 mm tris-hcl, ph 8.0, 2 m nacl, 2 mm ␤-mercaptoethanol) containing 5 mm imidazole. the suspension was rotated at 4 • c for 1 h, and cleared by centrifugation at 40 000 × g at 4 • c for 20 min. the supernatant was loaded onto a 1-ml chelating sepharose (ge healthcare) column charged with 50 mm imidazole. the column was washed with five column volumes of buffer a containing 5 mm imidazole, and the protein was eluted with three column volumes of buffer a containing 200 mm and 400 mm imidazole, respectively. proteincontaining fractions were pooled, and edta was added to a final concentration of 5 mm. the protein was dialyzed against buffer b (20 mm hepes, ph 7.5, 1 mm edta, 1 m nacl, 20% glycerol) containing 2 mm ␤-mercaptoethanol, and then against buffer b containing 1 mm dithiothreitol. aliquots of the protein were stored at −70 • c (loizidou et al. 2009 ). a non-radioactive elisa-based hiv-1 integrase assay was performed according to the dna-coated plate method. in this study, 1 g of smai-linearized pbluescript sk was coated onto each well in the presence of 2 m nacl as target dna. the donor dna was prepared by annealing vu5br (5 -biotin-gtgtggaaaatctcta-gcagt-3 ) and vu5 (5 -actgctagagattttccacac-3 ) in 10 mm tris-hcl, ph 8.0, 1 mm edta and 0.1 m nacl at 80 • c followed by 30 min at room temperature. integrase reaction was performed in 20 mm hepes (ph 7.5) containing 10 mm mncl 2 , 30 mm nacl, 10 mm dithiothreitol and 0.05% nonidet-p40 (sigma). after the integrase reaction, the biotinylated dna immobilized on the wells was detected by incubation with streptavidin-conjugated alkaline phosphatase (boehringer-mannheim, mannheim, germany), followed by colorimetric detection with 1 mg/ml p-nitrophenyl phosphate in 10% diethanolamine buffer (ph 9.8) containing 0.5 mm mgcl 2 . the absorbance due to the alkaline phosphatase reaction was measured at 415 nm. the ribosome inactivating protein trichosanthin was used as a positive control (loizidou et al. 2009 ). the antiproliferative activity of the purified peptide was determined as follows. the cell lines l1210 (mouse leukemia) and mbl2 (lymphoma) from american type culture collection were maintained in dulbecco modified eagles' medium (dmem) supplemented with 10% fetal bovine serum (fbs) and 100 mg/l streptomycin and 100 iu/ml penicillin, at 37 • c in a humidified atmosphere of 5% co 2 . cells (1 × 10 4 ) in their exponential growth phase were seeded into each well of a 96-well culture plate (nunc, denmark) and incubated for 3 h before addition of the peptide. incubation was carried out for another 48 h. radioactive precursor, 1 ci, [methyl-3 h] thymidine (ge healthcare) was added to each well and incubated for 6 h. the cultures were then harvested by a cell harvester. the incorporated radioactivity was determined by liquid scintillation counting (wong and ng 2003) . the activity of sars coronavirus (cov) protease was indicated by a cleavage of designed substrate which was composed of two proteins linked by a cleavage site for sars cov protease. the reaction was performed in a mixture containing 5 m sars cov protease, 5 m sample, 20 m substrate and buffer [20 mm tris-hcl (ph 7.5), 20 mm nacl and 10 mm beta-mercaptoethanol] for 40 min at 37 • c. after 40 min, the reaction was stopped by heating at 100 • c for 2 min. then the reaction mixture was analysed by sds-page. if sars cov protease is inhibited by the test sample, there is only one band, which is the intact substrate, shown in sds-page (leung et al. 2008) . the extract of cloud beans was fractionated into three fractions of approximately equal size: an unadsorbed fraction d1 with antifungal activity and two adsorbed fractions d2 and d3 without activity (table 1) . fraction d1 was resolved on affi-gel blue gel into a larger unadsorbed fraction b1 devoid of antifungal activity, an adsorbed fraction b2 with antifungal activity eluted with 0.2 m nacl in the tris-hcl buffer, and an adsorbed fraction b3 eluted with 1.0 m nacl in the tris-hcl buffer but without antifungal activity (table 1) . fraction b2 was separated on sp-sepharose into a broad unadsorbed fraction s1 and three sharp adsorbed fractions s2, s3 and s4 (fig. 1) . antifungal activity was detected only in the largest fraction s2 (table 1) . fraction s2 was separated by gel filtration on superdex 75 into three fractions su1, su2 and su3 (fig. 2) . su3, which displayed a molecular mass of 7.3 kda, was the only fraction with antifungal activity (table 1) . its amino acid sequence, which manifests pronounced homology to plant defensins, is recorded in table 2 . fraction su3 appeared in sds-page as a single band with a molecular mass of 7.3 kda (fig. 3) . the inhibitory action of the purified antifungal peptide represented by fraction su3 on the fungi mycosphaerella arachidicola and fusarium oxysporum is shown in figs. 4 and 5, respectively. there was a dose-dependent inhibition of mycelial growth. the antifungal peptide suppressed mycelial growth in m. arachidicola with an ic 50 value of 1.8 m (fig. 6) and fusarium oxysporum with an ic 50 value of 2.2 m. the antifungal peptide did not reduce the activity of hiv-1 reverse transcriptase. neither did it affect the activities of hiv-1 integrase and sars coronavirus proteinase (data not shown). but it inhibited the proliferation of l1210 and mbl2 tumor cells with an ic 50 of 1.0 m and 40 m, respectively (table 3 ). in the assay of sytox green uptake, cloud bean defensin (10 m) could induce membrane permeabilization in m. arachidicola and f. oxysporum as wong et al. 2008a,b; lin et al. 2010) . comparison with white cloud bean defensin table 4 presents a comparison of the characteristics and activities of cloud bean defensin with white cloud bean defensin. they are grossly similar. the antifungal peptide isolated from cloud beans in the present study is a defensin as judged by the remarkable homology of its n-terminal amino acid sequence to plant defensins (thevissen et al. 2003 ). it has a molecular mass and chromatographic behavior on cationic and anionic exchangers and affi-gel blue gel similar to those of plant defensins reported earlier. it is unadsorbed on deae-cellulose but adsorbed on sp-sepharose and affi-gel blue gel. this chromatographic behavior is also characteristic of antifungal proteins in general (lam et al. 2000; wang and ng 2000 , 2005 wang et al. , 2004 wong and ng 2003; ye et al. 2001a ye et al. ,b, 2002 ye et al. , 1999 ng 2002, 2000) . cloud bean defensin potently inhibits proliferation of tumor cells. antiproliferative activity toward tumor cells is also an attribute of defensins (wong and ng 2003) , defensin-like peptides and also other antifungal proteins including ribosome inactivating proteins (lam et al. 1998 ) and chitinase-like proteins (lam et al. 2000) . this activity of cloud bean defensin may be a consequence of the protein synthesis inhibitory activity which is a characteristic of antifungal proteins (ng and ye 2003) including ribosome inactivating proteins (ng and parkash 2002) . it is noteworthy that cloud bean defensin is inhibitory to both l1210 and mbl2 tumor cells. the antifungal protein passiflin (lam and ng 2009 ) inhibits proliferation in breast cancer mcf7 cells, but not in hepatoma hepg2 cells. some antifungal proteins including defensin-like peptides (wong and ng 2003) , protease inhibitors (ye et al. 2001a ), thaumatin-like proteins (wang and ng 2002) , chitinase-like proteins (lam et al. 2000) and ribosome inactivating proteins (wong et al. 2008a,b) demonstrate hiv-1 reverse transcriptase inhibitory activity. it is somewhat surprising that cloud bean defensin is devoid of inhibitory activity toward the retroviral enzyme. its lack of inhibition on hiv-1 integrase and sars cov protease is in concord with previous reports on other defensins (leung et al. 2008) . the present findings on cloud bean defensin is reminiscent of the observation that mungin, an antifungal protein from mung beans, is without hiv-1 reverse transcriptase inhibitory activity (ye and ng 2000) . the isolated defensin-like antifungal peptide potently inhibits fungal growth in both fungal species examined. some anti-fungal proteins are able to inhibit only one out of the several fungi tested . other antifungal proteins have a lower antifungal potency than cloud bean defensin (ye et al. 1999) . to recapitulate, a defensin with potent antifungal, and antiproliferative activities was isolated from cloud beans. like french bean defensin (leung et al. 2008) , it is devoid of inhibitory activity toward hiv-1 integrase and sars coronavirus proteinase. white cloud bean and cloud bean are two cultivars of phaseolus vulgaris. their defensins are similar in molecular mass, n-terminal sequence and antifungal potency. however, white cloud bean defensin, but not cloud bean defensin, has inhibitory activity toward hiv-1 reverse transcriptase. the former appears to be more potent in inhibitory activity toward tumor cells. hence similar, but not identical proteins are produced by different cultivars. a potent antimicrobial protein from onion seeds showing sequence homology to plant lipid transfer proteins antifungal activity of a bowman-birk type trypsin inhibitor from wheat kernel mollisin, an antifungal protein from the chestnut castanea mollissima purification, characterization, and cdna cloning of an antifungal protein from the hemolymph of sarcophaga peregrina (flesh fly) larvae maturation of the head of bacteriophage t4. i. dna packaging events passiflin, a novel dimeric antifungal protein from seeds of the passion fruit purification and characterization of novel ribosome inactivating proteins, alpha-and beta-pisavins, from seeds of the garden pea pisum sativum a robust cysteine-deficient chitinase-like antifungal protein from inner shoots of the edible chive allium tuberosum concurrent purification of two defense proteins from french bean seeds: a defensin-like antifungal peptide and a hemagglutinin a defensin with highly potent antipathogenic activities from the seeds of purple pole bean analysis of binding parameters of hiv-1 integrase inhibitors: correlates of drug inhibition and resistance inhibitory effects of antifungal proteins on human immunodeficiency virus type 1 reverse transcriptase, protease and integrase hispin, a novel ribosome inactivating protein with antifungal activity from hairy melon seeds panaxagin, a new protein from chinese ginseng possesses anti-fungal, anti-viral, translation-inhibiting and ribonuclease activities fabin, a novel calcyon-like and glucanase-like protein with mitogenic, antifungal and translation-inhibitory activities from broad beans two isoforms of np24: a thaumatin-like protein in tomato fruit isolation and partial characterization of two antifungal proteins from barley interactions of antifungal plant defensins with fungal membrane components the nomenclature of pathogenesis-related proteins purification, characterization and differential hormonal regulation of a beta-1,3-glucanase and two chitinases from chickpea (cicer arietinum l.) ginkbilobin, a novel antifungal protein from ginkgo biloba seeds with sequence similarity to embryo-abundant protein isolation of a novel deoxyribonuclease with antifungal activity from asparagus officinalis seeds isolation of an antifungal thaumatin-like protein from kiwi fruits purification of chrysancorin, a novel antifungal protein with mitogenic activity from garland chrysanthemum seeds an antifungal peptide from the coconut a non-specific lipid transfer protein with antifungal and antibacterial activities from the mung bean an antifungal protein from bacillus amyloliquefaciens gymnin, a potent defensin-like antifungal peptide from the yunnan bean (gymnocladus chinensis baill) marmorin, a new ribosome inactivating protein with antiproliferative and hiv-1 reverse transcriptase inhibitory activities from the mushroom hypsizigus marmoreus isolation of a novel peroxidase from french bean legumes and first demonstration of antifungal activity of a non-milk peroxidase mungin, a novel cyclophilin-like antifungal protein from the mung bean a bowman-birk-type trypsin-chymotrypsin inhibitor from broad beans cicerin and arietin, novel chickpea peptides with different antifungal potencies isolation of a homodimeric lectin with antifungal and antiviral activities from red kidney bean (phaseolus vulgaris) seeds first chromatographic isolation of an antifungal thaumatin-like protein from french bean legumes and demonstration of its antifungal activity this work was financially supported by national grants of china (2006bad07a01 and nyhyzx07-008) was gratefully acknowledged. key: cord-254107-02bik024 authors: hillisch, alexander; pineda, luis felipe; hilgenfeld, rolf title: utility of homology models in the drug discovery process date: 2004-08-31 journal: drug discovery today doi: 10.1016/s1359-6446(04)03196-4 sha: doc_id: 254107 cord_uid: 02bik024 abstract advances in bioinformatics and protein modeling algorithms, in addition to the enormous increase in experimental protein structure information, have aided in the generation of databases that comprise homology models of a significant portion of known genomic protein sequences. currently, 3d structure information can be generated for up to 56% of all known proteins. however, there is considerable controversy concerning the real value of homology models for drug design. this review provides an overview of the latest developments in this area and includes selected examples of successful applications of the homology modeling technique to pharmaceutically relevant questions. in addition, the strengths and limitations of the application of homology models during all phases of the drug discovery process are discussed. 1359-6446/04/$ -see front matter ©2004 elsevier ltd. all rights reserved. pii: s1359-6446(04)03196-4 the majority of drugs available today were discovered either from chance observations or from the screening of synthetic or natural product libraries. the chemical modification of lead compounds, on a trial-and-error basis, typically led to compounds with improved potency, selectivity and bioavailability and reduced toxicity. however, this approach is labor-and time-intensive and researchers in the pharmaceutical industry are constantly developing methods with a view to increasing the efficiency of the drug discovery process [1] . two directions have evolved from these efforts. the 'random' approach involves the development of hts assays and the testing of a large number of compounds. combinatorial chemistry is used to satisfy the need for extensive compound libraries. the 'rational', protein structure-based approach relies on an iterative procedure of the initial determination of the structure of the target protein, followed by the prediction of hypothetical ligands for the target protein from molecular modeling and the subsequent chemical synthesis and biological testing of specific compounds (the structure-based drug design cycle). the rational approach is severely limited to target proteins that are amenable to structure determination. although the protein data bank (pdb; http://www.rcsb.org/pdb) is growing rapidly (~13 new entries daily), the 3d structure of only 1-2% of all known proteins has as yet been experimentally characterized. however, advances in sequence comparison, fold recognition and protein-modeling algorithms have enabled the partial closure of the so-called 'sequence-structure gap' and the extension of experimental protein structure information to homologous proteins. the quality of these homology models, and thus their applicability to, for example, drug discovery, predominantly depends on the sequence similarity between the protein of known structure (template) and the protein to be modeled (target). despite the numerous uncertainties that are associated with homology modeling, recent research has shown that this approach can be used to significant advantage in the identification and validation of drug targets, as well as for the identification and optimization of lead compounds. in this review, we will focus on the application of homology models to the drug discovery process. homology, or comparative, modeling uses experimentally determined protein structures to predict the conformation of another protein that has a similar amino acid sequence. the method relies on the observation that in nature the structural conformation of a protein is more highly conserved than its amino acid sequence and that small or medium changes in sequence typically result in only small changes in the 3d structure [2] . generally, the process of homology modeling involves four steps -fold assignment, sequence alignment, model building and model refinement ( figure 1 ). the fold assignment process identifies proteins of known 3d structure (template structures) that are related to the polypeptide sequence of unknown structure (the target sequence; this is not to be mistaken with drug target). next, a sequence database of proteins with known structures (e.g. the pdb-sequence database) is searched with the target sequence using sequence similarity search algorithms or threading techniques [3] . following identification of a distinct correlation between the target protein and a protein of known 3d structure, the two protein sequences are aligned to identify the optimum correlation between the residues in the template and target sequences. the next stage in the homology modeling process is the model-building phase. here, a model of the target protein is constructed from the substitution of amino acids in the 3d structure of the template protein and the insertion and/or deletion of amino acids according to the sequence alignment. finally, the constructed model is checked with regard to conformational aspects and is corrected or energy minimized using force-field approaches. several improvements and modifications of this general homology modeling strategy have been developed and applied to the prediction of protein structures. to subject the available structure prediction methods to a blind test, community-wide experiments on the critical assessment of techniques for protein structure prediction (casp 1-5) have been performed and their results presented and published. as a result, the current state-of-the-art in protein structure prediction has been established, the progress made has been documented and the areas where future efforts might be most productively concentrated have been highlighted [4, 5] . homology modeling techniques are dependent on the availability of high-resolution experimental protein structure data. the development of effective protein expression systems and major technological advances in the instrumentation used for structure determination (x-ray crystallography and nmr spectroscopy) has contributed to an exponential growth in the number of experimental protein 3d structures. by may 2004, the pdb contained ~23,000 experimental protein structures for ~7400 different proteins (proteins with less than 90% sequence identity). a recent analysis of all protein chains in the pdb shows that these proteins can be grouped into 2500 protein families 660 ddt vol. 9, no. 15 august 2004 reviews research focus www.drugdiscoverytoday.com figure 1 . the steps involved in the prediction of protein structure by homology modeling. structure modeling of the bacterial transcriptional repressor copr is shown [28] . although the model is based on a low-sequence identity of only 13.8% between copr and the p22 c2 repressor, several experimental methods support this homology model. reproduced, with permission, from ref. [84] . abbreviation: copr, plasmid copy control protein. target sequence: homolog 1, no 3d structure: homolog 2, no 3d structure: template 1, 3d structure: template 2, 3d structure: comprising 900 unique protein folds [6] (updates can be found at http://scop.mrc-lmb.cam.ac.uk). the majority of the structures in the pdb (84%) were determined by x-ray crystallography, with 15% of the structures being characterized by nmr spectroscopy. the pdb database encompasses experimental information on an extensive array of ligands (small organic molecules and ions) bound to more than 50,000 different binding sites that can be analyzed using programs including relibase (http://relibase.ebi.ac.uk) [7] , ligbase (http://alto.compbio.ucsf.edu/ligbase) [8] and pdbsum (http://www.biochem.ucl.ac.uk/bsm/pdbsum) [9] . although the experimental structure database is growing rapidly, there is still a substantial gap between the number of known annotated sequences [1, 182, 126 unique sequences in swiss-prot-trembl (http://www.expasy.org/ sprot) as of 29 august 2003] and known protein 3d structures (23,000). if only significantly different proteins are considered (~7400), which omits muteins, artificial proteins and multiple structure determinations of the same proteins (e.g. hiv-protease and carbonic anhydrase ii), then less than 1% of the 3d structures of known protein sequences have been elucidated. this sequence-structure gap can partly be filled with homology models. for example, the queryable database modbase (http://alto.compbio. ucsf.edu/modbase-cgi/index.cgi) provides access to an enormous number of annotated comparative protein structure models [10] . the program psi-blast was used to assign protein folds to all 1,182,126 unique sequence entries in swiss-prot-trembl. for 56% of these sequences, comparative models with an average model size of 235 amino acids could be built using the program modeller [11] . thus, by august 2003, 659,495 3d structure models of proteins were accessible via the internet. the models are predicted to have at least 30% of their c α atoms superimposed within 3.5 å of their correct positions. information on binding sites and ligands can be retrieved from this database using ligbase [8] . however, the majority of the models are built on a low sequence identity and it should be realized that this level of accuracy is, in most cases, not sufficient for a detailed structure-based ligand design. the swiss-model repository (http://swissmodel.expasy. org/repository) [12] is also a database of annotated comparative protein 3d structure models, which have been generated using the fully automated homology-modeling pipeline swiss-model. as of august 2003, this database contained models for 282,096 different protein sequence entries (26%) from the swiss-prot-trembl databases (1,073,566 sequences), with an average model size of ~200 amino acids. researchers from eidogen (http://www.eidogen.com) have created a database system called target informatics platform™ [13] that currently includes homology models for 55,000 proteins. homology modeling of 26,279 human protein sequences resulted in the construction of 17,442 models for 13,114 different sequences (50%). thus, putative and known ligand binding pockets can be detected, analyzed and compared and the resulting data used to support target prioritization and lead discovery and/or optimization procedures. accelrys (http://www.accelrys.com) produces discovery studio (ds) atlasstore™ as a complete oracle ® -based protein and ligand structural data management solution. currently, ds atlasstore™ contains 2,052,000 homology models that have been automatically generated from the sequences of 195,000 proteins from 33 different genomes. in conjunction with homology models, cengent therapeutics (http://www.cengent.com) offers dynamic structural information generated from molecular dynamics simulations for 5500 human drug target proteins. this structural information can be used for target prioritization and virtual screening. the quality of the homology models is dependent on the level of sequence identity between the protein of known structure and the protein to be modeled [14] . for a sequence identity that is greater than 30%, homology can be assumed; the two proteins probably have a common ancestor and are, therefore, evolutionarily related and are likely to share a common 3d structure. in this case, pairwise and multiple sequence alignment algorithms are reliable and can be used for the generation of homology models ( figure 2 ). if the sequence identity is below 15%, structure modeling becomes speculative, which could lead to misleading conclusions. when the sequence identity is between 15% and 30%, conventional alignment methods are not sufficiently reliable and only sophisticated, profile-based methods are capable of recognizing homology and predicting fold. for regions of low sequence identity, threading methods [15] are often applied. protein models that are built on such low sequence identities can be used for the assignment of protein function and for the direction of mutagenesis experiments ( figure 2 ). models that have a sequence identity between ~30% and 50% could facilitate the structure-based prediction of target drugability, the design of mutagenesis experiments and the design of in vitro test assays ( figure 2 ). if sequence identity is greater than ~50%, the resulting models are frequently of sufficient quality to be used in the prediction of detailed protein-ligand interactions, such as structure-based drug design and prediction of the preferred sites of metabolism of small molecules ( figure 2 ). there are numerous applications for protein structure information and, hence, homology models at various stages of the drug discovery process [16] . the most spectacular successes are clearly those where protein structural information has helped to identify or to optimize compounds that were subsequently progressed to clinical trials or to the drug market [17] . the applications of homology models that had an impact on target identification and/or validation, lead identification and lead optimization are reviewed here ( figure 3 ). it is clear that only a minute fraction of the entire proteome can be affected by drug-like (preferentially orally bioavailable) small molecules. based on the total numbers of known genes, disease-modifying genes and drugable proteins, the number of drug target proteins, for humans, has been estimated at 600-1500 [18] . for small molecules, sets of properties have been established that differentiate drugs from other compounds [19, 20] ; these properties can be used to identify compounds with, for example, poor oral absorption properties [21] . drug molecules and their corresponding target proteins are highly complementary, which suggests that some rules that distinguish good target proteins from others should be deducible [22] . deep lipophilic pockets that comprise distinct polar interaction sites are clearly superior to shallow highly charged protein surface regions. the inhibition of protein-protein interfaces as a valuable therapeutic principle has recently been shown with inhibitors of the p53-murine double minute clone 2 (mdm2) interaction [23, 24] . the binding site for these inhibitors is a distinct lipophilic pocket that normally interacts with the α-helical surface patches of the p53 tumor suppressor transactivation domain. advances in the rapid detection, description and analysis of ligand-binding pockets [25] [26] [27] , together with the availability of more than 0.5 million homology models, will open new possibilities for the prioritization of proteins with regards to drugability. in the pharmaceutical industry, structural aspects are being increasingly implemented as additional decision criteria on the drugability of potential drug targets. companies such as inpharmatica (http:// www.inpharmatica.com) have developed an integrated suite of informatics-based discovery technologies that contain software tools for the structure-based assessment of target drugability. the design of site-directed mutant proteins is one further important option for the application of homology models to target validation. introducing point mutations and subsequently studying the effects in vitro or in vivo is a common approach in molecular biology. this strategy enables the identification of amino acids that are functionally or relationship between target and template sequence identity and the information content of resulting homology models. arrows indicate the methods that can be used to detect sequence similarity between target and template sequences. applications of the homology models in drug discovery are listed to the right. the higher the sequence identity, the more accurate the resulting structure information. homology models that are built on sequence identities above ~50% can frequently be used for drug design purposes. superimpositions of x-ray crystal structures of the ligand-binding domains of members of the nuclear receptor family are shown to the left. these x-ray structures illustrate the increase in structure deviation with a decreased sequence identity. the pr is red, the gr is green, the erα is blue and the trβ is cyan. sequence identities: pr:gr, 54%; pr:erα, 24%; and pr:trβ, 15%. abbreviations: erα, estrogen receptor α; gr, glucocorticoid receptor; pr, progesterone receptor; trβ, thyroid receptor β. structurally important in the protein under investigation, which ultimately contributes to biological knowledge on, for example, potential target proteins. typically, the amino acids that are to be modified in these studies are selected on the basis of sequence alignments by focusing on conserved residues. however, if at least some structure information is available, the selection of the amino acids that are to be mutated can be much more precise and successful [28] . this approach is even more powerful when applied in conjunction with pharmacologically active compounds. site-directed mutants of the target protein can be made to render that target sensitive to an existing pharmacological agent. based on homology models, some members of the mitogen-activated protein (map) kinase family were mutated to make them sensitive to a kinase inhibitor from the pyridinyl imidazole class [29] . this enabled the use of the compound for broader target validation studies. one of the most attractive ways to validate a target protein is to administer a pharmacologically active compound that selectively acts on that protein and to study the effects in a relevant animal model. similar strategies have been described under the term 'chemogenomics' [30] . it has recently been shown that it is possible to design small molecules based on homology models and then to use these compounds as tools to study the physiological role of the respective target protein of that particular drug [31] . eight years after the discovery of estrogen receptor β (erβ), the distinct roles of the two er isotypes, erα and erβ, in mediating the physiological responses to estrogens are not completely understood. although knockout animal experiments have provided an insight into estrogen signaling, additional information on the function of erα and erβ was imparted by the application of isotype selective er agonists. based on the crystal structure of the erαligand-binding domain (lbd) and a homology model of the erβ-lbd (59% sequence identity to erα), hillisch et al. [31] designed steroidal ligands that exploit the differences in size and flexibility of the two ligand-binding cavities ( figure 4 ). compounds that were predicted to bind preferentially to either erα or erβ were synthesized and tested in vitro. this approach led directly to highly er isotype-selective (200-250-fold) ligands that were also highly potent. to unravel the physiological roles of each of the two receptors, in vivo experiments with rats were conducted using the erα-and erβ-selective agonists in parallel with the natural ligand of er, 17β-estradiol. the erα agonist was shown to be responsible for most of the known estrogenic effects (e.g. induction of uterine growth and bone-protection), in addition to pituitary (e.g. reduction of luteinizing hormone plasma levels) and liver (e.g. increase in angiotensin i plasma levels) effects [31] . however, the erβ agonist had distinct effects on the ovary, for example, the stimulation of early folliculogenesis [32] , which possibly presents clinicians with a new option for tailoring classical ovarian stimulation protocols. a comparison of the homology model with the x-ray crystal structure of the erβ-lbd complexed with genistein [33] revealed that the homology model had a root-mean-square deviation (rmsd) of the backbone atoms (not considering helix 12) of 1.4 å. the x-ray crystal structure confirmed the presence of essential interactions between the ligand and the erβ and did not reveal, at least in this case, any new aspects for the design of erβ agonists that were not covered by the homology model. these studies show that it is possible to design highly selective compounds, if structure information on all of the relevant homologs of the target is available, and 663 ddt vol. 9, no. 15 august 2004 reviews research focus www.drugdiscoverytoday.com figure 3 . applications of homology models in the drug discovery process. the enormous amount of protein structure information currently available could not only support lead compound identification and optimization, but could also contribute to target identification and validation. reproduced, with permission, from [84] . there are numerous examples where protein homology models have supported the discovery and the optimization of lead compounds with respect to potency and selectivity. currently, the structures of 40 of the 518 known different human protein kinases have been characterized by x-ray crystallography [34] . homology model-based drug design has been applied to epidermal growth factor-receptor tyrosine kinase protein [35, 36] , bruton's tyrosine kinase [37] , janus kinase 3 [38] and human aurora 1 and 2 kinases [39] . using the x-ray crystal structure of cyclin-dependent kinase 2 (cdk2), honma et al. [40] generated a homology model of cdk4. this model guided the design of highly potent and selective cdk4 inhibitors that were targeted towards the atp binding pocket. the diarylurea class of compounds were subsequently synthesized and tested. in an in vitro inhibition assay, the most potent compound had an ic 50 of 42 nm. the predicted binding mode of the lead compound was verified by co-crystallization with cdk2 [40] . vangrevelinghe et al. [41] identified a cdk2 inhibitor using a homology model of the protein and highthroughput docking. siedlecki et al. [42] have demonstrated the utility of homology modeling in the prediction of pharmacologically active compounds. alterations in dna methylation patterns play an important role in tumorigenesis; therefore, inhibitors of dna methyltransferase 1 (dnmt1), which is the protein that represents the major dna methyltransferase activity in human cells, are desired. known inhibitors from the 5-azacytidine class were docked into the active site of a dnmt1 homology model, which led to the design of n4-fluoroacetyl-5-azacytidine derivatives that acted as highly potent inhibitors of dna methylation in vitro. thrombin-activatable fibrinolysis inhibitor (tafi) is an important regulator of fibrinolysis, and inhibitors of this enzyme have potential use in antithrombotic and thrombolytic therapy. based on a homology model of tafi (~50% sequence identity to carboxypeptidases a and b), appropriately substituted imidazole acetic acids were designed and were subsequently found to be potent and selective inhibitors of activated tafi [43] . homology models of the voltage-gated k + -channel k v 1.3 and the ca 2+ -activated channel ik ca 1 were used to design selective ik ca 1 inhibitors that were based on the polypeptide toxin charybdotoxin. comparison of the two models revealed a unique cluster of negatively charged residues in the turret of k v 1.3 that were not present in ik ca 1. to exploit this difference, the homology model was used to design novel analogs, which were then synthesized and tested. research demonstrated that the novel compounds blocked ik ca 1 activity with ~20-fold higher affinity than k v 1.3 [44] . 4. (a,b) comparison of the two isotypes of the estrogen receptor. in the homology model, erα (blue) and erβ (green) ligand-binding pockets are shown in complex with the natural ligand of the er, 17β-estradiol. the binding of 8β-ve 2 , a highly potent and selective erβ agonist, modeled into the erβ ligand-binding niche is depicted to the right. reproduced, with permission, from ref. [31] . (c,d) a model of the antiprogestin ru 486 (mifepristone) bound to hpr. a single amino acid mutation renders this compound inactive at the cpr and hamster pr. steric clashes between ru 486 and cpr are shown on the right side. abbreviations: er, estrogen receptor; her, human estrogen receptor; cpr, chicken progesterone receptor; hpr, human progesterone receptor; pr, progesterone receptor; rba, relative binding affinity; 8β-ve 2 , 8β-vinylestra-1,3,5(10)-triene-3,17β-diol. the key proteinase (m pro , or 3cl pro ) of the new coronavirus (cov) that caused the severe acute respiratory syndrome (sars) outbreak of 2003 (sars-cov) is another example of successful homology model building; in this case, success is defined as the ability to use the model to propose an inhibitor that has significant affinity for the target enzyme. x-ray crystal structures for the m pro s of transmissible gastroenteritis virus (tgev, a porcine coronavirus) and of human coronavirus 229e [45, 46] have been characterized. these proteinases have 44 and 40% sequence identity, respectively, with the key proteinase of sars-cov. following publication of the genome sequence of the new virus, first on the internet and a few weeks later in print [46, 47] , the level of sequence identity between the proteinases enabled anand et al. [46] to construct a 3d homology model for the m pro of human cov. however, the 3d homology model generated was insufficient for the design of inhibitors with reasonable confidence. to establish the structural basis of the interaction with the polypeptide substrate of the m pro , anand and co-workers [46] synthesized a substrate-analogous hexapeptidyl chloromethylketone inhibitor that was complexed with tgev m pro . the x-ray crystal structure of the complex was then determined, which revealed that, as expected, the chloromethylketone moiety had covalently reacted with the active-site cysteine residue of the proteinase. the p1, p2, and p4 side chains of the inhibitor had bound to, and thereby defined, the specificity binding sites of the target enzyme. the experimentally determined structure of the inhibitor-tgev m pro complex was then compared with all inhibitor complexes of cysteine proteinases in the pdb, which revealed a surprisingly similar inhibitor binding mode in the complex of human rhinovirus type 2 (hrv2) 3c proteinase with ag7088 ( figure 5 ) [48] . at that time, ag7088 was in late phase ii clinical trials as a drug for the treatment of the strain of the common cold that is caused by human rhinovirus. the comparison of the crystal structures of hrv2 in complex with ag7088 and tgev m pro in complex with the hexapeptidyl chloromethylketone inhibitor revealed little similarity between the two target enzymes, except in the immediate neighborhood of the catalytic cysteine residue, but an almost perfect match of the inhibitors. to investigate these findings further, ag7088 was docked into the substrate-binding site of the sars-cov m pro model without much difficulty, although it was noted that there could potentially be steric problems with the p-fluorobenzyl group in the s2 pocket, and also with the ethylester moiety in s1′. therefore, it was proposed that, although ag7088 was not an ideal inhibitor, this compound should be a good starting point for the design of anti-sars drugs. indeed, only a few days after the on-line publication of these results in sciencexpress [46] , it was confirmed that ag7088 had anti-sars activity in vitro. derivatives of ag7088 with modified p2 residues have since been shown to have k i values in the lower µmolar range (rao et al., pers. commun.) . the crystal structure of the authentic sars-cov key proteinase was determined a few months later [49] . although the dimeric structure showed the expected similarity to the homologous enzymes of tgev and human cov 229e, there were interesting differences in detail. in particular, one of the monomers in the dimer was observed to be in an inactive conformation, which was thought to be the result of the low ph of crystallization. the overall rmsd for the entire dimer from the homology model of anand et al. [46] was >3.0 å (i.e. no residues excluded from the comparison), which dropped to 2.1 å when a few outliers at the carboxy terminus were excluded from the comparison, and to <1.8 å for each of the three individual domains of the enzyme. other homology models were generated (d. debe, unpublished and [50] ) and virtual screening has been performed using a sars-cov m pro model [51] . taken together, these findings confirm that homology modeling is often inadequate for the prediction of the mutual orientation of domains in multidomain proteins. however, the homology model generated by anand et al. [46] also shows that a reasonable model of a substrate-binding site can serve to develop useful ideas for inhibitor design that can inspire medicinal chemists to start a synthesis program long before the 3d structure of the target enzyme is experimentally determined. in the case of g-protein-coupled receptors (gpcr), homology-modeling approaches are limited by the lack of experimentally determined structures and the low sequence similarity of those structures that have been characterized with respect to pharmacologically important target proteins. the x-ray crystal structure of only one gpcr, bovine rhodopsin, has been determined [52] . this structure is complemented by bacteriorhodopsin, which is a transmembrane protein that comprises seven helices and is also of relevance for modeling approaches, even though this protein is not a gpcr. some examples of homology models for gpcrs and their utility have recently been reviewed [53] . high-throughput docking has been applied to verify the ability of homology models to identify agonists (glucocorticoid receptor agonists) [54] , antagonists of retinoic acid receptor α [55] , d 3 -dopamine-, m 1 -muscarinic acetylcholine-and v 1a -vasopressin-receptors [56] and inhibitors of thrombin [57] . in the identification of thrombin inhibitors, homology models of thrombin were built retrospectively and were based on homologous serine proteases (28%-40% sequence identity); the best docking solutions were yielded with those models that were derived from proteins of higher sequence identity. recently, the performance of docking studies into protein active sites that had been constructed from homology models was assessed using experimental screening datasets of cdk2 and factor viia [58] . when the sequence identity between the model and the template near the binding site was greater than ~50%, there was an approximate fivefold increase in the number of active compounds identified than would have been be detected randomly. this performance is comparable to docking to crystal structures. a further application of homology models is the design of test assays for the in vitro pharmacological characterization of compounds or hts. based on the structure of the coiledcoil domain of c-jun, models for α-helical proteins were designed such that they can be used as affinity-tagged proteins that incorporate protease cleavage sites [59] . the resulting 10.5 kda recombinant proteins were synthesized and used as molecularly defined and uniform substrates for in vitro detection of hiv-1 and iga endoprotease activity, which enabled the surface plasmon resonance-based screening of inhibitors. the enormous volume of structure information on entire target protein families that is available might also have an impact on screening cascades. many drug discovery projects endeavor to identify ligands that are highly selective for particular drug targets. selective compounds are supposed to be superior because such compounds typically lead to fewer adverse side effects (e.g. cox-2 inhibitors). however, the most important homologs that should not be targeted by the desired drug, with respect to the actual target, are not always clear, particularly within the larger target protein families. the sequence similarity of the full-length proteins or entire domains might not always be representative of the target protein when considering the conservation of the ligand-binding pockets. comparison of the shape and features of the binding pockets within a protein family could indicate which homologs should be included in the screening cascade for so-called 'counter screening'. the structure information that is currently available on entire protein families (e.g. proteases, kinases and nuclear receptors) could contribute to the design of selective compounds or better screening cascades, both of which could potentially advance the design of drugs that have fewer side effects. a detailed structural knowledge of the ligand-binding sites of target proteins was also shown to facilitate the selection of animal models for ex vivo or in vivo experiments. the proposal is that animals having target proteins with significantly different binding sites compared with human orthologs should be excluded as pharmacological models. many promising compounds showing high-potency in human in vitro assays have not reached clinical trials because efficacy could not be demonstrated in animal models. single amino acid differences between humans and animals might, in some cases, be sufficient to cause such effects. the er selectivity of ligands described by hillisch et al. [31] was shown by homology models and in vitro assays to be crucially dependent on the interaction of ligand substituents with one particular amino acid that differs between erα and erβ (figure 4a ) [31] . to ensure that this important interaction is present in estrogen receptors of all animal models that are used to characterize compounds [32] , homology models of murine, rat and bovine erβ were built and compared with the binding pocket of human erβ (herβ). a complete conservation of amino acids within the binding pockets of human, murine and rat erβ was observed. however, bovine erβ showed one amino acid difference at the exact position that was determined to be crucial for erβ ligand selectivity. the prediction that the herβ selective compounds should not bind to bovine erβ was later verified using transactivation experiments (unpublished results). thus, the implementation of uninterpretable experiments could be avoided at an early stage and the otherwise attractive bovine tissues (later available in larger amounts) could be excluded from ex vivo investigations. similarly, information on the structure of progesterone receptors (pr) can be used to explain the abolished binding of the progesterone antagonist mifepristone (ru 486) to chicken pr and hamster pr [60] . a single point mutation (human pr gly722 to chicken pr cys575) prevents antiprogestins containing 11β-aryl substituents (e.g. ru 486) from binding to chicken (and hamster) pr (figure 4c) , which therefore excludes hamsters, for example, from pharmacological studies with antiprogestins [61] . in the future, such effects could be predicted and particular species could then be excluded from pharmacological studies at an early stage, which would ultimately reduce attrition rates in the drug discovery process. one of the challenges in lead optimization is to identify compounds that not only show a high potency at the desired target protein but also have adequate physical properties to reach systemic circulation, to resist metabolic inactivation for a specific time period and to avoid undesired pharmacological effects. knowledge of the structure of the proteins that are involved in these processes, such as drugmetabolizing enzymes, transcription factors or transporters, could help to design molecules that do not interact with these 'non-target' proteins. the cytochrome p450s (cyp) are an extremely important class of enzymes that are involved in phase i oxidative metabolism of structurally diverse chemicals. only ~10 hepatic cyps are responsible for the metabolism of 90% of known drugs. recently, the x-ray crystal structures of three mammalian cyps, cyp2c5 [62] , cyp2c8 [63] and cyp2c9 [64] , have been solved and represent a solid basis for the homology modeling of this entire superfamily. models of cyp1a2, cyp2a6, cyp2b6, cyp2c8, cyp2c9, cyp2c19, cyp2d6, cyp2e1, cyp3a4 and cyp4a11 have been generated using different structure templates. these models have been used to explain and to predict the probable sites of metabolic attack in a variety of cyp substrates [65] [66] [67] [68] [69] [70] [71] [72] . however, the large lipophilic and highly flexible character of some cyp binding cavities renders pure in silico approaches towards the prediction of the occurrence and site of small molecule metabolism extremely difficult. if protein structure information is combined with pharmacophoric patterns and quantum mechanical calculations, some predictions concerning the preferred sites of metabolism within small molecules are possible [73] . regarding this aspect of homology modeling, cyp2d6 is a particularly interesting cyp because 5-9% of the caucasian population does not produce this polymorphic member of the cyp superfamily. the resulting deficiencies in drug oxidation can lead to severe side effects in these individuals. predictions on whether or not a lead compound could act as a cyp2d6 substrate could help to identify problematic cases early in drug discovery. combined homology modeling and quantitative sar approaches are able to predict such cyp inhibitors [74] . thus, in the future, protein structure information in conjunction with high-throughput docking and pharmacophore-based methods could be used to decide which compounds have the potential to inhibit particular cyps. this approach could facilitate the detection of potential drug-drug interactions early in the drug discovery process and measures could then be taken to avoid such interactions [75] . cyp substrates and inhibitors are not the only compounds to have been studied using homology models. these approaches have been used recently to investigate cyp inducers. the induction of cyps is primarily mediated via the activation of ligand-dependent transcription factors, such as the aryl hydrocarbon receptor (ahr) for the cyp1a family, the constitutive androstane receptor (car) for the cyp2d family and the pregnane x receptor (pxr), glucocorticoid receptor (gr) and vitamin d receptor (vdr) for the cyp3a family [76] . in principle, the in silico prediction of drug-metabolizing enzyme induction could be reduced to predicting the binding and activation of transcription factors (e.g. ahr and car). however, recent xray structure analyses of pxr have shown that the lbd of this nuclear receptor contains a large lipophilic and flexible binding pocket [77] . this renders pure in silico structure-based predictions concerning whether or not a small molecule will activate pxr difficult. the homology modeling of car [78, 79] and other members of the nuclear receptor family involved in cyp induction [80] have recently been described. these models predict reasonably shaped potential ligand binding pockets. however, further results on the utility of these models are needed. with respect to the structure-based prediction of adverse health effects, progress has been described with the human ether-a-go-go-related gene (herg). this tetrameric potassium channel contributes to phase three repolarization of heart muscle cells by opposing the depolarizing ca 2 + influx during the plateau phase. inhibition of this protein results in cardiovascular toxicity (qt-prolongation) and has caused several drugs to be withdrawn from the market. therefore, in silico predictions on the probability of the formation of an interaction between a drug and herg have gained enormous attention and have recently been reviewed [81] . homology models of herg, which are based on the x-ray crystal structures of the bacterial kcsa [82] and mthk channels [83] , have already shed light on some details of the molecular interactions that initiate herg inhibition. however, the complexity of this potassium channel signifies that detailed x-ray structure analyses of the protein in the open-and closed-state are required before these molecular interactions can be fully understood and predicted, which has implications for the prediction of cardiotoxicity. numerous examples for the successful application of homology modeling in drug discovery are described here. in the absence of experimental structures of drug target proteins, homology models have supported the design of several potent pharmacological agents. one of the advantages of homology models is that these models can be generated 667 ddt vol. 9, no. 15 august 2004 reviews research focus relatively easily and quickly. furthermore, such models could support the hypotheses of medicinal chemists on how to generate biologically active compounds in the important early conceptual phase of a drug discovery project. the design of compounds that are selectively directed at particular drug target proteins is one of the strengths of this technique. such selective compounds can even be applied to gain insights into the physiological role of novel drug targets. the in silico protein structure-based prediction of metabolism and toxicity of small molecules, particularly cyp inhibition and induction and herg inhibition, is currently in its infancy and predictive capabilities could be limited to classification only. however, while complete experimental structures of pharmacologically important proteins are missing, the homology modeling technique provides one approach to bridge the gap until this information becomes available. modern methods of drug discovery: an introduction the response of protein structures to amino-acid sequence changes fold recognition methods progress in protein structure prediction assessment of homology-based predictions in casp5 scop: a structural classification of proteins database for the investigation of sequences and structures databases for protein-ligand complexes ligbase: a database of families of aligned ligand binding sites in known protein sequences and structures pdbsum: summaries and analyses of pdb structures modbase, a database of annotated comparative protein structure models, and associated resources comparative protein modelling by satisfaction of spatial restraints the swiss-model repository of annotated threedimensional protein structure homology models supporting your pipeline with structural knowledge the relation between the divergence of sequence and structure in proteins casp5 assessment of fold recognition target predictions the role of protein 3d structures in the drug discovery process the impact of structure-guided drug design on clinical agents the druggable genome prediction of 'drug-likeness' a scoring scheme for discriminating between drugs and nondrugs drug-like properties and the causes of poor solubility and poor permeability structural bioinformatics in drug discovery inhibition of the p53-hdm2 interaction with low molecular weight compounds inhibition of the p53-mdm2 interaction: targeting a protein−protein interface a new method to detect related function among proteins independent of sequence and fold homology inferring functional relationships of proteins from local sequence and spatial surface patterns structural classification of protein kinases using 3d molecular interaction field analysis of their ligand binding sites: target family landscapes transcriptional repressor copr: structure model-based localization of the deoxyribonucleic acid binding motif use of a drug-resistant mutant of stress-activated protein kinase 2a/p38 to validate the in vivo specificity of sb 203580 chemogenomics: an emerging strategy for rapid target and drug discovery dissecting physiological roles of estrogen receptor alpha and beta with potent selective ligands from structurebased design impact of isotype-selective estrogen receptor agonists on ovarian function structure of the ligand-binding domain of oestrogen receptor beta in the presence of a partial agonist and a full antagonist the protein kinase complement of the human genome design and synthesis of novel tyrosine kinase inhibitors using a pharmacophore model of the atp-binding site of the egf-r rational design of potent and selective egfr tyrosine kinase inhibitors as anticancer agents rational design and synthesis of a novel antileukemic agent targeting bruton's tyrosine kinase (btk), lfm-a13 structure-based design of specific inhibitors of janus kinase 3 as apoptosis-inducing antileukemic agents targeting aurora2 kinase in oncogenesis: a structural bioinformatics approach to target validation and rational drug design structure-based generation of a new class of potent cdk4 inhibitors: new de novo design strategy and library design discovery of a potent and selective protein kinase ck2 inhibitor by high-throughput docking establishment and functional validation of a structural homology model for human dna methyltransferase 1 synthesis and evaluation of imidazole acetic acid inhibitors of activated thrombin-activatable fibrinolysis inhibitor as novel antithrombotics structure-guided transformation of charybdotoxin yields an analog that selectively targets ca 2+ -activated over voltage-gated k + channels structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra alpha-helical domain coronavirus main proteinase (3clpro) structure: basis for design of anti-sars drugs characterization of a novel coronavirus associated with severe acute respiratory syndrome structure-assisted design of mechanismbased irreversible inhibitors of human rhinovirus 3c protease with potent antiviral activity against multiple rhinovirus serotypes the crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor evaluation of homology modeling of the severe acute respiratory syndrome (sars) coronavirus main protease for structure-based drug design a 3d model of sars cov 3cl proteinase and its inhibitors design by virtual screening crystal structure of rhodopsin: a g-proteincoupled receptor modeling the 3d structure of gpcrs: advances and application to drug discovery nuclear hormone receptor targeted virtual screening rational discovery of novel nuclear hormone receptor antagonists protein-based virtual screening of chemical databases. ii. are homology models of g-protein-coupled receptors suitable targets? docking ligands onto binding site representations derived from proteins built by homology modelling performance of 3d database molecular docking studies into homology models design of helical proteins for real-time endoprotease assays a single amino acid that determines the sensitivity of progesterone receptors to ru486 ru486 is not an antiprogestin in the hamster mammalian microsomal cytochrome p450 monooxygenase: structural adaptations for membrane binding and functional diversity structure of human microsomal cytochrome p450 2c8. evidence for a peripheral fatty acid binding site crystal structure of human cytochrome p450 2c9 with bound warfarin molecular modeling of human cytochrome p450-substrate interactions modelling human cytochromes p450 involved in drug metabolism from the cyp2c5 crystallographic template homology modelling of human cyp1a2 based on the cyp2c5 crystallographic template structure homology modelling of cyp2a6 based on the cyp2c5 crystallographic template: enzyme-substrate interactions and qsars for binding affinity and inhibition molecular modelling of cyp2b6 based on homology with the cyp2c5 crystal structure: analysis of enzyme-substrate interactions a molecular model of cyp2d6 constructed by homology with the cyp2c5 crystallographic template: investigation of enzyme-substrate interactions investigation of enzyme selectivity in the human cyp2c subfamily: homology modelling of cyp2c8, cyp2c9 and cyp2c19 from the cyp2c5 crystallographic template prediction of drug metabolism: the case of cytochrome p450 2d6 a novel approach to predicting p450 mediated drug metabolism. cyp2d6 catalyzed n-dealkylation reactions and qualitative metabolite predictions using a combined protein and pharmacophore model for cyp2d6 competitive cyp2c9 inhibitors: enzyme inhibition studies, protein homology modeling, and three-dimensional quantitative structure-activity relationship analysis molecular basis of p450 inhibition and activation: implications for drug development and drug therapy prediction of human drug metabolizing enzyme induction coactivator binding promotes the specific interaction between ligand and the pregnane x receptor a structural model of the constitutive androstane receptor defines novel interactions that mediate ligandindependent activity insights from a three-dimensional model into ligand binding to constitutive active receptor molecular modelling of the human glucocorticoid receptor (hgr) ligand-binding domain (lbd) by homology with the human estrogen receptor alpha (heralpha) lbd: quantitative structure-activity relationships within a series of cyp3a4 inducers where induction is mediated via hgr involvement predicting undesirable drug interactions with promiscuous proteins in silico a structural basis for drug-induced long qt syndrome characterization of herg potassium channel inhibition using comsia 3d qsar and homology modeling approaches modern methods of drug discovery we gratefully acknowledge fruitful discussions with mario lobell (bayer healthcare; http://www.bayerhealthcare.com), derek debe, sean mullen (eidogen) and sunil patel (accelrys). key: cord-256316-1odgm6hm authors: godet, murielle; l'haridon, rene; vautherot, jean-francois; laude, hubert title: tgev corona virus orf4 encodes a membrane protein that is incorporated into virions date: 1992-06-30 journal: virology doi: 10.1016/0042-6822(92)90521-p sha: doc_id: 256316 cord_uid: 1odgm6hm abstract the coding potential of the open reading frame orf4 (82 amino acids) of transmissible gastroenteritis virus (tgev) has been confirmed by expression using a baculovirus vector. five monoclonal antibodies (mabs) raised against the 10k recombinant product immunoprecipitated a polypeptide of a similar size in tgev-infected cells. immunofluorescence assays performed both on insect and mammalian cells revealed that orf4 was a membrane-associated protein, a finding consistent with the prediction of a membrane-spanning segment in orf4 sequence. two epitopes were localized within the last 21 c-terminal residues of the sequence through peptide scanning and analysis of the reactivity of a truncated orf4 recombinant protein. since the relevant mabs were found to induce a cell surface fluorescence, these data suggest that orf4 may be an integral membrane protein having a cexo-nendo orientation. anti-orf4 mabs were also used to show that orf4 polypeptide may be detected in tgev virion preparations, with an estimated number of 20 molecules incorporated per particle. comparison of amino acid sequence data provided strong evidence that other coronaviruses encode a polypeptide homologous to tgev orf4. our results led us to propose that orf4 represents a novel minor structural polypeptide, tentatively designated sm (small membrane protein). virology 188, 666-675 (1992) the coding potential of the open reading frame orf4 (82 amino acids) of transmissible gastroenteritis virus (tgev) has been confirmed by expression using a baculovirus vector. five monoclonal antibodies (mabs) raised against the 10k recombinant product immunoprecipitated a polypeptide of a similar size in tgev-infected cells. immunofluorescence assays performed both on insect and mammalian cells revealed that orf4 was a membrane-associated protein, a finding consistent with the prediction of a membrane-spanning segment in orf4 sequence. two epitopes were localized within the last 21 c-terminal residues of the sequence through peptide scanning and analysis of the reactivity of a truncated orf4 recombinant protein. since the relevant mabs were found to induce a cell surface fluorescence, these data suggest that orf4 may be an integral membrane protein having a cexo-nendo orientation. anti-orf4 mabs were also used to show that orf4 polypeptide may be detected in tgev virion preparations, with an estimated number of 20 molecules incorporated per particle. comparison of amino acid sequence data provided strong evidence that other coronaviruses encode a polypeptide homologous to tgev orf4. our results led us to propose that orf4 represents a novel minor structural polypeptide, tentatively designated sm (small membrane protein). transmissible gastroenteritis virus (tgev), an important pathogen of swine, is a member of the coronaviridae, a family of enveloped viruses with a large (-30 kb) continuous, positive rna genome. sequencing data have led to the identification of a number of large open reading frames (orfs). orfla and b, which account for the 5' two-thirds of the coronavirus genome, are assumed to encode nonstructural proteins including the viral replicase/transcriptase. the seven to eight remaining orfs are expressed through a set of 3' coterminal, subgenomic size mrnas of which only the unique region is translationally active. these include the orfs coding for the virion structural proteins, i.e., the nucleocapsid (n) and two or three envelope glycoproteins: the spike (s) and the membrane (m) proteins, and the hemagglutinin-esterase (he) present in a coronavirus subset. these orfs are distributed following the consensus gene order 5' pol-(he)-s-m-n 3'. the other orfs are interspersed within the genome and their number and position differ among coronavirus members (reviewed by spaan et a/., 1988; and lai, 1990) . they have been shown to be expressed by functionally mono-, di-, or tricistronic mrnas and were generally assumed to encode nonstructural proteins, the function of which is still unknown or conjectural. ' to whom correspondence and reprint requests should be addressed. in tgev genome, four such orfs have been deduced. two of them are expressed by the same mrna (mrna 3) in two out of the three virus strains sequenced (rasschaert et a/., 1987; kapke et a/., 1988; britton et al., 1989; wesley et al., 1989) . the predicted product of orf3a is 61 to 71 codon long, with a variable c-terminal end. it appears to be dispensable for virus replication since it was found to be absent in a tgev variant strain sp (wesley et a/., 1990) as well as in the closely related porcine respiratory coronavirus (prcv) (rasschaert eta/., 1990) . orf3b (expressed by a separate mrna species numbered 3-l in miller strain) has a constant length (244 residues); however, one clone of purdue-l 15 strain was reported to have an orf3b which is shortened by 79 codons at the 5' end and in several cdna clones by 67 codons at the 3' end (rasschaert et al., 1987) . orf4 was predicted to encode a 82 amino acid long hydrophobic polypeptide (rasschaer-t et a/., 1987; kapke et a/., 1988; britton et a/., 1989; wesley eta/., 1989) . so far, the putative products of the three above-mentioned orfs have not been identified in infected cells. the last orf, orf7, is located downstream of the n gene (kapke and brian, 1986; rasschaert et al., 1987; britton et al., 1988) an unusual feature among coronaviruses. a polypeptide of m, 14k, reacting with antibodies produced against an orf7 synthetic peptide, has been characterized in tgev-infected cells (garwes et al., 1989) . in this study we report the identification of a product of one of these orfs (orf4) in infected cells and its preliminary characterization. in particular, we show evidence that orf4 represents a novel virion-associated polypeptide with a possible counterpart in other coronaviruses. autographa cahfornica nuclear polyhedrosis virus (acnpv) and recombinant baculoviruses were grown and assayed in confluent monolayers of spocfoptera frugiperda (sf9) cells in medium containing 10% (v/v) fetal bovine serum, according to the procedures described by brown and faulkner (1977) . propagation of the high cell passage purdue-l 15 strain of tgev in swine cell lines pd5 or st was done as previously described (laude et al., 1986) . manipulations of plasmid dna were performed according to the procedures described by sambrook et a/. (1989) . restriction enzymes, t4 dna ligase, and calf intestine alkaline phosphatase (cip) were purchased from boehringer-mannheim. the baculovirus transfer plasmid containing the full-length cdna copy of tgev orf4 coding sequence was constructed following the general scheme outlined in fig. 1 . the ndel-sspl fragment (0.7 kbp) derived from plasmid ptg2-15 (rasschaert et al., 1987) was digested with the ddel restriction enzyme. ddel-sspl dna fragment was repaired with the klenow large fragment of dna polymerase i and cloned into the barnhi site of the pvl941 vector (luckow and summers, 1989) . the resulting plasmid, named pvlorf4, contained a 0.33 kbp insert. a second plasmid, pvlorf4a was constructed by inserting an orf4 gene in which the 3' last 63 nt were deleted through pcr mutagenesis @char-f et a/., 1986) on ptg2-15 using the oligonucleotides 5' g aagaagggatccatacctatgac and 5' clta-tagggatcctaagcatg as 5' and 3' amplimers, respectively. these amplimers were designed to introduce an additional stop codon tag at the 3'end of the gene and a barnhi cloning site at each end. the amplification product was digested by bamhl and ligated into the pvl941 cloning site. the orientation and sequence of the orf4 and orf4a inserts relative to the acnpv polyhedrin leader were determined by restriction analysis and partial dna sequencing. in these constructs, the initiation codon of the orf4 and orf4a sequences were positioned at 54 and 7 bp from the bamhl cloning site, respectively. transfer of the tgev orf4 gene into the acnpv genome was accomplished by transfection of sf9 cells using the calcium phosphate precipitation technique as described by summers and smith (1987) . recombinant baculoviruses were screened by dot blot hybridization using an orf4specific [32p]-labeled dna fragment as a probe. four polyhedrin-negative clones were tested for orf4 expression. tgev orf4a gene was introduced into a linear form of acnpv dna (kitts eta/., 1990) . circular acrpg-sc dna was linearized by digestion at the unique bsu361 site. two hundred nanograms of bsu361 or mock digest viral dna was mixed with 1 pg of pvlorf4a dna and transfected into sf9 cells using the lipofectin method (kitts et a/., personal communication) according to the procedure of the manufacturer (gibco-brl). after a 2-day incubation the culture supernatants were harvested and plated. a dozen well-isolated plaques were picked out and screened for orf4a expression using [35s]methionine-labeled cultures. three balb/c mice were injected threefold intraperitoneally at a 1 month interval with 1 x 10' acorf4-infected cells (disrupted in freund complete adjuvant for the first injection). three days before fusion, the mice were boosted both intraperitoneally and subcutaneously with orf4 protein purified from 4 x 10' infected cells by 15% sds-page. splenocytes from one mouse that tested positive by immunoprecipitation assay were fused with sp,o myeloma cells. supernatants of hybrid clones were tested in a comparative immunofluorescence assay (see below) using acorf4or acnpv-infected sf9 monolayers. subcloning of orf4-specific antibody-producing hybridomas and ig isotyping were done as described elsewhere (l' haridon et al., 1991) . iggs purified from ascites fluids by ammonium sulfate precipitation and gel permeation on a sephacryl-s200 column were used in all experiments. screening of hybridoma was performed on sf9 cell monolayers established in 96-well microplates, infected with baculovirus (m.o.i. 10 pfu) and fixed with acetone/ethanol (v/v) at 38 hr p.i. for surface fluorescence analysis, aliquots of cells in suspension were stained with mabs at 100 pglml in grace medium and then with fitc conjugate (each step 1 hr at 4") and spotted onto glass slides. alternatively, spotted cells were fixed with 4% paraformaldehyde and permeabil-ized or not with 0.1% triton x-l 00 before staining (1 hr at 37"). similar experiments were performed on st cell monolayers infected by tgev at a m.o.i. of 0.1 pfu and fixed 15 hr p.i. the procedures for metabolic labeling of insect and mammalian cells were as reported previously (godet et a/., 1991; . monolayers of 4 x 1 o5 sf9 cells or 6 x 1 o6 pd5 cells labeled with [35s]methionine were washed with pbs and lyzed in 4 ml of pbs-triton (tris, 50 mm, ph 8.5, 1 o/o triton x-l 00, 1 o3 kallicrein units of aprotinin per milliliter). resulting cytosols of sf9 cells and pd5 cells were centrifuged 30 min at 10,000 g or 1 hr at 30,000 rpm in a 50 ti rotor (beckman), respectively and stored in aliquots at -70". lmmunoprecipitation assay aliquots of radiolabeled cytosols or virions were adjusted to 0.5 ml with pbs-triton buffer containing the appropriate mab (100 pg/ml) or 3 ~1 of ascites fluid from a feline infectious peritonitis virus-infected cat (used as a source of anti-tgev polyclonal antibodies) and protein a-sepharose beads (50 ~1 of a 50% suspension); after a 2-hr incubation at room temperature with agitation, the immune complexes were extensively washed with pbs-triton, then with 0.5 m nacl + 50 mn/l tris (ph 8). beads were treated for 3 min at 100" in sample buffer. the immunoprecipitated material thus released was analyzed by 15% or 15-20% sds-page. virions in the supernatant of cultures labeled as above were purified following a described procedure (laude et a/., 1986) . the material pelleted by ultracentrifugation at 35,000 rpm in a 45 ti rotor was resuspended in distilled water and was applied to a linear sucrose gradient (16 ml, 20 to 45% sucrose w/v in distilled water). centrifugation was performed in an sw27 rotor for 3 hr at 25,000 rpm and 4". gradients were collected from top to bottom into 500-p.1 fractions. half of each fractions was run at 100,000 rpm for 30 min and the resulting pellets were analyzed by sds-page electrophoresis on a 8-20% gradient gel. the material remaining in the virus-containing fractions was pooled, pelleted as above, and split into three parts; one was analyzed directly by 15% sds-page; the two others were solubilized in pbs-triton, immunoprecipitated by a different mab each, and analyzed as above. in one experiment, virion-associated material was subjected to a second round of gradient purificalaude et al., 1986) prior to gel analysis. peptides were synthesized on polyethylene pins (geysen eta/., 1984) by using a commercially available kit, according to the procedure given by the manufacturer (cambridge research biochemicals). immunoreactivity of the immobilized peptides was assayed by elisa using anti-orf4 mabs (25 pglml) as primary antibody and an anti-mouse igg (h + l) peroxydase conjugate (biosys). insertion of the entire orf4 coding sequence into the genome of acnpv baculovirus was performed by using the transfer plasmid pvl941 (see methods and specific probe and a polyhedron-negative phenotype were selected, amplified, and tested for the expression of orf4. among four selected orflf-expressing clones, one, designated acorf4, was retained for subsequent studies. analysis of [35s]methioninelabeled acorf4-infected cells revealed the presence of a major polypeptide of i'@ 1 ok (fig. 2) in good agreement with the predicted m.w. of orf4 product (9.2k). the time course for its synthesis was found to be from 24 to 48 hr p.i. screening of positive hybridoma clones was achieved by comparative indirect immunofluorescence assay on acorf4-or acnpv-infected cells. among 268 hybridomas tested, 5 positive clones, all producing mabs of iggl isotype, were subcloned and used for the production of ascites fluids. when assayed by immunoprecipitation of acorf4-infected cells extracts, all 5 mabs recognized a single species of m, 1 ok, identical to the target protein (fig. 3a, lane 4 ). an immunoreactive species comigrating with recombinant orf4 protein was detected in tgev-infected mammalian cells as well (lane 2). analysis of the recombinant protein in nonreducing conditions revealed the existence of oligomeric forms, which were not observed with the authentic orf4 protein (fig. 3b) . orf4 product is a membrane-associated protein indirect immunofluorescence assays were performed to determine the subcellular location of orf4 protein. in acetone-fixed cultures of both acorf4-infected sf9 and tgev-infected cells, all anti-orf4 mabs induced a strong fluorescence, which were polarized in a juxtanuclear region consistent with golgi localization (fig. 4a) . in addition, a bright fluorescence was observed on unfixed (fig. 4b) positively stained cells were consistently observed (fig. 4d) , whereasvirtually all the cells showed the presence of orf4 antigen after permeabilization (not shown). the fluorescence observed in nonpermeabilized cells was unlikely due to cell damage since only a few of them were stained positively with a mab directed against an intracellular antigen (tgev n protein; data not shown). these data were interpreted as reflecting a late accumulation of orf4 at the outer membrane of tgev-infected cells. these observations showing that orf4 protein is found in association with cellular membranes are consistent with the prediction of a membrane-spanning domain in its amino acid sequence (fig. 5) . anti-orf4 monoclonal antibodies recognize the c-terminal domain of the protein assuming that orf4 was an integral membrane protein, it was of interest to determine whether exposed epitopes were located within the carboxy or amino domain of the polypeptide chain. the peptide scanning method (geysen et al., 1984) was used since two anti-orf4 mabs were shown to be reactive toward denatured and reduced protein. a set of peptides encompassing all overlapping linear nonapeptides homologous with orf4 sequence was synthesized and tested against each of the five mabs. as shown in fig. 6 , mab v27 strongly recognized a linear epitope centered on residues ayknf (positions 64 to 68; see fig. 5 ). mab s2 gave comparable results, although a lesser reactivity was observed when compared to mab v27. no reactivity was observed with the three remaining mabs, possibly because of a lower avidity. to test the possibility that these three anti-orf4 mabs may also recognize the carboxy-subterminal region of the molecule, a second recombinant baculovirus designated acorf4a, which expressed a truncated form of orf4 lacking the 21 c-terminal amino acids, was constructed via pcr mutagenesis. immunoprecipitation analysis revealed that a polypeptide of slightly reduced size was expressed by acorf4ainfected cells and recognized by polyclonal antibodies but not by any of the anti-orf4 mabs (partial data in fig. 7) . in order to determine whether the protein is incorporated within the virion as a possible envelope protein, labeled tgev particles were purified by centrifugation and analyzed by gel electrophoresis (fig. 8) . five welldefined major bands were visible, which corresponds to the previously recognized virion-associated polypeptides: (i) a 220k band (s protein), (ii) a 47k band (n peptides fig. 6. epitope mapping of anti-orf4 mab v27. a series of nonapeptides (overlapping 1) spanning the length of the entire orf4 sequence was tested for elisa reactivity toward mab v27. n-terminus is on the left. protein), and (iii) three bands corresponding to different species of m protein; 30-36k identified as complex type glycosylated forms, 29k (high mannose form) and 26k (unglycosylated form) (delmas and laude, 1991) . a single additional minor band of m, 1 ok, similar to that of orf4 polypeptide, was detected (fig. 8a) . the fact that very few, if any, other polypeptides copurified renders unlikely the possibility that the 1 ok polypeptide remained nonspecifically associated to the sedimenting virus particles. true association of the 10k band with virions was confirmed by showing that the observed polypeptide pattern remained unchanged after an additional round of purification by isopycnic centrifugation. lmmunoprecipitation with mab v27 of detergent-dissociated material from virus-containing fractions confirmed the identity between the 10k virion-associated and orf4 polypeptides (fig. 8b) . densitometric tracing of autoradiograms at different times of exposure was performed to evaluate the relative amounts of the virion-associated polypeptides. the resulting values were corrected according to the predicted number of met residues in the respective sequences. this led to an estimated molar ratio of 1:20:300 for orf4, s, and m polypeptides. a common feature of coronavirus genomes appears to be the existence, immediately upstream from the membrane protein m gene, of an orf 250 to 330 nucleotide long, which is predicted to encode a hydrophobic polypeptide. furthermore, recent studies on mouse hepatitis virus (mhv), infectious bronchitis virus (ibv), and bovine (bcv) coronaviruses have shown that the products translated from these orfs were associated with the membrane of infected cells (leibowitz et a/., 1988; abraham et a/., 1990; smith et al., 1990) , as evidenced now for tgev orf4. these observations prompted us to reexamine the relevant amino acid sequences for possible similarities that earlier comparative analysis by us and others failed to detect (except for the closely related mhv and bcv viruses). figure 9 shows a tentative alignment of the orf4-like sequences from 5 coronavirus members, which was outlined using the program multalin (corpet, 1988) and refined manually. on the basis of the deduced consensus sequence, in which 37 residues are conserved in at least 3 out of the 5 sequences aligned (87 positions), we conclude that these proteins share significant similarities. as a striking feature, a 19-20 residue-long hydrophobic stretch, followed by a cluster of 2 or 3 cysteines is present 15 to 20 residues from the n terminus the c-terminal part of the polypeptides seems to be more distantly related; in particular, ibv protein extended the consensus sequence by 14 to 28 residues, depending of the strain (see liu et a/., 1991 for ibv orf3c sequence data). in this study we have confirmed the coding potential of the orf encoded by tgev mrna 4 and characterized several properties of its translation product. baculovirus-vectored expression of orf4 resulted in the synthesis of a 1 ok polypeptide. the same species was identified in tgev-infected cells by immunoprecipitation using monoclonal antibodies (mabs) raised against the recombinant polypeptide. examination of its subcellular localization by immunostaining showed that orf4 translation product is a membrane-associated polypeptide. this finding is consistent with the prediction in the second n-terminal quarter of a stretch of uncharged residues with proper-ties of a membrane-spanning domain (fig. 5) . immunofluorescence data also suggested that orf4 may enter the exocytic pathway. in tgev-infected cells, however, orf4 could be detected in association with the cell surface only at a late stage of the infection. assuming that the observed surface fluorescence was related to externally exposed determinants, it was of interest to map the relevant epitopes on the amino acid sequence of the molecule. peptide scanning led to the identification of residues 64-ayknf-68 as the core sequence of the binding site of 2 of the mabs (cavanagh et a/., 1990) . pairwise homologies are marked by dots and bold letters; gaps are indicated by dashes. bottom line, consensus sequence showing residues identical in at least three of the five sequences. sequence data from rasschaert et al., 1987; raabe and siddel, 1989; skinner et al., 1985; woloszyn et al., 1990; and boursnell et al., 1985. the 22 last n-terminal residues of ibv orf3c (beaudette strain) are not shown. studied. furthermore, a recombinant orf4 protein truncated of its last 21 c-terminal amino acids was no longer recognized by any of the 5 mabs. these results support the view that an antigenic, possibly immunodominant site is expressed in the c-subterminal part of orf4 protein. in addition, they led us to speculate that the region of the molecule which is translocated across the membrane would correspond to its carboxy domain. recently, a striking correlation has been reported between the transmembrane orientation of eukaryotic proteins and the disposition of charged residues surrounding the most n-terminal membranespanning sequence. it has been proposed that the difference in the charge of the 15 residues flanking the presumed anchor segment determines its orientation with the more positive portion facing the cytosol (hartmann et a/., 1989) . applying such a rule to tgev orf4 sequence gives charges of -1 and +2 for the n and c flanking segments, respectively, which predicts a nexo-cendo orientation, in contrast to the available experimental evidence. thus further experiments, including the production of antibodies directed to the n-terminal part of the protein, are needed for a definite assignment of the transmembrane orientation of orf4. the possible role of the cysteine cluster immediately downstream the hydrophobic segment was also examined. gel analysis of recombinant orf4 protein under nonreducing conditions revealed the presence of multimeric, predominantly dimeric forms. in contrast, orf4 synthesized in tgev-infected cells could be detected in a monomeric form only, suggesting that the formation of disulfide-bridged species is an artifact, presumably linked to the high level of expression of orf4 in the insect cells (kiefhaber et al., 1991) . the possibility was tested that the cystein residues could serve as an acylation site, as demonstrated for coronavirus s protein (schmidt, 1982) . however, no incorporation of palmitic acid chains could be detected in recombinant orf4 protein (data not shown), as would be expected if the target residues belong to the orf4 ectodomain. an important implication of the above findings was that orf4 could represent a structural polypeptide present in the virus envelope. indeed, purified preparations of labeled tgev virions revealed the presence of a previously unrecognized 1 ok polypeptide which was specifically immunoprecipitated by anti-orf4 mabs. based on the estimated molar ratio and assuming that coronavirions bear 100 (roseto et a/., 1982) to 200 spikes, each composed of 3 s molecules (delmas and laude, 1990) it can be inferred that approximately 15-30 copies of orf4 protein are incorporated into tgev virions (purdue strain). such a small number of molecules in virus particles does not seem to reflect a selective exclusion since s and orf4 accumulated in infected cells at a ratio comparable to that found in virions (data not shown). these results lead us to conclude that orf4 may represent a minor structural polypeptide, which we propose to designate by the tentative acronym sm, standing for "small membrane" protein. several lines of evidence lend support to the view that a gene encoding an sm-like protein is a common feature of the coronavirus genomes: (i) an orf predicting a polypeptide with striking similarities to tgev orf4 was identified in the genome sequence of each of the 5 coronaviruses examined (fig. 9 ) and the fact that tgev sm was recognized by anti-fipv antibodies argues for the presence of a related gene also in feline infectious peritonitis virus genome; (ii) the product expressed from the relevant mhv, bcv, and ibv orfs was reported to have properties of a transmembrane polypeptide (leibowitz et a/., 1988; smith et al., 1990; abraham et a/., 1990) ; and (iii) although expressed through a mono-, di-, or tricistronic mrna (abraham et a/., 1990; budzilowicz and weiss, 1987; liu et al., 1991 ) the assumed sm-encoding genes are all located upstream and adjacent to the m protein gene. therefore, not only the sequences show significant similarities but the gene order 5'. . s sm/m . 3' is conserved, as would be expected for a structural protein. finally, preliminary experiments in this laboratory allowed us to detect orfs-encoded polypeptide in association with bcv particles (n. woloszyn, p. boireau, and j. f. vautherot, unpublished results). small integral membrane proteins have been described in several other enveloped rna viruses, including sindbis and semliki forest togaviruses, influenza a and b viruses, simian virus 5, and respiratory syncitial paramyxoviruses (garoff et a/., 1980; welch and sefton, 1980; lamb et al., 1985; hiebert, 1985; olmsted and collins, 1989) . the influenza virus m2 protein (15k) and the alphavirus 6k protein are both acylated, nexo-cendo transmembrane polypeptides. m2 and 6k have been shown to represent minor structural polypeptides, with an estimated number of 40 + 25 and 24 f 4 molecules per virion, respectively (zebedee gaedigk-nitschko and schlesinger, 1990 ). m2 has been reported to form tetrameric channels within the membrane and to be a target of the antiviral drug amantadine and of the ctl response to influenza virus infection (hay et al., 1985; lamb et al., 1985; surgrue and hay, 1991) . site-directed mutagenesis studies on sindbis and semliki forest viruses have demonstrated that 6k protein is dispensable for virus production but exerts a role late in the assembly, possibly during virus budding (gaedigk-nitschko and schlesinger, 1990; liljestrom et a/., 199 1). the sh protein of sv5 has been reported to be orientated in the mem-brane with its n-terminus domain exposed at the cytoplasmic face, as it might be the case for tgev sm. whether sh is incorporated into virions is still questioned, as well as its potential role (hiebert eta/., 1988) . the apparent conservation of sm gene in the coronavirus genome strongly implies that its product is essential for an efficient replication of the virus. based on its location and its low copy number in particles, we speculate that sm would more likely play a role in modulating assembly and/or release of the virion. thus, eluci-.dating the function of the coronavirus sm protein might contribute to a better understanding of an important aspect of the biology of enveloped viruses. finally, sm may be a potent surface antigen since murine antibodies recognized a domain of the protein possibly exposed on live infected cells. preliminary experiments indicated that anti-sm antibodies are readily detected in the serum from infected swines. therefore, the role of sm protein in humoral and cellular immune response to tgev infection should be worth investigating in the future. we thank j. gelfi for technical assistance, j. levin for revising the manuscript, and m. nezondb for the artwork, part of this work was carried out with the support of the e.e.c program eclair. note addedinproof. during the submission process of this article, a communication by (d. x. liu and s. c. lnglis (1991, virology 185, 91 1-917) reported the association of ibv orf3c protein with the virion envelope. this strengthens the view that sm-like proteins are a general feature of coronavirus. 5 kda encoded between the spike and membrane protein genes of the bovine coronavirus sequencing of coronavirus ibv genomic rna: three open reading frames in the 5' "unique" region of mrna d sequence of the nucleoprotein gene from a virulent british field isolate of transmissible gastroenteritis virus and its expression in saccharomyces cerevisiae sequence of the coding regions from the 3.0 kb and 3.9 kb mrna subgenomic species from a virulent isolate of transmissible gastroenteritis virus a plaque assay for nuclear polyhedrosis viruses using a solid overlay in vitro synthesis of two polypeptides from a nonstructural gene of coronavirus mouse hepatitis virus strain a59 recommendations of the coronavirus study group for the nomenclature of the structural proteins, mrnas, and genes of coronaviruses multiple sequence alignment with hierarchical clustering assembly of coronavirus spike protein into trimers and its role in epitope expression four major antigenic sites of the coronavirus transmissible gastroenteritis virus are located on the amino-terminal half of spike glycoprotein s carbohydrate-induced conformational changes strongly modulate the antigenicity of coronavirus tgev glycoproteins s and m the sindbis virus 6k protein can be detected in virions and is acylated with fatty acids nucleotide sequence of cdna coding for semliki forest virus membrane glycoproteins the polypeptide of mr 14000 of porcine transmissible gastroenteritis virus: gene assignment and intracellular location use of peptide synthesis to probe viral antigens for epitopes to a resolution of a single amino acid processing and antigenicity of entire and anchor-free spike glycoprotein s of coronavirus tgev expressed by recombinant baculovirus predicting the orientation of eukaryotic membrane-spanning proteins the molecular basis of the specific anti-influenza action of amantadine identification and predicted sequence of a previously unrecognized small hydrophobic protein, sh, of the paramyxovirus simian virus 5. 1. viral cell surface expression and orientation in membranes of the 44-aminoacid sh protein of simian virus. 5 sequence analysis of the porcine transmissible gastroenteritis coronavirus nucleocapsid protein gene nucleotide sequence between the peplomer and matrix protein genes of the porcine transmissible gastroenteritis coronavirus identifies three large open reading frames protein aggregation in vitro and in viva: a quantitative model of the kinetic competition between folding and aggregation linearization of baculovirus dna enhances the recovery of recombinant virus expression vectors coronavirus: organization, replication and expression of genome influenza virus m, protein is an integral membrane protein expressed on the infected-cell surface antigenic structure of transmissible gastroenteritis virus. i. properties of monoclonal antibodies directed against virion proteins. 1 production of an hybridoma library to recombinant porcine alpha i interferon: a very sensitive assay (isba) allows the detection of a large number of clones detection of a murine coronavirus nonstructural protein encoded in a downstream open reading frame in vitro mutagenesis of a full-length cdna clone of semliki forest virus: the small 6,000.molecular-weight membrane protein modulates virus release a polycistronic mrna specified by the coronavirus infectious bronchitis virus high level expression of nonfused foreign genes with autographa californica nuclear polyhedrosis virus expression vectors the 1a protein of respiratory syncytial virus is an integral membrane protein present as multiple, structurally distinct species nucleotide sequence of the human coronavirus hcv 229e mrna 4 and mrna 5 unique regions porcine respiratory coronavirus differs from transmissible gastroenteritis virus by a few genomic deletions enteric coronavirus tgev: partial sequence of the genomic rna, its organization and expression bovine enteric coronavirus structure as studied by a freeze-drying technique molecular cloning: a laboratory manual direct cloning and sequence analysis of enzymatically amplified genomic sequences acylation of viral spike glycoproteins: a feature of enveloped rna viruses coronavirus mhv-jhm mrna 5 has a sequence arrangement which potentially allows translation of a second, downstream open reading frame identification of a new membrane-associated polypeptide specified by the coronavirus infectious bronchitis virus coronaviruses: structure and genome expression m, protein of influenza a viruses: evidence that it forms a tetramerit channel a manual of methods for baculovirus vectors and insect cell culture procedures. texasagricultural experiment station bulletin no. 1555 publishers college characterization of a small, nonstructural viral polypeptide present late during infection of bhk cells by semlikl forest virus. 1. viral nucleotide sequence of coronavirus tgev genomic rna: evidence for 3 mrna species between the peplomer and matrix protein genes genetic basis for the pathogenesis of transmissible gastroenteritis virus. 1. viral nucleotide sequence of the bovine enteric coronavirus becv f15 mrna 5 and mrna 6 unique regions influenza a virus m, protein: monoclonal antibody restriction of virus growth and detection of m, in virions key: cord-260708-l9w5jhsw authors: lasecka, lidia; baron, michael d. title: the molecular biology of nairoviruses, an emerging group of tick-borne arboviruses date: 2013-12-11 journal: arch virol doi: 10.1007/s00705-013-1940-z sha: doc_id: 260708 cord_uid: l9w5jhsw the nairoviruses are a rapidly emerging group of tick-borne bunyaviruses that includes pathogens of humans (crimean-congo hemorrhagic fever virus [cchfv]) and livestock (nairobi sheep disease virus [nsdv], also known as ganjam virus), as well as a large number of viruses for which the normal vertebrate host has not been established. studies on this group of viruses have been fairly limited, not least because cchfv is a bsl4 human pathogen, restricting the number of labs able to study the live virus, while nsdv, although highly pathogenic in naive animals, is not seen as a threat in developed countries, making it a low priority. nevertheless, recent years have seen significant progress in our understanding of the biology of these viruses, particularly that of cchfv, and this article seeks to draw together our existing knowledge to generate an overall picture of their molecular biology, underlining areas of particular ignorance for future studies. new viral diseases appear with increasing frequency. just in the last two years we have seen the appearance of a new livestock virus (schmallenberg) and a new human pathogen (mers coronavirus). in some cases these viruses appear to be completely new, in others they have 'emerged' into our awareness as human use of different habitats changes, leading to increased contact with carriers of disease, whether those carriers are ''bush meat'' or the many insect and tick species that can act as vectors of disease. one such group of viruses that is rapidly becoming more important is the genus nairovirus in the family bunyaviridae. this genus includes a number of human and livestock pathogens, as well as a collection of other viruses about which little is known, not even the host in which they naturally circulate. the nairoviruses show a number of unique features, and the purpose of this article is to summarise our current knowledge of the molecular biology of these viruses and highlight areas where it is to be expected that research will soon bear fruit. a primary characteristic of the viruses of the genus nairovirus, distinguishing them from most of the other members of the family bunyaviridae, is that they are all transmitted by ticks [109] . based on antibody cross-reactivity, nairoviruses are classified into seven serogroups (table 1) [32, 39, 190] , of which the most important are the crimean-congo haemorrhagic fever (cchf) group, which includes the human pathogen crimean-congo haemorrhagic fever virus (cchfv), and the nairobi sheep disease (nsd) group, to which belong nairobi sheep disease virus (nsdv) dugbe virus (dugv) and kupe virus (kupv). kupv [43] and finch creek virus [102] are the most recently discovered viruses of this genus. cchfv is arguably the most important from a human perspective, causing haemorrhagic fever in humans, with mortality up to 30 % (reviewed in refs. [58] and [182] ). after dengue virus, this is the second most widespread of the arboviruses that are pathogenic to humans; cases of cchf have been reported in sub-saharan africa, the former soviet union, bulgaria, turkey, china, pakistan, india, the arabian peninsula, northern greece, iraq and iran [26, 53-55, 115, 121-123, 183] . although cchfv can infect several mammalian species, it appears to cause disease only in man [124, 158] . because of its importance as a human pathogen, most of the available data on nairoviruses come from studies on this virus, though studies on other members of the group have also contributed, notably on nsdv, a virus first identified nearly a hundred years ago as a tick-borne virus that causes severe haemorrhagic gastroenteritis in sheep and goats, with mortality rates of up to 90 % in susceptible populations [113, 167, 181] . interestingly, an asian virus causing a similar disease, ganjam virus (gv), was recently identified, based on genetic and serological studies, as being the same virus as that causing nsd in east haemorrhagic gastroenteritis in sheep and goats [106, 113] . antibodies to nsdv/gv have been reported in humans, but otherwise only laboratoryacquired infections have been seen [13, 45, 133] dugbe virus (dugv) frequently isolated from ticks infesting livestock in which dugv appears to be apathogenic [27, 46, 146] . while one case of human infection has been described [27] , the link to dugv was circumstantial kupe virus (kupv) crimean-congo haemorrhagic fever group crimean-congo haemorrhagic fever virus (cchfv) haemorrhagic fever in man [58, 179, 182] hazara virus (hazv) africa [47, 106, 187] . it is not possible to say yet whether gv is a variant of nsdv or vice versa, but it is clear that this virus also has a wide distribution. nairoviruses, like the members of the other genera of the family bunyaviridae, are enveloped viruses that appear spherical in the electron microscope, with a diameter of approximately 100 nm [21, 46, 142] . as with all bunyaviruses, the genome consists of three segments of negative-sense rna [39] (reviewed in refs. [109] and [175] ). these are termed the small (s) segment, encoding the nucleocapsid (n) protein, which forms a complex with each rna segment [24, 39] ; the medium (m) segment, which encodes a polyprotein that is processed into two mature glycoproteins (gn and gc) as well as one or more non-structural proteins [3, 17, 39, 107, 144] ; and the large (l) segment, which encodes the viral rna-dependent rna polymerase (rdrp) [85, 108] . viral replication takes place in the cytoplasm, and viral budding occurs in the golgi [57, 142] (fig. 1) . during viral replication, the genome segments are used as the template for synthesis of both mrna (transcription) and complementary rna (crna) (replication), where crnas are then used as the template for the synthesis of progeny genomic viral rna (vrna) [16] . the 5' and 3' untranslated regions (utrs) of the s, m and l segments contain the minimum cis-acting elements necessary for transcription, replication and packaging [16, 67] . the terminal 5' and 3' non-coding regions of each segment are complementary to each other and highly conserved among different nairoviruses [37] . the first nine nucleotides are usually the same in different segments, which suggests that this sequence is recognised by the viral polymerase to initiate viral transcription/ replication [107] . it is not clear whether the complementarity between 5' and 3' ends reflects an interaction between the ends of segments or the requirement for the same 3' sequence in genome and anti-genome rnas to act as the attachment site for the viral polymerase [14, 66] (reviewed in refs. [109] and [147] ). as with other bunyaviruses, the nairoviruses utilise capped primers snatched from host mrna for initiation of their mrna transcription [90] . the initiation of crna and vrna synthesis occurs by a different mechanism, which has not yet been fully worked out. the fact that polymerase slippage occurs during mrna synthesis [90] , coupled with the observation that the rna segments of cchfv, dugv and nsdv contain a 5'-terminal pyrimidine [90, 144, 187] , while viral rna polymerases can initiate rna synthesis only by attaching purines [12] , suggests that initiation of rna replication occurs in a prime-and-realign manner. the two membrane glycoproteins (gn and gc) are believed to determine cell tropism and the ability of the viruses to infect susceptible cells via recognition and binding of one or more cellular receptors. the specific cellular receptor(s) used by nairoviruses are currently unknown, but the gc protein is thought to be involved in virus attachment to the target cell receptor, as antibodies against gc, but not gn, appear to protect cells from cchfv infection in plaque reduction neutralisation assays [1, 17] . the ectodomain of cchfv gc contains an epitope that is highly conserved among strains [1] . this fragment is probably exposed in the virus, as antibodies against this epitope are neutralising for different strains of cchfv and protect mice from challenge with cchfv [1] . similarly, the gc of dugv was also demonstrated to be targeted by neutralising antibodies [107] . antibodies to the gn protein were found to be non-neutralising in vitro, though could still be protective in vivo [17] . based on the finding that an independently folding section of this gc protein bound to cchfv-susceptible cells, and using this part of gc as a probe [185] , nucleolin has been suggested as a putative cchfv receptor. nucleolin is a cellular protein that is abundant in the nucleus but can also be expressed on the cell surface of several cell types [36, 83, 148, 149, 151] , including mononuclear phagocytes, endothelial cells and hepatocytes which are known to be targets for cchfv [28, 185] . although nucleolin has also been suggested to act as a receptor for several other viruses, such as parainfluenza virus type 3 [22] , human immunodeficiency virus (hiv) [118] , coxsackie b virus [49] and respiratory syncytial virus (rsv) [162] , further studies are required to determine whether nucleolin acts as the primary receptor for cchfv; in particular, infection studies with wild-type (not cell-culture-adapted) viruses are required to confirm the biological relevance of any potential receptor candidates. after binding to their receptor, viruses fuse with the plasma membrane or internalise through one of several endocytosis pathways to gain entry into the intracellular environment (reviewed in refs. [40] and [153] ). cchfv entry is dependent on the low ph of the endosomal compartment, which appears to be required in the first steps of the virus entry post-internalisation [74, 156] , and the virus has been shown to use clathrin-dependent endocytosis, while caveolin-1 is dispensable for its entry [74, 156] . studies using dominant negative rab5 and rab7 indicate that cchfv fuses with early, but not late, endosomes to gain access to the cytoplasm [74] . cholesterol also appears to be important for cchfv replication, especially at early stages after binding and internalisation; it is possible that depletion of cholesterol from cells traps viral particles in endosomes [88] or interferes with clathrin-mediated endocytosis [140] . the internalisation of cchfv virions was shown to be dependent on intact microtubules [155] . further investigation of cchfv infection showed that disruption of the cell microtubules inhibited viral rna replication [155] , and both intact actin and microtubules are essential for correct distribution of the n protein to the perinuclear area [5, 155] . as nairoviruses enter via endocytosis and bud in the golgi compartment, the observation of a redistribution of viral proteins and suppression of cchfv assembly/egress by microtubule modification (depolarisation and stabilisation) [155] is not surprising, and this might be caused by interference with the endogenous secretory pathway, which also occurs along microtubules [101] (reviewed in ref. [87] ). disruption of actin filaments in cchfv-infected cells also drastically reduced replication of the virus [5] . while the n protein is often located in close proximity to the golgi apparatus, where generation of new viral particles occurs [16] , n does not interact directly with golgi membranes [5, 134] . the n protein appears to interact with the l protein, even in the absence of viral rna, as expressed n protein seems to redistribute most of the l protein into the perinuclear area containing n in transfected cells [16] . the n protein of nairoviruses, at approximately 53 kda, is almost twice of size of the n proteins of other bunyaviruses, with the exception of those of hantaviruses, which, at approx. 48 kda, are similar in size [39, 110] . the n protein is the most abundant protein in the virion and encapsidates newly synthesised vrna and crna; this process is necessary for completion of the replication cycle and packaging of the genome into virions [16, 65, 100] . viruses of different genera in the family bunyaviridae appear to adopt different mechanisms of rna encapsidation, e.g. some bunyavirus n proteins recognise generic ssrna, while others recognise specific structures in the vrna [110, 120, 135, 137, 150] . there is a need to understand the mechanism by which nairoviruses encapsidate genomic and antigenomic rna; inhibition of vrna encapsidation could be a target for potential antiviral drugs, e.g., using an rna decoy to bind the viral n protein, as has been suggested as an antiviral treatment for hepatitis b virus (hbv) [62] , or selecting rna-binding proteins that target specific packaging sequences in the viral genome [193] . the recently determined crystal structure of the n protein of cchfv shows that it is built from two major domains: a globular head and an extended stalk [31, 77, 174] . the globular domain is the larger and is formed from both n-terminal and c-terminal helices, while the stalk domain is formed by a group of internal helices (the exact numbering of the helices varies between the several papers publishing the structure of this protein). three potential rna-binding regions were identified in the n protein structure. two positively charged grooves are located in the globular domain: the smaller one is under the stalk domain, and the larger one is in the opposite site of the globular domain. the third positive groove is located in the stalk domain. the positively charged residues forming the rna-binding site appear to be well conserved across the genus nairovirus, and several residues were identified as crucial for rna binding and virus replication, e.g., k132 and q300 in the smaller groove and k411 and h456 in the larger groove of the head domain [31, 77] . the n protein of cchfv was also crystallised as a linear oligomer, where three monomeric n subunits were organised in head-to-tail manner, with the stalk domain of one monomer interacting with an oligomeric groove located at the base of the head domain of an adjacent molecule [174] . in addition, the linear oligomers of n were predicted to interact in a dimeric manner to form an antiparallel double superhelix [31, 174] . further predictions suggest that that there are nine n molecules per turn of the superhelix, which is 210 å in diameter [174] . additionally, a positively charged crevice located on the outside of the double superhelix is predicted to serve as an additional rna binding cleft [174] . studies performed by guo and collaborators [77] suggest that the cchfv n protein, when expressed without rna, predominantly exists as a monomer, and the nairovirus n protein is only a weak binder of nonspecific rna in this monomeric state [77, 80, 111, 135] . this suggests that nairovirus n binds rna only in the oligomeric state and/or that the n recognises specific structures of viral rna, as was shown for viruses of the genus hantavirus [110, 120] . however the fact that the nairovirus n protein does not form oligomers without rna suggests that its oligomerisation is rna-stabilised, where monomeric n protein requires binding to rna to form an oligomer, a mechanism that was previously suggested for influenza a virus [160, 161] . in a similar fashion, the n protein of rift valley fever virus (rvfv) can form only a short oligomeric form in the absence of rna [64] . superimposing the structures obtained for the cchfv n protein by two different groups, using two cchfv isolates, revealed the head and stalk with very similar folds, but with a transposition of the stalk domain when comparing these two molecules; results which suggest a flexibility of the stalk domain and the possibility of different conformations of n for different functions or different states of the n-rna complexes (e.g., transcription vs replication) [31] . additionally, comparison of the crystal structure of the n protein in the monomeric and oligomeric forms suggests that the stalk domain changes its conformation upon oligomerisation, probably by binding to the oligomerisation groove on the head domain of the adjacent molecule [174] . such flexibility of the stalk domain has also been shown for the n proteins of rvfv and lasv, which, during binding to an oligomerisation groove on an adjacent n molecule, undergo conformational changes exposing an rna-binding groove [64, 80] . these structural data have led to suggestions of possible models for the initiation of transcription and replication. transcription of viral mrnas by nairoviruses utilises short capped rna fragments (10-20 nt in length) derived from host mrnas as primers [90] . incubation of the cchfv n protein with primer-length ssrna resulted in a conformational change of the stalk domain, which resulted in disruption of the oligomeric interactions and release of the monomeric n protein from the antiparallel double superhelix [174] . as discussed by wang et al., presentation of the capped primers to the ribonucleoprotein (rnp) may initiate conformational changes in the stalk domain, leading to the release of monomeric n and exposing vrna to the primer and the viral polymerase. given that head-to-stalk interactions between two adjacent n molecules have also been observed for rvfv, lasv, and bunyamwera virus (bunv) [9, 64, 80] , it seems likely that the model proposed for cchfv may be biologically valid. the n protein of cchfv has also been shown to have nuclease activity specific for dsdna and ssdna (but not for rna); the importance of this dnase activity and its function in the virus life cycle are unknown [77] . the residues involved in the nuclease activity are located in the globular head domain of the n protein and are conserved in all nairoviruses [77] . in contrast, the n protein of lasv, the head domain of which exhibits high structural homology with that of cchfv, has rna-specific nuclease activity [79, 80, 130] . the n protein of nairoviruses also contains a conserved sequence signature specific for catalytic motif ii (cmii) of n-6 adenine-specific dna methylases (lasecka and baron, unpublished) (fig. 2) , where the conserved motif nppw could be involved in substrate binding or in catalytic activity [97, 164] . the motif is located on an exposed loop of the stalk domain, and the function and potential importance of this motif still need to be determined. the methylation of dna is used for regulation of gene expression, but so far there is no indication that the n protein of nairoviruses travels to the nucleus; the protein does not contain a classical nuclear localisation signal (nls), nor does it accumulate in the nucleus of infected or transfected cells. n6 methylation is also used as a posttranscriptional modification of mrna, and rna methyltransferases appear to contain motifs very similar to the cmi and cmii motifs identified in dna methyltransferases (reviewed in ref. [119] ); further studies are required to determine if the nairovirus n protein can modify its own or host cell rnas in this way. like those of members of the other genera of the family bunyaviridae, the m segment of nairoviruses contains a single open reading frame (orf) encoding a polyprotein that is co-and post-translationally cleaved into the mature viral glycoproteins [39] . the glycoproteins of most nairoviruses are still poorly characterised, and most of the available data come from studies on cchfv, largely through studies on proteins expressed from plasmids. the processing of the cchfv m polyprotein to generate the mature glycoproteins appears to be more complex than that of other bunyaviruses, as it involves first the generation of glycoprotein precursors through the action of the signal protease in the endoplasmic reticulum (er) followed by further cleavages to give rise to the full set of mature glycoproteins, a process that employs other cellular proteases [15, 144, 145, 172] (fig. 3) . the virions of most of the nairoviruses contain two mature glycoproteins, gn and gc [15, 33, 38, 107, 144, 145, 172] ; however, two nairoviruses, hazara virus and clo mor virus, have been shown to contain three structural glycoproteins [68, 175] . the full nairovirus m-encoded polyprotein appears to have six hydrophobic regions (tm 0 -tm 5 ), which could function as transmembrane helices [3, 144] and act either as classic secretory signal peptides (tm 0 ), membrane anchors (tm 1, 3 and 5 ), or a combination of both (tm 2,4 ) (fig. 3) . signal cleavage motifs are found after tm 0 (releasing the amino terminus of the gn precursor, pregn) and tm 4 (releasing the gc precursor pregc). sequence inspection revealed a signal cleavage signal immediately after tm 2 , suggesting that there might be a separate protein released, consisting of the sequence between the distal ends of tm 2 and tm 4 ; such a protein has been shown to be produced by cchfv and has been termed ns m [3] . further non-structural glycoproteins have been identified in cchfv-infected cells, referred to as the mucin-like domain (a highly o-glycosylated peptide), gp38, gp85 and gp160, all of which are released from infected cells; gp85 and gp160 contain both the gp38 protein and the mucin-like domain [145] . a model for the generation of the cchfv glycoproteins is summarised in figure 3 based on a number of studies [3, 144, 145, 172] . co-translational cleavage in the er by cellular signalase generates the 140-kda pregn (containing the mucin-like domain, gp38, and gn), the predominantly cytoplasmically oriented ns m , and the 85-kda pregc. the pregn is further cleaved, either in the er or the golgi compartment, to separate mucin-like domain/gp38 from [4, 161, 162] gn. this cleavage occurs at a conserved rrll; motif and is effected by the host's subtilisin kexin isozyme-1/site-1 protease (ski-1/s1p) [145, 172] . this cleavage has been shown to be critical for virus replication and subsequent infectivity, and lack of the cleavage prevents incorporation of the glycoproteins into viral particles [15] . a similar tetrapeptide (rkpl) is found 41 amino acids downstream of the signalase cleavage site in pregc, which, although it is not processed also by ski-1/s1p, is predicted to be utilised by a related subtilisin-like protease in the er/cis-golgi to generate the mature gc (75 kda) [144, 172] . the mature gn interacts (directly or indirectly) with the gc, and both are translocated to the virus-assembly sites in the golgi [17, 59, 145] . the interaction of mature gn with gc is essential for the gc to travel from er to golgi, as gn, but not gc, contains a golgi localisation signal [17] . the ectodomains of gn and gc appear to be sufficient for heterodimer formation and transport to the golgi, indicating that at least partial golgi-targeting information is located in the gn ectodomain [17] . the final processing of the glycoproteins takes place in the trans-golgi, where the mucin-like domain and gp38 are separated from each other by a furin-like protein convertase at the rskr; cleavage site [116, 145] . this cleavage is not required for cchfv gn maturation [145] , and it appears that not the entire pool of mucin-like domain/gp38 polyprotein is being cleaved, as the nonstructural proteins termed gp85 and gp160 contain both the mucin-like domain and gp38 and are resistant to resolution into smaller proteins by denaturation in sds and urea [145] . resistance to denaturation also suggests that gp160 is probably not a dimer of gp85 [145] . interestingly, the mucin-like domain/gp38 and gp38 can fold independently of the rest of gn and are secreted even when expressed on their own in transfected cells [145] . no specific biological function has been assigned to the mucinlike domain, gp38, gp85 or gp160. the mucin-like domain of the ebola virus glycoprotein has been shown to play a major role in pathogenesis, including involvement in the observed increase of endothelium permeability [154, 188] , and it will be important to see if this is also true of cchfv, especially given the changes visible in the endothelial cells of cchfv patients. the ns m protein, when expressed in transfected cells on its own, is transported to the golgi [3] . while the function(s) of the nairovirus ns m have still to be determined, the fact that gn requires ns m for maturation may mean that ns m is necessary for virus replication [3] . glycosylation is an important post-translational modification of secreted and membrane proteins that can influence protein folding, transport and function. n-linked glycosylation in particular is known to regulate protein folding, association with chaperones [112] , transport, cellular localisation [81, 82] , and even virus infectivity (reviewed in ref. [171] ). the n-terminal part of the nairovirus m polyprotein, which contains the mucin-like domain, is heavily o-glycosylated, while the adjacent gp38 domain appears to have few o-glycosylation sites [144, 145] . both domains are also n-glycosylated, the mucin-like domain containing five potential sites and gp38 two [145] . of two predicted n-glycosylation sites in each of the mature gn and gc proteins, only one is functional in gn (n557), while both sites in gc (n1054 and n1563) are glycosylated [59] . however, only the glycosylation of the gn is essential for gn maturation, correct localisation, and transport of itself and other cchfv glycoproteins [59] . given that the gn glycosylation sites are conserved among cchfv strains [51] , it is likely that correct glycosylation of the cchfv gn is critical for virus viability. recently, nmr has been used to determine a solution structure for the c-terminal (cytoplasmic) tail of cchfv gn, showing that this region contains a dual zinc-finger domain, the sequence of which is highly conserved among nairoviruses [60, 61] . classical bba-zinc finger domains have been shown to take part in protein-protein interactions (reviewed in ref. [71] ), and it is possible that the dual zincfinger domain in the c-terminus of the nairovirus gn protein is involved in interaction with rnps, which would help drive assembly/budding of the virus. analysis of the sequences of m segments of other nairoviruses suggests that the general model proposed for processing of the m polyprotein described above for cchfv holds true for the other nairoviruses; m polyproteins show similar membrane topology, with six transmembrane regions and conserved signalase cleavage sites after the transmembrane domains tm 0 , tm 2 and tm 4 (fig. 3) . glycoprotein maturation from precursor proteins has also been observed for dugv, where pregn is around 70 kda and pregc around 85 kda [107] . an exception to this general similarity is erve virus (ervev), which does not contain tm 3 and tm 4 , its m polyprotein lacking the entire amino acid sequence between tm 2 and the gc ectodomain. this suggests that ervev lacks an ns m protein. most of the nairoviruses for which we have sequence data appear to have a pregn domain that is about 120 amino acids shorter than that seen in cchfv, due to a much shorter mucin-like domain, as was previously described for dugv [144] . another difference is that other nairoviruses do not contain a furin-cleavage site (rskr) following the o-glycosylated mucin-like domain. the ski-1/s1p-like protease cleavage tetrapeptides in many nairoviruses appear to be different to those proposed for cchfv; however, all appear to fit the consensus subtilisin cleavage sequence (r/k)x(hydrophobic)z; (where x is any amino acid and z is preferably f, k, l or t but not v, p, e, d, c) [56] . this may reflect differences in adaptation to specific tick or mammalian hosts, and further studies on the importance of these various stages in the maturation of the glycoproteins for different hosts are clearly required. the l segment of nairoviruses, which contains a single open reading frame (orf) of approximately 12 kb, encoding a protein of approximately 450 kda, is almost twice as long as the l proteins of most other bunyaviruses, with the exception of the tospoviruses, in which the l segment is approximately 9 kb in length [48, 85, 94, 108] (fig. 4) . despite this difference in length and sequence, nairovirus l proteins still show the four conserved functional regions previously described for other bunyavirus l proteins [94, 108, 131, 139, 177] . the bunyavirus l proteins contain the rna-dependent rna polymerase (rdrp), and the most conserved region of the l segment among nairoviruses is the region corresponding to the coding sequence for the core catalytic domains of the rdrp [8, 85, 108] . within this polymerase module (also called region 3) can be distinguished six conserved motifs [94, 108] (fig. 4) . motifs a through d are conserved among all rna-dependent polymerases [50, 129] ; motif pre-a, upstream of motif a, is present in all rna-dependent rna polymerases and reverse transcriptases, and motif e, which is downstream of motif d, is conserved in segmented negative-strand rna viruses [114, 129, 186] . motifs a, c and d are predicted to bind nucleoside triphosphates (ntps) and are therefore likely to be involved in the catalytic functions of the polymerase [50, 114] , while motifs b and e are predicted to take part in template and/or primer positioning [114] . motif pre-a is also predicted to be involved in template positioning [94, 114] . the fact that the inter-motif distances are more or less constant suggests that the polymerase module functions in a structurally dependent manner [94] . upstream of the polymerase module are regions 1 and 2 (fig. 4) which are conserved in bunyaviruses and arenaviruses [94, 114] , while downstream of the polymerase module is region 4 (fig. 4) , which, although originally suggested to be specific for bunyaviruses [8] , appears to be conserved in other segmented negative-strand rna viruses [94] . protein sequence analysis shows that the distances between regions appear to be conserved among bunyaviruses, with the exception of the interval between regions 2 and 3, where nairoviruses appear to have much longer amino acid sequences than other bunyaviruses (fig. 4) . region 1, based on sequence similarity with other viruses, appears to be responsible for capsnatching endonuclease activity [52, 136, 189] ; however, this needs to be confirmed experimentally. in the case of nairoviruses, the rdrp accounts for only one-third of the entire l protein (fig. 4) , with regions of unidentified function in both the amino (n) and carboxy (c) termini [94] . all bunyaviruses have a significant c-terminal section after the rdrp, although the function of this region is so far unknown in any of the viruses of this family, and there appear to be no cross-genera-conserved motifs [136] . n-terminal to the rdrp motifs, the l proteins of nairoviruses contain several additional domains that are unique to this genus, of which the ovarian-tumour (otu)like protease domain is the most studied [2, 29, 85, 89, 94] . this domain belongs to a larger papain-like cysteine protease family also found in other viruses (e.g., blueberry scorch virus (blscv) of the genus carlavirus, and equine arteritis virus (eav) and porcine respiratory and reproductive syndrome virus (prrsv) of the genus arterivirus), in saccharomyces cerevisiae, in drosophila melanogaster and in mammalian cells [11, 103] . curiously, the region containing the otu domain also contains a sequence resembling a topoisomerase-like domain, located at amino acids , and therefore lying between amino acids that form part of the of the otu catalytic site -cysteine 40 and histidine 151 [85, 94] . the consensus topoisomerase motif the approximate size, expressed as the number of amino acids (aa), is indicated for each protein, and the diagram is drawn approximately to scale. see text for details of the various motifs. adapted from ref. [91] (skxxy) is not conserved across nairoviruses, being mostly slxxy in cchfv and nsdv/gv, but the activesite tyrosine is conserved across all nairoviruses so far sequenced. given the structural similarities between topoisomerases and strand-specific recombinases [35] , this motif may indicate a role for this region of l in rna strand manipulation as well as its function as a protease. alternatively, nairovirus l proteins may include their own topoisomerase activity, rather than having to recruit the host-cell topoisomerase i, as has been shown for at least one non-segmented rna virus [159] . downstream of the otu domain, the l protein of nairoviruses contains a c2h2-type zinc finger domain and a leucine zipper motif [85, 94] , both of which are highly conserved among nairoviruses, but the function of which in the viral replication cycle is still to be determined. interestingly, the zinc-finger domain and leucine zipper motif are located in region 1 and region 2, respectively, of the nairoviral l proteins but do not appear to be present in these regions in the l proteins of other bunyaviruses. the otu-domain protease activity is dispensable for virus genome replication [16] , and most recent studies have focussed on the effects of this enzymatic activity on host proteins. mammalian otu-domain proteins are primarily deubiquitinating enzymes (dubs), responsible for cleaving the modified peptide bond that links ubiquitin (ub) to hostcell proteins or to other ub molecules. most mammalian dubs are only able to deubiquitinate and usually have a limited set of targets that they de-conjugate in this way (reviewed in ref. [96] ). the otu domains of nairoviruses, in contrast, are capable of de-conjugating not only ub but also other ubiquitin-like peptides, notably the interferonstimulated gene 15 protein (isg15), removing these peptides from a variety of protein targets [10, 69, 84] . it has been shown that amino acids 1 to 169 of the l proteins of cchfv, nsdv or dugv are sufficient for enzymatic activity, and conserved active site residues that are critical for its catalytic activity have been identified [10, 69, 84] . in the last few years, the crystal structures of the otu domain with and without a ubiquitin molecule have been determined [2, 29, 89] . the cchfv otu showed an overall similar structure to yeast otu1, but with an additional domain formed by two antiparallel b-strands that allow the viral otus to bind both ub and isg15 [2, 89] . specific cleavage targets, other than host ubiquitinated or isgylated proteins, for the viral otu-like protease have not yet been described. as several potential cysteine-protease-like cleavage sites have been identified in the l protein sequence of nairoviruses [94] and some viral proteins containing an otu-like protease domain have also been shown to undergo autoproteolytic cleavage to generate multiple mature proteins, e.g., the replicase of blscv [98] , it has been suggested that the l proteins of nairoviruses may also be autoproteolytically cleaved into an active rna polymerase and protein(s) with additional function [85] . we have raised specific antibodies to the nand c-termini of the nsdv l protein and shown that such cleavage does not occur in infected cells (lasecka and baron, unpublished). early studies on nairoviruses showed that, unlike other bunyaviruses, they do not shut off host protein synthesis [33, 176] , so the viruses must have a different way of controlling the immediate host cell responses to infection, such as the innate immune response and apoptosis. cchfv infection in cultured cells induced apoptosis, albeit at late stages of infection [91, 141] . one of the ways in which nairoviruses might induce apoptosis has been suggested to be via the induction of er stress [141] . certainly cchfv and dugv have both been shown to induce er stress [141] , and we have observed the same in cells infected with nsdv (lasecka, unpublished data). during replication, nairoviruses synthesise large amounts of their glycoproteins, which mature in the er and golgi [17, 21, 142] and are likely to overload the normal er protein synthesis machinery. as long as the apoptosis occurs after the virus has completed the assembly and release of progeny virus, apoptosis is not a major problem, but it is unclear as yet whether nairoviruses take active steps to inhibit apoptotic pathways. there is the possibility that apoptosis may be delayed or inhibited in cchfv infections by the presence of a highly conserved caspase-3 cleavage site (devd) in the n protein [31, 91] , which may act as a decoy substrate for caspase-3. although this motif has been suggested to be involved in control of apoptosis [31, 91] , the cleavage of n protein by caspase-3 is not required for replication/transcription of a cchfv minigenome [31] . in addition, this devd motif appears to be inaccessible to caspase-3 in the oligomeric form of cchfv n, suggesting that only n monomers are caspase-3 sensitive [174] . cchfv also appears to be the only nairovirus that contains the devd motif [174] ; other nairoviruses, such as nsdv/gv, do not have the motif, yet are still highly pathogenic. the innate immune response to viral infection has been described in a number of recent reviews [72, 105, 132, 143] . cells detect viral pathogen-associated molecular patterns (pamps) (e.g., double-stranded rna (dsrna) or dna with unmethylated cpg motifs) using pattern recognition receptors (prrs), of which the most well-known are the toll-like receptors (tlrs), which scan extra-cytoplasmic spaces including the interior of endosomes and lysosomes, and the cytosolic proteins melanoma-associated differentiation gene-5 protein (mda-5) and retinoic acidinducible gene-i protein (rig-i), which detect virus-associated pamps in the cytoplasm. recognition of a pamp leads to activation of the prr, followed by an intracellular signalling cascade leading to the transcription of interferon b (ifnb) mrna. secreted ifnb works in an autocrine and paracrine manner by binding to the ifna/b receptor of both infected and uninfected cells. this in turn activates a signalling pathway, which up-regulates interferon-stimulated genes (isgs), including ifna, which increases the stimulus to other isgs. finally, the products of isgs (directly or indirectly) lead to the cell entering the so-called antiviral state, in which many proteins are present that can inhibit different viruses. interferon and its actions have strong inhibitory effects on the replication of nairoviruses. for instance, treatment of human endothelial cells with ifna significantly reduced the yield of cchfv [6] , while the replication of cchfv and dugv is impaired in vero cells stably expressing the human mx (mxa) protein; mxa is the product of an isg and appears to act through sequestration of the n protein of these viruses in a perinuclear region [4, 25] . however, as with other viruses, nairoviruses have developed mechanisms to evade this innate antiviral response. some of these mechanisms appear to be unique to one virus; others appear to be common to all the viruses in the genus that have been studied. for example, cchfv can delay ifna/b production in infected cells; andersson et al. showed that, in cchfv-infected cells, an increase in ifnb mrna could only be detected 48 h after infection, which leaves the virus a replication window with no antiviral state [7] . this delay is related to a delay in translocation of the transcription factor interferon regulatory factor-3 (irf-3) to the nucleus [7] and hence a delay in the induction of isg56, one of the genes whose transcription is regulated by irf-3. the isg56 protein is a cytoplasmic protein (p56) that is involved in the global inhibition of translation in response to infection (reviewed in ref. [63] ). the delay in isg56 expression correlates with earlier findings that nairoviruses do not appear to shut off cellular protein synthesis [33, 176] . in contrast to the findings with cchfv, ifnb induction was seen in nsdv/gv-infected cells after about 16 hours; however, during those 16 hours, the virus actively blocked induction of ifnb [84] . nsdv/ gv infection also blocked the signalling pathways activated by external ifna or ifnc, blocking the activation of transcription from promoters normally activated by one or the other of these cytokines by directly inhibiting the phosphorylation of the transcription factors stat1 and stat2 [84] . of the cytoplasmic prrs, rig-i is activated by rnas with a 5' triphosphate group, such as viral vrnas and crnas, while mda-5 is activated by binding dsrnas [44, 86, 92, 128] . like other negative-sense rna viruses, nairoviruses do not produce detectable amounts of dsrna during replication [178] and hence avoid activation of a number of dsrna-sensitive prrs (e.g., mda-5, tlr3) and dsrna-dependent enzymes (e.g., dsrna-activated protein kinase r [pkr] and 2'5'-oligoadenylate synthetase [2'5' oas] ). the newly synthesised vrna and crna of nairoviruses, as for other bunyaviruses, is cotranscriptionally encapsidated by the n protein, minimising the exposure of vrnas and preventing formation of dsrna intermediates [127, 135, 173] . it is thought that the major sensor for bunyavirus infection is therefore rig-i. 5'triphosphate groups are present on rna molecules generated during viral replication but are absent on cellular rnas, as cellular mrna contains a cap structure at the 5' end, and the transcription products of other cellular rna polymerases (rna polymerases i and iii) contain a monophosphate at the 5' end. by utilising capped, short nucleotide sequences snatched from host mrna to initiate viral mrna transcription [90] , nairoviruses prevent recognition of viral mrna by prrs. cchfv further avoids activation of rig-i as its vrna and crna have a monophosphate group rather than the triphosphate group found at the 5' end of most viral rnas [78] , though the mechanism of 5' monophosphate generation during replication of cchfv remains unknown. strategies to generate 5'monophosphates on viral rna have been shown for other viruses: htnv utilises a prime-and-realign mechanism to generate 5' and 3' complementary ends, where a viral endonuclease (which is probably also involved in cap-snatching) is proposed to remove 5'-terminal extensions, leading to a 5' monophosphate on the final rna [73] . interestingly, this mechanism is not common to all bunyaviruses, as rvfv was shown to contain triphosphate groups at the 5' ends of its vrnas, which can activate ifnb via the rig-i pathway [78] . it will be interesting to see if other nairoviruses adopt the rna processing seen for cchfv. it is possible that bunyaviruses that have developed methods of blocking the rig-i-activated pathway (e.g., by the activities of a nonstructural protein such as the phlebovirus nss protein) can have triphosphates at the 5' end of their vrnas while viruses that do not have an nss activity must generate monophosphates to avoid activating the rig-i pathway in the first place [18, 19, 23, 73] . ubiquitin and ubiquitin-like molecules play important roles in the initiation and maintenance of immune response. for example, they are essential for the action of cytokines such as ifna/b and tumour necrosis factor alpha (tnfa) (reviewed in ref. [95] ). ubiquitination allows for the activation of nuclear factor kappa b (nf-jb) by targeting the inhibitor of nf-jb (i-jb) for degradation [163] . k63-linked ubiquitination activates several molecules of the ifnb induction pathway, including rig-i, mitochondrial antiviral signalling protein (mavs), tank-binding kinase-1 (tbk1), ijb kinase-e (ikk-e), tumour necrosis factor receptor-associated factor 3 (traf3) and traf6 [70, 125, 126, 166] . in addition to modulation of the innate immune signalling, ubiquitination also plays an important role in antigen presentation by major histocompatibility complex (mhc) class i and ii proteins [152] . isg15 is an ifninduced 15-kda ubiquitin-like molecule that is composed of two-ubiquitin-like domains [20, 117] . although the precise role of isg15 in modulation of protein function is unknown, conjugation by isg15 is also known to modify hundreds of cellular proteins, including several of those involved in the antiviral response, e.g., pkr, mxa, stat1, rig-i, janus kinase 1 (jak1), and irf-3 [75, 104, 138, 192] . both ub and isg15 are synthesised as precursors, which are cleaved in order to expose the conjugation sequence (lrlrgg) by which they are attached to other proteins. the conjugation is mediated by a sequence involving activating enzymes (e1), conjugating enzymes (e2) and protein ligases (e3) [157, 180, 191] (reviewed in ref. [169] ). removal of these conjugated proteins is carried out by cellular deubiquitinating enzymes and deisgylating enzymes, which have roles, as expected, in negative feedback regulation of ifn induction and action [93, 99] . as described in the discussion of the nairovirus l protein, the n-terminus of these proteins contains an otu proteaselike domain [85] similar to that often found in mammalian dubs, and experimental evidence showed that this protease domain in the viral protein indeed deconjugated ub and isg15. global ubiquitination and isgylation levels were greatly reduced in nsdv/gv-infected cells [84] , and overexpression of the amino-terminal end of the l proteins of cchfv, nsdv or dugv in cell culture resulted in a similar reduction in ubiquitin-and isg15-conjugated proteins [10, 30, 69, 84] . the l protein otu domains of several nairoviruses have been shown to block ifnb induction and the actions of type i and type ii ifns [10, 69, 84] . interestingly, at high enough concentrations, even catalytically inactive otus are capable of blocking ifnainduced transcription [10, 69] . this suggests that catalytically inactive otu domain proteins are still capable of sequestering specific ubiquitinated or isg15ylated targets by binding to them. the nsp2 protein of eav and prrsv also contain otulike domains [69] . recently, it was shown that both the nsp2 of eav and the otu domain of cchfv l protein are able to deubiquitinate rig-i and hence block rig-i-mediated activation of ifnb [168] . interestingly, the rna of neither cchfv nor eav appears to activate rig-i [78, 168] , so the fact that these viruses have evolved specific mechanisms to inhibit the rig-i-mediated induction of ifnb suggests that rig-i still has a function in the antiviral response to these viruses. however, unlike cchfv, the vrna of nsdv/gv does activate transcription from the ifnb promoter (lasecka and baron, unpublished observations). the ability to directly block the rig-i pathway would therefore be particularly important for this virus. comparison of the otu domains of different nairoviruses revealed some differences between their affinity for different types of poly-ub and isg15 [30] . for instance, cchfv shows a higher affinity for k63-poly-ub than k48-poly-ub, and the cchfv otu was more active in the deubiquitination of host proteins than the otus of either nsdv or dugv [10, 30, 84] . on the other hand, while the ervev otu appears to bind any poly-ub weakly (when compared to those of cchfv and dugv), it had higher affinity for isg15 [30] . this may indicate that different nairoviruses have adopted slightly different ways of utilising their core deubiquitinating and deisgylating activities, which might reflect the wide range of pathogenicity caused by these viruses, or differences in the requirements imposed on these viruses by their arthropod hosts, about which we know very little. isg15 is not as strongly conserved as ubiquitin among different species, even mammals, so there can be real effects of species preference in isg15 binding/cleavage, e.g., the cchfv otu appears to show a preference for isg15 of human origin over that of mouse origin, while ervev appears to recognise both human and mouse isg15s equally [30] . the better binding of the murine isg15 by the ervev otu may be associated with the homology between the isg15 of mouse and the white-toothed shrew from which ervev is commonly isolated [34, 184] . nairoviruses share many of their features with other bunyaviruses, e.g., replication in the cytoplasm, budding in the golgi, and their coding and rna replication strategy. from phylogenetic studies, the members of the genus nairovirus appear to be most closely related to those of the genus phlebovirus of all bunyaviruses [108, 131, 139, 177] . however, nairoviruses possess many features not found in other bunyaviruses. nairoviruses appear to have complex processing of their glycoproteins, which involves the actions of cellular proteases such as ski-1/s1p-like proteases and furin. the proteins of nairoviruses also contain domains that have not been observed in other bunyaviruses, such as the l protein otu domain. nairoviruses express a secreted mucin-like domain, which may play an essential role in the pathogenicity of the virus [154, 188] . structural similarity between the nairovirus n protein globular domain or the rdrp region of its l protein and equivalent proteins of arenaviruses has been taken to suggest that the nairoviruses are more closely related to arenaviruses than to members other genera of the family bunyaviridae [31, 170] , with some authors even suggesting that the current classification of the nairoviridae might need re-evaluation in the future [31] . several areas for future study stand out. given that the nairoviruses are, in general, tick-borne, while most other bunyaviruses are insect-borne, it is to be expected that the interaction of these viruses with their arthropod hosts will be specific to the virus genus and need specific study. fortunately, expertise with handling ixodid ticks and tick cell lines is rapidly increasing, and it is to be hoped that our understanding of the replication of these viruses in their tick hosts will catch up with our knowledge of what is happening in mammals. the nairoviruses have been more resistant to the development of successful reverse genetics than e.g. the orthobunyaviruses or the phleboviruses, and development of such a system for nairoviruses will be invaluable in helping us understand the roles of various nairovirus-specific domains such as the topoisomerase-like domain, c2h2-zinc finger domain, leucine zipper motif, and otu domain of the l protein, or the ns m and mucinlike domain from the m segment, both in mammalian and arthropod hosts. the ability to create targeted mutations will also enable us to more rapidly develop stably attenuated viruses that could act as vaccines. presence of broadly reactive and group-specific neutralizing epitopes on newly described isolates of crimean-congo hemorrhagic fever virus molecular basis for ubiquitin and isg15 cross-reactivity in viral ovarian tumor domains identification of a novel c-terminal cleavage of crimean-congo hemorrhagic fever virus pregn that leads to generation of an nsm protein human mxa protein inhibits the replication of crimean-congo hemorrhagic fever virus role of actin filaments in targeting of crimean congo hemorrhagic fever virus nucleocapsid protein to perinuclear regions of mammalian cells type i interferon inhibits crimean-congo hemorrhagic fever virus in human target cells crimean-congo hemorrhagic fever virus delays activation of the innate immune response analysis of oropouche virus l protein amino acid sequence showed the presence of an additional conserved region that could harbour an important role for the polymerase activity nucleocapsid protein structures from orthobunyaviruses reveal insight into ribonucleoprotein architecture and rna polymerization dugbe virus ovarian tumour domain interferes with ubiquitin/isg15-regulated innate immune cell signalling otubains: a new family of cysteine proteases in the ubiquitin pathway 5'-terminal cap structure in eucaryotic messenger ribonucleic acids viral infections in laboratory personnel role of the conserved nucleotide mismatch within 3'-and 5'-terminal regions of bunyamwera virus in signaling transcription crimean-congo hemorrhagic fever virus glycoprotein processing by the endoprotease ski-1/s1p is critical for virus infectivity crimean-congo hemorrhagic fever virus-encoded ovarian tumor protease activity is dispensable for virus rna polymerase function cellular localization and antigenic characterization of crimean-congo hemorrhagic fever virus glycoproteins nss protein of rift valley fever virus blocks interferon production by inhibiting host gene transcription la crosse bunyavirus nonstructural protein nss serves to suppress the type i interferon system of mammalian hosts molecular characterization of the interferon-induced 15-kda protein. molecular cloning and nucleotide and amino acid sequence structure and morphogenesis of dugbe virus (bunyaviridae, nairovirus) studied by immunogold electron microscopy of ultrathin cryosections role of nucleolin in human parainfluenza virus type 3 infection of human lung epithelial cells genetic evidence for an interferon-antagonistic function of rift valley fever virus nonstructural protein nss dugbe nairovirus s segment: correction of published sequence and comparison of five isolates inhibition of dugbe nairovirus replication by human mxa protein nosocomial outbreak of viral hemorrhagic fever caused by crimean hemorrhagic fever-congo virus in pakistan investigation of tick-borne viruses as pathogens of humans in south africa and evidence of dugbe virus infection in a patient with prolonged thrombocytopenia immunohistochemical and in situ localization of crimean-congo hemorrhagic fever (cchf) virus in human tissues and implications for cchf pathogenesis structural analysis of a viral ovarian tumor domain protease from the crimean-congo hemorrhagic fever virus in complex with covalently bonded ubiquitin diversity of ubiquitin and isg15 specificity among nairoviruses' viral ovarian tumor domain proteases structure, function, and evolution of the crimean-congo hemorrhagic fever virus nucleocapsid protein the nairovirus genus: serological relationships polypeptide synthesis of dugbe virus, a member of the nairovirus genus of the bunyaviridae erve virus, a probable member of bunyaviridae family isolated from shrews (crocidura russula) in france conservation of structure and mechanism between eukaryotic topoisomerase i and site-specific recombinases nucleolin expressed at the cell surface is a marker of endothelial cells in angiogenic blood vessels the 3' terminal rna sequences of bunyaviruses and nairoviruses (bunyaviridae): evidence of end sequence generic differences within the virus family qalyub virus, a member of the newly proposed nairovirus genus (bunyavividae) structural characteristics of nairoviruses (genus nairovirus, bunyaviridae) fusing structure and function: a structural view of the herpesvirus entry machinery pretoria virus: a new african agent in the tickborne dera ghazi khan (dgk) group and antigenic relationships within the dgk group soldado virus (hughes group) from ornithodoros (alectorobius) capensis (ixodoidea: argasidae) infesting sooty tern colonies in the seychelles, indian ocean kupe virus, a new virus in the family bunyaviridae, genus nairovirus, kenya the c-terminal regulatory domain is the rna 5'-triphosphate sensor of rig-i isolation of ganjam virus from a human case of febrile illness: a report of a laboratory infection and serological survey of human sera from three different states of india dugbe virus: a tick-borne arbovirus from nigeria the serological relationships of nairobi sheep disease virus tomato spotted wilt virus l rna encodes a putative rna polymerase characterization of a 100-kilodalton binding protein for the six serotypes of coxsackie b viruses an attempt to unify the structure of polymerases crimean-congo hemorrhagic fever virus genomics and global diversity the cap-snatching endonuclease of influenza virus polymerase resides in the pa subunit crimean-congo hemorrhagic fever in kosovo first documentation of human crimean-congo hemorrhagic fever crimean-congo haemorrhagic fever virus infection in the western province of saudi arabia biosynthesis and cellular trafficking of the convertase ski-1/s1p: ectodomain shedding requires ski-1 activity congo/crimean haemorrhagic fever virus from iraq 1979: i. morphology in bhk21 cells crimean-congo haemorrhagic fever n-linked glycosylation of gn (but not gc) is important for crimean congo hemorrhagic fever virus glycoprotein localization and transport molecular biology of nairoviruses 1261 the hantavirus glycoprotein g1 tail contains dual cchc-type classical zinc fingers structural characterization of the crimean-congo hemorrhagic fever virus gn tail provides insight into virus assembly a selex-screened aptamer of human hepatitis b virus rna encapsidation signal suppresses viral replication the isg56/ifit1 gene family the hexamer structure of rift valley fever virus nucleoprotein suggests a mechanism for its assembly into ribonucleoprotein complexes reverse genetics system for uukuniemi virus (bunyaviridae): rna polymerase i-catalyzed expression of chimeric viral rnas mutational analysis of the uukuniemi virus (bunyaviridae family) promoter reveals two elements of functional importance reverse genetics for crimean-congo hemorrhagic fever virus structural polypeptides of hazara virus ovarian tumor domain-containing viral proteases evade ubiquitin-and isg15-dependent innate immune responses ring-finger e3 ubiquitin ligase is essential for rig-i-mediated antiviral activity sticky fingers: zinc-fingers as protein-recognition motifs type 1 interferons and the virus-host relationship: a lesson in detente the 5' ends of hantaan virus (bunyaviridae) rnas suggest a prime-and-realign mechanism for the initiation of rna synthesis crimean-congo hemorrhagic fever virus utilizes a clathrin-and early endosome-dependent entry pathway proteomic identification of proteins conjugated to isg15 in mouse and human cells immunofluorescence studies on the antigenic interrelationships of the hughes virus serogroup (genus nairovirus) and identification of a new strain crimean-congo hemorrhagic fever virus nucleoprotein reveals endonuclease activity in bunyaviruses processing of genome 5' termini as a strategy of negative-strand rna viruses to avoid rig-i-dependent interferon induction structure of the lassa virus nucleoprotein reveals a dsrna-specific 3' to 5' exonuclease activity essential for immune suppression crystal structure of the lassa virus nucleoprotein-rna complex reveals a gating mechanism for rna binding lectins and traffic in the secretory pathway intracellular functions of n-linked glycans a multifunctional shuttling protein nucleolin is a macrophage receptor for apoptotic cells inhibition of interferon induction and action by the nairovirus nairobi sheep disease virus/ganjam virus crimean-congo hemorrhagic fever virus genome l rna segment and encoded protein ) 5'-triphosphate rna is the ligand for rig-i the role of motor proteins in endosomal sorting cholesterol is required for endocytosis and endosomal escape of adenovirus type 2 structural basis for the removal of ubiquitin and interferon-stimulated gene 15 by a viral ovarian tumor domain-containing protease non-viral sequences at the 5' ends of dugbe nairovirus s mrnas induction of caspase activation and cleavage of the viral nucleocapsid protein in different cell types during crimean-congo hemorrhagic fever virus infection differential roles of mda5 and rig-i helicases in the recognition of rna viruses duba: a deubiquitinase that regulates type i interferon production sequence determination of the crimean-congo hemorrhagic fever virus l segment role of ubiquitin-and ubl-binding proteins in cell signaling breaking the chains: structure and function of the deubiquitinases three-dimensional structure of the adenine-specific dna methyltransferase m.taq i in complex with the cofactor s-adenosylmethionine autocatalytic processing of the 223-kda protein of blueberry scorch carlavirus by a papain-like proteinase regulation of virus-triggered signaling by otub1-and otub2-mediated deubiquitination of traf3 and traf6 the l protein of rift valley fever virus can rescue viral ribonucleoproteins and transcribe synthetic genome-like rna molecules morphogenesis of post-golgi transport carriers ticks associated with macquarie island penguins carry arboviruses from four genera a novel superfamily of predicted cysteine proteases from eukaryotes, viruses and chlamydia pneumoniae high-throughput immunoblotting. ubiquitiinlike protein isg15 modifies key regulators of signal transduction viral activation of macrophages through tlrdependent and -independent pathways nairobi sheep disease virus, an important tick-borne pathogen of sheep and goats in africa, is also present in asia dugbe nairovirus m rna: nucleotide sequence and coding strategy large rna segment of dugbe nairovirus encodes the putative rna polymerase molecular biology of nairoviruses hantavirus n protein exhibits genus-specific recognition of the viral rna panhandle investigating the specificity and stoichiometry of rna binding by the nucleocapsid protein of bunyamwera virus chaperone selection during glycoprotein translocation into the endoplasmic reticulum on a tick-borne gastroenteritis of sheep and goats occurring in british east africa rift valley fever virus l segment: correction of the sequence and possible functional role of newly identified regions conserved in rna-dependent polymerases human crimean-congo hemorrhagic fever furin: a mammalian subtilisin/kex2p-like endoprotease involved in processing of a wide variety of precursor proteins crystal structure of the interferon-induced ubiquitin-like protein isg15 the anti-hiv pentameric pseudopeptide hb-19 binds the c-terminal end of nucleolin and prevents anchorage of virus particles in the plasma membrane of target cells n6-methyl-adenosine (m6a) in rna: an old modification with a novel epigenetic function rna binding properties of bunyamwera virus nucleocapsid protein and selective binding to an element in the 5' terminus of the negative-sense s segment genetic characterization of the m rna segment of crimean congo hemorrhagic fever virus strains crimean-congo hemorrhagic fever in bulgaria emergence of crimean-congo haemorrhagic fever in greece ecology of the crimean-congo hemorrhagic fever endemic area in albania tax1bp1 and a20 inhibit antiviral signaling by targeting tbk1-ikki kinases ubiquitin-regulated recruitment of ikappab kinase epsilon to the mavs interferon signaling adapter ribonucleoproteins of uukuniemi virus are circular rig-i-mediated antiviral responses to single-stranded rna bearing 5'-phosphates identification of four conserved motifs among the rna-dependent polymerase encoding elements cap binding and immune evasion revealed by lassa nucleoprotein structure sequence and phylogenetic analysis of the large (l) segment of the tahyna virus genome interferons and viruses: an interplay between induction, signalling, antiviral responses and virus countermeasures laboratory infections with ganjam virus hantavirus nucleocapsid protein is expressed as a membrane-associated protein in the perinuclear region molecular biology of nairoviruses 1263 structure of the rift valley fever virus nucleocapsid protein reveals another architecture for rna encapsidation bunyaviridae rna polymerases (l-protein) have an n-terminal, influenza-like endonuclease domain, essential for viral cap-dependent transcription characterization of the nucleic acid binding properties of tomato spotted wilt virus nucleocapsid protein isg15: the immunological kin of ubiquitin completion of the la crosse virus genome sequence and genetic comparisons of the l proteins of the bunyaviridae extraction of cholesterol with methyl-betacyclodextrin perturbs formation of clathrin-coated endocytic vesicles crimean-congo hemorrhagic fever virus-infected hepatocytes induce er-stress and apoptosis crosstalk ultrastructural studies on the replication and morphogenesis of nairobi sheep disease virus, a nairovirus antiviral actions of interferons characterization of the glycoproteins of crimean-congo hemorrhagic fever virus crimean-congo hemorrhagic fever virus glycoprotein precursor is cleaved by furin-like and ski-1 proteases to generate a novel 38-kilodalton glycoprotein tickborne arbovirus surveillance in market livestock bunyaviridae: the viruses and their replication the v3 loop-mimicking pseudopeptide 5[kpsi(ch2n)pr]-tasp inhibits hiv infection in primary macrophage cultures a protein partially expressed on the surface of hepg2 cells that binds lipoproteins specifically is nucleolin essential amino acids of the hantaan virus n protein in its interaction with rna nucleolin is a receptor that mediates antiangiogenic and antitumor activity of endostatin surface expression of mhc class ii in dendritic cells is controlled by regulated ubiquitination dissecting virus entry via endocytosis ebola virus glycoproteins induce global surface protein down-modulation and loss of cell adherence microtubule-dependent and microtubule-independent steps in crimean-congo hemorrhagic fever virus replication cycle crimean-congo hemorrhagic fever virus entry and replication is clathrin-, phand cholesterol-dependent ubiquitylation and isgylation: overlapping enzymatic cascades do the job epidemiology and phylogenetic analysis of crimean-congo hemorrhagic fever viruses in xinjiang dna topoisomerase 1 facilitates the transcription and replication of the ebola virus genome oligomerization paths of the nucleoprotein of influenza a virus molecular dynamics studies of the nucleoprotein of influenza a virus: role of the protein flexibility in rna binding identification of nucleolin as a cellular receptor for human respiratory syncytial virus nfkappab pathway: a good signaling paradigm and therapeutic target sequence motifs characteristic for dna [cytosine-n4] and dna [adenine-n6] methyltransferases. classification of all dna methyltransferases thunderclap headache caused by erve virus different modes of ubiquitination of the adaptor traf3 selectively activate the expression of type i interferons and proinflammatory cytokines general review of tick-borne diseases of sheep and goats world-wide arterivirus and nairovirus ovarian tumor domain-containing deubiquitinases target activated rig-i to control innate immune signaling uncovering ubiquitin and ubiquitin-like signaling networks sequence analysis of l rna of lassa virus virus glycosylation: role in virulence and immune interactions crimean-congo hemorrhagic fever virus glycoprotein proteolytic processing by subtilase ski the inner structure of uukuniemi and two bunyamwera supergroup arboviruses structure of crimean-congo hemorrhagic fever virus nucleoprotein: superhelical homooligomers and the role of caspase-3 cleavage the proteins and rnas specified by clo mor virus, a scottish nairovirus synthesis of bunyavirus-specific proteins in a continuous cell line (xtc-2) derived from xenopus laevis tensaw virus genome sequence and its relation to other bunyaviridae double-stranded rna is produced by positive-strand rna viruses and dna viruses but not in detectable amounts by negative-strand rna viruses interferon and cytokine responses to crimean congo hemorrhagic fever virus; an emerging and neglected viral zonoosis ubiquitin and ubiquitin-like proteins as multifunctional signals nairobi sheep disease. in: 7th (ed) foreign animal disease crimean-congo hemorrhagic fever crimean-congo haemorrhagic fever: a seroepidemiological and tick survey in the sultanate of oman the erve virus: possible mode of transmission and reservoir identification of a putative crimean-congo hemorrhagic fever virus entry factor origin and evolution of retroelements based upon their reverse transcriptase sequences genomic analysis reveals nairobi sheep disease virus to be highly diverse and present in both africa, and in india in the form of the ganjam virus variant identification of the ebola virus glycoprotein as the main viral determinant of vascular cell cytotoxicity and injury crystal structure of an avian influenza polymerase pa(n) reveals an endonuclease active site electron microscopic and antigenic studies of uncharacterized viruses. ii. evidence suggesting the placement of viruses in the family bunyaviridae the ubch8 ubiquitin e2 enzyme is also the e2 enzyme for isg15, an ifnalpha/beta-induced ubiquitin-like protein human isg15 conjugation targets both ifn-induced and constitutively expressed proteins functioning in diverse cellular pathways rna-binding proteins that inhibit rna virus infection acknowledgments the laboratory of mdb receives funding from the uk biotechnology and biological science research council (bbsrc). ll was the recipient of a bbsrc phd studentship. key: cord-270594-62xotol3 authors: he, lei; hu, xiaolong; zhu, min; liang, zi; chen, fei; zhu, liyuan; kuang, sulan; cao, guangli; xue, renyu; gong, chengliang title: identification and characterization of vp7 gene in bombyx mori cytoplasmic polyhedrosis virus date: 2017-09-05 journal: gene doi: 10.1016/j.gene.2017.06.048 sha: doc_id: 270594 cord_uid: 62xotol3 the genome of bombyx mori cytoplasmic polyhedrosis virus (bmcpv) contains 10 double stranded rna segments (s1–s10). the segment 7 (s7) encodes 50 kda protein which is considered as a structural protein. the expression pattern and function of p50 in the virus life cycle are still unclear. in this study, the viral structural protein 7 (vp7) polyclonal antibody was prepared with immunized mouse to explore the presence of small vp7 gene-encoded proteins in bombyx mori cytoplasmic polyhedrosis virus. the expression pattern of vp7 gene was investigated by its overexpression in bmn cells. in addition to vp7, supplementary band was identified with western blotting technique. the virion, bmcpv infected cells and midguts were also examined using western blotting technique. 4, 2 and 5 bands were detected in the corresponding samples, respectively. the replication of bmcpv genome in the cultured cells and midgut of silkworm was decreased by reducing the expression level of vp7 gene using rna interference. in immunoprecipitation experiments, using a polyclonal antiserum directed against the vp7, one additional shorter band in bmcpv infected midguts was detected, and then the band was analyzed with mass spectrum (ms), the ms results showed thatone candidate interacted protein (vp7 voltage-dependent anion-selective channel-like isoform, vdac) was identified from silkworm. we concluded that the novel viral product was generated with a leaky scanning mechanism and the vdac may be an interacted protein with vp7. the genome of cytoplasmic polyhedrosis viruses (cpvs) normally consist of 10 dsrna segments (payne & mertens, 1983) , however, some cpvs have 11-12 dsrna segments (zhao, liang, hong, & peng, 2003a; zhao, liang, hong, xu, & peng, 2003b) . the genome of trichoplusiani cypovirus type 15 (tncpv-15) and antheraea mylitta cypovirus type 4 (amcpv-4) is composed of 11 dsrna segments (qanungo, kundu, mullins, & ghosh, 2002; rao, carner, scott, omura, & hagiwara, 2003) . each dsrna segment of bmcpv is composed of a single complete open reading frame (orf) . these open reading frames (orfs) code for the structural as well as non-structural proteins and polyhedrin. genome analysis of the bmcpv h and i strains revealed that segment s1, s2, s3, s4, s6 and s7 encode viral structural proteins such as vp1, vp2, vp3, vp4, vp6 and vp7, while segment s5, s8, s9 and s10 encode nonstructural proteins p101 (nsp5), p44 (nsp8), ns5 (nsp9) and polyhedrin, respectively (cao et al., 2012) . the eleventh dsrna segment was found in the bombyx mori cypovirus type 1 (bmcpv-1), which was termed as small polyhedron gene segment (sp) (arella, lavallee, belloncik, & furuichi, 1988) . the twelfth dsrna segment was also identified in the bmcpv-1 with novel rna extraction methods (nagae, miyake, kosaki, & azuma, 2005) . the functions of these newly found dsrna segments are still undefined. a unique characteristic of the dsrna viruses including bmcpvs, is their ability to accomplish transcription of the dsrna segments. 5′ end of bmcpv genome was processed by adding cap and methylation that form a cap like structure, mgpppampgp, protecting mrnas from degradation by exonuclease enzymes which cuts uncapped mrna in a 5′ to 3′ direction (furuichi, 1978) . all of the dsrna segments in the genome of bmcpv have the similar conservative sequence (guuaa……guuagcc) in the end (kuchino, nishimura, smith, & furuichi, 1982) , which were predicted as the recognize signal for rna transcriptase and replicase, or related with both ribosome and mrna, or assembly with viral structure (la, 2001) . these conservative sequences were also found in the genome of dendrolimus punctatus (zhao et al., 2003a; zhao et al., 2003b) . the size of complete segment 7 (s7) in bmcpv is 1501 bp and its orf is located at 25-1371 bp, which encode 448 amino acid residue. the prokaryotic expression product is 50 kda (p50). however, anti-p50 can detect one band with 56 kda from the bmcpv-1 infected bmn4 cells. by using the same antibody, the bands with 34, 36, 38 and 40 kda from the virion can be detected (jin, peng, wang, zhang, & zhang, 2014) . m2 gene of reovirus serotype 1 encoded protein μ1 could be cleaved into capsid protein μ1c, which can interact with s4 gene encoded protein σ3 to complete the assembly of the virion (wiener & joklik, 1988) . it is suggested that p50 of bmcpv might play a vital role in the process of assembly of virion. in the previous study, bmcpv-sz genome was cloned and found that vp7 enriched alanine and threonine, and a nuclear localization signal (nls) (cao et al., 2012) . the analysis of the cpv virion with frozen electron microscope and computer three-dimensional reconstruction technology showed that there are two different sizes of protuberance in the outer surface of the capsid. cpv caspid three-dimensional structure analysis showed that there are several protuberances (large protuberance and small protuberance) in the surface of caspid. it was suggested that p50 is processed by posttranslational modification to form the protuberance for the attachment of the virion to the targeted cells (cheng et al., 2011) . domain analysis of vp7 showed that it may also have role in nucleic acid binding. this study highlights the potential role of small vp7 gene-encoded proteins in the life cycle of bmcpv. three sirnas targeting the vp7 gene of bmcpv were designed and synthesized to reveal their interference effects on bmcpv replication in infected silkworm larvae and cells. the expression patterns and potential cleavage sites of vp7 were also investigated with the specific antibody, which were produced by the prokaryotic expression system. bmn cells of domestic silkworm were cultured in tc-100 medium at 27°c under standard conditions. the silkworms (strain p50) were cultivated on mulberry leaves at 25 ± 1°c temperature and 70-85% humidity, respectively, with a photoperiod of 12:12 ld. according to the vp7 gene (accession number: gq150538) sequence, three sirna were designed and termed as vp7-sirna1, vp7-sirna2 and vp7-sirna3 (shanghai integrated biotech solutions co., ltd). control sirna was designed with gfp gene (accession number: nc_011521.1) sequence and named as sirna-gfp (table 1) . 1 × 10 6 bmn cells were transfected with 2 μg sirna-gfp (control) and three bmcpv vp7 gene specific sirnas vp7-sirna1, vp7-sirna2 and vp7-sirna3 (test) using 2 μl lipidosome. after 24 h of transfection, bmcpv virions prepared by our lab (zhang et al., 2017) were added into the medium containing bmn transfected cells. after 48 h of infection, the infected cells were collected for the rna extraction from the control group and test group. two groups (control and test) were made randomly for third instar larva, followed by four replicates for each group. 1 μg sirna was injected into haemolymph of each silkworm. injected silkworms were fed with bmcpv (4.3 × 10 9 polyhedron/ml). 48 h later, midguts of the infected silkworms were taken for the rna isolation. total rna was extracted from the third instar larvae and bmn cells, infected with bmcpv, according to the standard protocol. the quantity and quality of rna was determined with both 260/280 absorbance ratio and gel electrophoresis, respectively. rna was stored at − 80°c for further experiments. 1 μg of total rna was used as template in the first-strand cdna synthesis. real-time quantitative pcr was performed using sybr green dye (bio-rad, hong kong, china) with c1000 thermal cycler system (bio-rad, hercules, ca, usa). the primers for the genes expression analysis are listed in table 2 . the 2 − δδct method was used for determination of relative expression level of the detected genes (livak & schmittgen, 2001) . experiments were performed in triplicates for each sample. to prepare polyclonal antibody against vp7 proteins of bmcpv, the cdna with primers listed in table 1 was amplified with pcr. the amplified pcr products were ligated into the prokaryotic expression vector pet28(a). the recombinant prokaryotic expression vector pet28(a)-vp7 ( supplementary fig. 1a ) was transformed into escherichia coli strain bl21 cells and induced with 1 mmol/l isopropyl-β-d-thiogalactopyranoside (iptg). after 4 h of induction, 1.5 ml of the transformed bl21 cells was collected for the recombinant protein detection. the proteins were detected with sodium dodecyl sulfate-polyacrylamide gel electrophoresis (sds-page) and western blotting (supplementary fig. 1b) , using a primary antibody of rabbit anti-6 × his (sigma) and a secondary antibody of horseradish peroxidase (hrp)conjugated goat anti-rabbit (sigma). recombinant protein (vp7) was purified from the 1 l bl21 cells with ni-nta (ni 2 + -nitrilotriacetate) resin (invitrogen, carlsbad, ca) using standard procedure. mouse was immunized with recombinant protein (vp7) for preparing the vp7 polyclonal antibody. booster injections were administered according to the manufacturer's instructions. the collected serum was stored at − 70°c. 5′-ugcgcuccuggacguagccuu-3′ table 2 primers for real-time pcr. gene name primer name sequence puc-s7 plasmid, preserved in our lab, was used as a template for the amplification of s7 fragment with primers s7-ei/s7-xi. pcr amplicons were digested with digestion enzymes (ecoriand xba) and inserted into the vector pizt/v5-his to produce a recombinant plasmid pizt/v5-his-s7. bmn cells were transfected with recombinant plasmid pizt/v5-his-s7 by adding 2 μg of recombinant plasmid (pizt/v5-his-s7) and 5 μl of lipidosome into the 25 cm 2 cell culture flask containing bmn cells. after 3 days of transfection, transfected cells were examined with inverted fluorescence microscope for screening transfection in bmn cells. 400 μg/ml zeocin antibiotic was added for screening the positive transfected cells. transfected cells were screened continually for one month using zeocin antibiotic. 70-80% positive transfected cells were collected for the extraction of total protein. midguts were derived from bmcpv infected silkworm for the paraffin section. the sections of the midguts were fixed in 4% (v/v) paraformaldehyde at room temperature for 2 h and then rinsed with 0.01 m pbst (10 mm na 2 hpo 4 , 2.7 mm kcl, 140 mm nacl, 1.8 mm kh 2 po 4 , ph 7.4) containing 0.05% (v/v) tween-20. the fixed midgut sections were blocked with 3% (w/v) bsa at room temperature for 2 h followed by three washes (5 min each) in pbst, then incubated overnight at 4°c with vp7 polyclonal antibody (diluted 1:100 in blocking buffer). after three washes (5 min each) in pbst, cells were incubated with anti-rabbit fluorescein isothiocyanate (fitc)-labeled secondary antibody (huaan biotechnology) at a dilution of 1:200. fitc displays green fluorescence under blue light. the nuclei were labeled with 4′, 6diamidino-2-phenylindole (dapi), which exhibits blue fluorescence. cells were observed under a laser confocal scanning microscope (nikco, japan). silkworm midgut total proteins were extracted with the ripa buffer (sangon, shanghai). the total proteins were incubated overnight at 4°c with the specific murine polyclonal anti-vp7 antibody. the murine unimmune serum was used as the corresponding antibody. immune complexes were precipitated by adding protein a + g (sigma), washed three times in the lysis buffer and analyzed by sodium dodecyl sulfate 12% polyacrylamide gel electrophoresis (sds-page) followed by coomassie brilliant blue staining. co-ip was replicated three times and all the appeared bands were cut for the mass spectrum. proteins identification was carried out in shanghai applied protein technology co. ltd. peptidecutter online software (http://web.expasy.org/peptide_ cutter/) was used to predict the potential cleavage sites cleaved by proteases or chemicals in a given protein sequence (wilkins et al., 1999) . all data are presented as mean ± standard deviation (sd). statistical differences were evaluated using student's t-test for unpaired samples. the level of statistical significance was set at * p < 0.05, ** p < 0.01 and *** p < 0.001. after the successful construction of pizt/v5-his-s7 expression vector (fig. 1a) , bmn cells were transfected with recombinant vector (pizt/v5-his-s7) using liposome. 400 μg/ml zeocin antibiotic was added for screening the positive transfected cells. after 3 days of transfection, bmn cells were examined with inverted fluorescence microscope. the detection of green fluorescence in transfected cells indicated that expression vector was transferred into the bmn cells. transfected cells were screened continually for one month using zeocin antibiotic. the proportion of green fluorescence did not increase after 1 month of zeocin treatment (fig. 1b) . total protein was extracted from the transfected cells, and vp7 gene expression was confirmed with western blotting. the results showed that one band with 70 kda was detected from the transfected cells with 6*his antibody. while, with the vp7 antibody, three bands with about 70, 34 and 35 kda were detected from the transfected cells (fig. 1c) . vp7 recombinant protein was purified from the transfected cells. the purified product was also confirmed with sds-page and western blotting. the results showed that only one band with 70 kda was detected with 6*his antibody, while additional band with 35 kda was detected by vp7 antibody ( fig. 2a and b ). a novel protein was detected from the overexpression of vp7 gene in the bmcpv infected cultured cells, with vp7 antibody. to identify that whether other proteins could be present in the bmcpv, total proteins were extracted from the bmcpv infected cultured cells (bmn cells), midgut and bmcpv virion. a band with 55 kda was detected from the bmcpv infected bmn cells using western blotting technique. however, this band is similar with the hypothetical molecular weight of vp7, but smaller than overexpression product in bmn cells. additional band with 20 kda was also identified from the bmcpv infected bmn cells (fig. 3a) . proteins from bmcpv virion were also extracted and detected with vp7 antibody, and four clear bands (55, 38, 25 and 10 kda) were found in the bmcpv virion (fig. 3b ). total proteins from the bmcpv infected silkworm midguts (from the first day to the twelfth day) were also extracted, and detected with vp7 antibody. a single positive band was detected from the seventh day of infected silkworm. from the eighth day, three positive bands (55, 25 and 20 kda) were detected. however, from the ninth to the twelfth day, > 5 bands (55, 28, 25, 20 and 18 kda) were detected (fig. 4) . the expression level of bmcpv vp7 gene after transfection with specific sirna was evaluated by real-time pcr. virus replication in transfected cells and silkworm was analyzed. transfection with sirna-vp7 reduced the expression level of vp7 gene in infected cells and silkworm compared to the levels of the control (fig. 5a) . the transcript level of vp1 gene was decreased by 15.4 fold. silencing of the vp7 gene inhibited the multiplication of bmcpv (fig. 5b) . similar results were obtained from the bmn cells infected with bmcpv. when the expression of vp7 gene was down-regulated in bmn cells, vp1 gene transcript level was decreased by 13 fold (fig. 5c ). immunoprecipitation experiments using a polyclonal antiserum directed against the vp7 protein detected the vp7 protein and at least four additional shorter products in infected midguts. these shorter bands were cut for the mass spectrum (fig. 6) , and total 19 proteins were identified. except vp7, other 18 proteins were encoded by the genome of silkworm. all proteins are listed in the table 3 . only viral structural protein 7 and voltage-dependent anion-selective channel-like isoform (vdac) were identified with > 2 unique peptides among all of the identified proteins. these 4 bands with similar molecular weight of the products were identified from the bmcpv virion. it was suggested that these products may be translated from alternative in-frame aug codon in vp7 mrna. vdac was considered a potential interacted protein with the vp7, playing important role in the life cycle of bmcpv. the 3rd instar silkworm larvae were infected with bmcpv. 6 days later, the midguts were extracted to analyze with immunofluorescence technique. the results showed that the viral virion was localized in the cytoplasm of the infected cells (fig. 7) . bmcpv genome has ten dsrna segments (s1-s10), which are encoding structural and non-structural proteins and polyhedrin. vp7 gene is encoded by the s7 segment, and the function of it is still unclear. in this report, it was demonstrated that 4, 2 and 5 bands of vp7 gene products, recognized by a polyclonal anti-vp7 antibody, are present in bmcpv virion, bmcpv infected cells and bmcpv infected silkworm midguts and 3 bands in transfected cells overexpressed with vp7 gene. these polypeptides appeared may be encoded by a single open reading frame (orf) produced from the specific aug codons of vp7 gene. sequence analysis found that bmcpv vp7 gene contains nearly 17 aug codons in vp7 orf. all of these short mrna encoded > 100 codons, and only 3 short mrnas were inconsistent with the orf of vp7 gene. however, we cannot exclude the possibility that these short polypeptides were initiated from non-aug codons, or truncated vp7 was generated from the cleavage by the proteolytic enzyme. coding capacity of viruses is enhanced with the evolving overlapping reading frames. until now, the segmented dsrna genome of the viruses including reoviridae, coronaviruses and orbivirus families were thought to be monocistronic (i.e. ten genome segments encoding ten proteins), but the identification of novel orfs, translated in the same reading frame that begin at an alternative downstream aug, changed this assumption (gong, chen, chen, kuo, & shih, 2014) . rice black-streaked dwarf virus genome segments s5, s7 and s9 were bicistronic mrna, which were involved in the formation of tubular structures, the formation of viroplasms, virus replication and assembly occurrence. at least 9 novel orfs were found from the genome of coronaviruses. they were considered as the accessory proteins involved in viral pathogenicity (shukla & hilgenfeld, 2015) . a small orf in segment 10, overlapping the ns3 orf in the +1 position was identified from the genome of bluetongue virus (btv) (stewart et al., 2015) . in this study, we concluded that short products of vp7 were appeared in the gels, which might play a significant role in bmcpv life cycle. further experiments on these short vp7 products will be carried out in our future studies. voltage-dependent anion channels belongs to 6 . identification of the interaction proteins with the bmcpv vp7 with co-immunoprecipitation. co-ip was replicated with three time and all the disparity bands were cut for the mass spectrum. a class of porin ion channel and positioned on the outer mitochondrial membrane (hoogenboom, suda, engel, & fotiadis, 2007) . it act as a general diffusion pore for small hydrophilic (water attractant) molecules (benz, 1994) and helps in the exchanging of ions and molecules between mitochondria and cytosol and is regulated by the interactions with other proteins and small molecules (hiller, abramson, mannella, wagner, & zeth, 2010) . vdac has been shown to play a significant role in apoptosis. during apoptosis, increased permeability of vdac allows for the release of apoptogenic factors (tsujimoto & shimizu, 2002) . vdac-like protein in rhipicephalus microplus may play important role in the infection of babesia bigemina (rodriguez-hernandez et al., 2012) . previous studies suggested the apoptosis induction is carried out by the upregulation of vdac which results in enhanced permeability of outer mitochondrial membrane. the possible reason is thought to be a shift in the vdac equilibrium to an oligomeric form of the protein which forms large pores and thus allows the release of pro-apoptotic proteins (ghosh, pandey, maitra, brahmachari, & pillai, 2007) . furthermore, many viruses encode proteins that act on vdac to control apoptosis in infected cells (boya et al., 2003) , such as influenza a and hepatitis b viruses (boya et al., 2004) . whether vdac function is important during babesia bigemina invasion, propagation or replication must be determined in the near future. in the later period of bmcpv infected silkworm larvae, nearly 5 bands were found from the midguts. the alternative hypothesis was that vp7 was produced by the proteolytic enzyme. in our previous study, the analysis of vp5, encoded by the s5 fragment of bmcpv, showed that it has foot-and-mouth disease virus (fmdv) 2a pro -like protease domain which is a short polypeptide of 16 amino acids. the function of this domain is in breaking of the polyprotein of fmdv at the 2a/2b junction co-translationally. this cleavage occurs at its carboxyl terminus at a glycineproline amino acid pair (ryan, king, & thomas, 1991; mattion, harnish, crowley, & reilly, 1996) . whether it might be taking part in the cleavage of the vp7 still need further investigation. approximately 5% of gag gene (such as lentiviruses, human immunodeficiency virus, simian immunodeficiency virus) produced other proteins causing by the ribosomal frameshift with independent functions (rue, roos, tarwater, clements, & barber, 2005) . two mechanisms for generating more than one protein from a single mrna have been reported. one possibility is through leaky scanning, such as paramyxoviruses (kozak, 1989) and rabies virus. the other possibility is a cap-independent mechanism, such as picornavirus polyprotein and sendai virus proteins (pelletier & sonenberg, 1988; curran & kolakofsky, 1989) . why vp7 exited various protein expression patterns among virion, infected cells and larvae? peptidecutter online software was used to predict the potential cleavage sites cleaved by proteases in vp7. bnps-skatole, iodosobenzoic acid, ntcb (2-nitro-5thiocyanobenzoic acid) and caspase1 were predicted to be the potential enzymes (table 4 ). in the future, to better understand the potential cleavage mechanisms of vp7, these proteins with small molecular weight will be purified from the bmcpv virion and then will be identified with the mass spectrum analysis. in this study, we also conducted the experiments for revealing the function of vp7 gene in the viral life cycle. down-regulated vp7 gene transcript level decreased the transcript level of vp1 gene in bmcpv infected silkworm larvae and bmn cells. it was indicated that vp7 gene was associated with the bmcpv genome replication. the reason for the decline of genome replication due to the reduction in expression level of vp7 gene was that during the cell entry of the progeny virus, the vp7-vp4 outer layer un-coats releasing the intact double-layered particle (dlp) into the cytosol. vp4 mediates cell binding (fiore, greenberg, & mackow, 1991) and membrane disruption (denisova et al., 1999; kim, trask, babyonyshev, dormitzer, & harrison, 2010) , but the final steps of dlp delivery require un-coating of vp7 (cuadras, arias, & lopez, 1997; liprandi et al., 1997) . full un-coating of the outer layer is also required to activate the dlp (lawton, estes, & prasad, 2001) , which contains multiple copies of the viral polymerase complex that synthesize and extrude mrna transcribed from each of the 10 genome segments. the vp7 transcript was reduced because the vp7 layer anchors the vp4 spikes onto the underlying dlp was less, leading to the membrane disruption failure (chen & ramig, 1993; trask & dormitzer, 2006) . the new mrna transcribed from each segment was decreased, leading to the reduction of the genome replication. supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.gene.2017.06.048. our work was supported by the national natural science foundation of china (31602007 and 31072085), natural science foundation of jiangsu province (bk20140324) and a project funded by the priority academic program of development of jiangsu higher education institutions. molecular cloning and characterization of cytoplasmic polyhedrosis virus polyhedrin and a viable deletion mutant gene permeation of hydrophilic solutes through mitochondrial outer membranes: review on mitochondrial porins mitochondrion-targeted apoptosis regulators of viral origin viral proteins targeting mitochondria: controlling cell death characterization of the complete genome segments from bmcpv-sz, a novel bombyx mori cypovirus 1 isolate rescue of infectivity by sequential in vitro transcapsidation of rotavirus core particles with inner capsid and outer capsid proteins atomic model of a cypovirus built from cryo-em structure provides insight into the mechanism of mrna capping rotaviruses induce an early membrane permeabilization of ma104 cells and do not require a low intracellular ca2 + concentration to initiate their replication cycle scanning independent ribosomal initiation of the sendai virus y proteins in vitro and in vivo rotavirus capsid protein vp5* permeabilizes membranes the vp8 fragment of vp4 is the rhesus rotavirus hemagglutinin pretranscriptional capping" in the biosynthesis of cytoplasmic polyhedrosis virus mrna a role for voltagedependent anion channel vdac1 in polyglutamine-mediated neuronal cell death computational analysis and mapping of novel open reading frames in influenza a viruses the 3d structures of vdac represent a native conformation the supramolecular assemblies of voltage-dependent anion channels in the native membrane advances in the study of infection and transcription mechanism of cytoplasmic polyhedrosis viruses effect of mutations in vp5 hydrophobic loops on rotavirus cell entry the scanning model for translation: an update homologous terminal sequences in the double-stranded rna genome segments of cytoplasmic polyhedrosis virus of the silkworm bombyx mori reoviruses and their replication identification and characterization of a transcription pause site in rotavirus productive penetration of rotavirus in cultured cells induces coentry of the translation inhibitor alpha-sarcin analysis of relative gene expression data using realtime quantitative pcr and the 2(−delta delta c(t)) method foot-and-mouth disease virus 2a protease mediates cleavage in attenuated sabin 3 poliovirus vectors engineered for delivery of foreign antigens identification of novel doublestranded rna produced in midgut epithelial tissue of the silkworm, bombyx mori, during infection by a cypovirus 1 cytoplasmic polyhedrosis viruses internal initiation of translation of eukaryotic mrna directed by a sequence derived from poliovirus rna molecular cloning and characterization of antheraea mylitta cytoplasmic polyhedrosis virus genome segment 9 comparison of the amino acid sequences of rna-dependent rna polymerases of cypoviruses in the family reoviridae the identification of a vdac-like protein involved in the interaction of babesia bigemina sexual stages with rhipicephalus microplus midgut cells phosphorylation and proteolytic cleavage of gag proteins in budded simian immunodeficiency virus cleavage of foot-and-mouth disease virus polyprotein is mediated by residues located within a 19 amino acid sequence acquisition of new protein domains by coronaviruses: analysis of overlapping genes coding for proteins n and 9b in sars coronavirus characterization of a second open reading frame in genome segment 10 of bluetongue virus assembly of highly infectious rotavirus particles recoated with recombinant outer capsid proteins the voltage-dependent anion channel: an essential player in apoptosis evolution of reovirus genes: a comparison of serotype 1, 2, and 3 m2 genome segments, which encode the major structural capsid protein mu 1c protein identification and analysis tools in the expasy server integrin beta and receptor for activated protein kinase c are involved in the cell entry of bombyx mori cypovirus genomic sequence analyses of segments 1 to 6 of dendrolimus punctatus cytoplasmic polyhedrosis virus molecular characterization of segments 7-10 of dendrolimus punctatus cytoplasmic polyhedrosis virus provides the complete genome the authors declare that they have no conflicts of interest. this article does not contain any studies with human participants performed by any of the authors. key: cord-103709-86hv27vh authors: zhang, dong yan; wang, jian; dokholyan, nikolay v. title: prefusion spike protein stabilization through computational mutagenesis date: 2020-06-19 journal: biorxiv doi: 10.1101/2020.06.17.157081 sha: doc_id: 103709 cord_uid: 86hv27vh a novel severe acute respiratory syndrome (sars)-like coronavirus (sars-cov-2) has emerged as a human pathogen, causing global pandemic and resulting in over 400,000 deaths worldwide. the surface spike protein of sars-cov-2 mediates the process of coronavirus entry into human cells by binding angiotensin-converting enzyme 2 (ace2). due to the critical role in viral-host interaction and the exposure of spike protein, it has been a focus of most vaccines’ developments. however, the structural and biochemical studies of the spike protein are challenging because it is thermodynamically metastable1. here, we develop a new pipeline that automatically identifies mutants that thermodynamically stabilize the spike protein. our pipeline integrates bioinformatics analysis of conserved residues, motion dynamics from molecular dynamics simulations, and other structural analysis to identify residues that significantly contribute to the thermodynamic stability of the spike protein. we then utilize our previously developed protein design tool, eris, to predict thermodynamically stabilizing mutations in proteins. we validate the ability of our pipeline to identify protein stabilization mutants through known prefusion spike protein mutants. we finally utilize the pipeline to identify new prefusion spike protein stabilization mutants. the ongoing outbreak of the novel coronavirus [2] [3] [4] , which causes fever, severe respiratory illness, and pneumonia, poses a major public health and governance challenges. the emerging pathogen has been characterized as a new member of the betacoronavirus genus (sars-cov-2) 5,6 , closely related to several bat coronaviruses and to severe acute respiratory syndrome coronavirus (sars-cov) 7, 8 . compared with sars-cov, sars-cov-2 appears to be more readily transmitted from human to human, spreading to multiple continents and leading to the world health organization (who)'s declaration of a pandemic on march 11, 2020 9 . according to the who, as of june 9, 2020, there had been >7,000,000 confirmed cases globally, leading to >400,000 deaths. in the initial step of the infection, the coronavirus enters the host cell by binding to a cellular receptor and fusing the viral membrane with the target cell membrane 10, 11 . for sars-cov-2, this process is mediated by spike glycoprotein which is a homo-trimer 12 . in the process of viruses fusing to host cells, the spike protein undergoes structural rearrangement and transits from a metastable prefusion conformational state to a highly stable post-fusion conformational state 13, 14 . the spike protein comprises of two functional units, s1 and s2 subunits; when fused to the host cell, the two subunits are cleaved. the s1 subunit is responsible for binding to the angiotensin-converting enzyme 2 (ace2) [15] [16] [17] receptor on the host cell membrane and it contains the n-terminal domain (ntd), the receptor-binding domain (rbd) and the c-terminal domain (ctd). ntd in the s1 subunit assists recognize sugar receptors. rbd in the s1 subunit is critical for the binding of coronavirus to the ace-2 receptor [18] [19] [20] [21] . ctd in the s1 subunit could recognize other receptors 22 . the binding of rbd to ace2 facilitates the cleavage of the spike protein and promotes the dissociation of the s1 subunit from the s2 subunit 23 . s2 contains two heptad repeats (hr1 and hr2), a fusion peptide, and a protease cleavage site (s2'). the dissociation of s1 induces s2 to undergo a dramatic structural change to fuse the host and viral membranes. thus, the spike protein serves as a target for development of antibodies, entry inhibitors and vaccines 24 . coronavirus transits from a metastable prefusion state to a highly stable post-fusion state as part of the spike protein's role in membrane fusion. the instability of the prefusion state presents a significant challenge for the production of protein antigens for antigenic presentation of the prefusion antibody epitopes that are most likely to lead to neutralizing responses. thus, since the prefusion spike protein exists in a thermodynamically metastable state 1 , a stabilized mutant conformation is critical for the development of vaccines and drugs. computational mutagenesis is an effective approach to finding mutations that are able to stabilize proteins. we have previously developed a protein design platform, eris 25, 26 , which utilizes a physical force field 27 for modeling inter-atomic interactions, as well as fast side-chain packing and backbone relaxation algorithms to enable efficient and transferrable protein molecular design. originally, eris has been validated on 595 mutants from five proteins, corroborating the unbiased force field, side-chain packing and backbone relaxation algorithms. in many later studies, eris has been validated through prediction of thermodynamically stabilizing or destabilizing mutations [28] [29] [30] [31] [32] [33] , and direct protein design efforts [34] [35] [36] [37] . in this work, we propose a pipeline to automatically stabilize spike proteins through computational mutagenesis. within the pipeline, we first analyze the conservation score and solvent accessible surface area (sasa) of residues in the protein. we then perform discrete molecular dynamics (dmd) [38] [39] [40] [41] simulations to calculate the root mean square fluctuation (rmsf) of residues to analyze their flexibility. based on this information, we select appropriate residues (2 < conservation score < 5; sasa > 0.4; rmsf > 3.5 å; see discussion) as mutation sites. we subject the selected residues to computational redesign using eris to find the stabilizing mutations by calculating the change in free energy ∆∆ = ∆ − ∆ , where ∆ and ∆ are the free energies of the mutant protein and wild type proteins correspondingly. we utilize this pipeline to identify stabilization mutants of the spike protein. next, we describe our methods in detail and provide a list of stabilizing mutations for spike protein. we propose a pipeline to automatically find stabilized mutants of proteins ( figure 1 ). the pipeline can be divided into two stages. in the first stage, users designate the protein of interest, and then the pipeline will analyze the 3d structure of the protein by using different metrics. in the second stage, users designate the mutation sites according to the analysis of the 3d structure of the protein, and finally the pipeline will determine the stabilizing mutations of the protein. users can either upload the 3d structure of the protein or input the pdb id of the protein to designate the protein of interest. in the first stage, the first step is to remodel the 3d structure of the protein of interest to complete the missing atoms and residues. we integrate modeller into our pipeline to remodel proteins. next, the pipeline utilizes consurf 48 to calculate the conservation score of each residue in the protein of interest. the conservation score indicates the importance of the residue in maintaining protein structure and/or function. subsequently, the pipeline utilizes dmd to analyze the flexibility of each residue in the spike protein through rmsf. the technique has already been used to efficiently study the protein folding thermodynamics and protein oligomerization and allows for a good equilibration of the structures. then, the pipeline will calculate sasa of residues in the protein. in the second stage, users designate the mutation sites according to the conservation score, rmsf, and sasa. a high conservation score (≥ 7) indicates the residue may play important roles in the function or the stability of the structure of the protein; residues of high rmsf (> 3.5 å) are likely the culprit to undermine the stability of the structure of the protein, hence we select residues that have a low conservation score or high rmsf; residues with sasa < 0.5 are considered buried and residues with sasa ≥ 0.5 are considered exposed to solvent. after the designation of the mutation sites, the pipeline utilizes eris to determine the changes in free energies of the mutants. for each residue in the mutation sites, we utilize modeller 42 to re-model the 3d structure of the spike protein by using a structure deposited to protein databank (pdb), pdbid:6vsb 12 , the cryo-em structure of the prefusion state of the spike protein, as the template structure to complete the missing atoms and residues ( figure 2a&b ). next, we use chiron 44 all following computational mutagenesis study are performed using the structure with the 3 rbds in down conformation. to validate the ability of the pipeline to identify stabilization mutants, we use the pipeline to calculate the free energy changes of several known prefusion spike protein stabilization mutants. the 2p mutation strategy (k986p and v987p) has been proved effective for the stabilization of spike protein of sars-cov-2 and other betacoronavirus 12,20,50 . hsieh and coworkers 51 the spike protein is mainly composed of s1 and s2 subunits ( figure s1 ). we select residues for mutation from the ntd and rbd domains in s1, and we also select residues from the hr1 (heptad repeat 1) and ch (central helix) domains in s2. we don't select residues from hr2 domain in s2 because the structure of the hr2 domain has not been solved in the cryo-em structure 6vsb. at the outset, we calculate the conservation score ( figure 3a&d ) of all residues by using consurf. based on the conservation score, most residues in hr1/ch are conservative, while residues in ntd and rbd are prone to mutation in evolution. next, we use pymol 47 to calculate sasa of all residues in the spike protein ( figure 3b&e) . sasa indicates the level of residues exposed to the solvent in a protein and usually most of the functional residues are located on the protein structure's surface 52 . all four domains have both low sasa residues and high sasa residues. then, we perform 1,000,000 steps dmd simulation for the spike protein to calculate the rmsf of each residue ( figure 3c&f ). the residues in hr1/ch have extremely low rmsf, while the residues in ntd and rbd domains have moderate to high rmsf. we select residues that have different conservation scores in these four domains for mutagenesis. in the ntd, we select 5 residues (table s1 ) with the conservation score ranging from 1 (highly variable) to 9 (highly conservative). the sasa of these residues range from 0.01 (buried) to 0.75 (exposed). the rmsf range from 1.61 å (frozen) to 4.88 å (flexible). likewise, in the other three domains (table s23), we select 5 to 7 residues, respectively. these residues also have diverse conservation scores, sasa, and rmsf. of note, to avoid affecting the function of the spike protein, these residues are all not chosen from the functional sites of the spike protein, such as the ace2 binding site in rbd. we utilize eris to calculate the free energy changes of mutants relative to the wild type ( figure 4a , figure s2&3 , and table s4-6). in the ntd, 5 residues, e169, k113, i203, r246, and l270 are selected for mutagenesis. among them, the free energy changes of nearly all mutations of residues i203, r246, and l270 are positive, indicating that they are destabilizing the structure. in contrast, most mutations on residues e169 and k113 have negative free energy changes, suggesting that they are stabilizing the structure. the mutant that has the most negative free energy change is r246c. however, cysteine is prone to forming disulfide bond with other cysteine, which may affect the correct folding of the protein structure, so we recommend k113m to be a better choice as the stabilization mutant. in the rbd, 5 residues (a411, t415, y505, n439, and d428) (table s2) are selected for mutagenesis. the free energy changes calculated by eris ( figure s2 , in the hr1/ch domain, 7 residues (l948, i1018, a1026, y1007, s1003, t961, and v976) are selected for mutagenesis as shown in table s3 . in stark contrast to ntd and rbd, most residues have extremely high free energy changes (> 30 kcal/mol), suggesting that these residues are not very good choices for mutagenesis. the high free energy changes also implicate that they may play important roles in stabilizing the structure so that they are irreplaceable to some extent. this finding is also in concert with the high conservation scores of these residues. that said, we can still find stabilization mutants for these residues, such as t961a and s1003m ( figure s3 ). compared to experimental mutagenesis, such as random mutagenesis 53, 54 and site-directed mutagenesis 55, 56 , computational mutagenesis 25 is an efficient alternative that lays the foundation of large-scale mutation screening. however, performing computational mutation screening for all residues in the spike protein trimeric structure, which consists of 1288 residues in each monomer, is still timeconsuming and inefficient, so we seek to select critical residues to perform mutagenesis. in this work, to interrogate how to select residues as mutation sites, we select residues that have different conservation scores, sasa, and rmsf. we find that the probability of stabilizing mutations for specific residues is correlated with the conservation score, sasa, and rmsf of these residues ( figure 4b -d). we calculate the average and the minimum free energy change of all 19 mutations of each residue. we find that the average free energy change of mutations of residues with either high (>5) or low (<2) conservation score are typically larger than 0. only mutations in residues that have moderate (2~5) conservation score have negative average free energy change, thus indicating a possibility to find stabilizing mutations. we posit that conservative residues are typically playing critical roles in structural stability or protein functioning 57 , making them a likely target for finding stabilizing mutations. however, if the conservation score of the residue is too high, the residue may be irreplaceable and any mutation will destabilize the structure. on the other hand, residues with low conservation scores may be less critical to the structural stability, reducing the chances for finding stabilizing mutations at these positions. thus, we select residues that have moderate conservation score (2~5) for mutagenesis. similarly, we find that residues with large sasa or large rmsf have more stabilization mutants than residues with low sasa or low rmsf, respectively. overall, we select residues that have moderate conservation score (2~5), high sasa (>0.4), and high rmsf (>3.5 å) for mutagenesis. in addition, although we only select residues in ntd, rbd, and hr1/ch domains to perform mutagenesis, residues in other regions can also be used as mutation sites. for example, the known 2p mutation strategy (k986p and v987p) has been proved effective for the stabilization of spike protein of sars-cov-2 and other betacoronavirus 12, 20, 50 . in this work, we propose a pipeline to automatically stabilize proteins through computational mutagenesis. we analyze the conservation score, rmsf, and sasa of residues in the spike protein through the pipeline. we propose criteria based on the conservation score, rmsf, and sasa to identify residues for mutation. finally, we utilize eris to calculate the free energy change and find stabilizing mutants. all source codes are deposited in: https://bitbucket.org/dokhlab/proteinstabilization. the 3d structure of the spike protein colored sasa. the red/blue colors indicate exposed/buried residues. (f) the 3d structure of the spike protein colored by rmsf. red means flexible and blue means frozen. characterization of spike glycoprotein of sars-cov-2 on virus entry and its immune cross-reactivity with sars-cov a novel coronavirus from patients with pneumonia in china features, evaluation and treatment coronavirus (covid-19) coronavirus (covid-19) outbreak: what the department of endoscopy should know the proximal origin of sars-cov-2 a novel coronavirus genome identified in a cluster of pneumonia cases-wuhan severe acute respiratory syndrome coronavirus-like virus in chinese horseshoe bats severe acute respiratory syndrome coronavirus 2 (sars-cov-2) and corona virus disease-2019 (covid-19): the epidemic and the challenges world health organization declares global emergency: a review of the 2019 novel coronavirus (covid-19) fusion mechanism of 2019-ncov and fusion inhibitors targeting hr1 domain in spike protein sars-cov-2 infects t lymphocytes through its spike protein-mediated membrane fusion cryo-em structure of the 2019-ncov spike in the prefusion conformation. science (80-. ) structure, function, and evolution of coronavirus spike proteins the coronavirus spike protein is a class i virus fusion protein: structural and functional characterization of the fusion core complex a novel angiotensin-converting enzyme-related carboxypeptidase (ace2) converts angiotensin i to angiotensin 1-9 sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor structural basis for the recognition of sars-cov-2 by fulllength human ace2. science (80-. ) cryo-em structures of mers-cov and sars-cov spike glycoproteins reveal the dynamic receptor binding domains cryo-electron microscopy structures of the sars-cov spike glycoprotein reveal a prerequisite conformational state for receptor binding immunogenicity and structures of a rationally designed prefusion mers-cov spike antigen unexpected receptor functional mimicry elucidates activation of coronavirus fusion structure of mouse coronavirus spike protein complexed with receptor reveals mechanism for viral entry tectonic conformational changes of a coronavirus spike glycoprotein promote membrane fusion covid-19, an emerging coronavirus infection: advances and prospects in designing and developing vaccines, immunotherapeutics, and therapeutics eris: an automated estimator of protein stability modeling backbone flexibility improves protein stability estimation emergence of protein fold families through rational design g protein mono-ubiquitination by the rsp5 ubiquitin ligase tyrosine phosphorylation switching of a g protein stabilization of μ-opioid receptor facilitates its cellular translocation and signaling large sod1 aggregates, unlike trimeric sod1, do not impact cell viability in a model of amyotrophic lateral sclerosis a phosphomimetic mutation stabilizes sod1 and rescues cell viability in the context of an als-associated mutation nonnative sod1 trimer is toxic to motor neurons in a model of amyotrophic lateral sclerosis computational design of chemogenetic and optogenetic split proteins engineering extrinsic disorder to control protein activity in living cells rational design of a ligand-controlled protein conformational switch rationally designed carbohydrate-occluded epitopes elicit hiv-1 env-specific antibodies discrete molecular dynamics studies of the folding of a protein-like model discrete molecular dynamics applications of discrete molecular dynamics in biology and medicine ab initio folding of proteins with all-atom discrete molecular dynamics comparative protein modelling by satisfaction of spatial restraints statistical potential for assessment and prediction of protein structures computational modeling of small molecule ligand binding interactions and affinities solving protein structures using short-distance cross-linking constraints as a guide for discrete molecular dynamics simulations ab initio rna folding by discrete molecular dynamics: from structure prediction to folding mechanisms convergent solutions to binding at a protein-protein interface consurf: identification of functional regions in proteins by surface-mapping of phylogenetic information mdanalysis: a toolkit for the analysis of molecular dynamics simulations stabilized coronavirus spikes are resistant to conformational changes induced by receptor recognition or proteolysis structure-based design of prefusion-stabilized sars-cov-2 spikes improving prediction of secondary structure, local backbone angles and solvent accessible surface area of proteins by iterative deep learning sequence saturation mutagenesis (sesam): a novel method for directed evolution evolving strategies for enzyme engineering directed mutagenesis site-directed mutagenesis: effect of an extracistronic mutation on the in vitro propagation of bacteriophage qbeta rna understanding hierarchical protein evolution from first principles we acknowledge support from the national institutes for health 1r35 gm134864,the huck institutes of the life sciences, and the passan foundation. the project described was also supported by the national center for advancing translational sciences, national institutes of health, through grant ul1 tr002014. the content is solely the responsibility of the authors and does not necessarily represent the official views of the nih. the author declares no potential conflict of interest. key: cord-002967-yy3bennu authors: penna, fabio; ballarò, riccardo; beltrá, marc; de lucia, serena; costelli, paola title: modulating metabolism to improve cancer-induced muscle wasting date: 2018-01-29 journal: oxid med cell longev doi: 10.1155/2018/7153610 sha: doc_id: 2967 cord_uid: yy3bennu muscle wasting is one of the main features of cancer cachexia, a multifactorial syndrome frequently occurring in oncologic patients. the onset of cachexia is associated with reduced tolerance and response to antineoplastic treatments, eventually leading to clinical conditions that are not compatible with survival. among the mechanisms underlying cachexia, protein and energy dysmetabolism play a major role. in this regard, several potential treatments have been proposed, mainly on the basis of promising results obtained in preclinical models. however, at present, no treatment yet reached validation to be used in the clinical practice, although several drugs are currently tested in clinical trials for their ability to improve muscle metabolism in cancer patients. along this line, the results obtained in both experimental and clinical studies clearly show that cachexia can be effectively approached by a multidirectional strategy targeting nutrition, inflammation, catabolism, and inactivity at the same time. in the present study, approaches aimed to modulate muscle metabolism in cachexia will be reviewed. cancer-induced muscle wasting is one of the hallmarks of cachexia, a multifactorial syndrome that represents one of the most important comorbidities in oncologic patients. the occurrence of cachexia markedly complicates the management of cancer patients, negatively impinging on the tolerance and response to antineoplastic treatments, worsening the quality of life, and reducing survival. in particular, about 25% of deaths by cancer are due to cachexia, rather than to the tumor itself [1] . few years ago, a classification of cachexia was proposed, defining three different stages: precachexia, cachexia, and refractory cachexia [2] . prognosis progressively worsens going from patients with precachexia to those with refractory cachexia. in this regard, the earlier anticachexia treatments are set up, the better. for this reason, the research on cachexia is focused on two main goals: (i) to find out biomarkers useful to the early identification of a condition of still latent cachexia and (ii) to define treatment protocols useful to delay the progression from precachexia to refractory cachexia. skeletal muscle wasting in cancer patients has a good prognostic value, being predictor of reduced tolerance to chemotherapy and/or surgery, decreased ability to perform daily activities, and shortened survival. in addition, recent data report that loss of muscle mass negatively affects quality of life in cancer patients [3, 4] ; such correlation might occur irrespectively of survival rates [5] . being poor quality of life one of the most prominent and invalidating consequences of cancer cachexia, to investigate the mechanisms underlying cancer-induced muscle wasting is even more relevant to design therapeutic strategies that also take into account patient well-being. the possibility to underestimate the occurrence of muscle mass depletion exists, since the first approach to clinically evaluate a patient is to obtain information about body weight and body mass index (bmi). however, in face of no body weight loss and/or normal bmi, reduced muscle mass might well occur, being masked by fat or water content. another relevant point that frequently is poorly taken into consideration is that the loss of muscle mass is likely exacerbated by anticancer treatments. at present, different mechanisms have been proposed to lead to muscle wasting in cancer hosts, among which there are altered protein and energy metabolism and impaired myogenesis [1] . several factors may contribute to these alterations, such as reduced calorie intake, hormonal unbalance, and systemic inflammation. cancer-driven production of proinflammatory cytokines plays a relevant role in tumor progression and markedly contributes to cachexia. indeed, in cancer patients, systemic inflammation correlates with increased resting energy expenditure and with reduced survival rate [6] . along the same line, increased circulating levels of tumor necrosis factor α (tnfα), interleukin-6 (il-6), γ-interferon (inf), and, more recently, growth and differentiation factor 15 (gdf15) have been reported in cachectic cancer patients [1, 7] . the link existing between cytokines and cachexia has led to the inclusion of anti-inflammatory drugs in treatment protocols [8] . the present review will focus on strategies able to modulate metabolism that may reveal useful to prevent/delay cancer-induced muscle wasting. altered protein turnover is a general feature of muscle wasting in cancer cachexia. in particular, protein breakdown rates are persistently increased, while protein synthesis rates can be reduced, unchanged, or even increased, depending on the model system [1] . the different reaction kinetics that characterizes protein synthesis and degradation rates, zero and first order, respectively, implies that if degradation is higher than normal, the loss of total protein cannot be antagonized by simply increasing the rate of synthesis. taking this assumption for true, any anabolic approach should be associated with anticatabolic strategies in order to achieve a beneficial effect on muscle protein mass. 2.1. protein degradation. several pieces of evidence suggest that the intracellular proteolytic systems in the skeletal muscle of cancer hosts are poised towards activation above the physiological levels ( figure 1 ). particularly relevant in this regard are the pathways dependent on ubiquitinproteasome and autophagy. while the former mainly degrades short-lived and regulatory proteins, the latter gets rid of structural proteins and organelles [9] . both experimental and human cancer cachexia were associated with increased activity of the ubiquitinproteasome pathway [1] . of interest, alterations in molecular and biochemical markers of proteasome activation were observed in gastric cancer patients before any evidence of body weight loss, supporting the need to detect cachexia as early as possible [10] . modulations of the ubiquitinproteasome proteolytic system, however, are not a general finding in cancer cachexia, as shown by studies reporting that it is not differently activated with respect to controls in the muscle of patients affected by non-small-cell lung cancer (nsclc; [11] ) or esophageal cancer [12] . the involvement of the autophagic-lysosomal proteolysis in muscle wasting was recognized just in the last fifteen years. two main reasons account for such delay: (i) autophagy was not considered as typically operated by the skeletal muscle as a response to stress conditions. such belief was definitely abandoned when autophagy was clearly demonstrated in fasted mice overexpressing green fluorescent proteinlabeled lc3 [13] . (ii) the study and detection of autophagy was not easy since the atg genes were not discovered [14] . a number of studies reported that the autophagic system was overactivated, without reaching complete cargo degradation, in the muscle of both tumor-bearing animals and cancer patients [6, [15] [16] [17] . in particular, despite autophagic flux was enhanced in mice bearing the c26 tumor, autophagosomes accumulated, likely due to exhaustion of the lysosomal compartment [15] . a similar pattern could also be observed in cancer patients, as suggested by lc3b-ii and p62 accumulation [16, 17] . both proteasome and lysosomes, however, cannot directly degrade intact myofilaments. in this regard, a preliminary cleavage was proposed to be operated by other proteolytic systems, such as those dependent on caspases or calpains. these latter are ca 2+ -dependent cysteine proteases, normally inactive and localized in the cytosolic compartment. when intracellular ca 2+ concentrations increase, inactive calpains translocate to the cell membrane and become activated by autoproteolysis [18] . the system also includes calpastatin, a physiological inhibitor, which is a substrate of active calpain itself. increased calpain expression was reported in the muscle of tumor-bearing animals [19] , while rats transplanted with the yoshida ah-130 hepatoma showed a progressive reduction of calpastatin levels and increased in vitro cleavage of specific fluorogenic substrates [20] . more recently, calpain activation was also demonstrated in mice bearing the c26 colon carcinoma [21] . both increased or unchanged muscle calpain expression were reported in cancer patients [12, 22] . several lines of evidence show that proinflammatory cytokines act as triggers, or at least as contributors, of cancer-induced protein hypercatabolism [23] . briefly, data obtained in both experimental models and human pathology have demonstrated that cytokines such as tnfα and il-6 lead to reduced rates of protein synthesis paralleled by enhanced protein breakdown, both accounting for the loss of muscle mass [24] . such effects depend, at least in part, on activation of the transcription factor nf-κb, as shown in both experimental and human cancer cachexia patients [25, 26] . cancer-induced muscle wasting has also been associated with another proinflammatory cytokine, namely, tnf-like weak inducer of apoptosis (tweak) [27] . the therapeutic approaches mainly pursued by researchers to counteract enhanced muscle protein breakdown have long been those specifically targeting the different proteolytic systems. the results, however, did not really clarify the issue. since the discovery of muscle-specific ubiquitin ligases, these were considered a good target to interfere with protein breakdown, being involved in determining both substratespecificity and proteasome degradation rate. among the enzymes belonging to this family, the first described were mafbx/atrogin-1 and murf1/trim63, respectively, involved in the degradation of structural proteins and of proteins contributing to cell proliferation, differentiation, and survival [1] . subsequently, other members came out, such as trim32 and fbxo40. more recently, fbxo30/ musa1 was shown to contribute to denervation-and fasting-mediated muscle loss [28] , as well as to cancerinduced muscle wasting (unpublished data). genetic approaches specifically targeting these ubiquitin ligases proved effective in protecting the muscle against protein depletion [29] ; however, at present, the use of these enzymes as therapeutic targets for muscle wasting is not validated yet. on the other side, the inhibition of proteasome activity by means of pharmacological inhibitors was effective just in few models of muscle atrophy but totally unable to protect tumor-bearing animals against muscle wasting [30] . in contrast with these findings, few years ago, a study reported that inhibition of the ubiquitin-proteasome pathway by mg132 was able to improve experimental cancer cachexia [31] . however, mg132 is a rather unspecific inhibitor, being able to block also calpains and autophagy [31, 32] . finally, carfilzomib, an irreversible selective inhibitor of proteasome chymotrypsin-like activity, was shown to improve cachexia in tumor-bearing mice by inhibiting muscle protein breakdown [31] . such improvement, however, was associated with reduced tumor burden, which could be the real mechanism underlying the beneficial effect of the treatment. several lines of evidence proposed that modulations of autophagy could be useful to improve cancer-associated muscle wasting. in this regard, muscle-specific gene strategies aimed at silencing beclin-1, one of the proteins involved in autophagosome formation, showed that suppression of autophagy in the c26 hosts was unable to rescue myofiber diameter (unpublished data). in addition, pharmacological inhibition of autophagy in mice hosting the c26 tumor lead to death of the animals, suggesting that lysosomal degradation is mandatory to sustain the requirement of both energy and substrates in tumor hosts, at least when they are facing the terminal phase of cancer growth [15] . the other way round, excessive stimulation of muscle autophagy, experimentally obtained by the overexpression of tp53inp2/ dor, exacerbated muscle atrophy in tumor-bearing mice (unpublished data), while activation of autophagy by means of the mtor inhibitor rapamycin was shown to positively affect the skeletal muscle in the c26 hosts [16] . such discrepancy might depend on the fact that while mtor inhibition affects stress-induced autophagy, tp53inp2 hyperexpression targets basal autophagy. despite the literature report data supporting the involvement of ca 2+ -dependent proteolysis in the pathogenesis of cancer-induced muscle wasting, protein hypercatabolism was not downregulated in preparations of isolated muscles obtained from tumor-bearing animals and incubated in the presence of calpain inhibitors [19, 33, 34] . more recently, both pharmacological and genetic approaches aimed at inhibiting the ca 2+ -dependent proteolytic system were not able to prevent or delay cancer-induced muscle wasting [21] , although contrasting results were reported in this regard [35] . while interfering with specific proteolytic systems does not seem to be an appropriate approach to prevent/delay cancer-induced muscle wasting, the modulation of bulk protein turnover appears more promising. in this regard, anti-inflammatory approaches revealed able to improve muscle protein turnover in tumor-bearing mice [20] . more recently, administration of formoterol, a β 2 -adrenergic agonist, to tumor-bearing animals revealed able to reverse muscle wasting [36] . such an effect is mainly exerted by stimulating protein synthesis and inhibiting protein degradation rates. in particular, both the ubiquitin-proteasome and the autophagic-lysosomal proteolytic systems were downregulated in formoterol-receiving animals ( [24] ; unpublished data). at present, only one study evaluated the effectiveness of formoterol, combined with megestrol acetate, in cachectic cancer patients [37] . the results suggest that both muscle size and strength can be improved by the treatment, although more trials are needed to draw clear-cut evidence. 2.2. protein synthesis. as reported above, depending on the situation, reduced, normal, or even increased muscle protein synthesis rates were shown in cancer cachexia. due to the rapid development of cachexia, tumor-bearing animals frequently show reduced protein synthesis rates, although this is not a general finding. indeed, rats bearing the ah-130 hepatoma, that usually die about 10 days after tumor transplantation, showed muscle protein synthesis rates comparable to those of healthy animals [38] . the situation is more complex when studying human pathology. reduced muscle protein synthesis was reported many years ago in patients affected by different types of tumors [39] and more recently in prostate cancer patients [40] . on the contrary, van dijk et al. [41] reported that baseline protein synthesis rates were higher than control values in cachectic patients affected by pancreatic cancer. also, intermediate results are available in the literature: myofibrillar protein synthesis rates were analyzed in healthy people and weight-stable and weight-losing gastrointestinal cancer patients and found comparable among the different groups. similarly, no changes in whole body protein synthesis were reported in nsclc patients [42] . the possibility to modulate protein synthesis in order to correct muscle atrophy or simply to provide an environment permissive for the maintenance of muscle mass was long studied. many approaches were tested, most of them consisting in nutritional strategies or in molecular modulations aimed at pushing muscle metabolism towards anabolism. most of these approaches revealed unsuccessful, giving rise to the idea that anabolic resistance occurs in cancer cachexia. just to provide few examples, conventional nutritional supplementation or the infusion with an amino acid cocktail did not stimulate muscle protein synthesis in advanced cancer patients [42] . along this line, studies aimed at stimulating the anabolic pathway depending on igf-1, by both pharmacological and genetic means, did not succeed in improving muscle wasting in tumor-bearing animals [43, 44] . recently, however, patients not yet considered as refractory cachectic were proposed to display an anabolic window that could be exploited with nutritional interventions [42, 45, 46] or with other anabolism-inducing strategies. as an example, patients with stage iii and iv nsclc showed a normal anabolic response to hyperaminoacidemia but not to isoaminoacidemia, suggesting that high substrate availability is relevant to induce anabolism in cancer hosts [47] . consistently, muscle protein synthesis could be stimulated in advanced cancer patients by a high protein formula versus a conventional nutritional supplement [48, 49] . these observations point out the possibility to overcome the anabolic resistance that occurs in cancer patients by providing specifically enriched nutritional supplements. stimulation of anabolism can be exerted by several means. particularly interesting in this regard is ghrelin, a mediator released by the stomach during fasting or caloric restriction. modulations of ghrelin levels exert remarkable effects on both energy and protein metabolism, such as the inhibition of autophagy in conditions characterized by systemic inflammation [50] . the administration of ghrelin to tumor-bearing animals improved food intake, body weight, lean body mass, and chemotherapy-induced toxicity [51] . both ghrelin and ghrelin analogues are currently being tested in clinical trials. among these, anamorelin was recently shown to improve lean body mass, total body mass, and hand grip strength in patients affected by nsclc [52] . other studies, however, showed that anamorelin administration to cancer patients increased body weight and improved faact scores while did not enhance hand grip strength [53] . acids. in addition to be necessary to synthesize proteins, free amino acids (faa) also act as regulators of protein metabolism. in particular, plasma faa, that even represent a small fraction of the total amino acid pool, are the main source of metabolically active nitrogen compounds. among faa, the essential amino acids were reported to stimulate protein synthesis and to inhibit protein degradation. such a role is mainly played by the three branched chain amino acids (bcaa), leucine in particular. alterations of amino acid metabolism are frequent features in cancer-induced muscle wasting. reduced amino acid uptake was generally observed in cancer patients, mainly due to the occurrence of anorexia, which also leads to decreased insulin secretion. both decreased amino acid availability and insulin levels inhibit the anabolic pathway dependent on mtor, resulting in downregulation of protein synthesis rates and stimulating protein degradation. inhibition of mtor signaling in cancer cachexia is enforced by proinflammatory cytokines [1] . reduced amino acid uptake in the muscle was also reported to derive from altered amino acid transport. indeed, in the soleus muscle of tumor-bearing rats, the activity of system a was decreased, while no changes were observed for systems l and asc [54] . of interest, tnfα was shown to impair amino acid transport in tumor-bearing rats [55] . plasma glutamine levels were shown to be significantly reduced in tumor-bearing rats with respect to healthy animals [56] . of interest, reduced glutamine availability could activate the metabolic sensor adenosine monophosphateactivated protein kinase (ampk, see below; [57] ). finally, leucine oxidation was markedly increased in the muscle of rats bearing the yoshida ah-130 hepatoma [58] . consistently, enhanced activity of the bcaa dehydrogenase was reported in rats bearing the walker 256 carcinoma [59] . several studies have proposed amino acid supplementation as a mean to improve cancer-induced muscle wasting. in experimental models of cancer cachexia, bcaa were shown to attenuate the loss of muscle mass. the underlying mechanisms of such effect are not clear, although downregulation of protein degradation and stimulation of protein synthesis were hypothesized [60] . more recently, metabolomic alterations were proposed to be the basis of the positive effects exerted by a leucine-rich diet on cachexia in rats bearing the walker 256 carcinoma, in the absence of effects on tumor mass [61] . as for clinical studies, bcaa were proposed to improve anorexia [62] , thus removing, partially at least, one of the mechanisms accounting for reduced amino acid uptake. other studies supported a beneficial role of bcaas on muscle protein metabolism, although this should be confirmed by larger randomized, blind, placebocontrolled trials [63] . beta-hydroxy-beta-methylbutyrate (hmb), a metabolite of leucine, was shown to improve muscle wasting in experimental cancer cachexia, mainly by inhibiting protein degradation rather than stimulating protein synthesis [63] . recently, hmb was proposed to be more effective than leucine in preventing body weight loss in tumor-bearing animals [64] . such effect, however, could depend on the model system chosen, since hmb does not appear able to modulate muscle mass in mice bearing the c26 tumor (costelli et al., unpublished observations). the situation is even more confused in human cachexia. increase of both hemoglobin levels and fat-free mass were reported in advanced cancer patients administered a nutritional supplement containing hmb, arginine, and glutamine [65, 66] . another study, however, was not able to demonstrate a beneficial effect of the same supplement in cancer patients [67] , suggesting that the effectiveness of hmb in the clinical practice is still unclear and deserves further investigation. glutamine supplementation was reported to attenuate muscle protein wasting in cancer patients [68] , as well as to improve the energy balance in rats bearing the walker 256 tumor [69] . finally, promising data are available about the possibility to treat cancer hosts with l-carnitine, an amino acid derivative that plays a role in fatty acid metabolism and energy production [63] . a negative energy balance, generally resulting from both reduced production and increased expenditure, is a frequent occurrence in cancer patients. while resting energy expenditure (ree) frequently increases, likely due to enhanced thermogenesis, the occurrence of reduced physical activity, particularly in advanced cancer patients, paradoxically leads to a net decrease of total energy expenditure. the increased thermogenesis is consistent with the observation that in cachectic tumor-bearing animals, the expression of the brown adipose tissue-(bat-) specific uncoupling protein 1 (ucp1) is higher than in controls, while ucp2 (ubiquitous) and ucp3 (expressed in bat and muscle) levels increase in the skeletal muscle only [70] . similarly, muscle ucp3 mrna levels were higher in weight-losing than in non-weight-losing cancer patients or controls [71] . the increase of ree in cancer cachexia is not a new observation; however, just recently, the underlying mechanisms start to be clarified. a central point in this regard is played by muscle mitochondria compartment, which is markedly affected in tumor hosts. indeed, both morphological [72, 73] and functional alterations [74, 75] were reported in experimental tumor-bearing animals. in particular, mitochondrial uncoupling and reduced oxidative capacity were associated with myofiber shift from oxidative to glycolytic fibers [73] . impairment of the mitochondrial compartment results in decreased atp production, leading to an energy deficit that becomes even worse since it is coupled with steadily increased ree. consistently, reduced atp levels and increased activity of the energy sensor ampk were shown in the muscle of tumor-bearing animals [72, 73] . the lack of an appropriate energy supply results in compromised cell function and reduced contractile force generation, leading to loss of muscle mass and strength. several factors can lead to mitochondrial alterations in the skeletal muscle. among these, proinflammatory cytokines play a major role. indeed, the activation of nf-κb induced by tnfα was reported to reduce both muscle oxidative capacity and the expression of factors regulating mitochondrial biogenesis. similar observations were reported when other inflammation-driven pathways such as the il-6/stat3 or tgfβ/smad3 are activated above physiological levels [76] . in addition to proinflammatory mediators, also oxidative stress, due to reactive oxygen and nitrogen species levels exceeding the compensative capacity of the intracellular antioxidant systems, contributes to mitochondrial function impairment. in this regard, there are several studies sustaining the involvement of oxidative stress in cancerinduced muscle wasting, although a clear-cut causative evidence is still lacking [77] (ballarò et al., unpublished data; figure 2 ). being mitochondria crucial to the maintenance of muscle oxidative metabolism, emergency routes can be activated in order to avoid mitochondrial dysfunction. in particular, mitochondrial biogenesis and dynamics as well as the disposal of damaged organelles, mainly by mitophagy, are promoted ( figure 2 ). the other way round, impaired function of the emergency routes themselves could trigger the accumulation of altered mitochondria resulting in reduced muscle oxidative metabolism. in this regard, the expression of the peroxisome proliferator-activated receptor-γ (ppar-γ) coactivator-1α (pgc-1α), the master regulator of mitochondrial biogenesis and oxidative metabolism, was shown to be reduced in the skeletal muscle of tumorbearing mice [78] , although this is not a constant finding [73, 79] . mitochondria dynamics, representing the balance between fission and fusion processes, was shown to be altered in both experimental cancer cachexia and in cancer patients [78, 80] . in this regard, impaired mitochondrial dynamics could drive the hyperactivation of muscle protein breakdown, likely through pathways depending on ampk and foxo, eventually leading to muscle wasting [81, 82] . autophagy (mitophagy) is the main mechanism responsible for disposal of altered mitochondria. also, mitophagy was reported to be impaired in cancer cachexia, as shown by the observation that bnip3l and parkin1 mrna increased in the muscle of cancer patients [17] . similarly, bnip3l protein levels were increased in the muscle of mice bearing the lewis lung carcinoma [73] . on the whole, these observations suggest that in addition to mitochondria biogenesis and dynamics, also their disposal is perturbed in the skeletal muscle of tumor hosts, thus contributing to mitochondrial dysfunction and reduced muscle oxidative metabolism. several strategies were proposed to improve energy metabolism by acting on mitochondria. the first and perhaps simplest one, in theory at least, is exercise training, in particular a combination of both resistance and endurance exercise. these two types of training affect different but complementary targets, being able to improve force production and metabolic adaptations, respectively. of particular relevance is the observation that endurance training was reported to increase the number of mitochondria and to drive myofiber-type shift from glycolytic to oxidative, thus specifically targeting alterations that characteristically occur in the skeletal muscle of tumor hosts. however, these potentially favorable effects can be exploited just systematically practicing exercise, even if at a moderate level. this may not be an easy task in cancer patients that frequently present with chronic fatigue and comorbidities eventually leading to exercise intolerance. this point is supported by the observation that c26-bearing mice did not benefit from exercise training, suggesting that the effort to exercise in already compromised animals was damaging rather than protective [73] . consistently, excessive endurance exercise was associated with increased mitochondrial fission in the absence of mitophagy induction [83] . in the last years, the possibility to mimic the effects of exercise by drugs has been gaining a growing consensus. the positive side is that this strategy will allow to overcome both poor patient compliance to exercise training and possible occurrence of exercise intolerance. the negative part is that generally exercise mimicking drugs do not totally recapitulate the effects of exercise itself. in this sense, these drugs do not properly hit the target; however, they could be a good compromise when exercise training cannot be proposed to the patient. at present, several options were investigated as exercisemimicking strategies. while most of them are pharmacological, also a genetic approach was proposed. this latter consists in manipulations able to increase the levels of pgc-1α in the skeletal muscle. in this regard, improved exercise capacity and oxidative metabolism were reported in mice specifically overexpressing this factor in the muscle, resembling the phenotype induced by endurance training [84] . pgc-1α overexpression was shown to interfere with muscle atrophy induced by activation of the tweak-fn14 pathway [85] and to improve cancer-induced muscle wasting in tumor-bearing mice [73, 86] , although contrasting data were previously reported [87] . several classes of drugs were proposed to modulate energy metabolism, among which are activators of ampk, sirtuin 1 (sirt1), and trimetazidine (tmz). different compounds such as resveratrol, metformin, quercetin, and aicar can activate ampk [88] . in this regard, aicar administration was shown to impinge on exercise capacity, oxygen consumption, and fatty acid oxidation [89] . muscle atrophy induced by angiotensin ii was prevented by treatment with aicar. this drug also revealed able to activate autophagy and to improve muscle phenotype in both dystrophic mdx mice and animals bearing the c26 tumor [16, 90] . metformin administration was shown to improve sarcopenia of aging and muscle wasting in severely burned patients [91, 92] and was proposed to be useful to treat muscle wasting in cancer cachexia [93] . ampk activation can also be induced by resveratrol, as demonstrated by the observation that ampkα 1 -or ampkα 2 -deficient mice were refractory to resveratrol-induced increase of both mitochondrial biogenesis and endurance exercise capacity [94] . consistently, obese men receiving resveratrol showed improved inflammation, ampk activation, and increased expression of pgc-1α and sirt1 protein levels [95] . finally, an ampk-stabilizing peptide was reported to improve white adipose tissue wasting in tumor hosts [96] . sirt1 belongs to a class of deacetylases deregulated in aging and in different chronic diseases, including cancer. sirt1 is also involved in the regulation of energy homeostasis; its expression is induced in response to caloric restriction [97] and can be activated in the skeletal muscle by ampk [98] . specific overexpression of sirt1 in the muscle resulted in a fast-to-slow myofiber type transition, producing an oxidative phenotype. consistently, muscle-specific sirt1 transgenic mice exposed to fasting or denervation showed a reduced expression of atrogenes in comparison with wildtype littermates [99] . finally, improved muscle phenotype was reported in mdx/sirt1 double transgenic mice [100] . in addition to ampk, resveratrol also activates sirt1. in this regard, part of the above-described effects exerted by resveratrol derive from sirt1-dependent modulations of pgc-1α acetylation state [101] . synthetic selective sirt1 activators such as srt2104 are also available [102, 103] . plasma lipid profile and insulin sensitivity were improved in healthy volunteers by administration of srt2104 [102, 104] . very few studies are actually available about srt2104 action on both muscle mass and function. in this regard, srt2104 appeared to reduce the depletion of muscle mass due to fasting or inactivity [105] , at least in part by increasing pgc-1α expression [105] . tmz is a metabolic modulator that blocks fatty acid oxidation, shifting atp production to glucose oxidation and improving cell energy metabolism. indeed, atp synthesis through fatty acid β-oxidation requires more oxygen than glucose oxidation [106] . along this line, the shift to glucose oxidation improves the use of the available oxygen, possibly increasing metabolic efficiency and skeletal muscle function. tmz was shown to increase the size of cultured myotubes [107] and to improve both heart metabolism and exercise capacity in patients suffering from chronic stable angina [108] . when administered to aged animals, tmz resulted in increased muscle strength [109] . finally, treatment of c26-bearing mice with tmz resembled some of the benefits triggered by exercise, among which fast-to-slow myofiber phenotype shift, pgc-1α upregulation, oxidative metabolism enhancement, and grip strength increase (molinari et al., unpublished). the relevance of fatty acid oxidation to cachexia is also supported by a recent study showing that several tumor cell lines are able to release proinflammatory mediators, resulting in enhanced fatty acid oxidation and activation of p38-dependent signaling in the skeletal muscle, well before tissue wasting occurs. in addition, the same study also showed that treatment of tumor-bearing animals with etomoxir, an inhibitor of fatty acid oxidation, rescued both muscle mass and body weight [110] . a complex network of metabolic alterations sustained by hypercatabolism, energy deficit, and systemic inflammation is the milieu underlying cancer cachexia. while becoming overtly detectable in advanced cancer patients, such perturbations likely take place very early in the course of the disease, at least at the molecular level. protein and energy dysmetabolism in cachexia are quite well recognized; however, the available therapeutic strategies, although frequently promising from the preclinical point of view, have not yet reached validation to be used in the clinical practice. several drugs identified by experimental studies are currently tested in clinical trials for their ability to improve muscle metabolism in cancer patients. other emerging strategies are those aimed at interfering with the intestinal microbiota, previously reported to improve cachexia in a preclinical model [111] . the available results of both experimental and clinical studies, however, have clearly indicated that single-targeted therapies will hardly be successful in the treatment of cachexia. in this regard, the view of a multidirectional approach, selectively tailored and, whenever possible, personalized, is gaining a growing consensus. such an approach should not just rely on nutritional counseling and pharmacologic treatment with anti-inflammatory and anticatabolic drugs but also include exercise training and/or exercisemimicking agents. in this regard, exercise mimetics could not only merely replace exercise training in depleted patients but also improve exercise tolerance and effectiveness in precachectic individuals, thus amplifying the beneficial action of exercise itself. last but not least, treatments aimed at preventing/correcting the metabolic alterations underlying cancer-induced muscle wasting might also impinge on tumor-targeted therapies improving their effectiveness and/ or enhancing patient tolerance to chemotherapy. in addition, metabolic modulators could also directly affect tumor growth. this is the case, for example, of exercise, that was shown to prevent or at least delay tumor growth [112] . cancer cachexia: understanding the molecular basis definition and classification of cancer cachexia: an international consensus muscle mass and association to quality of life in non-small cell lung cancer patients sarcopenia is associated with quality of life and depression in patients with advanced cancer anamorelin for advanced non-small-cell lung cancer with cachexia: systematic review and meta-analysis definition of cancer cachexia: effect of weight loss, reduced food intake, and systemic inflammation on functional status and prognosis targeting obesity and cachexia: identification of the gfral receptor-mic-1/gdf15 pathway novel targeted therapies for cancer cachexia coming back: autophagy in cachexia increased muscle proteasome activity correlates with disease severity in gastric cancer patients precachexia in patients with stages i-iii non-small cell lung cancer: systemic inflammation and functional impairment without activation of skeletal muscle ubiquitin proteasome system autophagic-lysosomal pathway is the main proteolytic system modified in the skeletal muscle of esophageal cancer patients in vivo analysis of autophagy in response to nutrient starvation using transgenic mice expressing a fluorescent autophagosome marker guidelines for the use and interpretation of assays for monitoring autophagy autophagic degradation contributes to muscle wasting in cancer cachexia aerobic exercise and pharmacological treatments counteract cachexia by modulating autophagy in colon cancer autophagy is induced in the skeletal muscle of cachectic cancer patients the calpain system muscle wasting associated with cancer cachexia is linked to an important activation of the atpdependent ubiquitin-mediated proteolysis anticytokine treatment prevents the increase in the activity of atpubiquitin-and ca 2+ -dependent proteolytic systems in the muscle of tumour-bearing rats interference with ca 2+ -dependent proteolysis does not alter the course of muscle wasting in experimental cancer cachexia calpain activity is increased in skeletal muscle from gastric cancer patients with no or minimal weight loss role of inflammation in muscle homeostasis and myogenesis anti-cytokine strategies for the treatment of cancer-related anorexia and cachexia nf-κb-mediated pax7 dysregulation in the muscle microenvironment promotes cancer cachexia nuclear transcription factor κb activation and protein turnover adaptations in skeletal muscle of patients with progressive stages of lung cancer cachexia targeting of fn14 prevents cancer-induced cachexia and prolongs survival bmp signaling controls muscle mass the role of e3 ubiquitin-ligases murf-1 and mafbx in loss of skeletal muscle mass effect of the specific proteasome inhibitor bortezomib on cancer-related muscle wasting mg132-mediated inhibition of the ubiquitin-proteasome pathway ameliorates cancer cachexia severe acute respiratory syndrome coronavirus replication is severely impaired by mg132 due to proteasome-independent inhibition of m-calpain activation of the atp-ubiquitin-proteasome pathway in skeletal muscle of cachectic rats bearing a hepatoma increased atpubiquitin-dependent proteolysis in skeletal muscles of tumor-bearing rats calpain inhibitors ameliorate muscle wasting in a cachectic mouse model bearing ct26 colorectal adenocarcinoma anticachectic effects of formoterol phase i/ii trial of formoterol fumarate combined with megestrol acetate in cachectic patients with advanced malignancy tumor necrosis factor-alpha mediates changes in tissue protein turnover in a rat cancer cachexia model protein synthesis in muscle measured in vivo in cachectic patients with cancer attenuation of resting but not load-mediated protein synthesis in prostate cancer patients on androgen deprivation effects of oral meal feeding on whole body protein breakdown and protein synthesis in cachectic pancreatic cancer patients protein anabolic resistance in cancer igf-1 is downregulated in experimental cancer cachexia muscle atrophy in experimental cancer cachexia: is the igf-1 signaling pathway involved? cancer cachexia and diabetes: similarities in metabolic alterations and possible treatment central tenet of cancer cachexia therapy: do patients with advanced cancer have exploitable anabolic potential? normal protein anabolic response to hyperaminoacidemia in insulinresistant patients with lung cancer cachexia muscle protein synthesis in cancer patients can be stimulated with a specially formulated medical food high anabolic potential of essential amino acid mixtures in advanced nonsmall cell lung cancer current opinion in clinical nutrition and metabolic care anamorelin hydrochloride in the treatment of cancer anorexia-cachexia syndrome: design, development, and potential place in therapy anamorelin (ono-7643) in japanese patients with non-small cell lung cancer and cachexia: results of a randomized phase 2 trial anamorelin in patients with non-small-cell lung cancer and cachexia (romana 1 and romana 2): results from two randomised, double-blind, phase 3 trials amino acid uptake in skeletal muscle of rats bearing the yoshida ah-130 ascites hepatoma tumour growth results in changes in placental amino acid transport in the rat: a tumour necrosis factor α-mediated effect cancer cachexia, malnutrition, and tissue protein turnover in experimental animals hypothesis: muscular glutamine deficiency in sepsis-a necessary step for a hibernation-like state? enhanced leucine oxidation in rats bearing an ascites hepatoma (yoshida ah-130) and its reversal by clenbuterol the energy state of tumor-bearing rats effect of branched-chain amino acids on muscle atrophy in cancer cachexia leucine-rich diet alters the 1 h-nmr based metabolomic profile without changing the walker-256 tumour mass in rats branchedchain amino acids: the best compromise to achieve anabolism? cancer-induced muscle wasting: latest findings in prevention and treatment comparison of the anticatabolic effects of leucine and caβ-hydroxy-β-methylbutyrate in experimental models of cancer cachexia reversal of cancer-related wasting using oral supplementation with a combination of β-hydroxy-βmethylbutyrate, arginine, and glutamine supplementation with a combination of beta-hydroxy-beta-methylbutyrate (hmb), arginine, and glutamine is safe and could improve hematological parameters a randomized, doubleblind, placebo-controlled trial of a β-hydroxyl β-methyl butyrate, glutamine, and arginine mixture for the treatment of cancer cachexia (rtog 0122) glutamine supplementation in cancer patients l-glutamine supplementation promotes an improved energetic balance in walker-256 tumor-bearing rats lipid mobilization in cachexia: mechanisms and mediators muscle ucp-3 mrna levels are elevated in weight loss associated with gastrointestinal adenocarcinoma in humans mitochondrial and sarcoplasmic reticulum abnormalities in cancer cachexia: altered energetic efficiency? combination of exercise training and erythropoietin prevents cancer-induced muscle alterations cancer cachexia is associated with a decrease in skeletal muscle mitochondrial oxidative capacities without alteration of atp production efficiency skeletal muscle mitochondrial uncoupling in a murine cancer cachexia model disrupted skeletal muscle mitochondrial dynamics, mitophagy, and biogenesis during cancer cachexia: a role for inflammation antioxidant supplementation accelerates cachexia development by promoting tumor growth in c26 tumorbearing mice il-6 regulation on skeletal muscle mitochondrial remodeling during cancer cachexia in the apc min/+ mouse combined approach to counteract experimental cancer cachexia: eicosapentaenoic acid and training exercise altered mitochondrial quality control signaling in muscle of old gastric cancer patients with cachexia mitochondrial biogenesis and fragmentation as regulators of muscle protein degradation the opa1-dependent mitochondrial cristae remodeling pathway controls atrophic, apoptotic, and ischemic tissue damage modulation of autophagy and ubiquitinproteasome pathways during ultra-endurance running skeletal muscle pgc-1β signaling is sufficient to drive an endurance exercise phenotype and to counteract components of detraining in mice regulatory circuitry of tweak-fn14 system and pgc-1α in skeletal muscle atrophy program a pgc-1α isoform induced by resistance training regulates skeletal muscle hypertrophy increase in muscle mitochondrial biogenesis does not prevent muscle loss but increased tumor size in a mouse model of acute cancer-induced cachexia the therapeutic potential of skeletal muscle plasticity in duchenne muscular dystrophy: phenotypic modifiers as pharmacologic targets ampk and pparδ agonists are exercise mimetics ampk activation stimulates autophagy and ameliorates muscular dystrophy in the mdx mouse diaphragm effects of pharmacological interventions on muscle protein synthesis and breakdown in recovery from burns effects of the antidiabetic drugs on the age-related atrophy and sarcopenia associated with diabetes type ii metformin treatment modulates the tumour-induced wasting effects in muscle protein metabolism minimising the cachexia in tumour-bearing rats amp-activated protein kinase-deficient mice are resistant to the metabolic effects of resveratrol calorie restriction-like effects of 30 days of resveratrol supplementation on energy metabolism and metabolic profile in obese humans an amp-activated protein kinase-stabilizing peptide ameliorates adipose tissue wasting in cancer cachexia in mice mammalian sirtuins: biological insights and disease relevance pgc-1α, sirt1 and ampk, an energy sensing network that controls energy expenditure sirt1 protein, by blocking the activities of transcription factors foxo1 and foxo3, inhibits muscle atrophy and promotes muscle growth the multifaceted functions of sirtuins in cancer metabolic control of muscle mitochondrial function and fatty acid oxidation through sirt1/pgc-1α a pilot randomized, placebo controlled, double blind phase i trial of the novel sirt1 activator srt2104 in elderly volunteers a novel sirtuin 2 (sirt2) inhibitor with p53-dependent pro-apoptotic activity in non-small cell lung cancer cardiovascular effects of a novel sirt1 activator, srt2104, in otherwise healthy cigarette smokers srt2104 extends survival of male mice on a standard diet and preserves bone and muscle mass the antiischemic effect of trimetazidine in patients with postprandial myocardial ischemia is unrelated to meal composition the metabolic modulator trimetazidine triggers autophagy and counteracts stress-induced atrophy in skeletal muscle myotubes trimetazidine improves exercise performance in patients with peripheral arterial disease improvement of skeletal muscle performance in ageing by the metabolic modulator trimetazidine excessive fatty acid oxidation induces muscle atrophy in cancer cachexia synbiotic approach restores intestinal homeostasis and prolongs survival in leukaemic mice with cachexia molecular mechanisms linking exercise to cancer prevention and treatment the authors declare that there is no conflict of interest regarding the publication of this paper. key: cord-273019-hbpfz8rt authors: glingston, r. sahaya; deb, rachayeeta; kumar, sachin; nagotu, shirisha title: organelle dynamics and viral infections: at cross roads date: 2018-06-25 journal: microbes infect doi: 10.1016/j.micinf.2018.06.002 sha: doc_id: 273019 cord_uid: hbpfz8rt viruses are obligate intracellular parasites of the host cells. a commonly accepted view is the requirement of internal membranous structures for various aspects of viral life cycle. organelles enable favourable intracellular environment for several viruses. however, studies reporting organelle dynamics upon viral infections are scant. in this review, we aim to summarize and highlight modulations caused to various organelles upon viral infection or expression of its proteins. a unique feature of eukaryotic cells is the presence of distinct membrane bound structures called organelles. this subcompartmentalization of eukaryotic cells is very essential for its optimum function. organelles achieve this due to the presence of a unique set of proteins and distinctive lipid composition that determines their function. they are dynamic and interact with the surrounding organelles to regulate their biogenesis and function [1, 2] . association of organelle dysfunction with human diseases/ disorders is reported and widely acknowledged [3] . it is interesting that not only organelles are required for proper functioning of cells, but are also required for the successful infection of a virus [4, 5] . it is well established that viruses develop alternate strategies to survive in the cells. the steps of the viral life cycle include entry, translation, replication, assembly and egress [6] . moreover, viruses have developed remarkable ways to complete their life cycle by targeting specific cell organelles and processes. organelles like mitochondria, er, peroxisomes play an important role in innate immunity and host defence [7] . recently lipid droplets have also been reported to be essential for innate response against viral infection [8] . the viral life cycle is also a dynamic process that leads to the extensive host cellular reorganization. characterization of this spatiotemporal reorganization is a key step in order to understand the molecular mechanism of viral infection. localization of the viral proteins to an appropriate sub-cellular compartment is part of the strategy to hijack the host machinery and necessary pathways leading to establishment of an infection [9, 10] . with the advent of several modern microscopy and proteomics methods, it is now clear that the localization and function of many host proteins are altered due to viral infections. not only the host proteins but the intracellular localization of the viral proteins and their interactions with the host proteins are also extensively studied using these methods [10] . however, currently our understanding of organelle dynamics during viral infections is still at its infancy. current review briefly summarizes our knowledge of the various cell organelles/compartments following virus infection. this topic at the interjection of virology and cell biology represents an emerging research area of molecular investigations in virology. presence of nucleus that houses the genome is what distinguishes eukaryotic cells from prokaryotes (fig. 1 ). it is a double membrane organelle and is the site for gene regulation and transcription. nuclear envelope comprising of nuclear pores is the barrier between the nuclear content and the rest of the cell. nucleolus is the region where ribosomal subunit assembly takes place in the nucleus. sub nuclear structures comprising of several proteins which regulate many cellular processes like apoptosis, dna damage, etc are called pml-nbs. many dna and rna viruses depend on the host nuclear proteins for their replication [11] . several strategies are used by viruses in order to deliver its genome to the host cells [12] . usually, when cells undergo mitosis, there is a temporary disassembly of the nuclear envelope (ne) which allows some viruses to enter into the nucleus [13] . entry and integration of murine leukemia virus (mlv) into host nucleus depends on the ne breakdown during mitosis [14] . various human immunodeficiency virus (hiv) preintegration complex proteins, such as the matrix [15] , vpr [16] , integrase [17, 18] were found to contain nuclear localization signal (nls) in their genome sequence. this nls interacts with the nuclear transport receptors, which transports the viral proteins through the nuclear pore complex (npc). similarly, the influenza virus nucleoprotein (np) present in the viral ribonucleoprotein complex (vrnp) contains nls1, nls2 and nls3, which helps in its binding to cellular importins resulting in transportation of the vrnp to the nucleus [19e21] . in another strategy, the capsid of certain viruses gets attached to the cytoplasmic side of the host npc either directly or with the help of importins. this interaction acts as a signal for capsid disassembly followed by entry of the viral genome along with their proteins through the npc into the nucleus [12] . studies on the herpes simplex virus-1 (hsv-1) infection on vero, bhk-21 and ptk 2 cells reported transportation of viral tegument-capsid by dynein to the cytoplasmic side of npc [22, 23] . the hsv-1 capsid binds to npc with the help of importin b and subsequently releases dna into the nucleus through the npc [24] . similarly, the capsid protein of adenovirus (ad) on binding with the nuclear transport protein nup214 attaches to the cytoplasmic side of npc and releases the genetic material into the nucleus [25] . some small viruses such as hepatitis b virus (hbv), baculovirus, etc were found to cross the npc and release their genome at the nuclear side of npc or within the nucleus [12] . the capsid protein of hbv was reported to interact with npc via the nuclear receptor nup153 in xenopus laevis oocytes [26] . furthermore, this binding depends on importin a, b and phosphorylated core protein present on its capsid [27] . upon infection, hbv capsid enters the nucleus through the npc and subsequently releases its genome into the nucleoplasm [28] . similarly, capsid proteins of simian virus 40 (sv40) were found to enter the nucleus through npc [29] . it has been reported that the disruption of the ne and nuclear lamina temporarily aid the nuclear entry of viruses such as parvoviruses [30] . disruption of ne in xenopus oocytes and both ne and nuclear lamina in mouse fibroblast cells upon parvovirus infection and minute virus of mice (mvm) was reported using electron microscopy analysis [30, 31] . presumably, viruses evolved different mechanisms to enter nucleus in order to facilitate their replication cycle and survivability inside the host cells. after the successful entry into the nucleus, both dna and rna viruses exploit various nuclear components such as nucleolus, promyelocytic leukaemia nuclear body (pln) and/or nuclear proteins in order to facilitate their replication (fig. 2) [11] . nucleolin, fibrillarin and b23 (nucleophosmin) are examples of nucleolar proteins reported essential for various functions like post transcriptional process, ribosome assembly, etc [32] . these multifunctional host nucleolar proteins were reported to be incorporated into the replication and translation complex of many viruses [11] . many dna viruses like hsv and ad were reported to be involved in relocalization or disruption of the host nucleolar proteins [11] . for instance, the transfection of the recombinant ul24 protein of hsv-1, tagged with an n-terminal hemagglutinin in vero cells resulted in the redistribution of nucleolin and b23 [33, 34] . another nucleolar protein fibrillarin was also found to be redistributed in spots throughout the nucleus but in an ul24 independent manner [33] . the core protein of ad was found to be associated with the nucleolus in hela cells [35] . further studies showed that this association results in the relocation of the nucleolin to the cytoplasm of the infected cells [36] . the ad-induced relocation of nucleolin was proposed to be a strategy used by virus to suppress its activity [36] . additionally, infection of hela cells with ad resulted in inhibition of synthesis and maturation of rrna [37] . nucleolar localization of the viral capsid and rna binding proteins of many rna viruses has been reported [11] . for example, the encephalomyocarditis viral (emcv) proteins 2a and 3bcd and human rhinovirus hrv 3c protease were found to localize to the nucleoli resulting in inhibition of cellular rna transcription [38, 39] . it has been reported that the expression of tat and rev proteins of hiv-1 in cos7 cells resulted in their accumulation in dense fibrillar component in the nucleolus [40] . moreover, rev protein was reported to be involved in deforming the nucleolar architecture [40] . similarly, the viral n protein was found to be localized to the nucleolus in addition to the cytoplasm upon infection of coronavirus resulting in delayed cell cycle to facilitate viral assembly [41] . poliovirus (pv) or rhinoviruses interact with nucleolin and blocks the nuclear import leading to accumulation of nucleolin in the cytoplasm of the infected cells [42] . the accumulated nucleolin interacts with the internal ribosome entry site (ires) element present in the upstream 5 0 end of the viral genome to stimulate its translation [43] . on the other hand, these changes result in the shutdown of cellular transcription leading to the downregulation of the host defence mechanism [42] . infection of human respiratory syncytial virus (hrsv) in a549 cells resulted in depletion of nucleolin [44] . although, both dna and rna viruses behave differently towards acquiring the nuclear niche however the goal being to establish control over cellular transcription and favour its genome over the host. pml nb is another subnucleolar component, which contains proteins such as pml and sp100. these proteins are induced by interferons and hence pml nb can act as a target for viruses to escape the antiviral signalling response [45] . redistribution of the pml nb has been reported upon infection of hsv-1 to bhk-21 cells [46] . similarly, ebna lp protein of epstein-barr virus (ebv) was found to displace sp100 from pml nbs in burkitt's lymphoma and hep2 cells [47] . the viral e4-orf3 protein induced reorganization of pml nbs into elongated track like structures upon infection of cv1 cells with human ad5 [48] . later, it was found that this reorganization was responsible for the downregulation of the host antiviral response [49] . another example is the infection of human foreskin fibroblasts (hffs) with human cytomegalovirus (hcmv) resulting in the accumulation of pp71 at pml nbs in the infected cells [50] . further studies showed that pp71 was responsible for the proteasomal degradation of pml-nb protein daxx, which is important for inducing intrinsic immune response against hcmv infection [50] . some rna viruses also result in the redistribution and degradation of host pml nbs in order to neutralize the antiviral response. the ring finger protein z of lymphocytic choriomeningitis virus interacts with pml protein and leads to the redistribution of pml-nbs from the nucleolus to the cytoplasm [51] . the expression of 3c protease of emcv in cho cells was reported to target pml-nbs and promote its degradation in proteasome and by sumo-dependent mechanism [52] . interestingly, cho cells infected with rabies virus were reported to contain enlarged pml nbs [53] . disruption of the nucleocytoplasmic trafficking in the host cells is another major alteration caused due to viral infections ( fig. 2 ) [54] . two conserved proteins pul50 and pul53 of hcmv were reported to remodel the nuclear lamina of helfs (primary human lung fibroblasts) for the budding of virions [55] . interaction between human papillomavirus (hpv) and host mitotic chromosomes has been documented [56] . infection of porcine bone marrow cells with african swine fever virus (afsv) resulted in fragmentation of the nucleus [57] . the organelle found in continuation with the nucleus in a cell is the er (fig. 1) . protein synthesis and folding (rough er), lipid synthesis, calcium regulation (smooth er), etc are some of the important functions of the er. multiple structural domains of the er are reported which enable it to serve as a site for various important functions in a cell. several morphological alterations of er such as vesicle formation, invagination of the membrane, zippered appearance, etc have been reported upon viral infection in different cell lines (fig. 2) . the large malleable surface of the er aids viruses to form protective compartments to set up their replication machinery. these compartments known as viroplasm not only concentrate viral and host proteins required for viral genome replication but also protect the viral genome from cellular nucleases [58] . in order to construct these compartments, viruses alter host's fatty acid metabolism, induce rearrangement of the membrane constituents and also recruit cellular machinery to produce proteins essential for its replication [59, 60] . infection of vaccinia virus (vacv) in bs-c-1, hela and rk-13 cells leads to the formation of vacv membranes derived from the er membrane [61] . in case of dengue virus (denv-2), the genomic rna was observed to localize over the rough er in c6/36 aedes albopictus cells [62] . denv infection in human hepatoma 7 (huh7) cells leads to invagination of er membrane resulting in the formation of spherules or vesicles containing double stranded rna [63] . hela cells and cos-1 cells on infection with pv 1 cause formation of single membrane tubules which later mature into double membrane vesicles (dmvs) [64, 65] . similarly, severe acute respiratory syndrome corona virus (sars-cov) induced formation of er derived dmvs upon infection of vero cells [66, 67] . zippered appearance of er was observed upon infection of infectious bronchitis virus (ibv) on mammalian, avian and tracheal epithelial cells [68] . studies with the plant viruses like the potato virus x (pvx) reported tgbp2 protein induced reorganization of the er and appearance of vesicular structures in tobacco plants [69] . effect of the viral protein expression on sterol biosynthesis apart from the alterations in er morphology have also been reported [59, 70] . several viral proteins need to be glycosylated at their n-terminal to ensure proper folding and for the incorporation into virions [71] . hence, a common phenomenon observed upon viral infection is interference with the host post translational machinery and competition with the cellular proteins to undergo these modifications [72] . for example, protein porf2 of hepatitis e virus (hev) gets glycosylated in the er upon its expression in cos-1 cells and huh7 mammalian cells [73, 74] . this increased biosynthetic load is proposed to increase in accumulation of malformed proteins resulting in er stress (fig. 2) . er gets relieved from this stress by inducing a pathway called unfolded protein response (upr) pathway [72] . upr serves to enhance the cell's degradation ability in order to establish er homeostasis [72] . transport of misfolded, misassembled proteins from the er to the cytosol and clearance by the ubiquitin proteasome system takes place via the er associated degradation (erad) pathway [72] . however, some viruses use erad pathway for their advantage by co-opting erad to disassemble and gain access to the cytosol of the host cell [75] . upon sv40 infection of cv-1, hela and 3t6 cells, induction of erad factors in turn induce er membrane reorganization into distinct subdomains called foci and the accumulation of sv40 in these foci was observed [76] . later, sv40 particles travel across the er membrane to reach the cytosol [76] . many enveloped and non-enveloped viruses that exploit er functions like upr, erad to facilitate their replication have been discussed elsewhere [71] . in addition to the above listed mechanisms, er mediated nlinked glycosylation plays a very important role in the survival of the viruses. it has been very elegantly demonstrated that the biogenesis of influenza could modulate the host immune response [77] . similarly, the role of n-glycans in the host immunity against hiv has been documented [77] . these studies corroborate to a common point where the level of glycosylation in the viral surface glycoproteins could alter their antigenicity. mitochondria are double membrane bound organelles that comprise their own genome ( fig. 1 ). they are involved in various functions like fatty acid metabolism, apoptosis, calcium homeostasis, etc. considered evolutionarily the oldest organelle of a eukaryotic cell, they are indispensable as power house of the cell. mitochondrial morphology is maintained by a series of interlinked events namely mitochondrial fusion, fission and mitophagy [78] . mitochondrial fusion helps in exchanging matrix metabolites and intact mitochondrial dna while mitochondrial fission helps in sorting impaired mitochondria from the healthy population which are further eliminated by a process called mitophagy [79] . all the above dynamic events are altered upon viral infection in order to facilitate its replication (fig. 2) [80] . for example, it was reported that hbv infection in the cell induces mitochondrial fission followed by mitophagy in order to attenuate apoptosis [81] . on the other hand, expression of orf-9b protein of sars-cov virus promoted mitochondrial fusion in hek293 cells [82] . upregulation of mitophagy and degradation of the mitochondrial antiviral signalling protein (mavs) in order to attenuate the antiviral immune response in non-small cell lung cancer (nsclc) cells was reported upon measles virus infection [83] . the expression of matrix protein (m) of human parainfluenza virus type 3 (hpiv3) in hek293t and hela cells was reported to induce mitophagy resulting in the suppression of type1 interferon response [84] . hbv induces mitophagosome formation which upon fusion with lysosomes leads to mitophagy and prevents apoptosis, thus facilitating persistent infection in the huh7 cells [81] . similarly, newcastle disease virus (ndv) was reported to induce mitophagy, which promotes ndv replication by preventing caspase dependent apoptosis in human non-small cell lung cancer a549 cells [85] . viruses can alter the intracellular distribution of mitochondria either to prevent the release of mediators of apoptosis or to meet their energy requirement during replication by concentrating them near their viral factories [86] . hbv x protein induces the perinuclear clustering of mitochondria in huh7 cells [87] and afsv leads to the transport of mitochondria to viral assembly sites in vero cells [88] . mitochondrial dna codes for proteins essential for respiratory functions of a cell. certain viruses evade the mitochondria associated antiviral response by damaging the host cell mitochondrial dna, which is essential for synthesizing enzymes for optimum mitochondrial function [86] . for example, mitochondrial dna degradation is induced in mammalian cells upon expression of ul12.5, an amino-terminally truncated ul12 isoform of hsv-1 [89] . raji cell lines infected with hepatitis c virus (hcv) also exhibit mitochondrial dna depletion [90] . maintenance of mitochondrial/cellular ca 2ã¾ homeostasis is vital for various cellular functions. many viruses are involved in altering the mitochondrial calcium homeostasis in order to meet their needs during replication [86] . hcmv upon infection causes calcium influx into mitochondria from er [91] . on the other hand, expression of 2b protein of coxsackievirus resulted in reduced signalling of ca 2ã¾ between the er-golgi and the mitochondria in hela cells resulting in suppression of apoptosis [92] . rotavirus was also reported to alter the calcium homeostasis in the host cell throughout its life cycle [93] . mitochondria are the major source of ros in a cell and a balance between ros production and scavenging is crucial for optimum functioning of the cells. upon viral infection mitochondria undergo oxidative stress and result in an increased production of ros which in turn reduces virus replication [86] . interestingly, ros is also involved in activating many host cellular pathways favourable for viral replication and pathogenesis. several viruses like hiv, hcv, adv, ebv, etc result in increased oxidative stress upon infection [86] . on the other hand, both increase and decrease of oxidative stress is employed as a survival strategy by hbv [94, 95] . many viral proteins reach mitochondria through the mitochondria-associated membrane [96] a sub-compartment of the er or are targeted directly from the cytosol and result in an altered mitochondrial permeability transition pore (mptp) [97] . mptp is also responsible for the maintenance of mmp and provides energy for atp synthesis. altering mptp leads to passive swelling, outer membrane rupture, osmotic water flux, and release of proapoptotic factors leading to cell death [98] . in general, increased mmp induces apoptosis, while decreased mmp prevents apoptosis [86] . viruses are proposed to decrease mmp to prevent cell death in order to promote their replication. however, in later stages, they may trigger an increase in mmp to release the progeny virions by apoptosis [86] . the m11l protein of myxoma pox virus prevents the loss of mmp in cos-7 monkey kidney cells, hela cells, thp-1 human monocytes and jurkat t lymphocytes [99] . on the other side, the r protein of hiv-1 induces the loss of mmp in cem-c7 and jurkat cells and results in apoptosis [100] . viruses alter the host mitochondrial metabolic pathways in order to maintain cellular energy homeostasis essential to ensure efficient replication and to avoid mitochondrial antiviral response particularly in case of slow replicating virus [101] . some viruses modulate the normal cells to increase aerobic glycolysis and use glucose biosynthetically, which helps especially enveloped viruses to increase their available pool of fatty acids, lipids and nucleotides during their replication [102, 103] the feline leukemia virus infection on lung fibroblasts (flf-3) resulted in a 30e40% increase in glucose uptake and lactic acid production [104] . an increase in the lactic acid production upon infection of the chick embryo cells with rous sarcoma virus was reported [105] . hcmv upon infection of hffs, human retinal pigment epithelial cells (arpe19), human embryonic lung fibroblasts (mrc5) and vero cells enhances glycolytic flux and directs the supply of carbon from glucose to tca cycle. this helps to facilitate fatty acid biosynthesis while hsv-1 upon infection of same cell lines directs the central carbon metabolism in order to induce the production of pyrimidine nucleotide components [106] . hcv core protein suppresses mitochondrial complex 1 activity and impaired function of electron transport cycle leading to ros accumulation in hepatoma cells of transgenic mice [107] . mitochondrial involvement in virus survival is quite relevant to the fact that viruses require a source of energy to favour the active processes involved in their life cycle. peroxisomes are single membrane bound dynamic organelles required for b-oxidation of fatty acids and ros metabolism (fig. 1) . they have developed diverse functions which are organism and environment dependent. they are unique with respect to proliferation, as they can increase in number by both growth and division of pre-existing organelles and formation of new organelles from the er. many viruses or viral proteins are reported to localize to peroxisomes and/or exploit their functions to facilitate their replication in the host cells [108] . presence of a peroxisome targeting signal (pts) in the protein sequence is reported to be essential for targeting proteins into the peroxisomes [109] . however, it was also proposed that viral proteins without pts may get associated with host peroxisomal proteins in the cytosol which then ferry the viral protein into the peroxisomes in a piggyback fashion [108] . for instance, hcmv encodes a protein called vmia reported to be localized to peroxisomes in hffs and hepg2 cells [110] . the vmia interaction with the host pex19 protein aids the viral protein localization to the peroxisomes. fragmentation of peroxisomes upon vmia expression in hepg2 was also reported [110] . in addition, peroxisomal localization of two cymbidium ring spot viral proteins p33 and p92 was reported in yeast [111] . a role for peroxisomes in the replication of tbsv has also been reported. the viral replication protein p33 interacts with the host peroxisomal protein pex19 for its targeting to the peroxisomal membrane and subsequent replication [112, 113] . the host hsp70 protein was reported to promote the localization of the viral replication proteins to the peroxisomes in yeast cells [114] . n-terminal protease n pro of pestivirus is another example of a viral protein that is associated with peroxisomes and facilitates its survival and replication [115] . some members of the family tombusviridae are involved in remodelling the peroxisomal membrane resulting in the formation of vesicular structures called "profoundly modified peroxisomes" or "peroxisome derived multivesicular bodies" [108] . for example, the infection of cymbidium ring spot virus resulted in the formation of small vesicles at the periphery of the peroxisomes in plant leaf tissues [116] . at the later stage of infection, the entire peroxisomal matrix is occupied by these vesicles [116] . similarly, the cucumber necrosis virus (cnv) induces the formation of peroxisomal vesicles in which the viral rna replication occurs in yeast cells [117] . studies revealed upregulation of peroxisomal biogenesis in endothelial cells upon latent infection of kaposi's sarcoma associated herpes virus [118] . it was also reported that proteins involved in peroxisomal lipid metabolism were essential for the survival of latently infected human dermal microvascular endothelial cells and lymphatic endothelial cells [118] . interestingly, hiv infection reduces the number of peroxisomes in infected cells due to upregulation in the levels of micrornas that inhibit production of peroxisome biogenesis factors [119] . it has been reported that the west nile virus (wnv) and denv infection on a549 and hek293t cells result in the degradation of peroxisomes [120] . the capsid proteins of both denv and wnv were reported to interact with the peroxisomal protein pex19 required for peroxisome biogenesis. degradation and redistribution of pex19 from perinuclear region to juxtanuclear region was reported in the infected cells. in addition, reduced level of the antioxidant enzyme catalase was observed in the infected cells. these alterations resulted in an impairment of early antiviral signalling of the peroxisomes. expression of the pts containing vp4 protein of the rotavirus, in ma104 cell lines resulted in peroxisomal localization [121] . the role for such a localization was speculated to utilize the peroxisomal lipid metabolism for the supply of cholesterol for lipid raft synthesis, required for the viral replication. peroxisomal b-oxidation metabolism leads to the production of myristoyl-coa by shortening the fatty acids chain which could be exported to the cytosol for nmyristoylation of the viral proteins vp2 and vp6 of rotavirus [108] . studies on hiv suggested an interaction between viral nef protein and the peroxisomal enzyme thio-esterase using yeast two hybrid system [122] . further studies reported an increase in the enzymatic activity of human acyl-coa thio-esterase 3 on binding with the nef protein [123] . the increased activity of the peroxisomal enzyme was speculated to contribute in alteration of the subcellular morphology and in downregulation of the host antiviral response [108] . extensive modifications of proteins for proper functioning and targeting in a cell takes place at the golgi apparatus. other functions of golgi inevitable for the cell are carbohydrate synthesis and lipid transport. the unique stacked structural organization of the golgi is essential for these functions (fig. 1) . the golgi apparatus is composed of three regions namely cis golgi network (cgn), medial and trans golgi network (tgn). various viruses and viral proteins have been identified to localize to these golgi regions during their life cycle. for example, hela cells upon infection with adeno-associated virus type 5 (aav-5) led to an accumulation of the viral particles in the tgn and in the golginetwork associated coated vesicles [124] . in sars-cov, the transmembrane domain of the orf7b protein contains the retention signal required for accumulation of the protein in the cgn and tgn [125] . reports suggest bunyaviruses assemble in the golgi as a result of its retention signal in the glycoprotein (gn) of the virus [82, 126] . fragmentation of golgi body is a common phenomenon that occurs as a result of various viral infections (fig. 2) . orf virus (a parapox virus) causes disruption and fragmentation of golgi in vero cells. this structural modification affects the late vesicular export machinery and results in downregulation of the host immune response [127] . similar phenomenon was also reported upon expression of 3c protease of foot-and-mouth disease virus in vero cells [128] . infection of hrv1a on wi-38, 293t, and h1-hela cells was reported to induce fragmentation of golgi body, rearrangement of the golgi membranes into vesicles which are utilized as sites of rna replication [129] . another incidence of the virus induced golgi body fragmentation was observed upon infection of sars-cov that resulted in death of vero cells [130] . many rna viruses were also found to alter the integrity of golgi complex of the host cells. for example, the expression of n-terminal non-structural protein of norwalk virus in crandell-rees feline kidney (crfk) and hela cells co-localizes to golgi complex and induces its disassembly into discrete aggregates [131] . another example is the pv infection on vero cells which results in complete disruption of the cgn into fragments scattered throughout the cytoplasm. the expression of pv protein 2b in vero, normal rat kidney (nrk) and cos-7 cells also resulted in golgi bodies disassembly [132] . similarly, expression of protein 3a of avian encephalomyelitis virus (aev) in chick embryo brain (ceb) cells and cos-7 cells resulted in depletion of golgi stacks and in severe disassembly of golgi bodies [133] . an interesting alteration in the morphology of golgi apparatus was observed upon infection of turnip mosaic virus (tumv) on n. benthamiana plant. tumv infection was observed to induce amalgamation of golgi apparatus, er and chloroplast [134] . mccoy mouse fibroblast cells led to accumulation of golgin-97 a tgn resident protein in the viral factories [135] . similarly, 3a protein of some picornaviruses were also found to interact with a golgi apparatus resident protein namely golgi adaptor protein acyl coenzyme a (acyl-coa) binding domain protein 3 (acbd3/gpc60), acts as an adaptor to recruit phosphatidylinositol 4-kinase class iii beta (pi4kiii) in infected cells [136] . evidence shows that the tomato spotted wilt virus (tswv) glycoprotein gn localizes to golgi membranes and induces deformation of the membranes into pseudo-circular and pleomorphic structures in tobacco plant cells [137] . upon infection of hela cells, rabbit kidney cells and mouse monocytes-macrophages cell lines with vacv, viral progeny becomes enwrapped in the membrane derived from tgn cisterna to form the enveloped virus [138] . several viral proteins localize to the golgi body and mature by undergoing post translational modifications such as glycosylation, phosphorylation, etc. rubella virus was reported to undergo a golgi dependent maturation upon infection of bhk-21 and vero cells [139] . the inner tegument protein pul37 of the dna virus hsv was identified to be responsible in directing the viral capsids to the tgn in order to undergo secondary envelopment in different cell lines [140] . the glycoproteins of bunyamwera virus undergo primary maturation by modifying their sugar composition in the tgn upon infection of the bhk-21 and vero cells [141] . lipid droplets are single membrane bound organelles with a lipid core that primarily consists of neutral lipids like triacylglycerols (fig. 1) . they are essential for lipid storage and metabolism in a cell. these stored lipids can be used to generate and maintain energy homeostasis and hence they are central to the cellular function. lipid droplets are dynamic intracellular organelles which are required for storing lipids in a cell. they play a major role in energy homeostasis and membrane trafficking [142] . many rna viruses exploit this energy storing capacity of lipid droplets to facilitate their replication [142] . upon expression, various viral proteins are reported to localize to the lipid droplets [143] . for example, upon hcv infection, the hcv ns5a protein localizes to the surface of the lipid droplets with the help of a host factor diacylglycerol acyltransferase 1 (dgat1) in huh7 and hek293t cells [143] . another study reported that hcv core 3a protein is involved in downregulating the expression of phosphoinositide 3-kinase (pi3k)/phosphatase and tensin (pten) which in turn induces the accumulation of enlarged lipid droplets in human huh7 and hepg2 cells [144] . in another example, denv c protein was reported to bind and interact with perilipin 3 a protein present on the surface of the lipid droplets in hepg2 cell lines [145] . a similar localization of denv c protein was also observed upon infection of denv2 in bhk-21, hepg2, and c6/36 ht mosquito cells of a. albopictus [146] . additionally, the denv c protein localization on lipid droplets was also found to be essential for the denv2 replication [146] . denv induces autophagy and results in a reduction in lipid droplet area in huh7.5 cells, a subline derived from huh7 cells [147, 148] . further analysis of the infected cells showed reduced levels of triglycerides in the lipid droplets and suggests that atp generation by b-oxidation of fatty acids is essential for robust replication of viral rna [147, 148] . another study reports that rotavirus recruits lipid droplets into their viroplasm compartments upon infection of ma104, caco-2, bsc-1, and cos-7 cell lines [149] . confocal microscopy studies reported that two ld-associated proteins namely perilipin a and adrp colocalize with rotaviral proteins present in the viroplasm [149] . overexpression of the hbv x (hbx) protein was found to induce lipid accumulation in hepatic cells [150, 151] . na and colleagues found that hbx induces a pathway that involves the expression of liver x receptor and its associated genes result in accumulation of lipid droplets in hepg2 cell lines [152] . lysosomes are single membrane-bound organelles which house enzymes involved in the degradation of various extracellular and intracellular macromolecules such as proteins, pathogens, etc through phagocytic or autophagic pathway. plant and fungal vacuoles have similar degradative and storage functions like the lysosomes (fig. 1) . the specialized acidic lumen is the unique feature of these organelles which is needed to keep the enzymes active. viruses utilize lysosomal enzymes in order to facilitate their replication and release within the host cell [153] . it has been suggested that lysosomal enzymes are involved at different stages of vacv and mouse hepatitis virus replication [153] . they proposed a possibility for viruses to recruit lysosomal enzymes inside phagosomes in order to uncoat and release their genome in the host cell. a possible role for the lysosomal enzymes in enhancement of glycolysis in viral infected cells was also proposed. recent studies reported the accumulation of hav progeny in the lysosome of the host hepg2 cells. the maturation of these viral particles was reported to be catalysed by the lysosomal protease [154] . sv40 infection in bsc-1 and 3t3 cell lines resulted in swelling up of lysosomes followed by the release of lysosomal enzyme into the cytoplasm [155] . since lysosomes play a major role in the host's antiviral response, they are likely to become a target for certain viruses. the x protein of the hbv leads to inhibition of lysosomal acidification leading to loss of functioning [156] . however, the lysosome's ability to fuse with autophagosomes was found to remain unaffected. this virus induced accumulation of immature lysosomes resulting in the suppression of autophagic degradation was followed by development of hbv-associated hepatocellular carcinoma [156] . deglycosylation of the lysosome-associated membrane proteins by neuraminidase (na) of h5n1 influenza virus resulted in destruction of lysosomes. this was followed by cell death due to the release of hydrolytic lysosomal enzymes to the cytoplasm [157] . shubin and colleagues studied the expression of 3c protease of hav in a549 and human lung epidermoid carcinoma (calu-1) cell lines and found that hav 3c protease induces the development of non-acidic cytoplasmic vacuoles which originate from several types of lysosomal/endosomal organelles [158] . similarly several viruses like hiv, adenovirus, pv have been reported to cause lysosomal rupture [159] . tobacco plant when infected with cucumber mosaic virus (cmv), the viral replicase complex that constitutes the replicaseassociated protein cmv1a and rna dependent rna polymerase protein cmv2a was reported to localize on the vacuolar membrane [160] . singapore grouper iridovirus (sgiv) infection on grouper embryonic cells (gecs) from the brown-spotted grouper epinephelus tauvina results in the formation of a large intracellular vacuole for viral accumulation. later, the virus recruits the host cytosolic membrane-bending proteins in order to induce tubulation of vacuolar membrane [161] . the individual vacuoles fuse together to form a large vacuole which in turn fuses with the cell membrane. these events aid in the release of virions [161] . semliki forest virus (sfv) targets the vacuolar atpase in order to cause acidification of vacuoles resulting in low intra luminal ph [162] . upon infection with the sfv the endocytic vacuoles with low ph trigger membrane fusion essential for the viral pathogenesis in bhk-21 cells [163] . vacuolar proton atpase activity leading to low intra endosomal ph was reported to be required for the entry of reovirus into the host cells [164] . acidification of vacuoles by cellular v-atpase in human dermal fibroblast cells upon infection of hcmv was reported to be required for the formation of the specialized compartment for virion assembly [165] . apart from the above discussed well defined organelles of a cell, viruses also modify and hijack various vesicular structures and protein complexes of the host cell to ensure efficient infection. we discuss these essential compartments/structures in this section. endosomes are required for internalization of extracellular material into the cells and lead them to lysosomes for degradation or recycle back to the plasma membrane. multi vesicular bodies are vesicular compartment of the endocytic pathway and contain intraluminal vesicles. autophagosomes are double membrane vesicular structures that sequester the cytoplasm containing proteins, organelles, etc and direct them to degradation (fig. 1) . efficient cellular entry of most viruses is via endocytic pathway that comprises of various endosomes. several viruses like influenza, sfv, vsv, sv40, ebola, etc use different endocytic pathways like clathrin, caveoli or micropinocytosis to gain entry into the cells [6,166e168] . viruses modify or induce the formation of vesicles like endosomes in order to facilitate their multiplication inside the host cells [169] . sfv modifies the endosomal and lysosomal membrane of the infected bhk-21 cells for the construction of its replication site [170] . further the endosomes and lysosomes were reported to fuse together resulting in the formation of cytoplasmic vacuoles [170] . sv40 was reported to trigger the formation of endocytic vacuoles that migrate and fuse with the nuclear membrane upon infection of cv-1 cells leading to the migration of virions into the infected cell nucleus [171, 172] . endosomal sorting complexes required for transport (escrt) catalyse the process of invagination of endosomal membrane and result in the formation of multiple vesicular bodies (mvb) in eukaryotic cells [173] . mvb act as an intermediate for transporting ubiquitinated or misfolded protein to lysosomes [173] . mvb are also reported to play an active role in endolysosomal transport and budding of the virus. several rna and dna viruses hijack the escrt machinery in order to facilitate their release from the infected cells [169] . the matrix protein vp40 recruits tsg101 protein and escrt-1 complex constituting of vps28, vps37b and vps4 in order to direct the ebola virus into multivesicular bodies that assists in their budding [174] . hoffmann and colleagues reported that upon infection of hepatoma derived cell lines by the dna virus hbv, cellular a-taxilin acts as an adaptor for the binding of large hbv surface antigen with escrt components. this aids in recruiting the escrt machinery for the release of hbv-dna containing particles [175] . various rna viruses alter autophagosomes and induce or suppress autophagy in order to complete their life cycle or to escape from the host antiviral response [176] . it has been reported that the coxsackie b3 infection triggers the formation of autophagosomes in hela and hek293 cells [177] . prevention of lysosomal fusion with autophagosomes also enhanced coxsackie virus replication in the host cells. wild type mouse embryonic fibroblast (mef) and huh7 cells also exhibit enhanced autophagosome formation upon denv 2 infection [178] . the viruses modulating the phenomenon of cellular autophagy for their advantage have been reviewed elsewhere [176] . proteasome is a complex of proteases that selectively degrades intracellular proteins. this complex is tightly regulated and identifies polyubiquitinated proteins for degradation process. recent studies identified that viruses hijack ups in several ways to enhance their infectivity [179] . this is achieved by degradation of host proteins of ups, enhancing the function of viral proteins by modifications, hampering the modifications of signalling molecules of innate immunity, etc [179] . virus induced ubiquitination and subsequent proteasomal degradation of p53 as a strategy is employed by dna viruses such as hpv, adv, etc [180] . viral protein x of hiv-2 was reported to be responsible for the ubiquitination and proteasomal degradation of the host protein samhd1 that inhibits hiv infection [181] . an interesting strategy by denv was reported where the viral protein ns5 stimulates proteasomal degradation of stat2 and blocks type 1 ifn signalling and thus evades the host immune mechanisms [182] . similarly selective degradation of ns3 (non-structural protein) of the zika virus via proteasome mediated pathway was recently reported [183] . this proteasome dependant degradation of the viral protein was proposed as a strategy for the host antiviral mechanism. a role for ups has also been reported in plant viral infections. components of rna silencing such as argonaute 1 are targeted to proteasomal degradation by viruses like potato virus x, enamovirus, etc for efficient infection [184, 185] . this review summarizes the modifications that organelles encounter upon viral infection in a cell. understanding organelle dynamics under various conditions is a fundamental question that has attracted researchers for a long time. the importance of organelle dynamics and function is highlighted by many examples of diseases/disorders where it is affected. alterations in organelles such as shape, content, dynamics and eventually the function as a result of viral infections is observed. not only viruses but pathogenic bacteria have also been reported to alter organelles for their survival and infection. as emphasized in this paper, recent studies have shown that many viruses encode proteins that are targeted to various cellular organelles and control their functions. certainly there exists a close relationship between organelle dynamics and viral infections but thorough characterization will highlight their relevance to pathogenesis. advanced methods in microscopy and proteomics have enabled such characterization and the molecular details of virus-host interactions and viral replication in host cells is now understood in detail for few viruses. it is important to determine how various organelle proteins are temporally and spatially regulated upon viral infections leading to altered functions. organelles not as independent entities but a role for inter-organelle communication/inter-organelle cross talk in a cell for optimum functioning is now unequivocally accepted. this is another very interesting aspect to be explored to enhance our understanding of the virus-host interaction mechanisms to enable design of new antiviral strategies. our understanding on the organelle-virus interaction has been rapidly increasing with the advent of new molecular biology tools and advance imaging techniques. although we know the basic modus operandi of the viruses, there might be a novel virus that behaves different than the existing dogma with respect to host cellular architecture. the authors declare to no conflict of interest. ground control to major tom: mitochondria-nucleus communication the upsides and downsides of organelle interconnectivity an organelle-specific protein landscape identifies novel diseases and molecular mechanisms hiv-1 and the host cell: an intimate association virulence and pathogenesis virus entry by endocytosis signaling organelles of the innate immune system lipid droplet density alters the early innate immune response to viral infection virus factories: associations of cell organelles for viral replication and morphogenesis exploring and exploiting proteome organization during viral infection nuclear remodelling during viral infections how viruses access the nucleus the road to chromatin -nuclear entry of retroviruses integration of murine leukemia virus dna depends on mitosis two nuclear localization signals in the hiv-1 matrix protein regulate nuclear import of the hiv-1 pre-integration complex characterization of hiv-1 vpr nuclear import: analysis of signals and pathways hiv-1 infection of nondividing cells through the recognition of integrase by the importin/karyopherin pathway integrase interacts with nucleoporin nup153 to mediate the nuclear import of human immunodeficiency virus type 1 nuclear import and export of influenza virus nucleoprotein application of bioinformatics-coupled experimental analysis reveals a new transport-competent nuclear localization signal in the nucleoprotein of influenza a virus strain importin alpha nuclear localization signal binding sites for stat1, stat2, and influenza a virus nucleoprotein function of dynein and dynactin in herpes simplex virus capsid transport microtubule-mediated transport of incoming herpes simplex virus 1 capsids to the nucleus herpes simplex virus type 1 entry into host cells: reconstitution of capsid binding and uncoating at the nuclear pore complex in vitro import of adenovirus dna involves the nuclear pore complex receptor can/nup214 and histone h1 nucleoporin 153 arrests the nuclear import of hepatitis b virus capsids in the nuclear basket phosphorylationdependent binding of hepatitis b virus core particles to the nuclear pore complex nuclear import of hepatitis b virus capsids and release of the viral genome role of nuclear pore complex in simian virus 40 nuclear targeting pushing the envelope: microinjection of minute virus of mice into xenopus oocytes causes damage to the nuclear envelope parvoviral nuclear import: bypassing the host nuclear-transport machinery nucleoli: composition, function, and dynamics involvement of ul24 in herpes-simplex-virus-1-induced dispersal of nucleolin involvement of the ul24 protein in herpes simplex virus 1-induced dispersal of b23 and in nuclear egress adenovirus core protein v is delivered by the invading virus to the nucleus of the infected cell and later in infection is associated with nucleoli adenovirus protein v induces redistribution of nucleolin and b23 from nucleolus to cytoplasm effects of adenovirus infection on rrna synthesis and maturation in hela cells encephalomyocarditis virus (emcv) proteins 2a and 3bcd localize to nuclei and inhibit cellular mrna transcription but not rrna transcription rhinovirus 3c protease precursors 3cd and 3cd' localize to the nuclei of infected cells the post-transcriptional regulator rev of hiv: implications for its interaction with the nucleolar protein b23 localization to the nucleolus is a common feature of coronavirus nucleoproteins, and the protein may disrupt host cell division inhibition of nuclear import and alteration of nuclear pore complex composition by rhinovirus nucleolin stimulates viral internal ribosome entry site-mediated translation quantitative proteomic analysis of a549 cells infected with human respiratory syncytial virus ifn enhance expression of sp100, an autoantigen in primary biliary cirrhosis hsv-1 ie protein vmw110 causes redistribution of pml mediation of epstein-barr virus ebna-lp transcriptional coactivation by sp100 adenovirus replication is coupled with the dynamic properties of the pml nuclear structure adenovirus e4 orf3 protein inhibits the interferon-mediated antiviral response inactivating a cellular intrinsic immune defense mediated by daxx is the mechanism through which the human cytomegalovirus pp71 protein stimulates viral immediate-early gene expression two ring finger proteins, the oncoprotein pml and the arenavirus z protein, colocalize with the nuclear fraction of the ribosomal p proteins sumoylation promotes pml degradation during encephalomyocarditis virus infection rabies virus p and small p products interact directly with pml and reorganize pml nuclear bodies viral interactions with the nuclear transport machinery: discovering and disrupting pathways remodelling of the nuclear lamina during human cytomegalovirus infection: role of the viral proteins pul50 and pul53 papillomavirus interaction with cellular chromatin phenotypic and cytologic studies of lymphoid cells and monocytes in primary culture of porcine bone marrow during infection of african swine fever virus wrapping membranes around plant virus infection inhibition of sterol biosynthesis reduces tombusvirus replication in yeast and plants endoplasmic reticulum: the favorite intracellular niche for viral replication and assembly direct formation of vaccinia virus membranes from the endoplasmic reticulum in the absence of the newly characterized l2-interacting protein a30.5 intracellular localisation of dengue-2 rna in mosquito cell culture using electron microscopic in situ hybridisation composition and three-dimensional architecture of the dengue virus replication and assembly sites cellular origin and ultrastructure of membranes induced during poliovirus infection remodeling the endoplasmic reticulum by poliovirus infection and by individual viral proteins: an autophagylike origin for virus-induced vesicles ultrastructure and origin of membrane vesicles associated with the severe acute respiratory syndrome coronavirus replication complex sars-coronavirus replication is supported by a reticulovesicular network of modified endoplasmic reticulum infectious bronchitis virus generates spherules from zippered endoplasmic reticulum membranes the potato virus x tgbp2 movement protein associates with endoplasmic reticulum-derived vesicles during virus infection tombusviruses upregulate phospholipid biosynthesis via interaction between p33 replication protein and yeast lipid sensor proteins during virus replication in yeast opportunistic intruders: how viruses orchestrate er functions to infect cells arms race between enveloped viruses and the host erad machinery mutational analysis of glycosylation, membrane translocation, and cell surface expression of the hepatitis e virus orf2 protein cytoplasmic localization of the orf2 protein of hepatitis e virus is dependent on its ability to undergo retrotranslocation from the endoplasmic reticulum the expanding roles of endoplasmic reticulum stress in virus replication and pathogenesis bap31 and bip are essential for dislocation of sv40 from the endoplasmic reticulum to the cytosol global sitespecific n-glycosylation analysis of hiv envelope glycoprotein the interplay between mitochondrial dynamics and mitophagy mitochondrial fusion, fission, and mitochondrial toxicity mitochondrial dynamics and viral infections: a close nexus hepatitis b virus disrupts mitochondrial dynamics: induces fission and mitophagy to attenuate apoptosis sarscoronavirus open reading frame-9b suppresses innate immunity by targeting mitochondria and the mavs/traf3/traf6 signalosome mitophagy in viral infections the matrix protein of human parainfluenza virus type 3 induces mitophagy that suppresses interferon responses mitophagy promotes replication of oncolytic newcastle disease virus by blocking intrinsic apoptosis in lung cancer cells viruses as modulators of mitochondrial functions hepatitis b virus x protein induces perinuclear mitochondrial clustering in microtubule-and dyneindependent manners migration of mitochondria to viral assembly sites in african swine fever virus-infected cells herpes simplex virus eliminates host mitochondrial dna hepatitis c virus triggers mitochondrial permeability transition with production of reactive oxygen species, leading to dna damage and stat3 activation human cytomegalovirus pul37x1 induces the release of endoplasmic reticulum calcium stores the coxsackievirus 2b protein suppresses apoptotic host cell responses by manipulating intracellular ca2ã¾ homeostasis role of ca2ã¾in the replication and pathogenesis of rotavirus and other viral infections human hepatitis b virus-x protein alters mitochondrial function and physiology in human liver cells hbx sensitizes cells to oxidative stress-induced apoptosis by accelerating the loss of mcl-1 protein via caspase-3 cascade influence of cytosolic and mitochondrial ca2ã¾, atp, mitochondrial membrane potential, and calpain activity on the mechanism of neuron death induced by 3-nitropropionic acid viral product trafficking to mitochondria, mechanisms and roles in pathogenesis a pore way to die: the role of mitochondria in reperfusion injury and cardioprotection the myxoma poxvirus protein, m11l, prevents apoptosis by direct interaction with the mitochondrial permeability transition pore the hiv-1 viral protein r induces apoptosis via a direct effect on the mitochondrial permeability transition pore a renewed focus on the interplay between viruses and mitochondrial metabolism systemslevel metabolic flux profiling identifies fatty acid synthesis as a target for antiviral therapy dynamics of the cellular metabolome during human cytomegalovirus infection glycolysis during early infection of feline and human cells with feline leukemia virus glycolysis in chick embryo cell cultures transformed by rous sarcoma virus divergent effects of human cytomegalovirus and herpes simplex virus-1 on cellular metabolism hepatitis c virus core protein inhibits mitochondrial electron transport and increases reactive oxygen species (ros) production viruses exploiting peroxisomes import of peroxisomal matrix and membrane proteins peroxisomes are platforms for cytomegalovirus' evasion from the cellular immune response expression of the cymbidium ringspot virus 33-kilodalton protein in saccharomyces cerevisiae and molecular dissection of the peroxisomal targeting signal localization of the tomato bushy stunt virus replication protein p33 reveals a peroxisome-to-endoplasmic reticulum sorting pathway the host pex19p plays a role in peroxisomal localization of tombusvirus replication proteins a key role for heat shock protein 70 in the localization and insertion of tombusvirus replication proteins to intracellular membranes the pestivirus n terminal protease n(pro) redistributes to mitochondria and peroxisomes suggesting new sites for regulation of irf3 by n(pro.) the fine structure of cymbidium ringspot virus infections in host tissues. iii. role of peroxisomes in the genesis of multivesicular bodies the role of the p33:p33/p92 interaction domain in rna replication and intracellular localization of p33 and p92 proteins of cucumber necrosis tombusvirus integrated systems biology analysis of kshv latent infection reveals viral induction and reliance on peroxisome mediated lipid metabolism micro-rnas upregulated during hiv infection target peroxisome biogenesis factors: implications for virus biology, disease mechanisms and neuropathology flavivirus infection impairs peroxisome biogenesis and early antiviral signaling identification of a type 1 peroxisomal targeting signal in a viral protein and demonstration of its targeting to the organelle binding of hiv-1 nef to a novel thioesterase enzyme correlates with nef-mediated cd4 down-regulation a novel acyl-coa thioesterase enhances its enzymatic activity by direct binding with hiv nef endocytosis of adeno-associated virus type 5 leads to accumulation of virus particles in the golgi compartment the transmembrane domain of the severe acute respiratory syndrome coronavirus orf7b protein is necessary and sufficient for its retention in the golgi complex a signal for golgi retention in the bunyavirus g1 glycoprotein orf virus interferes with mhc class i surface expression by targeting vesicular transport and golgi foot-and-mouth disease virus 3c protease induces fragmentation of the golgi compartment and blocks intra-golgi transport fragmentation of the golgi apparatus provides replication membranes for human rhinovirus 1a the open reading frame 3a protein of severe acute respiratory syndrome-associated coronavirus promotes membrane rearrangement and cell death norwalk virus n-terminal nonstructural protein is associated with disassembly of the golgi complex in transfected cells poliovirus infection and expression of the poliovirus protein 2b provoke the disassembly of the golgi complex, the organelle target for the antipoliovirus drug ro-090179 membrane-association properties of avian encephalomyelitis virus protein 3a impact on the endoplasmic reticulum and golgi apparatus of turnip mosaic virus infection a trans-golgi network resident protein, golgin-97, accumulates in viral factories and incorporates into virions during poxvirus infection the 3a protein from multiple picornaviruses utilizes the golgi adaptor protein acbd3 to recruit pi4kiiibeta tomato spotted wilt virus glycoproteins induce the formation of endoplasmic reticulum-and golgi-derived pleomorphic membrane structures in plant cells assembly of vaccinia virus: the second wrapping cisterna is derived from the trans golgi network structural maturation of rubella virus in the golgi complex inner tegument protein pul37 of herpes simplex virus type 1 is involved in directing capsids to the trans-golgi network for envelopment key golgi factors for structural and functional maturation of bunyamwera virus emerging role of lipid droplets in host/pathogen interactions lipid droplets and viral infections down-regulation of phosphatase and tensin homolog by hepatitis c virus core 3a in hepatocytes triggers the formation of large lipid droplets dengue virus capsid protein binding to hepatic lipid droplets (ld) is potassium ion dependent and is mediated by ld surface proteins dengue virus capsid protein usurps lipid droplets for viral particle formation dengue virus-induced autophagy regulates lipid metabolism dengue virus and autophagy rotaviruses associate with cellular lipid droplet components to replicate in viroplasms, and compounds disrupting or blocking lipid droplets inhibit viroplasm formation and viral replication hepatitis b virus x protein induces lipogenic transcription factor srebp1 and fatty acid synthase through the activation of nuclear receptor lxralpha hepatitis b virus x protein induces hepatic steatosis via transcriptional activation of srebp1 and ppargamma liver x receptor mediates hepatitis b virus x protein-induced lipogenesis in hepatitis b virusassociated hepatocellular carcinoma activation of lysosomal enzymes in virus-infected cells and its possible relationship to cytopathic effects lysosomes serve as a platform for hepatitis a virus particle maturation and nonlytic release lysosomal changes in lytic and nonlytic infections with the simian vacuolating virus (sv40) hepatitis b virus x protein inhibits autophagic degradation by impairing lysosomal maturation neuraminidase of influenza a virus binds lysosome-associated membrane proteins directly and induces lysosome rupture protease 3c of hepatitis a virus induces vacuolization of lysosomal/endosomal organelles and caspase-independent cell death lysosomal cell death at a glance a zinc finger protein tsip1 controls cucumber mosaic virus infection by interacting with the replication complex on vacuolar membranes of the tobacco plant visualization of assembly intermediates and budding vacuoles of singapore grouper iridovirus in grouper embryonic cells involvement of the vacuolar h(ã¾)-atpase in animal virus entry membrane and protein interactions of a soluble form of the semliki forest virus fusion protein the entry of reovirus into l cells is dependent on vacuolar proton-atpase activity cellular v-atpase is required for virion assembly compartment formation in human cytomegalovirus infection endocytosis via caveolae endocytosis of simian virus 40 into the endoplasmic reticulum cellular entry of ebola virus involves uptake by a macropinocytosis-like mechanism and subsequent trafficking through early and late endosomes membrane dynamics associated with viral infection biogenesis of the semliki forest virus rna replication complex fusion of sv40-induced endocytotic vacuoles with the nuclear membrane interaction of endocytotic vacuoles with the inner nuclear membrane in simian virus 40 entry into cv-1 cell nucleus escrt complexes and the biogenesis of multivesicular bodies involvement of vacuolar protein sorting pathway in ebola virus release independent of tsg101 interaction identification of alpha-taxilin as an essential factor for the life cycle of hepatitis b virus divergent roles of autophagy in virus infection autophagosome supports coxsackievirus b3 replication in host cells autophagic machinery activated by dengue virus enhances virus replication irhom2 is essential for innate immunity to dna viruses by mediating trafficking and stability of the adaptor sting identification of three functions of the adenovirus e4orf6 protein that mediate p53 degradation by the e4orf6-e1b55k complex molecular determinants for recognition of divergent samhd1 proteins by the lentiviral accessory protein vpx ns5 of dengue virus mediates stat2 binding and degradation viperin restricts zika virus and tick-borne encephalitis virus replication by targeting ns3 for proteasomal degradation the silencing suppressor p25 of potato virus x interacts with argonaute1 and mediates its degradation through the proteasome pathway the enamovirus p0 protein is a silencing suppressor which inhibits local and systemic rna silencing through ago1 degradation we sincerely apologize to our colleagues for any research study or review publication not being cited in this review. this is only due to space limitation. research in the laboratory of sn is supported by grants from science and engineering research board ( key: cord-029957-q7v5gli8 authors: prabhu, d.; rajamanikandan, s.; anusha, s. baby; chowdary, m. sushma; veerapandiyan, m.; jeyakanthan, j. title: in silico functional annotation and characterization of hypothetical proteins from serratia marcescens fgi94 date: 2020-07-31 journal: biol doi: 10.1134/s1062359020300019 sha: doc_id: 29957 cord_uid: q7v5gli8 serratia marcescens, rod-shaped gram-negative bacteria is classified as an opportunistic pathogen in the family enterobacteriaceae. it causes a wide variety of infections in humans, including urinary, respiratory, ocular lens and ear infections, osteomyelitis, endocarditis, meningitis and septicemia. unfortunately, over the past decade, antibiotic resistance has become a serious health care issue; the effective means to control and dissemination of s. marcescens resistance is the need of hour. the whole genome sequencing of s. marcescens fgi94 strain contains 4434 functional proteins, among which 690 (15.56%) proteins were classified under hypothetical. in the present study, we applied the power of various bioinformatics tools on the basis of protein family comparison, motifs, functional properties of amino acids and genome context to assign the possible functions for the hps. the pseudo sequences (protein sequence that contain ≤100 amino acid residues) are eliminated from the study. although we have successfully predicted the function for 483 proteins, we were able to infer the high level of confidence only for 108 proteins. the predicted hps were classified into various classes such as enzymes, transporters, binding proteins, cell division, cell regulatory and other proteins. the outcome of the study could be helpful to understand the molecular mechanism in bacterial pathogenesis and also provide an insight into the identification of potential targets for drug and vaccine development. enterobacteriaceae family consists of more than 250 published bacterial species and serratia is one of the most clinically important genera of the family (alnajar and gupta, 2017) . the bacteria belonging to the enterobacteriaceae are most commonly encountered organisms which are isolated from water, soil and clinical specimens. serratia species is frequently found associated with animals and plants either detrimental or beneficial. among the genus, serratia marcescens (s. marcescens), gram-negative bacteria have been recognized as a crucial cause of healthcare associated infections (nosocomial infections) in humans (parente et al., 2016) . several strains of s. marcescens have been reported till date, and the complete genome sequence of serratia strain fgi94 has been sequenced and its 16s rrna has shown 99% highest nucleic acid identity with the pathogenic strain serratia rubidaea jcm1240 (aylward et al., 2013) . in recent times, the majority of the healthcare associated urinary, respiratory, eye infections, osteomyelitis, wound infections, endocarditis and pulmonary infections were reported by s. marcescens (padmavathi et al., 2014) . resistance to antimicrobial agents through intrinsic and acquired process is a notable feature in s. marcescens. a wide variety of gene cassettes containing resistance were identified in the chromosome and plasmids of s. marcescens. certain strains of s. marcescens have shown resistance to antibiotics (gentamicin, cephalosporins, fluoroquinolone, cefotaxime and ceftazidime) which already complicates the antibiotic therapy (iguchi et al., 2014) . according to the world health organization (who) report 2018, serratia spp. were classified under class-1 priority. the categorization was based on the antimicrobial resistance (amr) mechanism of the pathogens (who priority pathogens list for r&d of new antibiotics, 2018). amr has emerged as a serious threat to the public health authorities at global level, particularly in intensive care and surgical units. resistance leads to the long-term medication, higher medical cost, prolonged hospitalization, and on the severity, leading to death (vazirianzadeh et al., 2013) . the rapid emergence of antibiotic resistance is theoretical biology occurring worldwide as at least 2 million people acquire resistance each year and around 23,000 deaths were documented by cdc report (cdc-antibiotic resistant threats report, 2013) . according to the who report, amr is one of the three major challenging problems of the human race (who antimicrobial resistance, 2018) . global distribution of resistance plasmids among several bacterial species is alarming the entire world on amr (finley et al., 2013) . the recent technological innovations in the next generation sequencing (ngs) generates larger amount of genomic data from wide range of bacterial species. however, very limited bacterial species have their complete proteome information, but sound knowledge of the microbial proteome is essential to understand the disease pathogenesis, virulent determinants, and their survival and propagation (singh et al., 2015) . for instance, in most bacterial species, nearly 30-40% of genes within in the genomes are marked as unknown or hypothetical (hoskeri et al., 2010) . proteins with unknown function or conserved putative proteins, which are showing limited connection with functionally annotated proteins, are termed as hypothetical proteins (hps). hps are translated portions of nucleic acid sequences based on their sequence similarity, but for experimental existence functional and biochemical characterization has to be evaluated (shahbaaz et al., 2013) . moreover, hps also includes the low identity proteins, imprecisely described and vague functional proteins (hoskeri et al., 2010) . in general, hps are classified into two major classes: (i) uncharacterized protein families (upfs), and (ii) domains of unknown function (dufs). the ones whose protein structures are available but not been functionally characterized or linked to any known functional gene, are referred as upfs. whereas the ones whose experimental existence of protein is available although not related to any known functional or structural domain, are represented as dufs (varma et al., 2015) . annotating the hps have many advantages such as, deriving new structurefunctional relationship, novel protein structures, and also helps in decoding additional pathways for better understanding of the pathogens. functionally annotated hps may serve as potential biomarkers for the screening and diagnostics of diseases. they can also be used as a pharmacological target to design and discover novel drugs (singh et al., 2015) . function of the protein can be predicted using various strategies, such as phylogenetic profiling, mass spectrometry identification, sage analysis, lethal analysis, etc., (varma et al., 2015; gasperskaja and kučinskas, 2017) . in order to minimize the time and investment cost, various computational tools were developed to aid the functional annotation process (hawkins and kihara, 2007) . in the current study, we have used the power of computational tools to annotate the possible functions for the hps in s. marcescens. functional predictions of hps from various bacterial genomes (v. cholera, n. gonorrhoeae, c. difficile, s. aureus, m. tuberculosis, h. influenza) through computational approaches were widely used in successful proteome annotation (shahbaaz et al., 2013; singh et al., 2015; costa et al., 2018) . deciphering the function of entire gene coding regions in the genome is essential to merge the gaps in proteome to fully understand the pathogenicity and genome plasticity of s. marcescens. sequence retrieval the entire protein sequences of s. marcescens fgi94 were identified by searching in ncbi (http://www.ncbi.nlm.nih.gov/) database. s. marcescens genome possesses 4434 protein-coding genes out of which 690 proteins were marked as hps. all the hps sequences were retrieved for the biological functional assignment using various functional annotation servers. in order to minimize the misinterpretations in functional annotation pipeline, pseudo genes (proteins having less than 100 amino acid residues) were excluded in this study. the various tools used in the functional predictions of hps in s. marcescens are tabulated in table 1 . physicochemical characterization of the hps was performed using expasy proteomics tools (gasteiger et al., 2005) . parameters such as molecular weight (mw), theoretical isoelectric point (pi), grand average of hydropathicity (gravy), aliphatic and instability index were computed for the 483 hps in s. marcescens. localization of the hps was predicted using various tools including psortb (yu et al., 2010) , pslpred (bhasin et al., 2005) and cello (yu et al., 2004) . the signal peptide was predicted using signalp server (petersen et al., 2011) . secretomep (bendtsen et al., 2005) was used to identify the presence of hps in nonclassical secretory pathway. transmembrane information about the hps was predicted using tmhmm (krogh et al., 2001) and hmmtop (tusnády and simon, 1998) servers. functions of the hps were predicted using the web-based tools. ncbi-blast (altschul et al., 1990) , smart (letunic et al., 2012) , ebi-interproscan (hunter et al., 2011) and motif (kanehisa, 1997) were used to identify the functional domains/motifs present in the hps. family level classifications of the hps were identified using pfam (finn et al., 2014) , scop-superfamily (gough et al., 2001) and panther (thomas et al., 2006) families. structural level classification of protein super families was predicted using cath database (orengo et al., 1997) . domain architecture of the hps was predicted using svmprot (cai, 2003) , cdart (geer et al., 2002) and protonet tools (rappoport et al., 2012) . virulence activity of the hps was predicted using virulentpred (garg and gupta, 2008) and vcim-pred (saha and raghava, 2006) . characterization 483 hps were selected, which are having more than 100 amino acids residues for the physicochemical characterization and functional annotation. on the basis of amino acid sequences, predicted physicochemical properties of the 483 hps were tabulated in supplementary table 1. molecular weight is an important criterion in protein functional characterization. it can be seen that the proteins agb82318.1 and agb83651.1 have shown the lowest and highest mw of 106 48.04 and 137161.31 da respectively. isoelectric point (pi) is the ph at which the proteins have no net charge and the mobility based on charge will be zero. prediction of pi is essential in the development of buffer system for purification and in isoelectric focusing. the predicted pi value of the hps ranged from 3.53 to 11.83. out of 483 hps, 246 proteins were observed to be acidic and the predicted pi values ranged from 3.53 to 6.97. similarly, 236 proteins have shown the pi values between 7.01 and 11.83, are considered basic and agb83651.1 has shown neutral charge. extinction coefficient of the hps were measured in water at 280 nm based on the concentration of cystine, tryptophan and tyrosine amino acid residues in the protein sequences. higher occupancy of these amino acid residues in the protein results in higher extinction coefficient values. small proteins with 100 plus amino acid residues containing minimal number of cystine, tryptophan and tyrosine residues have shown very less extinction co-efficient values. especially, agb81485.1 and agb84113.1 have not shown the extinction co-efficient values due to absence of cystine, tryptophan and tyrosine amino acid residues. computing the extinction co-efficient values helps in the quantitative analysis of protein-protein and protein-ligand interactions for the drug development process. instability index of the hps were found to be minimum value of -0.24 to the maximum of 119.78. instability index value illustrates the stability of the proteins in test tube environment. the 228 hps having the instability index values greater than 40 are classified as unstable and 225 proteins having the instability index less than 40 are classified as stable. aliphatic amino acid residues present in the protein are directly proportional to the aliphatic index. proteins with higher aliphatic index have shown higher thermal stability, especially in globular proteins. the computed aliphatic index of the 483 hps falls between 38.03 and 157.52. the stability of the thermophilic proteins are mainly contributed by the aliphatic amino acids (a, v, i and l), which results in higher aliphatic index values to withstand the wide range of temperatures. gravy value represents the protein-water interactions and it illustrates the hydrophilic nature of the protein. the gravy values are computed based on the average sum of the hydrophilic and hydrophobic side chains in the amino acids. least gravy value among the hps is found to be -1.478 and the highest is 1.238. proteins were characterized as drug and vaccine targets are mainly based on the subcellular localization. proteins located in cytoplasmic matrix are capable to be potential drug targets, whereas both inner and outer membrane proteins can possibly act as potential vaccine targets. crucial step in determining the function of the proteins is underlying on the knowledge of localization. based on the knowledge of trained data sets, the hps were predicted for their presence in any of these locations (cytoplasm, periplasm, extracellular, inner membrane and outer membrane) in the cell. we predicted the localization of 356 hps out of 483 proteins subject for the study based on the concurrence hits. among the predicted 356 hps, 60% (214) of the proteins were shown to be present in cytoplasm. presence of hps in inner membrane and outer membrane is 17.41% (62) and 2.52% (9) respectively. it is predicted that 14% (50) hps are present in periplasm and 5.89% (21) in extra cellular matrix. however, locations of 127 proteins were not confirmed due to absence of concurrence results. signal peptides are the key players in determining the transport of proteins to the target location. hence, prediction of signal peptide is essential to know the transport system of the particular proteins and the cleavage sites. apart from the certain cytoplasmic proteins, all other proteins have the signal peptides to facilitate the transport of proteins in and out of the membrane to reach the target cellular location or organelles. we predicted the presence of signal peptide sequences in 116 hps out of 483. similarly, 142 proteins were predicted to be involved in non-classical secretory pathway. proteins which are secreted outside by the cells are secretomes, and these proteins are important to maintain the cell-cell communications, cell proliferation and pathogenesis. membrane proteins are involved in various processes such as transport, signaling and energy transduction; hence half of the targets used for drug development are membrane proteins. the predicted 136 proteins have shown transmembrane helices from tmhmm server and 235 proteins from hmmtop server. prediction of membrane proteins is vital for the better understanding of drug targets to develop potent drug molecules. the details of the 483 hps were shown in supplementary table 2 . annotating the function of hps is essential for the better understanding of the entire biological system towards the development of effective drugs. we have analyzed 483 hps of s. marscences to predict their possible functions. further, hps were analyzed for the presence of functional domains and signature motifs in context with the biological function. the results of the domain and motif analysis are shown in supplementary table 3 . results of the sequence and structural features of the hps were shown in the supplementary table 4 . we have successfully annotated the function of 108 hps with high confidence out of the 483 proteins. the list of 108 proteins with their functions assigned is illustrated in table 2 . based on the analysis, only 22.36% of hps functions were predicted and the remaining hps have not shown concurrence results, which indicates that suitable experimental strategy has to be coupled for better functional assignment. annotated proteins ( fig. 1) were majorly classified into the following six categories such as, enzymes, binding proteins, cell division proteins, cell regulatory proteins, transporters and the remaining proteins involved in different biological process. bacterial enzymes play a catalyst role in all the metabolic and catholic process, leading to the supply of essential nutrient for the growth and also responsible for the pathogenesis of the organism (gurung et al., 2013) . we have characterized 34 proteins which act as enzymes, out of which 9 were shown to be transferases. transferases catalyze the transfer of a functional group (methyl group or glycosyl group) from one molecule (act as donor) to another molecule (act as acceptor). six proteins were identified as endonucleases, which function in destroying the invaded foreign dna ( van den broek et al., 2005) . two proteins agb81206.1 and agb83112.1 were predicted as a member of the exonuclease-endonuclease-phosphate domain super family which plays a crucial role in the intracellular signaling activities in bacteria (dlakic, 2000) . disrupting the intracellular signaling activities either arrests the physiological role or even kills the organism, therefore it is widely considered as a potential target for drug development (kohanski et al., 2010) . the protein agb83394.1 is predicted to contain smr domain like. it is observed that, smr domain like protein is broadly classified into three sub-families. family-1 closely relates to the c-terminal domain of the muts, presumably found in deltaproteobacteria, firmicutes, bacteroidetes and epsilonproteobacteria phyla of bacteria and plants. the proteins under this family are responsible for the protection of cells from oxidative dna damages. family-2 closely relates to the c-terminal domain of muts (eukaryotes), whereas family-3 indicates the muts that are found in e. coli. overall, these three families of proteins are involved in endoculease activity and also responsible for the formation of the branched dna structures (fukui and kuramitsu, 2011) . the protein agb81307.1 is predicted to have acyl carrier protein phosphodiesterase and functions essentially in catalyzing the hydrolytic cleavage of the 4′ phosphopantetheine residue from acyl carrier protein phosphodiesterase with the generation of apo acyl carrier protein phosphodiesterase. it also plays a significant role in the regulation of fatty acid synthesis (fischi and kennedy, 1990) . the fatty acid metabolic regulation in bacteria is essential to maintain the lipid homeostasis in the growth and stationary phases as well as in various physical and nutrient states (fujita et al., 2007) . agb82203.1 is predicted as 2-methyl aconitate cis/trans isomerase prpf. these proteins catalyze the inter-conversion of 2-methy caa and 2methyl taa in the 2-methylcitric acid cycles (du et al., 2017) . we predicted agb83389.1 as an elongation factor p. the factor p is required for the synthesis of proteins containing polyproline motifs and functions in ribosomal stalling (hersch et al., 2013) . agb83655.1 is predicted as p-loop containing nucleoside triphosphate hydrolase superfamily. these families of proteins function as kinases in many pathways and also work as motor to drive reactions through conformational changes (leipe et al., 2004 agb84572.1 is found to be oxidoreductase molybdopterin-binding domain and these proteins are reported to be involved in the h2 metabolism and bioenergetic pathways. the protein involved in these pathways help in the production of atp molecules by using oxidation-reduction reactions and thereby providing energies to the bacteria for the normal cellular activities (li et al., 2009) . the observed function of the hps helps to understand the crucial role of new proteins in bacterial growth and can be targeted as a potential targets for drug discovery. we identified 11 hps under binding proteins. agb80434.1 is classified as pk beta barrel domain like protein which is reported to be involved in chromosome formation and prerequisite for the initiation of chromophore maturation. it is well understood that pk beta barrel domain like protein play a significant role in wide variety of cellular processes including dna replication, dna repair and horizontal gene transfer (stepanenko et al., 2013) . thus, such proteins are important for the survival of the pathogens in the environment. we have characterized agb82209.1 and agb84196.1 as leptospira immunoglobulin like pro-tein b and it is observed that these proteins plays a major role in binding with fibrinogen, collagen, laminin and elastin and inhibit fibrin clot formation (lin et al., 2011) . the molecular mechanism of protein binding plays a critical role in the coagulation cascade and platelet aggregation, tissue regeneration, and immune responses proves to be a potential target for many pathogens (choy et al., 2011) . cell division in bacteria is closely linked to bacterial multiplication. knowledge on the mechanism of cell division is essential to explore novel targets in drug development. the agb81074.1 and agb83861.1 are predicted to be cell division protein (zapd). it is understood that the members of the zapd family of proteins share a common role in cell division machinery and mechanism (durand-heredia et al., 2012) . the protein agb84389.1 is predicted to a capsule assembly wzi family protein. this protein is classified as an outer membrane protein which is involved in extracellular capsule formation in many pathogenic bacteria (bushell et al., 2013) , and can be used as novel drug target. cell regulatory process protein gene regulation in prokaryotes and eukaryotes includes wide range of mechanisms for the production of desired gene product. this regulatory process is a complex network that controls the expression of the various transcriptional units in the bacteria, presumably maintains the microbial pathogenesis, growth and survival. the agb80600.1 protein was predicted to be dna recombination protein rmuc. although the function of this protein is not clearly understood, it shows high level of sequence similarities with myosins and structural maintenance of chromosome proteins (gaudermann et al., 2006) . protein agb80981.1 is a crea (dna binding protein), which function in the repression and control of alc regulon expression, that are necessary for the ethanol utilization pathways (panozzo et al., 1998) . protein agb81139.1 is a recombination regulatory (reca) protein and functions in dna pairing, strand exchange and recombinational dna repair (cox, 1991) . reca are multifunctional proteins that are found in all forms of living organisms. in the form of recombinase, the protein exhibits dna dependent atpase activity and also plays a role in the regulatory system that controls the induction of the sos response. as in the case of nucleoprotein filament, the pair plays a role in dna strand exchange (gruenig et al., 2008) . we found agb82748.1 protein in association with stage v sporulation protein r related protein. beall and moran (1994) in their work described the involvement of spovr from bacillus subtilis in spore cortex formation (beall and moran, 1994) . proteins agb83410.1 and agb83544.1 are antitoxin systems (reii/pare) composed in all bacterial genome and play a significant role in the formation of persistence cells, involvement in biofilm formation and in the pathogenesis of the organisms (wen et al., 2014) . we identified agb84710.1 as der gtpase activating protein yihi; the families of these proteins are more conserved in eubacteria and involved in the cell survival and bacterial growth (verstraeten et al., 2011) . the findings are helpful to conclude that by inhibiting these proteins bang the normal cellular process and thereby reduce the bacterial pathogenecity. bacteria contains various transport proteins for the import and export of substances such as nutrients, ions, metabolites, amino acids, etc., through the cell membrane, to exclude the unwanted by-products, and modify their cytoplasmic content of protons and salts needed for growth and development of the microorganisms. we predicted agb81989.1 and agb83037.1 as type vi secretion system and these proteins play a major role in virulence, antibacterial activity and also participate in metal ion uptake conferring an advantage during bacteria-bacteria competition (gallique et al., 2017) . agb82622.1 belongs to multi-drug resistance efflux transporter. emre belongs to the smr family of small multidrug transporters (ninio et al., 2004) . recent data suggest that over expression of smr causes bacteria to become resistant to wide range of antiseptics and antibiotics such as ethidium bromide, methyl viologen and intercalating dyes (ma and chang, 2004) . agb82887.1 is predicted to be manganese efflux pump mntp. metals such as manganese, copper and zinc are essential for all micro-organisms and appropriate maintenance of metal homeostasis is important to prevent toxicity and mismetallation for both eukaryotes and bacteria (procheronet et al., 2013) . the identification of such pumps in various bacteria such as e. coli (mntp) (martin et al., 2015) , streptococcus pneumoniae (mnte) (turner et al., 2015) and neisseria spp. (mntx) (veyrier et al., 2011) play an important role in removing excess manganese before levels become toxic. the protein agb83371.1 is found to be tripartite tricarboxylate transporter permease. this kind of protein is most abundant in beta-proteobacteria and plays a significant role in the transport of carboxylate (rosa et al., 2018) . we identified agb84290.1 as lipopolysaccharides export abc transporter periplasmic protein lptc. it is found that the lipopolysaccharide acts as hydrophilic barrier to wide range of hydrophobic antibiotics and plays a counterpart in the pathogenicity of the organisms (hicks and jia, 2018) . we categorized 39 hypothetical proteins function in different biological process under one category named other proteins. the protein agb80728.1 was predicted as translocation and assembly module (tam), which plays a major role in outer membrane biogenesis and virulence mechanisms in bacterial kingdom (josts et al., 2017) . agb8163.1 is found to be sh3 domain and it has been identified in various bacteria and plays a critical role in the targeting domains involved in bacterial cell wall recognition and metal binding (kamitori and yoshida, 2015) . we identified agb82063.1 as brct domain, and it is recognized as a c-terminal region of brca1 gene which participates in the signal transduction and in protein targeting motif in the dna damage response system (woods et al., 2012) . further, structural analysis of brct domain reveals that it may appear either in singleton (single brct) or tandem pair (double brct) and it also has a phosphate binding site in which the phosphate molecules are bound either in the dna end or in the phosphopeptide (sheng et al., 2011) . we found the protein agb82090.1 belongs to ycel protein and to the subgroup of lipocalin superfamily, and plays a role in isoprenoid quinine metabolism, transport and storage (handa et al., 2005) . the proteins agb81427. involving sugar modification and also play a role in oxalate metabolism. we predicted agb82764.1 as outer membrane lipoprotein slp family which is encoded by the gene xac1113 (ferreira et al., 2016) . bacterial lipoproteins are a class of membraneanchored proteins and play important role in bacterial physiology and pathogenesis (zuckert, 2014) . de souza et al., 2004 reported slp involvement in the biofilm formation. the protein agb83581.1 is found to be tetratricopeptide repeat like domain. this kind of proteins are found in wide variety of prokaryotic and eukaryotic organisms and plays an vital role in cell processes and associated with virulence mechanisms of bacterial pathogens (cerveny et al., 2013) . agb83707.1 is predicted to be outer membrane lipoprotein rcsf, presumably involved in signal transduction pathway (shiba et al., 2012) . we identified agb84181.1 protein as antibiotic biosynthesis monooxygenases. increased production of reactive oxygen species (ros) causes oxidative damages to the cells. the primary ros generated within mitochondria are limited to flavoproteins and flavins that activate the molecular oxygens. the generated monooxygenases do variety of functions in all living organisms and catalyze a wide range of reactions such as participating in respiratory chain within the cytoplasmic membrane and protecting themselves against external reactive oxygen species (khan et al., 2017 jer, 2005) . these lipoproteins are universally distributed in the bacterial kingdom and accounts to about 1-3% of the total genome. these lipoproteins play pivotal roles in many physiological and cellular processes such as cell wall metabolism, antibiotic resistance, nutrient uptake, cell division, signal transduction and virulence. thus, the identified hps are predicted to play an important role in the survival and pathogenesis of the pathogens and the identified functions of hps may provide clue as a therapeutic target for the drug design and developments. virulence factors are essential to invade the bacteria to colonize, causes disease and to overcome the defenses of the host. therefore, understanding the molecular mechanisms of microbial virulence plays a central role in the pathogenesis of the bacteria. so, we used vicmpred and virulentpred, an svm based method to predict the bacterial virulence factor among the 483 hps. the predicted results are shown in supplementary table 5 . we found that 29 hps as virulent factors, 226 as a cellular process, 15 as information and storage and 120 as metabolism molecules. all these proteins can be used as potential targets for drug design and development. in general, about 30-50% of the sequenced genome are referred as hypothetical proteins, and the rapid accumulation of the genome data containing hps is one of the emerging challenges in modern biology. hps hamper the identification of novel drug targets which can specifically act on pathogens to combat the pathogenicity. in the case of bacterial pathogens, these hps play a crucial role in the identification of potential drug targets and also enhance our understanding of their virulence capacity and pathogenicity. in this study, we have characterized the functions for the 108 hps from s. marcescens with a high level of confidence using various bioinformatics approaches. various types of enzymes, transporters, cell division, binding proteins were characterized which play an essential role in the growth, survival virulence and pathogenesis of s. marcescens. characterization of proteins on the basis of physiochemical properties and subcellular localization helps in differentiating the vital drug targets from vaccine targets. in addition, we also identified 18 virulence proteins that are predicted to play a crucial role in the pathogenesis of the organisms. hence, this study may facilitate future studies on the predicted hps as novel therapeutic targets for the drug and vaccine development. the authors declare that they have no conflict of interest. this article does not contain any studies involving animals or human participants performed by any of the authors. phylogenomics and comparative genomic studies delineate six main clades within the family enterobacteriaceae and support the reclassification of several polyphyletic members of the family basic local alignment search tool complete genome of serratia sp. strain fgi 94, a strain associated with leaf-cutter ant fungus gardens cloning and characterization of spovr, a gene from bacillus subtilis involved in spore cortex formation non-classical protein secretion in bacteria in silico functional annotation of a hypothetical protein from staphylococcus aureus pslpred: prediction of subcellular localization of bacterial proteins the molecular mechanism of bacterial lipoprotein modification-how, when and why wzi is an outer membrane lectin that underpins group 1 capsule assembly in escherichia coli prot: web-based support vector machine software for functional classification of a protein from its primary sequence antibiotic resistant threats report tetratricopeptide repeat motifs in the world of bacterial pathogens: role in virulence mechanisms the multifunctional ligb adhesin binds homeostatic proteins with potential roles in cutaneous infection by pathogenic leptospira interrogans the reca protein as a recombinational repair system functional annotation of hypothetical proteins from the exiguobacterium antarcticum strain b7 reveals proteins involved in adaptation to extreme environments, including high arsenic resistance gene expression profile of the plant pathogen xylella fastidiosa during biofilm formation in vitro functionally unrelated signaling proteins contain a fold similar to mg 2+ dependent endoculeases genetic and biochemical characterization of a gene operon for trans-aconitic acid, a novel nematicide from bacillus thuringiensis identification of zapd as a cell division factor that promotes the assembly of ftsz in escherichia coli unravelling potential virulence factor candidates in xanthomonas citri subsp. citri by secretome analysis the scourge of antibiotic resistance: the important role of the environment pfam: the protein families database isolation and properties of acyl carrier protein phosphodiesterase of escherichia coli regulation of fatty acid metabolism in bacteria structure and function of the small muts-related domain the type vi secretion system: a dynamic system for bacterial communication, front. microbiol virulentpred: a svm based prediction method for virulent proteins in bacterial pathogens the most common technologies and tools for functional genome analysis protein identification and analysis tools on the expasy server analysis of and function predictions for previously conserved hypothetical or putative proteins in biochmannia floridanus cdart: protein homology by domain architecture, genome res assignment of homology to genome sequences using a library of hidden markov models that represent all proteins of known structure reca-mediated sos induction requires an extended filament conformation but no atp hydrolysis a broader view: microbial enzymes and their relevance industries, medicine, and beyond crystal structure of a novel polyisoprenoidbinding protein from thermus thermophilus hb8 function prediction of uncharacterized proteins divergent protein motifs direct elongation factor p-mediated translational regulation in salmonella enteric and escherichia coli structural basis for the lipoplysaccharide export activity of the bacterial lipopolysaccharide transport system functional annotation of conserved hypothetical proteins in rickettsia massiliae mtu5 interpro in 2011: new developments in the family and domain prediction database genome evolution and plasticity of serratia marcescens, an important multidrug-resistant nosocomial pathogen the structure of a conserved domain of tamb reveals a hydrophobic β taco fold structure-function relationship of bacterial sh3 role of nanomaterials in plants under challenging environments how antibiotics kill bacteria: from targets to networks predicting transmembrane protein topology with a hidden markov model: application to complete genomes stand, a class of p-loop ntpases including animal and plant regulators of programmed cell death: multiple, complex domain architectures, unusual phyletic patterns, and evolution by horizontal gene transfer smart 7: recent updates to the protein domain annotation resource a molybdopterin oxidoreductase is involved in h 2 oxidation in desulfovibrio desulfuricans g20 leptospira immunoglobulin-like protein b (ligb) binding to the c-terminal fibrinogen αc domain inhibits fibrin clot formation, platelet adhesion and aggregation structure of the multidrug resistance efflux transporter emre from escherichia coli the escherichia coli small protein mnts and exporter mntp optimize the intracellular concentration of manganese the membrane topology of emre-a small multidrug transporter from escherichia coli cath-a hierarchic classification of protein domain structures 1-dimethylethyl) of marine bacterial origin inhibits quorum sensing mediated biofilm formation in the uropathogen serratia marcescens the crea repressor is the sole dna-binding protein responsible for carbon catabolite repression of the alca in aspergillus nidulans via its binding to a couple of specific sites serratia marcescens resistance profile and its susceptibility to photodynamic antimicrobial chemotherapy, photodiagn. photodyn signalp 4.0: discriminating signal peptides from transmembrane regions iron, copper, zinc, and manganese transport and regulation in pathogenic enterobacteria: correlations, between strains, site of infection and the relative importance of the different metals transport systems for virulence, front protonet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree tripartite atp-independent periplasmic (trap) transporters and tripartite tricarboxylate transporters (ttt): from uptake to pathogenicity vicmpred: an svm-based method for the prediction of functional proteins of gramnegative bacteria using amino acid patterns and composition functional annotation of conserved hypothetical proteins from haemophilus influenzae rd kw20 functional evolution of brct domains from binding dna to protein exploring the relationship between lipoprotein mislocalization and activation of the rcs signal transduction system in escherichia coli functional annotation and classification of the hypothetical proteins of neisseria meningitidis h44/76 beta-barrel scaffold of fluorescent proteins: folding, stability and role in chromophore formation applications for protein sequence-function evolution data: mrna/protein expression analysis and coding snp scoring tools manganese homeostasis in group a streptococcus is critical for resistance to oxidative stress and virulence principles governing amino acid composition of integral membrane proteins: application to topology prediction dnatension dependence of restriction enzyme activity reveals mechanochemical properties of the reaction pathway dna binding: a novel function of pseudomonas aeruginosa type iv pili the first report of drug resistant bacteria isolated from the brown-banded cockroach, supella longipalpa the university conserved prokaryotic gtpases, microbiol a novel metal transporter mediating manganese export (mntx) regulates the mn tofe intracellular ration and neisseria meningitidis virulence toxin-antitoxin systems: their role in persistence, biofilm formation and pathogenicity charting the landscape of tandem brct domain-mediated protein interactions predicting subcellular localization of proteins for gram-negative bacteria by support vector machines based on n-peptide compositions psortb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes secretion of bacterial lipoproteins: through the cytoplasmic membrane, the periplasm and beyond supplementary materials are available for this article at https://doi.org/10.1134/s1062359020300019 and are accessible for authorized users. key: cord-275993-isff6lp2 authors: han, dong p; kim, hyung g; kim, young b; poon, leo l.m; cho, michael w title: development of a safe neutralization assay for sars-cov and characterization of s-glycoprotein date: 2004-08-15 journal: virology doi: 10.1016/j.virol.2004.05.017 sha: doc_id: 275993 cord_uid: isff6lp2 the etiological agent of severe acute respiratory syndrome (sars) has been identified as a novel coronavirus sars-cov. similar to other coronaviruses, spike (s)-glycoprotein of the virus interacts with a cellular receptor and mediates membrane fusion to allow viral entry into susceptible target cells. accordingly, s-protein plays an important role in virus infection cycle and is the primary target of neutralizing antibodies. to begin to understand its biochemical and immunological properties, we expressed both full-length and ectodomain of the protein in various primate cells. our results show that the protein has an electrophoretic mobility of about 160–170 kda. the protein is glycosylated with high mannose and/or hybrid oligosaccharides, which account for approximately 30 kda of the apparent protein mass. the detection of s-protein by immunoassays was difficult using human convalescent sera, suggesting that the protein may not elicit strong humoral immune response in virus-infected patients. we were able to pseudotype murine leukemia virus particles with s-protein and produce sars pseudoviruses. pseudoviruses infected vero e6 cells in a ph-independent manner and the infection could be specifically inhibited by convalescent sera. consistent with low levels of antibodies against s-protein, neutralizing activity was weak with 50% neutralization titers ranging between 1:15 to 1:25. to facilitate quantifying pseudovirus-infected cells, which are stained blue with x-gal, we devised an automated procedure using an elispot analyzer. the high-throughput capacity of this procedure and the safety of using sars pseudoviruses should make possible large-scale analyses of neutralizing antibody responses against sars-cov. during the first epidemic of severe acute respiratory syndrome (sars), which began in november of 2002 in guandong province of the people's republic of china, and lasted for about 7 months, close to 8100 people were infected worldwide, among which 774 people died (who, 2003) . the etiological agent of this atypical respiratory disease has been identified as a novel coronavirus (designated as sars-cov) fouchier et al., 2003; ksiazek et al., 2003; peiris et al., 2003; poutanen et al., 2003) . with a mortality rate of over 9%, sars-cov had a major health and socioeconomic impact. fortunately, there have been very few incidences of sars infections during the winter season of [2003] [2004] . however, with multiple modes of virus transmission and a wide range of potential nonhuman reservoirs including wild animals commonly found in markets (e.g., civet cats and raccoon dogs; guan et al., 2003) as well as domestic cats (martina et al., 2003) , it is highly likely that a virus of this nature will most certainly resurface in the future. currently, there are no antiviral drugs, immunotherapeutic agents, or vaccines available against the virus. to better control or prevent future epidemics, anti-sars-cov drugs and/or vaccines need to be developed. sars-cov belongs to coronaviridae family. the genomic organization of the virus is similar to that of other coronaviruses with a general order of replicase (rep; orfs-1a and 1b), spike (s)-glycoprotein, envelope (e), membrane protein (m), and nucleocapsid (n) from 5v to 3v direction (marra et al., 2003; rota et al., 2003) (fig. 1) . several openreading frames have also been identified, which may encode additional proteins (marra et al., 2003; rota et al., 2003; snijder et al., 2003) . their functions, however, are not known at the present time. the protein of a major interest as a target of antiviral drug development efforts as well as for developing vaccines is s-glycoprotein. s-protein of coronaviruses, which is thought to function as a trimer (delmas and laude, 1990) , is responsible for both binding to cellular receptors and inducing membrane fusion for virus entry into target cells (collins et al., 1982; godet et al., 1994; kubo et al., 1994) . mutations in the protein have been shown to alter virulence and cellular tropism (fazakerley et al., 1992; leparc-goffart et al., 1998; sanchez et al., 1999) . taken together, the s-protein plays a critical role in the biology and pathogenesis of coronaviruses. not surprisingly, it is an important target of virus-neutralizing antibodies (chang et al., 2002; collins et al., 1982; fleming et al., 1983; godet et al., 1994; kant et al., 1992; kubo et al., 1993 kubo et al., , 1994 takase-yoden et al., 1991) . moreover, mice immunized with a recombinant s-protein, or a peptide derived from it, are protected from lethal challenges with murine hepatitis virus (mhv) (daniel and talbot, 1990; koo et al., 1999) . s-protein is a type i membrane glycoprotein, which is translated on membrane-bound polysomes, inserted into rough endoplasmic reticulum (rer), cotranslationally glycosylated, and transported to the golgi complex. during the transport, s-proteins are incorporated onto maturing virus particles, which assemble and bud into a compartment that lies between the rer and golgi (lai and holmes, 2001) . virions are carried from golgi to plasma membrane in secretory vesicles. virions are released from cells when virion-containing vesicles fuse with plasma membrane. excess s-proteins not incorporated onto virus particles are transported to the surface of plasma membrane (lai and holmes, 2001; tsai et al., 1999; yamada et al., 1998) . s-protein of sars-cov is 1255 amino acids long (fig. 1) . it is predicted to have a 13 amino acid signal peptide at the amino-terminus, a single ectodomain (1182 amino acids) and a transmembrane region followed by a short cytoplasmic tail (28 residues) at the carboxy-terminus (marra et al., 2003; rota et al., 2003) . due to low sequence homology between the s-protein of sars-cov and that of the other coronaviruses (marra et al., 2003; rota et al., 2003) , the structural and immunogenic properties of sars-cov sprotein must be ascertained experimentally. the cellular receptor for sars-cov has recently been identified to be angiotensin-converting enzyme 2 (ace2; li et al., 2003) . the molecular interactions between the s-protein and ace2 are not yet known. better understanding of the interactions could lead to development of virus entry inhibitors. neutralizing antibodies (nabs) play a critical role in protection against a variety of viral diseases. an accurate assessment of nab responses in virus-infected patients is needed to determine immune correlates of protection. it is also an essential and integral part of a vaccine development process. conventional virus-neutralization assays require the use of replication-competent, infectious viruses. evaluating virus-neutralizing activity of a large number of antisera with these assays is undesirable due to safety concerns, especially for a biosafety level 3 (bsl3) pathogen like sars-cov. the same safety concerns have prompted our laboratory to utilize replication-defective pseudoviruses for hiv-1 neutralization assay (kim et al., 2001) . in this assay, nonreplicating moloney murine leukemia virus (mulv) particles pseudotyped with hiv-1 envelope glycoproteins are used (schnierle et al., 1997) . these pseudoviruses encode a h-galactosidase gene, which allows detection of individual infected cells when stained with x-gal (5-bromo-4-chloro-3-indolyl-h-d-galactopyranoside). in this study, we report development of a sars-cov pseudovirus neutralization assay, which should be particularly valuable for researchers who may not have easy access to bsl3 containment facility. additionally, we describe a high-throughput system for quantitative analyses of x-gal stained cells. this assay system should facilitate rapid evaluation of antibody responses to vaccine candidates and/or entry inhibitors against sars-cov. to express sars-cov s-glycoprotein, we initially cloned a dna fragment encoding the protein into pcdna-3 vector (pcdna-s; fig. 1b ). to detect s-protein, western blot was performed with convalescent sera from sars-cov-infected patients. however, no clear protein band was detected despite number of attempts. we reasoned that one of the possibilities for the inability to detect the protein is a low level of s-protein expressed from pcdna-s. to increase the amount of sprotein expressed, we subcloned the s gene into phcmv-g vector (burns et al., 1993) , which expresses high level of vesicular stomatitis virus (vsv) g glycoprotein. although we were able to express higher amount of s-protein (see below), this was not sufficient to detect a clear band on western blots. an alternative explanation is that antibodies against the protein in convalescent sera cannot recognize s-protein subjected to denaturing conditions of sds-page (viz. linear epitopes). however, because results from radioimmunoprecipitation and indirect immunofluorescence assays were also ambiguous, it is most likely that the antibody titer against s-protein is very low in convalescent sera. to further increase protein expression level, s gene was subcloned into a ptm vector (moss et al., 1990) . with this vector, a protein of interest is under the control of a strong t7 rna polymerase (t7rnap) promoter and the protein is expressed when cells transfected with the plasmid are infected with a recombinant vaccinia virus expressing t7rnap (vtf7-3; fuerst et al., 1986) . the presence of encephalomyocarditis virus internal ribosome entry site (ires) at the 5v end of rna transcripts allows efficient translation of mrna transcribed in cytoplasm. using ptm-s, we were able to detect a faint, but distinct band of approximately 160 -170 kda by western blot (fig. 2a , lane 3). we also reevaluated pcdna-s as this vector has a dual promoter system (cmv and t7 promoter). using t7 promoter, we were able to detect a protein band of a similar size, albeit less clear than using ptm-s (lane 2). the lower expression of the protein is likely due to the lack of ires in the pcdna vector. because the calculated molecular weight of s-protein without 13 amino acid signal peptide is about 138 kda, the result suggested posttranslational modification (e.g., glycosylation). to better demonstrate this, we generated another clone (ptm-eshis) that expresses the entire ectodomain of s-protein (amino acids 1 -1190) with a six-histidine tag at the carboxy terminus. the ectodomain of s-protein migrated with an approximate molecular weight of 163 kda while its calculated molecular weight is only 131.2 kda (fig. 2b , lane 2). to demonstrate that this difference is due to glycosylation, eshis protein was treated with endoglycosidase h (endo-h) or peptide: n-glycosidase f (pngase f). as shown in fig. 2b (lanes 3 and 4), treatment with either glycosidase increased the mobility of the protein to approximately 133 kda. because the mobility of the protein treated with either glycosidases was the same, s-protein is most likely modified with high mannose and/or hybrid, rather than complex, oligosaccharides. while s-glycoprotein of some coronaviruses is cleaved into two subdomains, s1 and s2, the fact that we observed only a single band suggests that sars-cov s-protein functions as a single unit. despite difficulties in detecting s-protein directly by immunoassays, proteins expressed from both pcdna-s and phcmv-s constructs were able to pseudotype mulv particles to produce sars pseudoviruses that could readily infect vero e6 cells (fig. 3a) . none of the other cell lines we tested, including hela, a549, 293t, and bs-c-1, were susceptible. this is in contrast to vsv-g pseudotyped viruses, which could infect all cell lines (data not shown). the fact that bs-c-1 cells, which, like vero e6 cells, are african green monkey kidney cells, were not susceptible was somewhat unexpected. however, when we performed infections with a high multiplicity of infection, we were able to detect infected bs-c-1, albeit at a significantly reduced titer (7 -8-fold compared to vero e6 cells; data not shown). this result was not too surprising because it has been shown that even 293t cells, which express small amounts of ace2, support some basal level of sars-cov replication . many pseudovirus-infected cells appeared as a doublet, which is the result of a cell division following integration of mulv pseudovirus genome encoding h-galactosidase. these doublets are counted as a single infectious unit. a typical yield of sars pseudoviruses was about 2 â 10 4 infectious units per milliliter of culture supernatant using phcmv-s, which was about fivefold greater than using pcdna-3. this yield is comparable to what we have been able to achieve for hiv-1 pseudoviruses (between 2 â 10 3 and 2 â 10 4 depending on envelopes; kim et al., 2001) , but lower than vsv-g pseudovirus yield (between 3 â 10 4 and 9 â 10 4 depending on target cell lines used). interestingly, sars pseudovirus production was about 20-fold less efficient when plasmid transfection was performed by calcium phosphate method compared with using cationic lipids (lipofection). this difference, however, was not observed for vsv-g pseudovirus production. an additional difference was that a longer incubation time was needed to achieve peak pseudovirus production for sars-s compared to vsv-g (3 vs. 2 days posttransfection, respectively). the reasons for these discordant results are unknown at the present time. cellular entry of coronaviruses can occur either by acidic ph-dependent or -independent pathway (gallagher et al., 1991; lai and holmes, 2001; cavanagh, 1990, 1992; nash and buchmeier, 1997; payne et al., 1990) . to investigate whether sars-cov infection requires low ph, we examined sensitivity of pseudovirus infections to lysosomotropic agents chloroquine and nh 4 cl. as expected, infectivity of viruses pseudotyped with vsv-g was reduced by chloroquine and nh 4 cl in a dose-dependent manner 3 and 4, respectively) . the protein was detected by western blot with anti-6âhis antibody. no band was detected from cells transfected with an empty vector (lane 1). acrylamide gradient gel (4 -12%) was used. ( fig. 3b and c, respectively) . in contrast, sars pseudovirus infection was virtually unaffected, suggesting that sars-cov infection proceeds in an acidic ph-independent manner. to assess whether sars pseudoviruses we generated could be used to quantify virus-neutralizing antibodies, we examined their susceptibility to convalescent sera from sars-cov-infected patients. as shown in fig. 4a , sera from two patients were able to specifically neutralize sars pseudoviruses; the same convalescent sera could not neu-tralize hiv-1 or vsv-g pseudoviruses and no neutralizing activity was observed with a normal serum. to determine neutralizing antibody titers in virus-infected patients, we performed the assay with serially diluted sera from seven patients. as shown in fig. 4b , antibody levels were quite similar in all patients with 50% neutralization titer between 1:15 and 1:25. although the pseudovirus neutralization assay is sensitive, quantitative, and safe, it has one disadvantage of having to count individual x-gal-stained cells through a microscope. to overcome this problem, we looked into a possibility of automating the data collection procedure using an elispot reader (immunospot analyzer, cellular technology ltd.). although this instrument is commonly used to quantify antigen-specific t cell cytokine responses by counting chromogenic immunospots (e.g., ifn-g), we rationalized that it might be able detect x-gal-stained blue cells. as shown in fig. 5 , there was no problem with using the instrument to count spots at a single-cell resolution and the analysis was highly efficient as the entire 96-well plate could be processed in less than 20 min. virus-infected cells appearing as doublets did not pose a problem because parameters on the analysis software could be adjusted to count two stained cells adjacent to each other as one. the number of infectious foci counted was quite linear as a function of virus inoculum (fig. 5c ), validating the methodology. this procedure could be used to quantify other assays based on x-gal staining of cells (e.g., recombinant vaccinia viruses that express h-galactosidase). in this study, we expressed sars-cov s-glycoprotein, which was able to pseudotype mulv particles. sars pseudoviruses were able to efficiently infect vero e6 cells, which have been shown to support sars-cov infection. the infection did not require low ph, suggesting viral entry is mediated by a direct fusion event between viral and plasma membranes. this result is consistent with a previous report that cell-to-cell fusion mediated by sprotein and its cellular receptor ace2 occurred at neutral ph (xiao et al., 2003) . however, our result is in direct disagreement with recently published article by simmons et al. (2004) . there are three major differences in exper-imental procedures between the two studies. first, we pseudotyped mulv particles whereas simmons et al. used hiv-1. second, we used an authentic s-glycoprotein whereas they used a c-terminal fusion protein that included a v5 epitope and polyhistidine tag, which totaled, by our estimation, 27 extra amino acids. whether the discrepant result is due to the use of different s-glycoproteins and/or different virus cores needs to be further investigated. the third difference between the studies is the concentrations of lysosomotropic agents used. while we used nh 4 cl at 0-50 am amounts, which are sufficient to inhibit vsv-g-mediated fusion ( fig. 3 ; picard-maureau et al., 2003) , they used millimolar (mm) amounts. at these concentrations, nh 4 cl could have a secondary effect on sglycoprotein. we were unable to find concentrations of chloroquine used in their study. it is interesting to note that while simmons et al. observed that pseudovirus infections required low ph, s-protein-mediated cell-to-cell fusion did not. the sars pseudoviruses we generated could be specifically inhibited by convalescent sera from sars-cov infected patients, indicating that s-glycoprotein of sars-cov is a target of neutralizing antibodies as it is for other coronaviruses. the major purpose of generating sars pseudoviruses was to devise an assay system to assess virus-neutralizing antibodies safely and rapidly without having to use infectious, replication-competent sars-cov. the results of our study indicate that sars pseudoviruses could be used to evaluate efficacy of various s-glycoprotein-based vaccine candidates to elicit virus-neutralizing antibodies. they could also be used to perform structurefunction analyses of s-glycoprotein. due to a large size of sars-cov genome, it would be difficult to perform such analyses directly in the context of the virus, not to mention potential safety hazards from working with it. in contrast, mutational analyses of the protein could be performed readily using pseudoviruses. our attempt to characterize biochemical and immunological properties of the s-protein was hampered by the fact that antibody titers against the protein in convalescent sera were extremely low; we were able to identify only a faint band on a western blot (with high background) and attempts to detect the protein by immunofluorescence and radioimmunoprecipitation assays were less than successful. in contrast, convalescent sera have been successfully used to detect sars-cov-infected cells by an immunofluorescence assay (hsueh et al., 2003; peiris et al., 2003) . together, the available data seem to suggest that s-protein might not be immunogenic, at least compared to other viral proteins. in fact, immunoreactivity analyses of a panel of synthetic peptides derived from s, membrane (m), and nucleocapsid (n) proteins suggested that n protein might be the most immunogenic protein . the nonimmunogenic nature of s-protein might present potential problems in developing a vaccine that can elicit potent neutralizing antibodies against sars-cov. in this regard, it is interesting to note that s-protein is highly glycosylated with 23 potential asparagine-linked glycosylation sites. based on our analyses of the ectodomain of the protein, carbohydrate residues account for approximately 30 kda (based on mobility in sds-page). the glycans were primarily high mannose and/or hybrid type. this, however, needs to be verified using proteins produced from nonvaccinia virus expression system, because the virus infection could possibly affect cellular glycosylation machinery. extensive glycosylation of hiv-1 envelope glycoprotein has been one of the major obstacles in eliciting good humoral responses against the protein and in developing an effective vaccine against the virus (cho, 2003) . it remains to be seen whether and to what extent glycans on s-protein affect immunogenic properties of the protein. interestingly, potential glycosylation sites are clustered into three regions of the protein (fig. 1a) : n-terminal, middle, and c-terminal. it has been shown that individual glycosylation sites on hiv-1 surface glycoprotein gp120 may have different functions; while some are important for evading immune responses, others are critical for maintaining proper protein structure necessary to interact with cellular receptors and mediate membrane fusion (ogert et al., 2001; reitter et al., 1998) . additional studies are needed to determine whether glycosylation sites in different clusters of s-protein serve different functions. in the absence of an effective vaccine and/or antiviral drugs against sars-cov, early detection of virus-infected patients would be critical for effective containment of future epidemics. quantitative rt-pcr-based diagnostic assays have been described for sars-cov (grant et al., 2003; lau et al., 2003; ng et al., 2003; poon et al., 2003a poon et al., , 2003b tang et al., 2004; yam et al., 2003) . despite high sensitivity, their utility has some limitations: (i) the detection rate varies widely between 20% and 80% depending on clinical sam-ples and protocols used for the assay; (ii) the window of detectability is limited to early stages of infection; and (iii) the assay is not suitable for routine surveillance. antibodies against sars proteins have been shown to appear as early as 9 days after the onset of illness (hsueh et al., 2003) . therefore, development of a high-throughput serologybased diagnostics could complement pcr-based assays. in this regard, a virus-neutralization assay could be used as a confirmatory test, which would enhance the accuracy of early diagnosis of sars-cov. because neutralizing antibodies are important for virus clearance, the assay could also be used to assess disease prognosis. in either case, the availability of sars pseudoviruses allows avoiding the use of infectious sars-cov. the overall cloning strategy is shown in fig. 1b . two parental plasmids encoding a sars-cov s gene (urbani strain), pentr-s and pcr-s, were obtained from the u.s. centers for disease control and prevention. two s-proteinexpressing plasmids (pcdna-s* and pcdna-s) were generated using pcdna-3 (invitrogen). the s gene in pcdna-s*, which was transferred from pentr-s (bamhi -ecori fragment), lacks the original translation stop codon taa because it was changed to aat of ecori restriction site (gaattc). pcdna-s with a stop codon was constructed by replacing a swai-ecori fragment of pcdna-s* with the same fragment from pcr-s. to generate phcmv-s, a bamhi -ecori fragment from pcdna-s was inserted into a bamhi site of phcmv-g following blunting ends with klenow. to construct ptm-s, a bamhi -xhoi fragment from pcdna-s was cloned into the corresponding sites of ptm-ndei (cho et al., 1994) . despite the fact that ptm-s has a small open-reading frame that encodes eight amino acids between the internal ribosome entry site of ptm-ndei vector and the s gene, s-protein was efficiently expressed and the plasmid was used as is without further modification. to generate ptm-eshis, 3v end of the ectodomain was pcr amplified using a sense primer 5v-gtc gtc aac att caa aaa gaa-3v (nts 3472 -3492 of s gene) and an antisense primer 5v-aat gaa gcg gat cccggg tta gtg atg gtg gtg atg atg ttg ctc ata ttt tcc caa-3v. base-pairing region (nts 3553-3570) is shown in bold and the six histidine residues are italicized. the amplified fragment was digested with swai (nt 3521) and smai (underlined) and subsequently cloned into ptm-s digested with swai and stui. cell culture, protein expression, and western blots all cell lines, except for vero e6, were maintained in dmem supplemented with 10% fetal bovine serum (fbs), 2 mm l-glutamine, and penicillin -streptomycin antibiotics. vero e6 cells were maintained in emem with the same supplements plus 0.1 mm nonessential amino acids. cells were cultured at 37 jc in 5% co 2 incubators. to express sprotein, cells were transfected with plasmids by a calcium phosphate precipitation method. briefly, 0.5 ml of 0.25 m cacl 2 solution containing 30 ag of plasmids was slowly mixed with 2â hbs (50 mm hepes, 1.5 mm na 2 hpo 4 , 280 mm nacl, ph 7.1) and the mixture was added to cells. after an overnight incubation, culture medium was replaced and cells were further incubated for two additional days. for expression from ptm-s and ptm-eshis, transfected cells were infected with vtf7-3 (fuerst et al., 1986 ) at a multiplicity of infection of 5. following 2 days of infection, cells were lysed with a hypotonic cell lysis buffer (10 mm tris, ph 8.0, 10 mm nacl, 1.5 mm mgcl 2 , 1% np-40). insoluble cell debris and nuclei were removed by a brief centrifugation in a microfuge. cell lysates were subjected to sds-page and western blot. s-proteins were detected with either a pool of convalescent sera (1:100 dilution) or anti-his (c-terminal) monoclonal antibody (invitrogen; 1:3000 dilution) followed by horseradish peroxidase-conjugated goat anti-human or anti-mouse igg antibody (pierce), respectively. protein bands were visualized using supersignal west pico chemiluminescence detection system (pierce). molecular weights of the protein bands were approximated by the mobility of standard molecular weight markers. pseudoviruses were generated as previously described (kim et al., 2001) . briefly, mulv packaging cell line telceb6 (schnierle et al., 1997) was transfected with pcdna-s, phcmv-s, phcmv-g (burns et al., 1993) , or pltr-gp140 (hiv-1 dh12 ; kim et al., 2001) using either calcium phosphate precipitation or lipofection (lipofectin; invitrogen) method. two days posttransfection (3 days for sars pseudovirus), cell culture medium was harvested and subjected to centrifugation (1700 â g, 10 min) to remove cell debris. supernatant was aliquoted, stored at à80 jc and used as a virus stock. virus titer was determined in vero e6 cells for sars-s and vsv-g or in hos-cd4-ccr5 (cheng-mayer et al., 1997; deng et al., 1996) for hiv-1 gp140 pseudotyped viruses. typically, cells were infected with 60-80 infectious units for 36 h. cells were washed with pbs and incubated with a fixative (1% formaldehyde, 0.05% glutaraldehyde in pbs) for 10 min at room temperature. the cells were washed twice with pbs and incubated with a freshly prepared staining solution (pbs containing 5 mm potassium ferricyanide, 5 mm potassium ferrocyanide, 2 mm magnesium chloride, and 1 mg/ml of x-gal) for >2 h at 37 jc. for routine analyses, x-gal-stained blue cells were manually counted using an inverted microscope. to determine ph-dependency of viral entry, vero e6 cells were incubated in culture medium containing 0 -100 am chloroquine for 1 h at 37 jc before adding viruses. vsv-g or sars-s pseudoviruses were allowed to adsorb to cells for 1 h at 37 jc in the absence of chloroquine. following adsorption, virus inoculum was removed, cells were washed, and infection was allowed to proceed for about 36 h in the absence of chloroquine. for nh 4 cl, cells were incubated with 0 -50 am. due to minimal cytotoxicity, nh 4 cl was present throughout the infection period including 1 h incubation before virus addition. all infections were done in duplicates. neutralization assay was performed as previously described (kim et al., 2001) using convalescent sera (13 -50 days post-onset of symptoms) obtained from cdc or from patients hospitalized in queen mary hospital, hong kong. approximately 60 -80 infectious units of pseudoviruses were incubated with serially diluted, heat-inactivated (56 jc, 30 min) convalescent or normal sera for 1 h at 37 jc. the mixture was subsequently added to vero e6 (for sars-s or vsv-g) or hos-cd4/ccr5 (for hiv-1 gp140) cells. virus infection was allowed to proceed for another 36 h. virus-neutralizing activity was determined relative to no serum control. the general pseudovirus infection procedure is the same as described above. the major difference was that 96-well plates with a white membrane bottom normally used for elispot assays (plate m200; bd biosciences) were utilized rather than regular tissue culture plates. immunospot analyzer from cellular technology ltd. was used as per manufacturer's recommendations. vesicular stomatitis virus g glycoprotein pseudotyped retroviral vectors: concentration to very high titer and efficient gene transfer into mammalian and nonmammalian cells identification of the epitope region capable of inducing neutralizing antibodies against the porcine epidemic diarrhea virus macrophage tropism of human immunodeficiency virus type 1 and utilization of the cc-ckr5 coreceptor subunit protein vaccines: theoretical and practical considerations for hiv-1 membrane rearrangement and vesicle induction by recombinant poliovirus 2c and 2bc in human cells monoclonal antibodies to murine hepatitis virus-4 (strain jhm) define the viral glycoprotein responsible for attachment and cell -cell fusion protection from lethal coronavirus infection by affinity-purified spike glycoprotein of murine hepatitis virus, strain a59 assembly of coronavirus spike protein into trimers and its role in epitope expression identification of a major co-receptor for primary isolates of hiv-1 identification of a novel coronavirus in patients with severe acute respiratory syndrome the v5a13.1 envelope glycoprotein deletion mutant of mouse hepatitis virus type-4 is neuroattenuated by its reduced rate of spread in the central nervous system antigenic relationships of murine coronaviruses: analysis using monoclonal antibodies to jhm (mhv-4) virus aetiology: koch's postulates fulfilled for sars virus eukaryotic transient-expression system based on recombinant vaccinia virus that synthesizes bacteriophage t7 rna polymerase alteration of the ph dependence of coronavirus-induced cell fusion: effect of mutations in the spike glycoprotein major receptorbinding and neutralization determinants are located within the same domain of the transmissible gastroenteritis virus (coronavirus) spike protein detection of sars coronavirus in plasma by real-time rt-pcr isolation and characterization of viruses related to the sars coronavirus from animals in southern china microbiologic characteristics, serologic responses, and clinical manifestations in severe acute respiratory syndrome location of antigenic sites defined by neutralizing monoclonal antibodies on the s1 avian infectious bronchitis virus glycopolypeptide development of a safe and rapid neutralization assay using murine leukemia virus pseudotyped with hiv type 1 envelope glycoprotein lacking the cytoplasmic domain protective immunity against murine hepatitis virus (mhv) induced by intranasal or subcutaneous administration of hybrids of tobacco mosaic virus that carries an mhv epitope a novel coronavirus associated with severe acute respiratory syndrome neutralization and fusion inhibition activities of monoclonal antibodies specific for the s1 subunit of the spike protein of neurovirulent murine coronavirus jhmv c1-2 variant localization of neutralizing epitopes and the receptor-binding site within the amino-terminal 330 amino acids of the murine coronavirus spike protein coronviridae: the viruses and their replication a real-time pcr for sars-coronavirus incorporating target gene pre-amplification the c12 mutant of mhv-a59 is very weakly demyelinating and has five amino acid substitutions restricted to the spike and replicase genes role of ph in syncytium induction and genome uncoating of avian infectious bronchitis coronavirus (ibv) coronavirus ibv-induced membrane fusion occurs at near-neutral ph angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus virology: sars virus infection of cats and ferrets new mammalian expression vectors entry of mouse hepatitis virus into cells by endosomal and nonendosomal pathways quantitative analysis and prognostic implication of sars coronavirus rna in the plasma and serum of patients with severe acute respiratory syndrome n-linked glycosylation sites adjacent to and within the v1/ v2 and the v3 loops of dualtropic human immunodeficiency virus type 1 isolate dh12 gp120 affect coreceptor usage and cellular tropism initial events in bovine coronavirus infection: analysis through immunogold probes and lysosomotropic inhibitors coronavirus as a possible cause of severe acute respiratory syndrome foamy virus envelope glycoprotein-mediated entry involves a ph-dependent fusion process early diagnosis of sars coronavirus infection by real time rt-pcr rapid diagnosis of a coronavirus associated with severe acute respiratory syndrome (sars) identification of severe acute respiratory syndrome in canada a role for carbohydrates in immune evasion in aids characterization of a novel coronavirus associated with severe acute respiratory syndrome targeted recombination demonstrates that the spike gene of transmissible gastroenteritis coronavirus is a determinant of its enteric tropism and virulence pseudotyping of murine leukemia virus with the envelope glycoproteins of hiv generates a retroviral vector with specificity of infection for cd4-expressing cells characterization of severe acute respiratory syndromeassociated coronavirus (sars-cov) spike glycoprotein-mediated viral entry unique and conserved features of genome and proteome of sarscoronavirus, an early split-off from the coronavirus group 2 lineage localization of major neutralizing epitopes on the s1 polypeptide of the murine coronavirus peplomer glycoprotein interpretation of diagnostic laboratory tests for severe acute respiratory syndrome: the toronto experience a 12-amino acid stretch in the hypervariable region of the spike protein s1 subunit is critical for cell fusion activity of mouse hepatitis virus assessment of immunoreactive synthetic peptides from the structural proteins of severe acute respiratory syndrome coronavirus summary of probable sars cases with onset of illness from 1 the sars-cov s glycoprotein: expression and functional characterization evaluation of reverse transcription-pcr assays for rapid diagnosis of severe acute respiratory syndrome associated with a novel coronavirus requirement of proteolytic cleavage of the murine coronavirus mhv-2 spike protein for fusion activity we are grateful to cdc for providing plasmids encoding sars-cov s gene and convalescent sera, to dr. bernard moss for vtf7-3, to dr. franc ßois-loïc cosset for telceb6 cell line, and to drs. jonathan silver and mario skiadopoulos for vero e6 cells. hos-cd4-ccr5 cell line was obtained from dr. nathaniel landau through the aids research and reference reagent program, division of aids, niaid, nih. we specially thank dr. magdalena tary-lehman for providing assistance with using immuno-spot analyzer. key: cord-103509-hynnba03 authors: wong, ten-tsao; liou, gunn-guang; kan, ming-chung title: a self-assembled protein nanoparticle serving as a one-shot vaccine carrier date: 2020-09-18 journal: biorxiv doi: 10.1101/2020.09.16.299149 sha: doc_id: 103509 cord_uid: hynnba03 in this paper, we are exploring the role of an amphipathic helical peptide in mediating the self-assembly of a fusion protein into a protein nanoparticle and the application of the nanoparticle as a one-shot vaccine carrier. out of several candidates, an amphipathic helical peptide derived from m2 protein of type a influenza virus is found to stimulate high antigenicity when fused to a fluorescent protein genetically. this fusion protein was found to form protein nanoparticle spontaneously when expressed and purified protein stimulates long-lasting antibody responses in single immunization. through modeling peptide structure and nanoparticle assembly, we have improved this vaccine carrier in complex stability. the revised vaccine carrier is able to stimulate constant antibody titer to a heterologous antigen for at least six months in single immunization. the immune response against a heterologous antigen can be boosted further by additional immunization in spite of high immune responses to carrier protein. subunit vaccine is a safe alternative to traditional inactivated or attenuated vaccines, but its efficacy is often hindered by the low antigenicity of recombinant protein. different approaches are utilized to resolve this issue, among them, virus like particle (vlp) and self-assembled protein nanoparticle (sapn) are considered the best platforms for subunit vaccine development(1). vlp is assembled from a recombinant capsid protein alone without the genomic nucleic acid and bound nucleocapsid protein so it is non-infectious (2) . the size of vlp is ranged between 20-200 nm that facilitates both draining efficiently to lymph node and also uptake by antigen presenting cells like dendritic cell and macrophage (3) . the other benefit of vlp based vaccine is the induction of b cell receptor clustering when presenting repetitive antigen to b cell, a function that can activate antibody class-switch and somatic hypermutation in a t cell dependent mechanism (4) . not only the virus like particle can be used for vaccine directly, heterologous antigen can be presented in the particle surface through genetic fusion (5) . the universal flu vaccine candidate, ectopic m2 domain (m2e), was genetically fused with hepatitis b core antigen(hbc) and assembled into a nanoparticle that provides full protection to homologous flu strain (6) . but the application of hbc based vlp in human vaccine development is restricted by the pre-existing anti-hbc antibody present in the 450 million chronic hbv carriers and the population exposed to hbv infection (7) . an artificially designed sapn may avoid the effect of the existing antibody. a sapn assembled from protein constituted by two coiled-coil domains that form trimer and pentamer respectively can be assembled into nanoparticles with specific sizes (8, 9) . this sapn stimulates strong immune responses to target protein fused to the terminal of constituting monomer after 3 immunizations even without adjuvant, but this immunity waned gradually (10) . the green fluorescent protein is a member of fluorescent protein family that are structurally conserved and emit fluorescent light from a chromophore when excited by photons of shorter wavelength (11) . the shared features of fluorescent proteins including a sturdy barrel shaped structure constituted by 11 β-sheets and an enclosed chromophore that emits fluorescent light when excited (12) . the function of the barrel shell is to provide a well organized chemical environment to ensure the maturation of chromophore and protects it from hostile elements (13) . so it is conceivable that the protein sequences among fluorescent protein family members in the barrel shell are highly variable and fluorescent proteins possess desirable biophysical properties can be selected using directed evolution (11, (14) (15) (16) (17) . the applications of fluorescent protein have been expanded into multiple areas beyond live imaging which includes serving as biological sensors (18, 19) , or detectors for protein-protein interaction or protein folding (20, 21) . amphipathic α-helical peptide (ahp) forms hydrophilic and hydrophobic faces when folded and is often identified in proteins related to phospholipid membrane interaction. the n-terminal amphipathic helical peptide is required for membrane anchorage of hepatitis c virus ns3 protein and the protease function of the ns3/ns4a complex (22, 23) . several anti-microbial peptides also possess amphipathic properties and function by forming membrane pores or causing membrane disruption (24) . the amphipathic α-helical of type a influenza virus m2 protein is required for m2 protein anchorage and induces membrane curvature required for virus budding (25, 26) . life protection from diseases through vaccination is considered a holy grail only achievable by some live attenuated viral vaccines with multiple doses up to date (27) . here in this report, we are describing the identification of a protein nanoparticle based on an amphipathic α-helical peptide (ahp) from m2 protein of type a influenza strain h5n1 and a green fluorescent protein. we have predicted the protein nanoparticle structure according to transmissive electronic microscope images using protein modeling and generated ahp mutants that provide higher stability and antigenicity to the protein nanoparticle. this modified protein nanoparticle is able to simulate long constant antibody response to an inserted heterologous antigen in single immunization without adjuvant. and this antibody response can be further boosted by additional immunization. the identification and application of this nanoparticle as a vaccine carrier was explored in this study. identification of an ahp-gfp protein complex with high antigenicity and stability. as described in our patent application filed in 2015, we have tested the immunogenicity of fusion proteins composed of an ahp and a gfp (28) . the results showed an increase of anti-gfp igg titer ranged between 2~3 log under a two immunizations regime ( figure s1 ). one of the peptides, ah3, that derived from m2 protein of type a influenza strain h5n1 gives extended stability to the gfp fusion protein when compared to another peptide, ah1 ( figure s2 ) as well as other peptides in our study (data not shown). since a stable protein is essential for vaccine carrier and may profits worldwide vaccination effort, we were interested in the mechanism of ah3-gfp stability and antigenicity. to study the potential mechanisms that contribute to the above mentioned properties of ah3-gfp fusion protein, we first checked the composition of ah3-gfp protein post expression and purification. one clue that led us to study the composition of ah3-gfp fusion protein is the difficulties encountered during protein purification. unlike other fusion proteins studied, both ah3-gfp and ah5-gfp fusion proteins are mostly expressed as insoluble inclusion body and the remaining soluble protein did not bind to ni-nta resin under normal condition of 300mm nacl. the ah3-gfp and ah5-gfp fusion proteins only started to bind to ni-nta resin after lowering the nacl concentration from 300mm to 50mm, an indication that hydrophobic interaction may induce a protein complex with n-terminal his tag hindered from binding to ni-nta ligand. also, the resistance of ah3-gfp fusion protein to hydrolysis suggested the linker between ah3 peptide and gfp is kept in a water tight complex. from these two clues, it was hypothesized that ah3-gfp fusion protein form stable protein complex through hydrophobic interaction mediated by n-terminal ah3 peptide. to test the hypothesis that ah3-gfp or ah5-gfp fusion protein forms a protein complex, we first used the protein concentration tube with different molecular weight cut off (mwco) to determine the protein complex sizes. as shown in figure 1a , when gfp protein with a molecular weight of 27kda is able to pass through membranes with mwco of 100kda, 300kda and 1000kda freely, ah3-gfp fusion protein purified from bacterial lysate was prevented from passing through the membrane with an mwco of 1000kda. with a molecular weight of 33kda, the purified ah3-gfp protein need to form a complex with more than 30 monomers to be excluded from passing a membrane with a 1000kda mwco. to explore further the geometric composition of the ah3-gfp protein complex, we examined the fusion protein under transmissive electronic microscope (tem). the tem results showed the ah3-gfp fusion protein forms a cylinder-like structure with length up to ~60 nm and a diameter around 10 nm (fig. 1b) . the difference in length suggests that the particle may be assembled along the long axis. when scanning along the long axis of the ah3-gfp particle, there is a repetitive pattern of two-one-two-one of white dots with two less visible dots on each side of the single dot. the predicted structure according to tem images is shown in fig.1d . we also examined protein geometric composition of ah5-gfp under tem, but there is no clear evidence of forming higher order protein complex, suggesting ah5-gfp protein complex is not as stable as ah3-gfp to withstand the conditions during negative staining. to find the correlation between protein complex formation and antigenicity, we immunized mice with purified ah3-gfp fusion protein and the recombinant gfp protein that pass through the membrane freely. proteins were prepared from lps synthesis defective e. coli strain, clearcoli bl21(de3), to avoid the interference of lps contamination, a known tlr4 ligand. the mice were immunized with purified proteins by single intramuscular injection and sera were collected at day 7, 14, 30 and 182 to evaluate anti-gfp igg titer by elisa. deoxycholate was added to test if deoxycholate in the concentration of 0.2% affects ah3-gfp antigenicity and related experiment was terminated at 30 days post immunization when it showed no effect on antigenicity of either gfp or ah3-gfp. these results suggest gfp alone is a poor antigen and only gained high antigenicity after fused with ah3 peptide (fig. 1c) . to understand the potential molecular mechanism leading to the assembly of ah3-gfp nanoparticle, we tried to build a protein model based on three observations: first, the particle was assembled through hydrophobic interaction and second, the particle assembled along the long axis and the third: ah3-gfp protein particle has a repetitive three-two pattern when observed under tem. we first assembled the two ah3 peptide as anti-parallel helices with hydrophobic sidechains of f4, f5, i8, l12 and l16 mediating intermolecular contacts. the assembled ah3 dimer created a hydrophobic core between two helices and the helix dimer is surrounded by hydrophilic sidechains from multiple lysine and arginine except one exposed hydrophobic patch, as predicted using deepview and marked by white mesh (fig. 2a) . this hydrophobic patch can be seen only to cover one face of the dimer (fig. 2b ). when the water accessible surface of ah3 dimer was calculated, two cavities could be seen located within the hydrophobic patch that provides contact points for two arg9 sidechains extruding from the opposite face of ah3 dimer. a second ah3 dimer can make close contact with the first dimer after turning counter clockwise looking down the hydrophobic patch for 36 o and forms a tetramer (fig. 2c) . the intermolecular energy between two dimers from this model was calculated to has a δg of -101 kcal/mol (fig. 2c) . after adding gfp protein structures onto the ah3 tetramer model, the ah3-gfp fusion protein tetramer will form a cross-shaped assembling unit and the stacking of every ah3-gfp tetramer on top of another tetramer will extend the particle length by 2.8 nm and turning the cross by 72 o . since the gfp protein barrel diameter is ranged between 2.7~3.5 nm, the out extending gfp from ah3-gfp tetrameric cross can spatially fit with the model (fig. 2d ). under this model, the protein nanoparticle will be extended continuously with a hydrophobic patch presenting on one end of the assembled particle constitutively and serving as a point for polymerization. after proving that the ah3-gfp protein complex possesses high antigenicity, we decided to explore the application of ah3-gfp protein nanoparticle as a vaccine carrier. for the hepatitis b core antigen, the amino acid 144 served as an insertion site for heterologous antigen fusion (29) . gfp protein has a thermal stable structure that constituted by 11 β-strands and 3 α-helices and some of the loops between strands have been explored as insertion sites for heterologous protein for various purposes (18, 19, 30) . among those candidates, loop173 linking strand 8 and strand 9 was chosen because it has a high capacity for foreign peptide insertion (fig 3a) (30) . the original ah3-gfp recombinant protein is constructed in pet28a vector with ah3 coding region inserted c-terminal to his-tag and thrombin cleavage site followed immediately by gfp cloned from pegfp-c2. this expression vector was low in soluble protein productivity and unable to express soluble recombinant protein when the peptide is inserted between d173 and g174. to resolve the expression and folding issues, we designed a new expression vector. first, we cloned ah3 peptide into the very n-terminal following methionine in the pet27 vector, and then to its c-terminal we inserted a synthesized sfgfp gene (31) with a created antigen insertion site following 175s of sfgfp. the antigen insertion site also included an 8xhis tag for recombinant protein purification. to verify vaccine carrier function, we inserted two copies of broad spectrum flu vaccine candidate, human m2 ectopic peptide (hm2e) separated by a 6 amino acid linker (fig. 3b) . the newly constructed vector was proven to be efficient for expressing soluble ah3-sfgfp-2hm2e fusion protein as a protein complex (data not shown). the ah3-sfgfp-2xhm2e protein complex under tem is not as stable as ah3-gfp unless first cross-linking the protein preparation with a heterobifunctional protein crosslinker, sulfo-smcc (fig. 3c ). following the previous established ah3-gfp protein model, we were seeking strategies to create a more stable ah3-sfgfp protein complex. first, we found the mutation of isoleucine 8 to leucine increase the intermolecular interaction(δg) from -16 kcal/mol to -37 kcal/mol (fig.3d) . second, we mutated lysine 13 to glutamic acid and generated additional electrostatic interactions between side chains of glu13 with arg10 and arg11 (fig. 3e ). to verify whether the protein modeling results are correct, including the presence of hydrophobic patch on ah3 peptide complex and a higher stability of ah3 based protein complex with i8l and/or k13e mutations. we first generated mutations in ah3 peptide in the context of ah3-sfgfp-2xhm2e construct (fig. 4a ). our hypothesis is that the hydrophobic patch of the ah3-gfp complex will bind bacterial membrane and co-sediment with it during ultracentrifugation. and then the methodology was verified by first centrifuge bacterial lysate prepared from clearcoli culture in a centrifuge tube preloaded with 15%(w/v), 45%(w/v) and 85%(w/v) sucrose solution in the volume of 7ml, 2ml, 1ml respectively ( fig. 4b left panel) . the distribution of bacterial membrane was marked by a lysochromic dye, sudan iii. the control sample contained sudan iii with lysis buffer alone (fig. 4b lane 1) . after ultracentrifugation, the bacterial membrane is sedimented to the junction of 15%/45% sucrose solution (fig.4b lane 2) . using the same protocol, ah3-gfp was found to co-sedimented with the bacterial membrane (fig. s3a) as well as the ah3-sfgfp-hm2e fusion protein but not a free gfp protein (fig s3b) these results are consistent with our protein structure modeling that arg11 serves as a main contact point for dimer stacking also it mediates the electrostatic interaction with mutated glu13 (fig. 3e ). as shown in the sucrose step gradient result, the presence of hydrophobic patch enables nonspecific interaction of the ah3-gfp protein complex with phospholipid membrane. which may restrict the free moving of protein nanoparticle and keep it from reaching draining lymph node for stimulating immunity (4) . to compare the antigenicity of protein complexes derived from either ah3-sfgfp-2xhm2e or lyrrle-sfgfp-2xhm2e, we immunized mice with a single injection of either recombinant proteins. post immunization, sera were collected at day 15, 50, 90 and 202 to evaluate anti-hm2e igg titer by elisa. the geometric mean titer of anti-hm2e igg reached the highest point for the ah3-sfgfp -2xhm2e group and then declined afterward. but of the lyrrle-sfgfp-2xhm2e group, the gmt reached highest point at day 50 and remained steady up to day 90 (fig. s4) . when the individual mouse serum result is observed separately, only one out of 5 mice from ah3-sfgfp-2xhm2e group has higher anti-hm2e igg titer at day 202 than day 15. but there are 4 out of 5 mice from the lyrrle-sfgfp-2xhm2e group shows a higher antibody titer in day 202 compared to day 15 (fig. 5a) . these results suggest that the two point mutations of ah3 in i8l and k13e enable the formation of a stable, high antigenic protein complex that stimulates long lasting immune responses in a single immunization. vaccine carrier like hbc based virus like particle (vlp) is often failed at boosting humoral immune responses after prime dose in a multiple doses protocol due to antigen competition. although gfp is a protein of low antigenicity, the fusion with ah3 strongly enhances its antigenicity as shown in figure 1c . to test if sfgfp backbone competes with inserted hm2e peptide for immune machinery, we immunized mice in a prime-boost protocol using the same protein preparations. the two consecutive injections were carried out 14 days apart and sera collected at day 14, 28 and 90 were subject to elisa assay using either hm2e peptide or sfgfp protein as coating antigen. the result shows the igg titer against hm2e elevated continuously after consecutive immunizations for both proteins as well as anti-sfgfp igg titer. the result suggests that although carrier protein ah3-sfgfp also has high antigenicity, it did not interfere with the immune response against the heterologous protein, hm2e (fig. 5b, 5c) vaccine as a tool to prevent infectious disease is the most cost effective strategy. especially for attenuated viral vaccines like vaccinia, mmr or oral polio vaccine, they produce long lasting even life time protective immune responses but these attenuated viral vaccines took decades for development. apparently, this strategy will not be able to timely develop a vaccine to ward off emerging global pandemic like covid-19. although the new vaccine technology like dna vaccine, mrna vaccine or adenovirus based vaccine that can quickly develop a subunit vaccine after the genomic information of pathogen become available, but the immune responses generated are often declined to base level within a year (32) (33) (34) (35) . this short lived immune response may expose vaccinated people to the risk of antibody dependent enhancement (ade) that is known to be devastating and leads to vaccine failure (11, 36) . in this study, we have created a self-assembled protein nanoparticle composed of ah3-fp and a more stable variant that stimulates long lasting antibody responses. the nature of this long lasting immune responses is not known but it may be mediated by long lasting plasma cell generated during ah3-fp immunization, the same mechanism that accounts for the lifelong protection of attenuated viral vaccines (27) . fluorescent protein family is a group of proteins with conserved barrel shaped structure with chromophore buried inside. since the function of this beta strand constituted barrel is to provide a suitable environment for chromophore maturation, the primary sequence of fluorescent protein barrel is prone to mutagenesis through either direct selection (11) or evolution (12) . also, fluorescent proteins of desired biophysical properties like thermal stability or folding efficiency can be obtained expert review of vaccines key: cord-265642-7mu530yp authors: syomin, b. v.; ilyin, y. v. title: virus-like particles as an instrument of vaccine production date: 2019-06-17 journal: mol biol doi: 10.1134/s0026893319030154 sha: doc_id: 265642 cord_uid: 7mu530yp the paper discusses the techniques which are currently implemented for vaccine production based on virus-like particles (vlps). the factors which determine the characteristics of vlp monomers assembly are provided in detail. analysis of the literature demonstrates that the development of the techniques of vlp production and immobilization of target antigens on their surface have led to the development of universal platforms which make it possible for virtually any known antigen to be exposed on the particle surface in a highly concentrated form. as a result, the focus of attention has shifted from the approaches to vlp production to the development of a precise interface between the organism’s immune system and the peptides inducing a strong immune response to pathogens or the organism’s own pathological cells. immunome-specified methods for vaccine design and the prospects of immunoprophylaxis are discussed. certain examples of vaccines against viral diseases and cancers are considered. immunoprophylaxis has long been used to control infectious diseases exerting an enormous impact not only on the global health but also on the safety of numerous populations of agricultural animals. in the past decades, the focus on vaccine production technologies has changed from handling intact pathogens to producing recombinant subunit vaccines based on isolated target antigens [1] . intensive development of recombinant vaccines began in the 1980s when it became possible to clone the desired dna sequences into expression plasmids and produce the target proteins. the construction of a recombinant vaccine takes advantage of the knowledge of the nucleotide sequences of genes encoded by a pathogen, and involves identification of the antigenic determinant, synthesis of the nucleotide sequence encoding the antigen, its cloning into an expression vector, and production of the target peptide in a certain expression system. the expression systems for individual heterologous proteins has been developed based on the prokaryotic and eukaryotic cells, which allow us to produce target proteins both on laboratory and industrial scales [2] . the interest in recombinant vaccines is a consequence of the emergence of new infectious diseases, most often zoonotic ones. among them are the outbreaks of human diseases caused by the ebola virus, zika virus, marburg virus, the middle east respiratory syndrome, and severe acute respiratory syndrome coronaviruses [3] . oncology also places great hopes in recombinant vaccines, which may help to overcome immune tolerance in the case of cancer treatment [4] . moreover, there exists a constant threat of the emergence of new highly virulent strains of well-known viruses due to unceasing mutational processes in virus genomes [5] . against this background, a new scientific concept, vaccinomics [6] , which consists of identifying the minimum subset of antigens, which are able to induce a competent immune response to a pathogen or tumor by their specific interaction with the b and t immune cells, is being actively developed. using this approach it appears possible to design vaccines based on the minimum subset of antigens which most specifically characterize the pathogen in its interaction with the immunome (the set of all immune receptor sequences, which are present in the individual organism) [7] . target detection of epitopes, the regions located on the surface of the protein, is possible with the aid of x-ray analysis, which is rather labor-consuming since it requires obtaining protein crystals. presently, the 3-d protein structure modeling has found broad use. this approach is based on identification of the physicochemical and electrostatic characteristics of the polypeptide chain regions and correlation of these characteristics with antigenic properties. two strategies are currently utilized, namely, modeling by homology and abbreviations: vlp, virus-like particles. udc 636.2+611:619: 616-006.446 de novo modeling. in the first case, the protein data bank (http://www.ncbi.nlm.nih.qov/genbank/) database of 3-d protein structures is used to model the protein. the commonly used modeller [8] , as well as other software such as i-tasser [9] , swiss-model [10] , esypred3d [11] , 3d-jigsaw [12] , phyre [13] , and cphmodels [14] , can be used as homology modeling tools. the limitation of this approach lies in the requirement that the structures of homologous proteins should demonstrate more than 30% identicality [15] . the de novo modeling strategy is based on the computer simulation of the protein folding process (rosetta [16] and tasser [17] software). in this case, all possible variants of polypeptide chain folding are considered, and the most energetically favorable conformation, i.e., the one with the lowest potential energy, is chosen. predicted antigenic determinants are synthesized using genetic engineering vector techniques and heterologous protein expression systems, and then are experimentally tested using, for example, the antibody neutralization assay. using protein expression systems it is possible to produce virus-like particles (vlps), which are made up of monomers, which are able to multimerize into vlps, and display the antigenic determinants of target pathogens on their surface. this point will be addressed further in the review, we will just observe here that even complex epitopes represented by trimers may be obtained using heterologous expression systems. this has been demonstrated, for instance, for the trimeres of the human immunodeficiency virus (hiv-1) envelope glycoprotein [18, 19] and influenza virus haemagglutinin [20] . in order to choose the most specific antigens of the pathological agent of a new infectious disease, it appears inevitable to work with the infected material if only to isolate and sequence the nucleic acids containing the genetic information of the pathogen. however, in order to produce a vaccine based on epitopes characteristic for the target pathogen, there is no need to work with the pathogen itself. therefore, apart from all other advantages of the vaccines of this kind, they appear to be improved in terms of biological safety. vaccines based on the presentation of a subset of antigenic determinants of the infectious agent are characterized by a high level of reproducibility when commercially manufactured and are highly effective [21] , since in this case, the immune response is directed exclusively against the most significant antigenic elements of the pathogenic microorganism or tumor. a number of vaccines specifically interacting with certain immune system receptors have been tested to date. for example, a vaccine containing only a single epstein-barr virus epitope specifically recognized by cd8 + t-cells is proposed for the prophylaxis of mononucleosis in human [22] . this vaccine prevents the development of the disease, although it does not protect the organism from the entry of the virus. another example is classical swine fever virus (csfv). it has been demonstrated that the e2 viral protein produced in the baculovirus expression system induces the synthesis of csfv-neutralizing antibodies in pigs [23] . vaccines are designed to produce immunity to a disease. although a single epitope is able to induce a strong immune response, it appears usually insufficient to induce protective immunity. the in vitro synthesized peptide properly representing the main antigenic determinant of the pathogen is highly likely not to be itself a strong immunogene primarily because of its small size (<10 nm) and high risk of proteolytic degradation. this problem may be overcome by the use of nanoparticles. it has been shown in a considerable number of works that in order to induce a strong immune response, antigenic determinants should be exposed on the surface of nanoparticles whose shape and size (20-150 nm) mimic those of the virus (fig. 1) . in this respect also, vlps, the nanoparticles composed by viral proteins capable of self-assembly (multimerization) into structures morphologically resembling viruses, are the choice selection for the role of a strong immunogene. antigenic determinants multiply repeatedly on the vlp surface and promote complement fixation and b-lymphocyte receptor clusterization leading to the activation of the immune response. additionally, the same antigen multiply displayed on the surface of the particle promotes the multiplication of the pool of autoreactive b-cells, which is the primary objective when designing vaccines against autoimmune diseases and tumors [29] . natural sources for vlp bioengineering and the specific features of their assembly structural proteins of various viruses are capable of autonomous self-assembly into vlps. they interact with the formation of globular, icosahedral, or rodlike structures. so far, the structural proteins of several dozen viruses have been obtained in heterologous expression systems, and almost all of them proved able to form vlps. the vlp size varies from 20 to 200 nm and is similar to the size of the corresponding viruses [30] . being similar to viruses allows vlps to penetrate into the lymph and to be efficiently entrapped by antigenpresenting cells [31] . currently, several heterologous expression systems and the corresponding expression vectors which can be used with them are available. these systems are based on both the bacterial cells (the pet system based on escherichia coli cells is often exploited) and different eukaryotic cells. in the case of eukaryotic cells, the most commonly used systems are yeast expression systems and the ones which rely on drosophila [32] and spodoptera frugiperda insect cell lines [33] . mammalian cells are also utilized, the most pop-ular among them being cho and hek293 [34] . the cost of heterologous protein production in bacterial systems is lower than in eukaryotic cells. however, when studying the functional activity of the protein or vlp associated, for example, with the potential glycosylation sites or interaction with ubiquitin, or other ubiquitin-like peptides [35] , no alternative to eukaryotic expression systems is available. it should be noted that the choice of the expression system is up to the researcher and is driven by the research task. for example, in different laboratories different eukaryotic systems for viral protein expression, including plant cells, are used to produce vlps which are used for vaccination against the hepatitis c virus (hcv) [36] . in most cases, vlps assembled from a virus protein are considered as a candidate vaccine against this very virus, since protein monomers in multimeric configuration (vlp) induce a more potent immune response than protein monomers [29, 30] . the strong immunogenic properties of vlps are determined by several factors. first of all, the dominant epitope of the structural protein is displayed as a part of the particle and is present in a multimeric form as is the case with a native virion [21] . second, vlps stimulate b-lymphocytes and induce t-lymphocytes in the same manner as does the virus infecting the host organism [30] . for example, vlps formed by the main capsid protein l1 of the human papilloma virus of several serotypes are successfully used as vaccines against cervical cancer [37] . a vaccine against hepatitis b virus has been produced which contains vlps in the lipid membrane envelope [38] . the vaccine against coxsackievirus a6 contains vlps assembled from the virus capsid proteins produced in the baculovirus expression system [39] . scientists from china [40] have demonstrated that the capsid protein (vp60) of the rabbit hemorrhagic disease virus (rhdv) efficiently multimerize into vlps, and a single intramuscular injection of the obtained vlp preparation completely protects the immunized rabbits against rhdv infection for at least 180 days. the porcine circovirus capsid protein is also able to efficiently assemble into vlps when synthesized either in the human embryonic kidney cell culture (hek293) [41] , or yeast [42] and baculovirus [43] expression systems, as well as in bacteria [44] . footand-mouth disease vlps are assembled from three structural polypeptides vp0, vp1, and vp3 (naturally produced as a result of processing the p1-2a precursor polypeptide), simultaneously produced in e. coli [45] . it has been recently demonstrated that the polyprotein 2) liposome-based particles [24] , (3) nondegradable spherical nanoparticles (for example, metal nanoparticles) [25] , (4) polymer nanoparticles [26] , (5) graphene nanosheets [27] and nanotubes [28] . antigen of the duck hepatitis a virus produced in the baculovirus expression system assembles into vlps immediately in the cultured spodoptera frugiperda (sf9) cells, while immunization of ducklings with the obtained vlps induces a high level humoral immune response and protects them from developing the disease [46] . the abilities of vlps resulting from multimerization of the cloned virus protein to play the role of an immunogen are not limited just to the presentation of their proper epitopes but may also be taken advantage of to display heterologous proteins. multimerization of capsid proteins into a particle requires the presence of nucleic acid, while for in vitro assembly, a short oligonucleotide (7-10 nt) is sufficient [47] . therefore, vpls formed by capsid proteins are free from the infectious virus rna or dna. moreover, the above-mentioned property, which triggers protein monomer multimerization by nucleic acid, may be taken advantage of to package rna or dna into a particle with two different objectives. hence, antisense rna can be used to suppress virus expression. for example, in the case of the vaccine against foot-and-mouth disease, researchers from china [48] not only displayed the vp1 epitope of the foot-and-mouth disease virus on the surface of vlps but also packaged the antisense rna complementary to the fragment of the viral genomic rna into a particle. another goal of nucleic acid packaging into a particle lies in the presentation of viral nucleic acids leading to the activation of specific immune receptors which induce the synthesis of type i interferons (inf) and other cytokines triggering the antivirus response [49] . it has been demonstrated that long dna and rna molecules may be incorporated into vlps. for example, mrna for the reporter protein, red fluorescent protein, was packaged into particles representing the hepatitis e capsids [50] . the plasmid containing the green fluorescent protein (gfp) gene was encapsidated into the vlps assembled in vitro from the main capsid protein of the hamster polyoma virus [51] . a strategy for the packaging of up to 17 kbp doublestranded circular dna into the particles was developed using the structural protein of the sv40 simian retrovirus expressed in baculovirus system [52] . there exist two fundamentally different approaches for nucleic acid incorporation into vlps: (1) in vitro vlp assembly from protein monomers in the presence of rna or dna, which is to be encapsidated into the vlp; (2) exposure of the assembled vlps to osmotic shock in the presence of nucleic acids. osmotic shock is produced by using a solution with a low ionic strength; nucleic acid enters into a vlp as a result of the shift in the surface structures [53] . however, the incubation of vlps in the presence of nucleic acids without exposure to osmotic stress results in a certain part of the nucleic acids becoming associated with the particles [54] . a far more predictable approach is the in vitro assembly of vlps from the mixture of protein monomers and nucleic acids which should be encapsidated in them. in this case, after synthesis in a heterologous system, the obtained protein monomers are purified to completely remove the contaminating nucleic acids from the protein preparation; then the proteins are denatured in 7-8 m urea, nucleic acids which should be packaged are added, and the proteins are assembled into the particles by eliminating urea from the solution. the use of this strategy was demonstrated, for instance, in the works [47, 55, 56] . protein association into a particle is a reversible process. vlp dissociation can be easily achieved by the addition of a denaturing agent. further, it may be removed and vlps may be reassembled in vitro [47] with the encapsidation of the target rna or dna into the particle. this approach was implemented for the p21 protein of the hepatitis b virus. virus proteins produced in the heterologous expression system were denatured in 7 m urea solution, and their assembly into vlps in the presence of the nucleic acid was further induced [57] . protective immunity is controlled by specific humoral and cellular mechanisms activated by the antigen. posttranslational protein modifications and covalent bonds between the modifying molecules and the functional groups in the polypeptide chains are of great importance for immune homeostasis during the antiviral response. the balance between phosphorylation, ubiquitination, methylation, acetylation, sumoylation, adp-ribosylation, and glutamilation of a certain antigen performs the fine adjustment of the host's antiviral response [49, 58] . the structural proteins for vlp production when synthesized in the eukaryotic expression systems undergo a number of posttranslational modifications. in particular, the gag gypsy monomer proved to be a natural substrate for type 2 casein kinase [59] . moreover, while produced in the eukaryotic cell, gag gypsy monomers become associated with ubiquitin and sumo, the cellular protein partners of the viral structural protein [33] . generally, ubiquitin and sumo can bind with any lysine residue within the protein molecule, although there are preferable binding sites; it should also be noted that phosphorylation determines which of these two partners binds with the protein [60, 61] . additionally, binding with a limited number of the above-mentioned signal peptides, such as monoubiquitination [62] , usually plays a regulatory role and controls the transport of the protein itself, or the particle formed by it. in the case of polyubiquitination, the protein is destined for proteasome degradation [60, 63] . by constructing recombinant polypeptides based on the viral capsid proteins it is possible to obtain vlps bearing several antigens. for example, dna encoding the vp2 capsid protein of the porcine parvovirus was fused with the dna fragment encoding 35 amino acid residues of the main antigen of porcine circovirus (pcv2) nucleoprotein. the resulting hybrid dna was used to produce protein in heterologous cells which further formed vlps. these vlps induced a significantly stronger immune response against pcv2 than the recombinant adenovirus encoding the open reading frame 2 (orf2) of pcv2 [41] . another interesting example is represented by the vlps formed by the influenza virus matrix protein and displaying a toxoplasma gondii antigen on their surface [64] . there exists another approach to designing therapeutic vaccines based on vlps. in this case, the vlp surface displays the variable fragment of an antibody specific to the antigen of the target virus [65] . moreover, knowing which receptor is bound by the virus proteins renders it possible to produce particles possessing tropism to the cells of certain tissues. for example, the pre-sprotein of the hepatitis b virus exposed as a ligand on the vlp surface provided for their specific binding with hepatocytes [66] . vlps assembled from the structural proteins of bacteriophages are also considered as carriers for human and animal vaccine production. for example, the dna fragment encoding foreign protein was inserted into the dna region encoding the n-terminal β-hairpin of the coat protein of the e. coli ms2 phage. it has been demonstrated that the expression of the obtained construct in bacterial cells resulted in the production of the recombinant protein in which a foreign peptide was present in the central part of the hairpin. the obtained chimeric protein monomers were able to self-assemble into particles morphologically similar to the phage capsid both in vivo and in vitro [67] . using the mentioned property of the ms2 coat protein, vlps displaying the ep141-160 epitope of the foot-and-mouth disease vp1 structural protein on their surface were obtained. these chimeric vlps induced string immune response in the animals, which allowed regarding them as a promising base for the development of a prophylactic vaccine [68] . it is worth noting that the exposure of ep141-160 on the vlp surface resolved a long-standing problem of how to make use of the beneficial properties of this peptide. researchers from several laboratories have demonstrated that this antigenic determinant of the footand-mouth disease not only induces the production of neutralizing antibodies but also stimulates t-lymphocytes [69] . however, when isolated, ep141-160 is not able to induce the immune response protecting animals against the foot-and-mouth virus infection [70] . many attempts were made to solve this problem, for example, by combining the antigenic determinant with large molecules, such as for instance, t cell-specific molecules [71] . however, only integration of ep141-160 into vlps resulted in the possibility of using this antigenic determinant as a strong immunogen for vaccine production [68] . practically all structural proteins of phages infecting various species of bacteria are able to autonomously form vlps. for example, salmonella typhimurium phages are able to autonomously multimerize into vlps on whose surface the epitopes of eukaryotic viruses, including the epitopes of the human influenza virus may be exposed [72] . at the same time, however, bacteriophages are apparently incapable of presenting large epitopes, whose length exceeds 24 amino acid residues [67] . structural proteins of retroviruses and retrotransposons have a higher capacity compared to bacteriophage proteins. this means that protein monomers forming the particle may be fused with a longer peptide and still retain the ability to multimerize. in particular, it has been shown for the gag gypsy capsid protein [73] , that at least 26% of its amino acid sequence (more than 100 amino acid residues) is of little importance for multimerization into vlps [55] . therefore, the truncated form of this protein may be fused with a heterologous peptide with the length similar to the length of the deleted fragments and a recombinant protein may be obtained which will retain its ability to self-assemble into vlps. in such a way, the truncated gag gypsy fused with the heterologous peptide formed particles when it was synthesized in bacterial cells. we were also able to assemble particles from the purified protein monomers in vitro [47, 55, 74] . the obtained particles resembled the native gypsy virus by their morphology [75] . the substitution of the deleted region in the dna encoding gag gypsy by the nucleotide sequence encoding the main antigenic determinant of the foot-and-mouth virus vp1 protein (ep141-160) and his 6 -tag resulted in the formation of particles which displayed ep141-160 as the target antigen. the advantage of the technique used is that vlps formed by the protein containing his 6 -tag can be readily isolated and purified by affinity chromatography [76] . chimeric vlps may be obtained through the construction of recombinant dna molecules encoding both the corresponding virus protein and a foreign peptide or protein. another way is vlp pseudotyping. this approach was elaborated using the hamster polyomavirus. the v1 capsid protein of this virus first forms pentamers which further assemble into vlps comprised of 72 pentamers. when v1 is expressed together with the minor capsid protein v2, the latter binds with the central part of each v1 pentamer. it has been demonstrated that the n-terminal part of v2 is not involved in this interaction and therefore may be removed and substituted by the target epitope [77] . another approach is also known. antigen is immobilized on the vlp via covalent binding between the reactive groups of the amino acid residues of the antigen and vlp. this approach proved to be ineffective due to the disruption of the native conformation of the antigen attached to the particle surface [77, 78] . this problem was countered in the following way. glygly-lysglygly sequence was inserted into the monomer subunits for vlp assembly. in this environment, the reactive ε-amino group of lys residue is exposed and ready to interact with the cys-group of any protein in the presence of the cross-linking agent, such as m-maleimidobenzoyl-n-hydroxysulfosuccinimide ester (sulfo-mbs) [79] , or succinimidyl-6-[(β-maleimidopropionamido)hexanoate] (smph) [80] . although the authors claimed that the described system of molecular assembly may be used to induce strong blymphocyte response against most antigens and prospective vaccine prototypes were suggested, this platform became the prototype only for a single prospective preparation based on vlps, the medication for lowering blood pressure in patients with hypertension [80] , due to the development of the new improved techniques for antigen immobilization of the vlp surface, which will be discussed below. one of them takes advantage of the ability of his 6tag to bind with multivalent tris-nitrilotriacetic acid (trisnta) which in turn binds with a broad range of molecules, in particular, with biotin. the possibilities of this elegant technique for the functionalization of noninfectious viral nanoparticles were demonstrated in the case of vlps formed by the norovirus (nov) structural proteins. using the baculovirus expression system, the authors obtained nov vlps, displaying his 6 -tag on their surface, which was first used to purify vlps, and further to bind trisnta molecules conjugated with a fluorescent dye, or biotin which in turn successfully bound streptavidin [81] . notwithstanding some authors suggesting the described technique to be promising for vlp vaccine production, this approach continues to be only a prototype and has not advanced further than model experiments. the search for techniques providing efficient and stable immobilization of the antigen on the vlp surface led to the development of a versatile platform which allows us to covalently attach large antigens to the vlp surface [82] . this molecular assembly system uses the tag-catcher conjugation system which was derived from the cnab2 domain of fibronectin-binding protein fbab from streptococcus pyogenes. as a result a highly reactive spytag peptide (13 amino acid residues) was obtained which efficiently interacted with the spycatcher protein with the formation of isopeptide bonds in a broad range of buffer solutions [83] . the spytag peptide was inserted into the vlp-forming polypeptide of the acinetobacter-infecting phage (these particles were called ap205), so that each monomer forming ap205 displayed two spytag peptides. a recombinant antigen consisting of the target protein and the spycatcher peptide was produced using bacterial or baculovirus expression systems [84] . the obtained antigen efficiently bound with the vlp surface and induced a very strong immune response. almost the same effect was observed in the case of the reverse combination of the conjugating peptides, that is when spycatcher was incorporated into vlps, and spytag was fused to the antigen. this system is already utilized to design different types of vaccines, including the vaccines against tuberculosis and malaria [84, 85] . a recent report has demonstrated that it was successfully used to develop a vaccine against breast cancer [86] . this work is remarkable due to the fact that the high density of the her2 antigen (human epidermal factor receptor 2) synthesized in the cultured drosophila melanogaster cells could be obtained on the ap204 vlp surface [32] . the obtained particles induced a strong immune response to the antigen in the patients, which rendered it possible to overcome the immune tolerance to her2 known for patients with the her2-dependent breast cancer [85, 86] . in summary, the analysis of the discussed published data allows us to conclude that the nature of the vlp-forming protein is not as important for vlp functionalization as the technique which makes it possible to obtain high antigen density on the surface of vlps (table 1) . the product range a number of vlp vaccines are available on pharmaceutical markets in many countries. the most wellknown are the cervarix® and gardasil® vaccines against cervical cancer, which have been successfully used for prophylaxis of this disease in girls for more than ten years. both vaccines are based on vlps formed by the main capsid protein l1 of the human papilloma virus belonging to several serotypes [37] . engerix and recombivax hb vaccines against the hepatitis b virus representing vlps enveloped in a lipid membrane [38] , as well as hecolin and xiamen innovax vaccines against hepatitis e [88] , were developed and commercialized. the anti-malaria vaccine malaria rts has been certified, which represents the c-terminal part of the cs protein of plasmodium falciparum attached to the surface antigen of the hepatitis b virus (hbsag), which is also used in the certified vaccines against hepatitis b. to stabilize recombinant vlps, the fused protein is expressed together with the hbsag protein in saccharomyces cerevisease [89] . a vaccine against the porcine circovirus representing vlps assembled from the vp2pcv2 capsid protein synthesized in a heterologous system has also been commercialized [41] . an increasing number of works have appeared that report on the development of vaccines based on nanoparticles, including vlps and/or replicationinactive viruses. among the apparent advantages of vaccines of this kind are their high specificity, efficiency, and good pharmacokinetic characteristics. it has been demonstrated that in an organism vlps reach lymphatic nodes in less than 10 min, while the particle mixture bearing different antigens may be pro-cessed by the same antigen-presenting cells simultaneously [90] . it should be noted that nanoprticle-based vaccines offer new possibilities in the development of immunoprophylaxis strategies, especially, the development of injection-free formulations, in particular, intranasal vaccines and inhalers. the administration of these vaccines may not involve humans, which is especially important in the case of industrial animal breeding in large agricultural enterprises characteristic of presentday russia [19] , since it proves to be extremely difficult to administer identical injections to a large number of animals at the same time. this problem may be resolved by nebulizing aerosol vaccines in the area where animals are kept. the potential of this approach was demonstrated for the first time in 1947 with mice [92] ; then, in the 1950s-1970s, attempts were made to introduce this approach into animal and poultry farms in many countries. starting from the 1960s, attempts were also made in the soviet union [93] . however, at that time, the method of inhaled administration of vaccines was not introduced in practice since protective immunity could only be obtained when the tolerable dose was exceeded many times. for example, the inhaled administration of the vaccine against the newcastle disease led to 10-20% lethality in birds [94] . however, due to the introduction of vaccines based on the exposure of the epitopes of a pathogen on the surface of nanoparticles, including recombinant vlps, together with the development of innovative approaches to aerosol production, inhaled vaccines again became a topical issue. for example, biodegradable polymer nanoparticles formed by poly(glyceroladipate-co-ω-pentadecalactone), pga-co-pdl, proved themselves as an effective antigen carrier for inhaled vaccination [95] , while the vaccine based on the adenovirus with the particles ranging from 4 to 10 µm induced stable protective immunity when administered via inhalation to monkeys [96] . lastly, a method for vlp-vaccination via inhalation was proposed as protection against the human papilloma virus [97] . one of the most important properties of vlps is mimicking virus particles and the consequent ability to induce a strong immune response to the antigen which they demonstrate irrespective of the source of the monomers which multimerize into vlps, these being either insect viruses, in particular the gypsy virus [56, 61] , or plant viruses [98] . vlps based on the structural proteins of plant viruses produced in plants [99] make it possible to obtain vaccines with another nonconventional way of administration, edible vaccines [100] . vaccines of this type may be synthesized directly in plant forage, with the oral vaccination of this kind inducing an immune response. expression vectors for foreign protein production in plants have been developed based on plant viruses, which allows obtaining plant-producing recombinant viruses or vlps displaying the target antigen on their surface [101, 102] . vlps assembled from the structural proteins of plant viruses are also used to design functionalization platforms for antigen binding. these platforms are based on the strategies described above. in particular, they take advantage of the reactivity of the lys ε-group to immobilize the antigen on the surface of the plant's vlps [102] [103] [104] [105] . using the tobacco mosaic virus [106] and potato x virus [104] , it has been shown that a considerably large antigen can be immobilized on the surface of the corresponding vlps via the formation of the streptavidin-biotin complex. finally, based on the structural protein of the plant virus, the platform for antigen immobilization using the tag-catcher conjugation system is being developed [107] . it should be noted that among the drawbacks of vlp vaccines manufactured using proteins which are not specific for mammals, for example, the plant virus proteins, include the development of the immune response to the structural component of the particles, which is the plant virus protein. undoubtedly, there is no therapeutical benefit in developing such immunity; however, whether there are adverse effects will be demonstrated by further studies. model molecules including biotin, alexa 488 (fluorescent dye), and gfp protein conjugated with trisnta [81, 87] isopeptide bond formation between spytag (integrated into vlp monomers) and spy-catcher (fused with antigen) peptides vaccine against breast cancer (her2 antigen presentation). it is also implemented to develop vaccines against malaria and tuberculosis [82] [83] [84] [85] [86] the use of vlps for vaccination in medicine is becoming increasingly widespread. this point is reinforced by the list of clinical trials of vlp vaccines prepared by the united states national institute of health. currently, more than hundred vlp vaccines, which are directed against human and avian influenza viruses, the norwalk virus, norovirus, hiv-1, and the chikungunya virus, the foot-and-mouth disease virus, as well as against melanoma, adenocarcinoma, papillomatoses, and cervical cancer, are undergoing clinical trials (http://clinicaltrials.gov/ct2/search/index). it may be expected that the variety of nanoparticle vaccines against different cancers and viral infections will grow steadily, and vlps which can be used for immunization against microorganisms belonging to different taxons as well helminthes will also be developed. we thank the designers oleg vasil'ev and galina podzolkova (institute for statistical studies and economics of knowledge (issek), national research university higher school of economics) for their assistance in preparing the illustrations. the article was prepared within the framework of the basic research program at the national research university higher school of economics (hse) and supported within the framework of a subsidy by the russian academic excellence project 5-100. the authors declare that they have no conflict of interest. this article does not contain any studies involving animals or human participants performed by any of the authors. assessing sequence plasticity of a virus-like nanoparticle by evolution toward a versatile scaffold for vaccines and drug delivery identifying and engineering promoters for high level and sustainable therapeutic recombinant protein production in cultured mammalian cells going to bat(s) for studies of disease tolerance mesothelin virus-like particle immunization controls pancreatic cancer growth through cd8+ t cell induction and reduction in the frequency of cd4+ foxp3+ icos-regulatory t cells quasispecies and virus vaccinomics approach to tick vaccine development immunome-derived vaccines modeller: generation and refinement of homology-based protein structure models template-based modeling and free modeling by i-tasser in casp7 proceedings: basic considerations on the requirements to be met by disinfection procedures: the swiss model esypred3d: prediction of proteins 3d structures enhancement of protein modeling by human intervention in applying the automatic programs 3d-jigsaw and 3d-pssm exploring the extremes of sequence/structure space with ensemble fold recognition in the program phyre protein distance constraints predicted by neural networks and probability density functions a historical perspective of template-based protein structure prediction high-resolution structure prediction and the crystallographic phase problem scoring function for automated assessment of protein structure template quality a comparative immunogenicity study of hiv-1 virus-like particles bearing various forms of envelope proteins, particles bearing no envelope and soluble monomeric gp120 stabilized hiv-1 envelope glycoprotein trimers for vaccine use overexpression of a virus-like particle influenza vaccine in eri silkworm pupae, using autographa californica nuclear polyhedrosis virus and host-range expansion viruslike particle engineering: from rational design to versatile applications phase i trial of a cd8+ t cell peptide epitopebased vaccine for infectious mononucleosis pigs immunized with a novel e2 subunit vaccine are protected from subgenotype heterologous classical swine fever virus challenge liposome: composition, characterisation, preparation, and recent innovation in clinical applications vaccine delivery using nanoparticles polymers in vaccine formulation alum-functionalized graphene oxide nanocomplexes for effective anticancer vaccination immunotherapy applications of carbon nanotubes: from design to safe applications moving towards a new class of vaccines for non-infectious chronic diseases vaccine delivery: a matter of size, geometry, kinetics and molecular patterns interaction of viral capsid-derived virus-like particles (vlps) with the innate immune system. vaccines (basel) drosophila melanogaster s2 cells for expression of heterologous genes: from gene cloning to bioprocess development expression of the retrovirus gypsy gag in spodopterafrugiperda cell culture with the recombinant baculovirus activation of human somatostatin receptor type 2 causes inhibition of cell growth in transfected hek293 but not in transfected cho cells regulation of cellular antiviral signaling by modifications of ubiquitin and ubiquitinlike molecules preclinical development and production of virus-like particles as vaccine candidates for hepatitis c. front. microbiol. 8, 2413 relationship between humoral immune responses against hpv16, hpv18, hpv31 and hpv45 in 12-15 year old girls receiving cervarix ® or gardasil ® vaccine development of imaged capillary isoelectric focusing method and use of capillary zone electrophoresis in hepatitis b vaccine recombivax hb ® virus-like particle-based vaccine against coxsackievirus a6 protects mice against lethal infections development of a vlpbased vaccine in silkworm pupae against rabbit hemorrhagic disease virus development of recombinant porcine parvovirus-like particles as an antigen carrier formed by the hybrid vp2 protein carrying immunoreactive epitope of porcine circovirus type 2 oral application of freeze-dried yeast particles expressing the pcv2b cap protein on their surface induce protection to subsequent pcv2b challenge in vivo virus-like particles of chimeric recombinant porcine circovirus type 2 as antigen vehicle carrying foreign epitopes immunogenicity and immunoprotection of porcine circovirus type 2 (pcv2) cap protein displayed by lactococcus lactis large-scale production of foot-andmouth disease virus (serotype asia1) vlp vaccine in escherichia coli and protection potency evaluation in cattle duck hepatitis a virus structural proteins expressed in insect cells selfassemble into virus-like particles with strong immunogenicity in ducklings effect of nucleocapsid on multimerization of gypsy structural protein gag protection of a novel epitope-rna vlp double-effective vlp vaccine post-translational regulation of antiviral innate signaling recombinant hepatitis e virus like particles can function as rna nanocarriers hamster polyomavirusderived virus-like particles are able to transfer in vitro encapsidated plasmid dna to mammalian cells high cloning capacity of in vitro packaged sv40 vectors with no sv40 virus sequences cell-free assembly of a polyoma-like particlefrom empty capside and dna gene transfer using human polyomavirus bk virus-like particles expressed in insect cells the structural protein gag of the gypsy retrovirus forms virus-like particles in the bacterial cell in vitro assembly of retroviruses preparation by alkaline treatment and detailed characterisation of empty hepatitis b virus core particles for vaccine and gene therapy applications sumoylation-disrupting was mutation converts wasp from a transcriptional activator to a repressor of nf-κb response genes in t cells homologous and heterologous type 2 casein kinases have the same effect on the affinity for rna of the gag structural protein of gypsy (mdg4) sumo-nonclassical ubiquitin the functional motifs that are revealed in the gypsy gag amino acid sequence analysis of human immunodeficiency virus type 1 gag ubiquitination global landscape of hiv-human protein complexes viruslike nanoparticle vaccine confers protection against toxoplasma gondii construction of polyomavirus-derived pseudotype virus-like particles displaying a functionally active neutralizing antibody against hepatitis b virus surface antigen construction and immunological evaluation of multivalent hepatitis b virus (hbv. core virus-like particles carrying hbv and hcv epitopes multiple presentation of foreign peptides on the surface of an rna-free spherical bacteriophage capsid promising ms2 mediated virus-like particle vaccine against foot-and-mouth disease plasmids encoding foot-and-mouth disease virus vp1 epitopes elicited immune responses in mice and swine and protected swine against viral infection adenovirus-mediated rna interference against foot-andmouth disease virus infection both in vitro and in vivo promising multiple epitope recombinant vaccine against foot-and-mouth disease virus type o in swine symmetry controlled, genetic presentation of bioactive proteins on the p22 virus-like particle using an external decoration protein drosophila melanogaster endogenous retrovirus gypsy can propagate in drosophila hydei cells bacterially expressed polyprotein gag of retroelement gypsy (mdg4) is able to form multimeric complexes presence of gypsy (mdg4) retrotransposon in the extracellular virus like particles a novel strategy of veterinary vaccines production conditioned by the development of vaccinomics cellular and humoral immunogenicity of hamster polyoma virus-derived virus-like particles harboring a mucin 1 cytotoxic t-cell epitope reengineering viruses and virus-like particles through chemical functionalization strategies functionalizing nanoparticles with biological molecules: developing chemistries that facilitate nanotechnology immunodrugs: therapeutic vlp-based vaccines for chronic diseases histagged norovirus-like particles: a versatile platform for cellular delivery and surface display a molecular assembly system that renders antigens of choice highly repetitive for induction of protective b cell responses bacterial superglue enables easy development of efficient virus-like particle based vaccines peptide tag forming a rapid covalent bond to a protein, through engineering a bacterial adhesin improving the malaria transmission-blocking activity of a plasmodium falciparum 48/45 based vaccine antigen by spytag/spycatcher mediated viruslike display virus-like particle display of her2 induces potent anti-cancer responses structural analysis and insertion study reveal the ideal sites for surface displaying foreign peptides on a betanodavirus-like particle hepatitis e vaccine development: a 14 year odyssey the rts, s/as01 malaria vaccine in children 5 to 17 months of age at first vaccination delivering adjuvants and antigens in separate nanoparticles eliminates the need of physical linkage for effective vaccination diversity of research publications: relation to agricultural productivity and possible implications for sti policy. scientometrics immunization of mice against pneumococcal pneumonia by inhaled polysaccharide anatomical and physiological features of the human and animal respiratory system in the genesis of immunity after aerosol vaccination: 1. role of airway structure in mechanical retention of inhyaled material dry aerosol vaccination against newcastle disease: 1. safety and activity controls on chickens bovine serum albumin adsorbed pga-co-pdl nanocarriers for vaccine delivery via dry powder inhalation unique cellular and humoral immunogenicity profiles generated by aerosol, intranasal, or parenteral vaccination in rhesus macaques spect/ct study of bronchial deposition of inhaled particles. a human aerosol vaccination model against hpv immunology of viruslike particles different applications of virus-like particles in biology and medicine: vaccination and delivery systems a preliminary study of a lettuce-based edible vaccine expressing the cysteine proteinase of fasciola hepatica for fasciolosis control in livestock a launch vector for the production of vaccine antigens in plants rotavirus vp6 expressed by pvx vectors in nicotiana benthamiana coats pvx rods and also assembles into virus-like particles chemical conjugate tmv-peptide bivalent fusion vaccines improve cellular immunity and tumor protection potato virus x as a novel platform for potential biomedical applications incorporation of tetanus-epitope into virus-like particles achieves vaccine responses even in older recipients in models of psoriasis, alzheimer's and cat allergy. npj vaccines. 2, 30 modified tobacco mosaic virus particles as scaffolds for display of protein antigens for vaccine applications use of plant viruses and virus-like particles for the creation of novel vaccines key: cord-270273-a4iu9qg6 authors: ruiz, federico m.; gilles, ulrich; ludwig, anna-kristin; sehad, celia; shiao, tze chieh; garcía caballero, gabriel; kaltner, herbert; lindner, ingo; roy, rené; reusch, dietmar; romero, antonio; gabius, hans-joachim title: chicken grifin: structural characterization in crystals and in solution date: 2017-12-15 journal: biochimie doi: 10.1016/j.biochi.2017.12.003 sha: doc_id: 270273 cord_uid: a4iu9qg6 despite its natural abundance in lenses of vertebrates the physiological function(s) of the galectin-related inter-fiber protein (grifin) is (are) still unclear. the same holds true for the significance of the unique interspecies (fish/birds vs mammals) variability in the capacity to bind lactose. in solution, ultracentrifugation and small angle x-ray scattering (at concentrations up to 9 mg/ml) characterize the protein as compact and stable homodimer without evidence for aggregation. the crystal structure of chicken (c-)grifin at seven ph values from 4.2 to 8.5 is reported, revealing compelling stability. binding of lactose despite the arg71val deviation from the sequence signature of galectins matched the otherwise canonical contact pattern with thermodynamics of an enthalpically driven process. upon lactose accommodation, the side chain of arg50 is shifted for hydrogen bonding to the 3-hydroxyl of glucose. no evidence for a further ligand-dependent structural alteration was obtained in solution by measuring hydrogen/deuterium exchange mass spectrometrically in peptic fingerprints. the introduction of the asn48lys mutation, characteristic for mammalian grifins that have lost lectin activity, lets labeled c-grifin maintain capacity to stain tissue sections. binding is no longer inhibitable by lactose, as seen for the wild-type protein. these results establish the basis for detailed structure-activity considerations and are a step to complete the structural description of all seven members of the galectin network in chicken. the emerging versatility of physiological functions of animal and human lectins gives ample reason to characterize their structures in detail. in overview, more than 12 folds have been identified that convey ability to bind glycans [1e4] . sequence divergence after duplication events, starting from an ancestral gene, then led to forming families of homologous proteins. the case study on galectins (b-galactoside-binding proteins with b-sandwich fold and a sequence signature responsible for ligand contact [5] ) is describing such a network with overlapping and distinct expression profiles [6e8]. this emerging evidence poses the challenge of a complete characterization of the galectins of an organism. it would be a step forward towards delineating rules of network design and providing insights into the functional meaning of sequence variations. in this respect, the galectin fold presents remarkable adaptability for accommodating ligands of different biochemical nature. the crystallographic or nmr spectroscopical study of such domains of protozoan (toxoplasma gondii) micronemal protein 1 and protein 2-associated protein [9, 10] as well as the n-terminal modules of bovine and murine coronavirus spike proteins [11, 12] have taught the following instructive lesson: conversion of the concave topology of the carbohydrate-binding pocket established by surrounding loops to a predominantly hydrophobic surface shifts specificity from glycans to distinct proteins. looking at the assumedly crucial sequence signature for contact to a b-galactoside, a less dramatic deviation than just described may not necessarily impair glycan binding. for example, the change of a seemingly essential asn (in position 46 in human galectin-1 (gal-1)) to ala at the equivalent position 64 in a fungal (agrocybe cylindracea) galectin is not detrimental. it is neutralized by a five residue insertion at positions 42e46 (with the inserted asn46 taking the place of the asn lost at position 64) [13, 14] . the structural way how to maintain affinity for lactosides in the conger eel (conger myriaster) galectin from peritoneal cells (con-p), although even seven from eight conserved amino acids are replaced as reported in [15] , has not yet been characterized. alternatively, sequence deviation(s) can alter carbohydrate specificity. the trp81arg change implemented binding to bi-n-acetylated disaccharides (chitobiose, lac-dinac) in the third galectin protein (cgl3) from the inky cap mushroom coprinopsis cinerea [16] . a unique situation, i.e. a species-dependent loss of lectin activity, is encountered for the galectin-related inter-fiber protein, termed grifin. this protein has first been described in rat as lens-specific protein [17, 18] . it was found in the insoluble fraction of nuclear fiber cells and localized at the interface between adjacent fiber cells, representing about 0.5% of total protein in adult lens. its gene with an elaborate promoter region to facilitate lens-specific expression is a common constituent of vertebrate genomes [19] . given this sitespecific occurrence and conserved presence among vertebrates, it is exceptional and thus intriguing to see sequence deviations at canonical positions between mammalian and bird/fish grifins. especially, the equivalent of the already noted asn46 (in human gal-1) is turned to lys in mammalian grifins [17] . as consequence, a bead (lactosylated sepharose beads) assay revealed no lectin activity for rat grifin [17] . in contrast, grifins from zebrafish [20] and chicken [19] were bona fide lectins. two reasons prompted us to initiate structural analysis of grifins by studying chicken (c)-grifin: the mentioned plasticity within the galectin fold and our long-term interest to achieve complete crystallographic documentation of galectin structures in an organism. towards this end, chicken with its (only) seven family members is a favorable model. since c-grifin shares a deviation from the canonical sequence for lactose binding with mammalian grifins, i.e. the arg71val exchange, defining the resulting contact pattern to the ligand will enable a comparison to common features. we here combine crystallographical analysis, reaching atomic resolution, with studies of c-grifin in solution. they were started by determining its quaternary structure, a key feature of proto-type galectin functionality [21] . quaternary structure and tendency for aggregation were first examined in solution, up to a concentration of 9 mg/ml, by ultracentrifugation and small angle x-ray scattering (saxs). crystallographically, respective specimen from solutions at seven ph values ranging from 4.2 to 8.5 could be processed to monitor stability of structural features. the interaction with lactose was monitored in the crystals and also in solution. here, thermodynamic parameters (by isothermal titration calorimetry (itc)) and profiles of hydrogen/ deuterium exchange (hdx) in the absence and presence of the ligand were measured. faced with the conundrum that reestablishing the sequence signature in rat grifin with the lys-to-asn reconstitution did not repair the loss of lectin activity [17] , we finally probed into the effects of site-specific mutations on c-grifin's carbohydrate-binding activity by a histochemical assay. the wild-type protein was obtained after recombinant production directed by a pgemex-1 vector with the respective cdna insert and purified by affinity chromatography using lactosebearing resin as described [19] . cdnas of the mutants of c-gri-fin, i.e. the trp66lys and the asn48lys single-site mutants, the asn48lys/arg50val double mutant and the asn48lys/arg50val/ tyr66leu triple mutant, were prepared by using the quikchange™ site-directed mutagenesis protocol (agilent technologies, munich, germany). the following primer pairs were used: 5 0 c ctg gcc aac cac ctg ggg aag gag gag g 3 0 and 5 0 c ctc ctc ctt ccc cag gtg gtt ggc cag g 3 0 (trp66leu), 5 0 c gcc ttc cac tt t aag ccc cgc ttt gcc agc 3 0 and 5 0 gct ggc aaa gcg ggg ctt aaa gtg gaa ggc g 3 0 (asn48lys), 5 0 gct ggc aaa gac ggg ctt aaa gtg gaa ggc g 3 0 and 5 0 c gcc ttc cac ttt aag ccc gtc ttt gcc agc 3 0 (asn48lys/arg50val) (exchanged base pairs are underlined). the cdna of the triple mutant was generated by further altering the cdna of the double mutant (asn48lys/arg50val) by respective processing with the primers for the trp66leu mutant. successful implementation of the intended changes was checked by dna sequencing (sequiserve, vaterstetten, germany). mutant proteins of c-grifin were designed as fusion proteins with a glutathione s-transferase (gst) part in a pgex-6p-2 vector (ge healthcare, münchen, germany), they were purified after recombinant production by affinity chromatography using glutathione-presenting sepharose 4b (ge healthcare). thereafter, the linkage between both proteins was cleaved by gst-tagged human rhinovirus 3c protease (at a ratio of 1:100 (w/w)), then the c-grifin part was separated from released gst and the tagged protease by a second round of affinity chromatography as described [19] . protein was either precipitated by adding (nh 4 ) 2 so 4 or labeled by biotinylation for the histochemical analysis, as described for human gal-1 [22] . protein samples were diluted to final concentrations of 0.5 and 1.0 mg/ml in 5 mm phosphate buffer containing 150 mm nacl and 4 mm b-mercaptoethanol, and solutions were pre-cleared at 16 ,000 â g. sedimentation-velocity experiments were run at 293 k in an optima kl-i analytical ultracentrifuge (beckman coulter, indianapolis, usa) with an an50-ti rotor and standard doublesector epon-charcoal center pieces (1.2 cm optical path length). measurements were performed at 48,000 rpm, registering the course of protein migration every minute at 280 nm. rayleigh interferometric detection was used to monitor the course of development of the concentration gradient as a function of time and radial position, and the data were analyzed using the sedfit software (version 14.7). saxs data were collected at the bm29 beamline (esrf synchrotron, grenoble, france) using the biosaxs robot and a pilatus 1 m detector (dectris, baden-daettwill, switzerland) with synchrotron radiation at a wavelength of l ¼ 1.0 å and a sampledetector distance of 2.867 m [23] . each measurement consisted of 10 frames, each of 1 s exposure of a 100 ml sample flowing continuously through a 1 mm diameter capillary during exposure to x-rays. buffer scattering was measured immediately before each measurement of the corresponding protein sample at 277 k. the obtained scattering profiles were spherically averaged, and the buffer scattering intensities were subtracted using in-house software. protein samples were prepared at concentrations of 1, 2, and 9 mg/ml in 20 mm phosphate buffer (ph 7.0) containing 150 mm nacl and 5 mm lactose. particle envelopes were generated ab initio using the program dammif [24] . multiple runs were performed to generate 30 independent model shapes that were combined and filtered to produce an averaged model using the damaver software package [25] . suspensions of precipitated protein were extensively dialyzed against 5 mm phosphate buffer (ph 7.0) containing 150 mm nacl and 4 mm b-mercaptoethanol. protein was purified by affinity chromatography using a column packed with lactosylated sepharose 4b to remove inactive material. fractions after elution with 200 mm lactose were concentrated using amicon ultra 10,000 mcwo centrifugal filter units and then loaded on a hiprep 16/60 sephacryl s-200 column equilibrated with phosphate-buffered saline (ph 7.0) containing 4 mm b-mercaptoethanol and 5 mm lactose. the protein eluted as a single peak, the respective fractions were concentrated to a concentration of 15 mg/ml. separately, protein was purified in the absence of lactose by loading sample directly after dialysis on a gel filtration column as above equilibrated with phosphate-buffered saline (ph 7.0) containing 4 mm bmercaptoethanol. eluted protein fractions were finally concentrated as described above. crystallization was performed at 295 k using the sitting-drop vapour diffusion method. small crystals appeared in a wide range of crystallization conditions. in detail, crystal size and quality were optimal at the following conditions: 20% peg 8000, 0.1 m phosphate/citrate (ph 4.2); 30% peg 4000, 0. each crystal was flash-cooled by immersion in liquid nitrogen using the corresponding crystallization medium supplemented with 30% ethylene glycol as cryo-solution. all data collections were done at the xaloc beamline of the alba synchrotron (cerdanyola del vall es, spain) except for the crystal obtained at ph 4.6 that was taken to the proxima 2 beamline of the soleil synchrotron (gifsur-yvette, france). diffraction data were processed using xds [26] and aimless [27] . a summary of reflection data parameters is presented in table 1 . the molecular replacement method was used to solve c-gri-fin's structures, using the program phaser from the phenix suite [28] . a theoretical model generated by the swiss-model server (http://swissmodel.expasy.org) was applied as a search probe [29] . the phenix suite [28] was employed for structural refinements, and addition of water molecules and placement of lactose were carried out manually with the coot program [30] , if necessary. the statistical details of final models are given in table 1 . molecular illustrations of the structural models were generated using pymol. solutions of c-grifin (5 mm or 200 mm (at ph 8.5)) were prepared in phosphate-buffered saline (10 mm, ph 7.2), in tris-hcl (30 mm, ph 8.5) or in sodium acetate (10 mm, ph 5.2). the ligand-containing solutions (100 mm or 60 mm (at ph 8.5) for b-lactosides, 40 mm for the 3 0 -o-sulfated derivative of the 2-naphthyl b-lactoside) were also freshly prepared just before starting the experiment by injecting first 2 ml, then 10 ml (or 5 ml for the titration at ph 8.5) of ligand-containing solution into the cell of a microcal vp-itc calorimeter (ge healthcare), filled with the protein-containing solution, at 298 k. control experiments using buffer excluded lectin-independent heat generation. resulting data were fitted using the origin software package. a solution of lactose and c-grifin at a molar ratio of 2344:1 in equilibration buffer (20 mm potassium phosphate containing 97 mm sodium chloride in h 2 o, ph 7.5), with a lactose-free solution as control, was incubated for 1 h at room temperature. hdx was started at room temperature by adding 10-times deuteration buffer (20 mm potassium phosphate containing 97 mm sodium chloride in d 2 o, pd 7.9, ph 7.5) to each sample. after 0.5 min, 1 min, 10 min, 30 min, 60 min and 240 min, the deuterated samples were quenched by adding ice-cold quenching buffer (100 mm potassium phosphate buffer containing 5 m guanidinium chloride and 0.5 m tris [2-carboxyethyl] phosphine hydrochloride in h 2 o, ph 2.4) at a 1:1 ratio, followed by immediate freezing on dry ice, as previously performed for the n-terminal lectin domain of chicken galectin (cg)-8, referred to as cg-8n [31] . in parallel, protein solutions without lactose were kept in equilibration buffer without deuteration and quenched in the same way, this undeurated control sample used for identification of the peptic peptides. experiments were performed in triplicate. quenched undeuterated and deuterated sample solutions (320 pmol c-grifin) were injected into a nanoacquity uplc system with hdx technology (waters corporation, milford, usa), allowing an online peptic digest on a poroszyme-presenting pepsin cartridge (2.1 mm â 30 mm; applied biosystems, foster city, usa) at 15 c. peptic peptides were then trapped on an acquity uplc beh c18 vanguard pre-column (1.7 mm, 2.1 mm â 5 mm; waters) and separated with a linear acetonitrile gradient over 14 min at 0 c on an acquity uplc beh c18 column (1.7 mm, 1 mm â 100 mm; waters). the column outlet was directly connected to a synapt g2 hdms mass spectrometer equipped with a lockspray esi source (waters). mass spectra for the peptic peptides were acquired in the mse mode over the range of m/z 50e2000. identified peptic peptides obtained by the digestion of c-grifin were organized as list using the protein lynx global server software (waters) and the mse data for the undeuterated controls. by applying dynamx software (waters), the relative deuterium uptake for each identified peptide was calculated by subtracting the centroid masses of the corresponding peptides found in the undeuterated and deuterated samples, for both ligand-loaded and ligand-free c-grifin. the ligand-dependent difference in relative deuterium uptake was determined using dynamx, and its significance was evaluated by calculating a two-sided confidence limit with a significance level of 0.02 [32] . the result of this evaluation was visualized by color coding of b-strands in the threedimensional model of c-grifin. biotinylated c-grifin and its asn48lys, trp66leu, asn48lys/ arg50val and asn48lys/arg50val/trp66leu mutants were tested on paraffin-embedded sections (5 mm) of adult chicken kidney, obtained after fixation of tissue specimens for 24 h in bouin's solution and embedded, mounted on superfrost ® plus glass slides (menzel, braunschweig, germany). following optimized processing for deparaffinizing sections, blocking of sites for nonspecific binding, incubation of biotinylated probe in the absence and presence of lactose (200 mm) and signal development by bound avidin-alkaline phosphatase (ap) conjugate with vector ® red ap substrate (enzo life sciences, l€ orrach, germany) as described [33, 34] , documentation was recorded using an axioimager.m1 microscope (carl zeiss microimaging, g€ ottingen, germany) equipped with an axiocam mrc3 and mrc digital camera and axiovision (version 4.6) software. processing controls with an incubation step using plain buffer instead of buffer containing the labeled probe excluded probe-independent staining, titrations with probe concentrations of 4 mg/ml, 12 mg/ml, 24 mg/ml and 48 mg/ml were systematically performed for all proteins and for the pair of measurements in the absence and presence of lactose. at low concentrations in the course of gel filtration, grifins from chicken, mouse and rat eluted at the position of a dimer, with evidence for inter-subunit exchange when fractionating mixtures of tag-free and tagged proteins [17, 19, 35] . considering the high local concentration in the lens, we examined the protein status after further increasing the concentration. in sedimentationvelocity experiments at concentrations up to 1.0 mg/ml, c-grifin migrated as single sharp peak with a sedimentation coefficient of 2.6 ± 0.1 s. in comparison, homodimeric human gal-1 and cg-1a gave very similar values under these conditions, whereas cg-2 has an s-value of 2.03 ± 0.1 [36] . frictional ratios among homodimeric proto-type cgs can thus differ when measured by ultracentrifugation. it is about 1.3 for c-grifin and cg-1a but 1.63 for cg-2 [36] . of note for cg-2, heterogeneity in size distribution had been observed for this protein at 4 mg/ml, with the acquisition of the quaternary structure as trimer of dimers [37] . we proceeded to determine the quaternary structure of c-grifin at concentrations up to 9.0 mg/ml. when reaching the concentration of 9.0 mg/ml in saxs experiments, the particle distribution function of c-grifin is still in agreement with exclusive presence of a dimer. the ab initio model of the dimer calculated with these data is shown in fig. 1 . its elongated disc shape fits well with the frictional ratio of the dimer as calculated using data obtained by ultracentrifugation. overall, no evidence for aggregation up to this concentration was obtained. these results thus define the quaternary structure in solution as homodimer with no indication for higher-order aggregates at concentrations up to 9 mg/ml under the given conditions. structural analysis was next taken to the level of crystallography. in view of the remarkable long-term stability of c-grifin in lenses, it was attempted to prepare a panel of crystals from solutions over a broad ph range. a wide range of combinations of buffers and additives was tested and structural analysis became possible with crystals obtained at seven conditions. in addition to presence of different types of additives, they covered the ph range from 4.2 to 8.5. as summarized in table 1 , the resolution reached the level of 0.96 å for crystals grown at ph 7.5. space groups differed within this group, p21 seen in three cases, either p4 2 2 2 or p22 1 2 1 in two cases. the asymmetric unit cell content was variable in the seven crystals, being either one, two or four (table 1) the c-grifin monomer adopts the typical b-sandwich fold with two anti-parallel b-sheets of six (s1-s6) and five (f1-f5) b-strands (fig. 2) . the short 3 10 helix in the long loop connecting the s2/f5 strands is also a typical feature. among the proto-type cgs [37e39], the loop between the s3/s4 strands reaches the largest length, as highlighted by rectangles in fig. 3 . loop-length variation also holds true for the s4-s5 section. it is four amino acids shorter than in cg-1a/b, while superimposing with that of cg-2 (fig. 3) . as consequence, a flat surface is created in this region (fig. 3) . length, conformation and structure of this loop has been described as a discriminatory factor between human gal-1 and -3 when interacting with the tf antigen [40] , the core 1 disaccharide of mucintype o-glycosylation (cd176) [41] . overall, alignment is at a high level, as reflected by rmsd values at 2.9 å (to cg-1a, cg-2) and 2.1 å (cg-1b). for comparison, respective values are 2.4 å for cg-1a/cg-2 and 1.5 å for the paralogue pair cg-1a/b. the availability of crystals of c-grifin exposed to a wide range of ph values enabled a detailed comparison of the seven crystal structures. it revealed no marked structural changes, the common homodimer shown in fig. 4a . to substantiate this conclusion by a number, the rmsd value was 0.337 å for the pairwise analysis of the crystal structures at the tested ph minimum (4.2) and its maximum (8.5) . since the availability of the crystallographic homodimer structure made its placement into the sphere of the saxs-derived model possible, as done in fig. 1 , the fitting of crystal and solution structures could be tested and was found to be excellent. looking at the unit cell content of the seven crystals, it differs considerably, as indicated in table 1 . the example of cg-2, which is arranged as a non-crystallographic trimer of dimers (added to fig. s1 ) and occurs in part as hexamer in solution at 4 mg/ml in sedimentation-velocity analysis, underlines capacity of homodimer association of a galectin [13] . crystal packing of other galectins to a dimer of dimers [42, 43] , also seen for human gal-1 in solution in an aprotic solvent [44] , that can even be stable in solution [16,45e47] as well as the aggregation of a monomer (murine gal-9n) to a dimer [48] and na þ -mediated association of the n-terminal lectin domain of murine gal-4 to a tetramer around the crystallographic four-fold axis [49] indicate potential of galectin domains for building higher-order aggregates under special conditions. in this study, the availability of crystals of the same protein obtained at different conditions documents the very low degree of influence of the ph value on the galectin fold, with some variability in unit cell looking closely at the interface region between the two subunits of c-grifin's homodimer, it is established by the f1/s1 strands from the n-and c-termini of each subunit in the homodimer (residues 4e14/127-136) (fig. 4b) . around the s1 edge, the salt bridge between arg5/glu12 and hydrogen bonds between glu7/ leu9 stabilize the assembly. in the case of the anti-parallel f1 strands, hydrogen bonding between ser131/thr135 and also ser130, ile134 and lys136 serve this purpose. typically for cg homodimers, hydrophobic contacts via phe6, ile129, val132 and ile134 come into play, too (fig. 4b) . together, these contacts enable c-grifin to act as cross-linker, reported previously based on haemagglutination assays [19] , or as a kind of molecular glue (please see below). the susceptibility to abolish the cell bridging by presence of lactose at 1.0e1.5 mm, together with c-grifin's binding to lactose-bearing beads used in affinity chromatography, was indicative of lectin activity. here, we first confirm this conclusion by itc analysis and then characterize the contact site and its pattern of contact formation with lactose crystallographically. analysis of itc data for the association of c-grifin and the b-methyl derivative of lactose resulted in revealing this process to be enthalpically driven. homodimeric cg-1a and cg-2 had similar thermodynamics of binding, whereas ligand association to cg-1b produced a relatively small enthalpy gain [13] . per dimer, the number of binding sites of 1.76 ± 0.42 was reached at ph 7.2. it increased to 1.89 ± 0.45 by exchanging the methyl by a 2 0 -naphthyl group, together with the considerable enhancement of affinity from 140 mm to 8.6 mm. adjusting the ph to 8.5 (in tris buffer) lowered ligand loading to well below 1 and also the affinity. at the acidic side, no heat production was observed at ph 5.2 (in sodium acetate). this ph profile with rather sharp decreases in binding activity to both sides corresponded well with assays on human gal-1 using lactose-bearing beads or surface-immobilized asialofetuin [50, 51] . fittingly, binding sites for lactose in the homodimer were occupied by the ligand in the crystals obtained at ph values of 6.2, 6.5, 8.0 and 8.5, and they remained free when the ph was set to 4.2/4.6. interestingly, a special circumstance precluded complete loading of both sites of the homodimer at ph 7.5 in the crystal. at this ph value, crystallographic contacts with asn28 of a symmetry-related subunit impedes access to one lactose-binding site of the homodimer. this situation makes a comparison of lactose-free with lactose-loaded subunits possible at atomic resolution. the resulting rmsd value of this superimposition is 0.263 å, arguing in favor of a common shape irrespective ligand loading. the contact site in crystals at each ph value is composed of amino acids from b-strands s4-s6 (fig. 4a) . interestingly, the electron density map of the ligand shows a preference for presence of a-lactose instead of the natural b-anomer at the reducing end, as shown in fig. s2 , likely by crystallographic contacts. despite sequence variations among the homodimeric cgs, the canonical contact profile between the protein and lactose is maintained, except for the arg71val substitution (fig. 5) . as illustrated in this figure, the axial 4 0 -hydroxyl group of the galactose moiety is involved in hydrogen (h-)bonding with the conserved his46, asn48 and arg50 moieties, sequence conservation of asn59/glu69 enabling h-bonding with the 6 0 -hydroxyl group. in addition, the 3 0 -hydroxyl is in watermediated contact with glu32, and c-h/p interactions with trp66 complete the binding pattern for the galactose moiety. since trp oxidation to the oxindole impairs ligand binding, as first shown for electrolectin [52] , the substantial reactivity of this residue (up to 18.6% after 24 h when exposed to 0.05% hydrogen peroxide [19] ) is noteworthy. deamidation within peptide 58e70 (after seven days at 25 c/40 c reaching 9.2%/13.5% [19] ) may also have an impact on the protein's lectin activity, if not protected by bound ligand. in addition to galactose, the 3 0 -hydroxyl group of the glucose part of lactose contributes to ligand binding, as shown for cg-1a and cg-2 by measuring inhibitory capacity of methyl b-lactosides on lectin binding to asialofibrin [53] . b-methyl derivatives of lactose with 3-deoxy and 3-deoxy-3-methyl glucose have low or no inhibitory potency on cg-1a (and human gal-1). in c-grifin, this 3-hydroxyl connects by h-bonding with arg50 and glu69 (fig. 5) . val71, taking the place of the arg residue conserved in homodimeric cgs and zebrafish grifin, is not able to engage in h-bonding. when for example compared to cg-2 [13] , the arg71val substitution causes a loss in h bonding to this hydroxyl for c-grifin. this exchange of an amino acid, which is shared by mammalian grifins, yet does not appear to be detrimental for lactose binding. considering the ph dependence, contribution to affinity by glu69 could be reduced by an increasing degree of its protonation. its pk a was calculated to be 4.6 so that a loss of enthalpy generation may ensue at this site at low ph. going above neutral ph, his46 (at pk a 6.75) may be a factor in the observed affinity decrease at ph 8.5, whose origin appears to be more complex than a single-site alteration. when extending testing to the 3 0 -o-sulfated derivative of 2naphthyl b-lactoside in itc, the k d -value decreased from 8.6 to 0.13 mm. this marked enhancement of affinity by presence of the sulfate group is in contrast to 3 0 -sialylation, as observed in cell binding [19] . when comparing the binding mode of the negatively charged lactose derivative to cg-8n, reported previously [31] , with the situation in c-grifin, a lack of interaction with arg57 and also glu45 may have an influence on this grading of affinity to the lactoside and its 3 0 -modified derivatives. binding-site modeling with the 3 0 -o-sulfated lactose as ligand provides a set of putative contacts to the sulfate group that can adopt two orientations (fig. s3) . interestingly, these sets include contacts to asn48 and arg50 as well as trp66, albeit not with the same conformer, as is the case for the n-domains of gal-4 and -8 [54, 55] : gal-4c, in contrast, appears to employ a transient contact to trp256 for the slight affinity increase caused by the presence of the 3 0 -sulfate group [56] . when comparing in detail positions of amino acid of the ligandfree and -loaded subunits of the c-grifin homodimer, one liganddependent event of reorganization was revealed (fig. 6a) . the side chain of arg50 moves toward the glucose ring into the direction of trp66 from the position of pointing to asn48, in this place also in crystallographic interactions with ser54 and glu78 of a symmetryrelated protein. interestingly, the accommodation of ligand into the n-terminal lectin domain of human gal-9 triggers such a movement of the side chain of arg87 to let its nh1 become hydrogen bonded with o2 of the glucose moiety [57] . the movement in c-grifin was also seen in crystals obtained at ph 6.2 (fig. 6b) . when c-grifin in solution bound to lactose on beads and was tryptically fragmented, peptides covering the amino acid stretches of 26e52 and 58e70 had sufficient interactions to maintain their binding during repeated washes and could be eluted [19] . this result is in line with the contact pattern seen in crystals. taking analysis of c-grifin and lactose binding again to the level of a solution in this report, measuring extent and profile of hdx in the absence and presence of ligand can identify the contact site. beyond furnishing a sensitive means to do so, measuring the impact of ligand presence on amide deuteration can also detect alterations in this parameter at other sites. this has recently been accomplished for cg-8n when introducing application of this method to galectins [31] . measured for homodimeric human gal-1 and cg-1b/2 by small angle neutron scattering and fluorescence correlation spectroscopy, respectively [44, 58, 59] , ligand binding can lead to shape changes into both directions, depending on the type of protein. we thus performed hdx experiments with c-grifin. peptic fingerprinting of c-grifin and sequence coverage is illustrated in fig. 7 (for complete listing of peptides with sequence assignment, please see table s1 ; the n-terminal peptide without release of the methionine, i.e. malr, is detected in tryptic digests at an overall level of 2.1% together with the predominant alr tripeptide after release). the number of detected peptides was 94. the size of the panel ensured complete sequence coverage at a redundancy of 7.25. this value denotes the average frequency of a distinct amino acid position covered by the panel of identified peptides. presence of lactose reduced deuterium uptake into amides, predominantly in the sequence stretch covering canonical amino acids, with strong impact especially on the peptide 55e61 (fig. 8) . changes seen at position 1e17 and 21e38 should be viewed with caution, because no confirmation by overlapping peptides was possible. the quantitative data on the level of peptides given in fig. 8 can now be introduced by color coding to the crystal structure. their implementation into structural models of the binding site is shown in fig. 6c and d. accommodation of the binding site by lactose thus reduces extent of deuteration in the region of hydrogen/ch-p bonding, without a marked alteration at other sites. to pursue the delineation of relative importance of distinct amino acids in this region for ligand binding, site-directed mutagenesis was applied. the example of mammalian gal-1 had revealed that already the mutation at a single site was sufficient to abolish reactivity to glycans. in detail, the tested cases are asn46asp, arg48his, asn61asp, glu71gln and arg73his, the trp68leu substitution causing a drastic reduction [60e62]. interestingly, the natural occurrence of the arg71val substitution does obviously not preclude binding of lactose by c-grifin despite the documented reduction in h-bonding to the 6 0 -hydroxyl group of galactose (fig. 5) . as outlined in the introduction, a mutation can yet have different consequences. using a histochemical assay to probe lactose-inhibitable reactivity with the pattern of natural glycans in tissue sections, wild-type c-grifin and three mutants were tested. the application of tissue sections as assay platform enables to monitor the activity profile of the test proteins with cellular glycoconjugates, covering natural diversity in structural and spatial parameters. in view of the general importance of trp66's indole ring for c-h/p interactions, as shown in fig. 5 , a trp66leu mutant was designed, with the expectation of a substantial loss of signal. next, the structure of mammalian grifin at position 48 was mimicked by the asn48lys mutation. absence of asn at the equivalent site leads to knocking down lactose-dependent binding for human gal-1 noted above. this alteration was combined with an arg50val exchange, aimed at further harming hydrogen bonding to the 4 0 -hydroxyl group of the galactose moiety. of note, this pair of two sequence deviations is encountered in chicken galectinrelated protein that has no affinity for lactose (c-grp [63] ). to complete our series, a triple mutant was engineered. all four variant proteins failed to bind to lactose-bearing beads, precluding their purification by standard affinity chromatography, thus requiring the route over fusion-protein design. in the histochemical assay, the wild-type and the four mutant proteins were tested under identical conditions. titrations from 4 mg/ml to 48 mg/ml were performed in the absence and presence of 200 mm lactose, flanked by controls to exclude probeindependent staining. incubation of sections with the wild-type protein led to strong staining that was completely inhibited by the cognate sugar (fig. 9a) . the presence of trp66leu mutation decreased signal intensity and capacity of lactose for inhibition (fig. 9b ). in contrast, the substitution at position 48 even increased signal intensity that was not affected by lactose (fig. 9c) . by adding the arg50val to the asn48lys substitution, signal intensity fell back to the level of the trp66leu variant (fig. 9d) . as likewise seen in the case of the triple mutant (not shown), the effect of lactose presence on signal intensity by the double mutant was nearly comparable to that of the trp66leu mutant (fig. 9b,d) . the asn48lys substitution, the central difference between mammalian and bird/fish grifins with respect to direct ligand contact, therefore appeared to fundamentally affect relative signal intensity among mutants and susceptibility to presence of lactose. of course, other alterations present in mammalian grifin, too, could make their presence felt, because reconstitution at three sites of rat grifin (i.e. val46phe, lys47asn, val70arg) did not repair the defect in lectin activity [17] . furthermore, the examples of longrange effects of the r111h/c2s mutations in human gal-1, causing a shift in the positions of his52/trp68 [13] , and of the snp-based phe19tyr change in a natural variant of human gal-8 [64] attest remarkable plasticity and sensitivity for subtle changes. in this respect, it will be very informative to have the crystal structures of cg-3 and also of the c-terminal domain of cgfig. 8 . ligand-dependent reduction of deuterium uptake in c-grifin. summary of differences in deuterium uptake over a deuteration time course of 0.5 min, 1 min, 10 min, 60 min and 240 min between peptides of ligand-free and lactose-loaded c-grifin is presented in quantitative form. the peptide index given on the x-axis was calculated on the basis of midpoint values that reflect the position of the peptide within the amino acid sequence of the protein; the blue dotted lines represent the two-sided confidence limit (a 0.02) calculated as described [32] . fig. 9 . histochemical staining profiles of biotinylated wild-type c-grifin and three mutants in paraffin-embedded sections of fixed adult chicken kidney. wild-type c-grifin strongly stained the thick loops (tl), distal tubules (dt) and the apical part of the proximal tubules (pt) (a). binding was also detected in the collecting ducts (cd) and in the peripheral collecting tubules (ct). presence of 200 mm lactose led to complete abolishment of binding of c-grifin (inset to a). the introduction of the trp66leu mutation led to marked decrease of staining, with only weak signals in the thick loops (tl), distal tubules (dt) and the collecting tubules (ct) (b). very weak but significant staining was seen in the proximal tubules (pt). incubation with 200 mm lactose either slightly reduced signal intensity (thick loops (tl), distal tubules (dt), collecting tubules (ct)) or led to complete inhibition of binding (proximal tubules (pt)) (inset to b). in contrast, the asn48lys mutant led to very strong staining (c). no inhibitory effect was seen in the presence of lactose (inset to c). combining this mutation with an arg50val substitution (to obtain the asn48lys/arg50val double mutant) markedly decreased signal intensity in all areas of the section (d), with no effect by presence of 200 mm lactose (inset to d). the concentration of biotinylated proteins was 12 mg/ml. the corresponding category of signal intensity is given in the bottom left part of each microphotograph, according to the following grading system: -, no staining; (þ), very weak but significant staining; þ, weak staining; þþ, medium-level staining; þþþ, strong staining; þþþþ, very strong staining. scale bars: 20 mm. 8 available, which are closest to c-grifin in the phylogenetic family-tree diagram [19] . the same holds true for the identification of binding partners in the lens. "since grifin comprises approximately 0.5% of the watersoluble lens protein [17] , it is rather remarkable that so little is known about a possible physiological role of this protein in the lens" [35] . in transgenic mouse lenses, chemical cross-linking of a tagged version of human aa-crystallin led to detecting grifin in the complexes, supported by co-elution of the two proteins in gel filtration at a peak around 600 kda [35] . the k d -value measured in filtration binding (scatchard) assays with murine a l -crystallin was 13.6 ± 5.3 mm (9.4 mm) with a stoichiometry of 0.4 ± 0.08 (0.34) grifin monomer per crystallin monomer (0.25 ± 0.01 for bovine crystallin) [35] . this stoichiometry intimates the possibility that the grifin homodimer may help give order when packaging crystallin proteins. as consequence, grifin's presence contributes to establish the refractive index of the lens. that galectins of this design have been referred to as "bridging molecules" [65] and can indeed, especially at locally high concentrations, form vesicle aggregates [66e68] or lattices [21] makes them suited to become a molecular "glue" for binding partners [69] . intriguingly, di-or tetrameric (plant) lectins of the b-sandwich fold are known to assist in ordered packaging. this is an intracellular role of leguminous lectins. their association to storage proteins destined to fill protein bodies depends either on protein-glycan (for glycosylated vicilin) or protein-protein interactions (for nonglycosylated vicilin and legumin) [70, 71] . this dual reactivity is not uncommon, a lectin from slime molds even employing two contact sites for a glycan and for a protein to fulfill its role in ordered cell migration [72] . to take the comparison to grifin further, the leguminous lectins can bind to protein body membranes [73] . the finding of an association of grifin and a-crystallin with the plasma membrane of lens fiber cells [17, 74] has led to assuming an influence on "cell elongation and suture formation during lens development" [35] . the homodimeric design as well as high degree of stability and rigidity can thus be molecular means of grifins for helping to build and/or maintain the high-order organization of lens proteins. bridging of counterreceptors, an often encountered theme in translating glycan-encoded messages by lectins [8, 21, 75] , will in this context likely engender stability of aggregates in three dimensions. whether and how grifins and their yes/no variability of lectin activity in proteins from different vertebrates actually play a role in these processes remains to be clarified. equally important, mapping the course of grifin expression and localization in lens development is required to define physiological role(s) of respective proteins with/without lectin activity. the combined analysis of c-grifin in crystals (at seven ph values from 4.2 to 8.5) and in solution (up to 9.0 mg/ml in saxs) revealed c-grifin to be a stable and compact homodimer with no tendency for aggregation in solution. the contact profile to lactose compensated the arg71val deviation and maintained the enthalpically driven nature of binding. ligand association caused a shift in the position of the side chain of arg50, the only observed structural change in crystals. in solution, monitoring hdx in peptic fingerprints at 100% sequence coverage, too, confirmed location of the binding site and did not record any significant liganddependent change. localization of contact site(s) for ligands and monitoring for an impact of the association on solvent accessibility of protein regions are the strengths of this technique [31, 76] . implementing the asn48lys change, a hallmark of mammalian grifins, made binding of this mutant protein to tissue sections rather insensitive to lactose presence, intimating this position to be a key site for interactions. this result and the presented structural informations build a solid foundation to proceed in the quest to understand grifin's role as lens protein not present, for example, in the retina [17, 19, 77] . moreover, they are a step toward completing the structural analysis of the galectin network in a model organism, i.e. in chicken with its seven members. funding source was not involved in the collection, analysis and interpretation of data, in the writing of the report, and in the decision to submit the article for publication. principles of structures of animal and plant lectins a guide into glycosciences: how chemistry, biochemistry and biology cooperate to crack the sugar code lectins: a primer for histochemists and cell biologists evolution and structural dynamics of bacterial glycan binding adhesins recent topics on galectins galectinomics: finding themes in complexity galectins in acute and chronic inflammation galectins: their network and roles in immunity/tumor growth control a novel galectin-like domain from toxoplasma gondii micronemal protein 1 assists the folding, assembly, and transport of a cell adhesion complex structural basis of toxoplasma gondii mic2-associated protein interaction with mic2 crystal structure of mouse coronavirus receptor-binding domain complexed with its murine receptor crystal structure of bovine coronavirus spike protein lectin domain growth-regulatory human galectin-1: crystallographic characterisation of the structural changes induced by single-site mutations and their impact on the thermodynamics of ligand binding structural basis of a fungal galectin from agrocybe cylindracea for recognizing sialoconjugate allosteric regulation of the carbohydrate-binding ability of a novel conger eel galectin by d-mannoside structural basis for chitotetraose coordination by cgl3, a novel galectinrelated protein from coprinopsis cinerea grifin, a novel lens-specific protein related to the galectin family mass measurements of cterminally truncated a-crystallins from two-dimensional gels identify lp82 as a major endopeptidase in rat lens chicken grifin: a homodimeric member of the galectin network with canonical properties and a unique expression profile unlike mammalian grifin, the zebrafish homologue (drgrifin) represents a functional carbohydrate-binding galectin when lectin meets oligosaccharide lectin localization in human nerve by biochemically defined lectin-binding glycoproteins, neoglycoprotein and lectin-specific antibody upgraded esrf bm29 beamline for saxs on macromolecules in solution dammif, a program for rapid ab-initio shape determination in small-angle scattering uniqueness of ab initio shape determination in small-angle scattering overview of the ccp4 suite and current developments phenix: a comprehensive python-based system for macromolecular structure solution swiss-model: modelling protein tertiary and quaternary structure using evolutionary information features and development of coot combining crystallography and hydrogen-deuterium exchange to study galectin-ligand complexes the utility of hydrogen/deuterium exchange mass spectrometry in biopharmaceutical comparability studies galectin-related protein: an integral member of the network of chicken galectins. 2. from expression profiling to its immunocyto-and histochemical localization and application as tool for ligand detection teaming up synthetic chemistry and histochemistry for activity screening in galectin-directed inhibitor design interactions between small heat shock protein a-crystallin and galectin-related interfiber protein (grifin) in the ocular lens proto-type chicken galectins revisited: characterization of a third protein with distinctive hydrodynamic behaviour and expression pattern in organs of adult animals fine-tuning of prototype chicken galectins: structure of cg-2 and structure-activity correlations the 2.15 å crystal structure of cg-16, the developmentally regulated homodimeric chicken galectin homodimeric chicken galectin cg-1b (c-14): crystal structure and detection of unique redox-dependent shape changes involving inter-and intrasubunit disulfide bridges by gel filtration, ultracentrifugation, site-directed mutagenesis, and peptide mass fingerprinting structural basis for distinct binding properties of the human galectins to thomsen-friedenreich antigen the glycobiology of the cd system: a dictionary for translating marker designations into glycan/lectin structure and function structure of a tetrameric galectin from cinachyrella sp. (ball sponge) crystal structure of a xenopus laevis skin proto-type galectin, close to but distinct from galectin-1 detection of ligand-and solvent-induced shape alterations of cell-growth-regulatory human lectin galectin-1 in solution by small angle neutron and x-ray scattering structure and functional analysis of the fungal galectin cgl2 fungal galectins, sequence and specificity of two isolectins from coprinus cinereus gene design, expression, crystallization and preliminary diffraction analysis of two isolectins from the fungus coprinus cinereus: a model for studying functional diversification of galectins in the same organism and their evolutionary pathways crystal structure of the galectin-9 n-terminal carbohydrate recognition domain from mus musculus reveals the basic mechanism of carbohydrate recognition structure of the mouse galectin-4 n-terminal carbohydrate-recognition domain reveals the mechanism of oligosaccharide recognition human splenic galaptin: carbohydrate-binding specificity and characterization of the combining site binding characteristics of galactosidebinding lectin (galaptin) from human spleen isolation and physicochemical characterization of electrolectin, a b-d-galactoside-binding lectin from the electric organ of electrophorus electricus different architecture of the combining sites of two chicken galectins revealed by chemical-mapping studies with synthetic ligand derivatives galectin-8-n-domain recognition mechanism for sialylated and sulfated glycans structural characterisation of human galectin-4 n-terminal carbohydrate recognition domain in complex with glycerol, lactose, 3'-sulfo-lactose, and 2'-fucosyllactose structural characterization of human galectin-4 c-terminal domain: elucidating the molecular basis for recognition of glycosphingolipids, sulfated saccharides and blood group antigens n-domain of human adhesion/growth-regulatory galectin-9: preference for distinct conformers and non-sialylated n-glycans and detection of ligandinduced structural changes in crystal and solution hydrodynamic properties of human adhesion/growth-regulatory galectins studied by fluorescence correlation spectroscopy analysis of homodimeric avian and human galectins by two methods based on fluorescence spectroscopy: different structural alterations upon oxidation and ligand binding soluble 14-kda b-galactoside-specific bovine lectin. evidence from mutagenesis and proteolysis that almost the complete polypeptide chain is necessary for integrity of the carbohydrate recognition domain effect of amino acid substitution by sited-directed mutagenesis on the carbohydrate recognition and stability of human 14-kda b-galactoside-binding lectin further evidence by site-directed mutagenesis that 127e138 conserved hydrophilic residues form a carbohydrate-binding site of human galectin-1 galectin-related protein: an integral member of the network of chicken galectins. 1. from strong sequence conservation of the gene confined to vertebrates to biochemical characteristics of the chicken protein and its crystal structure natural single amino acid polymorphism (f19y) in human galectin-8: detection of structural alterations and increased growthregulatory activity on tumor cells factors mediating cell-cell recognition and adhesion. galaptins, a recently discovered class of bridging molecules mimicking biological membranes with programmable glycan ligands selfassembled from amphiphilic janus glycodendrimers dissecting molecular aspects of cell interactions using glycodendrimersomes with programmable glycan presentation and engineered human lectins reaction of a programmable glycan presentation of glycodendrimersomes and cells with engineered human lectins to show the sugar functionality of the cell surface galectin: intelligent glue, non-bureaucratic bureaucrat or almighty supporting actor interactions between lectins and other components of leguminous protein bodies in vivo binding partners of the lens culinaris lectin receptor for the cell binding site of discoidin i interaction of the soybean (glycine max) seed lectin with components of the soybean protein body membrane characterization of a-crystallin plasma membrane binding how to crack the sugar code exploration of the metal coordination region of concanavalin a for its interaction with human norovirus network analysis of adhesion/growth-regulatory galectins and their binding sites in adult chicken retina and choroid rr is grateful to the natural sciences and engineering research council of canada (nserc) for a canadian research chair and financial support. ar thanks the spanish ministry of economy and competitiveness for the financial support through the bfu2016-77835-r project. we are grateful for inspiring discussions with drs. b. friday and a. leddoz, for obtaining access to the itc equipment to prof. n. doucet, to m. l etourneau, j. gagnon and p.t. nguyen for helpful advice as well as for the outstanding technical assistance provided by the staff members of the soleil (france) and alba (spain) synchrotrons and for the thorough manuscript review leading to valuable recommendations. supplementary data related to this article can be found at https://doi.org/10.1016/j.biochi.2017.12.003. key: cord-256047-mabrmzd9 authors: jacomin, anne-claire; taillebourg, emmanuel; fauvarque, marie-odile title: deubiquitinating enzymes related to autophagy: new therapeutic opportunities? date: 2018-08-19 journal: cells doi: 10.3390/cells7080112 sha: doc_id: 256047 cord_uid: mabrmzd9 autophagy is an evolutionary conserved catabolic process that allows for the degradation of intracellular components by lysosomes. this process can be triggered by nutrient deprivation, microbial infections or other challenges to promote cell survival under these stressed conditions. however, basal levels of autophagy are also crucial for the maintenance of proper cellular homeostasis by ensuring the selective removal of protein aggregates and dysfunctional organelles. a tight regulation of this process is essential for cellular survival and organismal health. indeed, deregulation of autophagy is associated with a broad range of pathologies such as neuronal degeneration, inflammatory diseases, and cancer progression. ubiquitination and deubiquitination of autophagy substrates, as well as components of the autophagic machinery, are critical regulatory mechanisms of autophagy. here, we review the main evidence implicating deubiquitinating enzymes (dubs) in the regulation of autophagy. we also discuss how they may constitute new therapeutic opportunities in the treatment of pathologies such as cancers, neurodegenerative diseases or infections. autophagy is a lysosomal catabolic process that ensures the degradation and recycling of intra-cytoplasmic components and, therefore, highly contributes to the maintenance of cellular homeostasis. in order to adapt to various stresses and promote its survival under challenging conditions, the cell also uses autophagy to degrade a broad range of endogenous or exogenous substrates [1] . the best-studied endogenous substrates of autophagy are mitochondria and protein aggregates [2] [3] [4] . defects in the elimination of such substrates are often associated with human pathologies, including neurodegenerative diseases or cancers [5] [6] [7] . autophagy is a very dynamic process and has to be tightly regulated to provide a timed and efficient response to a multitude of signals. posttranslational modifications play a significant role in the induction and regulation of autophagy [8, 9] . ubiquitination of substrates and protein of the autophagy machinery has notably emerged as a central regulatory mechanism of autophagy acting at various levels to promote autophagy induction or shutdown [10, 11] . ubiquitination is a reversible protein modification that consists of the covalent attachment of one or several ubiquitin moieties to protein substrates [12] [13] [14] . while ubiquitination is achieved by the sequential action of e1 ubiquitin-activating enzyme, e2 ubiquitin-conjugating enzyme, and an e3 ubiquitin ligase, the removal of ubiquitin from a protein is catalyzed by deubiquitinating enzymes (dubs) [15] . ubiquitination can promote or interfere with protein-protein interactions, modifying the the main regulators of autophagy are ampk (amp-activated kinase) and mtorc1 (mtor complex (1) which act as autophagy activator and inhibitor, respectively. ampk and mtorc1 regulate the activation of the ulk1 complex which, together with the pi3k-ш complex, initiates the autophagosome formation. the formation of an autophagosome starts with the generation of a phagophore which elongates into an isolation membrane where the cargos to be degraded are gathered. the enclosure of the isolation membrane forms a double-membraned autophagosome that matures and eventually fuses with the lysosome, forming an autolysosome where lysosomal hydrolases degrade its content. the atg12 and lc3 conjugation systems are essential for the autophagy process and formation of the autophagosome. the atg12 conjugation system consists in the formation of the atg12-atg5-atg16l1 complex that then promotes the conjugation of lc3; pro-lc3 is cleaved by atg4 (lc3-i) which is then conjugated with phosphatidylethanolamine (pe) (lc3-ii) before being anchored to the nascent autophagosomal membrane. ubiquitin is a small globular protein with a -grasp superfold conformation [39, 40] . ubiquitin is covalently conjugated by its terminal glycine (g76) onto a lysine residue of a substrate protein. protein ubiquitination is a complex process requiring the successive activity of three types of enzymes: an e1 ubiquitin-activating enzyme, an e2 ubiquitin-conjugating enzyme, and an e3 ubiquitin-ligase enzyme. the ubiquitination process can be broken down in two main steps: (1) the atp-dependent activation of a ubiquitin molecule by conformational modification of its c-terminus extremity by an e1 enzyme followed by its transfer onto an e2 enzyme, and (2) the conjugation of the activated ubiquitin onto a substrate protein, mediated by an e3 enzyme that bridges the e2 to a the main regulators of autophagy are ampk (amp-activated kinase) and mtorc1 (mtor complex (1) which act as autophagy activator and inhibitor, respectively. ampk and mtorc1 regulate the activation of the ulk1 complex which, together with the pi3k-iii complex, initiates the autophagosome formation. the formation of an autophagosome starts with the generation of a phagophore which elongates into an isolation membrane where the cargos to be degraded are gathered. the enclosure of the isolation membrane forms a double-membraned autophagosome that matures and eventually fuses with the lysosome, forming an autolysosome where lysosomal hydrolases degrade its content. the atg12 and lc3 conjugation systems are essential for the autophagy process and formation of the autophagosome. the atg12 conjugation system consists in the formation of the atg12-atg5-atg16l1 complex that then promotes the conjugation of lc3; pro-lc3 is cleaved by atg4 (lc3-i) which is then conjugated with phosphatidylethanolamine (pe) (lc3-ii) before being anchored to the nascent autophagosomal membrane. ubiquitin is a small globular protein with a β-grasp superfold conformation [39, 40] . ubiquitin is covalently conjugated by its terminal glycine (g76) onto a lysine residue of a substrate protein. protein ubiquitination is a complex process requiring the successive activity of three types of enzymes: an e1 ubiquitin-activating enzyme, an e2 ubiquitin-conjugating enzyme, and an e3 ubiquitin-ligase enzyme. the ubiquitination process can be broken down in two main steps: (1) the atp-dependent activation of a ubiquitin molecule by conformational modification of its c-terminus extremity by an e1 enzyme followed by its transfer onto an e2 enzyme, and (2) the conjugation of the activated ubiquitin onto a substrate protein, mediated by an e3 enzyme that bridges the e2 to a specific substrate allowing for the subsequent transfer of ubiquitin from the e2 enzyme to the substrate through the formation of an isopeptide bond [41, 42] . alternatively, e3 ligases of the hect family possess e2 and e3 activities [43, 44] . there is also evidence for the addition of ubiquitin moieties onto non-lysine residues. the ubiquitination of cysteine or serine and threonine residues requires the formation of thiol-or oxy-ester bonds respectively [45] . ubiquitination was discovered in 1980 and first described for its essential role in targeting proteins to the proteasome for degradation [46] . it is now widely recognized that ubiquitination also acts in other cellular processes. indeed, a broad range of types of ubiquitination have been reported. substrates can be modified by the attachment of a single ubiquitin molecule (mono-ubiquitination) or several single ubiquitin molecules on different lysine residues of the substrate (multi-mono-ubiquitination). mono-ubiquitination is primarily described for its function in endocytosis of plasma membrane receptors [47] . alternatively, several ubiquitin molecules can be ligated to one another using ubiquitin internal lysine residues, forming chains of ubiquitin moieties that can elongate on the substrate or directly be attached to a target protein (poly-ubiquitination). because ubiquitin has seven lysine residues, there can be at least as many types of ubiquitin chains that can be generated (k6, k11, k29, k48, k63-linked ubiquitin chains) [48, 49] . poly-ubiquitin chains can also be assembled through n-to c-terminal interaction to form linear chains (m1) [50, 51] . the tridimensional structures of ubiquitin chains vary depending on the lysine in the ubiquitin used to generate the chain and affect the function or stability of the substrate. for instance, k48-linked ubiquitin chains have a compact conformation and target substrates for proteasomal degradation, while k63-linked ubiquitin chains display an open conformation and are mostly described to promote signal transduction through the assembly of large protein complexes [52, 53] . protein ubiquitination is a reversible posttranslational modification. the hydrolysis of ubiquitin linkages is conducted by a specific family of proteases: the dubs. these enzymes can act at different stages of the protein ubiquitination process: (1) at the "initial" stage, by cleaving the ubiquitin precursors to supply ubiquitin monomers to the ubiquitination enzymes; (2) at the "intermediate" stage, by the regulated removal of ubiquitin moieties from proteins to alter their fate (stabilization, conformational change); and (3) at the "final" stage by the removal of ubiquitin chains from substrates addressed to the proteasome to facilitate their degradation and processing into ubiquitin monomers, free to enter a new ubiquitination cycle ( figure 2 ) [54] [55] [56] . the hydrolysis of k48-linked ubiquitin chains, most well-known to induce the proteasomal degradation, can affect the fate of the protein they are added to either by protecting substrates from degradation or by supporting proteasomal degradation, as the removal of k48-linked ubiquitin chains, mostly by proteasomal dubs, is also required for protein entry into the proteasome [57, 58] . there are approximately 100 dubs encoded by the human genome [15, 59] , which are divided into two main families: the cysteine proteases and the metalloproteases. there are 12 dubs from the metalloprotease family characterized by a jamm (jab1/pab1/mpn domain-containing metalloenzymes) domain that catalyzes the hydrolysis of isopeptide bonds in the presence of zn 2+ . cysteine proteases are divided into five sub-families according to the sequence and structure of their catalytic domain: ubiquitin-specific proteases (usps), ubiquitin c-hydrolases (uchs), otubain proteases (otus), machado joseph disease proteases (mjds), and the most recently identified sub-family miu-containing novel dub family (mindys). the most abundant sub-family of dubs is the usps with over 50 members, come after the otus (18 members), uch, mjds and mindys (each with four members) [15] . protein posttranslational modifications are crucial in the dynamic regulation of the autophagy process. modifications of core components of the autophagic machinery are notably essential for the induction of autophagy. autophagy proteins that are ubiquitinated constitute substrates for dubs, therefore regulating their function and/or stability [60] . the mechanistic target of rapamycin complex 1 (mtorc1) plays a central role in the integration of various environmental signals to regulate cell metabolism, growth, proliferation, and survival. in nutrient-rich conditions, mtorc1 is active and promotes cell growth while down-regulating autophagy. conversely, down-regulation of mtorc1 activity during nutrient deprivation activates autophagy. the protease otub1 was recently reported to inhibit mtorc1 activity by deubiquitinating and stabilizing the inhibitor deptor in response to amino acid deprivation [61] ( figure 3a ). deptor stabilization by otub1 depends on its catalytic activity as the catalytically inactive mutant otub1-c91a fails to both remove ubiquitin moieties from deptor and protect it from proteasomal degradation. consistent with these observations, otub1 overexpression induces autophagy while its knockdown represses autophagy induction [61] . protein posttranslational modifications are crucial in the dynamic regulation of the autophagy process. modifications of core components of the autophagic machinery are notably essential for the induction of autophagy. autophagy proteins that are ubiquitinated constitute substrates for dubs, therefore regulating their function and/or stability [60] . the mechanistic target of rapamycin complex 1 (mtorc1) plays a central role in the integration of various environmental signals to regulate cell metabolism, growth, proliferation, and survival. in nutrient-rich conditions, mtorc1 is active and promotes cell growth while down-regulating autophagy. conversely, down-regulation of mtorc1 activity during nutrient deprivation activates autophagy. the protease otub1 was recently reported to inhibit mtorc1 activity by deubiquitinating and stabilizing the inhibitor deptor in response to amino acid deprivation [61] ( figure 3a ). deptor stabilization by otub1 depends on its catalytic activity as the catalytically inactive mutant otub1-c91a fails to both remove ubiquitin moieties from deptor and protect it from proteasomal degradation. consistent with these observations, otub1 overexpression induces autophagy while its knockdown represses autophagy induction [61] . the serine/threonine protein kinase ulk1 (unc51-like kinase 1) is a critical inducer of autophagy (see above and figure 1 ). dynamic phosphorylation and polyubiquitination regulate ulk1 activity. notably, the ubiquitination of ulk1 with k63-linked ubiquitin chains contributes to its stabilization and activity [38] . in contrast, the linkage of k48-linked ubiquitin chains leads to proteasomal degradation of ulk1, resulting in a blockade of starvation-induced autophagy [62, 63] . a loss-of-function screen of dubs in hela cells identified usp20 as the first dub to be involved in regulating ulk1 ubiquitination and stability. usp20 interacts with and deubiquitinates ulk1, thus protecting it from degradation. the maintenance of basal levels of ulk1 by usp20 contributes to rapid induction of autophagy under stress condition. indeed, silencing of usp20 encoding gene inhibits autophagosomes and autolysosomes formation in response to starvation ( figure 3b ). however, the interaction between ulk1 and usp20 is weakened upon prolonged induction of autophagy (4 to 8 h), leading to a reduction of the level of ulk1 protein while its ubiquitinated form accumulates [64] . the molecular mechanisms regulating the dissociation of usp20 and ulk1 are not known yet but could depend on posttranslational or allosteric modifications on usp20 as its stability is not affected by starvation; alternatively, the e3 ubiquitin ligase nedd4l, may compete with usp20 to interact with ulk1 and promotes its proteasomal degradation [63] . beclin1 is a multivalent adaptor protein and forms, with vps34 and vps15, the core components of the pi3k-iii signaling complex, which is essential for the maturation of the autophagosome (see above and figure 1 ). beclin1 also interacts transiently with accessory factors, such as atg14l, ambra1, uvrag or bcl-2 [65] . moreover, beclin1 versatile ubiquitination is tightly linked to its function and activation [37] . different types of ubiquitin chains, including k63-and k48-linked chains, were found on beclin1, and several dubs control beclin1 ubiquitination status and activity ( figure 3c ). usp14 is a ubiquitin-specific protease tightly associated with the proteasome which has been shown to cleave k48-linked ubiquitin chains [66, 67] ; however, other studies have shown that usp14 is also able to cleave k63-linked ubiquitin chains [68, 69] . knockdown of usp14 or its inhibition with the inhibitor iu1 (see below section 4.1) induces the activation of autophagy, indicating that usp14 is a negative regulator of autophagy in h4 (neuroglioma) cells. because silencing the usp14 encoding gene does not affect the stability of beclin1 or other components of the complex, this suggests that its ability to regulate autophagy is independent of its proteasomal function in h4 (neuroglioma) cells. according to this observation, it was shown that usp14 suppresses the activity of beclin1 complex and induction of autophagy by interacting with and controlling k63-rather than k48-linked ubiquitin chains of beclin1 [70] . autophagy is highly interconnected with immune processes and can be triggered by activated tlr4 (toll-like receptor 4). activation of tlr4 by microbial components contributes to the recruitment of adaptor proteins and enzymes required for the signal transduction and induction of the immune response by nf-κb (nuclear factor kappa b) transcription factors, such as the e3 ubiquitin ligase and scaffold protein traf6 (tnfr-associated factor 6). traf6 is then responsible for the induction of autophagy following the activation of the tlr4 in macrophages through the ubiquitination of beclin1 with k63-ubiquitin chains in murine macrophages raw 264.7 [71] . indeed, ubiquitination of the lysine residue 117 within the bh3 domain of beclin1 prevents its interaction with the inhibitor bcl-2 [71, 72] . in addition, min and colleagues showed that traf6 and usp14 compete for the interaction with beclin1 in hek293t (human embryonic kidney 293t) cells and that usp14 negatively regulates autophagy in thp-1 monocyte cells [73] . traf6 and beclin1 interact through their coiled-coil domains in the absence of usp14, whereas the interaction is gradually attenuated when the cells are co-transfected with increasing quantity of usp14. however, the study does not provide any evidence for the catalytic role of usp14 and instead suggests that the interaction of beclin1 with usp14 inhibit traf6-mediated ubiquitination of beclin1 [73] . additionally, the dub a20-a downstream target of nf-κb-is responsible for the catalytic removal of k63-linked ubiquitin chains on the lysine 117 of beclin1 to limit the induction of autophagy in the murine macrophage cell line raw 264.7 [71] . interestingly, beclin1 and the bcl-2 family member, mcl-1, compete for their interaction with usp9x, which contributes to their stabilization by protecting them from proteasomal degradation in hek293t cells [74] . in the same study, it was observed that mcl-1 levels were increased in malignant tissues from melanoma patients while beclin1 was destabilized, suggesting that usp9x promotes tumor progression. however, usp9x function in tumorigenesis appears to be more complex and context-dependent as independent studies have shown that usp9x can be either a tumor promoter or suppressor depending on the origin of the cells [75, 76] . the addition of k11-linked ubiquitin chains to beclin1 by the e3 ligase nedd4 triggers its degradation by the proteasome [77] . the enzymes usp13 and usp19 are known to process k11-linked ubiquitin chains resulting in the stabilization of beclin1 in hek293t cells (for both usp13 and usp19 roles), as well as in mef, hela and bcap-37 cells (role of usp13) [78, 79] . although several lysine residues in beclin1 may be ubiquitinated with k11-linked chains, usp19 seems to mainly target ubiquitin moieties bound to the lysine 437 of beclin1 in hek293t cells, suggesting that its function may not be redundant, but instead complementary to usp13 [78] . finally, the dub usp33 was found to be a regulator of early steps of starvation-induced autophagy activation by promoting the interaction between beclin1 and the ras-like gtpase ralb. indeed, the assembly of the complex ralb-exo84-beclin1 is made possible by the deubiquitination of ralb at its lysine residue 47 in hek293t and hela cells [80] . the identification of usp33 as indirectly regulating beclin1 activity through the deubiquitination of one of its partners reinforces the hypothesis that the modification and activation of autophagy machinery components are context-specific. the formation of protein aggregates is a continuous process in the cell, and their degradation by autophagy, referred to as aggrephagy, is one of the first types of selective autophagy that has been described. aggrephagy involves the autophagy receptors p62/sqstm1 and nbr1 which are recruited to ubiquitinated protein aggregates [81] [82] [83] [84] . the accumulation of protein aggregates that fail to be degraded in neuronal or glial cells is a hallmark of various neurodegenerative diseases. aggregation of α-synuclein is characteristic of the pathogenesis of parkinson's disease in which mono-and poly-ubiquitinated α-synuclein are major constituents of the lewy bodies [85, 86] . mono-ubiquitination of α-synuclein seems to be required for its targeting of proteasomal degradation and thus negatively regulates its autophagic degradation. usp9x interacts with α-synuclein in vitro and in vivo and contributes to the removal of mono-ubiquitin moieties and favors its targeting for degradation by autophagy rather than the proteasome [87] . conversely, uch-l1-a dub associated with parkinson's disease and whose gene is frequently mutated in familial forms of the pathology [88] -promotes the accumulation of α-synuclein aggregates in oligodendrocytes [89, 90] . another independent study also reported that membrane-associated uch-l1 contributes to α-synuclein neurotoxicity, possibly by negatively regulating its lysosomal degradation [91] ( figure 3d ). finally, a recent study demonstrates the prevalence of k63-linked ubiquitin chain conjugates in lewy bodies, suggesting that their elimination can be primarily ensured by the lysosomal route rather than the 26s proteasome. in addition, this study identifies usp8 as one of the best markers of lewy bodies in human pigmented neurons in sporadic cases of parkinson's disease and demonstrates the ability of usp8 to hydrolyze k63-linked ubiquitin chains from α-synuclein in vitro [92] . moreover, usp8 gene extinction significantly reduces the total level of α-synuclein both in a drosophila model of parkinson's disease and in human embryonic kidney (hek293t) cultured cells [92] . thus, the presence of usp8 on endosomal membranes (see below) and in lewy bodies, as well as its ability to deubiquitinate α-synuclein, makes it a preferred therapeutic target according to the assumption that inhibition of usp8 could directly promote the elimination of amyloid fibers by the endocytic and lysosomal pathways. it was recently shown that the selective autophagy receptor p62 binds to protein aggregates modified not only with k63-linked ubiquitin chains but also with k33-linked ubiquitin chains [93, 94] . zranb1/trabid is a k29-and k33-specific dub [95] . knockdown of zranb1 enhances the recruitment of p62 to k33-associated protein aggregates, suggesting that zranb1 is a negative regulator of aggrephagy; yet, the physiological function of k33-linked ubiquitin chains and the role of zranb1 are not entirely understood [94] . in drosophila, the p62 homolog ref (2)p is also implicated in the clearance of ubiquitinated protein aggregates [96, 97] . however, little is known about the regulation of autophagy-associated ubiquitination processes in flies. the only dub known to regulate ubiquitin-dependent autophagy negatively is dusp36 [98] . this protein negatively regulates the formation of ubiquitinated nuclear aggregates, while promoting cell growth. indeed, deletion of the dusp36 encoding gene results in the robust accumulation of ubiquitinated protein aggregates, which include the histone protein h2b, and in the activation of autophagy, independently of the tor pathway [98] (figure 3d ). protein aggregates clearance requires the action of selective receptors to be adequately targeted for autophagic degradation [3] . ndp52 is an autophagy receptor mostly described for its role in addressing ubiquitin-decorated bacteria for degradation by autophagy [99, 100] . however, a new role for ndp52 in the formation of traf6 aggregates was unveiled [101] . indeed, ndp52 mediates the aggregation and selective autophagic degradation of the tlr adaptor molecule trif and the signaling molecule traf6 in response to tlr4 stimulation. ubiquitination of ndp52, mediated by traf6, is necessary for its activity and is counteracted by a20 [101] ( figure 3d ). with the variety of proteins prone to aggregations, it is not surprising that different dubs are involved in the regulation of the formation and degradation of protein aggregates. protein aggregation is a hallmark of various pathologies, notably neurodegeneration and infection, and a better understanding of the regulatory mechanisms associated with each pathology will greatly benefit the development of appropriate and targeted treatments. mitophagy refers to the clearance of exhausted mitochondria by autophagy. the mitochondrial kinase pink1 and the e3-ubiquitin ligase parkin play a central role in the mitochondrial quality control. upon mitochondria damage and loss of membrane potential, parkin is translocated to the outer mitochondrial membrane (omm) and activated by stabilized pink1 [102] . active parkin ubiquitinates a myriad of substrates on the omm that can be recognized by ubiquitin-binding selective autophagy receptors [103, 104] . so far, three dubs-usp30, s-usp35, and usp15-have been reported to counteract parkin activity following acute mitochondrial depolarization, thus acting as a negative regulator of mitophagy. usp8 is the only dub identified so far as a positive regulator of mitophagy ( figure 3e ). usp30 is a mitochondrial enzyme, tethered to the outer membrane of the mitochondria with its catalytic domain facing the cytoplasm [105] . several independent studies point out that usp30 is one of the major dub regulating mitophagy. overexpression of usp30 reverses the ubiquitination of parkin substrates, such as tom20, and impairs mitophagy [106] [107] [108] [109] . usp30 function in autophagy is dependent on its catalytic activity as a catalytically inactive mutant usp30-c77a is ineffective at inhibiting mitophagy [106] . the depletion of usp30 was shown to enhance the degradation of mitochondria in neuronal and hela cell cultures [106, 109] . usp30 knockdown also increases the ubiquitination level on multiple parkin substrates, thus confirming that usp30 antagonizes parkin function. usp30 proteolytic activity is more efficient on k6-linked ubiquitin chains, even though it can also process k11, k48 and k63 chains [108, 110] . the ubiquitin chains targeted by usp30 are similar to the ones parkin adds to its substrates, suggesting that these two enzymes act as antagonists on shared substrates. the majority of the work carried out on the regulation of mitophagy have relied on cells overexpressing parkin along with the use of mitochondrial-depolarizing agents. such experiments simulate an extreme scenario of mitochondrial stress and interpretations may not be relevant to basal conditions of mitochondrial clearance [111] . however, recently published work by marcassa and colleagues describes the investigation into the role of usp30 in more physiological conditions [112] . the authors propose a new model in which usp30 acts upstream pink1 as the depletion of pink1 in cells lacking usp30 abrogated the increased mitophagy induced by usp30 knockdown in u2os cells [112] . in the same study, usp30 is revealed to regulate the degradation of peroxisome by autophagy (pexophagy) in a similar way to its role in mitophagy. like mitochondria, peroxisomes are the main source of reactive oxygen species (ros) that can be damaging to the cells if produced in high quantities [113] . moreover, contact sites between mitochondria and peroxisomes exist, and mitochondria were shown to play a role in the generation of peroxisomes [114] ; thus reinforcing the hypothesis of an intrinsic relationship between both organelles. the depletion of usp30 increases both pexophagy and mitophagy. however, the localization of usp30 on mitochondria and peroxisomes relies on distinct sequences, suggesting that the role of usp30 in pexophagy is independent to its mitochondrial function [112] . it is possible that usp30 acts at different levels during the mitophagy or pexophagy processes depending on the conditions. in basal condition, usp30 could serve as a safety check-point to avoid mitophagy or pexophagy to be triggered inappropriately. besides, other studies identified additional dubs which may also contribute to the regulation of mitophagy. usp35 short form (s-usp35) is another dub that is localized at the mitochondria [109] . in a similar manner to usp30, overexpression of s-usp35 impairs mitophagy while s-usp35 knockdown enhances mitochondrial degradation [109] . usp15 is the third dub identified to antagonize parkin-mediated mitophagy. overexpression of usp15 inhibits mitophagy dependently of its catalytic activity, while depletion of usp15 enhances mitophagy. unlike usp30 and s-usp35, usp15 is only rarely localized at the mitochondria [115] . only one dub, usp8, may act as a positive regulator of mitophagy through parkin regulation. usp8 knockdown impairs parkin-mediated mitophagy by preventing parkin recruitment of depolarized mitochondria. in this process, usp8 selectively removes k6-linked ubiquitin chains on parkin and counteracts parkin auto-ubiquitination and auto-catalytic activation [116] . whether and how these dubs act in concert or within different organs or situations remains to be determined to fully understand their specific requirements in physiological or stressed conditions. proteins can be degraded by autophagy independently of their aggregation in a way that can be either dependent or independent of their ubiquitination state. to date, only a few substrates, known to be directly targeted for degradation by autophagy in response to ubiquitination, have been shown to be regulated by a specific dub ( figure 3f ). the hypoxia-inducible factor 1, α subunit (hif-1α) is a transcription factor essential for cells to adapt rapidly to low oxygen levels (hypoxia). when oxygen is available, hif-1α is polyubiquitinated and rapidly degraded either by the proteasome or directly by the lysosome through chaperone-mediated autophagy [117, 118] . the dub cezanne/otud7b is itself induced by oxygen deprivation in cultured endothelial cells. moreover, loss of cezanne reduces the amount of hif-1α protein while cezanne overexpression stabilizes hif-1α and protects it from autophagic degradation in a catalytic-dependent manner by specifically processing k11-linked ubiquitin chains [119] . mutation of the cma-targeting motif (kferq motif) of hif-1α makes it insensitive to cezanne knockdown, thus suggesting that cezanne specifically regulates the degradation of hif-1α mediated by cma. cezanne is not the only dub to regulate the ubiquitination status of hif-1α. indeed, usp8 is essential for the removal of ubiquitin moieties on hif-1α in normoxia, contributing to the maintenance of a basal level of hif-1α. in this case, however, usp8 appears to protect hif-1α from proteasomal degradation [120] . these two studies show that different dubs can regulate the fate of a shared substrate depending on the physiology of the cell. connexin-43/cx43 is a member of the connexin family that forms the gap junction channels between adjacent cells, enabling direct intercellular exchanges between cells, which is another example of a substrate for at least two different degradative pathways. indeed, cx43 polyubiquitination triggers its degradation through either the proteasome or the lysosome via endocytosis and autophagy [121] [122] [123] [124] . usp8 interacts with and deubiquitinates cx43, removing monoubiquitin moieties as well as k63-and k48-linked ubiquitin chains. cx43 ubiquitination and degradation by autophagy are increased in usp8 knockdown cells [125] . even though usp8 regulates autophagic degradation of cx43 in basal condition, one cannot exclude that usp8 may also affect cx43 through the endolysosomal pathways in different conditions as usp8 is well described for its implication in the endocytosis of various plasma membrane receptors [126] [127] [128] . thus, there are several cases of proteins being degraded through different processes notably as a result of the linkage of different kinds of ubiquitin moieties and undergoing tight regulation by specific dubs. autophagy and endocytosis are two conserved and interconnected degradative pathways among eukaryotes. moreover, fusion events between autophagosomes and endocytic compartments have been observed and investigated [129, 130] . endocytosis and autophagy converge not only at the level of lysosomes but also at the level of early and late endosomes, forming another type of vesicle called amphisomes. several dubs, known for their role in endocytosis, also impact directly or indirectly the autophagic flux ( figure 3g ). amsh is a metalloprotease of the jamm type involved in the sorting of cell-surface receptors at endosomes [131] [132] [133] [134] . amsh localizes at the endosomes and promotes the recycling of internalized receptors [135, 136] . disruption of amsh in mice results in the loss of neurons in the hippocampus and severe atrophy of the cerebral cortex [137] . an independent study observed that the loss of amsh in neurons results in the accumulation of ubiquitinated protein aggregates associated with the autophagy receptor p62, indicating that the autophagy flux is impaired [138] ( figure 3d) . however, at this stage of the study, it is not possible to discriminate whether the blockade of the autophagy flux results from an impairment in the endocytic process or a lack of targeting of cargoes to autophagosome, independently of endocytosis. in the plant model arabidopsis thaliana, the ortholog amsh1 interacts with the escrt-iii protein vps2.1 and contributes to autophagic degradation [139] (figure 3g ). usp8 is a second dub playing a major role in endocytosis by regulating both the ubiquitination status of cargoes and members of the escrt machinery regulating membrane deformation and scission events [126] [127] [128] [140] [141] [142] . usp8 has been extensively studied for its role in the regulation of the trafficking and lysosomal degradation of receptors, such as egfr, through the endocytic process [135, 143] . in addition to its role in endocytosis, loss of dusp8 in drosophila blocks the progression of autophagy, resulting in the accumulation of ref(2)p/p62 and ubiquitinated proteins [144] . as for dusp8, dusp12 depletion affects both the autophagic flux and endocytosis process. indeed, silencing dusp12 encoding gene results in the accumulation of autophagosomes in drosophila [144] , and usp12 negatively regulates the endocytosis and translocation of the notch receptor to the lysosomes in both drosophila and mammalian cells [145] . although these studies cannot exclude a direct role of these dubs in the regulation of autophagy, they support a close imbrication of endocytosis and autophagy and suggest interdependence of the two processes. this hypothesis is reinforced by the fact that disruption of the endocytic process using dominant-negative rab or by their knockdown also results in impaired autophagy [146] . the particular case of the β2-adrenergic receptors (β2-ars) also illustrates a reliable interconnection between the endocytic and autophagic processes and their tight regulation by ubiquitination. β2-adrenergic receptors (β2-ars) availability on the plasma membrane is tightly regulated by balancing their internalization and recycling rates. misregulation of β2-ars trafficking has been associated in various pathologies, including heart failure and asthma [147] . β2-ars take an unconventional route to the lysosomes; indeed, after their endocytic internalization, ubiquitinated β2-ars are directed to the autophagosomes rather than the lysosomes [148] . the post-endocytic sorting of the receptor from the endosomes to the autophagosomes is modulated by the proteases usp20 and usp33 [149] . however, solely usp20 was shown to promote the deubiquitination of β2-ars and their post-endocytic trafficking to autophagosomes. in this process, phosphorylation of usp20 at serine residue 333 is required for its activity providing an additional level of regulation [148] . recently, the protein chmp2a of the escrt-iii complex was identified to be crucial for the closure of the autophagosome [150] . therefore, endosomes-associated dubs such as usp8 and amsh, or other dubs that remain to be identified, may also play a direct regulatory role in this process through the regulation of escrt-iii components activity, as recently shown for chmp1b during endocytosis [140] . the expression of a number of genes related to autophagy is activated upon starvation in mammalian cell culture. in this process, histone protein h2b monoubiquitination (h2bub1) is an essential modification for the regulation of gene transcription. the level of h2bub1 is controlled by the protease usp44 which is upregulated after starvation. knockdown of usp44 results in the maintenance of h2bub1 upon starvation and abolishes the change in expression of starvation-induced autophagy-related genes. moreover, downregulation of usp44 encoding gene blocks the induction of autophagy in mescs (mouse embryonic stem cells) ( figure 3h ). this study thus unveils a new role for dub in the transcriptional regulation of autophagy through the modulation of h2b monoubiquitination [151] . ubiquitination of microbial molecular patterns is used by eukaryotic cells to tag invasive pathogens and target them for autophagic degradation. this reaction leads to the accumulation of ubiquitinated protein aggregate known as ubiquitinated aggresome-like induced structures (alis). such aggregates contribute to the upregulation of autophagy and the removal of intracellular pathogens. in response to this host defense mechanism, intracellular pathogens, such as bacteria or viruses, have developed strategies to hijack the host ubiquitin pathway by expressing dub-like enzymes able to counteract ubiquitination and permit them to escape their elimination by autophagy. for instance, the intracellular pathogenic bacteria salmonella enterica serovar typhimurium (s. typhimurium) counteracts the alis-induced autophagy by translocating a dub-like enzyme, ssel (salmonella-secreted factor l), into the cytosol. lysates from mouse macrophages infected with ∆ssel mutant bacteria are enriched in ubiquitinated proteins, and immunofluorescence experiments revealed that these bacteria are more prone to ubiquitination and recognition by autophagy markers such as lc3 or p62 [152, 153] . secreted ssel deubiquitinates alis and the salmonella-containing vacuoles, reducing the induction of autophagy, further promoting the survival and replication of s. typhimurium [152] ( figure 3d ). like salmonella, legionella is an intracellular bacterium that can establish niches in cytoplasmic vacuoles which allows for the survival and replication of the bacteria. the legionella pneumophila effector protein ravz is a secreted cysteine protease that interferes with the autophagy machinery by irreversibly deconjugating lc3 from the autophagosome membrane [154] . lc3 is an autophagy-related ubiquitin-like protein anchored to the autophagosome membrane. the process leading to lc3 lipidation and association to the membrane is similar to the ubiquitination process and requires a ubiquitin-like conjugation system. ravz deconjugates lc3 from the autophagosome membrane by hydrolyzing the amide bond between the c-terminal glycine residue and an adjacent aromatic residue; the lack of terminal glycine residue prevents the conjugation of lc3 to the membrane [154, 155] . ravz can also process conventional ubiquitin chains and prevent the targeting of intracellular bacteria for autophagic degradation [156] . indeed, using a co-infection system with salmonella and legionella, kubori and colleagues showed that legionella ravz protease prevents the recruitment of the autophagy receptors p62 and ndp52 to the salmonella-containing vacuoles. the lack of autophagy receptors at the scvs is due to the removal of ubiquitin moieties from the scvs by ravz [156] . it was recently shown that ravz specificity toward lc3 anchored to the autophagosomal membrane depends on its interaction with pi3p [157] ; this observation suggests that ravz activity as dub on ubiquitin coats depends on the lipid structure of the nearby vacuole. autophagy also targets viruses; yet, many viruses exploit autophagy for their replication [158] . coronaviruses induce the formation of double-membrane vesicles that allow for their replication and are often decorated with lc3 and cell infection with coronaviruses is often accompanied with induction of autophagy [159] . the non-structural protein plp2-tm, which is a transmembrane papain-like protease with deubiquitinating activity [160] , is sufficient for the accumulation of autophagosomes in different cell lines. however, its role in the regulation of autophagy is independent of its protease activity [161] . plp2-tm interacts with lc3 and promotes the accumulation of autophagosomes by blocking their fusion with the lysosomes. in their study, chen and colleagues suggest that plp2-tm blocks autophagosome-lysosome fusion through its interaction with beclin1, a prime target for viruses that manipulate the autophagy pathway [162] . plp2-tm also promotes the interaction of sting (stimulator of interferon genes) with beclin1, possibly to impede the activation of downstream antiviral responses, accentuated by the deubiquitination of components of the signaling cascade such as rig-1 or traf3 [161, 163] . these studies provide fascinating examples of possible coevolution and adaptation of the pathogens with their host, where pathogens managed to bypass the host's defense to their own benefit. ubiquitination regulates major cellular functions by controlling protein stability and activity, and defects in this process contribute to the development of many diseases. in some cases, ubiquitin-dependent autophagic processes constitute entry points to design new treatments. depending on the context, however, autophagy can either be beneficial and contribute to survival and recovery or have adverse effects. as such, there is a need for in-depth understanding of autophagy function and regulation in pathological or physiological situations to define in which situation the inhibition of a particular dub will be beneficial or detrimental. interestingly, the design of chemical tools is also a powerful strategy to probe the effect of dubs inhibition to help both the understanding of their role in the regulation of autophagy and the design of future treatments to modulate autophagy in the corresponding pathologies. efficiency and usability of an inhibitor depends on its specificity. as such, the discovery of dub-focused drugs has been challenging [20] . indeed, although dubs have a catalytic pocket that is suitable for drug development, their sequence and structure are very similar. moreover, dubs are flexible enzymes, and the regulation of their activity can involve allosteric effects as described for several dubs that alternate between active and inactive conformations [164] [165] [166] [167] [168] [169] . for instance, the free catalytic domain of usp7 undergoes significant structural modifications when it is complexed to ubal (ubiquitin aldehyde, an irreversible dub inhibitor) [164, 170] . recent publications of the dynamic interaction of usp7 with specific small-molecule inhibitors demonstrated that the binding of the molecules into the active site of usp7 modifies the catalytic residue c223. this modification of the active site of usp7 results in its inability to change conformation and perform the cleavage of ubiquitin chains. these studies show that the development of specific inhibitors binding to the active site of dubs is a realistic approach, opening new avenues in the field [171, 172] . in addition to intrinsic modulation of their activity and substrate specificity, some dubs require cofactors. for instance, the full activation of usp19 requires its interaction with hsp90, which promotes the binding of ubiquitin to the catalytic domain of usp19 [173] . another example is the proteasome-associated enzyme usp14, whose activity is strongly enhanced when in association with the proteasome [66] . despite the complexity of the regulation of dubs activity, much effort has been placed in the identification and development of small-molecule regulating dubs catalytic activity. some of the most successful small-molecules affecting autophagy the process, through the inhibition of the activity of autophagy-associated dubs, are shortly introduced hereafter. screens for inhibitors of dubs sought to identify new small-molecules with potential in two main therapeutic fields, for cancer treatment and neurodegeneration (reviewed in [20] ). in both fields, some of the inhibitors' targets play a role in the regulation of autophagy that possibly contributes to pathogenesis. usp14 is an enzyme associated with the proteasome, which plays an essential role in the regulation of protein turnover. the role of usp14 is particularly important in neurons to maintain synaptic functions and constitutes an appealing target for drug development in order to modulate the activity of the proteasome [174] . a screen of 63,052 compounds using proteasome reconstituted with usp14, led to the identification of the first inhibitor of usp14. the small-molecule iu1 inhibits specifically usp14 with an ic 50 of 4-5 µm [66] . the inhibitor iu1 blocks the activity of usp14 only in the presence of the proteasome, suggesting that it binds only to the activated enzyme. moreover, the compound iu1 abrogates the catalytic activity of usp14 without affecting its noncatalytic regulatory function [66] . however, with the growing number of usp14 substrates identified, there was a need for the development of iu1 analogs with improved selectivity over the usp14-substrate complexes. a curated screen of 87 variants of iu1 led to the identification of iu1-47 as a new potent inhibitor of usp14. iu1-47 treatment of murine primary neuron cultures and in neurons derived from human-induced pluripotent stem cells (ipsc) accelerates the degradation of the microtubule-associated protein tau, which is implicated in many neurodegenerative diseases. besides, the inhibition of usp14 by iu1-47 induced an increase of the autophagy flux, consistent with the increased degradation rate of tau [175] . as mentioned above, uch-l1 is a negative regulator of autophagy widely studied for its implication in parkinson's disease and its contribution to the aggregation of α-synuclein as a result of autophagy blockade [86, 89] . uch-l1 is also expressed in various primary lung tumors while not detectable in normal, healthy lung tissue, suggesting a possible contribution to cancer [176] . because of the correlation between uch-l1 and tumor progression, as well as its implication in neurodegenerative disease, uch-l1 is a recognized target for the development of therapeutic inhibitors. as such, the compound ldn-57444 was identified in a high throughput drug screening as a specific uch-l1 inhibitor. treating h1299 lung cancer cell line with this compound significantly reduces the cell proliferation rate [177] . nsc632839 is another inhibitor that affects uch-l1 activity. however, nsc632839 activity is not specific to uch-l1, and it is already known to inhibit usp2 and usp7 [178, 179] . the amount of p62 in cells is reduced after treatment with both ldn-57444 and ncs632839, suggesting that these drugs could prevent the accumulation of protein aggregates [178] . therefore, these inhibitors may constitute new tools to investigate further the implication of uch-l1 in parkinson's disease and evaluate whether uch-l1 inhibition favors the clearance of α-synuclein aggregates in neurons. inhibition of early regulators is another strategy to inhibit autophagy in some situations. the inhibitor wp1130, which targets usp9x, was reported to lead to an increase in ulk1 ubiquitination, inducing its transfer to the aggresomes and its inhibition, further resulting in the blockade of autophagy in several cultured cell lines, including the bone osteosarcoma u2os cell line [180] . it was speculated that the inhibition of usp9x could be responsible for the accumulation and subsequent aggregation of ubiquitinated ulk1. however, silencing usp9x did not result in changes in ulk1 expression level when cells were treated with wp1130. this could be because wp1130 is only partially specific and could have other targets in vivo that remain to be discovered [180, 181] . in order to screen and select new small-molecules interfering with autophagy in mammalian cells, an imaging-based assay has been optimized by liu and colleagues that makes use of cells expressing the autophagy marker gfp-lc3 to quantify the accumulation of autophagosomes [79] . using this assay, they screened the iccb known bioactives library, a collection of 472 compounds, and they identified the inhibitor spautin-1 (specific and potent autophagy inhibitor 1). spautin-1 inhibits the catalytic activity of both usp10 and usp13 with an ic 50 of~0.6-0.7 µm. these two dubs are involved in the regulation of beclin1 ubiquitination in the vps34 complex and, therefore, constitute an entry point to modulate the initiation of autophagy. cancer cell lines treated with spautin-1 demonstrated an increased cell death rate under starvation conditions. as such, spautin-1 constitutes a potential lead for the development of autophagy inhibitors for anti-cancer therapies [79] . because of its essential function in mitophagy, which is crucial to clear damaged mitochondria notably in neuronal cells, several small-molecule inhibitors of usp30 have been developed in the past few years. for example, based on a phenotypic screening, it was shown that the inhibition of usp30 by the compound 15-oxospiramilactone enhances the activity of usp30's targets mfn1 and mfn2-two gtpases anchored at the omm and essential for tethering adjacent mitochondria-and promotes mitochondrial fusion, thus contributing to the restoration of the mitochondria network [182] . more recently, an in vitro study identified a new small-molecule mf-094, as a potent and selective inhibitor of usp30. this compound has the opposite effect of 15-oxospiramilactone, as mf-094-mediated inhibition of usp30 accelerates mitophagy [183] . these two studies highlight the fact that the same dubs can be involved in different processes, dependent on the signal they may receive and the interaction within different protein complexes. there is no doubt that new inhibitors of dubs will arise with problems related to the existence of several substrates or to poor selectivity, requiring in-depth analysis of the selected compounds in different cell types and stress situations before any preclinical assays. interestingly, these investigations may tell a lot about how dubs regulate autophagy and other cell processes, and may be used as molecular tools to unveil regulatory mechanisms. protein ubiquitination is an essential, reversible, posttranslational modification involved in virtually every cellular process. the past decades have seen remarkable progress in the understanding of the function of dubs, their mechanism of action and regulation. recently, there has been an increasing body of evidence that ubiquitination plays a crucial role in regulating autophagy, and dubs intervene at multiple steps in autophagy. deregulation in both autophagy and ubiquitination/deubiquitination processes have been linked to many pathologies such as neurodegenerative diseases, cancer onset and progression, and different kinds of viral or bacterial infections. also, considerable effort was placed on the development and optimization of small-molecules acting as dubs inhibitors. such molecules can serve not only as leads for the development of drug-like molecules but also as tremendous useful tools to investigate the molecular mechanism of autophagy and its regulation by the ubiquitin system. by the development of molecules targeting protein-protein interaction instead of the catalytic activity, it could be possible to manipulate and orientate precisely the function of a dub towards a given process and/or target to avoid pleiotropic effects. autophagy at the crossroads of catabolism and anabolism cargo recognition and degradation by selective autophagy aggrephagy: selective disposal of protein aggregates by macroautophagy mitophagy and quality control mechanisms in mitochondrial maintenance clearance of misfolded and aggregated proteins by aggrephagy and implications for aggregation diseases the role for autophagy in cancer compromised autophagy and neurodegenerative diseases regulation of autophagy by protein post-translational modification posttranslational modification of autophagy-related proteins in macroautophagy the role of ubiquitin system in autophagy ubiquitin signaling and autophagy ubiquitin: structures, functions, mechanisms the demographics of the ubiquitin system the emerging complexity of protein ubiquitination mechanisms of deubiquitinase specificity and regulation proteasomal and autophagic degradation systems ubiquitin modifications the many roles of ubiquitin in nf-κb signaling ubiquitination: friend and foe in cancer deubiquitylating enzymes and drug discovery: emerging opportunities functions of lysosomes autophagy as a stress-response and quality-control mechanism: implications for cell injury and human disease the machinery of macroautophagy autophagosome formation from membrane compartments enriched in phosphatidylinositol 3-phosphate and dynamically connected to the endoplasmic reticulum the chaperone-mediated autophagy receptor organizes in dynamic protein complexes at the lysosomal membrane the coming of age of chaperone-mediated autophagy an intralysosomal hsp70 is required for a selective pathway of lysosomal protein degradation selective endosomal microautophagy is starvation-inducible in drosophila microautophagy of cytosolic proteins by late endosomes hsc70-4 deforms membranes to promote synaptic protein turnover by endosomal microautophagy chaperone-mediated autophagy and endosomal microautophagy: joint by a chaperone making new contacts: the mtor network in metabolism and signalling crosstalk two distinct vps34 phosphatidylinositol 3-kinase complexes function in autophagy and carboxypeptidase y sorting in saccharomyces cerevisiae a ubiquitin-like system mediates protein lipidation atg8, a ubiquitin-like protein required for autophagosome formation, mediates membrane tethering and hemifusion gabarap and gate16 localize to autophagosomal membrane depending on form-ii formation regulation of the tumor-suppressor beclin 1 by distinct ubiquitination cascades mtor inhibits autophagy by controlling ulk1 ubiquitylation, self-association and function through ambra1 and traf6 structure of ubiquitin refined at 1.8 å resolution the ubiquitin domain superfold: structure-based sequence alignments and characterization of binding epitopes the ubiquitin system getting into position: the catalytic mechanisms of protein ubiquitylation ubiquitin ligases: structure, function, and regulation hect and ring finger families of e3 ubiquitin ligases at a glance ubiquitination of substrates by esterification ubiquitin dependence of selective protein degradation demonstrated in the mammalian cell cycle mutant ts85 distinct monoubiquitin signals in receptor endocytosis the ubiquitin code a proteomics approach to understanding protein ubiquitination a ubiquitin ligase complex assembles linear polyubiquitin chains linear ubiquitin chains: enzymes, mechanisms and biology crystal structure and solution nmr studies of lys48-linked tetraubiquitin at neutral ph solution conformation of lys63-linked di-ubiquitin chain provides clues to functional diversity of polyubiquitin signaling deubiquitylation and regulation of the immune response regulation and cellular roles of ubiquitin-specific deubiquitinating enzymes deubiquitylases from genes to organism role of rpn11 metalloprotease in deubiquitination and degradation by the 26s proteasome identifying usps regulating immune signals in drosophila: usp2 deubiquitinates imd and promotes its degradation by interacting with the proteasome a genomic and functional inventory of deubiquitinating enzymes emerging roles of e3 ubiquitin ligases in autophagy otub1 protein suppresses mtor complex 1 (mtorc1) activity by deubiquitinating the mtorc1 inhibitor deptor chaperone-like protein p32 regulates ulk1 stability and autophagy fine-tuning of ulk1 mrna and protein levels is required for autophagy oscillation the deubiquitinating enzyme usp20 stabilizes ulk1 and promotes autophagy initiation conformational flexibility of becn1: essential to its key role in autophagy and beyond enhancement of proteasome activity by a small-molecule inhibitor of usp14 deubiquitinating enzyme ubp6 functions noncatalytically to delay proteasomal degradation deubiquitination of dishevelled by usp14 is required for wnt signaling phosphorylation and activation of ubiquitin-specific protease-14 by akt regulates the ubiquitin-proteasome system usp14 regulates autophagy by suppressing k63 ubiquitination of beclin 1 traf6 and a20 regulate lysine 63-linked ubiquitination of beclin-1 to control tlr4-induced autophagy disruption of the beclin 1-bcl2 autophagy regulatory complex promotes longevity in mice ubiquitin-specific protease 14 negatively regulates toll-like receptor 4-mediated signaling and autophagy induction by inhibiting ubiquitination of tak1-binding protein 2 and beclin 1 beclin 1 restrains tumorigenesis through mcl-1 destabilization in an autophagy-independent reciprocal manner deubiquitinase usp9x stabilizes mcl1 and promotes tumour cell survival the deubiquitinase usp9x suppresses pancreatic ductal adenocarcinoma nedd4-dependent lysine-11-linked polyubiquitination of the tumour suppressor beclin 1 usp19 modulates autophagy and antiviral immune responses by deubiquitinating beclin-1 beclin1 controls the levels of p53 by regulating the deubiquitination activity of usp10 and usp13 the deubiquitylase usp33 discriminates between ralb functions in autophagy and innate immune response essential role of sequestosome 1/p62 in regulating accumulation of lys63-ubiquitinated proteins p62 is a common component of cytoplasmic inclusions in protein aggregation diseases a role for nbr1 in autophagosomal degradation of ubiquitinated substrates regulation of selective autophagy: the p62/sqstm1 paradigm the lewy body in parkinson's disease and related neurodegenerative disorders ubiquitination of α-synuclein and autophagy in parkinson's disease engelender, s. α-synuclein fate is determined by usp9x-regulated monoubiquitination familial mutations and post-translational modifications of uch-l1 in parkinson's disease and neurodegenerative disorders inhibition of uch-l1 in oligodendroglial cells results in microtubule stabilization and prevents α-synuclein aggregate formation by activating the autophagic pathway: implications for multiple system atrophy the functions of uch-l1 and its relation to neurodegenerative diseases membrane-associated farnesylated uch-l1 promotes α-synuclein neurotoxicity and is a therapeutic target for parkinson's disease deubiquitinase usp8 regulates α-synuclein clearance and modifies its toxicity in lewy body disease lysine 63-linked polyubiquitin potentially partners with p62 to promote the clearance of protein inclusions by autophagy novel polyubiquitin imaging system, polyub-fc, reveals that k33-linked polyubiquitin is recruited by sqstm1/p62 an ankyrin-repeat ubiquitin-binding domain determines trabid's specificity for atypical ubiquitin chains ref(2)p, the drosophila melanogaster homologue of mammalian p62, is required for the formation of protein aggregates in adult brain ref(2)p and ubiquitinated proteins are conserved markers of neuronal aging, aggregate formation and progressive autophagic defects the deubiquitinating enzyme usp36 controls selective autophagy activation by ubiquitinated proteins lc3c, bound selectively by a noncanonical lir motif in ndp52, is required for antibacterial autophagy the tbk1 adaptor and autophagy receptor ndp52 restricts the proliferation of ubiquitin-coated bacteria regulation of toll-like receptor signaling by ndp52-mediated selective autophagy is normally inactivated by a20 parkin is activated by pink1-dependent phosphorylation of ubiquitin at ser65 emerging themes in mitochondrial homeostasis the three 'p's of mitophagy: parkin, pink1, and post-translational modifications regulation of mitochondrial morphology by usp30, a deubiquitinating enzyme present in the mitochondrial outer membrane the mitochondrial deubiquitinase usp30 opposes parkin-mediated mitophagy usp30 deubiquitylates mitochondrial parkin substrates and restricts apoptotic cell death usp30 and parkin homeostatically regulate atypical ubiquitin chains on mitochondria deubiquitinating enzymes regulate park2-mediated mitophagy ubiquitin linkage-specific affimers reveal insights into k6-linked ubiquitin signaling organelle turnover: a usp30 safety catch restrains the trigger for mitophagy and pexophagy dual role of usp30 in controlling basal pexophagy and mitophagy peroxisomal dysfunction in age-related diseases newly born peroxisomes are a hybrid of mitochondrial and er-derived pre-peroxisomes the deubiquitinase usp15 antagonizes parkin-mediated mitochondrial ubiquitination and mitophagy usp8 regulates mitophagy by removing k6-linked ubiquitin conjugates from parkin chaperone-mediated autophagy targets hypoxia-inducible factor-1α (hif-1α) for lysosomal degradation stub1/chip is required for hif1a degradation by chaperone-mediated autophagy cezanne (otud7b) regulates hif-1α homeostasis in a proteasome-independent manner hif1α deubiquitination by usp8 is essential for ciliogenesis in normoxia regulation of connexins by the ubiquitin system: implications for intercellular communication and cancer epidermal growth factor regulates ubiquitination, internalization and proteasome-dependent degradation of connexin43 gap junction turnover is achieved by the internalization of small endocytic double-membrane vesicles internalized gap junctions are degraded by autophagy the ubiquitin-specific protease usp8 deubiquitinates and stabilizes cx43 regulation of epidermal growth factor receptor ubiquitination and trafficking by the usp8.stam complex essential role of ubiquitin-specific protease 8 for receptor tyrosine kinase stability and endocytic trafficking in vivo the mit domain of ubpy constitutes a chmp binding and endosomal localization signal required for efficient epidermal growth factor receptor degradation induction of autophagy promotes fusion of multivesicular bodies with autophagic vacuoles in k562 cells new insights into autophagosome-lysosome fusion amsh, an escrt-iii associated enzyme, deubiquitinates cargo on mvb/late endosomes targeting of amsh to endosomes is required for epidermal growth factor receptor degradation activation of the endosome-associated ubiquitin isopeptidase amsh by stam, a component of the multivesicular body-sorting machinery amsh interacts with escrt-0 to regulate the stability and trafficking of cxcr4 endocytosis: the dub version amsh is an endosome-associated ubiquitin isopeptidase loss of neurons in the hippocampus and cerebral cortex of amsh-deficient mice amsh is required to degrade ubiquitinated proteins in the central nervous system the deubiquitinating enzyme amsh1 and the escrt-iii subunit vsp2.1 are required for autophagic degradation in arabidopsis chmp1b is a target of usp8/ubpy regulated by ubiquitin during endocytosis regulation of endocytic sorting by escrt-dub-mediated deubiquitination ubpy controls the stability of the escrt-0 subunit hrs in development the ubiquitin isopeptidase ubpy regulates endosomal ubiquitin dynamics and is essential for receptor down-regulation the deubiquitinating enzyme ubpy is required for lysosomal biogenesis and productive autophagy in drosophila the ubiquitin-specific protease 12 (usp12) is a negative regulator of notch signaling acting on notch receptor trafficking toward degradation a functional endosomal pathway is necessary for lysosome biogenesis in drosophila trafficking of β-adrenergic receptors: implications in intracellular receptor signaling phosphorylation of the deubiquitinase usp20 by protein kinase a regulates post-endocytic trafficking of β 2 adrenergic receptors to autophagosomes during physiological stress the deubiquitinases usp33 and usp20 coordinate β 2 adrenergic receptor recycling and resensitization an autophagy assay reveals the escrt-iii component chmp2a as a regulator of phagophore closure histone h2b monoubiquitination is a critical epigenetic switch for the regulation of autophagy the salmonella deubiquitinase ssel inhibits selective autophagy of cytosolic aggregates ssel, a salmonella deubiquitinase required for macrophage killing and virulence the legionella effector ravz inhibits host autophagy through irreversible atg8 deconjugation elucidation of the anti-autophagy mechanism of the legionella effector ravz using semisynthetic lc3 proteins legionella ravz plays a role in preventing ubiquitin recruitment to bacteria-containing vacuoles the legionella anti-autophagy effector ravz targets the autophagosome via pi3p-and curvature-sensing motifs viruses and the autophagy pathway involvement of autophagy in coronavirus replication deubiquitinating and interferon antagonism activities of coronavirus papain-like proteases coronavirus membrane-associated papain-like proteases induce autophagy through interacting with beclin1 to negatively regulate antiviral innate immunity beclin-1 targeting for viral immune escape coronavirus papain-like proteases negatively regulate antiviral innate immune response through disruption of sting-mediated signaling crystal structure of a ubp-family deubiquitinating enzyme in isolation and in complex with ubiquitin aldehyde structure and mechanisms of the proteasome-associated deubiquitinating enzyme usp14 amino-terminal dimerization, nrdp1-rhodanese interaction, and inhibited catalytic domain conformation of the ubiquitin-specific protease 8 (usp8) structural basis for conformational plasticity of the parkinson's disease-associated ubiquitin hydrolase uch-l1 structural basis and specificity of human otubain 1-mediated deubiquitination structure validation of the josephin domain of ataxin-3: conclusive evidence for an open conformation mechanism of usp7/hausp activation by its c-terminal ubiquitin-like domain and allosteric regulation by gmp-synthetase usp7-specific inhibitors target and modify the enzyme's active site via distinct chemical mechanisms active site-targeted covalent irreversible inhibitors of usp7 impair the functions of foxp3+ t-regulatory cells by promoting ubiquitination of tip60 characterization of the deubiquitinating activity of usp19 and its role in endoplasmic reticulum-associated degradation synaptic defects in ataxia mice result from a mutation in usp14, encoding a ubiquitin-specific protease an inhibitor of the proteasomal deubiquitinating enzyme usp14 induces tau elimination in cultured neurons expression of the protein gene product 9.5, pgp9.5, is correlated with t-status in non-small cell lung cancer discovery of inhibitors that elucidate the role of uch-l1 activity in the h1299 lung cancer cell line ubiquitin c-terminal hydrolase l1 regulates autophagy by inhibiting autophagosome formation through its deubiquitinating enzyme activity characterization of ubiquitin and ubiquitin-like-protein isopeptidase activities deubiquitinase inhibition by wp1130 leads to ulk1 aggregation and blockade of autophagy deubiquitinase inhibition by small-molecule wp1130 triggers aggresome formation and tumor cell apoptosis a small natural molecule promotes mitochondrial fusion through inhibition of the deubiquitinase usp30 novel highly selective inhibitors of ubiquitin specific protease 30 (usp30) accelerate mitophagy key: cord-000575-g1ob16b9 authors: xie, xiao-li; zheng, li-fei; yu, ying; liang, li-ping; guo, man-cai; song, john; yuan, zhi-fa title: protein sequence analysis based on hydropathy profile of amino acids date: 2012-01-27 journal: journal of zhejiang university science b doi: 10.1631/jzus.b1100052 sha: doc_id: 575 cord_uid: g1ob16b9 biology sequence comparison is a fundamental task in computational biology. according to the hydropathy profile of amino acids, a protein sequence is taken as a string with three letters. three curves of the new protein sequence were defined to describe the protein sequence. a new method to analyze the similarity/dissimilarity of protein sequence was proposed based on the conditional probability of the protein sequence. finally, the protein sequences of nd6 (nadh dehydrogenase subunit 6) protein of eight species were taken as an example to illustrate the new approach. the results demonstrated that the method is convenient and efficient. the comparative biological sequence is one of the issues in bioinformatics when analyzing similarities of function and properties of different sequences. similarly, evolutionary homology is analyzed by comparing dna and protein sequences. in general, there are two types of methodologies to conduct the comparison. one is an alignment-based method, and the other is an alignment-free method. sequence alignment is based on computeroriented and computer-intensive comparisons of sequences, and then a distance function or a score function is obtained. using the distance function, one can compare biological sequences. however, multiple sequence alignment of several hundred sequences always produces a bottleneck, firstly due to long computational time, and secondly due to possible bias of multiple sequence alignments for multiple occurrences of highly similar sequences (pham and zuegg, 2004) . therefore, the emergence of a study on alignment-free sequence analysis is obvious. until now, alignment-free sequence analysis is still in its early development. for most alignment-free methods, a biological sequence should be transformed into an object for which a linear algebra and statistical theory already has useful analytical tools. since 1983, dna sequence has been represented in different dimension spaces (hamori and ruskin, 1983; hamori, 1985; nandy, 1994; 1996; nandy and basak, 2000; randić et al., 2001; randić, 2003; randić and balaban, 2003; zhang et al., 2003; liao and wang, 2004; liao et al., 2005; nandy et al., 2006; bai et al., 2007; feng and wang, 2008) . each nucleotide of a given dna sequence is a point in different dimension spaces, and these graphical representations can allow us to qualitatively analyze dna sequences, and provide a way of viewing, sorting and comparing various genomic sequences. based on the graphical representation, it is possible to numerically characterize dna sequence and further quantitatively measure similarity of different dna sequences. although protein sequence and dna sequence belong to symbolic sequences, compared with dna sequence, there are fewer methods for the graphical representation of protein sequence. this is mainly because extension of dna graphical representation to protein sequences would enormously increase the number of possible alternative assignments for the 20 amino acids. the amino acid sequence is the key to understanding protein structure and function in the cell, so analysis of amino acid sequence is an important part of post-genomic studies. recently, several schemes have been proposed in protein graphical representation (randić and krilov, 1997; vinga and almeida, 2003; bai and wang, 2005; li j. et al., 2006; li c. et al., 2008; munteanu et al., 2008; yau et al., 2008; yao et al., 2008; wen and zhang, 2009) . in order to plot amino acid sequence, 20 amino acids in protein sequences are divided into different types, including protein sequence regarded as a word with three, four, or five different letters. since ordering amino acids based on their physicochemical properties may offer better insights into comparative study of protein than representation of protein based on the random ordering of amino acid, randić (2007) and yao et al. (2008; outlined different 2d graphical representations of protein sequence based on different physicochemical properties. the graphical representation of protein sequence cannot only describe amino acid sequence, but also measure similarity/ dissimilarity of different protein sequences. however, the methods only consider the string's information of protein, and do not consider adjacent string's information of amino acid sequence. here, we choose conditional probability to measure adjacent string's information. in this paper, we converted a protein sequence into three-letter sequence based on hydropathy profile of amino acid and defined the three curves to represent different hydropathy features. we then selected conditional probability as a new invariant for the protein sequences. to illustrate the proposed method, we made a comparison of the sequences belonging to eight nd6 (nadh dehydrogenase subunit 6) proteins from http://www.ncbi.nlm.nih.gov/: human , rat (ap_004903), and mouse (np_904339). according to the hydropathy profile of amino acids, the amino acids can be classified into three groups (nei and kumar, 2002; liu and wang, 2006) : internal group (f, i, l, m, v), external group (d, e, h, k, n, q, r), and ambivalent group (s, t, y, c, w, g, p, a). the amino acid of internal group tends to occur in the inner side of the protein's spatial structure, while the amino acid of external group tends to appear at the surface. in order to characterize the hydropathicity of a protein primary structure, we defined a primary protein sequence as a symbolic sequence including three letters according to the following rule: where s(i) is the letter in the ith position in the protein primary sequence, and f(s(i)) is the substitution for s(i). since the hydropathy profile can detect more evolutionary relationships, in the next section, we analyzed the new protein sequence containing three letters through different mathematical methods. given a protein primary sequence with length n, we transformed it into a new sequence according to the above definition. for example, for the protein sequence, s=mmyalfllsvglvmgfvgfs, then f(s)=iiaaiiiiaiaiiiaiiaia. to obtain more information, we defined three curves of the sequence. firstly, we let ie ea ia 1 if ( ( )) i, 0 otherwise, where i ranges from 1 to n. then, let y n u and n are y axis and x axis, respectively, and then we can draw three different curves, which are named as ie, ia, and ea curves of the protein sequence. the three different curves can give us some information about the protein sequence. according to the ie curve, we can compare the numbers of the amino acids belonging to the internal group and the external group at different positions. the ia curve can then be used to compare the numbers of the amino acids belonging to the internal group and the ambivalent group at different positions. finally, the ea curve can compare the numbers of the amino acids of the external group and the ambivalent group at different positions. according to the above definitions of three different curves, we drew three curves of nd6 proteins for the eight species (fig. 1 ). fig. 1 shows that the amino acids of the internal group in nd6 protein sequences are more than the amino acids of the external group, and the amino acids of the ambivalent group are more than the amino acids of the external group. furthermore, it is evident that g. seal and h. seal have similar curves, rat and mouse's curves are almost identical, and the three curves of human, gorilla, and chimpanzee are similar, but wallaroo's curve is different from curves of other species. protein sequence is composed of three parts, internal group, external group and ambivalent group, so we regard the random numerical sequence to be composed of three parts (+1, 0, −1). we calculated the conditional probability, which was invariant to quantity protein sequences. for example, let x i ie represents the state of the ith (i=1, 2, ..., n) moment, state space s={+1, 0, −1}. there are nine conditional probabilities as follows: ( 1 1 ), according to the above definition, we can obtain these conditional probabilities of a given protein sequence. the conditional probability of each of nd6 proteins is listed in table 1 . given two protein sequences, we can obtain two nine-component vectors whose elements are conditional probabilities for each protein sequence. based on the vectors, we can compare different protein sequences. in general, similarities of the two vectors can be obtained by calculating euclidean distance. the smaller the euclidean distance of two vectors is, the more similar are the protein sequences. the euclidean distance of two vectors u and v is as follows: where u i and v i denote the components of vectors u and v, respectively. k is the dimension of vectors u and v. yao et al. (2009) proposed a new similarity measure of sequences, and coefficient of determination (r 2 ), which is defined as: r 2 can vary from 0 to 1, and represents the percent of the data, which is the closest to the line of best fit. the larger the coefficient of determination of two vectors is, the more similar are the protein sequences. in tables 2 and 3, we give the similarity/dissimilarity matrices for the eight nd6 sequences based on euclidean distance and coefficient of determination amongst nine-component vectors. as shown in tables 2 and 3, it is obvious that nd6 proteins of human, gorilla, and chimpanzee are more similar to each other. in addition, nd6 proteins are more similar for (g. seal, h. seal) and (mouse, rat). however, nd6 protein of wallaroo is very dissimilar to others amongst the eight species. the results are consistent with the known fact of evolution (yao et al., 2009) . biology sequence analysis is a fundamental task in computational biology, whose aim is to detect similarity/dissimilarity relationships between molecular sequences. some alignment-free methods to analyze similarities/dissimilarities of dna sequences have been proposed. however, there are few alignmentfree methods to analyze protein sequences. the amino acid sequence of a protein is the key to understanding its structure and function in the cell, so we present a new method to analyze protein primary sequence in this paper. the method is based on the graphical representation and conditional probability taken as the numerical characterization of the protein sequence. the demonstrable significance of the new method is that it cannot only analyze similarity/dissimilarity of protein sequences, but also provide more biological information about the protein sequences. according to the ie curve, we can compare the numbers of amino acids of the internal and external groups at different positions. also the ia curve can be used to compare the numbers of amino acids of the internal and ambivalent groups at different positions. the ea curve can be used to compare the numbers of amino acids in the external and ambivalent groups at different positions. therefore the three curves show the distribution of the three types of amino acids. furthermore, the conditional probability reflected the distribution of the two adjacent amino acids. the new approach was applied to nd6 protein sequences of several species and results have shown that the introduction of hydropathy profile of amino acids into protein sequence is effectual and feasible. a 2-d graphical representation of protein sequences based on nucleotide triplet codons a representation of dna primary sequences by random walk a 3d graphical representation of rna secondary structures based on chaos game representation novel dna sequence representation h curves, a novel method of representation of nucleotide series especially suited for long dna sequences 2-d graphical representation of protein sequences and its application to coronavirus phylogeny simplification of protein sequence and alignment-free sequence analysis analysis of similarity of dna sequences based on 3d graphical representation application of 2d graphical representation of dna sequence protein-based phylogenetic analysis by using hydropathy profile of amino acids enzymes/non-enzymes classification model complexity based on composition, sequence, 3d and topological indices a new graphical representation and analysis of dna sequence structure: i. methodology and application to globin genes two-dimensional graphical representation of dna sequences and intron-exon discrimination in intronrich sequences simple numerical descriptor for quantifying effect of toxic substances on dna sequences mathematical descriptors of dna sequences: development and applications molecular evolution and phylogenetics a probabilistic measure for alignment-free sequence comparison condensed representation of dna primary sequences 2-d graphical representation of proteins based on physico-chemical properties of amino acids characterization of 3-d sequences of proteins on a four-dimensional representation of dna primary sequences on the characterization of dna primary sequences by triplet of nucleic acid bases alignment-free sequence comparison-a review a 2d graphical representation of protein sequence and its numerical characterization analysis of similarity/dissimilarity of protein sequences similarity/dissimilarity studies of protein sequences based on a new 2d graphical representation a protein map and its application the z curve database: a graphic representation of genome sequences we would like to thank dr. jian-gang wang (college of animal science and technology, northwest a&f university, china), jian-zhong luo (department of foreign languages, northwest a&f university, china), feng an and dr. jun-li du (college of sciences, northwest a&f university, china) for their helpful suggestions. key: cord-261961-u4d0vvmq authors: st-germain, jonathan r.; astori, audrey; samavarchi-tehrani, payman; abdouni, hala; macwan, vinitha; kim, dae-kyum; knapp, jennifer j.; roth, frederick p.; gingras, anne-claude; raught, brian title: a sars-cov-2 bioid-based virus-host membrane protein interactome and virus peptide compendium: new proteomics resources for covid-19 research date: 2020-08-28 journal: biorxiv doi: 10.1101/2020.08.28.269175 sha: doc_id: 261961 cord_uid: u4d0vvmq key steps of viral replication take place at host cell membranes, but the detection of membrane-associated protein-protein interactions using standard affinity-based approaches (e.g. immunoprecipitation coupled with mass spectrometry, ip-ms) is challenging. to learn more about sars-cov-2 host protein interactions that take place at membranes, we utilized a complementary technique, proximity-dependent biotin labeling (bioid). this approach uncovered a virus-host topology network comprising 3566 proximity interactions amongst 1010 host proteins, highlighting extensive virus protein crosstalk with: (i) host protein folding and modification machinery; (ii) membrane-bound vesicles and organelles, and; (iii) lipid trafficking pathways and er-organelle membrane contact sites. the design and implementation of sensitive mass spectrometric approaches for the analysis of complex biological samples is also important for both clinical and basic research proteomics focused on the study of covid-19. to this end, we conducted a mass spectrometry-based characterization of the sars-cov-2 virion and infected cell lysates, identifying 189 unique high-confidence virus tryptic peptides derived from 17 different virus proteins, to create a high quality resource for use in targeted proteomics approaches. together, these datasets comprise a valuable resource for ms-based sars-cov-2 research, and identify novel virus-host protein interactions that could be targeted in covid-19 therapeutics. the sars-cov-2 ~30 kb positive strand rna genome (genbank mn908947.3 1, 2 ) contains two large open reading frames (orf1a and orf1ab) encoding polyproteins that are cleaved by viral proteases into ~16 non-structural proteins (nsps). 13 smaller 3' orfs encode the primary structural proteins of the virus, spike (s), nucleocapsid (n), membrane (m) and envelope (e), along with nine additional polypeptides of poorly understood function (fig 1a) . to enable proteomics-based approaches for the analysis of complex biological samples, we analyzed both sars-cov-2-infected cell lysates and mature virions, generating a high confidence virus peptide spectrum compendium. this dataset can be used e.g. for the selection of virus peptides for use in targeted proteomics approaches (e.g. the identification of viral peptides in human clinical samples), or for the generation of peptide spectral libraries for increased sensitivity of detection. two sars-cov-2 -host protein-protein interaction (ppi) mapping efforts have utilized immunoprecipitation coupled with mass spectrometry (ip-ms) of epitope-tagged viral proteins to identify >1000 putative virus-host ppis in hek293t 3 and a549 4 cells. while extremely powerful for identifying stable, soluble protein complexes, ip-ms approaches are not optimal for the capture of weak or transient ppis, or for the detection of ppis that take place at poorly soluble intracellular locations such as membranes, where key steps in viral replication occur. to better understand how host cell functions are hijacked and subverted by sars-cov-2 proteins, we used proximity-dependent biotinylation (bioid 5 ) as a complementary approach to map virus-host protein proximity interactions in live human cells. this dataset provides a valuable resource for better understanding sars-cov-2 pathogenesis, and identifies numerous previously undescribed virus-host ppis that could represent attractive targets for therapeutic intervention. a sars-cov-2 peptide compendium. to identify high quality tryptic virus peptides for use in targeted proteomics analyses, data-dependent acquisition (dda) mass spectrometry was conducted on the mature sars-cov-2 virion. the toronto sb3 virus strain 6 was cultured in veroe6 cells (moi 0.1), and culture media was collected 48 hrs post-infection. virus was concentrated by centrifugation, inactivated by detergent and subjected to tryptic proteolysis. the resulting viral tryptic peptides were identified using nanoflow liquid chromatography -tandem mass spectrometry (lc-ms/ms; fig 1a, together, these data confirm and expand upon previous proteomic analyses of sars-cov-2 virions, infected cells 4, 7-11 and patient samples [12] [13] [14] , and provide a library of high quality virus peptide spectra covering 17 virus proteins that can be used for the creation of peptide spectral libraries and targeted proteomics approaches. a sars-cov-2 -host protein proximity interactome. based on standard transcript mapping algorithms and conservation with orfs in other coronaviruses 1, 2 , we created a sars-cov-2 open reading frame (orf) vector set 15 (fig 1a) . nine sars-cov-2 proteins are predicted to have one or more transmembrane domains (s, e, m, nsp3, nsp4, nsp6, orf3a, orf7a and orf7b). to better characterize sars-cov-2 -host membrane-associated ppis, these virus orfs (along with the remaining poorly understood open reading frames orf3b, orf6, orf8 and orf9b) were fused in-frame with an n-terminal bira* (r118g) coding sequence, and the resulting fusion proteins individually expressed in hek 293 flp-in t-rex cells. using these cells, a virus-host ppi landscape was characterized using bioid (as in 16 ; supp table 2 ). applying a bayesian false discovery rate of ≤1%, 3566 high confidence proximity interactions were identified with 1010 unique human proteins (all raw data available at massive.ucsd.edu, accession #msv000086006). 412 prey polypeptides were detected as high confidence interactors for a single sars-cov-2 bait protein in this analysis, underscoring the high degree of specificity in this virus-host proximity interaction map. bait-bait correlation analysis (fig 1b) , based on similarity between interactomes (jaccard index analysis conducted in prohits-viz 17 ) revealed high levels of correspondence between the s (spike), e, m, nsp4, nsp6, orf7a, orf7b, and orf8 bait proteins. the nsp2, nsp3, orf3a, orf3b, orf6, and orf9b interactomes shared a lower degree of similarity with the other bait proteins in this set. bait proteins with one or more predicted transmembrane domains thus largely clustered together, with the exception of orf3a (which clusters outside the main group of putative membrane baits, even though it is predicted to possess three transmembrane helices), and orf8 (which clusters with the putative membrane baits, but has no predicted transmembrane domain itself). a self-organized force-directed bait-prey topology map was next generated, in which map location is determined by the number and abundance (i.e. total peptide counts) of host cell interactors (fig 1c) . this approach similarly clustered all of the baits with one or more predicted transmembrane helices, along with orf8, in a dense "core" region of the map, indicating that these bait proteins share a large proportion of common interactors. nsp3 and orf3a occupy regions at the edge of this dense region of the map, indicating a lower number of shared interactors with the other membrane proteins. nsp2, orf3b and orf9b occupy peripheral regions of the map, indicating that they share far fewer interactors with the rest of the baits analyzed here. interestingly, orf6 occupies a region of the map near orf3a (these two baits were also clustered near each other in the bait-bait analysis, above). consistent with this location, more than half of the orf3a interactome (39 of 73 proteins) is also present in the larger orf6 interactome (217 proteins). this overlapping group of interactors is enriched in plasma membrane (pm) and er proteins. based on this observation, it will be interesting to explore similarities in orf3a and orf6 function. as a whole, the virus-host interactome is significantly enriched in proteins associated with the endoplasmic reticulum (er)/nuclear, golgi and plasma membranes, and er-golgi trafficking vesicles ( even amongst those viral proteins that appear to localize exclusively to the er-golgi-pm endomembrane membrane system, specificity in virus-host interactomes was observed, likely reflecting preferences for interactions with different subsets of membrane proteins and/or localization to unique membrane lipid nanodomains. for example, both the orf7a and orf7b interactomes are enriched in pm solute channels, but orf7a appears to interact uniquely with the anion exchanger slc4a2, the taurine transporter slc6a6, and the glycine transporter slc6a9, while orf7b interacts specifically with the amino acid transporter slc1a5, the sulfate transporter slc26a11, and the divalent metal transporter slc39a14. autophagy is an important part of the innate immune response, effecting the elimination of intracellular pathogens such as viruses (virophagy), and delivering them to the lysosome, which processes pathogen components for antigen presentation 26, 27 . many viruses have thus evolved strategies to inhibit the host autophagic machinery. notably, however, (+)rna viruses appear to be dependent on autophagic function for efficient replication, and hijack components of the autophagic machinery for use in membrane re-organization and the creation of ros 28 . consistent with these observations, our data highlight multiple interactions amongst sars-cov-2 proteins and the er-phagy receptors fam134b, tex2644 and sec24c. a number of virus protein interactions were also detected with components of the ufmylation system (ddrgk1, cdk5rap3, ufl1 and ufsp2), which was recently shown to play a key role in er-phagy 10 , highlighting interesting links between specific autophagy pathways and sars-cov-2. interactome is significantly enriched in mitochondrial proteins (supp table 2, supp table 3 ). amongst the high confidence orf9b interactors is the mitochondrial antiviral signaling protein (mavs), which acts as a hub for cell-based innate immune signaling. the cellular pattern recognition receptors (prrs) detect pathogen-associated molecular patterns (pamps, e.g. viral rnas). pamp-bound prrs interact with mavs, which activates the nf-kb and type i interferon signaling pathways 29 . many different viruses block the host antiviral response by interfering with mavs signaling. this may be accomplished by e.g. direct mavs cleavage by viral proteases (a strategy used by hav, hcv and coxsackievirus b3), or via 26s proteasomemediated degradation, a strategy used by sars-cov orf9b 30 , which recruits the hect e3 ligase itch/aip4 to effect mavs ubiquitylation. we did not detect any ubiquitin e3 ligases in the orf9b bioid interactome, but consistent with a recent report indicating that sars-cov-2 orf9b binds directly to tomm70 31 to block mavs-mediated ifn signaling, we detected tomm70 as a major component of the orf9b interactome. the sars-cov-2 orf6 interactome is uniquely enriched in nuclear pore complex (npc) components (supp table 2, supp table 3 ). sars-cov orf6 was shown to inhibit npcmediated transport by tethering the importin proteins kpna2 and kpnb2/tnpo1 to er/golgi membranes 32 , which effectively blocks the import of immune signaling proteins such as stat1 into the nucleus. sars-cov-2 orf6 shares 69% identity at the amino acid level with its sars-cov counterpart, and similarly displays potent immune repressor function 33 . it will be interesting to detemine if the sars-cov2 orf6 interaction with npc components leads to a similar disruption of immune signaling and nuclear transport. orf3b interacts specifically with lamtor1 and lamtor2, components of the ragulator complex, which is localized to the lysosomal membrane and regulates the mechanistic target of rapamycin complex 1 (mtorc1). mtor signaling is inactivated by amino acid starvation and other types of stress, to inhibit cap-dependent translation and upregulate autophagy 34 . many viruses have thus evolved mechanisms to maintain mtorc1 activity during infection 35 . recent work has shown that the lamtor1/2 proteins may also play important roles in xenophagy 36 . based on these observations, sars-cov-2 orf3b could play an important role in regulating mtorc1 activity and/or in the disruption of antiviral immune function. in addition to er/golgi proteins, sars-cov-2 nsp3 interacts with the cytoplasmic rna binding proteins fxr1 and fxr2 37 . the fxr proteins were identified as host cell components of (+)rna equine encephalitis virus (eev) rna replication complexes (rc) [38] [39] [40] . fxrs are recruited to the viral rc by eev nsp3 proteins, and the fxrs are required for rc assembly. it will be interesting to determine whether the fxrs play similar roles in sars-cov-2 rna replication. we and others have previously reported that orthogonal ppi discovery approaches such as proximity-dependent biotinylation (bioid) can provide information that is highly complementary to ip-ms datasets 16, [41] [42] [43] [44] . to this end, we applied bioid to identify proximity partners for sars-cov-2 proteins in the proteomics "workhorse" 293 cell system. this mapping project significantly expands upon the sars-cov-2 virus-host interactome, providing a rich resource that can be mined by the scientific community for better understanding sars-cov-2 pathobiology, and identifying virus-host membrane protein interactions that could be targeted in covid-19 therapeutics. the design and implementation of sensitive mass spectrometric approaches for the analysis of complex biological samples will be important for clinical and basic research proteomics focused on covid-19. to this end, we also undertook an analysis of sars-cov-2 virions and infected vero cell lsyates using data-dependent acquisition tandem mass spectrometry, and identified 189 unique tryptic peptides, assigned to 17 different virus proteins. this work provides a significantly expanded sars-cov-2 tryptic peptide compendium for use in targeted proteomics approaches such as parallel or selected reaction monitoring (prm/srm), or for use in spectral library building. bioid, mass spectrometry and data analysis were conducted exactly as in 16 supp table 1 . virus peptide identification dataset, complete raw data. data for viral preparation and infected cells is presented in individual tabs. table 2 genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting wuhan a new coronavirus associated with human respiratory disease in china multi-level proteomics reveals host-perturbation strategies of a promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells sequence, infectivity, and replication kinetics of severe acute respiratory syndrome coronavirus 2. emerg infect dis data, reagents, assays and merits of proteomics for sars-cov-2 research and testing shortlisting sars-cov-2 peptides for targeted studies from experimental data-dependent acquisition tandem mass spectrometry data shotgun proteomics analysis of sars-cov-2-infected cells and how it can optimize whole viral particle antigen production for vaccines a genome-wide er-phagy screen highlights key roles of mitochondrial metabolism and er-resident ufmylation proteomics of sars-cov-2-infected host cells reveals therapy targets mass spectrometric identification of sars-cov-2 proteins from gargle solution samples of covid-19 patients mass-spectrometric detection of sars-cov-2 virus in scrapings of the epithelium of the nasopharynx of infected patients via nucleocapsid n protein proteotyping sars-cov-2 virus from nasopharyngeal swabs: a proof-of-concept focused on a 3 min mass spectrometry window global interactomics uncovers extensive organellar targeting by zika virus prohits-viz: a suite of web tools for visualizing interaction proteomics data an intramembrane chaperone complex facilitates membrane protein biogenesis here, there, and everywhere: the importance of er membrane contact sites +)rna viruses rewire cellular pathways to build replication organelles rhinovirus uses a phosphatidylinositol 4-phosphate/cholesterol countercurrent for the formation of replication compartments at the er-golgi interface fat(al) attraction: picornaviruses usurp lipid transfer at membrane contact sites to create replication organelles host lipids in positive-strand rna virus genome replication the oxysterol-binding protein cycle: burning off pi(4)p to transport cholesterol hepatitis c virus replication depends on endosomal cholesterol homeostasis digesting the crisis: autophagy and coronaviruses autophagy enhances the presentation of endogenous viral antigens on mhc class i molecules during hsv-1 infection manipulation of autophagy by (+) rna viruses regulation of mavs expression and signaling function in the antiviral innate immune response sars-coronavirus open reading frame-9b suppresses innate immunity by targeting mitochondria and the mavs/traf3/traf6 signalosome sars-cov-2 orf9b suppresses type i interferon responses by targeting tom70 viral subversion of nucleocytoplasmic trafficking sars-cov-2 nsp13, nsp14, nsp15 and orf6 function as potent interferon antagonists nutrient regulation of mtorc1 at a glance adapting the stress response: viral subversion of the mtor signaling pathway lamtor2/lamtor1 complex is required for tax1bp1-mediated xenophagy concise review: fragile x proteins in stem cell maintenance and differentiation mutations in hypervariable domain of venezuelan equine encephalitis virus nsp3 protein differentially affect viral replication hypervariable domain of eastern equine encephalitis virus nsp3 redundantly utilizes multiple cellular proteins for replication complex assembly new world and old world alphaviruses have evolved to exploit different components of stress granules, fxr and g3bp proteins, for assembly of viral replication complexes bioid-based identification of skp cullin f-box (scf)beta-trcp1/2 e3 ligase substrates getting to know the neighborhood: using proximity-dependent biotinylation to characterize protein complexes and map organelles parallel exploration of interaction space by bioid and affinity purification coupled to mass spectrometry proximity biotinylation and affinity purification are complementary approaches for the interactome mapping of chromatin-associated protein complexes saint: probabilistic scoring of affinity purification-mass spectrometry data key: cord-000674-36jzzy77 authors: reggiori, fulvio; komatsu, masaaki; finley, kim; simonsen, anne title: autophagy: more than a nonselective pathway date: 2012-05-15 journal: int j cell biol doi: 10.1155/2012/219625 sha: doc_id: 674 cord_uid: 36jzzy77 autophagy is a catabolic pathway conserved among eukaryotes that allows cells to rapidly eliminate large unwanted structures such as aberrant protein aggregates, superfluous or damaged organelles, and invading pathogens. the hallmark of this transport pathway is the sequestration of the cargoes that have to be degraded in the lysosomes by double-membrane vesicles called autophagosomes. the key actors mediating the biogenesis of these carriers are the autophagy-related genes (atgs). for a long time, it was assumed that autophagy is a bulk process. recent studies, however, have highlighted the capacity of this pathway to exclusively eliminate specific structures and thus better fulfil the catabolic necessities of the cell. we are just starting to unveil the regulation and mechanism of these selective types of autophagy, but what it is already clearly emerging is that structures targeted to destruction are accurately enwrapped by autophagosomes through the action of specific receptors and adaptors. in this paper, we will briefly discuss the impact that the selective types of autophagy have had on our understanding of autophagy. three different pathways can deliver cytoplasmic components into the lumen of the lysosome for degradation. they are commonly referred to as autophagy (cell "self-eating") and include chaperone-mediated autophagy (cma), microautophagy, and macroautophagy. cma involves the direct translocation of specific proteins containing the kferq pentapeptide sequence across the lysosome membrane [1, 2] . microautophagy, on the other hand, entails the invagination and pinching off of the lysosomal limiting membrane, which allows the sequestration and elimination of cytoplasmic components. the molecular mechanism underlying this pathway remains largely unknown. the only cellular function that so far has been indisputably assigned to microautophagy is the turnover of peroxisomes under specific conditions in fungi [3] . recently, it has been reported the existence of a microautophagy-like process at the late endosomes, where proteins are selectively incorporated into the vesicles that bud inward at the limiting membrane of these organelles during the multivesicular bodies biogenesis [4] . in contrast to cma and microautophagy, macroautophagy (hereafter referred to as autophagy) entails the formation of a new organelle, the autophagosome, which allows the delivery of a large number of different cargo molecules into the lysosome. autophagy is a primordial and highly conserved intracellular process that occurs in most eukaryotic cells and participates in stress management. this pathway involves the de novo formation of vesicles called autophagosomes, which can engulf entire regions of the cytoplasm, individual organelles, protein aggregates, and invading pathogens ( figure 1 ). the autophagosomes fuse with endosomal compartments to form amphisomes prior to fusion with the lysosome, where their contents are degraded and the resulting metabolites are recycled back to the cytoplasm (figure 1 ). unique features of the pathway include the double-membrane structure of the autophagosomes, which were originally characterized over 50 years ago from detailed electron microscopy studies [5] . starting in the 1990s yeast mutational studies began the genetic and molecular characterization of the key components required to initiate and build an autophagosome in response to inactivation of mtorc1 (but also other cellular and environmental cues), the ulk1 complex is activated and translocates in proximity of the endoplasmic reticulum (er). thereafter, the ulk1 complex regulates the class iii pi3k complex. atg9l, a multimembrane spanning protein, is also involved in an early stage of autophagosome formation by probably supplying part of the membranes necessary for the formation and/or expansion. local formation of pi3p at sites called omegasomes promotes the formation of the phagophore, from which autophagosomes appear to be generated. the pi3p-binding wipi proteins (yeast atg18 homolog), as well as the atg12-atg5-atg16l1 complex and the lc3-phosphatidylethanolamine (pe) conjugate play important roles in the elongation and closure of the isolation membrane. finally, the complete autophagosome fuses with endosomes or endosome-derived vesicles forming the amphisome, which subsequently fuses with lysosomes to form autolysosomes. in the lysosomes, the cytoplasmic materials engulfed by the autophagosomes are degraded by resident hydrolases. the resulting amino acids and other basic cellular constituents are reused by the cell; when in high levels they also reactivate mtorc1 and then suppress autophagy. [6] . subsequently, genetic and transgenic studies in plants, worms, fruit flies, mice, and humans have underscored the pathway's conservation and have begun to unveil the intricate vital role that autophagy plays in the physiology of cells and multicellular organisms. for a long time, autophagy was considered a nonselective pathway induced as a survival mechanism in response to cellular stresses. over the past several years, however, it has become increasingly evident that autophagy also is a highly selective process involved in clearance of excess or dysfunctional organelles, protein aggregates and intracellular pathogens. in this introductory piece, we will briefly discuss the molecular mechanisms of selective types of autophagy and their emerging importance as a quality control to maintain cellular and organismal health, aspects that will be presented in deep in the reviews of this special issue of the international journal of cell biology and highlighted by the research papers. 2.1. the function of the atg proteins. autophagosomes are formed by expansion and sealing of a small cistern known as the phagophore or isolation membrane ( figure 1 ). once complete, they deliver their cargo into the hydrolytic lumen of lysosomes for degradation. a diverse set of components are involved in the biogenesis of autophagosomes, which primarily includes the proteins encoded by the autophagyrelated genes (atg). most atg genes have initially been identified and characterized in yeast. subsequent studies in higher eukaryotes have revealed that these key factors are highly conserved. to date, 36 atg proteins have been identified and 16 are part of the core atg machinery essential for all autophagy-related pathways [7] . upon autophagy induction, these proteins associate following a hierarchical order [8, 9] to first mediate the formation of the phagophore and then to expand it into an autophagosome [10, 11] . while their molecular functions and their precise contribution during the biogenesis of double-membrane vesicles remain largely unknown, they have been classified in 4 functional groups of genes: (1) the atg1/ulk complex, (2) the phosphatidylinositol 3-kinase (pi3k) complex, (3) the atg9 trafficking system, and (4) the two parallel ubiquitinlike conjugation systems ( figure 1 ). the atg1/ulk complex consists of atg1, atg13, and atg17 in yeast, and ulk1/2, atg13, fip200 and atg101 in mammals [12] [13] [14] [15] . this complex is central in mediating the induction of autophagosome biogenesis and as a result it is the terminal target of various signaling cascades regulating autophagy, such as the tor, insulin, pka, and ampk pathways [16] (figure 1 ). increased activity of the atg1/ulk kinase is the primary event that determines the acute induction and upregulation of autophagy. it is important to note that ulk1 is part of a protein family and two other members, ulk2 and ulk3, have been shown play a role in autophagy induction as well [14, 17] . the expansion of this gene family may reflect the complex regulation and requirements of the pathway in multicellular long-lived organisms. stimulation of the ulk kinases is achieved through an intricate network of phosphorylation and dephosphorylation modifications of the various subunits of the atg1/ulk complex. for example, atg13 is directly phosphorylated by tor and the phosphorylation state of atg13 modulates its binding to atg1 and atg17. inactivation of tor leads to a rapid dephosphorylation of atg13, which increases atg1-atg13-atg17 complex formation, stimulates the atg1 kinase activity and induces autophagy [18, 19] . the matg13 is also essential for autophagy, but seems to directly interact with ulk1, ulk2 and fip200 independently of its phosphorylation state [13, 14] . in addition, there are several phosphorylation events within this complex as well, including phosphorylation of matg13 by ulk1, ulk2, and tor; phosphorylation of fip200 by ulk1 and ulk2; phosphorylation of ulk1 and ulk2 by tor [13, 14] . additional studies are required to fully characterize the functional significance of these posttranslational modifications. autophagy is also regulated by the activity of pi3k complexes. yeast contains a single pi3k, vps34, which is present in two different tetrameric complexes that share 3 common subunits, vps34, vps15, and atg6 [20] . complex i is required for the induction of autophagy and through its fourth component, atg14, associates to the autophagosomal membranes where the lipid kinase activity of vps34 is essential for generating the phosphatidylinositol-3-phosphate (pi3p) that permits the recruitment of other atg proteins [9, 21] (figure 1 ). complex ii contains vps38 as the fourth subunit and it is involved in endosomal trafficking and vacuole biogenesis [20] . there are three types of pi3k in mammals: class i, ii, and iii. the functions of class ii pi3k remains largely unknown, but both classes i and iii pi3ks are involved in autophagy. while class i pi3k is principally implicated in the modulation of signalling cascades, class iii pi3k complexes regulate organelle biogenesis and, like yeast, contain three common components: hvps34, p150 (vps15 ortholog), and beclin 1 (atg6 ortholog). the counterparts of atg14 and vps38 are called atg14l/barkor and uvrag, respectively [22] [23] [24] . the atg14l-containing complex plays a central role in autophagy and functions very similarly as the yeast complex i by directing the class iii pi3k complex i to the phagophore to produce pi3p and initiate the recruitment of the atg machinery ( figure 1 ). atg14l is thought to be present on the er irrespective of autophagy induction [25] . upon starvation, atg14l localizes to autophagosomal membranes [8] . importantly, depletion of atg14l reduces pi3p production, impairs the formation of autophagosomal precursor structures, and inhibits autophagy [8, 24, 26, 27] . the uvrag-containing class iii pi3k complex also regulates autophagy but it appears to act at the intersection between autophagy and the endosomal transport pathways. uvrag initially associates with the bar-domain protein bif-1, which may regulate matg9 trafficking from the trans-golgi network (tgn) [28, 29] . uvrag then interacts with the class c vps/hops protein complex, promoting the fusion of autophagosomes with late endosomes and/or lysosomes [30] . finally, the uvrag-containing class iii protein complex binds to rubicon, a late endosomal and lysosomal protein that suppresses autophagosome maturation by reducing hvps34 activity [26, 31] . importantly, both the atg14l-and uvrag-containing complexes interact through beclin 1 with ambra1, which in turn tethers these protein complexes to the cytoskeleton via an interaction with dynein [32, 33] . following the induction of autophagy, ulk1 phosphorylates ambra1 thus releasing the class iii pi3k complexes from dynein and their subsequent relocalization triggers autophagosome formation. therefore, ambra1 constitutes a direct regulatory link between the atg1/ulk1 and the pi3k complexes [32] . together with the atg1/ulk and the pi3k complexes, atg9 is one of the first factors localizing to the preautophagosomal structure or phagophore assembly site (pas), the structure believed to be the precursor of the phagophore [9, 34] (figure 1 ). atg9 is the only conserved transmembrane protein that is essential for autophagy. it is distributed to the pas and multiple additional cytoplasmic tubulovesicular compartments derived from the golgi [35] [36] [37] . atg9 cycles between these two locations and consequently it is thought to serve as a membrane carrier providing the lipid building blocks for the expanding phagophore [37] . one of the established functions of atg9 is that it leads to the formation of the yeast pas when at least one of the cytoplasmic tubulovesicular compartments translocates near the vacuole [34] . atg9 is also essential to recruit the pi3k complex i to the pas [9] . retrieval transport of yeast atg9 from the pas and/or complete autophagosome is mediated by the 4 international journal of cell biology atg2-atg18 complex [38] and appears to be regulated by the atg1/ulk and pi3k complexes [37] . mammalian atg9 (matg9) has similar characteristics to its yeast counterpart. matg9 localizes to the tgn and late endosomes and redistributes to autophagosomal structures upon the induction of autophagy (figure 1 ) [39] , further promoting pathway activity [29, [40] [41] [42] . as in yeast, cycling of matg9 between locations also requires the atg1/ulk complex and kinase activity hvps34 [39, 43] . the core atg machinery also entails two ubiquitin-like proteins, atg12 and atg8/microtubule-associated protein 1 (map1)-light chain 3 (lc3), and their respective, partially overlapping, conjugation systems [44] [45] [46] (figure 1 ). atg12 is conjugated to atg5 through the activity of the atg7 (e1like) and the atg10 (e2-like) enzymes. the atg12-atg5 conjugate then interacts with atg16, which oligomerizes to form a large multimeric complex. atg8/lc3 is cleaved at its c terminus by the atg4 protease to generate the cytosolic lc3-i with a c-terminal glycine residue, which is then conjugated to phosphatidylethanolamine (pe) in a reaction that requires atg7 and the e2-like enzyme atg3. this lipidated form of lc3 (lc3-ii) is attached to both faces of the phagophore membrane. once the autophagosome is completed, atg4 removes lc3-ii from the outer autophagosome surface. these two ubiquitination-like systems appear to be closely interconnected. on one hand, the multimeric atg12-atg5-atg16 complex localizes to the phagophore and acts as an e3-like enzyme, determining the site of atg8/lc3 lipidation [47, 48] . on the other hand, the atg8/lc3 conjugation machinery seems to be essential for the optimal functioning of the atg12 conjugation system. in atg3-deficient mice, atg12-atg5 conjugation is markedly reduced, and normal dissociation of the atg12-atg5-atg16 complex from the phagophore is delayed [49] . some evidences suggest that these two conjugation systems also function together during the expansion and closure of the phagophore. for example, overexpression of an inactive mutant of atg4 inhibits the lipidation of lc3 and leads to the accumulation of a number of nearly complete autophagosomes [47] . while controversial [50] , it has been postulated that atg8/lc3 also possesses fusogenic properties, thus mediating the assembly of the autophagic membrane [51, 52] . it has to be noted that mammals possess at least 7 genes coding for lc3/atg8 proteins that can be grouped into three subfamilies: (1) the lc3 subfamily containing lc3a, lc3b, lc3b2 and lc3c; (2) the gammaaminobutyrate receptor-associated protein (gabarap) subfamily comprising gabarap and gabarapl1 (also called gec-1); (3) the golgi-associated atpase enhancer of 16 kda (gate-16) protein (also called gabarap-l2/gef2) [53] . although in vivo studies show that they are all conjugated to pe, they appear to have evolved complex nonredundant functions [54] . membranes. the origin of the membranes composing autophagosomes is a long-standing mystery in the field of autophagy. a major difficulty in addressing this question has been that phagophores as well as autophagosomes do not contain marker proteins of other subcellular compartments [55, 56] . a series of new studies has implicated several cellular organelles as the possible source for the autophagosomal lipid bilayers. the plasma membrane and elements of the trafficking machinery to the cell surface have been linked to the formation of an early autophagosomal intermediate, perhaps the phagophore [57] [58] [59] [60] [61] . it is possible that early endosomal-and/or golgiderived membranes are also key factors in the initial steps of autophagy [34, 36, 39] . the golgi, moreover, appears also important for autophagy by supplying at least in part the extra lipids required for the phagophore expansion [29, [62] [63] [64] [65] . the endoplasmic reticulum (er) is also central in this latter event. while the relevance of the er in autophagosome biogenesis was already pointed out a long time ago [5, 55, 66, 67] , recently two electron tomography studies have demonstrated the existence of a physical connection between the er and the forming autophagosomes [68, 69] . these analyses have revealed that the er is connected to the outer as well as the inner membrane of the phagophore through points of contact, supporting the notion that lipids could be supplied via direct transfer at the sites of membrane contact. in line with this view, it has been found that atg14l is associated to the er and pi3p is generated on specific subdomains of this organelle from where autophagosomes emerge under autophagy-inducing conditions [25, 70] (figure 1 ). it has also been proposed that the outer membrane of the mitochondria is the main source of the autophagosomal lipid bilayers, but while the experimental evidences appear to show that mitochondria are essential for the phagophore expansion, it remains unclear whether these organelles play a key role in the phagophore biogenesis [71] . the discrepancy between the conclusions of the various studies has not allowed yet drawing a model about the membrane dynamics during autophagosome biogenesis. the different results could be due to the different experimental conditions and model systems used by the various laboratories. alternatively, the lipids forming the autophagosomes could have different sources depending on the cell and the conditions inducing autophagy [72, 73] . a third possibility is that the source of phagophore membrane could depend on the nature of the double-membrane vesicle cargo. additional investigations are required to shed light on these issues. despite the potential of curing, quite a substantial range of specific pathological conditions by inducting autophagy, there are currently no small molecules that allow to exclusively stimulate this pathway [74] . nevertheless, there is a variety of chemicals that by acting on signaling cascades that also regulate autophagy permit to trigger this degradative process. these agents fall into two distinct categories based on the mechanism of action; whether they work through an mtordependent (rapamycin or torin) or mtor-independent pathway (e.g., lithium or resveratrol) [74] . in addition to these compounds, there are biological molecules such as interferon î³ (ifnî³) and vitamin d that can be used to stimulate autophagy especially in experimental setups [75, 76] . international journal of cell biology 5 inhibition of autophagy can also be beneficial in specific diseases but as for the inducers there are no compounds that exclusively block this pathway without affecting other cellular processes. the small molecules inhibiting autophagy include wortmannin and 3-methyladenine, which hamper the activity of the pi3k; bafilomycin a and chloroquine, which impair the degradative activity of lysosomes [77] . they are currently solely used in the basic research on autophagy. it is becoming increasingly evident that autophagy is a highly selective quality control mechanism whose basal levels are important to maintain cellular homeostasis (see below). a number of organelles have been found to be selectively turned over by autophagy and cargo-specific names have been given to distinguish the various selective pathways, including the er (reticulophagy or erphagy), peroxisomes (pexophagy), mitochondria (mitophagy), lipid droplets (lipophagy), secretory granules (zymophagy), and even parts of the nucleus (nucleophagy). moreover, pathogens (xenophagy), ribosomes (ribophagy), and aggregate-prone proteins (aggrephagy) are specifically targeted for degradation by autophagy [78] . selective types of autophagy perform a cellular quality control function and therefore they must be able to distinguish their substrates, such as protein aggregates or dysfunctional mitochondria, from their functional counterparts. the molecular mechanisms underlying cargo selection and regulation of selective types of autophagy are still largely unknown. this has been an area of intense research during the last years and our understanding of the various selective types of autophagy is starting to unravel. a recent genomewide small interfering rna screen aimed at identifying mammalian genes required for selective autophagy found 141 candidate genes to be required for viral autophagy and 96 of those genes were also required for parkin-mediated mitophagy [79] . in general, these pathways appear to rely upon specific cargo-recognizing autophagy receptors, which connect the cargo to the autophagic membranes. the autophagy receptors might also interact with specificity adaptors, which function as scaffolding proteins that bring the cargo-receptor complex in contact with the core atg machinery to allow the specific sequestration of the substrate. the selective types of autophagy appear to rely on the same molecular core machinery as non-selective (starvation-induced) bulk autophagy. in contrast, the autophagy receptors and specificity adapters do not seem to be required for nonselective autophagy. autophagy receptors are defined as proteins being able to interact directly with both the autophagosome cargo and the atg8/lc3 family members through a specific (wxxl) sequence [80] , commonly referred to as the lc3-interacting region (lir) motif [81] or the lc3 recognition sequences (lrs) [82] . based on comparison of lir domains from more than 20 autophagy receptors it was found that the lir consensus motif is an eight amino acids long sequence that can be written d/e-d/e-d/e-w/f/y-x-x-l/i/v. although not an absolute requirement, usually there is at least one acidic residue upstream of the w-site. the terminal l-site is occupied by a hydrophobic residue, either l, i, or v [83] . the lir motifs of several autophagy receptors have been found to interact both with lc3 and gabarap family members in vitro, but whether this reflects a physiological interaction remains to be clarified in most cases. it should be pointed out that not all lir-containing proteins are autophagy cargo receptors. some lir-containing proteins, like atg3 and atg4b, are recruited to autophagic membranes to perform their function in autophagosome formation [84, 85] , whereas others like fyve and coiled-coil domain-containing protein 1 (fyco1) interact with lc3 to facilitate autophagosome transport and maturation [86] . others might use an lir motif to become degraded, like dishevelled, an adaptor protein in the wnt signalling pathway [87] . the adaptor proteins are less well-described, but seem to interact with autophagy receptors and work as scaffold proteins recruiting and assembling the atg machinery required to generate autophagosomes around the cargo targeted to degradation. examples of autophagy adaptors are atg11 and alfy [88, 89] . the list of specific autophagy receptors is rapidly growing and the role of several of them in different types of selective autophagy will be described in detail in the reviews of this special issue. here we will briefly discuss the best studied form of selective autophagy, the yeast cytosol to vacuole targeting (cvt) pathway, as well as the best studied mammalian autophagy receptor, p62/sequestosome 1 (sqstm1) (figure 2 ). the cvt pathway is a biosynthetic process mediating the transport of the three vacuolar hydrolases, aminopeptidase 1 (ape1), aminopeptidase 4 (ape4) and î±-mannosidase (ams1), and the ty1 transposome into the vacuole [90, 91] . ape1 is synthesized as a cytosolic precursor (prape1), which multimerizes into the higher order ape1 oligomer, to which ape4, ams1, and ty1 associate to form the socalled cvt complex, prior to being sequestered into a small autophagosome-like cvt vesicle. sequestration of the cvt complex into cvt vesicles is a multistep process, which requires the autophagy receptor atg19, which facilitates binding to atg8 at the pas, as well as the adaptor protein atg11 (figure 2(a) ) [92] . atg11 acts as a scaffold protein by directing the cvt complex and atg9 reservoirs translocation to the pas in an actin-dependent way and then recruiting the atg1/ulk complex [40, 93] . the pi3p-binding proteins atg20, atg21, and atg24 are also required for the cvt pathway [94, 95] , but their precise function remains to be elucidated. interestingly, atg11 overexpression was found to recruit more atg8 and atg9 to the pas resulting in more cvt vesicles. this observation indicates that atg11 levels could regulate the rate of selective autophagy, and maybe also the size of the cargo-containing autophagosomes in yeast [90, 96] . indeed, a series of studies has revealed that atg11 is also involved in other types of selective autophagy such as mitophagy and pexophagy. however, the autophagy receptors involved in the different atg11-dependent types figure 2 : representative selective autophagy. (a) the cytoplasm-to-vacuole targeting (cvt) pathway. ape1 is synthesized as a cytoplasmic precursor protein with a propeptide and rapidly oligomerizes into dodecamers that subsequently associate with each other to form a higher order complex. the autophagy receptor atg19 directly binds to the complex and mediates the recruitment of another cvt pathway cargo, ams1, leading to the formation of the so-called cvt complex. atg19 also interacts with the autophagy adaptor atg11 and this protein allows the transport of the cvt complex to the site where the double-membrane vesicle will be generated. at this location, atg11 tethers the atg proteins essential for the cvt vesicle formation and the direct binding of atg19 to atg8 permits the exclusive sequestration of the cvt complex into the vesicle. (b) a model for p62 and nbr1 as autophagy receptors for ubiquitinated cargos. p62 and nbr1 bind with ubiquitinated cargos via their ubiquitin-associated (uba) domain and this interaction triggers the aggregate formation through the oligomerization of p62 via its phox and bem1p (pb1) domain. furthermore, p62 interacts with both autophagy-linked fyve protein (alfy), which serves to recruit atg5 and to bind pi3p, and directly with lc3. this latter event appears to organize and activate the atg machinery in close proximity of the ubiquitinated cargos, which allows to selectively sequester them in the autophagosomes in analogous to the cvt pathway. of selective autophagy are different as atg32 is required for mitophagy [97, 98] , whereas atg30 is essential for pexophagy [99] . like atg19, these two proteins have an atg8-binding lir motif and directly interact with atg11. mammalian cells appear to not possess an atg11 homologue, and further studies are necessary to delineate the molecular machinery involved in sequestration and targeting of different cargoes for degradation by autophagy in higher eukaryotes. the mechanism of the cvt pathway is reminiscent of the selective form of mammalian autophagy called aggrephagy, which involves degradation of misfolded and unwanted proteins by packing them into ubiquitinated aggregates. in both cases aggregation of the substrate (prape1 or misfolded proteins) is required prior to sequestration into cvt vesicles or autophagosomes, respectively [100] [101] [102] . similar to cvt vesicles, aggregate-containing autophagosomes appear to be largely devoid of cytosolic components suggesting that the vesicle membrane expands tightly around its cargo [88] . aggrephagy also depends on proteins with exclusive functions in substrate selection and targeting [81, 88, 100, 103] . the autophagy receptors p62 and neighbour of brca1 gene (nbr1) bind both ubiquitinated protein aggregates through an ubiquitin-associated (uba) domain and to lc3 via their lir motifs and, thereby, promote the specific autophagic degradation of ubiquitinated proteins (figure 2(b) ) [81, 82, 100, 103, 104] . nbr1 and p62 also contain an nterminal phox and bem1p (pb1) domain through which they can oligomerize, or interact with other pb1-containing binding partners [83] . in addition to being a cargo receptor for protein aggregates, p62 has been implicated in autophagic degradation of other ubiquitinated substrates such as intracellular bacteria [105] , viral capsid proteins [106] , the midbody remnant formed after cytokinesis [107] , peroxisomes [108, 109] , damaged mitochondria [110, 111] , and bacteriocidal precursor proteins [112] . the pb1 domain was recently found to be required for p62 to localize to the autophagosome formation site adjacent to the er [113] , suggesting that it could target ubiquitinated cargo to the site of autophagosome formation or alternatively promote the assembly of the atg machinery at this location. international journal of cell biology 7 the large scaffolding protein autophagy-linked fyve (alfy) appears to have a similar function as the specificity adaptor atg11. alfy is recruited to aggregate-prone proteins through its interaction with p62 [101] and through a direct interaction with atg5 and pi3p it serves to recruit the core atg machinery and allow formation of autophagic membranes around the protein aggregate [88] (figure 2(b) ). interestingly, alfy is recruited from the nucleus to cytoplasmic ubiquitin-positive structures upon cell stress suggesting that it might regulate the level of aggrephagy [114] . in line with this, it was found that overexpression of alfy in mouse and fly models of huntington's disease reduced the number of protein inclusions [88] . it will be interesting to determine whether alfy, as p62, is involved in other selective types of autophagy such as the one eliminating midbody ring structures or mitochondria. it is well known that posttranslational modifications like phosphorylation and ubiquitination are involved in the regulation of the activity of proteins involved in autophagy and degradation of autophagic cargo proteins, respectively. however, little is known about how these modifications may regulate selective autophagy. the fact that the core atg machinery is required for both nonselective and selective types of autophagy gives raise to the question of whether these two types of autophagy may compete for the same molecular machinery. such a competition could be detrimental for the cells undergoing starvation and to avoid this, there might be a tight regulation of the expression level and/or activity of the proteins specifically involved the selective autophagy. it has recently been proposed that phosphorylation of autophagy receptors might be a general mechanism for the regulation of selective autophagy. dikic and coworkers noted that several autophagy receptors contain conserved serine residues adjacent to their lir motifs and indeed, the tank binding kinase 1 (tbk1) was found to phosphorylate a serine residue close to the lir motif of the autophagy receptor optineurin. this modification enhances the lc3 binding affinity of optineurin and promotes selective autophagy of ubiquitinated cytosolic salmonella enterica [115] . in yeast, phosphorylation of atg32, the autophagy receptor for mitophagy, by mitogen-activated protein kinases was found to be required for mitophagy [116, 117] . the atg8/lc3 proteins themselves have also been found to become phosphorylated and recent works have identified specific phosphorylation sites for protein kinase a (pka) [118] and protein kinase c (pkc) [119] in the nterminal region of lc3. interestingly, the n-terminal of lc3 is involved in the binding of lc3 to lir-containing proteins [120] . it is therefore tempting to speculate that phosphorylation of the pka and pkc sites might facilitate or prevent the interaction of lc3 with lir-containing proteins such as p62. it has been found that phosphorylation of the pka site, which is conserved in all mammalian lc3 isoforms, but not in gabarap, inhibits recruitment of lc3 into autophagosomes [118] . the role of ubiquitin in autophagy has so far been ascribed as a signal for cargo degradation. ubiquitination of aggregate prone proteins, as well as bacteria and mitochondria, has been found to serve as a signal for recognition by autophagy receptors like p62 and nbr1, which are themselves also degraded together with the cargo that they associate with [83] . the in vivo specificity of p62 and nbr1 toward ubiquitin signals remains to be established under the different physiological conditions. interestingly, it was recently found that casein kinase 2-(ck2-) mediated phosphorylation of the p62 uba domain increases the binding affinity of this motif for polyubiquitin chains leading to more efficient targeting of polyubiquitinated proteins to autophagy [121] . ck2 overexpression or phosphatase inhibition reduced the formation of aggregates containing the polyglutamine-expanded huntingtin exon1 fragments in a p62-dependent manner. the e3 ligases involved in ubiquitination of different autophagic cargo largely remains to be identified. however, it is known that the e3 ligases parkin and rnf185 both regulate mitophagy [122, 123] . smurf1 (smad-specific e3 ubiquitin protein ligase 1) was recently also implicated in mitophagy, as well as in autophagic targeting of viral particles [79] . interestingly, the role of smurf1 in selective autophagy seems to be independent of its e3 ligase activity, but it rather depends on its membrane-targeting c2 domain, although the exact mechanism involved remains to be elucidated. it is also not clear whether ubiquitination could serve as a signal to regulate the activity or binding selectivity of proteins directly involved in autophagy, and whether this in some way could regulate selective autophagy. the role of ubiquitinlike proteins as sumo and nedd in autophagy is also unexplored. acetylation is another posttranslational modification that only recently has been implicated in selective autophagy. the histone de-acetylase 6 (hdac6), initially found to mediate transport of misfolded proteins to the aggresome [124] , was lately implicated in maturation of ubiquitinpositive autophagosomes [125] . the fact that hdac6 overproduction in fly eyes expressing expanded polyq proteins is neuroprotective further indicates that hdac6 activity stimulates aggrephagy [126] . furthermore, the acetylation of an aggrephagy cargo protein, muntant huntingtin, the protein causing huntington's disease, is important for its degradation by autophagy [127] . hdac6 has been also implicated in parkin-mediated clearance of damaged mitochondria [128] . the acetyl transferase(s) involved in these forms of selective autophagy is currently unknown, but understanding the role of acetylation in relation to various aspects of autophagy is an emerging field and it will very likely provide more mechanistic insights into these pathways. basal autophagy acts as the quality control pathway for cytoplasmic components and it is crucial to maintain the homeostasis of various postmitotic cells [129] . while this quality control could be partially achieved by nonselective autophagy, growing lines of evidence have demonstrated 8 international journal of cell biology that specific proteins, organelles, and invading bacteria are specifically degraded by autophagy (figure 3 ). mice deficient in autophagy die either in utero (e.g., beclin 1 and fip200 knockout mice) [130] [131] [132] or within 24 hours after birth due, at least in part, to a deficiency in the mobilization of amino acids from various tissues (e.g., atg3, atg5, atg7, atg9, and atg16l knockout mice) [49, [133] [134] [135] [136] . as a result, to investigate the physiological roles of autophagy, conditional knockout mice for atg5, atg7, or fip200 and various tissue-specific atg knockout mice have been established and analyzed [133, 137, 138] . for example, the liver-specific atg7-deficient mouse displayed severe hepatomegaly accompanied by hepatocyte hypertrophy, resulting in severe liver injuries [133] . mice lacking atg5, atg7, or fip200 in the central nervous system exhibited behavioral deficits, such as abnormal limb-clasping reflexes and reduction of coordinated movement as well as massive neuronal loss in the cerebral and cerebellar cortices [137] [138] [139] . loss of atg5 in cardiac muscle caused cardiac hypertrophy, left ventricular dilatation, and systolic dysfunction [140] . skeletal muscle-specific atg5 or atg7 knockout mice showed age-dependent muscle atrophy [141, 142] . pancreatic î² cell-specific atg7 knockout animals exhibited degeneration of islets and impaired glucose tolerance with reduced insulin secretion [143, 144] . podocytespecific deletion of atg5 caused glomerulosclerosis in aging mice and these animals displayed increased susceptibility to proteinuric diseases caused by puromycin aminonucleoside and adriamycin [145] . proximal tubule-specific atg5 knockout mice were susceptible to ischemia-reperfusion injury [146] . finally, deletion of atg7 in bronchial epithelial cells resulted in hyperresponsiveness to cholinergic stimuli [147] . all together, these results undoubtedly indicate that basal autophagy prevents numerous life-threatening diseases. how does impairment of autophagy lead to diseases? ultrastructural analyses of the mutant mice revealed a marked accumulation of swollen and deformed mitochondria in the mutant hepatocytes [133] , pancreatic î² cells [143, 144] , cardiac and skeletal myocytes [140, 141] and neurons [138] , but also the appearance of concentric membranous structures consisting of er or sarcoplasmic reticulum in hepatocytes [133] , neuronal axons [137, 139] and skeletal myocytes [141] , as well as an increased number of peroxisomes and lipid droplets in hepatocytes [133, 148] . in addition to the accumulation of aberrant organelles, histological analyses of tissues with defective autophagy showed the amassment of polyubiquitylated proteins in almost all tissues (although the level varied from one region to another) forming inclusion bodies whose size and number increased with aging [149] . consequently, basal autophagy also acts as the quality control machinery for cytoplasmic organelles (figure 3(a) ). although this could be partially achieved by bulk autophagy, these observations point to the existence of selective types of autophagy, a notion that is now supported by experimental data. p62/sqstm1 is the best-characterized disease-related autophagy receptor and a ubiquitously expressed cellular protein conserved among metazoan but not in plants and fungi [83] . besides a role of p62 as the receptor, this protein itself is specific substrate for autophagy. suppression of autophagy is usually accompanied by an accumulation of p62 mostly in large aggregates also positive for ubiquitin (figure 3(a) ) [104, 150] . ubiquitin and p62-positive inclusion bodies have been detected in numerous neurodegenerative diseases (i.e., alzheimer's disease, parkinson's disease, and amyotrophic lateral sclerosis), liver disorders (i.e., alcoholic hepatitis and steatohepatitis), and cancers (i.e., malignant glioma and hepatocellular carcinoma) [151] . very interestingly, the p62positive aggregates observed in hepatocytes and neurons of liver-and brain-specific atg7 deficient mice, respectively, as well as in human hepatocellular carcinoma cells, are completely dispersed by the additional loss of p62 strongly implicating involvement of p62 in the formation of diseaserelated inclusion bodies [104, 152] . through its self-oligomerization, p62 is involved in several signal transduction pathways. for example, this protein functions as a signaling hub that may determine whether cells survive by activating the traf6-nf-îºb pathway, or die by facilitating the aggregation of caspase 8 and the downstream effector caspases [153, 154] . on the other hand, p62 interacts with the nrf2-binding site on keap1, a component of the cullin 3-type ubiquitin ligase for nrf2, resulting in stabilization of nrf2 and transcriptional activation of nrf2 target genes including a battery of antioxidant proteins [155] [156] [157] [158] [159] . it is thus plausible that excess accumulation or mutation of p62 leads to hyperactivation of these signaling pathways, resulting in a disease onset (figure 3(b) ). paget's disease of bone is a chronic and metabolic bone disorder that is characterized by an increased bone turnover within discrete lesions throughout the skeleton. mutations in the p62 gene, in particular in its uba domain, can cause this illness [160] . a proposed model explaining how p62 mutations lead to the paget's disease of bone is the following: mutations of the uba domain cause an impairment in the interaction between p62 and ubiquitinated traf6 and/or cyld, an enzyme deubiquitinating traf6, which in turn enhances the activation of the nf-îºb signaling pathway and the resulting increased osteoclastogenesis (figure 3(b) ) [160] . if proven, this molecular scenario could open the possibility of using autophagy enhancers as a therapy to cure paget's disease of bone. it is established that autophagy has a tumor-suppressor role and several autophagy gene products including beclin1 and uvrag are known to function as tumor suppressor proteins [161] . the tumor-suppressor role of autophagy appears to be important particularly in the liver. spontaneous tumorigenesis is observed in the livers of mice with either a systemic mosaic deletion of atg5 or a hepatocytespecific atg7 deletion [152, 162] . importantly, no tumors are formed in other organs in atg5 mosaically deleted mice. enlarged mitochondria, whose functions are at least partially impaired, accumulate in atg5-or atg7-deficient hepatocytes [152, 162] . this observation is in line with the previous data obtained in ibmk cell lines showing that both the oxidative stress and genomic damage responses are activated by loss of autophagy [163, 164] . again, it is clear that accumulation of p62, at least partially, contributes to tumor growth because the size of the atg7 â��/â�� liver tumors is reduced by the additional deletion of p62 [162] , which may cause a dysregulation of nf-îºb signaling [165] and/or a persistent activation of nrf2 [166] . almost all tissues with defective autophagy are usually displaying an accumulation of polyubiquitinated proteins [149] . loss of autophagy is considered to lead to a delay in the global turnover of cytoplasmic components [137] and/or to an impaired degradation of substrates destined for the proteasome [167] . both observations could partially explain the accumulation of misfolded and/or unfolded proteins that is followed by the formation of inclusion bodies. as discussed above, p62 and nbr1 act as autophagy receptors for ubiquitinated cargos such as protein aggregates, mitochondria, midbody rings, bacteria, ribosomal proteins and virus capsids [83, 168] (figure 3 ). although these studies suggest the role of p62 as an ubiquitin receptor, it remains to be established whether soluble ubiquitinated proteins are also degraded one-by-one by p62 and possibly nbr1. a mass spectrometric analysis has clearly demonstrated the accumulation of all detectable topologies of ubiquitin chain in atg deficient livers and brains, indicating that specific polyubiquitin chain linkage is not the decisive signal for autophagic degradation [169] . because the increase in ubiquitin conjugates in the atg7 deficient liver and brain is completely suppressed by additional knockout of either p62 or nrf2 [169] , accumulation of ubiquitinated proteins in tissues defective in autophagy might be attributed to p62mediated activation of nrf2, resulting in global transcriptional changes to ubiquitin-associated genes. further studies are needed to precisely elucidate the degradation mechanism of soluble ubiquitinated proteins by autophagy. concomitant with the energy production through oxidative phosphorylation, mitochondria also generate reactive oxygen species (ros), which cause damage through the oxidation of proteins, lipids and dna often inducing cell death. therefore, the quality control of mitochondria is essential to maintain cellular homeostasis and this process appears to be achieved via autophagy. it has been postulated that mitophagy contributes to differentiation and development by participating to the intracellular remodelling that occurs for example during haematopoiesis and adipogenesis. in mammalian red blood cells, the expulsion of the nucleus followed by the removal of other organelles, such as mitochondria, are necessary differentiation steps. nix/bnip3l, an autophagy receptor whose structure resembles that of atg32, is also an outer mitochondrial membrane protein that interacts with gabarap [170, 171] and plays an important role in mitophagy during erythroid differentiation [172, 173] (figure 3(c) ). although autophagosome formation probably still occurs in nix/bnip3l deficient reticulocytes, mitochondrial elimination is severely impaired. consequently, mutant reticulocytes are exposed to increased levels of ros and die, and nix/bnip3l knockout mice suffer severe anemia. depolarization of the mitochondrial membrane potential of mutant reticulocytes by treatment with an uncoupling agent results in restoration of mitophagy [172] , emphasizing the importance of nix/binp3l for the mitochondrial depolarization and implying that mitophagy targets uncoupled mitochondria. haematopoietic-specific atg7 knockout mice also exhibited severe anaemia as well as lymphopenia, and the mutant erythrocytes markedly accumulated degenerated mitochondria but not other organelles [174] . the mitochondrial content is regulated during the development of the t cells as well; that is, the high mitochondrial content in thymocytes is shifted to a low mitochondrial content in mature t cells. atg5 or atg7 deleted t cells fail to reduce their mitochondrial content resulting in increased ros production as well as an imbalance in pro-and antiapoptotic protein expression [175] [176] [177] . all together, these evidences demonstrate the essential role of mitophagy in haematopoiesis. recent studies have described the molecular mechanism by which damaged mitochondria are selectively targeted for autophagy, and have suggested that the defect is implicated in the familial parkinson's disease (pd) [178] (figure 3(c) ). pink1, a mitochondrial kinase, and parkin, an e3 ubiquitin ligase, have been genetically linked to both pd and a pathway that prevents progressive mitochondrial damage and dysfunction. when mitochondria are damaged and depolarized, pink1 becomes stabilized and recruits parkin to the damaged mitochondria [122, [179] [180] [181] . various mitochondrial outer membrane proteins are ubiquitinated by parkin and mitophagy is then induced. of note, pd-related mutations in pink1 and parkin impair mitophagy [122, [179] [180] [181] , suggesting that there is a link between defective mitophagy and pd. how these ubiquitinated mitochondria are recognized by the autophagosome remains unknown. although p62 has been implicated in the recognition of ubiquitinated mitochondria, elimination of the mitochondria occurs normally in p62-deficient cells [182, 183] . when specific bacteria invade host cells through endocytosis/phagocytosis, a selective type of autophagy termed xenophagy, engulfs them to restrict their growth [184] (figure 3(d) ). although neither the target proteins nor the e3 ligases have yet been identified, invading bacteria such as salmonella enterica, listeria monocytogenes, or shigella flexneri become positive for ubiquitin when they access the cytosol by rupturing the endosome/phagosome limiting membrane [185, 186] . these findings raise the possibility that ubiquitin also serves as a tag during xenophagy. in fact, to date, three proteins, p62 [105, 185, 187] , ndp52 [188] , and optineurin [115] have been proposed to be autophagy receptors linking ubiquitinated bacteria and lc3. an ubiquitin-independent mechanism has recently been revealed; recognition of a shigella mutant that lacks the icsb gene requires the tectonin domaincontaining protein 1 (tecpr1), which appears to be a new type of autophagy adaptor targeting shigella to atg5-and wipi-2-positive membranes [189] . interestingly, the shigella icsb normally prevents autophagic sequestration of this bacterium by inhibiting the interaction of shigella virg with atg5 indicating that some bacteria have developed mechanism to inhibit or subvert autophagy to their advantage [190] . this latter category of pathogens also includes viruses such as herpes simplex virus-1 (hsv-1), which express an inhibitor (icp34.5) of atg6/beclin1 [106] . however, it was recently shown that a mutant hsv-1 strain lacking icp34.5 becomes degraded by selective autophagy in a smurf1dependent manner [79] , suggesting that selective autophagy plays an important role in our immune system. recently, a different antimicrobial function has been assigned to autophagy and this function appears to be selective. during infection, ribosomal protein precursors are transported by autophagy in a p62-dependent manner into lysosomes [112] . these ribosomal protein precursors are subsequently processed by lysosomal protease into small antimicrobial peptides. importantly, it has been shown that induction of autophagy during a mycobacterium tuberculosis infection leads to the fusion between phagosomes containing this bacterium and autophagosomes, and the production of the antimicrobial peptides in this compartment kills m. tuberculosis [112] . while the molecular mechanism is largely unknown, autophagy contributes at least partially to the supply of free fatty acids in response to fasting (figure 3(e) ). fasting provokes the increase of the levels of free fatty acids circulating in the blood, which are mobilized from adipose tissues. these free fatty acids are rapidly captured by various organs including hepatocytes and then transformed into triglycerides by esterification within lipid droplets. these lipid droplets appear to be turned over by a selective type of autophagy that has been named lipophagy in order to provide endogenous free fatty acids for energy production through î²-oxidation [148] . indeed, liver-specific atg7 deficient mice display massive accumulation of triglycerides and cholesterol in the form of lipid droplets [191] . agoutirelated peptide-(agrp-) expressing neurons also respond to increased circulating levels of free fatty acids after fasting and then induce autophagy to degrade the lipid droplets [192] . similar to the case in hepatocytes, autophagy in the neurons supplies endogenous free fatty acids for energy production and seems to be necessary for gene expression of agpr, which is a neuropeptide that increases appetite and decreases metabolism and energy expenditure [192] . originally, it was assumed that autophagy was exclusively a bulk process. recent experimental evidences have demonstrated that through the use of autophagy receptors and adaptors, this pathway can be selective by exclusively degrading specific cellular constituents. the list of physiological and pathological situations where autophagy is selective is constantly growing and this fact challenges the earliest concept whether autophagy can be nonselective. it is believe that under starvation, cytoplasmic structures are randomly engulfed by autophagosomes and delivered into the lysosome to be degraded and thus generate an internal pool of nutrients. in yeast saccharomyces cerevisiae, however, the degradation of ribosomes, for example, ribophagy, as well as mitophagy and pexophagy, and the transport of the prape1 oligomer into the vacuole under the same conditions requires the presence of autophagy receptors [97, [193] [194] [195] . as a result, these observations suggest that autophagy could potentially always operate selectively. this is a conceivable hypothesis because this process allows the cell to survive stress conditions and the casual elimination of cytoplasmic structure in the same scenario could lead to the lethal depletion of an organelle crucial for cell survival. future studies will certainly provide more molecular insights into the regulation and mechanism of the selective types of autophagy, and this information will also be important to determine if indeed bulk autophagy exists. agrp: agouti-related peptide ampk: amp-activated protein kinase alfy: autophagy-linked fyve protein ams1: î±-mannosidase 1 ape1: aminopeptidase 1 ape4: aminopeptidase 4 atg: autophagy-related gene bnip3l: b-cell leukemia/lymphoma 2 (bcl-2)/adenovirus e1b interacting protein 3 ck2: casein kinase 2 cma: chaperone-mediated autophagy cvt: cytoplasm to vacuole targeting er: endoplasmic reticulum fip200: focal adhesion kinase family interacting protein of 200 kd fyco1: fyve and coiled-coil domain-containing protein 1 gabarap: gamma-aminobutyrate receptor-associated protein gate-16: golgi-associated atpase enhancer of 16 kda hdac6: histone de-acetylase 6 hops: homotypic fusion and protein sorting hsv-1: herpes simplex virus-1 keap1: kelch-like ech-associated protein 1 lc3: microtubule-associated protein 1 (map1)-light chain 3 lir: lc3-interacting region lrs: lc3 recognition sequences nbr1: neighbour of brca1 gene ndp52: nuclear dot protein (ndp) 52 nf-îºb: nuclear factor îºb nix: nip-like protein x nrf2: nf-e2 related factor 2 pas: phagophore assembly site pb1: phox and bem1p international journal of cell biology pe: phosphatidylethanolamine pd: parkinson's disease pi3k: phosphatidylinositol 3-kinase pi3p: phosphatidylinositol 3-phosphate pka: protein kinase a pkc: protein kinase c ros: reactive oxygen species rubicon: run domain and cysteine-rich domain containing beclin 1-interacting protein smurf1: smad-specific e3 ubiquitin protein ligase 1 sumo: small ubiquitin-like modifier sqstm1: p62/sequestosome 1 tbk1: tank binding kinase 1 tecpr1: tectonin domain-containing protein 1 traf6: tumour necrosis factor receptor-associated factor 6 tor: target of rapamycin tgn: trans-golgi network uba: ubiquitin associated ulk1: unc-51-like kinase 1 uvrag: uv-resistance associated gen vps: vacuolar protein sorting. chaperone-mediated autophagy in protein quality control chaperone-mediated autophagy: molecular mechanisms and physiological relevance microautophagy in mammalian cells: revisiting a 40-year-old conundrum microautophagy of cytosolic proteins by late endosomes seeing is believing: the impact of electron microscopy on autophagy research eaten alive: a history of macroautophagy dynamics and diversity in autophagy mechanisms: lessons from yeast characterization of autophagosome formation site by a hierarchical analysis of mammalian atg proteins hierarchy of atg proteins in pre-autophagosomal structure organization autophagosome formation: core machinery and adaptations toward unraveling membrane biogenesis in mammalian autophagy ulk1â·atg13â·fip200 complex mediates mtor signaling and is essential for autophagy nutrient-dependent mtorcl association with the ulk1-atg13-fip200 complex required for autophagy ulk-atg13-fip200 complexes mediate mtor signaling to the autophagy machinery a novel, human atg13 binding protein, atg101, interacts with ulk1 and is essential for macroautophagy regulation mechanisms and signaling pathways of autophagy autophagy mediates the mitotic senescence transition tor-mediated induction of autophagy via an apg1 protein kinase complex an overview of the molecular mechanism of autophagy two distinct vps34 phosphatidylinositol 3-kinase complexes function in autophagy and carboxypeptidase y sorting in saccharomyces cerevisiae assortment of phosphatidylinositol 3-kinase complexes-atg14p directs association of complex i to the pre-autophagosomal structure in saccharomyces cerevisiae beclin 1 forms two distinct phosphatidylinositol 3-kinase complexes with mammalian atg14 and uvrag autophagic and tumour suppressor activity of a novel beclin1-binding protein uvrag identification of barkor as a mammalian autophagy-specific factor for beclin 1 and class iii phosphatidylinositol 3-kinase autophagy requires endoplasmic reticulum targeting of the pi3-kinase complex via atg14l two beclin 1-binding proteins, atg14l and rubicon, reciprocally regulate autophagy at different stages amino terminus of the sars coronavirus protein 3a elicits strong, potentially protective humoral responses in infected patients bif-1 interacts with beclin 1 through uvrag and regulates autophagy and tumorigenesis bif-1 regulates atg9 trafficking by mediating the fission of golgi membranes during autophagy beclin1-binding uvrag targets the class c vps complex to coordinate autophagosome maturation and endocytic trafficking distinct regulation of autophagic activity by atg14l and rubicon associated with beclin 1-phosphatidylinositol-3-kinase complex the dynamic interaction of ambra1 with the dynein motor complex regulates mammalian autophagy ambra1 regulates autophagy and development of the nervous system an atg9-containing compartment that functions in the early steps of autophagosome biogenesis apg9p/cvt7p is an integral membrane protein required for transport vesicle formation in the cvt and autophagy pathways membrane delivery to the yeast autophagosome from the golgi-endosomal system the atg1-atg13 complex regulates atg9 and atg23 retrieval transport from the pre-autophagosomal structure the atg18-atg2 complex is recruited to autophagic membranes via phosphatidylinositol 3-phosphate and exerts an essential function starvation and ulk1-dependent cycling of mammalian atg9 between the tgn and endosomes recruitment of atg9 to the preautophagosomal structure by atg11 is essential for selective autophagy in budding yeast atg17 recruits atg9 to organize the pre-autophagosomal structure coordinated regulation of autophagy by p38alpha mapk through matg9 and p38ip evolution of atg1 function and regulation a ubiquitin-like system mediates protein lipidation a protein conjugation system essential for autophagy mammalian autophagy: core molecular machinery and signaling regulation the atg16l complex specifies the site of lc3 lipidation for membrane biogenesis in autophagy the atg12-atg5 conjugate has a novel e3-like activity for protein lipidation in autophagy the atg8 conjugation system is indispensable for proper development of autophagic isolation membranes in mice snare proteins are required for macroautophagy atg8, a ubiquitin-like protein required for autophagosome formation, mediates membrane tethering and hemifusion lc3 and gate-16 n termini mediate membrane fusion processes required for autophagosome biogenesis atg8: an autophagy-related ubiquitin-like protein family lc3 and gate-16/gabarap subfamilies are both essential yet act differently in autophagosome biogenesis studies on cellular autophagocytosis. the formation of autophagic vacuoles in the liver after glucagon administration purification and characterization of autophagosomes from rat hepatocytes ralb and the exocyst mediate the cellular starvation response by direct 14 international journal of cell biology activation of autophagosome assembly quantitative analysis of autophagy-related protein stoichiometry by fluorescence microscopy autophagosome precursor maturation requires homotypic fusion molecular mechanisms and regulation of specific and nonspecific autophagy pathways in yeast plasma membrane contributes to the formation of pre-autophagosomal structures exit from the golgi is required for the expansion of the autophagosomal phagophore in yeast saccharomyces cerevisiae the golgi complex as a source for yeast autophagosomal membranes characterization of the isolation membranes and the limiting membranes of autophagosomes in rat hepatocytes by lectin cytochemistry the conserved oligomeric golgi complex is involved in double-membrane vesicle formation during autophagy studies on the mechanisms of autophagy: maturation of the autophagic vacuole studies on induced cellular autophagy. i. electron microscopy of cells with in vivo labelled lysosomes a subdomain of the endoplasmic reticulum forms a cradle for autophagosome formation 3d tomography reveals connections between the phagophore and endoplasmic reticulum autophagosome formation from membrane compartments enriched in phosphatidylinositol 3-phosphate and dynamically connected to the endoplasmic reticulum mitochondria supply membranes for autophagosome biogenesis during starvation the emergence of autophagosomes the origin of the autophagosomal membrane chemical modulators of autophagy as biological probes and potential therapeutics autophagy and cytokines vitamin d, vitamin d receptor, and macroautophagy in inflammation and infection guidelines for the use and interpretation of assays for monitoring autophagy in higher eukaryotes how shall i eat thee image-based genome-wide sirna screen identifies selective autophagy factors structural basis of target recognition by atg8/lc3 during selective autophagy p62/sqstm1 binds directly to atg8/lc3 to facilitate degradation of ubiquitinated protein aggregates by autophagy structural basis for sorting mechanism of p62 in selective autophagy selective autophagy mediated by autophagic adapter proteins the structure of atg4b-lc3 complex reveals the mechanism of lc3 processing and delipidation during autophagy autophagy-related protein 8 (atg8) family interacting motif in atg3 mediates the atg3-atg8 interaction and is crucial for the cytoplasm-to-vacuole targeting pathway fyco1 is a rab7 effector that binds to lc3 and pi3p to mediate microtubule plus end-directed vesicle transport autophagy negatively regulates wnt signalling by promoting dishevelled degradation the selective acroautophagic degradation of aggregated proteins requires the pi3p-binding protein alfy mechanism of cargo selection in the cytoplasm to vacuole targeting pathway the cvt pathway as a model for selective autophagy selective autophagy regulates insertional mutagenesis by the ty1 retrotransposon in saccharomyces cerevisiae autophagy in organelle homeostasis: peroxisome turnover autophagy: molecular machinery for self-eating cooperative binding of the cytoplasm to vacuole targeting pathway proteins, cvt13 and cvt20, to phosphatidylinositol 3-phosphate at the pre-autophagosomal structure is required for selective autophagy atg21 is a phosphoinositide binding protein required for efficient lipidation and localization of atg8 during uptake of aminopeptidase i by selective autophagy peroxisome size provides insights into the function of autophagy-related proteins atg32 is a mitochondrial protein that confers selectivity during mitophagy mitochondria-anchored receptor atg32 mediates degrada-tion of mitochondria via selective autophagy ppatg30 tags peroxisomes for turnover by selective autophagy p62/sqstm1 forms protein aggregates degraded by autophagy and has a protective effect on huntingtin-induced cell death p62/sqstm1 and alfy interact to facilitate the formation of p62 bodies/alis and their degradation by autophagy the propeptide of aminopeptidase 1 mediates aggregation and vesicle formation in the cytoplasm-to-vacuole targeting pathway a role for nbr1 in autophagosomal degradation of ubiquitinated substrates homeostatic levels of p62 control cytoplasmic inclusion body formation in autophagy-deficient mice the adaptor protein p62/sqstm1 targets invading bacteria to the autophagy pathway autophagy protects against sindbis virus infection of the central nervous system midbody ring disposal by autophagy is a post-abscission event of cytokinesis ubiquitin signals autophagic degradation of cytosolic proteins and peroxisomes peroxisomal dynamics nix is critical to two distinct phases of mitophagy, reactive oxygen species-mediated autophagy induction and parkin-ubiquitin-p62-mediated mitochondrial priming pink1/parkinmediated mitophagy is dependent on vdac1 and p62/sqstm1 delivery of cytosolic components by autophagic adaptor protein p62 endows autophagosomes with unique antimicrobial properties p62 targeting to the autophagosome formation site requires self-oligomerization but not lc3 binding alfy, a novel fyve-domain-containing protein associated with protein granules and autophagic membranes phosphorylation of the autophagy receptor optineurin restricts salmonella growth phosphorylation of serine 114 on atg32 mediates mitophagy two mapk-signaling pathways are required for mitophagy in saccharomyces cerevisiae regulation of the autophagy protein lc3 by phosphorylation protein kinase c inhibits autophagy and phosphorylates lc3 the nterminus and phe52 residue of lc3 recruit p62/sqstm1 into autophagosomes serine 403 phosphorylation of p62/sqstm1 regulates selective autophagic clearance of ubiquitinated proteins parkin is recruited selectively to impaired mitochondria and promotes their autophagy rnf185, a novel mitochondrial ubiquitin e3 ligase, regulates autophagy through interaction with bnip1 the deacetylase hdac6 regulates aggresome formation and cell viability in response to misfolded protein stress hdac6 controls autophagosome maturation essential for ubiquitin-selective quality-control autophagy hdac6 rescues neurodegeneration and provides an essential link between autophagy and the ups acetylation targets mutant huntingtin to autophagosomes for degradation disease-causing mutations in parkin impair mitochondrial ubiquitination, aggregation, and hdac6-dependent mitophagy autophagy: renovation of cells and tissues role of fip200 in cardiac and liver development and its regulation of tnfî± and tsc-mtor signaling pathways promotion of tumorigenesis by heterozygous disruption of the beclin 1 autophagy gene beclin 1, an autophagy gene essential for early embryonic development, is a haploinsufficient tumor suppressor impairment of starvation-induced and constitutive autophagy in atg7-deficient mice the role of autophagy during the early neonatal starvation period atg9a controls dsdnadriven dynamic translocation of sting and the innate immune response loss of the autophagy protein atg16l1 enhances endotoxin-induced il-1î² production suppression of basal autophagy in neural cells causes neurodegenerative disease in mice neural-specific deletion of fip200 leads to cerebellar degeneration caused by increased neuronal death and axon degeneration loss of autophagy in the central nervous system causes neurodegeneration in mice the role of autophagy in cardiomyocytes in the basal state and in response to hemodynamic stress autophagy is required to maintain muscle mass suppression of autophagy in skeletal muscle uncovers the accumulation of ubiquitinated proteins and their potential role in muscle damage in pompe disease autophagy is important in islet homeostasis and compensatory increase of beta cell mass in response to high-fat diet loss of autophagy diminishes pancreatic î² cell mass and function with resultant hyperglycemia autophagy influences glomerular disease susceptibility and maintains podocyte homeostasis in aging mice autophagy protects the proximal tubule from degeneration and acute ischemic injury inducible disruption of autophagy in the lung causes airway hyper-responsiveness autophagy regulates lipid metabolism autophagy in mammalian development and differentiation ref(2)p, the drosophila melanogaster homologue of mammalian p62, is required for the formation of protein aggregates in adult brain p62 is a common component of cytoplasmic inclusions in protein aggregation diseases persistent activation of nrf2 through p62 in hepatocellular carcinoma cells cullin3-based polyubiquitination and p62-dependent aggregation of caspase-8 mediate extrinsic apoptosis signaling p62 at the crossroads of autophagy, apoptosis, and cancer physical and functional interaction of sequestosome 1 with keap1 regulates the keap1-nrf2 cell defense pathway keap1 facilitates p62-mediated ubiquitin aggregate clearance via autophagy p62/sqstm1 is a target gene for transcription factor nrf2 and creates a positive feedback loop by inducing antioxidant response elementdriven gene transcription the selective autophagy substrate p62 activates the stress responsive transcription factor nrf2 through inactivation of keap1 a noncanonical mechanism of nrf2 activation by autophagy deficiency: direct interaction between keap1 and p62 recent advances in understanding the molecular basis of paget disease of bone autophagy regulation in cancer development and therapy autophagydeficient mice develop multiple liver tumors autophagy mitigates metabolic stress and genome damage in mammary tumorigenesis autophagy suppresses tumor progression by limiting chromosomal instability autophagy suppresses tumorigenesis through elimination of p62 oncogene-induced nrf2 transcription promotes ros detoxification and tumorigenesis autophagy inhibition compromises degradation of ubiquitin-proteasome pathway substrates biogenesis and cargo selectivity of autophagosomes ubiquitin accumulation in autophagy-deficient mice is dependent on the nrf2-mediated stress response pathway: a potential role for protein aggregation in autophagic substrate selection nix is a selective autophagy receptor for mitochondrial clearance nix directly binds to gabarap: a possible crosstalk between apoptosis and autophagy essential role for nix in autophagic maturation of erythroid cells nix is required for programmed mitochondrial clearance during reticulocyte maturation the autophagy protein atg7 is essential for hematopoietic stem cell maintenance a critical role for the autophagy gene atg5 in t cell survival and proliferation autophagy is essential for mitochondrial clearance in mature t lymphocytes identification of atg5-dependent transcriptional changes and increases in mitochondrial mass in atg5-deficient t lymphocytes mechanisms of mitophagy pink1 stabilized by mitochondrial depolarization recruits parkin to damaged mitochondria and activates latent parkin for mitophagy pink1 is selectively stabilized on impaired mitochondria to activate parkin pink1-dependent recruitment of parkin to mitochondria in mitophagy p62/sqstm1 is required for parkin-induced mitochondrial clustering but not mitophagy; vdac1 is dispensable for both p62/sqstm1 cooperates with parkin for perinuclear clustering of depolarized mitochondria ubiquitination-mediated autophagy against invading bacteria shigella phagocytic vacuolar membrane remnants participate in the cellular response to pathogen invasion and are regulated by autophagy recognition of bacteria in the cytosol of mammalian cells by the ubiquitin system listeria monocytogenes acta-mediated escape from autophagic recognition the tbk1 adaptor and autophagy receptor ndp52 restricts the proliferation of ubiquitin-coated bacteria a tecpr1-dependent selective autophagy pathway targets bacterial pathogens escape of intracellular shigella from autophagy chronic oxidative stress sensitizes hepatocytes to death from 4-hydroxynonenal by jnk/c-jun overactivation autophagy in hypothalamic agrp neurons regulates food intake and energy balance cvt9/gsa9 functions in sequestering selective cytosolic cargo destined for the vacuole mature ribosomes are selectively degraded upon starvation by an autophagy pathway requiring the ubp3p/bre5p ubiquitin protease cvt19 is a receptor for the cytoplasm-to-vacuole targeting pathway the authors thank shun kageyama (tokyo metropolitan institute of medical science) for helping in the creation of the figures used in the paper. f. reggiori key: cord-258784-9bdd9krr authors: wei, chiming; lyubchenko, yuri l.; ghandehari, hamid; hanes, justin; stebe, kathleen j.; mao, hai-quan; haynie, donald t.; tomalia, donald a.; foldvari, marianna; monteiro-riviere, nancy; simeonova, petia; nie, shuming; mori, hidezo; gilbert, susan p.; needham, david title: new technology and clinical applications of nanomedicine: highlights of the second annual meeting of the american academy of nanomedicine (part i) date: 2006-12-31 journal: nanomedicine: nanotechnology, biology and medicine doi: 10.1016/j.nano.2006.11.001 sha: doc_id: 258784 cord_uid: 9bdd9krr abstract the second annual meeting of the american academy of nanomedicine (aanm) was held at the national acadmy of science building in washinton, dc, september 9–10, 2006. the program included two nobel prize laureate lectures, two keynote lectures, and 123 invited outstanding state-in-art lectures presenting in 23 special concurrent symposia. in addition, there were 22 poster presentations in the meeting addressing different areas in nanomedicine research. all of the presenters at the meeting are outstanding investigators and researchers in the field. the second annual meeting of the aanm was a great success. the meeting provides investigators from different world areas a forum and an opportunity for discussion. we believe that nanomedicine research will develop rapidly in the future. the aanm invites basic and clinical researchers from the world to join this exciting research. the second annual meeting of the american academy of nanomedicine (aanm) was held at the national academy of science building in washington, dc, september 9-10, the opening ceremony was held in the morning of september 9, 2006 . aanm president chiming wei, md, phd, welcomed all the attendees and extended his deep appreciation and thanks to all presenters and attendees for contributing to the meeting ( figure 1 ). mr. joe mok, chair of the governorts committee of maryland, gave a wonderful message welcoming everybody who came to maryland to attend this very important meeting. tachung yih, phd, the vice president of aanm, gave a thoughtful address encouraging attendees to continue their efforts in advancing nanomedicine research. donald a. tomelia, phd, the vice chair of the 2006 aanm organizing and program committee, summarized the aanm program and provided attendees information and presentations in aanm symposia. james castracane, phd, member of the board of directors of aanm, summarized the aanm membership and fellowship development and future strategy. marianna foldvari, phd, the co-chair of the aanm international committee, summarized the aanm international collaboration research projects and international research activities and developments. during the presidential lecture, chiming wei, md, phd, of johns hopkins university, summarized new published and unpublished findings and results in each nanomedicine research area, and also reviewed the new nanotechnologies and clinical applications in nanomedicine development. during the past year many important research results were published in nanomedicine research areas. the significance of these investigations lies in the development of platform technologies including nanoscale molecular imaging, drug delivery, gene delivery, and diagnostic approaches. dendrimer-based nanomedicine was developed for protein mimicry research, nanopharmaceuticals, diagnostic imaging with contrast agents, and targeted drug delivery in cancer cells. biosensors with nanoscale materials provide unique and powerful technology for detection of biological and chemical species to aid in disease diagnosis and in the discovery and screening of new drug molecules. the development of a cancer therapy and monitoring with diagnostic nanotechnology include cancer-related genotyping detection, gene expression, and immunochemical analysis, as well as nearly real-time monitoring of patient blood for cancer cells. nanotechnologies were developed for clinical applications in cardiovascular, neurological, pulmonary, skin, and renal diseases. in the first nobel prize laureate lecture, baquaporin water channels: from atomic structure to clinical medi-cineq, peter agre, md (2003 nobel prize laureate for chemistry), from duke university school of medicine, summarized the aquaporin (aqp) water channel structure and applications to clinical medicine (figure 2, a) . aqp water channel proteins permit high water permeability of certain biological membranes. in people with no aqp1, lack of water causes defective urine concentration and reduced fluid exchange between capillary and interstitium in lung. mutations in the gene encoding aqp0, expressed in lens fiber cells, result in familial cataracts. mutations in the gene encoding aqp2, expressed in renal collecting duct principal cells, result in nephrogenic diabetes insipidus. mistargeting of aqp5, normally expressed in the apical membranes of salivery and lacrimal gland acini, can occur in sjogrents syndrome. aquaporins also are implicated in brain edema and muscular dystrophy (aqp4), anhidrosis (aqp5), renal tubular acidosis (aqp6), conversion of glycerol to glucose during starvation (aqp7 and aqp9), and cystic fibrosis (several). dr. agre received the honor fellowship from the aanm (figure 2 , b). in the second nobel prize laureate lecture, bnanotechnology, biology and businessq, ivar giaever, phd (1973 nobel prize laureate for physics), from applied biophysics, inc., indicated that nanotechnology holds much future promise to make things both cheaper and better ( figure 3 , a). in particular, dr. giaever discussed a general immunology detector that utilizes small indium particles to detect antibodies. he also described a whole-cell biosensor using electrical fields to obtain information about the morphology of cells in tissue culture. finally, dr. giaever discussed means to bring nanotechnology to the market with many important examples and experiments. dr. giaever received the honor fellowship from the aanm (figure 3, b) . in the keynote lecture, belectron cryomicroscopy of biological nanomachinesq, wah chiu, phd, from baylor medical college, demonstrated the applications of electron cryomicroscopy in determination of multiple molecular components and performance of specific biological functions (figure 4, a) . image reconstruction methods can be used to reconstruct three-dimensional (3-d structures from single nanomachine images at sub-nanometer resolution. mining of salient structural features within the molecular components of a large nanomachine is a daunting task. computational and visualization tools have been developed to extract features such as a-helices and b-sheets of protein subunits with a high degree of reliability. 3-d structure at sub-nanometer resolution can be combined with sequence-based structure prediction methods to derive pseudo atomic models of molecular components of a nanomachine. dr. chiu received the presidential award from aanm (figure 4 , b). there were 23 concurrent symposia in the second aanm annual meeting. each symposium focused on a different nanomedicine research area, covering topics from basic nanomedicine to clinical nanomedicine. following are the summaries for each symposium. because of the journalts page limitations we can only publish summaries for several symposia in this issue of nanomedicine: nanotechnology, biology, and medicine. we will publish the summaries for other symposia that remain will appear in the next issue. this symposium focused on the nanoscale analysis of the protein misfolding and aggregation phenomena critically associated with such neurodegenerative disorders as alzheimer, parkinson, and huntington diseases, to mention a few [1, 2] . the field of medicine can be dramatically advanced by establishing a fundamental understanding of key events leading to the misfolding and self-aggregation of proteins involved in the various protein-folding pathologies. there were four speakers at the symposium. dr. yuri lyubchenko outlined approaches for detection and analysis of protein misfolded states. protein misfolding is a complex phenomenon that can be facilitated, impeded, or prevented by interactions with various intracellular metabolites and interprotein interaction with intracellular nanomachines controlling protein folding. a fundamental understanding of molecular processes leading to misfolding and self-aggregation of proteins will provide critical information to help identify appropriate therapeutic routes to control these processes. protein misfolding is the very fist link in this long chain of events eventually leading to neurodegeneration. therefore, availability of methods capable of detecting the pathological protein conformations facilitates the development of novel tools for diagnostic and treating the diseases at very early stages of development. single-molecule force spectroscopy atomic force microscope was proposed to measure the strength of the interprotein interactions before aggregation [3, 4] . the capability of the atomic force microscope to manipulate with the tip in the nanometer scale was used for development of the nanotweezers approach capable of single-molecule selecting of antibodies for a certain type of protein aggregates. the role of oxidative stress in protein misfolding and aggregation of a-synuclein was a topic of dr. jean-christophe rochet. to investigate the link between complex i impairment and sequence-specific, post-translational modifications of a-synuclein, the protein was isolated from rotenone-treated pc12 cells and analyzed by tandem mass spectrometry [5] . it was found that rotenone induced various modifications in the c-terminal region of a-synuclein, including oxidation of methionine, nitration and amination of tyrosine, and phosphorylation of tyrosine and serine. these modifications correlated with an increase in the levels of membrane-bound, oligomeric a-synuclein detectable by fluorescence lifetime imaging microscopy and lipid flotation. in parallel, it was found that a-synuclein aggregation and toxicity were suppressed by dj-1, an antioxidant protein that is dysfunctional in some cases of familial parkinson disease. a current goal in nanomedicine is to determine how the conformational behavior of a-synuclein is influenced by sequence-specific modifications using single-molecule approaches. the presentation of dr. boris akhremitchev focused on analysis of thermodynamics of a-synuclein fragments critically involved in protein misfolding and aggregation. interactions between individual molecules were studied using a scanning force microscopy-based technique [6, 7] . the reported energy landscape parameters of interactions, including the barrier width and dissociation rate, were obtained, illustrating the possibility for direct measurements of the molecular-level parameters for investigating the molecular origin of the misfolding-based protein aggregation and effects of chemical agents that prevent or hinder the aggregation. dr. robert tycko reviewed the advances in highresolution studies of fibrillated aggregates achieved with the use of solid-state nuclear magnetic resonance. this technique is capable of providing site-specific constraints on the secondary, tertiary, and quaternary structures of amyloid fibrils, as demonstrated by recent works on fibrils formed by the full-length b-amyloid peptide associated with alzheimer disease [8] and fragments thereof [9] . the recent results focused in the following areas [10] : (1) development of a structural models for b-amyloid fibrils grown under either agitated or quiescent conditions; (2) development of a structural model for fibrils formed by the 37-residue amylin peptide, which is associated with type 2 diabetes; (3) constraints on the molecular structures of fibrils formed by the ure2p and sup35 yeast prion proteins. dr. justin hanes from the johns hopkins university started the first session by describing a new biomaterial platform for targeted and intracellular drug delivery, the poly(ether-anhydrides). these polymers incorporate poly(ethylene glycol) (peg) into their backbone structure and are polymerized with one to several other monomers, such as sebacic acid (sa), that are also known to be safe in humans. the addition of peg to the polymer backbone allows classical poly(anhydrides) to be formulated into nanoparticles that are readily resuspended in liquids for injection or inhalation, or into dry powder microparticles that are easily aerosolized for inhalation. peg coats the surface of the particles naturally during particle preparation, allowing the linkage of targeting agents to the free end of peg (i.e., end not attached to the rest of the polymer). highly selective targeting of poly(peg-sa) particles to inflamed blood vessels was demonstrated following the attachment of antibodies against molecules that are upregulated on the luminal side of inflamed endothelial cells in a collaboration with prof. doug goetz from the ohio university. poly(peg-sa) nanoparticles (with transferrin attached to the free end of peg) were also able to penetrate targeted cancer cells and rapidly transit within the cell to the perinuclear region; these particles were more toxic to cancer cells at relevant doses than the free drug. dr. richard price from the university of virginia next discussed the use of ultrasound-assisted local permeation of blood vessels by microbubble destruction as a method to enhance the targeting of drugs, including those contained within nano-and microparticles. microbubbles (hollow microspheres with a shell composed of albumin) are currently used clinically as a contrast agent in diagnostic ultrasound. dr. price showed that ultrasound can be used at low frequencies to cause bubbles in targeted regions to explode (by cavitation), which causes a significant increase in the local permeability of the endothelium to drugs and even large particles. highly targeted delivery of polymer nanoparticles to muscle tissue was clearly demonstrated in an animal model. control over particle penetration depth was achieved by simply applying ultrasound without microbubbles, in which case particles penetrated only the endothelium and not into the interstitium, whereas the application of ultrasound with microbubbles led to particle delivery to both the endothelium and the interstitium. the results suggest that this technique may provide an alternative to molecular targeting for a variety of diseases. dr. lawrence tamarkin from cytimmune sciences, inc., next spoke about his companyts colloidal gold-based system for targeted drug delivery to tumors (aurimune). tumor necrosis factor a (tnf-a) is bound to the surface of colloidal gold nanoparticles with an average diameter of 30 nm. the nanoparticles are pegylated to reduce clearance by the reticuloendothelial system; evidence that liver and spleen uptake of the nanoparticles was minimal as compared with uptake by tumors in mice was provided. aurimune was administered to two dogs with naturally occurring tumors and to rabbits; in each case its administration caused fever but did not cause hypotension. hypotension has historically been dose limiting in the treatment of cancer with tnf-a. based on promising preclinical results, aurimune production was scaled up and certified by gmp it is currently being studied in a national cancer institute-sponsored phase i trial in patients with advanced-stage cancer. dr. esther chang from georgetown university next provided provocative results with targeted cationic liposomes for both the treatment and improved imaging of various tumor types. noting that tumors need iron to grow and that they therefore typically express high levels of transferrin receptor on their surfaces, dr. changts group uses an anti-transferrin receptor single-chain antibody fragment as the targeting ligand for their liposomal delivery platform. dr. chang gave an overview of her laboratoryts extensive experience with these liposomal drug carriers in a variety of tumor types, including head and neck, breast, prostate, kidney, and pancreatic, demonstrating antitumor effectiveness of several drugs and drug types, including dna, small interfering rna, and small molecules. dr. chang reported that the technology is now entering clinical trials in the united states. finally, dr. chang also showed that contrast agents for magnetic resonance imaging could be encapsulated into the liposomes, leading to improved sensitivity and resolution in detecting metastatic lesions. dr. hamid ghandehari from the university of maryland, baltimore, capped the first drug delivery session with an overview of ways that nanoscale medicines might be improved by controlling particle structure using specialized chemistries and fabrication methods. he noted that carriers used for the controlled temporal or spatial delivery of bioactive agents are typically made of polymers, liposomes, or inorganic particles with polydisperse size distributions that can make prediction of biodistribution or subcellular fate challenging. he also pointed out that most of the polymers being studied for drug delivery are made by random copolymerization techniques, which limits control over monomer sequence, polymer molecular weight, and molecular weight distribution that may affect the performance of a given nanomedicine platform technology. dr. ghandehari then described several means by which polymers and nanosystems could be made with better definition of structure and dimensions, including his laboratoryts experience with n-(2-hydroxypropyl)methacrylamide copolymers, poly(amido amine) dendrimers, recombinant polymeric gene carriers, and silica nanotubes, and discussed the relationship of carrier structure at the nanoscale and function in targeted drug delivery. the second session of the drug delivery nanomedicine symposium began with a talk by dr. kathleen j. stebe from the johns hopkins university on the importance of controlling interfacial phenomena in the production of nanoparticles to enhance particle stability and functionality. dr. stebe discussed how the interfacial properties of nanoparticles could be controlled during fabrication to influence the properties and performance of nanomedicine products. the final surface properties of nanoparticles dictate everything, from whether they will aggregate in vivo to how effectively they will circulate and find their target. understanding the complex kinetics and thermodynamics of nanoparticle systems during their synthesis and formulation is a critical and yet underappreciated science that can improve the performance of nanomedicine platforms. dr. jimmy yun from singapore nanomaterials technology next discussed his companyts efforts in developing an effective process for the large-scale production of pharmaceutical nanoparticles with excellent control over size, size distribution, and morphology by a high-gravity precipitation method. these formulation methods may be valuable in enhancing the performance of drugs used in aerosol and solid dosage forms. dr. tsuneya ikezu from the university of nebraska medical center discussed his groupts findings that amyloidb aggregation and s-tubulin kinase-1 induces s phosphorylation and reduces microtubule extension. this study has important implications in understanding a potential molecular mechanism underlying impaired neuronal plasticity in the brains of alzheimer patients and may provide a potential therapeutic target for treating alzheimer disease. dr. hai-quan mao from the johns hopkins university capped the second drug delivery session by discussing a new polyphosphoester block copolymer system that self-assembles into micelles with an average diameter of 80 to 150 nm. improved gene expression in the liver using this new polymer nanoparticle platform technology was demonstrated. polyphosphoester micelles have potential in treating liver-associated diseases by gene medicine approaches. the cell is the basic unit of animate matter, but the cell consists of an astonishing collection of different individual inanimate nanostructures and ordered aggregates of nanostructures ( table 1) . three of the five talks in the symposium had a basic biological bent; the others were more technological in character. one of the technological talks was from a start-up company that has spun out of a university; the others were from academia. dr. anja nohe (chemical and biological engineering, university of maine) spoke on mouse genetic models of bone stem cells and nanoscale receptor dynamics. methods in imaging and microscopy-in particular fourier imaging correlation spectroscopy, and fluorescence resonance energy transfer-allow assessment of receptor protein dynamics within microdomains of the plasma membrane. dr ari requicha (laboratory for molecular robotics, university of southern california) described how binstrumentedq cellular systems could be developed with sensor/actuator networks and distributed robotics to open new directions in biomedical research. the eventual realization of multiple nanosensors providing continuous streams of real-time, multimodal, and complementary data on the behavior of groups of cells or single cells will increase human understanding of life processes at the tissue, cellular, and molecular levels. dr you han bae (pharmaceutics and pharmaceutical chemistry, university of utah) spoke on the use of phsensitive polymeric mixed micelles constructed from the block copolymers poly(l-histidine)-b-peg and poly(llactide)-b-peg for targeting tumors and overcoming multiple-drug resistance. the approach could result in a paradigm shift for treatment of solid tumors from monoclonal antibody/receptor-based targeting to ph shift-based targeting. dr don haynie (bionanosystems engineering laboratory, artificial cell technologies, inc.) explained how the same polypeptide multilayer nanofilm platform technology is being adapted to the development of different biomedical applications: artificial extracellular matrixes for tissue regeneration, artificial red blood cells for oxygen therapeutics, and artificial viruses for synthetic vaccine development. in each case peptide design and nanofilm architecture is tailored to the specific purpose, mimicking key properties of cells, viruses, and biomacromolecules at the microscale and the nanoscale. dr denis wirtz (chemical and biomolecular engineering, johns hopkins university) showed how to probe cytoskeletal dynamics in ovarian cancer progression with multiple particle tracking bnanorheologyq. cells corresponding to the two known pathways of ovarian cancer progression vary distinctly in their cytoplasmic viscosity. further progress in medicine will require a deeper understanding of how cell, tissue, and system behavior depends on the properties of subcellular nanostructures and nanostructured components of molecular assemblies. in the pharmacological nanomedicine symposium, four talks were presented with the latest advancements in drug discovery and delivery. in the first talk, dr. jeanne hardy (university of massachusetts-amherst) presented a site-directed approach called btetheringq in which allosteric sites on key proteins are exploited for drug discovery. these sites are on the surface of proteins and can be redesigned to be controlled by a small molecule of choice. recent advances with the caspase family of cysteine proteases were presented, which revealed the pathway of inhibition for caspase regulation of apoptosis. dr. wafik s. el-deiry (university of pennsylvania school of medicine) presented novel optical imaging approaches to monitor the molecular events relevant to tumor progression and therapeutic response. the approach is to introduce genetic alteration into human cells along with bioluminescent reporter genes to determine the role of specific oncogenes and tumor suppressor genes in controlling tumor progression and therapeutic response. these molecular beacons are then used in high-throughput screens for small molecules that can modulate signaling leading to tumor cell death. other approaches included evaluation of bioluminescent tumors in animals for drug-target validation and therapeutic efficacy. the overall goal presented by dr. el-deiry is to develop new approaches in drug screening and target validation to accelerate the preclinical phase of anticancer drug development to bring new agents into clinical trials. dr. scott l. diamond (university of pennsylvania), director of the penn center for molecular discovery, reported on their breakthrough in microarray design and screening where nanoliter-scale biochemical reactions can be analyzed by matrix-assisted laser desorption/ionizationtime-of-flight mass spectrometry with 10-fmol sensitivity. dr. diamond described a multiplex screen across numerous proteases that identified a nanomolar inhibitor of human cathepsin l that was a potent inhibitor of severe acute respiratory syndrome coronavirus entry. dr. david needham (duke university) described the temperature-sensitive liposome approach for drug delivery that his group introduced in 1996 and is now in phase 1 the aanm is rapidly expanding into an international organization with membership from all over the world. symposium xvi was an exciting representation of the international scope of nanomedicine and bionanotechnology research. in addition to the united states, eight other countriest leading researchers gave an overview of the current state of research in nanomedicine in their respective countries. the following countries were represented: canada nanomedicine presents an explosive number of opportunities to improve peoplets lives. a great diversity of research areas is being developed worldwide, with the united states being the leader. in canada some of the important nanomedicine research areas include intelligent drug delivery systems; vaccine and gene delivery nanosystems; polymeric systems and devices; biochips; lab-on-achip; artificial cells; anti-cancer, immune-, neural-and musculoskeletal systems; nanoscale targeting; tissue and cellular engineering; novel biomaterials; and molecular imaging. in hong kong dna electronics, microfluidic devices, metallic nanoparticles, nanofibers, and biopolymeric materials are being investigated for diagnostic and tissue engineering applications. in india the nanomedicine research includes drug delivery with dendrimer-based nanoarchitecture for the controlled and targeted delivery of anticancer bioactives; mitochondrial drug targeting; gene therapy of genetic disorders like diabetes and cancer using dendrimer-based nano carriers; developing a dna vaccine against leichmania donovani; and stem cell research. korea is active in the design of intelligent microsystems and biomedical devices, whereas in japan imaging, drug delivery, and the application of nanoscience in medical treatments are already strong. the second international conference of america-japan nanomedicine society will be held in japan during april 2007. the nanotechnology program in russia has recently been launched and includes nanomedicine research across the country. in singapore the biomedical sciences are a leading area of nanotechnology and supported by a strong economy. the panel discussion after the presentations concluded with the future vision of a true international effort in the development of nanomedicine-related technologies and discoveries, with regular updates to be presented at the annual aanm meetings. this symposium dealt with highlights relating to the toxicology of skin, eye, respiratory, and cardiovascular effects as well as the interpretation of polymer toxicology. dr. nancy monteiro-riviere discussed the effects of multiwalled carbon nanotubes, fullerenes, derivatized fullerenes, and quantum-dot effects in human epidermal keratinocytes and penetration through skin. some of these nanomaterials depicted inflammatory effects, altered gene expression proteins, ultrastructural localization within skin cells, and irritation profiles. quantum dots of different sizes and surface chemistries depicted unique profiles in penetration through the skin. dr. gerard lutty talked about the use of different types of injected nanoparticles to deliver genes in rabbit eyes. chitosan, pcep, and magnetic nanoparticles did not show toxicity in vitro but demonstrated toxicity in vivo. all of these nanoparticles were capable of transfecting cells with chitosan, causing inflammation in the eyes. magnetic nanoparticles were the least toxic and can be transfected to specific cell types in the eye. dr. randall schneider discussed the consideration that must be taken into account when using polymer nanoparticles for biomedical applications. he emphasized the complexicity of nanoparticles and that the predictive models currently used may not be suitable for nanoparticles. therefore, it is important to study the behavior of different types of nanoparticles in the biological environment and to fully characterize these materials before full interpretations can be made. dr. petia simeonova discussed her research with singlewalled carbon nanotubes instilled into the lung of mice. she demonstrated a dose-dependent increase in oxidative vascular damage. her studies also demonstrated that singlewalled carbon nanotubes did not modify the lipid profiles but did generate an accelerated plaque formation in a mouse model of arteriosclerosis. these findings not only demonstrated lung toxicity but also depicted cardiovascular effects. all of these studies presented in the toxicological nanomedicine symposium showed how nanomaterials may be toxic under specific conditions and that extensive studies on interactions of these materials in the body are essential before full interpretations can be made. nanotechnology product development cycles should incorporate an evaluation of potential risk reduction from the earliest stages. hidezo mori, md, phd summarized his groupts national project in japan named nanolevel imaging of molecular structure and function. in this project mori and his group analyzed structures of fundamental protein in human diseases. they selected complexes of cardiac troponin and a calcium sensitizer, vascular apoptosis-inducing protein 1, and calcineurin homologous protein 2 as targets ( figure 5 ). proteins expressed in escherichia coli or isolated from crude snake venom are purified and finally crystallized by hangingdrop method. all the x-ray diffraction data were collected at spring-8 for the structural determination. such sub-nanolevel structural imaging will be useful for structure-based drug design to treat cardiovascular disease or cancer. shuming nie, phd talked about semiconductor quantum dots. when linked with biotargeting or biorecognition ligands such as monoclonal antibodies, peptides, or small molecules, these nanoparticles can be used to target tumor antigens (biomarkers) as well as tumor vasculatures with high affinity and specificity. in the bmesoscopicq size range of 10 to 100 nm (diameter), quantum dots and polymeric nanoparticles also have more surface areas and functional groups that can be linked to multiple diagnostic (e.g., optical, radiosotopic, or magnetic)and therapeutic (e.g., anticancer) agents. kenneth wong, phd reported that topical delivery of silver nanoparticles reduces systemic inflammation of burns and promotes wound healing. his team investigated the wound healing properties of silver nanoparticles in an animal model and found rapid healing and better cosmetic appearance in a dose-dependent manner. furthermore, through quantitative polymerase chain reaction, immunohistochemistry, and proteomic studies the team showed that silver nanoparticles exerted positive effects through their antimicrobial properties, reduction in wound inflammation, and modulation of fibrogenetic cytokines. these results have given an insight into the actions of silver and provided a novel therapeutic direction for wound treatment in clinical practice. yoshinobu bana, phd described nanofabrication, molecular nanotechnology, and nanomaterials for bioanalysis and biomedical application. arrays of nanopillars, which are 200 nm wide and 4,000 nm tall, are successfully applicable to fast separation of dna within 10 to 25 seconds. nanoball, with a diameter of 50 nm, has been developed and successfully applied for fast separation of dna fragments from 100 base pairs to 15 kilobase pairs. bacteria cellulose, a biofiber with a diameter of 50 nm, has been found to be a useful in highly sensitive detection of biomolecules based on the optical confined effect. the nanobiodevice is useful for fast separation of protein samples from several kilodaltons to 200 kda within 15 seconds. joe kao, phd, reported the possibility of controlling biology with light. a bphototriggerq or bcaged moleculeq is a biologically inactive but photosensitive precursor that is rapidly transformed into a fully bioactive molecule upon exposure to a flash of light. the bioactive molecule that is thus generated acts as a biochemical trigger; it can be a hormone, a neurotransmitter, a second messenger, or an enzyme modulator. therefore, in combination with focused light pulses, caged molecules represent a simple methodology for manipulating biology in living cells or tissues with excellent temporal and spatial resolution. specific applications of the technology to manipulate neurotransmission at single synapses and to stimulate single nerve terminals were discussed. this year there were nine finalists for the young investigator award (yia). the nine finalists each gave a 10-minute platform presentation on topics representing the most novel and exciting areas of nanomedicine. awards were presented in two categories: those scientists who were more senior and the junior scientists who were graduate students, postdoctoral fellows, or new investigators. for the junior yia scientists, jean-christophe rochet of purdue university received first place for his work to develop nanoimaging-based approaches for detection and analysis of protein misfolding states. gengfeng zheng of harvard university received second place for his development of nanowire transistor arrays for large-scale, label-free, real-time, parallel electrical detection of a variety of biomolecules ranging from small molecules to proteins and viruses. third place was awarded to winston timp of the massachusetts institute of technology for his development of living-cell microarrays in which arrays of optical traps are able to manipulate hundreds of bacteria or tens of mammalian cells into 3-d arrays. these living-cell microarrays can be observed over time to assay gene expression and cell viability. in the senior yia group, the first place award went to justin hanes of johns hopkins university. his group is developing a new family of polymers, poly(ether-anhydride nanoparticles, designed to be selectively adhesive to desired cell types, and nonadhesive to obstacles in the body that might prevent their targeted delivery. second place went to xiaohua huang of the university of california-san diego for his efforts in developing a technology that is capability of rapidly sequencing a human genome for as little as $1,000. the third place recipient was susan l. beamis rempe of sandia national laboratories for her contributions in understanding ion transport between aqueous and biological environments for the purpose of designing new transporter devices relevant to medical and environmental applications. there was the poster session in the second annual meeting of aanm. the presentations in these posters covered basic, biosensors, cellular, dendrimer-based, diagnostics, engineering, experimental, genetics, neurology, oncology, policy, and toxicology nanomedicine. the presentations in each poster were excellent, and the discussions were a positive step in building future pathways in various research areas. the first place of best poster award went to rutledge g. ellis-behnke of the massachusetts institute of technology. the second place of best poster award went to lajos p. balogh of nanobiotechnology center, roswell park cancer institute. the third place of best poster award went to winston timp of the massachusetts institute of technology. during the welcome reception, researchers and investigators had the opportunity to discuss the dayts lectures and presentations ( figure 6 ). wenchi wei (chiming weits daughter) performed the chinese music instrument gu-zheng during the reception (figure 7) . this year the aanm volunteer staff team worked very hard and provided the wonderful service for the conference. we greatly appreciate the aanm staff teamts excellent work. dr. ling xu, the senior coordinator of aanm staff team, received the aanm service award during the reception (figure 8) because the aanm will guide academic activities, provide leadership, and lead research in the field, the aanm fellowship committee approved and elected the 48 new fellows to join the academy. we would like to invite more qualified scientists and investigators to apply for the aanm fellowship and join us in the development of nanomedicine research. in conclusion, the second annual meeting of the aanm was a great success ( figure 9 ). the meeting provides investigators from different world areas a forum and an opportunity for discussion. we believe that nanomedicine research will develop rapidly in the future. the aanm invites basic and clinical researchers from the world to join this exciting research. we sincerely appreciate everyone who contributed to this conference, and we are looking forward to seeing everyone next year at the third annual meeting of the aanm, september 8-9, 2007, in san diego, california. nanoimaging for protein misfolding and related diseases nanotools for megaproblems: probing protein misfolding diseases using nanomedicine modus operandi nanomedicine and protein misfolding diseases protein interactions and misfolding analyzed by afm force spectroscopy identification of rotenone-induced modifications in a-synuclein using affinity pull-down and tandem mass spectrometry packing density and structural heterogeneity of insulin amyloid fibrils measured by afm nanoindentation single-molecule force spectroscopy measurements of bhydrophobic bondq between tethered hexadecane molecules experimental constraints on quaternary structure in alzheimer's b-amyloid fibrils polymorphic fibril formation by residues 10-40 of the alzheimer's b-amyloid peptide molecular structure of amyloid fibrils: insights from solidstate nmr key: cord-260057-2m6jdvtc authors: pandey, preeti; prasad, kartikay; prakash, amresh; kumar, vijay title: insights into the biased activity of dextromethorphan and haloperidol towards sars-cov-2 nsp6: in silico binding mechanistic analysis date: 2020-09-23 journal: j mol med (berl) doi: 10.1007/s00109-020-01980-1 sha: doc_id: 260057 cord_uid: 2m6jdvtc abstract: the outbreak of novel coronavirus disease 2019 (covid-19) caused by severe acute respiratory syndrome coronavirus-2 (sars-cov-2) virus continually led to infect a large population worldwide. sars-cov-2 utilizes its nsp6 and orf9c proteins to interact with sigma receptors that are implicated in lipid remodeling and er stress response, to infect cells. the drugs targeting the sigma receptors, sigma-1 and sigma-2, have emerged as effective candidates to reduce viral infectivity, and some of them are in clinical trials against covid-19. the antipsychotic drug, haloperidol, exerts remarkable antiviral activity, but, at the same time, the sigma-1 benzomorphan agonist, dextromethorphan, showed pro-viral activity. to explore the potential mechanisms of biased binding and activity of the two drugs, haloperidol and dextromethorphan towards nsp6, we herein utilized molecular docking–based molecular dynamics simulation studies. our extensive analysis of the protein-drug interactions, structural and conformational dynamics, residual frustrations, and molecular switches of nsp6-drug complexes indicates that dextromethorphan binding leads to structural destabilization and increase in conformational dynamics and energetic frustrations. on the other hand, the strong binding of haloperidol leads to minimal structural and dynamical perturbations to nsp6. thus, the structural insights of stronger binding affinity and favorable molecular interactions of haloperidol towards viral nsp6 suggests that haloperidol can be potentially explored as a candidate drug against covid-19. key messages: •inhibitors of sigma receptors are considered as potent drugs against covid-19. •antipsychotic drug, haloperidol, binds strongly to nsp6 and induces the minimal changes in structure and dynamics of nsp6. •dextromethorphan, agonist of sigma receptors, binding leads to overall destabilization of nsp6. •these two drugs bind with nsp6 differently and also induce differences in the structural and conformational changes that explain their different mechanisms of action. •haloperidol can be explored as a candidate drug against covid-19. electronic supplementary material: the online version of this article (10.1007/s00109-020-01980-1) contains supplementary material, which is available to authorized users. the current outbreak of corona virus disease 2019 (covid19) caused by a novel coronavirus sars-cov-2 was first reported from wuhan, china, in late december 2019 [1] , which has subsequently affected the entire world, reporting nearly 26 million of confirmed cases of covid-19 along with 9.0-lakh deaths as per data recorded in september 1st week, 2020, posing a global threat for human health and economy. with so many novel studies and findings surfaced, since its inception, we are still lagging behind in development of an effective treatment strategy to control the virus spread and prevent the disease [2] [3] [4] [5] [6] [7] . sars-cov-2 is an enveloped non-segmented large positive sense, single-stranded rna virus (~30 kb) with 5′-cap structure and 3′-poly-a tail belonging to β-cov category [8, 9] . its rna genome contains 29,891 nucleotides and encoding for~9860 amino acids [9] . the genome codes for both structural proteins like spike (s), envelope (e), membrane (m), and nucleocapsid (n), along with many nonstructural proteins (nsps 1-16) [10] . while these nsps linked to rna replication and processing of subgenomic rnas, the functions of some of the nsps are not known. a key component, nsp6, is a membrane protein of approximately 34 kda with eight transmembrane helices and a highly conserved cterminus. together with nsp3 and nsp4, nsp6 is involved in the formation of replication-transcription complexes (rtcs) or replication organelles (ro) by stimulating the rearrangement of host cell membranes [11] . these replication complexes serve many important functions during the virus life cycle and play an important role in infection [12, 13] . the expression of these three nsps has been associated with the formation of different membranous structure characteristic of cov-infected cells including the double-membrane vesicles (dmvs), large virion-containing vacuoles (lvcvs), cubic membrane structures (cmss), and zippered endoplasmic reticulum (er) spherules [14] [15] [16] . nsp6 protein is also involved in blocking er-induced autophagosome/ autolysosome vesicle formation that plays a protective role in checking viral production inside host cells. nsp6 through the activation of the omegasome pathway induces autophagy [17] . the autophagosomes produced by nsp6 are higher in number but smaller in size as compared with those induced by starvation [18] . this may favor coronavirus infection by limiting the ability of autophagosomes to deliver viral components to lysosomal degradation. recently, gordon et al. [19] presented a blueprint of sars-cov-2-human interactome to reveal potential drug targets against covid-19. the authors identified 332 interactions between viral and host proteins, largely targeting the innate immune signaling pathway. using this blueprint, they identified a series of drugs and compounds with high potential to fight covid-19-some of which are now being entered into clinical trials. the authors found that sars-cov-2 nsp6 protein interacts with the sigma receptor, which regulates er stress response [19] and blocks er-induced autophagosome/autolysosome vesicle that restricts viral production. they found that drugs or molecules targeting sigma-1 and sigma-2 receptors had effectively inhibited virus replication and growth. these identified drugs or molecules include antipsychotics, haloperidol, and melperone, which are used to treat schizophrenia; antihistamines like clemastine and cloperastine; compound pb28; and the female hormone progesterone. recently, serkan tulgar and co-authors [20] advocated the use of haloperidol, an anti-inflammatory agent, in preventing the progression and reduction of severity of covid-19. gordon et al. [19] in the same study also identified the sigma-1 benzomorphan agonist, dextromethorphan, that actually has pro-viral activity, which helped the growth of the virus in cells. the authors suggested that dextromethorphan might activate sigma-1, which could help the activation of the s t r e s s r e s p o n s e t h a t t h e s a r s -c o v -2 h i j a c k s . dextromethorphan is a common cough suppressant found in several over-the-counter cough medicines. it binds to nmda receptors and can modulate glutamatergic signaling [21] . it also binds to serotonin transporters [22] and several other protein targets, including sigma-1 receptors [21, 22] which have been considered therapeutic targets for antidepressant drugs [23] [24] [25] [26] . however, the consumer healthcare products association (chpa) in their statement says that "there is currently no clinical data indicating that the cough suppressant dextromethorphan has a pro-viral effect in people with covid-19 infection. the study results published by gordon et al. [19] are experimental, preliminary, and not conclusive." more importantly, theresa enkirch et al. [27] had evaluated the anti-influenza activity of dextromethorphan in vitro, and in mice as well as in animal models. the authors demonstrated that dextromethorphan was able to inhibit viral replication both in vitro and in vivo. all these findings suggest that haloperidol and dextromethorphan are potential candidate binding to sars-cov-2 nsp6 and exert different viral activity. haloperidol inhibits the sars-cov-2 with ki of 2-12 nm, while dextromethorphan activates the virus with an activity constant, ki of 118 nm [19] . however, it is still not clear about the theoretical basis of this contradictory and biased activity. to explore the molecular mechanism of antiviral activity of haloperidol and the pro-viral activity of dextromethorphan, a number of computational methods like molecular docking, all-atom molecular dynamics (md) simulation, and the molecular mechanics/poisson-boltzmann surface area (mm-pbsa) were employed in this study. these mechanism-relevant studies will explore the binding modes, examine the key residues in the binding process, and elucidate the detailed interaction mechanisms. the results are expected to provide insights into the binding mechanism of the nsp6drug complexes, which will be useful for future exploration of efficient drug targets in covid-19. the three-dimensional structure of nsp6 was determined by the deepmind algorithm aplhafold, based on deep neural network learning [28] . the nsp6 model structure was energy-minimized to remove steric clashes, and the lowest energy structure was selected based on the molprobity scores. in addition, the zhang group also predicted models for the sars-cov-2 proteins [29] using the novel c-i-tasser platform [30] . however, these models have poor local geometries, numerous atomic clashes, poor side-chain conformations, and bad backbone dihedral angles, as measured by the molprobity score. the stereochemical quality of the modeled protein was evaluated using the structural analysis and verification server version 5.0 (saves) (https://servicesn.mbi.ucla. edu/saves/). this meta server runs six programs for checking the stereochemical quality of a protein structure by analyzing residue-by-residue geometry and overall structural geometry. the chemical structures of the drugs were downloaded from the zinc database [31] as mol2 files and then converted into three-dimensional pdb files using open babel 2.4.1 suite. the swiss-pdb tool was used to energy-minimize the protein structure to get the stable and low-energy conformation state of protein. all the docking studies have been performed with autodock vina [32] . the autodock tools package [33] was used to prepare the protein structures for docking by adding hydrogen to the polar groups along with kollman charges. both the protein and ligand files were converted into pdbqt format using the autodock vina plugin with scripts from the autodock tools package. the autodock vina generates up to 9 poses for each drug containing the molecular coordinates as well the gibbs free energy variation (δg, kcal/mol) for each pose. finally, the obtained top-posed docking conformations were subjected to post-docking energy minimization in discovery studio (ds 3.531). after docking, the resultant receptor-ligand complexes were visualized and studied by the discovery studio visualizer (biovia) and pymol [34] , ucsf chimera 1.9 [35] , and ligplot+ [36] . the all-atom md simulations were performed using gromacs v5.1.4 on the atomic coordinates of sars-cov-2 nsp6, and nsp6 complexed with dextromethorphan and haloperidol. the force field was selected as charmm27, and the water model tip3p was used to solvate the systems [37] , and the ligand parameters were taken as described elsewhere [38] . the system was placed in the center of the octahedral simulation box with buffer distance (10 å) and the water molecule padding around the system. to neutralize the system, 0.15-m counter ions (na + and cl − ) were added [39] . all the md simulations were performed at physiological temperature, 300°k. the energy minimization of all three systems was performed using steepest descent as first, then conjugant gradients (50,000 steps for each). bonds involving hydrogens were treated with shake algorithm, pme (particle mesh ewald) was used to define long-range electrostatic forces, and pbc (periodic boundary condition) was applied to x, y, and z directions [40] . the ensemble processes, nvt and npt, were applied for the equilibration of the system for the period of 500 ps. during the simulation, berendsen thermostat [41] and parrinello-rahman pressure [42] were used to maintain the pressure and temperature, respectively. linc algorithm was used to constrain the bonds and angles [43] . the van der waals interactions were taken cared of by lj potential with a cutoff of 0.10 nm. using the npt ensemble, production runs were performed for the period of 100 ns, with time integration. the energy, velocity, and trajectory were updated at the time interval of 10 ps. all production runs were done on cuda-enabled tesla gpu machine (dell t640 with v100 gpu) and os centos 7 [44, 45] . gromacs utilities and python scripts with mdtraj [46] were used to analyze the global structural order parameters, rmsd (root mean square deviation), rg (radius of gyration), sasa (solvent accessible surface area), rmsf (root mean square fluctuation), pca (principal component analysis), and free energy landscape (fel), and dssp was used to examine the secondary structure conformational dynamics during the simulation [47] . we calculated the molecular mechanics/poisson-boltzmann surface area (mm-pbsa) from the obtained md trajectories of protein-ligand complexes to describe the structural stability of protein-ligand, spatial orientation ligands, and molecular interactions at binding site of nsp6. the mm-pbsa provides a robust estimation of free-binding energy, contacts, and the effect of solvent underlying the binding affinity of ligand molecules [48, 49] . the binding free energy is expressed as: where g complex represented the total free energy of proteinligand complex, g protein as the free energy of protein, and g ligand used as the free energy of ligand. neglecting entropy terms, the free energy for each entity can be represented as: where δe mm is the average molecular mechanics interaction energy change (gas-phase) upon ligand binding. δg solv is the solvation free energy change on ligand binding. e mm included the bonded, electrostatic, and van der waals energy, and g solv included polar and non-polar solvation terms. we have applied the fereiro and woylness [50] algorithm for the calculation of residual frustration in the nsp6 structures. the residual frustration index was calculated from the frustratometer server (http://www.frustratometer.tk) [51] . the results were explained based on the "single residual frustration index." a frustration value (denoted as z-score) greater than + 0.78 will be defined as minimally frustrated or stabilizing residue, while a frustration value less than − 0.78 will be defined as highly frustrated or destabilizing [52] . if the z-score lies in between these two limits, the residue will be defined as neutral. the secondary structure of nsp6 was predicted using alphafold, a deep learning algorithm, and shown in fig. 1 . the constructed model of sars-cov-2 nsp6 has been verified through the saves v5.0 server. the ramachandran plot showed that 91.7% and 8.3% of residues are located in the most favored, and additionally allowed, regions, respectively. the environmental profile for nsp6 was computed with verify3d, which showed that almost 80% of the amino acids have scored more than zero. the non-bonded interactions between the atoms were also predicted with the errat server, which showed the overall quality factor of 92.19 (supplementary figure s1 ). the overall three-dimensional structure of nsp6 ( fig. 1) consists of fourteen α-helices, a c-terminal, two antiparallel β-strands, and sixteen turns. the structure of nsp6 has eight transmembrane helices (tm1-tm8). to predict the binding site for performing the molecular docking, a ligand-independent binding site search was done on nsp6 using castp server [53] . this program provides comprehensive and detailed quantitative characterization of geometric and topological features of protein structures. the server identifies a putative binding site with a solvent accessible surface area of 1076 å 2 and volume of 1442 å 3 . this binding site of nsp6 is encompassed by helices h3 (residues 61-69), turn and coil region (residues 126-133) joining h6 and h7, h8-h9 (residues 170-179), and c-terminal h11-h12-β1 (residues 221-244) (supplementary figure s2) . we have used the other two web-servers, prankweb [54] and coach [55] , for predicting the binding sites. these two servers also predict similar residues in the binding pocket, with certain changes. for the molecular docking study, we utilized the autodock vina, which is an open-source program and the most commonly used docking software with an effective scoring function [32] . haloperidol is an antipsychotic drug and is an inhibitor of sigma-1, dopamine d2, and histamine h1 receptors; it inhibits sars-cov-2 with a ki of 2-12 nm [19] . dextromethorphan, an antidepressant and sigma-1 agonist, activates the virus with an activity constant, ki of 118 nm [19] . the molecular docking results showed that both haloperidol and dextromethorphan form a good complex with nsp6 with docking (affinity) scores of − 7.7 kcal/mol and − 6.5 kcal/mol, respectively, which indicates k d (k d = exp δg/ rt ) values in the nanomolar to low micromolar range. the binding mode for dextromethorphan based on the docking studies is presented in fig. 2a . dextromethorphan forms h-bonds with its methoxy group to lys61 of nsp6. both cys229 and tyr242 form a liophilic binding pocket with the formation of π-sulfur and π-alkyl interactions with aromatic rings. the morphian group binds with several hydrophobic residues, including his62, arg233, arg236, and leu245, and forms additional van der waals interactions from residues asn232, thr238, asp243, and pro282. this suggests that dextromethorphan-binding pocket lies mainly in cterminal α-helix, h12, and β1 strand. the docking results show the different binding pose of haloperidol compared with dextromethorphan (fig. 2b) . haloperidol is predicted to bind in the hydrophobic cavity formed by h7 (tm5), h9 (tm7), and c-terminal positively charged residues. the carbonyl oxygen of the phenone group forms an h-bond with ser176, while the hydroxy group of piperidino forms h-bonds with ser265. due to the close proximity of leu231, leu237 and leu239 show cation-π interactions with the chlorophenyl group of the drug and ala136 form π-stacked interactions with the aromatic ketone of the phenone group. additional van der waals interactions from the residues arg137, trp140, asn174, tyr175 val178, thr238, and gln290 are observed. thus, a higher number of hydrogen bonds and hydrophobic interactions in haloperidol are observed in comparison to dextromethorphan, indicating the former as a good inhibitor molecule. evaluating the stability and conformational dynamics of nsp6-systems through md simulation to further characterize the binding-induced structural and conformational changes, all-atom md simulations for 100 ns were performed to obtain the conformational sampling of the two systems. the selected docking conformations of nsp6 in complex with haloperidol and dextromethorphan were sampled by 100-ns md simulation, and the dynamic stability of the complex was elucidated by calculating the cα-rmsd values of the protein as the function of simulation time ( figure s3a ). the rmsd trajectory indicated that all the systems are stable and well equilibrated after~50 ns and well converged for further analysis. the rmsd plots showed that the nsp6 and nsp6haloperidol remained mostly stable with a rmsd of~0.5 nm, while nsp6-dextromethorphan showed a sharp increase in a rmsd at~45 ns reaches up to 1.25 nm and then remained stable throughout the simulation. the result thus indicated that the nsp6-haloperidol complex showed almost similar rmsd as unliganded nsp6 and formed much more stable complex as compared with nsp6-dextromethorphan complex. the analysis of the one-dimensional probability distribution function of the nsp6 without ligand showed a narrower distribution with a higher probability of 0.10 around the rmsd ≤ 5.0a and thus had a stable helical conformation (fig. 3a) . indeed, in the presence of dextromethorphan, the complex showed a broader rmsd distribution with the appearance of new populations with the maximum probability of 0.12 at a rmsd of 1.25 nm, indicating a change in conformation leading to destabilization of α-helix of h12 (fig. 3a) . the cα-rmsd beyond 0.12 nm indicates a transition from an α-helical conformation towards the extended configuration (fig. 3a) . however, the drug haloperidol showed a lower probability (p = 0.10) with a slightly broader distribution and a rightward shift of the rmsd population compared with the nsp6. at this rmsd, intrahelical h-bonds are still present to stabilize a helical conformation. thus, dextromethorphan has highly unstable binding to nsp6 which is evident with the rmsd plot, while haloperidol binds strongly. rg is an effective parameter to evaluate the structural integrity and compactness of the studied systems. the rg is defined as the mass-weighted root mean square distance of a collection of atoms from their common center of mass. the time evolution plot of rg showed that all systems were compact, with the nsp6-dextromethorphan having the lowest rg ( figure s3b ). the rg plot clearly suggests that the nsp6dextromethorphan complex having the least rg value forms a well compact complex than the nsp6-haloperidol complex. it also indicates that after binding of the drug, the nsp6 protein attains a compact conformation and the drug dextromethorphan could stay stably at the binding site, which provided a guarantee for stronger interaction between dextromethorphan and nsp6. the nsp6-haloperidol complex showed similar rg to nsp6. the rg differences between the nsp6 systems were minimal in haloperidol, indicating that haloperidol binding marginally changes the spatial packing of the residues. in case of the nsp6-dextromethorphan system, rg distribution shows the mixture of two normal distributions that correctly describes the sampled configurations (rg1 = 2.12 nm, p = 0.05 nm; rg2 = 2.25 nm, p = 0.04 nm) (fig. 3b) . interestingly, the first peak is slightly in lesser side than the nsp6 and nsp6-haloperidol systems, suggesting a more compact globular state. the second small peak falls near the nsp6-haloperidol system. a visual inspection revealed the fig. 3 probability distributions of structural parameters of nsp6 systems. a cα-rmsd. b radius of gyration (rg). c sasa. d rmsf for nsp6 (green), nsp6-haloperidol (blue), and nsp6-dextromethorphan (red) more compact globular shape (rg =~1.65 nm) in the haloperidol system as a consequence of the lesser fluctuations (see rmsf results). the sasa results showed stable trajectories without any fluctuations throughout the simulation, thus suggesting the structural stability of nsp6 in the presence of drugs ( figure s3c ). the marginal increment of sasa in the nsp6-dextromethorphan system indicates increased exposure of the residues to the surface of nsp6 (fig. 3c) . the rmsf displays the flexibility/mobility of each amino acid residue in the nsp6 and nsp6-drug complexes (fig. 3d) . it is worth noting that nsp6 alone showed lower rmsf (higher rigidity) in comparison with the complexes, maintaining lower flexibility in α-helices, except for the helix h5 (s89-d99). the nsp6-haloperidol complex does not increase the fluctuation and showed rmsf value similar to unliganded nsp6, suggesting a favorable protein-inhibitor association. from the figure, it can be observed that the rmsf values of residues 130-140 (correspond to h7) in nsp6-haloperidol were lower than unliganded nsp6, indicating the stabilization of these regions upon haloperidol binding. the residues 85-92 (correspond to h5), 96-110 (correspond to hinge between h5 and h6), and 245-260 (h13) showed a milder increase in fluctuations. however, dextromethorphan caused much higher fluctuations along the whole protein when bound to nsp6, indicating that dextromethorphan binding results in significant structural perturbation to nsp6 (fig. 3d) . the dextromethorphan binding, however, decreases the flexibility in residues 93, 94, 97, and 98 (correspond to h5), indicating the dominant role of these residues in drug binding. from the rmsf analysis, we also observed that the nsp6dextromethorphan complex showed maximum fluctuation in the c-terminal domain (residues 240-270 correspond to β1-h13-h14) while unliganded nsp6 and nsp6-haloperidol do not show this fluctuation. from the rmsf result, we thus conclude that the nsp6-haloperidol complex is more stable as compared with the nsp6-dextromethorphan complex. this has been further substantiated by the h-bonding analysis. the protein-drug interaction is transient, and h-bonds play a critical role in the stability of the protein-drug complex. we have analyzed the number of h-bonds for the last 60-ns trajectories ( figure s4) . the time evolution of h-bonds for the three systems has been monitored during the simulation and represented in figure s4 . the average number of h-bonds for the nsp6-haloperidol and nsp6-dextromethorphan complexes was 0-2 and 0-1, respectively. as shown in the figure, hydrogen bonding patterns in the nsp6-haloperidol system remained constant throughout the entire simulation. whereas, hydrogen bond formation in the nsp6-dextromethorphan was unstable and absent most of the time during the simulation. for the last 20 ns of simulation, the number of intermolecular h-bonds is greater in the case of the nsp6-haloperidol system, indicating the most stable binding. to further understand the drug binding-induced changes in secondary structure, the time evolution of the secondary structure profiles is shown in fig. 4 . as mentioned earlier, the nsp6 adopts a compact fold consisting of fourteen α-helices and two β-strands. notable changes in the secondary structure profile were observed in the nsp6-drug complex. most significant changes were an increase in helicity of the residues 89-99 (i.e., h5) and residues 200-205 (i.e., residues between h10 and h11). in the nsp6 and nsp6-drug complexes, the hinge region covering residues 100-110 can transform between helix or turn during simulation. in the case of the nsp6-haloperidol complex, the region covering residues 226-236 (i.e., h12) was lost to turn during the simulation. dssp information indicates that dextromethorphan binding to nsp6 decreases the helicity of residues 150-160 (h7 and h8) while the h12 helix remains stable as in unliganded nsp6. here, we use the principal component analysis (pca) to examine the conformational sampling of the nsp6 systems via examining their dominant modes of motion. the covariance matrix of atomic fluctuations was diagonalized for predicting the eigenvalues. the first few eigenvectors play a key role in the motions of protein. it is apparent from the result that the first 5 eigenvectors have a larger eigenvalue for the unliganded than for the drug-bound nsp6, reflecting larger collective atomic fluctuations of the unliganded nsp6 ( figure s5 ). in addition, the first five principal components (pcs) account for more than 70% of motions observed for the last 40-ns trajectories of the nsp6 and nsp6-haloperidol systems, whereas only the first three pcs account for~100% of the motions of the nsp6-dextromethorphan system ( figure s5 ). this indicates that the nsp6 and nsp6haloperidol complex showed lesser motions as compared with the nsp6-dextromethorphan complex, and the first few pcs are not the same in the studied three systems. the conformational sampling of the nsp6 systems in the essential subspace is shown in figure 5a which illustrates the global motions along with the pc1 and pc2 projected by the cα atom. the figure clearly indicates that modeled nsp6 showed a more stable cluster as compared with the nsp6drug complex. at the same time, we found that the nsp6dextromethorphan complex occupied a wide and different conformational subspace and showed fewer stable clusters as compared with nsp6-haloperidol. the pca thus supports the findings that show the nsp6-dextromethorphan complex less stable with increased conformational dynamics. to find the reasons how drug binding affects the motions described by pc1 and pc2, the displacements of pcs for the nsp6 systems were calculated and are shown in fig. 5b . average fluctuation for the nsp6-dextromethorphan in both the pcs were much higher as compared with nsp6haloperidol and unliganded nsp6. this indicated that nsp6-dextromethorphan might influence the protein motions during binding, whereas the nsp6-haloperidol complex showed fewer motions during binding, which is consistent with the rmsf analysis (fig. 3d) . to demonstrate the effects of drug binding on conformational redistributions, free energy landscapes (fel) for the three fig. 6 . each protein-drug complex has a different fel pattern. fel of nsp6 protein had multiple minima with small energy barriers in a single broad valley (basin). most of the minima with the lowest energy had a flat end showing the clustering of low-energy conformations (fig. 6a) . after haloperidol binding, the basins segregate to the different coordinates and show clustered lowest energy minima close to each other. some of the energy minima had a conical end suggesting the presence of stable conformation, while the minima with flat end indicating the absence of low-energy conformations (fig. 6b) . however, in the case of the dextromethorphan-bound nsp6, there exists only one deeper energy minima in the global free energy minimum region, indicating only one stable conformational state residing within this valley (basin) (fig. 6c) . the free energy valley is much deeper than the unliganded nsp6 and nsp6-haloperidol complex, indicating that the nsp6dextromethorphan complex has a lower free energy value. besides this, the nsp6-dextromethorphan complex also showed a reversed population shift relative to the unliganded nsp6. a comparison of the fels for these two drug complexes of nsp6 reveals that the fel of the haloperidol-bound nsp6 exhibits a more rugged free energy surface than that of the dextromethorphan-bound nsp6. furthermore, the fel of the nsp6-haloperidol complex contains a greater number of local free energy minima either in the global free energy minimum region (i.e., the funnel bottom) or in the region outside the global free minimum (i.e., the funnel wall), resulting in a more rugged and complex fel. evaluation of binding free energy is one of the important aspects of drug discovery as they describe the strength of non-bonded molecular associations during binding. the more negative free energy value signifies the stronger binding between the ligand and the protein. using the mm-pbsa method [56, 57] , binding free energy (δg binding ) was calculated as the difference between the free energy of the protein and figure s6) . the average binding free energies and detailed energetic contribution components of the last 20 ns of md trajectories were calculated and are shown in table 1 . it can be seen that haloperidol binds more strongly to nsp6 than dextromethorphan by − 12.74 kcal/mol, indicating enhanced binding affinity. while van der waals energy (evdw) largely contributed to the binding of haloperidol to nsp6, polar energy is unfavorable with the positive value. besides, hydrophobic interaction played a crucial role in hel binding to nsp6. for dextromethorphan binding to nsp6, both polar energy and van der waals interactions were favorable, while electrostatic energy is highly unfavorable with the positive value of 52.1 ± 3.39 kcal/mol (table 1 ). according to table 1 , it is apparent that the polar (charged interactions) component was the key contributor in the binding free energy of the nsp6dextromethorphan complex. the results indicated that the large unfavorable electrostatic term (δeele) was compensated by the large contribution of polar and van der waals component in the binding process of dextromethorphan to nsp6. therefore, the polar contribution was considered the main driving force in binding mechanism. to identify the key residues that contributed to the binding energy, we decomposed the binding energy at amino acid basis and the important ones with high contribution (≥ 0.1 kcal/mol) were identified and are shown in fig. 7 . as represented in fig. 7 , the values of amino acid energy with high contribution in each drug complex varied significantly. the decompositions of the relative binding energies of individual residues to the nsp6-haloperidol complex formation with the most favorable interactions were from trp140, trp165, ala166, and gly177, which comprised both charged and hydrophobic amino acids (fig. 7) . residues within dextromethorphan-binding sites of nsp6 includes mainly charged amino acids like w31, e39, d112, d133, d134, d159, e195, d243, e250, and d267 (fig. 7) . of note, the dextromethorphan interactions with nsp6, many charged residues like lysine (k61, 63, 109, 111, 151, 263, 270, 274 , 281, and 285) and arginine (r93, 187, 233, 236, and 252) , produced large positive unfavorable binding free energy values on average. thus, we have observed the large spatial shift of the binding residue clusters in both the drugs. overall, haloperidol binds to the residue clusters near trp165-gly177 (while there was no energy contribution in the nsp6-dextromethorphan system). also, the dextromethorphan binds to the c-terminal residue clusters arg236-asp267 (while there was no energy contribution in the nsp6-haloperidol system). calculation of correlation matrix is frequently utilized to illustrate dynamical information of proteins in two dimensions [52, 58, 59] . to observe the correlation in the dynamics, correlation matrices for each of the nsp6 systems were calculated through dynamut web server [60] and are plotted in fig. 8 . the red regions in the correlation maps represent the strongly correlated motion of the residues while the blue regions are associated with strong anticorrelated motion of the residues. from the comparison of the correlation map of the nsp6 systems, substantial loss of motions in the correlation maps of the nsp6-drug complexes had been identified and labeled as shown in fig. 7 the per-residue binding free energy decomposition for the simulated nsp6-haloperidol (blue) and nsp6dextromethorphan (red). the free energy values ≥ 0.1 kcal/mol contributes more to the binding interaction fig. 8 . in the presence of drugs, the helices of the hydrophobic core significantly destabilized and this is most apparent in the nsp6-dextromethorphan complex, whereby most of the helices showed the weakest correlation in motions compared with the nsp6-haloperidol complex. as can be seen, dextromethorphan binding leads to the complete loss of correlated motions of helices h9 and h11 in nsp6. the significant loss of correlated motions between h1-h6 and h1-h8 along with disruption of long-range motions between h5-h13, h6/h7-h11, and h5-β2, and shortrange motions between h8/h9-h11 was observed. interestingly, correlated motions between h9 and β2 have been observed in nsp6-dextromethorphan which is not present in the unliganded nsp6 protein. moreover, the long-range anti-correlated motions between n-and c-terminal regions were also lost in the nsp6-dextromethorphan complex. on the contrary, the nsp6-haloperidol complex displayed a somewhat better correlation in motion of residues of h8 and h11 compared with the nsp6-dextromethorphan system, hence indicating enhanced stability of the hydrophobic core (fig. 8) . however, similar to dextromethorphan, haloperidol binding also leads to complete loss of correlated motions between h1-h6, h1-h8, and h6/h7-h11, while retaining the motion between h8/h9-h11. there is also gain in long-range correlated motions between h2 and h13. thus, the lack of correlation in the motion of the helical domains for the nsp6-dextromethorphan complex suggested the loss of contacts among these helices, which might cause the greater accessibility of the hydrophobic core to the solvent molecules, and subsequently the destabilization of the protein. to examine the role of frustration in binding, we compute residual frustration in the protein by using the protein frustratometer server [51] . the frustration indices calculated through single residual frustration analysis indicate that sars-cov-2 nsp6 is more frustrated than a typical globular protein. using the cutoff values of the frustration indices, we found that nsp6 protein is highly frustrated molecules with almost 27% of residues are highly frustrated (compared with 10% observed in a typical protein), and 36% of the residues are minimally frustrated (40% in general). the frustration values range between − 2.9 and + 1.4. here, the minimally frustrated residues were found in the helices h3, h5, h6, h10, and h11, whereas the highly frustrated residues were largely scattered and seen mainly in h5 and c-terminal h13 and h14 ( figure s7a-c) . furthermore, we also monitored the frustration changes in nsp6 upon drug binding and the frustration profile showed some significant differences in protein-drug complex structure as shown in the supporting information table s1 . as can be seen, dextromethorphan binding increased the local destabilization of the protein by increasing the highly frustrated residues (29.3%) while the local frustration is similar to apo nsp6 (35.8%). haloperidol binding also increased the frustration to a little extent (28%) while keeping the minimal frustration unchanged (table s1 ). the drug binding thus increases the number of highly frustrated residues and thus increases the local frustration and flexibility of the protein, especially in dextromethorphan binding. moreover, the residues involved in drug binding were dominantly neutral to minimal frustration. however, some of the dextromethorphan-binding residues like ser32 and leu276 gain frustration upon binding, whereas the frustration of tyr132, tyr136, gly177, arg236, and lys270 decreased upon binding. also, haloperidol binding increases the frustration of leu231, leu237, and thr238 upon binding, and at the same time, it also decreased the frustration of residues like tyr136, phe184, and phe269 (table s1) . moreover, to find out how drug binding induced the local frustration changes, we performed a comparative analysis of the changes in highly frustration values in nsp6-drug complex. the secondary structural regions are represented as: h-α-helix and s-βstrands fig. 10 the structural snapshots of nsp6-drug complex observed during the md simulation (0 ns, 25 ns, 50 ns, 100 ns) for the most abundant structure of a-d nsp6-dextromethorphan and e-h nsp6-haloperidol the spatial distribution of local frustrations mapped onto the secondary structure of the protein (fig. 9 ). in particular, highly frustrated residues in the nsp6-dextromethorphan complex increased in the α-helix h1, h7, and h13, while the increase is seen in h11 for the nsp-haloperidol complex (fig. 9a) . the decrease in frustration was also seen for both the complexes mainly in c-terminal h11, h12, s1, and h14 for the nsp6dextromethorphan complex, and for nsp6-haloperidol, a decrease is observed in h5, h8, and s1 (fig. 9a) . additionally, the drug binding decreased the minimal frustration in h5, h7, h9, and h11 helices, and also increased the minimal frustration in h2 and h12 helices (fig. 9b) . thus, the coupling between structurally rigid c-terminal helices h10 and h11, and conformationally flexible helices, h5 and h13, is important for drug binding, and also the frustration index of the regions close to the binding site changes upon association. the all-atom md simulation shows significant differences in the tertiary structure of the nsp6-drug complexes. the structural snapshots of the protein-drug complex were used to analyze the differences in tertiary structure caused by each drug ( figure s8 ). the nsp6-drug complexes show the significant differences in h2, h5, h7, and c-terminal regions comprising h12, h13, β1, and β2 ( figure s8 ). the haloperidol and dextromethorphan complexes have a kink in h2, h5, and h7 at midway of simulation (50 ns) that causes a change in orientation when the drug binds ( figure s8 a, c). the dextromethorphan binding induces much larger kink at h2 and h5 with a much larger deviation also (rmsd = 3.59) ( figure s8c ). at the end of the simulation, the haloperidol and dextromethorphan complexes showed an increase in twist of h12 and h13, and both the strands of the c-termini along with the kinks present in h2, h5, and h7 ( figure s8b, d) . similar to what was observed at 50 ns, dextromethorphan binding induces much larger disruption (rmsd = 3.24). thus, although both drugs lead to bending of the helices, dextromethorphan binding induces larger disruption of the helices and thus more destabilization observed in the protein. the protein-drug binding analysis shows that dextromethorphan and haloperidol interact with many key residues and are shared between both the drugs. the residues interacting with each drug during the simulation were compared and visual 2d representations are shown in figures s9 and s10. figure 2 a shows the initial pose of the dextromethorphan molecule in the md simulation, which is also the best pose from the docking study. as can be seen, the dextromethorphan molecule formed hydrogen bonds with residue lys 61, and several van der waals interactions with his62, asn232, arg233, arg236, thr238, asp243, leu245, and pro282. figure 2 b shows the haloperidol binding sites in nsp6 which is comprised of helix h7, and h9 residues (i.e., h7: tyr132, asp134, ala136, arg137, trp140; h9: asn174, tyr175, ser176, and val178) and c-terminal positively charged residues leu231, leu237, leu239, and ser265. the structural snapshots of the drug-protein complexes have been shown in fig. 10 . it has been seen during simulation that dextromethorphan has drifted to the different locations inside the nsp6 during~20-100 ns of simulation. in the first ∼ 20 ns, dextromethorphan stayed in its initial location (fig. 10a) . after that, it drifted away from its initial location and moved into the water above the binding pocket and the interaction becomes minimal, only with tyr234, phe235, and leu276 (fig. 10b) . without drifting further into the water, the dextromethorphan molecule re-entered to the enlarged binding pocket, where it interacts mainly with h12 residues (phe225, leu230, arg233, tyr234, and arg236) through the hydrophobic interaction (fig. 10c) . during the second half of the md simulation (~50-100 ns), the dextromethorphan molecule stably remained at a new binding pose with much stronger interactions. at this new location, dextromethorphan coordinates with ser32 through h-bonds, and hydrophobic interactions by trp31, ile189, val190, met192, cys193, phe200, and phe201 (h9-h10) (fig. 10d) . in contrast, the drug haloperidol was stably bound inside the nsp6 pocket and explored two binding poses during the simulation. for the first 50 ns, it remained at its initial docking pose where it forms h-bonds to ser 176 and ser265, and several hydrophobic interactions mediated by arg137, trp140, asn174, tyr175, val178, and thr238 (fig. 10e, f) . after that, it drifted to another binding site and remained stable there, until the end of the simulation. at this new site, haloperidol interacts mainly with h7-h9 residues (trp140, asn144, trp165, ala166, ile169, ser170, ser176, val178, and phe184), leu237, leu239, and ile266, phe269 (h14) (fig. 10g, h) . to better understand the mechanism of biased activity of sars-cov-2 in the presence of two sigma-r1 binding drugs to nsp6, molecular dynamics simulation studies were employed. our data suggests dextromethorphan binds to cterminal helices while haloperidol binding sites are in the middle of the protein domain in helices h7 and h9. the disruption of the alpha helix h7 and h8 is significant for the dextromethorphan complex along with a large kink in h5 and h7. the nsp6-haloperidol complex showed a less significant change in the tertiary structure despite major disruption of the h12 helix. furthermore, the analyses of rmsd, rmsf, pca, fel, and dynamic cross-correlation matrix (dccm) indicated that dextromethorphan binding leads to destabilization of the protein with the loss of correlated motions and residual frustrations. in contrast, haloperidol binding brings milder alterations in these order parameters and thus showed minimal changes in stability compared with dextromethorphan system. besides, intermolecular hydrogen bonds were constantly formed in haloperidol with high occupation indicating the more stabilization of the nsp6-haloperidol system. in conclusion, the study elucidated the detailed interaction mechanism of dextromethorphan and haloperidol to nsp6 protein and the associated structural and dynamical changes upon drug binding. these results will significantly enhance our understanding of the working mode of these drugs at the molecular and structural level and will contribute to the future rational drug design for covid-19. epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study novel therapeutic approaches for treatment of covid-19 covid-19: molecular diagnostics overview immuno-epidemiology and pathophysiology of coronavirus disease 2019 (covid-19) sars-cov-2 interference in heme production: is it the time for an early predictive biomarker? neurological manifestations of covid-19: available evidences and a new paradigm design of a peptide-based subunit vaccine against novel coronavirus sars-cov-2 a novel coronavirus from patients with pneumonia in china di napoli r (2020) features, evaluation, and treatment of coronavirus (covid-19) coronaviruses: molecular and cellular biology biogenesis and architecture of arterivirus replication organelles mda5 is critical to host defense during infection with murine coronavirus responses of the toll-like receptor and melanoma differentiation-associated protein 5 signaling pathways to avian infectious bronchitis virus infection in chicks expression and cleavage of middle east respiratory syndrome coronavirus nsp3-4 polyprotein induce the formation of double-membrane vesicles that mimic those associated with coronaviral rna replication sarscoronavirus replication is supported by a reticulovesicular network of modified endoplasmic reticulum infectious bronchitis virus generates spherules from zippered endoplasmic reticulum membranes coronavirus nsp6 proteins generate autophagosomes from the endoplasmic reticulum via an omegasome intermediate coronavirus nsp6 restricts autophagosome expansion a sars-cov-2 protein interaction map reveals targets for drug repurposing possible old drugs for repositioning in covid-19 treatment: combating cytokine storms from haloperidol to anti-interleukin agents an extension of hypotheses regarding rapidacting, treatment-refractory, and conventional antidepressant activity of dextromethorphan and dextrorphan a comparison of the binding profiles of dextromethorphan, memantine, fluoxetine and amitriptyline: treatment of involuntary emotional expression disorder sigma (sigma) receptors as potential therapeutic targets to mitigate psychostimulant effects sigma receptors: potential targets for a new class of antidepressant drug involvement of sigma-1 receptors in the antidepressant-like effects of dextromethorphan targeting hub genes and pathways of innate immune response in covid-19: a network biology perspective identification and in vivo efficacy assessment of approved orally bioavailable human host proteintargeting drugs with broad anti-influenza a activity improved protein structure prediction using potentials from deep learning protein structure and sequence reanalysis of 2019-ncov genome refutes snakes as its intermediate host and the unique similarity between its spike protein insertions and hiv-1 deep-learning contact-map guided protein structure prediction in casp13 zinc 15-ligand discovery for everyone autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading 0 0 9 ) a u t o d o c k 4 a n d autodocktools4: automated docking with selective receptor flexibility ligand docking and binding site analysis with pymol and autodock/vina ucsf chimera-a visualization system for exploratory research and analysis ligplot+: multiple ligandprotein interaction diagrams for drug discovery gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers swissparam: a fast force field generation tool for small organic molecules determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulations particle mesh ewald: an n·log(n) method for ewald sums in large systems the missing term in effective pair potentials crystal structure and pair potentials: a molecular-dynamics study lincs: a linear constraint solver for molecular simulations delineating the effect of mutations on the conformational dynamics of n-terminal domain of tdp-43 structural heterogeneity in rna recognition motif 2 (rrm2) of tar dna-binding protein 43 (tdp-43): clue to amyotrophic lateral sclerosis mdtraj: a modern open library for the analysis of molecular dynamics trajectories dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features structurebased virtual screening, molecular dynamics simulation and mm-pbsa toward identifying the inhibitors for two-component regulatory system protein narl of mycobacterium tuberculosis identification of inhibitors against alpha-isopropylmalate synthase of mycobacterium tuberculosis using docking-mm/pbsa hybrid approach localizing frustration in native proteins and protein assemblies protein frustratometer 2: a tool to localize energetic frustration in protein molecules, now with electrostatics phosphorylation-induced changes in the energetic frustration in human tank binding kinase 1 castp 3.0: computed atlas of surface topography of proteins prankweb: a web server for ligand binding site prediction and visualization protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models g_mmpbsa-a gromacs tool for high-throughput mm-pbsa calculations a computational analysis of binding modes and conformation changes of mdm2 induced by p53 and inhibitor bindings revealing origin of decrease in potency of darunavir and amprenavir against hiv-2 relative to hiv-1 protease by molecular dynamics simulations dynamut: predicting the impact of mutations on protein conformation, flexibility and stability publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations author contributions v.k. designed the study. p.p. and k.p. performed the experiments and calculations. v.k. and a.p. analyzed the data. p.p. and k.p. prepared figures of the results. v.k. wrote the manuscript with the contributions of k.p. and a.p. conflict of interest the authors declare that they have no conflict of interest. key: cord-269531-7gy4epzo authors: kumar, pankaj; van den hurk, jan; ayalew, lisanework e.; gaba, amit; tikoo, suresh k. title: proteomic analysis of purified turkey adenovirus 3 virions date: 2015-07-09 journal: vet res doi: 10.1186/s13567-015-0214-z sha: doc_id: 269531 cord_uid: 7gy4epzo turkey adenovirus 3 (tadv-3) causes high mortality and significant economic losses to the turkey industry. however, little is known about the molecular determinants required for viral replication and pathogenesis. moreover, tadv-3 does not grow well in cell culture, thus detailed structural studies of the infectious particle is particularly challenging. to develop a better understanding of virus-host interactions, we performed a comprehensive proteomic analysis of proteinase k treated purified tadv-3 virions isolated from spleens of infected turkeys, by utilizing one-dimensional liquid chromatography mass spectrometry. our analysis resulted in the identification of 13 viral proteins associated with tadv-3 virions including a novel uncharacterized tav3gp04 protein. further, we detected 18 host proteins in purified virions, many of which are involved in cell-to cell spread, cytoskeleton dynamics and virus replication. notably, seven of these host proteins have not yet been reported to be present in any other purified virus. in addition, five of these proteins are known antiviral host restriction factors. the availability of reagents allowed us to identify two cellular proteins (collagen alpha-1 (vi) chain and haemoglobin) in the purified tadv-3 preparations. these results represent the first comprehensive proteomic profile of tadv-3 and may provide information for illustrating tadv-3 replication and pathogenesis. electronic supplementary material: the online version of this article (doi:10.1186/s13567-015-0214-z) contains supplementary material, which is available to authorized users. hemorrhagic enteritis (he) is an economically important disease of turkeys characterized by depression, splenic enlargement, intestinal haemorrhages and sudden death [1] . the disease is caused by turkey adenovirus 3 (tadv-3), also known as hemorrhagic enteritis virus (hev), a member of genus siadenovirus a [2] . oral infection of susceptible turkeys with pathogenic tadv-3 strains results in well-characterized splenomegaly and intestinal bleeding in 4 to 6 days causing subclinical infections and mortality [3] . although tadv-3 remains one of the most important causes of economic loss to turkey industry, critical molecular determinants of virulence and factors affecting virus replication are not well understood. this may be in part because of unavailability of an efficient "in vitro" tissue culture system for propagation of tadv-3 [4] [5] [6] . the genome of tadv-3 is 26,263 bp [7] . although, tadv-3 genomic organization of central block of genus-common genes [8] appears similar to that of other adenovirus genomes [7] , the left (e1) and right (e4) terminal regions appear absent. interestingly, tadv-3 encodes a genus specific protein, which shows similarity to bacterial sialidase protein [8] . although western blot analysis of purified tadv-3 particles isolated from crude spleen extract revealed presence of eleven structural polypeptides with apparent molecular weight ranging from 9.5 to 96 kda [9] , no systematic study has been performed to identify the precise protein composition of purified tadv-3 particles. in recent years, mass spectrometry (ms) based proteomic characterization has revealed important insights into viral replication, tropism and virulence for a number of different enveloped viruses [10] [11] [12] [13] [14] . in contrast, a few proteomic studies have been reported for nonenveloped viruses [15] [16] [17] [18] . additionally, there is now compelling evidence suggesting that host cellular proteins incorporated in the virions play an important role in viral replication and pathogenesis [10, 13, 19, 20] . using ms based approaches, a number of host proteins have been reported to be incorporated into rna viruses ("human immunodeficiency virus-1 [10, 13] "; "simian immunodeficiency virus [21] "; "respiratory syncytial virus [22] ; hepatitis c virus [23] "; "swine hepatitis e virus [24] "; "coronavirus [25] " and "influenza [20, 26] ") or dna viruses ("herpes simplex virus 1 [27] "; "african swine fever virus [28] "; "kshv [29] "; "marek's disease virus (mdv) [30] ", and "mimivirus [31] "). however, to the best of our knowledge, characterization of the host cellular factors integrated into virions for any member of adenoviridae family including tadv-3 has not been reported so far. here, we report the protein composition of the purified tadv-3 particles by performing a comprehensive proteomic analysis utilizing liquid chromatography-mass spectrometry (lc-ms/ ms). our analysis resulted in successful identification of 13 viral structural proteins and 18 host-incorporated proteins. moreover, incorporation of two host proteins in purified virions was verified by western blot analysis using available immunological reagents. all turkey procedures were approved by university committee of animal care and supply (protocol # 19940211) at the university of saskatchewan, saskatoon, canada according to guidelines set by the canadian council of animal care. day-old hybrid poults obtained from chinook belt hatcheries, calgary, canada were housed in isolation rooms throughout the experiments. the avirulent tadv-3 isolate (pheasant origin) was passaged in sero negative turkeys by oral inoculation and purified from crude spleen extracts, as described earlier [32] . the tadv-3 virions were purified as previously described [9] . the proteinase k (pk) treatment of purified tadv-3 virions was performed as described previously [33] . briefly, double cscl-purified virions were incubated in 1 ml of mnt buffer (30 mm morpholineethanesulfonic acid [mes], 10 mm nacl, and 20 mm tris-hcl [ph 7.4]) containing proteinase k [0 to 20 μg] (roche, mannheim, germany) for 45 min at room temperature and subsequently treated with "2 mm phenylmethylsulfonyl fluoride" (roche) prior to purification by cscl density gradient centrifugation. purified virions were resuspended in 10% glycerol and stored at −80°c until further use. the experiments were performed in triplicate employing three independent virus preparations. electron microscopy was performed on cscl 2 gradient purified tadv-3 virions (proteinase k treated or untreated) at em facility at biology department, university of victoria, bc, canada, as described [34] . briefly, for negatively stained preparation, cscl 2 gradient purified virus was first applied onto carbon and formvar coated grids, washed with h 2 0 and stained with 2% aqueous phosphotungstic acid. the specimens were photographed using a charge-coupled device camera (advanced microscopy techniques, amt ccd camera equipped hitachi h7000 tem operating at 75 kv). production and characterization of anti-tadv-3 serum and monoclonal antibodies (mabs) recognizing tadv-3 hexon (15g4) and fiber (87-03) proteins has been described earlier [4, 9] . chicken polyclonal anti-human hemoglobin serum (ab28961) was purchased from abcam (cambridge, ma, usa). rabbit polyclonal antihuman collagen type vi alpha-1 serum (col6a1) was purchased from antibodies-online inc. (atlanta, ga, usa). alkaline phosphatase conjugated goat anti-rabbit (sigma aldrich) and peroxidase-conjugated goat "antiturkey" igg (kpl, maryland, usa) were used as described [4, 9] . proteins from purified tadv-3 were separated by sodium dodecyl sulphate (sds) polyacrylamide gel electrophoresis (page) on 10-15% or 4-15% precast gradient gels (bio-rad),transferred to nitrocellulose membrane and probed with protein specific antibodies as described previously [9] . proteins from cscl 2 gradient purified virion-enriched (proteinase k treated or untreated) samples were diluted with 200 mm ammonium bicarbonate prior to reduction with 200 mm dithiothreitol and incubated 30 min at 37°c. cysteine sulfhydryl groups were alkylated with 20 μl of 100 mm iodoacetamide (30 min at 37°c in darkness). each sample was digested with 5 μg of trypsin (promega) at 37°c for 16 h [33, 35] . finally, the samples were de-salted on a waters hlb oasis column, speed vac concentrated and stored at −80°c prior to lc-ms analysis. the peptide mixtures were separated by on-line reverse phase chromatography using a easy-nlc ii system (thermo scientific) with a reversed-phase magic c-18aq pre-column (100 μm i.d., 2 cm length, 5 μm, 100 å, michrom bio resources inc, auburn, ca, usa) and reversed phase nano-analytical column magic c-18aq (75 μm i.d., 15 cm length, 5 μm, 100 å, michrom bio resources inc, auburn, ca, usa) at a flow rate of 300 1/min. the resulting peptides were analyzed by the chromatography system, which was coupled on-line with a ltq orbitrapvelos mass spectrometer (thermo fisher scientific, bremen, germany) equipped with a nano-sprayflex source (thermo fisher scientific) as described previously [33, 35] . the data was acquired with keratin and trypsin peptide mass exclusion lists. raw files were analysed with proteome discoverer 1.4 software suite (thermo scientific). parameters for the spectrum selection to generate peak lists of the collisioninduced "dissociation (cid) spectra were activation type: cid"; (s/n cut-off: 1.5; total intensity threshold: 0; minimum peak count: 1; precursor mass: 350-5000 da). the peak lists were submitted to an in-house mascot 2.3 server against "the following databases": uniprot_trembl 20111103 (17 651 715 sequences; 5,747,683, 275 residues) and uniprot-swissprot 20110104 (523 151 sequences; 184 678 199 residues) all species taxonomy. database search parameters were as follows: precursor tolerance 8 ppm; ms/ms tolerance 0.6 da; trypsin enzyme 1 missed cleavages; fourier transform ion cyclotron resonance (ft-icr) instrument type; fixed modification: carbamidomethylation (c); variable modifications: deamidation (n,q); oxidation (m). the decoy database percolator settings: max delta cn 0.05; target fdr strict 0.01, target fdr relaxed 0.05 with validation based on q-value. additional virus only species searches were also performed with tolerances previously mentioned. all data were also searched against ncbi (gallus gallus (chicken)) database to detect viral and host proteins. only sequences identified with a mascot score value greater than 30 were considered as significant. protein identifications were accepted when the peptide probability was greater than 95.0% [33, 35] , the protein probability greater than 99.0%, and contained at least 2 identified peptides. peptide identifications were systemically evaluated manually. due to difficulty in propagating turkey adenovirus 3 in cell culture system, tadv-3 was propagated in six to 8 week old turkeys. tadv-3 virions were purified from spleens of turkeys inoculated orally with an avirulent vaccine strain of tadv-3 [4, 9] ( figure 1a ). following cscl 2 density gradient purification, two distinct bands were observed, the upper band (present at lower density) containing capsid and the lower band (at higher density, between 1.25 and 1.35) containing complete infectious viruses ( figure 1b , left panel). the lower band was subjected to second round of cscl 2 density gradient purification resulting in single band containing purified virions ( figure 1b, right panel) . virion-enriched preparations were checked for quality by negative stain transmission electron microscopy (tem) (figures 1c and d) . as seen, virions demonstrated uniform, intact tadv-3 virus particles of 100 nm diameter. these tem results were consistent with the quality and apparent purity reported earlier [33, 35] . the purity of the virion preparation was also determined by western blot analysis using turkey anti-tadv-3 sera. as seen in figure 1e , polypeptides of 96 k (hexon), 57 k (iiia), 52 k (penton base), 29 k (fiber) and 24 k (pvi) were detected in cscl 2 purified tadv-3 virions. these findings suggest that our enrichment procedure yielded a highly purified preparation of tadv-3 virions. the protein composition of tadv-3 virions was analyzed by the method of in-solution trypsin "digestion a gel-free approach" to ms that subject the entire sample to sequential one-dimensional reversed-phase chromatography coupled on-line to ms/ms analysis (1d-nanospray-lc-ms/ms). this method eliminates the problems reported with proteins that either enter gel poorly or extracted inefficiently from the gel slices. our lc-ms/ms analysis revealed a total of 15 virus-encoded proteins packaged in the purified tadv-3 virions. this included 13 proteins, which have been detected in human adenovirus 5 (hadv-5) virions [16] (table 1) , a novel uncharacterized hypothetical viral protein designated as tav3gp04 (table 1 , additional file 1) and a non-structural viral protein (22 k) to be associated with tadv-3 virions. in addition to tadv-3 encoded viral proteins, interestingly 26 cellular proteins appeared to be associated with purified tadv-3 virions (table 2) . to determine if the host proteins are actually incorporated into the virions, the purified tadv-3 virions were treated with proteinase k (20 μg/ml) and subjected to another round (third round) of cscl 2 purification. the proteinase k treated and untreated, purified virions were then analysed by western blotting. proteinase k treatment degrades fiber protein protruding from the capsid but does not degrade hexon protein not protruding from the capsid. as seen in figure 2a , hexon protein could be detected in proteinase k treated or untreated tadv-3 virions. in contrast, fiber protein could only be detected in untreated virions, but not in proteinase k treated virions. moreover, tem analysis suggested that the virions were intact and maintained virion integrity after proteinase k treatment and cscl 2 density gradient purification ( figure 2b ). the lc-ms/ms analysis of proteinase k treated cscl 2 density gradient purified tadv-3 virions identified eleven virus-encoded proteins (hexon, pvi, pvii, penton base, pviii, sialidase, iiia, adenain, px, iva2 and dbp) previously reported to be in other adenoviruses (table 3 and figure 3a ) [16] . in addition, a novel viral protein tav3gp04 remains an integrated part of proteinase k treated tadv-3 virions (table 3 , additional file 1). as expected, peptides representing fiber protein were not detected in proteinase k treated tadv-3 virions. in addition, ptp and 22 k virion proteins were not detected in proteinase k treated tadv-3 ( table 3 ). the high mascot scores and number of peptides observed for hexon, pvi and pvii presumably reflect the fact that they are perhaps the most abundant proteins in the tadv-3 particles. interestingly only 18 host proteins were exclusively detected in proteinase k treated tadv-3 virions (table 4 and figure 3b ). notably, thirteen of these host proteins were the same as detected in the untreated tadv-3 virions (table 4, figure 3b ) indicating that these proteins are part of the tadv-3 virions. among these proteins, promyelocytic leukemia protein (pml) isoform x6 (additional file 2), collagen alpha-1(vi) chain (additional file 3), haemoglobin subunit alpha (additional file 4) and haemoglobin subunit beta (additional file 5) appeared abundant. the pml protein appears as abundant as viral structural protein pviii or penton base peptide. in addition, five host proteins namely, vitronectin, collagen alpha-3 (vi) chain, collagen alpha-2 (vi) chain, tyrosine protein phosphatase and turkey heterophil peptide 2 (thp-2) were only detected in proteinase k treated tadv-3 virions. functional classification of the identified proteins revealed that many of these proteins participate in a common molecular pathway (table 4 and figure 3c ) and are involved in innate immunity, cell adhesion, cytoskeleton organization and virus replication. non availability of turkey host protein specific antisera made it difficult to verify the packaging of host proteins in tadv-3 virions. however, human collagen alpha-1(vi) peptides showed 70% identity to turkey collagen alpha-1(vi) and chicken collagen alpha-1(vi) (additional file 6). in addition, human haemoglobin peptides demonstrated 75% identity to turkey haemoglobin alpha and chicken haemoglobin alpha, 50% identity to turkey haemoglobin beta and 66% identity to chicken haemoglobin beta proteins (additional file 7). therefore, we attempted to determine the incorporation of collagen alpha-1(vi) and haemoglobin in purified tadv-3 using western blot assays. as shown in figure 4 , anti-collagen alpha-1 (vi) serum detected collagen alpha-1 (vi) chain specific band in proteinase k untreated tadv-3 (panel a, lane 1). similar protein could be detected in proteinase k treated purified tadv-3 (panel a, lane 2). antihaemoglobin serum detected haemoglobin specific band in proteinase k untreated tadv-3 (panel b, lane 1). similar protein band could be detected in proteinase k treated purified tadv-3 (panel b, lane 2). viruses exploit multiple host proteins for successful entry, establishment of infection, replication, and immune evasion. for a better understanding of the tadv-3-host interactions, we performed a comprehensive analysis of the protein content of tadv-3 virions, using a lc-ms/ms based proteomic approach. to the best of our knowledge, incorporation of host proteins in adenovirus has not been reported so far. the proteomic analysis of cscl 2 purified tadv-3 identified a total of 13 virion proteins and 18 host proteins. earlier, proteomic analysis has not reported the detection of host proteins in purified hadv-5 virions [15, 16] . it is possible that the observed host proteins identified by proteomic analysis of cscl 2 purified tadv-3 virions may not be actually incorporated in the purified virions but are loosely associated on the outside of the tadv-3 virion capsids. since proteinase k treatment has been traditionally used to remove any contaminating protein from the surface of enveloped viruses [33, 35] , we used protease treatment of non-enveloped tadv-3 to remove the potential contaminating proteins. several lines of evidence validate the approach and suggest that proteinase k treatment of tadv-3 appears successful in removing contamination proteins. 1) intact virions could be detected by tem after proteinase k treatment of tadv-3. 2) western blot analysis of protease k treated tadv-3 detected hexon protein but not fiber protein (protruding from the capsid). 3) the fiber and 22 k (non structural protein) could not be detected by ms analysis of proteinase k treated tadv-3. 4) only 18 of the 26 host proteins could be identified in proteinase k treated tadv-3. interestingly, all major viral proteins were identified in proteinase k treated virions (table 3 ) except viral ptp, possibly due to its low abundance and least mascot score values observed (table 1) . overall sequence coverage observed for different viral peptides ranged from 2 to 70%, with the majority between 10 and 35%. earlier, sequence analysis of turkey adenovirus-3 identified a hypothetical protein orf 4 (named tavgp04) [7] , which appears to be conserved in raptor adenovirus-1 [36] and south polar skua adenovirus [37] . in contrast, a hypothetical hydrophobic protein was identified in frog adenovirus 1 [8] , which shows no similarity to similar proteins identified in turkey adenovirus 3 [7] and raptor adenovirus 1 [36] . our results suggest that an orf4 of tadv-3 encodes a structural protein tavgp04, which is incorporated into virion capsid (additional file 1). in addition, this is the first report to suggest the existence of tavgp04 as a structural protein in siadenoviruses particularly of avian origin. the proteomic analysis of proteinase k treated purified virions identified eleven cellular proteins incorporated in tadv-3, which have been identified in other viruses (table 4 ). in addition, proteomic analysis identified seven host proteins incorporated in tadv-3 virions (table 4) , which have not been identified so far in any other virus. interestingly, of the 18 detected host proteins, five of the proteins were only detected in proteinase k treated tadv-3. it is possible that high abundance non-specific proteins might have masked the detection of these proteins in virions not treated with proteinase k that are truly virion associated, but present in low copy numbers. though earlier reports have demonstrated the packaging of viral [38] or non viral rnas [39] into purified adenovirus, recent reports have not described the detection of any cellular protein in purified lizard adenovirus-2 [40] , a member of atadenovirus genus and purified hadv-5, a prototype of mastadenovirus genus [15] . the absence of a cellular protein packaged in purified adenovirus virions could be due to variety of reasons. as stated, the difference could be due to the technique used for analysis [15] . alternatively, it is possible that packaging of the cellular proteins may be dependent on the type of adenovirus (tadv-3, a prototype of siadenovirus genus) and origin of cells used for virus cultivation [15] . the host proteins packaged intadv-3 are known to play important roles in enhancing the cell-to-cell spread of virus, transcription and virus replication (table 4 , figure 3 ). for example, extracellular matrix (collagen) has been shown to increase infectious sindbis virus titers from bhk cells by enhancing post-infection cell survival [41] . in another study, rotavirus-induced pi3k activation resulted in prolonged adherence of infected cells to collagen and increased virus production [42] . similarly, extracellular matrix vitronectin has been reported to enhance the growth of human adenovirus19 (hadv-19) [43] . however, the incorporation of antiviral host defense factors including, protein pml, haemoglobin and antimicrobial peptide (thp-2) into tadv-3 virions is particularly intriguing. all of these host defence factors have been implicated in establishing antiviral environments. recent studies have implicated pml in maintaining host antiviral defence and revealed different strategies developed by viruses to disrupt pml nuclear bodies [44] [45] [46] . in addition, protein pml has been shown to be important for the inhibition of adenovirus replication [47] . similarly, avian antimicrobial peptide thp-2, a member of beta-defensin family is effector of the innate defence system and play key functions during host defence by generating vigorous cytokine response [48, 49] . on the other hand, a novel role of haemoglobin in innate immunity has been recently reported for classical swine fever virus (csfv) [50] as silencing of haemoglobin expression using sirna promoted csfv growth and replication, whereas overexpression of haemoglobin antagonized csfv replication and growth by triggering ifn signalling [50] . although tadv-3 grows efficiently in spleen of infected turkey, virus grows poorly in primary or established cell lines. it is tempting to speculate that integration of certain established antiviral host restriction factors into viral particles may play a role in determining tadv-3 replication "in vitro". additional studies need to be performed in order to investigate whether these proteins are functionally required for virus entry, replication and pathogenesis. future availability of reagents and a reliable cell culture system to grow tadv-3 should make it possible to determine the role of individual host restriction factor in tadv-3 replication. comparison of 12 turkey hemotthagic enteritis virus isolates allows prediction of genetic factors affecting virulence ictv at paris icv: results of the plenary session and the binomial ballot avian adenoviruses characterization of group ii avian adenoviruses with a panel of monoclonal antibodies evaluation of cell culture propagated and in vivo propagated hemorrhagic enteritis vaccines in turkeys development and application of quantitative real-time pcr for the rapid detection of hemorrhagic enteritis virus in tissue samples the complete dna sequence and genome organization of the avian adenovirus, hemorrhagic enteritis virus genetic content and evolution of adenoviruses characterization of the structural proteins of hemorrhagic enteritis virus proteomic and biochemical analysis of purified human immunodeficiency virus type 1 produced from infected monocyte-derived macrophages alteration of protein levels during influenza virus h1n1 infection in host cells: a proteomic survey of host and virus reveals differential dynamics proteomic changes in hek-293 cells induced by hepatitis delta virus replication proteomic analysis of human immunodeficiency virus using liquid chromatography/tandem mass spectrometry effectively distinguishes specific incorporated host proteins proteomics of herpes simplex virus replication compartments: association of cellular dna replication, repair, recombination, and chromatin remodeling proteins with icp8 analysis of purified wild type and mutant adenovirus particles by silac based quantitative proteomics analysis of the adenovirus type 5 proteome by liquid chromatography and tandem mass spectrometry methods proteomics strategies to analyze hpv-transformed cells: relevance to cervical cancer proteome analysis of adenovirus using mass spectrometry plunder and stowaways: incorporation of cellular proteins by enveloped viruses cellular proteins in influenza virus particles plasma proteomic analysis of simian immunodeficiency virus infection of rhesus macaques quantitative proteomic analysis of a549 cells infected with human respiratory syncytial virus ferritin heavy chain is the host factor responsible for hcv-induced inhibition of apob-100 production and is required for efficient viral infection proteomic analysis of swine hepatitis e virus (shev)-infected livers reveals upregulation of apolipoprotein and down-regulation of ferritin heavy chain proteomics analysis unravels the functional repertoire of coronavirus nonstructural protein 3 quantitative proteomic analyses of influenza virus-infected cultured human lung cells hsv-1 cgal + infection promotes quaking rna binding protein production andinduces nuclear-cytoplasmic shuttling of quaking i-5 isoform in human hepatoma cells two-dimensional analysis of african swine fever virus proteins and proteins induced in infected cells proteomic analysis of the kaposi's sarcoma-associated herpesvirus terminal repeat element binding proteins a mass spectrometry-based proteomic approach to study marek's disease virus gene expression mimivirus giant particles incorporate a large fraction of anonymous and unique gene products propagation of group ii avian adenoviruses in turkey and chicken leukocytes proteomic characterization of pseudorabies virus extracellular virions embedding in epoxy resins for ultrathin sectioning in electron microscopy proteomic characterization of bovine herpesvirus 4 extracellular virions complete sequence of raptor adenovirus 1 confirms the characteristic genome organization of siadenovirus full genome analysis of a novel adenovirus from the south ploar skua (catharacta maccormicki) in antarctica viral rnas detected in virions of porcine adenovirus type 3 presence of prepackaged mrna in virions of dna adenovirus molecular characterization of a lizard adenoivirus reveals the first atadenoivirus with two fiber genes and the first adenovirus with either one short or three long fibers per penton effects of collagen matrix on sindbis virus infection of bhk cells rotavirus replication in intestinal cells differentially regulates integrin expression by a phosphatidylinositol 3-kinase-dependent pathway, resulting in increased cell adhesion and virus yield vitronectin: a possible determinant of adenovirus type 19 tropism for human corneal epithelium effects of promyelocytic leukemia protein on virus-host balance rabies virus p and small p products interact directly with pml and reorganize pml nuclear bodies serum-dependent expression of promyelocytic leukemia protein suppresses propagation of influenza virus adenovirus e4 orf3 protein inhibits the interferon-mediated antiviral response human beta-defensin 2 induces a vigorous cytokine response in peripheral blood mononuclear cells antimicrobial activity of chicken and turkey heterophil peptides chp1, chp2, thp1, and thp3 hemoglobin subunit beta interacts with the capsid protein and antagonizes the growth of classical swine fever virus submit your next manuscript to biomed central and take full advantage of: • convenient online submission • thorough peer review • no space constraints or color figure charges • immediate publication on acceptance • inclusion in pubmed, cas, scopus and google scholar • research which is freely available for redistribution submit your manuscript at www authors thank other members of tikoo's laboratory for helpful suggestions. published as vido-intervac article # 725. the work was supported by grants from saskatchewan agriculture development fund, alberta livestock and meat agency, saskatchewan chicken industry development fund, canadian poultry research council and agriculture and agri-food canada. the authors declare that they have no competing interests. conceived and designed the experiments: pk, skt. performed the experiments; pk, jv, ag, lea. analyzed the data; pk, jv, skt; ag. wrote the manuscript: pk, lea, skt. all authors read and approved the final manuscript. key: cord-274080-884x48on authors: rumlová, michaela; ruml, tomáš title: in vitro methods for testing antiviral drugs date: 2018-06-30 journal: biotechnology advances doi: 10.1016/j.biotechadv.2017.12.016 sha: doc_id: 274080 cord_uid: 884x48on abstract despite successful vaccination programs and effective treatments for some viral infections, humans are still losing the battle with viruses. persisting human pandemics, emerging and re-emerging viruses, and evolution of drug-resistant strains impose continuous search for new antiviral drugs. a combination of detailed information about the molecular organization of viruses and progress in molecular biology and computer technologies has enabled rational antivirals design. initial step in establishing efficacy of new antivirals is based on simple methods assessing inhibition of the intended target. we provide here an overview of biochemical and cell-based assays evaluating the activity of inhibitors of clinically important viruses. viral pandemics remain serious threats to humankind. every year, known viruses such as hiv-1 and hepatitis b virus newly infect millions of people across the globe. in addition, recent outbreaks of ebola virus, influenza virus, severe acute respiratory syndrome (sars) coronavirus and middle east respiratory syndrome-coronavirus (mers-cov) serve as a reminder of the silent danger. the world health organization reported in 2015 that over 1.34 million causalities per year are connected with hepatitis b and almost 500,000 with hepatitis c. occasional epidemic outbreaks of other viruses are also striking. these include outbreaks of noroviruses, flaviviruses (zika and dengue viruses), new strains of influenza viruses, the re-emergence of west nile virus in italy and the united states. despite relatively long pauses, viruses such as influenza virus can re-emerge and cause global pandemic health problems. unlike cellular genomes, which consist of double-stranded (ds) dna, viral genomes can be formed by a broad variety of nucleic acids (nas), including ds or single-stranded (ss) circular or linear dna or positive-, negative-or sometimes ambi-sense rna. as viral life cycles are dependent on cellular factors and cellular metabolic and signaling pathways, the number of possible antiviral drug targets is limited. however, almost all viruses encode unique proteins and enzymes that may serve as specific targets for antiviral therapy. the overarching goal of modern drug development efforts is to design compounds that specifically inhibit viral targets or cellular targets essential for virus replication. the purpose of this review is to provide insight into the broad variety of cell-based and biochemical assays used for identification and evaluation of antivirotics, including high-throughput screening (hts) methods. an overview of available inhibitors and vaccines, which has been reviewed elsewhere [e.g. in (de clercq and li, 2016) ], is beyond the scope of this paper. to package their genomes into particles of very limited size, viruses encode few genes. virions consist of an rna or dna genome that is protected by an outer shell called a capsid (also nucleocapsid) formed by a lattice of capsid proteins. in enveloped viruses, the capsid is additionally surrounded by a lipid bilayer spiked with viral proteins. the size of animal viruses ranges from approximately 25 nm to over 300 nm (cohen et al., 2011) . the capsids and virions of rna viruses adopt various shapes. viral capsids may be icosahedral (e.g. picornaviridae, astroviridae, reoviridae, togaviridae), bullet-shaped (rhabdoviridae), helical (coronaviridae), helical filamentous (filoviridae, orthomyxoviridae, paramyxoviridae), or filamentous (arenaviridae, bunyaviridae). the retroviral capsid core may be conical, spherical, or rod-shaped. the morphology of dna viruses is similarly diverse, ranging from icosahedral (enveloped -hepadnaviridae, herpesvridae; nonenveloped -adenoviridae, parvovirdae, polyomaviridae and papillomaviridae) to rodshaped (baculoviridae) or pleomorphic (poxviridae). the viral life cycle is the process of viral replication in a host cell. first, viruses enter the host cell and replicate their genomes. following translation of viral proteins by the host cell machinery, viruses package their genomic material into protective proteinaceous capsids and exit the cell to infect another host cell. nonenveloped viruses consist only of a protein capsid shell enclosing the viral genome and enzymes, while the capsid shell of enveloped viruses is enclosed in a lipid envelope derived from the host cell membrane. regardless of whether the virus is enveloped, its surface must display (glyco)proteins suitable for specific interactions with host cell receptors. in contrast to the surface proteins of nonenveloped viruses, those of enveloped viruses usually serve another function in addition to host cell recognition and attachment. for example, they may enable fusion of the viral and cellular membranes, usually through an interaction with a secondary receptor(s) or co-receptor(s). the fusion domains of viral surface glycoproteins thus can lower the kinetic barrier for the energetically demanding membrane fusion. the viral particle must be sufficiently stable to protect the genome until its delivery into the host cell. simultaneously, it must be metastable, to allow its disassembly and release of the genome for replication in the infected cell at the appropriate time. the energetic barrier that prevents viral disassembly outside the cell is lowered upon infection by structural changes in the viral components. these changes may be induced by binding of a cellular ligand or by changes in the environment, such as ph change upon entering a specific cellular compartment. numerous viruses preassemble immature particles that undergo irreversible (usually proteolytic) transitions into mature structures of fully infectious virions. mutual interactions of viral capsid proteins are typically different from those of cellular proteins, which predominantly create binary interactions. viral structural proteins interact with multiple neighboring partners to form multicomponent macromolecular structures [reviewed in (cheng and brooks, 2015; jayaraman et al., 2016) ]. the economy of the packaged viral genomes due to the limited capsid size implies formation of only a few types of structural proteins, which are usually symmetrically organized [reviewed in (prasad and schmid, 2012; raguram et al., 2017) ]. in the initial stage of infection, viruses must overcome several obstacles. the first is the cellular plasma membrane with an actin cortex. then, they must traffic through dense cytoplasm to reach their final destination for replication and assembly. these pathways are specific for different viruses and often are dictated by the size of the particle and its structure. the viral life cycle can be divided into several common stages, including entry, uncoating, genome replication, genome packaging and assembly, release, and maturation ( fig. 1) . in general, the first phase of viral infection is specific recognition of the target host cell and binding to a surface receptor displayed on the cell membrane. this process is common to both enveloped and nonenveloped viruses. in enveloped viruses, the binding is mediated by viral surface components, typically oligomers of integral glycoproteins. nonenveloped viruses bind receptors through sites or projections on the capsid surface. viruses can use either a single receptor (e.g. tim-1 for hepatitis a virus, gm1 for sv40, cd155 for poliovirus, low-density lipoprotein receptor for human rhinovirus) or multiple receptors with equivalent roles (e.g., nectin-1/2 or hvem for herpes simplex virus 1/ 2, ace or l-sign for sars coronavirus). for other viruses (e.g. hiv, hcv, adenoviruses, rotaviruses, picornaviruses, and some herpesviruses), the presence of at least two cytoplasmic membrane components is required. for example, hiv-1 binds to the primary receptor cd4 and one of two co-receptors (ccr5 or cxcr4) [reviewed in (cossart and helenius, 2014; grove and marsh, 2011) ]. to infect a cell, viruses must overcome the plasma membrane to deliver their genetic material into the cytosol. viruses enter cells by two main mechanisms. a majority of animal viruses, both enveloped and nonenveloped, enter cells by one or two types of endocytosis, such as clathrin-mediated endocytosis (e.g. vsv, influenza a virus, rhinovirus), caveolin-mediated uptake (echovirus, polyoma virus), clathrin/caveolin-independent endocytosis such as caveolar or lipid raft-mediated (e.g. sv40, polyomavirus mouse), or macropinocytosis (e.g. vaccinia virus, respiratory syncytial virus, ebola virus, hiv-1) (blaas, 2016; cossart and helenius, 2014; fields and knipe, 2013; kirkham and parton, 2005; marechal et al., 2001; mayor and pagano, 2007; mercer and helenius, 2009; parton and simons, 2007; rasmussen and vilhardt, 2015; saeed et al., 2010) . these endocytic mechanisms enable the virus to be transported by the cell's machinery through the plasma membrane and to pass through the dense actin cortex. cellular entry of some viruses is coupled with receptor-mediated signaling, resulting in activation of molecules that facilitate virus entry by cytoskeleton reorganization, induction of long-distance transport of the virus-containing vesicles, or actin cortex disassembly. the second type of entry is used by some enveloped viruses (paramyxo-, herpes-, and retroviruses), which upon cell surface receptor binding, infect the cell by direct fusion of viral and plasma membranes at neutral ph. upon internalization, most viruses become trapped and enclosed in vesicles that pinch off the inner side of the plasma membrane and transport their cargo into the cytoplasm. on their way through the cell, these transport vesicles undergo maturation by fusing with other vesicles. to fulfill their task of genome replication, viruses have to escape from these endosomes. the "great escape" is triggered by activation of a fusion or penetration mechanism, such as changes in conditions in the endosomal interior [e.g. ph, ionic environment (e.g. calcium ion concentration), oxido-reducing conditions], changes in membrane composition, and other physico-chemical cues (blaas, 2016; cossart and helenius, 2014; inoue and tsai, 2013) . depending on the requirements for a particular virus, these events can occur in early endosomes (ph 6.5-6.0; e.g. hepatitis c virus, vesicular stomatitis virus), late endosomes (ph 6.0-5.0; e.g. influenza a virus, dengue virus, sars coronavirus), recycling endosomes, macropinosomes, the endoplasmic reticulum, the golgi apparatus, or lysosomes (ph 5.0-4.5) (blaas, 2016; cossart and helenius, 2014; grove and marsh, 2011; inoue and tsai, 2013) . for the majority of animal viruses, the activation of these fusion or penetration mechanisms occurs through conformational changes and structural rearrangements in viral surface proteins and/or the whole virion shell that may destabilize the capsid core. structural rearrangements in enveloped viruses usually mediate fusion of viral and endosomal membranes. in nonenveloped viruses, the structural changes uncover amphipathic or hydrophobic domains that may induce pore formation or disruption of endosomal membranes. to deliver genetic material to the replication site, these mechanisms ultimately release viral capsid structures from endosomal vesicles into the cytosol either by fusing with the endosomal membrane (enveloped viruses) or by penetrating the endosomal membrane (nonenveloped viruses). uncoating is the partial or complete disassembly of the protective capsid shell and/or lipid envelope to liberate the viral genome. for many viruses, this process is closely connected with conformational changes induced by the virus binding to the cell surface receptor; a low ph environment; or changes in oxido-reducing conditions, ion concentrations, or other factors. for enveloped viruses, uncoating involves a loss of the viral membrane by fusion either with the plasma membrane or with intracellular vesicles, followed by stepwise uncoating of the protective capsid shell. in nonenveloped viruses, the uncoating process typically involves conformational changes that result in the weakening of intermolecular interactions, loss of structural proteins, proteolytic cleavage, and so on (fields and knipe, 2013) . depending on the virus, uncoating can take place at the plasma membrane, in the cytoplasm, during endocytosis in early or late endosomes, in lysosomes, in the nucleus, or at the nuclear pore complex (npc). the ssrna genome of retroviruses, which is fully enclosed inside a protective capsid shell, must be reverse transcribed into dsdna and released from the capsid. although it is accepted that hiv-1 uncoating is linked to reverse transcription and nuclear import (ambrose and aiken, 2014) and is controlled by host factors (e.g. cyclophilin a, trim5α), the precise molecular mechanisms that trigger the uncoating remain unknown. several hypotheses have been proposed, including breakage of the capsid shell due to increased inner pressure caused by accumulation of the reverse transcription product (rankovic et al., 2017) , the requirement of intact microtubules and dynein and kinesin motors (lukic et al., 2014) , and phosphorylation of capsid shell protein by the host cell kinase melk (takeuchi et al., 2017) . in contrast to the majority of rna viruses, which replicate in the cytoplasm, most dna viruses (with the exception of large dna viruses including poxviridae, asfarviridae, and mimiviridae) and several negative stranded rna viruses enter the nucleus to replicate their genomes (kobiler et al., 2012; koonin and yutin, 2010) . passive diffusion into the nucleus is suitable for molecules smaller than 9 nm in diameter, but larger structures must enter through the nuclear pore complex (npc), which can accommodate molecules of up to 39 nm (pante and kann, 2002) . translocation through the npc is tightly regulated and requires the presence of a nuclear localization signal (nls) on the passing molecule and nuclear import receptors (importins or karyopherins) (cautain et al., 2015) . due to the diversity of viral particle sizes (from 25 nm to over 300 nm) and structures, viruses have evolved several strategies to export their dna or rna genome into the nucleoplasm. with regard to the npc, these pathways can be divided into npc-dependent and npcindependent mechanisms. due to size limitations, only a few, very small simplified scheme of common stages of viral life cycle targeted by antiviral drugs. these stages including: 1) attachment and entry, 2) uncoating, 3) genome replication, 4) genome packaging and assembly of viral particle and 5) virus release and maturation. m. rumlová, t. ruml biotechnology advances 36 (2018) 557-576 viral capsid particles, such as hepatitis b virus (hbv; 32-36 nm diameter), can pass through the npc (cohen et al., 2011; rabe et al., 2003) in an importin-dependent manner. hbv then disassembles at the nuclear side of the npc (the nuclear basket), releasing its genome into the nucleoplasm (fay and panté, 2015) . the capsid shells or ribonucleoprotein complexes (rnps) of some larger viruses, such as influenza a virus and hiv-1, disassemble within the cytoplasm. the viral genome, released from the shell and complexed with nls-containing components, is then translocated through the npc (ambrose and aiken, 2014; campbell and hope, 2015; cohen et al., 2011; fay and panté, 2015; hutchinson and fodor, 2013) . another mechanism, used by herpes simplex virus (hsv) and adenoviruses, involves cellular (importins, nup) or viral protein-mediated attachment (docking) of the capsid shell at the cytoplasmic side of the npc. this facilitates the passage of genomic dna into the npc either by ejection from an almost intact particle (through the capsid portal in hsv-1) or upon complete disassembly of the capsid shell (in adenoviruses) (kremer and nemerow, 2015; ojala et al., 2000; pasdeloup et al., 2009) . npc-independent mechanisms are used by some retroviruses (e.g. mlv) that enter the nucleus during the mitotic phase of the cell division cycle when the nuclear envelope is dissolved (matreyek and engelman, 2013; roe et al., 1993) . another mechanism, described for parvoviruses, involves partial disruption of the nuclear envelope (cohen and pante, 2005; fay and panté, 2015) . viral genomes can be encoded by various types of nas, as summarized in table 1 for dna viruses and table 2 for rna viruses. by convention, ssdna of equivalent polarity to mrna is designated as the positive (+) strand. the complementary ssnas are of minus polarity (−). the majority of dna viruses replicate in the nucleus, where cellular dna replication and transcription also occur. in contrast, rna viruses usually replicate in the cytoplasm. rna viruses are the only "organisms" that store their genetic information in the form of rna. replication of their genomes is accomplished either by rna-dependent rna synthesis ( fig. 2a , b) [reviewed in (ferron et al., 2017; lu and gong, 2017; menéndez-arias and andino, 2017; pietilä et al., 2017; tao and ye, 2010) ] or through rnadependent dna synthesis (reverse transcription) followed by dna integration, replication, and transcription ( fig. 2c ). as some of these enzymatic activities are not commonly found in uninfected host cells, rna viruses must encode enzymes to aid replication (table 2 ). rna viruses are divided according to the baltimore classification into dsrna viruses (birna-and reoviridae); positive-sense ssrna viruses (corona-, flavi-, and picornaviriade); negative-sense ssrna viruses (filo-, rhabdo-, paramyxo-, and orthomyxoviriade), and ambisense rna viruses (arena-and some members of bunyaviridae) with both positive-and negativesense rnas (nguyen and haenni, 2003 ) (see table 2 ). the nature of the genome not only dictates the mechanism of replication, but also has other important consequences. the genome of (+)rna viruses may serve directly as mrna for production of viral proteins. therefore, the mere introduction of genetic material (e.g. in exocytic vesicles) may result in productive infection. the rna polymerases that copy the genetic material of rna viruses are error-prone, which provides considerable genetic flexibility and the propensity to evolve drug-resistant mutants. these features are amplified in viruses with segmented genomes that undergo reassortment. in viruses with segmented genomes (orthomyxoviridae, arenaviridae and bunyaviridae), each segment is transcribed in an autonomous transcription-replication unit by viral rna polymerase that binds to the 5′ end cap structure. of note, the genomic rna (grna) is capped by a unique mechanism called cap-snatching (ferron et al., 2017) , in which the cap is cleaved from cellular mrna and transferred onto the viral grna by a subunit of viral rna polymerase (pflug et al., 2017) . some dsrna viruses (birnaviridae and reoviridae) also contain segmented genomes. upon replication, these viruses must ensure stoichiometric incorporation of single copies of each grna segment into new particles. this is guided by specific packaging signals on each segment of grna that interact with positively charged domains in the capsid proteins [reviewed in (isel et al., 2016; pohl et al., 2016) ]. although different models of packaging have been proposed for various segmented genome viruses, a common feature is co-assembly of the capsid proteins with the grna and rna polymerase. the genomes of dna viruses come in a considerable variety of sizes and shapes, from small ss to large ds molecules that may be linear or circular. the size range of these genomes (from 1.8 kb to 1200 kb) reflects the necessity for some viruses to encode specific proteins required for viral replication. small-genome dna viruses (polyoma-, papilloma-, and parvoviruses) use only host cell enzymes for replication and transcription. the only exceptions are some hepadnaviruses (e.g. hepatitis b virus) that despite having small genomes (approximately 3 kb), encode their own specific dna polymerase/reverse transcriptase that reverse transcribes pregenomic rna (pgrna) into genomic viral dna (fig. 2d , ii) (beck and nassal, 2007) . viruses with intermediate-size genomes (up to 35 kb) (e.g. adenoviruses) encode their own dna polymerase for genome replication, but they usually utilize cellular rna polymerases ii and iii for transcription (fields and knipe, 2013) . viruses with large genomes (larger than 100 kb) (e.g. herpesviruses) encode proteins for replication, including dna polymerase and dna helicase-primase, as well as some enzymes necessary for biosynthesis of deoxyribonucleotide triphosphates (dntps) and several transcription factors (boehmer and nimonkar, 2003) . poxviruses (e.g. vaccinia virus), are another type of virus with large dsdna genomes, and their replication, transcription, and translation take place entirely in the cytoplasm within discrete juxtanuclear sites called virus factories (moss, 2013) , (fig. 2d , iii). these viruses encode all enzymes and specific factors necessary for genomic replication and transcription. with their own replicative machinery, large-genome dna viruses can replicate in nondividing cells. in contrast, the replication of small-genome dna viruses, which depends on cellular dna polymerases, must occur simultaneously with the s-phase of the cell cycle (e.g. parvoviruses), or must express some viral protein/ oncogene to re-program the host cell-cycle regulatory proteins p53 or retinoblastoma protein (prb), triggering entry into the s-phase (e.g. polyomaviruses and papillomaviruses). by affecting the g1/s checkpoint (controlled by p53 and prb), the viruses ensure the production of host enzymes required for viral replication (fields and knipe, 2013) . production of viral proteins of dna viruses with intermediate and large size genomes is divided into early and late phases. in the early (prereplicative) phase, nonstructural proteins required for dna replication are translated. the late phase, during which the late structural proteins needed for assembly are translated, begins after viral dna replication ( fig. 2d , i). depending on the species, viruses assemble either in contact with the cellular membrane or independent of the membrane in either the nucleus or cytoplasm. the membrane-independent route is used by nonenveloped viruses and a few enveloped viruses (e.g. orthomyxoviruses, herpesviruses, and some retroviruses) that acquire the membrane envelope after intracellular assembly during budding from the cell. the limiting step for nuclear assembly is the size of npcs, which are large enough to transport rna and import proteins required for assembly into the nucleus; however, npc size limits the transport of larger assemblies. therefore, some viruses form assembly intermediates in the nucleus. these structures are then exported to the cytoplasm, where they come together to form viral particles. nuclear export is specific and depends on the presence of nuclear export signals (nes) in the transported proteins. one example of a nonenveloped icosahedral virus that assembles in the nucleus is adenovirus, the assembly of which has been studied intensely due to its potential use in gene therapies [reviewed in (ahi and mittal, 2016) ]. recent data suggest that upon accumulation of multiple copies of adenoviral dsdna genomes, coordinated assembly and packaging occur by two interlinked mechanisms that involve both capsid proteins and core components (condezo and san martín, 2017) . the assembly occurs in the so-called peripheral replicative zone with the assistance of scaffolding proteins that facilitate formation of adenoviral particles but are excluded from mature viruses. adenoviruses are finally released upon lysis of the infected host cell. orthomyxoviruses and herpesviruses are enveloped viruses that assemble their nucleoproteins in the nucleus. herpesviruses package their dsdna genome as head-to-tail concatemers and assemble icosahedral procapsids on scaffold proteins in the nucleus [reviewed in (heming et al., 2017) ]. however, subsequent steps of herpesvirus assembly proceed in the cytoplasm. the preassembled procapsid is too large to pass through the npc, but it exits the nucleus by viral proteindriven vesicular transport across the nuclear inner membrane leaflet. thus, the herpesvirus acquires an initial envelope from the inner nuclear membrane [for review see (fields and knipe, 2013; hellberg et al., 2016) ]. next, the herpesviral membrane fuses with the outer nuclear membrane, and the naked particle is released from the nucleus into the cytosol. here, the virus acquires tegument (a protein layer between the capsid and the envelope) and other proteins. final herpesvirus envelopment occurs at the golgi membrane containing the viral glycoproteins (fields and knipe, 2013) . preassembled orthomyxoviral ribonucleoproteins, upon their export from the nucleus, are driven to the plasma membrane to which they attach through electrostatic interactions of the matrix protein m1 with membrane phosphatidylserine. the virions then assemble simultaneously with budding during which they also acquire ha, na and m2 membrane proteins. poxviruses undergo an even more complicated pathway. they are enveloped with multiple membranes acquired from er/intermediate compartments and golgi or early endosomes (moss, 2015 (moss, , 2016 . these membranes also provide the poxviruses with their membrane proteins. most viruses assemble upon interaction of their structural proteins with cellular membranes. the target membrane for assembly differs according to the virus type. flaviviruses assemble at the surface of the er and then upon their budding into the er lumen, the immature particles are then transported into the tgn. some viruses, such as coronaviruses, assemble at the er-golgi intermediate compartment [reviewed in (ujike and taguchi, 2015) ]. the assembly of bunyaviruses occurs concurrently with replication of grna segments in virus factories located along the golgi (guu et al., 2012; strandin et al., 2013) . the presence of membrane glycoproteins at the golgi membrane determines the sites where assembling bunyavirus particles bud into the golgi lumen, similar to other enveloped viruses. numerous viruses, including paramyxoviruses (cox and plemper, 2017) , orthomyxoviruses (pohl et al., 2016) , alphaviruses (jose et al., 2009) , rhabdoviruses (okumura and harty, 2011) , and most retroviruses (freed, 2015) , assemble underneath the cytoplasmic membrane, which facilitates assembly by providing a scaffolding function. the virus then acquires a lipid envelope through budding across the plasma membrane. during this process, it also gains the envelope glycoproteins (env) that are anchored in the plasma membrane by hydrophobic transmembrane domains. env reaches the plasma membrane by a cellular secretory pathway upon synthesis in the er and subsequent glycosylation in the golgi. usually, a specific interaction between viral structural proteins and glycoproteins is required. this may be either direct or mediated through the interaction with matrix protein (e.g. in (-)rna viruses such as ortho-and paramyxoviruses or rhabdoviruses). most retroviruses, including hiv (so-called morphogenetic c-type), also assemble at the plasma membrane of the host cell. the interaction of the structural polyprotein precursor (gag) with the plasma membrane is usually facilitated by a bipartite signal in the n-terminal domain of gag (i.e. matrix protein) comprising both the basic patch and n-terminally linked myristoyl residue (added co-translationally to gag). in contrast to c-type assembly at the membrane, morphogenetic b/d-type retroviruses assemble at pericentriolar sites where gag polyproteins are transported along microtubules by dynein molecular motors (sfakianos et al., 2003; vlach et al., 2008) . for both morphogenetic pathways, the packaging of grna facilitates assembly. (+)rna viruses have adopted a mechanism of extensive rearrangement of intracellular membranes to provide a milieu for virus replication and assembly. this mechanism effectively protects dsrna intermediates from degradation by the host cell machinery (delgui and colombo, 2017; jackson, 2014) . this process has been well-documented for poliovirus, in which the newly formed membranous structures exclusively serve for virus production. various types of vesicles that are formed upon viral infection have different roles in viral replication (rossignol et al., 2015) . some nonenveloped mammalian viruses, such as reoviruses, assemble in the cytosol in so-called viroplasms or viral factories into icosahedral particles consisting of three concentric layers encircling segments of genomic dsdna (benavente and martinez-costas, 2006; jones, 2000; shah et al., 2017) . rotavirus seems to be the only exemption in the reovirus family, as it enters the endoplasmic reticulum to gain its outer protein shell (trask et al., 2012) . numerous viruses assemble from polyprotein precursors that must be specifically cleaved by a viral protease to generate infectious particles. this mechanism, which is an irreversible step in the virus life cycle, ensures equimolar packaging of structural proteins and proportional co-assembly of viral enzymes in the form of precursors. upon proteolytic release, the liberated proteins may undergo different trafficking pathways or fulfill various functions in the virus. maturation changes the energy of interaction forces among those interfaces required for intracellular assembly of immature particles to those suitable for viral stability in the environment and for disassembly and uncoating of genetic material for replication [reviewed in (veesler and johnson, 2012) ]. in poliovirus, the autocatalytic and subsequent viral proteasemediated cleavage of p1 precursor protein allows formation of pentamers that interact with grna. additional cleavage of vp0, yielding vp2 and vp4, is required to form infectious poliovirus particles. in retroviruses, proteolytic processing is initiated by the autocatalytic liberation of viral protease, which subsequently cleaves the polyproteins to trigger major structural rearrangements in the virus and release of other viral enzymes (reverse transcriptase and integrase) and structural proteins. in herpesviruses, maturation involves proteolytic processing of the scaffolding protein and recruitment of tegument proteins that stabilize the particle and mediate interactions with the membrane during envelopment (gibson, 2008; tandon and mocarski, 2012; tandon et al., 2015) . adenoviruses undergo a maturation process that involves processing of six structural components of the core and one non-structural precursor that initiates replication of gdna (gaba et al., 2017) . one interesting feature of adenoviral maturation is that dna is required as a co-factor for protease activity. this means that maturation occurs only in particles that have packaged gdna (mangel and san martín, 2014) . flaviviruses form icosahedral particles upon budding into the neutral milieu of the er. the particles then translocate to the more acidic environment of the trans-golgi network. this ph shift results in disassembly of the immature lattice and extensive rearrangement of the flaviviral particle. this is connected with the exposure of a viral structural glycoprotein (prm) that is specifically processed by the , retroviruses (c) and dna viruses (d) (the schemes a-c were modified from: (ahlquist, 2006) . a: following endocytosis, the genome of (+)rna viruses may directly serve as mrna for translation of virus encoded proteins. among them, there are proteins of rna-replication machinery that recruit (+)rna into a membrane-associated replication complex. the genomic rna is replicated by using (-)rna template, which is transcribed in a low copy number amplified (+)rna is then packaged into newly assembled virions that exit the host cell either through secretion or cell lysis. b: upon the virus attachment, a core containing both grna and virus encoded rna polymerase is delivered into the cytoplasm by endocytosis. cytoplasmic transcription of the (-)rna template provides (+)rna that serves as mrna for translation of viral proteins. in dsrna viruses, the (+)rna is packaged into new cores which undergo maturation by synthesizing (-) rna (dotted strand) and acquiring surface proteins. in the (-)rna viruses, the (+)rna strand is transcribed into genomic (-)rna in the cytoplasm and then packaged. the new virions egress the cell either through secretion or cell lysis. c: following fusion of viral and cell plasma membrane, the retroviral core is released into cytosol, rna genome is transcribed into dsdna by viral reverse transcriptase concomitantly with uncoating of the viral core, ds dna is transferred into nucleus where it is integrated into host cell genome by viral integrase, following translation of retroviral structural and enzymatic polyproteins, the unspliced grna is packaged into the immature particles that usually assemble at the plasma membrane and the viral particles bud from the cell. d: three mechanisms (i.-iii.) of dna viruses replication are shown: (i): following entry and uncoating, the dna genome is transported to the nucleus; products of early genes (regulatory proteins, transcription factors) regulate the synthesis of viral enzymes (e.g. dna polymerase) required for genome replication; expression of late genes encoding structural capsid proteins in the cytosol, they are then transported into nucleus where packaging and pre-assembly take place; preassembled procapsids exit the nucleus and leave the cell (e.g. herpesviruses). (ii) unique replication cycle of hepatitis b virus (hbv): following entry, the viral particle is internalized by endocytosis and the nucleocapsid is released into the cytoplasm; the genome (relaxed circular rcdna) is transported into the nucleus, where it is converted to a covalently closed circular form (cccdna); which serves as a template for transcription of pregenomic rna (pgrna) for translation of the core protein and the viral polymerase and three subgenomic mrnas used for translation of regulatory and envelope proteins; following viral transcription and translation, the hbv core proteins self-assemble in the cytoplasm into viral nucleocapsid with concurrent incorporation of pgrna and hbv polymerase, pgrna is reversely transcribed into a rcdna within the nucleocapsid; nucleocapsid containing rcdna can either re-enter the nucleus to amplify cccdna, or can be enveloped by hbv envelope proteins in the endoplasmic reticulum. the particles are then secreted from the infected hepatocyte by exocytosis. (iii) upon entering the cell, the replication, transcription and translation take place entirely in the cytoplasm, within discrete juxtanuclear sites called virus factories (e.g. poxviruses) adapted from: ahlquist, p., 2006. parallels among positive-strand rna viruses, reverse-transcribing viruses and double-stranded rna viruses. nat rev microbiol 4(5), 371-382. cellular protease furin in the trans-golgi network. the liberated globular heads of prm remain attached at the low ph, but are released when the virus enters neutral body fluids (rey et al., 2017) . one frequently used mechanism of release of nonenveloped and some enveloped viruses is lysis of the infected cell. this is the simplest release mechanism, although the transition to lysis from latent infection is delicately regulated (aneja and yuan, 2017; levings and roth, 2013; schmiedel et al., 2016) . however, viruses that are usually lysogenic may also use alternative release pathways (bird and kirkegaard, 2015a) . these include non-lytic spread of viruses mediated by vesicles, which has been observed for poliovirus (bird and kirkegaard, 2015b; jackson et al., 2005) , coxsackievirus (alirezaei et al., 2012) , and hepatitis a virus . another possibility is that vesicles released from a cell infected with (+)rna viruses contain naked viral grna. this (+)grna-containing vesicle functions itself as an infectious agent as it is transferred to another cell (bird and kirkegaard, 2015a) . the standard way for enveloped viruses to leave the host cell is budding, which includes membrane extrusion and separation of the viral and cellular membranes (so-called pinching off). the lipid envelope layer acquired during viral particle budding through the plasma membrane protects the virus particle. there are three basic mechanisms of budding: i) mediated by envelope glycoproteins, such as the e protein of coronaviruses; ii) independent of glycoproteins, in which the viral structural proteins trigger the extrusion of the plasma membrane, such as retroviral budding in which strong interactions between the gag n-terminal domain (matrix protein) and the plasma membrane induce bulging of the membrane; and iii) requiring interaction between the viral glycoproteins and the capsid for membrane extrusion, as in alphaviruses. the final step that results in the separation of virus from the cell surface is pinching off the particle. this is a controlled process that involves both viral protein domains and cellular factors. during this process, viruses apparently use cellular machinery similar to that used during the last step of cell division (the release of the daughter cell) called endosomal sorting complex required for transport (escrt) [reviewed in (campsteijn et al., 2016; hurley, 2015; olmos and carlton, 2016; scourfield and martin-serrano, 2017) ]. direct interactions of numerous viral domains with escrt complex subunits have been identified (bieniasz, 2006) . in retroviruses, short specific amino acid sequences (ptap or psap) interact with the escrt components, and the interaction of hiv with the escrt proteins nedd4 and alix is wellknown (freed, 2002; fujii et al., 2007; gomez and hope, 2005) . however, this mechanism has also been adopted by other viruses that interact with the same escrt components, such as filoviruses (han et al., 2015; jasenosky and kawaoka, 2004; liu and harty, 2010) . interactions with escrt proteins have also been reported for vesicular stomatitis virus luyet et al., 2008 ), rhabdovirus (taylor et al., 2007 , arenaviruses (ziegler et al., 2016) , picornaviruses , paramyxoviruses (park et al., 2016) , and hepatitis c virus, a representative of the flavivirus family (falcon et al., 2017) . the typical release pathways used by viruses may vary under some conditions. for example, during chronic infection, numerous viruses use alternative cell-to-cell transmission that may help the virus avoid host neutralization (hulo et al., 2017) . syncytium formation, an hivinduced cell fusion, was recognized in the early 1990s (callahan, 1994) . another type of cell-to-cell transmission through tight junctions was shown for hiv (hübner et al., 2009; wang et al., 2017) and murine leukemia virus (sherer et al., 2010) . receptors on tight junctions that specifically recognize hepatitis c virus (carloni et al., 2012; ploss et al., 2009 ) and reovirus (barton et al., 2001) have been identified. some viruses are able to modify cellular pathways to reprogram both the synthesis and metabolism of lipids and membrane compartmentalization for their transmission and to evade cellular defense mechanisms (mazzon and mercer, 2014) . despite a general understanding that poliovirus spreads through cellular lysis, it was recently found that it may also be transferred between cells by an autophagy-dependent mechanism, called autophagosome-mediated exit without lysis (bird and kirkegaard, 2015b; bird et al., 2014; lai et al., 2016; richards and jackson, 2012) . similar mechanisms have been described for varicellazoster virus and human cytomegalovirus (grose et al., 2016; meier and grose, 2017) . poxviruses encode several proteins that block the apoptotic cellular response to the presence of their dsdna in the cytoplasm. this allows cell-to-cell passage of poxviruses by a mechanism known as apoptotic mimicry (amara and mercer, 2015; nichols et al., 2017) . in this process, enveloped viruses expose surface phosphatidylserine to mimic apoptotic bodies. these cells are then macropinocytosed by either dendritic cells or macrophages (mercer and helenius, 2010) . among the plethora of compounds designed to inhibit infectious viruses, only a few (< 100) have been approved for clinical use (de clercq, 2004; de clercq and li, 2016) . nevertheless, some effective antiviral drugs have been on the market for several decades, such as the anti-influenza a virus drug amantadine, marketed under the trade name symmetrel (by dupont), which was approved in 1966. in 1982, burroughs-wellcome introduced acyclovir as an inhibitor of herpesviruses. its remarkable specificity is connected with its activation by viral thymidine kinase-catalyzed phosphorylation. however, due to development of drug resistance in a number of viruses, especially rna viruses, there is a continuous need to design and test new inhibitors, preferably targeting different steps of viral life cycles. here, we provide insights into the world of biochemical and cell-based assays that were developed to test antivirotics targeting various steps in the viral life cycle. different types of assays, including cell-cell fusion assays, cell-virus fusion assays with pseudotyped viral particles, and in vitro biochemical assays have been developed to screen inhibitors of viral entry. in enveloped viruses, entry is initiated by fusion of the viral envelope with the target cellular membrane. the entry mechanism has been well-described for hiv-1, in which it is mediated by env glycoprotein, consisting of transmembrane gp41 and surface gp120 subunits. binding of gp120 to its cellular receptor, cd4, and to one of two co-receptors, cxcr4 or ccr5, triggers a refolding of gp41 that promotes fusion of the viral and cellular membranes. the refolding involves oligomerization of the extracellular n-and c-terminal heptad repeat (hr) domains of gp41, which leads to the formation of a 6-helical bundle [reviewed in (melikyan, 2008) ]. jiang et al. established an in vitro system to quantify formation of the hiv-1 gp41 6-helical bundle (jiang et al., 1999 ). in their system, the bundle is formed by mixing equimolar concentrations of peptides derived from the n-and c-hr regions of gp41. elisa using the monoclonal antibody nc-1, which specifically recognizes and binds an epitope formed on the gp41 6-helix bundle but not the individual peptides, enables screening for compounds that prevent formation of fusion-active gp41. for hts of hiv-1 fusion inhibitors, this method was modified to use a fluorescence-linked immunosorbent assay (flisa), in which the c-hr peptide was replaced with fitc-conjugated c-hr peptide (boyer-chatenet et al., 2003) . cell-based assays are routinely used to screen viral entry inhibitors. high throughput formats have been developed for screening of hiv-1 fusion inhibitors. cell-cell fusion assays involve two types of cells: effector cells that stably (bradley et al., 2004) or inducibly (herschhorn et al., 2011; ji et al., 2006) express hiv-env glycoprotein and target cells that express cd4 and either cxcr4 or ccr5. co-cultivation of these cells leads to hiv-1 env-mediated cell membrane fusion, resulting in formation of multinucleated syncytia. membrane fusion induces expression of a reporter protein such as luciferase (herschhorn et al., 2011; ji et al., 2006) or β-galactosidase (bradley et al., 2004) . one such assay enables determination of both the efficiency and specificity of fusion inhibitors (herschhorn et al., 2011) . this approach uses effector cells that express both hiv-1 env and the renilla luciferase (r-luc) reporter protein using inducible tetracycline-controlled transactivator (tta) and target cells that express the hiv-1 receptor (cd4) and coreceptor (ccr5) and contain the firefly luciferase (f-luc) reporter gene under the control of a tta-responsive promoter. upon fusion of the effector and target cells, tta enters the target cells and activates the expression of the f-luc reporter. the inhibition of fusion of cellular membranes is determined as a decrease in f-luc luminescence, and the inhibitor specificity is measured as the r-luc activity (herschhorn et al., 2011) . based on an hiv-1 cell-cell fusion method that uses a computercontrolled digital image analysis system for automatic quantitation , a modified method was developed to screen inhibitors targeting mers-cov s protein (lu et al., 2014) . to test potential fusion inhibitors, effector cells stably expressing the mers-cov spike protein s2s and egfp were used to mediate fusion with target cells expressing the dpp4 receptor (lu et al., 2014) . virion-based fusion assays are another category of cell-based fusion assays. one such approach is based on production of chimeric hiv-1 virions carrying β-lactamase-vpr chimeric protein (blam-vpr). chimeric hiv released into the cell culture media is isolated by ultracentrifugation and used to infect target cells. entry of the virions into the cytoplasm is detected by cleavage of a fluorescent substrate by βlactamase. the fluorescence shift corresponds to the fusion efficacy and is measurable by fluorescence microscopy, flow cytometry, or fluorometric plate reader (cavrois et al., 2002; cavrois et al., 2004 cavrois et al., , 2014 . modification of this assay by constructing pseudotyped hiv-1 virions in which hiv-1 env was replaced with ebola virus glycoprotein (gp) has also been described (yonezawa et al., 2005) . tscherne et al. developed an approach to monitor viral entry using the blam reporter (tscherne et al., 2010) . their assay uses influenza virus-like particles (vlps) bearing the influenza membrane proteins hemagglutinin and neuraminidase. blam tagged with influenza matrix protein (m1) is incorporated into the vlps and delivered into target cells. upon release, blam can be detected by flow cytometry, microscopically, or fluorometrically. a rapid cell-based hts method was developed to assess sars-cov entry inhibitors (zhou et al., 2011) . this dual envelope pseudovirion (dep) assay employs two hiv pseudoviruses: one encodes an envelope protein from the target virus and firefly luciferase and the second encodes a control, unrelated viral envelope protein and renilla luciferase. reporter expression is determined with the dual-glo luciferase assay system (promega). inclusion of the unrelated envelope protein greatly reduced false positive hits (zhou et al., 2011) . the method was further used to screen compounds that inhibit entry of filoviruses, including ebola virus . a similar approach employing four pseudotyped hiv viruses, carrying marburg virus glycoprotein, hemagglutinin and neuraminidase isolated from a/goose/qinghai/59/05 (h5n1) influenza virus, ebolavirus zaire envelope glycoprotein, and lassa virus envelope glycoprotein, has been used for entry inhibitor screening . this screening identified multiple compounds with potent inhibitory activity against entry of both marburg and ebola viruses in human cancer cell lines, and confirmed their anti-ebola activity in human primary cells . other pseudotyped viral assays have been used to screen entry inhibitors of sars-cov, ebola, hendra, and nipah viruses. to establish infection, the glycoproteins of these viruses must be processed by the host intracellular lysosomal protease cathepsin l (catl). hts resulted in identification of several inhibitors that block the cleavage of viral glycoprotein but not catl itself (elshabrawy et al., 2014) . until recently, the development of anti-hbv therapeutics had been limited by the lack of an in vitro infection system. several aspects of the hbv life cycle have been elucidated using in vitro production of hbv particles after transfection of human hepatoma cell lines (hepg2) with recombinant hbv dna and by establishment of hepatoma cell lines with the entire hbv genome integrated, such as the hepg2.2.15 cell line (ladner et al., 1997; sells et al., 1987; sureau et al., 1986) . in addition to differentiated human (phhs) (gripon et al., 1988) and tupaia belangeri (glebe et al., 2003) hepatocytes, the heparg cell line, which exhibits hepatocyte-like characteristics, also supports hbv replication (gripon et al., 2002) . the identification of sodium taurocholate cotransporting polypeptide (ntcp) as an hbv receptor by yan et al. opened possibilities to use ntcp-complemented hepg2 cells not only for studies of hbv replication mechanisms but also for hts of inhibitors (yan et al., 2012) . in an infection competition assay, hbv particles were isolated and used to infect heparg and phhs cells that had been pre-incubated with hbv envelope protein-derived peptides to test their potential activity to block hbv infection. twelve days after infection, viral rnas were quantified by northern blot (gripon et al., 2002 (gripon et al., , 2005 . using this assay, researchers identified a peptide that specifically prevents hbv and hdv entry into heparg and phhs cell lines (gripon et al., 2005) , primary cultures of tupaia hepatocytes (glebe et al., 2005) , and cells in vivo (lütgehetmann et al., 2012; petersen et al., 2008; volz et al., 2013) . recently, a phase iia clinical trial of a first-in-class entry inhibitor (myrcludex-b) that functions as an ntcp inhibitor was promisingly completed (bogomolov et al., 2016) . development of cell culture systems producing hcv pseudoparticles (hcvpp) and infectious hcv particles (hcvcc) has shed light on hcv interactions and enabled discovery of antiviral drugs and vaccines (colpitts et al., 2016) . hcvpp consist of hcv e1 and e2 glycoproteins enveloping a retroviral particle that packages gfp mrna during assembly (bartosch et al., 2003) . the entry efficiency of hcvpp can be quantified by facs analysis as the number of gfp-positive cells. use of this screening system led to discovery of a triazine derivative that blocks the entry of hcv (baldick et al., 2010) . production of hcvcc is a robust system to generate infectious hcv in naïve cells (kato et al., 2006; zhong et al., 2005) . the anti-hcv activity of hundreds of compounds approved for a wide variety of indications was determined immunochemically with anti-hcv e2 antibody in 96-well plate format. over thirty compounds displayed anti-hcv activities, most of which were directed against hcv entry (gastaminza et al., 2010) . the majority of viruses enter cells by endocytosis. unfortunately, there are no suitable experimental techniques for endosome handling, making it difficult to study early steps in the viral life cycle such as uncoating and capsid disassembly. viruses that enter cells by direct membrane fusion, such as hiv-1, are an exception. there are several methods available to monitor and quantify hiv uncoating. as some of these were reviewed recently by campbell and hope (campbell and hope, 2015) , we will discuss them very briefly. two main techniques are used in these assays: ultracentrifugation or utilization of hiv-1 specific cellular restriction factors. the "in vitro core-stability assay" is based on ultracentrifugation of released hiv-1 virions through a detergent layer, where the viral membrane dissolves, into a sucrose gradient, where the viral cores are concentrated (shah and aiken, 2011) . the "fate of capsid" assay uses ultracentrifugation through a sucrose cushion to separate the hiv-1 core from a whole cell lysate prepared shortly after infection (stremlau et al., 2006) . the "csa-washout assay" involves specific cellular factors (trim-cyclophilin) that restrict hiv-1 infection by binding to the capsid core and cyclosporine a (csa), which can effectively turn off restriction (hulme et al., 2011) . recently, a novel entry/uncoating assay (eurt), an alternative to blam-vpr (described in section 3.1), was reported (da silva santos et al., 2016). it quantifies the protein product of a virion-packaged mrna reporter upon uncoating. a method to monitor the uncoating/disassembly of the capsid of influenza a virus, which enters the cell by endocytosis, also is based on ultracentrifugation (stauffer et al., 2016) . purified virions are separated using velocity gradient centrifugation through a two-layer glycerol gradient. similar to the "in vitro core stability" assay for hiv, the sedimenting viruses migrate through the detergent-containing layer of the gradient, which dissolves the membrane to release the viral core. moreover, modification of the detergent-containing glycerol layer-for example, by lowering ph, changing ionic strength, or adding putative viral uncoating factors-enables study of the factors or compounds that affect viral uncoating in vitro. depending on the virus type, either dna or rna polymerases replicate the viral genome. thus, these enzymes play a key role in viral life cycles. reverse transcriptase, an rna-dependent dna polymerase of retro-and hepadnaviruses, is unique among nucleic acid polymerases. despite the different mechanisms of viral replication, polymerases, which are essential for all viruses, are excellent targets for antiviral therapies. polymerase inhibitors represent the vast majority of clinically approved antiviral drugs, followed by protease inhibitors, immunostimulators, entry inhibitors, and integrase inhibitors [reviewed in (de clercq and li, 2016) ]. polymerases continue to be a preferred target for newly designed inhibitors (clercq, 2008; de clercq and li, 2016) . polymerase inhibitors may be nucleoside and nucleotide analogs, pyrophosphate analogs, or nonnucleosides, such as allosteric inhibitors (de clercq and li, 2016; öberg, 2006) . non-specific approaches such as plaque assays were initially used to monitor the effectiveness of polymerase inhibitors (tino et al., 1993; tisdale et al., 1993) . the activity of these inhibitors also can be evaluated based on their ability to prevent the cytopathic effects of the virus (zhang et al., 2017) . more straightforward assays involve measurement of incorporated radio-labeled nucleotides, which directly reflects the polymerase activity (coates et al., 1992; hirashima et al., 2006; joyce, 2010) ; these include hts methods using homopolymeric polycytidylic acid and [ 3 h]-gtp (amraiz et al., 2016) . alternatively, the pyrophosphate released during the polymerase reaction can be measured luminometrically by combining the primer extension with the commercial piper assay (malvezzi et al., 2015) . the pyrophosphate anion also can be detected with dna-attached magnetic nanoparticles (tong et al., 2015) or quantum dots (chai et al., 2015) . frequently used fluorescence methods avoid the use of radiochemicals. these methods exploit a fluorescent label, such as dsdna binding dyes like sybr green or pi-cogreen (driscoll et al., 2014; holden et al., 2009; zipper et al., 2004) , or fret between two nucleotides (schwartz and quake, 2009; weiss, 2000) . numerous kits for quantification of products of both dna and rna polymerases, including reverse transcriptase, are commercially available (bustin, 2000) . quantitative real-time pcr is a preferred method to monitor the activity of dna polymerases (cellular as well as viral) in the presence of inhibitors (beadle et al., 2016; zweitzig et al., 2012) , and quantitative real-time reverse-transcription pcr has become standard for screening inhibitors of viral rna polymerases okon et al., 2017; pelliccia et al., 2017; zhang et al., 2017) . white et al. described a hts method using a microfluidic apparatus combined with digital pcr for single-cell analysis (white et al., 2013) . in their method, a fluorescently labeled template also serves as a primer, due to stem formation. it emits a measurable polarization signal both upon binding of the polymerase and extension (mestas et al., 2007) . the assay has been used for hts of inhibitors of poliovirus rna polymerase (campagnola et al., 2011) . when available, viral genomes modified with reporter genes can be used for cell-based luminescent or fluorescent screening of viral inhibitors (beadle et al., 2016; edwards et al., 2015; fenaux et al., 2016; feng et al., 2014; lo et al., 2017; madhvi et al., 2017; wang et al., 2015c) . this approach can be extended to screen inhibitors of other enzymes. in the aptamer-based approach, a dna template encodes rna of interest joined to a fluorescence module and ribozyme sequence. when transcribed, the fluorescence module is released and detected (höfer et al., 2013) . hcv uses an rna-dependent rna polymerase (rdrp). a unique cell-based assay for monitoring hcv rna polymerase (ns5b) activity, based on the innate immune signaling molecule retionic acid-inducible gene i product (rig-i), has been developed (ranjith-kumar et al., 2011) . the rig-i-like receptors are cytosolic proteins that recognize viral rna and induce production of proinflammatory cytokines and interferon-activated genes (jensen and thomsen, 2012) . rig-i triggers cytokine production stimulated by various viral rnas from different families, including flaviviridae, paramyxoviridae, rhabdoviridae, orthomyxoviridae, and arenaviridae, as well as ebola virus rna (jensen and thomsen, 2012) . the assay is based on recognition of hcv rna produced by active ns5b by rig-i, followed by rig-i-mediated activation of firefly luciferase expression controlled by the interferon b promoter. renilla luciferase expression is used for normalization of transfection efficacy. the viral grnas of some rna viruses are modified at the 5′ ends with cap structures, which may be either acquired from the host cell mrna (e.g. in influenza virus) or newly synthesized (e.g. flaviviruses). the methylation of viral grna mimics that of cellular mrna cap structures to enhance the chances of the virus to escape from cellular defense mechanisms and also to increase the efficiency of translation of viral proteins. virus-specific methyltransferases are thus a promising therapeutic target. virus-encoded methyltransferases have been identified and characterized in flaviviruses such as zika virus coutard et al., 2017; duan et al., 2017; munjal et al., 2017; zhao et al., 2017) , west nile virus, and dengue virus (dong et al., 2012) ; rhabdoviruses such as vesicular stomatitis virus (vsv) (rahmeh et al., 2009 ); coronaviruses such as sars (wang et al., 2015b) ; and roniviruses (zeng et al., 2016) . in flaviviruses, the n-terminal part of ns5 methyltransferase catalyzes cap formation via both guanine n-7 and ribose 2′-oh methylations at the 5′ end of grna, and the c-terminus of the enzyme is responsible for the rna polymerase activity (ray et al., 2006; zhao et al., 2015) . the c-terminal part of the sars nsp14 protein exhibits guanine n7-methyltransferase activity, forming the grna cap, while the 3′-5′ exoribonuclease activity of the n-terminus enhances the fidelity of viral replication (case et al., 2016; minskaia et al., 2006) . in vsv, the methyltransferase activity resides in the conserved region vi of the multifunctional large polymerase protein (li et al., 2005; ma et al., 2014) . some cellular methyltransferases regulate viral infections, as shown for herpes simplex virus, for which epigenetic control is involved in viral latency (cliffe and wilson, 2017) . inhibition of human histone methyltransferases such as histone-lysine n-methyltransferase (ezh2/ 1) (arbuckle et al., 2017) or lysine-specific demethylase-1 (lsd1) induces antiviral signaling pathways (liang et al., 2009) . the inhibitory effect of histone demethylase activity has been demonstrated for human cytomegalovirus (gan et al., 2017) and herpes simplex virus (hill et al., 2014; liang et al., 2013) . the activity of purified recombinant methyltransferases can be determined by measuring the methylation by-product s-adenosyl homocysteine (sah) using commercially available kits. in one such assay, conversion of s-adenosyl methionine (sam) to sah is monitored luminometrically via luciferase reaction, in which measurable atp is generated through a sequence of reactions with mtase-glo™ reagent (promega). the epigeneous™ methyltransferase kit (cisbio bioassays) is based on competition of produced sah with fluorescently labeled sah (sah-d2 tracer) for binding to a terbium cryptate-labeled anti-sah antibody. the decrease in fret between the tracer and antibody is then evaluated. elisa using anti-5-methylcytosine antibody can be used to quantify methylation of immobilized cytosine-rich dna substrate (e.g. epiquik™ dna methyltransferase activity/inhibition assay kit; epigentek). this method was modified with homogenous time resolved fret (degorce et al., 2009) to screen a library of inhibitors against sars-cov nsp14 . flaviviral and human cap n7-mtases have been screened with radioactive assays using 3 h-labeled sam and gpppac4 or 7m gpppac4 synthetic rnas. the 3 h methyl transferred onto deae filter-bound rna can be measured by scintillation after multiple washings to remove unincorporated 3 h-sam . alternatively, in vitro transcription can be carried out using 3 h-sam. the radioactively labeled products are then resolved by thin-layer chromatography and developed using a phosphorimager (li et al., 2007) . a yeast cell-based method was established based on the finding that coronavirus methyltransferase can functionally replace a methyltransferase essential for yeast viability sun et al., 2014) . in this method, sun et al. constructed recombinant yeasts producing the viral methyltransferase instead of the yeast one (sun et al., 2014) . this strain was used for a hts of methyltransferase inhibitor activity that negatively correlated with the cell density after 20 h incubation. virus-encoded atpase-driven helicases have been identified in numerous human pathogens. helicases from several (+)rna viruses have been characterized, including ns3 helicases from flaviviruses such as dengue virus, hcv, west nile virus, yellow fever virus, and japanese encephalitis virus (cao et al., 2016; gu and rice, 2016; jain et al., 2016; lin et al., 2017; nedjadi et al., 2015; wu et al., 2005) . sars and mers coronaviruses encode nsp13 with helicase activity (adedeji and lazarus, 2016; hao et al., 2017; seybert et al., 2000) . in semliki forest virus, a representative member of the togavirus family, helicase activity is encoded by the n-terminal domain of nsp2 protein. helicases are also common in some dna viruses, including poxviruses. these include the vaccinia virus helicase-primase d5 (bayliss and smith, 1996; hutin et al., 2016) , e1 protein of bovine papilloma virus 1 (yang et al., 1993) , and the large tumor antigen of sv40 (stahl et al., 1986) . helicases appear to be attractive targets for antiviral drugs (briguglio et al., 2011; frick, 2003; reynolds et al., 2015) , but development of such compounds is challenged by cytotoxicity and bioavailability issues (kwong et al., 2005) . traditional methods for monitoring the activity of rna helicases use radioactively labeled dsrna substrates and follow the unwinding reaction by electrophoretic separation (nondenaturing page) of the ss reaction products, which are detected autoradiographically (adedeji et al., 2012; utama et al., 2000) . to determine whether the inhibitor affects binding of helicase to nucleic acid, a standard gel mobility shift assay is usually used (adedeji et al., 2012) . helicase activity is fueled by atp hydrolysis; thus, inhibition of atpase activity became another possible antiviral strategy. atpase activity can be determined either by the decrease in atp or formation of adp (using commercially available fluorescent anti-adp antibodies) or inorganic pyrophosphate. phosphates released by atp hydrolysis can form a molybdophosphate complex that can be measured colorimetrically using malachite green, quinaldine red, or rhodamine b (baykov et al., 1988; debruyne, 1983; miyata et al., 2010) or by lightscattering (oshima et al., 1996) . the colorimetric methods can be miniaturized for hts (zuck et al., 2005) . the absorbance-based assay was also converted into a hts method based on fluorescence quenching by a colored quinaldine red complex ). an alternative method employs a europium-tetracycline complex for luminescent determination of inorganic phosphate (schäferling and wolfbeis, 2007) . a more complex coupled enzyme colorimetric assay (with maltose phosphorylase, glucose oxidase, and horseradish peroxidase) was used for hts of atpase inhibitors (avila et al., 2006) . assays for luminescent and fluorescent screening of atpase activity, including an immunochemical method using fluorescently labeled anti-adp antibodies, have been reviewed by shadrick and colleagues (shadrick et al., 2013) . in a radioactive method using [γ-32 p]atp, the amp product was separated from unreacted atp by thin-layer chromatography on polyethyleneimine-cellulose and visualized autoradiographically (adedeji et al., 2012) . several research groups have described fluorescence assays to identify inhibitors targeting sars-cov helicase (nsp13) (adedeji et al., 2012; cho et al., 2015; özeş et al., 2014) . the substrate is usually a dsdna oligonucleotide consisting of a fluorescently labeled strand annealed to the complementary strand carrying a quencher. this approach was also adapted for real-time determination of the rna helicase activity of hcv (tani et al., 2010) . in this assay, an ssdna capture strand complementary to the strand carrying the quencher was used to prevent reannealing of the unwound duplex. recently, a colorimetric assay for monitoring helicase activity using dna-conjugated gold nanoparticles was developed (deka et al., 2017) . this method is based on shifts in the optical properties of nanoparticles due to dna unwinding and allows simple screening of inhibitor activity. the dsdna melting curves can be determined spectrophotometrically (at 524 nm and 260 nm) or even by the naked eye. another fluorescence hts of dengue virus ns3 helicase inhibitors measures the unwinding of a double-labeled molecular beacon (basavannacharya and vasudevan, 2014) . other approaches involve graphene oxide-based fluorescence monitoring of viral helicase activity [reviewed in (jang et al., 2013) ]. a gquadruplex-based method for label-free determination of hcv helicase ns3 activity measures changes in the luminescence of transition metal complexes with dna upon helicase-mediated quadruplex melting (leung et al., 2015) . both protein tyrosine kinases and protein serine/threonine (s/t) kinases have been found in viruses. tyrosine kinase function is wellunderstood in connection with oncogenic retroviruses, [for review on retroviral oncogenes, see (vogt, 2012) ]. in contrast to tyrosine kinases from oncogenic viruses, such as rous sarcoma virus src tyrosine kinase, viral s/t kinases share little homology with cellular enzymes. they are exclusively encoded by large dna viruses (e.g. herpesviruses), in which they play important roles in viral virulence, helping the virus to escape defense mechanisms such as those regulated by cytokine signaling pathways (jacob et al., 2011; sato et al., 2017) . their autophosphorylation and transphosphorylation activities mimic those of cellular cyclin-dependent kinases such as cdc2. for example, viral kinases can phosphorylate translation elongation factor 1 delta (ef-1δ) (jacob et al., 2011; kawaguchi and kato, 2003) . all herpesviruses encode the s/t protein kinase ul13, and us3 s/t kinases have been described in the alphaherpesvirus subfamily (kato, 2016; kawaguchi and kato, 2003) . in addition to protein kinases, hsv encodes a thymidine kinase. unlike cellular thymidine kinase, hsv thymidine kinase has a wide substrate specificity that includes pyrimidine and purine phosphonate analogs (e.g. acyclovir, ganciclovir, penciclovir) (de clercq and li, 2016) . in the body, ganciclovir is phosphorylated by cellular kinases and penciclovir and acyclovir by virus-encoded thymidine kinases to the active nucleoside triphosphate forms of the drugs that inhibit viral dna synthesis (de clercq and li, 2016; kokoris and black, 2002) . some cellular protein kinases appear to support viral replication. for example, polo-like kinases induce early stages in the influenza virus life cycle (pohl et al., 2017) , and human protein kinase c regulates the assembly of the ribonucleoprotein complexes in influenza virus (mondal et al., 2017) . inhibitors of abelson tyrosine-protein kinase 2 are active against sars and mers coronaviruses (coleman et al., 2016) . several in vitro approaches can be used to determine kinase activity as well as the activity of kinase inhibitors. analyses of the cellular phosphoproteome, sometimes accompanied by phosphopeptide enrichment, have become standard to determine kinase activity (lea and simeonov, 2011; meyer et al., 2017; vyse et al., 2017) . these assays can be used to assess the impact of inhibitors on the overall phosphoproteome in mammalian cells. although these methods provide complex information about the overall array of kinases and phosphatases in the cell (olsen et al., 2006) , they are not applicable to screening inhibitors of a single viral kinase. for these purposes, in vitro assays with recombinant kinases have been developed [for review see ], including some hts fluorescence methods (zegzouti et al., 2009 ) that can replace radioactive techniques using [γ 32 p]atp (sanghera et al., 2009) . mass spectrometric analysis also can be used to identify in vitro kinase inhibitors. for these analyses, synthetic peptides, proteins, or phosphatase-and heat-treated tissue samples (to dephosphorylate the proteins and inactivate all enzymes, respectively) are subjected to kinase treatment in the presence of inhibitors (huang et al., 2007; meyer et al., 2017; xue et al., 2012) . recombinant kinase activity in the presence of inhibitors can be quantified as atp consumption or adp production in the phosphorylation reaction by numerous commercial kits. other methods monitor binding of inhibitors to phage-displayed kinases in ligand competition assays (fabian et al., 2005) . fluorescence methods including fret, fluorescence polarization or intensity endpoint measurement, and lifetime imaging of fluorescence including fluorescence biosensors (zhang and allen, 2007) have been reviewed elsewhere. sulfonamido-oxine labeled peptides can be used as chromophores that bind mg 2 + upon phosphorylation and emit chelation-enhanced fluorescence (devkota et al., 2013; luković et al., 2008) . kinase-catalyzed phosphorylation of fluorescent peptides promotes their binding to metal-coated nanoparticles, which decreases their mobility and enhances measurable fluorescence polarization (lea and simeonov, 2011; sportsman et al., 2004) . tbiii complexes, in which phosphotyrosine induces fluorescence emission (wang et al., 2015a) , may be used to evaluate protein tyrosine kinase activity (akiba et al., 2015; sumaoka et al., 2016) . fluorescence polarization methods also can be useful for drug screening [reviewed in (hall et al., 2016) ]. other alternatives are immunochemical methods that use antibodies specific to the phosphorylated amino acids, such as phosphotyrosine (li et al., 2001; youngren et al., 1997) or phosphoserine/ phosphothreonine. these antibodies can be used to detect protein/ peptide phosphorylation by western blotting, elisa, or immunoprecipitation of phosphorylated proteins for further mass spectrometry-based analysis (grønborg et al., 2002; zhang et al., 2002) . an elegant approach that limits the false-positive hits in screening of specific kinase inhibitors is based on an in situ proximity ligation assay using both an antibody against the target protein and an anti-phosphotyrosine antibody (leuchowius et al., 2010) . both antibodies are coupled with oligonucleotides, which when brought together due to antibody binding, can be enzymatically ligated and replicated through rolling circle amplification to form a long linear tandem repeat of sequences detectable by a complementary fluorescent oligonucleotide. in addition to protein kinases, some lipid kinases have been targeted by antivirals. one example is sphingosine kinase 1, which affects replication of dengue virus (aloia et al., 2017) . its activity can be determined by measuring the production of 32 p-labeled sphingosine-1phosphate from sphingosine and [γ 32 p]atp (clarke et al., 2016; pitman et al., 2012) . retroviral integrase inhibitors are a new type approved new type of inhibitors imposed by the emergence of drug-resistant mutants. hiv integrase activities, integrase inhibitors, and drug resistance have been discussed in detail elsewhere (andrake and skalka, 2015; anstett et al., 2017; hajimahdi and zarghi, 2016; liao et al., 2010; podany et al., 2017; thierry et al., 2016) . methods to assess the two major activities of integrase-end processing of the reverse transcription product and its joining to target chromosomal dna-have been reviewed in detail by several groups (engelman and cherepanov, 2014; marchand et al., 2001; merkel et al., 2009) . initial methods used radioactively labeled dna oligonucleotides comprising the terminal cis-acting sequences of linear viral dna required for integration. the joining of the processed strand to the other strand (self-integration) or to supplemented target dnas can be analyzed by page (katz et al., 1990; katzman et al., 1989) . a less time-consuming, non-radioactive method involves timeresolved fluorescence anisotropy measurement using a 21-meric oligonucleotide fluorescently labeled on the terminal gt dinucleotide. this assay monitors the binding of integrase to the substrate as well as the subsequent 3′-processing reaction, which both change the anisotropy (guiot et al., 2006) . alternatively, the yields of both the processing and joining reactions can be measured upon separation of the radioactively labeled product from the rest of the dna molecule using adsorption to pei-cellulose (muller et al., 1993) . a real-time hts method measures fluorescence emission resulting from removal of the 3′-terminal dinucleotide, labeled with a quencher, by integrase (he et al., 2007) . han et al. described a fluorescence method to screen molecules that inhibit binding of integrase to viral dna . methods evaluating the integrase strand transfer reaction have been modified to a high-throughput format using magnetic beads (he et al., 2008) or streptavidin-coated microplates (john et al., 2005) . a method to assess strand transfer by time-resolved fret with a europium-streptavidin-labeled substrate has been optimized for 384-and 1536-well plate formats (wang et al., 2005) . the hbv capsid protein is the building block of the viral core, surrounding the viral nucleic acid (pre-genomic rna, pgrna) and reverse transcriptase. the hbv core is icosahedral, formed by 240 copies of capsid protein dimers. in vitro hts of hbv core assembly inhibitors using a modified hbv capsid protein has been described (stray et al., 2006) . the capsid protein was modified by deleting the nucleic acid binding domain, which is dispensable for capsid assembly, and the nterminal assembly domain alone was used in the assay. to fluorescently label the hbv capsid protein, all cysteine residues dispensable for assembly were replaced with alanines. a unique cysteine residue (c150) was c-terminally joined to the assembly domain and labeled with fluorescent bodipy-fl dye (c150bo). during assembly, capsid proteins dimerize, bringing c150bo residues close together and resulting in c150bo fluorescence self-quenching. following incubation of the labeled hbv protein with inhibitors, fluorescence was measured in black 96-well microtiter plates. development of in vitro assembly systems has contributed greatly to current understanding of the structure of retroviral particles and mechanisms of virion formation. these systems also became the base for several high throughput assays for screening assembly inhibitors. during the last 20 years, a number of in vitro assembly assays have been established, mainly for hiv-1 (campbell et al., 2001; campbell and rein, 1999; ehrlich et al., 1992; gross et al., 2000; lanman et al., 2002) , mason-pfizer monkey virus (bohmova et al., 2010; klikova et al., 1995; rumlova-klikova et al., 2000; rumlova-klikova et al., 1999; ulbrich et al., 2006) , rous sarcoma virus vogt, 1995, 1997; purdy et al., 2008 purdy et al., , 2009 , and murine leukemia virus (dolezal et al., 2016; hadravova et al., 2012; cheslock et al., 2003) . hts assays for inhibitors of hiv-1 assembly include several methods using purified hiv-1 ca or ca-nc proteins. one of them, the turbidimetric assay, is based on the observation that direct dilution of the hiv-1 capsid protein (ca) into a high-salt solution (1.6-2.2 m nacl) leads to the formation of tubular structures. as the tube formation is accompanied by an increase in light scattering, assembly can be detected as an increase in turbidity, and the rate of turbidity change is proportional to the rate of the assembly (lanman et al., 2002) . other method published by lemke et al., exploits the affinity of nucleocapsid (nc) to a short tg-rich deoxyribooligonucleotide, d(tg25), which is used as a scaffold (lemke et al., 2012) . this arrangement enables ca to assemble at much lower protein and salt concentrations than in the turbidimetric assay (lanman et al., 2002) . biotin-labeled d(tg25) bound on the surface of neutravidin-coated microtiter well plates nucleates assembly of complexes of ca-nc and soluble fluorescein-labeled d(tg25). fluorescence is measured after washing to remove the unbound and unassembled material from the captured assembly products (lemke et al., 2012) . similarly, the faith assay uses a dually labeled oligonucleotide (tqon). however, in this case, the ssdna oligonucleotide tqon is labeled with the reporter dye fluorescein (fam) as well as black hole quencher (bhq); thus, it does not emit any fluorescence. the assembly reaction is triggered by mixing hiv-1 ca-nc or a gagtruncated assembly-competent version with tqon. following incubation, during which tqon is incorporated into the particles, exonuclease is added to degrade unbound tqon, while co-assembled tqon is protected from cleavage. degradation of free unbound tqon with exoi results in separation of fam from its quencher, and the emitted fluorescence is measured (hadravova et al., 2015) . phage display has been employed to screen peptide inhibitors of hiv-1 assembly (sticht et al., 2005) . a commercial library of m13-derived phages presenting random 12-amino-acid peptides was analyzed for specific binding to purified ca or ca-nc proteins. the specifically bound phages were sequenced, and corresponding peptides were chemically synthesized and re-tested in an in vitro assembly assay (gross et al., 2000) . late in the retroviral life cycle, grna is incorporated into the nascent particle during assembly at the plasma membrane. nc contains two zinc-finger domains that are responsible for specific binding of grna. to screen for compounds that would prevent nc-rna/dna interactions, a hts system consisting of two sequential screens was developed (breuer et al., 2012) . the primary screen uses fluorescence polarization (fp), while the secondary one uses differential scanning fluorimetry (dsf). the combination and order of these two techniques were selected to first identify compounds that disrupt interactions between dna and nc, and then identify the compounds binding to nc during the secondary screen. a rapid and simple turbidimetric method was developed to screen inhibitors of assembly of hcv core protein (fromentin et al., 2007) . for the in vitro assembly reaction, two components, an n-terminal part of the hcv core protein corresponding to the minimal assembly competent domain and rna corresponding to the full-length 5′utr of hcv, were used. assembly of hcv nucleocapsid-like particles was initiated by mixing the purified protein and nucleic acid, and the assembly process was monitored by measuring turbidity at 350 nm. numerous viruses, including picornaviruses, retroviruses, alphaviruses, and flaviviruses, encode proteases that are essential for their virulence. the majority of viral proteases specifically cleave viral polyprotein precursors to liberate the functional proteins of the virion. some viral proteases, such as the papain-like proteases of coronaviruses, also reprogram cellular signaling pathways, including ubiquitination mechanisms and interferon controlled responses, to prevent degradation of viral components (clementz et al., 2010; frieman et al., 2009; randow and lehner, 2009; xing et al., 2013) . numerous viruses, including papillomaviruses (bronnimann et al., 2016; buck et al., 2005) and retroviruses (hallenberger et al., 1992) , use also host cell proteases, mainly furin, to trim their envelope and surface proteins. this triggers conformational changes required for interaction with cellular receptors and membrane fusion in enveloped viruses. some viruses use proteases other than furin to modulate their infectivity, as shown for viruses entering airway epithelial cells [reviewed in (laporte and naesens, 2017) ] such as influenza virus (böttcher-friebertshäuser et al., 2010; kühn et al., 2016) , newcastle disease virus (gotoh et al., 1992) , and respiratory syncytial virus (sugrue et al., 2001) . extensive research efforts have yielded detailed information about hiv-1 protease and its inhibitors [reviewed in (de clercq, 2004; konvalinka et al., 2015; midde et al., 2016) ]. large amounts of data also are available for hcv (foote et al., 2011; pawlotsky et al., 2007; razonable, 2011) and sars-cov 3cl protease inhibitors (pillaiyar et al., 2016) , although there is not yet an inhibitor of the latter target approved for clinical use. numerous approved drugs are synthetic peptides derived from the natural proteolytic substrates of target viruses modified to improve the in vivo effects related to bioavailability, stability, and so on. numerous in vitro assays to monitor the activity of proteases and their inhibitors, including commercial kits, have been developed. classical methods use synthetic peptides that mimic the target sites of the protease. although some of the methods described here were not originally designed to screen the activity of viral proteases in the presence of inhibitors, they can be adapted for this purpose by simply changing the peptide sequence to the target site of the protease of interest. the cleavage yield is usually monitored either colorimetrically (ding and yang, 2015; zhou et al., 2014) or as a change in fluorescence triggered by the release of fluorescent labels such as 7-amido-4-methylcoumarin (amc) or rhodamine (grant et al., 2002) . fluorogenic substrates have been used to determine the activity of coronavirus proteases and screen inhibitors (kuo et al., 2004; lee et al., 2014; park et al., 2017; song et al., 2014; tomar et al., 2015; wang et al., 2016; yang et al., 2005; zhao et al., 2008) . some recently described arrangements employ nanoparticles (feltrup and singh, 2012; khalilzadeh et al., 2016; udukala et al., 2016; wang et al., 2014; zeng et al., 2015) or quantum dot bioconjugates (lee and kim, 2015; li et al., 2014; medintz et al., 2006) with immobilized fluorescently or luminescently labeled peptide substrates. alternatively, cleavage products may be monitored by analysis of proteolytic products by mass spectrometric methods (hu et al., 2015; joshi et al., 2017; lathia et al., 2011; rumlová et al., 2003) , analytical hplc (teruya et al., 2016) , or electrochemical methods based on the difference in penetration of substrate and cleavage products through the membrane of a polyionselective sensor (gemene and meyerhoff, 2011; han et al., 1996) . to study the specificity of inhibitor binding and to extend the research to rational design of inhibitors, x-ray or nmr structures of proteases in complex with the inhibitor may be determined, as reported in numerous cases for the proteases of hiv-1 [reviewed in (ghosh et al., 2016) ], hcv (yilmaz et al., 2016) , and mers . cell-based assays can provide additional information, including the capability of the inhibitor to pass through the cell membrane and its stability in the cytoplasm. a general determination of infectivity, such as a plaque cytotoxicity assay in the presence of protease inhibitor, may be used for confirmation. one elegant example exploits the cytotoxicity of hiv protease. cells are transfected with a protease precursor fused to gfp. in the absence of inhibitor, hiv-1 protease is autocatalytically activated and cleaves a broad variety of cellular proteins, resulting in activation of apoptosis and cell death (cummins and badley, 2010; rumlova et al., 2014) . this toxic effect, when suppressed by active inhibitors, results in production of a gfp signal in surviving cells (lindsten et al., 2001) . another elegant approach for hiv and coxsackievirus b3 proteases, which both undergo autocatalytic cleavage, employs constructs in which the protease gene is inserted between sequences encoding the dna-binding domain and the domain that activates transcription of the gal1-lacz reporter gene (dasmahapatra et al., 1992; murray et al., 1993) . the protease-mediated cleavage separates the dna-binding domain from the trans-activating domain and results in failure of reporter gene transcription. this approach has also been modified with gfp as a reporter gene (hilton and wolkowicz, 2010) . co-expression of cleavage-activated luciferase substrate and mers-cov protease permits both live-cell imaging and quantification of the enzyme activity (kilianski et al., 2013) . the need to develop new antiviral compounds will likely persist over the long term, although there has been enormous progress in molecular biology methods, especially rna silencing, bioinformatics, imaging, and structural biology techniques. viruses present challenging targets for drug development due their flexibility and adaptability caused by the error-prone copying of their genomes, which can result in emergence of drug-resistant mutants. viral integration into the host genome and inhibitor toxicity are other obstacles. here, we provide an overview of in vitro methods, including cell-based assays, that may be suitable for screening of antivirotics that interfere with the key steps of viral life cycles and target either virus or cell-encoded proteins required for the infectivity. biochemical characterization of middle east respiratory syndrome coronavirus helicase severe acute respiratory syndrome coronavirus replication inhibitor that interferes with the nucleic acid unwinding of the viral helicase components of adenovirus genome packaging click conjugation of a binuclear terbium(iii) complex for real-time detection of tyrosine phosphorylation pancreatic acinar cell-specific autophagy disruption reduces coxsackievirus replication and pathogenesis in vivo investigation of sphingosine kinase 1 in interferon responses during dengue virus infection viral apoptotic mimicry hiv-1 uncoating: connection to nuclear entry and regulation by host proteins development of robust in vitro rnadependent rna polymerase assay as a possible platform for antiviral drug testing against dengue retroviral integrase: then and now reactivation and lytic replication of kaposi's sarcoma-associated herpesvirus: an update hiv drug resistance against strand transfer integrase inhibitors toward the identification of viral cap-methyltransferase inhibitors by fluorescence screening assay inhibitors of the histone methyltransferases ezh2/1 induce a potent antiviral state and suppress infection by diverse viral pathogens highthroughput screening for hsp90 atpase inhibitors a novel small molecule inhibitor of hepatitis c virus entry junction adhesion molecule is a receptor for reovirus infectious hepatitis c virus pseudoparticles containing functional e1-e2 envelope protein complexes suramin inhibits helicase activity of ns3 protein of dengue virus in a fluorescence-based high throughput assay format a malachite green procedure for orthophosphate determination and its use in alkaline phosphatase-based enzyme immunoassay vaccinia virion protein i8r has both dna and rna helicase activities: implications for vaccinia virus transcription -phosphonomethoxy)ethyl]guanine (ode-bn-pmeg), a potent inhibitor of transient hpv dna amplification hepatitis b virus replication early steps in avian reovirus morphogenesis late budding domains and host proteins in enveloped virus release escape of non-enveloped virus from intact cells nonlytic spread of naked viruses nonlytic viral spread enhanced by autophagy components viral entry pathways: the example of common cold viruses herpes virus replication treatment of chronic hepatitis d with the entry inhibitor myrcludex b: first results of a phase ib/iia study effect of dimerizing domains and basic residues on in vitro and in vivo assembly of mason-pfizer monkey virus and human immunodeficiency virus cleavage of influenza virus hemagglutinin by airway proteases tmprss2 and hat differs in subcellular localization and susceptibility to protease inhibitors development and automation of a 384-well cell fusion assay to identify inhibitors of ccr5/cd4-mediated hiv virus entry identification of hiv-1 inhibitors targeting the nucleocapsid protein inhibition of rna helicases of ssrna(+) virus belonging to flaviviridae, coronaviridae and picornaviridae families furin cleavage of l2 during papillomavirus infection: minimal dependence on cyclophilins maturation of papillomavirus capsids absolute quantification of mrna using real-time reverse transcription polymerase chain reaction assays hiv-1 virion-cell interactions: an electrostatic model of pathogenicity and syncytium formation high-throughput screening identification of poliovirus rna-dependent rna polymerase inhibitors hiv-1 capsid: the multifaceted key player in hiv-1 infection in vitro assembly properties of human immunodeficiency virus type 1 gag protein lacking the p6 domain self-assembly in-vitro of purified ca-nc proteins from rous-sarcoma virus and human-immunodeficiency-virus type-1 in vitro assembly of virus-like particles with rous sarcoma virus gag deletion mutants: identification of the p10 domain as a morphological determinant in the formation of spherical particles modulation of hiv-like particle assembly in vitro by inositol phosphates novel escrt functions in cell biology: spiraling out of control? molecular mechanism of divalentmetal-induced activation of ns3 helicase and insights into zika virus inhibitor design hcv infection by cell-to-cell transmission: choice or necessity? mutagenesis of s-adenosyl-l-methionine-binding residues in coronavirus nsp14 n7-methyltransferase m demonstrates differing requirements for genome translation and resistance to innate immunity components and regulation of nuclear transport processes a sensitive and specific enzyme-based assay detecting hiv-1 virion fusion in primary t lymphocytes fluorescence resonance energy transfer-based hiv-1 virion fusion assay hiv-1 fusion assay a reversible fluorescence nanoswitch based on carbon quantum dots nanoassembly for detection of pyrophosphate ion functional screen reveals sars coronavirus nonstructural protein nsp14 as a novel cap n7 methyltransferase tsg101: a novel anti-hiv-1 drug target protein-protein interfaces in viral capsids are structurally unique identification of a coumarin-based antihistamine as an anti-filoviral entry inhibitor charged assembly helix motif in murine leukemia virus capsid: an important region for virus assembly and particle size determination identification of a novel small molecule inhibitor against sars coronavirus helicase reduction in sphingosine kinase 1 influences the susceptibility to dengue virus infection by altering antiviral responses deubiquitinating and interferon antagonism activities of coronavirus papain-like proteases emerging antiviral drugs restarting lytic gene transcription at the onset of herpes simplex virus reactivation (-)-2'-deoxy-3'-thiacytidine is a potent, highly selective inhibitor of human immunodeficiency virus type 1 and type 2 replication in vitro pushing the envelope: microinjection of minute virus of mice into xenopus oocytes causes damage to the nuclear envelope how viruses access the nucleus abelson kinase inhibitors are potent inhibitors of severe acute respiratory syndrome coronavirus and middle east respiratory syndrome coronavirus fusion structures of ns5 methyltransferase from zika virus highthroughput approaches to unravel hepatitis c virus-host interactions localization of adenovirus morphogenesis players, together with visualization of assembly intermediates and failed products, favor a model where assembly and packaging occur concurrently at the periphery of the replication center endocytosis of viruses and bacteria zika virus methyltransferase: structure and functions for drug design perspectives structure and organization of paramyxovirus particles mechanisms of hiv-associated lymphocyte apoptosis: 2010 a novel entry/uncoating assay reveals the presence of at least two species of viral capsids during synchronized hiv-1 infection a genetic system for studying the activity of a proteolytic enzyme antiviral drugs in current clinical use approved antiviral drugs over the past 50 years inorganic phosphate determination: colorimetric assay based on the formation of a rhodamine b-phosphomolybdate complex htrf: a technology tailored for drug discovery -a review of theoretical aspects and recent applications dna-conjugated gold nanoparticles based colorimetric assay to assess helicase activity: a novel route to screen potential helicase inhibitors a novel mechanism underlying the innate immune response induction upon viral-dependent replication of host cell mrna: a mistake of + srna viruses' replicases. front high-throughput screens for eef-2 kinase quantitative serine protease assays based on formation of copper(ii)-oligopeptide complexes functional and structural characterization of novel type of linker connecting capsid and nucleocapsid protein domains in murine leukemia virus 2′-o methylation of internal adenosine by flavivirus ns5 methyltransferase a quantitative fluorescence-based steadystate assay of dna polymerase the crystal structure of zika virus ns5 reveals conserved drug targets high-throughput minigenome system for identifying small-molecule inhibitors of ebola virus replication assembly of recombinant human immunodeficiency virus type 1 capsid protein in vitro identification of a broad-spectrum antiviral small molecule against severe acute respiratory syndrome coronavirus and ebola, hendra, and nipah viruses by using a novel high-throughput screening assay retroviral integrase structure and dna recombination mechanism a small molecule-kinase interaction map for clinical kinase inhibitors ultrastructural and biochemical basis for hepatitis c virus morphogenesis nuclear entry of dna viruses development of a fluorescence internal quenching correction factor to correct bont/a endopeptidase kinetics using snaptide antiviral nucleotide incorporation by recombinant human mitochondrial rna polymerase is predictive of increased in vivo mitochondrial toxicity risk a pathogenic picornavirus acquires an envelope by hijacking cellular membranes inhibition of hepatitis c virus replication by gs-6620, a potent c-nucleoside monophosphate prodrug transcription and replication mechanisms of bunyaviridae and arenaviridae l proteins fields virology boceprevir: a protease inhibitor for the treatment of chronic hepatitis c viral late domains hiv-1 assembly, release and maturation helicases as antiviral drug targets severe acute respiratory syndrome coronavirus papain-like protease ubiquitin-like domain and catalytic domain regulate antagonism of irf3 and nf-κb signaling a method for in vitro assembly of hepatitis c virus core protein and for screening of inhibitors beyond tsg101: the role of alix in 'escrting' hiv-1 proteolytic cleavage of bovine m adenovirus 3-encoded pviii epigenetically repressing human cytomegalovirus lytic infection and reactivation from latency in thp-1 model by targeting h3k9 and h3k27 histone demethylases unbiased probing of the entire hepatitis c virus life cycle identifies clinical compounds that target multiple aspects of the infection detection of protease activities by flash chronopotentiometry using a reversible polycation-sensitive polymeric membrane electrode recent progress in the development of hiv-1 protease inhibitors for the treatment of hiv/aids structure and formation of the cytomegalovirus virion pre-s1 antigen-dependent infection of tupaia hepatocyte cultures with human hepatitis b virus mapping of the hepatitis b virus attachment site by use of infection-inhibiting pres1 lipopeptides and tupaia hepatocytes the ins and outs of hiv replication mammalian subtilisin-related proteinases in cleavage activation of the paramyxovirus fusion glycoprotein: superiority of furin/pace to pc2 or pc1/ pc3 development of novel assays for proteolytic enzymes using rhodamine-based fluorogenic substrates hepatitis b virus infection of adult human hepatocytes cultured in the presence of dimethyl sulfoxide infection of a human hepatoma cell line by hepatitis b virus efficient inhibition of hepatitis b virus infection by acylated peptides derived from the large viral surface protein a mass spectrometry-based proteomic approach for identification of serine/threonine-phosphorylated proteins by enrichment with phospho-specific antibodies: identification of a novel protein, frigg, as a protein kinase a substrate varicella-zoster virus infectious cycle: er stress, autophagic flux, and amphisome-mediated trafficking a conformational switch controlling hiv-1 morphogenesis the cell biology of receptor-mediated virus entry the spring α-helix coordinates multiple modes of hcv (hepatitis c virus) ns3 helicase action relationship between the oligomeric status of hiv-1 integrase on dna and enzymatic activity bunyavirus: structure and replication in vitro assembly of virus-like particles of a gammaretrovirus, the murine leukemia virus xmrv faith -fast assembly inhibitor test for hiv progress in hiv-1 integrase inhibitors: a review of their chemical structure diversity fluorescence polarization assays in high-throughput screening and drug discovery: a review inhibition of furin-mediated cleavage activation of hiv-1 glycoprotein gpl60 selective monitoring of peptidase activities with synthetic polypeptide substrates and polyion-sensitive membrane electrode detection development of a fluorescence-based hiv-1 integrase dna binding assay for identification of novel hiv-1 integrase inhibitors alix rescues budding of a double ptap/ppey l-domain deletion mutant of ebola vp40: a role for alix in ebola virus egress crystal structure of middle east respiratory syndrome coronavirus helicase highthroughput real-time assay based on molecular beacons for hiv-1 integrase 3'-processing reaction a novel highthroughput format assay for hiv-1 integrase strand transfer reaction using magnetic beads nuclear egress of herpesviruses: the prototypic vesicular nucleocytoplasmic transport herpesvirus capsid assembly and dna packaging an inducible cell-cell fusion system with integrated ability to measure the efficiency and specificity of hiv-1 entry inhibitors inhibition of lsd1 reduces herpesvirus infection, shedding, and recurrence by promoting epigenetic suppression of viral genomes an assay to monitor hiv-1 protease activity for the identification of novel inhibitors in t-cells benzimidazole derivatives bearing substituted biphenyls as hepatitis c virus ns5b rna-dependent rna polymerase inhibitors: structure − -activity relationship studies and identification of a potent and highly selective inhibitor jtk-109 universal aptamer-based real-time monitoring of enzymatic rna synthesis factors affecting quantification of total dna by uv spectroscopy and picogreen fluorescence peptide code-on-a-microplate for protease activity analysis via maldi-tof mass spectrometric quantitation a systematic ms-based approach for identifying in vitro substrates of pka and pkg in rat uteri quantitative 3d video microscopy of hiv transfer across t cell virological synapses complementary assays reveal a relationship between hiv-1 uncoating and reverse transcription the ins and outs of eukaryotic viruses: knowledge base and ontology of a viral infection escrts are everywhere transport of the influenza virus genome from nucleus to nucleus domain organization of vaccinia virus helicase-primase d5 how viruses use the endoplasmic reticulum for entry, replication, and assembly experimental approaches to study genome packaging of influenza a viruses poliovirus-induced changes in cellular membranes throughout infection subversion of cellular autophagosomal machinery by rna viruses viral serine/threonine protein kinases structure of the ns3 helicase from zika virus a new helicase assay based on graphene oxide for anti-viral drug development filovirus budding oligomeric viral proteins: small in size, large in presence sensing of rna viruses: a review of innate immune receptors involved in recognizing rna virus invasion development of a novel dual ccr5-dependent and cxcr4-dependent cell-cell fusion assay system with inducible gp160 expression a screening assay for antiviral compounds targeted to the hiv-1 gp41 core structure using a conformation-specific monoclonal antibody development and application of a highthroughput screening assay for hiv-1 integrase enzyme activities avian reovirus infections. revue scientifique et technique a structural and functional perspective of m. rumlová alphavirus replication and assembly the rational design of therapeutic peptides for aminopeptidase n using a substrate-based approach techniques used to study the dna polymerase reaction pathway molecular mechanism by which us3 protein kinase regulates the pathogenicity of herpes simplex virus type-1 cell culture and infection system for hepatitis c virus the avian retroviral in protein is both necessary and sufficient for integrative recombination in vitro the avian retroviral integration protein cleaves the terminal sequences of linear viral dna at the in vivo sites of integration protein kinases conserved in herpesviruses potentially share a function mimicking the cellular protein kinase cdc2 reduced graphene oxide decorated with gold nanoparticle as signal amplification element on ultra-sensitive electrochemiluminescence determination of caspase-3 activity and apoptosis using peptide based biosensor assessing activity and inhibition of middle east respiratory syndrome coronavirus papain-like and 3c-like proteases using luciferase-based biosensors clathrin-independent endocytosis: new insights into caveolae and non-caveolar lipid raft carriers efficient in-vivo and in-vitro assembly of retroviral capsids from gag precursor proteins expressed in bacteria virus strategies for passing the nuclear envelope barrier characterization of herpes simplex virus type 1 thymidine kinase mutants engineered for improved ganciclovir or acyclovir activity retroviral proteases and their roles in virion maturation origin and evolution of eukaryotic large nucleo-cytoplasmic dna viruses adenovirus tales: from the cell surface to the nuclear pore complex the proteolytic activation of (h3n2) influenza a virus hemagglutinin is facilitated by different type ii transmembrane serine proteases characterization of sars main protease and inhibitor assay using a fluorogenic substrate viral and cellular rna helicases as antiviral targets inducible expression of human hepatitis b virus (hbv) in stably transfected hepatoblastoma cells: a novel system for screening potential inhibitors of hbv replication the autophagic machinery in enterovirus infection kinetic analysis of the role of intersubunit interactions in human immunodeficiency virus type 1 capsid protein assembly in vitro airway proteases: an emerging drug target for influenza and other respiratory virus infections multiplexed protease assays using element-tagged substrates fluorescence polarization assays in small molecule screening fluorescent and bioluminescent nanoprobes for in vitro and in vivo detection of matrix metalloproteinase activity identification of novel drug scaffolds for inhibition of sars-cov 3-chymotrypsin-like protease using virtual and high-throughput screenings characterization of the activity of 2′-c-methylcytidine against dengue virus replication distinct effects of two hiv-1 capsid assembly inhibitor families that bind the same site within the n-terminal domain of the viral ca protein high content screening for inhibitors of protein interactions and post-translational modifications in primary cells by proximity ligation label-free luminescence switch-on detection of hepatitis c virus ns3 helicase activity using a g-quadruplex-selective probe †electronic supplementary information (esi) available: compound characterisation and supplementary data immunity to bovine herpesvirus 1: i. viral lifecycle and innate immunity small molecule insulin receptor activators potentiate insulin action in insulin-resistant cells amino acid residues within conserved domain vi of the vesicular stomatitis virus large polymerase protein essential for mrna cap methyltransferase activity vesicular stomatitis viruses resistant to the methylase inhibitor sinefungin upregulate rna synthesis and reveal mutations that affect mrna cap methylation fluorescence detection techniques for protein kinase assay quantum dots based molecular beacons for in vitro and in vivo detection of mmp-2 on tumor inhibition of the histone demethylase lsd1 blocks α-herpesvirus lytic replication and reactivation from latency a novel selective lsd1/kdm1a inhibitor epigenetically blocks herpes simplex virus lytic replication and reactivation from latency authentic hiv-1 integrase inhibitors single-molecule imaging reveals the translocation and dna looping dynamics of hepatitis c virus ns3 helicase cellbased fluorescence assay for human immunodeficiency virus type 1 protease activity viral and host proteins that modulate filovirus budding rapid and automated fluorescencelinked immunosorbent assay for high-throughput screening of hiv-1 fusion inhibitors targeting gp41 gs-5734 and its parent nucleoside analog inhibit filo-, pneumo-, and paramyxoviruses. sci. rep. 7 a structural view of the rna-dependent rna polymerases from the flavivirus genus automatic quantitation of hiv-1 mediated cell-tocell fusion with a digital image analysis system (dias): application for rapid screening of hiv-1 fusion inhibitors structure-based discovery of middle east respiratory syndrome coronavirus fusion inhibitor hiv-1 uncoating is facilitated by dynein and kinesin 1 recognition-domain focused (rdf) chemosensors: versatile and efficient reporters of protein kinase activity humanized chimeric upa mouse model for the study of hepatitis b and d virus interactions and preclinical drug evaluation the escrt-i subunit tsg101 controls endosome-to-cytosol release of viral rna the challenge of selecting protein kinase assays for lead discovery optimization mrna cap methylation influences pathogenesis of vesicular stomatitis virus in vivo a screen for novel hepatitis c virus rdrp inhibitor identifies a broad-spectrum antiviral compound quantification of pyrophosphate as a universal approach to determine polymerase activity and assay polymerase inhibitors structure, function and dynamics in adenovirus maturation in vitro human immunodeficiency virus type 1 integrase assays human immunodeficiency virus type 1 entry into macrophages mediated by macropinocytosis viral and cellular requirements for the nuclear entry of retroviral preintegration nucleoprotein complexes pathways of clathrin-independent endocytosis lipid interactions during virus entry and infection proteolytic activity monitored by fluorescence resonance energy transfer through quantum-dot-peptide conjugates variable effects of autophagy induction by trehalose on herpesviruses depending on conditions of infection common principles and intermediates of viral protein-mediated fusion: the hiv-1 paradigm viral polymerases virus entry by macropinocytosis apoptotic mimicry: phosphatidylserine-mediated macropinocytosis of vaccinia virus oligonucleotide-based assays for integrase activity a fluorescence polarization based screening assay for nucleic acid polymerase elongation activity multiplex substrate profiling by mass spectrometry for kinases as a method for revealing quantitative substrate motifs investigational protease inhibitors as antiretroviral therapies discovery of an rna virus 3′ → 5′ exoribonuclease that is critically involved in coronavirus rna synthesis high-throughput screen for escherichia coli heat shock protein 70 (hsp70/dnak): atpase assay in low volume by exploiting energy transfer influenza virus recruits host protein kinase c to control assembly and activity of its replication machinery poxvirus dna replication poxvirus membrane biogenesis. virology 479-480 membrane fusion during poxvirus entry rapid solution assays for retroviral integration reactions and their use in kinetic analyses of wild-type and mutant rous sarcoma virus integrases advances in developing therapies to combat zika virus: current knowledge and future perspectives inactivation of a yeast transactivator by the fused hiv-1 proteinase: a simple assay for inhibitors of the viral enzyme activity tackling dengue fever: current status and challenges expression strategies of ambisense viruses poxviruses utilize multiple strategies to inhibit apoptosis rational design of polymerase inhibitors as antiviral drugs herpes simplex virus type 1 entry into host cells: reconstitution of capsid binding and uncoating at the nuclear pore complex in vitro anchimerically activatable antiviral protides rabies virus assembly and budding the escrt machinery: new roles at new holes global, in vivo, and site-specific phosphorylation dynamics in signaling networks determination of phosphate as aggregates of ion associates by light-scattering detection and application to flow injection real-time fluorescence assays to monitor duplex unwinding and atpase activities of helicases nuclear pore complex is able to transport macromolecules with diameters of about 39 nm nipah virus c protein recruits tsg101 to promote the efficient release of virus in an escrt-dependent pathway evaluation of polyphenols from broussonetia papyrifera as coronavirus protease inhibitors the multiple faces of caveolae herpesvirus capsid association with the nuclear pore complex and viral dna release involve the nucleoporin can/nup214 and the capsid protein pul25 the hepatitis c virus life cycle as a target for new antiviral therapies inhibition of dengue virus replication by novel inhibitors of rna-dependent rna polymerase and protease activities prevention of hepatitis b virus infection in vivo by entry inhibitors derived from the large envelope protein structural insights into rna synthesis by the influenza virus transcription-replication machine alphavirus polymerase and rna replication an overview of severe acute respiratory syndrome-coronavirus (sars-cov) 3cl protease inhibitors: peptidomimetics and small molecule chemotherapy isoform-selective assays for sphingosine kinase activity human occludin is a hepatitis c virus entry factor required for infection of mouse cells comparative clinical pharmacokinetics and pharmacodynamics of hiv-1 integrase strand transfer inhibitors late stages of the influenza a virus replication cyclea tight interplay between virus and host identification of pololike kinases as potential novel drug targets for influenza a virus principles of virus structural organization critical role of conserved hydrophobic residues within the major homology region in mature retroviral capsid assembly retroviral capsid assembly: a role for the ca dimer in initiation nuclear import of hepatitis b virus capsids and release of the viral genome a chiral pentagonal polyhedral framework for characterizing virus capsid structures ribose 2'-o methylation of the vesicular stomatitis virus mrna cap precedes and facilitates subsequent guanine-n-7 methylation by the large polymerase protein viral avoidance and exploitation of the ubiquitin system a cell-based assay for rna synthesis by the hcv polymerase reveals new insights on mechanism of polymerase inhibitors and modulation by ns5a reverse transcription mechanically initiates hiv-1 capsid disassembly macropinocytosis is the entry mechanism of amphotropic murine leukemia virus west nile virus 5′-cap structure is formed by sequential guanine n-7 and ribose 2′-o methylations by nonstructural protein 5 antiviral drugs for viruses other than human immunodeficiency virus flavivirus structural heterogeneity: implications for cell entry melting of duplex dna in the absence of atp by ns3 helicase domain through specific interaction with a single-strand/ double-strand junction intracellular vesicle acidification promotes maturation of infectious poliovirus particles integration of murine leukemia virus dna depends on mitosis the role of electron microscopy in studying the continuum of changes in membranous structures during poliovirus infection specific in vitro cleavage of mason-pfizer monkey virus capsid protein: evidence for a potential role of retroviral protease in early stages of infection hiv-1 protease-induced apoptosis conditions resulting in formation of properly assembled retroviral capsids within inclusion bodies of escherichia coli analysis of mason-pfizer monkey virus gag domains required for capsid assembly in bacteria: role of the n-terminal proline residue of ca in directing particle shape cellular entry of ebola virus involves uptake by a macropinocytosis-like mechanism and subsequent trafficking through early and late endosomes comparison of the luminescent adp-glo assay to a standard radiometric assay for measurement of protein kinase activity involvement of herpes simplex virus type 1 ul13 protein kinase in induction of socs genes, the negative regulators of cytokine signaling europium tetracycline as a luminescent probe for nucleoside phosphates and its application to the determination of kinase activity human herpesvirus 6b downregulates expression of activating ligands during lytic infection to escape elimination by natural killer cells single molecule measurement of the "speed limit" of dna polymerase growing functions of the escrt machinery in cell biology and viral replication production of hepatitis b virus particles in hep g2 cells transfected with cloned hepatitis b virus dna the human coronavirus 229e superfamily 1 helicase has rna and dna duplex-unwinding activities with 5′-to-3′ polarity the m-pmv cytoplasmic targeting-retention signal directs nascent gag polypeptides to a pericentriolar region of the cell discovering new medicines targeting helicases: challenges and recent progress in vitro uncoating of hiv-1 cores genome packaging of reovirus is mediated by the scaffolding property of the microtubule network directional spread of surface-associated retroviruses regulated by differential virus-cell interactions papain-like protease (plpro) inhibitory effects of cinnamic amides from < i > tribulus terrestris < /i > fruits immobilized metal ion affinity-based fluorescence polarization (imap): advances in kinase screening targeting zoonotic viruses: structure-based inhibition of the 3c-like protease from bat coronavirus hku4 -the likely reservoir host to the human coronavirus that causes middle east respiratory syndrome (mers) dna helicase activity of sv40 large tumor antigen in vitro disassembly of influenza a virus capsids by gradient centrifugation a peptide inhibitor of hiv-1 assembly in vitro cytoplasmic tails of bunyavirus gn glycoproteins-could they act as matrix protein surrogates an in vitro fluorescence screen to identify antivirals that disrupt hepatitis b virus capsid assembly specific recognition and accelerated uncoating of retroviral capsids by the trim5alpha restriction factor furin cleavage of the respiratory syncytial virus fusion protein is not a requirement for its transport to the surface of virus-infected cells selective sensing of tyrosine phosphorylation in peptides using terbium(iii) complexes yeast-based assays for the high-throughput screening of inhibitors of coronavirus rna cap guanine-n7-methyltransferase production of hepatitis b virus by a differentiated human hepatoma cell line after transfection with cloned circular hbv dna phosphorylation of the hiv-1 capsid by melk triggers uncoating to promote viral cdna synthesis viral and host control of cytomegalovirus maturation the a, b, cs of herpesvirus capsids real-time monitoring of rna helicase activity using fluorescence resonance energy transfer in vitro ubiquitin depletion and dominant-negative vps4 inhibit rhabdovirus budding without affecting alphavirus budding structural basis for the development of sars 3cl protease inhibitors from a peptide mimic to an aza-decaline scaffold different pathways leading to integrase inhibitors resistance synthesis and antiviral activity of novel isonucleoside analogs rapid in vitro selection of human immunodeficiency virus type 1 resistant to 3'-thiacytidine inhibitors due to a mutation in the ymdd region of reverse transcriptase ligand-induced dimerization of middle east respiratory syndrome (mers) coronavirus nsp5 protease (3cl(pro)): implications for nsp5 regulation and the development of antivirals fluorescent sensing of pyrophosphate anion in synovial fluid based on dna-attached magnetic nanoparticles structural insights into the coupling of virion assembly and rotavirus replication an enzymatic virus-like particle assay for sensitive detection of virus entry early breast cancer screening using iron/iron oxide-based nanoplatforms with sub-femtomolar limits of detection incorporation of spike and membrane glycoproteins into coronavirus virions distinct roles for nucleic acid in in vitro assembly of purified mason-pfizer monkey virus canc proteins identification and characterization of the rna helicase activity of japanese encephalitis virus ns3 protein virus maturation d-retrovirus morphogenetic switch driven by the targeting signal accessibility to tctex-1 of dynein retroviral oncogenes: a historical primer the entry inhibitor myrcludex-b efficiently blocks intrahepatic virus spreading in humanized mice previously infected with hepatitis b virus advances in mass spectrometry based strategies to study receptor tyrosine kinases homogeneous highthroughput screening assays for hiv-1 integrase 3β-processing and strand transfer activities nanoplatforms for highly sensitive fluorescence detection of cancer-related proteases a terbium (iii)-complex-based on-off fluorescent chemosensor for phosphate anions in aqueous solution and its application in molecular logic gates coronavirus nsp10/nsp16 methyltransferase can be targeted by nsp10-derived peptide in vitro and in vivo to reduce replication and pathogenesis establishment of a high-throughput assay to monitor influenza a virus rna transcription and replication structure of main protease from human coronavirus nl63: insights for wide spectrum anti-coronavirus drug design mathematical analysis of an hiv latent infection model including both virus-to-cell infection and cell-to-cell transmission measuring conformational dynamics of biomolecules by single molecule fluorescence spectroscopy highthroughput microfluidic single-cell digital polymerase chain reaction structure of the flavivirus helicase: implications for catalytic activity, protein interactions, and proteolytic processing the papain-like protease of porcine epidemic diarrhea virus negatively regulates type i interferon pathway by acting as a viral deubiquitinase sensitive kinase assay linked with phosphoproteomics for identifying direct kinase substrates sodium taurocholate cotransporting polypeptide is a functional receptor for human hepatitis b and d virus the e1 protein of bovine papilloma virus 1 is an atp-dependent dna helicase design of wide-spectrum inhibitors targeting coronavirus main proteases improving viral protease inhibitors to counter drug resistance studies of ebola virus glycoproteinmediated entry and fusion by using pseudotyped human immunodeficiency virus type 1 virions: involvement of cytoskeletal proteins and enhancement by tumor necrosis factor alpha decreased muscle insulin receptor kinase correlates with insulin resistance in normoglycemic pima indians adp-glo: a bioluminescent and homogeneous adp monitoring assay for kinases compact, programmable, and stable biofunctionalized upconversion nanoparticles prepared through peptide-mediated phase transfer for high-sensitive protease sensing and in vivo apoptosis imaging identification and characterization of a ribose 2′-o-methyltransferase encoded by the ronivirus branch of nidovirales fret-based biosensors for protein kinases: illuminating the kinome phosphoprotein analysis using antibodies broadly reactive against phosphorylated motifs cell-based high-throughput screening assay identifies 2′,2′-difluoro-2′-deoxycytidine gemcitabine as a potential antipoliovirus agent structure of the main protease from a global infectious human coronavirus, hcov-hku1 molecular basis for specific viral rna recognition and 2′-oribose methylation by the dengue virus nonstructural protein 5 (ns5) structure and function of the zika virus full-length ns5 protein robust hepatitis c virus infection in vitro inhibitors of sars-cov entryidentification using an internally-controlled dual envelope pseudovirion assay a new colorimetric strategy for monitoring caspase 3 activity by hrp-mimicking dnazyme-peptide conjugates protease inhibitors targeting coronavirus and filovirus entry the lymphocytic choriomeningitis virus matrix protein ppxy late domain drives the production of defective interfering particles investigations on dna intercalation and surface binding by sybr green i, its structure determination and methodological implications miniaturization of absorbance assays using the fluorescent properties of white microplates characterization of a novel dna polymerase activity assay enabling sensitive, quantitative and universal detection of viable microbes this work was supported by ga čr (cz) ga17-25602s to mr, ga17-24281s to tr and the ministry of education, youth and sport of the czech republic(oppc cz.2.16/3.1.00/24503), through its "national program of sustainability i" npu lo 1601. key: cord-254957-jqp1gto6 authors: klann, kevin; bojkova, denisa; tascher, georg; ciesek, sandra; münch, christian; cinatl, jindrich title: growth factor receptor signaling inhibition prevents sars-cov-2 replication date: 2020-08-11 journal: mol cell doi: 10.1016/j.molcel.2020.08.006 sha: doc_id: 254957 cord_uid: jqp1gto6 summary sars-cov-2 infections are rapidly spreading around the globe. the rapid development of therapies is of major importance. however, our lack of understanding of the molecular processes and host cell signaling events underlying sars-cov-2 infection hinder therapy development. we employed a sars-cov-2 infection system in permissible human cells to study signaling changes by phospho-proteomics. we identified viral protein phosphorylation and defined phosphorylation-driven host cell signaling changes upon infection. growth factor receptor (gfr) signaling and downstream pathways were activated. drug-protein network analyses revealed gfr signaling as key pathway targetable by approved drugs. inhibition of gfr downstream signaling by five compounds prevented sars-cov-2 replication in cells, assessed by cytopathic effect, viral dsrna production, and viral rna release into the supernatant. this study describes host cell signaling events upon sars-cov-2 infection and reveals gfr signaling as central pathway essential for sars-cov-2 replication. it provides with novel strategies for covid-19 treatment. co-expression clustering severe acute respiratory syndrome coronavirus 2 (sars-cov-2), a novel coronavirus, has been rapidly spreading around the globe since the beginning of 2020. in people, it causes coronavirus disease 2019 (covid-19) often accompanied by severe respiratory syndrome (chen et al., 35 2020) . to conquer the global health crisis triggered by covid-19, rapidly establishing drugs is required to dampen the disease course and relieve healthcare institutions. thus, repurposing of already available and (ideally) approved drugs might be essential to rapidly treat covid-19. many studies for proposing repurposing of specific drugs have been conducted in the last months, but mostly remain computational without tests in infection models (smith and smith, 40 2020; wang, 2020) . in addition, they are hindered by the lack of knowledge about the molecular mechanisms of sars-cov-2 infection and the resulting host-cell responses required to allow 2 viral replication. to rationally repurpose drugs, a molecular understanding of the infection and the changes within the host cell pathways is essential. experimentally identifying viral targets in the cell allows candidate drugs to be selected with high confidence for further testing in the 45 clinics to reduce the risks for patients resulting from tests with drugs lacking in vitro validation. growth factor receptor (gfr) signaling plays important roles in cancer pathogenesis and has also been reported to be crucial for infection with some viruses (beerli et al., 2019; kung et al., 2011; zhu et al., 2009) . gfr activation leads to the modulation of a wide range of cellular processes, including proliferation, adhesion, or differentiation (yarden, 2001) . various viruses, 50 such as epstein-barr virus, influenza, or hepatitis c, have been shown to use the epidermal growth factor receptor (egfr) as an entry receptor (eierhoff et al., 2010; kung et al., 2011; lupberger et al., 2011) . in addition, egfr activation can suppress interferon signaling and thus the antiviral response elicited in respiratory virus diseases, for instance influenza a and rhinovirus (ueki et al., 2013) . activation of gfr signaling might play an important role also in 55 other respiratory viruses, such as sars-cov-2. in the last years, it has been shown for many viruses that modulation of host cell signaling is crucial for viral replication and might exhibit strong therapeutic potential (beerli et al., 2019; pleschka et al., 2001) . however, how sars-cov-2 infection changes host cell signaling has remained unclear. we recently established an in vitro cell culture model of sars-cov-2 60 infection using the colon epithelial cell line caco-2, which is highly permissive for the virus and commonly used for the study of coronaviruses (herzog et al., 2008; ren et al., 2006) . here, we determine changes in the cellular phospho-protein networks upon infection with sars-cov-2 to gain insight into infection-induced signaling events. we found extensive rearrangements of cellular signaling pathways, particularly of gfr signaling. strikingly, inhibiting gfr signaling 65 using prominent (anti-cancer) drugs -namely pictilisib, omipalisib, ro5126766, lonafarnib, and sorafenib -prevented sars-cov-2 replication in vitro, assessed by cytopathic effect and viral rna replication and release. these compounds prevented replication at clinically achievable concentrations. due to their clinical availability, these drugs could be rapidly transitioned towards clinical trials to test their feasibility as covid-19 treatment option. 70 in a previous study, we analyzed the effect of sars-cov-2 infection on the host cell translatome and proteome (bojkova et al., 2020) . this study found the effects 24 hours after sars-cov-2 infection especially useful for identifying druggable host pathways. to evaluate changes in 75 intracellular signaling networks brought about by sars-cov-2 infection, we quantified phosphoproteome changes 24 hours after infection ( figure 1a ). caco-2 cells were mock-infected or infected with sars-cov-2 patient isolates (in five biological replicates at an moi of 1) for one hour, washed, and incubated for 24 hours before cell harvest. extracted proteins were digested and split to 1) carry out whole-cell proteomics of a tandem mass tag (tmt) 10-plex samples 80 using liquid chromatography synchronous precursor selection mass spectromety (lc-sps-ms 3 ), or 2) use fe-nta phosphopeptide enrichment (achieving 98% enrichment) for phospho proteome analyses of a tmt 10-plex analyzed by lc-ms 2 , due to the higher precision and identification rates of ms 2 based methods during phosphopeptide measurements (hogrebe et al., 2018) . we identified and quantified 7,150 proteins and 16,715 different phosphopeptides for 85 a total of 15,093 different modification sites ( figure 1b , c, s1, and table s1 , s2). the main fraction of phosphopeptides were modified serines (86.4%), followed by threonine (13.4%), and tyrosine (0.2%) ( figure 1d ). upon infection, 2,197 and 799 phosphopeptides significantly increased or decreased respectively (log 2 fc > 1, p value < 0.05). 3 viral proteins are produced in the host cell and underlie (and often require) post-translational 90 modification (ptm) by host cell enzymes (wu et al., 2009) . accordingly, we assessed viral proteins phosphorylated in the host cell. we identified 33 modification sites on 6 different viral proteins ( figure 1e -j). possible functions of the observed modifications largely remain unclear due to a lack of understanding of their molecular function and regulation. sars-cov-2 protein 3a was phosphorylated on the luminal side of this transmembrane protein ( figure 1e ) membrane 95 protein m was phosphorylated at three serines in close proximity, at the c-terminal, cytoplasmic region of the protein ( figure 1f ), suggesting a high-activity modification surface. sars-cov-1 protein 6 was described to accelerates infections in murine systems (tangudu et al., 2007) . we found a single phosphorylation of the sars-cov-2 protein homologue non-structural protein 6 in host cells ( figure 1g ) protein 9b was modified at two sites ( figure 1h ). however, its function in 100 sars-cov-1 or sars-cov-2 remains unknown. polyprotein 1b is a large 7,096 amino acid protein heavily processed to generate distinct proteins in sars-cov-1 (tangudu et al., 2007) . we found polyprotein 1b to be modified at three residues, two in a region of unknown function and one in the non-structural protein 11 (nsp11) part of the protein ( figure 1i ). our data cannot distinguish whether phosphorylation occurred before or after cleavage and whether 105 phosphorylation may affect processing. sars-cov-2 nucleoprotein was heavily phosphorylated ( figure 1j ). mapping phosphosites to the structure (residues 47 to 173, pdb: 6vyo) revealed a small surface region, suggesting specific regulation and interaction changes ( figure 1k ). to reveal host kinases potentially phosphorylating viral proteins, we bioinformatically assessed identified phosphorylation motifs using netphos 3.1 and gps5 (blom et al., 2004; wang et al., 110 2020) (table s3) . some motifs present in nucleoprotein were predicted to be modified by cmgc kinases. among several others, casein kinase ii (ck2) kinases are part of the cmgc family and have been independently identified as interaction partners of the nucleoprotein, when expressed in cells (gordon et al., 2020) . inhibition of ck2 kinases, could be employed to study possible functional interactions between kinase and viral protein. taken together, we identified extensive changes in phosphorylation of host and viral proteins after sars-cov-2 infection. the roles of viral protein modifications remain unclear. however, targeting the corresponding host kinases may offer new treatment strategies. to identify the key host signaling pathway networks modulated by infection, we carried out 120 protein-protein co-regulation analysis on all proteins quantified in phosphorylation and total protein level. we first standardized phosphorylation and total protein levels by individual zscoring to compare the different datasets. subsequently, to merge phosphorylation and proteome data, we collapsed all phospho-sites for each protein into one average profile and calculated combined z-scores. patterns of co-regulation were identified using protein-protein 125 correlation and hierarchical clustering ( figure 2a ). this generalized approach allows to study large scale patterns of dependencies of protein and phosphorylation levels, that then can be dissected into individual phosphorylation sites and protein levels for downstream analysis. the dynamic landscape of the proteome revealed three main clusters of co-regulated proteins, each one representing different sets of pathways (discussed in detail below): 130 the first cluster was mainly comprised of receptor signaling and endocytic pathways ( figure 2b ). prominent among these pathways were platelet derived growth factor receptor (pdgfr), erbb1 (egfr) signaling, metabolism, and various pathways associated with vesicle trafficking (table s4 ). as changes in phospho-peptide abundance can represent different ratios in phosphorylated versus non-phosphorylated peptide or a change in protein abundance (with the same ratio of 135 protein being phosphorylated), we integrated our phospho-proteome dataset with total proteome data ( figure 2c ). when comparing abundances of individual phosphopeptides and their protein levels, extensive changes were observed in the phospho-proteome; however, no general j o u r n a l p r e -p r o o f 4 changes were seen for the total proteome ( figure 2c , table s2 ). thus, phosphorylation changes were induced by signaling activity alteration resulting in increased phosphorylation and not due 140 to protein abundance differences. the second cluster was mainly comprised of proteins decreased in phosphorylation and was highly connected to cell cycle and translation initiation ( figure 2d and table s4 ). we reported recently that inhibition of cellular translation prevented sars-cov-2 replication in cells (bojkova et al., 2020) , consistent with regulation of translation by altering phosphorylation patterns. to 145 further distinguish the regulations within this cluster, we correlated protein levels with differential phosphorylation abundance ( figure 2e ) and found two groups of proteins: the first was contained translation related pathways (identified in figure 2e ) and was predominantly regulated by decreased modification. the second set of proteins was decreased in phosphorylation and on total protein level. the majority of proteins found in the second cluster belonged to diverse cell 150 cycle pathways. consistent with these findings, cell cycle pathways were also enriched in the set of proteins significantly decreased on protein level ( figure 2f , s2, and table s4 and s5). translation pathways were not regulated on protein level to this extent. analysis of the third cluster revealed signaling events of the splicing machinery (table s4) possibly explaining previously observed changes in splicing machinery abundance upon sars-155 cov-2 infection (bojkova et al., 2020) . consistent with previous literature (grimmler et al., 2005; ilan et al., 2017; mathew et al., 2008; mermoud et al., 1994) , we therefore hypothesized that the host splicing machinery is extensively reshaped during viral infection. this finding further supports splicing as a potential therapeutic target, in agreement with decreased sars-cov-2 pathogenic effects when inhibiting splicing by pladeinolide b. additionally, we found carbon 160 metabolism among the pathways showing significantly increased phosphorylation upon sars-cov-2 infection (table s4 ) in addition to previously described changes of total protein levels of enzymes part of glycolysis and carbon metabolism (bojkova et al., 2020 )( figure s3 ). taken together, we showed that, during sars-cov-2 infection, specific rearrangements of signaling pathways were elicited in the cellular proteome. regulation was mainly comprised of 165 cellular signaling and translational pathways as well as proteins regulated not only by phosphorylation, but also in total protein abundance. proteins exhibiting decreased protein levels were significantly enriched in cell cycle proteins. we observed over 2,000 phospho-peptides to be increased in abundance while their protein 170 levels stay constant upon infection ( figure 2c and s4). this reveals differential modification activity (e.g. signaling events) for these phospho-proteins. for many kinases in cellular signaling pathways there are already approved drugs available. hence, we investigated the potential to repurpose drugs to treat covid-19 by mapping already available drugs via reactomefi to the set of proteins increased in phosphorylation. we filtered the network for drugs and direct targets 175 and found egfr as one of the central hits, including a number of regulated proteins in the downstream signaling pathway of egfr ( figure 3a ). these downstream targets are also regulated by other gfrs and could thus also be explained by their observed activation upon sars-cov-2 infection ( figure 2 ). 28 clinically approved drugs (largely used in cancer therapy) are already available to target egfr or downstream targets. indeed, we found a subnetwork of 180 gfr signaling components remodeled ( figure 3b ). we mapped identified members of gfr signaling and their respective phosphorylation differences upon sars-cov-2 infection ( figure 3c ) revealing an extensive overall increase in phosphorylation of the whole pathway, including related components for cytoskeleton remodeling and receptor endocytosis. how gfr signaling might regulate sars-cov-2 infection is still matter for speculation. however, gfr signaling 185 j o u r n a l p r e -p r o o f 5 inhibition might provide a useful approach already implicated in sars-cov induced fibrosis therapy and might be a viable strategy to treat covid-19. since gfr signaling seemed to be central during sars-cov-2 infection, we examined the use of inhibitors as antiviral agents. since there are several gfrs integrating their signaling and 190 regulating a number of processes inside the cell, directly targeting downstream signaling components is likely to be more successful to prevent signaling of different gfrs and to avoid mixed effects of multiple pathways. gfr signaling, amongst others, results in activation of 1) the raf/mek/erk mapk signaling cascade and 2) integrates (via phosphoinositide 3 kinase [pi3k] and protein kinase b [akt]) into mtorc1 signaling to regulating proliferation ( figure 4a ). to 195 explore the antiviral efficiency of targeting proteins downstream of gfrs, we first tested the pi3k inhibitors pictilisib and omipalisib (ippolito et al., 2016; sarker et al., 2015; schmid et al., 2016) . both compounds inhibited viral replication, based on their propensity to prevent cytopathogenic e ect (cpe) and viral rna production in cells ( figure 4b -d, s5, and s6). our drug-target analyses identified mitogen activated protein kinase kinase (map2k2, better known as mek) 200 and the raf inhibitor sorafenib (wilhelm et al., 2006) as promising targets inhibiting downstream signaling of gfrs ( figure 4a ). thus, we tested sorafenib and the dual raf/mek inhibitor ro5126766 in our viral replication assays. both compounds inhibited cytopathic effects during infection and the viral replication ( figure 4b -d, s5, and s6). to validate our findings in another cell line, we repeated the treatments in ukf-rc-2 cells infected with sars-cov-2. quantifying 205 the viral rna copies in the supernatant, we observed that the compounds efficiently inhibited virus replication ( figure 4e and s7a) at non-toxic conditions ( figure s7b , except for sorafenib). overall, five compounds, inhibiting downstream signaling of gfrs, prevented sars-cov-2 replication at clinically achievable concentrations ( figure 4b and 5) (eskens et al., 2001; fucile et al., 2014; martinez-garcia et al., 2012; munster et al., 2016; sarker et al., 2015) , emphasizing 210 the importance of gfr signaling during sars-cov-2 infection and revealing clinically available treatment options as drug candidates for covid-19. with the rapid spreading of the covid-19 pandemic, investigating the molecular mechanisms underlying sars-cov-2 infection are of high importance. in particular, the processes underlying 215 infection and host-cell response remain unclear. these would offer potential avenues for pharmacological treatment of covid-19. here, we report global, differential phosphorylation analysis of host cells after infection with intact sars-cov-2 virus. we could identify phosphorylation sites on numerous viral proteins in cells, showing that they can undergo efficient modification in infected cells. until now, we can only speculate about the host kinases involved 220 and the functions driven by ptms, which will be an important topic for follow-up studies. for sars-cov-1, it was shown that modification of viral proteins can lead to regulation of rna binding of the nucleoprotein (wu et al., 2009 ) and is needed for viral replication. although similar effects in sars-cov-2 are likely, this remains to be studied in this novel virus. a recent paper analyzed the interaction profile of sars-cov-2 proteins expressed in hek293t cells (gordon et 225 al., 2020) . for the heavily phosphorylated nucleoprotein they could identify interactions the host casein kinases, which might indicate possible modification events by the latter. also for the orf9b/protein 9b that we found modified in cells, interaction mapping identified mark kinases as interaction partners. by exploring the signaling changes inside the host cell, we could gain important insights into host 230 cell signaling during infection. we found essential gfr signaling pathways activated such as egfr or pdgfr, together with a plethora of rhogtpase associated signaling molecules. we could furthermore show modulation of the splicing machinery, in line with previous results indicating dependency of viral in vitro pathology on the host spliceosome (bojkova et al., 2020) . the same is true for metabolic reprogramming, for which we found differential post-translational 235 modification of most members of the carbon metabolic pathways, namely glycolysis, pentose phosphate and tca cycle. these metabolic pathways were significantly up-regulated on total protein levels in the presented dataset, consistent with our previous study (bojkova et al., 2020) , suggesting that these key pathways are regulated on multiple levels. a number of drugs to treat covid-19 have been suggested, largely based on bioinformatics 240 analyses of genetics or cellular data (gordon et al., 2020; li et al., 2020; wang, 2020) . however, for many of these compounds, studies explaining their working mechanisms in the context of sars-cov-2 or viral assays to determine their efficacy of blocking viral replication in cell models of sars-cov-2 infection, are missing. while monitoring signaling changes in host cells, we observed activation of gfr signaling cascades after infection, consistent with other viruses 245 relying on the receptors themselves or elicited signal transduction (eierhoff et al., 2010; kung et al., 2011; lupberger et al., 2011; ueki et al., 2013; wu et al., 2017; zhu et al., 2009) . from our data we could not clearly conclude which gfr might be activated and thus tested whether gfr downstream signaling inhibition can prevent sars-cov-2 replication, as reported for some other viruses (baturcam et al., 2019; pleschka et al., 2001) . previously, temporal kinome analysis 250 identified antiviral potential of ras/raf/mek and pi3k/akt/ for mers-cov (kindrachuk et al., 2015) . by targeting the ras/raf/mek and pi3k/akt/mtor downstream axes of gfr signaling, we found efficient inhibition of viral replication in two different cell lines derived from different tissues (figure 4) . gfr signaling was shown to play a role in diverse virus infections as well as in fibrosis induction by sars-cov-1 (beerli et al., 2019; kung et al., 2011; lupberger et 255 al., 2011; pleschka et al., 2001; ueki et al., 2013; . thus, our results in cytopathic effects might indeed indicate cytoprotective roles for gfr signaling axes during sars-cov-2 infection and possible development of fibrosis (luo et al., 2020) . notably, some inhibitors used in our study such as omipalisib were shown to suppress fibrosis progression in patients with idiopathic pulmonary fibrosis, which may share deregulation of signaling pathways 260 involved in lung fibrosis of coronavirus patients . these findings suggest that inhibitors of gfr downstream signaling may bring benefit to covid-19 patients independently of their antiviral activity. taken together, this study provides new insights into molecular mechanisms elicited by sars-cov-2 infection. proteomic analyses revealed several pathways that are rearranged during 265 infection and showed that targeting of those pathways is a valid strategy to inhibit cytopathic effects triggered by infection. in this study, cancer cell lines were used to assess the effect of sars-cov-2 on host cells during infections. we chose the experimental time-point, based on previous analysis of the 270 infection course in these cells (bojkova et al., 2020) . notably, the kinetics of infection are likely to be different in other cell lines or primary material, since we also observed a different moi needed for ukf-rc-2 cells. in addition, we tested the efficiency of the presented drugs only in the context of in vitro cell line experiments. thus, the results do not represent direct evidence for the use of these therapeutics in patients, as the effects might differ in primary tissue. the results 275 presented indicate potential anti-viral effects that have to be further validated in other models and clinical trials to assess their usefulness for the treatment of covid-19. (a) experimental scheme. caco-2 cells were infected with sars-cov-2 for one hour (moi: 1), washed and incubated for additional 24 hours. proteins were extracted and prepared for bottomup proteomics. all ten conditions were multiplexed using tmt10 reagents. 250 µg of pooled samples were used for whole cell proteomics (24 fractions) and the remainder (~1 mg) enriched for phosphopeptides by fe-nta. phosphopeptides were fractionated into 8 fractions and 305 concatenated into 4 fractions. all samples were measured on an orbitrap fusion lumos. (b) volcano plot showing fold changes of infected versus mock cells for all quantified phosphopeptides. p values were calculated using an unpaired, two-sided student's t-test with equal variance assumed and adjusted using the benjamini hochberg fdr method (n = 5 biological replicates). orange or blue points indicate significantly increased or decreased 310 phosphopeptides, respectively. (c) volcano plot showing differences between sars-cov-2 and mock infected cells in total protein levels for all quantified proteins. p values were calculated using an unpaired, two-sided student's t-test with equal variance assumed and adjusted using the benjamini hochberg fdr method (n = 5). orange or blue points indicate significantly increased or decreased 315 phosphopeptides, respectively. (d) distribution of phosphorylation sites identified across modified amino acids. see also figure s1 and tables s1 and s2. see also table s1 , s2, s3 and figure s1 . (e) scatter plot showing correlation between fold changes of phosphopeptides compared to fold changes of total proteins levels. two subsets of phosphopeptides were detected: one was 340 j o u r n a l p r e -p r o o f 9 mainly regulated by differential modification (indicated in yellow), the second by changes in protein abundance. (f) string network analysis of proteins decreased in total protein levels ( figure 1c) . inserts indicate pathways found in the network. see also table s1 , s2, s4, s7, figure s2 and s3. 345 see also figure s4 . see also figure s5 . resource availability further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact: christian münch: ch.muench@em.uni-frankfurt.de (c.m.) 400 ukf-rc-2 human kidney cells derived from renal carcinoma are available upon request. the mass spectrometry proteomics data have been deposited to the proteomexchange consortium via the pride (perez-riverol et al., 2019) human caco-2 cells, derived from colon carcinoma, was obtained from the deutsche sammlung von mikroorganismen und zellkulturen (dsmz; braunschweig, germany). cells were grown at 37°c in minimal essential medium (mem) supplemented with 10% fetal bovine serum (fbs) and containing 100 iu/ml penicillin and 100 µg/ml streptomycin. all culture reagents were purchased from sigma. 415 cell line designated ukf-rc-2 was established from a tumor sample of a patient with a diagnosis of renal carcinoma hospitalized at department of urology, university hospital frankfurt. tumor tissue was cut in pieces and dissociated using 0.2% trypsin solution. primary tumor cells and passaging of cell line was performed using imdm medium supplemented with 10% fcs and antibiotics. ukf-rc-2 cells between passages 15 and 20 were used for antiviral 420 experiments. all culture reagents were purchased from sigma. sars-cov-2 was isolated from samples of travelers returning from wuhan (china) to frankfurt (germany) using human colon carcinoma cell line caco-2 as described previously 12 . sars-cov-2 stocks used in the experiments had undergone one passage on caco-2 cells and were 425 stored at -80° c. virus titers were determined as tcid50/ml in confluent cells in 96-well microtiter plates. confluent layers of caco-2 cells in 96-well plates were infected with sars-cov-2 at moi 0.01. virus was added together with drugs and incubated in mem supplemented with 2% fbs with different drug dilutions. cytopathogenic e ect (cpe) was assessed visually 48 h after infection. to assess effects of drugs on caco-2 cell viability, confluent cell layers were treated with different drug concentration in 96-well plates. the viability was measured using the rotitest vital (roth) according to manufacturer's instructions. data for each condition was collected for at 435 least three biological replicates. for dose response curves, data was fitted with all replicates using originpro 2020 with the following equation: ic50 values were generated by originpro 2020 together with metrics for curve fits. for ukf-rc-2 cells, the assay was performed as described above, except for usage of a moi of 440 0.1 as staining experiments for sars-cov-2 infection in ukf-rc-2 cells revealed the need of a higher moi to achieve comparable effects to caco-2 cells. sars-cov-2 rna from cell culture supernatant samples was isolated using avl buffer and the qiaamp viral rna kit (qiagen) according to the manufacturer's instructions. absorbance-based 445 quantification of the rna yield was performed using the genesys 10s uv-vis spectrophotometer (thermo scientific). rna was subjected to onestep qrt-pcr analysis using the luna universal one-step rt-qpcr kit (new england biolabs) and a cfx96 real-time system, c1000 touch thermal cycler. primers were adapted from the who protocol (corman et al., 2020) targeting the open reading frame for rna-dependent rna polymerase 450 (rdrp): rdrp_sarsr-f2 (gtg ara tgg tca tgt gtg gcg g) and rdrp_sarsr-r1 (car atg tta aas aca cta tta gca ta) using 0.4 µm per reaction. standard curves were created using plasmid dna (pex-a128-rdrp) harboring the corresponding amplicon regions for rdrp target sequence according to genbank accession number nc_045512. all quantification experiments have been carried out with biological replicates. 455 effect of selected compounds on viral replication was assessed by staining of double-stranded rna, which has been shown to be sufficient for measurement of sars-cov-1 replication (weber et al., 2006) . briefly, cells were fixed with acetone/methanol (40:60) solution 48 h post infection. immunostaining was performed using a monoclonal antibody directed against dsrna (1:150 460 dilution, scicons j2, mouse, igg2a, kappa chain, english & scientific consulting kft., szirák, hungary), which was detected with biotin-conjugated secondary antibody (1:1000 dilution, jackson immunoresearch) followed by application streptavidin, peroxidase conjugate (1:3000 dilution, sigma aldrich). lastly, the dsrna positive cells were visualized by addition of aec substrate. wells were imaged at different areas to visualize a larger area (presented in 465 supplementary figures). for all proteomics analysis, caco-2 cells were infected at an moi of 1 and the sample preparation was performed as described previously . briefly, lysates were precipitated by methanol/chloroform and proteins resuspended in 8 m urea/10 mm epps ph 470 8.2. concentration of proteins was determined by bradford assay and 300 µg of protein per samples was used for digestion. for digestion, the samples were diluted to 1 m urea with 10mm epps ph 8.2 and incubated overnight with 1:50 lysc (wako chemicals) and 1:100 sequencing grade trypsin (promega). digests were acidified using tfa and tryptic peptideswere purified by tc18 seppak (50 mg, waters). 125 µg peptides per sample were tmt labelled and the mixing 475 was normalized after a single injection measurement by lc-ms/ms to equimolar ratios for each channel. 250 µg of pooled peptides were dried for offline high ph reverse phase fractionation by hplc (whole cell proteome) and remaining 1 mg of multiplexed peptides were used for phospho-peptide enrichment by high-select fe-nta phosphopeptide enrichment kit (thermo fisher) after manufacturer`s instructions. after enrichment, peptides were dried and 480 resuspended in 70% acetonitrile/0.1% tfa and filtered through a c8 stage tip to remove contaminating fe-nta particles. dried phospho-peptides then were fractionated on c18 (empore) stage-tip. for fractionation c18 stagetips were washed with 100% acetonitrile twice, followed by equilibration with 0.1% tfa solution. peptides were loaded in 0.1% tfa solution and washed with water. elution was performed stepwise with different acetonitrile concentrations in 485 0.1% triethylamine solution (5%, 7.5%, 10%, 12.5%, 15%, 17.5%, 20%, 50%). the eight fractions were concatenated into four fractions and dried for lc-ms. peptides were fractionated using a dionex ultimate 3000 analytical hplc. 250 µg of pooled and purified tmt-labeled samples were resuspended in 10 mm ammonium-bicarbonate (abc), 5% 490 acn, and separated on a 250 mm long c18 column (x-bridge, 4.6 mm id, 3.5 µm particle size; waters) using a multistep gradient from 100% solvent a (5% acn, 10 mm abc in water) to 60% solvent b (90% acn, 10 mm abc in water) over 70 min. eluting peptides were collected every 45 s into a total of 96 fractions, which were cross-concatenated into 24 fractions and dried for further processing. 495 all mass spectrometry data was acquired in centroid mode on an orbitrap fusion lumos mass spectrometer hyphenated to an easy-nlc 1200 nano hplc system using a nanoflex ion source (thermofisher scientific) applying a spray voltage of 2.6 kv with the transfer tube heated to 300°c and a funnel rf of 30%. internal mass calibration was enabled (lock mass 445.12003 500 m/z). peptides were separated on a self-made, 32 cm long, 75µm id fused-silica column, packed in house with 1.9 µm c18 particles (reprosil-pur, dr. maisch) and heated to 50°c using an integrated column oven (sonation). hplc solvents consisted of 0.1% formic acid in water (buffer a) and 0.1% formic acid, 80% acetonitrile in water (buffer b). for total proteome analysis, a synchronous precursor selection (sps) multi-notch ms3 method 505 was used in order to minimize ratio compression as previously described (mcalister et al., 2014) . individual peptide fractions were eluted by a non-linear gradient from 7 to 40% b over 90 minutes followed by a step-wise increase to 95% b in 6 minutes which was held for another 9 minutes. full scan ms spectra (350-1400 m/z) were acquired with a resolution of 120,000 at m/z 200, maximum injection time of 100 ms and agc target value of 4 x 10 5 . the 20 most intense 510 precursors with a charge state between 2 and 6 per full scan were selected for fragmentation ("top 20") and isolated with a quadrupole isolation window of 0.7 th. ms2 scans were performed in the ion trap (turbo) using a maximum injection time of 50ms, agc target value of 1.5 x 10 4 and fragmented using cid with a normalized collision energy (nce) of 35%. sps-ms3 scans for quantification were performed on the 10 most intense ms2 fragment ions with an 515 isolation window of 0.7 th (ms) and 2 m/z (ms2). ions were fragmented using hcd with an nce of 65% and analyzed in the orbitrap with a resolution of 50,000 at m/z 200, scan range of 110-500 m/z, agc target value of 1.5 x10 5 and a maximum injection time of 120ms. repeated sequencing of already acquired precursors was limited by setting a dynamic exclusion of 45 seconds and 7 ppm and advanced peak determination was deactivated. 520 for phosphopeptide analysis, each peptide fraction was eluted by a linear gradient from 5 to 32% b over 120 minutes followed by a step-wise increase to 95% b in 8 minutes which was held for another 7 minutes. full scan ms spectra (350-1400 m/z) were acquired with a resolution of 120,000 at m/z 200, maximum injection time of 100 ms and agc target value of 4 x 10 5 . the 20 most intense precursors per full scan with a charge state between 2 and 5 were selected for 525 fragmentation ("top 20"), isolated with a quadrupole isolation window of 0.7 th and fragmented via hcd applying an nce of 38%. ms2 scans were performed in the orbitrap using a resolution of 50,000 at m/z 200, maximum injection time of 86ms and agc target value of 1 x 10 5 . repeated sequencing of already acquired precursors was limited by setting a dynamic exclusion of 60 seconds and 7 ppm and advanced peak determination was deactivated. an ms 2 based 530 method was chosen, because of higher precision and identification rates (hogrebe et al., 2018) . phospho-peptide fractions intrinsically exhibit lower complexity, rendering them less prone to ratio compression by isolation interference. raw files were analyzed using proteome discoverer (pd) 2.4 software (thermofisher 535 scientific). spectra were selected using default settings and database searches performed using sequestht node in pd. database searches were performed against trypsin digested homo sapiens swissprot database, sars-cov-2 database (uniprot pre-release). static modifications were set as tmt6 at the n-terminus and lysines and carbamidomethyl at cysteine residues. search was performed using sequest ht taking the following dynamic modifications into 540 account: oxidation (m), phospho (s, t, y), met-loss (protein n-terminus), acetyl (protein nterminus) and met-loss acetyl (protein n-terminus). for whole cell proteomics, the same settings were used except phosphorylation was not allowed as dynamic modification. for phosphoproteomics all peptide groups were normalized by summed intensity normalization and then analyzed on peptide level. for whole cell proteomics normalized psms were summed for each 545 accession and data exported for further use. peptide and protein identifications were validated using a concatenated target-decoy strategy and fdr was estimated using q-values calculated by percolator applying 1% and 5% cut-offs for high and medium confidence hits, while only high confident proteins are reported. phosphosite localization probabilities were calculated using the ptmrs-node working in "phosphors mode" and using default settings. 550 unless otherwise stated significance was tested by unpaired, two-sided students t-tests with equal variance assumed. resulting p values were corrected using the benjamini-hochberg fdr 555 procedure. adjusted p values smaller/equal 0.05 were considered significant. for phosphoproteomics an additional fold change cutoff was applied (log2 > |1|), while for total protein levels, due to different dynamic range, a fold change cutoff of log 2 > |0.5| was applied. kinase motifs of phosphopeptides from sars-cov-2 proteins were predicted using netphos 3.1 560 (blom et al., 1999) and gps 5.0 (stand-alone version) using the fasta-file of the uniprot prerelease which was also used for the proteomics data analysis. (blom et al., 2004; wang et al., 2020) . for netphos, only kinases with a score above 0.5 were considered as positive hits. for gps 5.0, sequences were submitted separately for s/t-and y-kinases and the score threshold was set to "high". for the final list in supplementary table 3 , only the top hits with the highest 565 scores were considered. z-scores were calculated for each phospho-site and the total protein levels individually. phosphosites were collapsed by average. for merging phosphorylation and total protein levels z-scores for collapsed phosphorylation and protein level were added for each condition and 570 replicate. thus, both negative z-scores (downregulation) will produce a lower combined z-score and vice versa two positive z-scores will produce a larger combined z-score. next, euclidean distance correlation for all possible protein-protein pairs were calculated, taking all conditions and replicates individually into account. a heatmap was then build by euclidean distance hierarchical clustering of the correlation matrix. 575 pathway enrichment analysis was performed by reactomefi cytoscape plugin or by string functional enrichment analysis. both analysis used reactome database for pathway annotations. all proteins were loaded into reactomefi cytoscape plugin to visualize protein-protein functional 580 interaction network. next, drugs were overlaid by reactomefi and network was filtered for the drugs and the first interacting partners. layout was calculated by yfileslayout algorithm. all proteins showing significant regulation were loaded by omicsvisualizer cytoscape plugin and string interaction network was retrieved with a confidence cutoff of 0.9. for egfr 585 subnetwork, egfr was selected with first interacting neighbors. highlights: • phosphoproteomics of sars-cov-2 infected cells reveal the signaling landscape • sars-cov-2 proteins are extensively phosphorylated in host cells • infection leads to activation of growth factor receptor signaling • drugs inhibiting growth factor receptor signaling prevent viral replication in this study, klann et al. dissected the host cell signaling landscape upon infection with sars-cov-2. mapping differential signaling networks identified a number of pathways activated during infection. drug-target network analysis revealed potential therapeutic targets. growth factor receptor signaling was highly activated upon infection and its inhibition prevented sars-cov-2 replication in cells. j o u r n a l p r e -p r o o f growth factor receptor growth factor receptor growth factor receptor mek inhibition drives anti-viral defence in rv but not rsv challenged human airway epithelial cells through akt/p70s6k/4e-bp1 signalling vaccinia virus hijacks egfr signalling to enhance virus spread 620 through rapid and directed infected cell motility sequence and structure-based prediction of eukaryotic protein phosphorylation sites1 1edited by prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence proteomics of sars-cov-2-infected host cells reveals therapy targets epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus 630 pneumonia in wuhan, china: a descriptive study detection of 2019 novel coronavirus (2019-ncov) by real-time rt-pcr the epidermal 635 growth factor receptor (egfr) promotes uptake of influenza a viruses (iav) into host cells phase i and pharmacokinetic study of the oral farnesyl transferase inhibitor sch 66336 given twice daily 640 to patients with advanced solid tumors measurement of sorafenib plasma concentration by highperformance liquid chromatography in patients with advanced hepatocellular carcinoma: is it useful the application in clinical practice? a pilot study a sars-cov-2-human protein-protein interaction map reveals drug targets and potential drug-repurposing phosphorylation regulates the activity of the smn complex during assembly of spliceosomal u snrnps plaque assay for human coronavirus nl63 using human colon carcinoma cells benchmarking common quantification strategies for large-scale phosphoproteomics pkr activation and eif2α phosphorylation mediate human globin mrna splicing at spliceosome assembly omipalisib (gsk458), a novel pan-pi3k/mtor inhibitor, exhibits in vitro anti-lymphoma activity in chemotherapy-sensitive and -resistant models of burkitt lymphoma antiviral potential of erk/mapk and pi3k/akt/mtor signaling modulation for middle east respiratory syndrome coronavirus infection as identified by temporal kinome analysis functional translatome proteomics reveal 670 converging and dose-dependent regulation by mtorc1 and eif2α epstein-barr virus lmp1 activates egfr, stat3, and erk through effects on pkcδ network bioinformatics analysis provides insight into drug repurposing for clinical pathology of critical patient with novel coronavirus pneumonia egfr and epha2 are host factors for hepatitis c virus entry 680 and possible targets for antiviral therapy first-in-human, phase i dose-escalation study of the safety, pharmacokinetics, and pharmacodynamics of ro5126766, a first-in-class dual mek/raf inhibitor in patients with solid tumors phosphorylation of human prp28 by srpk2 is required for integration of the u4/u6-u5 tri-snrnp into the spliceosome multinotch ms3 enables accurate, sensitive, and 690 multiplexed detection of differential expression across cancer cell line proteomes regulation of mammalian spliceosome assembly by a protein phosphorylation mechanism first-in-human phase i study of 695 gsk2126458, an oral pan-class i phosphatidylinositol-3-kinase inhibitor, in patients with advanced solid tumor malignancies the pride database and related tools and resources in 2019: improving support for quantification data influenza virus propagation is impaired by inhibition of the raf/mek/erk signalling cascade analysis of ace2 in polarized epithelial cells: surface expression and function as receptor for severe acute respiratory syndrome-associated coronavirus first-in-human phase i study of pictilisib (gdc-0941), a 710 potent pan-class i phosphatidylinositol-3-kinase (pi3k) inhibitor, in patients with advanced solid tumors phase ii randomized preoperative window-of-opportunity study of the pi3k inhibitor pictilisib plus anastrozole compared with anastrozole alone in 715 patients with estrogen receptor-positive breast cancer repurposing therapeutics for covid-19: supercomputer-based docking to the sars-cov-2 viral spike protein and viral spike protein-human ace2 interface severe acute 720 respiratory syndrome coronavirus protein 6 accelerates murine coronavirus infections respiratory virus-induced egfr activation suppresses irf1-dependent interferon λ and antiviral defense in airway epithelium the role of epidermal growth factor receptor (egfr) signaling in sars coronavirus-induced pulmonary fibrosis overactive egfr signaling leads to increased fibrosis after sars-cov infection fast identification of possible drug treatment of coronavirus disease -19 730 (covid-19) through computational drug repurposing study 0: an update on the prediction of kinase-specific phosphorylation sites in proteins double-735 stranded rna is produced by positive-strand rna viruses and dna viruses but not in detectable amounts by negative-strand rna viruses discovery and development of sorafenib: a multikinase inhibitor for treating cancer glycogen synthase kinase-3 regulates the phosphorylation of severe acute respiratory syndrome coronavirus nucleocapsid protein and viral replication human cytomegalovirus glycoprotein complex gh/gl/go uses pdgfr-α as a key for entry the egfr family and its ligands in human cancer: signalling mechanisms and therapeutic opportunities airway mucin production involves a novel tlr3-egfr-dependent pathway we thank the quantitative proteomics unit (ibc2, goethe university frankfurt) for support and expertise on lc-ms instrumentation and data analysis and christiane pallas and lena 280stegmann for support of with experimental work. key: cord-266977-5swwc6kr authors: secker, thomas.j.; leighton, timothy.g.; offin, douglas.g.; birkin, peter.r.; hervé, rodolphe.c.; keevil, charles.w. title: journal of hospital infection a cold water, ultrasonic activated stream efficiently removes proteins and prion-associated amyloid from surgical stainless steel date: 2020-09-19 journal: j hosp infect doi: 10.1016/j.jhin.2020.09.021 sha: doc_id: 266977 cord_uid: 5swwc6kr background: sterile service department decontamination procedures for surgical instruments struggle to demonstrate efficient removal of the hardiest infectious contaminants, such as prion proteins. a recently designed novel system, which utilises a low pressure ultrasonic activated, cold water stream, has previously demonstrated efficient hard surface cleaning of several biological contaminants. aim: to test the efficacy of an ultrasonically activated stream for the removal of tissue proteins, including prion-associated amyloid, from surgical stainless steel (ss) surfaces. methods: test surfaces were contaminated with 22l, me7 or 263k prion infected brain homogenates. the surfaces were treated with the ultrasonically activated water stream for contact times of 5 and 10 seconds. residual proteinaceous and amyloid contamination were quantified using sensitive microscopic analysis, and immunoblotting was used to characterize the eluted prion residues before and after treatment with the ultrasonically activated stream. findings: efficient removal of the different prion strains from the surgical ss surfaces was observed, and reduced levels of protease sensitive and resistant prion protein was detected in recovered supernatant. conclusions: this study demonstrated that an ultrasonically activated stream has the potential to be a cost-effective solution to improve current decontamination practices and has the potential to reduce hospital acquired infections. instruments struggle to demonstrate efficient removal of the hardiest infectious contaminants, 23 such as prion proteins. a recently designed novel system, which utilises a low pressure 24 ultrasonic activated, cold water stream, has previously demonstrated efficient hard surface 25 cleaning of several biological contaminants. 26 aim: to test the efficacy of an ultrasonically activated stream for the removal of tissue 27 proteins, including prion-associated amyloid, from surgical stainless steel (ss) surfaces. 28 methods: test surfaces were contaminated with 22l, me7 or 263k prion infected brain 29 homogenates. the surfaces were treated with the ultrasonically activated water stream for 30 contact times of 5 and 10 seconds. residual proteinaceous and amyloid contamination were 31 quantified using sensitive microscopic analysis, and immunoblotting was used to characterize 32 the eluted prion residues before and after treatment with the ultrasonically activated stream. at present, the reprocessing of surgical instruments utilises; a pre-wash, washer/disinfector 44 cycle (run at elevated temperature with detergents) and sterilisation in high heat/pressure 45 autoclaves 1 . decontamination protocols for reusable surgical instruments are very efficient 46 against microbiological contaminants. however, highly hydrophobic proteins such as prions, 47 responsible for the transmission of variant creutzfeldt-jakob disease (vcjd), are readily 48 adsorbed to surgical stainless-steel surfaces and poorly removed or inactivated by current 49 decontamination methods. this results in an impending risk of iatrogenic transmission of 50 vcjd 2-5 . a risk that has been experimentally demonstrated in both animal and cell-based 51 bioassays [6] [7] [8] [9] . 52 the latest estimated prevalence of asymptomatic carriers of the causative protein of vcjd 53 (prp sc ) in the uk is approximately 1/2000 10 . while the full impact of the genetic 54 susceptibility of the host remains unclear, the ostensibly long incubation periods and the 55 potential for disease transmission via infected blood [11] [12] [13] , imply that all surgical procedures 56 pose a risk of vcjd transmission. 57 improvements in the methodologies used for reprocessing surgical instruments, potentially 58 contaminated with prions, are required to diminish the risk of iatrogenic vcjd transmission. 59 novel, specialised prion decontamination protocols have been developed and in some cases 60 marketed for sterile service departments (ssds) 7, 14-22 . however, some of these protocols 61 are very aggressive and can be damaging to instrument surfaces and/or the 62 washer/disinfectors themselves 14 . simple methods to adopt into ssd's have been researched 63 and demonstrated improved efficiency over current practices, such as preventing instruments 64 from drying once contaminated, i.e. keeping them in a moist environment prior to cleaning 23-65 27 . 66 j o u r n a l p r e -p r o o f bjerknes forces 43 aid the scrubbing bubbles in efficiently removing contaminants from 92 microscopic crevices, such as those found on worn surgical instruments 44 , that are 93 traditionally difficult to clean by brushes, wiping, or by chemical means that rely on passive 94 diffusion for reagents to penetrate deep into the crevice 45 . the efficient removal of 95 contamination from crevices using a uas system has been demonstrated previously 40 . 96 furthermore, the microstreaming that radiates from the resonating bubbles can penetrate into 97 crevices present on the surfaces of the contaminant as shown in the insert in figure 1 46 . the 98 fact that such results can be obtained in cold water without chemical additives warrants 99 investigation of uas for the removal of infectious prion proteins from surgical surfaces. 100 high temperature decontamination using aggressive enzymatic or alkaline solutions, that are 101 currently adopted to clean expensive surgical items (such as intricate neurosurgical tools) are 102 ineffective at protein and prion removal, and can shorten the surgical item lifetime 47 . it is not 103 the purpose of this study to explore the replacement of such standard cleaning practices. 104 however, given the above properties of uas, it is important to explore the possible benefits 105 of including an innovative cold-water uas pre-wash (at the stage where ssds conduct hand 106 brushing of instruments under a stream of water) that can be introduced with minimal 107 operator training. this would be particularly beneficial if it could be conducted immediately 108 after instrument use (e.g. before contaminated tissue dries on the instrument and becomes 109 harder to remove), although in this trial the contaminant is tested in a dried-on state. the 110 question is whether such a uas pre-wash could remove a substantial proportion of the 111 contaminant, especially from microscopic crevices of the type associated with worn surgical 112 instrument surfaces, and break up aggregates in which the inner portion of biological 113 contaminant is partially protected from subsequent enzymatic cleaning chemistries. 114 a previous study demonstrated efficient tissue protein removal from surgical stainless steel 115 using the uas 39 . however, due to the globular nature of the predominantly β-sheet 116 structured infectious prion protein, it adheres to surgical stainless steel far more rigorously 117 than do normal brain tissue proteins, and therefore the ability of uas to remove brain tissue 118 protein cannot be taken as an indicator of any efficacy in reducing the iatrogenic transmission 119 risk of vcjd. therefore, this study involved the contamination of surgical stainless-steel 120 surfaces with several amyloid-rich brain homogenates from prion infected rodents. normal 121 tissue proteins and more hazardous prion-associated amyloid were differentially stained and 122 analysed using sensitive in situ microscopy, to compare the ability of uas to remove both 123 during the same cleaning operation. inoculating, tokens were decontaminated and analysed to be deemed free of any 131 contamination following a previously described protocol 49 . 132 murine scrapie me7-infected brain homogenate produced from c57bl mice (tse resource 134 centre, roslin institute, university of edinburgh, scotland, uk); murine scrapie 22l-135 infected brain homogenate produced from c57bl/6j mice (kindly donated from the 136 neuroscience department, school of biological sciences, university of southampton) and 137 syrian hamster scrapie 263k-infected brain homogenates (tse resource centre, roslin 138 institute, university of edinburgh, scotland, uk) were standardized to 1 mg/ml (bsa 139 equivalent) in phosphate buffered saline (pbs, gibco) with 0.1 % (v/v) tween 20 (sigma-140 aldrich) as previously described 50 . 141 pristine tokens were spiked with 1 µl (1 µg bsa equivalent) drops of 22l, me7 or 263k 143 infected brain homogenate, and dried at 37 o c for 2 hours or room temperature for 24 hours. 144 tokens were subjected to decontamination using a prototype recirculating uas device (the 145 mark i starstream ® system (f0030001)) using fresh dh 2 o for each sample, running at 2.32 ± 146 0.02 l/min at room temperature with the ultrasound on for 5 and 10 s contact times, with the 147 sample being 10 mm from the nozzle ( figure 1 ). once processed the tokens were dried at 148 37 o c for 1 hour prior to staining and analysis. 149 residual tissue protein and prion-associated amyloid on the control and processed surfaces 151 was quantified, in situ, using the total protein blot stain sypro ruby (sr; invitrogen, uk) 152 and the amyloid specific stain thioflavin t (tht [0.2% (w/v) in 0.01m hcl]; sigma-153 aldrich), as described elsewhere 50, 51 . fluorescent signal was visualised using episcopic 154 differential interference contrast (edic) microscopy coupled with epifluorescence (ef -155 best scientific, wroughton, uk) 50, 52 . full x/y scans of the contaminated areas were 156 acquired at x100 magnification showing the sypro ruby (excitation: 470nm; emission: 157 618nm) and tht (0.2% (w/v) in 0.01m hcl sigma-aldrich) signals. the captured images 158 were analysed using imagej software (national institutes of health). 159 to analyse the effects of the uas treatment on infectious prion proteins, immunoblot 161 analysis was used to determine the presence of prp c and proteinase k (pk) resistant prp sc in 162 both 22l-spiked distilled water, as an untreated control, and the effluent taken from the uas system post cleaning of 22l-spiked stainless-steel tokens. controls were prepared by spiking 164 1 l of sterile distilled water with 15 µg of 22l-infected brain homogenate. uas positive 165 samples were prepared from capturing the 1 l uas effluent post cleaning of 15 surgical 166 stainless-steel tokens contaminated with 1 µg 22l-infected brain homogenates each (dried for 167 24 hours at room temperature) as described above. the control and effluent solutions were 168 filtered through nitrocellulose membranes to capture the suspended protein aggregates. 169 after 24 h drying at room temperature the 22l brain homogenate, again demonstrated the 203 highest affinity for the stainless steel with the highest attachment of protein and prion-204 associated amyloid observed. when compared with 2 hours drying, the 263k-contaminated 205 homogenate resulted in higher protein attachment after 24 h drying and the me7 infected 206 brain homogenates demonstrated similar protein attachment but higher prion-associated 207 amyloid attachment (figure 3 ). the removal of 22l and me7 tissues was slightly more 208 difficult using a 5 s uas treatment with 91 and 90 % protein and 97 and 99 % amyloid 209 removal, respectively (figure 3 ). after 10 s uas treatment the removal was improved with 210 98 and 99 % protein and 99 -100 % amyloid removal, respectively. the 263k was harder to 211 remove after 24 hours drying with only 56 % protein and 90 % amyloid removal after the 5 s 212 uas treatment, however, after the 10 s uas treatment the cleaning was improved with 74 % 213 protein and 87 % amyloid removal (figure 3 ). the percentage of amyloid within the total 214 residual contamination was again very low with 4 -8 % amyloid remaining for all the 215 samples after 10 s uas contact time (figure 3) . 216 the effluent from the uas system after decontaminating the 22l spiked surfaces was filtered 218 and labelled for residual prion protein (both non-resistant and pk-resistant) and compared to 219 control samples of distilled water spiked with the equivalent amount of 22l brain 220 homogenate. a clear reduction of both the pk-sensitive and pk-resistant prion protein from 221 the tokens was observed (as demonstrated by the protein capture on nitrocellulose membranes 222 following the previously demonstrated 98 -99 % protein and 99 -100 % amyloid removal, 223 described above) after 10 s uas treatment (figure 4) . the reduction in immuno-labelled 224 prion proteins post uas treatment could be demonstrating that the uas treatment is 225 destructive to the antibody specific epitopes of the prion protein, therefore reducing the 226 immunochemical detection post uas treatment. furthermore, small protein aggregates could 227 be observed in the control samples but not in the samples post uas treatment, suggesting that 228 the uas may degrade and/or solubilize these aggregates. 229 230 current practices for the decontamination and sterilisation of surgical instruments within 232 ssds are not entirely efficient at removing all potentially infectious material, especially, 233 hardy prion proteins. therefore, surgical instruments which may have come in contact with 234 cjd-infected tissues cannot be deemed safe post cleaning 3, 7, 16, 53 and are subsequently 235 quarantined. simple, cost effective methods to prevent the initial attachment of bioburden to 236 surgical surfaces have been demonstrated [25] [26] [27] . ultrasonic baths provide efficient cleaning 237 using water alone, however; the limitations associated with water baths was described earlier 238 in this manuscript. this study has tested the efficacy of uas technology for the removal of 239 total protein and prion-amyloid from stainless steel, which is considered the most difficult 240 contaminant to decontaminate in the surgical field. 241 the uas technology demonstrated significant removal of the three prion strains tested after 242 differing drying and uas treatment times; however, increased uas treatment times are 243 required to further improve the efficacy of the uas treatment. the efficient removal of me7 244 and 22l, both murine adapted scrapie strains, was very similar following both drying and 245 uas treatment times. however, 263k, a hamster adapted scrapie strain, was harder to 246 remove and would require a longer uas treatment to reduce to the levels observed with the 247 two murine strains. this observation suggests that the hamster brain constituents and prp sc 248 conformation is different to the mouse brains and showed increased affinity to stainless steel. 249 this highlights the importance of studying different prion strains, from different hosts when 250 determining the efficacy of hospital decontamination tools. for comparison of the efficacy of 251 the uas system to that of cleaning chemistries used in ssds, the removal of me7-infected 252 brain homogenate from stainless steel tokens using the same methodology as this study have 253 been previously published 3, 25 . hervé et al (2010) , tested four different cleaning chemistries 254 marketed for proteinaceous decontamination which demonstrated total protein removals of 255 39%, 97.9%, 98.9% and 99.85%, respectively 3 . secker et al (2012) , tested two cleaning 256 chemistries, also marketed for proteinaceous decontamination, which demonstrated total 257 protein removal of 0% and 90.1%, respectively 25 . all the cleaning chemistries tested in these 258 studies required heating of the cleaning solution, whereas the uas system tested here 259 removed 97% total protein with cold water and only a 10s contact time. a recent nihr 260 health technology assessment (hta) has extensively compared studies quantifying the 261 efficacy of interventions to reduce the surgical transmission of vcjd 54 . the other important 262 observation was that the uas system favourably removed the prion-associated amyloid 263 (infectious prion proteins in the aggregated form) from the surfaces. demonstrated by the low 264 percentages of the total residual proteinaceous contamination being tht positive amyloid, 265 compared to the comparative treatment using commercially available cleaning chemistries 3, 266 25 . 267 immunoblot analysis of both pk-sensitive and -resistant residues of prp was carried out to 268 determine the presence and state of prion aggregates post uas decontamination. following 269 the predetermined 98 -99 % protein and 99 -100 % prion-amyloid removal, described 270 above, the supernatant from the uas treatment was filtered and the prion proteins were 271 labelled. the pk resistant and sensitive aggregates observed in the control immunoblots were 272 not present in the uas treated samples; suggesting that the uas mechanism of action is 273 causing the breakdown of the prp aggregates, reducing the available epitopes for antibody 274 binding, and therefore a reduction in antibody positive prp residues. furthermore, this would 275 explain why an increase in the removal of prion-amyloid using the uas system was 276 observed, as described earlier. further work is required to confirm and determine if the 277 breakdown of prp caused by the action of uas correlates with a reduction in prion 278 the results from this study demonstrated efficient removal of tissue proteins, and more 280 importantly prion-associated amyloid from surgical stainless-steel harnessing the power of 281 water at ambient temperature. while the cleaning efficacy demonstrated by this system is 282 improved compared to that of the best currently available cleaning chemistries tested on the 283 same contaminants, interestingly the uas appeared more effective at removing prion-284 amyloid as well as the total proteinaceous contamination. 285 this study has demonstrated the efficacious ability of the uas to clean with just cold water. 286 however, the uas system could work also with chemical cleaners, so that you can get a 287 synergistic effect of mechanical (acoustically activated bubbles) and chemical cleaning. 288 furthermore, previous studies have demonstrated that the uas efficiently removes microbial 289 contamination from rough, etched surfaces 40 . thus demonstrating that a uas has the ability 290 to clean items, such as surgical instruments, that contain dynamic differences in surface 291 topography 40 . in its current form, the uas system is designed as a hand-held device, and the 292 plan is to include this in a pilot to test as a pre-clean before the surgical instruments proceed 293 on to washer-disinfectors (i.e. at the stage where currently ssds conduct washing by hand, 294 brushing and pre-cleaning of surgical instruments). the mechanical removal by uas of 295 prion-associated amyloid embedded in dried-on brain homogenate, demonstrates an 296 interesting parallel with the problem of removing the sars-cov-2 virus responsible for the 297 current covid-19 from touch-surfaces. lacking an appropriate attachment mechanism, the 298 virus relies on the stickiness of respiratory secretions in which it resides (that are composed 299 mainly of mucin glycoproteins, surfactant and intercellular fluid) to attach to abiotic surfaces. 300 therefore, the efficient ability of the uas system for removing prion-associated amyloid by 301 cleaning away the biological material, in the case of this study brain homogenate, as well as 302 bacteria and lubricant contamination, previously published 40, 46 , highlights the importance of 303 testing this system against viruses. if viruses can also be removed by uas, then incorporation 304 of uas in society to clean these surfaces with just water could aid infection prevention, the datasets generated and analysed during this study will be openly available from the 322 university of southampton repository at http://dx.doi.org/10.5258/soton/[ note to editors: 323 the policy of the university of southampton is that they will grant a link for insertion once 324 the paper is accepted to avoid their repository referring to papers that were not published]. 325 ts carried out laboratory experimentation, data analysis interpretation, participated in the 327 design of the study and drafted the manuscript; tl conceived the study, coordinated across 328 disciplines, and helped draft the manuscript; do and pb helped conceive the study and 329 provided support for the set-up and running of the uas; rh helped with experimental design 330 and drafting the manuscript; ck oversaw the microbiological components and helped draft 331 the manuscript. all authors gave final approval for publication. 332 one of the authors (t.g.l.) is director and inventor-in-chief of the company (sloan water 334 technology, ltd.) that holds the patent to this technology but has drawn no salary from this. 335 j o u r n a l p r e -p r o o f tissue protein (dark grey bars) and prion-associated amyloid (light grey bars) attachment 545 from different prion-infected brain homogenates (22l, me7 and 263k) to surgical stainless 546 steel pre and post treatment with an ultrasonically activated stream (uas) (graph a). brain 547 homogenate was initially dried for 2h at 37 o c prior to cleaning (pos). the orange dashes 548 represent percentage protein removal and the blue dashes represent percentage prion-549 associated amyloid removal (graph a). graph b has an expanded y-axis scale to distinguish 550 the lower levels of contamination. data shows mean ± sem (n=9), however, in 551 decontamination and other research areas 55 , outliers are also important to assessing outcomes, 552 whether it be risk of infection, or the response of the most sensitive individuals to some 553 stimulus; ***: p=≤0.001 for total proteins; † †: p=≤0.01 for amyloid, when compared to the 554 corresponding positive controls, respectively. 555 556 557 tissue protein (dark grey bars) and prion-associated amyloid (light grey bars) attachment 560 from different prion-infected brain homogenates (22l, me7 and 263k) to surgical stainless 561 steel pre and post treatment (5 and 10s contact times) with an ultrasonically activated stream 562 (uas) (graph a). brain homogenate was initially dried for 24h at room temperature prior to 563 cleaning (pos). the orange dashes represent percentage protein removal and the blue dashes 564 represent percentage prion-associated amyloid removal (graph a). graph b has an expanded 565 y-axis scale to highlight the lower levels of contamination. data shows mean ± sem (n=9); 566 however, in decontamination and other research areas 55 , outliers are also important to 567 assessing outcomes, whether it be risk of infection, or the response of the most sensitive 568 individuals to some stimulus; *: p=≤0.05 and ***: p=≤0.001 for total proteins; † †: p=≤0.01 569 and † † †: p=≤0.001 for amyloid, when compared to the corresponding positive controls, 570 respectively. 571 572 immunoblot films showing captured proteins from 1 l of 22l-spiked solution containing 15 574 µg of 22l homogenate in distilled water (a and b) and from the uas system effluent after 575 treating surfaces contaminated with the equivalent amount of 22l homogenate (c and d). 576 proteins were detected using the primary antibody 6h4, without (a and c) or with pk 577 digestion (b and d). 578 minimise 337 transmission risk of cjd and vcjd in healthcare settings cjd): guidance, data and analysis reports by the uk 341 government department of health (22 october2015) crown copyright gajdusek dc 343 transmission of creutzfeldt-jakob disease to a chimpanzee by electrodes 344 contaminated during neurosurgery keevil cw current risk of iatrogenic creutzfeld-jakob disease in 346 the uk: efficacy of available cleaning chemistries and reusability of neurosurgical 347 instruments keevil cw diathermy forceps and pencils: reservoirs for 349 protein and prion contamination? surface 351 decontamination of surgical instruments: an ongoing dilemma highly sensitive, 354 quantitative cell-based assay for prions adsorbed to solid surfaces investigations 357 of a prion infectivity assay to evaluate methods of decontamination weissmann c transmission of 360 scrapie by steel-surface-bound prions infectivity of prion protein bound to 362 stainless steel wires: a model for testing decontamination procedures for transmissible 363 spongiform encephalopathies. infection control and hospital epidemiology : the 364 official journal of the society of detection 370 of prion infection in variant creutzfeldt-jakob disease: a blood-based assay. the 371 will rg creutzfeldt-jakob disease and blood 373 transfusion: results of the uk transfusion medicine epidemiological review study busick dn effects on instruments of the world 379 health organization-recommended protocols for decontamination after possible 380 exposure to transmissible spongiform encephalopathy-contaminated tissue decontamination of prion protein (bse301v) using a genetically engineered protease collinge j a 386 standardized comparison of commercially available prion decontamination reagents 387 using the standard steel-binding assay prion inactivation using a 390 new gaseous hydrogen peroxide sterilisation process decontamination 393 of surgical instruments from prions. ii. in vivo findings with a model system for 394 testing the removal of scrapie infectivity from steel surfaces the challenge of prion decontamination sklaviadis t photocatalytic degradation of 399 prions using the photo-fenton reagent keevil cw doped diamond-402 like carbon coatings for surgical instruments reduce protein and prion-amyloid 403 biofouling and improve subsequent cleaning halting the spread of human prion disease--exceptional measures for an 405 exceptional problem keevil cw application of a 407 fluorescent dual stain to assess decontamination of tissue protein and prion amyloid 408 from surgical stainless steel during simulated washer-disinfector cycles keevil cw effect of drying time, ambient 411 temperature and pre-soaks on prion-infected tissue contamination levels on surgical 412 stainless steel: concerns over prolonged transportation of instruments from theatre to 413 central sterile service departments adsorption of prion and tissue proteins to surgical 415 stainless steel surfaces and the efficacy of decontamination following dry and wet 416 storage conditions keevil cw efficacy of humidity retention bags for 418 the reduced adsorption and improved cleaning of tissue proteins including prion-419 associated amyloid to surgical stainless steel surfaces quantitative measurement of the efficacy of protein removal by cleaning 422 formulations; comparative evaluation of prion-directed cleaning chemistries offin d a new approach to ultrasonic cleaning the collapse of single bubbles 427 and approximation of the far-field acoustic emissions for cavitation induced by shock 428 wave lithotripsy white pr prediction of far-430 field acoustic emissions from cavitation clouds during shock wave lithotripsy for 431 development of a clinical device crawford ae the measurement of cavitation 435 characterisation of measures of reference acoustic cavitation (comorac): an 436 experimental feasibility trial acoustic fields: modern trends and applications ultrasonic cleaning: an historical perspective measurement of cavitation activity in ultrasonic cleaners acoustics: from whales to other worlds. proceedings of the 443 institute of acoustics transient processes near the threshold of acoustically 445 driven bubble shape oscillations the acoustic bubble: oceanic bubble acoustics and ultrasonic cleaning leighton tg electrochemical 'bubble swarm' 450 enhancement of ultrasonic surface cleaning cold water 453 cleaning of brain proteins, biofilm and bone -harnessing an ultrasonically activated 454 stream dental biofilms with an ultrasonically activated water stream bubbles vs 459 biofilms: a novel method for the removal of marine biofilms attached on antifouling 460 coatings using an ultrasonically activated water stream acoustic radiation force on a parametrically distorted 466 bubble evaluation 468 of stainless steel surgical instruments subjected to multiple use/processing leighton tg an electrochemical and high-speed imaging study 471 of micropore decontamination by acoustic bubble entrapment industrial 474 lubricant removal using an ultrasonically activated water stream, with potential 475 application for coronavirus decontamination and infection prevention for sars-476 current limitations about the cleaning of luminal endoscopes sidiropoulou t 480 a simple method for blocking the deep cervical nerve plexus using an ultrasound-481 guided technique keevil cw amyloid-specific 483 fluorophores for the rapid, sensitive in situ detection of prion contamination on 484 surgical instruments keevil cw a rapid dual staining procedure 486 for the quantitative discrimination of prion amyloid from tissues reveals how 487 interactions between amyloid and lipids in tissue homogenates may hinder the 488 detection of prions keevil cw rapid method for the 490 sensitive detection of protein contamination on surgical instruments keevil cw rapid detection of biofilms and adherent pathogens using scanning 493 confocal laser microscopy and episcopic differential interference contrast microscopy flan b in vitro infectivity assay for 496 prion titration for application to the evaluation of the prion removal capacity of 497 biological products manufacturing processes wong r interventions to 500 reduce the risk of surgically transmitted creutzfeldt-jakob disease: a cost-effective 501 modelling review analogies in contextualizing human response to airborne ultrasound and fish response 504 to acoustic noise and deterrents key: cord-257392-u6jy6w1m authors: zhao, yanfeng; ben, haijing; qu, su; zhou, xinwen; yan, liang; xu, bin; zhou, shuangcheng; lou, qiang; ye, rong; zhou, tianlun; yang, pengyuan; qu, di title: proteomic analysis of primary duck hepatocytes infected with duck hepatitis b virus date: 2010-06-07 journal: proteome sci doi: 10.1186/1477-5956-8-28 sha: doc_id: 257392 cord_uid: u6jy6w1m background: hepatitis b virus (hbv) is a major cause of liver infection in human. because of the lack of an appropriate cell culture system for supporting hbv infection efficiently, the cellular and molecular mechanisms of hepadnavirus infection remain incompletely understood. duck heptatitis b virus (dhbv) can naturally infect primary duck hepatocytes (pdhs) that provide valuable model systems for studying hepadnavirus infection in vitro. in this report, we explored global changes in cellular protein expression in dhbv infected pdhs by two-dimension gel electrophoresis (2-de) combined with maldi-tof/tof tandem mass spectrometry (ms/ms). results: the effects of hepadnavirus infection on hepatocytes were investigated in dhbv infected pdhs by the 2-de analysis. proteomic profile of pdhs infected with dhbv were analyzed at 24, 72 and 120 h post-infection by comparing with uninfected pdhs, and 75 differentially expressed protein spots were revealed by 2-de analysis. among the selected protein spots, 51 spots were identified corresponding to 42 proteins by ms/ms analysis; most of them were matched to orthologous proteins of gallus gallus, anas platyrhynchos or other avian species, including alpha-enolase, lamin a, aconitase 2, cofilin-2 and annexin a2, etc. the down-regulated expression of beta-actin and annexin a2 was confirmed by western blot analysis, and potential roles of some differentially expressed proteins in the virus-infected cells have been discussed. conclusions: differentially expressed proteins of dhbv infected pdhs revealed by 2-de, are involved in carbohydrate metabolism, amino acid metabolism, stress responses and cytoskeleton processes etc, providing the insight to understanding of interactions between hepadnavirus and hepatocytes and molecular mechanisms of hepadnavirus pathogenesis. the hbv, prototype of the hepadnaviridae family, is a noncytopathic hepatotropic dna virus replicating via reverse transcription [1] . more than 350 million individuals are hbv carriers worldwide and over one-third of them develop serious liver diseases such as chronic hepatitis, cirrhosis and primary hepatocellular carcinoma [2] . major obstacles in hbv research have been the inability of the virus to infect cells in vitro and lack of adequate animal models for hbv infection, though primary human hepatocytes and heparg cell line have been used to study hbv infection [3] . human primary hepatocytes and heparg cells can support hbv life cycle, but have limitations in accessibility, reproducibility and low level of hbv replication, and a large amount of input virus was needed to infect low proportion of cells [4] [5] [6] . dhbv and woodchuck hepatitis b virus (whbv) are classified into the family of hepadnaviridae. thus for hepadnavirus infection primary hepatocytes of ducks (dhbv) and woodchucks (whbv) are still considered as suitable models for investigating the viral replication and pathogenesis [7, 8] . the development of proteomic methods has enabled us to investigate the changes of cellular protein expression at a global scale to reveal virus-host interactions [9] [10] [11] [12] . the effect of hepadnavirus replication on the host cells, such as the carcinoma derived hepatocyte lines transfected with the hbv genome, heparg cell lines or hbv trans-genic mice, have been investigated by using 2-de analysis [13] [14] [15] . in the present study, we intend to utilize the dhbv-pdhs system to explore global protein expression changes during hepadnavirus infection by 2-de. a total of 75 differentially expressed protein spots were revealed by 2-de between dhbv infected and uninfected pdhs, and 51 protein spots have been identified by ms/ms analysis. differential expression of beta-actin and annexin a2 was confirmed by western blot analysis, and potential roles of some differentially expressed proteins in the viral infection have been discussed. pdhs isolated from the same liver of dhbv-negative cherry valley duckling were infected with dhbv at multiplicity of infection of 30 (moi, based on dhbv dna genome equivalents) and cultured for 12, 24, 72 and 120 h in l15 medium supplemented with 5% fetal bovine serum (fbs). the efficiency of dhbv infection in pdhs was determined by indirect immunofluorescence using anti-dhbv pres monoclonal antibody. pdhs inoculated with phosphate-buffered saline (pbs, ph 7.2) as a control. at 12 h and 24 h after infection, only a few cells showed fluorescence (data not shown), and at 72 h post-infection about 30% of cells expressed viral large surface antigen, indicating that cells were infected with dhbv, showed in figure 1 . dhbv dna in pdhs was analyzed by southern blot hybridization using an alpha-32 p-dctp labeled dhbv-specific probe, and dhbv in the supernatant was detected by real time polymerase chain reaction (pcr). single stranded forms of intracellular viral dna and dhbv copy number increasing in the supernatant indicated the replication of dhbv 72 h and 120 h post-infection (additional file 1 and 2). differentially expressed proteins between dhbv infected and uninfected pdhs at 24, 72 and 120 h post-infection were analyzed using the 2-de. the gels were stained by a modified silver staining method compatible with mass spectrometry (ms) analysis and processed for image analysis. on the 2-de gels (ph 3-10 nl, 24 cm), about 1150~1350 protein spots were detected. compared with the parallel uninfected pdhs, 91 differentially expressed protein spots were revealed by 2-de (p-values less than 0.05 with at least a 1.5-fold difference in percentage of the volume), shown in figure 2 and additional file 3 (see also additional file 4), and a total of 75 differentially expressed non-redundant protein spots were analyzed by ms/ms. differentially expressed protein spots between dhbv infected and uninfected pdhs, were excised, digested ingel with trypsin and determined by ms/ms. among 75 differentially expressed protein spots, 51 protein spots were identified corresponding to 42 proteins (table 1 and additional file 5). with a mascot cutoff score of 72 (pvalue less than 0.05), 51 spots were identified, and 37 spots were matched to orthologous proteins of avian species (26 protein spots to gallus gallus, 4 spots to anas platyrhynchos and 7 spots to other avian species), listed in table 1 . some of the differentially expressed protein spots such as annexin a2, beta-actin, lamin a, destrin, aconitase 2 and mn superoxide dismutase were illustrated in enlarged formats (figure 3 ), and representative mass spectrum of annexin a2 (spot 49) analyzed by maldi-tof/tof ms was shown in figure 4 . isoforms of annexin a2, alpha-enolase, lamin a, glyceraldehyde-3phosphate dehydrogenase (gapdh), heat shock protein 70 (hsp70) and elongation factor 2 have been identified. for example, protein spot 49 (mw 31 kda and pi of 5.45) and spot 50 (mw 10 kda and pi of 5.78) down-regulated in dhbv infected pdhs were both identified as annexin a2, and up-regulated protein spot 26 (mw 65 kda and pi of 6.61) and spot 27 (mw 66 kda and pi of 6.4) were matched to lamin a (theoretical mw 73.1 kda and pi of 6.5) showed in figure 2 . biological functions of the differentially expressed proteins in the dhbv-infected pdhs, were analyzed according to the gene ontology criteria and classified into carbohydrate metabolism (29%), amino acid metabolism (14%), cytoskeletal/structural protein (24%), stress response (18%) and other functions (16%), as shown in table 1 . the roles of selected differentially expressed proteins reported in viral infections were showed in table 2 . expression levels of annexin a2, beta-actin, hsp70, destrin, and lamin a were validated by western blot analysis to confirm the dynamic alterations of protein expression during dhbv infection. equal amounts (30 μg) of cell lysates of dhbv-infected and uninfected pdhs at 12, 24, 72 and 120 h post-infection were separated by sds-page. duck beta-actin and annexin a2 expression were detected down-regulated in the dhbv-infected pdhs at 12-120 h post-infection with mouse anti-beta-actin and anti-duck-annexin a2 as primary antibodies ( figure 5a ), that were consistent with the protein expression pattern revealed by the 2-de analysis. duck hsp70, destrin, and lamin a were not detected by western blot analysis with the rabbit anti-human or anti-mouse hsp70 polyclonal antibodies, rabbit anti-human destrin polyclonal antibody and rabbit anti-human lamin a polyclonal antibody. the same amount protein of each sample were applied to a parallel sds-page gel and stained with coomassie brilliant blue ( figure 5b ). hbv infection remains a public health problem worldwide. because the lack of appropriate cell lines that can support hbv infection efficiently, the cellular and molecular mechanisms of hepadnavirus infection remain incompletely understood. the hepadnavirus animal infection models such as ducks (dhbv) and woodchucks (whbv) have been used to investigate the viral replication, pathogenesis or hepadnavirus-associated hepatocellular carcinoma. dhbv-pdhs model is a valuable model of hepadnavirus infection with high reproducibility and efficiency [16] . in the present study, global changes in cellular protein expression in dhbv-infected pdhs were explored by 2-de combined with ms/ms. among the 75 differentially expressed protein spots, 51 spots have been identified by ms/ms corresponding to 42 proteins, in which 30 spots were matched to orthologous proteins of gallus gallus or anas platyrhynchos, 7 spots to other avian species, and 14 spots to non-avian species, while mass spectra of the other 24 protein spots did not match to any proteins in the current databases, possibly due to the incomplete genome sequence of anas platyrhynchos or low abundance of those protein spots. in previously studies, tong performed a proteomic analysis comparing hepg2 with hepg2.2.15 in which hbv genome integrated into cellular chromosome [13] , and narayan revealed 19 differentially regulated features in heparg cells by 2-de [14] . hepg2.2.15 is a hbv replication cell model but not an infection model, while the human hepatoma heparg cells are susceptible to hbv, but 10~20% of cells can be infected regardless of the amount of virus used (moi > 200) [4, 6] . in previous studies, it has been showed that at moi of 30, about 50%~60% pdhs can be reproducibly infected with dhbv [17] . some of differentially expressed proteins identified in the present study, such as alpha-enolase, lamin a, gapdh and cofilin2 have not yet been reported in hepadnavirus proteomic analysis. viruses depend on host cell metabolism for their replication. elucidation of the pathways/processes involving in the viral life cycle will help to understand the mechanisms of viral infection. in the 2-de analysis, the identified differentially expressed proteins were classified into carbohydrate metabolism, amino acid metabolism, cytoskeletal/structural protein, stress response and other functions according to the gene ontology criteria. some of differentially expressed proteins identified in the present study have been reported playing roles in viral infections, as shown in table 2 . in dhbv infected pdhs, the expression of some carbohydrate metabolic enzymes, such as phosphoglycerate kinase 1, triosephosphate isomerase, phosphoglycerate mutase 1 etc, was up-regulated. the differentially expressed proteins involving in carbohydrate metabolism, suggests perturbed energy metabolism in dhbv infections. hepatitis c virus (hcv) infection reprograms the cellular metabolisms to favor glucose fermentation and glycolytic intermediates toward the metabolite synthesis that supports the viral life cycle [18] . in lymphocytic choriomeningitis virus infection, there was a significant increasing in transcripts promoting gluconeogenesis for viral mediate synthesis, and a decreasing in transcripts promoting glycogenolysis in the early stage of infection [19] . however, gapdh and alpha-enolase, key enzymes involving in glycolysis and gluconeogenesis, are decreased in dhbv infected pdhs. gapdh and alphaenolase have been found associating with the cell membrane and in secreted viral particles of influenza virus, lentiviral vector etc [20, 21] . gapdh may phosphorylate the hbv core protein, and binds to the pres1 region of the hbv envelope antigen and posttranscriptional regulatory element in regulating expression of surface antigen, suggesting that gapdh plays an important role in the life-cycle of hbv infection [22] [23] [24] . the host cellular carbohydrate metabolism affected by dhbv infection may benefit viral replication. alterations of cytoskeleton networks were found in many viral infections [25] [26] [27] . hepadnavirus needs to manipulate and utilize the host cytoskeleton to promote viral infection like many viruses, although the mechanism is still unclear [28] . in dhbv infected pdhs, the microfilament-associated proteins, beta-actin and cofilin-2 were down-regulated, and three microfilament-associated proteins such as transgelin, destrin, and collapsin response mediator protein-2b were up-regulated. actin plays an active role in maturation of the viruses [29, 30] . many viruses require actin for viral entry and establishment of figure 2 . b) accession no. is the mascot result of maldi-tof/tof searched from the ncbinr database. c) protein score was from maldi-tof/tof identification. the proteins that had a statistically significant score great than 72 (p < 0.05) were considered identified. d) theoretical/experimental molecular mass. e) theoretical/experimental molecular pi. f) not applicable, because the spots on the gels were too weak or non-detectable. g) a represents the spot on one of the gels was detectable, n represents the spot on one of the gels was too weak to detect. [31] [32] [33] [34] . however, the actin cortex beneath the plasma membrane can also be an obstacle for virus entry or budding [35] . it has been reported that dhbv entry depends on both intact microtubules and their dynamic turnover but not actin cytoskeleton [28] . therefore the role of actin in dhbv replication is required to further investigation. lamin a is key structural components of the nuclear lamina and lamins, involving in dna replication and gene expression, as well as presenting a natural barrier against most dna viruses such as human cytomegalovirus (hcmv), kaposi's sarcoma-associated herpesvirus, herpes simplex virus (hsv) 1 and epstein-barr virus [36] . lamin a/c is phosphorylated in hsv-infected cells supporting a role in regulating virus capsid nuclear egress [37, 38] . infection of epstein-barr virus induced disassembly of the nuclear lamina and redistribution of nuclear lamin for the nuclear egress [39] . the expression of lamin a with different isoforms, were up-regulated in dhbv infected pdhs, suggesting that lamin a may play a role in dhbv replication. in dhbv infected pdhs, up-regulated expressions of amino acid metabolism enzymes, catalyzing interconversion of glutamate, histidine, and proline (glutamate dehydrogenase 1, urocanate hydratase, delta-1-pyrroline-5-carboxylate dehydrogenase, the orthologs in human referred to protein 16, 19 and 20, 22 in table 1 ), indicate that glutamine metabolism is enhanced. switching the anaplerotic substrate from glucose to glutamine to accommodate the biosynthetic and energetic needs of the viral infection and to allow glucose to be used biosynthetically was reported in hcmv infection [40] . hcvinfected cells exhibit increased levels of the enzymes catalyzing glutamine flux to replenish metabolic intermediates through the latter half of the citric acid cycle providing substrates for atp production [18] . thus simi-lar mechanism of glutamine metabolism may be at work in dhbv infection. stress response associated proteins including endoplasmic reticulum stress associated proteins such as hsp70, and chaperonin containing t-complex polypeptide 1 (tcp1) and oxidative stress associated proteins such as antioxidant enzymes mn superoxide dismutase and peroxiredoxin-3 (similar to antioxidant protein isoform 2) were found to be up-regulated post dhbv infection. hsp70 assists folding of many newly synthesized polypeptides, and refolding of the proteins misfolded [41, 42] . hsp70 can enhance flock house virus replication [43, 44] . hsp70 and hsp90 participate in dengue virus entry as a receptor complex [45] . moreover, hbv p protein activation in vitro is fundamentally dependent on heat shock protein 70 family hsc70/hsp40 [46] . in hbv-replicating hepad38 cell, expressions of heat shock proteins (hsp70 and hsp90) and mn superoxide dismutase increase, after hbv replication induced by tetracycline [47] . in humanized transgenic mice, inhibition of hbv replication results in suppression of mn superoxide dismutase expression in hepatocytes [47, 48] . it suggests that oxidative stress can be induced by hepadnavirus replication as epstein-barr virus [49] . annexin a2, belongs to a family of calcium-dependent, phospholipid binding proteins, is involved in many biological processes, such as the ca 2+ dependent exocytosis, calcium transport and cell proliferation. it participates in viral infection, including assisting in the assembly of hiv in monocyte-derived macrophages [50] , as a cellular cofactor supporting hiv-1 infection [51] , enhancing cytomegalovirus binding and membrane fusion [52] and supporting the replication of influenza viruses by mediating activation of plasminogen [53] . it has been reported that hbv polymerase activity was inhibited by interacted with s100a10, a protein binding to annexin a2 [54] . in hepg2.2.15 compared with hepg2, annexin a2 was revealed down-regulated [55] , which was consistent with our observation in dhbv-pdhs model and confirmed by western blot analysis. it indicated that annexin a2 may involve in hepadnavirus infection and warrants further investigation. beta-actin and gapdh are usually referred as the internal standards for detections of rna transcription and protein expression of genes. however, those proteins were found to be down-regulated post dhbv infection by 2-de analysis. recently, accumulated evidence showed that in hbv-related hepatocellular carcinoma or viral infections, beta-actin and gapdh are unsuitable controls in quantitative mrna expression or western blot analysis due to variations in expression [56] [57] [58] [59] [60] , though there are controversial observations [61] . these findings therefore highlight the importance of re-evaluating the housekeeping genes whose expressions may be affected by hepadnavirus infection. in summary, the present study explored global changes in cellular protein expression of hepadnavirus infection by 2-de analysis, using a natural dhbv-pdhs infection system. forty-two differentially expressed proteins in dhbv infected pdhs have been identified by ms/ms. most of them involve in carbohydrate metabolism, amino acid metabolism, stress responses and cytoskeleton processes including alpha-enolase, beta-actin, lamin a and annexin a2. it suggests that those proteins may play important roles in hepadnavirus infection. differential expressions of annexin a2 and beta-actin were confirmed by western blot analysis. further investigation of the roles of the differentially expressed cellular proteins will help to understand cellular and molecular mechanisms of hepadnavirus infection. cherry valley ducks (anas platyrhynchos) were purchased from breeding center of shanghai institute of veterinary medical sciences, china. animal protocols were approved by the ethics committee of fudan university. three-day-old ducklings with no congenital dhbv infection detected by pcr (sence: 5'-ctcactttgtg-gatctcattg-3', antisense: 5'-atcggatagtcg-ggttgg-3'), were used for pdhs cultures. duck hepatocytes were isolated with in situ liver perfusion method with modifications. after the duck was anesthetized with approximately 0.3 ml of 0.75% pentobarbital sodium, the liver was perfused via portal vein with prewarmed liver perfusion medium (gibco laboratories), then the inferior vena cava was cut to effuse the buffer liquid when the liver was engorged. at first, perfusion was maintained at 15 to 20 ml per minute with 100 ml of liver perfusion medium until the liver became blanch, then the liver became soft followed by 50 ml digestion buffer with 1 μg/ml of collagenase type iv (sigma) in l15 medium (gibco). after the perfusion, the gallbladder was removed and hepatocytes were dispersed in l-15 medium. hepatocytes were filtered through sterilized gauze, centrifuged at 40 g for 4 min, and washed three times with hepatocyte wash medium (gibco). then 4 × 10 6 hepatocytes were seeded onto 100-mm-diameter dish and incubated in l-15 medium containing 5% fbs (gibco laboratories), 15 mm hepes, 100 u penicillin per liter, 100 mg of streptomycin per liter, 1 mg of insulin (sigma) per liter and 10 -5 m hydrocortisone-hemisuccinate (sigma) at 37°c. the medium was changed every day. infectious dhbv were produced by lmh-d2 cell line which carries a stably integrated dhbv dimer and constitutively secretes dhbv virions (generous gift of william mason, fox chase cancer center, usa) [62] . dhbv particles were obtained from lmh-d2 cells by ultracentrifugation, and the virus pellet was suspended in pbs with 10% glycerol. dhbv was quantified using real time pcr as a titer of 2 × 10 9 copies per milliliter. pdhs cultured 16 h after plating were infected with purified dhbv viral particles at moi of 30, and incubated at 37°c overnight. then, they were washed with pbs three times and cultured for 12, 24, 72 and 120 h in l15 medium supplemented with 5% fbs. the efficiency of dhbv infection in pdhs was determined by indirect immunofluorescence and southern blot hybridization. monolayers of pdhs grown on glass coverslips were fixed directly by adding 4% polyoxymethylene at room temperature for 20 min, washed twice with pbs and preincubated with 3% bovine serum albumin for 30 min. after incubation with a 1:100 dilution of monoclonal mouse anti-dhbv pres (generous gift of john c. pugh and william mason, fox chase cancer center, usa) at 37°c for 60 min, the cells were washed three times with pbs, subsequently incubated with a 1:200 dilution of fitc-conjugated sheep anti-mouse igg (gghl-90f, immunology consultants laboratory) at 37°c for 30 min. cell nucleus were stained with 1 μg/ml 4',6'-diamidino-2-phenylindole (dapi, sigma) and mounted in 50% glycerol in pbs. efficiency of dhbv infection was observed by the confocal laser scanning microscope (leica). to detect dhbv replication, intrac-ellular dna was extracted from dhbv infected or uninfected pdhs. forty micrograms of dna from each sample was separated on a 1.5% agarose gel, and analyzed by southern blot hybridization with an alpha-32 p-dctp labeled dhbv-specific probe for detection as described previously [63] . the dhbv-infected and control (uninfected) pdhs at 24, 72, and 120 h post-infection were washed three times with ice-cold pbs before harvesting and stored at -80°c. pdhs for dhbv infection or the uninfected controls were from the same duck in order to avoid individual differences. approximately 2 × 10 7 cells were lysed in 1 ml lysis buffer (7 m urea, 2 m thiourea, 2% (w/v) chaps, 50 mm dithiothreitol (dtt), 2% (v/v) ph 3-10 nonlinear immobilized ph gradient (ipg) buffer (amersham biosciences) containing 1% protease inhibitor cocktail (roche) and 1 mm pmsf (sigma)), then sonicated on ice for 12 cycles, each consisting of 5 s pulse and 10 s pause. after centrifugation at 20,000 g at 4°c for 1 h, the supernatants of lysates were divided into aliquots and the protein concentrations were determined by the bradford assay. then, aliquots were stored at -80°c for further analysis. the 2-de gels were performed using 24-cm ipg strips (ph 3-10, nonlinear, ge healthcare) in ettan ipgphor isoelectric focusing system (amersham biosciences) plus ettan-dalt six system (amersham biosciences) according to the manufacturer's instructions. to compensate the variability of gel electrophoresis, at least three replicate gels were performed for each group. in the first dimensional isoelectric focusing (ief), 120 μg proteins of each sample were diluted to 450 μl with rehydration buffer containing 8 m urea, 2% (w/v) chaps, 50 mm dtt, 0.5% (v/v) ampholyte (ph 3-10, nonlinear, amersham biosciences), and ipg strips were allowed to rehydrate in the above solution under mineral oil. ief was performed as follow: 30 v for 6 h (active rehydration); 60 v for 6 h (active rehydration); 500 v for 2 h, rapid; 1,000 v for 2 h, rapid; 4,000 v for 2 h, linear; linear ramping to 8,000 v for 2 h, and finally 8,000 v for about 7 h with a total of 64 kvh at 20°c. then the ipg strips were incubated in equilibration buffer (75 mm tris-hcl (ph 8.8), 6 m urea, 29.3% (v/v) glycerol, 2% (w/v) sds and 0.002% (w/v) bromophenol blue) containing 2% (w/v) dtt for 15 min with gentle agitation, followed by incubation in the same equilibration buffer supplemented with 2.5% (w/v) iodoacetamide for 15 min at room temperature. the second dimension sds-page was performed on 1 mm thick 12.5% polyacrilamide vertical gels in ettan-dalt six system using 5 w/gel for 30 min, and followed by 12 w/gel at 10°c until the bromophenol blue dye front reached the end of the gels. the gels were stained by a modified silver staining method compatible with ms analysis [64] and scanned at 300 dpi (dots/inch) using imagescanner (umax, amersham biosciences). images were captured and analyzed by imagemaster 2d platinum 6.0 software (amersham biosciences). the percentage of the volume of the spots representing a certain protein was determined in comparison with the total proteins present in the 2-de gel. to select differentially expressed protein spots, quantitative analysis was performed using the student's t-test to compare the percentage volumes of spots between dhbv infected and uninfected groups at three time points. the differentially expressed protein spots with p-values less than 0.05 were considered as significant differences, and at least 1.5 fold difference in percentage of the volume for each spot was set as a threshold. these protein spots were selected and subjected to in-gel tryptic digestion and identification by ms. the differentially expressed protein spots were manually excised from the sliver-stained gels (each gel of 120 μg protein) and placed into a 96-well microplate. the gel pieces were destained with a solution of 15 mm potassium ferricyanide and 50 mm sodium thiosulfate (1:1) at room temperature for 10 min, then washed twice with deionized water, each for 30 min, and dehydrated in 80 μl of acetonitrile (acn) twice. then the samples were swollen in a digestion buffer containing 25 mm nh 4 hco 3 and 12.5 ng/μl trypsin (promega) at 4°c after 30 min incubation, and incubated at 37°c for more than 12 h. the peptide mixtures from the gel were extracted twice using 0.1% trifluoroacetic/50% acn at room temperature, re-suspended with 0.7 μl matrix solution (α-cyano-4-hydroxy-cinnamic acid (sigma) in 0.1% trifluoroacetic, 50% acn), allowed to dry in air under the protection of n2. the peptide mixtures from samples were analyzed by 4700 maldi-tof/tof proteomics analyzer (applied biosystems). the uv laser was operated at a 200 hz repetition rate with wavelength of 355 nm. the accelerated voltage was operated at 20 kv. myoglobin digested by trypsin was used to calibrate the mass instrument with internal calibration mode. all acquired spectra of samples were processed using 4700 series explore software (applied biosystems) in a default mode. the parent mass peaks with mass range 700-3200 da and minimum s/n 20 were picked out for tandem tof/tof analysis. combined ms and ms/ms spectra were submitted to mas-cot (v2.1, matrix science) by gps explorer software (v3.6, applied biosystems) and searched with the following parameters: ncbinr database (release date: 2009.11), taxonomy of bony vertebrates or viruses, trypsin digest with one missing cleavage, none fixed modifications, ms tolerance of 100 ppm, ms/ms tolerance of 0.6 da and possible oxidation of methionine. known contaminant ions (human keratin and tryptic autodigest peptides, etc.) were excluded. mascot protein scores (based on ms and ms/ms spectra) with greater than 72 were considered statistically significant (p < 0.05). the individual ms/ms spectrum with statistically significant (confidence interval >95%) and best ion score (based on ms/ ms spectra) was accepted. to eliminate the redundancy of proteins that appeared in the database under different names and accession numbers, the protein belonging to the species anas platyrhynchos or with the highest protein score (top rank) was singled out. duck cdna of annexin a2 was amplified by reverse transcription-pcr with the primers designed according to the sequence of the chicken annexin a2 (genbank accession no. gi|45382533, sense: 5'-ccgctcgaggtc-ctctccaccacacaggt-3' and antisense: 5'-ccg ctcgaggtcctctccaccacacaggt-3'). the fulllength of duck annexin a2 amplified from pdhs, was cloned into prokaryotic expression plasmid pet-28a (novagen). recombinant annexin a2 with c-terminal fusion his-tag was induced by iptg, and purified by ni-nta affinity chromatography (qiagen). balb/c mice were immunized by the purified recombinant duck annexin a2 with freund's complete adjuvant (sigma). the serum was collected 2 weeks following the final injection and the levels of anti-duck-annexin a2 antibody titers from immunized mice were determined by western blot. differential expression of duck beta-actin, annexin a2, hsp70, destrin and lamin a were confirmed by western blot analysis. the primary antibodies for detection were as follow: a monoclonal antibody against beta-actin (sigma), rabbit anti-human lamin a polyclonal antibody (santa cruz biotechnology and proteintech), rabbit antihuman hsp70 polyclonal antibody (santa cruz biotechnology), rabbit hsp70 polyclonal antibody (bios), rabbit anti-human destrin polyclonal antibody (protein-teck) and mouse anti-duck-annexin a2 polyclonal antibody prepared in our laboratory described as above. thirty microgram proteins from each sample were separated in 12% sds-page gels and transferred to pvdf membranes using the transfer system (biorad). the blots were blocked with 5% nonfat milk for 2 h at room temperature and incubated at 4°c overnight with 1:200-1:500 dilution of primary antibody. the blots were then washed four times with pbs containing 0.1% tween-20, and incubated with the appropriate horseradish peroxidase-conjugated secondary antibody (santa cruz biotechnology) 1 hour at room temperature. after washed four times with pbs containing 0.1% tween-20, the bands were developed with ecl detection reagent (pierce). the same amount protein of each sample was applied to a parallel sds-page gel and stained with coomassie brilliant blue. hepatitis b virus immunopathogenesis hepatitis b virus infection--natural history and clinical consequences in vitro experimental infection of primary human hepatocytes with hepatitis b virus infection of a human hepatoma cell line by hepatitis b virus initiation of hepatitis b virus genome replication and production of infectious virus following delivery in hepg2 cells by novel recombinant baculovirus vector persistence of the hepatitis b virus covalently closed circular dna in heparg human hepatocyte-like cells age-related differences in amplification of covalently closed circular dna at early times after duck hepatitis b virus infection of ducks treatment of chronic viral hepatitis in woodchucks by prolonged intrahepatic expression of interleukin-12 proteome analysis of cultivar-specific deregulations of oryza sativa indica and o. sativa japonica cellular suspensions undergoing rice yellow mottle virus infection identification of epstein-barr virus (ebv) nuclear antigen 2 (ebna2) target proteins by proteome analysis: activation of ebna2 in conditionally immortalized b cells reflects early events after infection of primary b cells by ebv proteomic analysis of tobacco mosaic virus-infected tomato (lycopersicon esculentum m.) fruits and detection of viral coat protein identification of cellular proteins modified in response to african swine fever virus infection by proteomics proteomic analysis of cellular protein alterations using a hepatitis b virus-producing cellular model proteomic analysis of heparg cells: a novel cell line that supports hepatitis b virus infection proteomic analysis of hepatitis b surface antigen positive transgenic mouse liver and decrease of cyclophilin a duck hepatitis b virus: an invaluable model system for hbv infection infection and uptake of duck hepatitis b virus by duck hepatocytes maintained in the presence of dimethyl sulfoxide temporal proteome and lipidome profiles reveal hepatitis c virus-associated reprogramming of hepatocellular metabolism and bioenergetics gene expression in primate liver during viral hemorrhagic fever cellular proteins in influenza virus particles quantitative proteomic analysis of lentiviral vectors using 2-de identification of glyceraldehyde-3-phosphate dehydrogenase as a cellular protein that binds to the hepatitis b virus posttranscriptional regulatory element phosphorylation of the hepatitis b virus core protein by glyceraldehyde-3-phosphate dehydrogenase protein kinase activity protein kinase and nostimulated adp-ribosyltransferase activities associated with glyceraldehyde-3-phosphate dehydrogenase isolated from human liver almeras l: identification of cellular proteome modifications in response to west nile virus infection proteomics analysis of host cells infected with infectious bursal disease virus quantitative analysis of severe acute respiratory syndrome (sars)-associated coronavirus-infected cells using proteomic approaches: implications for cellular responses to virus infection itinerary of hepatitis b viruses: delineation of restriction points critical for infectious entry actin in transcription and transcription regulation actin-based motility of vaccinia virus local actin polymerization and dynamin recruitment in sv40-induced internalization of caveolae cellular motility driven by assembly and disassembly of actin filaments adenovirus endocytosis requires actin cytoskeleton reorganization mediated by rho family gtpases establishment of a functional human immunodeficiency virus type 1 (hiv-1) reverse transcription complex involves the cytoskeleton sfv infection in cho cells: cell-type specific restrictions to productive virus entry at the cell surface escape of herpesviruses from the nucleus us3 of herpes simplex virus type 1 encodes a promiscuous protein kinase that phosphorylates and alters localization of lamin a/c in infected cells effects of lamin a/c, lamin b1, and viral us3 kinase activity on viral infectivity, virion egress, and the targeting of herpes simplex virus u(l)34-encoded protein to the inner nuclear membrane epstein-barr virus bglf4 kinase induces disassembly of the nuclear lamina to facilitate virion production glutamine metabolism is essential for human cytomegalovirus infection molecular chaperones in the cytosol: from nascent chain to folded protein hsp70 chaperones: cellular functions and molecular mechanism proteomics analysis of the tombusvirus replicase: hsp70 molecular chaperone is associated with the replicase and enhances viral rna replication the cellular chaperone heat shock protein 90 facilitates flock house virus rna replication in drosophila cells heat shock protein 90 and heat shock protein 70 are components of dengue virus receptor complex in human cells efficient hsp90-independent in vitro activation by hsc70 and hsp40 of duck hepatitis b virus reverse transcriptase, an assumed hsp90 client protein hepatitis b virus replication causes oxidative stress in hepad38 liver cells betulinic acid-mediated inhibitory effect on hepatitis b virus by suppression of manganese superoxide dismutase expression epstein-barr virus induces an oxidative stress during the early stages of infection in b lymphocytes, epithelial, and lymphoblastoid cell lines annexin 2: a novel human immunodeficiency virus type 1 gag binding protein involved in replication in monocyte-derived macrophages secretory leukocyte protease inhibitor binds to annexin ii, a cofactor for macrophage hiv-1 infection annexin ii enhances cytomegalovirus binding and fusion to phospholipid membranes annexin ii incorporated into influenza virus particles supports virus replication by converting plasminogen into plasmin association of hepatitis b virus polymerase with promyelocytic leukemia nuclear bodies mediated by the s100 family protein p11 proteome responses to stable hepatitis b virus transfection and following interferon alpha treatment in human liver cell line hepg2 suitable reference genes for real-time pcr in human hbv-related hepatocellular carcinoma with different clinical prognoses selection of reference genes for real-time pcr in human hepatocellular carcinoma tissues validation of putative reference genes for gene expression studies in human hepatocellular carcinoma using real-time quantitative rt-pcr determination of suitable housekeeping genes for normalisation of quantitative real time pcr analysis of cells infected with human immunodeficiency virus and herpes viruses reference gene selection for quantitative real-time pcr analysis in virus infected cells: sars corona virus, yellow fever virus, human herpesvirus-6, camelpox virus and cytomegalovirus infections a protein-based set of reference markers for liver tissues and hepatocellular carcinoma efficient duck hepatitis b virus production by an avian liver tumor cell line rapid resolution of duck hepatitis b virus infections occurs after massive hepatocellular involvement pharmacia a: a modified silver staining protocol for visualization of proteins compatible with matrix-assisted laser desorption/ionization and electrospray ionization-mass spectrometry proteomic analysis of primary duck hepatocytes infected with duck hepatitis b virus proteome science we would like to thank professor alastair i. h. murchie for careful reading and correcting the english of the revised manuscript. this work was supported by national natural science foundation of china (30670092, j0730860) and the program of ministry of science and technology of china (2010dfa32100, the authors declare that they have no competing interests.authors' contributions dq was responsible for the conception and design of the study. yz were responsible for pdhs culture, virus infection and sample preparation. hb and sq carried out the 2-de gels experiments, image analysis and excised the protein spots. xz performed the mass spectrometric analyses. yz and ly confirmed the differential expression by western blot. yz and hb carried out the analysis and interpretation of data. dq, yz and hb wrote the manuscript. bx, sz, ql, ry, tz, py have been involved in drafting the manuscript or revising it critically for important content. all authors read and approved the final manuscript. key: cord-259237-aty0vrat authors: frabutt, dylan a.; zheng, yong-hui title: arms race between enveloped viruses and the host erad machinery date: 2016-09-19 journal: viruses doi: 10.3390/v8090255 sha: doc_id: 259237 cord_uid: aty0vrat enveloped viruses represent a significant category of pathogens that cause serious diseases in animals. these viruses express envelope glycoproteins that are singularly important during the infection of host cells by mediating fusion between the viral envelope and host cell membranes. despite low homology at protein levels, three classes of viral fusion proteins have, as of yet, been identified based on structural similarities. their incorporation into viral particles is dependent upon their proper sub-cellular localization after being expressed and folded properly in the endoplasmic reticulum (er). however, viral protein expression can cause stress in the er, and host cells respond to alleviate the er stress in the form of the unfolded protein response (upr); the effects of which have been observed to potentiate or inhibit viral infection. one important arm of upr is to elevate the capacity of the er-associated protein degradation (erad) pathway, which is comprised of host quality control machinery that ensures proper protein folding. in this review, we provide relevant details regarding viral envelope glycoproteins, upr, erad, and their interactions in host cells. despite their vast diversity, animal viruses can be simply divided into two categories: non-enveloped viruses and enveloped viruses [1] . while non-enveloped viruses are wrapped with naked shells made of viral capsid proteins, enveloped viruses are covered with a lipid-bilayer, which is called a viral envelope. the viral envelope is obtained from progenitor host cells during the budding process, which can be a portion of plasma membrane or intracellular membrane. on the surface of the enveloped viruses, there are peplomers that project from the viral envelope, and play a critical role in viral infection. these peplomers are also described as spikes, which are made of viral envelope glycoproteins. envelope spikes serve to identify and bind to viral receptors on the host cell surface, allowing viral entry into cells and the initiation of infection by mediating virus-cell fusion. thus, the infectivity of enveloped viruses is absolutely dependent on the integrity of the viral envelope, and the functionality of the viral glycoproteins found therein. enveloped viruses are more stable than non-enveloped viruses under physiological conditions, at the expense of their sensitivity to high-temperature, low-ph, desiccation, or detergent-treatment, which limits their ability to withstand severe environments [2] . the entry of enveloped viruses requires the formation of a fusion pore between the viral envelope and the cell membrane, through which the viral genome is released into the cell. this fusion process is triggered by interactions between viral glycoproteins on the viral envelope and viral receptors on the cell surface, which can occur directly at the plasma membrane at neutral ph or in endocytic compartments at either low or neutral ph [3] . in addition, enveloped viruses can also enter cells through direct cell-to-cell contacts via virological glycosylation, which is one of the most common post-translational modifications in eukaryotic cells, is required for protein folding and maintaining protein structure. viruses have taken advantage of this benefit at nearly every step of the viral life cycle [12] . n-glycosylation significantly promotes their folding and solubility, enhances subsequent trafficking of these viral proteins to their destinations, and ensures that they are properly processed and incorporated into virions. nevertheless, glycosylation can have distinct effects that are both advantageous and detrimental to viral fitness. for example, if glycosylation occurs close to the glycoprotein processing sites, it may block the precursor cleavage by proteases and inhibit viral infection [13] ; if glycosylation occurs adjacent to the receptor-binding site, it may enhance the binding affinity and promote viral infection [14, 15] . in addition, the high density of glycans on virions may form a shield to impede antibody attack and promote immune evasion. however, these glycans can also become epitopes for stimulating neutralizing antibodies and the innate immune response, making viruses more vulnerable to immune clearance [16] . thus, there are multiple selective pressures on viral envelope glycosylation that can influence the pattern of glycosylation in order to achieve the optimal fitness in their hosts [17] . viruses are obligate intracellular parasites, and their glycoprotein biosynthesis and modification rely entirely on host cell machinery in the secretory pathway. therefore, viral and host proteins are glycosylated in a similar manner by the same mechanism. although glycans can be attached to polypeptide structures via several different mechanisms, asparagine n-linked glycosylation represents a fundamental and well characterized post-translational modification in eukaryotic organisms [18] . n-linked glycosylation starts from the membrane of the endoplasmic reticulum (er), where the tetradecasaccharide precursor is assembled. this precursor consists of two n-acetylglucosamine (glcnac), nine mannose (man, 4 are α1,2-linked), and three terminal glucose (glc) residues distributed on three extended man branches: a, b, and c (glc 3 man 9 glcnac 2 ) ( figure 1a ) [19, 20] . when nascent polypeptides enter the er lumen, the precursor is en bloc attached to asn residues of a nascent polypeptide in a consensus asn-x-(ser/thr) motif. after the attachment, these precursors are processed by a series of enzymes in both the er and the golgi apparatus to remold the core oligosaccharide into diverse n-linked glycan structures ( figure 1b ). the first step in this process is the sequential removal of the two outermost glc residues on branch a. the first glc residue is removed by glucosidase i (gi), resulting in the di-glycosylated oligosaccharide glc 2 man 9 glcnac 2 , which is recognized by an er transmembrane lectin malectin [21] . the second glc residue is then removed by glucosidase ii (gii), resulting in the mono-glucosylated oligosaccharide glc 1 man 9 glcnac 2 , which is recognized by two other er lectins, the membrane-bound calnexin (cnx) and/or soluble calreticulin (crt). interaction with these two chaperones segregates the newly formed glycoprotein and provides access to protein disulfide isomerases (pdis) such as erp57, which promotes disulfide bond formation, resulting in protein folding into a native conformation. once a protein is properly folded, gii cleaves the last glc residue on branch a, which releases the protein from the cnx/crt cycle. the er class i α-mannosidase (ermani) then cleaves the outermost man residue on branch b on native proteins, resulting in the oligosaccharide man 8 glcnac 2 . these high-man glycans are then recognized by lectins including er-golgi intermediate compartment-53 (ergic-53), vesicular integral membrane protein of 36kda (vip36), and vip36-like (vipl), which promote trafficking from the er to the golgi [22] . the remaining man residues are cleaved by the golgi mannosidases, and the glycan remolding process is continued through the remainder of the n-glycosylation pathway, which generates functional glycoproteins that are delivered to the cell surface ( figure 1b ). in addition to these chaperones and enzymes that promote protein folding, the er is also equipped with a unique quality control mechanism that extracts and degrades proteins that are not correctly folded or assembled into their native conformation, which is called er-associated protein degradation (erad) [23] . in fact, the folding efficiency of glycoproteins in the er is very low, which requires cycles of association and dissociation from cnx/crt to ensure proper glycoprotein maturation. if glycoproteins with the man 9 glcnac 2 oligosaccharide display non-native conformations, they are reglucosylated by the udp-glc:unfolded glycoprotein glucosyltransferase (ugt1 or uggt), and are subject to additional rounds of re-engagement with the cnx/crt machinery until folding is achieved. however, if a certain time frame for the folding is exceeded, proteins may never fold properly. misfolded proteins are sequestered into coat protein complex ii (cop-ii) -dependent, highly mobile er-derived quality control vesicles (qcvs), where ermani is enriched ( figure 1b ) [24] . because ermani is able to excise all α1,2-man residues when it is expressed at much higher levels in vitro [25] , the enzyme may catalyze extensive demannosylation, resulting in the production of low-man oligosaccharide man 5 glcnac 2 -containing glycoprotein precursors. the removal of the a branch terminal man residue, which is the acceptor for glc transferred by uggt, disables these proteins from reengagement with cnx/crt and re-entering into the folding cycle. importantly, the low-man n-glycans represent a tag for defective glycoproteins, targeting them to erad [26] . figure 1b ) [24] . because ermani is able to excise all α1,2-man residues when it is expressed at much higher levels in vitro [25] , the enzyme may catalyze extensive demannosylation, resulting in the production of low-man oligosaccharide man5glcnac2-containing glycoprotein precursors. the removal of the a branch terminal man residue, which is the acceptor for glc transferred by uggt, disables these proteins from reengagement with cnx/crt and re-entering into the folding cycle. importantly, the low-man n-glycans represent a tag for defective glycoproteins, targeting them to erad [26] . , and endoplasmic reticulum (er) stress pathways. nascent polypeptides are translocated through sec61 into the rough er, where the core oligosaccharide is transferred from a dolichol phosphate onto asparagine residues in asparagine-x-serine/threonine (nxs/t) motifs (i). the two terminal glucose residues on the core oligosaccharide are trimmed by glucosidase i, (gi) (ii), and gii (iii), respectively, allowing for the association with the chaperones, membrane-bound calnexin (cnx) and and/or soluble calreticulin (crt), which promote folding to a native conformation. eventually, the last terminal glucose residue will be trimmed by gii, and the glycoprotein will attain a native conformation (iv), or misfold (vii). glycoproteins that reach a native conformation will have the terminal α1,2-man residue on the b branch removed by er class i α-mannosidase (ermani) (v), as a signal to allow it to traverse the canonical secretory pathway for surface presentation or secretion (vi). polypeptides unable to reach a native conformation (vii) will engage in multiple rounds of the cnx/crt cycle, facilitated by reglucosylation of the terminal glucose by udp-glc:unfolded glycoprotein glucosyltransferase (uggt) (viii), and trafficking between quality control vesicles (qcv) (ix) and the the er-derived quality compartments (erqc) (x) under er stress. terminally misfolded glycoproteins will be demannosylated to remove all α1,2-man residues (xi), followed by association with lectins osteosarcoma amplified 9 (os9) and xtp3-transactivated gene b protein (xtp3-b) for erad (xii). figure 1 . (a) schematic presentation of the n-linked core oligosaccharide structure. the core is composed of two n-acetylglucosamine (glcnac, blue), nine mannose (man, red), and three glucose (glc, yellow) residues. a, b, and c are three oligosaccharide branches. (b) schematic description of n-glycosylation, endoplasmic reticulum-associated protein degradation (erad), and endoplasmic reticulum (er) stress pathways. nascent polypeptides are translocated through sec61 into the rough er, where the core oligosaccharide is transferred from a dolichol phosphate onto asparagine residues in asparagine-x-serine/threonine (nxs/t) motifs (i). the two terminal glucose residues on the core oligosaccharide are trimmed by glucosidase i, (gi) (ii), and gii (iii), respectively, allowing for the association with the chaperones, membrane-bound calnexin (cnx) and and/or soluble calreticulin (crt), which promote folding to a native conformation. eventually, the last terminal glucose residue will be trimmed by gii, and the glycoprotein will attain a native conformation (iv), or misfold (vii). glycoproteins that reach a native conformation will have the terminal α1,2-man residue on the b branch removed by er class i α-mannosidase (ermani) (v), as a signal to allow it to traverse the canonical secretory pathway for surface presentation or secretion (vi). polypeptides unable to reach a native conformation (vii) will engage in multiple rounds of the cnx/crt cycle, facilitated by reglucosylation of the terminal glucose by udp-glc:unfolded glycoprotein glucosyltransferase (uggt) (viii), and trafficking between quality control vesicles (qcv) (ix) and the the er-derived quality compartments (erqc) (x) under er stress. terminally misfolded glycoproteins will be demannosylated to remove all α1,2-man residues (xi), followed by association with lectins osteosarcoma amplified 9 (os9) and xtp3-transactivated gene b protein (xtp3-b) for erad (xii). ermani containing qcv are rapidly recycled through autophagy/lysosome pathways (xiii). without interactions with client glycoproteins, edemosome components are degraded through an autophagy-like mechanism (xiv). viruses can hijack edemosomes to form double membrane vesicles (dmvs) that serve as platforms for their replication (xv). with only one-tenth of the total cell volume, the er is responsible for the synthesis of the vast majority of the secreted or membrane proteins, which account for one-third of total cellular proteins. therefore, the er has extremely high protein concentrations (100 mg/ml), which renders this organelle very susceptible to protein aggregation [27] . in addition, the protein folding is error prone, and this process can be further compromised by physiological and pathological perturbations. moreover, genetic mutations may prohibit proteins from being folded properly. all these factors may cause the accumulation of unfolded or misfolded proteins. when the level of these aberrant proteins exceeds the folding and clearance capacity of the er, known as er homeostasis, it leads to a cellular stress response termed "er stress", which in turn activates the unfolded protein response (upr) to restore the er homeostasis [28] . er stress is sensed by three er transmembrane receptors: double-stranded rna (dsrna)-activated protein kinase (pkr)-like er kinase (perk), inositol-requiring enzyme 1 (ire1), and activating transcription factor 6 (atf6). perk and atf6 are in association with another er chaperone, the binding immunoglobulin protein (bip, or grp78), when the cell is not under stress. bip preferentially binds to misfolded proteins and dissociates from perk and atf6 under er stress, resulting in their activation and upr to mitigate this stress [29, 30] . ire1 is activated by the direct binding of unfolded proteins [31] . ire1 then activates the transcription factor x-box binding protein 1 (xbp-1), which in turn up-regulates er chaperones to assist in the folding capacity of the er as well as erad components to boost protein degradation. perk phosphorylates the eukaryotic initiation factor (eif)-2α and halts protein translation, and atf6 up-regulates protein expression to boost the er protein folding capacity and erad. however, if these objectives are not achieved within a certain time span or if the disruption is prolonged, upr also activates pathways leading to cell death. although perk activation causes global inhibition of protein translation by blocking eif-2α activity, it paradoxically enhances translation of the transcription factor atf4. atf4 then trans-activates the ccaat/enhancer-binding protein-homologous protein (chop), which is a pro-apoptotic transcription factor, resulting in cell death by apoptosis [32] . erad is a protein quality control mechanism conserved in all eukaryotic cells, which is an important arm of upr, necessary to alleviate er stress [33] . erad results in the selective dislocation of unfolded and misfolded proteins from the er to the cytosol via specific membrane machinery. erad targets are subsequently degraded by the cytosolic ubiquitin proteasome system (ups) [34] . quality control of functional proteins produced from the er is also critical for maintenance of the er homeostasis by eliminating unfolded and misfolded proteins. thus, erad is a central element of both the secretory pathway and upr, which targets a number of physiological and pathological substrates such as the t cell antigen receptor (tcr) [35] , 3-hydroxy-3-methylglutaryl coenzyme-a (hmg-coa) reductase (hmgcr) [35] , squalene monooxygenase (sqle) [36] , inositol 1,4,5-trisphosphate (ip 3 ) receptor [37] , diacylglycerol acyltransferase 2 (dgat2) [38] , heme oxygenase-1 (ho-1) [39] , alpha-1 antitrypsin [35] , and cystic fibrosis transmembrane regulator (cftr) [40] . so far, more than 60 human diseases have been attributed to this pathway [41] . although the vast majority of secreted proteins are glycosylated, the er is responsible for the folding and assembly of both glycosylated and non-glycosylated proteins into functional complexes, which are subjected to erad quality control if they are misfolded. the process of erad can be divided into three steps: substrate recognition, retrotranslocation, and ubiquitylation/proteasomal degradation. in fact, extensive excision of α1,2-man residues from n-glycans sends an important signal to trigger misfolded glycoprotein degradation, which is dependent on class i mannosidases [42] . class i mannosidases belong to the glycoside hydrolase family 47 (gh47), which are exo-acting α1,2-mannosidases that are divided into three subfamilies [43] . the first subfamily consists of ermani, which is supposed to cleave the outmost α1,2-man residue on the b branch from n-linked glycans in the er. the second subfamily consists of three golgi α-mannosidase i, including golgimania, golgimanib, and golgimanic, which cleave the remaining three α1,2-man residues in the golgi complex for n-glycan maturation. the third subfamily consists of the er degradation-enhancing α-mannosidase-like proteins (edem) 1, 2, and 3. although some edem orthologs in lower eukaryotes have detectable α1,2-mannosidase activity, such activity has not been reported for any mammalian edem proteins in vitro. nevertheless, there is evidence suggesting that these edem proteins should have enzymatic activity in vivo [44, 45] . indeed, the extent of man excision determines the fate of a glycoprotein, which could be either targeted to erad for degradation or sent to the golgi for normal trafficking. ermani exhibits a slow rate of enzymatic activity, which allows nascent proteins to perform multiple rounds of reglucosylation and achieve proper folding [46] . properly folded glycoproteins should have one man residue trimmed off from n-glycans by ermani. these glycoproteins then interact with the high-man binding lectin, er-golgi intermediate compartment 53 kda protein (ergic-53) [47] , for trafficking from the er to the golgi ( figure 1b) . however, if glycoproteins are misfolded terminally, the remaining three α1,2-man residues are excised from these molecules, which targets misfolded proteins for degradation [48] . it is still not completely understood how misfolded glycoproteins are subjected to such extensive demannosylation in the er and then targeted for erad. although ermani alone may be able to complete this task, there is evidence suggesting that additional gh47 enzymes are involved. elevation of the golgi mannosidases has been shown to accelerate erad, so these enzymes may possibly be responsible for such extensive excision, likely by trafficking back to the er via an unknown mechanism [49] . in fact, the localization of the golgimania has recently been observed in qcv with other canonical erad machinery such as ermani, and overexpression and knockdown can, respectively, increase or retard trimming of misfolded glycoproteins from man 9 glcnac 2 to man 5 glcnac 2 in vitro [50] . in addition, upon er stress, qcv converge to form the er-derived quality compartments (erqc), where edem proteins are also sequestered ( figure 1b ) [48] . edem1 and edem3 boost mannose trimming when overexpressed [45, 51, 52] . in addition, using a genomic knockout approach, it has been recently proposed that edem2 plays a central role in the trimming of the outmost man residue on the b branch, whereas edem1 and edem3 should be responsible for trimming of the remaining α1,2-man residues. accordingly, a "double check" model for misfolded glycoproteins has been proposed, which suggests that edem2 catalyzes the first step of man trimming, and edem1 and edem3 contribute the second step [44] . under these joint actions, all four α1,2-man residues are removed from the oligosaccharide, which is then recognized by the lectins osteosarcoma amplified 9 (os9) and xtp3-transactivated gene b protein (xtp3-b) via the mannose 6-phosphate receptor homology (mrh) domain ( figure 1b) [53, 54] . misfolded proteins are targeted to specific translocation channels (retrotranslocons) for retrotranslocation in an energy-dependent manner. this process is facilitated by p97, a member of the atpases associated with the diverse cellular activities (aaa) family, by catalyzing atp hydrolysis [55] . the p97 atpase is recruited by the ubiquitin-like (ubx)-domain-containing protein ubxd8, an er-membrane protein that plays a role in erad [56] . it is still mysterious how these retrotranslocons are formed and how integral membrane and lumenal erad substrates are exported across the er membrane through these retrotranslocons. the first candidate channel is composed of the sec61 complex, which is comprised of α, β, and γ subunits. the α subunit crosses the membrane 10 times, and forms a channel with the other subunits. the sec61 heterotrimeric channel is the main translocon involved in co-translational protein transport into the er [57] . however, there is evidence suggesting that the sec61 translocon is also involved in retrotranslocation of erad substrates, implying the non-specificity and bi-directional property of this channel [58] . the second candidate is a member of the derlin family (derlin-1, -2, and -3) [59, 60] . derlins are integral membrane proteins that likely span the er membrane four times and contain a rhomboid-like domain [61] . the third candidate includes the erad-specific e3 ligases. they have a large number of transmembrane domains, which are not only responsible for polyubiquitylation, but could act as potential exit channels for erad substrates [62] . in saccharomyces cerevisiae, there are two major types of really interesting new gene (ring)-finger e3 ligase complexes, hrd1 and doa10, which mediate erad by targeting discrete substrates [63] . hrd1 was the first e3 enzyme identified in the erad pathway during the study of hmg-coa reductase degradation (hrd) [64] . hrd1 has 6 transmembrane helices in its n-terminal transmembrane domain and a catalytic ring domain in the soluble c-terminal region extended to the cytosol. using an elegant erad assay reconstituted in vitro, the hrd1-mediated formation of ubiquitin-gated protein-conducting membrane channels has been demonstrated [65, 66] . hrd1 has two mammalian orthologs named hrd1 and gp78, and its functioning e2 enzyme is known as ubc7, which also has two mammalian orthologs, ube2g2 and ube2g1 [67] . hrd1 is unstable, and must be stabilized by its co-factor hrd3 in an equimolar ratio [68] . the mammalian ortholog of hrd3 is sel1l, which is required for erad substrate retrotranslocation [69] . hrd1 interacts with der1, and the der1-hrd1 interaction is bridged by another integral membrane protein usa1, which is also required for hrd1 oligomerization [70] . the usa1 mammalian ortholog is called herp, which also interacts with hrd1 and derlin-1 and plays an important role in erad [71] . doa10 was identified in degradation studies of mating type (mat)-α2-10 (doa), which is a yeast transcription factor. doa10 is a~150 kda protein that has 14 transmembrane domains, which requires both ubc6 and ubc7 as e2 enzymes. the doa10 mammalian ortholog is teb4 (march6), which functions in the erad pathway with similar subcellular distribution and topology [72] . the ubc6 e2 enzyme has two mammalian orthologs, ube2j1 and ube2j2; both are involved in erad [73, 74] . in saccharomyces cerevisiae, erad is executed synergistically by hrd1 and doa10 with minimal redundancy because they exhibit different substrate specificity. doa10 mainly triggers ubiquitylation of specific soluble proteins and membrane proteins with degrons exposed to the cytosol; a process referred to as erad-c [75, 76] . hrd1 interacts with two other types of substrates, whose degradation is termed erad-l and erad-m. erad-l includes soluble lumenal proteins in the er and transmembrane proteins with degrons exposed to the er lumen; erad-m includes transmembrane proteins with degrons embedded into the er membrane [63] . simultaneous inactivation of both genes has been shown to increase the sensitivity to heavy metal-induced cellular stress and exhibit an elevated upr. the regulation of protein folding and the functional relation between erad and the upr are much more complex in mammalian cells. in saccharomyces cerevisiae, the cnx cycle does not exist due to the lack of uggt. in addition, protein synthesis is tightly controlled at the translational level by determination of the stoichiometry to avoid surplus production, resulting in minimal dependence on the post-translational regulation of protein expression. moreover, although the yeast ermani ortholog mns1p and edem ortholog htm1p are indispensable for erad, only one edem ortholog is present in yeast [77, 78] . because overly active erad may interfere with the regular protein folding process in the er, mammalian cells have evolved mechanisms to tightly regulate this quality control device by a combination of compartmentalization and tuning. edem1 is segregated into er-derived, lc3-i-associated vesicles, which are called edemosomes, where edem1, os-9, and sel1l are concentrated when they lack client glycoproteins to dislocate ( figure 1b) [79] . notably, unlike chaperones and the other enzymes, many erad regulators including ermani, edem1, os-9, herp, and sel1l are short-lived proteins, and ermani, edem1, sel1l, and os-9 are targeted to the lysosomal pathway for degradation [79, 80] . thus, edemosomes are called erad tuning vesicles, which deliver their content to lysosomes for disposal via an autophagy-like pathway to reduce the erad capacity under natural conditions [81] . additionally, lysosomal inhibitors are able to cause the accumulation of an aggregating mutant of dysferlin in the er when compared to the wild-type, which was used as evidence to propose that large protein aggregates are disposed of via an autophagy/lysosomal pathway, dubbed erad ii [82] . however, under er stress, most of these factors are highly induced, including the edem proteins, but not ermani [45, 83, 84] . under stress, qcv are recruited to the erqc, resulting in the accumulation of ermani and its glycoprotein substrates [85] . moreover, many other erad components, including edem1, hrd1, derlin-1, sec61β, and herp, are also concentrated in erqc. importantly, it has been found that edem1 stabilizes ermani and increases its protein expression at steady-state levels [86] . such enrichment of these critical components accelerates efficient assembly of the erad machinery, potentiating the degradation of misfolded glycoproteins and alleviating er stress. during infection, viruses are able to hijack the host translational machinery and saturate the er with viral proteins. not only do viruses use the er to generate their glycoproteins, but some even utilize the er as their site to assemble progeny particles [87] . such accumulation of viral proteins in the er places a heavy demand on the protein folding machinery, which may cause er stress, and in turn, activate the upr, resulting in restoration of the er homeostasis or apoptosis. so far, at least 36 viruses have been found to be able to induce er stress, and activate the three upr stress signaling pathways [88] . enveloped viruses may bud through the plasma membrane or an intracellular compartment. in addition, their envelope glycoproteins are targeted to the er for post-translational modifications and folding. not surprisingly, many viral envelope glycoproteins are significant inducers of the upr, which includes hcv [89] , hepatitis b virus (hbv) [90] , coronaviruses [91] , chikungunya virus (chikv) [92] , and retroviruses [93] . as introduced earlier, the upr utilizes three different mechanisms to alleviate er stress: reducing global protein translation, increasing the er folding capacity, and enhancing erad by activating the perk, atf6, or ire1-xbp1 pathways, respectively. viral infections may activate these pathways, resulting in the inhibition or enhancement of viral replication (table 1) . for example, the perk-mediated global translation shutdown is a very effective antiviral mechanism, and a similar shutdown by pkr has been used in the interferon pathway to defend against viral infection [94] . conceivably, viruses have evolved a number of strategies to circumvent the detrimental effect of upr to establish productive infection. hcv is still able to produce viral proteins even when the cellular translational machinery is shut down, because these viruses have their own internal ribosome entry site (ires) to recruit and assemble the ribosomal initiation complex for protein expression [95] . epstein-barr virus (ebv), herpes simplex virus (hsv), and african swine fever virus (asfv) can counteract the perk-mediated eif2α phosphorylation by activating an eif2α phosphatase pp1 [96] [97] [98] . in another example, the hcv e2 protein directly interacts with perk to prevent er stress sensing by acting as a pseudo-substrate to block perk activity [99] . in addition to combating the upr, viruses also take advantage of the upr pathways to benefit their replication. for example, influenza a virus (iav) replication is promoted by activation of the ire1-xbp1 pathway [100] ; atf6 activation promotes asfv, lymphocytic choriomeningitis virus (lcmv), denv, human cytomegalovirus (hcmv), and japan encephalitis virus (jev) replication [101, 102] , and atf4 activation enhances hiv-1 replication [103] . thus, despite the detrimental effects, viruses have evolved to manipulate host upr signaling pathways to promote viral infections. below, we will focus on the roles of erad played in virus replication, which is the main target of this review. as introduced earlier, erad transports unfolded/misfolded proteins from the er into the cytosol for proteasomal degradation. conceivably, viruses can manipulate and exploit this cellular machinery to degrade several important host factors to promote their propagation. herpesviruses have evolved multiple mechanisms to suppress the host immune response via erad. major histocompatibility complex (mhc) molecules play an indispensable role in triggering an immediate immune response to inhibit virus infections. herpesviruses inhibit mhc class i (mhc-i) expression by targeting these molecules to erad for degradation. for example, hcmv produces two transmembrane proteins, us2 and us11, and each is sufficient to bind to mhc-i heavy chains, causing their dislocation from the er to the cytosol for degradation [108] . notably, us2 and us11 use different mechanisms to degrade mhc-i. us2-dependent mhc-i degradation is mediated through an interaction with the e3 ligase, trc8. this us2/trc8 complex has been implicated in the degradation of other membrane proteins including multiple alpha-integrins, the interleukin 12 receptor (il-12r), thrombomodulin (thbd), protein tyrosine phosphatase receptor type j (ptprj), and cd112 [109] . although the signal peptide peptidase (spp) has been shown to bind to trc8, the us2/trc8 complex maintains its mhc-1 degradation activity in spp−/− knockout cells, suggesting that spp binding is not related to mhc-1 degradation [110, 111] . recent reports now regard the us2/trc8 complex as a multifunctional hub that is able to degrade a multitude of targets in order to further hcmv immune evasion [109, 112] . a complex formed between us11, derlin-1, and the e3 ligase, tmem129, mediates mhc-i degradation via us11 [113] . initial reports concerning us11 found an association with sel1l and assumed that us11 mediated mhc-1 degradation could be sel1l/hrd1 dependent. recent literature has confirmed that while the us11/tmem129 complex degrades mhc-1, us11 itself is degraded through a sel1l/hrd1 axis in the absence of the client mhc-1 [113, 114] . recruitment of p97 by ubxd8 is also crucial for us11-mediated mhc-i degradation [115] . with regard to us11, hcmv utilizes erad to dispose of mhc-i and its own effector protein using discrete axes for ubiquitination. mouse gammaherpesvirus 68 (mhv68) uses another mechanism to inhibit mhc-i. mhv68 produces a protein termed mk3, which is a ring-finger e3 ligase anchored on the er membrane. mk3 interacts with mhc-i heavy chain molecules, and it also associates with the transporter associated with antigen processing (tap), p97, derlin-1, and the e2 ube2j2. association with ube2j2 results in an interesting pattern of ubiquitination of non-lysine residues (the mk3/ube2j2 complex can ubiquitinate serines as well as lysines) that leads to rapid degradation of the mhc-i by proteasomes [73, 116] . thus, herpesviruses have evolved numerous strategies to block the mhc antigen presentation and evade the host immune response to establish a persistent infection. primate lentiviruses also harness the erad pathway to promote their replication via downregulation of their receptor cd4. cd4 downregulation prevents superinfection and promotes viral release by interrupting viral receptor-envelope interactions on the plasma membrane, leading to a controlled and productive viral infection and immunodeficiency [117] . these viruses produce two accessory proteins, nef and vpu, to trigger cd4 degradation via two distinctive mechanisms [118] . nef uses the endocytic pathway to redirect cd4 from the cell surface, or to interfere with the transport of newly synthesized cd4 from the trans-golgi network (tgn) to the cell surface, resulting in cd4 dislocation to endosomes and degradation by lysosomes [119] . however, vpu interacts with cd4 in the er and induces cd4 proteasomal degradation via erad [120] . vpu is a small transmembrane protein encoded by hiv-1 and some simian immunodeficiency virus (siv) isolates. vpu forms ion conductive membrane pores; it also interacts with β-transducin repeat-containing proteins (βtrcp), which are f-box/wd repeat-containing proteins that are part of the skp1-cul1-f-box (scf) e3 ubiquitin ligase complex [121] . the vpu-induced cd4 degradation is strictly dependent on the scf-β-trcp complex [122] . notably, this e3 ligase complex is not associated with the er membrane, and therefore does not normally function in erad. however, the degradation also requires the cytosolic atpase p97 and its cofactors ufd1l and npl4, which are key components of the erad machinery, suggesting that cd4 is degraded via erad [122] . nevertheless, the degradation is not dependent on hrd1, sel1l, and ubc7. in addition to degradation, viruses may harness erad components to benefit their replication. first, erad can promote viral protein expression. mouse mammary tumor virus (mmtv) is a betaretrovirus, which expresses the rem protein in the er. rem has a n-terminal 98-amino acid signal peptide (sp), which is cleaved off by signal peptidase and retrotranslocated in a p97-dependent manner [123] . rem sp then promotes the nuclear export of viral unspliced rnas to the cytosol for protein expression. similarly, hepatitis e virus (hev) orf2 is an n-linked glycoprotein, but functions as the major capsid protein. although orf2 is expressed in the er, it depends on erad components to exit from the er to the cytoplasm without being polyubiquitylated [124] . second, erad can promote virus entry. polyomaviruses (pyv) enter cells through the er and then replicate in the nuclei [125] . to get from the er to the nucleus, these viruses can cross the er membrane into the cytosol via the erad retrotranslocons [126] . an example of this is mouse pyv, which uses derlin-2, whereas simian virus 40 (sv40) uses derlin-1 and the sel1l complex for dislocation [126, 127] . in addition, the proteasome machinery is also required for the human bk pyv exit from the er [127] . third, erad can promote virus replication. the replication of positive-strand rna viruses normally involves the formation of double-membrane vesicles (dmvs) and convoluted membranes (cms) by rearrangement of cellular membranes, which segregates and protects viral proteins and genomes from the host's innate immune response. as introduced earlier, the erad activity can be adjusted by erad tuning vesicles termed edemosomes ( figure 1b) , which display non-lipidated lc3 and segregate the erad factors edem1, os-9, and sel1l from the er lumen [81] . by comparing the similarity between dmvs and edemosomes, it has been discovered that mouse hepatitis virus (mhv), equine arteritis virus (eav), and jev indeed replicate in these erad tuning vesicles [128] . thus, these viruses can subvert edemosomes as their replication vesicles to promote infection [129] . although erad has been frequently manipulated by a number of viruses to promote infection or attenuate immune responses, it may also function directly as an antiviral device to protect host cells from infection. because viral envelope glycoprotein production and folding take place in the er, these viral proteins may become the primary targets for erad, resulting in the inhibition of viral infection. primate lentiviruses, including hiv and siv, have low levels of envelope glycoproteins on their surface, and the average copy number is~14 env trimers per virion [130, 131] . in contrast, ifa, sendai virus, hsv, and moloney murine leukemia virus (momulv) have much more envelope glycoproteins on their surfaces [132] [133] [134] [135] . the exceptionally low number of env spikes may protect hiv-1 from host immune responses [136] since almost 85% of env proteins are retained in the er and are degraded [137] [138] [139] . this degradation mechanism was not clear until we recently reported that hiv-1 env glycoproteins are targeted for erad. from completely unrelated studies, we isolated hiv-1 non-permissive (np) and permissive (p) t cell clones n2-np and n5-p from the original cem.nkr human t cell line [140] . our initial analysis uncovered that hiv-1 replication is restricted from the second round of the viral life cycle in n2-np cells, resulting in~1000-fold inhibition when compared to n5-p. further transcriptome analysis by microarrays revealed that n2-np cells overexpress the mitochondrial translocator protein (tspo), which strongly inhibits hiv-1 env expression [141] . tspo interacts with the mitochondrial permeability transition pore (mptp) complex, which includes the outer membrane protein voltage-dependent anion channel (vdac) protein, the inner membrane protein adenine nucleotide translocase (ant), and the mitochondrial matrix protein cyclophilin d (cypd) [142] . tspo binds to vdac and contributes to the regulation of the mitochondrial membrane permeability by the mptp complex [143] . our results suggested that tspo overexpression could reduce the oxidative redox status in the er, which interferes with the env oxidative folding process, resulting in env degradation. consistently, the rapid env degradation in n2-np cells was rescued by kifunesine, an effective inhibitor of glycoside hydrolase family 47 (gh47) enzymes [144] , suggesting that hiv-1 is degraded via erad in n2-np cells. to further explore the env degradation mechanism, we investigated which of those four er-associated gh47 enzymes was responsible for the env degradation. notably, when ermani, edem1, edem2, and edem3 were ectopically expressed in 293t cells, only ermani strongly inhibited env expression in a dose-dependent manner. in addition, when the endogenous ermani was knocked out by crispr/cas9, tspo was no longer able to suppress the env expression [145] . these results demonstrated that ermani should be responsible for the initiation of hiv-1 env degradation via erad. human ermani is a 699-amino-acid, 79.5-kda, type ii membrane protein, which is divided into an n-terminal cytoplasmic domain (cd), transmembrane (tm) helix, lumenal 'stem' region, and a catalytic domain [146, 147] . using an immunoprecipitation assay, we found that hiv-1 env interacts with the catalytic domain of ermani [145] . the structure of this catalytic domain shows an (αα) 7 -barrel composed of 14 consecutive helices [148] . in the catalytic domain, there are seven residues that are critical for ermani function. c527 and c556 form a highly conserved disulfide bond and were reportedly critical for protein folding [149] , whereas e330, d463, and e599 were proposed as catalytic residues [148] . r334c and e397k mutations are found in nonsyndromic autosomal-recessive intellectual disability (ns-arid) disease [150] , and the r334c mutation is also found in the congenital disorders of glycosylation [151] . all these residues are required for hiv-1 env degradation, suggesting that the mannosidase activity is important for the ermani activity. ermani also targets the terminally misfolded human alpha1-antitrypsin variant null (hong kong) (nhk) for degradation via erad, but neither its catalytic activity nor its catalytic domain is required for this degradation, suggesting that different mechanisms are involved in hiv-1 env and nhk degradation [152] . we have also found that the viral protein r (vpr) of hiv-1 enhances viral replication in monocyte-derived macrophages (mdms) and dendritic cells (mddcs) by rescuing env from erad degradation through the erad (ii) autophagy pathway. compounds known to facilitate glycoprotein folding (pk11195 and as 2 o 3 ) and inhibit er α-mannosidases crucial for erad (kifunensine), and those that block lysosomal proteases (bafilomycin) rescued envelope expression and infectivity in a ∆vpr background to that of wild-type virus [153] . as aforementioned, unlike ermani, whose expression is not responsive to upr, the expression of the edems is induced upon upr via the ire1/xbp activation pathway, which boosts erad and alleviates er stress. although ectopic expression of edems did not inhibit hiv-1 env expression [145] , these proteins inhibit the expression of some other envelope glycoproteins. hbv expresses three surface glycoproteins, the large (l), middle (m), and small (s), which are translated from different initiation codons within the same open reading frame (orf) and share the tetra-spanning transmembrane domains in the s protein. the n-terminus of the m and l protein contain additional pres2 and pres1-pres2 domains, respectively. the common s domain has an n-glycosylation site, and the m pres2 domain has another site. overexpression of the surface proteins is sufficient to activate the ire1/xbp1 pathway and elevate edem1, edem2, and edem3 expression. importantly, edem1 overexpression destabilizes s, m, and l, and edem1 silencing stabilizes their expression [154] . in addition, the autophagy/lysosomal pathway, but not the proteasomal pathway, is involved in the degradation of hbv surface glycoproteins, further complicating our understanding of the viral protein degradation process via erad [154] . hcv has two n-glycosylated envelope proteins e1 and e2 on the surface of virions, which are type i transmembrane proteins expressed from a common viral polyprotein precursor. hcv infection strongly induces the activation of the ire1 stress sensor, resulting in elevation of edem1, edem2, and edem3, but not the ermani expression. both edem1 and edem3, but not edem2, interact with e2, and overexpression of these two proteins induces e2 polyubiquitylation and degradation. conversely, knockdown of edem1 expression or treatment with kifunesine increases e2 expression, and also reduces the interaction of edem1 and edem3 with sel1l [155] . taken together, these results strongly suggest that edem proteins are able to extract viral polypeptides from the er quality control cycle, and degrade them via erad. however, since none of these proteins can target the jev e protein to erad for degradation, not every viral glycoprotein is recognizable by these proteins [155] . in vivo experiments on patients with chronic liver injury were unable to identify up-regulation of upr and erad elements in diseased versus control patients [156] . erad has also been implicated in the degradation of hcmv glycoproteins, gh and gl, via the 26s proteasome. hcmv produces at least 65 unique glycoproteins, with four homologues to the hsv glycoproteins, gh, gb, gl, and gm [157] . the glycoproteins, gh and gl, are constituents of the gcii type complexes found on the surface of hcmv virions. the gcii trimeric complex between gh, gl, and go can initiate ph independent fusion [158] . in addition, a pentameric complex between gh, gl, and the gene products u128, u130, and u131 is able to mediate entry into different cell types via ph-dependent receptor-mediated endocytosis; a process that requires the trimeric gh/gl/go complex [159] . although previous studies have shown that the glycoprotein gl stabilizes the expression of gh and potentiates its surface localization [160] , recent work revealed that gh is degraded via erad in the absence of gl [161] . replacement of the cytoplasmic tail of gh with that of the human cd4 protein subverted gh degradation via erad, potentiating surface expression. current studies describe two paradigms for erad to target viral glycoproteins for degradation: ermani-mediated, which targets hiv-1 env, and edem-mediated, which can target hcv and hbv surface glycoproteins. gh47 family members share a common catalytic mannosidase homology domain of~440-residues [52] , and the three catalytic residues e330, d463, and e599 found in ermani are all conserved in these proteins [43] . nevertheless, there is little protein sequence homology beyond this domain among these proteins. unlike ermani, all three edems are er-lumenal proteins, although the signal sequence of edem1 is resistant to cleavage [162] . edem3 has two novel features including an additional protease-associated domain of unknown function and a kdel signal for er retention [45] . whether or how the coordination between the edems and ermani facilitates erad is still a convoluted issue. due to lysosomal degradation mediated by the n-terminal cytoplasmic tail, ermani is expressed at very low basal levels in cells, and its expression is not induced by upr [86] . such proteolytically driven checkpoint control of ermani expression may contribute to establish glycoprotein quality control at a baseline level, which maintains er homeostasis without activation of ire1/xbp1. however, if this basic mechanism fails to restore er homeostasis, ire1/xbp1 is induced to elevate expression of the edems, which will increase erad. unlike hcv and hbv, hiv-1 induces upr, but barely activates the ire1/xbp1 pathway, which may explain why hiv-1 env is not directly targeted by edem proteins [93] . nevertheless, these two different arms of erad do not exclude the role of the edems in ermani-mediated degradation. edems may accelerate the release of terminally misfolded glycoproteins from the cnx/crt cycle, and thereby help ermani to conduct more extensive demannosylation [163] ; and the association of edem with sel1l may further accelerate the cytosolic delivery of misfolded proteins [164] . moreover, edem1 may form a complex with ermani, which stabilizes ermani by the suppression of its proteolytic degradation [86] . discrepancies concerning the localization of ermani with various labs determining colocalization with the er, golgi, or er-golgi intermediate compartments and quality control vesicles, lends credence to both current theories that ermani is either a golgi checkpoint in quality control that will return misfolded proteins back to the er for further processing, or that it resides in quality control vesicles with glycoprotein substrates as part of the cnx/crt cycle [24, 165] . it is well established that viruses have evolved to manipulate host upr and erad to optimize their replication, whether they are 'tuning' host quality control to ensure the proper folding of their envelope glycoproteins, circumventing erad in order to prevent degradation of their viral envelope glycoproteins, or hijacking erad to dispose of host proteins. there are still many questions left to be answered, including the identities of the dislocons that each envelope glycoprotein is targeted to, the motifs or patterns that allow α1,2-mannosidases to differentiate between native and misfolded glycoproteins, why some viral proteins are disproportionately targeted (hcmv gh), and the roles that the upr and erad play in vivo during viral infections. these exciting areas merit more extensive studies. virus entry: molecular mechanisms and biomedical applications the cell biology of receptor-mediated virus entry recruitment of hiv and its receptors to dendritic cell-t cell junctions penetration of nonenveloped viruses into the cytoplasm virus and cell fusion mechanisms both e protein glycans adversely affect dengue virus infectivity but are beneficial for virion release a systematic study of the n-glycosylation sites of hiv-1 envelope protein on infectivity and antibody-mediated neutralization playing hide and seek: how glycosylation of the influenza virus hemagglutinin can modulate the immune response to infection the hepatitis c virus glycan shield and evasion of the humoral immune response comprehensive functional analysis of n-linked glycans on ebola virus gp1 tracking global patterns of n-linked glycosylation site variation in highly variable viral glycoproteins: hiv, siv, and hcv envelopes and influenza hemagglutinin glycosylation affects cleavage of an h5n2 influenza virus hemagglutinin and regulates virulence importance of hemagglutinin glycosylation for the biological functions of influenza virus interdependence of hemagglutinin glycosylation and neuraminidase as regulators of influenza virus growth: a study by reverse genetics the hiv glycan shield as a target for broadly neutralizing antibodies virus glycosylation: role in virulence and immune interactions vertebrate protein glycosylation: diversity, synthesis and function roles of n-linked glycans in the endoplasmic reticulum n-glycan-based er molecular chaperone and protein quality control system: the calnexin binding cycle malectin: a novel carbohydrate-binding protein of the endoplasmic reticulum and a candidate player in the early steps of protein n-glycosylation the role of lectin-carbohydrate interactions in the regulation of er-associated protein degradation the long road to destruction mammalian er mannosidase i resides in quality control vesicles, where it encounters its glycoprotein substrates the specificity of the yeast and human class i er alpha 1,2-mannosidases involved in er quality control is not as strict previously reported endoplasmic reticulum-associated degradation of mammalian glycoproteins involves sugar chain trimming to man6-5glcnac2 the unfolded protein response: integrating stress signals through the stress sensor ire1alpha endoplasmic reticulum stress sensing in the unfolded protein response dynamic interaction of bip and er stress transducers in the unfolded-protein response the unfolded protein response: from stress pathway to homeostatic regulation unfolded proteins are ire1-activating ligands that directly induce the unfolded protein response er-stress-induced transcriptional regulation increases protein synthesis leading to cell death endoplasmic reticulum-associated degradation regulation of endoplasmic reticulum-associated protein degradation (erad) by ubiquitin. cells how early studies on secreted and membrane protein quality control gave rise to the er associated degradation (erad) pathway: the early history of erad sterol homeostasis requires regulated degradation of squalene monooxygenase by the ubiquitin ligase doa10/teb4. elife 2013, 2, e00953 when worlds collide: ip(3) receptors and the erad pathway regulation of diacylglycerol acyltransferase 2 protein stability by gp78-associated endoplasmicreticulum-associated degradation ubiquitin-proteasome system mediates heme oxygenase-1 degradation through endoplasmic reticulum-associated degradation pathway selective inhibition of endoplasmic reticulum-associated degradation rescues deltaf508-cystic fibrosis transmembrane regulator and suppresses interleukin-8 levels: therapeutic implications the delicate balance between secreted protein folding and endoplasmic reticulum-associated degradation in human physiology flagging and docking: dual roles for n-glycans in protein quality control and cellular proteostasis family 47 alpha-mannosidases in n-glycan processing edem2 initiates mammalian glycoprotein erad by catalyzing the first mannose trimming step edem3, a soluble edem homolog, enhances glycoprotein endoplasmic reticulumassociated degradation and mannose trimming in vitro mannose trimming property of human er alpha-1,2 mannosidase i ergic-53 and traffic in the secretory pathway endoplasmic reticulum (er) mannosidase i is compartmentalized and required for n-glycan trimming to man5-6glcnac2 in glycoprotein er-associated degradation stimulation of erad of misfolded null hong kong alpha1-antitrypsin by golgi alpha1,2-mannosidases mannosidase ia is in quality control vesicles and participates in glycoprotein targeting to erad edem1 regulates er-associated degradation by accelerating de-mannosylation of folding-defective polypeptides and by inhibiting their covalent aggregation glycoprotein folding and the role of edem1, edem2 and edem3 in degradation of folding-defective glycoproteins os-9 and grp94 deliver mutant alpha1-antitrypsin to the hrd1-sel1l ubiquitin ligase complex for erad the mrh domain suggests a shared ancestry for the mannose 6-phosphate receptors and other n-glycan-recognising proteins in vitro analysis of hrd1p-mediated retrotranslocation of its multispanning membrane substrate 3-hydroxy-3-methylglutaryl (hmg)-coa reductase derlin-1 and ubxd8 are engaged in dislocation and degradation of lipidated apob-100 at lipid droplets structure of the native sec61 protein-conducting channel proteasome 19s rp binding to the sec61 channel plays a key role in erad a membrane protein required for dislocation of misfolded proteins from the er a membrane protein complex mediates retro-translocation from the er lumen into the cytosol derlin-1 is a rhomboid pseudoprotease required for the dislocation of mutant alpha-1 antitrypsin from the endoplasmic reticulum protein quality control in the er: balancing the ubiquitin checkbook the ubiquitylation machinery of the endoplasmic reticulum role of 26s proteasome and hrd genes in the degradation of 3-hydroxy-3-methylglutaryl-coa reductase, an integral endoplasmic reticulum membrane protein autoubiquitination of the hrd1 ligase triggers protein retrotranslocation in erad key steps in erad of luminal er proteins reconstituted with purified components selective ubiquitylation of p21 and cdt1 by ubch8 and ube2g ubiquitin-conjugating enzymes via the crl4cdt2 ubiquitin ligase complex endoplasmic reticulum degradation requires lumen to cytosol signaling. transmembrane control of hrd1p by hrd3p association of the sel1l protein transmembrane domain with hrd1 ubiquitin ligase regulates erad-l usa1 functions as a scaffold of the hrd-ubiquitin ligase the ubiquitin-domain protein herp forms a complex with components of the endoplasmic reticulum associated degradation pathway membrane topology of the yeast endoplasmic reticulum-localized ubiquitin ligase doa10 and comparison with its human ortholog teb4 (march-iv) ube2j2 ubiquitinates hydroxylated amino acids on er-associated degradation substrates hrd1 and ube2j1 target misfolded mhc class i heavy chains for endoplasmic reticulum-associated degradation distinct ubiquitin-ligase complexes define convergent pathways for the degradation of er proteins the yeast erad-c ubiquitin ligase doa10 recognizes an intramembrane degron htm1p, a mannosidase-like protein, is involved in glycoprotein degradation in yeast the saccharomyces cerevisiae processing alpha 1,2-mannosidase is localized in the endoplasmic reticulum, independently of known retrieval motifs segregation and rapid turnover of edem1 by an autophagy-like mechanism modulates standard erad and folding activities human endoplasmic reticulum mannosidase i is subject to regulated proteolysis disposal of cargo and of erad regulators from the mammalian er two endoplasmic reticulum-associated degradation (erad) systems for the novel variant of the mutant dysferlin: ubiquitin/proteasome erad(i) and autophagy/lysosome erad(ii) a novel er alpha-mannosidase-like protein accelerates er-associated degradation a novel stress-induced edem variant regulating endoplasmic reticulum-associated glycoprotein degradation glycan regulation of er-associated degradation through compartmentalization the mammalian upr boosts glycoprotein erad by suppressing the proteolytic downregulation of er mannosidase i how viruses use the endoplasmic reticulum for entry, replication, and assembly the expanding roles of endoplasmic reticulum stress in virus replication and pathogenesis unfolded protein response in hepatitis c virus infection modulation of the unfolded protein response by the human hepatitis b virus coronavirus infection, er stress, apoptosis and innate immunity differential unfolded protein response during chikungunya and sindbis virus infection: chikv nsp4 suppresses eif2alpha phosphorylation hiv infection and antiretroviral therapy lead to unfolded protein response activation signal integration via pkr the pathway of hcv ires-mediated translation initiation the lmp1 oncogene of ebv activates perk and the unfolded protein response to drive its own synthesis herpes simplex virus 1 infection activates the endoplasmic reticulum resident kinase perl and mediates eif-2alpha dephosphorylation by the gamma(1)34.5 protein the african swine fever virus dp71l protein recruits the protein phosphatase 1 catalytic subunit to dephosphorylate eif2alpha and inhibits chop induction but is dispensable for these activities during virus infection protein synthesis and endoplasmic reticulum stress can be modulated by the hepatitis c virus envelope protein e2 through the eukaryotic initiation factor 2alpha kinase perk influenza a viral replication is blocked by inhibition of the inositol-requiring enzyme 1 (ier1) stress pathway the atf6 branch of unfolded protein response and apoptosis are activated to promote african swine fever virus infection er stress, autophagy, and rna viruses short communication: activating transcription factor 4 (atf4) promotes hiv type 1 activation pb1-f2 attenuates virulence of highly pathogenic avian h5n1 influenza virus in chickens dengue virus modulates the unfolded protein response in a time-dependent manner cytomegalovirus downregulates ire1 to repress the unfolded protein response the sars coronavirus 3a protein causes endoplasmic reticulum stress and induces ligand-independent downregulation of the type 1 interferon receptor the hcmn gene products us2 and us11 target mhc class i molecules for degradation in the cytosol plasma membrane profiling defines an expanded class of cell surface proteins selectively targeted for degradation by hcmv us2 in cooperation with ul141 cleavage by signal peptide peptidase is required for the degradation of selected tail-anchored proteins signal peptide peptidase is required for dislocation from the endoplasmic reticulum the trc8 e3 ligase ubiquitinates mhc class i molecules before dislocation from the er tmem129 is a derlin-1 associated erad e3 ligase essential for virus-induced degradation of mhc-i identifying the erad ubiquitin e3 ligases for viral and cellular targeting of mhc class i sel1l nucleates a protein complex required for dislocation of misfolded glycoproteins mhc class i ubiquitination by a viral phd/lap finger protein role of cd4 receptor down-regulation during hiv-1 infection mechanisms of cd4 downregulation by the nef and vpu proteins of primate immunodeficiency viruses a novel human wd protein, h-beta trcp, that interacts with hiv-1 vpu connects cd4 to the er degradation pathway through an f-box motif hiv-1 vpu -an ion channel in search of a job multilayered mechanism of cd4 downregulation by hiv-1 vpu involving distinct er retention and erad targeting steps retroviral rem protein requires processing by signal peptidase and retrotranslocation for nuclear function cytoplasmic localization of the orf2 protein of hepatitis e virus is dependent on its ability to undergo retrotranslocation from the endoplasmic reticulum the polyomaviridae: contributions of virus structure to our understanding of virus receptors and infectious entry murine polyomavirus requires the endoplasmic reticulum protein derlin-2 to initiate infection simian virus 40 depends on er protein folding and quality control factors for entry into host cells coronaviruses hijack the lc3-i-positive edemosomes, er-derived vesicles exporting short-lived erad regulators, for replication how viruses hijack the erad tuning machinery electron tomography analysis of envelope glycoprotein trimers on hiv and simian immunodeficiency virus virions distribution and three-dimensional structure of aids virus envelope spikes retrovirus envelope protein complex structure in situ studied by cryo-electron tomography three-dimensional structure of herpes simplex virus from cryo-electron tomography paramyxovirus ultrastructure and genome packaging: cryo-electron tomography of sendai virus zernike phase contrast electron microscopy of ice-embedded influenza a virus why hiv virions have low numbers of envelope spikes: implications for vaccine development model for intracellular folding of the human immunodeficiency virus type 1 gp120 secretion of a truncated form of the human immunodeficiency virus type 1 envelope glycoprotein biosynthesis, cleavage, and degradation of the human immunodeficiency virus 1 envelope glycoprotein gp160 a novel hiv-1 restriction factor that is biologically distinct from apobec3 cytidine deaminases in a human t cell line cem the mitochondrial translocator protein, tspo, inhibits hiv-1 envelope glycoprotein biosynthesis via the endoplasmic reticulum-associated protein degradation pathway the mitochondrion in apoptosis: how pandora's box opens tspo interacts with vdac1 and triggers a ros-mediated inhibition of mitochondrial quality control elucidation of the molecular logic by which misfolded alpha 1-antitrypsin is preferentially selected for degradation ermani (endoplasmic reticulum class i alphamannosidase) is required for hiv-1 envelope glycoprotein degradation via endoplasmic reticulum-associated protein degradation pathway identification, expression, and characterization of a cdna encoding human endoplasmic reticulum mannosidase i, the enzyme that catalyzes the first mannose trimming step in mammalian asn-linked oligosaccharide biosynthesis cloning and expression of a specific human alpha 1,2-mannosidase that trims man(9)glcnac(2) to man(8)glcnac(2) isomer b during n-glycan biosynthesis structural basis for catalysis and inhibition of n-glycan processing class i alpha 1,2-mannosidases role of the cysteine residues in the alpha1,2-mannosidase involved in n-glycan biosynthesis in saccharomyces cerevisiae. the conserved cys340 and cys385 residues form an essential disulfide bond mutations in the alpha 1,2-mannosidase gene, man1b1, cause autosomal-recessive intellectual disability somatic overgrowth associated with homozygous mutations in both man1b! and sec23a. cold spring harb. mol. case stud a golgi-localized mannosidase (man1b1) plays a non-enzymatic gatekeeper role in protein biosynthetic quality control hiv-1 vpr increases env expression by preventing env from endoplasmic reticulum-associated protein degradation (erad) activation of erad pathway by human hepatitis b virus modulates viral and subviral particle production role of the endoplasmic reticulum-associated degradation (erad) pathway in degradation of hepatitis c virus envelope proteins and production of virus particles no evidence of the unfolded protein response in patients with chronic hepatitis c virus infection human cytomegalovirus glycoproteins human cytomegalovirus penetrates host cells by ph-independent fusion at the cell surface human cytomegalovirus gh/gl/go promotes the fusion step of entry into all cell types, whereas gh/gl/ul128-131 broadens virus tropism through a distinct mechanism a novel herpes simplex virus glycoprotein, gl, forms a complex with glycoprotein h (gh) and affects normal folding and surface expression of gh human cytomegalovirus gh stability and trafficking are regulated by er-associated degradation and transmembrane architecture characterization of early edem1 protein maturation events and their functional implications role of edem in the release of misfolded glycoproteins from the calnexin cycle edem1 recognition and delivery of misfolded proteins to the sel1l-containing erad complex golgi localization of ermani defines spatial separation of the mammalian glycoprotein quality control system the authors declare no conflict of interest. key: cord-260869-rym2ik0o authors: lemmermeyer, tanja; lamp, benjamin; schneider, rainer; ziebuhr, john; tekes, gergely; thiel, heinz-jürgen title: characterization of monoclonal antibodies against feline coronavirus accessory protein 7b date: 2016-02-29 journal: vet microbiol doi: 10.1016/j.vetmic.2015.12.009 sha: doc_id: 260869 cord_uid: rym2ik0o feline coronaviruses (fcovs) encode five accessory proteins termed 3a, 3b, 3c, 7a and 7b of unknown function. these proteins are dispensable for viral replication in vitro but are supposed to play a role in virulence. in the current study, we produced and characterized 7b-specific monoclonal antibodies (mabs). a recombinant form of the 7b protein was expressed as a fusion protein in escherichia coli, purified by immobilized metal affinity chromatography and used as immunogen. two hybridoma lines, 5b6 and 14d8, were isolated that expressed mabs that recognized 7b proteins of both fcov serotypes. using an extensive set of nand c-terminally truncated 7b proteins expressed in e. coli and a synthetic peptide, the binding sites of mabs 5b6 and 14d8 were mapped to an 18-residue region that comprises the only potential n-glycosylation site of the fcov 7b protein. the two mabs were suitable to detect a 24-kda protein, which represents the nonglycosylated form of 7b in fcov-infected cells. we speculate that glycosylation of 7b is part of the viral evasion strategy to prevent an immune response against this antigenic site. feline coronaviruses (fcovs) are enveloped viruses with a large (approximately 30 kb) single-stranded rna genome of positive polarity. they belong to the family coronaviridae within the order nidovirales (de groot et al., 2012) . based on phylogenetic analyses, coronaviruses are divided into four genera termed alpha-,beta-, gamma-and deltacoronavirus. feline coronaviruses together with the closely related canine coronavirus (ccov) and porcine transmissible gastroenteritis virus (tgev) have been classified to the species alphacoronavirus 1 in the genus alphacoronavirus. the latter contains a large number of more distantly related alphacoronaviruses, including a number of important medical and veterinary pathogens, such as porcine epidemic diarrhea virus (pedv), human coronavirus 229e (hcov-229e), and human coronavirus nl63 (hcov-nl63) (de groot et al., 2012) . the 5 0proximal two-thirds of the coronavirus genome comprise orf 1a and orf 1b that together encode up to 16 nonstructural proteins (nsps) that form the viral replication/transcription complex but may also be involved in interactions with host cell factors and functions. the remaining part of the genome codes for the structural proteins s (spike), e (envelope), m (membrane) and n (nucleocapsid) (masters, 2006) and a varying number of so-called accessory proteins with poorly characterized functions. fcovs infect cats and other members of the felidae worldwide. based on serological properties, fcovs are separated into serotypes i and ii (heeney et al., 1990; kennedy et al., 2003) . serotype i viruses dominate in the field and account for up to 95% of fcov infections (hohdatsu et al., 1992; addie et al., 2003) , whereas the prevalence of type ii fcovs is much lower. serotype ii fcovs emerged through homologous recombination between serotype i fcovs and ccovs (herrewegh et al., 1998; terada et al., 2014) . in addition to a serology-based classification, fcovs may be divided into different biotypes based on their pathogenic potential. the avirulent biotype is generally referred to as feline enteric coronavirus (fecv) and causes inapparent persistent infections of the gut, while the virulent biotype, feline infectious peritonitis virus (fipv), causes a fatal disease called feline infectious peritonitis (fip) (pedersen, 2009) . according to the "internal mutation theory" fipv evolves from fecv through mutations in approximately 5-10% of the persistently infected cats (vennema et al., 1998) . so far, the genetic changes responsible for the biotype switch have not been identified, but there is increasing evidence that mutations in the accessory genes and the s gene are involved in the development of fip (vennema et al., 1998; kennedy et al., 2001; pedersen, 2009; chang et al., 2010 chang et al., , 2011 licitra et al., 2013; bank-wolf et al., 2014) . coronavirus genomes comprise a variable number of accessory genes at different positions in the 3 0 -proximal genome region. with only very few exceptions, homologs of specific accessory genes are only conserved in very closely related viruses (of the same species or sublineage) but not in the more distantly viruses (lai and cavanagh, 1997) . there is increasing evidence that accessory gene products are important for virulence in the natural host but the precise functions of the vast majority of accessory proteins remain to be investigated. alphacoronaviruses harbor accessory genes at two different positions in their genomes. between the s and e genes, fcovs and the very closely related ccovs possess three orfs (3a-3c), while tgev contains only two orfs (3a and 3b) in this genome region. recently, an additional orf, named orf3, was found between the s and e genes in ccov serotype i (lorusso et al., 2008) . other alphacoronaviruses, such as pedv, hcov-229e and hcov-nl63, have only one orf 3. sequence analyses suggest that fcov orf 3a is homologous to ccov orf 3a and tgev orf 3a while the fcov orf 3 c is a homolog of ccov orf 3c, tgev orf 3b and orf 3 of all other alphacoronaviruses. downstream of the n gene, all members of the species alphacoronavirus 1 contain a different number of additional accessory genes. thus, tgev contains only one orf (called orf 7), which is homologous to orf 7a of fcovs and ccovs, while the latter two coronaviruses contain yet another orf called 7b in the 3 0 -terminal genome region. altogether, the fcov genome encodes five accessory proteins termed 3a, 3b, 3c, 7a and 7b (dye and siddell, 2005; tekes et al., 2008) . using a reverse genetics approach, haijema et al. (2004) showed that the accessory genes of the fipv strain 79-1146 are dispensable for viral growth in vitro and recombinant viruses that lack orfs 3a-3c or 7a and 7b were unable to induce fip in vivo (2004) . so far, there is no evidence that 3a-3c accessory proteins are produced in infected cells. nevertheless, it has been proposed that 3c is essential for viral replication in the gut (as is the case for fecv) but dispensable for systemic infections (chang et al., 2010) . the functions of the fcov-accessory proteins 3a-3c remain to be determined. recently, it was suggested that the accessory protein 7a represents an interferon antagonist (dedeurwaerder et al., 2014) , although its expression in infected cells has not been confirmed. among the fcov-accessory proteins, the 7b protein has been studied most extensively. the 7b protein has a molecular mass of $26 kda, it is secreted from the cell and contains (i) a cterminal kdel-like endoplasmic reticulum (er) retention signal, (ii) an n-terminal signal sequence of 17 amino acids and (iii) a potential n-glycosylation site at aa position 68 (vennema et al., 1992a) . the presence of 7b-specific antibodies in both fecv-and fipv-infected cats indicates that this protein is produced during natural infections (vennema et al., 1992a,b; herrewegh et al., 1995; kennedy et al., 2008) . it was also reported that cell culture adaptation often leads to mutations in the 7b gene and loss of expression of this protein (herrewegh et al., 1995) . taken together, there is growing evidence that the accessory genes and their products are involved in fcov persistence and virulence but, up to now, their function is not known. one reason for our limited knowledge on fcov-accessory proteins is the lack of appropriate tools to study these proteins. to our knowledge, specific antibodies against the individual accessory proteins are not available. in this manuscript, we describe the production of fcov 7b protein-specific mabs. our data show that the mabs recognize the 7b protein of both serotype i and ii fcovs. moreover, using a panel of n-and c-terminally truncated 7b proteins and a synthetic peptide, we determined the binding sites of the mabs in 7b. the data further indicate that the mabs recognize in fcov-infected cells only the nonglycosylated form of 7b. all animal procedures were performed in strict accordance with the legal regulations of the german animal welfare jurisdiction (tierschutzgesetz). the animal experiment was approved by the giessen regional council (regierungspräsidium) and recorded after approval under reference number a15/2013. crandell rees feline kidney cells (crfk) were purchased from the american type culture collection (attc ccl-94). felis catus whole fetus 4 cells (fcwf-4) were kindly provided by r. j. de groot, university of utrecht, the netherlands. both cell lines were maintained in dulbecco's modified eagle's medium (dmem) supplemented with 10% fetal calf serum (fcs), penicillin (100 iu ml à1 ) and streptomycin (100 iu ml à1 ) in 5% co 2 at 37 c. the type-ii strain fcov 79-1146 was kindly provided by r. j. de groot, university of utrecht, the netherlands (genbank nc_002306). the recombinant type-i fcov strain recfcov-dstop-7b was generated by reverse genetics as described previously (tekes et al., 2012) . it comprises the type-i fcov strain black with a restored accessory gene 7b (genbank eu186072). the type-ii fcov 79-1146 and the type-i recfcov-dstop-7b were propagated on crfk cells and fcwf-4 cells, respectively. crfk or fcwf cells were incubated for 1 h with fcov 79-1146 (multiplicity of infection [moi] of 1) or recfcov-dstop-7b (moi of 0.001), respectively. to inhibit n-glycosylation of viral proteins, 2 mg ml à1 tunicamycin in dimethyl sulfoxide (dmso) were added 4 h post infection (p.i.). at 24 h p.i., the cells were harvested in ripa buffer (150 mm nacl, 50 mm tris, 1% np-40, 0.5% na-deoxycholate, 0.5 mm pefabloc sc [merck], ph 8.0) with 0.2% sds for sds-page and western blot analysis. following treatment with tunicamycin, cells were either fixed for immunofluorescence or harvested in ripa buffer for sds-page and western blot analysis at 15 h p.i. deglycosylation of cell lysates was carried out for 3 h with nglycosidase f (pngase f; new england biolabs, ma) according to the manufacturer's instructions. the full-length 7b genes of fcov 79-1146 (nucleotides (nts) 28462-29082) and recfcov-dstop-7b (nts 28337-28954) were amplified by reverse transcription polymerase chain reaction (rt-pcr). the amplified product of fcov 79-1146 was cloned with primers c32/c33 into the ncoi/xhoi sites of a modified pet-11a vector (novagen). the resulting plasmid was named pc18.2 and contained a c-terminal 12â histidine-tag (7b-his). the amplified product of recfcov-dstop-7b was cloned with primers c83/ c82 into a modified pgex-6p-1 vector (ge healthcare) by digestion with bamhi and xhoi. the resulting plasmid with an n-terminal gst and a c-terminal 13â his-tag was named pc80.1 (gst-7bdstop-his). for the generation of gst-7bdss-his (pc52.1; c50/ c47), a deletion mutant of 7b (amino acid 17-206, without the nterminal signal sequence [dss] ) was cloned into the same modified pgex-6p-1 vector. to determine the antigenic site of the monoclonal antibodies, n-and c-terminally truncated 7b proteins of fcov 79-1146 were generated as gst-fusion proteins. to this end, the sequences were amplified by pcr with specific primers and cloned into the modified pgex-6p-1 vector. the resulting plasmids were named pc71.2 (aa 112-206; c72/c47), pc72.1 (aa 1-111; c46/c73), pc82.1 (aa 65-206; c84/c47), pc88.1 (aa 26-206; c86/c47), pc89.2 (aa 34-206; c87/c47), pc90.1 (aa 42-206; c88/ c47), pc91.5 (aa 50-206; c89/c47), pc92.9 (aa 58-206; c90/c47), pc97.4 (aa 1-60; c46/c101), pc98.1 (aa 1-63; c46/c102), pc102.3 (aa 1-67; c46/c103), pc108.2 (aa1-70; c46/c107) and pc114.1 (aa 1-75; c46/c109). primer sequences are shown in table 1 . the pet-11a vector was modified by pcr with primers c17 and c18 to insert an ncoi and xhoi restriction site upstream of a bamhi restriction site. furthermore, a 7â his-tag followed by a stop codon behind the xhoi site was inserted and the existing bamhi site was deleted. in a second pcr with primers c12 and c18, the his-tag was extended with 5 histidines to generate a 12â his-tag. the pgex-6p-1 vector was modified by pcr with primers c36 and c37 to insert a 13x his-tag downstream of the existing xhoi site. plasmid constructs were verified by sequence analysis. the expression of the recombinant proteins was performed after transformation of the different plasmids into escherichia coli (e. coli) strain rosetta tm (de3) plyss (novagen). an overnight culture was diluted 1:10 in fresh lb medium supplemented with ampicillin (100 mg ml à1 ), chloramphenicol (34 mg ml à1 ) and glucose (1%) to a final volume of 500 ml and grown for 2-3 h at 37 c to od 600 of 1.0. protein expression was induced by the addition of isopropyl-b-d-thiogalactopyranoside (iptg) to a final concentration of 1 mm. after 3 h of growth at 28 c the cells were harvested by centrifugation and resuspended in 50 mm na 2 hpo 4 , 300 mm nacl and 1% triton x-100. cells were lysed by freeze/thaw treatment (3â) and sonication on ice. the insoluble protein fraction was separated from soluble proteins by ultracentrifugation (100,000 â g for 1 h). to solubilize inclusion bodies the pellet was resuspended in 8 m urea, 300 mm nacl and 50 mm na 2 hpo 4 , ph 8.0 or in 6 m guanidinium-hydrochloride (guhcl), 50 mm na 2 hpo 4 , 20% ethanol, 1% triton x-100 and 100 mm nacl, ph 8.0 and treated with ultrasonication. after an additional ultracentrifugation at 100,000 â g for 30 min, the solubilized proteins were applied on histrap hp columns (ge healthcare). the purification of the recombinant proteins was performed by metal ion affinity chromatography (imac) with ni 2+ sepharose as recommended in the supplier's protocol. the purified recombinant proteins were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (sds-page) and coomassie staining. following electrophoresis the proteins were identified by western blot using anti-his mab. after dialysis overnight against pbs (gst-7b-dss-his) or acetone precipitation (7b-his), the proteins were used for the immunization of mice. protein concentrations were determined in bradford assay (bradford, 1976) . four 12 week old female balb/c mice (charles river breeding laboratories) were injected subcutaneously with 50 mg of purified protein gst-7bdss-his in freund's incomplete adjuvant (sigma) on days 0, 22 and 54. antibodies binding the recombinant protein were detected in serum samples by western blotting after the third injection. on days 70, 71 and 72 the mouse with the best immune response was immunized with 40 mg of purified protein 7b-his without adjuvant. one day after the final antigen application splenocytes were fused with sp2/0-ag14 myeloma cells (atcc crl-1581) at a ratio of 3:1 using polyethylene glycol 1.500 (roche) according to standard protocols (köhler et al., 1976) . the fused cells were cultured in 96-well plates and hybridoma cells were selected in dmem supplemented with 20% f10 (gibco), 20% f12 (gibco), 15% fcs, 2 mm l-glutamine, 0.35% 2-mercaptoethanol, 50 mm hepes, oxaloacetate-pyruvate-insulin supplement (opi; sigma), and hypoxanthine-aminopterin-thymidine supplement (hat; sigma). for the detection of specific antibodies against the 7b protein, hybridoma culture supernatants were screened by elisa (engvall and perlmann, 1971 ) 7-10 days after fusion. to avoid screening for antibodies that react with gst, the fusion protein 7b-his was used for coating. however coating with 7b-his resulted in od 450nm values <0.35. as an alternative flat-bottomed elisa 96-well plates (nunc maxisorp) were coated overnight at 4 c with 35 ng purified gst-7bdss-his protein in 0.1 m na 2 co 3 /nahco 3 (ph 9.5) per well. the supernatants were also tested against an irrelevant control protein (gst-yfp-his; 75 ng/well) in order to exclude gst-and hisreactive clones. this control protein was expressed in the same vector and purified as described for the gst-7bdss-his construct. after washing three times with pbst (pbs containing 0.1% tween-20), the plates were blocked with 10% fcs in pbst for 1 h at 37 c. then, the plates were washed again and incubated with 100 ml/ well of hybridoma culture supernatants. specific antibodies were detected using goat anti-mouse igg conjugated with horseradish peroxidase (dianova) at dilution 1:2500. the elisa was performed with 3,3 0 ,5,5 0 -tetramethylbenzidine (tmb; sigma) and enzyme reactions were terminated with 25% (v/v) h 2 so 4 solution. the optical density (od values) at 450 nm was measured in a microtiter plate reader (spectra ii, slt). hybridomas with elisa values >0.4 were cloned twice. reactivity was confirmed by elisa and specificity was determined by western blot analysis. to confirm the antigenic sites recognized by the mabs 5b6 and 14d8, elisa table 1 primer sequences used to amplify defined fragments of fcov orf 7b. sequence 5 plates were coated with 460 ng/well of bsa-7b-pep (c-rvece-giegfnctwpgfq [centic biotec, germany]). due to problems during the synthesis of the peptide, the internal cysteines were linked by a disulfide bond. therefore, the peptide was incubated with the reducing agent dtt (5 mm) for 1 h at room temperature before coating. as negative control, an irrelevant peptide (coupled to bsa), bsa-7b-clvg (1 mg/well; c-lvgavpkqkrlnvg, biogenes, germany) and, as positive control, the purified protein gst-7bdss-his (36 ng/well) were coated. the mabs 5b6 and 14d8 were used at a 1:3 dilution, anti-his mab was used at a 1:10 dilution. the elisa was performed as described above. the immunoglobulin subclass of both mabs was determined using the commercially available iso-gold rapid mouse-monoclonal isotyping kit (bioassay works) according to the manufacturer's instructions. sds-page was carried out using a 10% tricine-polyacrylamide gel system (schagger and von jagow, 1987) . for western blot analysis, the proteins were transferred to a nitrocellulose membrane (pall corporation). residual binding sites were blocked with 4% skimmed milk in pbst. the membrane was then incubated with monoclonal antibodies of the hybridoma cell culture supernatants (primary antibodies) at a 1:3 dilution or sera in pbst (1:1500 dilution). bound antibodies were visualized with hrpconjugated goat anti-mouse or goat anti-cat igg (dianova) and chemiluminescence reagent (western lightning plus-ecl; perki-nelmer). goat anti-cat igg was diluted in pbst with 1x roti 1 -block (roth) additionally. crfk cells were cultured on sterile cover slips in 24 well plates pretreated with rat tail collagen (sigma). cells were mock or virusinfected and treated with 2 mg ml à1 tunicamycin 4 h p.i. at 15 h p.i. the cells were fixed with pbs containing 4% paraformaldehyde for 20 min at 4 c, followed by three pbs washes. cells were permeabilized with triton x-100 (1% in pbs) for 5 min at room temperature. the cells were rinsed twice with pbs and residual binding sites were blocked with 1â roti 1 -immunoblock (roth) for 10 min at room temperature. the cells were then incubated with primary antibodies (1:3 dilution) or sera (1:1500 dilution) in pbs for 1 h at 37 c, followed by three pbs washes. subsequently the cells were incubated with secondary antibodies conjugated to cyanogen-3 (1:500 dilution, dianova) in pbs for an additional hour at 37 c. the dna was counterstained with 66 mg ml à1 4 0 ,6diamidino-2-phenylindole (dapi) for 5 min at room temperature. after three washes the cover slips were mounted with mowiol (sigma) containing the anti-fading reagent dabco (1,4-diazabicyclo(2.2.2)octane, roth). the immunofluorescent staining was visualized by fluorescence microscopy (axiovert 10, zeiss). the full-length 7b protein of fcov 79-1146 of 206 amino acid (aa) residues was expressed in e. coli with a c-terminal his-tag (7b-his). induction of expression of 7b-his resulted in insoluble protein aggregates that could be solubilized using 6 m guhcl-containing buffer and purified by ion metal affinity chromatography (imac). the identity of the purified protein was confirmed by sds-page and western blotting using anti-his mab. the protein had an apparent molecular mass of 24 kda as judged by sds-page analysis, which is predicted for this protein, and was recognized by the his-tag-specific antibody (fig. 1a) . western blot analysis further revealed a protein of approximately 48 kda, suggesting dimerization of the full-length protein. the overall yield for this protein was low. we therefore decided to express 7b as part of a gst fusion protein. to do this, the 5 0 region encoding a putative nterminal signal sequence (ss, residues 1 to 17) was removed from the 7b coding sequence and replaced by the gst coding sequence. the resulting construct was termed gst-7bdss-his. expression of gst-7bdss-his in e. coli resulted again in insoluble protein aggregates. the protein was purified under denaturing conditions using 8 m urea-containing buffer and analyzed by sds-page and western blotting using an anti-his mab. consistent with its calculated molecular mass, the major fraction of the purified protein migrated as a 51-kda protein in sds polyacrylamide gels. two additional minor bands in the sds-page were specifically recognized by western blotting using the his-tag-specific mab. these bands are consistent with a dimer and a 37-kda degradation product, respectively, of the gst-7bdss-his protein (fig. 1b) . the total yield of gst-7bdss-his was estimated to be 25 times higher compared to 7b-his. the elution fractions of 7b-his and gst-7bdss-his were pooled, concentrated, desalted and used for immunization ( fig. 1c and d, respectively) . the fusion protein 7b-his was used for the immunization of balb/c mice. however, even after four injections over a time period of 12 weeks, no immune reaction against 7b-his could be detected in sera obtained from immunized animals. we therefore decided to immunize another four balb/c mice using the gst-7bdss-his protein. two weeks after the third injection a specific seroconversion was detected. the mouse with the best immune response received an additional boost with 7b-his. splenocytes isolated from immunized mice were fused with myeloma cells and cell culture supernatants were screened by elisa using the gst-7bdss-his as capture antigen. gst-yfp-his was used as a control antigen to identify and exclude clones producing antibodies specific for one of the tags. two supernatants were confirmed to contain antibodies that bind to gst-7bdss-his but not the gst-yfp-his control protein. the respective hybridoma cells were subcloned twice and continued to produce 7b-specific antibodies as determined by elisa. antibodies produced by the two stable hybridoma cell lines (5b6 [igg2a], 14d8 [igg1]) were subsequently tested by western blotting. both mabs recognized the recombinant proteins 7b-his (fig. 2a) and gst-7bdss-his (data not shown). the data also suggest that the formation of the putative 7b dimer discussed above ( fig. 1c and text) involves one or more intermolecular disulfide bonds since the higher molecular mass band of >40 kda was not detectable if 2-mercaptoethanol was included in the protein sample buffer (fig. 2b) . to determine whether the two fcov serotype ii (strain 79-1146) 7b protein-specific mabs recognize epitopes that are conserved among serotype i and ii strains, the mab reactivities against the well-characterized serotype i strain black were analyzed. the 7b proteins of strains black and 79-1146 share 90% amino acid sequence identity. however, strain black contains an internal translation stop codon in orf 7b, resulting in a c-terminally truncated 7b protein. we therefore restored the orf 7b of fcov strain black and expressed the strain black 7b peptide sequence as part of a recombinant fusion protein (gst-7bdstop-his) in e. coli. western blot analysis of lysates obtained from iptg-induced e. coli cells transformed with the appropriate expression plasmid confirmed that the two mabs recognized the recombinant 7b protein of strain black (data not shown), confirming that the two mabs obtained in this study recognize epitopes that are conserved among the 7b proteins of representative fcov serotype i and ii strains. to further characterize the newly generated mabs, we sought to determine the binding sites within the 7b protein. as shown above, both mabs recognized 7b-his in western blot experiments after sds-page under non-reducing and reducing conditions, suggesting that they recognize a linear epitope on the 7b protein. first, we produced a set of bacterially expressed segments of 7b. appropriate coding sequences were cloned into the pc23.6 plasmid and expressed with n-terminal gst and c-terminal histidine tags. mab reactivities against these proteins were tested by western blotting (fig. 3) . all proteins containing aa residues 58-111 of 7b (pc72.1, pc88.1, pc89.2, pc90.1, pc91.5, pc92.9) were detected by the mabs. in contrast, proteins containing residues 112-206 (pc71.2) or 65 to 206 (pc82.1) were not recognized. furthermore, a protein containing residues 1-75 of 7b (pc114.1) was recognized while none of the slightly smaller proteins that lacked a few more residues at the carboxyl terminus (pc97.4, pc98.1, pc102.3, pc108.2) were recognized. the combined results summarized in fig. 3 led us to conclude that both mabs recognize a peptide segment that includes residues 58 to 75 of the 7b protein (fig. 3) . to further corroborate this hypothesis, a synthetic peptide coupled to bsa (bsa-7b-pep, c-58 rvecegiegfnctwpgfq 75 ) was specificities of 5b6 and 14d8 were characterized using total lysates of noninduced (à) and iptg-induced (+) e. coli rosetta cells transformed with the appropriate 7b-his expression plasmid (see materials and methods). proteins were separated by sds-page (10%) under nonreducing conditions and antibody reactivities were analyzed by western blotting using anti-7b mabs (5b6 and 14d8, respectively) and anti-his mab (control) as indicated. (b) the purified protein 7b-his/e100 was separated by sds-page (10%) in the absence (à2-me) or presence (+2-me) of the reducing agent 2-mercaptoethanol. western blot analysis was performed using anti-7b mab 5b6. arrowheads indicate the recombinant protein 7b-his. e100: elutate fraction obtained with 100 mm imidazole. used in an additional set of experiments. to resolve problems that occurred during peptide synthesis, an intramolecular disulfide bond was introduced. to further characterize the mab binding properties, bsa-7b-pep was used as antigen in an elisa that was performed under non-reducing and reducing conditions, respectively. bsa-7b-pep was treated with or without dtt and elisa plates were coated with the peptide. bsa-7b-clvg and purified gst-7bdss-his fusion protein were used as negative and positive control, respectively. as shown in fig. 4 , mabs 5b6 and 14d8 were confirmed to bind specifically to bsa-7b-pep in the absence and presence of dtt. the data provide direct evidence that both mabs recognize a region in the 7b protein that spans residues 58 to 75. 3.5. comparative sequence analysis of the peptide region recognized by the mabs as shown above, the epitope(s) recognized by both mabs are located within the amino acid sequence 58 to 75 of the fcov 7b protein. this part of the 7b sequence is highly conserved within fcov serotypes i and ii with sequence identities of 78-94%. interestingly, this part of the sequence comprises the only n-glycosylation motif ( 68 nct 70 ) within the 7b sequence (fig. 5) . this motif is conserved in almost all sequenced fcov strains. in order to detect the authentic 7b protein with our newly generated anti-7b mabs, crfk cells were mock-infected or infected with fcov 79-1146 and analyzed by indirect fig. 3 . determination of the 5b6 and 14d6 binding sites in the 7b protein. the scheme shows the full-length 7b construct and the truncated versions derived from this construct. all proteins were expressed as gst-and his-tag fusion proteins and mab reactivities were tested by western blotting under non-reducing conditions. +: specific binding detected; à: not detected. fig. 4 . reactivities of mabs 5b6 and 14d8 against bsa-7b-pep as determined by elisa. elisa plates were coated with bsa-7b-pep in the presence (+) or absence (à) of dtt. as controls, bsa-7b-cvlg and gst-7bdss-his were used. the data represent three independent experiments; standard deviations are indicated. immunofluorescence assay (ifa) 24 h post infection (p.i.). surprisingly, neither of the anti-7b mabs produced a positive signal, although there was a clear reactivity with an antiserum against fcov (data not shown). as the two mabs were shown previously to recognize fcov 7b fusion proteins in western blot experiments (fig. 2) , we used this method to analyze cell lysates obtained at 24 h p.i. both mabs detected a protein with an apparent molecular mass of approximately 24 kda in fcov-infected cells, whereas no signal was detected in mock-infected cells (fig. 6a) . it has been shown by others that orf 7b encodes a glycoprotein of 26 kda (vennema et al., 1992a) . in order to investigate whether our mabs detect a glycosylated form of the 7b protein, pngase f treatment of cell lysates was performed. surprisingly, incubation with pngase f did not affect the migration behavior of the 7b protein (data not shown), indicating that the protein detected by the mabs represents a nonglycosylated form of the 7b protein. since the only n-glycosylation motif in 7b is located at aa position 68-70 (nct), we speculated that the n-glycosylation site might be part of the epitope(s) recognized by the mabs. accordingly, the epitope(s) may be masked and thereby inaccessible for our antibodies. in order to test this hypothesis, the effect of an inhibitor of n-linked glycosylation was studied. crfk cells were infected with fcov 79-1146 or mock-infected and treated with tunicamycin at 4 h p.i. cell lysates were prepared at 15 h p.i. and probed by western blotting. the mabs recognized a single protein with an apparent molecular mass of 24 kda in the absence and presence of tunicamycin (fig. 6b) . interestingly, the signal intensity obtained with the mabs was much stronger after inhibition of nglycosylation. moreover, after treatment with tunicamycin, a dimer of 7b was recognized by both mabs which was not detectable in the absence of tunicamycin (fig. 6b) and disappeared after incubation with 2-mercaptoethanol (data not shown). these results support the idea that both mabs do not recognize the glycosylated form of fcov 7b. an anti-fcov serum, which served as positive control, detected the virus-specific proteins n and m (fig. 6b) . in addition, a protein band with an apparent molecular mass of 24 kda was detected by the anti-fcov serum. this protein may represent the nonglycosylated form of the m protein and/or the 7b protein. the results outlined above show that the anti-7b mabs recognize exclusively the nonglycosylated form of the viral protein in fcov-infected cells. this was particularly striking after treatment of the cells with tunicamycin. in light of this finding, another round of immunofluorescence experiments was performed with the mabs after infection with fcov 79-1146 and incubation with tunicamycin. as already mentioned, there was no staining after incubation with the mabs in the absence of tunicamycin. however, after tunicamycin treatment, clear signals were obtained in infected cells (fig. 7) . at this point, the observed cytoplasmic localization of 7b cannot be further narrowed down. an anti-fcov serum detected viral proteins in the cytoplasm of infected cells regardless of whether or not they were treated with tunicamycin ( fig. 7) . the genome of feline coronavirus (fcov) encodes five accessory proteins termed 3a-3c, 7a and 7b. among these, only 7b has been identified in virus-infected cells (vennema et al., 1992a) . the 7b protein was also expressed from a plasmid under the control of the t7 promoter in t7 polymerase-expressing vaccinia virus-infected cells. the produced 7b protein was identified with ascites from a fipv-infected cat. fcov 7b was shown to be a glycoprotein that to some extent may be secreted from infected cells. interestingly, a kdel-like sequence at its c-terminus has been shown to slow down the export of the protein (vennema et al., 1992a) . we are interested in accessory proteins encoded by fcov in particular with respect to potential roles in (i) establishing persistent infections by enteric feline coronavirus in the gut and (ii) the pathogenesis of feline infectious peritonitis (pedersen, 2009) . since there was already some information on fcov 7b, we first concentrated on this protein. for further characterization including its unambiguous identification in fcov-infected cells monoclonal antibodies against 7b protein were generated. using nand c-terminally truncated parts of the 7b protein expressed by bacteria as well as a synthetic peptide conjugated to bsa, the binding sites of two mabs were determined. according to this analysis, the mabs recognize a stretch of 18 amino acids (aa) between aa position 58 and 75. interestingly, the only putative nlinked glycosylation site as well as two of the seven cysteines of the fcov 7b protein are contained within this sequence. this finding led to the question whether the mabs recognize only the nonglycosylated form of 7b. in fact, both mabs detected in fcov-infected cells only recognized a protein with an apparent molecular weight of 24 kda and not the previously described glycosylated form of 26 kda (vennema et al., 1992a) . the exclusive recognition of the nonglycosylated form was supported by results obtained after tunicamycin treatment of fcov-infected cells. notably, immunofluorescence studies using the mabs led to negative results unless the fcov-infected cells were treated with tunicamycin. the difference to western blot experiments where the nonglycosylated form could be readily shown without inhibition of glycosylation most likely results from different sensitivities of the two methods. masking of epitopes by glycosylation has been shown for many other viral proteins including the hemagglutinin of influenza virus (skehel et al., 1984) . the authors also used tunicamycin to demonstrate recognition by mabs in the absence of n-linked glycosylation. accordingly, the antigenicity of influenza virus hemagglutinin and of other proteins can be modified by addition of carbohydrate chains. furthermore, since both mabs recognize the same or a nearby epitope in 7b, it can be assumed that the identified 18 aa stretch represents a highly antigenic site. masking of this region through glycosylation of asparagine 68 may prevent an immune response and could be relevant for establishment of persistent infection by fcov. this could be directly tested by the use of recombinant fcovs, where the n-linked glycosylation site of 7b is missing. it remains to be seen whether this finding has implications for vaccine development against fcov-induced diseases. another interesting aspect of our studies with fcov 7b concerns putative intra-and intermolecular disulfide bonds. in this context, it is noteworthy that expression of 7b in bacteria allowed the detection of disulfide linked dimers, which disappeared after treatment of extracts with a reducing agent. such dimeric forms were also detected in fcov-infected cells, but apparently only after treatment with tunicamycin. thus, it is tempting to speculate that intermolecular disulfide bridges represent an artifact only observed in the absence of glycosylation at asparagine 68 of fcov 7b. it will be interesting to determine whether one or both cysteines present in our synthetic peptide is/ are involved in intramolecular disulfide bonds. recognition of 7b by our mabs in the absence and presence of reducing agent may give some indirect evidence in this regard. pilot experiments actually indicate that our mabs recognize the nonreduced form of 7b from fcov-infected cells better than the reduced one; this result suggests that the native protein contains one or more intramolecular disulfide bridges, which are important for the antigenicity of 7b. another way to study the relevance of disulfide bonds for the structure of 7b is the exchange of cysteine codons in different expression systems together with western blots under nonreducing and reducing conditions. however, final proof for the presence of intramolecular disulfide bridges within fcov 7b will require the determination of its three-dimensional crystal structure. unfortunately, our mabs are not well suited to detect the secreted glycosylated form of 7b. to further study the synthesis, intracellular localization and secretion of 7b, we will use our reverse genetics system (tekes et al., 2008 (tekes et al., , 2012 to obtain recombinant viruses containing the desired changes in the 7b coding sequence. ultimately, these studies together with the three dimensional structure of 7b will give more insight into the biological role of the 7b protein. persistence and transmission of natural type i feline coronavirus infection mutations of 3c and spike protein genes correlate with the occurrence of feline infectious peritonitis a rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding feline infectious peritonitis: insights into feline coronavirus pathobiogenesis and epidemiology based on genetic analysis of the viral 3c gene sequence analysis of feline coronaviruses and the circulating virulent/avirulent theory family coronaviridae orf7-encoded accessory protein 7a of feline infectious peritonitis virus as a counteragent against ifn-alpha-induced antiviral response genomic rna sequence of feline coronavirus strain fipv wsu-79/1146 enzyme-linked immunosorbent assay (elisa): quantitative assay of immunoglobulin g live, attenuated coronavirus vaccines through the directed deletion of group-specific genes provide protection against feline infectious peritonitis prevalence and implications of feline coronavirus infections of captive and free-ranging cheetahs (acinonyx jubatus) the molecular genetics of feline coronaviruses: comparative sequence analysis of the orf7a/7b transcription unit of different biotypes feline coronavirus type ii strains 79-1683 and 79-1146 originate from a double recombination between feline coronavirus type i and canine coronavirus the prevalence of types i and ii feline coronavirus infections in cats deletions in the 7a orf of feline coronavirus associated with an epidemic of feline infectious peritonitis detection of feline coronavirus infection in southern african nondomestic felids evaluation of antibodies against feline coronavirus 7b protein for diagnosis of feline infectious peritonitis in cats fusion between immunoglobulin-secreting and nonsecreting myeloma cell lines the molecular biology of coronaviruses mutation in spike protein cleavage site and pathogenesis of feline coronavirus gain, preservation, and loss of a group 1a coronavirus accessory glycoprotein the molecular biology of coronaviruses a review of feline infectious peritonitis virus infection: 1963-2008 tricine-sodium dodecyl sulfate-polyacrylamide gel electrophoresis for the separation of proteins in the range from 1 to 100 kda a carbohydrate side chain on hemagglutinins of hong kong influenza viruses inhibits recognition by a monoclonal antibody genome organization and reverse genetic analysis of a type i feline coronavirus a reverse genetics approach to study feline infectious peritonitis emergence of pathogenic coronaviruses in cats by homologous recombination between feline and canine coronaviruses a novel glycoprotein of feline infectious peritonitis coronavirus contains a kdel-like endoplasmic reticulum retention signal genomic organization and expression of the 3 0 end of the canine and feline enteric coronaviruses feline infectious peritonitis viruses arise by mutation from endemic feline enteric coronaviruses this study was supported by the "bundesministerium für bildung und forschung" of the german government (zoonosis network, consortium on ecology and pathogenesis of sars, project code 01ki1005a-f) as well as by the collaborative research centre 1021: "rna viruses: rna metabolism, host response and pathogenesis". key: cord-261472-qcu73sdu authors: yao, yong xiu; ren, junyuan; heinen, paul; zambon, maria; jones, ian m. title: cleavage and serum reactivity of the severe acute respiratory syndrome coronavirus spike protein date: 2004-07-01 journal: j infect dis doi: 10.1086/421280 sha: doc_id: 261472 cord_uid: qcu73sdu severe acute respiratory syndrome (sars) coronavirus (scov) spike (s) protein is the major surface antigen of the virus and is responsible for receptor binding and the generation of neutralizing antibody. to investigate scov s protein, full-length and individual domains of s protein were expressed on the surface of insect cells and were characterized for cleavability and reactivity with serum samples obtained from patients during the convalescent phase of sars. s protein could be cleaved by exogenous trypsin but not by coexpressed furin, suggesting that the protein is not normally processed during infection. reactivity was evident by both flow cytometry and western blot assays, but the pattern of reactivity varied according to assay and sequence of the antigen. the antibody response to scov s protein involves antibodies to both linear and conformational epitopes, with linear epitopes associated with the carboxyl domain and conformational epitopes associated with the amino terminal domain. recombinant scov s protein appears to be a suitable antigen for the development of an efficient and sensitive diagnostic test for sars, but our data suggest that assay format and choice of s antigen are important considerations. and is responsible for virus attachment to receptors, entry by fusion, and the development of neutralizing antibody [9] . of interest, most of the residue changes identified within s protein lie in a region that, in other covs, causes significant alteration in tropism [10] , suggesting that drift toward a virus more capable of using a human cell-surface receptor could occur. the structure and function of scov s protein is, therefore, an integral part of studies of viral tropism and of the development of a receptor-blocking antibody response. during the convalescent phase of sars, antibody against the whole virus can be detected at day 10 and increases to a peak titer by day 28 [11] . however, the spectrum of the antibody response, its role in viral clearance, and its use as a correlate of protection has not been described. because scov s protein is such a pivotal viral protein, we have applied a high-throughput expression strategy to the production of recombinant s protein for biochemical and antigenic characterization. here, we describe high-level expression of 6 overlapping fragments of s protein on the surface of insect cells, a context that allows high-level production and maintains sensitive conformation [12, 13] . recombinant s protein showed strong serum reactivity in convalescent-phase serum samples by use of both flow cytometry and western blotting (wb), suggesting that it could be used in the development of sars-specific diagnostic techniques. reactivity was not equivalent between fluorescence-activated cell sorting (facs) and wb, however, and was not equal on all fragments, providing evidence of a variety of antibodies in the host response. cell culture. spodoptera frugipurda (sf9) cells were routinely cultured in sf900-ii medium (life sciences) at 28њc in suspension culture. for plaque assays, cells were allowed to settle onto polystyrene dishes for 1 h before virus inoculation. after virus adsorption, cells were sequentially overlaid with 1.6% lowmelting-temperature agarose and then with sf900-ii supplemented with 5% fetal calf serum and were cultured for 4 more days before staining with neutral red. vector construction. a cdna encoding the full-length s protein minus the signal peptide and transmembrane (tm) domain (aa 18-1193) of the hong kong isolate of scov [7] was supplied by andrew davidson and stuart siddell (university of bristol, bristol, uk) as a reverse-transcription polymerase chain reaction-amplified fragment and was first cloned into ptopo (invitrogen). several overlapping fragments of s protein, with end points chosen after bioinformatics analysis, were amplified from the original clone. all fragments generated were flanked by dissimilar restriction sites for the enzyme sfii. each fragment was cloned into a set of 3 vectors (18 variants in all), to provide his-tagged, maltose binding protein-tagged, and cell surface-displayed forms of s protein. only the cell surface-displayed cassette is described in detail here. the baculovirus display transfer vector pacvsv gtm [12] , modified to include sfii sites at the junction of the signal peptide and the vesicular stomatitis virus (vsv) g protein tm domain, was used to provide expression at the cell surface. expression in insect cells. for expression in insect cells, recombinant baculoviruses were constructed by use of a new, rapid recombination technique after cotransfection of each transfer vector with a linearized, modified form of bacmid dna capable of growth only after recombination [14] . coexpression of mouse furin or human calnexin was done by coinfection of viruses at mois 15 [15] . wb. protein samples to be analyzed were separated on precast 10% tris-hcl sds-polyacrylamide gels (bio-rad) and transferred onto immobilon-p transfer membranes (millipore). wb was performed with human convalescent-phase serum or rabbit serum to the vsv g protein tm domain (research diagnostics), followed by peroxidase-coupled secondary antibodies (sigma). the membrane was finally developed by use of bm chemiluminescence (roche). facs analysis (flow cytometry). cells were plated in a 6well plate ( cells/well) and overlaid with 2 ml of me6 1 ϫ 10 dium. after virus inoculation, cells were cultured for a further 2 days at 28њc, harvested, washed with pbs, fixed with cellfix (becton dickinson), and stained with serum and conjugate. data acquisition and analysis were performed by use of a fac-scan flow cytometer and cellquest software (both from becton dickinson). human serum. serum samples from patients with sars and human cov 229e were obtained from the respiratory disease reference laboratory of the health protection agency (colindale, uk). nine probable sars cases occurred in the united kingdom, but only 4 conformed to the clinical definition of sars in use at the time. serum samples from patients known to be positive for sars were derived from the patients with clinically confirmed cases, whereas serum samples used as probable but unconfirmed sars cases were derived from the remaining 5 patients. all patients recovered from sars. no human cov oc43 serum samples were available for the present study, so the reactivity of cov oc43 with sars s protein could not be assessed. similarly, only 2 serum samples from cov 229e-infected patients were available. serum samples were used in facs and wb analyses at 1:50 and 1:400 dilutions, respectively, unless otherwise stated. the time-course serum samples, available from 1 of the patients with confirmed sars, were assayed by use of elisa using lentil lectin-captured, fulllength s protein, and the 50% end-point titer was compared. the class of antibody bound was ascertained by use of classspecific, secondary anti-human antibodies (sigma). the scov s protein is ∼1200 residues and has 23 glycosylation triplets clustered toward the amino and carboxyl termini. because glycosylation and protein folding are often linked, we focused our expression studies on the use of a highyielding but eukaryotic-based expression system: baculovirusmediated expression in insect cells. we examined a number of biochemical features, to characterize the protein expression before use for antibody binding. the s protein signal peptide and tm domain were substituted for regions known to function well in this expression system-a signal peptide from the major baculovirus glycoprotein gp64 and a tm domain from the vsv g protein already present in pacvsv g tm [12] . this vector efficiently displays proteins on the surface of insect cells and is a method of expression that maintains the conformational integrity of even problematic proteins [13] . in addition to the full-length s protein, fragments representing the predominantly b-sheet and a-helical domains (residues 18-713 and 714-1193, respectively), a fragment from the n terminus (residues 18-410) predicted to form a distinct folded domain when analyzed by folding predication software [16] , and fragments representing residues 411-1193 and 411-713 were expressed in the same way ( figure 1a) . abundant expression of all s protein fragments was achieved by use of this strategy, as shown by wb with a vsv g tm domain antibody that showed 2 predominant bands for each construct representing nonglycosylated and glycosylated proteins, with apparent molecular weights consistent with those predicted (figure 1b). interaction with calnexin and s protein cleavage. the s proteins of all covs are heavily glycosylated, and improper glycan trimming can result in poor surface-expression levels through retention of the viral glycoprotein by the glycoprotein chaperone calnexin [17, 18] . to assess whether the calnexin pathway was relevant to expression of s protein, we coinfected recombinant full-length s protein with a virus-expressing calnexin and re-examined the level of glycosylated and nonglycosylated protein at 3 days after infection. compared with expression of s protein only, coexpression of s protein with calnexin led to an increase in the ratio of glycosylated to nonglycosylated product, calculated by use of gel scan to be ∼20% ( figure 2a) , showing that s protein interacts with calnexin during glycosylation. however, no overall increase in s protein yield was observed, despite improved glycoprotein processing. some, but not all, cov s proteins are cleaved around the center of the molecule to form the receptor-binding outer domain s1 and the inner-membrane fusion domain s2 [19, 20] . the full-length s protein was not cleaved in insect cells, with the fully glycosylated form having an apparent molecular weight of ∼180 kda ( figure 1b) . we assessed the potential for s protein to be cleaved by 2 proteases, the subtilisin-like furin (provided by coinfection, as described elsewhere [15] ) and by trypsin (added exogenously). furin is a classic viral glycoprotein maturation enzyme [21] . the consensus furin cleavage site does not occur in the s protein sequence, but several dibasic sites, which also can be recognized by the enzyme [15] , are present, allowing the possibility of furin cleavage. recombinant s protein produced in the presence of coexpressed furin showed no evidence of cleavage (figure 2b), suggesting that furin is not associated with scov envelope maturation. s protein on the surface of infected insect cells was also treated with trypsin under conditions found to cleave murine cov s [19] and was reanalyzed after the cleavage reaction, by wb using the vsv g tm domain and patients' antibody. treatment with trypsin caused the loss of the ∼180-kda glycosylated s protein band and concomitant appearance of 2 new bands, the most intense of which was at ∼70 kda and whose mobility was indistin-guishable from that shown by s-f3r1 (figures 1b and 2c). the f3/r3 junction, which was chosen wholly on the basis of predicted secondary structure, is at residue 713, and a single lys residue occurs at position 714. on the basis of these data, we suggest that trypsin can access the s protein under nondenaturing conditions, to cleave at lys 714 and produce a b-sheetrich s1 domain and a a-helix-rich s2 domain. interestingly, molecular modeling of the scov s protein has also suggested that the s1-s2 junction lies within the amino acid sequence 680-727 [22] . no s protein was found in the nonpellet fraction after the addition of trypsin, suggesting that the 2 s protein domains remain associated even after cleavage (data not shown). when probed with patients' serum samples, the presumed s2 domain at ∼70 kda was also highlighted, but there were also a number of smaller breakdown products. no distinct band at the predicted size of the s1 domain was visible, suggesting that it is degraded by trypsin or is not detected efficiently by the available human serum samples (see below). whether s protein is cleaved before or during scov cell entry remains undetermined. reactivity with human serum samples. the abundant, stable, and characterized surface expression of a suite of s protein-related fragments prompted us to use these reagents to investigate the antibody response to s protein in the serum samples from infected individuals. serum samples used were obtained from several uk patients (for patients' details, see materials and methods); in addition, serum samples representing a time course from disease onset were obtained from 1 reference patient. we assessed the reactivity of each serum sample with each fragment by use of flow cytometry and wb, to represent native and denatured sources of antigen, respectively. cytometry profiles showed the greatest reactivity of patients' serum samples with the protein s-f1r2 and s-f1r1, but all other fragments also reacted, with the exception of s-f2r3 (aa 411-713), which represents the middle section of the s coding region and failed to react significantly with most patients' serum samples, despite high levels of expression ( figure 1b) . the same general pattern of binding was apparent for all the patients' serum samples, although the exact degree of reactivity with each fragment varied somewhat (of the 2 serum samples shown in figure 3 , the data for serum sample 1 were the more typical pattern). when serum reactivity was assessed by use of wb, however, the pattern of binding was substantially different. reactivity was seen with the full-length protein s-f1r1, but strong binding was also seen with s-f2r1 and s-f3r1 ( figure 3, bottom) . one serum sample (2908, the most broadly reactive of all the samples) showed some reactivity with s-f1r2, s-f1r3, and s-f2r3 but was very weak, compared with fragments spanning the c-terminal half of the s protein, and was wholly absent in other serum samples (typified by the wb using serum sample 1 in figure 3, bottom) . quantitation of the serum response to s protein by densitometry (wb) and relative fluorescence (facs) gave the reactivity orders and f3r1 1 f2r1 1 f1r1 111 others , respectively (figure 4). f1r1 p f1r2 1 f1r3 p f2r1 p f3r1 thus, there was a clear shift in reactivity with the same serum samples, depending on the assay format. specificity of serum reactivity with s protein fragments. the possible use of insect cell-displayed s protein for diagnostic application was assessed by examining fragment reactivity with serum samples from patients infected with human cov 229e and also with serum from a patient with suspected but clinically unconfirmed sars (serum sample 3118). preliminary data have indicated that serum samples from some cov 229e-infected individuals can cross-react with purified scov nucleocapsid protein (p.h. and m.z., unpublished data), so the use of a more specific test for seroconversion, based on s protein, may be valuable. serum samples from 2 patients with confirmed cov 229e infection did not react with s-f1r2 by use of facs (figure 5), whereas the serum sample from a patient with suspected sars showed a weak but clear shift in fluorescence. wb could also distinguish the serum samples, although the discrimination between the samples was not as good as that achieved by use of facs (data not shown), suggesting that the most discriminatory test in uncertain cases should use optimized s protein fragments presented in a nondenaturing assay format. time course of antibody response. a set of serum samples representing a time course from 6 to 40 days after the onset of sars for 1 patient was obtained and used in an s protein-specific elisa, to determine the increase in s protein titer over time. the titer of s protein antibodies was significant from day 10, similar to the serum responses reported for whole infected cells [11] and isolated n protein [23] , and increased to the latest time point (40 days after onset) ( figure 6 ). serum samples were also used in wb assays to examine the changes of reactivity with each individual s protein fragment over time. the earliest wb-positive serum sample (10 days after onset) gave a pattern of binding similar to that of the other serum samples tested (figures 3 and 6). the 40-day time point provided a much stronger signal, but the pattern of binding was not altered ( figure 6 ). blots were also probed with isotype-specific conjugates, and the results were compared with blots probed with an igg-specific conjugate. we found evidence for the development of some igm response late in the recovery period, but it was weak, compared with igg (data not shown). only serum antibodies were available for the present study, and it remains possible that secreted antibodies are also present in respiratory secretions early during the infection. our data provide the first profiles of antibody binding to the scov s protein. efficient expression of the s protein as a specific source of antigen was achieved by use of a highly productive expression system, recombinant baculoviruses, combined with the use of signal and tm sequences proven to efficiently direct proteins to the surface of the expressing cell [12] . this format offered assays based on antibody binding to the cell surface, mimicking infected cells but with singularly high levels of expression of s protein, as well as denatured antigen prepared by cell lysis. characterization of the expressed products provided some evidence for potential interaction with a known glycoprotein chaperone (calnexin), although the stimulation in glycan processing was marginal (20% of the total), and for cleavage of s protein by trypsin-like, but not furin, proteases. cleavage of surface glycoproteins is a key factor in pathogenesis for some viruses [24, 25] , and it will be interesting to evaluate the role of s protein cleavage, if any, in scov infectivity in both animal and human hosts. in the present study, full-length s protein on the surface of insect cells did not bind a variety of species of red blood cells, suggesting that it does not hemagglutinate. s protein released by detergent lysis also failed to bind to any discrete proteins in far-western blot assay of vero cell membranes (data not shown). accordingly, our data do not address the nature of the receptor for scov, in particular the early suggestion that it may be cd13 [14] . recent data show angiotensin-converting enzyme 2 to be at least 1 functional scov receptor [26] . efficient sources of several recombinant s protein fragments allowed us to evaluate the predominant antibodies present in convalescent serum samples. interestingly, the patterns of s reactivity with available serum samples was similar within each assay format, suggesting that the immune response to s protein varies only quantitatively between individuals. this result would be in keeping with the apparently low immunological pressure suggested from the sequence variation observed in several isolates [7] . of the 2 assay formats we used, nondenatured s protein present on the cell surface provided the most sensitive detection of antibodies, with clear shifts in fluorescence for serum samples from patients with suspected but clinically unconfirmed sars. no shift was apparent with human cov 229e serum samples, suggesting that this format is highly specific. when reactivity of positive serum samples to individual s protein fragments was compared, we observed a strong differential binding, depending on the assay used. thus, although facs showed the highest reactivity with fragments, including the amino terminus of s protein, wb with the same serum samples showed preferential reactivity to the carboxyl terminal half of the molecule. differential antibody binding was not influenced by the high-mannose glycans present on insect-derived glycoproteins, because s protein prepared in mimic cells (invitrogen), which add complex glycans to the polypeptide [27] , showed the same pattern of reactivity with the serum samples used (data not shown). our data, which were obtained with only the few serum samples from patients with sars available in the united kingdom, clearly require confirmation with larger sets of serum samples but would be consistent with a level of conformational antibody present in the serum samples preferentially directed toward the globular n terminus of the protein and detected by use of facs but not by use of wb. this class of antibody includes those most useful for diagnosis (see above), suggesting that the most suitable assay format for a final diagnostic will be best configured with authentically folded protein, rather than, for example, peptides. our preliminary work with purified soluble full-length s protein suggests that sensitivity is maintained with the soluble protein, indicating that assay formats simpler than flow cytometry will be feasible. confirmation that the class of conformational antibody directed at the globular s1 domain includes neutralizing antibody, and whether it has any bearing on the outcome of infection, will be important if recombinant s protein is also to be considered as part of a possible vaccine for scov infection. identification of a novel coronavirus in patients with severe acute respiratory syndrome koch's postulates fulfilled for sars virus newly discovered coronavirus as the primary cause of severe acute respiratory syndrome the genome sequence of the sars-associated coronavirus characterization of a novel coronavirus associated with severe acute respiratory syndrome unique and conserved features of genome and proteome of sars-coronavirus, an early split-off from the coronavirus group 2 lineage comparative full-length genome sequence analysis of 14 sars coronavirus isolates and common mutations associated with putative origins of infection isolation and characterization of viruses related to the sars coronavirus from animals in southern china sars coronavirus: a new challenge for prevention and therapy pathogenesis of chimeric mhv4/ mhv-a59 recombinant viruses: the murine coronavirus spike protein is a major determinant of neurovirulence clinical progression and viral load in a community outbreak of coronavirus-associated sars pneumonia: a prospective study non-polar distribution of green fluorescent protein on the surface of autographa californica nucleopolyhedrovirus using a heterologous membrane anchor baculovirus surface display of theileria parva p67 antigen preserves the conformation of sporozoite-neutralizing epitopes putative hapn receptor binding sites in sars-cov spike protein legitimate and illegitimate cleavage of human immunodeficiency virus glycoproteins by furin why are "natively unfolded" proteins unstructured under physiologic conditions? role of n-linked oligosaccharide recognition, glucose trimming, and calnexin in glycoprotein folding and quality control folding and oligomerization of influenza hemagglutinin in the er and the intermediate compartment conformational changes in the spike glycoprotein of murine coronavirus are induced at 37њc either by soluble murine ceacam1 receptors or by ph 8 the coronavirus spike protein is a class i virus fusion protein: structural and functional characterization of the fusion core complex furin at the cutting edge: from protein traffic to embryogenesis and disease molecular modelling of s1 and s2 subunits of sars coronavirus spike glycoprotein antibody response of patients with severe acute respiratory syndrome (sars) to nucleocapsid antigen of sars-associated coronavirus furin: a mammalian subtilisin/kex2p-like endoprotease involved in processing of a wide variety of precursor proteins the pathogenesis of influenza in humans angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus engineering the protein n-glycosylation pathway in insect cells for production of biantennary, complex n-glycans we thank robin gopal (health protection agency, colindale, uk), for providing the original viral rna; andrew davison and stuart siddell (university of bristol, bristol, uk), for providing spike protein cdna; malik peiris (university of hong kong, hong kong), for distribution of source materials; christian drosten (bernard noch institute, hamburg, germany); and members of the virology group, for constructive criticism. key: cord-264031-0y7xbgun authors: wierbowski, shayne d.; liang, siqi; chen, you; andre, nicole m.; lipkin, steven m.; whittaker, gary r.; yu, haiyuan title: a 3d structural interactome to explore the impact of evolutionary divergence, population variation, and small-molecule drugs on sars-cov-2-human protein-protein interactions date: 2020-10-13 journal: biorxiv doi: 10.1101/2020.10.13.308676 sha: doc_id: 264031 cord_uid: 0y7xbgun the recent covid-19 pandemic has sparked a global public health crisis. vital to the development of informed treatments for this disease is a comprehensive understanding of the molecular interactions involved in disease pathology. one lens through which we can better understand this pathology is through the network of protein-protein interactions between its viral agent, sars-cov-2, and its human host. for instance, increased infectivity of sars-cov-2 compared to sars-cov can be explained by rapid evolution along the interface between the spike protein and its human receptor (ace2) leading to increased binding affinity. sequence divergences that modulate other protein-protein interactions may further explain differences in transmission and virulence in this novel coronavirus. to facilitate these comparisons, we combined homology-based structural modeling with the eclair pipeline for interface prediction at residue resolution, and molecular docking with pyrosetta. this enabled us to compile a novel 3d structural interactome meta-analysis for the published interactome network between sars-cov-2 and human. this resource includes docked structures for all interactions with protein structures, enrichment analysis of variation along interfaces, predicted δδg between sars-cov and sars-cov-2 variants for each interaction, predicted impact of natural human population variation on binding affinity, and a further prioritized set of drug repurposing candidates predicted to overlap with protein interfaces†. all predictions are available online† for easy access and are continually updated when new interactions are published. † some sections of this pre-print have been redacted to comply with current biorxiv policy restricting the dissemination of purely in silico results predicting potential therapies for sars-cov-2 that have not undergone thorough peer-review. the results section titled “prioritization of candidate inhibitors of sars-cov-2-human interactions through binding site comparison,” figure 4, supplemental table 9, and all links to our web resource have been removed. blank headers left in place to preserve structure and item numbering. our full manuscript will be published in an appropriate journal following peer-review. the ongoing global covid-19 pandemic caused by the infection of sars-cov-2 has to date infected impact of mutations on interaction binding affinity and performed a comparison of protein-protein and 96 protein-drug binding sites. we compile all results from our structural interactome into a user-friendly 97 web server allowing for quick exploration of individual interactions or bulk download and analysis of the 98 whole dataset. further, we explore the utility of our interactome modeling approach in identifying key 99 interactions undergoing evolution along viral protein interfaces, highlighting population variants on 100 human interfaces that could modulate the strength of viral-host interactions to confer protection from or 101 susceptibility to covid-19, and prioritizing drug candidates predicted to bind competitively at viral-102 human interaction interfaces. enrichment of divergence between sars-cov and sars-cov-2 at spike-ace2 binding interface to highlight the utility of computational and structural approaches to model the sars-cov-2-human 106 interactome, we first examined the interaction between the sars-cov-2 spike protein (s) and human 107 angiotensin-converting enzyme 2 (ace2) (fig 1.a) . this interaction is key for viral entry into human 108 cells 3 and is the only viral-human interaction with solved crystal structures available in both sars-cov 47 109 and sars-cov-2 [48] [49] [50] . comparison between sars-cov and sars-cov-2 revealed that sequence 110 divergence of the s protein was highly enriched at the s-ace2 interaction interface (fig 1.a; log2oddsratio=2.82, p=1.97e-5), indicating functional evolution around this interaction. to explore the 112 functional impact of these mutations on this interaction, we leveraged the rosetta energy function 51 to 113 estimate the change in binding affinity (δδg) between the sars-cov and sars-cov-2 versions of the 114 s-ace2 interaction (fig 1.b and 1.c) . the predicted negative δδg value of -14.66 rosetta energy units 115 (reu) indicates an increased binding affinity using the sars-cov-2 s protein driven by better optimized 116 solvation and hydrogen bonding potential fulfillment along the ace2 interface. our result is consistent 117 with the hypothesis that increased stability of the s-ace2 interaction is one of the key reasons for 118 elevated transmission of sars-cov-2 52 . moreover, recent experimental energy kinetics assays have 119 shown that sars-cov-2 s protein binds ace2 with 10-20-fold higher affinity than that of sars-cov s 120 protein 53 supporting the conclusions from our computational modeling. among individuals 6, 7, 54 . several hypotheses for genetic predisposition models have been proposed 123 including that expression quantitative trait loci (eqtls) may up-or down-regulate host response genes 124 and that functional coding variants may alter viral-human interactions 55, 56 . for instance, a recent rna-in order to add a structural component to our interactome map, and thereby enable modeling of 149 the binding affinity for these interactions, we additionally performed docking in pyrosetta using our 150 eclair interface likelihood predictions to refine the search space (supplemental after constructing the 3d interactome between sars-cov-2 and human, we first looked for evidence 162 of interface-specific variation by mapping both gnomad 58 reported human population variants 163 (supplemental table 3 ) and sequence divergences between sars-cov and sars-cov-2 164 (supplemental table 4 ) onto the predicted interfaces. in general, conserved residues have been shown 165 to cluster at protein-protein interfaces 63 , and a recent analysis of sars-cov-2 structure and evolution 166 likewise concluded that highly conserved surface residues were likely to drive protein-protein 167 interactions 64 . consistent with these prior findings at an interactome-wide level, we observed significant 168 depletion for both viral and human variation along the predicted interfaces comparable to that observed 169 on solved human-human interfaces (fig 2.a) . nonetheless, considering each interaction individually, our analysis uncovered a 13 interaction 171 interfaces enriched for human population variants (fig 2.b) , and 7 enriched for recent viral sequence 172 divergences (fig 2.c) . a breakdown of variant enrichment on each interface is provide in supplemental 173 table 5 . the individual viral interfaces showing an unexpected degree of variation may-like the 174 previously discussed s-ace2 interface-be indicative of recent functional evolution around the viral-human interaction. considering the slower rate of evolution in humans, enrichment of population variants along the human interfaces is unlikely to be a selective response to the virus. rather, these 177 interfaces with high population variation along the interfaces may represent edges in the interactome whose strength may fluctuate among individuals or between populations. alternatively, enrichment and depletion of variation along the human-viral interfaces could help distinguish viral proteins that bind 180 along existing-and therefore conserved-human-human interfaces from those that bind using novel 181 interfaces-that would be less likely to be under selective pressure. to further explore the functional impact of naturally occurring variants on the human interactors 183 of sars-cov-2, we considered variants with phenotypic associations as reported in hgmd 65 , clinvar 66 184 or the nhgri-ebi gwas catalog 67 . interactors of sars-cov-2 were significantly more likely than the 185 rest of the human proteome to harbor phenotypic variants in each of these databases (fig 2.d) . notably, 186 among the individual disease categories enriched in this gene set, several were consistent with reported 187 comorbidities including heart disease, respiratory tract disease, and metabolic disease 68, 69 (fig 2. e; 188 supplemental table 6 ). disruption of native protein-protein interactions is one mechanism of disease 189 pathology, and disease mutations are known to be enriched along protein interfaces 70, 71 . human 190 population variants on the predicted human-viral interface were more likely to be annotated as 191 deleterious by sift 72 and polyphen 73 but showed identical allele frequency distributions compared to 192 those off the interfaces (supplemental figure 4) . however, mapping annotated disease mutations onto 193 the protein interfaces only revealed significant enrichment along known human-human interfaces; no 194 such enrichment was found on human-viral interfaces (fig 2.f) . this is likely because unlike with human-195 human interactions, mutations disrupting human-viral interactions would not disrupt natural cell function, 196 and therefore would be unlikely to be pathogenic. our finding that disease mutations and viral proteins 197 affect human proteins at distinct sites is consistent with a two-hit hypothesis of comorbidities whereby 198 proteins whose function is already affected by genetic background may be further compromised by viral 199 infection. we next sought to explore the impact of sequence divergence in sars-cov-2 relative to sars-cov on 202 viral-human interactions. mutations between the two viruses were identified by pairwise alignment and the 203 impacts of these mutations on the binding energy (δδg) for 250 interactions amenable to docking were 204 predicted using a pyrosetta pipeline 46, 59, 60 . although the binding energy for most interactions was 205 unchanged-either because no mutations occurred near the interface or because the mutations that did 206 had marginal effect-we observed an increased likelihood of the divergence from sars-cov to sarscov-2 resulting in decreased binding energy (i.e. more stable interaction) (fig 3. a; supplemental table 208 7). the significant outliers in these δδg predictions can help pinpoint key differences between the viral-209 human interactomes of sars-cov and sars-cov-2. we further note a wide range of affinity impacts 210 among various human interactors of a single viral protein (fig 3.d) and hypothesize that these differences 211 may help identify the most important interactions. to further explore the significance of these changes in interaction affinity, we considered those 213 interactions with the largest decrease in binding energy; corresponding to the largest predicted increase in 214 affinity. specifically, we highlight the interaction between coronavirus orf9c and human mitochondrial nadh scanning mutagenesis along all docked interfaces in pyrosetta. we identified as binding energy hotspot 254 mutations all mutations with a predicted δδg at least one standard deviation away from the mean for identical amino acid substitutions across the rest of the interface. in total, out of 2,241 population variants 256 on eligible interfaces, 161 (7.2%) were identified as hotspots predicted to disrupt interaction stability, and 257 116 (5.2%) were identified as hotspots predicted to contribute to interaction stability (fig 3.b) . most of the 258 hotspot mutations were predicted to be driven by solvation or repulsive forces, with disruptive hotspots 259 primarily being driven by repulsive forces and stabilizing hotspots primarily being driven by solvation forces 260 (fig 3.c) . results summarizing the predicted impact of all 2,241 population variants on the docked 261 interfaces are provided in supplemental table 8 . the current version of the sars-cov-2 human structural interactome web server describes 332 291 viral-human interactions reported by gordon et al. 20 . we will continue support for the web server with 292 periodic updates as additional interactome screens between sars-cov-2 and human are published. as we update, a navigation option to select between the current or previous stable releases of the web 294 server will be provided. overall, we present a comprehensive resource to explore the sars-cov-2-human protein-protein 297 interactome map in a structural context. analysis through this framework allows us to consider the recent 298 evolution of sars-cov-2 in the context of its interactome map and to prioritize for further functional 299 characterization key interactions. likewise, our consideration of underlying variation in the human 300 proteins that interact with sars-cov-2 may be valuable in explaining differences in response to 301 infection. we particularly note that our observation that perturbation from underlying disease mutations 302 and viral protein binding occur at distinct sites on the protein is of clinical interest. further investigation 303 into the combined role of these two sources of perturbation to better understand the mechanisms linked 304 to comorbidities is warranted. however, our work is not without limitation. firstly, we note that although structural coverage 306 from our homology modelling of sars-cov-2 proteins was robust (supplemental figure 1) , the same done to orient the most likely interface residues on each structure towards each other, protein-protein 309 docking using incomplete protein models introduces some bias and low coverage may exclude some 310 true interface residues. for this reason, the initial eclair interface annotations-which are less subject 311 to structural coverage limitations-may provide orthogonal value. we additionally note that direct perhaps most importantly, we emphasize the importance of further experimental 323 characterization to confirm the predictions made here. nonetheless we believe our 3d structural sarscov-2-human interactome web server will prove to be a key resource in informing hypothesis driven 325 exploration of the mechanisms of sars-cov-2 pathology and host response. the scope, and potential 326 impacts of our webserver will continue to grow as we incorporate the results of ongoing and future 327 interactome screens between sars-cov-2 and human. additionally, we note that our 3d structural 328 interactome framework can be rapidly deployed to analyze future viruses. homology-based modeling of all 29 sars-cov-2 proteins was performed in modeller 90 using a multiple 332 template modeling procedure. in brief, a list of candidate template structures for each protein to be 333 modelled was obtained by running blast 91 against a reference containing all sequences in the protein data bank (pdb) 92 . templates were filtered to only retain those with at least 30% identify to the protein 335 to be modelled, and the remaining templates were ranked using a weighted combination of percent 336 identity and coverage as described previously 93 . to compile the final set of overlapping templates for 337 modeling, first the top ranked template was selected as a seed. overlapping templates were iteratively 338 added to the set so long as 1) the new template increased the overall coverage by at least 10%, and 2) 339 the new template retained a total percent identity no more than 25% worse than the initial seed template. pairwise alignments between the protein to be modelled and the template set were generated using a regions with large gaps (at least 5 gaps in the alignment in a 10 residue window). finally, alignment was to accommodate predictions between sars-cov-2 and human, slight alterations were made. using the predicted interface probabilities reported by eclair, we set up the initial docking 379 conformation to explore a restricted search space for each docking simulation. in cases where multiple 380 structures were available for the human protein, all structures were weighted based on the eclair 381 scores for the covered residues in each structure to maximize both coverage age inclusion of likely 382 interface residues. for each protein in the interaction, we performed a linear regression classification in scikit-learn 102 to optimally separate the likely interface residues from likely non-interface residues. the plane defined by this linear regression served as a reference to orient the structures along the y-axis the x-z plane and separated a distance of 5 å along the y-axis. for each docking attempt, a series of 387 random perturbations from these initial conformations were made to search the nearby space. first, the 388 human protein was rotated up to 360° along the y-axis to allow full exploration of different rotations of 389 the two interfaces relative to each other. second, to apply some flexibility to the plane predicting the 390 interface vs. non-interface sides of each protein, up to 30° of rotation along the x-and z-axis were 391 allowed for both the viral and human proteins. finally, a random translation up to 5 å in magnitude was 392 applied to the human protein along the x-z plane so that the docking could explore contact points other 393 than the center of masses along these axes. after initializing these guided starting conformations, docking was simulated in pyrosetta 46 395 using a modified version of the protein-protein docking methodology provided by gray 2006 103 . the 396 initial demo (https://graylab.jhu.edu/pyrosetta/downloads/scripts/demo/d100 docking.py) takes two 397 chains from a co-crystal structure, applies a random perturbation, and re-docks them. because 398 randomized initial orientation was already handled as described previously, these steps were removed 399 from our docking runs. in brief, the protein models were converted to centroid representation, slid into 400 contact using the "interchain_cen" scoring function, and converted back to full atom representation, 401 before having their side-chains optimized using the predefined "docking" and "docking_min" scoring table 2 . to annotate interface residues from atomic resolution docked models, we used a previously described 407 and established definition for interface residues 45 . in brief, the solvent accessible surface area (sasa) 408 for both bound and unbound docked structures was calculated using naccess. 100 we define as accessibility) and 2) in contact with the interacting chain (defined as any residue whose absolute 411 accessibility decreased by ≥ 1.0 å 2 ). the full list of sars-cov-2 mutations is reported in supplemental table 4 . human population variants in all 332 human proteins shown to interact with sars-cov-2 430 proteins were obtained from gnomad 58 reported as the most general term with no more significant ancestor term (supplemental table 6 , sheet 465 1). raw enrichment values for all terms are also predicted (supplemental table 6 , sheet 2). for curation of disease and trait associations from nhgri-ebi gwas catalog (http://www.ebi.ac.uk/gwas/) 67 the scoring function used for these calculations is as described previously 59 using the following weights; protein-ligand docking using smina the previous viral-human interactome screen by gordon et al. 20 supplemental table 9 . list of all predicted drug-target binding sites . comparison of the percentage of human genes that interact with (green) or do not interact with (orange) coronaviruses: an overview of their replication and pathogenesis a pneumonia outbreak associated with a new coronavirus of probable bat origin including severe acute respiratory syndrome (sars) and middle east respiratory syndrome (mers). mandell, douglas, and bennett's principles and practice of infectious diseases a novel bat coronavirus closely related to sars-cov-2 contains natural insertions at the s1/s2 cleavage site of the spike protein extrapulmonary manifestations of covid-19 clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in clinical course and outcomes of critically ill patients with sars-cov-2 pneumonia in wuhan, china: a single-centered, retrospective, observational study clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study severe obesity, increasing age and male sex are independently associated with worse in-hospital outcomes, and higher in-hospital mortality african-american covid-19 mortality: a sentinel event characteristics associated with hospitalization among patients with covid-19 greater risk of severe covid-19 in black, asian and minority ethnic populations is not explained by cardiometabolic, socioeconomic or behavioural factors, or by 25(oh)-vitamin d status: study of 1326 cases from the uk biobank disparities in incidence of covid-19 among underrepresented racial/ethnic groups in counties identified as hotspots during racial demographics and covid-19 confirmed cases and deaths: a correlational analysis of 2886 us counties the sars-coronavirus-host interactome: identification of cyclophilins as target for pan-coronavirus inhibitors global landscape of hiv-human protein complexes protein interaction mapping identifies rbbp6 as a negative regulator of ebola virus replication comparative flavivirus-host protein interaction mapping reveals mechanisms of dengue and zika virus pathogenesis a sars-cov-2 protein interaction map reveals targets for drug repurposing structure of the human receptor tyrosine kinase met in complex with the listeria invasion protein inlb. cell sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor chemokine receptor ccr5 antagonist maraviroc: medicinal chemistry and clinical applications inhibiting hiv-1 integrase by shifting its oligomerization equilibrium small molecule inhibitors of the ledgf site of human immunodeficiency virus integrase identified by fragment screening and structure based design virus-receptor interactions: the key to cellular invasion structural insights into the interaction of coronavirus papain-like proteases and interferon-stimulated gene product 15 from different species mechanism of inhibition of retromer transport by the bacterial effector ridl solution structure of the complex between poxvirus-encoded cc chemokine inhibitor vcci and human mip-1beta structural properties of the promiscuous vp16 activation domain crystal structure of a gamma-herpesvirus cyclin-cdk complex metabolic syndrome and viral pathogenesis: lessons from influenza and coronaviruses a unifying view of 21st century systems biology the molecular sociology of the cell network medicine: a network-based approach to human disease small molecules, big targets: drug discovery faces the protein-protein interaction challenge small-molecule inhibitors of protein-protein interactions: progressing toward the reality alphaspace: fragment-centric topographical mapping to target protein-protein interaction interfaces the development and current use of bcl-2 inhibitors for the treatment of chronic lymphocytic leukemia identification of protein-protein interaction inhibitors targeting vaccinia virus processivity factor for development of antiviral agents inhibition of human papillomavirus dna replication by small molecule antagonists of the e1-e2 protein interaction optimization and determination of the absolute configuration of a series of potent inhibitors of human papillomavirus type-11 e1-e2 protein-protein interaction: a combined medicinal chemistry, nmr and computational chemistry approach protein-protein interactions in virus-host systems. front microbiol interactome insider: a structural interactome browser for genomic studies pyrosetta: a script-based interface for implementing molecular modeling algorithms using rosetta stabilized coronavirus spikes are resistant to conformational changes induced by receptor recognition or proteolysis structural and functional basis of sars-cov-2 entry by using human ace2 sars-cov-2 and bat ratg13 spike glycoprotein structures inform on virus evolution and furin-cleavage effects structure, function, and antigenicity of the sars-cov-2 spike glycoprotein the rosetta all-atom energy function for macromolecular modeling and design structural basis of receptor recognition by sars-cov-2 cryo-em structure of the 2019-ncov spike in the prefusion conformation who is most likely to be infected with sars-cov-2? the lancet infectious diseases comparative genetic analysis of the novel coronavirus (2019-ncov/sars-cov-2) receptor ace2 in different populations genetic predisposition models to covid-19 infection the mutational constraint spectrum quantified from variation in 141,456 humans a simple physical model for binding energy hot spots in protein-protein complexes spatial chemical conservation of hot spot interactions in protein-protein complexes tests of concrete strength across the thickness of industrial floor using the ultrasonic method with exponential spot heads the sequence of human ace2 is suboptimal for binding the s spike protein of sars coronavirus 2. biorxiv conserved residue clusters at protein-protein interfaces and their use in binding site identification sars-cov2 (covid-19) structural/evolution dynamicome: insights into functional evolution and human genomics. biorxiv human gene mutation database (hgmd): 2003 update clinvar: improving access to variant interpretations and supporting evidence the nhgri-ebi gwas catalog of published genome-wide association studies, targeted arrays and summary statistics 2019 characteristics associated with hospitalization among patients with covid-19 prevalence of comorbidities and its effects in patients infected with sars-cov-2: a systematic review and meta-analysis widespread macromolecular interaction perturbations in human genetic disorders three-dimensional reconstruction of protein networks provides insight into human genetic disease sift web server: predicting effects of amino acid substitutions on proteins a method and server for predicting damaging missense mutations analysis of intraviral protein-protein interactions of the sars coronavirus orfeome human mitochondrial complex i assembly is mediated by ndufaf1 cia30 complex i assembly factor: a candidate for human complex i deficiency? hum genet human genome-wide rnai screen reveals a role for nuclear pore proteins in poxvirus morphogenesis mitochondrial reactive oxygen species control t cell activation by regulating il-2 and il-4 expression: mechanism of ciprofloxacin-mediated immunosuppression the landscape of human cancer proteins targeted by sars-cov-2 genome-wide sirna screen identifies the retromer as a cellular entry factor for human papillomavirus the master regulator of the cellular stress response (hsf1) is critical for orthopoxvirus infection a genome-wide small interfering rna (sirna) screen reveals nuclear factor-kappab (nf-kappab)-independent regulators of nod2-induced interleukin-8 (il-8) secretion architecture of the human interactome defines protein communities and disease networks tmed2 potentiates cellular ifn responses to dna viruses by reinforcing mita dimerization and facilitating its trafficking role of the early secretory pathway in sars-cov-2 infection tom70 mediates activation of interferon regulatory factor 3 on mitochondria a whole-genome association study of major determinants for host control of hiv-1. science extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations comparative protein structure modeling using modeller basic local alignment search tool the protein data bank interactome3d: adding structural details to protein networks protein identification and analysis tools on the expasy server the proteomics protocols handbook divergence measures based on the shannon entropy predicting functionally important residues from sequence conservation direct coupling analysis for protein contact prediction evolutionarily conserved pathways of energetic connectivity in protein families modbase, a database of annotated comparative protein structure models and associated resources the interpretation of protein structures: estimation of static accessibility accelerating protein docking in zdock using an advanced 3d convolution library scikit-learn: machine learning in python high-resolution protein-protein docking uniprot: a worldwide hub of protein knowledge analysis of multimerization of the sars coronavirus nucleocapsid protein a new coronavirus associated with human respiratory disease in china genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting wuhan a general method applicable to the search for similarities in the amino acid sequence of two proteins amino acid substitution matrices from protein blocks the ensembl variant effect predictor explaining odds ratios. j can acad child adolesc psychiatry ldlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants open babel: an open chemical toolbox lessons learned in empirical scoring with smina from the csar 2011 benchmarking exercise sars-cov-2 that contain disease annotations in hgdm (log2or=0.57, p=1.70e-4) sars-cov-2 proteins were significantly more likely to harbor disease mutations than non-interactors error bars indicate ± se. e, a sample of individual disease terms enriched in human genes targeted by 603 comparison of the enrichment of hgdm, clinvar, and gwas annotated mutations on human-vial 605 interfaces or human-human interfaces for the same gene set. although disease mutations were enriched 606 on human-human interfaces (hgmd, log2or=0 24), no 607 enrichment was observed on human-viral interfaces (hgmd, log2or=0.21, p=0.13; clinvar the gwas category was removed from this analysis because most lead gwas 609 snps occurred in non-coding regions. error bars indicate ± se predicted changes in binding affinity from sequence divergences between sars-cov and sars-613 an overall representation of 614 these δδg predictions is reported (mean=-1.40 reu, std=6.16 reu) with interactions sorted from those 615 with the largest decrease in binding energy (most stabilized relative to sars-cov) to those with the 616 largest increase in binding energy (most destabilized relative to sars-cov) < z-score ≤ -1, n=85), or strongly stabilizing (z-score ≤ -2, n=31) score < 1, n=1,964) showed minimal impact of binding affinity. c, breakdown of the contribution of each 623 term in the pyrosetta energy function used for in-silico scanning mutagenesis for all population variants a breakdown of which term contributed most heavily 625 to the classification of all 277 interface hotspot population variants is shown on the right. d, individual 626 sars-cov-2-human interactions involving the same viral protein can have distinct interfaces with 627 distinct predicted changes in binding affinity between sars-cov and sars-cov-2 versions of the 628 protein. an example involving orf9b is highlighted where some interactions (e.g. tomm70 and ptbp2) 629 are predicted to be more stabilized in sars-cov-2 whereas others (e.g. bag5, slc9a3r1, and 630 mark2) are predicted to me unaffected. e, docked structure for the interaction between sars-cov-2 631 orf9c and human ndufaf1 sars-cov-2 (bottom) orf9c. interface residues are colored by their predicted energy contribution 633 from blue (stabilizing) to white (no impact) to red (destabilizing) -2 are labeled in red, while other residues with a major contribution to the binding 635 affinity are labeled in green (ndufaf1) or blue (orf9c). the overall predicted change in binding energy 636 (δδg=-21.7 reu) suggests the interaction is more stable (lower energy) in the sars-cov-2 version of 637 the interaction supplemental figure 4. summary of human population variant frequency and deleteriousness summary of allele frequency for human population variants either on or off the predicted human-700 viral interface presented as either a raw distribution or a cumulative density respectively. variants in 701 either category had roughly identical allele frequency distributions. c, d, summary of the sift 702 deleteriousness score for human population variants either on or off the predicted human-viral interface 703 presented as either a raw distribution or a cumulative density respectively population variants on the 706 interface were significantly more likely to be classified deleterious. f, g, summary of the polyphen 707 deleteriousness score for human population variants either on or off the predicted human-viral interface 708 presented as either a raw distribution or a cumulative density respectively. plots are colored based on 709 the split between polyphen benign, possibly damaging, and probably damaging categories. e, pie chart 710 breakdown of these categories. pie char outlines distinguish interface (green) from non-interface 711 (orange) for each sars-cov-2-human interaction with 3d structure available for both proteins, 50 independent 680 guiding docking trials were used to select a final docked configuration. the structure for the viral protein 681 is colored from white to blue with darker blue corresponding to higher eclair prediction. the structure 682 for the human protein is colored similarly using a green to white gradient. initial semi-random docked 683 configurations were generated using five steps. first a plane separating eclair predicted likely 684 interface from likely non-interface residues was drawn to divide each protein. second, the two protein 685 chains were separated 5 å apart on the y-axis using the previously defined plane to orient the likely 686 interface sides of each protein towards each other. third, the human protein was randomly rotated up 687 to 360° along the y-axis to sample different orientations of the two interfaces relative to each other. key: cord-281005-6gi18vka authors: singh, praveen kumar; kulsum, umay; rufai, syed beenish; mudliar, s. rashmi; singh, sarman title: mutations in sars-cov-2 leading to antigenic variations in spike protein: a challenge in vaccine development date: 2020-09-01 journal: j lab physicians doi: 10.1055/s-0040-1715790 sha: doc_id: 281005 cord_uid: 6gi18vka objectives the spread of severe acute respiratory syndrome coronavirus-2 (sars-cov-2) virus has been unprecedentedly fast, spreading to more than 180 countries within 3 months with variable severity. one of the major reasons attributed to this variation is genetic mutation. therefore, we aimed to predict the mutations in the spike protein (s) of the sars-cov-2 genomes available worldwide and analyze its impact on the antigenicity. materials and methods several research groups have generated whole genome sequencing data which are available in the public repositories. a total of 1,604 spike proteins were extracted from 1,325 complete genome and 279 partial spike coding sequences of sars-cov-2 available in ncbi till may 1, 2020 and subjected to multiple sequence alignment to find the mutations corresponding to the reported single nucleotide polymorphisms (snps) in the genomic study. further, the antigenicity of the predicted mutations inferred, and the epitopes were superimposed on the structure of the spike protein. results the sequence analysis resulted in high snps frequency. the significant variations in the predicted epitopes showing high antigenicity were a348v, v367f and a419s in receptor binding domain (rbd). other mutations observed within rbd exhibiting low antigenicity were t323i, a344s, r408i, g476s, v483a, h519q, a520s, a522s and k529e. the rbd t323i, a344s, v367f, a419s, a522s and k529e are novel mutations reported first time in this study. moreover, a930v and d936y mutations were observed in the heptad repeat domain and one mutation d1168h was noted in heptad repeat domain 2. conclusion s protein is the major target for vaccine development, but several mutations were predicted in the antigenic epitopes of s protein across all genomes available globally. the emergence of various mutations within a short period might result in the conformational changes of the protein structure, which suggests that developing a universal vaccine may be a challenging task. since the rapid outbreak of 2019 novel coronavirus (2019-ncov, later named sars-cov-2 or severe acute respiratory syndrome coronavirus 2) in wuhan, china, the world health organization on january 30, 2020, declared the sars-cov-2 epidemic as a public health emergency of international concern. the enduring pandemic has caused nearly 5 million detected cases of coronavirus disease 2019 illness and claimed over 3,25,563 lives worldwide as of may 20, 2020 according to covid-19 resource center johns hopkins. 1 however, so far, no proven therapeutic or effective vaccine candidate has been found. 2 for developing a drug or vaccine, the protein profiling and/or genomic information of the pathogen is extremely crucial. to understand genetic landscape of sars-cov-2 virus, scientists have worked tirelessly and the complete genome sequences of virus isolates are published. now, many isolates have been sequenced completely or partially and are available in the database for scientific community. 3 it is found that the genome size of sars-cov-2 varies from 29.8 kb to 29.9 kb. specific genetic characteristic in its genome have also been found. 4 the genome consists of four structural proteins including spike protein (s), envelope (e), membrane (m), and nucleocapsid (n) proteins. 3 of the four glycoproteins, the m protein is reported to have role in determining the shape of the virus envelope and stabilizing the nucleocapsids, the n protein is involved in processes related to the viral genome, the viral replication cycle, and the cellular response of host cells to viral infections. the e protein which is the smallest protein in the sars-cov-2 structure plays the role in the production and maturation of this virus. 4 however, the glycoprotein s is the core transmembrane monomer of approximately 180 kda size with two subunits s1 and s2. this glycoprotein mediates membrane fusion and finally facilitates virus entry (receptor-binding and entry of virion into the target cells). 5 the receptor binding domain (rbd) (residues 319-541) of the subunit s1 is known to interact with angiotensin-converting enzyme 2 (ace-2), which provides tight binding to the peptidase domain of ace-2. 6 this gives an impression of rbd being an important element of virus-receptor interaction and has an essential role in virus-host range, tropism, and infectivity. the rbd sequences of different sars-cov-2 strains that are circulating globally were initially thought to be conserved; however, with the availability of sequencing data several mutations have been reported. 7, 8 these findings emphasize the argument, that there may be correlation of the mutated strains circulating in a particular geographic setting with higher mortality rates and transmission patterns beside other combating factors. 9 the mutation rate in ribonucleic acid (rna) viruses is intensely high which can be million times higher than that of their hosts and this results in virulence modulation and evolutionary capability for better viral adaptation. 7 genetic depiction of virus mutations can thus offer valuable insights for assessing the fitness of drug resistance, immune escapism, and pathogenesis. due to its receptor binding property, the s protein is also supposed to be immunogenic and a putative target for developing the neutralizing antibodies and vaccines. it is reported that single-point mutations in the conserved amino acid residues in the rbd region completely abolishes the capacity of fulllength s protein to induce neutralizing antibodies. 10 thus, virus mutation studies can be crucial for designing new vaccines and antiviral drugs. in this study, we aimed to predict the mutations in the spike protein (s) of sars-cov-2 genomes available in the database (whole genome sequences as well as partial coding sequences of spike protein) and analyze the effect of each mutation on the antigenicity of the predicted epitopes. this information may be helpful in predicting the transmission and infectivity of various sars-cov-2 strains circulating worldwide. entrez direct (edirect) utilities were used to access the ncbi's nucleotide database by using in-house developed bash scripts to batch download the data. we used query keyword or phrase as "severe acute respiratory syndrome coronavirus 2" and "spike coding sequences (cds)" by applying esearch and efetch utilities implemented in bash scripts to download the dataset for genome and spike proteins that were available till may 1, 2020. a total of 1,325 complete draft genomic sequences of sars-cov-2 and additional 279 cds having partial genomes coding spike protein (total 1,604 cds of spike protein) were downloaded in fasta format available globally from ncbi database (►fig. 1). multiple sequence alignment (msa) of all the 1,325 complete genome sequences as well as 1,604 cds was performed using clustalw-mpi with default parameters. 11 generated snps were identified by in-house developed bash scripts to batch process the data using blade server (dell poweredge fc640 server) with 256 gb ram and 40 core processor with 2.30 ghz. after msa, each genome and spike protein were marked based on the location and clustering was done based on 100% similarity for ease of visualization and analysis. visualization was performed by using jalview 2.11.1.0. 12 the output snp alignment generated from msa was used to assemble a maximum likelihood phylogenetic tree using raxml (randomized axelerated maximum likelihood raxml 8.2.12). 13 phylogenetic trees were visualized using the interactive tree of life (itol) v5 with their respective metadata. 14 emboss antigenic software was used to predict the antigenic regions and the epitopes in the 88 unique spike proteins based on antigenic scores using the formula: where f(ag) = antigenic frequency; f(s) = surface frequency and antigenic score ≥ 1.0 is considered potentially antigenic. [15] [16] [17] the data for different epitopes were analyzed and the epitopes with high antigenicity were superimposed on the structure of spike protein. overall, 1,197 snps were found in 1,325 complete genome datasets. on the basis of similarity these were classified in 782 clusters. however, among cds of spike proteins (1,325 complete genomes and 279 partial cds) a total 140 snps in 88 clusters were found. further snp analysis resulted in identification of snps in a gene stretch of 21,563-25,384 bp in the s gene, encoding the spike (s) protein. the most predominant snp predicted in the gene encoding s protein was 23402a>g in 48.2% of overall genomes under study. in addition to several single mutations in the s gene of all available genomes, we also predicted double mutations such as 22436g>t, 22439c>g, 22444c>a, 22445c>t (corresponding to four amino acid antigenic drift aldp -> sves at position 292-295) and 21723t>g (l54w); 21726t>a (f55i) in two different genomes from the united states. list of all the snps in the rbd, antigenic sites, and double mutations among strains is shown in ►table 1. one deletion (21994-21996deltta) was also found in an indian strain (mt012098.1). no copy number variants were observed in this virus. the alignment of 1,604 spike proteins extracted from 1,325 complete genomes and 279 partial cds was performed and after clustering based on 100% identity, we identified 88 unique sequences with 88 hypervariable sites within these protein sequences. based on the variable sites the phylogeny was inferred showing two major clades a and b with many subclades in the s protein of sars-cov-2 circulating worldwide (►fig. 2). furthermore, the evaluation of the antigenicity of spike protein predicted 14 highest scoring antigenic epitopes (antigenic scores ≥ 1.0) due to variations in each (at positions l54f, l54w, f55i, s71f, d111n, f157l, l293m, l293v, d294e, d294i, a419s, v367f, a348v, and a653v). out of these, amino acid changes were noted at positions a348v, v367f and a419s in the rbd with v367f and a419s being novel. other speculated mutations in the putative epitopes lying within rbd showing less antigenicity were t323i, a344s, r408i, g476s, v483a, h519q, a520s, a522s, and k529e out of which t323i, a344s, a522s, and k529e are also novel. in addition, regions outside rbd (4-19, 1,215-1,256, 1,039-1,071, 123-146, 607-629, 1,123-1,136 , 857-867, 535-541) also infer high antigenicity (based on predicted antigenic score in the range 1.167-1.261) with variations in s protein of different genomes from similar locations as shown in ►fig. 3. the antigenic epitopes are depicted in the protein structure of spike protein (►fig. 4). two mutations a930v and d936y were observed in the heptad repeat domain 1 (hr1) and one mutation d1168h in heptad repeat domain 2 (hr2). several studies have shown that mutations within the spike protein influence virus-host interaction. 18 among the four proteins, viz., m, s, n and e, the m protein is known to play a significant role in virus assembly, role of e protein involves the production and maturation of this virus, the n protein is involved in processes related to viral replication cycle, and cellular response of host cells to viral infections, and s protein is the major target for neutralizing antibodies as it mediates the fusion and facilitates viral entry into host. 4, 5 in the present study, we found that although multiple genetic variants were identified in the same country, yet there were some unique mutations found in a particular country, which suggests that diversity of s protein mutations might have significant role in the pathogenicity of this virus in countries with high or low mortality rates, as proposed by others also. 19 we predicted 140 snps in the s protein with a>g and vice versa in sars-cov-2 genomes submitted from india, t>a and c>t from china and the interchange of all four nucleotides (c>t, t>a, a>g, g>t, c>g, c>a, g>c, t>c, a>t, g>a) in genomes submitted from the united states. the data from the united states is significant. however, it might be because maximum genomes were submitted from the united states only. the snp profile revealed that the s protein mutations were predominant at specific positions only. these mutations are expected to make the virus more capable to escape from the host immune and might help in natural selection and evolution of the sars-cov-2, as reported by andersen et al. 20 it is important to mention that double mutations in the s protein were found only in the strains from the united states but not in genomes from other regions. these double mutations probably could have helped in the increased virulence of the virus. 21 it has also been noticed that the death toll is comparatively higher in the united states than in other regions included in the study. 1 this might probably indicate that the prevalence of several mutated strains within the provinces would have either reduced or increased its severity. it may also help in understanding the antigenic and immunogenic changes but the correlation of mutation with regional virulence could not be established due to statistical imbalance of the available genomes in the database at the time the study was done. moreover, extensive research is required to correlate the mutations with the severity of the disease and mortality. out of 88 snp clusters, d614g was found in 34 (38.6%) sars-cov-2 genomes. the amino acid change in 23403a>g variant (p.d614g) involves a change of large acidic residue d (aspartic acid) into small hydrophobic residue g (glycine). this observation is important, as this large difference in both size and charge may help compromise the binding affinity of antibodies against s protein, due to electrostatic interactions in the tertiary structure of protein group. this may hinder the developments of vaccines and might potentiate the virus for antigenic drifts. 22 the effect of deletion variant (figured in one sars-cov-2 genome from india) on the viral phenotypes needs further investigation. the high frequency of genetic mutations in rna viruses is well known but in the genomes of sars-cov-2, we found a series of single amino acid variations. this can affect the virus evolution and emergence of the new strains. 21 the mutations in the rbd found in our study predicted conformational changes in the s1 domain of spike protein. the mutations in rbd play an important role while designing new drugs, as suggested in a recent study. 23 these mutations might affect the interaction of viral rbd with the host receptor. our study revealed 12 mutations of which six were novel mutations (►table 1). out of the six novel mutations two were exhibiting high antigenicity while others were in the less antigenic region. the amino acid change observed in the antigenic epitopes were from positively charged to uncharged amino acids (r->i, h->q), and negatively charged to uncharged (d->n, d->g, d->y) amino acids. we also found mutations in negative to positively charged (d->h) amino acids. these replacements might influence the tertiary structure of the proteins and facilitate the increased virulence by escaping host immune response. 24 the sequences in the hr1 (residues 902-952) and hr2 (residues 1,145-1,184) regions tend to form dimeric or trimeric helix bundles. 25 as the s protein of coronavirus are homodimers or homotrimers, these hr regions may undergo oligomerization and result in the conformational change of s protein during virus-host cell fusion. 26 these regions show different conformations in different fusion states and are known to be the most conserved among other regions in s protein of sars-cov. 25, 27 however, a previous study shows variations in hr1 domain which forms helical bundles with hr2 to facilitate fusion and entry of virus into the host and hypothesizes that the mutation a1168v in hr2 domain along with a930v mutation in hr1 domain confers peptide entry inhibitor resistance in mouse hepatitis coronaviruses. 28 hence the mutation a930v in hr1 domain and d1168h in hr2 domain found in our study might be relevant in explaining the pathogenesis of sars-cov-2. with the rapid spread of this virus and limitation of specific therapy, studies are being focused on exploring the potential of neutralizing antibodies (as in plasma therapy) against vulnerable epitopes of s protein. 29 our study predicts an interactive web-based dashboard to track covid-19 in real time a review of sars-cov-2 and the ongoing clinical trials genomic characterization of a novel sars-cov-2 coronavirus envelope protein: current knowledge cryo-em structure of the 2019-ncov spike in the prefusion conformation structure of mouse coronavirus spike protein complexed with receptor reveals mechanism for viral entry emerging sars-cov-2 mutation hot spots include a novel rna-dependent-rna polymerase variant preliminary identification of potential vaccine targets for the covid-19 coronavirus (sars-cov-2) based on sars-cov immunological studies real estimates of mortality following covid-19 infection single amino acid substitutions in the severe acute respiratory syndrome coronavirus spike glycoprotein determine viral entry and immunogenicity of a major neutralizing domain clustalw-mpi: clustalw analysis using distributed and parallel computing jalview version 2-a multiple sequence alignment editor and analysis workbench raxml-vi-hpc: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models interactive tree of life (itol) v4: recent updates and new developments emboss: the european molecular biology open software suite a semi-empirical method for prediction of antigenic determinants on protein antigens new hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and x-ray-derived accessible sites potential therapeutic targeting of coronavirus spike glycoprotein priming sars-cov-2 viral spike g614 mutation exhibits higher case fatality rate the proximal origin of sars-cov-2 why are rna virus mutation rates so damn high? electrostatic interactions in protein structure, folding, binding, and condensation exploring the genomic and proteomic variations of sars-cov-2 spike glycoprotein: a computational biology approach lineage-specific differences in the amino acid substitution process interaction between heptad repeat 1 and 2 regions in spike protein of sars-associated coronavirus: implications for virus fusogenic mechanism and identification of fusion inhibitors tectonic conformational changes of a coronavirus spike glycoprotein promote membrane fusion mechanisms of viral membrane fusion and its inhibition coronavirus escape from heptad repeat 2 (hr2)-derived peptide entry inhibition as a result of mutations in the hr1 domain of the spike fusion protein a highly conserved cryptic epitope in the receptor binding domains of sars-cov-2 and sars-cov most of the vaccines strategies against covid-2 are focusing on the predicted epitopes of sars-cov-2 spike protein. this protein is also proposed to be the most potent and specific drug target and for designing neutralizing antibodies. our findings indicate that vaccine designing against sars-cov-2 could be a challenging task. even though both rna based, and peptide-based vaccines are being developed in more than seven laboratories, our observations may be useful in the efficacy analysis of these vaccine candidates. none.conflict of interest p.k.s. and u.k. are research officers in a department of biotechnology funded unrelated project (bt/pr23016/ ner/95/581/2017). key: cord-259112-tkj5de7b authors: mandal, santi m; panda, souvik title: inhaler with electrostatic sterilizer and use of cationic amphiphilic peptides may accelerate recovery from covid-19 date: 2020-06-17 journal: biotechniques doi: 10.2144/btn-2020-0042 sha: doc_id: 259112 cord_uid: tkj5de7b we explore the design of a smart inhaler with electrostatic sterilizer and propose the utilization of cationic amphiphilic peptides, independently or in conjunction with a bronchodilator, for covid-19 patients to quickly improve wellbeing while maintaining a strategic distance to protect healthcare personnel from virus-containing aerosol or droplets during the process of inhalation. caps are known effector molecules synthesized in the host cell in response to immunological challenge by pathogens, and can eliminate microorganisms (bacteria, fungi and protozoan parasites) and acellular entities (viruses and viroids) [10] . the epithelial surfaces of the respiratory tract and lungs are shielded from pathogens by emitting safeguarding molecules, referred to as intrinsic mucosal resistance. this intrinsic mucosal resistant framework eliminates pathogens by forestalling their colonization in the epithelial layer. synthesis and discharge of a few caps -for example, collectin, ␤-defensins, cathelicidins, hydrophilic surfactant proteins -and other molecules plays a significant role in counteracting pathogen attacks [11] . caps have an expansive range of action against microorganisms and infections, neutralize endotoxins and exhibit other in vivo activity [12] . the utility of recombinant lung surfactant proteins has been demonstrated for the treatment of respiratory pain disorder in neonates, chronic obstructive pulmonary disease, emphysema, cystic fibrosis and asthma [13] . both in vitro and in vivo data reveal the significant impact of caps in epithelial host defense [14] . under inflammatory disease conditions, illness and morbidity are enhanced because of dysfunction in mucosal resistance [15] ; hence inhalation of caps or hydrophilic surfactant proteins should be an effective way to deal with receptor-binding proteins of viruses, interrupting or disrupting the membrane lipid bilayer or annihilating colonized virus particles from the epithelial surface. cap-induced disturbance/disruption of the membrane lipid bilayer could happen in various ways, as envisaged in various studies (e.g., barrel-fight model, toroidal model, cover model and cleanser model) [16, 17] . a few caps are notable for causing fusion of cell membranes and can control viral infections by intervening in the fusion process between the host cell membrane and the enveloped virus [18] . the fusogenic tat protein transduction area has been utilized to convey a wide scope of the naturally dynamic segments and medications by the immediate entrance over the lipid membrane [19] . membrane permeabilization is viewed as a significant trait of antiviral action. in poxviruses, rifampin is a compelling inhibitor of viral envelope arrangement; lipid layer-bound viral proteins might be targeted immunologically to expand its counterviral efficacy. for a successful and effective interaction, both electrostatic charge and hydrophobicity are significant. a positive charge is required initially to attract negatively charged membranes, and hydrophobic mass aides is required to disturb the membrane just as it makes contact with the hydrophobic site of hr1, hr2 area of viral combination protein and the receptor restricting space of s protein; this may lessen viral passage into the cell. the significant point here is that higher hydrophobicity increases hemolytic activity, which may hinder the utilization of caps. in any case, hemolytic action might be diminished by the alteration of residues [20] . a few lipopeptides, including maginin and gomesin, are known to be successful caps for antimicrobial action. nonetheless, the use of caps is managed here through an inhaler, allowing the caps to reach to the upper respiratory tract and lungs, which are the hotspot for sars-cov-2 because of their overwhelming hyperexpression of its main receptor molecule, ace2. their cationic property is apt to disrupt the viral the inhaler is an extremely common and valuable device for conveying medicine into the body via the lungs, and is commonly utilized for the treatment of asthma, chronic obstructive pulmonary disease and viral diseases. covid-19 patients often display symptoms of extreme respiratory complication and inhalers may be used to provide immediate relief. healthcare workers may risk infection from patients using inhalers and none can rule out the possibility of spreading contamination in the room during exhalation. here we have designed a unit to stop the spread of virus particles from patients. the inhaler is commonly utilized for quick alleviation of blocked airways. although it delivers short-acting medications, during its use aerosolized viral particles may be released into the environment and may affect health workers. for this reason, another device is proposed to kill viral particles inside the aerosol of covid-19 patients while inhaling air ( figure 2 ). to assemble the gadget, the two halves of the spacer need to be firmly pushed together to rotate the mouthpiece top. the spacer includes two locks which guarantee proper assembly of the two parts. a canister is put into the face of the inhalation chamber with one push. there is a one-way gate valve (owgv) which helps entry of the medicine during inhalation but does not let it out in this entryway. the full inhalator air is flown in this gadget and followed to the mouthpiece. there is another owgv in the guard of the face of inhalation chamber. the pressurized canned products from exhalation are not returned to this valve, which will be shut due to the owgv working head. the patient exhales completely, closing the lips immovably around the mouthpiece to make a good seal without biting on it, breathes in profoundly through the mouth, and in this manner takes in the medicine through the spacer. at that point, the patient removes the inhaler spacer from the mouth, holds the breath for about 10 s (or as long as possible) and breathes out gradually. a different parabolic chamber is attached with this gadget, where one cathode plate and one anode plate are placed independently. a high efficiency particulate air channel, battery, switch and other devices creating a high voltage circuit are additionally appended. a push switch is arranged outside of the parabolic chamber. an ordinary 4 v direct current (dc) battery is utilized in our device to create a high intensity power of 1000 v to enable the positive anode and negative cathode free to liberate charged particles separately. a negative voltage of 1000 v is applied between the cathode wire or plate. an electric discharge from the anode plate ionizes the air; aerosol concentrates around the electrode inside the parabolic chamber during the utilization of high voltage. this is then ionized, causing sparking-induced burning of virus particles due to huge electrostatic power. viruses in the inhaled air or aerosols within the gadget are killed instantaneously, and is burned right away inside the device. when patients breathe out slowly, the owgv will open and air goes into the parabolic chamber. this inletting air is charged by cathode and anode electrodes, leaving virus particles to be burned. at that point the charged air flowing through the channel can trap 99% of remaining virus particles. virus-free air will be discharged from the chamber, preventing dissemination of viruses. two strategies are proposed here for covid-19 patients. first, caps might be brought into the inhaler or medicine independently, alongside bronchodilators and may assist with diminishing the interaction propensity between ace2 receptors and the s1 protein of sars-cov-2. the virus is mainly abundant in the lung and respiratory framework. after inhalation of caps, viral membranes rupture and the resulting new molecular membrane arrangement should be disordered; this should loosen binding with ace2 receptors and diminish the viral burden. second, another plan of smart inhaler has been proposed for covid-19 patients. we have appended an extra parabolic chamber with battery-operated terminals to kill viral pathogens within a second by electrical contact. in this way, the risk of contaminating healthcare personnel with virus-loaded aerosols released from the exhaled air is diminished. structure, function, and evolution of coronavirus spike proteins the coronavirus nucleocapsid is a multifunctional protein structural insights into coronavirus entry host cell proteases: critical determinants of coronavirus tropism and pathogenesis coronavirus envelope (e) protein remains at the site of assembly analysis of sars-cov e protein ion channel activity by tuning the protein and lipid charge the coronavirus spike protein is a class i virus fusion protein: structural and functional characterization of the fusion core complex sars-cov fusion peptides induce membrane surface ordering and curvature coronavirus envelope protein: current knowledge antimicrobial and host-defense peptides as new anti-infective therapeutic strategies collectins and cationic antimicrobial peptides of the respiratory epithelia cationic antimicrobial peptides and their multifunctional role in the immune system the potential of recombinant surfactant protein d therapy to reduce inflammation in neonatal chronic lung disease, cystic fibrosis, and emphysema antimicrobial peptides as mediators of epithelial host defense humoral immunodeficiency in recurrent upper respiratory tract infections. some basic, clinical and therapeutic features biomedical exploitation of self assembled peptide based nanostructures functional and structural insights on self-assembled nanofiber-based novel antibacterial ointment from antimicrobial peptides, bacitracin and gramicidin s cationic host defence peptides: potential as antiviral therapeutics transducible tat-ha fusogenic peptide enhances escape of tat-fusion proteins after lipid raft macropinocytosis cationic amphiphiles, a new generation of antimicrobials inspired by the natural antimicrobial peptide scaffold the authors are grateful to prof. ranadhir chakraborty (department of biotechnology, north bengal university, india) for his kind help in linguistic corrections and reframe the manuscript. the authors have no relevant financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. this includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.no writing assistance was utilized in the production of this manuscript. this work is licensed under the attribution-noncommercial-noderivatives 4.0 unported license. to view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ key: cord-276456-oa6hh7ky authors: collins, r.n.; holz, r.w.; zimmerberg, j. title: 5.14 the biophysics of membrane fusion date: 2012-05-03 journal: comprehensive biophysics doi: 10.1016/b978-0-12-374920-8.00523-3 sha: doc_id: 276456 cord_uid: oa6hh7ky a crucial interplay between protein conformations and lipid membrane energetics emerges as the guiding principle for the regulation and mechanism of membrane fusion in biological systems. as some of the basics of fusion become clear, a myriad of compelling questions come to the fore. is the interior of the fusion pore protein or lipid? why is synaptic release so fast? why is pip(2) needed for exocytosis? how does fusion peptide insertion lead to fusion of viruses to cell membranes? what role does the tmd play? how can studies on membrane fission contribute to our understanding of membrane fusion? what exactly are snare proteins doing? the biological membrane is ancient, and crucial to the emergence of life itself. while a single topology of membrane was sufficient for bacteria and archebacteria, eukaryotic cells contain specialized subcellular systems of more topologies, enclosing many discontinuous volumes. in order to mix these intracellular volumes, and to secrete such volumes to the extracellular space, or to add a volume to the cytoplasmic space, the membranes enclosing one of these volumes must either rupture or merge with another membrane, a process termed membrane fusion. topological membrane rearrangements, then, such as fusion, its inverse -fission, or formation of a membrane pore, are the most essential ingredients of the complex membrane dynamics of living cells. these membrane transformations are key elements of dynamic intracellular trafficking networks; they are also intimately linked to important pathological processes including cellular entry and egression of enveloped viruses and various parasites, membrane rupture and apoptosis. the topological membrane remodeling generally converges to highly bent intermediates. the characteristic length scales associated with key structural intermediates, such as fusion or fission pores, are typically of the order of tens of nanometers. it has become increasingly clear that critical properties of biological material at this nanoscale cannot be readily extrapolated from bulk measurements. neither can they be obtained from experiments with individual molecules, as in many cases, especially in most membrane processes, the functional unit is not a single protein (e.g., channel), but a selectively self-assembled cluster of proteins and lipids acting in a highly co-operative manner. therefore, for intuition and information about the nature of these transformations, we must create and study them where possible. since the biological membrane is thin (5 nm in thickness, that of two lipid molecules) we cannot see the process of membrane fusion under the microscope; we can only see the sequellae of fusion as organellar or cellular contents mix or are secreted. the simplest natural occurring instance of fusion is the coalescence of two miscible liquid droplets in air, whose two outer molecular layers merge into one. but these layers can dive into the interior, any given molecule is an ephemeral rather than integral part of the droplet's surface. films made of soap or protein (e.g., bubbles made by children from detergent or blowing bubbles in their milk) provide a better model, and we have all seen them fuse and lyse before our eyes. however, they also have surfaces that can exchange with a bulky interior, and their width can vary. these films are unlike the biological membrane in that their apolar moities face the low dielectric air rather than the high dielectric watery interior of the film. the biological membrane is composed of lipids, proteins, and carbohydrates of varying chemical structure. it exists within the context of an aqueous cellular environment that prefers to avoid the interior of the membrane fusion. this entropically-driven hydrophobic effect leads to two important constraints on topological transformations, (1) a tension at the interface of the polar head groups of the lipids to resist any stretching, and (2) a uniform thickness which is primarily determined by the lipid constituents. bilayers made from phospholipids have been used extensively to study the fusion process, and more recently the fission process. since the number of phospholipids molecules far outnumber the other constituents, membrane entropy is dominated by the thermodynamics of the lipids themselves, and thus it is likely that whatever is learned from investigations of lipid bilayer fusion will inform us about biological membrane fusion. indeed, it seems that many of the key intermediates of membrane fusion are the same in the fusion of synthetic and natural membranes. [1] [2] [3] we will not cover this ground, as many excellent reviews serve this purpose. [4] [5] [6] [7] [8] [9] [10] rather, in this chapter we will introduce a number of controversies that the reader may be stimulated to solve. in general there are a number of energy barriers that prevent fusion from proceeding spontaneously. first and foremost, there is the fact that in distilled water and dilute solutions of monovalent salt, all lipid bilayers resist close approach with a force that rises exponentially from an equilibrium distance of about 2 nm. with divalent cations in the solution bathing the membranes, close approach is possible, depending upon the lipid composition. this is readily seen as exchange of components in the contacting leaflets. 11, 12 presumably, this occurs through minute and transient contact sites in which the contacting leaflets are joined (hemifusion). however, this is not generally sufficient either for membrane fusion or the formation of an extended diaphragm composed of the outer, or non-contacting leaflets of the joining bilayers (termed hemifusion diaphragm), 13 for there are energetic restrictions to the widening of the hemifusion intermediate that joins the two leaflets (termed a stalk, [14] [15] [16] . the main way to facilitate the formation of a stable hemifusion diaphragm is to add hydrocarbon solvent to the membrane, so as to lower the energy of the three-way junction between the two bilayers and the joined diaphragm. 17, 18 once there is a sufficient junction, another way to complete fusion is via the formation of a lipidic pore within the hemifusion diaphragm. 19 this can be facilitated by either increasing membrane tension, or by adding lipids whose composition favors spontaneous pore formation. 20,21 (formally, such lipids would be defined as having positive monolayer spontaneous curvature. 22 ) thus it is possible to set up ionic and membrane compositional situations which allow the demonstration of membrane fusion and its intermediates without adding energy, thus spontaneous fusion is possible under a set of restricted conditions. what is not at all clear is the favorable effect of tension on the formation kinetics of hemifusion contacts, and how the huge hydration repulsion between contacting monolayers is breached during the tension-driven fusion of liposomes to bilayers. 20 the fusion pore links the interior of a vesicle to the external space, or the interior of a virus to the cytoplasm. a relatively long-lived fusion pore is universal in biological fusion, regardless of whether exocytosis, viral fusion, or cell-cell fusion is studied. since its discovery as a sub-ultrastructural entity, its architecture and composition has been in debate. in part, this is because of two very different views of the mechanism of membrane fusion: lipid centric and protein centric. in the lipid centric view, proteins surround a fusion site, and the conformational energy of the protein (as it transitions from a pre-fusion to a post-fusion form) is harnessed into stressing the encircled lipids to the point of a spontaneous transition in topology along a well-studied set of molecular intermediates, first a hemifusion intermediate -the stalk -and then the fusion pore ( figure 1) . 21, 24, 25 the prediction of this hypothesis is clear: the proteins should be outside of an hourglass-shaped pore scaffolded by proteins. in the protein-centric view, proteins link up the two membranes that are to fuse and make a solid proteinaceous connection, a potential channel that is first closed, then opens to form the initial fusion pore. 3 the prediction of this hypothesis is also clear: the proteins should be at the center of the pore, surrounded and later infiltrated by lipids as the proteins dissociate to guide the topological change of the lipids. a direct method is needed to determine which of these predictions is met during different instances of biological fusion. one of the abiding mysteries in biology is the great speed that synaptic transmission is capable of, with fusion of some vesicles beginning some tens of microseconds after ca 2 þ floods the presynaptic intracellular release site. 26, 27 this finding leads one to wonder if there is a physical state to a small population of vesicles whose fusion machinery is beyond the stages of priming and conformational change. 3, 28 there are two main proposals: 1. that synaptotagmin provides positive curvature stress that translates into hemifusion at the center of a ring of protein at the base of a dimple, 29 and 2. that ring assemblies of snare (soluble n-ethylmaleimidesensitive factor attachment protein (snap) receptor) and synaptotagmin complexes form to appropriately concentrate and orient c2b domains of synaptotagmin. 2 this ring of ordered domains effectively creates a tube-like scaffold of positively charged protein residues that span the two membranes that are to fuse, a favorable location for dimples of membrane to approach each other. in other words, it would be an electrostatic tunnel for membrane fusion that is extended by the polybasic linker regions of syntaxin and synaptobrevin. 3 variations of this model can account for many physiological pathways, including a small fraction of the vesicles that may already interacting at the level of the hemifusion prior to the entry of calcium. what is the role of calcium once it enters? first, ca 2 þ turns on an 'electrostatic switch' initially proposed for synaptotagmin-syntaxin interaction, but better suited to instantaneously stressing the phospholipid bilayers of the presynaptic membrane and the synaptic vesicle for the ultra-rapid exocytosis seen in the nervous system. second, even without synaptotagmin, ca 2 þ speeds up fusion of snare-reconstituted membranes considerably. perhaps divalent ions play a direct role, electrostatically complexing ps headgroups to promote fusion between negatively charged phospholipid bilayers. is this effect specific for calcium over magnesium? there are indications that the spontaneous curvature of ps in the presence of calcium, but not magnesium, is significantly more negative with calcium than with magnesium. 2 ultimately, synaptotagmin, snares, and the other proteins that comprise the exocytotic fusion machine must cajole lipids to move through a pathway that culminates in fusion pore opening. the snare proteins and synaptotagmin are the guides that walk and pull the membrane through a bumpy stalk-pore path, with electrostatic interactions playing a larger role than hitherto realized. 20 5.14.5 why is ptdins-4,5-p 2 needed for exocytotic fusion? experiments on the permeabilized chromaffin cell established a requirement for atp in membrane fusion in exocytotic secretion. 30, 31 further work revealed the product of atp in exocytosis to be ptdins-4,5-p 2 , which stands out as a key player amongst bilayer lipids. it has been almost 20 years since this minor plasma membrane constituent was directly implicated in exocytosis. the early studies used biochemical approaches in permeabilized cells that directly implicated the polyphosphoinositides 30 and subsequently pip kinase and ptdins-4,5-p 2 5 as key components late in the fusion pathway. imaging of ptdins-4,5-p 2 in secreting cells demonstrates that the lipid is located on the plasma membrane and not on the secretory granule. 32, 33 ptdins-4,5-p 2 associates with syntaxin puncta in plasma membrane lawns from pc12 cells. 7 the concentration of ptdins-4,5-p 2 in the cytoplasmic leaflet of puncta is surprisingly high,b6 mole%. 8 an initial fusion pore of 3 nm diameter and 10 nm 9,10 would have room for approximately 140 phospholipid molecules (area ¼ 70 å 2 ) facing the cytosol including nine ptdins-4,5-p 2 molecules at 6 mole%. clearly, ptdins-4,5-p 2 is absolutely required for exocytosis. ptdins-4,5-p 2 can have two, potentially interrelated functions: 1. as a scaffold for proteins of the exocytotic and endocytotic pathways, such as the exocyst and indirect effects through the actin cytoskeleton. 34-36 2. as a lipid involved in the topological rearrangement of the plasma membrane during fusion and fission. a distinguishing feature of ptdins-4,5-p 2 is its high negative charge of b à 4 at ph 7. 11 the highly charged lipid creates a highly dynamic electrostatic scaffold that interacts with unstructured (e.g., in marcks, gap43) and structured (e.g., in ph and c2 domains) basic moieties on a variety of proteins (for a review see ref. 12) . unstructured basic peptides cause lateral sequestration of ptdins-4,5-p 2 on membranes in vitro containing 1 mole% ptdins-4,5-p 2 with as much as 30 mole % ps 13, an effect that is explained by electrostatic considerations. 14 several proteins that play important roles in exocytosis have structured basic moieties in c2 domains that interact with ptdins-4,5-p 2 including synaptotagmin 15 and rabphilin. 16 stop-flow techniques demonstrate that ptdins-4,5-p 2 greatly speeds the ca 2 þ -dependent interaction of synaptotagmin with membranes in vitro 17 , suggesting that this interaction may have important physiological consequences. in addition, fret has been used to show that ptdins-4,5-p 2 directly interacts with syntaxin in vitro. 37 there are numerous proteins involved in exocytosis and endocytosis that contain a structured basic sequences in pleckstrin homology (ph) domains that interact with ptdins-4,5-p 2 . these include caps in the exocytosis pathway 19 and dynamin in the endocytosis pathway. ptdins-4,5-p 2 also plays an important regulatory role in conjunction with small gtpases and proteins that regulate the actin cytoskeleton. 20 these pathways also influence fusion and fission. while there is strong evidence for ptdins-4,5-p 2 interacting with proteins that are important for fusion and fission, there is little direct evidence at this time for a direct role of the lipid in the fusion or fission reactions. for example, both long term and acute modulation of ptdins-4,5-p 2 in chromaffin cells alter the size of the releasable granule pools but not the fusion kinetics, 21 consistent with a role prior to but not during fusion. nevertheless, it seems likely that ptdins-4,5-p 2 molecules with high charge and relatively high concentrations at fusion sites (as many as nine molecules in the cytoplasmic leaflet of the fusion pore) would directly influence lipid rearrangements. this is an important area for future investigation. it will be challenging to distinguish between the lipid simply being a scaffold for a multitude of proteins involved with trafficking at the plasma membrane, and having a direct function in the bilayer rearrangements of fusion and fission. in fact, these two roles may sometimes be indistinguishable, since one or more of the interacting proteins may have as its primary task the regulation of ptdins-4,5-p 2 function in fusion or fission. the superficially shared structural features of ha with the snare proteins essential for internal cellular fusion have stimulated the hope that there is a universal mechanism for protein-mediated fusion. as the prototype of class i fusion proteins, and the epitope for flu serotyping the influenza virus ha has been studied extensively. while the speed of lipid mixing of ha-mediated fusion is rapid in vitro relative to infection speeds ( figure 2 ), fusion pore formation has not been measured and thus we do not know the kinetics of complete fusion for ha in an intact virus. however, the accessibility of ha-mediated cell-cell fusion, and the large body of investigation into ha make this the best studied fusion protein that is sufficient for fusion. we learn the importance of guided conformational transitions together with protein-protein interactions in conjunction with clear phenotypic discrimination between intermediates of hemifusion and the fusion pore opening and subsequent widening. clearly there is much more work to be done, because we do not know why changing a single amino acid residue at the tip of the fusion peptide gives hemifusion instead of fusion. 38 ha is the first viral glycoprotein for which the structures of both pre-and post-fusion forms were solved at atomic resolution. 39, 40 ha is synthesized as a single-chain precursor protein ha0, which then oligomerizes into a trimer during protein transport through the secretory pathway 41 . the precursor ha0 (549 amino acids in a/hong kong/68/h3n2) then needs to be cleaved into the ha1 and the ha2 polypeptides after the conserved arginine at residue 329 to be primed for the subsequent low ph-triggered conformational changes ( figure 3) . 42 although the crystal structures of both pre-and post-fusion forms of ha have been available for more than a decade ( figure 4) , we still do not know exactly how conformational rearrangements occur step by step, how low ph environment triggers the rearrangements of the ha ectodomain, and which structural elements are crucial for ha fusion activity. it has been suggested that the low ph might lead to an enhanced protonation of the ha1 domain, and generate enough electrostatic repulsion force to partially dissociate the ha1 domain from the ha2 to allow the loop region in the ha2 to contact with water; this would cause the loop to transition into a helix and extend the pre-existing central coiled coil ( figure 5 ). [44] [45] [46] in this model, it seems that the only step requiring a low ph trigger is the exposure of the loop region connecting the two alpha helices in ha2 to water by partial disorientation of ha1 from ha2. consistent with this idea, spontaneous formation of extended coiled coils are observed in bacterially expressed ha2 polypeptide at neutral ph, indicating that the low ph trigger is not necessary for the extension of coiled coils. 47 the prevailing hypothesis for the mechanism of how conformational changes lead to the fusion of two membranes is that extension of coiled coils triggered by the low ph directs the fusion peptide to insert into the target membrane, and then a helix-to-loop conformational change reorients the protein to pull the fusion peptide toward the transmembrane domain. these molecular transitions result in a tight packing of the cooh-terminus of ha2 against grooves of the nh 2 -terminal coiled coil, proceeding to the fusion of the two bilayers. [48] [49] [50] this hypothesis emphasizes that both the extension of the coiled coil and the bending of the protein resulting from the helix-to-loop conformational change are important for fusion. the experimental results favoring this hypothesis demonstrate that a double proline substitution mutant (f63p/f70p) at the region supposedly undergoing loop-to-helix conformational change upon low ph failed to induce fusion, although the mutant still presented and inserted the fusion peptide to the target membrane. 50 the block to fusion was demonstrated to occur in the tight packing of cooh-terminal extended regions into the grooves between the helices of the nh 2 -terminal half of the coiled coil due to the splayed nh 2 -terminal half of ha2, suggesting that the insertion of fusion peptide into target membrane alone is not sufficient for fusion, and the tension caused by the packing of cooh-terminus against the nh 2 -terminus of ha2 may be the driving force for membrane merging. another piece of evidence for the role of the packing of the cooh-terminus to the nh 2 -terminus in viral fusion is that the alanine substitution mutants at five apolar residues after the short cooh-terminus of ha2 fails to cause both lipid and content mixing. 51 as a result of conformational changes of ha upon low ph exposure, the fusion peptide located at the tip of the ha2 molecule is exposed to the close proximity of the target membrane. when the target membrane is available, the fusion peptide inserts into the lipid bilayer to induce lipid mixing between the target cell membrane and the viral membrane. it has been experimentally determined that the free energy associated with the insertion of full-length fusion peptide into the lipid membrane is 4.5 kbt, and as many as 18 fusion peptides binding to the target membrane might generate enough energy to stabilize a stalk-like fusion intermediate, which has high membrane curvature. 52 while necessary for fusion, the packing of fusion peptides does not appear to be sufficient for fusion. lipids present in the membrane act together with ha to cause fusion. changing membrane lipid physical properties or composition in ways that well defined also blocks fusion, despite conformational-specific antibody binding indicating that ha is in the post-fusion state. 53 in other words the protein conformational change, while essential, is not sufficient, as there are post-protein conformational changes in lipid conformation that are needed for fusion to continue along its path. thus the pathway of ha-mediated membrane fusion involves conformational changes to induce lipids to undergo the more general pathway outlined above for lipid membrane fusion. 53 the role of the fusion peptide domain for ha-mediated membrane fusion has been the subject of many biophysical studies. 54 the fusion peptide of ha is rich in glycine; for example, influenza a/x31/h3n2 contains 6 glycine residues in 20 residues of fusion peptide (glfgaiagfiengwegmidg). extensive mutagenesis studies of fusion peptides have revealed that both the primary sequence and the length of fusion peptide are crucial for ha fusogenic activity. 55 for example, ha2 g1e substitution abolishes cell-cell and rbc-cell fusion activity of expressed ha, while g4e substitution decreases fusion efficiency and elevated the ph threshold for activation. 56 alanine can substitute for glycine at positions 1 and 4 without impacting ha-induced cell-cell fusion; however, the polar amino acid serine substitution for glycine at position 1 causes a hemifusion phenotype. 38 the requirements for specific amino acids at certain positions and for a defined length in the fusion peptide have been further supported by the nmr-solved structure of fusion peptide in detergent micelles and in model lipid membranes 43,52 stabilized by a charge-dipole interaction between the n-terminal gly and the dipole moment of helix 2 54 (figure 6 ). at acidic ph, the 20 residues of fusion peptide adopt a v-shaped 'boomerang' structure with an oblique nh 2 -terminal amphipathic helix spanning residues 2-10 and a turn formed by residues 11, 12 and 13 followed by a short cooh-terminal helix 43, 57 stabilized by a charge-dipole interaction between the nterminal gly and the dipole moment of helix 2 58 ( figure 6 ). the bent fusion peptide then may insert b16-17 å into the outer leaflet of the target membrane, almost to the mid-plane of the lipid bilayer with the residue leu 2 and phe 3 penetrating deepest into the membrane. 52, 57 in the solution structure, the hemifusion phenotype mutant g1s has a similar structure to that of wild type, but the glycine ridge on the outer surface of the nh 2 -terminal helical arm is disrupted. in contrast, the mutant g1v, where fusion is completely abolished, has a very irregular linear amphipathic helix instead of the fixed angled boomerang structure ( figure 6 ). 59,60 another fusion-defective mutant, w14a, has a more flexible kink than that of the wild type, in contrast, the alanine substitution at phenylalanine residue 9 (f9a) has a similar structure to that of the wild-type and has no defect in fusion ( figure 6 ). 61 these studies suggest that both the angled and deeply inserted structure as well as the glycine ridge make a contribution to the fusion activity. despite the fact that we have gained a large amount of information on the structure of the influenza fusion peptide over the past several years, there are still open questions surrounding how the insertion of fusion peptide leads to fusion of viral membrane and target membrane and also with regard to the thermodynamic profile during the mixing of two bilayers. it has been generally accepted that the transmembrane domain (tmd) of ha2 and cooperation of multiple ha molecules are required in the fusion pore initiation and fusion pore enlargement. 62 replacing the tmd of ha with a glycerylphosphatidylinositol (gpi) anchor or the deletion mutants with less than 17 residues in length cause a hemifusion phenotype, but a tmd with polar amino acids at the coohterminus still allows full fusion. 12,63 a synthetic peptide representing the transmembrane segment of x31/ha spans an artificial dmpc/dmpg bilayer as an a-helix that aligns roughly perpendicular to the bilayer membrane. the consequences of ha fusion peptide insertion are not well understood, although it is clear that the proposed boomerang structure avoids placing the polar amino acids within the hydrophobic phase, which would be disruptive or structure-forming. insertion of a charged amphipathic helix per se would tend to promote positive curvature, and might aid in ''nippling'' the membranes towards each other to contact prior to fusion, as in the proposed mechanism for synaptotagmin. 29 wt-ph 7. one suggestion is that the membrane helices of the fusion peptide and the tmd could interact with each other, so providing a driving force for the mixing of two membranes. 64 other open questions include the study of ha fusion (and other enveloped viruses) in the context of negative membrane curvatures found in the multi-vesicular bodies of endosomal compartments, 65 and lipids known to be enriched in endosomes. another open question is whether the ha conformational changes are strictly irreversible in the absence of a target membrane. the prevailing model that spring-loaded conformational changes are uni-directional is based in part on the fact that pre-treatment of virus particles from typical laboratory strains (e.g., the h3 strain x-31) with low ph effectively neutralizes infection. analysis of other viral subtypes (e.g., h2) show features consistent with reversible conformational changes 66 and many natural isolates of influenza do not show the irreversible conformational changes associated with x-31 in the absence of a target membrane (g. whittaker personal communication). tatulian and tamm 67 have demonstrated that the conformational change of the entire ha is reversible in the absence of bound target membranes. the formation of a six-helix bundle (6hb) is the structural feature that characterizes class i viral fusion proteins. because fusion peptides insert into target membranes and tmds span the viral envelope, the folding of fusion proteins into hairpins brings the viral envelope and host cell membrane into proximity. even for the two other classes of viral fusion proteins, which do not exhibit 6hbs and contain much b-sheet, in the final, post-fusion state, it is likely that fusion peptides and tmds are proximal. it is not experimentally certain that this proximity occurs, because the hydrophobic tmds and fusion peptides themselves are not present in the crystallized proteins. this apposition of the membrane imbedded tmd and fusion peptides of viral fusion proteins is similar to the proximity between tmds of v-and t-snares in their tetrameric coiledcoil. when the ubiquity of 6hbs was first demonstrated, it was assumed that bundle formation merely brought membranes into close contact, and fusion then occurred. we now know this is incorrect. the correlations between bundle formation and steps in the fusion process have been investigated for hiv-1 env, influenza ha, and aslv env. the precise steps of bundle formation depend on the precise protein. it appears that the longer the amino acid sequence that intervenes between tmds and the coiled-coil, the earlier the bundle can form. for hiv-1 env, which has a relatively short intervening sequence, not only does hemifusion occur prior to completion of bundle formation, but so does the creation of the initial fusion pore. bundle formation releases considerable energy and the late occurrence of formation indicates that considerable energy is required for pore enlargement -the last step in fusion. in fact, several lines of evidence lead to the picture that hemifusion is energetically easy to achieve, pore formation more difficult, and pore enlargement even more difficult. from the theoretical point of view of membrane mechanics, considerable work must be expended to enlarge a fusion pore because additional membrane must be bent from its relaxed state as the pore enlarges. the requirements for hemifusion are reasonably well understood: the standard helfrich concepts of spontaneous membrane curvature and bending energy are sufficient to account for experimental observation. the mechanisms by which fusion proteins create the pore formation is less understood and how fusion proteins contribute to pore enlargement remains a mystery. for topological reasons, tmds cannot span hemifusion diaphragms and can only enter the diaphragm from the junction between the diaphragm and the two original membranes. common sense suggests that this entry of tmds should destabilize the junction and thereby generate pores. this is consistent with demonstrations that tmds are directly involved in pore formation 68 and with theory that predicts that pores form at the junction. but why tmds are forced through the junction has not been addressed and the determinants of pore properties are virtually uncharacterized. for example, some viral fusion proteins, such as hiv-1 env, generate large initial pores than readily open whereas others, such as influenza ha, create small pores that generally flicker open and closed before they enlarge. it seems reasonable that differences in pores are conferred by the proteins (rather than by the lipids), but the field has not formulated guiding principles as to which structural features of a protein control pore properties. because viral nucleocapsids can only enter cytosol if a pore enlarges, these questions are potentially of practical, in addition to biophysical, importance. the snare complex formed between two fusing membranes are the principal fusogens of the eukaryotic molecular machinery that mediates membrane fusion in intracellular trafficking pathways. 69 the snare complex is a coiled bundle of four parallel helices provided by three or four individual snare protein molecules (figure 7) . perhaps the best studied is the snare complex mediating neuronal exocytosis containing four helices provided by three snare proteins: snap-25 (synaptosome-associated protein of 25 kd), synaptobrevin-2, and syntaxin-1. 70 the free energy generated during the assembly of a single ternary snare is estimated to be 20-35 k b t. [71] [72] [73] a key question is what proportion of the energy of snare complex formation is directed at membrane fusion and what the contribution of the energetics of this reaction, if any, contribute to membrane docking and tethering. several of the proteins engaged in and/or responsible for fusion have been studied at atomic resolution with biophysical structural approaches. these studies have greatly illuminated our understanding of the protein machines driving membrane fusion reviewed in refs. 75 and 76. the minimal domains of snare proteins that can spontaneously engage in a stable 4-helix snare complex revealed a characteristic packing at the core of the snare complex. the majority of the packing interactions are hydrophobic, however there is an ionic layer typically consisting of a single arginine and three glutamine residues. 70, 74 this ionic layer is found at the midpoint of the coiled-coiled bundle and hence is referred to as the zero ionic layer. the zero ionic layer is an evolutionarily conserved feature of all snare complexes examined to date, however the functional role of this feature is not known. biophysical characterization of snare complexes that perturb the zero ionic layer suggest that it may be important for the stability of snare complexes. 77 one idea was that this layer could provide an intervention point for snare complex disassembly, however perturbation of the layer in an in vitro disassembly assay 78 had no discernable impact on nsf/snap catalyzed snare disassembly. current hypotheses for how polar zero layer residues might impact snare function favor potential role(s) in snare complex assembly, such as the suggestion that a polar zero layer helps align assembly of the snare helices in register, or in other downstream functions of the snare complex. mutagenesis studies of the zero ionic layer in different systems suggest different effects, but they are all relatively subtle. 79, 80 clearly, the influence of particular residues may vary according to the individual snare complex in question and the local parameters governing the assembly of a particular cognate snare complex. structures of the individual snare proteins have been tremendously stimulating in posing novel questions. the coiled-coil a-helix of synaptobrevin extends to the most membrane proximal residue, lysine 87 and this residue is also part of the extended transmembrane helix. the energetics and topology of snare complex formation may influence local bending of the a-helix at the interfacial region, which in turn could generate local membrane destabilization to aid fusion ( figure 8) . it is not known how the membrane itself may locally influence the structure of cytoplasmic portions of synpatobrevin or other proteins involved in fusion. a recent structure of lipid-bound synpatobrevin suggests that the amphipathic helix 1 of synpatobrevin may lie on the surface of the membrane, 81 providing a molecular explanation of the observation that the membrane may influence the cytoplasmic portion of synaptobrevin to adopt conformations not observed in the absence of lipid. 82 some snares contain independently folded nh 2 -terminal domains together with additional unstructured linker regions of significant length. the presence of such domains and their ability to interact inter-and intra-molecularly significantly increases the complexity of snare complex formation and the ability of the snare proteins to drive membrane fusion. syntaxin contains a linker region connecting the snare domain with the h abc nh 2 -terminal domain. fully extended, this region may be up to approximately 120 å in length ( figure 9 ). it is currently unknown whether there are proteins that selectively bind to or regulate this region. in contrast to synaptobrevin, syntaxin has an slightly extended membrane proximal region that may not be part of the initial core snare complex structure (residues 260-265). the post-fusion snare complex shows this region adopts an a-helical structure that directly links the snare complex a-helix to the transmembrane a-helix. the local secondary structure adopted by these residues prior to fusion is unknown; secondary structural predictions suggest the region is unstructured and could conceivably act as a hinge facilitating the molecule to sample up to a 290 å radius of the membrane proximal area ( figure 10 ). there is evidence for an initial nh 2 -terminal interaction between the snare proteins, but the final low energy 4-helical bundle may not be an intermediate in fusion, but rather either a dead-end conformation or a post-fusion conformation. an experiment missing from the field is a demonstration of helical bundle formation before, or simultaneous with fusion pore formation, as described above for viral fusion. there is strong evidence that prior interaction between the plasma membrane snares syntaxin and snap25 increases rate of interaction with vamp. 83 the relationship of the post-fusion snare complex ( figure 11 ) to lipid rearrangements that occur during fusion and content mixing are currently not known. one study has placed syntaxin as a pore-forming molecule with 5-8 syntaxin molecules making up a fusion pore. 84 assembly of 5-8 snare complexes as a minimum number required to drive fusion events is in good agreement with theoretical considerations of the energetics of snare complex assembly and membrane fusion. b8 snare complexes were required to promote fast fusion in supported bilayer experiments, 85 and fewer in other lipid mixtures including pe, presumably because pe promotes fusion by curvature. 86 what is enormously fascinating is how the protein fusogens form such a pore, how the pore forms initially and how the pore dilates as fusion proceeds to completion. these pathways may have different kinetic and thermodynamic parameters for different types of biological fusion, depending on the physiological requirements of the particular fusion event. the snare machinery certainly appears to have the potential for adaptability, mediating types of fusion as diverse as ''kiss and run'', complete fusion during constitutive events of exocytosis and homotypic fusion events such as those between endosomes. 87 the snare complexes for which the structures have been solved all contain the four helix bundle in a parallel orientation. perhaps not surprisingly, given the versatility of coiledcoil structures, the snare domains are also capable of associating in anti-parallel bundles, which are also stable, although not as stable as the parallel bundles. 88 this is superficially reminiscent of the anti-parallel and parallel coiled-coil transitions experienced by different conformations of ha (figures 4 and 5) although caution should be exercised in extending these analogies given the topological constraints of the fusion machinery. it is not known what such associations may represent physiologically; they could possibly represent a means of tethering membranes independently of fusion, or may be unproductive molecules requiring reconditioning by accessory factors in order to participate in multiple rounds of membrane fusion. the profile of the energy landscape 89 traversed by the fusion reaction will be influenced by a multitude of exogenous factors. known factors include the membrane lipid composition, the availability of snare proteins in suitable pre-fusion states and the specific activity of snare accessory factors ( table 1) . these factors can have multiple influences, for example, membrane composition can play a role in providing molecular determinants for protein assembly and will also determine membrane elasticity. how conformational changes amongst snare proteins and their accessory factors control the thermodynamics and kinetics of docking, lipid mixing and content mixing after membrane fusion remain open questions. our understanding of the conformational plasticity of protein machineries and the hysteresis properties of biological fusion machines lead to an appreciation for the energetic complexities of the fusion reaction. the use of single molecule studies to unravel the complex energy landscape of membrane fusion will be an important biophysical approach with tremendous potential to relate the topography of the energy landscape to the mechanism and regulation of fusion. although the application of this approach to studying biological fusion machines is relatively new, 72, [125] [126] [127] [128] [129] [130] such experiments have some additional advantages -such as being able to distinguish fusion events from vesicle aggregation or vesicle rupture. single molecule approaches of fluorescently-labeled virus in living cells has allowed visualization of influenza fusion with intracellular compartments/endosomes, 131 and the use of solid supported lipid bilayers, 132 facilitates a more detailed analysis of the single-particle kinetics of ha-mediated fusion as well as snare-mediated fusion. 85 in combination with assays that reflect lipid and content mixing, together with pore formation and expansion, such approaches are expected to contribute substantially towards providing missing information regarding the intermediates and pathways involved in fusion. membrane fusion? recent work has shown that two rungs of a dynamin spiral is the minimal structural unit responsible for the formation of the fission neck and hemifission intermediate in model membrane nanotubes (figure 12) . 133 requires the same hemifusion event, but in a cylindrically symmetrical way, as the inner leaflet of the tube must touch itself in the center to allow for the hemifission intermediate. once again, it is relatively straightforward to calculate the ring conditions that will lead to constriction and narrowing of the tube towards the center, as long as the distance between the two rings is not too short. but once again, we are faced with the hydration force resisting any further constriction of the neck, since the surfaces of the inner leaflet lipids would have to come closer than the 2 nm equilibrium distance between attractive dispersion forces and repulsive hydration forces. but lo, there is a new modality for minimizing energy in this system; it is the tilt-like movement of lipid head groups away from each other at the very center of the hourglass constriction of the neck (figure 12 in formation of a narrow separation of heads exposing the hydrophobic interior of the bilayers, termed a hydrophobic ''belt'' to indicate its presence as a ring (you can visualize this ring by rotating the figures of figure 12 (a) around the axis of the horizontal dashed line under each bilayers). now the repulsive hydration force stabilizing inner aqueous diameters of 2 nm gives way to a newly developed hydrophobic attractive force, which is effectively the desire of water to desolvate the space between the tilting headgroups in the center. this water ejection leads to the close approximation and finally merger of the neck with itself at the midpoint; that is, its closure. figure 12 (d) shows the calculated energy of a 9-mm-long segment of neck, depending upon the width (h) and radius (rmid) of the hydrophobic belt discussed above. like the stalk, the belt width becomes that of a single monolayer at the point of merger. the energy barrier of 35 k b t is also similar to those calculated for membrane fusion, 24 so it is energetically feasible. the lipid bending and tilting needed to catalyze membrane merger are mainly motivated by high curvature stress in the neck inner leaflet. accordingly, the energy barrier depends sharply on the minimal radius of the thinned neck, rmin, which should approach 1-2 nm for the fission to occur, as seen in other estimations of hemifission. 135 thus the rings of protein acting on the outside of a 12 nm tube lead to boundary conditions whose effects propagate through the bilayer to influence and stress lipids facing each other across water, leading to their head groups parting and their oils merging in energetically feasible hemifission. the proteins act at length scales consistent with what we know about protein structural lengths, and the lipids respond fluidly to the protein's influence. mechanistic studies concerning dynamin offer not only a static view of the energetics of non-leaky fission/fusion reactions but also insights into the dynamics of the transition structures. dynamin assembly shapes bilayers into highly curved structures ( figure 12 ). assembly also greatly enhances the rate of gtp hydrolysis (100-fold), 136 which in turn leads to dynamin disassembly. thus, assembly is self-limited in the presence of gtp. recent experiments suggest that assembly-induced gtpase activity reduces the interaction of dynamin with the highly curved lipid membrane even before disassembly. 137 the result is the unstable, highly curved lipid neck described above that resolves either in membrane fission (endocytosis, described above), or reversal of high membrane curvature. from these studies, performed in model membranes, membrane fission may therefore be considered to be a stochastic result of gtp hydrolysis. 133, 139 studies in adrenal chromaffin cells suggest that ectopic expression can reverse the otherwise lethal neurodegeneration of cysteine string protein-a knockout mice. (123, 124) dynamin functions in a similar way to control the fate of a recently fused secretory granule. increasing the dynamin gtpase activity increases the rate of fusion pore expansion and the likelihood of rapid endocytosis. 138 the results are consistent with a function for dynamin in restricting fusion pore expansion. increased gtp activity catalyzes a more rapid stochastic decision that results in either fusion pore expansion or (less frequently) membrane fission and endocytosis. recent theoretical and experimental studies of membrane fission by dynamin and viral matrix protein reveal how protein complexes are arranged to effectively apply localized curvature stress to membranes without perturbing lateral membrane integrity; that is, without leakage. one conceptual approach is that sites of membrane remodeling are organized as membrane domains, both through membrane composition and membrane curvature; thus membrane remodeling is a collaborative effort accomplished by the entire domain, involving protein complexes and multiple lipids. we emphasize that despite decades of studies we still know only a little about fundamental physical principles underlying the spatial and temporal organization of membrane domains specialized in membrane remodeling. for example, the structure and composition of the fusion pore are unknown. new synergistic experimental and theoretical approaches are needed to resolve how proteins merge and separate membranes. for example, the ability to detect submicron deformations of the plasma membrane with the combination of polarization and tirfm techniques 138, 139 permits detection of the expanding fusion pore and may enable investigations of the molecular basis for membrane curvature changes in living cells. the key is to study membrane remodeling in the context of concrete biological processes, taking into account corresponding length scales for the key membrane intermediates, dynamic cooperation between protein machineries and lipids, component segregation and sorting and, importantly, longrange interactions which are, ultimately, of critical importance for highly localized membrane rearrangements leading to membrane fusion or fission. functional determinants of a synthetic vesicle fusion system influence of lipid composition on physical properties and peg-mediated fusion of curved and uncurved model membrane vesicles: ''nature' s own'' fusogenic lipid bilayer fusion pores and fusion machines in ca 2+ -triggered exocytosis the molecular mechanism of mitochondrial fusion mitochondrial fusion and fission in mammals protein-driven membrane stresses in fusion and fission protein-lipid interplay in fusion and fission of biological membranes mechanisms of membrane fusion: disparate players and common principles conflicting views on the membrane fusion machinery and the fusion pore how does synaptotagmin trigger neurotransmitter release? flickering fusion pores comparable with initial exocytotic pores occur in protein-free phospholipid bilayers gpi-anchored influenza hemagglutinin induces hemifusion to both red blood cell and planar bilayer membranes the pathway of membrane fusion catalyzed by influenza hemagglutinin: restriction of lipids, hemifusion, and lipidic fusion pore formation possible mechanism of membrane fusion phospholipid surface bilayers at the air-water interface. ii. water permeability of dimyristoylphosphatidylcholine surface bilayers on the theory of membrane fusion. the stalk mechanism asymmetric membranes resulting from the fusion of two black lipid bilayers short-chain alcohols promote an early stage of membrane hemifusion the mechanisms of lipid-protein rearrangements during viral infection how proteins produce cellular membrane curvature lipids in biological membrane fusion intrinsic bending force in anisotropic membranes made of chiral molecules parameters affecting the fusion of unilamellar phospholipid vesicles with planar bilayer membranes a quantitative model for membrane fusion based on low-energy intermediates the exocytotic fusion pore modeled as a lipidic pore calcium in synaptic transmission timing of neurotransmission at fast synapses in the mammalian brain emerging roles of presynaptic proteins in ca þ þtriggered exocytosis how synaptotagmin promotes membrane fusion evidence that the inositol phospholipids are necessary for exocytosis. loss of inositol phospholipids and inhibition of secretion in permeabilized cells caused by a bacterial phospholipase c and removal of atp catecholamine secretion from digitonin-treated pc12 cells. effects of ca 2þ , atp, and protein kinase c activators arf6 regulates a plasma membrane pool of phosphatidylinositol(4,5)bisphosphate required for regulated exocytosis a pleckstrin homology domain specific for ptdins-4-5-p 2 and fused to green fluorescent protein identifies plasma membrane ptdins-4-5-p 2 as being important in exocytosis nonclassical pitps activate pld via the stt4p ptdins-4-kinase and modulate function of late stages of exocytosis in vegetative yeast exo70 interacts with phospholipids and mediates the targeting of the exocyst to the plasma membrane phosphoinositides in cell regulation and membrane dynamics clustering of syntaxin-1a in model membranes is modulated by phosphatidylinositol 4,5-bisphosphate and cholesterol a specific point mutant at position 1 of the influenza hemagglutinin fusion peptide displays a hemifusion phenotype structure of the haemagglutinin membrane glycoprotein of influenza virus at 3 a resolution structure of influenza haemagglutinin at the ph of membrane fusion helenius, a. folding, trimerization, and transport are sequential events in the biogenesis of influenza virus hemagglutinin receptor binding and membrane fusion in virus entry: the influenza hemagglutinin membrane structure and fusion-triggering conformational change of the fusion domain from influenza hemagglutinin protonation and stability of the globular domain of influenza virus hemagglutinin early steps of the conformational change of influenza virus hemagglutinin to a fusion active state: stability and energetics of the hemagglutinin the relevance of salt bridges for the stability of the influenza virus hemagglutinin a soluble domain of the membrane-anchoring chain of influenza virus hemagglutinin (ha2) folds in escherichia coli into the low-ph-induced conformation cdna cloning of component a of rab geranylgeranyl transferase and demonstration of its role as a rab escort protein specific single or double proline substitutions in the ''spring-loaded'' coiled-coil region of the influenza hemagglutinin impair or abolish membrane fusion activity new insight into the spring-loaded conformational change of influenza virus hemagglutinin leash in the groove mechanism of membrane fusion a host-guest system to study structure-function relationships of membrane fusion peptides structural intermediates in influenza haemagglutinin-mediated fusion composition and functions of the influenza fusion peptide membrane fusion by influenza hemagglutinin studies on the mechanism of membrane fusion: site-specific mutagenesis of the hemagglutinin of influenza virus hypothesis: spring-loaded boomerang mechanism of influenza hemagglutinin-mediated membrane fusion early endosomal snares form a structurally conserved snare complex and fuse liposomes with multiple topologies the cytoplasmic tail slows the folding of human immunodeficiency virus type 1 env from a late prebundle configuration into the six-helix bundle fusion peptide of influenza hemagglutinin requires a fixed angle boomerang structure for activity locking the kink in the influenza hemagglutinin fusion domain structure membrane fusion mediated by the influenza virus hemagglutinin requires the concerted action of at least three hemagglutinin trimers multiple local contact sites are induced by gpi-linked influenza hemagglutinin during hemifusion and flickering pore formation secondary structure, orientation, oligomerization, and lipid interactions of the transmembrane domain of influenza hemagglutinin endosome-to-cytosol transport of viral nucleocapsids conformational changes and fusion activity of influenza virus hemagglutinin of the h2 and h3 subtypes: effects of acid pretreatment reversible ph-dependent conformational change of reconstituted influenza hemagglutinin amino acid sequence requirements of the transmembrane and cytoplasmic domains of influenza virus hemagglutinin for viable membrane fusion snares-energies for membrane fusion crystal structure of a snare complex involved in synaptic exocytosis at 2.4 a resolution energetics and dynamics of snarepin folding across lipid bilayers single molecule probing of snare proteins by atomic force microscopy is assembly of the snare complex enough to fuel membrane fusion crystal structure of the endosomal snare complex reveals common structural principles of all snares structure and function of snare and snare-interacting proteins structure of proteins involved in synaptic vesicle fusion in neurons exocytosis requires asymmetry in the central layer of the snare complex snare complex zero layer residues are not critical for n-ethylmaleimide-sensitive factor-mediated disassembly testing the 3q:1r ''rule'': mutational analysis of the ionic ''zero'' layer in the yeast exocytic snare complex reveals no requirement for arginine exocytotic mechanism studied by truncated and zero layer mutants of the c-terminus of snap-25 dynamic structure of lipid-bound synaptobrevin suggests a nucleationpropagation mechanism for trans-snare complex formation the snare complex from yeast is partially unstructured on the membrane structural basis for the inhibitory role of tomosyn in exocytosis transmembrane segments of syntaxin line the fusion pore of ca 2 þ -triggered exocytosis single vesicle millisecond fusion kinetics reveals number of snare complexes optimal for fast snare-mediated membrane fusion docking and fast fusion of synaptobrevin vesicles depends on the lipid compositions of the vesicle and the acceptor snare complex-containing target membrane reconstitution of rab-and snare-dependent membrane fusion by synthetic endosomes single-molecule studies of snare complex assembly reveal parallel and antiparallel configurations structural transitions in the synaptic snare complex during ca 2 þ -triggered exocytosis novel targets and catalytic activities of bacterial protein toxins receptor and substrate interactions of clostridial neurotoxins x-ray structure of a neuronal complexin-snare complex from squid three-dimensional structure of the complexin/snare complex the synaptic vesicle protein csp alpha prevents presynaptic degeneration ddi1, a eukaryotic protein with the retroviral protease fold different domains of the ubl-uba ubiquitin receptor, ddi1/vsm1, are involved in its multiple cellular roles a structure-based mechanism for vesicle capture by the multisubunit tethering complex dsl1 the rab5 effector eea1 interacts directly with syntaxin-6 involvement of lma1 and gate-16 family members in intracellular membrane dynamics structure of gate-16, membrane transport modulator and mammalian ortholog of autophagocytosis factor aut7p intracellular bacteria encode inhibitory snare-like proteins a phorbol ester/diacylglycerol-binding protein encoded by the unc-13 gene of caenorhabditis elegans an open form of syntaxin bypasses the requirement for unc-13 in vesicle priming modular architecture of munc 13/calmodulin complexes: dual regulation by ca 2 þ and possibility function in short-term synaptic plasticity structural basis for a munc13-1 homodimer to munc13-1/rim heterodimer switch cellular functions of nsf: not just snaps and snares isolation and characterization of a dual prenylated rab and vamp2 receptor structural basis of rab effector specificity: crystal structure of the small g protein rab3a complexed with the effector domain of rabphilin-3a a novel syntaxin 6-interacting protein, ship164, regulates syntaxin 6-dependent sorting from early endosomes synaptic vesicle fusion complex contains unc-18 homologue bound to syntaxin specificity and regulation of a synaptic vesicle docking complex sec1p binds to snare complexes and concentrates at sites of secretion tomosyn: a syntaxin-1-binding protein that forms a novel complex in the neurotransmitter release process structure of the yeast polarity protein sro7 reveals a snare regulatory mechanism amisyn, a novel syntaxin-binding protein that may regulate snare complex assembly snap family of nsf attachment proteins includes a brain-specific isoform the yeast sec17 gene product is functionally equivalent to mammalian alpha-snap protein doc2b is a high affinity ca 2 þ sensor for spontaneous neurotransmitter release synaptotagmin-ca 2+ triggers two sequential steps in regulated exocytosis in rat pc12 cells: fusion pore opening and fusion pore dilation synaptotagmin modulation of fusion pore kinetics in regulated exocytosis of dense-core vesicles synaptobrevin binding to synaptophysin: a potential mechanism for controlling the exocytotic fusion machine vesicle-associated membrane protein and synaptophysin are associated on the synaptic vesicle alpha-synuclein promotes snare-complex assembly in vivo and in vitro alpha-synuclein cooperates with cspalpha in preventing neurodegeneration a single-vesicle content mixing assay for snare-mediated membrane fusion discrimination between docking and fusion of liposomes reconstituted with neuronal snare-proteins using fcs single-molecule studies of the neuronal snare fusion machinery single molecule measurements of mechanical interactions within ternary snare complexes and dynamics of their disassembly: snap25 vs. snap23 single molecule mechanical probing of the snare protein interactions single molecule observation of liposome-bilayer fusion thermally induced by soluble n-ethyl maleimide sensitive-factor attachment protein receptors (snares) visualizing infection of individual influenza viruses single-particle kinetics of influenza virus membrane fusion gtpase cycle of dynamin is coupled to membrane squeeze and release, leading to spontaneous fission real-time visualization of dynamin-catalyzed membrane fission and vesicle release membrane fission: model for intermediate structures dynamin self-assembly stimulates its gtpase activity real-time detection reveals that effectors couple dynamin's gtp-dependent conformational changes to the membrane a new role for the dynamin gtpase in the regulation of fusion pore expansion localized topological changes of the plasma membrane upon exocytosis visualized by polarized tirfm this work was supported by nih grant no. 5r01gm069596 to r. collins, r56-ns38129 to rwh, and the intramural program of the nichd, nih. many thanks to fabio rinaldi for help with figure preparation, fred cohen for his discussion of the viral helical bundle timing experiments, and gary whittaker for helpful discussions. key: cord-265087-g4k6pc82 authors: munteanu, cristian robert; gonzález-díaz, humberto; borges, fernanda; de magalhães, alexandre lopes title: natural/random protein classification models based on star network topological indices date: 2008-10-21 journal: journal of theoretical biology doi: 10.1016/j.jtbi.2008.07.018 sha: doc_id: 265087 cord_uid: g4k6pc82 abstract the development of the complex network graphs permits us to describe any real system such as social, neural, computer or genetic networks by transforming real properties in topological indices (tis). this work uses randic's star networks in order to convert the protein primary structure data in specific topological indices that are used to construct a natural/random protein classification model. the set of natural proteins contains 1046 protein chains selected from the pre-compiled culledpdb list from pisces dunbrack's web lab. this set is characterized by a protein homology of 20%, a structure resolution of 1.6å and r-factor lower than 25%. the set of random amino acid chains contains 1046 sequences which were generated by python script according to the same type of residues and average chain length found in the natural set. a new sequence to star networks (s2snet) wxpython gui application (with a graphviz graphics back-end) was designed by our group in order to transform any character sequence in the following star network topological indices: shannon entropy of markov matrices, trace of connectivity matrices, harary number, wiener index, gutman index, schultz index, moreau–broto indices, balaban distance connectivity index, kier–hall connectivity indices and randic connectivity index. the model was constructed with the general discriminant analysis methods from statistica package and gave training/predicting set accuracies of 90.77% for the forward stepwise model type. in conclusion, this study extends for the first time the classical tis to protein star network tis by proposing a model that can predict if a protein/fragment of protein is natural or random using only the amino acid sequence data. this classification can be used in the studies of the protein functions by changing some fragments with random amino acid sequences or to detect the fake amino acid sequences or the errors in proteins. these results promote the use of the s2snet application not only for protein structure analysis but also for mass spectroscopy, clinical proteomics and imaging, or dna/rna structure analysis. one of the widely used methods for the predicting of the protein properties is quantitative structure activity relationship (qsar) (devillers and balaban, 1999) . graph theory can be used to obtain macromolecular descriptors named topological indices (tis). the branch of mathematical chemistry dedicated to encode the dna/protein information in graph representations by the use of the tis has become an intense research area with interesting works of liao (liao and wang, 2004a, b; liao and ding, 2005; liao et al., 2006) , randic, nandy, balaban, basak, and vracko (randic, 2000; randic et al., 2000; randic and basak, 2001; randic and balaban, 2003) , bielinska-waz team (bielinska-waz et al., 2007) or our group (perez et al., 2004; aguero-chapin et al., 2006) . using graphic approaches to study biological systems can provide useful insights, as indicated by many previous studies on a series of important biological topics, such as enzyme-catalyzed reactions (andraos, 2008; chou, 1989; forsen, 1980, 1981; chou and liu, 1981; chou et al., 1979; king and altman, 1956; kuzmic et al., 1992; myers and palmer, 1985; zhou and deng, 1984) , protein folding kinetics (chou, 1990) , inhibition kinetics of processive nucleic acid polymerases and nucleases (althaus et al., 1993a (althaus et al., , b, c, 1994a (althaus et al., , b, 1996 chou et al., 1994) , analysis of codon usage (chou and zhang, 1992; chou, 1993, 1994) , base frequencies in the anti-sense strands , and analysis of dna sequence (qi et al., 2007) . moreover, graphical methods have been introduced for qsar study prado-prado et al., 2008) as well as utilized to deal with complicated network systems (diao et al., 2007; gonzalez-diaz et al., 2007a . recently, the ''cellular automaton image'' (wolfram, 1984 (wolfram, , 2002 has also been applied to study hepatitis b viral infections (xiao et al., 2006a) , hbv virus gene missense mutation (xiao et al., 2005b) , and visual analysis of sars-cov (gao et al., 2006; wang et al., 2005) , as well as representing complicated biological sequences (xiao et al., 2005a) and helping to identify protein attributes (xiao and chou, 2007; xiao et al., 2006b) . the actual work presents for the first time a natural/random protein classification using only the chain sequence and amino acid connectivity protein structural data. the data are transformed into sequence and connectivity star graph's tis, which are then used as input for a statistical linear method in the construction of a simple classification model. two sets of proteins are compared in the new classification model: a set (nat) of 1046 natural protein chains as defined in the pre-compiled culledpdb list from pisces dunbrack's web lab (wang and dunbrack, 2003) and a second (rnd) with the same size formed by random amino acid sequences generated with python scripts (rossum, 2006) . the natural set is characterized by a homology of 20%, a structure resolution of 1.6 å and r-factor lower than 25%. the random set is composed by the same standard amino acid types and the average length of the chains is the same as that of the natural set. python scripts are used to download pdb files from the pdb data bank (berman et al., 2000) and to create the correspondent dssp file with the dssp application (kabsch and sander, 1983) . the chain sequences were extracted with a python script from these dssp files and were filtered with our prot-2s web tool (http://www.requimte. pt:8080/prot-2s/) by removing the chains that contain nonstandard amino acid (usually labelled x). each protein can be considered as a real network where the amino acids are the vertices (nodes), connected in a specific sequence by the peptide bonds. the graph is the abstract representation of the network and is a collection of n vertices and the connections between them. the star graph is a special case of trees with n vertices where one has got nà1 degrees of freedom and the remaining nà1 vertices have got one single degree of freedom (harary, 1969) . in addition, as a general property, there is a unique path between any pair of vertices. for proteins, each of the 20 possible branches (''rays'') of the star contains the same amino acid type and the star centre is a nonamino acid vertex. the same protein can be represented by different forms which are associated to distinct distance matrices (randic et al., 2007) . if the vertices do not carry a label, the sequence information will be lost; for that reason, the best method is to construct a standard star graph where each amino acid/vertex holds the position in the original sequence and the branches are labelled by alphabetical order of the three-letter amino acid code (randic et al., 2007) . in the present study we are using the alphabetical order of oneletter amino acid code. the standard star graph for a random virtual decapeptide (acadcefdgh) is illustrated in fig. 1 . if the initial connectivity in the protein chain is included, the graph is embedded (fig. 2) . in order to compare the graphs, it is necessary to transform the graphical representation in connectivity matrix, distance matrix and degree matrix. in the case of the embedded graph, the matrices of the connectivity in the sequence and in the star graph are combined. these matrices and the normalized ones are the base for the tis calculation. the protein chain sequences are transformed into star graph representations and then characterized by several tis using our new sequence to star networks (s2snet) application. s2snet is a wxpython (noel rappin, 2006) gui application with graphviz (koutsofios, 1993 ) as a graphics back-end. the user of this interactive tool is able to choose the level of calculations, such as: embedded graph, additional weights for each amino acid, markov normalization, power of the matrix connectivity, the input files (files with sequences, groups and weights), the output files, the level of details (files for summary and detailed results) and the type of graph visualization (dot, neato, fdp, twopi, circo). in particular, the calculations presented in this work are characterized by embedded and non-embedded tis, no weights, markov normalization and power of matrices/indices (n) up to 5. the summary file contains the following tis (todeschini and consonni, 2002) : shannon entropy of the n powered markov matrices (sh n ): where p i are the n i elements of the p vector, resulted from the matrix multiplication of the powered markov normalized matrix (n i â n i ) and a vector (n i â 1) with each element equal to 1/n i ; the trace of the n connectivity matrices (tr n ): where harary number (h): where d ij are the elements of the distance matrix, m ij are the elements of the m connectivity matrix, w j are the weight elements and nw is a switch to select (1) or not select (0) weights calculations; wiener index (w): -500 case number gutman topological index (s 6 ): where deg i are the elements of the degree matrix; schultz topological index (non-trivial part) (s): moreau-broto, autocorrelation of topological structure (ats n , n ¼ 1àpower limit), only with weights included: where dp ij n are the elements of the pair distance matrix when the distance is n; balaban distance connectivity index (j): where nodes+1 ¼ aa numbers/node number in the star graph+origin, p k d ik is the node distance degree; kier-hall connectivity indices ( n x): randic connectivity index ( 1 x): all these tis will be used to construct a natural/random classification model by statistical methods. general discriminant analysis (gda) (kowalski and wold, 1982; van waterbeemd, 1995) (statsoft.inc., 2002) has been chosen as the simplest and fastest method. in order to decide if a protein chain is classified as natural (if exists in the pdb database) or random, we added an extra dummy variable named nat/rnd (binary values of 0/1) and a cross-validation variable (cv). there are three often used crossvalidation methods to examine a predictor for its effectiveness in practical application: independent dataset test, subsampling test, and jackknife test (chou and zhang, 1995) . through a crystal-clear analysis, shen (2007, 2008) have shown that only the jackknife test has the least arbitrariness. therefore, the jackknife test has been increasingly used by investigators to examine the accuracy of various predictors (chen and li, 2007a, b; diao et al., 2007; ding et al., 2007; jiang et al., 2008; jin et al., 2008; li and li, 2008; lin, 2008; lin et al., 2008; niu et al., 2006 niu et al., , 2008 wang et al., 2008; xiao and chou, 2007; zhou et al., 2007; zhang et al., 2008) . in the actual work, the independent data test is used by splitting the data at random in a training series (train, 75%) used for model construction and a prediction one (val, 25%) for model validation (the cv column is filled by repeating 3 train and 1 val). all independent variables are standardized prior to model construction. using s2snet methodology, as defined previously we can attempt to develop a simple linear qsar, with the general formula where nat/rnd-score is the continue score value for the nat/rnd classification, t i ¼ tis described above, c 1 àc n ¼ tis coefficients, n is the number for the indices and c 0 is the independent term. gda models quality was determined by examining wilk's u statistics, fisher ratio (f), p-level (p), and canonical regression coefficient (r c ). we also inspected the percentage of good classification, cases/variables ratios, and number of variables to be explored in order to avoid over-fitting or chance correlation. the forward, backward and best subset model types are tested for the embedded, non-embedded and both data. eight variable selection methods were applied in order to find the best gda equation which is able to discriminate between natural and random chain proteins. eight models were constructed using embedded/non-embedded star graph tis obtained with s2snet application and forward, backward and best subset model types. the values obtained for the training/predicting accuracies are presented in table 1 . the forward stepwise selection variable method conjugated with the ne and e tis provides the best results for our data set with values of correctly classified compounds of 91.01%, 90.06% and 90.77% for the training, cross-validation and full sets, respectively, and using a minimum number of 12 parameters (eq. (15)). the embedded tis have the name of the non-embedded ones plus ''e'' as suffix: nat=rnd à score ¼ 0:1 þ 4:8sh0 þ 254:9h þ 1860:2w à 1931:0s þ 39:4j à 139:2x0 à 73:0x3 þ 146:7x4 à 159:3x5 à 6:6tr4e þ 7:1x2e, where n is the number of studied protein sequences (nat+rnd), r c is the canonical regression coefficient, u is the wilk's statistics, f is the fisher's statistics and p is the p-level (probability of error). the present r c value shows a high level of correlation between the input variables and the classification of proteins. wilk's u is used to measure the statistical significance of the discriminatory power of the model and has values from 1.0 (no discriminatory power) to 0.0 (perfect discriminatory power). the f value shows the statistical significance in the discrimination between groups, a measure of the extent to which a variable makes a unique contribution to a prediction of group membership. the values of the p-level of fisher's test for the gda is less than 0.05 and show that the hypothesis of group overlapping with a 5% error can be rejected (hua and sun, 2001 considered as excellent in the literature for lda-qsar models (garcia-garcia et al., 2004; marrero-ponce et al., 2004 . the parametrical assumptions such as normality, homoscedasticity (homogeneity of variances) and non-colinearity have the same importance in the application of multivariate statistic techniques to qspr (bisquerra alzina, 1989; stewart, 1998) as the correct specification of the mathematical form has. the validity and statistical significance of any model is conditioned by the above-mentioned factors. in our study, a simple linear mathematical form of the model has been chosen in the absence of prior information. figs. 3 and 4 show that the training cases against the residuals did not present any characteristic pattern (dillon and goldstein, 1984) . the protein nos. 632 and 864 are the only two cases not shown in fig. 4 because the corresponding raw residuals are clear distinct from the whole set, ca -7. they correspond to 1qwn, chain a (1014 aas) and 1jz8, chain a (1011 aas). one possible reason for the apparent different statistical behaviour could be the limitation of the model when the length of the chains is greater than 1000 amino acids. it is possible that the star net tis for large proteins become similar to the tis of the random proteins. a different and better threshold for the a priori classification probability can be estimated by means of the receiver operating characteristics (roc) curve (james and hanley, 1982) . as the fig. 5 clearly shows, one can see that the model is not a random, but a truly statistically significant classifier, since the area under the roc curve (for both training ¼ 0.98 and validation ¼ 0.96) is significantly higher than the area under the random classifier curve random ¼ 0.5 ¼ diagonal line (morales helguera et al., 2007) . the validity of the gda models depends on the normal distribution of the sample used as well as the homogeneity of their variances. thus, we carried out two significant tests for normality, chi-square and kolmogorov-smirnov tests, and we have found significant statistical differences (po0.01) on the respective values (chi-square, d). these results allow us to reject the hypothesis of normal distribution of the sample under study (fig. 6) (stewart, 1998) . the heteroscedasticity of a large set can be detected with the simple graphical method based on the examination of the residuals of the variable included in the model. fig. 7(a and b) shows that the nat/rnd gda model variables against the residuals plots do not present any pattern, which indicates that homoscedasticity assumption is fulfilled (stewart, 1998) . due to the robustness of the gda multivariate statistical techniques, the predictive ability and interference reached by using the proposed model should not be affected (see fig. 8 ). this study extends for the first time the classical tis to protein star network tis by proposing a model that can predict if a chain protein is natural or random. the results prove for the first time the excellent predictive ability (90.77%) of the simple and fast star network tis and gda statistics linear models in the case of natural/random protein model. this classification can help the study of the protein function by changing some fragments with random amino acid sequences or can detect the fake amino acid sequences or the errors in proteins. the s2snet application can be very useful to calculate the protein star network tis, which can be the base of a model for any other protein property. s2snet can also be used for mass spectroscopy, clinical proteomics and imaging or dna/rna structure analysis. novel 2d maps and coupling numbers for protein sequences. the first qsar study of polygalacturonases; isolation and prediction of a novel sequence from psidium guajava l steady-state kinetic studies with the non-nucleoside hiv-1 reverse transcriptase inhibitor u-87201e the quinoline u-78036 is a potent inhibitor of hiv-1 reverse transcriptase kinetic studies with the non-nucleoside hiv-1 reverse transcriptase inhibitor u-88204e steady-state kinetic studies with the polysulfonate u-9843, an hiv reverse transcriptase inhibitor kinetic studies with the non-nucleoside hiv-1 reverse transcriptase inhibitor u-90152e the benzylthiopyrididine u-31, 355 is a potent inhibitor of hiv-1 reverse transcriptase kinetic plasticity and the determination of product ratios for kinetic schemes leading to multiple products without rate laws: new methods based on directed graphs the protein data bank distribution moments of 2d-graphs as descriptors of dna sequences introducció n conceptual al aná lisis multivariante: un enfoque informá tico con los paquetes spss prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo amino acid composition prediction of the subcellular location of apoptosis proteins graphical rules in steady and non-steady enzyme kinetics review: applications of graph theory to enzyme kinetics and protein folding kinetics. steady and non-steady state systems graphical rules for enzyme-catalyzed rate laws graphical rules of steady-state reaction systems graphical rules for non-steady state enzyme kinetics review: recent progresses in protein subcellular location prediction cell-ploc: a package of web-servers for predicting subcellular localization of proteins in various organisms diagrammatization of codon usage in 339 hiv proteins and its biological implication review: prediction of protein structural classes graph theory of enzyme kinetics: 1. steady-state reaction system review: steady-state inhibition kinetics of processive nucleic acid polymerases and nucleases do antisense proteins exist? topological indices and related descriptors in qsar and qspr. gordon and breach the community structure of human cellular signaling network multivariate analysis: methods and applications prediction of protein structure classes with pseudo amino acid composition and fuzzy support vector machine network a novel fingerprint map for detecting sars-cov new agents active against mycobacterium avium complex selected by molecular topology: a virtual screening method 3d-qsar study for dna cleavage proteins with a potential anti-tumor atcun-like motif medicinal chemistry and bioinformatics-current trends in drugs discovery with networks topological indices ann-qsar model for selection of anticancer leads from structurally heterogeneous series of compounds proteomics, networks, and connectivity indices graph theory support vector machine approach for protein subcellular localization prediction the meaning and use of the area under a receiver operating characteristic (roc) curve using the concept of chou's pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy predicting subcellular localization with adaboost learner dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features a schematic method of deriving the rate laws for enzyme-catalyzed reactions drawing graphs with dot pattern recognition in chemistry kinetic analysis by a recursive rate equation predicting protein subcellular location using chou's pseudo amino acid composition and improved hybrid approach graphical approach to analyzing dna sequences analysis of similarity/dissimilarity of dna sequences based on nonoverlapping triplets of nucleotide bases new 2d graphical representation of dna sequences coronavirus phylogeny based on 2d graphical representation of dna sequence the modified mahalanobis discriminant for predicting outer membrane proteins by using chou's pseudo amino acid composition predicting subcellular localization of mycobacterial proteins by using chou's pseudo amino acid composition 3d-chiral quadratic indices of the 'molecular pseudograph's atom adjacency matrix' and their application to central chirality codification: classification of ace inhibitors and prediction of sigma-receptor antagonist activities atom, atom-type and total molecular linear indices as a promising approach for bioorganic and medicinal chemistry: theoretical and experimental assessment of a novel method for virtual screening and rational design of new lead anthelmintic probing the anticancer activity of nucleoside analogues: a qsar model approach using an internally consistent training set microcomputer tools for steady-state enzyme kinetics predicting protein structural class with adaboost learner predicting membrane protein types with bagging learner a topological sub-structural approach for predicting human intestinal absorption of drugs unified qsar approach to antimicrobials. part 3: first multi-tasking qsar model for input-coded prediction, structural back-projection, and complex networks clustering of antiprotozoal compounds new 3d graphical representation of dna sequence based on dual nucleotides condensed representation of dna primary sequences on a four-dimensional representation of dna primary sequences characterization of dna primary sequences based on the average distances between bases on 3-d graphical representation of dna primary sequences and their numerical characterization on representation of proteins by starlike graphs python reference manual statistica (data analysis software system), version 6.0, /www handbook of molecular descriptors discriminant analysis for activity prediction pisces: a protein sequence culling server a new nucleotide-composition based fingerprint of sars-cov with visualization analysis predicting membrane protein types by the llda algorithm cellular automation as models of complexity digital coding of amino acids based on hydrophobic index using cellular automata to generate image representation for biological sequences an application of gene comparative image for predicting the effect on replication ratio by hbv virus gene missense mutation a probability cellular automaton model for hepatitis b viral infections using cellular automata images and pseudo amino acid composition to predict protein subcellular location graphic analysis of codon usage strategy in 1490 human proteins analysis of codon usage in 1562 e. coli protein coding sequences prediction protein structural classes with pseudo amino acid composition: approximate entropy and hydrophobicity pattern an extension of chou's graphical rules for deriving enzyme kinetic equations to system involving parallel reaction pathways using chou's amphiphilic pseudoamino acid composition and support vector machine for prediction of enzyme subfamily classes cristian r. munteanu thanks the fct (portugal) for support from grant sfrh/bpd/24997/2005. gonzá lez-díaz humberto acknowledges an isidro parga pondal research contract supported by xunta de galicia, university of santiago de compostela (spain). key: cord-262043-66qle52a authors: basit, abdul; ali, tanveer; rehman, shafiq ur title: truncated human angiotensin converting enzyme 2; a potential inhibitor of sars-cov-2 spike glycoprotein and potent covid-19 therapeutic agent date: 2020-05-20 journal: j biomol struct dyn doi: 10.1080/07391102.2020.1768150 sha: doc_id: 262043 cord_uid: 66qle52a the current pandemic of covid-19 caused by sars-cov-2 is continued to spread globally and no potential drug or vaccine against it is available. spike (s) glycoprotein is the structural protein of sars-cov-2 located on the envelope surface, involve in interaction with angiotensin converting enzyme 2 (ace2), a cell surface receptor, followed by entry into the host cell. thereby, blocking the s glycoprotein through potential inhibitor may interfere its interaction with ace2 and impede its entry into the host cell. here, we present a truncated version of human ace2 (tace2), comprising the n terminus region of the intact ace2 from amino acid position 21-119, involved in binding with receptor binding domain (rbd) of sars-cov-2. we analyzed the in-silico potential of tace2 to compete with intact ace2 for binding with rbd. the protein-protein docking and molecular dynamic simulation showed that tace2 has higher binding affinity for rbd and form more stabilized complex with rbd than the intact ace2. furthermore, prediction of tace2 soluble expression in e. coli makes it a suitable candidate to be targeted for covid-19 therapeutics. this is the first md simulation based findings to provide a high affinity protein inhibitor for sars-cov-2 s glycoprotein, an important target for drug designing against this unprecedented challenge. communicated by ramaswamy h. sarma the rapid spread of sars coronavirus 2 (sars-cov-2) demands an immediate public health emergency, and no fda approved treatment/vaccines are currently available. sars-cov-2 spike (s) protein (1267 amino acids) is essential for virus entry through binding with the host receptor angiotensin converting enzyme ii (ace2) and mediating virus-host membrane fusion (boopathi et al., 2020; sarma et al., 2020) . the s protein contains two functional domains (s1 and s2). the s1 (residues 14-685) domain performs the function of virion attachment with human ace2 receptor on epithelial membrane cell surface, followed by its internalization, hence initiating the infection (hasan et al., 2020) . this binding induces certain conformational changes in the s protein, which results the s2 (residue 686-1273) to mediate fusion with cellular membrane. the receptor-binding domain (rbd) of the sars-cov-2 s protein are highly conserved and directly involve in binding to human ace2 (yuan et al., 2020) . since, ace2 is not mutated/evolved to recognize s protein of sars-cov-2; therefore, using alternative of ace2 with more binding affinity for s protein than the wild type receptor, may inhibit entry of sars-cov-1& à2 into human cells. this strategy can play important role in devising therapeutics of sars-cov-2. several studies have proposed small compounds based inhibitors as therapeutic agents for covid-19 (aanouz et al., 2020; elmezayen et al., 2020; gupta et al., 2020; khan et al., 2020; wahedi et al., 2020) . the small compounds based drugs may not efficiently block the entire binding patch of s protein. on the other hand, the peptides based therapeutics can block the entire binding interface (rbd) of s protein (wan et al., 2020b) , as reported for hiv peptide based drug fuzeona (jenny-avital, 2003; w ojcik & berlicki, 2016) . there is growing interest in peptide based therapeutics for covid-19 treatment (pant et al., 2020) and approximately 140 peptide based drugs have been evaluated in clinical trials (fosgerau & hoffmann, 2015) . peptide based drugs have little side effects and little drug tolerance compared with chemical drugs (bruno et al., 2013) . in order to block the fusion of sars-cov-2 s protein with human cells, a recent study has reported a neck and transmembrane deficient ace2, called as soluble ace2 (sace2), that can block the entry of sars-cov-2 into the host cell (procko, 2020) , which is also found safe in healthy human subjects (haschke et al., 2013) and patients with lung disease (khan et al., 2017) . recombinant sace2 is under clinical trials for covid-19 treatment in guangdong province of china (clinicaltrials.gov #nct04287686). the study proposed that mutations in ace2 receptors interface may increase s/ace2 interaction. another study has proposed a 23 amino acid peptide, derived from ace2 (amino acid position 21-43), which can bind with sars-cov-2 s protein with a low nanomolar affinity, and can block the attachment of sars-cov-2 to human ace2 . since, the binding residues of ace2 involve in interaction with rbd are located at amino acid position 21-119 (wan et al., 2020a; yan et al., 2020) , therefore, we hypothesized that this fragment carrying all the binding residues will have better binding affinity for rbd and can hinder the interaction of sars-cov-2 with human ace2, hence blocking its entry into the epithelial cells. we designed a truncated version (tace2) of ace2 receptor covering the binding residues and performed protein-protein docking and molecular dynamic simulations to analyze its binding affinity for rbd and complex stability. the tace2 will compete with wild type human ace2 receptors for binding to sars-cov-2, as they will have more binding affinity for s protein. this will allow all sars-cov-2 viral particles to bind strongly with the tace2, blocking all its available binding sites for the host ace2 receptors, thus inhibiting its entry into the cell which will be eliminated through body defense mechanisms. we further determined the soluble expression for tace2 in e. coli, a suitable host for bulk production of tace2. the pdb structure of ace2 and rbd of sars-cov-2 s glycoprotein (pdb id: 6m17) was obtained from pdb database. in order to determine the variation in the sars-cov-2 s glycoprotein sequence reported from different regions of the globe, 61 s glycoprotein sequences of sars-cov-2 including reference sequence (nc_045512, reported from wuhan, china) were retrieved from ncbi. multiple sequence alignment of the sequences was performed through mega-x. the aligned sequences were then analyzed for amino acid variations. the pdb structures of ace2 and rbd were repaired for their missing loops and optimized for energy minimization and amino acid side chain clashes through foldx (schymkowitz et al., 2005) . side chains were optimized through foldx to remove vander waals' clashes by mutating residues with bad energy values into new rotamers with energy minimization (van durme et al., 2011) . the optimized three dimensional (3 d) structures of ace2 and rbd were used to design truncated ace2 and studying protein-protein interactions. based on protein-protein interactions between ace2 and rbd shown in ace2-rbd complex (pdb id 6m17), a truncated version of ace2 was produced by removing the c-terminus residues from amino acid position 116-768, leaving a truncated n-terminus fragment tace2, from 21-119 amino acid position. the first 20 residues of ace2 is the signal peptide (huang et al., 2003; turner & hooper, 2004) , therefore it was also removed. the structure of tace2 was produced through i-tasser, which build the model by assembling continues fragments of multiple threading templates, identified through replica exchange monte carlo (remc) simulations (yang et al., 2015) . in order to determine binding affinity of both intact and truncated ace2 with sars-cov-2 s glycoprotein, rigid body protein-protein docking tools; zdock (pierce et al., 2014) , cluspro (kozakov et al., 2017) , patchdock (schneidman-duhovny et al., 2005) and a flexible protein-protein docking tool, haddock (van zundert et al., 2016) were used. the energy function used by zdock is z score, which is a cumulative of pairwise shape complementarity function with desolvation and electrostatics. the zdock rank the top 10 predicted docking poses on the basis of z score (chen et al., 2003) . cluspro uses piper's scoring function, which contains terms of shape complementarity, electrostatics, and pairwise potentials applied on the top 1000 conformations produced and ranked on the basis of cluster size. patchdock uses patchdock score as the energy function which ranked the docked model based on desolvation energy, interface area size and geometric score (zhang et al., 1997) . haddock is a flexible docking method used for docking of protein-protein complexes. haddock drive the docking process by retrieving information from experimentally identified protein complex interfaces. the haddock scoring function consists on combination of various energies and buried surface area. the scoring of the models was performed according to the haddock score. all the generated docking poses of ace2 and spike protein were visualized through pymol (schrodinger, 2010) . based on the haddock score and the docking rmsd value, the docked complexes of ace2 and tace2 with rbd were analyzed for binding affinity dg (kcal mol à1 ) and stability using protein binding energy prediction (prodigy) server (xue et al., 2016) . the server predicts the binding affinity and stability on the basis of structural properties of the proteinprotein complexes. stability of the protein-protein complex is measured through dissociation constant k d (m). the run was performed at different temperatures ranging from 25 to 36˚c. the protein-protein docked complex with the minimum rmsd and higher binding affinity was considered for md simulation to further confirm stability of the complex. md simulation of the rbd domain in complex with intact ace2 and tace2 was performed through gromacs 5.0.4 (abraham et al., 2015) . simulation was performed by using charm 36.0 force field and tip3p cube box as water model. the protein complex in the cubic box was solvated with water molecules to provide an aqueous environment. the system was then neutralized with addition of 3 na ions followed by energy minimization for removal of conflict between the atoms. the system was then equilibrated through nvt and npt at constant temperature (300 k) and pressure (1 bar), respectively. langevin thermostat was applied to regulate temperature of the system. md simulation was then run for 20 ns. in order to determine post translation modifications (ptms) in ace2, the protein sequence was submitted to ptm-ssmp server, which combines the submitted sequence and site specific modification profile to predict ptm sites in mammalian protein (liu et al., 2018) . since, glycosylation is the most abundant and diverse posttranslational modification of proteins, therefore, we further determined the o-glycosylation sites in ace2 using netoglyc 4.0 server which specifically predict the galnac-type o-glycosylation site, unique to ser and thr (steentoft et al., 2013) . we further determined the n-glycosylation sites by using netnglyc-1.0 server using a threshold value of 0.5 (gupta et al., 2004) . in order to express the tace2 in e. coli, its soluble expressionat 37 c was determined through camsol intrinsic and camsol structurally corrected online solubility prediction tools (sormanni & vendruscolo, 2019) . camsol determines the solubility on the basis of amino acid sequence, while camsol structurally corrected tool determines the solubility profile on the basis of the structure, which accounts for amino acid distribution in the structure and their solvent exposure. both run was performed at ph 7.0. in both methods, the solubility profile scores higher than 1.0 denotes highly soluble regions, while scores lower than à1 indicates poor solubility in e. coli. in the current study, we have proposed a truncated version of ace2 that comprises the binding interface for receptor binding domain (rbd) of sars-cov-2 spike protein. recently, in-vitro binding assay have confirmed that rbd is mainly responsible for initial binding to ace2, which further mediate virus entry into the host cell (lan et al., 2020) . variation in the rbd sequence was analyzed in the sars-cov-2 genome reported from various region of the globe so far (shu et al., 2020) . the sequence alignment showed more than 99.99% homology for rbd domain, with only single variation r408i in the sars-cov-2 genome reported from india ( figure s1 ). the rest of the sars-cov-2 genome sequences submitted throughout the globe have identical rbd sequence, which indicate that the sars-cov-2 rbd is highly conserved globally. structural elucidation has also found the rbd domain as highly conserved (lan et al., 2020). in order to block the spike protein attachment to the cell, the ace2/rbd binding interface comprising residues from position 21-119 of ace2 was selected as truncated version of ace2.the structure of tace2 was built through i-taseer with c-score 1.22. the c-score value in range à5 to 2 shows correctness of the fold. the high c-score for tace2 suggest the highly likelihood of the structure. the tace2 fragment contains almost all binding residues involve in binding with rbd domain of sars-cov-2 (yan et al., 2020) , covering two complete helices (lan et al., 2020) . this suggests that rational design of a binder based on this interface with enhanced affinities to rbd may play vital role by blocking the sars-cov-2 spike protein interaction with ace2, thus inhibiting viral entry into the host cell. previously, peptides based strategies have been employed successfully to inhibit fusion of the sars-cov-1 s protein and membrane receptor (du et al., 2009) . another recent study has reported a 23 amino acid based peptide, a homologue of ace2 binding interface, which successfully bind with s protein with low nanomolar affinity . since, the binding residues for ace2 are located at distant location on rbd, thus providing a larger protein binding site, which is difficult for a small size therapeutic peptide to cover the entire binding sites on rbd. however, our proposed tace2 fragment carrying almost all the binding residues that can block the attachment of rbd with the intact ace2. rbd was docked with intact and truncated ace2 through haddock, a flexible protein-protein docking tool. the method allows the side-chains and backbone atoms of both the protein and receptor flexible during docking run . haddock scoring function (haddock score) is a linear combination of non-bonded intermolecular van der waals (vws), coulomb electrostatics energies and empirically derived desolvation energy term (vangone et al., 2017) .the haddock-score of ace2 and tace2 was à111 and à126.6, respectively, (the more negative the better). similarly the vws and electrostatic energy of tace2-rbd complex was also greater than the ace2-rbd complex, which shows higher binding affinity of tace2 for rbd than the intact ace2 (table 1 ). the rmsd value of ace2 and tace2 in complex with rbd were 0.7 and 0.8, respectively, showing the high likelihood of the docked complexes with native-one (vangone et al., 2017) . in order to further confirm these docking scores, rigid docking was also performed through patchdock, z-dock and cluspro protein-protein docking tools. the docking results obtained for ace2 was compared with tace2 in term of energy functions of each docking tool (table 2 ). all the three docking scores are higher for tace2 than that of the intact ace2, indicating high affinity of tace2 for rbd. our docking results showed that seven residues of ace2 glu23, thr27, asp30, glu35, tyr 83, asn 330 and lys 353 of ace2 interact with rbd residues lys417, lys458, asn487, tyr 489, gln493, tyr495, gly496, thr500 and gly502, respectively, which is almost similar to the binding residues profile of ace2 interface reported previously (yan et al., 2020) , with additional thr27 and glu35 reported by our docking results (figure 1(a, b) ). however, the tace2 form a different binding residues network than the intact ace2. our docking results showed that ser23, asn31, tyr30, glu36, gln40, gln76 and arg95 of tace2 are involved in binding with rbd ( figure 1(c, d) ). this seems that the truncation has produced the conformational changes in the tace2-rbd complex which results in exposure of buried binding residues , thus facilitate higher binding of tace2 to the rbd as compared to the native ace2, which are in agreement with previously reported peptides inhibiting viral attachment with the host cell (koehler et al., 2013) . since, docking methods are not reliable for predicting binding affinity between protein-protein complexes, due to their simple scoring functions (ram ırez & caballero, 2016). as binding affinity of protein-protein complex also depends on dissociation constants (k d ), ph and temperature (kastritis & bonvin, 2010) , while these parameters are not included in the benchmark of docking scoring functions. therefore, we determined the binding affinity of ace2 variants for rbd through prodigy server, which determine the binding affinity based on structural properties of the protein-protein complexes (vangone & bonvin, 2015) . the ace2 and tace2 complexes showed à10.7 and à12.7dg (kcal mol à1 ) binding affinity for rbd, respectively, at temperatures ranges from 20 to 37 c, showing higher binding affinity of tace2 for rbd than the intact ace2 (table 3) . similarly, the dissociation constant k d value of tace2-rbd complex was more than three-fold lesser than the intact ace2-rbd complex, showing that tace2 is more tightly bound to rbd. the smaller k d value indicates high stability and strong binding affinity between protein-protein complex (johnson et al., 2007) . the ace2 variants showed a significant decline in k d value when temperature was increased from 20 c to 36 c, leading to a lower k d (9.8 â 10 à10 m) for tace2 (higher affinity) than that of intact ace2 (2.6 â 10 à8 m) at 36 c.this k d value of tace2 is lesser than the previously reported k d value (47 nm) of sbp1 (an ace2 derived peptide of 23 amino acid) to rbd . the optimum stability of the complexes was found at 36˚c (table 3 ). the dramatic changes of binding kinetics might be caused by reduced stability of ace2 complex below optimum temperature 36˚c (zhao et al., 2018) . in order to determine the structural stability and dynamic behavior of intact ace2-rbd and tace2-rbd complexes, we performed md simulation for 20 ns using gromacs 5.0.4. the docking pose of each complex obtained from haddock with lowest energy was selected for md simulation run. to investigate structural stability of the complex, rmsd plot of the complex backbone was produced. a uniform rmsd plot signifying structural stability of tace2-rbd complex. the rmsd value for tace2 complex was 0.2-0.25 nm, while intact ace2 showed 0.25-3.0 nm rmsd (figure 2 ). the rmsd value of tace2-rbd complex is lesser than sbp1-rbd complex, reported previously, which is almost 0.8 nm , showing higher stability of tace2-rbd complex. root mean square fluctuation (rmsf) was determined to evaluate the residues flexibility of both ace2 and rbd in the docked complexes. the high rmsf values indicate the mobility of residue side chains in relation to their average position (kumar et al., 2014) .the rmsf plot shows the residues of rbd in tace2 complex are stable with a few peaks with rmsf more than 0.2 nm (figure 3(b) ), while rbd of ace2 complex shows many residues with rmsf above 0.35 nm (figure 3(d) ).the residues of tace2 at position 24, 30, 40, 76 and 95 showed reduction in rmsf value due to creating binding interactions with rbd (figure 3(a) ). the residues involved in binding with other protein, present lower rmsf values, reveal the most stable regions of the complex (ardalan et al., 2018) .similarly, the residues window of 470-480 of rbd showed higher fluctuation to 0.25 nm, while decrease in fluctuation at the binding residues positions (figure 3(b) ). the most violent fluctuation in the intact ace2 was observed at c-terminus, which was above 0.7 nm (figure 3(c) ). the overall rmsf values of both tace2 and rbd are below 0.2 nm, which indicate that tace2 complex with rbd is stable, which are in agreement with a previously reported rmsf value 0.4 nm, showing complex stability (maqsood et al., 2020) . the overall trajectories obtained after every 100 ps during a 20 ns md simulation run, very small backbone deviation for both the intact ace2 and tace2 complex was observed (figure 4) . however, the amino acid region 470-489 of rbd has shown backbone fluctuation highlighted as yellow (figure 4(c) ), which we suggest the region of binding site for ace2. previously, the amino acid region of the sars-cov-2 spike protein (480-488) was also reported as binding region for ace2 (ibrahim et al., 2020) . à185.2 ã the haddock score is defined as: 1.0 evdw þ 0.2 eelec þ 1.0 edesol þ 0.1 eair. ãã the z-score produced by haddock indicates standard deviations from the average cluster (the more negative the better). radius of gyration (r g) of both ace2 complexes describes overall spread of molecule during a 20 ns md run. a low rg value indicates better structural integrity and folding behavior (erva et al., 2016) . a slight increase in rg value of the intact ace2-rbd complex was observed during first 5 ns of the run, then after no further drifts till end ( figure 5 , red line), however, the tace2-rbd complex was found stable throughout the md run ( figure 5 , violet line), which indicates its structural integrity. overall, the md simulation results confirm that tace2 form a more stabilized complex with rbd and suggest its inhibitory features for sars-cov-2 spike glycoprotein. post-translational modifications (ptms) play important role in protein-protein interactions (su et al., 2017) . since, experimental methods are high-cost and time-consuming, therefore, it is necessary to theoretically predict ptms site on protein to be expressed heterologously. ptm-ssmp, which predict ptms sites on human protein based on local sequence and site specific modification profile (liu et al., 2018) . ace2 analysis through ptm-ssmp server predicted ubiquitination at position 74 and 304, phosphorylation at 606 and o-glycosylation at 720 residue position. the ptm site at 74 is important for protein degradation and have no role in ppis (lecker et al., 2006) . in transmembrane proteins, the extracellular domains may only be n-glycosylated (gupta et al., 2004) . however, there was no n-glycosylation and oglycosylation site predicted for tace2. these results conclude that there is no ptms site predicted on tace2, which is important for protein-protein interactions. interestingly, an experimental study reported that the lack of glycosylation do not affect the binding of sars-cov-1 rbd to human ace2 (chakraborti et al., 2005) , which strongly support our designed tace2 fragment, if expressed in e. coli may bind efficiently with rbd of sars-cov-2 s glycoprotein. figure 2 . rmsd plot of the ace2-rbd (red) and tace2-rbd complex (violet) backbone atoms. the tace2 complex showing less rmsd value than the intact ace2, indicating its higher complex stability than the intact ace2. since, there was no ptm site predicted in tace2, therefore, e. coli would be an ideal host for its large scale expression. e. coli is the easiest, quickest, and cheapest expression host with a fully known genome, most widely used for hetrologous expression of recombinant protein (basit et al., 2019) . since, ace2 is eukaryotic protein; therefore, its expression in its native form in e. coli will be uncertain, as most of the eukaryotic protein showed insoluble expression in e. coli, which need to be refolded invitro , which is costly and time consuming. that's why, the protein that express in soluble form in e. coli are referred as "low hanging fruit", as their bulk production is cost effective and easy to recover (maqsood et al., 2020) . both sequence and structure based solubility prediction tool using camsol software predicted expression of intact ace2 in a completely insoluble form in e. coli with intrinsic solubility score à1.027 and complete soluble expression of tace2 with a solubility score of 1.23. the software generate solubility profile with one score per residue, where regions with scores higher than 1 denote highly soluble regions, while scores lower than à1 showing poorly soluble ones (figure 6(a, b) ). these results propose e. coli as a suitable host for soluble expression of tace2 using pet28a (þ) as an expression vector, which favors single step purification. structure-based rational design of inhibitory protein with enhanced affinities to the sars-cov-2 spike glycoprotein may facilitate development of potential therapeutics. in this study, we have designed a truncated version of human angiotensin converting enzyme 2 as a potential inhibitor of spike glycoprotein. the truncated protein tace2 was extensively studied through protein-protein docking and md simulation for binding to rbd of sars-cov-2 spike glycoprotein. we found that tace2 can bind to rbd with a higher binding affinity and form more stabilized complex than the intact ace2. in addition, the tace2 sequence predicted soluble expression in e. coli, which makes it an easy target for rapid production at large scale for sars-cov-2 prevention. we believe that this study narrow down the region of interaction between sars-cov-2 s glycoprotein and human ace2 and paves the way to further enhance the binding affinity between tace2 and sars-cov-2 s glycoprotein through rational design. this will open a new path to covid-19 treatment. the authors declare no conflict of interest. moroccan medicinal plants as inhibitors of covid-19: computational investigations gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers novel mutant of escherichia coli asparaginase ii to reduction of the glutaminase activity in treatment of acute lymphocytic leukemia by molecular dynamics simulations and qm-mm studies truncation of the processive cel5a of thermotoga maritima results in soluble expression and several fold increase in activity health improvement of human hair and their reshaping using recombinant keratin k31 improvement in activity of cellulase cel12a of thermotoga neapolitana by error prone pcr novel 2019 coronavirus structure, mechanism of action, antiviral drug promises and rule out against its treatment basics and recent advances in peptide and protein drug delivery the sars coronavirus s glycoprotein receptor binding domain: fine mapping and functional characterization zdock: an initial-stage protein-docking algorithm the spike protein of sars-cov-a target for vaccine and therapeutic development drug repurposing for coronavirus (covid-19): in silico screening of known drugs against coronavirus 3cl hydrolase and protease enzymes molecular dynamic simulations of escherichia coli l-asparaginase to illuminate its role in deamination of asparagine and glutamine residues peptide therapeutics: current status and future directions information-driven, ensemble flexible peptide docking using haddock in-silico approaches to detect inhibitors of the human severe acute respiratory syndrome coronavirus envelope protein ion channel netnglyc 1.0 server. center for biological sequence analysis a review on the cleavage priming of the spike protein on coronavirus by angiotensin-converting enzyme-2 and furin pharmacokinetics and pharmacodynamics of recombinant human angiotensin-converting enzyme 2 in healthy human subjects novel peptide inhibitors of angiotensin-converting enzyme 2 covid-19 spike-host cell receptor grp78 binding site prediction enfuvirtide, an hiv-1 fusion inhibitor inhibition of human pancreatic ribonuclease by the human ribonuclease inhibitor protein are scoring functions in protein àprotein docking ready to predict interactomes? clues from a novel binding affinity benchmark a pilot clinical trial of recombinant human angiotensin-converting enzyme 2 in acute respiratory distress syndrome identification of chymotrypsin-like protease inhibitors of sars-cov-2 via integrated computational approach a fusion-inhibiting peptide against rift valley fever virus inhibits multiple, diverse viruses the cluspro web server for protein-protein docking molecular docking and molecular dynamics studies on b-lactamases and penicillin binding proteins structure of the sars-cov-2 spike receptorbinding domain bound to the ace2 receptor protein degradation by the ubiquitin-proteasome pathway in normal and disease states ptm-ssmp: a web server for predicting different types of post-translational modification sites using novel site-specific modification profile characterization of a thermostable, allosteric l-asparaginase from anoxybacillus flavithermus peptide-like and small-molecule inhibitors against covid-19 zdock server: interactive docking prediction of protein-protein complexes and symmetric multimers the sequence of human ace2 is suboptimal for binding the s spike protein of sars coronavirus 2. biorxiv is it reliable to use common molecular docking methods for comparing the binding affinities of enantiomer pairs for their protein target? in-silico homology assisted identification of inhibitor of rna binding against 2019-ncov n-protein (n terminal domain) patchdock and symmdock: servers for rigid and symmetric docking the pymol molecular graphics system the foldx web server: an online force field potential inhibitors for targeting mpro and spike of sars-cov-2 based on sequence and structural pharmacology analysis protein solubility predictions using the camsol method in the study of protein homeostasis precision mapping of the human o-galnac glycoproteome through simplecell technology investigation and identification of functional post-translational modification sites associated with drug binding and protein-protein interactions 84 -angiotensin-converting enzyme 2 a graphical interface for the foldx forcefield the haddock2. 2 web server: user-friendly integrative modeling of biomolecular complexes contacts-based prediction of binding affinity in protein-protein complexes. elife, 4, e07454 sense and simplicity in haddock scoring: lessons from casp-capri round 1 stilbene-based natural compounds as promising drug candidates against covid-19 receptor recognition by the novel coronavirus from wuhan: an analysis based on decade-long structural studies of sars coronavirus receptor recognition by the novel coronavirus from wuhan: an analysis based on decade-long structural studies of sars coronavirus peptide-based inhibitors of protein-protein interactions prodigy: a web server for predicting the binding affinity of protein-protein complexes structural basis for the recognition of sars-cov-2 by full-length human ace2 the i-tasser suite: protein structure and function prediction a highly conserved cryptic epitope in the receptor-binding domains of sars-cov-2 and sars-cov determination of atomic desolvation energies from the structures of crystallized proteins the first-in-class peptide binder to the sars-cov-2 spike protein. biorxiv, 1-15 impact of temperature on heparin and protein interactions key: cord-147565-mtdhdkc1 authors: harmalkar, ameya; gray, jeffrey j. title: advances to tackle backbone flexibility in protein docking date: 2020-10-15 journal: nan doi: nan sha: doc_id: 147565 cord_uid: mtdhdkc1 computational docking methods can provide structural models of protein-protein complexes, but protein backbone flexibility upon association often thwarts accurate predictions. in recent blind challenges, medium or high accuracy models were submitted in less than 20% of the"difficult"targets (with significant backbone change or uncertainty). here, we describe recent developments in protein-protein docking and highlight advances that tackle backbone flexibility. in molecular dynamics and monte carlo approaches, enhanced sampling techniques have reduced time-scale limitations. internal coordinate formulations can now capture realistic motions of monomers and complexes using harmonic dynamics. and machine learning approaches adaptively guide docking trajectories or generate novel binding site predictions from deep neural networks trained on protein interfaces. these tools poise the field to break through the longstanding challenge of correctly predicting complex structures with significant conformational change. protein-protein interactions are involved in nearly all of the biological processes in human health and disease. understanding the dynamics of binding and the structure of protein complexes at the molecular level can be instrumental in delineating biological mechanisms and developing intervention strategies. computational protein-protein docking provides a route to predict the three-dimensional structures of protein assemblies or complexes from known structures of individual monomeric proteins [1] . docking methods are tested in the blind prediction challenge known as the critical assessment of prediction of interactions (capri) [2] , which in recent rounds pushed the field by including a wide array of target types such as transport proteins, higher order assemblies and host-virus interactions [3, 4] . out of the 28 protein-protein targets evaluated in capri over the past four years [4, 5] , predictors achieved high quality structures for 11 "easy" targets, defined as those with little backbone motion (unbound to figure 1 : performance of protein docking approaches on blind targets in capri rounds 38-46. [4, 5] distribution of dockq scores for the best model submitted by each predictor group (points) for each individual target (x-axis). dockq measures a combination of intermolecular residue-residue contacts, interface rmsd, and ligand rmsd on a scale of 0 (incorrect) to 1 (matching the experimental structure) [5] . targets are labelled by their capri target number and, when needed, interface number (after the decimal). the targets are classified into rigid (easy) targets (high-homology monomer templates and under 1.2 a unbound-bound backbone motion, and flexible targets (poor template availability and/or over 1.2å rmsd bu ). dockq scores are color-coded by capri model quality ranking: blue, high; green, medium; yellow, acceptable; gray, incorrect. data graciously provided by marc lensink [4, 5] . bound c α root mean square deviation (rmsd bu ) of less than 1.2å [6, 7] ; figure 1 ). the remaining 17 targets were categorized as "difficult" (rmsd bu over 2.2å and/or poor monomer template availability). for these targets, predictors only achieved acceptable quality in 8 of 17 targets (47%) and high quality in only 2 (12%) [4, 5] . thus, the intrinsic flexibility of biomolecules still confounds the protein docking community at large. in this review, we focus on the central docking challenge of capturing larger binding-induced conformational changes. we summarize progress by recent algorithms and frameworks, additionally augmented by growth in databases and computational power (cpu-and gpu-based). these new methods have achieved greater accuracy on more challenging targets and additionally yielded insight into binding mechanisms. we first present progress in binding site identification and then docking methods including molecular dynamics (md) and monte carlo (mc) approaches, normal modes, and machine learning. together, these techniques have helped better explore broader regions of conformational space and more thoroughly evaluate the energy landscape to improve protein-protein docking. to reduce the complexity of the immense conformation space of flexible proteins, coarse-grained models are frequently used to reduce the degrees of freedom ( figure 2 ). in the extreme, global docking approaches typically first treat protein partners as rigid bodies by restricting to six degrees of freedom (three rotational and three translational). a prime method to exhaustively sample the global 6d space is enumerating and scoring different rigid-body orientations on a dense grid. approaches such as clus-pro [15, 16] , zdock [17, 18] , piper [19] and hexserver [20] [21] . relative to traditional fft-based docking, fmft accelerates calculations ten-fold [13]*. another shape-based approach is geometric hashing, which indexes point sets or curves to match geometric features under arbitrary transformations like translations, rotations or even scaling [22] . local 3d zernike descriptor-based docking (lzerd), one of the top methods in capri, projects 3d surfaces onto spheres to efficiently capture complementarity of protein surfaces [23] . some rigid-body approaches exploit data from chemical cross-linking experiments [24] or small-angle x-ray scattering (saxs) [25] to further improve discrimination of generated structures. these approaches provide fast, global exploration of the energy landscape, and in recent capri rounds [4, 5] , many predictors incorporated these approaches as the first step to identify putative binding patches, and they supplement with other refinement tools to capture backbone flexibility. molecular dynamics (md) is one strategy that is often used after grid-search or template-based approaches for refinement ( figure 3 ) [26, 27] . unbiased, all-atom md simulations can provide a highresolution, time-resolved microscopic model of protein-protein interactions. md calculates newtonian trajectories using physics-based energy functions to simulate protein association and dissociation events. md use for protein docking has been limited because non-native local minima trap proteins, and dissociation is too slow [28] . over the past decade, two new modifications to capture conformational changes are steered molecular dynamics (smd) [29] , which utilizes external force constraints, and markov sampling, which breaks a long md simulation into multiple short trajectories [30] . to accelerate dissociation of protein partners at sub-optimal binding regions, ostermeir et al. developed a hamiltonian replica exchange md protocol (h-remd) for protein docking [31] *. in h-remd, biasing potentials are based on the shortest distance between protein partner atoms (defined as "ambiguity restraints"). as the biasing potential and associated ambiguity restraints vary across replicas, associated protein partners in one replica are forced to dissociate in another. pan et al. simulated long timescales in a global search space for a benchmark set of five targets on the special purpose machine anton [32, 33] . their "tempered binding" protocol updates energy function parameters throughout the simulation: a soft-core van der waals intermolecular potential is scaled so that long-lived states are dissociated more frequently, improving the in contrast to md approaches that target flexibility with newtonian dynamics; monte carlo (mc) methods sample by random moves often followed by minimization (mcm) [40, 41] . mc allows a wide variety of conformational move types to sample diverse conformations. mc algorithms have emulated the kinetic binding models, namely key-lock, conformer selection (cs) and induced-fit (if) mechanisms [42, 43, 44] . the cs model chooses protein backbones from a pre-generated ensemble, thus this approach has the advantage of docking one partner's conformations at a time. however, cs docking can fail if the ensemble is devoid of native-like backbone conformations [45] . docked from an ensemble of 10 structures). to diversify backbone conformations, the protocol generates monomer structures by three methods: (1) normal modes [46] (2) backrub motions [47] and (3) all-atom backbone refinement [48] . further, to discriminate between near-native and non-native structures, they developed a more accurate coarse-grained energy function with 6-dimensional residue-pair data obtained since intrinsic fluctuations in proteins contribute to conformational change, some docking approaches utilize harmonic dynamics to capture protein backbone motions [49, 50, 51] . normal modes of vibration represent internal motions of a protein based on a hookean potential between close residues. normal mode analysis (nma) is incorporated in docking approaches such as attract [52] , fiberdock [53] , swarmdock [54] and eigenhex [55] . to mimic induced-fit, schindler et al. developed iattract [56] by moving interface residues in cartesian coordinate space subject to nma-generated harmonic potentials. iattract served as a refinement stage and improved the fraction of native contacts predicted by 70%. for targets with unbound to bound interface rmsd over 4å, iattract can achieve acceptable quality models [56] . population-based methods such as particle swarm optimization (pso) have also [57, 4] . extending the swarm intelligence methods, the lightdock algorithm uses a "glowworm" swarm optimization to sample different backbone conformations in local regions of the protein surface with an anisotropic network model [58] . lightdock additionally uses multiscale modeling to combine all-atom and coarse grained scoring functions. while normal modes have typically been used on individual protein partners prior to docking, oliwa and shen introduced the complex nma in docking to also sample molecular complex fluctuations [60] . by calculating modes of an encounter complex, this approach focuses on the binding region as it reduces the dimensionality of the search space [61] . one of the problems of nma is that higher frequency modes often distort protein bonds. to overcome this limitation, frezza and lavery developed the internal coordinate nma (inma) approach to move in the torsion angle space, that is, with fixed bond lengths and angles ( figure 4 ) [62] . with a reduced protein model in an internal coordinate space, they captured larger conformational changes from eigenvectors of low-frequency modes [59] **. inma can generate structures within 3å of the bound state when starting from the unbound for 39% of single-domain and 45% of multi-domain proteins in their benchmark. although protein folding has been one prime focus of deep learning methods in biology (e.g., alphafold [63, 64] and raptorx [65] ), in recent years, a few studies have explicitly addressed challenges relevant to protein docking [66] . protein binding sites can be thought of as an information-rich molecular space that can be mined for elucidating protein interactions [67, 68, 69] . one approach is to use this information to create score functions for use with traditional docking approaches. for example, geng et al. used graph representations to train a support vector machine (svm) on native and non-native protein complex structures to develop a scoring potential (graphrank) to rank docked poses [70] . and iscore, composed of the graphrank and haddock [71] scores, achieved top performance in capri scoring rounds (medium or high quality structures for nine out of 13 targets). other teams have used deep learning techniques to identify protein interfaces by extrapolating image recognition tools to protein structures. raptorx-complexcontact [69] uses a deep residual neural network trained on single-chain proteins to predict contacts between binding partners, achieving the top contact prediction scores in casp [72] . another approach is to characterize interaction environments. townshend et al. created "voxels," i.e., volumetric pixels with local atomic information for every protein surface residue, and with this 3d representation, they trained a deep 3d convolutional neural network (sasnet) on a curated database of bound protein complex structures [73] . pittala et al. employed graph convolutions with the nodes representing the amino acid residues and edges connecting residues with a c β − c β distance under 10å [74] . they placed geometric and chemical features on both nodes and edges and used a graph neural network to predict epitopes and paratopes in antigen-antibody interfaces. in a unique approach by gainza et al., a geometric deep learning model (masif) used molecular interaction "fingerprints" calculated using geometric and chemical features of protein surfaces [14]** (2). their deep network was composed from geodesic convolutional layers, and they used it to predict binding sites, evaluate alternate docked interfaces, and assess likelihood of a given protein-protein interaction. relative to conventional rigid docking methods on protein targets, masif-search can perform ultra-fast scanning to identify true 'binder' with similar accuracy but significantly faster (4 cpu-minutes vs. 45 hours for patchdock and 93 days for zdock to evaluate a benchmark of 100 bound protein complexes). in a study to explore how neural networks might be used to generate structures with considerable backbone motion, degiacomi trained an autoencoder with conformations from md simulations, compressing the protein motion into a low-dimensional latent space [75] *. by training with simulations of both closed (bound) and apo conformations of a target protein, the autoencoder generated an intermediate closed-apo conformation at 0.8å rmsd [75] from the native state. however, when the autoencoder was trained only with open conformations, the generator could only create structures far from the closed state (over 4.2å), limiting the utility of this approach for blind docking. in an approach suitable for blind cases, cao and shen developed a bayesian active learning (bal) model to quantify uncertainty in protein 8 structure quality, and then they extended their model to flexible protein docking [76] *. the bayesian framework determines the posterior probability as it samples backbone conformations [60] . flexibility is captured with low-frequency complex-nma modes, and in principle it can be extended to higher frequencies that capture loop and hinge motions. compared to zdock [17] and pso, bal improves the interface rmsd of the near-native predictions by 0.5å. in conjunction with experimental data, docking has advanced a range of biological and health applications (e.g., alzheimer's disease [77] , celiac disease [78] , sars-cov-2 [79] , influenza [80] , cancer [81] , and heart disease [82] , to name just a few). over the past few years, docking success rates have improved on "difficult" blind prediction targets, but rates need to be higher for docking to be a reliable stand-alone tool in all cases. clearly, a diverse and impressive array of tools has steadily advanced toward reliably capturing large conformational changes in protein docking. docking will be even more impactful when the field finally overcomes this challenge. recent progress and future directions in protein-protein docking capri: a critical assessment of predicted interactions the challenge of modeling protein assemblies: the casp12-capri experiment blind prediction of homo-and hetero-protein complexes: the casp13-capri experiment modeling protein-protein, protein-peptide, and protein-oligosaccharide complexes: capri updates to the integrated protein-protein interaction benchmarks: docking benchmark version 5 and affinity benchmark version 2 dockground: a comprehensive data resource for modeling of protein complexes cluspro: a fully automated algorithm for protein-protein docking the cluspro web server for protein-protein docking zdock: an initial-stage protein-docking algorithm interactive docking prediction of protein-protein complexes and symmetric multimers piper: an fft-based protein docking program with pairwise potentials hexserver: an fftbased protein docking server powered by graphics processors efficient search for the possible mutual arrangements of two rigid bodies with the use of the generalized five-dimensional fourier transform prediction of protein-protein interactions by docking methods protein-protein docking using region-based 3d zernike descriptors integrating cross-linking experiments with ab initio protein-protein docking ultra-fast filtering using small-angle x-ray scattering data in protein docking modeling of protein complexes in capri round 37 using template-based approach combined with model selection performance and enhancement of the lzerd protein assembly pipeline in capri 38-46 atomic-level characterization of the structural dynamics of proteins implicit flexibility in protein docking: crossdocking and local refinement complete protein-protein association kinetics in atomic detail revealed by molecular dynamics simulations and markov modelling accelerated flexible protein-ligand docking using hamiltonian replica exchange with a repulsive biasing potential hamiltonian replica exchange (h-remd) modifies parts of the force field across different replicas. in this paper, a repulsive potential between receptor and ligand surface residues promotes transient dissociation on switching replicas, accelerating exploration of the protein surface to identify possible binding sites raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer atomic-level characterization of protein-protein association with long timescale md simulations using a "tempered binding" protocol that scales a soft-core energy across replicas to promote dissociation of longlived states, this work found that protein binding occurs through repeated association-dissociation events rather than prolonged in-contact exploration prediction of protein-protein complexes using replica exchange with repulsive scaling using a novel replica exchange scheme with variable van der waals radii for interface residue atoms, the rs-remd approach promotes dissociation in some replicas, which improves sampling for both global searches and refinement replica exchange with solute tempering: a method for sampling biological systems in explicit water replica exchange improves sampling in low-resolution docking stage of rosettadock umbrella sampling funnel metadynamics as accurate binding free-energy method holo-like and druggable protein conformations from enhanced sampling of binding pocket volume and shape high-resolution protein-protein docking sampling and scoring: a marriage made in heaven protein-protein docking with backbone flexibility conformer selection and induced fit in flexible backbone protein-protein docking using computational and nmr ensembles monte carlo replica-exchange based ensemble docking of protein conformations pushing the backbone in protein-protein docking anisotropy of fluctuation dynamics of proteins with an elastic network model backrub-like backbone simulation recapitulates natural protein conformational variability and improves mutant side-chain prediction alternate states of proteins revealed by detailed energy landscape mapping harmonic modes as variables to approximately account for receptor flexibility in ligand-receptor docking simulations: application to dna minor groove ligand complex flexibility and conformational entropy in protein-protein binding accounting for conformational changes during protein-protein docking flexible docking and refinement with a coarse-grained protein model using attract fiberdock: flexible induced-fit backbone refinement in molecular docking swarmdock and the use of normal modes in protein-protein docking flexible protein docking refinement using pose-dependent normal mode analysis iattract: simultaneous global and local interface optimization for protein-protein docking refinement enhanced sampling of protein conformational states for dynamic cross-docking within the protein-protein docking server swarmdock * a hybrid conformational-selection/induced-fit approach for dynamic crossdocking in swarmdock, a particle swarm optimization algorithm. ensembles are pre-generated with nma and undergo cross-docking while sampling alternate protein conformations using low frequency normal modes lightdock: a new multi-scale approach to protein-protein docking internal coordinate normal mode analysis: a strategy to predict protein conformational transitions this work employs nma in the internal coordinate space with a reduced protein model to capture large conformational changes of proteins with a faster compute time and no distortion of protein bonds cnma: a framework of encounter complex-based normal mode analysis to model conformational changes in protein interactions predicting protein conformational changes for unbound and homology docking: learning from intrinsic and induced flexibility internal normal mode analysis (inma) applied to protein conformational flexibility protein structure prediction using multiple deep neural networks in the 13th critical assessment of protein structure prediction (casp13) improved protein structure prediction using potentials from deep learning accurate de novo prediction of protein contact map by ultra-deep learning model deep learning in protein structural modeling and design (2020) recognition of functional sites in protein structures protein interface prediction using graph convolutional networks complexcontact: a web server for interprotein contact prediction using deep learning iscore: a novel graph kernel-based function for scoring protein-protein docking models haddock: a protein-protein docking approach based on biochemical or biophysical information critical assessment of methods of protein structure prediction (casp)-round xiii end-to-end learning on 3d protein structure for interface prediction learning context-aware structural representations to predict antigen and antibody binding interfaces coupling molecular dynamics and deep learning to mine protein conformational space this paper describes a unique method of generating plausible motions of a protein using a generative neural network (autoencoder) bayesian active learning for optimization and uncertainty quantification in protein docking * with a framework to quantify uncertainty in docked models, the bayesian approach uses a posterior distribution to guide sampling to likely low-energy conformations from monomer to fibril: abeta-amyloid binding to aducanumab antibody studied by molecular dynamics simulation plasma cells are the most abundant gluten peptide mhc-expressing cells in inflamed intestinal tissues from patients with celiac disease dna aptamers block the receptor binding domain at the spike protein of sars-cov-2, chemrxiv characterizing receptor flexibility to predict mutations that lead to human adaptation of influenza hemagglutinin targeting the corest complex with dual histone deacetylase and demethylase inhibitors protein docking and steered molecular dynamics suggest alternative phospholamban-binding sites on the serca calcium transporter this work was supported by the national institutes of health through grant r01-gm078221. we thank marc lensink for generously providing us with data from capri and sai pooja mahajan and sudhanshu shanker for helpful comments on the manuscript. dr. jeffrey j. gray is an unpaid board member of the rosetta commons. under institutional participation agreements between the university of washington, acting on behalf of the rosetta commons, johns hopkins university may be entitled to a portion of revenue received on licensing rosetta software including applications mentioned in this review. as a member of the scientific advisory board, dr. gray has a financial interest in cyrus biotechnology. cyrus biotechnology distributes the rosetta software, which may include methods mentioned in this review. key: cord-275023-0z219rcy authors: cerofolini, linda; fragai, marco; luchinat, claudio; ravera, enrico title: orientation of immobilized antigens on common surfaces by a simple computational model: exposition of sars-cov-2 spike protein rbd epitopes date: 2020-07-29 journal: biophys chem doi: 10.1016/j.bpc.2020.106441 sha: doc_id: 275023 cord_uid: 0z219rcy the possibility of immobilizing a protein with antigenic properties on a solid support offers significant possibilities in the development of immunosensors and vaccine formulations. for both applications, the orientation of the antigen should ensure ready accessibility of the antibodies to the epitope. however, an experimental assessment of the orientational preferences necessarily proceeds through the preparation/isolation of the antigen, the immobilization on different surfaces and one or more biophysical characterization steps. to predict a priori whether favorable orientations can be achieved or not would allow one to select the most promising experimental routes, partly mitigating the time cost towards the final product. in this manuscript, we apply a simple computational model, based on united-residue modelling, to the prediction of the orientation of the receptor binding domain of the sars-cov-2 spike protein on surfaces commonly used in lateral-flow devices. these calculations can account for the experimental observation that direct immobilization on gold gives sufficient exposure of the epitope to obtain a response in immunochemical assays. the activity and reactivity of an immobilized protein strongly depend on its orientation with respect to the surface of the support in -or on -which it is immobilized. this holds true for enzymes, as well as for antibodies and antigens. therefore, the possibility to control and manipulate the exposition of the relevant residues and protein surfaces plays an important role in the rational design of devices based on immobilized proteins. among these devices, immunosensors represent an expanding space for research and market opportunities. while the path to reach a technologically relevant product must rely upon a strong experimental characterization, [1, 2] possibly relying upon atomic-level methodologies, [3] [4] [5] [6] [7] [8] [9] the preparation/isolation of the protein of interest, its immobilization, and the characterization of the resulting composite are complex and time-consuming, therefore it is also true that guidelines for achieving optimal orientations could improve the efficiency of the r&d connected to protein immobilization. [10, 11] however, simplified simulation models that would allow for a rapid prediction of the most plausible orientations are not particularly common. in this manuscript we apply a very simple method based on a unitedresidue modelling of protein-surface interactions, to specifically address the problem of determining the orientation of the sars-cov-2 spike protein receptor binding domain (rbd) on a few prototypical surfaces for biomedical use. united residue modelling of protein-surface interactions is a rather effective model to screen the poses of protein molecules with respect to surfaces. [12, 13] the method we apply is based on the works by jiang, zhou and del monte-martinez, [10, [12] [13] [14] [15] [16] [17] [18] and encompasses van der waals and electrostatic interactions, as well as covalent immobilization. the choice of the target protein is motivated by the recent emergence of a new infectious disease (covid-19) caused by a coronavirus (sars-cov-2). [19] this infectious disease has spread significantly throughout the world, counting 13.841.890 infected people and a death toll of 590.845 as of july 2020. [20] models suggests that it will remain circulating and active for several months, [21, 22] and there is a marked possibility that reinfection is possible, [23, 24] thus increasing the time of the circulation of the virus. this pandemic outbreak has had a major impact on world economics, with a very long outlook. [25] a capillary control of the diffusion of the infection has proven crucial, [26] and serological tests are expected to have a key role in mass screening. [27, 28] methods j o u r n a l p r e -p r o o f the structures of the proteins were downloaded from the protein databank (pdb), [29] the pka values of reactive groups were calculated using propka, [30, 31] and the interfaces were calculated using the pdbe pisa server. [32] the non-bonded interaction of a residue of type i is represented with a lennard-jones (lj) potential: [12, 13] where r is the nearest distance between the residue and the surface, is the energy at the minimum position, is the equivalent van der waals radius of each residue and is a size parameter taken from the literature (see tables s2-s7, parameters are taken from [14, 18, [33] [34] [35] [36] 16, 37] , as indicated in the table captions). the electrostatic interaction is represented through the gouy-chapman potential [12, 13, 38] ( ) where r is the nearest distance between the i-th residue with charge and the surface, is the surface charge density, is the inverse debye length calculated from the ionic strength i as √ , and the relative permittivity of the medium is assumed to be distance-dependent ( ). [13, 38] a 1:1 buffer salt concentration of 0.15 mol dm -3 is assumed. for silica, the surface charge density is estimated to be -0.3 c m -2 . [37] for self-assembled charged monolayers (sam), the charge density is set to +0.02 c m -2 for the amino-capped monolayer (sam-nh 2 ) and to -0.02 c m -2 for the carboxyl-capped monolayer (sam-co 2 h). [14, 18] the formation of a covalent bond is treated with the following potential: where is the bond energy and is set to 600 kj mol -1 for imino bonds [39] and 100 kj mol -1 for gold-thiol bonds, [40] [41] [42] regardless of the starting oxidation state of the thiol. [43] the desolvation energy is already accounted for in the vdw term. lj parameters for epoxide-glyoxyl functionalization is assumed to be equal to sam-co 2 h, whereas for gold the parameters have been adapted from reference [36] . the sampling of the relative protein-surface orientations is performed by rotating a plane around the center of mass of the protein. the plane is initially parallel to the z=0 plane. only two rotations are necessary, as all the rotations around the normal to the plane will j o u r n a l p r e -p r o o f yield the same energy. the first rotation by an angle [ ] is applied around the y-axis, followed by another rotation of an angle [ ] around the z-axis, and then a translation is applied to optimize the position, similarly to what is done in the popular pales software. [44] [45] [46] the sampling of the pairs is made uniform by using repulsion angular sampling. [47] [48] [49] the distance of the plane to the protein is then set by minimizing the energy terms described above. the most important feature of a composite thought for immunochemical applications is that the orientation of the antigen with respect to the surface must ensure the accessibility of the epitope to the antibodies, to guarantee the recognition. therefore, we have selected the crystallographic structure of rbd in complex with a fragment (fab) of the human antibody cr3022 (pdb id: 6w41), [50] and identified the interface residues relevant for the interaction ( figure 1 and table s1 ). in the analysis of the orientations, we assume that the full length antibody will have the same accessibility as the fab because of the high flexibility of the linkers of the heavy chains (see figure s1 ). [51] [52] [53] [54] it is also important to note that, while the spike protein is highly glycosylated at n-and o-positions, [55, 56] the structured part of the rbd which is recognized by the antibody only carries one glycation at position 343 [55] (pink in figure 1 ), and the glycation site faces away from the antibody binding site. on these grounds, we have not considered glycation (experimentally, this would be done expressing recombinant rbd in prokaryotic cells, whereas glycation would be obtained in human cells [55] ). the fragment of the human antibody cr3022 is shown in blue. hydrophobic adsorption occurs selectively on hydrophobic carriers at low ionic strength. [57] it is a rather common immobilization protocol, because of its simplicity. the also this immobilization strategy is rather common because of its simplicity. it is slightly less general, because the outcome strongly depends on the nature of the protein and of the surface. the electrostatic interaction of the i-th residue with the uniformly charged surface with a given charge density is estimated by the gouy-chapman potential, [12, 13] which is added to a lj term. the parameters defining each system are listed in tables s3-s6. we have considered the following surfaces: 1) silica -a common chromatographic support with high negative surface charge; 2) positively charged self-assembled monolayer (sam), with amino capping of the chains; [14, 18] 3) negatively charged sam, with carboxylic capping of the chains. [14, 18] supports #2 and #3 imply the possibility of colorimetric detection through gold, [58] vide infra. the most probable orientations are shown in figure 3 . degrees from the one with highest relative population, except one that is more tilted, yielding a larger accessibility, with a relative population around 4% ( figure s2 ). it is apparent that only negatively charged surfaces allow for the exposition of the epitope, and this is anyway relatively marginal. these results suggest that it would be nontrivial to achieve a good orientation relying upon adsorption, either based on hydrophobic or on charge interactions. therefore, we have considered directed approaches based on stronger interactions. in particular, we have considered epoxide-glyoxyl (directed at primary amine moieties) [59] and gold (thiols and disulfide bridges). [58] the glyoxyl-based approach is quite popular for multipoint orientation-selective immobilization of proteins on surfaces. it involves a two-step mechanism, in the first step, the primary amine groups of the protein are allowed to react with the aldehyde groups to form schiff base bonds, in the second step the bonds are reduced with sodium borohydride. this kind of immobilization has been simulated in a similar way as described by del monte-martìnez et al., [11] assuming a working ph = 7.5, to maximize the reactivity of the n-terminus and at the same time limiting the reactivity of lysine residues (see table s7 ). the choice of gold is also extremely popular, because of two reasons: the strong plasmonic response of gold, which causes a purple coloring of the bioconjugate, and because of the relatively easy manipulation required. current sars-cov2 serological tests are indeed based on gold conjugates. [60] the conjugation to the surface is simulated in the same way as the amine-glyoxyl reaction, assuming that all cysteines are equally reactive towards gold (disulfide bridges can interact with gold to a comparable extent as thiols). [43] the resulting orientation has 100% relative population. colloidal gold has a net negative surface charge, [61] but including the electrostatic term has no impact on the recovered orientation. in the epoxide-glyoxyl strategy, the conjugation appears to be mostly directed at the nterminus, 1 which is facing away from the recognition interface but is not topologically very remote. therefore, the epitope will only be partially exposed, whereas for the gold conjugation, ample access to the epitope is possible in the most probable orientation. finally, a completely different strategy could be applied for conjugation to (e.g.) gold nanoparticles: the use of a avidin-biotin affinity system. [62] biotinylation can be achieved through amine-specific reagents, [63] and improvement in the selectivity can be achieved with minimal engineering of the sequence. [64] given that there is a rather substantial difference in the calculated pkas for the different amine sites (see table s7 ), it can be expected that, for ph values lower than 7, all lysine residues will be protonated and thus less reactive with probability higher than 99%. the n-terminus is not facing the interaction site (see figure 1 ). therefore, selective biotinylation at the n-terminus is expected to be possible. in this case, the accessibility of the epitope is warranted if the interaction between the antigen and streptavidin, if at all possible, is sufficiently weak. to explore this possibility we have performed an initial-stage docking using zdock, [65] and inspected the first two elements that had a significantly higher zdock score ( figure s3 ). the possible interaction between the rbd and streptavidin was investigated also using haddock2.4. [66] the protein-protein interface residues were predicted with cport, [67] and then used as "active" and "passive" residues in the haddock calculation. about 10 lowly populated clusters with weak energy were obtained; the most significant three with the lowest haddock-scores are reported in figure s3 and their energies in table s8 . both dockings indicate that, should the interaction occur, it would occur in a position that does not interfere with the antigen-antibody recognition. in this work, we describe the use of united-residue modelling for the prediction of the orientation of the receptor binding domain of the spike protein of the novel coronavirus sars-cov-2, a protein of high immunological relevance at the most commonly used surfaces for the preparation of lateral-flow immunochemical devices. with this simple, yet very flexible approach, we find that immobilization on silica, or through glyoxyl reaction of amine residues, or on gold yield orientations compatible with antibody recognition, with gold granting the highest exposition. in this way, we can explain why random conjugation of the rbd to a gold surface yields responsive immunosensors, which are now routinely used. a more detailed experimental verification of the predictions of protein orientation at surfaces represents a significant challenge for the current biophysical methodologies. [1] one can expect that cryo-electron transmission microscopy will be limited by the fact that, in most cases, the surface has higher electron density than that of the protein. confocal laser scanning microscopy can be used to assess the positioning of the protein with respect to the support, and super-resolution microscopic techniques, such as total internal reflection fluorescence microscopy also allow for the detection of discrete molecular events (e.g., desorption, unfolding, lateral diffusion, …), [2] but the orientation is still a high-hanging fruit by these methodologies. conversely, the interaction between the protein and the interface can be probed at the atomic level through the application of solid-state nmr, [4, [7] [8] [9] [68] [69] [70] effort which is being started in our lab. our results suggest that very simple modelling approaches can provide significant hints towards rationally orienting j o u r n a l p r e -p r o o f antigens in a way to maximize the exposition of epitopes, and therefore help in the initial moments of the design of conjugates for immunologic applications, when a rapid response to emergency is vital. this is also testified by the emergence of theoretical modelling of several molecular aspects of viral infection and inhibition mechanisms. [71] [72] [73] [74] [75] [76] overall, the expected short-time impact of our work is to provide guidelines to avoid the experimental exploration of immobilization pathways that are less promising. advanced characterization of immobilized enzymes as heterogeneous biocatalysts on the relationship between structure and catalytic effectiveness in solid surface-immobilized enzymes: advances in methodology and the quest for a singlemolecule perspective trapping rnase a on mcm41 pores: effects on structure stability, product inhibition and overall enzymatic activity proton-detected solid-state nmr spectroscopy of bone with ultrafast magic angle spinning high-resolution structural characterization of a heterogeneous biocatalyst using solid-state nmr conformations and intermolecular interactions in cellulose/silk fibroin blend films: a solid-state nmr perspective ubiquitin immobilized on mesoporous mcm41 silica surfaces -analysis by solid-state nmr with biophysical and surface characterization engineering l-asparaginase for spontaneous formation of calcium phosphate bioinspired microreactors structural characterization of a protein adsorbed on aluminum hydroxide adjuvant in vaccine formulation, npj vaccines computer-aided design of bromelain and papain covalent immobilization rational design strategy as a novel immobilization methodology applied to lipases and phospholipases orientation of adsorbed antibodies on charged surfaces by computer simulation based on a united-residue model parallel tempering monte carlo simulations of lysozyme orientation on charged surfaces molecular simulation studies of the orientation and conformation of cytochrome c adsorbed on self-assembled monolayers multiscale simulations of protein g b1 adsorbed on charged self-assembled monolayers computer simulations of fibronectin adsorption on hydroxyapatite surfaces lipase adsorption on different nanomaterials: a multi-scale simulation study ribonuclease a adsorption onto charged self-assembled monolayers: a multiscale simulation study who statement regarding cluster of pneumonia cases in an interactive web-based dashboard to track covid-19 in real time analysis and forecast of covid-19 spreading in china covid-19: the unreasonable effectiveness of simple models the dynamics of humoral immune responses following sars-cov-2 infection and the potential for reinfection coronavirus protective immunity is short-lasting ea measures to mitigate the economic, financial and social effects of coronavirus suppression of covid-19 outbreak in the municipality of the important role of serology for covid-19 control poolkeh finds the optimal pooling strategy for a populationwide covid-19 testing (israel, uk, and us as test cases the protein data bank very fast empirical prediction and rationalization of protein pka values improved treatment of ligands and coupling effects in empirical calculation and rationalization of pka values inference of macromolecular assemblies from crystalline state residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading coarse-grained models for simulations of multiprotein complexes: application to ubiquitin binding generic coarse-grained model for protein folding and aggregation low-resolution models for the interaction dynamics of coated gold nanoparticles with β2-microglobulin understanding the curvature effect of silica nanoparticles on lysozyme adsorption orientation and conformation: a mesoscopic coarse-grained simulation study electrostatic interactions of a string-like particle with a charged plate atkins' physical chemistry use of electroactive thiols to study the formation and exchange of alkanethiol monolayers on gold the gold-sulfur interface at the nanoscale quantifying thiol-gold interactions towards the efficient strength control thiols and disulfides on the au(111) surface: the headgroup−gold interaction prediction of sterically induced alignment in a dilute liquid crystalline phase: aid to protein structure determination by nmr prediction of charge-induced molecular alignment of biomolecules dissolved in dilute liquid-crystalline phases nmr: prediction of molecular alignment from structure using the pales software repulsion, a novel approach to efficient powder averaging in solid-state nmr simpson: a general simulation program for solid-state nmr spectroscopy computerintensive simulation of solid-state nmr experiments using simpson a highly conserved cryptic epitope in the receptor-binding domains of sars-cov-2 and sars-cov refined structure of an intact igg2a monoclonal antibody crystallographic structure of an intact igg1 monoclonal antibody11edited by crystal structure of a neutralizing human igg against hiv-1: a template for vaccine design structure of full-length human anti-pd1 therapeutic igg4 antibody pembrolizumab deducing the n-and o-glycosylation profile of the spike protein of novel coronavirus sars-cov-2, glycobiology structure, function, and antigenicity of the sars-cov-2 spike glycoprotein hydrolysis of edible oils by lipases immobilized on hydrophobic supports: effects of internal support structure conjugation of nanoparticles to proteins stabilization of enzymes by multipoint covalent immobilization on supports activated with glyoxyl groups development and clinical application of a rapid igm-igg combined antibody test for sars-cov-2 infection diagnosis determination of the surface charge density of colloidal gold nanoparticles using second harmonic generation colloidal gold/streptavidin methods preferential labeling of α-amino nterminal groups in peptides by biotin: application to the detection of specific anti-peptide antibodies by enzyme immunoassays selective n-terminal acylation of peptides and proteins with a gly-his tag sequence zdock server: interactive docking prediction of protein-protein complexes and symmetric multimers the haddock2.2 web server: user-friendly integrative modeling of biomolecular complexes cport: a consensus interface predictor and its performance in prediction-driven docking with haddock atomic structural details of a protein grafted onto gold nanoparticles 1h-detected solid-state nmr of proteins entrapped in bioinspired silica: a new tool for biomaterials characterization probing the transmembrane structure and dynamics of microsomal nadphcytochrome p450 oxidoreductase by solid-state nmr reckoning a fungal metabolite, pyranonigrin a as a potential main protease (mpro) inhibitor of novel sars-cov-2 virus identified using docking and molecular dynamics simulation identification of a novel dual-target scaffold for 3clpro and rdrp proteins of sars-cov-2 using 3d-similarity search, molecular docking, molecular dynamics and admet evaluation inhibition of the main protease 3cl-pro of the coronavirus disease 19 via structure-based ligand design and molecular modeling identification of potential binders of the main protease 3clpro of the covid-19 via structure-based ligand design and molecular modeling interaction of hydroxychloroquine with sars-cov2 functional proteins using all-atoms non-equilibrium alchemical simulations delving deep into the structural aspects of a furin cleavage site inserted into the spike protein of sars-cov-2: a structural biophysical perspective key: cord-254100-u6x5zd4i authors: taliansky, m.e.; brown, j.w.s.; rajamäki, m.l.; valkonen, j.p.t.; kalinina, n.o. title: involvement of the plant nucleolus in virus and viroid infections: parallels with animal pathosystems date: 2010-10-15 journal: adv virus res doi: 10.1016/b978-0-12-385034-8.00005-3 sha: doc_id: 254100 cord_uid: u6x5zd4i the nucleolus is a dynamic subnuclear body with roles in ribosome subunit biogenesis, mediation of cell-stress responses, and regulation of cell growth. an increasing number of reports reveal that similar to the proteins of animal viruses, many plant virus proteins localize in the nucleolus to divert host nucleolar proteins from their natural functions in order to exert novel role(s) in the virus infection cycle. this chapter will highlight studies showing how plant viruses recruit nucleolar functions to facilitate virus translation and replication, virus movement and assembly of virus-specific ribonucleoprotein (rnp) particles, and to counteract plant host defense responses. plant viruses also provide a valuable tool to gain new insights into novel nucleolar functions and processes. investigating the interactions between plant viruses and the nucleolus will facilitate the design of novel strategies to control plant virus infections. the nucleolus is a dynamic subnuclear body with roles in ribosome subunit biogenesis, mediation of cell-stress responses, and regulation of cell growth. an increasing number of reports reveal that similar to the proteins of animal viruses, many plant virus proteins localize in the nucleolus to divert host nucleolar proteins from their natural functions in order to exert novel role(s) in the virus infection cycle. this chapter will highlight studies showing how plant viruses recruit nucleolar functions to facilitate virus translation and replication, virus movement and assembly of virus-specific ribonucleoprotein (rnp) particles, and to counteract plant host defense responses. plant viruses also provide a valuable tool to gain new insights into novel nucleolar functions and processes. investigating the interactions between plant viruses and the nucleolus will facilitate the design of novel strategies to control plant virus infections. the interactions between a virus and its host cell play a central role in the viral infection cycle. the analysis of virus-host interactions is critical for understanding the mechanisms of viral infections and for the development of novel antiviral strategies. viruses are obligate intracellular pathogens with small genomes and, therefore, are reliant on subverting of the cellular the nucleolus and plant viruses functions and machineries to facilitate their own replication. the cell nucleus is one of the key features of eukaryotic cells. it is a highly dynamic, membrane-bound organelle that hosts major cellular events, including dna replication, messenger rna synthesis and processing, and ribosome subunit biogenesis (trinkle-mulcahy and . given the pivotal role of the nucleus in cell host function, it is not surprising that viruses interact with this organelle and its compartments, and that such interactions play a crucial role in the virus infection cycle. indeed, certain viruses including plant dna containing begomoviruses (rojas et al., 2001; sharma and ikegami, 2009) , rna reverse-transcribing caulimoviruses (camvs) (haas et al., 2005) , and negative-strand rna rhabdoviruses belonging to the genus nucleorhabdovirus (tsai et al., 2005) replicate in the nucleus, and therefore it makes sense that these viruses modify nuclear structures to usurp some of the nuclear functions and provide an appropriate environment for their own replication. in contrast, single-stranded positive-sense rna [(þ)ssrna] viruses replicate in the cytoplasm. therefore, the rational for rna viruses targeting the nucleus and its compartments is not so evident. however, an increasing number of reports clearly show that cytoplasmic rna viruses including plant viruses can also target nuclear compartments and, nucleoli, in particular (reviewed by hiscox 2002 hiscox , 2007 greco, 2009 ). thus, investigating why and how plant rna viruses interact with the nucleolus to meet their own needs will contribute to the better understanding of molecular biology of plant viruses and facilitate the design of novel strategies for virus control. these studies may also teach us about the fundamental principles of plant nucleolar structure and functions. the nucleolus is a prominent subnuclear compartment formed around the clusters of genes (rdna) coding for ribosomal rna (rrna), and is the site of rdna transcription, rrna processing, and ribosome assembly (olson, 2004; rubbi and milner, 2003; sirri et al., 2008) . indeed, for many years the exclusive function of the nucleolus was thought to be rrna synthesis and ribosome biogenesis. however, several insights from the past decade have dramatically changed this traditional view and implicated the nucleolus in many other aspects of cell function such as cellcycle regulation, gene silencing, telomerase activity, senescence, stress responses, and biogenesis of multiple ribonucleoprotein (rnp) particles (boisvert et al., 2007; olson and dundr, 2005; sirri et al., 2008) . this chapter briefly reviews the structure and composition of the nucleolus and, subsequently, data implicating the nucleolus in the infection cycles of animal and plant viruses and also viroids. most of the current knowledge on nucleolar localization and functions of viral proteins has been gained in studies on animal viruses and has also been reviewed comprehensively (greco, 2009; hiscox, 2002 hiscox, , 2007 . therefore, the main emphasis of this review will be on what is known about the different aspects of interactions between plant virus proteins and the nucleolus, of which the functional significance in control of virus movement and interference with host antiviral defense has started to appear only recently. we also discuss recent findings on the potential role of cajal bodies (cbs), another type of small subnuclear bodies structurally and functionally associated with the nucleolus. the nucleus is a highly structured organelle responsible for chromosome organization, replication and division, for gene expression and regulation, and the integration of a vast array of activities required for cellular function. the nucleus contains chromatin-rich regions, made up of condensed heterochromatin, more dispersed euchromatin, and interchromatin regions, and chromatin organization is responsible for regulating gene expression, dna replication, and chromosome segregation (schneider and grosscheld, 2007; trinkle-mulcahy and lamond, 2008) . in addition, it contains many other structures or bodies of different numbers and sizes which vary between cell types, at different stages in the cell cycle and under different conditions. the most prominent of these structures is the nucleolus but many other bodies [e.g., cbs, splicing speckles, paraspeckles, pml (pro-myeloid leukemia) bodies, etc.] have now been identified and are being characterized on the basis of their protein and rna components and functions (boisvert et al., 2007; cioce and lamond, 2005; lamond and spector, 2003; matera et al., 2009; rippe, 2007) . the nucleolus is where rrna genes are transcribed and processed, and rrnas are assembled, along with ribosomal proteins, into the small and large subunits of the ribosome (boisvert et al., 2007; fatica and tollervey, 2002; granneman and baserga, 2004) . the nucleolus contains three different regions on the basis of their appearance in the transmission electron microscope: fibrillar centers (fc), the dense fibrillar component (dfc), and the granular component (gc). in the mammalian nucleolus, the fcs are small, lightly staining structures surrounded by the more densely stained dfc. the dfc in turn is surrounded a particulate region-the gc. the plant nucleolus tends to be more regular in its structure (usually near to spherical) than animal nucleoli. the organization of the nucleolar regions is different in plant nucleoli in that the dfc is not as densely stained as in animal nucleoli and occupies much more of the nucleolar volume (up to 70%). transcription of precursor rrnas (pre-rrnas) occurs at multiple sites (200-400) within the dfc regions and pre-rrnas undergo processing in the dfc and gc. indeed, the localization of pre-rrnas and different small nucleolar rnas (snornas) and proteins has allowed early and late pre-rrna cleavage events to be correlated to the dfc and gc, respectively, suggesting a vectorial model for the production and maturation of rrnas. plant nucleoli also often contain a prominent central region called the nucleolar cavity (brown and shaw, 1998; shaw and brown, 2004) . the function of the nucleolar cavity is currently unknown, and it remains to be seen whether the differences in organization of plant and animal nucleoli influence the type and range of other functions which nucleoli carry out and interactions with different viruses. the major function of the nucleolus is the transcription and processing of precursor rrnas and ribosomal subunit assembly. the dynamic assembly pathway involves a series of intermediate pre-ribosomal complexes of the 40s and 60s ribosomal subunits and in addition to ribosomal proteins, requires up to 200 accessory proteins (grannemann and baserga, 2004) . processing of the pre-rrnas involves both cleavage of the precursor transcript and nucleotide modifications, the majority of which are 2 0 -o-ribose methylation and pseudouridylation. some cleavage reactions and the modifications require snornps. snornps consist of a snorna and associated proteins (kiss, 2002) . by the nature of the major function in rrna production and ribosomal subunit assembly, there is a huge flux of proteins and rna complexes into and out of the nucleolus. the highly dynamic movement of proteins and complexes has been illustrated in animal cells by quantitative nucleolar proteomic analyses lam et al., 2007) . in particular, ribosomal proteins are highly expressed and rapidly accumulate in the nucleolus where they are incorporated into ribosomal subunits or rapidly degraded (lam et al., 2007) . three of the most abundant and well-studied nucleolar proteins are fibrillarin, nucleolin, and b23. fibrillarin, one of the major proteins of the nucleolus, is a core component of box c/d snornps and is required for rrna processing (venema and tollervey, 1999) . fibrillarin has methyltransferase activity directing 2 0 -o-ribose methylation of rrna (barneche et al., 2000; cioce and lamond, 2005; matera and shpargel, 2006) . fibrillarin is highly conserved in sequence, structure, and function in eukaryotes. the n-terminal region of fibrillarin comprises a glycine-and arginine-rich (gar) domain (barneche et al., 2000) . the gar domain is methylated at arginine residues (liu and dreyfuss, 1995) and is responsible for interactions with various proteins including the survival motor neuron (smn) gene product ( jones et al., 2001) and the nuclear dead (asp-glu-ala-asp) box protein p68 (nicol et al., 2000) . fibrillarin contains a centrally located rna-binding domain which together with the c-terminal helix domain and the intervening spacer constitutes a methyltransferase-like domain that contains an s-adenosyl methionine binding motif and is responsible for fibrillarin methyltransferase activity (wang et al., 2000a) . the c-terminal helix domain appears to target fibrillarin to cbs (snaar et al., 2000) . although it is well established that fibrillarin plays a role in ribosome biogenesis within the nucleolus, its role in cbs is not well understood but it is presumably responsible for the 2 0 -o-ribose methylation of small nuclear rnas (snrnas). nucleolin is an abundant, ubiquitously expressed protein, which is highly phosphorylated, methylated, and also can be adp-ribosylated (ginisty et al., 1999) . nucleolin is found in various cell compartments, and it is especially abundant in the nucleolus. nucleolin has three well-defined domains. the n-terminal domain with alternating acidic and basic stretches is involved in rdna transcription by interacting with rdna repeats and histone h1 and in nuclear localization. the central portion is the rnabinding domain, whereas the c-terminal part contains a gar domain involved in interaction with the ribosomal proteins (tuteja and tuteja, 1998) . nucleolin is involved in ribosome biogenesis (mongelard and bouvet, 2007) , affects transcription, processing and modification of rrna and nuclear-cytosolic transport of ribosomal proteins and ribosomal subunits by shuttling between the nucleus and the cytoplasm (tuteja and tuteja, 1998) . b23 is an abundant, multifunctional nucleolar phosphoprotein whose activities are proposed to play a role in ribosome assembly, binding to other nucleolar proteins, nucleocytoplasmic shuttling (li et al., 1996) , and possibly regulating transcription of rdna by mediating structural changes in chromatin (okuwaki et al., 2001) . two isoforms of b23 have been identified: the major form (b23.1) is predominately located in the nucleolus and the minor form (b23.2) resides in the cytoplasm (reviewed by hiscox, 2002) . small proteins of less than 40-60 kda can enter the nucleus through nuclear pore complexes by passive diffusion (hiscox, 2007; nigg, 1997) . rna-binding proteins that diffuse into the nucleus may therefore nonspecifically target the nucleolus as it contains a large amount of rrna. the nuclear import of larger proteins is mediated by nuclear localization signals (nlss), composed of one (monopartite) or two (bipartite) stretches of basic amino acid (arginine and/or lysine) residues of a given size (hiscox, 2007; macara, 2001) . nucleocytoplasmic shuttling proteins also contain specific nuclear export signals (ness), usually leucine-rich amino acid motifs (hiscox, 2007; macara, 2001) . how proteins may be further delivered to the nucleolus is poorly understood. the nucleolus does not have apparent membrane or other barriers, and entry into it does not require energy, unlike entry to the nucleus. it seems conceivable that viral proteins localize to the nucleolus, firstly, as a result of targeting the nucleus via classical nls followed by direct or indirect interactions between the viral molecules (via various nucleolar localization signals, nolss) and components that make up the nucleolus (hiscox, 2002 (hiscox, , 2007 . the structure of nolss is not well defined and depends on different factors including whether the protein associates with another nucleolar-bound protein or alternatively traffics to the nucleolus on its own or associates with rna transcripts that are being transcribed in the nucleolus. at least in some cases, nols are rich in arginine and lysine residues and can overlap with nlss (hiscox, 2002 (hiscox, , 2007 . the second most studied nuclear body is cbs which are frequently associated with the nucleolus and found in animal and plant nuclei. cbs are involved in snrnp and snornp maturation and transport, with snrnps and snornps accumulating in cbs before appearing in speckles or the nucleolus, respectively (cioce and lamond, 2005) . as mentioned above, spliceosomal snrnas are also modified and contain 2 0 -o-ribose methylations and pseudouridines. modification of nucleotides in snrnas is guided by small cb-specific rnas (scarnas) (darzacq et al., 2002; jady et al., 2003) . a major component of cbs is the protein, coilin, which is required for their formation. in plants, mutants are available which knock out cbs or alter their size and number, two of which are due to mutations in coilin (collier et al., 2006) . the number of cbs per nucleus can vary in different cell types and is under developmental control (cioce and lamond, 2005) . besides rrna transcription and processing and the production of ribosomal subunits, the nucleolus is also involved in many other aspects of rna processing and rnp assembly as well as cellular functions (boisvert et al., 2007; olson, 2004; pedersen, 1998; raška et al., 2006; rubbi and milner, 2003) . for example, the nucleolus has a role in the maturation, assembly, and export of rnp particles. the signal recognition particle (srp) has a nucleolar phase in its assembly with particular protein constituents of the srp being localized to the nucleolus. similarly, telomerase rnp, required for chromosome replication, may be assembled in the nucleolus, and the nucleolus may also have a role in sequestering telomerase rnp to avoid inappropriate nucleation of telomere structures. spliceosomal snrnps may also undergo part of their assembly in the nucleolus, and processing of pre-trnas and u6snrna occurs in the nucleolus. in addition, the nucleolus is a site of sequestration of particular proteins to regulate, for example, the cell cycle or cell death, and acts as a sensor of cellular stress (boisvert et al., 2007; olson, 2004; pedersen, 1998; raška et al., 2006; rubbi and milner, 2003) . characterization of the protein composition of the nucleolus under different conditions has provided support for or suggested new functions for the nucleolus. initial proteomic analyses of human cells identified a few hundred proteins which included not only well-known nucleolar proteins such as ribosomal proteins, fibrillarin, nucleolin, b23, etc. but also, for example, splicing and translation factors (andersen et al., 2002) . advances in proteomics has now allowed the identification of around 4500 proteins in the human nucleolar proteome (ahmad et al., 2008) and the dynamic behavior of nucleolar components and complexes can now be determined lam et al., 2007) . in plants, a partial proteomic analysis of the nucleolus identified many expected ribosomal and nucleolar proteins but also found splicing and translation factors (pendle et al., 2005) . in particular, exon junction complex proteins (associated with mrnas following splicing) were identified in the nucleolar proteome and in the nucleolus by fluorescence microscopy (pendle et al., 2005) . one of these proteins, eif4a-iii, a core protein of the exon junction complex, was shown to redistribute from the nucleoplasm to the nucleolus and finally to splicing speckles under stress conditions of hypoxia . similar dynamic distribution, involving the nucleolus, of proteins that interact with mrnas has been demonstrated and the nucleolus and cbs have been shown to be involved in u1snrnp production in plants (lorković and barta, 2008; tillemans et al., 2006) . in plants, novel functions for the plant nucleolus, cbs, and another largely nucleolar-associated body, the d-body, have been described. firstly, the production of heterochromatic sirnas, which are involved in transcriptional silencing, is thought to occur in a region of the nucleolus or in d bodies due to the localization of protein components of the machinery and of sirnas (pontes and pikaard, 2008) . secondly, maturation of micrornas (mirnas) may occur in d-bodies as precursor mir-nas as well as dicer-like ribonuclease 1 (dcl1) have been located to these structures (pontes and pikaard, 2008) . similarly, in mammalian cells, some precursor and mature mirnas have recently been found in the nucleus with some being enriched in the nucleolus (politz et al., 2009; scott et al., 2009) . of particular interest was the enrichment of some precursors in the gc suggesting that mirna processing could occur in the nucleolus, or these mirnas may be involved in ribosome synthesis or other nucleolar functions (politz et al., 2009) . the link between the nucleolus and mirna production is illustrated by the evolutionary relationship between some snornas and mirna precursors. for example, a human snorna was shown to be processed by dicer to generate small rnas which were associated with argonaute proteins (agos) and caused reduced expression of gene targets (ender et al., 2008) . in addition, numerous snorna-derived small rnas from different organisms (including arabidopsis) were associated with components of rna silencing pathways (taft et al., 2009) and many mirna precursors have retained snorna features (scott et al., 2009 ). finally, a third novel function for the plant nucleolus is in mrna biogenesis, surveillance, or nonsensemediated decay (nmd). biochemical fractionation of nucleoplasm and nucleoli and subsequent sequencing of isolated cdnas have shown that the plant nucleolus not only contains mrnas but that aberrant mrnas are enriched in the nucleolus ). the aberrant mrnas show splicing defects, the majority of which would introduce premature termination codons and therefore be expected to be substrates for nmd. using upf mutants (affected upf proteins, key components of the nmd mechanism), the correlation between enrichment of aberrant mrnas in the nucleolus and turnover by nmd was corroborated and was further supported by the localization of the nmd proteins, upf2 and upf3, to the nucleolus . before the observation that the plant nucleolus contained mrnas and aberrant mrnas, some spliced mrnas (e.g., c-myc) were localized to the nucleolus while their unspliced versions were found in the nucleoplasm in mammalian cells (pedersen, 1998; olson, 2004) . the nucleolus has also been associated with mrna export on the basis of accumulation of polyaþ rna upon disruption of export factors, nucleolar structure, or stress conditions in yeast and animal cells, although this could reflect increased polyadenylation prior to degradation (ideue et al., 2004; pederson, 1998; schneiter et al., 1995) . more recently, the nucleolus has been proposed to be involved in the formation of mrnps which are localized to specific regions of the cytoplasm for translation. in yeast, ash1 mrna enters the nucleolus bound to specific rna-binding proteins at which time translation repressor proteins are loaded onto the mrna. the mrnp is then exported to the cytoplasm, transported to its final destination, and then translation is activated (du et al., 2008; jellbauer and jansen, 2008) . a similar trafficking pathway may also operate in mammals for mrnps associated with the nucleolar protein, staufen, which is involved in transport of mrnas in neurons . thus, the nucleolus has numerous functions related to rna biogenesis and different rna processing and rnp maturation pathways. it contains many rna-interacting proteins involved in these processes, including highly abundant rna-binding proteins (such as fibrillarin and nucleolin) involved in rrna and ribosomal subunit production. it is therefore a dynamic hub of rna processing activity, rna:protein interaction and complex formation. these characteristics, in addition to the potential involvement of the nucleolus in mrna biogenesis and, particularly, the transport of mrnas and mrnps to and from the nucleolus to other parts of the cell, make the nucleolus a prime target for exploitation by viruses. it is therefore not surprising that viruses have taken advantage of the nucleolus and nucleolar components for production and distribution of viral rnas and rnps. several excellent reviews have documented various aspects of the involvement of the nucleolus in the infection cycle of animal and human viruses (greco, 2009; hiscox, 2002 hiscox, , 2007 matthews and olson, 2006; stark and taliansky, 2009 ). this section therefore briefly summarizes the main findings in this research area to illuminate the nucleolar functions involved in virus infections that are conserved between plant and animal kingdoms. certain animal viruses such as dna containing adenoviruses that replicate in the nucleus interact with and disrupt the nucleolus. as a result, synthesis of rrna is disrupted in adenovirus-infected cells. adenovirus infection also causes the redistribution of nucleolin and b23 (matthews, 2001) . interestingly, b23 has been shown to stimulate adenovirus replication. rna virus replication may also be facilitated by nucleolar proteins. indeed, the hdag protein encoded by hepatitis delta virus (a (à)ssrna virus which is a subviral satellite of the dna virus, hepatitis b virus) also binds to nucleolin and b23. it has been proposed that hdag interacts with both these proteins in a complex that promotes viral rna replication, presumably as a result of the helicase activity of nucleolin (huang et al., 2001) . nucleolar proteins may also be involved in virus assembly and egress. for example, accumulation and assembly of structural (cap) proteins of adeno-associated virus take place in nucleoli (bevington et al., 2007) . further export of the assembled cap proteins from the nucleolus is mediated by another type of virus-encoded packaging (rep) proteins, which should form a complex with cap proteins. remarkably, formation of this complex is mediated by b23. after export of complexes containing rep, b23, and cap proteins from the nucleolus to the nucleoplasm, viral dna encapsidation occurs (bevington et al., 2007) . another animal virus, the (à)ssrna containing borna disease virus, uses the nucleolus as a site of genome replication (pyper et al., 1998 ). an rna-binding protein encoded by this virus has the appropriate trafficking signals for import and export to and from the nucleus (cros and palese, 2003) many proteins encoded by animal viruses that replicate mainly or exclusively in the cytoplasm have also been shown to localize to the nucleolus, cause the relocalization of nucleolar proteins and disruption of nucleolar architecture and function. these include the (þ)ssrna virus proteins such as nucleocapsid proteins encoded by coronavirus and arterivirus, the dengue virus core protein, the alphavirus capsid protein and nonstructural nsp2 protein, the (à)ssrna virus proteins, including the influenza virus nucleoprotein (np) and nonstructural protein 1 (ns1), newcastle disease virus matrix protein, and retrovirus proteins such as human immunodeficiency virus (hiv) rev and the transactivator tat proteins (reviewed by greco, 2009; hiscox, 2002 hiscox, , 2007 . although functional relevance of why these proteins localize to the nucleus or nucleolus and how this relates to their functions in virus replication in many cases is largely unknown, several reports reveal significant progress in this area as exemplified below. infection of cells with poliovirus (picornavirus) results in the inactivation of the nucleolar protein, rna polymerase i upstream binding (transcription) factor, which inhibits transcription of rrnas in the host cell (banerjee et al., 2005) . poliovirus is also able to interact with nucleolin causing its selective redistribution from the nucleolus to the cytoplasm (waggoner and sarnow, 1998) . after relocalization, nucleolin binds to the internal ribosome entry sites (iress) at the 5 0 untranslated region of poliovirus genomic rna, and this interaction stimulates ires-dependent translation (hellen and sarnow, 2001) . this also occurs in hepatitis c virus (izumi et al., 2001) and represents an alternative translation initiation strategy as compared to the classical eukaryotic cap-dependent translation (hellen and sarnow, 2001) . nucleolin is also able to interact with the 3 0 -untranslated region of poliovirus rna, which controls synthesis of negative strand rna (waggoner and sarnow, 1998) . interaction of picornaviruses with the nucleolus could also down-regulate host gene expression. for example, the human rhinovirus 3c protease precursors that are localized in the nucleolus at early stages of the infection inhibit cellular rna by cleavage of vital transcription factors (amineva et al., 2004) . the nucleocapsid proteins encoded by porcine arterivirus and avian coronavirus (infectious bronchitis virus, ibv) interact with nucleolin and fibrillarin (chen et al., 2002; yoo et al., 2003) , and as a result may disrupt the normal functions of these proteins. for instance, by altering the distribution of fibrillarin, viruses might be reducing poli transcription, that is, the synthesis of rrna, as blocking fibrillarin with antibody prevented its translocation to nucleoli and resulted in the reduction or inhibition of poli transcription (fomproix et al., 1998) . the ibv infection leads to disruption of the nucleolus (dove et al., 2006a) and arrest of the cell cycle and cytokinesis (chen et al., 2002; dove et al., 2006b; wurm et al., 2001) . therefore, disruption of nucleolar architecture and function might be common in cells infected with viruses interacting with the nucleolus. the loss of essential nucleolar function in its turn may play an important role during virus infection toward an active production of the virus. one of the most studied viruses in terms of viral interactions with the nucleolus is hiv, a retrovirus. hiv rnas are reverse transcribed in the cytoplasm of infected cells and trafficked to the nucleus. after transcription in the nucleus, progeny rna molecules are transported back to the cytoplasm. one of the functions of the rev protein is to export unspliced or partially spliced viral mrna from the nucleus (reviewed by greco, 2009; hiscox, 2007) . before nuclear export, hiv rna passes through the nucleolus. rev binds to a cis-acting rna element (rev-response element), which is found in all unspliced and incompletely spliced viral mrnas, and this promotes the translocation of these rnas from the nucleus (cantó-nogués et al., 2001; michienzi et al., 2000) . the nucleolus plays a central role in this process, and the nucleolar trafficking of rev and viral rna is critical for the outcome of infection. thus, many animal viruses, whether they replicate or not in the nucleus, have evolved a nucleolar phase for part of their infection cycle to prevent unwanted interference from the cell. alternatively, they use nucleolar functions for their own benefit. recruitment of nucleolar proteins is especially beneficial for viruses, and in particular for rna containing viruses, as these proteins possess many crucial functions in cellular rna biosynthesis, processing, translation, and trafficking. indeed, during virus infections of mammalian cells, various viral components traffic to and from the nucleolus where they interact with different host factors. certain nucleolar proteins are redistributed into other cell compartments or are modified, and some cellular proteins are relocalized in the nucleolus of infected cells (reviewed by greco, 2009; hiscox, 2002 hiscox, , 2007 . well-documented studies have established that several of these nucleolar modifications play a role in some steps of the viral infection cycle such as viral attachment and entry, intracellular trafficking, transcription, translation, replication, virus assembly, and regress (reviewed by greco, 2009 ). the virally induced nucleolar modifications could also affect fundamental cellular pathways including the initiation of transcription from the dna promoter of the rrna genes, cell-cycle regulation, and apoptosis (reviewed by greco, 2009) . although some steps (replication, translation) of the infection cycles of plant and animal/human viruses are essentially similar, there is a fundamental difference in some other mechanisms employed to enter host cells and spread from cell to cell between viruses infecting animal cells and viruses infecting plants. this is because animal cells are separated by barriers far less formidable than the thick, rigid, and impermeable cell walls consisting of cellulose and pectin that separate plant cells from one another. other differences relate to defense strategies employed by humans/animals and plants against viral infections. for example, in mammals the interferon (ifn) pathway plays a key role in the innate antiviral immune response, whereas plants do not display such an activity. instead, plants primarily rely on other natural defense strategies such as rna silencing. interestingly, functional links between plant virus infection cycle and the nucleolus have been described for both common and plant-specific virus infection steps. certain plant virus proteins localize to the nucleolus with examples from single-stranded dna viruses, para-retroviruses and negative-strand and positive-strand rna viruses (table i) . the most common technique for studying the nucleolar targeting of plant virus proteins is based on the confocal microscopy localization of the proteins which have been tagged with a fluorescent fusion protein (such as green fluorescent protein, gfp). such proteins are usually larger than the size-exclusion limit ($ 40-60 kda) and hence prevented from nonspecific protein diffusion into the nucleus. moreover, in many cases the specific nucleolar localization of plant virus proteins has been supported by identification of nolss (table ii) . although no overall conserved nucleolar trafficking motif has been identified in these nolss, they presumably resemble host nolss. thus plant viral nucleolar trafficking might use a form of molecular mimicry as has earlier been proposed for animal viruses (table ii) . however, in the case of grv orf3, in addition to the arginine-rich domain (nls), a leucine residue at position 149 (l149) residing inside the leucine-rich domain (nes) is also essential for nucleolar targeting of orf3 protein (table ii; kim et al., 2007a) . the fact that viral proteins contain nolss is a strong indication that viruses have evolved specific nucleolar functions. viral proteins might also traffic to the nucleolus through association with host proteins. one example of such an association may be the interaction of various plant virus proteins with the major nucleolar protein, fibrillarin. the first description of this interaction was the demonstration that the umbravirus grv orf3 long-distance movement protein (mp) binds to fibrillarin in vivo and in vitro (kim et al., 2007a,b) . for example, the leucine-rich domain (and l149, in particular) of orf3 is involved in direct interaction with fibrillarin (kim et al., 2007b) . mutations in the leucine-rich domain prevent fibrillarin from binding to orf3 and nucleolar trafficking (kim et al., 2007b) . by implication, this may relate fibrillarin binding to nucleolar trafficking. other plant virus mps which interact with fibrillarin are the pomovirus pmtv triple gene block protein 1 (tgb1) and hordeivirus pslv tgb1 (n. o. kalinina and d. rakitina, unpublished results). as their name suggests, mps are involved in virus spread in infected plants, and the potential role of fibrillarin in this process will be discussed in section iv.b. the multifunctional pva (potato virus a)-encoded viral genomelinked protein (vpg) is also able to interact with fibrillarin (rajamäki and bonfiglioli et al. (1996) , qi and ding (2003) valkonen, 2009). however, the role of this interaction is likely to be different from that in virus movement as this process is not compromised by fibrillarin depletion (rajamäki and valkonen, 2009 ; section iv.c). another example of a viral protein that interacts with fibrillarin in the nucleolus is a silencing suppressor of cmv (cucumber mosaic virus), the 2b protein (gonzález et al., 2010 ; section iv.c). collectively, these data indicate that interaction with fibrillarin is a general property of various plant virus proteins that is not restricted to one or two virus taxonomic groups. however, such interactions may have quite diverse molecular implications for different viruses being required at various phases of virus infection cycle. this may also suggest novel, unexpected natural functions for fibrillarin that are hijacked or affected by plant viruses at different stages of infection for needs of the viruses. viroids are small, circular, self-replicating, and non-coding rna molecules that cause plant diseases (reviewed by tabler and tsagris, 2004) . viroids do not replicate in the cytoplasm like conventional plant rna viruses but replicate in either the nucleus or the chloroplast. nuclear viroids such as pstvd and cccvd have a nucleolar phase in their replication cycle (table i ; qi and ding, 2003; schumacher et al., 1983) . virus acronyms and other abbreviations are as in table i . a nlss are also required for the nucleolar localization. b nuclear and nucleolar localization is controlled independently by the same nls regions. c l149 (italic) residing inside the orf3 nes is also essential for nucleolar localization of the orf3 protein. in early experiments using in situ hybridization, pstvd rna [including both (þ) and (à) rna strands] was localized in nucleoli of nuclei isolated from infected plants (harders et al., 1989) , suggesting that the nucleolus is the replication site of the viroid. however, later using improved sample preparation and in situ hybridization protocols, qi and ding (2003) have found that the (à) strand of pstvd localizes in the nucleoplasm but not in the nucleolus. by contrast, the (þ) strand of pstvd localizes in the nucleolus as well as in the nucleoplasm, with various distinct spatial patterns. these experiments are suggestive of successive stages of nuclear involvement in viroid replication: (1) synthesis of the (à) and (þ) strands of pstvd occurs in the nucleoplasm; (2) the (à) strand rna is anchored in the nucleoplasm; (3) the (þ) strand rna is transported selectively into the nucleolus; and (4) some (þ) strand rna traffics from the nucleolus back into the nucleoplasm and further into the cytoplasm for spreading into neighboring cells. the significance of (þ) strand pstvd rna trafficking into the nucleolus remains to be determined. however, nucleolar machineries for processing of rrnas, snornas, and trnas make the nucleolus an attractive candidate site for pstvd processing. the specific intranuclear localization of (à) and (þ) strands of pstvd may have some implication for pathogenicity. assuming that the (à) and (þ) strands each can interact with specific cellular factors to disrupt normal cell functions to cause symptoms, the differential subnuclear localization of the (þ) and (à) strands of pstvd may suggest different cellular targets for these rnas. (þ) strand of pstvd rna has been shown to interact with a bromodomain-containing protein (a member of a family of transcriptional regulators associated with chromatin remodeling), termed viroid rna binding protein 1 (virp1) (martínez de alba et al., 2003; tabler and tsagris, 2004) . virp1 contains an nls and therefore might transfer the viroid rna to the nucleus and bring it into contact with transcription units associated with chromatin (martínez de alba et al., 2003; tabler and tsagris, 2004) . however, mechanisms for selective nucleolar trafficking of (þ) strand pstvd are still unknown. the nucleolus and its factors may be used not only by viruses and viroids for their own needs in replication, but also by plant host defense systems. for example, one of the major nucleolar proteins, nucleolin, binds to the 3 0 non-coding region of the tomato bushy stunt virus, tombusvirus, rna ( jiang et al., 2010) . this leads to significant inhibition of tombusviral rna replication and may thus represent one of the innate immunity systems of plant hosts. plant viruses enter cells either through sites of mechanical injury to plant tissues or during the feeding by a specific vector organism (insect, nematode, or soil microbes belonging to protocists). to induce disease, after replication in the initially infected cells, plant viruses must spread to the rest of the plant. the systemic spread of plant viruses proceeds in two stages: (i) cell-to-cell movement through plasmodesmata (pd) and (ii) long-distance movement through vascular tissues. first, the virus (in the form of virions or nucleic acid protein complexes) moves intracellularly from the sites of replication to pd, which are unique intercellular membranous channels that span cell walls linking the cytoplasm of contiguous cells. the virus then transverses the pd to spread intercellularly. it is generally accepted that viral cell-to-cell movement involves virus-encoded mps as well as host-encoded components (reviewed by lucas, 2006) . virus systemic movement between organs (long-distance movement) occurs through the phloem, the specialized vascular system used by plants for the transport of assimilates and macromolecules (reviewed by lucas, 2006; oparka, 2004) . viruses enter, move through, and exit from the vascular system, which is usually surrounded by bundle sheath cells and contains various cell types including vascular parenchyma cells, companion cells, and enucleate sieve elements. thus, transport of a virus into and within vascular tissue implies movement from mesophyll cells to bundle sheath cells, from bundle sheath cells to vascular parenchyma and/or companion cells, and into sieve elements. virus exit from vascular tissue presumably involves the same steps in reverse order. coat protein (cp) is essential for efficient long-distance transport of plant viruses, with only few exceptions (reviewed by lucas, 2006 ). one of the most well-studied plant viruses in terms of viral interactions with the nucleolus is grv, an umbravirus [(þ)ssrna virus]. umbraviruses differ from most other viruses in that they do not encode a cp such that conventional virus particles are not formed in infected plants. umbraviral genomes encode at least three proteins. in grv, two orfs at the 5 0 -end of the rna are expressed by a frameshift mechanism as a single protein that appears to be an rna replicase . the other orfs (orf3 and orf4) overlap each other. orf4 encodes the mp that mediates the cell-to-cell movement of viral rna via pd (ryabov et al., 1998) . orf3 protein is the long-distance movement factor that facilitates trafficking of viral rna through the phloem (ryabov et al., 1999) . umbraviral orf3 proteins (26) (27) (28) (29) show no significant similarity with any other recorded or predicted proteins . the grv orf3 protein interacts with viral rna in vivo to form filamentous rnp particles, which have elements of regular helical structure, but not the uniformity typical of virus particles . the rnps accumulate in cytoplasmic inclusions which have been detected in all cell types and were abundant in phloem cells . they serve to protect viral rna and move it through the phloem to cause systemic infection. remarkably, in addition to its presence in the cytoplasm, the orf3 protein was also found in nuclei and predominantly in nucleoli (ryabov et al., 1998 (ryabov et al., , 2004 . studies of the biology of grv infection have provided molecular insights into how and why viruses may target the nucleolus (canetta et al., 2008; kim et al., 2007a,b) . it has been demonstrated that the grv orf3 protein traffics to the nucleolus via a mechanism involving the reorganization of cbs into multiple cajal body-like structures (cbl) and their fusion with the nucleolus. nucleolar localization and further trafficking of orf3 protein from the nucleolus back to the cytoplasm is essential for the umbravirus infection. the integral connection between nucleolar targeting of the orf3 protein and its biological function in virus long-distance spread is demonstrated by mutagenesis of the arginine-and leucine-rich domains that block nucleolar localization or nuclear export of the orf3 protein, and which prevent the formation of cytoplasmic viral rnps and their long-distance movement (kim et al., 2007a) . although the mechanisms by which the orf3 protein targets cbs and produces cbls are unknown, it could be suggested that targeting of cbs by the orf3 protein may utilize elements of existing cb-trafficking pathways. for example, part of the maturation pathway of snrnps in mammalian cells occurs in the cytoplasm and involves a complex containing the smn protein (smn complex) which together with the snrnp are reimported into the nucleus and targeted to cbs (matera and shpargel, 2006; navascues et al., 2004; sleeman and lamond, 1999) . particular snrnp proteins contain modified symmetric di-methyl arginines (sdmas) which enhance the formation of snrnps and interaction with smn (paushkin et al., 2002) . preliminary experiments have identified sdmas in the orf3 protein (m. taliansky, unpublished results) , suggesting that targeting of the orf3 protein to cbs could involve interactions with the smn protein. although smn protein has yet to be identified in plants, the existence of an orthologue has been suggested (collier et al., 2006) . the formation of cbls may involve either fragmentation of cbs into multiple bodies by the orf3 protein or the redistribution of cb components into new structures containing the orf3 protein. interestingly, the multiple cbl phenotype, described here, is similar to that of the poly cajal bodies (pcb) mutant of arabidopsis (collier et al., 2006) . as the protein normally encoded by a gene controlling the pcb phenotype, appears to regulate cb formation, orf3 protein may interfere with the function of this or other proteins to affect the integrity and number of cbs in nuclei. the second possibility is that the orf3 protein causes the redistribution of cb components, such as coilin, u2b 00 and fibrillarin, to form cbls with the orf3 protein. orf3 protein trafficking to the nucleolus uses a novel pathway of nucleolar import by causing the fusion of cbls with the nucleolus. the physical and functional association of the nucleolus and cbs is welldocumented and is controlled by complex molecular interactions among cb and nucleolar proteins such as coilin, smn, fibrillarin, and nopp140 (cioce and lamond, 2005; ogg and lamond, 2002) . expression of mutant versions of some of these proteins has profound effects on cb structure and function causing disruption or dispersal and compositional changes ( jones et al., 2001; pellizzoni et al., 2001; tucker et al., 2001) . moreover, phosphorylation of coilin is an important factor determining physical interactions and trafficking of cbs (cioce and lamond, 2005; ogg and lamond, 2002) . for example, cbs form within the nucleolus of hela cells upon treatment with okadaic acid (an inhibitor of protein phosphatase) and with transient expression of coilin mutated at a single serine residue (lyon et al., 1997; sleeman et al., 1998) . cbs have also been observed within nucleoli in human breast carcinomas (ochs et al., 1994) and in liver cells of hibernating dormice (malatesta et al., 1994) . therefore, the orf3 protein may interfere with normal protein-protein interactions or posttranslational modifications causing the reorganization and fusion of cbs with the nucleolus. the last stage of the nuclear voyage of the orf3 protein is its nuclear export leading to the redistribution of some fibrillarin from the nucleolus to cytoplasm (fibrillarin normally does not accumulate in cytoplasm) (kim et al., 2007a) . this redistribution is mediated by the direct interaction between the orf3 protein and fibrillarin (kim et al., 2007b) . taking into account the long-distance movement function of grv orf3, it could be expected that fibrillarin is directly involved in this particular viral function. further support of a role for fibrillarin in umbravirus systemic infection has been provided by the fibrillarin knock-down experiments (kim et al., 2007b) . silencing of the fibrillarin gene has indeed prevented formation of rnp particles and long-distance movement of grv but does not affect viral replication or cell-to-cell movement. thus, it has been concluded that fibrillarin is needed for formation of cytoplasmic rnp particles that are capable of long-distance movement and causing systemic viral infection such that the redistribution of fibrillarin with orf3 protein is a prerequisite to form rnps (kim et al., 2007a,b) . further experiments have shown that fibrillarin, in combination with orf3 protein and viral rna in vitro, produced filamentous rnp structures with structural properties similar to in vivo rnps (as discussed in detail in section iv.e). fibrillarin, an rna-binding protein, may bind the viral rna or act as a chaperone to permit or catalyze the regular assembly of proteins around viral rna. although fibrillarin and orf3 protein are sufficient to form functional viral rnp particles in vitro, additional proteins may also be associated with the in vivo particles. when formed in phloem companion cells, the viral rnps are able to enter sieve elements and move through the plant to cause systemic infection. hence formation of the fibrillarin-orf3 protein complexes appears to be the key prerequisite for formation of grv rnp particles and their long-distance movement through the phloem. poleroviruses are (þ) ssrna viruses that are mainly restricted to cells in the vascular system. besides a major cp plrv encodes a minor cp, an extended version of the major cp produced by occasional translational ''readthrough'' of the cp gene (cp-rt) (bahner et al., 1990) . both cp and cp-rt contain the same nols, and are targeted to the nucleolus when they are individually expressed in plant tissues (haupt et al., 2005) . however, cp-rt but not plrv cp loses its nucleolar localization in the presence of replicating plrv. it has been suggested that the cp-rt protein does not accumulate in the nucleolus in the presence of plrv infection because plrv rna or plrv-encoded or -induced factors retain this protein outside the nucleolus (haupt et al., 2005) . like grv, plrv has been unable to cause systemic infection in the fibrillarin-silenced plants, although accumulation of plrv in the inoculated leaves was not affected (kim et al., 2007b) suggesting that fibrillarin is also involved in plrv long-distance movement. a role of nucleolar functions in systemic infection has also been suggested for plant viruses that require tgb for movement. the genomes of some viruses, such as potexviruses, hordeiviruses, and pomoviruses, contain so-called triple gene block, tgb that encodes three proteins required for cell-to-cell and long-distance movement (reviewed by morozov and solovyev, 2003) . the tgb1 protein encoded by the tgb-containing virus, pmtv (pomovirus) localizes to the nucleus and nucleolus. deletion of 84 nterminal amino acids abrogates its nucleolar localization. northern blots of rna from inoculated and upper non-inoculated leaves of plants infected with clones carrying the tgb1 n-terminal deletion mutant reveal that long distance movement of viral rnas has also been abolished, but this mutant is still competent for cell-to-cell movement (l. torrance, personal communication). moreover, the pmtv tgb1 protein has been shown to interact with fibrillarin in vitro. interestingly, tgb1 protein encoded by the hordeivirus, pslv can also interact with fibrillarin both in vitro and in vivo (n. o. kalinina, unpublished results) . thus it appears that some tgb-encoding viruses such as pmtv and pslv may represent another example of plant viruses that require association with the nucleolus to control long distance movement of their genomes. potyviruses form the largest group of plant-infecting rna viruses (rajamäki et al., 2004) . they have a polyadenylated (þ) ssrna genome of ca. 9500-10,000 nucleotides that is encapsidated into a 680-900 nm long, filamentous particle. the genome is translated into a large polyprotein of about 3000-3350 amino acids, which is subsequently processed into up to ten mature proteins by three virus-encoded proteinases (rajamäki et al., 2004) . additionally, a 25-kda protein produced from the p3 cistron by frameshifting was recently identified (chung et al., 2008) . potyviruses replicate in membranous structures in the cytoplasm (cotton et al., 2009; schaad et al., 1997a) . however, two potyviral replication-associated proteins, the nuclear inclusion protein a (nia) and nuclear inclusion protein b (nib), accumulate in the nucleus of virus-infected cells (baunoch et al., 1991; dougherty and hiebert, 1980; hajimorad et al., 1996; knuhtsen et al., 1974; restrepo et al., 1990) . in addition, nia localizes in the nucleolus and cbs (baunoch et al., 1991; rajamäki and valkonen, 2009; restrepo et al., 1990) . the nib of tev (tobacco etch virus) has also been detected in the nucleolus (baunoch et al., 1991; restrepo et al., 1990) . potyviruses may also induce nuclear inclusions, which consist of nia and nib (baunoch et al., 1991; dougherty and hiebert, 1980; edwardson and christie, 1983; knuhtsen et al., 1974; martin et al., 1992) . the significance of these nuclear inclusions is unknown but they may simply represent inactivated protein complexes, because they seem to form only at late stages of virus infection (baunoch et al., 1991) . nia is multifunctional and consists of two domains. the n-proximal half of nia is the vpg that is covalently linked to the 5 0 -end of viral genomic rna (oruetxebarria et al., 2001; siaw et al., 1985) . the c-proximal part (nia-pro) is the main viral proteinase (dougherty et al., 1989) . the two domains are separated by a suboptimal proteolytic cleavage site, the slow processing of which is essential for efficient viral replication schaad et al., 1996) . the majority of full-length nia is found in the nucleus of virus-infected cells . however, nia can also exist as a polyprotein with the 6k2 protein located upstream of nia in the viral polyprotein. the 6k2 protein impedes nuclear localization of the 6k2-nia polyprotein (restrepo-hartwig and carrington, 1992) and directs nia to cytoplasmic membrane vesicles, the sites of viral replication cotton et al., 2009; restrepo-hartwig and carrington, 1994; schaad et al., 1997a) . many lines of evidence suggest that nia is part of the viral replication complex involving several viral and host proteins fellers et al., 1998; li et al., 1997; murphy et al., 1996; puustinen and mäkinen, 2004; schaad et al., 1996) . the vpg domain has ntp binding activity, is uridylylated by nib, and may act as a primer for synthesis of viral rna (anindya et al., 2005; murphy et al., 1996; puustinen and mäkinen, 2004; schaad et al., 1996) . in addition, vpg is involved in viral cell-to-cell and long-distance movement (nicolas et al., 1997; valkonen, 1999, 2002; schaad et al., 1997b) . the nlss and nols of potyviral nia map to the n-proximal part of vpg (carrington et al., 1991; rajamäki and valkonen, 2009; schaad et al., 1996) . the signal controlling nuclear localization of nia is bipartite (carrington et al., 1991; rajamäki and valkonen, 2009; schaad et al., 1996) . the regions of nia controlling nucleolar targeting of nia have been studied in pva and found to map to the same regions as the nlss. however, mutations in both nls regions are required to abolish nuclear targeting of pva nia, whereas mutations in either nls prevent nucleolar localization of nia (rajamäki and valkonen, 2009) . the most n-terminal nls controls also localization of pva nia to cbs (rajamäki and valkonen, 2009) . mutations that affect nuclear localization of nia are detrimental for genome amplification of tev and pva (rajamäki and valkonen, 2009; schaad et al., 1996) , suggesting that nuclear/nucleolar localization of nia has an important role in virus infection. potyviral nia (or vpg) has been shown to interact with several host proteins (dunoyer et al., 2004; huang et al., 2010; léonard et al., 2004; rajamäki and valkonen, 2009; schaad et al., 2000; thivierge et al., 2008; wittmann et al., 1997) . the structure of vpg is intrinsically disordered (grzela et al., 2008; rantalainen et al., 2008) , which may provide flexibility for many types of interactions. one of the interaction partners of vpg and/or nia is the eukaryotic translation initiation factor eif4e or its isoform (robaglia and caranta, 2006; schaad et al., 2000; wittmann et al., 1997) and the interaction appears important for virus infectivity (léonard et al., 2000; robaglia and caranta, 2006) . remarkably, interaction of tumv nia with eif(iso)4e and the poly(a) binding protein 2 (pabp2) has been detected in the nucleus and nucleolus using bimolecular fluorescence complementation (bifc) and colocalization experiments . by contrast, interaction with the 6k2-nia polyprotein targets eif(iso)4e and pabp2 to cytoplasmic membrane vesicles . the data suggest that these host proteins are needed in different viral processes. in healthy plants, most of pabp is cytoplasmic but some of the protein is redistributed to the nucleolus probably by the nia . nia appears to mediate also eif (iso)4e localization to the nucleolus and could interact simultaneously with eif(iso)4e and pabp2 as shown by in vitro binding assays and bifc and colocalization experiments . hence, all three proteins may be part of the same complex. although the role of interaction between nia and eif(iso)4e and/or pabp2 in the nucleus is currently unclear, some possibilities can be suggested. the nuclear pool of eif4e takes part in mrna export from the nucleus but may also be involved in nuclear translation and nmd of rna (strudwick and borden, 2002) . pabp regulates the initiation of protein synthesis, nuclear export of mature mrnas, and mrna stability and decay (mangus et al., 2003) . interaction with nia might disrupt some of these functions. as mentioned above, some animal viruses are known to modulate and inhibit activities of translation initiation factors in order to favor viral replication and translation (reviewed by thompson and sarnow, 2000) . recent data connect nuclear/nucleolar targeting of plant viral proteins to interference with host antiviral defense. for example, the potyviral vpg protein has been observed to accumulate in the companion cells of upper leaves of a wild potato species ahead of virus infection, which suggested that vpg might interfere with host defense and hence facilitate infection of the cells that receive the virus via systemic transport (rajamäki and valkonen, 2003) . indeed, this hypothesis gained support in the experiments in which overexpression of vpg in leaf tissues temporarily interfered with cosuppression of gene silencing (i.e., rna silencing), whereas nls mutants of vpg, which exhibited reduced nuclear and nucleolar localization, were not able to suppress rna silencing (rajamäki and valkonen, 2009) . rna silencing is a sophisticated, sequence-specific rna degradation mechanism operating in the cytoplasm (ruiz-ferrer and voinnet, 2009) . a key feature of the rna silencing pathway is the generation of dsrna that corresponds in sequence to the target (virus) rna. this dsrna is cleaved into sirnas by dcls and these mediate the target specificity for rna degradation (for reviews, see carrington, 2000; ruiz-ferrer and voinnet, 2009; vance and vaucheret, 2001; voinnet, 2001) . rna silencing is a natural anti-viral defense system of plants but is also involved in gene regulation in a wide range of developmental and pathogen defense processes (ruiz-ferrer and voinnet, 2009) . to combat host defense rna silencing, most plant viruses encode silencing suppressor proteins. the potyviral helper-component proteinase (hc-pro) was the first detected viral suppressor of rna silencing (anandalakshmi et al., 1998; brigneti et al., 1998; kasschau and carrington 1998) and the potyviral p1 protein may enhance its activity (rajamäki et al., 2005) . hc-pro acts in the cytoplasm and, therefore, the association of vpg with rna silencing suppression when localized in the nucleolus is intriguing because many host proteins involved in rna silencing as well as the processing centers for small rnas are located in the nucleus, nucleolus, and cbs as discussed in section ii (pontes and pikaard, 2008) . in contrast, nuclear/nucleolar localization of the p19 protein of tbsv, tombusvirus, mediated by plant aly proteins (mrna-processing and -export factors) may be a defense mechanism of the plant to down-regulate the silencing suppressor activity of p19 (canto et al., 2006) . previously, suppression of rna silencing has been found to be connected to nuclear localization of another viral protein: the 2b protein, silencing suppressor of cmv. cmv 2b localizes in the nucleus and the nucleolus where it interacts with argonaute 1 (ago1), the core component of the rna-induced silencing complex (gonzález et al., 2010; lucy et al., 2000; zhang et al., 2006) . more recently, it has also been shown that cmv 2b interacts with ago4 (gonzález et al., 2010) . however, these interactions are not sufficient for suppression of rna silencing, and hence their biological relevance remains so far unclear (gonzález et al., 2010) . the vpg of pva interacts with fibrillarin in the nucleolus and cbs as detected by bifc experiments (rajamäki and valkonen, 2009 ). depletion of fibrillarin reduces pva accumulation in nicotiana benthamiana, suggesting a role for fibrillarin in virus infection (rajamäki and valkonen, 2009 ). as mentioned above, in grv, fibrillarin is recruited for viral long-distance transport (canetta et al., 2008; kim et al., 2007a,b) , but in potyviruses the role is likely to be different as long-distance transport of pva is not compromised by depletion of fibrillarin (rajamäki and valkonen, 2009) . taking into account the silencing suppression activity of vpg, it could be suggested that vpg-fibrillarin might contribute to such an activity. alternatively, the fibrillarin-vpg interaction may disrupt certain nucleolar functions (e.g., host transcription or pre-mrna processing), which might explain the observed shutdown of host gene expression during potyvirus infection (wang and maule, 1995) . in mammals, the ifn pathway plays a key role in the innate antiviral immune response whereas involvement of rna silencing is still controversial (reviewed by hale et al., 2008) . however, emerging evidence suggests some parallels in how plant and animal viruses could use the nucleolus to counteract host defense. for example, the ns1 protein of influenza a virus is an important factor in counteracting ifn-based cellular antiviral mechanisms (hale et al., 2008) . in addition, ns1 is also able to suppress rna silencing in plant, insect, and mammalian cells (bucher et al., 2004; de vries et al., 2009; delgadillo et al., 2004; haasnoot et al., 2007; li et al., 2004) . remarkably, ns1 localizes to the nucleolus and interacts with nucleolin (murayama et al., 2007) . nib is a viral rna-dependent rna polymerase of potyviruses and hence involved in the replication complex (hong and hunt, 1996; schaad et al., 1997a) . while viral replication takes place in the cytoplasm on cellular the nucleolus and plant viruses membranes, nib is also targeted to the nucleus and the nucleolus (baunoch et al., 1991; restrepo et al., 1990) . nuclear targeting of nib appears to be highly sensitive to alterations in protein conformation and is eliminated by deletions, dipeptide insertions and amino acid substitutions introduced also into parts of nib other than the nlss (li and carrington, 1993) . nuclear and nucleolar localization of nib may be important for the viral infection cycle, because mutations in the nlss abolish infectivity of tev (li and carrington, 1995; li et al., 1997) . three host proteins, the pabp2, the heat shock cognate 70 protein (hsc70-3) , and the eukaryotic translation elongation factor 1a, eif1a, interact with nib thivierge et al., 2008; wang et al., 2000b) . however, none of these protein interactions was found in the nucleolus, shedding no light on the role of the nucleolar localization of nib. besides nuclear inclusion proteins, the p3 protein of tev is targeted to the nucleus and nucleolus of virus-infected tobacco cells, as detected by immunogold labeling (langenberg and zhang, 1997) . p3 is a nonstructural protein with no well-characterized function. however, it is involved in virus multiplication (kekarainen et al., 2002) and virus-host interactions (chu et al., 1997; eggenberger et al., 2008; jenner et al., 2003; johansen et al., 2001) . a role for nuclear/nucleolar localization of p3 is currently unknown. the multifunctional nucleocytoplasmic shuttling p6 protein encoded by plant para-retrovirus camv has also been found to localize to the nucleolus (haas et al., 2005) . although a nols has not been identified for this protein, its nucleolar import might be facilitated by its interaction with ribosomal proteins of the 60s ribosomal subunit (haas et al., 2008) . the presence of p6 in the nucleolus, where assembly of ribosomal subunits occurs, raises the possibility that p6 might interact directly with ribosomes before their export to render them competent for translation of the camv polycistronic mrna. indeed, p6 interacts with the ribosomal proteins l18 (leh et al., 2000) , l24 (park et al., 2001) , and l13 (bureau et al., 2004) , which hence may be the potential targets because they participate in the formation of the 60s subunit in the nucleolus (andersen et al., 2002) . another functional activity of camv p6 is suppression of rna silencing. this protein interferes with rna-directed rna polymerase 6 (rdr6)dependent rna silencing via inhibition of the dsrna-binding protein drb4, a protein normally enhancing dcl4-mediated dicing. however, unlike nuclear targeting, the nucleolar localization of p6 is completely dispensable for its silencing suppression function (haas et al., 2008) . transport of the viral genome into the nucleus is an obligatory step in the replication cycle of plant dna viruses such as the begomoviruses. cps of monopartite begomoviruses (such as tylcv and tolcjav) are nucleocytoplasmic shuttling proteins thought to be involved in this process (rojas et al., 2001; sharma and ikegami, 2009 ). interestingly, these proteins also contain nolss targeting them to the nucleolus (sharma and ikegami, 2009 ). however, the biological significance of nucleolar localization of begomoviral cp is unclear. nlss have been identified in three of the seven proteins encoded by plant nucleorhabdovirus, mfsv. remarkably, two of them, the nucleocapsid (n) protein and phosphoprotein (p), localize to the nucleolus but only when they are coexpressed in plant tissues (as fusions with gfp) using the agrobacterium system; when expressed individually these proteins do not target nucleoli (tsai et al., 2005) . this clearly indicates the interdependent character of nucleolar targeting for the n and p proteins. however, the molecular mechanisms responsible for this effect and its biological relevance remain to be explored. another plant virus protein with nucleolar localization is the cp of spmv (satellite panicum mosaic virus) (qi et al., 2008) . some of the functional and biochemical properties of the spmv cp including the nucleolar association of spmv cp, its rna binding activity (desvoyes and scholthof, 2000) , and the involvement of the cp in systemic movement (omarov et al., 2005) are similar to those of grv orf3. by analogy therefore, the authors have suggested that nucleolar localization of spmv cp may be essential for its role in the systemic movement of spmv as in the case of grv orf3. collectively, all of these findings support an active role of the nucleolus and fibrillarin in various aspects of the virus infection cycle and interactions with host cells promoting systemic infections with plant viruses belonging to various taxonomic groups. the nucleolus contains a complex machinery for rrna modification and rrnp assembly and may provide an environment which allows other forms of functional rnps, such as srp, telomerase, splicing snrnps, and viral rnps to exploit the nucleolus or nucleolar components in their biogenesis pathways (reviewed by boisvert et al., 2007) . for example, fibrillarin is one of the four core protein components of a box c/d snornp complex (reviewed by tran et al., 2004) . being a methyltransferase, fibrillarin (as a component of box c/d snornps) functions in the 2 0 -in vitro-methylation and processing of pre-rrna. in addition, human fibrillarin forms a subcomplex with splicing factor 2-associated p32, protein arginine methyltransferases, and tubulins a3 and b1 that is independent of its association with snornps, suggesting that fibrillarin may also coordinate protein methylation during the processes of ribosome biogenesis . furthermore fibrillarin interacts with some other cellular proteins such as smn protein ( jones et al., 2001) and the nuclear dead box protein p68, an rna-dependent atpase and rna helicase (nicol et al., 2000) . however, the physiological role of these interactions is unclear and may be based on novel natural functions of fibrillarin that remain to be established. the most studied viral rnp complex containing fibrillarin is the grv orf3 complex with elements of regular helical structure that is capable of long-distance movement via the phloem. assembly of grv rnp particles occurs in the cytoplasm and requires fibrillarin relocalized from the nucleolus (kim et al., 2007a,b; ; also see above). rnp particles similar in architecture and infectivity to the viral rnps formed in vivo, have been reconstituted in vitro from fibrillarin, the orf3 protein and viral rna (kim et al., 2007a,b) . taking the study further, the in vitro experiments have shown that the orf3-fibrillarin interaction occurs between the leucine-rich region (l149 in particular) of the orf3 protein and the gar domain of fibrillarin (kim et al., 2007b) known to be responsible for interaction with other proteins such as smn ( jones et al., 2001) . this interaction leads to formation of single-layered ring-like complexes of orf3 with fibrillarin as was shown by atomic force microscopy (canetta et al., 2008) . the diameter of these orf3 protein-fibrillarin rings is 18-22 nm which correlates with that of the filamentous rnp particles (canetta et al., 2008) . it thus appears that the orf3 protein fibrillarin rings interact with viral rna, encapsidating it and reorganizing it into helical structures, and thereby play a key role in the assembly of umbraviral rnp complexes. these results demonstrate that, in addition to traditional functions in rrna processing and modification, fibrillarin possesses completely novel functions in mediating assembly of umbraviral rnps. these functions are presumably based on the ability of the orf3 protein to interact and form ring-like complexes with fibrillarin such that the virus alters and exploits the properties of fibrillarin for successful virus propagation. other viral proteins, such as the nucleocapsid protein encoded by porcine arterivirus also interact with fibrillarin in an rna-dependent manner . however, the structure and architecture of these complexes and how they impact the viral infection cycle remain unknown. the nucleolus is a highly conserved feature of eukaryotic cells that has a key role as the site of ribosome subunit production. however, recent multiple lines of investigation have confirmed and characterized additional roles for nucleoli in other important cellular processes including cell-cycle control, stress responses, and coordination of the biogenesis of a number of other functional rnps. the nucleus has also been shown to play a crucial role in the infection cycle of various viruses, and the nucleolar localization of viral proteins has recently been described as a pan-virus phenomenon (hiscox, 2002 (hiscox, , 2007 . in this regard, plant viruses are not different from other eukaryotic viruses. the past few years have brought remarkable progress in our understanding of why and how some plant viruses (in particular, umbraviruses and potyviruses) target the nucleolus and the functional role of the interaction between viral and nucleolar proteins in the plant virus infection cycle. for many other plant virus proteins, nucleolar localization and interaction with nucleolar components have also been demonstrated, and functional implications of these findings is a challenge for future research. we anticipate that more information will emerge about the mechanisms involved in regulating nucleolar function and structure in response to plant virus infections and hijacking nucleolar functions for the virus infection cycle. there are now several examples in which the plant viruses also target other subnuclear bodies, such as cbs. in particular, for umbraviruses the role of cbs in nucleolar trafficking of the orf3 protein has been established. the potential role of sub-nuclear structures in other plant virus infections will be addressed in the future. the study of viral interactions with the nucleolus also provides unique and valuable tools to gain new insights into novel nucleolar functions and processes. for example, as previously discussed, the major nucleolar protein fibrillarin, is involved in formation and long-distance movement of umbraviral rnp particles. these functions as well as other potential yet unrecognized natural functions of nucleolar proteins will be the focus of much future research. on a practical level, both the plant cell and viral biology of the nucleolus can, and hopefully will be exploited for the design of novel strategies to control plant virus infections. nopdb: nucleolar proteome database -2008 update rhinovirus 3c protease precursors 3cd and 3cd' localize to the nuclei of infected cells the nucleolus and plant viruses a viral suppressor of gene silencing in plants directed proteomic analysis of the human nucleolus nucleolar proteome dynamics tyrosine 66 of pepper vein banding virus genome-linked protein is uridylylated by rna-dependent rna polymerase expression of the genome of potato leafroll virus: readthrough of the coat protein termination codon modifications of both selectivity factor and upstream binding factor contribute to poliovirus-mediated inhibition of rna polymerase i transcription fibrillarin genes encode both a conserved nucleolar protein and a novel small nucleolar rna involved in ribosomal rna methylation in arabidopsis thaliana a temporal study of the expression of the capsid, cytoplasmic inclusion and nuclear inclusion proteins of tobacco etch potyvirus in infected cells the poly(a) binding protein is internalized in virus-induced vesicles or redistributed to the nucleolus during turnip mosaic virus infection visualization of the interaction between the precursors of vpg, the viral protein linked to the genome of turnip mosaic virus, and the translation eukaryotic initiation factor iso 4e in planta adeno-associated virus interactions with b23/nucleophosmin: identification of sub-nucleolar regions the multifunctional nucleolus tissue and intra-cellular distribution of coconut cadang cadang viroid and citrus exocortis viroid determined by in situ hybridization and confocal laser scanning and transmission electron microscopy viral pathogenicity determinants are suppressors of transgene silencing in nicotiana benthamiana small nucleolar rnas and pre-rrna processing in plants the influenza a virus ns1 protein binds small interfering rnas and suppresses rna silencing in plants p6 protein of cauliflower mosaic virus, a translation reinitiator, interacts with ribosomal protein l13 from arabidopsis thaliana a plant virus movement protein forms ringlike complexes with the major nucleolar protein, fibrillarin, in vitro translocation of tomato bushy stunt virus p19 protein into the nucleus by aly proteins compromises its silencing suppressor activity ultrastructural localization of the rna of immunodeficiency viruses using electron microscopy in situ hybridization and in vitroinfected lymphocytes rna silencing. moving targets bipartite signal sequence mediates nuclear translocation of the plant potyviral nia protein internal cleavage and trans-proteolytic activities of the vpg-proteinase (nia) of tobacco etch potyvirus in vivo interaction of the coronavirus nucleoprotein with nucleolar antigens and the host cell two separate regions in the genome of the tobacco etch virus contain determinants of the wilting response of tabasco pepper an overlapping essential gene in the potyviridae cajal bodies: a long history of discovery a distant coilin homologue is required for the formation of cajal bodies in arabidopsis turnip mosaic virus rna replication complex vesicles are mobile, align with microfilaments and are each derived from a single viral genome trafficking of viral genomic rna into and out of the nucleus: influenza, thogoto and borna disease viruses cajal body-specific small nuclear rnas: a novel class of 2 0 -o-methylation and pseudouridylation guide rnas differential rna silencing suppression activity of ns1 proteins from different influenza a virus strains human influenza virus ns1 protein enhances viral pathogenicity and acts as an rna silencing suppressor in plants rna: protein interactions associated with satellites of panicum mosaic virus translation of potyvirus rna in a rabbit reticulocyte lysate: identification of nuclear inclusion proteins as products of tobacco etch virus rna translation and cylindrical inclusion protein as a product of the potyvirus genome characterization of catalytic residues of the tobacco etch virus 49-kda proteinase changes in nucleolar morphology and proteins during infection with the coronavirus infectious bronchitis virus cell cycle perturbations induced by infection with the coronavirus infectious bronchitis virus and their effect on virus replication nuclear transit of the rna-binding protein she2 is required for translational control of localized ash1 mrna heat shock 70 protein interaction with turnip mosaic virus rna-dependent rna polymerase within virus-induced membrane vesicles a cysteine-rich plant protein potentiates potyvirus movement through an interaction with the virus genomelinked protein vpg cytoplasmic cylindrical and nucleolar inclusions induced by potato virus-a gain of virulence on rsv1-genotype soybean by an avirulent soybean mosaic virus requires concurrent mutations in both p3 and hc-pro a human snorna with microrna-like functions making ribosomes in vitro interactions between a potyvirus-encoded, genome-linked protein and rna-dependent rna polymerase effect of anti-fibrillarin antibodies on building of functional nucleoli at the end of mitosis structure and functions of nucleolin cucumber mosaic virus 2b protein subcellular targets and interactions: their significance to its rna silencing suppressor activity ribosome biogenesis: of knobs and rna processing involvement of the nucleolus in replication of human viruses virulence factor of potato virus y, genome-attached terminal protein vpg, is a highly disordered protein the open reading frame vi product of cauliflower mosaic virus is a nucleocytoplasmic protein: its n terminus mediates its nuclear export and formation of electron-dense viroplasms nuclear import of camv p6 is required for infection and suppression of the rna silencing factor drb4 rna interference against viruses: strike and counterstrike nia and nib of peanut stripe potyvirus are present in the nucleus of infected cells, but do not form inclusions the multifunctional ns1 protein of influenza a viruses imaging of viroids in nuclei from tomato leaf tissue by in situ hybridization and confocal laser scanning microscopy nucleolar localization of potato leafroll virus capsid proteins internal ribosome entry sites in eukaryotic mrna molecules brief review: the nucleolus -a gateway to viral infection? rna viruses: hijacking the dynamic nucleolus rna polymerase activity catalyzed by a potyvirusencoded rnadependent rna polymerase the nucleolar phosphoprotein b23 interacts with hepatitis delta antigens and modulates the hepatitis delta virus rna replication a host rna helicase-like protein, atrh8, interacts with the potyviral genome-linked protein, vpg, associates with the virus accumulation complex, and is essential for infection the nucleolus is involved in mrna export from the nucleus in fission yeast nucleolin stimulates viral internal ribosome entry site-mediated translation modification of sm small nuclear rnas occurs in the nucleoplasmic cajal body following import from the cytoplasm a putative function of the nucleolus in the assembly or maturation of specialised messenger ribonucleoprotein complexes the dual role of the potyvirus p3 protein of turnip mosaic virus as a symptoms and avirulence determinant in brassicas nucleolin/nsr1p binds to the 3 0 noncoding region of the tombusvirus rna and inhibits replication recessive resistance in pisum sativum and potyvirus pathotype resolved in a gene-for-cistron correspondence between host and virus direct interaction of the spinal muscular atrophy disease protein smn with the small nucleolar rnaassociated protein fibrillarin a counterdefensive strategy of plant viruses: suppression of posttranscriptional gene silencing functional genomics on potato virus a: virus genome-wide map of sites essential for virus propagation cajal bodies and the nucleolus are required for a plant virus systemic infection interaction of a plant virus-encoded protein with the major nucleolar protein fibrillarin is required for systemic virus infection aberrant mrna transcripts and the nonsensemediated decay proteins upf2 and upf3 are enriched in the arabidopsis nucleolus small nucleolar rnas: an abundant group of non-coding rnas with diverse cellular functions partial purification and some properties of tobacco etch virus induced intranuclear inclusions dynamic behaviour of the eif4a-iii putative core protein of the exon junction complex: fast relocation to nucleolus and speckles under hypoxia analysis of nucleolar protein dynamics reveals the nuclear degradation of ribosomal proteins nuclear speckles: a model for nuclear organelles immunocytology shows the presence of tobacco etch virus p3 protein in nuclear inclusions the cauliflower mosaic virus translational transactivator interacts with the 60s ribosomal subunit protein l18 of arabidopsis thaliana complex formation between potyvirus vpg and translation eukaryotic initiation factor 4e correlates with virus infectivity interaction of vpg-pro of turnip mosaic virus with the translation initiation factor 4e and the poly(a)-binding protein in planta nuclear transport of tobacco etch potyviral rnadependent rna polymerase is highly sensitive to sequence alterations complementation of tobacco etch potyvirus mutants by active rna polymerase expressed in transgenic cells c23 interacts with b23, a putative nucleolar-localization-signal-binding protein functions of the tobacco etch virus rna polymerase (nib): subcellular transport and protein-protein interaction with vpg/proteinase (nia) interferon antagonist proteins of influenza and vaccinia viruses are suppressors of rna silencing in vivo and in vitro arginine methylation of rna-binding proteins role of cajal bodies and nucleolus in the maturation of the u1 snrnp in arabidopsis plant viral movement proteins: agents for cell-to-cell trafficking of viral genomes suppression of post-transcriptional gene silencing by a plant viral protein localized in the nucleus inhibition of protein dephosphorylation results in the accumulation of splicing snrnps and coiled bodies within the nucleolus transport into and out of the nucleus. microbiol ultrastructural location of non-structural protein 3a of cucumber mosaic virus in infected tissue using monoclonal antibodies to a cloned chimeric fusion protein cytochemical and immunocytochemical characterization of nuclear bodies during hibernation poly(a)-binding proteins: multifunctional scaffolds for the post-transcriptional control of gene expression intracellular localization of three non-structural plum pox potyvirus proteins by immunogold labelling a bromodomaincontaining protein from tomato specifically binds potato spindle tuber viroid rna in vitro and in vivo pumpimg rna: nuclear bodybuilding along the rnp pipeline nuclear bodies: random aggregates of sticky proteins or crucibles of macromolecular assembly? adenovirus protein v induces redistribution of nucleolin and b23 from nucleolus to cytoplasm what's new in the nucleolus? rybozyme-mediated inhibition of hiv 1 suggests nucleolar trafficking of hiv-1 rna nucleolin: a multifaceted protein triple gene block: modular design of a multifunctional machine for plant virus movement influenza a virus non-structural protein 1 (ns1) interacts with cellular multifunctional protein nucleolin during infection replacement of the tyrosine residue that links a potyviral vpg to the viral rna is lethal targeting smn to cajal bodies and nuclear gems during neuritogenesis the nuclear dead box rna helicase p68 interacts with the nucleolar protein fibrillarin and colocalizes specifically in nascent nucleoli during telophase variations in the vpg protein allow a potyvirus to overcome va gene resistance in tobacco nucleocytoplasmic transport: signals, mechanisms and regulation coiled bodies in the nucleolus of breast cancer cells cajal bodies and coilin-moving towards function identification of nucleophosmin/b23, an acidic nucleolar protein, as a stimulatory factor for in vitro replication of adenovirus dna complexed with viral basic core proteins nontraditional roles of the nucleolus the moving parts of the nucleolus the capsid protein of satellite panicum mosaic virus contributes to systemic invasion and interacts with its helper virus getting the message across: how do plant cells exchange macromolecular complexes? identification of the genome-linked protein in virions of potato virus a, with comparison to other members in genus potyvirus a plant viral 'reinitiation' factor interacts with the host translational machinery the smn complex, an assemblysome of ribonucleoproteins the plurifunctional nucleolus the survival of motor neurons (smn) protein interacts with the snornp proteins fibrillarin and gar1 proteomic analysis of the arabidopsis nucleolus suggests novel nucleolar functions micrornas with a nucleolar location sirna and mirna processing: new functions for cajal bodies uridylylation of the potyvirus vpg by viral replicase nib correlates with the nucleotide binding capacity of vpg the nucleolus is the site of borna disease virus rna transcription and replication differential subnuclear localization of rna strands of opposite polarity derived from an autonomously replicating viroid the complex subcellular distribution of satellite panicum mosaic virus capsid protein reflects its multifunctional role during infection the 6k2 protein and the vpg of potato virus a are determinants of systemic infection in nicandra physaloides viral genome-linked protein (vpg) controls accumulation and phloem-loading of a potyvirus in inoculated potato leaves localization of a potyvirus and the viral genome-linked protein in wild potato leaves at an early stage of systemic infection control of nuclear and nucleolar localization of nuclear inclusion protein a of picorna-like potato virus a in nicotiana species infection with potyviruses a novel insertion site inside the potyvirus p1 cistron allows expression of heterologous proteins and suggests some p1 functions potato virus a genome-linked protein vpg is an intrinsically disordered molten globule-like protein with a hydrophobic core structure and function of the nucleolus in the spotlight nuclear transport of plant potyviral proteins regulation of nuclear transport of a plant potyvirus protein by autoproteolysis the tobacco etch potyvirus 6-kilodalton protein is membrane associated and involved in viral replication dynamic organisation of the cell nucleus translation initiation factors: a weak link in plant rna virus infection functional analysis of proteins involved in movement of the monopartite begomovirus, tomato yellow leaf curl virus nucleolar-cytoplasmic shuttling of prrsv nucleocapsid protein: a simple case of molecular mimicry or the complex regulation by nuclear import, nucleolar localization and nuclear export signal sequences disruption of the nucleolus mediates stabilization of p53 in response to dna damage and other stresses roles of plant small rnas in biotic stress responses intracellular location of two groundnut rosette umbravirus proteins delivered by pvx and tmv vectors a plant virus-encoded protein facilitates long-distance movement of heterologous viral rna identification of a nuclear localization signal and nuclear export signal of the umbraviral long-distance rna movement protein analysis of the vpg-proteinase (nia) encoded by tobacco etch potyvirus: effects of mutations on subcellular transport, proteolytic processing and genome amplification formation of plant rna virus replication complexes on membranes: role of an endoplasmid reticulum-targeted viral protein vpg of tobacco etch potyvirus is a host genotype-specific determinant for long-distance movement strain-specific interaction of the tobacco etch virus nia protein with the translation initiation factor eif4e in the yeast twohybrid system dynamics and interplay of nuclear architecture, genome organisation, and gene expression mrna transport in yeast: time to reinvestigate the functions of the nucleolus subcellular localization of viroids in highly purified nuclei from tomato leaf tissue human mirna precursors with box h/aca snorna features characterization of signals that dictate nuclear/nucleolar and cytoplasmic shuttling of the capsid protein of tomato leaf curl java virus associated with dnab satellite plant nuclear bodies identification of a protein covalently linked to the 5 0 -terminus of tobacco vein mottling virus rna nucleolus: the fascinating nuclear body newly assembled snrnps associate with coiled bodies before speckles, suggesting a nuclear snrnp maturation pathway dynamic interactions between slicing snrnps, coiled bodies and nucleoli revealed using snrnp protein fusions to the green fluorescent protein mutational analysis of fibrillarin and its mobility in living human cells old and new faces of the nucleolus. workshop on the nucleolus and disease the emerging roles of translation factor eif4e in the nucleus viroids: petite rna pathogens with distinguished talents small rnas derived from snornas molecular biology of umbraviruses: phantom warriors an umbraviral protein, involved in long-distance rna movement, binds viral rna and forms unique, protective ribonucleoprotein complexes eukaryotic elongation factor 1a interacts with turnip mosaic virus rna-dependent rna polymerase and vpg-pro in virus-induced vesicles regulation of host cell translation by viruses and effects on cell function insights into nuclear organisation in plants as revealed by the dynamic distribution of arabidopsis sr splicing factors evolutionary origins of the rna-guided nucleotide-modification complexes: from the primitive translation apparatus? toward a high-resolution dynamics view of nuclear dynamics nuclear functions in space and time: gene expression in a dynamic complete genome sequence and in planta subcellular localization of maize fine streak virus proteins residual cajal bodies in coilin knockout mice fail to recruit sm snrnps and smn, the spinal muscular atrophy gene product nucleolin: a multifunctional major nucleolar phosphoprotein rna silencing in plants -defense and counterdefense ribosome synthesis in saccharomyces cerevisiae rna silencing as a plant immune system against viruses viral ribonucleoprotein complex formation and nucleolar-cytoplasmic relocalization of nucleolin in poliovirus-infected cells inhibition of host gene expression associated with plant virus replication crystal structure of a fibrillarin homologue from methanococcus jannaschii, a hyperthermophile, at 1.6 å resolution interaction between zucchini yellow mosaic potyvirus rna-dependent rna polymerase and host poly-(a) binding protein interaction of the viral protein genome linked of turnip mosaic potyvirus with the translational eukaryotic initiation factor (iso) 4e of arabidopsis thaliana using the yeast two-hybrid system localisation to the nucleolus is a common feature of coronavirus nucleoproteins and the protein may disrupt host cell division human fibrillarin forms a sub-complex with splicing factor 2-associated p32, protein arginine methyltransferases, and tubulins alpha 3 and beta 1 that is independent of its association with preribosomal ribonucleoprotein complexes colocalization and interaction of the porcine arterivirus nucleocapsid protein with the small nucleolar rnaassociated protein fibrillarin cucumber mosaic virus-encoded 2b suppressor inhibits arabidopsis argonaute1 cleavage activity to counter plant defense key: cord-271642-i71g2tmd authors: mullen, lisa m.; nair, sean p.; ward, john m.; rycroft, andrew n.; henderson, brian title: phage display in the study of infectious diseases date: 2006-02-07 journal: trends microbiol doi: 10.1016/j.tim.2006.01.006 sha: doc_id: 271642 cord_uid: i71g2tmd microbial infections are dependent on the panoply of interactions between pathogen and host and identifying the molecular basis of such interactions is necessary to understand and control infection. phage display is a simple functional genomic methodology for screening and identifying protein–ligand interactions and is widely used in epitope mapping, antibody engineering and screening for receptor agonists or antagonists. phage display is also used widely in various forms, including the use of fragment libraries of whole microbial genomes, to identify peptide–ligand and protein–ligand interactions that are of importance in infection. in particular, this technique has proved successful in identifying microbial adhesins that are vital for colonization. microbial infections are dependent on the panoply of interactions between pathogen and host and identifying the molecular basis of such interactions is necessary to understand and control infection. phage display is a simple functional genomic methodology for screening and identifying protein-ligand interactions and is widely used in epitope mapping, antibody engineering and screening for receptor agonists or antagonists. phage display is also used widely in various forms, including the use of fragment libraries of whole microbial genomes, to identify peptide-ligand and protein-ligand interactions that are of importance in infection. in particular, this technique has proved successful in identifying microbial adhesins that are vital for colonization. phage display technology display on filamentous phage one of the first genetic techniques developed to study protein-ligand interactions was phage display, described by smith in 1985 [1] . this method displays recombinant peptides or proteins on the surface of phage particles, which can then be selected for ('panned' for figure 1 ) by enabling the phage to interact with selected immobilized ligands. the power of phage display lies in its ability to: (i) maintain a physical link between the displayed protein and the dna sequence encoding it; and (ii) screen libraries containing billions of unique peptides and proteins. escherichia coli filamentous phage of the ff class including strains m13, fd and f1 have been extensively used to develop and exploit this technology. these phage are composed of a circular single-stranded dna genome that is encased in a long tube composed of thousands of copies of a single major coat protein, with four additional minor capsid proteins at the tips ( figure 2 ). phage display involves the fusion of foreign dna sequences to the phage genome such that the resulting foreign proteins are expressed in fusion with one of the coat proteins. although all five coat proteins have been used to display proteins or peptides, gene viii protein (pviii) and gene-iii-encoded adsorption protein (piii) are by far the most commonly used [2] . a viable wild-type phage expresses w2700 copies of pviii and 3-5 copies of piii ( figure 2 ) [3] , although this does depend on the size of the phage genome. phage display libraries can be constructed using vectors based on the natural ff phage sequence (i.e. phage vectors) or by using 'phagemids', which are hybrids of phage and plasmid vectors [2, 3] . such phagemids are designed with the origin of replication (ori) from the ff phage, a plasmid origin of replication from e. coli, gene iii and/or gene viii for fusion formation, a multiple cloning site and an antibiotic resistance gene [2] . however, they lack all other phage genes that encode the structural and non-structural proteins that are required to produce a complete phage. phagemids can be grown as plasmids in e. coli and packaged as recombinant ff phage dna with the aid of helper phage, which provide all of the necessary components for phage assembly. the key feature of filamentous phage (as applied to phage display) is that, in contrast to the lytic bacteriophages, filamentous phage are assembled in the cytoplasmic membrane and secreted from infected bacteria without cell lysis [2] (figure 3 ). however, the characteristics of the filamentous phage life cycle has limitations for the display of proteins, the properties of which prevent the correct transfer of the hybrid capsid protein across the lipid bilayer of the inner membrane of e. coli [4] . alternative bacteriophage display systems have been developed using bacteriophage such as t4, t7, l [4] and p4 [5] , which have lytic life cycles so that the proteins of the phage capsids are assembled and folded in the cytoplasm rather than being secreted through the membrane [4] . because the most common approaches to phage display (described earlier) involve n-terminal fusion to the gene iii or gene viii products of filamentous phage, they are unsuitable for surface expression of proteins coded by intact cdna inserts that have stop codons [6, 7] . hence, most phage libraries of cdna fragments are constructed in alternative display systems. however, a modified filamentous phage display system based on the highaffinity interactions between the jun and fos leucine zipper proteins was developed by crameri and suter [6] . in this system, the jun leucine zipper protein is fused to the n terminus of the piii coat protein so that the jun protein is displayed on the surface of the phage, and the cdna libraries are cloned as a c-terminal fusion to the fos leucine zipper protein [8] . the piii-anchored jun protein is bound by the soluble fos fusion in the periplasm, thus, the cdna-encoded protein is bound indirectly to piii. there are two types of phage display library: random peptide libraries (rpls) and natural peptide libraries (npls). the repertoire of peptides displayed in rpls is encoded by synthetic random degenerate oligonucleotide inserts [9] [10] [11] and these libraries have been extensively used to identify linear antigenic epitopes (see later). the advantage of this type of library is their universal nature, which enables many applications of each library. the main disadvantage of rpls is that, because of the way in which they are constructed, peptide sequences that are not found within the antigen or intact pathogen can be displayed [12] . by contrast, npls are constructed from randomly fragmented dna from the genomes of selected organisms such as pathogenic microbes. thus, the phage particles in these libraries display fragments of natural proteins. although the majority of clones in npls are nonfunctional (only one in 18 clones will be correctly in frame with the vector sequences: one clone in three will start correctly, one clone in three will end correctly and one clone in two will be in the correct orientation), peptides or proteins selected from npls are more successful in eliciting an antibody response that crossreacts with the native intact pathogen than those selected from rpls [13] . therefore, npls provide important alternatives to rpls for applications such as the identification of vaccine components and, recently, have been used effectively in the identification of bacterial adhesins (see later). phage display technology has been used in a wide variety of applications including: the identification of peptide agonists and antagonists for receptors [14] , the gvii gii gix gviii giii gvi gx gv gi giv identification of targets for the inhibition of tumourspecific angiogenesis [15] , the identification of peptide drug candidates [16] , vaccine development [17] and the isolation and engineering of recombinant antibodies [18] . however, in recent years, numerous studies have used phage display to address specific aspects of infectious disease. the purpose of this review is to focus on the growing number of ways in which phage display technology can be applied to the study of infectious diseases and to evaluate the impact of the use of this technology. infectious diseases are absolutely dependent upon hostpathogen interactions, the extent of which determines the infectious process both for the pathogen and the host. as discussed later, epitope mapping of infectious agents has been used by numerous research groups, particularly to focus on the bacterial adhesins that enable host colonization. phage display technology has been used to good effect in malaria research to investigate host-pathogen interactions in both hosts of the malaria parasite: homo sapiens and the mosquito anopheles. panning of a plasmodium falciparum (the causative agent of malaria) cdna phage display library against immobilized human erythrocyte membrane proteins identified seven parasite proteins that are potentially involved in the entry into or exit from human erythrocytes by p. falciparum [19] . such proteins could become vaccine targets. the search for the mechanisms used by plasmodium species to cause infection has also prompted novel uses for phage display libraries. the plasmodium parasite completes its life cycle in the mosquito and it is this interaction, rather than that of the human host, that has been investigated by ghosh et al. [20] . because the development of the plasmodium parasite in the mosquito involves the crossing of both midgut and salivary gland epithelia, this study investigated the hypothesis that such trafficking requires specific host-pathogen interaction. a phage display library of random dodecapeptides fused to the n terminus of phage coat protein viii was injected into anopheles, followed by dissection of the organs of interest and elution of bound phage. a 12-residue peptide from the eluted phage was identified and designated 'salivary gland and midgut peptide 1 (sm1)' [20] . this peptide strongly inhibited plasmodium invasion of salivary-gland and midgut epithelia, thereby hindering the development of the parasite in anopheles. this study illustrates another variation of phage display technology -the use of phage display libraries to investigate specific phage-binding targets within the whole organism. in vivo panning was first described by pasqualini and ruoslahti [21] , who isolated phage-displayed peptides that bound to vascular beds in vivo. to date, there are limited reports of such in vivo panning but this could have an important future role for the identification of specific tissue receptors for microbes or their products. a variation on this theme is the panning of phage display libraries against ex vivo biomaterials that have been used in a range of clinical applications including prosthetic heart valves and intravenous catheters. bacteria such as staphylococcus aureus and staphylococcus epidermidis are frequently isolated from biomaterial infections [22] . a phage display library constructed from the genomic dna of s. aureus was panned against a central intravenous catheter that had been removed from a patient [23] . numerous clones from the s. aureus library were recovered through panning and, after the second and third rounds, the enriched phage encoded fragments of bacterial proteins known to bind to fibrinogen and b 2 -glycoprotein i, which, unsurprisingly, were the most abundant host proteins deposited on the catheter. laminin-binding motifs in the surface virulence factor [plasminogen activator (pla)] of yersinia pestis were mapped by benedek et al. [24] using an rpl. this approach involved identification of phage-displayed amino acid sequences that bound to laminin, then comparison of these sequences to the sequence of pla to identify the specific binding site of pla to laminin. this study demonstrates how phage display can be used to map specific binding sites within a protein of interest. epitope mapping and identification of potential vaccine candidate antigens the term epitope can be used to describe the contacting points of any molecular interaction but it is more often used to describe the region on an antigen that elicits an immune response. the identification of epitopes from microbial pathogens has obvious importance in the study of infectious disease, particularly for the development of novel vaccines. phage display technology is especially suited to epitope mapping and has been widely used in infectious-disease research (for review, see ref. [17] ). much of the work on epitope mapping and identification of mimotopes of infectious agents has used rpls. mimotopes are peptide sequences that are capable of inducing immune responses by mimicking the structural features of a non-linear protein epitope or of a non-protein (e.g. carbohydrate) epitope structure. these libraries have been panned against immune sera to identify diseasespecific epitopes. selected examples of these applications are listed in table 1 . in addition, phage display has successfully identified peptide mimics of polysaccharides. this approach has important implications for the development of novel vaccines against encapsulated bacteria because the polysaccharide capsules of these bacteria are often antigenic. peptide mimics of the capsular polysaccharide of three of the neisseria meningitidis serotypes a [25] , b [26] and c [27] have been identified using rpls. the results of these studies are encouraging in that the mimotopes identified in this way did elicit immune responses in murine models and could be useful in the development of novel vaccines against this bacterium. the pathogenesis of s. aureus has been investigated by panning an rpl against the rnaiii-activating protein (rap), which autoinduces toxin production in this bacterium [28, 29] . initial work demonstrated that selected peptides could attenuate infection by s. aureus in murine models [28] . more recent studies have identified a peptide mimic for the rap protein that, when displayed on e. coli and used to vaccinate mice, prevented mortality caused by s. aureus infection [29] . infection with pseudomonas aeruginosa is a major problem in patients with cystic fibrosis (cf). the development of vaccines against this organism must take into account that antibiotic treatment of chronic p. aeruginosa infections rarely results in clearance of the bacterium. this is in contrast to treatment of early p. aeruginosa infections in which the bacteria can be eradicated [30] . antigens expressed early in p. aeruginosa infections of cf patients were investigated by panning an rpl against sera from non-infected and infected patients [31] . in conjunction with gene-array data and bioinformatic analysis, this study identified several genes that encode secreted proteins and outer-membrane proteins that are potential targets for vaccine development against p. aeruginosa infection [31] . alterations in the panning technique used, such as subjecting the library to initial negative selection, further increases the potential uses of these libraries. gnanasekar et al. [32] constructed a t7 phage display library of the cdna of the parasite brugia malayi (a causative agent of lymphatic filariasis). to ensure that the clones of interest (i.e. those binding to infection-specific antibodies) were enriched, the phage library was first subjected to three negative-selection steps using non-infected human sera followed by a positive-selection step using sera from patients infected with b. malayi. further experiments revealed that one of the five antigens identified using this strategy, b. malayi [50] 12mer rpl displayed on piii monoclonal iga1 specific for the capsule of streptococcus pneumoniae identification of mimotopes of the capsular polysaccharide of type 8 s. pneumoniae. when conjugated to tetanus toxoid, mimotopes induced a type 8 capsular-polysaccharide-specific antibody response in mice [51] 7mer rpl displayed on piii sera from swine infected with nipah virus identification of several putative epitopes within the nucleocapsid protein of nipah virus [52] 12mer rpl displayed on piii polyclonal igg specific for neutral polysaccharides of mycoplasma tuberculosis isolation of phage clones that were antigenic mimotopes of b-cell epitopes of m. tuberculosis sugars. one clone invoked immune responses in rabbits [53] review trends in microbiology vol.14 no. 3 march 2006 nip3-like protein, conferred protection against challenge infections in animal models [32] . single chain variable fragment (scfv) libraries have been used in several ways to study infectious diseases. identification of specific scfvs that bind to microbial antigens forms the basis for the development of novel vaccines and diagnostic reagents [3] . panning of a scfv library against purified severe acute respiratory syndrome-associated coronavirus (sars-cov) identified two antibody fragments that bind to sars-cov with high specificity [33] . these could potentially be used in the development of detection assays for the virus or to elucidate potential vaccine candidate antigens. phagedisplay scfv libraries have also been used successfully to develop immunodiagnostic and detection methods for microbial products or spores. one of the most impressive aspects of these applications is the high specificity of selected clones for the microbe. for example, panning of a scfv library against spores of bacillus subtilis resulted in the selection of clones that bound to only one of 11 spore strains [34] . similarly, clones selected from a phage display scfv library screened against clostridium difficile toxin b were highly specific and showed no crossreactivity with strains of c. difficile that are toxin-b negative. the single-chain antibody could also be used to detect as little as 10 ng of toxin b when used in elisas [35] . microbial infections are initiated by molecular interactions between the pathogen (through cell surface adhesins) and receptor molecules on host cells and the extracellular matrix (ecm), resulting in microbial adhesion, which can be followed by internalization [36, 37] . it is well established that the adhesion of enteric, oral and respiratory bacteria is required for colonization [37] . furthermore, when bacteria adhere to surfaces, they are substantially more resistant to host antimicrobial defences [38] . adherence to structures such as the ecm is, therefore, a key step in the development of disease. the ecm consists of a complex mixture of macromolecules including collagens, fibronectin, fibrinogen, vitronectin, laminin and heparin sulfate [38] , all of which function as ligands for bacterial adhesion. with the continued rise in antibiotic resistance in bacteria, there is an urgent need to find alternative ways to combat microbial and, specifically, bacterial infection. inhibition of bacterial adhesion is one potential therapeutic approach but usually requires an intimate knowledge of the adhesins involved in infection. phage display lends itself perfectly to the investigation of possible binding partners for a particular ligand and this application of phage display was first exploited by jacobsson and frykberg [39, 40] to identify genes encoding bacterial proteins that interact with host proteins. by constructing libraries from random fragments of bacterial genomic dna (i.e. shotgun phage display) and subsequent panning against components of the host, such as proteins of the ecm, host serum or plasma, it is possible to identify bacterial genes that encode cell surface adhesins. microbial proteins (including cell-bound and soluble proteins) that bind to mammalian target proteins were termed 'receptins' by kronvall and jonsson [41] . following the initial description of gene-iii-based phage display to identify genes coding for ligand-binding domains of bacterial receptins by jacobsson and frykberg [39] , many genes involved in host-pathogen interactions (from a variety of bacterial species) have been discovered using phage-display systems ( table 2 ). the majority of these studies have been carried out using gene-viii-based vectors such as the pg8h6 or pg8saet phagemid vectors developed by jacobsson and frykberg [40, 42] . panning of piii-based libraries tends to select for fusion proteins with the strongest interaction with the ligand of choice but only a small fraction of piii libraries contain functional inserts [43] . use of pviii generates phage with multiple copies of recombinant proteins but the selection process tends to select for lower affinity interactions. shotgun phage display has also been used in the mapping of binding domains within proteins coded by genes of interest. instead of random fragmentation of genomic dna from a particular species, a previously identified gene of interest is fragmented by sonication and used to generate a single-gene phage display library. this approach has been used to map fibronectin-binding activity to two c-terminal domains of the fibronectinbinding protein fnz of streptococcus equi [44] , the igg and b 2 -glycoprotein-i-binding domains of the sbi protein of s. aureus [42, 45] and the igg and albumin-binding domains in two cell-surface receptors, mag and zag, from group c streptococci [46] . the application of shotgun phage display to the identification of bacterial virulence factors has been further extended by frykberg and colleagues, with the development of phagemid vectors that only enable the display of bacterial exported proteins [47, 48] . many exported microbial proteins are adhesins, enzymes or toxins, all of which might have a role in pathogenicity. the libraries constructed by the shotgun method might not achieve full coverage of bacterial genomes for two reasons: (i) the low transformation efficiency of e. coli limits the number of primary clones (i.e. those containing unique inserts); and (ii) only one in 18 clones displays fusion proteins (see earlier). however, despite these limitations, the success of this method is obvious considering the numbers of shotgun phage display libraries that have been used to identify novel genes that encode putative bacterial receptins and protein binding domains ( table 2) . use of this technique identifies multiple microbial genes encoding proteins that potentially interact with the host in a single panning experiment and such results provide a starting point to determine the importance of various bacterial genes in pathogenesis. it is likely that phage display npls will become even more widely used in the study of infectious diseases, particularly because the growing number of available bacterial genomes means that the dna fragments that encode phage-displayed peptides and proteins can be easily mapped to a particular bacterial gene. the study of infectious diseases has benefited greatly from phage display technology. this technique has many advantages: the link between phenotype and genotype, the enormous diversity of variant proteins displayed within a single library and the flexibility of selection that can be performed in vitro or in vivo. earlier limitations on this technique (such as the unsuitability of piii-or pviii-based filamentous phage display systems for the display of cdna coding for peptides or problems with protein conformation) have been circumvented by continual development of the technology. the versatility of phage display means that it can be adapted easily to many different areas of research, as highlighted by the variety of biological questions to which phage display has been applied. it has yielded interesting and important results in the study of infectious diseases, not least in epitope mapping, the identification of potential vaccine candidates and bacterial adhesins. however, the vast potential of phage display is such that there are many more ways of applying the technology to these areas of research. there is certainly scope for developing new strategies to identify host receptors for microbial products by panning in vivo in animal models or against ex vivo tissues or organs. there are numerous examples of synergistic relationships between opportunistic bacteria that could be examined by panning phage display libraries against each other to investigate interactions between microbes. perhaps the most exciting potential use of phage display in the study of infectious diseases is in comparative genomics. the number of sequenced microbial genomes has increased greatly in the past ten years and continues to do so as more genomes are sequenced. this will enable detailed bioinformatic analysis of genes identified by panning of npls and a comparison of the genes implicated in pathogenicity between groups of closely related bacteria. filamentous fusion phage -novel expression vectors that display cloned antigens on the virion surface introduction to phage display and phage biology phage display technology: clinical applications and recent innovations alternative bacteriophage display systems peptide presentation by bacteriophage p4 display of biologically-active proteins on the surface of filamentous phages -a cdna cloning system for selection of functional gene-products linked to the genetic information responsible for their production pjufo: a phage surface display system for cloning genes based on protein-ligand interaction display of expression products of cdna libraries on phage surfaces -a versatile screening system for selective isolation of genes by specific gene-product ligand interaction peptides on phage -a vast library of peptides for identifying ligands searching for peptide ligands with an epitope library random peptide libraries -a source of specific protein-binding molecules peptides isolated from random peptide libraries on phage elicit a neutralizing anti-hiv-1 response: analysis of immunological mimicry immunogenically fit subunit vaccine components via epitope discovery from natural peptide ligands peptides identify the critical hotspots involved in the biological activation of the insulin receptor in vivo phage display and vascular heterogeneity: implications for targeted medicine phage display-derived peptides as therapeutic alternatives to antibodies epitope identification and discovery using phage display libraries: applications in vaccine development and diagnostics biotechnological applications of phage and cell display construction and use of plasmodium falciparum phage display libraries to identify host parasite interactions targeting plasmodium ligands on mosquito salivary glands and midgut with a phage display peptide library organ targeting in vivo using phage display libraries pathogenesis of infections due to coagulasenegative staphylococci sorting a staphylococcus aureus phage display library against ex vivo biomaterial identification of laminin-binding motifs of yersinia pestis plasminogen activator by phage display selection of an immunogenic peptide mimic of the capsular polysaccharide of neisseria meningitidis serogroup a using a peptide display library peptide mimotopes of neisseria meningitidis group b capsular polysaccharide two different methods result in the selection of peptides that induce a protective antibody response to neisseria meningitidis serogroup c inhibition of staphylococcus aureus pathogenesis in vitro and in vivo by rap-binding peptides a novel peptide screened by phage display can mimic trap antigen epitope against staphylococcus aureus infections significant microbiological effect of inhaled tobramycin in young children with cystic fibrosis use of phage display to identify potential pseudomonas aeruginosa gene products relevant to early cystic fibrosis airway infections novel phage display-based subtractive screening to identify vaccine candidates of brugia malayi identification of single-chain antibody fragments specific against sars-associated coronavirus from phage-displayed antibody library peptide ligands that bind selectively to spores of bacillus subtilis and closely related species recombinant single-chain variable fragment antibodies directed against clostridium difficile toxin b produced by use of an optimized phage display system bacterial adhesins: common themes and variations in architecture and assembly bacterial adhesion to animal cells and tissues mscramm-mediated adherence of microorganisms to host tissues cloning of ligand-binding domains of bacterial receptors by phage display phage display shot-gun cloning of ligand-binding domains of prokaryotic receptors approaches 100% correct clones receptins: a novel term for an expanding spectrum of natural and engineered microbial proteins with binding properties for mammalian proteins a second igg-binding protein in staphylococcus aureus shotgun phage display -selection for bacterial receptins or other exported proteins fibronectin-binding protein of streptococcus equi subsp zooepidemicus staphylococcus aureus expresses a cell surface protein that binds both igg and b2-glycoprotein i shot-gun phage display mapping of two streptococcal cell-surface proteins phage display as a novel screening method to identify extracellular proteins identification of a novel collagen-like protein, sclc, in streptococcus equi using signal sequence phage display epitope mapping of burkholderia pseudomallei serine metalloprotease: identification of serine protease epitope mimics epitope mapping of mycoplasma hyopneumoniae using phage displayed peptide libraries and the immune responses of the selected phagotopes a peptide mimotope of type 8 pneumococcal capsular polysaccharide induces a protective immune response in mice identification of epitopes in the nucleocapsid protein of nipah virus using a linear phage-displayed random peptide library peptide mimotopes of mycobacterium tuberculosis carbohydrate immunodeterminants molecular cloning and characterization of two helicobacter pylori genes coding for plasminogen-binding proteins a novel von willebrand factor binding protein expressed by staphylococcus aureus staphylococcus aureus fibronectin-binding protein (fnbp)-mediated adherence to platelets, and aggregation of platelets induced by fnbpa but not by fnbpb interaction of staphylococcus aureus fibronectin-binding protein with fibronectin -affinity, stoichiometry, and modular requirements identification of a fibronectin-binding protein from staphylococcus epidermidis staphylococcus aureus fibronectin binding proteins a and b possess a second fibronectin binding region that may have biological relevance to bone tissues a fibrinogen-binding protein of staphylococcus epidermidis identification of novel adhesins from group b streptococci by use of phage display reveals that c5a peptidase mediates fibronectin binding a novel family of fibrinogen-binding proteins in streptococcus agalactiae m-like proteins of streptococcus dysgalactiae sfs, a novel fibronectin-binding protein from streptococcus equi, inhibits the binding between fibronectin and collagen a von willebrand factor-binding protein from staphylococcus lugdunensis a fibrinogen-binding protein of staphylococcus lugdunensis key: cord-262119-s6hc7fxs authors: ostaszewski, marek; niarakis, anna; mazein, alexander; kuperstein, inna; phair, robert; orta-resendiz, aurelio; singh, vidisha; aghamiri, sara sadat; acencio, marcio luis; glaab, enrico; ruepp, andreas; fobo, gisela; montrone, corinna; brauner, barbara; frischman, goar; monraz gómez, luis cristóbal; somers, julia; hoch, matti; gupta, shailendra kumar; scheel, julia; borlinghaus, hanna; czauderna, tobias; schreiber, falk; montagud, arnau; de leon, miguel ponce; funahashi, akira; hiki, yusuke; hiroi, noriko; yamada, takahiro g.; dräger, andreas; renz, alina; naveez, muhammad; bocskei, zsolt; messina, francesco; börnigen, daniela; fergusson, liam; conti, marta; rameil, marius; nakonecnij, vanessa; vanhoefer, jakob; schmiester, leonard; wang, muying; ackerman, emily e.; shoemaker, jason; zucker, jeremy; oxford, kristie; teuton, jeremy; kocakaya, ebru; summak, gökçe yağmur; hanspers, kristina; kutmon, martina; coort, susan; eijssen, lars; ehrhart, friederike; rex, d. a. b.; slenter, denise; martens, marvin; haw, robin; jassal, bijay; matthews, lisa; orlic-milacic, marija; senff ribeiro, andrea; rothfels, karen; shamovsky, veronica; stephan, ralf; sevilla, cristoffer; varusai, thawfeek; ravel, jean-marie; fraser, rupsha; ortseifen, vera; marchesi, silvia; gawron, piotr; smula, ewa; heirendt, laurent; satagopam, venkata; wu, guanming; riutta, anders; golebiewski, martin; owen, stuart; goble, carole; hu, xiaoming; overall, rupert w.; maier, dieter; bauch, angela; gyori, benjamin m.; bachman, john a.; vega, carlos; grouès, valentin; vazquez, miguel; porras, pablo; licata, luana; iannuccelli, marta; sacco, francesca; nesterova, anastasia; yuryev, anton; de waard, anita; turei, denes; luna, augustin; babur, ozgun; soliman, sylvain; valdeolivas, alberto; esteban-medina, marina; peña-chilet, maria; helikar, tomáš; puniya, bhanwar lal; modos, dezso; treveil, agatha; olbei, marton; de meulder, bertrand; dugourd, aurélien; naldi, aurelien; noel, vincent; calzone, laurence; sander, chris; demir, emek; korcsmaros, tamas; freeman, tom c.; augé, franck; beckmann, jacques s.; hasenauer, jan; wolkenhauer, olaf; wilighagen, egon l.; pico, alexander r.; evelo, chris t.; gillespie, marc e.; stein, lincoln d.; hermjakob, henning; d’eustachio, peter; saez-rodriguez, julio; dopazo, joaquin; valencia, alfonso; kitano, hiroaki; barillot, emmanuel; auffray, charles; balling, rudi; schneider, reinhard title: covid-19 disease map, a computational knowledge repository of sars-cov-2 virus-host interaction mechanisms date: 2020-10-27 journal: biorxiv doi: 10.1101/2020.10.26.356014 sha: doc_id: 262119 cord_uid: s6hc7fxs we hereby describe a large-scale community effort to build an open-access, interoperable, and computable repository of covid-19 molecular mechanisms the covid-19 disease map. we discuss the tools, platforms, and guidelines necessary for the distributed development of its contents by a multi-faceted community of biocurators, domain experts, bioinformaticians, and computational biologists. we highlight the role of relevant databases and text mining approaches in enrichment and validation of the curated mechanisms. we describe the contents of the map and their relevance to the molecular pathophysiology of covid-19 and the analytical and computational modelling approaches that can be applied to the contents of the covid-19 disease map for mechanistic data interpretation and predictions. we conclude by demonstrating concrete applications of our work through several use cases. the coronavirus disease 2019 (covid-19) pandemic due to severe acute respiratory syndrome coronavirus 2 (sars-cov-2) [1] has already resulted in the infection of over 40 million people worldwide, of whom one million have died 1 . the molecular pathophysiology that links sars-cov-2 infection to the clinical manifestations and course of covid-19 is complex and spans multiple biological pathways, cell types and organs [2, 3] . to gain the insights into this complex network, the biomedical research community needs to approach it from a systems perspective, collecting the mechanistic knowledge scattered across the scientific literature and bioinformatic databases, and integrating it using formal systems biology standards. with this goal in mind, we initiated a collaborative effort involving over 230 biocurators, domain experts, modelers and data analysts from 120 institutions in 30 countries to develop the covid-19 disease map, an open-access collection of curated computational diagrams and models of molecular mechanisms implicated in the disease [4] . to this end, we aligned the biocuration efforts of the disease maps community [5, 6] , reactome [7] , and wikipathways [8] and developed common guidelines utilising standardised encoding and annotation schemes, based on community-developed systems biology standards [9] [10] [11] , and persistent identifier repositories [12] . moreover, we integrated relevant knowledge from public repositories [13] [14] [15] [16] and text mining resources, providing a means to update and refine contents of the map. the fruit of these efforts was a series of pathway diagrams describing key events in the covid-19 infectious cycle and host response. we ensured that this comprehensive diagrammatic description of disease mechanisms is machine-readable and computable. this allows us to develop novel bioinformatics workflows, creating executable networks for analysis and prediction. in this way, the map is both human and machine-readable, lowering the communication barrier between biocurators, domain experts, and computational biologists significantly. computational modelling, data analysis, and their informed interpretation using the contents of the map have the potential to identify molecular signatures of disease predisposition and development, and to suggest drug repositioning for improving current treatments. covid-19 disease map is a collection of 41 diagrams containing 1836 interactions between 5499 elements, supported by 617 publications and preprints. the summary of diagrams available in the covid-19 disease map can be found online 2 in supplementary material 1. the map is a constantly evolving resource, refined and updated by ongoing efforts of biocuration, sharing and analysis. here, we report its current status. in section 2 we explain the set up of our community effort to construct the interoperable content of the resource, involving biocurators, domain experts and data analysts. in section 3 we demonstrate that the scope of the biological maps in the resource reflects the state-ofthe-art about the molecular biology of covid-19. next, we outline analytical workflows that can be used on the contents of the map, including initial, preliminary outcomes of two such workflows, discussed in detail as use cases in section 4. we conclude in section 5 with an outlook to further development of the covid-19 map and the utility of the entire resource in future efforts towards building and applying disease-relevant computational repositories. the covid-19 disease map project involves three main groups: (i) biocurators, (ii) domain experts, and (iii) analysts and modellers: i. biocurators develop a collection of systems biology diagrams focused on the molecular mechanisms of sars-cov-2. ii. domain experts refine the contents of the diagrams, supported by interactive visualisation and annotations. iii. analysts and modellers develop computational workflows to generate hypotheses and predictions about the mechanisms encoded in the diagrams. all three groups have an important role in the process of building the map, by providing content, refining it, and defining the downstream computational use of the map. figure 1 illustrates the ecosystem of the covid-19 disease map community, highlighting the roles of different participants, available format conversions, interoperable tools, and downstream uses. the information about the community members and their contributions are disseminated via the fairdomhub [17] , so that content distributed across different collections can be uniformly referenced. the biocurators of the covid-19 disease map diagrams follow the guidelines developed by the community, and specific workflows of wikipathways [8] and reactome [7] . the biocurators build literature-based systems biology diagrams, representing the molecular processes implicated in the covid-19 pathophysiology, their complex regulation and the phenotypic outcomes. these diagrams are main building blocks of the map, and are composed of biochemical reactions and interactions (further called altogether interactions) taking place between different types of molecular entities in various cellular compartments. as there are multiple teams working on related topics, biocurators can provide an expert review across pathways and across platforms. this is possible, as all platforms offer intuitive visualisation, interpretation, and analysis of pathway knowledge to support basic and clinical research, genome analysis, modelling, systems biology, and education. table 1 lists information about the created content. for more details see supplementary material 1. communicating to refine, interpret and apply covid-19 disease map diagrams. these diagrams are created and maintained by biocurators, following pathway database workflows or standalone diagram editors, and reviewed by domain experts. the content is shared via pathway databases or a gitlab repository; all can be enriched by integrated resources of text mining and interaction databases. the covid-19 disease map diagrams, available in layout-aware systems biology formats and integrated with external repositories, are available in several formats allowing a range of computational analyses, including network analysis and boolean, kinetic or multiscale simulations. both interactions and interacting entities are annotated following a uniform, persistent identification scheme, using either miriam or identifiers.org [18] , and the guidelines for annotations of computational models [19] . viral protein interactions are explicitly annotated with their taxonomy identifiers to highlight findings from strains other than sars-cov-2. moreover, tools like modelpolisher [20] , sbmlsqueezer [21] or memote 3 help to automatically complement the annotations in the sbml format and validate the model (see also supplementary material 2). the knowledge on covid-19 mechanisms is rapidly evolving, as demonstrated by the rapid growth of the covid-19 open research dataset (cord-19) dataset, a source scientific manuscript text and metadata on covid-19 and related coronavirus research [28] . cord-19 currently contains over 130,000 articles and preprints, over four times more than when it was introduced 10 . in such a quickly evolving environment, biocuration efforts need to be supported by other repositories of structured knowledge about molecular mechanisms relevant for covid-19, like molecular interaction databases, or text mining resources. contents of such repositories may suggest improvements in the existing covid-19 disease map diagrams, or establish a starting point for developing new pathways (see section "biocuration of database and text mining content"). interaction and pathway databases contain structured and annotated information on protein interactions or causal relationships. while interaction databases focus on pairs of molecules, offering broad coverage of literature-reported findings. pathway databases provide detailed description of biochemical processes and their regulations of related interactions, supported by diagrams. both types of resources can be a valuable input for covid-19 disease map biocurators, given the comparability of identifiers used for molecular annotations, and the reference to publications used for defining an interaction or building a pathway. table 2 text-mining approaches can help to sieve through such rapidly expanding literature with natural language processing (nlp) algorithms based on semantic modelling, ontologies, and linguistic analysis to automatically extract and annotate relevant sentences, biomolecules, and their interactions. this scope was recently extended to pathway figure mining: decoding pathway figures into their computable representations [30] . altogether, these automated workflows lead to the construction of knowledge graphs: semantic networks incorporating ontology concepts, unique biomolecule references, and their interactions extracted from abstracts or full-text documents [31] . the covid-19 disease map project integrates open-access text mining resources, indra [32] , biokb 15 , ailani covid-19 16 , and pathwaystudio 10 . all platforms offer keyword-based search allowing interactive exploration. additionally, the map benefits from an extensive protein-protein interaction network (ppi) 17 generated with a custom text-mining pipeline using opennlp 18 and gnormplus [33] . this pipeline was applied to the cord-19 dataset and the collection of medline abstracts associated with the genes in the sars-cov-2 ppi network [34] using the entrez gene reference-into-function (generif). for detailed descriptions of the resources, see supplementary material 3. molecular interactions from databases and knowledge graphs from text mining resources discussed above (from now on called altogether 'knowledge graphs') have a broad coverage at the cost of depth of mechanistic representation. this content can be used by the biocurators in the process of building and updating the systems biology focused diagrams. biocurators can use this content in three main ways: by visual exploration, by programmatic comparison, and by direct incorporation of the content. first, the biocurators can visually explore the contents of the knowledge graphs using available search interfaces to locate new knowledge and encode it in the diagrams. moreover, solutions like covidminer project 19 , pathwaystudio and ailani offer a visual representation of a group of interactions for a better understanding of their biological context, allowing search by interactions, rather than just isolated keywords. finally, indra and ailani offer assistant bots that respond to natural language queries and return meaningful answers extracted from knowledge graphs. second, programmatic access and reproducible exploration of the knowledge graphs is possible via data endpoints: sparql for biokb and application programming interfaces for indra, ailani, and pathway studio. users can programmatically submit keyword queries and retrieve functions, interactions, pathways, or drugs associated with submitted gene lists. this way, otherwise time-consuming tasks like an assessment of completeness of a given diagram, or search for new literature evidence, can be automated to a large extent. finally, biocurators can directly incorporate the content of knowledge graphs into sbml format using biokc [35] . additionally, the contents of the elsevier covid-19 pathway collection can be translated to sbgnml 20 preserving the layout of the diagrams. the sbgnml content can then be converted into other diagram formats used by biocurators (see section 2.3 below). the biocuration of the covid-19 disease map is distributed across multiple teams, using varying tools and associated systems biology representations. this requires a common approach to annotations of evidence, biochemical reactions, molecular entities and their interactions. moreover, the interoperability of layout-aware formats is needed for comparison and integration of the diagrams in the map. the covid-19 disease map diagrams are encoded in one of three layout-aware formats for standardised representation of molecular interactions: sbml 21 [36] [37] [38] , sbgnml [27] , and gpml [24] . these xml-based formats focus to a varying degree on user-friendly graphical representation, standardised visualisation, and support of computational workflows. for the detailed description of the formats, see supplementary material 1. each of these three languages has a different focus: sbml emphasizes standardised representation of the data model underlying molecular interactions, sbgnml provides standardised graphical representation of molecular processes, while gpml allows for a partially standardised representation of uncertain biological knowledge. nevertheless, all three formats are centered around molecular interactions, provide a constrained vocabulary to encode element and interaction types, encode layout of their diagrams and support stable identifiers for diagram components. these shared properties, supported by a common ontology 22 [39] , allow cross-format mapping and enable translation of key properties between the formats. therefore, when developing the contents of the map, biocurators use the tools they are familiar with, facilitating this distributed task. the covid-19 disease map community ecosystem of tools and resources (see figure 1 ) ensures interoperability between the three layout-aware formats for molecular mechanisms: sbml, sbgnml, and gpml. essential elements of this setup are tools capable of providing cross-format translation functionality [40, 41] and supporting harmonised visualisation processing. another essential translation interface is a representation of reactome pathways in wikipathways gpml [42] and sbml. the sbml export of reactome content has been optimised in the context of this project and facilitates integration with the other covid-19 disease map software components. the contents of the covid-19 disease map diagrams can be directly transformed into inputs of computational pipelines and data repositories. besides the direct use of sbml format in kinetic simulations, celldesigner sbml files can be transformed into sbml qual [43] using casq [44] , enabling boolean modelling-based simulations (see also supplementary material 3). in parallel, casq converts the diagrams to the sif format 23 , supporting pathway modelling workflows using simplified interaction networks. notably, the gitlab repository features an automated translation of stable versions of diagrams into sbml qual. finally, translation of the diagrams into xgmml format (the extensible graph markup and modelling language) using cytoscape [45] or ginsim [46] allows for network analysis and interoperability with molecular interaction repositories [47] . thanks to the community effort discussed above supported by a rich bioinformatics framework, we constructed the covid-19 disease map, focussing on the mechanisms known from other coronaviruses [48] and suggested by early experimental investigations [pmid:32511329]. then, we applied the analytical and modelling workflows to the contributed diagrams and associated interaction databases to propose initial map-based insights into covid-19 molecular mechanisms. the covid-19 disease map is an evolving repository of pathways affected by sars-cov-2. figure 2 . it is currently centred on molecular processes involved in sars-cov-2 entry, 22 http://www.ebi.ac.uk/sbo/main/ 23 http://www.cbmc.it/fastcent/doc/sifformat.htm replication, and host-pathogen interactions. as mechanisms of host susceptibility, immune response, cell and organ specificity emerge, these will be incorporated into the next versions of the map. the covid-19 map represents the mechanisms in a "host cell". this follows literature reports on cell specificity of sars-cov-2 [3, [49] [50] [51] [52] [53] . some pathways included in the covid-19 map may be shared among different cell types, as for example the ifn-1 pathway found in cells such as dendritic, epithelial, and alveolar macrophages [54] [55] [56] [57] [58] . while at this stage, we do not address cell specificity explicitly in our diagrams, extensive annotations may allow identification of pathways relevant to the cell type of interest. the sars-cov-2 infection process and covid-19 progression follow a sequence of steps ( figure 3 ), starting from viral attachment and entry, which involve various dynamic processes on different time scales that are not captured in static representations of pathways. correlation of symptoms and potential drugs suggested to date helps downstream data exploration and drug target interpretation in the context of therapeutic interventions. human host ilc1, ilc-2, ilc3 natural killer renin-angiotensinaldosterone system (raas) granulocytes nasal mucosa disease map golgi er cd8+ cd4+ ace2 tmprss2 integrative stress response dendritic cells transmission of sars-cov-2 primarily occurs through contact with respiratory drops, airborne transmission, and through contact with contaminated surfaces [66] [67] [68] . upon contact with the respiratory epithelium, the virus infects cells mostly by binding the spike surface glycoprotein (s) to angiotensin-converting enzyme 2 (ace2) with the help of serine protease tmprss2 [69] [70] [71] [72] . importantly, recent results suggest viral entry using other receptors of lungs and the immune system [73, 74] . once attached, sars-cov-2 can enter cells either by direct fusion of the virion and cell membranes in the presence of proteases • dendritic cells. • nk cells. • monocytes and macrophages. • t cells, th1 and th2 response. • b cells, antibody production. asymptomatic/pre -symptomatic. vaccine? pre-exposure prophylaxis? antivirals? sirs, shock. shortness of breath. anosmia, ageusia, cough, fever, diarrhea. multiple organ dysfunction ards, complications. host response • cellular stress. • apoptosis. systemic and ventilation support oxygen therapy host raas ards; acute respiratory distress syndrome. raas; renin-angiotensin-aldosterone system. sirs; systemic inflammatory response syndrome. pathophysiology virus-host cell interactions and host response disease map critical asymptomatic (lung, heart, kidney) (nasal and respiratory epithelium, alveoli, vascular endothelial) tmprss2 and furin or by endocytosis in their absence. regardless of the entry mechanism, the s protein has to be activated to initiate the plasma or endosome membrane fusion process. while in the cell membrane, s protein is activated by tmprss2 and furin, in the endosome s protein is activated by cathepsin b (ctsb) and cathepsin l (ctsl) [71, 75] . activated s promotes the cell-or endosome-membrane fusion [76] with the virion membrane, and then the nucleocapsid is injected into the cytoplasm. these mechanisms are represented in the corresponding diagrams of the map 25 . within the host cell, sars-cov-2 hijacks the rough endoplasmic reticulum (rer)-linked host translational machinery. it then synthesises viral proteins replicase polyprotein 1a (pp1a) and replicase polyprotein 1ab (pp1ab) directly from the virus (+)genomic rna (grna) [48, 77] . through a complex cascade of proteolytic cleavages, pp1a and pp1ab give rise to 16 non-structural proteins (nsps) [78] [79] [80] . most of these nsps collectively form the replication transcription complex (rtc) that is anchored to the membrane of the double-membrane vesicle [78, 81] endoplasmic reticulum stress and unfolded protein response as discussed above, the virus hijacks the er to replicate. production of large amounts of viral proteins exceeds the protein folding capacity of the er, creating an overload of unfolded proteins. as a result, the unfolded protein response (upr) pathways are triggered to assure the er homeostasis, using three main signalling routes of upr via perk, ire1, and atf6 [87]. their role is to mitigate the misfolded protein load and reduce oxidative stress. the resulting protein degradation is coordinated with a decrease in protein synthesis via eif2alpha phosphorylation and induction of protein folding genes via the transcription factor xbp1 [88] . when the er is unable to restore its function, it can trigger cell apoptosis [89, 90] . the results are er stress and activation of the upr. the expression of some human coronavirus (hcov) proteins during infection, in particular the s glycoprotein, may induce activation of the er stress in the host cells [91] . based on sars-cov results, this may lead to activation of the perk [92], ire1 and in an indirect manner, of the atf6 pathways [93] . processes of degrading malfunctioning proteins and damaged organelles, including the ubiquitin-proteasome system (ups) and autophagy [94] are essential to maintain energy homeostasis and prevent cellular stress [95, 96] . autophagy is also involved in cell defence, including direct destruction of the viruses via virophagy, presentation of viral antigens, and inhibition of excessive inflammatory reactions [97, 98] . sars-cov-2 directly affects the process of ups-based protein degradation, as indicated by the host-virus interactome dataset published recently [34] . this mechanism may be a defence against viral protein degradation [99] . the map describes in detail the nature of this interaction, namely the impact of orf10 virus protein on the cul2 ubiquitin ligase complex and its potential substrates. interactions between sars-cov-2 and host autophagy pathways are inferred based on results from other covs. a finding that covs use double-membrane vesicles and lc3-i for replication [100] may suggest that the virus induces autophagy, possibly in atg5-dependent manner [101] , although some evidence points to the contrary [102] . also, the cov nsp6 restricts autophagosome expansion, compromising the degradation of viral components [103] . recently revealed mutations in nsp6 [104] indicate its importance, although the exact effect of the mutations remains unknown. based on the connection between autophagy and the endocytic pathway of the virus replication cycle [105] , autophagy modulation was suggested as a potential therapy strategy, either pharmacologically [96, [105] [106] [107] , or via fasting [108] . apoptosis, a synonym for programmed cell death, is triggered by virus-host interaction upon infection, as the early death of the virus-infected cells may prevent viral replication. many viruses block or delay cell death by expressing anti-apoptotic proteins to maximize the production of viral progeny [109] . in turn, apoptosis induction at the end of the viral replication cycle might assist in viral dissemination while reducing an inflammatory response. for instance, sars-cov-2 [110] and mers [111] are able to invoke apoptosis in lymphocytes, compromising the immune system. apoptosis follows two major pathways [112] , called extrinsic and intrinsic. extrinsic signals are transmitted by death ligands and their receptors (e.g., fasl and tnf-alpha). activated death receptors recruit adaptors like fadd and tradd, and initiator procaspases like caspase-8, leading to cell death with the help of effector caspases-3 and 7 [113, 114] . in turn, the intrinsic pathway involves mitochondria-related members of the bcl-2 protein family. cellular stress causes bcl-2 proteins-mediated release of cytochrome c from the mitochondria into the cytoplasm. cytochrome c then forms a complex with apaf1 and recruits initiator procaspase-9 to form the apoptosome, leading to the proteolytic activation of caspase-9. activated caspase-9 can now initiate the caspase cascade by activating effector caspases 3 and 7 [114] . the intrinsic pathway is modulated by sars-cov molecules [115, 116] . as intrinsic apoptosis involves mitochondria, its activity may also be exacerbated by sars-cov-2 disruptions of the electron transport chain, mitochondrial translation, and transmembrane transport [34] . the resulting mitochondrial dysfunction may lead to increased release of reactive oxygen species and pro-apoptotic factors. another vital crosstalk is that of the intrinsic pathway with the pi3k-akt pro-survival pathway. activated akt can phosphorylate and inactivate various pro-apoptotic proteins, including bad and caspase-9 [117] . sars-cov uses pi3k-akt signalling cascade to enhance infection [118] . moreover, sars-cov could affect apoptosis in a cell-type-specific manner [119, 120] . sars-cov structural proteins s, e, m, n, and accessory proteins 3a, 3b, 6, 7a, 8a, and 9b have been shown to act as crucial effectors of apoptosis in vitro. structural proteins seem to affect mainly the intrinsic apoptotic pathway, with p38 mapk and pi3k/akt pathways regulating cell death. accessory proteins can induce apoptosis via different cascades and in a cellspecific manner [114] . sars-cov e and 7a protein were shown to activate the intrinsic pathway by blocking anti-apoptotic bcl-xl localized to the er [121] . sars-cov m protein and the ion channel activity of e and 3a were shown to interfere with pro-survival signalling cascades [114, 122] . the viral replication and the consequent immune and inflammatory responses cause damage to the epithelium and pulmonary capillary vascular endothelium and activate the main intracellular defence mechanisms, as well as the humoral and cellular immune responses. resulting cellular stress and tissue damage [123, 124] impair respiratory capacity and lead to acute respiratory distress syndrome (ards) [61, 125, 126] . hyperinflammation is a known complication, causing widespread damage, organ failure, and death, followed by a not yet completely understood rapid increase of cytokine levels (cytokine storm) [127] [128] [129] , and acute ards [130] . other reported complications, such as coagulation disturbances and thrombosis are associated with severe cases, but the specific mechanisms are still unknown [64, [131] [132] [133] , although some reports suggest that covid-19 coagulopathy has a distinct profile [134] . the sars-cov-2 infection disrupts the coagulation cascade and is frequently associated with hyperinflammation, renin-angiotensin system (ras) imbalance and intravascular coagulopathy [132, [135] [136] [137] . hyperinflammation leads in turn to detrimental hypercoagulability and immunothrombosis, leading to microvascular thrombosis with further organ damage [138] . importantly, ras is influenced by risk factors of developing severe forms of covid-19 [139] [140] [141] . ace2, used by sars-cov-2 for host cell entry, is a regulator of ras and is widely expressed in the affected organs [142] . the main function of ace2 is the conversion of angii to angiotensin 1-7 (ang1-7), and these two angiotensins trigger the counter-regulatory arms of ras [143] . the signalling via angii and its receptor agtr1, elevated in the infected [142, 144] , induces the coagulation cascade leading to microvascular thrombosis [145] , while ang1-7 and its receptor mas1 attenuate these effects [143] . the innate immune system detects specific pathogen-associated molecular patterns (pamps), through pattern recognition receptors (prrs). detection of sars-cov-2 is mediated through receptors that recognise double-stranded and single-stranded rna in the endosome during endocytosis of the virus particle, or in the cytoplasm during the viral replication. these receptors mediate the activation of transcription factors such as ap1, nfkappab, irf3, and irf7, responsible for the transcription of antiviral proteins, in particular, interferon-alpha and beta [146, 147] . sars-cov-2 reduces the production of type i interferons to evade the immune response [49] . the detailed mechanism is not clear yet; however, sars-cov m protein inhibits the irf3 activation [148] and suppresses nfkappab and cox2 transcription. at the same time, sars-cov n protein activates nfkappab [149] , so the overall impact is unclear. these pathways are also negatively regulated by sars-cov nsp3 papain-like protease domain (plpro) [150] . the map contains the initial recognition process of the viral particle by the innate immune system and the viral mechanisms to evade the immune response. it provides the connection between virus entry (detecting the viral endosomal patterns), its replication cycle (detection cytoplasmic viral patterns), and the effector pathways of pro-inflammatory cytokines, especially of the interferon type i class. the latter seems to play a crucial but complex role in covid-19 pathology: both negative [151, 152] and positive effects [3, 153] of interferons on virus replication have been reported. interferon type i signalling interferons (ifns) are central players in the antiviral immune response of the host cell [55] , specifically affected by sars-cov-2 [154] [155] [156] [157] . type i ifns are induced upon viral recognition of pamps by various host prrs [48] as discussed earlier. the ifn-i pathway diagram represents the activation of tlr7 and ifnar and the subsequent recruiting of adaptor proteins and the downstream signalling cascades regulating key transcription factors including irf3/7, nf-kappab, ap-1, and isre [48, 158] . further, the map shows irf3mediated induction of ifn-i, affected by the sars-cov-2 proteins. sars-cov nsp3 and orf6 interfere with irf3 signalling [159, 160] and sars-cov m, n, nsp1 and nsp3 act as interferon antagonists [48, 150, 158, 161, 162] . moreover, coronaviruses orf3a, orf6 and nsp1 proteins can repress interferon expression and stimulate the degradation of ifnar1 and stat1 during the unfolded protein response (upr) [163, 164] . another mechanism of viral rna recognition is rig-like receptor signalling [58] , leading to sting activation [165] , and via the recruitment of traf3, tbk1 and ikkepsilon to phosphorylation of irf3 [56] . this in turn induces the transcription of ifns alpha, beta and lambda [166] . sars-cov viral papain-like-proteases, contained within the nsp3 and nsp 16 proteins, inhibit sting and the downstream ifn secretion [167] . in line with this hypothesis, sars-cov-2 infection results in a unique inflammatory response defined by low levels of ifn-i and high expression of cytokines [58, 168] . the ifnlambda diagram describes the ifnl receptor signaling cascade [169] , including jak-stat signaling and the induction of interferon stimulated genes, which encode antiviral proteins [170] . the interactions of sars-cov-2 proteins with the ifnl pathway are based on the literature [171] or sars-cov homology [58] . metabolic pathways govern the immune microenvironment by modulating the availability of nutrients and critical metabolites [172] . infectious entities reprogram host metabolism to create favourable conditions for their reproduction [173] . sars-cov-2 proteins interact with a variety of immunometabolic pathways, several of which are described below. heme catabolism is a well-known anti-inflammatory system in the context of infectious and autoimmune diseases [174, 175] . the main effector of this pathway, heme oxygenase-1 (hmox1) was found to interact with sars-cov-2 orf3a, although the nature of this interaction remains ambiguous [34, 176] . hmox1 cleaves heme into carbon monoxide, biliverdin (then reduced to bilirubin), and ferrous iron [pmid:31396090]. biliverdin, bilirubin, and carbon monoxide possess cytoprotective properties, and have shown promise as immunomodulatory therapeutics [177, 178] . importantly, activation of hmox1 also inhibits the nlrp3 inflammasome [178] [179] [180] , which is a pro-inflammatory and prothrombotic multiprotein system [181] highly active in covid-19 [182] [183] [184] . it mediates production of the pro-inflammatory cytokines il-1b and il-18 via caspase-1 [181] . the sars-cov orf3a, e, and orf8a incite the nlrp3 inflammasome [185] [186] [187] [188] . still, the potential of the hmox1 pathway to fight covid-19 inflammation remains to be tested [176, 189, 190] despite promising results in other models of inflammation [176, 178, [191] [192] [193] . the tryptophan-kynurenine pathway is closely related to heme metabolism. the ratelimiting step of this pathway is catalysed by the indoleamine 2,3 dioxygenase enzymes (ido1 and ido2) in dendritic cells, macrophages, and epithelial cells in response to inflammatory cytokines like ifn-gamma, ifn-1, tgf-beta, tnf-alpha, and il-6 [194] [195] [196] . crosstalk with the hmox1 pathway also increases the expression of ido1 and hmox1 in a feed-forward manner. metabolomics analyses from severe covid-19 patients revealed enrichment of kynurenines and depletion of tryptophan, indicating robust activation of ido enzymes [197, 198] . depletion of tryptophan [173, 199, 200] and kynurenines and their derivatives affect the proliferation and immune response of a range of t cells [176, [201] [202] [203] [204] [205] . however, despite high levels of kynurenines in covid-19, cd8+ t-cells and th17 cells are enriched in lung tissue, and t-regulatory cells are diminished [206] . this raises the question of whether and how the immune response elicited in covid-19 evades suppression by the kynurenine pathway. the sars-cov-2 protein nsp14 interacts with three human proteins: gla, sirt5, and impdh2 [34] . the galactose metabolism pathway, including the gla enzyme [207] , is interconnected with amino sugar and nucleotide sugar metabolism. sirt5 is a naddependent desuccinylase and demalonylase regulating serine catabolism, oxidative metabolism and apoptosis initiation [208] [209] [210] . moreover, nicotinamide metabolism regulated by sirt5 occurs downstream of the tryptophan metabolism, linking it to the pathways discussed above. finally, impdh2 is the rate-limiting enzyme in the de novo synthesis of gtp, allowing regulation of purine metabolism and downstream potential antiviral targets [211, 212] . the pyrimidine synthesis pathway, tightly linked to purine metabolism, affects viral dna and rna synthesis. pyrimidine deprivation is a host targeted antiviral defence mechanism, which blocks viral replication in infected cells and can be regulated pharmacologically [213] [214] [215] . it appears that components of the dna damage response connect the inhibition of pyrimidine biosynthesis to the interferon signalling pathway, probably via sting-induced tbk1 activation that amplifies interferon response to viral infection, discussed above. inhibition of de novo pyrimidine synthesis may have beneficial effects on the recovery from covid-19 [215] ; however, this may happen only in a small group of patients. covid-19 pathways featured in the previous section cover mechanisms reported so far. still, certain aspects of the disease were not represented in detail because of their complexity, namely cell-type-specific immune response, and susceptibility features. their mechanistic description is of great importance, as suggested by clinical reports on the involvement of these pathways in the molecular pathophysiology of the disease. the mechanisms outlined below will be the next targets in our curation roadmap. cell type-specific immune response covid-19 causes serious disbalance in multiple populations of immune cells. some studies report that covid-19 patients have a significant decrease of peripheral cd4+ and cd8+ cytotoxic t lymphocytes (ctls), b cells, nk cells, as well as higher levels of a broad range of cytokines and chemokines [128, [216] [217] [218] [219] . the disease causes functional exhaustion of cd8+ ctls and nk cells, induced by sars-cov-2 s protein and by excessive pro-inflammatory cytokine response [217, 220] . moreover, the ratio of naïve-to-memory helper t-cells, as well as the decrease of t regulatory cells, correlate with covid-19 severity [206] . conversely, high levels of th17 and cytotoxic cd8+ t-cells have been found in the lung tissue [221] . pulmonary recruitment of lymphocytes into the airways may explain the lymphopenia and the increased neutrophil-lymphocyte ratio in peripheral blood found in covid-19 patients [216, 222, 223] . in this regard, an abnormal increase of the th17:treg cell ratio may promote the release of pro-inflammatory cytokines and chemokines, increasing disease severity [224] . sars-cov-2 infection is associated with increased morbidity and mortality in individuals with underlying chronic diseases or a compromised immune system [225] [226] [227] [228] . groups of increased risk are men, pregnant and postpartum women, and individuals with high occupational viral exposure [229] [230] [231] . other susceptibility factors include the abo blood groups [232] [233] [234] [235] [236] [237] [238] [239] [240] and respiratory conditions [241] [242] [243] [244] [245] [246] . importantly, age is one of the key aspects contributing to the severity of the disease. the elderly are at high risk of developing severe or critical disease [227, 247] . age-related elevated levels of pro-inflammatory cytokines (inflammation) [247] [248] [249] [250] , immunosenescence and cellular stress of ageing cells [125, 227, 247, 251, 252] may contribute to the risk. in contrast, children are generally less likely to develop severe disease [253, 254] , with the exception of infants [125, [255] [256] [257] . however, some previously healthy children and adolescents can develop a multisystem inflammatory syndrome following sars-cov-2 infection [258] [259] [260] [261] [262] . several genetic factors have been proposed and identified to influence susceptibility and severity, including the ace2 gene, hla locus, errors influencing type i ifn production, tlr pathways, myeloid compartments, as well as cytokine polymorphisms [156, 235, [263] [264] [265] [266] [267] [268] [269] . we aim to connect the susceptibility features to specific molecular mechanisms and better understand the contributing factors. this can lead to a series of testable hypotheses, including the role of vitamin d counteracting pro-inflammatory cytokine secretion [270] [271] [272] in an age-dependent manner [247, 273] , and modifying the severity of the disease. another example of a testable hypothesis may be that the immune phenotype associated with asthma inhibits pro-inflammatory cytokine production and modifies gene expression in the airway epithelium, protecting against severe covid-19 [245, 246, 274] . in order to understand complex and often indirect dependencies between different pathways and molecules, we need to combine computational and data-driven analyses. standardised representation and programmatic access to the contents of the covid-19 disease map enable the development of reproducible analytical and modelling workflows. here, we discuss the range of possible approaches and demonstrate preliminary results, focusing on interoperability, reproducibility, and applicability of the methods and tools. our goal is to work on the computational challenges as a community, involving the biocurators and domain experts in the analysis of the covid-19 disease map and rely on their feedback to evaluate the outcomes. in this way, we aim to identify approaches to tackle the complexity and the size of the map, proposing a state-of-the-art framework for robust analysis, reliable models, and useful predictions. visualisation of omics data can help contextualise the map with experimental data creating data-specific blueprints. these blueprints could be used to highlight parts of the map that are active in one condition versus another (treatment versus control, patient versus healthy, normal versus infected cell, etc.). combining information contained in multiple omics platforms can make patient stratification more powerful, by reducing the number of samples needed or by augmenting the precision of the patient groups [275, 276] . approaches that integrate multiple data types without the accompanying mechanistic diagrams [277] [278] [279] produce patient groupings that are difficult to interpret. in turn, classical pathway analyses often produce long lists mixing generic and cell-specific pathways, making it challenging to pinpoint relevant information. using disease maps to interpret omics-based clusters addresses the issues related to contextualised visual data analytics. footprints are signatures of a molecular regulator determined by the expression levels of its targets [280] . for example, a footprint can contain targets of a transcription factor (tf) or peptides phosphorylated by a kinase. combining multiple omics readouts and multiple measurements can increase the robustness of such signatures. nevertheless, an essential component is the mechanistic description of the targets of a given regulator, allowing computation of its footprint. with available sars-cov-2 related omics and interaction datasets [281] , it is possible to infer which tfs and signalling pathways are affected upon infection [282] . combining the covid-19 disease map regulatory interactions with curated collections of tf-target interactions like dorothea [283] will provide a contextualised evaluation of the effect of sars-cov-2 infection at the tf level. the virus-host interactome is a network of virus-human protein-protein interactions (ppis) that can help understanding the mechanisms of disease [34, [284] [285] [286] . it can be expanded by merging virus-host ppi data with human ppi and protein data [287] to discover clusters of interactions indicating human mechanisms and pathways affected by the virus [288] . these clusters first of all can be interpreted at the mechanistic level by visual exploration of covid-19 disease map diagrams. in addition, these clusters can potentially reveal additional pathways to add to the covid-19 disease map (e.g., e protein interactions or tgfbeta diagrams) or suggest new interactions to introduce into the existing diagrams. computational modelling is a powerful approach that enables in silico experiments, produces testable hypotheses, helps elucidate regulation and, finally, can suggest via predictions novel therapeutic targets and candidates for drug repurposing. mechanistic models of pathways allow bridging variations at the scale of molecular activity to variations at the level of cell behaviour. this can be achieved by coupling the molecular interactions of a given pathway with its endpoint and by contextualising the molecular activity using omics datasets. hipathia is such a method, processing transcriptomic or genomic data to estimate the functional profiles of a pathway conditioned by the data studied and linkable to phenotypes such as disease symptoms or other endpoints of interest [289, 290] . moreover, such mechanistic modelling can be used to predict the effect of interventions as, for example, the effect of targeted drugs [291] . hipathia integrates directly with the diagrams of the covid-19 map using the sif format provided by casq (see section 2.3), as well as with the associated interaction databases (see section 2.2). the drawback of approaches like hipathia is their computational complexity, limiting the size of the diagrams they can process. an approach to large-scale mechanistic pathway modelling is to transform them into causal networks. carnival [292] combines the causal representation of networks [13] with transcriptomics, phosphoproteomics, or metabolomics data [280] to contextualise cellular networks and extract mechanistic hypotheses. the algorithm identifies a set of coherent causal links connecting upstream drivers such as stimulations or mutations to downstream changes in transcription factor activities. analysis of the dynamics of molecular networks is necessary to understand their dynamics and deepen our understanding of crucial regulators behind disease-related pathophysiology. discrete modelling framework provides this possibility. covid-19 disease map diagrams, translated to sbml qual (see section 2.3), can be directly imported by tools like cell collective [293] or ginsim [46] for analysis. preserving annotations and layout information ensures transparency and reusability of the models. importantly, cell collective is an online user-friendly modelling platform 26 that provides features for real-time in silico simulations and analysis of complex signalling networks. the platform allows users without computational background to simulate or analyse models to generate and prioritise new hypotheses. references and layout are used for model visualisation, supporting the interpretation of the results. the mathematics and code behind each model, however, remain accessible to all users. in turn, ginsim is a tool providing a wide range of analysis methods, including efficient identification of the states of convergence of a given model (attractors). model reduction functionality can also be employed to facilitate the analysis of large-scale models. viral infection and immune response are complex processes that span many different scales, from molecular interactions to multicellular behaviour. the modelling and simulation of such complex scenarios require a dedicated multiscale computational architecture, where multiple models run in parallel and communicate among them to capture cellular behaviour and intercellular communications. multiscale agent-based models simulate processes taking place at different time scales, e.g., diffusion, cell mechanics, cell cycle, or signal transduction [294] , proposed also for covid-19 [295] . physiboss [296] allows such simulation of intracellular processes by combining the computational framework of physicell [297] with maboss [298] tool for stochastic simulation of logical models to study of transient effects and perturbations [299] . implementation of detailed covid-19 signalling models in the pysiboss framework may help to better understand complex dynamics of multi-scale processes as interactions and crosstalk between immune system components and the host cell in covid-19. in this case study, we combine computational approaches discussed above and present results derived from omics data analysis on the covid-19 disease maps diagrams. we measured the effect of covid-19 at the transcription factor (tf) activity level by applying viper [300] combined with dorothea regulons [283] on rna-seq datasets of the sars-cov-2 infected cell line [168] . then, we mapped the tfs normalised enrichment score (nes) on the interferon type i signalling pathway diagram of the covid-19 disease map using the sif files generated by casq (see section 2.3). as highlighted in figure 4 , our manually curated pathway included some of the most active tfs after sars-cov-2 infection, such as stat1, stat2 , irf9 and nfkb1. these genes are well known to be involved in cytokine signalling and first antiviral response [301, 302] . interestingly, they are located downstream of various viral proteins (e, s, nsp1, orf7a and orf3a) and members of the mapk pathway (mapk8, mapk14 and map3k7). sars-cov-2 infection is known to promote mapk activation, which mediates the cellular response to pathogenic infection and promotes the production of proinflammatory cytokines [281] . altogether, we identified signaling events that may capture the mechanistic response of the human cells to the viral infection. in this use case, the hipathia [289] algorithm was used to calculate the level of activity of the subpathways from the covid-19 apoptosis diagram, with the aim to evaluate whether covid-19 disease map diagrams can be used for pathway modelling approach. to this end, a public rna-seq dataset from human sars-cov-2 infected lung cells (geo gse147507) was used. first, the rna-seq gene expression data was normalized with the trimmed mean of m values (tmm) normalization [303] , then rescaled to range [0;1] for the calculation of the signal and normalised using quantile normalisation [304] . the normalised gene expression values were used to calculate the level of activation of the subpathways, then a case/control contrast with a wilcoxon test was used to assess differences in signaling activity between the two conditions. the activation levels have been calculated using transcriptional data from gse147507 and hipathia mechanistic pathway analysis algorithm. each node represents a gene (ellipse), a metabolite (circle) or a function (square). the pathway is composed of circuits from a receptor gene/metabolite to an effector gene/function, with interactions simplified to inhibitions or activations (see section 2.3, sif format). significantly deregulated circuits are highlighted by color arrows (red: activated in infected cells). the color of the node corresponds to the level of differential expression of each node in sars-cov-2 infected cells vs normal lung cells. blue: down-regulated elements, red: up-regulated elements, white: elements with not statistically significant differential expression. hipathia calculates the overall circuit activation, and can indicate deregulated interaction even if interacting elements are not individually differentially expressed. results of the apoptosis pathway analysis can be seen in figure 5 and supplementary material 5. importantly, hipathia calculates the overall activation of circuits (series of causally connected elements), and can indicate deregulated interactions resulting from a cumulative effect, even if interacting elements are not individually differentially expressed. when discussing differential activation, we refer to the circuits, while individual elements are mentioned as differentially expressed. the analysis shows an overactivation of several circuits, specifically the one ending in the effector protein bax. this overactivation seems to be led by the overexpression of the bad protein, inhibiting bcl2-mcl1-bcl2l1 complex, which in turn inhibits bax. indeed, sars-cov-2 infection can invoke caspase8-induced apoptosis [305] , where bax together with the ripoptosome/caspase-8 complex, may act as a pro-inflammatory checkpoint [306] . this result is supported by studies in sars-cov, showing bax overexpression following infection [121, 307] . overall, our findings recapitulate reported outcomes. with evolving contents of the covid-19 disease map and new omics data becoming available, new mechanism-based hypotheses can be formulated. in the covid19 disease map community we strive to produce interoperable content and seamless downstream analyses, translating the graphic representations of the molecular mechanisms to executable models. we are also aware of parallel efforts towards modelling of covid-19 mechanisms, which we plan to include as a part of our ecosystem. these efforts are not yet directly interoperable with the covid 19 disease map content as they use either different notation schemes or require parameters not covered by our biocuration guidelines at the same time, they provide a complementary source of information and the opportunity to create an even broader toolset to tackle the pandemic. the modified edinburgh pathway notation (mepn) scheme [308] allows for the detailed visual encoding of molecular processes using the yed platform but diagrams are constructed in such a way as to also function as petri nets. these can then be used directly for activity simulations using the biolayout network analysis tool [309] . the current mepn covid-19 model details the replication cycle of sars-cov-2, integrated with a range of host defence systems, e.g. type 1 interferon signalling, tlr receptors, oas systems, etc. simulations of altered gene expression, interactions with drug targets or changes to interaction kinetics can be represented by introducing relevant transitions or nodes directly in the diagram. currently, models constructed in mepn can be saved as sbgn.ml files, however is a loss of information and the features associated computationally are not compatible with other covid-19 disease map diagrams (not modelled as petri nets). the covid-19 disease map can support dynamic kinetic modelling to quantify the behaviour of different pathways and evaluate the dynamic effects of perturbations. however, it is necessary to assign a kinetic equation or a rate law to every reaction in the diagram to be analysed. this process is challenging because any given reaction depends on its cellular and physiological context, which makes it difficult to parameterise. software support of tools like sbmlsqueezer [21] and reaction kinetics databases like sabio-rk [310] are indispensable in this effort. nevertheless, the most critical factor is the availability of experimentally validated parameters that can be reliably applied in sars-cov-2 modelling scenarios. the covid-19 disease map is both a knowledgebase and a computational repository. on the one hand, it is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. on the other hand, it is a computational resource of curated content for graph-based analyses and disease modelling. it offers a shared mental map for understanding the dynamic nature of the disease at the molecular level and also its dynamic propagation at a systemic level. thus, it provides a platform for a precise formulation of models, accurate data interpretation, monitoring of therapy, and potential for drug repositioning. the covid-19 disease map spans three platforms and assembles diagrams describing molecular mechanisms of covid-19. these diagrams are grounded in the relevant published sars-cov-2 research, completed where necessary by mechanisms discovered in related beta-coronaviruses. this unprecedented effort of community-driven biocuration resulted in over forty diagrams with molecular resolution constructed since march 2020. it demonstrates that expertise in biocuration, clear guidelines and text mining solutions can accelerate the passage from the published findings to a meaningful mechanistic representation of knowledge. the covid 19 disease map can provide the tipping point to shortcut research data generation and knowledge accumulation, creating a formalized and standardized streamline of well defined tasks. this approach to an emerging pandemic leveraged the capacity and expertise of an entire swath of the bioinformatics community, bringing them together to improve the way we build and share knowledge. by aligning our efforts, we strive to provide covid-19 specific pathway models, synchronize content with similar resources and encourage discussion and feedback at every stage of the curation process. with new results published every day, and with the active engagement of the research community, we envision the covid-19 disease map as an evolving and continuously updated knowledge base whose utility spans the entire research and development spectrum from basic science to pharmaceutical development and personalized medicine. moreover, our approach includes a large-scale effort to create interoperable tools and seamless downstream analysis pipelines to boost the applicability of established methodologies to the covid-19 disease map content. this includes harmonisation of formats, support of standards, and transparency in all steps to ensure wide use and content reusability. preliminary results of such efforts are presented in the case studies. the covid-19 disease map community is open and expanding as more people with complementary expertise join forces. in the longer run, the map's content will help to find robust signatures related to sars-cov-2 predisposition or response to various treatments, along with the prioritization of new potential drug targets or drug candidates. we aim to provide the tools to deepen our understanding of the mechanisms driving the infection and help boost drug development supported with testable suggestions. we aim at building armor for new treatments to prevent new waves of covid-19 or similar pandemics. a pneumonia outbreak associated with a new coronavirus of probable bat origin covid-19 and the kidney: from epidemiology to clinical practice receptor ace2 is an interferon-stimulated gene in human airway epithelial cells and is detected in specific cell subsets across tissues covid-19 disease map, building a computational repository of sars-cov-2 virus-host interaction mechanisms systems medicine disease maps: community-driven comprehensive representation of disease mechanisms communitydriven roadmap for integrated disease maps the reactome pathway knowledgebase wikipathways: a multifaceted pathway database bridging metabolomics to other omics research the systems biology graphical notation sbml level 3: an extensible format for the exchange and reuse of biological models the biopax community standard for pathway data sharing uniform resolution of compact identifiers for biomedical data omnipath: guidelines and gateway for literature-curated signaling pathway resources the imex coronavirus interactome: an evolving map of coronaviridae-host molecular interactions signor 2.0, the signaling network open resource 2.0: 2019 update update: integration, analysis and exploration of pathway data fairdomhub: a repository and collaboration environment for sharing systems biology research identifiers.org and miriam registry: community resources to provide persistent identification setting the basis of best practices and standards for curation and annotation of logical models in biology-highlights of the zbit bioinformatics toolbox: a web-platform for systems biology and expression data analysis sbmlsqueezer 2: context-sensitive creation of kinetic equations in biochemical networks minerva-a platform for visualization and curation of molecular interaction networks editing, validating and translating of sbgn maps pathvisio 3: an extendable pathway analysis toolbox modeling and simulation using celldesigner software support for sbgn maps: sbgn-ml and libsbgn systems biology graphical notation markup language (sbgnml) version 0.3 cord-19: the covid-19 open research dataset protein interaction data curation: the international molecular exchange (imex) consortium 25 years of pathway figures informing epidemic (research) responses in a timely fashion by knowledge management -a zika virus use case [internet]. pathology from word models to executable models of signaling networks using automated assembly gnormplus: an integrative approach for tagging genes, gene families, and protein domains a sars-cov-2 protein interaction map reveals targets for drug repurposing biokc: a collaborative platform for systems biology model curation and annotation systems biology markup language (sbml) level 3 package: multistate, multicomponent and multicompartment species, version 1, release 2 the systems biology markup language (sbml) level 3 package: layout, version 1 core sbml level 3 package: render, version 1, release 1 controlled vocabularies and semantics in systems biology closing the gap between formats for storing layout information in systems biology cd2sbgnml: bidirectional conversion between celldesigner and sbgn formats reactome from a wikipathways perspective sbml qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools automated inference of boolean models from molecular interaction maps using casq cytoscape: a software environment for integrated models of biomolecular interaction networks logical modeling and analysis of cellular regulatory networks with ginsim 3.0 ndex: a community resource for sharing and publishing of biological networks human coronavirus: host-pathogen interaction comparative replication and immune activation profiles of sars-cov-2 and sars-cov in human lungs: an ex vivo study with implications for the pathogenesis of covid-19 tropism, replication competence, and innate immune responses of the coronavirus sars-cov-2 in human respiratory tract and conjunctiva: an analysis in ex-vivo and in-vitro cultures pathogenesis of covid-19 from a cell biology perspective pulmonary postmortem findings in a series of covid-19 cases from northern italy: a two-centre descriptive study interaction of sars-cov-2 and other coronavirus with ace (angiotensin-converting enzyme)-2 as their main receptor: therapeutic implications. hypertens dallas tex type i interferons: diversity of sources, production pathways and effects on immune responses impaired type i interferon activity and inflammatory responses in severe covid-19 patients interplay between sars-cov-2 and the type i interferon response the type i interferon response in covid-19: implications for treatment type i and type iii interferons -induction, signaling, evasion, and application to combat covid-19 the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application temporal dynamics in viral shedding and transmissibility of covid-19 clinical features of patients infected with 2019 novel coronavirus in wuhan, china persons evaluated for 2019 novel coronavirus -united states epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan the prevalence of olfactory and gustatory dysfunction in covid-19 patients: a systematic review and meta-analysis identifying airborne transmission as the dominant route for the spread of covid-19 aerosol and surface stability of sars-cov-2 as compared with sars-cov-1 airborne transmission of sars-cov-2: theoretical considerations and available evidence cell entry mechanisms of sars-cov-2 structure of the sars-cov-2 spike receptorbinding domain bound to the ace2 receptor sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor functional assessment of cell entry and receptor usage for sars-cov-2 and other lineage b betacoronaviruses cd209l/l-sign and cd209/dc-sign act as receptors for sars-cov-2 and are differentially expressed in lung and kidney epithelial and endothelial cells sars-cov-2 spike protein interacts with multiple innate immune receptors a multibasic cleavage site in the spike protein of sars-cov-2 is essential for infection of human lung cells fusion mechanism of 2019-ncov and fusion inhibitors targeting hr1 domain in spike protein viral and cellular mrna translation in coronavirus-infected cells rna replication of mouse hepatitis virus takes place at double-membrane vesicles identification of severe acute respiratory syndrome coronavirus replicase products and characterization of papain-like protease activity liberation of sars-cov main protease from the viral polyprotein: n-terminal autocleavage does not depend on the mature dimerization mode severe acute respiratory syndrome coronavirus envelope protein regulates cell stress response and apoptosis autophagy during viral infection -a double-edged sword autophagy and energy metabolism canonical and noncanonical autophagy as potential targets for covid-19 digesting the crisis: autophagy and coronaviruses understanding sars-cov-2-mediated inflammatory responses: from mechanisms to potential therapeutic tools proteasome activator pa28γ-dependent degradation of coronavirus disease (covid-19) nucleocapsid protein coronaviruses hijack the lc3-i-positive edemosomes, er-derived vesicles exporting short-lived erad regulators, for replication coronavirus replication complex formation utilizes components of cellular autophagy coronavirus replication does not require the autophagy gene atg5 coronavirus nsp6 restricts autophagosome expansion evolutionary analysis of sars-cov-2: how mutation of non-structural protein 6 (nsp6) could affect viral autophagy targeting the endocytic pathway and autophagy process as a novel therapeutic strategy in covid-19 autophagy and sars-cov-2 infection: apossible smart targeting of the autophagy pathway open questions for harnessing autophagy-modulating drugs in the sars-cov-2 war: hope or hype? intermittent fasting, a possible priming tool for host defense against sars-cov-2 infection: crosstalk among calorie restriction, autophagy and immune response murine coronavirus-induced apoptosis in 17cl-1 cells involves a mitochondria-mediated pathway and its downstream caspase-8 activation and bid cleavage the novel severe acute respiratory syndrome coronavirus 2 (sars-cov-2) directly decimates human spleens and lymph nodes middle east respiratory syndrome coronavirus efficiently infects human primary t lymphocytes and activates the extrinsic and intrinsic apoptosis pathways apoptotic pathways: paper wraps stone blunts scissors apoptosis: a review of programmed cell death modulation of host cell death by sars coronavirus proteins spike protein of sars-cov stimulates cyclooxygenase-2 expression via both calcium-dependent and calcium-independent protein kinase c pathways augmentation of chemokine production by severe acute respiratory syndrome coronavirus 3a/x1 and 7a/x4 proteins through nf-kappab activation antiapoptotic signalling by the insulin-like growth factor i receptor, phosphatidylinositol 3-kinase, and akt jnk and pi3k/akt signaling pathways are required for establishing persistent sars-cov infection in vero e6 cells phosphatidylinositol 3-kinase-dependent pathways oppose fas-induced apoptosis and limit chloride secretion in human intestinal epithelial cells. implications for inflammatory diarrheal states human intestinal epithelial cell survival: differentiation state-specific control mechanisms induction of apoptosis by the severe acute respiratory syndrome coronavirus 7a protein is dependent on its interaction with the bcl-xl protein the sars-coronavirus membrane protein induces apoptosis via interfering with pdk1-pkb/akt signalling frequency and distribution of chest radiographic findings in patients positive for covid-19 coronavirus disease 2019 (covid-19) ct findings: a systematic review and meta-analysis covid-19 pathophysiology: a review acute respiratory distress syndrome clinical and immunological features of severe and moderate coronavirus disease 2019 longitudinal analyses reveal immunological misfiring in severe covid-19 is a "cytokine storm" relevant to covid-19? urgent avenues in the treatment of covid-19: targeting downstream inflammation to prevent catastrophic syndrome incidence of thrombotic complications in critically ill icu patients with covid-19 abnormal coagulation parameters are associated with poor prognosis in patients with novel coronavirus pneumonia clinical course and outcome of 107 patients infected with the novel coronavirus, sars-cov-2, discharged from two hospitals in wuhan, china the unique characteristics of covid-19 coagulopathy treatment of covid-19 with conestat alfa, a regulator of the complement complement associated microvascular injury and thrombosis in the pathogenesis of severe covid-19 infection: a report of five cases hematologic, biochemical and immune biomarker abnormalities associated with severe illness and mortality in coronavirus disease 2019 (covid-19): a meta-analysis hyperinflammation and derangement of renin-angiotensin-aldosterone system in covid-19: a novel hypothesis for clinically suspected hypercoagulopathy and microvascular immunothrombosis angiotensin ii up-regulates angiotensin i-converting enzyme (ace), but down-regulates ace2 via the at1-erk/p38 map kinase pathway sex hormones promote opposite effects on ace and ace2 activity, hypertrophy and cardiac contractility in spontaneously hypertensive rats sex differences in the aging pattern of renin-angiotensin system serum peptidases sars-cov-2 receptor and regulator of the renin-angiotensin system: celebrating the 20th anniversary of the discovery of ace2 counterregulatory renin-angiotensin system in cardiovascular disease clinical and biochemical indexes from 2019-ncov infected patients linked to viral loads and lung injury the emerging threat of (micro)thrombosis in covid-19 and its therapeutic implications pathogen recognition and inflammatory signaling in innate immune defenses pattern recognition receptors and inflammation severe acute respiratory syndrome coronavirus m protein inhibits type i interferon production by impeding the formation of traf3.tank.tbk1/ikkepsilon complex activation of nf-kappab by the full-length nucleocapsid protein of the sars coronavirus sars coronavirus papain-like protease inhibits the tlr7 signaling pathway through removing lys63-linked polyubiquitination of traf3 and traf6 antiviral activities of type i interferons to sars-cov-2 infection interferon priming enables cells to partially overturn the sars coronavirus-induced block in innate immune activation a suspicious role of interferon in the pathogenesis of sars-cov-2 by enhancing expression of ace2 sars-cov-2 is sensitive to type i interferon pretreatment severe acute respiratory syndrome coronavirus orf6 antagonizes stat1 function by sequestering nuclear import factors on the rough endoplasmic reticulum/golgi membrane inborn errors of type i ifn immunity in patients with life-threatening covid-19 autoantibodies against type i ifns in patients with life-threatening covid-19 human coronaviruses: a review of virus-host interactions regulation of irf-3-dependent innate immunity by the papain-like protease domain of the severe acute respiratory syndrome coronavirus severe acute respiratory syndrome coronavirus open reading frame (orf) 3b, orf 6, and nucleocapsid proteins function as interferon antagonists post-translational modifications of coronavirus proteins: roles and function accessory proteins of sars-cov and other coronaviruses the sars coronavirus 3a protein causes endoplasmic reticulum stress and induces ligand-independent downregulation of the type 1 interferon receptor accessory proteins 8b and 8ab of severe acute respiratory syndrome coronavirus suppress the interferon signaling pathway by mediating ubiquitin-dependent rapid degradation of interferon regulatory factor 3 message in a bottle: lessons learned from antagonism of sting signalling during rna virus infection peroxisomal mavs activates irf1-mediated ifn-λ production covid-19 as a sting disorder with delayed over-secretion of interferon-beta imbalanced host response to sars-cov-2 drives development of covid-19 interferon-λ: immune functions at barrier surfaces and beyond decoding type i and iii interferon signalling during viral infection structural basis for translational shutdown and immune evasion by the nsp1 protein of sars-cov-2 immunometabolism and pulmonary infections: implications for protective immune responses and host-directed therapies competition for nutrients and its role in controlling immune responses new insights into the nrf-2/ho signaling axis and its application in pediatric respiratory diseases heme catabolic pathway in inflammation and immune disorders the hmox1 pathway as a promising target for the treatment and prevention of sars-cov-2 of 2019 (covid-19) bile pigments in pulmonary and vascular disease heme oxygenase-1 dampens the macrophage sterile inflammasome response and regulates its components in the hypoxic lung carbon monoxide negatively regulates nlrp3 inflammasome activation in macrophages heme oxygenase-1 protects airway epithelium against apoptosis by targeting the proinflammatory nlrp3-rxr axis in asthma negative regulators and their mechanisms in nlrp3 inflammasome activation and signaling targeting the nlrp3 inflammasome in severe covid-19 sars-cov-2 infection and overactivation of nlrp3 inflammasome as a trigger of cytokine "storm" and risk factor for damage of hematopoietic stem cells severe acute respiratory syndrome coronavirus orf3a protein activates the nlrp3 inflammasome by promoting traf3-dependent ubiquitination of asc severe acute respiratory syndrome coronavirus e protein transports calcium ions and activates the nlrp3 inflammasome role of severe acute respiratory syndrome coronavirus viroporins e, 3a, and 8a in replication and pathogenesis. denison mr, editor. mbio coronavirus e protein forms ion channels with functionally and structurally-involved membrane lipids targeting the heme-heme oxygenase system to prevent severe complications following covid-19 infections genetic polymorphisms complicate covid-19 therapy: pivotal role of ho-1 in cytokine storm heme oxygenase-1 induction contributes to renoprotection by g-csf during rhabdomyolysis-associated acute kidney injury targeting the nrf2-heme oxygenase-1 axis after intracerebral hemorrhage hemin and cobalt protoporphyrin inhibit nlrp3 inflammasome activation by enhancing autophagy: a novel mechanism of inflammasome regulation remarkable role of indoleamine 2,3-dioxygenase and tryptophan metabolites in infectious diseases: potential role in macrophage-mediated inflammatory diseases inhibition of acute lethal pulmonary inflammation by the ido-ahr pathway 3-hydroxyanthranilic acid, one of metabolites of tryptophan via indoleamine 2,3-dioxygenase pathway, suppresses inducible nitric oxide synthase expression by enhancing heme oxygenase-1 expression nitric oxide inhibits indoleamine 2,3-dioxygenase activity in interferon-gamma primed mononuclear phagocytes multiomic immunophenotyping of covid-19 patients reveals early infection trajectories [internet]. immunology gcn2 kinase in t cells mediates proliferative arrest and anergy induction in response to indoleamine 2,3-dioxygenase indoleamine 2,3 dioxygenase and metabolic control of immune responses an expanding range of targets for kynurenine metabolites of tryptophan ido expands human cd4+cd25high regulatory t cells by promoting maturation of lps-treated dendritic cells inhibition of allogeneic t cell proliferation by indoleamine 2,3-dioxygenase-expressing dendritic cells: mediation of suppression by tryptophan metabolites cinnabarinic acid generated from 3-hydroxyanthranilic acid strongly induces apoptosis in thymocytes through the generation of reactive oxygen species and the induction of caspase aryl hydrocarbon receptor negatively regulates dendritic cell immunogenicity via a kynureninedependent mechanism dysregulation of immune response in patients with coronavirus the molecular defect leading to fabry disease: structure of human alpha-galactosidase sirt5 is a nad-dependent protein lysine demalonylase and desuccinylase shmt2 desuccinylation by sirt5 drives cancer cell proliferation substrates and regulation mechanisms for the human mitochondrial sirtuins sirt3 and sirt5 imp/gtp balance modulates cytoophidium assembly and impdh activity fba reveals guanylate kinase as a potential target for antiviral therapies against sars-cov-2. zenodo teriflunomide in the treatment of multiple sclerosis: current evidence and future prospects cerpegin-derived furo[3,4-c]pyridine-3,4(1h,5h)-diones enhance cellular response to interferons by de novo pyrimidine biosynthesis inhibition novel and potent inhibitors targeting dhodh are broad-spectrum antivirals against rna viruses including newly-emerged coronavirus sars-cov-2 longitudinal characteristics of lymphocyte responses and cytokine profiles in the peripheral blood of sars-cov-2 infected patients functional exhaustion of antiviral lymphocytes in covid-19 patients immunopathological characteristics of coronavirus disease 2019 cases in guangzhou detection of sars-cov-2-specific humoral and cellular immunity in covid-19 convalescent individuals sars-cov-2 spike 1 protein controls natural killer cell activation via the hla-e/nkg2a pathway pathological findings of covid-19 associated with acute respiratory distress syndrome neutrophil-to-lymphocyte ratio and clinical outcome in covid-19: a report from the italian front line higher level of neutrophil-to-lymphocyte is associated with severe covid-19 the comparative immunological characteristics of sars-cov, mers-cov, and sars-cov-2 coronavirus infections comorbidity and its impact on 1590 patients with covid-19 in china: a nationwide analysis risk factors for developing into critical covid-19 patients in wuhan, china: a multicenter, retrospective, cohort study covid-19 and crosstalk with the hallmarks of aging covid-19 in immunocompromised hosts: what we know so far considering how biological sex impacts immune responses and covid-19 outcomes public health agency of sweden's brief report: pregnant and postpartum women with severe acute respiratory syndrome coronavirus 2 infection in intensive care in sweden risk of covid-19 among front-line health-care workers and the general community: a prospective cohort study relationship between the abo blood group and the covid-19 susceptibility association between abo blood groups and risk of sars-cov-2 pneumonia relationship between abo blood group distribution and clinical characteristics in patients with covid-19 genomewide association study of severe covid-19 with respiratory failure covid-19 and the abo blood group connection more on "association between abo blood groups and risk of sars-cov-2 pneumonia abo blood group predisposes to covid-19 severity and cardiovascular diseases inhibition of the interaction between the sars-cov spike protein and its cellular receptor by anti-histoblood group antibodies covid-19 and abo blood group: another viewpoint asthma and covid-19 association of respiratory allergy, asthma, and expression of the sars-cov-2 receptor ace2 distinct effects of asthma and copd comorbidity on disease expression and outcome in patients with covid-19 eleven faces of coronavirus disease covid-19 and asthma: reflection during the pandemic type 2 inflammation modulates ace2 and tmprss2 in airway epithelial cells the possible pathophysiology mechanism of cytokine storm in elderly adults with covid-19 infection: the contribution of "inflameaging immunosenescence and inflamm-aging as two sides of the same coin: friends or foes? front immunol inflammaging: a new immunemetabolic viewpoint for age-related diseases an update and a model coronavirus disease 2019 (sars-cov-2) and colonization of ocular tissues and secretions: a systematic review immunosenescence in aging: between immune cells depletion and cytokines up-regulation clinical characteristics of children and young people admitted to hospital with covid-19 in united kingdom: prospective multicentre observational cohort study immune responses to sars-cov-2 infection in hospitalized pediatric and adult patients pathophysiology of covid-19: why children fare better than adults? sars-cov-2 (covid-19): what do we know about children? a systematic review sars-cov-2 infection in children and newborns: a systematic review an outbreak of severe kawasaki-like disease at the italian epicentre of the sars-cov-2 epidemic: an observational cohort study multisystem inflammatory syndrome in children during the coronavirus 2019 pandemic: a case series hyperinflammatory shock in children during covid-19 pandemic childhood multisystem inflammatory syndrome -a new challenge in the pandemic genetic variability of human angiotensin-converting enzyme 2 (hace2) among various ethnic populations covid-19 and individual genetic susceptibility/receptivity: role of ace1/ace2 genes, immunity, inflammation and coagulation. might the double x-chromosome in females be protective against sars-cov-2 compared to the single x-chromosome in males? ace2 receptor polymorphism: susceptibility to sars-cov-2, hypertension, multi-organ failure, and covid-19 disease outcome genetic gateways to covid-19 infection: implications for risk, severity, and outcomes a theory on sars-cov-2 susceptibility: reduced tlr7-activity as a mechanistic link between men, obese and elderly severe covid-19 is marked by a dysregulated myeloid cell compartment deciphering the role of host genetics in susceptibility to severe covid-19 amelioration of non-alcoholic fatty liver disease with npc1l1-targeted igy or n-3 polyunsaturated fatty acids in mice evidence that vitamin d supplementation could reduce risk of influenza and covid-19 infections and deaths effect of single-dose injection of vitamin d on immune cytokines in ulcerative colitis patients: a randomized placebo-controlled trial prevalence of vitamin d deficiency among healthy infants and toddlers type 2 and interferon inflammation strongly regulate sars-cov-2 related gene expression in the airway epithelium a computational framework for complex disease stratification from multiple large-scale datasets integration of multi-omics datasets enables molecular classification of copd similarity network fusion for aggregating data types on a genomic scale integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis diablo: an integrative approach for identifying key molecular drivers from multi-omics assays footprint-based functional analysis of multiomic data the global phosphorylation landscape of sars-cov-2 infection regulatory network analysis of paneth cell and goblet cell enriched gut organoids using transcriptomics approaches benchmark and integration of resources for the estimation of human transcription factor activities network-based drug repurposing for novel coronavirus 2019-ncov/sars-cov-2 virus-host interactome and proteomic survey reveal potential virulence factors influencing sars-cov-2 multi-level proteomics reveals host-perturbation strategies of sars-cov-2 and sars-cov psicquic and psiscore: accessing and scoring molecular interactions covid-19: viral-host interactome analyzed by network based-approach model to study pathogenesis of sars-cov-2 infection high throughput estimation of functional cell activities reveals disease mechanisms and predicts relevant clinical outcomes assessing the impact of mutations found in next generation sequencing data over human signaling pathways actionable pathways: interactive discovery of therapeutic targets using signaling pathway models from expression footprints to causal pathways: contextualizing large signaling networks with carnival the cell collective: toward an open and collaborative approach to systems biology comparing individualbased approaches to modelling the self-organization of multicellular tissues rapid community-driven development of a sars-cov-2 tissue simulator physiboss: a multiscale agent-based modelling framework integrating physical dimension and cell signalling physicell: an open source physics-based cell simulator for 3-d multicellular systems maboss 2.0: an environment for stochastic boolean modeling conceptual and computational framework for logical modelling of biological networks deregulated in diseases functional characterization of somatic mutations in cancer using network-based inference of protein activity stat2 and irf9: beyond isgf3 ifnβdependent increases in stat1, stat2, and irf9 mediate resistance to viruses and dna damage edger: a bioconductor package for differential expression analysis of digital gene expression data a comparison of normalization methods for high density oligonucleotide array data based on variance and bias sars-cov-2 triggers inflammatory responses and cell death through caspase-8 activation bax/bak-induced apoptosis results in caspase-8-dependent il-1β maturation in macrophages over-expression of severe acute respiratory syndrome coronavirus 3b protein induces both apoptosis and necrosis in vero e6 cells the mepn scheme: an intuitive and flexible graphical system for rendering biological pathways a graphical and computational modeling platform for biological pathways sabio-rk: an updated resource for manually curated biochemical reaction kinetics key: cord-014462-11ggaqf1 authors: nan title: abstracts of the papers presented in the xix national conference of indian virological society, “recent trends in viral disease problems and management”, on 18–20 march, 2010, at s.v. university, tirupati, andhra pradesh date: 2011-04-21 journal: indian j virol doi: 10.1007/s13337-011-0027-2 sha: doc_id: 14462 cord_uid: 11ggaqf1 nan patients showed rashes on face, hand and foot. ev detection carried out in vesicular fluid, stool, serum and throat swab specimens by rt-pcr of 5 0 ncr gene. serotyping was carried out by using rt-pcr of viral protein of vp1/2a junction region followed by sequencing and phylogenetic analysis using neighbor-joining-algorithm and kimura-2 parameter model of mega-4 software. overall ev positivity detected in hfmd patients from kerala, tamil nadu, west bengal and orissa states was found to be 51.6%, 66.6%, 62.5% and 71.4% respectively. typing of vp1 gene sequences indicated presence of ca-6, ev-71, echo-9 strains in kerala and ca-16 in west bengal, orissa and tamil nadu. phylogenetic analysis indicated ca-6, ev-71, echo-9 strains showed 94.8-95.7% and 95-94.4% homology with japanese, australian and french strains. however, ca-16 strains were closer to malaysian strains with 91.2-95.6% nucleotide homology. the present study documents the association of multiple types of ev's i.e., ca-6, ev-71, echo-9 and ca-16 strains contributing as prime viral pathogens in hfmd epidemics in the reported regions with new emergence of ca-6 circulating strain in kerala, india. tasgaon september 2010. sera were collected from 162 suspected hepatitis cases and there contacts and tested for anti hev igm/igg antibodies (elisa) and liver enzymes like alanine aminotransferase (alt). anti hev igm antibodies were detected in 45.7% (74/162) of the suspected cases. the overall attack rate was 0.7%. male to female ratio was 2:1. majority (60.4%) of the cases were in the age group 20-40 years and recovered without any clinical complications. weekly distribution of cases showed that the majority (79.4%, 116/146) cases occurred between 2nd and 3rd week of june. dark urine (97.5%), jaundice (93.5%), fatigue (35.9%), abdominal pain (32.6%), anorexia (29.4%), vomiting (26.5%), fever (22.8%), giddiness (14.3%), diarrhoea (12.6%) and arthalgia (3.7%) were the prominent symptoms. sera collected from 73 antenatal cases (ancs) showed anti hev igm antibody in 3. affected pregnant women had a normal outcome. a death of 32 year, male hepatitis e case was reported during the outbreak period that had cirrhosis of liver with oesophageal varices. sanitary survey revealed that water pipelines were laid down in close proximity of sewerage system, and water posts were without tap. these are the likely sources of faecal contamination of water supplies. among 17 water samples collected from various places, 5 were found to be unfit for drinking based on the routine bacteriological tests conducted at state public health laboratory, pune. no case occurred after the pipelines were repaired. this typical outbreak of hepatitis e re-emphasizes need for proper water supply/sewage disposal pipelines and adequate maintenance measures. jayanthi shastri, nilima vaidya, sandhya sawant, umesh aigal department of molecular biology, kasturba hospital for infectious diseases, mumbai, india dengue and dengue haemorrhagic fever are amongst the most important challenges in tropical diseases due to their expanding geographic distribution, increasing outbreak frequency, hyperendemicity and evolution of virulence. the gobal prevalence of dengue has grown dramatically in recent decades. who estimates 50-100 million cases of dengue virus infections worldwide every year resulting in 250,000 to 500,000 cases of dhf and 24,000 deaths each year. public health laboratories require rapid diagnosis of dengue outbreaks for application of measures such as vector control. laboratory diagnosis of dengue virus infection can be made by the detection of specific virus, viral antigen, genomic sequence and/or antibodies. currently 3 basic methods used by laboratories for diagnosis of dengue virus infection are virus isolation and characterisation, detection of genomic sequence by nucleic acid amplification technology assay and detection of dengue virus specific antibodies/antigen. molecular diagnosis based on reverse transcription (rt)-pcr s.a. one step or nested pcr, nucleic acid sequence based amplification (nasba), or real time rt-pcr, has gradually replaced the virus isolation method as the new standard for the detection of dengue virus in acute phase serum samples. several pcr protocols for detection have been described that vary in the extraction method, genomic location of primers, specificity, sensitivity and the methods to determine the products and the serotype. pcr-based dengue tests, due to the specificity of amplification, enable a definitive diagnosis and serotyping of the virus. in addition dna sequencing of the amplification product enables the virus to be genotyped, providing important information on the sources of infection. more recently tests have incorporated flurogenic probe, so called taq man technology for the specific real time detection of dengue 1-4 amplicons. product is detected by a specific oligodeoxy nucleotide probe that is labelled with 6 carboxy-fluorescein (fam). this technology offers the advantage of being both rapid and potentially quantitative. second, the detection of product by hybridisation of flurochrome labelled probes increases specificity. third, as the product is detected without the need to open the reaction tube, the risk of contamination by product carry over is minimised. the advantages of speed, contamination minimisation and reduced turn around time justify application of this assay over the currently used nested pcr assay. during the period january 2007 to october 2009, molecular laboratory received 900 samples from patients presenting with acute onset fever for dengue .6%) samples were tested positive by this method. the disease peaks in the monsoon season with a percentage of 17.5%. rapid tests, igm and igg capture elisa are popularly used tests for diagnosis of dengue infection. its utility is limited for diagnosing dengue in convalescecce (8-14 days) . specificity is also compromised due to infections with flaviviruses: japanese encephalitis and chikungunya. dengue ns1 ag elisa with its cost effectiveness, specificity and sensitivity should be considered as the test of choice for diagnosing dengue in the acute phase of illness in the developing countries. molecular diagnosis enables confirmatory diagnosis of dengue in the acute phase of the illness and is suitable for further typing methods. assistant general manager and r&d coordinator, division of quality control and r&d, bharat immunologicals and biologicals corporation ltd., village chola, bulandshahr, up vaccine development in india, though slow to start, has progressed by leaps and bounds in the past 60 years. it was dependent on imported vaccines but now it is not only self-sufficient in the production of vaccines conforming to international standards with major supplier of the same to unicef. the role of drug authorities is to enhance the public health by assuring the availability of safe and effective a2 indian j. virol. (september 2010) 21(suppl. 1):a1-a58 vaccines, allergenic extracts, and other related products. vaccine development is tightly regulated by a hierarchy of regulatory bodies. guidelines provided by the indian council of medical research (icmr) set the rules of conduct for clinical trials from phase i to iv studies as well as studies on combination vaccines. these guidelines address ethical issues that arise during a vaccine study. a network of adverse drug reaction (adr) monitoring centers along with the adverse events following immunization (aefi) monitoring program provide the machinery for vaccine pharmacovigilance. genetic modifications have been developed to develop effective and cheaper vaccines by the use of recombinant technology. to ensure safety of consumers, producers, experimental animals and environment, governments all over the world are following regulatory mechanisms and guidelines for genetically modified products. as with other industrializing countries undergoing rapid shifts, india clearly recognizes the need to restructure its regulatory system so that its biopharmaceutical industry can compete in international markets. genetic engineering approval council (geac), recombinant dna advisory committee (rdac), review committee on genetic manipulation (rcgm), institutional biosafety committees (ibsc) are responsible for development, commitment for parameters and commercialization of recombinant vaccines. to centralize and coordinate the whole system, government has taken to form two agencies to regulate the regulation laws to develop recombinant pharmaceuticals products including vaccines. the first is the creation of the national biotechnology regulatory authority (nbra), under the department of biotechnology (dbt), as part of india's long-term biotech sector development strategy. the second major initiative will affect the entire indian pharmaceutical industry. this is the replacement of most state, district, and central drug regulatory agencies with a single, central, fda-style agency, the central drug authority (cda). the cda is expected to have separate, semi-autonomous departments for regulation, enforcement, legal, and consumer affairs; biotechnology products; pharmacovigilance and drugs safety; medical devices and diagnostics; imports; quality control; and traditional indian medicines. it will set up offices throughout india and will be paid for inspection, registration, and license fees. its enforcement powers will be strengthened by a new law increasing the criminal penalties for illegal clinical trials. in the manufacturing area, though, the country has been tightening the rules and enforcement. an amendment to the regulations, ''schedule m'' of the drug and cosmetics act, now specifies the good manufacturing practice (gmp) requirements for factory premises and materials. these requirements were modeled after us fda regulations, to improve regulatory coordination between indian and us regulators. india has realized the importance of regulations in pharmaceutical specially in vaccine field but it will take several years to implementation of these. india has coordinated some of its regulatory functions with western organizations. the us pharmacopoeia established an office in hyderabad in 2007. a representative of the indian pharmaceutical lobby also recently has expressed openness to an expansion of the fda's oversight of indian manufacturing. as india expands its global drug and biologicals production, us and europe, as the world's largest drug importers, will likely expand their regulatory support in the development of the country's regulatory systems. rapid diagnosis of japanese encephalitis virus (jev) infections is important for timely clinical management and epidemiological control in areas where multiple flaviviruses are endemic. however, the speed and accuracy of diagnosis must be balanced against test cost and availability, especially in developing countries. an antigen capture enzyme-linked immunosorbent assay (elisa) for detection of circulating jev specific nonstructural protein 1 (ns1) was developed by using monoclonal antibodies (mabs) specific to recombinant (ns1). the applicability of this jev ns1 antigen capture elisa for early clinical diagnosis was evaluated with 200 acute phase serum/ cerebrospinal fluid (csf) specimens collected from different epidemics during [2007] [2008] [2009] . jev ns1 antigen was detected in circulation from day 1 to 18. the sensitivity and specificity of jev ns1 detection in serum/csf specimens with reference to reverse transcriptase pcr was 82%, and 98.9% respectively. no crossreactions with any of the other closely related members of the genus flaviviruses (dengue, westnile, yellow fever and saint louis encephalitis (sle) viruses) were observed when tested with either clinical specimens or virus cultures. these findings suggested that the reported jev specific mab-based ns1 antigen capture elisa will be a rapid and reliable tool for early confirmatory diagnosis as well as surveillance of je infections in developing countries. manmohan parida the recent emergence of a novel human influenza a virus (h1n1) poses a serious global health threat. the h1n1 virus has caused a considerable number of deaths within a short duration since its emergence. a two-step single tube accelerated rapid real-time and quantitative swine flu virus specific h1 rtlamp assay is reported by targeting the h1 gene of the novel h1n1 hybrid virus. the feasibility of swine flu h1 rtlamp for clinical diagnosis was validated with a panel of 239 suspected throat wash samples comprising 116 confirmed positive and 123 confirmed negative cases of ongoing epidemic. the comparative evaluation of h1 specific rtlamp assay with real-time rt-pcr demonstrated exceptionally higher sensitivity by picking up all the 116 h1n1 positive and 36 additional positive cases amongst the negatives that were sequence confirmed as h1n1. none of the real-time rtpcr positive samples were missed by rtlamp system. the comparative study revealed that rtlamp was 100-fold more sensitive than rtpcr with a detection limit of 1 copy number. these findings suggested that rtlamp assay is a valuable tool for rapid, real-time detection as well as quantification of h1n1 virus in acute phase throat swab samples without requiring any sophisticated equipments. because of its recurrent nature. despite considerable progress in understanding of the virus at cellular and molecular levels, the proper management of the disease in its different stages is still a dilemma particularly whether to use antiviral or steroids or both. the risk of using steroids with its attendant complications has to be weighed against the risk of progression of the disease if avoiding the use of steroids. this dilemma can be reduced to a considerable extent if basic principles of virology and pathogenesis are kept in mind. this article reviews current concepts of virological and clinical aspects of hsv keratitis to enable a broad understanding of the disease process. it is recognized several influential host factors including the fact that hsk is more common in men than women. it is observed that the ability of hsv to establish latent infection in sensory neurons and possibly cornea, but have as yet been unable to use this knowledge to prevent the disease limitations. acknowledging limitations may further stimulate application of laboratory knowledge in coping with hsk which constitutes to present major challenge in terms of management. mvo-10 study on effect of human bhsp90 in immunity of hcv core protein and hbv hbsag there are more than 500 million individuals with hepatitis b and c in the world. in spite of vaccination in the different areas there are several reports about patients who got vaccine before. also there is not efficient vaccine against of hepatitis c and one of the important problems in vaccine project is development of effective and suitable adjuvant in human vaccines. at present research we applied human bhsp90 protein as adjuvant and chaperon. this protein injected to balbc mice as adjuvant together with recombinant proteins of hcv core and hbv hbsag. then humoral and cellular immune systems of the mice were studied. core and hbsag genes were cloned into petduet-1 vector and thermal vector of pgp1-2 was used for human heat shock protein 90 expressions. the different combination of these three proteins was injected to mice and we evaluated the total igg and igg2a of mice serums after a week. two weeks after booster injection, we studied the proliferation and cytokine secretion of spleen, inguinal and popliteal lymph nodes lymphocytes in vitro and ex vivo conditions. so the core/hbsag + hsp and core + hbsag + hsp complexes induced total igg and igg2a secretion. the spleen lymphocytes proliferation were increased equal to serum igg2a level that was constant in second time bleeding with significant different to complexes with freund's adjuvant. at first il-4 and il-5 cytokines were increased and then decrease of il-4 meaned no hypersensitivity. the chaperon effect of hsp90 on structure of core and hbsag proteins was studied by cd and flourometer. it could fold the proteins after heating and unfolding. hepatitis b virus (hbv) infection is vaccine preventable global public health problem. all commercially available vaccines contain one or more of the recombinant hepatitis b envelope protein or surface antigen (hbsag). measurement of antigen responsible for immunogenicity of vaccine is central to quality assessment. the problems associated with the use of a polyclonal antibody in an assay with regard to its poorly defined nature and batch-to-batch variation has been mitigated by the use of mabs as described in this paper. the initial capture of hbsag by the mab could orientate it such that the same antibody could bind to it as a detection antibody after labeling with out steric hindrance. the development of an immuno-capture elisa (ic-elisa) to measure the hbsag content using a monoclonal antibody (mab) specific to determinant ''a'' of hbsag in the experimental vaccine formulations is being discussed. murine mabs developed against hbsag, subtype adw2 were found to cross-react with the other subtypes viz. ad and ay too. the mabs have been characterized following which, one mab hbs06 was chosen for developing ic-elisa format for the quantification of the hbsag in the final algel adsorbed vaccines. the unadsorbed hbsag was used to establish the standard curve of hbsag/a. the elisa had a sensitivity of 10 ng/ml of hbsag. the recovery rate of hbsag/a was found to be around 70% in the vaccines treated to desorb the antigen from algel. twenty seven experimental batches of monovalent hepatitis b vaccines were analyzed for the hbsag content, both by ic-elisa and a commercial kit (axsym kit, abbott laboratories, usa). the statistical analysis of ic-elisa results indicated that an experimental equation f(x) = 0.0062(x) + 0.184, could precisely estimate the amount of hbsag in the adsorbed vaccines. the amounts of hbsag recovered from the adsorbed vaccines as estimated by the ic-elisa format had a good correlation with the estimates derived from a commercial kit, which is being used by several vaccine manufacturers in india for the quality control of vaccine antigen. the varying amounts of vaccine antigens that could be recovered seemed to depend on the quality of the hbsag and the methods of hbsag adsorption to the alum gel during vaccine manufacture. epidemiology of the spread of h1n1 virus. children of school going age have become victim of this deadly virus as evident from the reporting data generated in the past few weeks. the mortality rate has also been slightly increased. the disease spread in wave pattern and presently the world is passing through the second wave of pandemic with more severity in young and otherwise health people with a predilection for lungs leading to viral pneumonia and respiratory failure. now the pandemic gained hold in the developing world affecting more severely as millions of people live under deprived conditions having multiple health problems, with little access to basic health care. current data about the pandemic from developed counties need to be very closely watched in relation to shift in virus sub type, shift of the highest death rate to younger populations, successive pandemic waves, higher transmissibility than seasonal influenza, and demographic differences etc. presently the world appears to be better prepared. vaccine is available in market in many countries. even vaccine trials are actively going on in indian population. effective antivirals are available. although till now h1n1 diagnostic centers worked with cdc/who recommended h1n1 specific primer, probes with taqman chemistry by real time pcr, efforts on the development of indigenous diagnostics, vaccines and chemoprophylaxis is going on to have a better combat against this highly infectious virus. were positive for rotavirus infection by either page or elisa methods. the available data highlights the importance of rotavirus as a cause of diarrhea in children, which is severe enough to deserve specialized care. the observed proportion of 25.5% of all diarrhea cases being associated with rotavirus falls within the range of values reported by other workers. the reported positivity varies from 10.5 to 70.7%. in our study a complete concordance of elisa and page results were observed in 194 (97%) of the 200 tested specimens. this finding closely correlates with the findings of other authors who found a 96.7-97.14% concordance results between elisa and page methods. some authors found rna-page method that is as sensitive and rapid as elisa for detecting rotavirus in stool samples of cases of diarrhea and some others proposed elisa is more sensitive than page method fond to be 100% specific. the remaining 6 (3%) samples showed conflicting results. in a lone sample in which the od value of elisa test was 0.195, this value was almost at the cutoff level, the possibility of this sample being positive by elisa test is doubtful. negative result of the same sample in page method is difficult to explain, the possibility of presence of lot of empty virus particles or due to low concentration of viral rna in the fecal specimen and insufficient extraction of viral rna could be possible. on the other hand, 5 of the samples which gave positive results by page method were negative by elisa test. these 5 samples had a typical 4-2-3-2 rna pattern. the reason for their being elisa negative thus remains unexplained, however blocking factors or the presence of inhibitory substance in stools might have been responsible. the samples containing predominantly complete particles can also give false negative results. since, the group antigen is not exposed. earlier studies have also reported page to be the most sensitive technique although some are of view that it is laborious procedure. how ever, the page system used in this study was very simple to perform and the results were available on the same day. the main requirement was of trained personnel and proper standardization of the technique. most reports states that the greatest advantage of page and silver stain method are its lack of ambiguity and the fact that it provides information about viral electropherotypes. the modified page system was thus found to be reliable, simple and rapid, no expensive reagents were required. locally available reagents from hi media were used. the cost of the chemical for page per specimen was rs. 24 approximately as compared to rs. 110 per test by confirmatory elisa. a locally produced slab gel electrophoresis system with power pack was the only equipment required. this method could be used for the routine diagnosis of rotavirus infection in the laboratory. vaccine, rapid diagnosis plays an important role in early management of patients. in this study a qc-rt-pcr assay was developed to quantify chikungunya virus rna by targeting the conserved region of e1 gene. a competitor molecule containing an internal insertion was generated, that provided a stringent control of the quantification process. the introduction of 10-fold serially diluted competitor in each reaction was further used to determine sensitivity. the applicability of this assay for quantification of chikungunya virus rna was evaluated with human clinical samples and the results were compared with real-time quantitative rt-pcr. the sensitivity of this assay was estimated to be 100 rna copies per reaction with a dynamic detection range of 10 2 to 10 10 copies. specificity was confirmed using closely related alpha and flaviviruses. the comparison of qc-rt-pcr result with real-time rt-pcr revealed 100% concordance. these findings demonstrated that the reported assay is convenient, sensitive and accurate method and has the potential usefulness for clinical diagnosis due to simultaneous detection and quantification of chikungunya virus in acute-phase serum samples. in india, measles vaccine was introduced as part of expanded programme of immunization in 1985. measles, mumps and rubella (mmr) vaccine is still not part of the national immunization schedule of india. the indian association of paediatrics (iap) recommends measles vaccine at 9 months of age and mmr vaccine at 15-18 months. however, in a recent policy update, iap committee on immunisation opined that there is a need for a second dose of mmr vaccine for providing adequate immunity against mmr. the aim of the present study was to assess the extent of sero-protection against mmr at 4-6 years of age in children who have received one dose of mmr between 12 and 24 months of age. an attempt has also been made to assess the sero-response to the second dose of mmr vaccine in 4-6 years old children. a total of 106 consecutive children between the ages of 4-6 years who had received mmr vaccine between 12 and 24 months of age and attending the immunization clinic of gtb hospital, delhi were enrolled. the vaccination status, anthropometry and physical examination findings were recorded. three ml of venous sample was again withdrawn for estimation of post vaccination antibody titre. it was observed that 20.39%, 87.38% and 75.73% children were seroprotected for mmr respectively after 2.5-4.5 year of receiving first dose of mmr vaccine. seroprotection rose to 72.62%, 100% and 100% for mmr respectively after 4-6 weeks of receiving second dose of mmr vaccine. geometric mean concentration of antibody also rose significantly in all three diseases. in view of low seroprevalence of mmr and hence high susceptibility to infection at 4-6 years of age, who have already received mmr vaccine, there is need to boost the immune responses against these three diseases by giving a second dose of mmr vaccine. baseline information on the epidemiology of viral agents causing stis and types of risk behaviour of affected persons are essential for any meaningful targeted intervention. the present study documents the pattern of viral stis in patients attending a tertiary care hospital, correlating the syndromic approach and the laboratory investigations to determine the aetiology. three hundred consecutive patients attending the sti clinic were diagnosed and categorized according to the syndromic approach of the who along with detailed history and demographic data. majority of the patients were men (53.12%) with a mean age of 24 years. men received education up to middle school. half of the female subjects were illiterate. sixty percent of the patients were married and among these, 19% were regular condom users. first sexual contact at or before 18 years of age was more in men (31% vs. 22 .22% in women). promiscuity was more among male patients who had contact with csw. genital herpes was the commonest viral sti (86/300) followed by genital wart (60/300). concomitant infection with more than one virus was seen in 35% of patients. hiv was prevalent in 10.3% of sti patients. hepatitis b, hepatitis c, herpes simplex type 1 and molluscum contagiosum were the other viral agents seen in sti clinic attendees at our centre. this disease currently prevalent in more than 100 countries world wide and annually 50-100 million people are infected with dengue virus among which 2.5-5 lakhs cases were dengue hemorrhagic fever (dhf) and dengue shock syndrome (dss) which are serious forms of dengue virus infection and due to this condition 25,000 deaths might occur annually world wide and approximately 3 million children were hospitalized for the fast 3 decades. this disease is characterized by sudden onset of high fever with sever headache, pain in the back and limbs, lymphadenopathy macuolo-papulur rash over the skin and retro-bulbar pain. early diagnosis can be established with simple and rapid lgg/1gm antibodies detection in the blood samples of the patients based on the bi-directional immunoassay system for its management and control to reduce morbidity and mortality. details will be presented. myocarditis and dilated cardiomyopathy (dcm) are common causes of morbidity and mortality both in children and adults. the most common viruses involved in myocarditis are coxsackievirus b or adenovirus. recently, the coxsackievirus and adenovirus receptor (car), a common receptor for coxsackieviruses b3, b4 and adenoviruses 2, 5 has been identified. increased expression of car has been reported in patients with dcm suggesting utilization of car by these viruses for cell entry. the present study was designed to study the expression of car in myocardial tissue of patients with dcm. formalin fixed myocardial tissues were obtained from autopsy cases. a total of 26 cases of dcm and 20 cases of controls which included non-cardiac (group-a) and cardiac disease other than dcm (group-b) were included in the study. expression of car was studied by immunohistochemical staining of myocardial tissue using car specific rabbit polyclonal antibody and biotin conjugated secondary antibody. the tissue sections were considered positive when[25% of the cell showed brown color staining by immunohistochemistry (ihc). the car positivity in dcm cases was found to be 96% (25/ 26) as compared to 30% in control group a and 40% in control group b respectively. the car positivity was significantly higher in the test group as compared to both the control groups. further car positivity in all the cellular types (myocytes, endothelial cells and interstitial cells) was found significantly higher in test group as compared to both the control groups. the expression of car was significantly higher in myocytes as compared to both endothelial and interstitial cells in all the groups. however, no significant difference was observed in car positivity between endothelial and interstitial cells. the present study highlights the increased expression of car in dcm cases with further significance of car expression in myocytes and endothelial cells. this may help further in understanding the tropism of viruses or cellular susceptibility, which in turn will help in appropriate diagnostic and therapeutic approach in management of viral myocarditis and dcm cases. food security and safety vary widely around the world, and reaching these goals is one of the major challenges, raising public concern for the wellbeing of mankind, in particular. industrialized production and processing as well as improper environmental protection have clearly shown severe limitations such as worldwide contamination of the food chain and water. contaminated water and food during the processes of production, processing and handling are essentially responsible for food and water borne viral infections/diseases. the cases of viral food borne outbreaks are on the rise, creating a threat to human health. recent researches indicate that epidemiological studies are meager to focus the frequently contaminated foods and food borne viral diseases. current paper projects the etiology of select food borne viral diseases, probable reasons for non availability of appropriate methods to detect the viruses responsible for the diseases, routes of water and food borne transmission of enteric viral infections, currently available methods of detection of select viruses and bio safety measures to prevent food borne viral infections. dietary/nutritional management in food borne viral diseases is crucial to control weakness and gastro enteric intolerance due to disease condition and antibiotic therapy. it will principally improve food intake, resulting in better nutritional status leading to optimum immune response. food borne viruses are mainly belong to rotaviruses, enteropathogenic viruses, astroviruses, adenoviruses and caliciviruses, causes acute gastroenteritis (ag) which is an important health problem. the frequency of rotavirus as a cause of sporadic cases of ag ranges between 17.3% and 37.4%. astroviruses cause ag, with a frequency ranging between 2 and 26%: outbreaks have been described in schools and kindergartens, but also in adults and the elderly. the frequency of identification of adenoviruses 40 and 41 as causes of sporadic ag in non-immuno suppressed children ranges between 0.7% and 31.5%, although there is probably underreporting because the sensitivity of conventional techniques is low. caliciviruses are separated phylogenetically into two genera: norovirus and sapovirus. norovirus is frequently associated with food-and water-borne outbreaks of ag. it is estimated that 40% of cases of ag due to norovirus are food borne. in sweden and some regions of the united states, norovirus is the first cause of outbreaks of food borne diseases. sapovirus outbreaks due to person-to-person and food borne transmission affecting both children and adults have recently been reported in countries such as canada and japan. it has been predicted that the importance of diarrhoeal disease, mainly due to contaminated food and water, as a cause of death will decline worldwide. evidence for such a downward trend is limited. this prediction presumes that improvements in the production and retail of microbiologically safe food will be sustained in the developed world and, moreover, will be rolled out to those countries of the developing world increasingly producing food for a global market. sustaining food safety standards will depend on constant vigilance maintained by monitoring and surveillance but, with the rising importance of other food-related issues, such as food security, obesity and climate change, competition for resources in the future to enable this may be fierce. in addition the pathogen populations relevant to food safety are not static. food is an excellent vehicle by which many pathogens (bacteria, viruses/prions and parasites) can reach an appropriate colonization site in a new host. although food production practices change, the well-recognized food-borne pathogens, such as salmonella spp. and escherichia coli, seem able to evolve to exploit novel opportunities, for example fresh produce and even generate new public health challenges, for example antimicrobial resistance. in addition, previously unknown food-borne pathogens, many of which are zoonotic, are constantly emerging. awareness and surveillance of viral food-borne pathogens is generally poor but emphasis is placed on norovirus, hepatitis a, rotaviruses and newly emerging viruses such as sars. it is clear that one overall challenge is the generation and maintenance of constructive dialogue and collaboration between public health, veterinary and food safety experts, bringing together multidisciplinary skills and multi-pathogen expertise. such collaboration is essential to monitor changing trends in the well-recognized diseases and detect emerging pathogens. it is also necessary to understand the multiple interactions between these pathogens and their environments during transmission along the food chain in order to develop effective prevention and control strategies. to analyse the effectiveness of these sirnas targeting rabies virus l gene, the bhk-21 cells expressing sirnas in shrna form were produced by transduction of cells with radv-l. the transduced bhk-21 cells expressing sirna were infected with rabies virus pv-11 strain. there was reduction in rabies virus multiplication as analysed by reduction in fluorescent foci forming unit (ffu) count by 51.85% (70 ffu in bhk-21 cells expressing sirna-l compared to 135 ffu in bhk-21 cells expressing negative sirna). the expression of l gene mrna was reduced by 16.11fold in rabies virus infected radv-l transduced cells compared to radv-neg transduced cells (negative control) as detected using real-time pcr. after analyzing the effectiveness of radv-l in vitro, its effectiveness was also evaluated in vivo in mice after virulent rabies challenge. the mice were inoculated with 10 7 plaque forming units (pfu) of radv-l in masseter muscle (i/m route) and challenged with 15 ld 50 rabies virus challenge virus standard (cvs) strain. the results indicated 50% protection with improved median survival from 7 to 11 days compared with group of mice treated with radv-neg. the results of this study indicated that sirnas targeting rabies virus polymerase (l) gene delivered through adenoviral vector inhibited rabies virus multiplication in vitro and in vivo. and 4 were successfully produced and purified from the infected spodoptera frugiperda (sf-9) cells using these recombinant baculovirus. the morphology of the vlps was validated by electron microscopy in comparison to the authentic bt virions. the vlps produced here were stable and were highly immunogenic with intact outer layer which is rapidly lost during normal infection of btv. these btv-vlps elicited long lasting protective immunity in vaccinated sheep against virulent virus challenge. with the use of btv-vlps it was also possible to differentiate the infected and vaccinated animals (diva). vlp-based btv vaccine has potential advantages with regard to controlling the spread of btv with multiple serotypes. it is possible to produce milligram quantities of correctly folded and processed protein complexes using this baculovirus expression system and hence it is a more promising system for producing new generation vaccines like vlp subunit vaccine against any viral diseases in large scale. peste des petits ruminants (ppr), goatpox and orf are oie notifiable diseases of small ruminants especially goat and sheep. these diseases are economically important, in enzootic countries like india and cause significant loss and are major constraints in the productivity. considering the geographical distribution of ppr, goat pox and orf infections and prevalence of mixed infection, in the present study, safety and potency of the experimental triple vaccine comprising attenuated strains of thermostable-ppr virus (pprv jhansi, p-50) grown at 40°c, high passaged goat poxvirus (gtpv uttarkashi, p100) and attenuated orf virus (orfv mukteswar, p51) was evaluated in sub-himalayan local hill goats. goats simultaneously immunized with 1 ml of vaccine consisting of either 10 3 tcid 50 or 10 5 tcid 50 of each of pprv, gtpv and orfv were monitored for clinical and serological responses for a period of 3-4 weeks post-immunization (pi) and post challenge (pc). specific immune responses i.e., antibodies directed to pprv, gtpv and orfv could be demonstrated by ppr competitive elisa kit and capripox indirect elisa, snt, respectively following immunization. all the immunized animals resisted infections when challenged with virulent strains of either gtpv or pprv or orfv on day 28 dpi, while in contact control animals developed characteristic signs of respective disease. further, ppr viral antigen could be detected by using ppr sandwich elisa kit in the excretions (nasal, ocular and oral swab materials) of unvaccinated control animals after challenge but not from any of the immunized goats. triple vaccine was found safe at dose as higher as 10 5 tcid 50 and induced protective immune response even at lower dose (10 2 tcid 50 ) in goats, which was evident from sero-conversion as well as challenge studies. the study indicated that these viruses are compatible and did not interfere with each other in eliciting immune response, paving the feasibility of use of this triple vaccine in combating these infections simultaneously. toll like receptors (tlrs), primary sensors of microbial origin, plays a crucial role in the innate immunity. till now 13 mammalian tlrs have been identified, while there is no information available on tlrs of yak. this study is part of world bank funded-icar project. yak, named bos grunniens for its distinctive vocalization and relationship with cattle, is natural habitant of extremely cold environment. when these animals comes to a lower altitude grazing land, adjacent to villages, become susceptible to the diseases of cattle, buffalo etc. thus, present study was undertaken to with genetic characterization and evolutionary lineage analysis of yak tlrs. we worked on tlr7 gene, which plays an important role in recognition of ssrna viruses. total rna was extracted from mitogen stimulated pbmcs of yak. the rt-pcr conditions were standardized for full length amplification of tlr gene 7 using specific self designed primers. the expected amplicon of 3559bps was obtained. it was cloned in pgemt-easy vector followed by transformation in e. coli top10 strain. the recombinant clones were screened, picked up for plasmid isolation and release of tlr7 was confirmed by restriction digestion. the cloned tlr7 product was sequenced and analyzed for the nucleotide and deduced amino acid sequences, and 3d structure analysis. the results revealed that yak shows more than 98% sequence homology with other bos indicus breeds and bos taurus breeds. however, identity was less than 88% with other animal species (equine, murine, feline, canine etc.). the evolutionary lineage findings cluster yak more closely with bovine species. point mutations revealed changes at 25 nucleotide positions with corresponding amino acid change at 15 positions. smart analysis of yak protein domain architecture revealed toll-interleukin i receptor (tir), leucine rich repeats (lrr) and signal peptide region. the variations in yak mainly lie in the lrr region. homology modeling revealed horse shoe shaped structure with 5 alpha helix. the additional alpha helix present in bos indicus was not detected in yak. the present study shows existence of genetic variability in tlr7 gene of yak, in particular the lrr region, which plays an important role in the pathogen recognition and the evolutionary lineage analyses shows its closeness with other bovine species. a.p. aquaculture and fisheries, tirupati in this new millennium, aquatic animal health management strategies in asia expanded and adjusted to the current disease problems faced by the aquaculture sector. this presentation will briefly discuss some of the most serious trans-boundary pathogens affecting asian aquaculture including a newly emerging disease and highlight recent regional and national efforts on responsible health management for mitigating the risks associated with aquatic animal movement. a regional approach is fundamental since many countries share common social, economic, industrial, environmental, biological and geographical characteristics. capacity and awareness building on aquatic animal epidemiology, science-based risk analysis for aquatic animal transfers, surveillance and disease reporting, disease zoning and establishment of aquatic animal health information systems to support development of national disease control programs and emergency response to disease outbreaks are needed. molecular diagnostics with emphasis towards standardization and harmonization, inter-calibration exercises and quality assurance in laboratories, accreditation program and utilization of regional resource centres on aquatic animal health will also be needed. whilst most of these strategies are directed in support of government policies, implementation will require pro-active involvement, effective cooperation and strategic networking between governments, farmers, researchers, scientists, development and aid agencies, and relevant private sector stakeholders at all levels. their contributions are essential to the health management process. generally, aquaculture plays an important role in economy as harvests from natural waters have declined or, at best, remained static in most countries. fish and shrimp, the main aquaculture product sources, have gained the most attention. many factors can cause losses in yields of fish products and infectious disease in fish and shrimp is the biggest threat to the fishery industry. shrimp and fish aquaculture has grown rapidly over several decades to become a major global industry that serves the increasing consumer demand for seafood and has contributed significantly to socio-economic development in many poor coastal communities. however, the ecological disturbances and changes in patterns of trade associated with the development of shrimp and fish farming have presented many of the pre-conditions for the emergence and spread of disease. shrimp and fish are displaced from their natural environments, provided artificial or alternative feeds, stocked in high density, exposed to stress through changes in water quality and are transported nationally and internationally, either live or as frozen product. these practices have provided opportunities for increased pathogenicity of existing infections, exposure to new pathogens, and the rapid transmission and trans boundary spread of disease. not surprisingly, a succession of new viral diseases has devastated the production and livelihoods of farmers and their sustaining communities. this review examines the major viral pathogens of farmed shrimp and fish, the likely reasons for their emergence and spread, and the consequences for the structure and operation of the shrimp farming industry. in addition, this review discusses the health management strategies that have been introduced to combat the major pathogens and the reasons that disease continues to have an impact, particularly on poor, smallholder farmers in asia. btv isolates from the same geographic region have been termed as 'topotypes' and initial observation on segment 3 nucleotide sequences identified a correlation between topotypes and genetic information. later topotyping was proposed based on segment 10, on the premise that the encoding protein ns3, which is involved in virus egress from insect cells, would lead to evolutionary fitness in parallel with the geographic distribution of the different culicoides species. further studies attempted to extend this to nucleotide sequence homology in segments 7 and 10, but failed to identify clear cut correlations or any evidence for positive selection. for example, south african isolates were found not to cluster into separate african lineage. in this study, we carried out a more extensive analysis of segment 10 sequences. our analysis showed no segregation of isolates into topographically distinct groups. instead we observed topological clustering of the clades, and we attribute this to genetic bottleneck resulting in genetic drift and founder effect leading to homogenous gene pool in a geographical area. we hypothesize that when a new virus enters a geographical area where local btv strains are already circulating, the new genes/segments would enter into a bigger gene pool. consequently, the newer incursions into a heavily endemic area tend to get diluted and disappear from the population because the rate of drift is inversely proportional to the population size, unless they are positively selected. use of live attenuated vaccine in israel, europe, south africa and usa also led to more homogenous population similar to the vaccine strains due to continuous infusion of the vaccine type genes into the gene pool. we conclude that restriction of specific strains to certain geographical areas could generate uniquely imprinted genotypes which would not only indicate origin but also predict movement of viral strains to new areas. vvo-10 viral diseases of zoonotic importance: indian context k. prabhudas pd-admas, ivri, campus, hebbal, bangalore 24 zoonoses are generally defined as animal diseases that are transmissible to humans. they continue to represent an important health hazard in most parts of the world, where they cause considerable expenditure and losses for the health and agricultural sectors. the emergence of these zoonotic diseases are very distinct, hence their prevention and control will require unique strategies, apart from traditional approaches. such strategies require rebuilding a cadre of trained professionals of several medical and biologic sciences. the article discusses virus infections that have significant zoonotic implications for india. buffalopox is a contagious viral disease affecting milch buffaloes and rarely, cows, with a morbidity rate up to 80% in the affected herd. although the disease is not responsible for high mortality, it adversely affects the productivity of the animals, resulting in large economic losses. furthermore, the disease has zoonotic implications, as outbreaks are frequently associated with human infections, particularly in the milkers. the causative agent, buffalopox virus (bpxv), is closely related to vaccinia virus. the outbreaks of febrile rash illness among humans and buffaloes were investigated in the villages of districts solapur and kolhapur of western maharashtra. clinico-epidemiological investigations of humans and buffaloes were carried out and representative clinical samples were collected respectively. the samples include vesicular fluid, scab, and blood. laboratory investigations for buffalo-pox virus (bpxv) was done by pcr on blood samples, scabs and vesicular fluid. in vitro virus isolation attempts were carried out by using vero e-6 cells. negative staining electron microscopy was also employed for detection of virus particles. a total of 166 human cases with pox lesions on hand and other body parts from village kasegaon, district-solapur and 185 cases from 20 different villages of kolhapur district were reported. besides pox lesions patients were having fever, malaise, pain at site of lesion and axillary and inguinal lymphadenopathy. in kasegaon village, attack rate in human cases was 6.6% and in buffaloes 41.9% (231/551). whereas in kolhapur area attack rate in buffaloes was 11.75% (2633/22398). bpxv was confirmed in blood, vesicular fluid and scab specimens from human cases and scab specimen from buffalo by polymerase chain reaction (pcr) method. the bpxv was also isolated from 3 different clinical specimens and further identified by pcr and electron microscopy. clinical manifestation of the disease in buffaloes from solapur district was as reported earlier like common pox lesions on teats and udders whereas the buffaloes from kolhapur district had lesions on hairless parts of ears and on the eyelids with purulent discharge. bpxv from human and buffalo cases showed similarity. vaccines have been made against several diseases and used for controlling the afflictions. however a few of them were not effective for successfully controlling the disease. the reasons for the failure are many, the major being, either the pathogen is not completely cleared from the vaccinated animal or it reemerges after changing its antigenic structure, thus making the vaccination programme less effective. in addition to this, emergences of newer diseases such as hiv the development of suitable vaccines have become a challenging task. this is especially true in the case of viral diseases. these challenges have warned the researchers ''that protection by vaccination is not that simple and strait forward approach'', and lot need to be understood in terms of host virus interaction and role of environment in perpetuating the disease. so the immediate step that was considered was the environmental safety by way using non infectious materials as vaccines. with the understanding that has been developed in molecular immunology and molecular biology and with the availability of molecular tools that have been developed through recombinant dna technology the field of vaccinology has changed dramatically to emerge as modern vaccinology. this presentation deals with the modern approaches that are being used to produce effective vaccines in the case of foot and mouth disease of cloven footed animals. the similar approach may be worked out for other viral diseases also. despite the availability of an inactivated vaccine that is noted to provide solid immunity against the disease over a short period of time, the search for an ideal vaccine, the criteria for which are; safety of the vaccine for environment, easy in its preparation, does not require a cold chain for its storage, provides longer lasting immunity, economically viable and may be able to clear the virus in case of persistent infection is on. the advent of recombinant dna technology together with the information available on the molecular biology of viruses has enabled to design the development of newer vaccines that can induce strong cellular and humoral responses. the underlying principal in the present vaccine development strategy world over is the virus antigen gene has to be expressed in the tissue and the vaccine backbone has to trigger the immune system for eliciting desired immune response. bangalore campus of ivri has been vigorously pursuing research to develop ideal vaccines for foot and mouth disease keeping above principal in mind to achieve the previously mentioned criteria. the approaches selected are to see that the virus antigen/s replicate transiently in the host. the self replicating vaccines that have been developed are pox virus vectored vaccines, alpha virus replicase based vaccines and fmdv vectored vaccines. the approach and the result obtained so far will be discussed. silkworm, bombyx mori is affected with various diseases caused by viruses viz., nuclearpolyhedrosis (bmnpv), densosnucleosis (bmdnv) and infectious flacherie (bmifv). silkworm viral diseases form major constraints for the silk cocoon production in all the sericultural countries. the losses due to silkworm diseases is estimated about 20-40% and among them viral diseases are most common. in sericulture, prophylactic measures play a vital role in the management of silkworm diseases. these include disinfection of silkworm rearing house and appliances, rearing area, rearing surroundings, silkworm egg and body, and rearing bed disinfection associated with maintenance of general hygiene and personnel hygiene. all these activities are generally carried out as rituals by using general disinfectants often with partial success. recent trends in complete management of silkworm diseases include development of silkworm hybrids evolved from disease resistant/tolerant breeds, effective eco-and user-friendly disinfectants, anti-microbial feed-supplements and use of transgenic silkworms. biotechnological breakthrough in this regard is through rna interference (rnai) approach involving dsrna mediated nuclear polyhedrosis management and this is presently pursued by apssrdi, hindupur in collaboration with centre for dna fingerprinting and diagnostics (cdfd), hyderabad. nadu and karnataka. the disease appears to be more severe in rural flocks than organized farms. our investigations revealed the morbidity, mortality and case fatality rates among rural and organised farms as 9.34%, 2.69%, 28.84% and 6.22%, 0.47%, 7.63% respectively. higher morbidity and mortality in rural areas may be due to stress factors like poor nutrition, parasitic burden, fatigue due to long walks and non availability of veterinary aid. kulkarni et al. 1992 also reported the severe bt outbreaks in rural areas of maharashtra with overall morbidity, mortality and case fatality of 32%, 8% and 25% respectively. all the south indian sheep breeds were found to be susceptible and clinical farm of the disease is evident in all of them though saravanabava (1992) reported variations in susceptibility among the indigenous sheep. trichy black and ramnad white sheep were found to be more susceptible than the vambur and mecheri sheep of tamil nadu. prevalence of bluetongue in sheep, goat and cattle appears to be high in the region. serological surveys conducted in andhra pradesh during 1991 revealed the prevalence of btv antibodies in sheep (47.5%) goats (43.56%) cattle (33%) and buffaloe (20%). similar high prevalence of btv antibodies in sheep and goats were also reported from the other states in the region. clinical disease has not been recorded in kerala though btv antibodies were recorded in sheep (13.76%) and goats (7.10%) (ravi sankar 2003) . culicoides are the known biological vectors of btv. all the culicoides species are not capable of transmitting the btv. the occurrence of the disease is related to the presence of the competent vectors in the area. jain et al. (1988) established the involvement of the culicoides in transmitting the btv by isolating the virus from culicoides at haryana, the north indian state. c. imicola and c. oxystoma were found to be prevalent in andhra pradesh and tamil nadu. narladakar et al.(1993) reported the presence of c. schultzei, c. perigrinus and c. octoni in marathwada region of maharastra. culicoid vectors are significantly affected by the climate and annual variations in the climate reflects the outcome of the disease. the monsoon season (june to dec) with the temperature ranging from 21.2 to 35.6°c appears to be favourable period for the multiplication of culicoides. the maximum no of outbreaks were recorded during the north east monsoon period (oct-dec) followed by south west monsoon period (june to sep) in the region. however, details on the distribution of the competent vectors, feeding habits and their dynamics in the region is lacking multiple btv serotypes were found to be circulating in the region. (kulkarni and kulkarni 1984; janakiraman etal. 1991; mehrotra et al. 1996) a total of 10 serotypes viz. 1-4, 8, 9, 15, 16, 18 and 23 were identified based on the virus isolations. sreenivasulu et al. 1999 isolated btv serotype 2 from an outbreak of bt in native sheep of andhra pradesh. btv serotype 9, 15 and 21 were also isolated from the outbreaks occurred in andhra pradesh. some of the isolates need to be serotyped. deshmukh and gujar (1999) isolated btv type 1 from maharashtra. following is the summary of the distribution of btv serotypes in this region. clinical picture of bt in native sheep appears to be slightly different, the major difference being that swelling of lips and face was less conspicuous. mucocutaneous borders appeared to be very sensitive to touch and bleed easily upon handling. the classical signs of cyanosis of tongue and reddening of coronary band are not the common features of the disease in native sheep. the disease was also confirmed by the virus isolation and identification. clinical disease has not been reported in cattle, buffaloes and goats in spite of high seroprevalence. in conclusion bt is established in native sheep and causes severe economic losses to the farmers. the disease is concentrated in the southern peninsula of the country. the disease is seasonal and is associated with the rain fall. multiple serotypes appear to be circulating in this region. the btv serotypes were of virulent in nature as evident by severe outbreaks. s. janardana reddy*, d. c. reddy department of fishery science and aquaculture, sri venkateswara university, tirupati 517 502 in less than three decades, the penaeid shrimp culture industries of the world developed from their experimental beginnings into major industries providing hundreds of thousands of jobs, billions of u.s. dollars in revenue, and augmentation of the world's food supply with a high value crop. concomitant with the growth of the shrimp culture industry has been the recognition of the ever increasing importance of disease, especially those caused by infectious agents. in india viral diseases have become an important limiting factor for growth of shrimp aquaculture industry. although more than 30 different viral pathogens have been identified in different species of shrimp world wide, only a few viruses have identified which are causing disease problems in cultured tiger shrimps in india, east coast of andhra pradesh, in particular. diagnostic methods for these pathogens include the traditional methods of morphological pathology (direct light microscopy, histopathology, and transmission electron microscopy), enhancement and bioassay methods, traditional microbiology, and the application of serological methods. while tissue culture is considered to be a standard tool in medical and veterinary diagnostic labs, it has never been developed as a useable, routine diagnostic tool for shrimp pathogens. the need for rapid, sensitive diagnostic methods led to the application of modern biotechnology to penaeid shrimp disease. the industry now has modern diagnostic genomic probes with nonradioactive labels for viral pathogens like infectious hypodermal and hematopoietic necrosis (ihhnv), hepatopancreatic virus (hpv), taura syndrome virus (tsv), white spot syndrome virus (wssv), monodon baculo virus (mbv), and bp. highly sensitive detection methods for some pathogens that employ dna amplification methods based on the polymerase chain reaction (pcr) now exist, and more pcr methods are being developed for additional agents. these advanced molecular methods promise to provide badly needed diagnostic and research tools to an industry reeling from catastrophic epizootics and which must become poised to go on with the next phase of its development as an industry that must be better able to understand and manage disease. within this field, shrimp immunology is a key element in establishing strategies for the control of diseases in shrimp aquaculture. research needs to be directed towards the development of assays to evaluate and monitor the immune state of shrimp. the establishment of regular immune checkups will permit the detection of shrimp immunodeficiencies but also to help monitor and improve environment quality. for this, immune effectors must be first identified and characterised. in the end, however, the assumption may be made that the sustainability of aquaculture will depend on the selection of disease-resistant shrimp, i.e. to develop research in immunology and genetics at the same time. the development of strategies for prophylaxis and control of shrimp diseases could be aided by the establishment of a collaborative network to contribute to progress in basic knowledge of penaeid immunity. however, to improve efficiency, it appears essential also to open this network to complementary research areas related to shrimp pathology, physiology, genetics and environment. bluetongue is an important viral disease of sheep causing severe economic losses to the farmers. lack of effective vaccine is the major impediments in controlling the disease. multiple serotypes were found to be circulating in the state. attempts are being made to develop the vaccine employing the available serotypes to control the disease. hence, it is essential to identify the antigenic relationship among the serotypes to identify the candidate vaccine strains to be incorporated in the preparation of vaccine. reciprocal cross neutralization test was employed to find out the r% values between btv-2, -9 and -15 which indicated the extent of antigenic relationship between the serotypes. r% value between btv-2 and btv-9 was recorded as 2.8 r% value of 3.53 and 2.8 were observed between btv-2 and -15 and btv-9 and -15 respectively. the r% values recorded in the present study revealed a weak antigenic relationship between the btv serotypes. the extent of antigenic relationship between the btv serotypes was also determined by multiple sequence alignment of the nucleotide and amino acid sequences of the reference btv serotypes 2, 9 and 15. the sequence analysis of the vp2 gene revealed a homology of 47-53% and 29-41% at the nucleotide and amino acid levels respectively. r% values obtained using reciprocal cross neutralization test with the btv-2, 9 and 15 serotypes isolated in native sheep of andhra pradesh and the genomic analysis of the reference serotypes of btv-2, 9 and 15 revealed very weak antigenic relationship and were highly divergent. diseases especially those by viral pathogens cause greater economic losses in most horticultural crop species throughout the world as compared to agricultural crops. non-genetic methods of management of these diseases include quarantine measures, eradication of infected plants and weed hosts, crop rotation, use of certified virus-free seed or planting stock and use of pesticides to control insect vector populations implicated in transmission of viruses. however, none of these measures is likely to provide an enduring solution against these diseases especially those caused by viruses due sometimes to the huge expenditure involved, but mostly to the questionable effectiveness and reliability of those methods. as key control pesticides are getting increasingly abandoned, development of alternative methods to control diseases has been a felt-need in the recent past. though breeding for disease resistance generally provides a reliable security in a long run, introgression of host plant resistance did not materialise in most important crops. non-availability of an appropriate source of resistance in inter-fertile relatives, linkage to undesirable traits, or often times polygenic nature of such sources of resistance are the stumbling blocks in breeding programs. the limitations of conventional breeding and routine cultural practices prompted the need for the development of other approaches of virus control that could be fully incorporated into traditional methods. in this perspective, the concept of pathogen-derived resistance offers an attractive strategy to evolve newer methods of virus management, by transforming crop plants with nucleotide sequences derived from the pathogen's genome. an increasing number of molecular characterisation of plant virus genomes and the stable transformation of a number of horticultural crop species have in fact opened an avenue for molecular breeding against virus pathogens. successful field-testing of genetically modified crop cultivars renders proof of their supremacy over existing cultivars. it also contributes to demonstrate their capability with regard to environmental safety with a view to winning over public concern and scepticism. in general, the eventual commercialisation transgenic lines expressing virus resistance will rely upon a host of factors including their field performance, genetic stability, public acceptance and the resolution of environmental concerns and patent related issues. as such, elaborate field trials and allied studies are now required to adapt genetically engineered horticultural crops expressing virus resistance for their implementation into practical agriculture. a few examples from current research at tnau, in india or elsewhere will be discussed in this presentation. virology unit, division of plant pathology, iari, new delhi 12 in recent times there has been greater emphasis on vegetatively propagated crops in india to help diversify the indian agriculture. fruit, flower, spice and plantation crops are important vegetatively propagated horticultural crops, which have become a driving force for economic development in several parts of india. however, most of the vegetatively propagated crops are threatened by biotic stress caused by plant pathogens in general and plant viruses in particular. plant viruses produce specific and non specific symptoms and in some cases no symptoms are produced. correct identification and diagnosis of viral diseases is first step in the management of any disease including viral diseases. there have been two major breakthroughs in virus diagnostics during last four decades. the first one was serological assay using monoclonal or polyclonal antibodies in enzyme linked immunosorbent assay (elisa) and the other one was the use of in vitro amplification of dna in polymerase chain reaction (pcr). a significant development in serological assays has been its simplification in form of user's friendly quick strip/dip stick method. the one-step lateral-flow (lf) tests have been developed for the on-site detection and identification of several plant viruses. rapid advancement in virus genome characterization has led to the development of novel approaches of nucleic acid based diagnostics which include conventional pcr, real time pcr, multiplex pcr, micro/macro arrays and biochips. pcr protocols already exist for many plant viruses of citrus, banana, apple, papaya, vegetables, ornamental and spice crops. a further advancement has led to development of realtime pcr assay which is relatively easy but requires training for diagnosticians. in real-time pcr assays, results can be available within 20 min. the nucleic acid template preparation in pcr has been simplified. membrane based dna template protocol and co-isolation of nucleic acid template preparation are novel approaches in pcr detection of virus and virus like pathogens. since many of the horticultural crops are often infected by more than one virus, their individual detection by pcr is not only expensive but also time consuming. therefore, multiplex pcr has been developed where in genome of more than one virus could be amplified and detected in the same reaction mixture. development of nucleic acid based chip is now one of the fastest and recent growing areas in the field of pathogen detection. these nucleic acid based chips have been named as dna/rna chips, biochips, genechips, biosensors or dna arrays. when it comes to applications of microarray technology for plant viruses, it is not too difficult to see the value of a method that could potentially detect a whole range of viruses using a single test. however, microarrays are unlikely to become the only method in use in a diagnostic laboratory. processing of germplasm including transgenic planting material imported for research purposes into the country. during the last two decades, a total of 49,923 samples of wheat including transgenics were imported from cimmyt (mexico), icarda (syria) and many other countries. these were grown in post-entry quarantine nursery each year at nbpgr, new delhi and the transgenic samples were grown in national containment facility of level-4 (cl-4) since its inception to ensure that no viable biological material/pollen/pathogen enters or leaves the facility during quarantine processing of transgenics. in addition, post-entry quarantine inspections of the transgenic wheat grown by indenters are also undertaken by nbpgr quarantine scientists. virus-induced gene silencing (vigs) is a technique in which viral genomes are used, usually after appropriate modifications, for transient gene silencing in plants. the mechanism behind vigs is the phenomenon called rna-interference (rnai), which is widespread in many organisms and is believed to be form of inherent defence system against intracellular pathogens, such as viruses and transposons. double-stranded rna or rna containing strong secondary structures, commonly produced during viral infections, are believed to cause triggering of rnai, which employs a battery of proteins and nucleoprotein complexes to identify and degrade specific viral transcripts. in vigs, viral genomes not causing severe symptoms, but which can accumulate and spread efficiently in the host plant are used as vectors in which a host gene is cloned and introduced into the plant. upon replication, the viral vector triggers rnai response in the host plant, which also targets the host gene, leading to its silencing and subsequently, the silenced phenotype revealing gene function in vivo. vigs has been used extensively to study gene functions in dicot plants, such as tobacco, tomato, pea, soybean, etc., using vectors derived from reference genes are commonly used as an/the endogenous normalisation measure for the relative quantification of target genes. the expression (characteristics) of seven potential reference genes was evaluated in tissues of 180 healthy, physiologically stressed and barley yellow dwarf virus (bydv) infected cereal plants. these genes were tested by rt-qpcr and ranked according to the stability of their expression (characteristics) using three different methods (two-way anova, genorm and normfinder tools). in most cases, the expression (characteristics) of all genes did not depend on the abiotic stress conditions or on the virus infections. all the genes showed significant differences in expression (characteristics) among plant species. glyceraldehyde-3-phosphate dehydrogenase (gapdh), beta-tubulin (tubb) and 18s ribosomal rna (18s rrna) always ranked as the three most stable genes. on the other hand, elongation factor-1 alpha (ef1a), eukaryotic initiation factor 4a (eif4a), and 28s ribosomal rna (28s rrna) for barley and oat samples; and beta-tubulin (tubb) for wheat samples were consistently ranked as the less reliable controls. the bydv titre was determined in two oat varieties by rt-qpcr by three different quantification approaches. statistically, there were no significant differences between the absolute and the relative quantification, or between quantification using gapdh + tubb + tuba +18s rrna and ef1a + eif4a + 28s rrna. the geometric average of gapdh, 18s rrna, tuba and tubb is suitable for normalisation of bydv quantification in barley and oat tissues. for wheat samples, a combination of gapdh, 18s rrna, tubb, eif4a and e1fa is recommended. department of microbiology, yogi vemana university, vemanapuram, kadapa 516 003 large scale production and import of propagative material poses potential risk of introducing several destructive pathogens particularly viruses and mycoplasma like organisms in our country. this demands adequate quarantine safe guards such as growing them under approved post entry quarantine facility for specific period so as to facilitate virus detection, thereby curtailing risk. when such facilities are coupled with propagation by tissue culture will ensure virus free propagative plant material. the requirement of nationwide network of post entry quarantine facility working in close collaboration with crop institutions are very much emphasized for considering import of high risk plant genera for agriculture development. present paper discusses about virus disease of quarantine importance affecting ornamental and fruit plants such as chrysanthimum, dahlia, dianthus, rosabengalensis, cattleya, cymbidium, dendrobium, lilium, citrus, vitis etc. the paper also discusses on immunodiagnostic methods of detection and methods of obtaining virus free propagative material. rice tungro occurs as epidemics in regular cycles and has been reported in the last 50 years from all the major rice growing regions of india, especially prevalent in the southern and eastern states. development of the durable resistant varieties to tungro is crucial for the management of the disease. molecular breeding, involving the use of dna markers linked to the resistant gene(s) for selection, can overcome the difficulties encountered in conventional resistant breeding programs. for successful marker-assisted selection (mas), the identification of closely linked markers through the process of gene tagging and mapping is a prerequisite. attempts have been initiated for identification of tungro resistance genes through molecular mapping and their introgression into the target varieties using marker-assisted selection at drr, hyderabad. the inheritance of resistance to rice tungro virus disease was studied in seven resistant rice cultivars with field evaluation at hot spot locations. the microsatellite markers linked to rice tungro resistance in utri merah was studied and found that resistance genes were linked to rm 336 on chromosome 7. through molecular mapping two qtl were identified controlling rtv resistance on chromosomes 7 and 2 in 'utri rajapan' explaining 40.8% and 21.6% of the phenotypic variance. in variety 'vikramarya', another two qtl for rtv resistance were detected on chromosomes 7 and 1 explaining 18.7% and 16.4% of the phenotypic variance. the closely linked markers identified in this study flanking the gene of interest through mapping will improve the efficiency and precision of introgression programs in marker assisted breeding for rtv resistance. functional characterization of these qtl for rtv resistance is under progress. there is only a limited pool of natural virus resistance in cassava against cassava mosaic geminiviruses and cassava brown streak ipomovirus hence the development of transgenic resistance in this significant crop might present an option. rna mediated resistance through the expression of inverted-repeat dsrna sequences derived from the virus genome and the modification of plant microrna to produce antiviral artificial microrna are strategies that have recently been proven very effective for induction of virus resistance (immunity) against a number of rna viruses. results from rna interference strategies against geminiviruses never resulted in immunity of transgenes. however, it suggest that viral mrna are targets of rna silencing and that the success of the strategy depends on the relevance of the target gene in the systemic spread of the virus. we have generated a number of rna silencing constructs to induce resistance against cbsv and the indian cassava mosaic viruses icmv and slcmv. due to the serious problems inherent with transformation of cassava and subsequent resistance screening, these constructs were tested for efficiency either by transient-or by transgenic expression in n. benthamiana. complete immunity was reached in transgenic n. benthamiana against cbsv using inverted repeat or amirna constructs. using different species of cbsv for resistance screening, immunity was broken, to show the minimum context for broad spectrum resistance. similarly, highly specific resistance was reached in expression of amirna. in contrast, virus resistance against icmv/ slcmv using single amirna constructs was not successful. results from the experiments to generate virus resistance against cbsv and icmv/slcmv will be shown; methods to evaluate efficiency of rnai gene constructs by transient gene expression in n. benthamiana and strategies to develop efficient resistance against rna and dna viruses in cassava will be discussed. bitter gourd (momordica charantia l.) which is also called bitter melon, balsam apple and balsam pear belongs to family cucurbitaceae. it is an important traditional vegetable of nutritive and medicinal value that is cultivated in tropical and sub-tropical asia, but is considered as a weed host reservoir for viruses in jamaica. viral disease-like symptoms were observed occurring naturally on the crops of bitter gourd grown in the fields of northern india during 2007-2009. an incidence of 78.5% of diseased plants was recorded which showed chlorotic spots and mosaic ranging from mild mottling to green blisters along with leaf smalling, leaf and fruit deformations, bud necrosis and stunted growth whereas 20.2% plants exhibited leaf curling alone or in combination with mosaic-type disease. a reduction of 34.5% in fruit yield was recorded in mosaic-like disease which could be attributed to lesser fruit setting due to bud necrosis, smaller fruit size and stunted plant growth. such plants produced deformed, notched, irregularly shaped fruits wherein pre-mature yellowing and necrosis on the anterior and posteriors ends made 22.4% fruits unfit for marketability. the dwindling yield and production of unmarketable fruits posed a major constraint for profitable cultivation of this economically important crop, thus warranting for studies on etiology and management of these diseases. the mosaic-like disease was transmitted to healthy seedlings of bitter gourd at 2-leaves stage by sap inoculation as well as by aphid viz., myzus persicae sulz. and aphis gossypii glov. initially studies were carried out to optimize protocols for efficient plant regeneration and agrobacterium-mediated transformation for nagpur sweet orange, which is a popular and elite citrus cultivar in india. organogenesis was induced in etiolated epicotyl explants of one-month-old axenically raised polyembryonic seedlings by culturing them in mt medium supplemented with 30 g/l sucrose with varying concentrations of plant hormones. it was found that bap at 1 mg/l without auxin was best for efficient shoot regeneration in citrus using epicotyl explants. a 100% regeneration frequency was obtained and multiple shoot formation was obtained from both the cut ends of all the explants. an average of 8.24 well-differentiated shoots per explant were obtained, all of which rooted normally under the influence of 1 mg/l iba. this improved regeneration protocol was utilized in standardizing agrobacterium-mediated transformation of citrus using a. tumefaciens strain eha 105, containing binary plasmid pcambia 2301 that harbors gus reporter gene and npt-ii plant selection marker gene. one-month-old epicotyl explants infected with over-night grown agrobacterium (a 600 0.6-0.8) for 15 min and co-cultured for 3 days were found to be optimum for transformation as assessed on the basis of pcr analysis and gus activity displayed by the stem and leaf sections of putative transgenics. overall transformation frequency ranged from 38 to 48%. current study focuses on the generation of citrus transgenics for ctv resistance using a. tumefaciens strain eha 105 containing binary plasmid pbinar harboring portion of coat protein gene of ctv and npt-ii gene employing the standardized protocols. several putative transgenic shoots were recovered on selection medium and they are being utilized for molecular analyses and resistance against ctv. work is also in progress on the generation of citrus transformants using rnai construct harboring ctv cp and p23 genes, singly and in conjunction. our lab was also involved in developing rice transgenics for resistance against rice tungro disease, which is one of the most important and widespread virus diseases of rice in south and southeast asia, causing an annual estimated loss in crop yield of economic losses worth millions of rupees are caused due to these diseases annually. virus diseases are frequently less conspicuous than those caused by other plant pathogens and last for much longer. this is especially true for perennial crops and those that are vegetatively propagated. one further problem with attending to assess losses due to various diseases on a global basis is that what most of the data are from small comparative trials rather than wide scale comprehensive surveys, even the small trials do not necessarily give data that can be used for more global estimates of losses. this is for several reasons, including: (1) variation in losses by a particular crop from year to year; (2) variation from region to region and climatic zone to climatic zone: (3) differences in loss assessment methodologies; (4) identification of the viral etiology of the disease; 5 variation in the definition of the term 'losses' and (6) chilli is the major vegetable and spice crop grown in thar desert areas of rajasthan. leaf curl disease (chlcd) is one of the major constrains in chilli cultivation faced by farmers and cause yield loss up to 100%. a survey was conducted in major chilli growing areas of thar desert; bikaner, nagur, jodhpur and jalore districts of rajasthan during november, 2009 to understand the present status of leaf curl disease in chilli. among the four district surveyed for chlcd, the disease incidence was recorded maximum (up to 98%) in jodhpur district followed by jolore district (up to 88%). no relation was found between the disease incidence and varieties. the major varieties grown in these area are; mehsana, rch (mandoria), haripur raipur, mathania and local cultivars. the number of whitefly was also counted in top, middle and bottom leaf of chilli grown in these areas. the average number of whitefly per plant ranged from 0.0 to 4.0. more number of whitefly (4.0) was recorded in jodhpur district and lowest (1.8) in jalore district. total dna was extracted from three leaf curl infected samples from each district and tested for the presence of begomovirus using coat protein (cp) and dna-b specific primers. all the samples were positive for cp and dna-b amplifications by pcr. the cloning and sequencing of selected cp gene and dna-b fragments are in progress. the preliminary investigations shows that the leaf curl disease of chilli is widespread in the arid region of rajasthan and may be caused by begomovirus associated with satellite dna-b. bittergourd (momordica charantia) is an important vegetable crop of kerala. the crop is affected by several diseases of which mosaic is a prominent one. a field experiment was conducted to evaluate the efficacy of potentised resistance inducing substances (ris) viz., mosaic affected bittergourd plant tissue, ash of mosaic affected bittergourd plant tissue, plumbago and salicylic acid for control of bittergourd mosaic in march 2008. ris were applied as drench and foliar spray at three potency levels twice, before flowering of the crop. the experimental crop was grown as per the package of practice recommendations in split plot design with five replications per treatment. the disease incidence, disease severity and yield of the crop were recorded. the result of the experiment shows that spraying was more effective than drenching of treatments for reducing mosaic incidence and severity. among treatments, infected plant extract at 19 potency was the most effective one for reducing mosaic incidence and it showed the maximum incubation period and minimum disease severity. the spray application of treatments produced significantly higher yield than drenching. among the treatments, ash of infected plant at 19 and 309 potency and infected plant extract at 69 potency were on par and produced comparatively higher yield. elephant foot yam (amoprhophallus paeoniifolius), colocasia (colocasia esculenta) and tannia (xanthosoma sagittifolium) are the major edible aroids cultivated in india. the elephant foot yam cultivation is gaining importance due to its high production potential, nutritional and medicinal values and good economic returns. all these aroids are vegetatively propagated and viral diseases are spreading through planting materials. ctcri has the mandate of producing healthy planting materials of these edible aroids. accurate diagnosis and identification of the virus is essential for production of healthy planting material and effective management of the disease. though occurrences of viral diseases on edible aroids in india were known in 1960s, not much attention was given for detection and identification of the virus involved. in case of elephant foot yam 5-30% mosaic incidence was observed with varying symptoms of mosaic, puckering, filiformy etc. in colocasia and tannia, 5-10% incidence was noticed. rt-pcr amplification with potyvirus group specific primers and subsequent cloning and sequencing of the amplified product has confirmed the association of dasheen mosaic virus (dsmv) with all the three edible aroids cultivated in india. the complete full length coat protein gene of dsmv infecting elephant foot yam was cloned in pgem-t vector and sequenced. further sequence analysis revealed that the cp of dsmv consisted of 942 nucleotides and the 3 0 utr comprised of 260 nucleotides. blast and phylogenetic analysis showed highest similarity of 89% with that of dsmv isolate af048981, reported from usa. the deduced amino acid sequence of cp had 92.0-98.0% identity with other dsmv isolates. blast analysis of the partial cp gene sequences of colocasia and tannia also confirmed that the virus involved is dsmv. rt-pcr analysis of large number of samples from all the three crops confirmed that the potyvirus group specific primers (mj1 and mj2) are good for rapid detection of dsmv in these crops. dsmv specific biotinylated cdna and digoxigenin labelled crna probes were also prepared and dsmv in elephant foot yam was detected through nucleic acid spot hybridization. yellow leaf disease (yld) caused by sugarcane yellow leaf virus (scylv) is a recently recorded disease in india and is found wide spread throughout country. in popular varieties, the disease incidence varied from 0 to 75.0% and attained epidemic levels under field conditions. detailed studies on the impact of yld on sugarcane revealed that the virus infection significantly reduces various cane growth parameters, cane yield and juice quality. sequence comparisons of the coat protein (cp) and movement protein (mp) of 22 scylv isolates from india and database sequences showed a significant variation between indian isolates and the database sequences both at nt and aa level in the cp/mp coding regions. the significant variation in our isolates with the database isolates, even in the least variable region of the scylv genome showed that the population existing in india is different from rest of the world. further, comparison of partial sequences encoding for orf 1 and 2 revealed that yld in sugarcane in india is caused at least by three genotypes viz., cub, ind and bra-per, of which a majority of the samples were found infected with cuban genotype (cub). the genotype ind was identified as a new genotype and this was found to have significant variation with the reported genotypes. we have identified specific primers from cp region of the virus and optimized rt-pcr conditions to diagnose the virus. this assay has been found efficient in detecting the virus in asymptomatic plants and tissue culture derived seedlings. elimination of the virus through meristem culture has been demonstrated to purify the virus from the infected planting materials and this technique needs to be adopted to supply disease-free planting materials for effective management of the disease. studies are also in progress to identify the yld-resistant sources in sugarcane germplasm to initiate breeding for yld-resistance in sugarcane. mycoviruses are viruses that infect fungi. they have been identified in all major fungal families. in the present scenario, mycoviruses are the important means of biocontrol of plant fungal pathogens. most identified fungal viruses have double stranded rna genomes, often with more than one dsrna present per virus particle, and have been spherical in shape. these viruses are mostly vesicle bound, as other viruses have protein coatings. to be a true mycovirus, they must demonstrate an ability to be transmitted-in other words be able to infect other healthy fungi through anastomosis and spores. mycoviruses lead 'secret lives', reduce the ability of their fungal hosts to cause disease in plants. this property, known as hypovirulence (hypovirulence is a term used to describe reduced virulence found in strains of pathogens), this phenomenon was first observed in cryphonectria (endothia) parasitica (chestnut blight fungus) on european castanea sativa in italy, where naturally occuring hypovirulent strains were able to reduce the effect of virulent ones. these slower growing hypovirulent strains of c. parasitica contain a single cytoplasmic element of double-stranded rna (ds rna) similar to that found in mycoviruses that was transmitted by anastomosis in compatible strains through natural virulent populations of c. parasitica. hypovirulence has also been reported in many other fungal plant pathogens, including rhizoctonia solani, gaeumannomyces gramini var. tritici, ophiostoma ulmi, sclerotinia homoeocarpa, diaporthe ambigua alternaria alternata, and fusarium sp. etc. hypovirulence has attracted attention owing to the importance of fungal diseases in agriculture and the limited strategies that are available for the control of these diseases. it reduces the use of toxic fungicides which also affect the plant growth. the symptoms resulted by the mycoviruses are reduction in growth, reduction in pigmentation and sporulation, excessive sectoring and aerial mycelial collapse. these are the consequences of alteration in complex physiological and biochemical processes involving interaction between host and virus. cassava (manihot esculenta crantz.) is the major tuber crop in peninsular india, it is grown in an area of 2.4 lakh hectares with the annual production of 6.7 million tonnes both for direct consumption and the starch grain (sago) producing industries, mainly in the southern states of tamil nadu, kerala and andhra pradesh (fao 2005) . in tamil nadu, cassava primarily produced for sago producing industries where it is considered as an industrial crop rather than food crop, so the resource rich farmers are cultivating the cassava as irrigated crop in their fertile land and the poor farmers are raising the crop under rainfed conditions. in south india in addition to cassava there is a practice of intercropping important vegetable crops like, tomato, brinjal, legumes and gourds in cassava fields since all the above mentioned crops are short duration and are money spinners for the farmers. unfortunately, the major production constraint in these vegetable crops including cassava is the geminiviruses belonging to the family of in recent years there has been growing concern regarding the standard of scientific researches in india. the strengths, weaknesses, opportunities and threats (swot) analysis on indian scientific research reviewed the progress of science during the last six decades. although the 'strengths' were highlighted in good measure, it was the list of 'weaknesses' that called for attention to upgrade the standard of research and 'opportunities' that provide scope for overall scientific growth. a comparison between india and other countries in terms of research papers published revealed that india's contribution to science has come down enormously. what ails indian science? should we compare the growth of indian science with other developed countries? what criteria should be adopted to judge the quality and standard of scientific research? how to motivate the scientists to improve their scientific output? how do motivate the scientists to improve their scientific output? how do indian journals perform in maintaining quality? this paper analyses critically the scientific journals around the world, based on the scores allotted by the national academy of agriculture sciences (naas) in 2003 and 2007 for 1460 and 1608 journals respectively. in general, the indian journals performed poorly irrespective of the disciplines with only 25-30% in the high standard. the paper dealt with the reasons for low impact factor, the anomalies in the allotment of scores to wide spectrum of the journals and the disadvantages the scientists face with the scoring system. a case study was presented of an institute with over 50 scientists whose publications were analyzed to discuss the merits and demerits of the system. the performance of the journals published by prestigious academics, societies and councils was also projected. the paper concluded with the need for enhancing the image of the country through research publications in high standard journals and the role of various scientific bodies with shore and long term measures. poster session herpes simplex virus (hsv) keratitis is a leading cause of corneal blindness throughout the world. the infection can be diagnosed by clinical manifestations but in case of atypical ocular cases, laboratory diagnosis is more helpful in timely management of disease. collection of corneal scrapings in all cases of stromal and epithelial keratitis may not be possible, but collecting tear fluid is a convenient procedure causing less discomfort to the patients. therefore, the present study was intended to evaluate the suitability of tear specimens for detecting hsv by polymerase chain reaction (pcr) and immunofluorescence (ifa). tear fluid and corneal scrapings were collected from 134 patients of suspected herpetic keratitis. hsv-1 antigen was detected by ifa using rabbit anti-hsv antibodies. pcr was performed to amplify 111 bp region of thymidine kinase (tk) coding gene and 144 bp region from dna polymerase coding gene of hsv. out of 134 patients hsv antigen was detected in 25 (18.65%) of corneal scrapings and 15 (11.19%) of tear specimens and in 12 (8.95%) patients from both the specimens. hsv gene could be amplified in 44 (32.83%) of corneal scrapings and 16 (11.94%) of tear fluids and in 13 (9.71%) patients from both the specimens. although, corneal scraping seemed to be marginally superior material for detection of hsv, tear fluid may also serve as an appropriate alternative clinical specimen, due to ease of collection and least discomfort to the patients. in either cases pcr detected higher number of hsv cases than ifa. therefore if and when feasible, both ifa and pcr should be used simultaneously on each specimen to obtain best results. cytokines play a key role in the regulation of immune responses. in hepatitis c virus infection (hcv), the production of inappropriate cytokine levels appears to contribute to viral persistence and to affect response to therapy. il-6 is produced by a variety of cells including t cells, phagocytes and fibroblast. cytokine genes are polymorphic at specific sites, and certain mutations located within coding/regulatory regions have been shown to affect the overall expression and secretion of cytokines in patients with hcv infection. to correlate the serum levels and polymorphism of il-6 gene in chronic hepatitis c patients and healthy controls. forty patients positive for hcv rna attending the medicine out patient department and wards of lok nayak hospital, new delhi as well as forty healthy controls were enrolled for the study. the serum level of il-6 was detected by using elisa. genomic dna was extracted from whole blood of hcv infected patients and healthy controls by using accuprep genomic dna extraction kit according to manufacture's instruction. the genotyping of il-6 promoter (-174 variant) was carried out by pcr and direct sequencing using the method of patricia woo et al. 1998. the serum level of il-6 was significantly down regulated in hcv infected chronic patients as compared to the healthy controls. genotyping of -174 promoter variant of il-6 was performed by pcr and direct sequencing. il-6 polymorphism in the g/g, g/c and c/c allele was non significant when compared to hcv patients and healthy controls. the il-6 serum levels were significant among hcv infected patients when compared to healthy controls. the polymorphism in the promoter region of il-6 (-174) was found nonsignificantly associated in hcv patients compared to healthy controls. in conclusion, the present study suggests that the host il-6 polymorphism alone may not play a significant role in the outcome of hcv infection. acute gastroenteritis (age) is a global health problem and has been associated with multiple etiological agents, which include bacteria, protozoa and viruses. viral gastroenteritis is considered as the second most common illness in children after upper respiratory tract infection. among enteric viruses, rota, noro, enteric adeno, astro and enterovirus are found to be associated with gastroenteritis. although, association of enteric viruses has been established in children hospitalized for age no such data is available from hospitalized children other than enteric infections. to determine the prevalence of enteric viruses circulating in hospitalized children. fecal samples, n = 292 (177 symptomatic and 115 asymptomatic for age) were collected from children \5 year of age from three different hospitals across the city of pune from june 2008 to feb. 2009. detection of group a rotavirus was carried out by using antigen captured elisa. rt-pcr and pcr was carried out for the detection of norovirus, enterovirus, astrovirus and enteric adenovirus detection by using primers targeted to rdrp gene, 5 0 ncr gene and consevered gene for serine protease and hexon gene respectively. out of 177 fecal samples tested for enteric viruses in age cases, the prevalence of rota, entero, noro, enteric adeno and astrovirus were 33.3% (59), 14.7% (26), 6.2% (11), 2.8% (5) and 1.1% (2) respectively. however, the presence of these viruses in the asymptomatic cases (n = 115) was detected at 7.8% (9), 5.2% (6), 7.8% (9), 0.86% (1) and 1.7% (2) levels respectively. mixed infections of enterovirus and rotavirus were found in both symptomatic 1.6% (3) and asymptomatic cases 0.8% (1). however, mixed infection of enterovirus with adenovirus were found only in asymptomatic cases 0.8% (1). no marked difference was observed in the seasonal pattern of all viruses in the patients with or without gastroenteritis. the findings of this study document highest circulation of rotaviruses in patients symptomatic and asymptomatic for age. the entero and noroviruses remain second most important enteric viruses in these patients. influenza in humans is a major public health concern and the understanding of its evolution in the light of its ''antigenic drift'' helps prediction of epidemics and update of yearly influenza vaccine. to antigenically characterize influenza a (h3n2) isolates and study antigenic drift during 1990 to 2009 in pune city. patients with influenza like illness were identified using a strict case definition from dispensaries located in different areas in pune and clinical samples (ns/ts) were collected after obtaining informed consent. these clinical samples were processed in vivo (in fertile eggs) and in vitro ( overall, an additional 35 (39.7%) positive cases of dengue could be detected when ns1 antigen assay was also used in the study. highest ns1 antigen positivity was encountered among the samples collected on the 3rd day of fever whereas mac elisa for anti igm antibody was positive after 4th day and gradually there was an increase in the positivity towards the convalescent phase of the disease. the results of this study indicate that ns1 antigen based elisa test can be an useful tool to detect the dengue virus infection in patients during the early acute phase of disease since appearance of igm antibodies usually occur after fifth day of the infection. concurrent use of both diagnostic assays namely ns1 antigen as well as mac elisa will improve the overall detection of dengue infection. early detection of acute dengue virus infection is crucial to provide timely information for the management of patients. human parvovirus b19, a member of the parvoviridae family, is a pathogen associated with a wide variety of diseases. most commonly, it causes childhood rash erythema infectiosum, but in some cases more serious symptoms such as persistent arthropathy, critical failures of red cell production causing transient aplastic crisis, this infection in pregnancy causes hydrops fetalis and myocarditis. traditional immunosuppressive therapy being unsuccessful, anti-viral therapy might be worthy of consideration. functional annotation would provide role of viral proteome in its survival and pathogenic mechanisms. svmprot functional family annotations of vp2 protein had deciphered its zincbinding, coat protein, outer membrane, chlorophyll biosynthesis, dna repair and calcium-binding nature. vp2 protein is having a key role in viral assembly of b19 virus and being non-homologous to human proteome, it was identified as an attractive molecular target for structure based drug discovery. the vp2 protein crystal structure was energy minimized using charmm. a structure based virtual screening method was applied using ligandfit to identify potential inhibitors of vp2 protein from chembank database and ten potential human parvovirus b19 vp2 inhibitors were proposed. prism 310 genetic analyzer. the drafting of the sequences was performed using bioedit software and were submitted in genbank. for phylogenetic interpretation denv representing the full extent of genetic diversity in denv-1, denv-2 and denv-3 were collected from genbank. neighbor joining algorithm was implemented with bootstrap value of 10,000 replicates for phylogenetic inference using mega 4.0.2. the genomic region 134 to 644 (c-prm gene junction) of denv were amplified directly from patient serum. twelve of 72 samples were positive for dengue viral rna. of these 4 were dengue type 1, 1 was dengue type 2 and 7 were dengue type 3. for molecular epidemiological survey and genotyping of the sequences more than 100 sequences from different geographical areas including sequences form previously reported north indian isolates were compared with our present data set. the critical analysis of the sequences revealed: 4 dengue type 1 sequences were clustered within sub-type 2 of genotype iii and all the 7 sequences of den-3 clustered along with genotype iii. thus, among the dengue types 1, 2 and 3 currently circulating in north india, dengue type 3, genotype iii, being the predominant one followed by, genotype iii of dengue type 1. although there is no specific treatment or vaccine available currently, the confirmative rapid diagnosis based on detection of viral nucleic acid or igm antibodies in serum, an indication of recent infection, helps in epidemiological monitoring, symptomatic treatment of patients and determining prognosis. serological detection of anti-cgv igm antibodies was performed using rapid immuno-chromatographic assay (rica) and igm-antibody capture enzyme linked immunosorbant assay (mac-elisa). eighty convalescent sera were tested by rica and 60 of them were found positive for anti-cgv igm antibodies. twenty-five anti-cgv igm antibody rica positive sera were further assayed using mac-elisa. more sera from the patients are currently being tested to compare the sensitivity of these two serological assays in anti-cgv igm antibody based early serological diagnosis of cgv infection and the findings will be presented. thus the present study was designed to evaluate the utility of multiplex pcr (mpcr) for simultaneous and rapid detection of dengue and chikungunya viral infections. seventy-two acute phase blood samples from clinically suspected dengue cases were subjected for dengue and chikungunya uniplex pcr using dengue genus specific primers and e gene specific primers for chikungunya virus as well as multiplex pcr was developed for simultaneous detection of dengue and chikungunya infection. standard strains of dengue and chikungunya virus were used as controls. 13 of the 72 clinically suspected dengue samples were found to be positive for dengue viral rna by dengue uniplex pcr as well as dengue chikungunya mpcr whereas none of the samples were positive for chikungunya virus infection by both uniplex chikungunya pcr and dengue chikungunya mpcr. the result of dengue and chikungunya uniplex pcr was found to be 100% concordant with dengue chikungunya multiplex pcr. dengue chikungunya multiplex pcr was found to be a potential rapid test to detect dengue and chikungunya viral infections simultaneously in clinical samples. sheetal malhotra, neelam marwaha, karan saluja, ratti ram sharma department of transfusion medicine, pgimer, chandigarh 160012 transmission through blood and blood products can be reduced to a great extent by efficient and reliable testing of the blood. the newer fourth generation elisa assays simultaneously detect antibodies against hiv-1 and 2 and the presence of p24 antigen and thus shorten the window period to about 14 days, as compared to 22 days with third generation elisa. to compare the hiv seroprevalance among blood donors using fourth generation elisa (antigen-antibody) versus third generation elisa (antibody) assay. this was a prospective study involving 5100 blood donors of which 3400 were voluntary donors (1700 being students and 1700 being non students) and 1700 were replacement donors. sex workers are one of the core group for transmission of sti/hiv and as a ''bridge group'' to the general population. accordingly, highest priority is given to this group in targeted intervention for prevention of hiv/aids. here we are describing one such female sex worker who was harbouring 5 concomitant sti including 4 viral sti. a 25 year old female sex worker was brought to the sti clinic of a tertiary care hospital by ngo with complaint of genital discharge for 3 days. on per speculum examination, cervix was slightly erythematous, tender with mucopurulent discharge. there was no vaginal discharge or ulcer in anogenital area. however, there was a wart at lateral wall of vagina. as per naco syndromic management guideline, treatment was given for n. gonorrhoeae, c. trachomatis and hpv. cervical swab was taken and subjected to various microbiological investigation for the detection of sti viz n. gonorrhoeae, c. trachomatis, t. pallidum, candida spp., t.vaginalis, hsv-1, hsv-2, hiv, hbv, hcv, hpv and m. contagiosum. saline wet mount showed pus cells, but no yeast cells or trophozoite of trichomonas vaginalis. gram stained smear showed more than four polymorphonuclear leucocytes in the absence of gramnegative intracellular diplococci and a presumptive diagnosis of non gonococcal urethritis was made. no organism was isolated on any culture media after appropriate incubation. cervical swab was negative for antigen of c. trachomatis. serum was tested positive for hbv, hcv, hsv-2 and t. pallidum though it was seronegative for hiv. in the present case, the female sex worker was harbouring four viral sti viz hsv-2, hbv, hcv and hpv alongwith t. pallidum. however clinically she was diagnosed and treated accurately only for genital wart while cervical discharge due to hsv-2 was misdiagnosed. it is necessary to try to test alternative approaches such as periodic presumptive therapy of viral sti, because this will not only boost up the efforts of sti control in the target group but also help in hiv control. alternatively, regular clinical and laboratory screening for viral sti may be tried. densonucleosis viruses (dnv) belong to parvoviridae family. they are the etiological agents of insect's disease known as densonucleosis, which leads to death or loss of vital functions of the infected insect. densonucleosis virus of mosquitoes has generated lot of scientific interests because of its tremendous potential in biological control and its application as a transducing vector. earlier, we have reported the isolation and characterization of a dnv from aedes aegypti mosquitoes and its prevalence among different ae. aegypti populations from india. there are reports suggesting that when aedes albopictus mosquitoes co-infected with dengue-2 and dnv, the multiplication of den-2 is suppressed. the present study focus on the effect of coinfection of ae. aegypti mosquitoes with dnv and chikungunya virus (chik). the first instar mosquito larvae were infected with dnv and the emerging dnv infected females were then infected with chikv by oral feeding. thus obtained chik infected female mosquitoes were analyzed by real time pcr for both dnv and chikv on alternate days post-infection, up to the 14th day. the data showed no significant difference in the multiplication of either of the viruses after co-infection. results suggest that chikv neither stimulates the replication of dnv nor is its own replication suppressed due to co-infection. this study forms an initial step in understanding the role played by such endogenous viruses on the vector dynamics. chandipura virus pathogenesis is manifested as encephalitis in young children with a very high mortality rate. this damage could be due to direct replication of the virus in brain parenchymal tissue or immune system mediated. this study aims at elucidating the role of brain infiltrating lymphocytes in pathogenesis using mice as the model system. mice were inoculated intracerebrally with the virus and the perfused brain tissue was used to isolate the lymphocytes. control mice were inoculated with an equal amount of media. in order to standardize the procedure for isolation of lymphocytes from brain tissue, splenocytes were processed to isolate the lymphocytes using histopaque density gradient method. methods to isolate lymphocytes from brain tissue as described by earlier workers were tested for the ease and efficiency of procedure using known suspension of lymphocytes from spleen. percoll density gradient method provided optimum yield of lymphocytes with an ease of handling. in this, brain cell suspension used to prepare 30% percoll is layered over 70% percoll prepared using media in 1:2 ratio. density gradient centrifugation is carried out at 9009g for 20 min at 15°c to obtain lymphocyte layer at the interface. leishman staining was performed to analyze the morphological characteristics of isolated lymphocytes. normal lymphocytes showed dark blue stained nucleus. some bigger sized cells with diffused nucleus characteristic of atypical lymphocytes were observed and some of the cells were surrounded by hair like structures. phenotypic characterization was carried out using flow cytometry. the presence of cd4 + , cd8 + and cd19 + cells was observed. the percentages of cd8 + , cd4 + and cd19 + cells was found to be 7.60%, 35.14% and 34.32% respectively in the lymphocytes isolated from infected animal and 5.65%, 30.27% and 3.13% respectively from control animal. hence, cd19 + cells showed maximum infiltration after infection. (santosh et al. 2008; pradeep et al. 2008 ). in the present study chikv suspected blood samples were collected and the acute phase samples were subjected to rt-pcr for the presence of virus specific rna by using the primer pair dvrchk-f/dvrchk-r as described by us earlier (naresh kumar et al. 2007 ). the convalescent phase samples were screened for chikv specific antibodies by using sd bioline chikungunya igm rapid test. six sets of primers were designed to amplify the complete nsp4 and complete structural genes of chikungunya virus. the products were further gel purified, cloned in ptz57r/t vector and the recombinant clones were sequenced and submitted to the genbank. the complete ns4gene and structural genes were compared with other available sequences in the genbank. sequence analysis results will be presented. the present study discusses these aspects in detail. . some of these phages (viz. v953, v954) showed plaques at 42°c but not at 37°c. thus they seem to be lysogenic. for propagating and increasing the titre of all the above isolates, various previously described methods were attempted, but none of these methods were satisfactory. but when siliconized glassware and plastic-ware were used, propagation was successful. we showed that siliconization of glassware and plastic-ware was essential for the propagation of our mycobacteriophage isolates v951, v952, v953, v954 and v955. also, phage dilution medium (pdm) as described by chaterjee et al. (2000) was found to be effective for picking out of the plaques made by the phages. in this way, the phage isolates were propagated up to p 3 . the various passages of the phage isolates v951, v952, v953, v954 and v955 (i.e. original, p 1 , p 2 and p 3 ) were stored at -80°c. the four major routes of transmission are unsafe sex, contaminated needles, transmission from an infected mother to her baby at birth (vertical transmission) and breast milk. screening of blood products for hiv has largely eliminated transmission through blood transfusions or infected blood products in the developed world. in 2008, globally, about 2 million people died of aids, 33.4 million were living with hiv and 2.7 million people were newly infected with the virus. hiv infections and aids deaths are unevenly distributed geographically and the nature of the epidemics vary by region. more than 90% of people with hiv are living in the developing world. there is growing recognition that the virus does not discriminate by age, race, gender, ethnicity, socioeconomic status-everyone is susceptible. however, certain groups are at particular risk of hiv, including men who have sex with men (msm), injecting drug users (idus), and commercial sex workers (csws). the present study indicates the prevalence of hiv infection among the people residing in the northern region of india predominantly among the foothills of the himalayas. the study was carried out on the patients visiting herbertpur christian hospital (a unit of emmanuel hospital association) under the integrated counselling and testing centre scheme at the respective hospital during the 2009-2010. the study indicates the screening of people groups residing in the respective area through community health schemes. the diagnosis of the hiv infection is done by three types of assays namely. the tridot method which is the rapid method of diagnosis followed by the. hiv coombs test which involves the dot immunoassay principle. the third assay is the enzyme linked immunosorbent assay (elisa). the number of patients screened during the period of september 2009 to march 2010 is 635 which include patients coming from four different states namely haryana uttarakhand uttarpradesh and himachal pradesh. the number of people who were tested positive are 8 and the number of people who were tested negative are 627. the people tested positive are sent to the higher centre for other confirmatory tests such as pcr and western blot analysis. these patients are sent for treatment and prophylaxis at a respective recognised centre in dehradun. the present study determines a consistent community hiv screening and treatment approach through diagnostics counselling and awareness programmes. classical swine fever (csf) also known as hog cholera is a highly contagious and fatal disease of swine. csf became rapidly a major issue of pig industries. it still causes important economical losses worldwide. it is considered as a major health problem of swines in india. during the month of august to october 2009 there was an outbreak of classical swine fever in bihar. from three districts darbhanga, patna and supol, total 36 numbers of different infected tissue samples like kidney, spleen and lymphnode were collected from the dead morbid/pigs. total rna was isolated from 20% homogenate of infected tissues in sterile pbs by tri-reagent (sigma, usa) according to the manufacturer's instructions and cdna was prepared by using commercial available kit. the cdna was stored frozen at -20°c until used. for the molecular detection of classical swine fever virus specific nested pcr amplification of e2 and 5 0 ntr was done along with ns5b and e rns amplification. primarily these samples were found positive with these primers. further confirmation by sequencing was done by cloning of these pcr products in pgem-t easy vector. e2 and 5 0 ntr sequences were considered for phylogentic analysis along with 20 complete available sequences of csfv. nucleotide sequence alignments were carried out using the clustalw program (dnastar) and phylogenetic tree analysis (dnastar) showed that 5 0 ntr have close proximity with taiwan strain (accession no. ay568569) and e2 shows close proximity with chinese isolate csfv-39 (accession no. af407339). peste des petits ruminants (ppr) and sheeppox are oie notifiable diseases of small ruminants especially sheep and goat. both the diseases are economically important, in enzootic countries like india and are major constraints in the productivity of animals. considering the geographical distribution of both ppr and sheep pox infections and prevalence of mixed infection, in the present study, safety and potency of the experimental duel vaccine comprising attenuated strains of thermostable-ppr virus (pprv-revati, p-50) grown at 40°c and attenuated sheep poxvirus (sppv-srinagar, p40) was evaluated in local non-descript sheep. experimental animals were grouped into four groups and each group was comprising six animals, received 100 doses (10 5 tcid 50 ), 1 dose (10 3 tcid 50 ) and 1/10th dose of vaccines and normal saline as control in 1 ml volume subcutaneously, respectively. serum samples were collected on 0, 7, 14, 21 and 28th day post vaccination. sheep simultaneously immunized with 1 ml of vaccine consisting of either 100 or 1 doses of each of pprv and sppv were monitored for clinical and serological responses for a period of 3-4 weeks post-immunization (pi) and post challenge (pc). specific immune responses i.e., antibodies directed to both pprv and sppv could be demonstrated by ppr competitive elisa kit and capripox indirect elisa, respectively following immunization. all the immunized animals' resisted infection when challenged with virulent strain of sppv (srinagar isolate at p-6) on day 28 dpi, while in contact control animals developed characteristic signs of sheeppox. the challenge of the sheep against ppr was not carried out, however, the antibody titre after immunization determined by snt and elisa, indicated that protective titre, as per earlier report on the goats. dual vaccine was found safe at higher dose and induced protective immune response even at lower dose (10 2 tcid 50 ) in sheep, which was evident from sero-conversion as well as challenge study with sppv. the study indicated that both the viruses are compatible and did not interfere with each other in eliciting immune response, paving the feasibility of use of this dual vaccine in combating both infections simultaneously. goatpox is one of the highly contagious, oie notifiable and economically important viral diseases of goats. the disease is caused by goatpox virus (gtpv) is classified of the genus capripoxvirus in the family poxviridae. the disease incurs severe economic losses in terms of high morbidity in adults and heavy mortality in young kids and is a major constraint in goat farming in india. considering the enzotic nature and economic impact of the disease, it is all important to control the infection by developing an effective vaccine. recently, vero cell based a live attenuated goat pox vaccine; using gtpv uttarkashi isolate (p60) has been developed in authors' laboratory and evaluated in goats. the vaccine was found safe, potent and immunogenic experimentally and even at field trials. the vaccine has been evaluated at large-scale at different regions of the country and found suitable for mass vaccination. however, the longevity of potency was not evaluated. therefore, a long term potency trials were studied for a period of 4 years with annual challenge by using virulent goatpox virus and sero-monitoring. a sufficient number of hill goats has been vaccinated with 1 dose of vaccine (10 2.5 tcid 50 /ml) and monitored for clinical and serological response. every year, significant number of vaccinated (n = 5) and control animals (n = 2) were used for challenge with virulent strain (2 9 10 7.0 srd 50 /ml, gtpv mukteswar). sera of pre-and post-challenged (14 dpc) animals including controls have been collected and monitored for serological response in the form of specific antibody production by snt and indirect elisa. all the vaccinated animals were protected on challenge, whereas, all unvaccinated controls developed infections. the same has been reflected in sero monitoring of collected sera. so the developed live attenuated goat pox vaccine was found safe, immunogenic and potent for a period of 4 years of immunization and suitable for mass scale vaccination in control and eradication of goat pox along with a/are suitable diagnostic tool/s in goatpox enzootic country like india. rotavirus infection in avian species varies from subclinical infections to outbreaks of diarrhea. the economic significance of rotaviral enteritis to the poultry industry has not yet been defined, but by analogy to the situation in mammals, it is likely to be significant. unlike the extensive studies performed on rotavirus infection in humans and animals, limited studies have been carried out to determine the extent of exposure of poultry birds to rotaviruses. to determine the prevalence of avian rotavirus antibodies in commercial broiler chickens. a total of 120 chicken serum samples were collected from the lairage of a poultry slaughter house where birds from four different broiler farms in and around pune city were supplied to. the serum samples were tested by an igg antibody capture elisa wherein purified chicken rotavirus ch2 was used as coating antigen. sera from specific pathogen free (spf) chick (n = 20) served as negative control in the test. cut off was calculated as mean negative control ? 3sd (standard deviation). s/co (mean sample od 450 /cut off) values above 1 (1.113-4.445) in 60% (72/120) serum samples were indicating positivity to rotavirus antibodies. the result of the study indicates exposure of the birds to avian rotavirus or similar agent that is circulating in pune city. bluetongue has become established in south india causing regular outbreaks in sheep. btv serotypes 2, 9, 15 and 21 were isolated from native sheep of andhra pradesh. the other serotypes circulating in the state need to be identified. however the major constraint is the serotype identification. to overcome the difficulties of traditional serotyping methods (neutralization tests), nucleic acid based tests are being tried. rt-pcr for serotyping was standardized using primers specific to vp2 gene of btv-2, 9 and 15 serotypes. rt-pcr resulted in 653 bp product of btv-2, 1241 bp product of btv-9 which was defined by specific primers. however non specific amplification at two different sites i.e. 700 bp and 1500 bp was noticed for btv-15. specificity of rt-pcr was evaluated. btv-2 and btv-9 specific primers could amplify only btv-2 and btv-9 respectively where as btv-15 type specific primers amplified not only btv-15 but also btv-2 and btv-9. nucleic acid sequence data obtained from btv-2 pcr product and btv-9 cloned products were specific to vp2 gene of btv-2 and btv-9 respectively. however, 700 and 1500 bp products of btv-15 were identical to vp 4 gene of btv-2, 8, 10, 11, 13 and 18 and vp 1 gene of btv-2, 8 and 10 respectively, indicating the non specific amplification of btv-15. foot and mouth disease is the most contagions and highly economically impotent disease of cloven footed animals. the disease is controlled by regular vaccination using the vaccine produced from the virus grown in the cell culture. the vaccine strain used for vaccine production is selected from the field isolates based on the adaptability and growth kinetics in bhk21 cells and antigen coverage. however the field viruses need to be passaged several times to adapt in tissue culture. passage of field viruses in tissue culture may results in development of mutants whose genetic makeup may differ from the field samples also some of the field strains may fail to adapt or may grow poorly in the tissue culture, thus the efficiency of the vaccine gets affected. structural proteins of fmdv carry the sequences which determine the serotype specificity and immunogenicity. thus one may replace the gene coding for structural proteins from the full length cdna copy of the vaccine strain that has been adapted to the tissue culture with the poly-structural protein gene (pi) so that the chimeric virus gets the serotype specificity of the field strain besides retaining the other characteristics that are needed for a vaccine virus. we have made replication competent fmdv asia i full length genome and cloned under t7 and cmv promoter separately in plasmid vectors. bam h1 sites were created for inserting pi-2a gene of other field strains. the p1-2a of type 'o' vaccine strain was amplified directly from the cattle tongue material, cloned in plasmid vector and studied the specificity by sequence analysis and gene expression. we have introduced 'o' p1-2a gene into the full length construct devoid of asia 1 structural protein gene, p1-2a. the in vitro transcribed rna in case of t7 promotered construct and plasmid dna in case of cmv promotered construct were transfected into the bhk21 cells. after the passaging the virus obtained was studied for the speciality. this approach may be used not only for rapid selection of vaccine strain and also as a repository of the cdna copy of the virus. the p1 is composed of 1a, 1b, 1c and 1d (vp4, vp2, vp3, and vp1) respectively of which the vp1 is the most immunogenic and subunit vaccine produced with vp1 alone was able to induce high level of neutralising antibodies. thus to control the disease in india polyvalent vaccine consisting of the inactivated virus of all the three serotypes are in use. however the conventional vaccines have several drawbacks which include safety and temperature sensitivity. hence alternatively sub-unit vaccines consisting of vp1 protein has been tried. however this showed limited success due to the antigenic variations occurring in the field viruses thus escaping the neutralization from the antibodies generated from single cloned protein. hence the present study was undertaken with an objective to include all the neutralizing epitopes present in the three serotypes by linking vp1 (1d) genes and produce a poly valent protein for using as poly subunit vaccine. in this study we have constructed a cassette by linking the genes of three serotypes 'o' (622 bp), 'a' (640 bp) and 'asia 1' (622 bp). these genes were cloned individually in commercially pbsk vector and confirmed by sequence analysis before linking in pc dna vector. the linked gene construct was sub-cloned in pet32 expression vector. the expression of the protein gene from the pet vector was induced with iptg and analysed by sodium dodecyl sulphate polyacrylamide gel electrophoresis (sds-page). a fusion protein of size 72 kda was observed in page gels. since the protein contains 6 his residues from the vector at the n-terminal end, affinity purification was carried out using nickel nitrilo-tri-acetic-acid (ni-nta) agarose matrix. the immunoreactivity of the purified protein was assayed by western blot with the anti fmdv type 'o' and 'asia 1' specific sera. the may be used as a subunit vaccine. silkworm diseases caused by viruses, bacteria, fungi and protozoans form major constraints for the silk cocoon production in all the sericultural countries and among these silkworm viral diseases viz., nuclear polyhedrosis and infectious flacherie caused by bmnpv and bmifv cause severe crop loss. the traditional disease management strategies include prophylactic measures and use of disease free silkworm eggs. the prophylactic measures such as disinfection of silkworm rearing house and appliances, egg surface, silkworm bed disinfection and rearing surroundings. the disinfectants used presently in sericulture are either formaldehyde or chlorine based products, but these chemicals are neither eco-nor user-friendly. the awareness about health hazards caused by formaldehyde and environmental pollution caused by cl 2 necessitated the development of eco-and user-friendly disinfectant products for use in sericulture. these include alternative disinfectant products developed using biodegradable chemicals and plant based ingredients by apssrdi, hindupur and central silk board for the management of silkworm diseases in india. the ideal disinfectant for sericulture would be the one which can inactivate silkworm pathogens of diverse origin and economical for sericulture. the paper discusses on the disadvantages of hcho and cl 2 based disinfectants and advantages of eco-and user-friendly disinfectant for the management of silkworm diseases especially the ones caused by viruses. the baculovirus expression vector system (bevs) is widely used for the production of high levels of properly post-translationally modified, biologically active and functional recombinant proteins and has facilitated basic biomedical research on protein structure, function, drug discovery and the roles of various proteins in disease. bevs is based on the introduction of a foreign gene into nonessential for viral replication genome region via of homologous recombination with a transfer vector containing target gene. the resulting recombinant baculovirus lacks one of nonessential gene (polh, v-cath, chia etc.) replaced with foreign gene encoding heterologous protein which can be expressed in cultured insect cells and insect larvae. insect cell-bev system is widely used to produce recombinant proteins. bevs also eliminates concerns regarding pathogens that could potentially be transmitted to humans as it is non-infectious to vertebral animals. these features make silkworm system an ideal expression and delivery package for producing proteins of medicinal importance. the efficiency, low cost and large-scale production of proteins using bevs represents breakthrough technology that is facilitating highthroughput proteomic studies. the bevs has become a core technology for cloning and expression of genes for study of protein structure, processing and function; production of biochemical reagents; study of regulation of gene expression; commercial exploration, development and production of vaccines, therapeutics and diagnostics; drug discovery research; exploration and development of safer, more selective and environmentally compatible biopesticides. utilization of silkworm larvae and pupae as bioreactor with recombinant bmnpv producing foreign proteins extends the usages of silkworms. due to its large-size and high protein synthesis ability as well as the expediency in mass culture, silkworm is considered as good candidate for producing recombinant proteins. wssbv is the causative agent of a disease, which has recently caused high shrimp mortalities and severe damage to shrimp culture. wssbv has been found across different penaeid shrimp species. in order to develop a effective diagnostic tool, a wssbv genomic library was constructed by cloning wssbv genomic dna extracted from purified virions. in the present study wssv disease free (confirmed by pcr analysis) were collected from hatcheries from different areas of guntur and prakasam districts and analysed to study the effect of various physical parameters like temperature, p h , salinity and turbidity on the prevalence of above disease. the studies on the surface water temperature revealed fluctuations in the ponds ranging between 19 to 30.2°c in diseased ponds and 25.2 to 34.5°c in healthy ponds. these results show definite influence of temperature on the prevalence of wssv. present day strategy in vaccine development is to include marker facility that helps in distinguishing antibody response due to vaccination vis-à-vis infection in vaccinated animals. such information becomes relevant for effective disease control programmes especially when using inactivated virus vaccines like foot and mouth disease (fmd). the antibodies generated in the animals, only through vaccination, is the measure of vaccine efficacy and safety. presently inactivated fmd virus (fmdv) vaccines are used to control the disease in the endemic countries like india. the quality assurance of the vaccine depends on the efficacy of the vaccine in generating protective antibody without causing subclinical disease due to improper inactivation. since protective antibody response in vaccinated animals can not be distinguished from that of infected animals one needs to assay the antibody response against non structural proteins (nsps) and the vaccine must be free of contaminated nsps. production of vaccine free of nsps requires the cumbersome method of virus purification which adds to the cost of the vaccine. alternatively one may develop a positive marker vaccine by including a foreign protein or epitope which is not expected to be present in the vaccine and the antibodies generated against which helps in detecting the vaccine related response. here we report a molecular approach by which we introduced a immuno-dominant epitope of green fluorescent protein (gfp) into the structural protein gene of foot and mouth disease virus vaccine strain asia 1 (63/72). our laboratory has produced a mini-genome of fmdv asia 1 that lacks structural protein gene (p1-2a) coding for all the structural proteins (vp1-4) of fmdv asia 1 as a vector (pcfl dasia 1). the p1-2a of the asia 1 vaccine strain was cloned separately into a plasmid vector and by successive pcr mutagenesis and cloning we have introduced nucleotide sequence corresponding to 9 amino acid epitope of gfp into p1-2a gene. gfp epitope was inserted by replacement at n-terminal region of vp-2 which is not immunogenic. the modified p1-2a was expressed in e. coli and studied. the modified p1-2a gene with gfp epitope was inserted into the pcfl dasia 1 to get full length replication competent cdna cloned under cmv promoter in pcdna (pcflasiagfp). this can be used to produce synthetic virus with gfp epitope that can generate antibodies not only against neutralizing epitopes but also against gfp epitope. presence of antibody against gfp epitope in the vaccinated animal will reveal vaccine efficacy. elisa against gfp can be used as a companion test not only for safety evaluation but also for quick evaluation of efficacy. further absence of nsp antibodies in the serum may reveal the quality of the vaccine in respect of safety. self replicating dna vaccines are developed to achieve robust immune response through enhanced antigen production and gamma interferon expression in vaccinated animals. since self replicating dna vaccines induce gamma interferon expression which helps in viral clearance such vaccines are expected to be useful to cure even the carrier and persistently infected animals. understanding the events that help in the elicitation of both the arms of immune response in vaccinated animals is necessary to understand the effectiveness of the vaccine. the work presented here deals with the immunological evaluation of a sindbis virus replicase based dna vaccine carrying linked fmdv vp1 genes in vaccinated guinea pigs. we have constructed self replicating dna vaccine vector and to the down stream of a sub genomic promoter we have inserted secretory signal followed by linked-vp1 genes of 3fmdv serotypes (o-a-asia 1) with glycine and proline bridge in between. guinea pigs were vaccinated with the construct and the sera at 28 days post vaccination were evaluated both for cellular response by studying the cd8 levels and by mtt and cytokine profiles by real time assays. the humoral response was evaluated by studying cd4 levels in the whole blood by facs analysis and serum antibody levels by snt and elisa. the animals were challenged with 100 gp infective dose of fmdv type 'o' virus lesions were scored. further the replicative efficiency of the challenge virus was studied by 3ab elisa. the results showed that all the assays except antibodies against 3ab protein have positive correlation with the protection. as expected the titre of the antibodies against 3ab protein was lower indicating that the challenge virus replication was inhibited in the vaccinated animals. the limited studies conducted by us showed that self replicating vaccine has a potentiality to emerge as potent vaccine for fmd. ganjam virus (ganjv) belongs to the genus nairoviruses (family bunyaviridae). these viruses cause diseases in livestock. it has been isolated from different animal hosts and tick vectors from india. genus nairoviruses includes a total of 34 tick-borne viruses, classified into 7 serogroups. the important serogroups are crimean congo hemorrhagic fever (cchf) and the nairobi sheep disease (nsd). the main members of the nsd group are nsd and dubge viruses. their genome consists of three segments of single stranded rna, viz. s, m and l that encodes viral nucleocapsid protein, viral glycoprotein g1 and g2 and the viral polymerase respectively. ganjv is very closely related to (nsdv). nsdv is found in east and central africa, causes very high morbidity and mortality in livestock. the present study involves phylogenetic comparison of ganjv isolates from india with other nairoviruses based on complete n gene. it will help to understand the kind of nucleotide (nt) and amino acid (aa) changes that have occurred in ganjv strains from different geographical areas. eight strains of ganjv isolated at niv during 1954-2002 from different parts of india were used in this study. virus stocks were prepared in vero e6 cell line these were used as the source of viral rna. the n gene was amplified either as a complete gene in one reaction or in fragments whenever necessary. thus obtained sequences were analyzed; annotated to get a consensus sequence, aligned against the sequence of prototype strain of ganjv and other representative nairoviruses. the nt sequences were converted to aa sequences and analysis was done at both nucleotide and amino acid levels. based on what nt or aa phylogenetic tree was constructed and compared with other nairoviruses (cchf, dugv, hazv, kupv and nsdv) where complete s segment sequences were available gen-bank database (ncbi). the phylogenetic data at both the nt and aa levels showed that all the strains of ganjv form monophyletic lineage with the nsdv. cchfv and hazv together formed another clade, whereas dugv and kupv made a separate branch in the tree. the different ganjv strains showed 9-10% difference with nsdv at the nucleotide level and 3-4% difference at the amino acid level. hazv showed 37-38% difference at the nt level and 37% difference at the aa level with ganjv as well as nsdv. the present data obtained suggests that ganjv and nsdv are minor variants of the same virus. diarrhoeal syndrome is one of the major concerns of the livestock industry. most of the diarrheic cases in animals go unnoticed and limited attention is paid on viral etiology. presence of large amount of fecal matter in animal shed acts as a source of infection for calves via drinking water, feed, or contaminated soil. keeping this in view, investigation was planned to detect the association of rotaviruses with diarrhea in dairy calves and to observe the genomic diversity among the circulating viruses in tarai area of uttarakhand. a total of 63 diarrheic fecal samples collected from instructional dairy farm, nagla, pantnagar, uttarakhand were screened during the study. samples were collected from both cow calves and buffalo calves in 0-3 months of age. for the diagnosis of rotavirus, all the fecal samples were subjected to rna-electrophoresis after nucleic acid extraction. viral genome segments were visualized by silver staining. out of the total 63 samples tested, seven were found positive in rna-page showing typical 11 genome segments migration pattern of bovine rotavirus. in the given samples prevalence of bovine rotavirus was 11.32% and 10% in cow and buffalo calves, respectively. on the basis of migration patterns of rotavirus in rna-page, group a were identified with typical 4:2:3:2 pattern. variation within movement of various genome segments among isolates of bovine rotaviruses was observed during the study that may be indicative of emergence of mutants in the circulating isolates. the vp6 gene based group a specific rt-pcr was standardized and all the isolates in this area were confirmed to be of group a type. work is in progress to genotype the bovine rotaviruses of this region based on vp7 and vp4 genes. this study emphasizes the need to explore the prevalence of bovine group a rotaviruses in different places of uttarakhand and their genetic characterization which could help in selection of control strategies for rotavirus infections. foot-and-mouth disease (fmd) is endemic in india causing enormous economic loss to the animal keepers and trade embargo with fmd free countries in livestock and animal products. rapid diagnosis of fmd is of immense importance in prevention and control of the disease. fmd is initially diagnosed clinically and confirmed by laboratory tests. virus isolation in cell culture and sandwich elisa for antigen detection are commonly practiced in laboratories. the virus isolation though is very sensitive but it can be slow and analytical sensitivity of the elisa is lower and can not be used with certain sample types. the use of molecular techniques in the diagnostic laboratory has greatly increased the speed, specificity and sensitivity of fmd diagnostic tests. molecular techniques like rt-pcr, pcr-elsa and dot hybridization can be used with more success for detecting carrier animals and animals harboring sub-clinical infection and can be applied in a wide range of clinical sample types. these techniques can be used as genus and serotype specific test including detection of particular lineage/genotypes with in the serotype. multiplex pcr has been used to differentiate serotypes of fmdv and the technique is sensitive, experimentally simpler, cost effective and less time consuming. the assay can be used for serotyping on elisa negative samples. the molecular techniques not only help in diagnosis but also useful for epidemiological studies. lineage differentiating rt-pcr has been useful in identifying different lineages of serotype asia 1 (lineage b, c and d) before proceeding with sequencing of 1d region. similarly genotype differentiating rt-pcr has been developed and used in differentiating two different genotypes of serotype a (genotype vi and vii). these assays have the potential to be applied on clinical samples directly, thereby saving much time needed for sample processing and nucleotide sequencing. recent development of real time rt-pcr methodology has allowed the diagnostic potential of molecular assays to be realised. advancement in real time pcr technology made it possible to combine several assays within a single tube which is in the progress in our laboratory. integration of these assays onto automated high throughput platforms provides diagnostic laboratories with the capability to test large numbers of samples. microarray technology was provided greater screening capabilities for pathogen detection. the microarray allows the addition of large number of oligonucleotide probes for identification of mutant pathogen and also for subtype determination. the combined properties of high sensitivity and specificity, low contamination risk, and speed has made realtime pcr and microarray technology a highly attractive alternative to conventional methods in increasing percentage of outbreaks confirmed and analyzed and for tracing the origin of fmd virus responsible for outbreaks. dna vaccines are expected to elicit both humoral and cellular responses, cellular response being long lasting. however the approach has several limitations like poor stability of dna, poor expression and risk of integration. poor expression becomes the major limitation in the case of fmd as fmdv proteins are poor immunogens. also dna vaccine vectors carrying only eukaryotic promoters elicit strong cmi response and weak humoral response. the methodology to achieve humoral response involves the expression and secretion of the expressed protein so that the antigen presenting cells will be able to process the antigen and produce humoral response. in case of fmd humoral response is as important as cellular response. the present project aims at addressing these issues; achieving higher expression and getting the protein secreted out by constructing self replicating gene vaccines for fmd and studying their efficacy. the vector for humoral immune response contains eef1 promoter, sindbis virus polymerase gene and secretory and anchoring signals. the integrity of the vectors was confirmed by sequence analysis. the linked polyvalent protein genes of fmdv serotype a, o and asia 1 were cloned into the vectors and the presence of the insert was confirmed by restriction enzyme digestion. the functionality of the constructed dna vaccine vector (pvac self rep 990) was assayed by transfecting the dna into bhk 21 cell monolayer and studying the 35 s labeled proteins in immuno-precipitation assays. the studies showed high level of expression in case of constructed vector as compared to infected virus for the specific protein. the secretion of the expressed protein was assayed by immuno-fluorescence assay and found to be positive. encouraged with these studies the preliminary studies were conducted on vaccine efficacy studies in guinea pig model. the immunized guinea pigs showed high antibody titres by snt and elisa, as compared to conventional dna vaccines (pup3cd) even at 1/10th of the dose. this approach of constructing self replicating dna vaccine for humoral response is the first report. genetically engineered microorganisms are important sources of industrial and medicinal proteins. over the past decade, plant host system has been investigated as potential host system for expressing proteins of therapeutic and diagnostic use. however concerns regarding the stability and environmental safety need to be addressed. chloroplast engineering is expected to resolve some of these issues since, plastids/chloroplasts are inherited maternally and are not disseminated through pollen. this makes plastid transformation a valuable tool for transgenic creation besides offering biological containment. since foot and mouth disease (fmd) of cloven footed animals is a major concern in the world over. foot and mouth disease (fmd) is the most feared, viral disease of the cloven footed animals causing heavy losses to the live stock industry. the disease is enzootic in many parts of the world including asia. the conventional vaccines for fmd have several limitations which include safety, temperature sensitivity and duration of immunity. attempts have been made to overcome these limitations using recombinant dna technology. amongst the newer vaccines, edible vaccines are cost effective and easy to administer. since the stability of the gene of interest is the major concern in the case of plant transgenics, marker genes are used for regular selection. the detection methods based on the available marker proteins like b glucoronidase (gus) protein/antibiotic selection are cumbersome and cost intensive. however selection based on herbicide resistance is much simpler and easy. hence in the present study, the 5-enolpyruvylshikimate-3-phosphate synthase (epsp) gene was used as a marker along with the immunogen gene of fmdv. epsp is the key enzyme in the shikimate biosynthesis pathway necessary for the aromatic amino acids production. in order to investigate the mechanism of long term immunity and the effect of protective immunity induced by cationic plg micro particle coated dna vaccination. we constructed the expression plasmid containing a foot-and-mouth disease virus (fmdv) id gene sero type a. intramuscular vaccination of guinea pigs with the micro particles coated plasmid dna induced a strong antibody response and neutralization antibodies, cellular mediated immune response which lasted 1 year. we further analyzed the persistence and expression of id gene by polymerase chain reaction and reverse transcriptase polymerase chain reaction and quantitative pcr. the results showed that id gene was present and expressed in the muscle cells up to 1 year after days post vaccination. furthermore, guinea pigs vaccinated with micro particles coated plasmid dna were protected against a challenge with fmdv virus. therefore the micro particles coated plasmid dna vaccination dose induce a protective immunity and long term humoral, cellular immuno responses against fmdv, which could be maintained by persistent expression of id gene in muscle cells. foot and mouth disease virus (fmdv) causes a highly contagious viral disease of cloven hoofed animals, which has a considerable socioeconomic impact on the countries affected. interleukin-18 (il-18) enhances the il-12 driven th1 immune response that is important in immunity against intracellular pathogens. the multiple roles of il-18 in many physiological and pathological processes have generated a great deal of interest in recent years. antiviral effects of il-18 have been reported. we evaluated the effects interleukin-18 (il-18) on the replication of fmdv in vitro in bhk-21 cells. bovine il-18 mature protein coding sequence was amplified from the bovine pbmc cells and cloned into prokaryotic expression vector pet32a. protein expressed was purified and specificity was confirmed by immunoblotting. bhk-21 cells were treated with purified expressed il-18 protein with (2 lg/ml) 4 h prior to fmd infection. cell culture supernatants were collected at 24 h post infection were subjected for elisa and virus titration assay. rna extracted from the cells was subjected to real time pcr for viral rna quantification. 2 log titer reduction was observed in the fmd virus titer in il-18 treated cells compared to the untreated cells where as virus antigen quantified by elisa has shown a reduction of 60-folds. 69-fold reduction in the fmd viral rna copy number was observed in the il-18 treated cell compared to the untreated measured by qpcr. current study demonstrated the potent anti viral activity of il-18 on fmdv by inhibiting the viral replication. these results further suggests that il-18 has the potential role of il-18 as molecular adjuvant in fmd vaccine development and development of therapeutic for fmd. foot and mouth disease is the most contagious viral disease of farm animals. control of the disease in animals is by vaccination and slaughtering of infected animals. conventional oil adjuvant vaccine has its own limitation. alternate to this genetic vaccines where the dna encoding viral antigen may be a promising approach. naked dna vaccine has limitations like poor uptake of dna by cells and more importantly by nucleus. as a result delivery of naked dna through calcium phosphate nanoparticle was attempted. calcium phosphate nanoparticle is a potential delivery agent which proved to enhance the immune response. fmdv p1-3cd ''o'' vaccine gene constructs in pcdna3.1? entrapped by the nanoparticles was prepared by using different molarity of calcium chloride and disodium hydrogen orthophosphate. the nanoparticles entrapping fmdv p1-3cd ''o'' and naked dna were presented to the guinea pigs through intramuscular injection to study the mrna expression of antigen by rt-pcr. animals were sacrificed at defined time to collect different organs and total rna was extracted. each time blood was collected to analyse the fmdv specific serum antibodies. dna vaccines presented through calcium phosphate produced transcripts in the injected muscle up to 240 days whereas naked dna up to 120 days. serum antibody levels of naked dna vaccine showed antibody titre till 60 days. whereas nanoparticle injected animals showed serum antibody till 120 days. serum neutralization titres of 1.5 were observed in calcium phosphate dna vaccines at about 28-150 days, where as naked dna sn titers were observed for short period of 30-90 days. the study clearly showed calcium phosphate nanoparticle entrapping fmdv vaccine dna may be a better delivery system for dna vaccines as it confirms availability of the antigen and persistence of antibody for longer duration than naked dna. capripox is highly infectious, contagious, and oie notifiable disease of small ruminants, caused by sheeppox and goatpox viruses which are members of capripoxvirus genus of the family poxviridae. in the present study, we analyzed the partial gene sequences of p32 protein, an immunogenic envelope protein of capripox viruses (capv) to assess the genetic relationship among different sheep pox and goat pox virus isolates from different geographical areas of the country. product of this gene has been shown to be important in attachment of capv to host cell surface receptors during viral entry and host immune response. the following virus isolates have been used in the analysis: gtpv-uttarkashi, p60, vaccine virus; gtpv mukteswar, p10, challenge virus; gtpv (akola), gtpv bareilly/00, gtpv ladakh/01 and gtpv sambalpur/82, field isolates and sppv srinagar, p40; sppv ranipet, p50; sppv-rf, p50, vaccine viruses and sppv makdhoom/07, sppv cirg/08, sppv pune/08, sppv bareilly, sppv 183/03 and sppv 125/02, field isolates. in this study, all virus isolates were confirmed by pcr amplification and analysed in pcr-restriction fragment length polymorphism (pcr-rflp) using ecori enzyme to confirm their specificity. further, the amplicons were cloned and sequenced commercially. nucleotide and the deduced amino acid (aa) sequences were compared with published sequences of the members of the genus capripox virus. sequence analysis of partial 172 bp sequence has shown high sequence identity among all indian sppv and gtpv isolates at both nt and aa levels. it revealed a 99.4-100% and 98.2 for gtpv field isolates where as, 100% for sppv field isolates at both the nt and aa levels. in general, capv isolates in this study shown 98.3-98.8 and 96.5% homology between gtpv and sppv at nt and aa levels as reported earlier. further, it revealed a unique change of g120a in all gtpv isolates resulting in formation of drai site in place of ecori and possible development of restriction enzyme specific pcr-rflp for differentiation of sppv and gtpv from field isolates. orf or contagious ecthyma is considered as non-contagious, proliferative disease and is caused by orf virus of the genus parapox virus of the family poxviridae. it is reported most commonly in sheep and goats and also a zoonotic agent. camels are also infected by orf virus and reported in camel rearing countries as a mixed infection with camel pox, the later is caused by an orthopox virus. in india, there are few reports of the orf virus infection in camels and identified by clinical signs and pcr. in this study, we identified the presence of orf virus from clinical samples of suspected case of sporadic infection in camels by serological and molecular techniques. viral dna isolated from processed scabs used initially in nested polymerase chain reaction as diagnostic pcr which successfully amplified 235 bp fragments and also sequenced to check the fidelity of the product. after confirming the infection by pcr, some of the structural and non-structural genes were amplified for sequence analysis. out of the five genes characterized, the major important one selected for sequence and phylogenetic analysis is b2l gene which is homologous to a major envelope protein p37k of vaccinia virus. full open reading frame of 1137 bp from orf b2l was amplified by pcr, cloned and sequenced commercially. nucleotide and deduced amino acid sequences of b2l were compared with other published sequences of the members of the genus papapox virus. sequence analysis shows a maximum percent identity of 94.8 and 95 (indian orf virus isolates); 94.7 and 94.5 (other orf isolates); 98.8 and 98.7 (orf-camel/jodhpur/08); 85 and 82.8 (bovine popular stomatitis virus) and finally 97.4 and 97.6 (pseudo cowpox virus) respectively at nt and aa levels. phylogenetic analysis of the isolate was also performed using the neighbour joining method in mega 4 program to know the phylogeny relatedness of the virus, which revealed that the isolate is well grouped with the jodhpur isolate and closely related to pseudo cowpox virus. it warrants further analysis of other potential genes to confirm the causative agent of the contagious ecthyma in camels as pseudo cowpox virus. chikungunya an arboviral disease is transmitted through the bite of an infected aedes mosquito. it causes a self limited febrile illness along with arthralgia and myalgia. in some cases neurological and severe hemorrhagic manifestations has been observed. chikv epidemic has been reported in africa, india, south east asian countries and during the current out break imported cases of chikv has been encountered in most of the european countries. the causative agent belongs to the genus alphavirus family togaviridae. human beings serve as the chikungunya virus reservoir host during epidemic periods. outside these periods the main reservoirs are monkeys, rodents, birds, and other unidentified vertebrates. antibodies to chikv have been detected in domestic animals. in the present study we surveyed madanapalli, palamaner, b. kotta kota and tirupati and collected a total of 67 rodent samples, 75 bovine samples; 20 sheep samples and 15 canine samples. total rna was isolated from all these samples and subjected to rt-pcr using a primer pair dvrchk-f/dvrchk-r which could amplify a 330 bp e1 gene product specific to chikungunya virus (naresh kumar et al. 2007 ). all the serum samples were further screened for chikv specific igm antibodies using commercially available ctk biotech strips. none of the samples were found positive either for chikv specific rna or chikv specific igm antibodies. more number of samples from domestic animals as well as rodents are being screened to study their possible role if any in the maintenance of chikv in nature and during the inter epidemic periods. the present study discusses these aspects in detail. petunia hybrida is widely used as experimental host plant for begomovirus identification and its characterization. hitherto, natural infection of begomovirus on petunia has not been reported in india. recently, petunia hybrida grown in and around ludhiana were found to be depicting typical symptoms caused by begomovirus. the symptoms include severe reduction in leaf size, downward curling and distorted leaves. severely infected plant became bushy, stunted and produces no flower. total genomic dna was extracted from the plants showing symptoms of begomovirus, by ctab method. the presence of virus was confirmed by using degenerated primers, designed to identify all the begomovirus prevailing in the world. to identify the strain associated with the disease, the positive samples along with healthy control were tested against different strain specific primers of tomato leaf curl virus, so far reported in india i.e. tomato leaf curl new delhi virus, tomato leaf curl palampur virus, tomato leaf curl banglore virus, tomato leaf curl karnataka virus and tomato leaf curl gujarat virus. among these, only tomato leaf curl new delhi virus specific primer was able to give the desired amplicon of *1180 bp. hence, it is confirmed that the leaf curl disease of petunia hybrida is associated with tomato leaf curl new delhi virus. this disease of petunia can become a sever production constraint in coming years. from last 2 years (2008 and 2009) it was observed that some varieties of brinjal grown in rainy season, showed typical leaf curl type of symptoms. the symptoms include upward curling of the leaves, cupping, vein thickening, reduction in leaf size and distortion of leaves. the severely infected plant remains stunted and bushy, became unproductive or produces only few fruits. the disease was experimentally transmitted from naturally infected brinjal to healthy seedlings by whiteflies (bemisia tabaci) and grafting, but not by mechanical or aphid transmission. to detect the begomovirus associated, total genomic dna was extracted from the plants showing disease symptoms. the presence of virus was confirmed by using pcr based begomovirus geneus-specific primers designed by deng et al., wyatt and brown and rojas et al. these degenerated primers give the expected product size of *530, *575 and *1280 bp, respectively. core coat protein (cp) gene and dna-b was also amplified in the samples using specific primers. to identify the strain associated with leaf curl virus, dna was subjected against primers of different indian tomato leaf curl virus strain i.e. tomato leaf curl new delhi virus, tomato leaf curl palampur virus, tomato leaf curl banglore virus, tomato leaf curl karnataka virus and tomato leaf curl gujarat virus, using pcr. among these, only tomato leaf curl new delhi virus primer was able to show the desired product size of *1180 bp. therefore, it was confirmed that leaf curl disease of brinjal is caused by tomato leaf curl new delhi virus in association with satellite b-dna. to identify the strain associated with the disease, all samples were further subjected to the specific primers, designed to amplify all the tomato leaf curl virus strains, so far reported from india i.e. tomato leaf curl new delhi virus, tomato leaf curl palampur virus, tomato leaf curl banglore virus, tomato leaf curl karnataka virus and tomato leaf curl gujarat virus, using pcr. among these, only tomato leaf curl palampur virus specific primer was able to give the expected product size of *900 bp. this shows the association of tomato leaf curl palampur virus with leaf curl disease of calendula and marigold. thus, calendula and marigold can act as a reservoir for the tomato leaf curl palampur virus and may cause severe constrain in the production of these important ornamental plants. groundnut bud necrosis virus (gbnv) belongs to serogroup iv of the genus tospovirus in bynayaviridae family and infects several economically important crops all over india. the nucleocapsid protein (np) encoded by the small rna of gbnv encapsidates the viral rnas. apart from this structural role, the np has also been implicated in the replication, transcription, maturation and cell to cell movement. with a view to study the structure and function, the np of gbnvtomato isolate from karnataka was over expressed in e. coli and purified by ni-nta chromatography. the purified np was present as ribonucleoprotein complex and as heterogeneous mixture containing monomers, tetramers and higher order multimers. in order to determine the regions involved in oligomerization and nucleic acid binding, mutational approach was taken. n-and c-terminal deletion clones were generated (n20np, n40np, c15np and c37np), over expressed in e. coli, and were purified by a procedure identical to that used for the wild type protein. initial studies on oligomeric status suggested that in addition to n-and c-terminal regions there may be additional regions or residues which contribute to multimerization of np. the amount of rna bound to the truncated proteins was reduced in case of n20np, n40np and c15np. interestingly removal of 37 amino acid residues (natively unfolded region) from the c terminus resulted in complete loss of nucleic acid binding suggesting that the rna binding domain was located in c-terminal region of np. further np was observed to get phosphorylated in in vitro kinase assays by a kinase present in the soluble fraction of tobacco plant sap. both atp and gtp were utilized as phosporyl donors and mn 2? was the preferred metal ion which suggests that np might be phosphorylated by a ck2-like protein kinase. phosphorylation studies with n-and c-terminal truncated proteins revealed that the site of phosphorylation lies within the amino acid residues 40-239. by mass spectrometric analysis of the protein threonine-84 and serine-202 were identified as possible phosphorylation sites. a naturally occurring isolate of virus infecting gherkin (cucumis anguira l.) showing mosaic symptoms of mosaic, leaf distortion and dark green islands in the lamina was identified in the export cultivars of gherkin grown in commercial fields of kuppam rural, chittoor district, andhra pradesh. the virus infection was deadly prevalent among the field that caused a lot of economic damage to the crop that resulted in yield losses and reduced quality of fruits meant for export. symptoms of the infected fruit included blistering and malformation of the fruit. the virus infected leaf samples were collected and initial host range tests were conducted with different cucurbit species showed that the host range include propagation hosts like cucumis anguira (gherkin), cucumis sativus, cucurbita pepo, cucumis melo, langeneria vulgaris, momordica charantia and local assay host like chenopodium amaranticolor. the virus host range was only restricted to cucurbit species and chenopodium. the virus was maintained for further studies on cucurbita pepo by sap or mechanical inoculation. the virus induced mosaic, vein clearing symptoms on pumpkin. electron microscopy of the leaf dip preparations stained with 2% uranyl acetate from the pumpkin leaves showing symptoms revealed the presence of a long flexuous filamentous particle measuring 750 9 12 nm. the virus positively reacted to the polyclonal antisera of papaya ringspot virus-w, potato virus y, tobacco etch virus and also strongly reacted with the polyclonal antiserum of zucchini yellow mosaic virus in direct antigen coated-enzyme linked immunosorbent assay (dac-elisa). because of very strong reaction to polyclonal antisera of zucchini yellow mosaic virus, we tried to amplify the partial nib and cp genes of the virus along with the 3 0 utr by using two primers zy2 5 0 gctccatacatagctgag acagc3 0 and zy3 5 0 taggctttttgcaaacggagtcta at c3 0 . total rna from gherkin infected leaves was isolated using trizol ls reagent (sigma). rt-pcr was performed to obtain an amplicon of *1.2kbp, cloned into fermentas ptz57r/t vector and sequenced at mwg biotech, bangalore. sequence analysis revealed that the virus was isolate of zucchini yellow mosaic virus and was showing 98% of homology to that of the zucchini yellow mosaic virus strain b genome ay188994 and zucchini yellow mosaic virus nat genome ef062582 which were strains reported from israel. the sequence of the present study was submitted to the genbank gq482976. the results state a suspicion that the virus could have been mobilized by some infected source brought by the commercial israeli based companies into india due to poor quarantine regulations as the gherkin cultivation in these regions is chiefly supported, purchased, exported and marketed by these private companies that are based from israel. this is the first report on molecular characterisation of zucchini yellow mosaic virus infecting cucumis anguira (gherkin) from india. they also exhibited synergism with other virus which was region specific. fifty percent of the total symptomatic plant population was found be positive only for carla while remaining showed mixed infection of carla with tospo in some regions while in others carla virus was found to be associated with cmv. presence of only carlavirus was up to 10-20% incidence, without association of tospo, cmv, poty or tobamo viruses was also observed in some fields. avijit tarafdar, raju ghosh, k. k. biswas plant virology unit, division of plant pathology, indian agricultural research institute, new delhi 110012 citrus tristeza virus (ctv), a brown citrus aphid (toxoptera citricidus) transmitted closterovirus under family closteroviridae, is one of the major limiting factors in cultivation of citrus worldwide. ctv is a longest known plant virus having flexuous particle of 2000 9 11 nm in size. ctv genome is a positive sense ssrna of about 20 kb nucleotide containing 13 open reading frames (orfs) encoding 17 proteins. several biological as well as genetic variants of ctv are reported in all the citrus growing countries in the world. ctv causes decline and death of millions of citrus trees in the world. in india, ctv is a century old problem, and has killed an estimated one million citrus trees till today. in molecular and genetic level, ctv isolates from india were not fully characterized. genetic diversity and sequence divergence in ctv isolates of india are not fully established. further, evidence of recombination and causes of evolution of ctv variants in india have not been studied till date. therefore, in the present study, effort has been made to characterize several indian ctv isolates in genetic level, examine their genetic diversity, identify recombination events and analyze evolution of divergent ctv. a total number of 73 ctv isolates from different regions of india (35 from darjeeling hills, five from bangalore, 15 from delhi and 18 from vidarbha) were under taken for genetic study. two genomic regions of ctv, i.e., entire cp gene (cpg) (672 nt) and a gene fragment of 5 0 orf1a (orf1a) (404 nt) were amplified, cloned, sequenced and nucleotides were analyzed. based on cpg, indian isolates shared 88-99% nucleotide identity, and based on orf1a they shared 82-99% identity, among them. incongruence of phylogenetic relationship was observed as on sequence analysis five phylogenetic clades based on cpg, and eight clades based on orf1a, were generated suggesting the recombination events have been occurred between the sequences of indian ctv isolates. thus, to identify the potential recombination events, and determine the parental sequences in ctv isolates, six recombination detecting algorithms, namely, rdp, genconv, bootscan, maxchi, chimera and siscan were used. out of 73 indian ctv, cpg of 18 and orf1a of 47 isolates of ctv showed recombination events suggesting orf1a was more prone and fragile to rna recombination as compared to cpg. this findings indicated that high degrees of genetic diversity and incongruent relationships of indian ctv isolates are due to genetic recombination occurred, which may be the important factors in driving evolution ctv variants in india, that was also supported by a splittree decomposition analysis. b. v. bhaskara reddy, y. sivaprasad, k. rekha rani, k. raja reddy department of plant pathology, regional agricultural research station, acharya n.g. ranga agricultural university, tirupati, andhra pradesh sunflower (helianthus annus l.) is one of the most important oil seed crops in the world which ranks third in area after soyabean and groundnut. the sunflower necrosis disease (snd) is characterized by necrosis of leaves, necrosis streaks on petioles, stem, floral parts and stunted growth. the causal agent of the disease has been identified as tobacco streak virus (tsv) which belongs to genus ilarvirus of the family bromoviridae. the suspected tsv infected sunflower samples collected from chittoor district in andhra pradesh were found positive for tsv-dac elisa. total rna was extracted from sunflower using rnaeasy isolation kit (qiagen). the tsv coat protein (cp) gene, movement protein (mp) gene and replicase (rep) gene were amplified by rt-pcr with specific primers, cloned in ptz57r/t vector, sequenced and deposited in genbank (gu355899, gu355900 and gu371445). the size of cloned cp gene was 717 bp and codes for 239 amino acids. the cp gene sequence analysis revealed that the tsv-tpt infecting sunflower has 98-100% homology at nucleotide level with soybean, tagietus-tpt and okra-tn isolates and 93-99% homology at amino acid level. the movement protein gene was 615 bp and codes for 205 amino acids. the mp gene sequence analysis showed that it has 94-97% homology at nucleotide level and 92-95% at aminoacid level. chilli (capsicum annuum), the important commercial vegetable/spice of himachal pradesh, is affected by several viral diseases; of them cucumo, tospo, poty and gemini viruses are the most common genera. however, these viruses are not identified clearly and characterized fully, which are foremost needed to formulate the management strategy. therefore, in the present study, effort has been made to identify and characterize the important viruses causing diseases in chilli. in this study, several farms in major chilli growing areas of bilaspur and kangra districts in himachal pradesh were surveyed and infected plant samples were collected randomly. virus infection in these samples were detected by das-elisa using antisera to cucumber mosaic virus (cmv) and potyvirus (group specific) and through slot-blot hybridization (sbh) using cmv, iris severe mosaic poty virus (ismv), tomato spotted wilt tospo virus (tswv) and chilli leaf curl gemini virus (clcuv). based on das-elisa and sbh, the incidence of disease was estimated and ranged from 18.2 to 21.8% by cmv and 3.5 to 5.4% by potyvirus. to detect tospo and geminivirus in the infected chilli, sbh test was carried out. infected samples showed maximum virus titer in both das-elisa and sbh test were further confirmed by pcr using specific primers. desired sizes of amplicons; *540 bp, *800 bp, *570 bp and *460 bp of cmv, poty, gemini and tospo viruses, respectively, were obtained. as the present study clearly indicated that cmv appeared as a major one among the viruses infecting chilli in the hilly region of himachal pradesh, two isolates of cmv were characterized in genetic level. thus the amplified products (*540 bp) of cmv, palampur 1 and palampur 2 were cloned in pgemt cloning vector, sequenced and the sequences were submitted to ncbi database (palampur 1: acc-fm209497 and palampur 2: acc-fm209498). the sequences were then analyzed and compared with other sequences available in the data base. based on sequence analysis, it was found that present cmv isolates shared 99% nucleotide identity between them, are closely related with australian cmv isolate cmv-ly (acc-af198103) by 98% nucleotide identity. in phylogenetic tree analysis, it was observed that indian cmv isolates formed same cluster along with cmv-ly. as it is known that cmv subgroup ii comprises cmv-ly, it is concluded that the cmvs of this hilly region of himachal pradesh belong to subgroup ii. chilli is essentially a crop of the tropics and grows better in hotter regions. chlii (capsicum annuum), a member of family solanaceae is an important vegetable and spice crop of immense commercial importance. the pungency in pepper is due to an alkaloid known as capsaicine and peppers are characterized as sweet, hot or mild depending on capsaicine content. the present investigation were conducted to find out the highly resistant cultivars of capsicum annuum against cmv and tylcv among ten cultivars of chilli in agroclimatic condition of aligarh. the highest (70 and 80) percentage of infection was observed in hc-201 and kalyanpur type-1 by showing the positive reaction to cmv by elisa test. no symptoms was recorded in case of bc-16, lca-235 and jca-154 and showed negative reaction to cmv by elisa. bc-16 and lca-235 also showed negative reaction to tylcv by elisa and these were symptomless. maximum infection (70 and 80) was registered in hc-201 and c 8 , cultivar. so, the bc-16, lca-235 and jca-154 has proved highly resistant varieties against cmv and tylcv and these may be used in breeding programmes against viruses. cotton leaf curl virus belongs to the family geminiviridae, genus begomovirus. the members of this family contain circular single stranded dna molecules as their genomes. there are two kinds of begomoviruses-bipartite viruses with genomes consisting of two dna molecules designated dna-a and dna-b and the monopartite viruses which contain only dna-a but not dna-b. in monopartite viruses, the dna-a is accompanied by a small circular dna molecule called dna-b which is essential for the development of typical disease symptoms. cotton leaf curl virus is a monopartite virus and causes the cotton leaf curl disease which has emerged as a major disease of cotton in the indian subcontinent. the non-structural protein ac4 of cotton leaf curl kokhran virus-dabawali isolate (clcukv-dab) was cloned into pgex5x2 vector and overexpressed in bl21(de3)plyss e. coli cells. the overexpressed gst-ac4 protein was purified by glutathione sepharose chromatography. the purified gst-ac4 protein was found to possess atpase activity. the optimum temperature and ph for the activity were 37°c and 7.4 respectively. the atpase activity was inhibited in presence of edta, showing that it is dependent on divalent metal ions. the activity was supported by magnesium, manganese and zinc ions but inhibited in presence of calcium ions. it was also inhibited by the non-hydrolyzable atp analogue adenosine-b, c-imido triphosphate and in the presence of other nucleotides like ctp and gtp. the k m and the v max of the reaction for atp as the substrate are 1.54 mm and 95.2 nmol/min/ mg of the protein respectively. the enzyme could also utilize gtp as the substrate. the fact that ac4 is specifically an ntpase and not a general phosphatase is revealed by the finding that it does not hydrolyze p-nitrophenyl phosphate to yield yellow colour while a similar reaction carried out in parallel with alkaline phosphatase readily yields the colour. it has been suggested earlier that ac4 may be involved in cell to cell movement of the virus (rojas et al. 2001) . it is possible that by its ability to hydrolyze atp, ac4 serves to power viral movement in the plant. thirteen sugarcane yellow leaf virus isolates causing yellow midrib and irregular yellow spot pattern from six states of india were characterized by rt-pcr assays. scylv-615f and scylv-615r primers were used as forward and reverse primer pairs and the amplified products were cloned and sequenced. comparative coat protein sequence analysis confirmed that all the scylv-indian isolates were clustered into two major groups confirming the existence of two strains of scylv affecting sugarcane crops of india. in a separate experiment, the member of both of the phylogenetic groups were found to be transmitted by the sugarcane aphid, melanaphis sacchari from infected to healthy sugarcane suggesting its secondary spread in nature. the symptoms produced by the virus causing cotton mosaic disease were little bit different in both sap inoculation and under natural field condition. in natural field condition it has shown clear chlorosis type of symptoms on major leaves of plants but in sap inoculated plants veinal chlorosis and mosaic type of symptoms are found to be common. in field conditions infected plants grows erect and have less boll formation. there is no effect found on seed shape or seed size. the initial symptoms produced on cotton leaves after inoculation were wonderful. local lesions observed in second week from inoculation and then they changes to chlorotic type of symptoms and some are necrotic symptoms also. the plants at early stage are found to be affected, has less lateral branch development and hence reduction in yield production. the naturally field infected plants showing good symptoms are also difficult to identify in lateral stage of plant. because they disappear with time. the virus is very easily sap transmissible. the virus is found to be transmitted by thrips palmi and thrips tobacci in persistent manner. no seed transmission is observed. virus showed same physical properties as it shows in stem necrosis of peanut or sunflower necrosis disease. the physical properties are found to be thermal inactivation point (tip) 55-60°c, dilution end point (dep) 10 -2 to 10 -3 and longevity in vitro (liv) 5 h, virus infecting nineteen different host plants are identified belonging to five different types of families viz. malvaceae, chenopodiaceae, compositae, leguminaceae and solanaceae. however they found to produce same types of symptoms as in most of the host that have been tested before. in elisa test report it is found that the virus showing positive test only with anti serum of tsv of a cowpea and cotton but negative reaction with pbnv of cowpea and cotton which clearly denied possibility of presence of pbnv in cotton producing these kinds of symptoms elisa report clearly shows that tsv antiserum of cowpea is showing positive results with clear chlorotic types of symptoms. a powerful approach to functional genomics, and an alternative to the massive generation of transgenic plants, is the use of the recently described virus induced gene silencing (vigs) process, which allows viral vectors to knock out the function of a gene-of-interest. vigs is based on a silencing mechanism that regulates gene expression by the specific degradation of rna. as a tool for reverse genetics, vigs has many advantages over other common ways to study gene function because of the ability of viruses to replicate and move systemically within a plant. vigs can generate a phenocopy of a mutant without all the troubles of traditional methods of mutagenesis. geminiviruses with their small dna genomes and ease of inoculation through agrobacterium, are excellent candidates for vigs vector development. as a first step, the geminivirus bhendi yellow vein mosaic virus, characterized in our lab (jose and usha, virology 305: [310] [311] [312] [313] [314] [315] [316] [317] 2003) has been chosen. the satellite b dna associated with this virus has a single open reading frame (bc1). bc1 is essential for symptom development but not for replication. therefore, bc1 has been replaced by a multiple cloning site harbouring sali, xbai, bamhi, bsrgi and xhoi, initially in a cloning vector and then in the binary vector containing the partial tandem repeat of the b dna. in the place of the bc1 orf, the plant phytoene desaturase gene has been cloned and the resulting construct was used for agroinfiltration along with the partial tandem repeat clone of the begomovirus (dna a component chilli (capsicum annuum l.) plants exhibiting prominent symptoms of begomovirus like: leaf curl, vein swelling, shortening of petioles, crowding of leaves and stunting of plants were collected from rorkee, uttarakhand and dhaulpur, rajasthan, india. total genomic dna was isolated from naturally infected chilli samples and pcr was carried out with coat protein (located in dna-a) gene specific primers. as expected to the primers, *800 bp dna fragments were amplified from the infected chilli samples. to know the bipartite nature of the virus isolates, nuclear shuttle protein (located in dna-b) gene specific primers were employed which also resulted in positive amplification of *850 bp dna bands with all the coat protein tested positive samples. to ascertain the association of dnab component with the virus isolates, a set of dna-b specific primers were used which resulted in positive amplification of full length (*1.3 kb) dna bands in the chilli samples collected from rorkee, uttarakhand, however, multiple sizes bands were resulted with the samples collected from dhaulpur, rajasthan. these findings confirmed that both the virus isolates under study are bipartite begomovirus associated with dna-b satellite. the sequencing of the pcr products is under progress which analysis will be discussed. groundnut bud necrosis virus (gbnv) belonging to the genus tospovirus, which is a unique member of the family bunyaviridae, infects several economically important crops. the virus has three genomic ssrna segments namely s (ambisense), m (ambisense) and l (negative sense). the s rna codes for nucleoprotein (np) and non-structural protein (nss) from viral complimentary and viral strands respectively. many viral nonstructural proteins such as ns3 of hepatitis c virus, yellow fever virus, dengue virus, sv40 large t antigen and cytoplasmic inclusion protein of tamarillo mosaic potyvirus are known to exhibit rna/dna stimulated ntpase, dntpase and helicase activity. nss of gbnv does not have any sequence similarity with any of the above mentioned viral rna/dna helicases but has a ntp binding domain. however, it has been implicated as suppressor of gene silencing in vivo. with a view to elucidate the mechanism by which nss could act as a suppressor of gene silencing and examine the other potential roles of nss in the life cycle of the virus, the gbnv (to) nss was over-expressed in e. coli and purified by ni-nta chromatography. in vitro studies with the purified rnss suggest that it exhibits an rna stimulated ntpase activity. many of the proteins that possess the rna/ dna stimulated ntpase and datpase activity, are also shown to have atp dependent nucleic acid unwinding activity. it was therefore of interest to examine whether nss has the nucleic acid unwinding activity. the helicase assays revealed that nss has dna/rna helicase activity. helicase activity of nss was absolutely dependent on atp and mg 2? ion. nss could unwind dsdna substrate with 5 0 overhang, or 3 0 overhang. mutation of the crucial lysine in walker motif a (k189) severely affected the unwinding activity where as mutation of aspartate residue in walker motif b (d159) resulted in only 20% loss of activity. in this regard, rnss is a unique enzyme which does not have the canonical helicase motifs but can catalyze dsdna/dsrna unwinding in an atp and mg 2? dependent manner. the rnss might act as a suppressor of by unwinding the dsrna, the substrate for dicer. in addition to being a suppressor of ptgs, nss may also regulate the viral replication and transcription by modulating the secondary structure of the viral genome. this new research finding on nss might pave way for further studies on its role in viral replication and transcription. yellow vein mosaic disease of pumpkin (cucurbita moschata) poses a serious threat to the cultivation of this crop in india. the disease was found to be associated with whitefly-transmitted bipartite begomoviruses were detected in varanasi field using polymerase chain reaction (pcr) with primer design through coat protein conserved region of begomoviruses from ncbi database. all plant samples showing symptoms were infected with begomovirus. the virus species were provisionally identified by sequencing *750 bp of the viral coat protein gene (av1 ageratum conyzoides is commonly known as billygoat-weed, chick weed, goatweed and whiteweed. in india it is popularly known as bill goat weed. it is an annual herbaceous plant with a long history of traditional medicinal uses in several countries of the world and also reputed to possess varied medicinal properties including the treatment of wounds and burns. in cameroon and congo, it is used traditionally to treat fever, rheumatism, headache, and colic. during survey in and around gorakhpur in 2009, ageratum plants were found affected with the symptoms of leaf curling, mosaic mottling and leaf yellows. the infected leaf samples were processed for virus identification and association with pcr assays. total dna was extracted and pcr were performed with begomovirus specific primers (tlcv-cp). a *800 bp band was consistently amplified on 1% agarose. the pcr products were directly sequenced and sequence was submitted in genbank with the accession no. gq412352. the blast search analysis showed highest similarity of 98% with the ageratum enation virus. vernonia cinerea leaves with yellow vein symptoms were collected around crop fields in madurai. a 550 bp product amplified from total dna extracted from symptomatic leaves with degenerate primers designed to amplify a part of the av1 gene from begomoviral dna a component was cloned and sequenced. based on the above sequences, specific primers were designed and the full length dna a of 2745 nucleotides with typical genome organization of begomoviral dna a was obtained and was submitted to embl data base (acc no: am182232). the sequence comparison with other begomoviruses revealed the closest identity (83%) with emilia yellow vein virus from china and less than 80% with all known begomoviruses. the international committee on taxonomy of viruses (ictv) has therefore recognized vernonia yellow vein virus (vyvv) as a distinct begomovirus species. conventional pcr could not amplify the dna b or dna b from the infected tissue. however, the b dna (1364 bp) associated with the disease was obtained (acc no: fn435836) by the rolling circle amplification-restriction fragment length polymorphism method (rca-rflp) using phi29 dna polymerase. sequence analysis shows that dna b of vyvv has the highest identity (81%) with dna b of ageratum leaf curl disease and 58-77% with the b dna associated with other begomoviruses. infectious clones of vyvv dna a and dna b as dimers were made using the products of rca-rflp. these infectious clones will be used for agroinfection of vernonia and the results will be discussed. this is the first report of the molecular characterization of vernonia yellow vein virus (vyvv) from vernonia cinerea in india. production of bulb and seed crop of onion (allium cepa l.) is hampered by onion yellow dwarf virus (oydv) and iris yellow spot virus (iysv) with an incidence of 83.22% and 89.97% in bulb crop and 90.65% and 89.58% in seed crop, respectively in the popularly grown cv. hisar-2. four symptom-based variants of oydv designated as grade a, b, c and d produced varied types of symptoms in onion crop incurring heavy losses in bulb and seed production. iysv caused tiny hay coloured spots of different shapes and sizes on leaves and scapes which later coalesced and led to drying and lodging of scapes. the plant height, bulb weight and bulb size were 37.7 cm, 75.5 g and 24.2 cm 2 in plants infected with oydv, 39.6 cm, 79.7 g and 25.5 cm 2 in iysv infection, 35.1 cm, 68.4 g and 22.1 cm 2 due to their combined infection, as compared to 40.6 cm, 88.4 g and 27.6 cm 2 respectively, in healthy plants of bulb crop. in plants infected with oydv grade a the plant height was minimum (90.33 cm) whereas the number of umbels was maximum (9.20 umbels/pl.) but other yield parameters viz., weight/umbel (2.32 g), number of seeds/umbel (209), seed weight/umbel (0.64 g) and seed yield/plant (5.88 g) were recorded to be the lowest. the minimum reduction in plant height (100.26 cm), weight/umbel (6.72 g), number of seeds/umbel (633), seed weight/umbel (2.36 g) and seed yield/plant (11.90 g) were recorded in oydv grade d. the plant height was 98.84 cm with 5.10 umbels per plant, 4.24 g weight/umbel, 428 seeds/umbel, 1.25 g seed weight/umbel and 6.37 g seed yield/plant in iysv infected plants. the plant height (96.26 cm), umbels/plant (5.97), weight/umbel (4.60 g), number of seeds/umbel (432), seed weight/umbel (1.42 g) and seed yield/plant (7.82 g) were found to be the lowest in combined infection of oydv and iysv diseases in comparison to higher values in healthy controls (104.50 cm, 4.90, 7.84 g, 677, 2.60 g, 12.74 g, respectively). a minimum reduction in the test weight, germination and seed vigour index were found (3.06 g, 75.68% and 926) due to oydv grade a infection, whereas these were 2.92 g, 70.42% and 788 in iysv disease infected plants and 2.62 g, 70.4% and 776 in combined infection of oydv and iysv diseases in comparison to 3.84 g, 88.67% and 1276 in healthy plants. the maximum hampering of seed vigour parameters was recorded due to iysv infection. lodging of scapes caused by this disease was responsible for heavy losses in seed production and seed quality. cotton leaf curl disease is one of the major threats to cotton cultivation from northern india. survey conducted during 2009, observed the disease incidence ranged from 70 to 90% from bhatinda, abohar, fazilka, sri ganganagar, hanumanghar. in order to study genetic variability in the virus, twelve clcuv isolates were partially characterized (700 bp common region, full length av2 gene and partial sequences of ac1 and av1 gene). full length characterization of representative isolates from bhatinda, abohar, fazilka, sri ganganagar, hanumanghar is under progress. partial sequence analysis of clcuv isolates revealed that, the virus isolates collected during 2009 cropping season are closely related to cotton leaf curl burewala virus from pakistan and results were discussed. pratibha singh, h. s. savithri department of biochemistry, indian institute of science, bangalore tospoviruses, belonging to the family bunyavirideae, infect economically important plants such as groundnut, tomato, watermelon etc. they have a tripartite genome, with l, m and s segments of rna, in pseudo circular (panhandle) form. the viral genomes encode four structural proteins (l, n, g1 and g2) in the antisense orientation, and two non structural proteins nss and nsm in the sense orientation. the nsm is the only protein unique to tospoviruseses that infect plants in the bunyaviridae family and hence is proposed to be important for cell to cell movement. ground nut bud necrosis virus (gbnv), a member of the tospovirus genus, is the most prevalent virus infecting several species of leguminosae and solanaceae plants in india. total rna was isolated from gbnv infected tomato leaves and rt-pcr was performed using appropriate primers to amplify the nsm gene. the pcr product was cloned in pgex5x2 vector. the recombinant nsm clone was transformed into bl21 (de3) e. coli cells and over-expressed by induction with 0.3 mm iptg. sds-page analysis of induced and uninduced fraction revealed the presence of overexpressed protein of expected size. the soluble gst-nsm was purified by gsh sepharose affinity chromatography. purified gst-nsm was shown to interact with in vitro transcribed rna transcript by electrophoretic mobility shift assay. further nsm was shown to interact with viral encoded proteins np and nss using elisa and yeast two hybrid system. nsm was also shown to be phosphorylated in vitro by pellet fraction of plant sap. thus the recombinant gbnv nsm possesses the characteristic features of a movement protein such as nucleic acid binding, interaction with nucleocapsid protein, and ability to undergo posttranslational modification. solanum melongena, commonly called as egg plant is one of the most important vegetable crop in the world. it is cultivated widely in the tropical and sub tropical regions. several viruses such as cucumber mosaic cucumo virus (cmv), potato virus-y (pvy), potato virus-x (pvx) and tobacco ring spot virus (trsv) infect egg plant under natural conditions. in india major crop losses due to cmv infection in brinjal is 57% (fao stat-2008) . in the present study the infected leaf samples were collected from local fields of ramapuram, chandamama palli, chandragiri, madanapalli, yadhamari, durgasamudram villages in and around tirupati, were tested for cmv infection by dac-elisa with cmv antisera. the resulting positive samples were further inoculated to the raised brinjal seedlings of selected varieties through mechanical sap inoculation. different varieties of brinjal like mullabadhine, ankhur, ravya, mattigulla, casper and easter egg were used for monitoring the susceptibility to cmv infection. the mosaic symptoms were observed after 2 weeks of inoculation in all varities of brinjal except mullabadhina. among all these susceptible varities ankhur variety is selected to study induced biochemical changes such as chlorophylls, carbohydrates, proteins, nucleic acids and polyphenol oxidases in cmv infected brinjal leaves. in the infected leaves considerable reduction in chlorophyll and starch and increase in total proteins, sugars, rna and polyphenol oxidases was observed when compared to healthy leaves. the amount of total starch, protein and dna decreased to about 25, 136 and 645 lg/g respectively in infected leaves, where as sugars (75 lg/g), rna content (754 lg/g) and polyphenol oxidase activity was increased as compared to healthy leaves. the above results suggests that there is an altered concentrations of chlorophyll, proteins, nucleic acids, carbohydrates and polyphenol oxidase activity in the brinjal leaves due to the effect of cucumber mosaic cucumo virus infection. leaf analysis was found to be used as widely accepted diagnostic tool to assess the nutritional status of the vegetables. the present study deals with these aspects in detail. the total rna and dna was isolated from infected leaf samples. rt-pcr assays were performed using sugarcane yellow leaf virus (scylv) specific primers (scylv-615f and scylv-615r). the infection of scylv was detected in all the collected samples, which showed the expected size (*610 bp) amplicon during rt-pcr. in another experiment with nested pcr analysis, a phytoplasma characteristic 1.2 kb rdna pcr product were amplified from dnas of all infected samples but not in healthy sugarcane plants tested using phytoplasma universal primer pairs p1/p7 and fu3/ru5. dna extracts from plants with yellow mid rib and leaf yellows produced products of 1250 bp, which gave typical phytoplasma profiles when digested with hae iii and hha i. no pcr amplifications were produced using dna from symptomless plants. our results suggest that the yellow mid rib and leaf yellows symptoms on sugarcane varieties in uttar pradesh and uttarakhand states of india exhibiting midrib yellowing and leaf yellows symptoms is mainly caused by mixed infection of scylv and scylp. the affected clumps showed reduction in stalk height as compared to healthy fields. thirty-one sugarcane mosaic isolates belonged to sugarcane mosaic virus (scmv) and sugarcane streak mosaic virus (scsmv were collected from china and india), confirmed in indirect elisa and rt-pcr amplification with scmv and scsmv-specific primers. the amplicons (0.8 kb) from the coding region of coat protein (cp) were cloned, sequenced and compared to each other as well as to the sequences of 15 scmv isolates from sugarcane (australia, usa, china, brazil, mexico and south africa), maize (australia, china, iranian) and one scsmv isolate from sugarcane (india) in genbank. maximum likelihood and maximum parsimony analyses robustly supported two major monophyletic groups that were correlated with the host of origin: the scmv subgroup that included 18 isolates from china and only 13 isolates from india, and the scsmv subgroup that contained all isolates from india. maize dwarf mosaic virus (mdmv) and johnsongrass mosaic virus (jgmv) were not detected in any of the samples tested. a strong correlation was observed between the sugarcane groups and the geographical origin of the scmv isolates. the 11 millable sugarcane samples from china contained a virus tentatively described as sorghum mosaic virus (srmv). three isolates from nine chewing canes in fujian, yunnan and guizhou provinces of china also contained srmv, and the other 12 samples including five isolates from india was found infected with scmv. no srmv infection has been detected in sugarcane mosaic samples from india. sequence comparisons and phylogenetic analysis indicated that srmv can be considered as the most common and prevalent potyvirus infecting sugarcane in china, however in india sugarcane streak mosaic virus is dominant in causing mosaic symptoms on sugarcane. dig-labeled dna probe complementary to coat protein (cp) region of tobacco streak virus (tsv) sunflower isolate was designed for the sensitive and broad-spectrum detection of tsv isolates, the most devastating virus in india. dot-blot and tissue print hybridizations with the digoxigenin labeled probe were performed for the tsv detection at field levels. here, dot-blot hybridization was used to check a wide number of tsv isolates with a single probe and sensitivity with different sample extraction methods. the probe with cp conserved region prepared from sunflower pcr amplicon was hybridized with the tsv field isolates of gherkin, pumpkin, sunflower, marigold and globe amaranth samples because of highly conserved with little variability in cp region. the sensitivity limits were decreased from total nucleic acid to partially purified and crude extract preparations. in particular, tissue blot hybridization offers a simple, reliable procedure as dot-blot, but requires no sample processing. because there is minimal sample preparation, tissue-print hybridization could be an important component of tsv management programs. thus, the above non-radioactive labeled probe techniques can facilitate in screening the samples during tsv outbreaks and in quarantine services. savita patil, rupali sawant*, k. banerjee virology group, agharkar research institute, macs, g.g. agarkar road, pune 411 004 two mycobacterium smegmatis strains (ari lab nos. v842 and v946) were employed for the isolation of mycobacteriophages from soil and sewage samples. mycobacteriophages were isolated from soil samples collected from an area surrounding the tuberculosis (tb) ward, naidu hospital, pune, against m. smegmatis strain v842. these were numbered as v942, v943 and v944 and were isolated by using washed-cell preparation method. the bacteriophages against the other m. smegmatis strain, i.e. v946, were isolated from soil samples (collected from around tb ward, sassoon hospital, pune). some of these phages (viz.v953, v954) showed plaques at 42°c but not at 37°c. thus they seem to be lysogenic. for propagating and increasing the titre of all the above isolates, various previously described methods were attempted, but none of these methods were satisfactory. but when siliconized glassware and plastic-ware were used, propagation was successful. we showed that siliconization of glassware and plastic-ware was essential for the propagation of our mycobacteriophage isolates v951, v952, v953, v954 and v955. also, phage dilution medium (pdm) as described by chaterjee et al. (2000) was found to be effective for picking out of the plaques made by the phages. in this way, the phage isolates were propagated up to p 3 . the various passages of the phage isolates v951, v952, v953, v954 and v955 (i.e. original, p 1 , p 2 and p 3 ) were stored at -80°c. pvp-29 effect on pigments due to geminivirus infection on cowpea (vigna unguiculata) shail pande*, naveen pandey, k. shukla mahatma gandhi p. g. college gorakhpur, d.d.u. gorakhpur university, gorakhpur geminiviruses are one of the most important group of viruses causing economic losses in tropics. the symptom produced are yellowing of leaves which directly affect the pigments of diseased plants it in turn affects productivity and yield of diseased plant. cowpea vigna unguiculata is one of the important crop cultivated throughout india for its green pods which are used as vegetables and seeds are used as pulse. cowpea is affected by many viruses amongst them geminiviruses are one of the important virus on the cowpea plant. in the present study total chlorophyll content was studied in leaf of cowpea of diseased and healthy plants using arnon's method. carotenoids were also studied using ikan's method. it was found that chlorophyll content in diseased plants were lower compared to healthy plant similar results were found with carotenoids so the geminivirruses infection lowers the chlorophyll and carotenoid content in diseased plants which reduces yield of diseased cowpea plant. shweta sharma 1 , amrita banerjee 2 , j. tarafdar 2 , r. rabindran 3 , indranil dasgupta 1 * 1 department of plant molecular biology, university of delhi, south campus, new delhi; 2 bidhan chandra krishi vishwavidayalaya, kalyani, nadia, west bengal 741235; 3 tamil nadu agricultural university, coimbatore, tamil nadu 641003 rice tungro disease is an important disease of rice, caused by a joint infection by two viruses: rice tungro spherical virus (rtsv) and rice tungro bacilliform virus (rtbv) in south and southeast asia. the complex of rtbv and rtsv is transmitted by an insect vector green leaf hopper (glh). previously we reported complete genomic sequences of two geographically distinct isolates of rtbv; rtbv-wb (west bengal) and rtbv-ap (andhra pradesh) collected from the field in mid-1990s. both the sequences showed high homology all along the genome but showed divergence from previously reported southeast asian isolate i.e. rtbv-phil (philippines). to check whether a time period of a decade has resulted into variability in the genomic sequence of different isolates of rtbv in india, we cloned and sequenced the complete genome of rtbv from two geographically distinct regions of india i.e. west bengal and kanyakumari collected from the field in 2008. the complete nucleotide sequence of the dna fragments covering the whole genome of rtbv was determined using universal primers m13f and m13r and by primer walking, without any ambiguities remaining. the nucleotide sequences of overlapping clones were assembled and analyzed using the dna analysis software generunner and blastn program of ncbi. homology search at the nucleotide and amino acid level were performed using the blastn and blastp (respectively) programs of ncbi. multiple sequence alignments were performed using clus-tal-w software. sequence analysis results thus obtained showed that both the recently obtained complete genomic sequences of rtbv from two geographically distinct regions of india i.e. west bengal and kanyakumari showed very high homology (both at the nucleotide and amino acid levels) with the two previously reported rtbv isolates from india i.e. rtbv-wb (west bengal) and rtbv-ap (andhra pradesh) all along the genome. as observed earlier both the sequences diverged significantly from the southeast asian isolates. this suggests that even after the spatial and temporal difference (a time gap of approx 10 years) between the two previously reported rtbv isolates and the recently reported one, there is very little sequence variability between them. this further strengthens the earlier reports that the rtbv genomes in india are highly conserved. homology search at the nucleotide level using blastn program with the previously existing rtbv isolates revealed a very high percentage identity of 99% with the rtbv west bengal isolate and 95% with the rtbv andhra pradesh isolate. this further strengthens the earlier reports that there is not much genetic variability in the rtbv genomes in indian subcontinent. complete genomic rna sequences of two geographically distinct isolates of rice tungro spherical virus (rtsv), a member of the genus waikavirus, family sequiviridae, were determined from india. out of the two previously reported sequences, the indian isolates were closer to the resistance breaking strain rtsv-[vt6] than rtsv[phila] . between them, the indian sequences showed nucleotide as well as amino acid identities of 96%. a moderate homology was observed between the leader peptide and a putative helper component protein involved in insect transmission of the maize chlorotic dwarf virus, a closely related waikavirus, indicating its possible transmission-related function. unlike rice tungro bacilliform virus, which causes rice tungro disease jointly with rtsv, and is significantly different between isolates from india and philippines, rtsv genomes were observed to be much more conserved between isolates from the two countries. rice tungro bacilliform virus (rtbv) are believed to be the joint causative agents for the devastating tungro disease of rice prevalent in south and southeast asia [11] . rice tungro disease has become the major cause of production losses in rice during last three decades in several rice growing states of india. here, we report, for the first time the complete sequence analysis of two geographically distinct indian isolates of rtsv. we analyze the deduced protein sequences and their phylogenetic relationship with the two complete rtsv sequences from philippines as well as with other members of sequiviridae family. we provide molecular evidence that the indian isolates of rtsv are closely related to those from the philippines. we had earlier reported that rtbv isolates between india and philippines differ significantly from each other [18] . this study was undertaken in order to see whether rtsv isolates from india also show similar difference from those reported from the philippines. frequent outbreaks of tungro were reported near kanyakumari in the last 2-3 years. the present work was undertaken to clone and sequence the full-length rtbv and rtsv genomes from the infected rice plants collected from above region and to analyze the similarity of its genetic material with the existing indian isolates of rtbv and rtsv. a 1.1 kb dna fragment encoding the reverse transcriptase gene of rtbv genome was amplified and cloned in t/a vector and was sequenced commercially. homology search at the nucleotide level using blastn program with the previously existing rtbv isolates revealed a very high percentage identity of 99% with the rtbv west bengal isolate and 95% with the rtbv andhra pradesh isolate. this further strengthens the earlier reports that there is not much genetic variability in the rtbv genomes in indian subcontinent. similarly, the cp3 region of rtsv was amplified by rt-pcr and was cloned in t/a vector. recently, rice tungro disease has been reported from kanyakumari district of tamil nadu. it is important to determine the genetic nature of this isolate in order to develop resistance strategies. it is thus necessary to clone and characterize the viruses from kanyakumari and to determine the mechanism of virus resistance in transgenic lines. rice tungro disease is an important viral disease of rice. rice tungro is caused by infection by two viruses: rice tungro bacilliform virus (rtbv) and rice tungro spherical virus (rtsv). rtsv is a plant picornavirus with a 12 kb single stranded rna genome. it belongs to genus waikavirus in the family sequiviridae and is necessary for transmission of the two viruses by the leafhopper vector nephotellix virescens. rtsv rna is translated to form a large polyprotein, which is then self cleaved to form the viral proteins, including the three coat proteins, replicase, protease. studies have been conducted on rtsv from philippines. correct information of sequence variability of viral isolates to check whether different geographical conditions like those present in india select for genotypically variable strain and to design for transgenic resistance strategy, information on rtsv from india is absolutely essential. the objective of this study was to clone rtsv isolates from india and compare the genetic diversity of indian isolates from other southeast asian isolates and amongst each other. also develop strategy to impair the attack of virus-complex on rice. the achieve this, complete genomes of two isolates from india were cloned by amplifying different genes by rt-pcr and subsequently cloned in ta vectors, followed by sequencing. subsequently constructs containing cp1-3, antisense replicase, sense replicase and double stranded replicase were cloned in plant transformation vector. these constructs were used to transform aromatic rice variety from indian-pusa basmati (pb1). pcr analysis of the above plants was done to check the stable insertion of insert in the transgenics. jatropha (jatropha curcas) of the family euphorbiaceae is being grown in india as a major commercial fuel (bio-diesel) crop. jatropha is cultivated in 200 districts of 19 potential states of india. unfortunately, the cultivation of jatropha is limited by the severe mosaic disease. recently, a severe mosaic disease with significant disease incidence was observed in 2006-2009 on j. curcas grown in experimental plots of nbri and j. gossypifolia, a weed growing road side around lucknow and kathaupahadi, madhya pradesh. the disease consisted of the symptoms of severe mosaic, blistering, leaf distortion and stunting of whole plant and no fruit/seed production in severely affected plants. symptomatology and whitefly population observed on them suggested the occurrence of begomovirus infection. to detect the begomovirus infection, the total dna from leaf samples of infected jatropha plants was extracted and polymerase chain reaction (pcr) were performed using three sets of begomovirus genus specific (cpit-i/cpit-t, paliv 1978/paric 496 and paliv 722/palic 1960) primers and the expected size *800 bp, 1.2 kb and 1.2 kb amplicons were obtained which confirmed the begomovirus infection. further to identify the begomovirus/es and investigate the genetic diversity among them exists if any, the *1.2 kb amplicons were cloned and sequenced. the sequence data were deposited in the genbank database under accession nos.: gq847545 and fj346232 (from j. curcas) and eu727086 and fj177030 (from j. gossypifolia). during blast analysis gq847545 and fj346232 shared highest 95% sequence identity with each other and 84-88%% with sri lankan cassava mosaic virus (aj579307, aj607394, aj890225, aj89 0229 and aj890224) and indian cassava mosaic virus from india (ay738105) therefore, designated as two strains of jatropha mosaic india virus-lucknow. blast analysis of eu727086 showed maximum 93% similarities with croton yellow vein mosaic virus (aj507777), 82% with tomato leaf curl new delhi virus (dq629102) and 80-79% with papaya leaf curl virus (aj436992 and y15934), therefore, identified as strain of croton yellow vein mosaic virus. blast analysis of the virus isolate (fj177030) showed highest 83% identities with tomato leaf curl virus-bangalore ii (tolcv-b ii-u38239) and 82-81% with tomato leaf curl karnataka virus (tol-ckv, ay754812, fj514798), therefore, considered as new begomovirus species ''jatropha yellow mosaic india virus''. the phylogenetic analysis of gq847545 and fj346232 (from j. curcas) and eu727086 and fj177030 (from j. gossypifolia) was performed along with some selected isolates of begomovirus which showed [90% sequence identities during blast analysis. the isolate eu727086 showed closest relationship with croton yellow vein mosaic virus while fj177030 showed separate clustering of all the four begomovirus from jatropha species. during phylogenetic analysis these isolates formed three separate clusters, therefore, they were considered as three distinct begomoviruses. the above data clearly show that some genetic diversity exists among the begomoviruses infecting jatropha species in india. bitter gourd (momordica charantia l.) of the family cucurbitaceae, also known as bitter melon is extensively cultivated in north eastern region of uttar pradesh, india. it is regarded as one of the world's major vegetable crops and has great economic importance. a severe yellow mosaic disease on bitter gourd (momordica charantia) with a significant disease incidence was observed during the survey of different locations of eastern up, india in the year 2007. the whitefly (bemisia tabaci) population was also observed in the vicinity. the characteristic disease symptoms and whitefly population indicated the possibility of begomovirus infection. total dna were isolated from infected as well as healthy leaf samples. two primer pair (tlcv-cp and roja's primer) were used to study, which resulted *800 bp with tlcv-cp in 3/3 samples and *1.3 kb amplicons with roja's primer in 3/4 samples. for further identification of the begomovirus, the pcr amplicons were cloned and sequenced (genbank accession no. eu439260 and eu888908, respectively). the blastn search analysis of eu439260 indicated 99-95% identity with several isolates of tomato leaf curl new delhi virus (tolcndv). the phylogenetic analysis also showed closest relationships of the isolate (eu439260) with tolcndv isolates. based on highest sequence identity and closed relationships with tolcndv the virus isolated from bitter gourd was considered as an isolate of tomato leaf curl new delhi virus. while, blastn search analysis of eu888908 isolate, shared highest 99-97% identites with pepper leaf curl bangladesh virus (peplcbv) isolates. the phylogenetic analysis of the virus isolate with selected begomovirus isolates revealed a closest relationship with peplcbv. these results confirmed the association of peplcbv on bitter gourd. study revealed the variability of viruses on bitter gourd in eastern up, india. tobacco streak virus groundnut isolate was characterized biologically by taking six cultivars (jl24, tmv2, k6, k7, k9) and one pre-release culture (k1271) using seedlings of 7-84 days old under glasshouse conditions. there were clear differences were observed among cultivars tested regarding incubation period, percent seedling wilt and time taken to death of seedlings. k-7 was least susceptible among all the cultivars tested and it supported least virus titer (a 405 nm: 0.11-1.23). both localized (necrotic lesions on leaf, veinal necrosis, leaf yellowing, wilting) and systemic (petiole necrosis, necrotic lesions on young leaves, death of top growing buds not only on main stem but also on all primaries (side shoots), followed by stem necrosis, stunted growth, axillary shoot proliferation with small leaves having general chlorosis, peg necrosis, pod necrosis, pod size reduction, wilt of plants) symptom were observed in all cultivars tested. biological differentiation of tsv and gbnv was made by sap inoculation of both viruses separately using susceptible groundnut cultivar jl24 under glasshouse conditions. there were certain similarities and differences were observed between these viruses infecting groundnut. seed infection of tsv ranged from 18.9 to 28.9% in seeds collected from naturally infected and sap inoculated groundnut cultivars/pre-releases (jl24, tmv2, k-6, k-7, k-9 and k-1271) belonging to spanish and virginia types. tsv was detected both in pod shell and seed testa from pod samples produced by sap inoculation under glasshouse conditions. however, seed transmission of tsv was not observed in groundnut. coat protein (cp) gene of three groundnut tsv isolates (gn-ap-1-00; gn-ap2-04; gn-ap3-07) were sequenced and all the three isolates contained a single open reading frame (orf) of 717 bp nucleotide and could potentially code for 238 amino acids (aa). cp gene of tsv isolates originating from different hosts shared high degree of sequence identity both at nucleotide (97.6-100%) and amino acid (95.7-100%) levels respectively. tones grown in an area of 3.83.430 ha (fao stat2007). in india papaya is grown in nearly 80,000 ha with an annual production of 7,00,000 tones (fao stat 2007) and occupies fourth place in the world. the crop is severely affected by a number of viruses. papaya ring spot virus (prsv-p) is the most important virus. the detection of virus infection in plants has traditionally involved either bioassay on indexing plants and or immunological methods (hill 1981, torrence and jones 1981) . use of nucleic acid probes has improved the detection and sensitivity of viruses. the most common non-radioactive probes are biotynilated probes, which are very specific and sensitive. papaya ring spot virus (prsv-p) is a positive sense ssrna virus belonging to the genus potyvirus family potyviridae and transmitted by aphids. prsv-p coat protein gene region was used as template cdna for probe preparation. dot-blot hybridization with the biotin labeled probe were performed for prsv-p detection. the clarified sap of healthy and infected plants were serially diluted and spotted onto the nitrocellulose membrane, hybridized to biotin labeled probe. biotin labeled rna's are employed as probes, with a subsequent detection based on streptavidin-alkaline phosphatase conjugates. the sensitivity for viral detection of the biotin labeled probe was found to be sensitive than enzyme linked immunosorbent assay (elisa). in recent years tospovirus is causing devastating damage to the yield of vegetables in india. it infects economically important crops viz., tomato, chilli, peppers, groundnut, watermelon and various legumes. now it is emerging as severe disease in brinjal also. in order to monitor the natural occurrence and distribution of tospovirus in vegetable, surveys were conducted in the predominant brinjal growing areas of gujarat, karnataka, maharashtra and andhra pradesh during 2008-2010 incidence ranging from 5 to 10%, 0 to 80%, 1 to 40%, and 0 to 55.78% respectively. samples collected from different places of india were found positive to pbnv in direct antigen coating-enzyme linked immunosorbent assay (dac-elisa). pbnv infected brinjal plants showed mosaic mottling of leaves with leaf distortion, longitudinal streaks on the stem and necrotic rings on leaves and fruits. early infection led to severe stunting and abnormal fruiting. biological and molecular characterization of pbnv-brinjal isolates were compared with other isolates and results are discussed. for identification of virus causing mosaic symptoms on soybean various host plants were tested. plants species belonging to the different families viz. caricaceae, graminae, leguminosae, malvaceae and solanaceae were tested. the virus produced symptoms on diagnostic plant species like chenopodium album, c. quinoa, helianthus anus, phaseolus vulgaris and vigna ungiculata. among tested families the leguminosae that were the host of virus included arachis hypogea, the virus causing mosaic symptoms in soybean is inactivated between 50 and 55°c and between dilution of 10 -4 to 10 -5 . all the inoculated plants of assay host showed the symptoms at 50°c but not at 55°c. similarly local lesions produced at 10 -4 but not at 10 -5 . the virus in crude sap was infectious up to 72 h but not at 96 h at room temperature. however, the percentage infectivity decreased progressively as the aging of the sap was increased at room temperature. on the basis of reactions on diagnostic hosts pvp-38 identification and characterization of potyvirus infected chilli (capsicul annum l the virus under study caused mild mosaic and severe mottling symptom in leaves of infected plants. the dilution end point (dep) of the virus was found to be 10 -3 to 10 -4 , longevity in vitro (liv) 1-3 days at room temperature (25°c), thermal inactivation point (tip) 50-55°c. electron microscopy of purified virus preparation revealed the presence of flexuous particle of size 780 nm long and 14 nm in width with characteristic cytoplasmic inclusions: pinwheels and scrolls. the virus was transmitted by sap and by aphid myzus persicae. the host range study revealed that the host species were restricted to family chenopodiaceae and solanaceae. on the basis of above characteristic, the virus under study was identified as potyvirus associated with mild mosaic and severe mottling symptom in capsicum. phytoplasma causing grassy shoot disease and sugarcane yellow leaf viruses are important pathogens of sugarcane. these pathogens are causing severe losses in sugarcane productivity. with a view to producing virus and phytoplasma free planting material of sugarcane, experiments were undertaken using infected varieties of sugarcane growing at the farms of sugarcane research institute. apical meristems measuring about 2 mm in length, were dissected out, surface sterilized and cultured on agar gelled murashige and skoog's (ms) medium containing growth regulators for shoot induction. the established shoot cultures were multiplied through repeated subcultures on fresh media at 10-12 days interval. elimination of gsd and scylv was confirmed through molecular analysis of regenerated plants using specific primers of scylv and gsd. results revealed that apical meristem culture technique is effective in eliminating the pathogens like scylv and phytoplasma (gsd) from the infected clones. this is probably the first report on elimination of grassy shoot disease in sugarcane through meristem culture. papaya ringspot virus (prsv), which causes the most widespread and devastating disease in papaya, isolates originating from different geographical regions in south india were collected and maintained on natural host papaya. the entire coat protein (cp) gene of papaya ringspot virus-p biotype (prsv-p) was amplified by reverse transcription-polymerase chain reaction (rt-pcr). the amplicon was inserted into pgem-t vector by t-a cloning method, sequenced and sub cloned into a bacterial expression vector prset-a using directional cloning strategy. the prsv coat protein was over expressed as fusion protein in e. coli. sds-page gel revealed that cp expressed as a *40 kda protein. the recombinant coat protein (rcp) fused with 69 his-tag was purified from e. coli using ni-nta resin. the antigenicity of the fusion protein was determined by western blot analysis using antibodies raised against purified prsv. the purified rcp was used as an antigen to produce high titer prsv specific polyclonal antiserum. the resulting antiserum was used to develop an immunocapture reverse transcription-polymerase chain reaction (ic-rt-pcr) assay and compared its sensitivity levels with elisa based assays for detection of prsv isolates. ic-rt-pcr was shown to be the most sensitive test followed by dot-blot immunobinding assay (dbia) and plate trapped elisa. key: cord-270587-k56fze59 authors: scherbinina, sofya i.; toukach, philip v. title: three-dimensional structures of carbohydrates and where to find them date: 2020-10-18 journal: int j mol sci doi: 10.3390/ijms21207702 sha: doc_id: 270587 cord_uid: k56fze59 analysis and systematization of accumulated data on carbohydrate structural diversity is a subject of great interest for structural glycobiology. despite being a challenging task, development of computational methods for efficient treatment and management of spatial (3d) structural features of carbohydrates breaks new ground in modern glycoscience. this review is dedicated to approaches of chemoand glyco-informatics towards 3d structural data generation, deposition and processing in regard to carbohydrates and their derivatives. databases, molecular modeling and experimental data validation services, and structure visualization facilities developed for last five years are reviewed. knowledge of carbohydrate spatial (3d) structure is crucial for investigation of glycoconjugate biological activity [1, 2] , vaccine development [3, 4] , estimation of ligand-receptor interaction energy [5] [6] [7] studies of conformational mobility of macromolecules [8] , drug design [9] , studies of cell wall construction aspects [10] , glycosylation processes [11] , and many other aspects of carbohydrate chemistry and biology. therefore, providing information support for carbohydrate 3d structure is vital for the development of modern glycomics and glycoproteomics. as result of growing interest to glycoprofiling, glycan microarrays, carbohydrate active enzymes (cazy) and glycan-binding proteins (gbp) which are involved in biological processes, several major international projects (e.g., glyspace [12] , glycosmos [13] , glycomics@expasy [14] , glygen [15] , jcggdb [16] , glytoucan [17] , mirage [18] , cfg [19] , rings [20] , glic (https://glic.glycoinfo.org/), sysglyco (https://sysglyco.org/)) were launched to integrate variety of data produced by glycobiological research. the main goal of existing glycoinformatics projects is to provide versatile resources with user-friendly access helpful for disease diagnostics [21, 22] , glycobioinformatics studies [23] , glycosylation site prediction [24] , cazy activity prognosis [25, 26] and other applications. appending of structural repositories with 3d structural data opens the way for computational glycobiology and modeling of carbohydrate structures at atomic resolution. design of novel workflows and techniques to connect carbohydrate spatial structure modes and experimental data with verification, processing, analysis and deposition of associated data has gained increased popularity in glycoscience community [27] . a carbohydrate structure database (csdb, [28] ) module for carbohydrate 3d structure modeling is a demonstrative example of 3d structural data integration facilities (as a database) combined with dedicated interface (as a glycoinformatics project). further details on csdb 3d facilities are discussed below. herein we focus on the important aspects of carbohydrate 3d structure availability to researchers: structural repositories; glycoinformatics tools and workflows to assist structure building, modeling and erroneous molecular geometry data detection and remediation; carbohydrate 3d structure presentation and visualization methods. structural databases make significant contribution to bringing information technologies to glycoscience [29] . with no focus on spatial structure, glycan databases and online tools have been recently reviewed [30] [31] [32] . depositing huge number of carbohydrates with detailed data for each entry, databases are valuable sources of structural information, biological assignments, references and external links. structural data are often accompanied by original and sometimes assigned experimental observables: nmr spectra, hplc and ms profiles, etc. the services built on top of the databases can include 3d structure simulation, validation, and storage. a viewpoint of the authors at the ideal integration of data resources and services in glycoinformatics is summarized in figure 2 . a subject of this review is databases providing theoretical or empirical 3d structures of carbohydrates and related data-mining tools. herein we focus on the important aspects of carbohydrate 3d structure availability to researchers: structural repositories; glycoinformatics tools and workflows to assist structure building, modeling and erroneous molecular geometry data detection and remediation; carbohydrate 3d structure presentation and visualization methods. structural databases make significant contribution to bringing information technologies to glycoscience [29] . with no focus on spatial structure, glycan databases and online tools have been recently reviewed [30] [31] [32] . depositing huge number of carbohydrates with detailed data for each entry, databases are valuable sources of structural information, biological assignments, references and external links. structural data are often accompanied by original and sometimes assigned experimental observables: nmr spectra, hplc and ms profiles, etc. the services built on top of the databases can include 3d structure simulation, validation, and storage. a viewpoint of the authors at the ideal integration of data resources and services in glycoinformatics is summarized in figure 2 . a subject of this review is databases providing theoretical or empirical 3d structures of carbohydrates and related data-mining tools. networking between glycoinformatics projects and related services that promotes achievement of data integration in glycomics. reproduced with permission from [29] , © 2020 wiley-vch verlag gmbh & co. kgaa, weinheim. the majority of existing repositories for carbohydrate 3d structures offer open-access data via web interface. deposited datasets can be represented by glycoproteins, protein-carbohydrate complexes, poly-and oligosaccharides with 3d structure experimentally resolved or specified by means of nmr, x-ray crystallography, cryoem, small angle x-ray scattering, etc. [27] . several databases such as glycam-web, ek3d, 3dsdscar, glycomapsdb contain data from molecular dynamics simulations. we have also mentioned databases featuring information on protein structures involving carbohydrate moiety in terms of glycosylation (as post-translational modification, dbptm), carbohydrate active enzymes (cazy) and homology modeling (swiss-model). table 1 displays currently active structural databases maintaining three-dimensional data on carbohydrates. for table 1 , we have selected carbohydrate and related databases using the following criteria: • database can be freely accessed through web user interface; • database must contain experimentally confirmed and/or predicted 3d structures (preprocessed and/or generated on-the-fly from a primary structure input) of glycans, glycoproteins, or protein-carbohydrate complexes; • stored 3d structures must be deposited as atomic coordinates in pdb, mol, or other format, and the structures must contain a saccharide moiety; • databases with records linked to other large 3d data collections (e.g., rcsb pdb, pdbe, pdbj, pdbsum, uniprotkb etc.) are included in table 1 (as long as database entries contain carbohydrate moiety, e.g., as a part of a lectin or an antibody); • databases with derived carbohydrate 3d structural data (conformational maps, conformer energy minima, etc.) are included in table 1 even if they provide no atomic coordinates (e.g., glycomapsdb and gfdb). despite no fit to the criteria above, assistance of large structure repositories offering only glycan primary structures (e.g., glytoucan [17] (https://glytoucan.org/), unicarbkb [33] (http://www.unicarbkb.org/)) can be useful for cross-referencing of existing carbohydrate resources and serve as supplementation to 3d modeling pipelines. some out-of-date projects, such as complex carbohydrate structural database (ccsd) [34, 35] , eurocarbdb [33, 36] , glycomedb [36] [37] [38] , glycoconjugate data bank [39] , glycosuite [40, 41] are noteworthy as they had shaped the modern vision of structural glycoinformatics. networking between glycoinformatics projects and related services that promotes achievement of data integration in glycomics. reproduced with permission from [29] , © 2020 wiley-vch verlag gmbh & co. kgaa, weinheim. the majority of existing repositories for carbohydrate 3d structures offer open-access data via web interface. deposited datasets can be represented by glycoproteins, protein-carbohydrate complexes, poly-and oligosaccharides with 3d structure experimentally resolved or specified by means of nmr, x-ray crystallography, cryoem, small angle x-ray scattering, etc. [27] . several databases such as glycam-web, ek3d, 3dsdscar, glycomapsdb contain data from molecular dynamics simulations. we have also mentioned databases featuring information on protein structures involving carbohydrate moiety in terms of glycosylation (as post-translational modification, dbptm), carbohydrate active enzymes (cazy) and homology modeling (swiss-model). table 1 displays currently active structural databases maintaining three-dimensional data on carbohydrates. for table 1 , we have selected carbohydrate and related databases using the following criteria: • database can be freely accessed through web user interface; • database must contain experimentally confirmed and/or predicted 3d structures (preprocessed and/or generated on-the-fly from a primary structure input) of glycans, glycoproteins, or protein-carbohydrate complexes; • stored 3d structures must be deposited as atomic coordinates in pdb, mol, or other format, and the structures must contain a saccharide moiety; • databases with records linked to other large 3d data collections (e.g., rcsb pdb, pdbe, pdbj, pdbsum, uniprotkb etc.) are included in table 1 (as long as database entries contain carbohydrate moiety, e.g., as a part of a lectin or an antibody); • databases with derived carbohydrate 3d structural data (conformational maps, conformer energy minima, etc.) are included in table 1 even if they provide no atomic coordinates (e.g., glycomapsdb and gfdb). despite no fit to the criteria above, assistance of large structure repositories offering only glycan primary structures (e.g., glytoucan [17] (https://glytoucan.org/), unicarbkb [33] (http://www. unicarbkb.org/)) can be useful for cross-referencing of existing carbohydrate resources and serve as supplementation to 3d modeling pipelines. some out-of-date projects, such as complex carbohydrate structural database (ccsd) [34, 35] , eurocarbdb [33, 36] , glycomedb [36] [37] [38] , glycoconjugate data bank [39] , glycosuite [40, 41] are noteworthy as they had shaped the modern vision of structural glycoinformatics. • mammalian glycans • pre-built libraries of predicted 3d structures of common bioglycans • 3d structure models * • 3d-atomic coordinates generation (http://glycam.org/pre-builtlibraries.jsp) a where unknown, the year of the first publication is given. b database is marked as curated if manual verification of data was reported in the original publication or at the database web site. c published coverage data can be outdated; database interface provides no statistics on current coverage. * database provides no search facilities for indicated carbohydrate 3d structural data. methods to probe a 3d structure of carbohydrate-containing biomolecules has been developed for decades. nmr techniques (interatomic distances derived from noe, and torsion angles derived from coupling constants), x-ray crystallography, and electron cryo-microscopy (the two latter being atomic models built on the basis of electron density map) are among most demanded methods for 3d strucural elucidation. these methods have been reviewed [93] [94] [95] [96] and are beyond the scope of this review focused in information technologies. for use of instrumental methods for the validation of a simulated structure, please refer to section 5 "experimental data validation". structural investigation of large biological systems involving protein-glycan interactions requires leveraging more resources and employing more complex experimental techniques compared to solely oligo-and polysaccharides studies. advances in nmr methods hold great potential for direct spatial structure determination of carbohydrate-protein complexes in solution based on intermolecular noes which affords estimation of atomic contacts between a protein and a carbohydrate ligand [97, 98] . further extraction of noe-derived distance restraints for a saccharide molecule results in generation of representative conformational ensembles [99] [100] [101] . support of experimental data with computer simulations can significantly improve quality of 3d structures. quantum mechanics [100, [102] [103] [104] [105] [106] and molecular dynamics modeling [107] [108] [109] [110] [111] are commonly applied to conformation search and nmr signal prediction. to date, the following theoretical models and methods are applied for in silico design of carbohydrate three-dimensional structure [112] [113] [114] [115] [116] : based on scopus [135] article count we estimated the application rate for quantum mechanics (10759 publications) and molecular mechanics (14871 publications) methods applied for carbohydrate structure modeling for the recent five years (2015-2020). search queries included abundant carbohydrate terms, typical glycan moieties, and common modeling approaches (query details are given in supplementary table s1 ). in spite of growing interest to qm approaches in carbohydrate structure simulation, the major contribution to the statistics for such resource-intensive calculations is application of qm to relatively simple model compounds. for complex bioglycans in solution predominance of mm methods is more pronounced [6, 8] . molecular dynamics methods have achieved broad scope of application in terms of reasonable computer resource consumption. they fulfill advantageous compromise between calculation accuracy and performance, when applied to glycan molecules and their structural complexity (variety of known monomeric elements, presence of ionogenic groups), high bridge flexibility and stereo-electronic effects [112, 113, 136, 137] . in molecular mechanics simulations, newtonian mechanics principles are applied to calculate potential energy of a system using parameter set specific for a class of compounds under study (force field). particular features of carbohydrate moiety, e.g., ring puckering, rotational barriers, hydrogen bonds, must be taken into account to perform precise analysis of molecular behavior in vacuo or in solution [138] . molecular dynamics simulations consider newtonian motion equations to observe evolution of a system during a certain timespan. conformation ensemble generation occurs via calculation of molecular trajectory at given temperature. accuracy of calculation depends on the employed force field and sufficient conformational sampling. md simulations are commonly used for interpretation and analysis of the nmr and x-ray observables in the context of carbohydrate 3d structure [139] . enhanced molecular dynamics sampling technologies, such as replica-exchange md (remd) [140, 141] , hamiltonian replica-exchange md (hrex) [142] [143] [144] , multidimensional swarm-enhanced sampling md (msesmd) [145, 146] , gaussian accelerated md (gamd) [147, 148] have been reported. density maps or energy maps built for a set of the glycosidic torsion angles (ϕ, ψ, ω) are a typical way to report conformational preferences of a glycan provided by population analysis of its md trajectory. as a representative example, conformational characteristics of highly flexible branched oligosaccharide glc 1 man 9 glcnac 2 (gm9) were investigated by explicit-water remd study and validated using paramagnetism-assisted nmr spectroscopy [149] (figure 3a,b) . due to the structural complexity of gm9, adequate exploration of conformational space requires long-timescale simulations. regular md simulations of similar manno-oligosaccharides were reported to fail reproduction of experimental data [150] . replica-exchange approach implies running periodically swapped parallel replicas of the system at different temperatures. ensemble of gm9 conformers sampled by this method was consistent with the nmr observables. populated areas of density maps built for glycosidic linkages of glc 1 man 3 branch of gm9 ( figure 3c ) were close to crystallographic conformations of a linear glc 1 man 3 tetrasaccharide (a gm9 determinant recognized by lectins) from pdb. molecular dynamics simulations consider newtonian motion equations to observe evolution of a system during a certain timespan. conformation ensemble generation occurs via calculation of molecular trajectory at given temperature. accuracy of calculation depends on the employed force field and sufficient conformational sampling. md simulations are commonly used for interpretation and analysis of the nmr and x-ray observables in the context of carbohydrate 3d structure [139] . enhanced molecular dynamics sampling technologies, such as replica-exchange md (remd) [140, 141] , hamiltonian replica-exchange md (hrex) [142] [143] [144] , multidimensional swarm-enhanced sampling md (msesmd) [145, 146] , gaussian accelerated md (gamd) [147, 148] have been reported. density maps or energy maps built for a set of the glycosidic torsion angles (φ, ψ, ω) are a typical way to report conformational preferences of a glycan provided by population analysis of its md trajectory. as a representative example, conformational characteristics of highly flexible branched oligosaccharide glc1man9glcnac2 (gm9) were investigated by explicit-water remd study and validated using paramagnetism-assisted nmr spectroscopy [149] (figure 3a,b) . due to the structural complexity of gm9, adequate exploration of conformational space requires long-timescale simulations. regular md simulations of similar manno-oligosaccharides were reported to fail reproduction of experimental data [150] . replica-exchange approach implies running periodically swapped parallel replicas of the system at different temperatures. ensemble of gm9 conformers sampled by this method was consistent with the nmr observables. populated areas of density maps built for glycosidic linkages of glc1man3 branch of gm9 ( figure 3c ) were close to crystallographic conformations of a linear glc1man3 tetrasaccharide (a gm9 determinant recognized by lectins) from pdb. force field (or potential energy function) is represented by atomistic parameter set obtained for a considered compound class. potential energy value can be calculated as a sum of interaction potentials for bonded (covalent bond stretching, angle bending, proper torsions) and non-bonded (electrostatic and van der waals interactions) terms, and can include other terms (e.g., improper torsions, solvation, hydrogen bonds [151] , nonconventional hydrogen bonds [101] , for protein-carbohydrate complexes-ch-π stacking interactions [152] [153] [154] [155] , chi carbohydrate intrinsic (chi) energy contribution [156, 157] ). several force fields developed for general representation of wide range of organic compounds (e.g., allinger's mm2, mm3, mm4) can be applied to carbohydrate 3d modeling [151, 158, 159] . of them, despite being a universal force field, mm3 [160, 161] still exhibits good performance on glycans [162] [163] [164] (reviews), [165, 166] (exemplary articles). however, a number of force fields specially tuned for carbohydrates have been developed (figure 4 ). in supplementary table s2 , we provided citation metrics of articles reporting carbohydrate-dedicated and selected general force fields that could be applied to carbohydrate structure modeling. unfortunately, usage of general force fields could not be adequately estimated via number of citations. automated full-text analysis and retrieval of data, needed to confirm employment of force fields for carbohydrate molecules, is beyond the scope of this review. nevertheless, statistical data obtained for general force fields supported in popular md software packages (e.g., amber, charmm, gromacs, tinker) shows obsolescence of modern force fields above allinger's ones, and mm3 in particular (see more detailed data, references to original publications and absolute values in supplementary table s2 ). force field (or potential energy function) is represented by atomistic parameter set obtained for a considered compound class. potential energy value can be calculated as a sum of interaction potentials for bonded (covalent bond stretching, angle bending, proper torsions) and non-bonded (electrostatic and van der waals interactions) terms, and can include other terms (e.g., improper torsions, solvation, hydrogen bonds [151] , nonconventional hydrogen bonds [101] , for protein-carbohydrate complexes-ch-π stacking interactions [152] [153] [154] [155] , chi carbohydrate intrinsic (chi) energy contribution [156, 157] ). several force fields developed for general representation of wide range of organic compounds (e.g., allinger's mm2, mm3, mm4) can be applied to carbohydrate 3d modeling [151, 158, 159] . of them, despite being a universal force field, mm3 [160, 161] still exhibits good performance on glycans [162] [163] [164] (reviews), [165, 166] (exemplary articles). however, a number of force fields specially tuned for carbohydrates have been developed (figure 4 ). in supplementary table s2 , we provided citation metrics of articles reporting carbohydrate-dedicated and selected general force fields that could be applied to carbohydrate structure modeling. unfortunately, usage of general force fields could not be adequately estimated via number of citations. automated full-text analysis and retrieval of data, needed to confirm employment of force fields for carbohydrate molecules, is beyond the scope of this review. nevertheless, statistical data obtained for general force fields supported in popular md software packages (e.g., amber, charmm, gromacs, tinker) shows obsolescence of modern force fields above allinger's ones, and mm3 in particular (see more detailed data, references to original publications and absolute values in supplementary table s2 ). detailed comparisons of all-chemical and dedicated force fields in a context of glycan modeling have been published [114, 139, 151, 167] . charmm36, glycam06, gromos and opls-aa-sei were reported as commonly used force fields for handling carbohydrate or glycoconjugate molecules. more details are provided in figure 5 . charmm36 force field with modern carbohydrate parameter table (c36 [168] ) was derived from charmm all-atom biomolecular force field [169, 170] . currently, charmm36 parameterization features include monosaccharides in furanose [171] and pyranose [172] forms, glycosidic linkages between monosaccharides [171, 173] , complex carbohydrates and glycoproteins detailed comparisons of all-chemical and dedicated force fields in a context of glycan modeling have been published [114, 139, 151, 167] . charmm36, glycam06, gromos and opls-aa-sei were reported as commonly used force fields for handling carbohydrate or glycoconjugate molecules. more details are provided in figure 5 . charmm36 force field with modern carbohydrate parameter table (c36 [168] ) was derived from charmm all-atom biomolecular force field [169, 170] . currently, charmm36 parameterization features include monosaccharides in furanose [171] and pyranose [172] forms, glycosidic linkages between monosaccharides [171, 173] , complex carbohydrates and glycoproteins [174] , monosaccharide-linked sulfate and phosphate groups [175] , acyclic carbohydrates and alditols [171] , as well as carbohydrate simulations in aqueous solution [176] . [174] , monosaccharide-linked sulfate and phosphate groups [175] , acyclic carbohydrates and alditols [171] , as well as carbohydrate simulations in aqueous solution [176] . glycam06 force field is compatible with carbohydrates of all ring sizes and conformations for both mono-and oligosaccharides built of residues common for mammalian glycans, such as widespread aldoses, n-acetylated amino-sugars, sialic, glucuronic and galacturonic acids [177] . parameter set was extended to non-carbohydrate moieties such as lipids [178] , glycolipids [179, 180] , lipopolysaccharides [181] , proteins and nucleic acids. parameterization of glycam06 for glycosaminoglycans was reported [182] . gromos represents a broad family of carbohydrate force fields. having been a classic one since 2005, gromos 45a4 [183] parameter set is used for explicit-solvent simulation of hexopyranose-based saccharides. in the recent decade, several parameters of 45a4 were optimized in gromos 56acarbo [184] including lipopolysaccharides [185] . gromos 53a6glyc was improved for explicit-solvent simulations [186] and extended for glycoproteins [187] . gromos 56acarbo_r [188] was designed to improve description of ring conformational equilibria in hexopyranose-based saccharide chains as compared to the previous 56acarbo version. another modification of 56acarbo named 56acarbo_cht [189] was developed for chitosan and its derivatives. recently, extensions of gromos 56acarbo/carbo_r parameter set were adapted towards charged, protonated and esterified urinates [190] and furanose-based carbohydrates [191] . gromos96 43a1 was reported to have good performance on glycan structure simulation in glycoproteins [192, 193] . opls-aa scaling of electrostatic interactions (sei) force field [194] consists of improved parameters for conformational changes associated with φ-ψ dihedrals combined with enhanced accuracy of qm relative energy calculation in carbohydrate molecules refined for opls-aa biomolecular force field [195, 196] . additionally opls force field was improved for explicit-water simulations [197] . rapidly developing charmm drude polarizable force field for carbohydrates based on classical drude oscillator has to be mentioned. parameter sets obtained for hexapyranoses [198] and their aqueous solutions [199] , aldopentafuranoses and methyl-aldopentafuranosides [200] , glycam06 force field is compatible with carbohydrates of all ring sizes and conformations for both mono-and oligosaccharides built of residues common for mammalian glycans, such as widespread aldoses, n-acetylated amino-sugars, sialic, glucuronic and galacturonic acids [177] . parameter set was extended to non-carbohydrate moieties such as lipids [178] , glycolipids [179, 180] , lipopolysaccharides [181] , proteins and nucleic acids. parameterization of glycam06 for glycosaminoglycans was reported [182] . gromos represents a broad family of carbohydrate force fields. having been a classic one since 2005, gromos 45a4 [183] parameter set is used for explicit-solvent simulation of hexopyranose-based saccharides. in the recent decade, several parameters of 45a4 were optimized in gromos 56a carbo [184] including lipopolysaccharides [185] . gromos 53a6 glyc was improved for explicit-solvent simulations [186] and extended for glycoproteins [187] . gromos 56a carbo_r [188] was designed to improve description of ring conformational equilibria in hexopyranose-based saccharide chains as compared to the previous 56a carbo version. another modification of 56a carbo named 56a carbo_cht [189] was developed for chitosan and its derivatives. recently, extensions of gromos 56a carbo / carbo_r parameter set were adapted towards charged, protonated and esterified urinates [190] and furanose-based carbohydrates [191] . gromos96 43a1 was reported to have good performance on glycan structure simulation in glycoproteins [192, 193] . opls-aa scaling of electrostatic interactions (sei) force field [194] consists of improved parameters for conformational changes associated with ϕ-ψ dihedrals combined with enhanced accuracy of qm relative energy calculation in carbohydrate molecules refined for opls-aa biomolecular force field [195, 196] . additionally opls force field was improved for explicit-water simulations [197] . rapidly developing charmm drude polarizable force field for carbohydrates based on classical drude oscillator has to be mentioned. parameter sets obtained for hexapyranoses [198] and their aqueous solutions [199] , aldopentafuranoses and methyl-aldopentafuranosides [200] , carboxylate and n-acetylamine saccharide derivatives [201] , alditols [202] and glycosidic linkages [203] demonstrated significant improvement of qm data reproduction compared to charmm additive force field. martini coarse-grained (cg) force field [204] can be used alternatively to all-atom (aa) level simulations with advantage of modeling large carbohydrate systems (solutions of oligo-, polysaccharides, glycolipids [205] [206] [207] ) on a long time scale at reasonable computational cost. blocked ring puckering (only 4 c 1 conformation is allowed) and restrictions on the anomeric effect and glycosidic bond flexibility cumulatively provide reduction of available degrees of freedom. another cg model pitomba [208] for carbohydrate simulations was developed based on gromos 53a6 glyc force field. docking methods for carbohydrate ligands utilize molecular modeling approaches for protein-carbohydrate complexes for initial geometry generation, conformational sampling, grafting, active site mapping and binding affinity estimation [129, 137, [209] [210] [211] . accurate reproduction of experimental data requires application of particular scoring function parameterization (empirical, force fields or knowledge-based [212] ) and docking protocols, which depend on the interaction types present in a system (ch-π interactions, chi-energy, hydrogen bonding, solvent model, influence of solvent molecules inclusion effects, charged moiety etc.) [8, [213] [214] [215] [216] [217] [218] [219] . extension of several docking software packages to handle carbohydrate molecules was reported to improve modeling of biologically relevant systems such as lectin-glycan [220, 221] , gag-protein [222] [223] [224] , or antibody-carbohydrate [225] . currently available web-based tools along with standalone software packages were developed to facilitate work with carbohydrate 3d structure. versatile online services for in silico molecular modeling allow users to start from a user-friendly structure input, and to automatize further procedures (see table 2 for references). glycam-web provides tools for glycan structure prediction, glycosylated protein 3d model generation, grafting and docking. charmm-gui modeler offers options for 3d structure generation and modeling of glycans including n-/o-glycoproteins and glycolipids [226, 227] . biological membranes can be simulated with the assistance of charmm-gui membrane builder (by combining features of lps and glycolipid charmm-gui modelers) and gnomm (a tool for building lipopolysaccharide-rich membranes). noteworthy standalone programming frameworks for structure modeling are glycosylated (modeling of glycans, glycoproteins and glycosylation) and rosetta carbohydrate (loop modeling [228] , glycan-to-protein docking, and glycosylation modeling). to build diverse saccharide 3d models online, one can use such tools as restless and sweet-ii. doglycans standalone framework can be used for preparation of the atomistic models of glycopolymers, glycolipids and glycoproteins. complex polysaccharide 3d models can be generated via polys and carbbuilder. another special class of polysaccharide builders is dedicated to glycosaminoglycans (gags) which can be accessed using polys gag-builder and glycam-web gag-builder. recently, another approach for building gag molecules was reported [229] (exemplary data pipeline only). unfortunately, application scope of the majority of the existing structure building and modeling services is limited to rigidly defined set of supported sugar residues, and lacks non-carbohydrate moiety support. tools for locating and identification of a carbohydrate moiety (e.g., pdb2linucs, glyfinder, glycan reader) are useful for the atomic coordinate analysis and extraction of glycoproteins and protein-carbohydrate complexes deposited in protein data bank (pdb). automated molecular geometry processing facilities can be accessed via glycoinformatics tools designed for conformational data analysis (cat, bfmp), nuclear overhauser effect (noe) calculation (md2noe, distance mapping) and 3d structural data analysis related to glycan moieties from pdb (glytorsion, glyvicinity, gs-align). in table 2 , we summarized freely available tools for generation and processing carbohydrate 3d structural data and divided them into eight categories of application. a web-service implies an automated pipeline for running a specific software (e.g., molecular modeling, structure building, carbohydrate coordinate extraction, format conversion). it results in 3d structural data output starting from primary structure input or atomic coordinate file upload. web-tool is employed for 3d structural data processing and analysis without 3d structural data output; it is a simpler application designed primarily for statistics and visualization. other types are self-explanatory. vast variety of methods provide information about 3d structure of individual glycans and glycan moieties of glycoproteins and protein-carbohydrate complexes ( figure 6 ) [285, 286] . the following approaches are most utilized for 3d structural data validation [287] [288] [289] : • ccombination of carbohydrate simulated geometry data with x-ray crystallographic data analysis [225, 290] ; • analysis of inter-glycosidic nmr spin couplings, which depend on glycosidic bond torsions [114, 291, 292] vast variety of methods provide information about 3d structure of individual glycans and glycan moieties of glycoproteins and protein-carbohydrate complexes ( figure 6 ) [285, 286] . the following approaches are most utilized for 3d structural data validation [287] [288] [289] : • ccombination of carbohydrate simulated geometry data with x-ray crystallographic data analysis [225, 290] ; • analysis of inter-glycosidic nmr spin couplings, which depend on glycosidic bond torsions [114, 291, 292] ; • deriving nuclear overhauser effects (noes) from relative populations of the interatomic distances, with subsequent comparison to the experimental noes in solution [99, 293, 294] ; • purely informatic detection of errors, such as incompatible atomic coordinates originating from incorrect processing or simulation [295] [296] [297] [298] ; • simulation by other computational methods at higher levels of theory [102, 103, 105, 108] . unfortunately, most of the data obtained on the basis of crystallographic experiments can dramatically differ from glycan conformations in solution or have poor resolution which needs further adjustment [299, 300] . moreover, not all of the objects of interest can be obtained as a single crystal. electron cryo-microscopy gains popularity for carbohydrate 3d structural research [301] , however, this method requires additional refinement procedures due to resolution restrictions of the obtained density unfortunately, most of the data obtained on the basis of crystallographic experiments can dramatically differ from glycan conformations in solution or have poor resolution which needs further adjustment [299, 300] . moreover, not all of the objects of interest can be obtained as a single crystal. electron cryo-microscopy gains popularity for carbohydrate 3d structural research [301] , however, this method requires additional refinement procedures due to resolution restrictions of the obtained density maps [302] [303] [304] . recently, cryo-em data were used for the refinement of sars-cov-2 spike glycoprotein stucture using privateer (see table 3 for references) software [305, 306] . van beusekom et al., illustrated [295] quality improvement of the pdb glycan structure model with incorrect (1-6)-linked fucose annotation, poor fit to the electron density, and missing (1-3)-linked fucose (figure 7a ) with the help of pdb-redo ( figure 7b ) and carp (figure 7d ) tools (see table 3 for references). structure model obtained by pdb-redo treatment was further manually inspected ( figure 7c ): corrections were made for acetylamino group geometry, distorted (1-6)-linked fucose ring conformation, and (1-3)-linked fucose residue was added. despite successful automated resolution of residue annotation problem and poor electron density refinement, complete revision could not be achieved without manual intervention. maps [302] [303] [304] . recently, cryo-em data were used for the refinement of sars-cov-2 spike glycoprotein stucture using privateer (see table 3 for references) software [305, 306] . van beusekom et al., illustrated [295] quality improvement of the pdb glycan structure model with incorrect (1-6)-linked fucose annotation, poor fit to the electron density, and missing (1-3)-linked fucose (figure 7a ) with the help of pdb-redo ( figure 7b ) and carp (figure 7d ) tools (see table 3 for references). structure model obtained by pdb-redo treatment was further manually inspected (figure 7c ): corrections were made for acetylamino group geometry, distorted (1-6)-linked fucose ring conformation, and (1-3)-linked fucose residue was added. despite successful automated resolution of residue annotation problem and poor electron density refinement, complete revision could not be achieved without manual intervention. nmr techniques are a powerful approach to investigate conformational and dynamic behavior of carbohydrate moieties in biomolecules [307] [308] [309] [310] . however, the nature of noe enhancement factor has been hampering obtaining the sufficient number of distance restrains [99] . in the case of saccharides with their multiple rotatable bonds, the stable 3d structure was difficult to define, making molecular modeling essential for this class of compounds. adjustment of experimental conditions helped to overcome the mentioned limitation and to reproduce crystal structures of oligosaccharides by modeling with noe-derived distance restraints [100, 101] . nmr techniques are a powerful approach to investigate conformational and dynamic behavior of carbohydrate moieties in biomolecules [307] [308] [309] [310] . however, the nature of noe enhancement factor has been hampering obtaining the sufficient number of distance restrains [99] . in the case of saccharides with their multiple rotatable bonds, the stable 3d structure was difficult to define, making molecular modeling essential for this class of compounds. adjustment of experimental conditions helped to overcome the mentioned limitation and to reproduce crystal structures of oligosaccharides by modeling with noe-derived distance restraints [100, 101] . since there is no direct way to derive detailed three-dimensional representation from the observed noe intensities, additional molecular modeling protocols are required to establish comprehensive view of conformational space at the atomic level [311] [312] [313] . frank et al., demonstrated conformation filtering based on the observed noe obtained by molecular dynamics in explicit solvent [314] . as a representative example, figure 8 depicts 1 h-1 h spatial contacts and conformation selection criteria illustrated by moraxella catarrhalis lgt2∆ bacterium heptasaccharide, which adopts an unusual conformation. since there is no direct way to derive detailed three-dimensional representation from the observed noe intensities, additional molecular modeling protocols are required to establish comprehensive view of conformational space at the atomic level [311] [312] [313] . frank et al., demonstrated conformation filtering based on the observed noe obtained by molecular dynamics in explicit solvent [314] . as a representative example, figure 8 depicts 1 h-1 h spatial contacts and conformation selection criteria illustrated by moraxella catarrhalis lgt2δ bacterium heptasaccharide, which adopts an unusual conformation. protein data bank (pdb) [315] and cambridge structural database (csd) [316] are historically considered the main repositories of experimentally determined carbohydrate three-dimensional structures. csd is reported to deposit over 4000 crystal structures of oligosaccharides [93] . unlike cambridge structural database, protein data bank provides open access to the entire structural archive. carbohydrate moieties deposited in pdb are usually represented as covalently bound to protein or imply non-covalently bound protein-carbohydrate complex formation [302] . according to recent reports, as at november 18, 2019 protein data bank contained ~13500 carbohydrate structures representing ~9.4% of total database records [317] . despite being a valuable source of 3d structural data for glycoscientists, pdb lacks convenient search facilities for glycan structures. some projects have developed data-mining tools capable of retrieving bioglycan molecular geometry data from pdb: glycan reader (glycanstructure.org) [260, 261] (http://www.glycanstructure.org/), pdb2linucs (glycosciences.de) [47, 259, 318] (http://www.glycosciences.de/database/start.php?action=form_pdb_data), glyconavi tcarp [61] (https://glyconavi.org/tcarp/) (https://gitlab.com/glyconavi/pdb2glycan) and glyfinder (glycam-web) [257, 258] (https://dev.glycam.org/portal/gf_home/). another issue of concern related to protein data bank is large proportion of errors in deposited coordinates, leading to requirement for a thorough checkup and development of data remediation services [319] . commonly occurring problems associated with nomenclature, poor glycan geometry, linkage errors, missing or surplus atoms can seriously decline the quality of the obtained 3d structures [300, 320, 321] . using privateer software, it was discovered [299] , [301] that pdb deposits significant number of erroneous n-glycosylated structures with pyranose ring distortions, considering preferred adoption of 4 c1 conformation for d-sugars and 1 c4 conformation for l-sugars ( figure 9 ). in most cases, poor electron density of carbohydrate moiety results in anomalous high-energy pyranose ring conformations (envelopes, half-chairs, boats, skew boats, etc.). to obtain a reasonable structure model, experimental data refinement programs should be applied to derive geometric restraints for sugar monomers. notably, despite a cryo-em method has a resolution limit protein data bank (pdb) [315] and cambridge structural database (csd) [316] are historically considered the main repositories of experimentally determined carbohydrate three-dimensional structures. csd is reported to deposit over 4000 crystal structures of oligosaccharides [93] . unlike cambridge structural database, protein data bank provides open access to the entire structural archive. carbohydrate moieties deposited in pdb are usually represented as covalently bound to protein or imply non-covalently bound protein-carbohydrate complex formation [302] . according to recent reports, as at november 18, 2019 protein data bank contained~13500 carbohydrate structures representing~9.4% of total database records [317] . despite being a valuable source of 3d structural data for glycoscientists, pdb lacks convenient search facilities for glycan structures. some projects have developed data-mining tools capable of retrieving bioglycan molecular geometry data from pdb: glycan reader (glycanstructure.org) [260, 261] (http://www.glycanstructure.org/), pdb2linucs (glycosciences.de) [47, 259, 318] (http://www. glycosciences.de/database/start.php?action=form_pdb_data), glyconavi tcarp [61] (https://glyconavi. org/tcarp/) (https://gitlab.com/glyconavi/pdb2glycan) and glyfinder (glycam-web) [257, 258] (https://dev.glycam.org/portal/gf_home/). another issue of concern related to protein data bank is large proportion of errors in deposited coordinates, leading to requirement for a thorough checkup and development of data remediation services [319] . commonly occurring problems associated with nomenclature, poor glycan geometry, linkage errors, missing or surplus atoms can seriously decline the quality of the obtained 3d structures [300, 320, 321] . using privateer software, it was discovered [299] , [301] that pdb deposits significant number of erroneous n-glycosylated structures with pyranose ring distortions, considering preferred adoption of 4 c 1 conformation for d-sugars and 1 c 4 conformation for l-sugars ( figure 9 ). in most cases, poor electron density of carbohydrate moiety results in anomalous high-energy pyranose ring conformations (envelopes, half-chairs, boats, skew boats, etc.). to obtain a reasonable structure model, experimental data refinement programs should be applied to derive geometric restraints for sugar monomers. notably, despite a cryo-em method has a resolution limit disadvantage, observed results indicate larger content of atypical conformations solved by x-ray crystallography, as compared to cryo-em data. disadvantage, observed results indicate larger content of atypical conformations solved by x-ray crystallography, as compared to cryo-em data. exceptions for the relevancy of high-energy conformations were found in complexes involving carbohydrate-active enzymes, which force pyranose ring distortion enabling catalytic transformation of a carbohydrate substrate via transition states (e.g., glycosydic bond hydrolysis) [322] . fushinobu has performed glycosidic torsion analysis for a set of pdb entries of crystal structure complexes bound to ligands bearing lacto-n-biose i (lnb, both α-and β-anomers) disaccharide unit presented in type-1 antigens. the study was supported by glycomaps db (see table 1 for references) [323] . obtained φ-ψ data for lnbs bound to various proteins was plotted against corresponding free energy maps. distortion of the energetically favored ring conformation strongly depended on substrate catalytic and recognition mechanisms. to date, existing tools for carbohydrate structural error detection and correction in pdb files (table 3 ) cannot be used directly as an integral part of protein data bank. nevertheless, initiative aimed at improvement of quality at wwpdb was carried out via collaboration with glycoscience community in july 2020 [324] (https://www.wwpdb.org/documentation/carbohydrate-remediation). it includes data annotation and validation of carbohydrate-containing records. exceptions for the relevancy of high-energy conformations were found in complexes involving carbohydrate-active enzymes, which force pyranose ring distortion enabling catalytic transformation of a carbohydrate substrate via transition states (e.g., glycosydic bond hydrolysis) [322] . fushinobu has performed glycosidic torsion analysis for a set of pdb entries of crystal structure complexes bound to ligands bearing lacto-n-biose i (lnb, both αand β-anomers) disaccharide unit presented in type-1 antigens. the study was supported by glycomaps db (see table 1 for references) [323] . obtained ϕ-ψ data for lnbs bound to various proteins was plotted against corresponding free energy maps. distortion of the energetically favored ring conformation strongly depended on substrate catalytic and recognition mechanisms. to date, existing tools for carbohydrate structural error detection and correction in pdb files (table 3 ) cannot be used directly as an integral part of protein data bank. nevertheless, initiative aimed at improvement of quality at wwpdb was carried out via collaboration with glycoscience community in july 2020 [324] (https://www.wwpdb.org/documentation/carbohydrate-remediation). it includes data annotation and validation of carbohydrate-containing records. proportion of carbohydrate-containing structures in pdb has been recently reported in [302] . figure 10 presents our analysis of data published in the framework of protein data bank carbohydrate remediation project. 14117 pdb entries from carbohydrate remediation list (https://cdn.rcsb.org/ wwpdb/docs/documentation/carbohydrateremediation/pdb_carbohydrate_list.list) were sorted by release year and plotted against the growth of pdb structures released annually (https://www.rcsb.org/ stats/growth/growth-released-structures) (as on august 10, 2020; 167,327 pdb entries were available). obtained results indicated that~8.4% of pdb records contained a carbohydrate moiety. additionally, each pdbx/mmcif file corresponding to pdb id from carbohydrate remediation list was parsed to reveal the presence of n-or o-glycosylation site annotations, which resulted in~4.2% (7076 n-glycosylated entries) and 0.2% (362 o-glycosylated entries) of total database records. a few s-and c-glycans (24 entries in total) were neglected. statistics on glycans in protein data bank was reported [259, 302, 317, 325] , as well as tools that could facilitate collection of statistical data (glycan reader [70, 260, 261] , glyfinder [258] , pdb2linucs and pdb-care [326] proportion of carbohydrate-containing structures in pdb has been recently reported in [302] . figure 10 presents our analysis of data published in the framework of protein data bank carbohydrate remediation project. 14117 pdb entries from carbohydrate remediation list (https://cdn.rcsb.org/wwpdb/docs/documentation/carbohydrateremediation/pdb_carbohydrate_lis t.list) were sorted by release year and plotted against the growth of pdb structures released annually (https://www.rcsb.org/stats/growth/growth-released-structures) (as on august 10, 2020; 167,327 pdb entries were available). obtained results indicated that ~8.4% of pdb records contained a carbohydrate moiety. additionally, each pdbx/mmcif file corresponding to pdb id from carbohydrate remediation list was parsed to reveal the presence of n-or o-glycosylation site annotations, which resulted in ~4.2% (7076 n-glycosylated entries) and 0.2% (362 o-glycosylated entries) of total database records. a few s-and c-glycans (24 entries in total) were neglected. statistics on glycans in protein data bank was reported [259, 302, 317, 325] , as well as tools that could facilitate collection of statistical data (glycan reader [70, 260, 261] , glyfinder [258] , pdb2linucs and pdb-care [326] ). carbohydrate structure visualization in publications and computer interfaces is extremely important in terms of perception universality, unambiguity, and machine-readability. hence, carbohydrate input [335] [336] [337] and visualization [338, 339] tools are actively developing. feature comparison of glycan sketchers, builders and viewers (occasionally including 3d ones) was reported in a recently published review [340] . in our review, we gave more emphasis to 3d visualization approaches. being informative to represent glycan primary structure, most of graphical input tools such as glycanbuilder [341] , drawrings [342] , sugarsketcher [343] , drawglycan-snfg [344, 345] and glycoglyph [337] are inappropriate for obtaining 3d structural models and their visualization due to lack of underlying modeling and insufficient data conversion functionality. at present, glycan 3d molecular models can be built in user-friendly software allowing constructing glycans from individual saccharide components. free web-tools, such as glycam-web, charmm-gui, polys glycan builder, gag-builder, sweet-ii should be noted (more references are listed in table 2 ). a few commercial molecular modeling software is equipped with special plugins for glycan 3d structure building based on a list of predefined monosaccharide templates, e.g., sugar builder tool in hyperchem (http://www.hyper.com/?tabid=360) software [346] or azahar [235] plugin in pymol package (schrödinger software) (https://pymol.org/2/) [347] . to render 3d glycan structure and its conformational features, it should be recorded using a notation which includes atomic coordinates, such as mol [348] or pdb [349] . all-atom visualization based on atomic coordinates is supported by the majority of existing molecular modeling software. several carbohydrate structure databases utilize interactive 3d visualization using open-source software engines. as one of the pioneers, glycosciences.de portal developed pdb2multigif [350] (http: //www.glycosciences.de/modeling/pdb2mgif/) visualization pipeline which generates an animated image of 3d model from a pdb file using rasmol [351] (http://www.openrasmol.org/). rasmol visualization was included in w3-sweet [263] (ancestor of sweet-ii) pipeline developed by same project. nowadays, more advanced interactive visualization applications have been developed for carbohydrate 3d molecule presentation. jmol/jsmol [352] (http://www.openrasmol.org/) visualization applet is useful to display 3d models of carbohydrate molecules applied in numerous projects, such as csdb, glycosciences.de, glycam-web and ek3d (see references in table 1 ). ngl [353, 354] (http://nglviewer.org/), litemol [355] (https://www.litemol.org/) and mol* [356] (https://www.rcsb.org/ news?year=2020&article=5efe0f606378d876901146f8) (https://molstar.org/) 3d viewers are handy for processing macromolecular pdb data stored in glycoproteomics databases (unilectin3d, glycan binding site db, procarbdb, glyconavi, procaff, etc.; see references in table 1 ) and general proteomics repositories such as pdb [315] (http://www.wwpdb.org/), uniprotkb [357] (https://www.uniprot.org/) or swiss-model [90] (https://swissmodel.expasy.org/repository). ngl viewer was developed mainly for convenient protein macromolecule structure processing. it allows only ball-stick representation for small molecules or non-peptide fragments, such as saccharide residues. litemol (and its successor, mol*) viewer could be applied for the visualization of an arbitrary glycan with facility of highlighting carbohydrate fragments or displaying specific interactions in protein-carbohydrate complex structure. due to these features, it was implemented in multiple carbohydrate structure databases (e.g., csdb, glyco3d, matrixdb, and eps-db). despite the absence of the experimental 3d structural data, a number of carbohydrate databases have opportunity to simulate 3d atomic coordinates for deposited or inputted compounds from primary structure owing to tools developed by glycoinformatics community. csdb (restless api [265] ), glycosciences.de (sweet-ii [264, 350] ) and glycam-web (http://glycam.org/) portals make it possible to generate 3d atomic coordinates recorded in pdb (all) and mol (csdb) file formats. polys developed by glyco3d project enables the construction of polysaccharides in pdb format; it was introduced in matrixdb and eps-db databases. more details are provided in table 2 . atomic coordinates and all-atom molecular models have not been popular in publications due to a lack of human readability. first attempts [358, 359] of prof. kuttel et al., to visualize carbohydrate molecules in an efficient and simple way were made by developing paperchain and twister graphic algorithms as a part of carbohydra [360] and visual molecular dynamics [361] software packages. later, group of prof. pé rez suggested to restrict visualized molecule to skeletal atoms via conditional cycle plane coloring in accordance with the color code adopted in snfg [338] visualization scheme (sweetunitymol software [362] , figure 11a ). another unitymol visualization approach called umbrella visualization [363, 364] was tailored for n-glycan structures. azahar plugin for pymol [235] affords cartoon models with polygons and rods. several solutions for convenient visualization came up with the development of snfg notation [339] . thus, group of prof. woods proposed to combine molecular structure elements with 3d snfg icons (figure 12a ). such convenient visualization technique was integrated in litemol (figure 12b ) [365] and mol* (figure 12c) [324, 356] . 3d snfg visualization plugins are available via visual molecular dynamics platform [366] (http://glycam.org/docs/othertoolsservice/ 2016/06/03/3d-symbol-nomenclature-for-glycans-3d-snfg/) and ucsf chimera [367] visualization software tangram plugin (https://github.com/insilichem/tangram_snfg). designed as part of ccp4mg [368] molecular-graphics software, glycoblocks [369] representation of monosacchrides uses shapes and colors, identical to those in snfg (figure 12d ). available as pymol plugin developed by widmalm group (http://www.organ.su.se/gw/doku.php?id=3dcfg), 3d-cfg representation [370] based on cfg notation [371] (often referred to as a predecessor of snfg) should also be noted as earlier approach to interpretation of carbohydrate 3d structures based on a symbol library. considering efficiency and usability of 3d representation based on snfg concept, which grows popular among glycoscientists, the development of alternative solutions in carbohydrate 3d structure representations has a potential for application in glycoinformatics projects. support of colored residues in 3d structures implemented via jsmol on glycosciences.de portal was reported [47] (figure 11b) . similarly, csdb project has developed a 3d viewer (http://csdb.glycoscience.ru/database/core/show_ 3d.php?csdb=-3)admanp(1-3)[ac(1-2)?dglcpn(1-6)]bdgal?(1-) with carbohydrate residue coloring according to the snfg notation in the framework of a modeling module based on restless api. in this tool, user can visualize input structure with help of sticks, balls and sticks, or van der waals spheres (figure 11c ). options for aglycone moiety (white) and pseudo-atoms (polymeric repeats, blue caps) are supported (figure 11d ). int. j. mol. sci. 2020, 21, x for peer review 28 of 48 development of glycoinformatics resources makes great impact on treating enormous masses of data sets produced by glyco-related research. tools for carbohydrate 3d structural information retrieval provide a framework for experimental and computational data quality validation. data sources based on conformational ensemble generation and analysis assist structure-function and structure-activity relationship prediction of biologically relevant bioglycans and glycoconjugates. in this review, we have summarized existing facilities on working with glycan spatial features that can provide harmonious network of structural databases, web-services, tools and standalone software development of glycoinformatics resources makes great impact on treating enormous masses of data sets produced by glyco-related research. tools for carbohydrate 3d structural information retrieval provide a framework for experimental and computational data quality validation. data sources based on conformational ensemble generation and analysis assist structure-function and structure-activity relationship prediction of biologically relevant bioglycans and glycoconjugates. in this review, we have summarized existing facilities on working with glycan spatial features that can provide harmonious network of structural databases, web-services, tools and standalone software for modeling and processing structural data. further advances in this field will help building better understanding of glycan participation in biological processes and supply glycoscience community with user-friendly access to voluminous data collections. funding: the work with carbohydrate molecular modeling and pdb data was funded by russian foundation for basic research grant 18-04-00094. the work with structural databases, glycoinformatic tools and visualization was funded by russian science foundation grant 18-14-00098. the authors declare no conflict of interest. three structural aspects of carbohydrates and the relation with their biological properties biological roles of glycans applications of molecular dynamics simulations in immunology: a useful computational method in aiding vaccine design the role of molecular modeling in predicting carbohydrate antigen conformation and understanding vaccine immunogenicity. in carbohydrate-based vaccines: from concept to clinic evaluation of selected classical force fields for alchemical binding free energy calculations of protein-carbohydrate complexes computational glycoscience: characterizing the spatial and temporal properties of glycans and glycan-protein complexes calculating binding free energies for protein-carbohydrate complexes predicting the structures of glycans, glycoproteins, and their complexes computational docking as a tool for the rational design of carbohydrate-based drugs synthetic glycoscapes: addressing the structural and functional complexity of the glycocalyx function and 3d structure of the n-glycans on glycoproteins the glyspace alliance: toward a collaborative global glycoinformatics community the glycosmos portal: a unified and comprehensive web resource for the glycosciences glygen data model and processing workflow japan consortium for glycobiology and glycotechnology database an accessible glycan structure repository mirage: the minimum information required for a glycomics experiment a focused microarray approach to functional glycomics: transcriptional regulation of the glycome rings: a web resource of tools for analyzing glycomics data. in a practical guide to using glycomics databases representing glycophenotypes: semantic unification of glycobiology resources for disease discovery understanding the glycome: an interactive view of glycosylation from glycocompositions to glycoepitopes the glycome analytics platform: an integrative framework for glycobioinformatics glycosylation in health and disease glycoenzymes in glycan analysis and synthesis computational glycobiology: mechanistic studies of carbohydrate-active enzymes and implication for inhibitor design the current structural glycome landscape and emerging technologies carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts bridging isolated islands in the sea of data databases and associated tools for glycomics and glycoproteomics recent advances in glycoinformatic platforms for glycomics and glycoproteomics databases and bioinformatic tools for glycobiology and glycoproteomics unicarbkb: building a knowledge platform for glycoproteomics the complex carbohydrate structure database eurocarbdb: an open-access platform for glycoinformatics glycomedb-integration of open-access carbohydrate structure databases a portal for querying across the digital world of carbohydrate sequences glycoconjugate data bank: structures-an annotated glycan structure database and n-glycan primary structure verification service a new curated relational database of glycoprotein glycan structures and their biological sources a curated relational database of glycoprotein glycan structures and their biological sources. 2003 update bacterial carbohydrate structure database 3: principles and realization a new curated database on glycosyltransferases. glycobiology new database of bacterial carbohydrate structures an internet portal to support glycomics and glycobiology research glycan data retrieval and analysis using glycosciences. de applications db: an annotated data collection linking glycomics and proteomics data glyco3d: a portal for structural glycosciences glyco3d: a suite of interlinked databases of 3d structures of complex carbohydrates, lectins, antibodies, and glycosyltransferases an annotated data base of 3 dimensional structures of polysaccharides ek3d: an e. coli k antigen 3-dimensional structure database 3dsdscar-a three dimensional structural database for sialic acid-containing carbohydrates through molecular dynamics simulation hema thanka christlet, t. three dimensional structures of carbohydrates and glycoinformatics: an overview the extracellular matrix interaction database the extracellular matrix interaction database: updated content, a new navigator and expanded functionalities integration of new data with a focus on glycosaminoglycan interactions the exopolysaccharide properties and structures database: eps-db. application to bacterial exopolysaccharides glycan microarray database and analysis toolset advancing glycomics: implementation strategies at the consortium for functional glycomics glycan array data management at consortium for functional glycomics databases for glycoconjugates (glycosmos glycoproteins and glycolipids, glycoprotdb, glyconavi:tcarp, glycopost). glycoforum 2020 integration of glycoscience data in glycosmos using semantic web technologies the glycosmos web portal: glycan structures, glycogenes, glycoproteins, pathways, diseases and more! in program and abstracts for 2018 annual meeting of the society for glycobiology sugarbinddb, a resource of glycan-mediated host-pathogen interactions glyconnect: glycoproteomics goes visual, interactive, and analytical proglycprot v2.0, a repository of experimentally validated glycoproteins and protein glycosyltransferases of prokaryotes proglycprot: a repository of experimentally characterized prokaryotic glycoproteins procarbdb: a database of carbohydrate-binding proteins procaff: protein-carbohydrate complex binding affinity database a systematic analysis of protein-carbohydrate interactions in the pdb a database of known and modelled carbohydrate-binding protein structures with sequence-based prediction tools imberty, a. unilectin3d, a database of carbohydrate binding proteins with curated information on 3d structures and interacting ligands structural database for lectins and the unilectin web platform the lectin frontier database (lfdb), and data generation based on frontal affinity chromatography lectindb: a plant lectin database the integrated database of carbohydrate antigens and antibodies glycoepitope: a database of carbohydrate epitopes and antibodies using databases and web resources for glycomics research glycocd: a repository for carbohydrate-related cd antigens sacs-self-maintaining database of antibody crystal structure information the structural antibody database the carbohydrate-active enzymes database (cazy) in 2013 the carbohydrate-active enzymes database (cazy): an expert resource for glycogenomics the cazy database/the carbohydrate-active enzyme (cazy) database: principles and usage guidelines. in a practical guide to using glycomics databases an information repository of protein post-translational modification -year anniversary of a resource for post-translational modification of proteins dbptm in 2019: exploring disease association and cross-talk of post-translational modifications the swiss-model repository of annotated three-dimensional protein structure homology models the swiss-model repository and associated resources the swiss-model repository-new features and functionality a database of the accessible conformational space of glycosidic linkages glycan fragment database: a database of pdb-based glycan 3d structures diffraction and crystallography of oligosaccharides and polysaccharides glycan structures and their interactions with proteins. a nmr view spin ballet for sweet encounters: saturation-transfer difference nmr and x-ray crystallography complement each other in the elucidation of protein-glycan interactions nmr of glycans: shedding new light on old problems insights into carbohydrate recognition by 3d structure determination of protein-carbohydrate complexes using nmr an excellent tool to understand rna and carbohydrate recognition by proteins molecular insights into dc-sign binding to self-antigens: the interaction with the blood group a/b antigens a secondary structural element in a wide range of fucosylated glycoepitopes stabilization of branched oligosaccharides: lewisx benefits from a nonconventional c-h···o hydrogen bond is n-acetyl-d-glucosamine a rigid 4c1 chair? glycobiology specific rotation of monosaccharides: a global property bringing local information conformational analysis of xylobiose by dft quantum mechanics stereoelectronic effects impact glycan recognition use of circular statistics to model αman-(1→2)-αman and αman-(1→3)-α/βman o-glycosidic linkage conformation in 13c-labeled disaccharides and high-mannose oligosaccharides the carbohydrate-binding site in galectin-3 is preorganized to recognize a sugarlike framework of oxygens: ultra-high-resolution structures and water dynamics an nmr and md study of complexes of bacteriophage lambda lysozyme with tetra-and hexa-n-acetylchitohexaose modeling of oligosaccharides within glycoproteins from free-energy landscapes conformational populations of β-(1→4) o-glycosidic linkages using redundant nmr j-couplings and circular statistics shaping up for structural glycomics: a predictive protocol for oligosaccharide conformational analysis applied to n-linked glycans conformational analysis of carbohydrates-a historical overview predicting carbohydrate 3d structures using theoretical methods recent advances in computational predictions of nmr parameters for the structure elucidation of carbohydrates: methods and limitations computerized models of carbohydrates computerized molecular modeling of carbohydrates application of molecular dynamics simulation in food carbohydrate research-a review monte carlo-based searching as a tool to study carbohydrate structure combined monte carlo/torsion-angle molecular dynamics for ensemble modeling of proteins, nucleic acids and carbohydrates structures and energies of d-galactose and galabiose conformers as calculated by ab initio and semiempirical methods ring puckering: a metric for evaluating the accuracy of am1, pm3, pm3carb-1, and scc-dftb carbohydrate qm/mm simulations am1/d-cb1: a semiempirical model for qm/mm simulations of chemical glycobiology systems evaluating am1/d-cb1 for chemical glycobiology qm/mm simulations correlated ab initio quantum chemical calculations of di-and trisaccharide conformations conformational analysis of cellobiose by electronic structure theories dft energy optimization of a large carbohydrate: cyclomaltohexaicosaose (ca-26) ab initio study of molecular interactions in cellulose iα aqueous-phase conformations of lactose, maltose, and sucrose and the assessment of low-cost dft methods with the dsconf set of conformers for the three disaccharides computational analysis of carbohydrate recognition based on hybrid qm/mm modeling: a case study of norovirus capsid protein in complex with lewis antigen structure and conformation of α-, β-and γ-cyclodextrin in solution: theoretical approaches and experimental validation reaction mechanisms in carbohydrate-active enzymes: glycoside hydrolases and glycosyltransferases. insights from ab initio quantum mechanics/molecular mechanics dynamic simulations atomistic insight into the catalytic mechanism of glycosyltransferases by combined quantum mechanics/molecular mechanics (qm/mm) methods twisting of glycosidic bonds by hydrolases the oniom method and its applications scopus database: a review bioinformatics and molecular modeling in glycobiology simulation of carbohydrates, from molecular docking to dynamics in water chapter 1-carbohydrate-protein interactions: molecular modeling insights molecular simulations of carbohydrates and protein-carbohydrate interactions: motivation, issues and prospects the conformational properties of methyl α-(2,8)-di/trisialosides and their n-acyl analogues: implications for anti-neisseria meningitidis b vaccine design conformational flexibility of n-glycans in solution studied by remd simulations conformational properties of α-or β-(1→6)-linked oligosaccharides: hamiltonian replica exchange md simulations and nmr experiments enhanced conformational sampling of carbohydrates by hamiltonian replica-exchange simulation influence of solvent and intramolecular hydrogen bonding on the conformational properties of o-linked glycopeptides identification of rare lewis oligosaccharide conformers in aqueous solution using enhanced sampling molecular dynamics ring puckering landscapes of glycosaminoglycan-related monosaccharides from molecular dynamics simulations the mechanism of high affinity pentasaccharide binding to antithrombin, insights from gaussian accelerated molecular dynamics simulations comparison of carbohydrate force fields using gaussian accelerated molecular dynamics simulations and development of force field parameters for heparin-analogue pentasaccharides conformational analysis of a high-mannose-type oligosaccharide displaying glucosyl determinant recognised by molecular chaperones using nmr-validated molecular dynamics simulation exploration of conformational spaces of high-mannose-type oligosaccharides by an nmr-validated simulation carbohydrate force fields dispersion interactions of carbohydrates with condensate aromatic moieties: theoretical study on the ch-π interaction additive properties carbohydrate-aromatic interactions in proteins the dependence of carbohydrate-aromatic interaction strengths on the structure of the carbohydrate carbohydrate-protein aromatic ring interactions beyond ch/π interactions: a protein data bank survey and quantum chemical calculations importance of ligand conformational energies in carbohydrate docking: sorting the wheat from the chaff improving glycosidic angles during carbohydrate docking 11-molecular modeling in glycoscience disaccharide conformational maps: adiabaticity in analogues with variable ring shapes molecular mechanics. the mm3 force field for hydrocarbons. 1 a molecular mechanics force field (mm3) for alcohols and ethers comparative performance of mm3(92) and two tinker™ mm3 versions for the modeling of carbohydrates comparison of different force fields for the study of disaccharides conformational analysis of furanoside-containing mono-and oligosaccharides additive effects in the modeling of oligosaccharides with mm3 at high dielectric constants: an approach to the 'multiple minimum problem' mm3 potential energy surfaces of trisaccharide models of λ-, µ-, and ν-carrageenans force fields and scoring functions for carbohydrate simulation charmm force field files all-atom empirical potential for molecular modeling and dynamics studies of proteins extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations charmm additive all-atom force field for glycosidic linkages between hexopyranoses additive empirical force field for hexopyranose monosaccharides charmm additive all-atom force field for glycosidic linkages in carbohydrates involving furanoses charmm additive all-atom force field for carbohydrate derivatives and its utility in polysaccharide and carbohydrate-protein modeling charmm additive all-atom force field for phosphate and sulfate linked to carbohydrates kirkwood-buff-derived alcohol parameters for aqueous carbohydrates and their application to preferential interaction coefficient calculations of proteins glycam06: a generalizable biomolecular force field. carbohydrates extension of the glycam06 biomolecular force field to lipids, lipid bilayers and glycolipids atomic-resolution conformational analysis of the gm3 ganglioside in a lipid bilayer and its implications for ganglioside-protein recognition at membrane surfaces molecular dynamics simulations of membrane-and protein-bound glycolipids using glycam a glycam-based force field for simulations of lipopolysaccharide membranes: parametrization and validation extension and validation of the glycam force field parameters for modeling glycosaminoglycans a new gromos force field for hexopyranose-based carbohydrates a reoptimized gromos force field for hexopyranose-based carbohydrates accounting for the relative free energies of ring conformers, anomers, epimers, hydroxymethyl rotamers, and glycosidic linkage conformers the effect of temperature, cations, and number of acyl chains on the lamellar to non-lamellar transition in lipid-a membranes: a microscopic view gromos 53a6glyc, an improved gromos force field for hexopyranose-based carbohydrates extension and validation of the gromos 53a6glyc parameter set for glycoproteins revision of the gromos 56a6carbo force field: improving the description of ring-conformational equilibria in hexopyranose-based carbohydrates chains modification of 56acarbo force field for molecular dynamic calculations of chitosan and its derivatives extension of the gromos 56a6carbo/carbo_r force field for charged, protonated, and esterified uronates a gromos force field for furanose-based carbohydrates gromos96 43a1 performance on the characterization of glycoprotein conformational ensembles through molecular dynamics simulations gromos96 43a1 performance in predicting oligosaccharide conformational ensembles within glycoproteins an improved opls-aa force field for carbohydrates opls all-atom force field for carbohydrates development and testing of the opls all-atom force field on conformational energetics and properties of organic liquids optimizing nonbonded interactions of the opls force field for aqueous solutions of carbohydrates: how to capture both thermodynamics and dynamics polarizable empirical force field for hexopyranose monosaccharides based on the classical drude oscillator proper balance of solvent-solute and solute-solute interactions in the treatment of the diffusion of glucose using the drude polarizable force field charmm drude polarizable force field for aldopentofuranoses and methyl-aldopentofuranosides drude polarizable force field parametrization of carboxylate and n-acetyl amine carbohydrate derivatives polarizable empirical force field for acyclic polyalcohols based on the classical drude oscillator charmm drude polarizable force field for glycosidic linkages involving pyranoses and furanoses martini coarse-grained force field: extension to carbohydrates overcoming the limitations of the martini force field in simulations of polysaccharides extending the martini coarse-grained forcefield to n-glycans martini force field parameters for glycolipids pitomba: parameter interface for oligosaccharide molecules based on atoms modelling of carbohydrate-aromatic interactions: ab initio energeticsand force field performance stacking interactions between carbohydrate and protein quantified by combination of theoretical and experimental methods applying pose clustering and md simulations to eliminate false positives in molecular docking scoring functions and their evaluation methods for protein-ligand docking: recent advances and future directions aromatic-carbohydrate interactions: an nmr and computational study of model systems a gibbs free energy correlation for automated docking of carbohydrates protein-carbohydrate interactions docking glycosaminoglycans to proteins: analysis of solvent inclusion flexibility and explicit solvent in molecular-dynamics-based docking of protein-glycosaminoglycan systems mannobiose binding induces changes in hydrogen bonding and protonation states of acidic residues in concanavalin a as revealed by neutron crystallography improvements, trends, and new ideas in molecular docking: 2012-2013 in review recognition of selected monosaccharides by pseudomonas aeruginosa lectin ii analyzed by molecular dynamics and free energy calculations in silico mutagenesis and docking study of ralstonia solanacearum rsl lectin: performance of docking software to predict saccharide binding finding a needle in a haystack: development of a combinatorial virtual screening approach for identifying high specificity heparin/heparan sulfate sequence(s) computational analysis of interactions in structurally available protein-glycosaminoglycan complexes identification and characterization of a glycosaminoglycan binding site on interleukin-10 via molecular simulation methods a. in silico analysis of antibody-carbohydrate interactions and its application to xenoreactive antibodies charmm-gui input generator for namd, gromacs, amber, openmm, and charmm/openmm simulations using the charmm36 additive force field charmm-gui supports the amber force fields residue-centric modeling and design of saccharide and glycoconjugate structures efficient construction of atomic-resolution models of non-sulfated chondroitin glycosaminoglycan using molecular dynamics data charmm-gui glycan modeler for modeling and simulation of carbohydrates and glycoconjugates glycosylator: a python framework for the rapid modeling of glycans the rosetta all-atom energy function for macromolecular modeling and design novel sampling strategies and a coarse-grained score function for docking homomers, flexible heteromers, and oligosaccharides using rosetta in capri rounds 37-45 macromolecular modeling and design in rosetta: recent methods and frameworks azahar: a pymol plugin for construction, visualization and analysis of glycan molecules automatic conformation prediction of carbohydrates using a genetic algorithm rapid generation of a representative ensemble of n-glycan conformations the use of a genetic algorithm search for molecular mechanics (mm3)-based conformational analysis of oligosaccharides sugar folding: a novel structural prediction tool for oligosaccharides and polysaccharides 1 sugar folding: a novel structural prediction tool for oligosaccharides and polysaccharides 2 a tool for the prediction of structures of complex sugars computational study of the conformational structures of saccharides in solution based on j couplings and the "fast sugar structure prediction software glyprot: in silico glycosylation of proteins macromolecular structure determination using x-rays, neutrons and electrons: recent developments in phenix computational screening of the human tf-glycome provides a structural definition for the specificity of anti-tumor antibody jaa-f11 presentation, presentation, presentation! molecular-level insight into linker effects on glycan array screening data recent advances in employing molecular modelling to determine the specificity of glycan-binding proteins gly-spec: a webtool for predicting glycan specificity by integrating glycan array screening data and 3d structure combining 3d structure with glycan array data provides insight into the origin of glycan specificity automated builder and database of protein/membrane complexes for molecular dynamics simulations membrane builder for mixed bilayers and its application to yeast membranes charmm-gui membrane builder toward realistic biological membrane simulations modeling and simulation of bacterial outer membranes with lipopolysaccharides and enterobacterial common antigen the gram-negative outer membrane modeler: automated building of lipopolysaccharide-rich bacterial outer membranes in four force fields maker: an online tool for generating equilibrated micelles as direct input for molecular dynamics simulations probabilistic identification of saccharide moieties in biomolecules and their protein complexes new online tools for locating and curating carbohydrate structures in wwpdb tools to find glycoproteins in the protein data bank and generate realistic 3d structures for them data mining the protein data bank: automatic detection and assignment of carbohydrate structures glycan reader: automated sugar identification and simulation preparation for carbohydrates and glycoproteins glycan reader is improved to recognize most sugar types and chemical modifications in the protein data bank vattulainen, i. doglycans-tools for preparing carbohydrate structures for atomistic simulations of glycoproteins, glycolipids, and carbohydrate polymers for gromacs sweet-www-based rapid 3d construction of oligo-and polysaccharides automated translation of glycan sequences from residue-based notation to smiles and atomic coordinates a molecular builder for carbohydrates: application to polysaccharides and complex carbohydrates 0: an open source software package for building three-dimensional structures of polysaccharides an adjustable tool for building 3d molecular structures of carbohydrates for molecular simulation software for building molecular models of complex oligoand polysaccharide structures a pipeline to translate glycosaminoglycan sequences into 3d models. application to the exploration of glycosaminoglycan conformational space a web-tool for modeling 3d structures of glycosaminoglycans slick-scoring and energy functions for protein−carbohydrate interactions balldock/slick: a new method for protein-carbohydrate docking the haddock2.2 web server: user-friendly integrative modeling of biomolecular complexes docking server for the identification of heparin binding sites on proteins so you think computational approaches to understanding glycosaminoglycan-protein interactions are too dry and too rigid? think again! predicting glycosaminoglycan surface protein interactions and implications for studying axonal growth improved docking of sulfated sugars using qm-derived scoring functions conformational analysis of oligosaccharides and polysaccharides using molecular dynamics simulations a method for discretizing and visualizing pyranose conformations direct noe simulation from long md trajectories gs-align for glycan structure alignment and similarity measurement analysis of carbohydrate 3d structures derived from the pdb statistical analysis of amino acids in the vicinity of carbohydrate residues performed by glyvicinity rules of engagement of protein-glycoconjugate interactions: a molecular view achievable by using nmr spectroscopy and molecular modeling conformational studies of oligosaccharides structure, conformation, and dynamics of bioactive oligosaccharides: theoretical approaches and experimental validations conformational studies of oligosaccharides and glycopeptides: complementarity of nmr, x-ray crystallography, and molecular modelling analysis and validation of carbohydrate three-dimensional structures an efficient use of x-ray information, homology modeling, molecular dynamics and knowledge-based docking techniques to predict protein-monosaccharide complexes chapter 3 developments in the karplus equation as they relate to the nmr coupling constants of carbohydrates a perspective on the primary and three-dimensional structures of carbohydrates nmr structure determination of a segmentally labeled glycoprotein using in vitro glycosylation nmr structural biology of sulfated glycans making glycoproteins a little bit sweeter with pdb-redo automatically fixing errors in glycoprotein structures with rosetta leveraging glycomics data in glycoprotein 3d structure validation with privateer current developments in coot for macromolecular model building of electron cryo-microscopy and crystallographic data carbohydrate anomalies in the pdb numerous severely twisted n-acetylglucosamine conformations found in the protein databank structural glycobiology in the age of electron cryo-microscopy strategies for carbohydrate model building, refinement and validation structures of ebola virus gp and sgp in complex with therapeutic antibodies cryo-em structure of a native, fully glycosylated, cleaved hiv-1 envelope trimer cross-neutralization of sars-cov-2 by a human monoclonal sars-cov antibody structure, function, and antigenicity of the sars-cov-2 spike glycoprotein nmr spectroscopy in the study of carbohydrates: characterizing the structural complexity recent advances in the application of nmr methods to uncover the conformation and recognition features of glycans. in carbohydrate chemistry the recognition of glycans by protein receptors. insights from nmr spectroscopy novel nmr avenues to explore the conformation and interactions of glycans delineating the conformational flexibility of trisaccharides from nmr spectroscopy experiments and computer simulations conformational flexibility of the pentasaccharide lnf-2 deduced from nmr spectroscopy and molecular dynamics simulations molecular conformations of di-, tri-, and tetra-α-(2→8)-linked sialic acid from nmr spectroscopy and md simulations an unusual carbohydrate conformation is evident in moraxella catarrhalis oligosaccharides protein data bank (pdb): the single global macromolecular structure archive the cambridge structural database current status of carbohydrates information in the protein data bank data mining the pdb for glyco-related data carbohydrate structure: the rocky road to automation building meaningful models of glycoproteins carbohydrate 3d structure validation dissecting conformational contributions to glycosidase catalysis and inhibition conformations of the type-1 lacto-n-biose i unit in protein complex structures collaborating with glycoscience community to improve data representation of carbohydrates in the protein data bank building and rebuilding n-glycans in protein structure models pdb-care (pdb carbohydrate residue check): a program to support annotation of complex carbohydrate structures in pdb files crystallography & nmr system: a new software suite for macromolecular structure determination version 1.2 of the crystallography and nmr system tools to assist determination and validation of carbohydrate 3d structure data compatible topologies and parameters for nmr structure determination of carbohydrates by simulated annealing structural analysis of glycoproteins: building n-linked glycans with coot software for the conformational validation of carbohydrate structures motivevalidator: interactive web-based validation of ligand and residue structure in biomolecular complexes database of up-to-date validation results for ligands and non-standard residues from the protein data bank a practical guide to using glycomics databases implementation of glycanbuilder to draw a wide variety of ambiguous glycans a glycan visualizing, drawing and naming application symbol nomenclature for graphical representations of glycans updates to the symbol nomenclature for glycans (snfg) guidelines computational tools for drawing, building and displaying carbohydrates: a visual guide the glycanbuilder and glycoworkbench glycoinformatics tools: updates and new developments the rings resource for glycome informatics analysis and data mining on the web sugarsketcher: quick and intuitive online glycan drawing drawglycan-snfg: a robust tool to render glycans and glycopeptides with fragmentation information drawglycan-snfg and gpannotate: rendering glycans and annotating glycopeptide mass spectra the pymol molecular graphics system description of several chemical structure file formats used by computer programs developed at molecular design limited protein data bank contents guide: atomic coordinate entry format description. brookhaven natl. lab a web tool to create animated images of molecules biomolecular graphics for all fast and scriptable molecular graphics in web browsers without java3d. nat. prec a web application for molecular visualization ngl viewer: web-based molecular graphics for large complexes litemol suite: interactive web-based visualization of large-scale macromolecular structure data mol: towards a common library and tools for web molecular graphics uniprot: a worldwide hub of protein knowledge eborn, i. techniques for visualization of carbohydrate molecules visualisation of cyclic and multi-branched molecules with vmd rendering carbohydrate cartoons vmd: visual molecular dynamics three-dimensional representations of complex carbohydrates and polysaccharides-sweetunitymol: a video game-based computer graphic software new visualization of dynamical flexibility of n-glycans: umbrella visualization in unitymol umbrella visualization: a method of analysis dedicated to glycan flexibility with unitymol rapidly display glycan symbols in 3d structures: 3d-snfg in litemol 3d implementation of the symbol nomenclature for graphical representation of glycans ucsf chimera-a visualization system for exploratory research and analysis presenting your structures: the ccp4mg molecular-graphics software glycoblocks: a schematic three-dimensional representation for glycans and their interactions glycan synthesis, structure, and dynamics: a selection symbol nomenclature for glycan representation development of carbohydrate nomenclature and representation. in a practical guide to using glycomics databases key: cord-266147-s8rxzm0t authors: burnouf, thierry title: modern plasma fractionation date: 2007-03-28 journal: transfus med rev doi: 10.1016/j.tmrv.2006.11.001 sha: doc_id: 266147 cord_uid: s8rxzm0t protein products fractionated from human plasma are an essential class of therapeutics used, often as the only available option, in the prevention, management, and treatment of life-threatening conditions resulting from trauma, congenital deficiencies, immunologic disorders, or infections. modern plasma product production technology remains largely based on the ethanol fractionation process, but much has evolved in the last few years to improve product purity, to enhance the recovery of immunoglobulin g, and to isolate new plasma proteins, such as α1-protease inhibitor, von willebrand factor, and protein c. because of the human origin of the starting material and the pooling of 10 000 to 50 000 donations required for industrial processing, the major risk associated to plasma products is the transmission of blood-borne infectious agents. a complete set of measures—and, most particularly, the use of dedicated viral inactivation and removal treatments—has been implemented throughout the production chain of fractionated plasma products over the last 20 years to ensure optimal safety, in particular, and not exclusively, against hiv, hepatitis b virus, and hepatitis c virus. in this review, we summarize the practices of the modern plasma fractionation industry from the collection of the raw plasma material to the industrial manufacture of fractionated products. we describe the quality requirements of plasma for fractionation and the various treatments applied for the inactivation and removal of blood-borne infectious agents and provide examples of methods used for the purification of the various classes of plasma protein therapies. we also highlight aspects of the good manufacturing practices and the regulatory environment that govern the whole chain of production. in a regulated and professional environment, fractionated plasma products manufactured by modern processes are certainly among the lowest-risk therapeutic biological products in use today. protein products fractionated from human plasma are an essential class of therapeutics used, often as the only available option, in the prevention, management, and treatment of life-threatening conditions resulting from trauma, congenital deficiencies, immunologic disorders, or infections. modern plasma product production technology remains largely based on the ethanol fractionation process, but much has evolved in the last few years to improve product purity, to enhance the recovery of immunoglobulin g, and to isolate new plasma proteins, such as a1protease inhibitor, von willebrand factor, and protein c. because of the human origin of the starting material and the pooling of 10 000 to 50 000 donations required for industrial processing, the major risk associated to plasma products is the transmission of blood-borne infectious agents. a complete set of measures-and, most particularly, the use of dedicated viral inactivation and removal treatments-has been implemented throughout the production chain of fractionated plasma products over the last 20 years to ensure optimal safety, in particular, and not exclusively, against hiv, hepatitis b virus, and hepatitis c virus. in this review, we summarize the practices of the modern plasma fractionation industry from the collection of the raw plasma material to the industrial manufacture of fractionated products. we describe the quality requirements of plasma for fractionation and the various treatments applied for the inactivation and removal of blood-borne infectious agents and provide examples of methods used for the purification of the various classes of plasma protein therapies. we also highlight aspects of the good manufacturing practices and the regulatory environment that govern the whole chain of production. in a regulated and professional environment, fractionated plasma products manufactured by modern processes are certainly among the lowest-risk therapeutic biological products in use today. a 2007 elsevier inc. all rights reserved. c ollected human plasma may be used as a therapeutic product (known as bclinical plasmaq or bfresh frozen plasmaq) or as source material for the production of pharmaceutical fractionated products (also called bplasma productsq or bplasma derivativesq). this complex biologic material contains hundreds of proteins covering a myriad of physiological functions. many components still have undiscovered roles. the most abundant proteins, albumin and immunoglobulin (ig) g, are present at about 35 and 10 g/l, respectively, representing about 80% of all plasma proteins. less abundant proteins include the protease inhibitors, like a1-antitrypsin (aat) (1.5 g/l) and antithrombin (at) (300 mg/l), and the coagulation factors such as factor viii (fviii) (a few ng/l), which exhibit potent physiologic activity. currently, about 20 different plasma protein therapeutics are used for treating life-threatening diseases or injuries associated to bleeding and thrombotic disorders, immunological diseases, infectious conditions, as well as tissue degenerating diseases, thus addressing the clinical needs of countless patients. an updated list of the major therapeutic applications of plasma protein products can be found elsewhere. 1 this industrial process used to isolate therapeutic plasma proteins is known as bfractionation.q over 23 to 28 million liters of human plasma are fractionated each year in the world, in batches of several thousand liters, in about 70 factories. modern plasma fractionation combines manufacturing steps to isolate, in a sequential and integrated manner, the crude fractions that are further purified into individual therapeutic products. validated dedicated steps inactivate and/or remove infectious agents potentially present in the starting plasma pool. this sophisticated industrial process is performed under highly hygienic conditions in licensed facilities (plasma fractionation plants) that are operated in compliance with good manufacturing practices and following quality assurance principles. over the years, plasma fractionation has evolved from a medical service activity mostly oriented toward the needs of local communities into a global manufacturing industry conforming to high regulatory standards. these strict requirements start from the collection of plasma for fractionation and include product manufacture and distribution steps. in this article, we review the most current practices encompassing the collection of plasma for fractionation, the core industrial plasma fractionation process, and the purification and pathogen reduction technologies of individual plasma products. the practices used for the collection of plasma for fractionation have direct influence on the safety profile of protein products since individual donations contribute to large plasma pools used to manufacture therapeutic preparations intended for hundreds or even thousands of patients. it is therefore logical that the production of plasma is regarded as an integral part of the manufacture of modern fractionated products. collection requirements of plasma for fractionation may differ from those relevant to fresh frozen plasma. in a regulated environment, plasma for fractionation is collected by licensed/registered blood establishments (blood centers and apheresis collection centers) that are inspected by the relevant national regulatory authorities (nras). compelled by the same safety concerns, the plasma fractionators conduct audits to verify that the contractual plasma collection and quality and safety measures, agreed upon with the plasma supplier, are met. areas of specific relevance include (a) procedures for donor screening and donation testing; (b) labeling, documentation, and traceability requirements; and (c) the handling of blood and plasma. such information is part of the marketing license of plasma products and, in europe, is assembled into a document called the bplasma master file.q various requirements for the collection of plasma for fractionation have been described in various guides, eg, from the pharmaceutical inspection convention and pharmaceutical inspection cooperation scheme (jointly referred to as pic/s), us food and drug administration (fda) and in recent world health organization recommendations. 1 they are summarized below. candidate donors are provided with educational materials and undergo a medical interview to establish the absence of risks or signs of infections and to prove compliance for a donation of plasma for fractionation (table 1) . potential donors presenting a health hazard are asked to exclude themselves. medical information of donors is acquired and archived. continuous epidemiologic surveillance of the donor population is being required in some jurisdictions. 2 it helps to establish the background level (prevalence and incidence) and trends of known infectious markers (eg, hiv 1 and 2 antibodies, hepatitis c virus [hcv] antibodies, and hepatitis b surface antigen [hbsag]) in the population. this is also of interest for the early detection of emerging diseases, allowing early implementation of counter measures (such as more stringent donor screening processes or requirements for additional testing procedures). donors eligible to donate plasma for fractionation are individuals who meet donation criteria (such as age and donation frequency), do not present risk factors of blood-born infectious agents, and comply with requirements defined by the plasma fractionator and the nras of the country of plasma collection and of use of the products. in most situations, eligibility of whole blood donors and apheresis donors overlap, apart from donation frequency which is higher for plasmapheresis donors. eligibility criteria take into account scientific information about the risks of transmission of infectious agents by pooled plasma products (which may differ from those by blood components). special criteria may exist for the collection of hyperimmune plasma (used to make hyperimmune igg preparations), such as procedures for donors' immunization and minimal antibody titer. 1 currently, about 35% of the plasma fractionated in the world is obtained by centrifugation of whole blood (brecoveredq plasma), and 65% is obtained by apheresis. blood/plasma collection, processing, and storage may affect plasma quality, as well as having an impact on the recovery of the most labile proteins such as fviii. in particular, risks of activation of the coagulation, complement, and fibrinolytic systems, which may lead to generation of plasma proteases, should be avoided. to better preserve the integrity of recovered plasma and limit risks of activation of the coagulation cascade and of cellular components, (a) good mixing of the blood with the anticoagulant solution (a sodium citrate based solution 1 ) should be ensured from the initiation till the end of the collection process; (b) the duration of the collection should not exceed 15 minutes; and (c) temperature variations of the blood should be avoided. a few hours after donation, whole blood is subjected to a centrifugation that separates the cellular elements (most specifically red cells) from plasma. the mean plasma volume obtained from one whole blood donation is about 220 ml but varies depending upon the volume of collected whole blood (most often 400-450 ml) and donor's hematocrit. apheresis plasma (also called bsource plasmaq) is collected from donors through a process where blood is removed from the donor, anticoagulated (generally with a 4% sodium citrate solution), 1 and immediately separated by physical means (centrifugation or filtration, or a combination of both) into components. 3 at minimum, the red cells are returned to the donor while plasma is retained and collected in a container (bag or plastic bottle). the duration of a typical plasmapheresis procedure depends on the number of cycles (and, hence, the volume of plasma collected) and lasts generally from 35 to 70 minutes. apheresis plasma volume may range from 450 to 880 ml, depending upon the country's regulations and collection protocol. apheresis plasma can also be prepared as a byproduct of plateletpheresis (bconcurrent plasmaq), a procedure used primarily for the collection of platelets. both recovered and apheresis plasmas are suitable for the manufacture of the whole range of fractionated plasma products. the mean content in coagulation factors, more particularly fviii, is lower in recovered than in apheresis plasma because of (a) longer processing time before freezing (whole blood must be further processed to separate cellular components and plasma), (b) higher ratio of anticoagulant and, possibly, (c) the higher level of cellular contamination that may release proteolytic enzymes affecting the stability of coagulation factors. apheresis plasma contains less igg when collected from frequent donors. protein content and quality of fractionated proteins is apparently not affected by the apheresis system used, although residual cell content differ based upon the type and configuration of the cell separation device. 4 plasma from membrane apheresis procedures, as does recovered plasma prepared from whole blood leukoreduced on positively charged filters, may 4performed by most fractionators. ymandatory in europe for hcv. zmay contribute to viral clearance but does not necessarily result in robust and consistent removal. §for small viruses, robust removal is achieved by narrow pore size membranes (v20 nm). texpected contribution based on experimental studies using spiked tse agents, in the absence of information of the biological nature of the tse-human plasma associated agent. contain more activated complement component 3 and 5 (c3a, c5a) anaphylatoxins, 5-7 but the impact on the quality or yield of fractionated products is unknown. various infectious agents have been identified as potential contaminants of human blood. 8 bacteria, parasites, and intracellular viruses are not transmitted by plasma products because they are destroyed by freeze-thaw steps or removed by the 0.2-to 1-lm filtration steps used during the processing of fractionated products. pathogenic plasma-borne viruses include hiv, hcv, hepatitis b virus (hbv), west nile virus (wnv), hepatitis a virus (hav) and parvovirus b19 (b19). the various complementary safety nets in place during the production chain of fractionated products, from donor selection to industrial product extraction, to optimize safety against these agents are summarized in table 1 . the importance of viral testing on the safety of plasma products has been reviewed. 1, 9, 10 the extent of viral testing of plasma for fractionation takes into account the ability of validated fractionation processes to eliminate viral risks. 11 some testing is performed by blood establishments, other by plasma fractionators (table 1) . individual plasma donations must be negative for anti-hiv 1 and 2, anti-hcv, and hbsag. genomic assays of plasma minipools for nonenveloped hav and b19 may be performed. 11,12 relevance of testing for the absence of hiv p24ag or wnv nucleic acid testing (nat), which may be justified for the safety of non-virally inactivated blood components, is arguable for plasma for fractionation subjected to robust viral reduction steps of enveloped viruses. the industrial manufacturing pool (usually the cryo-poor plasma that is the first homogeneous pooled plasma fraction) is also tested to confirm the absence of serologic and/or genomic viral markers of hiv, hbv, hcv, hav, and b19. in spite of the most rigorous donor screening and donation testing, infectious viruses may still be present in plasma fractionation pools. therefore, the viral inactivation-removal steps that have been deliberately introduced during plasma products manufacture-and that are described below-play a most critical role in ensuring safety. altogether, these overlapping tests should ensure that the viral load of the manufacturing should is both minimal and significantly below the viral reduction capacity of the manufacturing processes used. preserving fviii during blood/plasma collection and preparation is important for most fractionators preparing coagulation factor concentrates. apheresis plasma can generally be frozen quickly, ensuring optimal fviii preservation. by contrast, whole blood has to be centrifuged to separate the various components. when blood is cooled to 48c after collection, plasma should be separated and frozen within 6 to 8 hours to preserve fviii, but when cooled rapidly at constant 208c using devices like butanediol plates, coagulation factors are stable for up to 18 to 24 hours. plasma frozen within 72 hours is suitable for igg and albumin production. 1 after separation from cellular elements, plasma for fractionation should be frozen rapidly below à208c, à258c, or à308c, depending upon local regulations. 13 plasma used to manufacture only albumin and igg may be frozen below à20 8c within 72 hours of collection. 13 in the us code of federal regulations, apheresis plasma should be stored at à208c or colder immediately after collection. rapid plasma freezing, to ensure rapid ice front velocity and core temperature of à208c, preserves fviii and appears more important than the actual freezing temperature itself. 14 plasma for fractionation is stored at less than à20 8c, or colder, typically for several months or more. storage temperature should be as constant as possible, including the transportation to the fractionation facilities. cross-continent or intercontinent shipment of plasma for fractionation is frequent. production steps taking place at fractionation plants are summarized in figure 1 , and typical plasma protein downstream methods are presented in table 2 . physical compliance with shipping requirements is verified at delivery of the plasma frozen plasmas are expelled from the plastic containers and pooled for cryoprecipitation and further manufacturing steps, as described below. intermediate fractions generated during production may be stored for subsequent pooling and processing. purified sterile-filtered products are aseptically dispensed into final containers (glass vials or bottles). albumin bottles undergo terminal pasteurization. many products, but albumin and some igg preparations, are freeze-dried, typically for a duration of 3 to 6 days, depending upon physicochemical characteristics and filled volumes. batches are quarantined while quality controls and checks of production files take place. batches meeting specifications are labeled and packaged and subsequently boxed and shipped for distribution. in a few countries, product batches may be released by regulatory authorities. the production cycle of fractionated products takes a few weeks to several months. current core fractionation technology largely relies on a backbone process encompassing cryoprecipitation and cold ethanol precipitation steps, as developed in the 1940s by cohn et al 15 in the united states, or modified by kistler and nitschman 16 in europe. this process involves successive processing steps at defined ethanol concentrations, associated with shifts in ph, temperature, and osmolality that result in selective precipitation of proteins, most notably igg and albumin. precipitates are separated by centrifugation or filtration. in the last few years, the complexity of the fractionation process has increased by (a) the introduction of chromatography to isolate new proteins from existing fractions such as cryoprecipitate, cryo-poor plasma, and cohn fractions; (b) the integration of chromatography to the ethanol fractionation process to increase igg recovery; and (c) the implementation of dedicated viral inactivation or removal steps. chromatography was introduced in the 1960s; however, its application developed mostly in the mid/late 1980s. anion-exchange chromatography and affinity chromatography are frequently used to capture proteins at physiological ph and ionic strength, therefore best preserving functional activity. 17, 18 immobilized heparin and monoclonal antibodies are common affinity chromatography ligands. chromatography is used for 4 specific goals: (a) improvement of products purity, (b) extraction of trace labile proteins, (c) optimization of protein recovery, and (d) removal of viral inactivation agents. figure 2 illustrates a typical industrial fractionation scheme of standard plasma. plasma packs (typically for a batch of 2000-4000 l) are opened under hygienic conditions, and plasma is expelled from the containers and thawed at 18c to 48c. cryoprecipitate is isolated using refrigerated continuous centrifuges, recovered from the centrifugation bowls and frozen at à308c or colder for storage until further pooling and processing. the cryo-poor plasma is immediately processed for primary chromatographic capture of labile coagulation factors (such as the factor ix [fix] complex and its components) and protease inhibitors (such as at and c1 esterase inhibitor [c1-inh]). the prepurified intermediates may be stored frozen until further processing. the coagulation factors/ anticoagulant-depleted plasma undergoes sequential ethanol precipitation steps. this leads to successive precipitations of fibrinogen, igg and albumin fractions, and intermediates for extraction of other therapeutic proteins, such as aat (fraction iv-1), or igm (fraction iii). depth filtration is preferred to centrifugation to separate precipitates and improve protein recovery. the fractionation of hyperimmune plasma (eg, anti-rhesus) is usually performed on small plasma batch sizes, increas ingly using full chromatographic processes to optimize the recovery of igg. no documented transmission of hiv, hbv, or hcv by products subjected to dedicated viral inactivation treatments has been recorded since the end of the 1980s. 19 viral reduction treatments include inactivation steps (where viruses are bkilledq) and removal steps (where viruses and proteins partition into distinct fractions). use of one, or preferably two, distinct dedicated viral reduction treatments is the current bgold standardq for all plasma products. the first treatment is performed primarily to inactivate the most pathogenic viruses (hiv, hbv, and hcv), whereas the second reduction step targets nonenveloped viruses but also contributes to added safety against all agents. most viral reduction treatments are integrated with the protein fractionation process (bin-processq treatments), but some currently based on heat inactivation procedures are applied on products filled in their final container (terminal treatment). fractionators are required by regulatory authorities to conduct down-scale experimental validation studies using relevant model viruses to establish the efficacy and robustness of viral reduction procedures. the robustness of the viral reduction procedures in place is exemplified by the absence of transmission of wnv, an emerging virus. 20 similarly, the lipid-enveloped severe acute respiratory syndrome coronavirus has been shown to be inactivated by core viral inactivation treatments of plasma products. 21,22 it is also likely, but not proven by validation experiments yet, that avian flu and simian foamy viruses, which also have a lipid envelope, would be inactivated by current processes in place if present in plasma. details on viral validation procedures can be found elsewhere. 23, 24 the major characteristics of current viral reduction treatments are summarized in table 3 9,23 and discussed briefly here. in-process viral inactivation treatments. the solvent-detergent (sd) treatment, developed in the mid 1980s, 25 remains the most frequent core viral inactivation procedure of plasma products. protein solutions are incubated for 4 to 6 hours at 248c to 378c in the presence of 0.3% to 1% tri-n-butyl phosphate (tnbp) and 1% tween-80 or triton x-100. typically, lipid enveloped viruses are inacti-vated in a matter of minutes, and the functional activity of even the most labile plasma proteinswith the possible exception of some serine prolease inhibitors-is well preserved, but nonenveloped viruses (less pathogenic in most individuals) are not inactivated. the sd agents are removed down to a level of a few parts per million usually by chromatographic adsorption or specific precipitation of proteins, or selective adsorption on hydrophobic chromatographic support. pasteurization, another common viral inactivation procedure, is a heat treatment of protein solutions for 10 hours at 608c, a treatment that denatures viral proteins and inhibits virus replication. 26 pasteurization can inactivate both enveloped and nonenveloped viruses, but stabilizers, needed to limit loss of protein functionality, may decrease the rate and extent of viral inactivation. 27 stabilizers may be removed by ultrafiltration, protein precipitation, or chromatography. vapor heat has also been used by one company; extent of virus inactivation is influenced by the temperature, duration, and pressure during treatment. risk of neoantigen formation, which can enhance protein immunogenicity, should be considered when using heat-based inactivation processes. low ph incubation, usually at ph 4, at 308c to 378c for more than 20 hours, was introduced in the early 1980s to allow the intravenous infusion of igg. this form of treatment was subsequently found to inactivate most lipid-enveloped viruses. caprylic (octanoic) acid precipitation/incubation at ph below 6 is a recently introduced treatment of human igg that can inactivate lipid-enveloped viruses. 28, 29 terminal viral inactivation treatments. in modern plasma fractionation, heat treatment of lyophilized products (dry heat) is used, due to limitations, mostly as a secondary viral inactivation step rather than the core inactivation treatment. the treatment is applied to some coagulation factor concentrates. performed at 808c for 72 hours or at 1008c for 30 minutes, generally in the presence of protein stabilizers, it provides added safety against havand other heat-sensitive viruses but may not be sufficient to exclude b19 transmission. 23 terminal (liquid) pasteurization at 608c for 10 hours is the bgold standardq treatment of albumin preparations. the fatty acids, caprylate, and tryptophanate, which protect albumin from heat denaturation, are added at doses compatible with therapeutic use and, therefore, are not removed before product infusion. viral removal treatment. nanofiltration is a specific viral filtration process applied to protein solutions using 15-to 75-nm multi-layers membranes, or equivalent systems, to remove viruses mostly by a sieving mechanism. 30,31 introduced in the early to mid 1990s, it has reached wide acceptance as a robust viral removal step for essentially all products, apart from albumin. nanofiltration is used to complement the core viral inactivation treatment and to provide enhanced safety against nonenveloped viruses or other resistant infectious agents. virus removal can also incidentally take place during protein precipitation, chromatography, or filtration steps; these steps contribute to the lowering of virus load from the protein production stream; they are difficult to monitor; and therefore do not guarantee, as standalone procedures, sufficient safety margin. prion removal methods. variant creutzfeldt-jakob disease (vcjd) can be transmitted by red blood cell concentrates, but to date, transmission has not been identified from plasma or plasma products. because of its biological nature, the prion agent is thought to be resistant to current viral inactivation procedures used during plasma fractionation. the methods known to inactivate abnormal misfolded prion proteins associated with transmissible spongiform encephalopathies (prp tse ) (such as oxidation, treatment with strong base, chaotropic agents, and extreme heat) destroy plasma proteins and, therefore, cannot be used. still, modern processes of fractionated products would seem to ensure significant removal of prp tse , as suggested by experimental spiking studies. 32 as scale-down and experimental transmissible spongiform encephalopathy (tse) spiking models are developed, knowledge on the capacity of manufacturing processes of plasma proteins to remove prions is growing rapidly, although much uncertainty remains because the unknown biological nature of the human plasma associated infectious agent. 33 several processes for manufacturing fviii, fibrinogen, von willebrand factor (vwf), fix, igg, and albumin products have been shown to be capable of removing prions in a consistent and reproducible manner. [34] [35] [36] generally, multistep fractionation processes are thought to be a contributing factor to prp tse elimination. two to more than 5 log removals of spiked prions occur during standard protein purification steps such as precipitation with ethanol or polyethylene glycol (peg), anion-exchange chromatography, and depth filtration. 34, 35, 37 mechanism of removal that may encompass an adsorption mechanism is still not fully understood and appears to be influenced by ph and the concentration of the precipitating agent. the partitioning process may reflect some prion aggregation because of prion hydrophobicity and insolubility. the removal capacity of nanofiltration membranes with pore sizes less than 75 or 35 nm has been extensively investigated, 34, 38 and prion removal is likely due in part to molecular sieving. extent of removal is related to the pore size of the filters and, presumably, to the aggregation state of prp tse . removal is superior with membrane of pore sizes of 15 nm, compared to 35 nm. the nature and origin of the spiking agent used for prp tse clearance studies is of critical importance because various spikes differ in size and characteristics. brain homogenate or brain-derived microsomal fractions from infected animals have usually been used as the source of prp tse spiking material. the most reliable and quantitative detection method of prp tse is based on animal bioassays, which require many animals and a time frame generally longer than 9 months. in vitro immunochemical methods, such as a western blot assay, are being used, at least as a first marker of the presence of prp tse and associated infectivity, by detecting the protease-resistant fragment. 33 thus, the risk transmission of vcjd by human plasma products appears remote, but caution should prevail since the biochemical nature of the infectious agent in human blood is not known. technologies to extract coagulation factors, protease inhibitors, and igg have evolved considerably in the last 20 years, leading to the development of products with improved safety and purity profiles. a description of major manufacturing techniques is given below. factor viii. several generations of fviii preparations have been developed since the mid-1980s, providing (a) safety from hiv and then hcv and hbv, (b) improved purity, and (c) enhanced safety from hav and b19. current development efforts focus on establishing prion removal. 36 all currently licensed plasma-derived fviii concentrates are purified from cryoprecipitate. in a typical process, cryoprecipitate is subjected to a combination of aluminium hydroxide adsorption and precipitation, or precipitation only (eg, using glycine), to reduce the level of trace vitamin k coagulation factors (as they may activate fviii during the downstream purification steps) or load proteins such as fibrinogen. the purified cryoprecipitate extract usually undergoes viral inactivation typically by sd or pasteurization. many processes include subsequent chromatography by anion exchange, monoclonal antibody affinity (using anti-fviii or anti-vwf murine antibodies), or immobilized heparin affinity to remove protein contaminants (such as fibrinogen or fibronectin), most or part of the vwf, and the sd agents. 39 immunopurified fviii eluate is further purified by chromatography to remove murine igg ligands that may have leached. prior to formulation and sterile filtration, some fviii products are nanofiltered using membranes with a pore size of 35, 20, or even 15 nm if a partial dissociation of fviii and high-molecular-weight vwf multimers is initiated. alternatively, some freeze-dried preparations are subjected to heat treatment at 808c or 1008c to inactivate nonenveloped viruses like hav. recovery of fviii, as expressed per liter of plasma, is usually comprised between 100 and 200 iu (1 iu is defined as the physiological activity present in 1 ml of plasma). factors lowering fviii yield include cryoprecipitation (ca 30% loss), chromatographic purification (ca 20% to 30 %), and viral heat inactivation (15%-30%). current fviii concentrates have a specific activity between 10 and 250 iu/mg. some products are formulated with human plasma-derived albumin, whereas others contain copurified vwf that helps stabilize fviii. purity of fviii concentrates has not been convincingly demonstrated to enhance immunological safety. long-term clinical experience indicates that the alleged reduced immunosuppressive effects of immunopurified preparations compared to lowerpurity plasma-derived preparations, as claimed in the late 1990s, were probably unfounded. processes used that, in part, influence residual vwf, may have an impact on the immunogenicity of plasma-derived fviii products. retrospective studies of previously untreated patients show that an sd-treated, nanofiltered, ion-exchange purified fviii product containing vwf appears to be 2 times less likely to induce anti-fviii inhibitors in hemophilia a patients than 2 full-length recombinant fviii concentrates. 40 von willebrand factor. because fviii chromatographic purification removes all, or part, of vwf, fviii products effective in treating von willebrand disease (vwd) are generally low-purity (bintermediate purityq) preparations prepared from cryoprecipitate by precipitation steps coextracting vwf and fviii in a ratio of higher than 1. the low purity and the high protein content of these products technically restrict the choice of viral inactivation treatments to pasteurization or terminal dry heat at 808c for 72 hours. one highly purified vwf concentrate, largely devoid of fviii and, therefore, specific for vwd treatment, is prepared from cryoprecipitate by a 3-step chromatographic procedure (integrated with fviii and fibrinogen purification processes) using 2 anion exchangers and immobilized gelatin polishing (to remove fibronectin). 41 viral reduction is by sd, 35-nm nanofiltration, and terminal dry heat at 808c for 72 hours. 42 fibrinogen. there are 5 registered fibrinogen preparations available for treating a fibrinogenemia or hypofibrinogenemia. 43 traditional preparations are obtained by multiple precipitation steps of plasma or cryoprecipitate using ethanol and glycine, whereas other modern products are purified by chromatography. viral reduction is achieved by sd treatment, often complemented by 35-nm nanofiltration or terminal dry heat treatment. single-step pasteurization at 608c for 20 hours is used for 1 product. fibrin sealants. fibrin sealants (fibrin glues) comprise fibrinogen-rich and purified thrombin concentrates. when mixing the 2 components, a strong adhesive clot exhibiting hemostatic, sealing, and healing properties is formed almost instantaneously or within a few seconds, offering multiple topical surgical applications. fibrinogen is prepared by precipitation methods from cryoprecipitate, or from the cohn fraction i; the fraction may also contain fibronectin, vwf, or factor xiii (fxiii), which may confer other physiological functions. 44 fibrinogen fractions are virally inactivated by sd, pasteurization, vapor-heat treatment, and/or nanofiltration. the fibrinogen concentration is typically above 80 g/l and may be formulated in the presence of an antifibrinolytic agent. prothrombin complex. prothrombin complex concentrate (pcc) is a mixture of vitamin kdependent coagulation factors in which fix, factor ii, and factor x and proteins c and s have a low specific activity between 0.5 and 2 iu/mg. 45 a few products contain also factor vii (fvii), but usually at levels lower than that of fix. the manufacture is usually based on a 1960s method that involves diethylaminoethyl (deae) sephadex or deae cellulose adsorption of cryo-poor plasma, but downstream ethanol fractions can also be used as starting materials. 46 anion exchangers coextract proteins sharing the presence of gamma-carboxyglutamic acid residues, whereas bulk plasma proteins, such as albumin and igg or at and aat remain in the unbound fraction. precipitation with tricalcium phosphate is used for one product. viral reduction is most often achieved by sd (tnbp-tween 80) treatment, complemented by 35-or 15-nm nanofiltration or by terminal dry heat. pasteurization and vapor heat are applied to 2 products. viral inactivation treatment by sd requires one subsequent ionexchange chromatographic step for removal of the virus-inactivating agents. recovery is in the range of 250 to 380 iu of fix per liter of plasma. single fix. high-purity fix products were developed in late 1989, 47 leading to reduced risks of thromboembolism, compared to pcc, in hemophilia b patients. fix is isolated by chromatographic purification of the pcc using anion exchange combined with either immobilized heparin, metal chelate affinity, or monoclonal antibody. these processes yield fix concentrates with a mean specific activity in the range of 100 to 150 iu/mg and a yield between 200 and 300 iu/l of plasma. factor vii. three specific concentrates rich in fvii and with reduced amount of other vitamin kdependent clotting factors are currently licensed to control bleeding in deficient patients. the manufacturing process includes ion-exchange chromatography or aluminium hydroxide adsorption, following a downstream procedure similar to that of pcc and fix. viral inactivation is achieved by sd treatment, vapor heat, or dry heat. factor xi. two factor xi (fxi) concentrates are currently available for deficient patients. one, of low purity, is purified by chromatography of cryo-poor plasma on deae cellulose and immobilized heparin. after freeze-drying, the product is virally inactivated by dry heat. in the other product, fxi is captured by adsorptive filtration then highly purified by cation-exchange chromatography. 48 this product undergoes dual viral reduction processing by sd and 15-nm nanofiltration. factor xiii. factor xiii is a transglutaminase that catalyzes the final step in the coagulation cascade, cross-linking the loose fibrin polymer into a highly organized structure. the early generation of fxiii concentrates for the treatment of fxiiideficient patients was extracted from placenta, but 2 plasma-derived products have subsequently been developed. one is purified from a cold-ethanol fraction from cryoprecipitate supernatant, purified by precipitation with sodium citrate and removal of fibrinogen by heating. the product is pasteurized in sorbitol solution, ultrafiltered to remove sorbitol, adsorbed with bentonite, and freeze-dried. the other product is obtained by precipitation steps and is also pasteurized. factor xiii is also a component of some fibrin sealants. although still subject to discussion, the presence of fxiii is claimed to contribute to fibrin c-chain cross-linking and tensile strength, possibly improving the hemostatic effect. 49 activated coagulation factors. human thrombin concentrates are available so far only as components of fibrin sealant. thrombin is prepared by activation of the pcc, usually in the presence of calcium chloride, followed by viral inactivation treatment by sd, purification by cation-exchange chromatography, and viral removal by 15-nm nanofiltration. final thrombin concentration is usually between 300 and 1000 iu/ml, but preparations with lower potency are available when a slower speed for clot formation of the sealant is preferred for some surgical applications requiring longer time for tissue gluing. a procedure for large-scale production of a purified plasma-derived fviia has been developed in japan. 50 fvii is purified by anion exchange and immunoaffinity chromatography and converted to fviia by autoactivation on an anion-exchange resin and incubation in the presence of ca 2+ for 18 hours at 108c. this preparation is virally reduced by nanofiltration and dry-heating and is intended for the treatment of hemophiliacs with antibodies against fviii or fix. antithrombin. antithrombin concentrates were the first plasma products extracted by affinity chromatography. production from cryo-poor plasma usually comprises ion exchange chromatography to remove the pcc components, followed by capture of at on immobilized heparin. viral inactivation is traditionally achieved by pasteurization in the presence of sodium citrate or a combination of sucrose and glycine, although sd treatment is used as well. because heat treatment may partially denature at, a second adsorption step on immobilized heparin can be used to remove altered molecules. recovery is between 250 and 350 u/l of plasma. fraction iv-1 is an alternative starting material, but yield is significantly lower. a1-antitrypsin (a1-protease inhibitor). there are now several licensed aat concentrates, including 3 in the united states. a1-antitrypsin augmentation therapy is indicated for the treatment of patients with lung emphysema secondary to congenital aat deficiency. because aat shares many physicochemical properties, in particular, molecular weight and isoelectric point, with albumin, it has been difficult to design production methods for aat not affecting the existing production process for albumin. most preparations are recovered from fraction iv [1] [2] [3] [4] . purification from this waste fraction is rather cumbersome and involves peg precipitation and ion-exchange chromatography, considerably compromising recovery (~0.2 g/l). the first preparations developed in the 1990s were virally inactivated by pasteurization, but dual viral reduction steps using sd and nanofiltration are also now used. recent isoelectrofocusing studies have provided evidence indicating that new anodal aat variants are present in at least one of the fda-licensed product, suggesting alteration of some isoforms during purification. loss of a c-terminal positive charged lysine, secondary to carboxypeptidase activity, was proposed as an explanation for these isoelectrofocusing migration shifts. under circumstances when clinical demand for albumin decreases, extracting aat from upstream fractions, such as the supernatant ii+iii, 51 appears a logical trend, offering the possibility of a more effective production scheme, characterized by a recovery of 0.6 to 1 g/l. 51 protein c. there are 2 protein c concentrates manufactured in europe and 1 in japan. in one process, the pcc undergoes cascade purification on 3 ion exchangers, 52 whereas in the other, immunoaffinity and affinity chromatographic processes are combined with ion exchange. viral reduction is achieved by sd treatment, which can be combined with 15-nm nanofiltration or vapor-heat treatment. 53 c1-esterase inhibitor. c1-esterase inhibitor concentrates are used for the treatment of acute phases of angioedema, primarily in the oropharyngeal region and gastrointestinal tract in patients with congenital or acquired c1-inh deficiency. there are 3 products licensed in europe. products are generally purified by chromatography from the cryo-poor plasma after extraction of the pcc and, potentially, at. viral inactivation is achieved by pasteurization, vapor heat, or sd, possibly combined with nanofiltration. essentially all therapeutic albumin preparations are prepared by fractionation of cryo-poor (or pccpoor and/or at-poor and/or c1-inh-poor) plasma by ethanol fractionation. a critical upstream process step (precipitation ii+iii) separates the igg fraction. to optimize recovery, the precipitates generated during the ethanol fractionation process are separated by depth filtration. albumin recovery of 75% to 85% (25-28 g/l) and purity of 96% to 98% are typically obtained. some processes combine ethanol fractionation with a polishing ion exchange chromatography, which generally improves product purity to ca 99%, whereas in one production method, albumin is purified mostly by anion exchange, cation-exchange, and size-exclusion chromatography. the adjustment of the concentration of the purified fraction, typically from 4% to 25%, is achieved by ultrafiltration. the standard viral inactivation method is pasteurization, which, according to most pharmacopeias, should be performed in the final container rather than on the albumin batch before aseptic filling. current mean albumin yield is 24 to 26 g/l of plasma. polyvalent igg preparations, either for intramuscular or intravenous uses, are traditionally prepared from the fraction ii that is obtained by stepwise fractionation of cryo-poor plasma using cold ethanol at concentrations up to 25%. increasingly, igg products are extracted from up-streamed ethanol precipitated fractions, such as supernatant iii or precipitate ii+iii, to optimize recovery. intermediate igg fractions are subjected to ion-exchange chromatography, caprylic acid, or peg precipitations to remove protein contaminants, proteolytic enzymes, and/or aggregates. most current viral inactivation procedures are low ph incubation, pasteurization, or sd; the caprylic acid treatment, recently introduced in the manufacture of human igg products, is also a robust viral inactivation process of igg. dedicated viral removal by 15-to 35-nm nanofiltration is commonly used to increase the safety against nonenveloped viruses, especially in a situation when the core viral inactivation treatments target only lipid-enveloped viruses. the igg recovery has long been in the 2.7-to 3.2-g/l range when combining traditional ethanol fractionation processes and centrifugation. depth-filtration and/or chromatographic purification from upstream fractions have improved the mean recovery to the 3.5-to 4.5-g/l range, or more. total chromatographic procedures are increasingly used for the production of hyperimmune igg products because such processes are amenable to the fractionation of smaller plasma volumes and can optimize recovery. the manufacturing process includes at least 2 dedicated viral reduction treatments. other protein components have been fractionated at pilot-scale and subjected to experimental or human trials. inter-a-trypsin inhibitor (iti) is a kunitz-type serine proteinase inhibitor. its inhibitory capacity is carried by bikunin, a chondroitin 4-sulfate proteoglycan, which is covalently linked to its heavy chains h1 and h2, but can be released by proteolytic cleavage. an iti concentrate has been obtained by fractionation of the prothrombin complex on an anion exchanger followed by immobilized heparin and viral inactivation by sd. iti, as a reservoir of bikunin, may be involved in control of inflammatory processes. in a porcine model of endotoxin shock, iti improved the hemodynamic, oxygenation, and coagulation parameters. 54 administration of iti very early after the onset of sepsis or repeated injections at later time points (10 and 20 hours) maintains cardiovascular stability and significantly reduces mortality in a rat model. 55 transferrin is the major iron binding plasma protein which may prevent cytotoxic effects or predisposition to septic infection due to accumulated free non-transferrin-bound iron when normal iron use is hampered and/or apotransferrin production is decreased. a liquid apotransferrin concentrate has been obtained from cohn fraction iv by 2 ion exchange chromatographic steps and ultrafiltration. viral safety was ensured by sd treatment, nanofiltration, and peg precipitation. the product had intact iron binding capacity, and maintained the bacterial growth inhibitory effect in serum. in hematological stem cell transplant patients, the product prevented the appearance of non-transferrin-bound iron. apolipoprotein a-i is the principal protein component of the plasma high-density lipoproteins. it prevents the accumulation of cholesterol-loaded macrophages which deposit on the arterial wall as foam cells. apolipoprotein a-i inhibits hepatic lipase and lipoprotein lipase in vitro. a concentrate has been isolated by precipitation from cohn fraction iii. 56 intravenous injections in men were well tolerated in an early clinical trial; clinical observations were consistent with combined inhibition of hepatic lipase and lipoprotein lipase activities. possible clinical applications include the treatment of hypercholesterolemic patients and atherosclerosis. recombination with lecithin forms a high-density lipoprotein complex that could help limit inflammation, endotoxin-induced activation of coagulation, and fibrinolysis in septic conditions. mannan-binding lectin (mbl) is a component of the innate (aspecific) immune system that can bind repetitive structures of mannan groups, such as those on the surface of micro-organisms, activating the complement system and leading to the destruction of a large variety of micro-organisms. the relatively frequent congenital deficiency of mbl is associated to recurrent infections, especially in infants when the specific immune system has not yet matured. an mbl concentrate has been produced from cohn fraction iii. 57 plasmin is the major fibrinolytic enzyme in plasma. encouraging results as a new fibrinolytic agent have been obtained in animal models where plasmin was applied directly to the clot through a catheter to treat peripheral arterial occlusion. von willebrand factor cleaving protease (vwf-cp, adamts13) cleaves ultra-large multimers of vwf that enter the blood stream directly after biosynthesis by endothelial cells. if developed, a purified vwf-cp could be useful to treat, in place of plasma, patients with thrombotic thrombocytopenic purpura with congenital or acquired deficiency of vwf-cp. activated protein c can be of value for the treatment of sepsis, as demonstrated through the clinical use of a recombinant preparation. there is no therapeutic plasma-derived activated protein c products available yet for therapeutic use, although a process for a highly purified preparation, where cryo-poor plasma is purified by immunoaffinity and anion-exchange chromatographic steps, and sd viral inactivation has been described. 58 there is also research needed to investigate alternative indications for currently available products. potential clinical use of c1-inh, aimed at benefiting from its role as inhibitor or attenuator of the activation of complement and contact systems, include septicemia, myocardial infarction, capillary leak syndrome, pancreatitis, and organ transplantation. intravenous use of a fxiii concentrate was found to increase epidermal growth factor and transforming growth factor-b, suggesting that it may accelerate wound healing of anastomotic leaks and nonhealing fistulas. factor xiii was also found to have osteoinductive properties, suggesting use in bone tissue engineering. a review on plasma fractionation should also cover the role played by national and international regulatory authorities. over the last few years, assuring the safety of large-pool plasma products has posed formidable challenges to regulatory authorities and fractionators alike. the complexity of the field, encompassing the diversity in blood and plasma product types and manufacturing processes, made it difficult to enact balanced decisions toward ensuring both product safety and guarantee of supply. since the 1980s, the agencies regulating the plasma fractionation industry have developed a comprehensive set of measures to ensure the viral safety of plasma products. multiple layers of regulatory oversight of the plasma industry have been established to ensure overlapping safeguards against the risks of the transmission of bloodborne infectious agents. several regulations, guidances, position statements have been issued by agencies like the us fda and the european medicine evaluation agency, which are updated as needed. those cover important safety aspects required at all stages of the manufacturing chain, from activities at blood establishment preparing plasma for fractionation, extending to the manufacturing and distribution of plasma products. regulatory oversight includes epidemiologic surveillance of the donor population; donor deferral policies and screening practices; mandatory donation testing; testing of manufacturing plasma pools; validation of viral reduction procedures and other production steps; as well as the assessment of product quality, safety, and efficacy for marketing authorization. most of relevant information on plasma collection is assembled in europe in the plasma master file, which allows establishing key levels of information regarding the quality and safety of the plasma raw material. post marketing, reports on adverse reactions associated with plasma products should be transmitted to nras and could prompt emergency response procedures and product recalls. 10 some harmonization of the requirements for manufacture and supply of plasma products in the united states, europe, and japan is taking place under the auspices of the international conference on harmonisation, but much work remains to be done. such measures, nonetheless, provide the framework through which modern plasma products now exhibit a very high level of quality, safety, and efficacy. however, the rigidity of the regulatory system has been an impediment to more significant technological evolution of the plasma fractionation process because process changes are currently associated to major regulatory work. the cohn plasma fractionation method initially designed to obtain albumin has, over the years, developed rather successfully into a well-established industrial procedure isolating a wide range of clinically useful products. today, more than 20 different protein products, and more if one considers the variety of hyperimmune igg preparations, can be extracted through large-scale fractionation of human plasma. the soundness and large-scale adaptability of the technology, on the one hand, and the rigidity of the current regulatory framework, on the other, explains why this technology remains the main core method in use at industrial scale, although it implies suboptimal yield for most proteins, apart from albumin. the technology has increased in complexity over the years, with the greatest progresses in purification being, without doubt, associated with the use of chromatographic methods that have made possible the development of new protein therapeutics and impressive improvements in product purity and quality. the fractionation scheme has also changed dramatically through the introduction of in-process viral reduction treatments, which have required the addition of downstream techniques such as chromatography and ultrafiltration. the change in protein drivers, with the prominent clinical role now played by igg, and the requirements to increase protein recovery and optimize the fractionation process may crystallize the incentive for most fractionators to abandon relative technical conservatism and introduce significant technological changes in processing technology. the production of igg from precipitate (i)+ ii+iii, already implemented by some manufacturers and, possibly, that of aat from supernatant (i)+ ii+iii, are signs of a gradual evolution in the direction of total chromatographic processing. with so much gained, over the last few years, in the understanding of the key parameters building plasma product quality, safety, and efficacy, one can hope that the regulatory paths, in particular, with regard to clinical studies, to license known protein therapeutics prepared by improved, more efficient technology should be simplified. this would, in turn, contribute to an improved supply. as the plasma fractionation industry has so far been targeting mostly proteins that were obvious candidates for replacement therapy, it is also hoped that it will invest more into developing new products because plasma remains a unique source of potential therapeutic proteins. proactive research and development work should be encouraged to isolate and evaluate new therapies among the well characterized plasma proteins with still unknown function. finally, one should not forget that plasma protein therapies are expensive and largely inac-cessible to the developing world. the new market drivers in rich countries are likely to diminish economical interest in the manufacture of fviii that remain a major protein therapy in need in the developing world. the belief that decreased use of plasma-derived fviii or fix in developed economies (as hemophiliacs are switching to recombinant therapies) would increase the supply of products to developing countries may be not economically viable. rather, this switch to recombinant products in rich countries can make poor countries unable to afford the increased cost of products no longer subsidized by the premium price paid in rich countries. it therefore remains important to develop affordable viral inactivation and processing technologies gradually allowing developing countries to make use of local plasma resources in a safe manner. 59 guideline on epidemiological data on blood transmissible infections. for inclusion in the guideline on the scientific data requirements for a plasma master file current instrumentation for apheresis. apheresis: principles and practice assessment of complement activation during membrane-based plasmapheresis procedures the effect of leukocyte depletion on the quality of fresh-frozen plasma current safety of the blood supply in the united states farrugia a: guide for the assessment of clotting factor concentrates for the treatment of hemophilia. www.wfh.org. montreal, world federation of hemophilia preparation and properties of serum and plasma proteins. iv. a system for the separation into fractions of the protein and lipoprotein components of biological tissues and fluids eight years experience with the alcohol fractionation procedure of nitschmann, kistler and lergier tabor e: the epidemiology of virus transmission by plasma derivatives: clinical studies verifying the lack of transmission of hepatitis b and c viruses and hiv type 1 contribution and interpretation of studies validating the inactivation and removal of viruses (revised) inactivation of viruses in labile blood derivatives. i. disruption of lipidenveloped viruses by tri(n-butyl)phosphate detergent combinations pasteurization of antihemophilic factor and model virus inactivation studies enveloped virus inactivation by caprylate: a robust alternative to solventdetergent treatment in plasma derived intermediates evaluation de l'efficacité des procédés de purification des proteins plasmatiques à éliminer les agents transmissibles non conventionnels distribution of a bovine spongiform encephalopathy-derived agent over ionexchange chromatography used in the preparation of concentrates of fibrinogen and factor viii al: influence of the type of factor viii concentrate on the incidence of factor viii inhibitors in previously untreated patients with severe hemophilia a in vitro study of a triple-secured von willebrand factor concentrate fibrin sealant: scientific rationale, production methods, properties, and current clinical use properties of a highly purified human plasma factor ix:c therapeutic concentrate prepared by conventional chromatography large-scale production and properties of human plasma-derived activated factor vii concentrate effects of inter-alpha-inhibitor in experimental endotoxic shock and disseminated intravascular coagulation delayed administration of human inter-alpha inhibitor proteins reduces mortality in sepsis a minipool process for solvent-detergent treatment of cryoprecipitate at blood centres using a disposable bag system key: cord-270514-36k9xo7f authors: van der woude, roosmarijn; turner, hannah l.; tomris, ilhan; bouwman, kim m.; ward, andrew b.; de vries, robert p. title: drivers of recombinant soluble influenza a virus hemagglutinin and neuraminidase expression in mammalian cells date: 2020-08-14 journal: protein sci doi: 10.1002/pro.3918 sha: doc_id: 270514 cord_uid: 36k9xo7f recombinant soluble trimeric influenza a virus hemagglutinins (ha) and tetrameric neuraminidases (nas) have proven to be excellent tools to decipher biological properties. receptor binding and sialic acid cleavage by recombinant proteins correlate satisfactorily compared to whole viruses. expression of ha and na can be achieved in a plethora of different laboratory hosts. for immunological and receptor interaction studies however, insect and mammalian cell expressed proteins are preferred due to the presence of n‐linked glycosylation and disulfide bond formation. because mammalian‐cell expression is widely applied, an increased expression yield is an important goal. here we report that using codon‐optimized genes and sfgfp fusions, the expression yield of ha can be significantly improved. sfgfp also significantly increased expression yields when fused to the n‐terminus of na. in this study, a suite of different hemagglutinin and neuraminidase constructs are described, which can be valuable tools to study a wide array of different has, nas and their mutants. influenza a virus (iav) is a continuous burden for human and animal health, and its eradication is near impossible given the wild waterfowl reservoir. iav contains a negative-sense segmented rna genome that allows for rapid nucleotide changes and exchange of whole segments both of which contribute to high variability. iav hxnx subtypes are determined by antigenicity, however, several subtypes are under immune pressure, from which they can escape, resulting in drifted viruses. the two surface envelope proteins of iav have opposing functions; the trimeric hemagglutinin (ha) binds to sialic acid containing glycans to enable the virus to enter cells, 1,2 the tetrameric neuraminidase (na) cleaves sialic acids to release new viral particles from the membrane. 3, 4 na is also important for the cell entry process as it removes decoy receptors. 5, 6 both envelope proteins are therefore of great importance for the viral lifecycle and elicited antibodies impeding ha and na biological functions and are therefore protective. [7] [8] [9] elucidating antigenicity, receptor specificity and other biological phenotypes of these two envelope proteins have been aided by means of recombinant soluble multimeric proteins. also, in vaccine development and antiviral discovery, these proteins have proven to be excellent tools. [10] [11] [12] [13] the use of recombinant proteins eliminates the lengthy process of virus generation either by reverse genetics or growth of wild type viruses that in turn are prone to adaptation in eggs and/or cell culture. 14 lab adaptation is especially problematic for older strains of influenza due to multiple rounds of infection in eggs, vero and mdck cells. 15 in addition, contemporary h3n2 viruses adapt quickly to laboratory hosts. 16, 17 in addition, with recombinant proteins there is no need to work in bsl-ii or -iii environments. individually expressed ha and na proteins enable their functions, such as receptor specificity for ha or sialidase activity for na, to be analyzed in great detail. here we report our observations gleaned over a decade of recombinant ha and na protein expression in mammalian cells. 16 we demonstrate increased expression yields using codon-optimized sequences and genetic fusions of super folder gfp (sfgfp). [17] [18] [19] although codonoptimization might not sound surprising, sfgfp fusions are generally utillized to facilitate routine expression and purification techniques. however, we observed a significant increase in expression yields and determined that it reduced the use of expensive antibodies and provided an excellent handle, as well as an internal read out, of a glycan binding protein. for example, we used has of contemporary h1 and h3 vaccine strains, the latter have been increasingly difficult to express and crystallize, most likely due to an increased number of potential n-linked glycosylation sites that may result in an elongated retention time in the er and golgi. 20 furthermore, we applied the same principles to several na subtypes, n1, n2, and n9. the n-terminal sfgfp increases yields, maintains biological activity, structure and antigenicity, and aids protein quantitation during expression and purification. our results should be valuable for other labs interested in the use recombinant ha, na, and perhaps other viral envelope proteins. recombinant soluble ha was created with the use of an expression plasmid in which the open reading frame (orf) is preceded by a human cytomegalovirus (cmv) promoter, a cd5 derived signal peptide for efficient translation and transport to the cell culture medium (figure 1a ). the sfgfp is cleavable by a tobacco etch virus (tev) protease recognition and cleavage site sequence. all codon optimizations, ha, na, and sfgfp where performed by genscripts propieratary software, standard cloning sites in the open reading frame are removed. to demonstrate the utility of codon-optimization for protein expression we created plasmids with and without codon-optimized genes of contemporary h3n2 and h1n1 influenza a virus vaccine strains. we choose two h3n2 vaccine antigens, a/texas/2012 [a/tx/12], a/switzerland/2013 [a/ch/13], and the corresponding h1n1 a/michigan/15 [a/mi/15]. codon-optimization is known to increase expression yields, whereas sfgfp fusions are not, so both sequence versions were cloned into plasmids with and without sfgfp fusions. we determined the yields by western blot and measurement of fluorescence (figure 1b,c) . all orfs were efficiently expressed when fused to a c-terminal sfgfp and with codon optimization. while non-codon optimized and non-sfgfp fused variants are also observed in the cell culture supernatants, yields are less than 1 μg/ml. using three examples, it is clear that in some cases only the addition of sfgfp or codon optimization can be sufficient to increase yields. for the a/tx/13 h3 the sfgfp increases the band intensity and incorporating a codon optimized gene further improved expression efficiency. a/ch/13 h3 ha requires both a sfgfp and codon optimization to express at high yields, whereas for the 2015 h1 vaccine component, a/mi/15 h1n1 codon optimization by itself increased expression 10-fold, but the addition of a sfgfp fusion did not further increase expression. we observed these expression patterns for several ha orfs and decided to show these examples, as it is increasingly difficult to express heavily glycosylated h3 has. as another example we cloned the receptor binding domain of an avian coronavirus into plasmids with and without a c-terminal sfgfp. we again observed an increase of expression yield by twoto four-fold as determined by band intensity (figure 1d ). whereas ha is a type i membrane protein, na is type ii with its n-terminus proximal to the membrane. thus, we introduced the sfgfp fusion at the n-terminus. in many expression platforms the original stalk domain is omitted 21 as it appears to be unstable, therefore several different tetramerization domains are routinely included. 4 to properly compare the effect of different multimerization motifs, gcn4 versus tetrabrachion (tb), and sfgfp fusion on expression, folding, and enzymatic activity, we created three different constructs ( figure 2a) . 4, 22, 23 we tested the gcn4, tb, and sfgfp-tb-na proteins from two h3n2 viruses, nl/16, and nl/03. the resulting n2 proteins were analyzed by western blot where it was evident that the constructs fused to a tb domain maintain oligomerization on gel (figure 2b ). to observe monomers, we reduced the samples prior to gel loading. the gcn4 construct appeared as monomers and dimers on sds-page. the reduced monomers of sfgfp-tb, tb, and gcn4 migrated to different positions on gel that reflect their difference in molecular weight. in contrast to ha, we did not observe a large increase in expression yield of na using the sfgfp fusion. we analyzed the structural arrangements of our n2 proteins to ensure they were folded into the correct tetrameric native conformation using negative stain em, similar to our previously described sfgfp-ha proteins. 24, 25 the em data demonstrates that the n2 na assembles into a stable tetramer that resembled known na structures ( figure 2c) . initially, 57,505 individual particles were picked, placed into a stack, and submitted to reference free two-dimensional (2d) classification. from the initial 2d classes, particles that did not resemble na were removed, resulting are final particle stacks of 32,672 particles, which were then subject to relion 2d classification. all resultant classes demonstrated evident tetramerization and distinct na, gcn4, and tb motifs, and four sfgfp protein structures could be identified in the em images. to determine that sfgfp-na fusions are enzymatically, antigenically and structurally similar to their non-fused counterparts, we analyzed the gcn4, tb, and sfgfp-tb-n2 proteins with munana and na specific antibodies (figure 3 ). in the munana assay we determined enzymatic activity by measuring methylumberriferyl, which is released from sialic acid upon digestion, as previously described. 21, 26 the sfgfp-tb-n2 and gcn4-n2 had the highest enzymatic activity (figure 3a) , and although tb-n2 also displayed sialic acid cleavage, it was significantly less active. we also tested enzymatic activity of additional sfgfp-tb nas, two additional n2s (nl91 and nl19), an n1 from the 2009 pandemic (ca04) and the 2013 h7n9 na (figure 3b ). all nas tested displayed efficient sialic acid cleavage that can be inhibited by oseltamivir. to analyze antigenicity, we requested three monoclonal mouse antibodies to n2 raised against perth09 at the international reagent resource program (https://www. internationalreagentresource.org/). the monoclonal antibodies were conveniently designated as #56 through #58 and binding to coated n2 proteins was analyzed using elisa. we observed efficient binding of antibodies #56 and #57 for all sfgfp-n2s tested, with minimal differences for n2 protein derived from viruses isolated from 1991 to 2019 (figure 4a ). antibody #56 also bound to ca0409 n1 whereas antibody #57 did not. ab #58 failed to bind any na that we coated, perhaps this ab is restricted to the homologous perth09 n2, while a/nl/16 differs by 12 amino acids. however, 2 of the 3 monoclonal antibodies had a considerable breadth across n2 strains and even n1 recognition. finally, we tested a recently published broadly protective na antibody, 1g01, a kind gift of florian krammer, 23,27 both as a capturing as well as a detection antibody (figure 4b ). we hypothesized that direct coating of na could potentially block the enzymatic site, which can be overcome with the bigger tb tetramerization domain in which the sfgfp is helpful to present the enzymatic sites. we therefore also coated 1g01 as a capture ab, and applied the recombinant n2 proteins in a concentration dependent manner that where subsequently detected using the streptag. in both elisa variants 1g01 efficiently recognized all n2 proteins either as detecting and capture antibody. we observed hardly any differences between the differentially expressed n2 proteins, both as a detecting as well as capture the 1g01 antibody was significantly inhibited by oseltamivir. oseltamivir efficiently inhibited capture of n2 proteins, whereas the detection of coated n2 proteins was reduced for n2 with a sfgfp but less efficiently for gcn4 and tb only n2 proteins. confirming our inititial hypothesis that direct coating of na can shield the enzymatic site. in this report we describe several optimization techniques that can substantially increase expression yields of recombinant soluble multimeric ha and na proteins. these techniques can be extremely useful when large amounts of protein are needed, or a large number of mutants need to be expressed. we hope that our data will further enable the laboratories creating recombinant ha and na proteins f i g u r e 3 enzymatic analyzes sfgfp fused neuraminidases. (a) enzymatic activity of sfgfp-tb, tb, and gcn4 neuraminidases: na enzymatic activity of preparations containing different amounts of the sfgfp-tb, tb, or gcn4-n2 was determined using the munana fluorometric assay (rfu, relative fluorescent units). oseltamivir was added to inhibit enzymatic activity. the results shown are a representative example of three independent assay performed in triplicates. (b) enzymatic activity of sfgfp-tb neuraminidases of different strains and subtypes: na enzymatic activity of sfgfp-tb n2, n9, and n1 proteins using the munana fluorometric assay (rfu, relative fluorescent units). oseltamivir was added to inhibit enzymatic activity. the results shown are a representative example of three independent assay performed in triplicates to do so in a cost-efficient manner. we want to highlight that one of our main objectives was to minimize the expenses of expressing these complex proteins so that they can be used by labs with limited means. our optimization techniques ranged from a two-to five-fold increases in protein expression individually, and combinations thereof could further increase yields 10-fold. although our methods are generally applicable, we note that as with any approach for recombinant protein expression, results are ultimately protein dependent and can vary based on the ha or na subtype. for example, for efficient expression of h1, the fold increases were remarkably lower compared to h3 as described in this report. especially interesting for us was the increase in expression yield when ha or na was fused to a sfgfp, which is in accordance with glycosyltransferases. 19 nevertheless, sfgfp induced increases in expression yield, is protein dependent, as for coronavirus spike receptor binding domain the increase is only two-to fourfold, while no increase in yields were observed for hiv env. 28 f i g u r e 4 antigenic analyzes of sfgfp fused neuraminidases. (a) antigenicity of na: purified n2 proteins analyzed with three titrated mouse monoclonal antibodies. n2 proteins where coated on maxisorp 96-wells plates and serially diluted abs were detected using a rabbit anti-mouse. the results shown are a representative example of three independent assay performed in triplicates. (b) pan-na antibody 1go1 efficiently recognizes recombinant nas and is inhibited by oseltamivir: purified na proteins where either detected using 1go1 (left). 1go1 was also used capture titrated purified n2, which was detected using a mouseanti-streptag-hrp labelled antibody. no significant differences where observed between the differently produced na proteins. oseltamivir was used to inhibit the reaction 1go1-na binding. the results shown are a representative example of three independent assay performed in triplicates we have not changed a206v in the sfgfp that would result in a monomeric fluorescent protein, 25 apparently our trimerization and tetramerization domains overrule the tendency of sfgfp to dimerize. furthermore, we have not yet used the tev cleavage sites in these constructs, as we found that all biological properties where equal when comparing non-and sfgfp fused proteins. 26 another observation we made, is that in high salt containing buffers and elongated storage at 4 c the sfgfp dissociates, as indicated by a separate sfgfp band on gel. both ha and na structures have been available for decades and now all different subtypes have been crystallized with constructs that include the tb motif, whereas multimerization motifs are normally lacking. 5, [27] [28] [29] [30] for example some labs routinely use a vasp, which perfectly amendable for crystallization after cleavage. 31 crystallization is however not feasible for many labs, yet structural information is in many cases vital. low resolution structural information sufficient for mapping epitopes for antibodies can however be obtained using a relatively small amount (<10 micrograms) with the use of negative stain electron microscopy (em). high resolution cryo-em structure determination requires more protein (100 micrograms) but is sufficient to solve atomic structures such as a 3.1 angstrom resolution structure obtained for an n9 protein fused to gcn4, 29,32 that closely resembled the negative stain image presented here. finally the fusion of the sfgfp facilitates several improvements in transfection efficiency, protein production determination and has been extremely useful in analyzing biological activity for ha. 24, 25 we now show similar results for na. however, the improvement of yields for na can still necessitate expression in suspension cultures when milligram amounts are desirable. suspension cultures, however, need expensive shaking incubators and medium. ha and na expression in this manuscript were all done in adherent cells, with a routine milligram yield from 100 ml of supernatant for ha, making it amendable for labs with minimal means. for na however, a 100 ml supernatant results in 250 μg. in conclusion, we demonstrate several ways to increase multimeric soluble proteins expression in mammalian cells. which would help to increase workflow and decrease costs. both original (a kind gift of erhard van der vries) and codon-optimized sequences were cloned into the pcd5 expression as described previously. 18 the pcd5 expression vector was adapted so that the ha-encoding cdnas are cloned in frame with dna sequences coding for a signal sequence, gcn4 trimerization motif, a tev cleavage site, a sfgfp if indicated, 25, 28 and the twinstrep, iba, germany). ha encoding cdna, a/tx/50/12 (a/tx/12 genbank kc892248.1), a/ch/9715293/13 (a/ch/13) and a/mi/45/15, was synthesized by genscript (dna sequences available upon request), both original (a kind gift of erhard van der vries) and codon-optimized sequences were cloned into the pcd5 expression as described previously. 16 the pcd5 expression vector was adapted so that the ha-encoding cdnas are cloned in frame with dna sequences coding for a signal sequence, gcn4 trimerization motif, a tev cleavage site, a sfgfp if indicated, 24, 26 and the twinstrep, iba, germany). the codon optimized na genes were synthesized at genscript, after conventional restriction enzyme cloning, 21 the open reading frame was preceded by sequences successively coding a strep-tag ii and a gcn4 tetramerization domain. 33 additionally we cloned the na genes in a vector adapted as such to be preceded with a sfgfp and or tb tetramerization motif. 29 30 the open reading frame was preceded by sequences successively coding a strep-tag ii and a gcn4-pli tetramerization domain. 34 additionally we cloned the na genes in a vector adapted as such to be preceded with a sfgfp and or tetrabrachion tetramerization motif. 27 ha and na expression plasmids, endotoxin free, where transfected on 80% confluent hek293s gnti(−)cells using polyethyleneimine i (linear 25 kda, polysciences, inc, warrington, pa) (pei) at a ratio of 1:9 g/g, before applying the dna-pei mix buffered for 30 minutes in dulbeccos modified eagles medium (dmem), 30% of the medium is removed to increase surface tension. at 6 hr post transfection, the transfection mixture was replaced with 293 sfm ii expression medium (gibco), supplemented with sodium bicarbonate (3.7 g/l), glucose (2.0 g/l), primatone rl-uf (3.0 g/l), glutamax (gibco), valproic acid (2 mm), and dmso (1,5%). tissue culture supernatants were harvested 5-6 days post transfection. ha and na protein expression and purification was confirmed by western blotting using a strepmab-hrp classic antibody. whereas ha proteins disassociates during an sds-page run, na proteins need to be reduced to observe the monomeric fraction. additionally, we measure fluorescence in the cell culture supernatant when applicable using a polarstar fluorescent reader with excitation and emission wavelengths of 480 nm and 520 nm, respectively. proteins are purified using a single-step with streptactin sepharose in batch format. sfgfp-tb, tb and gcn4-pi na proteins in 10 mm tris, 150 mm nacl at 4 c was deposited on 400 mesh copper negative stain grids and stained with 2% uranyl formate. the grid was imaged on a 120 kev tecnai spirit electron microscope with a lab6 filament and a 4 k × 4 k temcam f416 camera. micrographs were collected using leginon 33 and then uploaded to appion 34 particles were picked using dogpicker, 35 stacked, and aligned using msa/mra. 36 further 2d and 3d processing was undertaken using relion. 4.6 | biological activity and antigenicity of na proteins na enzymatic activities were measured in 100 mm tris (ph 6.15), 150 mm nacl, 10 mm cacl 2 buffer, using the fluorescent substrate 2 0 -(4-methylumbelliferyl)-α-d-nacetylneuraminic acid (4-mu-nana) [79] with excitation and emission wavelengths of 365 and 450 nm, respectively. the reaction was conducted for 1 hr at 37 c in a total volume of 80 μl. the reactions were all performed in triplicate and were stopped by adding 80 μl of 1 m na 2 co 3 . elisas were coated with 1 μg/ml n2 protein in pbs on maxisorp 96-wells plates over night at 4 c. plates were blocked for 1 hr at rt with 1% bsa in pbs supplemented with 0.1% tween20. mouse monoclonal antibodies were serially diluted and incubated for 1 hr at rt. primary antibody binding was detected with a secondary rabbit anti-mouse hrp antibody (novus) at 1:2,000 for 1 hr at rt and as an hrp substrate, sigma fast odp tablets were used. a similar procedure was used when 1g01 abs were coated, after blocking, the recombinant na proteins were serially diluted and detected with a mouse anti-streptag hrp labelled antibody at a 1:2,000 dilution. the hrp reactions were stopped after 3 min using 2,5 m h 2 so 4 . optical density was measured at 490 nm, all assays were performed in triplicates and a representative result is shown from three independent biological repeats. virus recognition of glycan receptors the biology of influenza viruses. vaccine the specific enzyme of influenza virus and vibrio cholerae a generic system for the expression and purification of soluble and stable influenza neuraminidase influenza virus neuraminidase structure and functions a new role of neuraminidase (na) in the influenza virus life cycle: implication for developing na inhibitors with novel mechanism of action hemagglutinin stalk-reactive antibodies interfere with influenza virus neuraminidase activity by steric hindrance neuraminidase inhibition contributes to influenza a virus neutralization by antihemagglutinin stem antibodies influenza hemagglutinin and neuraminidase: yin(−)yang proteins coevolving to thwart immunity vaccination with a soluble recombinant hemagglutinin trimer protects pigs against a challenge with pandemic (h1n1) 2009 influenza virus a site of vulnerability on the influenza virus hemagglutinin head domain trimer interface universal influenza virus vaccines that target the conserved hemagglutinin stalk and conserved sites in the head domain drug susceptibility evaluation of an influenza a(h7n9) virus by analyzing recombinant neuraminidase proteins changes in influenza virus associated with adaptation to passage in chick embryos influenza passaging annotations: what they tell us and why we should listen the influenza a virus hemagglutinin glycosylation state affects receptor-binding specificity synonymous codon usage bias and the expression of human glucocerebrosidase in the methylotrophic yeast, pichia pastoris better and faster: improvements and optimization for mammalian recombinant protein production expression system for structural and functional studies of human glycosylation enzymes mechanisms of protein retention in the golgi n-glycolylneuraminic acid as a receptor for influenza a viruses expression and characterization of recombinant, tetrameric and enzymatically active influenza neuraminidase for the setup of an enzyme-linked lectin-based assay broadly protective human antibodies that target the active site of influenza virus neuraminidase fluorescent trimeric hemagglutinins reveal multivalent receptor binding properties assessing the tendency of fluorescent proteins to oligomerize under physiologic conditions engineering and characterization of a fluorescent native-like hiv-1 envelope glycoprotein trimer structure of an influenza a virus n9 neuraminidase with a tetrabrachiondomain stalk structure of influenza virus n7: the last piece of the neuraminidase "jigsaw" puzzle structural basis of protection against h7n9 influenza virus by human anti-n9 neuraminidase antibodies recombinant soluble, multimeric ha and na exhibit distinctive types of protection against pandemic swine-origin 2009 a(h1n1) influenza virus infection in ferrets structural characterization of the 1918 influenza virus h1n1 neuraminidase leginon: a system for fully automated acquisition of 1000 electron micrographs a day appion: an integrated, database-driven pipeline to facilitate em image processing a switch between two-, three-, and four-stranded coiled coils in gcn4 leucine zipper mutants dog picker and tiltpicker: software tools to facilitate particle selection in single particle electron microscopy topology representing network enables highly accurate classification of protein images taken by cryo electron-microscope without masking drivers of recombinant soluble influenza a virus hemagglutinin and neuraminidase expression in mammalian cells key: cord-279463-bli8hwda authors: lipp, joachim; dobberstein, bernhard title: the membrane-spanning segment of invariant chain (iγ) contains a potentially cleavable signal sequence date: 1986-09-26 journal: cell doi: 10.1016/0092-8674(86)90710-5 sha: doc_id: 279463 cord_uid: bli8hwda abstract the human invariant chain (iγ) of class ii histocompatibility antigens spans the membrane of the endoplasmic reticulum once. it exposes a small amino-terminal domain on the cytoplasmic side and a carboxyterminal, glycosylated domain on the exoplasmic side of the membrane. when the exoplasmic domain of iγ is replaced by the cytoplasmic protein chloramphenicol acetyltransferase (cat), cat becomes the exoplasmic, glycosylated domain of the resulting membrane protein iγcat∗. deletion of the hydrophilic cytoplasmic domain from iγcat gives rise to a secreted protein from which an amino-terminal segment is cleaved, most likely by signal peptidase. we conclude that the membrane-spanning region of iγ contains a signal sequence in its amino-terminal half and that hydrophilic residues at the amino-terminal end of a signal sequence can determine cleavage by signal peptidase. the human invariant chain (ly) of class ii histocompatibility antigens spans the membrane of the endoplasmic reticulum once. it exposes a small amino-terminal domain on the cytoplasmic side and a carboxyterminal, glycosylated domain on the exoplasmic side of the membrane. when the exoplasmic domain of ly is replaced by the cytoplasmic protein chloramphenicol acetyltransferase (cat), cat becomes the exoplasmic, glycosylated domain of the resulting membrane protein i$at*. deletion of the hydrophilic cytoplasmic domain from l$xt gives rise to a secreted protein from which an amino-terminal segment is cleaved, most likely by signal peptidase. we conclude that the membranespanning region of ly contains a signal sequence in its amino-terminal half and that hydrophilic residues at the amino-terminal end of a signal sequence can determine cleavage by signal peptldase. translocation of proteins across the membrane of the endoplasmic reticulum (er) requires signal sequences and specific receptors that recognize them (see recent reviews by hortsch and meyer, 1984; walter et al., 1984; rapoport and wiedmann, 1985; wickner and lodish, 1985) . signal sequences have been found at the amino-terminal end of precursors for secretory and transmembrane proteins. in many cases they are cleaved during their translocation across the membrane by a specific protease (signal peptidase). signal sequences are quite variable in length, ranging from 16 to more than 50 amino acid residues (von heijne, 1983) . they all have a central core of hydrophobic amino acid residues, and most of them have a positively charged amino-terminal segment (von heijne, 1985) . signal sequences on nascent polypeptides are recognized by the signal recognition particle (srp), a ribonucleoprotein complex that mediates the interaction with the membrane by the selective binding to docking protein (or srp receptor) (walter et al., 1981b; meyer et al., 1982; gilmore et al., 1982) . membrane proteins are also inserted into the er membrane by an srp-mediated mechanism (anderson et al., 1983; rottier et al., 1985; spiess and lodish, 1986; lipp and dobberstein, 1986) . those spanning the membrane once have either the carboxyl terminus (type i membrane proteins) or the amino terminus (type ii membrane proteins) exposed on the cytoplasmic side. membrane insertion of type i membrane proteins most likely proceeds in a manner very similar to that of secretory proteins (lingappa et al., 1978) . type i membrane proteins are usually synthesized with a cleavable signal sequence and, in contrast to secretory proteins, are held in the membrane by a "stop transfer" sequence. examples of type i membrane proteins are the vesicular stomatitis virus g protein and class i and class ii histocompatibility antigens (lingappa et al., 1978; dobberstein et al., 1979) . of the type ii membrane proteins so far investigated, all are synthesized without a cleavable signal sequence. the neuraminidase of influenzavirus (60s et al., 1984) , the invariant chain (ii or ly) of class ii histocompatibility antigens (claesson et al., 1983; strubin et al., 1984; long, 1985; lipp and dobberstein, 1986) , the transferrin receptor (schneider et al., 1984) , and the asialoglycoprotein receptor (chiacchia and drickamer, 1984; holland et al., 1984; spiess and lodish, 1986 ) all belong to this class of membrane proteins. some steps in their membrane insertion must be similar to that of secretory and type i membrane proteins, as an srp-and docking protein-dependent membrane insertion has been demonstrated for some of them (spiess and lodish, 1986; lipp and dobberstein, 1986) . membrane insertion might occur in a loop-like fashion as this scheme can most easily explain how the different membrane topologies of membrane proteins are achieved (engelman and steitz, 1981) . as type ii membrane proteins contain only a single stretch of hydrophobic amino acid residues, this might function as a signal for membrane insertion as well as a membrane anchor (markoff et al., 1984; spiess and lodish, 1986) . to identify and characterize this sequence, we tested membrane insertion of the human invariant chain (ly) and several deletion and fusion proteins derived from it in a cell-free membrane insertion system. ly is a typical type ii membrane protein (claesson et al., 1983; strubin et al., 1984; lipp and dobberstein, 1986) . it exposes 30 amino-terminal residues on the cytoplasmic side, spans the membrane between residues 30 and 60, and exposes a large carboxy-terminal domain on the exoplasmic side. this domain has two sites for the addition of n-linked carbohydrate units. membrane insertion of ly requires srp and docking protein (lipp and dobberstein, 1986) . as the amino-terminal, cytoplasmic domain is hydrophilic and shows no resemblance to a signal sequence, it has been proposed that the membrane-spanning region, or part of it, functions as an internal, uncleavable signal sequence (dobberstein et al., 1983; claesson et al., 1983; lipp and dobberstein, 1986) . we demonstrate here that the membrane-spanning region of ly is composed of a potentially cleavable signal sequence fused to part of a membrane anchor, which together with the cytoplasmic domain determine the orientation of ly in the er membrane. deletion of the cytoplasmic domain exposes the signal sequence at the amino terminus of the membrane-spanning region, resulting in cleavage of this otherwise uncleaved signal. claesson et al., 1993) . ply. the complete ly coding and all of its 3' noncoding sequence was cloned behind the t5 promoter (p) in the pds5 expression vector. plycat, the portion downstream of the pstl site in ply was replaced by the chloramphenicol acetyltransferase (cat) gene resulting in an in-frame fusion protein. pan-lycat, the segment between the sau3a and sstll sites of plycat coding for the cytoplasmic domain was deleted. a new atg initiation codon right in front of the membrane-spanning segment is provided by the vector (see figure 4a ). the regions coding for protein are boxed. the membrane-spanning region of ly is indicated by loops; the hydrophilic domains by dots. cat-derived sequences are indicated by slanted lines. the position of n-linked glycosylation sites in ly and the potential n-linked glycosylation site in cat protein are indicated by an asterisk. relevant cleavage sites for restriction endonucleases are also indicated. protein segments that perform a particular function can be identified by their deletion or addition to unrelated proteins. we used this approach to localize and characterize the region in ly that is responsible for membrane insertion. deletions and fusions were made at the dna level after cloning of ly cdna into an expression vector. messenger rna was transcribed from these plasmids and translated in a cell-free system. the resulting proteins were tested for their ability to insert into microsomal membranes (blobel and dobberstein, 1975; stueber et al., 1984) . plasmids ply, plycat, and pan-lycat we have shown previously that cdna sequences cloned behind the strong t5 promoter in pds5 can be transcribed very efficiently by e. coli rna polymerase (stueber et al., 1985) . when transcription is performed in the presence of the cap analog 7mgpppa, the resulting mrna can be translated efficiently in eukaryotic cell-free systems. we have observed, however, that a stretch of gc residues at the 5' end of a cdna negatively affects expression of the resulting rna (unpublished observation). the ly cdna construct (py-2) had been gc tailed and was inserted into the pstl site of pbr322 (claesson et al., 1983) . we deleted the 5' gc tail and cloned ly cdna, or part of it, into the polylinker site of pds5 or p6/5r (see experimental procedures for details). ply contains the entire l-y coding region behind the t5 promoter ( figure 1 ). plycat is an in-frame fusion between the 5' region of ly encoding the cytoplas, mic, membrane-spanning segment plus 12 amino acids of the exoplasmic portion of ly and the gene encoding the cytoplasmic protein chloramphenicol acetyltransferase (cat). the cat protein contains one potential site for the addition of n-linked oligosaccharide 36 amino acid residues downstream of its original initiator methionine. in an-iycat, the entire hydrophilic, cytoplasmic segment from ly was deleted. the new initiator methionine is provided by the vector and is located in front of the hydrophobic segment. in vitro translation and membrane insertion of ly when ply was transcribed by e. coli rna polymerase and the resulting mrna translated in the wheat germ cell-free system, a single polypeptide species of 27 kd was obtained ( figure 2 , lane 1). this is the expected molecular weight for nonglycosylated ly (claesson et al., 1983) . when rough microsomes (rm), derived from dog pancreas, were added to the translation system, a higher molecular weight species of 33 kd appeared. this increase of 6 kd in molecular weight is consistent with the addition of two oligosaccharides to the two n-glycosylation sites. the 33 kd form ly* was reduced in molecular weight by about 2 kd when proteinase k was used to remove the cytoplasmically exposed domain (figure 2 , lanes 2 and 3). when protease digestion was performed in the presence of the detergent np 40, ly' was digested. these data suggest that ly' is integrated into the membrane and exposes 20-30 amino acid residues on the cytoplasmic side and a 30 kd domain on the exoplasmic side of the membrane. the identity of ly and its glycosylated form was confirmed by immunoprecipitation with antibodies raised against the amino-terminal 72 (anti-iyn) or against the carboxy-terminal 144 (anti-iyc) residues of ly. as shown in figure 2 , lanes 5, 6, 8, and 9, these antibodies recognize glycosylated and nonglycosylated forms of ly. no protein could be precipitated with anti-iyn antibody when the cytoplasmic domain was removed from membrane-integrated ly' by protease digestion (figure 2, lane 7) . as the antibody is directed against the amino-terminal portion of ly, the data directly demonstrate that the amino terminus is located on the cytoplasmic side and is accessible to the protease. with anti-iyc antibody, the processed form of ly is readily detectable, demonstrating an exoplasmic location of the carboxy-terminal portion of ly ( figure 2 , lane 10). membrane insertion of iycat an analysis of membrane insertion was performed for ly-cat and cat as described above for ly. cat was expressed from pds5. iycat was synthesized in the absence of microsomal membranes as a 34 kd protein ( figure 3 , lane 1) and in the presence of microsomal membranes as a 37 kd protein called lycat* ( figure 3 , lane 2). 12 3 4 in vitro translation and membrane insertion of ly ply was transcribed in the presence of the cap analog 7mgpppa by e. coli rna polymerase. the resulting mrna was translated in the wheat germ cell-free system in the absence (lanes 1,5. and 8) or presence (lanes 2, 3, 4, 6, 7, 9. and 10) of rm. the membrane topology of ly was determined by treatment with proteinase k (pk) (lanes 3, 7, and 10) or pk and the detergent np40 (lane 4). proteins were separated by sds-page and visualized by autoradiography. lanes 1-4 show total protein synthesized. samples characterized in lanes 5-7 were immunoprecipitated with an antibody raised against the amino-terminal 72 amino acid residues of ly (anti-iyn); in lanes 8-10, with an antibody against the carboxy-terminal portion of ly (anti-iyc). the increase in molecular weight is consistent with the addition of one n-linked oligosaccharide to the cat-derived portion. there is one potential site for n-linked glycosylation in the cat protein. after protease digestion in the presence of microsomal membranes, lycat* is reduced in molecular weight by about 2 kd, suggesting that it exposes 20-30 amino acid residues on the cytoplasmic side ( figure 3 , lanes 2 and 3). cat protein obtained after transcription-translation from pds5 is not modified by the added microsomes. as expected, no shift in molecular weight can be seen (figure 3 , lanes 5 and 6). cat protein was very resistant to protease digestion even in the presence of np40 (figure 3, lanes 7 and 8). i-&at, in contrast, was very sensitive to added protease. this might reflect a difference in conformation between the free cat protein and the cat-derived portion in iycat. the location of cat outside of the membrane vesicles can be demonstrated by sedimenting the membranes by centrifugation. cat protein is then found in the supernatant (data not shown). we conclude from the data obtained with ircat and cat that the signal for membrane insertion must be located within the first 72 amino acid residues of ly. to localize this signal more precisely, we deleted the first 30 residues of iycat. catinsertion of k&at and cat protein rna derived from plycat or pds5 was translated in the wheat germ cell-free system in the absence or the presence of rm. membrane insertion was tested by treatment with proteinase k (pk) and np40. addition of rm, pk, and np40 is indicated at the bottom of each lane. of an+cat in all secretory proteins the cleavable signal for membrane translocation is located at the amino-terminal end of the precursor polypeptide. the main feature of this signal appears to be its hydrophobicity. in ly the only hydrophobic stretch of amino acid residues that resembles a signal sequence is located in the membrane spanning region about 30 amino acid residues away from the amino-terminal initiator methionine. we asked whether removal of the 30 amino-terminal residues in iycat would affect its membrane insertion and topology. the cytoplasmic domain of iycat was deleted and the initiator methionine was placed in front of the membranespanning segment. the amino-terminal sequences of ly-cat and an-i$at as deduced from the dna sequences are shown in figure 4a . when rna derived from panlycat was translated in the wheat germ cell-free system, a single polypeptide of 29 kd was synthesized, an-iycat ( figure 46 , lane 1). this was, as expected, about 3 kd smaller than the iycat protein ( figure 46 , lane 1). in the presence of microsomes, two new protein bands appeared, one about 1 kd smaller and one 2 kd larger than an-ircat both of these forms were resistant to proteinase k, indicating that they were inserted into or translocated across microsomal membranes (figure 48 , lanes 3 and 4). we suspected that the smaller molecular weight form was generated by signal peptidase cleavage without concomitant glycosylation and that the larger molecular weight form was glycosylated and cleaved by signal peptidase. these possibilities were tested. from pan-lycat was translated in the wheat germ cell-free system in the absence or presence of rm. membrane insertion and topology was tested by treatment with proteinase k (pk) and np40. components were added as indicated below the lanes. iycat translated in the wheat germ cell-free system is shown for comparison. processed and glycoeylated to detect the signal peptide cleavage of a glycosylated protein on a polyacrylamide gel it is necessary to block its glycosylation, but still allow membrane insertion to occur. addition of n-linked oligosaccharides onto nascent polypeptides can be blocked by including synthetic acceptor peptides in an in vitro membrane insertion assay (bause, 1983; lau et al., 1983) . iycat and an-iycat were translated in the presence of microsomes with and without the acceptor peptide asn-leu-thr. the size of iycat synthesized in the presence of rm and acceptor peptide was indistinguishable from that made in the absence of rm. when proteinase k was used to digest its cytoplasmically exposed domain, the size was reduced by about 2-3 kd ( figure 5a ). we can conclude that nonglycosylated iycat synthesized in the presence of rm and acceptor peptide is inserted into the membrane in the same way as its glycosylated form and that no signal sequence is cleaved during membrane translocation ( figure 5a figure 58, lanes 2 and 3) . an-iycat'was also found to be protected against exogenous proteinase k ( figure 56, lane 4) . this suggested to us that the larger form was glycosylated and proteolytically processed and that an-iycat' was generated by a proteolytic cleavage, most likely by signal peptidase. to determine the site of cleavage in the proteolytically processed forms of an-ircat, the positions of leucine in the amino-terminal regions of an-itcat and membraneinserted an-lrcat*' were determined. an-i$at was translated in the absence or presence of rm with [sh]leutine as label. as an-iycat is essentially the only protein synthesized from pan-lycat-derived mrna, the complete translation mixture was subjected to automated edman degradation. as seen in figure 6a , leucine residues are found at the positions 3, 10, 13, 14, and 15, as predicted from the sequence deduced from py-2 cdna (claesson et al., 1983) . the initiator methionine is probably removed during or shortly after translation (kozak, 1983) . the positions of leucine residues in the membranetranslocated forms of an-i$at were similarly determined. as rm in the in vitro assay do not translocate all chains, some cytoplasmic forms remained (see inserts in figures 6a and 6b ). leucine residues were found at positions 1,2,3, and 13 ( figure 66) . larger peaks at positions 3 and 10 are consistent with the presence of some unprocessed an-iycat (see insert in figure 6b ). taking into account the size reduction of about 1 kd by the processing 13, 14, and 15 in authentic an-i$at, we conclude that processing has occurred between amino acid residues 12 and 13 ( figures 6b and 6c ). proteolytically processed an-lycat is translocated into the lumen of microsomal vesicles with the proteolytic removal of 12 of the 30 hydrophobic amino acid residues in the membrane-spanning region of an-iycat, the question arose as to whether the processed protein was still anchored in the membrane or whether it was now released into the lumen of the microsomal vesicles as is the case for secretory proteins. we used the extractability with carbonate as a criterion for membrane integration. treatment of rm with carbonate at ph 11 releases proteins that are not integrated into the lipid bilayer as well as proteins present in the lumen of microsomal vesicles. an+cat was translated in the presence of rm. membranes were isolated by centrifugation through a sucrose cushion and resuspended in carbonate buffer. solubilized components were then separated from membranes by centrifugation. proteins in the membrane pellet and supernatant were analyzed by sds-page and autoradiography. membrane-spanning proteins, ly and iycat, and the secretory protein, mouse granulocyte-macrophage colony stimulating factor (gm-csf), were used as control (gough et al., 1985) . as is shown in figure 7 , ly" and i-&at*, as expected for membrane-spanning proteins, were found in the membrane fraction. both an-i$at' and the gm-csf' were found essentially in the soluble, carbonatereleased fraction. thus an-iycat' is released after the proteolytic processing into the lumen of the microsomal vesicles. proteolytic processing, as described above for an-iycat, was also obtained for an+, a protein that lacks the amino-terminal 30 residues of ly (data not shown). our results show that the membrane-spanning segment of the type ii membrane protein ly contains a potentially cleavable signal sequence. this signal sequence is located in the amino-terminal half of the membrane-spanning segment, and it is cleaved when the preceding cytoplasmic domain is removed. all properties known to identify a signal sequence and a cleavage by signal peptidase can be demonstrated. to restrict vertical mobility of the membranespanning segment. (6) an-itcat, during its initial stage of membrane insertion, also spans the membrane with its hydrophobic segment. however, as no charged amino acid residues are present at the extreme amino-terminal end, the hydrophobic segment has some freedom to change its topology across the membrane. part of the hydrophobic segment might now be pulled into the lumen of the er membrane, and a former cryptic site for signal peptidase cleavage might become accessible to the active center of signal peptidase. first, the cleavage occurs concomitant with insertion into the er membrane as is typical for cleavable signal sequences of presecretory proteins (blobel and dobberstein, 1975) . second, the cleaved segment is located at the aminoterminal end of the deletion protein an-iycat. it is 13 amino acid residues long and composed entirely of hydrophobic or uncharged residues. signal sequences can vary in length from about 15 to over 60 residues. the only structural element identified so far for a signal sequence is its hydrophobic core, usually 8-12 residues long. it is followed by a more polar region 5-7 residues long, which is thought to define the cleavage site for signal peptidase. thus, a "minimal" signal sequence would be composed of an 8 residue hydrophobic core followed by a 5 residue region conferring cleavage specificity (von heijne, 1983 (von heijne, , 1985 . the segment cleaved from protein an-iycat would be consistent with such a minimal length signal sequence. finally, the amino acid residues around the cleavage site in membrane-translocated an-lycat*' are consistent with cleavage by signal peptidase. based on a sequence comparison of 78 eukaryotic signal sequences, von heijne found that only small neutral residues are found at the site of cleavage (-1 position) and that only small neutral and uncharged ones are found at the -3 position, that is 3 amino acid residues in front of the signal peptidase cleavage site (von heijne, 1983) . in the segment cleaved from an-i$at, threonine, a small neutral amino acid, is found at the -1 position, and leucine, an uncharged amino acid, at the -3 position. both of these residues fulfill the above described criteria for a signal peptidase cleavage site. thus, place (rm) and time of cleavage (cotranslational), hydrophobic character of the cleaved segment, and property of the cleavage site demonstrate that an-i$at contains a signal sequence at its amino terminus which is cleaved upon membrane insertion by signal peptidase. how can we possibly explain how the deletion of the cytoplasmic, hydrophilic segment from iycat reveals a cleavable signal sequence in a formerly membranespanning region? to us the most plausible explanation is that the position of the hydrophobic segment in the membrane is different in iycat and an-i$at signal peptidase is known to be an integral membrane protein not exposed on the cytoplasmic side of rm (jackson and blobel, 1977; lively and walsh, 1983; evans et al., 1986) . as in many secretory proteins, the cleavage site for signal peptidase is surrounded on either side by 1 or even 2 charged amino acid residues. it is reasonable to assume that the active center of this enzyme is located close to the exoplasmic side of the er membrane, not within the membrane. we propose that the removal of the cytoplasmic, hydrophilic segment from iycat allows the hydrophobic segment to shift its position within the er membrane. most likely it positions itself more toward the exoplasmic side. hence, a potential signal peptidase cleavage site becomes accessible to the active center of signal peptidase (see figure 8b ). it has been noted previously in type i membrane proteins that a deletion of the charged amino acid residues flanking the membrane-spanning region does not affect the overall topology (zuninga and hood, 1986; cutler et al., 1986) . in the case of es glycoprotein of semliki forest virus, it has been shown that mutation of the basic amino acid residues at the cytoplasmic side of the membranespanning segment reduces the stability of the mutant protein in the membrane (cutler et al., 1986) . when the membrane-spanning regions of type i and type ii membrane proteins are compared, no obvious structural difference can be found. in both types of membrane proteins these regions comprise a stretch of 20 to 30 hydrophobic amino acid residues that is flanked on the cytoplasmic side by positively charged amino acid residues. in type i membrane proteins the segment spanning the membrane does not appear to participate in the initial stage of membrane insertion. type i membrane proteins usually have cleavable signal sequences that initiate the membrane translocation of the amino-terminal half of the protein. the membrane-spanning region, in its position close to the carboxy-terminal end, seems only to function in anchoring the protein in the membrane. yost et al. placed the membrane-spanning segment of the murine surface immunoglobulin heavy chain close to the amino-terminal end of a fusion protein (yost et al., 1983) . in this position the segment did not provide the signal function for membrane insertion. as, however, a hydrophilic segment of about 40 amino acid residues precedes the membrane-spanning segment, the question still remains as to whether a membrane-spanning region from a type i membrane protein, when placed into the appropriate surrounding, can also initiate translocation across the er membrane. it is well conceivable that certain hydrophilic sequences preceding a hydrophobic segment play a crucial role in exposing a potential signal for membrane insertion. up to now no special structural features, besides hydrophobicity, are known to be crucial for the function of a signal sequence. a common step has been proposed for the early stage of membrane insertion of secretory and membrane proteins (dobberstein et al., 1983; spiess and lodish, 1986; lipp and dobberstein, 1986) . this was based largely on the finding that both of these types of proteins require srp and docking protein for their membrane insertion. here, we show that a type ii membrane protein can be converted into a secretory protein by removal of the cytoplasmic segment. this directly demonstrates that the signal for membrane insertion of these two types of proteins can be the same. further deletion into the carboxy-terminal half of the ly hydrophobic segment is required to elucidate whether the cleaved signal sequence contains all the information for membrane insertion. it is conceivable that the functional signal sequence extends over the cleaved signal sequence into the adjacent hydrophobic part. for some secretory protein it has been observed that the cleavable signal sequence is not sufficient for membrane insertion. in the case of staphylococcal protein a, sequences of the amino-terminal part of the mature protein are required for membrane insertion and correct processing (abrahmsen et al., 1985) . srp can arrest elongation of presecretory and type ii membrane proteins after 70 or even more amino acid residues have been polymerized (walter and blobel, 1981a; meyer et al., 1982; lipp and dobberstein, 1986; lipp et al., unpublished data) . these domains are then inserted into the er membrane by a yet unknown mechanism. as the amino terminus of a type ii membrane protein has to remain on the cytoplasmic side, the formation of a loop during membrane insertion has been proposed. in the case of a secretory protein, signal peptidase would be able to act as soon as the loop appears on the exoplasmic side. an initial interaction of basic residues in a signal sequence with the phosphates of the membrane lipids was originally proposed by lnouye for the lipoprotein of e. coli (inouye et al., 1977) . our results rule out an essential role of these basic residues in er membrane insertion. the an-iycat protein does not contain any charged amino acid residues preceding the hydrophobic segment. it is nevertheless translocated across the er membrane and processed. the rules that define the cleavage site for signal peptidase in presecretory proteins are not yet fully understood. von heijne points out that the type of amino acids at the -1 and -3 position in front of the site of cleavage are im-portant in assigning a cleavage site. here we show that sequences at the very beginning of a signal sequence can also influence cleavage by signal peptidase. in the case of ly, these charged residues can prevent cleavage by signal peptidase. the variability in the length and in the amount of charged amino acid residues at the amino terminus of asignal sequence has not as yet been explained. mutation and deletion experiments have clearly shown that charged residues are not essential for membrane insertion. in the light of our findings, we propose that the charged amino acids at the amino terminus of signal sequences function in the alignment of signal sequences in the er membrane such that signal peptidase can cleave at a very specific site with high fidelity. our prediction is that removal of charged residues from the amino-terminal end of the signal sequences can lead to an altered or less specific signal peptidase cleavage. wheat germ was obtained from general mills, california. the acceptor peptide benzoyl-asn-leuthr-n-methylamide was a generous gift from e. bause, cologne. standard molecular cloning techniques, as described by maniatis et al. (1962) were used. the cdna clone py-2, containing the entire coding region of the human invariant chain cloned into the pstl site of pbr322, was obtained from p. a. peterson's laboratory, uppsala, sweden (claesson et al., 1963) . the expression plasmids pds5, pds6, and pds5/3 have been described previously (stueber et al., 1964) . they allow efficient transcription by e. coli rna polymerase of cdnas cloned behind the strong t5 promoter p25. figures 9a and 9b summarize the construction of the fusion and deletion plasmids described below. plycat py-2 was digested with pstl, and the 317 bp fragment containing the 5' end and the 660 bp fragment containing the 3' end of the ly coding region were isolated. the 317 bp fragment coding for the ly cytoplasmic domain, the membrane-spanning segment and 12 amino acid residues of the exoplasmic domain, was cleaved by sauda to remove the sgc tail. the 234 bp sau3a-pstl fragment was isolated and cloned into bamhiipstl-cut pds5. this results in an in-frame fusion of the 5' end of ly to the cat gene. ph initial attempts to clone the completely coding region into pds5 failed. when expressed, this region is probably lethal to the bacterium. to repress transcription from the t5 promoter/operator (p/o) in bacteria, we cloned the lac i repressor between the b/a gene and the t5 p/o. this plasmid is called pfllycat. for the construction of ply, prlycat was linearized by pstl and the 660 bp pstl fragment, coding for the carboxy-terminal domain of it, was ligated into this site. transformants containing the 660 bp fragment were screened for expression of immunoprecipitable ly chain after in vitro transcription-translation. to delete the cytoplasmic domain from ircat, the 950 bp sstll-xbal fragment from plycat was isolated and ligated at the xbal site of bamhllxbal cut p6/5r. the protruding ends at the bamhl and the sstll sites were blunted with sl nuclease and ligated. as a result, a new atg initiation codon is placed just in front of the membrane-spanning segment of ly. the construction was confirmed by dna and amino acid sequence analyses (see figure 4a ). p6lsr to repress transcription from the t5 promoter the lac i gene was inserted between the b/a gene and the t5 p/o region of pds5/3 (stueber et al., 1984) . against ly domains to raise antibodies against the amino-and the carboxy-terminal domains of ly, fusion proteins of b-galactosidase and parts of ly were produced in bacteria and used as antigens to raise antibodies in rabbits. from a pstl digest of w-2, the 317 bp fragment coding for the aminoterminal 72 amino acids of ly and the 860 bp fragment coding for the exoplasmic carboxy-terminal domain of ly were isolated. each of the fragments was inserted into the pstl site of the bacterial expression vector pex1 (stanley and luzio, 1984) . fusion proteins expressed in nfl bacteria were separated on preparative sds-polyacrylamide gels (7% acrylamide; laemmli, 1970) . protein bands were visualized by koac precipitation, and fusion proteins were eluted from gel slices. two rabbits were immunized with each of the two fusion proteins. antibodies against the amino terminus of ly (anti-iyn) and its carboxyl terminus (anti-iyc) were obtained. they reacted with authentic ly chains synthesized by human raji cells (data not shown). lmmunoprecipitations after translation and posttranslational assays, antigens in a 25 vi aliquot were solubilized by adding nonidet-p40 (np40) to 0.5%. then 1 ~1 of either anti-iyn or anti+c antiserum was added and the mixtures incubated for 15 min at 4oc. forty microliters of a 1:l slurry of protein a-sepharose (equilibrated in 0.2% np40,lo mm tris-hci [ph 7.51, 150 mm naci, and 2 mm edta) was added to each sample, and incubation continued for 60 min at 4°c. beads were sedimented by centrifugation and washed three times with 0.2% np40, 10 mm tris-hci (ph 7.5), 150 mm naci, and 2 mm edta, twice with 0.2% np40,lo mm tris-hci (ph 7.5), 500 mm naci, and 2 mm edta, and once with 10 mm tris-hci (ph 7.5). sample buffer for sds-page was added to the sedimented beads, and antigens were analyzed by sds-page and fluorography. in vitro transcription and translation plasmids were transcribed in vitro by e. coli rna polymerase, and the resulting mrna was translated in a wheat germ cell-free system as described by stueber et al. (1984) . to test for membrane translocation, rough microsomes from dog pancreas were included in the translation (blobel and dobberstein, 1975) . glycosylation onto asparagine residues was blocked by the addition of the acceptor peptide benzoylasn-leuthr-n-methylamide to a final concentration of 30 pm (lau et al., 1983; bause, 1983) . assays to test translocation of in vitro-synthesized proteins across, or their insertion into, the er membrane, accessibility to proteinase k was used. a 10 pl aliquot of a translation mixture containing rough microsomes was incubated for 10 min at 25oc with either 0.3 mg/ml of proteinase k or 0.3 mg/ml of proteinase k and 0.5% np40. further proteolysis was stopped by the addition of phenylmethylsulfonyl fluoride (pmsf) to 0.1 mglml, and the sample was further characterized by sds-page (laemmli, 1970) and fluorography or, where indicated in the figure, by immunoprecipitation. to remove secretory and peripheral membrane proteins, rough microsomes were subjected to a carbonate wash with 0.1 m na&os, ph 11 (fujiki et al., 1982) . peptide; h. gausepohl for performing automated amino acid analysis: m. t. haeuptle, i. ibrahimi, and d. meyer for critical reading of the manuscript, and annie steiner for expert typing. this work was supported by grant do 199/4-z from the deutsche forschungsgemeinschaft. the costs of publication of this article were defrayed in part by the payment of page charges. this article must therefore be hereby marked "advertisement" in accordance with 18 usc. section 1734 solely to indicate this fact. received april 23, 1986; revised july 1, 1986. multiple mechanisms of protein insertion into and across membranes a stop transfer sequence confers predictable transmembrane orientation to a previously secreted protein in cell-free systems clonal variation in cell surface display of an h-2 protein lacking a cytoplasmic tail we thank p a. peterson and l. claesson, uppsala, for plasmid py-2; e. bause, cologne, for the acceptor abrahamsen, l., moks, t., nilsson, b., hellman, u., and uhlen, m. (1985) . analysis of signals for secretion in the staphylococcal protein a gene. embo j. 4, 3901-3906. adams, g. a., and rose, j. k. (1985) . structural requirements for a membrane-spanning domain for protein anchoring and cell surface transport.cell 47, looi-1015.anderson, d. j., mostov, k. e., and blob& g. (1983) . mechanisms of integration of de novo-synthesized polypeptides into membranes: signal recognition particle is required for integration into microsomal membranes of calcium atpase and of lens mp26 but not of cytochrome be. proc. natl. acad. sci. usa 80, 7249-7253. bause, e. (1983) . structural requirements of n-glycosylation of proteins. biochem. j. 209, 331-336. blobel, g., and dobberstein, b. (1975) lingappa, v. r., katz, f. n., lodish, h. f., and blobel, g. (1978) . a signal sequence for the insertion of a transmembrane glycoprotein. similarities to the signals of secretory proteins in primary structure and function. j. biol. chem. 253, 8667-8670.lipp, j., and dobberstein, b. (1986). signal recognition particle-dependent membrane insertion of mouse invariant chain: a membrane spanning protein with a cytoplasmically exposed amino-terminus. j. cell biol. 702, 2169 -2175 . long, e. 0. (1985 . in search of a function for the invariant chain associated with la antigens. surv. immunol. res. 4, 27-34. lively, m. o., and walsh, k. a. (1983) spiess. m., and lodish. h. f. (1986) . an internal signal sequence: the asialoglycoprotein receptor membrane anchor. cell 44, 177-165. stanley, k. k., and luzio, j. p (1984) . construction of a new family of high efficiency bacterial expression vectors: identification of cdna clones coding for human liver proteins. embo j. 3, 1429 -1434 . strubin, m., mach, b., and long, e. 0. (1984 . the complete sequence of the mrna for the hla-dr associated invariant chain reveals a polypeptide with an unusual transmembrane polarity. embo j. 3,869~872. key: cord-276988-bvsz5q6d authors: neu, carolin t.; gutschner, tony; haemmerle, monika title: post-transcriptional expression control in platelet biogenesis and function date: 2020-10-15 journal: int j mol sci doi: 10.3390/ijms21207614 sha: doc_id: 276988 cord_uid: bvsz5q6d platelets are highly abundant cell fragments of the peripheral blood that originate from megakaryocytes. beside their well-known role in wound healing and hemostasis, they are emerging mediators of the immune response and implicated in a variety of pathophysiological conditions including cancer. despite their anucleate nature, they harbor a diverse set of rnas, which are subject to an active sorting mechanism from megakaryocytes into proplatelets and affect platelet biogenesis and function. however, sorting mechanisms are poorly understood, but rna-binding proteins (rbps) have been suggested to play a crucial role. moreover, rbps may regulate rna translation and decay following platelet activation. in concert with other regulators, including micrornas, long non-coding and circular rnas, rbps control multiple steps of the platelet life cycle. in this review, we will highlight the different rna species within platelets and their impact on megakaryopoiesis, platelet biogenesis and platelet function. additionally, we will focus on the currently known concepts of post-transcriptional control mechanisms important for rna fate within platelets with a special emphasis on rbps. blood platelets-the major players in hemostasis-are small anucleate cell fragments with a characteristic discoid shape and a diameter of 1 to 3 µm that originate from megakaryocytes (mks). platelets are the second most abundant "cell" type in the peripheral blood and constitutive renewal ensures a normal platelet count in the blood, despite their relatively short life span of 7-10 days [1, 2] . platelets are also interesting from a clinical point of view. platelet counts, which range from 150,000 to 350,000/µl of whole blood in healthy subjects, can drastically increase and decrease in pathophysiological conditions, resulting in thrombocytosis or thrombocytopenia, respectively [3] . thus, it is not surprising that a wide range of human pathologies are associated with abnormal platelet counts and/or functions like inflammatory diseases [4] , rheumatoid arthritis [5] , diabetes [6] , pulmonary hypertension [7] , alzheimer's disease [8] , cardiovascular disease [9] , and cancer [10, 11] . the unique life cycle of platelets starts with proplatelet biogenesis from mks in response to certain stimuli. mks develop from hematopoietic stem cells (hscs) present in the bone marrow, yolk sac, fetal liver, and spleen [12] . as a prerequisite, mks initially have to grow and multiply their genetic information via endomitosis. these processes are highly dependent on thrombopoietin (tpo), a cytokine that promotes the growth and development of mks from their precursors and that associates with its mk-specific receptor, i.e., cluster of differentiation 110 (cd110), encoded by the cellular myeloproliferative leukemia virus (c-mpl) oncogene [13] [14] [15] . both c-mpl −/− and tpo −/− mice are thrombocytopenic and have reduced numbers of mks in the bone marrow [16] [17] [18] . however, although platelet and mk numbers were significantly reduced, the remaining mks and platelets were structurally normal and platelets were functional as they were still able to form clots and responded to agonists [19] . this suggests that other regulatory factors exist [20] . furthermore, endomitosis, a process during which mks accumulate dna content ranging from 2n to 128n in their big polylobulated nucleus, also contributes to platelet formation and the degree of mk ploidy was shown to influence platelet quantity and quality [21, 22] . mechanistically, endomitosis is a process by which mks become polyploid through repeated rounds of dna replication without cell division, which requires both coordination and restriction of cell cycle components [23] [24] [25] . importantly, endomitosis is triggered by tpo and is necessary for mks to proceed with their final maturation and subsequent proplatelet formation [22] . another purpose of endomitosis is to produce large amounts of proteins and lipids necessary for the mk demarcation membrane system (dms) that serves as a membrane reservoir for proplatelet formation. the origin of the dms likely involves invaginations of the mk plasma membrane, de novo membrane synthesis, as well as contributions from the golgi-derived membranes and close endoplasmic reticulum (er) contacts [26] . in the final step of platelet biogenesis, terminally differentiated mks remodel their cytoplasm to form long protrusions, so-called proplatelets, which extend into bone marrow sinusoids and are the progenitors of mature platelets [27] . however, terminal platelet formation might not end in the bone marrow sinusoids, but likely continues in the circulation and was shown to be substantial in the lung [28, 29] . of note, in an attempt to delineate the final steps of proplatelet maturation and platelet release, a large (2-10 µm) intermediate discoid stage in platelet production, the preplatelet, was identified [30] . preplatelets are anucleate discs that have a thick cortical microtubule coil and are able to convert back into barbell-shaped proplatelets. final platelet release is achieved through continued bidirectional polymerization of microtubules at each end of the proplatelet followed by a fission event. this multi-step biogenesis yields approximately 100 billion platelets each day in an adult human, but these rates can increase by a factor of 10 or more when the demand increases [31] . interestingly, in recent years ex vivo protocols have been developed to generate functional platelets from megakaryocytes, which are differentiated from progenitor/stem cells. these models are generally based on bioreactors with biocompatible porous barriers that support mk function and platelet release is supported by media flow obtained by electronic pumps. these tissue models are especially important because platelet transfusions are common, but supply does not always match demand [32, 33] . however, if cultured platelets are to become a considerable clinical alternative, they must match the quality and functionality of donor-derived platelets. therefore, a better understanding of the mechanisms responsible for mk maturation, platelet generation and release is necessary [34] . mature platelets were first discovered by bizzozero in the late 19th century and have since been ascribed a primarily hemostatic role. while circulating in an inactive state in the blood stream, platelets become activated upon exposure to subendothelial collagen to support thrombus formation, highlighting their outstanding role as first responders in wound healing and hemostasis [35] . normally, intact endothelium expresses and releases factors that prevent platelets from activation and aggregation and block fibrin clot formation. after endothelial injury, subendothelial collagen gets exposed to the blood and initiates activation and adhesion of platelets via von willebrand factor. expression of tissue factor (tf) finally activates a sequence of events to establish a stable clot that seals the injury site and prevents excessive blood loss [36] . in the last centuries, it has become evident that platelets are not only implicated in hemostasis and thrombosis, but also act as key players in infectious and sterile inflammatory responses as well as cancer [11, 37] . many of the non-hemostatic functions of platelets result from the storage and release of bioactive molecules. these molecules, be they platelet proteins, lipids, growth factors, or cytokines, are mainly found in alpha (α) and dense (δ) granules [38] . the biogenesis of these granules begins in the mk, but maturation continues in circulating platelets [39, 40] . initially, it was hypothesized that granule packaging and release is a random process, however, recent studies suggested that platelets contain heterogeneous populations of granules that differentially release their content upon different physiological stimuli [41, 42] . besides their protein content, platelets contain different nucleic acids including small and long non-coding rnas and protein-coding messenger rnas (mrnas). first thought to be incapable of regulated gene expression, researchers started to shed light on the translational machinery and its regulation in platelets in the absence of a nucleus. subsequent studies revealed that platelets possess a functional spliceosome, allowing for rapid adaptations to changing environmental cues and a change of phenotype [43] . in recent years, in-depth transcriptional analyses of platelets have been performed suggesting that as many as 3000-6000 mrnas are contained in platelets [44] [45] [46] . however, the most intriguing question is: how and where do platelets get their content from? while the precise mechanisms are still largely unknown, it is becoming evident that the inheritance of proteins and mrnas from the megakaryocytes is not a random but highly regulated process. this review will highlight the different classes of rna species enriched in platelets, will focus on the role of non-coding rnas in platelet biogenesis and function and will feature the role of post-transcriptional control mechanisms with a special emphasis on the role of rna-binding proteins. a dynamic transcriptome is a prerequisite for platelet function, enabling rapid responses to external stimuli. however, rna abundance in platelets is~1000 times less than in leukocytes, leading to challenges when it comes to transcriptome analysis of platelets, as high purity samples are crucial. despite these challenges, independent studies have found a diverse set of nucleic acids within platelets such as different classes of rnas [43, [47] [48] [49] [50] as well as mitochondrial dna (mtdna) [45] . interestingly, it has been observed that the content of total rna decreases over time and young, reticulated platelets show 20-40 times higher rna levels compared to older ones, which indicates either decay and/or release of rna during the lifespan of platelets [51] . the bulk of platelet rna is derived from megakaryocytes, probably involving active transport mechanisms for sorting into platelets. in addition, other cellular sources might exist as well given the intimate interaction of platelets with other cell-types in the circulation or within tissue microenvironments. figure 1 summarizes the plethora of rna molecules in platelets including their source and additionally highlights the importance of intercellular communication via granule release and microvesicle shedding. besides their protein content, platelets contain different nucleic acids including small and long non-coding rnas and protein-coding messenger rnas (mrnas). first thought to be incapable of regulated gene expression, researchers started to shed light on the translational machinery and its regulation in platelets in the absence of a nucleus. subsequent studies revealed that platelets possess a functional spliceosome, allowing for rapid adaptations to changing environmental cues and a change of phenotype [43] . in recent years, in-depth transcriptional analyses of platelets have been performed suggesting that as many as 3000-6000 mrnas are contained in platelets [44] [45] [46] . however, the most intriguing question is: how and where do platelets get their content from? while the precise mechanisms are still largely unknown, it is becoming evident that the inheritance of proteins and mrnas from the megakaryocytes is not a random but highly regulated process. this review will highlight the different classes of rna species enriched in platelets, will focus on the role of non-coding rnas in platelet biogenesis and function and will feature the role of posttranscriptional control mechanisms with a special emphasis on the role of rna-binding proteins. a dynamic transcriptome is a prerequisite for platelet function, enabling rapid responses to external stimuli. however, rna abundance in platelets is ~1000 times less than in leukocytes, leading to challenges when it comes to transcriptome analysis of platelets, as high purity samples are crucial. despite these challenges, independent studies have found a diverse set of nucleic acids within platelets such as different classes of rnas [43, [47] [48] [49] [50] as well as mitochondrial dna (mtdna) [45] . interestingly, it has been observed that the content of total rna decreases over time and young, reticulated platelets show 20-40 times higher rna levels compared to older ones, which indicates either decay and/or release of rna during the lifespan of platelets [51] . the bulk of platelet rna is derived from megakaryocytes, probably involving active transport mechanisms for sorting into platelets. in addition, other cellular sources might exist as well given the intimate interaction of platelets with other cell-types in the circulation or within tissue microenvironments. figure 1 summarizes the plethora of rna molecules in platelets including their source and additionally highlights the importance of intercellular communication via granule release and microvesicle shedding. which are inherited from megakaryocytes. additionally, communications with surrounding cells might also influence rna expression landscape. transport of rna from megakaryocytes to platelets is not well understood, however, is suggested to be-at least in part-mediated by rna-binding proteins (rbps). pre-mrna and pre-mirna are processed by a functional spliceosome and by the presence of the cytoplasmic mirna processing into translate mrna and mature mirna, respectively, which can be signal-dependent. platelets may also host infectious virus particles and mediate dissemination of viruses in infected individuals. by secreting rna and protein contents by platelet granules or microvesicles, platelets may regulate signaling pathways in various interacting cells, including tumor or endothelial cells. among the different types of rna, mrnas are the best investigated class in platelets. they present typical features of eukaryotic mrnas like a 7-methylguanosine (m 7 g) 5 cap followed by a 5 -untranslated region (utr) as well as a 3 -utr and poly(a)-tail at their 3 -end. the m 7 g cap and poly(a) tail promote mrna translation and stability, while the utrs expose sequences for rna-binding proteins (rbps) and regulatory sites for microrna (mirna)-mediated translational and degradation control [45, [52] [53] [54] . getting a complete picture of the (dynamic) nature and composition of the platelets transcriptome is of great interest and next generation sequencing (ngs) has proven to be one of the preferred techniques for such studies [55, 56] . bulk platelet analyses demonstrated that platelets harbor a diverse set of mrnas and showed an enrichment of genes functionally linked to coagulation, platelet degranulation, cytoskeletal dynamics, receptor binding, secretion, and g-protein signaling, all being a prerequisite for a plethora of signaling pathways. along these lines, it is worth mentioning that platelets were shown to contain also pre-mrnas as well as small nuclear rnas (snrnas), which are functionally linked to gene expression and involved in pre-mrna splicing. in fact, denis et al. could show that platelets contain a complete set of snrnas as well as other essential splicing factors and are capable of signal-dependent splicing. here, in response to integrin engagement and surface receptor activation, platelets precisely excised introns from interleukin (il)-1β pre-mrna, thereby generating a mature mrna transcript that was translated into protein [43] . until then, it has been thought that splicing events only occur within nuclear boundaries, which has been challenged by these findings. however, since the initial discovery of a functional spliceosome in platelets, several other studies confirmed the existence and functional importance of splicing for platelet activation. for example, splicing upon platelet stimulation has also been shown for platelet factor xi (fxi) pre-mrna and significantly higher amounts of pre-mrna were found in resting compared to thrombin-activated platelets [57] . moreover, activation of platelets by fibrinogen and thrombin induced splicing of tissue factor (tf) pre-mrna leading to an increased tf protein expression, procoagulant activity, and accelerated formation of clots. mechanistically, it was revealed that tf pre-mrna splicing was dependent on cdc2-like kinase (clk)1-mediated splicing factor 2 (sf2) phosphorylation [58] . furthermore, lipopolysaccharide (lps) was shown to induce splicing, translation, and secretion of mature il-1β through tlr4 [59] . similarly, lps stimulation of platelets initiated splicing of cyclooxygenase-2 (cox-2) pre-mrna and ultimately production of the corresponding protein. however, this effect was extremely variable and it appeared that lps stimulated platelets either produced very little or non-active cox-2 protein [60] . of note, signal-dependent splicing, but also mrna degradation and translation might explain controversial findings regarding correlation between mrna and protein expression in platelets with some studies suggesting a weak correlation of the platelet transcriptome and proteome [61] , while others report a rather strong correlation, opening up the question on regulatory mechanisms of mrna metabolism in platelets under physiological and pathological conditions [62] . another intriguing aspect of the platelet transcriptome is the fact that the rna content of platelets is significantly influenced by interactions with surrounding cells including cancer cells. for example, best et al. recently discovered that tumor cells are able to educate platelets and induce an rna expression signature within platelets that could be used for non-invasive cancer diagnostics [63] [64] [65] [66] . moreover, a standardized protocol was established to enable isolation and analysis of spliced platelet mrna, which allowed detection of cancer with high accuracy [63] . this method has the potential to become clinically relevant, as platelet isolation is a rather simple standard procedure, already established in hematology laboratories. importantly, small amounts of only 100-500 pg of total platelet rna were sufficient for diagnostics and allowed separation of cancer patients from healthy individuals with 96% accuracy using mrna sequencing [64] . additionally, platelet rna signatures helped to distinguish patients with kirsten rat sarcoma viral oncogene homolog (kras), epidermal growth factor receptor (egfr) and phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (pik3ca) mutant and wild-type tumors, as well as patients with human epidermal growth factor receptor 2 (her2)-amplified versus her2-negative breast cancer and mesenchymal epithelial transition tyrosine kinase (met) overexpressing versus met non-overexpressing lung cancers [64] . although these studies seem promising, there are still some obstacles to be overcome such as the distinction between primary and metastatic tumors and the evaluation of cancer progression. one should also be aware of the fact that non-cancerous systemic factors like inflammation and cardiovascular events could influence platelet rna profiles as well [67] . however, using platelet-based mrna expression signatures in various diseases including cancer could open up new diagnostic avenues. taken together, rna expression profiling of platelets increased our understanding of the quantity and quality of their rna content. however, future studies are needed to shed new light onto the functional roles of these mrnas and their encoded proteins or whether these mrnas are simply the result of an inflammatory and/or pro-oncogenic environment. another class of rnas present in platelets is micrornas (mirnas). these 21-24 nucleotide (nt) spanning micrornas (mirnas) are known key regulators of eukaryotic gene expression [68] . alteration of mirna expression signatures has been observed in cancer and might predict prognosis and aid in diagnosis [69] . additionally, mirnas are important regulators of mk differentiation from hematopoietic progenitor cells [70] . microarray and rna-sequencing (rna-seq) analysis identified up to 532 different mirnas in human platelets [71] . interestingly, post-transcriptional modifications like adenylation and uridylation that are known for their regulatory impact on mirna stability and decay, also seem to play a role in mirna biology in platelets [72] . the relevance of mirnas in platelets has been recognized more and more in recent years and was particularly fueled by the discovery of a functional mirna processing machinery in platelets [49] . in general, mirna biogenesis starts with the transcription of primary mirnas (pri-mirnas), which are subsequently processed into precursor mirnas (pre-mirnas). a prerequisite for these initial steps is the micro-processor complex, consisting of nuclear rnase iii drosha and the digeorge syndrome critical region 8 (dgcr8), which is localized in the nucleus of cells [73] . following nuclear export, the cytoplasmic rnase iii enzyme dicer and trans-activation responsive rna-binding protein 2 (tarbp2) cooperatively cleave the stem of pre-mirna substrates yielding mature mirnas [74] [75] [76] [77] . landry et al. were the first to describe the cytoplasmic pre-mirna processing machinery in human platelets, including dicer and argonaute 2 (ago2) proteins [49] . in contrast, the nuclear microprocessor components drosha and dgcr8 could not be detected, consistent with the anucleate nature of platelets. using rnase activity assays, it was demonstrated that platelet dicer was functional and able to convert pre-let-7a-3 into mature mirna. additionally, platelet mirnas were able to mediate rna silencing as exemplified by the regulation of the purinergic receptor p2y12, which is involved in platelet aggregation, through an ago2/mir-223 effector complex contained in platelets [49] . moreover, functional ago2/mir-223 effector complexes were packaged into microparticles and released by platelets upon thrombin stimulation and subsequently internalized by endothelial cells, where they regulated f-box and wd repeat domain containing 7 (fbxw7) and ephrin a1 (efna1) mrna and protein levels [78] . additionally, platelet-secreted mir-223 was shown to promote endothelial cell apoptosis induced by advanced glycation end products via targeting the insulin-like growth factor 1 receptor (igf1r) [79] . moreover, rna-seq analysis of platelet mirnas in patients with myocardial infarction revealed nine differentially expressed platelet mirnas compared to healthy controls, which were released upon platelet aggregation and taken up by endothelial cells via a vesicle-dependent mechanism [80] . importantly, microvesicle-mediated mirna transfer to endothelial cells altered endothelial cell behavior and did not only influence endothelial cell survival but also inhibited tube formation as it has been shown for platelet-derived mirna let-7a, which targeted thrombospondin-1 (thbs-1) 3 -utr and inhibited thbs-1 release from endothelial cells [81] . next to endothelial cells, cancer cells also are recipients of platelet mirnas. for example, mir-223 expression was found to be significantly increased in platelets and platelet-derived microvesicles from non-small cell lung cancer (nsclc) patients as well as from mice intravenously injected with lewis lung carcinoma cells. of note, platelet microvesicles effectively delivered mir-223 to lung cancer cells and promoted cancer cell invasion by reducing erythrocyte membrane protein band 4.1-like 3 (epb41l3) levels [82] . these examples highlight the broad range of cell-platelet communication via mirnas, not only in the context of cancer. in fact, the majority of circulating mirnas derived from platelets and altered signatures of circulating mirnas were identified in patients with type 2 diabetes mellitus and with future myocardial infarction [83] [84] [85] . antiplatelet therapy using aspirin and prasugrel significantly decreased the level of platelet mirnas suggesting that plasma mirna expression could be used as surrogate markers of platelet activation and efficacy of antiplatelet therapy [83] . interestingly, the potential of mirnas as biomarkers of platelet activity in antiplatelet therapy monitoring has been recently reviewed elsewhere [86] . in summary, platelets inherit a functional mirna pathway from their mother cells which enables a complex crosstalk and post-transcriptional gene expression control based on the regulatory nature of mirnas [49] . thus, platelet-secreted mirnas are important modulators of signaling pathways in target cells and could serve as disease biomarkers, or as activation and therapy response markers. long non-coding rnas (lncrnas), of which some were identified in platelets [52] , constitute yet an additional group of regulatory non-protein-coding rnas. lncrnas can be subdivided into sense-, antisense-, intronic-, bidirectional-, and long intergenic ncrnas and are characterized by a minimum length of >200 nt [87] [88] [89] . importantly, lncrnas have been shown to play important roles in several human diseases, especially in cancer [90] [91] [92] [93] . in nsclc, the expression of lncrnas was analyzed in platelets in order to identify novel biomarkers of lung cancer. a selection of five lncrnas was identified and further measurements revealed a differential expression of two lncrnas, namely, membrane-associated guanylate kinase inverted 2-antisense rna 3 (magi2-as3) and zinc finger nfx1-type containing 1 antisense rna 1 (zfas1), in platelets of nsclc patients compared to healthy subjects [94] . both lncrnas have been assigned tumor suppressive functions in human breast cancer [95] [96] [97] , whereas zfas1 seems to be a potent oncogenic lncrna in other tumor types [98] . in platelets and plasma of nsclc patients, magi2-as3 and zfas1 were downregulated and their expression significantly correlated with tumor stage. in addition, magi2-as3 expression significantly correlated with lymph-node and distant metastasis and the authors suggested that both platelet-derived lncrnas could be used as potential diagnostic biomarkers in nsclc [94] . however, the molecular function of these and other lncrnas in platelets is currently not well understood and it is tempting to speculate about a potential transfer of lncrnas from platelets to diverse other cell-types to modulate expression programs and phenotypes. circular rnas (circrnas) belong to a group of highly stable non-coding rna-species with largely unknown functional relevance in platelets. in general, circrnas can act as sponges to sequester mirna or rbps. moreover, they serve as scaffolds to mediate complex formation between specific enzymes and substrates and recruit proteins to specific locations (reviewed in [99] ). circrna biogenesis involves the action of the spliceosome which connects the 5 and downstream 3 ends of exons within the same transcript resulting in a circular conformation which comes along with increased stability [100] . interestingly, circrnas were found to be abundant and highly enriched in human platelets as demonstrated by an analysis of publicly available rna-seq datasets of human platelets [50] . here, the authors analyzed three total rna-seq datasets using a back-splice exon junction discovery pipeline and identified 33,829 distinct structures consistent with circrnas. this number is about 15 times higher as reported for other cells and 24,632 (~73%) of the structures were also present in at least one other encyclopedia of dna elements (encode) rna-seq datasets previously mined [101] . in order to identify genes significantly enriched for circrnas in platelets, the authors analyzed 8041 circrna producing genes and identified 3162 significantly enriched. for the vast majority of enriched genes, the contribution of circrna exons to total transcription was >60% in nucleated tissues but >80% in platelets, and for 15 genes it was 100% in all three samples. thus, for some genes, only rna-seq reads derived from circrna-producing exons could be detected within platelet samples. importantly, circrnas were also enriched in other anucleate cells, but not in cultured megakaryocytes and the authors suggested that this was mainly due to a higher degradation rate of linear rnas compared to circrnas in mature platelets leading to the strong enrichment of the latter over time. intriguingly, extensive mrna degradation could explain some conflicting results concerning the correlation between the platelet transcriptome and proteome [50] . nevertheless, the biological relevance of platelet circrnas, if any, remains unclear: do they modulate platelet functions or are they simply byproducts of megakaryocyte transcription and platelet mrna degradation? compelling evidence against a rather random role of circrnas in platelets comes from the finding that they are differentially sorted into platelet-derived microvesicles via a specific, yet unknown mechanism [102] . the selective release of circrnas might represent a novel way of transferring information as vesicles from platelets to recipient cells, e.g., endothelial cells thereby inducing inflammatory responses. moreover, the same study could show that platelet circrnas associated with protein complexes of distinct sizes, so called circrnps, as demonstrated for six highly abundant circrnas, including a platelet-specific circrna, namely, plt-circr4 [102] . however, the circrna-specific binding partners and the functional significance of these interactions remain to be clarified. in summary, circrnas are highly enriched in platelets, they associate with proteins and can be released from platelets. however, not much is known about their regulatory roles in or outside platelets. in addition, their clinical implications remain unclear and it would be interesting to know if the abundance and/or composition of platelet circrnas varies depending on the age and health status of individual subjects. a fascinating aspect of platelets is the appearance of viral rna in platelets and their ability to aid in viral replication and the spread of virus in infected individuals [103, 104] . although both dna-and rna-containing viruses associate with platelets, only rna viruses are able to replicate, as platelets are incapable of dna transcription due to their anucleate nature. a study on platelets isolated from human immunodeficiency virus (hiv)-infected patients on antiretroviral drug therapy (art) revealed the occurrence of cytosolic hiv rna and intact viral particles [105] . noteworthy, replication competent viruses were only observed within closed vacuoles and not on the platelet surface. during the course of platelet elimination by tissue macrophages, hiv was transmitted from platelets to macrophages via phagocytosis. this process was prohibited by blocking platelet-macrophage interactions with the platelet-specific anti-integrin α iib /β3 antibody abciximab, underpinning the fact that viral spread is supported by intercellular interactions. most interestingly, a low cd4 + t cell count was predictive for the presence of hiv-containing platelets in patients receiving art, which correlated with immunological failure in these individuals [105] . in contrast, a recent study found no correlation of hiv + platelets with cd4 + t-cell count but with viral load and number of hiv + platelets were significantly reduced after art. this study additionally confirmed the ability of platelets to promote hiv viral spread, which was dependent on platelet activation and subsequent platelet-cd4 + t-cell complex formation [106] . importantly, platelet-virus interactions are not unique to hiv, but many other viruses were shown to infect platelets. platelets have been shown to be permissive to dengue virus entry, further enabling translation of viral rna, replication of the viral genome, and assembly of infectious virus. this study was further able to identify heparan sulfate proteoglycan (hsp) and dendritic cell-specific intercellular adhesion molecule-3-grabbing nonintegrin (dc-sign) as the major receptors for virus-platelet interactions [107] . other viruses that have been shown to interact with platelets, mostly via surface integrins, and are able to enter these include different adenoviruses, influenza virus and encephalomyocarditis virus (reviewed in [103, 104] ). starting in december 2019, severe acute respiratory syndrome coronavirus 2 (sars-cov-2) infections became a global problem with unforeseen consequences. some corona virus disease 2019 (covid-19) patients show hypercoagulation and thrombosis, which contributes to organ failure and worsened outcomes [108] . these clinical observations led to the hypothesis that platelets may play a role in covid-19 progression. protein or rna expression of the main receptor for sars-cov-2 binding, angiotensin-converting enzyme 2 (ace2), was not detected in platelets. nevertheless, mrna from the sars-cov-2 n1 gene was found in platelets of some covid-19 patients suggesting an ace2-independent entry of the virus into platelets [109, 110] . in addition, platelets might be attracted by the loss of endothelial integrity caused by the sars-cov-2 entry into endothelial cells, which ultimately results in platelet activation and degranulation thereby leading to a cytokine storm. thus, platelets might not only facilitate the dissemination of sars-cov-2 in infected individuals but might also contribute to hazardous thromboinflammation by the release of pro-inflammatory cytokines. moreover, megakaryocytes from the lungs might be susceptible to sars-cov-2, thereby generating sars-cov-2 rna-containing platelets with an altered transcriptome. rna-seq analysis of platelets isolated from severely ill covid-19 patients compared to healthy donors displayed significant differences in rna expression patterns associated with changes in platelet activation and aggregation [110] . whether platelets constitute a target for covid-19 therapy to alleviate the course of disease in severe cases will be an interesting topic for future research. as described above, platelets contain a diverse collection of coding and non-coding as well as linear and circular rnas, posing the question of the origins of these transcripts and the underlying molecular mechanisms, if any, that are responsible for the sorting of these rnas into platelets. intuitively, mks seem to be the prime source of platelet rnas and, indeed, mks transcribe a wide set of mrnas, which they pass on to their progeny in a highly regulated manner [111, 112] . as proven by transcriptome analysis, platelets contain thousands of mk-derived mrnas [45, [113] [114] [115] [116] [117] . careful analysis of rna expression data on matrix metalloproteinase (mmp) and tissue inhibitor of metalloproteinase (timp) mrnas uncovered different proportions of expression profiles in platelets compared to their abundancy in mks [118] . excluding mrna degradation or instability, cecchetti et al. hypothesized that mks differentially sort mrnas into platelets. this hypothesis is supported by the observation that mks expressed 10 out of 24 mmps and three out of four timps, yet several of these mrnas were missing or barely detectable in platelets, including transcripts encoding mmps 2, 9, 14, and 17 and timp-3. moreover, the abundance of a transcript in mks did not uniformly predict its expression level in platelets. for example, timp-1 mrna is expressed at high levels in both mks and platelets while timp-3 mrna is abundantly expressed in mks only, suggesting a regulated transfer of individual mrnas [118] . however, the exact sorting mechanisms remain elusive, but it was suggested that mks, similar to neurons, leverage rbps that bind to certain nascent transcripts in the nucleus thereby forming a complex that translocates to the cytoplasm. upon association with a kinesin motor, this complex is transported through the proplatelet by microtubules, and is ultimately delivered to the nascent platelet [119] . in fact, cecchetti et al. could detect the mrnas of a selected group of rbps, which were previously shown to be involved in rna localization, both in mks and platelets [118] . however, the expression and localization of the corresponding proteins await further investigation. in addition to mks, platelets may also acquire rna species by either direct or indirect interaction with surrounding cells in the circulation or in tissue microenvironments. in glioblastoma patients, platelets sequestered tumor-specific egfr variant iii (egfrviii) mrna, which was also detected in corresponding plasma samples [94] . in platelets from nsclc patients, echinoderm microtubule associated protein like 4-anaplastic lymphoma kinase (eml4-alk) rearrangements were detected on transcript level. the transfer of this rna was likely mediated via exosome shedding of cancer cells, thus confirming cross talk between platelets and tumor cells. most interestingly, serial monitoring of eml4-alk rearrangements in platelets showed a loss of detectable eml4-alk rearrangements after clinical responds to crizotinib treatment [120] . another intriguing, although likely indirect form of transfer of genetic information from cells to platelets might be emperipolesis, a process, where intact cells are present within the cytoplasm of another cell. histological examination of bone marrow routinely identifies mks, which have engulfed neutrophils or other hematopoietic cells. hence, emperipolesis is a physiological process that can be increased in pathological conditions, e.g., in gray platelet syndrome or essential thrombocythemia [121, 122] . recent studies have shown that emperipolesis of intact neutrophils in mks is mediated in part through the β2-integrin/ intercellular adhesion molecule 1 (icam-1)/ezrin pathway. neutrophil membranes merged with the dms and thereby transferred membrane material to daughter platelets, which accelerated platelet production [123] . intriguingly, some circulating platelets thus represent hybrids carrying both mk and neutrophil components. however, the mechanisms and functions of this conserved cellular phenomenon are largely unknown and whether emperipolesis is relevant for the transfer of rna into platelets should be investigated in the future. megakaryopoiesis, megakaryocyte maturation, as well as platelet formation, were shown to be highly complex processes that are regulated on multiple levels including epigenetic, transcriptional as well as post-transcriptional gene expression control mechanisms. indeed, several mirnas, which fine-tune gene expression at the post-transcriptional level, were shown to affect mk differentiation and platelet formation. for example, an inhibitory role for mir-155 during megakaryopoiesis was described, which is in line with the observed downregulation of this mirna during the differentiation of cd34+ hematopoietic stem cells along the megakaryocyte lineage [124] . in contrast, mir-150 was shown to support normal megakaryocyte differentiation by targeting the transcription factor myb [125, 126] . additionally, mir-130 targets the maf bzip transcription factor b (mafb), which increases with terminal mk differentiation and induces activation of the integrin alpha 2b (itga2b) gene [127] . very recently, mir-22 has been shown to promote megakaryocyte differentiation through repression of its target growth factor independent 1 transcriptional repressor (gfi1) [128] . ex vivo, megakaryocyte differentiation from hematopoietic stem/progenitor cells (hspcs) and generation of platelets was also shown to be regulated by mirnas. specifically, mir-125b and mir-660 increased the percentage of megakaryocytes with high ploidy and upregulated platelet biogenesis compared to controls. in contrast, overexpression of the mir-23a/27a/24-2 cluster in hspcs reduced colony formation associated with diminished megakaryocyte maturation and platelet production [129] . the importance of mirnas in the regulation of development and function of mks and platelets also becomes apparent upon mk-specific deletion of the ribonuclease dicer1 that is required for mirna biogenesis. knockdown of dicer1 in an in vivo mouse model led to reduced levels of mirnas in platelets and hence to an altered platelet mrna expression profile. the resulting phenotype was characterized by enhanced platelet reactivity and hemostatic function, presumably due to the upregulation of the fibrinogen receptor subunits integrin α iib and β 3 on the platelet surface in the absence of a proper mirna machinery [130] . furthermore, a global approach to identify mirnas and their respective targets in platelets detected protein kinase camp-dependent type ii regulatory subunit beta (prkar2b) as a mir-200b target, which functionally controlled platelet activation and altered levels of prkar2b led to impaired platelet reactivity [131] . moreover, hyperreactive platelets were found to have elevated levels of vesicle-associated membrane protein 8 (vamp8), a critical protein for granule release, which is a target of mir-96. overexpression of mir-96 restored normal vamp8 protein levels and thus confirmed the mir-96-vamp8 interaction [132] . importantly, alterations of mirna expression patterns have been linked to malignancies related to the megakaryocytic lineage, including essential thrombocythemia (et) [133] . in detail, it was shown that the expression of mir-9 and mir-490 significantly increased in platelets of et patients, while others, including mir-409 and mir-126, were downregulated. mirna expression was correlated with platelet aggregation and expression of platelet surface receptors including cd63 and p-selectin, suggesting that mirna may play a role in et, platelet maturation and function [133] . however, controversy exists as to what extent platelet-specific mirnas are important for platelet biology itself. some of the skepticism comes from genetic knockout studies in mice in which loss of mir-223, a highly abundant well-studied mirna in human platelets [134] , was dispensable for platelet production, both in terms of number and size as well as platelet functionality including overall life-span and surface expression of platelet adhesion receptors important for platelet activation like the adenosine diphosphate (adp) receptor p2y12 [135] . however, the lack of an overt phenotype might be due compensatory mechanisms as well as redundant targeting and functional rescue by other mirnas. moreover, lack of regulation of well-known mir-223 target transcripts might be explained by the lack of binding sites in the 3 -utr of murine as compared to human genes as evidenced in the case of p2y12 [136] . next to mirnas, lncrnas have also been implicated in the control of megakaryopoiesis. in general, lncrnas can regulate several cellular processes including differentiation applying a broad range of epigenetic, transcriptional as well as post-transcriptional mechanisms [90, 137] . for example, the expression of the human rna binding motif protein 15 (rbm15) was shown be positively regulated by its antisense lncrna, as-rbm15, during megakaryopoiesis [138] . importantly, the rna-binding protein rbm15 is known for its role in promoting terminal megakaryocyte differentiation through its impact on the alternative splicing of key transcription factors [139] . intriguingly, rbm15 protein synthesis underlies a complex regulatory mechanism itself involving as-rbm15 and the transcription factor runx1 [138] . specifically, as-rbm15, which is transcribed in the opposite direction within exon 1 of rbm15, was shown to enhance 5 cap-dependent protein translation of rbm15. this might be achieved via putative complementary binding of exon 1 of as-rbm15 deep within the 5 -utr of rbm15 and its incorporation into the rbm15 mrna-containing polysomes. both as-rbm15 and rbm15 transcriptions were activated by the transcription factor runx1 and repressed by the leukemic fusion protein runx1-eto, suggesting that as-rbm15 is not only important for megakaryocyte differentiation, but might also be implicated in leukemogenesis [138] . additional, although indirect, evidence that indicated a functional role of lncrnas in mks is provided by a transcriptome analysis, which identified 1109 polyadenylated lncrnas expressed in erythroblasts, megakaryocytes, and megakaryocyte-erythroid precursors of mice, and 594 lncrnas expressed in human erythroblasts [140] . importantly, it was recently shown that megakaryocyte-enriched protein-coding genes, but not erythroid ones, are "primed" in hspcs, and that these genes are occupied by a group of seven transcription factors (tfs) that facilitate low-level expression in hspcs [141] . lineage-specific alterations in tf occupancy subsequently led to an activation of these genes during megakaryopoiesis and repression during erythropoiesis. intriguingly, similar observations could be made for lncrnas and the model that some megakaryocyte-enriched genes are specifically primed in hspcs could be extended to lncrna loci, emphasizing common regulatory mechanisms with coding genes [140] . however, the relevance of these lncrnas for mk differentiation and function besides their regulation remains unclear and requires further investigation. in contrast, more direct evidence for the relevance of lncrnas for mk and platelet function was recently presented in the context of type 2 diabetes. here, platelet hyperaggregation and hypercoagulation are often observed in patients with type 2 diabetes (t2d) leading to an increased thrombogenic risk [142] . however, t2d patients frequently fail to respond to commonly used adp receptor blocker such as clopidogrel and cangrelor that are used as antiplatelet therapy thus leaving residual platelet reactivity [143] . moreover, high activity of the adp receptor p2y12 is found in t2d patients, exposing such patients to a prothrombotic condition [144, 145] . in this context, the non-coding metallothionein 1 pseudogene 3 (mt1p3) was found to be significantly upregulated in mks from type 2 diabetes patients compared to healthy controls and, most importantly, had an impact on platelet activation and expression of the p2y12 by sponging mir-126. of note, depletion of mt1p3 by small interfering rna (sirna) reduced the expression of the adp receptor, thereby inhibiting platelet activation and aggregation in a murine diabetes model [146] . in summary, small and long non-coding rnas are crucial regulators of megakaryocyte and platelet biology, and changes in ncrna expression are associated with alterations in megakaryopoiesis, platelet biogenesis and platelet function. however, our current understanding of the underlying molecular mechanisms is still very limited and much more research, especially on the role of lncrnas, is needed. post-transcriptional control mechanisms are crucial for platelets to dynamically alter their proteome, phenotype and function (summarized in figure 2 ). newly formed, anucleate platelets seem to have a higher biosynthetic potential than older ones, indicating post-transcriptional regulatory circuits acting during the life cycle of platelets [147] . post-transcriptional gene expression control is mainly achieved through mirnas via binding to the 3 -utr of their target transcript thereby regulating rna degradation and protein translation as mentioned above. in addition to mirnas, a second major class of post-transcriptional regulators consists of rbps. rbps recognize specific sequence or structure elements often present in the utrs of their target rnas. interestingly, platelets were found to have, on average, much longer 3 -utr compared to nucleated cells (1047 nt versus 492 nt, respectively), indicating an elongated binding region for mirnas and rbps thereby allowing potentially more complex regulation of translational and rna stability [44] . intriguingly, the key regulators of eukaryotic protein synthesis, i.e., eukaryotic translation initiation factor (eif) 4e and eif2α, are highly abundant in platelets [148] . furthermore, key ribosomal components, including ribosomal protein s6 as well as 18s and 28s ribosomal rnas (rrnas), are present in platelets [112, 115] . however, translation in resting platelets is generally prohibited or reduced via binding of eif4e to the inhibitory protein 4e-binding protein 1 (4e-bp1), which prevents the assembly of the eif4f initiation complex. activation of the mechanistic target of rapamycin (mtor) signaling pathway leads to phosphorylation of 4e-bp1 and to its dissociation from eif4e, thereby enabling formation of the initiation complex and subsequent translation in platelets [114, 149] . of note, pharmacological inhibition of mtor via rapamycin has controversial effects on platelet activity. while collagen-induced platelet aggregation was significantly reduced [150] , platelet adhesion to endothelial cells was facilitated by rapamycin through the remodeling of the endothelial membrane [151] . furthermore, a study that investigated the long-term effects of mtor inhibitors in patients that received kidney transplants observed altered platelet functions. the side effects included reduced granule secretion and impaired platelet aggregation, because of biphasic time-dependent alterations in calcium homeostasis and function in platelets [152] . these findings can be, at least partially, explained by the observation that inhibition of mtor activity using rapamycin reduced translational events of certain, but not all transcripts indicating specificity of mtor signaling-dependent protein synthesis [153] . importantly, these examples underscore the relevance of translational control in platelets. this is further supported by the observation that some proteins are constitutively translated in platelets, whereas the translation of others is regulated in a signal-dependent manner as it has been shown for bcl3 transcription coactivator (bcl-3) and il-1β [114, 115, 149, 154] . mechanistically, platelet integrins have been proposed to transmit outside-in signaling cascades thereby modulating signal-dependent translation in platelets. for example, while quiescent platelets harbor mrna for the precursor il-1β cytokine (pre-il-1β) at the site of polysomes, its translation is induced upon β 3 integrin signaling [115] . moreover, integrin signaling can lead to a redistribution of eif4e to mrna-rich areas in aggregated platelets thereby initiating protein translation upon platelet activation [114] . similarly, translation of bcl-3 mrna is constitutively repressed in resting platelets. however, upon thrombin receptor-mediated activation of platelets, bcl-3 protein is rapidly synthesized within minutes and the translation may be modulated via engagement of integrin α iib β 3 [155] . rapamycin was shown to abrogate bcl-3 translation indicating a regulatory role of mtor in signal-dependent protein translation in platelets. importantly, signal-dependent translation in platelets is not exclusively regulated via integrins. early studies investigated the impact of highly polyunsaturated fatty acids of the n-3 family on the platelet antioxidant status and observed elevated enzyme activity of glutathione-dependent peroxidase (gpx), which counteracts oxidative stress. the enhanced enzyme activity could be explained by increased protein translation, since concomitant treatment with the protein synthesis inhibitor cycloheximide abolished the additional enzyme activity [156] . in summary, small and long non-coding rnas are crucial regulators of megakaryocyte and platelet biology, and changes in ncrna expression are associated with alterations in megakaryopoiesis, platelet biogenesis and platelet function. however, our current understanding of the underlying molecular mechanisms is still very limited and much more research, especially on the role of lncrnas, is needed. post-transcriptional control mechanisms are crucial for platelets to dynamically alter their proteome, phenotype and function (summarized in figure 2 ). newly formed, anucleate platelets seem to have a higher biosynthetic potential than older ones, indicating post-transcriptional regulatory circuits acting during the life cycle of platelets [147] . post-transcriptional gene expression control is mainly achieved through mirnas via binding to the 3′-utr of their target transcript thereby regulating rna degradation and protein translation as mentioned above. in addition to mirnas, a second major class of post-transcriptional regulators consists of rbps. rbps recognize specific sequence or structure elements often present in the utrs of their target rnas. interestingly, platelets were found to have, on average, much longer 3′-utr compared to nucleated cells (1047 nt versus 492 nt, respectively), indicating an elongated binding region for mirnas and rbps thereby allowing potentially more complex regulation of translational and rna stability [44] . receptor-mediated signaling may lead to splicing and translational events that are silenced in resting platelets. lipopolysaccharide (lps) binding to toll-like receptor 4 (tlr4) and integrin signaling induces the proteolytic processing of pre-il-1β to mature interleukin (il)-1β, resulting in pro-inflammatory signaling. integrin-and thrombin receptor mediated platelet activation induces reorganization of the translation initiation complex and translation of certain mrnas, including serpin family e member 1 (serpine1) and bcl3 transcription coactivator (bcl-3), partially in an mechanistic target of rapamycin (mtor)-dependent manner. posttranscriptional gene expression control is also guided by mirna-binding to the 3 -untranslated region (utr) of target genes regulating rna degradation or translation. an additional layer of posttranscriptional gene expression control is represented by rbps responsible for megakaryocyte mk maturation, platelet biogenesis, platelet responsiveness and platelet interaction with immune cells as well as rna transport from mks to proplatelets. lastly, posttranscriptional regulation is mediated by l1 reverse transcriptase activity, which induces formation of rna-dna hybrids, thereby regulating global protein synthesis. besides these examples of general mechanisms that contribute to translational control in platelets, also more specific regulations of individual mrna by certain rbps have been identified. of note, more than 1500 human proteins have been annotated as rbps based on experimental evidence and the presence of canonical rna-binding domains (rbds) [157] . rbps typically use rbds, such as the rna recognition motif (rrm), hnrnp k homology domain (kh) or dead box helicase (ddx) domain, in order to bind to their target rna and subsequently form ribonucleoprotein (rnp) complexes. however, additional rbps have been identified lacking these characteristic rbds [158] . importantly, rbps are crucial post-transcriptional regulators that are able to execute a plethora of control mechanisms, including regulation of rna translation, splicing, transport, decay and editing [157] . hence, rbps could potentially contribute to multiple aspects of mk differentiation, platelet formation as well as platelet function. in line with this, a recent study revealed that the rbp ataxin2 (atxn2) modulates the mk transcriptome and proteome and affects expression of platelet surface proteins [159] . atxn2 has been implicated in regulating global mrna stability and translation and directly binds to over 4000 transcripts as evaluated by photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation (par-clip) [160, 161] . in human megakaryoblasts, atxn2 was shown to associate with the poly(a)-binding protein (pabp) and the rna helicase ddx6, and knockdown of atxn2 in meg-01 cells yielded 454 differentially expressed rnas and 20 differentially expressed proteins [159] . however, the authors of this study did not identify direct rna targets of atxn2 in megakaryoblasts. intriguingly, atxn2 knockout mice exhibited a decrease in type iv megakaryoid cells, and αiibβ3 integrin-mediated platelet aggregation was impaired upon stimulation with phorbol myristate acetate (pma) or with aggretin a. this could be due to deregulated total expression of itgb3 and aberrant surface receptor expression of cd31 in conjunction with c6orf25, coagulation factor ii thrombin receptor (f2r) and vav1 dysregulation in atxn2 knockout mice [159] . in addition to atxn2, other rbps have been implicated in megakaryopoiesis and mk maturation. as mentioned earlier, rbm15 was shown to play a role in megakaryocytic differentiation of hematopoietic stem cells acting in concert with protein arginine methyltransferase 1 (prmt1) which methylates rbm15 and thereby regulates its stability via the e3 ubiquitin ligase cnot4 [139, 162] . rbm15 was shown to bind to intronic sequences and interact with the splicing factor 3b subunit 1 (sf3b1) thereby regulating alternative splicing of genes that are important for megakaryopoiesis such as gata1, runx1, t-cell acute lymphocytic leukemia 1 (tal1) and c-mpl [139] . another rbp that may influence megakaryocyte differentiation is insulin-like growth factor 2 mrna-binding protein 1 (igf2bp1). igf2bp1 has been found to target the ets variant transcription factor 6 (etv6)/runx1 fusion transcript and potentially regulates its stability in acute lymphatic leukemia (all) [163] . however, it is currently unknown, if igf2bp1 modulates runx1 expression in hscs or megakaryocytes. however, it could be shown that another member of the igf2bp family regulates the human fetal-adult megakaryocyte transition [164] . in detail, the oncofetal igf2bp3 is expressed significantly higher in neonatal hematopoietic cells, including neonatal megakaryocytes, but is completely absent in its adult counterparts. high expression of igf2bp3 restricted megakaryocyte morphogenesis and polyploidy by inhibiting the positive transcription elongation factor b (p-tefb) kinase complex by binding and stabilization of the 7sk snrna. consequently, high igf2bp3 expression in neonatal megakaryocytes affected platelet function leading to hyporesponsiveness in full-term and premature neonates, which could lead to thrombocytopenia and bleeding commonly observed in premature neonates [164] . another example of an oncofetal rbp with a role in regulating mk and platelet function is lin28b, which is highly expressed in fetal, but not in adult megakaryocytes. high lin28b expression in fetal megakaryocytes negatively regulated surface p-selectin expression in fetal platelets, which influenced their interaction with neutrophils in vitro and in vivo [165] . hence, differential rbp expression in developing mks as well as old and young platelets may play a crucial role in regulating platelet function. in addition, post-transcriptional regulation by rbps might also be vital in regulating platelet production and release from mks, which includes cytoskeletal rearrangements. myosin heavy chain 9 (myh9) is a major non-muscle myosin expressed in mks and platelets, associates with the actin cytoskeleton and enables morphogenesis. mutations in or knockout of myh9 is linked to a group of platelet disorders leading to macro-thrombocytopenia, prolonged bleeding and clot retraction deficiency [166] . in erythropoiesis, myh9 is crucial for erythroid cell enucleation. this process was shown to depend on the activity of heterogeneous nuclear ribonucleoprotein k (hnrnp k), an rbp that inhibits translation of myh9 mrna [167] . it would be interesting to evaluate the effects of hnrnp k on myh9-dependent megakaryocyte maturation and platelet biogenesis, which has not been investigated so far. in addition to regulating stability and translation of mrnas, rbps can also be involved in the localization and transport of their target rnas from megakaryocytes to platelets, which is suggested to occur in a controlled rather than random manner as mentioned above [118] . while the exact sorting mechanisms remain elusive, mks and platelets were shown to contain mrnas encoding for rbps that were previously implicated in mrna transport processes including cancer susceptibility candidate gene 3 (casc3), staufen double-stranded rna binding protein 1 and 2 (stau1 and stau2) as well as eif4a3 [118, 168] . hence, these rbps might be expressed in mks and platelets and may play a role in the regulated transport of coding and non-coding transcripts into platelets [169] . consequently, rbps have also been detected within platelets themselves where they modulate transcript stability and translation efficiency. among the identified rbps are t-cell internal antigen-1 (tia-i), tia-i-related protein (tia-r), and elav like rna binding protein 1 (elavl1, hur) [112] . tia-1 has been implicated in alternative splicing regulation of various pre-mrnas and was shown to suppress translation [170] . upon platelet activation by thrombin, tia-i dissociates from the serpin family e member 1 (serpine1) mrna, thereby lifting repressive effects and allowing translation and de novo synthesis of serpine1 protein, also known as plasminogen activator inhibitor 1 (pai-1). similar effects were found for ago2, where dissociation of ago2 protein upon thrombin stimulation of platelets was found for serpine1 mrna whereas ago2/mirna complexes or associations of ago2 with other mrnas known to be translated upon platelet activation, including prostaglandin-endoperoxide synthase 1 (ptgs1) and itgb3 mrnas, were not affected. on the other hand, association of serpine1 mrna with the rbp hur did not change upon platelet activation, which might indicate that hur acts as a stabilizing factor rather than a translational regulator under these conditions [171] . overall, these findings implicate some degree of specificity for the regulation of individual platelet mrnas during platelet stimulation, which is likely distinct following platelet activation by other known platelet activating factors besides thrombin. finally, yet importantly, it is worth mentioning that not only mirnas and rbps could regulate protein synthesis in mks and platelets, but also other post-transcriptional regulators might exist. indeed, schwertz et al. demonstrated that translational events in platelets could be controlled by endogenous long interspersed nuclear element-1 (line-1) reverse transcriptase activity (ert) [172] . intriguingly, inhibition of ert in vitro in isolated platelets from healthy individuals or in people with hiv treated with rt inhibitors enhanced global protein synthesis. moreover, platelet activation was induced promoting a pro-thrombotic functional response, which was likely mediated by the generation of rna-dna hybrids. these results present a novel, previously unrecognized translational regulatory mechanism, which could have clinical implications for hiv patients that are at an increased risk of thrombosis, which might be related to rt inhibitor-based antiretroviral therapies [173] . in summary, post-transcriptional regulatory pathways are crucial for rapid adaptations to changing environmental properties upon platelet activation. rbps expressed in megakaryocytes and platelets are critical for megakaryocyte differentiation, maturation, and platelet genesis. moreover, rbps might be important transport factors mediating the sorting of rnas from megakaryocytes into platelets and thereby influence signaling pathways in circulating platelets. however, several of the underlying mechanistic details are not well understood so far, which warrants further in-depth investigations in the future. platelets contain a plethora of rnas that are inherited from megakaryocytes or taken up from interacting cells or microorganisms. the functions of these transcripts, whether they are crucial for platelet function per se independent of their role in megakaryocytes and the mechanisms that control their sorting and uptake as well as their constitutive or signal-dependent translation and decay, are just beginning to be resolved. studying the contribution of rbps to these processes will allow us to gain a deeper understanding of the complex networks underlying megakaryopoiesis, platelet biogenesis and function. we apologize to all scientists whose important work could not be cited in this review due to space constraints. the figures were generated with biorender.com. the authors declare no conflict of interest. programmed anuclear cell death delimits platelet life span determination of the life of human blood platelets using labelled diisopropylfluorophosphanate platelets at the interface of thrombosis, inflammation, and cancer platelets amplify inflammation in arthritis via collagen-dependent microparticle production platelets and diabetes mellitus platelets in pulmonary vascular physiology and pathology alzheimer disease and platelets: hows that relevant platelets and cardiovascular disease felding-habermann, b. contribution of platelets to tumour metastasis the platelet lifeline to cancer: challenges and opportunities the incredible journey: from megakaryocyte development to platelet formation identification and cloning of a megakaryocyte growth and development factor that is a ligand for the cytokine receptor mpl promotion of megakaryocyte progenitor expansion and differentiation by the c-mpl ligand thrombopoietin vainchenker, w. cmpl ligand is a humoral regulator of megakaryocytopoiesis deficiencies in progenitor cells of multiple hematopoietic lineages and defective megakaryocytopoiesis in mice lacking the thrombopoietic receptor c-mpl physiological regulation of early and late stages of megakaryocytopoiesis by thrombopoietin thrombocytopenia in c-mpl-deficient mice normal platelets and megakaryocytes are produced in vivo in the absence of thrombopoietin low levels of erythroid and myeloid progenitors in thrombopoietin-and c-mpl-deficient mice the biogenesis of platelets from megakaryocyte proplatelets polyploidy: occurrence in nature, mechanisms, and significance for the megakaryocyte-platelet system megakaryocyte endomitosis is a failure of late cytokinesis related to defects in the contractile ring and rho/rock signaling ubiquitin-dependent degradation of cyclin b is accelerated in polyploid megakaryocytes the cell cycle in polyploid megakaryocytes is associated with reduced activity of cyclin b1-dependent cdc2 kinase biogenesis of the demarcation membrane system (dms) in megakaryocytes aspects of platelet formation and release anucleate platelets generate progeny the lung is a site of platelet biogenesis and a reservoir for haematopoietic progenitors cytoskeletal mechanics of proplatelet maturation and platelet release lineage-specific hematopoietic growth factors in vitro generation of platelets: where do we stand? in vitro-derived platelets: the challenges we will have to face to assess quality and safety on the way to in vitro platelet production protein synthesis by platelets: historical and new perspectives new fundamentals in hemostasis platelets and the immune continuum the life cycle of platelet granules coated pits and vesicles transfer plasma components to platelet granules platelet dense granules begin to selectively accumulate mepacrine during proplatelet formation angiogenesis is regulated by a novel mechanism: pro-and antiangiogenic proteins are organized into separate platelet alpha granules and differentially released evidence that differential packaging of the major platelet granule proteins von willebrand factor and fibrinogen can support their differential release escaping the nuclear confines: signal-dependent pre-mrna splicing in anucleate platelets analysis of sage data in human platelets: features of the transcriptome in an anucleate cell transcript profiling of human platelets using microarray and serial analysis of gene expression message in the platelet"-more than just vestigial mrna! platelets genome-wide rna-seq analysis of human and mouse platelet transcriptomes a tour through the transcriptional landscape of platelets existence of a microrna pathway in anucleate platelets circular rna enrichment in platelets is a signature of transcriptome degradation time-dependent decay of mrna and ribosomal rna during platelet aging and its correlation with translation activity the complex transcriptional landscape of the anucleate human platelet integration of proteomics and genomics in platelets: a profile of platelet proteins and platelet-specific genes next generation sequencing analysis of human platelet polya+ mrnas and rrna-depleted total rna mapping and quantifying mammalian transcriptomes by rna-seq uncovering the complexity of transcriptomes with rna-seq platelet factor xi: intracellular localization and mrna splicing following platelet activation signal-dependent splicing of tissue factor pre-mrna modulates the thrombogenicity of human platelets lipopolysaccharide signaling without a nucleus: kinase cascades stimulate platelet shedding of proinflammatory il-1beta-rich microparticles lipopolysaccharide is a direct agonist for platelet rna splicing rigoutsos, i. the human platelet: strong transcriptome correlations among individuals associate weakly with the platelet proteome coordinate expression of transcripts and proteins in platelets rna sequencing and swarm intelligence-enhanced classification algorithm development for blood-based disease diagnostics using spliced blood platelet rna rna-seq of tumor-educated platelets enables blood-based pan-cancer, multiclass, and molecular pathway cancer diagnostics platelet rna as a circulating biomarker trove for cancer diagnostics tumor-educated platelets as a noninvasive biomarker source for cancer detection and progression monitoring cancer genetics: rna-seq for blood-based pan-cancer diagnostics micrornas: genomics, biogenesis, mechanism, and function micrornas in cancer: from developmental genes in worms to their clinical application in patients microrna fingerprints during human megakaryocytopoiesis the repertoire and features of human platelet micrornas the clinical significance of platelet microparticle-associated micrornas the nuclear rnase iii drosha initiates microrna processing processing of primary micrornas by the microprocessor complex the microprocessor complex mediates the genesis of micrornas role for a bidentate ribonuclease in the initiation step of rna interference many roads to maturity: microrna biogenesis pathways and their regulation activated platelets can deliver mrna regulatory ago2*microrna complexes to endothelial cells via microparticles platelet-secreted microrna-223 promotes endothelial cell apoptosis induced by advanced glycation end products via targeting the insulin-like growth factor 1 receptor platelets activated during myocardial infarction release functional mirna, which can be taken up by endothelial cells and regulate icam1 expression platelet microparticle delivered microrna-let-7a promotes the angiogenic switch microrna-223 delivered by platelet-derived microvesicles promotes lung cancer cell invasion via targeting tumor suppressor epb41l3 circulating micrornas as novel biomarkers for platelet activation plasma microrna profiling reveals loss of endothelial mir-126 and other micrornas in type 2 diabetes prospective study on circulating micrornas and risk of myocardial infarction micrornas as promising biomarkers of platelet activity in antiplatelet therapy monitoring long non-coding rnas: insights into functions unique features of long non-coding rna biogenesis and function challenges and strategies in ascribing functions to long noncoding rnas. cancers (basel) 2020, 12, 1458 the hallmarks of cancer: a long non-coding rna point of view linc00261 is differentially expressed in pancreatic cancer subtypes and regulates a pro-epithelial cell identity. cancers (basel) 2020, 12, 1227 the noncoding rna malat1 is a critical regulator of the metastasis phenotype of lung cancer cells the role of long non-coding rnas in the pathogenesis of hereditary diseases lncrnas and egfrviii sequestered in teps enable blood-based nsclc diagnosis downregulation of the long non-coding rna zfas1 is associated with cell proliferation, migration and invasion in breast cancer magi2-as3 inhibits breast cancer by downregulating dna methylation of magi2 long non-coding rna (lncrna) magi2-as3 inhibits breast cancer cell growth by targeting the fas/fasl signalling pathway zfas1: a novel tumor-related long non-coding rna the biogenesis, biology and characterization of circular rnas mis-splicing yields circular rna molecules cell-type specific features of circular rna expression selective release of circrnas in platelet-derived extracellular vesicles human platelets and their capacity of binding viruses: meaning and challenges? platelets in immune response to virus and immunopathology of viral infections platelets from hiv-infected individuals on antiretroviral drug therapy with poor cd4(+) t cell recovery can harbor replication-competent hiv despite viral suppression platelets function as an acute viral reservoir during hiv-1 infection by harboring virus and t-cell complex formation dengue virus binding and replication by platelets abnormal coagulation parameters are associated with poor prognosis in patients with novel coronavirus pneumonia platelets can contain sars-cov-2 rna and are hyperactivated in covid-19 genomics and transcriptomics of megakaryocytes and platelets: implications for health and disease change in protein phenotype without a nucleus: translational control in platelets evaluating the relevance of the platelet transcriptome integrins regulate the intracellular distribution of eukaryotic initiation factor 4e in platelets. a checkpoint for translational control activated platelets mediate inflammatory signaling by regulated interleukin 1beta synthesis gene expression profile of primary human cd34+cd38lo cells differentiating along the megakaryocyte lineage comparison of oligonucleotide-microarray and serial analysis of gene expression (sage) in transcript profiling analysis of megakaryocytes derived from cd34+ cells megakaryocytes differentially sort mrnas for matrix metalloproteinases and their inhibitors into platelets: a mechanism for regulating synthetic events platelets get the message rearranged eml4-alk fusion transcripts sequester in circulating blood platelets and enable blood-based crizotinib response monitoring in non-small-cell lung cancer the frequency and significance of megakaryocytic emperipolesis in myeloproliferative and reactive states striking emperipolesis in megakaryocytes of gray platelet syndrome megakaryocyte emperipolesis mediates membrane transfer from intracytoplasmic neutrophils to platelets cd34+ hematopoietic stem-progenitor cell microrna expression and function: a circuit diagram of differentiation control progression through key stages of haemopoiesis is dependent on distinct threshold levels of c-myb microrna-mediated control of cell fate in megakaryocyte-erythrocyte progenitors emerging key regulators of hematopoiesis microrna-22 promotes megakaryocyte differentiation through repression of its target, gfi1. blood adv mirnas can increase the efficiency of ex vivo platelet generation dicer1-mediated mirna processing shapes the mrna profile and function of murine platelets platelet microrna-mrna coexpression profiles correlate with platelet reactivity vamp8/endobrevin is overexpressed in hyperreactive human platelets: suggested role for platelet microrna platelet microrna expression and association with platelet maturity and function in patients with essential thrombocythemia the emerging role of mir-223 in platelet reactivity: implications in antiplatelet therapy mir-223 is dispensable for platelet production and function in mice microrna biomarkers and platelet reactivity: the clot thickens ulitsky, i. the functions of long noncoding rnas in development and stem cells the as-rbm15 lncrna enhances rbm15 protein translation during megakaryocyte differentiation cross-talk between prmt1-mediated methylation and ubiquitylation on rbm15 controls rna splicing lineage and species-specific long noncoding rnas during erythro-megakaryocytic development divergent functions of hematopoietic transcription factors in lineage priming and differentiation during erythro-megakaryopoiesis platelet hemostasis in patients with metabolic syndrome and type 2 diabetes mellitus: cgmpand no-dependent mechanisms in the insulin-mediated platelet aggregation type 2 diabetes and adp receptor blocker therapy current antiplatelet treatment strategy in patients with diabetes mellitus role of p2y12 receptor in thrombosis long non-coding rna metallothionein 1 pseudogene 3 promotes p2y12 expression by sponging mir-126 to activate platelet in diabetic animal model biosynthesis of major platelet proteins in human blood platelets expression of translation initiation factors elf-4e and elf-2alpha and a potential physiologic role of continuous protein synthesis in human platelets signal-dependent translation of a regulatory protein, bcl-3, in activated human platelets s6k1 and mtor regulate rac1-driven platelet activation and aggregation rapamycin promoted thrombosis and platelet adhesion to endothelial cells by inducing membrane remodeling long-term mtor inhibitors administration evokes altered calcium homeostasis and platelet dysfunction in kidney transplant patients signal-dependent protein synthesis by activated platelets: new pathways to altered phenotype and function integrin-dependent control of translation: engagement of integrin alphaiibbeta3 regulates synthesis of proteins in activated human platelets a. mtor-dependent synthesis of bcl-3 controls the retraction of fibrin clots by activated human platelets effects of fatty acids on human platelet glutathione peroxidase: possible role of oxidative stress a census of human rna-binding proteins a brave new world of rna-binding proteins the rna-binding protein atxn2 is expressed during megakaryopoiesis and may control timing of gene expression ataxin-2: from rna control to human health and disease direct binding of ataxin-2 to distinct elements in 3' utrs promotes mrna stability and protein expression prmt1-rbm15 axis regulates megakaryocytic differentiation of human umbilical cord blood cd34(+) cells etv6/runx1 transcript is a target of rna-binding protein igf2bp1 in t(12;21)(p13;q22)-positive acute lymphoblastic leukemia neonatal expression of rna-binding protein igf2bp3 regulates the human fetal-adult megakaryocyte transition lin28b regulates age-dependent differences in murine platelet function megakaryocyte-restricted myh9 inactivation dramatically affects hemostasis while preserving platelet aggregation and secretion translational control mediated by hnrnp k links nmhc iia to erythroid enucleation dendritic mrna: transport, translation and function the role of rna binding proteins for local mrna translation: implications in neurological disorders structure, dynamics and rna binding of the multi-domain splicing factor tia-1 dissociation of serpine1 mrna from the translational repressor proteins ago2 and tia-1 upon platelet activation endogenous line-1 (long interspersed nuclear element-1) reverse transcriptase activity in platelets controls translational events through rna-dna hybrids venous and arsterial thromboembolic complications associated with hiv infection and highly active antiretroviral therapy key: cord-266617-z8uecyl6 authors: pavesi, angelo title: asymmetric evolution in viral overlapping genes is a source of selective protein adaptation date: 2019-04-03 journal: virology doi: 10.1016/j.virol.2019.03.017 sha: doc_id: 266617 cord_uid: z8uecyl6 overlapping genes represent an intriguing puzzle, as they encode two proteins whose ability to evolve is constrained by each other. overlapping genes can undergo “symmetric evolution” (similar selection pressures on the two proteins) or “asymmetric evolution” (significantly different selection pressures on the two proteins). by sequence analysis of 75 pairs of homologous viral overlapping genes, i evaluated their accordance with one or the other model. analysis of nucleotide and amino acid sequences revealed that half of overlaps undergo asymmetric evolution, as the protein from one frame shows a number of substitutions significantly higher than that of the protein from the other frame. interestingly, the most variable protein (often known to interact with the host proteins) appeared to be encoded by the de novo frame in all cases examined. these findings suggest that overlapping genes, besides to increase the coding ability of viruses, are also a source of selective protein adaptation. many viruses produce novel genes inside pre-existing genes by overprinting of a de novo frame onto an ancestral frame (atkins et al., 1979; keese and gibbs, 1992; rancurel et al., 2009; sabath et al., 2012) . the high prevalence of overlapping genes in viruses has been attributed to the advantage of maximizing the gene information content of small viral genomes (miyata and yasunaga, 1978; lamb and orvath, 1991; pavesi et al., 1997) . in detail, the gene-compression hypothesis states that the size of the viral capsid imposes a biophysical limit on the size of the viral genome, thus making overprinting the most adequate strategy to gain new function (chirico et al., 2010) . in alternative, the gene novelty hypothesis argues that the birth of overlapping genes is driven by selection pressures favoring evolutionary innovation (brandes and linial, 2016) . this hypothesis is supported by the finding that overlaps, thought for a long time to be restricted to viruses, also occur in the large genomes of prokaryotic (delaye et al., 2008; fellner et al., 2015) and eukaryotic organisms (szklarczyk et al., 2007; bergeron et al., 2013; vanderperre et al., 2013) . a particularly interesting feature of overlapping genes is that they represent an intriguing example of adaptive conflict. indeed, they simultaneously encode two proteins whose freedom to change is constrained by each other (sander and schulz, 1979; krakauer, 2000; peleg et al., 2004; allison et al., 2016) , which would be expected to reduce the adaptive ability of the virus (simon-loriere et al., 2013) . we would expect, in principle, that overlapping genes are subjected to strong evolutionary constraints, as a single nucleotide substitution can impair two proteins (see the codon position "21" in fig. 1) . a typical example of "constrained evolution" is that occurring in hepatitis b virus (hbv), whose short genome (3.2 kb) contains a high percentage (50%) of overlapping coding regions (mizokami et al., 1997; zhang et al., 2010) . however, overlapping genes can also show a less conservative pattern of change, because of a high rate of non-synonymous substitutions in one frame (positive adaptive selection) with concurrent dominance of synonymous substitutions in the other (negative purifying selection). examples of positive selection concern the overlapping genes that encode the tat and vpr proteins of simian immunodeficiency virus (hughes et al., 2001) , the p19 and p22 proteins of the tombusvirus family of plant viruses (allison et al., 2016) , and the orf2 and orf5 proteins of trichodysplasia spinulosa-associated polyomavirus (kazem et al., 2016) . we can hypothesize for overlapping genes a first evolutionary model in which the two proteins they encode are subjected to similar selection pressures. when selection is strong both proteins (or protein regions) are highly conserved (e.g. the rnase domain of polymerase and the amino-terminal half of the x protein in hbv; see fig. 4 in mizokami et al., 1997) . when selection is not too strong both proteins can vary considerably (e.g. the spacer domain of polymerase and the pres1/s2 domain of the surface protein in hbv; see fig. 4 in mizokami et al., 1997) . this model is named "symmetric evolution", because the number of amino acid substitutions of one protein is expected to be not significantly different from that of the other. it corresponds to the "shared model" by fernandes et al. (2016) . in alternative, we can hypothesize for overlapping genes an evolutionary model in which the two proteins they encode are subjected to significantly different selection pressures. support for this model, which implies adaptive selection on one frame and purifying selection on the other, was provided both by viral (hughes et al., 2001; fujii et al., 2001; guyader and ducray, 2002; stamenković et al., 2016) and mammalian overlapping genes (szklarczyk et al., 2007) . this model is named "asymmetric evolution", because the number of amino acid substitutions of one protein is expected to be significantly different from that of the other. it corresponds to the "segregated model" by fernandes et al. (2016) . we recently assembled a dataset of 80 viral overlapping genes whose expression is experimentally proven (pavesi et al., 2018) , with the aim to provide a useful benchmark for systematic studies. a first analysis of the dataset revealed that overlapping genes differ significantly from non-overlapping genes in their nucleotide and amino acid composition (pavesi et al., 2018) . we also found that the vast majority of the 80 overlaps of the dataset have one or more homologs, suggesting further comparative studies. in the present study, i investigated the evolution of viral overlapping genes by sequence analysis of 75 pairs of homologs. the first aim of the study was to determine which of the two evolutionary models described above is the prevailing one. the second aim was to identify the type of nucleotide substitution that significantly affects the pattern of symmetric/asymmetric evolution. finally, the third aim was to assess whether the most variable protein (in the case of asymmetric evolution) is that encoded by the ancestral or the de novo frame. 2.1. selection criteria for homologous overlapping genes i first extracted from the dataset of 80 overlapping genes experimentally proven (s1 dataset from pavesi et al., 2018) the amino acid sequence of the two proteins encoded by each overlap. for each protein, i searched for homologs against the non-redundant protein sequences ncbi database using blastp (altschul et al., 1997) . when blastp did not detect any homolog i used tblastn, which compared the protein query sequence against the nucleotide collection ncbi database translated in all reading frames. i used tblastn because the amino acid sequence of the protein encoded by one of the two overlapping frames (usually that discovered more recently) may not be reported in many viral genomes present in the ncbi database (pavesi et al., 2018) . the selection of homologous overlapping genes was based on three criteria. the first was an equal length of the homolog. it was met in the great majority of cases (72 out of 80). in the remaining cases, the homolog was only slightly shorter than the query sequence. the exception was the overlap capsid protein/assembly activating protein (aap) of adeno-associated virus-2, whose homolog encodes an aap 9 amino acids shorter in the amino-terminal region and 26 amino acids shorter in the carboxy-terminal region. the second criterion was a homolog yielding, for both the encoded proteins, an alignment with no insertion/deletion (indel) or with a minimal number of indels. in the latter case, i imposed the rule that indel(s) must be located at the same amino acid position in the alignments of the two pairs of proteins (see for example the overlap polymerase/2b protein of spinach latent virus, which is the first overlap in supplementary file s1). by imposing this rule, i could align the two homologous nucleotide sequences in full accordance with the corresponding protein sequences. the alignment of protein sequences was carried out with clustal omega (sievers and higgins, 2014) . the third criterion concerned the cases in which i found multiple homologs meeting the two criteria described above. in these cases, i selected the most distantly related homolog, with the aim to cover the largest evolutionary space. the choice to select only one homolog for each overlapping gene was due to the fact that collection of a larger sample of homologs is limited to a few overlaps, mainly those occurring in virus species that are human pathogens (e.g. influenza and hepatitis viruses or sars and ebola viruses). the search for homologs yielded a dataset of 80 pairs of homologous overlapping genes (supplementary file s1). thirty-seven homologs came from a different virus species, in accordance with the ictv taxonomy (king et al., 2018) (https://talk.ictvonline.org/taxonomy/). the mean nucleotide identity between overlaps and homologs was 70.7%, with a standard deviation (sd) of 9.4%. the remaining 43 homologs came from isolates belonging to the same virus species. in this case, the mean nucleotide identity between overlaps and homologs was 89.6% (sd = 7.1%). for each pair of homologous overlapping genes, the supplementary file s1 contains the following information: i) the nucleotide sequence of the upstream frame and that of the homolog; ii) the amino acid sequence of the protein encoded by the upstream frame (up1) and that of the protein encoded by the homolog (up2); iii) the nucleotide sequence of the downstream frame (shifted of one nucleotide 3' with respect to the upstream frame) and that of the homolog; iv) the amino acid sequence of the protein encoded by the downstream frame (down1) and that of the protein encoded by the homolog (down2); v) the alignment of up1 with up2 and the percent amino acid identity; vi) the alignment of down1 with down2 and the percent amino acid identity; vii) the chisquare analysis, which compared by a 2 x 2 contingency-table the number of the amino acid identities and differences in the up1-up2 alignment with that in the down1-down2 alignment (cut-off of significance = 3.84; 1 degree of freedom; p < 0.05). 3.2. half of overlapping genes evolve in accordance with the asymmetric model i carried out a preliminary analysis using the t-student test for paired data. for each pairs of homologous overlaps, i counted the number of amino acid identities between up1 and up2 and that between down1 and down2. i then calculated the absolute value of the difference between them. the null hypothesis was a mean difference orientation of overlapping genes, with the downstream frame having a shift of one nucleotide 3′ with respect to the upstream frame. there are 3 types of codon position (cp): cp13 (bold character), in which the first position of the upstream frame overlaps the third position of the downstream frame; cp21 (underlined character), in which the second position of the upstream frame overlaps the first position of the downstream frame; cp32 (italic character), in which the third position of the upstream frame overlaps the second position of the downstream frame. based on the genetic code, a nucleotide substitution at first codon position causes an amino acid change in 95.4% of cases, at second codon position in 100% of cases, and at third codon position in 28.4% of cases. thus, nucleotide substitutions at the codon positions "13" and "32" are usually non-synonymous in one frame and synonymous in the other. nucleotide substitutions at the codon position "21" are almost all non-synonymous in both frames. virology 532 (2019) [39] [40] [41] [42] [43] [44] [45] [46] [47] between paired observations close to zero, indicating that overlapping genes evolve in accordance with the symmetric model. the null hypothesis was rejected (t-student = 5.91; 79 degrees of freedom; p = 10 −5 ), indicating that overlapping genes can also evolve in accordance with the alternative asymmetric model. in order to identify which and how many overlapping genes undergo symmetric or asymmetric evolution, i then compared the amino acid diversity between up1 and up2 to that between down1 and down2. i used the contingency-table chi-square test (snedecor and cochran, 1967) with a cut-off value of 3.84 for 1 degree of freedom (p < 0.05). i classified a pair of homologous overlaps as a case of symmetric evolution, if the number of amino acid substitutions in the up1-up2 alignment did not significantly differ from that in the down1-down2 alignment (chi-square < 3.84). an example is given by the overlap ns1 protein/ns2 protein from dendrolimus punctatus densovirus. for the ns1 protein, i found 73 identities and 86 differences when compared to the homolog from hordeum marinum itera-like densovirus. for the ns2 protein, i found 71 identities and 88 differences, yielding a chi-square value (0.01) largely below the cut-off of significance. in alternative, i classified a pair of homologous overlaps as a case of asymmetric evolution, if the number of amino acid substitutions in the up1-up2 alignment was significantly different from that in the down1-down2 alignment (chi-square > 3.84). an example is given by the overlap movement protein/replicase from turnip yellow mosaic virus. for the movement protein, i found 302 identities and 323 differences when compared to the homolog from watercress white vein virus. for replicase, i found 454 identities and 171 differences, yielding a chisquare value (76.3) largely above the cut-off of significance. the chi-square test was highly sensitive. for example, i found that the overlap capsid protein/p31 protein from maize chlorotic mottle virus undergoes asymmetric evolution, in spite of a nucleotide identity with the homolog extremely high (96.7%). indeed, the number of amino acid differences between p31 and homolog (12 out of 149 sites) was significantly higher than that between capsid and homolog (2 out of 149 sites) (chi-square = 6.07; p = 0.01). based on this finding, i set the upper limit of sensitivity of the chi-square test to a nucleotide identity between overlap and homolog of 97%. this filter limited the analysis to 75 (out of 80) pairs of homologous overlaps. overall, i found that 38 overlapping genes evolve in accordance with the asymmetric model (significantly different selection pressures on the two proteins). the highest chi-square value (113.8) concerned the overlap from apple stem grooving virus, which encodes the 36kd movement protein and the polyprotein linker-domain. indeed, the amino acid diversity between linker-domain and homolog (39%; 125 differences and 195 identities) was ten-fold higher than that between movement protein and homolog (4%; 13 differences and 307 identities). i found that the remaining 37 overlapping genes evolve in accordance with the symmetric model (similar selection pressures on the two proteins). the occurrence of similar selection pressures can yield two highly conserved proteins. for example, analysis of the overlap 3a protein/3b protein from human sars coronavirus revealed that the amino acid diversity between 3a and homolog is remarkably low (5.3%; 6 differences and 108 identities), as well as that between 3b and homolog (8.8%; 10 differences and 104 identities). however, the occurrence of similar selection pressure can also yield two proteins with a remarkably less conserved pattern of change. this is the case of the overlap from spinach latent virus, which encodes the zincfinger domain of polymerase and the 2b protein. sequence analysis revealed that the amino acid diversity between zinc-finger domain and homolog is considerably high (47%; 47 differences and 54 identities), as well as that between 2b and homolog (44%; 44 differences and 57 identities). the analysis of amino acid diversity in the 75 pairs of homologous overlapping genes is summarized in fig. 2 . it shows, for each overlap, the percent amino acid (aa) identity of the two encoded proteins with those encoded by the homolog. the subset of the 37 overlapping genes under symmetric evolution ( fig. 2a) contains 31 overlaps in which both proteins have high conservation (aa identity > 50%), 5 overlaps in which both proteins have poor conservation (aa identity < 50%) and 1 overlap with a protein having an aa identity above 50% and the other below 50%. the subset of the 38 overlapping genes under asymmetric evolution (fig. 2b ) contains 24 overlaps in which both proteins have high conservation (aa identity > 50%), 1 overlap in which both proteins have poor conservation (aa identity < 50%) and 13 overlaps with a protein having an aa identity above 50% and the other below 50%. finally, a list of the 75 overlapping genes, classified in accordance with the symmetric or asymmetric model (37 and 38 cases, respectively), is given in supplementary table s1. 3.3. validation of the model of symmetric/asymmetric evolution by analysis of the pattern of nucleotide substitutions in homologous overlapping genes in accordance with wei and zhang (2014) , i first classified the nucleotide sites of each overlapping gene into four categories depending on the impact of potential mutations on the two encoded proteins. the four categories are referred as nn, sn, ns, and ss sites, respectively, where n stands for non-synonymous change and s stands for synonymous change. that is, if all potential mutations at a site cause nonsynonymous change in both proteins, it is a nn site, and so on. i then classified the nucleotide substitutions occurring in the homolog into four categories: nn, sn, ns, and ss. using the contingency-table chisquare test, i compared the number of sn and ns sites in each overlapping gene with the number of sn and ns substitutions in the homolog. under symmetric evolution, i would expect a chi-square value below the cut-off of significance (3.84; 1 degree of freedom), that is a full concordance between the number of sn and ns sites and that of sn and ns substitutions. for example, in the overlap orf4/orf5 from barley yellow striate mosaic virus i counted 49 sn sites and 51 ns sites. in the homolog from maize yellow striate virus, i classified 23 nucleotide substitutions into the sn category and 28 substitutions into the ns category. the chi-square test yielded a value (0.08) largely below the cut-off of significance. under asymmetric evolution, i would expect a chi-square above the cut-off of significance, that is a significant discordance between the number of sn and ns sites and that of sn and ns substitutions. for example, the overlap capsid protein/ns4 protein from bluetongue virus (serotype 10) has 56 sn sites and 46 ns sites. the homolog from bluetongue virus (serotype 16) has 5 nucleotide substitutions belonging to the sn category and 29 substitutions to the ns category. the chisquare test yielded a value (15.07) largely above the cut-off of significance. the analysis of the pattern of nucleotide substitutions in the 75 pairs of homologous overlaps revealed 39 and 36 cases of symmetric and asymmetric evolution, respectively (supplementary table s2 ). this result was in accordance with that obtained previously (from analysis of the amino acid diversity, see supplementary table s1) in the 87% of cases (65 out of 75). overall, i found a total of 33 overlaps under symmetric evolution (they are marked with a single asterisk in supplementary tables s2a) and a total of 32 overlaps under asymmetric evolution (they are marked with a double asterisk in supplementary table s2b ). a list of the 32 overlapping genes under asymmetric evolution is given in table 1 . these findings were not affected by the fact that some homologs came from a different virus species, while others from an isolate within the same virus species. under symmetric evolution, i found 14 and 19 overlaps with the homolog within and between species, respectively. under asymmetric evolution, i found 18 and 14 overlaps with the homolog within and between species, respectively. finally, a further validation of the model of symmetric/asymmetric a. pavesi virology 532 (2019) 39-47 evolution was provided by a correlation test between the chi-square value from analysis of amino acid substitutions and the distribution of nucleotide substitutions at the codon positions "32" and "13" (fig. 1) . given the orientation of overlapping genes in our dataset (fig. 1) , a substitution at the codon position "32" (cp32) is usually synonymous in the upstream frame and always non-synonymous in the downstream frame, while a substitution at the codon position "13" is almost always non-synonymous in the upstream frame and usually synonymous in the downstream frame. under symmetric evolution, the number of substitutions at the codon position "32" is expected to be close to that at the codon position "13", yielding a similar distribution of the amino acid substitutions in the two pairs of homologous proteins. under asymmetric evolution, the number of substitutions at the codon position "32" is expected to be significantly higher (or lower) than that at the codon position "13", yielding a different distribution of the amino acid substitutions in the two pairs of homologous proteins. by comparing the upstream frame of each overlap with that of the homolog, i calculated the absolute value (abs) of the difference between the percent frequency (%f) of substitutions at the codon position "32" (%f.cp32) and that at the codon position "13" (%f.cp13). i then carried out a correlation test between abs (%f.cp32 -%f.cp13) and the chi-square value from analysis of amino acid substitutions. as the chisquare test depends on the extent of the sample (here the length of the protein encoded by the overlap), i normalized the chi-square value in accordance with the cohen's rule (cohen, 1988) . normalization was the square root of the ratio between the chi-square value and the overall length of the two proteins encoded by the overlap (e.g. the highest chisquare value, 113.83, was converted into the highest normalized chi-square value, 0.42). i found a significantly positive correlation between abs (%f.cp32 -%f.cp13) and the normalized chi-square value (r = 0.88; t-student = 14.36; one tailed p < 0.00001; 63 degrees of freedom) (fig. 3) . as expected, this result indicates that asymmetric evolution is significantly affected by an unbalanced distribution of the nucleotide substitutions at the codon positions "32" and "13". to answer the question, i investigated the genealogy of the 32 overlapping genes under asymmetric evolution. identifying which gene is ancestral and which one is de novo (the genealogy of the overlap) can be done by examining their phylogenetic distribution, under the assumption that the gene with the most restricted distribution is the de novo one (rancurel et al., 2009) . this approach yielded a set of 34 overlapping genes with a reliably predicted genealogy (see table 1 in sabath et al., 2012 and table 1 in pavesi et al., 2013) . this set included 16 out of the 32 overlaps under asymmetric evolution. another approach to infer the genealogy of overlapping genes is the codon-usage method. it is based on the assumption that the ancestral gene, which has co-evolved over a long period of time with the other viral genes, has a distribution of synonymous codons significantly closer to that of the viral genome than the de novo gene (keese and gibbs, 1992; sabath et al., 2012; pavesi et al., 2013; willis and masel, 2018) . due to the shortness of most overlapping genes, the method has been improved, with the aim to evaluate the correlation between the codon-usage patterns of overlapping and non-overlapping genes with a fig. 2 . analysis of the amino acid diversity in the 75 pairs of homologous overlapping genes. each pair of columns shows: i) the percent amino acid identity between the protein encoded by the upstream frame of the overlap and that encoded by the homolog (dark column); ii) the percent amino acid identity between the protein encoded by the downstream frame of the overlap (shifted of one nucleotide 3′ with respect to the upstream frame) and that encoded by the homolog (gray column). the horizontal line separates well-conserved homologous pairs (aa identity > 50%) from not well-conserved homologous pairs (aa identity < 50%). (a) subset of the 37 overlapping genes under symmetric evolution. (b) subset of the 38 overlapping genes under asymmetric evolution. the numbering of overlapping genes is in accordance with that given in supplementary table s1 . the underlined numbers indicate the overlaps in which the pattern of symmetric evolution (4 cases out of 37) or that of asymmetric evolution (6 cases out of 38) was not confirmed by chi-square analysis of the nucleotide diversity. virology 532 (2019) 39-47 table 1 list of the 32 overlapping genes evolving in accordance with the asymmetric model. minimal loss of information (pavesi, 2015) . using the improved version of the codon-usage method (pavesi, 2015) , i could predict the genealogy of 18 out of the 32 overlapping genes under asymmetric evolution. in 11 cases, the prediction by codon-usage was concordant with that established by the phylogenetic method. in the remaining 7 cases, the prediction was provided only by the codon-usage method (supplementary table s3 ). the overlap p130/p104 of providence virus is notable, as the ancestral frame p104 was acquired from another viral genome by distant horizontal gene transfer (pavesi et al., 2013) , which makes the codon usage an unreliable predictor of the genealogy. the prediction yielded by phylogenetics is supported by the finding that p104, unlike p130, has a wide phylogenetic distribution (pavesi et al., 2013) . overall, i collected a set of 23 overlapping genes, all under asymmetric evolution and with known genealogy (15 overlaps with a shift of the de novo frame of one nucleotide 3′ with respect to the ancestral frame and 8 overlaps with a shift of two nucleotides 3'). interestingly, i found that in all cases the most variable protein is that encoded by the de novo gene (table 2) . 3.5. symmetric and asymmetric evolution in the same overlap: the case of the overlap polymerase/large envelope protein of hepatitis b virus (hbv) chi-square analysis indicated that the overlap polymerase/large envelope protein of hbv evolves in accordance with the symmetric model (supplementary tables s1 and s2 ). on the other hand, theoretical and experimental studies (pavesi, 2015; lauber et al., 2017) demonstrated that this long overlap (1167 nt) is subjected to modular evolution, as the spacer domain of polymerase and the s domain of the large envelope protein originated de novo by overprinting. thus, the overlap can be subdivided into two regions: a 5′ region (480 nt), in which the spacer domain of polymerase (de novo gene product) overlaps the pre-s domain of envelope (ancestral gene product), and a 3' region (687 nt), in which the reverse transcriptase domain of polymerase (ancestral gene product) overlaps the s domain of envelope (de novo gene product). i carried out a chi-square analysis of the 2 regions of the overlap independently, under the hypothesis that they may have been subject to different evolutionary pressures. this analysis revealed that the 5' region of the overlap undergoes asymmetric evolution, because the amino acid diversity of the spacer domain (33.7%; 54 differences and 106 identities) is significantly higher than that of the pre-s domain (19.4%; 31 differences and 129 identities) (chi-square = 7.75; p = 0.005). asymmetric evolution was confirmed by analysis of the pattern of nucleotide substitutions (chi-square = 10.13; p = 0.001). in addition, chi-square analysis revealed that the 3' region of the overlap undergoes symmetric evolution, as the amino acid diversity of the reverse transcriptase domain (7.4%; 17 differences and 212 identities) does not significantly differ from that of the s domain (11.8%; 27 differences and 202 identities) (chi-square = 2.04; p = 0.15). symmetric evolution was confirmed by analysis of the pattern of nucleotide substitutions (chi-square = 1.61; p = 0.20). with the aim to further validate these findings, i carried out a further analysis using, as homolog, the most distantly related overlap of woolly monkey hbv (79.9% of nucleotide identity). again, chi-square analysis of the amino acid and nucleotide diversity revealed asymmetric evolution in the 5′ region and symmetric evolution in the 3' region. details of both analyses are reported in the supplementary file s2. finally, the finding that the spacer domain of polymerase (de novo gene product) is significantly more variable than the pre-s domain (ancestral gene product) confirms that the most variable protein, under asymmetric evolution, is usually that encoded by the de novo gene. several researchers have developed methods for estimating the strength of selection pressure on overlapping genes (pedersen and jensen, 2001; sabath et al., 2008; de groot et al., 2008; mir and schober, 2014; wei and zhang, 2014) . all methods evaluate, in both overlapping frames, the ratio of non-synonymous nucleotide substitutions to synonymous nucleotide substitutions (dn/ds) by correctly taking into account the problem of the interdependence between sequences imposed by the overlap. the aim is to assess if there is neutral evolution or positive selection in one frame (dn/ds higher than 1) and purifying selection (strong constraints) in the other frame (dn/ds lower than 1). however, the only method having an accessible implementation is that by sabath et al. (2008) . yet, the method has some limitations, as it restricts the analysis to the homologous overlaps in which the two encoded proteins have both an amino acid diversity smaller than 50% or greater than 5%. in the dataset examined here (see the first 75 pairs of homologous overlaps in supplementary file s1), these limitations would have considerably reduced the size of the sample from 75 to 43 pairs of homologous overlaps. i thus chose an approach focused, at first instance, on the evaluation of the amino acid diversity of homologous overlapping proteins, which is the final result of the complex pattern of the interdependent nucleotide substitutions that occur in dual-coding regions. unlike previous studies, limited to a few virus species (sabath et al., 2012; zaaijer et al., 2007; liang et al., 2010; shukla and hilgenfeld, 2015; brayne et al., 2017) , i examined a large dataset of 75 overlaps from 59 virus species. a possible limitation of the study concerns the selection criteria for homologous overlapping genes. in particular, the first two stringent criteria (an equal length of the homolog and an alignment with a minimal number of indels) led to exclusion, for some overlaps, of highly divergent homologs. an example is given by the overlap p3n-pipo/ polyprotein of turnip mosaic virus, in which the length of the p3n-pipo protein is quite variable among the different potyvirus species, ranging from 60 to 115 amino acids (hillung et al., 2013) . thus, the dataset used in this study likely underestimates the sequence diversity of overlapping genes, as it was created mainly to ensure a high quality in the homologous relationship. the finding that 32 out of 65 overlapping genes (table 1 ) undergo asymmetric evolution is striking, as well as that the most variable protein is encoded by the de novo gene in all cases examined (table 3 ). in particular, i would point out the overlap orf3/orf4 from tobacco bushy top virus, which encodes two proteins entirely nested within each other. this peculiar arrangement is similar to that of the overlap p19/ p22 from tomato bushy stunt virus, in which the de novo p19 protein fig. 3 . correlation between the normalized chi-square value (from analysis of amino acid substitutions) and the absolute value (abs) of the difference between the percent frequency (%f) of nucleotide substitutions at the codon position "32" (%f.cp32) and that at the codon position "13" (%f.cp13). empty circles indicate the 33 overlapping genes under symmetric evolution. black circles indicate the 32 overlapping genes under asymmetric evolution. virology 532 (2019) 39-47 table 2 list of the 23 overlapping genes with known genealogy and evolving in accordance with the asymmetric model. shows a previously unknown structural fold an a previously unknown mechanism of binding to small interfering rnas (vargason et al., 2003; baulcombe and molnár, 2004; scholthof, 2006) . i believe that structural or functional studies on the de novo orf3 protein from tobacco bushy top virus could reveal new interesting features. in addition, i would point out the overlap polymerase (pb1 subunit)/pb1-f2 protein of human influenza a virus. it shows, when compared to the homolog from duck, a sixteen-fold increase of substitutions at the codon position "32" (89.2%) with respect to the codon position "13" (5.4%). this yields only 3 amino acid differences between the two pb1 subunits and as many as 35 differences between the two pb1-f2 proteins. interestingly, the de novo pb1-f2 protein has been shown to largely contribute to viral pathogenicity by a pleiotropic effect (chen et al., 2001; varga et al., 2011; yoshizumi et al., 2014) . several other de novo proteins under asymmetric evolution are known to play a role in viral pathogenicity. eight de novo proteins (arfp, vp5, l*, x, vf1, pb1-f2, p6, and nss) act as suppressor or antagonist of the interferon response by the host (park et al., 2016; lauksund et al., 2015; sorgeelos et al., 2013; wensman et al., 2013; mcfadden et al., 2011; varga et al., 2011; garcía-rosado et al., 2008; jääskeläinen et al., 2007) . four de novo proteins (p19, p69, ac4, and movement protein) act as suppressor of rna silencing (silhavy et al., 2002; chen et al., 2004; chellappan et al., 2005; yaegashi et al., 2008) . two de novo proteins (apoptin and pb1-f2) act as apoptosis factor (noteborn et al., 1994; chen et al., 2001) . finally, the de novo protein pa-x has the ability to selectively degrade the host rna-polymerase ii transcripts (khaperskyy et al., 2016) . however, another possible limitation of the study depends on the fact that the subset of overlapping genes evolving asymmetrically and with known genealogy (23 overlaps) is too small to conclude that the de novo protein is always the preferred target of selection. furthermore, overlapping genes are subjected to a variety of selection pressures that are independent of the orientation of the overlapping frames relative to one another. thus, it is hypothetically possible that an ancestral protein may be significantly more variable than a de novo protein under peculiar selective constraints. despite this limitation, our findings suggest that the birth of new overlapping genes, besides to increase the coding ability of small viral genomes (chirico et al., 2010) , is also a valuable source of selective protein adaptation. none. positive selection or free to vary? assessing the functional significance of sequence change using molecular dynamics gapped blast and psi-blast: a new generation of protein database search programs binding of mammalian ribosomes to ms2 phage rna reveals an overlapping gene encoding a lysis function crystal structure of p19-a universal suppressor of rna silencing an out-of-frame overlapping reading frame in the ataxi-1 coding sequence encodes a novel ataxin-1 interacting protein gene overlapping and size constraints in the viral world genotype specific evolution of hepatitis e virus microrna-binding viral protein interferes with arabidopsis development a novel influenza a virus mitochondrial protein that induces cell death viral virulence protein suppresses rna silencing-mediated defense but upregulates the role of microma in host gene expression why genes overlap in viruses statistical power and analysis for the behavioral sciences investigating selection in viruses: a statistical alignment approach the origin of a novel gene through overprinting in escherichia coli evidence for the recent origin of a bacterial protein-coding, overlapping orphan gene by evolutionary overprinting functional segregation of overlapping genes in hiv conserved and non-conserved regions in the sendai virus genome: evolution of a gene possessing overlapping reading frames molecular and functional characterization of two infectious salmon anaemia virus (isav) proteins with type i interferon antagonizing activity sequence analysis of potato leafroll virus isolates reveals genetic stability, major evolutionary events and differential selection pressure between overlapping reading frame products intra-specific variability and biological relevance of p3n-pipo protein length in potyviruses simultaneous positive and purifying selection on overlapping reading frames of the tat and vpr genes of simian immunodeficiency virus tula and puumala hantavirus nss orfs are functional and the products inhibit activation of the interferon-beta promoter limited variation during circulation of a polyomavirus in the human population involves the coco-va toggling site of middle and alternative t-antigen(s) origins of genes: "big bang" or continuous creation? selective degradation of host rna polymerase ii transcripts by influenza a virus pa-x host shutoff protein changes to taxonomy and the international code of virus classification and nomenclature ratified by the international committee on taxonomy of viruses stability and evolution of overlapping genes diversity of coding strategies in influenza viruses deciphering the origin and evolution of hepatitis b viruses by means of a family of non-enveloped fish viruses infectious pancreatic necrosis virus proteins vp2, vp3, vp4 and vp5 antagonize ifna1 promoter activation while vp1 induces ifna1 selection characterization on overlapping reading frame of multiple-protein-encoding p gene in newcastle disease virus norovirus regulation of the innate immune response and apoptosis occurs via the product of the alternative open reading frame 4 selection pressure in alternative reading frames evolution of overlapping genes constrained evolution with respect to gene overlap of hepatitis b virus a single chicken anemia virus protein induces apoptosis hepatitis c virus frameshift/ alternate reading frame protein suppresses interferon responses mediated by pattern recognition receptor retinoic-acid-inducible gene-i on the informational content of overlapping genes in prokaryotic and eukaryotic viruses viral proteins originated de novo by overprinting can be identified by codon usage: application to the "gene nursery" of deltaretroviruses different patterns of codon usage in the overlapping polymerase and surface genes of hepatitis b virus suggest a de novo origin by modular evolution overlapping genes and the proteins they encode differ significantly in their sequence composition from non-overlapping genes a dependent-rates model and an mcmc-based methodology for the maximum-likelihood analysis of sequences with overlapping reading frames overlapping messages and survivability overlapping genes produce proteins with unusual sequence properties and offer insight into de novo protein creation a method for the simultaneous estimation of selection intensities in overlapping genes evolution of viral proteins originated de novo by overprinting degeneracy of the information contained in amino acid sequences: evidence from overlaid genes the tombusvirus-encoded p19: from irrelevance to elegance acquisition of new protein domains by coronaviruses: analysis of overlapping genes coding for proteins n and 9b in sars coronavirus clustal omega, accurate alignment of very large numbers of sequences a viral protein suppresses rna silencing and binds silencing-generated, 21-to 25-nucleotide double-stranded rnas the effect of gene overlapping on the rate of rna virus evolution statistical methods. iowa state university press evasion of antiviral innate immunity by theiler's virus l* protein through direct inhibition of rnase l substitution rate and natural selection in parvovirus b19 rapid asymmetric evolution of a dual-coding tumor suppressor ink4a/arf locus contradicts its function direct detection of alternative open reading frames translation products in human significantly expands the proteome the influenza virus protein pb1-f2 inhibits the induction of type i interferon at the level of the mavs adaptor protein size selective recognition of sirna by an rna silencing suppressor a simple method for estimating the strength of natural selection on overlapping genes the x proteins of bornaviruses interfere with type i interferon signalling gene birth contributes to structural disorder encoded by overlapping genes inhibition of long-distance movement of rna silencing signals in nicotiana benthamiana by apple chlorotic leaf spot virus 50 kda movement protein influenza a virus protein pb1-f2 translocates into mitochondria via tom40 channels and impairs innate immunity independent evolution of overlapping polymerase and surface protein genes of hepatitis b virus evolutionary selection associated with the multi-function of overlapping genes in the hepatitis b virus the author is grateful to alessio peracchi (university of parma) and alberto vianelli (university of insubria) for helpful suggestions. special thanks to xinzhu wei (university of michigan) for valuable comments and suggestions and to gianmarco del vecchio for preparing the figures. the author thanks the anonymous referees and the editor alexander e. gorbalenya for their helpful feedback and suggestions. the study was financed by the miur (ministero dell'università e della ricerca). supplementary data to this article can be found online at https:// doi.org/10.1016/j.virol.2019.03.017. key: cord-260949-w2xuf15h authors: galluzzi, lorenzo; brenner, catherine; morselli, eugenia; touat, zahia; kroemer, guido title: viral control of mitochondrial apoptosis date: 2008-05-30 journal: plos pathog doi: 10.1371/journal.ppat.1000018 sha: doc_id: 260949 cord_uid: w2xuf15h throughout the process of pathogen–host co-evolution, viruses have developed a battery of distinct strategies to overcome biochemical and immunological defenses of the host. thus, viruses have acquired the capacity to subvert host cell apoptosis, control inflammatory responses, and evade immune reactions. since the elimination of infected cells via programmed cell death is one of the most ancestral defense mechanisms against infection, disabling host cell apoptosis might represent an almost obligate step in the viral life cycle. conversely, viruses may take advantage of stimulating apoptosis, either to kill uninfected cells from the immune system, or to induce the breakdown of infected cells, thereby favoring viral dissemination. several viral polypeptides are homologs of host-derived apoptosis-regulatory proteins, such as members of the bcl-2 family. moreover, viral factors with no homology to host proteins specifically target key components of the apoptotic machinery. here, we summarize the current knowledge on the viral modulation of mitochondrial apoptosis, by focusing in particular on the mechanisms by which viral proteins control the host cell death apparatus. the sacrifice, via programmed cell death (pcd), of infected cells represents the most primordial response of multicellular organisms to viruses. this response is common to all metazoan phyla, including plants (which lack an immune system based on mobile cells) (text s2, [s1]). in mammals, microbial invasion does not only trigger pcd of infected cells but also elicits an immune reaction. this is hierarchically organized in a first-line response provided by innate immune effectors (e.g., infiltrating phagocytes and natural killer cells) [s2], followed by the activation of adaptive immunity, mediated by t and b lymphocytes [s3] . importantly, other layers of defense exist to prevent viral replication and spread [s2] . for instance, in invertebrates like drosophyla melanogaster (as well as in plants), a prominent antiviral mechanism is provided by rna interference (rnai) [s4] . although the rnai pathway is preserved in mammals, it has presumably been superseded in its antiviral role by the extremely potent interferon system, as well as by a number of additional mechanisms [s5] . such a multivariate antiviral response is designed to recognize virions, virus-infected cells, and virus-induced signals of stress (including cell death) to eliminate the pathogen (together with the host cell) and to elicit immunological memory [s6] . thus, the co-evolution between host and virus has forced the latter to develop strategies for modulating host cell pcd and/or for avoiding immunogenic cell death. apoptosis is an active mode of pcd exhibiting a series of morphological and biochemical changes by which it can be distinguished from other cell death subroutines [s7] . at a morphological level, these modifications include a dramatic reduction in cell volume (cell shrinkage), nuclear pyknosis (chromatin condensation), and karyorrhexis (nuclear fragmenta-tion). eventually, dying cells break down into small, discrete bodies known as apoptotic bodies [s7] . the morphological appearance of apoptosis is accounted for by an ensemble of biochemical events that include, but are not limited to: (1) loss of the structural integrity and bioenergetic functions of mitochondria, (2) cascade activation of a specific set of catabolic enzymes (e.g., proteases of the caspase family, nucleases), (3) exposure of phosphatydylserine on the outer leaflet of plasma membrane and, finally, (4) loss of the barrier function of the plasma membrane [s7] . apoptosis can be triggered by two fundamentally distinct signaling cascades, namely the extrinsic and intrinsic (or mitochondrial) pathways (for a recent review, see [1] ) ( figure 1 ). the extrinsic pathway is started by the ligand-induced oligomerization of specific cell surface receptors, such as fas/cd95 and the tumor necrosis factor receptor (tnfr). this induces the intracellular assembly of the death-inducing signaling complex (disc), a molecular platform for the activation of the caspase cascade that emanates from caspase-8 and results in the activation of effector caspases and nucleases (e.g., caspase-3, -6, and -7, caspase-activated dnase) ( figure 1) [s8,s9]. in contrast, the intrinsic pathway is controlled by mitochondria, which collect and integrate pro-and antiapoptotic signals incoming from other organelles as well as from the extracellular microenvironment. notably, proapoptotic stimuli as diverse as dna damage, endoplasmic reticulum (er) stress, lysosomal stress, reactiveoxygen species (ros), and calcium (ca 2+ ) overload are able to activate the intrinsic pathway of apoptosis by favoring mitochondrial membrane permeabilization (mmp) (figure 1) [s10,s11]. in some cells, mitochondrial apoptosis may ensue the activation of death receptors, due to the mmp-promoting activity of the bh3only protein bid, which can be proteolytically activated by caspase-8 [s9,s12]. mmp culminates in the loss of mitochondrial transmembrane potential (dy m ), an arrest of mitochondrial bioenergetic and biosynthetic functions, and in the release of mitochondrial intermembrane space (ims) proteins into the cytosol. these proteins include caspase activators like cytochrome c (cyt c) [s13,s14] and smac/diablo [s15], as well as caspaseindependent death effectors such as apoptosis-inducing factor (aif) [s16,s17] and endonuclease g (endog) [s18]. thus, mmp activates caspase-dependent and/or -independent mechanisms to execute cell death [1] . impaired mmp is associated with multiple pathological conditions, including autoimmune diseases and cancer [s19]. conversely, unscheduled mmp contributes to the development of diseases characterized by an excess of cell death, such as ischemia/ reperfusion injuries, trauma, toxic/metabolic syndromes as well as chronic neurodegenerative conditions like amyotrophic lateral sclerosis or alzheimer, parkinson, and huntington diseases [s20]. mmp is regulated by a complex network of signaling pathways that involves both endogenous (e.g., pro-and antiapoptotic bcl during the last decade, major efforts have been dedicated to the elucidation of the mechanisms underlying mmp in health and disease. according to current beliefs, mmp is executed via either of two distinct, yet partially overlapping and non-mutually exclusive, mechanisms. these two routes to mmp are initiated at different mitochondrial subcompartments (notably at the mitochondrial outer or inner membrane, i.e., om and im) and each relies on a specific set of factors. nonetheless, they both lead to a functional and structural collapse of mitochondria that commits the cell to death [1] . notably, mmp is associated with dramatic changes in the mitochondrial network as well as in mitochondrial ultrastructure, concerning both matrix volume and cristae organization [s50-s52]. how these structural modifications of mitochondria might impact on viral infection, however, remains to be elucidated. the abundant presence of the voltage-dependent anion channel (vdac) renders om freely permeable to solutes and small metabolites up to approximately 5 kda. this cutoff ensures that soluble proteins are retained in the ims under normal circumstances. the apoptosis-associated drastic increase in om permeability may originate at the om itself by means of multiple mechanisms, including (1) the assembly of large homo-or heteromultimeric channels, allowing for the release of ims proteins, by proapoptotic pore-forming proteins of the bcl-2 family (e.g., bax, bak) [10,11;s53,s54] ; (2) the destabilization of the lipid bilayer mediated by proapoptotic bcl-2 family members (e.g., bax, figure 1 . the extrinsic and the intrinsic (mitochondrial) pathways of apoptosis. the extrinsic apoptotic pathway involves the activation of death receptors at the cell surface, followed by a caspase cascade that eventually leads to the execution of cell death. in contrast, different proapoptotic stimuli initiate the intrinsic pathway by triggering mitochondrial membrane permeabilization (mmp). following mmp, intermembrane space proteins are released into the cytosol, the mitochondrial transmembrane potential (dy m ) is dissipated, and the bioenergetic and redoxdetoxifying functions of mitochondria are compromised. the resulting bioenergetic and redox crises, associated with the activation of both caspasedependent and -independent executioner mechanisms, commit the cell to death. the two pathways are interconnected by the bh3-only protein bid, whose truncated form (tbid) is generated by caspase-8 and can target mitochondria to trigger mmp. for a more detailed description of the intrinsic and extrinsic pathways of apoptosis please refer to the introduction and to [1] . disc, death-inducing signaling complex; er, endoplasmic reticulum. doi:10.1371/journal.ppat.1000018.g001 truncated bid, i.e., tbid), which results in the priming of mitochondria for the release of ims proteins [s55-s58]; and (3) the induction of the so-called mitochondrial permeability transition (mpt) at the im, following the interaction between bax (or tbid) and components of the permeability transition pore complex (ptpc) at the om [12;s59-s63]. in this latter case, mmp begins and ends at the om, yet is mediated by an event taking place mainly at the im, i.e., mpt (see the section ''mmp regulation by the ptpc'' for further details). independently from the specific mechanisms that activate mmp, the bcl-2 family of proteins exerts a major regulation of this process [s64]. the bcl-2 family is composed of antiapoptotic multidomain members (e.g., bcl-2, bcl-x l , mcl-1), which contain four bcl-2 homology (bh) domains (bh1-4) [s65], proapoptotic multidomain proteins (e.g., bax, bak) [s66], which contain three bh domains (bh1-3), and pro-apoptotic bh3-only proteins (e.g., bid, bad) [13] . due to an additional c-terminal domain, some members of all the subgroups share the ability to insert into the om and other intracellular membranes (e.g., er) [1] .the specific set of bh domains contained in each bcl-2 family member determines its profile of activity [s67-s69]. in this context, early structure-function studies identified bh1, bh2, and bh4 as the major antiapoptotic determinants of bcl-2 [s67-s70]. conversely, the presence of the bh3 domain was found to suffice for apoptosis induction by bax (as well as for heterodimerization with bcl-2/bcl-x l ) [s71-s72]. later, numerous reports showed that the conserved transmembrane (tm) domain and less conserved, unstructured loops between bh domains also contribute to define the functional profile of bcl-2 proteins, either by acting as targeting signals for subcellular compartments (e.g., mitochondria, er) or by modulating the overall tertiary structure [s73-s77]. bcl-2/bcl-x l stabilizes mitochondrial membranes via multiple mechanisms, including (1) the sequestration into inactive complexes of its proapoptotic counterparts, bax, bak, and bh3only proteins (e.g., bid) (for review: [s65,s78]), (2) inhibitory interactions with ptpc constituents, in particular with vdac and the adenine nucleotide translocase (ant) [12, 14] , (3) an enhancement of cyt c oxidase activity and mitochondrial respiration [s79], and/or (4) indirect effects on intracellular ca 2+ stores of the er [s80,s81]. while bax/bak execute mmp by one (or more) of the aforementioned mechanisms, bh3-only proteins exhibit indirect proapoptotic effects [1;s66,s82]. thus, ''activator'' bh3-only proteins (e.g., bid) would directly interact and activate bax/bak, whereas the ''derepressors'' (e.g., bad) would rather disrupt bcl-2/bcl-x l inhibitory complexes, thus allowing for the release of bax/bak [15,s82] . in healthy cells, inactive bak is constitutively associated with mitochondria, while bax is found as a cytosolic monomer [s66]. upon apoptosis induction, both undergo a conformational modification to become activated, and bax translocates to om in the form of an active dimer [s83,s84]. when bax or bak induce mmp, the release of specific ims proteins occurs via large pores in the om and may precede dy m loss. in some instances of bax-mediated apoptosis, indeed, mitochondrial membrane permeabilization (momp) occurs without discernable dy m alterations [1] (figure 2 ). large pores formed by the oligomerization of proapoptotic bcl-2 proteins (e.g., bax, bak) and/or the voltage-dependent anion channel (vdac) may promote selectively mitochondrial outer membrane permeabilization (momp). in this case, specific intermembrane space (ims) proteins are liberated in the cytosol, but the mitochondrial transmembrane potential (dy m ) is (at least initially) retained (a,b). on the contrary, some proapoptotic stimuli, such as calcium (ca 2+ ) overload, reactive oxygen species (ros), and the lipid second messenger ceramide, favor mmp by inducing the permeabilization of the inner mitochondrial membrane (im) via the activation of the permeability transition pore complex (ptpc). when the ptpc opens, dy m is immediately lost and an unregulated entry of solutes and water into the mitochondrial matrix occurs. this results in the osmotic swelling of mitochondria, followed by rupture of both mitochondrial membranes and the unspecific release into the cytosol of ims proteins (a,c) (please refer to the sections ''mmp regulation by bcl-2 family proteins'' and ''mmp regulation by the ptpc'' for additional details). notably, antiapoptotic proteins from the bcl-2 family play a role in both models. aif, apoptosis-inducing factor; ant, adenine nucleotide translocase; ck, creatine kinase; cypd, cyclophilin d; cyt c, cytochrome c; hk, hexokinase; oxphos, oxidative phosphorylation complexes; pbr, peripheral-type benzodiazepine receptor. doi:10.1371/journal.ppat.1000018.g002 despite their structural similarity, each bh3-only protein presents a specific mechanism of activation (either at a transcriptional level or mediated by post-translational modifications), and acts as a sensor of a particular type of cell stress [15,s82] . for instance, the bh3-only proteins puma and noxa are activated by dna damage via p53dependent transactivation [16;s85] , bim and bmf are released from cytoskeletal structures upon c-jun n-terminal kinase (jnk)-mediated phosphorylation [s86-s88], and bid is proteolytically processed by caspase-8 following the activation of the extrinsic pathway of apoptosis [s9,s12]. following caspase-8-mediated cleavage, a glycine residue of tbid is exposed, allowing for post-translational (rather than usual cotranslational) n-myristoylation. this modification has been shown to act as an activating switch and to enhance tbid-induced cyt c release and cell death [s89] . interestingly, tbid mitochondrial targeting [s90] and proapoptotic activity [s91] have been associated with cardiolipin, a mitochondrial lipid particularly abundant in the im. tbid-cardiolipin interaction requires three ahelical domains (a4-a6) of tbid and occurs prominently at the contact sites between the om and the im [s92]. cardiolipin might also be implicated in the dissociation of bid fragments (tbid and nbid), which would rather occur during the targeting of tbid to mitochondria than immediately after caspase-8-mediated cleavage [s93] . although bcl-2 family proteins exert their apoptosismodulatory functions mainly at mitochondria, extra-mitochondrial activities contribute to their effects. for instance, bcl-2/bcl-x l localize at the er and decrease luminal ca 2+ concentration, thus protecting against ca 2+ -dependent death stimuli [s80,s94,s95]. conversely, bax/bak favor the transfer of ca 2+ from the er to mitochondria and cell death [s96,s97]. moreover, bax 2/2 /bak 2/2 mefs show an impaired mobilization of er ca 2+ following numerous proapoptotic stimuli, which can be partially restored by overexpressing the sarco/endoplasmic reticulum ca 2+ atp-ase (serca) [17] . taken altogether, these observations point to the bcl-2 system as a prominent pharmacological target for the modulation of mitochondrial apoptosis (for a review, see [18] ). mmp may originate at the im due to the activation of the ptpc, a large multiprotein structure assembled at the contact sites between om and im. this applies in particular to cell death models characterized by enhanced ca 2+ fluxes and disproportionate ros generation [s98] . ptpc activation provokes a sudden increase in the im permeability to solutes of low molecular weight (i.e., mpt), which leads to the unregulated entry of water and osmotic swelling of the mitochondrial matrix. in turn, this may result in the physical rupture of the om, because the surface area of the im (with its folded cristae) largely exceeds that of the om [s63,s98,s99]. in the context of mpt-derived mmp, dy m dissipates before om is permeabilized and ims are released ( figure 2) [s63]. although its exact molecular composition remains elusive, numerous independent studies suggest that the ptpc might result from the association of multiple proteins, including ant (in the im) and vdac (in the om), in the context of a dynamic interaction with mitochondrial matrix proteins (e.g., cyclophilin d [cypd]), ims proteins (e.g., creatine kinase [ck]), om proteins (e.g., peripheral-type benzodiazepine receptor [pbr]), as well as with cytosolic factors (e.g., hexokinase isoforms) (for recent reviews, see [s63,s100,s101]). nevertheless, genetic studies performed in the murine system suggest that all the aforementioned components of the ptpc, most of which exist in multiple isoforms, are either dispensable for cell death or preferentially participate in necrotic pathways (rather than in apoptosis) [19-21;s102 ]. in addition, controversial views remain about the mechanisms by which the ptpc promotes mpt and therefore mmp. some authors have proposed that in physiological conditions vdac would be found within the ptpc in a state of low conductance, rapidly switching between the open and closed conformations [s103]. in this configuration, the ptpc would ensure the normal exchange of metabolites between the mitochondrial matrix and the cytosol. following proapoptotic stimuli, a state of high conductance for vdac would be favored, resulting in longlasting openings of the ptpc and mpt [12;s59] . alternatively, it has been suggested that the high conductance state of vdac would serve to its physiological functions, whereas cell death would result from a closed conformation, favoring a transient hyperpolarization of the mitochondrial matrix, followed by osmotic imbalance, swelling, and eventually mmp [s104-s105]. several reports indicate ptpc components as targets for the apoptosis-modulatory activity of both pro-and antiapoptotic bcl-2 family members. in this context, it has been demonstrated that bax and bak accelerate the opening of vdac in reconstituted proteoliposomes, and that vdac-deficient mitochondria do not exhibit bax/bak-induced cyt c release and dy m dissipation occurring in vdac-proficient control mitochondria [12;s106]. in the same model, recombinant bcl-2/bcl-x l as well as synthetic peptides corresponding to their bh4 domains were shown to prevent vdac opening, cyt c release, and dy m dissipation [s107,s108]. in addition, the bh3-only proteins bid and bim have been reported to interact directly with vdac, the latter interaction being remarkably enhanced during apoptosis [s62,s109]. bax and bcl-2/bcl-x l also modulate ptpc activity by binding to ant. as demonstrated by the yeast two-hybrid system, coimmunoprecipitation assays, and in artificial lipid bilayers, ant and bax directly interact and cooperate to form a channel with distinct electrophysiological properties as compared to the channels formed by bax or ant alone [22;s110]. in artificial membranes, the presence of bcl-2 inhibited cooperative channel formation by bax and ant as well as the atractyloside-induced assembly of channels by ant alone, thus pointing to a direct interaction between ant and bcl-2 [s46,s110]. furthermore, bcl-2 promotes (and bax inhibits) adp/atp exchange in ant-containing proteoliposomes, isolated mitochondria, and mitoplasts [s111]. interestingly, in this system the bax-mediated inhibition of ant translocase activity could be separated from the formation of cooperative channels by bax and ant [s111]. as determined by co-immunoprecipitation and proteomics analysis, the interactome of ant undergoes major rearrangements in the course of the chemotherapy-induced apoptosis. thus, soon after the treatment with etoposide of a human tumor cell line (ht29 cells), the amount of bax contained within the ant interactome significantly augmented, whereas the quantity of bcl-2 was decreased [s112]. as previously mentioned in the section ''mmp regulation by bcl-2 family proteins'', bcl-2 family members are known to modulate luminal ca 2+ concentration and ca 2+ release at the er [17;s94,s95]. in doing so, they exert an additional indirect control on the ptpc, since cytosolic ca 2+ liberated from er stores (for instance upon the induction of the unfolded protein response) can accumulate in mitochondria and promote ptpc opening, mptdependent mmp, and cell death [s40]. thus, it appears that an intricate crosstalk for the modulation of mmp exists between mitochondria and the er, in which proteins from the bcl-2 family participate at the level of both organelles [s80,s81]. during the last decade, numerous viral proteins have been reported to modulate (either positively or negatively, either in a direct or indirect fashion) the apoptotic response of host cells to infection ( figure 3 , tables 1 and 2) [7;s42]. with regard to this, viral factors can be classified into one of the four following subgroups: proapoptotic proteins (1) that insert into mitochondrial membranes and hence trigger mmp through the action of amphipathic a-helical domains or (2) that promote mmp indirectly, through the activition of host-encoded factors (table 1) , and antiapoptotic modulators (3) that exhibit sequence and/or structural similarity to multidomain bh1-4 members of the bcl-2 family (so-called viral bcl-2 proteins [vbcl-2s]) or (4) that inhibit apoptosis via other mechanisms (table 2) . notably, some viral proteins exhibit mixed apoptosis-modulatory functions, and hence cannot be unambiguously classified into one of the aforementioned groups. several proteins encoded by the human immunodeficiency virus 1 (hiv-1) exert a proapoptotic activity, thereby contributing to the hiv-induced depletion of cd4 + lymphocytes (for a review, see [23] ). among these, the viral protein r (vpr) has direct mitochondrial effects in numerous cell types independently from its mode of delivery (viral infection, transfection of a vpr gene, or exogenous administration of recombinant vpr protein) [s113,s114]. the c-terminal moiety (aa 52-96) of vpr directly interacts with ant and vdac, thereby triggering mmp associated with dy m loss, ims proteins release, and caspase cascade activation [24;s41,s115]. when added in vitro to purified mouse liver mitochondria, a synthetic vpr-derived peptide (vpr 52-96 ) induced large amplitude swelling (an indicator of im permeabilization and ptpc pore opening) in less than 15 minutes. this effect could be prevented by bcl-2 as well as by pharmacological agents targeting ant (such as bongrekic acid [ba]) or vdac (such as 4,49-diisothiocyanatostilbene-2,29disulfonic acid [dids]) [24] . in lymphoid cells, vpr-mediated mmp and apoptosis is facilitated by bax, yet inhibited by overexpression of bcl-2 or addition of the ptpc inhibitor cyclosporine a (csa) [s41,s115]. recently, it has been proposed that two distinct domains of vpr (namely aa 27-51 and aa 71-82) would bind to a region encompassing the first ant ims loop and part of its second and third tm helices [s114]. this model may explain why vpr is able to convert ant into a non-specific pore, as this has been observed experimentally when adding vprderived peptides to ant-containing proteoliposomes [24] . importantly, one of the vpr arginine residues that is required for the interaction with ant (r77) [s41] is frequently mutated (r77q) in long-term non-progressors (i.e., hiv-1-carriers who fail to develop signs of cd4 + t cell depletion within 15 years after infection) [s116]. this points to the functional relevance of the ant-vpr interaction to the development of aids. nonetheless, vpr has been reported to modulate viral replication independently from its proapoptotic function, presumably via an interaction with host proteins other than ant [s117]. the minimal cytotoxic domain of vpr that binds to ant (aa 67-82) is structured as an amphipathic a-helix [24] . this short cytotoxic domain (vpr [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] has been recently conjugated with a tumor blood vessel rgd-like ''homing'' motif. this hybrid peptide is highly efficient in killing human endothelial cells, presumably because it binds to the cell surface, internalizes, and finally interacts with mitochondrial membranes to trigger mmp [s118]. in contrast with vpr 52-96, vpr 67-82 induces cell death independently from bax and bak, and efficiently overcomes bcl-2-mediated apoptosis resistance [s118]. the differences in the modes of action of vpr 52-96 and vpr 67-82 remain elusive. nonetheless, the design of cell type-targeted mitochondriotoxic peptides, such as those derived from vpr, has opened tantalizing possibilities for the therapeutic induction of apoptosis (reviewed in [9;s119,s120]). hepatitis b virus (hbv) is one of the most prominent etiological agents of chronic liver disease and persistent infection is associated with hepatocellular carcinoma [25;s121]. hbv x protein (hbx) is implicated in viral replication and exhibits an oncogenic potential in animal models [25;s121]. moreover, hbx sensitizes cells to tumor necrosis factor a (tnfa)-induced apoptosis [s122]. heterologous expression of hbx in cultured human hepatoma cells results in its mitochondrial accumulation, interaction with vdac3, and dy m dissipation, coupled to a perinuclear redistribution of mitochondria [s123,s124]. mutagenesis studies revealed that hbx localization to mitochondria and dy m dissipation do not depend on its transactivation domain [s125,s126], but require a putative tm region (aa 54-70) of the protein, while two additional amphipathic helical domains (aa 75-88 and 109-131) provide only marginal contributions [s127]. morevoer, hbx binds to the heat shock protein of 60 kda (hsp60), a mitochondrial matrix chaperon [s126]. interestingly, ultrastructural and functional impairment of mitochondria as well as vdac3 overexpression have been associated with chronic liver disease, further supporting an etiological role for hbx in this context [s128]. poliovirus (plv) infection causes paralytic poliomyelitis, an acute disease resulting in flaccid paralysis associated with caspasedependent apoptosis of motor neurons [s129]. similar to hbx, the plv viroporin 2b localizes to mitochondria, induces a perinuclear redistribution of these organelles and alters their morphology, suggesting that 2b might exert proapoptotic effects by directly promoting mmp [s130]. as discussed below, this is not the sole mechanism accounting for plv-induced neurodegeneration (see the section ''indirect mmp facilitators''). similar perinuclear clustering of mitochondria and dy m loss, followed by partial cyt c release and signs of apoptosis (e.g., phosphatidylserine exposure on the outer leaflet of the plasma membrane and chromatin condensation), is observed upon the overexpression of the orfc protein from the walleye dermal sarcoma virus (wdsv), a retrovirus causing benign tumors in fish characterized by seasonal regression [s131,s132]. orfc is a basic protein of 120 aa that localizes to mitochondria, although it does not possess any homology to known mitochondrial targeting sequence (mts) [s132]. interestingly, regressing tumors express orfc at high levels, pointing to the involvement of orfc-mediated apoptosis in this phenomenon [s131,s132]. human t lymphotropic virus 1 (htlv-1) infection is linked with a diverse range of neurodegenerative and lymphoproliferative disorders, notably acute t cell leukemia [26] . the genome of this complex retrovirus codes for typical structural and enzymatic proteins but also for unique regulatory and accessory factors that are involved in both htlv-1 viral cycle and pathogenesis [s133,s134]. among these, p13(ii) is an 87-aa protein that is targeted to mitochondria, where it promotes a rapid flux of k + and ca 2+ across im together with swelling, dy m dissipation, and fragmentation [27;s135]. p13(ii) n-terminus includes a short hydrophobic leader peptide, followed by an amphipathic a-helical mts. within this region, ten residues are sufficient to target a green fluorescent protein (gfp)-p13(ii) fusion to mitochondria [s136], and four arginines form a positively charged patch on one side of the a-helix, which is responsible for its amphipathic properties [s137]. p13(ii) expression has been shown to enhance caspase-dependent fas-and c2 ceramide-induced apoptosis [s138], and to suppress the proliferative and tumorigenic potential of cells transformed by the myc and ras oncogenes [s135]. the bovine leukemia virus (blv) is an htlv-1 homolog causing lymphoproliferative disorders in multiple species [s139]. similarly to p13(ii), blv accessory protein g4 is localized to both the nucleus and mitochondria, due to an mts consisting of a hydrophobic region and an amphipathic a-helix [s140]. while g4 is known to alter mitochondrial morphology [s140], its exact role in blv pathogenesis remains poorly understood. indeed, whereas g4 exhibits an oncogenic potential both in vitro and in vivo [s141,s142] and g4-deleted blv variants are characterized by reduced in vivo propagation [s143], blv-infected peripheral blood mononuclear cells are protected from apoptosis independently from g4 expression [s144]. as a possibility, g4 effects on cellular transformation may rather result from its interaction with farnesyl pyrophosphate synthetase (fpps) than from the direct modulation of mmp [s145]. pb1-f2 is an 87-aa protein encoded by influenza a virus (iav) that determines virulence in murine models of infection [28;s146]. while iavs genetically engineered to lack pb1-f2 replicated normally both in tissue cultures and in mouse lungs, their pathogenicity and associated mortality were greatly diminished as compared to wild-type strains [s146]. pb1-f2 inserts into mitochondrial membranes via a positively charged amphipathic ahelix located in its c-terminus [29] . there, pb1-f2 acquires poreforming properties similar to those displayed by proapoptotic bcl-2 family members (e.g., bax) [s147]. the domain necessary and sufficient for mitochondrial targeting has been mapped to residues 46 to 75 of pb1-f2 [s148]. this region comprises a short hydrophobic region including several basic residues followed by an amphipathic a-helix, and closely resembles the mts found in vpr, p13(ii) and, g4 [29] . as assessed by glutathione-s-transferase pulldown assays followed by mass spectrometry, the sole mitochondrial proteins interacting with pb1-f2 are the isoform 1 of vdac (vdac1) and the isoform 3 of ant (ant3) [s149]. accordingly, ptpc blockers such as ba and csa prevent pb1-f2induced mmp and apoptosis [s149]. however, pb1-f2 is capable of destabilizing lipid bilayers in the electric field, implying that the protein may promote mmp by acting directly on mitochondrial membranes [s147]. altogether, human papillomaviruses (hpvs) are responsible for a broad range of infectious diseases, ranging from anogenital warts (e.g., hpv-6 and -11) to progressive dysplastic-neoplastic lesions of the genital mucosa (e.g., hpv-16 and -18) [s150]. the hpv genome codes for a 90-aa protein encompassing part of the e1 and e4 open reading frames (orfs), called e1e4 [s151]. in human mature keratinocytes (the natural host for hpv infection), e1e4 binds to cytokeratins, thereby inducing the total collapse of the keratin (but not of tubulin and actin) cytoskeleton [30] . thereafter (or in cells lacking cytokeratins), e1e4 localizes to mitochondria, due to an n-terminal leucine-rich region [s152]. via a yet unidentified mechanism, e1e4 displaces mitochondria from microtubules, promoting the aggregation of the organelles in a large perinuclear cluster, followed by dy m dissipation and apoptosis [s152]. both the structural protein vp3 and the non structural protein 2c encoded by the avian encephalomyelitis virus (aev) promote cell death [s153,s154]. in different cell lines, vp3 localizes to mitochondria and sets off the caspase cascade leading to apoptosis [s153]. comparable effects result from the expression of 2c, which is a conserved protein of picornaviruses with a role in several steps of the viral life cycle [s154]. the proapoptotic function of 2c is associated with an n-terminal domain of 35 aa (from residues 46 to 80), which has a putative a-helical structure and lies in the proximity of a coiled-coiled domain [s154]. as a last example, the non structural protein 4a (ns4a) from hepatitis c virus (hcv) localizes (at least in part) at mitochondria, thereby causing mitochondrial damage (associated with dy m dissipation and cyt c release) and impairing the intracellular distribution of these organelles [s155]. notably, acute hcv infection often progresses to chronic hepatitis, cirrhosis and hepatocellular carcinoma [s156], and hence it is not surprising that the genome of hvc encodes for several other modulators of apoptosis (see also the sections ''indirect mmp facilitators'' and ''other antiapoptotic viral strategies'') [s156,s157]. multiple proteins encoded by the hiv-1 genome initiate mitochondrial apoptosis in an indirect fashion (table 1) , without a physical interaction with mitochondrial membrane proteins. thus, nef stimulates lysosomal membrane permeabilization resulting in the release of cathepsin d from the lysosomal lumen into the cytosol. this triggers the activation of bax (but not that of bak) and mmp-dependent cell death [s158,s159]. the hiv-1 envelope glycoprotein complex (env, constituted by gp140 and gp41) is expressed by infected cells and promotes cellto-cell fusion by interacting with its receptor/co-receptor complex (cd4/cxcr4 or cd4/ccr5) on the surface of uninfected cells. env-elicited syncytium formation is followed by mmp after a latency of at least 12 hours [31] . env triggers mmp through a complex signal transduction pathway that involves the sequential activation of cyclin-dependent kinase 1 (cdk1), mammalian target of rapamycin (mtor), p38 map kinase, phosphorylation of p53, and p53-dependent transactivation of puma and bax [32,33;s160]. tat, a powerful activator of hiv-1 gene expression, triggers apoptosis of infected and uninfected cells, thereby contributing to the hiv-1-induced neurodegeneration [34;s161-s162]. transfection with tat results in its accumulation in mitochondria followed by dy m loss, ros overproduction, and caspase activation [s163]. however, recombinant tat protein added to cultured cells fails to localize at mitochondria and primarily accumulates in the endosomal compartment, presumably due to its uptake via the endocytic pathway [s164]. tat associates with the tubulin network through a 4-aa subdomain of its conserved core region, thereby altering microtubule dynamics, promoting the proteasomal degradation of the microtubule-associated protein 2 (map2), and activating a mitochondrion-dependent apoptotic pathway [35;s162]. tat cytotoxicity relies (at least partially) on the proapoptotic activity of the bh3-only protein bim. in healthy cells, bim is inactivated through its association with the microtubule-associated dynein motor complex, and tat liberates bim from this inhibition [35;s86] . thus, bim 2/2 cells exhibit increased resistance against tat-induced cell death [35] . nevertheless, tat may influence mitochondrial apoptosis through other mechanisms, and contradictory reports suggest that tat might regulate p53 activity as well as the expression levels of bax and bcl-2 [s165-s167]. the hiv-1-encoded protease is required for the viral life cycle because it processes large polypeptide precursors into mature viral proteins. due to its degenerate substrate specificity, the viral protease promotes the proteolytic activation of caspase-8, as assessed both in vitro and in t cells, which leads to bid cleavage and mitochondrial apoptosis [36] . reportedly, the hiv-1 protease would also favor apoptosis and viral replication via the cleavage and inactivation of bcl-2, which would result in the oxidative stress-dependent activation of nf-kb, a transcription factor required for hiv-1 enhancer activation [s168-s169]. conversely, high bcl-2 levels protect cells in vitro and in vivo from the viral protease and prevent apoptosis induced by hiv-1 infection of human lymphocytes, while reducing the yields of viral structural proteins, infectivity, as well as the secretion of tumor necrosis factor b (tnf b) [s169]. additional viral proteases have been implicated in the induction of apoptosis. thus, the hcv non structural protein 3 (ns3) participates in a protease complex [s170], and has been reported to induce caspase-8-mediated apoptosis independently from its enzymatic activity [s171]. upon expression either as single proteins or in combination, plv proteases 2a (2apro) and 3c (3cpro) activate caspase-dependent apoptosis [s172-s174]. however, other mechanisms are involved in the induction of apoptosis by pvs, including (1) modulation of antiapoptotic proteins of the bcl-2 family (e.g., bcl-x l ) [s175]; (2) jnk-mediated activation of bax [s176]; and (3) mmp promoted by viroporins 2b and 3a (see also the section ''direct inducers of mmp'') [s130]. following the infection with human adenoviruses (advs), cells exhibit an apoptotic response mediated by the expression of the viral e1a protein. this lethal response can be counteracted by the vbcl-2 e1b-19k (see also the section ''viral bcl-2 homologs'') [37] [38] [39] . notably, e1a was the first viral protein found to promote apoptosis [37] [38] [39] , via both p53-dependent and -independent mechanisms, the latter involving additional viral factors and in particular products of the e4 gene that are expressed upon e1amediated transactivation [s177-s182]. moreover, e1a sensitizes cells to apoptosis induced by multiple stimuli including death receptor agonists (e.g., fasl, tnfa, and tnf-related apoptosis inducing ligand [trail]) [s183-s185] and nitric oxide (no) [s186]. recently, a prominent role for bh3-only proteins (and in particular for bik) has been reported in adv-induced apoptosis [40;s187]. bik is upregulated at the transcriptional level after adv infection, in a p53-dependent fashion [s188]. moreover, during the viral life cycle, the proapoptotic activity of bik is enhanced as a result of an activating phosphorylation. accordingly, sirna-mediated depletion of bik has been shown to dramatically diminish adv-dependent cell death [s187]. e4orf6, a 34-kda protein encoded by the adenoviral gene e6, can promote apoptosis by inhibiting the protein phosphatase 2a (pp2a) [41] . pp2a inhibition prolongs the signal of dna damage emanating from phosphorylated histone h2ax (ch2ax), thereby leading to poly(adp-ribose) polymerase (parp) hyperactivation, aif nuclear translocation, and ultimately cell death [41] . moreover, pp2a is known to maintain mitochondrial bcl-2 in an hypophosphorylated form, which allows for its antiapoptotic function [s30] . in this context, pp2a inhibition would also favor the phosphorylationdependent inactivation of bcl-2. e6 and e7 contribute to the transforming properties of high-risk hpvs by targeting p53 to ubiquitin-mediated degradation [42;s189] and by inactivating the retinoblastoma (rb) protein [43;s190], respectively. e6 (alone or together with e7) has been reported to sensitize cells to different apoptotic stimuli [s191-s193], via mechanisms that may depend [s191,s193] or not [s192] on p53. similar to e6, e7 has been implicated in the sensitization of cells to cell death induced by growth factor withdrawal [s194], chemotherapeutic agents [s191,s195], and ultraviolet (uv) irradiation [s196]. the vesicular stomatitis virus (vsv) belongs to the family of rhabdoviruses, and its infection is associated with the development of neurological disorders characterized by enhanced neuronal apoptosis [s197,s198]. thus, vsv-infected cells exhibit an early activation of the mitochondrial pathway of apoptosis [s199], which does not depend on de novo viral protein synthesis nor on viral replication [s199,s200]. multiple pathways are involved in vsv-induced apoptosis, including (1) while the induction of host cell apoptosis may favor viral dissemination at late stages of infection, it is vital for viruses to inhibit pcd at early steps of the infectious cycle, thereby avoiding premature cell death and allowing the virus to replicate. thus, viruses have developed a battery of bcl-2 homologs by which they mimic the major antiapoptotic system of host cells (for a recent review, see [s215]). in some instances, such vbcl-2s fail to show significant sequence similarity with their mammalian counterparts, yet exhibit striking structural resemblance. finally, a number of viral factors inhibit apoptosis via other mechanisms, which do not directly involve the bcl-2 system (table 2 ). the ''founder'' of the viral bcl-2 homolog (vbcl-2) family is the 19-kda protein encoded by the adenoviral e1b gene (e1b-19k) [39;s216] . during adv infection, e1b-19k blocks host cell apoptosis, thereby sustaining viral replication [47] have been originally identified thanks to their interaction with e1b-19k. while e1b-19k and bcl-2 exhibit limited overall sequence similarity [51] , they share short homologous domains that can be exchanged between the two proteins without a significant loss in their antiapoptotic functions [s226,s227]. accordingly, bcl-2 and e1b-19k functionally complement each other, and in hela cells, bcl-2 overexpression is able to substitute for the absence of e1b-19k, thereby allowing for productive adv infection and inhibiting tnfr-and fas-mediated apoptosis [51] . human cytomegalovirus (cmv) encodes several proteins that subvert host cell functions in order to favor viral propagation [52] . one of the best characterized among these factors is the product of the cmv gene ul37 (pul37x1), known also as viral mitochondria-localized inhibitor of apoptosis (vmia). vmia, which is required for viral replication, has been shown to inhibit apoptosis triggered by different stimuli, including ligation of death receptors and exposure to cytotoxic agents, as well as infection with a mutant adv strain lacking the antiapoptotic modulator e1b-19k [s228]. although vmia does not share any obvious structural similarity with bcl-2 nor with its viral homologs (vbcl-2s), vmia exerts its antiapoptotic activity predominantly by inhibiting mmp at the level of mitochondria [52] . deletion mutagenesis studies revealed that an n-terminal mts is necessary and sufficient for vmia mitochondrial localization, but not for its antiapoptotic activity, which requires an additional c-terminal region of the protein [52;s46,s228]. vmia can physically interact with bax, recruit it to mitochondria, and neutralize it [53] . since vmia effects on apoptosis are lost in bax-deficient cells, it appears that vmia exerts its antiapoptotic functions solely by neutralizing bax [53] . the structure-function relationship of the bax/vmia interaction has been addressed by mutational and computational analyses, suggesting a model in which the overall fold of vmia closely resembles that of bcl-x l [s229]. in contrast to bcl-x l , however, it seems that vmia does not bind to the bh domain of bax but rather engages in electrostatic interactions involving a region of bax between its bh2 and bh3 domains [s229]. this region is not conserved in bak, explaining why vmia (in contrast to bcl-x l ) fails to interact with bak [53;s229]. besides its ability to neutralize bax, vmia exerts multiple functions, not all of which are directly linked to its mmp-modulatory role. in particular, vmia (1) interacts with ant and enhances its antiporter activity [54] ; (2) induces the fragmentation of the mitochondrial network, which might hamper the propagation of ca 2+ -dependent proapoptotic signals [s230]; (3) inhibits the atp synthasome via an interaction with the mitochondrial inorganic phosphate carrier (pic, an im protein), thus reducing mitochondrial atp generation [54] ; and (4) induces the release of er ca 2+ stores into the cytosol [s231]. this mobilization of er ca 2+ might have several consequences, including activation of the unfolded protein response, modulation of mitochondrial functions, induction of mitochondrial fission, and protection against proapoptotic signals, through an inhibition in the propagation of ca 2+ waves [s231]. poxviruses cause several diseases of humans and domestic animals, including smallpox, cowpox, sheeppox, fowlpox, and goatpox [s232]. the genome of many poxviruses codes for proteins that, despite the lack of sequence similarity, fold like bcl-2 and exert similar antiapoptotic properties, thereby belonging de facto to the vbcl-2 family. f1l from vaccinia virus (vacv) was first characterized following the observation that viral strains lacking the serpin crma (which acts as a direct inhibitor of caspases [55] ) maintained the ability to protect cells from apoptosis [56] . the c-terminus of f1l is composed by a 12-aa tm domain flanked by positively charged lysines and followed by an 8-aa hydrophilic tail. as assessed by mutagenesis studies, this domain (which exhibits slight homology to the c-terminus of bcl-2) is required for both the mitochondrial targeting and the antiapoptotic function of f1l [s233]. f1l directly interacts with the bh3 domain of proapoptotic bcl-2 family members, including bak [57;s234] and bim [s235], thereby inhibiting dy m dissipation and cyt c release induced by diverse stimuli (e.g., staurosporine, fas crosslinking) [56] . the region of f1l involved in this interaction encompasses aa 64-84 and has limited sequence similarity to known bh3-binding domains [s234]. while f1l has been shown to inhibit mitochondrial translocation and activation of bax, a direct interaction between f1l and bax has never been detected, suggesting that f1l acts upstream of bax activation [s235]. other poxviruses, including variola virus, monkeypoxvirus, and ectromelia virus, encode functional f1l orthologs, all of which display 95% sequence homology in the c-terminus of the protein [56] . myxoma virus (mxv) is a leporipoxvirus causing myxomatosis (a highly lethal disease) in the european rabbit. productive infection by mxv requires the expression of m11l [s236], a 166aa protein with no defined structural motifs except a putative cterminal tm domain [s237] . within this region, six positively charged residues flanking a hydrophobic trait followed by a short positively charged tail constitute an mts, which is responsible for the targeting of m11l to om during infection [s237]. this kind of mts resembles that of some members of the bcl-2 family [s237]. from its mitochondrial localization, m11l suppresses apoptosis induced by treatment with staurosporine [s237] and the pbr ligand protoporphyrin ix [58] , overexpression of bak [s238] , and viral infection [59] . these effects derive from the ability of m11l to constitutively interact with pbr [58] , bak [s238], and bax [59] , and hence to inhibit mpt-dependent mmp as well as bak/bax-mediated momp. bax-mediated apoptosis is blocked in mxv-infected cells lacking bak expression, suggesting that m11l interacts with bax independently from bak [59] . m11l has no obvious sequence homology with bcl-2 or bcl-x l , yet adopts a virtually identical three-dimensional fold [60] . this high level of structural homology allows m11l to associate with a peptide derived from the bh3 domain of bak with an affinity comparable to that of bcl-2 and bcl-x l for the same peptide [60] . thus, m11l represent a structural mimic of bcl-2 and hence a bona fide vbcl-2 [61] . n1l is the most potent virulence factor of vacv and is required for productive replication in vivo, as assessed in murine models of mucosal and brain infection [s239,s240]. preliminary studies indicated that n1l associates with several components of the i-kb kinase (ikk) complex, and hence can interfere with innate immunity signalling mediated by nf-kb and toll-like receptors (tlrs) [s241,s242]. however, cells infected with recombinant viruses with or without the n1l gene exhibited no difference in nf-kb-dependent gene expression [62] , suggesting that another function of n1l underlie its important role in vivo. indeed, during viral infection, n1l binds to endogenous proapoptotic bcl-2 family members such as bid, bak, and bax as well as to exogenous bad and bax (following transfectionenforced overexpression) [62] . similar to m11l, n1l shares no sequence homology with cellular proteins yet exhibits a threedimensional structure that closely mimics that of bcl-2 and bcl-x l [s243]. thus, the surface of n1l possesses a constitutively open groove common to other antiapoptotic members of the bcl-2 family, which mediates the interaction with bh3-containing proteins [62] . the examples provided by m11l and n1l illustrate the importance of the conservation of structure, rather than of primary amino acid sequence, for the maintenance of protein function along evolution. recently, two novel bcl-2-like inhibitors of apoptosis encoded by poxviruses have been discovered: fpv039 from fowlpox virus (fpv) [63] and orfv125 from parapoxvirus orf virus (ppvo) [64] . in human and chicken cells, fpv039 localizes at mitochondria and constitutively interacts with bak, thereby suppressing apoptosis induced by tnfa and bak overexpression. moreover, fpv039 is able to substitute for f1l in inhibiting bak activation and apoptosis triggered by staurosporine and vacv infection, confirming that fpv039 is a functional homolog of f1l [63] . in uv-irradiated hela cells, orfv125 fully inhibits dna fragmentation, caspase activation, and cyt c release, but not jnk activation, an event occurring upstream of mitochondria. the mitochondrial localization of orfv125 is determined by a distinctive c-terminal domain, and is required for its antiapoptotic function. as assessed by bioinformatic analyses, orfv125 shares sequence and structure similarities with antiapoptotic members of the bcl-2 family (i.e., bcl-2, bcl-x l , and bcl-w). accordingly, orfv125 inhibited the uv-induced activation of bax and bak, strongly corroborating the idea that orfv125 represents a novel bona fide member of the vbcl-2 family [64] . epstein-barr virus (ebv) is a prominent member of cherpesviruses, invading primary resting b lymphocytes to establish a latent infection that eventually culminates in cell transformation [s215] . the potent mitogenic effect of ebv is mediated by the coordinated expression of several gene products, including the apoptotic modulators balf1 and bhrf1 [s215,s244]. these two factors share sequence and structure homology with bcl-2 and belong de facto to the growing list of vbcl-2s [65;s245] . while bhrf1 clearly resembles bcl-2 in its antiapoptotic function [66] , the role of balf1 (which is able to interact with bax and bak, and has been proposed as a bhrf1 antagonist) is controversial [65;67] . balf1 is actively expressed in ebv-positive burkitt lymphoma's cell lines and nasopharyngeal carcinoma biopsies, and renders cells independent from serum [s246]. this points to a prominent antiapoptotic (rather than proapoptotic) function for balf1 during ebv-driven tumorigenesis [s246] . as true for other dna viruses, antiapoptotic proteins appear to be essential for the early phases of the herpesvirus life cycle. however, both proteins are neither expressed nor required once latent infection is established [s215], nor they are essential for in vitro viral replication and transformation of b cells [68;s247] . as assessed by immunoelectron microscopy studies, bhrf1 colocalizes with bcl-2 at the om [66, 69] . the c-terminal hydrophobic portion of bhrf1 (which is responsible for bhrf1 targeting to intracellular membranes) exhibits high levels of homology with several members of the bcl-2 family, in particular with bcl-2 (38%), bcl-x l (32%), and bax (34%) [69;s245] . moreover, the threedimensional structure of bhrf1 closely resembles that of bcl-2 in thus far that it contains two central hydrophobic a-helices that are surrounded by several amphipathic a-helices. in contrast to bcl-2/ bcl-x l , however, bhrf1 is not able to sequestrate and inhibit proapoptotic bh3-only proteins, presumably because it lacks an exposed, pre-formed bh3 binding groove [69;s245] [72;s262] . hhv-8 orf16 encodes the so-called kaposi sarcoma-associated bcl-2 (ksbcl-2), a polypeptide of 175 residues that shares limited (15%-20%) overall sequence identity with other bcl-2 homologs (including bcl-2, bcl-x l , bax, bak, and bhrf1) [71] . interestingly, significant amino acid identity is concentrated in the bh1 and bh2 (but not in the bh3) domains [71] . moreover, although ksbcl-2 exhibits an overall fold almost identical to that of bcl-2/bcl-x l , key differences exist in the lengths of helices and loops [73] . presumably, structure and sequence dissimilarity with bcl-2 account for the fact that ksbcl-2 neither homodimerizes nor heterodimerizes with other bcl-2 family members, suggesting that it may have evolved to escape any negative regulatory effects mediated by proapoptotic host proteins from the bcl-2 family [71, 73] . mutation of a conserved arginine and the two adjacent residues within the bh3 binding groove resulted in a correctly folded protein that failed to bind bax bh3 peptide and to inhibit bax toxicity in yeast. moreover, viruses harboring the same mutation exhibited impaired persistent replication and reactivation from latency in vivo [s267] . altogether, these studies point to a major role of m11 in vivo for the establishment of a persistent viral pool and chronic infection, rather than for viral replication and virulence during acute infection [74] . in contrast to ksbcl-2, hvs orf16 has been shown to interact with bax and bak to inhibit virus-induced apoptosis. while orf16 exhibits highly conserved bh1 and bh2 domains, it lacks the core sequence of the conserved bh3 domain, suggesting that this portion might be dispensable (at least in some proteins) for antiapoptotic functions [72] . similar to other vbcl-2s (i.e., bhrf1 and ksbcl-2), orf16 contains a stretch of conserved hydrophobic residues at its c-terminus, ending with basic amino acids, that may direct its (not yet demonstrated) targeting to intracellular membranes [71] . interestingly, several herpesvirus-encoded vbcl-2s cannot be converted into proapoptotic factors by activated caspases during pcd (as it occurs to their mammalian counterparts), and hence fail to display any latent proapoptotic activity [75] . in addition to ksbcl-2, hhv-8 codes for yet another protein with limited homology to bcl-2 [76] . thus, k7 is a 16-kda glycoprotein structurally related to a splice variant of human survivin (survivindex3), a mammalian antiapoptotic factor belonging to the inhibitor of apoptosis protein (iap) family [s268,s269]. both proteins contain an mts, an n-terminal region of a baculovirus iap repeat (bir) domain, and a putative bh2 domain [76] . the mts of k7 consists of a single tm hydrophobic region flanked by positively charged residues, and resembles that of m11l from mxv [s270]. k7 efficiently represses apoptosis induced by activation of death receptors (e.g., fas, tnfr), bax overexpression and thapsigargin-mediated er stress [76;s271] . similarly to other iaps, k7 binds to and hence inhibits caspase-3 via its bir domain. however, k7 antiapoptotic effects depend on its bh2 domain, which mediates the interaction of k7 with bcl-2. thus, it seems that k7 exerts its functions by bridging effector caspases and bcl-2, thereby enabling the latter to inhibit caspase activity [76] . interestingly, k7 has also been shown to modulate intracellular ca 2+ concentration and protein stability, by interacting with the cellular ca 2+ -modulating cyclophilin ligand (caml) and with a regulator of the ubiquitin system, respectively [s270,s271]. whereas mutational analyses showed that the interaction between caml and k7 is required for its antiapoptotic activity [s270], the significance of k7-mediated proteasome regulation remains to be established [s271]. due to its molecular structure, k7 can be considered either as a viral iap (viap) or as a vbcl-2 [76] . to avoid premature apoptosis of the host cell, asfv encodes multiple antiapoptotic proteins, including the vbcl-2 family member a179l (also known as 5-hl) [77;s272,s273]. a179l codes for a polypeptide of 21 kda that contains all known domains associated with bcl-2 structural and functional features, including those mediating protein-binding (i.e., homo-and heterodimerization) and regulating cell death [s272]. thus, a179l has been shown to suppress apoptosis induced by multiple stimuli, including growth factor deprivation [s272] and exposure to cytotoxic agents [s274]. viruses inhibit host cell apoptosis via a plethora of mechanisms other than vbcl-2s. for instance, the of ul36 gene of cmv encodes the viral inhibitor of caspase-8 activation (vica), which has been shown to inhibit fas/cd95-mediated apoptosis by binding to the pro-domain of caspase-8 and preventing its autoproteolytic processing [78] . although this effect regards in particular the pathway of apoptosis emanating from death receptors, it also avoids the activation of the intrinsic pathway occurring along the caspase-8 r bid axis [78;s275] . recently, another mechanism of cmv-mediated inhibition of mitochondrial apoptosis has been discovered [79] . during cmv infection, a 2.7-kilobase virally encoded rna (b2.7) interacts with the mitochondrial respiratory chain complex i (reduced nicotinamide adenine dinucleotide-ubiquinone oxidoreductase) and prevents the mitochondrial release (that normally would be induced in response to apoptotic stimuli) of the complex i subunit grim-19. this stabilizes dy m and results in continued atp production, hence improving the viability of infected cells and favoring the successful completion of the viral life cycle [79] . thus, cmv employs two distinct strategies to influence mitochondrial respiration of infected cells, namely vmia-mediated inhibition of the atp synthasome and stabilization of complex i by b2.7 rna. as a net result, this combined modulation should not lead to an increase in dy m , but to decreased mitochondrial atp generation, correlating with the documented glycolytic switch of cmvinfected cells [s276] . ebv-induced transformation of primary b lymphocytes into continuously proliferating lymphoblastoid cell lines involves proteins other than vbcl-2s [s277]. among these, the epstein-barr nuclear antigens (ebna) 3a and 3c (ebna3a and ebna3c) but not ebna3b may downregulate the bh3-only protein bim, and hence reduce the propensity of host cells to undergo apoptosis [s277]. moreover, the ebna leader protein (ebna-lp) has been shown to interfere not only with host cell transcription but also with the apoptotic machinery [80;s278]. interestingly, it has been suggested that ebna-lp would interact with bcl-2 though the hs1-associated protein x-1 (hax-1) [ , and k13 is required for the long term survival of infected cells [85] . thus, k13 promotes the activation of nf-kb via multiple, partially distinct molecular pathways [s289-s292], which in turn lead to (1) reduced sensitivity to death receptor-mediated [s293] and intrinsic apoptosis [86] ; in addition, rid has been shown to target tyrosine kinase receptors, such as the epidermal growth factor receptor (egfr) [s304] . both rid subunits are tm proteins oriented with their ctermini in the cytoplasm [s305,s306]. ridb contains a cterminal tyrosine residue that is required for receptor internalization and inhibition of fas-and trail-induced apoptosis [s307]. rida has been reported to undergo o-glycosylation [s308] and phosphorylation on serine residues [s309]. how rida posttranslational modifications might affect rid antiapoptotic function remains to be established. mutagenesis studies revealed that the extracellular domain of rida is important for the clearance of egfr from the cell surface, but not for the internalization of death receptors like fas [s310]. interestingly, e3-6.7k, another protein encoded by the e3 unit, is specifically implicated in ridmediated clearance of trail-r2 [s299]. this suggests that additional viral (or cellular) factors might cooperate with rid to determine its target specificity. baculoviruses infect insect cells, and possess at least two different classes of proteins by which they control the host apoptotic response [s311]. one of these is represented by p35, a potent inhibitor of metazoan caspases acting via a cleavage-dependent mechanism [89;s312]. p35 mechanism-based inhibition of caspases is the most broadly acting antiapoptotic system known recently, the viral mitochondrial antiapoptotic protein (vmap), encoded by the m8 gene of chv-68, has been shown to suppress intrinsic apoptosis via a completely novel, dual mechanism [99] . via its n-terminus, vmap is able to augment the recruitment of bcl-2 to mitochondria and to enhance its affinity for bh3-only proapoptotic proteins, thereby suppressing bax activation. morevoer, vmap interacts with vdac1 via two leucine-rich motifs located in the central and c-terminal parts of the protein, thus repressing staurosporine-induced cyt c release and apoptosis. interestingly, vmap is necessary for efficient chv-68 lytic replication in normal cells (with an intact apoptotic apparatus), but not in bax 2/2 /ba 2/2 cells, pointing to a crucial role for apoptosis inhibition during the early steps of the viral life cycle [99] . we have reviewed the cellular impact of viral infection on cell fate via modulation of mitochondrial apoptosis. while specific cellular and molecular mechanisms have been elucidated for a number of individual proteins (e.g., vpr, vmia), a clear scheme of the integrated effects resulting from the expression of whole virus genomes has only recently begun to emerge from transcriptomics and proteomics analyses (for a review, see [100] ). future studies will have to take into account the variability of the host cell and its microenvironmental context (e.g., local inflammation, oxidative stress) as key factors susceptible to modulating the response to specific pathogens. this will undoubtedly be instrumental for the prediction of the general consequences of viral infections, as well as for a more accurate identification of novel therapeutic targets designed to eradicate infectious diseases. a complete list of accession numbers (uniprotkb/swiss-prot knowledgebase, http://www.expasy.org/sprot/) for the proteins discussed in this manuscript can be found online in text s1. text s1 list of the accession numbers (uniprotkb/swiss-prot knowledgebase) of all the proteins described in this article. mitochondrial membrane permeabilization in cell death direct activation of bax by p53 mediates mitochondrial membrane permeabilization and apoptosis ca2+-induced apoptosis through calcineurin dephosphorylation of bad requirement for gd3 ganglioside in cd95-and ceramide-induced apoptosis disruption of mitochondrial function during apoptosis is mediated by caspase cleavage of the p75 subunit of complex i of the electron transport chain cell biology mitochondrion-targeted apoptosis regulators of viral origin mitochondrial apoptosis induced by the hiv-1 envelope mitochondria as therapeutic targets for cancer chemotherapy proapoptotic bax and bak: a requisite gateway to mitochondrial dysfunction and death bid, bax, and lipids cooperate to form supramolecular openings in the outer mitochondrial membrane bcl-2 family proteins regulate the release of apoptogenic cytochrome c by the mitochondrial channel vdac distinct bh3 domains either sensitize or activate mitochondrial apoptosis, serving as prototype cancer therapeutics the permeability transition pore complex: a target for apoptosis regulation by caspases and bcl-2-related proteins differential targeting of prosurvival bcl-2 proteins by their bh3-only ligands allows complementary apoptotic function and drug-induced apoptotic responses mediated by bh3-only proteins puma and noxa bax and bak regulation of endoplasmic reticulum ca2+: a control point for apoptosis pharmacological manipulation of bcl-2 family members to control cell death the adp/ atp translocator is not essential for the mitochondrial permeability transition pore cyclophilin d-dependent mitochondrial permeability transition regulates some necrotic but not apoptotic cell death mitochondrial apoptosis without vdac bax and adenine nucleotide translocator cooperate in the mitochondrial control of apoptosis apoptosis as an hiv strategy to escape immune attack control of mitochondrial membrane permeabilization by adenine nucleotide translocator interacting with hiv-1 viral protein rr and bcl-2 hbx gene of hepatitis b virus induces liver cancer in transgenic mice multiple viral strategies of htlv-1 for dysregulation of cell growth control the human t-cell leukemia virus type 1 p13ii protein: effects on mitochondrial function and cell growth a novel influenza a virus mitochondrial protein that induces cell death the influenza a virus pb1-f2 protein targets the inner mitochondrial membrane via a predicted basic amphipathic helix that disrupts mitochondrial function specific interaction between hpv-16 e1-e4 and cytokeratins results in collapse of the epithelial cell intermediate filament network apoptosis control in syncytia induced by the hiv type 1-envelope glycoprotein complex: role of mitochondria and caspases human immunodeficiency virus 1 envelope glycoprotein complex-induced apoptosis involves mammalian target of rapamycin/fkbp12-rapamycin-associated protein-mediated p53 phosphorylation nf-kappab and p53 are the dominant apoptosis-inducing transcription factors elicited by the hiv-1 envelope induction of apoptosis in uninfected lymphocytes by hiv-1 tat protein hiv-1 tat targets microtubules to induce apoptosis, a process promoted by the pro-apoptotic bcl-2 relative bim hiv-1 protease processes procaspase 8 to cause mitochondrial release of cytochrome c, caspase cleavage and nuclear fragmentation the 19-kilodalton adenovirus e1b transforming protein inhibits programmed cell death and prevents cytolysis by tumor necrosis factor alpha the adenovirus e1a proteins induce apoptosis, which is inhibited by the e1b 19-kda and bcl-2 proteins adenovirus e1b 19-kilodalton protein overcomes the cytotoxicity of e1a proteins nbk/ bik antagonizes mcl-1 and bcl-xl and activates bak-mediated apoptosis in response to protein synthesis inhibition the adenoviral e4orf6 protein induces atypical apoptosis in response to dna damage the e6 oncoprotein encoded by human papillomavirus types 16 and 18 promotes the degradation of p53 complex formation of human papillomavirus e7 proteins with the retinoblastoma tumor suppressor gene product vesicular stomatitis viruses expressing wild-type or mutant m proteins activate apoptosis through distinct pathways west nile virus infection activates the unfolded protein response, leading to chop induction and apoptosis induction of apoptosis by the severe acute respiratory syndrome coronavirus 7a protein is dependent on its interaction with the bcl-xl protein viral homologs of bcl-2: role of apoptosis in the regulation of virus infection bcl-2, bcl-xl and adenovirus protein e1b19kd are functionally equivalent in their ability to inhibit cell death the e1b 19k protein blocks apoptosis by interacting with and inhibiting the p53-inducible and death-promoting bax protein cloning of a bcl-2 homologue by interaction with adenovirus e1b 19k functional complementation of the adenovirus e1b 19-kilodalton protein with bcl-2 in the inhibition of apoptosis in infected cells a cytomegalovirus-encoded mitochondria-localized inhibitor of apoptosis structurally unrelated to bcl-2 cytomegalovirus cell death suppressor vmia blocks bax-but not bak-mediated apoptosis by binding and sequestering bax at mitochondria cytopathic effects of the cytomegalovirus-encoded apoptosis inhibitory protein vmia viral inhibition of inflammation: cowpox virus encodes an inhibitor of the interleukin-1 beta converting enzyme vaccinia virus encodes a previously uncharacterized mitochondrial-associated inhibitor of apoptosis the vaccinia virus f1l protein interacts with the proapoptotic protein bak and inhibits bak activation the myxoma poxvirus protein, m11l, prevents apoptosis by direct interaction with the mitochondrial permeability transition pore myxoma virus m11l blocks apoptosis through inhibition of conformational activation of bax at the mitochondria structure of m11l: a myxoma virus structural homolog of the apoptosis inhibitor, bcl-2 a structural viral mimic of prosurvival bcl-2: a pivotal role for sequestering proapoptotic bax and bak functional and structural studies of the vaccinia virus virulence factor n1 reveal a bcl-2-like anti-apoptotic protein fowlpox virus encodes a bcl-2 homologue that protects cells from apoptotic death through interaction with the proapoptotic protein bak a novel bcl-2-like inhibitor of apoptosis is encoded by the parapoxvirus orf virus epstein-barr virus encodes a novel homolog of the bcl-2 oncogene that inhibits apoptosis and associates with bax and bak epstein-barr virus-coded bhrf1 protein, a viral homologue of bcl-2, protects human b cells from programmed cell death epstein-barr virus balf1 is a bcl-2-like antagonist of the herpesvirus antiapoptotic bcl-2 proteins bhrf1 of epstein-barr virus, which is homologous to human proto-oncogene bcl2, is not essential for transformation of b cells or for virus replication in vitro ultrastructural localization of bhrf1: an epstein-barr virus gene product which has homology with bcl-2 epstein-barr virus bhrf1 protein protects against cell death induced by dna-damaging agents and heterologous viral infection a bcl-2 homolog encoded by kaposi sarcoma-associated virus, human herpesvirus 8, inhibits apoptosis but does not heterodimerize with bax or bak herpesvirus saimiri encodes a functional homolog of the human bcl-2 oncogene solution structure of a bcl-2 homolog from kaposi sarcoma virus identification of the in vivo role of a viral bcl-2 antiapoptotic herpesvirus bcl-2 homologs escape caspase-mediated conversion to proapoptotic proteins characterization of an anti-apoptotic glycoprotein encoded by kaposi's sarcomaassociated herpesvirus which resembles a spliced variant of human survivin african swine fever virus gene a179l, a viral homologue of bcl-2, protects cells from programmed cell death a cytomegalovirus-encoded inhibitor of apoptosis that suppresses caspase-8 activation complex i binding by a virally encoded rna regulates mitochondria-induced cell death epstein-barr virus (ebv) nuclear antigen leader protein (ebna-lp) forms complexes with a cellular anti-apoptosis protein bcl-2 or its ebv counterpart bhrf1 through hs1-associated protein x-1 the hepatitis c virus ns2 protein is an inhibitor of cide-b-induced apoptosis e2 of hepatitis c virus inhibits apoptosis viral flice-inhibitory proteins (flips) prevent apoptosis induced by death receptors modulation of host gene expression by the k15 protein of kaposi's sarcomaassociated herpesvirus kshv vflip is essential for the survival of infected lymphoma cells the human herpes virus 8-encoded viral flice inhibitory protein protects against growth factor withdrawalinduced apoptosis via nf-kappa b activation mechanism for removal of tumor necrosis factor receptor 1 from the cell surface by the adenovirus ridalpha/beta complex the adenovirus e3/10.4k-14.5k proteins downmodulate the apoptosis receptor fas/apo-1 by inducing its internalization prevention of apoptosis by a baculovirus gene during infection of insect cells control of programmed cell death by the baculovirus genes p35 and iap baculovirus p35 prevents developmentally programmed cell death and rescues a ced-9 mutant in the nematode caenorhabditis elegans inhibition of the caenorhabditis elegans cell-death protease ced-3 by a ced-3 cleavage site in baculovirus p35 protein an apoptosis-inhibiting baculovirus gene with a zinc finger-like motif cloning and expression of apoptosis inhibitory protein homologs that function to inhibit apoptosis and/or bind tumor necrosis factor receptor-associated factors drosophila homologs of baculovirus inhibitor of apoptosis proteins function to block cell death iaps block apoptotic events induced by caspase-8 and cytochrome c by direct inhibition of distinct caspases african swine fever virus iap-like protein induces the activation of nuclear factor kappa b nuclear and cytoplasmic survivin: molecular mechanism, prognostic, and therapeutic potential a novel inhibitory mechanism of mitochondrion-dependent apoptosis by a herpesviral protein we apologize to our colleagues for not having been able to cite their work. key: cord-259505-7hiss0j3 authors: kong, qingming; xue, chunyi; ren, xiangpeng; zhang, chengwen; li, linlin; shu, dingming; bi, yingzuo; cao, yongchang title: proteomic analysis of purified coronavirus infectious bronchitis virus particles date: 2010-06-09 journal: proteome sci doi: 10.1186/1477-5956-8-29 sha: doc_id: 259505 cord_uid: 7hiss0j3 background: infectious bronchitis virus (ibv) is the coronavirus of domestic chickens causing major economic losses to the poultry industry. because of the complexity of the ibv life cycle and the small number of viral structural proteins, important virus-host relationships likely remain to be discovered. toward this goal, we performed two-dimensional gel electrophoresis fractionation coupled to mass spectrometry identification approaches to perform a comprehensive proteomic analysis of purified ibv particles. results: apart from the virus-encoded structural proteins, we detected 60 host proteins in the purified virions which can be grouped into several functional categories including intracellular trafficking proteins (20%), molecular chaperone (18%), macromolcular biosynthesis proteins (17%), cytoskeletal proteins (15%), signal transport proteins (15%), protein degradation (8%), chromosome associated proteins (2%), ribosomal proteins (2%), and other function proteins (3%). interestingly, 21 of the total host proteins have not been reported to be present in virions of other virus families, such as major vault protein, tenp protein, ovalbumin, and scavenger receptor protein. following identification of the host proteins by proteomic methods, the presence of 4 proteins in the purified ibv preparation was verified by western blotting and immunogold labeling detection. conclusions: the results present the first standard proteomic profile of ibv and may facilitate the understanding of the pathogenic mechanisms. infectious bronchitis virus (ibv), the coronavirus of domestic chickens that causes acute, highly contagious respiratory disease, is one of the most important causes of economic loss in the poultry industry. ibv is an enveloped virus with continuous, positive and single-stranded rna genome, which is the largest of any rna virus characterized [1] and encodes four types of structural proteins. the spike (s) glycoprotein, together with small envelope (e) protein and matrix (m) glycoprotein, consists of the viral envelope, whereas the nucleocapsid (n) protein interacts with genomic rna of the virus to form the viral nucleocapsid, in the invariable order 5'-s-e-m-n-3'. proteins s, e, and m have been studied for their important roles in receptor binding and virus budding. s mediates attachment to cellular receptors and entry by fusion with cell membranes, whereas m interacting with s and n proteins is an essential component of virion and plays pivotal roles in virion assembly, budding and maturation [2, 3] . in addition, s protein can inhibit host cell translation by interacting with eif3f [4] and the interaction between m and actin facilitates virion assembly and budding [5] . e is a poorly characterized small envelope protein present in low levels in the virions. the significance of the e protein appears to be critical for viral budding. another role for protein e is that it can promote apoptosis [6, 7] . viruses constantly adapt to and modulate the host environment during replication and propagation. to govern egress from the host cell and initiation of replication in the target cell, viruses will carry some of the host proteins when released from infected cells. enveloped viruses particularly encoding only small proteins have the capability of incorporating numerous host proteins into or onto the newly formed viruses. it is an important prerequisite for the functional studies to know the protein composition of the purified viral particles, as it allows the analysis of specific proteins and their roles during the virus life cycle, resulting in better understanding of the infection process and the pathogenesis of viruses. as a large number of virus complete genomes have been sequenced since 1980s, more and more host proteins in different enveloped viruses have been studied using viral proteomic approaches. herpesviruses have been the most extensively studied in this respect, such as kaposi's sarcoma-associated herpesvirus (kshv) [8, 9] , marek's disease virus (mdv) [10] , epstein-barr virus (ebv) [11] , human cyotomegalovirus (hcmv) [12] and murine cyotomegalovirus (mcmv) [13] . other doublestranded dna (dsdna) viruses including vacciniavirus have also contributed to a better understanding of this intriguing phenomenon [14] [15] [16] . furthermore, recent studies on identification of the incorporated host proteins in rna viruses have also been undertaken. for retrovirus, various studies in this research area have been performed on human immunodeficiency virus type 1 (hiv-1) [17] [18] [19] [20] and moloney murine leukemia virus (mmlv) [21] . for paramyxovirus, numerous host proteins have been found incorporated into avian influenza virus (aiv) particles and respiratory syncytial virus (rsv) particles [22] [23] [24] . to date, no study of the host proteins in the virions of coronavirus has been performed yet. in this study, we performed two-dimensional gel electrophoresis fractionation coupled to mass spectrometry identification approaches to perform a comprehensive proteomic analysis of purified ibv particles. our analysis resulted in the identification of 2 virus-encoded structural proteins and 60 incorporated host proteins. in addition, we also discussed the functional implications of some host proteins in ibv infection and pathogenesis. viral proteomic analysis requires a highly purified preparation of virions. there was no available permissive cell line capable of supporting productive replication of ibv. although primary chick embryo kidney cell (cek) and chick kidney cell (ck) are capable of supporting productive replication of ibv, their poor yields prohibit them from being used for producing large quantity of ibv. in order to obtain large quantity of ibv virions, this study selected 10-day-old spf embryonated chicken eggs for the growth of ibv strain h52. the af with enrichment of h52 was clarified by differential centrifugation in order to remove the contamination of nuclei, mitochondria, lysosomes, peroxisomes and so on from the chicken embryo. the virus was concentrated through a 20% (wt/vol) sucrose cushion before purified over a non-linear 20%-50% sucrose-tne (tris-buffered saline including 50 mm tris, 100 mm nacl, 1 mm edta, ph 7.4) gradient. two distinct types of ibv particles were isolated by sucrose density gradients. the higher density particles banded at 30%-40% sucrose-tne gradients while the less density particles banded at 20%-30% sucrose-tne gradients. the purity of ibv was confirmed by electron microscopy analysis following negative staining to ensure that the virions have normal viral morphology and to exclude the possible inclusions of vesicles, other cellular organelles and debris (fig. 1a ). an abundance of intact virions were observed without obvious contamination from host cellular materia. proteins in purified virions were separated on 12% sodium dodecylsulfate polyacrylamide gel electrophoresis (sds-page) and stained withc coomassie brilliant blue (fig. 1b ). there were also some lighter bands visible that may represent cellular proteins besides the conjectured major virus-encoded structural proteins. furthermore, the four virus-encoded structural proteins were confirmed by immunoblotting test (fig. 1c) . taken together, the best purification of the ibv was obtained after differential centrifugation to remove the cellular contamination and condensation through a 20% (wt/vol) sucrose cushion with a non-linear sucrose gradient. to obtain a detailed protein composition profile associated with the ibv particles, the proteins in purified ibv particles were extracted for 2-de experiments. to authenticate the results and to compensate the variability figure 1 analysis of purified avian infectious bronchitis virus preparations. a: specific pathogen free (spf) chick embryo-grown major particles h52 from 30%-40% sucrose density gradients, negatively stained with 2% potassium phosphotungstate, ph 6.5. b: sds-page separation of proteins in a purified h52 preparation. 8 μg of proteins were separated on an 5-17.5% polyacrylamide gel and stained with coomassie blue. c: western blotting of the purified h52 virions. viral proteins were separated on 12% polyacrylamide gel and analyzed by western blot with chicken polyclonal antibody against infectious bronchitis virus (massachusetts). the identified viral proteins are indicated. s: spike, n: nuclecapsid, m: mebrane, e: envelope. of gel electrophoresis, three independent experiments were performed with three replicate gels for each experiment. the viral protein profiles were analyzed by 2-d with 250 μg of protein. after the electrophoresis separation, gels were stained with silver and processed for image analysis. for ibv particle-associated proteins separated in the ph3-10 range, 88 protein spots were detected (fig. 2) . to identify the proteins associated with ibv particles, all protein spots detected in the gels were excised and in-gel digested with trypsin followed by maldi-tof/tof (matrix-assisted laser desorption/ionizationtime of flight mass spectrometry) analysis. database search analysis revealed that 2 virus-encoded structural proteins and 60 host proteins were successfully identified. detailed information of the full set of the identified proteins is listed in table 1 ; additional file 1. to better understand the host proteins incorporated with ibv virion and their roles played in ibv infection, these proteins were categorized with biological processes according to uniprot knowledge database (swiss-prof/ trembl) and gene ontology database. the identified 60 host proteins were comprised of cytoskeleton proteins, molecular chaperone, macromolecular biosynthesis proteins, signal transport proteins and glycolytic enzymes (table s1 ; additional file 1). these host proteins were located mainly in the cytoplasm, including cytoskeleton, cytosol and mitochondrion (fig. 3) . to confirm the presence of host proteins in the purified ibv particles after the identification of them by pro-teomic method, we performed immunoblotting experiments. ibv preparation purified from af was analyzed for the presence of n protein, actin, hsp90, annexin a2 and tubulin (fig. 4) . extracts from 10-day-old specific pathogen free (spf) embryonated chicken eggs were included as a positive control. when analyzing the results of virion proteomic studies, the challenge is to prove that the host proteins are really an part of the virion and that they are not just attached non-specifically to the outside of the virus. to address this question, the af from uninfected 10-day-old spf embryonated chicken eggs were parallelly subjected to our standard density centrifugation procedure and the protein extracts from 30%-40% sucrose gradient was used as negative control. gradientpurified virions and the control were separated on 12% sds-page gels, transferred to pvdf membranes, and probed with the appropriate antibodies. as shown in fig. 4 , actin, tubulin and annexin a2 were both found in the purified virions and positive control but not in the af extracts from uninfected 10-day-old spf embryonated chicken eggs. hsp90 is a member of the heat shock protein family which is upregulated in response to stress and has low abundance in unstressed cells. in present study, we detected it only in the purified virions but not in normal cells. it is an expectable result that we also detected actin and tubulin in the af extracts from uninfected 10day-old spf embryonated chicken eggs which resulted from their high concentrations in all eukaryotic cells and subcellular fractions. to provide additional evidence that the host proteins are not just derived from a microvesicle or exosome that co-purified with the virus, we used the bromelain protease protection assay which has been shown to efficiently remove microvesicles from ibv virion preparations [25] . protease treatment of the purified virus preparation strips proteins off any contaminating microvesicles and off the outside of virus particles, such as s protein. in doing so, the microvesicles become lighter than the virions and therefore the virions can be isolated by density centrifugation. proteins that are inside the virion are protected by the lipid envelope and therefore will remain after the protease treatment. and then immunogold labeling of the purified virions was performed ( fig. 5 ). virions were either mock treated or subjected to digestion with bromelain and then were incubated with antibodies against actin, annexin a2, hsp90, ibv massachusetts strain and secondary gold antibodies followed by negative staining. one or two gold particles located on the surface of a virion could be seen for hsp90. this was significantly less compared with the degree of other labelings which is consistent with the fact that there is most likely far more actin, annexin a2 present on the virions than hsp90. in addition, the abundance of actin detected in the 2-de gels is much higher than that of hsp90. virus exploits multiple host proteins during infection for successful entry, replication, egress, and evasion. this is especially true for rna viruses because they encode only little proteins. learning the protein composition profile of the infectious viral particle is prerequisite for studying the role of host proteins during infection. to our knowledge, incorporation of host proteins in the envelopedvirus family coronaviridae has not been investigated so far. in this study, we revealed the presence of virusencoded proteins in infectious bronchitis particles and for the first time confirmed the incorporation of host proteins. a total of 2 viral and 60 host proteins associated with purified ibv particles were identified. in the present study, we failed to obtain m and e protein while other two structural proteins n and s were identified successfully. n protein is easy to identify because it is the most abundant virus-derived protein produced throughout the process of the virus infection, whereas s protein as a major structural protein of ibv located on the surface of viral particles is also easy for identification because it is large (about 175 kda) and has many tryptic cleavage peak in the ms analysis. the identification of m and e protein of coronavirus by ms has been thought to be a difficult task due to their properties, especially in the case of e protein for the following reasons [26] . first, e protein is a very hydrophobic protein. second, it is low-abundant in the virions. third, e is a low molecular weight protein with the mass about 12 kda. fourth, e protein contains two cysteines, which may form disulfide bonds within itself or with other proteins and make e protein difficult to be reduced and subsequently digested. of the total host proteins, 39 have also been described to be present in virions of quite diverse virus families, such as herpesviruses, poxviruses, paramyxovirus and retroviruses [27, 28] . there are some explanations that the incorporated host proteins are common to other virus families. first, they are all enveloped viruses. enveloped viruses contain the viral genome and core proteins wrapped within one or more membranes which are acquired from the host cell during virus assembly and budding. these viruses all share some fundamental feature in the particular stage of their life-cycle and these host proteins are involved in the common process. second, these host proteins would be either highly abundant cytosolic proteins or enriched at the virus budding sites. several of the highly abundant cytosolic proteins found within both ibv and other viral particles are beta actin, tubulin, annexins and enolase. proteins enriched at the virus budding sites including hsp70, hsp90 and gapdh are also identified in ibv and other viruses [9, 11, 12, 14, 19, 24] . some host proteins may be specially incorporated into the virions. in this study, 21 of the total host proteins are reported for the first time. the identified host protein functions in diverse biological processes and some functional groups are analyzed. these proteins participate in a broad array of cellular functions and are involved in many processes in the viral life cycle. the potential roles of some of these proteins are discussed below in relation with ibv infection, pathogenesis and early host antiviral response. numerous viral proteins interact with cytoskeletal elements. many viruses, such as retroviruses, herpesviruses and picornaviruses, even contain the main cytoskeletal element actins in their infectious particles. the transport machinery of actins are proven to be critical at almost every step along the infectious cycle [29] . actin has been found in preparations from several types of retroviruses and paramyxoviruses. for coronavirus, an association of m with cytoskeletal elements has been reported [5] , which indicates an essential function of actin in the replication cycle of coronavirus ibv. in our studies, actin and tubulin were all present in the interior of infectious bronchitis particles and this observation most likely reflects their active participation in moving the viral components to assembly sites. actin and tubulin have been characterized as the major folding substrates for cct (chaperonin containing tcp-1, also termed tric). both cytoskeletal proteins require in vivo and in vitro the interaction with cct to fold to their native states [30] . cct is the most different and complicated protein of all group ii chaperonins in eukaryotic cytosolic chaperonins, which might be involved in the assistance of the folding of a small set of proteins. in addition to the already mentioned actin and tubulin, cct has been found to interact either in vitro or in vivo with other cytoskeletal proteins, cell division control protein 20, protein phosphatase type 2a, and guanine nucleotide-binding protein (g protein) beta subunit [31] , which are all found to be associated with infectious bronchitis particles in present study. it's pleasantly surprising to find that certain viral proteins such as the epstein barr virus-encoded nuclear protein (ebna-3), the hepatitis b virus capsid and the type d retrovirus gag polyprotein are also folded by cct [32] [33] [34] . thus, cct may have an important role in infectious bronchitis viral proteins assembly. other cytoskeletal proteins found to be associated with infectious bronchitis particles are actin-related proteins, wd repeat containing protein, destrin and annexin. several annexin family members (a2, a5 and a11) were identified in purified infectious bronchitis particles. annexins are a well-known multigene family of ca 2+ regulated phospholipid-binding and membrane binding proteins with diverse functions. the presence of annexin a2 is thought to support viral binding, fusion and replication [35] [36] [37] [38] [39] . annexin a5, which interacts with annexin a2, has the opposite effect by preventing fusion, which possibly indicates a potential regulatory role [38] . annexin a2 tightly binds to a member of the s100 family of calciumbinding proteins, s100a10 (p11). upon binding, annexin a2 and p11 form a heterotetramer which is capable of binding two membrane surfaces simultaneously, which potentially promotes fusion events and also plays a role in exocytosis [40] . the p11 protein was also detected by our analysis, suggesting that ibv is also incorporated this complex. other s100 family members such as s100a6 and s100a11 were also detected in viral samples and could play various roles in fusion and membrane organization [41, 42] . heat-shock proteins (hsps) have been known as multifunctional proteins. they facilitate the folding and unfolding of proteins, participate in vesicular transport processes, prevent protein aggregation in the densely packed cytosol and are involved in signaling processes. most, but not all, hsps are molecular chaperones. several viruses require host molecular chaperones for entry, replication, and assembly, as well as other steps in viral production [43, 44] . hsp70 and hsp90 have been found incorporated into ibv. hsp70 interacts with various viral proteins and may be involved in the assembly of adenovirus [45] , enterovirus [46] , vaccinia virus [47] and hantaan virus [48] . alternatively, upon entry into susceptible target cells, virion-associated hsp70 might participate in early events of infection. for example, hsp70 might actively uncoat the viral capsid in a manner similar to its role in the uncoating of clathrin cages [49] . hsp70 and hsp90 have been shown to interact with hepatitis b virus reverse transcriptase and to facilitate the initiation of viral dna synthesis from hepatitis b virus pregenomic rna [50, 51] . for sendai virus (sv), the viral proteins synethsis will be inhibited as long as hsp70 synthesis occurs [52] . thus, hsp70 in ibv virions might serve a similar function in the virus life cycle. the chaperone hsp90 has been identified as an essential factor in the folding and maturation of picornavirus capsid proteins [53] . the involvement of hsp90 in viral replication has also been reported for many viruses and it has been demonstrated that hsp90 inhibition blocks viral replication [54] . recently, a role for hsp90 in the control of hepatitis c, flock house and influenza virus polymerase function has been shown [55] [56] [57] [58] [59] [60] [61] and it has been proposed that hsp90 is a major host factor that is of central importance for viral replication for a wide spectrum of rna viruses [56] , which implies the crucial roles of hsp90 in ibv replication. the importance of hsp90 for the replication of multiple viruses opens up an interesting possibility for developing new antiviral therapies which have not yielded drug-resistant viruses [62] . some proteins involved in the glycolytic pathway were identified, such as aldehyde dehydrogenase 9 family, member a1 (aldh9a1), glyceraldehyde-3-phosphate dehydrogenase (gapdh), alpha-enolase, which were identified in other viral particles, like hiv-1, mmlv, hcmv, kshv and aiv [8, 12, 19, 21, 24] . some studies have suggested that several glycolytic enzymes interact with microtubules and tubulin [63] [64] [65] and may also contribute to transcription of rna virus genomes. in higher eukaryotes, enolase is found as a dimer of subunits, α, β, or γ. all enolase isoforms from mammalian have been reported that are capable of stimulating transcription of svgenome [66] . gapdh is a well-characterized key enzyme in glycolysis, but recent evidence suggests it also has rna binding properties and binds to the untranslated rna sequences of several different viruses, including human parainfluenza virus type 3(hpiv3), japanese encephalitis virus (jev), hepatitis a virus (hav) and hepatitis b virus (hbv) as well as hepatitis c virus (hcv) [67] [68] [69] [70] [71] . in the case of hpiv3, gapdh has been reported to inhibit actin-dependent in vitro transcription and is also present in purified virions [67, 72] . in vitro data indicates that gapdh serves a negative regulatory role in hpiv3 transcription and in a phosphorylation-dependent on manner [72] . in addition to these host proteins associated with enveloped viruses, the roles of which in the virus life cycles have been studied well, we also identified 21 host proteins in purified infectious bronchitis particles, which have not been described to be present in other virions of quite diverse virus families, such as apolipoprotein a-i (apoa-i), fatty acid-bingding protein 3, ovalbumin, tenp protein, tumor protein translationally controlled-1, transthyretin and so on. apoa-i, a major constituent of highdensity lipoproteins, alters plasma membrane morphol-ogy by participating in the reverse transport of cholesterol binding with atp-binding cassette transporter a1 [73] , and activates the small gtp-binding protein cdc42associated signaling including apoa-i induced cholesterol efflux, protein kinases, and actin polymerization [74] . what important is that apo a-i can inhibit herpes simplex virus (hsv)-induced cell fusion at physiological concentrations. this function may be related to the structure of apoa-i and before long its amphipathic peptide analogue was also found to inhibit cell fusion, both in hiv-1infected t cells and in recombinant vaccinia-virusinfected cd4+ hela cells expressing hiv envelope protein on their surfaces [75] . the results indicate that amphipathic helices may be useful in designing novel antiviral agents that inhibit penetration and spreading of enveloped viruses. ovalbumin is the main protein found in egg white, making up 60-65% of the total protein. the chicken ovalbumin upstream promoter transcription factors (coup-tfs), members of the steroid/thyroid hormone receptor superfamily, binds to a negative regulatory region in the human immunodeficiency virus type 1 long terminal repeat (ltr). ltr contains a negative regulatory element which downregulates the rate of ltr-directed transcription and hiv-1 replication [76] . the interaction between ovalbumin and np from influenza a virus as well as glycoprotein c from the herpes simplex type i virus was reported long time ago [76] . the tenp protein from g. gallus, however, was isolated as a transiently expressed gene in neural precursor cells in retina and brain, and has been proposed to function in the transition to cell differentiation in neurogenesis. after expressed in chicken embryonic fibroblast cells, tenp was immunodetected in membrane fractions, implying that tenp might be a membrane protein as predicted by a computer analysis of its primary sequence [77] . to date, there have been no reports about tenp associated with virus, but it's an enriched and abundant protein identified in purified infectious bronchitis particles which suggests to us that it may be a requisite host protein in ibv life cycles. the present study 1) provides the first proteomic analysis of infectious bronchitis particles, 2) establishes the most comprehensive proteomic index of ibv and 3) shows that most of the virion incorporated host proteins have central roles in virus life cycle. although some proteins may be associated with virus biology, further investigation of the function of these host proteins may facilitate the understanding of the pathogenic mechanisms. the ibv strain h52 was obtained from qianyuanhao biological corporation limited (beijing, china). virus was propagated in 10-day-old specific pathogen free (spf) embryonated chicken eggs (beijing merial vital laboratory animal technology co, ltd, beijing, china) for 48 h at 37°c. the allantoic fluid (af) with enrichment of ibv h52 was clarified by differential centrifugation. af was first centrifugated at 3,000 × g for 30 min and then the supernatant was centrifugated at 12,000 × g for 30 min. clarification and all subsequent centrifugations were performed at 4°c. the virus was sedimented through 5.5 ml of 20% (wt/vol) sucrose in tne buffer (50 mm tris, 100 mm nacl, 1 mm edta, ph 7.4) by centrifugation in a 70ti rotor (beckman coulter, optima™ l-100xp preparative ultracentrifuge) at 75,000 × g for 1.5 h. condensed virions were then diluted with 1.0 ml tne buffer and centrifuged to equilibrium in 11.5 ml non-linear 20%-50% sucrose-tne gradients at 75,000 × g for 2.5 h in a sw41 rotor (beckman coulter, optima™ l-100xp preparative ultracentrifuge). purified virions were diluted with tne buffer and pelleted by sedimentation at 75,000 × g for 1.5 h in a sw41 rotor to remove the sucrose. the purified ibv pellets were stored at -80°c until use. the purified ibv particles were dissolved in about 300 μl lysis buffer (7 m urea, 2 m thiourea, 2% triton x-100, 65 mm dtt, 2% biolyte ph 3-10) and incubated for 60 min at 4°c. then the lysis solution was sonicated for 4 min (pulse durations of 2 s on and 3 s off ) in an ice bath sonicator. the viral protein samples were prepared when the indiscerptible sediments were wiped off by centrifugation at 12,000 × g for 30 min. the supernatant was collected and the concentration of the prepared protein samples was determined by the bio-rad protein assay kit ii according to the manufacturer's instructions. the samples were then aliquoted and stored at -80°c until used for further analysis. purified virus particles treated with bromelain (bb0243, bbi) at 0.2 mg/ml in 50 mm dtt (ph 7.2) in dulbecco's phosphate buffered saline (pbs) at 37°c for 15 min. after incubation, the treated virus was directly centrifuged to equilibrium in 11.5 ml non-linear 20%-50% sucrose-tne gradients at 75,000 × g for 2.5 h in a sw41 rotor (beckman coulter, optima™ l-100xp preparative ultracentrifuge). purified virions were diluted with tne buffer and pelleted by sedimentation at 75,000 × g for 1.5 h in a sw41 rotor to remove the sucrose and then subjected to immunogold labeling and electron microscopy analysis. two-dimentional gel electrophoresis analysis was performed using 18 cm immobile drystrip (ipg strips, ph 3-10 non-linear, ge healthcare). first, 100 μl samples containing 250 μg protein were added into 400 μl sample rehydration buffer (7 m urea, 2 m thiourea, 2% (w/v) chaps, 65 mm dtt, 0.2% bio-lyte ph 3-10) and incubated for 30 min at 37°c prior to their separation by isoelectric focusing (ief) in the first dimension. the ipg strips were rehydrated at 20°c for 12 h by a passive rehydration method. ief was carried out for a total of 45 kvh at 20°c on an ettan ipgphor iii electrophoresis unit (ge healthcare). second, ipg strips were further transferred onto the second dimension of gel electrophoresis. before this step the ipg strips were reduced and alkylated in a equilibration buffer containing 50 mm tris-hcl, ph 8.6, 6 m urea, 2% sds and 30% glycerol supplemented with 1% (w/v) dl-dithiothreitol (dtt) or 2.5% iodoacetamide (iaa) instead of dtt for 15 min. subsequently, the viral protein samples were separated at 140 v on linear 5%-17.5% sodium dodecyl sulfate gradient polyacrylamide gel (sds-page) in tris: glycine buffer (192 mm glycine, 25 mm tris, 0.1% sds, ph 8.3) for about 10 h. third, proteins in the gel were stained by the modified silver staining method compatible with ms [78] and the gels were scanned at a resolution of 600 dpi using imagescanner™ iii (ge healthcare). gel pieces (1.0 mm 3 ) containing the whole protein spots from the 2d gel were cut and washed three times with 50 mm carbonic acid, monoammonium salt (nh 4 hco 3 , amresco). these gel pieces were destained with 15 mm potassium ferricyanide (k 3 fe(cn) 6 , amresco) and 50 mm sodium thiosulfate (nas 2 o 3 , amresco) in 50 mm nh 4 hco 3 and dehydrated in 100% acetonitrile (acn, wako) until gel pieces turn to white. after dring in speedvac concentrator (thermo savant, usa) for about 100 min, gel pieces were incubated with 12.5 ng/μl trypsin (sequenceing grade, promega) to cover dry gel pieces completely at 37°c overnight. the gel pieces were then extracted three times in 50% acn water solution containing 5% trifluoroacetic acid (tfa, wako). the supernatant was pooled and dried thoroughly in speed-vac. protein digestion extracts were resuspended with 5 μl of 0.1% tfa and then the peptide samples were mixed (1:1) with a matrix consisting of a saturated solution of αcyano-4-hydroxycinnamic acid (α-cca, sigma) in 50% acn containing 0.1% tfa. 0.8 μl aliquot was spotted onto stainless steel target plates. peptide mass spectra were obtained on an applied biosystem/mds sciex 4800 maldi tof/tof plus mass spectrometer. data were acquired in positive ms reflector using a calmix5 standard to calibrate the instrument (abi4800 calibration mixture). mass spectra were obtained from each sample spot by accumulation of 900 laser shots in an 800-3500 mass range. for ms/ms spectra, the 5-10 most abundant precursor ions per sample were selected for subsequent fragmentation and 1200 laser shots were accumulated per precursor ion. both the ms and ms/ms data were interpreted and processed by gps explorer software (v3.6, applied biosystems), then those obtained ms and ms/ms spectra per spot were combined and submitted to mascot search engine (v2.1, matrix science, london, u.k.) by gps explorer software and searched with the following parameters: trypsin as the digestion enzyme, one missed cleavage site, partial modification of cysteine carboamidomethylated and methionine oxidized, none fixed modifications, ms tolerance of 60 ppm, ms/ms tolerance of 0.25 da. mascot protein score in ipi_chicken (v3.49) database (based on combined ms and ms/ms spectra) of greater than 57 (p ≤ 0.05) or in ncbinr database of greater than 67 (p ≤ 0.05) was accepted. mouse monoclonal antibodies against actin (mab1501) and hsp90 (05-594) were purchased from millipore. rabbit polyclonal antibodies against annexin a2 (ab40943) and tubulin alpha-1 (ab4074), and chicken polyclonal antibody against ibv (massachusetts) (ab31671) were purchased from abcam. mouse monoclonal antibody against nucleoprotein of ibv (3bn1) was purchased from hytest ltd. for control, the af from 10-day-old spf embryonated chicken egg performed with the same protocol as the purification of ibv particles and the protein extracted from the normal 10-day-old spf embryonated chicken eggs included for western blot analysis. samples were separated at 120 v on linear 5%-17.5% sds-page with 5% stacking gels in tris: glycine buffer for about 3 h. for purified virus, 10 μg of total proteins were used per lane. for the control, a total of 15 μg proteins were loaded. after separated by sds-page, the proteins were transferred to a polyvinylidene fluoride membrane (pvdf, p/n 66485, biotrace, pall corporation). the membrane was blocked in freshly prepared 5% bovine serum albumin (bsa) with 0.05% tween-20 for 2 h at room temperature with constant agitation. the pvdf membrane was washed three times with tris buffered saline plus 0.2% tween 20 (tbst) and then incubated with properly diluted primary antibodies for 2 h at room temperature or overnight with agitation at 4°c. anti-rabbit or anti-mouse immunoglobulin g antibody conjugated to horseradish peroxidase (hrp) (00001-14, proteintech group, inc) was used as the secondary antibody and the pvdf membrane was incubated in it for 1 h at room temperature. the chemiluminescence system (ar1022, boster bio-technology co. ltd) was used for detection of antibody-antigen complexes. rabbit polyclonal antibody against chicken igg (15 nm gold) (ab41500), goat polyclonal against rabbit igg (5 nm gold) (ab27235) and goat polyclonal against mouse igg (10 nm gold) (ab27241) were purchased from abcam. purified ibv particles were suspended in pbs (ph 7.4) and then were collected onto 230-mesh formwar-coated nickel grids and adsorbed on the grids for 5 min. the viruses were fixed in 2% paraformaldehyde for 5 min at rt and treated with triton x-100 (0.2%) in pbs (ph 7.4) for 5 min and then blocked with 5% bsa in pbs-tween 20 (ph 7.4) for 30 min at rt. all grids were then blocked with blocking buffer (5% bsa, 5% normal serum, 0.1% cold water skin gelatin, 10 mm phosphate buffer, 150 mm nacl, ph 7.4) for 30 min. after washing with pbs, immobilized virions were incubated for 1.5 h with 50 μg/ml primary antibody (in 1% bsa), and washed three times for 5 min in pbs/1% bsa. anti-rabbit or anti-mouse immunoglobulin g coupled to 10 nm colloidal gold particles was used as the secondary antibody and virions were incubated in it for 40 min at room temperature. the grids were then washed extensively with pbs, washed twice more with distilled water to remove excess salt and negatively stained with 2% sodium phosphotungstate for 1 min. negatively stained virions were examined on a scan and transmission electron microscope. abbreviations 2d: two-dimensional; 2-de: two-dimensional electrophoresis; sds-page: sodium dodecylsulfate polyacrylamide gel electrophoresis; ms: mass spectrometry; maldi-tof: matrix-assisted laser desorption/ionization time of flight mass spectrometry; spf: specific pathogen free; af: allantoic fluid; bsa: bovine serum albumin; dtt: dithiothreitol; iaa: iodoacetamide; acn: acetonitrile; tfa: trifluoroacetic acid; α-cca: α-cyano-4-hydroxycinnamic acid; tne: tris-buffered saline including 50 mm tris; 100 mm nacl; 1 mm edta: ph 7.4; pbs: phosphatebuffered saline; tbs: tris-buffered saline; tbst: tris buffered saline plus 0.2% tween 20; hrp: horseradish peroxidase; pi: isoelectric point; mw: molecular weight. coronavirus avian infectious bronchitis virus envelope glycoprotein interactions in coronavirus assembly characterization of the coronavirus m protein and nucleocapsid interaction in infected cells coronavirus spike protein inhibits host cell translation by interaction with eif3f interaction of the coronavirus infectious bronchitis virus membrane protein with betaactin and its implication in virion assembly and budding induction of apoptosis in murine coronavirusinfected 17cl-1 cells induction of apoptosis in murine coronavirus-infected cultured cells and demonstration of e protein as an apoptosis inducer host and viral proteins in the virion of kaposi's sarcoma-associated herpesvirus virion proteins of kaposi's sarcomaassociated herpesvirus a mass spectrometry-based proteomic approach to study marek's disease virus gene expression proteins of purified epstein-barr virus identification of proteins in human cytomegalovirus (hcmv) particles: the hcmv proteome identification of proteins associated with murine cytomegalovirus virions vaccinia virus proteome: identification of proteins in vaccinia virus intracellular mature virion particles krijnse locker j: identification of the major membrane and core proteins of vaccinia virus by two-dimensional electrophoresis protein composition of the vaccinia virus mature virion specific incorporation of heat shock protein 70 family members into primate lentiviral virions cellular proteins bound to immunodeficiency viruses: implications for pathogenesis and vaccines proteomic and biochemical analysis of purified human immunodeficiency virus type 1 produced from infected monocyte-derived macrophages proteomic analysis of human immunodeficiency virus using liquid chromatography/tandem mass spectrometry effectively distinguishes specific incorporated host proteins identification of host proteins associated with retroviral vector particles by proteomic analysis of highly purified vector preparations immunochemical identification of viral and nonviral proteins of the respiratory syncytial virus virion identification of cellular interaction partners of the influenza virus ribonucleoprotein complex and polymerase complex using proteomic-based approaches cellular proteins in influenza virus particles polypeptides of the surface projections and the ribonucleoprotein of avian infectious bronchitis virus proteomic analysis of sars associated coronavirus using two-dimensional liquid chromatography mass spectrometry and onedimensional sodium dodecyl sulfate-polyacrylamide gel electrophoresis followed by mass spectroemtric analysis plunder and stowaways: incorporation of cellular proteins by enveloped viruses viral proteomics interaction of epithelial ion channels with the actin-based cytoskeleton the tcomplex polypeptide 1 complex is a chaperonin for tubulin and actin in vivo systematic identification of protein complexes in saccharomyces cerevisiae by mass spectrometry epstein-barr virus-encoded nuclear protein ebna-3 interacts with the epsilon-subunit of the tcomplex protein 1 chaperonin complex a eukaryotic cytosolic chaperonin is associated with a high molecular weight intermediate in the assembly of hepatitis b virus capsid, a multimeric particle type d retrovirus gag polyprotein interacts with the cytosolic chaperonin tric annexin ii enhances cytomegalovirus binding and fusion to phospholipid membranes secretory leukocyte protease inhibitor binds to annexin ii, a cofactor for macrophage hiv-1 infection annexin ii incorporated into influenza virus particles supports virus replication by converting plasminogen into plasmin annexin 2-mediated enhancement of cytomegalovirus infection opposes inhibition by annexin 1 or annexin 5 annexin 2: a novel human immunodeficiency virus type 1 gag binding protein involved in replication in monocyte-derived macrophages annexins: linking ca2+ signalling to membrane dynamics intracellular and extracellular roles of s100 proteins s100-annexin complexes: some insights from structural studies recruitment of hsp70 chaperones: a crucial part of viral survival strategies synthesis and quality control of viral membrane proteins association of hsp70 with the adenovirus type 5 fiber protein in infected hep-2 cells association of heat shock protein 70 with enterovirus capsid precursor p1 in infected human cells vaccinia virus infection induces a stress response that leads to association of hsp70 with viral proteins increased expression of hsp70 and co-localization with nuclear protein in cells infected with the hantaan virus uncoating atpase is a member of the 70 kilodalton family of stress proteins hsp90 is required for the activity of a hepatitis b virus reverse transcriptase two-dimensional blue native/sds-page analysis reveals heat shock protein chaperone machinery involved in hepatitis b virus production in hepg2.2.15 cells selective inhibition of virus protein synthesis by prostaglandin a1: a translational block associated with hsp70 synthesis evolutionary constraints on chaperone-mediated folding provide an antiviral approach refractory to development of drug resistance molecular chaperone hsp90 is important for vaccinia virus growth in cells hsp90 inhibitors suppress hcv replication in replicon cells and humanized liver mice hepatitis c virus rna replication is regulated by fkbp8 and hsp90 the cellular chaperone heat shock protein 90 facilitates flock house virus rna replication in drosophila cells identification of hsp90 as a stimulatory host factor involved in influenza virus rna synthesis involvement of hsp90 in assembly and nuclear import of influenza virus rna polymerase subunits antiviral activity and rna polymerase degradation following hsp90 inhibition in a range of negative strand viruses herpes simplex virus type 1 dna polymerase requires the mammalian chaperone hsp90 for proper localization to the nucleus development and application of hsp90 inhibitors enhanced association of mutant triosephosphate isomerase to red cell membranes and to brain microtubules a glycolytic enzyme binding domain on tubulin glycolytic enzyme-tubulin interactions: role of tubulin carboxy terminals enolase, a cellular glycolytic enzyme, is required for efficient transcription of sendai virus genome specific interaction in vitro and in vivo of glyceraldehyde-3-phosphate dehydrogenase and la protein with cis-acting rnas of human parainfluenza virus type 3 human hepatic glyceraldehyde-3-phosphate dehydrogenase binds to the poly(u) tract of the 3' noncoding region of hepatitis c virus genomic rna identification of glyceraldehyde-3-phosphate dehydrogenase as a cellular protein that binds to the hepatitis b virus posttranscriptional regulatory element glyceraldehyde-3-phosphate dehydrogenase (gapdh) interaction with 3' ends of japanese encephalitis virus rna and colocalization with the viral ns5 protein functional significance of the interaction of hepatitis a virus rna with glyceraldehyde 3-phosphate dehydrogenase (gapdh): opposing effects of gapdh and polypyrimidine tract binding protein on internal ribosome entry site function specific phosphorylated forms of glyceraldehyde 3-phosphate dehydrogenase associate with human parainfluenza virus type 3 and inhibit viral transcription in vitro specific binding of apoa-i, enhanced cholesterol efflux, and altered plasma membrane morphology in cells expressing abc1 apolipoprotein a-i activates cdc42 signaling through the abca1 transporter apolipoprotein a-i and its amphipathic helix peptide analogues inhibit human immunodeficiency virus-induced syncytium formation immunoprecipitation, with an antiserum to ovalbumin, of protein np from influenza a virus and of glycoprotein c from the herpes simplex type i virus identification and characterization of tenp, a gene transiently expressed before overt cell differentiation during neurogenesis a modified silver staining protocol for visualization of proteins compatible with matrix-assisted laser desorption/ionization and electrospray ionization-mass spectrometry proteomic analysis of purified coronavirus infectious bronchitis virus particles proteome science the authors declare that they have no competing interests. qk performed the main proteomic experiments and data analysis and drafted the manuscript. cx created the detailed experimental design. xr and cz contributed to the initial phase of the proteomic experiments. ll and ds assisted in the propagation and purification of ibv. yb and yc helped conceive the research. all authors read and approved the final manuscript. key: cord-260345-ugd8kkor authors: giles, ian g. title: a compendium of reviews in biochemistry and molecular biology published in the first half of 1992 date: 1992-12-31 journal: international journal of biochemistry doi: 10.1016/0020-711x(92)90283-7 sha: doc_id: 260345 cord_uid: ugd8kkor abstract 1. 1. a compendium of reviews and mini-reviews in biochemistry and molecular biology published in the first half of 1992 is presented. in all 499 titles are listed from 95 different publications. 2. 2. this compendium presents the references by journal name. keywords have been included with each reference to increase the value of the collection. keyword and author cross-reference indexes are not included but are available in the electronic database from which this version was constructed. should anyone wish to have this information in electronic form it can be distributed on ms-dos formatted flopppy disks in either reference manager or medline format. the author should be contacted for details of the number of preformatted floppy disks required. krasikov n., thompson k. and sekhon g.s. (1992) brief clinical report-monosomy 18q12.1+21.1-a recognizable aneuploidy syndrom~report of a patient and review of the literature. am. j. med. geti. 43. 531-534. verloes a., mulliez n.. gonzales m., laloux f.. hemutnnsle t., pierard g.e. and koulischer l. (1992) restrictive dermopathy, a lethal form of arthrogryposis multiplex with skin and bone dysplasiac3 new cases and review of the literature. am. j. med. genet. 43, 539547. aplasia cutis ccugenita; pyloric atresia, newborn; sibs. leonard c., huret j.l., imbert mc., lebouc y., selva j. and boulley a.m. (1992) trisomy-16p in a liveborn offspring due to maternal translocation t(16-21)(ql l-p1 1) and review of the literature. am. j. med. gene; . 43, [621] [622] [623] [624] [625] spontaneous abortions; handing teclm' tques; duplication 16~; infant; segregation. xie l.q.. markides k.e. and lee m.l. (1992) biomedical applications of analytical supercritical fluid separation techniques. anal. biuchem. u)o, [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] chtumatography-mass spectrometry; amino-acid derivatives; gas-chromatograph stationary phase; chargeexchange; silica-gel; extraction; glycosphingolipids; resolution; interface. hozier j.c. and davis l.m. (1992) review-cytogenetic approaches to genome mapping. anal. biochem. 208, 205217 . chaiken i., rose s. and karlsson r. (1992) quantitative analysis of protein interaction with ligands. (2) analysis of macromolecular interactions using immobilized ligands. anal. b&km. 201, [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] . neurophysin self-association; affinity-chmmatography; subunit-exchange chromatography; biosynthetic precursor; equilibtium-carstants; peptide recognition; sense pcptides; binding; purification; elution. ichikawa y.. look g.c. and wong c.h. (1992) review-enzyme-catalyzed oligosaccharide synthesis. anal. b&hem. 202,215-238. gal-l3-1+3(4)glcnac a-2+3 sialyltransferase; acetylneuraminic acid synthetase; immobilized l3-galactosidase; gdp-l-fucose; sialic acids; esckrichia coli; rat-liver, glycoprotein oligosaccharides; carbohydrate antigens; determines expression. gabriel 0. and gersten d.m. (1992) staining for enzymatic activity after gel electrophoresis-review. anal. bbckm. 203, sodium dodecyl-sulfate.; ted blood-cells; phosphoenolpyruvate carboxylase activity; nucleotide-linked dehydrogenases; alkaline-phosphatase isoenzymes; pathogenesis-related proteins; cr-l-fucosidase; polyactylamide-gel; produce phosphate; general-method. wood p.j. (1992) the measurement of parathyroid hormone. ann. cfin. b&km. 29, [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] cyclic amp; immtmoradiomettic assay; primary hypatparathytoidism; humoral hypercalcemia; calcium huneostasis; intact parathytin; clinical utility; circadian-rhythm; chromogranin a; lung-cancer. newman d.j.. henneberry h. and price c.p. (1992) particle enhanced light scattering immunoassay. ann. clin. b&hem. 29, . c-reactive protein; human chorionic-gonadotropin; latex agglutination-test; cell-labeled antibodies; shell corn patticles; counting immunoassay; turbidimetric immunoassay; spectroscopic immunoassay; passive hemagglutinatiom luteinizing-hormone. thompson d.. milfordward a. and whither j.t. (1992) the value of acute phase protein measurements in clinical practice. ann. clin. c-reactive protein; erythrocyte sedimentation-rate; inflammatory bowel-disease; tumornecrosis factor, amyloid a protein; tlteumatoid atthritis; plasma-ptoteins; polymyalgia rheumatica; acute-leukemia; tissue-injury; acute phase. soldi s.j. (1992) drug receptor assays-quo vudis. ann. . allen j.f. (1992) protein phosphorylation in regulation of photosynthesis. biochln. bi@ys. ac& la275 335. light hamsting complex; dtlomphyll a; b protein; cyanobncterium syn&ococc~~ 6301; excitatiarcnergy distribution; absorptial cross-section; ii reaction center. thylakoid membrane pelypaptides; state-l state-2 tmnsiticns; amino-acid-sequence; randomixed chlomplast iamellae. anthony c. (1992) the c-type cytochromes of methylotrophic bacteria. b&him. biqhys. acta 1099, l-15. methylobacteriwn extorquetu am; electron-transport chain; blue copper proteins; oxidation mutant classes; ammo-acid sequence; sp strain aml; paracoccus dcni~rificons; melhylophilvs mdylotrophur; obligate metbylotmpb; m&and dehydrogenase. hoch f.l. (1992) cardi~lipins and biomembrane function. biochim. biophys. acta 1113, 71-133. rat-liver-mitochondtia; fatty-acid composition; brown-adipose-tissue cytochmme-c-oxidaset munbmne lipid-ccmpositiau adenine-nucleotide translocase; age-related-changes; skeletal-muscle mitochond~, chronic ethanol-consum@n; lateral proton conduction. bandekar j. (1992) amide modes and protein conformation. biochim. biophys. actu 1120. 123-143. transforfn-infrared-spsccpy; laser raman-spectroscopy; secondary-structure-analysis; liver ahohd-dehydmgenase; hydrogadeuteriutn exchange; a transmembrane channel; valyl-glycyl-glycine; iii spectral region; vibratiaral analysis, gramicidin-a. b&him. biophys. acta 1123, 231-238. acetyl-coa carboxylase; coenzyme-a reductase; hormone-sensitive l&se; dependent multipmtein kinase; rat-lives; 3-hydmxy-3-methylglutaryl coenxyme; enxymic activity; phosphorylationdephosphorylatiar; hydmxymethylgfutaryl coenxyme; reversible phosphotylation. w&t k. (1992) origins and fates of fatty acyl-coa esters. biochim. biophys. acfu 1124, 101-111. coenzyme a syntbetase; performance liquid-chromatography; rat-liver micmsomcs; acyltransfemse-cataly~ cleavage; dependent transacylation system; rabbit alveolar macropbages; pemxisomal &oxidation; amcbidonic-acid; brain micmsomes; sbott-chain. low-density~l&protein; high-performance liquid; chrcmatogmphy mass spectmmetry; chicken vitellogam gene; thin-layer chmmatography; apolipoprotein-vldl-ii; fatty-acid composition; laying turkey hens; egg-yolk; plasma-lipoproteins. erlansonalbertsson c. (1992) pancreatic colipase-structural and physiological aspects. biuchim. biophys. acta 1125. 1-7. gastric-inhibitory polypeptide, messenger-rna levels; diabetic rats; pro-colipase; co-lipase; tymsine residues; porcine colipase; sequence; taurodeoxycholrte; triglyceride. coleman r. and rahman k. (1992) wehle e. (1992) reiter r.j. (1992) the ageing pineal gland and its physiological conxequences. bioessays 14, 169-175. malatonin receptors; admoet@c-receptors; circadian variations; n-acetylscrotonin, plasma melatcoin, serum melatouin; hamsters; gerbil; brain; reduction; downward j. (1992) regulatory mechanisms for ras proteins. bioessap 14, 177-184. gtaase-activating protein; neurofibranatosis type-l gene; nucleotide exchange-reaction; gap-associated proteins; growth-factor; tymsine phosphorylation; ras-p21 gtpase; stimulation; p21; recepors. rusciano d. and burger m.m. (1992) adamo m., roberta ct. and lemith d. (1992) how distinct are the insulii and ~sul~-l~e growth factor? -signalling systems. biofwtors 3.151-157. human-skin fibroblests; igf binding-protein; messenger ribonucleic-acid; cooh-teminal truncation; monoclonal-antibody; endothelial-cells; factor receptoc amniotic-fluid; dna-synthesis; rat-heart. hehnreich e.j.m. (1992) how pyridoxal s-phosphate could function in glycogen phosphorylaae catalysis. biofbctors 3, 159-172. aron d.c. (1992) insulin-like growth factor-i and erythropoiesis. biofators 3, 211-216. factor-binding-protein; erythroid colony formation; cultured human-tibroblasts; factor messenger-rna, n-terminal sequence; fetal bovine serum; igf-i; somatomedin c, clinical-applicstians; stimulating factors. silver b.j. (1992) platelet-derived growth factor in human malignancy. biofmtors 3, 217-227. terminal coding region; c-sis; b-chain. bmis w.d. and durst r.w. (1992) bajpai p. and bajpai p.k. (1992) arachidonic acid production by microorganisms. biofechnol. appl. biockm. 15. l-10. bellomo m.j., parlier v., muhlematter d., grob j.p. and beris p. (1992) three new cases of chromosome-3 rearrangement in bandq21 and band-q26 with abnormal thrombopoiesis bring further evidence to the existence of a 3q21q26-syndrome. cancer genet. cytogenet. 59. 138-160. acute nonlymphocytic leukemia; chronic myelogcnous leukemia; chronic myeloid-leukemia; acute megakaryoblastic leukemia; chronic myelocytic-leukemia; acute myeloblastic-leukemia; british cooperative group; myelodysplastic syndromes; blast crisis; tmnslocation t(l -3). pedersen b. (1992) survival of patients with t(l-7)(pl l-p1 l&report of 2 cases and review of the literature. cancer genet. cytogenet. 60, 53-59. acute nonlymphocytic leukemir; trsnslccation i-7; myelodysplastic syndromes; chromosome analysis; mycloid disorders; secondary. nossal g.j.v. (1992) the molecular and cellular basis of affinity maturation in the antibody response. cell 68.1-2. mutation. thomas g. (1992) map kinase by any other name smells just as sweet. cell 68, 3-6. protein-kinsse; insulin; identification; muscle. teach r.e. (1992) type-2 astrocyte developme ciliaty netttotmphic factoc fibmblast growth-factor. retinoic acid xceptor; chick limb bud; tymsine kinaae m eatiy xatqus embryos; pmto-oncogene int-1; activin-a; w-locus. greenwald i. and rubin gm (1992) mellman i. and simons k. (1992) the golgi complex--in vitro verirus. cell 68. 829-840. asparsgine-linked oligosaccharides; cell-free system; rough endoplssmic-reticulum; vesicular stomatitis-tims; plasma-manbrane proteins; bmfeldin a; cis-golgi; successive compartments; intracellular-transport; n-acetylglucosamine. chao m.v. (1992) hall a. (1992) signal transduction through small gtpases-a tale of 2 gaps. cell 69, 389-391. proteins. wetr d.z., peles e.. cupples r., suggs s.v., bacus s.s., luo y., trail g., hu s.. silbiger s.m.. benlevy r.. koski r.a., lu h.s. and yarden y. (1992) neu differentiation factor-a transmembrane glycoprotein containing an egf domain and an immunoglobulin homology unit. cell 69, 559-572. epidennal growth-factor. human mterleukin-2 receptor. human-breast-carcinoma; human mammary-tumors; factor-a; nudeotide sequence; molecular cloning; prom-oncogene; tyrosine phosphorylation; signal transduction. travers a.a. (1992) the reprogramming of transcriptional competence. cell 69. 573-575. position effect variegation; drarophila; protein. helenius a. (1992) unpacking the incoming influenza virus. cell 69, 577-578. amantscline: protein, virions. jan l.y., jan y.n. and hughes h. (1992) tracing the roots of ion channels. cell 69. 715-718. protein. gauss p. and walther c. (1992) pax in development. cell 69, 719-722. genes; conservation; drosophila; nowchord, proteins; domain; box. varshavsky a. (1992) the n-end rule. cell 69, 725-735. shon-hved protein; dependent pmteolytic system; ubiquitin-conjugating enzyme; repair gene ra&, escherichia coli; transfer rna; sacclurroqces cercvisiue; endoplasmic-reticulum; cell-cycle; amino-acid. cell signailing iwashita s. and kobayashi m. (1992) signal transduction system for growth factor receptors asso&ed with tyrosine kinase activity-epidennal growth factor receptor signalling and its regulation. cell signal. 4. 123-132. vogt w. and nagel d. (1992) eriksson l.c. and andersson g.n. (1992) nucleotide-sequence; femo-cofacscr. iritani n. (1992) nutritional and hormonal regulation of lipogenic-enzyme gene expression in rat liver. eiu. j. fatty-acid aynthase; acetyl-coa cuboxylase; post-transcriptiaral regulation; chickembryo hepatocytes; messenger rna levels; malic enzyme; thymid hormone; posttmnscriptional regulation; molecular-clcming. gavel y. and vonheijne g. (1992) the distribution of charged amino acids in mitochondrial inner-membrane proteins suggests different modes of membrane integration for nuclearly and mitochondriaily encoded proteins. eur. j. biochem. 205, 1207-1215. cytochmme-c-oxidrse; adp-awcartier. beef-heart mitochondria; nicotinamide nucleotide tmnshydrogenase; brown fat mitochondtia; diffemnt genes cdna, imn-sulfur proteirx saccharanyces cerevisiae; unce@ieg potdn; escherichia coli. frrmcklyn c., musierforsyth k. and schimmel p. (1992) youn y.k., lalonde c. and demling r. (1992) haas a. and goebel w. (1992) defromentel c.c. and soussi t. (1992) mackay t.f.c., lyman r.f. and jackson ms. (1992) hpitan n.l.v. (1992) sobell j.l., heston l.l. and sommer s.s. (1992) delineation of genetic predisposition to multifactorial disease-a general approach on the threshold of feasibility. genomics 12, l-6. polymense chain-reaction; fragment length polymorphisms; dependent diabetes-mellitus; sickle-cell anemia; factor-m gene; enzymatic amplificatiat; genomic dna; mutations; sequence; diagnosis; predisposition; genetics. troy f.a. (1992) polysialylation-from bacteria to brains. glycobiology. 2, 5-23. cell-adhesion molecule; escherichia coli kl; rainbow-trout eggs; deaminated neuraminic acid; endo-n-acetylne.uramiidase; group b meningococci; long oligosaccharide segment: netno-blastoma cells; polysialic acid; sialic-acid. varki a. (1992) diversity in the sialic acids. glycobiology. 2, 25-40 . n-acetylneuramhtic acid, infhtenxa c virus; de-ortho-acetylation; performance liquid-chromatography; hungonuhu d&her antigen; liver golgi vesicles; melanoma-associated ganglioside; bombardment mass-spectrometry; human gastmintestinal-trac deaminated neuraminic acid. stanley p. (1992) glycosylation engineering. glycobiology. 2, 99-107. hamster ovary cells; mcombinant human erythropoietin; tissue plasminogen-activator. n-linked oligosaccharide, protein glycosylaticm; biological-activity; lysosomal-enzymes; insect cells; animal-cells; sugar chains. harvey d.j. (1992) the role of mass spectrometry in glycobiology. glycoconjugate j. 9, 1-12. fast-atom-bombardment; 6-o-methylghtcose pelysaccharide; collisional activation; gas-chromatography; laser desorptien; molecular mass; oligosacdtarides; ionization; proteins. aquit d.a. and b~~~ii~ii c.f. (1992) heat-shock proteins and immunopathology-an overview. heat-shock t-cell receptor; messenger-rna degradation; gamma-delta; stress proteins; antigen-receptor, lymphocytes-t; ~ycobactcrium fuberc&arir. mammalian-cells; cyto-toxicity; dna-binding. fenick d.a. and gemmellhori l. (1992) potential developmental role for self-reactive t cells bearing gamma-delta t cell receptors specific for heat-shock proteins. dendritic epidermal-cells; toxic lymphocytes-$ antigen receptor; intraepithclial lymphocytes; rheumatoid arthritis; athymic mice; mycobacrcritert lubercularis; immune-response; thymic ontogeny; a-chain; cell receptor. christmas s.e. (1992) cytokine production by lymphocytes-t bearing the gamma-delta t cell antigen receptor. antigen; antigen receptor. mceptor. tumor-necrosis-factor; bone-marrow transplantation; a-p+ lymphocytes; interferon-gamma; monoclonal-antibodies; peripheral-blood; cytotoxic activity; fetal thymocytes; different forms; human intestine. wintield j., jarjour w. and minota s. (1992) stress protein autoantibodies and the expression of stress proteins on the surface of human gamma-delta cells and other cells of the immune system. heat-shock proteins and gamma-delta t cells 53, 47-60. stress; autoantibodies; immune-system; heat-shock protein; rous-sarcoma virus; juvenile rheumatoid-arthritis; t-cells; synovial-fluid; lymphocytes-t, mycobucterium tuberculosis; borrelia bwgdorferi; lupus erythematosus; transforming protein; stmss protein. modlin r.l.. lewis j., uyemura k. and tigelaar r.e. (1992) lymphocytes-t bearing gamma-delta antigen receptors in skin. dendritic epidermaltells; heat-shock protein; mycobacterium-tuberculosis; intraepithelial lymphocytes; limited diversity; murine epidermis; concanavalin a; thy-l antigen; nude-mice; expression. hohlfeld r. and engel a.g. (1992) the role of gamma-delta lymphocytes-t in inflammatory muscle disease. monochmal-antibody analysis; natural-killer cells; mononuclear-cells; cyto-toxicity; receptor; myopathies; recognition; expmssiost; ptoteins; canplex. aquino d.a. and selmaj k. (1992) heat-shock proteins and gamma-delta t cell responses in the central nervous system. heat-shock proteins and gamma-delta t cells 53, 86-101. experimental autoimmune encephalomyelitis; experimental allergic encephalomyelitis; fibrillary acidic protein; myelin basic-protein multiple sclerosis: stress prcteim spinal-cord, insitu hybridixation; alexander's disease; praxosin treatment. mario t., nagasawa m. and yata j. (1992) gamma-delta t cells in patients with primary immunodeficiency syndrome-their function and a possible role in the pathogenesis. heat-shock proteins und gamma-delta t cells 53, 102-120. wiskott-aldrich syndrome; heat-shock proteins; receptor-delta; ataxia telangiectasia; transgenic mice; lymphocytes t; bearing cells; recognition; expression; incmase. reardon cl., bom w. and obrien r.l. (1992) murine gammadelta lymphocyte-t recognition of hsp60 a possible source for bacterial immunity or autoimmunity. he&& j.e. and whitelaw p.f. (1992) the role of cellular oncogenes in myogenesis and muscle cell hyperuqhy. int. j. bkxhem. 24, 193203. muscle; all; rous sarcoma vins; fibroblast growth-factor; c-fos expression; embtyonal caminoma-cel~, chicken skeletal-muscle; myc messatger-rna; proto-oncogene; geneexpnssion; dna-binding; src gmc. colacicchi s.. ferrari m. and sotgiu a. (1992) in vivo electron paramagnetic resonance spectroscopy imaging -first experiences, problems, and perspectives. bat. j. b&hem. 24.205-214. &oxide spin labels; loop-gap resonator. free-radicals; soluble nitroxides; metabolism; phannacokinetics; oxygen; cells; specttusn~, sensitivity. seyer r.. richoux j.p. and aumelas a. (1992) probing angiotensin receptors. iru. j. biochem. 24, 369-377. message-address concept; if onnplementaty rna, paraventricular nucleus; biological-activities; subfomical organ; binding-sites; amino-acid; rat-brain; antagonists; analogs. tuck m.t. (1992) the formation of internal 6-methylademne residues in eucaryotic messenger rna. fnt. j. biochem. 24, 379-386. rekharsky m.v., nemykina e.v. and erokhin a.s. (1992) thermochemistry of n-c bonds hydrolysis in amides, peptides, n-acetyl amino acids and high-energy n-c bonds hydrolysis in n-acetyl imidazole and urea. lnl. konopinska d., rosinski g. and sobotka w. (1992) insect peptide hormones, an overview of the present literature. amino-acid-sequence; bombardment mass-spectranetry; pigment-concentrating hormone; locust adipokinetic hormone; periphneta americana l; akh-rpch family; corpora cardiaca; lencophaea mademe; cotpus caniiaann; matuiuca swrta. dinarello c.a. (1992) the biology of interleukin-1. interleukinomolecular biology and immunology 51. l-32. tumor necrosis factor, colony-stimulating factor; smooth-muscle cells; human-immunodeficiency-vims; vascular endothelial-cells; blood mononuclear-cells; recombinant human interleukin-1; human monocyte interleukin-1; hepatic protein-synthesis; autocrine growth-factor. dower sk.. sims j.e., cerretti d.p. and bird t.a. (1992) the interleukin-1 system--receptors, ligands and signals. interleukin.+molecular biology and hmunology 51, 33-64. tumor necrosis factor, pmtein kinase c, growth-factor-receptor. nf-kappa-b; factor increase phosphorylation; prostaglandin e production; thaunatoid synovial-cells; high-affinity receptors; gtp-binding pmt+ human t-cells. ihle j.n. (1992) interleukin-3 and hematopoiesis. interleukim-44olecular biology ad i mmunology 51, 65-106. colony-stimulating factor. human granulocyte-macmphage; protein kinase-c; recombinant human interleukin-3; cell growth-factor, murine bone-marrow; express functional receptors; acute lymphocytic-leukemia; gtpase-activating pm&n; factor-independent growth. ascorbic-acid deficiency; ischemic-heart-disease; low-density-lipoprotein; eastern finnish men; diabetes melli~rcs. guinea-pigs; blood-pressure; oral-contraceptives; experimental atherosclerosis; plasma-cholesteml. mannella ca., forte m. and colombini m. (1992) toward the molecular structure of the mitochondrial channel, vdac. j. bioenerg. biometnbmne 24, 7-19. outer-membrane channel; neurarpora crassa mitochondria; hexokinase-binding protein; voltage-dependent channel, rat-liver mitochondrip; synthetic polyanion; electron-microscopy; pore protein; sequence; arrays; molecular-structure. depittt~ v. and pahnieri f. (1992) benz r. and brdiczka d. (1992) the cation-selective substate of the mitochondrial outer membrane pore-ainglechannel conductance and influence on intermembrane and peripheral kinases. j. bioenerg. biomembrane x33-39. rat-liver mitochondria; hexokinase-binding protein, contact sites; synthetic polyanicn; creatine-kinase; inhibition; transport; stae; brain. arora k.k., parry d.m. and pedersen p.l. (1992) khorana h.g. (1992) rhodopsin. photoreceptor of the rod cell-an emerging pattern for structure and function. j. bid chetn 267, 1-4. schiff-base counterion; bovine rhodopsin; cysteine residue-l 10; molecular mechanism; visual excitation; outer segmenl; protein; light; transducin; binding; mds; rhodopsin; photoreceptors. pugh b.f. and tjian r. (1992) diverse transcriptional functions of the multisubunit eukaryotic tfbd complex. j. a-a-crystallin; tissue-specific expmssion; chicken 6-l crystallin gene; vettebrate lens ctystallins; non-lenticular tissues; x ray-analysis; y-crystaltin; transgenic mice; b-ctystallin; eye lens; gene-regulation. fibroblast growth-faaer; endothelial-cell mitcgen; vascular-permeability factor, tumor necrosis factor. bovine brain; extracellular-matrix; factor-a; neovascularixaticn in viva; dna-synthesis; acidic fgf. sardesai v.m. (1992) nutritiongi role of ~~~~~~~ fatty acids. f. nurt. ffioch. 3,154-u%. a-linolutic acid; coronary beast-disease; arachidatic-acid; cholesterol levels; lipid-metabolism; young ra% deficiency: plasma; hr& requirements. rcmto g., toth k., gaspar s. and csik g. (1992) transmission; somatic-cell hybrids; @mine-rich region; maternal inheritance; nucleotide-sequarce; gene organization; d-loop; heteroplasmy; melanogaster; evolution; genome. schaich k.m. (1992) metals and lipid oxidation--contemporary issues. lipids 27, 209-218. oxidation; unsaturated fatty-a&s; electnm-spin resonance: hydrogen-petoxide, linoleic-acid; free-radicals; pulse-radiolysis; amino-acids; catalyzed nutoxidation; butyl hydtopetoxido: transition-metals. microbial reviews zimmer r.s. and lowry c.v. (1992) regulation of gene expression by oxygen in saccharomyces cerevisiae. microbial. rev. 56. l-11. cytoehrome-c oxidase; upstream activation site; mitochondtial messenger-ma; yeast hap1 activator, tihoscmntl-protein genes; sex-determining mgiont dna-binding maif; nuclear gene; glucose mpmssiont cycl gene wittarts s.c. (1992) two-way chemical signaling in agr~tffi~-plot interactions. ~~~~. rev. 56. 12-31. tumefaeiens 7%plasmi& vir gate-expression; single-stranded-dna, crown-gall tumors; atopine synthase mhancen transgenic tobacco plants; mc&tducing plasmid; rhizogenes strain a4; opine-like compound; virulence genes. ac 24/l%-c dowling j.n., saha a.k. and glow r.h. (1992) microbial. rev. !%, [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] cdi; a neurotoxin, light-chains; spasmodic taticolhs; mouucluual-antibodies; mediciue. botsford j.l. and htttman j.g. (1992) cyclic amp in prokaryotes. microbid. rev. 56, edurichia cdi kl2; pertussis adcnylate-cyclase; camp receptorproteim cat&d&e-activaterprcteinadn; l-m-l& sugar phosphotrausferase system; site-directed mutagenesis; calmoduliu-like activity; gate-ngulauq ptuteiu; calcium-biudiug prctein. matthews k.s. (1992) dna kroping. microbial. rev. 56. 123-136. gslactese opercn; escluriclia coli; lac represson arac protein; operator interactieu; lactose nprerra; uabimnc 0pfaen; gene-zcgulatioru rna-pelymense; binding-sites. nishimura a., akiyama k.. kohara y. and horiuchi k. (1992) glycusyl-phosphatidyliuosituk ascospotogeuuus yeasts; molecular-cloniug; division arrest sutface prcteiu. silver s. and walderhaug m. (1992) gene regulation of plasmid-determined and chromosome-determined inorganic ion transport in bacteria. microbid. rev. 56, 195228. eschcrichia coli k12; arsenical resistance openm; gram-negative bacteria; pnol2g-encoded nidcel resistance; irondicitrate transport; syringae pv tomato; nucleotide-sequence; alcaligeneseutmphus; staphylucoccus 4wew; phosphrte legulon. osawa s.. jukes t.h., watanabe k. and mutt a. (1992) recent evidence for evolution of the genetic code. microbiol. complete nuclcotide-sequence; transfer-rna genes; mitochatdnal transfer-rna; neurapporo cmsaa mitochondris directional mutation pressure; transfer rilwnucleic-acid; isoleucine transfer-ma; coli tnnsfer-mas; escherichia cog uycoplosnw capricdtm. vsnrens g.l.m.. dejong w.w. and bloemendal h. (1992) a superfamily in the mammalian eye lens-the &crystallins. mol. biol rep. 16. i-10. x-ny-aualysir, y-ctystallin; ~-cqstalhu; geue family; differential synthesis; messenger-rna, evolutionary dationrhipr; smctural variation; nucleutide-sequeuce. volkenstein m.v. (1991) structure and dynamics of proteins. md. pancmatic trypsin-inhi*, nuclear m&c-resonance; amino-acid-squenc; glcbularptweins; clystd-structure; heliul prutein; enzyme-activity; evelutiat; conformation; principles. barker p.a. and murphy r.a. (1992) the nerve growth factor receptor-a multicomponsnt system that mediams the actions of the neurntrophin family of proteins. md. cd2 biochcm. 110. 1-15. nm growth; fatter recepta, wbaat-germ-rgglutinin; rat spinal-cerdt sympatketic-ganglia manbran% kigkrffinity recepon; human-melanana celts; human ngf receptor pcl2 cells; messenger-rna; sasory neurons; pheochmmocytonu alla. dicker lb. and seetharam s. (1992) what is known ahout the structure and function of the escherichia cdi protein oncogene cooper c.s. (1992) mini review--& met oncogene-from detection by transfection to tmnsmembrane mceptor for hepatocyte @owth factor. 0t8l%?@?f&? 7, 3-7. protdn t$vwk kinare; c@4bkks locus; human dhine; molecular-&xdng; s6ster f-mnnga activa gene; rearrangemens expfessinn. verma d.p.s. (1992) signals in root nodule organogenesis and endocytosis of rhizobium. pbt cell 4.373382. pwibac&roid mcmbmn~ glycine-mu; soybean nod&es; celldivision; m-phase; gene; protein; cunpa~ nodulatioq expmstiul. henderson b.w. and dougherty t.j. (1992) h ow does photodynamic therapy work? photo&m. photobid. 55, 145-157. silver s. (1992) plasmid-determined metal resistance mechanisms-range and overview. plasmid 27, 1-3. folate supplements prevent recurrence of neural tube defects anonymous (1992) fda dietary supplement task force folate supplements and neural tube defects gastric acidity, atrophic gastritis, and calcium absorption nutrition mission to iraq for unicef a retrovirus uses a cationic amino acid transporter as a cell surface receptor anonymous (1992) position of double bond of trans isomers of linoleic acid affects fatty acid deaaturation and elongation retinoic acid, bound to its nuclear receptor, enhances the expression of the gene for phosphoenolpyruvate carboxykinase 50.5254. seasaral-variations; parathymid-honncee; mineral centem. anonymous (1992) transplacental nutrient transfer and intrauterine growth retardation 65-70. pmcellagee messenger-rna, chick-embryo choudmcytes; alkaline phosphatare; extmcellular-matrix nutrition and health communication-the message and the media over l/2 a century ceramide-a new lipid 2nd messenger plasma-membraues; spbinganyehaase; brain amines; metabolism; polyamines; methionine; protein. anonymous (1992) sweet foods and calorie consumption at meals vitamin-e supplementation enhances immune response in the elderly nutritional requirements and dietary guidelines-the globe shrinks excerpts from dietary reference values for food energy and nutrients for the united kingdom-introduction to the guide and summary tables 95-101. oral glucose-load; bcdy-cunposition; dietary-fat; thennogenic response; basal metabolism health and nutritional consequences of the 1991 bangladesh cyclone unproven nutritional remedies and cancer buthionine sulfoximine, an experimental tool to induce glutathione deficiency-elucidation of glutathione and ascorbate in their role as antioxidants choline-a conditionally essential nutrient for humans anonymous (1992) gender differences in immune competence during copper deficiency 116-l 18. folinic acid; plasma; ccujugase; intestine relevance of physiology of nutrient absorption to formulation of enteral diets vi&u; dissimilatory sulfate reduction; border membrane-vesicles; peptide-chain length; human gut bacteria, free amino-acids; small-intestine dietary fiber, phytoestrogens, and breast cancer post-menopausal women; hotmone-binding globulin; serum estrogen-levels bound estradiok american whites; mammary-tumors first lines of physiology-prospective overview relationship between malnutrition and falls in the elderly nutritional management of patients with short-bowel syndrome high-frt; resection 181-223. calcium-t&are channek samuplasmic-reticulum membranes junctional foot ptoteim fast-twitch muscle; intramembrane charge movemenu extensor digktmt longus; catdiac f%ukinje4iben the varieties of ribonuclease p flavin-iinked peroxide r~~~~o~~-s~f~c acids aud the oxidative stress response flavoprotein disulfide oxidoseductases; alkyl hydmperoxide reductase; stmpmcoc4 nadh pemxidase sulmuueflo quhimgriw ghttathiate-reduetase; thiomdoxm reductasc; escheri& cofi; sequence; petoxide chromatin dynamics and the modulation of genetic activity neutralizing autibodies; envelopa glycoprotein; mouc&mal-antibody complex regulation of simple sugar transport in htsulin-responsive cells new insights into genetic eye disease. trends genet. 8.85-91. dentinant relinirb pigmenfma; retinal degeneration; photonceptor cells; molecular-genetics; tbodopain gene a new bind for myc dna-binding; activates tmnsa@on; protein, dimerimtion; motif; my& ~nsfo~~~ ptefetenees; contains; dimer the bmp proteins in bone fo~ati~ and repair gmwth-factor-& tgf-& xenopus; identification; inductian; family; bovine; gene element; arabi&.& thaliana; lilium henryi; dna-sequence 109-114. dmninant motphoiogical mutan tmns&ption factots; maize; gene; protein hormones, puffs and flies-the molecular control of metamotphosis by ecdysone. trends gener. 8, 132-138. imaginal disk motphogenesir, sequential gene activation; larval salivary-gland; drosophila melonogoslcr, alcoholdehydtogenarc; pulytene cbtumosomes the molecular basis of glucose 6-phosphate dehydrogenase deficiency. trends genrt. 8.138-143. plarmadivnr f&ipwwq n~~&-~u~~~ point mutations; red-cells; gene; expmssion; cdna 144-149. skeletal-muscle; neural controh mcuse embryo molecular genetics of sex detetmination in c. efegunr. trench genet. 8, 164-168. pest-amlnyonic development; cycle contro1 proteins; dosage canpensation hemtaphrodim, mutaticaas; trri-1 trends gatct. 8.169-174. position+ffect variegatimx dna methyhuien; gene-expression; cpg island; mplmatia~ 174-180. mjugation tube formation; succharomyee~ cerevisioe; tremslia mesenterica; dna; gene the retinoblastoma protein and cell cycle regulation patterning of the drosophila nervous system-the achaete-scute gene complex. tr& gener. 8. 202-208. aensay m development mating-tupe; yeast; intexunpsicm; fusion; fission yeast; schizosaccharomyces pom6e; cell linuge; dna arank ho gene; cassetks; asymmetly ustilogo viohxu; nst !itlt@ts; t'&unce; pcpi&ition cti~ dynamics; ctisciw; @ttc; reproduction integrins and metastases-an overview. tumor bid 12.309-320. murine melanoma-cells; mconstituted basement-membrane; fibronectin-recepor complex tumor-cells; surface galactosyltransferase; adhesion molecules; breast-cancer. carbohydrate cha& leukucy~ integrins rat brain haxddnase; amino-acid-sequence; intraceuular-localization; terminal sequence; tumor-cells; membrane; glucose; particulate; form; phosphorylation.mcenery m.w. (1992) the mitochondrial benzodiazepine receptor-evidence for association with the voltage-dependent anion channel (vdac). j. bioenerg. biomembrane 24, 63-69.membrane contad sites; rat-liver mitochondria; central nervous-system; binding-sites; heafl-mitochcndria; mdogcnous ligands; calcium-channel; creatine-k&se; cyclosporin-a; localization.thinnes f.p. (1992) evidence for extra-mitochondrial localization of the vdac/porin channel in eucaryotic cells. j. bioenerg. biomembrane 24, 71-75. dependent anion channel; outer-membrane; chloride channels; escherichia coli; cystic-fibrosis; conductance; protein; identification; hexokinase; state.beavis a.d. (1992) properties of the inner membrane anion channel in intact mitochondria. rat-liver mitochondria; heart-mitochcndria; plant-mitochondria; k+ flux; dibutylchlonnnethyltin chloride; triphenyltin compounds; conducting channel; transport pathway; light-scattering;binding-sites. moran 0. and sorgato m.c. (1992) high-conductance pathways in mitochondrial membranes. j. bioenerg. biomembrane 24. 91-98. contact sites; outer-membrane; brain mitochondria; eukaryotic cells; lipid bilayers; pot-in pores; ion channel; transport; phosphorylation;reconstitution.kirmally k.w., antonenko y.n. and zorov d.b. (1992) modulation of inner mitochondrial membrane channel activity. j. bioenerg. biomembrane 24, 99-110. anion channel; contact sites; selective channels; brain mitochondria; lot] channci; conductance; protein; ca*'; mitoplasts;catiolls. l. and herzfeld j. (1992) nmr studies of retinal proteins. j. bioenerg. biomembrane 24. 139-146.solid-state 'le. dark-adapted bacteriorhodopsin; schiff-base; lab&d hncteriorhodopsin; membrane-proteins; chromophore; dynamics; conformation; spectroscopy. rothschild k.j. (1992) ftir difference spectroscopy ol bacteriorhodopsin-toward a molecular model. resonance raman-spectroscopy; carboxyl protonation changes; hydrogen-dcutcrium exchange; oriented purple membrane; retinal schiff-base; to-blue transition; vibrational spectroscopy; low-temperature; neutron-diffraction; differcncc spea*scopy.lanyi j.k. (1992) proton transfer and energy coupling in the bacteriorhodopsin photocycle. j. bioenerg. biomembrane 24, 169-179. resonance raman-spectra; solid-state "c. purple membrane; halobaclerium halobium: chromophore structure; photochanical cycle; schiff-base; retinal cimmophom: nbsor@ion-spectra; aspanic acid-96.oesterhelt d., tittor j. and bamberg e. (1992) segrest j.p.. jones m.k.. deloof h.. brouiiiette c.g., venkatachalapathi y.v. and anamharamaiah g.m. (1992) the amphipathic helix in the exchangeable apoiipoproteins-a review of secondary structure and function j. lipid res 33, 141-166. high-density-lipcpmtein; synthetic pa&de analogs; lccithiu-choleste.ml acyltransferaae; gradient gel&ctmphcresis; lipid-pmtaiu iute~; a-i; plasma-hpopmteins; monoclonal-antibodies; electron-miaorcopy; netstrut-scattering. hide wa., chw l. and li w.h. (1992) daliongeviile j., lussiercacan s. and davignon j. (1992) modulation of plasma triglyceride levels by apoe phenotype-a meta-analysis-review. j. lipid res 33, 447-454.levek apoiipopmmht-e poiymotphism; coronaty-anery disease; density-lipoprotein chdestemk measured genuype infonnaticn; amein-e polymorphism; amino-acid-sequence; iii hyparlipoproteinemia; e isoforms; quantitative phenotypes; v hyperlipoproteinemia. bjorkhem i. (1992) mechanism of degradation of the steroid side chain in the formation of bile acids-review. j.lipid res 33, 455-471. rat-liver peroxisomes; h-tic mitochondrial cytochrome-p-450; cerebmtendinous xanthomatosis; cholic-acid; 3-a,7-a.12-a-trihydroxy-s-@cholestanoic acid; 26-hydroxylase system; chenodeoxycholic acid; 12-a-trihydroxy-5-~-cholestanoic acid; cholesterol 7-a-hydroxylase; vitamin-d3 25hydroxylase.hofmann a.f. and mysels k.j. (1992) bile acid soiubiiity and precipitation in vitro and in v&-the role of conjugation, ph, and ca*+ ions. j. lipid res 33.617-626. guo z.s. and depamphilis m.l. (1992) a-protein; bacteriophage-~, purificatiat. trumbly r.j. (1992) glucose repression in the yeast saccharomyces cerevisiae. mol. microbiol. 6, 15-21. carbcn catabolite repression; snfl protein-kinase; invettase synthesis; molecular analysis; maltose fermentation; cytocbrome genes; nuclear-pmtcim gal genes; suc2 gene; mutants.bischoff d.s. and ordal g.w. (1992) okane d.j. and prasher d.c. (1992) evolutionary origins of bacterial bioluminescence. mol. microbial. 6,443-449.amino-acid-sequence; yellow fluorescent protein; fischeri strain y-l; photobacterium phosphorewn; vibrio fischeri; nucleotide-sequence; lumaxine protein; luminous bacterium; a-subunit; g-subunit.ditita v.j. (1992) ohe g., johannes c. and schultefiohlinde d. (1992) faisst s. and meyer s. (1992) compilation of vertebrate-encoded transcription factors. nucleic acids. nf-kappa-b, major histocompatibility complex; enhancer-binding-protein; nuclear factor-i; long terminal repeat amp response elanenu heavy-chain promoter. growth-hormone gene; admovilus dna-mplication; mouse estrogm-reapor. strohl w.r. (1992) compilation and analysis of dna sequences associated with apparent sfrepfomycete promoters.nucleic acids. res 20. %l-974.rna-polymerase; a-amylase gene; aminoglycoside phosphotransferase gene; fi-lactamase gene; escherichia coli; nucleotide-sequence; transcriptional analysis; coelicolor a3(2); sigma-factor. acetyltmnsferasc gme. perigaud c., gosselin g. and imbach j.l. (1992) nucleoside analogues as chemotherapeutic agents-a review.nucleos. nucleot. 11, 903-945. welch g.r. (1992) an analogical field construct in cellular biophysics-history and present status. prog. biophys.mol. biol57,71-128. nonstatiauy electric-fields; sequattial metabolic enzymes; free-energy; biological-systems; protein fluctuations; aqueous cytoplasm: diekttic theory; litig systems; thermodynamics; mechanisms.mart4 p. (1992) biophysicai aspects of neutron scattering from vibrational modes of proteins. prog. biophys. mot. bid 57, 129-179. pancm& tqprin-inhibitor, low-frequency vibrations; time-of-flight liquid glass-transition; hinge-bending mode; biological functions; globular pmtein; temperaturedependence; allosteric transition; supercooled liquids. cinti d.l., cook l., nagi m.n. and suneja sk. (1992) the fatty acid chain elongation system of memmalirm endoplaamic reticulum. prog. lipid res 31, 1-51. rat-liver nli crosomu; swine cembral mictosanes; acyl-coenzyme a; very-latg-chaim mouse-brain miaoscnnu; enoyl-coa hydratase; pemxisomal bifunctioual protein; cultured skin fib&lasts; lipid-lowering agentr; ~-oxidationheth2ea.m. lysolecithitt-lysolecithin acyltmttsferue; rabbit alveolar macmphages; myocatdial lysophos+&mse-tran~~, aortic endothelial-cells: cultumd netuo-blastoma; heart-muscle mictosomes; precursor fatty-acids; rat-brain microsomes; guinea-pig heart; fish oil diet kaya k. (1992) chemistry and biochemistry of taurolipids. prog. lipid res 31, 87-108. phosphoenolpymvate carboxykinase gtp; diet-induced hypercholesterolemia promoter-regulatory region; tissue-specific expression; enhancer-binding-pn&tu pymvate-kinase gene; rat-liver, messenger-rna, 6phosphofmcto-2kinasefructose 2,6-bispbosphatase; transcriptional regulation. barber j. and andersson b. (1992) lermrd j. (1992) moms d. (1992) smtctura 1 and fum&mal relationships between ~~yl-~f~ rna syt&tases. biodem. sci 17, 159-164. 3dimatsiand structure; atp, mechanisms; resohstion; homology; ~ueteim tymsyl; site; ammoacyl-transfer-rna. key: cord-262748-v4xue7ha authors: xu, yongtao; yu, shui; zou, jian-wei; hu, guixiang; rahman, noorsaadah a. b. d.; othman, rozana binti; tao, xia; huang, meilan title: identification of peptide inhibitors of enveloped viruses using support vector machine date: 2015-12-04 journal: plos one doi: 10.1371/journal.pone.0144171 sha: doc_id: 262748 cord_uid: v4xue7ha the peptides derived from envelope proteins have been shown to inhibit the protein-protein interactions in the virus membrane fusion process and thus have a great potential to be developed into effective antiviral therapies. there are three types of envelope proteins each exhibiting distinct structure folds. although the exact fusion mechanism remains elusive, it was suggested that the three classes of viral fusion proteins share a similar mechanism of membrane fusion. the common mechanism of action makes it possible to correlate the properties of self-derived peptide inhibitors with their activities. here we developed a support vector machine model using sequence-based statistical scores of self-derived peptide inhibitors as input features to correlate with their activities. the model displayed 92% prediction accuracy with the matthew’s correlation coefficient of 0.84, obviously superior to those using physicochemical properties and amino acid decomposition as input. the predictive support vector machine model for selfderived peptides of envelope proteins would be useful in development of antiviral peptide inhibitors targeting the virus fusion process. fusion process is the initial step of viral infection, therefore targeting the fusion process represents a promising strategy in design of antiviral therapy [1] . the entry step involves fusion of the viral and the cellular receptor membranes, which is mediated by the viral envelope (e) proteins. there are three classes of envelope proteins [2] : class i e proteins include influenza virus (ifv) hemagglutinin and retrovirus human immunodeficiency virus 1 (hiv-1) gp41; class ii e proteins include a number of important human flavivirus pathogens such as dengue virus (denv), japanese encephalitis virus (jev), yellow fever virus (yfv), west nile virus (wnv), hepatitis c virus (hcv) and togaviridae virus such as alphavirus semliki forest virus (sfv); class iii e proteins include vesicular stomatitis virus (vsv), herpes simplex virus-1 (hsv-1) and human cytomegalovirus (hcmv). although the exact fusion mechanism remains elusive and the three classes of viral fusion proteins exhibit distinct structural folds, they may share a similar mechanism of membrane fusion [3] . a peptide derived from a protein-protein interface would inhibit the formation of that interface by mimicking the interactions with its partner proteins, and therefore may serve as a promising lead in drug discovery [4] . enfuvirtide (t20), a peptide that mimicks the hr2 region of class i hiv-1 gp41, is the first fda-approved hiv-1 fusion drug that inhibits the entry process of virus infection [5] [6] [7] . then peptides mimicking extended regions of the hiv-1 gp41 were also demonstrated as effective entry inhibitors [8, 9] . furthermore, peptides derived from a distinct region of gb virus c e2 protein were found to interfere with the very early events of the hiv-1 replication cycle [10] . other successful examples of class i peptide inhibitors include peptide inhibitors derived from sars-cov spike glycoprotein [11] [12] [13] and from pichinde virus (picv) envelope protein [14] . recently, a peptide derived from the fusion initiation region of the glycoprotein hemagglutinin (ha) in ifv, flufirvitide-3 (ff-3) has progressed into clinical trial [15] . the success of developing the class i peptide inhibitors into clinical use has triggered the interests in the design of inhibitors of the class ii and class iii e proteins. e.g. several hydrophobic peptides derived from the class ii denv and wnv e proteins exhibited potent inhibitory activities [16] [17] [18] [19] [20] . in addition, a potent peptide inhibitor derived from the domain iii of jev glycoprotein and a peptide inhibitor derived from the stem region of rift valley fever virus (rvfv) glycoprotein were reported [21, 22] . examples of the class ii peptide inhibitors of enveloped virus also include those derived from hcv e2 protein [23, 24] and from claudin-1, a critical host factor in hcv entry [25] . moreover, peptides derived from the class iii hsv-1 gb also exhibited antiviral activities [26] [27] [28] [29] [30] [31] , as well as those derived from hcmv gb [32] . computational informatics plays an important role in predicting the activities of the peptides generated from combinatorial libraries. in silico methods such as data mining, generic algorithm and vector-like analysis were reported to predict the antimicrobial activities of peptides [33] [34] [35] . in addition, quantitative structure-activity relationships (qsar) [36] [37] [38] [39] [40] and artificial neural networks (ann) were applied to predict the activities of peptides [41, 42] . recently, a support vector machine (svm) algorithm was employed to predict the antivirus activities using the physicochemical properties of general antiviral peptides [43] . however, the mechanism of action of antiviral peptides is different from antimicrobial peptides; in fact, various protein targets are involved in the virus infection. e.g. hiv-1 virus infection involves virus fusion, integration, reverse transcription and maturation, etc. thus it is difficult to retrieve the common features from general antiviral peptides to represent their antiviral activities. virus fusion is mediated by e proteins. although e proteins are highly divergent in sequence and structure, they share a common pathway of membrane fusion dynamics. i.e. e proteins experience significant conformational change to form a-trimer-of-hairpin, which drives the fusion of viral membrane and host membrane [44] . the antiviral peptides derived from enveloped proteins function by in situ binding to their respective accessory proteins, disrupting forming of the trimer-of-hairpin and membrane fusion, and therefore inhibiting the virus infection. in view of the important role of e proteins in virus fusion process and common mechanism of action of self-derived peptides, we developed a svm model to predict the antiviral activities of self-derived peptides using sequence-based statistical scores as input features. the sequencebased properties were calculated by a conditional probability discriminatory function which indicates the propensity of each amino acid for being active at a specific position. our model exhibited remarkably higher accuracy in predicting the activities of self-derived peptides, compared to the previous models developed for general antiviral peptides using classical physicochemical properties as descriptors [43] . the method would be useful in identification of entry inhibitors as a new generation of antiviral therapies. 202 peptide virus entry inhibitors of enveloped viruses were collected, among them, 101 are active peptides and 101 are non-active peptides. these peptides comprised the 75p+75n training set of svm models. the remaining 26 active peptides and 26 non-active peptides inhibitors were used as the test set. amino acid composition. amino acid composition is the fraction of each amino acid in a peptide. the fraction of the 20 amino acids was calculated using the following equation: fraction of amino acid x ¼ total number of x = peptide length five physicochemical properties were used in svm models. isoelectric point (pi), molecular weight (mw) and grand average of hydropathicity (gravy) [45] were calculated using the protparam tool implemented in expasy web server. solvent accessibility and secondary structure features were calculated using sspro and accpro packages implemented in the scratch protein predictor server [46] . sequence-based statistical scoring function. the knowledge-based statistical function is developed from the concept of residue-specific all-atom probability discriminatory function (rapdf) [47] . rapdf is a structure-based statistical scoring function. it is based on the assumption that averaging over different atom types in experimental conformations is an adequate representation of the random arrangements of these atom types in any compact conformation. here we developed a sequence-based statistical scoring function, where we presume that averaging over different amino acid sequences with experimental validated inhibitive activities is an adequate representation of the random amino acid sequences with any inhibitory activity. the basis of this assumption is that the peptides share a common mechanism of action, i.e. the peptides derived from e proteins bind competitively to their partner proteins, disrupt the forming of a-trimer-of-hairpin, and therefore inhibit the virus membrane fusion. the sequence-based scoring function is described in the following form: sðfq i a gþ ¼ àln here, q i a 2 factiveg. pðq i a jcþ is the probability of observing amino acid i in an active peptide sequence; pðq i a þ is the probability of observing amino acid i in any peptide sequence, active or nonactive. they are approximately estimated using the following forms: similarly, we employed a dataset of experimentally verified non-active peptides in developing the statistical function, where q i a 2 finactiveg. for a given amino acid sequence, 20 columns of input are generated, corresponding to the occurrence of twenty natural amino acids at each position. each column is assigned a value of n ã (−log-likelihood), where n is the number of amino acid and −log-likelihood is derived from the statistical function score. each of the features thus combines the propensity of the amino acid for being active or non-active with the corresponding amino acid composition. below is an example of calculating the statistical scores for a given peptide sequence: the amino acid order for svm input features is set as: acdefghiklmnpqrstvwy. if the amino acid sequence of an active peptide inhibitor is: svm models combined with radial basis function (rbf) kernel parameters were developed using the c-svc module in libsvm (version 3.1) [48, 49] and executed under the matlab interface. the performance of svm depends on two parameters, gamma -g and cost-c [50] . the default value is 1 for -c and 1/k for -g, where k is the number of input entries. various pairs of (c, g) values were converted to exponential values (i.e. 2 x ;2 y ) and optimized using cross-validation and the pair with the best cross-validation accuracy was selected. 5-fold cross validation was performed to evaluate the performance of svm models. in the evaluation process, dataset was partitioned randomly into five equally sized subsets. the training and testing were carried out five times, each time four distinct subsets being used as training sets and the remaining subset as test set. the results were averaged over all five rounds of validation. the following equations were used to evaluate the prediction quality of the svm models [48, 51] : in the above equations, tp is the number of true positives, tn is the number of true negatives, fp is the number of false positives and fn is the number of false negatives. matthew's correlation coefficient (mcc) reflects the performance of the model. it ranges between -1 to 1 and a larger mcc value indicates a better prediction. svm learning algorithm is a powerful machine learning method that has been widely used in pattern recognition and classification. svm trains a dataset of experimentally validated positive and negative samples and generates a classifier to classify unknown samples into two distinct categories (positive or negative). we performed an exhaustive literature search on self-derived peptide inhibitors of enveloped proteins and collected experimentally validated peptides derived from the three classes of e proteins. for those peptides with overlapping segments, only one peptide sequence was kept. 202 peptides were found, among them, 101 are active peptides and 101 are non-active peptides ( table 1) . 75 active peptide inhibitors and 75 non-active peptides (75p+75n) of e proteins were used as the training dataset in svm learning; the remaining 26 active and 26 non-active peptides (26p+26n) were used as the test set. svm input features. three svm models were developed using different features as input descriptors, namely physicochemical properties (denoted as eapphysico), amino acid composition (eapcompo) and statistical scoring function amino acid composition (eapscoring). knowledge-based statistical functions are rooted in the bayesian (conditional) probability formalism and derived directly from properties observed in the known folded proteins [52] [53] [54] . in knowledge-based scoring function, it was presumed that averaging over different atom types in experimental conformations is an adequate representation of the random arrangements of these atom types in any compact conformation [55] . because the three classes of e proteins have different structural folds, it is difficult to retrieve a structure-based feature that is relevant to their antiviral activities. generally speaking, any property associated with folded proteins can be converted into an energy function [56] . since amino acid sequence determines the structural folds and properties of proteins/peptides, we presumed that a sequence-based statistical scoring function averaging over different amino acid sequences exhibiting inhibitive activities is an adequate representation of the random combinations of all twenty amino acid exhibiting any activity. in this approach, a peptide sequence derived from e protein is represented by twenty features each corresponding to the propensity of observing each of the twenty natural amino acids to be either active or non-active. a vector space of twenty sequence-based statistical scores was used as the eapscoring input entries in the svm learning. we also built a svm model using physicochemical properties as input features. because of the feature of membrane fusion process, it was suggested that functional regions in glycoproteins need to be solvent accessible, hydrophobic and flexible [57] . actually the majority of known peptide entry inhibitors share a common physicochemical property of being hydrophobic and amphipathic with a propensity for binding to lipid membranes [58] . therefore, here the properties of e peptide inhibitors were described by five physicochemical parameters: pi, mw, gravy index (positive and negative gravy values indicate hydrophobic and hydrophilic peptides, respectively), solvent accessibility (exposed or buried) and secondary structure features (propensity for adopting α-helix, β-sheet or turn structure). these physicochemical features were calculated for each of the peptides and used as the eapphysico input entries in the svm learning. a third svm model eapcompo was also built where the fractions of amino acids in a peptide were used as input features in the machine learning process. svm training. the svm models were trained using the experimentally validated 75p+ 75n data sets. during 5-fold cross validation, the training set was randomly partitioned into four subsets with equal size of (15p+15n) and a remaining subset (15p+15n). three svm models were built using sequence-based statistical scores, physicochemical properties and amino acid composition, respectively. the performances of the three models are shown in table 2 . it can be seen that the eapscoring model performed best among the three models during 5-fold cross validation. a "grid-search" combined with cross-validation was adopted to search for the optimal parameters -c and -g in svm models [49] . the result of the grid search is shown in the support information (s1 file). it is shown that the performances of three eap models during 5-fold cross validation have been improved significantly using the optimized parameters ( table 2) . the performance of the svm models was evaluated using an independent dataset of experimentally validated peptides that were not contained in the learning dataset (table 1 ). in the eapphysico model where physicochemical properties of peptides were used as input features, an accuracy of 65% with a mcc value of 0.31 was observed (table 3 ). in the eapcompo model where amino acid composition features were used, the predictive accuracy and the mcc value are slightly higher. when the sequence-based statistical function scores were used as input in the eapscoring model, a remarkable accuracy of 92% was achieved with a mcc value of 0.84. thus the sequence-based statistical scores developed in the present research are predominantly superior to the conventional physicochemical properties or amino acid decomposition features in identifying active peptides derived from enveloped proteins. avppred is a web server for prediction of the activities of general antiviral peptides (avps) based on a number of experimentally validated positive and negative data sets [43] . the peptide inhibitors employed in avppred target a variety of biological targets involved in virus infection. in contrast, the self-derived peptides of enveloped proteins being studied in the present research competitively bind to e proteins so as to mediate the virus fusion process. because the self-derived peptides share similar mechanism of action, it is feasible to retrieve common features from them to build predictive svm models. in order to evaluate the performance in predicting peptide inhibitors of the enveloped virus, we compared the avppred models with our eappred models using an independent 26p+26n dataset as test set. the results are shown in table 3 . four different features were employed in the avppred models, namely conserved motif search using meme/mast, amino acid composition, sequence alignment using blast and physicochemical parameters including secondary structure, charge, size, hydrophobicity and amphiphilic character [43] . when the avpmotif model was used to predict the activities of the self-derived peptide inhibitors, it performed rather poorly with accuracy of 52% and mcc of 0.14. this is not surprising because avpmotif was developed based on 20 general antiviral peptide motifs. however, the self-derived peptide inhibitors may not share a conserved motif with the general antiviral peptides since the latter interact with various biological targets with different mechanisms of action. in the avpalign model, the peptide sequences were classified into active and non-active databases and the query peptide sequences were matched against the active and non-active databases using the blast program. compared with avpcompo and avpphysico, avpalign performed better with a predictive accuracy of 73% and mcc value of 0.52. fusion mechanism is highly conserved among related viruses and entry of viruses into host cells has been inhibited by peptides derived from various regions of envelope glycoproteins [59] . self-derived peptides would inhibit interactions of their original domain by mimicking its mode of binding to partner proteins [4] . because similar sequences are often associated with similar structure and function, the sequence-based property avpalign would account for the activities of the self-derived peptide inhibitors which regulate the virus fusion by mimicking the binding to e proteins. in the avpphysico model, 25 best performing physicochemical properties were selected out of the 544 properties to build the svm model [43] . antiviral peptide inhibitors are generally amphiphilic [60] and the activities of peptide entry inhibitors are dependent on their interfacial hydrophobicity [58] . therefore we only employed five physicochemical properties reflecting hydrophobicity, solvent accessibility and secondary structure features as svm input features. it was demonstrated that the accuracy and mcc of eapphysico is comparable to that of avpphysico model, indicating the five properties used in current modeling building are critical for their activities. the mcc value of the avpcompo models is 0.20, indicating that the antiviral activities of the peptides are related to amino acid composition. when the amino acid composition was used as input, the predictive accuracy of the eapcompo model was higher than that of the avpcompo model, indicating the peptide inhibitors of e proteins employed in the training set is sufficient to represent the contribution of amino acid composition to their inhibitive activities. in the eapcompo model, the preference of the amino acid composition was ranked as: p, r, q, d, f, w, e, l, t, i, n, h, y, c, a, s, m, v, k, g (fig 1) . the role of arginine-arginine pairing and its contribution to protein-protein interactions has been investigated by computational approaches [61] . the higher abundance of r at protein-protein interfaces compared to k may be attributed to the formation of cation-π-interactions and the greater capacity of the guanidinium group in r to form hydrogen bonds (compared to k) [62] [63] [64] . furthermore, it was suggested that the interface regions are enriched in aliphatic (l, v, i, m) and aromatic (h, f, y, w) residues and depleted in charged residues (d, e, k) with the exception of arginine [62, [65] [66] [67] [68] [69] . this is in agreement with our amino acid composition analysis, where higher population of aliphatic leu residue as well as aromatic residues trp and phe was observed, whereas positively charged lys was hardly observed. the predominant occurrence of proline and glutamine residues is characteristic for the unique protein-protein interactions for e proteins. e.g. a conserved proline-rich motif was suggested to be engaged in monomer-monomer interactions in dengue e proteins [70] . a conserved glutamine-rich layer is involved in the extensive hbond network in hiv-1 gp41 e proteins [71] . thus the preference of the amino acid composition identified from the eapcompo model is generally in accordance with the predominant residues involved in protein-protein interactions, manifesting the amino acid composition of the self-derived peptide inhibitors are closely related to their potential activities in mediating the protein-protein interactions in the virus fusion process. because the antiviral activities of peptides are dependent on amino acid composition, we presume amino acid composition discriminated by the propensity of their activities would be an intrinsic feature in the self-derived peptide inhibitors which share a common mechanism of action. when statistical function scores were employed in the svm model (eapscoring), a remarkable predictive accuracy of 92% with an ideal mcc value of 0.84 was achieved, significantly better than any avp models. the logarithm form of the discriminatory function (eq 1) can be deemed as the pseudo energy of the system. in our previous study, we suggested that the stability of proteins is related to their in situ binding potential to the partner regions [72] . the prominent performance of eapscoring model indicates the sequence-based stability feature of self-derived peptides may reflect their potential of binding to e proteins so as to regulate the virus entry process. we developed three svm models using physicochemical properties, amino acid composition and statistical discriminative function as input features. the prediction accuracy and the mcc value of the eapphysico model where five physicochemical properties were employed are comparable with the previous avpphysico model where 25 physicochemical properties were used. the avpcompo and eapcompo models demonstrated that the activities of antiviral peptides are dependent on amino acid composition. a sequence-based scoring function was developed for the self-derived peptide inhibitors of e proteins. the outperformance of the eapscoring models supports our hypothesis that an intrinsic feature, represented by the propensity of each amino acid for being active in self-derived peptides, is responsible for the activities of the peptides to regulate virus fusion by mimicking the binding to their accessory proteins. the sequence-based statistical scoring function would be useful in development of novel antiviral therapies to target the initial step of viral infection. supporting information s1 file. parameters optimization by grid-research combined with 5-fold cross validation. x-axis is log2 g , y is log2 c and z-axis represents accuracy(%) ( figure a targeting cell entry of enveloped viruses as an antiviral strategy. molecules class iii viral membrane fusion proteins virus membrane-fusion proteins: more than one way to make a hairpin can self-inhibitory peptides be derived from the interfaces of globular protein-protein interactions? characterization of a putative cellular receptor for hiv-1 transmembrane glycoprotein using synthetic peptides propensity for a leucine zipperlike domain of human immunodeficiency virus type 1 gp41 to form oligomers correlates with a role in virus-induced fusion rather than assembly of the glycoprotein complex a synthetic peptide from hiv-1 gp41 is a potent inhibitor of virusmediated cell-cell fusion hiv gp41 c-terminal heptad repeat contains multifunctional domains. relation to mechanisms of action of anti-hiv peptides inhibition of human immunodeficiency virus type 1 entry in cells expressing gp41-derived peptides peptides derived from a distinct region of gb virus c glycoprotein e2 mediate strain-specific hiv-1 entry inhibition inhibition of severe acute respiratory syndrome-associated coronavirus (sars-cov) infectivity by peptides analogous to the viral spike protein synthetic peptides outside the spike protein heptad repeat regions as potent inhibitors of sars-associated coronavirus suppression of sars-cov entry by peptides corresponding to heptad regions on spike glycoprotein design and characterization of glycoprotein-derived peptide inhibitors of arena virus infection autoimmune technologies, safety, tolerability, and pk of escalating doses of flufirvitide-3 dry powder for inhalation in healthy subjects peptide inhibitors of dengue virus and west nile virus infectivity structural optimization and de novo design of dengue virus entry inhibitory peptides. plos neglected tropical diseases antiviral peptides targeting the west nile virus envelope protein peptide inhibitors of dengue virus entry target a late-stage fusion intermediate inhibition of dengue virus entry into target cells using synthetic antiviral peptides inhibition of japanese encephalitis virus entry into the cells by the envelope glycoprotein domain iii (ediii) and the loop3 peptide derived from ediii a fusion-inhibiting peptide against rift valley fever virus inhibits multiple, diverse viruses a peptide derived from hepatitis c virus e2 envelope protein inhibits a post-binding step in hcv entry early events in hepatitis c virus infection: an interplay of viral entry human claudin-1-derived peptide inhibits hepatitis c virus entry peptides containing membraneinteracting motifs inhibit herpes simplex virus type 1 infectivity evidence for a role of the membrane-proximal region of herpes simplex virus type 1 glycoprotein h in membrane fusion and virus inhibition the identification and characterization of fusogenic domains in herpes virus glycoprotein b molecules analysis of synthetic peptides from heptad-repeat domains of herpes simplex virus type 1 glycoproteins h and b conformational modifications of gb from herpes simplex virus type 1 analyzed by synthetic peptides multiple peptides homologous to herpes simplex virus type 1 glycoprotein b inhibit viral infection peptide inhibition of human cytomegalovirus infection a theoretical approach to spot active regions in antimicrobial proteins optimization of antibacterial peptides by genetic algorithms and cheminformatics antibp2: improved version of antibacterial peptide prediction application of 'inductive' qsar descriptors forquantification of antibacterial activity of cationic polypeptides qsar analysis of antimicrobial and haemolytic effects of cyclic cationic antimicrobial peptides derived from protegrin-1 design of novispirin antimicrobial peptides by quantitative structure-activity relationship qsar modeling and computer-aided design of antimicrobial peptides identification of novel antibacterial peptides by chemoinformatics and machine learning de novo design of potent antimicrobial peptides evaluating different descriptors for model design of antimicrobial peptides with enhanced activity toward p. aeruginosa avppred: collection and prediction of highly effective antiviral peptides structures and mechanisms of viral membrane fusion proteins: multiple variations on a common theme a simple method for displaying the hydropathic character of a protein scratch: a protein structure and structural feature prediction server an all-atom distance-dependent conditional probability discriminatory function for protein structure prediction support-vector networks working set selection using second order information for training svm a practical guide to support vector classification. initial version comparison of the predicted and observed secondary structure of t4 phage lysozyme distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction a distance-dependent atomic knowledge-based potential for improved protein structure selection statistical potential for assessment and prediction of protein structures comparison of database potentials and molecular mechanics force fields scoring functions for de novo protein structure prediction revisited peptide inhibitors against herpes simplex virus infections peptide entry inhibitors of enveloped viruses: the importance of interfacial hydrophobicity structure-based design of inhibitors of protein-protein interactions: mimicking peptide binding epitopes broad-spectrum antivirals against viral fusion the molecular origin of like-charge arginine -arginine pairing in water residue frequencies and pairing preferences at proteinprotein interfaces dissection of specific and non-specific protein-protein interfaces dissecting protein-protein recognition sites principles of protein-protein interactions studies of protein-protein interfaces: a statistical analysis of the hydrophobic effect the atomic structure of protein-protein recognition sites genome-wide studies of protein-protein interaction the interface of protein-protein complexes: analysis of contacts and prediction of interactions prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoem structures the fusion activity of hiv-1 gp41 depends on interhelical interactions computational identification of self-inhibitory peptides from envelope proteins the authors are grateful for the computing resources from qub high performance computing centre. the authors declare no conflict of interest. key: cord-006860-a3b8hyyr authors: nan title: 40th annual meeting of the gth (gesellschaft für thromboseund hämostaseforschung) date: 1996 journal: ann hematol doi: 10.1007/bf00641048 sha: doc_id: 6860 cord_uid: a3b8hyyr nan the variable molecular weight (mw) of vwf is due to differences in the number of subunits comprising the protein. it is assumed that endothelial cells secrete large polymeric forms of vwf and that smaller species arise from proteolytic cleavage. vwf has two main properties: it stabilizes factor viii protecting it from inactivation by activated protein c or factor xa, and it mediates platelet adhesion to subendothelium of the damaged blood vessel wall. each vwf subunit contains binding sites for collagen and for platelet giycoproteins gp ib and gp iib/i~a. multiple interactions of the multivalent vwf lead to extremely strong binding of platelets to subendothelial surface, that is capable of resisting high wall shear rate in the circulating blood. only the largest multimers are hemostatically active. lack of the largest vwf multimers was observed in patients with yon wiuebrand disease type 2a. unusually large molecular forms of vwf were found in patients with thrombotic thrombocytopenlc purpura. proteolytic enzyme(s) may be involved in the physiologic regulation of the polymeric size of vwf and thus play an important role in the pathogenesis of vwf abnormalities in some patients with congenital or acquired disorders of hemostasis. we have purified (-10,000-fold) from human plasma a vwf degrading proteas¢ using affinity chromatography and gel filtration. the proteolytic activity was associated with a high mw protein (mr -300 kd). vwf was resistant against the protease in a physiologic buffer but became degraded at low salt concentration or in the presence of 1 m urea. proteolytic activity had a ph optimum at g-9 and was not inhibited by serine protease inhibitors or sulfl~ydryl reagents. inhibition by chelating agents was best reversed by barium ions, the observed properties of the vwf degrading enzyme differ from those of all hitherto described pretenses. analysis of cleaved vwf showed that the peptide bond 842tyr-g43met had been cleaved -the same bond that has been proposed to be cleaved in rive. the endothelium releases the vasodilator nitric oxide (no) and the vasoconstrictor endothelin (et)-i. no is formed from l-arginine via the activity of constitutive nitric oxide synthase (cnos or enos). an inducible form of nos (inos) is activated by cytokines. no activates guanylyl cyclase in vascular smooth muscle and platelets, leading to the formation of cgmp which induces relaxation or platelet inhibition, respectively. in vessels, no is responsible for endothelium-dependent relaxations; in vivo it exerts a vasodilator tone which can be enhanced by shear forces and receptor-operated agonists such as acetylcholine, bradykinin, thrombin, atp and adp. infusion of no-inhibitors in vivo leads to vasoconstriction and increases in blood pressure and oral administration to hypertension in the rat. within the endothelium, no inhibits et gene expression and release of the peptide via cgmp. hence, no-induced hypertension is associated with increased plasma et levels. et, a 21-amino acid peptide, has potent vasoconstrictor properties via eta-and in part etb-raceptors on vascular smooth muscle. in endothelial cells, et activates etb-receptors linked to no and prostacyclin formation. under basal conditions, little et is formed, but is increased by thrombin, angiotensin ii, arginine vasopressin, cytokines and ox-ldl. et antagonists have been developed and allow to study the effects of et in vivo. et and no most likely play an important role in disease states such as hypertension, atherosclerosis, coronary artery disease, heart failure, pulmonary hypertension and subarachnoid hemorrhage. clinical trials to further define their role in these disease states are now under way. in summary, the endothelium is an important regulator of vascular tone and structure in vitro and in vivo. in disease states, their interaction is imbalanced leading to enhanced vasoconstriction, thrombus formation and structural changes of the blood vessel wall. pharmacological tools aiming to inhibit those changes are now being developed. j.m. harlan, r.k. w/nn, s. sharar, and n. vedder universit3, of washington, seattle, washington ischemia-reperfusion injury has been implicated in the pathogenesis of a wide variety of efinical disorders. [n preclinical models, tissue damage .clearly occurs during ischemia, but, parado.,dcally, may be exacerbated during reperfusion. this reperfusion injury appears to involve activation of the intlammato~, cascade with generation of complement 'components, lipoxygenase products, and chemokines as proximal mediators and neutrophils as final effectors of vascular and tissue damage. we have examined the role of leukocyte adhesion in reperfusion injury in two models -the rabbit ear as a model of isolated organ injury and hemorrhagic shock and resuscitation in the rabbit and primate as a model of traumatic shock and multiple organ failure. data regarding the efficacy, timing, and safety of leukocyte adhesion blockade using selectin-or integrindirected reagents in these models w/ll be presented. the current status of anti-adhesion therapy in other pre-clinical models and early clinical trials will be re~4ewed. an amidolytic assay for the determination of activated protein c (apc)-resistant factor va (fva) has been developed. this assay measures the cofactor activity of fva in diluted plasma samples via the rate of thrombin formation. the apc response is calculated from two fv determinations: one performed in the presence (apc-fv) and one in the absence of recombinant apc. the apc-fv activity is expressed as percentage of the initial fv activity and indicates the sensitivity of fva to apc. normal ranges were established by analysing plasma samples of 100 healthy individuals and an apc-fv activity above 60% was found to be indicative for apc-resistance (apc-r). in a control group of 34 patients the apc-r assay gave abnormal results in 15 patients. dna analysis confirmed heterozygous fv r506q mutation in all 15 patients and confirmed the non-carrier status in all of the 19 patients yielding normal results. an aptt-based apc-r assay performed on the same group of patients showed abnormal results in two of the non-carrier patients. one of these patients was diagnosed as positive for lupus anticoagulant, whereas the reason for the wrong positive result in the second patient remains unclear. eleven patients were analyzed before start of oral anticoagulation and during oral anticoagulant treatment. comparison of the assay results demonstrate a correlation of 96% indicating that the assay is independent of the activities of vitamin k-dependent clotting factors. the apc-r amidolytic assays allows specific and sensitive detection of fva-resistant to apc. the assay is performable in plasma samples of all persons in whom the diagnosis of apc-r may be indicated. in patients treated with oral anticoagulants or showing other clotting abnormalities affecting the aptt the apc-r amidolytic assay is helpful to establish the diagnosis of apc-r. dept of pediatrics, university hospitals kiel and mtinster, germany resistance to activated protein c (apcr), in the majority of cases associated with the arg 506 gin point mutation in the factor v gene is present in more than 50 % of patients < 60 years of age with unexplained thrombophilia. to determine to what exteut this relativdy common gene mutation affects the risk of thromboembolie events in infants and children, its occurrence was investigated in a population of children with unexplained venous or arterial thromboernbotism: thrombosis of the central nervous system (cns, n=4), vena portae (n=4), deep vein thrombosis (n=4), vena caval occlusion (n=4), neonatal renal venous thrombosis (rvi'; n--3), neonatal stroke (n=14), stroke (n=3), arteria femoralis ocdusion (n=2). four ont of these 38 patients showed a positive history of unexplained familial thrombophilia. apcr was measured in an activated thromboplastin time (afit) according to dahlbtick. the results were expressed as apc-ratios: clotting time obtained using the apc/caci2 -solution divided by dotting time obtained with cac12. concerning the special properties of the childhood hemostatic system, infants and children with apcr < 2 were considered to be apc-resistent only when the results were confirmed in a 1: l 1 dilution with factor v deficient plasma (instrumentation laboratory munich, germany). plasma of 268 healthy children served as controls. the arg 506 gin mutation of the factor v gene was assayed by amplification of the dna samples by pcr followed by digestion of the amplified products with the restriction enzyme mul i. results were confirmed by sscp -analysis or by direct sequencing of dna from patients with apcr. consistent with the ap'it based method 10 out of 19 children with venous (v) thrombosis and eight out of 19 patients with arterial (a) vascular insults showed the common factor v mutation. additional coagulation defects (autithrombin, protein c type i, enhanced antiphospbolipid igg, enhanced lipoprotein (a)) were found in 30% (v) and 28% (a). furthermore, we diagnosed exogenlc reasons (septicemia, postpastal asphyxia, fetopathia diabetica, central line and steroid/asparaginase administration) in six out of 10 (v) and three out of 8 (a) children with thrombosis and apcr. all four patients with a positive family history of thrombophilia (mothers only !) showed the common factor v mutation arg 506 gin. in the control group the prevalence of apcr was 5.1%. the high incidence of additional exengenic factors in children with apcr confirm literature data of previously described inherited coagulation disorders during infancy and childhood: an acquired risk of thromboembolic disorders masks the coagulation deficiency in the majority of patients with an inherited prethrombotic state. furthermore, the incidence of 42 % apc resistant children with arterial insults in this study challenge the view that apcr is associated with venous but not with arterial thrombo-st8. 12 activated protein c resistance and plasminogen deficiency in a family with thrombophilia m, zttger 1 , f. demarmels biasiutti 1, ch. mannhalter 2, m. furlan 1 , b.~e 1 1httmatologisches zentrallabor der universit~t, inselspital, ch-3010 bern 2klinisches institut fur medizinische und ehemische labordiagnostik, universit~t wien, a-1090 wien several hereditary defects of the proteins regulating blood coagulation have been associated with familial thrombophilia. since the recent discovery of activated protein c (apc) resistance due to the factor v rf06q mutation as a highly prevalent hereditary risk factor for venous thromboembolism (tel, evidence is accumulating that familial thrombophilia may be due to a combination of genetic defects. thus, protein c-or protein s-deficient patients having suffered from te seem to be more likely to carry the factor v r506q mutation than expected from its allelic frequency in the population. we report a family (see figure) in which plasminogen deficiency (0.60 u/ml) had been found in the propositus having had twice postoperative deep vein thrombosis (dvt) at ages of 29 and 31 yrs, respectively, as well as in 5 family members (0.53-0.69 u/ml). out of these 5 plasminogen deficient individuals, only the propositus' daughter had suffered from recurrent dvt at age <20yrs. reinvestigation of this family in 1995 showed factor v r506q mutation in the propositus, his daughter, an asymptomatic sister and a brother with postoperative pulmonary embolism (pe). his father had had postoperative pe; he is deceased and could not be examined. ~, [] plasminogen deficiency ~ ,rll factor v rf06q mutation /~ propositus ~¢ bistory of dvt and/or pe e superficial phlebitis j2~,,~ not investigated "1" deceased even though this family is small for establishing unequivocal association of te with known defects, the two most severely affected individuals with recurrent te at ages <30yrs had combined plasminogen deficiency and apc resistance whereas those with isolated plasminogen deficiency were asymptomatic. these data support the concept of multigenie interactions leading to familial thrombophilia. resistance to activated protein c (apc resistance) is the most common dsk factor for venous thrombosis (vt). in most cases apc resistance is caused by a single point mutation at position arg 506 in the factor v gene (factor v leiden). while ample data in hetarozygous patients have been published, reports in homozygous patients are limited. we studied 29 patients (12 males [m] , 17 females it]) in whom a homozygous mutation had been verified by dna analysis. the median age at the time of the study was 38.8 years (y) (range 9-83 y). twenty-five patients had experienced vt (10 m, 15 f). four patients were discovered dunng family studies and were asymptomatic, three were children (between 9 and 13 y) and one patient was a 62 y old man. in males the first thrombosis occurred at a median age of 38 y (range 21-82 y), in females this was at a significantly younger median age of 26 y (range 17-49 y). twelve of the 15 symptomatic females had taken oral contraceptives (oc, estregen content 0.02-0.ling) for 6 to 150 months (median 71 m) pdor to thrombosis. in 2 women vt occurred during pregnancy, in 1 female it was precipitated by hormone replacement therapy. in contrast, in 8,<10 males the thrombosis happened spontaneously, in 2 males it followed surgery. the sites of thrombosis were dvt in 9 males and 14 females, dvt and pulmonary embolism (pe) in 4 females and 1 male, dvt and caval vein thrombosis in 1 female and superficial thrombophlebitis in 4 males and 1 female. eight females had at least one pregnancy, in total 11 children and 5 abortions. two had thrombotic events during pregnancy and 2 after delivery. all homozygous patients showed apc ratios between 1.12 and 1.68 (mean 1.39 + 0.19). conclusion: patients with homozygous fv leiden have similar clinical symptoms as patients with deficiencies of antithrombin-, protein c or protein s deficiency. however, in contrast to these defects a very high dsk dudng oral contraceptive medication leading to an ealier manifestation in females can be observed. several synthetic (efegatran, argatroban, inogatran and napsagatran) and recombinant (hirudin, peg-hirudin and hirulog) antithrombin agents are in different stages of nlinical development for cardiovascular and thrombotic indications. while the specificity of these agents for thrombin is a concern, little has been done to study the effects of these agents on other serine proteases involved in coagulation and fibrinolytic processes. fihrinolytic compromise by site directed thrombin inhibitors has been reported recently (thromb res 74(3): 193-205, 1994) . while these agents have been shown to inhibit plasmin and related enzymes, little or no informatienentheir effects onthe generation and functionality of apc is available. sinceapc plays an increasingly important role both as an antienagulant enzyme, by inhibiting factors v and viii, end a pro-fibrinelytic enzyme, by stimulating the release of t-pa fi'om endothelial sites, an inhibition of apc may result in both pro-coagulant state and fibrinolytic deficit. representative thrombin inhibitors (dup 714 -a prototype boronic acid peptide derivative, efega~an, argatroban, hirulog, hirudin and feg-hirudin) have been compared for their ability to inhibit apc (american red cross). these bionhemically defined studies in which the remaining activity of apc after incubation with a thrombin inhibitor was determined speclrophotometrically with a ehromogenic substrate (s-2288, pharmacia, franklin, oh) , demonstrated that dup 714 and efegatran inhibit apc in a coneentrafinn dependent manner (ic~0=0.75 and 8,4 gm resp~tively), hirnlog inhibits apc weakly (60 p2¢i produced only 25% inhibition), while argatroban and ~ have no anti-apc activities, while hirulog, hirudin and argatroban produced no direct enti-apc activities, it is conceivable that they may inhibit thrombomodnlin-bound thrombin and thus prevent activation of protein c, resulting in a functional apc deficit and failure to improve clinical outcomes despite higher dosage. while initially it was thought that sole targeting of thrombin will provide monospacifin anticoagulant agents devoid of some of the adverse effects observed with heparin, the recent clinical trials clearly suggest that thrombin is not the only determinant of thrombogenesis. furthermore, potent antithrombin agents such as hirudin, hirulog and peptides, indirectly inhibit the generation of apc, by compromising thrombomodulin-bound thrornhin and such agents as efegalran and dup 714 also produce direct apc inhibition. endogenous inhibition of formed ape by thrombin inhibitors may therefore compromise the feedback regulatory funetiens of apc and may lead to thrombotic amplification in fully enticoagulated patients. these studies warrant prcelinlcal assessment of thrombin inhibitors to evaluate their relative inhibitory effects on apc. poor anticoagulant response to activated protein c (apc resistance) causes a significant portion of deep vein thrombosis (dvt) whereas its association with coronary artery disease (cad) and myocardial infarction (md is still controversial. therefore, we investigated 346 recently hospitalised patients suffering from cad with or without previous mi. the cad was proven by coronary angiography. apc resistance was analysed by using the ap'l-i'-based assay coatest apc resistance (chromogenix). eleven patients showed an apc sensitivity index below 2.0 viewed as apc resistance. using pcr technology, the factor v mutation causing apc resistance 11691 g "-) a) has been shown in nine of these eleven patients. this represents 2.6 % (9/346), compared to 8,5 % found in healthy blood donors (8/94). one homozygous carrier (male, age 76) was identified (apc sensitivity index 1.4) who suffered from dvt at age 59. recent angiography demonstrated diffuse cad, no thrombotic events were reported in his family. in contrast, multiple thrombotic manifestations (dvt, mi, stroke) occurred in the relatives of four heterozygous patients. we conclude that the prevalence of apc resistance is rather low in patients with cad. nevertheless, the natural history of coronary manifestation of apc resistance seems to vary, probably depending on the presence and severity of cardiovascular risk factors. resitanee to activated protein c (apc resistance) is the most cormnon hereditary cause of thrombophilia and significantly linked to factor v leiden pcr based methods are used to identify the crucial point mutation in the factor v g(me. we designed primers in order to identify factor v leiden by allele-specific pcr amplification. amplification specificity for factor v was ensured by the 3'primer fv1001, located at the introng/intronl0 border of the g~ae. one sense and two antisense primers were used in ~vo separate primer mixes specific for factor v arts06 (wildtypo) or factor v otn506 (factor v leiden) yielding 235bp products each. in each pcr reaction a pair of primers amplifying a fragment of the human growth hormone gene was included, fimctioning as an internal positive amplification control (429bp pcrfragment). after an initial denaturation step 10/.tl samples (100rig genomic dna) were subjected to 10 two-temperature cycles fouowed by 20 threetemperature cycles. for visualisation 8p3 of the amplification product were run on a 2% agarose gel presmined with ethidittm bromide. the presence or absence of specific pcr amplification allowed defiu/te allele assignment without the need for any postamplifieation specificity step. the in~ernal positive control primers it~cate a sucessf-u/pcr amplification, allowing the assignment of homozygosity. in a prospective study 126 p~e.ients with tlaromboembolic events were analysed using this technique and enmpared with pcr -rflp according to bertina et al. the concordance between these techniques was 100 %. in 27 patients a heterozygous factor v ohas06 mutation was detected, whereas one pa6ent with recurrent thrombcembolism was homoz-ygous. no false-positive or false-negative results worn observed in the homozygous as well as hcterozygous samples. in addition in 15 samples identified to carry the point mutation by al/ele-specifin pcr anxplification automatic :~equencing confirmed the heterozygous or homozygous point mutation. due to its time-and cost-saving features allele-specific amplification should be considered for screening of factor v leiden. background: an initial intravenous course of tmfractionated heparin ~ljasted on the basis of the activated partial thromboplastin time is the currmt standard treatment for most patients with venous thrombosis. low-molecularweight hqmrin pre~a~tious can be administered subcutaneously, once or twice daily, without laboratory monitoring. we compared the relative effic.~y and safety of low-molecular weight heparin versus anfractionated heparin for the initial treatment of deep venous thrombosis. methods: english-language reports of randomized trials vtta'e identified th~ a medline search (1984 through 1995) and a complementary extensive manual search. reasons for exclusion from the analysis were no hepada dosage adjustments, the lack of um of obj~tive tests for deep venous thrombosis, dos~ranging studies that used higher doses of low-molecularweight heparin than are ctareatly in use, and the failure to provide blind endpoint ~sossmeat. we assessed the incidence of symptomatic recurrent vinous thromboembolic disease, the incidence of clinically ii~t bleeding and mortality. results: twelve of the 21 identified trials satisfied the predetermined criteria. the relative risk reductions for symptomatic thromboembolic complicatious, clinically ~t bleeding, and mortality varied firom 20.60% aad were all statistically significantly in favor of low-molecular-wedght hqtmrin. coadusions: low-mol~ular--weight hoparim administered subcutaneously in fixed doses adjusted for body weight aml without laboratory nmaitori~ =re more effective and safea" tlum adjn~_-dose standard h~. sauce low molec~dar weight hqmrim vary in o~apositiou =ad pharma~ogical im)fil~ the benefits of each ~ shodd ~tabllsbcd separately. unfraetionated hcparin (uh) and low molecular weight heparin (lmwh) are widely used for the prevention and treatment of thrombotic disorders. uh and lmwh induce platelet aggregation in vitro. rgd peptides compete with fibrinogm for the binding to the glycoprotein receptor (gp lib-ilia) of platelets and inhibit platelet aggregation. to inhibit the heparin-indueed platelet ~tion and prolong the half-life in blood of rgd peptides, we linked ac-rgdv-ssggs-ahx-yk eovalently to lmwh in a ratio 1:1. the peptide is composed of three regions: a. rgd-gives the specificity for the receptor gp lib-ilia; b..ssggs-ahx-is the spacer between carrier and ligand, which should facilitate the intnraetion between the conjugate and the gp lib-ills receptor; c. -yk arc functional antino acids for iodination (y) and for covalent attachment (k) to the cattier lmwh. the aggregation achieved with different concentrations of lmwh, lmwh-eonjugate and lmwh/rgd-peptide mixture in a ratio 1:1 was mea.~ared after 20 rain.; maximum aggregation after platelet activation with 10 pm adp was set equal to 100%. platelet aggregation in normal human plateletrich eitrated plasma (prp; 220 000/p.l) was induced by lmwh in a dose ~ndent manner. heparin can induce antbodies which interact with platelets and endothelial cells. this causes thrombocytopenia and thromboembolic complications. hitpatients do need effective parenteral anticoagulation. we treated 82 patients (30m,52fm), median age 61 years (18-84) with laboratory prooven hit (hipa-test) with recombinant (r-)hirudin. as these patients had been preseleeted by their immunological response during heparin treatment and the treatment duration of the study was longer than in any other study using thlrudin, all patient samples were investigated for anti-r-hirudin antibodies himdin antibodies were screened by a sandwich elisa using r-hirudin fixed to the solid phase as antigen. all plasma samples were screened for anti-hirudin antibodies of the igg class, but solar only a suset of samples for lge anti-r-himdin antibodies. 38 of 82 patients (46.3%) developed anti-hirudin antibodies of the igg class. anti-hirudin antibodies were not detectable not before 6 days of r-hirudin administration solar no ige anti-hirudin antibodies were found. none of the patients devdoped thromboeytopenia or allergic symptoms. however, in a subset of patients the anti-hirudin antibodies enhanced the anticoagulatory effect of r-hirudin. in 5 patients the hirudin dosage had to be decreased by 2-3 fold to maintain a stable aptt level, in 3 patients, despite stable r-hirudin maintenance dose the aptt increased to values >100 see.. during the study 4 patients with anti-hirudin antibodies had to be reexposed to a second course of r-hirudin for parenteral anticoaguhtion none of these patients developed any allergic reaction. in conclusion we found a high proportion of anti-hirudin antibodies in hatpatients treated with r-hirudin for more than 7 days. these antibodies seem to have minor clinical relevance in regard to allergic reactions. however, one has to consider that these antibodies may influence the pharmacokinetics of rhimdin and thereby enhance its antieoagulatory potency. therefore, aptt must be monitored closely in patients receiying r-hirudin for more than 5 days a major concern in the use of hirndin, the most potent and specific thrombin inhibitor, is the risk of bleeding associated with the potential effect of this drug on hemostasis, particularly when the antithrombotic therapy is combined with invasivo procedures, fibrinolytic treatment, or patient's predisposition to abnormal bleeding. thus, availability of an antagonist to hirudin would be essential for instant neutralization of the antithrombotie action. however, thueh a hirudin antagonist is unknown in nature. to prepare an antagonist to hirudin, a mutant derivative of human prothrombin, in which active site aspartate at position 419 is replaced by an asparagine, has been designed, expressed in recombinant chinese hamster ovary cells, and purified to homogeneity. d419n-prothrombin was converted to the related molecules d419n-meizothrombin and d419n-thrombin by limited proteolysis by e. carinatue venom and o. scvutellatus venom, respectively. both d419n-thrombin and d419nmeizothrombin exhibited no thrombin activity and titration resulted no detection of the active site. however, binding to solid phase immobilized hirudin and fluorescence studies confirmed that the binding to the most specific thrombin inhibitor, hirudin, was conserved in both proteins, hi vitro examinations showed that d419n-thrombin and d419nmeizothrombin bind to immobilized hirudin, neutralize hirudin as well as in the purified system and in human blood plasma and re-activate the thrombin-hirudin complex. animal model studies confirmed that d419nthrombin and d419n-meizothromi.,in act as hirudin antagonist in blood cireulatlon without detectable effects on the coagulation system. while i.v. injections of hirudin in mice resulted in an increase in partial thromboplastin time, thrombin time and anti-thrombin potential, additional injections of d419n-thrombin and d419n-meizothrombin resulted in a normalization of these coagulation parameters. elevation of plasma homocysteine is a hereditary disorder of methionine metabolism associated with a high risk of arterial vascular disease. however, as yet relatively little attention has been directed towards the association between hyperhomocysteinemia and juvenile venous thromboembolism (vte). consequently the aim of our study was to evaluate the prevalence of hyperhomocysteinemia (hyper-hcys) and juvenile vte. patients: 85 patients (29 men, median age 42 ys; 56 women, median age 35 ys) who had at least one verified episode of vte before the age of 45 ys were investigated in regard to their total plasma hcys levels. none of the patients had renal or liver dysfunction or evidence of any autoimmune or neoplastic disease. methods: plasma total homocysteine levels were determined by hplc with fluorescence detection. hyperhomocysteinemia was defined as hcys levels exceeding the upper limit of the normal range obtained in our laboratory from 80 healthy control subjects (40 males, median age 25 ys, hcys 95% ci: 2.02-5.67 pmol/l; 40 females, median age 27.5 ys, hcys 95% ci: 2.99-5.40 ,gruel/l). resuits: 13 out of 85 patients had hyper-hcys, giving a prevalence of 15.3 %. of these 13 patients, 9 were male and 4 female, indicating that the relation between elevated plasma hcys levels and vte may not be as strong in woman as in men. discussion: according to previous reports, our study shows that there is a high prevalence of hyper-hcys in patients with juvenile vte. however, the mechanisms by which hyper-hcys can provoke vte and whether hcys is an exclusive risk factor or if it contributes to other existing predispositions, possibly working as a trigger factor is unknown yet. some authors suggest hcys-iaduced effect on factor v activation or inhibition of thrombomodalin-dependent protein c activation. in addition an influence on thrombocyte aggregation has been postulated. conclusion: measurement of hcys levels may be useful in the evaluation of patients with a history of juvenile venous thromboembolism and could be clinically important as hyper-hcys is easily corrected by vitamin supplementation. detailed determination of the pathogenesis of vte in patients with hyper-hcys should be the aim of further investigations. a deficiency of one of the coagulation inhibitors antithrombin (at), protein c (pc) or protein s (ps) and resistance to activated protein c (apc resistance) are established risk factors for venous thromboembolism (vte). in the majority of patients with apc resistance, the .tug 506 gin mutation (factor v leiden) is present. whereas deficiencies of one of the coagulation inhibitors are rare in the normal population, the allele frequency of factor v leiden is 2-7% in western europe. heterozygous individuals have a 3-7fold, homozygous an 80fold increased risk for vte the typical clinical features of all abnormalities are deep vein thrombosis, pulmonary embolism, superficial vein thrombosis and thrombosis at unusual sites, like mesenteric vein thrombosis or cerebral vein thrombosis. the thrombotic risk is low during childhood, but increases considerably after the 13th year of age. a retrospective study in adult patients out of families with a symptomatic deficiency of at, pc or ps revealed that around 30% of surgical interventions and traumas of the lower extremities were complicated by vte. therefore, these patients should receive thrombosis prophylaxis al~er surgery and trauma if their age is higher than 13 years. pregnancy is associated with a very high risk for vte in individuals with at deficiency and prophylaxis should be initiated already in the first trimester. after delivery, thrombosis prophylaxis is adviced for all females known to have an abnormality. oc increase the risk, especially in at deficient and in homozygous factor v leiden females and are therefore contraindicated in these individuals. oc do also increase the risk for vte in patients heterozygous for factor v leiden and females known to have this abnormality should be discouraged from taking oc or should at least be informed on their increased risk. university hospital-', jerusalem, israel, hospital bcan.iou ~. paris, france, increased frequency of thrombocmbolie events have been observed iu patients with b-thalassemia. our findings of shortened platelet survival and enhanced urinary excretion ofthmmboxanc a: metabolitcs (blood 77:1749 (blood 77: , 1991 suggested an increased platclet activation in tbese patients. we also fouud that isolated thalassemie rbc enhance prothronlbin activation, suggesting an increased membrane exposure of procoagulant phospholipids i.e, phasphatidylserine (am j. hematol. 44:33, 1993) . we now show that annoxin v, which has a high specificity and affinity for anionic phospholipids inhibits pmthrombm activation by factor xa, by binding to thalassemic rbc (ic~, = 0.32 nm). kerckhoff-klinik, bad nauheim ~, medizinische poliklinik bonn 2, institut for immunologie und transfusionsmedizin universit~ll greifswald a antibody-mediated intravascular platelet activation is believed to be the basis for both arterial and venous thrombosis in patients with hat. while the development of arterial thrombosis can explained sufficiently by intravascular platelet activation, it is a matter of discussion whether additional risk factors are involved in the pathogenesis of hat-related venous thrombosis. since resistance to activated protein c (apc) is the most common inherited risk factor for venous thrombosis described the frequency of apc resistance among a population of 68 hat patients has been studied. hat was diagnosed using the heparin-induced platelet aggregation assay and confirmed by the ~4c-serotoninrelease test. the diagnosis of apc resistance was established by two functional assays and genetic analysis. at time of diagnosis of hat, 38 patients showed venous thromboembolic complications. among these, 24 were found positive for apc-resistance. pulmonary embolism was diagnosed in 18 hat patients, 14 of them were apc resistance positive. none of the 18 hat patients who showed exclusively thrombocytopenia were apc resistance positive. early oral anticoagulation (oa) was initiated in 7 patients after the diagnosis of hat has been established. six of these patients developed serious thrombotic complications including skin necrosis. these results demonstrate that apc resistance is an additional and common risk factor for the development of hat-related venous thrombosis. early initiation of oa during an acute episode of hat dramatically increases the risk of thrombosis. therefore, oa in hat patients should be initiated only after platelet counts have been returned to baseline levels and effective parenteral anticoagulation is achieved. controlled trials for primary and secondary prevention of stroke g. de (3aetano, c. cerletti and v. bertel~ consorzio mario negri sud, santa maria imbaro, italy this presentation will review the antithrombotic treatments to prevent ischemic stroke that have been evaluated in controlled clinical trials. in two studies of aspirin therapy for pdmary prevention in male physicians there was no reduction in the incidence of stroke, while that of first myocardial infarction was significantly lowered. similar results were obtained in a prospective study in a large cohort of women taking aspirin daily. the incidence of vascular death was not modified by aspirin in any of these trials. this is possibly due .to an excess of strokes associated to aspirin treatment: indeed the four vascular events avoided in 1000 us physicians under aspirin prevention for five years would result from five myocardial infarction and one vascular death avoided and two additional strokes occurred. oral anticoagulant therapy decreases the relative risk of stroke in patients with non valvular atrial fibrillation. warfarin appears to be superior to aspirin, but the latter drug is a useful alternative when long-term anticoagulant therapy cannot be administered. a metanalysis of about 150 trials and over 100,000 patients with different vascular diseases treated with aspirin (at different doses) and/or other platelet inhibitors showed 25% overall reduction of vascular events including stroke. the optimal dose of aspirin for secondary stroke prevention could not be established. in patients with previous minor strokes or tia there was 22% reduction of vascular events and 23% of non fatal strokes. the avoidance of nine strokes of any cause among the 38 expected in 1000 patients at risk would result from the sum of 10 ischemic events avoided and a haemorrhagic one occurred in excess. ticlopidine was reported to reduce the risk of stroke in two large tdals (one in patients with major stroke), but there is no evidence that it is better or safer than aspirin. we compared the effect of the direct specific thrombin inhibitors, napsagatran (na) and rec. hirudin (rh) with unfractionated heparin (uh) on the further growth of preformed thrombi. as a model of thrombogenesls, an annular perfusion chamber exposing rabbit aortic subendothelium was perfused with native rabbit blood at an arterial wall shear rate (650/s). fibrin and platelet thrombi were allowed to form during a 10min perfusion period after which the test agents were given iv as a bolus and a continuous infusion (3 and 10pg/kg/min, n=6) and the perfusion continued for 20min. the control groups were perfused for 10 or 30 rain (n=6). fibrin deposited and platelet thrombi formed on subendothelium were evaluated by microscopic morphometry. the % surface coverage with fibrin was not reduced in the drug-treated groups since fibrin deposition was similar in the 10 and 30 min control groups (39+8% and 335:6%, respectively, mean:l:sem). platelet thrombus area (ta) in the control groups increased from 24+7pm2/pm after 10min to 97+32pm2/lim after 30rain perfusion. na at 1011g/kg/min reduced ta by 94% to values (6+_2ptm2/ ~m) lower than those of the 10min control group whereas rh at this dose reduced ta by 70% (30-j:141.tm2/i.tm). uh at both doses was ineffective. these findings show that in contrast to uh the direct thrombin inhibitors na and rh inhibit the growth of preexisting thrombi. these results could be explained by the higher potency of na and rh as compared to uh for inhibiting clotbound thrombin (gast et al., blood coagul fibrinol 1994, 5.' 879-87) and suggest that thrombus-bound thrombin is an important modulator of platelet thrombus growth and/or stability in this thrombosis model. platelet adhesion -the initial event of thrombosis -is believed to be completely prevented by intact endothelium. we challenged this theory by superfusing intact human umbilical vein endothelial monolayers with activated human platelet rich plasma utilizing the stagnation point flow adhesio-aggregometer (spaa). the spaa provides flow mediated contact of platelets with the superfused surface. heparinized (3.5 -4.0 u/ml) platelet rich plasma (prp) was obtained from healthy volunteers and activated by addition of adenosine diphosphate (adp 2"10 -6 m). platelet deposition was recorded on-line by video as well as by measuring scattered light. fixed samples were examined by phase contrast and electron microscopy, inhibition experiments were performed with either the tetrapeptide rgds, the non-peptide gpiib/llla-inhibitor ro-43-8857 or a monoclonal antibody directed against the gpilb/llla complex. stimulation with adp prompted platelets to adhere to intact endothelium single or as microaggregates of a diameter of up to 100 micrometer. adhesion was dependent upon convective transport resulting in platelet collision with the endothelial monolayer. infusion of rgds or ro-43-8857 into the flowing, adp-stimulated prp completely prevented platelet adhesion to the endothelium as well as subsequent aggregation. when the inhibitor inflow was stopped while adp stimulation persisted, adhesion and aggregation occurred immediately. re-establishing the inflow of the inhibitors -with still continued adp stimulation -led to disintegration of the adhering aggregates. when prp preincubated with the monoclonal antibody against gpllb/llla was superfused, platelet adhesion to the endothelium and aggregation were irreversibly blocked. our results suggest that convective transport and stimulation of platelets are prerequisites to overcome endothelial thromboresistance and that subsequent platelet adhesion to the endothelium is mediated via the platelet gpilb/llla receptor complex. prevent thrombus formation affer ptca i.p. 8tepanova t, g.v. bashkov 2, l.p.kapralova, 2 s.p. domogatsky ~ cardiology research center t and national haematology scientific cettter 2 russian academy of medical sciences, moscow, russia percutaneous transluminal coronary angioplasty (ptca) results in atheroselerotie plague rapture, vascular wall damage and thrombogenic collagen exposure. subendothelial collagen type i-lll is a very ~rong agonist of platelet-dependent thrombus formation in arteries. the anlithrombotic action of rabbit polyclonal antibodies to rat collagen type i-ill and their chemically synthetized conjugate with monoclonals to human recombinant two-chain/one-chain urokinase type plasminogen activator (rtcu-pa/rseu-pa), cross reacting with rat tcu-pa/scu-pa was studied both an in vifro and in vivo. anticollagen antibodies and bispecific conjugate inhibited human platelet adhesion, aggregation and formation of thrombi-like ~ructures induced by rat collagen immobilized with the polystiroi surface in a condition mimics the high shear rate in the large elastic-type arteries. the short-term treatment of the collagen-soaked silk thread by the collagen antibodies suppressed the platelet-dependent thrombus formation in the arterio-venous shunt in rats by 56_+4 % (p<0.001) as well as by the bispecific conjugate (44_+4%, p<0.001). the treatment of collagen-adsorbed conjugate by rtcu-pa did not increase the autithrombotic effect of bifunctional antibodies. the present date suggest, that the local administration of the anticollagen antibodies at the site of atherosclerotic plague rapture may tm the efficient tool for prophylaxis of platelet-dependent thrombus formation in arteries after ptca. increased levels of certain hemostatic factors have been shown to be related to an increased risk of cardiovascular events. hypercoagulability is suggested to predispose to arterial thrombosis and thereby to participate in atherogcncsis. we therefore assessed fibrinogen, prothrombin fragment 1+2 (fi+2) and yon willebrand factor (vwf) antigen in 80 consecutive patients (aged 59+5 years) with known coronary artery disease (cad) who all underwent coronary angiography. the extent of coronary artery disease was quantified according to modified criteria of the american heart association (total, proximal and distal "score"). furthermore the intima-media thickness (imt) was determined in the carotid and femoral arteries by standardized ultrasonographie measurement, vwf antigen was found to correlate positively with the total and proximal coronary score (r=029, p<0.05 and r=o.36, p< 0.01). while fi+2 showed no correlation with the coronary scores, it was significantly correlated with the imt values in the carotid arteries (r=0.27. p< 0.05). after differentiating tertiles of the parameters patients belonging to the upper tertile of fi+2 concentrations had significantly higher imt values of the carotid and femoral arteries (0.81_+0.11 mm vs. 0.72+_0.13 mm in the lower tertile, p<0.05:1.38_+0.44 mm vs. i. 15-+0.25 ram, p=0.05) whereas in patients belonging to the upper tertile of vwf antigen concentrations the proximal coronary artery score was significantly higher (!.71-+0.59 vs. 1.31+ 0.92 in the lower fertile, p<0.01). fro correlation of fibrinogen concentrations and extent of cad or imt values of the carotid and femoral arteries could be demonstrated. in conclusion procoagnlatory mechanisms as indicated by elevated concentrations of yon willehrand factor antigen and fi+2 may be contributing factors in atherogenesis. we have previously shown that pgei is a potent inhibitor of pdgf-ioducod proliferation of vascniar smooth muscle cells (vscm) and inhibits dna replication by a camp-related mechanism (grol~er et al, 1994) . the present study investigates of whether or not this aatimitogeni¢ activity of pget can be amplified by trapidil, a compound that has been shown recently to inhibit the incidence of restenosis of hmnan coronary arteries subsequenmt to ptca (maresta et al. 1994) , vsmc were prepared from coronary arteries of adult bovine hearts, passagod and kept under standard tissue culture conditions. cells of passage 4-9 wore incubated in serumfree medium for 24h in the presence of indomethacin (3 p.m). addition of pdgf-bb (10 ng/ml) under these conditions stimulated dna-replication as assessed from 'hthymidin lncm'poration, by 3.-4laid above control level. trapidil at 10 idvl caused a minor reduction of pdgf-induced mitogenesis whereas 10t) of the compound resulted in a marked reduction of dna replication by 69% (p < 0.05, n = 4). pgei at 0.3 nm diminished the incorporation rate by t 1% while the simultaneous administration of both pged and trapidil (100 idyll caused a significantly stronger response as seen from n reduction of ~h-thymidine incorporation rate by 82% (p < 0.05, n = 4). as a possible mechanism of action, trapidil might have inhibited phosphodiesterases. to establish this, we measured the camp-depcudont proteinkiaasc (pk) a activity in cell homogenates. trapfdil increased the basal fka-activity from 19% to 31% of the maximum response while the response to pget (10 am) amounted to 55%. coincubation of pgei with trapidil caused a 65% stimulation of pka activity, sugesting a small though detectable inhibition of vscm phosphodiesterases by trapidil at anttmitogenic concentrations. essentially similar results wore obtained when thrombin was used as the mitogenic agent. the data demonstrate a significant antimitogenic effect of trapidil at p.molar concentrations that are in the range of plasma levels after therapeutic administration of the compound in rive. at these concentratrations, pget induced inhibition of mitogenesis is markedly enhanced by trapidil. inc. i~enna, and ~cenlral itematnlogy laboroto~. , university hospital of bern pibrinogen (fg), yon willebrand factor antigen (vwf) and tissue-type plasminogeu activator antigen (l-pal have recently been shown in be independent risk factors for subsequent coronary events in patients with angina pectoris (nejm 1995; 332:635) although paul antigen has also been proposed as a risk factor, conclusive dam showing its predictive value is still lacking. furthermore, we have recently shown in a study investigating 200 survivors of myocardial infarction that not only are fg, t-pa and pai-i significantly increased in these patients when compared to a heahhy conlro[ group, but pci activity is also elevated (7hrornb. tfaemost. 1995;73:1137 abst.) , hi order to obtain cut.off points for the individual parameters, frequency histogram plotl; were transformed into straight line cumulative frequency (probit) plots (thromb i/aemost. 1995;74:718) . the cut-off valu~ for the four parameters were determined as follows: fg at 2.7 g/l, t-pa at 8.7 ng/ml, pal-i at 25 ng/ml and pc[ at 126% of a normal pooled plasma. utilising there cut.off points it was then possible to determine the accumulative discriminatow effectiveness of the parameters. when fg w;qs employed alone as the discriminatow factor, it was observed that 47% (941200) of the coronary heart dir, ease (chd) group eilher had the cul..off value or were below it aud 29% (29199) of the normal group were above the cut-off value, thus, resulting in 47% false ne$atives and 29% false positives. when a second additional risk factor, t-pa wa_~ introduced, the number of false negatives dropped to 21% [i.e. 79% (158/200) had two, risk factors elevated] and the number of false positives to 13% to investigate whether a third parameter could discriminate further, pai-i antigen was used to analyse the rcnudning false positives and negatives. an additional 4% could be detected, resuhing in 83% of the chd group having three risk factors elevated. similarly, the number of normal aubjecta with three parameters elevated dropped by 4% to 9% furthermore. when a fourth parameter was introduced, namely pci, it was round to discriminate a further 8% in the chd group, thereby increasing tile di~riminalion to 91%. the number of false positives dropped to 4%, additionally, determination of pci increased the discrimination of patienta having had multiple infarctions from 88°/= when thrce parameters were mcasured to 100%. from these results it can be concloded that determination of fibrinogen levels alone is not sumcicnt to separate patients from controls as t-pa adds significant discrimination. pai-i antigen which correlated strongly with t-pa did not significantly increase the discriminatory potential of both fg and i-pa. however, by employing pci as a fourth paramctcr, virtually complete separation between the chd and normal groups as well as rurthcr recoguitiou of' patients having had multiple infarctions could be obtained. to test the hypothesis that oral contraceptives (oc) enhance exercise-induced activation of blood coagulation we examined 11 women (27 + 3 (sd) years, bmi 20.4 + 2.0 kg/m 2, vozm.. 49 + 7 ml/kg/min) without oc between day 18 and 22 of the menstrual cycle and 10 women (24 + 2 (sd) years, bmi 20,6 ± 1,6 kg/m', vo2max 47 + 7 ml/kg/min) taking oc (150 mg desogestrel and 30 mg ethinylestradiol) between day 7 and 21 of drug intake. prothrombin fragment 1 +2 (ptf1 + 2) and fibrtnopeptide a (fpa) were measured before and after running for one hour on a treadmill at a speed corresponding to the anaerobic threshold. mean heart rate [191 ± 10 vs. 196 ± 12 min 1) and mean plasma lactate (3.3 ± 1.6 vs. 3.1 + 1.2 mmot/i) wera comparable during exercise between control and oc group, respectively. results for markers of thrombin and fibrin formation were: ptf1 +2 (nmol/i) fpa (ng/ml) control before 0,66 ± 0.19 1.0 + 0.2 after 0.73 + 0.23 2.2 + 1.2" oc before 0.82 + 0.31 1.0 + 0.2 after 1.11 + 0.48* + 5.8 -+ 6.0* + * p < 0.05 vs. baseline, + p < 0.05 between groups. we conclude that oral contraception with 150 mg desogestrel and 30 mg ethinylestradiol enhances exercise-induced thrombin and fibrin formation, our data suggest that exercise testing might be useful for evaluating the risk of thrombosis associated with different compositions of oc. a. haushofer +, wm. halbmayer +, j. radek +, m. dittel *, r. spiel *, h prachar *, j. mtczoch *, m. fischer + + zentraltaboratorium mit thrombose-und c~rinnungsambulanz -krankenhaus lai~: * 4. medizinische ab[eilung mtt ka~liolo$i¢ -krankenhaus lainz und ludwig bottzmann-lnstitut ftlr herzinfarktforsohung, wien fifty-one patients (age 61.6 ± 9.2a; 34 m / 171) implanted with 74 coronary stems 03 palm~-schatz, 27 gianturco-roubin, 14 micro stcnts) received a now antithrombotic treatment using a combination of ti¢lopidine (tic) 2 ×250 mg/d for 28 days and acetyl salicylic acid (asa) zoo mg/d for long-term treatment. patients (pat) only received 15000 tu standard hepartn as i.v. bolus immediately before stent implantation (day l ). side effects and changes in hematological (day i to 7, 14. 28 and 35 [= without t[cil liver and kidney parameters (day 7, 14, 28, 35) were monitored. thirty-eight pat (75%) came for the controls to our del~rtment and were additionally monitored by thromboelastograpy (teg) and bleeding time (bt) (day 2g and 35). the other pat were monitored externally, side effects were reported. thrombin geucration after stenting was monitored from day i to 7 by prothrombin fragment 1+2 (f 1+2) and thrombin-antithrombin-lll-comptex (tat). "k" of the "leg decreased (day 28 vs 35; p< 0.0l ). bt prolongation was negatively correlated with the bodysurf ace area (tic+asa: p< 0.05, asa: p< 0.0 l) and showed a reduction after withdrawal of tic (2 l0 sec, 180/ 300 so: [median, quartiles] vs. 120 sec, 881157 sec; p< 0.ix)01). f 1+2 and tat of day i (blood collection: 0, 2, 4, 6 h after intervention, f 1+2:0.84 nmol/i, 0.68/i.07 nmol/l: tat: 2.6 pg/1, 2.0/4.6 ilg/1) were lower compared to day 2 to 7 (f i +2: 1.31 nmol/l, ].08/i .62 nmol/l; tat: 4.8 pg/i, 3.2/7.2 ijg/1; p< 0.0001). tic scorns not to be a strong thrombin generation inhibitor. during stenting one pat (i.9%) sustained a non penetrating mci and one (1.9%) an ischaemic stroke. tic+asa were very effective, only with one pat (1.9%) stent thrombosis (acute) occurred. side effects: 8/15.7% gastrointestinal (one lead to hospitalization), 5/9.8% hematomas at the needle site in the groin (one surgical intervention), 5/9.8% leucopcnias (one agranulozytosis with hospitalization), 3/5.9% allergic skin reactions and 2/3.9% increased liver enzymes (got, gpt, "pgt, alkaline phosphatase; > 2 × of the j. ). with one pat with gastrointestinal disturbances and skin reactions tic had to be withdrawn and treatment was changed to oral anticoagulatlon + asa. one pat showed a combination of skin reactions, gastrointestinal distufl~aneas and on day 28 a heavy reaction of the liver enzymes ( j. after 5 weeks). a decrease of the white blood count (day 1:7.19 gh, 6.03/8.2 g/l, day 28:5.46 g/l, 4.63/6.47 g/l; p< 0.000 i) could be observed. the safety of the therapy with tic+asa should be elucidated and extensively discussed. the serpins c1 esterase inhibitor (cllnh), antithrombin iii (atiii), alantitrypsin (slat), and a2-antiplasmin (azap) are known inhibitors of coagulation factor xla (fxla). although initial studies suggested al at to be the main inhibitor of fxla, we recently demonstrated cllnh to be a predominant inhibitor of fxla in vitro in human plasma (wuillemin et el., blood 1995; 85:1517) . the present study was performed to investigate the plasma elimination kinetics of human fxla-fxla inhibitor complexes injected in rats. the amounts of complexes remaining in circulation were measured using elisas. the plasma tl/2 of clearance was 98 min for fxla-alat complexes, whereas it was 19, 1 8, and 1 5 min for fxla-cllnh, fxla-a2ap, and fxla-atiii complexes, respectively. thus, due to this different plasma tl/2, preferentially fxla-alat complexes may be detected in clinical samples. this was indeed shown in plasma samples from thirteen children with meningococcal septic shock (mss), a clinical syndrome which is complicated by activation of coagulation, fibrinolytic, and complement systems. fxla-fxla inhibitor complexes were assessed upon admittance to the intensive care unit. fxla-a 1at complexes were elevated in all patients, fxla-c11nh complexes in nine, fxla-atiii complexes in one patient, and no elevated fxla-a2ap complexes were found. we conclude from this study that, (1) although c1 inh is the predominant fxla inhibitor, fxla-alat complexes may be the best parameter to assess activation of fxi in clinical samples, (2) measuring fxla-fxla inhibitor complexes in patient samples may not help to clarify the relative contribution of the individual serpins to inactivation of fxla in rive, and (3) fxl is activated in patients with meningococcal septic shock. dudng the coagulation of plasma about 20 % of the (x2ap present is covalently crosslinked to fibrin by factor xiila (aoki und sakata 1980, thomb. res. 19: 149-155) . we investigated the binding of azap by factor xiila to soluble fibrin (desaabb-fibdno) whose polymerization was inhibited by an isolated fibrin ddomain named d=,,, (haverkate and tiemann 1977, thromb. res. 10: 803-812) . d==. is known to have an intact fibrin-polymerization site and is able to block the prolongation of the fibrin protofibrils at an early stage depending on its concentration. lateral association to fibrin fibers does not take place, since the inhibited protofibnls formed at the conditions used here do not reach a sufficient length (williams et el. 1981, biochem. j. 197: 661-668; hantgan et al. 1983, ann. n. y. acad sci. 408: 344-365) . material and method: soluble desaabb-fibrino was prepared by incubation of (lztl)-fibrinogen (0.48 mg/ml), d=1o (1.91 mg/ml; molar ratio d==o to fibrin 14:1) and 0.4 u/ml thmmbin for 45 min. then q2sl)-c~2ap (16 p.g/ml), faktor ×ill (2 ulml) and ca 2) (5 mmol/i) were added. the crosslinking reaction was stopped at different times of factor xiila-incubation by adding of urea/edtasolution. the suspension was analysed by ultracentrifugation on gradients containing saccharose, urea and sos. re~ultl: the elution profiles of the ultrecentifugation-gradients show the formation of cmsslinked fibrin oligomers of increasing size depending on the time of factor xiila-action. the crosslinked fibrin polymers contained about 80% of the fibrin initially added. although factor xiila acted well, crosslinking of azap in the fibrin oligomers could not be observed. conclusl0n: as we already demonstrated (kelach et el. 1994, ann, hematol. 70 (suppll) : a 60) the crosslinking of azap to fibnn clots depends on the structure of the fibdn network, especially on the degree of lateral association of the fibrin pmtofibdla. in desaabb-fibrino no lateral association of fibrin protofibnls takes place under the conditions chosen here. thus it is consistent with our theory that we did not observe any binding of aiap to the fibrin oligomers of desaabb-fibrino. human pci is a non-specific serpin that inhibits several proteases of the coagulation and fibrinolytic systems as well as tissue kallikrein and the sperm protease acrosin. it is synthesized in many organs including liver, pancreas, and testis. the physiological role of pci has not been defined yet. recently, we have cloned and sequenced the mouse pci gene (zechmeister-machhart etal., manuscript in prep.) . this enabled us to study pci gone expression in murino tissues using mouse pci edna and crna probes. by northern blot analysis, mouse pci tar.ha was exclusively found in the reproductive tract (testis, seminal vesicle, ovary), all other organs analyzed -including the liver were negative for pci mkna, indicating that in the mouse pci is not a plasma protein. to determine which cells of the reproductive tract synthesize pci, cellular localization was assessed by in situ hybridization of mouse testis and ovary sections. in testis, pci mrna was present in the spermstogonia layer and in leydig cells, while sertoli cells and peritubular myoid cells were negative. these results are consistent with the immunohistological localization of human pc1 (laurell et al,, 1992) . in the mouse ovary, stroma cells of the medulla and around the follicles were positive for pci mrna. no pci expression was detected in theca or granulosa cells. we also studied the regulation of mouse pci gone expression by steroid hormones in vivo. [n mature male mice castration caused an increase in pci mrna in seminal vesicles, which was reversible upon the administration of testosterone. in tissues of intact adult male and female mice, pci mrna levels decreased after injection of human chorionic gonedotropin (hcg), while in castrated male mice, hco had no effect on seminal vesicle pci mrna. progesterone and 17-b estradiol decreased ovarian pci mrna levels in immature female mice. these data suggest direct down-regulation of mouse pci by sex steroids. the different tissue specific pci-geoe expression in men and mice furthermore indicates a different biological role of this serpin in the two species. ctr. transgene technology, leuven "[' 1-tissue factor ('it) is a 47 kda glycoprotein mainly known a the primary cellular initiator of blood coagulation. whether tf expression may also play a role in development is unknown, but the lack of spontaneous viable mutations of the tf gene in rive leads to the speculation that its absence may not be compatible with normal embryonic development. to determine the significance of "if in ontogenesis, the pattern of tf expression in mouse development was examined and compared to the 'if distribution in human postlmplantation embryos and fetuses of corresponding gestational age. at early embryonic period of both murine (6.5 and 7.5 pc) and human (stage 5) development there is a strong tf expression in both ectodermal and entodermal cells. "if decoration was seen during ontogenetic development in tissues such as epidermis, myocardium, bronchial epithelium, and hepatocytes, which express "if in the adult organism. surprisingly, during renal development and in adult organism tf expression differs between men and mice. in humans maturing stage glomerali were "if positive whereas in mice glomeruli were negative and instead epithelia of tubular segments were tf positive. in ncuroepithelial cells there was a striking 'if expression indicating a possible role of'if in neumlation. moreover, there was a robust tf expression in tissues such as skeletal muscle, and pancreas, which do not express in adult. in contrast to tf, its physiologic ligand factor vii was not expressed in early stages of human embryogenesis, but was detectable in fetal liver, the temporal and spatial pattern of tf expression during murine and human development support the hypothesis, that 'if serves as an important mo~hojzenic factor darinz embrvozenesis. to serve as an anticoagulant, protein c (pc) must be activated by a complex formed between the enzyme thrombin (t) and its cofactor thrombomodulin (tm). therefore, downregulation of endothelial cell surface expressed tm, for example, triggered by an inflammatory stimulus, could become a critical factor in effective pc activation. in order to develop a recombinant (r) pc mutant which can be activated independently of the tkm-complex, a peptide sequence including p1-7 in the activation peptide of pc was modified to be identical to the factor xa (fxa)-cleavage site in prothrombin. the mutant was expressed in hu 293 cells, purified and its anticoagulant properties characterized. using purified fxa the mutant showed activation rates between 0.02 and 0.13 nmlmin at pc concentrations between 97 and 770 nm, while the rpc wild type was insensitive for fxa activation. the activation reaction is calcium-dependent reaching maximal activation rates at a calcium concentration of 3 mm and was enhanced to 3.8-fold by addition of anionic phospholipids (pl). in contrast to the wild type pc the rpc mutant was insensitive for activation by the t/i-m complex. addition of the mutant to normal human plasma induces a prolongation of tissue-factor and p-it-based clotting assays. using normal human plasma as a source for fxa the the activation rates of the mutant were found 5-fold higher than in the purified system if tissue factor was used to generate fxa. in conclusion, our data demonstrate that the rpc mutant is effectively activated by fxa in a purified as well as in a plasma system. interestingly, the activation rates are enhanced in the presence of pl and normal human plasma. fudher studies should clarify the potential use of this mutant as a novel anticoagulant. thrombln plays a pivotal role in thrombotic events. the time course of thrombln concentration in blood or plasma after activation is of special interest to answer a variety of questions. with a chromogenic assay developed by hemker et el. [thromb. haemostas, 70, 617, 1993] it became possible to measure the generation of thrombin in activated plasma continuously. inhibitors of clotting enzymes which are to be developed as anticoagulants should be able to inhibit thrombin generation or to immediately block generated thrombin. we have used a test based on hemker's thrombin generation assay to elucidate which potency and specificity an inhibitor of factor xa needs to efficiently block thrombin generation in human plasma. thrombin generation after extrinsic (tissue factor) or intrinsic (ellagic acid) activation was followed using the chromogenic substrate h-~ala-gly-arg-pna (pentapharm ltd.). a series of synthetic low molecular weight inhibitors as well as naturally occurring inhibitors of factor xa with different potency were investigated. because of the inhibition of activated factor x the generation of thrombin in plasma is delayed and the amount of the generated thrombin is reduced. the concentrations which cause a 50 % inhibition of thrombin generation (icso) correlate with the k~ values of the inhibitors. low molecular weight inhibitors with k~ values of about 20 nmol/i inhibit the generation of thrombin after extrinsic activation with icso in micromolar range. after activation of the intrinsic pathway tenfold lower concentrations are effective. the strongest inhibitory activity after extrinsic as well as intrinsic activation is shown by recombinant tick anticoagulant peptide (r-tap) with ic~o of 0.049 pmol/i (axtdqsic) and 0.037 pmo/i (intdnsic). in the compadson of synthetic low molecular weight inhibitors of thrombin end factor xa which have similar k= values for the inhibition of the respective enzyme (lowest i<1 20 nmol/i), factor xa inhibitors are less effective tn the thrombin generation assay. in contrast, the highly potent xa inhibitor r-tap shows a stronger inhibition of thrombin generation than the tight binding thrombin inhibitor hirudin. background: resistance to degradation of coagulation factor v by activated protein c is associated with a point mutation in which adenine is substituted for guanine at nucleotide 1691 in the gene coding for factor v. to date this specifc mutation appears to be the most common inherited abnormality which predisposes patients to venous thrombosis. for this reason a reliable, fast and automatable system for the diagnosis of the described point mutation is required. the conventional methods used to identify the mutation are based on allele-specific restriction enzyme site analysis or direct sequencing. these methods have disadvantages for a large scale dna diagnosis, which include the need for electrophoresis or a high cost and time consumption. methods: an alternative strategy of dna diagnosis, the allele-specific oligonucleotide ligation assay, was adapted for the diagnosis of tile point mutation of factor v. following pcr amplification of the target dna, tile procedure was performed completely automatically on a robotic workstation with an integrated elisa reader using a 96-well microtiter plate. allelespecific restriction enzyme site analysis was performed to confirm the genotypes. results: in 10 patients with the mutation and in 20 individuals without the mutation the genotypes determined with the conventional allele-specific restriction enzyme site analysis were in 100% concordance with the elisabased oligonucleotide ligation assay. discussion: the pck-oligonucleotide ligation assay applied as automated detection system for the identification of the coagul;mon factor v point mutation allows tile rapid, reliable, and large scale analysis of patients at risk for thrombosis. resistance to the asticoa=m~lant activity of activated protein c (apc resistance) has emerged as the most con'anon inherited thrombophilic state. patients lreterozygous for factor v leiden are more likely to suffer from thromboembolie events than controls. this risk is even more pronounced in homozygotes. due to the low sensitivity and speeifity of most coagulation tests some investigators suggest to examine patients for the presence of factor v leiden mutation by pcr-based methods. re e~tly we presented an aptt-based functional test (acceleria inactivation test ait): 1:20 diluted plasma (50bi) is mixed with factor v deficient plasma (50~tl) and aptt reagent (501.d), incubated at 37°c and then coagulation is induced by caci2 and a.pc (50~d). using a standard curve, the clotting time (see) is transferred in per cent accelerin inactivation (%ai). using this test, the widely used apc-ratio as well as pcr-based factor v leiden detection (confirmed by direct sequencing) we prospectively studied 172 consecutive patients with thromboembolic events. patients without the factor v mntation eonsitently showed more flazm 50% al with the exception of one patient with severe factor deficiencies (including factor v) due to hepatic failure and heterozygous for factor v-leiden resulting in 62*/. ai, there was a complete concordance between the pcrbased method and dysaseelerinemia detected by ait. due to these result a specifity and sensitivity of ait above 95 % was calculated. furthermore, a clear discrimination could be obsoved beween 34 heterozygotes (20%0,5 to <10 years; >10 to <18 years) with a normal population of 159 children. the mutation g1691a was found with an unexpected high prevalenee of 12% in our normal controls. however, the prevalence was significantly higher in the age groups: 0 to< 0,5 years (25%) and >10 to <18 years (30%). in patients between >0,5 to <10 years the overall prevalence was similar to the control (13%). however in patients of this age with spontaneous thrombosis apcr was also a significant risk factor (29%). our results emphasize the impact of apcr for thrombogenesis in children. however, the significance is agedependent and does possibly reflect the different physiology of haemostasis in our three age groups. activated protein c (apc)-resistanec is a newly reeognised risk factor for thrombosis. in at least 90% of the cases it is caused by a single point mutation in the factor v gene (g->a at nucleotide 1691), which predicts replacement of arginin 506 with ghitamin. one of the apc cleavage sites in factor va is located c-terminal of arginin 506, and mutated factor va (factor v leiden) is resistant to apc-mediated inactivation. from epidemiologic studies it is known, that this abnormality can be found in about one third of patients with thrombosis. apc-resistance is a major basis for venous thromboembolism and is prevalent in about 5.10% of the general caucasian population. recurrent spontaneous abortion (rsa) affects 1-5% of couples and represents a major concern for reproductive medicine. in spite of extensive endocrine, genetic, serologic and anatomic evaluation some 30-40% of rsa women remain unexplained. a frequent morphologic finding in placentae of aborted pregnancies is an increase of fibrin deposition within the intervitlous space. because of these findings we studied the prevalence of apc-resistance in women with rsa (more than 3 miscarriages) of unknown origin. in 2 of 35 cases we found a pathologic apc-resistance, both patients had a history of recurrent thrombosis and were heterozygous for factor v leiden. the prevalance of apc-resistance is 5,7% and thus equals the prevalence in the general population. our data do not support the hypothesis that apc-resistanee is a risk factor for recurrent spontaneous abortion. h~matologisches zentrallabor der universit~t, inselspital, 3010 bern resistance to activated protein c (apc) due to the mutation 506 arg --~ (]in of factor v (factor v leiden mutation) is the most frequent hereditary thrombophilic defect known today, with a prevalence of 20 -50 % in patients with idiopathic venous thromboembolism and of about 3 -5 % in the general population. with an allele frequency of 2 % the expected number of homozygous individuals is about 4 in 10000. homozygous and heterozygons individuals differ considerably with respect to the relative risk of thrombosis (80 -fold versus 7 -fold) as well as to the age of the first thrombotic event (31 versus 44 years). deficiency of the vitamin k dependent protein s (p$), an important cofactor of apc, is another hereditary thrombophilia which is, however, much rarer than apc resistance with a prevalence of 5 to 8 % in patients with venous thromboembolism. factor v leiden mutation as well as ps deficiency are associated with impaired anticoagulatory activity of apc, which is most pronounced in case of the combination of the two defects. the combination of ps deficiency (with an assumed prevalence similar to that of pc deficiency) with heterozygous or homozygous apc resistance can be expected with a probability of 1: ~ 5000 or 1: ~ 500000, respectively. it is well known that ps levels decrease towards the low normal or even subnormal range during pregnancy. moreovar, there is increasing evidence that the sensitivity of plasma to the antieoagulatory effect of apc decreases during pregnancy resulting in an acquired apc resistance. these pregnancy associated effects art obviously much more relevant in case of preexisting ps deficiency or apc resistance and should contribute to the elevated thrombotic risk during pregnancy in a subject with either of the two defects, and even more so for a woman who suffers from both defects. we describe a young woman with a combination of homozygens apc resistance ( apc ratio 1.10, normal range: 2.02 -3.73), pronounced ps deficiency (free ps 0.ll u/i, total ps 0.30 u/i, normal range: 0.55 -1.20 u/1 and 0.65 -lag u/i, respectively) and, moreover, impaired fibrinolysis (no change of euglobulin -lysis time after 10 rain venous occlusion) who developed deep vein thrombosis after cesarean section in her first pregnancy. examination of her familiy showed heterozygous apc resistance in her asymptomatie father (apc -ratio 1.99) , combination of heterozygous apc resistance (apc -ratio 1.60) and ps deficiency (free ps 0.30 u/i, total ps 0.58 u/i) in her nsymptomatic mother and no defect in her sister. considering the fact that the mother was still thrombosis free at the age of 49 one may assume that the thrombosis risk in the proposita was mainly influenced by the homozygnsity for apc resistance. s. ehrenforth, m. adam, b. zwinge, i. scharrer university hospital, dept. of angiology, frankfurt a.m., germany introduction: apc resistance has been shown to be the most commonly inherited defect which constitutes a risk factor for venous thrombosis (vt). however, most of the present epidemiological studies concerning apc-r prevalence in thrombophilia were derived from results of tests conducted onplasmas collected under various conditions. this may influence the great differences reported on the prevalence of apc-r among these patients. for example, it has been shown that freezing of plasma specimens prior to analysis of apc-r causes a significant decrease in the assay results.the aim of our study was to evaluate the influence of eentrifugation conditions on the results obtained with the chromogenic apc-r assay. patients and methods: blood was collected from 31 patients (t9 women, 12 men; fv gent.type: 13 r/r 506, 14 r/q 506, 4 q/q 506) through veinpuncture into trisodmm ciwat (1:9). platelet-rieh and platelet-poor plasma was obtained by immediately centrifugation at 4"c for 10, 20, 30, 40, 50, 60 rain at 1000, 2000, 3000, 4000, 5000 and 6000 rpm. additional, pnp obtained from 14 healthy individuals (7 male, 7 female without hormonal trealraent) was prepared equally. apc-response was determined within one hour after centrifugation using the coatest apc resistance kit from chromogenix. results: for both, pnp and sin/gle plasma samples, we observed continuous higher af'c-ratios after increasing cenwifugation intensity. for example, an increase from 1000 to 6000 rpm resulted m an increased apcratio from 2.7 to 3.26 (20 min), from 2.88 to 3.326 (60 rain) respectively. even though less distinctive, similar results were observed concerning the duration ol eentrifugation: when the duration was increased from 10 to 60 minutes we observed a continuous increase in apc-ratio, for example from 2.74 to 3.12 when using 2000 rpm and from 3.09 to 3.36 when using 4000 rpm. the decrease of the ratio after low eentrifugation is the eonse-9nence of the shortening of affft in the presence of apc, without a signhcant influence of basal al:rl~ without apc. conclusion: our results demonstrate that centrifugation conditions are important to consider for the interpretation of apc-r results. supporting our observations, recent studies from sidelmann et al. have shown that an increase in plasma platelet concentration, low eentrifugation respectively, causes a signficant decrease in the apc-response. however, so far the mechanism responsible for the significant effect of both on apc-r assay results is unknown. although technically simple, the biochemical cemplexitiy inherent in the chromogenic apc-r assay necessitates a standardized plasma handling procedure to secure a reproducible determination of apc-il compapjson of different assays for determination of apc-resistance with the geno'fyping factor v (arg -> glu) g. siegert*, s. gehrisch*, e. runge**. r. naumann**, r. kn6fler*** *institute of clinical chemistry, **clinic of internal medicine, *** childrens hospital resistance to apc diagnosed on the basis of prolongated clotting time in the aptt assay" is now considered a major cause of thrombophilia. in the majority apc resistance is ~ted with a point mutation in factor v molecule (arg506glu), but both are not synonym. protongated baseline aptt is a limitation of the assay. following the determination is not possible in risk groups of patients (factor)ill deficiency, lupus anticoagulan0 and in patients under anticoagulant therapy. in these causes a dilution of plasma in factor v deficient plasma is recommended. the immunochrom assay is based on the inactivation of factor villa by apc. the aim of the study was to compare different functional apc response assays with the result of the dna analysis. apc response was tested in 100 healthy probands, 81 thro~ patients and 16 family members using the lmmtmochrem assay, the contest (chromogeaix) and the contest with 1+ 4 dilution of the plasma in native factor v deficient plasma (immune). the dna analysis was performed as described by bertina. one patiem was homozygoas for factor v mutstion~ a hetemzygous result was obtained in 4 members of the control group, in 28 patients and in 6 family members. in all cases with factor v mutation the ratio of the immunochrom assay was lower than the laboratory own value, independent on anticoagulant therapy. pathological ratios in this assay were also obtained in one member of a family" with high thrombotic incidence (dna arg/arg) and in 17 patients under anticoagulant therapy ( two of this patients are one cloned twins). in the contest a ai~ response was diagnosed in all cases with factor v mutation without anticoagulant therapy and in 40 % of heterozygous patients under anticoaglant therapy. results of the test using the dilution in factor v deficient plasma showed a good agreement vath the results of the dna analysis but the method is obviously only sensitive for the factor v mutation. the reason for pathogical ratios in the lrnmunechrem assay in wildtype patients is unclear. the majority of this patients is treated with anticoagulants, a comparison with the contest is not possible. interestingly in one patient under heparin and low ratio in the immunochrom assay' after reduction of hepann the ratio of the coatest was also low. it seems necessary to investigate in which distance to the thrombotic events the apc resistance should be tested. following pathological ratios in ftmctional apc assays must be discussed: high levels of factor viii and or v wiuebrand antigen (acute phase reactien), other mutations in factor v and viii. the factor v dilution assay should be replaced by the dna analysis. due to their differing compositions, the "sensitivities" of various aptf reagealts differ not only with respect to factor depletions, heparin and fibrin-fibrinogen degradation products, but also with regard to pathological inhibitors. for lupus anticoagulants this means that "lupus-sensitive" reagents can be delineated from "lupus-insensitive" reagents. with a "lupus-insensitive" ai~ reagent there is no or only slight prolongation of the aptt in the plasma under investigation, whereas with a "lupus-sensitive" reagent marked prolongation is observed. for the meaninof~l use of aptr reagents it is necessary to know the extent to which they are influenced by lupus anticoagulants. the following 14 apti' reagents were tested: • ptt-reageaz, p'rta, ptta liquid, ptt-la, pti'-lt (boehringer/stago) • pathromtin, pathromfin sl, necthroratin (behring) • platelin s, piatehn excel ls (organon tekinka) • actin-fs, actin-fsl (dade) • aptt silica lye, aptt silica liquid (instntmentation laboratory) the material for investigation consisted of 20 plasmas from patients with lupus anticoagulants. a confirmatory test (lupus anticoagulant test, immune) was positive for all of the patients. measurements were made using the sta coagulation analyser (boehringer/stago). it can be seen from the results that in some instances very different prolongations were obtained in identical plasmas by using differing aptt reagents. low susceptibility to lupus anticoagulants was shown by actirt fs (dade), ptt-reagenz (bcehrlnger) and neothromfin (behring). high susceptibility was shown by platetin excel ls (organon teknika), ptt-la and pti'-lt (boehringer/stago). lupus anticoaguhant screening with the aptt reaction is promising when two aptr reagents differing as greatly as possible in their lupus anticoagulant sensitivity are used. the resistance to the anticoagulant response of activated protein c (apc) is a major cause of venous thrombosis. apcresistance is due to a single mutation in factor v gene, which predicts replacement of arg 560 in the apc-cleavage site with gln (factor v leiden mutation). in contrast to other known genetic risk factors for thrombosis, this factor v 1691 g-a mutation has a high prevalence in the common population of western europe (average 4-5 %). we have determined the prevalence of the factor v 1691 g-a mutation in a population of 700 probands of north-eastern part of germany. the mutation was found in 7 %. (heterozygoty were found in 48 subjects 1 person was homozygous.) the results are compared with our studies of populations from argentine and poland. me analysed the factor v 1691 g-a mutation in 71 patients with thrombosis from germany and hungary. this mutation has been found in about 60 % of these patients. in contrast, the frequency of this mutation was strongly reduced in a group of 47 patients with thrombosis and pulmonary embolism of argentine (3 heterozygotes in 47 patients; 6 %). the results of these different populations will be described and discussed. past medical history: venous thromboembolic events (re) at 18, 19 and 2i years; intermittent oral anticoagulation (oac) without te's. diagnosis of autoimmune disorder with elevated antinuelear-antibody-fiters and positive lupus-anticoagulant test. no other relevant illnesses; family history uneventful. two weeks prior to the referral to us -acute febrile illness with nausea, diarrhea, abdominal pain; hospitalisation, treatment with iv antibiotics and anticoagulation with fraetionated heparin; development of extensive deep vein thrombosis (dvt) of the right leg; initiation of full-dose unfractionated heparin; decline of platelet count from 121 to a nadir of 21 g/l; referral to our department. on admission an extensive coagulation screen yielded the following results (n/normal, t/elevated, i/reduced, +/positive, -/negative): pt t, aptt t, tr n, factor ii, v, viii n, factor vii, ix, xi, xii /,, fibrinogan t, atiii n, protein c, s *, activated protein c sensitivity ratio 1.92 ($), fv-leidenmutation pcr -, fibrinolytic system n, tat t, ft÷2 t, lupus anticoagulant +, heparin induced platelet antibodies +; no diagnosis of a specific autoimmuna disorder could be made. an immunosuppressive therapy with corticosteroids and anticoagulation with recombinant hirudin were init'~at~; no p~ogr~zsion of the dvt oeeured and normalisation of the platelet count was observed. during follow-up under oac ) and low-dose corticosteroids, the patient was well, the pathologic coagulat;.on results, including lupusanticoagulant and activated protein c resistance, have returned to normal; no further te's have been observed. in summary we present a case of a complex coagulation disorder as part of an autoimmune process, resulting in a clinically manifest prothrombotie dysbalance including lupus anticoagulant, acquired resistance against activated protein c and heparin induced thrombocytopenia (type ii), entering complete remission under combined immunosuppressive and anticoagulant therapy. in the last 30 years, a vast number of simplified analytical procedures have been developed for the diagnosis of haemostatic disorders. today the detection method have evolved from the mechanical hooking method or ball coagulometry to optical systems, which additionally can utilise chromogenic substrates or immunological methods. in these systems the clotting time is derived from algorithms (e.g. threshold or maximum of the first or second integral). we studied 50 healthy subjects, aged 21 to 65 years and 118 patients, aged 9 to 93 years using a new aptt reagent (pathromtin $l). the results were compared with those obtained with a routinely used reagent (pathromtin). the reference range, factor-, heparin-and lupus anticoagulant sensitivity were determined. analysis was performed using the behring fibrintimer a (bfa) with optomechanical clot detection, the behring coagulation timer (bct) with op-"dcal clot detection by threshold and the dw test and dw confirm for lupus anticoagulant diagnostic. our results showed that the new pathromtin sl reagent met the demands for a higher factor and lupus anticoagulant sensitivity. it is highly suitable for monitoring heparin therapy and gave comparable results with the optical and the optomechanical analyser systems, hence reagent c~n be used for both systems. restenosis following percutaneous transluminal angioplasty (pta) continous to be a major clinical problem. neoinfimal hyperplasia, being the major undedying cause, can not be sufficiently avoided. vadous plasmatic coagulation and fibrinolytic factors, have been associated with artedal restenosis. anticardiolipin antibodies (act_) have been established as dsk factors for venous or arterial thrombosis. methods: in a cohort of 75 patients (53 men and 22 women, age 68±13 years) undergoing pta of a peripheral artery we prospectively evaluated whether acl could influence 6 months restenosis rate. patients were clinically examined before, 3 and 6 months after pta. noninvasive grading of artedal stenosis was done by duplex scanning of jet peak velocities. restenosis was arbitrarily defined as more than 50% occlusion of the lumen at the site of dilatation 6 months after successful intervention. laboratory investigation at the same time included acl and other known atherosklerosis risk markers, such as fibdnogen (fbg), yon willebrand factor (vwf), homocystein (hcy), c-reactive protein (crp). thrombin generation markers, such as thrombin-antithrombin iii complexes and prothrombin fragments 1+2, as well as thrombomodulin (fm) as an endothelial activation marker, were also measured. results: 31/75 (41.3%) patients were considered to have developed restenosis after 6 months. 9/75 (12%) patients were found to have positive igg-(19-35 gpl) and/or igm-acl (14-103 mpl) at all three measurements. 1/75 was negative before but seroconverted (igm) 3 months after pta. 2/10 (20%) acl-positive and 29165 (44.6%) acl-negative developed restenesis at 6 months (chi-square p-value=0.14). all above mentioned coagulation parameters did not differ between acl-positive and -negative patients, measured before or 6 months after pta. some of them are shown below (values before pta): fbg ( basilar artery stenosis is a rare event in young children. risk factors are head or neck trauma with consecutive dissection of the vertebral artery, cardiac diseases or hypercoagulability. elevated lipoprotein (a) (lp(a)) serum levels in adults can mediate atherosclerosis. in addition, lp(a) might interfere with fibrinolysis. here we report on a 3 year old boy , who presented with acute brain stem symptoms. history revealed neither trauma nor infectious disease. conventional and mr angiography showed stenosis of basilar artery without ischemic lesions. laboratory findings were normal in routine blood and csf tests. global coagulation parameters as well as procoagulant and anticoagulant factors were normal. cardiac and autoimmune disease could be ruled out. lp(a) serum levels were significantly elevated to 104 mg/dl (normal range <30 mg/dl). analysis of other family members revealed a hereditary hyperlipoproteinemia (a) which might explain family history of an increased incidence of myocardial infarction and cve in elderly family members. clinically the patient recovered completely from brain stem symptoms after heparinization and subsequent oral anticoagulation with phenprocoumon. however, radiological signs of basilar artery stenosis were progredient. in a recently developed specific test, an elevated anti-phosphatidylserin antibody titer was detected one year after primary diagnosis. in conclusion, this is the first report on a child with stenosis of the basilar artery and elevated levels of lp (a). it is unclear, whether apa contributed to the onset of basilar artery stenosis or developed secondary due to endothelial defects after thrombosis and anticoagulation. apa, however, might increase the risk of further thrombotic events in this patient. in 110 patients with thrombotic events respectively patients with systemic lupus erythematodes antioardiolipin antibodies (aca) aund lupus anticoagulant (la) were ~ea~ured. for aca detecting we use the assays from elias for igg-and ig~}-antibodies. we use as sensitive methods for detecting la in our laboratory the testkits from diagnostlca stago (staclot la with hexagonal array of phospholipids, ,ptt-la a very sensitive pttmethod and staclot p~p-a platelet neutralization procedure) and the ptt from organon teknik~ (platelin excel ls with two incubation times, 1 and 10 minutes). i"~e results of this tests were compared with three new or~e on german market: specktin apot (aktlvated plasma clotting time), specktin aptt (aptt wlth purified soy extract) and 8pecktin la (phospholipid preparation in concentrations between 10 and 500 ~g/ml); all wak chemie. traditional aptt reagents were developed for the sensitive detection of factor vib an ix as a cause of hemorhage. high sensitivity against lupus anticoagulants, which also prolong aptt, was not required for this purpose, with increasing recognition of the importance of antiphospholipid antibodies as a risk factor for thrombembolism, more sensitive reagents were designed, which now reliable detect this condition. using such reagents as a screening test in a general hospital makes it necessary to distinguish both conditions quickly. we here report an algorhythm, by which we use an inhibitor (lupus anticoagulant) sensitive (sta aptt, boehringer) and an inhibitor insensitive reagent (actin fs, dade) to distinguish anticoagulants and factor deficiencies as a cause of prolonged aptt. citrate plasma from 33 patients with various diseases showed an unexpectedly abnormal inhibitor sensitive aptt (>40s). 13 plasmas with factor deficiencies remained abnormal with the insensitive aptt reagent. a regular correction of their defect occured on mixing with normal plasma. by measurement of single coagulation factors five patients with contact factor xii deficiency were found. this condition is associated with thrombosis and very rarly with bleeding. three patients with factor xi deficiency and two patient with factor ix deficiency were also identified. antiplatelets, of any kind, permits a secondary prevention of myocard ischemic lesions. there is no general consensus regarding secondary prevention of cerebral ischemic lesions. aspirin remains the most common substance, ticlopidlne also brings about prevention, but with important secondary effects. european stroke prevention study i has demonstrated that the combination of antiplatelets, in particular aspirin/dipyridamole (persantln), is also very active. to collect more information, esps 2 was organized and 6602 patients receiving either a placebo,either 50 mg aspirin,either 400 mg sustained release form of dipyridamole (persantin (r)), or the combination aspirin/dipyrldamole, were recruited. it ended march 31st 1995 with the following conclusions: i-aspirin, 50 mg a day, brings about a significant secondary reduction of stroke (18.z %), after a two year follow-up. notwithstanding the low dose of aspirin, haemorrhages remain important. 2-dipyridamole, at 400 mg a day, brings about a significant reduction of stroke (i6.3~), similar to that of aspirin. one could thus substitute 50 mg aspirin by 400 mg dipyridamole. 3-the combination of 50 mg aspirin and 400 mg dipyridamole brings about a significantly greater reduction of stroke (37.0 ~). esps 2 revealed that a low dosage of aspirin is active, that dipyridamole alone is also active, but that the combination of both gives far better results. the study of the primary end-points,the study of the survival curves, the factorial statistical analysis and the pairwise comparison analysis, led to these conclusions. the conclusions drawn from esp£ 2 underline that the combination aspirin/dipyridamole is a privileged choice for cerebral ischemia, the state of activation of circulating platelets in acute cerebral ischemia is controversial. activation of platelets on single cell level can be assessed by determining the shape change or the expression of antigens such as p-selectin (cd62). shape change is an early and rapidly reversible event in platelet activation whereas p-selectin is irreversibly expressed on the platelet surface upon stimulation. methods: we investigated 20 untreated patients within one day after cerebral ischemia, 20 patients months after stroke treated with warfarin, and 21 age and sex matched control subjects without vascular risk factors. venous blood was collected into a fixation solution blocking the metabolic processes in platelets within milliseconds. we determined the fraction of resting discoid platelets by phase contrast microscopy. the expression of p-selectin was measured by flowcytometry. results: the fraction of platelets expressing p-selectin was higher in patients with acute cerebral ischemia (7.8_+4.7%) than in control subjects (1.9_+1.1%; p<0.001, u-test). patients with stroke (n=15, 7.8+4.5%) and patients with transient ischemic attack (tia; n=5, 7.6-+6.1%) had similar values. patients months after stroke still had higher values (4.3+9,3 %, p<0.05) than control subjects. the rate of discoid platelets was not different between patients with acute ischemia (n=17, 85.6-+8.8 %), patients months after stroke (n=19, 85.7-+5.1%) and control subjects (n=21, 86.7_+5.8 %). platelet count was not significantly different between groups. conclusion: the elevated proportion of platelets expressing pselectin indicates strong platelet activation in acute cerebral ischemia and in a majority of patients months after stroke. assessment of pselectin revealed a higher sensitivity for platelet activation after stroke or tia than analysing the reversible shape change. further studies have to clarify if monitoring of platelet activation by flowcytometry is helpful as a prognostic tool and to evaluate therapeutic strategies after stroke. vascular smooth muscle cell (smc) proliferation and migration into neointima are the hallmarks of atherogenesis. the complexity of these processes and their concerted action and interaction of molecules are yet to be fully elucidated, one crucial molecule seems to be the urokinase-type plasminogen activator receptor (upar) recently also assigned as cd87 antigen, upar serves a dual function: (1) it directs upa proteolytic activity to a special location on the cell surface and (2) induces cellular signals leading to various phenotypic changes. we have investigated the signal-transducing capacity of upar in human smcs and provide here a molecular explanation for uparrelated cellular events. activation of these cells with upa (even with inactivated catalytic center) results in the induction of tyrosine phosphorylation, suggesting modulation of upar-associated protein tyrosine kinases (ptks) upon ligand binding. we obtained patterns of tyrosine-phosphorylated proteins with molecular masses of ~ 55-65 and 35-40 kd. using antibodies against different types of ptks as well as immunoprecipitation-and immunoblotting techniques the ptks involved in the upar-signalling complex were identified to be members of the src-ptk family. the cotocalization of upar and ptks at the cell surface of smcs was further confirmed by confocal microscopy studies. we conclude that the upar-ptk complex is most likely involved in this signal transduction pathway that provides the coordinated action of extracellular proteolysis, adhesion, and cell activation, which is required for cell migration. this mechanism may be crucial for the progression of atherosclerotic plaques. activation markers of haemostasis have been found elevated in relation to diabetic vascular lesions. simultaneous pancreas-and kidney transplantation (pkt) in type i diabetes has been shown to improve diabetic complications and long term survival. we measured haemostatic vascular risk factors and activation markers in plasma of 18 patients after successful pki', 17 patients after pkt and rejection of the pancreas graft and 5 patients after pkt and rejection of the renal graft. blood samples were taken during routine ambulatory visits, patients were free of any ongoing acute disorder or transplant rejection and under continuous immunosuppressive medication. despite individually adjusted insulin therapy hba1 plasma levels increased after pancreas rejection (5,37 vs 7,i2, p<0.001). platelet counts and plasma levels of fibrinogen, f1+2 fragment, tat-, app-complex and-fibrin monomer were found significantly elevated as compared to diabetic controls but not significantly different with respect to complete or partial successful pkt. one major reason of the increased activation state of haemostasis may be cyclosporin treatment given to all patients, t-pa and pal i plasma levels were within the normal range and significantly correlated to plasma triglyceddes (r.0.049; p<0.005). d-dimer plasma levels were significantly lowered after pancreas rejection (772(375) vs 324(232) nglml; mean(sem) p<0.005), which might reflect impaired fibrin degradation related to increased glycosylation of fibrinolytic factors. in conclusion, despite the marked improvement of glucose and lipid metabolism, plasma markers of activation of coagulation and flbrinolysis are not decreased to normal after simultaneous pancreas and kidney transplantation. according to the investigations of fowler et al. and pepe et al. the probability of an ards occurring with one risk factor is 5-8%, and in the presence of several risk factors, 25%. goris et al. and johnson et al. determined the level of severity with the aid of a fixed scale: the injury severity score. all these investigations are however not to be interpreted as typical following coronary surgery. these investigations demonstrated that the kallikrein and factor xii systems are of great importance as intraoperative risk factors. here the factor xii system plays a major role with direct or indirect activation of the kauikrein-kinin system with the splitting products alpha-factor xiia and bfactor xha respectively. all ards scores take the pmn-elastase into account. if the pmn-elastase values (1000 pg/l) are constantly high postoperatively then lung complications are to be expected. patients developing an ards displayed significantly lower alpha2-macroglobulin values. patients who developed a highly significantly raised kallikrein-like acdvity (>60 u/i) after the beginning of bypass and showed constantly high values during ecc are difficult to keep under control due to the blood pressure behaviour. the platelet pal also shows a significant rise and intraoperatively runs analogous to platelet factor 4, only antiparailel, since it attacks the endothelium. we were able to show that pai-1 is suitable as an indirect marker for a possibly developing restenosis. 85% of the patients investigated with lowered pai-1 values in the postoperative phase did not develop a restenosis. however, with patients showing significantly rising pa[-1 values from the 1 st. to 3rd. postoperative day 50% of all the cases had a restenosis. a further risk factor in this respect are significantly raised fibrinogen levels which lay over 800% at the end of surgery. if these fibrinogen values do not fall from the 1st. postoperative day onwards a raised risk of thrombosis must be reckoned with in the absence of therapeutic intervention. the following parameters represented haemostaseological risk parameters with significant behaviour within the framework of this study: 1) regards the blood pressure behaviour, the kallikrein-like activity (>60 u/i); 2) with regards to the lung complications, aipha2-macroglobulin and pmn-elastase (>900 g/i); 3) and final/y as a possible marker for a developing restenosis pai-1 and fibrinogen (>800%). resulting from numerous clinical studies homocysteinemia is found to be an almost independent risk factor of atherosclerosis including thrombotic complications as well as of venous thromboembolism. experimental investigations on the underlying mechanisms suggest endothelial cell damage accompartied by the development of an atherogenic and thrombogenic potential, increased platelet reactivity, oxidative modification of ldl, and enhanced affinity of lp(a) for fibrin. to our knowledge no results are published on the influence of homoeysteine on leukoeytes although these cells are deeply involved in pathological events within the vasculature. therefore, as a first approach different functional parameters of human polymorphonuclear leukozytes (pmnl) were followed under incubation with 60, 300, and 600 i.tm (final concentration) dl-homocysteine (hc) in isolated fractions or whole blood, respectively: l) spontaneous mobility of pmnl, measured as migration distance into micropore filters in a modified boyden-chamber, is found to be significantly enhanced by the two smaller hc concentrations. 2) chemotaxis induced by 0.1 i.tm formylmethionylleueylphenylaianine (tmlp) shows no significant differences. 3)monitoring of chemiluminescence signals (autolumat lb953, berthold) is complicated as hc influences the luminol-mediated indicator reaction. adjusting appropriate conditions the following results are obtained: spontaneous chemiluminescence and that induced by zymosane, tmlp, and the ca2+-ionophore a23187 are entranced by the two higher hc concentrations. there are, however, differences between the blood donors as a minority does not respond to hc in repeated measurements. with phorbol 12-myristate 13 acetate the signal is diminished by hc in all cases and with all concentrations. 4) phagoeytosis induced by zymosane (microscopic evaluation) as well as by opsonized e. coil (cytoflowmetric evaluation) is significantly increased by the two higher hc concentrations. conclusion: the activation of human pmnl is enhanced with respect to the majority of investigated stimuli by hc in concentrations reached under pathophysiological condititions. the effect of pysical exercise on hemostatic parameters was studied in 12 patients (male, mean age: 55 [range 36-68] yrs) with angiographically documented coronary artery disease (cad) and in 11 controls (male, 53 [43-62] yrs) both participating in an 1 hour group exercise session for cardiac rehabilitation. in each group relevant arteriosclerotic lesions in carotid, abdominal and leg arteries were excluded by doppler ultrasound examinations. patients were all under 6-blocking agents and aspirin. plasma levels of prothrombin fragment 1+2 (ptfi+2) and fibrinopeptide a (fpa) reflecting formation of thrombin and fibrin, respectively, were measured at rest and immediately after 1 hour of exercise consisting of jogging, light gymnastics and ball games. training intensity in both groups was comparable as indicated by the mean heart rate during exercise corresponding in patients to 80+6% (mean-+sd) and in controls to 79-+5% of the maximal heart rate previously determined on a bicycle ergometer. baseline values for ptf1 +2 were significantly lower in oatients (0.67-+0.15 nmol/i; mean-+sd) than in controls (1.01-+0.21; p<0.001 i. after exercise we found an increase of ptf1 +2 in controls to 1.14-+0.24 nmol/i (p<0.001) while in patients ptfi+2 remained unchanged (0.67-+ 0.14 after). accordingly, exercise induced r se of fpa was more pronounced in controls (from 0.62-+0.24 to 1.60-+0.62ng/mt; p<0.001) than in patients (from 0.63-+0.26 to 1.20+0.46ng/ml; p<0.00t). we conclude that in terms of thrombin and fibrin generation exercise training does not exert detrimental effects on hemostasis in patients with cad. lower baseline values and lack of exercise induced increases of ptf1 +2 in patients with cad might be attributed to medication with aspirin and/or b-blocking agents. periodontitis marginalis (pm) is an inflammatory oral disease that is caused by gram-negative bacteria and that has a high incidence in the second half of the life. clinical signs of pm are gingival bleeding, periodontal pockets, alveolar bone destruction and loss of teeth. recent epidemiologlcal studies have provided some evidence for an association between pm and atherosclerosis. in the present paper we will summarise some of the results that we have obtained in studies on patients with pm as well as on patients with hypercholesterolaemia (hc) and atherosclerosis. pm was frequently found to be associated with hc (90 % in rapidly progressive pm) and increased reactivity of peripheral blood neutrophils and platelets (e.g. generation of oxygen radicals and paf-induced aggregation). patients with hc and atherosclerosis had a higher frequency of severe pm when compared with data on the community periodontal health. the severity of pm was higher in patients with plasma cholesterol levels _> 6.5 mm when compared to those with plasma cholesterol < 6.5 mm. in patients with coronary atherosclerosis the severity of pm was significantly correlated with plasma cholesterol level, systolic blood pressure and the number of diseased coronary arteries. these results provide further evidence for an association between pm, hc and atherosclerosis. it can be speculated that hc is not only a risk factor for atheroscterosis but also a risk factor for pm and acts by increasing the reactivity of neutrophils and platelets. on the other hand, pm as a mild chronic inflammation could promote the development of atherosclerosis due to effects of endotoxins on vessel wall, blood cells and haemostatic factors. it has been also speculated that phagocyting leukocytes in the inflamed periodontal tissues could contribute to oxidative modification of ldl. so far, there is no evidence that atherosclerosis may contribute to the pathogenesis of pro protein z (pz) is a vitamin k dependant plasma protein synthesized in the liver. it promotes the association of thrombin with phosphorlipid surfaces. recently it has been shown that a deficiency of pz may lead to a bleeding tendency. in patients undergoing chronic hemodialysis, disorders of hemostasis are common. to examine if plasma levels of pz are altered in patients with end stage renal disease we determined pz in plasma of patients at the beginning of hemodialysis treatment. the results were compared with a group of 50 healthy controls. the difference of pz levels in plasma of patients with end stage renal disease with the control group was not significant. control group was 2900 + 487 ng/ml and in patient group was 3133 + -1394 ng/ml. one patient with marked bleeding tendency after hemodialysis pz was 1387 ng/ml. we concluded that patients with bleeding disorders pz determination may be helpful. the normal range of actin fs was reinvestigated in a multicentric approach. a protocol was developed which requests from each center to assess the aptt with one common and one variable lot of actin fs in 50 samples of suspected normals. inclusion and exclusion criteria based upon the results of clotting assays, liver enzymes and clinical data were defined. results: a total of 1056 results was obtained. the majority of centers in this study used the electra 1000 or 1600 c (mla). results for the electra group (n = 6) showed a precision for the common lot of actin fs with a common lot of a three level control from 1.5 % (level 1) to 1.1% (level 3) with an excellent accuracy between the 6 centers. clotting times with the variable lots of actin fs were very similar. the results from normals, however, showed a somewhat higher dispersion using the common lot of actin fs. 4 of 6 centers had almost identical mean values (range 26.6 to 27.2 sue) whereas one reported shorter and one longer clotting times (25.2 and 29.7 sec). results with the variable lots gave almost identical results as the common one. a total of 556 results of all lots gave a normal range of 23.5 to 31.7 (5 -95 % percentiles) on electra. mean values on acl (n = 200) were 26.3, on bct, 27.65 sec, on amga coagulometric, 29.4 sec, on amga turbidimetric, 26.6 sec (n = 100 each). all centers used sarstedt monovettes with 3.2 sodium citrate. discussion: the results of this study demonstrate the lot to lot consistency of all lots of reagents included in this study since the common and variable lots showed very consistent results. interestingly in the large group of electra users the normal ranges showed some differences, though the controls in all centers were almost identical. this confirms the recommendation that a normal range as stated from the manufacturer should be used for orientation only and that each laboratory should assess its own range. direct acting anticoagulant agents such as hirudin (r-h), argatroban (arg), efegatran (efe) and peg-hirudin (ph), represent specific and potent inhibitors of thrombin. blood samples collected in r-h (10 ~g/ml), arg (50 ~tg/ml), efe (25 ~tg/ml) and ph (10 ~tg/ml) do not clot for extended periods (>24 hours), thus allowing for the collection of plasma for analytical purposes. unlike heparin, these agents do not require any plasma cofactor for their anticoagulant effect. in contrast to citrate, oxalate, edta and heparin, these antithrombin agents do not alter the electrolyte or protein composition of blood. thus, blood collected in these agents may provide a physiologically intact (native) sample for clinical laboratory profiling. we have used all of these agents to prepare whole blood and plasma samples for various diuical laboratory measuroments. plasma samples collected with these agents are obviously not suitable for global clotting tests (pt, aptt, thrombin time, fibrinogen); however, these agents are optimal anticoagulants for the collection of samples for the molecular markers of hemostatie activation, such as fibrinogen/fibrin related degradation products, prothrombin fragment, protease cleavage products, tfpi, tnf and other protein mediators. electrolytes, blood gases, enzymes and protein profiling can also be satisfactorily measured on blood samples collected with these agents. antithrombin anticoagelatad blood used fur hematologic analysis showed equivalent blood count and differential results as that obtained with edta blood. unlike other anticoagulants, these agents do not interfere in the cell staining process. washed blood cells can also be prepared using antithrombin aents supplemented buffers for morphologic and fuuctional studies. thrombin inhibitors such as hirudin have also been used for flow cytometry and image analysis of blood cells and tissue exudates. our observations suggest that these anticoagulants can be used as suitable anticoagulants for clinical laboratory blood sampling. these agents can also be used as a flush anticoagnlant fur most automated instruments as these exhibit superior anticoagulant properties to heparin. furthermore, the hematologic parameters obtained in antithrombin anticeagnlated blood may be physiologically more relevant than those determined on blood collected in edta, citrate or heparin. antithrombin ul determination is one of the most popular method for in vitro diagnostic of number of different disorders. human fhrombin a~nity purified on heparin-rnodified silica-based sorbents was used for level of antithrombine lu determination by abilgaard method in blood of 40 patients with pregnancy pathology, acute leukemia, thrombocytopenia and anemia. it was founded, that antithrombin level is decreased to 50-60% of normal values in case of pregnancy pathology, to <50% -in case of acute leukemia and thrombocytopenia, to s0% -in case of anemia. obtained results show the strong relationships between named disorders and patient antifhrombin iii level. therefore anfifhrombine iii estimation may be used as simple and quick method for preliminary diagnosis of above named disorders. bm coasys 110 is a complete automated analyzer system for coagulation tests. it is well suited for routine coagulation testing in random access in a medium throughput laboratory environment. analytical performance and practicability were tested by a common evaluation program in five hospital laboratories. within run and day to day cv's were below 5 % in different samples (controls, patients) . comparison in different therapeutic ranges confirms the declaration of the isi-value for calculating inr-values. normal values for coagulation tests with results in pdmary units were checked in 390 samples and confirmed. due to the optical measuring principle of the bm coasys 110 there was a little tendency to shorter times with the thrombin reagent. in conclusion the performance of coagulation tests with the bm coasys 110 was rated as well or better compared to existing systems in the laboratories with advantages due to short timed familiarizing and easy handling. flexibility and stability of the system permit optimal integration and innovation into the w0rkflow of the routine laboratory. the purified thrombin and antithrombin iii (at iii) have a great interest in the clinical diagnostic and treatment practice, so their isolation methods are very important. molecules of these proteins have some fragments replying for interaction with native glycosaminoglycan, heparin. this interaction is used for isolation and purification of thrombin and at ul from native materials, blood plasma or its fractional products. we have done comparative studying these proteins purification on heparin sorbenfs, which contain heparin immobilized on sificagel, modified by glycidooxipropyl, gamma-aminopropyl and tosyl chloride groups, or on cellulose: heparin-epoxy-silica (1), heparin-gammapropyl-silica (2), heparjn-tozylsilica (3), and heparin-cellulose (4). we founded that thrombin binds with all sorbents, while at iii doesn't binds with sorbents 2 and 3. there wasn't any difference between silica and cellulose sorbents in thrombin desorbfion by t m naci. at iii binds more stronger with heparin-ceuulose t[~,~n with silica sorbents but specific activity and purity degree were approximately the same on both kinds of sorbents. thrombin specific activity and purity degree were approximately twice higher on sorbents 2 and 3 in comparison with sorbents ! and 4 (2250-3000 nih units/mg versus 1260-t350 nih unlts/mg). therefore, sorbents 2 and 3 can be used for isolation and purification of thrombin and sorbents t and 4 can be used for isolation and puriiication of at ii1. we used these sorbents for large scale purification of named proteins. purified thrombin was used for production of diagnostic kits for anfithrombine iii, fibrinogen, fibrin/fibrinogen degradation products and thrombin time determination. after an aerobic or anaerobic physical exercise various alterations of the hemostatic system were detected. numerous investigations of the hemostatic system exist of running and of bicycle ergometer exercise but not of swimming. young volunteers (n=54; median age 25 years) were investigeted~ there was an aerobic exercise (achieved heart rate 130 --10/min, lactate < 4 mmol; n= 27) and an anaerobic exercise (achieved heart rate 150 ~ lo/min, lactate > 4 mmol/1; n= 27). in both groups there was a significant shortening of the ptt. under anaerobic conditions hematocrit and quick significantly increased. factor viii activity rose significantly in both groups. indicating plasmatic clotting activation there was a significant increase in molecular markers tat and f1+2 only under anaerobic conditions (tat from 2,5 to 5,4 pg/1; f1+2 from 0,87 to 0,93 nmol/1). indicating activation of fibrinolysis t-pa activity increased significantly in the anaerobic group (from 0,1 to 0,4 iu/ml) but not in the other group. this findings indicate that there is e balance in the hemostatic system by activation of clotting as well as of fibrinolytic system in young volunteers during exercise by swimming dependend on the degree of exercise load. membranes as well as compact, porous disks are successfully used for fast analytical separations of biopolymers. as far as capacity, speed and performance of separation are concerned, the supports are as effective as other recently developed fast media for the separation of biopolymers °). so far, technical difficulties have prevented the proper scaling-up of the processes and the use of membranes and compact disks for preparative separations. in this report, the use of a compact tube made of poly(glycidyl methacrylate) for fast preparative separations of proteins is shown as a possible solution of these problems. the units have yielded excellent results, regarding performance and speed of separation as well as capacity. the application of compact tubes made of poly(glycidyl methacrylate) for the preparative isolation of the coagulation factors viii and ix from human plasma shows that this method can even be used for the separation of very sensitive biopolymers. in terms of yield and purity of the isolated proteins, this method was comparable to preparative column chromatography. the period of time required for separation was five times shorter than with corresponding column chromatographical methods. our measurements showed an excellent correlation of the two systems (r=0,99). the maximum amplitudes on the roteg were on average 2.2% higher than on the hteg, corresponding to a slightly lower reverse momentum of the measuring system in comparison to the hteg. we report first results out of the evaluation of sta compact (boehringer mannheim/diagnostica stack)). sta compact is designed for automated analyses of routine and special coagulation (chronometfic, photometric [405 nm] and turbidimetric [540 run]) tests. in addition, it does measure ,,derived" fihrinogen. tests as follows were evaluated: prothrombin time (pt), partial thromboplastin time (aptt), fibrinogen (clauss method), thrombin time, at iii (chromogen), hepato quick, as well as the factors ii, v, vii, x, and viii. results: within run cvs of the clotting tests were below 2% (calculated on the basis of seconds) in most cases, day to day cvs below 4% (not measured for factors, yet). at iii yielded within run cvs below 3% in the decision range. measuring ranges: at iii: 20-140 %; fibrinogen: 1.3-9.0 g/l (plasma -dilution 1/20), after rerun with other dilutions: from 0.3 g/1 (dilution: 1/5) to 18 g[l (dilution: 1/40). method comparisons, using sta as reference, yielded slopes close to 1.00 and negligible intercepts. throughput: with routine clotting tests about 100 tests/h, in a sample selective access mode. we conclude, that sta compact allows precise measurement of routine and special coagulation tests. it is also a reliable system for photometric tests and well suited for intermediate workloads as well as stat analyses. we did evaluate ptt lt, a new liquid, silica based ptt reagent. special attention was given to reference interval and heparin sensitivity. the new reagent is well suited for the measurement of intrinsic clotting factors and is reported to have high sensitivity for lupus anticoagulants (higher sensitivity than sta aptt [boehringer mannheim = bm]). it is stable for 4 days in the cooled compartment of the sta analyzer. methods: all experiments were made on sta. for comparison, we used three other ptt reagents (a lab. routine, silica based aptt, as well as sta aptt and sta ptt kaolin from bm). in addition, thrombin time (3 u/ml thrombin, sta thrombin reagent) and heparin (chromogenic xa test, rotachrom heparin) were measured. results: within mn imprecision (n=21) was below 0.7 % cv in the normal range and in two controls (mean values: 35 s, 76 s), and 1.4 % in a heparin plasma (mean: 81 s). between day imprecision (d=10) was below 3% in two controls ( mean values: 34 s and 58 s). the upper limit of the reference range is 40 s (97.5 th perc., median: 32 s; 90 patients with normal coagulation status [routine aptt, fib., pt], median age: 54 years); almost identical reference ranges were obtained with sta aptt and the routine ptt reagent, while sta ptt kaolin showed significantly lower values (97.5th perc.: 33 s, median 28 s). method comparison study: good agreement using plasmas from patients without heparin: (y= 0a + 0.98 x, n= 198, range of(x) from 25 to 79 s, r = 0.97; x = sta aptt). the median values from 54 patients under high dose heparin were: routine ptt: 81 s, sta aptt: 75 s, ptt -lt: 82 s0 sta ptt kaolin 54 s, thrombin time 37 s and heparin 0.5 iu/ml in conclusion, results of the new reagent compare well to our routine ptt and to sta aptt system reagent. it allows sensitive monitoring of high dose heparin therapy and is well suited for detecting abnormalities of the intrinsic clotting factor pathway. is a standard technique since many years. the interpretation of the thrombelastograms has been widely based on phenomenologic observations, while there is a lack of exact information concerning the coagulation mechanisms leading to the teg amplitude (a're~). the aa'ec is a measure for the mechanical stiffness of the clot and depends on: a) fibrin formation and adequate polymerisatiun of a 3-dlmensional network: measurements with nonrecalcified citrated blood activated with adp or epinephrine (both n=10) did not show any clot formation in the teg this relies on the need for a mechanical coupling between the teg pin and cup over a distance of 1 mm, which is accomplished by the fibrin network therefore, teg can only be performed under thrombin formation and thus under thrombin-activation of the platelets in the sample. factors, which inhibit platelet aggregation but don't limit thrombin-activation of platelets, cannot be monitored by teg. b) the attachment of the dot on the surface of the teg pin and cup. according to recent literature we suggest that the attachment of the clot in the teg relies exclusively on fibrinogen/fibrin adsorption to the surfaces of the pin and cup. interruption of this attachment can result in lower amplitudes or the so-called ,,stairway" phenomenon. we could show a complete interruption of the clot attachment by dipping the pin for one second in 30% albumine solution (n=10). c) the fibrinogen concentration (fg) and platelet count (pc) of the sample. in 50 volunteers we found only a poor correlation of the maximum amplitude (ma) with fg alone (r=0.40) or pc respectively (r=0.50), while there was a very good nonlinear correlation to the product of fg and pc. we suggest that the fibrin network forms the main structure of the clot while the thrombocytes enhance its stiffness in a concentration-dependent manner. this effect of the ptatelets can be completely reversed by gplibfllla antagonists. d) adequate coagulation activation: in nonactivated teg even small amounts of inhibitors can lead to a significant reduction of the ateg. conclusion: alterations in teg measurements can be judged more properly when the underlying mechanisms are understood. the consideration of the limitations of the method allows a more specific interpretation of the results. as a response on a customer request we did investigate the sample stability of blood samples for the aptt. the study was set up in a way that simulated the conditions of a large private laboratory in which the samples arrive several hours after blood collection. blood was drawn from 10 donors into 3.2 % sodium citrate and mixed well before it was divided into several aliquots which were kept at room temperature. the aliquots were centrifuged after ~ 0,5, 11, 23 and 29 h after venopuncture and the plasma was analyzed immediately with 3 different reagents on electra 1000. results: there was a clear difference in the change of the apttover time with these reagents. also f viii (determined with a chromogenic assay with complete and standardized activation) change considerably. 2 reagent a: ellagic acid, plant phospholipid, reagent b: sulfatide/kaolin, phopholipids, reagent c: ¢llagi¢ acid, plant and rabbit brain phospholipids the increase of aptt was apparently not a function of the decrease of fviii because the in vitro f viii sensitivity of reagent b. was inferior to reagent a though reagent b showed more prolongation of the aptt than reagent a. reagent c, however, showed only minor changes in the aptt. discussion: these data show that the sample stabifity of the aptt is reagent dependent and that it is not simply a function off viii sensitivity. other factors such as the buffer system but also the sensitivitiy towards other factors than f viii seem to contribute. a comparison of the technical principle of the roteg coagulation analyser and conventional thrombelastographic systems an. calatzis, p. fritzsche. al calatzis, +m. kling, +r. hipp, a. stemberger institute for experimental surgery and +institute of anesthesiology yechnische universit~t m0nchen thrombelastography (teg) was introduced by hartert in 1948 as a method for continous registration of the coagulation process. in 1995 we presented the roteg coagulation analyser, using a newly developed technical method. in teg systems according to i/artert the sample (blood or plasma) is placed in a cup which is alternately rotated to the right and left by 4,75 °. a cylindrical pin, which is suspended freely on a torsion wire, is lowered into the blood. when coagulation starts, the clot begins to transfer the rotation of the cup to the pin against the reverse momentum of the torsion wire. the angle of the pin is electromagnetically detected, transformed to the teg amplitude and continously recorded. in the roteg the pin is attached to a short axis, which is guided by a ball beating. thus all possible movement is limited to rpotation (r_oteg). the cup is stationary, and the pin is rotated alternately by 5 ° to the right and left by a feather system. when a clot is formed, it attaches to the surfaces of the pin and cup and starts preventing their relative movements against the reverse momentum of the feather. here the reduction of the rotation of the pin, which is detected optically, is tranformed to the teg amplitude. as can be shown by theoretical analysis and by control measurements, the roteg provides the same measuring capabilities as conventional teg systems. the main advantage is the solid guiding of the measuring system, which makes the roteg easily transportable and less susceptible to shock or vibration during measurement. yhrombelastography (teg) is a standard monitoring procedure for evaluation of coagulation. usually only nonactivated native blood teg measurements (nateg) are performed, which leads to a) a long time interval until coagulation and fibrinolysis parameters are available b) very high susceptibility of the measurement to inhibitors like heparin, which disturbes the judgement of other components of coagulation, c) unspecific results. our aim was to develop a coagulation monitoring system based on teg providing fast and specific information on the different components of coagulation. methods: the following measurements are performed in paralel using disposable pins/cups (haemoscope): a) extrinsic activated teg (exteg): 3551al whole blood (wb) + 5~tl innovin (recombinant thromboplastin reagent, dade). b) intrinsic activated teg (integ): 3551al wb + 5~tl kaolin (suspension 5g/l, behring). c) aprotinin teg (apteg): exteg + 20 kie aprotinin (trasylol, bayer). d) heparinase teg (hepteg) as decribed in (1). results: exteg and integ provide information on the extrinsic/intrinsic system within 5-10 min and information on the platelet/fibrinogen status within 10-20 min. because of the addition of potent activators integ and exteg can be performed when inhibitors like heparin are present in the circulation. fibrinolysis effects can be seen on exteg and integ and by comparison of exteg and apteg (apteg: invitro-fibrinolysis inhibition by aprotinin). if fibrinolysis is detected by exteg or integ and aprotinin-susceptibility is verified by apteg, aprotinin therapy will be initiated. heparin effects are revealed by hepteg. discussion: by the comparison of parallel teg measurements which have been differently activated, specific and fast information on the different aspects of the clinical coagulation status is provided. the presented tests can be easily performed bedside and only a small specimen of whole blood is needed (0,4-1,8 ml). introduction: a severely prolonged aptt (333s; normal: 40~os) was observed during preoperative screening for a planned splenectomy in a 71-year-old man with an 8 year history of osteomyelofibrosis. fellewing neer-normal~atien (71s) of the ap'ci" after 10 rain preincubation in a kaolin based aptt assay, pk deficiency was suspected and studies were performed to further investigate the nature of the pk deficiency as well as the mechanism underlying the normalization of the prolonged aptt by increasing the preincubation time. methods: the apl-r assay was peal'armed using kaolin/inesithin. high molecular weight kininogen clotting activitiy (hk:c), fxii:c end pk:c were measured by an aptt based assay using neothromtin ® (behnng) and 1 rain (pk:c) or 4 min (hk:c, fxli:c) preincubation. pk amidolytic activity (pk:am) was assayed using cosset pk ~ (chromogenix) and pk antigen (pk:ag) by quantitative immunoblotting. fxll and hk proteolysis dunng activation of plasma by kaolin (10mg/ml at 37=c) or ds (12.5~tglml at 4=c) was demonstrated by immunotilotting assays of fxii and hk following sds-page. assay pk:c pk:am pk:a~i fxfi: the propositus had pk:c<5%, pk:am=5% and pi~ag<2.5% as compared to normal pooled human p(asma (nhp). his son and two daughters had pk:c-50% and normal aptt values, incubation of the propositus' plasma with ds did not result in fxii or hk cleavage within 180 rain, whereas jn nhp detectable f×ii and hk proteolysis occurred after 5 rain and complete proteolysis was observed after -120 rain. in contrast, kaolin activation of propositus' plasma led to slow activation of fxii after 10 rain, presumably by autoactivation, and to fxlla-induced hk proteolysis. near-normalization of the propositus' aptt by prolongation of the preincubation time paralleled fxii autoactivation as evidenced by immunobletting. we describe a propositus with severely prolonged aptt due to hereditary, crm negative pk deficiency suffering from omf. activation with a particulate suspension of kaolin led to slow fxii autoactivation and hk proteolysis, whereas ds in solution did not induce fxii or hk cleavage. fxii autoactivation seems to be responsible for the normalization of the prolonged aptt in pk deficiency after prolonged preincubation times. in our study we compared a conventional bag with silicone tubing (a) for blood donation with 2 new ones (]3 from biotrans and c from baxter) with a newly developed y-shaped adapter. this adapter is integrated into the tubing and therefore provides the advantage for drawing blood samples in a closed system. the 3 systems were identical in amount and content of anticoagulant, i. e. 63 ml of cpd per bag resulting in approximately 14% of the final whole blood volume. the purpose of the study was to determine whether the different tubings can influence the quality of plasma products conceming the blood coagulation system. in plasma samples we measured several factors of the procoagnlatory and fibrinolytic systems. intralndividual control eitrated (.135 m) blood samples were initially drawn from the contralateral cubital vein from the same male donor (34 in each group). in all bag samples we found small but significantly higher levels of the global test parameters ap'it and ti" compared to controls, indicating a higher amount of anticoagulant. pt, however, revealed no differences, thus suggesting that factor activities were not altered (statistics according to mann-whimey). increase of procoagulatory activity measured as tat complexes showed elevated levels in bags a and c whereas prothrombin fragments fl+2 decreased only in a. conceming the fibrinolytic system, plasminogen a~tivators and pai-1 values were diminished in all three systems 03 < a < c) compared to controls. d-dimers were lowest in a followed by slightly higher values in c, controls and b. fibrin monomers did not reveal any significant differences: a < c < controls < b. in summary, the quality, of the 3 different blood sampling devices was comparable to the intraindividual controls as to factor activities measured by global tests. the activation of the procoagulatory and fibrinotytic systems was slightly but in most cases significantly higher in the two new devices than in the conventional one. all values, however, obtained from the plasma samples did not exceed the normal range of healthy blood donors. therefore we concluded that the two new closed blood drawing systems are favorable for blood donating procedures. in 20 patients with acute myocardial infarction (ami) and thromholytie therapy (13 patients with rt-pa, 6 patients with streptokinase and one with heparln) with ck, myoglobin and ekg criterions the patients were divided in two groups (reperfusion/no fellow two hours after starting the thrombolytic therapy) . blood samples were taken before, 30 rain, i h, 2 h, 4 h, 8 h,12 h after lysis and than every day till day 10. because of the central role of factor xii in activation of coagulation, fibrinolysis, kallikreln-kinin-system and complement cascade we investlgate the role of factor xila initiated by ami and the relation of factor xiia to the thrombolytie agent and reoeclusion rate. for the investigatlens we take the kits from shield diagnostics (xiia), behring diagnostica (c~-inactivator, pl~nogen, ~-antip]~n~n, pap), chromogefiilx ab (prekallikrein) and di~nostica stage (vile). the results: there is an increase of factor xiia immediately after starting the fihrinolysis (max. 30 rain after starting); the increase 5/i independently of the thrombolytie agent. parallel to factor xiia raises factor viia without significant changes of c1-1naotor and prekalllkrein. that means: activation of xiia and fibrinolytic pathway leads to relatively mild c.hanges in kallikrein system, hut to significant activation of extrinsic system by vila-tissue factor. in some patients is an additional rise in the system xiia -viia, when the fibrinolytic system is already in the normal range. there will need further investigations to define the risk of reocclusion as a result of activation of faktor viia by faktor >li ia. autoimmune thrombocytopenic purpura (aitp) is a frequent complication of chronic lymphocytic leukemia (cll] which developes on different stages of the disease and needs special treatment measure. mechanism of autoimmune disorders in cll remains uncleared. we investigated immunologic phenotype of blood lymphoid cells in 22 patients suffering from cll with aitp. in these patients we did not observe disorders in expression of b-lineage markers as compared with cll patients without immune complications (13 patients). but in the 1st group of the patients the greater number of b-celts expressed markers of activation. according to ig heavy chain expression, the lymphocytes in most cases of cll complicated by aitp had more mature phenotype. in all patients with k phenofype of cll lymphocytes we found immune disorders. the development of aitp was accompanied by lowered level of t cells and changed dis'flibution of their immunoreguiatory subsets: diminished number of cdz~cells and increased one of cd~'÷lymphocytes. the results of our investigations undirectly proved that malignant b-cells in cll are involved in production of autoantibodies against blood cells. dysbalance in t-cell system with functional disturbances of immunoregulation are significant in development of autoimmune complications in cll a24 in women with severe fvii deficency (<10%) hypermenorrhagia may cause life threatening blood loss. therefore, hysterectomy at a young age is reported frequently in the literature. a 12 year old girl without history for a bleeding disorder was transfered with hypermenorrhagia. the initial laboratory data revealed an abnormal quick-test of 30% due to fvll of 9,0%, normal platelet count and hemoglobin level of 7,2 g/dl. antifibrinolytic therapy (tranexamic acid 4x15mg/kg bw/d) and lynestrenol substitution were started to reduce the hemorrrhage. despite treatment the daily blood loss increased to a maximum of 290ml. therefore, substitution therapy with recombinant fvila (rfvila) (novonordisk) was started at a dose of 15 ilg/kg bw q 6 h. subsequently blood loss decreased to 30ml/d, but even with an increasing dose of rfvlla up to 35 i~g/kg bwq 4h (fvil activity max. 7400% 10 min after injection) and additional hormonal support with a lh-fsh-anatgonist some hemorrhage remained. a short .course of methergin was stopped due to severe pain. ultrasound of the uterus revealed a hypertrophic endometrium causing the persistent bleeding. it decreased slowly over several weeks and hemorrhage stopped completely after 40 d. the total rfvlla dose administered was 118 rag. no side effects were observed. no transfusions of blood products were necessary. currently, menstrual cycle is suppressed by estriosuccinate. conclusion: due to close cooperation with a specialised gynecologist, hypermenorrhagia was controlled and in this woman with severe fvll deficiency hysterectomy was avoided. in three male members aged between 27 and 52 years of a family suffering from inherited bleeding disorders the diagnosis of protein z deficiency was established. plasma protein z evaluated by elisa (asserachrom protein z, diagnostica stago, france) ranged between 200 and 300 ng/ml. the patients mostly suffered from moderate bleeding complications like prolonged bleeding secondary to trauma or invasive measures and also spontaneous hematuria. previous laboratory investigations revealed variable platelet function deficiencies and transitory boderline decrease of von-willebrand factor. spontaneous bleedings were rarely recognized, however, they occured more frequently when analgetics were taken. bleeding complications showed good response to hemostyptic measures and antifibrinolytic therapy. the use of pcc containing a high level of protein z in these patients is restrained to severe bleeding disorders or major surgery. defibrotide is a mammalian polydeoxyribonucleotide derived anti-ischemic and antithrombotic drug (crinos s.p.a., v"flla guardia, italy). while the drug is known to produce polytherapeutic effects owing to its multicomponent nature, the exact mechanisms of its anti-ischemic effects remain unknown at this time. since defibrotide is found to be effective in ischemic disorders such as paod, vod related occlusive disorders and related rnicroangiopathic conditions, we studied the effect of this drug on the contraction of dog and pig arterial strip/rings obtained from various sites. in vitro supplementation ofdefibrotide to the organ bath containing control dog and pig arterial rings did not modulate the serotonin and thromboxane (generated) contraction, however, tissues obtained from dogs treated with 10 mg/kg defibrotide iv exhibited a profound desensitization to the agonist induced contractile process. the time course of these effects was found to be much larger than the plasma half-life of defibrotide. this presentation will provide additional data on the effect of defibrotide on the contraction of vascular smooth muscles as a possible explanation for the anti-ischemic effects of defibrotide. a. wehmeier, a. popescu, w. schneider klinik for h,~matologie, onkologie und klinische immunologie der heinrich-heine-universit&t d0sseldorf in chronic myeloid leukemia (cml), evolution of blast crisis is the limiting factor of survival. however, as in other chronic myeloproliferative disorders, bleeding and thrombotic complications are a major source of morbidity but their incidence has rarely been analysed in larger patient groups. we retrospectively evaluated 182 patients with cml during chronic phase (170 cases), accelerated disease (58 cases), and blast cdsis (72 cases), and determined the incidence of thrombohemorrhagic complications in relation to the stage of the disease. in chronic phase, 28 patients had bleeding complications (8.4%/patient year) and 15 patients thrombotic episodes (4%/patient year). the incidence of bleeding increased significantly in accelerated disease (18 patients, 51.2%/patient year) and blast crisis (37 patients, 347%/patient year), and many patients had repeated complications. contrary to our expectations, the incidence of thrombotic complications also increased to 10.2%/patient year in accelerated phase and 39.8% /patient year in blast crisis, tn chronic phase, 3 patients died because of bleeding events. in accelerated phase, 5 patients died due to bleeding and 1 patient due to thrombotic complications. in blast crisis, bleeding was associated with 21 deaths, and pulmonary embolism with 2 deaths. analysis of the cause of thrombohemorrhagic complications revealed that in chronic phase, bleeding was often associated with uncontrolled busutfan therapy, whereas in blast crisis, severe bleeding occurred mainly when platelet counts were low and peripheral blasts increased. however, there was no obvious explanation for thrombotic complications. we conclude that bleeding and thrombotic complications are a major source of morbidity and mortality also in cml, and that the incidence of such complications increase in advanced stages of the disease. klinik for innere medizin °, klinikum schwerin patients suffering from primary or secondary amyloidosis may occasionally acquire a coagulation disorder characterised by isolated factor x deficiency. we report on a 60-years-old man who presented with lower gastrointestinal bleeding and prolonged prothrombin time (quick 50 %). amyloidosis was suspected and proven using biopsy of the rectum and histological analysis. in addition, a monoclonal gammopathy of undetermined significance was diagnosed by immunofixation (light chain, type x). detailed investigation of the prolonged prothrombin time led to the discovery of a pronounced factor x deficiency (residual activity 4 %). inhibitors of coagulation factors could not be demonstrated. the treatment of the patient consisted of red blood cell transfusion and infusion of prothrombin complex concentrates. due to the extremely rapid clearance of infused factor x, no increase of its activity was observed. chemotherapy of the monoclonal gammopathy was initiated (melphalan/ prednisone). over the following six months the frequency of major bleeding episodes gradually decreased. however, subclinical occult bleeding continued. the factor x activity was repeatedly found between 10 and 12 %. we support the suggestion from literature data that clinically relevant bleeding episodes are likely to occur in patients with amyloidosis-associated factor x deficiency if the residual activity is below 10 %. sepsis and septic shock is a disease entity which is characterized by inflammatory reactions (sirs), coagulation abnormalities (dic), organ failure (mof) and severe hemodynamic alteration frequently leading to death in a shock. the aim of our studies was to investigate the efficacy of antithrombin iii (kybernin ®) on ~he outcome of septic shock in a pig endotoxemic model. pigs, in this model respond to lps with elevated tnflevels, decreased leukocytes and platelets counts, increased tat and fibrin monomer levels, hypotension and in increase of the pulmonary arterial pressure (pap), indicating impaired lung function. a total number of 13 male castrated juvenile domestic pigs (25 -30 kg) were anaesthetized, ventilated mechanically and infused with saimonella abortus equi lipopolysaccharide (s. equ-lps) over three hours (0.5 ~g/kg * h). a swan-ganz-catheter was inserted into the pulmonary artery to measure the pap. animals were allocated to two groups,, the treatment group (n = 7) received antithrombin iii (at iii) according to the following regimen: 250 u/kg (t = 60 -30, i. v. infusion), 125 u/kg (i. v. bolus, t = 0) and 250 u/kg (t = 180 -240 rain, i. v. infusion). the placebo group ( n = 6) received the appropriate amount of human serum albumin: 50 -25 -50 mg/kg (same schedule as with at iii). main objective was defined as the mortality rate at six hours a_~er s. equ-lps infusion. whereas in the placebo group 4 out of 6 animal died (mortality rate: 66 %) all at iii-treated pigs survived the observation period of 6 hours (p < 0.05, x2-test). the at iii group was shown to have a lower pap than the control group, especially the second peak of hypertension was abolished by at iii. it is therefore concluded that at iii should be a useful tool for the treatment of severe sepsis and septic shock. in a nationwide monthly survey all childrens hospitals in germany (esped) were asked to clinical and therapeutical informations about children suffering from pmi. during july 94 till june 95 299 children were registered. from these, 87 had either ecchymoses and/or necroses related to an increased mordibity and mortality (20%), whereas 212 showed no bleeding signs except for petechiae. of these children one died. the therapeutic interventions concerning hemostasis are listed according to the defined two risk ~oups. from the patients with ecchymoses or necroses, 13/87 received combination therapy (compared to 5/212 with petechiae or no bleeding sign) of at iii, heparin and/or plasma. only t child received protein c concentrate. the data show that children with low risk did in part receive higher doses of heparin and/or at iii concentrate than did high risk patients, whereas plasma therapy was adjusted to severity of eoagnlopathy. furthermore, the wide range of given therapeutics allows no information about the different medications. therefore, controlled studies with respect to the different therapeutic interventions in children with high risk pmi is desirable. a fully automated procedure for the reptilase time assay y. schmitt (1) and h.j. kolde (2) (1) institute for laboratory medicine, st~dtisches klinikum, darmstadt, frg, (2) dade diagnostics, unterschlei6heim the reptilase time assay is a relatively simple technique for the detection of fibrinogen degradation products and fibrinogen deficiency or abnormality. the procedure is performed with citrated plasma and batroxobin reagent, a snake venom enzyme from bothrops atrox. this enzyme cleaves fibrinogen by releasing fibdno peptide a only but not fibdno peptide b. in contrast to the physiological enzyme thrombin that is readily neutralized by antithrombin iii and hepadn batroxobin is not inactivated by physiological inhibitors. at present this assay is mainly performed manually or on mechanical instruments. we have adapted this assay to the electra 1000 fully automated coagulation analyzer (medical laboratory automation, pleasantville, n.y.) using the thrombin clotting time procedure in the instrument software with batroxobin reagent (dade diathe clot formation is registrated turbidimetrically and the dotting time is pdnted. the within run precision (n= 10) of this procedure was tested with two plasmas from the daily routine and was between 2.8 and 3.4 %. in 25 normal samples we found clotting times from 10.5 to 12.8 sec. in 30 samples with liver disease (confirmed by pseudochlinesterase < 2000 u/ml) or on thrombolysis therapy with streptokinase or urokinase the fully automated assay on the electra was compared to the semiautomatic method using a kc 10 coagulometer (amelung, lemgo, germany) based on a rolling metal ball pdnciple and magnetic endpoint detection. the two assays agreed very well with a correlation coefficient of r = 0,948 and a regression line according to passing and bablok of y = 1.0 x + 1.7. these data show that the reptilase time can be performed with good precision and with good correlation to the manual technique on mechanical instruments on the electra 1000. introduction: disseminated intravasal coagulation (dic), due to a massive activation of the coagulation system, is frequently observed in intensive care patients suffering from severe underlying diseases. laboratory diagnosis of dic is based on different coagulation tests, but unfortunately the routine haemostaseological parameters react with latency in the course of acute dic objective: in four cases from a cohort of 43 patients with severe sepsis and dic we analysed special haemostaseological parameters (tat, f1-t2, d-dimers, human-leucocyte-elastese (file), catepsin g and heparin cofactor ii (hc ii)) and correlated them with a mof-score in order to test their predictability on the prognosis of these patients. results: all patients were substituted with at iii concentrate. l,1 the investigated patients median time of treatment with at iii concentrate was 8 (6-9) days and median time of dic-duration was 6 (4-8) days. none of the presented patients died during observation period. all analysed parameters, except d-dimers, showed a sufficient correlation with the evaluated mof-score (tat: r= 0,78; f1-f2: r= 0,84; hle: r= 0,71; catepsin g: r=-0,75; hc ii: 1"=-0,88). the d-dimers did not correlate with the mof-score, which is probably due to the delayed reactive hyperfibrinolysis in the course of dic. furthermore, the decrease of the tat-complexes, f1-f2, hle and catepsin g levels were followed by an increase of at hi and hc ii activity. conclusion: in general the analysed activation markers and coagulation parameters are sufficiently to describe the ongoing process of the dic. the hyperfibrinolytic activity of dic is sufficiently represented by the d-dimer test, but is of defered reactivity in the course of dic. unfortunately these parameters are not established in the routine monitoring of dic on intensive care units and therefore further studies are needed to investigate the practicability and reliability in the daily routine monitoring. we have previously reported that notoginsenoside r1 (ng-r1) has an effect on counteracting lipopolysaccharide (lps) induced upregulation of plasminogen activator inhibitor-1 and tissue factor expression in cultured human umbilical vein endothelial ceils in vitro and in mice in vivo [fibrinolysis 1994;8:(suppl 1)119]. in this study we investigated the effect of ng-r1 on prevention of lps induced lethal toxicity in mice. because mice are relatively resistant to lps when applied as a single agent, we sensitized them by simultaneous treatment with d-galactosamlne. the 80% lethality induced by lps (1.5 mg/mouse) plus d-galactosamine (8 mg/mouse) in c3hs-ie mice was reduced to 16% by simultaneous administration of ng-r1 (1.5 mg/mouse) with lps/galactosamine (p<0.05 by x 2 test). ng-r1 also significantly delayed lps/galactosamine induced lethal toxicity from 12 hours to 30 hours with all animals surviving beyond 30 hours. because lethality induced by lps involves the synergistic effect of multiple effector molecules such as tumor necrosis factor (tnf)-ct, interleukin (il)-i, interferon 3' etc., we also investigated the effect of ng-r1 on lps induced tnf-ct production from leukocytes in cultured human whole blood cells (hwbcs) ex vivo. the production of tnf--ct induced by lps (1 ng/ml for 24 hours) in the supernatant of hwbcs was inhibited by 46% and 22% respectively, when the cells were incubated 1 ng/ml or 10 ng/ml lps together with 100 i~g/ml ng-r1, respectively (tnf-ct concentration, 1 ng/ml lps treated cells: 297+192 pg/ml, i ng/ml lps plus 100 l.tg/ml ng-ri treated cells: 162+137 pg/ml, p<0.01; 10 ng/ml lps treated cells: 3094_+487 pg/ml, 10 ng/ml lps plus 100 pg/ml ng-r1 treated cells 2423+713 pg/ml, /'=-0.02). the present results suggest that ng-r1 can prevent the onset of lps toxicity as well as the lps induction of cytokines. therefor ng-ri may be effective in preventing the effects of septic shock in gram-negative infections. to elucidate the mechanisms by which coagulation is initiated in septic patients in vivo, coagulation measurements were prospectively evaluated in patients with severe chemotherapyinduced neutropenia. this group of patients was chosen because of their high risk of developing severe septic complications, thus allowing serial prospective coagulation testing prior to and during evolving sepsis or septic shock. 62 patients with febrile infectious events were accrued to the study. of these, 13 patients progressed to severe sepsis and an additional 13 patients to septic shock. at onset of fever, factor (f) vlla activity, f vii antigen and antithrombin iii (at iii) activity decreased from normal baseline revels and were significantly lower in the group of patients who progressed to septic shock compared to those that developed severe sepsis (medians: 0.3 versus 1.4 ng/ml, 21 versus 86 u/dl and 45 versus 95%; p < 0.001 ). the decrease of these variables in septic shock was accompanied by an increase in a marker of thrombin generation like prothrombin fragment 1 + 2 (medians: 3.6 versus 1.4 rim; p=o.05). these differences were sustained throughout the septic episode (p < 0.0001 ). f vlla and at ill levels of <0.8 ng/ml and <70%, respectively, at onset of fever predicted a lethal outcome with a sensitivity of 100 and 85%, and a specificity of 75 and 85%, respectively. in contrast, fxila-alpha antigen levels were not different between both groups at onset of fever and were only marginally higher further during the course of septic shock (p=o.001). thus, septic shock in neutropenia is associated with significant coagulation activation, presumably driven by the tissue factor pathway rather than the contact system. furthermore, in septicemia both f vlla and at iii measurements are sensitive markers of an unfavourable prognosis. hemostatic parameters in sepsis patients treated with anti-tnfct monoclonal antibodies c. salat 1, p. boekstegers 2, e. holler 1,3, b. reinhardt i, r. pihusch 1, k. werdan 2, m. kaul 4, t. beinert 2, e. hiller 1 med. klinik iii 1 und i 2, klinikum grosshadern der ludwig-maximilians-universitat monchen, h~imatologikum der gsf 3, knoll ag ludwigshafen 4 tumor necrosis factor et (tnfc~) is a central mediator in the pathogenesis of sepsis and septic shock. as administration of anti-tnfct monoclonal antibodies was able to protect animals from an otherwise lethal endotoxin challenge clinical studies were initiated in patients with sepis. tnfct exerts a procoagulant effect, e.g. by enhancing pai-i and activating thrombin as indicated by an increase in tat and pf 1/2 levels. therefore it may be involved in disseminated intravascular coagulation in sepsis. we determined tat, pf 1/2, d-dimers, tpa, upa, pai-i and vwf levels in 30 patients with sepsis or septic shock. 14 patients received the anti-tnfa monoclonal antibody mak 195f (knoll ag, ludwigshafen), whereas 16 patients served as controls. we found a significantly lower level ofupa in anti-tnfc~ treated patients. since the difference existed before onset of treatment it can not be attributed to tnfot antagonisation. all other parameters investigated did not differ significantly between the two groups throughout the study period. failure to detect modulation of hemostasis by anti-tnf~ might be explained by delayed initiation of treatment in clinical sepsis. in animal experiments it has been observed that the antibody prevented lethal endotoxin effects when given prophylactically or 30 minutes after endotoxin challenge, but not when it was administered 2.5 hours later. in addition, beneficial clinical and hemostatic effects of tnfet antagonisation might be observed only in subgroups of patients with hyperinflammatory sepsis. larger studies addressing this point are under way. protease receptors for thrombin and trypsin have been described for different cell lines. we investigated the ability of trypsin to activate human umbilical vein endothelial cells (huvec). cell activation was measured by the increase of intracellular free ca 2* (caff) with help of microscope fiuorometry (fura-2) and by the von willebrand factor release measured by a sandwich elisa. incubation of huvec with thrombin (1u/ml) or trypsin (10nm) showed a 2-10 fold increase of c~ff. a subsequent homologous stimulation after 80 s lead to a 2-5 fold lower concentration of ca~ 2÷ compared to the first stimulation. therefore cells have been desensitised by the first stimulation. inhibition of the proteolytical activity of trypsin by soybean trypsin inhibitor was followed by failure of trypsin inducing an increase of ca~ 2÷ concentration. in cross stimulation experiments with thrombin and trypsin, we could demonstrate, that cells first stimulated with thrombin showed a second maximal response by subsequent stimulation with trypsin. the same effect was measured with first stimulus trypsin and second stimulus thrombin. trypsin and thrombin induced a release of von willebrand factor (2-5 fold in comparison to unstimulated cells). we found a vwf release dependent on the concentration of trypsin similar to thrombin. an electrophoretic analysis of the released von willebrand factor showed a different multimeric composition of vwf between trypsin and thrombin stimulation. these results indicate, that there might be a protease receptor on huvec for trypsin being different from the thrombin receptor. clinical and laboratory findings of coagulopathy were investigated by an 1-year-survey to 320 children's hospitals. 291 meningococcal infections were evaluable. severe disease (characterized by need for mechanical ventilation, dialysis and/or catecholamines) was seen in 42 of these children; 29 of those survived and 13 died. clinical signs of severe coagulopathy were seen in 83 children: ecchymoses (n = 73) and skin necrosis (n = 36) were associated with increased mortality (16% and 20%, resp., compared to 4.5% overall mortality). five of 29 surviving children with skin necroses required surgical interventions (skin transplantation and/or amputations). petechiae were frequent (n = 156) and as isolated finding not related to severe disease or fatal outcome (6% mortaliy). platelet counts at admission were lower in non-survivors (10th-90th percentile: 30 -450.000/gl, median: 139.000/i.tl) than in survivors (10th-90th percentile: 140 -480.000/i.tl, median: 242.000/gl). at iii values showed no difference between survivors and non-survivors. protein c was available in few patients (n =14): in this subgroup, protein c was lowered in patients with limited disease (10th-90th percentile: 20 -105%, median: 48%) as well as severe disease (10th-90th percentile: 30 -75%, median: 60%). in conclusion, the findings "ecehymoses" and "skin necroses" were related to fatal outcome and therefore included in a prognostic score for severity of meningncoccal disease. the influence of irradiation on pai-i and vwf levels in human umbilical vein endothelial cell cultures k. fragiadaki, c. salat, r. pihusch, b. reinhardt, m penovici, e. hiller med. klinik iii, klinikum grosshadern der ludwig-maximilians-universitat monchen an elevation of pai-1 in bone marrow transplant recipients developing veno-occlusive disease (vod) of the liver has been described earlier. endothelial cell damage due to the preparative myeloablative radioehemotherapy is supposed to be an important step in the pathogenesis of the disease, which is characterized by an obstruction of small intrahepatic venules. in order to investigate a possible role of irradiation we studied the influence of several doses (0, 5, 15, 30 gy) on pai-1 and vwf levels in the supematant of human umbilical vein endothelial cell cultures (huvec). pai-1 antigen and vwf were determined by enzyme immunoassays. whereas pai-1 and vwf levels remained unchanged alter irradiation with 5 gy and in control cultures, a rise was observed one day after irradiation with 15 gy (mean day 0"-)day +1) in pai-1 (100,0% --)171,2 %) and vwf (100%--)159,7%) levels. the increase was more pronounced and reached levels of statistical significance after a dose of 30 cry (pai-1 100%--) 278,7% and vwf 100%--)168%). both pai-1 and vwf levels decreased on day 2 after irradiation with 15 and 30 gy. our results indicate that irradiation induces an increase of pal-1 and vwf in endothelial cells. nevertheless, this effect was observed only in doses above those ones used during conditioning when patients receive 3x4 gy. additional factors seem to be of significance. cytokines like tnfo~ enhance pai-1 and vwf in endothelial cell cultures and are known to be elevated in bmt-associated complications. it can be speculated that irradiation in concert with these factors may contribute to the development of veno-occlusive disease. disseminated intravascular coagulation is characterized by high consumption of coagulation factors, systemic elevation of fibrinolysis by tpa and concomitant elevation of pai-i secreted from inflamed endothelial cells. in an attempt to investigate the contribution of inflammatory cytokines, endothelial cells lines of microvascular origin were stimulated in vitro and pal-1 antigen was measured 2h, 4h and 24h after stimulation. in contrast to results published from experiments performed with macrovascular human umbilical vein cells (huve), our results obtained with 3 different microvascular endothelia isolated from skin, solid tumor tissue and bone marrow revealed that inflammatory cytokines reduced pal-1 antigen levels. in addition to tnf-a (25ng/ml) and lps (10pg/ml), we found that il-10 (100 u/ml) and gm-csf (100 u/rot) also reduced pai-i levels within the first 2h of incubation (from 120ng/ml to 80-110 ng/mll and the effect was even more pronounced after 4h and 24h (from 380 ng/ml to 250 ng/ml). il-1 (10 u/ml) and lps (10 pg/l) also reduced constitutive levels of pal-1 but the effect occured later than 4h after addition of the stimulator. the strongest synergistic effect was demonstrated with gm-csf plus il-1 resulting in pal-1 suppression of 50% after 2h and 30% after 24h. in contrast, g-csf (300 u/ml) induced the immediate (120 to 140 ng/ml after 2h and 380 to 420 ng/ml after 24h) upregulation of pal-1 antigen. stimulation of pat-1 levels was also observed with tgf-i~ (10 pg/ml), however not earlier than 18h of incubation. interestingly, both stimulatory cytokines, ie. g-csf and tgf-13, alone were able to counteract the decrease of pat-1 antigen by tnf-a but only a combination of g-csf plus tgf-g neutralized the effect by il-1. results indicate that inflammatory cytokines regulate pal-1 fibrinolysis in a synergistic and antagonistic fashion. we established the culture of human brain microvascular endothelial cells (hbmec) in order to investigate the pathophysiology of hu~man cerebral malada, which is still associated with a high mortality rate. it is widely accepted that among the reasons for the fatal outcome of cerebral malaria, the interaction of endothelial cells with cytokines and paras lites with subsequent changes in haemostaseological parameters is involved. the human microvascular endothelium may therefore play a deci §ive role in the pathophysiology of cerebral malaria. ery throcytes containing later stages of p. falciparum specifically bind to capillary ec in vivo (sequestration). tnf-cq il-1 and il-6 are considerably elevated in severe malaria. coagulation factors such as tissue factor and von willebrand factor are affected by malada suggesting the involvement of the hbmec in cerebral malada. so far, research on the involvement of the hbmec has been performed on ec cultured from human umblilical veins (huvec). the relevance of this model may be questioned on t, ,he grounds that the capillary endothelium probably plays a greater role than the endothelium of the large vessels. besides, some propertie.$ of the endothelioum seem to vary, upon the organ of origi/n. for the~ reasons, our laboratory has established the hbmec as a model to study the pathophysiology of human cerebral malaria. to demonstrate the relevance of this model in the context of malaria, hbmec were challenged with sera from different patients with severe p. falciparum malaria and with serum from a healthy donor. we can demonstrate that in cells challenged with malaria patient sera icam-1 and substance p were upregulated. on the other hand cells challenged with serum from a healthy donor expressed neither icam-1 nor substance p. these results strongly suggest the relevance of this model for vessel involvement in malaria. both, histamine and serotonin have been described as potent stimulators of yon willebrand factor (vwf) release from human umbilical vein endothelial cells (huvec). we performed experiments to differentiate the receptors for histamine and serotonin induced vwf release. absolutely unexpected we don't found any significant vwf release after the addition of serotonin to huvec or human artery endothelial cells (huaec) in concentrations from 0.1 ijm to 50 pm. in the case of histamine (0.1 pm -50 pm) we measured a vwf release 2-5 fold compared to unstimulated cells. this release was in the same order of magnitude as the release induced with 11u thrombin. to verify these results we measured the effect of histamine and serotonin on the intracellular ca 2÷ concentration (ca~ 2÷) in huvec and huaec. cells were labelled with fura-2 and the change in fluorescence after agonist addition was measured with a microscope fluorometer. using the same agonist concentrations as above we found an 5-10 fold increase of caj 2. with histamine or thrombin but no effect by addition of serotonin. this results indicate a similar activation of human endothelial cells by histamine and thrombin and that serotonin don't stimulate endothelial vwf release or increase of cay. activation and/or dysfunction of the endothelium can be triggered by cytokines (e.g. interleukin-2, tumor necrosis factor-alpha) or bacterial substances (e.g. endotoxins) and may contribute to shock and multi organ failure. pal-l and tm were assessed as parameters of activated endothelium following bsct in three to four days intervals from start of conditioning therapy through day +35. data were compared to the occurrence of sepsis, veno-occlusive disease (vod), capillary leakage syndrome (cls) and graftversus-host-disease (gvhd). patients with neither complication served as controls. no *days after stem cell tranplantation pai-1 and tm were increased in all patients with sepsis, cls~ vod and/or gvhd. pai-1 peaked at days 14 to 18 and the increase was highest in sepsis and lowest in cls. the increase in tm values was somewhat delayed (day +24) and was highest in vod and cls and lowest in gvhd. pai-1 and tm are sensitive markers of endothelial activation in sepsis, vod, cls, and/or gvhd, but they do not allow a differention between these complications. endothelin (et) is the most potent vasoconstrictor. it is known that et plasma concentration is correlated with a poor prognosis in patients with non ischemic cardiomyopathy (cm). the contribution of the heart to the production of et is still unknown. to investigate the pathogenetic mechanism in patients without coronary artery disease (cad), we examined 13 patients with hypertension ( . pulmonary capillary wedge pressure (pcwp) was measured in all patients. et and its precursor big-endothelin (bet) were determined at rest and after pharmacological stimulation with dipyridamole (0.5 mg/kg body weight), that increases coronary blood flow by factor 2 -4 on a non endothelial pathway. cardiac coronary et and bet concentrations were determined from the arterial blood samples, obtained from the aorta, and simultaneously from the coronary sinus (venous blood). blood samples were collected into ice chilled vacutainer tubes and stored after centrifugation at -70 *c. et and bet were analysed after extraction by a sepal< c 18 cartridge by radio immuno assay technique (immundiagnostik). it is concluded that et is increased with elevated filling pressures of the heart in patients with cm. it is not produced in considerable quantity by the heart neither at rest nor at increased blood flow. there4ore the lung has to be considered as the major organ for the production of et and bet in patients without cad. to characterize the incompatibility of blood with foreign surfaces valide in vitro methods especially in testing of platelet function are neceessary. it seems to be effective to use test systems which can also be helpful lateron in the clinic when foreign surfaces (e.g. venous catheters) are used and evaluated in so called phase-4-studies. we studied the influence of 21 reference polymers under standardized and controlled flow conditions on platelets in citrated blood specimen of healthy blood donors.the following tests were performed pre and post platelet-pol)aner contact: decrease of platelet count, platelet aggregation (wu-gmtemeyer index), analysis of platelet spreading capacity on standardized plastic surfaces by using a visual microscopic evaluation according to breddin and bfirck (1963) and an interactive computer-aided system (ibas, kontron gmbh, manchen, frg) by digitalizing the morphological picture of the platelet slides and area detection with a resolution of 512x512 pixels. results: platelet counts showed significant differences pre and post polymer contact, the wu-grotemeyer index demonstrated platelet activation only by blood contact with large volumes of polymeric material whereas both visual and computer-assisted evaluation of platelet spreading ability revealed a marked shift in the different classes of platelets: platelet activation results in a decrease of large structural elements and an increase of elements with spider threads. (pre contact (n=1000): 27:~-6 large forms of platelets, 700~-39 small forms and 275:l-41 spider forms; post contact (n=1000): 6-+-5 large forms, 510a:56 small forms and 484±58 platelets with spider threads). in some series there were significant differences between visual and computer-aided evaluation in the detection of small and spider forms. however, the relative increase of these nonspread spider forms could be stated with beth methods (wilcoxon test). we therefore conclude, that platelet morphometry with both methods is a sensitive and reliable ex vivo method to evaluate platelet interactions with artificial surfaces and can also be used lateron in phase-4-studies in patients. however, the ibas-system requires further maprovement in hard-and so,ware to reduce the high expenditure of this method. despite for the most part standardised methods such as hypothermia, cardioplegia the perioperative myocardial infartion rate is still high at approx. 6%. in cardiovascular surgery it is well known that various cardioplegic solutions are employed for myocardial protection during the ischemic phase. in order to evaluate the possible influence of these solutions we selected two of the most commonly used cardioplegic solutions for investigation in a randomised double-blind study: htk (group 1) and st. thomas (group 2). after randomisation each group consisted of 20 patients who had to undergo aortocoronary bypass surgery. aim of the investigation was to establish possible varying cellular changes during the reperfusion phase or in the early operative phase in order to be better able to apply reinforcing clinical measures. in the context of this study the classical enzyme-diagnostical methods ck,ck-mb and ldh as most useful, however not as convincing. still, we have in the meanwhile been able to show that the cardiac muscle troponin t proves a particularly sensitive parameter regards differentiated ischemic damage to the myocardium. ~his we were able to conflrm in extensive preliminary trials. cardiac troponin t was registered with a one-step lmmunoassay using two highly specific monoclonal antibodies directly via two different epitopes of cardiac troponin t. simultaneously the corresponding pre-and postoperative ecg was registered. further, within this context we investigated parameters that indicate cellular damage, such as platelet factor 4 (pf4), t-pa, interleukin-6 and pmn-elastase. in the reperfusion phase in group 2 there is a significant rise in tmponin t while in group 1 these values remain practically unchanged up to the 1st. postoperative day. of special importance is interleucin 6 since according to most recent studies the release of this substance leads to platelet activation via the arachidonic acid metabolism. this pathway must, further, be regarded within the context of free radical formation. on the 1st. postoperative day the 11 6 values in group 2 are significantly higher. the effects of membrane damage is also observed via pf4 and the pmn-elastase to be different in both groups. on the basis of this study we arrive at the conclusion that the htkcardioplegia is essentially less damaging than that of the st. thomas solution. (2) r. hetzer (2) (1) department of hematology and oncology, vimhow klinikum, humboldt university, berlin, germany (2) we investigated the influence of two different vad systems on these hemostatic changes. vads were implanted in 18 patients [11 bi-vad (berlin heart), 7 left vad (novacor n 100)] with end-stage heart disease who were awaiting heart transplantation. the following hemostatic parameters were measured during the first 51 days of bddging or until heart transplantation: thrombin-antithrembin iii (tat) complexes, prekallikrein, factor (f) xll, plasminogen, or2 -antiplasmin, and i?,thremboglobulin. results: during the first week of bridging, significantly higher tat levels were observed in novacor patients compared to berlin heart patients. prekallikrein activity levels were significantly lower in the berlin heart patients in the early bridging period. all other parameters were comparable in both groups throughout the entire observation period. differences in hemostatic parameters became apparent only in the early bridging period with more enhanced pmthrombin activation in the novacor group and more prominent contact activation in the berlin heart group. avoidance of the transmission of viral infections and saving in the use of blood products encouraged the use of apparatwe intraoperative autetransfusion techniques. patients and methods: arer randomization apparative intraoperative autotransfusion was performed in 5x7 patients during elective hip surgery using i-iaemonetics cell saver ill, haemonetics cell saver v, electromedics elmd, haemolite 3 and fresenius continuous autotransfnsion system (cats). at defined tmaes we detenmned a lab panel (clinical chemistry, lipids, proteolytic capacity, hemolysis, coagulation panel) at 9 determination points in the reservoir, the retransfused blood and in the patient. results: no significant differences concerning proteolytic capacity, prothrombin time, platelets, lipids, electrolytes. increased hemolysis (p<0.01) in the hcs iii group vs. the other groups (lo rain. after application of the retransfnsed blood). low heparin concentrations of retransfused blood in the hcs iii group( 0.32+-0.3 u/ml) vs. high concentrations in the cats group (0.47 +-0.3;p--0.01). parameters of thrombin generation were elevated in the hcs iii group vs. the other groups (p=0.02). conclusions: the use of 5 different apparative autotransfnsion systems dunng elective hip surgery results in dysturbances of hemocompatibility. the activation of the coagulation system during the collection and filtering is partly influenced by the elimination kinetics and the dose regime of heparin. however intraoperative autotransfusion must be roan~ged very carefully and possibly adverse effects of perioperadve heparin peak levels have to be considered. little information is available on the management of patients with factor viii deficiency who require cardiac surgery. we report the case of a 54 year old man with factor viii deficiency and combined severe aortic stenosis and incompetence and mitral incompetence who underwent a double valve replacement at our institution. he had a history of several bleeding episodes following minor surgery. previous factor viii levels were between 8 and 26%. using standard cardiopulmonary bypass, a double valve replacement with a 23 and 29 mm bileaflet prosthesis in aortic and mitral position, respectively, was performed. a high dose aprotinin regime was used (5.5 x 10 a iu). three doses of factor viii concentrate were given in the perioperative period, totalung 7000 1u until the 1st postoperative day. repeated measurements of the factor viii level were performed. the postoperative chest tube drainage was 350 rot. until the 4th postoperative day an additional dose of 3000 iu of factor viii was given to maintain a level of at least 30%. the obligatory anticoagulation was achieved initially with heparin i.v. in therapeutic dosage. due to a persistent 3rd degree av block a permanent pacemaker was inserted with additional 2000 iu of factor viii. on the 17th postoperative day warfarin was commenced aiming for an inr of 3.0 -3.5. the patient was discharged home therearer. he was trained to monitor his inr with a coagu chek device. no bleeding episode occurred during the first 3 months follow up. open heart surgery can be performed safely in patients with factor viii deficiency with the use of factor viii concentrates and monitoring of factor viii levels. coating of biomaterials was developed using synthetic polymers with incorporated anticoagulants. stents were coated with a thin layer consisting of a polylactide polymer containing peg-hirudin and a stable prostacyclin analogue. these materials were tested with a ,,human shunt model" using nonant/coagulated blood of healthy volunteers. within minutes uncoated stents were covered by fibrin and aggregated platelets, which could be seen macroscopically and by scanning electron microscopy; coated stents were free from coaguiation plugs. this observations were supported by analysis of coagulatiuon activation markers. unlike coated stents, uncoated stents revealed high levels (>detection limit) of tat complexes and prothrombin fragments (f1-2). in a series of experiments stents were tested in sheep. in 16 sheep stents (coated/uncoated patmaz-schatz stents) were ptaced by conventional techniques in the left anterior descending artery. anticoagulant therapy consisted of a heparin bolus and intravenously given aspirin before stent implantation. no ant/coagulation was given thereafter. existing data show hyperplasia in the area of uncoated stents which was reduced around coated stents (this study will be finished in january 1996). this coating technique with incorporated anticoagulants reduces thrombogenicity during the early and late phase of biomaterial implantation. studies concerning catheters, vascular prosthesis and oxygenators are in progress. the mechanical circulatory support (mcs) is a therapy for patients (pts) with endstage cardiac insufficiency. during mcs thrombeembolic events, due to the surface thrombogenicity of the implanted device, are feared complications. activated blood platelcts play a major role in this context. therefore, patient's platelet morphology was investigated. during the period of mcs, using the novacor left ventricular assist system n100, blood samples of 8 pts were observed by means of scanning electron microscopy (sem). blood was collected preoperatively and after implantation daily during the first week as well as weekly for the first 3 months. samples were drawn via an 18gauge cannula into caeodylic-acid buffered glutaraldehyde and platelets were prepared for morphological investigations. platelet alterations were classified as non activated, activated and aggregated, based on "shape change" morphology. additionally, the common blood coagulation parameters were evaluated. preoperatively, 15.0 + 4.6 % of activated platelets were found. within the first postoperative week, the mean level of activated platelets raised to 32.8 + 8.0 % (p<0.05). comparing short-(<30 days) vs. long-term (>30 days) mcs, a significant difference of activated ptatelets (overall mean values) could be seen (24.3 +_ 3.3 % vs. 34.8 _+ 3.4 %, p=0.004). during mcs a correlation between hemolysis and platelet aggregates, as well as the values of activated dotting time and activated platelets were observed. also, specific platelet deformations and damages appeared during mcs, which could not be found preoperatively. all pts with mcs showed alterations of their platelet morphology induced by the activation of the implanted synthetic material. with regard to the postoperative antithrombotic therapy, these observations should be taken into consideration. during extracorporeal circulation (ecc) the blood and its compenents are exposed to artificial surfaces and inflammatory respenses are activated, especially the complement, coagulation, fibrinolytic and kallikrein systems. furthermore leukocyte activation occurs and platelet function is impaired. these humoral and cellular systemic responses are known as the "pustperfusion syndrome" with clinical symptomes like lenkocytosis, increased capillary perraeability, accumulation of interstitial fluid and organ dysfunction. the impertance and even perhaps the existence of the damaging effects of cpb have been widely debated in the literature over the past 30 years. many efforts have been made to reduce traumatizing factors, e.g. the use of membrane instead of bubble oxygenators. recently, heparin-coated equipmen~ and tubings have been proposed to avoid excessive contact activation during cpb, the here presented study was designed to assess changes in coagulation and flbrinolytie activity in 20 patients undergoing cpb. in this regard we investigated coagulation parameters like fibrinogen, antithrombin, pmthrombin-fragments fl+2, thrombin-anthhmmhin complex, tissue-factor, fibrin-monomeres and parameters of the fibrinolytic system like tissue-plasminogen-activator, plasminantiplasmin-complex, d-dimers and plasminogen-activator inhibitor before, during and after cpb. the activation of the complement cascade was followed by measuring the concentration of c5a, c4 and c3c. the results demonstrate distinct alterations in above mentioned parameters. in spite of a high dose hepariulzation (act>450s) combined with an antifibrinolytic tw, atment an activation of the coagulation system was observed immediately after the onset of cpb followed by an activation of the fibrinolytic system. therefore further efforts should be done to develop new anticoagulatory regiments and improve the biocompatibility of materials used for cpb. during cardiopulmonary bypass blood is exposed to nonphysiologic conditions. the contact with artificial surfaces and mechanical stress result in a periopemtive response which includes activation of the complement, coagulation, fibrinolytic and kallikrein system, activation of nentrophils with degranulation and pmtease enzyme release, oxygen radical production and the synthesis of various proinflammatory cytokines. this so-called "pest-pump intlammatory response" has been linked to respiratory distress syndrome, renal failure and neurologic injmy. our goal was to investigate the time course of eytokine levels and the activation of leukozytes and platelets and to quantitate leucocyte subpepulatioas in 20 patients undergoing cpb. at different time points, pre, during and pest cpb, we determined the levels of interleukin (il) 113, il-2, il-4, il-6, il-8, il-10, tumor necrosis factor ¢z (tnf-a) and interferon "1' (ifn'--/) using elisa-techulques. lymphozyte subpepulations were characterized by flow cytometry and specific monoclonal antibodies against cd3 (pan t-cell marker), cd4 (surface antigen on t-helper cells), cd19 (surface antigen on b-cells), monocytes were determined by cd14 and platelets by cd41 (act. gpilb/llla) and cd42b (gp ib). single cell activation was analyzed using markers against cd25 (il-2 receptor), cd126 (il-6 receptor), hla-dr (mhc class ii), cd71 (transferrin receptor) and cd69 (activation inducer molecule), platelet activation was monitored with an antibody against cd62 (gmp-140). preliminary results revealed distinct increases in r,-6, il-8, and il-io following cpb whereas tnf-a and ifn--/levels were not significantly influenced. fttnhermore, activation of particular cell populations was observed. finally, our investigations should contribute to a better understanding of the complex humeral and cellular respenses induced by cpb and thus might help to develop new strategies to circumvent the negative impacts of cpb. optimal adjustment of anticoagulation in machine plasmapheresis is important for the quality of the prepared fresh frozen plasma (ffp) as well as for the safety of the donation. in the present study the suitability of prothrombin fragment ( ft+2 ) in the assessment of anticoagulation during plasmapheresis was investigated. matarlal and methods: 75 plasmapheresis procedures were performed on 25 donors (10 ~, 15o" ) using 3 different plasmapheresis machines (a 200, baxter; mcs 3p, haemonetics; pph 900, electromedics/medtronic). acid citrate dextrose formula a (acd-a) in a ratio to whole blood of 8 : 92 was used for anticoagulation. the concentration of fi+2 in the donor's blood was measured before and after plasmapheresis and in the prepared ffp. the actual acd-a volume used was also registered. results: there was a significant rise of the ft+2 -concentration in the donors blood after plasmapheresis with each of the three automatons: a 200:1.32 vs 1.14, p < 0.05; mcs 3p: 1.26 vs 0.98, p < 0.05; pph 900:1.20 vs 1.05, p < 0.05. the ffp prepared with each machine showed the following f~+2concentrations: 0.91± 0.18, 1.0:2 ± 0.17 and 0.93 ± 0.11 respectively. the difference between the groups was not significant. the elevation of the ft+2 -concentration in the donor's blood showed a negative correlation with the volume of the acd-a used. during 6 of the 75 procedures technical problems occurred (inadequate venous acces, occlusion of the citrate tube, reduced whole blood flow). after these procedures there was a marked elevation of f~+2 in the donors blood (2.74 ± 0.53), accompanied by an elevated f~+2 -concentration in the prepared ffp's. conclusion: these data show that ft+2 is a suitable parameter for the assessment of anticoagulation during plasmapheresis. several epidemiologic studies demonstrated that fibrinogen is an independent cardiovascular risk factor and should be considered for screening programs. prothrombin time derived fibrinogen (df) measurement combines the advantage of an established highly reproducible automated method with no additional reagents, except for calibration. several studies showed that the df values correspond well with the clanss method except in cases such as thrombolytic therapy in which the df results are higher. however, no results exist whether in patients with coronary heart disease with fibrinogen as a risk factor the df values are also comparable to the established clausss method. the aim of our study was to compare df values to clauss method results in cardiac patients, especially in patients before and after coronary bypass grafting (cab(]). measurements of df were performed on an acl 3000 (il) using the pt-fibrinogen-hs reagent. fibrinogen clanss method was done on the acl using fibrinogen c reagent (il) and on a kc4 (amelung) with fibrinogen a reagent (boehringer maanheim). for calibration we used the calibration plasma half volume (it.) with the fihrinogen concentration proposed by the manufacturer. plasma samples were obtained from 24 patients at admission before cabg and postoperatively up to 1 week, and from 23 healthy persons (staff). within assay imprecisious using normal and abnormal controls (il) were comparable with both methods showing cvs between 1.99 and 4.22 %. in normal healthy persons the medians of the df and the clanss method run on the acl were very similar (296 vs 302 rag/all), whereas kc4 values were about 10% lower (268 md/dl). in cabg patients at admission we found the same differences as in normals with the clanss method (acl: 363 vs kc4: 337rag/all), however the df values were siginficantly higher (median 418mg/dl). if we took a cutoff value of 320 mg/dl, as suggested by the results from the northwick park heart study, we would categorize into the high risk group 21 out of 24 patients using the df method, 20 with the clanss-acl method and 16 with the clanss-kc4 method, i.e. nearly 30% more patients were classified in the high risk group using the df method. postoperative samples showed the expected increases due to the acute phase response with the same magnitude of differences. because of its rapidity and reproducibility the df method is well suited for routine measurements, however, standardization remains an urgent task in order to avoid misinterpretation of results. for fibdnogen measurements in clinical laboratories, the two most widely used methods are the clotting time method according to clauss (cfib) and the sn called "derived" fibrinogen method (dfib) implemented in optical coagulometera with the fibrinogen concentration being derived flora the optical density of the fibrin clot in a standard prothrnmbin time (pt) assay. it is well known that under certain circumstances, e.g. in the presence of fibrin(ogen) degradation products (fdp), there is a discrepancy between the two methods with higher values for dfib than for cfib. yet the opposite discrepancy, i.e. fibrinogen values derived from the optical density of the clot grossly lower than values from dotting time assays, seems to be very rare and is poorly understood so far. the patient (male, 26 years) had ingested the esterase inhibitor parathion (e605) in an attempt o f suizide and was treated with high doses of atmpin. he had no clinical signs or history or family history of bleeding or thrombotic disorders. except for a very low pseudocholinesterase activity, all laboratory results were normal ineinding pt, afft, thrombin time, and factor xiii. pt and aptt did nnt differ between an optical coagulometer (electra 1000c, mla) and a mechanical one (kc..4, amelung). there was no evidence of disorders known to interfere with hemostasis like paraproteinemia or dyslipldemia. however, in all 7 blood samples received for dotting tests during a period of 7 days the macroscopic appearance of the fibrin clot was quite unusual (only slightly turbid/almost transparent) and there was a striking discrepancy between a very low or low dfib on the electra (pt reagent: thromboplastin is, dade) and a normal or high cfib (kc4; thrombin reagent, dade). on admission, values were 57 mgml (derived) vs. 275 mgldl (clauss). cfib rose to s42 mg]dl with dfib at 155 mg/dl in the last sample on day 7. ~ al! samples dfib was about 20 % (ls-23) of cf[b. when the patient's plasma was added m normal pooled plasma it caused, in a dose-dependent manner, values lower than predicted for dfib and values slightly higher than predicted for cfib. in the absence of data from additional (e.g. immunologic) methods the following principal possibilities (and combinations) have to be considered: 1) normal fibrinogen concentration and clot formation rate, but abnormal optical properties of the clot (cfib correct, dfib falsely tow); 2) normal optical properties of the clot, but accelerated clot formation and very low fibrinogen concentration (dfib correct, cfib falsely high). in either case, the molecular basis could be: a) a genetic or acquired molecular abnnrmality of fibrin/fibfinogen; b) an interfering substance. direct effects of the loxic agent parathion and/or the antidot drug atropin are not likely to be the cause since other patients, often with more severe parathion inmxicatian requiring higher doses of atmpin, showed normal optical density of the clot. we hope to perform a more in depth investigation of this abnormality in the future, including various methods, reagents, and instruments for fibrinogen measurement, a survey of the patient "s family, and studies of the molecular nature of the phenomenon. increased fibrinogen is known to be an independent predictor of subseqtmnt acut~ coronary syndromes. however. a multitude of methods for fibrinogen determination is available. there is a lack of standardisation among fibrinogen assays. in a family cohort study (patients'with combined hyperlipidaemia and f or hypemricaemia) fibrinogen was determined in plasma samples from 340 family members using a functional and an immunochemical assay. the fimctional assay according to clauss was performed on the analyser ca 5000 using the test fibrinogen a from boehringer. the immmmephelometric assay was performed on ~e behring nephelometer system using the reagent and standard from behring. a good similarity between both assays was obtained at low and high flbrinogen levels as well as in samples with increased c-reactive protein (crp). values obtained by both assays correlated similar with total cholesterol, ldl--cbelesterol and apolipeprntein b. the ratio functional fibrinagen / immlmochemial fibrinogen showed no dependence on cholesterol, t-pa, v wiuebrand factor and crp. release of two fibrinopeptides a from fibrinogen generates desaa-fibrin monomer, which rapidly aggregates, forming fibrin complexes. fibrin monomers can be detected in plasma samples after chemical desaggregation of fibrin complexes using thiocyanate by monoclonal antibody binding to the alpha-chain neo-n-termini generated by fibrinopeptide release. although postulated, an intermediate of fibrin formation, carrying one fibrinopeptide a and one fibrin alpha-chain neo-n-terminus has so far escaped analytical procedures. we have employed a monoctonal antibody specific for fibrin alpha-chain neo-n-terminus, mab 2b5, attached to magnetic microparticles, for isolation of fibrin-related material from plasma samples of patients with elevated soluble fibrin. the material was desorbed by sds-urea buffer and subjected to sds-page and immunoblotting. immunostaining with panspecific anti-fibrinogen and anti-fdp-e antisera showed a range of bands corresponding to fibrin monomers, and fibrin derivatives containing the fibrin e-domain. lmmunostaining with monoclonal anti-fibrinopeptide a antibody resulted in a doublet band corresponding in size to fibrin monomer. similar results were obtained with polyclonal antisera against fibrinopeptide a. for a more quantitative approach, desa-fibrin monomer was detected by an elisa procedure using mab 2b5 as capture and monoclonal anti-fibrinopeptide a antibody as tag. a sample with extremely high level of desaa-fibrin monomer, determined by elisa (enzymun®-test fm) was used for calibration, since reference material is not available. a correlation of r=o.g4 was found between desaa-fibrin monomer and relative desa-fibrin monomer levels. detection of desa-fibrin monomer required sample pretreatment with thiocyanate for desaggregadon of fibrin complexes. from these preliminary data it appears that desa-fibrin monomer accounts for a fairly constant proportion of soluble fibrin and is a polymerizing species. fibrinogen has been shown to be a major cardiovascular risk factor. especially for epidemiological studies, exact quantitation of fibrinogen in clinical plasma samples is of great imporance. fibrinogen levels are generally measured by clotting assay according to clauss, or by determination of derived fibrinogen values upon photometric measurement of prothrombin time (derfbg). the clotting assay has been shown to be influenced by high levels of soluble fibrin derivatives. the pt-derived fibrinogen levels appear rather convenient in clinical routine, since no additional reagents are needed. we have compared the clauss assay and derfbg with a turbidimetric fibrinogen assay using snake venom protease for fibrinopeptide release, performed in photometric autoanalyzers. d-direct antigen was measured in parallel using tinaqaant d-dimer lpia. results were correlated with total fibrinopeptide a release by thrombin, measured by elisa. a total of 484 samples were included, of which 29 samples (6 %) were recorded as above measurung range by derfbg. these samples encompassed a range of 5.90-10.40 g/l and 5.21-12.37 g/l in clauss, and turbidimetric assay, respectively. the range of values measured by derfbg assay was 0.72-9.14 g/i, corresponding to 0.26-11.00 gll and 0.24-10.48 g/1 in the clauss and turbidimetric assay, respectively. the correlation of derfbg with the clauss assay was re0.91, correlation with turbidimetric assay was r=0.92 for the values actually detected. the correlation between clauss and turbidimetric assay was r=0.93 for all values. there was no dependency of test results or inter-test variation upon d-direct. correlation graphs displayed a decreased test response of clauss assay in the high concentration range, resulting in an underestimation of fibrinogen concentration. the derfbg assay, in contrast, showed normal range values in samples from patients with fibrinotytic treatment and low fibrinogen levels in the other assays. correlation with fibrinopeptide a release was r=0.88 for clauss assay, r=0.89 for turbidimetric assay, and r=0.82 for derfbg. for clinical routine, derfbg appears to be applicable for all samples between 1.00 and 5.00 g/l with exclusion of samples from patients with fibrinolytic treatment or endogeneous hyperfibrinolysis. other samples may be analyzed by clotting assay or turbidimetric assay, although the latter appears to be more suited for measurement of high range samples. for inhibition of pk is 0.067 pmol/l the antifibrinolytic activity of the inhibitors was determined by measuring the lysis of radiolabelled human plasma clots• the compounds which inhibit plasmin and pk influence remarkably the streptokinase-induced clot lysis but not lysis induced by uk and tpa. surprisingly, inhibitors of uk and tpa do not influence clot lysis induced by uk or tpa. the structure-activity relationships for inhibition of ptasmin, uk, tpa and pk could help in the design of more potent inhibitors of fibrinotytic enzymes. uk inhibitors are of interest for the development of anti-invasiveness drugs, while plasmin/pk inhibitors could be prototypes of a "synthetic aprotinin". in the ecat angina pectoris study t-pa antigen was an indepcndem risk factor of subsequent acute coronary syndromes. pat indicates the risk bat depends on other known risk factors. it should be tested in 183 members of a family cohort study (patients with combined hyperlipidaemia and / or hyperuricaemia), if the active pal antigen or the whole pai antigen showed a stronger relation to t-pa and metabolic variables. the active pall antigen was determined using elisa actibind pat-1 (technoclone / lmmuno) , the whole pai-i antigen was measured using the f_lisa pat-1 (technoclone i immuno). t-pa activity was determined with the coaset t-pa from chromogenix, the tintelize tpa from biopool was the used test for determination of t-pa antigen. the active pat antigen showed a stronger correlation to t-pa activity and t-pa antigen than the whole pal antigen. circulating t-pa activity was influenced predominantly by the active pal antigen. both pat antigens were correlated in similar manner with metabolic variables, lipoproteins and b/vii. table: correlations of active and whole pal antigen (** p < 0,001) active pal antigen whole pal antigen active pat antigen 1,000 0,851 ** whole pal antigen 0,851 ** 1,000 t-pa activity -0,594 ** -0,492 ** bpa antigen 0,604 ** 0,497 ** body mass index 0,502 ** 0,462 ** triglycerides 0,452 ** 0,441 ** total cholesterol 0,252 ** 0,255 ** ldl-cholesterol 0,263 ** 0,264 ** hdl-cholesterol -0,357 ** -0,355 ** apolipoprotein b 0,428 ** 0,402 ** apolipoprotein a i -0,233 ** -0,211 ** the lower relationship of the whole pat antigen to t-pa is obviously caused by patient samples with high levels of whole pat antigen in contrast to normal values of active pat as well'as of t-pa. possibly, a high ratio of whole pai antigen / active pat antigen is caused by a raise of latent pal the main form of pat in the platelets. the clinical importance of an increased ratio whole pal antigen / active pal antigen remains under investigation. the cyclic antibiotics-polypeptides bacifracin a, bacilliquin from boci/lu~, licheniformis and gramicidin s from bocil/us brevis, var. (3. b., were used for investigation. we studied their influence on the fibrinoly~c and coagulation activity in vitro• me~hods. to solution of human plasmin (thrombin). containing 0.2 mg of protein (1 nih unff)/ml, the analyses' solution of antibiotics (0.1-8,0 mg) was added. then we defined the tlbrinolytlc activity of the mixes using azofibrin lysis, and fhrombin activity was determined according to the speed of fibrin clots formation from fibrinogen solution. results. in following table are submifled the results received in our laboratory {we also offer results of antibiotics influence on urokinase activity): ki, mm ki --the constant of inhibition. n. d. ~ in studied lirnils the inhibitor's activity was not observed. ---the inhibitor's activity was not define. i. --the inhibitor% activity was observed but ki not determined. +, +% +++ --effect of inhibffion (in rela*iive indexes). conc/us/on.~ the results received by us testify to the necessity of cautious approach to the use of antibiofics-polypeptides for various sorts of therapy in view of their possible influence on fibrinolytic and coagulation actlvlfy, of the organism. these results were used for preparation in our laboratory of biospeciflc sorbents containing c-ramicidin a, bacil}iquirt and gramicidin s.as ligands, they can reversibly bind thrombin, plasmin {plosminogen) and urokinase directly from crude exkacts. the enzymes are selectively eluted without substantial losses of specific activity in e yield of 60-90%. there is a great body of rather contradictory informations dealing with fibrinolysis in liver.. cirrhosis, which can be accelerated, normal or reduced, depending on the type of cirrhosis and investigation techniques (clot-lysis, fibrinolytic component measurements). our previous finding was, that in vitro plasma-clot lysis, induced by exogeneously added tpa or streptokinase proved to be reduced, and this had a good correlation with severity of the disease and the elevation of plasmatic yon willebrand factor levels. in vitro clo~/-[ lysis tests, induced by tpa were performed in 41 patients with alcoholic liver cirrhosis, utilising a microplate light-scattering assessment method. the tests were repeated using the same plasma samples in each patients with a microplate which was covered by cultured endothelial-cell monolayer (umbilical vein, huvec}. clot lysis speed proved to be 1.5-2 times slower with huvec milieu in the control group, while in the cirrhotic patients this inhibition was stronger and resulted in 5-fold reduction of lysis speed. our results suggest, that cirrhotic plasma is able to accelerate the release of fibrinolytic inhibitors from cultured endothelial cells, which phenomenon may also contribute to the complex alterations of in vivo fibrinolysis in cirrhotic patients. deep vein thrombosis (dvt) is a systemic disease with prolonged clinical manifectation. anticoagulation therapy in dvt is not completely effective. thrombolytic therapy may give rise to a systemic lytic state, the fibrinospesific agents (scu-pa and t-pa) have short half-lives in the circulation. we investigated the potency of the acylated plasminogen streptokinase activator complex (gbpg-sk) to deep vein clot dissolution as compared to well known sk and apsac both in v~tro and in vivo in the model of venous thrombosis in artherio-venous shunt in rats. it was shown in in vitro study that fibrinolytic activity of plasminogen activators mainly depends on their stability in plasma. stability studies carried out by incubating sk and pg-sk activator complexes in plasma with euglobulin precipitation . total fibrinolytic activity was measured by the fibrin plate method. gbpg-sk possessed the greatest stability in human plasma than apsac or sk because of its prolonged inactivation period (the deacylation half-life for gbpg-sk was 230 :e 21 rain in contrast with 73 -~ 6 min for apsac). the stability degree of two acylated thrombolytics (gbpg-sk and apsac) was in order to inverse proportion of their first order rate deacylation constants (2.9 • 10 -4 and 6.0 • 10-s sec-1 respectively). the fibrinolytic potency of sk, apsac and gbpg-sk was measured by 1251-labeled fibrin clot lysis in plasma and in vivo by lysis of the preliminary formed 1251-labeled fibrin clot inserted into the jugular vein. fibrinolytjc activity of acylated plasminogen activators gradually increased in time. under sk administration, the clot lysis came to the end by 2 hours while apsac and gbpg-sk haven't lost their activity for 5 -6 hours. gbpg-sk possessed significantly more prolonged fibrinolytic activity than apsac, the acyl-enzymes did not significantly influence on plasminogen,,.~2-anfiplasmin and fibrinogen levels in plasma according to their activity specific to fibrin-bound plasminogen. in opposite, sk produced a significant depletion of plasminogen, ~-2antiplasmin and fibdnogen levels in plasma. it seems, on the basis on in vitro and an animal experimentation, than apsac with its moderately fast deacylation rate is more suitable for rapid thrombolytic effect, but gbpg-sk with its slow deacylation rate is suitable for deep vein thrombosis, when the rapid thrombolysis is less critical. it's well known that the complete lysis of thrombi usually isn't observed at the thrombolytic therapy. at present study we have attempted to quantify the possible mechanism of fibrinolysis inhibition during the thrombolysis. 125i-labelled partially cross-linked fibrin clots of different volume (0.1-0.35 ml) were immersed in tris-hcl buffer (3 ml) containing plasmin (5-100 nm) at 37°c. the lysis rate was detected by counting of soluble fibrin degradation products (fdp). at all the eases lysis slowed down and stopped in 3 hs though clots dissolved up to only 60-85%. no irreversible inlaibition of plasmin caused by denaturation occur as was judged by the measurement of fibrinolytic activity at the diluted samples. however the increase of fdp concentration in surrounding buffer led to the reversible inhibition of fibrinolytic activity of plasmin up to 5% of baseline. the sds-page analysis under non-reduced conditions shown the acoumulation of high-molecular weight fdp at the surrounding buffer. the inhibition phenomenon could be connected with the specific binding of plasrnin with soluble fdp having exposed lysine residues and the subsequent removal of enzyme from fibrin surface. unexpectedly since the heterogeneous character of occurred reactions tile change of the clots surface area during lysis didn't affect the fibrinolysis kinetics in all the concentration intervals. to estimate the kinetic parameters the kinetic curves were linear in the coordinates [p1/t (l/t*ln(isl°{(lslo.lpi)). the obtained parameters were following: keat=l.36 min-l,km=l.33 ixm,kp=0.12 ~tm. the clinical trials have shown that fdp concentrations at the thrombolytic therapy of deep venous thrombosis and acute myocardial infarction usually was approximately in the range 0.05-0.2 ~tm. therefore the described phenomenon of fibrinolysis inhibition by formed fdp may take place during thrombolytic therapy. al. calatzis, an. calatzis, +m. klmg, +l. mielke, +r. hipp, a. stemberger institute for experimental surgery and +institute of anesthesiology technische universit~.t monchen thrombelastography (teg) is an established method for the detection of fibrinolysis. fibfinolysis is usually determined when the teg amplitude decreases by more than 15% atter the maximum amplitude is reached. this takes a considerable amount of time (more than 30 minutes). our approach bases on the understanding of fibrinolysis as a process which runs in paraue[ to coagulation and is not exclusively subsidiary to it. the effect of fibrinolysis on the growing clot in the teg is shown by the comparison of two parallely performed teg measurements: exteg: teg measurement with standardised activation of the extrinsic system. apteg: exteg with in-vitro-fibrinolysis inhibition via aprotinin. exteg-reagent (ex): 1:2 dilution of innovin (recombinant thromboplastin reagent, dade) with aqua dest. apteg-reagent (ap): 5 parts innovin, 2 parts trasy[ol (aprotinin, bayer, i0.000 kie/ml), 3 parts aqua dest. test procedure: l0 p1 ex or ap + 300 ~l citrated blood (cb) + 50 lal cacl2-solution 0,15 m. the only difference of the two reagents is the addition of 20 kie aprotinin in the apteg, leading to an in-vitro fibrinolysis inhibition. the usage of disposable pins and cups (haemoscope, illinois, usa/e.m.s., vienna) is recommended for ensuring standardised conditions for both measurements. results and discussion: when there is a better clot formation in the apteg (corresponding to a lower so-cafled k-value) than in the exteg, fibdnolysis can be suspected. this technique requires only commercially available reagents and is easy to perform on conventional teg systems. due to the standardised coagulation activation with a thromboplastin reagent, fibrinolysis can be detected also when inhibitors like heparin are present in the circulation. according to our experience using this technique during liver transplantation, clinical relevant fibrinolysis can be detected as described in less than l0 minutes. many thromboembolic (massive pulmonary embolism, proximal deepvein thrombosis, etc.) and coronary diseases (infarction, acute phase, etc.) require fibrinolytic therapy to early recanalizafion. the application of the well-known or new thrombolytic agents needs the use of specific, simple and reproducible methods for the determination of fibdnolyfic activity. we suggest new methods for measuring the blood plasma concentrations of plasmin, plasminogen, antiplasmins, and urine urokinase activity. these methods involve the employment of chromogenic substrafe azofibrin (human fibrin, covalently labeled with p-diazobenzenesulfonic acid). method~. 0.2 ml of studied solution was added to 0,8 ml of azofibrin suspension in certain buffer (5-10 mg/ml) and the mixture incubated at 37 oc for 10-60 rain. after the end of incubation the mixture was filtered, the volume of solution brought up to 4 ml by 0.02 m naoh and the optical density was determined at 440 nm. resuffs. azofibrin can be used for quantitative determination of proteinases activity in search of new fibrinolytic means. for comparison the results of our studies fibrinolytic activity of some proteinases with the use of azofibrin are presented: activity. with an increase of pal and ldl-and a decrease of hdl-cholesterol concentrations k is concluded that the increased cardiovascular risk in diabetes meilitus was partly caused by a down regulation of the fibrinolytic system, increase of erythrocyte aggregation and plasma viscosity. also disturbances of lipid metabolism an abnormal whr seems to be of an additional atherogenous factor in dm. plasma concentrations of thrombin-anfithrombin-iii (tat), alpha-2antiplasmin-plasmin (app) complexes and ddimer were investigated in 50 patients treated with thrombolytic therapy for acute myocardial infarction (ami) either with streptokinase (n=24), urokinase (n=16) or recombinant t-pa (rt-pa, n=10). all patients received an intravenous heparin bolus of 5,000 iu on admission, which was followed at once by an infusion of 1,000 iu/hr for the next three days titrated to maintain the partial thromboplastin time at twice control value. tat, pap and ddimer were measured by enzyme immunoassay on admission, 1, 2, 4, 6, 8, 12, 24 hours and on day 3 and 7 after admission. groups did not differ significantly in regard to age, sex, delay and infarct location. on admission, no marker differed significantly between groups. thereafter, tat levels increased significantly exclusively in rtpa treated group. from 2 to 6 hours after admission, tat were significantly higher in rtpa treated patients than in streptokinase and urokinase treated group (p<0.02). however, during continous heparin infusion, which was started immediately after stop of thrombolytic therapy, in each group tat concentrations decreased below admission values. app were significantly higher only 1 hour after admission in the rt-pa group (p=0.03). ddimer did not differ signifieanfly between groups. our results demonstrate, that rtpa induces a hypercoagulable state, which may contribute to reocclusion after successful reopening of the infarctrelated coronary artery. the significant tat decrease during continous heparin infusion supports the concomitant use of thrombin inhibitors as adjunctive therapy with thrombolytlc treatment for ami. thus, in acute myocardial infarction patients, thrombin generation is markedly influenced by the thrombolytic agent used and concomitant heparin therapy. endothelium derived relaxing factor-no (edrf-no) plays a major role in regulation of vascular tonicity and also exerts platelet inhibitory action~ however, due to the chemical nature of edrf-no few is known about its production and activity as a general index or marker of vascular function in human diseases. one way to achieve this can be measurement of nitrate/nitrite excretion in the urine, which seems to reflect vascular edrf-no production. in this report a self-developed elisa method is described, which was used for this perpose. nitrate/nitrite urinary exretion proved to significantly decreased in insulin dependent and in non-insulin dependent diabetes mellitus as well after a comparison of the excretion values to other markers of angiopathy (yon willebrand factod soluble thrombomodulin, beta -thromboglobulin) it seems to be acceptable, that urinary nitrate/nitrite excretion can be a useful indicate of diabetic vascular disorders. two major concerns still accompany the application of prothrombin complex concentrates (pcc). viral safety has to be guaranteed and therefore several measures for virus inactivation or elimination are taken during the manufacturing process. the inherent risk of thrombo-embolic side effects has to be considered. to minimize these risks and to achieve good clinical efficiency the quality criteria for pcc's are under pending discussion. it is generally accepted that a modem pcc-preparation should contain all of the four coagulation factors in a well balanced proportion and that it should also contain protein c and protein s. additionally, the concentration of activated coagulation factors should be kept at a minimum. a present pcc-produedon process mainly consists of a qae-sephadex extraction of cryopeer plasma followed by a solvent/detergent virus inactivation step. further purification is achieved by subsequent chromatography on deae-sephamse. the aim of this study was to improve product quality by avoiding f viiactivation without implementing major changes to the production process. at the same time, a second virus eliminating step was added to the production process. it could be shown that speeding up the chromatographical process by switching the deae-sepharose-chromatography from a classical axial column to a radial chromatography resulted in a significant reduction of f viia-genemtion. mainly the reduction of contact time, resulting from the highest possible flow rates, leads to the wanted effect. the relation between f vii/f viia was 10 : 1 or more. in order to investigate the feasibility of virus filtration the eluate of the deae-sepharose column was filtered through a virus removing ultipor vffilter. the analysis of the solution before and after fillration showed that the filtration had no influence on coagulation factors activity, protein content, proteolytic activity etc. preliminary studies showed significant virus reduction values. in the past few years the problem of expediency of the treatment aimed at developing immunological tolerance in hemophil;a patients by way of complete removal of inhibitor with high doses of factor viii has been discussed in literature. we observed 121 patients with hemophilia. inhibitors to factor viii:c were revealed in 32.7 % of patients with hemophilia a and fo factor ix --in 1.6 % of patients with hemophilia b. the level of an inhibitor was not higher than 87 befhesda u/ml, that is those patients were not regarded as "high responders". a high incidence of inhibifors in young patients [from 7 to 26 years of age, 51.9 %) compared with older patients (from 27 to 40 years of age, 11.2 %) testifies to the probability of inhibitors development during treatment with modern concentrated preparation of factor viii, ix. inhibitor development in patients (40.5 %] in the course of antihemophilic concentrates transfusions is an evidence of alloimmunization of patients with proteins. the investigations show that in the course of transfusion therapy patients develop secondary immunodeficiency due to chronic antigenic stimulation of immune system with high doses of allogenic proteins. against the background of immunodeficiency patients with hemophilia develop complications of immune character: infections complications --53.9 %, aufoimmune processes --44.9 %, secondary tumours --1.2 %. plasmapheresis is the most rational method of removing inhibitor in patients with low level of inhibitor ("low responders", < 10 bu/ ml) and in patients with mean response. thus it should be noted that the treatment of patients aimed at developing immunological tolerance is not only expensive and economically unprofitable but also not indifferent fo the organism. in a recent multicenter study 73 previously untreated patientens (pups) with severe hemophilia a were treated with a recombinant factor viii concentrate (rfviii, recombinate©). during fviii treatment 21 (29%) developed inhibitors, 6 high titer (>5 bethesda units (bu)/ml), 4 low titer (<5 bu/ml) and 11 transient inhibitors. plasma samples from before treatment and during treatment but before inhibitor occurrence were available in 12 inhibitor patients. these plasma samples were analyzed by a highly sensitive immunoprecipitation (ip) assay for the presence of anti-tviii antibodies. in 9 (66%) a significant increase of anti-fv]]i antibodies was seen indicating the development of a clinical relevant inhibitor titer. this immune response occurred after 2 to 17 (median 5) exposure days (ed). in the same period only 3 out of 15 inhibitor patients showed a decreased in vivo recovery. in 16 pups who developed no inhibitors plasma samples from the entire treatment period were available. an immune response to rfviii treatment was seen in 7 pups after 2 to 43 ed (median 24 ed). the immune response was later and less pronounced in comparison to inhibitor pups before inhibitor occurrence. with the ip method the detection of an early immune response is possible which might be predictive for a later inhibitor development. the inclusion of the lip method should be considered for future multicenter pup studies. in the past anaphylactie reactions to plasma and plasma components have been a common complication of replacement therapy in patients with hemophilia a and b. we report on 3 severe bleeding episodes in 2 patients with hemophilia a and b, respectively. both patients had a history of life threatening anaphylactic reactions after exposure to different plasma derived clotting factor concentrations including intermediate purity factor viii-and factor ix-concentrate, respectively. high purity factor concentrates were tolerated well without any allergic side effects. a 67 years old patient with a moderate form of hemophilia a (f viii 4 %) had a history of severe immediate reactions with skin manifestations and bronchospasm after exposure to fresh frozen plasma, ctyoprecipitate and 3 different plasma derived factor viii-concentrates of intermediate purity. in all episodes pretreatment with corticosteroids and antihistamines was unsuccessfull in avoiding severe bronchospasm. replacement therapy with two different recombinant factor viii concentrates was tolerated well without any side effects. a 12 years old haemophiita b patient developed hypersensitivity reactions to prophylactic factor ix substitution, which could be overcome by using a factor ix .concentrate with improved purity. a recent recurrence of hypersensitmty under this treatment was finally overcome by the use of highly purified (monoclonal antibodies) factor ix concentrate. we conclude from these findings that high purity of factor concentrates, possibly due to the absence of soluble hla-antigens, are advantageous in patients disposed to allergic reactions. introduction: antibody formation against factor (f) viii remains one of the most severe complications of repeatedly transfused patients with haemophilia a. as reported previously in our study about the incidence of fviii inhibitors, we have observed a high incidence of fviii inhibitors among our haemophilia a patients. it is still not clear why certain haemophiliacs develop antibodies and others do not. a number of previous studies suggest that there is a genetic predisposition for the fviii inhibitor development. thus, the purpose of our study was to examine, if there is a correlation between fviii antibody-formation and genetically determined histoeompatibility antigen (hla) patterns in our haemophiliacs. patients and methods: hla-class i (a, b, c) and hla-class ii (dr, dq) typing was carried out for 51 respectively 44 multi-transfused paediatric haemophilia a patients (fviii:c activity < 5%), including 22 who had developed an antibody to fviii: 19 were high responders (> 5 bu), 3 were low responders (< 5 bu). hla-typing has been performed by a standurcl two-stage microlymphoc~.ftotoxicity procedure (drk frankfurt) using antisera with defiend hla-specifity (biotest diagnostica). results: we found an under-representation of hla-a2 in fviii inhibitor patients when compared with the subgroup without inhibitor. in regard to the hla-b and hla-c antigen frequencies there are no apparent differences between the groups. among the class ii antigens there were higher frequencies of dr1, drw52 and dqwl in the non-inhibitor group. however, the reduction in hla-a1, hla-cw5, hla-dqw3 respectively hla-dr4 frequency for inhibitor patients as reported previously could not be confirmed in our study. conclusion: so far it remains unclear if there is a significant association of a certain hla allels with the development of fviii antibodies. recombinant factor sq (r-viii sq, pharmacia) is a b-domain-deleted recombinant factor viii. it is formulated without albumin (hsa). the product has been shown to have in vitro and in vivo biochemical characteristics similar to a plasma derived full-length protein (p-viii). the international clinical trial programme was initiated in march 1993. pharmacokinetic studies have shown that the b-deleted r-viii sq should be given according to the same dosage principles as a full length p-viii. at present, the product is being tested in previously treated patients (ptps) and untreated patients (pups) with severe haemophilia a (viii:c < 2 %), both during long-term treatment (on demand therapy or prophylaxis) as well as during surgery. the long-term study in previously treated patients in germany was started in january 1994. thirteen patients have been included in 8 centers. all patients are still on treatment with r-viii sq, most of them receiving prophylactic treatment. global treatment efficacy has in general been considered excellent or good. no serious clinical adverse events related to the study product have been reported, nor have any inhibiting antibodies to factor viii or antibodies to mouse-lgg or cho-cell components developed in the patients. further results such as data on efficacy, half-life, recovery and safety will be presented in detail at the meeting. nowadays it is not sufficient to regard hemophilia only as hemorrhagic diafhesis of coagulation genesis, caused by deficiency or molecular anomalies of coagulation factor, without taking into account the immunity state. on examination of 125 patients (pts) (hemophilia a --110 pts, hemophilia b --11 pts, willebrandt's disease u 4 pfs) the development of immune complications was revealed in 34.4 %. chronic persistent hepatitis (3.2 %), chronic active hepatitis (3.2 %), herpes simplex (1.2 %), chlamidiosis (1.2 %), bacterial infection (7.2 %} were regarded as infectious complications. bacterial infections have a routine course due to preserved phagocytic function of neufrophils. and viral infections, whose ability to resistance is connected with t -cell link immunity, take on a chronic persistent course, mechanism of the development of autoimmune processes (autoimmune thrombocytopenic purpura --2.4 % of pts, immunocomplex disease --4.9 % of pts, the appearance of immune inhibifors --34.4 % of pts} is connected with the impairment of immunological surveillance over b -cells aufoimmune clones as a result of dysbalance in the system of t -lymphocyfes immunoregulatory subpopulations. lymphadenopathy and splenomegaly (4.9%) develop due fo benign proliferation of lymphoid tissue as a result of impairment of regulatory function of t -lymphocytes system, or they may be an evidence of virus infection. we observed one episode of acute leukemia. immune complications in hemophilia patients develop against the background of secondary immunodeficiency caused by chronic antigenic stimulation of patients' immune system with high doses of allogenic proteins, which plasma preparations contain. in immune complications hemophilia patients develop hemorrhages, whose pathogenesis is quite different from that caused by coagulation factor, so it should be taken into account in the course of treatment. control of hemophilia therapy classically was based on four parameters: life span expectancy of patients, orthopedic status (normal zero), pettersson score and social integration. oren, however, these parameters described an irreversible status with permanent damage particularly of the joints, especially when patients were grown-up. in order to establish risk-adapted therapy protocols to prevent hemophllic osteoarthropathies, quality control programs have to he set-up that allow for early adjustment of dosage and substitution frequency. here bleeding frequency is one the main parameters, being a clear hint for the possible development of a target joint. since 1988 we have established a computer database (haemopat) that contains data from all patients treated in our center. tables and graphs allow for early detection of increased bleeding tendency in a given joint, and accordingly for adjustment of therapy. the results of 8 years of measuring reasons of joint damage and not documenting the orthopathies as such will be demonstrated. parallelly a new program (haemopat win 1.0) will he introduced allowing for easier handling of data and their evaluation. this program will be used as of december 1995. in combination with a substitution calender to be filled in by all patients, in which factor concentrates, lot numbers, dosage, and date of administration will he constantly recorded, this program will extend our existing database in order to follow closely clinical and orthopedic parameters of each patient, and consequently acts as strict control of therapy quality. additionally, it provides sufficient data to fulfil any documentation needs, requested by medical authorities. the program will be available for all those interested free of charge. 2) kinderklinik der westf. wilhelms univ. mttuster 3-6) biotest pharma gmbh, dreieich haemoctin® sdh; the fviii sdh (sdh = solvent detergent and dry heat = 100 °c, 30 rain) from biotest pharma is a high purity (specific activity ~ 100) fviii concentrate manufactured from large human plasma pools. virus validation studies have shown virus inactivation/reduction (log 10) during the manufacturing process for lipid coated vints~ such as: h]v-1 > 16.2; psr > 16.8; vsv > 14,5; bvdv > 15.7; hcv > 4.5* and non enveloped vimsas such as: parvo** = 2.7; reo > 5.3*** and hav > 13.9. more than 50 hemophilia a patients (ptps = previously treated patients), baseline fviii activity < 1%, were included in an international drug monitoring study to follow their fviii inhibitur status. the hemophilia centers included were three centers from hungaria (helm pal children hospital and the national inst. of haematology, budapest and regional blood transfusion center, debrecen) and four centers from germany (two from berlin, one fraukfurffmain and one monster). patients were enrolled in the drug monitoring beginning aug. 1993. at the entry none of the patients had a detectable inhibitor. at the end of sept. 1995 there were no side effects or adverse events in connection with the use of haemuetin®. before the haemoctin drug monitoring study, the patients were treated with cryoprecipitate, or purified fviii products. inhibitor testing was done on patients plasma samples using the bethesda method. repeated fviii recovery determination at one time (between 12 to 24 hrs) after haemoctin® application demonstrated the expected recovery and normal half life time. none of the hemophilia a patients, treated with haemuetin® sdh developed a clinical relevant inhibitor. at the beginning of the stud)', the clinical efficacy of haemuetin® was studied in 16 hemophilia a patients and shown to give an in vivo recovery of 71 + 15 % by one stage assay and 77 + 17 % by a chromogenic assay. t ½ values were 13 + 2.8 and 12.7 + 3.2 hrs respectively. the study for the clinical efficacy of haemoctin® sdh was repeated in a group of 6 patients approximately two years later. although cd4 lymphocyte counts are known as reasonable predictors of prognosis in hiv infection, the cd4 count is not in all cases an infallible indicator of prognosis. therefore several serological markers are used to predict disease outcome, including beta-2 microglobulin (132m), immunoglobulin a (iga), lymphocyte counts (lymph) and others. in this study we followed a cohort of 23 haemophiliacs (19 with haemophilia a, 4 with haemophilia b) and 2 patients with severe von willebrands disease over a period of 28 months (mean, range: 22-34). testing for l~2m, igg, iga, igm, cd4 and cd8 cell counts (abs. and relat.), cd4/cd8 ratio, and absolute resp. relative leucocyte and lymphocyte counts was performed at least 3 times a year. at the same time clinical examinations and review of history were undertaken. mean of laboratory tests for every quarter of a year and significant changes during time of observation were calculated and correlated with clinical data. 1-4 5-8 9-12 13-16 17-20 21-24 cd41 440+956 344+924 278+925 302.-1:240 273.+.220 166+125 cd8 ~ 1165+474 1171+523 1236+1187 1017+439 930±412 873+478 1~2m z 2.0+0.6 3.0+1.0 3.0+1.2 3.0+1.1 3.5±1.0 3.5±1.3 lymph ~ 1.98+0.6 1.83+0.6 1.72+0.7 1.67±0.6 1.46±0.6 1.31±0.5 means/pl ± standard deviation means mg/l ± standard deviation during time ef observation we found significant changes of cd4 (abs. and relat.), abs. cd8 counts, cd4/cd8 ratio, f~2m, leucocytes and lymphocytes. the abs. cd4 and cd8 counts correlated clearly with lymphocytes und leucocytes counts but not with 1~2m. the prognostic value of the tested parameters is discussed by calculation of correlations with clinical data, anti-retroviral treatment and treatment of haemophilia. the availability of high purity factor concentrates has recently encouraged clinicians to use perioperative continuous infusion of fviii or fix to prevent or reduce bleeding in patients with haemophilia. in conliast to repeated highdose bolus injections, the continuous infusion trealment regime maintains constant coagulation factor activity at a level necessary for hemostasis, reducing the total cost of treatment by about 20% and preventing possible side effects of bolus doses. the new application mode, however, requires stable products which tolerate slow passage through an infusion device. our objective was to test in vitro the fviii concentrate immunate (stim plus) and the fix concentrate immi.ynine (stim plus) at room temperature, under conditions of long-term contact with polypropylene tubing in an infusion pump. infusion rates were chosen to mimic clinical situation. the control samples were not infused through the pump but were otherwise treated identically. test samples were drawn before and at 1, 4, 8, 24 and 48 hours after the onset of each infusion run. fviii (one-stage, two-stage and chromogenic assay) and fix (one-stage) activity were measured using immuno reagents. presence of activated factors were measured by napt'i', while flla, fxa, plasmin and pre-kallikrein activator were detected with specific chromogenic substrates. the data showed equivalent results between test and control samples with no loss of fviii or fix activity. the potencies of both immunate (stim plus) and immunine (stim plus) remained within 100 + 20% of labeued values within 48 hours after onset of infusion. in conclusion, immunate (stim plus) and immunine (st1m plus) are suitable for contiuous infusion when using automatic infusion device within applied test criteria. in htanans, circulating half-lives of asparaginase enzymes from e. coli and erwinia chrysanthemi vary within a wide range. moreover, half-lives differ not only among different e. coli strains but also among commercial e. coli preparations. to investigate the possible influence of two different sources of e. coil asparagmase (asn) preparations on the fibfinolytic system of leukemic children a prospective randomized study was performed correlating asn pharmacokiuetics (asn activity, asparagine depletion) with fibrinolytic parameters (plasminogen (plas), o.2-antiphismin (ct2ap), tissue-type plasminogen activator (t-pa), tissue type plasminogen activator inhibitor 1 (pal 1), d -i)imer (i)-d)). together with prednisono, vincristine and an anthracycline 20 children received i0000 iu-/m 2 asn medae r (originally purchased: kyowa hakko, kyogo japan) and 20 children 10000 iu/m 2 crasintin r (bayer, leverkusen, germany). blood samples for pharmacokinetic and coagulation analysis were drawn before the first asn administration and every third day whilst on medication. the results are shown in the 0.05 asn activity shows a negative correlation (spearman: rho/p) to plas (-,637/0.0003) and ct2ap (-, 751/0.0001). a positive correlation was found between asn activity and d -dimer formation (0.475/0.01). t-pa and pal 1 showed no relationship to asn activity. all children showed complete aspamgiue depletion at a detection limit of 0.1 um during the course of asn admiatstration. two thrombotic events occurred in the kyowa group, one of the distinctions between the two e. coli asn preparations administered ill this stndy is the absence of cystine in the kyowa asn, which also has a lower isoelectric point and a longer half-life than the bayer type a asn. with respect to this observations this may lead to longer inhibition of protein synthesis, which then may be the cause of a bigher rate of side effects. along with studies on asn pharmacokinoties dose recommendations need to be tailored to the specific asn preparation employed to ensure optimal antineoplastic efficacy while minimizing the hazard of complications. different types of coagulopathy in hepatic veno-occlusive disease (vod) and capillary leakage syn-drome (cls) after bone marrow transplantation w. ntimberger, s. eckhof-donovan, st. burdaeh and u. g6bel department for pediatric hematology and oncology, heinrich heine university medical center, diisseldoff, germany it is generally accepted, that cls, coagulation activation and refractoriness to platelet transfusions are part of the syndrome of hepatic vod. we assessed patients with either vod or cls or both vod and cls, in order to analyze the influence of either syndrome on different aspects of hemostasis. vod was diagnosed according to jones et al. [transplantation 44 (1987) 778]. diagnosis of cls was >_3% increase of body weight in the past 24 hours and non-responsiveness to furosemide [niirnberger et al., ann hematol 17 (1993) 67] . patients with vod, cls or both were compared to control patients without either diagnosis. eight patients suffered from both vod and cls, 5 patients only from vod, and 8 only from cls. 61 patients had neither syndrome and served as control population. activation of the coagulation system was assessed by increase of tat-complexes and/or increased consumption of at iil the hemostasis patterns were as follows: no. introduction: lung cancer goes along with coagulation activation and increased thromboembolic risk. acute phase reaction in cancer patients leads to elevated levels of c4b-binding protein (c4b-bp) followed by a shift from free to c4b-bp-bound protein s. we tried to find out whether there is a correlation between alterations of c4b-bp, protein c protein s system and interleukin 6 (il-6), which is one of the most potent inducers of hepatic acute phase reaction. patients: i. 25 patients with lung cancer; 2. control group: 11 patients in complete remission after lung cancer. methods: clotting methods: protein c and s activity; elisa tests: protein c antigen, tat-complexes, prothrombin fragments f i+2, il-6. electroimmuno-diffusion (laurell): free and total protein s, c4b-bp. results: tat-complexes and f i+2 were elevated in cancer patients. c4b-bp levels were slighthly increased (129±19 % of n.), protein s activity was 89±33 % of n. (control group: 108±23 % of n.). il-6 in lung cancer patients was 37.2252.9 pg/l (control: 27.2±8.2 pg/l). conclusion: one source of the hypercoagulable state in lung cancer patients is decreased protein s activity due to elevated c4b-bp levels. this is probably caused by hepatic acute phase reaction which is triggered by increased il-6 levels. these plasma levels correlate with levels of the tumor marker ca 125 and with the stage of the disease but correlations with patient outcome (disease recurrence and overall survival) have not previously been shown. plasma levels of d -dimer and ca 125 (determined by sandwich elisa assays} were measured prior to treatment in 36 women with figo stage t to iii ovarian cancer and correlated with tumor stage, relapse and overall survival over a mean follow -up period of 28 months (range 16 to 40 months). levels in 71 healthy women and 27 patients with benign ovarian disease served as controls. the occurrence of deep vein thrombosis in the cancer patients was also determined by impedance plethysmography that, when positive , was confirmed by contrast venography. preoperative d -dimer and ca i25 levels in ovarian cancer patients were statistically signfficantly higher than in controls. preoperative cut off values were calculated for the prediction of cancer relapse and survival for both measurements. d -dimer levels above a cut off level of 2060 ng/ml were statistically significantly associated with the rate of relapse but ca 125 levels were not. deep venous thrombosis occurred in 33 % of cases but there was no difference between properafive levels of d -dimer in patients who subsequently did versus did not develop deep vein thrombosis. high levels of d -dimer are associated with more advanced disease and with poor prognosis in patients with ovarian cancer. the high levels of d -dimer are a biologic feature of the malignancy itself that may be attributable, at least in part, to increased conversion of fibrinogen to fibrin in the tumor bed with subsequent degradation of fibrin by the fibrinolytic mechanism. thus d -dimer levels may serve as a marker for overall tumor burden as well as "disease activity". a high incidence of deep vein thrombosis exists in the course of the disease in ovarian cancer patients but preoperative levels of d -dimer are not predictive of this occurence. yon tempelhoff georg -friedrich, michael dietrich, dirk schneider, lothar heilmann. dept. obstet. gynecol. city hospital of ruesselsheim. -germany. an increase of plasminogen activator inhibitor activity (pai act.) in the plasma of cancer patients has been recently discribed. we have longitudinally investigated pai act. in 136 patients with primary breast cancer and compared the results with the outcome of malignancy. patients with untreated primary breast cancer and without proof of metastasis (t 1-4 n0-2 m0) were eligible for this study. in all patients coagulation tests including fibrinogen {method according to clauss), d -dimer (elisa} and pal act. (upa dependent inhibition test) were performed prior to primary operation, 6 months thereafter and at the time of cancer relapse. seventy -two healthy women and 43 patients with benign breast disease served as controls. during a mean follow -up of 32 + 16 months 34 patients (25 %) developed cancer recurrence and 13 (9.6 %) patients died. in all cancer patients preoperative levels of fibrinogen and pai act. were significantly higher compared to healthy women and to patients with benign breast disease. preoperatively only pal act. was significantly higher in patients with vs. without cancer recurrence (4.52 _+ 1.67 u/ml vs. 3.4 1 + 1.55 u/ml; p = 0.002). in patients with later recurrence pai a~t. significantly dropped 6 months after operation (p = 0.02) and was again significantly increased at the time of cancer recurrence (4.90 _+ 2.89; p = 0.001). a preoperative cut off value (calculated via cox model) of pai act. above 3.52 u/ml was significantly associated with the rate of relapse (tog rank: p = 0.0005) and in 70 % of patients who died of cancer preoperative pai act. were also above this cut off. impaired fibrinolysis in patients with breast cancer is significantly associated with the outcome of cancer. a monoclonal heparin antibody (mab) has been raised against native heparin using a heparin-bovine serum albumin conjugate prepared by reductive amination. for further analyses tyramine, which was covalently bound to low molecular mass heparin by endp0int attachment (malsch r et al: anal biochem 1994; 217: 255-264) , was labeled with 125-iodine at the aryl residue. the tracer antibody complex was immunoprecipitated by goat anti-mouse immunoglobuline igg. the mab recognized specifically intact heparin and heparin fractions. the lower detection limit of heparin preparations was 100 ng/ml. no cross reactivity of the mab occurred with other glycosaminoglycans such as heparan sulfate, dermatan sulfate, chondroitin sulfate a and c. oversulfated heparin showed lower affinity to the antibody hl.18 than 2-0-and 6-0-desulfated beparin. the method established for the purification of the mab was ammonium sulfate precipitation with followed dialysis. sds-page and high pressure capillary electrophoresis prooved the high purity of the received antibody. the biological activity of mab was tested by the chromogenic assay $2222 and remained stabile while purified. in conclusion, the present abstract describes an purified igg 1 monoclonal antibody directed against heparin and heparin fractions, which can be used for biological measurements. the concentration of heparin and dermatan sulfate in biological fluids is usually measured using radiolabeling. for this purpose aromatic compounds are usually used to insert radioactive iodine labeling at the saccharide backbone of the glycosaminoglycan. we developed methods for the specific labeling of hepann and dermatan sulfate at the terminal residue. tyramine was bound by reductive amination to the 2,5 anhydromannitosyl end of heparin, produced by nitrous acid degradation and confirmed by 13c.nm r spectroscopy. (anal biochem 217: 254-264, 1994) this method was also used to produce a low molecular mass dermatan sulfate (lmmd)derivative after partial deacatylation. in order to choose the proper method for evaluating the specific anticoagulant activity in the row of chitosan polysulphate (cp) samples with different degrees of pol3~merization and sulphation we applied to pharmaeapea article (a~) when assessing the ability of direct anticoagulants to depress the coagulability of recalcificated sheep blood (using the 3rd international heparin standard), and to measuring such acti¢ity as per pharmacokinetic model (a2). the model admits the "kinetics of cp elimination be linear in ease of intravenous injection to rabbits, as it is observed in heparin: ct=co exp(-i~ x t), where ct is cp concentration at the time moment t; co is cp concentration at the moment of injection; i~ is the elimination constant. besides, it is assumed that there is a linear approximation of the anticoagulant effect on the dose, which finally makes it possible to calculate the specific actidty a2 : t=kt ct + tin, where t is the time of clot formation at different tlme intervals after of cp injection; t~, is the time of clot formation prior to cp injection. t value was assessed in two tests: in blood coagulation time (bct) and in activated partial thromboplastin time (aptt). no correlation was observed between a1 and a2. at the same time the values of ifm and the period of semieliminatinn (tvz) with the use of the original method that were obtained with the help of the quantitative determination of cp in rabbit's blood taken at different time intervals after injection, showed a close correlation (1"=0,94 p<0,05) between the same parameters, obtained with the help of the of the pharmacokinetic model in bct test. thus, experimentally it was proved that the assumption of the linear elimination and the effect-dose dependence was true, which is necessary for a2 calculation. we recommend to use intravenous injection of the samples to animals with further assessment of the results according to the pliarmacokinetic model to calculate the specific anticoagulant activity in the row of chemically related potential direct anticoagulants. in this investigation we compared the biological activity of a low-molecular-heparm (lmw-heparin, mono embolcx®) after intravenous, subcutaneous and oral application in rats. sprague-dawly rats were anaesthetized by ketamine/diazepam and the blood samples were taken from the retro orbital sinuus. 150 axa u/kg body weight of the lmw-heparin were injected intravenously and subcutaneously to 10 rats each. between 3 minutes and 10 hours after injection serial blood samples were taken. 200 mg/kg (20.000 axa u/kg) body weight of the lmw-heparin were applicated orally using a stomach tube. blood samples were taken between 1 and 24 hours after oral application. the antifactor xa and antithrombin activities of the plasma samples were measured, using ehromogenic assays and the substances s 2222 and s 2238 (kabi vitmm). after i.v. injection the maximum axa and alia activities were 2.8 axa u/ml and 0.8 aiia u/nil respectively. after s.c. application the antifactor xa activity of the lmw-heparin showed a maximum of 0.5 axa u/ml atter 120 minutes. the antithrombin activity exhibited an eatiier maximum activity of 0.2 alia u/nil 60 minutes after injection. after the oral application no increase of the axa or alia activities was measured. the lmw-heparin has a high antifaetor xa and antithrombin activity after i.v. and s.c. injection. after oral application no activity of the lmw-heparin was measurable. these results implicate that fractionated heparin is not absorbed after oral application or is inactivated in the gastrointestinal tract. to improve the activity after oral application modified hepatins have to be synthesized. in an in vitro study the effect of various heparin derivatives (calciparin, fraxiparin, cy 222, cy 231, astenose, hexasaccharide, ssh 14) on thrombin-and adp-induced platelet aggregation as well as on adpmediated platelet activation in whole blood was investigated. all heparin derivatives caused a concentration-dependent inhibition of thrombin-induced aggregation of washed platelets. calciparin and astenose were found to be the most effective compounds with ic5o values of 0.67 and 1.2 p, mol/l, resp.; higher concentrations (5-30 times) were required for the other compounds. furthermore, the heparin derivatives were studied with regard to their potentiating effect on adp-induced platelet aggregation. in a concenwation range from 1 to 10 u/nil calciparin, fraxiparin, cy 222 and astenose led to a potentiation of the adpinduced aggregation whereas cy 231, hexasaccharide and ssh 14 did not show this effect. the increase in aggregation was associated with an increase in thromboxane a2 lbrmation. in addition, the effect of calciparin, fraxiparin, cy 222 and astenose on adp-induced platelet activation in whole blood was investigated by flow cytometric analysis using monoelonal antibodies to platelet surface receptors opiiia (cd-61) and p-selectin (cd-62). at concentrations that caused a maximum potentiation of adp-induced platelet aggregation these substances led to a strong increase of adp-mediated activation of platelets in whole blood. the effect was most pronounced when the blood was anticoagulated with calciparin and astenose, resp. in conclusion, the results suggest that the aggregation-promoting effect of heparin derivatives included in this study is dependent on the molecular weight and the degree of sulfation and is in part due to the generation of thromboxane. heparins are negatively charged polysaccharides and bind protamine forming a stable complex. here we report on the properties of microbeads (4.5 pro) coated by protamine. protamine chloride (0.16 ijm) was covalently bound to 0.5 mg paramagnetic tosyl..activated microbeads m-450 (dynal). the covalent binding of protamine was from 1.0 to 13,0 mg/g beads. protamine-dynabeads were produced in a phosphate buffer at different ph (7,0; 7,5; 8,0 and 8,5). the protamine-dynabeads produced ph 7.5 showed the best properties for flow cytometry analysis. in saline solution they bound lmm-heparin-tyramine-fitc (lmmh-tyr-fitc) dose dependently from 0.001 to 2 u/ml, whereas in plasma and blood they bound lmmh-tyr-fitc from 0.05 to 2 u/ml. dependent on the binding protocol, the microbeads also bind proteins unspecifically, i.e bovine serum albumine and protamine to a lower extent.the adsorbed proteins, however do not bind lmmh-tyr-fitc dose dependently. the saturation of the proteins on the beads was determined as their relative fluorescence intensity (rfi). in saline solution the saturation was measured at 380 rfi, in human plasma at 325 rfi and in whole blood at 252 rfi. using flow cytometry erythrocyctes, lymphocytes, monocytes and granulocytes were not bound to protamine dynabeads. these data demonstrate that protamine-dynabeads can be used to measure the concentration of lmmh-tyr-fitc in saline solution, plasma and blood because they do not bind to human blood cells. the present study was designed to investigate the anticoagulant action of inhaled low molecular weight (lmw)-heparin in healthy volunteers. 3,000 iu (group t), 9,000 iu (group 2), 27,000 iu (group 3) or 54,000 iu (group 4) lmw-heparin were given to 20 healthy volunteers each at 4 weeks intervals. in group 1 tissue iactor pathway inhibitor (tfpi) antigen and activity, 82222 chromogenic factor xa assay, heptest, aptt and thrombin clotting time (tot) remained unchanged during the 10 days observation period. in group 2 tfpi antigen and activity, aptt, tct and the $2222 method remained uneffected. heptest coagulation times were 18.7 + 2.0 before, 26.1 + 5.2 sec. 6 hrs and to 20.5 + 1.9 sec. 24 hrs after inhalation. in group 3 tfpi antigen increased from 74.1 + 13.9 to 80.5 + 14.2 ng/ml 3 hrs after inhalation. tfpi activity remained unchanged. $2222 method increased from 0.01 to 0.08 + 0.06 iu/ml 6 hrs after inhalation. heptest coagulation values were prolonged up to 42 _+ 7.6 s ec after 6 hrs and returned to normal within 72 hrs after inhalation. aptt and tct remained unchanged. after inhalation of 54,000 iu lmw-heparin, the following changes were observed: tfpi antigen increased to 103 +_. 17.9 ng/ml and normalized within 24 hrs. -i'fpi activity increased to 1.14 _+ 0.23 u 3 hrs after inhalation and was normal after 24 hrs. antifactor xa activity, as measured by s2222 method, increased to 0.343 + 0.196 u/ml after 6 hrs and was normal after 72 hrs. heptest coagulation values increased to 77.5 + 11.8 sec 6 hrs after inhalation and normalized after 144 hrs. aptt and tct did not change throughout the observation period. the data demonstrate a resorption of lmw-heparin by intrapulmonary route in man. no side effects were observed. recently we developed a tritium-labelled arachidonic acid ([3h]aa) release test with high sensitivity to membrane-toxic agents. the assay performed in u937 cells is intended to evaluate ehemicals, drugs and biomatefials with regard to their eytomembrane toxicity [kloeking et at. (1994) , toxicology in vitro 8, 775-777]. local irritation reactions are described in patients receiving therapeurieat dosages of lmw heparin. this fact prompted us to examine the following lmw hepafins and heparinoids for their membrane toxicity in u937 cells: reviparin-sodium, enoxaparine-sodium, mueopolysccharide polysulphate (mps), pentosan polysulfate sodium (pps), polysulfated bis-lactobionic acid amide derivatives lw10082 (aprosulate) and lw10086. for this purpose, [31--1]aa labelled u937 ceils were incubated with different concentrations of lmw heparins and heparinoids at 37°c for 1 hour. compared with untreated cells, the [~h]aa release of cells treated with 5 mg of the drugs was two times higher with reviparin sodium, three rimes higher with bis-lactobionic acid amide lw10086, five times higher with pentosan polysulfate, 20 times higher with ertoxaparine-sodlum, but it was equal to the control with mucopolysaccharide polysulphate. the rate of araehidonic acid release in response to a test chemical may therefore be used to assess the membrane-toxic effect of this substance and to predict its the inflammatory potential in the skin. semi-synthetic glyensaminoglycans (gags) with antithrombotic properties can be prepared from the e. coli k5 polysaecharide by coupled chemical and enzymatic methods. the molecular weight of these semi-synthetic gags can be adjusted to obtain products mimicking the molecular profile of a low molecular weight hepatm. in order to compare the biochemical and pharmacologic properties of a semi-synthetic gag (sr 80486a, sanofi/choay) with a commereiany available low molecular weight heparin, fraxiparine (sanofi, paris, france), valid biooheanical and pharmacologic methods were used. the molecular profile of this agent as determined by hplc exhibited a comparable distribution profile (mr=5.7 kda) in comparison to fraxiparine (ma=5.1 kda) . the anticoagulant properties of sr 80486a were comparable to fraxiparine in the aptt and heptest. however, in the usp assay, this agent showed slightly weaker activity. sr 80486a also exhibi~d comparable affinity to atffl and hcii. in comparison to fraxiparine, it produced a much weaker response in the hit screening system. in~ viv0 studies, sr 80486a preduecd strong dose-dependent antithrombotic actions in both the iv and sc studies in the rabbit jugular vein stasis thrombosis model (ed50=i 5-60 gg/kg). additionally, it also produced antithrombotic aefiorts in a rat jugular vein clamping model. the hemorrhagic effects of this agent were comparable to those of fraxipafine as measured in a rabbit ear blood loss model. intravenous administration of sr 80486a also revealed a comparable pharmaeokinetie behavior to fraxiparine. no abnomaiitias of the clinical chemistry (change in liver enzymes) and hematology profile (thrombocytopenia and lencecytosis, etc.) were noted in primates. at a dosage of i and 2.5 mg/kg iv, this agent also caused a release of functional tfpi which was comparable to the observed responses of other low molecular weight heparins. these studies suggest that sr 80486a is capable of producing similar pharmacologic effects as other low molecular weight heparms, however, additional optimization studies are required for demonslrating product equivalence. limited information on the comparative pharmacoldnetics of low molecular weight heparin (lmwh) is available on the data obtained from aptt, heptest, anti-xa and antmia assays. since these drugs are currently used for therapeutic indications using relatively high dosages and intravenous administration. aptt, heptest and antmia test may be valuable in the assessment of their effects. in order to investigate the relative pharmacokinetics of lmwh using apt'i', heptest, anti-xa and anti-iia methods, certoparin (sandoz, basel, switzerland) was administered to individual groups of healthy male volunteers (52-70 kg) via intravenous (30 mg) and subcutaneous (50 nag) routes in a crossover study. blood samples were drawn at 0, 5, 15, 30, 60, 90, 180, 360, 540 and 720 minutes. using a baseline pool plasma obtained from the same volunteers, calibration curves for each of the individual tests were constructed to extrapolate circulating levels of certoparin. a non-compartmental model using trapezoidal technique was used to obtain pharmacokinetic parameters such as t 1/2, vd, and clsys. in the intravenous studies, the t 1/2 was found to be dosedependent for aptt, heptest, anti-xa and antm]a. the auc, however, was significantly different for each test and was dose-dependent following the order: apttheptest>aptt>antmia. the clsys of the antma was much faster in comparison to the other tests. the clsys of the aptt and heptest was independent of dose. however, anti-xa clsys by this route was lower than other tests. the apparent vd followed the order aptt>antmia>heptest>anti-xa. the bioavailability of the certoparin as measured by various tests ranged from 81-119%. these studies suggest that beside providing pharmacokinetic data, aptt, heptest and anti-iia assays may provide useful data on thier safety and efficacy at high dosages. the immunological type of heparin associated thrombocytopenla (hat ii) is a severe complication of heparin treatment and is associated with arterial and venous thrombosis. only patients with absolute thrombocytopenia have prompted suspicion of hat in clinical practice. we report on a 44 year old male, who developed thromboembolic episodes after coronary angiography like reinfarction and thrombotic episodes of a. brachialis. fibrinolytic therapy combined with i.v. unffactionated heparin treatment was the therapy of choice and was followed by severe fua~er thromboembolic adverse effects. besides an impaired fibrinolytic response and elevated antiphospholipid anitbodies, we diagnosed hat type ii in hipa and elisa (stago-boehringer, marmheim). this special patient had platelet counts within a normal range, when developing the thromboembolic episodes. it appears that the normal platelet count during the thromboembolic episodes reflect a relative thrombocytopenia. from a clinical point of view we recommend the use of a lab panel to exclude hat type ii in patients with thromboembolic episodes under therapy with fractionated or unfractionated hepafin. platelet counts within a normal range are no absolute exclusion criterion for hat ii. low molecular weight heparins (lmwhs) are now commonly used for the prophylaxis of post-surgical thromboembolic complications. in this indication, lmwhs are administered as a single or twice a day subcutaneous regimen. usually these agents are administered at 30-40 mg total dose which is equal to 3000-4000 anti-xa (axa) iu. newer methods such as ehromogenic substrate based axa methods and the heptest clotting time can be used to determine the effects of lmwhs during the initial phases of prophylactic therapy. this may be useful in the elderly and weight compromised patients where a fixed dosage may not be optimal and may produce bleeding effects. similarly in the overweight patients, a fixed dose may not be efficacious. thus, monitoring of lmwhs in these patients may be useful in the optimization of their therapy. lmwhs are also used in the treatment of deep vein thrombosis using both intravenous and subcutaneous protocols. high dosages of up to 200 mg sc/day and infusions of up to 30 axa iu/kg/hr have been administered. in these conditions, the monitoring of the circulating lmwh levels may be useful in optimizing the dosage. we have modified the aca heparin (do_pont merck, wilmington, de) method to measure the lmwh levels in the plasma of patients treated with both the prophylactic and therapeutic dosage. owing to the required turnaround time, simple operation and reliable results, this method was found to be of value in the monitoring of these agents. this presentation provides an overview of the clinical application of various lmwhs with particular reference to the need of monitoring for their effects to optimize the clinical outcome. a double-blind, multicentric, controlled trial was performed in order to compare the antithrombotic efficacy and safety of single daily doses of 5000 ie anti-xa of low molecular weight heparin (lmwh) sandoz (certoparin) and 5000 ie unfractionated heparin (ufh) tid. in 288 patients undergoing elective total hip replacement blood samples were drawn before the first subcutaneous injection of lmwh or ufh resp., two hours after administration on the first and 7th postop, day and on the last day of prophylaxis (day 10-14), anti-xaactivity was measured by chromogenic substrate assay, heptest and aptt by clotting assays and tissue factor pathway inhibitor (tfpi) and heparin-pf4-antibodies by elisa techiques. as expected, the anti-xa-activity and the heptest values were significantly higher in the lmwh-group at all time points after administration of the drugs; the mean values of heptest were 35 sec in the ufh-and 75 sec in the lmwh-group respectively, the aptt was not different in both groups. at the end of prophylaxis positive antibodies to heparin-pf4complexes were detected ~n both groups; this however was not correlated with clinical thrombocytopenia. a detailed correlation between patients with deep vein thrombosis (dvt) and positive antibodies has still to be done (all patients were screened for asymptomatic dvt between day 10-14 by bilateral phlebography. tfpi was markedly increased in the lmwh-and only slightly elevated in the ufh-group; the differences are statistically significant. summarizing it can be concluded that antibodies to heparin-pf4complexes may occur without clinical symptoms of hepafin-induced thrombocytopenia type ii and that tfpi may play a sigificant role for the antithrombotic efficacy of ufh and lmwh. unfractienated heparin represents one of the most severe and frequent causes of drug-induced thrombooytopenia. heparin-indueed thrombocytopeala (hit) occurring early in therapy is often mild and serf-limited, appearing to be caused by a direct aggregant effect of heparin on platelets (hit type i). hit type ii, however, is immune-related an may result in absolute thrombocytopenia (platelet count 5 bu) hemopb~iacs with high fitcrs have ~ually serious ~ problems. they are resistent to mg,,flary replacement therapy, the ~ goal in the treawnent is to control severn acum bleedin~ and to eradicate the inlu'bitor perrmnanfly and to induce tolea'ance. in the tream'tmt of acute blcedings in patients with hlhibitors factor viii inhibitor bypassing ag~ts like activated prothro~ complex concenuxtes (feiba) or prothrombin complex concentrates (pcc) arc mostly used. the meehani~n of aefiou of theses concentrates is net fully investigated. their effect is usually related to the high coment of activated clotting factors ~d phosphoupids. since some years acdwated recombinant factor vii (f vii a) is used to treat patients with inl'dbitocs successfully in several clinical situations including surgery. in addition porcine factor vii1 is widely used in particular in the uk for the treatment of factor v].ii inhibitor patients and could demonstrade good clir3cal results, in case of life threatening bleedings a temporary reducfic~ of inhihitors could be. ~hieved by using extem*,ivc plasma exchange (protein a adsorption) and immune suppression with cyclophosphamid (~alm6 protocol). follow~g the first description by h. bmc~'~mn some modifications for tlm induction of irmnune tolerance in hemophilia a patients have ~en propet'ed. these schedule, can be derided into high, intemxxfime and low dosage roglrmms di:ffea'jng in the dosage of factor viii infused. successful rates about 70 to go % can be obtained with ~ and high dose regimens. but is has to be co~sidered that the~ expensive trea.t~nt regimens have a great physical and p .syc.hosocial impact to the benx~-li~s and thch" farm~e& the different immu~ mler-a.~ze mg~-'~ predominantly used in high rcsponder inhibitor. most of the patients with low concentrations of inhibitors cm be managed with factor viii in increased dosage. this is in agreement with the consensus recorrnr~rdadons for u'eatlncnt hemophiliacs in germany fi'orn 1994. before vitamin k(vk) prophylaxis was generally accepted in japan, the incidence of infantile vk deficiency was 1:4000 both idiopathic and secondary types. since 1981, 4 nationwide surveys have been conducted. the current incidence rate is now about one-tenth that in early 1980. however, in a small number of eases, vk deficiency oceured despite prophylactic administration during the neonatal period. in order to clarify the absorption,excretion and transplacental transport of vk in the perinatal period,following studies were carried out. t)hepaplastintest(normotest) were performed on 65 women in the last stage of pregnancy and each coagulation factor was estimated as well. 2)correlations were made between mothers'and babies'hepaplastin test values. 3)transplacental transport of vk 2 was studied. the general activity of vk dependent factors in pregnant women was much higher than in non pregnant women. as far as the correlation between mothers'venous blood during delivery and cordvenous blood is concerned, in the group of mothers with hepaplastin test value of less than 120% of the normal adult value, the value of the hepaplastintest was less than 30 % of normal adult value in the cord venous blood° we also demonstrated that vk passed through the placenta but only in small qualities. 61hiv-negative patients (median age 4]yrs, 13-70}, formerly treated with non-virusinactivated coagulation products, underwent hepatologic examination, including afp screening and sonography. 31.suffer from severe, 20 from moderate or mild haemophilia a or b, 10 from other severe coagulation factor deficiencies. 52 had been treated with products of the swiss red cross (src) only (28 with small pool cryoprecipitate}, 4 with foreign products only, 5 with both src and foreign products. treatment intensity was variable with>20'000 iu/yr in 26,< 20'000 in ]6, < 1 treatment episode/yr in ]0, a total of only 1-3 treatment courses in 6 patients. 3 afibrinogenemic patients had prophylactic replacement therapy. hcv serology was positive in 56/6] patients (92%), in 47 with detectable hcv rna (77%). the 5 persons who escaped hcv infection, with normal alt-levels and without sonographic alterations, had low intensity treatment with small pool src preparations only. alt-levels were elevated in 33/56 anti-hcv positive patients (59%). 26/56 had abnormal sonographic findings (46%). there was a clear correlation between elevated alt-levels and abnormal sonographies: of 33 patients with elevated alt 23 had abnormal sonography, of 23 with normal alt 3 had abnormal sonography. 8 patients had liver cirrhosis (6 with clinically overt hepatopathy), 4 (4/56 = 1%!) with hepatocellu]ar carcinoma (hcc) with elevated afp-leveis. of these 4 patients 2 had intraarterial embolization with ]ipiodol-epirubicin; in 2 patients hcc diagnosis was made in a late stage. i patient with advanced liver cirrhosis underwent successful liver transplantation. 2 of the 6 patients with hepatopathy had severe haemophilia with temporary high alcohol intake, 4 had mild coagulatlon disorder with few treatment episodes. possible precipitating factors were coinfection with hbv, high alcohol consumation and first exposure to hcv contaminated blood products in an advanced age, but not intensive replacement therapy. very similar results for f vlll and vwf. since the factor viii level is kept steady above the level where there is an increased risk of haemorrhage, continuous infusion is haemostatically safer and more efficacious than bolus injections, another advantage is a progressive decrease of clearence during the first days after surgery which leads to a substantial reduction of factor concentrate consumption by avoiding the innecessary peaks of bolus injections. 22 children with severe form of haemophilia a undergoing elective surgery received continuous infusions with different plasma-derlved and recombinant f viii concentrates. before surgery, patients got bolus injections to raise the factor viii levels to more than 80 %. during continuous infusion factor viii levels were measured two to three times a day and the infusion rate of 4 to 5 iu/kg/h could be reduced on the second or third day to 2-3 iu/kg/h. the clinical efficacy was excellent with no bleeding events. in 5 children with vwd also undergoing elective surgery continuous infusions with humate pr were performed in the same way. no bleeding events were observed in these patients. none of the patients developed postoperative wound infections. the overall doses of f vtll concentrate 'were about 20-30 % lower than those required during replacement therapy with bolus doses. lg8 factor x frankfurt i : molecular and functional characterisation of a hereditary factor x defect (gla +25 lys) huhmann i., holler b., krinninger b., turecek p.l, richter g., scharrer i., forberg e., watzke h. univ. klinik f0r inhere med.i, abteilung for h~tmatologie und h~mostaseologie, w~en; immuno-ag, wien ; klinikum der j.w. goethe-univ. frankfurt am main, abt. f. angiologie. factor x (fx) is a vitamin k-dependent plasma protein which is activated either by fvila/tissue factor or ixaniila. fxa is the main enzyme for conversion of prothrombin to thrombin. the congenital fx-deficiency (stuart -prower-defect) being inherited as an autosomal recessive trait subsequently leads to bleeding diasthesis of varying severity. our propositus is a 24 year old patient presenting a mild bleeding tendency. his p'fi (36 sec) is within the normal range, the pt ( 73% of normal) is slightly reduced. the factor x antigen level is reduced to 55% of normal. molecular charactedsation of the genetic defect was performed by amplification of the eight exerts and exonintron junctions by pcr and subsequent direct sequencing of the products. in comparison to the normal sequence we could determine a single mismatch within exon ii resulting in the substitution of +25 gla (gaa) by lys (aaa). the mutation abolishes a naturally occuring mboll site in the dna sequence of exon ii. the status of the fx encoding alleles was determined in the propositus, his mother and one of his brothers by amplification of exon ii and restriction digest with mboll. these family members were heterozygous with respect to the mutation in exert ii. fx was isolated from plasma of the propositus by monoq ion exchange chromatography. performing clotting assays with purified fx frankfurt i we determined an activity of 89% of normal fx upon activation with rw, 77% upon intdnsic activation (aptt) and 81% upon extrinsic activation (pt). this compares well with the results obtained from the patient plasma ( pt 56%, ptt 55% and rw 57% of normal) when the reduced fx-ag-level of the plasma (55%) is taken into account. we therefore conclude that the substitution of gla +25 to lys results in a fx molecule which is severely defective in both the intrinsic and extrinsic pathway of blood coagulation. bleeding after cardiothoracic surgery is still a frequent, important and sometimes life-threatening complication. thus, the aim of this study was to examine routine parameters of hemostasis and their predictive values for severe bleedings. this prospective study included patients undergoing cardiopulmonary bypass surgery. blood samples were drawn preoperatively as well as 0, 6, 18 hours and 2, 3, 4, 5, 6, 7, 8 days after surgery. blood loss from drains, transfusion of blood products and other important clinical data were monitored apart from platelet count, hematocrit, thrombin time, thromboplastin time, aptt and levels of fibrinogen, atiii and c-reactive protein; soluble fibrin (sf) was measured via protamine sulfate aggregability and total fibrin(ogen) degradation products (ftdp) by an elisa from organon teknika. n= 109 patients were examined (age: 64__+9 y). they lost 750+__460 ml blood (mean+sd) into the drains within the first 18 hours after end of surgery. a severe bleeding was defined to exist, if the blood loss exceeded this range (> 1200 ml within 18 h). fibrin(ogen) split products proved to be a useful parameter in predicting the risk of severe bleedings : ftdp levels exceeding 12 mg/i at end of surgery (n = 105) had a negative predictive value of 94%, positive predictive value of 60%, specificity of 96% and a diagnostic efficacy of 91%. in contrast, soluble fibrin which correlated well with fibrinopeptidea (r>0.91, n= 20) did not correlate neither with degradation products nor with bleeding complications (n = 109). this observation does not match to the correspondence of sf with organ dysfunction during dic: sf reached a neg.predictive value near 95% and a diagnostic efficacy of >70% (pat. without antifibrinolytic drugs), which complies to findings from bredbacka (1994). other parameters were less predictive than ftdp and sf. therefore, further examinations are necessary to determine the value of soluble fibrin for a risk prediction of bleeding complications or dic. a differentiation of splits products deriving from either fibrinogen, fibrin or xl-fibrin will provide further insights into fibrin(ogen) metabolism. heparin induced thrombocytopenia represents a multicomponent syndrome associated with the use of heparin and related drugs resulting in not only thrombocytopenia, but also arterial thrombosis of varying magnitude. the initial diagnosis ofthis syndrome is usually made by clinical observation and a drop in platelet count. conventional diagnostic methods include platelet aggregation responses to patient's serum and ~4c serotonin release in response to patient's serum, aggregation/agglutination of patient's platdets in response to heparins and the detection of patients anti-heparin platelet factor 4 (hpf4-ab) ned-antibodies by using elisa methodology. several other individualized methods are also used to demonstrate platelet activation. to test the diagnostic validity of the platelet aggregation (pa) 14c serotonin release (sr) and the relevance of hpf~-ab 340 serum samples collected from patients with clinically eunfwmed eases of lilt syndrome were compared in parallel in various assay systems. the diagnostic efficacy of these tests varied from 60-74% with the pa test providing better results than others. when the pa test was compared with serotonin release, a poor correlation was noted (r=0.47). in contrast, the correlation between the pa and hp4-ab was somewhat better (r=0.66). in another study, blood samples collected from 50 patients treated with ahigh dose low molecular weight beparin for two weeks (80 mg o.d.) were tested. 20 of these patients showed a high titre of hpf4.ab without any decrease of platelet count. none of these patients were found to be positive in the 14c serotonin release assay. a third study included blood samples from dvt patients administered with iv heparin infusion, high dose sc lmw heparin (certoparin) and iv lmw heparin for the management of dvt. none of these patient groups (20-34) exhibited any hit responses, hmvever, the incidence of high hpf4-ab titre was found to be 53% in heparin, 36% in patients with lmw heparin iv and 26% in lmw heparin sc groups. pa and sr studies revealed 8% and 12% false positive ~ respeetively. these studies clearty suggest that the currently available ~ for laboratory diagnosis of hit syndrome are of limited value, and caution should be exercised in the interpretation of the results obtained with these tests. heparin-induced thrombocytopenia (hit) is one of the major severe side effects during treatment with heparin. in postoperative medicine clinical studies demonstrated the prevalence of hit with unfractionated over fractionated heparins. few data are available from the non-ope1"ative medicine and from patients without thmmboembolism before heparinization. in a controlled prospective randomized study the safety and efficacy of low-dose heparin was compared with a lowmolecular-weight (lmw) heparin over 10 days in bedridden medical inpatients (haemostasis, in press). 1968 patients were randomized and controlled for the development of thrombocytopenia. thrombocytopenia was defined as a platelet count below 40.000lid at day 10. 4 patients developed thrombocytopenia in the heparin group and no patient in the lmwheparin group (p<0.05). none of the patients with thrombocytopenia developed a thromboembolic complication. in a second prospective case control study 90 patients with side effects on anticoagulants were treated with lmw-heparin once daily subcutaneously for a period of 1 month to 9 years. platelet count was performed every 1 to 3 months. none of these patients developed thrombocytopenia during heparinization with lmw-heparin. it is concluded that hit is a very rare complication in nonoperated bedridden medical patients. a decrease of platetet count may occur in about 0.5% of patients receiving low-dose heparin. the incidence of hit with thrombosis during low-dose heparin and of hit during lmw-heparin in non-operated patients is manyfold lower and remains to be determined. terminology: instead of the term "hemorrhagic disease of the newborn (hdn)" the term vkdb should be used, since neonatal bleeding is often not due to vkdeficieacy and vkdb may occur after the neonatal period (i.e. after 4 weeks). definition: vkdb is a bleeding disorder caused by reduced activity of vkdependent coagulation factors which responds to vk. diagnnsis: in a bleeding infant a prolonged pt (inr > 3.5) together with normal fibrinogen and platelet count is almost diagnostic of vkdb. the diagnosis is proven, if vk shortens the pt (after only 30-60 minutes) and/or stops bleeding. classification: classification by age of onset into early (< 24 h~. classic fdav 2-7) and lale form (> i week <6 months), and by etiology into idionathic and ~ec0nd~'y. in secondary vkdb in addition to breast feeding other factors can be demonstrated, such as poor intake or absorption of vk and increased consumption of vk. vk-prophylaxis: benefits: oral and intramuscular (i.m) vk (one dose of i nag) prevents equally well the classic form of vkdb. lm. vk appears to be more effective in preventing the late form (times 15-> 50). the protection achieved by single oral prophylaxis (times 3-5) is improved by triple oral vk (times 15-30). risks: because of poten[ial ri~l~ associated with extremely high levels of vk and the possibility of injection injury, i.m. vk has been questioned as the prophylaxis of choice for normal neonates. since vk is involved not only in coagulation but 'also in carboxviation with multiple effects, excessive deviations from the low physiologic concentrations, which prevail in the fully breast-fed healthy mature infant should be avoided. proposal: repeated (daily or weekly) small oral doses of vk are closer to physiologic conditions than single i.m. bolus doses, which expose neonates to excessively high vk levels. the incidence of intracranial vkdb can be reduced if the grave significance of warning signs is recognized (i.e, icterns, failure to thrive, feeding problems, minor bleeding, disease with cholostasis). whether or not the more reliable absorption of the new mixed mieellar (mm~ nrenaral~i0n of vk can reduce the protective oral dose of vk-.prophylaxis has to be evaluated. before vitamin k(vk) prophylaxis was generally accepted in japan, the incidence of infantile vk deficiency was 1:4000 both idiopathic and secondary types. since 1981, 4 nationwide surveys have been conducted. the current incidence rate is now about one-tenth that in early 1980. however, in a small number of cases, vk deficiency occured despite prophylactic administration during the neonatal period. in order to clarify the absorption,excretion and transplacentel transport of vk in the perlnatal period,followlng studies were carried out. 1)hepaplastlntest(normotest) were performed on 65 women in the last stage of pregnancy and each coagulation factor was estimated as well. 2)correlatlons were made between mothers'and babies'hepaplastin test values. 3)transplacental transport of vk 2 was studied. the general activity of vk dependent factors in pregnant women was much higher than in non pregnant women. as far as the correlation between mothers'venous blood during delivery and cordvenous blood is concerned, in the group of mothers with hepaplastln test value of less than 120% of the normal adult value, the value of the bepaplastlntest was less than 30 % of normal adult value in the cord venous blood. we also demonstrated that vk passed through the placenta but only in small qualities. the point mutation g to a at nt 449 in exon v of the factor x gene (gin 102 to lys) has previously been found in two independent kindreds with fx deficiency. it occured in both families in an heterozygote state and was associated with two other genetic defect in the fx gene. we have identified another familiy in which this mutation occurs in a homozygote state. in this family the mutation is associated with the previously reported mutation gla 14 to lys which also occurs in a homozygote state. the pt and ptt of the proposita and her siter are markedly prolonged. the fx activity is reduced to <1% in the extrinsic system, to 30% in the intrinsic system and to 18 % after activation with rvv. the fx antigen is reduced to 20%. the coagulation profile of this family thus is identical with that of fx vorarlberg despite the fact that the fx vorarlberg kindred is only heterozygous for the mutation glal02 to lys. haplotype analysis could not rule out consanquinity with the fx vorarlberg kindred. these data suggest that the mutation at nt 449 which leads to a fairly dramatic amino acid change from glu to lys would indeed represent a polymorphism. to further address this question we cloned the fx gene in an expression vector (pcep 4) for transient expression in the human embryonic kidney cell line 293 and introduced the mutation at nt 449 by site directed mutagenesis. hereditary deficiency of factor ixa, a key enzyme in blood coagulation, causes hemophilia b, a severe x-chromosomelinked bleeding disorder; clinical studies have identified nearly 500 deleterious variants. the x-ray structure of porcine factor ixa shows the atomic origins of the disease, while the spatial distribution of mutation sites suggests a structural model for fx activation by phospholipid-bound flxa and cofactor villa. the 3.0 a resolution diffraction data clearly show the structures of the serine proteinase module and the two preceding epidermal growth factor (egf)-like modules; the n-terminal gla module is partially disordered. the catalytic module, with covalent inhibitor d-phe-pro-arg chloromethyl ketone, most closely resembles fxa but differs significantly at several positions. particularly noteworthy is the strained conformation of glu-388, a residue strictly conserved in known fixa sequences but conserved as gly among other trypsin-like serine proteinase. flexibility apparent in electron density together with modelling studies suggests that this may cause incomplete active site formation, even after zymogen activation, and hence the low catalytic activity of fixa. most hemophilic mutation sites of surface fix residues occur on the concave surface of the bent molecule and suggest a plausible model for the membrane-bound ternary flxa-fvilla-fx complex structure: the stabilizing fvilla interactions force the catalytic modules together, completing flxa active site formation and catalytic enhancement. factor x frankfurt i molecular and functional characterisation of a hereditary factor x defect (gla +25 ---, lys) huhmann i., holler b., krinninger b., turecek pi., richter g., scharrer i., forberg e., watzke h.. univ. klinik ftlr innere medi, abteilung for h~matoiogie und hamostaseologie, w~en; immuno-ag, wien ; klinikum der j.w. goethe-univ. frankfurt am main, abt. f. angiologie. factor x (fx) is a vitamin k-dependent plasma protein which is activated either by fvila/tissue factor or ixaniila. fxa is the main enzyme for conversion of prothrembin to thrombin. the congenital f×-deficiency (stuart -prower-defect) being inherited as an autosomal recessive trait subsequently leads to bleeding diasthesis of varying severity. our propositus is a 24 year old patient presenting a mild bleeding tendency. his ptt (36 sec) is within the normal range, the pt ( 73% of normal) is slightly reduced. the factor x antigen level is reduced to 55% of normal. molecular characterisauon of the genetic defect was performed by amplification of the eight exons and exonintron junctions by pcr and subsequent direct sequencing of the products. in comparison to the normal sequence we could determine a single mismatch within exon ii resulting in the substitution of +25 gla (gaa) by lys (aaa). the mutation abolishes a naturally occuring mboti site in the dna sequence of exon ii. the status of the fx encoding alleles was determined in the propositus, his mother and one of his brothers by amplification of exon ii and restriction digest with mboll. these family members were heterozygous with respect to the mutation in exon i1. fx was isolated from plasma of the propositus by monoq ion exchange chromatogrephy. performing clotting assays with purified fx frankfurt i we determined an activity of 89% of normal fx upon activation with rw, 77% upon intrinsic activation (aptt) and 81% upon extrinsic activation (pt). this compares well with the results obtained from the patient plasma ( pt 56%, ptt 55% and rw 57% of normal) when the reduced fx-ag-level of the plasma (55%) is taken into account_ we therefore conclude that the substitution of gla +25 to lys results in a fx molecule which is severely defective ip both the intrinsic and extrinsic pathway of blood coagulation. bleeding after cardioth~)racic surgery is still a frequent, important and sometimes life-threatening complication. thus, the aim of this study was to examine routine parameters of hemostasis and their predictive values for severe bleedings. this prospective study included patients undergoing cardlopulmonary bypass surgery. blood samples were drawn preoperatively as well as 0, 6, 18 hours and 2, 3, 4, 5, 6, 7, 8 days after surgery. blood loss from drains, transfusion of blood products and other important clinical data were monitored apart from platelet count, hematocrit, throm. bin time, thromboplastin time, aptt and levels of fibrinogen, atiii and c-reactive protein; soluble fibrin (sf) was measured via protamine sulfate aggregability and total fibrin(ogen) degradation products (ftdp) by an elisa from organon teknika. n= 109 patients were examined (age: 64+9 y). they lost 750+__460 ml blood (mean_+sd) into the drains within the first 18 hours after end of sur. gory. a severe bleeding was defined to exist, if the blood loss exceeded this range (> 1200 ml within 18 h). fibrin(ogen) split products proved to be a useful parameter in predicting the risk of severe bleedings : ftdp levels exceeding 12 mg/i at end of surgery (n = 105) had a negative predictive value of 94%, positive predictive value of 60%, specificily of 96% and a diagnostic efficacy of 91%. in contrast, soluble fibrin which correlated well with fibrinopeptide a (r>0.91, n= 20) did not correlate neither with degradation products nor with bleeding complications (n= 109). this observation does not match to the correspondence of sf with organ dysfunction during dic: sf reached a neg.predictive value near 95% and a diagnostic efficacy of >70% (pat. without antifibrinolytic drugs), which complies to findings from bredbacka (1994). other paramelers were less predictive than ftdp and sf. therefore, further examinations are necessary to determine the value of soluble fibdn for a risk prediction of bleeding complications or dic. a differentiation of splits products deriving from either fibrinogen, fibrin or xl-fibrln will provide further insighls into fibrin(ogen) metabolism. this study was conducted as a randomized parallel -group clinical trial comparing the safety and efficacy of a low molecular weight heparin {lmwh} -monoembolex sandoz and unfractionated standard heparin glfh) for the perioperative prevention of venous thromboembolie disease (dvt) following major surgms' in patients with gynecologic malignancy.. three hundred and twenty 4 women (six drop outsl werr randomized and received either 3 times daily 5000 [l" s.c. ul.'i-i (sandoz nuemberg germany] (n = 164) or once a day t5000 i~'v'. units s.c. monoembolex (n = 160) plus two placebo injections. heparin therapy was started the morning before opcrati(m and continued until the 7th postoperative day. up to the 10th poatop, day the incidence of dvt was 6.25 % (n = 10; incl. 7 pulmona~ embolisms pe) in the lmwh group and 6.10 % (n = 10; incl. 4 pe} in the ufh group. the overall incidence of clinically hemorrhagic wound complications was significantly decreased in the lmwh group 16.3 % (n = 2hi compared to the ufh group 26.8 % {n = 44; p < 0.0051. the incidence of major hemorrhagic episodes was 9.4 % in = 151 in the lmwh group and 14.0 %/n = 23) in the ufh group. this difference was not statisticauy significant. one case of fatal pe was observed in the lmwh -treated group. five women deaths in the lmwh group were observed during the study and 3 in the ufh group. this study demonstrates that the perioperative treaunent of low molecular weight heparins is more safety than standard heparins in gynecologic -oncologic patients undergoing major surge .ry. however, the incidence of thromboembohc complications is simmilar in both treatment regimes. to explore the effect of targeting an antithrombin to the surface of a thrombus, recombinant hirudin (hir) was covalently linked to the fab' fragment of fibrin-specific monoclonal antibody 59d8 (fab) resulting in a stable conjugate (hir-fab). in vitro, hir-fab was 9times more efficient than hir alone in inhibiting fibrin deposition on experimental clot surfaces in human or baboon plasma (p<0.01). to validate these results in vivo, hir-fab was compared to hir in a baboon model. the deposition of ill-in-labeled platelets onto a segment of dacron vascular graft present in an extracorporeal arteriovenous shunt was measured. blood flow rate was 40 ml/min. one hour local infusions of 4500 atu of either hir-fab or hir resulted in deposition of 0.16 x 109 and 2.17 x 109 plate!ets, respectively. equieffective dosages were 2000 atu hir-fab and 9000 atu hir resulting in deposition of 1.06 x 109 and 0.93 x 109 platelets, respectively. based on full dose response curves (n = 14), hir-fab was found to be > 4.5-fold more potent (based on activity) than hir. because of the small total amounts of antithrombins used and the short duration of these experiments, no significant systemic effects were observed. thus, fibrin-targeted recombinant hirudin prevents platelet deposition and thrombus formation more effectively than uncoupled hirudin in vitro and in an in vivo primate model. triabin, a 17 kda protein from the saliva of the assassin bug triatoma pallidipennis, is a new specific thrombin inhibitor (1). tt does not block the catalytic center but interferes with the anionbinding exosite of thrombin. the recombinant protein was produced with the baculovirus/insect cell system and used to study the inhibitory effect of triabin on thrombin-induced responses of human blood platelets and blood vessels. aggregation of platelets in tyrode's solution was measured turbidimetrically at 37°c. for the studies on blood vessels rings (2-3 mm) from small porcine pulmonary arteries were placed in organ baths for isometric tension recording. the integrity of the endothelium was assessed by the relaxant response to bradykinin. like hirudin, triabin inhibited the thrombin (0.1 u/ml)-induced aggregation of washed human platelets at nanomolar concentrations (ec50 = 2.6 nmol/l); whereas the adp-and collagen-induced aggregation were not suppressed. in pgf2c~-precontracted porcine pulmonary arteries, the thrombin (0.2 u/ml)-induced endothelium-dependent relaxation was inhibited by triabin in the same concentration range as found for inhibition of platelet aggregation. higher concentrations of triabin were required fo affect the contractile response of endothelium-denuded porcine pulmonary arteries to thrombin (1 u/ml). in all these assays, the inhibitory potency of triabin was dependent on the thrombin concentration used. these studies suggest that the new anion-binding exosite thrombin inhibitor triabin is one of the most potent inhibitors of the thrombin-mediated cellular effects. dept. of medicine, university hospital benjamin franklin, free university of berlin, dept. of medicine and dept. of surgery, heinrich-heine:university dusseldod after standardized training in home prothrombine estimation using the coaguchek system, 150 consecutive patients (p) who had st. jude medical aortic or mitral valve implantation were allocated to two random arms; 75 p were asked to control the inr themselves every third day. in the remaining 75 p anticoagulation was managed by the home physician without recommending an interval for these controls. all 150 p were monitored during the education period to a target therapeutic range of inr 3.5-4.0. p were asked to contact their home physician immediately if the inr was measured 0.5 below or above the target range (inr-corrider 3.0-4.5). all p had out-patient re-examinations every three months. thrombotic, thromboembelic and hemorrhagic complications were documented by the p using special documentation cards. the following findings were documented during the follow-up period: 4.49 0.9 0 the results of this randomized study demonstrate a significant improvement in the management of oral anticeagulation by home prothrombine estimation. significant (p<0.001) more inr measurements were found inside the target therapeutic range. moreover. bleeding and thromboembolic complications could be reduced (p = 0.038) in the study group with home prothrombine estimation. life-threatening thromboembolic and hemorrhagic complications were not observed in p who were on home prothrembine estimation, while three such events (2.72 %/year) were documented in group a. local vascular injury following ptca exposes circulating platelets to prodmmbogenic stimuli. by binding to platelet gp iiblliia fibrinogen crosslinks platelets, which represents the final common pathway of platelet aggregation. fradafiban (bibu52zw) is a non-peptide compound with effective, reversible inhibitory effects on fibrinogen binding to gp iib/ii/a on human platelets. in the first double-blinded, prospective phase ii study three escalating doses of bibu52zw as a continuous 24 h-i.v, infusion were tested in comparison to placebo in 65 patients with stable angina pectoris undergoing elective ptca. the mean receptor occupancy with 20rag, 40ms and 60ms per hour were 71.974, 84.5 % and 87.9% at 24 hours, respectively. as compared to placebo breeding time was significantly prolonged (7 vs 20rain) during fi-adaiiban infusion with a weak dose-dependency. platelet aggregation in platetet rich plasma ex vivo with collagen (2.0 and 4.0 gg/ml), adp (2.5 and 5.0 gmol/ml) or ca-ionophor a 23187 (2.5 and 5.0gg/ml) was significantly and dose-dependently inhibited as compared to placebo. using the two upper doses of fradafiban, we observed major bleeding complications in 8 patients requiring blood transfusions or vascular surreal repair. in these patients, too, maximal antiplatelet effects could be documented. these data sugest that bibu52zw is an effective fibrinogen receptor antagonist in patients. the requirement of ad hoc receptor occupancy determination or platelet function monitoring for safe and effective clinical use should be evaluated. in a placebo controlled interaction study 9 healthy volunteers were randomized to receive either a 24 hour infusion of peg-hirudin ( 0.02 mg/kg/h) after an i.v, bolus of 0.2 mg/kg + placebo, or 325 mg/day acetylsalicylic acid (asa) for three days followed by a placebo infusion or the peg-hirudin infusion + asa. each volunteer received all three treaments. there was a washout period of at least 14 days between the infusions. at short intervals aptt, activated clotting time (act), ecadntime (ect), alia-activity using the chromogenic substrate 2238, collagen-induced aggregation, platelet adhesion and platelet induced thrombin gene,ration time (pitt) were measured, bleeding time (simplate) was studied before drug administration, on day three before the infusion and 6 hours after start of the infusion.the infusion of peg-hirudin after 4 and 8 hours led to a mean hirudin plasma level of 1.8 pg/ml. asa markedly inhibited collagen induced aggregation as expected. the mean bleeding time was prolonged under the influence of peg-hirudin from 5.2 to 6.22 min, after asa from 5.8 -18.2 min and after the combination of peg-hirudin + asa from 5.4 -33.7 min. in each volunteer the bleeding time was longer under the combination than after asa alone. in two volunteers receiving peg-hirudin + asa the bleeding time measurement was stopped after 60 rain. none of the coagulation parameters or platelet function tests correlated with the prolongation of the bleeding time. however the bleeding time was excessively prolonged in those volunteers who had a marked prolongation under asa alone.the combination of hirudin at a higher dosage with asa probably is associated with a relative high risk of bleeding. either the hirudin dosage should be reduced if the combination seems feasabie or asa should be given after the end of hirudin treatment. fibrinogen with the sta/stago and the mla/dade systems correlated well, but neither system correlated well with the acl/il system. at iii, protein c, protein s, and anti-xa heparin assays using stago reagents performed as expected for normals and low abnormals on the sta. factor levels on the sta/stago system were less sensitive than factor levels obtained with the dade reagents on the mla or fibrometer. using the sta/stago system, thrombin time results correlated well with the aptt and heparin levels. the thrombin time was not associated with additional manipulation for assay preparation, nor any cross-contamination of reagent or sample, since on the sta reagents do not come into contact with tubing. the sta was not sensitive to hemolytic, icteric or lipernic samples for clotting assays artd showed the same sensitivity as the mla for chromogenic assays. the overall data comparisons, high throughput, minimal operator intervention for reagent/assay change and ease of operation warrant further evaluation of the sta hemostasis analyzer. a. wehmeier, d. s6hngen, c. rieth klinik for h#,matologie, onkologie und klinische immunologie der heinrich-heine-universit~it d0sseldorf hirudin selectively inhibits thrombin by direct interaction. because the effect of hirudin is independent of antithrombin iii and other factors, it seems an attractive alternative to current anticoagulants. however, it is uncertain whether hirudin influences plateletassociated thrombotic disorders and how it compares with conventional and lmw heparin. we investigated the effect of 2 recombinant hirudin preparations (rhein biotech, dt3sseldorf) on platelet function tests: in vitro bleeding time, adhesion to glass beads, aggregation in platelet-rich plasma and whole blood. hirudin was used in concentrations of 0.1-100 i.tg/mi, and was compared to trisodium citrate (0.38%), conventional heparin (50 iu/ml) or lmw heparin (fraxiparin, 500 iu/ml). both recombinant hirudins showed normal activity in thrombin neutralization tests, and prolongation of thrombin time and aptt. however, in vitro bleeding time was not prolonged by hirudin, but was more than doubled by addition of conventional and lmw heparins. platelet retention to glass bead columns was reduced by hirudin in a dose-dependent manner to about 40% but was more effectively reduced by both heparin preparations and citrate. hirudin had an inhibitory effect on p!atelet aggregation in prp induced by thrombin, collagen, and predominantly epinephrine but not adp and ristocetin. in whole blood, a small effect could only be observed with hirudin concentrations of >25 ~g/ml as compared to citrateanticoagulated blood. in summary, thrombin inhibition by recombinant hirudin has little effect on in vitro platelet function tests in comparison to heparins and calcium depletion. the role of endothelin (et), prostaglandins and the coagulation system in the pathogenesis of acute renal failure is still to be defined. in anaesthesized pigs the effects of i.v. infusion of et (3 /~g/kg) alone (group 1, n=6) and after pretreatment with the potent thrombin-inhibitor hirudin (0,5 mg/kg)(group 2, n=6) on haemodynamics, coagulation parameters (factor viii, antithrombin iii, precallicrein, fibrin monomers, aptt) and prostaglandins were investigated. plasma renin activity (pra)-, creatinine clearance-, urine volume-measurement and blood gas analysis were performed hourly. et-infusion caused an initial bp-reduction and marked hr-reduction followed by a transient bp-elevation and hr-reduction. activation of platelets can be directly measured by flow cytometry using monoclunal antibodies. in an in vitro study the effect of the thrombin inkibitors argatroban, efegatran, dup 714, recombinant hirudin and peghirudin on platelet activation induced by various agonists was studied in whole blood. blood was drawn from normal human volunteers using the double syringe technique without use of a tourniquet to avoid autoaggregatiun of platelets. for anticoagulation of blood the thrombin inhibitors mentioned above were used at a final concentration of 10 ~tg/ml each. blood samples were then incubated at 37°c either with saline, r-tissue factor (rtf), arachidonic acid (aa), adenosine diphosphate (adp) or collagen. at definite times (1, 2.5, 5, 10 rain) aliquots were taken and after various steps of fixative procedure the percentage of platelet activation was measured by means of fluorescent monoclonal antibodies to platelet surface receptors gpiiia (cd-61) and p-selectin (cd-62). the agunists used induced a platelet activation of 37.4 + 15.2 % (rtf), 65.1 + 12.1% (aa), 19.3 + 7.4 % (adp) and 27.1 + 12.8 % (collagen). flow cytometric analysis showed that all thrombin inlaibitors studied caused a nearly complete inhibition of r-tissue factor-mediated platelet activation. in contrast, after induction of platelet activation with the other agonists an increased percent cd-62 expression was found showing a strong platelet activation with a maximum at the same times as in non-anticoagulated blood. in conclusion, the results show that in whole blood thrombin inhibitors are effective in preventing platelet activation induced by r-tissue factor. the formation of active serine proteases including thrombin may be effectively inldbked by these agents. the observations further suggest that, while thrombin inkibitors may control serine proteases, these agents do not inhibit the activation ofplatelets mediated by other agonists. this work was supported by the grant bmft 07nbl01. animal experimental studies on the pharmacokinetics of peg-hirudin e. bucha, a. kossmehl, g. nowak max-pianck-gesellschaft e. v., arbeitsgruppe "pharmakologische h~imostaseologie", jena hirudin, when complexed with polyethylene glycol (peg), increases its molecular weight from 7 to 17 kda, thereby preventing extravasation of this drug. peg-hirudin is distributed almost only in the intravascular blood space. in addition, its increased molecular weight retards the renal elimination. the elimination half-life of hirudin in rats (58 + 12 min, as determined) is increased five-fold (244 ± 18 min). with the same hirudin dose applied, the blood level of hirudin is increased 19-fold, measured in the 13-elimination phase. in the urine of rats, 2 -3% of the hirudin activity were recovered following hirudin administration, but 48% could be detected after peg-hirudin had been applied. after subcutaneous administration of peg-hirudin, the trnaxwalue is reached at 380 rain (r-hirudin: 65 min); the cmax-value is increased 3-fold, compared to that of r-hirudin (2.5 pg/ml). 24 hours later, still one fifth of the maximum concentration (cma,) is present in the blood, and the renal elimination is still retarded. in the urine of rats, 12% of the hirudin activity applied were recovered in the 24-h urine sample. with intact renal function, following subcutaneous administration, peghirudin is abte to produce a constant blood level of hirudin over a long pedod. thrombin inkibitors such as r-hirudin (rh), argatroban (a), efegatran (e), and peghiradin (ph) are currently undergoing extensive clinical trials in such cardiovascular indications as ptca, ami, and treatment of unstable angina. a rapid assessment of the anticoagulant actions of these agents is, therefore, crucial to assure their efficacy and safety. currently, act and aptt are used to measure the anticoagulant effect of these agents. we have utilized a dry reagent technology based on the motion of paramagnetic iron oxide particles (plop) to measure the antithrombin effects of various thrombin inhibitors (cv diagnostics, raleight, nc). the heparin monitoring card has been modified to measure antithrombin agents in various anticoagulant ranges for (a)0 (e), (rh), and (ph). blood samples drawn from patients treated with (a) and (rh) have been evaluated and concentrations of these agents have been calculated using an external calibration curve. in the in vitro setting, citrated whole blood or citrated frozen plasma can be used to evaluate the anticoagulant effects of these agents. the results obtained are comparable to the act which is conventionally used for the monitoring of these agents. both (rh) and ( period. we would like to present a case of heparin-induced-thrombocytopenia (hit) in a 47 years old woman who underwend open heart surgery. she suffered from a combined aortic valve disease and leading stenosis. laboratory analysis showed constant low platelet counts (50/nl) without heparin application, so that an idiopathic thrombocytopenlc purpura was suspected. but platelets also decreased after heparin application. heparin-antibodies were found in the heparin induced platelet activation assay (hipaa). treatment with corticosteroids and immunoglobulines, respectively, showed no improvement but the patient unfortunately developed a pneumonia with legionetla pneumophila. therefore, the only suitable anticoagulant for the necessary aortic valve replacement was hirudin: a bolus injection of r-hirudin of 0,75 mg/kg b.w. was administered 10 min. bevore start of the extracorporal circulation (ecc), the heart-lung machine (hlm) was primed with 6 mg r-hirudin and another bolus of 5 mg of r-hirudin was administered. additionally 10 mg of r-hirudin was applicated to the cell-saver-reservoir. during the period of ecc ecarin clotting time and aptt values were taken every ten minutes for monitoring r-hirudin concentration. the postoperative anticoagulation was performed with a constant infusion of r-hirudin starting eight hours after the end of ecc and monitored by aptt. due to mechanical aortic valve the further anticoagulation was performed with phenprocoumon, starting 3 days postop. the therapy with hirudin showed no side-effects. hirudin, threrefore seems to be a suitable anticoagulant in patients with high risk for bleeding complications like this. doses fi:om 2-30 mg/kg gave similar post-op blood loss measurements without s dnseresponse (4-15 oc/kg) (less blood oozing than a historical heparin control but equivalent post-op blood loss; 10 q-3 ec/kg). doses >30 mg/kg showed more intra-op blood loss than the lowe~ doses, but equal post-.op blood loss. the bleeding time test was less elevated than for heparin. platelet counts and hematoerit did not vary except for hemodihition on pump. liver enzymes did not vary significantly pre-op to post. act values showed arg was eliminated (dose-dependently) by 1 hour post-op. dogs were hemodymamieally stable during the peri-operative period, and overall gave predictable responses to arg (as opposed to variable responses to heparin). in a substudy it was demonstrated that hypothermia did not affect the activity of arg, nor did varioos formnlations. this dose finding study strongly suggests that arg may be a safe and effective alternative to heparin for patients undergoing cpb. this is particularly important for the growing population of patients with hit who require cardiac surgery, for which no anticoagulant alternative is presently available. three recent clinical tdals with r-hirudin (timi 9, gusto 2 and hit) have shown that the risk of severe haemorrhagic side effects was strongly associated with high aptt-levels. the large intedndividual variability of the aptt and the lack of a linear dose-effect ratio, however, limits its value for reliable monitoring of the anticoagulant effect of hirudin since even severe overdosage due to impaired renal elimination may not be detected with this assay. we have therefore evaluated the ecadn clotting time (ect) as descdbed by nowak and bucha (thromb. haemost. 1993; 69: 1306) under conditions which allow conclusions on its reliability in the clinical situation.for this, citrated venous blood obtained from healthy volunteers, patients with unstable angina pectoris, and patients treated with marcumar was supplemented with different concentrations of peghirudin. measurements of aptt and ect were made in duplicate. in contrast to the aptt, the ect showed a close, linear relationship with peg-hirudin plasma concentrations in the range of 100 and 10 000 ng/ml. the lineadty of this relationship was not affected by the presence of unfractionated or low molecular weight hepadns in concentrations of up to 10 pg/ml. the ect was not affected by fibdnogen concentrations 60% below normal. a somewhat higher slope but no change in linearity was found in plasma from marcumar-patients with quick-values between 20 and 32%. no significant differences were found between values measured in citrated blood or plasma or using different coagulation timers. the most potent thrombin inhibitor containing a benzamidine moiety is napap (k i = 6 nmol/i). unfortunately, the pharmacokinetic properties (fast elimination by hepatic uptake and biliary excretion, poor enteral absorption) are unsuitable for the use of napap as an oral anticoagulant. the application of choice of a synthetic thrombin inhibitor would be the oral one, therefore, we looked for other lead structures. with the nc~-arylsulfonylated piperazides of 3-amidinophenylalanine we found a new group of derivatives which inhibit thrombin with ki-values in the nanomolar range. the piperazides exert anticoagulant activities with high selectivity, leaving activated protein c and components of the fibdnolytic system unaffected. in rats, the piperazides are rapidly eliminated from the circulation (tl/2 ~ 10 min) upon i.v. administration, too. after oral administration, the systemic bioavailability is low. upon intraduodenal administration of high doses widely varying blood levels were seen, depending on the mode of administration. to cladfy the importance of a possible hepatic first pass effect we studied in more detail the pharmacokinetics of the n~-(2naphthylsulfonyl)-3-amidinophenylalanine n'-acetylpiperazide in rats using hplc-analysis. like other benzamidines the piperazide is excreted via the bile to a high extent. enteral absorption rates of about 20 % are found after blocking the hepatic uptake and biliary excretion. hence, a hepatic first pass effect appears to be the main reason for low systemic bioavailability after orallenteral administration. at the same time, fast elimination from the circulation by hepatic uptake is the main problem for maintaining effective blood levels with benzamidines. therefore, the elucidation of the structural elements influencing the absorption and elimination processes of these types of inhibitors is necessary. the piperazides of 3-amidinophenylalanine bear the possibility to easily introduce a wide variety of substituents on the second nitrogen of the piperazine moiety. a 69-year-old female patient with diabetic nephropathy increasingly developed signs of allergisation combined with dyspnea, erythema, pruritus, and circulatory insufficiency two months after start of heparin-anticoagulated haemodialysis und initial surgical application of a double lumen venous catheter. in addition, growing thrombocytopenia was observed involving a drop in platelets by 50%, compared to the initial values. the haemodialytic efficiency was reduced by massive thrombosis of the dialyzer and subsequent repeated interruption of treatment. at the end of may 1995 heparin antibodies were detected and the hat diagnosis was confirmed. immediately afterwards, haemodialysis treatment was continued, applying hirudin as anticoagulant. using steam-stedlised haemophan dialyzers and 0.14 mg/kg r-hirudin (iketon, italy), the minimum therapeutic blood level of hirudin (0.4 pg/ml whole blood) was reached. this provided therapeutically relevant blood level conditions during a 4.5h haemodialysis. more than 80 regular haemodialyses were run without problems. in all hirudin-anticoagulated haemodialysis treatments the ecarin clotting time was used as the method of choice for bedside blood level and dosage control. after the 34th haemodialysis, the frequency was reduced from 3(4) to 2 haemodialyses a week. accordingly, the hirudin dose was increased to 0.2 mg/kg. the creatinine clearance increased continuously from initially 2.6 to 10.4 ml/min after the 13th week of hirudin-anticoagulated haemodialysis. platelet count and haemodialytic efficiency normalized. we could demonstrate that the regular use of hirudin as anticoagulant along with dialyzers impermeable to hirudin enables very good results in haemodialysis treatment in heparin-associated thrombocytopenia, hirudin is suited for use as anticoagulant in problem patients with hepadn-induced allergy when combined with a drug monitoring method fit for bedside use. capillary electrophoresis methods provide a fast measurement of proteins. thus we developed for pharmacokinetic measurements of r-hirudin and peg-hirudin capillary electrophoresis methods. for the measurement of r-hirudin we used fused silica capillary and a borate buffer. this buffer was used to detect r-hirudin, but could not be used to measure peg-hirudin. for simultaneous measurement we used a neutral capillary to prevent protein absorption to the capillary wall. the buffer was a 20 mm tricine buffer (ph = 8.0 field strength 500 v/cm). it resolved r-hirudin from peg-hirudin at 214 nm using reverse polarity. a linear correlation between the peak area and the concentration was found between 80 pgtml and 10 mg/ml for hirudin (r 2 = 0.99) and between 2,5 and 10 mg/ml for peg-hirudin (r2= 0.99) was found by coinspiking of human plasma and urine with r-hirudin and peghirudin the two proteins were completely resolved. a linear correlation between the peak area and the concentration was found. the method separates r-hirudin from peg-hirudin and may be applied to biological systems to measure the concentration of r-hirudin. triabin is a thrombin inhibitor from the saliva oft. pallidipennis structurally unrelated to any protease inhibitor known and which probably functions by an interaction with the anionbinding exosite of thrombin. we used sf9 insect cells infected with recombinant baculovirus to produce sufficient triabin for a detailed biochemical characterization. the activity of the protein purified from cell lysates was assessed in a fibrinogen clotting assay and was found to be similar to that of the natural protein. a 4-fold prolongation of thrombin-clotting time and aptt was achieved with 22 nm and 600 nm triabin, respectively. a kinetic analysis of the thrombin-catalyzed fibrinopeptide a release from fibrinogen showed that triabin is a tight-binding inhibitor. using the graphical method of dixon, the ki was determined to be 3 pm. introduction: thrombocytopenia is a common adverse effect of heparin therapy, in type ii hit platelet decrease induces severe complications. we here present two special cases of type ii hit. case report i: a 66 year old male patient with dvt of the left leg was treated with therapeutic doses of heparin. from the first to the 12th day of therapy, platelet count decreased from 122000 to 54000/ui. hit was confirmed by hipa-test, heparin therapy was s~opped and treatment with the heparinoid orgaran n was started. during the following days, arterial thromboses in the right a. femoralis occurred. several thrombe~tomies were not successful and although orgaran ~" was stopped because of suspected crossreactivity, amputation of the right leg could not be avoided. during the following days under hirudin-treatment platelet count normalized and no further complications occurred. case report 2: a 74 year old female patients suffering from hip fracture was treated by surgery with tep-operation and received prophylactic heparin treatment. after 6 days, platelet count decreased from initially 170000 to 8000/ul and dvt of the right leg was diagnosed. on the same day, severe bleeding into the left leg was observed and hemoglobin concentration was diminished to 7.8 g% (before surgery 16.0 g%). hit was confirmed by hipa te~t, heparin was stopped and treatment with orgaran started. thrombocyte count normalized and no further complications occured. conclusion: hit type ii can cause severe bleeding as well as thromboembolic complications. because of possible cross-reactivity between heparin and orgaran~,, hirudin should be given in hit patients. currently thrombin time (ti), aptt, activated clotting time (act) or anti ila -activity (alia), measured by a chromogenic substrate test are used to monitor hirudin treatment or prophylaxis. the "13" responds very sensitive to hirudin plasma levels end thus requires variable thrombin concentrations. aptt appears to be more adequate, however, it shows large interindividual variations and does not respond sensitive enough to higher hirudin concentrations. act is a simple whole blood clotting assay, but it is strongly influenced by the blood collection technique. the ecadn clotting time (ect) is a new clotting assay, recently described by nowak and bucha (thromb.haemost 1993, 69,1306) . it measures the clotting time of citrated blood or plasma after prothrombin activation by ecarin, a snake venom of echis carinatus. ec.t shows a linear dependence on different hirudin concentrations over a wide concentration range ( e.g. 0.1-5 pg/ml). in a clinical interaction study healthy volunteers were administered hirudin, asa or both. 15 male volunteers received an i.v. infusion of peg-hirudin (0.02 mg/kg/h) for 24 hours after an initial i.v. bolus of 0.2 mg/kg to compare the sensitivity and reliability of ect with aptt, l-r end act. the act was measured on the hemochron 801, usa, ect on a fibrin timer, aptf using the aptt lyophylized silica reagent by il and alia on an acl (il-milan) with the chromogenic substrate 2238. all tests were performed in duplicate. ect was more sensitive to different hirudin concentrations than aptt, or act. the ect results were better correlated with the alia-activity than ap'l"r and act. the lower detection range for ect is 0.05 pg/ml hirudin. ect is a very sensitive, simple and reliable test for the monitoring of hirudin treatment and prophylaxis. recombinant and synthetic inhibitors of thrombin such as hirudin, efegatran and argatroban are currently in various phases of clinical trials in several surgical and medical indications. the therapeutic effects of these agents are usually monitored by aptt whereas in cardiovascular indications, cefite act and hemotech® act are used. the reliability of both aptt and the act tests in predieting the safety of various thrombin inhibitors has been heavily debated. furthermore, some of these inkibiturs are administered simultaneously to heperinized or coumadinized patients and the obtained aptt and act results do not lady refleot the effects of these agents. fcafin is a snake venom derived fi'om echis carinatus which converts prothrombin into mesothrombin, targeting the arg~3-ile tm bond between the a and b chains of prothrombm. while thrombin inhibitors are capable of inhibiting mesothrombin, atiii/beparin complex does not have any effect. using purified ecarin, nowak and bucha (1993 thromb haemost 69:1306) proposed to assay hirudin. since thrombin inhibitors exhibit similar mechanisms of thrombin inhibition, ecarin clotting time (ect) was evaluated to test its diagnostic efficacy in various experimental and clinical settings. lyphilized eoarin was obtained from knoll ag, ludwigshafen, germany). concentration dependent clotting times for himdin, efegatran and argatroban were obtained in a range of 0-15 p.g/ml. all of the antithrombin agents produced a concentration dependent prolongation of ect and showed va~angpotendies inthe order ofefegatran> argatreban> hirudin, on a gravimetriebasis. on a molar basis, the anticoagulant order of potency was found to be hirudin> afegatran> argatroban. utilizing the ect, the effect of these inhibitors on patients undergoing bolus or infusion therapy, resulting in a concentration level of ~ 10 gt g/rnt, have been measured. unlike such global tests as pt and aptt, patients receiving simultaneous heparin or oral anticeagulants can be monitored for antithrombin specific prolongation ofthe ect. plasma samples from heparinized (aptt 45-90 sec) or coumadinized ('pt 15-25 see) patients, supplemented with argatroban or hiredin did not show any differences m the ect. a medified ecarm act comparable to the celite act has also been developed. initial results demonstrate that this test is not affected by aprotinin, heparin and reduction of the prothrorabin complexes in the inr range of 1.5 -3.0. these results indicate that ecarin based clotting times provide slx~etlie ~lts of circulating levels of thrombin inhibitors, which can provide reliable information to optimize their safe(y and efficacy. r-hirudin is a highly potent and selective inhibitor of the serine proteinase thrombin. after intravenous administration, r-hirudin is eliminated exclusively with the urine. its plasma half-life is very short, 1-2h. peg-hirudin is a derivative produced by coupling polyethylene glycol (peg) to a specially designed recombinant hirudin mutein. peg-coupling results in a considerable prolongation of the plasma half-life of peg-hirudin, compared to r-hirudin. after intravenous administration of r-hirudin into rats, a very small amount of ,,hirudin-like" activity (2 -4% of applied activity) was recovered in the urine. in contrast, after peg-hirudin had been administered, more than 30% of the applied activity could be recovered in rat urine. these results suggest differences in the renal metabolism of peg-hirudin and r-hirudin. within the scope of pharmacokinetie studies in rats we investigated the appearance of biologically active metabolites of peg-hirudin after kidney passage in urine. affinity chromatography on immobilised thrombin was used as a quick and gentle method in searching for biologically active hirudin metabolites in rat urine. but it had to be completed by anion-exchange and/or reversed-phase chromatography to ensure that all active metabolites were detected. the isolated biologically active metabolites were purified by reversed-phase hplc and were biochemieally characterized. in previously reported studies we found a hlrud n derivative consisting of the amino acids 1-50 as the main metabolite in rat urine following intravenous administration of r-hirudin. this metabolite was not detected in the urine after administration of peg-hirudin, confirming the suggestion of a different renal metabolism. carrageenans are high molecular weight sulfated polygalactans of plant origin (derived from red algae) with anticoagulant properties. in previous studies we investigated the anticoagulant activity of lambda-carrageenan, a highly sulfated type of carrageonans. unlike heparin, lambda-earrageenan exerts its anticoagulant activity primarily through direct inhibition of the serine proteinase thrombin. only a part of its antithrombin activity is indirectly mediated through antithrombin iii. to investigate relations between molecular weight and biological activities, tambda-carrageenan has been hydrolysed and fractionated. the molecular weight has been determined with the aid of size exclusion hplc using dextrans as molecular weight standards. the degree of sulfation has been determined by anion-exchange hplc. we have obtained low molecular weight lambdaearrageenans ranging from 10,000 dalton to 100,000 dalton with degrees of sulfation of 13 -17% and 33 -38%. the anticoagulant and antithrombin activity of low molecular weight carrageenans have been determined using coagulation assays and purified systems, and we have compared their activities with those of heparin and other sulfated polysaecharides. further, we have investigated the ability of lambda-carrageenan and its low molecular weight derivatives to inhibit the activity of human blood phagocytes. the activity has been determined by measuring the cellular chemiluminescence in a mieroplate himinometer using a himinol-dependent assay and zymosan as phagocytosis activating agent. we have used an assay in human whole blood and assays with isolated human mononuclear and polymorphnuclear cells. the anticoagulant activity and also the ability of carrageenans to inhibit the activity of human macrophages decrease with decreasing molecular weight and decreasing degree of sulfation. the natural ocouring, yellow pigment curcumin is the major component of tumeric and is commonly used as a spice and food-coloring agent. since curcumin has been reported to have anti-tumorpromoting, antithrombotic and anti-inflammatory properties, we studied, whether curcumin acts on the transcription factors ap-l(jun/fos) and nf-~:b in cultured endothelial cells (ec). when ec were cultured in the presence of curcumin, electrophoretic mobility shift assays (emsa) demonstrated, that binding of endogenous ap-1 to its dna recognition motif was suppressed. inhibition was due to direct interactions of curcumin with the dna-binding motif for ap-i. enhanced ap-1 binding, induced after tnfa stimulation of ec, was decreased in cells pretreated with curcumin. this resulted in reduced transcription and expression of tissue factor, known to be controlled by ap-f and nf-~b. nuclear run on assays proofed, that curcumin directly reduced the tnfa mediated transcription of genes, regulated by ap-1, as tf, endothelin-1 and c-jun. thus, curcumin did not only suppress apl(jun/fos)-binding, but also inhibited tnfa induced jun transcription, transient transfections with tissue factor promotor plasmids confirmed, that inhibition by curcumin was dependent on intact ap-i sites. beside its effect on ap-l-binding, curcumin reduced the radical dependent activation of nf-kb due to its antioxidant properties, however, this inhibition was indirect and less prominent. the relevance of the in vitro data was confirmed in vivo in mice bearing meth-a-sarcoma. when mice received curcumin before tnfa was injected, tumors showed reduced ap-1 activation. simultanously fibrin/fibrinogen deposition decreased, most probably due to reduced tissue factor expression. thus, curcumin inhibits ap-t activation and expression of endothelial genes controlled by ap-t in vitro and in vivo. (jung, 1991) . additionally, haemorhenlogical parameters (plasma viscosity, erythrocyte aggregation) were measured. in all patients aptt, bleeding time, platelet adhesiveness, von wiuebrand f~ctor and factor viii concentration and activity were determined. the patients with von willebrand disease showed characteristic morphological changes of capillary geometry. tortuosity of nailfold capillaries was markedly increased as well as the diameter of capillariez on the arterial and venous side. plasma viscosity was significantly low. multiple parameter analysis concerning to galen and gambino (1983) and using the parameters ,,plasma viscosity below 1.25 mpas", ,,torquation index higher than 5", ,,erythrocyte column diameter bigger than 16,9 gin" showed a positive predictive value of 100 %. capillary diameter and capillary tortuosity have a positive predictive value of 91,2 %. additionally, a reduction of the vasomotorie reserve and/or a decreased erythrocyte velocity in the capillaries below the reference range was found in most of the yon willebrand patients. it was quite remarkable, that 16 of 40 of the yon willebrand patients showed significant capillary bleedings. these findings confirm some former observations (e.g. o'brian 1950) and preliminary reports of our group (koscielny 1994). polymerase chain reaction (pcr)-based quantitation of mrna transcripts is an important tool in the investigation of the underlying molecular defects in inherited platelet disorders, such as the bernard-soulier syndrome. however, for the exact quantitation of mrna a number of methological requirements has to be met. first, a standard (s) mrna must be synthesized which is able to undergo the same processing as the target wild type (wt) mrna. secondly, the quantitation step following the pcr must differentially recognize standard and target dna, and thirdly, the assay must be precise with respect to both inter-and intraassay variability. in order to satisfy these requirements we constructed a s-gpib mrna which is identical to the wt-gpib mrna except a 13 bp long primer recognition site at its 5" end allowing differentiation between the pcr amplified wt-or s-gpib cdna through incorporation of a fluorescein or biotin labelled 5" primer. both standard and w[ gpib mrna showed identical amplification kinetics in the pcr reaction. the amplified dna was quantified using an dna binding assay. in this assay binding of amplified dna to gcn4 fusionprotein-coated microtiterplates is measured. since the gcn4 binding motif is incorporated into the wt-and s-gpib cdna through an identical 3" primer, competition between s-and wt-cdna during amplification has been analyzed. at a given concentration of 250 nm of gcn4. primer no competition between the sdna and wt-dna for the primer was observed during 25 pcr cycles. the sensitivity limit of the assay performed in this way was 250 amol wt-gpib~, dna, and intraassay variability reached from 1.66% to 6.72% calculated for 100 fmol and 5 fmol dna, respectively. to sum up, combination of rt-pcr with the amplified dna binding assay and usage of an internal standard mrna allows sensitive and accurate quantitation of gpiba mrna in human platelets. since upa and thrombin are main conrtibutors to the process of proliferation and migration of vascular smooth muscle cells (vsmc), which is part of the pathogenesis of atherosclerosis. we are currently assessing the role of spatial expression of upa and thrombin receptor (tr) on cells with human carotid artery plaques (n=10). we have used a double immunolabeling approach, combining anti-upa and anfi-tr antibodies. to identify the different cell types, we used the following antibodies: anti a-smooth muscle actin (a-sma) for smooth muscle cells, ulex europaeus agglutinin i (uea i) for endothelial cells, inflammation cell cocktail (cd68+cd45) for monocyte/macrophage and lymphocytes and an anti-proliferation cell nuclear antigen antibody (pcna) to stain proliferating cells. in the carotid atherosclerotic plaques, upa immunostaining was distributed focally, preferentially in the fibrous cap and some cells of the foam cell rich region (fcrr). it was present in distinct patterns: cytoplasmic staining. tr staining was distributed similar to upa staining. with double staining combining anti upa antibodies with anti-tr antibodies, cellular co-localisation of both upa and tr was demonstrated. these cells were identified as smooth muscle cells by -sma. inflammatory cells were mainly localized within the fcrr, they only stained for upa. in conclusion: our data demonstrates that upa and tr are coexpressed in vsmcs in human carotid artery atherosclerotic plaque tissue. we therefore conclude, that the mitogenic activity of upa is associated with the thrombin signalling pathway. in the proficiency test of the ,,deutsche gesellschaft flir klinische chemie" (dgkc) 1/95, 5 lyophilised plasma samples (immuno ag) were sent to participants: a normal plasma and 4 plasmas from persons under oral anticoagulation (oac-plasmas. inr 1.9 to 3.7). the participants (n=552) returned the pt times obtained and in most cases (n=355) also the isi value for the thromboplastin used (isi of pack insert). the inr was calculated using the pt of normal plasma and the isi of pack insert (method i). two additional methods for inr calculation were compared with method i. according to the concept of calibrated plasmas (houbouyan et al., t993), a calibration curve was constructed using the normal plasma and the 30ac-plasmas. the inr calculated using the pt •fn•rma¿ plasma and the laboratory-specific isi value given as 1/slope of the calibration curve (method ii) or was read off directly (method hi). for inr values, calculated by the 3 methods from the participants data (n=355), outlier elimination (2sd, iterative) was performed. the inr mean values for all 3 calculation models remain in a narrow range. using calibrated plasmas (method 1i and m), less outlier were eliminated and cv's obtained were smaller than using the conventional procedure ( i ). obviously, the inr inherited problems, such as accurate isi value, pt value of normal plasma and instrument/laboratory influences on isi, can be reduced using calibrated oac-plasmas. practical approach and educational considera-tions of home prothrombin time estimation a. bernardo, a. bernardo, c. halhuber herz-kreislauf-klinik, bad berleburg, germany specific training is necessary for the patient to achieve reliable and reproducible results in prolhrombin time measurement. the training scheme is based in many respects on experience with similar training courses for home control and management of diabetes and asthma. the education program is divided into a theoretical and a practical part. the theory part has group sessions of twenty patients of a time. the practical course is reduced to a maximum of five patients. the sessions are conducted by a medical doctor and by specialized medicaf/technical assistants. on average eight hours of theoretical education and two hours of practical training are sufficient. the contents of the theoretical lessons are: • need for anticoagulation after heart valve replacement, • potential interaction between anticoagulants and other medication, • accurate recording of the measured prothrombin time results, • techniques of prospective determination of the necessary amount of anticoagulant, • calculation of the individual doses, • potential pitfalls and mistakes, • corrections in case of over-and under-dosage, • early recognition of thromboembelic and/or bleeding complications. an alternative is a full-day intensive course which can be held during the weekend. our recently reported (1) observation that oral anticoagulant treatment causes an increase of heparin cofactor ii (hc ii) activity in plasma is now confirmed by a more extensive study. in 43 thrombophilic patients who were on vitamin k antagonist therapy (marcumar r) we found a median hc ii level of 142 % as compared to 119 % for 72 thrombophilic patients without any therapy (p < 0.002" ) and 104 % for 59 healthy controls (p < 0.001" ). moreover we observed that the increase of hc ii level was significantly correlated with increasing inr-values (r = 0.63, p < 0.001). follow-up observations on some patients showed, however, clear differences in the levels of hc ii activity after onset of vitamin k antagonist therapy. thus, some patients responded rapidly with a significant increase in activity ("strong responders") while others showed only slight changes ("weak responders"). in conclusion, the determination of hc ii activity may result in an improved estimation of the risk of bleeding, especially in high intensity treated patients (inr > 3.5). after intracoronary stent implantation an aggressive oral anticoagulation (oac) therapy is mandatory. to find out whether coagulation activation occurs after coronary stent implantation during high dose oac therapy markers of plasmatic coagulation and d-dimer were measured. patients 5 male patients (average age 57 years) were examined. blood samples were taken before and right after stent implantation and during the following week. patients got 30 mg phenprocoumon during the first three days and additionally heparin and acetylsalicylic acid (asa) were given. methods ptz, aptt, tz, protein c, tat-complexes, fi+2 and d-dimer were measured. results d-dimer levels increased steadily between day 0 and day 7. tatcomplexes showed a slight increase from day 0 (2.4 bg/i) to day 3 (15.3 ~tg/i). on day 7 tat levels were down again (2.0 p,g/l). fl+2 (day 0:1.0 ng/ml) also showed a slight increase on day 3 (1.3 ng/ml). protein c decreased steadily from day 0 (108%) to day 7 (15%). conclusion during the initial phase of oac therapy a coagulation activation is reported but no significant elevation of tat or fl+2 was found. this result shows that additional heparin and asa therapy was sufficient to avoid systemic coagulation activation. the increase of d-direct should be interpreted as a si~=m of local fibrinolytic reaction due to stent implantation. three methods for the determination of prothrombin time from capillary blood in patients under oral anticoagulation have been investigated. two methods were run on coaguchek® monitors (boehringer mannheim) from capillary whole blood. after fingerpuncture the first drop of blood was applied to the well of a coaguchek® test strip directly from the finger-tip, whereas the second drop was sucked into a non-anticoagulated plastic capillary (hirschmann) and immediately applied to the test strip -and vice versa to eliminate any influence of first and second drop of blood. the third method was hepato quick (boehringer mannheim) which was determined out of citrated capillary blood from an earlappuncture. 66 specimen of patients under oral anticoagulation were investigated. the method comparisons between each of the coaguchek® methods and the laboratory method show good results and the correlation between the coaguchek® methods is excellent. mean differences to the lab methods are -0.1 inr in both cases. no mean deviation was detectable between the coaguchek® methods. scattering of coaguchek® versus hepato quick was +/-0.6 inr in the range 1 to 4 inr except for three outliers and one patient with fluctuating results in the lab method which could not be resolved. introduction: haemorrhagic coumarin skin necrosis is a severe complication during initial phase of oral anticoagulant therapy. histological examination shows thrombotic occlusion of small vessels, but little is known concerning the pathophysiologic background of the bleeding component. recently, we described protein z deficiency in patients with bleeding complications of otherwise unknown origin. thus, we were prompted to measure protein z in patients with coumarin skin necrosis. patients: 4 patients (i man, 4 women; age: 35±10 years) suffering from haemorrhagic coumarin skin necrosis were examined. all patients had normal liver protein synthesis function, none was under oral anticoagulant treatment during this study. method: protein z antigen test, diagnostika stago, france. results: 4 out of the 5 patients examined had diminished protein z levels ( 700, 820, 1080, 1700 ug/l) in comparison to normals (2900 ug/l). in one of our patients, protein z was normal (3020 ug/l). conclusion: low protein z levels are additional risk factors for haemorrhagic coumarin skin necrosis. oral anticoagulant therapy is the treatment of choice in patients with need for long-term anticoagulation. since oral anticoagulants interfere with the function of vitamin k, it is not clear whether stable oral anticoagulation can be achieved in patients with need for continous substitution of fat-soluble vitamins including vitamin k. we report about a 59-year-old man who had experienced progressive hypertrophic obstructive cardiomyopathy over the preceeding 21 years. atrial fibrillation has been first diagnosed 18 years ago. latter on, recurrent ischemic attacks and embolism of the right arteria iliaca occurred. in 1993 the patient received extirpation of the ileum and subtotal amputation of the jejunum because of mesenteric infarction. the resulting short bowel syndrome requires continous substitution of fat-soluble vitamins. since vitamin k free preparations of fat-soluble vitamins for parenteral use are not available, prophylaxis of thrombosis has been performed with unfractionated hepadn. as a consequence of the longterm treatment with hepadn the patient developed severe osteoporosis. therefore, the decission 1:o discontinuate heparin therapy and initiate oral anticoagulation has been made. because of its shorter halflife warfarin (coumadin) was used instead of dicoumarol. over a 4 weeks lasting induction phase inr values were controlled daily. a dosage regime starting with '10 mg warfarin at the day of vitamin application (day 1) followed by 3.75 mg on day 2 and 1.25 mg on days 3, 5, and 6, respectively, was found to be optimal to maintain inr values within the target range (inr: 2.0-3.0). in order to minimize the risk of hemorrhage the vitamin administration was changed to the subcutaneous route. during an observation period of 6 months neither any bleeding or thrombotic complications nor a vitamin deficiency occurred. these data indicate that stable oral anticoagulation can be achieved despite extreme variation of vitamin k plasma levels. portable monitors for home monitoring of inr are well established for adults on oral anticoagulants. patient's compliance is improved as well as long term outcome. experience concerning accuracy of the procedure in children is limited. 32 inr determinations were performed in parallel from venous and capillaryblood samples of an infant on phenprocoumon, starting at the age of 4 months. the coaguchek® monitor from boehringer mannheim was used. choosing an arbitrary range of agreement of ,qnr 0.5 for both determinations, 81% of the measurements were within the defined range. 5/6 outliers were due to low inr resulting from difficulties in capillary blood sampling. the degree of agreement increased when the procedure was performed at least once a week. in conclusion: inr determination with a portable monitor may be helpful in home monitoring oral an.ticoagulant therapy in young children. a dose adjustment should be done only on the base of inr determination of venous blood -if it is considered the gold standard -to avoid over-anticoagulation. a stable anticoagulation is one of the most difficult tasks in attending patients with heart-valve-prosthesis. if prothrombin times are out of the therapeutic range, the risk of bleeding or thromboembolism increases disproportionately. for this reason any improvement in anticoagulant control and/or management can have far reaching consequences in decreasing complications, in extending longevity and in improving quality of life. for the first time a clinical trial was started in 1986 and continues until today at the cardiac rehabilitation center bad berleburg, germany with patients mainly after heart valve replacement. the patients were trained to measure their own prothrombin time and to adjust their own dosage of the oral anticoagulant. within six years 600 patients were trained: 216 patients could be followed up with regard to their selfdetermined prothrombin times. the results were within the therapeutic range in 83.1% of the measurements (n=14.812) taken by the patients themselves. on average, the patients who determine their prothrombin time themselves did so at a weekly interval. neither major bleeding nor thromboembolic complications could be observed in the 205 patient-years of home prothrombin estimation. it is to be hoped that the usual rate of complications can be reduced when patients determine their prothrombin time themselves at a close interval, resulting in more constant values in the therapeutic range and slight corrections of the anticoagulant dose. home prothrombin estimation promises better quality of life and has a considerable potential to achieve this goal. circulating plasma thrombomodulin (tm) is a novel endothelial cell marker, which may reflect endothelial injury. tm acts as thrombin receptor which neutralises the fibrin-forming effect of thrombin, and also accelerates the formation of the anticoagulant protein c/s pathway. tm therefore belongs to the anticoagulant defence system against thrombosis. increased tm levels have been described in various diseases such as ards, thromboemboembolic diseases, ttp, diabetes, le and cml reflecting alterations of the vascular system at the endothelial level. to find out to what extent cardiac catheterisation imtates vascular endothelium, tm concentrations (stago, asnieres, france: x 10 3 iu/ml) were investigated prospectively in 58 infants and children (three days -16 years). blood samples were drawn before the intervention, immediately at the end and 24 h later, snap frozen (-70 °c) and investigated serially in dublicate six weeks -3 months later. the results (median and range values) are shown in the enhanced tm concentrations immedately after the operative intervention, followed by normalisation within 24 h, indicates that cardiac catheterisation in pediatric patients rather leads to a short lasting irritation of the vascalar endothelium than to severe irreversible endothelial damage. recently in an al=wl" based method dahlb~ick et al described in vitro resistance to the anticoagulant effect of activated protein c (apc) in thrombophilic adult patients. apcr is in the majority of cases associated with the arg 506 gin point mutation in the factor v gene. concerning the special properties of the neonatal hemostatic system (low vitamin k dependent coagulation factors, physiological prolongation of the pt and aptf) we adjusted this ap'it based method (chromogenix, m~,lndal, sweden) to neonatal requirements: apcr was measured in 120 healthy infants according to dahlb~ck. the results were expressed as apc-ratios: clotting time obtained in a 1:1, 1:5 and 1:11 dilution with factor v deficient plasma (instrumentation laboratory munich. germany) using the apc/caci2solution divided by clotting time obtained with cac12 in the same i:1, 1:5 and 1:11 dilution. in addition, plasma of 24 neonates with septicaemia were investigated and data of 18 infants aged birth -three months with arg 506 gin +/-were shown. the arg 506 gin mutation of the factor v gene was assayed by amplification of the dna samples by pcr followed by digestion of the amplified products with the restriction enzyme mnl i. results were confirmed by sscp -analysis or by direct sequencing of dna from patients with apcr. results are shown in the 1.6(1.4-1.95) neonates and infants were considered to be apcr when the aptt ratio was < or = 2. concerning the special properties of the neonatal hemostatic system, our data show concordance with the pcr method in neonates and infants only, when the aptt based method was performed in the i: 11 plasma dilution. case report: we report on an 8-year old boy with severe hemophilia b and frequent screaming at night. eeg showed spike wave activity, starting from the temporal lobe, but generalizing within seconds. complex partial seizures were diagnosed and therapy with carbamazepine was initiated. as no improvement was seen nmr was performed. this revealed lesions within the right frontal cortex. higher doses of carbamazepine were not succcssfull as was therapy with phenytoin and pfimidone respectevely. the patient is now treated with carbamazepine and valproate. he still suffers from one short seizure per day. because of his seizures we started prophylactic replacement therapy with 600 i.e. factor ix twice per week. discussion: in 1992 wilson et al. first detected brain abnormalities in 25 of 124 children and adolescents with hemophilia a or b who were negative for immunodeficieney virus (1). the most common findings (14/25 patients) were small, focal, nonhemorrhagic white matter lesions of high signal intensity on t2weighed images. similar lesions have been reported in children with sickle cell cerebral infarction (2) . only three of these 14 patients had seizures, all of those having a documented history of intracranial hemorrhage. our patient has similar lesions as those described by wilson et al. but no history of intracranial hemorrhage is documented. even if tuberous sclerosis might be a differential diagnosis, we think that the abnormalities are related to hemophili a or its treatment, because the patient has no further signs of this disorder. conclusions: 1. in patients with hemophilia and seizures nmr might be useful as a high sensitive method for the detection of gray and white matter changes. 2. further studies should be initiated to determine the prevalence of pathological conditions in the brain of hemophiliac patients. disseminated intravascular coagulation (dic) is a rare, but foudroyant disease occuring in gram-negative sepsis like meningococcal septicemia. despite the avallibility of potent antibiotics, mortality in mertingococcal disease remains high ( about 10 % ), rising to 40 % in patients presenting with severe shock and consecutive dic. as the clinical course and the severity of manifestations of systemic meningococcal infections varies there is a need for early diagnosis of the infection and stage of coagulopathy in order to reduce the high mortality rate. few and rapidly available parameters are needed to classify the wide spectrum of clinical and laboratory findings in patients with dic. the parameters include partial thmmboplastin time, pmthmmbin time, plasma levels of fibrinogen, fibrin monomers and dimers, fibrin degradation products and the thrombocyte count. monitoring the course of hemostaseologicai findings in 26 pediatric patients with systemic meningococeal infections we observed a change of coagulation parameters as early as in the first stages of the infection: a prolongation of partial thromboplastin time to an average of 69. 1 sec (range 22 -150 sec, norreal 30 -45 sec), a decrease of prothrombin time to 45.7 % (range 13 -71 %, normal 70 -100 %) and of antithrombin iii to an average level of 16. 8 u/ml (normal 20 -29 u/ml ) was found 1 to 4 (-6) hours after admission. the consecutive development of hemostaseological parameters mentioned above permitted to define the stage of coagulopathy and thus to induce a stage related therapy. primary treatment consisted in control of shock by liquid substitution, compensation of metabolic acidosis, correction of clotting disorders ( at iii and heparin in stage of pre-dic ; at iii and fresh frozen plasma in case of advanced dic ) and treatment with g-lactam antibiotics ( e. g. cefotaxime or ceftriaxone ). an early assessment of the coagulation disorders in meningococcal disease can be based on few coagulation parameters, thus an appropriate treatment may be arranged to prevenl the patient from a fatal outcome of meningococcai septicemia and protect him from the development of a waterhouse-friderichsen-syndrome. this study was designed to prospectivdy evaluate coagulation and flbrinolyfie activation in 60 children (neonate -16 years) during cardiac catheterisation with low dose flush heparin (10 iu/ml saline). aptt (instrumentation laboratory: see), anti xa activity (xa; chromogenix: iu/ml), prothrombin fragment ft.2 (f1.2; behring werkc marburg: nmol/l) and d -dimer formation (d-d; bnhring werke/vhrburg: ug/l) were investigated before (t1), at the end (t2) and 24 h after cardiac catheterisation (t3). in addition, to evaluate the influence of inherited thrombophilia in all patients resistance to activated protein c (apcr), protein c, protein s and antithrombin were investigated. during catheterisation median (range) hepadn was administered in a total dose of 60 (17-206) iu/kg bw. in addition infants < 6 months of age (arterial catheterisatiun only) or patients with known thrombophilia received 300 -400 iu/kg hepafin for fmther 24 hours. the results (median and range) are shown in the ft.2 was sigificanfly elevated above the pediatric boundary immediately after the intervcation and nearly reached baseline values 24 h later. in contrast no cfinically relevant fibrinolytic activation was seen: d -dimer formation increased within the pediatric boundary immediately after the catheter and returned to basdine levels 24 h later. three children showed resitance to apc. tn one child stroke occurred before. not knowing the result of apcr in the remaining two patients only one neonate received further prophylactic heparin. the third neonate without heparin prophylaxis suffered from venous occlusion within two days after the intervenfon~ in addition, no protein c, protein s or antithrombin deficiencies were found. although administration of low dose flush heparinisation during cardiac cathetefisation could not prevent short -term coagulation activation, no thrombotic events occurred in children without inherited thrombophilia. if fnrther prophylactic hepariuisation in children with a~r, protein c, protein s or antithrombin deficiencies may prevent vascular occlusion requires a more intensive study. a.sandvoss, w.eberl, m.b0rchert introduction: capillary leakage, edema and hypovolemia are common complications in preterm infants especially if birth weigth is below 1.500 g. septicemia, asphyxia and immaturity seem to be most important risk factors. to determine the influence of c 1-esterase inhibitor (cilna) in preventing contact phase and complement activation we investigated c11na concentrations in normal and symptomatic preterm infants. methods: activity of cilna were measured by chromogenic substrate method (behringwerke), cilna concentration with radial immunodiffusion (behringwerke,germany). results: cllna-activity in asymptomatic preterm infants (n= 14) was 65+/-15% of normal at birth. healthy newborns showed activities of 80+/-20%. cilna reached normal adult values 2-4 days after birth. preterm infants with respiratory distress syndrome(n = 14) showed lower activity on day 2-5, patients with additional septicemia (n=15) had decreasing c1 ina-activities in the first three days of life. individual course of cllna-activity and thrombocyte count correlated in the group with irds with and without septicemia. in children with capillary leakage onset of diuresis went parallel with raising cllna-activity. markers of contact phase (f xlla) and complement activation (c 5al were investigated in single cases and evidence for involvement of both systems was found. conclusion: contact activation and complement system play an important role in capillary leakage in preterm infants. cilna regulates both systems. activity of cilna correlates with clinical course, substitution therapy is possible and may improve outcome of these critical ill patients. antiphospholipid antibodies (apa) interfere with hemostasis probably by inhibition of protein c or prothrombinase complex. thereby, apa might lead to thrombosis or increased bleeding. however, incidence and clinical importance of apa has not yet been investigated in children. therefore, we assayed plasma samples of 220 children, aged 0,1 to 19 years (mean 7 years) by elisa detecting igg-and lgm-antibodies directed against eardiolipin, phosphatidyl serine and phosphatidic acid. in patients with increased bleeding, thrombophilia or prolonged clotting tests a detailed coagulation analysis was performed. according to their diagnosis children were devided into 5 groups: i. autoimmune diseases, ii. infections, iii. metabolic diseases, iv. other diseases, v. healthy children. results: apa were found in 69/220 patients. in the respective groups we demonstrated apa in the following proportions: 1. lgg-isotype: activitiy of c1 esterase inhibitor (c11na) is reduced in preterm infants especially if birth weigth is below 1.500 g and respiratory distress syndrome and/or septicemia is present. capillary leakage with generalized edema, hypovolemia and hypotension is resulting in imbalance between inhibition and activation of contact phase and complement system. iln four patients we investigated seven courses of substitution ;with commercial c1 esterase inhibitor preparation (berinertr,behringwerke), case reports are given. all patients had clinical symptoms of capillary leakage, all had septicemia accompanied by either respiratory distress, disiseminated intravascular coagulation or mutiple organ failure. jefficiacy of substitution therapy is dose related, supranormal iactivities of cilna are necessary, reflecting raised consumption of inhibitor in ongoing disease. clinical effects on diuresis, catecholamine need and especially on thrombocyte counts are demonstrated. or arterial thromboembolic event in children e. lenz, c. heller, w. schr6ter*, w. kreuz johann w. goethe-universit~itskinderklinlk, frankfurt a. main, germany * georg-augast-universit/itskinderidinik, g/3ttingen, germany venous thrombosis as well as arterial thrombo-occlusive events are rarely observed in childhood, but can lead to life-threatening situations and longterm sequelae in these patients. after the initial stage of treatment (thrembolysis or thrombectomy) the pediatrician has to decide how to efficiently prevent re-thrombosis in the individual patient. anticoagulation after venous thrombosis is generauy recommended for 6 months after the event; if an underlying thrombophilic condition has been detected in the patient anticoagulation has to be considered lifelong. when evaluating antithrombotic therapies for children it is of importance to consider whether the anticoagulatory effect is mainly necessary in the venous or arterial vessel system. the hemorrhagic risk and side effects of the different anticoagulatory preparations have to be taken into account, especially when treating small children. only limited experiences exist concerning the suitability of the preparations for long-term anticoagulation in children and general recommendations on the ideal dosage in pediatric patients are still missing. we want to disscuss different types of anticoagulants (such as coumarins, unfractionated heparin, low molecular weight heparin (lmwh) and inhibitors of platelet aggregation) their mode of action, their suitability for pediatric patients and their side effects and relevance of these side effects especially in children. from the experience in our own pediatric patients, we would like to report on the indications, which can be given to administer these different preparations, the dosage regimen we recommend and the laboratory tests to monitor save and efficient re-occlusion prophylaxis in our patients. in this context we would like to present our data on 8 patients with either thrombosis or arterial infarction due to a thrombophilic condition, who had all contraindicatioas to oral anticoagulation by coumarins. because prophylaxis for re-thrombosis was mandatory in these patients, lmwh was given for long-term anticoagulation in a dally subcutaneous dosage of 100-150 anti-xa u/kgbw. monitoring was done by anti-xa-test (0,4-0,8 anti-xa u/ml). under this regimen none of the patients developed re-thrombosis or bleeding complications. alopecia was seen as a side-effect. this study was designed to prospectively evaluate coagulation and fihrinolytic activation after cardiopulmonary bypass with aprotinin (2x17000 u/kg bw) in 42 infants and children aged 0.1 -15 years, and to correlate these findings to the clinical outcome. prothrombin fragment f 1.2 (f1.2; behring werke marburg: nmol/l), antithrombin-serinesterase -complex (atm; stago: ng/ml), d -dimer formation (d-d; behring werke marburg: ug/l), tissue-type-plasminogen activator ag (t-pa; chromogenix: ng/ml), plasminogen activator inhibitor 1 antigen (pai; chromogenix: ng/ml) and cl-inhibitor (c1; behring werke marburg: x 10-3 g/l) were investigated before the operation (t1), at the end of the operation (t2), and on postoperative days 1 (t3), 4-6 (t4) and 7-9 (t5), respectively. the results are shown in the table (median and median absolut deviation): t1 t2 t3 t4 "1"5 nv fi.2 0.9 +/-0.5 1.7 +/-0.9 1.4+/-1 1.8+/-0.8 1.6+/-0. the platelet (pl) function defect induced by thrombolytic agents has been attributed either to the degradation of pl surface receptors or to the anti-aggregatory effect of fgdps. in contrast to other plasminogen activators scu-pa is intimately inked with pl: they can rapidly incorporate exogenous seu-pa, release it upon stimulation and bind the proenzyme. recently we have reported that exposure of prp to recombinant scu-pa (2.5-t00 um) in timed interval 1-30 min resulted in dose-dependent inhibition of pl aggregation. timecourse changes of the process were followed by the biexpotential kinetics: a rapid initial inhibition during the first 3-5 rain with the moderate suppression of pl aggregation in the 30 min period. when tcu-pa (25-100 nm) was exposed to prp in the same conditions dose-and time-dependent inhibition of pl aggregation was also observed. since the effect was obtained no earlier than t0 min after exposure of tcu-pa to prp, and the threshold dose was higher. comparable inhibition of pl aggregation was obtained with 25nm of scu-pa versus 100nm of tcu-pa and the llbrinogen depletion by the end of the 30 min period was 2% and 30% respectively. it's likely that tcu-pa and its precursor have different mechanisms of action on the pl aggregatory function. in a recent study we have shown that recombinant rscu-pa inhibits platelet (pl) aggregation in prp. to exclude the possible influence of rscu-pa/plasma interfere on this process the aggregation of washed pls was under the investigation. pls were washed according to modified mustard's method, suspended in buffer and adjusted to 250,109/1. the resuspended pls were exposed to 5-100 nm of rscu-pa for 30 min at 370(;. at time points 3, 5, 15 and 30 min the aggregation with 0.6 iu/ml of thrombin was measured. it was found that the exposure of pls to rscu-pa (20-100 nm) for 3 man resulted in marked inhibition of their aggregation. since after 15-30 man of incubation with 20-50 nm of rscu-pa the inhibitory effect on pl aggregation became less pronounce or even disappeared. when 5 nm of rseu-pa was used the inhibition of pl aggregation became significant only by 15 rain of exposure period and didn't change for 30 man of investigation. the observed results may be cormeeted with uptake of rscu-pa by pls from surrounding buffer as well as with individual variations of pl response to the same concentration of rscu-pa. loss of glycosylation may result in a reduced platelet (p) survival and perhaps altered function. we analyzed the structural and functional effect of specific deglycosylation (combinations of n/o-glycosidase and neuraminidase treatment) of p and isolated p gpib. washed and formaldehyde-fixed p were digested as follows: 1) with neuraminidase (0.125u/ml) + o-glycosidase (3.1mu/ml) + n-glycosidase (1.25u/ml), 2) with neuraminidase alone (0.2u/ml), 3) with n-glycosidase (2u/rnl) and 4) with neuraminldase (0.2u/ml) + o-glycosidase (5mu/ml). all reactions were performed in the presence of protease inhibitors (pmsf, leupeptin, sbti), after washing x2 the p and identically treated controls were analyzed by flowcytometry with the antibodies 6di (mab: a-gpib), 7i-l2 0vlab: a-gpiiia), and the lectins wheat germ agglutinatinln (wga, for neunac) and peanut agglutinin (pna, for [3dgal(1-3)-galnac) which confirmed effective and specific deglycosylation by the respective enzymes (but gave only minor differences with 6di and 7h2). the botrocetin (13) and ristocetin (r)induced agglutinations showed arer treatment 1) (all enzymes) a full inhibition of r-induced agglutination but only a mildly reduced b-induced agglutination (70% of normal). treatment 2 and 3 (neuraminidase alone, and n-glycosidase alone) affected both agglutinations only mildly (70-80% of normal).treatrnent 4) (o-deglycosylation) however showed a major inhibition of r-agglutination down to 30%, while b-agglutination interestingly was almost fully retained. the results of the rotary shadowing electron microscopy of purified gpib suggested a collapse of the normally stretched, glycosylated, gplb, not only after the treatment with all three glycosidases, but also .after o-deglycosylation alone. we conclude that oglycosylation is most important for ristocetin-induced platelet-von willebrand factor-interaction and responsible for the typical stretched shape. the phenomenon of in vitro platelet aggregation and consequent pseudothrombocytopenia (ptcp) in the presence of calciumchelatization by na-edta and sodium-citrate was studied in blood samples of a patient. initial platelet counts electronically measured were 20000/ul blood anticoagulated with na-edta and sodium-citrate. normal platelet counts were found in heparin-anticoagulated blood and in capillary blood. immunoglobulines of the igg and igm subclass were identified in the patients plasma. by incubation of the patient's serum with platelets of healthy individuals, platelet-clumping occurred in the presence of na-edta and sodium-citrate but not in the presence of heparin. the platelet membrane glycoproteins (gp) hb/llia, ix and iiia/vnr g-chain were involved in the antigen antibody reaction as demonstrated by specific antibodies and flow-cytometry. on platelet surface permanent calcium-exchange and -replacement is dependent on external calcium concentration. calcium depletion induced by calcium chelators as na-edta and sodium-citrate might conformationally change platelet surfaces and induce formation of neoantigens. the decrease of gp llb/illa platelet surface antigen to 10% (normal >75%) indicated the important role of the gp iib/iiia receptor at ptcp. the saliva of tdatoma pallidipennis, a triatomine bug, was found to contain a protein called "pallidipin", that specifically inhibits collageninduced platelet aggregation but not adhesion or shape change. to investigate the mechanism of action of recombinant pallidipin the influence on platelet fibdnogen binding after activation by collagen type i in different concentrations was measured by flow cytometry. the same concentrations of pallidipin that inhibited the couagen-induced platelet aggregation completely did not cause any inhibitory effect on fibdnogen-binding in the prp from the same donor measured contemporaryly. collagen type i-induced platelet aggregation of cd36-deficient platelets from two different unrelated blood donors was inhibited by the same concentration of pallidipin that inhibited aggregation of control platelets. there was no inhibition of collagen-induced fibdnogen-binding in the cd36-deficient platelets as well. pallidipin did not cause inhibition of collagen-induced membrane expression of cd62 and cd63 of control and cd36-deflcient platetets as measured by flow cytometry. however eadier studies had shown an inhibition of collagen-induced atp and {3tg secretion by pallidipin. therefore we compared the effect of pallidipin in unstirred and stirred prp samples. while pallidipin had no effect in unstirred samples it showed strong inhibition of ptg secretion in stirred samples. we therefore conclude that pallidipin does not act on collagen-induced aggregation through cd36 and that the inhibition is a post fibdnogenbinding event. pallidipin does not influence the first steps in secretion, which are independent from cytoskeleton and platelet-platelet contact, but inhibits the following steps. 17-hydroxy-wortmannin does not inhibit the transport of 1nm-gold labelled fibrinogen in resting platelets. e. morgenstem, b. kehrel and k.j. clemetson medical biology, saarland univ., homburg, germany, haemostasis research, univ. muenster, germany and theedor-kocher-lnstitut, univ. bern, switzerland. wortmannin, an inhibitor of phosphoinositide 3-kinase and of myosin light chain kinase blocks reactions of the activated platelet. to obtain informations about the role of the contractile cytoskeleton in receptor-mediated transport of resting platelets, the effect of 17-hydroxy-wodmannin (hw) on the endocytosis of fibrinogen from the surface of resting platelets was studied. gel filtered platelets (gfp) were incubated for 10 min at 37°c with hw (3x10-6m) or with iloprost. controls and gfp preincubated with hw or ilopmst were incubated with 1.4nm-gold labelled fibrinogen molecules (fg-au; final concentration 40p.g/ml) at 37°c. the experiments were stopped after 5 or 30 min by rapid freezing. after freeze substitution in acetone with 4% osmiumtetroxide, sedal sections were prepared. the sections were examined after incubation with ascorbic acid (5% in h20) for 30 rain at 20°c (to reduce metallic osmium) and silver-enhancement using danscher's (1981) method (to visualize the fg-au). examination of adp stimulated platelets in the presence of 40fg/ml fg-au shows that the ligand is able to mediate aggregation. the examination reveals, that fg-au was present in a low density on the platelet surface, in higher density in the surface connected system (scs), in coated pits and vesicles and separated smooth vesicles (representing endosomes?) as well as in the matrix of alpha-granules. after 30rain, the number of labeled granules was increasing. labels on the surface and on the mentioned cytoplasmic membranes were observed during the whole period of incubation. hw or iloprost did not alter the resting gfp and the mentioned qualitative ultrastructural findings in both preparations did not show differences to the controls. we conclude from the results with hwthat the regular contractile function of the cytoskeleton is not necessary to transport the fg-au in resting platelets. methods: edta anticoagulated whole blood was incubated with thiazole orange and analyzed with a flow cytometer. young platelets were defined by having a high fluorescence from thiazole orange (normalized to platelet size). platelets were also incubated with fluorescent antibodies to gpib, gp lib/ilia and gmp-140 (two colour method). results: surface expression of gpib was the same in young and older platelets. results for gp lib/ilia and gmp-140 (in resting and activated platelets) will be presented. conclusion: young platelets can easily be detected using thiazole orange and flow cytometry. there is no differential expression for gpib. further results will be presented. the influence of erythrocyte and thrombocyte content on the release of atp by different agents in whole blood specimens was tested. the measurement had been performed in the lumi-aggregometer using the principle of the luciferin-luciferase reaction. altogether 39 blood samples were diluted gradually before induction of the release reaction by arachidonic acid (1,25 mmol/i final concentration), adp (30 ijmol/i) and collagen (1,0 and 5,0 tjg/ml). the peak of the obtained curves was transformed into percent values of the maximal deflection by the undiluted sample (= peak in relation) and into atp concentrations (= absolute peak) after testing the atp standard in parallel for each dilution step separately. the peak in relation increases by increasing dilution with all inducers. it was identic with the atp standard and with collagen, somewhat lower with arachidonic acid and much higher by adp. a luminescence-optical effect may influence all these results. the absolute peak decreases by dilution under arachidonic acid and collagen as it was expected by the decreasing thrombocyte content of the samples. under induction by adp no decrease of the absolute peaks by increasing dilution of the samples was abserved. this can be explained only by liberation of atp from the erythrocytes. the atp standard is essential for the quantification of the release reaction. adp doesn't suit for it. collagen with a final concentration of 1 pg/ml was proven as the best inducer. platelet aggregation induced by several agents has been photometrically investigated in disc shaped rotating cuvettes coated with vessel wall tissues obtained from human umbilical cord, either endothelium or smooth muscle cells or extracellular matrix or combinations of them. in addition, effects of endothelium incubated with several cytokines on platelet aggregation have been studied. endothelial cells strongly inhibited aggregation depending on their cell count and the concentration of the inducer. smooth muscle cells showed the same effect but very less marked. in presence of extracellnlar matrix spontaneous aggregation occured. endothelium could inhibit this spontaneous aggregation when present in the same cuvette, smooth muscle cell could not. incubation of endothelium with several cytokines increased its anti-thombotic properties. for example, at a platelet count of 3x105/id in the prp, 10 -6 m adp led to maximal aggregation in uncoated cuvettes, in presence of 5,5x106 endothelial cells aggregation was completely abolished, in presence of 2,75x10 "6 cells aggregation was decreased to 40%. smooth muscle cells diminished the aggregation effect of 0,1 nih thrombin to 67% when only one side of the cuvette was coated and to 63% when both sides were coated. endothelium could not inhibit aggregation induced by 2,5 x 10 -6 m adp but endothelium incubated with 500 u/ml tnf-a or 30 u/ml intedeukin-lfl or lmm l-nitro-arginin for 24 h did completely inhibit aggregation. platelets become sticky and adhere to surfaces or to another without contracting and secreting. during maturation of megakaryocytes finally platelets lost their genomic nuclear message. only mitochondrial dna of platelets can be identified. we focused our attention on the impact of mitochondrial dna and the mitochondrial transscriptive mechausisms during platelet activation in normals. materials and methods: leucocyte free (nagentte chamber, flow cytometric analysis) platelet rich plasma or platelet concentrates a_~er hemapheresis were filtered by pall 100 leucocyte filters. the influence of different anticoagulants (commercially available sarstedt tubes containing citrate, heparim edta and 500 atu/ml hirudin wacker) was examined. activation was due to a 60 nun. hemapheresis procedure ( 3-5fold increase of cd 62, cd 63) and ex rive stmaulation due to 4 niy u/ml thrombin, 0.025 m cac12 or combmatious. the guanidiurn method for total rna preparation were used according to t. brown: current protocols in molecular biology 4.21-4.9.14,1991. different primers of mitochondrial genome (e.g. cytochrome b and atpase) were prepared using pcr and mitochondrial transscription was examined using northern-blot-technique. results: 1., there is less activation of mitochondrias using hirudin anticoagnlation, but a 2fold increase of mitochoindrial rna content in heparinized samples. 2., stimulation with thrombin leas to an increase to 5.5 e-l0 rna btg/platelet, compared to 4.7 -4.8 e-10 rna ~tg/platelet under unstimulated conditions.. conclusion: there is evidence for the importance of platelets mitochondrial dna and mitochondrinl transsefiption in regulation of cytosceleton and platelet activation. thrombospondin-1 (tsp-1) is a large homotrimeric glycoprotein originally identified as a platelet alpha-granule component. the investigation of its putative role in a variety of pathophysiologies like haemostatic disturbance, malignancy and wound healing requires specific laboratory reagents. monoclonal antibodies are one of the most powerful of these reagents. therefore, we purified human tsp-1 from thrombin-stimulated platelets using affinity chromatography to generate monoclonal antibodies in mice. a subclass igg 1 monoclonal antibody designated 48.42 was purified from ascitic fluid and further characterised. western blot experiments demonstrated that this antibody reacted only with the unreduced molecule whereas the tsp-1 subunit chain was not recognised. no cross-reactivities with human fibrinogen, fibronectin, vitronectin and von willebrand factor were found. preliminary results indicate that the monoclonal antibody 48.42 can be used to investigate tsp-1 function in several assays including immunocytochemistry and cell adhesion as has been demonstrated for hl-60 cells. in addition, a sandwich enzyme immunoassay was developed using goat-antihuman tsp-1 igg and derivatised monoclonal antibody 48.42 (peroxidase, biotin) as a sensitive method for detection of tsp-1 in human body fluids. in the following study the expression of the platelet antigen (cd62p) and the leukocyte antigen (cdllb) were measured in whole blood, in addition to platelet-leukoeyte adhesion (rosette formation) by means of multicolour fluorescent labelling (cd45, cd14, cd42a). the measurements were carded out both in freshly drawn whole blood which had been antieoagulated with different agents, and in stirred samples of whole blood under controlled conditions (37°c, 1000 rpm, different stirring times). the results are presented as the percent positive events in each gate (platelets, leukocytes -pmnl, monocytes, lymphocytes and rosettes -plateletpositive events in the pmnl, monocyte and lymphocyte gates), whose mean fluorescence is given in addition to an index comprising the product of the percent positive events and their mean fluorescence. stirring (max 15 rain) induced an increase of cd62p on the platelet surface of ca. 10%, without any change in the mean fluorescence. under these conditions increased cdllb on pmnl and monoeytes could be detected. an increase in the rosette formation could also be measured (greater index), in that the percent of monocytes which were platelet-positive increased with no change in the mean fluorescence of the positive events, whereas pmnl showed an increased mean fluorescence, but not an increased number, of platelet-positive events. the time-dependent changes in rosette formation on stirring could be further increased by addition of adp. these results show that it is possible to measure rosette formation, and also the influence of effector agents (inhibitors or activators of platelets or leukocytes) on rosette formation, in whole blood using flow eytometry. 17 itp patients undergoing splenectomy were observed after 1-30 years following operation and divided into 2 groups. first group consisted of 8 patients with normal platelets count and absence of haemorrhagic syndrome. second group was formed of 9 itp-patienfs with episodes of thrombocytopenia recovery following certain time period after splenectomy. in the aim to study the cellular immunity there were carried out immunophenotypical investigations of blood samples using immunofluorescence method with monoclonal antibodies application. the increase of b-cells, expressing cd22, cd37, hla-dr-antigen has been revealed in the 2nd group. quantity of srfc, cd3 +, cd5 + cells in the blood of recovered patients was lower than in patients of the first group. this group was also characterized by statistically significantly increased level of cd4 + cells while the cd4/cd8 ratio was equal to 1.0 :i: 0.3 % (0,5 + 0,1% in patients of the second group, respectively, p>o,05}. also the relatively high expression of activating antigens in patients with thrombocytopenia recovery after splenectomy was stated. among infectious complications in all patients observed were predominantly found various types of throat infection, mainly with unsatisfactory treatment possibilities. we have observed the opsi-syndrome in 2 patients, being featured with marked tiredness, breath loss, intolerance of hard physical working, diminished ability to maintain physical activity. extracellular matrix (ecm) produced by human endothelial cells closely resembles the vascular subendothellal basal lamina in its organization and chemical composition. thus it contains collagens, fibroneetin, von witlebrand factor, thrombospondin, fibrinogen, vitronectin, laminin and heparin-sulphate. platelets carry different receptors on their membrane surface with specific binding capacities for one or more of these extracellular matrix proteins, such as glycoprotein (gp) iibiiia, gp ib/ix and gpiiib. incubation of platelets with ecm results in platelet adhesion, degranulation, prostaglandin synthesis and aggregation. we studied patients whose platelets showed either a receptor defect in gpiibiiia or gpiiib or a storage pool disease. adhesion experiments were performed using siliconised glass, collagen coated surfaces, immobilized fibrinogen as well as human subendothelial matrix. platelet adhesion of patients with thrombasthenia glanzmann (receptor defect of gpiibiiia) resulted in a total lack of binding to silieonised glass and immobilized fibfinogen. adhesion to collagen was almost normal in spite of the fact that only single platelets sticked to the surface and no microaggregates were observed. the adhesion to ecm was diminished and also no aggregates were detected. patients with a receptor defect in gpiiib showed normal platelet adhesion to siliconised glass and immobilized fibrinogen but binding to collagen and ecm was markedly reduced, while platelets with a storage pool defect sticked to siliconised glass but failed to adhere to ecm. by centrifugation of citrate blood (250 x g, 10 min) erythrocytes and leucocytes go to the bottom, whereas plasma and thrombocytes stream in the upper part of the probe. so the thrombocyte count doubbles in the platelet rich plasma in contrast to the platelet count in the whole blood volume. if the thrombocytes are more or less activated, they adhaere on erythrocytes, leucocytes or aggregate end are not able to stream upwards. the quotient between thrombocyte counts in prp and whole blood is a measure for thrombocyte activation. we chequed the value of this screening in different groups of patients with arterial occlusions disease (aod), chronical venous disease (cvd), diabetes mellitus (dm] and in healthy control persons (control). variation coefficient of the method is 3.7 (prp) and 4.4 (tc) respectively (coulter counter). differences to the control group are significant. changes in the patient groups in dispensaires follow up 5 years are also significant. nicardipin -induced immunthrombocytopenia p. eichler 1, c. hinrichs 2 , g greinacher l i.institut fur immunologic und transfusionsmedizin, ernst-moritz-arndt-universitat greifswald, 2. deister-s0ntel-klinik, bad m0nder drug-dependent immune-thrombocytopenias are a rare but clinically important variant of immune-thrombocytopenias. patients are at risk to suffer from severe bleeding complications. especially in patients receiving multiple drugs, diagnosis of drug-dependent immune-thromboeytopenia is often difficult. we report the case of a 71 year old male patient who received allopurinol, captopril, digitoxin, furosemid, and nieardipin. the patient presented with hematomas (pit. count < 10 g/l) and later developed bone marrow dysplasia. in an elisa using whole platelets and patient serum, a weak reactivity in the presence of furosemid, but a stronger reactivity in the presence of nicardipin (antagonil, ciba-geigy) could be demonstrated. the reaction pattern is given in the the enzyme-immunological determination of soluble fibrin (sf) proved to be highly sensitive and specific. this sf-elisa detected fibrin hacking fibrinopeptide a (fpa) via the monoclonal antibody 2t35 specific for the neoepitope generated on the aa-chain after the split of fpa. lill et al. recently introduced a new assay modification which utilizes the same antibody as the old one but takes advantage of a pretreatment of plasma specimens with kscn. this strong chaotropic ion is used to dissociate the various fibrin complexes possibly hiding fibrin epitopes. it was the aim of this study, therefore, to compare the two sf-elisa modifications (with and without kscn-pretreatment of specimens) . in order to examine the dynamics of thrombin-induced fibrin(ogen) metabolism we made course observations in patients with a certain form of septicemia. both assay modifications detected fibrin(ogen) derivatives which differed considerably in kinetics (n= 160 samples from 10 courses). the former sf-elisa (no kscn) correlated well with prothrombin fragments, thrombin-antithrombin !11 -complexes and with the release of fibrinopeptide a ( r > 0.96, n= 151). results of the new sf-elisa with kscn pretreatment of patients' plasma, however, correlated conspiciously well with d-dimer levels (r > 0.94) but distinctly less with the markers of thrombin generation (-0.12 < r < 0.29). this good correlation with d-dimer levels was unaccountable since the d-dimer maximum occured significantly later than the peak of markers of thrombin generation (p < 0.05). therefore, kscnpretreatment of fibrin specimens seems to lead to a change in the specificity of the fibrin assay despite usage of the same catching antibody. different half-iifes of differently composed fibrin complexes should be considered in trying to explain the findings. nevertheless, the results of the former assay without kscn-treatment correlated much better with the well-known dynamics of thrombin-induced fibrin generation during hemostasis activation than the data from the new assay modification. consequently, further examinations are necessary to specify the effect of kscn on soluble fibrin complexes and the resulting assay specificity. a rapid assay for the determination of the primary hemostasis potential (php) of whole blood has been developed (kundu et al, 1995) from the original method of kratzer and born. the new system employs a disposable test cartridge which holds the sample (citrated whole blood) and all components for the tests at the same time. the test procedure is very simple. the cartridge is loaded with -500 p.l citrated whole blood and is inserted into the platelet function analyzer (pfa 100aaw). the test is started automatically after a preincubation phase of 2.5 rain. the reaction starts with the contact of the whole blood and the capillary which is connected with a collagerdephinephrin coated membrane with a small aperture inside the test cartridge. under constant negative pressure the sample is aspirated and through the contact ofplatelets and vwf with collagen adherence and aggregation begins. the adhesion and aggregation process leads to the formation of a platelet plug which obstructs the flow through the aperture. the result of the php is reported as closure time (ct). additional parameters such as bleeding volumes are possible as well. first results show good reproducibility, normal values in the range of up to 150 sec. and a good discrimination of healthy donors from patients with congenital or acquired platelet dysfunctions. the system detects aspirin induced thrombocyte function defects and von willebrand disease. in ease of an abnormal result in the collagerdepinephrin system a second type of cartridge with a collagerdadp coating can be employed. in the majority of cases aspirin induced dysfunctions are normalized and could thus detect aspirin use. the proposed system may be a valuable tool for routine assessment of the primary hemostasis potential in a routine citrate blood sample laboratory. inducing mental stress in 20 young healthy male volunteers aged 20 to 40 ),ears with no previous history of thmmbophilia or a hemorrhagic diathesis was performed by a first time parachute descent from an altitude of 1000 meters. the purpose of this investigation was to find out whether there are any changes in the corpuscular and plasmatic fractions of peripheral blood. we were especially interested in elucidating changes in the procoagulatory and/or fibrinolysis systems. venous blood samples were obtained directly before and directly after the jump. flight time from the departure of the airplane to the landing of the parachutists was approximately 20 minutes. the maximum time that elapsed between the two blood withdrawals were 45 minutes. in a preliminary study with different voinnteem, certain fluid imbalances had been observed. absolute numbers of leukoeytes (6.9 vs. 9. l/n0, erythrocytes (4.6 vs. 5.1/pl), and platelets (246 vs.276/nl) significantly increased (p < .001), as well as the hemoglobin concentration from 145 to 156 g/l (p < .018). even though fluid imbalances before and after the jump had practically been excluded by measuring nearly identical hematoerit values (.41 vs..42), we noticed a marked drop in aptr (27 vs. 23 sec) and a significant increase in factor viii ~tivity. as a direct stress response, we found a rise in fibrinogen concentration (2.4 vs. 2.8 g/l) which is one of the shortest acting acute phase proteins. concerning reactive fibrinolysis, d-dimers showed an increase in concentration from 115 lag/l to still normal values of 192 lag/l, which was not significant due to low numbers of values (p = .086). we observed similar changes in fibrin monomers and prothrombin fragments fl+2. from other investigations on the kinetics of the activation of the procoagulatory system we know that maximum activil7 is not reached until 24 hours after initiation of activation.these investigations studied perioperative changes in different kind of operations which served as a control group concerning the degrees of tissue damage and resulting coagulation disturbances. to better understand these phenomena we plan to induce mental stress in a laboratoq' environment to further exclude unknow~a influences on the mechanisms which can activate the procoagulatory and fibrinolytic systems. triodena (t) 30/40/30 ug ee, 50/70/100 ug gestodene) were tested for their effect on hemostatic parameters. three groups (n=20) of healthy female volunteers were treated for 6 months with one of these oc. blood was taken before treatment (day 24-28 of pretreatment cycle, 0) and on days 18-22 of the 3 ~ (i) and 62 (ii) treatment cycle. indications of an activation of blood coagulation and fibrinolysis were detected as the plasma levels of prothrombin fragment f i+2 and of fibrin split product d-dimer and plasmin antiplasmin complexes were found elevated during treatment. the following main regulatory components of blood coagulation, activators and inhibitors, were investigated: factor vii antigen fviiag, fvii clotting activity fviie, circulating activated factor vii cfviia and antithrombin 3 at3 activity, total protein s antigen tps-ag, free protein s antigen fps-ag, protein s activity psact, circulating thrombomodulin etm fviiag, fviie and cfviia significantly increased during treatment; cfviia: 0: c 32.4 mu/ml a prethrombotic condition characterized by elevated levels of circulating soluble fibrin has been claimed to be a predisposing factor for accumulation of coronary thrombotic material in acute myocardial infarction. the present study includes 161 patients with clinical suspicion of myocardial infarction. blood samples were drawn by the primary care physician, upon arrival in the hospital, and after 2, 6, 12, and 24 hours of hospital stay. patients with myocardial infarction were identified by typical course in 12 lead ecg, and upon sequential determination of troponine t, myoglobin, ck, and ck-mb. patients with primary cpr were excluded from evaluation. soluble fibrin was measured by enzymun®-test fm (boehringer mannheim). patients with acute myocardial infarction display soluble fibrin levels within the normal range (< 5 ~tg/ml) during the initial two hours after onset of symptoms. there was no significant difference between patients with myocardial infarction and patients with coronary heart disease without myocardial infarction. slightly elevated levels were found in patients with atrial fibrillation, reflecting intracardiac fibrin formation. in patients without fibrinolytie treatment, a slight increase of soluble fibrin levels with a maximum after approximately 8 hours is observed. most patients with fibrinolytic treatment display a considerable increase in soluble fibrin, with maximum levels immediately after infusion of the fibrinolytic agent. four patients with pulmonary embolism showed soluble fibrin levels in the range of 40-300 [.tg/ml, which remained in the same range during the entire observation period. in conclusion, circulating soluble fibrin is not increased in patients with acute myocardial infarction and does not appear to be a predictor of acute coronary events. high levels of soluble fibrin in patients with fibrinolytic therapy may reflect release of fibrin from thrombotic material, but also de novo generation of fibrin due to release of active thrombin from thrombi not necessarily located in the coronary vessels. detection of elevated levels of soluble fibrin in patients with acute chest pain should result in careful examination for signs of pulmonary embolism or aortic aneurysm. the possibility to determine activated coagulation factors opens the question if data provide evidence of an activated coagulation or fibrinolysis and if this has a prospective value. we investigated patients with confirmed thrombosis, postsurgical septieaemia and also after liver transplantation. in all patients factor viia, xii, xiia and also the fibrinolytic parameters t-pa, pai-1, pap, plasminogen and a2-ap were determined. in addition, f1+2 and apc-resistance with heterocygote factor v-leiden-mutation and confirmed thrombosis. we found increased factor viia which showed partly also an increased fl+2. patients with other pathological results such as a reduced t-pa and/or increased pai-1 showed a low incidence of elevations in factor vii or f1+2. the activation of factor xii seems to be of minor importance in patients with thrombosis. a different picture is found in septic and transplanted patients. obviously factor xii-activation is of major importance in this group. a deterioration of the clinical symptoms is correlated with an increased factor xiia which is paralleled by a decrease of factor xiiactivity. the investigation of fibrinolysis parameters such as pai-1 and pap demonstrate a fibrinolytic disturbance of the balance. statistically significant are differences in septicaemic patients both in the surgical and in the internistical group in contrast to polytrauma patients. in patients with liver transplantations significant changes are apparently related to rejection of the transplanted organ together with a deterioration of the clinical picture. the possibility to detect activated coagulation factors may be a tool to detect changes in the hemostasis system at an early stage and to use this for an improved therapy. control of long-term oral anticoagulation is usually performed by serial determinations of the prothrombin time. however, the assessment of effective anticoagulation versus the potential risk of bleeding complication is difficult to achieve. molecular markers of blood coagulation activation might add valuable information in individual cases. we investigated 48 patients with thromboembolic manifestations (deep vein thrombosis n=22, pulmonary embolism n= 13, myocardial infarction n= 13) for one year beginning with admission to the hospital. tat, prothrombin fragments f 1 +2, d-dirner and fibrin monomer concentrations were analysed. all markers were significantly increased at the time of initiation of anticoagulant therapy thus reflecting a prethrombotic situation. patients suffering from venous thromboembolism demonstrated higher concentrations of tat and f 1 +2 in comparison to myocardial infarction (34.6 vs 12.3 pg/1, p=o.009; 2.8 vs 1.3 nmol/i, p=0.0025). f 1 +2, tat and d-dimer concentrations decreased gradually over the first 14 days of anticoagulant therapy reaching values within the established normal ranges in all cases. f 1 +2 and tat concentrations reflect the activity of the coagulation system during long-term anticoagulation whereas analysis of fibrin monomer yielded partly controversial results. we conclude that f 1 + 2 and tat appear to be superior to fibrin monomer for the individual control of oral anticoagulant therapy. the influence of thyroid failure on haemostasis is controversial. mainly hypoceagulable states have been described in clinically overt hypothyroidism. since hypothyroidism has been associated with an increased risk of atherosclerosis, we studied a wide range of haemostatic factors in untreated female patients with subclinical (b, n=42, age 59+13) or overt (c, n=8, age 55-zcj) hypothyroidism, as well as in hypothyroid women under 1"4 treatment (d, n=8, age 57+9) and euthyroid controls (a, n=80, age 50+14). simple screening tests (prothrombin time, activated partial thromboplastin time, fibdnogen), procoagulant factors (fvii, fviii, von willebrand factor), coagulation inhibitors (antithrombin ill, hepadn cofactor ii, protein c, protein s) and fibdnolytic factors (plasminogen, antiplasmin, plasminogen activator inhibitor, tissue plasminogen activator) were measured. results factor vii activity (vii:c), factor vii antigen (vii:ag) and their ratio were found increased in hypothyroid patients. factor viii activity showed the same tendency, whereas von willebrand factor ramained unchanged, as did all other parameters with exception of free protein s, which declined in overt hypothyroidism and in t4 treated subjects. these differences tended to diminish after exclusion of 26 women with estrogen replacement therapy for menopause, but the ratio vii:cnii:ag, as well as fvii:c still remained significantly higher in hypothyroid patients. conclusions: subclinical and overt hypothyroidism are associated with significantly higher levels of factor vii:c and vii:ag. the disproportionate increase in vii:c compared to vll:ag, as shown by their ratio, might reflect the presence of activated factor vii (vila), which in turn indicates a hypercoagulable state. this pattern becomes more pronounced with the concomitant estrogen replacement after menopause. exocytosis following platelet activation leads to translocation of cd62p (p-selectin), cd63, and thrombospondin, from cytoplasmic granules to the cell surface membrane, where these molecules, serving as activation markers, can be detected by flow cytometry. we here report detectability of these molecules preformedprior to platelet activation -inside the cytoplasm of resting platelets. two different methods are compared, i. e. using either methanol or the fix&perm kit (an der grub) for cell membrane permeabilization. in addition, interleukin(il)-ice is shown to be present in platelet cytoplasm after methanol treatment, but not after permeabilization using fix&perm. whenever cell surface positivity for a specific marker coincides with intracellular presence, blocking of the surface membrane sites prior to membrane permeabilization is required in order to obtain fluorescence intensity attributable to cytoplasmic staining. our data demonstrate the feasibility of the methods presented for the detection of intracellular platelet molecules. this technique should also provide a means for estimating the relative quantity of intracellular platelet antigens, provided the permeabilization procedure does not lead to antigen leakage or destruction. physical exercise activates the clotting as well as the fibrinolytic system as indicated in numerous investigations of exercise by running and by bicycle ergometer but not by swimming. the positive effect of an endurance training in coronary sport groups is induced also by influences on the hemostatic system. the influences are suppression of the clotting activation by the acute exercise and by an increased fibrinolysis response. different hemostatic parameters, therefore, were analyzed before and after swimming of male coronary patients (n=33; median ag~ 61 years, achieved heart rate: 68/min). indicating plasmatic clotting activation there was a significant increase in molecular markers tat and f1+2 among the coronary patients (tat from 2,1 to 3,4 pg/1; fi+2 from 0,92 to 1,1 nmol/1). the degree of clotting activation among the coronary patients was less than that observed in a group of young volunteers in a former investigation. this must be explained by existence of the coronary heart disease or by the higher age in the patient group. indicating an activation of fibrinolysis t-pa activity increased significantly in coronary patients (from 0,14 to 0,5 iu/ml) resulting in an unchanged balance between coagulation and fibrinolysis. from this findings of the hemostatic systems no increased risk of the coronary patients by swimming can be derived. a prerequisite, however, are precautions l±ke to devoid exercise in the anaerobic range, exclusion of major heart failure and of cardiac arrhythmias before begirming of the swim training. the principle of the fontan operation consists in anastomosing the right atrium to the pulmonary arteria, thus bypassing the right ventricle and using the only functional single ventricle as a pump for the systemic circulation. there are only few data about the influence of the changes in hemodynamics on coagulation and fibrinolysis. we investigated the coagulation system in 20 children and young adults aged 4 to 21 years in a general examination 4 to 61 months after fontan procedure. besides other abnormalities of the coagulation system, there were significantly increased values for the thrombin-antithrombin-iii-complex (tat) in 12 patients (60%). as a marker for an activation of the fibdnolytic system we found elevated plasmin-alpha2-antiplasmin-(pap-) levels in 14 patients (70%). less frequently, the concentrations for the prothrombin-fragments 1 and 2 (f1 and 2) (7 patients, 35%) or the d-dimer (2 patients, 10%) were increased. we didn't find significant differences in a clot-lysis-assay between fontanoperated patients and an age-matched control group. there was no significant correlation between activation of coagulation and clinical situation or diameter of the pulmonary arteria. whether the present data can help to estimate the risk for a thrombo-embolic complication following fontan procedure, still has to be investigated. the results of the clot-lysis-assay suggest, that for lysis of thrombi the same dose of rt-pa should be used as for other patients. a 2nd generation functional protein s assay p. van dreden* and e. adema** * serbio, gennevilliers france, ** boehringer mannheim, tutzing germany a second generation protein s test was developed with improved sensitivity to protein s and better reagent stability. the test result was found to be unaffected by apc-resistence (10 patients, heterozygote for the mutation with a apti' + apc ratio between 1.4 and 1.9), heparin up to 2 iu/ml and f viii activity between 1 and 250%. in the test, diluted sample is mixed with protein s deficient plasma, activated factor v, activated protein c, phospholipids and an intrinsic pathway activator. this mixture is incubated for 3 minutes. during this time, the activated protein c inactivates part of the f va. the extend of f va inactivation depends on the protein s concentration. after 3 minutes caci2 is added and the time untill clot formation is measured. the clotting time is a linear function of the protein s concentration between 10 and 140% protein s. for the three preproduction lots the difference in dotting time between 10 and 100% protein s was 43-54 seconds. this compares to 30-40 seconds typically obtained with the old test. within run precision (n= i0 on sta) is cv= 2 -7% on the basis of protein s. day to day precision (n=10 on sta) was found to be cv= 4 -11%, again calculated on the basis of protein s concentration. the cv of 11% was obtained for an avk plasma with 13% protein s; it corresponds to a standard deviation of only 1.5% in protein s. the insensitivity to interferences, in particular apc-resistence and better precision and stability are expected to improve the quality/reliability of a protein s determination. in this study we evaluated the use of hormonal contraception on the parameters protein c, protein s and pal. samples from 71 women with, without hormonal contraception and in menopause were assayed by coagulometric (protein s clotting test (behdngwerke, marburg, frg) or chromogenic methods (protein c activity test and pal reagent from behringwerke, marburg, frg) in double determination and were compared with the reference ranges. in addition thromboplastin time (thromborel s reagent) and fibrinogen (multifibrin) from behringwerke, marburg, frg, and aptt (actin fs reagent from dade corp., unterschlei6heim, frg) were determined. in women using hormonal contraceptives (p<0,01) and in menopause (p<0,05) protein s activity was significantly reduced compared to other women (<45 years) while protein c acitivity did not change. in menopausal women a higher susceptibility to thrombosis was supported by an increase of aptt (p<0,05) and fibronogen (p<0,01). while there was no change for pal, plasminogen was significantly lower in women using hormonal contraceptives and in menopause (p<0,05). we could not observe a higher turnover of coagulation and fibdnolytjc system with hormonal contraception. noteworthy was the occurence of low (<200 mg/dl) and borderline fibrinogen (max. 220 mg/dl) in 40,9% of women res. in 22,8% of women (together with borderline aptt) who had an individuell risk for arterial disease. protein s protein c fibdno~en aptt plasminog~ without hcc 109,1-+13,6 78,3-+14,1 255,0-+14,0 36, [2] [3] [4] [5] 4 24, [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] 2 with hcc 85, [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] 2 78, 5±12, 0 253, 8.+24, 1 35, [2] [3] [4] 8 14, 9 menopause 90, 3+97, 6 87, 4±41, 4 307, 05:57, 6 39, [8] [9] [10] 0 12, 6 hcc= hormonal contraception hemostatic parameters in a patient undergoing bone marrow and subsequent liver transplantation due to veno-occlusive disease c. salat 1, , e. holler t,3, hi. kolbl, 3, b. reinhardt l, r. pihusch 1, p. g0hring 2, s. poley 2, e. hiller 1 l=med. klinik iii, 2 = institut flit klin. chemie, klinikum grosshadern der ludwig-maximilians-universit~tt mfinchen, 3=h~tmatologikum der gsf a 40 year old patient suffering from all received allogeneic bone marrow transplantation (bmt). after an uncomplicated early posttransplant period the patient was dismissed after 4 weeks. a bilirubin rise with subsequent liver failure was observed during the following weeks. according to biopsy proven hepatic veno-occlusive disease (vod) liver transplantation was performed on day 79. unfortunately the patient died on day 140 due to aspergillosis. we monitored levels of protein c (pc) and s (ps) as well as pall during the pre-and posttranspiant period. pal1 level was normal (<43 ng/ml) during the first 4 weeks after bmt but increased with the manifestation of vod (317.5 ng/ml on day 47). it reached its peak immediately before liver transplantation (547.6 ng/ml) and returned to normal levels within the next few days. pc levels which were normal before bmt decreased prior to clinical diagnosis of vod and were normal after liver transplantation. ps levels lay within the normal range at all timepoints. vwf was elevated before bmt (240%) and remained relatively stable during the whole investigatonal period ranging from 170 to 260%. it is assumed that vod is initiated by an endothelial cell injury -possibly due to radiochemotherapy -and subsequent hypercoagulability. our results indicate that the "endothelial cell marker" vwf is not helpful in predicting vod. the kinetics of the investigated parameters underline the significance of pc and pai-1 as described by others and our group earlier, whereas ps does not seem to play a role in the pathogenesis of vod. the budd-chiari syndrome (bcs) is characterized by hepatic venous outflow obstruction that may be caused by the precipitation of a thrombus. it frequently coseggregates with other major diseases like myoloproliferative diseases or defects in the haemostatic system (antiprotein c and protein s deficiencies e.g.). only recently, the factor v leiden mutation (fvlm) has also been associated with bcs. we hypothesized that defects in the thrombo-modelling associated anticoagulant pathways (tmaap) are a major risk factor for the precipitation of bcs. we screened our cohort of 27 patients (pts) with bcs for the presence of defects in the tmaap and identified 3 pts with protein s deficiency (psd). these pts were screened for the three point mutations in exon 1 (codon-25; ins t), exon 15 (codon 636; a-->t) and in intron 10 (g-->a + 5) of the ps alpha-gene that have been demonstrated by bertina et al to coseggregate with psd. restriction enzyme analysis and confirmation-sensitive gel electrophoresis for the detection of single-base differences in doublestranded pcr-products were employed. all living family members of the indicator pts were also screened for heterogeneties in the three point mutation as described. no single abnormality in these genes despite presence of pbd in those family members was found. in addition, pts and family members were also screened for fvlm. one pt and two of his family members, in addition to psd, were subject to fvlm. the other two lots and their family members were not subject to fvlm. in contrast to the first family, despite psd, those two pts suffered from morbus crohn and acute myeloid leukaemia as risk factors for bcs. we conclude: psd is one major risk factor for the precipitation of bcs. to precipitate this disease, one additional risk factor is required. psd may be caused by genomic defects in the protein s gene other than those described by bertina. only a few publications describe a thromboembolic disease due to dramatically reduced protein s levels being associated with viral or bacterial infections, autoimmune mechanisms are suspected but the aetiopathogenesis is still under discussion. we report on a 5 year old boy who developed purpura fulminans of the left leg during varicella infection. on the fourth day of infection the disease started with pain and haemorrhagic efflorescence localized at the left taft. on admission the boy suffered from a purpura fulminans with central necrosis measuring 15x8 era. suspecting a hereditary thrombophilic disease we started therapy with protein c concentrate and recombinant tissue type plasminogen activator. the fellowing coagulation investigation showed a severe deficiency of protein s (total protein s-antigen < 5 u/ml, free antigen not measurable) in combination with factor v leiden mutation. other thrombophilie and coagulation parameters did not show deviation from normal range. after 4 weeks we saw a slight improvement of the total protein s antigen up to 50 u/ml. the free protein s antigen was still undetectable. during the following weeks the patient recovered slowly and the protein s activity and antigen normalized. because of skin necrosis thromboembolie prophylaxis was initiated with low molecular weight heparin (fragmin®, 100 ie/kgbw/die) and continued for 6 months. under this therapy there were no further thromboembolic events. these results suggested an autoimmune protein s deficiency in a patient suffering from chickenpo×. an analyses of autoantibodies at the time of diagnosis showed a slight increase of the antieardiolipin antibodies (igg 16,1 iu/ml, igm 15,1 iu/ml) which normalized during hospitalisation. we suspect an antibody to protein s probably caused by similar presented viral antigens. we suppose that autoimmune mechanism during different infections in combination with a heterzygous apc-resistance may be a potential risk factor for developing thrombotic disease. in the central nervous system mrna encoding for prothrombin and thrombin receptor is present and astroglial cells in culture process and secrete thrombin. moreover, effects of thrombin on brain cells including change of neudte outgrowth and astrocyte shape are described, but the molecular mechanisms are unclear. we investigated the effects of human 10 g/l). when compared with conventional elisa techniques (asserachrom ddi), the assay demonstrated a correlation coefficient of 0.97 on 131 samples from normal individuals and hospitalised patients with elevated d.dimer concentrations. slope was of 0.97 and intercept was of -0.07. this new assay offers a full flexibility for individual testing as the calibration curve is stable for at least one week on the instrument. it is then well adapted for all the applications of d.dimer measurements in coagulation laboratories. 16 children between an age of 3 days and 11 months ( median 6 weeks ) with thrombotic or embolic occlusion of major vessels were treated with rt-pa for thrombolysis. the affected vessels were both sided renal veins or one sided renal vein and v. cava inf. in 8 cases, the v. cava superior in 3, the v. cava inf. plus renal veins plus aorta in 1, the left ventdcle in 1, the aorta in 1, the a. femoralis in 1 and the v. portae in 1 case. 10 out of 16 occlusions were associated with an indwelling catheter. underlying dieseases were sepsis (4), prematudty (3), vitiurn (2), asphyxia (1), short bowel syndrome (1), hus (1), diabetes (1), cmv (1), exsiccosis (1) and m. hirschsprung (1). thrombolysis was performed with an bolus of rt-pa (0.1-0.2 mg/kg) followed by continuous infusion (0.8-2.4 (-9) mg/kg/24h, median 1.8 mg/kg/24h). low dose hepadn (100 ie/kg/24h) was given dudng full dose hepadn (aptt 1,5-2 times normal) after the thrombolysis. in 5 pts. rt-pa was administered locally through the catheter and in 11 cases systemically. in 13 patients the vessels could be recanalised completely, in 2 partially, in 1 patient the therapy had to be discontinued. in 2 vessels a reocclusion occurred. bleedings were noted in three patients, all from recent venous puncture sites. the results encouraged us to start a multi-canter trial which has been approved by the ethical committee and is open for recrural. the aim is to compare efficacy and safety of rt-pa with urokinase, the only recommended standard in the management of critical major vessel obstruction in newborns and infants. the design is a randomised, notblinded trial with a cross-over option after three days in cases without success. study end points are recanalisations, major bleedings and number of cross-overs. inclusion criteria are age under 1 year, lifethreatening vessel obstruction, age of thrombus up to 10 days, no precaeding fibdnolytic therapy. exclusion cdteda are cerebral hemorrage, pedventricular leukomalacia, surgery dudng the last 7 days and cns injuries during the last 2 months. although our knowledge on inherited thrombotic coagulation disorders has greatly expanded within the last years, there are still man}, patients with recurrent venous thrombosis in whom no obvious predasposition can be identified.thus we decided to include also so-called rare defects associated with thrombosis in our routine thrombophilia screening programme, such as fxii deficiency. fxii is an important element m the intrinsic pathway of fibrinolysis and there is evidence for an insufficient fibrinolytic activity in fxii deficient pts..up to date only few and controversial data exist about the frequency of fxii deficiency in pts. with thrombophilia. cons~uently the aim of our study was to evaluate the association between fxii deficiency and juvenile venous thrombosis in a great population. patients and methods: 1554 pts. (851 female, 703 male, aged i to 61 ys, median age 38.2 ys) with venous thromboembolism before the age of 45 ys were studied. one-stage clotting activity assay of fxii (fxii:c) was performed on acl using fxii deficient plasma from instrumentation laborato~. fxii antigen concentration (fxii:ag) was measured by electroimmundiffusion using reagents from behfingwerke, enzym research respectively. the normal ranges are tl~. routine reference values obtained m our labratory from 80 healthy subjects (40 males, 40 females, median age 26.2 ys); 95% range: fxii:c 53-135%, fxii:ag 57-132%). results: 122/1554 pts.were classified as fxi1 deficient (f 60, m 62), giving a prevalence of 7.8%. severe fxii deficiencies with fxii:c below 1% were observed in 7 pts..ll5 pts= proved to have moderate fxii deficiency with fxihc ranging lrom 2 to 51% and fxii:ag ranging from 1 to 53%. in none of them inherited deficiencies of other well established thrombophila risk factors could be detected. none of the fxii deficient pts. had positive lupus anticoagulant tests. familial fxii deficiency was found m 9 cases. discussion and conclusion: the precedences of fxii deficiency amongpts, with venous thromboembolism was previously described to be 7.5-10%. supporting these data, we have shown a praevalence of fxii deficiency of 7.8 %. in comparison to the frequency of other well established thrombophila risk factors we consequently have observed a relatively high prevalence of fxii deficiency m our study group.these data, from the largest such study reported, strongly indicate that fxii deficiency may not be a rare deficiency and may be more frequently associated with thrombosis than currently suspected. we describe a family with an exceptionally rare, i.e. plasminogen, deficiency, combined with subnormal activities of coagulation factor xii (hageman factor). the first thromboembolic event, pulmonary embolism in the proposita was diagnosed at age 35. since that time, 'spontaneous' venous thromboembolic events verified by phlebography and perfusion/ventilation lung scan recurred once every year despite oral coumarin therapy, whose intensity varied over an exceptionally wide range despite tight control the patient was repeatedly given succesful thrombolytic therapy with streptokinase or recombinant tissue plasminogen activator. her plasma plasminogen chromogenic activity was 51-59 % compared to a normal plasma pool (reference range 70-130 %), plasminogen antigen was diminished to the same extent. the patient's factor xii exhibited only 28-55 % activity in a factor-deficient plasma assay as compared to a normal plasma pool. other known risk factors for recurrent venous thromboembolism were not present : no evidence of malignancy, no obvious precipitating events, normal values of antithrombin iii, protein c, protein s, fihrinogen, thrombin time, platelets, lupus-like anticoagulant, aptt prolongation after addition of activated protein c. the proposita's mother had died at age 60 from pulmonary embolisnt no coagulation studies are available. the proposita's sister was first diagnosed deep leg vein thrombosis at age 17, since that time recurrent episodes of venous thromboembolism have been diagnosed also in an other hospital. this sister's plasminogen activity was 120%, but factor xii activity was reduced to 55 %. three brothers of the proposita were examined, too, all in their 3rd decade of life. none of them recalled symptoms of or treatment for thromboembolic disease. in one brother, factor xii activity was normal (100-105 %), but plasminogen only about 50 %. in the 2nd brother, factor xii was very variable (64, 42 and 94 %), plasminogen was in the lower normal range, in the 3rd brother, factor xii was about 50 % (repeatedly), plasminogen was normal. current knowledge about the risk of thromboembolism with both enzymes is limited, the optimal management remains controversial. msrgit serbsn,maria cucuruz,dan madras,carmen petrescu, natalie rosiu,rodica costa iii rd psediatric clintc,universtt v of medicine, the unsatisfactory efficiency of entihepetitis b vaccination in our haemsphiliscs suggested the control of the immune status in 52 hiv negative patients,by establishing through flowcitomstrie with monoclonsl antibodies the lymphocyte subsets (cd3,cd4,cds,cd&/cd8 ratio and cd19) and by seric tmmunoglobulins levels; the immunological parameters have been correlated with the serological markers of hepatitis infections (hay, hbv,hcv ebd hdv) as well as on dependence with the treatment (blood,plasma,crysprecipitate,fector viii/ix concentrate) and the quantity of their consumption (ui/k9 weight/yesr).the interpretation of the results pointed out • significant lower level of cd3,cd& (p20 years (group 3) duration. anticoagulated whole blood was incubated with fluorescent antibodies to gpib and gmp-140 (two colour method) and analyzed with a flow cytometer. thrombomodulin, f1+2, protein s, 13-thromboglobulin were measured according to standard procedures. results: surface expression of gmp-140 was not different in groups 1 to 3, however, there was a tendency to higher acitvation in group 1 (<10 years iddm). results for thrombomodulin, f1+2, protein s, 13-thromboglobulin will also be presented. conclusion: though it did not reach statistical significance, platelet acitvation seems to be more important during early diabetes. this wilt be correlated with endothelial and plasmatic activation markers. in our clinic four patients with hiv-related thrombocytopenia were treated with a lot of gammagard (93f21abllf), which later turned out to be hcv contaminated. before infusion all patients were negative for hcv antibodies and hcv rna. 2 to 8 months after infusion 2/4 patients, who suffered from arc at the time of hcv infection with cd4 counts >100/pl, seroconverted, whereas in the two other patients, who suffered from aids with cd4 counts below 100/pl, there was no seroconversion. in all cases hcv rna was found. genotyping with inno-lipa (innogenetice) showed hcv genotype l(b) in all patients. liver enzymes and hcv rna copies were measured repeatedly over a period of one year after infection. the 2 patients with arc showed a strong increase of hcv rna titre during the first 3 to 4 months after infection, followed by a rapid decrease within the next months. in the patients with aids hcv rna copies increased moderately within the first 4 to 6 months, followed by a slow decrease. elevation of liver enzymes was mild in the aids patients and seems to be independent from the hcv rna titre. in the arc patients liver enzymes changed parallel to hcv rna titers with a delay of 2 to 3 months. the course of hiv infection was only slightly influenced by the acute hepatitis c as measured by cd4 counts, i%2microglobulin and hiv rna copies. introduction:mechanisms underlying ischemia/reperfusion injury have been thoumughly investigated in experimental models. leucocytes appear to play a main role through production of cytokines and overexpresssion of adhesion molecules. in experimental animals, administration of monocional antibodies (mab) recognizing cd18 can reduce organ injury following ischemia/repedusion. no data, however, have been reported concerning clinical ischemia situations. patients and methods:we investigated expression of cdt8, cd1 la, cdf l b and cd1 lc in granulocytes, monocytes and lymphocytes from peripheral blood of five patients undergoing elective hand surgery. the tourniquet was applied on the upper arm and heparinized samples from cubital veins were obtained before and at the end of ischemia. control samples were drawn from the nonischemic contralataral arm with the same timing, duration ot ischemia ranged between sixty and one hundred minutes (80~16). whole blood samples were incubated with specific, fluorochmme labelled antibodies and analyzed by fluorocytometry (facscan, becton dickinson, san jose, ca). mean fluorescence intensity (mfi), quantitatively reflecting surface expression of the indicated markers was evaluated for the individual cell populations. data were compared by the paired student's t-test, p<0,05 was evaluated as significant. results:mfi for all markers was comparable in all cell populations in samples obtained before ischemia from both arms. in contrast, expression of cd18 was significantly enhanced in granulocytes (321_+50 vs. 189_+38), monocytes (653-+54 vs. 426+122) and lymphocytes (299_+45 vs. 228-+36) from samples derived from the ischamic arm, as compared with the nonischemic arm, as measured at end of ischemia. at the same time, an increase of cdf lb on granulocytes (500~_342 vs. 213+150) and monocytes (533+359 vs.237-+206) but not on lymphocytes was found, no modifications of cdlta and cdttc expression could be observed. there was no correlation between duration of ischemia and quantitative expression of these markers, conclusions:our data indicate that relatively short ischemia periods induce an increased expression of ~2" integrins adhesion molecules on leucocytes. these results suggest, at close similarity with findings from expodmental models, that overexpression of adhesion molecules might play an important role in the induction of ischemia/reperfusion injury, in humans. in patients suffering from chronic inflammatory bowel diseases, such as morbus crohn and colitis ulcerosa, we observe massive, sometimes barely staunchable bleedings. hereby, the deficiency of coagulation factors, especially of factor xiii in plasma is established. ttowever the influence of factor xiii on the pathomechanism of the underlying disease is still under discussion. therefore we studied the f xiii content in the intestinal mucosa. an immunohistochemicat method was developed using commercially available antibodies against f xiii subunit-a, the detection of mucosal factor xiii depends on the amount of chromogen bound to the antibody-horseradish-peroxidase complex. with this method, it is possible to locate but not to quantify f xili in the intestinal tissue. therefore we developed an elisa-metbod in homogenized intestinal tissue, using commercially available antibodies. its precision was validated using a standard curve with commercially available factor xiii preparations (fibrogemmin®). the detection limit of this method is > 0.05 i.u. f xiii/ml of tissue solution. freezed dried intestinal tissue (lmg) was homogenized in 1 ml buffer using a potter. specimens of the large bowel revealed f xiii values of 0,21 + 0,0038 i.u. (x __+ sd), tissue solution. with this method it is possible to quantify tissue-bound faxtor xiii. studies are in progress to elucidate the content of f xiii in the intestine of patient's suffering from infammatory bowel diseases in order to contribute data to the pathomechanisms of f xiii deficiency. in a previous double-blind, controlled trial we were able to show that aprotinin administration has significantly contributed to reduce periand postoperative bleeding complications without increasing the risk of thromboembotic complications. the question arises whether this beneficial effect may be associated with its effects on intraoperative fibrinolysis. therefore, 20 patients were treated with or without aprotinin (2 million kiu loading dose over 15 minutes followed by 500,000 kiu per hour), and citrated blood samples were obtained at the following time points: before operation, after induction of the anesthesia, at the beginning of operation, intraoperatively when the femur shaft was implanted, and 24 hours postoperatively. the determinations of plasmin/antiplasmin-complexes, d-dimers, thrombin/antithrombin iii-complexes, and prothrombinfragments 1 +2 were performed by means of test kits from behring, germany (enzygnostrpap micro, enzygnost r d-dimer testkit, enzygnost r tat micro and enzygnost r f 1 +2 respectively). -all markers of activated fibrinolysis and blood coagulation were significantly increased in the groups with and without aprotinin treatment, the highest activities to be seen when the femur shaft was implanted. however, the values of pap and d-directs of the aprotinin group were below the values of the control group until the end of operation. the markers of activated coagulation showed the opposite effect, however the differences between the two groups were not significant. as expected, the aptt was significantly prolonged in the aprotiningroup. the aprotinin treatment was also associated with a significantly lower blood loss in these patients. -concluding it can be said it is not clear whether the blood saving effect of aprotinin may be exclusively attributed to its antiplasmin activity since the differences of the fibrinolysis parameters were not statistically significant. further blood samples should be analysed between the implantation of the femur shaft and the end of operation. in our laboratory large amounts of human prothrombin are required (30-50 mg/week). as we try to produce meizothrombin and meizothrombin-des-fragment-1 from human prothrombin and to apply it as an antidote for hirudin, the classical adsorption to barium sulphate or aluminum hydroxide from human plasma cannot be used. commercially available human prothrombin is expensive and of an unacceptable quality for our applications. in most of these batches we found small amounts of factor x and prothrombin activation products. we now developed a procedure to isolate prothrombin from "prothrombin complex concentrates" (ppsb-250-bulk, drk-blutspendedienst nds.). the concentrate also contains fac-tor vii, factor ix, and factor x. the prothrombin had to be separated from these factors. the concentrate we used contained amounts of other proteins and activation products of prothrombin (e.g. prethrombin-1) as well. for the preparation of prothrombin from ppsb we used anion exchangechromatography (resource-q ®) on an fplc ®. we applied dissolved ppsb directly or after buffer exchange on sephadex g-25 onto the column at room temperature. the prothrombin was eluted with an naci-gradient in trisodium citrate buffer, ph 7.0. the buffer conditions are similar to the conditions used in the preparation of ppsb. the quality of the prothrombin so obtained was sufficient for most of our experiments. a second purification step on ion-exchange resulted in a 99% pure product devoid of contaminating factor activities and activation intermediates as examined with coomassie and silver stained sds-page electrophoresis and assays for factor x. this prothrombin contained full enzymatic activity and its activation by specific snake venom prothrombin activators showed the known activation products. we are now able to isolate the amounts of pure prothrombin required for preclinical investigations. most of the commercially available lmwhs such as enoxaparin, fraxiparin, and fragrnin are prepared by chemical methods which can result in desulfation and other chemical modifications of the internal structure leading to differences in the pharmacologic effects. on the other hand, tiactionated lmwhs retain their native characteristics and are structurally similar to heparin. in addition, the oligosaocharide sequence responsible for atiii binding is not modified. physical methods such as gamma irradiation (~co) have been used to fi'agment sulfated glycosaminoglyeans yielding fragraents without chemical modifications (deambrosi et at. in : biomedical and biotechnological advances in industrial polysaccharides, pp. 45-53). utilizing this technique, depolymerized heparius exhibiting different molecular weights can be obtained. this communication reports on the biochemical and pharmacologic effects of several such depolymerized heparins to demonstrate the molecular weight dependence on biologic activity. fragments exhibiting molecular weights of 5, 7, 8, and 9 kda were prepared by exposing concentrated heparin solutions to a rectilinear gamma ray beam at intermittent doses of 2.5 to 25 mrad under controlled temperatures. unlike the chemically depolymerized heparins, these fractions did not exhibit any decrease in charge density or atiii affinity. in routine assays for heparin, a clear cut molecular weight dependance on the anticoagulant and antiprotease actions was observed. on a gravimetric basis, these agents produce superior antithrombotic actions in comparison to chemically depolymerized derivatives. these studies suggest that gamma irradiation can be used to prepare lmwhs which retain their molecular integrity and therefore may prove to exhibit a more comparable biologic profile to hepari~ futthermore, lmwhs produced by gamma irradiation lack the usual double bond fommtion which requires the use of additives which can alter the product profile. university hospital, dept. of angiology, frankfurt a.m., germany introduction: thromboembolic disease constitutes a major clinical problem and among others a defective fibrinolytic system has been suggested as a predisposing factor for the development of thrombosis. the plasma fibrinolytic system can be impaired by inherited deficiencies of plasminogen defective release from the wessel wall tissue plasminogen activator (t-p'a) or by high ptusma levels of regulatory proteins, such as plasmino-8en. activator inhibilors (pal). the aim ....... of the present study w~s to eshmate the prevalence of decreased fibnnolyl~c actwlty m young pls. with thrombophilia. patients: a great population of 884 pts. (fenmle 478, male 406; age 21-61 ys median 39.8 ys) with venous thromtx~emolism before the age of 45 years were investigated in regard to their plasma fibrinolytie system. in none of them well established thrombophilia risk factors could be identified previously. methods: plasminogen ~behdngwerke), pai-1 activity (ehromogenic assay, biopool), pal-i anugen coneentration (elisa, biopool), t-pa activity (chromogenic assay, biopool) and antigen concentration (elisa, biopool) were measured before and after venous oeclusion.vo was performed z 12 month after the last thromboembolic epi~xle. 24 healthy subjects (median age 24.7 ys) served as controls. results." 24 pts.(2.7%) were classified as plasminogen deficiencies (activity and antigen). 142 pts.(16%) had significantly elevated levels of pal activity (up to 120 u/ml) and pal antigen (up to 90 ng/ml). none of the pts. with high pal levels had laboratory signs of acute phase reaction. low t-pa activity could be demonstrated and confirmed in 121 pts., aecordingto a prevalence of 13.6% (range: 0-2.7 u/ml; reference limils: 2.8 -21.8 u/ml). however, there was a significant negative correlation between t-pa activity and pal values. in 67 pts. (55.4%) the low t-pa activity was associated with increased pal levels whereas the t-pa antigen concentration was normal. a parallel reduction of t-pa activity and t-pa antigen (range: 0.35-3.5 ng/ml; reference limits: 3.6 -21.0 ng/ml) were determined repeatedly in 54 pts. (f 23, m 31, median age 39 ys). thus, the prevalence of a defective t-pa release was 6.1% in our study group. conclusion." in comparison to the frequency of inherited deficiencies of other well established thrombophila risk factors we have observed a relativel~ high prevalence of diminished t-pa activity, elevation of pal respectively in our study group. our data strongly indicate that besides t-pa and pal acuvity, antigen concentration for both parameters should be determined in pts. with thrombophilia. the antithrombotic and anticoagulant effect of the supersulfated low molecular weight heparin ssh 14 was studied after i.v. and s.c. administration in rats. thrombus formation in the jugular vein was induced by i.v. injection of activated human serum and following stasis for 20 rain and was assessed by a thrombus score ranging from 0 (no thrombus formation) until 3 (complete thrombus formation). ssh t4 injected either 10 min (i.v.) or 30 rain (s.c.) before thrombus induction caused a dose-dependent antithrombotic effect in a range from 0.25 to 2 mg/kg i.v. and 1 to 4 mg/kg s.c. there were clear differences in the antithromboric effectiveness between female and male animals, i.e, in female rats antithrombotically effective doses were lower than in male rats (edh0 after i.v. injection in females 0.35 mg/kg, in males 0.9 mg/kg). the sex differences were confirmed in studies on the time course of the antithrombotic effect. after i.v. injection of fully effective doses (2 mg/kg i.v. and 4 mg/kg s.c., resp.) the antithrombotic effect disappeared after 8 h in female or after 4 h in male rats. for studies on the anticoagulant action blood was drawn from the femoral artery and after centrifugation global clotting assays were performed in plasma. similar to its antithrombotic action ssh 14 also caused doseand sex-dependent anticoagulant effects. the most sensitive assays were the aptt and the heptest; thrombin time and prothrombin time were less or not influenced by ssh 14. in conclusion, ssh 14 was found to be an effective anticoagulant and antithrombotic agent in experimental studies in rats. at present there is no explanation for the clear sex differences found in this species. venous thromboembolic disease is the most frequent complication in patients undergoing total knee replacement therapy. patients and methods: after informed consent 3x30 patients were included in an open randomized clinical study and the incidence of venous thromboembolisrn was examined using different regimes for heparin prophylaxis (30 patients received fraxiparin 36 rag once daily, 30 patients clexane 40 once daily and 30 patients 7500 u calciparin twice daily). there were no differences between the groups concerning age, sex, body weight, risk factors, surgeons, decrease in hemoglobin~ and requirements for blood products. pre surgery, day 1, day 5-7 phiebograms were performed and also tat, dimers, fl+2 prothrombin fragments were examined. results: 1., dvt in 26 patients (28.9%). dvt in 5/30 patients under calciparm prophylaxis, 8/30 patients under fraxiparin and 13/30 patients under clexane treatment. 2., low speciflty (3.4%) of dimers and tat (24%) for detecting a dvt in these special patients undergoing knee replacement therapy, elevated fi+2 fragments in the dvt group at ti and t2 vs the patients without dvt (t1 dvt: 3.24+-1.8 vs. 1.6+-0.3 -p= 0.0042). 3, only 8/26 patients (31%) with dvt had clinical signs of thrombosis. conclusions: 1., there is an increase of thrombin gneeration measured by tat and dimers after knee replacement therapy. there are further studies with more patients necessary to confirm that fl+2 prothrombin fragments can discriminate between patients with and without dvt from a clinician's point of view. 2., phlebographicauy confimled dvt in almost 30% of our patients demonstrate the high thromboembolic risk in these patients. von willebrand's disease (vwd) type 2 is characterized by absence of high molecular weight muitimers. qualitative changes in the structure of the molecule might be associated with enhanced binding of von willebrand factor (vwf) to platelet glycoprotein lb. therefore in some patients vwd type 2 is associated with severe thrombocytopenia. here, we report on a 9 year old boy who presented with severe purpura and platelet counts about 20000/gl at the age of 2 years. thrombocytopenia did not respond to corticosteroids. a normalized platelet count of short duration was observed after high-dose immunoglobulins. in addition, increase of platelets was seen after anti-d treatment. thus, although platelet associated antibodies were not detected, thrombocytopenia seemed to be caused by an autoimmune mechanism. despite platelet counts above 50000/gl, the patient experienced severe bleedings with a significant decrease of hemoglobin levels. therefore, he needed several transfusions. coagulation analysis revealed vwd. application of ddavp lead to a normalization of partial thromboplastin time (ptt) and an increase of factor viii with subsequent cessation of bleeding symptoms. recently, vwd was typed 2 by lack of high molecular weight multimers. in conclusion, we report a case with vwd type 2 responding to ddavp. however it is unclear, whether thrombocytopenia is part of the vwd type 2 or of autoimmune origin. since autoimmune antibodies have not been detected, the effect of immunoglobulin treatment might be explained by blockade of enhanced binding of vwf to glycoprotein lb. von willebrad disease (vwd) with a prevalence of 0,8% (ruggeri 1994, rodeignere 1987) seems to be the most frequent inherited hemostatic disorder. • the diagnostic criteria for vwd are clinical picture, family hostory, laboratory findings: bleeding time, partial tromboplastine time (ptt), level of factor viii:e, vwf, vwf:ag, ristocetin induced platelets aggregation (ripa) and multim~-analysis.the diagnosis ofvwd is occasionally difficult, especially in early childhood because the laboratory data may vary due to time of investigation, as well as abnormalities may not be present in all sub-types the aim of this study was the evaluation of diagnostic approach to vwd in childhood and diagnostic reliability of all available laboratory tests. all previously mentioned laboratory tests have been done on our own material (51 child who satisfied all criteria for vwd, 23 boys and 28 girls, 1-9 years old) except mulfimer analysis which was unavailable in some cases. majority of laboratory tests proved to be highly specific and necessary for diagnosis. however, the diagnostic reliability of fviii:c and adhesion of platelets is much lower in mild cases in comparison to total sample, while ptt is an unvaiied test. the most specific screening test for vwd is vwf which diagnostic reliability is almost 1,00. the optimal strategy to establish general diagnosis of mild forms ofvwd is use of vwf and vwf:ag plus ripa if necessary and multimer analysis to classify variant types. we report on a new multimeric structural defect of vwf detected in a german family (two sisters and their three children): all members of the family who presented to our outpatient clinic had an increased spontanous bleeding tendency (moderate or strong hematoma, epistaxis, menorrhagia). prolonged bleeding could be observed after surgical procedures (adenotomia, tooth extraction) and after trauma (laceration). wound heeling was impaired in two cases. clotting assays showed slightly prolonged apti" and a mild decrease of f viii:c, vwf:ag and vwf:rcof levels. collagen binding activity was within normal ranges. bleeding time (simplate i) was slightly prolonged. the analysis of the multimeric structure in plasma showed quantitative and qualitative abnormalities: all multimers were detectable; the structure of vwf was reproducably abnormal in all family members so that the defect must be caused genetically. the thmmbocytic vwf showed neither qualitative nor quantitative alterations. minirin@ (ddavp) was administered as a test dose of 0,3 ~tg/kg bw in 100 ml 0,9% nacl-solution i.v. to evaluate efficacy and tolerance: clotting assays showed normalization of a_vrt, f viii:c, vwf:ag, vwf:rcof in plasma and shortening of bleeding time in three cases. an insufficient rise of vwf:ag and vwf:rcof levels could be observed in one case. one patient had no rise of f vm:c but a corrected bleeding time. multimeric analysis showed no structural change. the administration of ddavp was well tolcrated in all cases. the existance of all multimers in plasma and the normal collagen binding activity suggest that the structural abnormalities of vwf in this family does not cause functional defects so that the defect could be classified as a type i vwd. the response to ddavp was only partially effective. mild von willebrand disease (vwd) is far the most frequent congenital bleeding tendency. its diagnosis is very helpful in pre-operative check-up in order to avoid bleeding complications during surgery. following post-operative periods or monitoring the management of haemorrhagic episodes in vwd patients is also strongly recommended. current methods involve complex technologies, are time consuming and require large series. these assays lack the expected flexibility for rapid individual testing in patients. a new and flexible assay which works on the fully automatic walk-away coagulation instrument, sta, has been developed for these applications (liatest vwf). the technology is an immuno-turbidimetric method using mierolatex particles coated with rabbit polyelonal antibodies specific for vwf. the assay has a dynamic range from 2 to 420% yon willebrand factor (vwf) concentration, it works with a 2 fold dilution of tested plasma (50 td) and it offers a calibration established with the nibsc international standard. the total assay time is of less than 10 minutes and the detection threshold is of 2% there is no prozone effect up to concentrations higher than 1,000% vwf. intra-assay reproducibility is < 4% and inter-assay one < 5%. in dilution studies a mean recovery of 98% was obtained. in a study on 55 plasma samples from norma~ individuals, patients with high vwf concentrations, and vwd, comparison with the elisa technique demonstrated a correlation coefficient of 0.997 with a slope of 0.978 and an intercept of 3.30. in the low assay range too, a good agreement was obtained with the elisa. we conclude that liatest vwf is a reliable, flexible, sensitive, and rapid automated assay which fits well the vw'f assay applications in coagulation laboratories. fibrinolysis, the process during which the active enzyme plasmin is generated in a regulated and localised way, is -in a classical understanding -responsible for the dissolution of blood clots formed in a vessel. for this activity, t-pa is generally assumed to be the most important plasminogen activator and its activity, is regulated by enzyme kinetic mechanisms dependent on the presence of fibrin. with this background t-pa is used for thrembolytic therapy with great success. however, data from t-pa knockout mice indicate that t-pa might not be responsible for inhibiting the spontaneous development of intravsacular thrombi but only for dissolution of fibrin formed upon a coagulation challenge. in contrast, u-pa, generally assumed to be important for extravascular proteolytie activity on activated or tumour cells, seems to lead to the development of spontaneous fibrin formation in a mouse knockout model. on the other hand, the major plasminogea activator inhibitor pal-i seems not only to regulate intravascular fibrinolysis but seems to also be important for the progression of vascular diseases (neointima formation is e.g. increased in a pai-1 knockout model, but increased levels of pai-1 seem to predict reocclusion after angioplasty). in addition to their functioning as enzymes and inhibitors, components of the fibrinolytic system seem also to be involved in signalling processes in tumour and other cells. the u-pa/u-pa-receptor system could be shown to function as a chemotactic system and to elicit a migratory and mitogenle response in monoeytes and tumour ceils as well as in vascular cells. for such a response activation of tyrosine kinases of the sre-family might be responsible in some cell lines, but other signal transduction pathways e.g. involving caveolae and the starprotein can not be excluded. there seems to be a further important role of components of the fibrinolytic system which involves serine protease inhibitors (serpins): serpins have homologies to hormone binding proteins and cleavage of serpins by their target enzymes not only leads to inactivation of the enzyme but also to a possible release of bound hormones from the serpins. from these data clearly the relevance of any regulation of the fibrinolytie, system depends on the specific function of the system to be dealt with. in addition to "fibrin binding", "receptor mediated" and "genetic control" (e.g. 4g vs. 5g in the pai-i promotor) also "signal transduction" and "hormone delivery" are distinct functions of the system with specific regulation. plasmatic for both, healthy persons as well as for patients with angina pectoris it could be shown that increased values of plasma fibrinogen, factor viic and vwf:ag are significantly associated with the risk to suffer an acute myocardial infarction or cardiac sudden death. the same holds for tpa:ag. however, a group analysis in quintiles reveals that particularly low tpa:ag values are connected with a particularly low coronary risk. unexpectedly also the acute phase protein crp is positively associated with increased coronary risk. for clinical purposes these factors have already been included into coronary risk scores in order to improve the individual risk prediction in combination with lipids and other risk factors. the assessment of the pathophysiological significance of these observations remains at dispute. 4 pathways are discussed: 1. the assumption that increased plasma values of those factors indicate increased coagulation activity could so far not be established in prospective studies. 2. both vwf:ag and tpa:ag are produced in endothelial cells. an increase of their plasma level could therefore indicate increased endothelial cell functions which accompanies progressive atheromatosis. the risk association of the two acute phase proteins crp and fibrinogen could be interpreted analogously. 3. first prospective studies favour the assumption of a genetic determination to an increased production of coagulation proteins in persons at particular coronary risk. it could also be shown that there is a certain dependance of the gene-polymorphism for co-and 13-fibrinogen chains from the coronary risk. 4. even slightly elevated concentrations of fibrinogen and/or vwf:ag may influence the quality of a coronary thrombus both by increased physical stability and by reduced fibrinolytic lysibility. this could mean that an early coronary clot under these conditions could more readily develop to a stable, occlusive thrombus. a newborn with pronounced bleeding tendency had a prothrombin (prth) deficiency below 2.8% in a clotting assay. both parents had activities of 71% and 69%, respectively. however, the immunological determination ofprth by elisa revealed normal concentrations in all family members (93%-101%). furthermore, thrombin generation as investigated by a chromogenic assay using ecarin for activation of prth was normal as well. activation of prth by fxa was investigated by reealcificafion of the plasma samples and further analyzed for prth and its derivatives produced. although clotting times still were different, finally, normal levels of fl+2 and tat were generated as determined by elisa. western blot analysis using polyclonal (rabbit) antibodies to prth and a monclonal antibody specific to human thrombin, revealed different patterns of prth degradation products. tat was only weakly visible in the serum of the mother and nearly absent in the child.the mobility of prothrombin and thrombin was different compared to normals indicating a lower molecular weight. after reduction of disulfide bridges a higher molecular weight of thrombin was observed compared to normals indicating an insufficient cleavage ofprth and formation ofprethrombin 2. these observations let suggest that prothrombin marburg is a deletion mutant lacking the cleavage region arg 320-ile321. upon cleavage by factor xa only prethrombin 2 is formed under liberation of fl+2. this prethrombin 2 is able to cleave chromogenic substrates in the ecarin assay. probably, prethrombin 2 forms a complex with atiii which is detected by elisa, but unstable under denaturing conditions as in the western blot. as a major complication of haemophilia a treatment, up to 30% of the severely affected patients develop antibodies to substituted factor viii. investigating 133 patients and considering the data of further 231 patients of the haemophilia database, we could show, that risk of inhibitor developement depends on the patient's mutation type. patients with more severe gene defects, like intron 22 inversions, stop mutations or large deletions had a risk of about 35% for inhibitor developement, which was about 7 times higher than for missense mutations or small deletions. besides an influence of mutation type, we investigated other parameters e. g. immune response genes (i-ila-genotype) and clinical aspects (treatment onset and frequency, type of concentrate) that might also affect inhibitor formation. to exclude any effect of mutation type, we focussed on 72 patients with an intron 22 inversion. hla-typing showed that some t-ila-alleles (dqb0602, bt) occurred more otten and others (dqa103, dqb603, dr13, c2) less frequent in inhibitor patients. treatment onset, frequency and type of concentrate apparently do not affect inhibitor incidence. the results presented here, prove that inhibitor development is considerably influenced by the mutation type. this supports the hypothesis that patients with severe molecular defects have no endogenous factor viii protein and that substituted factor viii represents a foreign protein, leading to an immune response, e. g. the production of alloantibodies. in addition, the immune response seems to be modified by the hla-genotype. however oar findings (in terms of genotype and treatment parameters) can only explain part of the inhibitor pathogenesis. it is still unsolved why substituted factor viii does not lead to a recognizable immune response in 2/3 of the patients with severe molecular factor viii gene defects. consequently other factors, probably concerning the antenatal phase, must be involved. viia in the treatment of patients with inhibitors against factor viii or ix: a german/swiss/austrian multi~center trial d. ellbriiek*, i. scharrer**, j. dethling***, and the rfviia study group *section h~mostaseology, university ulm **dept. of angiology, jwg-university hospital frankfurt a.m. ***novo nordisk, mainz administration of activated recombinant factor vii (rfviia) can by-pass the fvnlwlx pathway and offers an alternative treatment for patients with antibodies (inhibitors) against these factors. from november 1994 to october 1995, a total of 25 bleeding episodes and 10 surgical interventions in 18 patients were treated with rfviia in a phase iiib multicenter trial. diagnosis was hemophilia a (n = 15) or b (n=l) with inhibitor, and acquired inhibitor against factor viii (n=2). various serious bleeds, from complicated joint and gingival bleeds to lifethreatening psoas bleeds, have been treated. operations have been tooth extractions, radiosynovectomy, implantation and explantadon of porth-acaths and one adenotomy. dose regimen was 90-120/zg/kg bw every two to three hours until clinical improvement, with subsequent dose reduction. results: for bleeding episodes, response to rfviia after 24 hours was effective in 72%, partially effective in 12 9"0, ineffective in 129"o and not evaluable in 1 (4%) of the patients. two of the three treatment failures were associated with very long dosage intervals of rfviia. the third patient was in a critical situation with artificial high pressure respiration and polytransfusion because of a hematothorax, and suffered a terminal intracerebral bleed. the efficacy of rfviia for surgery was very good. response to treatment was independent of antibody titer. no signs of dic or activation of coagulation were noted. conduslon: in our experience, rfviia is an efficient and safe treatment for inhibitor patients with acute bleeding episodes. it should be investigated, whether rfviia can be an alternative treatment also for the hometreatment situation. successful immunetolerance therapy of f vih-inhibitor in children after changing from high to intermediate purity f vih concentrate w. kreuz, j. joseph-$teiner, d. mentzer, g. auerswald*, t. beeg, s. becker zentrum der kinderheilkunde, j. w. goethe-universit~itj frankfurt am main *professor hess kinderklinik bremen introduction: inhibitor to f viii is the most severe complication in treatment of patients with haemophilia a. the incidence of f viii inhibitors is estimated to range between 15-33%. several authors reported that the immunetolerance therapy (itr) of f viii-inhibitors can be induced with high dose f viii concentrate. objective: this presentation will show data of four children with haemophilia a and f viii inhibitor (high responder), who had an unsuccessful lit with high dose f viii concentrate (high purity) in the first step. f viii concentrate was changed to an intermediate purity product (haemate hs®) in the subsequent course of h't. all patients received bleeding prophylaxis with an activated-prothrombin-complex-concentrate (feiba®). results: median age was 13 (9-18) months, when the inhibitor was first detected. in all four patients the f viii inhibitor titre increased under immunetolerance treatment with f viii concentrate (high purity) in the first step of therapy. after changing the f viii concentrate (intermediate purity) the inhibitor titres decreased continuously after a rebooster effect to 0be within months. median duration of f viii inhibitor elimination time (until first testing of 0 be) was 3 (2-5) months. in all patients the f viii inhibitor was successfully eliminated. until now all patients are under prophylactic treatment with f viii concentrate and had no positive inhibitor testing since. median observation time since the first testing of 0 be is 14 (4-60) months. conclusion: different studies concerning immunetolerance treatment have been successful with f viii concentrates of different purity. according to our experience in these four presented patients, we assume that probably not the purity of the f viii concentrate is important for the induction of immunetoleranee, rather than the type of f viii presentation in the used concentrate. the used preparation (haemate hs®) is a f viii concentrate with high concentration of vwf, which is known to be important for the protection of f viii against degradation by proteases. this may be a mechanism for a prolonged antigen presentation to the immunesystem and thus may have a positive impact on the outcome ot itr. long scale trials are needed to prove the above assumptions. thrombasthenia glanzmann is a disease affecting platelet function because of a partial or total lack of glycoprotein (gp) ilbllla expression or a modification of this complex. since the receptor dysfunction goes along with reduced or absent platelet aggregation and adhesion, it causes bleeding complications in case of injury. here we report about a 60 years old women, who suffered since early childhood from a severe bleeding disorder. life threating bleeding complications occured after tooth extraction and after abdominal surgery. analysis of the patients platelets revealed normal values for the platelet count, whereas their volume showed to be increased (11 fl). clot retraction was diminished to 17%. platelet adhesion to siliconised glass and human subendothelial matrix was reduced, as was the spreading of the platelets. adp (i#m) induced platelet aggregation was inhibited, while collagen-, ristocetin-and thrombin-induced aggregation showed to be normal. cross immunelectrophoresis resulted in an atypical peak of gpiibllla with reduced electrophoretic mobility. in the electroimmunoassay according to laurell 14% of gpiibllla was detected. moreover we observed a markedly diminished 125j-fibrinogen binding. sequence analysis of the gpiib and gpiila cdna after pcr amplification unraveled a g2508 --, a transition in gpiib, substituting gly805 --* glu. the structure/function relationship of this mutation has still to be investigated. we report two new abnormal fibrinogen variants, denoted as bem iv and milano xi, both having an exchange of arginine to histidine in position 16 of the ac~-chain. routine coagulation studies revealed prolonged thrombin and reptilase clotting times, low plasma fibrinogen concentrations determined by a functional assay but normal fibrinogen levels measured by the immunological assay. the onset of turbidity increase following addition ofthrombin to purified fibrinogen was markedly delayed in both variants. release of fibrinopeptide b by thrombin, measured by reversed phase hplc, was normal whereas only one half amount of normal fibrinopeptide a was released. in addition to normal fibrinopeptide a, an abnormal fibrinopeptide a* was cleaved from both dysfunctional fibrinogens. the structural defect was determined by asymmetric pcr and direct sequencing of a gene fragment coding for the nh2-terminus of the aachain. both variants were found to be heterozygous for the transition g to a at nucleotide position 1203, leading to the substitution actl6 arg-->his, resulting in a delayed fibrin polymerization. the simple assay permits detection of the most common amino acid substitutions occuring in the nh2-terminus of the ac~-chain of the functionally abnormal fibrinogen variants. protein c inhibitor (pci) a member of the serpin family is also known as plasminogen activator-3 (pal-3). pci was first described as a component of human plasma, regulating the activity of activated protein c and other sedne proteases of the human coagulation and fibdnolysis system. since then pci was found to be present in extra-plasmatic systems also. high concentrations of pci were detected in human seminal plasma suggesting a role for pci in human fertility. significant concentrations of pci mrna and antigen were located in lysosomes of proximal tubular kidney cells suggesting an intracellular function for pci in this environment. in this study we present evidence that pci is also present in human pancreas. rna from human pancreas was reverse transcribed and pcr amplified. the resulting pci cdna was identical with pci cdna from human liver. ~p labeled antisense rna probes used in in situ hybridization experiments with human pancreas tissue sections showed that pci rna was located in the acinar ceils. pancreatic fluid was analyzed by sds-page and immunoblotting. using monospecific antibodies directed against human plasma pci, a 57 000 mw protein band was observed which comigrated with purified human plasma pci. our results show that pancreas cells contain a significant concentration of pci mrna. this message is localized in the secretory acinar cells. therefore we conclude that pci antigen found in pancreatic fluid is likely to originate in the pancreas. the role of pancreatic pci is unknown at present. however, since thrombosis and systemic hypercoagulable states are known complications of pancreatic diseases our results and in vitro experiments by others showing that pci can inhibit pancreatic enzymes such as chymotrypsin and trypsin indicate that pci may be part of the inhibitor potential which protects pancreatic tissue from auto degradation. these inhibitors normally prevent the release of active pancreatic proteases into the vasculature or microcirculation where destabilization of the coagulation balance and subsequent thrombus formation could occur. institute for clinical chemistry and laboratory diagnostics and *clinic for cardiology, universi w of duesseldorf p-selectin (cd 62p, the former granule membrane protein 140 or gmp 140) is an integrated membrane protein of platelets and endothelial cells. under inactivated conditions it is stored in the alpha granules of platelets and in the weibei-palade bodies of endothelial cells. endothelial cells covering atherosclerotic plaques show an increased expression of p-selectin. 13-thromboglobulin (13-tg), which is also expressed from the alpha granules of platelets during adhesion or aggregation, is regarded as a marker of platelet activation in vivo. coronary thrombosis plays a central role in the pathogenesis of acute coronary syndromes. we therefore analysed cd 62p and 13-tg in acute coronary syndromes, healthy subjects (hs, n=l i), patients with stable angina pectoris (sap, n=20), unstable angina pectoris (uap, n=l 2) and acute myocardial infarction (ami, n= 12). plasma samples were obtained by using ctad vacutainer tubes (0.109 m na~-citrate, theophylline, adenosine dipyridamole). patients with cad showed significantly increased plasma concentrations of cd 62p (hs: 98+20 versus sap: 133+38ng/ml, p<0.05; versus uap: 128+28 ng/ml, p<0.01; versus ami: 144+72 ng/ml, p<0.05) independent of the severity of clinical symptoms. in comparison only patients with ami showed significant higher 8-tg concentrations compared with hs (hs: 30+20 versus ami: 39+14ng/ml, p<0.05). although the cd 62p plasma concentrations showed no relationship to the clinical severity, hence there was a positive correlation between cd 62p (r=0.47; p< 0.001; n=55) to the severity of cad classified as i, 2, 3 vessel disease. it is concluded that elevated cd 62p concentrations are correlated with the severity of cardiovascular disease. cd 62p is not suitable for differential diagnosis of acute coronary syndromes, because it is elevated independently of the clinical status of the patients. the involvement of platelets in the pathogenesis of acute myocardial infarction may be indicated by the increased 13-tg concentrations. iklinik nr herz-, thorax-und herznahe gef&schirurgie und 2institut x~tr klinische chemie und laberatodumsmedizin der universint regensburg an increased blood loss following surgery with extracorporeal circulation (ecc) contributes to the morbidity and mortality. postoperative haemorrhage following ecc has been related to a platelet function defect and the activation of the blood dotting and fibrinulytic system. we investigated platelet surface antigen expression and parameters indicating activation of the clotting and fibrinolytic cascade to assess the predictive potential of these variables for increased blood loss after ecc. g0 patients referred for coronary bypass gra~ing with no history of a bleeding disorder and normal routine clotting tests were included. on the day prior to surge~ and immediately upon arrival on the intensive care unit blood samples were drawn. the surface expression of glycoprotein (gp) lib-ilia, gp lb, and p-selectin was meamred with and without in vitro stimulation with adenosine diphosphate (adp) using whole blood flow cytomet~y. platelet counts and platelet factor 4 (pf4), as well as, routine clotting tests were performed. activation of the clotting and fibrinolytic system were judged from thrombin-antithrombin-iii complex fiat), fibrinogen fig), d-dimers (dd), cc2-antiplasmin (tz2a), prothrombin fragment 1+2 (fl+2),and tissue plasm~ activator (t-pa). blood loss fxom chest tubes was measured hourly until removal of drains. following ecc the levels of pf4, tat, dd, o~2a, fl+2, and dd were sigulticnatly increased (p<0.0001) compared to baseline values. gp iib-iila, gp ib, p-selectin, platelet count, and fg were significantly reduced (p<0.0001). analysis of variance (anova) revealed that postoperative values of gp ib (p<0.0001), dd (p